0% found this document useful (0 votes)
6 views8 pages

1 - Machine Learning Survey Paper

This paper surveys machine learning concepts and algorithms, detailing supervised, unsupervised, and semi-supervised methods, along with their applications in various fields such as health monitoring and IoT. It emphasizes the importance of data processing techniques and provides insights into specific algorithms like linear regression, logistic regression, support vector machines, and clustering methods. The document also includes a discussion on software tools and an extensive bibliography for further reference.

Uploaded by

komalsahay810
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views8 pages

1 - Machine Learning Survey Paper

This paper surveys machine learning concepts and algorithms, detailing supervised, unsupervised, and semi-supervised methods, along with their applications in various fields such as health monitoring and IoT. It emphasizes the importance of data processing techniques and provides insights into specific algorithms like linear regression, logistic regression, support vector machines, and clustering methods. The document also includes a discussion on software tools and an extensive bibliography for further reference.

Uploaded by

komalsahay810
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

A Brief Survey of Machine Learning

Methods
Abstract—This paper provides a brief survey of the basic projects, mobile health monitoring, networked security,
concepts and algorithms used for Machine Learning and manufacturing, self-driven automobiles, surveillance, intelligent
its applications. We begin with a broader definition of border control; every application has its idiosyncrasies and
machine learning and then introduce various learning requires customized features, adaptive learning, and data fusion.
modalities including supervised and unsupervised methods Data compression and statistical signal and data analysis has a
and deep learning paradigms. In the rest of the paper, we large role transmitting and interpreting data and producing
meaningful analytics. Machine Learning algorithms can be
discuss applications of machine learning algorithms in broadly classified into three categories based on the properties,
various fields including pattern recognition, sensor style of learning, and the way data are used [13]: supervised,
networks, anomaly detection, Internet of Things (IoT) and unsupervised and semi-supervised algorithms. This type of
health monitoring. In the final sections, we present some classification is important in identifying the role of the input
of the software tools and an extensive bibliography. data, the utility of the algorithms and learning models relative
to the applications.
I. INTRODUCTION
II. SUPERVISED LEARNING
Machine Learning [1-10,89], as described by Arthur
Samuel in 1959 [11], is a “Field of study that gives computers In supervised learning, “true” or “correct” labels of the
the ability to learn without being explicitly programmed.” In input dataset are available. The algorithm is “trained” using
1997, Tom Mitchell [12] gave a more formal definition, the labeled input dataset (training data) which means ground
namely: “A Computer program is said to learn from an truth samples are available for training. In the training process,
experience E with respect to some task T and some the algorithm makes appropriate predictions on the input data
performance measure P, if its performance on T, as measured and improves its estimates using the ground truth and
by P, improves with experience E.” reiterating until the algorithm reaches a desired level of
Although, the term machine learning has its origins in accuracy. In almost all the machine learning algorithms, we
computer science, there have been several vector quantization optimize a cost function or an objective function. The cost
methods [106] developed in telecommunications and signal function is typically a measure of the error between the ground
processing for coding and compression [105]. In computer and truth and the algorithm estimates. By minimizing the cost
data science, learning is accomplished based on examples function, we train our model to produce estimates that are
(data samples) and experience. A basic signal/data processing close to the correct values (ground truth). Minimization of the
[86- 88,90] framework that includes pre-processing, noise cost function is usually achieved using gradient descent
removal and segmentation is shown in Figure 1, where, the technique [116-118,121,122]. Variants of gradient descent
signal is acquired from the sensor and then processed, technique such as stochastic gradient descent for a minibatch,
typically in a frame-by-frame or batch mode [94]. Removal of
momentum based gradient descent [123,124], nesterov
noise and feature extraction follows next and finally the
machine learning training paradigms. Suppose we have ‘𝑚’
accelerated gradient descent [119] have been used in many
classification stage which will provide either an estimate or a
decision is at the end of the process.

data and can be represented in a pair:(x, 𝑦), here x represents


number of training examples, each one of them is a labelled

the input data and 𝑦 represents the class label. The input data
x can be an 𝑛 dimensional, whereas each dimension
Figure 1: Basic signal processing framework including pre-
corresponds to a feature or a variable. Supervised learning
processing, feature extraction and classification.
methods are used in various fields including the identification
Typically, the feature extraction stage will extract compact of phytoplankton species [14], mapping rainfall induced
information bearing parameters that can characterize the data. landslides [15], and classification of biomedical data [16]. In
The classification stage will have to be trained by a machine [91], a machine learning algorithm is integrated on an
learning algorithm to recognize and classify the collection of embedded sensor system for IoT applications. In the following
features. The field of machine learning is vast and applications sub-sections, we present supervised learning algorithms.
are expanding rapidly especially with the emergence of fast
mobile devices that also have access to cloud computing [108].
Compressing and extracting information from sensors and big
data have recently elevated interest in the area. Smart city
A. Linear Regression The output of the sigmoid function is a value between 0 and 1.
Regression [17-19] is a statistical technique of estimating All values below 0.5 belong to negative class and values
the relationship between input and output variables. It maps greater than or equal to 0.5 belong to positive class. The
the input variables to a continuous function. A simple application of Logistic Regression is seen in various fields
univariate linear regression [20-22, 24] model is shown in including evaluating Trauma care [27], patient severity
Figure 2. assessment [28], determining the risk of heart disease [29],
early detection and recognition of Glaucoma in ocular
thermographs [32], and in computer vision and adaptive object
tracking [33]. For a multiclass classification problem, we can
have one-vs-all implementation.
C. Support Vector Machines (SVM)
Figure 2: A simple Linear Support Vector Machines [1-4,34,35,37] are one of the
regression example with one Figure 3: Sigmoid curve
having a bound between 0 and popular supervised learning models, mainly used for binary
feature/variable.
1. classification as well as multi-class classification. SVM maps

‘𝑛− 1’ dimensional hyperplane to separate the data points


the input data as points in a ‘𝑛’ dimensional space and draws a

The training dataset consists of ‘𝑚’ labelled training


into

sets(x, 𝑦) < 𝑅𝑛+1 , x is the independent variable and 𝑦 is


two groups. This can be visualized easily for two-dimensional
data points as shown in the Figure 4. From the labeled dataset,
the dependent variable. The linear regression model assumes SVM algorithm tries to divide these points to two separate
the relationship between independent variable and dependent groups by a hyperplane, which is in this case a line, such that
variable is linear and fits a straight line to the data points. This the width of separation between the two groups is maximized.
relationship is expressed by a hypothesis function or a In the Figure 4, ‘B’ is a line which just separates two classes.
prediction function. It is expressed as However, the line ‘A’ gives the maximum separation between
ℎ(x) = w0 + w1𝑥1 + w 2 𝑥 2 +... +w𝑛𝑥𝑛 (1)
the classes. The data points which are close to the hyperplane

where 𝑥1, 𝑥2,. . . 𝑥𝑛 are the features and w0, w1, w2.. . w𝑛
(line ‘A’) are called support vectors. Maximum margin was
proposed by Vapnik in 1963 and the SVM algorithm was
are the weights of the model. As shown in [142] an FIR introduced in 1992 [36]. Vapnik et.al also proposed a
filtering technique to generate a non- linear hyperplane known as the
approach can be used to perform linear regression through “Kernel trick” when the data is non-linearly separable. The
slope filtering. Equation (1) is for a multivariate linear kernel trick is achieved by transforming the non-linearly
regression model. The output is the linear sum of the weighted separable input data to a higher dimensional space or Hilbert
input features. The weights are typically learned by weighted space, where, the transformed data is now linearly separable.
least squares minimization process. We can also make use of The linear hyperplane is drawn in this space and transformed
quadratic, cubic or higher polynomial [144-145] terms to
back into original feature space. Many types of kernels are
obtain completely different hypothesis function which can fit
used in practice including Gaussian kernels [130-134], the
quadratic [143], cubic or polynomial curves respectively,
rather than a simple straight line. Multivariate linear regression radial basis function [120], and the polynomial kernel [125-
is used for several applications, including activity recognition 128]. In 1995, Vapnik and Cortes proposed the soft- margin
and classification [23], steady state visual evoked potential approach [38] where the maximum margin constraint is
(SSVEP) recognition for BCI data [25,26]. relaxed by introducing the slack variables which allows
outliers of either class to be present on the other side of the
B. Logistic Regression hyperplane. A major advantage of SVM is that it avoids
The objective of multivariate regression model is to overfitting and is non-probabilistic. SVM can also be used for
determine a hypothesis function which outputs a continuous regression analysis as well as clustering [39-41]. The SVM
value. Now, we present another class of supervised learning algorithm is used in several applications including simple
algorithms: Classification, in which the objective is to obtain a binary classification [135] text categorization [136-138], hand
discrete output. Logistic regression [30,31] is a statistical way written digit recognition [139-141], novelty, anomaly or
of modelling a binomial outcome. As before, the input can outlier detection [42,43], intrusion detection [51], emotion
have one or more features (or variables). For a binary logistic recognition [67], stress detection [69], noise robust speech
regression, the outcome can be a 0 or 1 which performs binary recognition [129]. Different variations of SVM have also been
classification of positive class from negative class. Logistic proposed including the least square SVM (LS-SVM) [44],
regression uses a sigmoid curve shown in the Figure 3 to one-class SVM for anomaly detection [45-50, 85], and
output a probability value and thus performs the classification. adaptive SVM [53].

ℎ(𝑥) = 𝑆(w0 + w1𝑥1 + w2𝑥2+ . . . +w𝑛𝑥𝑛)


The hypothesis function for a logistic regression is given by
D. Naïve Bayes
Naïve Bayes [68] classifiers are simple probabilistic
where 𝑆( ) is a sigmoid function given by
(2)
1
classifiers. The term “Naïve” is used because of the strong
𝑆(𝑧) =
assumption of the algorithm, that, all the input features are
1 + 𝑒−𝑧
(3)
independent of each other and no correlation exists between
them. Naïve Bayes is based on Bayes’ theorem. Being a
probabilistic model, Naïve Bayes’ outputs a posterior III. UNSUPERVISED LEARNING
probability of belonging to a class given the input features.
In the case of unsupervised algorithms [70,71], there are no
𝑝(𝜔𝑐|x) = 𝑝(𝜔𝑐|𝑥1, 𝑥2, 𝑥 3 ,... 𝑥𝑛) (4) explicit labels associated with the training dataset. The
objective is to draw inferences from the input data and then

𝑝(𝜔|𝜔|x)
𝑝(x� 𝑐)𝑝=
(𝜔𝑐 )
model the hidden or the underlying structure and the

𝑝(x)
(5) distribution in the data, in order to learn more about the data.
� Clustering is the most common example of an unsupervised

for each 𝐶 possible outcomes or 𝐶 number of classes.


algorithm. The details of the same is mentioned below.

𝑝(𝜔𝑐|𝑥) is the posterior probability that given feature x


Here, A. Clustering

belongs to 𝑐th class 𝜔𝑐, and 𝑝(𝜔𝑐) is the prior probability of


Clustering [75,81,82] deals with finding a structure or

the class 𝜔𝑐 independent of the data, and 𝑝(x|𝜔𝑐) is the


pattern in a collection of unlabeled dataset. For a given
dataset, clustering algorithm groups the given data into K
likelihood which is the probability of the predictor given the number of clusters such that the data points within each cluster

𝑝(x) is the prior probability of the predictor which is the


class and are similar to each other and data points from different clusters
are dissimilar. Similar to k-NN algorithm, we make use of a
normalizing factor. There are many variations of Naïve Bayes similarity metric or distance metric. Different distance metrics
theorem, some of them tackle the poor assumptions of Naïve such as Euclidean, Mahalanobis, cosine, Minkowski etc. are
Bayes [54,55,56]. Naïve Bayes algorithm is used for text used. Although Euclidean distance metric is used more often,
classification [57], for credit scoring [58], for emotion it is shown in [74] that it is not a suitable metric to capture the
classification and recognition [67], and detection of epileptic quality of the clustering. The K-means algorithm is one of the
seizures from EEG signals [146]. simplest clustering algorithms and is an intuitive and iterative
algorithm. It clusters the data by separating them into K
E. k-Nearest Neighbors groups of equal variances, minimizing the inertia or within-
The k-Nearest Neighbors (k-NN) algorithm [1,60,61,65] is cluster sum-of-squares. However, the algorithm requires the
one of the simplest supervised machine learning algorithm. k- number of clusters to be specified before running the

cluster with the nearest mean 𝝁(j), which is also referred to as


NN can be used for classification of input points to discrete algorithm. Each observation or the data point is assigned to the
outcomes. A simple k-NN model is shown in Figure 5.
the Centroid of that cluster. Thus, the K clusters can be
specified by the K centroids. After the random assignment of
K centroids, the algorithms inner loop iterates over the

(i) Assign each observation x(𝑖) (x(𝑖) is the 𝑖th sample point) to
following two steps:

the closest cluster centroid 𝝁(j)


(ii) Update each cluster’s centroid to the mean of the points

𝑖 = 1,2,... 𝑚; Total of 𝑚 observations or data points


assigned to it.

j = 1,2,... 𝐾; Total of K clusters and hence K centroids


Figure 4: Maximum margin Figure 5: A simple k-NN model
intuition; hyperplane A has for different values of k
maximum separation. The inertia or the within-cluster sum-of-squares is given

k-NN can be used for regression analysis [64,147] where the by:
𝑚 2 𝐾
∑ ǁ =∑ −2 𝝁
ǁx 𝑚𝑖𝑛 −𝝁
∑ǁx
outcome of a dependent variable is predicted from the input
(6)
(𝑖) (𝑖)
ǁ (k)
j(𝑖)
independent variables. In Figure 5, for k =3, the test point (star)
𝝁(j)∈
𝑖= k=1
0 𝐶
is classified as belonging to class B and for k=6, the point is
𝑖∈𝐶
classified as belonging to class A. k-NN is a non-probabilistic 𝝁j(𝑖) denotes that, for the 𝑖 th sample, 𝝁j is the closest
and non-parametric model [62,63,93] and hence it is the first centroid. K-means clustering algorithms leads to Voronoi
choice for classification study when there is no prior tessellation. K-means algorithms iterations stops (converges)
knowledge about the distribution of data. k-NN stores all the when there is no change in the value of means of the
labelled input points to classify any unknown sample and this clusters. In Figure 6, a converged K-means algorithm is
makes it computationally expensive. The classification is shown. Clustering has several applications in many fields. In
based on the similarity measure (a distance metric). Any biology, clustering has been used to determine groups of genes
unknown sample is classified by the majority vote of its k that have similar functions [77-79], for detection of brain
nearest neighbors. The complexity increases as the tumor in [76], cardiogram data clustering [80], in business
dimensionality increases and hence dimensionality reduction and e-commerce analysis [83] and information retrieval
techniques [164] are performed before using k-NN to avoid [92], image segmentation [72] and compression [84], in
the effects of curse of dimensionality [66]. k-NN classifier is the study of
quantitative resolutions of nanoparticles [95], in fault detection
used for stress detection using physiological signals in [69]
in Solar PV panels [101,187,188] and in speech recognition
and detection of epileptic seizures [146].
[148].
[152,154,170]. Each successive layer takes the output of the
previous layer and feeds the result to the next layer.

Figure 6: The K-means and the cluster centroids.

B. Vector Quantization
In its simplest form vector quantization [102,103,106]
organizes data in vectors and represents them by their Figure 9: Artificial Neural Network with four hidden layers.
centroids. It typically uses a K-means clustering algorithm to
train the quantizer. The centroids form codewords and all the Typical artificial neural networks challenges include
codewords are stored in a Codebook. Vector quantization is a initialization of the network parameters, overfitting, and long
lossy compression method and is used in several coding training time. We now have various techniques to address the
applications. As a result, the compressed data has errors that above problems. Batch normalization [182], normalization
are inversely proportional to density. This property is shown in propagation [183], weight normalization [184], layer
Figure 8 and compared with uniform quantization Figure 7. normalization [185] all help in accelerating the training of
deep neural networks. Dropouts [160] help in reducing
overfitting. There are several network architectures including
the one shown in Figure 9 which consists of dot product layers
(fully connected layers). A convolutional layer [167] processes
volume of activations rather than a vector and produces
feature maps. It also makes use of a subsampling layer or a
max- pooling layer to reduce the size of the feature maps.
Figure 10
Figure 7: Uniform quantization Figure 8: Vector quantization shows an example of a convolutional neural network (CNN).
of 2-dimensional Data. of 2-dimensional Data. Networks whose output depends on present and past inputs,
namely recurrent neural networks (RNNs) [169,172,173], have
The Vector quantization technique is used in various also been used in several applications.
speech applications including speech coding [103,107], Figure 10: A CNN with 3 convolutional, 2 subsampling layers.
emotion recognition [104], audio compression [105], large-
scale image classification [149] and image compression [150].
IV.DEEP LEARNING
In this section, a brief introduction to the field of
artificial neural networks is provided with the focus on deep
learning [151,153,161] methodologies and their applications.
Artificial neural networks are widely used in the areas of
image classification, pattern recognition and they have proved
V. SENSOR AND IOT APPLICATIONS
to be the most successful and they achieve superior results in
various fields including signal processing [163,168,171], The Internet of Things (IoT) [189] is a system of
computer vision [157], speech processing [162,165,166] and connected physical devices, smart machines or objects that
natural language processing [158,186]. Deep learning is a have unique identifiers. The devices will typically consist of
branch in machine learning that has gained popularity quite electronics, software, sensors, and radios enabling these
recently, capable of learning multiple levels of abstraction. objects to continuously collect and transfer data. Sensors that
Although, the inception of neural networks dates in 1960 consist of a transducer that will convert some form of physical
[156], deep learning gained more popularity since 2012 [155] process into an electrical signal. Examples include
because of the great advancements in the GPUs [99] and microphones, cameras, accelerometers, thermometers,
availability of large labeled datasets. In Figure 9, a simple pressure sensors etc. Perhaps a mobile phone is a good
artificial neural network with 4 hidden layers is shown. The example of a connected device that embeds several
last layer, namely the output layer, performs classification. heterogeneous sensors including microphone arrays, at least
The term “deep learning” [159] refers to several layers used to two cameras, magnetometers, accelerometers etc. First
learn multiple levels of representation generation smart phones for example typically included six
sensors. These days a Galaxy S5 has 26 sensors including
microphones, cameras, magnetometer,
accelerometer, proximity, IR, pressure, humidity, gyro etc. and NumPy [177,178]. TensorFlow [115,181] is an open
Accelerometers and magnetometers (Fig. 11) have been used source software library for numerical computation using data
in many applications, including machine monitoring, structural flow graphs and is very popular in deep learning and computer
monitoring, human activity, and healthcare [190-193]. Other vision. The Azure Machine Learning Studio [111,112,176] is a
areas of collaborative sensing and machine learning include drag and drop tool for analytics. IBM Bluemix [174,175] is a
localization [199-201] cloud platform that supports several programming languages
as well as integrated DevOps.
CONCLUSION
This Machine Learning short survey paper supported the
tutorial session of the IISA2017. The paper covered
supervised and unsupervised learning models. We also
provided a brief introduction to current deep learning
methodologies and outlined several applications including
pattern recognition, anomaly detection, computer vision, speech
processing, and IoT applications. The paper provides
Figure 11. A magnetometer can help align with the earth's field. extensive bibliography of machine algorithms and their
applications.
Clever entertainment and information exchange systems such
as smart speakers combine multiple technologies such as ACKNOWLEDGMENT
circular microphone arrays (Fig. 12), local and cloud based This work was supported in part by the NSF I/UCRC
machine learning and information retrieval algorithms. The award 1540040, IUSE award 1525716, NXP, and the ASU
Amazon Echo represents a recent example of an IoT device SenSIP Center.
that has a circular microphone array along with voice
recognition capabilities. Local and cloud computing allow this REFERENCES
device to: interface with various other systems, exchange [1] C. M. Bishop, Pattern recognition and machine learning (information
information, provide e-services, playback music and news on science and statistics). New York: Springer-Verlag New York, 2008.
[2] R. O. Duda, D. G. Stork, and P. E. Hart, Pattern classification: Pt.1:
demand, and provide human to machine interface for a smart Pattern classification, 2nd ed. New York: John Wiley & Sons, 2000.
home. [3] S. Marsland, Machine learning: An algorithmic perspective. Boca Raton:
Chapman & Hall/CRC, 2009.
[4] Y. Kodratoff, Introduction to machine learning. Morgan Kaufmann,
1993.
[5] R. S. Michalski et al., Machine learning an artificial intelligence
approach. Berlin, Heidelberg, Springer, 1983.
[6] J. Friedman, T. Hastie, and R. Tibshirani, The elements of statistical
learning: Data mining, inference, and prediction, Springer, NY, 2009.
[7] Miroslav Kubat, An Introduction to Machine Learning.
Springer International Publishing, ISBN 978-3-319-20009-5,
2015.
[8] "Machine learning," in Wikipedia, Wikimedia Foundation, 2016.
[Online]. Available: https://en.wikipedia.org/wiki/Machine_learning.
[9] E. Alpaydin, Introduction to machine learning, MIT Press, 2010.
[10] Alexander Johannes Smola and S. Vishwanathan, Introduction to
Machine Learning, Cambridge University Press, 2008.
[11] A. L. Samuel, "Some studies in machine learning using the game of
checkers," IBM Journal of R&D, vol. 3, no. 3, pp. 210–229, Jul. 1959.
Figure 12: Microphone array on Amazon EchoTM.(from [202]) [12] T. M. Mitchell, Machine learning, 7th ed. NY, McGraw Hill, 1997.
[13] J. L. Berral-García, "A quick view on current techniques and machine
learning algorithms for big data analytics," ICTON Trento, pp.1-4, 2016.
The interconnection of IoT smart devices is also enabling [14] T. Phan et al., "Comparative study on supervised learning methods for
advanced large-scale applications such as smart cities identifying phytoplankton species," ICCE, Vietnam, pp. 283-288, 2016.
[194,195], large-scale smart networks and radios, smart [15] S. Heleno, M. Silveira, M. Matias and P. Pina, "Assessment of
supervised methods for mapping rainfall induced landslides in VHR
campus systems [196-198]. The field of sensors and IoT
images," IGARSS, Milan, pp. 850-853, 2015.
applications is vast and large-scale applications are beginning [16] P. Drotar and Z. Smeakal, "Comparative Study of Machine Learning
to emerge. These include several smart and connected health Techniques for Supervised classification of Biomedical Data,"Acta
and community systems. Electrotechnica et Informatica, vol. 14, pp. 5–10, Sep. 2014.
[17] F. Galton, Natural Inheritance, Proc Royal Soc.y of London, 1989.
VI. IMPLEMENTATION AND SOFTWARE TOOLS [18] F. Galton, Anthropological Miscellanea: "Regression towards mediocrity
in hereditary stature,", The Journal of the Anthropological Institute of
This section introduces some of the machine learning tools. Great Britain and Ireland, pp. 246–263, 1886.
All the algorithms explained in sections II and III can be [19] F.Galton, "Co-relations and their measurement, chiefly from
implemented in various platforms and libraries, e.g., the R anthropometric data, Proc.s Royal Soc. of London, pp. 135-145, 1989.
[20] G. A. F. Seber, A. J. Lee, and R. A. Lee, Linear regression analysis, 2nd
[110,113] and Python [180] languages. Python is one of the ed. New York, NY, United States: Wiley, John & Sons, 2003.
most utilized environments for machine learning. There are [21] D. C. Montgomery, E. A. Peck, and G. G. Vining, Introduction to linear
also a number of libraries available such as SciKit-Learn regression analysis, 5th ed. Oxford: Wiley-Blackwell, 2012.
[114,179]
[22] H. Motulsky and A. Christopoulos, Fitting models to biological data [56] Rennie, J.; Shih, L.; Teevan, J.; Karger, D.; Tackling the poor
using linear and nonlinear regression: A practical guide to curve fitting. assumptions of Naive Bayes classifiers, ICML, 2003.
New York: Oxford University Press, 2004. [57] Chai, K.; H. T. Hn, H. L. Chieu; “Bayesian Online Classifiers for Text
[23] S. Gayathri et al., "Multivariate linear regression based activity Classification and Filtering”, ACM SIGIR, pp 97-104, August 2002.
recognition and classification," ICICES Chennai, pp. 1-6, 2014. [58] R. Vedala et al., "An application of Naive Bayes classification for credit
[24] P. Chandler et al., "Constrained Linear Regression for Flight Control scoring in e-lending platform," ICDSE, pp. 81-84, 2012.
System Failure Identification," ACC, San Francisco, p. 3141, 1993. [59] Lewis, D. D, Naive (Bayes) at forty: The independence assumption in
[25] H. Wang et al., "SSVEP recognition using multivariate linear regression information retrieval. Proceedings of ECML, 1998.
for brain computer interface," ICCC Chengdu, pp. 176-180, 2015. [60] S. Inc, "K-nearest neighbors," 2016. [Online]. Available:
[26] H. Wang et al., "Discriminative Feature Extraction via Multivariate http://www.statsoft.com/textbook/k-nearest-neighbors.
Linear Regression for SSVEP-Based BCI," in IEEE Trans. on Neural [61] T. M. Cover and P. E. Hart, "Nearest neighbour pattern classification,"
Systems and Rehabilitation Eng., vol. 24, no. 5, pp. 532-541, May 2016. IEEE Trans. Inform. Theory, vol. IT-13, pp. 21-27, Jan. 1967.
[27] C. R. Boyd et al., "Evaluating trauma care," The Journal of Trauma: [62] L. Peterson, "K-nearest neighbor," Scholarpedia, vol. 4, p. 1883, 2009.
Injury, Infection, and Critical Care, vol. 27, pp. 370–378, Apr. 1987. [63] Y. Lifshits, "Nearest neighbor search," SIGSPATIAL, v. 2, p. 12, 2010.
[28] J. R. Le Gall, "A new simplified acute physiology score (SAPS II) based [64] N. S. Altman, "An Introduction to Kernel and Nearest-Neighbor
on a European/north American multicenter study," JAMA, vol. 270, no. Nonparametric Regression," The Amer. Stat., vol. 46, p. 175, 1992.
24, pp. 2957–2963, Dec. 1993. [65] Cover TM, Hart PE, "Nearest neighbor pattern classification," IEEE
[29] J. Truett, J. Cornfield, and W. Kannel, "A multivariate analysis of the Transactions on Information Theory, vol. 13, no. 1, pp. 21–27, 1967.
risk of coronary heart disease in Framingham," Journal of Chronic [66] K. Beyer et al., "When Is ‘Nearest Neighbor’ Meaningful?” Database
Diseases, vol. 20, no. 7, pp. 511–524, Jul. 1967. Theory: ICDT, pp. 217–235, 1999.
[30] J. M. Hilbe, Logistic regression models. Boca Raton: Chapman and [67] E. H. Jang, B. J. Park, S. H. Kim, Y. Eum and J. H. Sohn, "A Study on
Hall/CRC, 2016. Analysis of Bio-Signals for Basic Emotions Classification: Recognition
[31] F. C. Pampel, Logistic regression: A primer. Thousand Oaks, CA: Sage Using Machine Learning Algorithms," 2014 ICISA, pp. 1-4, Seoul, 2014.
Publications, 2000. [68] W. Wu et al., "Bayesian Machine Learning: EEG/MEG signal processing
[32] Harshvardhan G et al., "Assessment of Glaucoma with ocular thermal measurements," in IEEE SPM, vol. 33, no. 1, pp. 14-36, Jan. 2016.
images using GLCM techniques and Logistic Regression [69] A. Ghaderi et al., "Machine learning-based signal processing using
classifier," WiSPNET, Chennai, India, pp. 1534-1537, 2016. physiological signals for stress detection," ICBME, Tehran, 2015.
[33] J. Song and B. Fan, "Adaptive object tracking with logistic regression," [70] M. Khanum et al., "A survey on Unsupervised machine learning
CCDC, Yinchuan, pp. 5403-5408, 2016. Algorithms for automation, classification and maintenance," IJCA, vol.
[34] N. Cristianini et al., An introduction to support vector machines: And 119, no. 13, pp. 34–39, Jun. 2015.
other kernel-based learning methods. Cambridge University Press, 2000. [71] M. E. Celebi , K. Aydin, Ed., Unsupervised Learning Algorithms, 1st ed.
[35] I. Steinwart and A. Christmann, Support vector machines. New York: Switzerland: Springer International Publishing, 2016.
Springer-Verlag New York, 2008. [72] A. Albiol et al., "An unsupervised color image segmentation algorithm
[36] Boser, B. E et al., "A training algorithm for optimal margin classifiers". for face detection applications," ICIP, Thessaloniki, pp. 681-684, 2001.
Proceedings of the fifth annual workshop on COLT, p. 144, 1992. [73] C. K. Lee, P. F. Sum and K. S. Tan, "An unsupervised learning
[37] C. J. C. Burges. “A Tutorial on Support Vector Machines for Pattern” algorithm for character recognition," Neural Networks, 1992. IJCNN.,
Recognition. Knowledge Discovery and Data Mining, 2(2), 1998. [74] N. Bouhmala, "How Good is the Euclidean Distance Metric for the
[38] C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, Clustering Problem,"IIAI-AAI, Kumamoto, pp. 312-315, 2016.
vol. 20, no. 3, pp. 273–297, Sep. 1995. [75] A. Bindal and A. Pathak, "A survey on k-means clustering and web-text
[39] Vapnik, Vladimir et al., "Support vector clustering", Journal of Machine mining," IJSR, vol. 5, no. 4, pp. 1049–1052, Apr. 2016.
Learning Research, vol. 2, pp. 125–137 2001. [76] A. A. Mandwe and A. Anjum, "Detection of brain tumor using k-means
[40] A. BenHur, "Support vector clustering," Scholarpedia, 3, p.5187, 2008. clustering," IJSR, vol. 5, no. 6, pp. 420–423, Jun. 2016.
[41] H.Xiao et al., "Indicative Support Vector Clustering with Its Application [77] K. Dhiraj and S. K. Rath, "Gene expression analysis using
on Anomaly Detection, "ICMLA Miami, FL, pp. 273-276, 2013. clustering," Int. Journal of Comp and Elec Engg pp. 155–164, 2009.
[42] D. Huang, J. H. Lai and C. D. Wang, "Incremental support vector [78] A. Bhattacharya, R. De, "Bi-correlation clustering algorithm for
clustering with outlier detection," ICPR, 2012. determining a set of co-regulated genes," Bioinf., V, 25, p. 2795, 2009.
[43] F. de Morsier et al., "Unsupervised change detection via hierarchical [79] E. Zeng, C. Yang, T. Li, and G. Narasimhan, "Clustering genes using
support vector clustering," PRRS, 2012 heterogeneous data sources," IJKDB, vol. 1, no. 2, pp. 12–28, 2010.
[44] Suykens, J.A.K.; Vandewalle, J. "Least squares support vector machine [80] C. Sundar, "An analysis on the performance of k-means clustering
classifiers", Neural Processing Letters, 9 (3), 293-300, 1999. algorithm for Cardiotocogram clustering," IJCSA, v. 2, p. 11, Oct. 2012.
[45] Bernhard S. et.al, SVM Method for Novelty detection, MIT press, 2000. [81] J. Sun, "Clustering Algorithms research," J. Software, V. 19, Jun. 2008.
[46] B Scholkopf et al., Single class support vector machines, Unsupervised [82] G. Gan, Chaoqun, and J. Wu, Data clustering: Theory, algorithms, and
learning, Dagstuhl –seminar report, pp. 19-20, 1999 applications. Philadelphia, SIAM, U.S., 2007.
[47] D.M.J Tax and R.P.W. Duin, Data description by support vectors. In [83] X. HUANG and Z. Song, "Clustering analysis on e-commerce
M.Verleysen, editor, Proceedings ESANN, Brussels, pp. 251-256, 1999. transaction based on k-means clustering," J. Networks, vol. 9, Feb. 2014.
[48] X. Peng et al., "Efficient support vector data descriptions for novelty [84] C. W. Wang and J. H. Jeng, "Image compression using PCA with
detection, "Neural Comp. and App., vol.21, pp. 2023–2032, May 2011. clustering," ISPACS New Taipei, pp. 458-462, 2012.
[49] S. Wang et al., "A modified support vector data description based [85] Kunlun Li and Guifa Teng, "Unsupervised SVM Based on p-kernels for
novelty detection approach for machinery components," Applied Soft Anomaly Detection," ICICIC Beijing, pp. 59-62, 2006.
Computing, vol. 13, no. 2, pp. 1193–1205, Feb. 2013. [86] N. Kovvali, M. Banavar, A. Spanias An Introduction to Kalman
[50] M. Yao, H. Wang, "One-Class Support Vector Machine for Functional Filtering with MATLAB Examples, Synthesis Lect. Signal Proc.,
Data Novelty Detection," 3rd Cong. Int. Sys., Wuhan, p. 172, 2012. Morgan & Claypool Publ., Ed. J. Mura, vol. 6, , Sep. 2013.
[51] ZHOU Guangping, "The study of the application in intrusion detection [87] B. Widrow, S. Stearns, Adaptive Signal Processing, Prentice Hall, 1985.
based on SVM," Journal of Conv. Inf. Tech., vol. 8, p. 11, Mar. 2013. [88] J. Foutz, A. Spanias, M. Banavar, Narrowband Direction of Arrival
[52] N. Chand et al., "A comparative analysis of SVM and its stacking with Estimation for Antenna Arrays, Synthesis Lectures on Antennas,
other classification algorithms," ICACCA, Dehradun, pp. 1-6, 2016. Morgan & Claypool Publishers, ISBN-13: 978-1598296501, Aug. 2008.
[53] M. A. Oskoei et al, "Adaptive schemes applied to online SVM for BCI [89] S. Theodorides, Machine Learning, A Bayesian and Optimization
data classification," IEEE EMBS, Minneapolis, pp. 2600-2603, 2009. Perspective, 1st Edition, Academic Press, December 2015.
[54] S. J. Russell and P. Norvig, Artificial intelligence: A modern approach. [90] A. Spanias, Digital Signal Processing; An Interactive Approach – 2nd
United Kingdom: Prentice Hall, 1994. Edition, ISBN 978-1-4675-9892-7, Lulu Press, May 2014.
[55] Irina Rish, An empirical study of the naive Bayes classifier, IJCAI
Workshop on Empirical Methods in AI, 2001.
[91] J. Lee, M. Stanley, A. Spanias, and Cihan Tepedelenlioglu, “Integrating [118] V. J. Mathews, Z. Xie, "A stochastic gradient adaptive filter with
Machine Learning in Embedded Sensor Systems for Internet-of-Things gradient adaptive step size," IEEE Trans. SP, vol. 41, p. 2075, Jun 1993.
Applications,” IEEE ISSPIT, Limassol, Cyprus Dec. 2016. [119] G. Qu and N. Li, "Accelerated Distributed Nesterov Gradient Descent
[92] J. Thiagarajan, K. Ramamurthy, P. Turaga, A. Spanias, Image for smooth and strongly convex functions," 54th Annual Allerton Conf.
Understanding Using Sparse Representations, Synth. Lect. on Image, on Comm., Control, and Computing, Monticello, IL, pp. 209-216, 2016.
Video, and Multimedia Proc., Morgan & Claypool Publ., April 2014. [120] Y. Wong, "How Gaussian radial basis functions work," IJCNN-91 Int.
[93] V. Berisha, A. Wisler, A. Hero, A. Spanias, "Empirically Estimable Joint Conf. on Neural Networks, Seattle, WA, pp. 133-138, 1991.
Classification Bounds Based on a Nonparametric Divergence Measure,” [121] J. A. Flanagan and T. Novosad, "Maximizing WCDMA network packet
IEEE Trans. on Signal Processing, vol. 64, pp.580-591, Feb. 2016. traffic performance: multi-parameter optimization by gradient descent
[94] Wichern, G.; Jiachen Xue; Thornburg, H.; Mechtley+, B.; Spanias, A; minimization of a cost function," IEEE PIMRC, v..1, pp. 311-315, 2003.
"Segmentation, Indexing, and Retrieval for Environmental and Natural [122] F. F. Lubis et al.,"Gradient descent and normal equations on cost
Sounds IEEE Trans on ASLP, Vol. 18, Issue: 3, pp. 688 – 707, 2010. function minimization for online predictive using linear regression with
[95] X Bi, S Lee, JF Ranville, P Sattigeri, A Spanias, P Herckes, P multiple variables," 2014 ICISS, Bandung, pp. 202-205, 2014.
Westerhoff, "Quantitative resolution of nanoparticle sizes using single [123] S. K. Lenka et al., "Gradient Descent with Momentum based Neural
particle inductively coupled plasma mass spectrometry with the K- Network Pattern Classification for the Prediction of Soil Moisture
means clustering algorithm," J Analy. Atomic Spectr., 29, p. 1630, 2014. Content in Precision Agriculture," 2015 IEEE iNIS., Indore, p. 63, 2015.
[96] Braun, H. Turaga, P.; Spanias, A., Direct tracking from compressive [124] M. Tivnan et al., "A modified gradient descent reconstruction algorithm
imagers: A proof of concept,” IEEE ICASSP 2014, Florence, 2014. for breast cancer detection using Microwave Radar and Digital Breast
[97] J. J. Thiagarajan, K. N. Ramamurthy, P. Sattigeri and A. Spanias, Tomosynthesis," 2016 10th EuCAP, Davos, pp. 1-4, 2016.
“Supervised local sparse coding of sub-image features for image [125] D. Chen, et al., "Similarity learning on an explicit kernel feature map for
retrieval,” IEEE ICIP 2012, Orlando, Sept. 2012. person re-identification," IEEE CVPR, Boston, pp. 1565, 2015.
[98] P. Sattigeri, J. J. Thiagarajan, M. Shah, K. N. Ramamurthy and A. [126] P. Sahoo et al., "On the study of GRBF and polynomial kernel based
Spanias, "A scalable feature learning and tag prediction framework for support vector machine in web logs," 2013 1st Int. Conf. on Emerging
natural environment sounds," 48th Asilomar Conference on Signals, Trends and Applications in Computer Science, Shillong, 2013, pp. 1-5.
Systems and Computers, Pacific Grove, CA, pp. 1779-1783, 2014. [127] P. Panavaranan and Y. Wongsawat, "EEG-based pain estimation via fuzzy
[99] P. Sattigeri, J. J. Thiagarajan, K. N. Ramamurthy and A. Spanias, logic and polynomial kernel support vector machine," The 6th 2013
"Implementation of a fast image coding and retrieval system using a Biomedical Engg. Int. Conf., Amphur Muang, 2013, pp. 1-4.
GPU," 2012 IEEE ESPA, Las Vegas, NV, pp. 5-8, 2012. [128] S. Yaman et al., "Using Polynomial Kernel SVM for Speaker
[100] P. Sattigeri, J. J. Thiagarajan, K. Natesan Ramamurthy, A. Spanias, M. Verification," in IEEE Sig.Process. Lett., v. 20, pp. 901-904, Sept. 2013.
Goryll and T. Thornton, “Robust PSD Features for Ion-Channel [129] J. Bai et al., "Application of SVM with Modified Gaussian Kernel in A
Signals,” in SSPD, London, UK, 27-29 September 2011. Noise-Robust Speech Recognition System," IEEE Int. Symp. on
[101] A. Spanias, C. Tepedelenlioglu, E.Kyriakides, D. Ramirez, S. Rao, H. Knowledge Acquisition and Modeling Workshop, pp. 502-505, 2008.
Braun, J. Lee, D. Srinivasan, J. Frye, S. Koizumi, Y. Morimoto, "An 18 [130] P. Baldi; S. Brunak, "Gaussian Processes, Kernel Methods, and
kW Solar Array Research Facility for Fault Detection Experiments," SVM," Bioinformatics, Machine Learning, MIT Press, p.387, 2001
Proc. 18th MELECON, Cyprus, April 2016. [131] M. Varewyck et al., "A Practical Approach to Model Selection for
[102] A. Gersho and R. M. Gray, Vector quantization and signal compression, Support Vector Machines With a Gaussian Kernel," in IEEE Trans. on
6th ed. Boston, MA, United States: Kluwer Academic Publishers, 1991. SMC, Part B (Cybernetics), vol. 41, no. 2, pp. 330-340, April 2011.
[103] J. Makhoul et al., "Vector quantization in speech coding," [132] D. Zhang et al., "Time Series Classification Using SVM with Gaussian
in Proceedings of the IEEE, vol. 73, no. 11, pp. 1551-1588, Nov. 1985. Elastic Metric Kernel," ICPR, Istanbul, 2010, pp. 29-32.
[104] M. Shah, C. Chakrabarti and A. Spanias, “Within and cross-corpus [133] J. Tian and L. Zhao, "Weighted Gaussian Kernel with Multiple Widths
speech emotion recognition using latent topic model-based features”, and Support Vector Classifications," Int. Symposium on Info. Engg. and
EURASIP J . Audio, Speech, and Music Processing, 2015:4, Jan. 2015. Electronic Commerce, Ternopil, pp. 379-382, 2009.
[105] A.Spanias, T. Painter, V.Atti, Audio Signal Processing and Coding, [134] Yaohua Tang et al., "Efficient model selection for SVM with Gaussian
Wiley, March 2007. kernel function," 2009 IEEE CIDM, Nashville, TN, pp. 40-45, 2009.
[106] Linde Yoseph Buzo Andrés Gray Robert M. "An Algorithm for Vector [135] A. Betancourt et al., "Filtering SVM frame-by-frame binary
Quantization" IEEE COM-28 No. 1 pp. 84-95 Jan. 1980. classification in a detection framework," ICIP, Quebec, p. 2552, 2015.
[136] Chen donghui and Liu zhijing, "A new text categorization method based
[107] A.S. Spanias, "Speech Coding: A Tutorial Review," Proceedings of the on HMM and SVM," 2010 2nd Int. Conf. on Computer Engig. and
IEEE, Vol. 82, No. 10, pp. 1441-1582, October 1994. Tech., Chengdu, 2010, pp. V7-383-V7-386.
[108] E. G. Ularu et al., "Mobile computing and cloud maturity - introducing [137] M. Kumar, M. Gopal, "An Investigation on Linear SVM and its
machine learning for ERP configuration automation," Informatica Variants," Int. Conf. Mach/ Learn.and Comp., Bangalore, p. 27, 2010.
Economica, vol. 17, no. 1/2013, pp. 40–52, Mar. 2013. [138] Z. Wang and X. Qian, "Text Categorization Based on LDA and
[109] I. H. Witten et al., “Data mining: Practical machine learning tools and SVM," 2008 Int. Conf. on Com Sci and Sofe Eng., Hubei, p. 674, 2008..
techniques”, 3rd ed. USA, Morgan Kaufmann Publishers In, 2011. [139] A. Sharma, "Handwritten digit recognition using SVM", eprint
[110] R. Schumacker, Understanding Statistics Using R, S. Tomek, Ed. arXiv:1203.3847, 2012.
Springer Publishing Company, Inc., 2013. [140] D. Gorgevik et al., "Handwritten digit recognition by combining support
[111] G. Webber-Cross, Learning Microsoft azure: A comprehensive guide to vector machines using rule-based reasoning," Proc. 23rd Int. Conf. on
cloud application development using MS azure. UK: Packt Publi., 2014. Info. Tech. Interfaces, pp. 139-144 vol.1, 2001
[112] V. Fontamaet al., “Predictive Analytics with MS azure machine [141] Tuba, Eva et al. "Handwritten digit recognition by SVM optimized by
learning; build and deploy solutions in minutes.USA press, 2014. bat algo." 24th Int Conf WSCG , 2016.
[113] R Development Core Team (2008). R: A language and environment for [142] C. S. Turner, "Slope filtering: An FIR approach to linear regression
statistical computing. R foundation for Statistical Computing, Vienna, [DSP Tips&Tricks]," IEEE Sig, Proc. Mag., pp. 159-163, Nov. 2008.
Austria. URL http://www.R-project.org. [143] Y. T. Chang and K. Cheng, "Sensorless position estimation of switched
[114] Pedregosa, F et al., Scikit-learn: Machine Learning in Python, Journal reluctance motor at startup using quadratic polynomial regression," in
on Machine Learning Research 12, p. 2825, 2011. https://scikit- IET Electric Power Applications, vol. 7, pp. 618-626, Aug. 2013.
learn.org [144] E. Masry, "Multivariate regression estimation of continuous-time
[115] Abadi, M et al., TensorFlow: Large-scale machine learning on processes from sampled data: local polynomial fitting approach," in
heterogeneous systems, 2015. URL https://tensorflow.org IEEE Trans. on Info. Theory, vol. 45, no. 6, pp. 1939-1953, Sep 1999.
[116] S. Amari, "Backpropagation and stochastic gradient descent method", [145] T. Banerjee et al, "PERD: Polynomial-based Event Region Detection in
Neurocomputing, vol. 5, no. 4-5, pp. 185-196, 1993. Wireless Sensor Networks," 2007 IEEE ICC, pp. 3307-3312, 2007.
[117] H. Blockeel, Machine learning and knowledge discovery in databases, [146] A. Sharmila and P. Geethanjali, "DWT Based Detection of Epileptic
Berlin, Springer, 2013. Seizure, ," in IEEE Access, vol. 4, pp. 7716-7727, 2016.
[147] V. Agrawal, et al., "Application of K-NN regression for predicting coal [183] Arpit D et al. "Normalization propagation: A parametric technique for
mill related variables," 2016 ICCPCT, India, pp. 1-9, 2016. removing covariate shift in deep networks." arXiv preprint, 2016.
[148] X. Li et al., "Speech recognition based on k-means clustering and NN [184] T. Salimans et al., "Weight normalization: A reparameterization to
th
ensembles," 7 Int. Conf. on Natural Comp., Shanghai, p. 614, 2011. accelerate training of neural networks." Adva. Neu. Info. Sys. 2016.
[149] E. C. Ozan, et al., "A vector quantization based k-NN approach for large- [185] J. Ba et al. "Layer normalization." arXiv:1607.06450 (2016).
scale image classification," IPTA, Oulu, pp. 1-6, 2016. [186] P. Loizou and A. Spanias, "High Performance Alphabet Recognition,"
[150] D. Valsesia, P. Boufounos, "Multispectral image compression using IEEE Trans. on Speech and Audio, vol. 4, pp. 439-445, Nov. 1996.
vector quantization," IEEE ITW, , Cambridge, p 151, 2016. [187] S. Rao, S. Katoch, P. Turaga, A. Spanias, C. Tepedelenlioglu, R. Ayyanar,
[151] I. Goodfellow, Y. Bengio and A. Courville, Deep learning, 1st ed. H.Braun, J. Lee, U.Shanthamallu, M. Banavar, D. Srinivasan, "A
Cambridge, Mass: The MIT Press, 2017. Cyber- Physical System Approach for Photovoltaic Array Monitoring
[152] Y. Bengioet et all, "Representation Learning: A Review and New and Control," Proceedings 8th International Conference on
Perspectives," IEEE Transactions on PAMI, vol. 35, p. 1798, Aug. 2013. Information, Intelligence, Systems and Applications (IEEE IISA 2017),
[153] I. Arel et al., "Deep Machine Learning - A New Frontier in Larnaca, August 2017.
Artificial Intelligence Research [Research Frontier]," in IEEE [188] A. Spanias, "Solar Energy Management as an Internet of Things (IoT)
Computational Application," Proceedings 8th International Conference on Information,
Intelligence Magazine, vol. 5, no. 4, pp. 13-18, Nov. 2010. Intelligence, Systems and Applications (IEEE IISA 2017), Larnaca,
[154] LeCun et al., "Deep learning." nature 521.7553, 2015. August 2017.
[155] Krizhevsky et al, “ImageNet classification with deep convolutional NN,” [189] Gubbi, Jayavardhana et al., "Internet of Things (IoT): A vision,
Adv. Neural Info.Process. Sys., vol 25, pp 1090-1098, 2012. architectural elements, and future directions. "Future generation
[156] F. Rosenblatt, "The perceptron: A probabilistic model for information computer systems”, Vol. 29, no. 7, pp.1645-1660, 2013.
storage in the brain." Psychological review vol 65, pp. 386-408, 1958. [190] Aldrich and L. Auret, Unsupervised Process Monitoring and Fault
[157] Kavukcuoglu, Koray et al. "Learning convolutional feature hierarchies
Diagnosis with Machine Learning Methods, Springer, 2013.
for visual recognition." Advances in neural info. process. sys. 2010. [191] X. Long, B. Yin and R. M. Aarts, "Single-accelerometer-based daily
[158] Mikolov, Tomas et al. "Distributed representations of words and phrases
physical activity classification," in Annual International Conference of
and their compositionality." NIPS 2013. the IEEE Engineering in Medicine and Biology Society, 2009.
[159] Schmidhuber, Jürgen. "Deep learning in neural networks: An
[192] D. Rajan, A. Spanias, S. Ranganath, M. Banavar, and P. Spanias,
overview." Neural networks 61 (2015): 85-117.
"Health Monitoring Laboratories by Interfacing Physiological Sensors
[160] N. Srivastava, et al. "Dropout: A simple way to prevent neural networks
from overfitting." J. Machine Learning Res, 15.1 (2014): 1929-1958. to Mobile Android Devices," in IEEE FIE, 2013.
[161] L. Deng. "A tutorial survey of architectures, algorithms, and applications [193] J. P. Lynch, "A Summary Review of Wireless Sensors and Sensor
for deep learning." APSIPA Trans. on Signal and Info. Process. 3, 2014. Networks for Structural Health Monitoring," The Shock and Vibration
[162] Hinton, G et al. “Deep NN for acoustic modeling in speech recognition”, Digest, vol. 38, no. 2, pp. 91 - 128, 2006.
IEEE Signal Process. Magazine, vol. 29, no. 6, pp. 82-97, 2012. [194] Hwang, Jong-Sung; Choe, Young Han (February 2013). "Smart Cities
[163] Yu, Dong, and Li Deng. "Deep learning and its applications to signal a nd Seoul: a case study" (PDF). ITU-T Technology Watch. Retrieved 23
information processing." IEEE Sig. Pro. Mag. 28, no. 1, pp. 145-154, 2011. October 2016.
[164] Hinton, G.; Salakhutdinov, R. “Reducing the dimensionality of data with [195] Zanella, Andrea; Bui, Nicola; Castellani, Angelo; Vangelista, Lorenzo;
neural networks”, Science 313 no. 5786, pp. 504–507, 2006.
Zorzi, Michele (February 2014). "Internet of Things for Smart Cities".
[165] Yu, Dong, and Li Deng. Automatic speech recognition: A deep learning
approach. Springer, 2014. IEEE Internet of Things Journal. 1(1): 22–32. Retrieved 26 June 2015.
[166] Abdel-Hamid, Ossama, and Hui Jiang. "Fast speaker adaptation of [196] Sensor networks and the smart campus," 2014. [Online]. Available:
hybrid NN/HMM model for speech recognition based on discriminative https://beaverworks.ll.mit.edu/CMS/bw/smartcampusfuture. Accessed:
learning of speaker code." Proc. IEEE ICASSP 2013, Vancouver, 2013. Dec. 12, 2016.
[167] Szegedy, Christian et al. "Going deeper with convolutions." Proceedings [197] P. Bellavista et al, “Convergence of MANET and WSN in IoT urban
of the IEEE Conf. on Computer Vision and Pattern Recognition. 2015. scenarios,” IEEE Sens. J., vol. 13, no. 10, pp. 3558–3567, Oct. 2013.
[168] H. Song, et al., "Auto-context modeling using multiple Kernel
[198] Andrea Zanella et al, “Internet of Things for Smart Cities”, IEEE
learning," 2016 IEEE ICIP, pp. 1868-1872, Phoenix, Sep. 2016.
[169] Bengio, Yoshua et al., "Advances in optimizing recurrent Internet Of Things Journal, Vol. 1, No. 1, February 2014.
networks." Proc. IEEE ICASSP, 2013, Vancouver, 2013. [199] S. Miller, X. Zhang, A. Spanias, Multipath Effects in GPS Receivers,
[170] Salakhutdinov et al., “Deep boltzmann machines.” Proceedings of the Synthesis Lectures on Communications, Morgan & Claypool
int. conf. on AI and statistics. vol. 5, no. , Cambridge, MIT Press, 2009. Publishers, ISBN 978-1627059312, Ed. W. Tranter, No. 1 , Dec. 2015.
[171] Song, Huan, J. Jayaraman, A. Spanias, "A Deep Learning Approach To [200] X. Zhang, C. Tepedelenlioglu, M. Banavar, A. Spanias, Node
Multiple Kernel Fusion." Proc. ICASSP 2017, New Orleans. Localization in Wireless Sensor Networks, Synth Lectures on
[172] Mikolov, T et al.: Recurrent neural network based language model, in
Communications, Morgan & Claypool Publ., ISBN: 9781627054850,
Proc. IEEE ICASSP, 2010, 1045–1048.
[173] Mesnil, G et al., Investigation of RNN architectures and learning Ed. W. Tranter, Dec. 2016.
methods for spoken language understanding, Proc. Interspeech, 2013. [201] Quoc-Huy Phan andSu-Lim Tan, "Mitigation of GPS periodic multipath
[174] Kobylinski, Kris et al., "Enterprise application development in the cloud using nonlinear regression,” 19th European Signal Processing
with IBM Bluemix." Proc 24th Conf Comp. Sc. Soft. Eng. IBM. , 2014. Conference, Barcelona, 2011.
[175] Gheith, A et al., "IBM Bluemix Mobile Cloud Services." IBM Journal of [202] www.ifixit.com/Teardown/Amazon+Echo+Teardown/
Research and Development 60.2-3 (2016): 7-1.
[176] Klein, Scott. "Azure Machine Learning." IoT Solutions in Microsoft's
Azure IoT Suite. Apress, 2017. 227-252.
[177] Walt, Stéfan van der, et al. "The NumPy array: a structure for efficient
numerical computation." Comp. in Science & Eng. 13.2 (2011): 22-30.
[178] McKinney, Wes. Python for data analysis: Data wrangling with Pandas,
NumPy, and IPython. " O'Reilly Media, Inc.", 2012.
[179] G. Hackeling. Mastering ML with scikit-learn. Packt Publi., 2014.
[180] Van Rossum, Guido. "Python Programming Language." USENIX
Annual Technical Conf. Vol. 41. 2007.
[181] Chollet, François. "Keras: Deep learning library for theano and
tensorflow." URL: https://keras. io/k (2015).
[182] I. Sergey, C. Szegedy. "Batch normalization: Accelerating deep network
training by reducing internal covariate shift." arXiv:1502.03167 (2015).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy