Ai Notes V
Ai Notes V
NOTES
Principal Component Analysis
This method was introduced by Karl Pearson. It works on a condition that while the data in a higher
dimensional space is mapped to data in a lower dimension space, the variance of the data in the lower
dimensional space should be maximum.
It involves the following steps:
• Construct the covariance matrix of the data.
• Compute the eigenvectors of this matrix.
• Eigenvectors corresponding to the largest eigenvalues are used to reconstruct a large fraction of
variance of theoriginal data.
Hence, we are left with a lesser number of eigenvectors, and there might have been some data loss in the
process. But, the most important variances should be retained by the remaining eigenvectors.
There are a lot of machine learning problems which a nonlinear, and the use of nonlinear feature mappings can
help to produce new features which make prediction problems linear. In this section we will discuss the
following idea: transformation of the dataset to a new higher-dimensional (in some cases infinite- dimensional)
feature space and theuse of PCA in that space in order to produce uncorrelated features. Such a method is
called Kernel Principal Component Analysis or KPCA.
,
where . Will consider that the dimensionality of the feature space equals to .
Eigendecompsition of is given by
By the definition of
And therefore
So far, we have assumed that the mapping is known. From the equations above, we can see, that only a
thing that we need for the data transformation is the eigendecomposition of a Gram matrix . Dot products,
which are its elements can be defined without any definition of . The function defining such dot products
in some Hilbert space is called kernel. Kernels are satisfied by the Mercer’s theorem. There are many different
types of kernels, there are several popular:
1. Linear: ;
2. Gaussian: ;
3. Polynomial: .
Using a kernel function we can write new equation for a projection of some data item onto -th
eigenvector:
So far, we have assumed that the columns of have zero mean. Using
Summary: Now we are ready to write the whole sequence of steps to perform KPCA:
1. Calculate .
2. Calculate .
3. Find the eigenvectors of corresponding to nonzero eigenvalues and normalize them:
.
4. Sort found eigenvectors in the descending order of coresponding eigenvalues.
5. Perform projections onto the given subset of eigenvectors.
The method described above requires to define the number of components, the kernel and its parameters. It
shouldbe noted, that the number of nonlinear principal components in the general case is infinite, but since we
are computing the eigenvectors of a matrix , at maximum we can calculate nonlinear principal components.
SUPPORT VECTOR MACHINES
Support Vector Machine or SVM are supervised learning models with associated learning algorithms that
analyze data for classification( clasifications means knowing what belong to what e.g ‘apple’ belongs to class
‘fruit’ while ‘dog’ to class ‘animals’ -see fig.1)
In support vector machines, it looks somewhat like which separates the blue balls from red.
SVM is a classifier formally defined by a separating hyperplane. An hyperplane is a subspace of one
dimension lessthan its ambient space. The dimension of a mathematical space (or object) is informally
defined as the minimum
number of coordinates (x,y,z axis) needed to specify any point (like each blue and red point) within it while
anambient space is the space surrounding a mathematical object.
Therefore the hyperplane of a two dimensional space below (fig.2) is a one dimensional line dividing the red
and bluedots.
Introduction to clustering
As the name suggests, unsupervised learning is a machine learning technique in which models are
not supervised using training dataset. Instead, models itself find the hidden patterns and insights from
the given data. It can be compared to learning which takes place in the human brain while learning
new things. It can be defined as:
“Unsupervised learning is a type of machine learning in which models are trained using unlabeled
dataset and are allowed to act on that data without any supervision.”
Below are some main reasons which describe the importance of Unsupervised Learning:
o Unsupervised learning is helpful for finding useful insights from the data.
Once it applies the suitable algorithm, the algorithm divides the data objects into groupsaccording to
the similaritiesand difference between the objects.
The unsupervised learning algorithm can be further categorized into two types of problems:
Clustering: Clustering is a method of grouping the objects into clusters such that objects with most
similarities remains into a group and has less or no similarities with the objects of another group. Cluster
analysis finds the commonalities between the data objects and categorizes them as per the presence and
absence of those commonalities.
o Hierarchal clustering
o Anomaly detection
o Neural Networks
o Principle Component Analysis
o Apriori algorithm
One of the most used clustering algorithm is k-means. It allows to group the data according
to the existingsimilarities among them in k clusters, given as input to the algorithm. I‟ll start
with asimple example.
Let‟s imagine we have 5 objects (say 5 people) and for each of them we know two features
(height and weight). Wewant to group them into k=2 clusters.
As you probably already know, I‟m using Python libraries to analyze my data. The k-means algorithm
is implemented in the scikit-learn package. To use it, you will just need the following line in your
script: