ML Assignment No. 3: 3.1 Title
ML Assignment No. 3: 3.1 Title
Engineering
ML Assignment No. 3
R C V T Total Dated Sign
(2) (4) (2) (2) (10)
3.1 Title
In the following diagram let blue circles indicate positive examples and orange squares indicate negative
examples. We want to use k-NN algorithm for classifying the points. If k=3, find-the class of the point
(6,6). Extend the same example for Distance-Weighted k-NN and Locally weighted Averaging.
3.3 Prerequisite:
Learn How to Apply KNN Classification for Classify Positive and Negative Points in given example.
3.7 Outcomes:
After completion of this assignment students are able Implement code for KNN Classification for Classify
Positive and Negative Points in given example also Extend the same example for Distance-Weighted k-NN
and locally weighted Averaging
3.8.1 Motivation
The dichotomy is pretty obvious here – There is a non existent or minimal training phase but a costly
testing phase. The cost is in terms of both time and memory. More time might be needed as in the worst
case, all data points might take point in decision. More memory is needed as we need to store all training
data.
KNN Algorithm is based on feature similarity: How closely out-of-sample features resemble our training
set determines how we classify a given data point:
Example of k-NN classification. The test sample (inside circle) should be classified either to the first class
of blue squares or to the second class of red triangles. If k = 3 (outside circle) it is assigned to the second
class because there are 2 triangles and only 1 square inside the inner circle. If, for example k = 5 it is
assigned to the first class (3 squares vs. 2 triangles outside the outer circle).
KNN can be used for classification — the output is a class membership (predicts a class — a discrete value).
An object is classified by a majority vote of its neighbors, with the object being assigned to the class most
common among its k nearest neighbors. It can also be used for regression — output is the value for the
object (predicts continuous values). This value is the average (or median) of the values of its k nearest
neighbors.
Assumptions in KNN
KNN assumes that the data is in a feature space. More exactly, the data points are in a metric space. The
data can be scalars or possibly even multidimensional vectors. Since the points are in feature space, they
have a notion of distance – This need not necessarily be Euclidean distance although it is the one
commonly used.
Lab Practices-III Fourth Year Computer Engineering
Each of the training data consists of a set of vectors and class labelEngineering
associated with each vector. In the
simplest case , it will be either + or – (for positive or negative classes). But KNN , can work equally well
with arbitrary number of classes.
We are also given a single number "k" . This number decides how many neighbors (where neighbors is
defined based on the distance metric) influence the classification. This is usually a odd number if the
number of classes is 2. If k=1 , then the algorithm is simply called the nearest neighbor algorithm.
Lets see how to use KNN for classification. In this case, we are given some data points for training and also
a new unlabelled data for testing. Our aim is to find the class label for the new point. The algorithm has
different behavior based on k.
This is the simplest scenario. Let x be the point to be labeled . Find the point closest to x . Let it be y. Now
nearest neighbor rule asks to assign the label of y to x. This seems too simplistic and some times even
counter intuitive. If you feel that this procedure will result a huge error , you are right – but there is a catch.
This reasoning holds only when the number of data points is not very large.
If the number of data points is very large, then there is a very high chance that label of x and y are same. An
example might help – Lets say you have a (potentially) biased coin. You toss it for 1 million time and you
have got head 900,000 times. Then most likely your next call will be head. We can use a similar argument
here.
Let me try an informal argument here - Assume all points are in a D dimensional plane . The number of
points is reasonably large. This means that the density of the plane at any point is fairly high. In other
words , within any subspace there is adequate number of points. Consider a point x in the subspace which
also has a lot of neighbors. Now let y be the nearest neighbor. If x and y are sufficiently close, then we can
assume that probability that x and y belong to same class is fairly same – Then by decision theory, x and y
have the same class.
The book "Pattern Classification" by Duda and Hart has an excellent discussion about this Nearest
Neighbor rule. One of their striking results is to obtain a fairly tight error bound to the Nearest Neighbor
rule. The bound is
Where is the Bayes error rate, c is the number of classes and P is the error rate of Nearest Neighbor. The
result is indeed very striking (atleast to me) because it says that if the number of points is fairly large then
the error rate of Nearest Neighbor is less that twice the Bayes error rate. Pretty cool for a simple algorithm
like KNN.
Lab Practices-III Fourth Year Computer Engineering
Engineering
This is a straightforward extension of 1NN. Basically what we do is that we try to find the k nearest
neighbor and do a majority voting. Typically k is odd when the number of classes is 2. Lets say k = 5 and
there are 3 instances of C1 and 2 instances of C2. In this case , KNN says that new point has to labeled as
C1 as it forms the majority. We follow a similar argument when there are multiple classes.
One of the straight forward extension is not to give 1 vote to all the neighbors. A very common thing to do
is weighted kNN where each point has a weight which is typically calculated using its distance. For eg
under inverse distance weighting, each point has a weight equal to the inverse of its distance to the point to
be classified. This means that neighboring points have a higher vote than the farther points.
It is quite obvious that the accuracy *might* increase when you increase k but the computation cost also
increases.
8. There are also some nice techniques like condensing, search tree and partial distance that try to reduce
the time taken to find the k nearest neighbor.
Applications of KNN
KNN is a versatile algorithm and is used in a huge number of fields. Let us take a look at few uncommon
and non trivial applications.
Lab Practices-III Fourth Year Computer Engineering
1. Nearest Neighbor based Content Retrieval Engineering
This is one the fascinating applications of KNN – Basically we can use it in Computer Vision for many
cases – You can consider handwriting detection as a rudimentary nearest neighbor problem. The problem
becomes more fascinating if the content is a video – given a video find the video closest to the query from
the database – Although this looks abstract, it has lot of practical applications – Eg :
Consider ASL (American Sign Language) . Here the communication is done using hand gestures.
So lets say if we want to prepare a dictionary for ASL so that user can query it doing a gesture. Now the
problem reduces to find the (possibly k) closest gesture(s) stored in the database and show to user. In its
heart it is nothing but a KNN problem. One of the professors from my dept , Vassilis Athitsos , does
research in this interesting topic – See Nearest Neighbor Retrieval and Classification for more details.
2. Gene Expression
This is another cool area where many a time, KNN performs better than other state of the art techniques . In
fact a combination of KNN-SVM is one of the most popular techniques there. This is a huge topic on its
own and hence I will refrain from talking much more about it.
Pros:
Cons:
In the following diagram let blue circles indicate positive examples and orange squares indicate
negative examples. We want to use k-NN algorithm for classifying the points. If k=3, find-the class of
the point (6,6). Extend the same example for Distance-Weighted k-NN and Locally weighted
Averaging
3.10 Algorithm
3.11 Conclusion
In this way we learn KNN Classification to predict the General and Distance Weighted KNN for Given
data point in term of Positive or Negative.
References:-
1.https://medium.com/@adi.bronshtein/a-quick-introduction-to-k-nearest-neighbors-algorithm-
62214cea29c7
2. Mittu Skillologies Youtube Channel
3. https://www.saedsayad.com/k_nearest_neighbors.htm