Aitee (Notes) KNN
Aitee (Notes) KNN
KNN:
Introduction
K-nearest neighbours (KNN) algorithm is a type of supervised ML algorithm which can be used for both
classification as well as regression predictive problems. However, it is mainly used for classification
predictive problems in industry. The following two properties would define KNN well −
Lazy learning algorithm − KNN is a lazy learning algorithm because it does not have a specialized training
phase and uses all the data for training while classification.
Non-parametric learning algorithm − KNN is also a non-parametric learning algorithm because it doesn‟t
assume anything about the underlying data.
K-nearest neighbors (KNN) algorithm uses „feature similarity‟ to predict the values of new datapoints
which further means that the new data point will be assigned a value based on how closely it matches the
points in the training set. We can understand its working with the help of following steps −
Step 1 − For implementing any algorithm, we need dataset. So during the first step of KNN, we must
load the training as well as test data.
Step 2 − Next, we need to choose the value of K i.e. the nearest data points. K can be any integer.
3.1 − Calculate the distance between test data and each row of training data with the help of any of
the method namely: Euclidean, Manhattan or Hamming distance. The most commonly used
method to calculate distance is Euclidean.
3.2 − Now, based on the distance value, sort them in ascending order.
3.3 − Next, it will choose the top K rows from the sorted array.
3.4 − Now, it will assign a class to the test point based on most frequent class of these rows.
Step 4 − End
Class: EEE D2 Semester: VII
Course: AI Techniques in Electrical Engineering Course Code: 20EE E43
Example
The following is an example to understand the concept of K and working of KNN algorithm −
Now, we need to classify new data point with black dot (at point 60,60) into blue or red class. We are
assuming K = 3 i.e. it would find three nearest data points. It is shown in the next diagram −
We can see in the above diagram the three nearest neighbours of the data point with black dot. Among
those three, two of them lies in Red class hence the black dot will also be assigned in red class.
Class: EEE D2 Semester: VII
Course: AI Techniques in Electrical Engineering Course Code: 20EE E43
Pros and Cons of KNN
Pros
It is very useful for nonlinear data because there is no assumption about data in this algorithm.
It has relatively high accuracy but there are much better supervised learning models than KNN.
Cons
It is computationally a bit expensive algorithm because it stores all the training data.
Applications of KNN
The following are some of the areas in which KNN can be applied successfully −
Banking System
KNN can be used in banking system to predict weather an individual is fit for loan approval? Does that
individual have the characteristics similar to the defaulters one?
KNN algorithms can be used to find an individual‟s credit rating by comparing with the persons having
similar traits.
Politics
With the help of KNN algorithms, we can classify a potential voter into various classes like “Will
Vote”, “Will not Vote”, “Will Vote to Party „Congress‟, “Will Vote to Party „BJP‟.
Other areas in which KNN algorithm can be used are Speech Recognition, Handwriting Detection,
Image Recognition and Video Recognition.
KNN
Video Link: https://www.youtube.com/watch?v=HVXime0nQeI&t=78s