14 K - Nearest Neighbours
14 K - Nearest Neighbours
K-nearest neighbors
● K-nearest neighbors is known as knn.
● Knn is the simplest algorithm for supervised machine learning.
● Knn stores all the available data and classify the new data points based on similarity.
● It can be used in both task regression and classification.but it usually prefer to use in
classification.
● It is used to find pattern recognition,intrusion detection and data mining.
A wonderful serenity
Why do we need K-NN
K-Nearest Neighbours is a simple algorithm that stores all available data and classifies new cases based on a similarity measure . It classifies
the new data points on the basis of distance .
For Ex ::
○ A very low value like K=1 or K=2 ,K=3 may cause of outliers and make noisy our data
○ A large value may good but that may also cause some difficulties .
Advantages of K-NN.
● It is versatile algorithm can be used for regression and classification task.
● It is simple to implement and intuitive to learn .
● Single hyper parameters which makes easy to tune the model.
● No Training time for classification it simples tags the new data entry based on historical data.
● Variety of distance criteria to choose from like euclidean,manhattan and minkowski distance.
Disadvantages of K-NN.
● Knn does not work well with large dataset because calculating the distance among all the data will be very costly,
● Does not work well with high dimensionality of data
● It is very sensitive to noisy data and missing data.
● Data needs to normalized and standardized properly.