3 KNN
3 KNN
Supervised Learning
Algorithms
• K-nearest neighbors (KNN)
• Decision Trees
• Linear Regression
• Logistic Regression
• Random Forrest
• Gradient Boosting
Nearest Neighbors
Nearest Neighbors
Suppose we’re given a novel input vector x we’d like to classify.
The idea: find the nearest input vector to x in the training set and copy
its label.
Can formalize “nearest” in terms of Euclidean distance
v
u d
u X (a)
||x (a) (b)
x ||2 = t (xj
(b) 2
xj )
j=1
Algorithm:
1. Find example (x⇤ , t⇤ ) (from the stored training set) closest to
x. That is:
x⇤ = argmin distance(x(i) , x)
x(i) 2train. set
2. Output y = t⇤
Algorithm (kNN):
1. Find k examples {x(i) , t(i) } closest to the test instance x
2. Classification output is majority class
Xk
y = arg max I(t(z) = t(i) )
t(z)
i=1
Tradeo↵s in choosing k?
Small k
I Good at capturing fine-grained patterns
I May overfit, i.e. be sensitive to random idiosyncrasies in the
training data
Large k
I Makes stable predictions by averaging over lots of examples
I May underfit, i.e. fail to capture important regularities
Balancing k
I Optimal choice of k depends on number of data points n.
I Nice theoretical properties if k ! 1 and k ! 0.
p n
I Rule of thumb: choose k < n.
I We can choose k using validation set (next slides).
KNN: Computational Cost
Pitfalls: Computational Cost
• STATLOG project
• four heat-map images:
• two in the visible spectrum and
two in the infrared, for an area of
agricultural land in Australia.
• Labels = {red soil, cotton, Spectral Band 4 Land Usage Predicted Land Usage
vegetation stubble, mixture,
gray soil, damp gray soil, very
damp gray soil},
• classify the land usage at a pixel,
based on the information in the
four spectral bands
KNN: Example (1)
470 13. Prototypes and Nearest-Neighbors
STATLOG results
0.15
NewID C4.5
CART Neural
including LVQ, CART, neural ALLOC80
0.10
analysis and many others, k-
K-NN
DANN
nearest-neighbors
performed best on this task.
0.05
• DANN is a variant of k-
nearest neighbors, using an
0.0
adaptive metric 2 4 6 8 10 12 14
Method
Method
256
This is a one-dimensional curve in ℝ Transformations of 3
dimensional space.
• The red line is the tangent line to the curve
at the original image, with some “ 3”s on
this tangent line, and its equation shown at α=− 0.2 α=− 0.1 α=0 α=0.1 α=0.2
Transformations
of xi
• Rather than using the usual Euclidean
distance between the two images, we use the Distance between
shortest distance between the two curves.
transformed
Tangent distance xi and xi!
Transformations
of xi
of these problems
transformed
Tangent distance xi and xi!