Lec03 Classifiers KNN+DT
Lec03 Classifiers KNN+DT
Classifiers
Aug 2024
Vineeth N Balasubramanian
Classification Methods
• k-Nearest Neighbors
• Decision Trees
• Naïve Bayes
• Support Vector Machines
• Logistic Regression
• Neural Networks
• Ensemble Methods (Boosting, Random Forests)
Compute
Distance Test
Record
new
K= 1: blue
K= 3: green
X X X
Possible to show
= − − =
2 2
1 Pr( y* | x ) Pr(Y y ' | x )
that: as the size of training data set
P(Y|xapproaches
) 1
y ' y*
... the one nearest neighbor classifier guarantees
infinity, y1 an error rate of
noworse
2(1 − Pr(
thany*twice
| x)) the Bayes error rate (the minimum achievable
error rate given the distribution
= 2(Bayes optimal error rate) of the data). We will see this later.
assume equal let y*=argmax Pr(y|x)
• An efficient
nonparametric
method
• A hierarchical
model
• Divide-and–
conquer
Leaf
strategy
Source: Ethem Alpaydin, Introduction to Machine Learning, 3rd Edition (Slides)
Entropy in information theory specifies the average (expected) amount of information derived from observing
an event
Source: Ethem Alpaydin, Introduction to Machine Learning, 3rd Edition (Slides)
j =1 Nm i =1