0% found this document useful (0 votes)

15 views20 pages

T07 IDS - Classification

Uploaded by

Tanjimul Hasan Sohan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views20 pages

T07 IDS - Classification

Uploaded by

Tanjimul Hasan Sohan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 20

Classification Algorithms.

 k - Nearest Neighbour (KNN).

 Decision Tree
 Naïve Bayes.
Introduction to classification k - Nearest Neighbour

• Mainly used when all attribute values are continuous

• It can be modified to deal with categorical attributes

• The idea is to estimate the classification of an unseen instance using the

classification of the instance or instances that are closest to it, in some sense that we
need to define (classifies new cases based on a similarity measure)
Nearest Neighbour

What should its classification be? Even without knowing what the six attributes
represent, it seems intuitively obvious that the unseen instance is nearer to the
first instance than to the second.
Nearest Neighbour
Nearest Neighbour

• A training set with 20 instances, each giving the values of two

attributes and an associated classification
• How can we estimate the classification for an ‘unseen’
instance where the first and second attributes are 9.1 and
11.0, respectively?
Nearest Neighbour

A circle has been added to enclose the five nearest neighbours

of the unseen instance, which is shown as a small circle close to
the centre of the larger one.
The five nearest neighbours are labelled with three + signs and
two − signs
So a basic 5-NN classifier would classify the unseen instance as
‘positive’ by a form of majority voting
Distance Measures: Euclidean Distance

• If we denote an instance in the training set by (a1, a2) and the unseen instance by (b1, b2) the length of the
straight line joining the points is

 If there are two points (a1, a2, a3) and (b1, b2, b3) in a three-dimensional space the corresponding formula
is

• The formula for Euclidean distance between points (a1, a2, . . . , an) and (b1, b2, . . . , bn) in n-dimensional
space is a generalisation of these two results. The Euclidean distance is given by the formula
Estimating the Predictive Accuracy of a Classifier

• Any algorithm which assigns a classification to unseen instances is called a classifier.

• Predictive accuracy -> The proportion of a set of unseen instances that it correctly classifies.
Estimating the Predictive Accuracy of a Classifier

Three main strategies (training/test):

• Dividing the data into training and test set
• K-fold cross validation
• N-fold (leave one out)cross validation
Method 1: Separate Training and Test Sets

Data is split into two parts called a training set and a test set

Training set is used to construct a classifier

The classifier is then used to predict the classification for the instances in the
test set.
If the test set contains N instances of which C are correctly classified, C are
correctly classified
Predictive accuracy, P = C/N
Method 2: K-fold Cross Validation

Dataset comprises N instances

Divided into k equal parts, k typically being a small number such as 5 or 10.

Each of the k parts in turn is used as a test set and the other k − 1 parts are used
as a training set.
Usually K = 5 to 10
Method 3: N-fold Cross Validation

 N-fold cross-validation is an extreme case of k-fold cross-validation

 Often known as ‘leave-one-out’ cross-validation or jack-knifing

 Dataset is divided into as many parts as there are instances, each instance effectively forming a test set of
one.
 K=N

 e1, e1,…….,en

 Take the average.

 High computation is required, but validation way is nice

Experimental Results - I

 Predictive accuracy of classifiers generated for four datasets.

 All the results in this section were obtained using the TDIDT tree induction algorithm, with information
gain used for attribute selection.
Experimental Results - I
Experimental Results - I

• Below results obtained using 10-fold and N-fold Cross-validation for the four datasets.
Confusion Matrix

• As well as the overall predictive accuracy on unseen instances it is often helpful to see a breakdown of the
classifier’s performance, i.e. how frequently instances of class X were correctly classified as class X or
misclassified as some other class.
• This information is given in a confusion matrix.
Confusion Matrix
Confusion Matrix
Confusion Matrix

K-Nearest Neighbors Algorithm
No ratings yet
K-Nearest Neighbors Algorithm
11 pages
ML Lec07 KNN
100% (2)
ML Lec07 KNN
37 pages
cYCLE 9
No ratings yet
cYCLE 9
5 pages
12 - 23ECE216 - Nearest Neighbors
No ratings yet
12 - 23ECE216 - Nearest Neighbors
29 pages
T6 - KNN - Features, Distances &amp Amp Non-Parametric Models
No ratings yet
T6 - KNN - Features, Distances &amp Amp Non-Parametric Models
23 pages
Example 1: Riding Mowers
No ratings yet
Example 1: Riding Mowers
6 pages
Topic 7.7 K-Nearest Neighbor Analysis
No ratings yet
Topic 7.7 K-Nearest Neighbor Analysis
5 pages
Nearest Neighbour Classifier (-NN Classifier)
No ratings yet
Nearest Neighbour Classifier (-NN Classifier)
17 pages
Unit 2
No ratings yet
Unit 2
55 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
L05-Predictive Analytics I
No ratings yet
L05-Predictive Analytics I
49 pages
KNN
No ratings yet
KNN
26 pages
04 KNN
No ratings yet
04 KNN
60 pages
S3 K Nearest Neighbor LKW 15jan2025
No ratings yet
S3 K Nearest Neighbor LKW 15jan2025
16 pages
جلسه پنجم-3
No ratings yet
جلسه پنجم-3
17 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
19 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
Ue21cs352a 20230830121058
No ratings yet
Ue21cs352a 20230830121058
18 pages
Classification KNN
No ratings yet
Classification KNN
11 pages
05 KNN
No ratings yet
05 KNN
49 pages
K Nearest Neighbour Classifier
No ratings yet
K Nearest Neighbour Classifier
24 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
DADM S15 K-NN Classification
No ratings yet
DADM S15 K-NN Classification
13 pages
ML 5
No ratings yet
ML 5
35 pages
KNN CIML
No ratings yet
KNN CIML
12 pages
Sayan Das - Machine Learning
No ratings yet
Sayan Das - Machine Learning
4 pages
Instance-Based Learning: K-Nearest Neighbour Learning
No ratings yet
Instance-Based Learning: K-Nearest Neighbour Learning
21 pages
ML Lec-10
No ratings yet
ML Lec-10
19 pages
Unit Ii
No ratings yet
Unit Ii
102 pages
06 KNN
No ratings yet
06 KNN
41 pages
Naive Bayes Classifier: K M M I I M
No ratings yet
Naive Bayes Classifier: K M M I I M
16 pages
A Review of Data Classification Using K-Nearest Neighbour
No ratings yet
A Review of Data Classification Using K-Nearest Neighbour
7 pages
Chapter 4
No ratings yet
Chapter 4
40 pages
What Is KNN
No ratings yet
What Is KNN
9 pages
w5 Classification
No ratings yet
w5 Classification
34 pages
K-Nearest Neighbor Learning
No ratings yet
K-Nearest Neighbor Learning
31 pages
3.1 K Nearest Neighbour Classifier
No ratings yet
3.1 K Nearest Neighbour Classifier
24 pages
K-Means and KNN
No ratings yet
K-Means and KNN
11 pages
KNN Dan KMeans
No ratings yet
KNN Dan KMeans
37 pages
KNN Algorithm
No ratings yet
KNN Algorithm
16 pages
HW02 Sol - KNN DT
No ratings yet
HW02 Sol - KNN DT
8 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
Week 07
No ratings yet
Week 07
24 pages
Lecture#2. K Nearest Neighbors
No ratings yet
Lecture#2. K Nearest Neighbors
10 pages
Chap7 KNN
No ratings yet
Chap7 KNN
15 pages
K Nearest Neighbor: Presented by
No ratings yet
K Nearest Neighbor: Presented by
29 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
4 KNN Classifier
No ratings yet
4 KNN Classifier
6 pages
4 KNN Classifier
No ratings yet
4 KNN Classifier
6 pages
KNN - Algorithm - SVM - Algorithm
No ratings yet
KNN - Algorithm - SVM - Algorithm
27 pages
Road Traffic Algorithm
No ratings yet
Road Traffic Algorithm
5 pages
KNN PDF
No ratings yet
KNN PDF
30 pages
Lecture 4
No ratings yet
Lecture 4
31 pages
K Nearest Neighbor KNN
No ratings yet
K Nearest Neighbor KNN
18 pages
Introduction To Data Science Lecture 6 KG Sir OEC M 621 (E)
No ratings yet
Introduction To Data Science Lecture 6 KG Sir OEC M 621 (E)
8 pages
Decision Tree KNN
No ratings yet
Decision Tree KNN
9 pages
K-Nearest Neighbour Classifier: Prerequisite
No ratings yet
K-Nearest Neighbour Classifier: Prerequisite
6 pages
Lecture 3
No ratings yet
Lecture 3
17 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

T07 IDS - Classification

Uploaded by

T07 IDS - Classification

Uploaded by

Classification Algorithms.

 k - Nearest Neighbour (KNN).

• Mainly used when all attribute values are continuous

• It can be modified to deal with categorical attributes

• The idea is to estimate the classification of an unseen instance using the

• A training set with 20 instances, each giving the values of two

A circle has been added to enclose the five nearest neighbours

• Any algorithm which assigns a classification to unseen instances is called a classifier.

Three main strategies (training/test):

Training set is used to construct a classifier

Dataset comprises N instances

 N-fold cross-validation is an extreme case of k-fold cross-validation

 Often known as ‘leave-one-out’ cross-validation or jack-knifing

 Take the average.

 High computation is required, but validation way is nice

 Predictive accuracy of classifiers generated for four datasets.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.