0% found this document useful (0 votes)

6 views16 pages

Lecture 10

The document discusses evaluation metrics for machine learning algorithms, focusing on error, accuracy, precision, recall, and measures for both regression and classification problems. It highlights the importance of the ROC curve and AUC for comparing classifiers, as well as the precision/recall trade-off in decision-making. The next lecture will cover loss functions.

Uploaded by

Dhruv Jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views16 pages

Lecture 10

Uploaded by

Dhruv Jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

DS605: Fundamentals of Machine Learning

Lecture 10

Evaluation - II
[Evaluation Metrics]

Arpit Rana
12th August 2024
Experimental Evaluation of Learning Algorithms

Given a representation, data, and a

bias, the learning algorithm returns a
Final Hypothesis (h).

Hypothesis Learner
Space 𝓗 (𝚪: S → h)

How to Check the Performance of

Learning Algorithms?

Final Hypothesis or
Model (h)
Evaluation Metrics
Common Measures
Experimental Evaluation of Learning Algorithms

Typical Experimental Evaluation Metrics

● Error

● Accuracy

● Precision/ Recall
Measures for Regression Problems

● Mean Absolute Error

● Squared Error Which one is better and why?

● Non-differentiability
● Robustness (sensitivity to
outliers)
● Unit changes in MSE
Measures for Regression Problems

● Misclassiﬁcation Rate (a.k.a. Error Rate)

Where,
Measures for Classiﬁcation Problems

True Class
Confusion (Actual)
Matrix
Positive Positive Negative Total
Hypothesized Class

True False
Positive Positive P’
(Predicted)

(TP) (FP)
Negative

False True
Negative Negative N’
(FN) (TN)

Total P N P+N
Measures for Classiﬁcation Problems

F measure: weighted harmonic mean of

True Class precision and recall.
Confusion (Actual)
Matrix
Positive Positive Negative Total
Hypothesized Class

True False
Positive Positive P’
(Predicted)

(TP) (FP)
Negative

False True
Negative Negative N’
(FN) (TN)
⍺ ∈ [0, 1] and 𝛽 ∈ [0, ∞]
For ⍺ = ½, 𝛽 = 1, F measure will be balanced
Total P N P+N and is known as F1 measure.
Measures for Classiﬁcation Problems

What metric would you use to measure the performance of the following classiﬁers.

● A classiﬁer to detect videos that are safe for kids.

● A classiﬁer to detect shoplifters in surveillance images.

Precision/Recall Trade-off

● Images are ranked by their classiﬁer (whether the image is 5 or not) score.

● Those above the chosen decision threshold are considered positive.

● The higher the threshold, the lower the recall, but (in general) the higher the precision.
Precision/Recall Trade-off

How do you decide which threshold to use?

Precision/Recall Trade-off

How do you decide which threshold to use?

● A high-precision classiﬁer is not

very useful if its recall is too low!

● If someone says “let’s reach 99%

precision,” you should ask, “at
what recall?”

To take recall into consideration, we

use other measures.
The ROC Curve

● The receiver operating characteristic (ROC) curve is another common tool used with
binary classiﬁers.

● It is very similar to the precision/recall curve,

○ but instead of plotting precision versus recall,

○ the ROC curve plots the true positive rate (TPR, another name for recall or
sensitivity) against the false positive rate (FPR, 1-speciﬁcity).
The ROC Curve

● Once again there is a tradeoff: the

higher the recall (TPR), the more
false positives (FPR) the classiﬁer
produces.

● The dotted line represents the

ROC curve of a purely random
classiﬁer;

● A good classiﬁer stays as far away

from that line as possible (toward
the top-left corner).
AUC: Area Under the (ROC) Curve

● One way to compare classiﬁers is to measure the area under the curve (AUC).

● A perfect classiﬁer will have a ROC AUC equal to 1, whereas a purely random classiﬁer
will have a ROC AUC equal to 0.5.

Note: As a rule of thumb, you should prefer the PR (precision-recall) curve whenever the
positive class is rare or when you care more about the false positives than the false negatives,
and the ROC curve otherwise.
Next lecture Loss Functions
13th August 2024

Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
CS340 Machine Learning ROC Curves
No ratings yet
CS340 Machine Learning ROC Curves
8 pages
Roc 1 PDF
No ratings yet
Roc 1 PDF
8 pages
06-FSSR_DS610_2024=2025T1_ٍMetrics
No ratings yet
06-FSSR_DS610_2024=2025T1_ٍMetrics
24 pages
DL_IT324a_4
No ratings yet
DL_IT324a_4
52 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
جلسه 13
No ratings yet
جلسه 13
76 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
11 pages
Binary Classification PDF
No ratings yet
Binary Classification PDF
27 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
Lecture 3b - Evaluation
No ratings yet
Lecture 3b - Evaluation
37 pages
lecture11evaluationmetricsforclassification-240913060639-0c766554
No ratings yet
lecture11evaluationmetricsforclassification-240913060639-0c766554
28 pages
Module 5 ML
No ratings yet
Module 5 ML
12 pages
A10-Model-Performance-v2-2up
No ratings yet
A10-Model-Performance-v2-2up
11 pages
Unit6 -7 Issues_23bc7150-918a-4ebe-9af6-01db96af986a
No ratings yet
Unit6 -7 Issues_23bc7150-918a-4ebe-9af6-01db96af986a
53 pages
Classification Metrics Mod 6
No ratings yet
Classification Metrics Mod 6
8 pages
Classification Metrics.pptx
No ratings yet
Classification Metrics.pptx
39 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
49 pages
Imbalance Problem
No ratings yet
Imbalance Problem
13 pages
EvaluationMatrix
No ratings yet
EvaluationMatrix
29 pages
lec5_Classification
No ratings yet
lec5_Classification
27 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
31 pages
6 Evaluarea performantei
No ratings yet
6 Evaluarea performantei
43 pages
Int3209 - Data Mining: Week 5: Classification Model Improvements
No ratings yet
Int3209 - Data Mining: Week 5: Classification Model Improvements
56 pages
CH 4
No ratings yet
CH 4
9 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
Lect_02_Evaluation_Part_1
No ratings yet
Lect_02_Evaluation_Part_1
33 pages
Machine_Learning_II
No ratings yet
Machine_Learning_II
61 pages
ML-Lecture-11-Evaluation
No ratings yet
ML-Lecture-11-Evaluation
17 pages
Evaluation Measures
No ratings yet
Evaluation Measures
8 pages
Lecture 2.3
No ratings yet
Lecture 2.3
9 pages
Machine Learning: B.Tech (CSBS) V Semester
No ratings yet
Machine Learning: B.Tech (CSBS) V Semester
9 pages
UNIT-1-2.Binary Classification and Related Tasks
No ratings yet
UNIT-1-2.Binary Classification and Related Tasks
22 pages
CH-5_ML
No ratings yet
CH-5_ML
36 pages
Data Mining Final
No ratings yet
Data Mining Final
25 pages
Module 6
No ratings yet
Module 6
24 pages
IAI&ML UNIT-5
No ratings yet
IAI&ML UNIT-5
15 pages
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
No ratings yet
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
17 pages
Lecture-(3-4) Evaluation Metrices Classification and Regression
No ratings yet
Lecture-(3-4) Evaluation Metrices Classification and Regression
28 pages
3 - Model Evaluation & Validation
No ratings yet
3 - Model Evaluation & Validation
47 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
ML Metrics
No ratings yet
ML Metrics
9 pages
11.2 - Classification Evaluation Metrics
No ratings yet
11.2 - Classification Evaluation Metrics
22 pages
Performance
No ratings yet
Performance
11 pages
Performance Parameters
No ratings yet
Performance Parameters
14 pages
chapter 5 Model Evaluation
No ratings yet
chapter 5 Model Evaluation
21 pages
5.3
No ratings yet
5.3
31 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
Lec 12 13 Evaluation Measures
No ratings yet
Lec 12 13 Evaluation Measures
45 pages
performance evaluation
No ratings yet
performance evaluation
24 pages
CSE4261 Lecture-10
No ratings yet
CSE4261 Lecture-10
50 pages
Lecture 3 1611410001002
No ratings yet
Lecture 3 1611410001002
51 pages
Machine Learning Project Report (Group 3) Shahbaz Khan
No ratings yet
Machine Learning Project Report (Group 3) Shahbaz Khan
11 pages
MACHINELEARNING
No ratings yet
MACHINELEARNING
20 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
Performance Parameters
No ratings yet
Performance Parameters
23 pages
3-Performance Measures
No ratings yet
3-Performance Measures
35 pages
ERROR and Confusion Matrix
No ratings yet
ERROR and Confusion Matrix
29 pages
Proposal New
No ratings yet
Proposal New
31 pages
Selina Concise Maths Solutions Class 6 Chapter 18 Fundamental Concepts
No ratings yet
Selina Concise Maths Solutions Class 6 Chapter 18 Fundamental Concepts
11 pages
Mahila_Suraksha_Handbook
No ratings yet
Mahila_Suraksha_Handbook
17 pages
CEM
No ratings yet
CEM
2 pages
TL - 080
0% (1)
TL - 080
12 pages
Cyber Law and Professional Ethics Labsheet
No ratings yet
Cyber Law and Professional Ethics Labsheet
6 pages
Control System Design White Paper
100% (1)
Control System Design White Paper
29 pages
Chipset
No ratings yet
Chipset
8 pages
IEEE 802.16 Network Architecture: Wireless Communication Networks
100% (3)
IEEE 802.16 Network Architecture: Wireless Communication Networks
10 pages
Sig Figs Presentation
No ratings yet
Sig Figs Presentation
18 pages
Opr601 Operations Management
0% (1)
Opr601 Operations Management
11 pages
CS-601-CBGS: B.Tech., VI Semester
No ratings yet
CS-601-CBGS: B.Tech., VI Semester
4 pages
ZHLT 3.0 Changelog
No ratings yet
ZHLT 3.0 Changelog
4 pages
Design A Lean Laboratory Layout: Lab Management
No ratings yet
Design A Lean Laboratory Layout: Lab Management
6 pages
PROJ-IS-IS220-2-22-Project Template
No ratings yet
PROJ-IS-IS220-2-22-Project Template
20 pages
History TM
No ratings yet
History TM
55 pages
Digital Shadows Anomali Integration Datasheet
No ratings yet
Digital Shadows Anomali Integration Datasheet
2 pages
DB Presets Install Guide
No ratings yet
DB Presets Install Guide
14 pages
DX10000HXL31 20000HXL31 Manual
No ratings yet
DX10000HXL31 20000HXL31 Manual
26 pages
Signal Isolator Single / Dual Output: Slim. Isolated. Reliable
No ratings yet
Signal Isolator Single / Dual Output: Slim. Isolated. Reliable
2 pages
Exposure Dose Saving DR Technology D-EVO + ISS', Provides High Resolution Image Quality
No ratings yet
Exposure Dose Saving DR Technology D-EVO + ISS', Provides High Resolution Image Quality
2 pages
AWS Certification Training Course PPT 1
No ratings yet
AWS Certification Training Course PPT 1
7 pages
unit 4(22516)
No ratings yet
unit 4(22516)
53 pages
Muhammad Abid Resume
No ratings yet
Muhammad Abid Resume
5 pages
Unit I - Introduction To CorelDRAW X5
87% (15)
Unit I - Introduction To CorelDRAW X5
28 pages
Ict Skills
No ratings yet
Ict Skills
6 pages
Civl445-Preliminary Design Report - Requirements and Guidelines
No ratings yet
Civl445-Preliminary Design Report - Requirements and Guidelines
3 pages
Sedecal sm0510R4
100% (4)
Sedecal sm0510R4
90 pages
07 - Working With Health IT Systems - Unit 7 - Protecting Privacy, Security, and Confidentiality in HIT Systems - Lecture A
No ratings yet
07 - Working With Health IT Systems - Unit 7 - Protecting Privacy, Security, and Confidentiality in HIT Systems - Lecture A
16 pages
PST 645UX Transforemer Protection and Control Devices
No ratings yet
PST 645UX Transforemer Protection and Control Devices
115 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lecture 10

Uploaded by

Lecture 10

Uploaded by

DS605: Fundamentals of Machine Learning

Given a representation, data, and a

How to Check the Performance of

Typical Experimental Evaluation Metrics

● Mean Absolute Error

● Squared Error Which one is better and why?

● Misclassiﬁcation Rate (a.k.a. Error Rate)

F measure: weighted harmonic mean of

● A classiﬁer to detect videos that are safe for kids.

● A classiﬁer to detect shoplifters in surveillance images.

● Those above the chosen decision threshold are considered positive.

How do you decide which threshold to use?

How do you decide which threshold to use?

● A high-precision classiﬁer is not

● If someone says “let’s reach 99%

To take recall into consideration, we

● It is very similar to the precision/recall curve,

○ but instead of plotting precision versus recall,

● Once again there is a tradeoff: the

● The dotted line represents the

● A good classiﬁer stays as far away

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.