2. Performance Measures
2. Performance Measures
CSE
MACHINE LEARNING
21CS2226F
Topic:
Performance metrics
Session - 02
AIM OF THE
SESSION
To build an accurate and efficient machine learning model that can handle both classification and
regression tasks.
INSTRUCTIONAL
OBJECTIVES
LEARNING OUTCOMES
2
Performance metrics
3
Regression metrics
4
Regression metrics
• Mean Absolute Error (MAE)
• Mean Absolute Error is the average of the difference between the ground truth and the
predicted values.
• Mathematically, it is represented as :
5
Regression metrics
• Mean Squared Error (MSE)
• Mean Squared Error is the average of the squared difference between the ground truth
and the value predicted by the regression model.
• Mathematically, it is represented as :
6
Regression metrics
• Root Mean Squared Error (RMSE)
• Root Mean Squared Error is the square root of the average of the squared difference
between the ground truth and the value predicted by the regression model.
• Mathematically, it is represented as :
7
Regression metrics
• R-Squared (R²)
• The R-squared metric enables to comparison of the model with a constant baseline to
determine the performance of the regression model.
• Mathematically, it is represented as :
• If the sum of the Squared Error of the regression line is small, R² will be close to 1 (Ideal),
meaning the regression was able to capture 100% of the variance in the target variable.
8
Regression metrics
• Adjusted R²
• When the model overfits the data, the variance will be 100% but the model learning
hasn’t happened. To overcome this problem, R² is adjusted with the number of
independent variables.
• Mathematically, it is represented as :
n = number of observations,
9
classification metrics
10
classification metrics
• Confusion Matrix
• Confusion Matrix is the easiest way to measure the performance of the classification
• model.
TP signifies how many positive class
True value
samples your model predicted correctly.
1 0
• TN signifies how many negative class
True False
samples your model predicted correctly. 1 Positive Positive
Predicted (TP) (FP)
• FP signifies how many negative class
value False True
samples your model predicted incorrectly. 0 Negative Negative
• FN signifies how many positive class (FN) (TN)
11
classification metrics
• Precision and Recall
12
classification metrics
• F1-score
13
classification metrics
• AU-ROC (Area Under Receiver Operating Characteristics
Curve)
• AU-ROC makes use of True Positive Rates (TPR) and False Positive Rates (FPR) to visualize
the performance of the classification model.
• Mathematically, it is represented as :
• High ROC means that the probability of a randomly chosen positive example is indeed
positive.
• ROC curves aren’t a good choice when your problem has a huge class imbalance.
14
classification metrics
• Accuracy
15
Self-Assessment Questions
1. Which among the following evaluation metrics would you NOT use to measure the
performance of a classification model?
(a) Precision
(b) Recall
(c) Mean Squared Error
(d) F1-score
(a) Recall
(b) Accuracy
(c) Precision
(d) F1-score
16
Self-Assessment Questions
(a) Precision
(b) Recall
(c) Mean Squared Error
(d) F1-score
4. What is called the average squared difference between classifier predicted output and actual
output?
17
REFERENCES FOR FURTHER LEARNING OF THE
SESSION
Text Books:
1. Mitchell, Tom. Machine Learning. New York, NY: McGraw-Hill, 1997. ISBN: 9780070428072.
2. MacKay, David. Information Theory, Inference, and Learning Algorithms. Cambridge, UK: Cambridge University Press,
2003. ISBN: 9780521642989.
Reference Books:
3. EthemAlpaydin “Introduction to Machine Learning “, The MIT Press (2010).
4. Stephen Marsland, “Machine Learning an Algorithmic Perspective” CRC Press, (2009).
18
THANK YOU
19