Chapter 3 Model Evaluation Final
Chapter 3 Model Evaluation Final
1
Why Evaluate?
Multiple methods are available to classify or predict
2
Misclassification error
3
Naïve Rule
5
How to evaluate DM models?
During the training:
Give both X_train and Y_train to it. After training you will have a
model.
6
How to evaluate DM models?
During testing:
The goal is to asses the performance of your model by giving it only the
X_test. Then, it will guest the value of Y_test.
Finally we need to compare the Y_test that was predicted by the model and
the one we hide it from the mode,.
7
Partitioning the dataset
8
Partitioning the dataset cont…
b) K-fold cross validation (most commonly used)
Example when the value of K=8
9
Partitioning the dataset cont…
b) K-fold cross validation (most commonly used)
10
Confusion Matrix
A confusion matrix is a table that is often used to describe the performance
of a classification model (or "classifier") on a set of test data for which the
true values are known.
11
Confusion Matrix
True Positive:
Cases, where the model claims that something has happened and actually it
has happened i.e patient, has cancer and the model also predicts cancer.
12
Confusion Matrix
True Negative:
Cases, where the model claims that nothing has happened and actually
nothing, has happened i.e patient doesn’t have cancer and the model also
doesn’t predict cancer.
13
Confusion Matrix
False Positive ( Type-1 error):
Cases, where the model claims that something has happened when actually
it hasn’t i.e patient, doesn’t have cancer but the model predicts cancer.
14
Confusion Matrix
False Negative (Type-2 error):
Cases, where the model claims nothing when actually something has
happened i.e patient has cancer but the model doesn’t predict cancer.
15
Commonly used Classification
Model Evaluation Metrics
16
Accuracy
Accuracy is the fraction of predictions our model got
right.
17
PRECISION
Precision quantifies out of the total predicted positive
values, how many were actually positive
Taking a use case of Spam Detection, suppose the mail is not spam (0), but the
model has predicted it as spam (1) which is FP. In this scenario, one can miss the
important mail. So, here we should focus on reducing the FP and must consider
precision in this case.
18
RECALL
Recall metric quantifies out of the total actual positive
values, how many were correctly predicted as positive
Recall= TP / (TP + FN)
19
F-1 Score
Use F-1 Score when FP and FN both are equally important. This allows
the model to consider both precision and recall equally using a single
score.
20
Exercise 1: How do you read the ff confusion
matrix?
Classification Confusion Matrix
Predicted Class
Actual Class 1 0
1 201 85
0 25 2689
21
Exercise
22
Exercise 2: You build a data mining model for e-mail spam
detection. The result of your model is summarized in Confusion matrix
below. Calculate the Accuracy, Precision, Recall and F1-score of your
model. Write some description based on your finding?
23
Problem
Imbalanced classification is the problem of classification when there is an unequal
distribution of classes in the training dataset.
E.g. there are 3000 HIV negative samples, and 200 positive samples in your
training data.
24
Problem
Accuracy metrics will fail when there is Imbalanced Class Distributions.
This kind of problem happens in problems such as fault detection or fraud detection
(AND other rare events)
Many machine learning models are designed around the assumption of balanced class
distribution, and often learn simple rules (explicit or otherwise) like always predict the
majority class, causing them to achieve an accuracy of 99 percent, although in practice
performing no better than an unskilled majority class classifier.
25
Exercise 3: Confusion matrix summarized below depicts
the performance of Cyber Attack Detection Model (kind
of Antivirus) based on Data Mining. Calculate the
Accuracy of this AV, and state your finding? Note:
Category means not a Virus, Category B implies virus.
26
Solutions to deal with class imbalance
problem?
Re-sampling Technique
27
Reporting Performance report in your thesis or paper
28
Reporting Performance report in your thesis or paper
29
Demo
30