0% found this document useful (0 votes)
53 views30 pages

Chapter 3 Model Evaluation Final

Uploaded by

Lusi ሉሲ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views30 pages

Chapter 3 Model Evaluation Final

Uploaded by

Lusi ሉሲ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Evaluating Classification & Predictive Performance of

Data Mining Models

Data Mining and Warehousing


By Gaddisa Olani (Ph.D)

1
Why Evaluate?
Multiple methods are available to classify or predict

For each method, multiple choices are available for


settings

To choose best model, need to assess each model’s


performance

2
Misclassification error

Error = classifying a record as belonging to one class


when it belongs to another class.

3
Naïve Rule

Naïve rule: classify all records as belonging to the


most prevalent class

 Often used as benchmark: we hope to do better than


that.

 Ignore your model if it’s not better than Naïve


4 rule
How to evaluate DM models?
We need:

Train Dataset: Used to fit the machine learning model.


(X_train, Y_train)

Test Dataset: Used to evaluate the fit machine learning


model (X_test, Y_test).

5
How to evaluate DM models?
 During the training:
 Give both X_train and Y_train to it. After training you will have a
model.

6
How to evaluate DM models?
 During testing:
 The goal is to asses the performance of your model by giving it only the
X_test. Then, it will guest the value of Y_test.

 Finally we need to compare the Y_test that was predicted by the model and
the one we hide it from the mode,.

7
Partitioning the dataset

a) Old Fashion train-test split

Divide your dataset in to 80/20.


 80% of your data are used for training, and
 20% of your data are used for testing it, and hence called test data
(a dataset that is independent of the training dataset).

8
Partitioning the dataset cont…
b) K-fold cross validation (most commonly used)
Example when the value of K=8

9
Partitioning the dataset cont…
b) K-fold cross validation (most commonly used)

10
Confusion Matrix
 A confusion matrix is a table that is often used to describe the performance
of a classification model (or "classifier") on a set of test data for which the
true values are known.

11
Confusion Matrix
True Positive:
 Cases, where the model claims that something has happened and actually it
has happened i.e patient, has cancer and the model also predicts cancer.

12
Confusion Matrix
True Negative:
 Cases, where the model claims that nothing has happened and actually
nothing, has happened i.e patient doesn’t have cancer and the model also
doesn’t predict cancer.

13
Confusion Matrix
False Positive ( Type-1 error):
 Cases, where the model claims that something has happened when actually
it hasn’t i.e patient, doesn’t have cancer but the model predicts cancer.

14
Confusion Matrix
False Negative (Type-2 error):
 Cases, where the model claims nothing when actually something has
happened i.e patient has cancer but the model doesn’t predict cancer.

15
Commonly used Classification
Model Evaluation Metrics

16
Accuracy
Accuracy is the fraction of predictions our model got
right.

Accuracy = (TP + TN) / (TP + TN + FP + FN)

Accuracy should be considered when TP and TN are more important and


the dataset is balanced because in that case the model will not get biased
based on the class distribution. But in real-life classification problem,
imbalanced class distribution exists.

17
PRECISION
Precision quantifies out of the total predicted positive
values, how many were actually positive

Precision = TP / (TP + FP)

Taking a use case of Spam Detection, suppose the mail is not spam (0), but the
model has predicted it as spam (1) which is FP. In this scenario, one can miss the
important mail. So, here we should focus on reducing the FP and must consider
precision in this case.

18
RECALL
Recall metric quantifies out of the total actual positive
values, how many were correctly predicted as positive
Recall= TP / (TP + FN)

When recall should be considered?


In Cancer Detection, suppose if a person is having cancer (1), but it is not predicted
(0) by the model which is FN. This could be a disaster. So, in this scenario, we should
focus on reducing the FN and must consider recall in this case.

19
F-1 Score
 Use F-1 Score when FP and FN both are equally important. This allows
the model to consider both precision and recall equally using a single
score.

When recall should be considered?


In Cancer Detection, suppose if a person is having cancer (1), but it is not predicted
(0) by the model which is FN. This could be a disaster. So, in this scenario, we should
focus on reducing the FN and must consider recall in this case.

20
Exercise 1: How do you read the ff confusion
matrix?
Classification Confusion Matrix
Predicted Class
Actual Class 1 0
1 201 85
0 25 2689

201 1’s correctly classified as “1”


85 1’s incorrectly classified as “0”
25 0’s incorrectly classified as “1”
2689 0’s correctly classified as “0”

21
Exercise

22
Exercise 2: You build a data mining model for e-mail spam
detection. The result of your model is summarized in Confusion matrix
below. Calculate the Accuracy, Precision, Recall and F1-score of your
model. Write some description based on your finding?

23
Problem
 Imbalanced classification is the problem of classification when there is an unequal
distribution of classes in the training dataset.

 E.g. there are 3000 HIV negative samples, and 200 positive samples in your
training data.

24
Problem
 Accuracy metrics will fail when there is Imbalanced Class Distributions.

 This kind of problem happens in problems such as fault detection or fraud detection
(AND other rare events)

 Many machine learning models are designed around the assumption of balanced class
distribution, and often learn simple rules (explicit or otherwise) like always predict the
majority class, causing them to achieve an accuracy of 99 percent, although in practice
performing no better than an unskilled majority class classifier.

25
Exercise 3: Confusion matrix summarized below depicts
the performance of Cyber Attack Detection Model (kind
of Antivirus) based on Data Mining. Calculate the
Accuracy of this AV, and state your finding? Note:
Category means not a Virus, Category B implies virus.

26
Solutions to deal with class imbalance
problem?
Re-sampling Technique

27
Reporting Performance report in your thesis or paper

Compare your work using different models


Compare your work with other works

28
Reporting Performance report in your thesis or paper

29
Demo

30

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy