0% found this document useful (0 votes)
20 views7 pages

CH EVALUATION

Uploaded by

Munny Narang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views7 pages

CH EVALUATION

Uploaded by

Munny Narang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Ch-17 EVALUATION

Problem Scoping ----- > Data Acquisition ---- > Data Exploring ------ > Modelling ---- >Evaluation.
● Evaluation is the final stage in the AI Project Cycle. Once a model has been made and trained.
● Evaluation is the process of understanding the reliability of any AI model, based on outputs by
feeding a test dataset into the model and comparing with actual answers. There can be different
Evaluation techniques, depending on the type and purpose of the model. It needs to go through
proper testing so that one can calculate the efficiency and performance of the model. Hence,
the model is tested with the help of Testing Data.
Why do we need evaluation?
While in modelling, we make different types of models. Then a decision to be taken which model is better
than another. So for that proper testing and evaluation is needed to calculate the efficiency and
performance of a model.
An efficient evaluation model proves helpful in selecting the most suitable modelling method that would
represent our data.
We must keep in mind that it is not advisable to use the data that we used to create the model to
evaluate it. Why?
Or
What type of data should be feeded to an evaluation model?

Ans-Training data must not be used for evaluation purposes because a model simply remembers the
whole of training data, therefore always predicts the correct output for any point in the training set
whenever training data is fed again. But it gives very wrong answers if a new dataset is introduced to the
model. This situation is known as overfitting.

Evaluation is basically done by two things:


1. Prediction The output given by the machine after training and testing the data is
known as Prediction. (Output of the machine)
2. Reality Reality is the real situation and real scenario where prediction has been
made by the machine. (Reality or truth)
We will consider many scenarios for evaluation. Then what is Scenario?
Consider an AI based prediction model which is deployed to identify Football or a
soccer ball.
Objective is to find out whether the given image is a football. Now there exists two
conditions as discussed above-
CASE-1 Here, we can see in the picture that it’s a
football. The model’s prediction is Yes
which means it's football. The Prediction
matches Reality. Hence, this condition is
termed as True Positive.

CASE-2 Here this is Not an image of Football hence


the reality is No. In this case, the machine
has predicted
it correctly as a No. Therefore, this condition
is termed as True Negative.

CASE-3 Here the reality is that it is not Football. But


the machine has incorrectly predicted that
this is Football.
This case is termed False Positive.
Another example- You predicted that India
won the cricket match series against
England but they
lost.

CASE-4 Here, a Football has been in a different look


because of which the Reality is Yes but the
machine has incorrectly predicted it as a No
which means the machine predicts that it is
not Football.
Therefore, this case becomes False
Negative.

What is Confusion Matrix?


A Confusion Matrix is a table that is often used to describe the performance of a
classification model (or "classifier") on a set of test data for which the true values are
known.
A 2x2 matrix denoting the right and wrong predictions might help us analyse the rate of
success. This matrix is termed the Confusion Matrix.
● It is a table of two rows and two columns that reports the number of true
positives,true negatives, false positives and false positives.
● It provides a most insightful picture, which is not only the performance of the
model but also to know which particular segments are predicted correctly and
incorrectly and also what are the errors made by the model.
● The confusion matrix allows us to understand the prediction results. It is not an
evaluation metric but a record which can help in evaluation
● Prediction and reality can be easily mapped together with the help of
confusion matrix

TRUE

FALSE
Q. Write the terms used for following cases for severe fire predictions.

Case Prediction Reality TERM

1 Yes Yes TP

2 No No TN

3 Yes No FP

4 No Yes FN

Evaluation Methods
Now as we have gone through all the possible combinations of Prediction and Reality,
let us see how we can use these conditions to evaluate the model.

● Accuracy
Accuracy is defined as the percentage of correct predictions out of all the observations.
A prediction can be said to be correct if it matches reality. Here, we have two conditions
in which the Prediction matches with the Reality: True Positive and True Negative.
Hence, the formula for Accuracy becomes:

● Precision:
Precision is defined as the percentage of true positive cases versus all the cases where
the prediction is true.

● Recall
Recall is another parameter for evaluating the model’s performance. It can be
defined as the fraction of positive cases that are correctly identified.

Note: Numerator of both precision and recall is same, the difference is only
about denominator, where Precision counts the False Positive and Recall takes
False Negative into consideration.
Which metric is important?

Choosing between Precision and Recall depends on the condition in which the model
has been deployed. In a case like Forest Fire, a False Negative can cost us a lot and is
risky too. Imagine no alert being given even when there is a Forest Fire. The whole
forest might burn down.
Another case where a False Negative can be dangerous is Viral Outbreak. Imagine a
deadly virus has started spreading and the model which is supposed to predict a viral
outbreak does not detect it. The virus might spread widely and infect a lot of people.
On the other hand, there can be cases in which the False Positive condition costs us
more than False Negatives.
One such case is Mining. Imagine a model telling you that there exists treasure at a
point and you keep on digging there but it turns out that it is a false alarm. Here, False
Positive case (predicting there is treasure but there is no treasure) can be very costly.
Similarly, let’s consider a model that predicts that a mail is spam or not. If the model
always predicts that the mail is spam, people would not look at it and eventually might
lose important information. Here also False Positive condition (Predicting the mail as
spam while the mail is not spam) would have a high cost.
● F1 Score

It is a measure of balance between precision and recall.

An ideal situation would be when we have a value of 1 (that is 100%) for both Precision
and Recall. In that case, the F1 score would also be an ideal 1 (100%). It is known as
the perfect value for F1 Score. As the values of both Precision and Recall ranges
from 0 to 1, the F1 score also ranges from 0 to 1.

Q.1 A social media company has developed an AI model to predict which users are
likely to churn (cancel their account). During testing, the AI model came up with the
following
Predictions.

(i) Calculate precision, recall and F1 Score for churn prediction


(ii) How many total tests were performed in the above scenario?

Q2.
Calculate Accuracy, Precision, Recall and F1 Score for the following Confusion Matrix
on Heart Attack Risk.
Q.3 Draw the confusion matrix for the following data:
Number of True Positive=100
Number of True Negative=47
Number of False Positive= 62
Number of False Negative=290

Q4.Calculate Accuracy, Precision, Recall and F1 Score for the following Confusion
Matrix on Water Shortage in Schools: Also suggest which metric would not be a good
evaluation parameter here and why?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy