CH EVALUATION
CH EVALUATION
Problem Scoping ----- > Data Acquisition ---- > Data Exploring ------ > Modelling ---- >Evaluation.
● Evaluation is the final stage in the AI Project Cycle. Once a model has been made and trained.
● Evaluation is the process of understanding the reliability of any AI model, based on outputs by
feeding a test dataset into the model and comparing with actual answers. There can be different
Evaluation techniques, depending on the type and purpose of the model. It needs to go through
proper testing so that one can calculate the efficiency and performance of the model. Hence,
the model is tested with the help of Testing Data.
Why do we need evaluation?
While in modelling, we make different types of models. Then a decision to be taken which model is better
than another. So for that proper testing and evaluation is needed to calculate the efficiency and
performance of a model.
An efficient evaluation model proves helpful in selecting the most suitable modelling method that would
represent our data.
We must keep in mind that it is not advisable to use the data that we used to create the model to
evaluate it. Why?
Or
What type of data should be feeded to an evaluation model?
Ans-Training data must not be used for evaluation purposes because a model simply remembers the
whole of training data, therefore always predicts the correct output for any point in the training set
whenever training data is fed again. But it gives very wrong answers if a new dataset is introduced to the
model. This situation is known as overfitting.
TRUE
FALSE
Q. Write the terms used for following cases for severe fire predictions.
1 Yes Yes TP
2 No No TN
3 Yes No FP
4 No Yes FN
Evaluation Methods
Now as we have gone through all the possible combinations of Prediction and Reality,
let us see how we can use these conditions to evaluate the model.
● Accuracy
Accuracy is defined as the percentage of correct predictions out of all the observations.
A prediction can be said to be correct if it matches reality. Here, we have two conditions
in which the Prediction matches with the Reality: True Positive and True Negative.
Hence, the formula for Accuracy becomes:
● Precision:
Precision is defined as the percentage of true positive cases versus all the cases where
the prediction is true.
● Recall
Recall is another parameter for evaluating the model’s performance. It can be
defined as the fraction of positive cases that are correctly identified.
Note: Numerator of both precision and recall is same, the difference is only
about denominator, where Precision counts the False Positive and Recall takes
False Negative into consideration.
Which metric is important?
Choosing between Precision and Recall depends on the condition in which the model
has been deployed. In a case like Forest Fire, a False Negative can cost us a lot and is
risky too. Imagine no alert being given even when there is a Forest Fire. The whole
forest might burn down.
Another case where a False Negative can be dangerous is Viral Outbreak. Imagine a
deadly virus has started spreading and the model which is supposed to predict a viral
outbreak does not detect it. The virus might spread widely and infect a lot of people.
On the other hand, there can be cases in which the False Positive condition costs us
more than False Negatives.
One such case is Mining. Imagine a model telling you that there exists treasure at a
point and you keep on digging there but it turns out that it is a false alarm. Here, False
Positive case (predicting there is treasure but there is no treasure) can be very costly.
Similarly, let’s consider a model that predicts that a mail is spam or not. If the model
always predicts that the mail is spam, people would not look at it and eventually might
lose important information. Here also False Positive condition (Predicting the mail as
spam while the mail is not spam) would have a high cost.
● F1 Score
An ideal situation would be when we have a value of 1 (that is 100%) for both Precision
and Recall. In that case, the F1 score would also be an ideal 1 (100%). It is known as
the perfect value for F1 Score. As the values of both Precision and Recall ranges
from 0 to 1, the F1 score also ranges from 0 to 1.
Q.1 A social media company has developed an AI model to predict which users are
likely to churn (cancel their account). During testing, the AI model came up with the
following
Predictions.
Q2.
Calculate Accuracy, Precision, Recall and F1 Score for the following Confusion Matrix
on Heart Attack Risk.
Q.3 Draw the confusion matrix for the following data:
Number of True Positive=100
Number of True Negative=47
Number of False Positive= 62
Number of False Negative=290
Q4.Calculate Accuracy, Precision, Recall and F1 Score for the following Confusion
Matrix on Water Shortage in Schools: Also suggest which metric would not be a good
evaluation parameter here and why?