0% found this document useful (0 votes)

20 views7 pages

CH EVALUATION

Uploaded by

Munny Narang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views7 pages

CH EVALUATION

Uploaded by

Munny Narang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Ch-17 EVALUATION

Problem Scoping ----- > Data Acquisition ---- > Data Exploring ------ > Modelling ---- >Evaluation.
● Evaluation is the final stage in the AI Project Cycle. Once a model has been made and trained.
● Evaluation is the process of understanding the reliability of any AI model, based on outputs by
feeding a test dataset into the model and comparing with actual answers. There can be different
Evaluation techniques, depending on the type and purpose of the model. It needs to go through
proper testing so that one can calculate the efficiency and performance of the model. Hence,
the model is tested with the help of Testing Data.
Why do we need evaluation?
While in modelling, we make different types of models. Then a decision to be taken which model is better
than another. So for that proper testing and evaluation is needed to calculate the efficiency and
performance of a model.
An efficient evaluation model proves helpful in selecting the most suitable modelling method that would
represent our data.
We must keep in mind that it is not advisable to use the data that we used to create the model to
evaluate it. Why?
Or
What type of data should be feeded to an evaluation model?

Ans-Training data must not be used for evaluation purposes because a model simply remembers the
whole of training data, therefore always predicts the correct output for any point in the training set
whenever training data is fed again. But it gives very wrong answers if a new dataset is introduced to the
model. This situation is known as overfitting.

Evaluation is basically done by two things:

1. Prediction The output given by the machine after training and testing the data is
known as Prediction. (Output of the machine)
2. Reality Reality is the real situation and real scenario where prediction has been
made by the machine. (Reality or truth)
We will consider many scenarios for evaluation. Then what is Scenario?
Consider an AI based prediction model which is deployed to identify Football or a
soccer ball.
Objective is to find out whether the given image is a football. Now there exists two
conditions as discussed above-
CASE-1 Here, we can see in the picture that it’s a
football. The model’s prediction is Yes
which means it's football. The Prediction
matches Reality. Hence, this condition is
termed as True Positive.

CASE-2 Here this is Not an image of Football hence

the reality is No. In this case, the machine
has predicted
it correctly as a No. Therefore, this condition
is termed as True Negative.

CASE-3 Here the reality is that it is not Football. But

the machine has incorrectly predicted that
this is Football.
This case is termed False Positive.
Another example- You predicted that India
won the cricket match series against
England but they
lost.

CASE-4 Here, a Football has been in a different look

because of which the Reality is Yes but the
machine has incorrectly predicted it as a No
which means the machine predicts that it is
not Football.
Therefore, this case becomes False
Negative.

What is Confusion Matrix?

A Confusion Matrix is a table that is often used to describe the performance of a
classification model (or "classifier") on a set of test data for which the true values are
known.
A 2x2 matrix denoting the right and wrong predictions might help us analyse the rate of
success. This matrix is termed the Confusion Matrix.
● It is a table of two rows and two columns that reports the number of true
positives,true negatives, false positives and false positives.
● It provides a most insightful picture, which is not only the performance of the
model but also to know which particular segments are predicted correctly and
incorrectly and also what are the errors made by the model.
● The confusion matrix allows us to understand the prediction results. It is not an
evaluation metric but a record which can help in evaluation
● Prediction and reality can be easily mapped together with the help of
confusion matrix

TRUE

FALSE
Q. Write the terms used for following cases for severe fire predictions.

Case Prediction Reality TERM

1 Yes Yes TP

2 No No TN

3 Yes No FP

4 No Yes FN

Evaluation Methods
Now as we have gone through all the possible combinations of Prediction and Reality,
let us see how we can use these conditions to evaluate the model.

● Accuracy
Accuracy is defined as the percentage of correct predictions out of all the observations.
A prediction can be said to be correct if it matches reality. Here, we have two conditions
in which the Prediction matches with the Reality: True Positive and True Negative.
Hence, the formula for Accuracy becomes:

● Precision:
Precision is defined as the percentage of true positive cases versus all the cases where
the prediction is true.

● Recall
Recall is another parameter for evaluating the model’s performance. It can be
defined as the fraction of positive cases that are correctly identified.

Note: Numerator of both precision and recall is same, the difference is only
about denominator, where Precision counts the False Positive and Recall takes
False Negative into consideration.
Which metric is important?

Choosing between Precision and Recall depends on the condition in which the model
has been deployed. In a case like Forest Fire, a False Negative can cost us a lot and is
risky too. Imagine no alert being given even when there is a Forest Fire. The whole
forest might burn down.
Another case where a False Negative can be dangerous is Viral Outbreak. Imagine a
deadly virus has started spreading and the model which is supposed to predict a viral
outbreak does not detect it. The virus might spread widely and infect a lot of people.
On the other hand, there can be cases in which the False Positive condition costs us
more than False Negatives.
One such case is Mining. Imagine a model telling you that there exists treasure at a
point and you keep on digging there but it turns out that it is a false alarm. Here, False
Positive case (predicting there is treasure but there is no treasure) can be very costly.
Similarly, let’s consider a model that predicts that a mail is spam or not. If the model
always predicts that the mail is spam, people would not look at it and eventually might
lose important information. Here also False Positive condition (Predicting the mail as
spam while the mail is not spam) would have a high cost.
● F1 Score

It is a measure of balance between precision and recall.

An ideal situation would be when we have a value of 1 (that is 100%) for both Precision
and Recall. In that case, the F1 score would also be an ideal 1 (100%). It is known as
the perfect value for F1 Score. As the values of both Precision and Recall ranges
from 0 to 1, the F1 score also ranges from 0 to 1.

Q.1 A social media company has developed an AI model to predict which users are
likely to churn (cancel their account). During testing, the AI model came up with the
following
Predictions.

(i) Calculate precision, recall and F1 Score for churn prediction

(ii) How many total tests were performed in the above scenario?

Q2.
Calculate Accuracy, Precision, Recall and F1 Score for the following Confusion Matrix
on Heart Attack Risk.
Q.3 Draw the confusion matrix for the following data:
Number of True Positive=100
Number of True Negative=47
Number of False Positive= 62
Number of False Negative=290

Q4.Calculate Accuracy, Precision, Recall and F1 Score for the following Confusion
Matrix on Water Shortage in Schools: Also suggest which metric would not be a good
evaluation parameter here and why?

Unit 7 - AI (Evaluation)
No ratings yet
Unit 7 - AI (Evaluation)
28 pages
EVALUATION
No ratings yet
EVALUATION
25 pages
Evaluation Class X
33% (3)
Evaluation Class X
19 pages
Evaluation
No ratings yet
Evaluation
32 pages
Evaluation 1646538719041
No ratings yet
Evaluation 1646538719041
65 pages
A Field of Computer Science That Focuses On Enabling Computers To Identify and Understand Objects and People in Images and Videos
No ratings yet
A Field of Computer Science That Focuses On Enabling Computers To Identify and Understand Objects and People in Images and Videos
136 pages
Evaluation in AI
No ratings yet
Evaluation in AI
20 pages
Cbse - Department of Skill Education Artificial Intelligence
No ratings yet
Cbse - Department of Skill Education Artificial Intelligence
12 pages
c10 Ai Evaluation - 2024-25
No ratings yet
c10 Ai Evaluation - 2024-25
29 pages
Grade 10 Unit 7 - Evaluation
No ratings yet
Grade 10 Unit 7 - Evaluation
50 pages
Evaluation Grade10 Ai
No ratings yet
Evaluation Grade10 Ai
32 pages
Class X Unit3WS Evaluation
No ratings yet
Class X Unit3WS Evaluation
3 pages
Evaluation Class X Ai 417
No ratings yet
Evaluation Class X Ai 417
19 pages
EVALUATION
No ratings yet
EVALUATION
12 pages
Evaluation - Grade 10 AI
No ratings yet
Evaluation - Grade 10 AI
12 pages
5.10ai - 2B
No ratings yet
5.10ai - 2B
15 pages
Evaluation Notes
No ratings yet
Evaluation Notes
12 pages
Evaluation Grade10 Ai
No ratings yet
Evaluation Grade10 Ai
32 pages
EvaluationQuestions Class 10 Ai
No ratings yet
EvaluationQuestions Class 10 Ai
6 pages
Evaluation Notes
No ratings yet
Evaluation Notes
12 pages
EVALUATION
No ratings yet
EVALUATION
10 pages
Partiiiunit2model Performanceconfusion Matrixaccuracyprecesion Recall
No ratings yet
Partiiiunit2model Performanceconfusion Matrixaccuracyprecesion Recall
8 pages
Unit - 7 - Evaluation
No ratings yet
Unit - 7 - Evaluation
30 pages
CH 07 Evaluation
No ratings yet
CH 07 Evaluation
25 pages
Notes of Evaluation
No ratings yet
Notes of Evaluation
5 pages
Evaluation AI X
No ratings yet
Evaluation AI X
6 pages
Part B Chapter 7 (Evaluation)
No ratings yet
Part B Chapter 7 (Evaluation)
5 pages
AI Evaluation
No ratings yet
AI Evaluation
24 pages
AI Evaluation
No ratings yet
AI Evaluation
30 pages
04 Evaluation Revision Notes
No ratings yet
04 Evaluation Revision Notes
5 pages
Unit-7 Evaluation Notes
No ratings yet
Unit-7 Evaluation Notes
9 pages
CH 7 - Notes Evaluation
No ratings yet
CH 7 - Notes Evaluation
3 pages
X Unit 7 Evaluation
No ratings yet
X Unit 7 Evaluation
5 pages
Evaluation Worksheet
No ratings yet
Evaluation Worksheet
2 pages
Screenshot 2025-03-17 at 12.15.59
No ratings yet
Screenshot 2025-03-17 at 12.15.59
3 pages
Chater 3 Class 10
No ratings yet
Chater 3 Class 10
4 pages
Unit 7 Evaluation
No ratings yet
Unit 7 Evaluation
13 pages
Evaluation 1 7
No ratings yet
Evaluation 1 7
7 pages
Assignment 5
No ratings yet
Assignment 5
22 pages
MS Evaluation Worksheet
No ratings yet
MS Evaluation Worksheet
3 pages
Part B Unit 7 Evaluation
No ratings yet
Part B Unit 7 Evaluation
11 pages
2.confusion Matrix and Performmance Metrics
No ratings yet
2.confusion Matrix and Performmance Metrics
15 pages
Evaluation
No ratings yet
Evaluation
12 pages
EVALUATION - Notes
No ratings yet
EVALUATION - Notes
15 pages
Unit 7 - Evaluation
No ratings yet
Unit 7 - Evaluation
7 pages
Evaluation Exercise
No ratings yet
Evaluation Exercise
3 pages
Evaluation Data
No ratings yet
Evaluation Data
3 pages
Q ClassX AI Evaluation
No ratings yet
Q ClassX AI Evaluation
12 pages
Confusion Matrix
No ratings yet
Confusion Matrix
5 pages
AI Evaluation
No ratings yet
AI Evaluation
3 pages
517-C-30072-Assignment Chapter Evaluation
No ratings yet
517-C-30072-Assignment Chapter Evaluation
10 pages
AI Project Evaluation 1
No ratings yet
AI Project Evaluation 1
5 pages
Evaluation Question Answers
No ratings yet
Evaluation Question Answers
7 pages
Aiunit 7 10
No ratings yet
Aiunit 7 10
4 pages
Evaluation Notes
No ratings yet
Evaluation Notes
12 pages
Unit-7 Evaluation: 7. What Is Meant by Overfitting of Data?
No ratings yet
Unit-7 Evaluation: 7. What Is Meant by Overfitting of Data?
7 pages
10 Ai Evaluation tp01
No ratings yet
10 Ai Evaluation tp01
5 pages
1051637-Worksheet Part B Unit7 Evaluation
No ratings yet
1051637-Worksheet Part B Unit7 Evaluation
5 pages
Panel Advocate'S List As On 07-11-2020, Circle Office Hyderabad
No ratings yet
Panel Advocate'S List As On 07-11-2020, Circle Office Hyderabad
57 pages
HP Probook 4510s (PN Fn068ut) Laptop Motherboard Schematic Diagram
100% (1)
HP Probook 4510s (PN Fn068ut) Laptop Motherboard Schematic Diagram
52 pages
DDL 5554
No ratings yet
DDL 5554
40 pages
1268 Manual (12D) PDF
No ratings yet
1268 Manual (12D) PDF
66 pages
File:Circuit Diagram - Pictorial and Schematic - PNG
No ratings yet
File:Circuit Diagram - Pictorial and Schematic - PNG
4 pages
SQLData Script Library User's Guide
No ratings yet
SQLData Script Library User's Guide
102 pages
CBR Experiment
100% (1)
CBR Experiment
9 pages
Knowledge Acquisition
No ratings yet
Knowledge Acquisition
5 pages
Dme I Mock Test Question Bank
No ratings yet
Dme I Mock Test Question Bank
5 pages
Template Allocation
100% (2)
Template Allocation
5 pages
Ea4300f - en
No ratings yet
Ea4300f - en
48 pages
Engines 18 Exhaust System PDF
No ratings yet
Engines 18 Exhaust System PDF
24 pages
Air Safety Procedures Manual
No ratings yet
Air Safety Procedures Manual
157 pages
Amateur Radio 1983 08
No ratings yet
Amateur Radio 1983 08
76 pages
EEE 303 Mid Assignment
No ratings yet
EEE 303 Mid Assignment
9 pages
Hybrid - Project Report Final
No ratings yet
Hybrid - Project Report Final
71 pages
Airbus Case From Challenger To Leader
No ratings yet
Airbus Case From Challenger To Leader
9 pages
Introducing Web Forms: VB Intro1.aspx
No ratings yet
Introducing Web Forms: VB Intro1.aspx
42 pages
Lenovo Flex3-1130 Intel Braswell-M Platform BM5488 SCH SVT v1.4 FPC
No ratings yet
Lenovo Flex3-1130 Intel Braswell-M Platform BM5488 SCH SVT v1.4 FPC
52 pages
Solar Cooled Greenhouse
No ratings yet
Solar Cooled Greenhouse
23 pages
ECDL Module 4 - Spreadsheets
No ratings yet
ECDL Module 4 - Spreadsheets
9 pages
KGUG 24, 36 en
No ratings yet
KGUG 24, 36 en
3 pages
Rohzin Rahman Abbas Instant Download
No ratings yet
Rohzin Rahman Abbas Instant Download
8 pages
Logistic Hard Copy Final.
No ratings yet
Logistic Hard Copy Final.
22 pages
Unbound GM Sheet
100% (1)
Unbound GM Sheet
1 page
Especificaciones Agfa Accuset
0% (1)
Especificaciones Agfa Accuset
1 page
A La Zaga. Decadencia y Fracaso de Las Vanguardias Del Siglo XX - Eric Hobsbawm
No ratings yet
A La Zaga. Decadencia y Fracaso de Las Vanguardias Del Siglo XX - Eric Hobsbawm
26 pages
Text-Sound Composition - The Second Generation: William Brunson
No ratings yet
Text-Sound Composition - The Second Generation: William Brunson
7 pages
Quart Dido Mas Eng - 0
No ratings yet
Quart Dido Mas Eng - 0
2 pages
Soil-Mulch-Stone Price List
No ratings yet
Soil-Mulch-Stone Price List
1 page
Errors of Regression Models: Bite-Size Machine Learning, #1
From Everand
Errors of Regression Models: Bite-Size Machine Learning, #1
Lee Baker
No ratings yet
Summary of Ajay Agrawal, Joshua Gans & Avi Goldfarb's Prediction Machines
From Everand
Summary of Ajay Agrawal, Joshua Gans & Avi Goldfarb's Prediction Machines
IRB Media
5/5 (1)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CH EVALUATION

Uploaded by

CH EVALUATION

Uploaded by

Ch-17 EVALUATION

Evaluation is basically done by two things:

CASE-2 Here this is Not an image of Football hence

CASE-3 Here the reality is that it is not Football. But

CASE-4 Here, a Football has been in a different look

What is Confusion Matrix?

Case Prediction Reality TERM

It is a measure of balance between precision and recall.

(i) Calculate precision, recall and F1 Score for churn prediction

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.