0% found this document useful (0 votes)
17 views5 pages

confusion_matrix

A confusion matrix is a table that evaluates the performance of a classification model by comparing predicted results to actual outcomes, categorizing predictions into true positives, true negatives, false positives, and false negatives. It aids in calculating key metrics like accuracy, precision, recall, and F1-score, which are essential for understanding model performance, especially with imbalanced datasets. The document also includes a practical implementation of a confusion matrix using Python.

Uploaded by

Nikhil Nikki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views5 pages

confusion_matrix

A confusion matrix is a table that evaluates the performance of a classification model by comparing predicted results to actual outcomes, categorizing predictions into true positives, true negatives, false positives, and false negatives. It aids in calculating key metrics like accuracy, precision, recall, and F1-score, which are essential for understanding model performance, especially with imbalanced datasets. The document also includes a practical implementation of a confusion matrix using Python.

Uploaded by

Nikhil Nikki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

What is a Confusion Matrix?

A confusion matrix is a simple table that shows how well a classification


model is performing by comparing its predictions to the actual results. It
breaks down the predictions into four categories: correct predictions for
both classes (true positives and true negatives) and incorrect
predictions (false positives and false negatives). This helps you
understand where the model is making mistakes, so you can improve it.
The matrix displays the number of instances produced by the model on
the test data.
 True Positive (TP): The model correctly predicted a positive outcome
(the actual outcome was positive).
 True Negative (TN): The model correctly predicted a negative
outcome (the actual outcome was negative).
 False Positive (FP): The model incorrectly predicted a positive
outcome (the actual outcome was negative). Also known as a Type I
error.
 False Negative (FN): The model incorrectly predicted a negative
outcome (the actual outcome was positive). Also known as a Type II
error.
Why do we need a Confusion Matrix?
A confusion matrix helps you see how well a model is working by
showing correct and incorrect predictions. It also helps calculate key
measures like accuracy, precision, and recall, which give a better idea
of performance, especially when the data is imbalanced.
Metrics based on Confusion Matrix Data
1. Accuracy
Accuracy measures how often the model’s predictions are correct overall.
It gives a general idea of how well the model is performing. However,
accuracy can be misleading, especially with imbalanced datasets where
one class dominates. For example, a model that predicts the majority
class correctly most of the time might have high accuracy but still fail to
capture important details about other classes.

Accuracy=TP+TN+FP+FNTP+TN
2. Precision
Precision focuses on the quality of the model’s positive predictions. It tells
us how many of the instances predicted as positive are actually positive.
Precision is important in situations where false positives need to be
minimized, such as detecting spam emails or fraud.

Precision=TP+FPTP
3. Recall
Recall measures how well the model identifies all actual positive cases. It
shows the proportion of true positives detected out of all the actual
positive instances. High recall is essential when missing positive cases
has significant consequences, such as in medical diagnoses.

Recall=TP+FNTP
4. F1-Score
F1-score combines precision and recall into a single metric to balance
their trade-off. It provides a better sense of a model’s overall performance,
particularly for imbalanced datasets. The F1 score is helpful when both
false positives and false negatives are important, though it assumes
precision and recall are equally significant, which might not always align
with the use case.
F1-Score=Precision+Recall2⋅Precision⋅Recall
5. Specificity
Specificity is another important metric in the evaluation of classification
models, particularly in binary classification. It measures the ability of a
model to correctly identify negative instances. Specificity is also known as
the True Negative Rate. Formula is given by:
Specificity=TN+FPTN
6. Type 1 and Type 2 error
 Type 1 error
o A Type 1 Error occurs when the model incorrectly predicts a
positive instance, but the actual instance is negative. This is
also known as a false positive. Type 1 Errors affect
the precision of a model, which measures the accuracy of
positive predictions.

Type 1 Error=TN+FPFP
 Type 2 error
o A Type 2 Error occurs when the model fails to predict a
positive instance, even though it is actually positive. This is
also known as a false negative. Type 2 Errors impact
the recall of a model, which measures how well the model
identifies all actual positive cases.

Type 2 Error=TP+FNFN
 Example:

o Scenario: A diagnostic test is used to detect a particular


disease in patients.
o Type 1 Error (False Positive):
o This occurs when the test predicts a patient has the
disease (positive result), but the patient is actually
healthy (negative case).
o Type 2 Error (False Negative):
o This occurs when the test predicts the patient is
healthy (negative result), but the patient actually has
the disease (positive case).
Confusion Matrix For binary classification
A 2X2 Confusion matrix is shown below for the image recognition having
a Dog image or Not Dog image:
Predicted Predicted

Actual True Positive (TP) False Negative (FN)

Actual False Positive (FP) True Negative (TN)

 True Positive (TP): It is the total counts having both predicted and
actual values are Dog.
 True Negative (TN): It is the total counts having both predicted and
actual values are Not Dog.
 False Positive (FP): It is the total counts having prediction is Dog
while actually Not Dog.
 False Negative (FN): It is the total counts having prediction is Not Dog
while actually, it is Dog.
Example: Confusion Matrix for Dog Image Recognition with Numbers
Index 1 2 3 4 5 6 7 8 9 10

Not Not Not Not


Dog Dog Dog Dog Dog Dog
Actual Dog Dog Dog Dog

Not Not Not Not


Dog Dog Dog Dog Dog Dog
Predicted Dog Dog Dog Dog

Result TP FN TP TN TP FP TP TP TN TN

 Actual Dog Counts = 6


 Actual Not Dog Counts = 4
 True Positive Counts = 5
 False Positive Counts = 1
 True Negative Counts = 3
 False Negative Counts = 1
Predicted

Dog Not Dog

True Positive False Negative


Dog (TP =5) (FN =1)

False Positive True Negative


Actual Not Dog (FP=1) (TN=3)

Implementation of Confusion Matrix for Binary classification


using Python
Step 1: Import the necessary libraries
import numpy as np
from sklearn.metrics import confusion_matrix,classification_report
import seaborn as sns
import matplotlib.pyplot as plt
Step 2: Create the NumPy array for actual and predicted labels
actual = np.array(
['Dog','Dog','Dog','Not Dog','Dog','Not Dog','Dog','Dog','Not Dog','Not
Dog'])
predicted = np.array(
['Dog','Not Dog','Dog','Not Dog','Dog','Dog','Dog','Dog','Not Dog','Not
Dog'])
Step 3: Compute the confusion matrix
cm = confusion_matrix(actual,predicted)
Step 4: Plot the confusion matrix with the help of the seaborn
heatmap
sns.heatmap(cm,
annot=True,
fmt='g',
xticklabels=['Dog','Not Dog'],
yticklabels=['Dog','Not Dog'])
plt.ylabel('Actual', fontsize=13)
plt.title('Confusion Matrix', fontsize=17, pad=20)
plt.gca().xaxis.set_label_position('top')
plt.xlabel('Prediction', fontsize=13)
plt.gca().xaxis.tick_top()

plt.gca().figure.subplots_adjust(bottom=0.2)
plt.gca().figure.text(0.5, 0.05, 'Prediction', ha='center', fontsize=13)
plt.show()
Output:

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy