0% found this document useful (0 votes)
26 views10 pages

8 Normalization Methods

Uploaded by

saharsh0812
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views10 pages

8 Normalization Methods

Uploaded by

saharsh0812
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

NORMALIZATION

METHODS
Normalization

• Many machine learning algorithms attempt to find trends


in the data by comparing features of data points.

• However, there is an issue when the features are on


drastically different scales.

• When the algorithm compares data points, the feature


with the larger scale will completely dominate the other.
• Problem Statement: Predict which house would be
best for you.

• Years old contributes more as compared to number


of rooms because of larger scale.
• The goal of normalization is to make every data point
have the same scale so each feature is equally important.
Min-Max Normalization

• For every feature, the minimum value of that feature gets


transformed into a 0, the maximum value gets
transformed into a 1, and every other value gets
transformed into a decimal between 0 and 1.
• Min-max normalization has one fairly significant
downside: it does not handle outliers very well. For
example, if you have 99 values between 0 and 40, and
one value is 100, then the 99 values will all be
transformed to a value between 0 and 0.4.

Now if we were to
compare these
points, the y-axis
would dominate;
the y-axis can
differ by 1, but the
x-axis can only
differ by 0.4.
Z-Score Normalization

• Z-score normalization is a strategy of normalizing data


that avoids this outlier issue.

• Here, μ is the mean value of the feature and σ is the


standard deviation of the feature.
Z-Score Normalization

• If a value is exactly equal to the mean of all the values of


the feature, it will be normalized to 0. If it is below the
mean, it will be a negative number, and if it is above the
mean it will be a positive number.

• The size of those negative and positive numbers is


determined by the standard deviation of the original
feature. If the un-normalized data had a large standard
deviation, the normalized values will be closer to 0.
• While the data still looks squished, notice that the points are
now on roughly the same scale for both features — almost all
points are between -2 and 2 on both the x-axis and y-axis. The
only potential downside is that the features aren’t on the exact
same scale.
Summary

• Min-max normalization: Guarantees all features will have


the exact same scale but does not handle outliers well.

• Z-score normalization: Handles outliers, but does not


produce normalized data with the exact same scale.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy