0% found this document useful (0 votes)

6 views13 pages

Feature Scaling

Feature scaling is crucial in machine learning as it ensures that different features are on a similar scale, which aids in the convergence of algorithms like gradient descent and distance-based models. Various scaling techniques such as Min-Max Scaler, Standard Scaler, and Robust Scaler are discussed, each serving different purposes depending on the data distribution and presence of outliers. The document also differentiates between scaling and transformation, as well as normalization and standardization, emphasizing the importance of selecting the appropriate method based on the specific algorithm and data characteristics.

Uploaded by

nhkjdhyegvemd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views13 pages

Feature Scaling

Uploaded by

nhkjdhyegvemd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

AIML | YBI Foundation

Feature Scaling

In a given data set different columns can be present in different ranges. For example, there can be
a column with a unit of distance, and another with the unit of a currency. These two columns will
have starkly different ranges, making it difficult for any machine learning model to reach an optimal
computation state.

Many machine learning algorithms perform better or converge faster when features are on a
relatively similar scale and/or close to normally distributed. Examples of such algorithm families
include:

In more technical terms, if one considers using Gradient Descent, it will take longer for the gradient
descent algorithm to converge, since it has to process different ranges that are far apart. The same
is demonstrated in the figure below.

1. linear and logistic regression

2. neural networks
3. support vector machines with radial bias kernel functions
4. linear discriminant analysis
5. nearest neighbors (KNN) with a Euclidean distance measure is sensitive to magnitudes and
hence should be scaled for all features to weigh in equally.
6. K-Means uses the Euclidean distance measure here feature scaling matters.
7. Scaling is critical while performing Principal Component Analysis (PCA). PCA tries to get the
features with maximum variance, and the variance is high for high magnitude features and
skews the PCA towards high magnitude features.
8. We can speed up gradient descent by scaling because θ descends quickly on small ranges
and slowly on large ranges, and oscillates inefficiently down to the optimum when the
variables are very uneven.

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

P a g e 11 | 122
AIML | YBI Foundation

The diagram on the left has scaled features. This means that features are brought down to values
which are comparable with one another, so the optimization function doesn’t have to take major
leaps to reach the optimal point. Scaling is not necessary for algorithms (like the decision tree),
which are not distance-based. Distance based models however, must have scaled features without
any exception.

Algorithms that do not require normalization/scaling are the ones that rely on rules. They would not
be affected by any monotonic transformations of the variables. Scaling is a monotonic
transformation. Examples of algorithms in this category are all the tree-based algorithms (1) CART,
(2) Random Forests, (3) Gradient Boosted Decision Trees. These algorithms utilize rules (series of
inequalities) and do not require normalization. Algorithms like (4) Linear Discriminant Analysis
(LDA), (5) Naive Bayes is by design equipped to handle this and give weights to the features
accordingly. Performing features scaling in these algorithms may not have much effect.

Why do we need scaling?

Machine learning algorithm just sees number — if there is a vast difference in the range say few
ranging in thousands and few ranging in the tens, and it makes the underlying assumption that

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

P a g e 12 | 122
AIML | YBI Foundation

higher ranging numbers have superiority of some sort. So, these more significant number starts
playing a more decisive role while training the model.

Example: If an algorithm is not using the feature scaling method then it can consider the value 3000
meters to be greater than 5 km but that’s actually not true and, in this case, the algorithm will give
wrong predictions. So, we use Feature Scaling to bring all values to the same magnitudes and thus,
tackle this issue.

The machine learning algorithm works on numbers and does not know what that number
represents. A weight of 10 grams and a price of 10 dollars represents completely two different
things — which is a no brainer for humans, but for a model as a feature, it treats both as same.

Suppose we have two features of weight and price, as in the below table. The “Weight” cannot have
a meaningful comparison with the “Price.” So the assumption algorithm makes that since “Weight”
> “Price,” thus “Weight,” is more important than “Price.”

So these more significant number starts playing a more decisive role while training the model. Thus
feature scaling is needed to bring every feature in the same footing without any upfront importance.
Interestingly, if we convert the weight to “Kg,” then “Price” becomes dominant.

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

P a g e 13 | 122
AIML | YBI Foundation

Another reason why feature scaling is applied is that few algorithms like Neural network gradient
descent converge much faster with feature scaling than without it.

Some popular scaling techniques are:

a. Min-Max Scaler: min-max scaler shrinks the feature values between any range of choice. For
example, between 0 and 5.
b. Standard Scaler: a standard scaler assumes that the variable is normally distributed and then
scales it down so that the standard deviation is 1 and the distribution is cantered at 0. Deep
learning algorithms often call for zero mean and unit variance. Regression-type algorithms
also benefit from normally distributed data with small sample sizes.
c. Robust Scaler: robust scaler works best when there are outliers in the dataset. It scales the
data with respect to the inter-quartile range after removing the median. RobustScaler
transforms the feature vector by subtracting the median and then dividing by the interquartile
range (75% value — 25% value). Use RobustScaler if you want to reduce the effects of outliers,
relative to MinMaxScaler.
d. Max-Abs Scaler: similar to min-max scaler, but instead of a given range, the feature is scaled
to its maximum absolute value. The sparsity of the data is preserved since it does not center
the data.

Standardizing Data

Standardizing data helps us transform attributes with a Gaussian distribution of differing means
and of differing standard deviations into a standard Gaussian distribution with a mean of 0 and a
standard deviation of 1. Standardization of data is done using scikit-learn with the StandardScaler
class.

The Standard Scaler assumes data is normally distributed within each feature and scales them such
that the distribution centered around 0, with a standard deviation of 1. Centring and scaling happen

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

P a g e 14 | 122
AIML | YBI Foundation

independently on each feature by computing the relevant statistics on the samples in the training
set. If data is not normally distributed, this is not the best Scaler to use.

Z score standardization is one of the most popular methods to normalize data. In this case, we
rescale an original variable to have a mean of zero and a standard deviation of one. Mathematically,
the scaled variable would be calculated by subtracting the mean of the original variable from the
raw value and then divide it by the standard deviation of the original variable.

Normalization or Min Max Scaler

Normalization is a scaling technique in which values are shifted and rescaled so that they end up
ranging between 0 and 1. It is also known as Min-Max scaling. Transform features by scaling each
feature to a given range. This estimator scales and translates each feature individually such that it

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

P a g e 15 | 122
AIML | YBI Foundation

is in the given range on the training set, e.g., between zero and one. This Scaler shrinks the data
within the range of -1 to 1 if there are negative values. We can set the range like [0,1] or [0,5] or [-1,1].

This Scaler responds well if the standard deviation is small and when a distribution is not Gaussian.
This Scaler is sensitive to outliers.

• When the value of X is the minimum value in the column, the numerator will be 0, and hence
X’ is 0
• On the other hand, when the value of X is the maximum value in the column, the numerator
is equal to the denominator and thus the value of X’ is 1
• If the value of X is between the minimum and the maximum value, then the value of X’ is
between 0 and 1

For each value in a feature, MinMaxScaler subtracts the minimum value in the feature and then
divides by the range. The range is the difference between the original maximum and original
minimum. MinMaxScaler preserves the shape of the original distribution. It doesn’t meaningfully
change the information embedded in the original data. Note that MinMaxScaler doesn’t reduce the
importance of outliers. The default range for the feature returned by MinMaxScaler is 0 to 1.
MinMaxScaler isn’t a bad place to start, unless you know you want your feature to have a normal
distribution or you have outliers and you want them to have reduced influence.

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

P a g e 16 | 122
AIML | YBI Foundation

Binarizing Data

In this method, all the values that are above the threshold are transformed into 1 and those equal to
or below the threshold are transformed into 0. This method is useful when we deal with probabilities
and need to convert the data into crisp values. Binarizing is done using the Binarizer class.

Robust Scaler:

As the name suggests, this Scaler is robust to outliers. If our data contains many outliers, scaling
using the mean and standard deviation of the data won’t work well.

This Scaler removes the median and scales the data according to the quantile range (defaults to
IQR: Interquartile Range). The IQR is the range between the 1st quartile (25th quantile) and the 3rd
quartile (75th quantile). The centering and scaling statistics of this Scaler are based on percentiles
and are therefore not influenced by a few numbers of huge marginal outliers. Note that the outliers
themselves are still present in the transformed data. If a separate outlier clipping is desirable, a non-
linear transformation is required.

Power Transformer Scaler:

The power transformer is a family of parametric, monotonic transformations that are applied to
make data more Gaussian-like. This is useful for modeling issues related to the variability of a
variable that is unequal across the range (heteroscedasticity) or situations where normality is
desired.

The power transform finds the optimal scaling factor in stabilizing variance and minimizing
skewness through maximum likelihood estimation. Currently, Sklearn implementation of Power
Transformer supports the Box-Cox transform and the Yeo-Johnson transform. The optimal
parameter for stabilizing variance and minimizing skewness is estimated through maximum

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

P a g e 17 | 122
AIML | YBI Foundation

likelihood. Box-Cox requires input data to be strictly positive, while Yeo-Johnson supports both
positive or negative data.

Q. Calculate Min-Max Scaler and Standard Scaler

SN Age Salary Purchase

1 28 27000 1
2 32 39000 1
3 18 20000 0
4 45 50000 1
5 35 38000 0
6 20 22000 1
7 25 27000 1
8 38 41000 0
9 39 40000 0
10 40 44000 1
Mean 32 34800 0.60
SD 9 10075 0.52

Q. What is the difference between Scaling and Transformation?

Scaling Transformation

Transformation helps in the case of

skewed variables to reduce the
The goal is to compare the skewness. In the case of regression,
variables as scaled variables on either if the assumptions of
Purpose the same band can be compared regression aren’t met or if the
and increase the computational relationship between the target and
power (or the efficiency). independent variables is non-linear,
then can use transformation to
linearize.

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

P a g e 18 | 122
AIML | YBI Foundation

Scaling has no impact on the

data. All the properties of the Transformation changes the data,
Impact on
data remain the same—only the and so does the distribution of the
Data
range of the independent data.
variables changes.

Impact on As the distribution remains the

Transformation can decrease the
Skewness, same so no changes in
skewness. It brings values closer,
Kurtosis, skewness and kurtosis. Scaling
which can remove the outliers.
Outliers doesn’t remove outliers.

Q. What is the difference between Standardization and Normalization?

Standardization and Normalization are scaling techniques. Standardization raises the data based
on the Z-score, using the formula (x-mean)/standard deviation, reducing the data width to -3 to 3.
Normalization scales the data using the formula (x – min)/(max-min) and Min-Max Scalar. It reduces
the data width from 0 to 1.

Q. List the different feature transformation techniques.

• The common feature transformation techniques are:
• Logarithmic transformation
• Exponential transformation
• Square Root transformation
• Reciprocal transformation
• Box-cox transformation

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

P a g e 19 | 122
AIML | YBI Foundation

Q. Why Should we Use Feature Scaling?

Some machine learning algorithms are sensitive to feature scaling while others are virtually invariant
to it.

Gradient Descent Based Algorithms: Machine learning algorithms like linear regression, logistic
regression, neural network, etc. that use gradient descent as an optimization technique require data
to be scaled. Take a look at the formula for gradient descent below:

The presence of feature value X in the formula will affect the step size of the gradient descent. The
difference in ranges of features will cause different step sizes for each feature. To ensure that the
gradient descent moves smoothly towards the minima and that the steps for gradient descent are
updated at the same rate for all the features, we scale the data before feeding it to the model. Having
featured on a similar scale can help the gradient descent converge more quickly towards the
minima.

Distance-Based Algorithms: Distance algorithms like KNN, K-means, and SVM are most affected
by the range of features. This is because behind the scenes they are using distances between data
points to determine their similarity.

Tree-Based Algorithms: Tree-based algorithms, on the other hand, are fairly insensitive to the scale
of the features. Think about it, a decision tree is only splitting a node based on a single feature. The
decision tree splits a node on a feature that increases the homogeneity of the node. This split on a
feature is not influenced by other features.

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

P a g e 20 | 122
AIML | YBI Foundation

So, there is virtually no effect of the remaining features on the split. This is what makes them
invariant to the scale of the features!

Q. What is the difference between normalization and standardization?

• Normalization is good to use when you know that the distribution of your data does not follow a
Gaussian distribution. This can be useful in algorithms that do not assume any distribution of
the data like K-Nearest Neighbours and Neural Networks.
• Standardization, on the other hand, can be helpful in cases where the data follows a Gaussian
distribution. However, this does not have to be necessarily true. Also, unlike normalization,
standardization does not have a bounding range. So, even if you have outliers in your data, they
will not be affected by standardization.
• However, at the end of the day, the choice of using normalization or standardization will depend
on your problem and the machine learning algorithm you are using.
• There is no hard and fast rule to tell you when to normalize or standardize your data. You can
always start by fitting your model to raw, normalized, and standardized data and compare the
performance for the best results.
• It is a good practice to fit the scaler on the training data and then uses it to transform the testing
data. This would avoid any data leakage during the model testing process. Also, the scaling of
target values is generally not required.

Q. What the difference between sklearn.preprocessing import MinMaxScaler &

sklearn.preprocessing.Normalizer ? When to use MinMaxScaler and when to
Normalizer?

Normalizer is also a normalization technique. The only difference is the way it computes the
normalized values. By default, it is calculating the l2 norm of the row values i.e. each element of a
row is normalized by the square root of the sum of squared values of all elements in that row. It is
useful in text classification where the dot product of two Tf-IDF vectors gives a cosine similarity
between the different sentences/documents in the dataset.

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

P a g e 21 | 122
AIML | YBI Foundation

Q. WHEN TO STANDARDIZE DATA AND WHY?

For distance-based models, standardization is performed to prevent features with wider ranges
from dominating the distance metric. But the reason we standardize data

• BEFORE PCA:
In Principal Component Analysis, features with high variances/wide ranges, get more weight
than those with low variance, and consequently, they end up illegitimately dominating the First
Principal Components (Components with maximum variance).
• BEFORE CLUSTERING:
Clustering models are distance-based algorithms, in order to measure similarities between
observations and form clusters they use a distance metric. So, features with high ranges will
have a bigger influence on the clustering. Therefore, standardization is required before building
a clustering model.
• BEFORE KNN:
k-nearest neighbor is a distance-based classifier that classifies new observations based on
similarity measures (e.g., distance metrics) with labeled observations of the training set.
Standardization makes all variables contribute equally to the similarity measures.
• BEFORE SVM
Support Vector Machine tries to maximize the distance between the separating plane and the
support vectors. If one feature has very large values, it will dominate over other features when
calculating the distance. So Standardization gives all features the same influence on the
distance metric.
• BEFORE MEASURING VARIABLE IMPORTANCE IN REGRESSION MODELS
You can measure variable importance in regression analysis, by fitting a regression model using
the standardized independent variables and comparing the absolute value of their standardized
coefficients. But, if the independent variables are not standardized, comparing their coefficients
becomes meaningless.
• BEFORE LASSO AND RIDGE REGRESSION

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

P a g e 22 | 122
AIML | YBI Foundation

LASSO and Ridge regressions place a penalty on the magnitude of the coefficients associated
with each variable. And the scale of variables will affect how many penalties will be applied to
their coefficients. Because coefficients of variables with large variance are small and thus less
penalized. Therefore, standardization is required before fitting both regressions.

Q. What are the cases when we can’t apply standardization?

• Tree-based algorithms

Tree-based algorithms such as Decision Tree, Random forest, and gradient boosting, are fairly
insensitive to the scale of the features. Think about it, a decision tree is only splitting a node based
on a single feature. The decision tree splits a node on a feature that increases the homogeneity of
the node. This split on a feature is not influenced by other features.

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

P a g e 23 | 122

Week 2 Homework - Summer 2020: Attempt History
No ratings yet
Week 2 Homework - Summer 2020: Attempt History
27 pages
Quick Guide To Commonly Used Statistical Tests
No ratings yet
Quick Guide To Commonly Used Statistical Tests
1 page
Soluciones Libro Daniel
No ratings yet
Soluciones Libro Daniel
273 pages
Assignment#2 RT WQ2021
No ratings yet
Assignment#2 RT WQ2021
2 pages
3 - AML - Lecture 3 - Feature Engg
No ratings yet
3 - AML - Lecture 3 - Feature Engg
39 pages
Towards Data Science All About Feature Scaling
No ratings yet
Towards Data Science All About Feature Scaling
16 pages
Feature Scaling Notes
No ratings yet
Feature Scaling Notes
4 pages
Summary Chap 1 & 2
No ratings yet
Summary Chap 1 & 2
5 pages
ML - Week 04
No ratings yet
ML - Week 04
33 pages
Feature Scaling Techniques: Machine Learning
No ratings yet
Feature Scaling Techniques: Machine Learning
27 pages
Machine Learning - Lec4 - 5
No ratings yet
Machine Learning - Lec4 - 5
41 pages
Feature Scaling (Standardization & Normalization)
No ratings yet
Feature Scaling (Standardization & Normalization)
35 pages
ML Unit 2
No ratings yet
ML Unit 2
90 pages
Feature Scaling in Machine Learning
No ratings yet
Feature Scaling in Machine Learning
4 pages
Unit 2 ML 2019
No ratings yet
Unit 2 ML 2019
91 pages
Scaling Techniques
No ratings yet
Scaling Techniques
30 pages
Feature Engineering
No ratings yet
Feature Engineering
18 pages
Chap 2 Linear Regression - Part2
No ratings yet
Chap 2 Linear Regression - Part2
16 pages
Session 7 Feature Selection & Dimensionality Reduction
No ratings yet
Session 7 Feature Selection & Dimensionality Reduction
20 pages
Feature Scaling
No ratings yet
Feature Scaling
6 pages
Data Preprocessing: Essential Steps For Preparing Data Before Modeling
No ratings yet
Data Preprocessing: Essential Steps For Preparing Data Before Modeling
111 pages
Presentation #1 Data Mining Minahel Khan BSIT (E) 22!11!1
No ratings yet
Presentation #1 Data Mining Minahel Khan BSIT (E) 22!11!1
7 pages
Lecture-11 - Feature Scaling
No ratings yet
Lecture-11 - Feature Scaling
26 pages
Data Mining
No ratings yet
Data Mining
33 pages
5.feauture Engineering
No ratings yet
5.feauture Engineering
34 pages
Model Selection and Feature Engineering
No ratings yet
Model Selection and Feature Engineering
64 pages
Feature Engineering
No ratings yet
Feature Engineering
15 pages
Unit 3-2
No ratings yet
Unit 3-2
15 pages
Assignment 121
No ratings yet
Assignment 121
9 pages
Feature Engineering: Getting The Most Out of Data For Predictive Models
No ratings yet
Feature Engineering: Getting The Most Out of Data For Predictive Models
75 pages
Lecture 7 Data Transformation and Dimensionality Reduction
No ratings yet
Lecture 7 Data Transformation and Dimensionality Reduction
22 pages
ML Notes
No ratings yet
ML Notes
44 pages
Lecture # 13 Data - Transformation - Techniques
No ratings yet
Lecture # 13 Data - Transformation - Techniques
36 pages
Feature Engineering PDF
100% (1)
Feature Engineering PDF
75 pages
287 - Sougata Saha - Scaling Training and Test Data
No ratings yet
287 - Sougata Saha - Scaling Training and Test Data
11 pages
Unit 2exploratory Analysis
No ratings yet
Unit 2exploratory Analysis
37 pages
Understanding The Inners of Clustering: DR Akashdeep, UIET, Panjab University Chandigarh, Maivriklab@pu - Ac.in
No ratings yet
Understanding The Inners of Clustering: DR Akashdeep, UIET, Panjab University Chandigarh, Maivriklab@pu - Ac.in
61 pages
Machine Learning: Dr. Jagan. T Professor Department of ECE, GRIET
No ratings yet
Machine Learning: Dr. Jagan. T Professor Department of ECE, GRIET
69 pages
DS Notes
No ratings yet
DS Notes
36 pages
Lec 7
No ratings yet
Lec 7
9 pages
Feature Engineering: Short Study: Indian Institute of Space Science and Technology, Department of Mathematics
No ratings yet
Feature Engineering: Short Study: Indian Institute of Space Science and Technology, Department of Mathematics
6 pages
Machine: Learning
No ratings yet
Machine: Learning
24 pages
PPA Data Preparation
No ratings yet
PPA Data Preparation
31 pages
ML Da
No ratings yet
ML Da
55 pages
ML Lectures Summary 2
No ratings yet
ML Lectures Summary 2
52 pages
Feature Engineering For Machine Learning
No ratings yet
Feature Engineering For Machine Learning
41 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
27 pages
MIQ - Lecture 1, 2, 3 and 4 ?
No ratings yet
MIQ - Lecture 1, 2, 3 and 4 ?
15 pages
Week 10
No ratings yet
Week 10
50 pages
MODELS (AutoRecovered)
No ratings yet
MODELS (AutoRecovered)
9 pages
Exp2 - Data Visualization and Cleaning and Feature Selection
No ratings yet
Exp2 - Data Visualization and Cleaning and Feature Selection
13 pages
Feature Engineering
No ratings yet
Feature Engineering
50 pages
Machine Learning Mindmap PDF
100% (1)
Machine Learning Mindmap PDF
5 pages
Lec06 7 Feature Engineering 08112022 100115am
No ratings yet
Lec06 7 Feature Engineering 08112022 100115am
44 pages
Summery of Feature Eng
No ratings yet
Summery of Feature Eng
4 pages
01 Basics 02knn 02
No ratings yet
01 Basics 02knn 02
7 pages
ML
No ratings yet
ML
6 pages
Djghuh
No ratings yet
Djghuh
2 pages
Feature and Feature Extractionlect2
No ratings yet
Feature and Feature Extractionlect2
28 pages
Normalization and Standardization: Methods To Preprocess Data To Have Consistent Scales and Distributions
No ratings yet
Normalization and Standardization: Methods To Preprocess Data To Have Consistent Scales and Distributions
10 pages
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
From Everand
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
Fouad Sabry
No ratings yet
Active Contour: Advancing Computer Vision with Active Contour Techniques
From Everand
Active Contour: Advancing Computer Vision with Active Contour Techniques
Fouad Sabry
No ratings yet
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Uji Validitas
No ratings yet
Uji Validitas
15 pages
Factors Affecting On Students Test Scores
No ratings yet
Factors Affecting On Students Test Scores
43 pages
Parametric and Non Parametric Tests
No ratings yet
Parametric and Non Parametric Tests
4 pages
Walmart Business Case Study - Ipynb - Colab
No ratings yet
Walmart Business Case Study - Ipynb - Colab
28 pages
Linkage Methods
No ratings yet
Linkage Methods
2 pages
Expected Shortfall
No ratings yet
Expected Shortfall
4 pages
PHD Thesis Structural Equation Modeling
100% (3)
PHD Thesis Structural Equation Modeling
6 pages
Data Science Mind Map PDF Download
No ratings yet
Data Science Mind Map PDF Download
1 page
MIS770A CH 09 Even Sol PDF
No ratings yet
MIS770A CH 09 Even Sol PDF
14 pages
The Problem With Sturges' Rule For Constructing Histograms: Rob J Hyndman
No ratings yet
The Problem With Sturges' Rule For Constructing Histograms: Rob J Hyndman
2 pages
Project Ali Huzaifa
No ratings yet
Project Ali Huzaifa
6 pages
Price Elasticity in Motor Insurance
No ratings yet
Price Elasticity in Motor Insurance
34 pages
DEN 1015H Lecture Notes Session 11 Examiner Variability and Error
No ratings yet
DEN 1015H Lecture Notes Session 11 Examiner Variability and Error
19 pages
Problem Sets (Days 1-6)
No ratings yet
Problem Sets (Days 1-6)
18 pages
MSC BT 1 Sem Biostatistics and Computer Applications 15002 May 2019
No ratings yet
MSC BT 1 Sem Biostatistics and Computer Applications 15002 May 2019
2 pages
Teks DATA SCIENCE Syllabus - QR
No ratings yet
Teks DATA SCIENCE Syllabus - QR
26 pages
METHODS OF DETE-WPS Office
No ratings yet
METHODS OF DETE-WPS Office
8 pages
Module 2 Lab Activity - Regression
No ratings yet
Module 2 Lab Activity - Regression
9 pages
CORRELATION
No ratings yet
CORRELATION
6 pages
Gargallo 2017
No ratings yet
Gargallo 2017
38 pages
Nonlife Actuarial Models: Classical Credibility
No ratings yet
Nonlife Actuarial Models: Classical Credibility
28 pages
Statistics - Normal Distribution
100% (1)
Statistics - Normal Distribution
63 pages
Examples On Continuous Variables Expected Value
No ratings yet
Examples On Continuous Variables Expected Value
4 pages
Statistical Calculations Using Calculators
No ratings yet
Statistical Calculations Using Calculators
4 pages
Method Validation Calculation File of Assay
No ratings yet
Method Validation Calculation File of Assay
6 pages
Kunci Soal Latihan
No ratings yet
Kunci Soal Latihan
7 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Feature Scaling

Uploaded by

Feature Scaling

Uploaded by

AIML | YBI Foundation

1. linear and logistic regression

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

Why do we need scaling?

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

Some popular scaling techniques are:

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

Normalization or Min Max Scaler

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

Power Transformer Scaler:

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

Q. Calculate Min-Max Scaler and Standard Scaler

SN Age Salary Purchase

Q. What is the difference between Scaling and Transformation?

Transformation helps in the case of

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

Scaling has no impact on the

Impact on As the distribution remains the

Q. What is the difference between Standardization and Normalization?

Q. List the different feature transformation techniques.

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

Q. Why Should we Use Feature Scaling?

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

Q. What is the difference between normalization and standardization?

Q. What the difference between sklearn.preprocessing import MinMaxScaler &

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

Q. WHEN TO STANDARDIZE DATA AND WHY?

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

Q. What are the cases when we can’t apply standardization?

www.ybifoundation.org (+91) 9667987711 support@ybifoundation.or g

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.