Dimensionality Reduction
Dimensionality Reduction
What is Predictive Modeling: Predictive modeling is a probabilistic process that allows
us to forecast outcomes, on the basis of some predictors. These predictors are basically
feature that come into play when deciding the final result, i.e., the outcome of the model.
Dimensionality reduction is the process of reducing the number of features (or
dimensions) in a dataset while retaining as much information as possible. This can be
done for a variety of reasons, such as to reduce the complexity of a model, to improve
the performance of a learning algorithm, or to make it easier to visualize the data. There
are several techniques for dimensionality reduction, including principal component
analysis (PCA), singular value decomposition (SVD), and linear discriminant analysis
(LDA). Each technique uses a different method to project the data onto a lower-
dimensional space while preserving important information.
Page 1 of 2
Feature Extraction:
Feature extraction involves creating new features by combining or transforming the
original features. The goal is to create a set of features that captures the essence of the
original data in a lower-dimensional space. There are several methods for feature
extraction, including principal component analysis (PCA), linear discriminant analysis
(LDA), and t-distributed stochastic neighbor embedding (t-SNE). PCA is a popular
technique that projects the original features onto a lower-dimensional space while
preserving as much of the variance as possible.
Why is Dimensionality Reduction important in Machine Learning and Predictive
Modeling?
An intuitive example of dimensionality reduction can be discussed through a simple e-
mail classification problem, where we need to classify whether the e-mail is spam or
not. This can involve a large number of features, such as whether or not the e-mail has a
generic title, the content of the e-mail, whether the e-mail uses a template, etc.
However, some of these features may overlap. In another condition, a classification
problem that relies on both humidity and rainfall can be collapsed into just one
underlying feature, since both of the aforementioned are correlated to a high degree.
Hence, we can reduce the number of features in such problems. A 3-D classification
problem can be hard to visualize, whereas a 2-D one can be mapped to a simple 2-
dimensional space, and a 1-D problem to a simple line. The below figure illustrates this
concept, where a 3-D feature space is split into two 2-D feature spaces, and later, if
found to be correlated, the number of features can be reduced even further.
Page 2 of 2
Components of Dimensionality Reduction
There are two components of dimensionality reduction:
Feature selection: In this, we try to find a subset of the original set of variables,
or features, to get a smaller subset which can be used to model the problem. It
usually involves three ways:
1. Filter
2. Wrapper
3. Embedded
Feature extraction: This reduces the data in a high dimensional space to a
lower dimension space, i.e. a space with lesser no. of dimensions.
Methods of Dimensionality Reduction
The various methods used for dimensionality reduction include:
Page 4 of 2
Disadvantages of Dimensionality Reduction
It may lead to some amount of data loss.
PCA tends to find linear correlations between variables, which is sometimes
undesirable.
PCA fails in cases where mean and covariance are not enough to define
datasets.
We may not know how many principal components to keep- in practice, some
thumb rules are applied.
Interpretability: The reduced dimensions may not be easily interpretable, and it
may be difficult to understand the relationship between the original features
and the reduced dimensions.
Overfitting: In some cases, dimensionality reduction may lead to overfitting,
especially when the number of components is chosen based on the training
data.
Sensitivity to outliers: Some dimensionality reduction techniques are sensitive
to outliers, which can result in a biased representation of the data.
Computational complexity: Some dimensionality reduction techniques, such as
manifold learning, can be computationally intensive, especially when dealing
with large datasets.
Important points:
Page 5 of 2