0% found this document useful (0 votes)
15 views6 pages

Linear Discriminant Analysis Reference

Uploaded by

VAIBHAV PATIL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views6 pages

Linear Discriminant Analysis Reference

Uploaded by

VAIBHAV PATIL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Linear Discriminant Analysis (LDA)

Introduction:

Linear Discriminant Analysis (LDA) uses linear combinations of independent


variables to predict the class in the response variable.
The concept of searching for a linear combination of predictor variables that
best separates the classes of the target variable.
vaibhaveee3@gmail.com
18XHT46RCYAssumptions:
LDA assumes that the independent variables are normally distributed and there
is equal variance/ covariance for the classes.
When these assumptions are satisfied, LDA creates a linear decision boundary.
Note: Based on many research studies, it is observed that LDA performs well
when these assumptions are violated.

Concept:
From the ‘Introduction’ section, we understand that LDA has 2 tasks:
1) Use a technique to separate the classes well so that misclassification error
is reduced
2) Also, do this separation using linear combination of the independent
variables

1
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.

This file is meant for personal use by vaibhaveee3@gmail.com only.


Sharing or publishing the contents in part or full is liable for legal action.
Let’s look at, how it tries to achieve the 1st objective – separate the classes:

Data Description
Let’s consider data with 2 independent variables ‘BMI’ and ‘Glucose’ used
to predict whether a patient is ‘Diabetic’ or not.

vaibhaveee3@gmail.com
18XHT46RCY

Separate the Classes

If we want a well separation between the 2 classes, we can achieve it by


1) Maximize the distance between the means OR increase between class
variance AND
2) Restrict the within class variance or reduce the within class variance

2
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.

This file is meant for personal use by vaibhaveee3@gmail.com only.


Sharing or publishing the contents in part or full is liable for legal action.
*** You might have heard similar explanation while you learnt ‘ANOVA’ – a
hypothesis testing technique that was used to check if there was a significant
difference between the means considering the within group and between
group variance.

Now let’s look at, how it tries to achieve the 2nd objective – linear combination
of the independent variables using the idea of objective 1:

vaibhaveee3@gmail.com
18XHT46RCY

linear combination of the independent variables using the idea of objective 1


Maybe we can find a linear combination of the independent variables which
separates the 2 classes as shown in figure above.
Here, we are considering 2 independent variables and hence, 2-dimensional
space where the line can do the job.
However, as the number of independent variables increases, for e.g. we have
3 independent variables the scatter will be across 3 dimensions and we will
need a plane to separate the classes.
As the number of independent variables further increases complexity further
increases.
So, to take care of this, we use dimension-reduction technique.

*** PCA was the dimension reduction technique you have learnt earlier. We
will use the similar approach. However, objective of PCA was to find directions
of maximal variance, the objective of LDA is to maximize separability between
the classes thus helping in classification prediction.

3
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.

This file is meant for personal use by vaibhaveee3@gmail.com only.


Sharing or publishing the contents in part or full is liable for legal action.
In the above figure we see that the data points are projected on best possible
line i.e. a line is chosen such that the mean between classes is maximum and
within class variance is minimum thus preventing chances of misclassification
error.

This best possible line is called as the ‘Discriminant Function’ or ‘Discriminant


Score’ which is the linear combination of independent variables.
vaibhaveee3@gmail.com
18XHT46RCY

β0, β1, β2… are called the coefficients: If the original data is standardized
(mean zero and standard deviation is 1), these are called as standardized
coefficients. The larger the standardized coefficient, larger is the respective
variable's unique contribution to the discrimination as specified by the
respective discriminant function (Note: This interpretation is reliable for
continuous predictors).

An alternate explanation for LDA is provided using Bayes’ Theorem:


Posterior probability which is of interest to us is predicted with help of prior
probability and the multivariate Gaussian distribution of the predictor variables.

4
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.

This file is meant for personal use by vaibhaveee3@gmail.com only.


Sharing or publishing the contents in part or full is liable for legal action.
Whether it is the ‘Discriminant Function’ approach or ‘Bayes’ Theorem’
approach, it is observed that the results are same.
LDA in Python (sklearn library) applies the ‘Bayes’ Theorem’ to arrive at the
classification results. However, the LDA model object also gives the intercept
and parameters to arrive at the discriminant function and gives corresponding
discriminant score.

Logistic Regression vs. LDA:


vaibhaveee3@gmail.com
18XHT46RCY  LogisticRegression is primarily considered for binary classification
problems. However, some of its extensions like ‘one-vs-rest’ can be used
for multi-class classification problems. LDA is considered as a better
choice whenever multi-class classification has to be worked upon.
 When the classes are well separated, logistic regression can have multiple
outputs thus making the result unstable. LDA performs better in such
conditions.
 LDA is preferred when dataset is small. Logistic regression uses the
maximum likelihood estimation procedure which is only unbiased with
large datasets. When dataset is small, it produces overly confident
predictions with coefficients having inflated absolute values.

Applications:
 Binary Classification:
o Classification or prediction of which disease a patient has, based on
the data collected regarding the symptoms.
o Classify a company’s credit worthiness based on its financial
performance.

5
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.

This file is meant for personal use by vaibhaveee3@gmail.com only.


Sharing or publishing the contents in part or full is liable for legal action.
 Multiclass Classification:
o Identification or classification of different objects in an image. E.g.
In an image captured at traffic signal LDA is used to classify children,
cars, traffic signals, other vehicles etc. In a face recognition, LDA
allows objective evaluation in different features of the face for
identifying the human face.

vaibhaveee3@gmail.com
18XHT46RCY

6
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.

This file is meant for personal use by vaibhaveee3@gmail.com only.


Sharing or publishing the contents in part or full is liable for legal action.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy