0% found this document useful (0 votes)

73 views27 pages

3.2 Pca

PCA is an algorithm used to reduce dimensionality in datasets by transforming variables into a set of orthogonal principal components. It works by computing the covariance matrix of the dataset and calculating its eigenvectors and eigenvalues. The principal components with the highest eigenvalues contain the most information about the dataset and are used to reduce its dimensionality.

Uploaded by

Javada Javada

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

73 views27 pages

3.2 Pca

Uploaded by

Javada Javada

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

The Problem of Dimensionality

 Handling the high-dimensional data is very difficult in practice, commonly

known as the curse of dimensionality.
 If the dimensionality of the input dataset increases, any machine learning
algorithm and model becomes more complex.
 As the number of features increases, the number of samples also gets
increased proportionally, and the chance of overfitting also increases.
 If the machine learning model is trained on high-dimensional data, it
becomes overfitted and results in poor performance.
 Hence - required to reduce the number of features, which can be
done with dimensionality reduction.
Principal Component Analysis(PCA)

 Principal Component Analysis is an unsupervised learning algorithm that is

used for the dimensionality reduction in machine learning

 It is a technique to draw strong patterns from the given dataset by reducing

the variances.

 It is a feature extraction technique, so it contains the important variables and

drops the least important variable.
Why dimensionality reduction?
How PCA works?
 Dimensionality: It is the number of features or variables / columns present in the
dataset.

 Correlation: It signifies that how strongly two variables are related to each other.
Such as if one changes, the other variable also gets changed.

 Orthogonal: It defines that variables are not correlated to each other, and hence the
correlation between the pair of variables is zero.

 Eigenvectors : represent the direction in which PCs are aligned

 Eigen values : represent the units of spread captured by each PC.

 Covariance Matrix: A matrix containing the covariance between the pair of variables
is called the Covariance Matrix.
Procedure for performing principal component analysis

Step By Step Computation Of PCA

1. Getting the dataset

2. Representing data into a structure

3. Standardization of the data

4. Computing the covariance matrix

5. Calculating the eigenvectors and eigenvalues and Sorting the Eigen Vectors

6. Computing the Principal Components

7. Reducing the dimensions of the data set(Remove less or unimportant features from
the new dataset.)
1. Getting the dataset
 Firstly, we need to take the input dataset
2. Representing data into a structure
 Consider a dataset which has 4 features and a total of 5 training examples.
 Here each row corresponds to the data items, and the column corresponds to the
Features.
 The number of columns is the dimensions of the dataset.
3. Standardization of the data

• We have 2 variables in our data set

 one has values ranging between 10-100 and
 the other has values between 1000-5000.
• Output - biased since the variable with a larger range will have a more obvious
impact on the outcome.
• Therefore, standardizing the data into a comparable range is very important
4. Computing the covariance matrix

 PCA helps to identify the correlation and dependencies among the features in a
data set.
 A covariance matrix expresses the correlation between the different variables in
the data set.
 It is essential to identify heavily dependent variables because they contain biased
and redundant information which reduces the overall performance of the model.

Eg: Consider a case where we have a 2-Dimensional data set with variables a and b,
the covariance matrix is a 2×2 matrix
5. Calculating the eigenvectors and eigenvalues

 Eigen vector
 to use the Covariance matrix to understand where in the data there is the
most amount of variance.
 Since more variance in the data denotes more information about the
data, eigenvectors are used to identify and compute Principal Components.
 Eigenvalues - simply denote the scalars of the respective eigenvectors.
6. Computing the Principal Components

 where the eigenvector with the highest eigenvalue is the most significant and thus
forms the first principal component.

 The principal components of lesser significances can thus be removed in order to

reduce the dimensions of the data.

 Final step in computing the Principal Components - to form a matrix known as

the feature matrix that contains all the significant data variables that possess
maximum information about the data.
7. Reducing the dimensions of the data set(Remove less or unimportant
features from the new dataset.)

 The last step in performing PCA is to re-arrange the original data with the
final principal components which represent the maximum and the most
significant information of the data set.
Advantages of PCA

 Easy to compute.
 Speeds up other machine learning algorithms.
 Counteracts the issues of high-dimensional data.
Disadvantages of PCA

 Low interpretability of principal components.

 The trade-off between information loss and dimensionality reduction.
Applications of PCA in Machine Learning

 visualize multidimensional data.

 to reduce the number of dimensions in healthcare data.
 can help resize an image.
 used in finance to analyze stock data and forecast returns.
 helps to find patterns in the high-dimensional datasets.
Summary

 Several points plotted on a 2-D plane.

 There are two principal components.
 PC1 is the primary principal component that explains the maximum variance in the data.
 PC2 is another principal component that is orthogonal to PC1.
 The Principal Components are a straight line that captures most of the variance of the data.
 They have a direction and magnitude.
 Principal components are orthogonal projections (perpendicular) of data onto lower-
dimensional space.

Pablo Picasso. An Introduction (PDFDrive)
100% (3)
Pablo Picasso. An Introduction (PDFDrive)
200 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
13 pages
Resume Sample For Entry Level Nurse
100% (2)
Resume Sample For Entry Level Nurse
9 pages
Assessment and Identification of Needs PDF
100% (1)
Assessment and Identification of Needs PDF
637 pages
1694601214-Unit 3.4 Principal Component Analysis CU 2.0
No ratings yet
1694601214-Unit 3.4 Principal Component Analysis CU 2.0
36 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
Unit 3
No ratings yet
Unit 3
102 pages
Fun With Language Book 2 Part 1
0% (2)
Fun With Language Book 2 Part 1
86 pages
Key Word Transformation 1 (Solved)
50% (2)
Key Word Transformation 1 (Solved)
11 pages
Principal Component Analysis1
No ratings yet
Principal Component Analysis1
26 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
Principal Component Analysis (PCA) in Machine Learning
No ratings yet
Principal Component Analysis (PCA) in Machine Learning
20 pages
Multivariate Statistical Analysis
No ratings yet
Multivariate Statistical Analysis
12 pages
W4.2 DataPreProcessing-PCA
No ratings yet
W4.2 DataPreProcessing-PCA
22 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
27 pages
Pca - Principal Component Analysis 1233
No ratings yet
Pca - Principal Component Analysis 1233
30 pages
PCA GL
No ratings yet
PCA GL
8 pages
Presentation A I STD 2
No ratings yet
Presentation A I STD 2
63 pages
Ai (PCA)
No ratings yet
Ai (PCA)
3 pages
Clustering and Dimensionality Reduction Techniques PCA T SNE K Means
No ratings yet
Clustering and Dimensionality Reduction Techniques PCA T SNE K Means
15 pages
Web Based Barangay Managemen System With E-Signature and SMS
100% (3)
Web Based Barangay Managemen System With E-Signature and SMS
11 pages
Data Analysis: Dr. C Santhosh Kumar
No ratings yet
Data Analysis: Dr. C Santhosh Kumar
22 pages
Photography and Cyprus - Time, Place and Identity
100% (1)
Photography and Cyprus - Time, Place and Identity
289 pages
Principal Component Analysis and Cluster Analysis
No ratings yet
Principal Component Analysis and Cluster Analysis
14 pages
U5@-Data Reduction
No ratings yet
U5@-Data Reduction
22 pages
Principal Component Analysis PCA in Machine Learning
No ratings yet
Principal Component Analysis PCA in Machine Learning
20 pages
ML Mod32019
No ratings yet
ML Mod32019
6 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
19 pages
DR Pca
No ratings yet
DR Pca
22 pages
Module 3
No ratings yet
Module 3
41 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
Love Report
No ratings yet
Love Report
7 pages
7.3 Pca
No ratings yet
7.3 Pca
17 pages
6 Principal Component Analysis
No ratings yet
6 Principal Component Analysis
7 pages
What Is PCA?: Image Source
No ratings yet
What Is PCA?: Image Source
17 pages
Oral Communication in Context: Luzviminda P. Laureniana
100% (1)
Oral Communication in Context: Luzviminda P. Laureniana
10 pages
Chapter Five Principal Comonent Analysis (PCA)
No ratings yet
Chapter Five Principal Comonent Analysis (PCA)
33 pages
Lecture 9 - Data Reduction
No ratings yet
Lecture 9 - Data Reduction
36 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
34 pages
Program 3
No ratings yet
Program 3
7 pages
Principal Computer Analysis (PCA)
No ratings yet
Principal Computer Analysis (PCA)
25 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
UoB - Conditional Offer - 25-Apr-2024
No ratings yet
UoB - Conditional Offer - 25-Apr-2024
3 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
2 pages
STAT502
No ratings yet
STAT502
13 pages
PCA Dev
No ratings yet
PCA Dev
16 pages
03 Principal Components Analysis
No ratings yet
03 Principal Components Analysis
3 pages
Pca 1
No ratings yet
Pca 1
3 pages
Dimensionality Reduction (Principal Component Analysis)
No ratings yet
Dimensionality Reduction (Principal Component Analysis)
12 pages
Pca
No ratings yet
Pca
18 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
Unit 3
No ratings yet
Unit 3
28 pages
1501589578da Mod15 Q1 e Text
No ratings yet
1501589578da Mod15 Q1 e Text
9 pages
PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
Love Report 1
No ratings yet
Love Report 1
10 pages
U4 - PCA - 5th Sem - DS
No ratings yet
U4 - PCA - 5th Sem - DS
14 pages
Preliminaries: - Prayer - in - Energizer - Asedilla, Gwyneth Kyle - Checking of Attendance - Review
No ratings yet
Preliminaries: - Prayer - in - Energizer - Asedilla, Gwyneth Kyle - Checking of Attendance - Review
20 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
1 page
Principal Component Analysis
No ratings yet
Principal Component Analysis
10 pages
Need of Principal Component Analysis
No ratings yet
Need of Principal Component Analysis
8 pages
PCA - Principal Component Analysis: Step by Step Computation of PCA
No ratings yet
PCA - Principal Component Analysis: Step by Step Computation of PCA
2 pages
Guc 2 61 38781 2023-11-25T16 29 04
No ratings yet
Guc 2 61 38781 2023-11-25T16 29 04
3 pages
Talking About Family (Handout 1 - Week 1)
No ratings yet
Talking About Family (Handout 1 - Week 1)
5 pages
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
No ratings yet
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
19 pages
Principal Components Analysis (PCA) Final
No ratings yet
Principal Components Analysis (PCA) Final
23 pages
Sample TeacherFit Questions
100% (1)
Sample TeacherFit Questions
4 pages
PCA
100% (1)
PCA
33 pages
BATASANG PAMBANSA OF 1982 PD 6-A EDUCATIONAL ACT
No ratings yet
BATASANG PAMBANSA OF 1982 PD 6-A EDUCATIONAL ACT
23 pages
School/ District: Cabugao Elementary School/ Ragay Division: Camarines Sur Program Title: Early Language, Literacy and Numeracy Program
No ratings yet
School/ District: Cabugao Elementary School/ Ragay Division: Camarines Sur Program Title: Early Language, Literacy and Numeracy Program
2 pages
Aldhelm: ("There Is No Real Evidence That Any Such Version Ever Existed... ")
No ratings yet
Aldhelm: ("There Is No Real Evidence That Any Such Version Ever Existed... ")
29 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
NUS AMP Brochure
No ratings yet
NUS AMP Brochure
15 pages
Control of Indirect Matrix Converter Under Unbalanced Source Voltage and Load Current Conditions
No ratings yet
Control of Indirect Matrix Converter Under Unbalanced Source Voltage and Load Current Conditions
7 pages
Case Study On BPO Employee - Impact of Employee Remuneration in Employee Motivation (Responses)
No ratings yet
Case Study On BPO Employee - Impact of Employee Remuneration in Employee Motivation (Responses)
13 pages
Quantum Simulation of Schrödingers Equation
No ratings yet
Quantum Simulation of Schrödingers Equation
50 pages
Deep Learning EECS 6327
No ratings yet
Deep Learning EECS 6327
43 pages
Motivation Letter Sample 3
No ratings yet
Motivation Letter Sample 3
2 pages
674 - Esl A1 Level MCQ Test With Answers Elementary Test 1
No ratings yet
674 - Esl A1 Level MCQ Test With Answers Elementary Test 1
8 pages
Don Cash Thesis
100% (2)
Don Cash Thesis
5 pages
RTS qp3
No ratings yet
RTS qp3
2 pages
Social Work ESC Motivation Letter
No ratings yet
Social Work ESC Motivation Letter
2 pages
Group 7: Presentation On Organizational Behaviour
No ratings yet
Group 7: Presentation On Organizational Behaviour
8 pages
Actual Paper
No ratings yet
Actual Paper
7 pages
05 Acknowledgement
No ratings yet
05 Acknowledgement
2 pages
Ntu Cost Sheet
No ratings yet
Ntu Cost Sheet
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

3.2 Pca

Uploaded by

3.2 Pca

Uploaded by

The Problem of Dimensionality

 Handling the high-dimensional data is very difficult in practice, commonly

 Principal Component Analysis is an unsupervised learning algorithm that is

 It is a technique to draw strong patterns from the given dataset by reducing

 It is a feature extraction technique, so it contains the important variables and

 Eigenvectors : represent the direction in which PCs are aligned

 Eigen values : represent the units of spread captured by each PC.

Step By Step Computation Of PCA

1. Getting the dataset

2. Representing data into a structure

3. Standardization of the data

4. Computing the covariance matrix

6. Computing the Principal Components

• We have 2 variables in our data set

 The principal components of lesser significances can thus be removed in order to

 Final step in computing the Principal Components - to form a matrix known as

 Low interpretability of principal components.

 visualize multidimensional data.

 Several points plotted on a 2-D plane.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.