0% found this document useful (0 votes)

21 views4 pages

Basic Theory

Principal component analysis (PCA) is a technique used to reduce the dimensionality of large data sets by transforming the data to a new coordinate system. It works by finding the principal components or directions of maximum variance in high-dimensional data. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. PCA is useful for reducing noise and extracting relevant information from a data set.

Uploaded by

KHÔI PHẠM ĐĂNG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views4 pages

Basic Theory

Uploaded by

KHÔI PHẠM ĐĂNG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

CHAPTER 2: THEORETICAL BASIS

2.1. About PCA:

- First to step to the concept of PCA, we will go to a practical example to understand
more about this new concept through an example:

- Through the picture above we can see two camels, but with different perspectives
(information axis) we can receive information in many different directions and from there
can infer conclusions. different. A more detailed image of a small concept of PCA:

2.2. Concept:

- Principal component analysis is a frequently used method when statistical analysts are
faced with data sets with large dimensions (big data) to minimize data dimensionality
without losing information. information and retain the information necessary for building
models using a statistical algorithm that uses orthogonal transformations to transform a
set of data from a high-dimensional space to a new, low-dimensional space. more
dimensional (2 or 3 dimensional) to optimize the representation of data variability.

- According to the definition of variance, the variance of an original table of data X

(A dimension) has significant variance in each dimension, it can be said that

The original dimensions of the data X all have a certain degree of importance, it is
impossible

omit its direction. Therefore, a transformation is needed to rotate the dimensions of the X
data until there are B dimensions receiving the largest variance value. Since the variance
of the data X is constant

number so we can say (A-B) the other dimension has very little importance (the

insignificant error) and is allowed to be omitted. Finally can perform X

on a new basis with the least “loss” in space with the number of dimensions.

2.3.Characteristics:
Helps reduce the dimensionality of data.
Instead of keeping the coordinate axes of the old space, PCA builds a new space with less
dimensionality, but has the same good data representation as the old space, that is,
guarantees the variability of the space. data on each new dimension.
The coordinate axes in the new space are linear combinations of the old space, so
semantically, PCA builds a new feature based on the observed features. The good thing is
that these features still represent the original data well.
In the new space, latent associations of data can be discovered, which would be more
difficult to detect in the old space, or such links would not be evident.
2.4.Mathematical basis:

- Expectation (mean): Is the desired value, it is simply the average of all the values
Given N values: x1, x2,…, xn
N
1
X = ∑ xi
N i=1

- Variance: is the average of the square of the distance from each point to the expectation,
the smaller the variance, the closer the data points are to the expectation, the more similar
the data points are. The larger the variance, the more distributed we say the data is
N
1
σ ❑2 = ∑
N −1 i=1
(x i−x)2

- Covariance: Is a measure of the variation of two random variables together (as distinct
from variance - measuring the degree of variation of a variable). If the two variables tend
to vary together (that is, when one variable has a higher value than the expected value,
the other tends to also be higher than the expected value), then the covariance between
the two variables is positive values. On the other hand, if one variable is above the
expected value and the other tends to be below the expected value, the covariance of the
two variables is negative. If these two variables are independent of each other, the value
is 0.
N

∑ (X i− X)(Y i−Y )
COV ( X ,Y )= i=1
N
- Variance matrix:
Given N data points represented by column vectors x1, x2,…, xn, Then, the expectation
vector and covariance matrix of the entire data are defined as:
N N
1 1 1
X= ∑x;
N i=1 i
S= ∑
N −1 i=1
(x i−x )(x i−x)T =
N

The covariance matrix is a symmetric matrix, moreover, it is a positive semi-deterministic

matrix. Every element on the diagonal of the covariance matrix is non-negative. They are
also the variance of each dimension of the data. The non-diagonal elements represent the
correlation between the i and jth components of the data, also known as the covariance.
This value can be positive, negative or zero. When it is zero, we say that the two
components i and j in the data are uncorrelated. If the covariance matrix is diagonal, we
data is completely uncorrelated across dimensions.
- Maximum Variance: The goal is to choose a linear transformation P of V that maximizes
the image variance of X over one transformation.
x 1 + x 2+ ...+ x N
The mean of the data is: X =
N
For simplicity, we consider the transformation P on the 1-dimensional space generated by
unit vector u1it mean uT1 . u1=1
The variance of the image of X through the transformation is:
N N
1
∑ {u T1 . x n−uT1 . x } =uT1 S u1 với S= N 1−1 ∑ (x N− X )(x N −X )( x N −X )T
2

N n=1 i=1

Find the maximum value of uT1 S u1 with the condition uT1 . u1=1
Using the Lagrange multiplier method of multivariable functional analysis, we have
Lagrange function: L=u T1 S u 1+۸ 1 (1−uT1 u1 )=0
The stopping point of the Lagrange function occurs when S u1=۸ 1 u1 , or ۸1 is the
eigenvalue of S and u1 is the eigenvector of S corresponding to the eigenvalue ۸1 .
In short, the maximum value of the variance is equal to ۸ 1, when we choose vectors.
Reference:
1. Principal component analysis – Wikipedia
2. Pca Là Gì - Principal Component Analysis (Pca) (ceds.edu.vn)
3. Ma trậ n hiệp phương sai – Wikipedia tiếng Việt

Principal Component Analysis1
No ratings yet
Principal Component Analysis1
26 pages
Lecture 9 - Data Prep - Reduction - PCA-M
No ratings yet
Lecture 9 - Data Prep - Reduction - PCA-M
44 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
ML RUSA Module 5 Dim Red
No ratings yet
ML RUSA Module 5 Dim Red
85 pages
Pca Kmeans GMM
No ratings yet
Pca Kmeans GMM
96 pages
How Do You Do A Principal Component Analysis?
No ratings yet
How Do You Do A Principal Component Analysis?
13 pages
Presentation A I STD 2
No ratings yet
Presentation A I STD 2
63 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
82 pages
5 Data Pre Processing III
No ratings yet
5 Data Pre Processing III
30 pages
Data Analytics
No ratings yet
Data Analytics
28 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
L 10 Principal Component Analysis 09052024 072206pm
No ratings yet
L 10 Principal Component Analysis 09052024 072206pm
37 pages
Program: Course Code: Course Name:: M.C.A. MCAS9220 Data Science Fundamentals
No ratings yet
Program: Course Code: Course Name:: M.C.A. MCAS9220 Data Science Fundamentals
28 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
27 pages
Dimensionality Reduction Using Principal Component Analysis
No ratings yet
Dimensionality Reduction Using Principal Component Analysis
32 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
Cheat Sheet
No ratings yet
Cheat Sheet
2 pages
Unit 4 Part 2
No ratings yet
Unit 4 Part 2
17 pages
Data Analysis: Dr. C Santhosh Kumar
No ratings yet
Data Analysis: Dr. C Santhosh Kumar
22 pages
Mlfa Autumn 2023 Pca
No ratings yet
Mlfa Autumn 2023 Pca
32 pages
Principal Component Analysis and Cluster Analysis
No ratings yet
Principal Component Analysis and Cluster Analysis
14 pages
Principal Computer Analysis (PCA)
No ratings yet
Principal Computer Analysis (PCA)
25 pages
U5@-Data Reduction
No ratings yet
U5@-Data Reduction
22 pages
Sample Calculation Drainage Design Road Side Drain PDF
100% (3)
Sample Calculation Drainage Design Road Side Drain PDF
3 pages
Data Pre-Processing-IV (Feature Extraction-PCA)
No ratings yet
Data Pre-Processing-IV (Feature Extraction-PCA)
23 pages
6 Principal Component Analysis
No ratings yet
6 Principal Component Analysis
7 pages
Principal Component Analysis Concepts
No ratings yet
Principal Component Analysis Concepts
16 pages
PCA - Feb 8
No ratings yet
PCA - Feb 8
28 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
34 pages
Lecture 9 - Data Reduction
No ratings yet
Lecture 9 - Data Reduction
36 pages
PCA Complete
No ratings yet
PCA Complete
8 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
Linear Regression
100% (1)
Linear Regression
56 pages
Damac Construction Update June 2025
No ratings yet
Damac Construction Update June 2025
34 pages
PCA Notes
No ratings yet
PCA Notes
3 pages
Principal Component Analysis Concepts: T56Gzsrvah
No ratings yet
Principal Component Analysis Concepts: T56Gzsrvah
16 pages
DS Ca2 PPT 3010 3017
No ratings yet
DS Ca2 PPT 3010 3017
10 pages
Unit 3
No ratings yet
Unit 3
28 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
9 pages
PLAY - The Bean Game - Worksheet
No ratings yet
PLAY - The Bean Game - Worksheet
5 pages
1501589578da Mod15 Q1 e Text
No ratings yet
1501589578da Mod15 Q1 e Text
9 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
19 pages
U4 - PCA - 5th Sem - DS
No ratings yet
U4 - PCA - 5th Sem - DS
14 pages
Multivariate Statistics Principal Component Analysis (PCA)
No ratings yet
Multivariate Statistics Principal Component Analysis (PCA)
41 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
PDPII - May 2011 - Dhanaraj - 10644
No ratings yet
PDPII - May 2011 - Dhanaraj - 10644
96 pages
Principal Component Analysis: Courtesy:University of Louisville, CVIP Lab
No ratings yet
Principal Component Analysis: Courtesy:University of Louisville, CVIP Lab
48 pages
Pac
No ratings yet
Pac
70 pages
Need of Principal Component Analysis
No ratings yet
Need of Principal Component Analysis
8 pages
Exam 01 Answer
No ratings yet
Exam 01 Answer
57 pages
Pca 1
No ratings yet
Pca 1
3 pages
Dimensionality Reduction Using PCA (Principal Component Analysis)
No ratings yet
Dimensionality Reduction Using PCA (Principal Component Analysis)
13 pages
ML - Unit 3
No ratings yet
ML - Unit 3
4 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
9 pages
Sixteen Saviours or One?, John Perry. 1879
100% (3)
Sixteen Saviours or One?, John Perry. 1879
160 pages
Remote Sensing Assignment
No ratings yet
Remote Sensing Assignment
10 pages
11.-I Love The Earth - Password - Removed
No ratings yet
11.-I Love The Earth - Password - Removed
16 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
10 pages
Iec 309.1-1988
No ratings yet
Iec 309.1-1988
66 pages
OMGT Week 13 & 14 PPT (With EOQ Problem)
No ratings yet
OMGT Week 13 & 14 PPT (With EOQ Problem)
46 pages
Aviation Week 1970 01 05
No ratings yet
Aviation Week 1970 01 05
48 pages
Feature Extraction: - Saheni Patra
No ratings yet
Feature Extraction: - Saheni Patra
17 pages
2020 Proposal
No ratings yet
2020 Proposal
14 pages
Allen: Code: A-3 Kcet - 2020 Test Paper With Answer Key (Held On Friday 31july, 2020)
No ratings yet
Allen: Code: A-3 Kcet - 2020 Test Paper With Answer Key (Held On Friday 31july, 2020)
6 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
Chemistry Class - VIII Topic-Metallurgy
No ratings yet
Chemistry Class - VIII Topic-Metallurgy
46 pages
Grammar Lesson 1 - Overview
No ratings yet
Grammar Lesson 1 - Overview
5 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
Principal Components Analysis (PCA) Final
No ratings yet
Principal Components Analysis (PCA) Final
23 pages
PCA Explained Stepbystep
No ratings yet
PCA Explained Stepbystep
4 pages
Projecting Data To A Lower Dimension With PCA
No ratings yet
Projecting Data To A Lower Dimension With PCA
6 pages
Brochure AVEVA InitialDesign PDF
No ratings yet
Brochure AVEVA InitialDesign PDF
4 pages
Project On Drug Addiction
No ratings yet
Project On Drug Addiction
17 pages
Chemical Kinetics QB
No ratings yet
Chemical Kinetics QB
12 pages
Multisensor Installation Tool List - 4309978 - 01
No ratings yet
Multisensor Installation Tool List - 4309978 - 01
6 pages
Just A Pretty Face
No ratings yet
Just A Pretty Face
2 pages
Exercises
No ratings yet
Exercises
9 pages
Structural Irregularities 2
No ratings yet
Structural Irregularities 2
12 pages
Grade 5 - Week 13 - Science Questions
No ratings yet
Grade 5 - Week 13 - Science Questions
4 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
SIS ESD Sistems For Process Industries Using IEC 61508 Unit7 SIL Selection
100% (1)
SIS ESD Sistems For Process Industries Using IEC 61508 Unit7 SIL Selection
100 pages
Safety Data Sheet: Durashield Cs
No ratings yet
Safety Data Sheet: Durashield Cs
9 pages
Bai Tap Dat Cau Hoi
No ratings yet
Bai Tap Dat Cau Hoi
4 pages
Tos Math 7
No ratings yet
Tos Math 7
1 page
Mechatronics Question Bank
100% (2)
Mechatronics Question Bank
2 pages
Brosur CCTV ZiFMachines
No ratings yet
Brosur CCTV ZiFMachines
3 pages
ALiCC PWRI 2006
No ratings yet
ALiCC PWRI 2006
47 pages
Introduction to Differentiable Manifolds
From Everand
Introduction to Differentiable Manifolds
Louis Auslander
4.5/5 (2)
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Basic Theory

Uploaded by

Basic Theory

Uploaded by

CHAPTER 2: THEORETICAL BASIS

2.1. About PCA:

- According to the definition of variance, the variance of an original table of data X

insignificant error) and is allowed to be omitted. Finally can perform X

The covariance matrix is a symmetric matrix, moreover, it is a positive semi-deterministic

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.