0% found this document useful (0 votes)

5 views18 pages

14 Pca

The document discusses Principal Components Analysis (PCA) and Autoencoders as methods for dimensionality reduction in data analysis. PCA aims to identify directions in data that explain variance, while Autoencoders are neural networks designed to minimize reconstruction error. Both techniques are useful for data visualization, preprocessing, and modeling, with PCA being a linear approach and Autoencoders allowing for nonlinear representations.

Uploaded by

tungxeom

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views18 pages

14 Pca

Uploaded by

tungxeom

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

CSC 411: Lecture 14: Principal Components Analysis &

Autoencoders

Raquel Urtasun & Rich Zemel

University of Toronto

Nov 4, 2015

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 1 / 18
Today

Dimensionality Reduction
PCA
Autoencoders

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 2 / 18
Mixture models and Distributed Representations

One problem with mixture models: each observation assumed to come from
one of K prototypes
Constraint that only one active (responsibilities sum to one) limits
representational power
Alternative: Distributed representation, with several latent variables relevant
to each observation
Can be several binary/discrete variables, or continuous

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 3 / 18
Example: continuous underlying variables

What are the intrinsic latent dimensions in these two datasets?

How can we find these dimensions from the data?

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 4 / 18
Principal Components Analysis

PCA: most popular instance of second main class of unsupervised learning

methods, projection methods, aka dimensionality-reduction methods
Aim: find small number of ”directions” in input space that explain variation
in input data; re-represent data by projecting along those directions
Important assumption: variation contains information
Data is assumed to be continuous:
I linear relationship between data and learned representation

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 5 / 18
PCA: Common tool

Handles high-dimensional data

I if data has thousands of dimensions, can be difficult for classifier to
deal with
Often can be described by much lower dimensional representation
Useful for:
I Visualization
I Preprocessing
I Modeling – prior for new data
I Compression

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 6 / 18
PCA: Intuition
Assume start with N data vectors, of dimensionality D
Aim to reduce dimensionality:
I linearly project (multiply by matrix) to much lower dimensional space,

M << D
Search for orthogonal directions in space w/ highest variance
I project data onto this subspace

Structure of data vectors is encoded in sample covariance

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 7 / 18
Finding principal components

To find the principal component directions, we center the data (subtract the
sample mean from each variable)
Calculate the empirical covariance matrix:
N
1 X (n)
C= (x − x̄)(x(n) − x̄)T
N n=1

with x̄ the mean

What’s the dimensionality of x?
Find the M eigenvectors with largest eigenvalues of C : these are the
principal components
Assemble these eigenvectors into a D × M matrix U
We can now express D-dimensional vectors x by projecting them to
M-dimensional z
z = UT x
Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 8 / 18
Standard PCA

Algorithm: to find M components underlying D-dimensional data

1. Select the top M eigenvectors of C (data covariance matrix):
N
1 X (n)
C= (x − x̄)(x(n) − x̄)T = UΣU T ≈ UΣ1:M U1:M
T
N n=1

where U: orthogonal, columns = unit-length eigenvectors

U T U = UU T = 1

and Σ: matrix with eigenvalues in diagonal = variance in direction of

eigenvector
2. Project each input vector x into this subspace, e.g.,

zj = uT
j x;
T
z = U1:M x

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 9 / 18
Two Derivations of PCA

Two views/derivations:
I Maximize variance (scatter of green points)
I Minimize error (red-green distance per datapoint)

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 10 / 18
PCA: Minimizing Reconstruction Error

We can think of PCA as projecting the data onto a lower-dimensional

subspace
One derivation is that we want to find the projection such that the best
linear reconstruction of the data is as close as possible to the original data
X
J= ||x(n) − x̃(n) ||2
n

where
M D
(n)
X X
x̃(n) = zj uj + bj uj
j=1 j=M+1

Objective minimized when first M components are the eigenvectors with the
maximal eigenvalues
(n)
zj = (x(n) )T uj ; bj = x̄T uj

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 11 / 18
Applying PCA to faces

Run PCA on 2429 19x19 grayscale images (CBCL data)

Compresses the data: can get good reconstructions with only 3 components

PCA for pre-processing: can apply classifier to latent representation

I PCA w/ 3 components obtains 79% accuracy on face/non-face
discrimination in test data vs. 76.8% for m.o.G with 84 states
Can also be good for visualization

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 12 / 18
Applying PCA to faces: Learned basis

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 13 / 18
Applying PCA to digits

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 14 / 18
Relation to Neural Networks

PCA is closely related to a particular form of neural network

An autoencoder is a neural network whose outputs are its own inputs

The goal is to minimize reconstruction error

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 15 / 18
Autoencoders

Define
z = f (W x); x̂ = g (V z)

Goal:
N
1 X (n)
min ||x − x̂(n) ||2
W,V 2N n=1

If g and f are linear

N
1 X (n)
min ||x − VW x(n) ||2
W,V 2N n=1

In other words, the optimal solution is PCA.

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 16 / 18
Autoencoders: Nonlinear PCA

What if g () is not linear?

Then we are basically doing nonlinear PCA
Some subtleties but in general this is an accurate description

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 17 / 18
Comparing Reconstructions

Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 18 / 18

PCA How To.1
No ratings yet
PCA How To.1
13 pages
Stat Test
75% (4)
Stat Test
3 pages
CS464 Ch6 FeatureExtraction
No ratings yet
CS464 Ch6 FeatureExtraction
46 pages
Dim Reduction & Pattern Recognition
No ratings yet
Dim Reduction & Pattern Recognition
63 pages
MLT Week2
No ratings yet
MLT Week2
41 pages
Machine Learning (CSO851) - Lecture 03
No ratings yet
Machine Learning (CSO851) - Lecture 03
71 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
16 pages
Face Recognition PAC
No ratings yet
Face Recognition PAC
24 pages
CHBE413CDS Lecture 12 Unsupervised DimRed
No ratings yet
CHBE413CDS Lecture 12 Unsupervised DimRed
30 pages
کتاب نهم بارگزاری شده
No ratings yet
کتاب نهم بارگزاری شده
55 pages
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
No ratings yet
FALLSEM2024-25 SWE1015 ETH VL2024250103260 2024-09-18 Reference-Material-I
62 pages
PCA
100% (1)
PCA
33 pages
08 Biometrics Lecture 8 Part3 2009-11-09
No ratings yet
08 Biometrics Lecture 8 Part3 2009-11-09
24 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
PCA revis-BoW PDF
No ratings yet
PCA revis-BoW PDF
47 pages
08 HighDimensional PDF
No ratings yet
08 HighDimensional PDF
88 pages
08 HighDimensional PDF
No ratings yet
08 HighDimensional PDF
88 pages
Computer Vision and Image Processing - Fundamentals and Applications
No ratings yet
Computer Vision and Image Processing - Fundamentals and Applications
34 pages
20 Pca
No ratings yet
20 Pca
50 pages
Principal Components Analysis: Vida Movahedi
No ratings yet
Principal Components Analysis: Vida Movahedi
18 pages
Clustering and Dimensionality Reduction Techniques PCA T SNE K Means
No ratings yet
Clustering and Dimensionality Reduction Techniques PCA T SNE K Means
15 pages
Love Report 1
No ratings yet
Love Report 1
10 pages
Lecture 9 - PCA
No ratings yet
Lecture 9 - PCA
44 pages
PCA (v3)
No ratings yet
PCA (v3)
34 pages
Lecture 7: Unsupervised Learning: C19 Machine Learning Hilary 2013 A. Zisserman
No ratings yet
Lecture 7: Unsupervised Learning: C19 Machine Learning Hilary 2013 A. Zisserman
20 pages
FR Pca Lda
No ratings yet
FR Pca Lda
52 pages
Love Report
No ratings yet
Love Report
7 pages
Program: Course Code: Course Name:: M.C.A. MCAS9220 Data Science Fundamentals
No ratings yet
Program: Course Code: Course Name:: M.C.A. MCAS9220 Data Science Fundamentals
28 pages
Lecture 3
No ratings yet
Lecture 3
14 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
60 pages
Dimensionality Reduction Using Principal Component Analysis
No ratings yet
Dimensionality Reduction Using Principal Component Analysis
32 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
Dimensionality Reduction Algorithms
No ratings yet
Dimensionality Reduction Algorithms
7 pages
4.5 Principal Component Analysis
No ratings yet
4.5 Principal Component Analysis
15 pages
Project LA
No ratings yet
Project LA
13 pages
PCA ChrisDing4
No ratings yet
PCA ChrisDing4
74 pages
Probabilistic & Unsupervised Learning: Maneesh@gatsby - Ucl.ac - Uk
No ratings yet
Probabilistic & Unsupervised Learning: Maneesh@gatsby - Ucl.ac - Uk
10 pages
Principal Component Analysis (PCA) - GeeksforGeeks
No ratings yet
Principal Component Analysis (PCA) - GeeksforGeeks
20 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
13 pages
Pca Lda Lobo
No ratings yet
Pca Lda Lobo
20 pages
Principal Component Analysis: Ujjwal Maulik Computer Sc. & Engg. Department Jadavpur University
No ratings yet
Principal Component Analysis: Ujjwal Maulik Computer Sc. & Engg. Department Jadavpur University
34 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
47 pages
Module 5.2 Principal Component Analysis - V1
No ratings yet
Module 5.2 Principal Component Analysis - V1
4 pages
Principal Component Analysis (PCA) : Gundimeda Venugopal
No ratings yet
Principal Component Analysis (PCA) : Gundimeda Venugopal
17 pages
Lecture 14: Principal Component Analysis: Computing The Principal Components
No ratings yet
Lecture 14: Principal Component Analysis: Computing The Principal Components
6 pages
MLPDF 2
No ratings yet
MLPDF 2
9 pages
PCA Dev
No ratings yet
PCA Dev
16 pages
Presentation
No ratings yet
Presentation
31 pages
Dimension Reduction
No ratings yet
Dimension Reduction
23 pages
Seminar PPT On Pca
No ratings yet
Seminar PPT On Pca
17 pages
Presentation A I STD 2
No ratings yet
Presentation A I STD 2
63 pages
Singular Value Decomposition (SVD) / Principal Components Analysis (Pca)
No ratings yet
Singular Value Decomposition (SVD) / Principal Components Analysis (Pca)
31 pages
Pca
No ratings yet
Pca
18 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
Data Pre-Processing-IV (Feature Extraction-PCA)
No ratings yet
Data Pre-Processing-IV (Feature Extraction-PCA)
23 pages
Mat 211 - 7
No ratings yet
Mat 211 - 7
14 pages
PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
DR Pca
No ratings yet
DR Pca
22 pages
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Classification and Clustering: CS109/Stat121/AC209/E-109 Data Science
No ratings yet
Classification and Clustering: CS109/Stat121/AC209/E-109 Data Science
28 pages
Violation of Assumptions
No ratings yet
Violation of Assumptions
61 pages
Midterm Reviews
100% (1)
Midterm Reviews
4 pages
EC 823 Fall 2012 - Applied Econometrics
No ratings yet
EC 823 Fall 2012 - Applied Econometrics
5 pages
Random Forest Presentation
No ratings yet
Random Forest Presentation
37 pages
Lasso and Ridge Regression
No ratings yet
Lasso and Ridge Regression
30 pages
Biostatistical Methods II (Regression Methods) : Course No. BMTRY 701
No ratings yet
Biostatistical Methods II (Regression Methods) : Course No. BMTRY 701
4 pages
Chapter9 - Serial Correlation
No ratings yet
Chapter9 - Serial Correlation
37 pages
Chapter3 Solutions
No ratings yet
Chapter3 Solutions
5 pages
Eco220y A17
No ratings yet
Eco220y A17
28 pages
Assignment I - E-May 2021
No ratings yet
Assignment I - E-May 2021
2 pages
Name: Reg. No.: Lab Exercise:: Shivam Batra 19BPS1131
100% (1)
Name: Reg. No.: Lab Exercise:: Shivam Batra 19BPS1131
10 pages
Supported Accommodation For People With Schizophrenia: Nordic Journal of Psychiatry
No ratings yet
Supported Accommodation For People With Schizophrenia: Nordic Journal of Psychiatry
9 pages
Module 2 - Section 4 (Linear Regression) - 11
No ratings yet
Module 2 - Section 4 (Linear Regression) - 11
20 pages
Cia 4 ML
No ratings yet
Cia 4 ML
60 pages
Pearson's Product-Moment Correlation Coefficient - Is A Measure of The Linear Strength of The Association Between Two Variables - It Is
No ratings yet
Pearson's Product-Moment Correlation Coefficient - Is A Measure of The Linear Strength of The Association Between Two Variables - It Is
2 pages
Module 3 Excel Final
No ratings yet
Module 3 Excel Final
19 pages
A Random Sample of 10 Cars of Different Makes and Sizes Is Taken and The Published Miles
No ratings yet
A Random Sample of 10 Cars of Different Makes and Sizes Is Taken and The Published Miles
20 pages
Econ 3505
0% (1)
Econ 3505
4 pages
Intrepretation Kmo
No ratings yet
Intrepretation Kmo
5 pages
M7 Muhammad Sandhi Khadafi 2KB04 (20122007)
No ratings yet
M7 Muhammad Sandhi Khadafi 2KB04 (20122007)
16 pages
For 7 TH Sem AIML4 ABC
No ratings yet
For 7 TH Sem AIML4 ABC
4 pages
Irs Ae18e8196b04e1548f
No ratings yet
Irs Ae18e8196b04e1548f
44 pages
Regression On Real Estate
No ratings yet
Regression On Real Estate
54 pages
CHP 8 Output
No ratings yet
CHP 8 Output
36 pages
Data Analysis Coca Cola
No ratings yet
Data Analysis Coca Cola
7 pages
Formula Card
100% (1)
Formula Card
13 pages
Stat 112 D. Small Example of Regression Analysis: Emergency Calls To The New York Auto Club
No ratings yet
Stat 112 D. Small Example of Regression Analysis: Emergency Calls To The New York Auto Club
7 pages
DDU ASSIGNMENT GROUP - PDF 2016
No ratings yet
DDU ASSIGNMENT GROUP - PDF 2016
4 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

14 Pca

Uploaded by

14 Pca

Uploaded by

CSC 411: Lecture 14: Principal Components Analysis &

Raquel Urtasun & Rich Zemel

What are the intrinsic latent dimensions in these two datasets?

How can we find these dimensions from the data?

PCA: most popular instance of second main class of unsupervised learning

Handles high-dimensional data

Structure of data vectors is encoded in sample covariance

with x̄ the mean

Algorithm: to find M components underlying D-dimensional data

where U: orthogonal, columns = unit-length eigenvectors

and Σ: matrix with eigenvalues in diagonal = variance in direction of

We can think of PCA as projecting the data onto a lower-dimensional

Run PCA on 2429 19x19 grayscale images (CBCL data)

PCA for pre-processing: can apply classifier to latent representation

PCA is closely related to a particular form of neural network

The goal is to minimize reconstruction error

If g and f are linear

In other words, the optimal solution is PCA.

What if g () is not linear?

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.