0% found this document useful (0 votes)
41 views38 pages

Bia b350f Unit 4

The document discusses principal component analysis (PCA), a technique used to simplify complex data sets by reducing the number of variables while retaining as much information as possible. PCA transforms the data into a set of values called principal components. It works by calculating the eigenvalues and eigenvectors of the covariance matrix and using them to change the basis of the data. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. PCA is useful for reducing dimensionality in multi-variable problems and as a pre-processing step before other analyses like regression.

Uploaded by

Nile Seth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views38 pages

Bia b350f Unit 4

The document discusses principal component analysis (PCA), a technique used to simplify complex data sets by reducing the number of variables while retaining as much information as possible. PCA transforms the data into a set of values called principal components. It works by calculating the eigenvalues and eigenvectors of the covariance matrix and using them to change the basis of the data. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. PCA is useful for reducing dimensionality in multi-variable problems and as a pre-processing step before other analyses like regression.

Uploaded by

Nile Seth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 38

1

BIA B350F Applied Multivariate Analysis for Business

Unit 4 Principal Component Analysis


2

Principal Component Analysis


Sometimes data are collected on a large number of variables from a single population. With
a large number of variables, the corresponding covariance matrix may be too large to
analyze and interpret.

Thus, there is a need to reduce the number of variables to a few significant linear
combinations of the data that are easier to interpret and analyze. Under this context, each
linear combination will correspond to a principal component.

Principal component analysis is a technique that is used to simplify a data set. It can be
used to reduce dimensionality by eliminating principal components that are considered to
be relatively less important.

In general, a principal component analysis is to explain the variance–covariance structure


of a set of variables through a few significant linear combinations of these variables.
3

Principal Component Analysis


If there are k original variables involved, normally k principal components are
needed to reproduce the variability of the data collected. However, it is often
the case that the total variability can be accounted for by p principal
components such that p < k. In such case, the p principal components is
capable of replacing the initial k variables. Principal component analysis often
is treated as an intermediate tools to reduce complexity of data. Its outputs
could become inputs of multiple regression analysis, factor analysis and so
on.
4

Framework of principal Component


Given a random vector x and the population covariance matrix Σ of x with eigenvalues λ1
≥ λ2 ≥ λ3 ≥ … ≥ λk ≥ 0
and corresponding normalized eigenvectors e1, e2, e3, … ek.
 X1 
X  11 12  1k 
 2    2k 
21  22
x   X3 Var x   Σ   
      
   
 X k   k1  k 2   kk 

Consider the linear combination:


Y1  a1' x  a11 X 1  a12 X 2    a1k X k
Y2  a '2 x  a21 X 1  a22 X 2    a2 k X k
 
Var Yi   Var a 'i x  a 'i Σ a i i  1,2, , k

  
Cov Yi , Y j  a 'i Σ a j i, j  1,2, , k

Yk  a 'k x  ak1 X 1  ak 2 X 2    akk X k


5

The principal components are those uncorrelated linear combinations


Y1 , Y2 , ... Yk whose variances are as large as possible.

In other words, the k principal components are those uncorrelated linear combinations
with the top k largest variances in Var Y   a '
i Σ a i from different
one couldi generated
choices of the coefficient vector ai provided that .
a 'i a i  1

Y1 = First principal component = linear combination a1' maximizes


that x  
Var a1' x
subject to a1' a1  1

Y2 = Second principal component = linear combination a '2maximizes


that x  
Var a '2 x
a '2 a 2  1and  
Cov a1' x , a '2 x  0

subject to

Yi = i principal component = linear combination a 'i xmaximizes


that  
Var a 'i x
subject to a 'i a i  1and  
Cov a 'i x , a 'j x  0 for j  i
6

Determine the coefficient vector ai


for the i th principal component
How do we find the coefficient vector ai for the i th principal component?
The solution involves the eigenvalues and eigenvectors of the covariance matrixΣ.
Recall that the population covariance matrix Σ of the random vector x has eigenvalues λ1
≥ λ2 ≥ λ3 ≥ … ≥ λk ≥ 0 and corresponding normalized eigenvectors e1, e2, e3, … ek.

It turns out that the i th eigenvectors will be the i th coefficients vectors of the i th
principal components and that the variance for the i th principal component is
equal to the i th eigenvalue. The resulting principal components are uncorrelated
with one another.

Yi  e 'i x  ei1 X 1  ei 2 X 2    eik X k e 'i  ei1 ei 2  eik 


 
Var Yi   Var e i'x  i
 
Cov Yi , Y j  0 for i  j
7

The Spectral Decomposition

Let A be a k x k symmetric matrix. Then A can be expressed in terms


of its k eigenvalue–eigenvector pairs (λi, ei) such that
k
A   ie ie i'  1e 1e 1'  2e 2e 2'    ke ke k'
i 1
This implies any symmetric matrix can be reconstructed from its eigenvalues
and eigenvectors.
8

The Spectral Decomposition of the Covariance Matrix Σ

The population covariance matrix Σ of the random vector x is symmetric.


Hence, Σ can be decomposed as:
k
Σ   ie ie i'  1e 1e 1'  2e 2e 2'    ke ke k'
i 1
p
  ie ie i'  1e 1e 1'  2e 2e 2'     pe pe 'p
i 1

The second expression is a useful approximation if λp+1, λ p+2, … , λk are


relatively small. We might approximate Σ by it.
9

Total Variance of the Principal Components


Let x   X 1 X2  X k  have covariance matrix Σ with eigenvalue - eigenvecto r
pairs 1 , e 1 , 2 , e 2 ,  , k , e k  where λ1  λ2    λk  0.

Let Y1  e 1x, Y2  e 2 x,  , Yk  e k x be the principal components. Then


k k
 11   22     kk   Var ( X i ) 1  2    k   Var (Yi )
i 1 i 1
That is, total population variance   11   22     kk  1  2    k

Hence, the proportion of total variance due to (explained by) the i th principal
component is given by:

λi
i  1,2, , k
λ1  λ2    λk
10

Correlation Coefficient between Yi and Xj


Each entry in the vector ei  ei1 ei 2  eij  eikhas
 meaningful
interpretation. That is, the magnitude of eij measures the contribution of the jth variable

to the ith principal component. In particular, eij is proportional to the correlation

coefficient between Yi and Xj.


For Y1  e 1x, Y2  e 2 x,  , Yk  e k x are the principal components
with respect to the covariance matrix Σ , the correlatio n coefficien ts
between principal component Yi and the variables X j is given by
eij i
Corr Yi , X j   This simple correlation between the original and the principal
 jj component variables, also called loadings.
11

Example 4.1
Suppose the random variables X1, X2, and X3 have the covariance matrix Σ.

1 0 0 1  5.236, e 1  0  0.8507 0.5257


Σ  0 4  2 with 2  1.000, e 2  1 0 0
0  2 2  3  0.764, e 3  0 0.5257 0.8507
Therefore, the principal components of Σ are:
Y1  e 1x  0.8507 X 2  0.5257 X 3
Y2  e 2 x  X 1
Y3  e 3 x  0.5257 X 2  0.8507 X 3

Var Y1   Var  0.8507 X 2  0.5257 X 3 


  0.8507 2Var X 2   0.5257 2Var X 3   2 0.8507 0.5257 Cov X 2 , X 3 
  0.8507 2 4   0.5257 2 2   2 0.8507 0.5257  2   5.236 = λ1
Example 4.1 12
1 0 0 1  5.236, e 1  0  0.8507 0.5257 Y1  e 1x  0.8507 X 2  0.5257 X 3
Σ  0 4  2 with 2  1.000, e 2  1 0 0 Y2  e 2 x  X 1
0  2 2  3  0.764, e 3  0 0.5257 0.8507 Y3  e 3 x  0.5257 X 2  0.8507 X 3

Var Y2   Var X 1   1 = λ2


Var Y3   Var 0.5257 X 2  0.8507 X 3 
 0.52572Var X 2   0.85072Var X 3   20.52570.8507Cov X 2 , X 3 
 0.52572 4   0.85072 2   20.52570.8507 2   0.764 = λ3
Total population variance
 11   22   33  1  4  2  1  2  3  5.236  1  0.764  7
The proportion of total variance accounted for The proportion of total variance accounted for by
by the first principal component is: the first two principal components is:
λ1 5.236 λ1  λ2 5.236  1
  0.748   0.8909
λ1  λ2  λ3 5.236  1  0.764 λ1  λ2  λ3 5.236  1  0.764

Thus, the principal components Y1 and Y2 could replace the original three random variables with
negligible loss of information.
Example 4.1 13
1 0 0 1  5.236, e 1  0  0.8507 0.5257 Y1  e 1x  0.8507 X 2  0.5257 X 3
Σ  0 4  2 with 2  1.000, e 2  1 0 0 Y2  e 2 x  X 1
0  2 2  3  0.764, e 3  0 0.5257 0.8507 Y3  e 3 x  0.5257 X 2  0.8507 X 3

e11 1 0 5.236
Corr Y1 , X 1    0
 11 1

e12 1  0.8507 5.236


Corr Y1 , X 2     0.8489
 22 4

e13 1 0.5257 5.236


Corr Y1 , X 3     0.8506
 33 2
Consider the first principal component Y1, the coefficient of variable X2 is –0.8507 which is the
largest weight in Y1. The coefficient of variable X3 is 0.5257. This indicates that X2 contributes
more than X3 to the formation of Y1. X1 does not contribute to Y1 hence their correlation is zero.

The correlation between Y1 and X2, Y1 and X3 are –0.8489 and 0.8506 respectively, almost
identical. This indicates that the two variables are equally important toY1.
Example 4.1 14
1 0 0 1  5.236, e 1  0  0.8507 0.5257 Y1  e 1x  0.8507 X 2  0.5257 X 3
Σ  0 4  2 with 2  1.000, e 2  1 0 0 Y2  e 2 x  X 1
0  2 2  3  0.764, e 3  0 0.5257 0.8507 Y3  e 3 x  0.5257 X 2  0.8507 X 3

e21 2 1 1
Corr Y2 , X 1    1
 11 1

e22 2 0 1
Corr Y2 , X 2    0
 22 4

e23 2 0 1
Corr Y2 , X 3    0
 33 2

Since Y2 equals to X1, the correlation between Y2 and X1 is 1.


Since the third principal component is not important, its respective correlations could be
ignored.
15

Desired Number of Principal Components to Keep


There is no “clear-cut” about how many principal components should be retained.
However, you can use the following guidelines to help you determine the desired
number of retained components :
1. The variances of each principal components (sizes of the eigenvalues) divided by the total
variance. You should choose the first n principal components where they explain most of the
total variances
2. Cattell’s Scree Test and Horn’s Parallel Analysis (more detailed will be explained later)
3. Apply the Kaiser rule (ONLY applicable when principal component analysis on correlation
matrix)
4. The interpretations of the components with respect to the original given scenario.
16

Principal Components of Sample Data

With reference to the framework of principal components established for


population data, the principal components using sample data could be determined
in the same way.

Let x1, x2, x3, … xn be n independent drawings from k -dimensional population with
mean vector µ and covariance matrix Σ. The samples mean vector and sample
covariance matrix are respectively. x and S
17

Principal Components of Sample Data


If the sample covariance matrix S has eigenvalues ˆ1  ˆ2    and
ˆ 0
k

corresponding normalized eigenvectors eˆ1 , eˆ2 ,with


 , eˆxk be any observations on the

variables X1, X2, …, Xk, then the ith sample principal component is given by

yˆ i  eˆi x  eˆi1 x1  eˆi 2 x2    eˆik xk , i  1,2,  , k


such that

Sample Variance  yˆ i   ˆi , i  1,2,  , k ; Sample Covarianceyˆ i , yˆ j   0, i j


Total sample variance  s11  s22    skk  ˆ1  ˆ2    ˆk
eˆij ˆi
Corr yˆ i , x j   , i, j  1,2,  , k
s jj
18

Example 4.2: Places Rated


In the Places Rated Almanac, Boyer and Savageau rated 329 communities in the United
States according to the following nine criteria:
1. Climate and Terrain 4. Crime 7. The Arts
2. Housing 5. Transportation 8. Recreation
3. Health Care & the Environment 6. Education 9. Economics

For housing and crime, the lower the score the better. For the rest of the variables, the
higher the score the better.
With nine variables, the covariance matrix may be too large to analyze and interpret in a
proper manner. There would be too many pairwise covariance between the variables to study.
Graphical display of data also may not too helpful if the data set is very large. To interpret the
data in a more meaningful way, it is therefore necessary to reduce the number of variables to a
few, interpretable linear combinations or principal components of the data.
19

Example 4.2: Places Rated


Principal Components Analysis Observations 329
Extracted R output Variables 9

Covariance Matrix
Climate Housing Health Crime Trans Educate Arts Recreate Econ
Climate 0.0128923499 0.0032677528 0.0054792649 0.0043741176 0.000385724 0.0004415009 0.0106885887 0.0025732595 –.0009661793
7
Housing 0.0032677528 0.0111161410 0.0145962100 0.0024830608 0.005278579 0.0010695852 0.0292263029 0.0091269830 0.0026458304
9
Health 0.0054792649 0.0145962100 0.1027278915 0.0099549524 0.0211534636 0.0074778111 0.1184843654 0.0152994310 0.0014633998

Crime 0.0043741176 0.0024830608 0.0099549524 0.0286107020 0.007298931 0.0004713186 0.0319465684 0.0092846815 0.0039464274
7
Trans 0.0003857247 0.0052785799 0.0211534636 0.0072989317 0.024828868 0.0024618893 0.0470407089 0.0115674940 0.0008343588
8
Educate 0.0004415009 0.0010695852 0.0074778111 0.0004713186 0.002461889 0.0025199764 0.0095204087 0.0008772470 0.0005464533
3
Arts 0.0106885887 0.0292263029 0.1184843654 0.0319465684 0.047040708 0.0095204087 0.2971731520 0.0508599879 0.0062060281
9
Recreate 0.0025732595 0.0091269830 0.0152994310 0.0092846815 0.0115674940 0.0008772470 0.0508599879 0.0353078256 0.0027924140

Econ –.0009661793 0.0026458304 0.0014633998 0.0039464274 0.000834358 0.0005464533 0.0062060281 0.0027924140 0.0071365383
Use sum(diag(R)) to obtain total variance 8
0.5223134457
20

Example 4.2: Places Rated


If you sum up all terms in “SS loadings”, it gives
you 0.5223, which is the total variance

PC1 PC2 PC3 PC4 PC5


SS loadings 0.3774624 0.05105221 0.02791958 0.02296708 0.01677125
Proportion Var 0.7226740 0.09774248 0.05345370 0.04397184 0.03210956
Cumulative Var 0.7226740 0.82041652 0.87387021 0.91784205 0.94995161 Table 4.1
Proportion Explained 0.7607483 0.10289207 0.05626992 0.04628850 0.03380125 Eigenvalues of the
Cumulative Proportion 0.7607483 0.86364033 0.91991024 0.96619875 1.00000000
covariance matrix

PC1 PC2 PC3 PC4 PC5


Climate 0.03507 0.008878 0.140875 0.152745 -0.39751
Housing 0.09335 0.009231 0.128850 -0.178382 -0.17531
Health 0.40776 -0.858532 0.276058 -0.035161 -0.05032 Table 4.2
Crime 0.10045 0.220424 0.592688 0.723663 0.01346 Normalized eigenvectors
Trans 0.15010 0.059201 0.220898 -0.126205 0.86997
Educate 0.03215 -0.060589 0.008145 -0.005197 0.04780
Arts 0.87434 0.303806 -0.363287 0.081116 -0.05507
Recreate 0.15900 0.333993 0.583626 -0.628226 -0.21329
Econ 0.01949 0.056101 0.120853 0.052170 -0.02965
21

Example 4.2: Places Rated


PC1 PC2 PC3 PC4 PC5
SS loadings 0.3774624 0.05105221 0.02791958 0.02296708 0.01677125
Proportion Var 0.7226740 0.09774248 0.05345370 0.04397184 0.03210956
Cumulative Var 0.7226740 0.82041652 0.87387021 0.91784205 0.94995161
Proportion Explained 0.7607483 0.10289207 0.05626992 0.04628850 0.03380125
Cumulative Proportion 0.7607483 0.86364033 0.91991024 0.96619875 1.00000000

“SS loadings” represents the eigenvalue for each PC. The sum of SS loadings = 0.5223. The
proportion of variation explained by each eigenvalue is highlighted in blue bracket. For
example, 0.377462 divided by 0.5223 equals 0.7227, or, about 72% of the total variation is
explained by this first eigenvalue.

The cumulative percentage explained is obtained by adding the successive proportions of


variation explained to obtain the running total. For instance, 0.7227 plus 0.0977 equals 0.8204,
and so forth. Therefore, about 82% of the variation is explained by the first two eigenvalues
together.
22

Example 4.2: Places Rated


PC1 PC2 PC3 PC4 PC5
SS loadings 0.3774624 0.05105221 0.02791958 0.02296708 0.01677125
Proportion Var 0.7226740 0.09774248 0.05345370 0.04397184 0.03210956
Cumulative Var 0.7226740 0.82041652 0.87387021 0.91784205 0.94995161
Proportion Explained 0.7607483 0.10289207 0.05626992 0.04628850 0.03380125
Cumulative Proportion 0.7607483 0.86364033 0.91991024 0.96619875 1.00000000

If you compute the differences of SS loadings between two adjacent PCs, you can see the
magnitude of difference is decreasing.
Subtracting the second eigenvalue 0.051 from the first eigenvalue, 0.377 we get a difference
of 0.326. The difference between the second and third eigenvalues is 0.0232; the next difference
is 0.0049. Subsequent differences are even smaller. A sharp drop from one eigenvalue to the
next may serve as another indicator of how many eigenvalues to consider.
The first three principal components explain 87% of the variation. This actually is an acceptable
large percentage.
23

Example 4.2: Places Rated


Scree Plot
Another way to determine the number of principal components to employ is to look at a Scree
Plot. With the eigenvalues ordered from largest to the smallest, a scree plot is the plot of
λi versus i. The number of component is determined at the point, beyond which the remaining
eigenvalues are all relatively small and of comparable size.

The Scree plot is utilized in parallel analysis. A parallel analysis simulates a random dataset
(or resampled dataset), with the same sample size as the input dataset. Then, eigenvalues are
computed from this simulated dataset (or resampled dataset). Finally, these simulated
eigenvalues are plotted along with eigenvalues from the input dataset, on the same scree plot.
Simulated eigenvalues are plotted in dash lines.
24

Example 4.2: Places Rated

Kaiser rule
Kaiser (1960) recommends that we should
retain eigenvalues only if they are at least
equal to one. This is also known as
“eigenvalue > 1” rule.

However, scholar had argued that Kaiser


rule is only valid when using correlation
matrix to compute PCs. Therefore, the
Kaiser rule does not apply on example 4.2

(Source: https://www.rasch.org/rmt/rmt191h.htm)
25

Example 4.2: Places Rated


Cattell’s Scree Test and Horn’s Parallel Analysis
Cattell (1966)’s scree test is performed by searching
for a “bend” or “elbow” in the plot, or an abrupt
transition from large to small eigenvalues. From the
scree plot on the right, the sharp break appears at the
2nd component. Therefore, using Cattell’s suggestion,
only the 1st PC should be retained.

Horn (1965)’s parallel analysis is an equally


compelling procedure where each point on the actual
data is above the simulated data or resampled data
lines is a component to extract. Again, only the 1 st PC
should be retained.

(Source: https://www.rasch.org/rmt/rmt191h.htm)
26

Example 4.2: Places Rated


Interpretation of the Principal Components

To interpret each component, the correlations between each original variable and each
principal component will be computed. These correlations between the principal
components and the original variables will be used to interpret these principal
components. Note that among the principal components themselves there is zero
correlation between the components.
27

First Principal Component Analysis


Traditionally, researchers have used a loading of 0.5 or above as
the cutoff point. Principal Component

The first principal component is strongly correlated with five of Variable 1 2 3


the original variables. The first principal component increases with Climate 0.190 0.017 0.207
scores in Arts, Health, Transportation, Housing and Recreation Housing 0.544 0.020 0.204
increase. This suggests that these five criteria vary together. Health 0.782 -0.605 0.144
This component can be viewed as a measure of the quality of Arts, Crime 0.365 0.294 0.585
Health, Transportation, and Recreation, and the lack of quality in Transportation 0.585 0.085 0.234
Housing. Education 0.394 -0.273 0.027
It could be stated that based on the largest correlation of 0.985 that Arts 0.985 0.126 -0.111
this principal component is primarily a measure of the Arts. Recreation 0.520 0.402 0.519
Communities with high overall scores would tend to have a lot of Economy 0.142 0.150 0.239
arts available, in terms of theaters, orchestras, etc. Whereas
communities with small values would have very few of these types
of facilities.
28

Second Principal Component Analysis


The second principal component increases with only one Principal Component
of the values, decreasing Health. This component can be Variable 1 2 3
viewed as a measure of how unhealthy the location is in
Climate 0.190 0.017 0.207
terms of available health care including doctors, hospitals,
etc. Housing 0.544 0.020 0.204
Health 0.782 -0.605 0.144
Third Principal Component Analysis Crime 0.365 0.294 0.585
Transportation 0.585 0.085 0.234
The third principal component increases with
increasing Crime and Recreation. This suggests that Education 0.394 -0.273 0.027
places with high scores in crime also tend to have Arts 0.985 0.126 -0.111
better recreation facilities. Recreation 0.520 0.402 0.519
Economy 0.142 0.150 0.239
29

Computation of Principal Component Scores


PC1 PC2 PC3 PC4 PC5 The scores of principal component
Climate 0.03507 0.008878 0.140875 0.152745 -0.39751
Housing 0.09335 0.009231 0.128850 -0.178382 -0.17531 can be computed using the
Health 0.40776 -0.858532 0.276058 -0.035161 -0.05032 elements of the eigenvectors.
Crime 0.10045 0.220424 0.592688 0.723663 0.01346
Trans 0.15010 0.059201 0.220898 -0.126205 0.86997
Educate 0.03215 -0.060589 0.008145 -0.005197 0.04780
Arts 0.87434 0.303806 -0.363287 0.081116 -0.05507
Recreate 0.15900 0.333993 0.583626 -0.628226 -0.21329
Econ 0.01949 0.056101 0.120853 0.052170 -0.02965

For example, the first two principal component scores for an individual community of interest
can be computed using the elements of the eigenvector and the values of that community for
each of the nine variables :

yˆ1  0.03507Climate  0.09335Housing  0.40776Health    0.01949Econ 

yˆ 2  0.00888Climate  0.00923Housing  0.85853Health    0.05610Econ 


30

Example 4.2: Places Rated - Scatter plots of the Principal Components


The scatter plot is plotting the first principal component
against the second principal component. Each dot in this
plot represents one community. Each of their location on
the plot.

#213 has a very high score for the first principal component
and it is expected that this community possesses high values
for the Arts, Health, Housing, Transportation and
Recreation.

#195 has a very high value for the second component. One
can deduce expect that these two communities would be
bad for Health.

Conversely, #255 on the bottom represents that the


corresponding community would have high values for
Health.
31

Example 4.3: Using R to analyze U.S lawyers ratings


Here, we use an R data set that contains the raw data describing 43 US lawyers’ ratings on
the following 13 variables.

Variable Meaning of the variable


JUDGE Record identification variable
CONT Number of contacts of lawyer with judge
INTG Judicial integrity
DMNR Demeanor
DILG Diligence
CFMG Case flow managing
DECI Prompt decisions
PREP Preparation for trial
FAMI Familiarity with law
ORAL Sound oral rulings
WRIT Sound written rulings
PHYS Physical ability
RTEN Worthy of retention
32

Example 4.3: Using R to analyze U.S lawyers ratings


The R program of the principal component analysis (PCA) is given below
# read the data
dat <- read.csv("judgeratings.csv") Extract only the
dat2 <- dat[, 3:13] numerical parts

# use the package “psych” to build the PCA model


library(psych)
out <- principal(dat2, nfactors=11, rotate = "none", covar = TRUE, cor="cov")
out Summarize the output

## extract the eigenvalues and eigenvectors Eigenvalues and eigenvectors are stored
eigen.values <- out$values explicitly. Call them out if needed.
eigen.vectors <- out$loadings

# use “fa.parallel” to generate a scree plot


fa.parallel(dat2, fa="pc", n.iter=100, cor = "cov", show.legend = FALSE,
main = "Scree plot with parallel analysis")
abline(h = 1)
33

Example 4.3: Using R to analyze U.S lawyers ratings


Output from the PCA model - eigenvalues
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 PC11
SS loadings 9.180 0.383 0.226 0.076 0.030 0.017 0.014 0.008 0.005 0.003 0.002
Proportion Var 0.923 0.039 0.023 0.008 0.003 0.002 0.001 0.001 0.001 0.000 0.000
Cumulative Var 0.923 0.962 0.984 0.992 0.995 0.997 0.998 0.999 0.999 1.000 1.000
Proportion Explained 0.923 0.039 0.023 0.008 0.003 0.002 0.001 0.001 0.001 0.000 0.000
Cumulative Proportion 0.923 0.962 0.984 0.992 0.995 0.997 0.998 0.999 0.999 1.000 1.000

The row “SS loadings” represents the eigenvalues for each


component. The first eigenvalue = 9.18; the second Since the first principal component
eigenvalue = 0.383, etc. accounts for 92.3% which is most of
variance and the second principal
components only accounts for 3.9%,
which is less than 5%, only the first
principal component will be retained.
34

Example 4.3: Using R to analyze U.S lawyers ratings


output from the PCA model - eigenvectors
Unstandardized loadings (pattern matrix) based upon covariance matrix
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 PC11 h2
INTG 0.71 -0.26 0.09 0.00 0.08 0.06 0.03 0.04 0.02 0.01 0.00 0.59
DMNR 1.05 -0.43 -0.04 0.06 -0.06 -0.06 -0.01 0.01 0.00 0.00 0.00 1.31
DILG 0.87 0.12 0.14 0.09 0.10 -0.04 -0.03 0.00 -0.03 0.00 0.00 0.81

Standardized loadings (pattern matrix)


item PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 PC11 h2
INTG 1 0.92 -0.34 0.11 0.00 0.10 0.07 0.04 0.05 0.02 0.01 0.00 1
DMNR 2 0.92 -0.38 -0.03 0.05 -0.05 -0.05 -0.01 0.01 0.00 0.00 0.00 1
DILG 3 0.96 0.13 0.16 0.10 0.11 -0.05 -0.04 0.00 -0.03 0.00 0.00 1

The eigenvectors are labelled as “loadings” in R. The Unstandardized loadings are the coefficient of
principal components (in original units i.e., ) and the standardized loadings are the correlation between the
principal components and the variables (i.e., ) . So, sum of squares of the unstandardized loadings for each
principal component gives the eigenvalue (i.e., ) .
35

Example 4.3: Using R to analyze U.S lawyers ratings


Output from the scree plot using fa.parallel()

Scree plot with parallel analysis


The scree plot shows a sharp break at the second
PC Actual Data component. This suggests that one PC should be
PC Simulated Data
retained.
eigen values of principal components

PC Resampled Data
8

In addition, each point on the blue line (i.e., the


6

observed eigenvalues) that lies above the


simulated data or the resampled data line is a
4

component to extract.
2
0

2 4 6 8 10

Component Number
36

Example 4.3: Using R to analyze U.S lawyers ratings


Use the following R program to interpret the retained principal components
## What is the correlation between the first
## and the second principal components? Load the R package “ltm” to conduct a
comp <- out1$scores[,1:2] Pearson correlation test between the first
library(ltm) 2 PCs.
rcor.test(cbind(comp,dat2))

## b. What does the first principal components represent?


biplot.psych(out1, choose = c(1,2), main = "Pattern of the first two
components")

This biplot helps you visualize the


relationship between the first 2 PCs.
37

Example 4.3: Using R to analyze U.S lawyers ratings


Output from rcor.test
PC1 PC2 INTG DMNR DILG CFMG DECI PREP FAMI ORAL WRIT PHYS
PC1 ***** -0.000 0.923 0.921 0.965 0.959 0.956 0.982 0.974 0.996 0.990 0.895
PC2 >0.999 ***** -0.337 -0.377 0.129 0.210 0.227 0.118 0.123 0.019 0.034 0.121
INTG <0.001 0.027 ***** 0.965 0.872 0.814 0.803 0.878 0.869 0.911 0.909 0.742
DMNR <0.001 0.013 <0.001 ***** 0.837 0.813 0.804 0.856 0.841 0.907 0.893 0.789
DILG <0.001 0.411 <0.001 <0.001 ***** 0.959 0.956 0.979 0.957 0.954 0.959 0.813
CFMG <0.001 0.178 <0.001 <0.001 <0.001 ***** 0.981 0.958 0.935 0.951 0.942 0.879
DECI <0.001 0.144 <0.001 <0.001 <0.001 <0.001 ***** 0.957 0.943 0.948 0.946 0.872
PREP <0.001 0.450 <0.001 <0.001 <0.001 <0.001 <0.001 ***** 0.990 0.983 0.987 0.849
FAMI <0.001 0.432 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 ***** 0.981 0.991 0.844
ORAL <0.001 0.903 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 ***** 0.993 0.891

The correlation between the first two PC


The loadings of the first principal component are all greater than 0.5.
is zero. In fact, this is true for any two
So, the first principal component is highly positively correlated (or
PCs.
high positive loadings) with all variables. This component seems to
The correlation figures in this table are reflect all aspects of the judge since it shows approximately positive
identical to the standardized loadings at loadings with all variables. It could be used as an index for rating the
p.34. judges.
38

Example 4.3: Using R to analyze U.S lawyers ratings


INTG DMNR DILG CFMG DECI PREP FAMI ORAL WRIT PHYS
PC1 0.923 0.921 0.965 0.959 0.956 0.982 0.974 0.996 0.990 0.895
PC2 -0.337 -0.377 0.129 0.210 0.227 0.118 0.123 0.019 0.034 0.121

Loadings

PCA
scores of
samples

Scores

Workbook – Unit 4 Q1

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy