0% found this document useful (0 votes)
47 views67 pages

02 Linear Algebra

The document provides an introduction to linear algebra concepts used in data science, including: - Matrices are used to represent data, geometric points, systems of equations, graphs and networks. - Common matrix operations include addition, scalar multiplication, and linear combinations. Transpose, multiplication, and covariance matrices are also introduced. - Matrix multiplication can be used to represent rearrangements like permutations and connectivity in networks. It can also model linear transformations of data.

Uploaded by

emily clarke
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views67 pages

02 Linear Algebra

The document provides an introduction to linear algebra concepts used in data science, including: - Matrices are used to represent data, geometric points, systems of equations, graphs and networks. - Common matrix operations include addition, scalar multiplication, and linear combinations. Transpose, multiplication, and covariance matrices are also introduced. - Matrix multiplication can be used to represent rearrangements like permutations and connectivity in networks. It can also model linear transformations of data.

Uploaded by

emily clarke
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

Introduction to Data Science

Linear Algebra

Arijit Mondal
Dept. of Computer Science & Engineering
Indian Institute of Technology Patna
arijit@iitp.ac.in

1 CS244
Matrix representation
• Matrix is every where!!
CS244

2
Matrix representation
• Matrix is every where!!
• Data — A data can be represented as n × m matrix,
• A row represents an example
• Each column represent distinct feature / dimension
CS244

2
Matrix representation
• Matrix is every where!!
• Data — A data can be represented as n × m matrix,
• A row represents an example
• Each column represent distinct feature / dimension
• Geometric point set — A n × m matrix can denote n points in m dimension space
CS244

2
Matrix representation
• Matrix is every where!!
• Data — A data can be represented as n × m matrix,
• A row represents an example
• Each column represent distinct feature / dimension
• Geometric point set — A n × m matrix can denote n points in m dimension space
• Systems of equations — Equation like y = c0 + c1 x1 + cm−1 xm−1 can be modeled as n × m
matrix
CS244

2
Matrix representation
• Matrix is every where!!
• Data — A data can be represented as n × m matrix,
• A row represents an example
• Each column represent distinct feature / dimension
• Geometric point set — A n × m matrix can denote n points in m dimension space
• Systems of equations — Equation like y = c0 + c1 x1 + cm−1 xm−1 can be modeled as n × m
matrix
CS244

• Graphs & Networks — City network, chemical structure, etc.

2
Matrix representation
• Matrix is every where!!
• Data — A data can be represented as n × m matrix,
• A row represents an example
• Each column represent distinct feature / dimension
• Geometric point set — A n × m matrix can denote n points in m dimension space
• Systems of equations — Equation like y = c0 + c1 x1 + cm−1 xm−1 can be modeled as n × m
matrix
CS244

• Graphs & Networks — City network, chemical structure, etc.


• Rearrangements — Permutation of given set of elements

2
Geometry & Vectors
• Vectors — A 1 × d dimension matrix. In geometry sense a ray from the origin through the
given point in d dimension
• Normalization — In many scenarios the vectors are normalized to have unit norm
• Dot Product —
• Useful to reduce vector to scalar
• Can be used to measure angle
CS244

3
Matrix operations
• Addition: C = A + B, Cij = Aij + Bij
• Scalar multiplication: A′ = cA, A′ij = c · Aij
• Linear combination: αA + (1 − α)B
CS244

4 image source: Data Science Design Manual


Matrix Transpose
• Let M be a matrix and M be the transpose of M, then Mij = M′ij
T

• (AT )T = A
• Let C = A + AT hold, then Cij = Aij + Aji =
CS244

5 image source: Data Science Design Manual


Matrix Transpose
• Let M be a matrix and M be the transpose of M, then Mij = M′ij
T

• (AT )T = A
• Let C = A + AT hold, then Cij = Aij + Aji = Cji
CS244

5 image source: Data Science Design Manual


Matrix Transpose
• Let M be a matrix and M be the transpose of M, then Mij = M′ij
T

• (AT )T = A
• Let C = A + AT hold, then Cij = Aij + Aji = Cji
CS244

5 image source: Data Science Design Manual


Matrix multiplication
• It is an aggregated version of the vector dot or inner product
P
• x · y = i xi yi
• Matrix product XYT produces 1 × 1 matrix which contains dot product X · Y
X
• C = AB, Cij = Aik Bkj
k
• It does not commute, usually AB ̸= BA
• It is associative, A(BC) = (AB)C
CS244

• Consider the following matrixes: A1×n , Bn×n , Cn×n , Dn×1 .


Which of the following is better — (AB)(CD) or (A(BC))D?

6
Covariance matrix
• Multiplication by transpose matrix is common ie. A · AT
• Both A · AT and AT · A are compatible for multiplications
• Let An×d be a feature matrix, each row represents an item and each column denotes a feature
• C = AAT is a n × n matrix dot products
• Cij is a measure how similar item i is to item j (in syncness)
• D = AT A is a d × d dot products in syncness among the features
CS244

• Dij represents the similarity between feature i and feature j


Xn
• Covariance formula: Cov(X, Y) = (Xi − X̄)(Yi − Ȳ)
i=1

7
Covariance matrix (contd)
• A, A · A , A · A
T T
CS244

8 image source: Data Science Design Manual


Matrix multiplication & Paths
• Square matrix can be multiplied without transposition
• A matrix can represent the connectivity of nodes in a given network
• Let An×n can represent adjacency matrix
X
n
A2ij = Aik Akj
k=1
CS244

9
Matrix multiplication & Permutations
• Multiplication is often used to rearrange the oder of the elements in a particular matrix
• Multiplication with identity matrix (I) does not arrange anything new
• I contains exactly one non-zero element in each row and each column
• Matrix with this property is known as permutation matrix
• For example, multiplication with P(2431)

     
CS244

0 0 1 0 11 12 13 14 31 32 33 34
 1 0 0 0   21 22 23 24   11 12 13 14 
P(2431) =
 0
, M = 
 
 , PM = 
 

0 0 1 31 32 33 34 41 42 43 44 
0 1 0 0 41 42 43 44 21 22 23 24

10
Permutations Example
CS244

11 image source: Data Science Design Manual


Linear transformation

x2
" # " #
1 3 2
A= x=
2 1 1
CS244

x1

12
Linear transformation

x2
" # " #
1 3 2
A= x=
2 1 1
CS244

x1

12
Linear transformation

x2
" # " #
1 3 2
A= x=
2 1 1
CS244

x1

12
Linear transformation

x2
" # " #
1 3 2
A= x=
2 1 1
" # " # " #
1 3 5
Ax = ×2+ ×1=
CS244

2 1 5

x1

12
Linear transformation

x2
" # " #
1 3 2
A= x=
2 1 1
" # " # " #
1 3 5
Ax = ×2+ ×1=
CS244

2 1 5

x1

12
Linear transformation

x2
" # " #
1 3 2
A= x=
2 1 1
" # " # " #
1 3 5
Ax = ×2+ ×1=
CS244

2 1 5

x1

12
Linear transformation

x2
" # " #
1 3 2
A= x=
2 1 1
" # " # " #
1 3 5
Ax = ×2+ ×1=
CS244

2 1 5

x1

12
Rotating points in space
• Multiplying with the right matrix can rotate a set of points about the origin by angle θ
 
cos(θ) − sin(θ)
Rθ =
sin(θ) cos(θ)
 ′     
x x x cos(θ) −y sin(θ)
• = R =
y′ θ
y x sin(θ) y cos(θ)
CS244

13
Identity matrix
• Identity plays a big role in algebraic structure
CS244

14
Identity matrix
• Identity plays a big role in algebraic structure
• 0 is the identity element for addition operation
CS244

14
Identity matrix
• Identity plays a big role in algebraic structure
• 0 is the identity element for addition operation
• 1 is the identity element for multiplication operation
CS244

14
Identity matrix
• Identity plays a big role in algebraic structure
• 0 is the identity element for addition operation
• 1 is the identity element for multiplication operation
• Inverse operation is about taking an element x down to its identity
• For addition operation, inverse of x is −x
CS244

14
Identity matrix
• Identity plays a big role in algebraic structure
• 0 is the identity element for addition operation
• 1 is the identity element for multiplication operation
• Inverse operation is about taking an element x down to its identity
• For addition operation, inverse of x is −x
• For multiplication operation, inverse of x is 1x
CS244

14
Identity matrix
• Identity plays a big role in algebraic structure
• 0 is the identity element for addition operation
• 1 is the identity element for multiplication operation
• Inverse operation is about taking an element x down to its identity
• For addition operation, inverse of x is −x
• For multiplication operation, inverse of x is 1x
• For matrix, we say A−1 is multiplicative inverse if A · A−1 = I
CS244

 −1
a b 1 d −b
−1
• for 2 × 2 matrix, A = =
c d ad − bc c a

14
Identity matrix
• Identity plays a big role in algebraic structure
• 0 is the identity element for addition operation
• 1 is the identity element for multiplication operation
• Inverse operation is about taking an element x down to its identity
• For addition operation, inverse of x is −x
• For multiplication operation, inverse of x is 1x
• For matrix, we say A−1 is multiplicative inverse if A · A−1 = I
CS244

 −1
a b 1 d −b
−1
• for 2 × 2 matrix, A = =
c d ad − bc c a
• Matrix that is not invertible is known as singular matrix
• Gaussian elimination can be used to find the inverse

14
Inversion Example
• Inverse of Lincoln image and M · M−1
CS244

15 image source: Data Science Design Manual


Linear Systems, Matrix Rank
• Linear systems
• Consider the following linear equation: y = c0 + c1 x1 + · · · + cm−1 xm−1
• Thus the coefficient of n such linear equations can be represented as a matrix C of size
n×m
CX = Y ⇒ X = C−1 Y
• What will happen if inverse does not exist?
• Matrix Rank
CS244

• A rank of a matrix measures the number of linearly independent rows


• Rank can be determined using Gaussian elimination

16
Factoring matrices
• Factoring matrix A into matrices in B and C represents particular aspect of division
• Non-singular matrix has an inverse I = M · M−1
• Matrix factorization is an important abstraction in data science, leading to feature represen-
tation in a compact way
• Suppose matrix A can be factored as A ≈ BC where the size of A is n × m, B — n × k, C
— k × m where k < min(n, m)
CS244

17 image source: Data Science Design Manual


Eigenvalues & Eigenvectors
• Multiplying a vector U by a matrix A can have the same effect as multiplying it by scalar λ
           
−5 2 2 2 −5 2 1 1
· = −6 , · = −1
2 −2 −1 −1 2 −2 2 2
• λ is eigenvalue, U is eigen vector
• Together, the eigenvector and eigenvalue must encode a lot of information about the matrix
A
• Properties
CS244

• Each eigenvalue has an associated eigenvector


• There are in general n eigenvector-eigenvalue pairs for every full rank n × n matrix
• Every pair of eigenvectors of symmetric matrix are mutually orthogonal
• Two vectors are orthogonal if the dot product is 0
• The eigenvectors can play the role of dimensions or bases in some n dimensional space

18
Example " #
y 1.25 0.75
A=
0.75 1.25
CS244

19
Example " #
y 1.25 0.75
A=
0.75 1.25
CS244

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" #
1.414
Av1 =
1.414

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" #
1.414
Av1 =
1.414

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" #
1.414
Av1 =
1.414

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" #
1.414
Av1 =
1.414

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 =
1.414
0.354

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 =
1.414
0.354

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354

19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244

0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354

19
Eigenvalue decomposition
• Any n × n symmetric matrix M can be decomposed into the sum of its n eigenvector products
• Let (λi , Ui ) be the eigen pairs i = 1, . . . , n and assume λi ≥ λi+1
• Each eigenvector Ui is an n × 1 matrix, multiplying it by its transpose yields an n × n matrix,
product Ui UT i same dimension as M
• Linear combination of these matrices weighted by its corresponding eigenvalue gives the
Xn
original matrix M = λi Ui UT
i
CS244

i=1
• It holds for symmetric matrices
• Can be applied on covariance matrix
• Using only the vector associated with the largest eigenvalues, a good approximation of the
matrix can be made

20
Example
• Covariance of Lincoln & M − U1 UT
1
CS244

21 image source: Data Science Design Manual


Error plot
• Reconstructing the Lincoln memorial from the one, five, fifty eigenvectors
CS244

22 image source: Data Science Design Manual


Singular Value Decomposition
• Eigen value decomposition is good but works for symmetric matrix
• Singular value decomposition is more general matrix factorization approach
• SVD of an n × m matrix M factors into three matrices Un×n , Dn×m , Vm×m ie. M = UDVT ,
D is a diagonal matrix
• The product U · D has the effect of multiplying Uij by Djj
• Relative importance of each column of U is provided by D
• DVT provides relative importance of each row of VT
CS244

• The weight of D are known as singular values of M

23
Singular Value Decomposition
• Let X and Y be vectors of size n × 1 and 1 × m, then matrix outer product P = X ⊗ Y is
n × m matrix, Pjk = Xj YK
X
• Traditional matrix multiplication can be expressed as C = A · B = Ak ⊗ BT
k
k
• Ak — kth column of A, BT k — kth row of B
• M can be expressed as the sum of outer product of vectors resulting from SVD namely (UD)k ,
VT
CS244

24
Example
• Vectors associated with first 50 singular values, MSE of reconstruction
CS244

25 image source: Data Science Design Manual


Example
• Reconstruction with k = 5, 50, and error for k = 50
CS244

26 image source: Data Science Design Manual

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy