02 Linear Algebra
02 Linear Algebra
Linear Algebra
Arijit Mondal
Dept. of Computer Science & Engineering
Indian Institute of Technology Patna
arijit@iitp.ac.in
1 CS244
Matrix representation
• Matrix is every where!!
CS244
2
Matrix representation
• Matrix is every where!!
• Data — A data can be represented as n × m matrix,
• A row represents an example
• Each column represent distinct feature / dimension
CS244
2
Matrix representation
• Matrix is every where!!
• Data — A data can be represented as n × m matrix,
• A row represents an example
• Each column represent distinct feature / dimension
• Geometric point set — A n × m matrix can denote n points in m dimension space
CS244
2
Matrix representation
• Matrix is every where!!
• Data — A data can be represented as n × m matrix,
• A row represents an example
• Each column represent distinct feature / dimension
• Geometric point set — A n × m matrix can denote n points in m dimension space
• Systems of equations — Equation like y = c0 + c1 x1 + cm−1 xm−1 can be modeled as n × m
matrix
CS244
2
Matrix representation
• Matrix is every where!!
• Data — A data can be represented as n × m matrix,
• A row represents an example
• Each column represent distinct feature / dimension
• Geometric point set — A n × m matrix can denote n points in m dimension space
• Systems of equations — Equation like y = c0 + c1 x1 + cm−1 xm−1 can be modeled as n × m
matrix
CS244
2
Matrix representation
• Matrix is every where!!
• Data — A data can be represented as n × m matrix,
• A row represents an example
• Each column represent distinct feature / dimension
• Geometric point set — A n × m matrix can denote n points in m dimension space
• Systems of equations — Equation like y = c0 + c1 x1 + cm−1 xm−1 can be modeled as n × m
matrix
CS244
2
Geometry & Vectors
• Vectors — A 1 × d dimension matrix. In geometry sense a ray from the origin through the
given point in d dimension
• Normalization — In many scenarios the vectors are normalized to have unit norm
• Dot Product —
• Useful to reduce vector to scalar
• Can be used to measure angle
CS244
3
Matrix operations
• Addition: C = A + B, Cij = Aij + Bij
• Scalar multiplication: A′ = cA, A′ij = c · Aij
• Linear combination: αA + (1 − α)B
CS244
• (AT )T = A
• Let C = A + AT hold, then Cij = Aij + Aji =
CS244
• (AT )T = A
• Let C = A + AT hold, then Cij = Aij + Aji = Cji
CS244
• (AT )T = A
• Let C = A + AT hold, then Cij = Aij + Aji = Cji
CS244
6
Covariance matrix
• Multiplication by transpose matrix is common ie. A · AT
• Both A · AT and AT · A are compatible for multiplications
• Let An×d be a feature matrix, each row represents an item and each column denotes a feature
• C = AAT is a n × n matrix dot products
• Cij is a measure how similar item i is to item j (in syncness)
• D = AT A is a d × d dot products in syncness among the features
CS244
7
Covariance matrix (contd)
• A, A · A , A · A
T T
CS244
9
Matrix multiplication & Permutations
• Multiplication is often used to rearrange the oder of the elements in a particular matrix
• Multiplication with identity matrix (I) does not arrange anything new
• I contains exactly one non-zero element in each row and each column
• Matrix with this property is known as permutation matrix
• For example, multiplication with P(2431)
CS244
0 0 1 0 11 12 13 14 31 32 33 34
1 0 0 0 21 22 23 24 11 12 13 14
P(2431) =
0
, M =
, PM =
0 0 1 31 32 33 34 41 42 43 44
0 1 0 0 41 42 43 44 21 22 23 24
10
Permutations Example
CS244
x2
" # " #
1 3 2
A= x=
2 1 1
CS244
x1
12
Linear transformation
x2
" # " #
1 3 2
A= x=
2 1 1
CS244
x1
12
Linear transformation
x2
" # " #
1 3 2
A= x=
2 1 1
CS244
x1
12
Linear transformation
x2
" # " #
1 3 2
A= x=
2 1 1
" # " # " #
1 3 5
Ax = ×2+ ×1=
CS244
2 1 5
x1
12
Linear transformation
x2
" # " #
1 3 2
A= x=
2 1 1
" # " # " #
1 3 5
Ax = ×2+ ×1=
CS244
2 1 5
x1
12
Linear transformation
x2
" # " #
1 3 2
A= x=
2 1 1
" # " # " #
1 3 5
Ax = ×2+ ×1=
CS244
2 1 5
x1
12
Linear transformation
x2
" # " #
1 3 2
A= x=
2 1 1
" # " # " #
1 3 5
Ax = ×2+ ×1=
CS244
2 1 5
x1
12
Rotating points in space
• Multiplying with the right matrix can rotate a set of points about the origin by angle θ
cos(θ) − sin(θ)
Rθ =
sin(θ) cos(θ)
′
x x x cos(θ) −y sin(θ)
• = R =
y′ θ
y x sin(θ) y cos(θ)
CS244
13
Identity matrix
• Identity plays a big role in algebraic structure
CS244
14
Identity matrix
• Identity plays a big role in algebraic structure
• 0 is the identity element for addition operation
CS244
14
Identity matrix
• Identity plays a big role in algebraic structure
• 0 is the identity element for addition operation
• 1 is the identity element for multiplication operation
CS244
14
Identity matrix
• Identity plays a big role in algebraic structure
• 0 is the identity element for addition operation
• 1 is the identity element for multiplication operation
• Inverse operation is about taking an element x down to its identity
• For addition operation, inverse of x is −x
CS244
14
Identity matrix
• Identity plays a big role in algebraic structure
• 0 is the identity element for addition operation
• 1 is the identity element for multiplication operation
• Inverse operation is about taking an element x down to its identity
• For addition operation, inverse of x is −x
• For multiplication operation, inverse of x is 1x
CS244
14
Identity matrix
• Identity plays a big role in algebraic structure
• 0 is the identity element for addition operation
• 1 is the identity element for multiplication operation
• Inverse operation is about taking an element x down to its identity
• For addition operation, inverse of x is −x
• For multiplication operation, inverse of x is 1x
• For matrix, we say A−1 is multiplicative inverse if A · A−1 = I
CS244
−1
a b 1 d −b
−1
• for 2 × 2 matrix, A = =
c d ad − bc c a
14
Identity matrix
• Identity plays a big role in algebraic structure
• 0 is the identity element for addition operation
• 1 is the identity element for multiplication operation
• Inverse operation is about taking an element x down to its identity
• For addition operation, inverse of x is −x
• For multiplication operation, inverse of x is 1x
• For matrix, we say A−1 is multiplicative inverse if A · A−1 = I
CS244
−1
a b 1 d −b
−1
• for 2 × 2 matrix, A = =
c d ad − bc c a
• Matrix that is not invertible is known as singular matrix
• Gaussian elimination can be used to find the inverse
14
Inversion Example
• Inverse of Lincoln image and M · M−1
CS244
16
Factoring matrices
• Factoring matrix A into matrices in B and C represents particular aspect of division
• Non-singular matrix has an inverse I = M · M−1
• Matrix factorization is an important abstraction in data science, leading to feature represen-
tation in a compact way
• Suppose matrix A can be factored as A ≈ BC where the size of A is n × m, B — n × k, C
— k × m where k < min(n, m)
CS244
18
Example " #
y 1.25 0.75
A=
0.75 1.25
CS244
19
Example " #
y 1.25 0.75
A=
0.75 1.25
CS244
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" #
1.414
Av1 =
1.414
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" #
1.414
Av1 =
1.414
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" #
1.414
Av1 =
1.414
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" #
1.414
Av1 =
1.414
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 =
1.414
0.354
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 =
1.414
0.354
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354
19
Example " #
y 1.25 0.75
A=
0.75 1.25
" #
0.707
v1 = , λ1 = 2.0
0.707
" #
−0.707
v2 = , λ2 = 0.5
CS244
0.707
x
" ##
−0.354
1.414
Av21 = ∥x∥ = 1, find Ax
Given
1.414
0.354
19
Eigenvalue decomposition
• Any n × n symmetric matrix M can be decomposed into the sum of its n eigenvector products
• Let (λi , Ui ) be the eigen pairs i = 1, . . . , n and assume λi ≥ λi+1
• Each eigenvector Ui is an n × 1 matrix, multiplying it by its transpose yields an n × n matrix,
product Ui UT i same dimension as M
• Linear combination of these matrices weighted by its corresponding eigenvalue gives the
Xn
original matrix M = λi Ui UT
i
CS244
i=1
• It holds for symmetric matrices
• Can be applied on covariance matrix
• Using only the vector associated with the largest eigenvalues, a good approximation of the
matrix can be made
20
Example
• Covariance of Lincoln & M − U1 UT
1
CS244
23
Singular Value Decomposition
• Let X and Y be vectors of size n × 1 and 1 × m, then matrix outer product P = X ⊗ Y is
n × m matrix, Pjk = Xj YK
X
• Traditional matrix multiplication can be expressed as C = A · B = Ak ⊗ BT
k
k
• Ak — kth column of A, BT k — kth row of B
• M can be expressed as the sum of outer product of vectors resulting from SVD namely (UD)k ,
VT
CS244
24
Example
• Vectors associated with first 50 singular values, MSE of reconstruction
CS244