0% found this document useful (0 votes)
44 views29 pages

ComputationalMathematics - Chapter 2 PDF

The document contains lecture notes on computational mathematics. It covers topics such as rank of matrices, determinants, eigenvalues, positive definite matrices, matrix decompositions including LU, Cholesky, and singular value decompositions. It also discusses generalized inverses and their applications. The notes are intended for students in M.Sc programs in data science, biostatistics, and digital epidemiology.

Uploaded by

Megha Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views29 pages

ComputationalMathematics - Chapter 2 PDF

The document contains lecture notes on computational mathematics. It covers topics such as rank of matrices, determinants, eigenvalues, positive definite matrices, matrix decompositions including LU, Cholesky, and singular value decompositions. It also discusses generalized inverses and their applications. The notes are intended for students in M.Sc programs in data science, biostatistics, and digital epidemiology.

Uploaded by

Megha Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Lecture Notes on Computational Mathematics

Dr. K. Manjunatha Prasad


Professor of Mathematics,
Department of Data Science, PSPH
Manipal Academy of Higher Education, Manipal, Karnataka-576 104
kmprasad63@gmail.com, km.prasad@manipal.edu
Lecture Notes on Computational Mathematics

Dr. K. Manjunatha Prasad


Department of Data Science, PSPH
Manipal Academy of Higher Education, Manipal, India

M.Sc Data Science/Biostatistics/Digital Epidemiology (I sem), Batch 2021-2022

Contents

1 Decomposition of Matrices, Generalized inverses 2


1.1 Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Eigenvalues, Positive Definite Matrices and Decompositions . . . . . . . . . . . . . . . . . . . 9
1.4 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.1 Characteristics of a diagonalizable matrix . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Positive Definite Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5.1 Properties of PD and PSD matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6 LU Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.7 Cholesky Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.8 Spectral Decomposition Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.9 Singular Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.10 The Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.11 Generalized Inverses and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.12 Construction of generalized inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.13 Minimum Norm, Least Squares g-inverse and Moore-Penrose inverse . . . . . . . . . . . . . 24
1.13.1 Construction of Moore-Penrose inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

1
Decomposition of Matrices, Generalized inverses

1. Decomposition of Matrices, Generalized inverses

1.1 Rank
Column space of a matrix: Given m × n matrix A, each column of A is a vector from Rm and the
subspace spanned by those columns is known as ‘column space’ of matrix A (denoted by C (A)).
The dimension of column space of A is known as ‘column rank’ of matrix A.

Row space of a matrix: Given m × n matrix A, each row of A is a vector from Rn and the subspace
spanned by those rows is known as ‘row space’ of matrix A.
The dimension of row space of A is known as ‘row rank’ of matrix A.
Note: For any matrix A, we have

Row Rank(A) = Column Rank(A)

Definition 1.1 (Rank). Rank of an m × n matrix A is the Column Rank(A) which is same as the Row
Rank(A).

Theorem 1.1. Let A, B be matrices such that AB is defined. Then

rank(AB) ≤ min{rank(A), rank(B)}

Proof. A vector in C (AB) is of the form ABx for some vector x, and therefore it belongs to C (A). There-
fore C (AB) ⊆ C (A) and hence rank(AB) ≤ rank A. Similarly, we observe that R (AB) ⊆ R (B) and there-
fore rank(AB) ≤ rank B.

Theorem 1.2. Let A be an m × n matrix of rank r, r ̸= 0. There exist matrices B, C of order m × r, r × n


respectively such that rank B = rank C = r and A = BC. This decomposition of matrix A is called
rank factorization of A

Proof. Consider a basis for the column space of A, say b 1 , b 2 , . . . , b r . Construct an m × r matrix B =
³ ´
b 1 · · · b r . Since each column of A is a linear combination of the columns of B, there exists an r × n
matrix C such that A = BC From the definition of B, it is trivial that rank B = r. Since r = rank A ≤
rank C and C is of size r × n, we obtain rank C = r.

Exercise 1.1. Let A be an m × n matrix. Then N (A) = (C (A T ))⊥ .

Exercise 1.2. Let A be an n × n matrix. Then the following conditions are equivalent.

(i) A is nonsingular, i.e., rankA = n.

(ii) For any b ∈ Rn , Ax = b has unique solution.

Lecture Notes 2
1.1 Rank

(iii) There exists unique matrix B such that AB = BA = I.

Theorem 1.3. Let A, B be m × n matrices. Then rank (A + B) ≤ rank A + rank B.


 
³ ´ Y
Proof. Let A = X Y , B = UV be rank factorizations of A, B. Then A + B = X Y + UV = X U  . So,
V
³ ´ ³ ´
rank (A + B) ≤ rank X U . Clearly, dim(C X U ) ≤ dim(C (X )) + dim(C (U)) = rank A + rank B. This
proves the theorem.

Exercise 1.3. Let A be an m × n matrix and M and N are the invertible matrices of size m × m and n × n,
respectively. Then prove that

(i) rank(M A) = rank A

(ii) rank(AN) = rank A.

Theorem 1.4. Given an m × n matrix  A ofrank r, there exists invertible matrices M, N of order m × m,
Ir 0
n × n respectively such that M AN =  .
0 0

Example 1. Obtain the canonical form of the following matrix and give two different rank factorizations:
 
3 6 6
A= .
1 2 2

Solution. The canonical form of A is given by


 
Ir 0
A=P Q
0 0

where P and Q are invertible matrices and r is the rank of matrix A.


Consider the augmented matrix [I : A] to find the invertible matrix P. Now
     
h i 1 0 : 3 6 6 ≃ 0 1 : 1 2 2 ≃ 0 1 : 1 2 2
I : A =  R1 ↔ R2   R 2 → R 2 + (−3)R 1  
0 1 : 1 2 2 1 0 : 3 6 6 1 −3 : 0 0 0
     
0 1 1 2 2 3 1
Take M =  , then M A =   and P = M −1 =  .
1 −3 0 0 0 1 0
 
MA
Consider the augmented matrix   to find the invertible matrix Q. Now
I3
     
1 2 2 1 0 2 1 0 0
     
 0 0 0 0 0 0
   0 0 0
MA
   
  ≃   ≃  
 C 2 → C 2 + (−2)C 1 1 −2 0 C 3 → C 3 + (−2)C 1 1 −2 −2
  = 1 0 0   
I3
 
     
0 1 0 0 1 0 0 1 0
 
   
0 0 1 0 0 1 0 0 1

Lecture Notes 3
1.2 Determinants

   
1 −2 −2   1 2 2
  1 0 0 −1
 
Take N = 
0 1 0 , then M AN =
  and Q = N = 0 1 0 .
0 0 0

0 0 1 0 0 1
 
       1 2 2
1 0 0 I 0 3 1 1 0 0 
Therefore A = M −1   N −1 = P  1

Q =    0 1 0.
0 0 0 0 0 1 0 0 0 0
 
0 0 1
 
3 h i
A rank factorization of A is A =   1 2 2 . Now
1
   
3 h i 3 h i h i−1 h i
A =   I1 1 2 2 =   2 2 1 2 2
1 1
 
3 h ih ih i
=   2 2−1 1 2 2
1
 
6 h i
=   12 1 1 ,
2

is another rank factorization.

Exercise 1.4. Obtain the canonical form of the following matrices and give two different rank factoriza-
tions:        
2 1 −2 1 1 −1 1 1 1 −1
 ,  ,  ,  .
1 0 −1 −1 1 1 2 2 1 1

Exercise 1.5. Let A be an n × n matrix of rank r. Then there exists an n × n matrix Z of rank n − r such
that A + Z is nonsingular.

Theorem 1.5 (Frobenius Inequality). Let A, B be n × n matrices. Then

rank(AB) ≥ rank A + rank B − n.


   
I r 0 0 0
Proof. Let rank of A be r. If A = M −1   N −1 , then for Z = M −1   N −1 we get A + Z =
0 0 0 I n− r
−1 −1
M N is an invertible matrix. Further

rank B = rank((A + Z)B) = rank(AB + ZB) ≤ rank(AB) + rank(ZB)

≤ rank(AB) + rank(Z) = rank(AB) + n − r = rank(AB) + n − rank A.

This proves the Frobenius inequality.

1.2 Determinants
Consider Rn×n the set of all n × n matrices over R.
A mapping D : Rn×n → R is said to be n-linear if for each i, 1 ≤ i ≤ n, D is linear function of i-th row
when other (n − 1) rows are held fixed.

Lecture Notes 4
1.2 Determinants

Example 2. A function D : Rn×n → R defined by D(A) = a 11 a 22 . . . a nn , product of diagonal elements of


matrix A, is n-linear.

Exercise 1.6. A linear combination of n-linear functions is n-linear.

Alternating n-linear Function


A mapping D : Rn×n → R is said to be an alternating n-linear function if

(i) D is n-linear

(ii) If B is a matrix obtained by interchanging two rows of A, then D(A) = −D(B).

When D : Rn×n → R satisfies the condition (i) above, the condition (ii) can be replaced by
(ii)′ D(A) = 0 if two rows are equal. (Exercise)

Theorem 1.6. If a mapping D : Rn×n → R is n-linear, then the following are equivalent:

(i) If B is a matrix obtained by interchanging two rows of A, then D(A) = −D(B).

(ii) D(A) = 0 if any two rows are equal.

(iii) If B is a matrix obtained by interchanging two adjacent rows of A, then D(A) = −D(B).

(iv) D(A) = 0 if any two rows on the adjacent are equal.

Proof. (i) =⇒ (ii) Consider a matrix A such that its i-th row A i and j-th row A j are same for some
i ̸= j. Now
D(A) = D(A 1 , . . . , A i , . . . , A j , . . . , A n ) = D(A 1 , . . . , A j , . . . , A i , . . . , A n )

But from (i) we get

D(A 1 , . . . , A i , . . . , A j , . . . , A n ) = −D(A 1 , . . . , A j , . . . , A i , . . . , A n )

Therefore D(A) = 0. (ii) =⇒ (i) Consider a matrix B with its k-th row B k is same as A k , the k-th row of
A, for all k ̸= i, j and B i = A j and B j = A i (i < j). Now obtain a matrix C such that k-th row C k is same
as A k , the k-th row of A, for all k ̸= i, j and C i = C j = A i + A j . Now from (ii), D(C) = 0 and we get

0 = D(C 1 , . . . , C i , . . . , C j , . . . , C n )

= D(A 1 , . . . , (A i + A j ), . . . , (A i + A j ), . . . , A n )

= D(A 1 , . . . , A i , . . . , A i , . . . , A n ) + D(A 1 , . . . , A i , . . . , A j , . . . , A n )+

D(A 1 , . . . , A j , . . . , A i , . . . , A n ) + D(A 1 , . . . , A j , . . . , A j , . . . , A n )

= D(A 1 , . . . , A i , . . . , A j , . . . , A n ) + D(A 1 , . . . , A j , . . . , A i , . . . , A n )

= D(A) + D(B)

Therefore D(B) = −D(A).


(i) =⇒ (iii) is trivial.

Lecture Notes 5
1.2 Determinants

(iii) =⇒ (i) Consider the sequence of rows

A 1 , A 2 , . . . , A i−1 , A i , A i+1 , . . . , A j−1 , A j , A j+1 , . . . A n where i < j.

Now begin interchange the row A i with A i+1 and continue until we get the sequence in the order

A 1 , A 2 , . . . , A i−1 , A i+1 , . . . , A j−1 , A j , A i , A j+1 , . . . A n .

This requires k = j − i many interchanges of adjacent rows. Further to get A j in the i-th position we
require k − 1 such interchanges of adjacent rows. So, if B is the matrix with interchange of i-th and j-th
rows of A, we get B from A after 2k − 1 interchanges of adjacent rows. So, D(B) = (−1)2k−1 D(A) = −D(A).
(ii) =⇒ (iv) is trivial, (iv) =⇒ (ii) follows from (iii) =⇒ (i).

Alternating n-linear Function, n = 2 case


Now consider an alternating 2-linear function D : R2×2 → R, where n of the above discussion is 2. Let
e 1 = (1, 0) and e 2 = (0, 1), the first and second rows of I respectively. For any matrix A of size 2 × 2, the
first row is given by a 11 e 1 + a 12 e 2 and similarly the second row is a 21 e 1 + a 22 e 2 . Therefore,

D(A) = D(a 11 e 1 + a 12 e 2 , a 21 e 1 + a 22 e 2 )

= D(a 11 e 1 , a 21 e 1 + a 22 e 2 ) + D(a 12 e 2 , a 21 e 1 + a 22 e 2 )

= D(a 11 e 1 , a 21 e 1 ) + D(a 11 e 1 , a 22 e 2 ) + D(a 12 e 2 , a 21 e 1 )

+ D(a 12 e 2 , a 22 e 2 )

= a 11 a 21 D(e 1 , e 1 ) + a 11 a 22 D(e 1 , e 2 ) + a 12 a 21 D(e 2 , e 1 ) + a 12 a 22 D(e 2 , e 2 )

Now employing (ii) or (ii)′ conveniently, the above equals to

= a 11 a 22 D(e 1 , e 2 ) + a 12 a 21 D(e 2 , e 1 ) = (a 11 a 22 − a 12 a 21 )D(e 1 , e 2 ) (1.1)

Determinant Function
Determinant function on Rn×n is a mapping D : Rn×n → R such that

(i) D is n-linear

(ii) D(A) = 0 if two rows are equal

(iii) D(I) = 1 for the identity matrix.

If D satisfies the property (iii) given above, then (1.1) reduces to D(A) = (a 11 a 22 − a 12 a 21 ).
Now for any alternating n-linear function D : Rn×n → R, we shall consider e i the i-th row of an
identity matrix. Now à !
n
X n
X n
X
D(A) = D a1i e i , a2i e i , . . . , a ni e i (1.2)
i =1 i =1 i =1

Also, consider the set of all permutations of degree n

P n = {σ = (σ1 , σ2 , . . . , σn ) : 1 ≤ σ1 , σ2 , . . . , σn ≤ n and distinct}

Lecture Notes 6
1.2 Determinants

Now apply n-linear property and D(e k1 , e k2 , . . . , e k n ) = 0 for any repeated k i , we get (1.2) as
X
D(A) = a 1σ1 a 2σ2 . . . a nσn D(e σ1 , e σ2 , . . . , e σn )
σ

Now by alternating property of D, we get D(e σ1 , e σ2 , . . . , e σn ) = sgn(σ)D(e 1 , e 2 , . . . , e n ), where sgn(σ) is


(−1)k , k is the length of the permutation or the number of interchanges of elements in (σ1 , σ2 , . . . , σn )
required to get (1, 2, . . . , n). So a permutation is said to be an odd permutation if k above is odd otherwise,
even permutation. Hence, we get
X X
D(A) = sgn(σ)a 1σ1 a 2σ2 . . . a nσn D(e 1 , e 2 , . . . , e n ) = sgn(σ)a 1σ1 a 2σ2 . . . a nσn D(I)
σ σ

Further, if D satisfies (iii) of determinant function then D(A) is uniquely determined by the entries of
A and
X
D(A) = sgn(σ)a 1σ1 a 2σ2 . . . a nσn
σ

So the determinant function for Rn×n exists and unique.


If D is a determinant function on Rn×n , then D(A) is called determinant of matrix A and denoted
by det(A).
Given an n × n matrix A and 1 ≤ i, j ≤ n(n > 1), A(i | j) is the submatrix of A of size (n − 1) × (n − 1)
obtained by removing i-th row and j-th column. If D is an alternating (n − 1) linear function, then
D i j (A) = D(A(i | j)).
If a i j is the (i j)-th element of matrix A ∈ Rn×n , then the cofactor of a i j is c i j = (−1) i+ j det(A(i | j)).
We mention the following result without any proof.

Theorem 1.7. Consider an n × n matrix A and D an alternating (n − 1) linear function for each j, 1 ≤
j ≤ n, E j defined by
n
(−1) i+ j A i j D i j (A)
X
E j (A) =
i =1

is an alternating n-linear function. Further, if D is determinant function, then so is E j .

The followings are some basic properties of determinant which are useful:

(i) The determinant is a linear function of any row when all the other rows are held fixed.

(ii) The determinant changes sign if two rows are interchanged.

(iii) The determinant is unchanged if a constant multiple of one row is added to another row.

(iv) det(AB) = det(A) det(B).

(v) det(A) = det(A T ).


 
A C
(vi) If A and B are square matrices and D =  , then
0 B

det(D) = det(A) det(B).

Lecture Notes 7
1.2 Determinants

Remarks:(Lapalce expansion) The determinant can be evaluated by expansion along a row or a column.

• The determinant expansion for a real matrix A of size n × n about the j-th column is
n
(−1) i+ j a i j det(A(i | j)).
X
det(A) =
i =1

• The determinant expansion for a real matrix A of size n × n about the i-th row is
n
(−1) i+ j a i j det(A(i | j)).
X
det(A) =
j =1

Exercise 1.7. For a square matrix A ∈ Rn×n adjoint matrix adj(A) is the matrix where

(ad j(A)) ji = (−1) i+ j det(A(i | j))

Prove that A.ad j(A) = det(A)I. Hence prove the Cramer’s rule, i.e., j-th co-ordinate x j of the solution to
|B j |
Ax = b is given by | A| where B j is the matrix obtained by replacing j-th column of A by b.

Exercise 1.8. For a square matrix A ∈ Rn×n , A has inverse if and only if det(A) is nonzero.

Exercise 1.9. Find the inverse of the following matrices using adjoint method:
 
1 2 −1
 
(i) 
−1 1 2
2 −1 1
 
cos(θ ) − sin(θ ) 0
 
 sin(θ ) cos(θ ) 0
(ii)  

0 0 1

Exercise 1.10. Solve by Cramer’s rule, if possible:

(i) 3x + y + 2z = 3, 2x − 3y − z = −3, x + 2y + z = 4

(ii) x + y + z = 11, 2x − 6y − z = 0, 3x + 4y + 2z = 0

(iii) 3x − 2y = 7, 3y − 2z = 6, 3z − 2x = −1
¯ ¯
¯ x 2 −1¯
¯ ¯
¯ ¯
Exercise 1.11. Solve for x : ¯ 2 5 x ¯¯ = 0.
¯
¯ ¯
¯−1 2 x ¯
¯ ¯
¯1
¯ ω ω2 ¯¯
Exercise 1.12. If ω is the imaginary cube root of unity, evaluate ¯¯ ω
¯ ¯
ω2 1 ¯¯ .
¯ω2
¯ ¯
1 ω¯
¯ ¯
¯1 + a b c ¯¯
¯
¯ ¯
Exercise 1.13. Prove that ¯¯ a 1+b c ¯¯ = 1 + a + b + c.
¯ ¯
¯ a b 1 + c¯

Lecture Notes 8
1.3 Eigenvalues, Positive Definite Matrices and Decompositions

¯ ¯
¯a + b + 2c a b ¯
¯ ¯
¯ = 2(a + b + c)3 .
¯ ¯
Exercise 1.14. Show that ¯¯ c b + c + 2a b ¯
¯ ¯
¯ c a c + a + 2b¯
¯ ¯
¯91 92 93¯
¯ ¯
¯ ¯
Exercise 1.15. Evaluate ¯94 95 96¯¯.
¯
¯ ¯
¯97 98 99¯

¯ ¯
¯x + 3 5 7 ¯¯
¯
¯ ¯
Exercise 1.16. Solve for x : ¯¯ 3 x+5 7 ¯¯ = 0.
¯ ¯
¯ 3 5 x + 7¯
   
4 10 11 1 −2 3
   
Exercise 1.17. If A = 
7 6 2 and B =  0

, find | A.B|.
2 1
1 5 4 −4 5 2
   
102 105 160 150
Exercise 1.18. If A =   and B =  , find | A.B|.
100 100 150 150
 
3 2 x
 
Exercise 1.19. If A = 
4 1 −1 is a singular matrix, find x.

0 3 4
¯ ¯
¯ x 3 7¯
¯ ¯
¯ ¯
Exercise 1.20. If x = −9 is a root of ¯2 x 2¯¯ = 0, find the other two roots.
¯
¯ ¯
¯7 6 x ¯

Exercise 1.21. Decide whether the determinant of the following matrix A is even or odd, without eval-
 
387 456 589 238
 
488 455 677 382
uating it explicitly: A =  .
 
440 982 654 651
 
892 564 786 442
¯ ¯
¯ A + B A¯
¯ ¯
Exercise 1.22. If A, B are n × n matrices, show that ¯
¯ ¯ = | A ||B|.
¯ A A¯
¯

Exercise 1.23. Evaluate the determinant of an n × n matrix A where a i j = i j if i ̸= j and a i j = 1 + i j if


i = j.

1.3 Eigenvalues, Positive Denite Matrices and Decompositions


Characteristic Polynomial: Given an n × n matrix A over the field R (or C), the characteristic poly-
nomial is the polynomial of degree n with a variable x, given by det(xI − A). If P(x) = det(xI − A), the
characteristic equation is given by P(x) = 0. i.e., the characteristic equation of A is det(xI − A) = 0.

Lecture Notes 9
1.3 Eigenvalues, Positive Definite Matrices and Decompositions

Note: In the characteristic polynomial, the coefficient of x n is 1, coefficient x n−1 is (−1)1 T race(A), . . . . . .
coefficient of x n− i is (−1) i s i× i where s i× i is the sum of all i × i principal minors of A . . . . . . and the con-
stant term is (−1)n det(A).

Definition 1.2. The roots of the characteristic equation of a square matrix A is called the eigenvalues
(characteristic values) of A or in other words λ is said to be an eigenvalue of A if there exists a vector
x ̸= 0 such that Ax = λ x. Such a vector x is called an eigenvector of A corresponding to the eigenvalue λ.

Note that the set of vectors { x : Ax = λ x} is the nullspace of A − λ I. This nullspace is called the
eigenspace of A corresponding to the eigenvalue λ and its dimension is called the geometric multi-
plicity of λ.
The eigenvalues may not all be distinct. The number of times an eigenvalue occurs as a root of the
characteristic equation is called the algebraic multiplicity of the eigenvalue.
Cayley Hamilton theorem: Given a n × n matrix A and characteristic polynomial P(x) = det(xI −
A), we have P(A) = 0.

Exercise 1.24. Two similar matrices have same characteristic polynomial.

Exercise 1.25. Find the characteristic polynomials of the following matrices:


 
2 −3
(i)  
5 1
 
1 3 0
 
(ii) 
−2 2 −1

4 0 2

Exercise 1.26. Find the characteristic polynomial of a 2 × 2 matrix whose trace and determinant are 7
and 6 respectively.

Exercise 1.27. Show that a matrix A and its transpose A T have the same characteristic polynomial.
 
A1 B
Exercise 1.28. Suppose   where A 1 and A 2 are the square matrices. Show that the character-
0 A2
istic polynomial of M is the product of the characteristic polynomials of A 1 and A 2 .

Exercise 1.29. Find the characteristic polynomials of the following matrices:


 
1 2 3 4
 
0 2 8 −6
(i) 
 

0 0 3 −5
 
0 0 0 4
 
2 5 7 −9
 
1 4 −6 4 
(ii) 
 

0 0 6 −5
 
0 0 2 3

Lecture Notes 10
1.4 Diagonalization

 
5 8 −1 0
 
0 3 6 7
(iii) 
 

0 −3 5 −4
 
0 0 0 7

Exercise 1.30. Find all the eigenvalues and the eigenvectors corresponding to each of the eigenvalues of
the following matrices:
 
1 4
(i)  
2 3
 
1 0 −1
 
(ii) 
1 2 1 

2 2 3
 
1 −3 3
 
(iii) 
3 −5 3

6 −6 4

Exercise 1.31. Show that the eigenvectors corresponding to distinct eigenvalues of a matrix are linearly
independent.

1.4 Diagonalization
Definition 1.3. Two n × n matrices A and B are said to be similar if there exists an invertible matrix P
such that P −1 AP = B.

The product P −1 AP is called a similarity transformation on A.


A fundamental problem is the following: Given a square matrix A, can we reduce it to a simplest possible
form by means of a similarity transformation?
Note that the diagonal matrices have the simplest form. So, now the question reduces to the following:
Is every square matrix similar to a diagonal
 matrix?.

0 0
The answer is No. For example, let A =  . Note that A 2 = 0. If there exists a nonsingular matrix
1 0
P such that P −1 AP = D, then D 2 = P −1 APP −1 AP = P −1 A 2 P = 0 ⇒ D = 0 ⇒ A = 0, which is not true.
So A is not similar to a diagonal matrix. Hence not all the square matrices are similar to a diagonal
matrix.

Definition 1.4. A square matrix A is said to be diagonalizable if A is similar to a diagonal matrix.

Lecture Notes 11
1.4 Diagonalization

1.4.1. Characteristics of a diagonalizable matrix

We examine the following equation:


 
λ1 0 0 ... 0
 
0
 λ2 0 ... 0 
−1
 
P A n× n P = D = 
 ... ... ... ... ... 

 
 ... ... ... ... ... 
 
0 0 0 0 λn

which implies,
 
λ1 0 0 ... 0
 
0
 λ2 0 ... 0 
 
AP = PD ⇒ A[P1 |P2 |...|P n ] = [P1 |P2 |...|P n ] 
 ... ... .
... ... ... 
 
 ... ... ... ... ... 
 
0 0 0 0 λn

Equivalently,
[AP1 | AP2 |...| AP n ] = [λ1 P1 |λ2 P2 |...|λn P n ].

Hence, AP j = λ j P j i.e., (λ j , P j ) is an eigenpair for A i.e., P must be a matrix whose columns constitute
n linearly independent eigenvectors, and D is a diagonal matrix whose diagonal entries are the corre-
sponding eigenvalues. Also, if there exists a linearly independent set of n eigenvectors that are used as
columns to build a nonsingular matrix P, and if D is the diagonal matrix whose diagonal entries are
the corresponding eigenvalues, then P −1 AP = D.
A complete set of eigenvectors for A n×n is any set of n linearly independent eigenvectors for A. By the
above discussion it follows that A is diagonalizable if and only if it has a complete set of eigenvectors.
Hence A is diagonalizable if and only if the algebraic multiplicity equals the geometric multiplicity of
each eigenvalue.

Exercise 1.32. If possible, diagonalize the following matrix with a similarity transformation:
 
1 −4 −4
 
A=  8 −11 −8

−8 8 5

Exercise 1.33. If possible, diagonalize the following matrices with a similarity transformation. Other-
wise give reasons why they are not diagonalizable:
 
0 1
1. A =  
−8 4
 
1 1 1
 
2. A = 
1 1 1

1 1 1

Lecture Notes 12
1.5 Positive Definite Matrices

 
5 4 2 1
 
0 1 −1 −1
3. A = 
 

−1 −1 3 0
 
1 1 −1 2
 
5
−6 −6
 
4. A = 
−1 4 2

3 −6 −4

1.5 Positive Denite Matrices


Definition 1.5. A principal submatrix of a square matrix is a submatrix formed by a set of rows and
the corresponding set of columns. A principal minor of a square matrix is the determinant of a principal
submatrix.

Definition 1.6. An n × n matrix A is said to be symmetric if A = A T . An n × n matrix A is said to be


positive definite if it is a symmetric matrix and if xT Ax > 0 for every nonzero vector x.

Definition 1.7. An n × n matrix A is said to be a positive semidefinite if it is a symmetric matrix and if


xT Ax ≥ 0 for all x.

Example 3. An identity matrix is trivially a positive definite matrix.


     
2 1 1 −2 1 2
Matrices  ,   are positive definite. The matrix  is neither a positive definite
1 3 −2 5 2 1
 
1 1
nor a positive semidefinite.  is positive semidefinite.
1 1

1.5.1. Properties of PD and PSD matrices

1. If A is positive definite then it is nonsingular.

2. If A, B are positive definite and if α, β ≥ 0, with α + β > 0, then α A + βB is positive definite.

3. If A is positive definite then | A | > 0.

4. If A is positive definite then any principal submatrix of A is positive definite.

5. Let A be a symmetric n × n matrix. Then A is positive definite if and only if the eigenvalues of
A are all positive. Similarly, A is positive semidefinite if and only if the eigenvalues of A are all
nonnegative.

6. Let A be a symmetric n × n matrix. Then A is positive definite if and only if the entire principal
minors of A are positive. (Similarly, A is positive semidefinite if and only if the entire principal
minors of A are nonnegative.)

Lecture Notes 13
1.6 LU Decomposition

7. Let A be a symmetric n × n matrix. Then A is positive definite if and only if all leading principal
minors of A are positive.

Note: (i) If A is a symmetric matrix then the eigenvalues of A are all real.
(ii) If v, w are eigenvectors of a symmetric matrix correspond to distinct eigenvalues α, β, then v, w are
mutually orthogonal.

Definition 1.8. A square matrix P is said to be orthogonal if P −1 = P T ; that is to say, if PP T = P T P = I.

1.6 LU Decomposition
Theorem 1.8. Let  
a 11 a 12 ... a 1n
 
a a 22 ... a 2n 
 21 
 
A=
 ... ... ... ... 

 
 ... ... ... ... 
 
a n1 a n2 ... a nn
be a non-singular matrix. Then A can be factorized into the form LU, where
   
l 11 0 0 ... 0 1 u 12 u 13 ... u 1n
   
l
 21 l 22 0 ... 0  0 1 u 23 ... u 2n 
 

   
L= ... ... ... ... ...  and U = ... ...
 ... ... ,
... 
   
 ... ... ... ... ...  ... ... ... ... ... 
 
 
l n1 l n2 l n3 ... l nn 0 0 0 ... 1

if
¯ ¯
¯ ¯ ¯a a 12 a 13 ¯¯
¯ 11
¯a 11 a 12 ¯
¯ ¯
¯ ¯
a 11 ̸= 0, ¯¯ ¯ ̸= 0, ¯a
¯ 21 a 22 a 23 ¯¯ ̸= 0 and so on.
¯a 21 a 22 ¯
¯
¯ ¯
¯a 31 a 32 a 33 ¯
Such a factorization, whenever it exists, is unique.

Similarly, the factorization LU where,


   
1 0 0 ... 0 u 11 u 12 u 13 ... u 1n
   
l u 22 u 23 u 2n 
 21 1 0 ... 0 0 ...

  
   
L=  ... ... ... ... ...
 and U = 
 ... ... ... ... ... 

   
 ... ... ... ... ...
  ... ... ... ... ... 
  
l n1 l n2 l n3 ... 1 0 0 0 ... u nn

is also a unique factorization.

Example 4. Consider a positive definite matrix A ∈ R3×3 . Write


    
a 11 a 12 a 13 1 0 0 u 11 u 12 u 13
    
A= a 21 a 22 a 23  = LU =  l 21 1 0  0
   u 22 u 23 

a 31 a 32 a 33 l 31 l 32 1 0 0 u 33

Lecture Notes 14
1.6 LU Decomposition

i.e.,
 
u 11 u 12 u 13
 
A=
 l 21 u 11 l 21 u 12 + u 22 l 21 u 13 + u 23 

l 31 u 11 l 31 u 12 + l 32 u 22 l 31 u 13 + l 32 u 23 + u 33
Equating the corresponding entries, we get

u 11 = l 11 , u 12 = a 12 , u 13 = a 13 ,
a 21 a 31
l 21 = , l 31 = ,
a 11 a 11
a 21 a 21
u 22 = a 22 − a 12 , u 23 = a 23 − a 13 ,
a 11 a 11
a 32 − aa31
11
a 12
l 32 = ,
u 22
from which u 33 can be computed.

Note: We follow the following systematic procedure to evaluate the elements of L and U (where L
is unit lower triangular and U is upper triangular).
Step I: Determine first row of U and first column of L.
Step II: Determine the second row of U and the second column of L.
Step III: Determine third row of U.

Exercise 1.34. Factorize the matrix  


2 3 1
 
A=
1 2 3

3 1 2
into the LU form.

Exercise 1.35. Factorize the matrix  


4 3 −1
 
A=
1 1 1 

3 5 3
into the LU form.

Exercise 1.36. Factorize the matrix  


5 −2 1
 
A=
7 1 −5

3 7 4

into the LU form. Hence solve the system Ax = b where b = [4 8 10]T .

Exercise 1.37. Factorize the matrix  


4 3 2
 
A=
2 3 4

1 2 1
into the LU form.

Lecture Notes 15
1.7 Cholesky Decomposition

1.7 Cholesky Decomposition


Note that a positive real number can be decomposed into identical components using square root oper-
p p
ation, for example, 16 = 4 × 4 and 2 = 2 × 2. Similarly, for matrices we have the following

Theorem 1.9. A positive definite matrix A can be factorized into a product A = LL T , where L is a lower-
triangular matrix with positive diagonal entries:

    
a 11 a 12 ... a 1n l 11 0 0 ... 0 l 11 l 21 l 31 ... l n1
    
a a 22 a 2n 
...   l 21 l 22 0 ... 0  0 l 22 l 23 ... l 2n 
 
 21 
    
 =  ... .
 ... ... ... ...  ... ... ... ... 
  ... ... ... ... ... 
 

    
 ... ... ... ... 
  ... ... ... ... ... 
  ... ... ... ... ... 
 
 
a n1 a n2 ... a nn l n1 l n2 l n3 ... l nn 0 0 0 ... l nn
L is called the Cholesky factor of A, and L is unique.

Example 5. Consider a positive definite matrix A ∈ R3×3 . Write


    
a 11 a 12 a 13 l 11 0 0 l 11 l 21 l 31
T
    
A=
a 21 a 22  = LL =  l 21
a 23   l 22 0  0
 l 22 l 32 

a 31 a 32 a 33 l 31 l 32 l 33 0 0 l 33

i.e.,
 
l 211 l 21 l 11 l 31 l 11
 
A=
 l 21 l 11 l 221 + l 222 .
l 21 l 31 + l 22 l 32 
l 31 l 11 l 21 l 31 + l 22 l 32 l 231 + l 232 + l 233
Equating the first columns of each of the matrices, we get,

p 1 1
l 11 = a 11 , l 21 = a 21 , l 31 = a 31 .
l 11 l 11

Equating the second and third columns, we get,


q 1 q
l 22 = a 22 − l 221 , l 32 = (a 32 − l 31 l 21 ), and l 33 = a 33 − (l 231 + l 232 ).
l 22

Note: If the matrix is Positive Semi-definite, instead of Positive Definite, then it still has a de-
composition of the form A = LL T , where the diagonal entries are allowed to be zero. However, this
decomposition is not unique.

Exercise 1.38. Solve the system


    
5 0 1 x1 8
    
0 −2 0  x  = −4
   2  
1 0 5 x3 16
by Cholesky’s method.

Lecture Notes 16
1.8 Spectral Decomposition Theorem

1.8 Spectral Decomposition Theorem


Theorem 1.10. Given a symmetric matrix A, there exists an orthogonal matrix U such that

A = U diag(λ1 , λ2 , . . . , λr , 0, . . . , 0)U T ,

where λ1 , λ2 , . . . , λr are the eigenvalues of A and the first r columns of U are the unit eigenvectors corre-
sponding the eigenvalues.

Lemma 1.1. If A a symmetric matrix of size n × n and v is a unit eigenvector corresponding to the
eigenvalue α, then B = αvvT is a matrix satisfying the following properties:

(i) B is a matrix of rank one such that AB = BA = B2 .

(ii) Rank(A) = Rank(B) + Rank(C), where C = A − B.

Proof. Since v is an eigenvector of A, we have that Av = αv and therefore AB = BA = B2 . From


the definition of B, Range(B) ⊆ Range(A) and therefore Range(C) ⊆ Range(A). Since A and B are
symmetric, so is C = A − B. Therefore BC = CB = 0 implies Range(B) ⊥ Range(C), and therefore
Range(B) ⊕ Range(C) = Range(A). So, Rank(A) = Rank(B) + Rank(C).

Proof. Proof is by induction on the rank of given symmetric matrix A. If the rank of A is one with
eigenvalue λ1 and eigenvector v, then A is in the form

λ1 u 1 u 1T

1
where u 1 = ||v|| v. Now extend { u 1 } to an orthonormal basis { u 1 , u 2 , . . . u n } of Rn . Now construct a matrix
U by taking { u 1 , u 2 , . . . u n } as columns in the same order. Clearly, UU T = U T U = I and
 
λ1 0
A =U U T .
0 0

So, the theorem holds when the rank of A is one. Suppose that the theorem holds for all the matrices of
rank r − 1. If A is a symmetric matrix of rank r, chose an eigenvalue λ1 of A with corresponding unit
eigenvector u 1 , as in the earlier lemma. Now construct B = λ1 u 1 u 1T satisfying Rank(A) = Rank(B) +
Rank(C) and BC = CB = 0. Since C is a symmetric matrix and Rank(C) = r − 1, by induction, there
exists an orthogonal matrix V such that

C = V diag(λ2 , . . . , λr , 0, . . . , 0)V T ,

where λ2 , . . . , λr are the eigenvalues of C and the first r − 1 columns of V are corresponding eigenval-
ues with unit norm. Since BC = CB = 0, u 2 , . . . , u r are the eigenvectors correspond to the eigenvalues
λ2 , . . . , λr of C as well of A. For the same reason, the set of vectors u 1 together with the first r − 1 columns
{ u 2 , . . . , u r } of V forms a set of orthonormal vectors. Extend this set of orthonormal vectors to form an

Lecture Notes 17
1.8 Spectral Decomposition Theorem

orthonormal basis { u 1 , u 2 , . . . , u r , u r−1 . . . u n } of Rn . Now construct a orthogonal matrix U by using these


orthonormal vectors and from the definition of B and C, it is clear that

A = U diag(λ1 , λ2 , . . . , λr , 0, . . . , 0)U T .

 
2 1
Example 6. Obtain the spectral decomposition of A =  .
1 2

Solution. Spectral Decomposition of a symmetric matrix is given by

A = QDQ T

First step is to find the eigenvalues of A. To get eigenvalues we have to solve characteristic equation
| A − λI | = 0 ¯ ¯
¯2 − λ 1 ¯
¯ ¯
| A − λ I | = ¯¯ ¯=0
¯ 1 2 − λ¯
¯

Solving, we get λ = 3, 1
Eigenvector corresponding to eigenvalue λ = 3 is given by
    
−1 1 x 0
   1 =  
1 −1 x2 0
   
x1
1
Solving we get   =  
x2 1
 
p1
Normalized eigenvector corresponding to eigenvalue λ = 3 is  2
p1
2
Eigenvector corresponding to eigenvalue λ = 1 is given by
    
1 1 x1 0
   =  
1 1 x2 0
   
x1
−1
Solving we get   =  
x2 1
 
−1
p
Normalized eigenvector corresponding to eigenvalue λ = 1 is  2
p1
2
Spectral Decomposition of A is given as
   
p1 −1
p 3 0 p1 p1
A =  2 2  2 2.
p1 p1
0 1 −1
p p1
2 2 2 2

Exercise 1.39. If A is positive semidefinite matrix then there exists a unique positive semidefinite matrix
B such that B2 = A. (The matrix B is called the square root of A and is denoted by A /2 .)
1

Lecture Notes 18
1.9 Singular Values

 
1 3
Exercise 1.40. Obtain the spectral decomposition of  .
3 1
 
0 1 1  
  2 5
Exercise 1.41. Obtain the spectral decomposition of matrices 
1 0 1 ,
  .
5 3
1 1 0

1.9 Singular Values


1
Let A be an n × n matrix. the singular values of A are defined to be the eigenvalues of (A A T ) 2 . The
singular values are always nonnegative, since A A T is a positive semidefinite matrix and we denote
them by
σ1 (A) ≥ · · · ≥ σn (A)

or simply by
σ1 ≥ · · · ≥ σ n

Suppose A is an m × n matrix with m < n. Augment A by n − m zero rows to get a square matrix, say B.
Then the singular values of A are defined to be the singular values of B. Suppose m < n, then similar
definition can be given by augmenting A by zero columns, instead of zero rows.
The following assertions can be verified easily. We omit the proof.

(i) The singular values of A and P AQ are identical for any orthogonal matrices P,Q.

(ii) The rank of a matrix equals the number of nonzero singular values of the matrix.

(iii) If A is symmetric then the singular values of A are the absolute values of its eigenvalues. If A is
positive semidefinite then the singular values are the same as eigenvalues.

1.10 The Singular Value Decomposition


Theorem 1.11. Given an m × n matrix A, there exist orthogonal matrices U and V such that

A = U diag(σ1 , σ2 , . . . , σr , 0, . . . , 0)V T

where σ1 , σ2 , . . . , σr are the singular values of A.

Proof. If A is a matrix of size m × n and with rank r, it is clear that the matrix A A T is of size m ×
m with rank r. Further, the eigenvalues of A A T are non-negative and let the positive eigenvalues
be σ21 , σ22 , . . . , σ2r . Consider the orthogonal unit eigenvectors u 1 , u 2 , . . . , u r of A A T corresponding to the
eigenvalues σ21 , σ22 , . . . , σ2r . From spectral decomposition theorem, we have

A A T = P diag(σ21 , σ22 , . . . , σ2r )P T ,

Lecture Notes 19
1.11 Generalized Inverses and Applications

where P is the matrix obtained by taking orthogonal unit eigenvectors u 1 , u 2 , . . . , u r as its columns. Note
1
that P T P = I and PP T A A T = A A T which implies PP T A = A. Now write v i = σi
A T u i . Observe that

1 T 1 T 1 T 2
A T Av i = A T A( A ui) = A (A A T u i ) = A (σ i u i ) = σ2i v i
σi σi σi

and therefore v i are eigenvectors of A T A. Further, v i are orthogonal unit vectors and therefore {v1 , v2 , . . . , vr }
is a set of orthonormal vectors. For Q obtained by taking {v1 , v2 , . . . , vr } as its columns, we get

A T A = QD 2 Q T ,

where D 2 = diag(σ21 , σ22 , . . . , σ2r ). From the definition of v i , it is clear that

Q = A T PD −1

and therefore
PDQ T = PDD −1 P T A = PP T A = A.

Now extend the matrices P and Q to the orthonormal matrices U and V , respectively to get
 
D 0
A =U V T.
0 0

Remark: u i is an eigenvector of A A T and v i is an eigenvector of A T A corresponding to the same


eigenvalue σ2i . These vectors are called the singular vectors of A.

Exercise 1.42. Find the singular value decomposition of matrix


 
1 −2 2
A= .
−1 2 −2

1.11 Generalized Inverses and Applications


Consider the linear system
Ax = y (1.3)

where A is an m × n matrix and y ∈ R (A), the range space of A. If the matrix A is nonsingular then
x = A −1 y will be the solution to the system (1.3). Suppose the matrix A is singular or m ̸= n then we
need a right candidate G of order n × m such that

AG y = y. (1.4)

That is G y is a solution to the linear system (1.3). Equivalently, G is of order n × m such that

AG A = A. (1.5)

Hence we can define the generalized inverse as the following.

Lecture Notes 20
1.11 Generalized Inverses and Applications

Definition 1.9. Given a m × n matrix A, an n × m matrix G is said to be generalized inverse of A if

AG A = A.

G is also known as g-inverse, {1}−inverse, pseudo inverse, partial inverse by many authors in the
literature. We denote an arbitrary generalized inverse by A − . The set of all generalized inverses is
denoted by { A − }.
Remark: If A is square and nonsingular, then A −1 is the unique g-inverse of A.

Lemma 1.2. If G is a g-inverse of A, then rank(A) = rank(AG) = rank(G A).

Proof. Since AG A = A and rank(AB) ≤ min{ rank(A), rank(B)} for any two matrices A, B, we have

rank (A) = rank (AG A) ≤ rank (AG) ≤ rank (A).

Also,
rank (A) = rank (AG A) ≤ rank (G A) ≤ rank (A).

=⇒ rank(A) = rank(AG) = rank(G A).

Example 7. Let A be a matrix and let G be a g-inverse of A. Show that the class of all g-inverses of A is
given by
G + (I − G A)U + V (I − AG),

where U, V are arbitrary.

Solution. Consider any matrix G + (I − G A)U + V (I − AG) where U and V are arbitrary matrices and
G is a g-inverse of A.
We have AG A = A. Now,

A(G + (I − G A)U + V (I − AG))A = AG A + A(I − G A)U A + AV (I − AG)A

= AG A + AU A − AG AU A + AV A − AV AG A

= A + AU A − AU A + AV A − AV A

=A

=⇒ A(G + (I − G A)U + V (I − AG))A = A.

Consider any H = { A − }. If G is a g-inverse of A,


write H = G + (H − G) = G + W ( where W = H − G & AW A = 0)

W = (I − G A)W + G AW

AW A = 0 =⇒ G AW A = 0 =⇒ G AW = G AW(I − AG)

So,
W = (I − G A)W + G AW(I − AG)

Therefore, H = G + (I − G A)U + V (I − AG) where U = W = H − G, V = G AW.


∴ H ∈ {G + (I − G A)U + V (I − AG) : G is a g-inverse of A and U, V are arbitrary}.

Lecture Notes 21
1.12 Construction of generalized inverse

1.12 Construction of generalized inverse


The following characterizations are easy to verify.

1. If A = BC is a rank factorization, then


G = C− −
r Bl

is a g-inverse of A where C − −
r is a right inverse of C and B l is left inverse of B.

Proof. Note that, B has left inverse B−


l
(say) and C has right inverse C − − −
r . Now set G = C r B l then

AG A = BCC − −
r B l BC = BI r I r C = BC = A.

Hence, G is the g-inverse of A.


 
Ir 0
2. If A = P   Q for any non-singular matrices P and Q, then
0 0
 
I r U
G = Q −1   P −1
W V

is a generalized inverse of A for arbitrary U, V and W.


   
Ir 0 Ir 0
Proof. Note that   the g-inverse of the matrix   since,
0 0 0 0
     
Ir 0 Ir 0 Ir 0 Ir 0
   = .
0 0 0 0 0 0 0 0
   
Ir U I 0
Further note that, for any U, V ,W of appropriate sizes,   is the g-inverse of  r  since
W V 0 0
     
Ir 0 Ir U Ir 0 Ir 0
   = .
0 0 V W 0 0 0 0
 
Ir U
Now set G = Q −1   P −1 . Since,
W V
     
Ir 0 Ir U Ir 0
AG A = P   QQ −1   P −1 P  Q
0 0 W V 0 0
   
Ir 0 I U I 0
=P  r  r Q
0 0 W V 0 0
 
Ir 0
=P Q
0 0

=A

G is the g-inverse of A.

Lecture Notes 22
1.12 Construction of generalized inverse

This also shows that any matrix which is not a square, nonsingular matrix admits infinitely many
g-inverses.

3. Let A be of rank r. Since A is a matrix of rank r, there exists a r × r submatrix whose determinant
is nonzero. Without loss of generality, let
 
B C
A= 
D E

where B r×r is the non-singular submatrix of A. Also, there exists a matrix X such that B =
C X , D = E X . Then,  
B−1 0
G= 
0 0

is a g-inverse of A. (Easy to verify)


 
1 0 −1 2
 
 2 0 −2 4
Example 8. Find two different g-inverses of  .
 
−1 1 1 3
 
−2 2 2 6

Solution. Note that the matrix is of rank 2, since the Echelon form of the matrix is,
 
1 0 −1 2
 
 0 1 0 5 
.
 

 0 0 0 0 
 
0 0 0 0
 
2 0
Now, note that, A 1 =   is a 2 × 2 nonsingular minor. Now, fitting the inverse of A 1 in the
−1 1
appropriate place, we get the
 
0 1/2 0 0
 
 0 1/2 1 0 
G1 =  .
 
 0 0 0 0 
 
0 0 0 0
   
1 0 1 0
Similarly, A 2 =   , which gives A −1 =  . And the g-inverse is,
2
−1 1 1 1
 
1 0 0 0
 
 1 0 1 0 
G2 =  .
 
 0 0 0 0 
 
0 0 0 0

Lecture Notes 23
1.13 Minimum Norm, Least Squares g-inverse and Moore-Penrose inverse

Definition 1.10. A g-inverse G of A is called a reflexive g-inverse if it satisfies

G AG = G.

Note that, if G is any g-inverse of A, G AG is a reflexive g-inverse of A.

Theorem 1.12. Let G be a g-inverse of A. Then

rank A ≤ rankG.

Furthermore, equality holds if and only if G is reflexive.

Proof. From Lemma 1.2, rank (A) = rank (G A) ≤ rank (G). If G is a reflexive g-inverse of A, then A is a
g-inverse of G and hence rank(G) ≤ rank (A), equality holds.
Conversely, suppose rank(A) = rank(G). First observe that C (G A) ⊂ C (G). By 1.2, rank(G) =
rank(G A) and hence C (G) = C (G A). Therefore G = G A X for some X . Now G AG = G AG A X = G A X = G
and hence G is reflexive.

1.13 Minimum Norm, Least Squares g-inverse and Moore-Penrose in-


verse
Definition 1.11 (minimum norm g-inverse). A g-inverse G of A is said to be a minimum norm g-inverse
if, in addition to AG A = A, it satisfies
(G A)T = G A.

Definition 1.12 (least squares g-inverse). A g-inverse G of A is said to be a least squares g-inverse if,
in addition to AG A = A, it satisfies
(AG)T = AG.

Definition 1.13 (Moore-Penrose inverse). If G is a reflexive g-inverse of A which is both minimum norm
and least squares then it is called a Moore-Penrose inverse of A.
In other words, G is said to be Moore-Penrose inverse of A if it satisfies

AG A = A, G AG = G, (AG)T = AG and (G A)T = G A.

Moore-Penrose inverse is denoted by A + .

Lemma 1.3. Let A be a complex matrix of order m × n then the Moore-Penrose inverse of A exists and it
is unique.

Proof. Let A = BC be a rank factorization. Then

B+ = (B T B)−1 B T , C + = C T (CC T )−1

and then
A + = C + B+ .

Verification:

Lecture Notes 24
1.13 Minimum Norm, Least Squares g-inverse and Moore-Penrose inverse

(i)

A A + A = BCC T (CC T )−1 (B T B)−1 B T BC

= BC (∵ CC T (CC T )−1 = I, (B T B)−1 B T B = I)

=A

(ii)

A + A A + = C T (CC T )−1 (B T B)−1 B T BCC T (CC T )−1 (B T B)−1 B T

= C T (CC T )−1 (B T B)−1 B T

= A+

(iii)

A A + = BCC T (CC T )−1 (B T B)−1 B T

= B(B T B)−1 B T

(A A + )T = (B(B T B)−1 B T )T

= B(B T B)−1 B T

∴ (A A + )T = A A +

(iv)

A + A = C T (CC T )−1 (B T B)−1 B T BC

= C T (CC T )−1 C

(A + A)T = (C T (CC T )−1 C)T

= C T (CC T )−1 C

∴ (A + A)T = A + A.

Since, all the four conditions of Moore-Penrose are satisfied, therefore A + is Moore-Penrose inverse of
A. Hence the existence. To prove the uniqueness, let G 1 and G 2 be two Moore-Penrose inverse of A.
Then

G 1 = G 1 AG 1 = G 1 G 1T A T (∵ AG 1 = (AG 1 )T )

= G 1 G 1T A T G 2T A T (∵ AG 2 A = A)

= G 1 G 1T A T AG 2 (∵ AG 2 = (AG 2 )T )

= G 1 AG 1 AG 2 (∵ AG 1 = (AG 1 )T )

= G 1 AG 2 AG 2 (AG 1 A = A, G 2 AG 2 = G 2 )

= G 1 A A T G 2T G 2 (∵ G 2 A = (G 2 A)T )

= A T G 1T A T G 2T G 2 (∵ G 1 A = (G 1 A)T )

= A T G 2T G 2 (∵ AG 1 A = A)

= G 2 AG 2 = G 2 (∵ G 2 A = (G 2 A)T , G 2 AG 2 = G 2 ).

Lecture Notes 25
1.13 Minimum Norm, Least Squares g-inverse and Moore-Penrose inverse

Hence, whenever Moore-Penrose inverse exists, it is unique.

1.13.1. Construction of Moore-Penrose inverse

1 T
(i) If A is of rank 1 matrix, then A + = α
A is the Moore-Penrose inverse of A, where α = Trace(A T A).
(Proof as exercise)
 
2 1 3
Exercise 1.43. Find the Moore-Penrose inverse of A =  .
4 2 6

Here rank(A) = 1.  
20 10 30
AT A =  T
 
10 5 15 , Trace(A A) = 70

30 15 45

2 4

1  70 70
A+ = A T =  1 2 , where α = Trace(A T A)

α  70 70 
3 6
70 70

(ii) Let A be an m× n matrix. The singular value decomposition of A is given by A (m×n) = U(m×m) (m×n) V(Tn×n)
P
P
where U and V are orthogonal matrices and is a block diagonal matrix consists of singular val-
ues of A and zeros.

By using singular value decomposition, the Moore-Penrose inverse of A is given by A + ( n× m)


=
P+ T P+
V(n×n) (n×m) U(m×m) where is a block diagonal matrix consists of reciprocal of singular val-
ues of A and zeros. (Proof left as an exercise)
 
1 −1
 
Example 9. Find Moore-Penrose inverse of A =  −2 2  using S.V.D.

2 −2
 
1 −1
 
Solution. Given A = 
−2 2.
2 −2
P T
Formula for SVD is A m×n = Um×m m× n Vn× n
where U, V are orthogonal matrices.
In this case formula for calculating Moore-Penrose inverse is A + = V + U T .
P

First step is to find the singular values of A. To get singular values we have to find the eigenvalues
of A T A by solving | A T A − λ I | = 0.
Now  
  1
 −1
1 −2 2 9 −9
AT A = 
 
 −2 2  =  .
−1 2 −2 −9 9
 
2 −2
Therefore ¯ ¯
9 λ 9
¯ ¯
− −
| A T A − λ I | = 0 ⇒ ¯¯
¯ ¯
¯ = 0;
¯ −9 9 − λ¯
¯

Lecture Notes 26
1.13 Minimum Norm, Least Squares g-inverse and Moore-Penrose inverse

∴ λ = 18, 0.
p
Thus the singular values are σ = 18, 0.
Finding eigenvector corresponding to λ = 18, by solving the matrix equation

(A T A − λ I)X = 0
    
−9 −9 x1 0
i. e.,    =  .
−9 −9 x2 0
 
 
−1 x1
Solving we get   =   = v.
x2 1
 
−1
p
Normalizing v we get v1 =  2 .
p1
2
We require unit vector v
2 which
 is orthogonal to v1 .
y1
To find v2 we write v2 =   such that y2 + y2 = 1,
1 2
y2

v1T v2 = 0
1
i.e., p (− y1 + y2 ) = 0
2
   
y1 p1
∴ v2 =   =  12  .
y2 p
2
 
−1
p p1
∴V = 2 2 .
p1 p1
2 2
Next is to find U3×3 .
  
−1
1
−1  −1  3
p
Let u 1 = σ11 Av1 = p1    2  =  2 .
   
18 
− 2 2  p1 3
2 −2
2 −2 3
We now extend the set { u 1 } to form an orthonormal basis for R3 . We need two orthonormal vectors
which are orthogonal to u 1 , satisfying u 1T x = 0 i.e., − x1 + 2x2 − 2x3 = 0.
       
x1 2x2 − 2x3 2 −2
       
∴  x2  =  x2  = x2 1 + x3  0 
      
.
x3 x3 0 1

A basis for the solution set is given by


   
2 −2
   
w1 = 
1 , w2 =  0  .
  

0 1

Applying Gram-Schmidt process to {w1 , w2 } we obtain

Lecture Notes 27
1.13 Minimum Norm, Least Squares g-inverse and Moore-Penrose inverse

   
p2 p−2
 5  45 
u 2 =  p1  , u 3 =  p4 .
   
 5  45 
0 p5
 45
−1 p2 p−2
 3 5 45 
2 p1 p4 .
∴U =
 
 3 5 45 
−2
3 0 p5
45 

−1
p
p2 p−2

3 18
0  −1 
 5 45 
 p p1
2 1 4  2 2  is the required singular value decomposition of A.

∴ A= 00
 p p 
 
3  p1
 5 45   p1
−2
3 0 p5 0 0 2 2
45
Hence the Moore-Penrose inverse of A is 
   −1 2 −2  
−1 1 1 3 3 3  1 −1 1
p p p 0 0 
A+ = V + U T =  2 2   18  p25 p1 0  =  18 9 9 
P  
p1 p1 0 0 0  5  −1 1 −1
18 9 9
2 2 p−2 p4 p5
45 45 45
We now obtain some characterizations of the Moore–Penrose inverse in terms of volume. For example,
we will show that if A is an n × n matrix then A + is a g-inverse of A with minimum volume. First we
prove some preliminary results. It is easily seen that A + can be determined from the singular value
decomposition of A. A more general result is proved next.

Theorem 1.13. Let A be an n × n matrix of rank r and let


 
Σ 0
A=P Q
0 0

= diag (σ1 , . . . , σr ). Then the


P
be the singular value decomposition of A, where P,Q are orthogonal and
class of g-inverses of A is given by  
Σ−1 X
G = QT   PT (1.6)
Y Z

where X , Y , Z are arbitrary matrices of appropriate dimension. The class of reflexive g-inverses G of A is
given by (1.6) with the additional condition that Z = Y Σ X . The class of least squares g-inverses G of A
is given by (1.6) with X = 0. The class of minimum norm g-inverses G of A is given by (1.6) with Y = 0.
Finally, the Moore-Penrose inverse of A is given by (1.6) with X , Y , Z all being zero.

Lecture Notes 28

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy