0% found this document useful (0 votes)
34 views223 pages

Linear Algebra LectureNote

The document is an introduction to linear algebra, authored by Seongjai Kim from Mississippi State University, and includes lecture notes and a detailed table of contents covering various topics such as linear equations, matrix algebra, determinants, vector spaces, and eigenvalues. It emphasizes the importance of linear algebra in solving real-world problems and provides a structured approach to understanding systems of linear equations and their solutions. The document also contains references to programming with Matlab/Octave and applications in least squares problems.

Uploaded by

belal97ctgbd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views223 pages

Linear Algebra LectureNote

The document is an introduction to linear algebra, authored by Seongjai Kim from Mississippi State University, and includes lecture notes and a detailed table of contents covering various topics such as linear equations, matrix algebra, determinants, vector spaces, and eigenvalues. It emphasizes the importance of linear algebra in solving real-world problems and provides a structured approach to understanding systems of linear equations and their solutions. The document also contains references to programming with Matlab/Octave and applications in least squares problems.

Uploaded by

belal97ctgbd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 223

Introduction to Linear Algebra

Lectures on YouTube:
https://www.youtube.com/@mathtalent

Seongjai Kim

Department of Mathematics and Statistics

Mississippi State University

Mississippi State, MS 39762 USA

Email: skim@math.msstate.edu

Updated: November 20, 2022


Seongjai Kim, Professor of Mathematics, Department of Mathematics and Statistics, Mississippi
State University, Mississippi State, MS 39762 USA. Email: skim@math.msstate.edu.
Prologue
This lecture note is organized, following the sections in [1]. More examples and applications would
be added later on.

Seongjai Kim
November 20, 2022

iii
iv
Contents

Title ii

Prologue iii

Table of Contents vi

1 Linear Equations 1
1.1. Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1. Solving a linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2. Existence and uniqueness questions . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2. Row Reduction and Echelon Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.1. Pivot positions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.2. The row reduction algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.3. Solutions of linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3. Vector Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3.1. Vectors in R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3.2. Vectors in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.3.3. Linear Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.4. Programming with Matlab/Octave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Programming with Matlab/Octave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.5. Matrix Equation Ax = b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.6. Solutions Sets of Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.6.1. Solutions of homogeneous systems . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.6.2. Solutions of nonhomogeneous systems . . . . . . . . . . . . . . . . . . . . . . . . 42
1.8. Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1.9. Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
1.10.The Matrix of A Linear Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
1.10.1. The Standard Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
1.10.2. Existence and Uniqueness Questions . . . . . . . . . . . . . . . . . . . . . . . . 64

2 Matrix Algebra 71

v
vi Contents

2.1. Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72


2.2. The Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
2.3. Characterizations of Invertible Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
2.5. Matrix Factorizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
2.8. Subspaces of Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2.9. Dimension and Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
2.9.1. Coordinate systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
2.9.2. Dimension of a subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

3 Determinants 107
3.1. Introduction to Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.2. Properties of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

4 Vector Spaces 117


4.1. Vector Spaces and Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
4.2. Null Spaces, Column Spaces, and Linear Transformations . . . . . . . . . . . . . . . . 123

5 Eigenvalues and Eigenvectors 129


5.1. Eigenvectors and Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.2. The Characteristic Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.3. Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.5. Complex Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.7. Applications to Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

6 Orthogonality and Least Squares 161


6.1. Inner product, Length, and Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.2. Orthogonal Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.2.1. An orthogonal projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.2.2. Orthonormal sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.3. Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Project #1: Orthogonal projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
6.4. The Gram-Schmidt Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
6.5. Least-Squares Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
6.6. Applications to Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
6.6.1. Regression line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
6.6.2. Least-Squares fitting of other curves . . . . . . . . . . . . . . . . . . . . . . . . . 205

Bibliography 211
Contents vii

Index 213
viii Contents
C HAPTER 1
Linear Equations

Why/What for Linear Algebra


Real-world systems can be approximated as systems of linear equa-
tions  
a11 a12 · · · a1n
 
 a21 a22 · · · a2n  m×n
Ax = b, A=  ... .. . . . ..  ∈ R , (1.1)

 . . 
am1 am2 · · · amn
where b is the source (input) and x is the solution (output).
Its solution can formally be written as
x = A−1 b. (1.2)

What you would learn, from Linear Algebra:


• How to solve systems of linear equations
• Matrix algebra & Properties of algebraic systems
• Determinants & Vector spaces
• Eigenvalues and eigenvectors
• Applications: Least squares problems

In this first chapter, we study its bases, including


• Systems of linear equations
• Three elementary row operations
• Linear transformations

1
2 Chapter 1. Linear Equations

1.1. Systems of Linear Equations

Definition 1.1. A linear equation in the variables x1 , x2 , · · · , xn is an


equation that can be written in the form

a1 x1 + a2 x2 + · · · + an xn = b, (1.3)

where b and the coefficients a1 , a2 , · · · , an are real or complex numbers.

A system of linear equations (or a linear system) is a collection of one


or more linear equations involving the same variables – say, x1 , x2 , · · · , xn .
Example 1.2.
( 
4x1 − x2 = 3  2x + 3y − 4z = 2
(a)

2x1 + 3x2 = 5 (b) x − 2y + z = 1

3x + y − 2z = −1

• Solution: A solution of the system is a list (s1 , s2 , · · · , sn ) of numbers


that makes each equation a true statement, when the values s1 , s2 , · · · , sn
are substituted for x1 , x2 , · · · , xn , respectively.
• Solution Set: The set of all possible solutions is called the solution set
of the linear system.
• Equivalent System: Two linear systems are called equivalent if they
have the same solution set.

• For example, above (a) is equivalent to


(
2x1 − 4x2 = −2
R1 ← R1 − R2
2x1 + 3x2 = 5
1.1. Systems of Linear Equations 3

Remark 1.3. Linear systems may have

no solution ) : inconsistent system


exactly one (unique) solution
: consistent system
infinitely many solutions

Example 1.4. Consider the case of two equations in two unknowns.


( ( (
−x + y = 1 x+y = 1 −2x + y = 2
(a) (b) (c)
−x + y = 3 x−y = 2 −4x + 2y = 4

1.1.1. Solving a linear system


Consider a simple system of 2 linear equations:
(
−2x1 + 3x2 = −1
(1.4)
x1 + 2x2 = 4
Such a system of linear equations can be treated much more conveniently
and efficiently with matrix form. In matrix form, (1.4) reads
" # " # " #
−2 3 x1 −1
= . (1.5)
1 2 x2 4
| {z }
coefficient matrix
The essential information of the system can be recorded compactly in a
rectangular array called a augmented matrix:
" # " #
−2 3 −1 −2 3 −1
or (1.6)
1 2 4 1 2 4
4 Chapter 1. Linear Equations

Solving (1.4):

System of linear equations Matrix form


( " #
−2x1 + 3x2 = −1 1 −2 3 −1
x1 + 2x2 = 4 2 1 2 4

1 ↔ 2 : (interchange)
( " #
x1 + 2x2 = 4 1 1 2 4
−2x1 + 3x2 = −1 2 −2 3 −1

2 ← 2 +2· 1 : (replacement)
( " #
x1 + 2x2 = 4 1 1 2 4
7x2 = 7 2 0 7 7

2 ← 2 /7: (scaling)
( " #
x1 + 2x2 = 4 1 1 2 4
x2 = 1 2 0 1 1

1 ← 1 −2· 2 : (replacement)
( " #
x1 = 2 1 1 0 2
x2 = 1 2 0 1 1

At the last step:


( " #
x1 = 2 2
LHS: solution : RHS : I
x2 = 1 1
1.1. Systems of Linear Equations 5

Tools 1.5. Three Elementary Row Operations (ERO):


• Replacement: Replace one row by the sum of itself and a multiple
of another row
Ri ← Ri + k · Rj , j 6= i
• Interchange: Interchange two rows
Ri ↔ Rj , j 6= i
• Scaling: Multiply all entries in a row by a nonzero constant
Ri ← k · Ri , k 6= 0

Definition 1.6. Two matrices are row equivalent if there is a se-


quence of EROs that transforms one matrix to the other.

1.1.2. Existence and uniqueness questions

Two Fundamental Questions about a Linear System:


1. (Existence): Is the system consistent; that is, does at least one
solution exist?
2. (Uniqueness): If a solution exists, is it the only one; that is, is the
solution unique?

Remark 1.7. Let’s consider a system: Ax = b, with b being an input.


Then,
• Existence ⇔ Ae−1 : b 7→ x is a map
• Uniqueness ⇔ Ae−1 : b 7→ x is a function

That is, if the system can produce an output as a map or function, then
we call it consistent.

Most systems in real-world are consistent (existence) and


they produce the same output for the same input (uniqueness).
6 Chapter 1. Linear Equations

Example 1.8. Solve the following system of linear equations, using the 3
EROs. Then, determine if the system is consistent.

x2 − 4x3 = 8
2x1 − 3x2 + 2x3 = 1
4x1 − 8x2 + 12x3 = 1

Solution.

Ans: Inconsistency means that there is no point where the three planes meet at.
Example 1.9. Determine the values of h such that the given system is a
consistent linear system
x + h y = −5
2x − 8y = 6

Solution.

Ans: h 6= −4
1.1. Systems of Linear Equations 7

True-or-False 1.10.
a. Every elementary row operation is reversible.
b. Elementary row operations on an augmented matrix never change the
solution of the associated linear system.
c. Two linear systems are equivalent if they have the same solution set.
d. Two matrices are row equivalent if they have the same number of rows.
Solution.

Ans: T,T,T,F
8 Chapter 1. Linear Equations

You should report your homework with your work for problems. You can scan your solutions
and answers, using a scanner or your phone, then try to put in a file, either in doc/docx or pdf.

Exercises 1.1
1. Consider the augmented matrix of a linear system. State in words the next two elemen-
tary row operations that should be performed in the process of solving the system.
 
1 6 −4 0 1
0 1
 7 0 −4 
 
0 0 −1 2 3
0 0 2 1 −6
2. The augmented matrix of a linear system has been reduced by row operations to the
form shown. Continue the appropriate row operations and describe the solution set.
 
1 1 0 0 4
0 −1 3 0 −7
 
 
0 0 1 −3 1
0 0 0 2 4
3. Solve the systems or determine if the systems in inconsistent.
−x2 − 4x3 = 5 x1 + 3x3 = 2
(a) x1 + 3x2 + 5x3 = −2 x2 − 3x4 = 3
(b)
3x1 + 7x2 + 7x3 = 6 −2x2 + 3x3 + 2x4 = 1
3x1 + 7x4 = −5

4. Determine the value of h such that the matrix is the augmented matrix of a consistent
linear
" system. #
2 −3 h
−4 6 −5
Ans: h = 5/2

5. An important concern in the study of heat Write a system of four equations whose so-
transfer is to determine the steady-state tem- lution gives estimates for the temperatures
perature distribution of a thin plate when the T1 , T2 , · · · , T4 , and solve it.
temperature around the boundary is known.
Assume the plate shown in the figure repre-
sents a cross section of a metal beam, with
negligible heat flow in the direction perpen-
dicular to the plate. Let T1 , T2 , · · · , T4 denote
the temperatures at the four interior nodes of
the mesh in the figure. The temperature at
a node is approximately equal to the average
of the four nearest nodes. For example, T1 =
(10 + 20 + T2 + T4 )/4 or 4T1 = 10 + 20 + T2 + T4 . Figure 1.1
1.2. Row Reduction and Echelon Forms 9

1.2. Row Reduction and Echelon Forms


Some terminologies
• A nonzero row in a matrix is a row with at least one nonzero entry.
• A leading entry of a row is the left most nonzero entry in a nonzero
row.
• A leading 1 is a leading entry whose value is 1.

Definition 1.11. Echelon form: A rectangular matrix is in the eche-


lon form if it has following properties.
1. All nonzero rows are above any rows of all zeros.
2. Each leading entry in a row is in a column to the right of leading
entry of the row above it.
3. All entries below a leading entry in a column are zeros.
Reduced echelon form (row reduced echelon form): If a matrix in
the echelon form satisfies 4 and 5, then it is in the reduced echelon
form.
4. The leading entry in each nonzero row is 1.
5. Each leading 1 is the only nonzero entry in its column.

Example 1.12. Check if the following matrix is in echelon form. If not,


put it in echelon form.
 
0 0 0 0 0
1 2 0 0 1
 

0 0 3 0 4
10 Chapter 1. Linear Equations

Example 1.13. Verify whether the following matrices are in echelon form,
row reduced echelon form.
   
1 2 0 0 1 1 1 2 2 3
1) 0 0 3 0 4 4) 0 0 1 1 1
   

0 0 0 0 0 0 0 0 0 4
   
2 0 0 5 1 0 0 5
2) 0 1 0 6 5) 0 1 0 6
   

0 0 0 9 0 0 0 9
   
1 1 0 0 1 0 5
3) 0 0 1 6) 0 0 0 6
   

0 0 0 0 0 1 2

Uniqueness of the Reduced Echelon Form


Theorem 1.14. Each matrix is row equivalent to one and only one
reduced echelon form.
1.2. Row Reduction and Echelon Forms 11

1.2.1. Pivot positions


Terminologies
1) Pivot position in a matrix A is a location in A that corresponds to a
leading 1 in the reduced echelon form of A.
2) Pivot column is a column of A that contains a pivot position.

Example 1.15. The matrix A is given with its reduced echelon form. Find
the pivot positions and pivot columns of A.
   
1 1 0 2 0 1 1 0 2 0
 R.E.F
A = 1 1 1 3 0 −−−→ 0 0 1 1 0
  

1 1 0 2 4 0 0 0 0 1
Solution.

Note: Once a matrix is in an echelon form, further row operations do


not change the positions of leading entries. Thus, the leading entries
become the leading 1’s in the reduced echelon form.
12 Chapter 1. Linear Equations

Terminologies
3) Basic variables: In the system Ax = b, the variables that corre-
spond to pivot columns (in [A : b]) are basic variables.
4) Free variables: In the system Ax = b, the variables that correspond
to non-pivotal columns are free variables.

Example 1.16. For the system of linear equations, identify its basic vari-
ables and free variables.

 −x1 − 2x2
 = −3
2x3 = 4

3x3 = 6

Solution. Hint : You may start with its augmented matrix, and apply row operations.

Ans: basic variables: {x1 , x3 }


1.2. Row Reduction and Echelon Forms 13

1.2.2. The row reduction algorithm


Steps to reduce to reduced echelon form
1. Start with leftmost non-zero column. This is pivot column. The pivot
is at the top.
2. Choose a nonzero entry in pivot column as a pivot. If necessary,
interchange rows to move a nonzero entry into the pivot position.
3. Use row replacement operations to make zeros in all positions be-
low the pivot.
4. Ignore row and column containing the pivot and ignore all rows
above it. Apply Steps 1–3 to the remaining submatrix. Repeat this
until there are no more rows to modify.
5. Start with right most pivot and work upward and left to make zeros
above each pivot. If pivot is not 1, make it 1 by a scaling operation.

Example 1.17. Row reduce the matrix into reduced echelon form.
 
0 −3 −6 4 9
A = −2 −3 0 3 −1
 

1 4 5 −9 −7

Combination of Steps 1–4 is call the forward phase of the row reduction, while Step
5 is called the backward phase.
14 Chapter 1. Linear Equations

1.2.3. Solutions of linear systems

1) For example, for an augmented 3) Rewrite the above as


matrix, its R.E.F. is given as 
   x1 = 1 +5 x3

1 0 −5 1 (1.9)
x2 = 4 −x3
0 1 1 4 (1.7)
  
x3 is free
  
0 0 0 0
4) Thus the solution of (1.8) can be
2) Then, the associated system of
written as
equations reads
     
x1 − 5 x3 = 1 x1 1 5
x2  = 4 + x3 −1 , (1.10)
     
x2 + x3 = 4 (1.8)
0 = 0 x3 0 1

where {x1 , x2 } are basic vari- in which you are free to choose
ables (∵ pivots) and x3 is free any value for x3 . (That is why it
variable. is called a “free variable".)

• The description in (1.10) is called a parametric description of solu-


tion set; the free variable x3 acts as a parameter.
• The solution in (1.10) represents all the solutions of the system (1.7).
It is called the general solution of the system.
1.2. Row Reduction and Echelon Forms 15

Example 1.18. Find the general solution of the system whose augmented
matrix is  
1 0 −5 0 −8 3
 
0 1 4 −1 0 6
[A|b] = 0 0 0 0 1 0

 
0 0 0 0 0 0
Solution. Hint : You should first row reduce it for the reduced echelon form.
16 Chapter 1. Linear Equations

Example 1.19. Find the general solution of the system whose augmented
matrix is  
0 0 0 1 7
 
0 1 3 0 −1
[A|b] = 0 0 0 1 1

 
1 0 −9 0 4
Solution.
1.2. Row Reduction and Echelon Forms 17

Properties
1) Any nonzero matrix may be row reduced (i.e., transformed by ele-
mentary row operations) into more than one matrix in echelon form,
using different sequences of row operations.
2) Each matrix is row equivalent to one and only one reduced eche-
lon matrix (Theorem 1.14, p. 10).

3) A linear system is consistent if and only if the rightmost column of


the augmented matrix is not a pivot column
–i.e., if and only if an echelon form of the augmented matrix has no
row of the form [0 · · · 0 b] with b nonzero.
4) If a linear system is consistent, then the solution set contains either
(a) a unique solution, when there are no free variables, or
(b) infinitely many solutions, when there is at least on free variable.

Example 1.20. Choose h and k such that the system has

a) No solution b) Unique solution c) Many solutions


(
x1 − 3 x2 = 1
2 x1 + h x2 = k
Solution.

Ans: (a) h = −6, k 6= 2


18 Chapter 1. Linear Equations

True-or-False 1.21.
a. The row reduction algorithm applies to only to augmented matrices for
a linear system.
b. If one row in an echelon form of an augmented matrix is [0 0 0 0 2 0],
then the associated linear system is inconsistent.
c. The pivot positions in a matrix depend on whether or not row inter-
changes are used in the row reduction process.
d. Reducing a matrix to echelon form is called the forward phase of the
row reduction process.
Solution.

Ans: F,F,F,T
1.2. Row Reduction and Echelon Forms 19

Exercises 1.2
1. Row reduce the matrices to reduced echelon form. Circle the pivot positions in the final
matrix and in the original matrix, and list the pivot columns.
   
1 2 3 4 1 3 5 7
(a) 4 5 6 7 (b) 3 5 7 9
   

6 7 8 9 5 7 9 1

2. Find the general solutions of the systems (in parametric vector form) whose aug-
mented matrices are given as
   
1 −7 0 6 5 1 2 −5 −6 0 −5
(a)  0 0 1 −2 −3 0 1 −6 −3 0 2
 
(b) 
 

−1 7 −4 2 7 0 0 0 0 1 0
0 0 0 0 0 0
Ans: (a) x = [5, 0, −3, 0]T + x2 [7, 1, 0, 0]T + x4 [−6, 0, 2, 1]T ;
Ans: (b) x = [−9, 2, 0, 0, 0]T + x3 [−7, 6, 1, 0, 0]T + x4 [0, 3, 0, 1, 0]T 1
3. In the following, we use the notation for matrices in echelon form: the leading entries
with , and any values (including zero) with ∗. Suppose each matrix represents the aug-
mented matrix for a system of linear equations. In each case, determine if the system is
consistent. If the system is consistent, determine if the solution is unique.
     
∗ ∗ ∗ 0 ∗ ∗ ∗ ∗ ∗ ∗ ∗
(a) 0 ∗ ∗ (b) 0 0 ∗ ∗ (c) 0 0 ∗ ∗
     

0 0 ∗ 0 0 0 0 0 0 0 ∗

4. Choose h and k such that the system has (a) no solution, (b) a unique solution, and (c)
many solutions.
x1 + hx2 = 2
4x1 + 8x2 = k
5. Suppose the coefficient matrix of a system of linear equations has a pivot position in
every row. Explain why the system is consistent.

 
a
1 T
The superscript T denotes the transpose; for example [a, b, c] =  b.
 

c
20 Chapter 1. Linear Equations

1.3. Vector Equations

Definition 1.22. A matrix with only one column is called a column


vector, or simply a vector.

For example,  
" # 1
3
∈ R2 −5 ∈ R
3
 
−2
4

1.3.1. Vectors in R2
" #
a
We can identify a point (a, b) with a column vector , position vector.
b

Figure 1.2: Vectors in R2 as points. Figure 1.3: Vectors in R2 with arrows.

Note: Vectors are mathematical objects having direction and length.


We may try to (1) compare them, (2) add or subtract them, (3) multiply
them by a scalar, (4) measure their length, and (5) apply other operations
to get information related to angles.
1.3. Vector Equations 21

" # " #
u1 v1
1) Equality of vectors: Two vectors u = and v = are equal if
u2 v2
and only if corresponding entries are equal, i.e., ui = vi , i = 1, 2.
" # " #
u1 v1
2) Addition: Let u = and v = . Then,
u2 v2
" # " # " #
u1 v1 u1 + v1
u+v = + = .
u2 v2 u2 + v2

3) Scalar multiple: Let c ∈ R, a scalar. Then


" # " #
v1 cv1
cv = c = .
v2 cv2

Theorem 1.23. (Parallelogram Rule for Addition) If u and v are


represented as points in the plane, then u + v corresponds to the fourth
vertices of the parallelogram whose other vertices are 0, u, and v.

Figure 1.4: The parallelogram rule.


22 Chapter 1. Linear Equations
" # " #
2 1
Example 1.24. Let u = and v = .
1 −3

(a) Find u + 2v and 3u − 2v.


(b) Display them on a graph.
Solution.
1.3. Vector Equations 23

1.3.2. Vectors in Rn
Note: The above vector operations, including the parallelogram rule, are
also applicable for vectors in R3 and Rn , in general.

Algebraic Properties of Rn
For u, v, w ∈ Rn and scalars c and d,

1) u + v = v + u 5) c(u + v) = cu + cv
2) (u + v) + w = u + (v + w) 6) (c + d)u = cu + du
3) u + 0 = 0 + u = u 7) c(du) = (cd)u
4) u + (−u) = (−u) + u = 0 8) 1u = u
where −u = (−1)u

1.3.3. Linear Combinations


Definition 1.25. Given vectors v1 , v2 , · · · , vp in Rn and scalars
c1 , c2 , · · · , cp , the vector y ∈ Rn defined by

y = c1 v1 + c2 v2 + · · · + cp vp (1.11)

is called the linear combination of v1 , v2 , · · · , vp with weights


c1 , c2 , · · · , cp .

In R2 , given v1 and v2 :
24 Chapter 1. Linear Equations

Definition 1.26. A vector equation is of the form


x1 a1 + x2 a2 + · · · + xp ap = b, (1.12)

where a1 , a2 , · · · , ap , and b are vectors and x1 , x2 , · · · , xp are weights.


       
0 1 5 0
Example 1.27. Let a1 =  4, a2 = 6, a3 = −1, and b = 2.
       

−1 3 8 3
Determine whether or not b can be generated as a linear combination of a1 ,
a2 , and a3 .
Solution. Hint : We should determine
 whether weights x1 , x2 , x3 exist such that x1 a1 +
x1
x2 a2 + x3 a3 = b, which reads [a1 a2 a3 ] x2  = b.
 

x3
1.3. Vector Equations 25

Note:
1) The vector equation x1 a1 + x2 a2 + · · · + xp ap = b has the same solu-
tion set as a linear system whose augmented matrix is [a1 a2 · · · ap : b].
2) b can be generated as a linear combination of a1 , a2 , · · · , ap if and only
if the linear system Ax = b whose augmented matrix [a1 a2 · · · ap : b]
is consistent.

Definition 1.28. Let v1 , v2 , · · · , vp be p vectors in Rn . Then


Span{v1 , v2 , · · · , vp } is the collection of all linear combination of
v1 , v2 , · · · , vp , that can be written in the form c1 v1 + c2 v2 + · · · + cp vp ,
where c1 , c2 , · · · , cp are weights. That is,

Span{v1 , v2 , · · · , vp } = {y | y = c1 v1 + c2 v2 + · · · + cp vp } (1.13)

Figure 1.5: A line: Span{u} in R3 .

Figure 1.6: A plane: Span{u, v} in R3 .


26 Chapter 1. Linear Equations
 
2
Example 1.29. Determine if b = −1 is a linear combination of the
 

6
 
1 0 5
columns of the matrix −2 1 −6. (That is, determine if b is in the span
 

0 2 8
of columns of the matrix.)
Solution.
1.3. Vector Equations 27
   
2 1
Example 1.30. Find h so that  1 lies in the plane spanned by a1 = 2
   

h 3
 
2
and a2 = −1.
 

1
Solution.

b ∈ Span{v1 , v2 , · · · , vp }
⇔ x1 v1 + x2 v2 + · · · + xp vp = b has a solution (1.14)
⇔ [v1 v2 · · · vp : b] has a solution

True-or-False 1.31.
# "
1
a. Another notation for the vector is [1 − 2].
−2
b. The set Span{u, v} is always visualized as a plane through the origin.
c. When u and v are nonzero vectors, Span{u, v} contains the line through
u and the origin.
Solution.

Ans: F,F,T
28 Chapter 1. Linear Equations

Exercises 1.3
1. Write a system of equations that is equivalent to the given vector equation; write a
vector equation that is equivalent to the given system of equations.
     
6 −3 1 x2 + 5x3 = 0
(a) x1 −1 + x2  4 = −7 (b) 4x1 + 6x2 − x3 = 0
     

5 0 −5 −x1 + 3x2 − 8x3 = 0

2. Determine
  if b is a linear
  combination
  of a1 , a2, and
 a3 .
1 0 5 2
a1 = −2 , a2 = 1 , a3 = −6 , b = −1 .
       

0 2 8 6
Ans: Yes
3. Determine if b is a linear combination of the vectors formed from the columns of the
matrix A.
       
1 −4 2 3 1 −2 −6 11
(a) A =  0 3 5 , b = −7 (b) A = 0 3 7 , b = −5
       

−2 8 −4 −3 1 −2 5 9
     
1 −2 4
4. Let a1 =  4 , a2 = −3, and b =  1. For what value(s) of h is b in the plane
     

−2 7 h
spanned by a1 and a2 ? Ans: h = −17
5. Construct a 3 × 3 matrix A, with nonzero entries, and a vector b in R3 such that b is
not in the set spanned by the columns of A. Hint : Construct a 3 × 4 augmented matrix in
echelon form that corresponds to an inconsistent system.
6. A mining company has two mines. One day’s operation at mine #1 produces ore that
contains 20 metric tons of copper and 550 kilograms of silver, while one day’s operation
at mine #2 produces
" # ore that contains
" # 30 metric tons of copper and 500 kilograms of
20 30
silver. Let v1 = and v2 = . Then v1 and v2 represent the “output per day" of
550 500
mine #1 and mine #2, respectively.
(a) What physical interpretation can be given to the vector 5v1 ?
(b) Suppose the company operates mine #1 for x1 days and mine #2 for x2 days. Write a
vector equation whose solution gives the number of days each mine should operate
in order to produce 150 tons of copper and 2825 kilograms of silver.
(c) M 2 Solve the equation in (b).

The mark M indicates that you have to solve the problem, using one of Matlab, Maple, and Mathemat-
2

ica. You may also try “octave" as a free alternative of Matlab. Attach a hard copy of your code.
1.4. Programming with Matlab/Octave 29

1.4. Programming with Matlab/Octave

Note: In computer programming, important things are


• How to deal with objects (variables, arrays, functions)
• How to deal with repetition effectively
• How to make the program reusable

Vectors and matrices

The most basic thing you will need Vectors and Matrices
to do is to enter vectors and matri- 1 >> u = [1; 2; 3] % column vector
2 u=
ces. You would enter commands to 3 1
Matlab or Octave at a prompt that 4 2
looks like >>. 5 3
6 >> v = [4; 5; 6];
7 >> u + 2*v
• Rows are separated by semi- 8 ans =
colons (;) or Enter . 9 9
10 12
• Entries in a row are separated 11 15
by commas (,) or space Space . 12 >> w = [5, 6, 7, 8] % row vector
13 w=
For example, 14 5 6 7 8
15 >> A = [2 1; 1 2]; % matrix
16 >> B = [-2, 5
17 1, 2]
18 B=
19 -2 5
20 1 2
21 >> C = A*B % matrix multiplication
22 C=
23 -3 12
24 0 9
30 Chapter 1. Linear Equations

You can save the commands in a file to run and get the same results.
tutorial1_vectors.m
1 u =
[1; 2; 3]
2 v =
[4; 5; 6];
3 u +
2*v
4 w =
[5, 6, 7, 8]
5 A =
[2 1; 1 2];
6 B =
[-2, 5
7 1, 2]
8 C = A*B

Solving equations
   
1 −4 2 3
Let A =  0 3 5 and b = −7 . Then Ax = b can be numerically
   

−2 8 −4 −3
solved by implementing a code as follows.

tutorial2_solve.m Result
1 A = [1 -4 2; 0 3 5; 2 8 -4]; 1 x =
2 b = [3; -7; -3]; 2 0.75000
3 x = A\b 3 -0.97115
4 -0.81731

Graphics with Matlab


In Matlab, the most popular graphic command is plot, which creates a 2D
line plot of the data in Y versus the corresponding values in X. A general
syntax for the command is
plot(X1,Y1,LineSpec1,...,Xn,Yn,LineSpecn)
1.4. Programming with Matlab/Octave 31

tutorial3_plot.m
1 close all
2

3 %% a curve
4 X1 = linspace(0,2*pi,10); % n=10
5 Y1 = cos(X1);
6

7 %% another curve
8 X2=linspace(0,2*pi,20); Y2=sin(X2);
9

10 %% plot together
11 plot(X1,Y1,'-or',X2,Y2,'--b','linewidth',3);
12 legend({'y=cos(x)','y=sin(x)'},'location','best',...
13 'FontSize',16,'textcolor','blue')
14 print -dpng 'fig_cos_sin.png'

Figure 1.7: fig_cos_sin.png: plot of y = cos x and y = sin x.

Above tutorial3_plot.m is a typical M-file for figuring with plot.

• Line 1: It closes all figures currently open.


• Lines 3, 4, 7, and 10 (comments): When the percent sign (%) appears,
the rest of the line will be ignored by Matlab.
• Lines 4 and 8: The command linspace(x1,x2,n) returns a row vector
of n evenly spaced points between x1 and x2.
• Line 11: Its result is a figure shown in Figure 1.7.
• Line 14: it saves the figure into a png format, named fig_cos_sin.png.
32 Chapter 1. Linear Equations

Repetition: iteration loops

Note: In scientific computation, one of most frequently occurring events


is repetition. Each repetition of the process is also called an iteration.
It is the act of repeating a process, to generate a (possibly unbounded)
sequence of outcomes, with the aim of approaching a desired goal, target
or result. Thus,
• iteration must start with an initialization (starting point) and
• perform a step-by-step marching in which the results of one iteration
are used as the starting point for the next iteration.

In the context of mathematics or computer science, iteration (along with


the related technique of recursion) is a very basic building block in pro-
gramming. Matlab provides various types of loops: while loops, for loops,
and nested loops.

while loop
The syntax of a while loop in Matlab is as follows.
while <expression>
<statements>
end
An expression is true when the result is nonempty and contains all nonzero
elements, logical or real numeric; otherwise the expression is false. Here is
an example for the while loop.
n1=11; n2=20;
sum=n1;
while n1<n2
n1 = n1+1; sum = sum+n1;
end
fprintf('while loop: sum=%d\n',sum);
When the code above is executed, the result will be:
while loop: sum=155
1.4. Programming with Matlab/Octave 33

for loop
A for loop is a repetition control structure that allows you to efficiently
write a loop that needs to execute a specific number of times. The syntax of
a for loop in Matlab is as following:
for index = values
<program statements>
end
Here is an example for the for loop.
n1=11; n2=20;
sum=0;
for i=n1:n2
sum = sum+i;
end
fprintf('for loop: sum=%d\n',sum);
When the code above is executed, the result will be:
for loop: sum=155

Functions: Enhancing reusability


Program scripts can be saved to reuse later conveniently. For example,
the script for the summation of integers from n1 to n2 can be saved as a form
of function.
mysum.m
1 function s = mysum(n1,n2)
2 % sum of integers from n1 to n2
3

4 s=0;
5 for i=n1:n2
6 s = s+i;
7 end

Now, you can call it with e.g. mysum(11,20).


Then the result reads ans = 155.
34 Chapter 1. Linear Equations

1.5. Matrix Equation Ax = b


A fundamental idea in linear algebra is to view a linear combination of
vectors as a product of a matrix and a vector.
Definition 1.32. Let A = [a1 a2 · · · an ] be an m × n matrix and x ∈ Rn ,
then the product of A and x denoted by A x is the linear combination
of columns of A using the corresponding entries of x as weights, i.e.,
 
x1
 
 x2 
 ...  = x1 a1 + x2 a2 + · · · + xn an .
A x = [a1 a2 · · · an ]   (1.15)
 
xn

A matrix equation is of the form A x = b, where b is a column vector of


size m × 1.

Example 1.33.

Matrix equation Vector equation Linear system


" #" # " # " # " # " #
1 2 x1 1 1 2 1 x1 + 2x2 = 1
= x1 + x2 =
3 4 x2 −1 3 4 −1 3x1 + 4x2 = −1

Theorem 1.34. Let A = [a1 a2 · · · an ] be an m × n matrix, x ∈ Rn , and


b ∈ Rm . Then the matrix equation
Ax = b (1.16)

has the same solution set as the vector equation


x1 a1 + x2 a2 + · · · + xn an = b, (1.17)

which, in turn, has the same solution set as the system with augmented
matrix
[a1 a2 · · · an : b]. (1.18)
1.5. Matrix Equation Ax = b 35

Theorem 1.35. (Existence of solutions): Let A be an m × n matrix.


Then the following statements are logically equivalent (all true, or all
false).
a. For each b in Rm , the equation Ax = b has a solution.
b. Each b in Rm is a linear combination of columns of A.
c. The columns of A span Rm .
d. A has a pivot position in every row.
(Note that A is the coefficient matrix.)
     
0 0 4
Example 1.36. Let v1 = 0, v2 = −3, and v3 = −2.
     

3 9 −6
Does {v1 , v2 , v3 } span R3 ? Why or why not?
36 Chapter 1. Linear Equations
    

1 3 −2
     
 0  1
,  , and  1 span R4 ?
 
Example 1.37. Do the vectors 
 1  2 −3
     
−2 −8 2
Solution.

True-or-False 1.38.
a. The equation Ax = b is referred to as a vector equation.
b. Each entry in Ax is the result of a dot product.
c. If A ∈ Rm×n and if Ax = b is inconsistent for some b ∈ Rm , then A can-
not have a pivot position in every row.
d. If the augmented matrix [A b] has a pivot position in every row, then
the equation Ax = b is inconsistent.
Solution.

Ans: F,T,T,F
1.5. Matrix Equation Ax = b 37

Exercises 1.5
1. Write the system first as a vector equation and then as a matrix equation.
3x1 + x2 − 5x3 = 9
x2 + 4x3 = 0
   
0 3 −5
2. Let u = 0 and A = −2 6. Is u in
   

4 1 1
3
the plane R spanned by the columns of A?
(See the figure.) Why or why not?

Figure 1.8

3. The problems refer to the matrices A and B below. Make appropriate calculations that
justify your answers and mention an appropriate theorem.
   
1 3 0 3 1 3 −2 2
−1 −1 −1 1  0 1 1 −5
A= B=
   
 
 0 −4 2 −8  1 2 −3 7
2 0 3 −1 −2 −8 2 −1

(a) How many rows of A contain a pivot position? Does the equation Ax = b have a
solution for each b in R4 ?
(b) Can each vector in R4 be written as a linear combination of the columns of the
matrix A above? Do the columns of A span R4 ?
(c) Can each vector in R4 be written as a linear combination of the columns of the
matrix B above? Do the columns of B span R4 ?
Ans: (a) 3; (b) Theorem 1.35 (d) is not true
     
0 0 4
4. Let v1 =  0, v2 = −3, v3 = −1. Does {v1 , v2 , v3 } span R3 ? Why or why not?
     

−2 8 −5
Ans: The matrix of {v1 , v2 , v3 } has a pivot position on each row.
5. Could a set of three vectors in R4 span all of R4 ? Explain. What about n vectors in Rm
when n < m?
6. Suppose A is a 4 × 3 matrix and b ∈ R4 with the property that Ax = b has a unique
solution. What can you say about the reduced echelon form of A? Justify your answer.
Hint : How many pivot columns does A have?
38 Chapter 1. Linear Equations

1.6. Solutions Sets of Linear Systems

Linear Systems:
1. Homogeneous linear systems:
Ax = 0; A ∈ Rm×n , x ∈ Rn , 0 ∈ Rm . (1.19)

(a) It has always at least one solution: x = 0 (trivial solution)


(b) Any nonzero solution is called a nontrivial solution.
2. Nonhomogeneous linear systems:
Ax = b; A ∈ Rm×n , x ∈ Rn , b ∈ Rm , b 6= 0. (1.20)

Note: Ax = 0 has a nontrivial solution if and only if the system has at least
one free variable.

1.6.1. Solutions of homogeneous systems


Example 1.39. Determine if the following homogeneous system has a
nontrivial solution. Then describe the solution set.
x1 − 2x2 + 3x3 = 0
−2x1 − 3x2 − 4x3 = 0
2x1 − 4x2 + 9x3 = 0
1.6. Solutions Sets of Linear Systems 39

Definition 1.40. If the solutions of Ax = 0 can be written in the form

x = c1 u1 + c2 u2 + · · · + cr ur , (1.21)

where c1 , c2 , · · · , cr are scalars and u1 , u2 , · · · , ur are vectors with size


same as x, then they are said to be in parametric vector form.

Note: When solutions of Ax = 0 is in the form of (1.21), we may say


{The solution set of Ax = 0} = Span{u1 , u2 , · · · , ur }. (1.22)

Example 1.41. Solve the system and write the solution in parametric
vector form.
x1 + 2x2 − 3x3 = 0
2x1 + x2 − 3x3 = 0
−x1 + x2 = 0
40 Chapter 1. Linear Equations

Example 1.42. Describe all solutions of Ax = 0 in parametric vector form


where A is row equivalent to the matrix.
 
1 −2 3 −6 5 0
 
0 0 0 1 4 −6
 
0 0 0 0 0 1
 
0 0 0 0 0 0
Hint : You should first row reduce it for the reduced echelon form.
Solution.
1.6. Solutions Sets of Linear Systems 41

A single equation can be treated as a simple linear system.


Example 1.43. Solve the equation of 3 variables and write the solution in
parametric vector form.
x1 − 2x2 + 3x3 = 0

Solution. Hint : x1 is only the basic variable. Thus your solution would be the form of
x = x2 v1 + x3 v2 , which is a parametric vector equation of the plane.
42 Chapter 1. Linear Equations

1.6.2. Solutions of nonhomogeneous systems


Example 1.44. Describe all solutions of Ax = b, where
   
3 5 −4 7
A = −3 −2 4 , b = −1.
   

6 1 −8 −4

Solution.

   
−1 4/3
Ans: x =  2 + x3  0
   

0 1

The solution of Example 1.44 is of the form


x = p + x3 v (= p + t v), (1.23)

where t is a general parameter. Note that (1.23) is an equation of line


through p parallel to v.
1.6. Solutions Sets of Linear Systems 43

In the previous example, the solution of Ax = b is x = p + t v.


Question: What is “t v"?
Solution. First of all,

Ax = A(p + t v) = Ap + A(tv) = b. (1.24)

Note that x = p + t v is a solution of Ax = b, even when t = 0. Thus,

A(p + t v)t=0 = Ap = b. (1.25)

It follows from (1.24) and (1.25) that

A(t v) = 0, (1.26)

which implies that “t v" is a solution of the homogeneous equation


Ax = 0.

Theorem 1.45. Suppose the equation Ax = b is consistent for some


given b, and let p be a solution. Then the solution set of Ax = b is the
set of all vectors of the form {w = p + uh }, where uh is the solution of
the homogeneous equation Ax = 0.

Self-study 1.46. Let Ax = b have a solution. Prove that the solution is


unique if and only if Ax = 0 has only the trivial solution.
44 Chapter 1. Linear Equations

True-or-False 1.47.
a. The solution set of Ax = b is the set of all vectors of the form {w =
p + uh }, where uh is the solution of the homogeneous equation Ax = 0.
(Compare with Theorem 1.45, p.43.)
b. The equation Ax = b is homogeneous if the zero vector is a solution.
c. The solution set of Ax = b is obtained by translating the solution of
Ax = 0.
Solution.

Ans: F,T,F
1.6. Solutions Sets of Linear Systems 45

Exercises 1.6
1. Determine if the system has a nontrivial solution. Try to use as few row operations as
possible.

2x1 − 5x2 + 8x3 = 0 3x1 + 5x2 − 7x3 = 0


(b)
(a) 2x1 − 7x2 + x3 = 0 6x1 + 7x2 + x3 = 0
4x1 − 12x2 + 9x3 = 0

Hint : x3 is a free variable for both (a) and (b).


2. Describe all solutions of Ax = 0 in parametric vector form, where A is row equivalent
to the given matrix.
" #  
1 3 −3 7 1 −4 −2 0 3 −5
(a)
0 1 −4 5
0 0 1 0 0 −1
(b) 
 

0 0 0 0 1 −4
0 0 0 0 0 0

Hint : (b) x2 , x4 , and x6 are free variables.


3. Describe and compare the solution sets of x1 − 3x2 + 5x3 = 0 and x1 − 3x2 + 5x3 = 4. Hint :
You must solve two problems each of which has a single equation, which in turn represents a
plane. For both, only x1 is the basic variable.
4. Suppose Ax = b has a solution. Explain why the solution is unique precisely when
Ax = 0 has only the trivial solution.
5. (1) Does the equation Ax = 0 have a nontrivial solution and (2) does the equation
Ax = b have at least one solution for every possible b?

(a) A is a 3 × 3 matrix with three pivot positions.


(b) A is a 3 × 3 matrix with two pivot positions.
46 Chapter 1. Linear Equations

1.8. Linear Independence

Definition 1.48. A set of vectors {v1 , v2 , · · · , vp } in Rn is said to be


linearly independent, if the vector equation

x1 v1 + x2 v2 + · · · + xp vp = 0 (1.27)

has only the trivial solution (i.e., x1 = x2 = · · · = xp = 0). The set of


vectors {v1 , v2 , · · · , vp } is said to be linearly dependent, if there exist
weights c1 , c2 , · · · , cp , not all zero, such that

c1 v1 + c2 v2 + · · · + cp vp = 0. (1.28)

Example 1.49. Determine if the set {v1 , v2 } is linearly independent.


" # " # " # " #
3 0 3 1
1) v1 = , v2 = 2) v1 = , v2 =
0 5 0 0

Remark 1.50. Let A = [v1 , v2 , · · · , vp ]. The matrix equation Ax = 0 is


equivalent to x1 v1 + x2 v2 + · · · + xp vp = 0.
1. Columns of A are linearly independent if and only if Ax = 0 has
only the trivial solution. ( ⇔ Ax = 0 has no free variable ⇔ Every
column in A is a pivot column.)
2. Columns of A are linearly dependent if and only if Ax = 0 has
nontrivial solution. ( ⇔ Ax = 0 has at least one free variable ⇔ A
has at least one non-pivot column.)
1.8. Linear Independence 47

Example 1.51. Determine if the vectors are linearly independent.


           
1 0 1 0 0 2
1) 0, 1 2) 0, 1, 0, 1
           

0 0 0 0 1 2

Solution.

Example 1.52. Determine if the vectors are linearly independent.


     
0 0 −1
2,  0,  3
     

3 −8 1
Solution.
48 Chapter 1. Linear Equations

Example 1.53. Determine if the vectors are linearly independent.


       
1 −2 3 2
−2,  4, −6, 2
       

0 1 −1 3
Solution.

Note: In the above example, vectors are in Rn , n = 3; the number of vectors


p = 4. Like this, if p > n then the vectors must be linearly dependent.
1.8. Linear Independence 49

Theorem 1.54. The set of vectors {v1 , v2 , · · · , vp } ⊂ Rn is linearly


dependent, if p > n.

Proof. Let A = [v1 v2 · · · vp ] ∈ Rn×p . Then Ax = 0 has n equations with


p unknowns. When p > n, there are more variables then equations; this
implies there is at least one free variable, which in turn means that there
is a nontrivial solution.

Example 1.55. Find the value of h so that the vectors are linearly inde-
pendent.
     
3 −6 9
−6,  4, h
     

1 −3 3
50 Chapter 1. Linear Equations

Example 1.56. (Revision of Example 1.55): Find the value of h so that


c is in Span{a, b}.
     
3 −6 9
a = −6, b =  4, c = h
     

1 −3 3
Solution.

Example 1.57. Determine by inspection if the vectors are linearly depen-


dent.
" # " # " #          
1 0 1 1 0 0 1 3
1) , ,
0 1 2 2)  0, 2, 0 3) 2, 6
         

−1 3 0 3 9

Note: Let S = {v1 , v2 , · · · , vp }. If S contains the zero vector, then it is


always linearly dependent. A vector in S is a linear combination of other
vectors in S if and only if S is linearly dependent.
1.8. Linear Independence 51

True-or-False 1.58.
a. The columns of any 3 × 4 matrix are linearly dependent.
b. If u and v are linearly independent, and if {u, v, w} is linearly depen-
dent, then w ∈ Span{u, v}.
c. Two vectors are linearly dependent if and only if they lie on a line
through the origin.
d. The columns of a matrix A are linearly independent, if the equation
Ax = 0 has the trivial solution.
Solution.

Ans: T,T,T,F
52 Chapter 1. Linear Equations

Exercises 1.8
1. Determine if the columns of the matrix form a linearly independent set. Justify each
answer.
   
−4 −3 0 1 −3 3 −2
 0 −1 4 (b) −3 7 −1 2
 
(a) 
 

 1 0 3 0 1 −4 3
5 4 6

2. Find the value(s) of h for which the vectors are linearly dependent. Justify each answer.
           
1 3 −1 1 −2 3
(a) −1, −5,  5 (b)  5, −9,  h
           

4 7 h −3 6 −9

3. (a) For what values of h is v3 in Span{v1 , v2 }, and (b) for what values of h is {v1 , v2 , v3 }
linearlydependent?
 Justify
 eachanswer.

1 −3 5
v1 = −3, v2 =  9, v3 = −7.
     

2 −6 h
Ans: (a) No h; (b) All h
4. Describe the possible echelon forms of the matrix. Use the notation of Exercise 3 in
Section 1.2, p. 19.

(a) A is a 3 × 3 matrix with linearly independent columns.


(b) A is a 2 × 2 matrix with linearly dependent columns.
(c) A is a 4 × 2 matrix, A = [a1 , a2 ] and a2 is not a multiple of a1 .
1.9. Linear Transformations 53

1.9. Linear Transformations


Note: Let A ∈ Rm×n and x ∈ Rn . Then Ax a new vector in Rm , the result
of action of A on x. For example, let
 
" # 1
4 −3 1
A= , x = 1 ∈ R3 .
 
2 0 5
1

Then "
 
# 1 " #
4 −3 1   2
Ax = 1 = .
2 0 5 7
1 |{z}
a new vector in R2
That is, A transforms vectors to another space.

Definition 1.59. Transformation(function or mapping):


A transformation T from Rn to Rm is a rule that assigns to each vector
x ∈ Rn a vector T (x) ∈ Rm . In this case, we write

T : R n → Rm
(1.29)
x 7→ T (x)

where Rn is the domain of T , Rm is the codomain of T , and T (x) denotes


the image of x under T . The set of all images is called the range of T .
Range(T ) = {T (x) | x ∈ Rn }

Figure 1.9: Transformation T : Rn → Rm .


54 Chapter 1. Linear Equations

Definition 1.60. Transformation associated with matrix multiplication


is matrix transformation. That is, for each x ∈ Rn , T (x) = Ax, where
A is an m × n matrix. We may denote the matrix transformation as

T : Rn → Rm
(1.30)
x 7→ Ax

Here the range of T is set of all linear combinations of columns of A.

Range(T ) = Span{columns of A}.

" #
1 3
Example 1.61. Let A = . The transformation T : R2 → R2 defined
0 1
by T (x) = Ax is called a shear transformation. Determine the image of a
square [0, 2] × [0, 2] under T .
Solution. Hint : Matrix transformations is an affine mapping, which means that they
map line segments into line segments (and corners to corners).
1.9. Linear Transformations 55
     
1 −3 " # 3 3
2
Example 1.62. Let A =  3 5, u = , b =  2, c = 2, and
     
−1
−1 7 −5 5
2 3
define a transformation T : R → R by T (x) = Ax.
a. Find T (u), the image of u under the transformation T .
b. Find an x ∈ R2 whose image under T is b.
c. Is there more than one x whose image under T is b?
d. Determine if c is in the range of the transformation T .
Solution.

" #
1.5
Ans: b. x = ; c. no; d. no
−0.5
56 Chapter 1. Linear Equations

Definition 1.63. (Linear Transformations): A transformation is lin-


ear if:
(i) T (u + v) = T (u) + T (v), for all u, v in the domain of T
(ii) T (cu) = cT (u), for all scalars c and all u in the domain of T

Remark 1.64. If T is a linear transformation, then

T (0) = 0 (1.31)

and
T (c u + d v) = c T (u) + d T (v) , (1.32)
for all vectors u, v in the domain of T and all scalars c, d.
We can easily prove that if T satisfies (1.32), then T is linear.

Example 1.65. Prove that a matrix transformation T (x) = Ax is linear.


Proof. It is easy to see that

T (c u + d v) = A(c u + d v) = c Au + d Av = c T (u) + d T (v),

which completes the proof, satisfying (1.32).

Remark 1.66. Repeated application of (1.32) produces a useful gener-


alization:
T (c1 v1 + c2 v2 + · · · + cp vp ) = c1 T (v1 ) + c2 T (v2 ) + · · · + cp T (vp ). (1.33)

In engineering physics, (1.33) is referred to as a superposition princi-


ple.
1.9. Linear Transformations 57

Example 1.67. Let θ be the angle measured from the positive x-axis coun-
terclockwise. Then, the rotation can be defined as
" #
cos θ − sin θ
R[θ] = (1.34)
sin θ cos θ

1) Describe R[π/2] explicitly.


" # " #
1 0
2) What are images of and under R[π/2].
0 1
3) Is R[θ] a linear transformation?
Solution.

For example, a yaw is a counter-


clockwise rotation of ψ about the z-
axis. The rotation matrix reads
 
cos ψ − sin ψ 0
Rz [ψ] =  sin ψ cos ψ 0
 
Figure 1.10: Euler angles (roll,pitch,yaw)
in aerodynamics. 0 0 1
58 Chapter 1. Linear Equations

True-or-False 1.68.
a. If A ∈ R3×5 and T is a transformation defined by T (x) = Ax, then the
domain of T is R3 .
b. A linear transformation is a special type of function.
c. The superposition principle is a physical description of a linear trans-
formation.
d. Every matrix transformation is a linear transformation.
e. Every linear transformation is a matrix transformation. (If it is false,
can you find an example that is linear but of no matrix description?)
Solution.

Ans: F,T,T,T,F
1.9. Linear Transformations 59

Exercises 1.9
1. With T defined by T x = Ax, find a vector x whose image under T is b, and determine
whether
 x is unique.
  
1 −3 2 6
A = 0 1 −4, b = −7
   

3 −5 −9 −9  
−5
Ans: x = −3, unique
 

1
2. Answer the following

(a) Let A be a 6 × 5 matrix. What must a and b be in order to define T : Ra → Rb by by


T x = Ax?
(b) How many rows and columns must a matrix A have in order to define a mapping
from R4 into R5 by the rule T x = Ax?
Ans: (a) a = 5; b = 6
   
−1 1 −4 7 −5
3. Let b =  1 and A = 0 1 −4 3. Is b in the range of the linear transformation
   

0 2 −6 6 −4
x 7→ Ax? Why or why not?
Ans: yes
" # " #
5 −2
4. Use a rectangular coordinate system to plot u = , u= , and their images under
2 4
the given transformation T . (Make a separate and reasonably large sketch.) Describe
2
geometrically
" what
#"T does
# to each vector x in R .
−1 0 x1
T (x) =
0 −1 x2
5. Show that the transformation T defined by T (x1 , x2 ) = (2x1 − 3x2 , x1 + 4, 5x2 ) is not linear.
Hint : T (0, 0) = 0?
6. Let T : R3 → R3 be the transformation that projects each vector x = (x1 , x2 , x3 ) onto the
plane x2 = 0, so T (x) = (x1 , 0, x3 ). Show that T is a linear transformation.
Hint : Try to verify (1.32): T (cx + dy) = T (cx1 + dy1 , cx2 + dy2 , cx3 + dy3 ) = · · · = cT (x) + dT (y).
60 Chapter 1. Linear Equations

1.10. The Matrix of A Linear Transformation


Note: In Example 1.65 (p. 56), we proved that every matrix transforma-
tion is linear. The reverse is not always true. However, a linear transfor-
mation defined in Rn is a matrix transformation.

Here in this section, we will try to find matrices for linear transformations
defined in Rn . Let’s begin with an example.
Example 1.69. Suppose T : R2 → R3 is a linear transformation such that
   
5 −3 " # " #
1 0
T (e1 ) = −7, T (e2 ) =  8, where e1 = , e2 = .
   
0 1
2 0

3×2
Solution. What we should do is tofind a matrix A ∈R such that
5 −3
T (e1 ) = Ae1 = −7, T (e2 ) = Ae2 =  8. (1.35)
   

2 0
" #
x1
Let x = ∈ R2 . Then
x2
" # " #
1 0
x = x1 + x2 = x1 e1 + x2 e2 . (1.36)
0 1
It follows from linearity of T that
T (x) = T (x1 e1 + x2 e2 ) = x1 T (e1 ) + x2 T (e2 )
 
" # 5 −3
x1
= [T (e1 ) T (e2 )] = −7 8 x.
  (1.37)
x2
2 0
| {z }
A

Now, you can easily check that A satisfies (1.35).

Remark 1.70. The matrix of a linear transformation is decided by its


action on the standard basis.
1.10. The Matrix of A Linear Transformation 61

1.10.1. The Standard Matrix

Theorem 1.71. Let T : Rn → Rm be a linear transformation. Then


there exists a unique matrix A ∈ Rm×n such that

T (x) = Ax, for all x ∈ Rn .

In fact, with ej denoting the j-th standard unit vector in Rn ,

A = [T (e1 ) T (e2 ) · · · T (en )] . (1.38)

The matrix A is called the standard matrix of the transformation.

Note: Standard unit vectors in Rn & the standard matrix:


     
1 0 0
 .
 ..
 
0 1
 
 ..
e1 =  ., e2 = 0, · · · , en =  ....
     
(1.39)
   
 .
 ..  ..  
   . 0
 
0 0 1

Any x ∈ Rn can be written as


       
x 1 0 0
 1
 ...
   
 x2  0 1
 
 
 ..  ..
x =  . = x1  . + x2 0 + · · · + xn  ... = x1 e1 + x2 e2 + · · · + xn en .
       
   
 .
 ..
 .
 ..  ..  
     . 0
 
xn 0 0 1

Thus
T (x) = T (x1 e1 + x2 e2 + · · · + xn en )
= x1 T (e1 ) + x2 T (e2 ) + · · · + xn T (en ) (1.40)
= [T (e1 ) T (e2 ) · · · T (en )] x,
and therefore the standard matrix reads
A = [T (e1 ) T (e2 ) · · · T (en )] . (1.41)
62 Chapter 1. Linear Equations

Example 1.72. Let T : R2 → R2 is a horizontal shear transformation


that leaves e1 unchanged and maps e2 into e2 + 2e1 . Write standard matrix
of T .
Solution.

Example 1.73. Write the standard matrix for the linear transformation
T : R2 → R4 given by
T (x1 , x2 ) = (x1 + 4x2 , 0, x1 − 3x2 , x1 ).

Solution.
1.10. The Matrix of A Linear Transformation 63

Geometric Linear Transformations of R2


Example 1.74. Find the standard matrices for the reflections in R2 .
" # " #
x1 x1
1) The reflection through the x1 -axis, defined as R1 = .
x2 −x2
" # " #
x1 x2
2) The reflection through the line x1 = x2 , defined as R2 = .
x2 x1
3) The reflection through the line x1 = −x2 (Define R3 first.)
Solution.

" #
0 −1
Ans: 3) A3 =
−1 0
64 Chapter 1. Linear Equations

1.10.2. Existence and Uniqueness Questions

Definition 1.75.

1) A mapping T : Rn → Rm is said to be surjective (onto Rm ) if each


b in Rm is the image of at least one x in Rn .
2) A mapping T : Rn → Rm is said to be injective (one-to-one) if each
b in Rm is the image of at most one x in Rn .

Figure 1.11: Surjective?: Is the range of T all of Rm ?

Figure 1.12: Injective?: Is each b ∈ Rm the image of one and only one x in Rn ?

Note: For solutions of Ax = b, A ∈ Rm×n ; existence is related to


“surjective"-ness, while uniqueness is granted for “injective" mappings.
1.10. The Matrix of A Linear Transformation 65

Example 1.76. Let T : R4 → R3 be the linear transformation whose stan-


dard matrix is  
1 −4 0 1
A = 0 2 −1 3 .
 

0 0 0 −1
Is T onto? Is T one-to-one?
Solution.

Ans: onto, but not one-to-one

Theorem 1.77. Let T : Rn → Rm be a linear transformation with stan-


dard matrix A. Then,
(a) T maps Rn onto Rm if and only if the columns of A span Rm .
( ⇔ every row has a pivot position
⇔ Ax = b has a solution for all b ∈ Rm )
(b) T is one-to-one if and only if the columns of A are linearly indepen-
dent.
( ⇔ every column of A is a pivot column
⇔ Ax = 0 has “only" the trivial solution)
66 Chapter 1. Linear Equations
 
1 3 0
Example 1.78. Let T (x) = 0 3 4x. Is T one-to-one? Does T map R3
 

0 0 4
3
onto R ?
Solution.

 
1 4
 
0 0
Example 1.79. Let T (x) = 
1 −3x. Is T one-to-one (1–1)? Is T onto?

 
1 0
Solution.
1.10. The Matrix of A Linear Transformation 67
 
1 0 0 1
Example 1.80. Let T (x) = 0 1 2 3x. Is T 1–1? Is T onto?
 

0 2 4 6
Solution.

Example 1.81. Let T : R4 → R3 be the linear transformation given by

T (x1 , x2 , x3 , x4 ) = (x1 − 4x2 + 8x3 + x4 , 2x2 − 8x3 + 3x4 , 5x4 ).

Is T 1–1? Is T onto?
Solution.
68 Chapter 1. Linear Equations

True-or-False 1.82.
a. A mapping T : Rn → Rm is one-to-one if each vector in Rn maps onto a
unique vector in Rm .
b. If A is a 3 × 2 matrix, then the transformation x 7→ Ax cannot map R2
onto R3 .
c. If A is a 3 × 2 matrix, then the transformation x 7→ Ax cannot be one-
to-one. (See Theorem 1.77, p.65.)
d. A linear transformation T : Rn → Rm is completely determined by its
action on the columns of the n × n identity matrix.
Solution.

Ans: F,T,F,T
1.10. The Matrix of A Linear Transformation 69

Exercises 1.10
1. Assume that T is a linear transformation. Find the standard matrix of T .

(a) T : R2 → R4 , T (e1 ) = (3, 1, 3, 1) and T (e2 ) = (5, 2, 0, 0), where e1 = (1, 0) and e2 =
(0, 1).
(b) T : R2 → R2 first performs a horizontal shear that transforms e2 into e2 − 2e1 (leav-
ing e1 unchanged) and then reflects points through the line x2 = −x1 .
" # " # " #
1 −2 0 −1 0 −1
Ans: (b) shear: and reflection: ; it becomes .
0 1 −1 0 −1 2
2. Show that T is a linear transformation by finding a matrix that implements the map-
ping. Note that x1 , x2 , · · · are not vectors but are entries in vectors.

(a) T (x1 , x2 , x3 , x4 ) = (0, x1 + x2 , x2 + x3 , x2 + x4 )


(b) T (x1 , x2 , x3 ) = (x1 − 5x2 + 4x3 , x2 − 6x3 )
(c) T (x1 , x2 , x3 , x4 ) = 2x1 + 3x3 − 4x4

3. Let T : R2 → R2 be a linear transformation such that T (x1 , x2 ) = (x1 +x2 , 4x1 +5x2 ). Find
x such that T (x) = (3, 8). " #
7
Ans: x =
−4
4. Determine if the specified linear transformation is (1) one-to-one and (2) onto. Justify
each answer.

(a) The transformation in Exercise 2(a).


(b) The transformation in Exercise 2(b).
Ans: (a) Not 1-1, not onto; (b) Not 1-1, but onto
5. Describe the possible echelon forms of the standard matrix for a linear transformation
T , where T : R4 → R3 is onto. Use the notation of Exercise 3 in Section 1.2.
Hint : The matrix should have a pivot position in each row. Thus there 4 different possible
echelon forms.
70 Chapter 1. Linear Equations
C HAPTER 2
Matrix Algebra

From elementary school, you have learned about numbers and operations
such as addition, subtraction, multiplication, division, and factorization.
Matrices are also a kind of mathematical objects. So you may define ma-
trix operations, similarly done for numbers. Matrix algebra is a study
about such matrix operations and related applications. Algorithms and
techniques you will learn through this chapter are quite fundamental and
important to further develop for application tasks.
For problems in Chapter 1, the major tool-set was the “3 elementary row
operations." So will be, for this chapter.

71
72 Chapter 2. Matrix Algebra

2.1. Matrix Operations

Let A be an m × n matrix.
Let aij denotes the entry in row i and
column j. Then, we write A = [aij ].

Figure 2.1: Matrix A ∈ Rm×n .

Terminologies
• If m = n, A is called a square matrix.
• If A is an n × n matrix, then the entries a11 , a22 , · · · , ann are called
diagonal entries.
• A diagonal matrix is a square matrix (say n × n) whose non-
diagonal entries are zero.
Ex: Identity matrix In .

Sums and Scalar Multiples:

1) Equality: Two matrices A and B of the same size (say m × n) are equal
if and only if the corresponding entries in A and B are equal.
Ex:

2) Sum: The sum of two matrices A = [aij ] and B = [bij ] of the same size
is the matrix A + B = [aij + bij ].
Ex:

3) Scalar multiplication: Let r be any scalar. Then, rA = r[aij ] = [raij ].


Ex:
2.1. Matrix Operations 73

Example 2.1. Let


" # " # " #
4 0 5 1 1 1 2 −3
A= , B= , C=
−1 3 2 3 5 7 0 1

a) A + B, A + C?
b) A − 2B
Solution.
74 Chapter 2. Matrix Algebra

Definition 2.2. Matrix multiplication: If A is an m × n matrix and B


is an n × p matrix with columns b1 , b2 , · · · , bp , then the matrix prod-
uct AB is m × p matrix with columns Ab1 , Ab2 , · · · , Abp . That is,

AB = A[b1 b2 · · · bp ] = [Ab1 Ab2 · · · Abp ]. (2.1)

Matrix multiplication as a composition of two linear transforma-


tions:
A B : x ∈ Rp 7→ Bx ∈ Rn 7→ A(Bx) ∈ Rm (2.2)
B A

Figure 2.2

Let A ∈ Rm×n , B ∈ Rn×p , and x = [x1 , x2 , · · · , xp ]T ∈ Rp . Then,

Bx = [b1 b2 · · · bp ] x = x1 b1 + x2 b2 + · · · + xp bp , bi ∈ Rn
⇒ A(Bx) = A(x1 b1 + x2 b2 + · · · + xp bp )
= x1 Ab1 + x2 Ab2 + · · · + xp Abp
= [Ab1 Ab2 · · · Abp ] x
⇒ AB = [Ab1 Ab2 · · · Abp ] ∈ Rm×p
(2.3)
2.1. Matrix Operations 75

Example 2.3. Let T : R2 → R3 and S : R2 → R2 be such that T (x) = Ax


 
4 −3 " #
1 4
where A = −3 5 and S(x) = Bx where B = . Compute the
 
3 −2
0 1
standard matrix of T ◦ S.
Solution. Hint : (T ◦ S) (x) = T (S(x)) = T (Bx) = A(Bx) = (AB)x.

Row-Column rule for Matrix Multiplication


If the product AB is defined (i.e. number of columns in A = number of
rows in B) and A ∈ Rm×n , then the entry in row i and column j of AB
is the sum of products of corresponding entries from row i of A and
column j of B. That is,

(AB)ij = ai1 b1j + ai2 b2j + · · · + ain bnj . (2.4)

The sum of products is also called the dot product.


76 Chapter 2. Matrix Algebra

Example 2.4. Compute AB if


" # " #
2 3 4 3 6
A= , B= .
1 −5 1 −2 3
Solution.

Example 2.5. Find all columns of matrix B if


" # " #
1 2 8 7
A= , AB = .
−1 3 7 −2
Solution.
2.1. Matrix Operations 77

Example 2.6. Find the first column of B if


   
−2 3 −11 10
A =  3 3 and AB =  9 0.
   

5 −3 23 −16
Solution.

Remark 2.7.
1. (Commutativity) Suppose both AB and BA are defined. Then, in
" AB 6=
general, # BA. " # " # " #
3 −6 −1 1 −21 −21 −4 8
A= , B= . Then AB = , BA = .
−1 2 3 4 7 7 5 −10

2. (Cancellation law) If AB = AC, then B = C needs not be true


always. (Consider the case A = 0 ⇒ determinant and invertibility)
3. (Powers of a matrix) If A is an n × n matrix and if k is a positive
integer, then Ak denotes the product of k copies of A:

Ak = A
| ·{z
· · A}
k times

If k = 0, then Ak is identified with the identity matrix, In .


78 Chapter 2. Matrix Algebra

Transpose of a Matrix:
Definition 2.8. Given an m × n matrix A, the transpose of A is the
matrix, denoted by AT , whose columns are formed from the correspond-
ing rows of A.
 
1 −4 8 1
Example 2.9. If A = 0 2 −1 3, then AT =
 

0 0 0 5

Theorem 2.10.
a. (AT )T = A
b. (A + B)T = AT + B T
c. (rA)T = r AT , for any scalar r
d. (AB)T = B T AT

Note: The transpose of a product of matrices equals the product of their


transposes in the reverse order: (ABC)T = C T B T AT
2.1. Matrix Operations 79

True-or-False 2.11.
a. Each column of AB is a linear combination of the columns of B using
weights from the corresponding column of A.
b. The second row of AB is the second row of A multiplied on the right by
B.
c. The transpose of a sum of matrices equals the sum of their transposes.
Solution.
Ans: F(T if A ↔ B),T,T

Challenge 2.12.

a. Show that if the columns of B are linearly dependent, then so are the
columns of AB.
b. Suppose CA = In (the n × n identity matrix). Show that the equation
Ax = 0 has only the trivial solution.

Hint : a. The condition means that Bx = 0 has a nontrivial solution.


80 Chapter 2. Matrix Algebra

Exercises 2.1
1. Compute the product AB in two ways: (a) by the definition, where Ab1 and Ab2 are
computed
 separately,
 and (b) by the row-column rule for computing AB.
−1 2 " #
3 2
A= 5 4 and B = .
 
2 1
2 −3
2. If a matrix A is 5 × 3 and the product AB is 5 × 7, what is the size of B?
" # " # " #
2 −3 8 4 5 −2
3. Let A = , B = , and C = . Verify that AB = AC and yet
−4 6 5 5 3 1
B 6= C.
" # " #
1 −2 −1 2 −1
4. If A = and AB = , determine the first and second columns of
−2 5 6 −9 3
B. " # " #
7 −8
Ans: b1 = , b2 =
4 −5

5. Give a formula for (ABx)T , where x is a vector and A and B are matrices of appropriate
sizes.
   
−2 a
6. Let u =  3 and v =  b. Compute uT v, vT u, u vT and v uT .
   

−4 c  
−2a −2b −2c
Ans: uT v = −2a + 3b − 4c and u vT =  3a 3b 3c.
 

−4a −4b −4c


2.2. The Inverse of a Matrix 81

2.2. The Inverse of a Matrix


Definition 2.13. An n × n matrix A is said to be invertible (nonsin-
gular) if there is an n × n matrix B such that AB = In = BA, where In
is the identity matrix.

Note: In this case, B is the unique inverse of A denoted by A−1 .


(Thus AA−1 = In = A−1 A.)
" # " #
2 5 −7 −5
Example 2.14. If A = and B = . Find AB and BA.
−3 −7 3 2
Solution.

Theorem 2.15. (Inverse of an n × n matrix, n ≥ 2) An n × n matrix


A is invertible if and only if A is row equivalent to In and in this case any
sequence of elementary row operations that reduces A into In will also
reduce In to A−1 .

Algorithm 2.16. Algorithm to find A−1 :


1) Row reduce the augmented matrix [A : In ]
2) If A is row equivalent to In , then [A : In ] is row equivalent to [In :
A−1 ]. Otherwise A does not have any inverse.
82 Chapter 2. Matrix Algebra
" #
3 2
Example 2.17. Find the inverse of A = .
8 5
Solution." You may begin
# with
3 2 1 0
[A : I2 ] =
8 5 0 1

 
0 1 0
Example 2.18. Find the inverse of A = 1 0 3, if it exists.
 

4 −3 8
Solution.
2.2. The Inverse of a Matrix 83
 
1 2 −1
Example 2.19. Find the inverse of A = −4 −7 3, if it exists.
 

−2 −6 4
Solution.

Theorem 2.20.
" #
a b
a. (Inverse of a 2 × 2 matrix) Let A = . If ad − bc 6= 0, then A
c d
is invertible and " #
1 d −b
A−1 = (2.5)
ad − bc −c a

b. If A is an invertible matrix, then A−1 is also invertible and


(A−1 )−1 = A.
c. If A and B are n × n invertible matrices then AB is also invertible
and (AB)−1 = B −1 A−1 .
d. If A is invertible, then AT is also invertible and (AT )−1 = (A−1 )T .
e. If A is an n × n invertible matrix, then for each b ∈ Rn , the equation
Ax = b has a unique solution x = A−1 b.
84 Chapter 2. Matrix Algebra

Example 2.21. When A, B, C, and D are n × n invertible matrices, solve


for X if C −1 (A + X)B −1 = D.
Solution.

Example 2.22. Explain why the columns of an n × n matrix A are linearly


independent when A is invertible.
Solution. Hint : Let x1 a1 + x2 a2 + · · · + xn an = 0. Then, show that x = [x1 , x2 , · · · , xn ]T = 0.

Remark 2.23. (Another view of matrix inversion) For an invertible


matrix A, we have AA−1 = In . Let A−1 = [x1 x2 · · · xn ]. Then

AA−1 = A[x1 x2 · · · xn ] = [e1 e2 · · · en ]. (2.6)

Thus the j-th column of A−1 is the solution of

Axj = ej , j = 1, 2, · · · , n.
2.2. The Inverse of a Matrix 85

True-or-False 2.24.
a. In order for a matrix B to be the inverse of A, both equations AB = In
and BA = In must be true.
" #
a b
b. if A = and ad = bc, then A is not invertible.
c d
c. If A is invertible, then elementary row operations that reduce A to the
identity In also reduce A−1 to In .
Solution.
Ans: T,T,F

Exercises 2.2
 
"
# 1 −2 1
3 −4
1. Find the inverses of the matrices, if exist: A = and B =  4 −7 3
 
7 −8
−2 6 −4
Ans: B is not invertible.
2. Use matrix algebra to show that if A is invertible and D satisfies AD = I, then D = A−1 .
Hint : You may start with AD = I and then multiply A−1 .
3. Solve the equation AB + C = BC for A, assuming that A, B, and C are square and B is
invertible.
4. Explain why the columns of an n × n matrix A span Rn when A is invertible. Hint : If A
is invertible, then Ax = b has a solution for all b in Rn .
5. Suppose A is n × n and the equation Ax = 0 has only the trivial solution. Explain why
A is row equivalent to In . Hint : A has n pivot columns.
6. Suppose A is n × n and the equation Ax = b has a solution for each b in Rn . Explain
why A must be invertible. Hint : A has n pivot columns.
86 Chapter 2. Matrix Algebra

2.3. Characterizations of Invertible Matrices

Theorem 2.25. (Invertible Matrix Theorem) Let A be an n × n ma-


trix. Then the following are equivalent.
a. A is an invertible matrix.
b. A is row equivalent to the n × n identity matrix.
c. A has n pivot positions.
d. The equation Ax = 0 has only the trivial solution x = 0.
e. The columns of A are linearly independent.
f. The linear transformation x 7→ Ax is one-to-one.
g. The equation Ax = b has unique solution for each b ∈ Rn .
h. The columns of A span Rn .
i. The linear transformation x 7→ Ax maps Rn onto Rn .
j. There is a matrix C ∈ Rn×n such that CA = I
k. There is a matrix D ∈ Rn×n such that AD = I
l. AT is invertible and (AT )−1 = (A−1 )T .
More statements will be added in the coming sections.

Note: Let A and B be square matrices. If AB = I, then A and B are both


invertible, with B = A−1 and A = B −1 .
2.3. Characterizations of Invertible Matrices 87

Example 2.26. Use the Invertible Matrix Theorem to decide if A is invert-


ible:  
1 0 −2
A =  3 1 −2
 

−5 −1 9
Solution.

Example 2.27. Can a square matrix with two identical columns be invert-
ible?
88 Chapter 2. Matrix Algebra

Example 2.28. An n × n upper triangular matrix is one whose entries


below the main diagonal are zeros. When is a square upper triangular ma-
trix invertible?

Theorem 2.29. (Invertible linear transformations)


1. A linear transformation T : Rn → Rn is said to be invertible if
there exists S : Rn → Rn such that S ◦ T (x) = T ◦ S(x) = x for all
x ∈ Rn . In this case, S = T −1 .
2. Also, if A is the standard matrix for T , then A−1 is the standard
matrix for T −1 .

Example 2.30. Let T : R2 → R2 be a linear transformation such that


" # " #
x1 −5x1 + 9x2
T = . Find a formula for T −1 .
x2 4x1 − 7x2
Solution.
2.3. Characterizations of Invertible Matrices 89

Example 2.31. Let T : Rn → Rn be one-to-one. What can you say about


T?
Solution. Hint : T : 1-1 ⇔ Columns of A is linearly independent

True-or-False 2.32. Let A be an n × n matrix.


a. If the equation Ax = 0 has only the trivial solution, then A is row
equivalent to the n × n identity matrix.
b. If the columns of A span Rn , then the columns are linearly independent.
c. If A is an n × n matrix, then the equation Ax = b has at least one
solution for each b ∈ Rn .
d. If the equation Ax = 0 has a nontrivial solution, then A has fewer than
n pivot positions.
e. If AT is not invertible, then A is not invertible.
f. If the equation Ax = b has at least one solution for each b ∈ Rn , then
the solution is unique for each b.
Solution.
Ans: T,T,F,T,T,T
90 Chapter 2. Matrix Algebra

Exercises 2.3
1. An m × n lower triangular matrix is one whose entries above the main diagonal are
0’s. When is a square upper triangular matrix invertible? Justify your answer. Hint :
See Example 2.28.
2. Is it possible for a 5 × 5 matrix to be invertible when its columns do not span R5 ? Why
or why not?
Ans: No
3. If A is invertible, then the columns of A−1 are linearly independent. Explain why.
4. If C is 6 × 6 and the equation Cx = v is consistent for every v ∈ R6 , is it possible that
for some v, the equation Cx = v has more than one solution? Why or why not?
Ans: No
5. If the equation Gx = y has more than one solution for some y ∈ Rn , can the columns of
G span Rn ? Why or why not?
Ans: No
6. Let T : R2 → R2 be a linear transformation such that T (x1 , x2 ) = (6x1 − 8x2 , −5x1 + 7x2 ).
Show that T is invertible and find a formula for T −1 . Hint : See Example 2.30.
2.5. Matrix Factorizations 91

2.5. Matrix Factorizations


The LU Factorization/Decomposition:
The LU factorization is motivated by the fairly common industrial and busi-
ness problem of solving a sequence of equations, all with the same coeffi-
cient matrix:
Ax = b1 , Ax = b2 , · · · , Ax = bp . (2.7)

Definition 2.33. Let A ∈ Rm×n . The LU factorization of A is A = LU ,


where L ∈ Rm×m is a unit lower triangular matrix and U ∈ Rm×n is an
echelon form of A (upper triangular matrix):

Figure 2.3

Remark 2.34. Let Ax = b be to be solved. Then Ax = LU x = b and it


can be solved as (
Ly = b,
(2.8)
U x = y,
each algebraic equation can be solved effectively, via substitutions.

Definition 2.35. Every elementary row operation can be expressed as


a matrix to be left-multiplied. Such a matrix is called an elementary
matrix. Every elementary matrix is invertible.
92 Chapter 2. Matrix Algebra
 
1 −2 1
Example 2.36. Let A =  2 −2 3 .
 

−3 2 0

a) Reduce it to an echelon matrix, using replacement operations.


b) Express the replacement operations as elementary matrices.
c) Find their inverse.
2.5. Matrix Factorizations 93

Algorithm 2.37. (An LU Factorization Algorithm) The derivation


introduces an LU factorization: Let A ∈ Rm×n . Then
A = Im A
= E1−1 E1 A
= E1−1 E2−1 E2 E1 A = (E2 E1 )−1 E2 E1 A
.. (2.9)
= .
= E1−1 E2−1 · · · Ep−1 Ep · · · E2 E1 A = (Ep · · · E2 E1 )−1 Ep · · · E2 E1 A
| {z } | {z }| {z }
an echelon form L U

Remark 2.38. Let E1 and E2 be elementary matrices that correspond


to replacement operations occured in the LU Factorization Algorithm.
Then E1 E2 = E2 E1 and E1−1 E2−1 = E2−1 E1−1 .

Example 2.39. Find the LU factorization of


 
4 3 −5
A = −4 −5 7.
 

8 8 −8
94 Chapter 2. Matrix Algebra

Exercises 2.5
1. Solve the equation Ax = b by using the LU factorization given for A.
      
4 3 −5 1 0 0 4 3 −5 2
A = −4 −5 7 = −1 1 00 −2 2 and b = −4.
      

8 6 −8 2 0 1 0 0 2 6

Here Ly = b requires replacement operations forward, while U x = y requires replace-


ment and scaling operations backward.  
1/4
Ans: x =  2.
 

2. M Use matlab or octave to find the LU factorization of matrices1


 
1 3 −5 −3  
−1 −5 2 −4 4 −2
8 4
A=  and B =  6 −9 7 −3
  
 4 2 −5 −7
−1 −4 8 0
−2 −4 7 5

3. When A is invertible, MATLAB finds A−1 by factoring A = LU , inverting L and U ,


and then computing U −1 L−1 . You will use this method to compute the inverse of A in
Exercise 1.

(a) Find U −1 , starting from [U I], reduce it to [I U −1 ].


(b) Find L−1 , starting from [L I], reduce it to [I L−1 ].
(c) Compute U −1 L−1 .
 
1/8 3/8 1/4
Ans: A−1 = −3/2 −1/2 1/2.
 

−1 0 1/2

All problems marked by M will have a very high credit. For implementation issues, try to figure them
1

out yourself; for example, you may google search for “matlab LU decomposition".
2.8. Subspaces of Rn 95

2.8. Subspaces of Rn

Definition 2.40. A subspace of Rn is any set H in Rn that has three


properties:
a) The zero vector is in H.
b) For each u and v in H, the sum u + v is in H.
c) For each u in H and each scalar c, the vector cu is in H.
That is, H is closed under linear combinations.

Example 2.41.

1. A line through the origin in R2 is a subspace of R2 .


2. Any plane through the origin in R3 .

Figure 2.4: Span{v1 , v2 } as a plane through the origin.

3. Let v1 , v2 , · · · , vp ∈ Rn . Then Span{v1 , v2 , · · · , vp } is a subspace of Rn .

Definition 2.42. Let A be an m × n matrix. The column space of A


is the set Col A of all linear combinations of columns of A. That is, if
A = [a1 a2 · · · an ], then

Col A = {u | u = c1 a1 + c2 a2 + · · · + cn an }, (2.10)

where c1 , c2 , · · · , cn are scalars. Col A is a subspace of Rm .


96 Chapter 2. Matrix Algebra
   
1 −3 −4 3
Example 2.43. Let A = −4 6 −2 and b =  3. Determine whether
   

−3 7 6 −4
b is in the column space of A, Col A.
Solution. Clue: 1 b ∈ Col A
⇔ 2 b is a linear combinations of columns of A
⇔ 3 Ax = b is consistent
⇔ 4 [A b] has a solution

Definition 2.44. Let A be an m × n matrix. The null space of A, Nul A,


is the set of all solutions of the homogeneous system Ax = 0.

Theorem 2.45. Nul A is a subspace of Rn .

Proof.
2.8. Subspaces of Rn 97

Basis for a Subspace


Definition 2.46. A basis for a subspace H in Rn is a set of vectors that
1. is linearly independent, and
2. spans H.

Remark 2.47.
(" # " #)
1 0
1. , is a basis for R2 .
0 2
     
1 0 0
0 1 0
     
, e2 = 0, · · · , en =  .... Then {e1 , e2 , · · · , en } is called
     
2. Let e1 =  0
 ..  ..
     
 .  . 0
 

0 0 1
n
the standard basis for R .

Basis for Nul A


Example 2.48. Find a basis for the null space of the matrix
 
−3 6 −1 1
A =  1 −2 2 3.
 

2 −4 5 8
 
1 2 0 1 0
Solution. [A 0] ∼ 0 0 1 2 0
 

0 0 0 0 0
98 Chapter 2. Matrix Algebra

Theorem 2.49. Basis for Nul A can be obtained from the parametric
vector form of solutions of Ax = 0. That is, suppose that the solutions of
Ax = 0 reads
x = x1 u1 + x2 u2 + · · · + xk uk ,
where x1 , x2 , · · · , xk correspond to free variables. Then, a basis for Nul A
is {u1 , u2 , · · · , uk }.

Basis for Col A


Example 2.50. Find a basis for the column space of the matrix
 
1 0 −3 5 0
 
0 1 2 −1 0
B=0 0 0 0 1.

 
0 0 0 0 0
Solution. Observation: b3 = −3b1 + 2b2 and b4 = 5b1 − b2 .

Theorem 2.51. In general, non-pivot columns are linear combinations


of pivot columns. Thus the pivot columns of a matrix A forms a basis for
Col A.
2.8. Subspaces of Rn 99

Example 2.52. Matrix A and its echelon form is given. Find a basis for
Col A and a basis forNul 
A. 
3 −6 9 0 1 −2 3 0
A = 2 −4 7 2 ∼ 0 0 1 2
   

3 −6 6 −6 0 0 0 0
Solution.

True-or-False 2.53.
a. If v1 , v2 , · · · , vp are in Rn , then Span{v1 , v2 , · · · , vp } = Col [v1 v2 · · · vp ].
b. The columns of an invertible n × n matrix form a basis for Rn .
c. Row operations do not affect linear dependence relations among the
columns of a matrix.
d. The column space of a matrix A is the set of solutions of Ax = b.
e. If B is an echelon form of a matrix A, then the pivot columns of B form
a basis for Col A.
Solution.
Ans: T,T,T,F,F
100 Chapter 2. Matrix Algebra

Exercises 2.8
       
1 4 5 −4
−2 −7 −8  10
1. Let v1 =  , v2 =  , v3 =  , and u =  . Determine if u is in the subspace
       
 4  9  6 −7
3 7 5 −5
of R4 generated by {v1 , v2 , v3 }.
Ans: No
       
−3 −2 0 1
2. Let v1 =  0, v2 =  2, v3 = −6, and p =  14. Determine if p is in Col A, where
       

6 3 3 −9
A = [v1 v2 v3 ].
Ans: Yes
3. Give integers p and q such that Nul A is a subspace of Rp and Col A is a subspace of Rq .
 
  1 2 3
3 2 1 −5  4 5 7
A = −9 −4 1 7 B = 
   

−5 −1 0
9 2 −5 1
2 7 11
4. Determine which sets are bases for R3 . Justify each answer.
             
0 5 6 1 3 −2 0
a)  1, −7, 3 b) −6, −4,  7,  1
             

−2 4 5 −7 7 5 −2
Ans: a) Yes
5. Matrix A and its echelon form is given. Find a basis for Col A and a basis for Nul A.
   
1 4 8 −3 −7 1 4 8 0 5
−1 2 7 3 4 0 −1
0 2 5
 
A =   ∼   Hint : For a basis for Col A, you can just
 
−2 2 9 5 5 0 0 0 1 4
3 6 9 −5 −2 0 0 0 0 0
recognize pivot columns, while you should find the solutions of Ax = 0 for Nul A.
6. a) Suppose F is a 5 × 5 matrix whose column space is not equal to R5 . What can you
say about Nul F ?
b) If R is a 6 × 6 matrix and Nul R is not the zero subspace, what can you say about
Col R?
c) If Q is a 4 × 4 matrix and Col Q = R4 , what can you say about solutions of equations
of the form Qx = b for b in R4 ?
Ans: b) Col R 6= R6 . Why? c) It has always a unique solution
2.9. Dimension and Rank 101

2.9. Dimension and Rank


2.9.1. Coordinate systems

Remark 2.54. The main reason for selecting a basis for a subspace H,
instead of merely a spanning set, is that each vector in H can be written
in only one way as a linear combination of the basis vectors.

Example 2.55. Let B = {b1 , b2 , · · · , bp } be a basis of H and

x = c1 b1 + c2 b2 + · · · + cp bp ; x = d1 b1 + d2 b2 + · · · + dp bp , x ∈ H.

Show that c1 = d1 , c2 = d2 , · · · , cp = dp .
Solution. Hint : A property of a basis is that basis vectors are linearly independent

Definition 2.56. Suppose the set B = {b1 , b2 , · · · , bp } is a basis for a


subspace H. For each x ∈ H, the coordinates of x relative to the ba-
sis B are the weights c1 , c2 , · · · , cp such that x = c1 b1 + c2 b2 + · · · + cp bp ,
and the vector in Rp  
c1
 ..
[x]B =  .
cp
is called the coordinate vector of x (relative to B) or the B-
coordinate vector of x.
102 Chapter 2. Matrix Algebra
     
3 −1 3
Example 2.57. Let v1 = 6, v2 =  0, x = 12, and B = {v1 , v2 }.
     

2 1 7
Then B is a basis for H = Span{v1 , v2 }, because v1 and v2 are linearly
independent. Determine if x is in H, and if it is, find the coordinate vector
of b relative to B.
Solution.

Figure 2.5: A coordinate system on a plane H ⊂ R3 .

Remark 2.58. The grid on the plane in Figure 2.5 makes H “look"
like R2 . The correspondence x 7→ [x]B is a one-to-one correspondence
between H and R2 that preserves linear combinations. We call such a
correspondence an isomorphism, and we say that H is isomorphic to
R2 .
2.9. Dimension and Rank 103

2.9.2. Dimension of a subspace

Definition 2.59. The dimension of a nonzero subspace H (dim H) is


the number of vectors in any basis for H. The dimension of zero subspace
{0} is defined to be zero.

Example 2.60.

a) dim Rn = n.
b) Let H be as in Example 2.57. What is dim H?
c) G = Span{u}. What is dim G?
Solution.

Remark 2.61.
1) Dimension of Col A:

dim Col A = The number of pivot columns in A

which is called the rank of A, rank A.


2) Dimension of Nul A:

dim Nul A = The number of free variables in A


= The number of non-pivot columns in A

Theorem 2.62. (Rank Theorem) Let A ∈ Rm×n . Then

dim Col A + dim Nul A = rank A + dim Nul A = n


= (the number of columns in A)
104 Chapter 2. Matrix Algebra

Example 2.63. A matrix and its echelon form are given. Find the bases
for Col
 A and Nul A and also
 state
 the dimensions
 of these subspaces.
1 −2 −1 5 4 1 0 1 0 0
   
 2 −1 1 5 6 0 1 1 0 0
A= −2 0 −2 1 −6 ∼ 0 0 0 1 0
  
   
3 1 4 1 5 0 0 0 0 1
Solution.

Example 2.64. Find a basis for the subspace spanned by the given vectors.
What
 isthe 
dimension
   of the subspace?

1 2 0 −1 3
         
−1 −3 −1  4 −7
 ,  ,  ,  ,  
−2 −1  3 −7  6
         
3 4 2 7 −9
Solution.
2.9. Dimension and Rank 105

Theorem 2.65. (The Basis Theorem) Let H be a p-dimensional sub-


space of Rn . Then
a) Any linearly independent set of exactly p elements in H is automati-
cally a basis for H
b) Any set of p elements of H that spans H is automatically a basis for
H.

Theorem 2.66. (Invertible Matrix Theorem; continued from The-


orem 2.25, p.86) Let A be n × n square matrix. Then the following are
equivalent.
m. The columns of A form a basis of Rn
n. Col A = Rn
o. dim Col A = n
p. rank A = n
q. Nul A = {0}
r. dim Nul A = 0

Example 2.67.

a) If the rank of a 9 × 8 matrix A is 7, what is the dimension of solution


space of Ax = 0?
b) If A is a 4 × 3 matrix and the mapping x 7→ Ax is one-to-one, then what
is dim Nul A?
Solution.
106 Chapter 2. Matrix Algebra

True-or-False 2.68.
a. Each line in Rn is a one-dimensional subspace of Rn .
b. The dimension of Col A is the number of pivot columns of A.
c. The dimensions of Col A and Nul A add up to the number of columns of
A.
d. If B = {v1 , v2 , · · · , vp } is a basis for a subspace H of Rn , then the corre-
spondence x 7→ [x]B makes H look and act the same as Rp .
Hint : See Remark 2.58.

e. The dimension of Nul A is the number of variables in the equation


Ax = 0.
Solution.

Ans: F,T,T(Rank Theorem),T,F

Exercises 2.9
1. A matrix and its echelon form is given. Find the bases for Col A and Nul A, and then
state the dimensions of these subspaces.
   
1 −3 2 −4 1 −3 2 −4
3 9 −1 5
 0
 0 5 −7
A= ∼
 

2 −6 4 −3 0 0 0 5
4 12 2 7 0 0 0 0
2. Use the Rank Theorem to justify each answer, or perform construction.

(a) If the subspace of all solutions of Ax = 0 has a basis consisting of three vectors and
if A is a 5 × 7 matrix, what is the rank of A?
Ans: 4
(b) What is the rank of a 4 × 5 matrix whose null space is three-dimensional?
(c) If the rank of a 7 × 6 matrix A is 4, what is the dimension of the solution space of
Ax = 0?
(d) Construct a 4 × 3 matrix with rank 1.
C HAPTER 3
Determinants

In linear algebra, the determinant is a scalar value that can be computed


for a square matrix. Geometrically, it can be viewed as a volume scaling
factor of the linear transformation described by the matrix, x 7→ Ax.
For example, let A ∈ R2×2 and
" # " #
1 0
u1 = A , u2 = A
0 1

Then the determinant of A (in modulus) is the same as the area of the par-
allelogram generated by u1 and u2 .
In this chapter, you will study the determinant and its properties.

107
108 Chapter 3. Determinants

3.1. Introduction to Determinants


Definition 3.1. Let A be an n × n square matrix. Then determinant
is a scalar value denoted by det A or |A|.
1) Let A = [a] ∈ R1 × 1 . Then det A = a.
" #
a b
2) Let A = ∈ R2 × 2 . Then det A = ad − bc.
c d
" #
2 1
Example 3.2. Let A = . Consider a linear transformation T : R2 → R2
0 3
defined by T (x) = Ax.

1) Find the determinant of A.


2) Determine the image of a rectangle R = [0, 2] × [0, 1] under T .
3) Find the area of the image.
4) Figure out how det A, the area of the rectangle (= 2), and the area of the
image are related.
Solution.

Ans: 3) 12

Note: The determinant can be viewed as a volume scaling factor.


3.1. Introduction to Determinants 109

Definition 3.3. Let Aij be the submatrix of A obtained by deleting row


i and column j of A. Then the (i, j)-cofactor of A = [aij ] is the scalar Cij ,
given by
Cij = (−1)i+j det Aij . (3.1)

Definition 3.4. For n ≥ 2, the determinant of an n × n matrix A = [aij ]


is given by the following formulas:
1. The cofactor expansion across the first row:

det A = a11 C11 + a12 C12 + · · · + a1n C1n (3.2)

2. The cofactor expansion across the row i:

det A = ai1 Ci1 + ai2 Ci2 + · · · + ain Cin (3.3)

3. The cofactor expansion down the column j:

det A = a1j C1j + a2j C2j + · · · + anj Cnj (3.4)


 
1 5 0
Example 3.5. Find the determinant of A = 2 4 −1, by expanding
 

0 −2 0
across the first row and down column 3.
Solution.

Ans: −2
110 Chapter 3. Determinants
 
1 −2 5 2
 
2 −6 −7 5
Example 3.6. Compute the determinant of A = 
0 0 3
 by a
 0

5 0 4 4
cofactor expansion.
Solution.

Note: If A is triangular (upper or lower) matrix, then det A is the product


of entries on the main diagonal of A.
 
1 −2 5 2
 
0 −6 −7 5
Example 3.7. Compute the determinant of A =  0 0 3 0.

 
0 0 0 4
Solution.
3.1. Introduction to Determinants 111

True-or-False 3.8.
a. An n × n determinant is defined by determinants of (n − 1) × (n − 1)
submatrices.
b. The (i, j)-cofactor of a matrix A is the matrix Aij obtained by deleting
from A its i-th row and j-th column.
c. The cofactor expansion of det A down a column is equal to the cofactor
expansion along a row.
d. The determinant of a triangular matrix is the sum of the entries on the
main diagonal.
Solution.
Ans: T,F,T,F

Exercises 3.1
1. Compute the determinants in using a cofactor expansion across the first row and down
the second column.
   
3 0 4 2 3 −3
a) 2 3 2 b) 4 0 3
   

0 5 −1 6 1 5
Ans: a) 1, b) −24
2. Compute the determinants by cofactor expansions. At each step, choose a row or column
that involves the least amount of computation.
   
3 5 −6 4 4 0 −7 3 −5
0 −2 3 −3 0 0

2 0 0

a) 
   
b) 

0 0 1 5 7 3 −6 4 −8 

0 0 0 3 5 0 5 2 −3
 
0 0 9 −1 2
112 Chapter 3. Determinants

Ans: a) −18, b) 6
3. The expansion of a 3 × 3 determinant can be remembered by the following device. Write
a second copy of the first two columns to the right of the matrix, and compute the deter-
minant by multiplying entries on six diagonals:

Figure 3.1

Then, add the downward diagonal products and subtract the upward products. Use this
method to compute the determinants for the matrices in Exercise 1. Warning: This
trick does not generalize in any reasonable way to 4 × 4 or larger matrices.
4. Explore the effect of an elementary row operation on the determinant of a matrix. In
each case, state the row operation and describe how it affects the determinant.
" # " # " # " #
a b c d a b a + kc b + kd
a) , b) ,
c d a b c d c d
Ans: b) Replacement does not change the determinant
" #
3 1
5. Let A = . Write 5A. Is det (5A) = 5det A?
4 2
3.2. Properties of Determinants 113

3.2. Properties of Determinants

Theorem 3.9. Let A be n × n square matrix.


a) (Replacement): If B is obtained from A by a row replacement, then
det B = det A.
" # " #
1 3 1 3
A= , B=
2 1 0 −5

b) (Interchange): If two rows of A are interchanged to form B, then


det B = −det A.
" # " #
1 3 2 1
A= , B=
2 1 1 3

c) (Scaling): If one row of A is multiplied by k (6= 0), then


det B = k · det A.
" # " #
1 3 1 3
A= , B=
2 1 −4 −2


1 −4 2
Example 3.10. Compute det A, where A = −2 8 −9, after applying
 

−1 7 0
a couple of steps of replacement operations.
Solution.

Ans: 15
114 Chapter 3. Determinants

Theorem 3.11. A square matrix A is invertible if and only if det A 6= 0.

Remark 3.12. Let A and B be n × n matrices.


a) det AT = det A.
" # " #
1 3 1 2
A= , AT =
2 1 3 1

b) det (AB) = det A · det B.


" # " # " #
1 3 1 1 13 7
A= , B= ; then AB = .
2 1 4 2 6 4

1
c) If A is invertible, then det A−1 = . (∵ det In = 1.)
det A

Example 3.13. Suppose the sequence 5 × 5 matrices A, A1 , A2 , and A3 are


related by following elementary row operations:
R ←R −3R R3 ←(1/5) R3 R ↔R
A −−2−−−2 1
−−→ A1 −−−−−−−→ A2 −−4−−→
5
A3
 
1 2 3 4 1
0 −2 1 −1 1
 
 
Find det A, if A3 = 
 0 0 3 0 1 

0 0 0 −1 1
 

0 0 0 0 1
Solution.

Ans: −30
3.2. Properties of Determinants 115

A Linearity Property of the Determinant Function


Note: Let A ∈ Rn×n , A = [a1 · · · aj · · · an ]. Suppose that the j-th col-
umn of A is allowed to vary, and write

A = [a1 · · · aj−1 x aj+1 · · · an ].

Define a transformation T from Rn to R by

T (x) = det [a1 · · · aj−1 x aj+1 · · · an ]. (3.5)

Then,
T (cx) = cT (x)
(3.6)
T (u + v) = T (u) + T (v)
This (multi-) linearity property of the determinant turns out to have
many useful consequences that are studied in more advanced courses.

True-or-False 3.14.
a. If the columns of A are linearly dependent, then det A = 0.
b. det (A + B) = det A + det B.
c. If three row interchanges are made in succession, then the new deter-
minant equals the old determinant.
d. The determinant of A is the product of the diagonal entries in A.
e. If det A is zero, then two rows or two columns are the same, or a row or
a column is zero.
Solution.

Ans: T,F,F,F,F
116 Chapter 3. Determinants

Exercises 3.2
1. Find the determinant by row reduction to echelon form.
 
1 3 0 2
−2 −5 7 4
 
 
 3 5 2 1
1 −1 2 −3
Ans: 0
2. Use
 determinants
 to find out if the matrix is invertible.
2 0 0 6
1 −7 −5 0
 
 
3 8 6 0
0 7 5 4
Ans: Invertible
3. Use
 determinants
    to  decide if the set of vectors is linearly independent.
7 −8 7
−4,  5,  0
     

−6 7 −5
Ans: linearly independent
 
1 0 1
4. Compute det B 6 , where B = 1 1 2.
 

1 2 1
Ans: 64
5. Show or answer with justification.

a) Let A and P be square matrices, with P invertible. Show that det (P AP −1 ) = det A.
b) Suppose that A is a square matrix such that det A3 = 0. Can A can be invertible?
Ans: No
c) Let U be a square matrix such that U T U = I. Show that det U = ±1.

6. Compute
" AB # and "verify# that det AB = det A · det B.
3 0 2 0
A= ,B=
6 1 5 4
C HAPTER 4
Vector Spaces

A vector space (also called a linear space) is a nonempty set of objects,


called vectors, which is closed under “addition" and “scalar multiplication".
In this chapter, we will study basic concepts of such vector spaces and their
subspaces. What the students would learn is how to prove mathematical
statements.

117
118 Chapter 4. Vector Spaces

4.1. Vector Spaces and Subspaces

Definition 4.1. A vector space is a nonempty set V of objects, called


vectors, on which are defined two operations, called addition and multi-
plication by scalars (real numbers), subject to the ten axioms (or rules)
listed below. The axioms must hold for all vectors u, v, w ∈ V and for all
scalars c and d.
1. u + v ∈ V
2. u + v = v + u
3. (u + v) + w = u + (v + w)
4. There is a zero vector 0 ∈ V such that u + 0 = u
5. For each u ∈ V , there is a vector −u ∈ V such that u + (−u) = 0
6. cu ∈ V
7. c(u + v) = cu + cv
8. (c + d)u = cu + du
9. c(du) = (cd)u
10. 1u = u

Example 4.2. Examples of Vector Spaces:

a) Rn , n ≥ 1, are the premier examples of vector spaces.


b) Let Pn = {p(t) = a0 +a1 t1 +· · ·+an tn }, n ≥ 1. For p(t) = a0 +a1 t1 +· · ·+an tn
and q(t) = b0 + b1 t1 + · · · + bn tn , define

(p + q)(t) = p(t) + q(t) = (a0 + b0 ) + (a1 + b1 )t1 + · · · + (an + bn )tn


(cp)(t) = cp(t) = ca0 + ca1 t1 + · · · + can tn

Then Pn is a vector space, with the addition and scalar multiplication.


c) Let V = {all real-valued functions defined on a set D}. Then, V is a vec-
tor space, with the usual function addition and scalar multiplication.
4.1. Vector Spaces and Subspaces 119

Definition 4.3. A subspace of a vector space V is a subset H of V that


has three properties:
a) 0 ∈ H, where 0 is the zero vector of V
b) H is closed under vector addition: for each u, v ∈ H, u + v ∈ H
c) H is closed under scalar multiplication: for each u ∈ H and each
scalar c, cu ∈ H

Example 4.4. Examples of Subspaces:

a) H = {0}: the zero subspace


b) Let P = {all polynomials with real coefficients defined on R}. Then, P
is a subspace of the space {all real-valued functions defined on R}.
c) The vector space R2 is not a subspace of R3 , because R2 6⊂ R3 .
  

 s 

d) Let H =  t : s, t ∈ R . Then H is a subspace of R3 .
 
 
0
 

Example 4.5. Determine if the given set is a subspace of Pn for an appro-


priate value of n.

a) at2 | a ∈ R c) {p ∈ P3 with integer coefficients}

b) a + t2 | a ∈ R d) {p ∈ Pn | p(0) = 0}

Solution.

Ans: a) Yes, b) No, c) No, d) Yes


120 Chapter 4. Vector Spaces

A Subspace Spanned by a Set


Example 4.6. Let v1 , v2 ∈ V , a vector space. Prove that H = Span{v1 , v2 }
is a subspace of V .
Solution.

a) 0 ∈ H, where 0 is the zero vector of V

b) For each u, w ∈ H, u + w ∈ H

c) For each u ∈ H and each scalar c, cu ∈ H

Theorem 4.7. If v1 , v2 , · · · , vp are in a vector space V , then


Span{v1 , v2 , · · · , vp } is a subspace of V .

Example 4.8. Let H = {(a − 3b, b − a, a, b) | a, b ∈ R}. Show that H is a


subspace of R4 .
Solution.
4.1. Vector Spaces and Subspaces 121

Self-study 4.9. Let H and K be subspaces of V . Define the sum of H and


K as
H + K = {u + v | u ∈ H, v ∈ K}.
Prove that H + K is a subspace of V .
Solution.

True-or-False 4.10.
a. A vector is an arrow in three-dimensional space.
b. A subspace is also a vector space.
c. R2 is a subspace of R3 .
d. A subset H of a vector space V is a subspace of V if the following con-
ditions are satisfied: (i) the zero vector of V is in H, (ii) u, v, and u + v
are in H, and (iii) c is a scalar and cu is in H.
Solution.

Ans: F,T,F,F (In (ii), there is no statement that u and v represent all possible elements of H)
122 Chapter 4. Vector Spaces

Exercises 4.1
You may use Definition 4.3, p. 119, or Theorem 4.7, p. 120.
(" # )
x
1. Let V be the first quadrant in the xy-plane; that is, let V = : x ≥ 0, y ≥ 0 .
y

a) If u and v are in V , is u + v in V ? Why?


b) Find a specific vector u in V and a specific scalar c such that cu is not in V . (This is
enough to show that V is not a vector space.)
 
s
2. Let H be the set of all vectors of the form 3s.
 

2s
 
a) Find a vector v in R3 such that H = Span{v}. 1
Ans: a) v = 3
 
b) Why does this show that H is a subspace of R3 ?
2
 
5b + 2c
3. Let W be the set of all vectors of the form  b.
 

a) Find vectors u and v in R3 such that W = Span{u, v}.


b) Why does this show that W is a subspace of R3 ?
 
s + 3t
 s − t
4. Let W be the set of all vectors of the form  . Show that W is a subspace of R4 .
 
2s − t
4t
For fixed positive integers m and n, the set Mm × n of all m × n matrices is a vector space,
under the usual operations of addition of matrices and multiplication by real scalars.
" #
a b
5. Determine if the set H of all matrices of the form is a subspace of M2 × 2 .
0 d
4.2. Null Spaces, Column Spaces, and Linear Transformations 123

4.2. Null Spaces, Column Spaces, and Linear


Transformations
Note: In applications of linear algebra, subspaces of Rn usually arise in
one of two ways: (1) as the set of all solutions to a system of homogeneous
linear equations or (2) as the set of all linear combinations of certain
specified vectors. In this section, we compare and contrast these two
descriptions of subspaces, allowing us to practice using the concept of a
subspace.

The Null Space of a Matrix


Definition 4.11. The null space of an m × n matrix A, written as
Nul A, is the set of all solutions of the homogeneous equation Ax = 0. In
set notation,
Nul A = {x | x ∈ Rn and Ax = 0}
 
" # 5
1 −3 −2
Example 4.12. Let A = and v =  3. Determine if v
 
−5 9 1
−2
belongs to the null space of A.
Solution.
124 Chapter 4. Vector Spaces

Theorem 4.13. The null space of an m × n matrix A is a subspace


of Rn . Equivalently, the set of all solutions to a system Ax = 0 of m
homogeneous linear equations in n unknowns is a subspace of Rn .

Proof.

Example 4.14. Let H be the set of all vectors in R4 , whose coordinates


(
a − 2b + 5c = d
a, b, c, d satisfy the equations . Show that H is a subspace
c−a = b
of R4 .
Solution. Rewrite the above equations as
(
a − 2b + 5c − d = 0
−a − b + c = 0
 
a " #
 
 b 1 −2 5 −1
Then   is the solution of x = 0. Thus the collection of
 c
  −1 −1 1 0
d
these is a subspace.
4.2. Null Spaces, Column Spaces, and Linear Transformations 125

An Explicit Description of Nul A


Example 4.15. Find a spanning set for the null space of the matrix
 
−3 6 −1 1 −7
A =  1 −2 2 3 −1
 

2 −4 5 8 −4
 
1 −2 0 −1 3 0
Solution. [A 0] ∼ 0 0 1 2 −2 0 (R.E.F)
 

0 0 0 0 0 0

The Column Space of a Matrix


Definition 4.16. The column space of an m × n matrix A, written
as Col A, is the set of all linear combinations of the columns of A. If
A = [a1 a2 · · · an ], then

Col A = Span{a1 , a2 , · · · , an } (4.1)

Theorem 4.17. The column space of an m × n matrix A is a subspace


of Rm .
126 Chapter 4. Vector Spaces

Example 4.18. Find a matrix A such that W = Col A.


  

 6a − b 

W =  a + b : a, b ∈ R
 
 
−7a
 
Solution.

Remark 4.19. The column space of an m × n matrix A is all of Rm if


and only if the equation Ax = b has a solution for each b ∈ Rm .

The Contrast Between Nul A and Col A


Remark 4.20. Let A ∈ Rm×n .
1. Nul A is a subspace of Rn . 1. Col A is a subspace of Rm .
2. Nul A is implicitly defined; that is, you 2. Col A is explicitly defined; that is, you
are given only a condition (Ax = 0). are told how to build vectors in Col A.
3. It takes time to find vectors in Nul A. 3. It is easy to find vectors in Col A. The
Row operations on [A 0] are required. columns of A are displayed; others are
formed from them.
4. There is no obvious relation between 4. There is an obvious relation between
Nul A and the entries in A. Col A and the entries in A, since each
column of A is in Col A.
5. A typical vector v in Nul A has the prop- 5. A typical vector v in Col A has the prop-
erty that Av = 0. erty that the equation Ax = v is consis-
tent.
6. Given a specific vector v, it is easy to 6. Given a specific vector v, it may take
tell if v is in Nul A. Just compute Av. time to tell if v is in Col A. Row opera-
tions on [A v] are required.
7. Nul A = {0} ⇔ the equation Ax = 0 7. Col A = Rm ⇔ the equation Ax = b
has only the trivial solution. has a solution for every b ∈ Rm .
8. Nul A = {0} ⇔ the linear transforma- 8. Col A = Rm ⇔ the linear transforma-
tion x 7→ Ax is one-to-one. tion x 7→ Ax maps Rn onto Rm .
4.2. Null Spaces, Column Spaces, and Linear Transformations 127

Kernel (Null Space) and Range of a Linear Transformation


Definition 4.21. A linear transformation T from a vector space V
into a vector space W is a rule that assigns to each vector v ∈ V a unique
vector T (x) ∈ W , such that

(i) T (u + v) = T (u) + T (v) for all u, v ∈ V , and


(4.2)
(ii) T (cu) = cT (u) for all u ∈ V and scalar c

Example 4.22. Let T : V → W be a linear transformation from a vector


space V into a vector space W . Prove that the range of T is a subspace of
W.
Hint : Typical elements of the range have the form T (u) and T (v) for u, v ∈ V . See Defini-
tion 4.3, p. 119; you should check if the three conditions are satisfied.
Solution.
128 Chapter 4. Vector Spaces

True-or-False 4.23.
a. The column space of A is the range of the mapping x 7→ Ax.
b. The kernel of a linear transformation is a vector space.
c. Col A is the set of all vectors that can be written as Ax for some x.
That is, Col A = {b | b = Ax, for x ∈ Rn }.
d. Nul A is the kernel of the mapping x 7→ Ax.
e. Col A is the set of all solutions of Ax = b.
Solution.

Ans: T,T,T,T,F

Exercises 4.2
1. Either show that the given set is a vector space, or find a specific example to the contrary.
     
 a
 
  b − 2d
 

(a)  b : a + b + c = 2 (b) b + 3d : b, d ∈ R
   
   
c d
   

Hint : See Definition 4.3, p. 119, and Example 4.18.


   
−8 −2 −9 2
2. Let A =  6 4 8 and w =  1.
   

4 0 4 −2

(a) Is w in Col A? If yes, express w as a linear combination of columns of A.


(b) Is w in Nul A? Why?
Ans: yes, yes
3. Let V and W be vector spaces, and let T : V → W be a linear transformation. Given a
subspace U of V , let T (U ) denote the set of all images of the form T (x), where x ∈ U .
Show that T (U ) is a subspace of W .
Hint : You should check if the three conditions in Definition 4.3 are satisfied for all
elements in T (U ). For example, for the second condition, let’s first select two arbitrary
elements in T (U ): T (u1 ) and T (u2 ), where u1 , u2 ∈ U . Then what you have to do is to
show T (u1 ) + T (u2 ) ∈ T (U ). To show the underlined, you may use the assumption that
T is linear. That is, T (u1 ) + T (u2 ) = T (u1 + u2 ). Is the term in blue in T (U )? Why?
Advice from an old man: I know some of you may feel that the last problem is crazy. It is related
to mathematical logic and understandability. Just try to beat your brain out for it.
C HAPTER 5
Eigenvalues and Eigenvectors

In this chapter, we will learn

• how to find eigenvalues and corresponding eigenvectors,


• how to diagonalize matrices, and
• how to solve dynamical systems of differential equations.

129
130 Chapter 5. Eigenvalues and Eigenvectors

5.1. Eigenvectors and Eigenvalues

Definition 5.1. Let A be an n × n matrix. An eigenvector of A is a


nonzero vector x such that Ax = λx for some scalar λ. In this case, a
scalar λ is an eigenvalue and x is the corresponding eigenvector.
" # " #
−1 5 2
Example 5.2. Is an eigenvector of ? What is the eigenvalue?
1 3 6
Solution.

" # " # " #


1 6 6 3
Example 5.3. Let A = ,u= , and v = .
5 2 −5 −2

a) Are u and v eigenvectors of A?

b) Show that 7 is an eigenvalue of matrix A, and find the corresponding


eigenvectors.
Solution. Hint : b) Start with Ax = 7x.
5.1. Eigenvectors and Eigenvalues 131

Definition 5.4. The set of all solutions of (A − λI) x = 0 is called the


eigenspace of A corresponding to eigenvalue λ.

Remark 5.5. Let λ be an eigenvalue of A. Then


a) Eigenspace is a subspace of Rn and the eigenspace of A corresponding
to λ is Nul (A − λI).
b) The homogeneous equation (A − λI) x = 0 has at least one free vari-
able.

Example 5.6. Find a basis for the eigenspace and hence the dimension of
 
4 0 −1
the eigenspace of A = 3 0 3, corresponding to the eigenvalue λ = 3.
 

2 −2 5
Solution.

Example 5.7. Find a basis for the eigenspace and hence the dimension of
 
4 −1 6
the eigenspace of A = 2 1 6, corresponding to the eigenvalue λ = 2.
 

2 −1 8
Solution.
132 Chapter 5. Eigenvalues and Eigenvectors

Theorem 5.8. The eigenvalues of a triangular matrix are the entries


on its main diagonal.
   
3 6 −8 4 0 0
Example 5.9. Let A = 0 0 6 and B = −2 1 0. What are
   

0 0 2 5 3 4
eigenvalues of A and B?
Solution.

Remark 5.10. Zero (0) is an eigenvalue of A ⇔ A is not invertible.

Ax = 0x = 0. (5.1)

Thus the eigenvector x 6= 0 is a nontrivial solution of Ax = 0.


5.1. Eigenvectors and Eigenvalues 133

Theorem 5.11. If v1 , v2 , · · · , vr are eigenvectors that correspond


to distinct eigenvalues λ1 , λ2 , · · · , λr of n × n matrix A, then the set
{v1 , v2 , · · · , vr } is linearly independent.

Proof.

• Assume that {v1 , v2 , · · · , vr } is linearly dependent.


• One of the vectors in the set is a linear combination of the preceding
vectors.
• {v1 , v2 , · · · , vp } is linearly independent; vp+1 is a linear combination of
the preceding vectors.
• Then, there exist scalars c1 , c2 , · · · , cp such that
c1 v1 + c2 v2 + · · · + cp vp = vp+1 (5.2)

• Multiplying both sides of (5.2) by A, we obtain


c1 Av1 + c2 Av2 + · · · + cp Avp = Avp+1
and therefore, using the fact Avk = λk vk :
c1 λ1 v1 + c2 λ2 v2 + · · · + cp λp vp = λp+1 vp+1 (5.3)

• Multiplying both sides of (5.2) by λp+1 and subtracting the result from
(5.3), we have
c1 (λ1 − λp+1 )v1 + c2 (λ2 − λp+1 )v2 + · · · + cp (λp − λp+1 )vp = 0. (5.4)

• Since {v1 , v2 , · · · , vp } is linearly independent,


c1 (λ1 − λp+1 ) = 0, c2 (λ2 − λp+1 ) = 0, · · · , cp (λp − λp+1 ) = 0.

• Since λ1 , λ2 , · · · , λr are distinct,


c1 = c2 = · · · = cp = 0 ⇒ vp+1 = 0,
which is a contradiction.
134 Chapter 5. Eigenvalues and Eigenvectors

Example 5.12. Show that if A2 is the zero matrix, then the only eigenvalue
of A is 0.
Solution. Hint : You may start with Ax = λx, x 6= 0.

True-or-False 5.13.
a. If Ax = λx for some vector x, then λ is an eigenvalue of A.
b. A matrix A is not invertible if and only if 0 is an eigenvalue of A.
c. A number c is an eigenvalue of A if and only if the equation (A−cI)x = 0
has a nontrivial solution.
d. If v1 and v2 are linearly independent eigenvectors, then they corre-
spond to distinct eigenvalues.
e. An eigenspace of A is a null space of a certain matrix.
Solution.

Ans: F,T,T,F,T
5.1. Eigenvectors and Eigenvalues 135

Exercises 5.1
" #
7 3
1. Is λ = −2 an eigenvalue of ? Why or why not?
3 −1
Ans: Yes
" # " #
1 −3 1
2. Is an eigenvector of ? If so, find the eigenvalue.
4 −3 8
3. Find a basis for the eigenspace corresponding to each listed eigenvalue.
   
1 0 −1 3 0 2 0
(a) A = 1 −3 0, λ = −2
 
  1 3 1 0
(b) B =  , λ = 4
4 −13 1 0
 1 1 0

0 0 0 4
 
  2
1 3
Ans: (a) 1 (b)   and another vector
   
1
3
0
 
0 0 0
4. Find the eigenvalues of the matrix 0 2 5.
 

0 0 −1
 
1 2 3
5. For 1 2 3, find one eigenvalue, with no calculation. Justify your answer.
 

1 2 3
Ans: 0. Why?
6. Prove that λ is an eigenvalue of A if and only if λ is an eigenvalue of AT . (A and AT have
exactly the same eigenvalues, which is frequently used in engineering applications of
linear algebra.)
Hint : 1 λ is an eigenvalue of A
⇔ 2 (A − λI)x = 0, for some x 6= 0
⇔ 3 (A − λI) is not invertible.
Now, try to use the Invertible Matrix Theorem (Theorem 2.25) to finish your proof. Note
that (A − λI)T = (AT − λI).
136 Chapter 5. Eigenvalues and Eigenvectors

5.2. The Characteristic Equation

Definition 5.14. The scalar equation det (A − λI) = 0 is called the


characteristic equation of A; the polynomial p(λ) = det (A − λI)
is called the characteristic polynomial of A. The solutions of
det (A − λI) = 0 are the eigenvalues of A.

Example 5.15. Find the characteristic polynomial and all eigenvalues of


" #
8 2
A= .
3 3
Solution.

Example 5.16. Find the characteristic polynomial and all eigenvalues of


 
1 1 0
A = 6 0 5
 

0 0 2
Solution.
5.2. The Characteristic Equation 137

Theorem 5.17. (Invertible Matrix Theorem; continued from The-


orem 2.25, p.86, and Theorem 2.66, p.105) Let A be n × n square ma-
trix. Then the following are equivalent.
s. the number 0 is not an eigenvalue of A.
t. det A 6= 0

Example 5.18. Find the characteristic equation and all eigenvalues of


 
5 −2 6 −1
 
0 3 −8 0
A=0 0 5 4.

 
0 0 0 1
Solution.

Definition 5.19. The algebraic multiplicity (or, multiplicity) of an


eigenvalue is its multiplicity as a root of the characteristic equation.

Theorem 5.20. The eigenvalues of a triangular matrix are the entries


on its main diagonal.
138 Chapter 5. Eigenvalues and Eigenvectors

Remark 5.21. Let A be an n × n matrix. Then the characteristic equa-


tion of A is of the form
p(λ) = det (A − λI) = (−1)n (λn + cn−1 λn−1 + · · · + c1 λ + c0 )
(5.5)
= (−1)n Πni=1 (λ − λi ),

where some of eigenvalues λi can be complex-valued numbers. Thus

det A = p(0) = (−1)n Πni=1 (0 − λi ) = Πni=1 λi . (5.6)

That is, det A is the product of all eigenvalues of A.

Similarity
Definition 5.22. Let A and B be n × n matrices. Then, A is similar to
B, if there is an invertible matrix P such that

A = P BP −1 , or equivalently, P −1 AP = B.

Writing Q = P −1 , we have B = QAQ−1 . So B is also similar to A, and we


say simply that A and B are similar. The map A 7→ P −1 AP is called a
similarity transformation.

The next theorem illustrates one use of the characteristic polynomial, and
it provides the foundation for several iterative methods that approximate
eigenvalues.

Theorem 5.23. If n × n matrices A and B are similar, then they


have the same characteristic polynomial and hence the same eigenvalues
(with the same multiplicities).

Proof. B = P −1 AP . Then,

B − λI = P −1 AP − λI
= P −1 AP − λP −1 P
= P −1 (A − λI)P,

from which we conclude that det (B − λI) = det (A − λI).


5.2. The Characteristic Equation 139

True-or-False 5.24.
a. The determinant of A is the product of the diagonal entries in A.
b. An elementary row operation on A does not change the determinant.
c. (det A)(det B) = det AB
d. If λ + 5 is a factor of the characteristic polynomial of A, then 5 is an
eigenvalue of A.
e. The multiplicity of a root r of the characteristic equation of A is called
the algebraic multiplicity of r as an eigenvalue of A.
Solution.
Ans: F,F,T,F,T

Exercises 5.2
" #
5 −3
1. Find the characteristic polynomial and the eigenvalues of .
−4 3 √
Ans: λ = 4 ± 13
2. Find the characteristic polynomial of matrices. [Note. Finding the characteristic poly-
nomial of a 3 × 3 matrix is not easy to do with just row operations, because the variable
is involved.]
   
0 3 1 −1 0 1
(a) 3 0 2 (b) −3 4 1
   

1 2 0 0 0 2
Ans: (b) −λ3 + 5λ2 − 2λ − 8 = −(λ + 1)(λ − 2)(λ − 4)

3. M Report the matrices and your conclusions.

(a) Construct a random integer-valued 4 × 4 matrix A, and verify that A and AT have
the same characteristic polynomial (the same eigenvalues with the same multiplic-
ities). Do A and AT have the same eigenvectors?
(b) Make the same analysis of a 5 × 5 matrix.

Note. Figure out by yourself how to generate random integer-valued matrices, how to
make its transpose, and how to get eigenvalues and eigenvectors.
140 Chapter 5. Eigenvalues and Eigenvectors

5.3. Diagonalization

Definition 5.25. An n × n matrix A is said to be diagonalizable if


there exists an invertible matrix P and a diagonal matrix D such that

A = P DP −1 (or P −1 AP = D) (5.7)

Remark 5.26. Let A be diagonalizable, i.e., A = P DP −1 . Then

A2 = (P DP −1 )(P DP −1 ) = P D2 P −1
Ak = P Dk P −1
(5.8)
A−1 = P D−1 P −1 (when A is invertible)
det A = det D

Diagonalization enables us to compute Ak and det A quickly.


" #
7 2
Example 5.27. Let A = . Find a formula for Ak , given that
−4 1
" # " #
1 1 5 0
A = P DP −1 , where P = and D = .
−1 −2 0 3
Solution.

" #
2 · 5k − 3k 5k − 3k
Ans: Ak =
2 · 3k − 2 · 5k 2 · 3k − 5k
5.3. Diagonalization 141

Theorem 5.28. (The Diagonalization Theorem)


1. An n × n matrix A is diagonalizable if and only if A has n linearly
independent eigenvectors v1 , v2 , · · · , vn .
2. In fact, A = P DP −1 if and only if columns of P are n linearly inde-
pendent eigenvectors of A. In this case, the diagonal entries of D are
the corresponding eigenvalues of A. That is,

P = [v1 v2 · · · vn ],
 
λ1 0 · · · 0
(5.9)
 
 0 λ2 · · · 0 
D = diag(λ1 , λ2 , · · · , λn ) = 
 ... ... . . . ... ,

 
0 0 · · · λn

where Avk = λk vk , k = 1, 2, · · · , n.

Proof. Let P = [v1 v2 · · · vn ] and D = diag(λ1 , λ2 , · · · , λn ), arbitrary ma-


trices. Then,
AP = A[v1 v2 · · · vn ] = [Av1 Av2 · · · Avn ], (5.10)
while
 
λ1 0 · · · 0
 
 0 λ2 · · · 0 
P D = [v1 v2 · · · vn ]
 ... ... . . . ...  = [λ1 v1 λ2 v2 · · · λn vn ].
 (5.11)
 
0 0 · · · λn
(⇒ ) Now suppose A is diagonalizable and A = P DP −1 . Then we have
AP = P D; it follows from (5.10) and (5.11) that
[Av1 Av2 · · · Avn ] = [λ1 v1 λ2 v2 · · · λn vn ],
from which we conclude
Avk = λk vk , k = 1, 2, · · · , n. (5.12)
Furthermore, P is invertible ⇒ {v1 , v2 , · · · , vn } is linearly independent.
(⇐ ) It is almost trivial.
142 Chapter 5. Eigenvalues and Eigenvectors

Example 5.29. Diagonalize the following matrix, if possible.


 
1 3 3
A = −3 −5 −3
 

3 3 1

Solution.
1. Find the eigenvalues of A.
2. Find three linearly independent eigenvectors of A.
3. Construct P from the vectors in step 2.
4. Construct D from the corresponding eigenvalues.
Check: AP = P D?

     
1 −1 −1
Ans: λ = 1, −2, −2. v1 = −1, v2 =  1, v3 =  0
     

1 0 1
5.3. Diagonalization 143

Note: A matrix is not always diagonalizable.

Example 5.30. Diagonalize the following matrix, if possible.


 
2 4 3
A = −4 −6 −3,
 

3 3 1

for which det (A − λI) = −(λ − 1)(λ + 2)2 .


Solution.
144 Chapter 5. Eigenvalues and Eigenvectors

Example 5.31. Diagonalize the following matrix, if possible.


 
3 0 0 0
 
0 2 0 0
A=0 0 2 0.

 
1 0 0 3
Solution.
5.3. Diagonalization 145

Theorem 5.32. An n × n matrix with n distinct eigenvalues is diago-


nalizable.

Example 5.33. Determine if the following matrix is diagonalizable.


 
5 −8 1
A = 0 0 7
 

0 0 −2
Solution.

Matrices Whose Eigenvalues Are Not Distinct

Theorem 5.34. Let A be an n × n matrix whose distinct eigenvalues


are λ1 , λ2 , · · · , λp . Let EA (λk ) be the eigenspace for λk .
1. dim E A (λk ) ≤ (the multiplicity of the eigenvalue λk ), for 1 ≤ k ≤ p
2. The matrix A is diagonalizable
⇔ the sum of the dimensions of the eigenspaces equals n
⇔ dim E A (λk ) = the multiplicity of λk , for each 1 ≤ k ≤ p
(and the characteristic polynomial factors completely into linear factors)

3. If A is diagonalizable and Bk is a basis for EA (λk ), then the total


collection of vectors in the sets {B1 , B2 , · · · , Bp } forms an eigenvector
basis for Rn .
146 Chapter 5. Eigenvalues and Eigenvectors

Example 5.35. Diagonalize the following matrix, if possible.


 
5 0 0 0
 
 0 5 0 0
A= 1 4 −3 0

 
−1 −2 0 −3
Solution.
5.3. Diagonalization 147

True-or-False 5.36.
a. A is diagonalizable if A = P DP −1 , for some matrix D and some invert-
ible matrix P .
b. If Rn has a basis of eigenvectors of A, then A is diagonalizable.
c. If A is diagonalizable, then A is invertible.
d. If A is invertible, then A is diagonalizable.
e. If AP = P D, with D diagonal, then the nonzero columns of P must be
eigenvectors of A.
Solution.
Ans: F,T,F,F,T

Exercises 5.3
1. The matrix A is factored in the form P DP −1 . Find the eigenvalues of A and a basis for
each eigenspace.
     
2 2 1 1 1 2 5 0 0 1/4 1/2 1/4
A = 1 3 1 = 1 0 −10 1 01/4 1/2 −3/4
     

1 2 2 1 −1 0 0 0 1 1/4 −1/2 1/4

2. Diagonalize the matrices, if possible.


     
2 2 −1 7 4 16 4 0 0
(a)  1 3 −1 (b)  2 5 8 (c) 1 4 0
     

−1 −2 2 −2 −2 −5 0 0 5

Hint : Use (a) λ = 5, 1. (b) λ = 3, 1. (c) Not diagonalizable. Why?


3. A is a 3 × 3 matrix with two eigenvalues. Each eigenspace is one-dimensional. Is A
diagonalizable? Why?
4. Construct and verify.
(a) A nonzero 2 × 2 matrix that is invertible but not diagonalizable.
(b) A nondiagonal 2 × 2 matrix that is diagonalizable but not invertible.
148 Chapter 5. Eigenvalues and Eigenvectors

5.5. Complex Eigenvalues

Remark 5.37. Let A be n × n matrix with real entries. A complex


eigenvalue λ is complex scalar such that

Ax = λx, where x 6= 0.

We determine λ by solving the characteristic equation

det (A − λI) = 0.
" #
0 −1
Example 5.38. Let A = and consider the linear transformation
1 0
x 7→ Ax, x ∈ R2 .
Then
a) It rotates the plane counterclockwise through a quarter-turn.
b) The action of A is periodic, since after four quarter-turns, a vector is
back where it started.
c) Obviously, no nonzero vector is mapped into a multiple of itself, so
A has no eigenvectors in R2 and hence no real eigenvalues.

Find the eigenvalues of A, and find a basis for each eigenspace.


Solution.
5.5. Complex Eigenvalues 149

Eigenvalues and Eigenvectors of a Real Matrix That Acts on Cn

Remark 5.39. Let A be an n × n matrix whose entries are real. Then

Ax = Ax = Ax.

Thus, if λ is an eigenvalue of A, i.e., Ax = λx, then

Ax = Ax = λx = λx.

That is,
Ax = λx ⇔ Ax = λx (5.13)
If λ is an eigenvalue of A with corresponding eigenvector x, then the
complex conjugate of λ, λ, is also an eigenvalue with eigenvector x.
" #
1 5
Example 5.40. Let A = .
−2 3

(a) Find all eigenvalues and the corresponding eigenvectors for A.


(b) Let (λ,
" v), with
# λ = a − bi, be an eigen-pair. Let P = [Re v Im v] and
a −b
C= . Show that AP = P C. (This implies that A = P CP −1 .)
b a

Solution.

Ans: (a) λ = 2 ± 3i.


150 Chapter 5. Eigenvalues and Eigenvectors

Theorem 5.41. (Factorization): Let A be 2 × 2 real matrix with com-


plex eigenvalue λ = a − "bi (b 6=# 0) and an associated eigenvector v. If
a −b
P = [Re v Im v] and C = , then
b a

A = P CP −1 . (5.14)
" #
a −b
Example 5.42. Find eigenvalues of C = .
b a
Solution.

Example 5.43. Find all eigenvalues and the corresponding eigenvectors


for " #
3 −3
A=
1 1
Also find an invertible matrix P and a matrix C such that A = P CP −1 .
Solution.
5.5. Complex Eigenvalues 151

Exercises 5.5
1. Let each matrix act on C2 . Find the eigenvalues and a basis for each eigenspace in C2 .
" # " #
1 −2 1 5
(a) (b)
1 3 −2 3
" #
−1 + i
Ans: (a) An eigen-pair: λ = 2 + i,
1
" # " #
5 −2 a −b
2. Let A = . Find an invertible matrix P and a matrix C of the form such
1 3 b a
that A = P CP −1 . " # " #
1 −1 4 −1
Ans: P = ,C= .
1 0 1 4
3. Let A be an n × n real matrix with the property that AT = A. Let x be any vector in Cn ,
and let q = xT Ax. The equalities below show that q is a real number by verifying that
q = q. Give a reason for each step.

q = xT Ax = xT Ax = xT Ax = (xT Ax)T = xT AT x = q
(a) (b) (c) (d) (e)
152 Chapter 5. Eigenvalues and Eigenvectors

5.7. Applications to Differential Equations

Recall: For solving the first-order differential equation:


dx
= a x, (5.15)
dt
we rewrite it as
dx
= a dt. (5.16)
x
By integrating both sides, we have

ln x = a t + K.

Thus, for x = x(t),

x = ea t+K = eat eK = C · eat . (5.17)

Example 5.44. Consider the first-order initial-value problem

dx
= 5x, x(0) = 2. (5.18)
dt
(a) Find the solution x(t).
(b) Check if our solution satisfies both the differential equation and the
initial condition.
Solution.
5.7. Applications to Differential Equations 153

Consider a system of first-order differential equations :





 x01 = a11 x1 + a12 x2 + · · · + a1n xn
 x0 = a x + a x + · · · + a x

2 21 1 22 2 2n n
. (5.19)

 ..

 x0 = a x + a x + · · · + a x

n n1 1 n2 2 nn n

where x1 , x2 , · · · , xn are functions of t and aij ’s are constants.


Then the system can be written as

x0 (t) = Ax(t), (5.20)

where
x01 (t)
     
x1 (t) a11 · · · a1n
x(t) =  ... , x0 (t) =  ... , and A =  ... .. .
. 
    

xn (t) x0n (t) an1 · · · ann

How to solve (5.20): x0 (t) = Ax(t) ?

Observation 5.45. Let (λ, v) be an eigen-pair of A, i.e., Av = λv. Then,


for an arbitrary constant c,

x(t) = cveλt (5.21)

is a solution of (5.20).

Proof. Let’s check it:


x0 (t) = cv(eλt )0 = cλveλt
(5.22)
Ax(t) = Acveλt = cλveλt ,

which completes the proof.


154 Chapter 5. Eigenvalues and Eigenvectors

Example 5.46. Solve the initial-value problem:


" # " #
−4 −2 −2
x0 (t) = x(t), x(0) = . (5.23)
3 1 4

Solution. The two eigen-pairs are (λi , vi ), i = 1, 2. Then the general solu-
tion is x = c1 x1 + c2 x2 = c1 v1 eλ1 t + c2 v2 eλ2 t .

" # " #
2 −t 1 −2t
Ans: x(t) = −2 e +2 e
−3 −1
5.7. Applications to Differential Equations 155

A Maple code
1 with(plots):
2 with(DEtools):
3 sys := diff(x(t),t)=-4*x(t)-2*y(t),diff(y(t),t)=3*x(t)+y(t):
4 f := {x(t),y(t)}: sol := dsolve({sys,x(0)=-2,y(0)=4},f):
5 P1 := DEplot({sys},[x(t),y(t)],t=0..5,x=-4..4,y=-4..4,arrows=medium,
6 color=black,title=sol):
7 V1 := arrow([0,0],[2,-3],0.1,0.2,0.1,color="Green"):
8 V2 := arrow([0,0],[1,-1],0.1,0.2,0.1,color="Green"):
9 display(P1,V1,V2)

Figure 5.1: Trajectories for the dynamical system (5.23).

Definition 5.47.

1. If the eigenvalues of A are both negative, the origin is called an


attractor or sink, since all trajectories (solutions x(t)) are drawn
to origin. For (5.23), the direction of greatest attraction is along
the trajectory of the eigenfunction x2 (along the line through 0 and
v2 ) corresponding to the more negative eigenvalue λ = −2.
2. If the eigenvalues of A are both positive, the origin is called a re-
peller or source, since all trajectories (solutions x(t)) are traversed
away from origin.
3. If A has both positive and negative eigenvalues, the origin is called
saddle point of the dynamical system.
The larger the eigenvalue is in modulus, the greater attrac-
tion/repulsion.
156 Chapter 5. Eigenvalues and Eigenvectors

Example 5.48. Solve the initial-value problem:


" # " #
2 3 3
x0 (t) = x(t), x(0) = . (5.24)
−1 −2 2

What are the direction of greatest attraction and the direction of greatest
repulsion?
Solution.

Figure 5.2: Trajectories for the dynamical


" # " # system (5.24). " #
5 −3 t 9 −1 −t −1
Ans: x(t) = − e + e . The direction of greatest attraction = .
2 1 2 1 1
5.7. Applications to Differential Equations 157

Summary 5.49. For the dynamical system x0 (t) = Ax(t), A ∈ R2 × 2 , let


(λ1 , v1 ) and (λ2 , v2 ) be eigen-pairs of A.
• Case (i): If λ1 and λ2 are real and distinct, then
x = c1 v1 eλ1 t + c2 v2 eλ2 t . (5.25)

• Case (ii): If A has a double eigenvalue λ (with v), then you should
find a second generalized eigenvector by solving, e.g.,
(A − λI) w = v. (5.26)

(The above can be derived from a guess: x = tveλt + weλt .) Then the
general solution becomes [2]
x(t) = c1 veλt + c2 (tv + w)eλt (5.27)

(w is a simple shift vector, which may not be unique.)


• Case (iii): If λ = a + bi (b 6= 0) and λ are complex eigenvalues with
eigenvectors v and v, then
x = c1 veλt + c2 veλt , (5.28)

from which two linearly independent real-valued solutions must be


extracted.

Note: Let λ = a + bi and v = Rev + i Imv. Then

veλt = (Rev + i Imv) · eat (cos bt + i sin bt)


= [(Rev) cos bt − (Imv) sin bt]eat
+i [(Rev) sin bt + (Imv) cos bt]eat .

Let
y1 (t) = [(Rev) cos bt − (Imv) sin bt]eat
y2 (t) = [(Rev) sin bt + (Imv) cos bt]eat
Then they are linearly independent and satisfy the dynamical system.
Thus, the real-valued general solution of the dynamical system reads

x(t) = C1 y1 (t) + C2 y2 (t). (5.29)


158 Chapter 5. Eigenvalues and Eigenvectors

Example 5.50. Construct the general solution of x0 (t) = Ax(t) when


" #
−3 2
A=
−1 −1
Solution.

Figure 5.3: Trajectories for the dynamical


system. " # " #
1 − i (−2+i)t 1 + i (−2−i)t
Ans: (complex solution) c1 e + c2 e
" 1 # " 1 #
cos t + sin t −2t sin t − cos t −2t
Ans: (real solution) c1 e + c2 e
cos t sin t
5.7. Applications to Differential Equations 159

Exercises 5.7
1. (i) Solve the initial-value problem x0 = Ax with x(0) = (3, 2). (ii) Classify the nature of
the origin as an attractor, repeller, or saddle point of the dynamical system. (iii) Find
the directions of greatest attraction and/or repulsion. (iv) When the origin is a saddle
point, sketch typical trajectories.
" # " #
−2 −5 7 −1
(a) A = (b) A = .
1 4 3 3
Ans:"(a) #The origin is a saddle point.
" #
−5 −1
The direction of G.A. = . The direction of G.R. = .
1 " 1#
1
Ans: (b) The origin is a repeller. The direction of G.R. = .
1
2. Use the strategies in (5.26) and (5.27) to solve
" # " #
7 1 2
x0 = x, x(0) = .
−4 3 −5
"# " # " #
1 5t  1 0  5t
Ans: x(t) = 2 e − t + e
−2 −2 1
160 Chapter 5. Eigenvalues and Eigenvectors
C HAPTER 6
Orthogonality and Least Squares

In this chapter, we will learn

• Inner product, length, and orthogonality,


• Orthogonal projections,
• The Gram-Schmidt process, which is an algorithm to produce an or-
thogonal basis for any nonzero subspace of Rn , and
• Least-Squares problems, with applications to linear models.

161
162 Chapter 6. Orthogonality and Least Squares

6.1. Inner product, Length, and Orthogonality

Definition 6.1. Let u = [u1 , u2 , · · · , un ]T and v = [v1 , v2 , · · · , vn ]T are


vectors in Rn . Then, the inner product (or dot product) of u and v is
given by  
v1
 
 v2 
u•v = uT v = [u1 u2 · · · un ]  ... 

  (6.1)
vn
n
X
= u1 v1 + u2 v2 + · · · + un vn = uk vk .
k=1
   
1 3
Example 6.2. Let u = −2 and v =  2. Find u•v.
   

2 −4
Solution.

Theorem 6.3. Let u, v, and w be vectors in Rn , and c be a scalar. Then


a. u•v = v•u
b. (u + v)•w = u•w + v•w
c. (cu)•v = c(u•v) = u•(cv)
d. u•u ≥ 0, and u•u = 0 ⇔ u = 0
6.1. Inner product, Length, and Orthogonality 163

Definition 6.4. The length (norm) of v is nonnegative scalar kvk de-


fined by
√ q
kvk = v•v = v12 + v22 + · · · + vn2 and kvk2 = v•v. (6.2)

Note: For any scalar c, kcvk = |c| kvk.


" #
3
Example 6.5. Let W be a subspace of R2 spanned by v = . Find a unit
4
vector u that is a basis for W .
Solution.

Distance in Rn
Definition 6.6. For u, v ∈ Rn , the distance between u and v is

dist(u, v) = ku − vk, (6.3)

the length of the vector u − v.

Example 6.7. Compute the distance between the vectors u = (7, 1) and
v = (3, 2).
Solution.

Figure 6.1: The distance between u and v is


the length of u − v.
164 Chapter 6. Orthogonality and Least Squares
  
1 3
Example 6.8. Let u = −2 and v =  2. Find the distance between u
   

2 −4
and v.
Solution.

Orthogonal Vectors
Definition 6.9. Two vectors u and v in Rn are orthogonal if u•v = 0.

Theorem 6.10. The Pythagorean Theorem: Two vectors u and v


are orthogonal if and only if

ku + vk2 = kuk2 + kvk2 . (6.4)

Proof. For all u and v in Rn ,

ku + vk2 = (u + v)•(u + v) = kuk2 + kvk2 + 2u•v. (6.5)

Thus, u and v are orthogonal ⇔ (6.4) holds

Note: The inner product can be defined as

u•v = kuk kvk cos θ, (6.6)

where θ is the angle between u and v.


6.1. Inner product, Length, and Orthogonality 165

Orthogonal Complements
Definition 6.11. Let W ⊂ Rn be a subspace. A vector z ∈ Rn is said
to be orthogonal to W if z•w = 0 for all w ∈ W . The set of all vectors
z that are orthogonal to W is called the orthogonal complement of
W and is denoted by W ⊥ (and read as “W perpendicular" or simply “W
perp"). That is,
W ⊥ = {z | z•w = 0, ∀ w ∈ W }. (6.7)

through the origin in R3 , and let L be


the line through the origin and per-
pendicular to W . If z ∈ L and and
w ∈ W , then

z•w = 0.

See Figure 6.2. In fact, L consists of


all vectors that are orthogonal to the
w’s in W , and W consists of all vec-
tors orthogonal to the z’s in L. That
is,
Figure 6.2: A plane and line through the ori-
gin as orthogonal complements.
L = W ⊥ and W = L⊥ .

Example 6.12. Let W be a plane

Remark 6.13.
1. A vector x is in W ⊥ ⇔ x is orthogonal to every vector in a set that
spans W .
2. W ⊥ is a subspace of Rn .
166 Chapter 6. Orthogonality and Least Squares

Definition 6.14. Let A be an m × n matrix. Then the row space is the


set of all linear combinations of of the rows of A, denoted by Row A. That
is, Row A = Col AT .

Theorem 6.15. Let A be an m × n matrix. The orthogonal complement


of the row space of A is the null space of A, and the orthogonal comple-
ment of the column space of A is the null space of AT :

(Row A)⊥ = Nul A and (Col A)⊥ = Nul AT . (6.8)


" # " #
1 −1/2
Example 6.16. Let u = √ and v = √ . Use (6.6) to find the angle
3 3/2
between u and v.
Solution.

Example 6.17. Let W is a subspace of Rn . Prove that if x ∈ W and x ∈ W ⊥ ,


then x = 0.
Solution. Hint : Let x ∈ W . The condition x ∈ W ⊥ implies that x is perpendicular to
every element in W , particularly to itself.
6.1. Inner product, Length, and Orthogonality 167

Example 6.18. Let W is a subspace of Rn . Prove that

dim W + dim W ⊥ = n. (6.9)

Solution. Hint : Start with a basis for W , of which the collection becomes a matrix A.
Then you may use (6.8): (Col A)⊥ = Nul AT .

True-or-False 6.19.
a. For any scalar c, u•(cv) = c(u•v).
b. If the distance from u to v equals the distance from u to −v, then u and
v are orthogonal.
c. For a square matrix A, vectors in Col A are orthogonal to vectors in
Nul A.
d. For an m × n matrix A, vectors in the null space of A are orthogonal to
vectors in the row space of A.
Solution.
Ans: T,T,F,T
168 Chapter 6. Orthogonality and Least Squares

Exercises 6.1
   
3 6
x•w
1. Let x = −1 and w = −2. Find x•x, x•w, and .
   
x•x
−5 3
Ans: 35, 5, 1/7
  
0 −4
2. Find the distance between u = −5 and z = −1.
   

2 8 √
Ans: 2 17
3. Verify the parallelogram law for vectors u and v in Rn :

ku + vk2 + ku − vk2 = 2kuk2 + 2kvk2 . (6.10)

Hint : Use (6.5).


   
2 −7
4. Let u = −5 and v = −4.
   

−1 6

(a) Compute u•v, kuk, kvk, ku + vk, and ku − vk.


(b) Verify (6.5) and (6.10).

5. Suppose y is orthogonal to u and v. Show that y is orthogonal to every w in Span{u, v}.


6.2. Orthogonal Sets 169

6.2. Orthogonal Sets

Definition 6.20. A set of vectors {u1 , u2 , · · · , up } in Rn is said to be an


orthogonal set if each pair of distinct vectors from the set is orthogonal.
That is ui •uj = 0, for i 6= j.
     
1 0 −5
Example 6.21. Let u1 = −2, u2 = 1, and u3 = −2. Is the set
     

1 2 1
{u1 , u2 , u3 } orthogonal?
Solution.

Theorem 6.22. If S = {u1 , u2 , · · · , up } is an orthogonal set of nonzero


vectors in Rn , then S is linearly independent and therefore forms a basis
for subspace spanned by S.

Proof. It suffices to prove that S is linearly independent. Suppose

c1 u1 + c2 u2 + · · · + cp up = 0.

Take the dot product with u1 . Then the above equation becomes

c1 u1 •u1 + c2 u1 •u2 + · · · + cp u1 •up = 0,

from which we conclude c1 = 0. Similarly, by taking the dot product with ui ,


we can get ci = 0. That is,

c1 = c2 = · · · = cp = 0,

which completes the proof.


170 Chapter 6. Orthogonality and Least Squares

Definition 6.23. An orthogonal basis for a subspace W of Rn is a


basis for W that is also an orthogonal set.

Theorem 6.24. Let {u1 , u2 , · · · , up } be an orthogonal basis for a sub-


space W of Rn . For each y in W , the weights in the linear combination

y = c1 u1 + c2 u2 + · · · + cp up (6.11)
are given by
y•uj
cj = (j = 1, 2, · · · , p). (6.12)
uj •uj

Proof. y•uj = (c1 u1 + c2 u2 + · · · + cp up )•uj = cj uj •uj = cj kuj k2 .


     
1 0 −5
Example 6.25. Let u1 = −2, u2 = 1, and u3 = −2. In Exam-
     

1 2 1
ple 6.21, we have seen that S = {u1 , u2 , u3 } is orthogonal. Express the
 
11
vector y =  0 as a linear combination of the vectors in S.
 

−5
Solution.

Ans: y = u1 − 2u2 − 2u3 .


6.2. Orthogonal Sets 171

6.2.1. An orthogonal projection


Note: Given a nonzero vector u in Rn , consider the problem of decompos-
ing a vector y ∈ Rn into sum of two vectors, one a multiple of u and the
other orthogonal to u. Let
y=y
b + z, b // u and z ⊥ u.
y

Let y
b = αu. Then
0 = z•u = (y − αu)•u = y•u − αu•u.

Thus α = y•u/u•u.

Figure 6.3: Orthogonal projection: y = y


b + z.

Definition 6.26. Given a nonzero vector u in Rn , for y ∈ Rn , let

y=y
b + z, b // u and z ⊥ u.
y (6.13)

Then
y•u
y
b = αu = u, z = y − yb. (6.14)
u•u
The vector yb is called the orthogonal projection of y onto u, and z is
called the component of y orthogonal to u. Let L = Span{u}. Then
we denote
y•u
y
b= u = projL y, (6.15)
u•u
which is called the orthogonal projection of y onto L.
172 Chapter 6. Orthogonality and Least Squares
" # " #
7 4
Example 6.27. Let y = and u = .
6 2

(a) Find the orthogonal projection of y onto u.


(b) Write y as the sum of two orthogonal vectors, one in L = Span{u} and
one orthogonal to u.
(c) Find the distance from y to L.
Solution.

Figure 6.4: The orthogonal projection of y


onto L = Span{u}.
6.2. Orthogonal Sets 173
" # " #
2 7
Example 6.28. Let y = and u = .
6 1
(a) Find the orthogonal projection of y onto u.
(b) Write y as the sum of a vector in Span{u} and one orthogonal to u.
Solution.

  
4 2
Example 6.29. Let v = −12 and w =  1. Find the distance from v
   

8 −3
to Span{w}.
Solution.
174 Chapter 6. Orthogonality and Least Squares

6.2.2. Orthonormal sets


Definition 6.30. A set {u1 , u2 , · · · , up } is an orthonormal set, if it is
an orthogonal set of unit vectors. If W is the subspace spanned by such a
set, then {u1 , u2 , · · · , up } is an orthonormal basis for W , since the set
is automatically linearly independent.
   
1 0
Example 6.31. In Example 6.21, p. 169, we know v1 = −2, v2 = 1,
   

1 2
 
−5
and v3 = −2 form an orthogonal basis for R3 . Find the corresponding
 

1
orthonormal basis.
Solution.

Theorem 6.32. An m × n matrix U has orthonormal columns if and


only if U T U = I.

Proof. To simplify notation, we suppose that U has only three columns:


U = [u1 u2 u3 ], ui ∈ Rm . Then
 T  T T T

u1 u1 u1 u1 u2 u1 u3
T  T
U U = u2 [u1 u2 u3 ] = u2 u1 uT2 u2 uT2 u3 .
 T 

uT3 uT3 u1 uT3 u2 uT3 u3


 
1 0 0
Thus, U has orthonormal columns ⇔ U T U = 0 1 0.
 
The proof of the gen-
0 0 1
eral case is essentially the same.
6.2. Orthogonal Sets 175

Theorem 6.33. Let U be an m × n matrix with orthonormal columns,


and let x, y ∈ Rn . Then
(a) kU xk = kxk (length preservation)
(b) (U x)•(U y) = x•y (dot product preservation)
(c) (U x)•(U y) = 0 ⇔ x•y = 0 (orthogonality preservation)

Proof.

Theorems 6.32 and 6.33 are particularly useful when applied to square ma-
trices.
Definition 6.34. An orthogonal matrix is a square invertible matrix
U such that U −1 = U T .

True-or-False 6.35.
a. If y is a linear combination of nonzero vectors from an orthogonal set,
then the weights in the linear combination can be computed without
row operations on a matrix.
b. If the vectors in an orthogonal set of nonzero vectors are normalized,
then some of the new vectors may not be orthogonal.
c. A matrix with orthonormal columns is an orthogonal matrix.
d. If L is a line through 0 and if y
b is the orthogonal projection of y onto L,
then kb yk gives the distance from y to L.
e. Every orthogonal set in Rn is linearly independent.
f. If the columns of an m × n matrix A are orthonormal, then the linear
mapping x 7→ Ax preserves lengths.
Solution.
Ans: T,F,F,F,F,T
176 Chapter 6. Orthogonality and Least Squares

Exercises 6.2
1. Determine which sets of vectors are orthogonal.
           
2 −6 3 2 0 4
(a) −7, −3,  1 (b) −5, 0, −2
           

1 9 −1 −3 0 6

      
3 2 1 5
2. Let u1 = −3, u2 =  2, u3 = 1, and x = −3.
       

0 −1 4 1

(a) Check if {u1 , u2 , u3 } is an orthogonal basis for R3 .


(b) Express x as a linear combination of {u1 , u2 , u3 }.
Ans: x = 43 u1 + 13 u2 + 13 u3
" # " #
1 −1
3. Compute the orthogonal projection of onto the line through and the origin.
−1 3
4. Determine which sets of vectors are orthonormal. If a set is only orthogonal, normalize
the vectors to produce an orthonormal set.
             
0 0 1/3 −1/2 1 1 −2/3
(a) 1, −1 (b) 1/3,  0  (c) 4,  0,  1/3
             

0 0 1/3 1/2 1 −1 −2/3

5. Let U and V be n × n orthogonal matrices. Prove that U V is an orthogonal matrix.


Hint : See Definition 6.34, where U −1 = U T ⇔ U T U = I.
6.3. Orthogonal Projections 177

6.3. Orthogonal Projections

Recall: (Definition 6.26) Given a nonzero vector u in Rn , for y ∈ Rn , let


y=y
b + z, b // u and z ⊥ u.
y (6.16)

Then
y•u
y
b = αu = u, z = y − y b. (6.17)
u•u
b is called the orthogonal projection of y onto u, and z is called the component
The vector y
of y orthogonal to u. Let L = Span{u}. Then we denote
y•u
y
b= u = projL y, (6.18)
u•u
which is called the orthogonal projection of y onto L.

We will expand this orthogonal projection for a subspace.

Theorem 6.36. (The Orthogonal Decomposition Theorem) Let W


be a subspace of Rn . Then each y ∈ Rn can be written uniquely in the
form
y=y b + z, (6.19)
where yb ∈ W and z ∈ W ⊥ . In fact, if {u1 , u2 , · · · , up } is an orthogonal
basis for W , then
y•u1 y•u2 y•up
b = projW y =
y u1 + u2 + · · · + up ,
u1 •u1 u2 •u2 up •up (6.20)
z = y−y
b.

Figure 6.5: Orthogonal projection of y onto W .


178 Chapter 6. Orthogonality and Least Squares
     
2 −2 1
Example 6.37. Let u1 =  5, u2 =  1, and y = 2. Observe that
     

−1 1 3
{u1 , u2 } is an orthogonal basis for W = Span{u1 , u2 }.

(a) Write y as the sum of a vector in W and a vector orthogonal to W .


(b) Find the distance from y to W .
y•u1 y•u2
b+z ⇒ y
Solution. y = y b = u1 + u2 and z = y − y
b.
u1 •u1 u2 •u2

Figure 6.6: A geometric interpretation of the


orthogonal projection.
6.3. Orthogonal Projections 179

Remark 6.38. (Properties of Orthogonal Decomposition) Let y =


b ∈ W and z ∈ W ⊥ . Then
b + z, where y
y
1. y
b is called the orthogonal projection of y onto W (= projW y)
2. y
b is the closest point to y in W .
(in the sense ky − y
bk ≤ ky − vk, for all v ∈ W )
3. y
b is called the best approximation to y by elements of W .
4. If y ∈ W , then projW y = y.

Proof. 2. For an arbitrary v ∈ W , y−v = (y−b y −v), where (b


y)+(b y −v) ∈ W .
Thus, by the Pythagorean theorem,

ky − vk2 = ky − y
bk2 + kb
y − vk2 ,

which implies that ky − vk ≥ ky − y


bk.
Example 6.39. Find the closest point to y in the subspace Span{u1 , u2 }
and hence
 find the distance
 from
 y to
 W.
3 1 −4
     
−1
, u1 = −2, u2 =  1
   
y= 1 −1  0
     
13 2 3
Solution.
180 Chapter 6. Orthogonality and Least Squares

Example 6.40. Find the distance from y to the plane in R3 spanned by u1


and u2.     
5 −3 −3
y = −9, u1 = −5, u2 =  2
     

5 1 1
Solution.

Example 6.41. Let {u1 , u2 , u3 , u4 } be an orthogonal basis for R4 :


         
1 −2 1 −1 4
         
 2
, u2 =  1, u3 =  1, u4 =  1, and v =  5.
       
u1 = 
 1 −1 −2  1 −3
         
1 1 −1 −2 3

Write v as the sum of two vectors: one in Span{u1 , u2 } and the other in
Span{u3 , u4 }.
Solution.
6.3. Orthogonal Projections 181

Theorem 6.42. If {u1 , u2 , · · · , up } is an orthonormal basis for a sub-


space W of Rn , then

projw y = (y•u1 ) u1 + (y•u2 ) u2 + · · · + (y•up ) up . (6.21)

If U = [u1 u2 · · · up ], then

projw y = U U T y, for all y ∈ Rn . (6.22)

Thus the orthogonal projection can be viewed as a matrix transforma-


tion.
" # " √ #
7 1/ 10
Example 6.43. Let y = , u1 = √ , and W = Span{u1 }.
9 −3/ 10

(a) Let U be the 2 × 1 matrix whose only column is u1 . Compute U T U and


UUT.
(b) Compute projw y = (y•u1 ) u1 and U U T y.
Solution.

" # " #
1 1 −3 −2
Ans: (a) U U T = (b)
10 −3 9 6
182 Chapter 6. Orthogonality and Least Squares

True-or-False 6.44.
a. If z is orthogonal to u1 and u2 and if W = Span{u1 , u2 }, then z must be
in W ⊥ .
b. The orthogonal projection y
b of y onto a subspace W can sometimes de-
pend on the orthogonal basis for W used to compute y
b.
c. If the columns of an n × p matrix U are orthonormal, then U U T y is the
orthogonal projection of y onto the column space of U .
d. If an n × p matrix U has orthonormal columns, then U U T x = x for all x
in Rn .
Solution.
Ans: T,F,T,F
6.3. Orthogonal Projections 183

Exercises 6.3
1. (i) Verify that {u1 , u2 } is an orthogonal set, (ii) find the orthogonal projection of y onto
W = Span{u1 , u2 }, and (iii) write y as a sum of a vector in W and a vector orthogonal to
W.
           
−1 3 1 −1 1 −1
(a) y =  2, u1 = −1, u2 = −1 (b) y =  4, u1 = 1, u2 =  3
           

6 2 −2 3 1 −2
   
3 −5
1  1
Ans: (b) y = 2 7 + 2  1

2 4
2. Find the best approximation to z by vectors of the form c1 v1 + c2 v2 .
           
3 1 −4 3 2 1
−1 −2  1 −7 −1  1
(a) z =  , v1 =  , v2 =   (b) z =  , v1 =  , v2 =  
           
 1 −1  0  2 −3  0
13 2 3 3 1 −1
Ans: (a) b
z = 3v1 + v2
3. Let z, v1 , and v2 be given as in Exercise 2. Find the distance from z to the subspace of
R4 spanned by v1 and v2 .
Ans: (a) 8
4. Let W be a subspace of Rn . A transformation T : Rn → Rn is defined as
x 7→ T (x) = projW x.

(a) Prove that T is a linear transformation.


(b) Prove that T (T (x)) = T (x).

Hint : Use Theorem 6.42.


184 Chapter 6. Orthogonality and Least Squares

Project #1: Orthogonal projection

Implement a computer program (Matlab or Ocatve) and report


• A summary note, which is a small note for your experiences and
thought,
• Your code, which includes the keystrokes or commands you use to
solve the problem, and
• Results (computer outputs, when you run your code).
No results, no credit!!

The Problem
Let {v1 , v2 , v3 , v4 } be a basis for a subspace W of R8 . Let
 
−6 −3 6 1
 
−1 2 1 −6
 
 3 6 3 −2
 
 
 6 −3 6 −1
V = [v1 v2 v3 v4 ] = 
 2 −1 2 3.

 
 
−3 6 3 2
 
−2 −1 2 −3
 
1 2 1 6

1. Show that the columns of the matrix V are orthogonal by making an


appropriate matrix calculation.
2. Normalize the columns of V and save the results to U .
3. Let y = (1, 1, 1, 1, 1, 1, 1, 1). Find its closest point y
b = projW y in Col U ,
using the formula in (6.21), page 181.
4. Find its closest point y
b = projW y in Col U , using the formula in
(6.22).
5. Verify y − y
b is orthogonal to each column of U .
6. Find the distance from y to Col U .
6.3. Orthogonal Projections 185

Hint : Save U. Then


(a) U(:,k) will be the k-th column.
(b) The dot product of the first two column can be computed by using
dot( U(:,1), U(:,2) )
(c) The distance can be computed by using norm.
(d) A number in the results that is less than 10−15 can be considered as zero. Such
small numbers are due to round-off for numeric values; it is not a Matlab issue
but general.

A small portion is implemented for you.


Project1.m
1 %% Project#1
2

3 V = [-6 -3 6 1
4 -1 2 1 -6
5 3 6 3 -2
6 6 -3 6 -1
7 2 -1 2 3
8 -3 6 3 2
9 -2 -1 2 -3
10 1 2 1 6];
11

12 %%---------------------
13 %% Problem 1
14 %%---------------------
15 [m,n] = size(V);
16

17 for j=1:n
18 for k=j:n
19 dp = dot(V(:,j),V(:,k));
20 fprintf(" dot(V%d,V%d) = %g\n",j,k,dp);
21 end, end
22

23 %%---------------------
24 %% Problem 2
25 %%---------------------
26 U = V;
27 for j=1:n
28 U(:,j) = U(:,j)/norm(U(:,j));
29 end
30 U
186 Chapter 6. Orthogonality and Least Squares

When the code is executed, with (octave Project1.m), the result becomes:
Result
1 dot(V1,V1) = 100
2 dot(V1,V2) = 0
3 dot(V1,V3) = 0
4 dot(V1,V4) = 0
5 dot(V2,V2) = 100
6 dot(V2,V3) = 0
7 dot(V2,V4) = 0
8 dot(V3,V3) = 100
9 dot(V3,V4) = 0
10 dot(V4,V4) = 100
11

12 U =
13

14 -0.60000 -0.30000 0.60000 0.10000


15 -0.10000 0.20000 0.10000 -0.60000
16 0.30000 0.60000 0.30000 -0.20000
17 0.60000 -0.30000 0.60000 -0.10000
18 0.20000 -0.10000 0.20000 0.30000
19 -0.30000 0.60000 0.30000 0.20000
20 -0.20000 -0.10000 0.20000 -0.30000
21 0.10000 0.20000 0.10000 0.60000
6.4. The Gram-Schmidt Process 187

6.4. The Gram-Schmidt Process


Note: The Gram-Schmidt process is an algorithm to produce an or-
thogonal or orthonormal basis for any nonzero subspace of Rn .
   
3 1
Example 6.45. Let W = Span{x1 , x2 }, where x1 = 6 and x2 = 2. Find
   

0 2
an orthogonal basis for W .
Main idea: Orthogonal projection
( ) ( (
x1 x1 v1 = x1
⇒ ⇒
x2 x2 = αx1 + v2 v2 = x2 − proj{x1 } x2
where x1 •v2 = 0. Then W = Span{x1 , x2 } = Span{v1 , v2 }.

Solution.

Figure 6.7: Construction of an orthogonal


basis {v1 , v2 }.
188 Chapter 6. Orthogonality and Least Squares

Example 6.46. Find an orthonormal basis for a subspace whose basis is


   
 3
 8 

 0,  5 .
   
 
−1 −6
 

Solution.

Theorem 6.47. (The Gram-Schmidt Process) Given a basis


{x1 , x2 , · · · , xp } for a nonzero subspace W of Rn , define

v1 = x1
x2 •v1
v2 = x2 − v1
v1 •v1
x3 •v1 x3 •v2
v3 = x3 − v1 − v2 (6.23)
v1 •v1 v2 •v2
..
.
xp •v1 xp •v2 xp •vp−1
vp = xp − v1 − v2 − · · · − vp−1
v1 •v1 v2 •v2 vp−1 •vp−1
Then {v1 , v2 , · · · , vp } is an orthogonal basis for W . In addition,

Span{x1 , x2 , · · · , xk } = Span{v1 , v2 , · · · , vk }, for 1 ≤ k ≤ p. (6.24)

Remark 6.48. For the result of the Gram-Schmidt process, define


vk
uk = , for 1 ≤ k ≤ p. (6.25)
kvk k
Then {u1 , u2 , · · · , up } is an orthonormal basis for W . In practice, it is
often implemented with the normalized Gram-Schmidt process.
6.4. The Gram-Schmidt Process 189

Example 6.49. Find an orthonormal basis for W = Span{x1 , x2 , x3 }, where


     
1 −2 0
     
 0
, x2 =  2, and x3 =  1.
   
x1 = 
−1  1 −1
     
1 0 1
Solution.
190 Chapter 6. Orthogonality and Least Squares

QR Factorization of Matrices
Note: If an m × n matrix A has linearly independent columns x1 , x2 , · · · , xn , then
the Gram-Schmidt process (with normalization) can make A factored, as described in
the next theorem. This factorization is widely used in computer algorithms for var-
ious computations, such as solving equations (discussed in Section 6.5) and funding
eigenvalues.

Theorem 6.50. (The QR Factorization) If A is an m × n matrix with


linearly independent columns, then A can be factored as A = QR, where
Q is an m × n matrix whose columns form an orthonormal basis for Col A
and R is an n × n upper triangular invertible matrix with positive entries
on its diagonal.

Proof. The columns of A form a basis {x1 , x2 , · · · , xn } for W = Col A.

1. Construct an orthonormal basis {u1 , u2 , · · · , un } for W (the Gram-Schmidt


process). Set
def
Q == [u1 u2 · · · un ]. (6.26)

2. Since Span{x1 , x2 , · · · , xk } = Span{u1 , u2 , · · · , uk }, 1 ≤ k ≤ n, there


are constants r1k , r2k , · · · , rkk such that

xk = r1k u1 + r2k u2 + · · · + rkk uk + 0 · uk+1 + · · · + 0 · un . (6.27)

We may assume hat rkk > 0. (If rkk < 0, multiply both rkk and uk by −1.)
3. Let rk = [r1k , r2k , · · · , rkk , 0, · · · , 0]T . Then

xk = Qrk (6.28)

4. Define
def
R == [r1 r2 · · · rn ]. (6.29)

Then we see A = [x1 x2 · · · xn ] = [Qr1 Qr2 · · · Qrn ] = QR.


Note: Once Q is obtained as in (6.26), one can get R as follows.

R = QT A. (6.30)
6.4. The Gram-Schmidt Process 191

Let’s view the QR Factorization, from a different angle.

Algorithm 6.51. (QR Factorization) Let A = [x1 x2 · · · xn ]. From the


Gram-Schmidt process, obtain an orthonormal basis {u1 , u2 , · · · , un }.
Then
x1 = (u1 •x1 )u1
x2 = (u1 •x2 )u1 + (u2 •x2 )u2
x3 = (u1 •x3 )u1 + (u2 •x3 )u2 + (u3 •x3 )u3 (6.31)
..
.
Pn
xn = j=1 (uj •xn )uj ,

which is a rephrase (with normalization) of (6.23). Thus

A = [x1 x2 · · · xn ] = QR (6.32)

implies that

Q = [u1 u2 · · · un ],
 
u1 •x1 u1 •x2 u1 •x3 · · · u1 •xn
 0 u2 •x2 u2 •x3 · · · u2 •xn 
 
  T
(6.33)
R = 
 0 0 u3 •x3 · · · u3 •xn 
 = Q A.
.. .. .. ... ..
. . . .
 
 
0 0 0 · · · un •xn

In practice, the coefficients rij = ui •xj , i < j, can be saved during the
(normalized) Gram-Schmidt process.
" #
4 −1
Example 6.52. Find the QR factorization for A = .
3 2
Solution.

" # " #
0.8 −0.6 5 0.4
Ans: Q = R=
0.6 0.8 0 2.2
192 Chapter 6. Orthogonality and Least Squares

True-or-False 6.53.
a. If {v1 , v2 , v3 } is an orthogonal basis for W , then multiplying v3 by a
scalar c gives a new orthogonal basis {v1 , v2 , cv3 }. Clue: c =?
b. The Gram-Schmidt process produces from a linearly independent set
{x1 , x2 , · · · , xp } an orthogonal set {v1 , v2 , · · · , vp } with the property
that for each k, the vectors v1 , v2 , · · · , vk span the same subspace as
that spanned by x1 , x2 , · · · , xk .
c. If A = QR, where Q has orthonormal columns, then R = QT A.
d. If x is not in a subspace W , then x
b = projW x is not zero.
e. In a QR factorization, say A = QR (when A has linearly independent
columns), the columns of Q form an orthonormal basis for the column
space of A.
Solution.
Ans: F,T,T,F,T
6.4. The Gram-Schmidt Process 193

Exercises 6.4
1. The given set is a basis for a subspace W . Use the Gram-Schmidt process to produce an
orthogonal basis for W .
       
3 8 1 7
(a)  0,  5 −4 −7
       
(b)  ,  
−1 −6  0 −4
1 1
Ans: (b) v2 = (5, 1, −4, −1)
2. Find an orthogonal basis for the column space of the matrix
 
−1 6 6
 3 −8 3
 
 
 1 −2 6
1 −4 −3
Ans: v3 = (1, 1, −3, 1)
 
−10 13 7 −11
 2 1 −5 3
 

3. M Let A = 
 
 −6 3 13 −3

 16 −16 −2 5
 
2 1 −5 −7

(a) Use the Gram-Schmidt process to produce an orthogonal basis for the column space
of A.
(b) Use the method in this section to produce a QR factorization of A.

Ans: (a) v4 = (0, 5, 0, 0, −5)


194 Chapter 6. Orthogonality and Least Squares

6.5. Least-Squares Problems

Note: Let A is an m × n matrix. Then Ax = b may have no solution,


particularly when m > n. In real-world,
• m  n, where m represents the number of data points and n denotes
the dimension of the points
• Need to find a best solution for Ax ≈ b

Figure 6.8: Least-Squares approximation for noisy data: m = 250 and n = 2. The lines
in cyan are the final linear models from random sample consensus (RANSAC) and the
iterative partial regression (IPR).

Note: Finding a best approximation/representation is being considered


in research level.

Definition 6.54. Let A is an m × n matrix and b ∈ Rm . A least-


b ∈ Rn such that
squares (LS) solution of Ax = b is an x

kb − Ab
xk ≤ kb − Axk, for all x ∈ Rn . (6.34)
6.5. Least-Squares Problems 195

Solution of the General Least-Squares Problem


Rewrite: Definition 6.54
Let A is an m × n matrix and b ∈ Rm . A least-squares (LS) solution of
Ax = b is an xb ∈ Rn such that

kb − Ab
xk ≤ kb − Axk, for all x ∈ Rn . (6.35)

Remark 6.55.
• For all x ∈ Rn , Ax will necessarily be in Col A, a subspace of Rm .
– So we seek an x that makes Ax the closest point in Col A to b.
• Let b
b = proj
Col A b. Then Ax = b has a solution and there is an
b
b ∈ Rn such that
x
Ab
x = b.
b (6.36)

• x
b is an LS solution of Ax = b.
• The quantity kb − bk
b = kb − Ab xk is called the least-squares error.

b is in Rn .
Figure 6.9: The LS solution x

Remark 6.56. If A ∈ Rn × n is invertible, then Ax = b has a unique


solution x
b and therefore
kb − Abxk = 0. (6.37)
196 Chapter 6. Orthogonality and Least Squares

Theorem 6.57. The set of LS solutions of Ax = b coincides with the


nonempty set of solutions of the normal equations

AT Ax = AT b. (6.38)

Proof. Suppose x b satisfies Ab


x=b b
⇒b−b b = b − Abx ⊥ Col A
⇒ aj •(b − Abx) = 0 for all columns aj
⇒ aj T (b − Ab
x) = 0 for all columns aj (Note that aj T is a row of AT )
⇒ AT (b − Ab x) = 0
⇒ AT Ab x = AT b
   
−2 1 −5
Example 6.58. Let A = −2 0 and b =  8.
   

2 1 1

(a) Find an LS solution of Ax = b.


(b) Also find an orthogonal projection of b onto Col A, b,
b and least-squares
error (kb − Ab xk).
Solution.
6.5. Least-Squares Problems 197

Remark 6.59. Theorem 6.57 implies that LS solutions of Ax = b are


solutions of the normal equations AT Ab
x = AT b.
• When AT A is not invertible, the normal equations have either no
solution or infinitely many solutions.
• So, data acquisition is important, to make it invertible.

Theorem 6.60. Let A be an m × n matrix. The following statements


are logically equivalent:
a. The equation Ax = b has a unique LS solution for each b ∈ Rm .
b. The columns of A are linearly independent.
c. The matrix AT A is invertible.
When these statements are true, the unique LS solution x
b is given by

b = (AT A)−1 AT b.
x (6.39)

Example 6.61. Describe all least squares solutions of the equation Ax = b,


given    
1 1 0 1
   
1 1 0
 and b = 3.
 
A= 1 0 1 8
   
1 0 1 2
Solution.
198 Chapter 6. Orthogonality and Least Squares

Alternative Calculations of Least-Squares Solutions

Theorem 6.62. Given an m × n matrix A with linearly independent


columns, let A = QR be a QR factorization of A as in Algorithm 6.51.
Then, for each b ∈ Rm , the equation Ax = b has a unique LS solution,
given by
b = R−1 QT b.
x (6.40)

Proof. Let A = QR. Then

(AT A)−1 AT = ((QR)T QR)−1 (QR)T = (RT QT QR)−1 RT QT


(6.41)
= R−1 (RT )−1 RT QT = R−1 QT ,

which completes the proof.


Example 6.63. Find the LS solution of Ax = b for
   
1 3 5 3 
1/2 1/2 1/2

 

1
   2 4 5
1 0 and b =  5, where A = QR = 
   1/2 −1/2 −1/2

A=  0 2 3

1 1 2  7 1/2 −1/2 1/2
    0 0 2
1/2 1/2 −1/2
1 3 3 −3
Solution.

Ans: QT b = (6, −6, 4) and x


b = (10, −6, 2)
6.5. Least-Squares Problems 199

True-or-False 6.64.
a. The general least-squares problem is to find an x that makes Ax as
close as possible to b.
b. Any solution of AT Ax = AT b is a least-squares solution of Ax = b.
c. If x b = (AT A)−1 AT b.
b is a least-squares solution of Ax = b, then x
d. The normal equations always provide a reliable method for computing
least-squares solutions.
Solution.
Ans: T,T,F,F
200 Chapter 6. Orthogonality and Least Squares

Exercises 6.5
1. Find a least-squares solution of Ax = b by (i) constructing the normal equations and
b. Also (iii) compute the least-squares error (kb − Ab
(ii) solving for x xk) associated with
the least-squares solution.
       
−1 2 4 1 −2 3
(a) A =  2 −3, b = 1 −1 2  1
       
(b) A =  , b =  
−1 3 2  0 3 −4
2 5 2
" #
4/3
Ans: (b) x
b=
−1/3
2. Find (i) the orthogonal projection of b onto Col A and (ii) a least-squares solution of
Ax = b. Also (iii) compute the least-squares error associated with the least-squares
solution.
       
4 0 1 9 1 1 0 2
1 −5 1 0  1 0 −1 5
      
(a) A =  , b =   (b) A =  , b =  

6 1 0 0  0 1 1 6
1 −1 −5 0 −1 1 −1 6
Ans: (b) b b = (1/3, 14/3, −5/3)
b = (5, 2, 3, 6) and x

3. Describe
 all least-squares solutions of the system and the associated least-squares error.
 x+y = 1

x + 2y = 3

x + 3y = 3

Ans: x
b = (1/3, 1)

For the above problems, you may use either a pen and your brain or computer programs.
For example, for the last problem, a code can be written as
exercise-6.5.3.m
1 A = [1 1; 1 2; 1 3];
2 b = [1;3;3];
3

4 ATA = A'*A; ATb = A'*b;


5

6 xhat = ATA\ATb
7 error = norm(b-A*xhat)
6.6. Applications to Linear Models 201

6.6. Applications to Linear Models


Recall: Section 6.5
• (Definition 6.54) Let A is an m × n matrix and b ∈ Rm . A least-
b ∈ Rn such that
squares (LS) solution of Ax = b is an x

kb − Ab
xk ≤ kb − Axk, for all x ∈ Rn . (6.42)

• (Theorem 6.57) The set of LS solutions of Ax = b coincides with


the nonempty set of solutions of the normal equations

AT Ax = AT b. (6.43)

The normal equations may have a unique solution, no solution, or


infinitely many solutions.
• (Theorem 6.62) Given an m × n matrix A with linearly indepen-
dent columns, let A = QR be a QR factorization of A as in Al-
gorithm 6.51. Then, for each b ∈ Rm , the equation Ax = b has a
unique LS solution, given by

b = R−1 QT b.
x (6.44)
202 Chapter 6. Orthogonality and Least Squares

6.6.1. Regression line

Figure 6.10: A regression line.

Definition 6.65. Suppose a set of experimental data points are given


as
(x1 , y1 ), (x2 , y2 ), · · · , (xm , ym )
such that the graph is close to a line. We determine a line

y = β0 + β1 x (6.45)

that is as close as possible to the given points. This line is called the
least-squares line; it is also called regression line of y on x and β0 , β1
are called regression coefficients.
6.6. Applications to Linear Models 203

Calculation of Least-Squares Lines


Remark 6.66. Consider a least-squares (LS) model of the form y =
β0 + β1 x, for a given data set {(xi , yi ) | i = 1, 2, · · · , m}. Then
Predicted y-value Observed y-value

β0 + β1 x1 = y1
β0 + β1 x2 = y2 (6.46)
.. ..
. .
β0 + β1 xm = ym

It can be equivalently written as


Xβ = y, (6.47)

where    
1 x1 y1
  " #  
1 x2  β0  y2 
 ... ... ,
X=  β= , y=
 ... .

  β1  
1 xm ym
Here we call X the design matrix, β the parameter vector, and y the
observation vector. Thus the LS solution can be determined as
X T Xβ = X T y ⇒ β = (X T X)−1 X T y, (6.48)

provided that X T X is invertible.

Example 6.67. Find the equation y = β0 + β1 x of least-squares line that


best fits the given points:
(−1, 0), (0, 1), (1, 2), (2, 4)
Solution.
204 Chapter 6. Orthogonality and Least Squares

Remark 6.68. It follows from (6.47) that


 
1 x1
" #  " #
1 1 · · · 1 1 x2 
  m Σxi
XT X = . . = ,
x1 x2 · · · xm  .
. .
.  Σx i Σx2
  i
1 xm
  (6.49)
y
" # 1 " #
1 1 ··· 1   y2  = Σyi .

XT y = . 
 .. 
x1 x2 · · · xm  Σxi yi
ym

Thus the normal equations for the regression line read


" # " #
Σ1 Σxi Σyi
β= . (6.50)
Σxi Σx2i Σxi yi

Example 6.69. Find the equation y = β0 + β1 x of least-squares line that


best fits the given points:
(0, 1), (1, 1), (2, 2), (3, 2)
Solution.
6.6. Applications to Linear Models 205

6.6.2. Least-Squares fitting of other curves

Remark 6.70. Consider a regression model of the form

y = β0 + β1 x + β2 x2 ,

for a given data set {(xi , yi ) | i = 1, 2, · · · , m}. Then


Predicted y-value Observed y-value

β0 + β1 x1 + β2 x21 = y1
β0 + β1 x2 + β2 x22 = y2 (6.51)
.. ..
. .
β0 + β1 xm + β2 x2m = ym

It can be equivalently written as


Xβ = y, (6.52)

where 
2
  
1 x1 x1   y1
β0
1 x2 x22 
   
 y2 
X= .. , β = β1 , y=
 ... .
 
 ... ...
 
. 
β2
  
1 xm x2m ym
Now, it can be solved through normal equations:
   
Σ1 Σxi Σx2i Σyi
X T Xβ =  Σxi Σx2i Σx3i  β =  Σxi yi  = X T y (6.53)
   

Σx2i Σx3i Σx4i Σx2i yi


206 Chapter 6. Orthogonality and Least Squares

Example 6.71. Find an LS curve of the form y = β0 + β1 x + β2 x2 that best


fits the given points:
(0, 1), (1, 1), (1, 2), (2, 3).
   
Σ1 Σxi Σx2i Σyi
Solution. The normal equations are 
 Σxi Σx2i Σx3i  β =  Σxi yi 
  

Σx2i Σx3i Σx4i Σx2i yi

Ans: y = 1 + 0.5x2
6.6. Applications to Linear Models 207

Example 6.72. Find an LS curve of the form y = β0 + β1 x + β2 x2 that best


fits the given points:
(−2, 1), (−1, 0), (0, 1), (1, 4), (2, 9)
Solution.

Ans: y = 1 + 2x + x2
208 Chapter 6. Orthogonality and Least Squares

Further Applications
Example 6.73. Find an LS curve of the form y = a cos x + b sin x that best
fits the given points:
(0, 1), (π/4, 2), (π, 0).
Solution.


Ans: (a, b) = (1/2, −1/2 + 2 2) = (0.5, 2.32843)
6.6. Applications to Linear Models 209

Nonlinear Models: Linearization


Strategy 6.74. For nonlinear models, change of variables can be
applied for a linear model.

Model Change of Variables Linearization


B 1
y =A+ x
e = , ye = y ⇒ ye = A + Bex
x x
1 1 (6.54)
y= x
e = x, ye = ⇒ ye = A + Bex
A + Bx y
y = CeDx x
e = x, ye = ln y ⇒ ye = ln C + De
x

Example 6.75. Find an LS curve of the form y = CeDx that best fits the
given points:
(0, e), (1, e3 ), (2, e5 ).
Solution.
x y x
e ye = ln y    
1 0 1
0 e 0 1
⇒ ⇒ X = 1 1, y = 3
   
1 e3 1 3
1 2 5
2 e5 2 5

Ans: y = ee2x = e2x+1


210 Chapter 6. Orthogonality and Least Squares

Exercises 6.6
1. Find an LS curve of the form y = β0 + β1 x that best fits the given points.

(a) (1, 0), (2, 1), (4, 2), (5, 3) (b) (2, 3), (3, 2), (5, 1), (6, 0)
Ans: (a) y = −0.6 + 0.7x
2. A certain experiment produces the data

(1, 1.8), (2, 2.7), (3, 3.4), (4, 3.8), (5, 3.9).

Describe the model that produces an LS fit of these points by a function of the form
y = β1 x + β2 x2 .

(a) Give the design matrix, the observation vector, and the unknown parameter vector.
(b) M Find the associated LS curve for the data; also compute the LS error.
Bibliography
[1] D. L AY AND S. L. ABD J UDI M C D ONALD, Linear Algebra and Its Applications, 5th Ed.,
Pearson, 2014.

[2] PAUL, Paul’s Online Notes.


https://tutorial.math.lamar.edu/Classes/DE/RepeatedEigenvalues.aspx.

211
212 BIBLIOGRAPHY
Index
M , 28, 94, 139, 184, 193, 210 direction of greatest attraction, 155
distance, 163
addition, 118 domain, 53
algebraic multiplicity, 137 dot product, 36, 75, 162
attractor, 155 dot product preservation, 175
augmented matrix, 3 double eigenvalue, 157

backward phase, 13 echelon form, 9


basic variables, 12 eigenspace, 131, 145
basis, 97 eigenvalue, 130, 136
basis theorem, 105 eigenvector, 130
best approximation, 179 elementary matrix, 91
elementary row operations, 5, 71, 113
change of variables, 209 equal, 72
characteristic equation, 136 equality of vectors, 21
characteristic polynomial, 136 equivalent system, 2
closed, 95 Euler angles, 57
closest point, 179 exercise-6.5.3.m, 200
codomain, 53 existence and uniqueness, 5
coefficient matrix, 3
cofactor, 109 for loop, 33
cofactor expansion, 109 forward phase, 13, 18
column space, 95, 125 free variables, 12
column vector, 20 function, 33
complex conjugate, 149 fundamental questions, two, 5
complex eigenvalue, 148
general solution, 14
consistent system, 3
generalized eigenvector, 157
coordinate vector, 101
geometric linear transformations of R2 , 63
coordinates, 101
Gram-Schmidt process, 187
design matrix, 203 Gram-Schmidt process, normalized, 188
determinant, 77, 107–109 homogeneous linear system, 38
diagonal entries, 72
diagonal matrix, 72 inconsistent system, 3
diagonalizable, 140 initial condition, 152
diagonalization theorem, 141 initial-value problem, 152
differential equation, 152 injective, 64
dimension, 103 inner product, 162, 164

213
214 INDEX

invertibility, 77 nontrivial solution, 38


invertible, 88 nonzero row, 9
invertible linear transformation, 88 norm, 163
invertible matrix, 81 normal equations, 196, 201, 205
invertible matrix theorem, 86, 105, 137 null space, 96, 123, 127
isomorphic, 102
isomorphism, 102 objects, 29
iteration, 32 observation vector, 203
iterative partial regression, 194 Octave, 29
octave, 94
kernel, 127 one-to-one, 64, 68
onto, 64
leading 1, 9 orthogonal, 164, 165
leading entry, 9 orthogonal basis, 170, 187, 188
least-squares (LS) problem, 194 orthogonal complement, 165
least-squares (LS) solution, 194, 195, 201 orthogonal decomposition theorem, 177
least-squares error, 195, 196 orthogonal matrix, 175
least-squares line, 202 orthogonal projection, 171, 177, 179
length, 163 orthogonal set, 169
length preservation, 175 orthogonality preservation, 175
linear combination, 23 orthonormal basis, 174, 187, 188
linear equation, 2 orthonormal set, 174
linear space, 117
linear system, 2 parallelogram law, 168
linear transformation, 56, 127 parallelogram rule for addition, 21
linearly dependent, 46 parameter vector, 203
linearly independent, 46, 84 parametric description, 14
linspace, in Matlab, 31 parametric vector form, 19, 39, 45
lower triangular matrix, 90 plot, in Matlab, 30
LU factorization, 91 position vector, 20
LU factorization algorithm, 93 programming, 29
Project: Orthogonal projection, 184
Matlab, 29 Pythagorean theorem, 164, 179
matlab, 94
matrix algebra, 71 QR factorization, 190
matrix equation, 34 QR factorization algorithm, 191
matrix form, 3
matrix multiplication, 74 random sample consensus, 194
matrix product, 74 range, 53, 127
matrix transformation, 54, 181 rank, 103
multiplication by scalars, 118 rank theorem, 103
multiplicity, 137 reduced echelon form, 9
mysum.m, 33 reflections in R2 , 63
regression coefficients, 202
nonhomogeneous linear system, 38 regression line, 202
nonsingular matrix, 81 repeller, 155
INDEX 215

repetition, 29, 32 subspace, 95, 119


reusability, 33 sum, 121
reusable, 29 sum of products, 75
roll-pitch-yaw, 57 sum of two matrices, 72
rotation, 57 superposition principle, 56, 58
row equivalent, 5 surjective, 64
row space, 166 system of linear equations, 1, 2
row-column rule, 75
transformation, 53
saddle point, 155 transpose, 19, 78
scalar multiple, 21 trivial solution, 38
scalar multiplication, 72
shear transformation, 54, 62 unique inverse, 81
similar, 138 unit lower triangular matrix, 91
similarity, 138 unit vector, 163, 174
similarity transformation, 138 upper triangular matrix, 88
sink, 155
solution, 2 vector, 20
solution set, 2 vector addition, 21
source, 155 vector equation, 24, 36
span, 25 vector space, 117, 118
square matrix, 72 vectors, 117, 118
standard basis, 97 volume scaling factor, 107, 108
standard matrix, 61, 88 yaw, 57
standard unit vectors in Rn , 61
submatrix, 109 zero subspace, 119

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy