MAST10007 Notes S2 2023 (Student) 1pp
MAST10007 Notes S2 2023 (Student) 1pp
Semester 2, 2023
Lecture Notes
These notes have been made in accordance with the provisions of Part VB of the copyright act for teaching purposes of the
University. These notes are for the use of students of the University of Melbourne enrolled in MAST10007 Linear Algebra.
1
Topic 1: Linear equations [AR 1.1 and 1.2]
2
1.1 Systems of linear equations. Row operations
Linear equations
Definition (Linear equation and linear system)
A linear equation in n variables, x1 , x2 , . . . , xn , is an equation of the form
a1 x1 + a2 x2 + · · · + an xn = b,
x1 + 5x2 + 6x3 = 0
x2 − x3 = 0
−x1 + x3 = 0
4
Example 1. Data fitting using a polynomial
Find the coefficients c, α, β, γ such that the cubic polynomial
y = c + αx + βx 2 + γx 3 passes through the points specified below.
x y
−0.1 0.90483
0 1
0.1 1.10517
0.2 1.2214
Solution:
Substituting the points (x, y ) into the cubic polynomial gives:
5
Note:
Solving such equations by hand can be tedious, so we can use a
computer package such as MATLAB.
6
Example 2. (Revision)
Solve the following linear system.
2x − y = 3
x +y =0
Solution:
Graphically
Note:
◮ Need an accurate drawing.
Note:
Elimination of variables will always give a solution, but we need to do
this systematically and not in an adhoc manner, for three or more
variables.
8
Definition (Matrix)
A matrix is a rectangular array of numbers. A m × n matrix has m rows
and n columns.
9
Example 3.
Write down a coefficient matrix and augmented matrix for the following
linear system:
2x − y = 3
x +y =0
Solution:
10
Row Operations
To find the solutions of a linear system, we perform row operations to
simplify the augmented matrix. An essential condition is that the row
operations we perform do not change the set of solutions to the system.
Note:
The matrices produced after each row operation are not equal but are
equivalent, meaning that the solution set is the same for the system
represented by each augmented matrix. We use the symbol ∼ to denote
equivalence of matrices.
11
Example 4.
Solve the following system using elementary row operations:
2x − y = 3
x +y =0
Solution:
12
1.2 Reduction of systems to row-echelon form
Note:
These conditions imply that in each column containing a leading entry,
all entries below the leading entry are zero.
13
Examples
1 −2 3 4 5 row-echelon form
1 0 0 3
0 4 1 2 row-echelon form
0 0 0 3
0 0 0 2 4
0 0 3 1 6
0 0
not row-echelon form
0 0 0
2 −3 6 −4 9
14
Gaussian elimination
Gaussian elimination is a systematic way to reduce a matrix to
row-echelon form using row operations.
Note: The row-echelon form obtained is not unique.
Gaussian elimination
1. Interchange rows, if necessary, to bring a non-zero number to the
top of the first column with a non-zero entry.
2. (Optional, but often useful.) Divide the first row by its leading
entry to create a new leading entry 1.
3. Add suitable multiples of the top row to lower rows so that all
entries below the leading entry are zero.
4. Start again at Step 1 applied to the matrix without the first row.
To solve a linear system we read off the equations from the row-echelon
matrix and then solve the equations to find the unknowns, starting with
the last equation. This final step is called back substitution.
15
Example 5.
Solve the following system of linear equations using Gaussian
elimination:
x − 3y + 2z = 11
2x − 3y − 2z = 13
4x − 2y + 5z = 31
Solution:
Step 1. Write the system in augmented matrix form.
16
Step 2. Use elementary row operations to reduce the augmented matrix
to row-echelon form.
17
Step 3. Read off the equations from the augmented matrix and use back
substitution to solve for the unknowns, starting with the last equation.
18
1.3 Reduction of systems to reduced row-echelon form
19
Examples
1 −2 3 −4 5 reduced row-echelon form
1 2 0
reduced row-echelon form
0 0 1
1 0 0 3
0 1 1 2 is not in reduced row-echelon form
0 0 0 3
1 0 0 2 4
0 1 0 1 6
0
is not in reduced row-echelon form
0 0 0 0
0 0 1 −4 9
20
Gauss-Jordan elimination
Gauss-Jordan elimination is a systematic way to reduce a matrix to
reduced row-echelon form using row operations.
Note: The reduced row-echelon form obtained is unique.
Gauss-Jordan elimination
1. Use Gaussian elimination to reduce matrix to row-echelon form.
2. Multiply each non-zero row by an appropriate number to create a
leading 1 (type 2 row operations).
3. Use row operations (of type 3) to create zeros above the leading
entries.
21
Example 6.
Solve the following system of linear equations using Gauss-Jordan
elimination:
x − 3y + 2z = 11
2x − 3y − 2z = 13
4x − 2y + 5z = 31
Solution:
Step 1. Use Gaussian elimination to reduce the augmented matrix to
row-echelon form.
22
Step 2. Divide rows by their leading entry.
Step 3. Add multiples of the last row to the rows above to make the
entries above its leading 1 equal to zero.
23
Step 4. Add multiples of the penultimate (2nd last) row to the rows
above to make the entries above its leading 1 equal to zero.
24
1.4 Consistent and inconsistent systems
Theorem
A system of linear equations has zero solutions, one solution, or
infinitely many solutions. There are no other possibilities.
Note:
Every homogeneous linear system is consistent, since it always has at
least one solution: x1 = 0, . . . , xn = 0.
25
Inconsistent systems
Theorem
The system is inconsistent if and only if there is at least one row in the
row-echelon matrix having all entries equal to zero except for a non-zero
final entry.
Example 7.
Solve the system with row-echelon form:
1 0 1 −2
0 2 2 4
0 0 0 −3
Solution:
26
Geometrically, an inconsistent system in 3 variables has no common
point of intersection for the planes determined by the system.
27
Consistent systems
Theorem
Suppose we have a consistent linear system with n variables.
◮ If the row-reduced augmented matrix has exactly n non-zero rows,
then the system has a unique solution.
◮ If the row-reduced augmented matrix has < n non-zero rows,
then the system has infinitely many solutions.
◮ If r is the number of non-zero rows in the row-echelon form,
then n − r parameters are needed to specify the solution set.
28
Example 8.
Solve the system with reduced row-echelon form:
1 0 0 2
0 1 0 −4
0 0 1 15
Solution:
29
Example 9.
Solve the system with reduced row-echelon form:
1 2 0 0 5 1
0 0 1 0 6 2
0 0 0 1 7 3
0 0 0 0 0 0
Solution:
30
31
Geometrically, a consistent system in 3 variables has a common point,
line or plane of intersection for the planes determined by the system.
32
Example 10. Calculating flows in networks
At each node • we require
sum of flows in = sum of flows out.
Determine a, b, c and d in the following network.
33
Solution:
34
35
36
Note:
◮ x = t(−1, 1, −1, 1), t ∈ R is the general solution of the
homogeneous linear system.
◮ x = (9, 1, 6, 0) is called a particular solution of the linear system.
37
Example 11.
Find the values of a, b ∈ R for which the system
x − 2y + z = 4
2x − 3y + z = 7
3x − 6y + az = b
has
(a) no solution,
(b) a unique solution,
(c) an infinite number of solutions.
Solution:
38
39
Topic 2: Matrices and Determinants [AR 1.3–1.7, 2.1–2.4]
40
2.1 Matrix notation
Definition (Matrix)
A matrix of size m × n is a rectangular array of numbers with m rows
and n columns.
A11 A12 . . . A1n
A21 A22 . . . A2n
A = .. .. .. or A = [Aij ]
..
. . . .
Am1 Am2 . . . Amn
We denote by Aij the entry in the i -th row and j-th column of A where
i = 1, 2, . . . , m and j = 1, 2, . . . , n.
41
Special matrices
43
2.2 Matrix operations
Recall that we can multiply a matrix by a scalar, add matrices of the
same size, and multiply matrices if the sizes are compatible.
44
Definition (Matrix multiplication)
Let A be an m × n matrix and B be a n × q matrix. The product AB is
a matrix of size m × q.
The entry in position (i , j) of the matrix product is obtained by taking
row i of A, and column j of B, then multiplying together the entries in
order and adding.
Xn
(AB)ij = Aik Bkj
k=1
Note:
◮ The product AB is only defined if the number of columns in A is
equal to the number of rows in B.
◮ We can think of matrix multiplication in terms of dot products:
45
Properties of matrix multiplication
Whenever the matrix products and sums are defined:
1. A(B + C ) = AB + AC (left distributivity)
2. (A + B)C = AC + BC (right distributivity)
3. A(BC ) = (AB)C (associativity)
4. A(αB) = α(AB)
5. AIn = Im A = A (where A has size m × n)
6. A0 = 0 and 0 A = 0
An = AA · · · A
Solution:
48
Example 2. Walks in a graph
A walk in a graph is a sequence of edges linking one vertex to another.
The length of a walk is equal to the number of edges it traverses.
Theorem
If A is an n × n adjacency matrix of a graph and Akij represents the (i , j)
entry of Ak , then Akij is equal to the number of walks of length k from
Vi to Vj for each k > 1.
In particular, the entry Aij of A gives the number of walks of length 1
from Vi to Vj .
Problem:
Determine the number of walks of length 3 between V1 and V4 for the
graph in example 1.
49
Solution:
To find walks of length 3, we compute A3 .
2 6 3 3 5
6 6 6 7 7
A3 =
3 6 2 5 3
3 7 5 4 7
5 7 3 7 4
50
Definition (Transpose of a matrix)
Let A be an m × n matrix. The transpose of A, denoted by AT , is the
n × m matrix whose entries are given by interchanging the rows and
columns of A:
(AT )ij = Aji
Example 3.
1 2 3
Let A = . Find AT .
4 5 6
Solution:
51
Properties of the transpose
Let α be a scalar. The following properties hold whenever the matrix
products and sums exist:
T
1. AT =A
2. (A + B)T = AT + B T
3. (αA)T = αAT
4. (AB)T = B T AT
Proof: Property 4.
(1) Check that (AB)T and B T AT are the same size.
52
(2) Show that the corresponding entries are equal for all i and j.
53
2.3 Matrix inverses
Definition (Matrix inverse)
An n × n matrix A is called invertible if there exists a matrix B such that
AB = BA = In .
Proof: Property 2.
55
56
Theorem (Inverse of a 2 × 2 matrix)
a b
Let A= . Then
c d
1. A is invertible ⇐⇒ ad − bc 6= 0.
1 d −b
2. If ad − bc 6= 0, then A−1 = .
ad − bc −c a
Example 4.
2 −1
Find the inverse of A = .
1 1
Solution:
57
Finding the inverse of a square matrix via Gauss-Jordan elimination
Calculating the inverse of a matrix
Algorithm: input: n × n matrix A
output: A−1 or “A is not invertible”.
1. Construct the augmented matrix [A | In ].
[A | In ] ∼ [R | B]
58
Example 5.
1 2 1
Find the inverse of A = −1 −1 1 .
0 1 3
Solution:
Form the augmented matrix [A | I3 ] and perform row operations so that
A is in reduced row-echelon form.
59
60
2.4 Elementary matrices
The effect of an elementary row operation can be achieved by
multiplication on the left by an elementary matrix.
Theorem
Let Ep be the elementary matrix obtained by applying a row operation p
to In .
If A is a matrix such that Ep A is defined, then Ep A is equal to the result
of performing p on A.
Solution:
62
Example 7.
1 2
Let A = . Assume that
3 4
1 2 1 2 1 2 1 0
∼ ∼ ∼ .
3 4 0 −2 0 1 0 1
63
64
In general, if A ∼ In by a sequence of elementary row operations, then
there is a sequence of elementary matrices E1 , E2 , . . . , Ek such that
Ek Ek−1 · · · E2 E1 A = In .
Theorem
1. Let A be an n × n matrix. Then
A is invertible ⇐⇒ A ∼ In .
Theorem
If A is an invertible n × n matrix, then every linear system of the form
Ax = b has a unique solution, given by x = A−1 b.
Proof:
Since A is an n × n invertible matrix, then AA−1 = A−1 A = In . So
Ax = b
⇒A−1 Ax = A−1 b
⇒In x = A−1 b
⇒x = A−1 b
66
Example 8.
Use a matrix inverse to solve the linear system
x + 2y + z = −4
−x − y + z = 11
y + 3z = 21
Solution:
Step 1. Write the system in the form Ax = b
67
Step 3.
The solution is x = A−1 b.
Note:
Once we know A−1 we can solve the linear system Ax = b for any
choice of b.
68
Rank of a matrix
Note:
◮ This is the same as the number of non-zero rows in any
row-echelon form of A.
◮ If A has size m × n, then rank(A) 6 m and rank(A) 6 n.
69
Example 9.
Find the rank of
1 −1 2 1
A= 0 1 1 −2
1 −3 0 5
Solution:
70
Rank and solutions of linear systems
We can rewrite the results of section 1.4 for the solutions of a linear
system in terms of rank.
Theorem
The linear system Ax = b, where A is an m × n matrix, has
1. No solution if rank(A) < rank([A | b]).
Note:
rank(A) 6 rank([A | b])
71
Example 10.
Determine the number of solutions of the system with row-echelon form:
1 −1 2 1 1 −1 2 1
0 1 1 −2 ∼ 0 1 1 −2
1 −3 0 5 0 0 0 0
Solution:
72
Theorem
If A is an n × n matrix, the following conditions are equivalent:
1. A is invertible.
2. Ax = b has a unique solution for any b.
3. The rank of A is n.
4. The reduced row-echelon form of A is In .
73
2.6 Determinants
The inverse of
a b
A=
c d
exists if and only if ad − bc 6= 0.
74
Examples of area and determinants
(1,0) (3,0)
det(A)=2, area equals 2 det(B)=6, area equals 6
75
Adding a multiple of one row to another row
(2,2)
(0,2)
(3,0) (3,0)
det(A)=6, area equals 6 det(B)=6, area equals 6
76
Multiplying a row by −1
(2,2) (2,2)
(3,0) (-3,0)
anti-clockwise orientation clockwise orientation
det(A)=6, area equals 6 det(B)=-6, area equals 6
77
Interchanging two rows
(1,0) (1,0)
78
Defining determinants
Definition (Determinant)
The determinant is a function that maps each n × n matrix A to a
number, denoted det(A) or |A|, that satisfies the following conditions:
D1. Adding one row to another row does not change the determinant.
D2. Multiplying any row by a scalar α multiplies the determinant by α.
D3. Swapping two rows multiplies the determinant by −1.
D4. The determinant of the identity matrix In equals 1.
79
Theorem
These four conditions specify the determinant uniquely.
Note:
◮ Definitions D1 and D2 imply that adding any multiple of one row
to another row does not change the determinant.
◮ All invertible n × n matrices have a non zero determinant.
80
Example 11.
Compute the determinant of
2 6 9
A= 0 3 8
0 0 −1
81
82
A similar argument proves the following:
Theorem
If A is an upper triangular or lower triangular matrix, then det(A) is the
product of the elements on the main diagonal.
Calculating determinants
For a general matrix, we use Gaussian elimination to transform the
matrix into triangular form, keeping track of what each step does to the
determinant. Then we can read off the determinant using the theorem.
83
Example 12.
Compute the determinant of
1 2 1
A = −1 1 1
0 1 3
using row operations.
Solution:
84
Properties of the determinant
Theorem
Let A, B be n × n matrices and let α be a scalar. Then
1. If A has a row (or column) of zeros, then det(A) = 0.
2. If A has a row (or column) which is a scalar multiple of another
row (or column) then det(A) = 0.
3. det(αA) = αn det(A)
5. det AT = det(A)
Solution:
86
Cofactor expansion
Another way to calculate determinants is to use cofactor expansion.
Definition (Cofactor)
Let A be a square matrix. The (i , j)-cofactor of A, denoted by Cij , is
the number given by
where A(i , j) is the matrix obtained from A by deleting the i th row and
jth column.
87
Example 14.
1 2 1
If A = −1 −1 1 , calculate C23 .
0 1 3
Solution:
88
Theorem (Cofactor expansion)
The determinant of an n × n matrix A can be computed by choosing
any row (or column) of A and multiplying the entries in that row (or
column) by their cofactors and then adding the resulting products.
That is, for each 1 6 i 6 n,
n
X
det(A) = Ai 1 Ci 1 + Ai 2 Ci 2 + · · · + Ain Cin = Aik Cik
k=1
89
How do you remember the sign of the cofactor?
The (1, 1) cofactor always has sign +. Starting from there, imagine
walking to the square you want using either horizontal or vertical steps.
The appropriate sign will change at each step.
Note:
The cofactor method is particularly useful for matrices with many zeros.
90
Example 15.
Compute the determinant of
1 2 1
A = −1 1 1
0 1 3
91
Example 16.
1 −2 0 1
3 2 2 0
Calculate |A| = using cofactors.
1 0 1 0
0 −4 2 4
Solution:
92
93
Topic 3: Euclidean Vector Spaces [AR 3.1-3.5]
3.1 Vectors in Rn
3.2 Dot product
3.3 Cross product of vectors in R3
3.4 Lines and planes
94
3.1 Vectors in Rn
Note:
In handwritten work, we denote vectors using the notation ~v or v or v .
˜
95
In R3 , the unit vectors in the x, y , z directions are often denoted by
i = (1, 0, 0), j = (0, 1, 0), k = (0, 0, 1)
so any vector in R3 can be written in the form
v = (v1 , v2 , v3 ) = v1 i + v2 j + v3 k.
Note:
We say a vector u is parallel to a vector v if u is a scalar multiple of v.
Then every non-zero vector u is parallel to a unit vector û, where
u
û = .
||u||
96
Distance between two points
Let u, v ∈ Rn . The distance between u and v is
d(u, v) = ||v − u||.
v−u
Example 1.
Find the distance between the points P(1, 3, −1, 2) and Q(2, 1, −1, 3).
Solution:
97
3.2 Dot product
Note:
u1 v1
.. ..
If we write u and v as column matrices U = . and V = . , then
un vn
u · v = UT V .
98
Properties of the dot product
Let u, v, w ∈ Rn and let α ∈ R be a scalar. Then
1. u · v is a scalar
2. u · v = v · u
3. (u + v) · w = u · w + v · w
4. u · (v + w) = u · v + u · w
5. (αu) · v = u · (αv) = α(u · v)
6. u · u = kuk2
99
Geometry of the dot product
We can also interpret the dot product geometrically by
u · v = kukkvk cos θ
Ƨ
u
Note:
◮ Two non-zero vectors are perpendicular or orthogonal when
u · v = 0, so that θ = π2 .
◮ Since | cos θ| 6 1, then
|u · v| 6 ||u|| ||v||.
100
Example 2.
Verify that the Cauchy-Schwarz inequality holds for the vectors
u = (0, 2, 2, −1) and v = (−1, 1, 1, −1).
Solution:
101
Example 3. Correlation
Correlation is a measure of how closely two variables are dependent.
Suppose we want to compare how closely 7 students’ assignment marks
correlate with their exam marks.
18.5 18.5
−0.5 1
−1.5 −2.5
X − X̄ =
−5 1
7 9.5
−13.5 −14
−4.5 −13.5
To compare the exam and assignment scores, we compute the cosine of
the angle between the two columns x1 , x2 of this matrix:
x1 · x2 656.75
cos θ = = ≈ 0.92
kx1 kkx2 k (24.92)(28.62)
A cosine value near 1 indicates that the scores are highly correlated.
103
Vector projection
v-projuv
θ projuv
v = v1 + v2
105
3.3 Cross product of vectors in R3
i j k
u u u u u u
u × v = u1 u2 u3 = 2 3 i − 1 3 j + 1 2 k
v2 v3 v1 v3 v1 v2
v1 v2 v3
Note:
The cross product is defined only for R3 .
106
Properties of the cross product
Let u, v, w ∈ R3 and α ∈ R be a scalar. Then
1. v × u = −(u × v)
2. (u + v) × w = u × w + v × w
3. u × (v + w) = u × v + u × w
4. (αu) × v = u × (αv) = α(u × v)
5. u × u = 0
Example
i × j = k, j × k = i, k × i = j.
107
Geometry of the cross product
We can also interpret the cross product geometrically by
Ƨ
u × v = kuk kvk sin(θ) n̂ u
where
◮ 0 6 θ 6 π is the angle between u and v
◮ n̂ is a unit vector
◮ n̂ is perpendicular to both u and v
◮ n̂ points in the direction given by the right-hand rule. The
direction of u × v is determined by curling your fingers on your
right hand following the angle from u to v and noting the direction
your thumb is pointing.
108
Example 5.
Find a vector perpendicular to both (1, 1, 1) and (1, −1, −2).
Solution:
109
Application: Area of a parallelogram
Suppose that u, v ∈ R3 . The area of the parallelogram defined by u and
v is equal to ku × vk.
v
height
θ
u
base length
110
Example 6.
Find the area of the triangle with vertices (2, −5, 4), (3, −4, 5) and
(3, −6, 2).
V
Solution:
111
Definition (Scalar triple product)
Let u = (u1 , u2 , u3 ), v = (v1 , v2 , v3 ) and w = (w1 , w2 , w3 ) be vectors in
R3 . The scalar triple product is the real number u · (v × w) given by
u1 u2 u3
u · (v × w) = v1 v2 v3 .
w1 w2 w3
1. v · (v × w) = w · (v × w) = 0
2. u · (v × w) = v · (w × u) = w · (u × v)
3. u · (v × w) = −u · (w × v) = −v · (u × w) = −w · (v × u)
Note:
Property 1 says that v × w is orthogonal to both v and w.
112
Application: Volume of a parallelepiped
Suppose that u, v, w ∈ R3 . Then, the parallelepiped defined by the
vectors u, v and w has volume equal to the absolute value of the scalar
triple product of u, v and w:
Volume of parallelepiped = |u · (v × w)|
θ w
Note:
This is the absolute value of the determinant of the matrix with rows
given by u, v, w.
113
Example 7.
−→ −→ −
→
Find the volume of the parallelepiped with adjacent edges PQ, PR, PS,
where the points are P(2, 0, −1), Q(4, 1, 0), R(3, −1, 1) and
S(2, −2, 2).
Solution:
114
3.4 Lines and planes
Equations of a straight line
A vector equation of a line through a point P0 in the direction of a
vector v is given by
r = r0 + tv, t ∈ R
−−→
where r0 = OP0 .
3
Wv
P
U
r
0
Each point on the line is given by a unique value of t.
115
For example, in R3 , if r = (x, y , z), r0 = (x0 , y0 , z0 ) and v = (a, b, c),
the vector equation of the line is
x = x0 + ta
y = y0 + tb, t∈R
z = z0 + tc
117
118
Example 9.
Find a vector equation of the line through the point (2, 0, 1) that is
parallel to the line
x −1 y +2 z −6
= = .
1 −2 2
Does the point (0, 4, −3) lie on the line?
Solution:
119
Definition
Two lines are said to
◮ intersect if there is a point lying on both lines.
◮ be parallel if their direction vectors are parallel.
◮ be skew if they do not intersect and are not parallel.
The angle between two lines is given by the angle θ between their
direction vectors. (We take the acute angle, which is the smaller of the
two possible angles θ and π − θ.)
120
Equations of planes
A vector equation of a plane through a point P0 that contains two
non-parallel vectors in the directions u and v is
r = r0 + su + tv, s, t ∈ R,
−−→
where r0 = OP0 .
su
P0
tv
r0
r
r - r0
n
r0
r
r · n = r0 · n
⇒ (x, y , z) · (a, b, c) = (x0 , y0 , z0 ) · (a, b, c)
⇒ ax + by + cz = ax0 + by0 + cz0
⇒ ax + by + cz = d
Note:
◮ The equations for a plane are not unique.
◮ The angle between two planes in R3 is given by the angle θ
between their normal vectors. (We take the acute angle, which is
the smaller of the two possible angles θ and π − θ.)
123
Example 10.
Find a Cartesian equation of the plane with vector form
Solution:
124
125
Example 11.
Find a vector equation for the plane containing the points P(1, 0, 2),
Q(1, 2, 3) and R(4, 5, 6).
Solution:
126
Intersection of a line and a plane
Example 12.
Where does the line
x −1 y −2 z −3
= =
1 2 3
intersect the plane 3x + 2y + z = 20?
Solution:
127
Intersection of two planes
Example 13.
Find a vector form for the line of intersection of the two planes
x + 3y + 2z = 6 and 3x + 2y − z = 11.
Solution:
128
129
Topic 4: General Vector Spaces [AR 4.1 - 4.8]
130
4.1 General vector spaces
Vector spaces
We want to extend the properties of vectors in Rn to more general sets
of objects.
131
Fields
The scalars are members of a number system F called a field where
addition, subtraction, multiplication, and division by a non zero scalar
are defined.
132
Definition (Vector space)
Fix a field of scalars F. Then a vector space over F is a non-empty set
V with two operations: vector addition and scalar multiplication. These
operations are required to satisfy the following 10 rules (axioms).
For all u, v, w ∈ V :
Vector addition behaves well:
A1 u + v ∈ V . (closure of vector addition)
A2 (u + v) + w = u + (v + w). (associativity)
A3 u + v = v + u. (commutativity)
There must be a zero and inverses:
A4 There exists a vector 0 ∈ V such that
v + 0 = v for all v ∈ V . (existence of zero vector)
A5 For all v ∈ V , there exists a vector −v
such that v + (−v) = 0. (additive inverse)
133
Definition (Vector space continued)
For all u, v ∈ V and α, β ∈ F:
Note:
It follows from the axioms that for all v ∈ V :
1. 0v = 0.
2. (−1)v = −v.
134
Examples of vector spaces
Definition
A vector space that has R as the scalars is called a real vector space.
1. Rn
Our definition of vector spaces was based on algebraic properties of Rn ,
so it is easy to check that Rn is a vector space with scalars R.
135
2. Vector space of matrices
Definition
Denote by Mm,n ( or Mm,n (R)) the set of all m × n matrices with real
entries.
Mm,n is a real vector space with the usual matrix addition and scalar
multiplication operations.
Example 1.
Prove axioms A1, A4, A5 and M1 for M2,2 .
Solution:
Define two vectors in V and a scalar in R
136
A1. Closure under vector addition
137
A5. Additive inverse
138
3. Vector space of polynomials
Definition
For a fixed integer n > 0, denote by Pn (or Pn (R)) the set of all
polynomials with degree at most n with real coefficients:
Pn = {a0 + a1 x + a2 x 2 + · · · + an x n | a0 , a1 , . . . , an ∈ R}
Note:
Two polynomials are equal if and only if their coefficients are equal.
139
4. Vector space of functions
Definition
Let
F(S, R) = { all functions f : S → R }
denote the set of all functions from S to R, where S is a non-empty set.
Note:
The zero vector is the function which is identically zero, i.e.
0 : S → R such that 0(x) = 0 for all x ∈ S.
140
Example 2.
If f , g ∈ F(R, R) are given by
f (x) = sin(x) and g (x) = x for all x ∈ R,
then
(f + g )(x) = sin(x) + x and (3f )(x) = 3 sin(x) for all x ∈ R.
10 10
5 5
-5 5 -5 5
-5 -5
-10 -10
Note:
Function spaces are important in analysis of sound and light waves,
quantum mechanics and the study of differential equations.
141
5. Vector space Fn2
The field F2
Definition
The set containing just two elements
F2 = {0, 1},
Addition Multiplication
0+0 =0 0×0=0
1+0 =1 1×0=0
0+1 =1 0×1=0
1+1 =0 1×1=1
Definition
Denote by Fn2 the set of vectors with integers modulo 2 as scalars:
Fn2 = {(a1 , a2 , . . . , an ) | ai ∈ F2 , i = 1, . . . , n}
Definition
A vector space that has C as the scalars is called a complex vector
space.
Example 3.
C2 = {(a1 , a2 ) | a1 , a2 ∈ C}
Note:
The real vector spaces Rn , Mm,n (R), Pn (R) and F(S, R) have complex
analogues denoted by Cn , Mm,n (C), Pn (C) and F(S, C).
144
Example 4.
Consider V = R2 . Let u = (a1 , a2 ) ∈ R2 , v = (b1 , b2 ) ∈ R2 and
α ∈ R. Define addition and scalar multiplication as follows:
145
Applications of general vector spaces
Example 5.
a11 a12 b11 b12
Let A = and B = so that A, B ∈ M2,2 .
a21 a22 b21 b22
146
Example 6.
The vector space Fn2 over F2 is used in coding theory applications to
reduce errors occurring in data storage and message transmission.
One way of coding data is to use a Hamming code so that errors can
not only be detected but also corrected efficiently.
1. Data is stored in the form of a string of 1s and 0s. These 1s and 0s
are called binary numbers.
2. Using modular 2 arithmetic enables checking and correction of
digital information.
147
4.2 Subspaces
Definition (Subspace)
A subspace W of a vector space V is a subset W ⊆ V that is itself a
vector space using the addition and scalar multiplication operations
defined on V .
Solution:
Define two vectors in S and a scalar in R
149
1. Closure under vector addition
150
Some geometric examples
1. The only subspaces of R2 are
• {0}
• Lines through the origin
• R2
Example 8.
Is the line {(x, y ) ∈ R2 | y = 2x + 1} a subspace of R2 ?
Solution:
151
Example 9.
Let S ⊂ P2 be the set of polynomials defined by
S = {a1 x + a2 x 2 | a1 , a2 ∈ R}
Is S a subspace of P2 ?
Solution:
Define two vectors in S and a scalar in R
152
1. Closure under vector addition
153
Definition (Trace)
If A is an n × n matrix, then the trace of A is the sum of the diagonal
entries, namely:
Tr(A) = A11 + A22 + ... + Ann
Example 10.
Solution:
Define two vectors in W and a scalar in R
154
0. W contains the zero vector
155
2. Closure under scalar multiplication
156
Example 11.
Let W be the set of real 2 × 2 matrices with determinant equal to 0:
a b
W = | a, b, c, d ∈ R and ad − bc = 0
c d
Show that W is not a subspace of M2,2 .
Solution:
Find a specific counterexample
157
4.3 Linear combinations and spans
Linear combinations
Let V be a vector space with scalars F.
where each αi ∈ F.
Example
(5, 1) is a linear combination of (1, −1) and (1, 2) since
Examples
Note:
These primary colours are different from the standard primary colours
red, blue, yellow.
159
Example 13.
Determine whether (1, 2, 3) ∈ R3 is a linear combination of (1, −1, 2)
and (−1, 1, 2).
Solution:
160
161
Example 14.
In P2 , determine whether q(x) = 1 − 2x − x 2 is a linear combination of
p1 (x) = 1 + x + x 2 and p2 (x) = 3 + x 2 .
Solution:
162
163
Spans
We can use linear combinations to produce vector equations for lines
and planes.
Theorem
Let V be a vector space and v1 , . . . , vk ∈ V . Then span{v1 , . . . , vk } is
a subspace of V .
Proof:
Check the conditions of the subspace theorem.
164
What does the span of a set of vectors look like?
Consider 3 vectors v1 , v2 , v3 in R3 . Then span{v1 , v2 , v3 } could be:
1. {0}
2. A line through the origin
3. A plane through the origin
4. R3
Note:
A span of 4 or more vectors in R3 will also give one of the above sets
because these are the only possible subspaces of R3 .
Example
The span of the vectors u and v is a plane through the origin.
#
$
!
"
165
Example 15.
Let v1 = (1, 1, 1), v2 = (2, 2, 2), v3 = (3, 3, 3). In R3 , what is
span{v1 , v2 , v3 }?
Solution:
166
We can also use the word “span” as a verb.
span{v1 , . . . , vk } = V .
Note:
The vectors v1 , . . . , vk span V if and only if
(1) V contains all the vectors v1 , . . . , vk , and
(2) all vectors in V are linear combinations of the vectors v1 , . . . , vk .
167
Example 16.
Show that the vectors (1, −1), (2, 4) span R2 .
Solution:
168
169
Example 17.
Do the vectors (1, 2, 0), (1, 5, 3), (0, 1, 1) span R3 ?
Solution:
170
171
General method to determine if vectors span Rn
If A = [v1 · · · vk ] where v1 , · · · , vk ∈ Rn are column vectors, then:
1. v1 , · · · , vk span Rn ⇔ the linear system with augmented matrix
[A | w] has a solution for all w ∈ Rn .
2. v1 , . . . , vk span Rn ⇔ rank(A) = n.
Theorem
If k < n, then vectors v1 , . . . , vk ∈ Rn do not span Rn .
Proof:
Since A = [v1 · · · vk ] has k columns, rank(A) 6 k.
Since k < n this implies that rank(A) 6= n.
Hence v1 , . . . , vk do not span Rn .
172
Example 18.
Do the polynomials p1 (x) = 1 + x + x 2 and p2 (x) = x 2 span P2 ?
Solution:
173
4.4 Linear dependence and linear independence
174
Let V be a vector space with scalars F.
175
Note:
Any set of vectors containing the zero vector 0 is linearly dependent.
Theorem
The vectors v1 , . . . , vk (k > 2) are linearly dependent if and only if one
vector is a linear combination of the others.
In particular:
Two vectors are linearly dependent if and only if one is a multiple of the
other.
176
Example 19.
Are the following vectors linearly dependent or linearly independent?
(a) u = (2, −1, 1), v = (−6, 3, −3)
(b) u = (2, −1, 1), v = (4, 0, 2)
Solution:
177
Example 20.
Are the vectors (2, 0, 0), (6, 1, 7), (2, −1, 2) linearly dependent or linearly
independent?
Solution:
178
Note:
The augmented matrix has the original vectors as its columns.
179
General method to determine if vectors in Rn are linearly independent
3. If k = n then
180
Theorem
Let v1 , . . . , vk be vectors in Rn . If k > n, then the vectors are linearly
dependent.
Proof:
Let A = [v1 · · · vk ] where vi ∈ Rn .
Then rank(A) 6 n.
Since k > n this implies that rank(A) 6= k.
Hence the vectors are linearly dependent.
Note:
The results and methods on linear combinations and linear dependence
for vectors in Rn also apply to vectors in Fn for any field F, e.g. C, F2 .
181
Example 21.
Determine if the following polynomials
2 + 2x + 5x 2 , 1 + x + x 2 , 1 + 2x + 3x 2
Solution:
182
183
Example 22.
Are the following matrices linearly independent or linearly dependent?
1 3 −2 1 1 10
, , .
1 1 1 −1 4 2
Solution:
184
185
Linear combinations via reduced row-echelon form
Let A = [v1 · · · vk ] and B = [u1 · · · uk ] be matrices with vectors in
columns.
α1 v1 + · · · + αk vk = 0 ⇔ α1 u1 + · · · + αk uk = 0
186
Example 23.
Consider the vectors v1 = (1, 2, 3), v2 = (3, 6, 9), v3 = (−1, 0, −2) and
v4 = (1, 4, 4) in R3 .
Given that:
1 3 −1 1 1 3 0 2
A = 2 6 0 4 ∼ 0 0 1 1 = B.
3 9 −2 4 0 0 0 0
Solution:
187
188
4.5 Bases and dimension
Definition (Basis)
A basis for a vector space V is a set of vectors in V which:
1. spans V , and
2. is linearly independent.
Note:
A vector space V can have many different bases.
e.g. Any two linearly independent vectors in R2 will give a basis for R2 .
189
Example 24.
Is {(1, −1), (2, 4)} a basis for R2 ?
Solution:
190
Example 25.
Recall that
W = {A ∈ M2,2 | Tr (A) = 0}
is a subspace of M2,2 .
1 0 0 1 0 0
Is , , a basis for W ?
0 −1 0 0 1 0
Solution:
1. Are the vectors in W ?
191
2. Do the vectors span W ?
192
3. Are the vectors linearly independent?
193
Theorem (Existence of bases)
Let V be a vector space. Then
1. V has a basis.
2. Any set that spans V contains a basis for V .
3. Any linearly independent set in V can be extended to a basis of V .
Theorem
Let V be a vector space and let {v1 , . . . , vk } be a basis for V .
194
This allows us to define the dimension of a vector space.
Definition (Dimension)
The dimension of a vector space V is the number of vectors in any basis
for V . This is denoted by dim(V ).
Note:
If V = {0}, we say that dim(V ) = 0.
195
Standard Bases
The dimension of Rn is n.
1, x, x 2 , . . . , x n .
The dimension of Pn is n + 1.
P = {a0 + a1 x + · · · + an x n | n ∈ N, a0 , a1 , . . . , an ∈ R}
{1, x, x 2 , . . .}.
197
Theorem
Let V be a vector space, and suppose that dim(V ) = n is finite.
1. If a set with exactly n elements spans V , then it is a basis for V .
2. If a linearly independent subset of V has exactly n elements, then
it is a basis for V .
198
Example 26.
1 0 0 1 0 0
Is S = , , a basis for M2,2 ?
0 −1 0 0 1 0
Solution:
199
Example 27.
Let B = {2 + 2x + 5x 2 , 1 + x + x 2 , 1 + 2x + 3x 2 }. Is B a basis for P2 ?
Solution:
200
How to find a basis for the span of a set of vectors in Rn
Note:
This method gives a basis that is a subset of the original set of vectors.
201
Example 28.
Let S = {v1 , v2 , v3 } = {(1, 3, −1, 1), (2, 6, 0, 4), (3, 9, −2, 4)}.
(a) Find a subset of S that is a basis for span(S).
(b) What is the dimension of span(S)?
Solution:
(a) We write the vectors as columns of a matrix and reduce to row
echelon form.
202
203
How to find a basis for the span of a set of vectors in Rn
Note:
◮ This method gives a basis that will usually not be a subset of the
original set of vectors.
◮ The method works because elementary row operations do not
change the subspace spanned by the rows, and in the row echelon
form the non-zero rows are linearly independent.
204
Example 29.
Let
S = {(1, 3, −1, 1), (2, 6, 0, 4), (3, 9, −2, 4))} ⊂ R4 .
Find a basis for W = span(S) and dim(W ) using the row method.
Solution:
We write the vectors as rows of a matrix and reduce to row echelon
form. Now
1 3 −1 1 1 3 0 2
A = 2 6 0 4 ∼ 0 0 1 1 = B
3 9 −2 4 0 0 0 0
205
Example 30.
(a) Show that
W = {(a + b, 0, a − b, 0) | a, b ∈ R}
is a subspace of R4 and find a basis B for W .
(b) Extend B to a basis for R4 .
Solution:
206
207
208
4.6 Solution space, column space, row space
S = {x ∈ Rn |Ax = 0}
Proof:
Check the conditions of the subspace theorem.
209
Example 31.
Find a basis for the solution space of
x1 + 2x2 + x3 + x4 = 0
3x1 + 6x2 + 4x3 + x4 = 0
What is the dimension of the solution space?
Solution:
210
211
Note:
The nullity(A) is the number of columns in the row-echelon form of A
not containing a leading entry.
212
Column space and row space of a matrix
Note:
◮ The column space and row space are subspaces, since the span of
any set of vectors is a subspace.
◮ To find a basis for the column space use the column method.
◮ To find a basis for the row space use the row method.
213
Example 32.
Given that
1 −1 2 −2 1 −1 2 −2
2 0 1 0 ∼ 0 2 −3 4 = B,
A=
5 −3 7 −6 0 0 0 1
1 1 −1 3 0 0 0 0
214
215
Effect of row operations
Assume that A ∼ B by a sequence of elementary row operations. Then
◮ solution space of A = solution space of B,
◮ row space of A = row space of B,
◮ column space of A 6= column space of B in general.
Theorem
For any m × n matrix A:
216
Rank-nullity theorem
Theorem
Suppose A is an m × n matrix. Then
Proof:
Suppose A has row echelon form B.
The rank of A equals the number of columns in B containing a leading
entry.
The nullity of A equals the number of columns in B which do not
contain a leading entry.
Therefore, rank(A) + nullity(A) is equal to the number of columns in B,
which is equal to the number of columns in A.
217
Other fields of scalars
Our work on vector spaces applies using any field of scalars F instead of
the real numbers R. For example, we could use F = C or F = F2 .
Then
Fn = {(x1 , . . . , xn ) | x1 , . . . , xn ∈ F}
is vector space using F as scalars.
218
Example 33.
Consider the matrix
1 1 0
A= ∈ M2,3 (F2 ).
1 0 1
Determine the column space, row space and solution space for A.
Note:
Since the field is F2 all matrix entries are in {0, 1} and the operations of
vector addition and scalar multiplication are done using mod 2
arithmetic.
Solution:
Reduce A to reduced row-echelon form using mod 2 arithmetic.
219
Column space of A
220
Row space of A
221
Solution space of A
We need to solve the following system using mod 2 arithmetic
x1
1 1 0 0
x2 = .
1 0 1 0
x3
222
Note:
The rank-nullity theorem holds as rank(A) = 2, nullity(A) = 1 and A
has 3 columns.
223
4.7 Coordinates relative to a basis
Theorem
If {v1 , . . . , vn } is a basis for a vector space V , then every vector v ∈ V
can be written uniquely in the form
v = α1 v1 + . . . + αn vn ,
where α1 , . . . , αn ∈ F.
Proof:
Since {v1 , . . . , vn } is a basis, it spans V and so for some scalars
α1 , . . . , αn ∈ F
v = α1 v1 + . . . + αn vn (1)
Are these scalars unique? Suppose
v = β1 v1 + . . . + βn vn (2)
for some other scalars β1 , . . . , βn ∈ F.
224
Taking (2) − (1) gives
β1 − α1 = 0, . . . , βn − αn = 0
⇒ α1 = β1 , . . . , αn = βn
showing the uniqueness of the scalars α1 , . . . , αn .
225
Definition (Coordinates)
Suppose B = {v1 , . . . , vn } is an ordered basis for a vector space V ,
i.e. a basis arranged in the order v1 first, v2 second and so on.
For v ∈ V we can write
αn
is called the coordinate vector of v with respect to B.
226
Example 34.
Consider the vector v = (1, 5) ∈ R2 . Determine the coordinate vector of
v with respect to the ordered bases:
(a) B = {(1, 0), (0, 1)}
(b) B ′ = {a, b} = {(2, 1), (−1, 1)}
Solution:
227
y
b 5
3b 4 a
3 3
2 2
v 1 1
5j x
1
-1 -1
-2 -2
2a
Solution:
229
Having fixed a finite ordered basis B for a vector space V , there is a
one-to-one correspondence between vectors and their coordinate
vectors.
Theorem
Let V be a finite dimensional vector space over a field of scalars F. Let
B be an ordered basis of V .
If u, v ∈ V and α ∈ F is a scalar, then
230
Topic 5: Linear Transformations [AR 4.9-4.11, 8.1-8.5]
231
5.1 General linear transformations
We now consider functions or mappings between general vector spaces
that preserve the vector space structure of vector addition and scalar
multiplication.
Note:
Taking α = 0 in condition 2 shows that
T (0V ) = 0W
i.e. T maps the zero vector in V to the zero vector in W .
232
Example 1.
Show that the cross product T : R3 → R3 given by
Solution:
Define two vectors in the domain and a scalar
233
2. T preserves scalar multiplication
234
Example 2.
Show that the function D : P3 → P2 defined by differentiation with
respect to x,
D(a0 + a1 x + a2 x 2 + a3 x 3 ) = a1 + 2a2 x + 3a3 x 2
is a linear transformation.
Solution:
Define two vectors in the domain and a scalar
235
2. D preserves scalar multiplication
236
Example 3.
Show that the determinant T : M2,2 → R given by
a b a b
T = = ad − bc
c d c d
is not a linear transformation.
Solution:
Find a specific counterexample
Method 1 - vector addition
237
Method 2 - scalar multiplication
238
Example 4.
Prove that the function T : R3 → R2 given by
T (x1 , x2 , x3 ) = (x2 − 2x3 , 3x1 + x3 )
is a linear transformation.
Solution:
Define two vectors in the domain and a scalar
239
2. T preserves scalar multiplication
240
Note:
◮ We can write T in matrix form using coordinate vectors:
x1
x2 − 2x3 0 1 −2
[T (x1 , x2 , x3 )] = = x2
3x1 + x3 3 0 1
x3
The matrix
0 1 −2
[T ] =
3 0 1
is called the standard matrix representation for T .
241
Theorem
Let S = {e1 , . . . , en } and S ′ = {e1 , . . . , em } be the ordered standard
bases for Rn and Rm , and let v ∈ Rn .
Every linear transformation T : Rn → Rm has a standard matrix
representation [T ] specified by the m × n matrix
242
Proof:
Let v ∈ Rn and write
v = α1 e1 + α2 e2 + · · · + αn en .
The coordinate vector of v with respect to the standard basis is
α1
..
[v] = .
αn
Since T is a linear transformation
[T (v)] = [T (α1 e1 + α2 e2 + · · · + αn en )]
= [T (α1 e1 )] + [T (α2 e2 )] + · · · + [T (αn en )]
= α1 [T (e1 )] + α2 [T (e2 )] + · · · + αn [T (en )]
α1
..
= [ [T (e1 )] [T (e2 )] · · · [T (en )] ] .
αn
= [T ] [v]
243
5.2 Geometric linear transformations from R2 to R2
Many familiar geometric transformations are linear transformations
T : R2 → R2 .
0, v, w, v + w
v +w
w
T (w )
T (v + w ) v
0
T (v )
245
Example 5.
Determine the standard matrix representation in R2 for reflection across
the y -axis.
Z
Y y xZ
Y
Solution:
246
Example 6.
Determine the standard matrix representation in R2 for reflection in the
line y = 5x.
Solution:
247
248
249
Example 7.
Determine the standard matrix representation in R2 for rotation around
the origin anticlockwise by an angle of θ.
T (e1 )
e2
T (e2 )
θ
x
e1
250
Solution:
Consider the effect of the transformation on the standard basis vectors.
251
Example 8.
Determine the standard matrix representation in R2 for the
compression/expansion along the x-axis by a factor c > 0.
y y
1 1
1 x c x
Solution:
Consider the effect of the transformation on the standard basis vectors.
252
Example 9.
Determine the standard matrix representation in R2 for the shear by a
factor c along the x-axis.
y y
1 1
1 x c 1 1+c x
Solution:
Consider the effect of the transformation on the standard basis vectors.
253
Composition of linear transformations
Recall:
If S : U → V and R : V → W are functions, then the composition
R ◦S :U →W
Theorem
Suppose R : Rk → Rm and S : Rn → Rk are linear transformations with
standard matrices [R] and [S] respectively.
254
Example 10.
Find the image of (x, y ) after a shear along the x-axis with c = 1
followed by a reflection across the y -axis.
Solution:
From examples 5 and 9
255
5.3 Matrix representations for general linear
transformations
Suppose that
◮ T : U → V is a linear transformation.
◮ B = {b1 , . . . , bn } is an ordered basis for U.
◮ C = {c1 , . . . , cm } is an ordered basis for V .
Then for any u ∈ U, the coordinate vector for the image of u under T
with respect to basis C , is given by
256
Theorem
There exists a unique matrix satisfying
which is given by
h i
[T ]C,B = [T (b1 )]C [T (b2 )]C ··· [T (bn )]C .
Note:
◮ The matrix [T ]C,B is called the matrix of T with respect to the
bases B and C.
◮ For the case U = V and B = C, we write often [T ]B instead of
[T ]B,B .
257
Example 11.
Consider the linear transformation T : M2,2 → M2,2 where T is defined
by
T (Q) = Q T .
Find the matrix representation of T with respect to the ordered basis
1 0 0 1 0 0 0 0
B= , , , .
0 0 0 0 1 0 0 1
Solution:
Find the image of each basis vector in B under T , and express it as a
coordinate vector using the basis B.
258
259
Example 12.
Find [T ]C,B for the linear transformation T : P2 → P1 given by
T (a0 + a1 x + a2 x 2 ) = (a0 + a2 ) + a0 x
using the ordered basis B = {1, x, x 2 } for P2 and the ordered basis
Solution:
1. Find the image of each basis vector in B under T .
260
2. Express each image as a coordinate vector with respect to basis C.
261
262
Example 13.
A linear transformation T : R3 → R2 has matrix
5 1 0
[T ] =
1 5 −2
with respect to the ordered standard bases of R3 and R2 .
What is the transformation matrix with respect to the ordered bases
B = {(1, 1, 0), (1, −1, 0), (1, −1, −2)} of R3
and
C = {(1, 1), (1, −1)} of R2 ?
Solution:
1. Find the coordinates of the image of each vector in B with respect to
the standard basis.
263
2. Express each image as a coordinate vector with respect to basis C.
264
265
5.4 Image, kernel, rank and nullity
Note:
◮ The kernel is a subspace of U. Its dimension is denoted by
nullity(T ).
◮ The image is a subspace of V . Its dimension is denoted by
rank(T ).
266
Example 14.
Find the kernel and image of the linear transformation D : P3 → P2
given by differentiation with respect to x:
D(a0 + a1 x + a2 x 2 + a3 x 3 ) = a1 + 2a2 x + 3a3 x 2 .
Solution:
267
Image and kernel from matrix representations
[T ] = [T ]C,B .
268
Using the rank-nullity theorem for matrices gives the following:
rank(T ) + nullity(T ) = n.
269
Example 15.
Consider the linear transformation T : R3 → R2 defined by
T (x, y , z) = (2x − y , y + z).
Find bases for Im(T ) and ker(T ). Verify the rank-nullity theorem.
Solution:
1. Image of T
270
2. Kernel of T
271
3. Verify rank-nullity theorem
272
Example 16.
Consider the linear transformation T : P2 → P1 defined by
T (a0 + a1 x + a2 x 2 ) = (a0 − a1 + a2 )(1 + 2x)
Find bases for Im(T ) and ker(T ).
Solution:
1. Image of T
273
2. Kernel of T
274
275
5.5 Invertible linear transformations
276
Example 17.
Consider the linear transformation T : P2 → P1 defined by
(a) Is T injective?
(b) Is T surjective?
Solution:
From example 16:
277
Definition (Invertible)
A function T : U → V is invertible if there exists a function S : V → U
such that
◮ (S ◦ T )(u) = u for all u ∈ U, and
◮ (T ◦ S)(v) = v for all v ∈ V .
Then S is called the inverse of T and is denoted by T −1 .
278
Theorem
Let T : U → V be a linear transformation.
1. T is invertible if and only if it is both injective and surjective.
2. If T is invertible, then its inverse T −1 is also a linear
transformation.
3. Assume that U and V have finite ordered bases B and C,
respectively. Then T is invertible if and only if [T ]C,B is invertible,
and
[T −1 ]B,C = ([T ]C,B )−1 .
This requires dim(U) = dim(V ).
Note:
An invertible linear transformation T is also called an isomorphism.
279
Example 18.
Is the rotation T : R2 → R2 around the origin anticlockwise by an angle
θ, with standard matrix
cos θ − sin θ
[T ] = ,
sin θ cos θ
280
Example 19.
Consider the linear transformation T : R2 → R2 for orthogonal
projection onto the x-axis, with standard matrix
1 0
[T ] = .
0 0
y y
1 1
1 x 1 x
(a) Is T surjective?
(b) Is T injective?
(c) Is T invertible?
Solution:
281
282
5.6 Change of basis
Transition Matrices
Let B = {b1 , . . . , bn } and C = {c1 , . . . , cn } be ordered bases for the
same vector space V with field of scalars F.
Given v ∈ V , how are [v]B and [v]C related?
Theorem
There exists a unique matrix PC,B such that for any vector v ∈ V ,
where
[T ]C,B = [T (b1 )]C [T (b2 )]C · · · [T (bn )]C
is the matrix representation of T with respect to bases B and C.
284
For the special case of the identity function
this gives
[T ]C,B = [b1 ]C [b2 ]C . . . [bn ]C
and
[v]C = [T ]C,B [v]B .
So
PC,B = [T ]C,B .
Note:
The transition matrix PC,B is invertible since T is invertible, and
285
Example 20.
Consider the following ordered bases for R2
286
Example 21.
Consider the following ordered bases for R2
287
Calculating a general transition matrix from basis B to basis C
We have
[v]S = PS,B [v]B and [v]C = PC,S [v]S .
Combining these, we get
288
Example 22.
Consider the following ordered bases for R2
Solution:
289
Note:
Writing vectors in B as linear combinations of vectors in C gives:
Theorem
The matrix representations of T : V → V with respect to two ordered
bases C and B for V are related by the following equation:
[T ]B = PB,C [T ]C PC,B .
Proof:
We need to show that for all v ∈ V ,
292
Solution:
1. Find image of the vectors in B and write in terms of B to give [T ]B .
293
2. Find the transition matrices PS,B and PB,S .
294
3. Find the standard matrix representation of T .
295
Example 24.
Consider the linear transformation T : R2 → R2 given by
T (x, y ) = (3x − y , −x + 3y ).
296
297
v2 v1
b1
b2
298
Topic 6: Eigenvalues and Eigenvectors [AR Chap 5]
299
6.1 Definition of eigenvalues and eigenvectors
300
Solution:
301
Algebraically, we seek non-zero vectors v such that
T (v) = λv
Geometrically,
◮ If λ = 0, then T maps v to 0.
◮ If λ is real and positive, then T rescales v by a factor λ.
◮ If λ is real and negative, then T rescales v by a factor λ and also
reverses the direction of v.
302
Let V be a vector space over a field of scalars F.
{v ∈ V | T (v) = λv}.
Note:
Each non-zero vector in this eigenspace is an eigenvector of T with
eigenvalue λ.
303
If we represent T by its matrix [T ]B with respect to an ordered basis B
for V , then we can find eigenvalues and eigenvectors by solving
Av = λv.
Note:
For a real matrix, eigenvalues and eigenvectors can be real or complex.
304
6.2 Finding eigenvalues
Av = λv
Av − λv = 0
⇒ Av − λI v = 0
⇒ (A − λI )v = 0
The values of λ for which this equation has non-zero solutions are the
eigenvalues.
305
Theorem
The homogeneous linear system
(A − λI )v = 0
det(A − λI ) = 0.
306
Example 2.
1 4
Find the eigenvalues of A = .
1 1
Solution:
Solve det(A − λI ) = 0
307
Definition (Characteristic polynomial)
Let A be an n × n matrix. The determinant det(A − λI ) is a polynomial
in λ of degree n called the characteristic polynomial of A.
309
310
Useful properties of eigenvalues
These properties are useful for checking that you have calculated the
eigenvalues correctly.
311
Example 4.
Check these properties for the eigenvalues λ = 2, 2, 3 of the matrix
2 −3 6
A = 0 5 −6 .
0 1 0
Solution:
312
Example 5.
0 1
Consider the matrix A = .
−1 0
(a) Find the eigenvalues of A.
(b) Interpret the matrix and eigenvalues geometrically.
Solution:
313
Note:
The matrix A also defines a linear transformation T : C2 → C2 where
T (v) = Av
for v ∈ C2 .
Then T has complex eigenvalues.
314
Eigenvalues of matrices versus eigenvalues of linear transformations
{v ∈ Cn | Av = λv}
{v ∈ Rn : Av = λv}
315
6.3 Finding eigenvectors
(A − λI )v = 0.
316
Example 6.
1 4
Find the eigenvectors of the matrix A = .
1 1
Solution:
From example 2, the eigenvalues are λ = −1, 3.
For each eigenvalue λ, solve (A − λI )v = 0
317
318
319
Example 7.
2 −3 6
Find the eigenvectors of the matrix A = 0 5 −6.
0 1 0
What is the geometric multiplicity of each eigenvalue?
Solution:
From example 3, the eigenvalues are λ = 2, 3.
For each eigenvalue λ, solve (A − λI )v = 0
320
321
322
Note:
The geometric multiplicity is always less than or equal to the algebraic
multiplicity.
323
Example 8.
0 1
Find the eigenvectors of the matrix A = .
−1 0
Solution:
From example 5, the eigenvalues are λ = ±i .
For each eigenvalue λ, solve (A − λI )v = 0
324
325
Note:
◮ The eigenvalues are complex conjugates.
◮ The eigenvectors are complex conjugates.
In general, the non-real eigenvalues and eigenvectors of a real matrix
occur in complex conjugate pairs.
326
6.4 Diagonalisation
Definition
A linear transformation T : V → V is diagonalisable if there is a basis
B = {b1 , . . . , bn } for V such that the matrix [T ]B is diagonal.
Definition
A complex n × n matrix A is diagonalisable (or diagonalisable over C) if
the linear transformation
T : Cn → Cn withT (v) = Av
is diagonalisable.
A real n × n matrix A is diagonalisable over R if the linear
transformation
T : Rn → Rn withT (v) = Av
is diagonalisable.
327
Assume V is a finite dimensional vector space over a field F = R or C.
Theorem
A linear transformation T : V → V is diagonalisable if and only if there
is a basis B for V consisting of eigenvectors of T .
A matrix A ∈ Mn,n (F) is diagonalisable over F if and only if it has n
linearly independent eigenvectors in Fn .
Proof:
Let B = {b1 , b2 , · · · , bn } be a basis for V .The matrix [T ]B is diagonal
if and only if each of the basis vectors bi satisfies
T (bi ) = λi bi ,
328
How to diagonalise a matrix
Theorem
An n × n matrix A is diagonalisable if and only if there is an n × n
invertible matrix P and an n × n diagonal matrix D such that
P −1 AP = D
or, equivalently,
A = PDP −1 .
The matrix P is said to diagonalise A.
If v1 , . . . , vn are linearly independent eigenvectors of A with
corresponding eigenvalues λ1 , . . . , λn , then we can take
P = v1 · · · vn and D = diag(λ1 , λ2 , . . . , λn ).
Note:
If A is diagonalisable over F then P and D must have entries in F.
329
Example 9.
1 4
Let A = . Suppose T : R2 → R2 is such that [T ] = A.
1 1
(a) Write A in the form PDP −1 . Check your answer.
(b) Give a geometric description of T .
Solution:
(a) From example 6:
−2
An eigenvector corresponding to λ = −1 is v1 = .
1
2
An eigenvector corresponding to λ = 3 is v2 = .
1
330
Check A = PDP −1 or AP = PD by multiplying the matrices.
331
Theorem
Eigenvectors v1 , . . . , vk corresponding to distinct eigenvalues λ1 , . . . , λk
are linearly independent.
Proof:
If k = 2.
Let v1 , v2 be eigenvectors of A corresponding to λ1 and λ2 respectively,
such that λ1 = 6 λ2 . Then
332
Equation (2) − λ1 (1) gives
α2 (λ2 − λ1 )v2 = 0
α1 = α2 = 0
Note:
Similar arguments can be used for the cases k > 2.
333
Theorem
If an n × n matrix A has n distinct eigenvalues, then A has n linearly
independent eigenvectors. Hence, A will be diagonalisable.
Note:
If A has one or more repeated eigenvalues, then A may or may not be
diagonalisable depending on the number of linearly independent
eigenvectors associated with each eigenvalue.
◮ If the algebraic and geometric multiplicities for each eigenvalue are
equal, A will be diagonalisable. In this case, the dimensions of the
eigenspaces add up to n.
◮ If the geometric multiplicity is less than the algebraic multiplicity
for some eigenvalue, A will not be diagonalisable.
334
Example 10.
2 −3 6
Is A = 0 5 −6 diagonalisable?
0 1 0
If it is, specify the matrices P and D such that A = PDP −1 .
Solution:
From example 7:
Two linearly independent eigenvectors corresponding to λ = 2 are
1 0
v1 = 0 and v2 = 2
0 1
An eigenvector corresponding to λ = 3 is
−3
v3 = 3
1
335
336
Example 11.
1 2
Show that the matrix A = is not diagonalisable.
0 1
Solution:
Find the eigenvalues of A
337
Note:
The eigenvalue λ = 1 has algebraic multiplicity 2 but geometric
multiplicity of 1, so we cannot diagonalise A.
338
Example 12.
0 1
Is A = diagonalisable?
−1 0
If it is, specify the matrices P and D such that A = PDP −1 .
Solution:
From example 8:
−i
An eigenvector corresponding to λ = i is v1 =
1
i
An eigenvector corresponding to λ = −i is v2 =
1
339
6.5 Matrix powers
Example 13.
−4 0 0
If D = 0 3 0, write down D 10 .
0 0 2
Solution:
340
Theorem
Suppose A is diagonalisable, so that
A = PDP −1
Ak = PD k P −1 .
341
Example 14.
1 4
From example 9, A = can be written in the form A = PDP −1
1 1
where
−1 0 −2 2 −1 1 −1 2
D= , P= and P = .
0 3 1 1 4 1 2
Calculate Ak , for all integers k > 1.
Solution:
342
Example 15. An application from genetics
If you continue crossing with only pink plants, what fractions of the
three types of flowers would you eventually expect to see in your garden?
343
Solution:
Let
rn
xn = pn
wn
where rn , pn , wn are the fractions of red, pink and white flowers after n
generations, for n > 0. Note that these satisfy
rn + pn + wn = 1.
344
This gives
xn+1 = T xn ,
where T is the transition matrix
Note that T has non-negative entries and the sum of each column is 1.
Then
x1 = T x0 , x2 = T x1 = T (T x0 ) = T 2 x0 ,
and in general
xn = T n x0
gives the distribution after n generations for n > 1.
345
Now T has eigenvalues
1
λ = 1, , 0
2
with corresponding eigenvectors
1 −1 1
v1 = 2 , v2 = 0 , v3 = −2 .
1 1 1
T = PDP −1
where
1 1 1
1 −1 1 1 0 0 4 4 4
P = 2 0 −2 , D = 0 21 0 , P −1 = − 12 0 1 .
2
1 1 1 0 0 0 1
− 41 1
4 4
346
Now
T n = PD n P −1
Taking the limit as n → ∞ we get
347
So in the long run, the expected distribution of flower colours is:
Note:
◮ These fractions are proportional to the entries of the eigenvector of
T with eigenvalue 1 . This is expected since the limiting
proportions x∞ satisfy T x∞ = x∞ .
◮ This is an example of a Markov chain.
348
Topic 7: Inner Product Spaces [AR 6.1-6.5,5.3,7.1-7.2,7.5]
349
7.1 Definition of inner products
We now generalise the dot product to define inner products, angles and
lengths on general vector spaces.
hu, vi = u · v
351
Example 1.
For u = (u1 , u2 ) ∈ R2 and v = (v1 , v2 ) ∈ R2 , define
hu, vi = u1 v1 + 2u2 v2 .
1. Symmetry
352
2. Linearity of scalar multiplication
353
4. Positivity
354
Example 2.
For u = (u1 , u2 , u3 ) ∈ R3 and v = (v1 , v2 , v3 ) ∈ R3 , define
hu, vi = u1 v1 − u2 v2 + u3 v3 .
355
Note:
Let u = (u1 , · · · , un ) ∈ Rn and v = (v1 , · · · , vn ) ∈ Rn .
A = AT .
356
Example 3.
For u = (u1 , u2 ) ∈ R2 and v = (v1 , v2 ) ∈ R2 , define
Solution:
357
358
Example 4.
For p = p(x) ∈ Pn and q = q(x) ∈ Pn , define
Z 1
hp, qi = p(x)q(x) dx.
0
Show that this gives an inner product on Pn .
Solution:
Define a third polynomial and a scalar
1. Symmetry
359
2. Linearity of scalar multiplication
360
4. Positivity
361
Complex vector spaces
362
Example 5.
Let u = (1 + i , 1 − i ), v = (i , 1) ∈ C2 .
Compute the Hermitian dot products u · v, v · u and u · u.
Solution:
363
Definition (Inner product over C)
A Hermitian inner product on a complex vector space V is a function
that associates to every pair of vectors u, v ∈ V a complex number,
denoted by hu, vi, satisfying the following properties.
364
Example
The Hermitian dot product on Cn satisfies these properties.
Example
The inner product on Pn (R) defined in example 4 can be generalised to
Pn (C) by replacing the second polynomial in the definition by its
complex conjugate:
Z 1
hp, qi = p(x)q(x) dx.
0
Note:
Inner products defined by integration are important in many
applications, for example in quantum mechanics, electrical engineering,
and in the study of differential equations.
365
Example 6.
For u = (u1 , u2 ) ∈ C2 and v = (v1 , v2 ) ∈ C2 , define
366
7.2 Geometry from inner products
d(u, v) = kv − uk.
367
Example (Frobenius inner product and digital pictures)
Given two matrices A, B ∈ Mm,n , we define
368
Definition (Angle)
For a real inner product space, the angle θ between two non-zero
vectors u and v is defined by
hu, vi
cos θ = , 0 6 θ 6 π.
kukkvk
For a complex inner product space, the angle θ between two non-zero
vectors u and v is defined by
Rehu, vi
cos θ = , 0 6 θ 6 π,
kukkvk
where Rehu, vi denotes the real part of the inner product.
369
Note:
In order for these definitions of angle to make sense, we need
hu, vi Rehu, vi
−1 6 6 1 and −16 6 1.
kvkkuk kvkkuk
Equality holds if and only if one vector is a multiple of the other i.e. the
vectors are linearly dependent.
370
Example 7.
Let u = 1 and v = x. Using the inner product:
Z 1
hp, qi = p(x)q(x) dx
0
find the
(a) lengths of u and v
(b) distance between u and v
(c) angle between u and v
Solution:
371
372
7.3 Orthogonal sets and the Gram-Schmidt procedure
Note:
For complex inner products this is stronger than saying the angle
between u and v is π/2 which only requires Rehu, vi = 0. It also implies
that the angle between u and i v is π/2.
Solution:
374
Example 9.
Show that
{v1 , v2 , v3 } = {(1, 1, 1), (1, −1, 1), (1, 0, −1)}
is an orthogonal set of vectors in R3 with respect to the inner product
hu, vi = u1 v1 + 2u2 v2 + u3 v3 .
Solution:
375
Theorem
Every orthogonal set of non-zero vectors {v1 , . . . , vk } in an inner
product space V is linearly independent.
Proof:
Since {v1 , . . . , vk } is an orthogonal set, then
hvi , vj i = 0, i 6= j
Also
h0, vi i = 0, i = 1, · · · , k
Suppose that
α1 v1 + · · · + αk vk = 0 (*)
hα1 v1 + · · · + αk vk , vi i = 0
⇒ hα1 v1 , vi i + · · · + hαi vi , vi i + · · · + hαk vk , vi i = 0
⇒ α1 hv1 , vi i + · · · + αi hvi , vi i + · · · + αk hvk , vi i = 0
⇒ αi hvi , vi i = 0
Since hvi , vi i =
6 0 then αi = 0.
αi = 0, for all i = 1, · · · , k
377
Definition (Orthonormal set)
A set of vectors {v1 , . . . , vk } in an inner product space V is
orthonormal if it is orthogonal and each vector has length 1.
That is
(
0 i 6= j,
{v1 , . . . , vk } is orthonormal ⇔ hvi , vj i =
1 i = j.
Note:
Any orthogonal set of non-zero vectors can be made orthonormal by
dividing each vector by its length.
378
Example 10.
Using the orthogonal set
Solution:
379
Orthonormal bases
Theorem
Let V be an inner product space. If {v1 , . . . , vn } is an orthonormal
basis for V and x ∈ V , then
x = α1 v1 + · · · + αn vn .
380
Then, for each i = 1, · · · , n, we have
hx, vi i = hα1 v1 + · · · + αn vn , vi i
= hα1 v1 , vi i + · · · + hαi vi , vi i + · · · + hαn vn , vi i
= α1 hv1 , vi i + · · · + αi hvi , vi i + · · · + αn hvn , vi i
= αi
Hence
x = hx, v1 iv1 + · · · + hx, vn ivn .
381
Example 11.
Recall R3 has an orthonormal basis
1 1 1
{b1 , b2 , b3 } = (1, 1, 1), (1, −1, 1), √ (1, 0, −1)
2 2 2
with respect to the inner product
hu, vi = u1 v1 + 2u2 v2 + u3 v3 .
Express x = (1, 1, −1) as a linear combination of b1 , b2 , b3 .
Solution:
382
383
Orthogonal projection
Orthogonal projection in Rn can be generalised to any inner product
space.
hv, ui
proju v = u.
hu, ui
Note:
If p = proju v then v − p is orthogonal to u since:
hv, ui
hv − p, ui = hv, ui − hp, ui = hv, ui − hu, ui = 0.
hu, ui
384
We can also generalise to projection onto a subspace W of V .
If p = projW v then:
v v−p
W
0 p
385
Properties of orthogonal projection of v onto W
1. projW v ∈ W
2. If w ∈ W , then
projW w = w
3. v − projW v is orthogonal to W , i.e. orthogonal to every vector in
W.
4. projW v is the unique vector in W that is closest to v. Thus,
kv − wk > kv − projW vk
386
Example 12.
Let W = {(x, y , z) | x + y + z = 0}. The set
1 1
{u1 , u2 } = √ (1, −1, 0), √ (1, 1, −2)
2 6
is an orthonormal basis for W with respect to the dot product.
Solution:
387
388
How to produce an orthonormal basis from a general basis
Gram-Schmidt procedure
Suppose {v1 , . . . , vk } is a basis for an inner product space V . Define:
1
1. u1 = kv1 k v1
Solution:
v1
Step 1: Find u1 = kv1 k
390
Step 2a: Find w2 = v2 − hv2 , u1 iu1
w2
Step 2b: Find u2 = kw2 k
391
Step 3a: Find w3 = v3 − hv3 , u1 iu1 − hv3 , u2 iu2
392
w3
Step 3b: Find u3 = kw3 k
393
7.4 Least squares curve fitting
Given a set of data points (x1 , y1 ), (x2 , y2 ),. . . , (xn , yn ) we want to find
the line y = a + bx that best approximates the data.
394
This can be written as
E 2 = ky − Auk2
using the Euclidean dot product, where
1 x1 a + bx1
y1 1 x2 a + bx2
.. a
y = . , A = .. .. , u = , Au = .. .
. . b .
yn
1 xn a + bxn
396
Theorem (Least squares line of best fit)
The least squares solution for the line y = a + bx of best fit for the
points (x1 , y1 ), . . . , (xn , yn ) is determined by
AT A u = AT y,
or, if AT A is invertible,
u = (AT A)−1 AT y,
where
1 x1
y1 1 x2
.. a
y = . , A = . . , u= .
. .
. . b
yn
1 xn
397
Example 14.
Find the line of best fit for the data points (−1, 1), (1, 1), (2, 3) using
the method of least squares.
Solution:
398
399
400
7.5 Orthogonal matrices and symmetric matrices
Note:
It is enough to show that
Q T Q = In or QQ T = In .
Theorem
A real n × n matrix is orthogonal if and only if its columns form an
orthonormal basis for Rn using the dot product.
401
Example 15.
Show that the following rotation matrix is orthogonal:
cos θ − sin θ
Q=
sin θ cos θ
Solution:
402
Geometric properties of orthogonal matrices
Theorem
If Q is an orthogonal n × n matrix then
1. Qu · Qv = u · v for all u, v ∈ Rn (with u, v regarded as column
vectors),
2. det(Q) = ±1.
Note:
◮ Property 1 says that orthogonal matrices preserve dot products.
Hence they also preserve Euclidean lengths and angles.
◮ Geometrically, orthogonal matrices represent rotations when
det(Q) = 1, and reflections composed with rotations when
det(Q) = −1.
403
Example 16.
Prove property 1: If Q is an orthogonal n × n matrix then
Qu · Qv = u · v for all u, v ∈ Rn regarded as column vectors.
Solution:
404
Real symmetric matrices
Theorem (Orthogonal Diagonalisation)
Let A be an n × n real symmetric matrix. Then using the dot product
as the inner product on Rn :
1. all eigenvalues λi (i = 1, · · · , n) of A are real,
2. eigenvectors corresponding to distinct eigenvalues are orthogonal,
3. there is an orthonormal basis of eigenvectors {v1 , · · · , vn },
4. A is diagonalisable with
A = QDQ −1 = QDQ T
405
Example 17.
Write the symmetric matrix
1 −1
A=
−1 1
406
Step 2. Find a unit eigenvector corresponding to each eigenvalue
407
408
Step 3. Form D and Q.
Note:
In the case of repeated eigenvalues, you may need to use the
Gram-Schmidt procedure in each eigenspace to find an orthonormal
basis of eigenvectors.
409
Singular value decomposition (SVD)
A = USV T
where S is an m × n matrix with Sii > 0 and Sij = 0 whenever i 6= j.
410
Theorem
Every real m × n matrix A has a singular value decomposition
A = USV T
where
◮ U is an m × m real orthogonal matrix,
◮ V is an n × n real orthogonal matrix,
◮ S is an m × n matrix with Sii = σi > 0 and Sij = 0, i 6= j,
◮ σi for 1 6 i 6 min(m, n) are called the singular values of A.
411
Note:
◮ AAT is an m × m matrix and AT A is an n × n matrix.
◮ AAT and AT A are real symmetric matrices.
◮ Geometrically, U and V correspond to rotations (or rotations
followed by reflections), so only S changes the lengths of vectors by
stretching by factors σi > 0 and orthogonal projection.
412
How to find the singular value decomposition of A ∈ Mm,n
3. Let
1
ui = Avi
σi
for σi 6= 0, say i = 1, . . . , r = rank(A).
4. Extend u1 , . . . , ur to an orthonormal basis {u1 , . . . , um } for Rm .
5. Then
A = USV T
where V = [v1 . . . vn ], U = [u1 . . . um ] and S has diagonal entries
Sii = σi for i = 1, . . . , min(m, n).
413
Example 18.
Let
0 −1 σ1 0
A= and S = .
0 0 0 σ2
A = USV T .
Solution:
Step 1. Find a matrix V
414
415
Step 2. Find a matrix U compatible with V using property 4 of theorem
416
417
7.6 Unitary matrices and Hermitian matrices
We now consider the complex analogues of orthogonal and real
symmetric matrices.
Note:
If α ∈ C is a scalar, then the following properties hold whenever the
matrix products and sums exist:
(A∗ )∗ = A, (A + B)∗ = A∗ + B ∗ ,
(AB)∗ = B ∗ A∗ , (αA)∗ = ᾱA∗ .
418
Definition (Unitary matrix)
An unitary matrix is a complex n × n matrix U such that
T
U −1 = U ∗ = U .
Note:
◮ To check if a matrix is unitary it suffices to check that U ∗ U = In or
UU ∗ = In .
◮ Any real orthogonal matrix is unitary if viewed as a complex matrix.
419
Example 19.
Show that
1 −i i
U=√
2 1 1
is a unitary matrix.
Solution:
420
The Hermitian dot product on Cn can be written in matrix form:
hu, vi = u1 v1 + u2 v2 + · · · + un vn = v̄T u = v∗ u,
Theorem
If U is a unitary n × n matrix then
1. hUu, Uvi = hu, vi for all u, v ∈ Cn ,
2. | det(U)| = 1,
where hu, vi is the Hermitian dot product on Cn .
421
Theorem
A complex n × n matrix is unitary if and only its columns form an
orthonormal basis for Cn using the Hermitian dot product.
A∗ = A.
Note:
Any real symmetric matrix is Hermitian if viewed as a complex matrix.
422
Example 20.
Are the following matrices Hermitian?
1 i
(a) A =
−i 1
i 0
(b) B =
0 i
Solution:
423
Theorem
Let A be an n × n Hermitian matrix. Then using the Hermitian dot
product as inner product on Cn :
1. all eigenvalues λi (i = 1, · · · , n) of A are real,
2. eigenvectors corresponding to distinct eigenvalues are orthogonal,
3. there is an orthonormal basis of eigenvectors {v1 , · · · , vn },
4. A is diagonalisable with
A = UDU −1 = UDU ∗
diag(λ1 , λ2 , . . . , λn )
424
Example 21.
Write the Hermitian matrix
1 i
A=
−i 1
in the form
A = UDU −1
where D is a diagonal matrix and U is a unitary matrix.
Solution:
Step 1. Find the eigenvalues of A
425
Step 2. Find a unit eigenvector corresponding to each eigenvalue
426
427
Step 3. Form D and U.
428