Linear Guest (110 218)
Linear Guest (110 218)
5. Let P3R be the set of polynomials with real coefficients of degree three
or less.
(a) Propose a definition of addition and scalar multiplication to make
P3R a vector space.
(b) Identify the zero vector, and find the additive inverse for the vector
3 2x + x2 .
(c) Show that P3R is not a vector space over C. Propose a small
change to the definition of P3R to make it a vector space over C.
(Hint: Every little symbol in the the instructions for par (c) is
important!)
Hint
9. Let V be a vector space and S any set. Show that the set V S of all
functions S ! V is a vector space. Hint: first decide upon a rule for
adding functions whose outputs are vectors.
110
Linear Transformations
6
The main objects of study in any course in linear algebra are linear functions:
Remark We will often refer to linear functions by names like “linear map”, “linear
operator” or “linear transformation”. In some contexts you will also see the name
“homomorphism” which generally is applied to functions from one kind of set to the
same kind of set while respecting any structures on the sets; linear maps are from
vector spaces to vector spaces that respect scalar multiplication and addition, the two
structures on vector spaces. It is common to denote a linear function by capital L
as a reminder of its linearity, but sometimes we will use just f , after all we are just
studying very special functions.
The definition above coincides with the two part description in Chapter 1;
the case r = 1, s = 1 describes additivity, while s = 0 describes homogeneity.
We are now ready to learn the powerful consequences of linearity.
111
112 Linear Transformations
because by homogeneity
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
5 1 1 5 25
L =L 5 = 5L =5 = .
0 0 0 3 15
because by additivity
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
1 1 0 1 0 5 2 7
L =L + =L +L = + = .
1 0 1 0 1 3 2 5
112
6.1 The Consequence of Linearity 113
we know how L acts on every vector from R2 by linearity based on just two pieces of
information;
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
x 1 0 1 0 5 2 5x + 2y
L =L x +y = xL +yL =x +y = .
y 0 1 0 1 3 2 3x + 2y
Thus, the value of L at infinitely many inputs is completely specified by its value at
just two inputs. (We can see now that L acts in exactly the way the matrix
✓ ◆
5 2
3 2
This is the reason that linear functions are so nice; they are secretly very
simple functions by virtue of two characteristics:
113
114 Linear Transformations
Example 71 Let 8 9
0 1 0 1
< 1 0 =
@ A @ A
V = c 1 1 + c 2 1 c 1 , c2 2 R
: ;
0 1
The domain of L is a plane and its range is the line through the origin in the x2
direction.
It is not clear how to formulate L as a matrix; since
0 1 0 10 1 0 1
c1 0 0 0 c1 0
L @c1 + c2 A = @1 0 1A @c1 + c2 A = (c1 + c2 ) @1A ,
c2 0 0 0 c2 0
or
0 1 0 10 1 0 1
c1 0 0 0 c1 0
L @c1 + c2 A = @0 1 0A @c1 + c2 A = (c1 + c2 ) @1A ,
c2 0 0 0 c2 0
114
6.3 Linear Di↵erential Operators 115
line is 1 dimensional, but the careful definition of “dimension” takes some work; this
is tackled in Chapter 11.) This leads us to write
2 0 1 0 13 0 1 0 1 0 1
1 0 0 0 0 0 ✓ ◆
c
L 4c1 @1A + c2 @1A5 = c1 @1A + c2 @1A = @1 1A 1 .
c2
0 1 0 0 0 0
0 1
0 0
This makes sense, but requires a warning: The matrix @1 1A specifies L so long
0 0
as you also provide the information that you are labeling points in the plane V by the
two numbers (c1 , c2 ).
115
116 Linear Transformations
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
1 2 1 6
L = , and L = .
1 4 1 8
✓ ◆ ✓ ◆ ✓ ◆
x 1 1
This is because any vector in R is a sum of multiples of
2 and which
y 1 1
can be calculated via a linear systems problem as follows:
✓ ◆ ✓ ◆ ✓ ◆
x 1 1
=a +b
y 1 1
✓ ◆✓ ◆ ✓ ◆
1 1 a x
, =
1 1 b y
✓ ◆ ✓ ◆
1 1 x 1 0 x+y
2
, ⇠
1 1 y 0 1 x2y
⇢
a = x+y
2
,
b = x2y .
Thus
! ! !
x x+y 1 x y 1
= + .
y 2 1 2 1
We can then calculate how L acts on any vector by first expressing the vector as a
116
6.4 Bases (Take 1) 117
It should not surprise you to learn there are infinitely many pairs of
vectors from R2 with the property that any vector can be expressed as a
linear combination of them; any pair that when used as columns of a matrix
gives an invertible matrix works. Such a pair is called a basis for R2 .
Similarly, there are infinitely many triples of vectors with the property
that any vector from R3 can be expressed as a linear combination of them:
these are the triples that used as columns of a matrix give an invertible
matrix. Such a triple is called a basis for R3 .
In a similar spirit, there are infinitely many pairs of vectors with the
property that every vector in
8 0 1 0 1 9
< 1 0 =
@ A @ A
V = c 1 1 + c 2 1 c 1 , c2 2 R
: ;
0 1
117
118 Linear Transformations
(valid for all vectors u, v and any scalar c) is equivalent to the single
condition:
L(ru + sv) = rL(u) + sL(v) , (2)
(for all vectors u, v and any scalars r and s). Your answer should have
two parts. Show that (1) ) (2), and then show that (2) ) (1).
118
6.5 Review Problems 119
Hint
119
120 Linear Transformations
120
Matrices
7
Matrices are a powerful tool for calculations involving linear transformations.
It is important to understand how to find the matrix of a linear transforma-
tion and the properties of matrices.
Example 74 Let
⇢✓ ◆
a b
V = a, b, c, d 2 R
c d
be the vector space of 2 ⇥ 2 real matrices, with addition and scalar multiplication
defined componentwise. One choice of basis is the ordered set (or list) of matrices
✓✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆◆
1 0 0 1 0 0 0 0
B= , , , =: (e11 , e12 , e21 , e22 ) .
0 0 0 0 1 0 0 1
121
122 Matrices
Given a particular vector and a basis, your job is to write that vector as a sum of
multiples of basis elements. Here an arbitrary vector v 2 V is just a matrix, so we
write
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
a b a 0 0 b 0 0 0 0
v = = + + +
c d 0 0 0 0 c 0 0 d
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
1 0 0 1 0 0 0 0
= a +b +c +d
0 0 0 0 1 0 0 1
= a e11 + b e12 + c e21 + d e22 .
The coefficients (a, b, c, d) of the basis vectors (e11 , e12 , e21 , e22 ) encode the information
of which matrix the vector v is. We store them in column vector by writing
0 1 0 1
a a
B b C B C
v = a e11 + b e12 + c e21 + d e22 =: (e11 , e12 , e21 , e22 ) B C B bC
@ cA =: @ cA .
d d B
0 1
a ✓ ◆
B bC a b
B C
The 4-vector @ A 2 R encodes the vector
4 2 V but is NOT equal to it!
c c d
d
(After all, v is a matrix so could not equal a column vector.) Both notations on the
right hand side of the above equation really stand for the vector obtained by multiplying
the coefficients stored in the column vector by the corresponding basis element and
then summing over them.
are called the standard basis vectors of R2 = R{1,2} . Their description as functions
of {1, 2} are
⇢ ⇢
1 if k = 1 0 if k = 1
e1 (k) = , e2 (k) =
0 if k = 2 1 if k = 2 .
122
7.1 Linear Transformations and Matrices 123
It is natural to assign these the order: e1 is first and e2 is second. An arbitrary vector v
of R2 can be written as ✓ ◆
x
v= = xe1 + ye2 .
y
To emphasize that we are using the standard basis we define the list (or ordered set)
E = (e1 , e2 ) ,
and write ✓ ◆ ✓ ◆
x x
:= (e1 , e2 ) := xe1 + ye2 = v.
y E y
You should read this equation by saying:
✓ ◆
x
“The column vector of the vector v in the basis E is .”
y
Again, the first notation of a column vector with a subscript E refers to the vector
obtained by multiplying each basis vector by the corresponding scalar listed in the
column and then summing these, i.e. xe1 + ye2 . The second notation denotes exactly
the same thing but we first list the basis elements and then the column vector; a
useful trick because this can be read in the same way as matrix multiplication of a row
vector times a column vector–except that the entries of the row vector are themselves
vectors!
You should already try to write down the standard basis vectors for Rn
for other values of n and express an arbitrary vector in Rn in terms of them.
The last example probably seems pedantic because column vectors are al-
ready just ordered lists of numbers and the basis notation has simply allowed
us to “re-express” these as lists of numbers. Of course, this objection does
not apply to more complicated vector spaces like our first matrix example.
Moreover, as we saw earlier, there are infinitely many other pairs of vectors
in R2 that form a basis.
123
124 Matrices
B = (b, )
be the ordered basis. Note that for an unordered set we use the {} parentheses while
for lists or ordered sets we use ().
As before we define
✓ ◆ ✓ ◆
x x
:= (b, ) := xb + y .
y B y
You might think that the numbers x and y denote exactly the same vector as in the
previous example. However, they do not. Inserting the actual vectors that b and
represent we have
✓ ◆ ✓ ◆ ✓ ◆
1 1 x+y
xb + y = x +y = .
1 1 x y
Thus, to contrast, we have
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
x x+y x x
= and =
y B x y y E y
Only in the standard basis E does the column vector of v agree with the column vector
that v actually is!
Based on the above example, you might think that our aim would be to
find the “standard basis” for any problem. In fact, this is far from the truth.
Notice, for example that the vector
✓ ◆
1
v= = e1 + e2 = b
1
written in the standard basis E is just
✓ ◆
1
v= ,
1 E
which was easy to calculate. But in the basis B we find
✓ ◆
1
v= ,
0 B
124
7.1 Linear Transformations and Matrices 125
which is actually a simpler column vector! The fact that there are many
bases for any given vector space allows us to choose a basis in which our
computation is easiest. In any case, the standard basis only makes sense
for Rn . Suppose your vector space was the set of solutions to a di↵erential
equation–what would a standard basis then be?
B=( x, y, z) ,
125
126 Matrices
where ✓ ◆ ✓ ◆ ✓ ◆
0 1 0 i 1 0
x = , y = , z = .
1 0 i 0 0 1
These three matrices are the famous Pauli matrices; they are used to describe electrons
in quantum theory, or qubits in quantum computation. Let
✓ ◆
2+i 1+i
v= .
3 i 2 i
This gives four equations, i.e. a linear systems problem, for the ↵’s
8 x
> ↵x
> i↵y = 1+i
<
↵ + i↵y = 3 i
>
> ↵z = 2+i
:
↵z = 2 i
with solution
↵x = 2 , ↵y = 2 2i , ↵z = 2 + i.
Thus 0 1
✓ ◆ 2
2+i 1+i
v= =@ 2 iA .
3 i 2 i
2+i B
126
7.1 Linear Transformations and Matrices 127
The numbers (↵1 , ↵2 , . . . , ↵n ) are called the components of the vector v. Two
useful shorthand notations for this are
0 1 0 1
↵1 ↵1
B ↵2 C B ↵2 C
B C B C
v = B .. C = (b1 , b2 , . . . , bn ) B .. C .
@ . A @ . A
n
↵ B
↵n
0 1 1
m1 m12 · · · m1i · · ·
Bm2 m2 · · · m2 · · ·C
B 1 2 i C
B .. .. .. C
B
= ( 1 , 2 , . . . , j , . . .) B . . . C
C
Bmj mj · · · mj · · ·C
@ 1 2 i A
.. .. ..
. . .
Example 79 Consider L : V ! R3 (as in example 71) defined by
0 1 0 1 0 1 0 1
1 0 0 0
L 1 = 1 , L 1 = 1A .
@ A @ A @ A @
0 0 1 0
127
128 Matrices
We had trouble expressing this linear operator as a matrix. Lets take input basis
00 1 0 11
1 0
B = @@1A , @1AA =: (b1 , b2 ) ,
0 1
The matrix on the right is the matrix of L in these bases. More succinctly we could
write 0 1
✓ ◆ 0
x
L = (x + y) @1A
y B
0 E
0 1
0 0
and thus see that L acts like the matrix 1 1A.
@
0 0
Hence 00 1 1
✓ ◆ 0 0 ✓ ◆
x x A
L = @@1 1A ;
y B y
0 0 E
given input and output bases, the linear operator is now encoded by a matrix.
128
7.2 Review Problems 129
Example 80 Lets compute a matrix for the derivative operator acting on the vector
space of polynomials of degree 2 or less:
V = {a0 1 + a1 x + a2 x2 | a0 , a1 , a2 2 R} .
and 0 1 0 1
a b
d @ A
b = b · 1 + 2cx + 0x2 = @ 2c A
dx
c B 0 B
In the ordered basis B for both domain and range
0 1
0 1 0
d B @
7! 0 0 2A
dx
0 0 0
Notice this last line makes no sense without explaining which bases we are using!
1. A door factory can buy supplies in two kinds of packages, f and g. The
package f contains 3 slabs of wood, 4 fasteners, and 6 brackets. The
package g contains 5 fasteners, 3 brackets, and 7 slabs of wood.
(a) Explain how to view the packages f and g as functions and list
their inputs and outputs.
129
130 Matrices
(b) Choose an ordering for the 3 kinds of supplies and use this to
rewrite f and g as elements of R3 .
2. You are designing a simple keyboard synthesizer with two keys. If you
push the first key with intensity a then the speaker moves in time as
a sin(t). If you push the second key with intensity b then the speaker
moves in time as b sin(2t). If the keys are pressed simultaneously,
(a) describe the set of all sounds that come out of your synthesizer.
(Hint: Sounds can be “added”.)
✓ ◆
3
(b) Graph the function 2 R{1,2} .
5
✓ ◆
3
(c) Let B = (sin(t), sin(2t)). Explain why is not in R{1,2} but
5 B
is still a function.
✓ ◆
3
(d) Graph the function .
5 B
d
3. (a) Find the matrix for dx acting on the vector space V of polynomi-
als of degree 2 or less in the ordered basis B = (x2 , x, 1)
(b) Use the matrix from part (a) to rewrite the di↵erential equation
d
dx
p(x) = x as a matrix equation. Find all solutions of the matrix
equation. Translate them into elements of V .
130
7.2 Review Problems 131
d
(c) Find the matrix for dx acting on the vector space V in the ordered
0 2 2
basis B = (x + x, x x, 1).
(d) Use the matrix from part (c) to rewrite the di↵erential equation
d
dx
p(x) = x as a matrix equation. Find all solutions of the matrix
equation. Translate them into elements of V .
(e) Compare and contrast your results from parts (b) and (d).
d
4. Find the “matrix” for dx acting on the vector space of all power series
in the ordered basis (1, x, x2 , x3 , ...). Use this matrix to find all power
d
series solutions to the di↵erential equation dx f (x) = x. Hint: your
“matrix” may not have finite size.
2
2 acting on {c1 cos(x) + c2 sin(x) | c1 , c2 2 R} in
d
5. Find the matrix for dx
the ordered basis (cos(x), sin(x)).
131
132 Matrices
Find the
◆ ✓✓ F :◆R ! ◆ R whose
2
(k) first
✓✓ order◆ polynomial
◆ ✓✓ ◆ function ✓✓ graph
◆ contains
◆
0 0 1 1
,1 , ,2 , , 3 , and ,4 .
0 1 0 1
(l) homogeneous first
✓✓order
◆ polynomial
◆ ✓✓ ◆function
◆ : R2 !
H✓✓ ◆ R ◆whose
0 1 1
graph contains ,2 , , 3 , and ,4 .
1 0 1
132
7.3 Properties of Matrices 133
(m) second✓✓
order◆polynomial
◆ ✓✓ function
◆ ◆ ✓✓ J : R◆2
! ◆R whose graph con-
0 0 0
tains ,0 , ,2 , ,5 ,
0 1 2
✓✓ ◆ ◆ ✓✓ ◆ ◆ ✓✓ ◆ ◆
1 2 1
,3 , , 6 , and ,4 .
0 0 1
(o) How many points in the graph of a q-th order polynomial function
Rn ! Rn would completely determine the function?
(p) In particular, how many points of the graph of linear function
Rn ! Rn would completely determine the function? How does a
matrix (in the standard basis) encode this information?
(q) Propose a way to store the information required in 8g above in an
array of numbers.
(r) Propose a way to store the information required in 8o above in an
array of numbers.
133
134 Matrices
The numbers mij are called entries. The superscript indexes the row of
the matrix and the subscript indexes the column of the matrix in which mij
appears.
v = v1 v2 · · · vk .
The transpose of a column vector is the corresponding row vector and vice
versa:
Example 81 Let
0 1
1
v = @2A .
3
Then
vT = 1 2 3 ,
and (v T )T = v. This is an example of an involution, namely an operation which when
performed twice does nothing.
Example 82 In computer graphics, you may have encountered image files with a .gif
extension. These files are actually just matrices: at the start of the file the size of the
matrix is given, after which each number is a matrix entry indicating the color of a
particular pixel in the image.
This matrix then has its rows shu✏ed a bit: by listing, say, every eighth row, a web
browser downloading the file can start displaying an incomplete version of the picture
before the download is complete.
Finally, a compression algorithm is applied to the matrix to reduce the file size.
134
7.3 Properties of Matrices 135
For example, the graph pictured above would have the following matrix, where mij
indicates the number of edges between the vertices labeled i and j:
0 1
1 2 1 1
B2 0 1 0C
M =B
@1
C
1 0 1A
1 0 1 3
135
136 Matrices
Then
0 1 0 1
| | | | | |
M N = M @N1 N2 · · · Ns A = @M N1 M N2 · · · M Ns A
| | | | | |
136
7.3 Properties of Matrices 137
⇣ ⌘ ⇣ ⌘ ⇣ ⌘
r ⇥ k times k ⇥ m is r ⇥ m
The entries of M N are made from the dot products of the rows of
M with the columns of N .
Example 85 Let
0 1 0 T 1
1 3 u ✓ ◆
2 3 1
M = @3 5A =: @ v T A and N = =: a b c
T 0 1 0
2 6 w
where
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
1 3 2 2 3 1
u= , v= , w= , a= , b= , c= .
3 5 6 0 1 0
Then 0 1 0 1
u·a u·b u·c 2 6 1
M N = @ v · a v · b v · c A = @6 14 3A .
w·a w·b w·c 4 12 2
137
138 Matrices
Mx = 0
Remark Remember that the set of all vectors that can be obtained by adding up
scalar multiples of the columns of a matrix is called its column space . Similarly the
row space is the set of all row vectors obtained by adding up multiples of the rows
of a matrix. The above theorem says that if M x = 0, then the vector x is orthogonal
to every vector in the row space of M .
This is the same as the rule we use to multiply matrices. In other words,
L(M ) = N M is a linear transformation.
Matrix Terminology Let M = (mij ) be a matrix. The entries mii are called
diagonal, and the set {m11 , m22 , . . .} is called the diagonal of the matrix.
Any r ⇥ r matrix is called a square matrix. A square matrix that is
zero for all non-diagonal entries is called a diagonal matrix. An example
of a square diagonal matrix is
0 1
2 0 0
@0 3 0A .
0 0 0
138
7.3 Properties of Matrices 139
Ir M = M I k = M
M T = (m̂ij )
A matrix M is symmetric if M = M T .
Example 86 0 1
✓ ◆T 2 1
2 5 6
= @5 3A ,
1 3 4
6 4
and ✓ ◆✓ ◆T ✓ ◆
2 5 6 2 5 6 65 43
= ,
1 3 4 1 3 4 43 26
is symmetric.
Observations
139
140 Matrices
(M N )T = N T M T .
So first we compute
⇣Xr hXn i ⌘ ⇣X n h
r X i ⌘ ⇣X r X
n ⌘
(M N )R = mij njk rlk = mij njk rlk = mij njk rlk .
k=1 j=1 k=1 j=1 k=1 j=1
In the first step we just wrote out the definition for matrix multiplication, in the second
step we moved summation symbol outside the bracket (this is just the distributive
property x(y +z) = xy +xz for numbers) and in the last step we used the associativity
property for real numbers to remove the square brackets. Exactly the same reasoning
shows that
⇣X n hXr i⌘ ⇣ X r X n h i⌘ ⇣ X r X n ⌘
M (N R) = mij njk rlk = mij njk rlk = mij njk rlk .
j=1 k=1 k=1 j=1 k=1 j=1
1
As a fun remark, note that Einstein would simply have written
(M N )R = (mij njk )rlk = mij njk rlk = mij (njk rlk ) = M (N R).
140
7.3 Properties of Matrices 141
M N 6= N M .
Do Matrices Commute?
rotates vectors in the plane by an angle ✓. We can generalize this, using block matrices,
to three dimensions. In fact the following matrices built from a 2 ⇥ 2 rotation matrix,
a 1 ⇥ 1 identity matrix and zeroes everywhere else
0 1 0 1
cos ✓ sin ✓ 0 1 0 0
M = @ sin ✓ cos ✓ 0A and N = @0 cos ✓ sin ✓A ,
0 0 1 0 sin ✓ cos ✓
141
142 Matrices
• The
✓ blocks
◆ of a block matrix✓ must ◆
fit together to form a rectangle. So
B A C B
makes sense, but does not.
D C D A
142
7.3 Properties of Matrices 143
This is exactly M 2 .
143
144 Matrices
to define
M0 = I ,
the identity matrix, just like x0 = 1 for numbers.
As a result, any polynomial can be have square matrices in it’s domain.
Then ✓ ◆ ✓ ◆
2 1 2t 3 1 3t
M = , M = , ...
0 1 0 1
and so
✓ ◆ ✓ ◆ ✓ ◆
1 t 1 2t 1 3t
f (M ) = 2 +3
0 1 0 1 0 1
✓ ◆
2 6t
= .
0 2
144
7.3 Properties of Matrices 145
7.3.4 Trace
A large matrix contains a great deal of information, some of which often re-
flects the fact that you have not set up your problem efficiently. For example,
a clever choice of basis can often make the matrix of a linear transformation
very simple. Therefore, finding ways to extract the essential information of
a matrix is useful. Here we need to assume that n < 1 otherwise there are
subtleties with convergence that we’d have to address.
Definition The trace of a square matrix M = (mij ) is the sum of its diag-
onal entries: n
X
tr M = mii .
i=1
Example 91 0 1
2 7 6
tr @9 5 1A = 2 + 5 + 8 = 15 .
4 3 8
X
tr(M N ) = tr( Mli Njl )
l
XX
= Mli Nil
i l
XX
= Nil Mli
l i
X
= tr( Nil Mli )
i
= tr(N M ).
Proof Explanation
Thus we have a Theorem:
Theorem 7.3.3. For any square matrices M and N
tr(M N ) = tr(N M ).
145
146 Matrices
tr M = tr M T
This is true because the trace only uses the diagonal entries, which are fixed
by the transpose. For example,
✓ ◆ ✓ ◆ ✓ ◆T
1 1 1 2 1 2
tr = 4 = tr = tr .
2 3 1 3 1 3
Finally, trace is a linear transformation from matrices to the real numbers.
This is easy to check.
146
7.4 Review Problems 147
0 10 1
2 1 2 1 2 1 2 1 2 1
0 10 1 B0 C B
2 1 1 x B 2 1 2 1 C B0 1 2 1 2CC
B C B CB C
x y z @1 2 1A @ y A , B0 1 2 1 2 C B0 2 1 2 1C ,
B CB C
1 1 2 z @0 2 1 2 1 A @0 1 2 1 2A
0 0 0 0 2 0 0 0 0 1
0 4 1
10 2 2
10 1
2 3 3
4 3 3
1 2 1
B 5 2C B 5 2C B C
@ 2 3 3A @
6 3 3A @
4 5 2A .
16 10
1 2 1 12 3 3
7 8 2
(a) Let M = (mij ) and let N = (nij ). Write out a few of the entries of
each matrix in the form given at the beginning of section 7.3.
(b) Multiply out M N and write out a few of its entries in the same
form as in part (a). In terms of the entries of M and the entries
of N , what is the entry in row i and column j of M N ?
(c) Take the transpose (M N )T and write out a few of its entries in
the same form as in part (a). In terms of the entries of M and the
entries of N , what is the entry in row i and column j of (M N )T ?
(d) Take the transposes N T and M T and write out a few of their
entries in the same form as in part (a).
(e) Multiply out N T M T and write out a few of its entries in the same
form as in part a. In terms of the entries of M and the entries of
N , what is the entry in row i and column j of N T M T ?
(f) Show that the answers you got in parts (c) and (e) are the same.
✓ ◆
1 2 0
3. (a) Let A = . Find AAT and AT A and their traces.
3 1 4
147
148 Matrices
0 1 0 1
x1 y1
B .. C B .. C
4. Let x = @ . A and y = @ . A be column vectors. Show that the
xn yn
T
dot product x y = x I y.
Hint
Hint
148
7.4 Review Problems 149
Hint
0 1
1 0 0 0 0 0 0 1
B0 1 0 0 0 0 1 0C
B C
B0 0 1 0 0 1 0 0C
B C
B0 0 0 1 1 0 0 0C
9. Let M = B B0
C. Divide M into named blocks,
B 0 0 0 2 1 0 0CC
B0 0 0 0 0 2 0 0C
B C
@0 0 0 0 0 0 3 1A
0 0 0 0 0 0 0 3
with one block the 4 ⇥ 4 identity matrix, and then multiply blocks to
compute M 2 .
149
150 Matrices
(A 1 ) 1
= A.
1
(AB) = B 1A 1
150
7.5 Inverse Matrix 151
Figure 7.1: The formula for the inverse of a 2⇥2 matrix is worth memorizing!
Thus, much like the transpose, taking the inverse of a product reverses
the order of the product.
(A 1 )T = (AT ) 1
2 ⇥ 2 Example
151
152 Matrices
M X = V1 , M X = V2
we can consider augmented matrices with many columns on the right and
then apply Gaussian row reduction to the left side of the matrix. Once the
identity matrix is on the left side of the augmented matrix, then the solution
of each of the individual linear systems is on the right.
1 1
M V1 V2 ⇠ I M V1 M V2
1 1
M I ⇠ I M I = I M
0 1 1
1 2 3
Example 93 Find @ 2 1 0A .
4 2 5
We start by writing the augmented matrix, then apply row reduction to the left side.
0 1 0 1
1 2 3 1 0 0 1 2 3 1 0 0
B 2 1 C B
0 0 1 0A ⇠ @0 5 6 2 1 0C
@ A
4 2 5 0 0 1 0 6 7 4 0 1
0 3 1 2 1
1 0 5 4 5 0
B0 1 6 2 1
0C
⇠ @ 5 5 5 A
1 4 6
0 0 5 5 5 1
0 1
1 0 0 5 4 3
B0 1 0 10 7 6C
⇠ @ A
0 0 1 8 6 5
152
7.5 Inverse Matrix 153
At this point, we know M 1 assuming we didn’t goof up. However, row reduction is a
lengthy and involved process with lots of room for arithmetic errors, so we should check
our answer, by confirming that M M 1 = I (or if you prefer M 1 M = I):
0 10 1 0 1
1 2 3 5 4 3 1 0 0
MM 1 = @ 2 1 0A @ 10 7 6A = @0 1 0A
4 2 5 8 6 5 0 0 1
The product of the two matrices is indeed the identity matrix, so we’re done.
x +2y 3z = 1
2x + y =2
4x
2y +5z = 0
0 1
1
The associated matrix equation is M X = 2A , where M is the same as in the
@
0
previous section, so the system above is equivalent to the matrix equation
0 1 0 1 10 1 0 10 1 0 1
x 1 2 3 1 5 4 3 1 3
@yA = @ 2 1 0A @2A = @ 10 7 6A @2A = @ 4A .
z 4 2 5 0 8 6 5 0 4
0 1 0 1
x 3
That is, the system is equivalent to the equation @ y A = @ 4A, and it is easy to
z 4
see what the solution(s) to this equation are.
1
In summary, when M exists
1
Mx = v , x = M v.
153
154 Matrices
154
7.6 Review Problems 155
155
156 Matrices
1. Find formulas for the inverses of the following matrices, when they are
not singular:
0 1
1 a b
(a) @0 1 cA
0 0 1
0 1
a b c
(b) @0 d eA
0 0 f
When are these matrices singular?
2. Write down all 2⇥2 bit matrices and decide which of them are singular.
For those which are not singular, pair them with their inverse.
3. Let M be a square matrix. Explain why the following statements are
equivalent:
(a) M X = V has a unique solution for every column vector V .
(b) M is non-singular.
Hint: In general for problems like this, think about the key words:
First, suppose that there is some column vector V such that the equa-
tion M X = V has two distinct solutions. Show that M must be sin-
gular; that is, show that M can have no inverse.
Next, suppose that there is some column vector V such that the equa-
tion M X = V has no solutions. Show that M must be singular.
Finally, suppose that M is non-singular. Show that no matter what
the column vector V is, there is a unique solution to M X = V.
Hint
4. Left and Right Inverses: So far we have only talked about inverses of
square matrices. This problem will explore the notion of a left and
right inverse for a matrix that is not square. Let
✓ ◆
0 1 1
A=
1 1 0
156
7.6 Review Problems 157
(a) Compute:
i. AAT ,
1
ii. AAT ,
T 1
iii. B := A AAT
(b) Show that the matrix B above is a right inverse for A, i.e., verify
that
AB = I .
(f) True or false: Left and right inverses are unique. If false give a
counterexample.
Hint
5. Show that if the range (remember that the range of a function is the
set of all its outputs, not the codomain) of a 3 ⇥ 3 matrix M (viewed
as a function R3 ! R3 ) is a plane then one of the columns is a sum of
multiples of the other columns. Show that this relationship is preserved
under EROs. Show, further, that the solutions to M x = 0 describe this
relationship between the columns.
1
6. If M and N are square matrices of the same size such that M exists
and N 1 does not exist, does (M N ) 1 exist?
157
158 Matrices
158
7.7 LU Redux 159
7.7 LU Redux
Certain matrices are easier to work with than others. In this section, we
will see how to write any square2 matrix M as the product of two simpler
matrices. We will write
M = LU ,
where:
• L is lower triangular . This means that all entries above the main
diagonal are zero. In notation, L = (lji ) with lji = 0 for all j > i.
0 1
l11 0 0 · · ·
Bl 2 l 2 0 · · · C
B1 2 C
L = Bl 3 l 3 l 3 · · · C
@1 2 3 A
.. .. .. . .
. . . .
• U is upper triangular . This means that all entries below the main
diagonal are zero. In notation, U = (uij ) with uij = 0 for all j < i.
0 1
u11 u12 u13 · · ·
B 0 u2 u2 · · · C
B 2 3 C
U = B 0 0 u3 · · · C
@ 3 A
.. .. .. . .
. . . .
M = LU is called an LU decomposition of M .
This is a useful trick for computational reasons; it is much easier to com-
pute the inverse of an upper or lower triangular matrix than general matrices.
Since inverses are useful for solving linear systems, this makes solving any lin-
ear system associated to the matrix much faster as well. The determinant—a
very important quantity associated with any square matrix—is very easy to
compute for triangular matrices.
Example 96 Linear systems associated to upper triangular matrices are very easy to
solve by back substitution.
✓ ◆ ✓ ◆
a b 1 e 1 be
) y= , x= 1
0 c e c a c
2
The case where M is not square is dealt with at the end of the section.
159
160 Matrices
0 1 8 9 8
1 0 0 d < x=d = < x=d
@a 1 0 eA ) y=e ax ) y=e ad .
: ; :
b c 1 f z=f bx cy z=f bd c(e ad)
For lower triangular matrices, forward substitution gives a quick solution; for upper
triangular matrices, back substitution gives the solution.
160
7.7 LU Redux 161
0 1
u
• Step 1: Set W = v A = U X.
@
w
0 10 1 0 1
3 0 0 u 3
@1 6 0A @ v A = @19A
2 3 1 w 0
Using an LU decomposition
161
162 Matrices
We would like to use the first row of M to zero out the first entry of every
row below it. For our running example,
0 1
6 18 3
M = @2 12 1A ,
4 15 3
162
7.7 LU Redux 163
163
164 Matrices
Moreover it is obtained by recording minus the constants used for all our
row operations in the appropriate columns (this always works this way).
Moreover, U2 is upper triangular and M = L2 U2 , we are done! Putting this
all together we have
0 1 0 10 1
6 18 3 1 0 0 6 18 3
B CB C
M = @2 12 1A = @ 13 1 0A @0 6 0A .
4 15 3 2 1
1 0 0 1
3 2
If the matrix you’re working with has more than three rows, just continue
this process by zeroing out the next column below the diagonal, and repeat
until there’s nothing left to do.
The fractions in the L matrix are admittedly ugly. For two matrices
LU , we can multiply one entire column of L by a constant and divide the
corresponding row of U by the same constant without changing the product
of the two matrices. Then:
0 1 0 1
1 0 0 6 18 3
B C B C
LU = @ 13 1 0A I @0 6 0A
2 1
1 0 0 1
03 2 1 0 1 01 10 1
1 0 0 3 0 0 3
0 0 6 18 3
B C B C
= @ 13 1 0A @0 6 0A @ 0 16 0A @0 6 0A
2 1
1 0 0 1 0 0 1 0 0 1
0 2 10
3
1
3 0 0 2 6 1
= @ 1 6 0 A @ 0 1 0A .
2 3 1 0 0 1
The resulting matrix looks nicer, but isn’t in standard (lower unit triangular
matrix) form.
164
7.7 LU Redux 165
For matrices that are not square, LU decomposition still makes sense.
Given an m ⇥ n matrix M , for example we could write M = LU with L
a square lower unit triangular matrix, and U a rectangular matrix. Then
L will be an m ⇥ m matrix, and U will be an m ⇥ n matrix (of the same
shape as M ). From here, the process is exactly the same as for a square
matrix. We create a sequence of matrices Li and Ui that is eventually the
LU decomposition. Again, we start with L0 = I and U0 = M .
✓ ◆
2 1 3
Example 99 Let’s find the LU decomposition of M = U0 = . Since M
4 4 1
is a 2 ⇥ 3 matrix, our decomposition
✓ ◆ consist of a 2 ⇥ 2 matrix and a 2 ⇥ 3 matrix.
will
1 0
Then we start with L0 = I2 = .
0 1
The next step is to zero-out the first column of M below the diagonal. There is
only one row to cancel, then, and it can be removed by subtracting 2 times the first
row of M to the second row of M . Then:
✓ ◆ ✓ ◆
1 0 2 1 3
L1 = , U1 =
2 1 0 2 5
Since U1 is upper triangular, we’re done. With a larger matrix, we would just continue
the process.
Then:
✓ ◆✓ ◆✓ ◆
I 0 X 0 I X 1Y
M= 1 1
.
ZX I 0 W ZX Y 0 I
165
166 Matrices
By multiplying the diagonal matrix by the upper triangular matrix, we get the standard
LU decomposition of the matrix.
x1 = v1
l12 x1 +x2 = v2
.. ..
. .
l1 x +l2 x + · · · + x = v n
n 1 n 2 n
(i) Find x1 .
(ii) Find x2 .
(iii) Find x3 .
166
7.8 Review Problems 167
4. Describe what upper and lower triangular matrices do to the unit hy-
percube in their domain.
7. If M is invertible then what are the LU, LDU, and LDP U decompo-
sitions of M T in terms of the decompositions for M ? Can you do the
same for M 1 ?
167
168 Matrices
168
Determinants
8
Given a square matrix, is there an easy way to know when it is invertible?
Answering this fundamental question is the goal of this chapter.
then ✓ ◆
1 1 m22 m12
M = .
m11 m22 m12 m21 m21 m11
Thus M is invertible if and only if
169
170 Determinants
det M = m11 m22 m33 m11 m23 m32 + m12 m23 m31 m12 m21 m33 + m13 m21 m32 m13 m22 m31 6= 0.
Notice that in the subscripts, each ordering of the numbers 1, 2, and 3 occurs exactly
once. Each of these is a permutation of the set {1, 2, 3}.
8.1.2 Permutations
Consider n objects labeled 1 through n and shu✏e them. Each possible shuf-
fle is called a permutation. For example, here is an example of a permutation
of 1–5:
1 2 3 4 5
=
4 2 5 1 3
170
8.1 The Determinant Formula 171
but since the top line of any permutation is always the same, we can omit it
and just write: ⇥ ⇤
= (1) (2) (3) (4) (5)
and so our example becomes simply = [4 2 5 1 3].
The mathematics of permutations is extensive; there are a few key prop-
erties of permutations that we’ll need:
171
172 Determinants
Permutation Example
P
det M = sgn( ) m1 (1) m2 (2) · · · mn(n) .
The sum is over all permutations of n objects; a sum over the all elements
of { : {1, . . . , n} ! {1, . . . , n}}. Each summand is a product of n entries
from the matrix with each factor from a di↵erent row. In di↵erent terms of
the sum the column numbers are shu✏ed by di↵erent permutations .
The last statement about the summands yields a nice property of the
determinant:
Example 102 Because there are many permutations of n, writing the determinant
this way for a general matrix gives a very long sum. For n = 4, there are 24 = 4!
permutations, and for n = 5, there are already 120 = 5! permutations.
0 1 1
m1 m12 m13 m14
Bm2 m2 m2 m2 C
B 4C
For a 4 ⇥ 4 matrix, M = B 13 2 3
C, then det M is:
@m1 m2 m3 m34 A
3 3
det M = m11 m22 m33 m44 m11 m23 m32 m44 m11 m22 m34 m43
m12 m21 m33 m44 + m11 m23 m34 m42 + m11 m24 m32 m43
+ m12 m23 m31 m44 + m12 m21 m34 m43 ± 16 more terms.
172
8.1 The Determinant Formula 173
Since the identity matrix is diagonal with all diagonal entries equal to one,
we have
det I = 1.
We would like to use the determinant to decide whether a matrix is in-
vertible. Previously, we computed the inverse of a matrix by applying row
operations. Therefore we ask what happens to the determinant when row
operations are applied to a matrix.
Swapping rows Lets swap rows i and j of a matrix M and then compute its determi-
nant. For the permutation , let ˆ be the permutation obtained by swapping positions
i and j. Clearly
sgn(ˆ ) = sgn( ) .
Let M 0 be the matrix M with rows i and j swapped. Then (assuming i < j):
X
det M 0 = sgn( ) m1 (1) · · · mj (i) · · · mi (j) · · · mn(n)
X
= sgn( ) m1 (1) · · · mi (j) · · · mj (i) · · · mn(n)
X
= ( sgn(ˆ )) m1ˆ (1) · · · miˆ (i) · · · mjˆ (j) · · · mnˆ (n)
X
= sgn(ˆ ) m1ˆ (1) · · · miˆ (i) · · · mjˆ (j) · · · mnˆ (n)
ˆ
= det M.
P P
The step replacing by ˆ often causes confusion; it holds since we sum over all
permutations (see review problem 3). Thus we see that swapping rows changes the
sign of the determinant. I.e.,
det M 0 = det M .
173
174 Determinants
det Eji = 1,
where the matrix Eji is the identity matrix with rows i and j swapped. It is a row swap
elementary matrix.
This implies another nice property of the determinant. If two rows of the matrix
are identical, then swapping the rows changes the sign of the matrix, but leaves the
matrix unchanged. Then we see the following:
Elementary Matrices
174
8.2 Elementary Matrices and Determinants 175
175
176 Determinants
Now we know that swapping a pair of rows flips the sign of the determi-
nant so det M 0 = detM . But det Eji = 1 and M 0 = Eji M so
det Eji M = det Eji det M .
This result hints at a general rule for determinants of products of matrices.
X
det M 0 = sgn( )m1 (1) · · · mi (i) · · · mn(n)
X
= sgn( )m1 (1) · · · mi (i) · · · mn(n)
= det M
176
8.2 Elementary Matrices and Determinants 177
det Ri ( )M = det M .
0 1
1
B ... C
B C
B C
det Ri ( ) = det B C= ,
B .. C
@ . A
1
177
178 Determinants
0 1
1
B .. C
B . C
B C
B 1 µ C
B . C
Sj (µ) = B
i
B .. C.
C
B 1 C
B C
B .. C
@ . A
1
Then multiplying M by Sji (µ) performs a row addition;
0 10 1 0 1
1
B .. C B .. C B .. C
B . CB . C B . C
B CB i C B i C
B 1 µ C B R C B R + µRj C
B .. CB . C B .. C
B . C B .. C = B . C.
B CB C B C
B 1 C B j C B Rj C
B CB R C B C
B .. C B . C B .. C
@ . A @ .. A @ . A
1
What is the e↵ect of multiplying by Sji (µ) on the determinant? Let M 0 =
Sji (µ)M , and let M 00 be the matrix M but with Ri replaced by Rj Then
X
det M 0 = sgn( )m1 (1) · · · (mi (i) + µmj (i) ) · · · mn(n)
X
= sgn( )m1 (1) · · · mi (i) · · · mn(n)
X
+ sgn( )m1 (1) · · · µmj (j) · · · mj (j) · · · mn(n)
= det M + µ det M 00
det M 0 = det M,
178
8.2 Elementary Matrices and Determinants 179
Figure 8.4: Adding one row to another leaves the determinant unchanged.
Elementary Determinants
179
180 Determinants
We have seen that any matrix M can be put into reduced row echelon form
via a sequence of row operations, and we have seen that any row operation can
be achieved via left matrix multiplication by an elementary matrix. Suppose
that RREF(M ) is the reduced row echelon form of M . Then
RREF(M ) = E1 E2 · · · Ek M ,
180
8.2 Elementary Matrices and Determinants 181
Corollary 8.2.3. Any elementary matrix Eji , Ri ( ), Sji (µ) is invertible, ex-
cept for Ri (0). In fact, the inverse of an elementary matrix is another ele-
mentary matrix.
To obtain one last important result, suppose that M and N are square
n ⇥ n matrices, with reduced row echelon forms such that, for elementary
matrices Ei and Fi ,
M = E1 E2 · · · Ek RREF(M ) ,
and
N = F1 F2 · · · Fl RREF(N ) .
If RREF(M ) is the identity matrix (i.e., M is invertible), then:
181
182 Determinants
Alternative proof
1. Let 0 1
m11 m12 m13
B C
M = @m21 m22 m23 A .
m31 m32 m33
182
8.3 Review Problems 183
Use row operations to put M into row echelon form. For simplicity,
assume that m11 6= 0 6= m11 m22 m21 m12 .
Prove that M is non-singular if and only if:
m11 m22 m33 m11 m23 m32 + m12 m23 m31
m12 m21 m33 + m13 m21 m32 m13 m22 m31 6= 0
✓ ◆ ✓ ◆
1 0 1 a b
2. (a) What does the matrix E2 = do to M = under
1 0 d c
left multiplication? What about right multiplication?
(b) Find elementary matrices R1 ( ) and R2 ( ) that respectively mul-
tiply rows 1 and 2 of M by but otherwise leave M the same
under left multiplication.
(c) Find a matrix S21 ( ) that adds a multiple of row 2 to row 1
under left multiplication.
3. Let ˆ denote the permutation obtained from by transposing the first
two outputs, i.e. ˆ (1) = (2) and ˆ (2) = (1). Suppose the function
f : {1, 2, 3, 4} ! R. Write out explicitly the following two sums:
X X
f (s) and f ˆ (s) .
What do you observe? Now write a brief explanation why the following
equality holds X X
F( ) = F (ˆ ) ,
183
184 Determinants
(a) det M .
(b) det N .
(c) det(M N ).
(d) det M det N .
1
(e) det(M ) assuming ad bc 6= 0.
T
(f) det(M )
(g) det(M + N ) (det M + det N ). Is the determinant a linear trans-
formation from square matrices to real numbers? Explain.
✓ ◆
a b
10. Suppose M = is invertible. Write M as a product of elemen-
c d
tary row matrices times RREF(M ).
11. Find the inverses of each of the elementary matrices, Eji , Ri ( ), Sji ( ).
Make sure to show that the elementary matrix times its inverse is ac-
tually the identity.
12. Let eij denote the matrix with a 1 in the i-th row and j-th column
and 0’s everywhere else, and let A be an arbitrary 2 ⇥ 2 matrix. Com-
pute det(A + tI2 ). What is the first order term (the t1 term)? Can you
184
8.3 Review Problems 185
express your results in terms of tr(A)? What about the first order term
in det(A + tIn ) for any arbitrary n ⇥ n matrix A in terms of tr(A)?
Note that the result of det(A + tI2 ) is a polynomial in the variable t
known as the characteristic polynomial.
(a)
det(I2 + teij ) det(I2 )
lim
t!0 t
(b)
det(I3 + teij ) det(I3 )
lim
t!0 t
(c)
det(In + teij ) det(In )
lim
t!0 t
(d)
det(In + At) det(In )
lim
t!0 t
Note, these are the directional derivative in the eij and A directions.
185
186 Determinants
X
det M = sgn( ) m1 (1) m2 (2) · · · mn(n)
X
= m11 sgn(/ 1 ) m2/ 1 (2) · · · mn/1 (n)
/1
X
+ m12 sgn(/ 2 ) m2/ 2 (1) m3/ 2 (3) · · · mn/2 (n)
/2
X
+ m13 sgn(/ 3 ) m2/ 3 (1) m3/ 3 (2) m4/ 3 (4) · · · mn/3 (n)
/3
+ ···
Here the symbols / k refers to the permutation with the input k removed.
The summand on the j’th line of the above formula looks like the determinant
of the minor obtained by removing the first and j’th column of M . However
we still need to replace sum of / j by a sum over permutations of column
numbers of the matrix entries of this minor. This costs a minus sign whenever
j 1 is odd. In other words, to expand by minors we pick an entry m1j of the
first row, then add ( 1)j 1 times the determinant of the matrix with row i
and column j deleted. An example will probably help:
186
8.4 Properties of the Determinant 187
✓ ◆ ✓ ◆ ✓ ◆
5 6 4 6 4 5
det M = 1 det 2 det + 3 det
8 9 7 9 7 8
= 1(5 · 9 8 · 6) 2(4 · 9 7 · 6) + 3(4 · 8 7 · 5)
= 0
0 1 0 1
1 2 3 4 0 0
det @4 0 0A = det @1 2 3A
7 8 9 7 8 9
✓ ◆
2 3
= 4 det
8 9
= 24
Example
Since we know how the determinant of a matrix changes when you perform
row operations, it is often very beneficial to perform row operations before
computing the determinant by brute force.
1
A fun exercise is to compute the determinant of a 4 ⇥ 4 matrix filled in order, from
left to right, with the numbers 1, 2, 3, . . . , 16. What do you observe? Try the same for a
5 ⇥ 5 matrix with 1, 2, 3, . . . , 25. Is there a pattern? Can you explain it?
187
188 Determinants
Example 105
0 1 0 1 0 1
1 2 3 1 2 3 1 2 3
det @4 5 6A = det @3 3 3A = det @3 3 3A = 0 .
7 8 9 6 6 6 0 0 0
Try to determine which row operations we made at each step of this computation.
You might suspect that determinants have similar properties with respect
to columns as what applies to rows:
Proof. By definition,
X
det M = sgn( )m1 (1) m2 (2) · · · mn(n) .
1
For any permutation , there is a unique inverse permutation that
1
undoes . If sends i ! j, then sends j ! i. In the two-line notation
for a permutation,
this corresponds to just flipping the permutation over. For
1 2 3
example, if = , then we can find 1 by flipping the permutation
2 3 1
and then putting the columns in order:
1 2 3 1 1 2 3
= = .
1 2 3 3 1 2
Since any permutation can be built up by transpositions, one can also find
the inverse of a permutation by undoing each of the transpositions used to
build up ; this shows that one can use the same number of transpositions
to build and 1 . In particular, sgn = sgn 1 .
188
8.4 Properties of the Determinant 189
= det M T .
Example 106 Because of this, we see that expansion by minors also works over
columns. Let 0 1
1 2 3
M = @0 5 6A .
0 8 9
Then ✓ ◆
5 8
det M = det M T = 1 det = 3.
6 9
189
190 Determinants
190
8.4 Properties of the Determinant 191
Example 107
0 ✓ ◆ ✓ ◆ ✓ ◆ 1T
2 0 1 0 1 2
B det det det C
0 1 B 1 1 0 1 0 1 C
3 1 1 B ✓ ◆ ✓ ◆ ✓ ◆C
B 1 1 3 1 3 1 C
adj @1 2 0 =B
A
B det det det C
B 1 1 0 1 0 1 C
C
0 1 1 B ✓ ◆ ✓ ◆ ✓ ◆C
@ 1 1 3 1 3 1 A
det det det
2 0 1 0 1 2
Let’s compute the product M adj M . For any matrix N , the i, j entry
of M N is given by taking the dot product of the ith row of M and the jth
column of N . Notice that the dot product of the ith row of M and the ith
column of adj M is just the expansion by minors of det M in the ith row.
Further, notice that the dot product of the ith row of M and the jth column
of adj M with j 6= i is the same as expanding M by minors, but with the
jth row replaced by the ith row. Since the determinant of any matrix with
a row repeated is zero, then these dot products are zero as well.
We know that the i, j entry of the product of two matrices is the dot
product of the ith row of the first by the jth column of the second. Then:
M adj M = (det M )I
1
Thus, when det M 6= 0, the adjoint gives an explicit formula for M .
191
192 Determinants
0 1 0 1
3 1 1 2 0 2
@
adj 1 2 A
0 = @ 1 3 1A .
0 1 1 1 3 7
Now, multiply:
0 10 1 0 1
3 1 1 2 0 2 6 0 0
@1 2 0A @ 1 3 1A = @0 6 0A
0 1 1 1 3 7 0 0 6
0 1 1 0 1
3 1 1 2 0 2
1@
) @1 2 0A = 1 3 1A
6
0 1 1 1 3 7
This process for finding the inverse matrix is sometimes called Cramer’s Rule .
Volume = det u v w
192
8.5 Review Problems 193
193
194 Determinants
1
3. Let denote the inverse permutation of . Suppose the function
f : {1, 2, 3, 4} ! R. Write out explicitly the following two sums:
X X
1
f (s) and f (s) .
What do you observe? Now write a brief explanation why the following
equality holds X X
F( ) = F ( 1) ,
Hint
194
Subspaces and Spanning Sets
9
It is time to study vector spaces more carefully and return to some funda-
mental questions:
9.1 Subspaces
Definition We say that a subset U of a vector space V is a subspace of V
if U is a vector space under the inherited addition and scalar multiplication
operations of V .
195
196 Subspaces and Spanning Sets
0 1
x
This equation can be expressed as the homogeneous system a b c @ y A = 0, or
z
M X = 0 with M the matrix a b c . If X1 and X2 are both solutions to M X = 0,
then, by linearity of matrix multiplication, so is µX1 + ⌫X2 :
M (µX1 + ⌫X2 ) = µM X1 + ⌫M X2 = 0.
So P is closed under addition and scalar multiplication. Additionally, P contains the
origin (which can be derived from the above by setting µ = ⌫ = 0). All other vector
space requirements hold for P because they hold for all vectors in R3 .
196
9.2 Building Subspaces 197
Note that the requirements of the subspace theorem are often referred to as
“closure”.
We can use this theorem to check if a set is a vector space. That is, if we
have some set U of vectors that come from some bigger vector space V , to
check if U itself forms a smaller vector space we need check only two things:
197
198 Subspaces and Spanning Sets
In this case, the vectors in U define the xy-plane in R3 . We can view the
xy-plane as the set of all vectors that arise as a linear combination of the two
vectors in U . We call this set of all linear combinations the span of U :
8 0 1 0 1 9
< 1 0 =
@ A @
span(U ) = x 0 + y 1 x, y 2 R .A
: ;
0 0
Notice that any vector in the xy-plane is of the form
0 1 0 1 0 1
x 1 0
@ y A = x @0A + y @1A 2 span(U ).
0 0 0
That is, the span of S is the set of all finite linear combinations1 of
elements of S. Any finite sum of the form “a constant times s1 plus a constant
times s2 plus a constant times s3 and so on” is in the span of S.2 .
0 1
0
Example 110 Let V = R3 and X ⇢ V be the x-axis. Let P = @1A, and set
0
S = X [ {P } .
0 1 0 1 0 1 0 1
2 2 2 0
The vector @3A is in span(S), because @3A = @0A + 3 @1A . Similarly, the vector
0 0 0 0
0 1 0 1 0 1 0 1
12 12 12 0
@17.5A is in span(S), because @17.5A = @ 0A +17.5 @1A . Similarly, any vector
0 0 0 0
1
Usually our vector spaces are defined over R, but in general we can have vector spaces
defined over di↵erent base fields such as C or Z2 . The coefficients ri should come from
whatever our base field is (usually R).
2
It is important that we only allow finitely many terms in our linear combinations; in
the definition above, N must be a finite number. It can be any finite number, but it must
be finite. We can relax the requirement that S = {s1 , s2 , . . .} and just let S be any set of
vectors. Then we shall write span(S) := {r1 s1 +r2 s2 +· · ·+rN sN | ri 2 R, si 2 S, N 2 N, }
198
9.2 Building Subspaces 199
of the form 0 1 0 1 0 1
x 0 x
@ 0A + y @1A = @ y A
0 0 0
is in span(S). On the other hand, any vector in span(S) must have a zero in the
z-coordinate. (Why?) So span(S) is the xy-plane, which is a vector space. (Try
drawing a picture to verify this!)
u = c 1 s1 + c 2 s2 + · · ·
v = d 1 s1 + d 2 s2 + · · ·
) u + µv = (c1 s1 + c2 s2 + · · · ) + µ(d1 s1 + d2 s2 + · · · )
= ( c1 + µd1 )s1 + ( c2 + µd2 )s2 + · · ·
Note that this proof, like many proofs, consisted of little more than just
writing out the definitions.
199
200 Subspaces and Spanning Sets
0 1 0 1 0 1 0 1
1 1 a x
r1 @0A + r2 @ 2A + r3 @1A = @ y A .
a 3 0 z
We can write this as a linear system in the unknowns r1 , r2 , r3 as follows:
0 1 0 11 0 1
1 1 a r x
@0 2 1A @r2 A = @ y A .
a 3 0 r3 z
0 1
1 1 a
If the matrix M = @0 2 1A is invertible, then we can find a solution
a 3 0
0 1 0 11
x r
M 1 @ y A = @r 2 A
z r3
0 1
x
for any vector y A 2 R3 .
@
z
Therefore we should choose a so that M is invertible:
Some other very important ways of building subspaces are given in the
following examples.
L(u) = 0 = L(u0 ) ,
200
9.2 Building Subspaces 201
Hence, thanks to the subspace theorem, the set of all vectors in U that are mapped
to the zero vector is a subspace of V . It is called the kernel of L:
kerL := {u 2 U |L(u) = 0} ⇢ U.
Note that finding a kernel means finding a solution to a homogeneous linear equation.
Hence, calling once again on the subspace theorem, the set of all vectors in V that
are obtained as outputs of the map L is a subspace. It is called the image of L:
imL := {L(u) | u 2 U } ⇢ V.
Hence, again by subspace theorem, the set of all vectors in V that obey the eigenvector
equation L(v) = v is a subspace of V . It is called an eigenspace
V := {v 2 V |L(v) = v}.
For most scalars , the only solution to L(v) = v will be v = 0, which yields the
trivial subspace {0}. When there are nontrivial solutions to L(v) = v, the number
is called an eigenvalue, and carries essential information about the map L.
201
202 Subspaces and Spanning Sets
1. Determine if x x3 2 span{x2 , 2x + x2 , x + x3 }.
(a) U [ W
(b) U \ W
Hint
3. Let L : R3 ! R3 where
L(x, y, z) = (x + 2y + z, 2x + y + z, 0) .
Find kerL, imL and the eigenspaces R3 1 , R33 . Your answers should be
subsets of R3 . Express them using span notation.
202
Linear Independence
10
Consider a plane P that includes the origin in R3 and non-zero vectors
{u, v, w} in P .
If no two of u, v and w are parallel, then P = span{u, v, w}. But any two
vectors determines a plane, so we should be able to span the plane using
only two of the vectors u, v, w. Then we could choose two of the vectors in
{u, v, w} whose span is P , and express the other as a linear combination of
those two. Suppose u and v span P . Then there exist constants d1 , d2 (not
both zero) such that w = d1 u + d2 v. Since w can be expressed in terms of u
and v we say that it is not independent. More generally, the relationship
c1 u + c2 v + c3 w = 0 ci 2 R, some ci 6= 0
203
204 Linear Independence
c1 v1 + c2 v2 + · · · + cn vn = 0.
Remark The zero vector 0V can never be on a list of independent vectors because
↵0V = 0V for any scalar ↵.
Worked Example
c1 v1 + c2 v2 + c3 v3 = 0
1
Usually our vector spaces are defined over R, but in general we can have vector spaces
defined over di↵erent base fields such as C or Z2 . The coefficients ci should come from
whatever our base field is (usually R).
204
10.1 Showing Linear Dependence 205
Therefore nontrivial solutions exist. At this point we know that the vectors are
linearly dependent. If we need to, we can find coefficients that demonstrate linear
dependence by solving
0 1 0 1 0 1
0 1 1 0 1 1 3 0 1 0 2 0
@0 2 2 0A ⇠ @0 1 1 0A ⇠ @0 1 1 0A .
1 1 3 0 0 0 0 0 0 0 0 0
The solution set {µ( 2, 1, 1) | µ 2 R} encodes the linear combinations equal to zero;
any choice of µ will produce coefficients c1 , c2 , c3 that satisfy the linear homogeneous
equation. In particular, µ = 1 corresponds to the equation
c1 v1 + c2 v2 + c3 v3 = 0 ) 2v1 v2 + v3 = 0.
Proof. The theorem is an if and only if statement, so there are two things to
show.
205
206 Linear Independence
c1 v1 + · · · + ck 1 vk 1 vk + 0vk+1 + · · · + 0vn = 0.
ii. Now we show that linear dependence implies that there exists k for
which vk is a linear combination of the vectors {v1 , . . . , vk 1 }.
The assumption says that
c1 v1 + c2 v2 + · · · + cn vn = 0.
Take k to be the largest number for which ck is not equal to zero. So:
c1 v1 + c2 v2 + · · · + ck 1 vk 1 + ck vk = 0.
c1 v1 + c2 v2 + · · · + ck 1 vk 1 = ck vk
c1 c2 ck 1
) v 1 v 2 · · · vk 1 = vk .
ck ck ck
Worked proof
206
10.2 Showing Linear Independence 207
Example 117 Consider the vector space P2 (t) of polynomials of degree less than or
equal to 2. Set:
v1 = 1 + t
v 2 = 1 + t2
v 3 = t + t2
v 4 = 2 + t + t2
v 5 = 1 + t + t2 .
c1 v1 + c2 v2 + c3 v3 = 0
207
208 Linear Independence
Since the matrix M has non-zero determinant, the only solution to the system of
equations 0 11
c
v1 v2 v3 @c2 A = 0
c3
is c1 = c2 = c3 = 0. So the vectors v1 , v2 , v3 are linearly independent.
Example 119 Let Z32 be the space of 3 ⇥ 1 bit-valued matrices (i.e., column vectors).
Is the following subset linearly independent?
8 0 1 0 1 0 19
< 1 1 0 =
@1A , @0A , @1A
: ;
0 1 1
If the set is linearly dependent, then we can find non-zero solutions to the system:
0 1 0 1 0 1
1 1 0
1@ A 2@ A 3@ A
c 1 + c 0 + c 1 = 0,
0 1 1
Solutions exist if and only if the determinant of the matrix is non-zero. But:
0 1
1 1 0 ✓ ◆ ✓ ◆
0 1 1 1
det @1 A
0 1 = 1 det 1 det = 1 1=1+1=0
1 1 0 1
0 1 1
Therefore non-trivial solutions exist, and the set is not linearly independent.
208
10.3 From Dependent Independent 209
209
210 Linear Independence
Hint
2. Let ei be the vector in Rn with a 1 in the ith position and 0’s in every
other position. Let v be an arbitrary vector in Rn .
210
10.4 Review Problems 211
First give an explicit example for each case, state whether the col-
umn vectors you use are linearly independent or spanning in each case.
Then, in general, determine whether (v1 , v2 , . . . , vm ) are linearly inde-
pendent and/or spanning Rn in each of the three cases. If they are
linearly dependent, does RREF(M ) tell you which vectors could be
removed to yield an independent set of vectors?
211
212 Linear Independence
212
Basis and Dimension
11
In chapter 10, the notions of a linearly independent set of vectors in a vector
space V , and of a set of vectors that span V were established; any set of
vectors that span V can be reduced to some minimal collection of linearly
independent vectors; such a minimal set is called a basis of the subspace V .
Definition Let V be a vector space. Then a set S is a basis for V if S is
linearly independent and V = span S.
If S is a basis of V and S has only finitely many elements, then we say
that V is finite-dimensional. The number of vectors in S is the dimension
of V .
Suppose V is a finite-dimensional vector space, and S and T are two dif-
ferent bases for V . One might worry that S and T have a di↵erent number of
vectors; then we would have to talk about the dimension of V in terms of the
basis S or in terms of the basis T . Luckily this isn’t what happens. Later in
this chapter, we will show that S and T must have the same number of vec-
tors. This means that the dimension of a vector space is basis-independent.
In fact, dimension is a very important characteristic of a vector space.
Example 121 Pn (t) (polynomials in t of degree n or less) has a basis {1, t, . . . , tn },
since every vector in this space is a sum
a 0 1 + a 1 t + · · · + a n tn , ai 2 R ,
so Pn (t) = span{1, t, . . . , tn }. This set of vectors is linearly independent; If the
polynomial p(t) = c0 1 + c1 t + · · · + cn tn = 0, then c0 = c1 = · · · = cn = 0, so p(t) is
the zero polynomial. Thus Pn (t) is finite dimensional, and dim Pn (t) = n + 1.
213
214 Basis and Dimension
Proof. Since S is a basis for V , then span S = V , and so there exist con-
stants ci such that w = c1 v1 + · · · + cn vn .
Suppose there exists a second set of constants di such that
w = d 1 v1 + · · · + d n vn .
Then
0V = w w
= c1 v1 + · · · + cn vn d 1 v1 ··· d n vn
= (c1 d1 )v1 + · · · + (cn dn )vn .
Proof Explanation
Remark This theorem is the one that makes bases so useful–they allow us to convert
abstract vectors into column vectors. By ordering the set S we obtain B = (v1 , . . . , vn )
and can write 0 11 0 11
c c
B .. C B .. C
w = (v1 , . . . , vn ) @ . A = @ . A .
cn cn B
Remember that in general it makes no sense to drop the subscript B on the column
vector on the right–most vector spaces are not made from columns of numbers!
214
215
Worked Example
vk = c0 w 1 + c1 v1 + · · · + ci 1 vi 1 + ci+1 vi+1 + · · · + cn vn .
Then replacing w1 with its expression in terms of the collection S gives a way
to express the vector vk as a linear combination of the vectors in S, which
contradicts the linear independence of S. On the other hand, we cannot
express w1 as a linear combination of the vectors in {vj |j 6= i}, since the
expression of w1 in terms of S was unique, and had a non-zero coefficient for
the vector vi . Then no vector in S1 can be expressed as a combination of
other vectors in S1 , which demonstrates that S1 is linearly independent.
The set S1 spans V : For any u 2 V , we can express u as a linear com-
bination of vectors in S. But we can express vi as a linear combination of
215
216 Basis and Dimension
Worked Example
Proof. Let S and T be two bases for V . Then both are linearly independent
sets that span V . Suppose S has n vectors and T has m vectors. Then by
the previous lemma, we have that m n. But (exchanging the roles of S
and T in application of the lemma) we also see that n m. Then m = n,
as desired.
and that this set of vectors is linearly independent. (If you didn’t do that
problem, check this before reading any further!) So this set of vectors is
216
11.1 Bases in Rn . 217
a basis for Rn , and dim Rn = n. This basis is often called the standard
or canonical basis for Rn . The vector with a one in the ith position and
zeros everywhere else is written ei . (You could also view it as the function
{1, 2, . . . , n} ! R where ei (j) = 1 if i = j and 0 if i 6= j.) It points in the
direction of the ith coordinate axis, and has unit length. In multivariable
calculus classes, this basis is often written {î, ĵ, k̂} for R3 .
Note that it is often convenient to order basis elements, so rather than
writing a set of vectors, we would write a list. This is called an ordered
basis. For example, the canonical ordered basis for Rn is (e1 , e2 , . . . , en ). The
possibility to reorder basis vectors is not the only way in which bases are
non-unique.
Bases are not unique. While there exists a unique way to express a vector in terms
of any particular basis, bases themselves are far from unique. For example, both of
the sets ⇢✓ ◆ ✓ ◆ ⇢✓ ◆ ✓ ◆
1 0 1 1
, and ,
0 1 1 1
are bases for R2 . Rescaling any vector in one of these sets is already enough to show
that R2 has infinitely many bases. But even if we require that all of the basis vectors
have unit length, it turns out that there are still infinitely many bases for R2 (see
review question 3).
0 = x1 v1 + · · · + xn vn .
Let M be a matrix whose columns are the vectors vi and X the column
vector with entries xi . Then the above equation is equivalent to requiring
that there is a unique solution to
MX = 0 .
217
218 Basis and Dimension
in the unknowns xi . For this, we need to find a unique solution for the linear
system M X = w.
Thus, we need to show that M 1 exists, so that
1
X=M w
is the unique solution we desire. Then we see that S is a basis for Rn if and
only if det M 6= 0.
Theorem 11.1.1. Let S = {v1 , . . . , vm } be a collection of vectors in Rn .
Let M be the matrix whose columns are the vectors in S. Then S is a basis
for V if and only if m is the dimension of V and
det M 6= 0.
Remark Also observe that S is a basis if and only if RREF(M ) = I.
Example 122 Let
⇢✓ ◆ ✓ ◆ ⇢✓ ◆ ✓ ◆
1 0 1 1
S= , and T = , .
0 1 1 1
✓ ◆
1 0
Then set MS = . Since det MS = 1 6= 0, then S is a basis for R2 .
0 1
✓ ◆
1 1
Likewise, set MT = . Since det MT = 2 6= 0, then T is a basis for R2 .
1 1
218