0% found this document useful (0 votes)
450 views45 pages

Solutions To Fundamental Computations Waxman

This document provides solutions to selected exercises from the textbook "Fundamentals of Matrix Computations" by David S. Watkins. It includes step-by-step workings and explanations for solutions to exercises related to Gaussian elimination, matrix inversions, forward and back substitution, and properties of positive definite matrices. Pseudocode implementations are also provided for algorithms like block partitioning, forward/back substitution, and determinant calculations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
450 views45 pages

Solutions To Fundamental Computations Waxman

This document provides solutions to selected exercises from the textbook "Fundamentals of Matrix Computations" by David S. Watkins. It includes step-by-step workings and explanations for solutions to exercises related to Gaussian elimination, matrix inversions, forward and back substitution, and properties of positive definite matrices. Pseudocode implementations are also provided for algorithms like block partitioning, forward/back substitution, and determinant calculations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Solutions to Selected Problems In:

Fundamentals of Matrix Computations: Second Edition


by David S. Watkins.
John L. Weatherwax
March 13, 2012

wax@alum.mit.edu

Chapter 1 (Gaussian Elimination and Its Variants)


Exercise 1.1.20 (an example of block partitioning)
We first compute Ax with (ignoring the partitioning for the time being) is given by

1 3 2
1 0 1
4 7 4
Ax = 2 1 1 2 1 1 = 3 3 3
1 0 1
1 2 8
2 2 1

Now to show that Ai1 X1j + Ai2 X2j = Bij for i, j = 1, 2 consider each of the possible four
pairs in tern. First consider (i, j) = (1, 1) which is
      


 
5
4
1
2
3 2
1
=
+
=
[1] +
A11 X11 + A12 X21 =
3
1
2
1
1 1
2

Which is the same as B11 . Now (i, j) = (1, 2) gives



 
 
 


 

7 4
7 3
0 1
1 1
3 2
1 
0 1 +
=
+
=
A11 X12 + A12 X22 =
3 3
3 1
0 2
2 0
1 1
2

Which is the same as B12 . Now (i, j) = (2, 1) gives






2
= 1 + 1 = 2
A21 X11 + A22 X21 = [1] [1] + 0 1
1

Which is the same as B21 . Finally, considering (i, j) = (2, 2), we have



 
 


 
 1 1
= 0 1 + 2 0 = 2 1
A21 X12 + A22 X22 = [1] 0 1 + 0 1
2 0

Which is the same as B22 . Thus we have shown the equivalence requested.

Exercise 1.1.25
Let A be n by m and consider A partitioned by columns so that


A = a1 |a2 | |am .

Here ai is the i-th column of A of dimension n by 1. Consider x partitioned into m scalar


blocks as

x1
x2

x = ..
.
xm

Then




Ax = a1 |a2 | |am

x1
x2
..
.

= x1 a1 + x2 a2 + x3 a3 + + xm am = b

xm
Showing that b is a linear combination of the columns of A.

(1)

Exercise 1.2.4
Lets begin by assuming that there exists a nonzero y such that Ay = 0. But applying A1
to both sides gives
A1 Ay = A1 0 = 0
or y = 0 which is a contradiction to our initial assumption. Therefore no nonzero y can
exist.

Exercise 1.2.5
If A1 exists then we have
AA1 = I .
Taking the determinant of both sides of this expression gives
|A||A1| = 1 ,

(2)

but since |A1 | = |A|1 it is not possible for |A| = 0 or else Equation 2 would be in
contradiction.

Exercise 1.2.11
The equation for the car at x1 is
4x1 + 1 + 4(x2 x1 ) = 0 ,
or
8x1 4x2 = 1 .
Which is the same as given in the book. The equation for the car at x2 is given by
4(x2 x1 ) + 2 + 4(x3 x2 ) = 0 .
Which is the same as in the book. The equation for the car at x3 is given by
4(x3 x2 ) + 3 4x3 = 0 ,
or
4x2 8x3 = 3 .
Which is the same as given in the book. Since the determinant of the coefficient matrix A
has value det(A) = 256 6= 0, A is nonsingular.

Exercise 1.3.7
A pseudo-code implementation would look like the following

% Find the last element of b that is zero ($b_k=0$ and $b_{k+1} \ne 0$)
for i=1,2,...,n
if( b_i \ne 0 ) break
endfor
k=i-1 % we now have b_1 = b_2 = b_3 = \ldots = b_k = 0
for i=k+1,...,n
for j=k+1,...,n
b_i = b_i - g_{ij} b_j
endfor
if( g_{ii} = 0 ) set error flag, exit
b_i = b_i/g_{ii}
endfor

Exercise 1.3.14
Part (a): To count the operations we first consider the inner for loop which has 2 flops
and is executed n (j + 1) + 1 = n j times. The outer loop is executed once for every
j = 1, 2, . . . , n therefore the total flop count is given by
n X
n
X

j=1 i=j+1

2=

n
X
j=1

2(n j) = 2

n1
X

1
j = 2 n(n 1) = n(n 1) ,
2
j=1

the same as row oriented substitution.


Part (b): Row oriented forward substitution subtracts just the columns of the row we are
working on as we get to each row. Column oriented forward substitution subtracts from all
rows before moving to the next unknown (row).

Exercise 1.3.14
Part (b): In row-oriented substitution we first solve for x1 , then x2 using the known value
of x1 , then x3 using the known values of x1 and x2 . This pattern continues until finally
we solve for xn using all of then known x1 , x2 , down to xn1 . The column subtractions (in
column oriented substitution) occurs one at a time i.e. in row-oriented substitution we have
a11 x1 = b1 x1 =

b1
.
a11

a21 x1 + a22 x2 = b2 x2 =
x2 =

b2 a21 x1
a22
 
b2 a21 ab111
a22

while in column oriented substitution we solve for x1 , with


x1 =

b1
.
a11

and then we get


a22 x2 = b2 a21

b1
a11

and solve for x2 . In P


summary using column oriented substitution we do some of the subbi

tractions in xi =
subtractions at once.

i1
j=1

aij

aij yj

, each time we go through the loop. In row-oriented does all

Exercise 1.3.15
A pseudocode implementation of row-oriented back substitution would look something like
the following
for i=n,n-1,. . . ,2,1 do
for j=n,n-1,. . . ,i+2,i+1 do
b(i) = b(i) - A(i,j)*b(j)
end
if A(i,i)=0 then
// return error; system is singular
end
b(i) = b(i)/A(i,i)
end

Exercise 1.3.16 (column-oriented back substitution)


A pseudocode implementation of column-oriented back substitution would look something
like the following
Note that the i loop above could be written backwards as i = j 1, j 2, . . . , 2, 1 if this
helps maintain consistency.

for j=n,n-1,. . . ,2,1 do


if A(j,j)=0 then
// return error; system is singular
end
b(j) = b(j)/A(j,j)
for i=1,2,. . . ,j-1 do
b(i) = b(i) - A(i,j)*b(j)
end
end
Exercise 1.3.16
Column oriented substitution we factor our problem Ux = y as

 


h
y
x
U
,
=
yn
xn
0 unn
which written out in each
U x + hxn = y
unn xn = yn
We first solve for xn then solve for x in
= y hxn .
Ux
This is a n 1 n 1 upper triangular system where we make the same substitutions, so
we let
y1 = y1 u1n xn
y2 = y2 u2n xn
..
.
yn1 =
Put xn in yn . The algorithm is given by
for i:-1:n do
if u(i,i)=0 set flag
y(i) = y(i)/u(i,i)
for j = i-1:-1:1 do
y(j) = y(j) - u(j,i) y(i)
The values of x stored in y1 , y2 , y3, yn . We can check this for the index n. We have
yn yn /unn
yn1 yn1 un1,n xn
..
.
y1 y1 u1,n xn .

Exercise 1.3.17
Part (a): Performing row oriented back-substitution on

10
x1
3 2 1 0
0 1 2 3 x2 10

0 0 2 1 x3 = 1
12
x4
0 0 0 4
we have

x4 = 3
13
x3 =
=1
2
x2 = 10 2(1) 3(3) = 1
10 + 2 1
9
10 2(1) 1(1)
=
=
= 3
x1 =
3
3
3
Part (b): Column oriented back-substitution, would first solve for x4 giving x4 = 3, and
then reduce the order of the system by one giving

3 2 1
x1
10
0
10
0 1 2 x2 = 10 3 3 = 1 .
0 0 2
x3
1
1
2
This general procedure is then repeated by solving for x3 . The last equation above gives
x3 = 1, and then reducing the order of the system above gives


 

  

3 2
x1
10
1
11
=
1
=
.
0 1
x2
1
2
1
Using the last equation to solve for x2 gives x2 = 1, and reducing the system one final time
gives
[3]x1 = [11] 2[1] = 9 ,
which has as its solution x1 = 3.

Exercise 1.3.20 (block column oriented forward substitution)


The block variant of column oriented forward substitution on

y1
G11
y2
G21 G22

y3
G31 G32 G33

..
..
..
.
.
.
.
.
.
ys
Gs1 Gs2 Gs3
Gss
would look like

for j=1,2,. . . ,s do
if G(j,j) is singular then
// return error; system is singular
end
// matrix inverse of G(j,j) taken here
b(j)=inverse(G(j,j))*b(j)
for i=j+1,j+2,. . . ,s do
// matrix multiplication performed here
b(i) = b(i) - G(i,j)*b(j)
end
end
Exercise 1.3.21 (row and column back substitution)
The difference between (block) row and column back substitution is that in row forward
substitution as we are processing each row we subtract out multiples of the computed unknowns. In column substitution we do all the subtractions required for each row at one time
and then no longer need to remember these unknowns. The difference is simply a question
of doing the calculations all at once or each time we access a row.

Exercise 1.3.23 (the determinant of a triangular matrix)


The fact that the determinant of a triangular matrix is equal to the product of the diagonal
elements, can easily be proved by induction. Lets assume without loss of generality that
our system is lower triangular (upper triangular systems are transposes of lower triangular
systems) and let n = 1 then |G| = g11 trivially. Now assume that for a triangular system
of size n n that the determinant is given by the product of its n diagonal elements and
of size (n + 1) (n + 1) partitioned into a leading matrix G11 of size
consider a matrix G
n n.


G11
0
.
G=
hT gn+1,n+1
Now by expanding the determinant of G along its last column we see that
|G| = gn+1,n+1|G11 | = gn+1,n+1

n
Y
i=1

gii =

n+1
Y

gii ,

i=1

proving by induction that the determinant of a triangular matrix is equal to the product of
its diagonal elements.

Exercise 1.3.29
Part (b):

G
0
T
h gnn



y
yn

b
bn

Given y1 , y2 , , yi1 we compute yi from


yi =

bi hT y
.
aii

As an algorithm we have
for i:1:n do
if a(i,i)=0 (set error)
for j=1:i-1 do
b(i) = b(i)-a(i,j) y(j)
b(i) = b(i)/a(i,i)
end
end

Exercise 1.4.15
If A =


4 0
.
0 9
T

Part (a): x Ax =
only if x = 0.

x1 x2

4 0
0 9



x1
x2

= 4x21 + 9x22 0, which can bet equal to zero

Part (b): A = GGT we have g11 = a11 , we take the positive sign, so g11 = 2. Now
ai1
g11

i = 2, , n

g21 =

0
a21
= = 0.
g11
2

gi1 =
so we have that

Multiplying out GGT we see that


2
a11 = g11
a12 = g11 g21
2
2
a22 = g21
+ g22
.

a22

ai2 = gi1 g21 + gi2 g22 .

2
= 0 + g22
g22 = a22 = 3 .

so

4
A=
0

2
so the Cholesky factor of A is
0

Part (c): G2 =

0
9

2 0
0 3



2 0
0 3

= GGT


0
.
3






2 0
2 0
2 0
.
, G4 =
, G3 =
0 3
0 3
0 3

Part (d): We can change the sign of any of the elements on the diagonal, so there are
2 2 2 2 = 2n where n is the number of diagonal elements so 2n lower triangular matrices.
Exercise 1.4.21

16 4 8 4
4 10 8 4

For A =
8 8 12 10 .
4 4 10 12

g11 =

16 = 4

4
=1
4
8
=
=2
4
4
=1
=
4
q

g21 =
g31
g41

g22 =

2
a22 g21
=

10 12 = 3

8 2(1)
a32 g31 g21
=
=2
g22
3
a42 g41 g21
4 1(1)
=
=
=1
g22
3
q

2
2
a33 g31
g32
= 12 4 4 = 2
=

g32 =
g42
g33

a43 g41 g31 g42 g32


10 1(2) 1(2)
=
=3
g33
2
q

2
2
2
a44 g41
=
g42
g43
= 12 12 12 32 = 1 .

g43 =
g44

4
1
G=
2
1

0
3
2
1

0
0
2
3

0
0
.
0
1

Since we want to solve GGT x = b, we let y = GT x to solve Gy = b, which is the system

32
y1
4 0 0 0
1 3 0 0 y2 26

2 2 2 0 y3 = 38 .
30
y4
1 1 3 1

The first equation gives y1 = 8, which put in the second equation gives 8 + 3y2 = 26 or
y2 = 6. When we put these two variables into the third equation we get
16 + 12 + 2y3 = 38 y3 = 5 .

When all of these variables are put into the fourth equation we have
8 + 6 + 15 + y4 = 30 y4 = 1 .

Using these values of yi we now want solve

4 1 2 1
0 3 2 1

0 0 2 3
0 0 0 1


8
x1
6
x2
=
x3 5
1
x4

The fourth equation gives x4 = 1. The third equation is then 2x3 + 3 = 5 x3 = 1. With
both of these values into the second equation we get
3x2 + 2 + 1 = 6 x2 = 1 .

With these three values put into the first equation we get

4x1 + 1 + 2 + 1 = 8 x1 = 1 .

1
1

Thus in total we have x =


1 .
1

Exercise 1.4.52
Since A is positive definite we must have for all non-zero xs the condition xT Ax > 0. Let
x = ei , a vector of all zeros but with a one in the i-th spot. Then xT Ax = eTi Aei = aii > 0,
proving that the diagonal elements of a positive definite matrix must be positive.

Exercise 1.4.56
Let v be a nonzero vector and consider
v T X T AXv = (Xv)T A(Xv) = y T Ay > 0 ,
where we have defined y = Xv, and the last inequality is from the positive definiteness
of A. Thus, X T AX is positive definite. Note, if X were singular then X T AX is only
positive semi definite since there would then exist nonzero v such that Xv = 0, and therefore
v T X T AXv = 0, in contradiction to the requirement of positive definiteness.

Exercise 1.4.58
Part (a): Consider the expression for A22 , we have
A22 =
=
=
=
=

T
A22 R12
R12
T
T
A22 (R11 A12 )T (R11
A12 )
T
1 T
A22 A12 R11 R11 A12
T
A22 AT12 (R11
R11 )1 A12
A22 AT12 A1
11 A12

T
which results from the definition of A11 = R11
R11 .

Part (b): Since A is symmetric we must have that A21 = AT12 and following the discussion
on the previous page A has a decomposition like the following


A11 A12
A =
A21 A22




T
T
I 0
R11
0
R11 R11
A12
=
.
(3)
1
AT12 R11
I
0
I
0 A22
This can be checked by expanding the individual products on the right hand side (RHS) as
follows

  T


T
T
R11 R11
A12
R11 R11
A12
R11
0
=
RHS =
1
1 T
I
AT12 R11
AT12
A22 + AT12 R11
R11 A12
0
A22
Which will equal A when

1 T
A22 = A22 AT12 R11
R11 A12 .

Part (c): To show that A22 is positive definite we note that by first defining X to be


T
R11
0
T
X
,
1
AT12 R11
I
we see that X is nonsingular at from Eq. 3 we have that


I 0
= X T AX 1 .
0 A22
By the fact that since X 1 is invertible and A is positive definite then we have that X T AX 1
is also positive definite and therefore by the square partitioning of principle submatrices of
positive definite matrices (Proposition 1.4.53) the submatrix A22 is positive definite.

Exercise 1.4.60
We will do this problem using induction on n the size of our matrix A. If n = 1 then showing
that the Cholesky factor is unique is trivial

[a11 ] = [+ a11 ][+ a11 ] .

Assume that the Cholesky factor is unique for all matrices of size less than or equal to n. Let
A be a positive definite of size n + 1 and assume (to reach a contradiction) that A has two
Cholesky factorizations, i.e. A = RT R and A = S T S. By partitioning A and R as follows


  T

R11
R12
R11
0
A11
A12
.
=
T
0 rn+1,n+1
R12
rn+1,n+1
AT12 an+1,n+1
Here A11 is of dimension n by n, A12 is n by 1, and the elements of R are partitioned
conformably with A. Then by equating terms in the above block matrix equation we have
three unique equations (equating terms for AT12 results in a transpose of the equation for
A12 ) The three equations are
T
A11 = R11
R11
T
A12 = R11 R12
T
2
an+1,n+1 = R12
R12 + rn+1,n+1
.

From the induction hypotheses since A11 is n by n its Cholesky decomposition (i.e. R11 )
is unique. This implys that the column vector R12 is unique since R11 and correspondingly
T
R11
is invertible. It then follows that since rn+1,n+1 > 0 that rn+1,n+1 must be unique since
it must equal
q
T
rn+1,n+1 = + an+1,n+1 R12
R12 .

We know that the expression

T
an+1,n+1 R12
R12 > 0

by an equivalent result to the Schur complement result discussed in the book. Since every
component of our construction of R is unique, we have proven that the Cholesky decomposition is unique for positive definite matrices of size n + 1. By induction it must hold for
positive definite matrices of any size.

Exercise 1.4.62
Since A is positive definite it has a Cholesky factorization given by A = RT R. Taking the
determinant of this expression gives
!2
n
Y
|A| = |RT ||R| = |R|2 =
rii
> 0,
i=1

showing the desired inequality.

Exercise 1.5.9
We wish to count the number of operations required to solve
Rx = y with semiband width s or

r1,1 r1,2 r1,3 . . .


r1,s

r2,2 r2,3 . . .
r2,s
r2,s+1

rn1,n1 rn1,n
rn,n

the following banded system

x = y

so at row i we have non-zero elements in columns j = i, i + 1, i + 2, . . . i + s, assuming that


i + s is less than n. Then a pseudocode implementation of row-oriented back substitution
would look something like the following
for i=n,n-1,. . . ,2,1 do
// the following is not executed when j=n
for j=min(n,i+s),min(n,i+s)-1,. . . ,i+2,i+1 do
y(j) = y(j) - R(i,j)*y(j)
end
if R(i,i)=0 then
// return error; system is singular
end
y(i) = y(i)/A(i,i)
end
Counting the number of flops this requires, we have approximately two flops for every execution of the line y(j) = y(j) R(i, j) y(j), giving the following expression for the number
of flops

1
i+1
X
X

2 + 1 .
i=n

Now since

j=min(n,i+s)

i+1
X

2 = O(2s)

j=min(n,i+s)

the above sum simplifies (using order notation) to


O(n + 2sn) = O(2sn) ,
as requested.

Exercise 1.6.4
Please see the Matlab file exercise 1 6 4.m for the evaluation of the code to perform the
suggested numerical experiments. From those experiments one can see that the minimum

degree ordering produces the best ordering as far as remaining non-zero elements and time to
compute the Cholesky factor. This is followed by the reverse Cuthill-McKee ordering. Since
the matrix delsq is banded to begin with a direct Cholesky factorization can be computed
rather quickly. Randomizing the ordering of the nodes produces the matrix with the most
fill in (largest number of non-zero elements) and also requires the largest amount of time to
compute the Cholesky factorization for.

Exercise 1.6.5
Please see the Matlab file exercise 1 6 5.m for the evaluation of the code to perform the
suggested numerical experiments. From those experiments one can see that the minimum
degree ordering produces the best ordering as far as remaining non-zero elements and time
to compute the Cholesky factor. This is followed by the reverse Cuthill-McKee ordering.

Exercise 1.7.18 (solving linear systems with LU)


From exercise 1.7.10 our matrix A has the following LU

1
0
0
2
1 1 3

2 0
0
0
0 1 1

=
4
1 2 6 2 1 1
3 2 1
6 1 2 3
Then we are solving LUx = b by first solving Ly
Ly = b or

1
0
0 0

1 1
0 0

2 1 1 0
3 2 1 1

decomposition

2 1 1
0

0 0 1 1
0 0 0 1
0 0 0
1

= b and then Ux = y. The first problem is



12
y1

y2 8
=
y3 21
26
y4

which upon performing forward substitution gives


y1
y2
y3
y4

=
=
=
=

12
8 + y1 = 8 + 12 = 4
21 2y1 + y2 = 21 24 + 4 = 1
26 + 3y1 2y2 + y3 = 3(12) 2(4) + 1 = 29

The second step is to solve Ux = y

2 1
0 1

0 0
0 0

3
3

3
3

or the system

x1
1 3

1 3 x2
1 3 x3
x4
0 3

12
4

=
1
29

which upon performing backward substitution gives


x4
x3
x2
2x1

=
=
=
=

13
1 3x1 = 1 39 = 38 x3 = 38
x3 3x4 = 38 39 = 1
x2 + x3 3x4 = 1 + 38 29 = 10

x1 = 5

so our solution is given by



5
x1
x2 1

x3 = 38
13
x4

Exercise 1.7.34
Part (a): By considering the given matrix M and the product MA we see that we are
multiplying row j by m and adding it to row i. This is the definition of a row operations of
type 1.
Part (b): Since the given matrix M, is also lower triangular the determinant of M is the
product of the diagonal elements. In this case, since the diagonal elements are all ones the
product is then also a one, and therefore det(M) = +1. Since A = MA we have that
= |MA| = |M||A| = |A| .
|A|
Part (c): I will assume that the inverse of M is given by replacing the element m in M
by its negative m. We can check this for correctness by multiplying the two matrices as
M 1 M and observing that we obtain the identity. This result can also be understood as
recognizing that the inverse of M (which is action of multiplying row j by m and adding
it to row i), would be the action of multiplying row j by m and adding it to row i. This
action is the same as replacing the (i, j)th element in I with m.
Exercise 1.7.35
Part (a): By multiplying a matrix A by the matrix M described we find the ith and jth
rows of A exchanged.
Part (b): Since M is obtained from the identity matrix by exchanging two rows of I the
determinant change sign i.e.
det(M) = det(I) = 1 ,
so

= det(MA) = det(M)det(A) = det(A) .


det(A)

Part (c): The inverse of exchanging the ith row and the jth row would be performing this
operation twice so M 1 = M. Also we can see that this is true by explicitly calculating the
multiplication of M with itself.

Exercise 1.7.36
Part (a): M is obtained from the identity matrix by replacing the ith row by c times the
ith row, i.e. Mii = c.
Part (b): M 1 is obtained from the identity matrix by replacing the ith row by 1/c times
the ith row, i.e. Mii = 1/c.
Part (c): We have
det(M) = c det(I) = c ,
so

= det(MA) = det(M)det(A) = c det(A) .


det(A)

Exercise 1.7.44 (triangular matrices have triangular inverses)


Part (a): We are told that L is non-singular and lower triangular. We want to prove that
L1 is lower triangular. We will do this by using induction on n the dimension of L. For
n = 1 L is a scalar and L1 is also a scalar. Trivially both are lower triangular. Now assume
that if L is non-singular and lower triangular of size n n, then L1 has the same property.
Let L be a matrix of size (n + 1) (n + 1) and partition L as follows


L11 0
L=
.
L21 L22
Where L11 and L22 are both lower triangular matrices of sizes less than n n, so that we
can apply the induction hypothesis. Let M = L1 and partition M con formally i.e.


M11 M12
.
M=
M21 M22
We want to show that M12 must be zero. Now since ML = I by multiplying the matrices
above out we obtain



L11 0
M11 M12
LM =
L21 L22
M21 M22

 

L11 M11
L11 M12
I 0
=
=
L21 M11 + L22 M21 L21 M12 + L22 M22
0 I

Equating block components gives


L11 M11
L11 M12
L21 M11 + L22 M21
L21 M12 + L22 M22

=
=
=
=

I
0
0
I.

By the induction hypothesis both L11 and L22 are invertible. Thus the equation L11 M11 = I
gives M11 = L1
11 , and the equation L11 M12 = 0 gives M12 = 0. With these two conditions
the equation L21 M12 + L22 M22 = I becomes L22 M22 = I. Since L22 is invertible we compute
that M22 = L1
22 . As both L11 and L22 are lower triangular of size less than n n by the
induction hypothesis their inverse are lower triangular and we see that M itself is then lower
triangular since

 1
L11
0
.
M=
M21 L1
22
Thus by the principle of induction we have shown that the inverse of a lower triangular
matrix is lower triangular.

Part (b): We can prove that the main diagonal elements of L1 are given by lii1 in a number
of ways. One is by using mathematical induction, another is to simply compute the product
of the corresponding row of L with the corresponding column of L1 . For example since the
ith row of L multiplied by the ith column of L1 must produce unity we have
1=

n
X

Lik (L1 )ki .

k=1

Since L is lower triangular L1 is lower triangular so we have that their components must
satisfy
Lik = 0 for k > i
(L )ki = 0 for i > k
1

so that the above sum becomes


1 = Lii (L1 )ii

or (L1 )ii =

1
.
Lii

Exercise 1.7.45 (product of lower triangular matrices)


Part (a): We will prove that the product of two lower triangular matrices is lower triangular
by induction. We begin with n = 2 for which we have


 

l11 0
m11 0
l11 m11
0
=
l21 l22
m21 m22
l21 m11 + l22 m21 l22 m22
which is lower triangular. Assume the product of two lower triangular matrices of size n
n
is also lower triangular and consider two lower triangular matrices of size n + 1. Performing

a bordered block partitioning of the two lower triangular matrices we have that




Mn
0
Ln
0
and Mn+1 =
Ln+1 =
mT mn+1,n+1
lT ln+1,n+1
where the single subscripts denote the order of the matrices. With bordered block decomposition of our two individual matrices we have a product given by


Ln Mn
0
Ln+1 Mn+1 =
.
lT Mn + ln+1,n+1mT ln+1,n+1 mn1 ,n+1
Since by the induction hypotheses the product Ln Mn is lower triangular we see that our
product Ln+1 Mn+1 is lower triangular.
Part (b): This can again be proved by induction. For n = 2 we see our diagonal elements
are given by l11 m11 and l22 m22 . Assuming this is true for n
n. The fact that the n + 1th
diagonal element of Ln+1 Mn+1 in the above bordered factorization is given by ln+1,n+1 mn1 ,n+1
and the induction hypothesis applied to Ln Mn shows that the diagonal elements of Ln+1 Mn+1
have the correct form. As stated in the book this implies that the product of two unit lower
triangular matrices is unit triangular.
One can prove that the diagonal elements of products of unit triangular matrices are ones
by another method. Letting L and M be unit lower triangular we have that the ith row of
L is given by
Li,: = [li,1 , li,2 , . . . li,i1 , 1, 0, . . . 0]
the ith column of the matrix M is given by
M:,i = [0, 0, . . . , 0, 1, mi+1,i , mi+2,i , . . . , mn,i ]T
Then the (i, i) element in LM is given by the inner product of these two vectors and gives
one since this is the only non-zero overlapping component.

Exercise 1.7.46 (product of upper triangular matrices)


Let matrices U and V be upper triangular, then U T and V T are lower triangular. Because of
this U T V T is lower triangular. Since U T V T = (V U)T , we see that (V U)T is lower triangular.
Taking the transpose we see that V U is upper triangular. Thus the product of upper
triangular matrices are again upper triangular.

Exercise 1.7.47
Part (e): Our L matrix is given by L = M1 M2 M3 . . . Mn2 Mn1 . To compute this product
we will multiply the factors from the left to the right. To begin, consider the product of

Mn2 Mn1 which is given by

..

1
Mn2 Mn1 =

mn1,n2
mn,n2

..

1
=

mn1,n2
mn,n2

0 0

1 0
0 1

0
1
mn,n1

1
..

.
1
1
0
0
1
0 mn,n1

0
1

.
0

0
1

From which we see that the effect of this multiplication is to have inserted the two elements
mn1,n2 and mn,n2 into the lower right block matrix in Mn1 . Motivated by this observation
we will now block decompose the product Mn2 Mn1 into a matrix like


In4 0
,
0 A
with A defined as

1
0
0
0
1
0
A=
0 mn1,n2
1
0 mn,n2 mn,n1

0


0
1 0
.
0 M
0
1

Here we have increased the dimension of lower right block matrix by one over the lower right
block matrix annotated in the product Mn2 Mn1 above. In addition, we have defined the
matrix M to be

1
0
0
M = mn1,n2
1
0
mn,n2 mn,n1 1

Now consider the matrix Mn3 given by

..

Mn3 =
1

m
n2,n3

mn1,n3
mn,n3

.
0 0 0

1 0 0

0 1 0
0 0 1

which can be viewed as having the same block structure as the product Mn2 Mn1 above
by considering a block decomposition of this matrix as


I 0
,
Mn3 =
0 B

where B is given by

mn2,n3
B=
mn1,n3
mn,n3

0
1
0
0

0
0
1
0

0


0
1
0

.
0
m I
1

Where we have introduced the vector m defined as m = [mn2,n3 , mn1,n3 , mn,n3 ]T . Then
the block product of Mn3 (Mn2 Mn1 ) is given by

 


I 0
I 0
I 0
.
(4)
=
Mn3 Mn2 Mn1 =
0 AB
0 A
0 B
For the product of B with A, we have from their definitions that

 


1 0
1 0
1 0
.
=
BA =
m M
0 M
m I

(5)

From which we see that the effect of the multiplication BA is to replace the first column of
A with the first column of B. Since everything we have done simply involves viewing the
various multiplications as block matrices in specific ways this procedure can be repeated at
each step, i.e. we can form block matrices of the current running product Mi Mi+1 . . . Mn1 ,
with the current Mi1 to multiply on the left by, just as is done in Eq. 4. The lower right
block multiplication can be decomposed further in a form exactly like Eq. 5. By repeating
this procedure n 2 times corresponding to all required products we will have produced a
matrix L as given in the book.

Exercise 1.7.44/46
Part (a): For n = 2 we have


a11 a12
0 a22

1

1
a1
a1
11
11 a12 a22
0
a1
22

So an upper triangular matrix n = 2 has an upper triangular matrix as its inverse. Assume
this is true for n = k, by mathematical induction we want to show this is true for n = k + 1.



a11
a1,k+1
T
= a11 h
0 a22
0 A
0
0
ak+1,k+1
with A a k k matrix. Now the inverse of the above is
 1 1 T 1 
a11 a11 h A
0
A1

Now A1 is an upper triangular by the induction hypothesis and therefore the matrix above
is upper triangular.

Part (b): V is unit upper triangular then V 1 is a unit upper triangular. As the inverse of
any triangular matrix (upper or lower) must have its diagonal elements to be the reciprocals
of the corresponding elements in the original matrix i.e.
a1
ii =

1
.
aii

Thus if aii = 1 then a1


ii = 1.

Exercise 1.7.50
Part (a): Assume that there are at least two M that will satisfy the given expression. What
is important is this expression is that the block matrices A11 and A12 dont change while
the transformation introduces zeros below the block matrix A11 . As specified in the problem
the matrix A22 in each case can be different. This means we will assume that there exists
matrices M1 and M2 such that

 


A11 A12
A11 A12
Ik
0
,
=
(1)
A21 A22
M1 Ink
0 A22
where we have indicated that A22 may depend on the M matrix by providing it with a
subscript. For the matrix M2 we have a similar expression. Multiplying on the left by the
inverse of this block matrix


Ik
0
M1 Ink
which is

Ik
0
M1 Ink

(this inverse can be shown by direct multiplication of the two matrices), gives the following

 


 

A11 A12
A11 A12
Ik
0
A11 A12
Ik
0
.
=
=
(2)
(1)
M2 Ink
A21 A22
M1 Ink
0 A22
0 A22
Equating the (2, 1) component of the block multiplication above gives M1 A11 = M2 A11 ,
which implys that M1 = M2 , since A11 is nonsingular. This shows the uniqueness of this
block Gaussian factorization.
Returning to a single M, by multiplying the given factorization out we have

 
 


A11 A12
A11
A12
A11 A12
Ik
0
,
=
=
MA11 + A21 MA12 + A22
A21 A22
M Ink
0 A22
so equating the (2, 1) block component of the above expression we see that MA11 +A21 = 0,
or M = A21 A1
11 . In the same way equating the (2, 2) block components of the above gives
A21 A1
11 A12 + A22 ,
which is the Schur complement of A11 in A.

Part (b): By multiplying on




Ik
0
Ik
M Ink
M
or

A11
A21

the left by the block matrix inverse we have




 

A11 A12
0
A11 A12
Ik
0
,
=
Ink
A21 A22
M Ink
0 A22
A12
A22

as was to be shown.

Ik
0
M Ink



A11 A12
0 A22

Part (c): Taking the determinant of the above expression gives since |A| =
6 0, that

 

I 0 A11 A12

= 1|A11 ||A22 | .
|A| =
M I 0 A22

so |A22 | =
6 0, (or else |A| = 0, which is not true) and therefore since its determinant is
nonzero, A22 is nonsingular.

Exercise 1.7.54
Part (a): Let H be by symmetric and invertible then by the definition of the inverse, H 1
satisfies HH 1 = I. Now taking the transpose of both sides and remembering that the
transpose of a product is the product of the transposes but in the opposite order we have
(H 1 )T H T = I T which simplifies to (H 1 )T H = I, since both H and I are symmetric. By
multiplying both sides on the left by H 1 we have that
(H 1 )T = H 1
showing that H 1 is symmetric.
Part (b): The Schur complement is given by A22 = A22 A21 A1
11 A12 and involves the
submatrices in A. To determine properties of these submatrices consider the block definition
of A given by


A11 A12
.
A=
A21 A22

Taking the transpose of A (and using the fact that it is symmetric) we have

 
 T
A11 A12
A11 AT21
T
=
A =
A21 A22
AT12 AT22

which gives (equating elements) the following


AT11
AT21
AT12
AT22

=
=
=
=

A11
A12
A21
A22 .

With these components we can compute the transpose of the Schur complement given by
1
T T

AT22 = AT22 AT12 (A1


11 ) A21 = A22 A21 A11 A21 = A22 ,
showing that A22 is symmetric.

Exercise 1.8.1
Taking the block determinant of the given B we have
|B| = |B11 ||B22 | .
Since B22 has a column of all zeros, it has a zero determinant. Thus B has a zero determinant
and is singular.

Exercise 1.8.7 (the inverse of a permutation matrix)


The fact that P 1 = P T can be recognized by considering the product of P with P T . When
row i of P is multiplied by any column of P T not equal to i the result will be zero since
the location of the ones in each vector wont agree in the location of their index. However,
when row i of P is multiplied by column i the result will be one. Thus we see by looking at
the components of P P T that P P T = I and P T is the inverse of P .

Chapter 2 (Sensitivity of Linear Systems)


Exercise 2.1.10 (the one norm is a norm)
We have for the one norm the following definition
||x||1 =

n
X
i=1

|xi | .

We can check to see that each norm requirement is satisfied for this norm. We have
The condition ||x||1 0 for all x and ||x||1 = 0 if and only if x = 0 can be seen to be
certainly true.

Exercise 2.1.10
From the definition of the one norm ||x||1 =

Pn

i=1

|xi | we see that

||x||1 0 and ||x||1 = 0 ,


can only be true if x = 0. We also see that
||x||1 =
and that
||x + y||1 =

n
X
i=1

n
X
i=1

|xi | = ||

|xi + yi |

X
i

X
i

|xi | = ||||x||1 .

(|xi | + |yi |) = ||x||1 + ||y||1

Exercise 2.1.11
The distance between two points is the distance taking only right angles.

Exercise 2.1.13
For the infinity norm has
||x|| 0 ,
and ||x|| = if and only if x = 0. We also have
||x|| = min |xi | = || min |xi | = ||||x|| .
1in

1in

The third requirement of a norm is given by


||x + y|| = max |xi + yi |
1in

max (|xi | + |yi |)


1in

max |xi | + max |yi | = ||x|| + ||y|| .


1in

1in

Exercise 2.2.4
Part (a): Show that (A) = (A1 ), We know that
(A) = ||A||||A1|| .
so that
(A1 ) = ||A1 ||||A|| = ||A||||A1|| = (A) .
Part (b): We have
(cA) = ||cA||||(cA)1|| = |c|||A||

1
||A1 || = (A) .
|c|

Exercise 2.2.5
If p = 1 we have the set {x R2 |||x||1 = 1} is equivalent to {x R2 ||x| + |y| = 1}.
Which looks like If p = 3/2 we have the set {x R2 |(|x|3/2 + |y|3/2)2/3 = 1} is equivalent to
{x R2 ||x|3/2 + |y|3/2 = 1} which looks like If p = 2 we have the set {x R2 |x2 + y 2 = 1},
which looks like If p = 3 we have the set {x R2 ||x|3 + |y|3 = 1}, which looks like If p = 10
we have the set {x R2 ||x|1 0 + |y|10 = 1}, which looks like If p = we have the set
{x R2 | max1in |xi | = 1}, which looks like
Exercise 2.2.6
Part (a): The condition number of A is defined by (A) = ||A||||A1||, while the condition
number of A1 is defined by (A1 ) = ||A1 ||||(A1)1 || = ||A1 ||||A|| = (A)
Part (b): If c is any nonzero scalar then we have for (cA) the following simplifications
proving the desired relationship
1
1
(cA) = ||cA|| ||(cA)1|| = |c| ||A|| || A1 || = |c| ||A|| ||A1|| = (A) .
c
|c|

Exercise 2.2.7
If A = GGT then Part (a): ||x||A = (xT Ax)1/2 = (xT GGT x)1/2 = ((GT x)T (GT x))1/2 =
||GT x||2 .
Part (b): We know that ||GT x||2 0, for all GT x which is the same as for all x. In addition,
if ||GT x||2 = 0 if and only if GT x = 0, which since GT is invertible is true if and only if
x = 0.
Next we have
||x||A = ||GT (x)||2
= ||||GT x||2 = ||||x||A .
Next we have
||x + y||A = ||GT (x + y)||2 = ||GT x + GT y||2
||GT x||2 + ||GT y||2
= ||x||A + ||y||A .
Exercise 2.2.7
Let A =

1 1
1 1


1 1
, then ||A|| = 1 and ||B|| = 1, where
and B =
1 1


2 2
AB =
and ||AB|| = 2 .
2 2


But ||AB|| = 2 6= ||A||||B|| = 1 1 = 1.


Exercise 2.2.8
Part (a): ||A(cx)|| = ||cAx|| = |c|||Ax||, and ||cx|| = |c|||x||, so that the ratio is unchanged
by scalar multiplication.
Part (b):
||Ax||
= max ||A
||A|| = max
x6=0
x6=0 ||x||

But as x runs through all Rn for x 6= 0, so that

x
||x||

runs through Rn with ||x|| = 1. Thus

||A|| = max ||Ax|| .


||x||=1


x
|| .
||x||

Exercise 2.2.9
The Frobenius norm is defined by
n X
n
X

||A||F =

i=1 j=1

|aij |2

!1/2

and the two norm of A is given by


||A||2 = max
x6=0

Part (a): We have ||I||F = (

Pn

i=1

12 )

1/2

= n1/2 , and

||I||2 = max
x6=0

Part (b):
inequality

||A||22

max||x||2=1 ||Ax||22

As ||x|| = 1 or

2
j=1 xj

1/2

||Ix||2
= 1.
||x||2

2
Pn Pn

= max||x||2=1 i=1 j=1 aij xj , by Cauchy-Schwartz

2

n
n
n

X
X
X


2
x2j .
aij
aij xj



j=1

j=1

j=1

P
n

||Ax||2
.
||x||2

= 1.

Exercise 2.2.10
Remembering the definition of the infinity norm. We have ||A|| = max1in
and consider the following
||Ax|| = max |Ax|i max
1in

max

1in

1in

n
X
j=1

|aij |

||x|| max

1in

Therefore

n
X
j=1

n
X
j=1

|aij ||xj |

max |xk |

1kn

|aij | .

n
X
||Ax||
max
|aij | .
1in
||x||
j=1

Pn

j=1 |aij |,

Let max1in

Pn

j=1 |aij |

occur at at the k-th row. Then


Axk =

n
X

akj xj =

j=1

n
X
j=1

|akj | ,

If the xj is chosen so that as the vector x has components xi , picked as



1 akj < 0
xj
+1 akj > 0 .
So ||x|| = 1, one sees that
n

X
||Ax|| X
=
|akj | = max
|aij | .
1in
||x||
i=1
i=1
Exercise 2.3.2
We want to prove that
MaxMag(A) = max
x6=0

and that
MinMag(A1 ) = min
x6=0

We have that
maxmag(A) = max
x6=0

||Ax||
,
||x||
||A1 x||
,
||x||

||v||
||Ax||
= max
v6=0 ||A1 v||
||x||
1

minv6=0 ||A||v||v||
1
=
.
minmax(A1 )
1

Ax = v, so x = A1 v.
MinMag(A1 ) = min
x6=0

||v||
||A1 x||
= min
v6=0 |Av|
||x||
1

maxv6=0 ||Av||
||v||
1
=
.
maxmag(A1 )
Then

from Exercise 2.3.2.

maxmag(A)
= ||A||maxmag(A1 ) = ||A||||A1|| ,
minmag(A)

Exercise 2.3.4
With A =


0
, so that
0

||A || = max
x6=0

||A x||
||(x1 , x2 )||
||x||
= max
= || max
= || =
x6=0 ||x||
(x1 ,x2 )6=0
||x||
||x||


1/ 0
, so by the above ||A1
|| = 1/, to therefore
0 1/
 
1
= 1.
(A ) =

an obviously det(A ) = 2 .

Exercise 2.3.5 (perturbing the right-hand-side b)


We want to prove that
||b||
||x||
(A)
.
||||
||x||

We begin with Ax = b which gives that x = A1 b, so that ||x|| ||A1||||b||. By perturbing


the solution to Ax = b, we get that x relates to b by considering
A(x + x) = b + b Ax = b .
Thus ||A||||x|| ||b||. So ||x||

||b||
,
||A||

or equivalently that

||A||
1

.
||x||
||b||
Multiplying these two equations we get
||x||
||b||
||b||
||A1 ||||A||
= (A)
.
||x||
||b||
||b||
Equality in this expression holds when ||x|| = ||A1||||b|| or b is in in the direction of maximum magnification of A1 and ||A||||x|| = ||b||, so that x is in the direction of the
maximal magnification of A. Then we have that
||x||
||b||
(A)
.
||b||
||x||

(A) ||x||
as a statement about residuals)
Exercise 2.3.6 ( ||b||
||b||
||x||
We have r(
x) = A
x b = A(x + x) b = Ax = b. So that
||r(
x)||
||b||
||x||
=
(A)
||b||
||b||
||x||
x)||
= (A) ||x||
, and thus the residual can be
As the above inequality is sharp we can get ||r(
||b||
||x||
very much non-zero. If the matrix is well conditioned then (A) is fairly small and by the
x)||
above the residual must be small and we cannot get a very large ||r(
with a well conditioned
||b||
matrix.

Watkins: Ex 2.3.10 (solving Ax = b with perturbations in A and b)


We are told to consider Ax = b and (A + A)(x + x) = b + b. Expanding the left-hand-side
of the second equation gives
Ax + Ax + A(x + x) = b + b ,
or since Ax = b this is
or solving for x we get

Ax = b A(x + x) ,
x = A1 (b A(x + x)) .

Taking vector norms on both sides we get

||x|| ||A1 || (||b|| + ||A||(||x|| + ||x||))




||b|| ||A||
1
+
(||x|| + ||x||) .
= ||A ||||A||
||A||
||A||
1
Since b = Ax we have ||b|| ||A||||x|| so ||A||
||x||
and using the definition of the condition
||b||
1
number (A) ||A||||A || we have


||A||
||A||
||b||
||x|| (A)
||x|| +
||x|| + (A)
||x|| ,
||b||
||A||
||A||

or solving for ||x|| we get






||b|| ||A||
||A||
||x|| (A)
||x|| .
+
1 (A)
||A||
||b||
||A||

Since we are assuming that our matrix perturbation is small enough that the new matrix
1
A + A is still invertible or ||A||
(A)
, the left-hand-side has a leading coefficient that is
||A||
positive and we can divide by it to get


||b||
||A||
(A) ||A|| + ||b||
||x||
,
(6)

||x||
1 (A) ||A||
||A||

the desired expression.

Chapter 4 (Eigenvalues and Eigenvectors I)


Exercise 4.3.21 (quartic convergence)
Exercise 4.3.22 (the number of digits of accuracy with linear convergence)
At iteration i denote our approximation error as ||qj v||2 = 10sj for some sj . Since we are
told that we have linear convergence we know that
1
1
||qj+1 v||2 ||qj v||2 = 10sj = 10sj log(r) ,
r
r
or we see the number of correct digits increases by log(r) on each iteration.
Exercise 4.4.1 (geometric multiplicity)
The geometric multiplicity is the dimension of the space {v Cn |Av = v}. This space
transforms as
Dim{v Cn |Av = v} = Dim{v Cn |P 1AP v = v} = Dim{P v Cn |A(P v) = (P v)} .
Thus the geometric multiplicity of the eigenvalue of B equals the geometric multiplicity of
the corresponding eigenvalue of A.

Exercise 4.4.2 (geometric and algebraic multiplicity for simple matrices)


When A is simple it has n distinct eigenvalues and thus n linearly independent eigenvectors.
Thus the geometric and the algebraic multiplicity of each eigenvalue is 1.

Exercise 4.4.3 (A1 is similar to B 1 )


If A is similar to B that means that there is an invertible transformation P such that
B = P 1 AP . Thus taking the inverse of each side and assuming that all inverses exist we
have
B 1 = P 1 A1 P ,
or that B 1 and A1 are similar.

Exercise 4.4.4 (eigenvectors of a diagonal matrix)


The vectors that are all zeros except for a single 1 at one location are the eigenvectors of a
diagonal matrix. These vectors are obviously linearly independent and a diagonal matrix is
therefore simple.

Exercise 4.4.5
If A satisfies V 1 AV = D with D diagonal and V nonsingular then we want to show that
the columns of V are the n linearly independent eigenvectors of A and the elements of D
are the eigenvalues. From V 1 AV = D we have AV = V D. If we let vi be the ith column
of V , and i the ith diagonal element of D then AV = V D is the same as Avi = i vi for
i = 1, 2, n or that vi is the ith eigenvector of A. That vi and vj are linearly independent
follows from the fact that V is nonsingular. As A has n linearly independent eigenvectors A
is simple.

Exercise 4.4.6 (norms of unitary matrices)


Part (a): As U is unitary we have
||Ux||2
||x||2
= max
= 1,
||x||6=0 ||x||2
||x||6=0 ||x||2

||U||2 = max

since ||Ux||2 = ||x||2 . As U is unitary so is U 1 , thus ||U 1 ||2 = 1 also. Thus we have
(U) = ||U||2 ||U 1 ||2 = 1 1 = 1 .
Part (b): We are told that A and B are unitary similar thus B = U 1 AU. Thus
||B||2 ||U 1 ||2||A||2 ||U||2 = ||A||2 ,
by the consistency property of the matrix norm. Also from A = UBU 1 we have ||A||2
||B||2 so we have that ||A||2 = ||B||2. In the same way we can show that ||A1 ||2 = ||B 1 ||2
so we have
2 (A) = ||A||2||A1 ||2 = ||B||||B 1|| = 2 (B)
Part (c): Given B = U AU and consider B + B = U (A + A)U or
B + B = U AU + U AU ,
or
B = U AU .
Thus A and B are unitary similar and ||B||2 = ||A||2.

Exercise 4.4.7 (round-off errors in arbitrary similarity transforms)

From the given P we have P

Part (a): With A =

2 0
0 1


1
.
0 1

so B = P

AP =

2
0 1

and

||B|| = max(2 + ||, 1) = 2 + || ,


if is large enough.
Part (b): A =

1 0
0 1

and A =


1 1
. Thus we have
1 1

||A|| = 1

||A|| = (2) = ,
2
thus

||A||
= .
||A||

From the matrices above we have B = P 1 AP = P 1P = I, and







1
1
1 + 2
1
2
B + B = P (A + A)P =

1 + 2
0 1
0 1
2




1
1 + 2 (1 + 2 ) + 2
=

2 + 1 + 2
0 1
2





1 2
1 + 2 2 1 + 2 + 2 2 2
=
.
=

1
1+
2
2
2
Thus ||B|| = 1 and
||B|| =

max(|1 | + | 2 |, 1 + |1 + |) ,
2

which can be as large as desired by selecting large. When is large we have this expression
O(2).

Exercise 4.4.8 (conditioning of similarity transformations)


Part (a): Writing B = P 1 AP we have
||B|| ||P 1||||A||||P || = (P )||A|| .
Thus ||B|| (P )||A||. Therefore

1
||A||
(P )

||B||. Combining these two we get

1
||A|| ||B|| (P )||A||
(P )

Part (b): From B = P 1 AP and B + B = P 1 (A + A)P we have that


B = P 1 AP .
Thus as in Part (a) we can show that
1
||A|| ||B|| (P )||A|| .
(P )
Using ||B||

1
||A||
(P )

and ||B|| (P )||A|| we have that


(P )2 ||A||2
||B||

.
||B||
||A||

Using ||B|| (P )||A|| and ||B||

1
||A||
(P )

we have that

||A||2
||B||

.
||B||
(P )2 ||A||
Exercise 4.4.9 (arbitrary similarity transformation dont preserve A = A )

1 i
For this just pick any invertible P such that P 6= P like P =
. Then consider

0 1
1 2
a matrix A that is Hermitian or satisfies A = A like A =
. From the given P we
2 1


1 i
1
and we find
have that P =
0 1


 



1 i+2
1 i
1 i
1 2
1 i
1
=
B = P AP =
2 2i + 1
0 1
0 1
2 1
0 1

 

1 2i i + 2 + 2 i
1 2i
5
=
=
.
2
2i + 1
2
2i + 1


1 + 2i
2

6= B.
Note that B =
5
2i + 1

Exercise 4.4.10 (observations from the proof of Schurs theorem)


Part (a): We can specify the ordering of the eigenvalues in T in any order since all that was
required in the proof of Schurs theorem was to initially pick an eigenvector v of A. Since
the eigenvalue associated with this eigenvector appears at the (1, 1) location in the matrix
T . Since v was arbitrary wen can pick any eigenvector and correspondingly any eigenvalue.
This argument can then be repeated recursively.

Part (b): We start the proof of Schurs theorem by picking an eigenvector v (of A) that
vector then goes in the matrix U1 . Our total unitary transformation U is U1 U2 with

1 0 0
0

U2 = ..
.

U2
0
Note that

U = U1 U2 =

v W

1 0 0


0

2 ,
= v WU
..

U2
.
0

Thus the first column of U is the eigenvector of A with the eigenvalue . Since (and the
corresponding eigenvector v) was chosen arbitrarily we can extract any arbitrary ordering of
eigenvectors values.

Exercise 4.4.13 (the Rayleigh quotient can give approximate eigenvalues)


Part (a): From q =

Pn

i=1 ci vi

||q||22

we have

=q q=

n
X

ci vi

i=1

Since qi qj = ij the Kronecker delta.

n
X

cj vj

j=1

Part (b): Note that


v1 q = (1 c1 )v1

n
X

n
X
i=1

|ci |2 .

cj vj .

j=2

Thus using the same arguments as in the first part of this problem we have
||v1

q||22

= |1 c1 | +

n
X
j=2

|cj |2 .

Thus dropping the positive term |1 c1 |2 we have


||v1

q||22

n
X
j=2

|cj |2 ,

as we were to show.
Part (c): Note the Rayleigh quotient is = q P
Aq when the norm of qP
is one. If we expand
q in terms of the eigenvectors of A we have q = ni=1 ci vj so that q = ni=1 ci vj and
Aq =

n
X
i=1

ci i vi ,

giving

q Aq =

n
X

ci vj

i=1

n
X

ci i vi

i=1

using the property of the orthonormal eigenvectors.


Part (d): Since

Pn

i=1

n
X
i=1

|ci |2 i ,

|ci |2 = 1 by Part (a) we can write

1 =

n
X
i=1

1 |ci |

Using this we have


|1 | C

n
X
i=1

n
X
i=2

i |ci | =

n
X
i=2

(1 i )|ci |2 .

|ci |2 C||v1 q||22 ,

using Part (b). Thus the take away from this is that if we pick a vector q that is close to
the eigenvector v1 (say in norm by O()) then the Rayleigh quotient will be close to the
eigenvalue (by an amount O(2 )).

Exercise 4.4.14 (eigenvalues of Hermitian matrices are real)


Part (a): Note that the conjugate of the expression x Ax is itself again
(x Ax) = x A x = x Ax .
Only real numbers are their own complex conjugates.
Part (b): Since the numerator x Ax and the denominator x x in the Rayleigh quotient are
real the Rayleigh quotient itself must be real.
Part (c): As the Rayleigh quotients when computed using eigenvectors return eigenvalues
the eigenvalues of Hermitian matrices must be real.

Exercise 4.4.15 (unitary similarity and positive definiteness)


Assume that B is unitarily similar to A and consider for an arbitrary vector x the inner
product
x Bx = x U AUx = (Ux) A(Ux) > 0
as A is positive definite. Thus B is positive definite.

Exercise 4.4.16 (eigenvalues of positive definite matrices are positive)


Consider the Rayleigh quotient for a positive definite matrix A or xxXx
. Since we know that
x
the numerator and the denominator are positive real numbers the fraction is also. Thus the
eigenvalues of the positive definite matrix A must be positive.

Exercise 4.4.17 (positive definite is equivalent to positive eigenvalues)


We have shown that if A is positive definite it has positive eigenvalues. Assume that A has
positive eigenvalues, then from the Spectral Theorem we can write A as A = UDU with U
unitary, and D a diagonal matrix with the positive eigenvalue of A as the diagonal elements.
In that case note that
x Ax = x UDU x = (U x) D(U x) > 0 ,
since the last expression is the sum of positive terms.

Exercise 4.4.18 (positive semidefinite matrices)


For results with positive semidefinite matrices all signs like > become but all of the
arguments are the same.

Exercise 4.4.19 (skew-Hermitian matrices)


Part (a): Consider B = (U AU) = U A U = U AU = B showing that B is skewHermitian.
Part (b): When A is skew-Hermitian then Schurs Theorem states T = U AU so we have
that T and A are unitary similar. Thus T must be skew-Hermitian and T = T this means
that T is a diagonal matrix with imaginary elements on the diagonal.
Part (c): Besides the argument above we can consider the Rayleigh quotient. Note that
taking its conjugate we have
=

x Ax
x A x
=

= .
x x
x x

The only numbers that are negative conjugates of them selves are pure imaginary numbers,
showing that the eigenvalues of A must be pure imaginary.

Exercise 4.4.20 (unitary and unitary similarity)


Part (a): If B = U AU then
BB = U AU(U A U) = U AA U = I
therefor B 1 = B and B is unitary.
Part (b): Let T be an upper triangular matrix that is unitary. Then because T us unitary
we know that T 1 = T . Since inverses of upper triangular matrices are also upper triangular
we have that T 1 is upper triangular. At the same time T is lower triangular. In order that
these two things equal each other we must have that they are both diagonal matrices. As
T is a diagonal matrix so is T .
Part (c): Schurs Theorem states that there exists a unitary matrix U such that U AU = T
with T upper triangular with the eigenvalues of A on the diagonal. Since A is unitary and
T is unitarily similar to A, by Part (a) we know that T is unitary. Since T is unitary and
upper triangular we know by Part (b) that T must be a diagonal matrix.
Part (d): From the above discussion we have that U AU = T where T is a unitary diagonal
matrix and hold the eigenvalues of A. Since T is unitary T 1 = T and writing this equation
in terms of an arbitrary element of the diagonal of T (say ) we have 1 = or ||2 = 1.
Thus the eigenvalues of A lie on the unit circle.
Another way to see this is to consider the Rayleigh quotient for an eigenvector x with
eigenvalue . Since A is unitary and x is the eigenvector we have that
 
1
x
x
1

(x A x
1
xA x
=
=
=
.
=
xx
x x
x x

Again we have = 1 or that ||2 = 1.


Exercise 4.4.21 (normal matrices)
Part (a): We will show this for just the case where A is Hermitian since all the other cases
are the same. In that case A = A so that AA = AA = A2 . The other product, A A, is
equal to A2 also.
Part (b): If B = U AU then we have
BB = U AU(U A U) = U AA U = U A AU ,
since A is normal. Continuing we have
U A AU = U A UU AU = B B ,

showing that B is normal.


Part (c): Partition the original n n upper triangular matrix T in the following way


t11 tr
.
T =
0 T
Here T is a n 1 n 1 upper triangular matrix, t11 is a scalar, and tr is a row vector.
Then with this we have


t11 0

T =
.
tr T
The normal products T T and

t11

TT =
0

t11
T T =
tr

T T are

t11
tr

T
tr

t11
0

T
0

|t11 |2 + tr tr tr T
=
T tr
TT
 

|t11 |2 t11 tr
tr
=
T
t11 tr T T

Since we assume that T T = T T equating the components of the above we get


|t11 |2 + tr tr = |t11 |2
tr T = t11 tr
T t = t11 t
r

TT = T T .
By induction T is normal, upper triangular and therefore diagonal. From the (1, 1) element
or the first equation above we must have
tr tr = 0 .
But tr tr is the sum of the normed squared of the elements of tr and this is only equal to zero
if each element is. Thus tr = 0. This solution makes the other two equations satisfied.
Part (d): Let A be a diagonal matrix then we have

|a11 |2
0
2

|a22 |

AA =
..

.
0
|ann |2
and A A is the same so A is normal.

Part (e): Assume that A is unitary similar to a diagonal matrix say D. Since D is normal
by Part (d) and every matrix unitary similar to a normal matrix is normal we have that A
is normal.
Assume that A is normal thus AA = A A. By Schurs Theorem there exists a unitary
matrix U such that U AU = T where T upper triangular. Now A is normal and thus T is
normal by Part (b). By Part (c) when T is upper triangular and normal we have that T is
diagonal. Thus A is unitary similar to a diagonal matrix.

Exercise 4.4.22 (A has n orthonormal eigenvectors then A is normal)


Assume that A has n orthonormal eigenvectors. Form the matrix U with these n eigenvector
as columns. Then AU = UD since the eigenvectors are orthonormal we can create these
such that D = U AU or A and D are unitary similar. As a diagonal matrix is normal then
by Exercise 4.4.21 Part (b) A is also normal.

Exercise 4.4.23 (properties of diagonal matrices)


Part (a): D is Hermitian so that D = D. In components this means that dii = dii or the
diagonal elements must be real. These are also the eigenvalues of the matrix D.
Part (b): Let ei be a vector of all zeros with a 1 in the ith location. Then ei Dei = dii 0
so D is positive semi-definite.
Part (c): This is the same as the previous part where we have ei Dei = dii > 0.
Part (d): For D = D or in component form we have dii = dii or that dii is imaginary
thus the eigenvalues of D are imaginary.
Part (e): The matrix D is such that D = D 1 in component form we have dii =
|dii |2 = 1.

1
dii

or

Exercise 4.4.24 (properties of normal matrices)


Part (a): A is a normal matrix then its unitary similar to the diagonal matrix so that
U AU = D thus D = U A U = U AU = D so D is Hermitian. By Exercise 4.4.23 (a) the
eigenvalues of D are real. As A is similar to D it has the same (real) eigenvalues.
Part (b-e): Following the steps as in Part (a) to show A unitary similar to a diagonal
matrix, that has the same properties as A. Then we use Exercise 4.4.23 to show properties
of the eigenvalues of D. Then A has the same eigenvalues as D (due to the fact that they
are similarity related) and thus the eigenvalues of A satisfy the desired properties.

Exercise 4.4.25 (Rayleigh quotient of normal matrices approximate eigenvalues)


Since the book has shown that every normal matrix has n orthonormal eigenvectors (the
main requirement needed for Exercise 4.4.13) we can follow the steps in that proof to show
the required steps.

Exercise 4.4.26 (defective matrices and a nearby simple matrix)


Let A be defective. By Schurs Theorem A = U T U with T upper triangular with diagonal
elements of T the eigenvalues of A. Consider a matrix A created from another (as yet)
unknown upper triangular matrix T as A = U T U. The matrix U is the same Hermitian
matrix created in Schurs Theorem. Then we have
||A A ||2 = ||U (T T )U||2 = ||T T ||2 .
We now pick T such that it equals T for all off diagonal elements. Then the matrix T T
is a diagonal matrix say D. The two norm of this diagonal matrix is the square root of the
largest eigenvector of D D. This is the same as the absolute value of the diagonal elements.
Thus we have
||T T ||2 = max(|Tii Tii |) .
i

We can now select the elements to use as the diagonal elements of T in such a way that
they are all distinct and that the maximum above is as small as possible. Thus if we pick
T to be as above T will be simple and so will A and A has the requested distance to the
matrix A.

Exercise 4.4.28 (properties of skew-symmetric matrices)


Part (a): Assume that A is n by n skew-symmetric. As the matrix A is real the characteristic
equation det(A I) is a nth order real polynomial and any nonzero complex roots that it
might have must come in complex conjugate pairs. Since A is skew-symmetric (also skewHermitian) it must have its eigenvalues on the imaginary axis. Since n is odd it cannot have
all of its roots be complex since they must come in pairs and we would need to have an even
polynomial order. Thus = 0 must be a root of the characteristic equation and our matrix
A must be singular.
Part (b): These matrices look like

0 b
0 b

is a 2 by 2 skew-symmetric matrix.

Part (c): T will be block diagonal with each block a 2 2 block like
block of zero.

0 b
b 0

or a 1 1

Exercise 4.4.29 (properties of the trace function)


Part (a): The trace function is a linear operator and this is just an expression for that fact.
Part (b): We want to show that trace(CD) = trace(DC) which we do by considering the

left-hand-side and showing that it equals the right-hand-side. We have


n
n X
n
X
X
trace(CD) =
(CD)ii =
Cik Dki

i=1
n X
n
X

Cik Dki =

k=1 i=1

Pn Pn

j=1

i=1

|bij |2 . Now consider the (i, j)th element of B B

n
n
X
X

(Bki ) Bkj .
(B )ik Bkj =
(B B)ij =

k=1

k=1

Thus

(DC)kk

k=1

= trace(DC) .
Part (c): Note that ||B||2F =
given by

i=1 k=1
n
X

trace(B B) =

n
n X
X

(Bki ) Bki =

n
n X
X
i=1 k=1

i=1 k=1

|Bki |2 = ||B||2F .

Using the result from Part (b) we have trace(B B) = trace(BB ) and thus this also equals
||B||F .
Exercise 4.4.30 (normal and block triangular)
Part (a): We assume that T is normal thus T T = T T . When we write out the left-handside and right-hand-side of this expression we get

 



T11 T11
+ T12 T12
T12 T22
T11 0
T11 T12

=
TT =

T22 T12
T22 T22
T12
T22
0 T22

 



T11 T11
T11
T12
T11 T12
T11 0

.
=
T T =

T12
T11 T12 T12
+ T22
T12
0 T22
T12
T22
Since T is normal we have the above two expressions are equal. If we consider the (1, 1)
component of the above we get that we must have

T11 T11
+ T12 T12
= T11
T11 .

Canceling the common terms and taking the trace of the remaining expression we get

trace(T12 T12
) = 0.

Using the trace is equal to the Frobenius norm result proven earlier we have that the above
is equal to

||T12 ||2F = trace(T12 T12


) = 0,

or

XX
i

|(T12 )ij |2 = 0 ,

which means that echo component of T12 must be zero or that the matrix T12 is zero. This
means that T is block diagonal as we were to show.

Exercise 4.4.31 (2 2 normal matrices)


Part (a): We find

AA =

A A=

a b
c d
a c
b d





a c
b d
a b
c d

=
=

a2 + b2 ac + bd
ac + db c2 + d2
a2 + c2 ab + cd
ab + cd b2 + d2

To be normal when we equating the (1, 1) component of AA = A A we get


a2 + b2 = a2 + c2

b2 = c2 ,

which means that b = c. Equating the (1, 2) component of AA = A A we get


ac + bd = ab + cd .

If we have b = c then this last equation is (changing all cs to bs) gives


ab + bd = ab + cd or b = c ,


a b
the same condition as before. Thus in this case we get A =
and we have a symmetric
b d
matrix. If b = c then the second equation above becomes (changing all cs into bs) gives
ab + bd = ab bd or


a b
Thus we have A =
.
b a

a + d = a d or a = d .

Part (b): For A of the given form the eigenvalues are given by |A I| = 0 or


a
b

b a = 0 .

This gives (a )2 + b2 = 0 or since b is real we get = a bi.

Part (c): By Ex. 4.4.30 T is block diagonal. The elements of the blocks are either scalars
a b
for the real eigenvalues or 2 2 blocks of the form
for the complex eigenvalues.
b a
Exercise 4.4.32 (more properies of the trace)
Part (a): A and B are similar so B = S 1 AS. Then
trace(B) = trace(S 1 AS) = trace(ASS 1 ) = trace(A) .
Part (b): By Schurs Theorem for all matrices A there exists an unitary matrix U such that
U AU = T with T upper triangular with the eigenvalues of A on the diagonal of T . Then
using that result we have
n
X

trace(U AU) = trace(A) = trace(T ) =


i .
i=1

Exercise 4.4.33 (the determinant of a matrix)


By the complex Schurs Theorem for all matrices A there exists an unitary matrix U such
that U AU = T with T upper triangular with the eigenvalues of A on the diagonal of T .
Then we have since U = U 1 that
|U

AU| = |T | =

n
Y
i=1

The left-hand-side is |U|1 |A||U| = |A| and we get


|A| =
as we were to show.

n
Y
i=1

i ,

i .

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy