3 Orthogonal Vectors and Matrices
3 Orthogonal Vectors and Matrices
The linear algebra portion of this course focuses on three matrix factorizations: QR factorization, singular
valued decomposition (SVD), and LU factorization. The first two of these factorizations involve orthogonal
matrices. These matrices play a fundamental role in many numerical methods. This lecture first considers
orthogonal vectors and then defines orthogonal matrices.
(u, v) = 0.
In view of formula (11) in Lecture 1, orthogonal vectors meet at a right angle. The zero-vector 0 is orthogonal
to all vector, but we are more interested in nonvanishing orthogonal vectors.
A set of vectors Sn = {vj }nj=1 in Rm is said to be orthonormal if each pair of distinct vectors in Sn is
orthogonal and all vectors in Sn are of unit length, i.e., if
½
0, j 6= k,
(vj , vk ) = (1)
1, j = k.
Here we have used that (vk , vk ) = kvk k2 . It is not difficult to show that orthonormal vectors are linearly
independent; see Exercise 3.1 below. It follows that the m vectors of an orthonormal set Sm in Rm form a
basis for Rm .
Example 3.1
The set S3 = {ej }3j=1 in R5 is orthonormal, where the ej are axis vectors; cf. (15) of Lecture 1. 2
Example 3.2
The set S2 = {v1 , v2 } in R2 , with
1 1
v1 = √ [1, 1]T , v2 = √ [−1, 1]T ,
2 2
is orthonormal. Moreover, the set S2 forms a basis for R2 . 2
An arbitrary vector v ∈ Rm can be decomposed into orthogonal components. Consider the set Sn =
{vj }nj=1 of orthonormal vectors in Rm , and regard the expression
n
X
r=v− (vj , v)vj . (2)
j=1
The vector (vj , v)vj is referred to as the orthogonal component of v in the direction vj . Moreover, the
vector r is orthogonal to the vectors vj . This can be seen by computing the inner products (vk , r) for all k.
We obtain
n
X n
X
(vk , r) = (vk , v − (vj , v)vj ) = (vk , v) − (vj , v)(vk , vj ).
j=1 j=1
1
Using (1), the sum in the right-hand side simplifies to
n
X
(vj , v)(vk , vj ) = (vk , v),
j=1
Example 3.3
Let v = [1, 2, 3, 4, 5]T and let the set S be the same as in Example 3.1. Then r = [0, 0, 0, 4, 5]T . Clearly, r is
orthogonal to the axis vectors e1 , e2 , e3 . 2
Exercise 3.1
Let Sn = {vj }nj=1 be a set of orthonormal vectors in Rm . Show that the vectors v1 , v2 , . . . , vn are linearly
independent. Hint: Assume this is not the case. For instance, assume that v1 is a linear combination of the
vectors v2 , v3 , . . . , vn , and apply (1). 2
where I denotes the identity matrix. Multiplying the above expression by the inverse Q−1 from the right-hand
side shows that
QT = Q−1 .
Thus, the transpose of an orthogonal matrix is the inverse.
Example 3.4
The identity matrix I is orthogonal. 2
2
Example 3.5
The matrix · √ √ ¸
1/√2 1/√2
Q=
−1/ 2 1/ 2
is orthogonal. Its inverse is its transpose,
· √ √ ¸
−1 T 1/√2 −1/√2
Q =Q = .
1/ 2 1/ 2
2
Geometrically, multiplying a vector by an orthogonal matrix reflects the vector in some plane and/or
rotates it. Therefore, multiplying a vector by an orthogonal matrices does not change its length. Therefore,
the norm of a vector u is invariant under multiplication by an orthogonal matrix Q, i.e.,
This can be seen by using the properties (18) and (16) of Lecture 1. We have
We take square-roots of the right-hand side and left-hand side, and since k · k is nonnegative, equation (3)
follows.
Hw = σe1 (5)
Therefore,
σ = ±kwk. (6)
3
Moreover, using the definition (4) of H, we obtain
σe1 = Hw = (I − ρuuT )w = w − τ u, τ = ρuT w
from which it follows that
τ u = w − σe1 .
The matrix H is independent of the scaling factor τ in the sense that the entries of the matrix H do not
change if we replace τ u by u. We therefore may choose
u = w − σe1 . (7)
This choice of u and either one of the choices (6) of σ give a Householder matrix that satisfies (5).
Nevertheless, in finite precision arithmetic, the choice of sign in (6) may be important. To see this, let
w = [w1 , w2 , . . . , wm ]T and write the vector (7) in the form
w1 − σ
w2
u = w3 .
..
.
wm
If the components wj for j ≥ 2 are of small magnitude compared to w1 , then kwk ≈ |w1 | and, therefore, the
first component of u satisfies
u1 = w1 − σ = w1 ± kwk ≈ w1 ± |w1 |. (8)
We would like to avoid the situation that |w1 | is large and |u1 | is small, because then u1 is determined with
low relative accuracy; see Exercise 3.3 below. We therefore let
σ = −sign(w1 )kwk, (9)
which yields
u1 = w1 + sign(w1 )kwk. (10)
Thus, u1 is computed by adding numbers of the same sign.
Example 3.6
Let w = [1, 1, 1, 1]T . We are interested in determining the Householder matrix H that maps w onto a
multiple of e1 . The parameter σ in (5) is chosen, such that |σ| = kwk = 2, i.e., σ is 2 or −2. The vector u
in (4) is given by
1±σ
1
u = w ± σe1 = 1 .
1
The choice σ = 2 yields u = [3 1 1 1]T . We choose the σ positive, because the first entry of w is positive.
Finally, we obtain ρ = 2/kuk2 = 1/6 and
−1/2 −1/2 −1/2 −1/2
−1/2 5/6 −1/6 −1/6
H = I − ρuuT = −1/2 −1/6
.
5/6 −1/6
−1/2 −1/6 −1/6 5/6
4
It is easy to verify that H is orthogonal and maps w to −2e1 . 2
In most applications it is not necessary to explicitly form the Householder matrix H. Matrix-vector
products with a vector v are computed by using the definition (4) of H, i.e.,
The left-hand side is computed by first evaluating the scalar τ = ρuT v and then computing the vector
scaling and addition v − τ u. This way of evaluating Hw requires fewer arithmetic floating-point operations
than straightforward computation of the matrix-vector product; see Exercise 3.6. Moreover, the entries of H
do not have to be stored, only the vector u and scalar ρ. The savings in arithmetic operations and storage
is important for large problems.
Exercise 3.2
Let w = [1, 2, 3]T . Determine the Householder matrix that maps w to a multiple of e1 . Only the vector u in
(4) has to be computed. 2
Exercise 3.3
This exercise illustrates the importance of the choice of the sign of σ in (6). Let w = [1, 0.5 · 10−8 ]T and let
u be the vector in the definition of Householder matrices (4), chosen such that Hw = σe1 . MATLAB yields
w =
1.0000
0.0000
>> sigma=norm(w)
sigma =
>> u1=w(1)+sigma
u1 =
>> u1=w(1)-sigma
u1 =
5
where the u1 denote the computed approximations of the first component, u1 , of the vector u. How large
are the absolute and relative errors in the computed approximations u1 of the component u1 of the vector
u? 2
Exercise 3.4
Show that the product U1 U2 of two orthogonal matrices is an orthogonal matrix. Is the product of k > 2
orthogonal matrices an orthogonal matrix? 2
Exercise 3.5
Let Q be an orthogonal matrix, i.e., QT Q = I. Show that QQT = I. 2
Exercise 3.6
What is the count of arithmetic floating point operations for evaluating a matrix vector product with an n×n
Householder matrix H when the representation (4) of H is used? Only u and ρ are stored, not the entries
of H. What is the count of arithmetic floating point operations for evaluating a matrix-vector product with
H when the entries of H (but not u and ρ) are available. Correct order of magnitude of the arithmetic work
when n is large suffices. 2
Exercise 3.7
e such that w2 =
Let the nonvanishing m-vectors w1 and w2 be given. Determine an orthogonal matrix H,
e 1 . Hint: Use Householder matrices. 2
Hw