4 Orthogonality
4 Orthogonality
MÜGE TAŞKIN
Bog̃aziçi University, Istanbul
FALL 2020
⌫⇤@v u
u⇤ R
@ u =< u1 , u2 >
⇤✓ ⌘⌘3
v v =< v1 , v2 >
⇤
⌘ v u =< v1 u1 , v2 u2 >
(u1 v1 + u2 v2 )
cos ✓ =
k u kk v k
i.e., the angle between two vectors can be given by a formula which depends on only the terminal points.
Definition
For two vectors u =< u1 , u2 > and v =< v1 , v2 >, we define their inner product by the following rule:
⇥ ⇤ v1 T
u · v := u1 v1 + u2 v2 = u1 u2 =u v
v2
REMARK:
u·v
1. Let ✓ be the angle between two vectors u and v. Then the formula cos ✓ = kukkvk yields that:
Two vectors u and v are orthogonal (that is the angle between them is ⇡/2) if and only if uT v = 0.
q
k~
u k= u12 + . . . + un2 .
Definition
For two vectors ~ v =< v1 , . . . , vn > in Rn , we define
u =< u1 , . . . , un > and ~ their inner product by the
following rule: 2 3
v1
⇥ ⇤ 6 .. 7 T
u ·~
~ v := u1 v1 + . . . + un vn = u1 .. un 6 7=~
u ~v
4 .. 5
un
Theorem
Two vectors ~ v =< v1 , . . . , vn > in Rn are orthogonal if and only if if their inner
u =< u1 , . . . , un > and ~
product is equal to 0. That is
T
u ·~
~ v := u1 v1 + . . . + un vn = ~
u ~v =0
2 2 2 2 2 2
(u1 + v1 ) + . . . + (un + vn ) = u1 + . . . + un + v1 + . . . + vn
if and only if u1 v1 + . . . + un vn = 0.
EXAMPLE: Elements of standard basis {e~1 , . . . , e~n } of Rn are pairwise orthogonal since
Also observe that the elements of the basis {< 1, 0, 1 >, < 0, 1, 0 >, < 1, 0, 1 >} are pairwise orthogonal.
Lemma
Let u, v~, w in Rn and k 2 R.
u ·~
1. ~ v =~ v ·~
u
2. ~ ~
u · (v + w ) = ~
u ·~ u·w
v +~ ~
u · (k~
3. ~ u ·~
v ) = k(~ u) · ~
v ) = (k~ v
Definition
Two subspace U and W of Rn are said to be orthogonal to each otherr if
u·w
~ u 2 U and w
~ = 0 for all ~ ~ 2W
Theorem
If U and W are orthogonal subspaces of Rn then U \ W = {0}.
Proof: If ~
x is in both U and W then by the definition of two subspaces U and W being orthogonal,then
x ·~
~ x = 0.
x k2 = 0. Bu ~
That is k ~ x must be the zero vector and U \ W = {0}.
0 is the only vector of length 0. Hence ~
Theorem
Let {u~1 , . . . , u~k } and {w~1 , . . . , w
~l } be some basis for U and W respectively.
Proof: First suppose that U and W are orthogonal. Then clearly u~i · w
~j = 0 for all i, j.
k X
X l
u · w = (x1 u1 + ... + xk uk ) · (y1 w1 + ... + yl wl ) = xi yj (ui · wj ) = 0
i=1 j=1
EXAMPLE: U = span{< 1, 0, 0, 0 >} and V = span{< 0, 1, 0, 0 >, < 0, 0, 1, 0 >} are orthogonal subspaces
in R4 since
< a, 0, 0, 0 > · < 0, b, c, 0 >= 0.
Definition
Given a subspace U of Rn , the set
? n
U := {r 2 R | r · u = 0 for all u 2 U}
?
< x, y , z, t >2 U ()< a, 0, 0, 0 > · < x, y , z, t >= ax = 0 for all a 2 R () x = 0.
Hence
?
U = {< 0, y , z, t >| y , z, t 2 R} = span{< 0, 1, 0, 0 >, < 0, 0, 1, 0 >, < 0, 0, 0, 1 >}.
Observe that subspace W = span{< 0, 1, 0, 0 >, < 0, 0, 1, 0 >} is orthogonal to U but it is not the largest set
orthogonal to U, that is W (= U ? .
Theorem
Let U be a subspace of Rn . Then
1. U ? is a subspace of Rn .
2. U ? consists of all vectors which are orthogonal to any vector in U. So It is the largest subspace which is
orthogonal to U.
3. If W is orthogonal to U then W ✓ U ?
4. U \ U ? = {0} (If v 2 U \ U ? then v · v = 0, that is |v |2 = 0 and hence v = 0)
5. dim(U) + dim(U ? ) = n (Will be proven later)
6. (U ? )? = U (Skipped)
Proof: Let r1 , . . . , rm 2 Rn be the row vectors of A. Then x =< x1 , . . . , xn >2 Row (A)? if and only if
ri · x = x · ri = 0 for all i. That is
2 x 3
2 3 1
... r1 ... 2 3
6 . 7 0
6 ... r2 ... 76 . 7
? 6 76 . 7 6 . 7
x 2 Row (A) () 6
6 . . . 76
76 .
7=6 .
7 4 .
7 () x 2 Null(A)
5
4 . . . 56 7
. . . 4 . 5
. 0
... rm ...
xn
2 3
. . 2 3
6 . . 7 0
⇥ ⇤6 . ... . 7 6 7
? 6 7 .
y 2 Col(A) () y1 ... ym 6 c1 ... c1 7=6
4 ..
7 () y 2 LNull(A)
5
6 7
4 . . 5
. . 0
. ... .
Corollary
Let A be an m ⇥ n. Then
?
dim(Row (A) ) + dim(Row (A)) = n
and
?
dim(Col(A) ) + dim(Col(A)) = m.
Proof: Suppose that the row reduced echelon matrix of A has k pivots. Then dim(Row (A)) = dim(Col(A)) = k
and dim(Null(A)) = n k whereas dim(LNull(A)) = m k. On the other hand since Row (A)? = Null(A) and
Col(A)? = LNull(A) the required result follows directly.
Corollary
Let U be a subspace of Rn of dimension k. Then U ? has dimension n k. That is
?
dim(U) + dim(U ) = n.
Proof: Let {u1 , ..uk } be a basis of U and A be a k ⇥ n matrix which admits {u1 , ..uk } as its row vectors. Then
U = span{u1 , ..uk } = Row (A) and U ? = Row (A)? . Hence dimension of U ? is the dimension of Row (A)?
which is n k by previous result.
2 3
0 1 0 1 0
6 0 0 1 2 0 7
A=6
4 0
7
0 0 0 1 5
0 1 1 3 1
Then Row (A) = U and Nul(A) = Row (A)? = U ? and it can be easily obtained that
2 3 2 3
1 0
6 0 7 6 1 7
6 7 6 7
Nul(A) = span{6 0 7,6 2 7}.
4 0 5 4 1 5
0 0
Let ~
v be a vector and L be the line along ~ u in R2 , the projection of ~
v . For a vector ~ u along v is defined to be
vector
p~
u in L such that p~u ~u is orthogonal to v .
⌫⇤J
pu u
⌘ Hence p~ v for some k 2 R.
u = k~
u⇤ J^ 3
⌘
J⌘ v
⇤✓ ⌘pu3
⌘ To find k, we need to solve (p~
u u) · ~
~ v =0
⇤
⌘
⌘
L ⌘ That is we need to solve (k~
v u) · ~
~ v = 0.
⌘
Clearly
u ·~
~ v
(k~
v u) · ~
~ v = 0 () k(~
v ·~
v) u ·~
~ v = 0 () k =
v k2
k~
u ·~
~ v uT ~
~ v vT ~
~ u
Hence p~
u = k~
v =( )~
v =( )~
v =( )~
v
v k2
k~ v k2
k~ v k2
k~
⌫⇤] u~ P~
u
JJ
u⇤
~
J⌘⌘3⌘~
v
⇤ ⌘ 3
⌘
L ⇤ P~u = k~u
⌘
⌘
⌘
⌘
Hence P~ v for some k 2 R and to find k we need to solve
u = k~
(P~
u u) · ~
~ v = 0, that is (k~
v u) · ~
~ v = 0.
Clearly
u ·~
~ v u ·~
~ v uT ~
~ v vT ~
~ u
k = and P~
u=( )~
v =( )~
v =( )~
v
v k2
k~ v k2
k~ v k2
k~ v k2
k~
P~
b = A~
x
T
A (~
b A~
x ) = 0.
Observe that, (AT A) 1 exists since the columns of A are linearly independent. So
T T T T T T 1 T
A (~
b x ) = 0 ()A ~
A~ b A A~ x =A ~
x = 0 =) A A~ b () x = (A A) A ~
~ b
m T 1 T~
So for any b2R , P~
b = A~
x = A(A A) A b lies in V
~ T 1 T ?
b P~
b = (Im A(A A) A )~
b lies in V
T 1 T
Therefore, the matrix representing the projection on V : P = A(A A) A
? T 1 T
the matrix representing the projection on V : Im P = Im A(A A) A .
EXAMPLE: Let V = span{< 1, 1, 0 >, < 2, 3, 0 >}. Find the matrix P representing the projection of any
vector in R3 on
2 V. . 3
1 2
Now set A = 4 1 3 5. Then V = Col(A)
0 0
2 3 2 3
1 2 1 1 0 0
T 1 T 2 5 1 1 0
P = A(A A) A = 4 1 3 5 =4 0 1 0 5
5 13 2 3 0
0 0 0 0 0
2 3
0 0 0
I3 P =4 0 0 0 5
0 0 1
v =< a, b, c > be vector in R3 and L be the line passing through the origin and in the direction
EXAMPLE: Let ~
of ~
v .
On the other hand, to find the matrix representing P we apply the formula
3 2 2 2 3
a 1 a ab ac
T 1 T
P = A(A A) A where A = 4 b 5 . Hence P = 4 ba b2 bc 5 .
a2 + b 2 + c 2
c ca cb c2
2 32 3 2 2 3
1 a2 ab ac x 1 a x + aby + acz
P~
u= 4 ba b2 bc 5 4 y 5 = 4 bax + b 2 y + bcz 5
a2 + b 2 + c 2 a 2 + b 2 + c 2
ca cb c2 z cax + cby + c 2 z
23
ax + by + cz a ~u ·~
v
P~
u= 4 b 5= ~
v
a2 + b 2 + c 2 c k~v k2
?
P~
b 2 V and ~
b P~
b2V
~ ?
b = P~
b + (~
b P~
b) where P~
b 2 V and (~
b P~
b) 2 V .
2. If ~
b lies in V then P~
b=~
b.
(~
b 2 V implies that P~
b ~
b lies in V . But P~
b b lies in V ? by definition. But V \ V ? = {0}, hence
~
~
b ~
P b = 0)
3. P 2 = P.
b 2 Rm we know that P~
(For any ~ b 2 V . Hence P 2~
b = P(P~
b) = P~
b)
4. Let A be the matrix whose columns are made of a basis of V . Then P = A(AT A) 1 AT .
5. P is symmetric.
( Since P T = (A(AT A) 1 AT )T = A((AT A) 1 )T AT = A(AT A) 1 AT = P)
x =~
Recall that for an m ⇥ n matrix A, the system A~ b has solution if and only if ~
b 2 Col(A).
Using notion of partial derivatives on multivariable functions, it is not hard to show that the distance k ~
b x k2
A~
is minimum if and only if ~
b A~x orthogonal to Col(A) that is ~b x 2 Col(A)? = Null(AT ).
A~
T 1 T
A~
x = A(A A) A b.
x = (AT A) 1 AT ~
Hence ~ x closets to ~
b makes the product A~ b hence it is the least square solution to the problem
A~ ~
x = b.
Definition
A set of nonzero k vectors {q~1 , . . . , q~k } in Rn is called an orthogonal set if the vectors are pairwise orthogonal.
That is
T
q~i · q~j = q~i q~j = 0 whenever i =
6 j
Proof.
We want to show that the equation ~ 0 = x1 u~1 + . . . + xk u~k has unique solution; x1 = ... = xk = 0.
Taking the inner product of both side with the vector u~i yields that
2
u~i · ~
0 = u~i · (x1 u~1 + . . . + xk u~k ) = xi k u~i k
2
q~i · q~i =k q~i k = 1 for all i = 1, ..., k.
q~ q~
NOTE that any orthogonal set {q~1 , . . . , q~k } can be turned into orthonormal set { kq~1 k , . . . , kq~k k }.
1 k
Theorem
Suppose that Q = {q~1 , . . . , q~n } is an orthonormal basis in Rn . Then any ~
u in Rn is written as
u · q~i )~
where each summand (~ qi is in fact the projection of ~
u along the unit vector q~i .
Proof.
u in Rn is written as a linear combination
Since Q = {q~1 , . . . , q~n } is a basis then any ~
~
u = x1 q~1 + ... + xi q~i + .... + xn q~n . On the other hand for each i = 1, ..., n,
Orthonormal basis are the most efficient basis, since calculating the coefficients appearing in the linear expansions
a~
First set q~1 = ka~1 k . That is q~1 is the unit vector in the direction of a~1 .
1
Then to construct q~2 first take the projection of a~2 along q~1 , that is (q~1 · a~2 )q~1 . By the definition of projection
a~2 (q~1 · a~2 )q~1 is orthogonal to q~1 . Hence construct
~ ~
p = ai+1 (q~1 · ai+1
~ )q~1 (q~2 · ai+1
~ )q~2 . . . qi · ai+1
(~ ~ )~qi
p · q~k = ai+1
~ ~ · q~k (q~1 · ai+1
~ )(q~1 · q~k ) . . . (q~k · ai+1
~ )(q~k · q~k ) . . . qi · ai+1
(~ qi · q~k )
~ )(~
~ · q~k
= aj+1 (q~k · aj+1
~ )(q~k · q~k )
~ · q~k
= aj+1 (q~k · aj+1
~ )=0
~
p
p · q~k = 0 for all k = 1, . . . i. Now we let qi+1
Hence ~ ~ = |~
p|
We conclude that the set {q~1 , . . . , q~n } constructed in the above manner is an orthonormal set in Rm .
Example
{a~1 =< 1, 0, 1 >, a~2 =< 1, 1, 0 >, a~3 =< 0, 1, 1 >} is a basis R3 .
Firstly, set
a~1 1 1
q~1 = =< p , 0, p >
k a~1 k 2 2
Then a~2 (q~1 · a~2 )q~1 =< 1, 1, 0 > p1 p1 < 1, 0, 1 >=< 12 , 1, 1 > and
2 2 2
< 12 , 1, 12 > 1 2 1
q~2 = p =< p , p , p >.
3/2 6 6 6
4
Now a~3 (q~1 · a~3 )q~1 (q~2 · a~3 )q~2 =< 0, 1, 1 > < 12 , 0, 12 > < 16 , 26 , 1
6
>=< 6
, 64 , 46 >. So
4 4 4
< 6
, , > 1 1 1
q~3 = p 6 6 =< p , p , p > .
4/3 3 3 3
Definition
An n ⇥ n matrix Q is called an orthogonal matrix if its columns {q~1 , . . . , q~n } is an orthonormal set.
⇢
T 1 if i =j
Since q~i q~j = we have that
0 if i 6= j
2 3 2 3
2 3 . . 1 0 .... 0
... q~1 T ... 6 . . 7
6 76 . ... . 7 6 0 1 ... 0 7
6 T . . . 76 7 6 7
Q Q =6 . . . 76 q~1 ... q~n 7=6
6 . . . . 7 = In
7
4 . . . 56 7 4 . . . . 5
4 . . 5 . . . .
... q~n T ... . . 0 0 ... 1
. ... .
But QQ T 6= In in general.
EXAMPLE: Below Q is an orthogonal matrix since the set of columns is an orthonormal sets.
2 3
p1 p1 p1
6 2 6 3 7
6 7
6 2 1 7
Q =6
6 0 p p 7 On the other hand, observe that the set of rows of Q is not an orthogonal set.
7
6 3
6 7
4 5
p1 p1 p1
2 6 3
1. An n ⇥ n matrix Q is orthogonal matrix if and only if its columns {q~1 , . . . , q~n } is an orthonormal basis
for Rn .
2. An n ⇥ n matrix Q is orthogonal matrix if and only if Q T Q = In .
3. If n ⇥ n matrix Q is orthogonal then Q 1 = Q T .
b 2 Rn , Q~
4. If n ⇥ n matrix Q is orthogonal then for any ~ x =~
b has unique solution
1 T T T
x =Q ~
~ b=Q ~b =< (q~1 ~b), . . . , (q~n ~b) >= (q~1 · ~
b)e~1 + . . . + (q~n · ~
b)e~n .
x 2 Rn we have k Q~
5. If n ⇥ n matrix Q is orthogonal then for any ~ x k=k ~
x k, since
2 T T T T T 2
k Q~
x k = (Q~ x =~
x ) Q~ x Q Q~x =~
x In ~
x =~
x ~ x k
x =k ~
EXAMPLE: Recall that in R2 rotation with ✓ angle in counterclockwise direction has matrix representation
cos ✓ sin ✓
A✓ = It is easy to check that A✓ is an orthogonal matrix and k A✓ ~
u k=k ~
u k for all
sin✓ cos ✓
u =< x, y >2 Rn . That is as a linear transformation, rotation does not change the length of vectors as expected.
~
That is a~j can be written as the linear combinations of q~1 , . . . , q~j with nonzero coefficients in the following
manner:
2 (~
aj · q~1 ) 3 2 (~
aj · q~1 ) 3
6 . 7 6 . 7
6 . 7 6 . 7
6 . 7 6 . 7
6 7 6 7
aj · q~1 )q~1 + . . . + (~
a~j = (~ qj = [q~1 . . . q~j . . . q~n ] 6
aj · q~j )~ 6 (~
aj · q~j ) 7 = Q 6 (~
7 6 aj · q~j ) 7
7
6 7 6 7
6 . 7 6 . 7
4 . 5 4 . 5
. .
0 0
2 3
(a~1 · q~1 ) (a~2 · q~1 ) ... (a~n · q~1 )
6 0 (a~2 · q~2 ) ... (a~n · q~2 ) 7
6 7
6 0 0 ... (a~n · q~3 ) 7
Now A = [a~1 . . . a~n ] = [q~1 . . . q~n ] 6
6
7 = QR
7
6 . . . 7
4 . . . 5
. . ... .
0 0 ... (a~n · q~n )
where Q is m ⇥ n matrix with orthonormal columns and R is n ⇥ n upper triangular matrix.
THEOREM: Any m ⇥ n matrix A = [a~1 . . . a~n ] with linearly independent columns in Rm can be written as
A = QR
EXAMPLE: 2 32 3
p1 p1 p1 p2 p1 p1
2 3 6 2 6 3 76 2 2 2
6 76 7
1 1 0 6 76 7
2 1 3 1 7
A=4 0 1 1 5=6
6 0 p p 76
76 0 p
6
p
6
7 = QR
1 0 1 6 6 3
74 7
4 5 5
p1 p1 p1 0 0 p2
2 6 3 3
2 p1 1
p
3
2 3 2 6 2 2 1 3
6 7 p p
1 1 6 7 2 2
6 2 76 7
A=4 0 1 5=6 0 p
6
74 5 = QR
6 7 3
1 0 4 5 0 p
1 1 6
p p
2 6
From Section 3.1 (Page 167 on the book): 1, 6, 7, 9, 12, 14, 16, 19, 21, 22, 45
From Section 3.2 (Page 177 on the book): 3, 14, 17, 18, 21
From Section 3.3 (Page 190 on the book): 3, 14, 17, 18, 21