Chapter 11 - Applications of Real Inner Product Spaces
Chapter 11 - Applications of Real Inner Product Spaces
W W L CHEN
c W W L Chen, 1997, 2006.
This chapter is available free to all individuals, on the understanding that it is not to be used for financial gain,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.
Chapter 11
APPLICATIONS OF
REAL INNER PRODUCT SPACES
is minimized. The purpose of this section is to study this problem using the theory of real inner product
spaces. Our argument is underpinned by the following simple result in the theory.
PROPOSITION 11A. Suppose that V is a real inner product space, and that W is a finite-dimensional
subspace of V . Given any u ∈ V , the inequality
u − projW u ≤ u − w
In other words, the distance from u to any w ∈ W is minimized by the choice w = projW u, the
orthogonal projection of u on the subspace W . Alternatively, projW u can be thought of as the vector
in W closest to u.
so that
Let V denote the vector space C[a, b] of all continuous real valued functions on the closed interval
[a, b], with inner product
b
f, g = f (x)g(x) dx.
a
Then
b
|f (x) − g(x)|2 dx = f − g, f − g = f − g2 .
a
It follows that the least squares approximation problem is reduced to one of finding a suitable polynomial
g to minimize the norm f − g.
Now let W = Pk [a, b] be the collection of all polynomials g : [a, b] → R with real coefficients and of
degree at most k. Note that W is essentially Pk , although the variable is restricted to the closed interval
[a, b]. It is easy to show that W is a subspace of V . In view of Proposition 11A, we conclude that
g = projW f
gives the best least squares approximation among polynomials in W = Pk [a, b]. This subspace is of di-
mension k + 1. Suppose that {v0 , v1 , . . . , vk } is an orthogonal basis of W = Pk [a, b]. Then by Proposition
9L, we have
Example 11.1.1. Consider the function f (x) = x2 in the interval [0, 2]. Suppose that we wish to find a
least squares approximation by a polynomial of degree at most 1. In this case, we can take V = C[0, 2],
with inner product
2
f, g = f (x)g(x) dx,
0
and W = P1 [0, 2], with basis {1, x}. We now apply the Gram-Schmidt orthogonalization process to this
basis to obtain an orthogonal basis {1, x − 1} of W , and take
x2 , 1 x2 , x − 1
g= 1+ (x − 1).
1 2 x − 12
Now
2 2
8
x , 1 =
2 2
x dx = and 1 = 1, 1 =
2
dx = 2,
0 3 0
while
2 2
4 2
x , x − 1 =
2
x (x − 1) dx =
2
and x − 1 = x − 1, x − 1 =
2
(x − 1)2 dx = .
0 3 0 3
It follows that
4 2
g= + 2(x − 1) = 2x − .
3 3
Example 11.1.2. Consider the function f (x) = ex in the interval [0, 1]. Suppose that we wish to find a
least squares approximation by a polynomial of degree at most 1. In this case, we can take V = C[0, 1],
with inner product
1
f, g = f (x)g(x) dx,
0
and W = P1 [0, 1], with basis {1, x}. We now apply the Gram-Schmidt orthogonalization process to this
basis to obtain an orthogonal basis {1, x − 1/2} of W , and take
ex , 1 ex , x − 1/2 1
g= 1+ x− .
12 x − 1/22 2
Now
1 1
ex , 1 = ex dx = e − 1 and ex , x = ex x dx = 1,
0 0
so that
1 1 3 e
e ,x −
x
= ex , x − ex , 1 = − .
2 2 2 2
Also
2 1 2
1 1
12 = 1, 1 = dx = 1 and x − = x − 1, x − 1 = x −
1
dx =
1
.
2 2 2 2 12
0 0
It follows that
1
g = (e − 1) + (18 − 6e) x − = (18 − 6e)x + (4e − 10).
2
Remark. From the proof of Proposition 11A, it is clear that u − w is minimized by the unique choice
w = projW u. It follows that the least squares approximation problem posed here has a unique solution.
Example 11.2.1. The expression 5x21 + 6x1 x2 + 7x22 is a quadratic form in two variables x1 and x2 . It
can be written in the form
2 2 5 3 x1
5x1 + 6x1 x2 + 7x2 = ( x1 x2 ) .
3 7 x2
Example 11.2.2. The expression 4x21 + 5x22 + 3x23 + 2x1 x2 + 4x1 x3 + 6x2 x3 is a quadratic form in three
variables x1 , x2 and x3 . It can be written in the form
4 1 2 x1
4x21 + 5x22 + 3x23 + 2x1 x2 + 4x1 x3 + 6x2 x3 = ( x1 x2 x3 ) 1 5 3 x2 .
2 3 3 x3
Note that in both examples, the quadratic form can be described in terms of a real symmetric matrix.
In fact, this is always possible. To see this, note that given any quadratic form (1), we can write, for
every i, j = 1, . . . , n,
c if i = j,
ij
(2) aij = 12 cij if i < j,
1
2 cji if i > j.
Then
a11 ... a1n x1
n
n
n
n
. .. ..
cij xi xj = aij xi xj = ( x1 ... xn ) .. . . .
i=1 j=1 i=1 j=1 an1 . . . ann xn
i≤j
The matrix
a11 ... a1n
.. ..
A= . .
an1 . . . ann
is clearly symmetric, in view of (2).
We are interested in the case when x1 , . . . , xn take real values. In this case, we can write
x1
.
x = .. .
xn
It follows that a quadratic form can be written as
xt Ax,
Many problems in mathematics can be studied using quadratic forms. Here we shall restrict our
attention to two fundamental problems which are in fact related. The first is the question of what
conditions the matrix A must satisfy in order that the inequality
xt Ax > 0
holds for every non-zero x ∈ Rn . The second is the question of whether it is possible to have a change of
variables of the type x = P y, where P is an invertible matrix, such that the quadratic form xt Ax can
be represented in the alternative form yt Dy, where D is a diagonal matrix with real entries.
Definition. A quadratic form xt Ax is said to be positive definite if xt Ax > 0 for every non-zero x ∈ Rn .
In this case, we say that the symmetric matrix A is a positive definite matrix.
PROPOSITION 11B. A quadratic form xt Ax is positive definite if and only if all the eigenvalues of
the symmetric matrix A are positive.
Our strategy here is to prove Proposition 11B by first studying our second question. Since the matrix
A is real and symmetric, it follows from Proposition 10E that it is orthogonally diagonalizable. In other
words, there exists an orthogonal matrix P and a diagonal matrix D such that P t AP = D, and so
A = P DP t . It follows that
xt Ax = xt P DP t x,
and so, writing
y = P t x,
we have
xt Ax = yt Dy.
Also, since P is an orthogonal matrix, we also have x = P y. This answers our second question.
Furthermore, in view of the Orthogonal diagonalization process, the diagonal entries in the matrix D
can be taken to be the eigenvalues of A, so that
λ1
D= ..
. ,
λn
where λ1 , . . . , λn ∈ R are the eigenvalues of A. Writing
y1
..
y = . ,
yn
we have
(3) xt Ax = yt Dy = λ1 y12 + . . . + λn yn2 .
Note now that x = 0 if and only if y = 0, since P is an invertible matrix. Proposition 11B now follows
immediately from (3).
Example 11.2.3. Consider the quadratic form 2x21 + 5x22 + 2x23 + 4x1 x2 + 2x1 x3 + 4x2 x3 . This can be
written in the form xt Ax, where
2 2 1 x1
A = 2 5 2 and x = x2 .
1 2 2 x3
The matrix A has eigenvalues λ1 = 7 and (double root) λ2 = λ3 = 1; see Example 10.3.1. Furthermore,
we have P t AP = D, where
√ √ √
1/√6 1/ 2 1/ √3 7 0 0
P = 2/√6 0√ −1/√ 3 and D = 0 1 0.
1/ 6 −1/ 2 1/ 3 0 0 1
Writing y = P t x, the quadratic form becomes 7y12 + y22 + y32 which is clearly positive definite.
Example 11.2.4. Consider the quadratic form 5x21 + 6x22 + 7x23 − 4x1 x2 + 4x2 x3 . This can be written
in the form xt Ax, where
5 −2 0 x1
A = −2 6 2 and x = x2 .
0 2 7 x3
The matrix A has eigenvalues λ1 = 3, λ2 = 6 and λ3 = 9; see Example 10.3.3. Furthermore, we have
P t AP = D, where
2/3 2/3 −1/3 3 0 0
P = 2/3 −1/3 2/3 and D = 0 6 0.
−1/3 2/3 2/3 0 0 9
Writing y = P t x, the quadratic form becomes 3y12 + 6y22 + 9y32 which is clearly positive definite.
Example 11.2.5. Consider the quadratic form x21 + x22 + 2x1 x2 . Clearly this is equal to (x1 + x2 )2 and
is therefore not positive definite. The quadratic form can be written in the form xt Ax, where
1 1 x1
A= and x= .
1 1 x2
It follows from Proposition 11B that the eigenvalues of A are not all positive. Indeed, the matrix A has
eigenvalues λ1 = 2 and λ2 = 0, with corresponding eigenvectors
1 1
and .
1 −1
Writing y = P t x, the quadratic form becomes 2y12 which is not positive definite.
Let E denote the collection of all functions f : [−π, π] → R which are piecewise continuous on the
interval [−π, π]. This means that any f ∈ E has at most a finite number of points of discontinuity, at
each of which f need not be defined but must have one sided limits which are finite. We further adopt
the convention that any two functions f, g ∈ E are considered equal, denoted by f = g, if f (x) = g(x)
for every x ∈ [−π, π] with at most a finite number of exceptions.
It is easy to check that E forms a real vector space. More precisely, let λ ∈ E denote the function
λ : [−π, π] → R, where λ(x) = 0 for every x ∈ [−π, π]. Then the following conditions hold:
We now give this vector space E more structure by introducing an inner product. For every f, g ∈ E,
write
1 π
f, g = f (x)g(x) dx.
π −π
The integral exists since the function f (x)g(x) is clearly piecewise continuous on [−π, π]. It is easy to
check that the following conditions hold:
The difficulty here is that the inner product space E is not finite-dimensional. It is not straightforward
to show that the set
1
(4) √ , sin x, cos x, sin 2x, cos 2x, sin 3x, cos 3x, . . .
2
in E forms an orthonormal “basis” for E. The difficulty is to show that the set spans E.
Remark. It is easy to check that the elements in (4) form an orthonormal “system”. For every k, m ∈ N,
we have
1 1 1 π 1
√ ,√ = dx = 1;
2 2 π −π 2
1 1 π 1
√ , sin kx = √ sin kx = 0;
2 π −π 2
1 1 π 1
√ , cos kx = √ cos kx = 0;
2 π −π 2
as well as
1 π 1 π 1 1 if k = m,
sin kx, sin mx = sin kx sin mx dx = (cos(k − m)x − cos(k + m)x) dx =
π −π π −π 2 0 if k
= m;
1 π 1 π 1 1 if k = m,
cos kx, cos mx = cos kx cos mx dx = (cos(k − m)x + cos(k + m)x) dx =
π −π π −π 2 0 if k
= m;
and
π π
1 1 1
sin kx, cos mx = sin kx cos mx dx = (sin(k − m)x + sin(k + m)x) dx = 0.
π −π π −π 2
Let us assume that we have established that the set (4) forms an orthonormal basis for E. Then a
natural extension of Proposition 9H gives rise to the following: Every function f ∈ E can be written
uniquely in the form
∞
a0
(5) + (an cos nx + bn sin nx),
2 n=1
known usually as the (trigonometric) Fourier series of the function f , with Fourier coefficients
a 1 1 π
√0 = f, √ = f (x) dx,
2 2 π −π
Note that the constant term in the Fourier series (5) is given by
1 1 a
f, √ √ = 0.
2 2 2
Example 11.3.1. Consider the function f : [−π, π] → R, given by f (x) = x for every x ∈ [−π, π]. For
every n ∈ N ∪ {0}, we have
1 π
an = x cos nx dx = 0,
π −π
since the integrand is an odd function. On the other hand, for every n ∈ N, we have
1 π 2 π
bn = x sin nx dx = x sin nx dx,
π −π π 0
∞
2(−1)n+1
sin nx.
n=1
n
Note that the function f is odd, and this plays a crucial role in eschewing the Fourier coefficients an
corresponding to the even part of the Fourier series.
Example 11.3.2. Consider the function f : [−π, π] → R, given by f (x) = |x| for every x ∈ [−π, π]. For
every n ∈ N ∪ {0}, we have
π π
1 2
an = |x| cos nx dx = x cos nx dx,
π −π π 0
since the integrand is an odd function. We therefore have the (trigonometric) Fourier series
∞ ∞
π 4 π 4
− cos nx = − cos(2k − 1)x.
2 n=1
πn 2 2 π(2k − 1)2
k=1
n odd
Note that the function f is even, and this plays a crucial role in eschewing the Fourier coefficients bn
corresponding to the odd part of the Fourier series.
Example 11.3.3. Consider the function f : [−π, π] → R, given for every x ∈ [−π, π] by
+1 if 0 < x ≤ π,
f (x) = sgn(x) = 0 if x = 0,
−1 if −π ≤ x < 0.
since the integrand is an odd function. On the other hand, for every n ∈ N, we have
π π
1 2
bn = sgn(x) sin nx dx = sin nx dx,
π −π π 0
∞ ∞
4 4
sin nx = sin(2k − 1)x.
n=1
πn π(2k − 1)
k=1
n odd
Example 11.3.4. Consider the function f : [−π, π] → R, given by f (x) = x2 for every x ∈ [−π, π]. For
every n ∈ N ∪ {0}, we have
π π
1 2
an = x2 cos nx dx = x2 cos nx dx,
π −π π 0
since the integrand is an odd function. We therefore have the (trigonometric) Fourier series
∞
π 2 4(−1)n
+ cos nx.
3 n=1
n2
2. For each of the following functions, find the best least squares approximation by linear polynomials
of the form ax + b, where a, b ∈ R:
a) f : [0, π/2] → R : x → sin x b) f : [0, 1] → R : x → x3
c) f : [0, 2] → R : x → e x
3. Consider the quadratic form 2x21 + x22 + x23 + 2x1 x2 + 2x1 x3 in three variables x1 , x2 , x3 .
a) Write the quadratic form in the form xt Ax, where
x1
x = x2
x3
and where D is a diagonal matrix with real entries. You should give the matrices P and D
explicitly.
d) Is the quadratic form positive definite? Justify your assertion both in terms of the eigenvalues of
A and in terms of your solution to part (c).
4. For each of the following quadratic forms in three variables, write it in the form xt Ax, find a
substitution x = P y so that it can be written as a diagonal form in the variables y1 , y2 , y3 , and
determine whether the quadratic form is positive definite:
a) x21 + x22 + 2x23 − 2x1 x2 + 4x1 x3 + 4x2 x3 b) 3x21 + 2x22 + 3x23 + 2x1 x3
c) 3x1 + 5x2 + 4x3 + 4x1 x3 − 4x2 x3
2 2 2
d) 5x21 + 2x22 + 5x23 + 4x1 x2 − 8x1 x3 − 4x2 x3
e) x1 − 5x2 − x3 + 4x1 x2 + 6x2 x3
2 2 2
6. Find the trigonometric Fourier series for each of the following functions f : [−π, π] → C:
a) f (x) = x|x| for every x ∈ [−π, π]
b) f (x) = | sin x| for every x ∈ [−π, π]
c) f (x) = | cos x| for every x ∈ [−π, π]
d) f (x) = 0 for every x ∈ [−π, 0] and f (x) = x for every x ∈ (0, π]
e) f (x) = sin x for every x ∈ [−π, 0] and f (x) = cos x for every x ∈ (0, π]
f) f (x) = cos x for every x ∈ [−π, 0] and f (x) = sin x for every x ∈ (0, π]
g) f (x) = cos(x/2) for every x ∈ [−π, π]
h) f (x) = sin(x/2) for every x ∈ [−π, π]