Snmeiwseis Ga
Snmeiwseis Ga
Part II
1 Introduction: What is “linear analysis”?
The objects in this course are infinite dimensional vector spaces (hence the term
“linear”) over R or C, together with additional structure (a “norm” or “inner
product”) which “respects” in some way the linear structure. This additional
structure will allow us to do “analysis”. The most pedestrian way to understand
the last sentence is that it will allow us to “take limits”.
In fact, the extra structure allows much more than just “taking limits”.
Hidden in the notion of norm and inner product are notions of convexity, duality,
and orthogonality. In the most complicated structure we will discuss, that of
a Hilbert space, all these notions will be present simultaneously together with
completeness.
The point of view of this course, like most undergraduate mathematics
courses, will be axiomatic/revisionist. The objects will be defined, and the
basic theorems discussed without any “motivation”. The objects in this course
have been come to be viewed as so central for mathematics that it is presently
inaccurate to consider them as having any single “motivation”.
This being said, the subject arose and developed under specific mathematical
circumstances, and this may be useful for deeper understanding of the theory.
Nice notes can be found in [3, 2]. At the very least, however, everyone starting
out in this subject should know that the vector spaces that originally gave rise
to this theory were spaces of functions, and the linear operators were linear
partial differential operators, their inverses, and more generally so-called linear
integral operators. So the linear algebraic question “Under what conditions can
you invert an operator?”–and this represents the kind of question we will be
answering in this course–in practice meant “Under what conditions can you
solve a linear pde?”.
Because of this connection to spaces of functions, the subject is almost always
known by the name Functional Analysis, and it is under this name that you will
probably search the literature for more material, if you find these notes to be
confusing.
1
Recall that the span of a set S in V is the smallest subspacePm of V containing
S, or alternatively, the set of all finite linear combinations i=1 α Pi si . If this
set is V we say S spans V . A set S ⊂ V is linearly independent if m i=1 αi si =
0 =⇒ αi = 0 for all i. A set B ⊂ V is called a basis if it spans V and is linearly
independent. If S1 and S2 are subsets of V , and λ1 , λ2 are given scalars, we
will often use the notation λ1 S1 + λ2 S2 to denote the set of points1 of the form
{λ1 s1 + λ2 s2 }s1 ∈S1 ,s2 ∈S2 . Zorn’s lemma implies that all vector spaces have a
basis, and any two baseis have the same cardinality. We will go through this
proof later, as a sort of warm up for the use of Zorn’s lemma in the Hahn-Banach
Theorem. If the cardinality of B is finite, we say that V is finite dimensional,
otherwise, infinite dimensional.
Most of the Theorems in this course will be easy to prove in the finite
dimensional case by purely algebraic methods. Thus, the main emphasis of this
course will be the infinite dimensional case.
value of a scalar.
2
metric d given by d(x, y) = |x − y|, whose three defining properties are inherited
from those of the norm.
In particular, the metric space structure allows us to speak of a topology,
i.e. to identify open and closed sets and continuous mappings. We will see soon
that the latter, when required also to be linear, have an alternative characteri-
zation with respect to the norm.
In addition to open, closed, and continuous, we may talk about the notion
of basis for the topology, product topology, homeomorphism, dense, separable,
convergent sequence. Since the topology is metrizable, we may also talk about
Cauchy sequences and the notion of completeness. In the context of normed
vector spaces, these notions will be applied without further comment.
To understand the sense in which the topological structure is “married” to
the linear, we present the following
Proposition 2.1. Let V, | · | be a normed vector space. The vector space oper-
ations are continuous maps V × V → V , R × V → V .
Proof. We will do only +. Let U ⊂ V be open. Want to show that the inverse
image +−1 (U) is open. Let (v1 , v2 ) ∈ +−1 (U). That is to say, v1 + v2 = v
for some v ∈ U. Let B(ǫ) denote the open ball of radius ǫ around the origin.
v + B(ǫ) is the open ball around v of radius ǫ. Clearly v + B(ǫ) ⊂ U for some
ǫ > 0. By the triangle inequality, v1 + B(ǫ/2) + v2 + B(ǫ/2) = v + B(ǫ). But
(v1 + B(ǫ/2), v2 + B(ǫ/2)) is an open neighborhood around (v1 , v2 ). So +−1 (U)
is indeed open, as desired.
Corollary 2.1. Let V be as above. Translations and dilations are homeomor-
phisms.
Proof. Just consider, for any v0 ∈ V , the map V → V given by composing
the continuous map V → V × V defined by v 7→ (v0 , v) with the addition map
V × V → V . By the previous Proposition, this is continuous. On the other
hand, its (continuous again!) inverse is given by the analogous map defined by
−v0 . So the map is a homeomorphism.
Similarly for dilations. Let λ 6= 0 and consider the composition of the map
V → R × V defined by v 7→ (λ, v) with the map of the previous proposition,
etc.
3
2.5 Norms and convexity
At the most abstract level, the main object of this subject would be the topo-
logical vector space. Our previous proposition shows that normed vector spaces
are a special case.
To understand, at a slightly more abstract level, what is the extra structure
given by a norm, let us make the following
Definition 2.3. Let V be a vector space, and C ⊂ V a subset. C is said to be
convex if
tC + (1 − t)C ⊂ C
for all t ∈ [0, 1].
Proposition 2.2. Let V, | · | be a normed vector space. The unit ball B(1) ⊂ V
is convex.
Remark. Note that if C is convex, then translations of C, i.e. sets of the form
p + C, are also convex.
Definition 2.4. A locally convex topological vector space V is a topological
vector space with a basis of convex sets.
By the above, remark, it is clear that a sufficient condition for a topological
vector space to be locally convex is that every open set U ⊂ V containing the
origin contains a convex neighborhood of the origin. In any case, normed vector
spaces are clearly examples of locally convex topological vector spaces in view
of Proposition 2.2.
In the class of locally convex topological vector spaces, how special are metric
spaces? Let us make the following
Definition 2.5. Let V be a topological vector space. A subset B ⊂ V is said
to be bounded if for any open neighborhood U ⊂ V of 0, there exists an s > 0
such that B ⊂ tU for all t > s.
Remark. If V is a normed vector space, then B is bounded iff B ⊂ B(t) for
some t > 0, where B(t) denotes the open unit ball around the origin of radius t.
Proposition 2.3. Let V be a topological vector space, and assume C ⊂ V is a
bounded convex neighborhood of 0. Then V is normable, that is to say, a norm
| · | can be defined on V with the same induced topology.
Proof. For this we need
Lemma 2.1. C ⊂ C̃, where C̃ is a balanced bounded convex neighborhood of
the origin, that is to say, one for which in addition λC̃ ⊂ C̃ for all |λ| ≤ 1.
The proof is straightforward and omitted. Now define the function µC̃ by
This function is called the Minkowski functional of C̃. One checks explicitly
.
that |v| = µC̃ (v) defines the structure of a normed vector space on V .
4
Definition 2.6. A topological vector space V is said to be locally bounded if
there exists a bounded open neighborhood U of the origin.
Normed vector spaces are clearly locally bounded. Proposition 2.3 says that
a topological vector space is normable if it is locally bounded and locally convex.
2.7 Examples
2.7.1 Finite dimensions
We have already seen the example of Rn , or Cn . As we shall see later on, these
are Banach spaces, as is any finite dimensional normed vector space.
It is easy to see that |f | defines a norm, making BR (S) into a normed vector
space.
On the other hand, let X be a compact Hausdorff space, and consider CR (X)
the set of all real valued continuous functions. We have
CR (X) ⊂ BR (S)
by well known properties of continuous functions. It inherits thus the norm.
(A subspace of a n.v.s is a n.v.s.!) It turns out that CR (X) is a Banach space.
Actually, you have shown this already in your analysis classes. For you have
probably encountered a theorem “Suppose a sequence of continuous fi on X
converge uniformly to f . Then f is continuous.” Exercise: How far away is
this theorem from the statement that CR (X) is a Banach space?
All the above considerations apply equally well to CC (X)
Understanding the topology of CR (X), in particular, idenitifying its compact
subsets, will be important later on.
5
2.7.3 C k (Ū )
Let U ⊂ Rn be open and bounded, and consider now the space of functions
f : U → R such that Dα f is continuous and uniformly bounded on U for all
multiindices |α| ≤ k.4 Denote this space by C k (Ū ). Consider the norm | · |k
defined by
|f |k = max |Dα f |
|α|≤k
where | · | denotes the supremum norm. This makes C k (Ū ) into a Banach space.
2.7.4 Lp
Let X = [0, 1] and let p > 1, and consider the set L̂p ([0, 1]) of all f : [0, 1] → R
such that f is continuous. Define a norm on L̂p by
Z 1 1/p
p
|f | = |f | (3)
0
2.7.5 lp
The previous example is one of the most fruitful for the application of linear
analysis. Since we do not have the technology of measure theory at our disposal,
we will have to settle with a baby example, where functions are replaced by
sequences. (The latter of course are just functions on the natural numbers, and
4 Here αi , and D α f denotes
P
α = (α1 , . . . αn ) nonnegative integers, with |α| = i
∂ |α|
α α .
∂x1 1 ···∂xnn
6
this example can be thought of as a special case of the previous where [0, 1] is
replaced by N.) This is so-called “little” lp .
For 0 < p < ∞, we define
∞
X
lp (C) = {(x1 , x2 , x3 , . . .), xi ∈ C : |xi |p < ∞},
i=1
for p ≥ 1, and
|x| = sup |xi |, (5)
i
for p = ∞. We will see later on that these are in fact Banach spaces. In the
case 0 < p < 1, lp is again a subspace of FC (N), but the expression (4) does not
define a norm.
7
Proof. First, I claim that T is continuous iff T is continuous at 0, i.e. iff for
any open subset U in W around 0, there exists an open neighborhood of 0 in V
contained in T −1 (U).
To see this, suppose then that indeed for any open neighborhood U of the
origin in W , there exists an open neighborhood Ũ of the origin in V , with
Ũ ⊂ T −1 (U). Let v0 and w0 be arbitrary points such that T (v0 ) = w0 . Then
T (v0 + Ũ ) = w0 + T (Ũ) ⊂ w0 + U.
This shows that the the inverse image of any open subset is open. (The other
direction of the implication is of course immediate.)
We now continue with the proof of the proposition. Suppose then T is
bounded. Let U ⊂ W be an arbitrary open neighborhood of 0, and let Ũ ⊂ V
be a bounded open neighborhood of 0. The latter exists since V is locally
bounded. We have that T (Ũ) is bounded, by assumption. Thus, by definition,
there exits an n > 0 such that T (Ũ) ⊂ nU. But then
n−1 Ũ ⊂ T −1 (U),
and this proves continuity, since U open implies n−1 U open by Corollary 2.1.
Suppose conversely that T is continuous, and let E ⊂ V be bounded. Let U
be an arbitrary open neighborhood of 0 in W . Since T −1 (U) is open, there exits
an open neighborhood of the origin Ũ ⊂ V such that T −1 (U) ⊃ Ũ, i.e. such that
U ⊃ T (Ũ).
E ⊂ tŨ
8
Proposition 2.5. B(V, W ) is a subspace of L(V, W ). ||T || defines a norm on
B(V, W ).
Proof. The proof of this is very easy. Clearly 0 ∈ B(V, W ) and ||0|| = 0. By
linearity, it is clear that if T ∈ B(V, W ), then λT ∈ B(V, W ) with ||T || = |λ|||T ||,
since
(λT )(B(1)) = T (λB(1)) = T (|λ|B(1)) = |λ|(T (B(1)))
so (if λ 6= 05 )
T (B(1)) ⊂ B(t)
iff
(λT )(B(1)) ⊂ |λ|B(t)
which is equivalent to
(λT (B(1)) ⊂ B(|λ|t).
Thus the set is closed under scalar multiplication, and the norm satisfies the sec-
ond property. Finally, suppose T1 , T2 ∈ B(V, W ), and the definition is satisfied
with t1 , t2 . We have
9
Thus Ti (v) converges by the completeness of W . Define T (v) = lim Ti (v).
Claim. T is linear, i.e. T ∈ L(V, W ). This follows immediately from the con-
tinuity of addition in W . On the other hand, T is clearly bounded. For given
any |v| ≤ 1,
The above should be interpreted as follows. For any ǫ > 0, v > 0 there exists
an i depending on ǫ and v such that the second inequality holds.
Thus we have ||T || < ∞, so T ∈ B(V, W ). Finally we must show that
||T − Ti || → 0. We compute
and thus,
||T ∗ || ≤ ||T ||.
In fact, ||T ∗ || = ||T ||, but this will need Hahn-Banach.
10
2.11 V ∗∗
We define the double dual V ∗∗ of a normed vector space to be (V ∗ )∗ , i.e. the
dual of the dual.
Proposition 2.7. The map φ : V → V ∗∗ defined by
v 7→ v ∗∗
2.12 Examples!
2.12.1 Finite dimensions
Let V , W be finite dimensional normed vector spaces. Then any linear map
T : V → W is bounded. We will see this later on. Exercise: Let vi , wi be
baseis for V , W , respectively. What is the relation between the matrix of T and
T ∗?
id : X → C 1 [0, 1]
where the target is the usual C 1 defined as in Section 2.7.3, and the map is just
the identity. This map is clearly linear! But it is not bounded. For one can
consider a sequence fi of C 1 functions such that |fi | ≤ 1, yet |fi′ | → ∞.
11
3 Finite dimensional normed vector spaces
The point of the theory developed in this course is to treat the infinite dimen-
sional case. In this section, we shall see why the finite dimensional case is so
very different.
Let us begin with a definition.
Definition 3.1. We say that two norms | · |1 and | · |2 on a vector space V are
equivalent if there exits constants 0 < c < C < ∞ such that
One easily sees that this defines an equivalence relation on the set of norms
on V . Clearly the induced topology defined by two equivalent norms is the
same. In particular, the notion of bounded operator with respect to the norms
coincides. Moreover, the notion of a Cauchy sequence with respect to the two
norms is the same. Thus, if | · |1 and | · |2 are equivalent, then vi is Cauchy with
respect to | · |1 iff it is Cauchy with respect to | · |2 .
In this section we will show that all finite dimensional normed vector spaces
have equivalent norms and are Banach. We will show that all linear maps are
bounded. Moreover, we will show the closed unit ball in a finite dimensional
n.v.s. is compact. Finally, we shall show that the latter fact is a characterization
of finite dimensionality for normed vector spaces, that is to say, if the closed
unit ball is compact, then the dimension is necessarily finite.
Given a finite n dimensional vector space V , we may think of it as Rn or Cn
after choice of basis. Let us define a norm on Rn , resp. Cn , by
n
X
|x|1 = |xi |.
i=1
12
Proof. This follows from the computation
Lemma 3.2. The closed unit ball (and thus the unit circle) are compact in the
topology of l1n .
Proof. Suppose xi is a sequence. Then xip is a sequence in R or C. Choose
j
a subsequence xj1k that converges to x̃1 , then a further subsequence x2km , etc.
Construct from the final choices a sequence Pnwe will again denote by xi . Clearly
x → x̃ = (x̃1 , . . . , x̃n ). Since |x − x̃|1 = j=1 |xj − x̃ij | → 0.
i i i
In fact, the above proof shows that any closed ball is compact in the topology
of l11 , and even more generally, any closed and bounded set.
From the above we know that | · | a continuous function on a compact set
attains its infimum. The latter then must be strictly positive. Thus we have
0 < c ≤ |x|
for all |x|1 = 1. But then, applying this to an arbitrary x̃ = λx, for 0 6= |λ| = |x̃|1
we obtain
0 < |x̃|1 c = |λ|c ≤ |λ||x| = |λx| = |x̃|.
This completes the proof.
Proposition 3.2. Let V be a finite dimensional normed vector space. Then the
closed unit ball is compact.
Proof. This follows from the statement proven about l11 in Lemma 3.2 and from
the previous proposition.
Proposition 3.3. Let V be a finite dimensional normed vector space. Then V
is a Banach space.
Proof. Let vi be Cauchy in V . It follows that vi is in particular bounded,
i.e. there exists an R such that vi ∈ B(R). But B(R) is compact (so in particular
complete!). So vi converges.
Corollary 3.2. Let V ⊂ W , where W is a normed vector space, and V is a
finite dimensional subspace. Then V is closed.
13
Proof. Since ImT is clearly finite dimensional, it suffices to consider the case
where W is also finite dimensional. It suffices to prove this when V = l1n ,
W = l1m . Consider the matrix Tij associated to T . We have
X X
T (x1 , . . . , xn ) = Ti1 xi , . . . , Tim xi
and thus
X
|T (x1 , . . . , xn )|1 = |Tij xi |
ij
X
= |Tij ||xi |
ij
X
≤ max |Tij | |xi |
ij
i
= C|(x1 , . . . , xn )|1 ,
where C = maxij |Tij |. This of course completes the proof.
Finally:
Proposition 3.5. Let V be a normed vector space with the property that the
closed unit ball is compact. Then V is finite dimensional.
Proof. Consider the open cover of the closed unit ball consisiting of all open
balls around arbitrary points, of radius 1/2. By compactness, there exits a
finite subcover, i.e.
B(1) ⊂ ∪ni=1 (xi + B(1/2)).
Now let W denote the subset of spanned by xi . This of course is finite dimen-
sional with dim W ≤ n. We have
B(1) ⊂ W + B(1/2).
Iterating once we obtain,
1
B(1) ⊂ W + (W + B(1/2)) = W + B(1/4).
2
Iterating arbitrarily many times, we obtain
B(1) ⊂ W + B(1/2m )
for all m ≥ 0, and thus
B(1) ⊂ ∩m (W + B(1/2m )) = W = W
where for the last equality we have used Corollary 3.2. But of course, this implies
that V ⊂ W , and thus V = W . So V is finite dimensional with dim V ≤ n.
The above proof can easily be generalised to a statement about locally com-
pact topological vector spaces, i.e. topological vector spaces with a neighborhood
of the origin with compact closure.
14
4 The Hahn-Banach Theorem
The Hahn-Banach Theorem is essentially a statement about the richness of V ∗ ,
i.e. it says (or rather, its corollaries show) that this space is sufficiently big.
At this point, we don’t even know whether in general V 6= 0 implies
V ∗ 6= 0!
The idea here is that one constructs elements of V ∗ be defining linear func-
tionals on subspaces of V (for instance finite dimensional ones, where this is
more or less trivial) and then extends them “little by little” to the whole of V .
The fact that one can extend a bounded linear functional from a codimension
1 subspace to the whole space is the content of Proposition 4.3. If
V = ∪Vi , (6)
where Vi are finite dimensional subspaces with Vi ⊂ Vi+1 of codimension 1, then
this allows one to obtain bounded linear functionals on V by induction.
The analogue of this induction procedure in the general case when (6) does
not hold is known as transfinite induction. We begin with a general discussion.
15
4.2 Application: Every vector space has a basis
Proposition 4.1. Let V 6= {0} be a vector space. Then there exists a basis for
V.
In fact we will prove something stronger namely
Proposition 4.2. Let V 6= {0} be a vector space, and let S ⊂ V be linearly
independent. Then there exists a basis B of V with S ⊂ B ⊂ V .
Proof. Let S be the set of all linear independent subsets of V , containing B.
This set is non-empty since it contains S. We may partially order S as follows.
For S1 , S2 ∈ S, we say that S1 ≤ S2 if S1 ⊂ S2 . One checks easily that this
defines an order relation. Suppose now that T ⊂ S is a non-empty totally
ordered subset. Define S b = ∪S̃∈T S̃. The set S̃ contains S, and is linearly
independent, because if m
P
i=1 αi xi = 0, then there exists by total ordering a
S̃ ∈ T such that xi ∈ S̃ for all i = 1, . . . n, and one applies linear independence
of S̃ to deduce that αi = 0. Thus, Sb is clearly a least upper bound in S, so S
has the least upper bound property.
By Zorn’s lemma, it follows that S has a maximal element. Call this B. To
show B is a basis, we need only show that Span(B) = V . Suppose this is not
the case, i.e. suppose ∃v : v 6∈ SpanB. Consider the set
B̃ = {v} ∪ B.
Pn
Claim. B̃ is linearly independent. For if αv + i=1 αi vi = 0 for vi ∈ B̃, and
α 6= 0, one obtains X
v=− α−1 αi vi
and thus v ∈ Span(B), a contradiction. But now we have B ≤ B̃, but B 6= B̃,
and this is contradicts the maximality of B. So B is indeed a basis.
Note how it was important in the above that the scalars constitute a field.
The above result does not hold for modules.
Exercise. Show that any two baseis have the same cardinality.
16
Proposition 4.3. Let V be a real vector space, and let p : V → R be a function
p : V → R, with p(v1 + v2 ) ≤ p(v1 ) + p(v2 ), and p(λv) = λp(v) for all λ > 0. Let
W be a codimension 1 subspace of V , and suppose that f : W → R is a linear
functional such that f (w) ≤ p(w) for all w ∈ W . Then there exists a linear
functional f˜ : V → R such that f˜|W = f and f˜(v) ≤ p(v) for all v ∈ V .
The assumptions on p imply in particular that is is a convex function. Note
that, in the case of a normed vector space V , if ||f || < ∞, then the conditions of
the proposition are satisfied with p(x) = ||f |||x|, and one has that ||f˜|| = ||f ||.
Proof. Pick v0 ∈ V \ W . By codimensionality 1, it suffices to define f˜ so that
f˜(w+av0 ) ≤ p(w+av0 ) for all a ∈ R. That is to say f˜(w)+af (v0 ) ≤ p(w+av0 ).
This is equivalent to
af˜(v0 ) ≤ p(w + av0 ) − f (w)
and thus, with, a > 0, we obtain
f˜(v0 ) ≤ a−1 p(w + av0 ) − f (a−1 w),
and thus
f˜(v0 ) ≤ p(a−1 w + v0 ) − f (a−1 w),
for all w, a condition which, by redefining w, we may write as
f˜(v0 ) ≤ p(w + v0 ) − f (w) (7)
for all w.
On the other hand, for a < 0, we obtain the condition
f˜(v0 ) ≥ a−1 p(w + av0 ) − a−1 f (w),
or equivalently
f˜(v0 ) ≥ −(−a−1 )p(w + av0 ) + f (−a−1 w)
and thus, after redefining w as above, we obtain the condition
f˜(v0 ) ≥ −p(w − v0 ) + f (w). (8)
for all w.
So now, the condition that f˜(v0 ) can be chosen so that inequalities (7) and
(8) both hold is just that, for all w, w̃:
f (w̃) + f (w) ≤ p(w + v0 ) + p(w̃ − v0 ). (9)
But indeed, (9) holds. For we have
f (w̃) + f (w) = f (w̃ + w)
≤ p(w̃ + w)
= p(w + v0 + w̃ − v0 )
≤ p(w + v0 ) + p(w̃ − v0 ).
17
4.3.2 The general case
Theorem 4.1. Proposition 4.3 holds without the assumption that W has codi-
mension 1.
Proof. Consider the set S of all extensions f˜ : Ṽ → R of f , where W ⊂ Ṽ ⊂ V
is a subspace, where f˜|V = f , and where f˜(x) ≤ p(x) for all x ∈ Ṽ .
This set is nonempty, as it contains f itself. We may partially order the set
by setting f˜1 ≺ f˜2 if Ṽ1 ⊂ Ṽ2 , where Ṽi are the domains of f˜i , and f˜2 |Ṽ1 = f˜1 .
This clearly defines a partial ordering.
Claim. Under this partial ordering, S satisfies the least upper bound prop-
erty. For if T is a totally ordered subset, we can consider a f˜ defined on ∪Ṽ ∈T Ṽ
by f˜(x) = f˜α (x) for some f˜α such that its domain Ṽα contains x. One easily sees
that since T is totally ordered, this definition does not depend on the choice.
Finally, it is clear that f˜α ≺ f˜, for any f˜α ∈ T .
We may thus apply Zorn’s lemma to obtain a maximal f˜ ∈ S. We are left
with proving that the domain of f˜ is V . So let us suppose that this is not the
case. Let v0 ∈ V \ W̃ , where W̃ denotes the domain of f˜. Consider the set
Ṽ = Span(v0 , W̃ ). We have that W̃ ⊂ Ṽ is codimension 1. We may apply then
˜ ˜
Proposition 4.3 to extend f˜ to a linear f˜ : Ṽ → R, with f˜(x) ≤ p(x). But
˜ ˜
clearly, f˜ ≺ f˜ and f˜ 6= f˜. This contradicts maximality.
So the domain of f˜ is V , and the theorem is proven.
Corollary 4.1. Let V be a normed vector space, and W ⊂ V a subspace. Let
f ∈ W ∗ . Then there exits an f˜ ∈ V ∗ with f˜|W = f , and ||f˜|| = ||f ||.
We will refer to Theorem 4.1 or Corollary 4.1 as the Hahn-Banach theorem.
18
In fact
Corollary 4.4. Let V be a normed vector space, v, w ∈ V with v 6= w. Then
there exists an f ∈ V ∗ such that f (v) 6= f (w).
Proof. Just take f = fv−w .
For another manifestation of the richness of V ∗ , let us consider V ∗∗ . We
have
Proposition 4.5. The map φ : V → V ∗∗ is an isometry, i.e. ||φ(v)|| = |v|.
In particular, φ is injective.
Proof. We have already shown in Proposition 2.7 that ||φ(v)|| ≤ |v|. For the
other direction, just note that for |v| = 1, we can choose a support functional
fv with ||fv || = 1, and this gives
and thus ||φ(v)|| ≥ 1 for all |v| = 1. But this gives ||φ(v)|| ≥ |v| for all v.
Finally, in a similar spirit, we show
Proposition 4.6. Let V and W be normed vector spaces and let T : V → W
be a bounded linear map. Then T ∗ : W ∗ → V ∗ satisfies ||T ∗ || = ||T ||.
Proof. Again, we have already shown that ||T ∗ || ≤ ||T ||. Let v ∈ V be arbitrary
with |v| = 1, and choose a support functional fw for w = T v. We have
||T ∗ || ≥ |T v|.
19
construct a large subset of the dual. One has no need for applying the axiom of
choice to construct such elements.
One easily sees that the norm of g thought of as an element in (C([0, 1]))∗ is
|g|. Since (C([0, 1]))∗ is a Banach space, and the set C([0, 1]) is not complete
R
under the L1 norm, it is clear that (C([0, 1]))∗ contains elements not of the
above form. For instance, f 7→ f (x0 ) is an element of the dual not induced as
above. It turns out that the dual of C([0, 1]) can be identified with the space
of signed Borel measures. This is not a space we have the technology to work
with in this course.
The convergence of this sum follows from the Hölder inequality, stating that for
x ∈ lq , y ∈ lp satisfiying (10)
It turns out (and you will show this on example sheets) that this identifica-
tion yields an isometric isomorphism of lp∗ and lq . What about l∞
∗
?
5 Completeness
In the last section, we essentially exploited the structure of convexity provided
by a norm defined on a vector space. (Our slightly more general formulation of
Hahn-Banach in terms of a convex functional p should make this clear.) In this
section, we will exploit completeness. Thus, Banach spaces will become here
important.
The considerations of the current section stem from the observation that
complete metric spaces are necessarily “big”. This notion of “big” is captured
by a concept known as Baire category.
20
5.1 Baire category
Definition 5.1. Let X be a topological space. We say that X is of first category
if
X = ∪i Ei
where Ei is nowhere dense6 . Otherwise, we say that X is of second category.
Our convention is that ∪i always denotes a countable union. We can al-
ternatively characterize the spaces of second category as follows: A nonempty
topological space X is of second category if either of the following equivalent
statements is true
1. For all Ui countable collection of open dense sets, ∩i Ui is nonempty.
2. If X = ∪Ci , where Ci is closed, then there exists an i such that Ci has
non-empty interior.
Theorem 5.1. Let X be a complete metric space. Then X is of second category.
Proof. Let Ui be a sequence of open dense sets in X. Choose x1 ∈ U1 . By
the density of U2 , and openness of U1 , there exists an open ball of radius ≤ 1,
such that Bx1 (ǫ1 ) ⊂ U1 , and x2 ∈ Bx1 (ǫ1 ) such that x2 ∈ U2 . Given now 0 <
ǫj ≤ j −1 , and Bxj (ǫj ) ⊂ Uj , and Bxj (ǫj ) ⊂ Bxj−1 (ǫj−1 ), for j = 1, . . . , i, we can
choose xi+1 ∈ Ui+1 ∩Bxi (ǫi ), and an ǫi+1 such that Bxi+1 (ǫi+1 ) ⊂ Bxi (ǫi )∩Ui+1 .
Clearly, xi is a Cauchy sequence, and converges thus to some x. Moreover,
Since xj ⊂ Bxi (ǫi ), for j ≥ i, and Bxi (ǫi ) is of course closed, then x = lim xj is
also contained in Bxi (ǫi ). Thus, x ∈ Ui for all i, so ∩Ui 6= ∅.
5.2 Applications
5.2.1 The existence of irrationals
Proposition 5.1. R \ Q 6= ∅.
Proof. Q is of first category, as it is countable and points are closed with empty
interior, yet R is of second category, as it is complete.
Note that this is a completely nonconstructive proof. Compare this with the
difficulty of constructing a particular number which is irrational.
21
Proof. Consider the space C 0 ([0, 1]). This is a Banach space. For all integer
n ≥ 1, and rationals p ∈ Q ∩ [0, 1], define
Let Fn denote the closure of En . One easily shows that Fn is nowhere dense,
as any open set in C 0 ([0, 1]) contains functions such that the Lipschitz constant
is arbitrarily large at all p, and thus cannot be approximated by members of
En . Thus, ∪Fn is of the first category. Since C 0 is complete, we must have
C 0 6= ∪Fn , i.e. there exists an f which is differentiable nowhere.
Showing that Fn is nowhere dense may give you an idea for a constructive
proof of the statement of the Proposition. Try it.
Then
sup ||Tα || < ∞. (12)
α
∪Fn = V.
Tα (B(ǫ)) ⊂ Tα x0 + B(n)
and thus
Tα (B(1)) ⊂ ǫ−1 B(n0 + n) = B(ǫ−1 (n0 + n))
and thus ||Tα || ≤ ǫ−1 (n0 + n), i.e. we have shown (12).
22
5.4 Open mapping, inverse mapping, closed graph
Theorem 5.3. Let V and W be Banach spaces, and let T be a surjective
bounded linear map T : V → W . Then T is an open map, i.e. T (U) is open if
U is open.
Proof. First note the following
Lemma 5.1. T is open iff T (B(1)) ⊃ B(ǫ) for some ǫ > 0.
Proof. One direction is obvious. For the other, note that if U is an arbitrary
open set, and q = T (p) for p ∈ U, then there exists a δ such that p + B(δ) ⊂ U.
But then
T (U) ⊃ T (p + B(δ))
= T (p) + T (B(δ))
= q + δT (B(1))
⊃ q + B(δǫ),
and this shows that T (U) contains an open set around an arbitrary element q
of it.
So it suffices for us to show that under the assumptions of the Proposition,
T (B(1)) ⊃ B(ǫ) for some ǫ > 0. We certainly have V = ∪∞ n=1 n(B(1)). Thus,
by surjectivity,
W = T ∪∞
n=1 nB(1)
= ∪∞ n=1 nT (B(1)).
23
Lemma 5.2. Let T be a bounded linear map T : V → W , where V is a Ba-
nach space and W a normed vector space, such that T (B(1)) ⊃ B(1). Then
T (B(1)) ⊃ B(1).
Proof. Let w ∈ B(1) ⊂ W . We have that w ∈ B(1 − δ) for some δ > 0. By
the density of T (B(ǫ)) in B(ǫ) for all ǫ > 0 (this follows by assumption, and
linearity), and the fact that, for any set X ⊂ V , X ⊂ X + B(ǫ̃), for any ǫ̃ > 0,
we have that for all i ≥ 1,
B(2−i (1 − 2−i δ)) = B(2−i−1 (1 − 2−i δ)) + B(2−i−1 (1 − 2−i δ)
= T (B(2−i−1 (1 − 2−i δ))) ∩ B(2−i−1 (1 − 2−i δ))
+ B(2−i−1 (1 − 2−i δ))
⊂ T (B(2−i−1 (1 − 2−i δ))) ∩ B(2−i−1 (1 − 2−i δ))
+ B(2−2i−2 δ) + B(2−i−1 (1 − 2−i δ))
= T (B(2−i−1 (1 − 2−i δ))) ∩ B(2−i−1 (1 − 2−i δ))
+ B(2−i−1 (1 − 2−i−1 δ)).
P∞
By induction, it follows that we may write w as w = i=1 wi , where
wi ∈ T (B(2−i (1 − 2−i+1 δ))) ∩ B(2−i (1 − 2−i+1 δ)).
If vi ∈ B(2−i (1 − 2−i+1 δ)) is thus such that T vi = wi , then
P
vi converges to a
v ∈ B(1) since V is a Banach space. By continuity and linearity, T v = w. Thus
w ∈ T (B(1)).
Note how the completeness assumptions for V and W enter in very different
ways. The completeness of W was only used to infer it is second category. In
fact, one can replace the assumptions that W is Banach and T is surjective with
the assumption that the image of T is of second category in W . Surjectivity
then follows after having shown that the map is open.
The following result is known as the Inverse Mapping Theorem. It is im-
portant in various applications of the theory. It is essentially an immediate
corollary of Theorem 5.3.
Theorem 5.4. Let V and W be Banach spaces, and let T : V → W be an
injective and surjective bounded linear map. Then T −1 is bounded.
Proof. The map T −1 exists and is linear. Since by Theorem 5.3, we have
T (B(1)) ⊃ B(δ),
for some δ > 0, it follows that
B(1) ⊃ T −1 (B(δ)),
i.e.
T −1 (B(1)) ⊂ B(δ −1 ),
i.e. T −1 is bounded with ||T −1 || ≤ δ −1 .
24
Another celebrated Corollary of Theorem 5.3 is the so-called Closed graph
theorem.
Theorem 5.5. Let V and W be Banach spaces, and let T : V → W be linear.
Then T : V → W is bounded iff the graph Γ of V is closed as a subset of V × W .
By the graph of V , we mean the set Γ = {v, T v}v∈V ⊂ V × W .
Proof. Certainly, if T is bounded, then the graph is closed. So it suffices to
show the other implication.
Assume then that Γ is closed. As Γ is evidently a linear subspace of the
Banach space V → W , it follows that Γ is itself a Banach space.
Consider the map φ : Γ → V defined by
φ : (v, T v) 7→ v.
The map φ is clearly linear. Moreover, it is bounded, as |(v, T v)| = max{|v|, |T v|},
and thus ||φ|| ≤ 1. Finally, the map is manifestly both injective and surjective.
It follows from Theorem 5.4 that φ−1 is a bounded linear map. But this implies
that there exists a C such that for all v,
i.e. T is bounded.
To see the gain apparent from the previous Theorem, consider the sequential
characterization of continuity. To show that T is continuous, one has to show
that if vi → v then T vi → T v. Armed with the above theorem, it suffices to
show that if vi → v and if T vi → w, then w = T v.
Sn : C(S1 ) → C(S1 ),
25
defined by
n
eikt fˆ(k)
X
Sn : f 7→
k=−n
where π
1
Z
fˆ(k) = f (t)e−ikt dt.
2π −π
Let φn denote the composition of Sn with the map “evaluation at 0.” Assuming
now that ∀f , supn |φn (f )| < ∞, one applies Banach-Steinhaus to obtain that
sup ||φn || < ∞. On the other hand, this is easily contradicted by using (13),
and choosing an appropriate f for each n.7 Thus there exists a continuous f
such that its Fourier series does not converge at the origin.
Another such argument, also left for the example sheets, is the following.
One can show that the Fourier series of an L1 function on S1 is an element
of c0 , the set of complex functions on Z that tend to 0 as |n| → ∞. Do all
sequences in c0 arise like this?
This would mean that the map Λ : L1 → c0 defined by taking a function to
its Fourier series is surjective. One first shows that this is an injective bounded
linear map. Were it surjective, its inverse would be bounded in view of the
inverse mapping theorem, Theorem 5.4. One obtains a contradiction by demon-
strating a sequence of L1 functions whose L1 norm goes to infinity, while the
sup norm of their Fourier coefficients remains bounded. Thus the answer is no.
One should make reference at this point to other results in analysis which are
beyond the scope of this class. If f ∈ C 1 (S1 ), then Sn f converges everywhere.
This is a 19th century theorem. On the other hand, for an Lp function on S1 ,
with p > 1 (in particular for continuous functions) the Fourier series converges
almost everywhere (i.e. the set where it doesn’t has measure 0). For p = 2, this
is the celebrated Carleson’s Theorem. This theorem is hard.
Finally, a classic result of Kolmogorov shows that there exists a function in
L1 (S1 ) such that the Fourier series diverges everywhere!
26
6.1 The extension theorem
Our first task will be to show that the space C(K) is sufficiently rich. Specif-
ically, we shall show that functions on C(K) can be constructed by extending
functions on closed subsets. Compare with the Hahn-Banach Theorem, Theo-
rem 4.1. In particular, C(K) separates points (Corollary 6.1).
U = ∩m
i=1 Uqi .
These sets are clearly open. By definition of a cover, V ⊃ C2 , while clearly
V ∩U = ∅. On the other hand, since Uqi ⊃ C1 for all i, it follows that U ⊃ C1 .
Note that the compactness of K was only used to assert the compactness of
the Ci . It follows thus from the proof that in any Hausdorff space X, one can
separate compact sets by open sets.
27
6.1.2 Urysohn’s lemma
We begin with the following
Lemma 6.1. Let X be a normal space, and let C0 , C1 be disjoint closed sets.
Then there exists an f ∈ C(X) with range [0, 1] such that f = 0 on C0 and
f = 1 on C1 .
Proof. Enumerate the rationals of Q ∩ [0, 1] as {qi }∞ i=0 , where q0 = 0, q1 = 1.
Define inductively a collection of open and closed sets Ui ⊂ Ci as follows: Let
U0 = ∅, F0 = C0 , let U1 be X \C1 and F1 = X. Given Ui , Fi , for 0 ≤ i ≤ n, with
the property that Ui ⊂ Fi , and, if qi < qj , then Fi ⊂ Uj , there exists a unique
interval (qi1 , qi2 ) containing qn+1 , with 0 ≤ i1 , i2 ≤ n, and no other qj ∈ (qi1 , qi2 )
for j = 0, . . . n. We have by normality that there exists Un+1 ⊂ Fn+1 such that
Fi1 ⊂ Un+1 ⊂ Fn+1 ⊂ Ui2 . It now follows that this holds for all qi < qj for
i = 0 . . . n + 1, and thus, by induction8 , one defines such a sequence for all
i = 0 . . . ∞.
Now define
f (x) = inf {qn : x ∈ Fn }.
n=0...∞
28
6.1.3 The Tietze-Urysohn extension theorem
Finally,
Theorem 6.1. Let X be normal, and f : C → C a bounded continuous function
on a closed subset C. Then there exists a continuous extension f˜ : X → C, with
f˜|C = f , and |f˜| = |f |, where | · | denotes the sup norm.
Note that Lemma 6.1 is a special case of Theorem 6.1 where the range of f
consists of 2 points.
Proof. Clearly, by taking real and imaginary parts, translating and rescaling, it
suffices to consider the case where the range of f is [0, 1]. Also, one need not
worry about the condition |f˜| = |f |. For given fˆ : X → C any continuous exten-
ˆ
sion of f , we may define f˜ by f˜(x) = fˆ(x), if |fˆ(x)| ≤ |f |, f˜(x) = ei(arg f (x)) |f |
otherwise, and f˜ is again a continuous extension of f with |f˜| = |f |.
Define a sequence of closed sets and functions by induction as follows. Let
f0 = f , C0 = f −1 [0, 31 ] , F0 = f −1 [ 23 , 1] , and let g0 : X → [0, 13 ] be a
1 2 i
Ci = fi−1 [0, ] ,
3 3
and
2 2 i 2 i
Fi = fi−1 [ , ] ,
3 3 3
i
let gi be a function retrieving 0 on Ci and 31 23 on Fi , given by Lemma 6.1.
Set
fi+1 = fi − gi |C . (14)
i+1
Clearly 0 ≤ fi+1 ≤ 23
. We thus have defined inductively functions
1 2 i
gi : X → 0, .
3 3
2 i+1
fi+1 : C → 0, .
3
Clearly, from (14), we obtain
∞
X
gi |C = f0 = f.
i=0
29
P∞
Setting f˜ = i=0 gi , we have f˜ ∈ C(X), since
m
X
|gi | ≤ (2/3)n ;
i=n
we’re done.
E ⊂ ∪x∈N Bǫ (x).
30
Corollary 6.2. Let X be a complete metric space. A set E is totally bounded
iff E is compact.
Thus, since C(K) is a Banach space we are more than happy to classify
totally bounded sets.
We have
Definition 6.5. A subset F ⊂ C(K) is called equicontinuous at x ∈ K if for
every ǫ > 0 there exists a neighborhood U of x such that for y ∈ U, |f (y)−f (x)| <
ǫ for all f ∈ F. We say that F is equicontinuous if it is equicontinuous at x
for all x ∈ K.
Note than finite subsets of C(K) are clearly equicontinuous.
The Arzela-Ascoli theorem states
Theorem 6.2. Let K be compact Hausdorff. Then F ⊂ C(K) is totally bounded
iff it is bounded and equicontinuous.
Proof. Suppose F is totally bounded. It is certainly bounded. Given ǫ >
0, let {fi }ni=1 be an ǫ-net for F . Given x ∈ K, then Ui be subsets so that
|fi (y) − fi (x)| < ǫ. Define U = ∩i Ui . We have for y ∈ U,
|f (y) − f (x)| ≤ |f (y) − fi (y)| + |fi (y) − fi (x)| + |fi (x) − f (x)|.
Since {fi } forms an ǫ-net, there exists an i such that |f − fi | < ǫ. For this i, we
have then
|f (y) − f (x)| ≤ ǫ + |fi (y) − fi (x)| + ǫ ≤ 3ǫ.
We have shown that F is equicontinuous at x.
Conversely, suppose F is bounded and equicontinuous. Given ǫ > 0, x, let
Ux denote an open set around x such that |f (y) − f (x)| < ǫ for all y ∈ Ux .
The collection {Ux } forms an open cover for K, and thus there exists a finite
subcover {Uxi }ni=1 .
n
Consider F |{xi } as a subset of l∞ . By assumption, this is a bounded subset,
n
and thus, since we are in l∞ , totally bounded. There thus exists an ǫ-net
{fj |{xi } }m
j=1 ⊂ F |{xi } for this subset. For any j, we have
31
Note that examining the above proof, it is clearly enough to assume that
the family F is pointwise bounded, that is, for all x, supf ∈F |f (x)| < ∞.
x′ = f (x), x(t0 ) = x0 ,
such that x is not the restriction of an x̃ satisfying the above on a larger interval.
Moreover, let K ⊂ R be compact. Then if T± 6= ∞ there exist t+ < T+ , t− > T− ,
respectively, such that such that x([t+ , T+ )) ∩ K = ∅, x((T− , t− ]) ∩ K = ∅,
respectively.
This theorem can be proven with the so-called Banach fixed point theorem
or with Picard iteration. On the other hand, what happens when we weaken the
Lipschitz condition to continuity? In this section, we will prove the following
Theorem 6.4. Let f : R → R be continuous. Then for any t0 , x0 , there exists
an ǫ > 0 and a function x : (t0 − ǫ, t0 + ǫ) → R satisfying the initial value
problem
x′ = f (x), x(t0 ) = x0 . (15)
That is to say, we still have existence, but we have lost uniqueness.
32
Proof. We shall prove Theorem 6.4 from Theorem 6.3. We shall make funda-
mental use of Arzela-Ascoli.
Choose δ > 0, and let F = sup|x−x0 |≤δ |f (x)|. Let ǫ > 0 be such that
ǫ(F + 1) ≤ δ/2. Let B denote the set {f + g̃} where g̃ ranges over functions
R → R with supremum less than or equal to 1.
Lemma 6.2. If x : [t0 − ǫ− , t0 + ǫ+ ] → R solves (15), for 0 < ǫ± ≤ ǫ, with
f˜ ∈ B in place of f , then
|x − x0 | < δ, (16)
|x′ | ≤ F + 1. (17)
The above lemma is an example of an a priori estimate.
Proof. For such an x, we have
x′ = f˜(x)
≤ (F + 1)ǫ
≤ δ/2.
33
Consider the subset S ⊂ C 1 ([t0 − ǫ, t0 + ǫ]) which consists of solutions to
(15) on the interval [t0 − ǫ, t0 + ǫ], with f replaced by f˜ ∈ B, restricted to
[x0 − δ, x0 + δ]. We have just shown that this set is nonempty, and it contains
a unique function corresponding to any f˜ ∈ B ∩ C 1 ([x0 − δ, x0 + δ]).
Now we show
Lemma 6.3. S ⊂ C([t0 − ǫ, t0 + ǫ]) is totally bounded.
Proof. In view of Arzela-Ascoli, it suffices to show the uniform boundedness and
equicontinuity of S. This is immediate from (16) and (17).
On the other hand, we have
Lemma 6.4. C 1 ([x0 − δ, x0 + δ]) ⊂ C([x0 − δ, x0 + δ]) is dense.
Proof. Omitted. This will follow in particular from the next section. In the
meantime, see if you can come up with a direct proof.
In particular, consider a sequence fi → f with fi ∈ B ∩ C 1 ([x0 − δ, x0 + δ]).
By density, such a sequence exists. Let xi ∈ S denote the corresponding solution
to (15) with fi . By Lemma 6.3, there exists a subsequence xik such that xik → x
for some x ∈ C 0 ([t0 −ǫ, t0 +ǫ]). The function x is a good candidate for a solution
of (15)! But we have to be careful; we’re not done yet. We still must deduce
that x actually solves (15). (At this point, we don’t even know yet that x is
differentiable!)
This doesn’t turn out to be so difficult. Since fik → f uniformly, and
xik → x uniformly, it follows that fik (xik )(t) → f (x(t)) uniformly in t. Since
x′ik = fik (xik ), this implies by an elementary result in real analysis that x′ exists
and equals lim x′ik , i.e. f (x).
The importance of these ideas is even greater when one passes from ordinary
differential equations to partial. Such applications would take us, however, too
far afield.
34
is commutative as a ring, we say that V is a commutative algebra. If V has a
multiplicative identity, we say V is unitial.
Examples. The space C(K) is certainly a unitial commutative Banach alge-
bra under the usual multiplication of functions. L(V, V ) is an algebra under
composition of linear transformations. If V is a normed vector space, then
B(V, V ) ⊂ L(V, V ) is a unitial subalgebra which is a normed algebra, which if
V is a Banach space, is a unitial Banach algebra.
On the other hand
Definition 6.7. A lattice is a partially ordered set L such that any two element
subset has a least upper bound and a greatest lower bound.
If p and q are two elements of L, we may denote the least upper bound and
greatest lower bound by p ∨ q and p ∧ q, respectively. We may give equivalently
an algebraic characterization of the notion of lattice in terms of a set L and
two binary operations ∨ and ∧ satisfying a collection of axioms. What are the
axioms?
The vector space FR (X), the real valued functions on a topological space
X, is a lattice where FR (X) is given the obvious partial ordering f ≤ g ⇐⇒
∀x f (x) ≤ g(x). We have then
35
Proof. Let g(x) ∈ CR (K) be as in the assumption of the lemma. Given ǫ > 0,
we will produce an f ∈ L such that |f − g| < ǫ.
Pick an x ∈ K, and for each y ∈ K, let fx,y be a function with
By continuity of fx,y , g, the inequality |fx,y − g| < ǫ must hold in open sets
around x, y. Call these open sets Vx,y , Ux,y . We have that {Ux,y } is an open
cover for K, and thus there exists a finite subcover Ux,yi . If we consider Vx =
∩ni=1 Vx,yi , and
fx = fx,y1 ∧ · · · ∧ fx,yn ,
we have
fx (y) < ǫ + g(y) (18)
for all y, and
g − ǫ < fx (19)
in Vx . By the defintion of a sublattice, fx ∈ L. Now, consider the open cover
Vx for K. By compactness of K there exists a finite subcover Vxj . Define
f = fx1 ∨ · · · ∨ fxm .
36
Pn
and let Sn denote the partial sum i=0 ci (x − 21 )i , and let S̃n = Sn − Sn (0).
We have that S̃n is a polynomial without constant term. On the other hand,
Sn (0) → ǫ, as n → ∞, thus, for n ≥ N , we have |Sn (0)| < 2ǫ.
We have that S̃n ◦ f 2 ∈ A, as A is an algebra, and S̃n has no constant term.
Since |f | ≤ 1 implies that 0 ≤ f 2 ≤ 1, we have that, given ǫ > 0, there exists
an Ñ such that, for n ≥ Ñ ,
|S̃n ◦ f 2 − |f || ≤ |S̃n ◦ f 2 − gǫ ◦ f 2 | + |gǫ ◦ f 2 − |f ||
≤ |S̃n ◦ f 2 − Sn ◦ f 2 | + |Sn ◦ f 2 − gǫ ◦ f 2 |
+ |gǫ ◦ f 2 − |f ||
< 2ǫ + ǫ + ǫ = 4ǫ.
(In the above centred formula, |f | denotes the function x 7→ |f (x)|, not the sup
of f .) Thus, since A is assumed closed, and ǫ > 0 is arbitrary, we are done.
Aside. One can remark that in the above lemma we do not use continuity. We
could replace CR (K) in the statement with the subspace FRb (K) ⊂ FR (K) of
bounded functions, given the supremum norm. Compare with Lemma 6.5 where
continuity is used in a fundamental way.
To complete the proof of Theorem 6.5, there is very little to say. Consider
A. Note that this is again an algebra by the continuity of the multiplication,
addition, etc. Thus, by Lemma 6.6, A is a sublattice of C(K).
Let us suppose that for all x ∈ K, there exists an f ∈ A such that f (x) 6= 0.
For x 6= y, let fx and be a function such that fx (x) 6= 0, let fy be a function
such that fy (y) 6= 0, and let fx,y be a function such that fx,y (x) 6= fx,y (y).
By defining f˜ = fx + αfx,y + βfy , for some α, β ∈ R, we obtain a function
such that f˜(x) 6= 0, f˜(y) 6= 0, and f˜(x) 6= f˜(y). It follows that (f˜(x), f˜(y)),
(f˜2 (x), f˜2 (y)) are linearly independent, and thus the assumptions of Lemma 6.5
hold for arbitrary g ∈ C(K).
Applying thus Lemma 6.5, we have shown thus that if for all x ∈ K there
exists an f ∈ A such that f (x) 6= 0, it follows that A = C(K).
On the other hand, suppose now that there exists a point such that for all
f ∈ A, f (x) = 0. Let us consider the algebra A′ which is spanned by A and the
constants. This is easily seen to equal {A + λ1}λ∈R . We have that A′ satisfies
the property enunciated in the previous paragraph, and in addition, separates
points. Thus, we have A′ = C(K). But now let g ∈ C(K) such that g(x) = 0,
and let ǫ > 0 be arbitrary. This means we may write
|g − (f + λ)| < ǫ
for some f ∈ A, λ ∈ R. Evaluating at x, since f (x) = g(x) = 0, we obtain
|λ| < ǫ. Thus, |g − f | < 2ǫ. It follows that g ∈ A.
We state the complex version of Stone-Weierstrass
Theorem 6.6. Let A ⊂ CC (K) be a subalgebra over C separating points, and
moreover, suppose that A is closed under complex conjugation. Then A =
CC (K), or there exists an x ∈ K such that A = {f ∈ CC (K) : f (x) = 0}.
37
Proof. Recall that ℜf = 12 (f + f¯), ℑf = 2i (f¯ − f ). Thus, by assumption,
f ∈ A =⇒ ℜf ∈ A, ℑf ∈ A. Consider the subalgebra A′ (over R) of CR (K)
generated by ℜf , ℑf , for all f ∈ A. We have A′ ⊂ A, and moreover, it is easily
seen to separate points.
Suppose that for all x ∈ K, there exists an f ∈ A′ such that f (x) 6= 0.
Then by Theorem 6.5, we have that A′ = CR (K). But then for any u ∈ CR (K),
v ∈ CR (K) there exists fj ∈ A′ → u, gj ∈ A′ → v, and thus fj + igj → u + iv.
But fj + igj ∈ A. Thus, we have shown in this case A = CC (K).
If on the other hand, there exists an x such that f (x) = 0 for all x ∈ A′ ,
then argue as in the last part of the proof of Theorem 6.5.
We have that Z
lim |f − SN (f )|2 → 0.
N →∞
and thus
Z π Z π Z π
2 2
|f − SN (f )| ≤ |f − P | + |SN (P − f )|2
−π −π −π
≤ 2πǫ + 2πǫ2 .
2
38
Where did the last inequality come from? The claim is that for any function
g ∈ C(S1 ), we have Z π Z π
|SN (g)|2 ≤ |g|2 . (20)
−π −π
Try showing this directly at this stage.
The geometric structure of (20) may not be immediately apparent. The einx
form what is known as an orthonormal set in C(S1 ), with respect to the “inner
product” Z π
1
f ·g = f ḡ.
2π −π
Orthonormal means einx · eimx = δnm , where δnm = 1 if n = m and 0 otherwise.
Modulo the convenient factor of (2π)−1 , this inner product is related to the L2
norm by |f |2 = f · f .
In this language, we can understand SN as an “orthogonal projection”, and
(20) will just follow from the general statement that orthogonal projections do
not increase the norm.
We embark in the next section on a general study of normed spaces whose
norm arises from an inner product in the sense described above. These are
called Euclidean spaces. A Euclidean space which is also complete is called a
Hilbert space. We will return to thus to (20) later. . .
7 Hilbert space
In this section, we shall introduce the concept of a Hilbert space, that is to say
a Banach space whose norm arises from an inner product. We have already
seen to a certain extent at the end of the previous section how the notion of
orthogonality may be useful. It is hard to give a sense of just how important
this notion is in analysis.
The inner product structure will also allow us to revisit the notion of duality,
defined for general Banach space earlier. As we shall see, for Hilbert spaces, the
dual and the adjoint map have very concrete realisations.
39
The properties (21), (22) together are sometimes called sesquilinearity and
define the notion of a Hermitian form. In the real case, then this just means
that p is a bilinear form.
If < v, w >= 0 we say that v and w are orthogonal.
The most fundamental perhaps property of inner product spaces is the so-
called Schwarz inequality.
Proposition 7.1. Let (V, <, >) be an inner product space. Then
√
| < v, w > | ≤ < v, v >< w, w >, (24)
< v + tw, v + tw > = < v, v > + < v, tw > + < tw, v > +t2 < w, w >
= < v, v) + 2t < v, w > +t2 < w, w > .
Since this polynomial is strictly positive its roots are not real, that is to say
4 < v, w >2 −4 < v, v >< w, w >< 0. But this is the desired ineqality.
Aside: In general, we call a vector v isotropic with respect to a Hermitian
form if p(v, v) = 0; we call a Hermitian form positive if p(w, w) ≥ 0 for all w,
and we call it non-degenerate if p(v, w) = 0 for all w implies v = 0. The above
proposition has thus shown that a positive hermitian form is non-degenerate iff
there are no isotropic vectors, i.e. iff it is positive definite.
Proposition 7.2. Let (V, <, >) be an inner product space. Define | · | on V by
√
|v| = < v, v >. (25)
40
Definition 7.2. A normed vector space (E, | · |) where | · | is defined by (25) for
some inner product <, > on E, is called a Euclidean space.
Proposition 7.3. Let (E, | · |) be a Euclidean space. Then there is a unique
<, > such that (25) holds.
Proof. Let <, > be such that (25) holds. In the case of a real inner product, we
have
1
< v, w > = (< v + w, v + w > − < v, v > − < w, w >)
2
1
= (|v + w|2 − |v|2 − |w|2 )
2
whereas in the case of a complex inner product, we have
while
−i < v, w > +i < w, v >=< v + iw, v + iw > − < v, v > + < w, w >
Also,
Proposition 7.5. Let (E, | · |) be a Euclidean space. Suppose v and w are
orthogonal. Then
|v + w|2 = |v|2 + |w|2 (27)
41
Proof. Expand using the inner product. . .
Iterating, we have that if vi are pairwise orthogonal than
n 2 n
X X
vi = |vi |2 .
i=1 i=1
Finally, we have
Definition 7.3. Let (H, | · |) be a Euclidean space which is Banach, i.e. such
that | · | defines a complete metric. We say H is a Hilbert space.
It turns out that any Euclidean space can be embedded into a larger Hilbert
space by taking the completion. For this, let us first note the following easy
proposition:
Proposition 7.6. Let E be a Euclidean space. Then <, >: E × E → C is
continuous.
Proof. This follows from the computation
| < v, w > − < ṽ, w̃ > | ≤ | < v, w > − < v, w̃ > | + | < v, w̃ > − < ṽ, w̃ > |
= | < v, w − w̃ > | + | < v − ṽ, w̃ > |
≤ |v||w − w̃| + |v − ṽ||w̃|.
7.1.1 Examples
For us, this whole story began with C(S1 ) with
Z π
.
< f, g >= f ḡ.
−π
10 I will also resist the temptation to call Hilbert spaces post-Euclidean spaces.
42
We could also equally well have taken C([a, b]) for some a < b. This is clearly a
Euclidean space with the “L2 norm”
s
p Z b
|f |2 = < f, f > = |f |2
a
Alas, this is not a Hilbert space, as we know already from our discussion of
Banach spaces. Its completion is, by Proposition 7.7. It is a miraculous fact
that this completion can be realised as a set of (equivalence classes of) Lebesgue
measurable functions, where < f, g > is defined with respect to the Lebesgue
integral. This is the space L2 , which in any case, we have discussed before in the
context of Banach spaces. For this, however, you will need more mathematical
technology than you have.
So we have to settle for little l2 . This space is easily seen to be a Euclidean
space with inner product X
< a, b >= ai b i .
√
Since we know it to be a Banach space with the norm |a|2 = < a, a >, it
follows that l2 is a Hilbert space.
The space l2 is clearly separable, i.e. there exists a countable dense subset.
We will see later on that all infinite dimensional separable Hilbert spaces are
isometrically isomorphic to l2 . So l2 is not a bad example to have. . .
This being said, the whole point about setting up the theory of Hilbert spaces
is applying it to solve problems in analysis. Knowing, thus, that a particular
space is indeed a Hilbert space, is fundamental. This is why it really is a shame
that we cannot talk about L2 .
43
Theorem 7.1. If F ⊂ E be a subspace, where F is assumed complete and E is
Euclidean, then F ⊕ F ⊥ = E. Moreover, for arbitrary x ∈ E, writing uniquely
x = x1 + x2 , where x1 ∈ F , x2 ∈ F ⊥ , then x1 is characterized uniquely by
Note that, of course, the assumptions of the theorem are satisfied if F is finite
dimensional, or alternatively, if E is Hilbert and F is closed. In particular, in a
Hilbert space, they are satisfied if F = S ⊥ .
Proof. The statement of the theorem should tip you off to the nature of the
proof. Take a minimising sequence yi for |y − x|, that is to say, a sequence
.
yi ∈ F such that lim |yi − x| = inf y∈F |y − x| = d.
First, I claim that yi is Cauchy. For this, apply (26) for v = x−yi , w = x−yj
to obtain
and thus
Now, for small enough t the second term is negative and dominates the last
term. For such a t then we have
and this contradicts the definition of d in view of the fact that y + tỹ ∈ F .
When F ⊕ F ⊥ = E, we say that F ⊥ is an orthogonal complement of F .
We may interpret the above Theorem in terms of projection operators.
44
Corollary 7.1. Let F , E be as in Theorem 7.1. There exists a unique operator
P : E → E such that P (E) = F , P (F ⊥ ) = 0, P 2 = P , (I − P )(E) = F ⊥ ,
(I − P )(F ) = 0, (I − P )2 = (I − P ), and ||P || ≤ 1, ||I − P || ≤ 1, with equality
if F 6= 0, F 6= E, respectively.
w 7→< w, v > .
45
7.3.1 Definitions and existence
Definition 7.5. Let E be Euclidean. A set {eα } of unit vectors is called an
orthonormal system if < eα , eβ >= 0 for α 6= β.
Definition 7.6. Let E be Euclidean. An orthonormal system is called maximal
if it cannot be extended to a strictly larger orthonormal system.
By Zorn’s lemma, there always exists a maximal orthonormal system in any
Euclidean space E.
Proposition 7.10. Let H be Hilbert, and S a maximal orthonormal system.
Then Span S = H.
⊥
Proof. Consider S ⊥ . By Proposition 7.8, we have that S ⊥ = Span S . Setting
F = Span S, we have by Theorem 7.1 that H = F ⊕ F ⊥ . If F ⊥ = 0, we have
that H = F = Span S, as desired. Otherwise, we have that S ⊥ = F ⊥ 6= 0, in
which case there exists a unit x ∈ S ⊥ . The system {x} ∪ S is then orthonormal
and contains strictly S. This contradicts the maximality of S.
Note also the easy converse, which holds in any Euclidean space:
Proposition 7.11. Let E be Euclidean, and S be an orthonormal system such
that Span S = E. Then S is maximal.
Definition 7.7. Let H be Hilbert. A maximal orthonormal system is called a
Hilbert space basis.
Propostion 7.11 thus provides an alternative characterization of a Hilbert
space basis.
We shall see in the examples that either characterization may be easier to
check in showing that a given orthonormal system is indeed a basis.
As discussed before, Zorn’s lemma is not something one should use lightly. It
turns out that in the separable case we can avoid it completely in constructing
Hilbert space baseis. First we note the following
Proposition 7.12. Let {xi }N i=1 be linearly independent for some N ≤ ∞. Then
there exist {ei }N
i=1 such that for all n = 1 . . . N , we have Span (ej )j=1...n =
Span (xj )j=1...n .
The above Proposition, is nothing but the celebrated Gram-Schmidt orthog-
onalization procedure.
Proof. By induction. Let e1 = x1 /|x1 |. Having constructed e1 . . . en , for n ≥ 1,
define
n −1 n
X X
en+1 = xn+1 − < xn+1 , ei > ei xn+1 − < xn+1 , ei > ei .
i=1 i=1
(Note that en+1 = |Pn xn+1 |−1 (Pn xn+1 ), where Pn is the orthogonal projection
to the subspace Span (xj )j=1...n .)
46
From this, we obtain the following
Proposition 7.13. Let H be separable, and let {yi } be countable set such that
Span (yi ) = H. Then there exists a countable Hilbert space basis for H in the
span of {yi }.
Proof. Go down the list of {yi }, and discard yj in the span of the previous. One
arrives at a {yik } which are linearly independent, and such that Span {yik } =
Span {yi }. Now, apply Gram-Schmidt to {yik }.
7.3.2 Examples
Take l2 , and let ei = (0, . . . , 0, 1, 0, . . .) where the 1 is in the i’th place. This is
clearly maximal, and thus a basis.
Take the completion H of C(S1 ) with respect to the L2 norm. (This space is
also known as L2 .) We know by Stone-Weierstrass that the algebra of trigono-
metric polynomials is dense in C(S1 ) with respect to the sup norm. This means
that it is also dense with respect to the L2 norm. (Why?) Thus A is dense in
H as well. By orthogonality, it follows that A is a basis.
As abstract Hilbert spaces, the above two examples are actually the same. A
countable, non-finite basis for a Hilbert space can be thought of as an isometric
isomorphism with l2 . We turn to this now.
We have
x =< x, e1 > e1 + . . . + < x, en > en ,
|x|2 = | < x, e1 > |2 + · · · + | < x, en > |2 .
We now move on to the general, separable case. First we show the following
Lemma 7.1. (Bessel’s inequality) Let E be Euclidean, and {ei }N
i=1 be a count-
able orthonormal system, for some 1 ≤ N ≤ ∞. For x ∈ E, define xi =<
x, ei >. Then
XN
|xi |2 ≤ |x|2 .
i=1
If N = ∞, (xi ) ∈ l2 .
Pn
Proof. Let Fn the span of {ei }ni=1 , note that i=1 |xi |2 = |PFn x|2 ≤ |x|2 . If
N = ∞, take the limit.
47
Proposition 7.14. Let H be a separable Hilbert space, and let {ei }N
i=1 be a
countable basis for some 1 ≤ N ≤ ∞. Let x, y ∈ H, and define xi =< x, ei >,
yi =< y, ei >. Then
N
X N
X
x= xi ei , y= yi ei , (28)
i=1 i=1
N
X
< x, y >= xi yi , (29)
i=1
48
Pn
Proof. Consider the partial sum sn = i=1 xi ei . Since (xi ) ∈ l2 , it follows by
the Pythagorean theorem that the sequence sn is Cauchy. Thus by completeness
sn → x. We show as before now that < x, ei >= xi by noting that < sn , ei >=
xi for n ≥ i, and the continuity of the inner product.
σp (T ) = σ(T ).
49
Lemma 8.1. U is open.
Proof. Let S1 ∈ U. Let S2 be such that ||S1 − S2 || < ǫ. Note that
(I − Q)−1 = 1 + Q + Q2 + Q3 + . . . (30)
and
1
||(I − Q)−1 || ≤ .
1 − ||Q||
We have thus that S2 is invertible for ǫ < ||S1−1 ||−1 , with S2−1 given by S2−1 =
(I − Q)−1 S1−1 and moreover,
||S1−1 ||
||S2−1 || ≤ .
1 − ||S1−1 ||||S1 − S2 ||
Now suppose that λ ∈ ρ(T ). This means that (T − λI) ∈ U. But then
T − µI ∈ U, for µ sufficiently close to λ, since
We have shown thus openness of the resolvent set , and thus, closedness of the
spectrum.
To remark that σ(T ) ⊂ {|λ| ≤ ||T ||}, or equivalently ρ(T ) ⊃ {|λ| > ||T ||},
just note that for |λ| > ||T ||, T − λI = λ(λ−1 T − I), and λ−1 T − I is invertible
by the previous, since ||λ−1 T || < 1. Thus (T − λI)−1 exists, i.e., λ ∈ ρ(T ). Note
moreover, that for such λ, we have
50
have that R is bounded. Liouville’s theorem from complex analysis says that R
must be constant.
But of course, R cannot be constant, because (T − λI)−1 6= (T − µI)−1 if
λ 6= µ. The contradiction proves ρ(T ) 6= C, and thus, σ(T ) 6= ∅.
One might want to compare with the finite dimensional case. Specialising
to that case, in view of σp (T ) = σ(T ), we have shown that every linear trans-
formation has an eigenvalue.
The usual proof of this latter fact goes through the characteristic polynomial.
Any root of the characteristic polynomial is an eigenvalue. All polynomials over
C have a root, by the fundamental theorem of algebra. Thus, any T has an
eigenvalue.
The algebraic device of the characteristic polynomial is not available to us in
infinite dimensions. But one must remember, that even in finite dimensions, this
does not render the proof completely algebraic. For the fundamental theorem
of algebra requires an analytic argument. In fact, one classic proof proceeds
precisely via Liouville’s theorem!
51
8.3.3 The spectrum of compact operators
We shall not give a general discussion of the theory of compact operators on
general Banach spaces, in particular, their spectrum. Let us just quote the
following theorem:
Theorem 8.2. Let T : X → X be compact. Then the point spectrum of T is a
countable set {λi }. If X is infinite dimensional, then σ(T ) = {0} ∪ {λi }, and
λi → 0 if there are infinitely many λi . Moreover, the eigenspace corresponding
to λi 6= 0 is finite dimensional.
T ∗ : Y ∗ → X∗
of an operator
T : X → Y.
Suppose now that X = Y = H a Hilbert space. We know that H can be identi-
fied with H ∗ via the Riesz Representation Theorem. Thus, we can “compare”
T and T ∗ . Let φ : H → H ∗ be the antilinear isometry.
Definition 8.5. We say that T : H → H is self-adjoint if φ ◦ T ◦ φ−1 = T ∗ .
We have
Proposition 8.2. Let T : H → H be bounded, and let T ∗ : H ∗ → H ∗ be the
adjoint, and let φ be the map of the Riesz Representation theorem. Then
Note that < x, φ−1 ◦ T ∗ ◦ φ(y) > for all x, y, determines T ∗ . Thus the iff
statement.
52
Proposition 8.3. Let T : H → H be self-adjoint. Then σp (T ) ⊂ R.
Proof. Let λ ∈ σp (T ) with 0 6= v ∈ Eλ . Then
Corollary 8.2. Let T be compact, self-adjoint. Given, ǫ, here are only finitely
many λα with |λα | > ǫ. Thus σp (T ) is either finite or countably infinite with
λi → 0.
53
8.4.2 The spectral theorem
Theorem 8.3. Let T be a self-adjoint, compact operator T : H → H. Then the
point spectrum of T is a real countable set {λi }N
i=1 for some 1 ≤ N ≤ ∞. The
eienspaces Eλi are finite dimensional for λi 6= 0, and Eλi ⊥ Eλj for i 6= j. If
their are infinitely many λi , then λi → 0. Moreover, if H is infinite dimensional
then σ(T ) = σp (T ) ∪ {0}. Finally, we may write T as
N
X
T = λi PEλi .
i=1
Proof. Like other results in Hilbert space theory, this is proven by exploiting
variational methods. In particular, we have a variational characterization of the
eigenvalues of T . For this, the following Lemma will be useful:
Lemma 8.2. Let T : H → H be self-adjoint. Then
√
Proof. This is not immediately obvious because ||T || is defined as sup < T x, T x >.
First we note that
||T || = sup | < T x, y > |. (34)
|x|=1,|y|=1
So, setting yi = |T xi |−1 T xi , we have that < xi , T yi >→ ||T ||. On the other
hand, | < x, T y > | ≤ ||T || for any |x| = 1, |y| = 1.
Let λ denote the right hand side of (33). We compute that
1
| < T x, y > | ≤ | < T (x + y), (x + y) > − < T (x − y), (x − y) > |
4
1
≤ (λ||x + y||2 + λ||x − y||2 )
4
λ
≤ (||x + y||2 + ||x − y||2 )
4
λ
= (2||x||2 + 2||y||2 )
4
= λ,
where we have used the parallelogram law. Thus, (34) implies (33).
The notation λ was meant to be suggestive! Without loss of generality, let
us assume that λ = sup|x|=1 < T x, x >. We will show that λ is an eigenvalue.
54
For this, let xi be a maximising sequence for < T x, x >, i.e., let < T xi , xi >→
λ with |xi | ≤ 1. By compactness of T , there exists a subsequence xik such that
T xik → y.
Now we have
< T xi − λxi , T xi − λxi >= ||T (xi )||2 − 2λ < T xi , xi > +λ2 < xi , xi >2 .
Thus ||T xi − λxi ||2 → 0. Since T xik → 0, then xik → 0 and T y = λy. Note
that y 6= 0. We have produced an eigenvector of eigenvalue λ.
Consider now the eigenspace Eλ . By compactness, it is necessarily finite
dimensional. Writing H = Eλ ⊕ Eλ⊥ , we have that T (Eλ ) = Eλ , T (Eλ )⊥ = Eλ⊥ .
The problem is thus reduced to understanding an operator on a smaller space.
Iterating the above argument with Eλ⊥ , in place of H, etc., we obtain a
sequence of distinct eigenvalues |λ1 | ≥ |λ2 | ≥ · · · . If this sequence is finite, we
must have
H = ⊕ni=1 Eλi
from which one easily deduces
n
X
T = λi PEλi .
i=1
H = E0 ⊕ ⊕∞
i=1 Eλi ,
55
From which is is clear that (Tn −νI)−1 exists, and ||(Tn −νI)−1 || ≤ max{|ν|−1 , |λi −
ν|−1 }. We know that for every ǫ there exists an n such that ||T − Tn || < ǫ, and
thus, by the above computation
9 Thanks
Thanks to Susan Thomas and Paul Jefferys for comments and corrections.
References
[1] B. Bollobas Linear Analysis: an introductory course second edition, Cam-
bridge University Press, 1999
[2] J. Dieudonné History of Functional Analysis North-Holland Mathematics
Studies, 49, Amsterdam, 1981
[3] P. Lax Functional Analysis Wiley-Interscience, New York, 2002
[4] W. Rudin Functional Analysis second edition, McGraw-Hill, New York,
1991
56