0% found this document useful (0 votes)
14 views47 pages

Intro to Analysis

Uploaded by

duncan888000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views47 pages

Intro to Analysis

Uploaded by

duncan888000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Pugh’s Analysis

FIN
Contents

1 A Taste of Topology 2
1.1 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 The Topology of a Metric Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Product Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.7 Coverings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.8 Uniform Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.9 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.10 Total Boundedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.11 Perfect Metric Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.12 Cantor Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2 Functions of a Real Variables 20


2.1 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Riemann Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3 Function Space 42
3.1 Uniform Convergence and C 0 [a, b] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2 Arzela-Ascoli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

1
Chapter 1

A Taste of Topology

I assume that the readers are familiar with some basic knowledge of real numbers, which are listed below:
Proposition 1.0.1.
(a) (ϵ-principle) Suppose a, b ∈ R. If for any ϵ > 0 we have a ≤ b + ϵ, then a ≤ b.
(b) (lub-principle) Suppose S ⊆ R is nonempty. If S is bounded above, then sup(S) exists.
(c) (Archimedean Property) Suppose x, y ∈ R with x > 0. Then there exists n ∈ N such that nx > y.
(d) (denseness of Q) Suppose x, y ∈ R with x < y. Then there exist q ∈ Q such that q ∈ (x, y).
(e) (Bolzano-Weierstrass theorem) Every bounded sequence in R has a convergent subsequence.
(f) (monotone convergent theorem) Every sequence that is increasing (resp. decreasing) and bounded
above (resp. below) converges.

1.1 Metric Spaces


Definition 1.1.1. A metric space is a set M , where its elements are called points, and a function
d : M × M → [0, ∞), which is referred as a distance function, such that for all x, y, z ∈ M , we have
(a) positive definiteness: d(x, y) ≥ 0, and d(x, y) = 0 if and only if x = y.
(b) symmetry: d(x, y) = d(y, x).
(c) triangle inequality: d(x, y) ≤ d(x, z) + d(z, y).
A subset M ′ ⊆ M with the same distance function d is called a subspace of M or we say that M ′ inherits
M . A special case of distance function is discrete metric: d(x, y) = [x = y] for all x, y ∈ M , where [·] is
the Iverson bracket.
Throughout the note, unless specially mentioned, M, N, P, X, Y are always metric spaces. For simplicity,
unless specified, their distance functions are all denoted by d despite the fact that they may be different
functions.
Definition 1.1.2. We say a sequence (pn ) in M converges to p ∈ M , written as pn → p, if for any ϵ > 0,
there exist N ∈ N such that d(pn , p) < ϵ whenever n ≥ N .
By the positive definiteness of d, the limit of any convergent sequence is unique, that is, if pn → p and
pn → q, then p = q.
Theorem 1.1.3 (subsequence formulation of convergence). Suppose (pn ) is a sequence in M . Then pn →
p ∈ M if and only if pnk → p for all subsequence pnk .
Proof. The sufficient condition naturally holds as (pn ) itself is a subsequence. For the converse, suppose
pn → p and (pnk ) is a subsequence of (pn ). Let ϵ > 0 be given. By the definition, there exists N ∈ N
such that d(pn , p) < ϵ whenever n ≥ N . Note that nk ≥ k for all k ∈ N. Thus, for k ≥ N , it follows that
d(pnk , p) < ϵ, as desired.

2
Here is a weaker criterion:
Proposition 1.1.4 (even-odd formulation of convergence). Suppose (pn ) is a sequence in M . Then pn →
p ∈ M if and only if p2n → p and p2n+1 → p.
Proof. The necessary condition follows by 1.1.3. As for the converse, suppose p2n → p and p2n+1 → p. Let
ϵ > 0 be given. Then there exists N1 , N2 ∈ N such that d(p2n , p) < ϵ whenever n ≥ N1 and d(p2n+1 , p) < ϵ
whenever n ≥ N2 . Let N = 2 max{N1 , N2 } + 1. For n ≥ N , if n = 2k is even, then k ≥ N1 and thus
d(pn , p) < ϵ; otherwise, n = 2k + 1 is odd, then k ≥ N2 and thus d(pn , p) < ϵ. In conclusion, d(pn , p) < ϵ
whenever n ≥ N , as desired.
Theorem 1.1.5 (sub-subsequence criterion of convergence). Suppose (pn ) is a sequence in M . Then pn →
p ∈ M if any only if for any subsequence (pnk ), there exists a sub-subsequence (pnkℓ ) with pnkℓ → p.
Proof. The necessary condition follows by 1.1.3. For the converse, suppose pn does not converge to p. Then
there exists ϵ0 > 0 such that for any N ∈ N, there exists n ≥ N such that d(pn , p) ≥ ϵ0 . We shall construct
a subsequence with no sub-subsequence converges to p inductively as below:
(initial step) Let N = 1. Then there exist n1 ≥ 1 such that d(pn1 , p) ≥ ϵ0 .
(inductive step) Suppose pn1 , · · · , pnj are chosen. Let N = nj . Then there exist nj+1 ≥ nj such that
d(pnj+1 , p) ≥ ϵ0 .
By the construction, (pnk ) has no sub-subsequence converges to p and the result follows by the contrapositive
argument.

1.2 Continuity
Definition 1.2.1. Suppose N is a metric space with distance function d′ . Then we say a function f : M → N
is continuous at p ∈ M if for any sequence (pn ) in M with pn → p ∈ M , one has f (pn ) → f (p). A function
g : M → N is continuous on M ′ ⊆ M if for any x ∈ M ′ , g is continuous at x.
Theorem 1.2.2 (weak sequential and ϵ-δ formulation of continuity). Let f : M → N be a function. Then
the following are equivalent:
(a) f is continuous at p.
(b) For any ϵ > 0, there exists δ > 0 such that for any q ∈ M with d(p, q) < δ, d(f (p), f (q)) < ϵ.
(c) For any sequence (pn ) in M with pn → p, f (pn ) converges.
Proof. (a) implies (b). We shall prove it by contrapositive argument. Suppose there exist ϵ0 > 0 such that
for any δ > 0, there exist q ∈ M with d(p, q) < δ and d(f (p), f (q)) ≥ ϵ0 . Then for any n ∈ N, let δ = 1/n
with corresponding pn . By Archimedean property, pn → p. However, d(f (pn ), f (p)) ≥ ϵ0 and thus f (pn )
does not converge to f (p).
(b) implies (c). Let (pn ) be a sequence in M such that pn → p. Let ϵ > 0 be given. By condition (b),
there exists δ > 0 such that for any q ∈ M with dM (p, q) < δ, d(f (p), f (q)) < ϵ. Since pn → p, there exist
K ∈ N such that d(pn , p) < δ whenever n ≥ K. Thus, d(f (pn ), f (p)) < ϵ whenever n ≥ K and therefore
(f (pn )) converges.
(c) implies (a). Let (pn ) be a sequence in M with pn → p. Note that p1 , p, p2 , p, p3 , p, · · · is also a sequence
converges to p (by 1.1.4). Thus, f (p1 ), f (p), f (p2 ), f (p), · · · converges. By 1.1.4 (or 1.1.3), (f (pj )) and (f (p))
converge to the same limit, which is f (p), as required.
Proposition 1.2.3 (continuity preserves under composition). Suppose f : M → N and g : N → P is
continuous, then g ◦ f : M → P is continuous.
Proof. Let p ∈ M and a sequence (pn ) with pn → p be given. Then by definition, f (pn ) → f (p) and
g(f (pn )) → g(f (p)) and the result follows.
Definition 1.2.4. A function f : M → N is a homeomorphism if f is bijective and f, f −1 are both
continuous. In this case, we say that M and N are homeomorphic, denoted by M ≃ N .

3
1.3 The Topology of a Metric Space
Definition 1.3.1. Suppose p ∈ M and r > 0. The set B(p; r) := {q ∈ M : d(p, q) < r} is an open ball
of M . The point p and r are called the center and the radius of B(p; r), respectively. The collection
B := {B(p; r) : p ∈ M, r > 0} is said to be the standard base of M .
Proposition 1.3.2 (axioms of a base).
(a) For any p ∈ M , there exists B ∈ B such that p ∈ B; that is, B covers M .
(b) For every B1 , B2 ∈ B and x ∈ B1 ∩ B2 , there exists B3 ∈ B such that x ∈ B3 ⊆ B1 ∩ B2 .
S
Proof. (a) is immediate: We can write M = p∈M B(p; 1). For (b), let B1 = B(p1 ; r1 ) and B2 = B(p2 ; r2 )
be given. Suppose x ∈ B1 ∩ B2 , then rj′ := rj − d(x, pj ) > 0 and B(x; rj′ ) ⊆ Bj , where j = 1, 2. Take
r = min{r1′ , r2′ }, then B(x; r) ⊆ B1 ∩ B2 , as required.
Definition 1.3.3. We say that O ⊆ M is open if O is the union of a subcollection of B. In contrary,
C ⊆ M is closed if C is co-open; that is, M \C is open.
Remark. O ⊆ M is open if and only if for any p ∈ O there exists r > 0 such that B(p; r) ⊆ O.
Remark. If there exists a subcollection of open set B ′ such that any open set is the union of a subcollection
of B ′ , then we say B ′ is a base. It can be shown that any such B ′ satisfies 1.3.2.
It immediately follows that any open ball and an arbitrary union of open sets are open. It is worth
mentioning that a finite intersection of open sets is also open: Suppose O1 , O2 ⊆ M are open and p ∈ O1 ∩O2 ,
then there exist open balls Gj ⊆ Oj such that x ∈ Gj (j = 1, 2). By 1.3.2, there exists an open ball G3 with
x ∈ G3 ⊆ G1 ∩ G2 ⊆ O1 ∩ O2 . Since x is arbitrary, O1 ∩ O2 is open. We conclude our discussion as below:
Proposition 1.3.4 (axioms of open sets). Let T be the collection of open sets (called the topology of M )
in M . Then
(a) ∅, M ∈ T .
(b) T is closed under arbitrary union and finite intersection.
Remark. In abstract space (without metric), a topology of a set X is a collection of subsets of X for which
1.3.4 holds.
By taking complement (and De’Morgan’s law), we have the following:
Corollary 1.3.4.1 (axioms of closed sets). Let F be the collection of closed sets in M . Then
(a) ∅, M ∈ F .
(b) F is closed under finite union and arbitrary intersection.
Definition 1.3.5. Let S ⊆ M . By 1.3.4, we can define the maximum open sets contained in M , denoted by
S ◦ , by [
S ◦ := G,
G⊆S,G open

which is referred as the interior of S. Similarly, by 1.3.4.1, we can define the minimum closed set containing
M , denoted by S, by \
S := F,
S⊆F,F closed

which is the closure of S.


Straight from the definition, it follows that S ◦ = S if and only if S is open and S = S if and only if S is
closed.
Proposition 1.3.6 (sequential formulation of closure). Let S ⊆ M . Then the following are equivalent:
(a) p ∈ S.

4
(b) For any r > 0, B(p; r) ∩ S ̸= ∅.
(c) There is a sequence (pn ) in S such that pn → p.
Proof. (a) implies (b). We shall prove this by the contrapositive argument. Suppose there exists r > 0 such
that B(p; r) ∩ S = ∅. Since F := M \B(p; r) is closed and F contains S, it follows that S ⊆ F . However,
p∈/ F and thus p ∈ / S.
(b) implies (c). For any n ∈ N, by condition (b), there exists pn ∈ S such that pn ∈ B(p; n1 ) ⇐⇒
d(p, pn ) < n1 . Thus, pn → p by Archimedean property.
(c) implies (a). It is still followed by the contrapositive argument. Suppose p ∈ / S. Then there exists
closed F ⊇ S with p ∈ / F . Thus, p ∈ M \F and there exists r > 0 such that B(p; r) ⊆ M \F ⊆ M \S (since
M \F is open). Then for any p′ ∈ S, d(p, p′ ) ≥ r and there is no sequence (pn ) in S with pn → p.
Corollary 1.3.6.1 (sequential formulation of closedness). Let S ⊆ M . Then S is closed if and only if any
convergent sequence (pn ) in S converges in S.
Proof. Suppose S is closed. Then S = S. Let (pn ) be a convergent sequence in S with pn → p. Thus, by
1.3.6, p ∈ S = S.
For the converse, we shall show that S = S. Suppose p ∈ S, by 1.3.6, there exists a sequence (pn ) in S
with pn → p. Then by the condition, p ∈ S and thus S = S.
Proposition 1.3.7 (basic set operation on closure and interior). Let S ⊆ T ⊆ M and {Sα }α∈∆ be a
collection of subsets of M . Then
(a) S ⊆ T and S ◦ ⊆ T ◦ .
◦ T ◦ T
(b) α∈∆ Sα◦ ⊆ ⊆ α∈∆ Sα◦ ; the converse holds if ∆ is finite.
S S
α∈∆ Sα . α∈∆ Sα
T T S S
(c) α∈∆ Sα ⊆ α∈∆ Sα . α∈∆ Sα ⊇ α∈∆ Sα ; the converse holds if ∆ is finite.
Proof.
(a) Since S ⊆ T and T is closed, S ⊆ T . Similarly, since S ◦ ⊆ T and S ◦ is open, S ◦ ⊆ T ◦ .
◦ ◦
(b) Since α∈∆ Sα◦ ⊆ α∈∆ ◦
S S S S T
T Sα , by (a), α∈∆ Sα ⊆ α∈∆ Sα . Now suppose p ∈ α∈∆ SαT . Then
there exists open G ⊆ α∈∆ Sα with p ∈ G. T Since G ⊆ Sα for all α ∈ ∆, p ∈ Sα◦ and thus p ∈ α∈∆ Sα◦ .
For the converse, suppose ∆ is finite and p ∈ α∈∆ Sα◦ . Then for each α ∈ ∆, there exists open Gα ⊆ Sα
T T ◦
with p ∈ Gα . Since α∈∆ Gα is open, we have p ∈ α∈∆ Sα .

(c) Note that M \S = (M \S)◦ as


\ [ [
M \S = M \ F = (M \F ) = G = (M \S)◦ .
F ⊇S,F closed F ⊇S,F closed G⊆M \S,G open

The rest follows by complement.

Proposition 1.3.8 (topological formulation of continuity). Let f : M → N be a function. The following


are equivalent:
(a) f is continuous.
(b) For any closed F ⊆ N , f −1 (F ) is closed.
(c) For any open G ⊆ N , f −1 (G) is open.
Proof. (a) implies (b). Suppose F ⊆ N is a closed set. By 1.3.6, we shall prove that for any convergent
(pn ) in f −1 (F ) converges in f −1 (F ). Since f (pn ) converges (by continuity of f ) and F is closed, we have
lim f (pn ) ∈ F (by 1.3.6 again) and the result follows.
(b) implies (c). Note that f −1 (N \S) = M \f −1 (S) for any S ⊆ N . Thus, (c) holds by taking the
complement.
(c) implies (a). Let p ∈ M and ϵ > 0. Since B(f (p); ϵ) is open, by condition (c), f −1 (B(f (p); ϵ)) is also
open. Thus, there exists δ > 0 such that B(p; δ) ∈ f −1 (B(f (p); ϵ)) ⇐⇒ f (B(p; δ)) ⊆ B(f (p); ϵ), then f is
continuous by 1.2.2.

5
Corollary 1.3.8.1 (homeomorphism bijects the topologies). Let f : M → N be a homeomorphism. Then
f bijects the topologies of M and N .
Proof. Let TM an TN be the topologies of M and N , respectively. We shall prove that f (TM ) := {f (G) :
G ∈ TM } = TN . Since f is a homeomorphism, f −1 is continuous. By 1.3.8, (f −1 )−1 (G) = f (G) is open
for any G ∈ TM . This proves f (TM ) ⊆ TN . For the converse, suppose G′ ∈ TN . Let O = f −1 (G′ ). Then
O ∈ TM by the continuity of f and f (O) = G′ ∈ f (TM ).
Replace f by f −1 in the argument above, it follows that f −1 (TN ) = TM , as desired.
Proposition 1.3.9 (axiom of subspace). Let M ′ ⊆ M be a subspace. Let G′ ⊆ M ′ . Then the following are
equivalent:
(a) G′ is open in M ′ .
(b) There exists a open G ⊆ M such that G′ = G ∩ M ′ .
Proof. The equivalence holds since the base B ′ of M ′ is given by

B ′ = B ∩ M ′ := {B ∩ M ′ : B is an open ball in M }.

Proposition 1.3.10 (expansion is continuous). Suppose S ⊆ M . Then ιS : S → M defined by ιS (p) = p is


continuous.
Proof. Suppose U ⊆ M is open. Then ι−1
S (U ) = U ∩ S is open in S by 1.3.9. Thus, by 1.3.8, ιS is
continuous.
Corollary 1.3.10.1 (restriction of continuous function). Suppose f : M → N is continuous. Then for any
S ⊆ M , f |S is continuous.
Proof. Since f |S = f ◦ ιS , by 1.2.3, we are done.
Corollary 1.3.10.2 (expansion of codomain). Suppose f : M → N is continuous. Let S ⊇ N such that N
is a subspace of S. Suppose g : M → S such that f (x) = g(x) for all x ∈ M . Then g is continuous.
Proof. Since g = ιS ◦ f , by 1.3.10, we are done.

1.4 Product Metric


Definition 1.4.1. Let M be a metric space with two different metrics d and d′ . Let the topologies T and
T ′ be the ones induced by d and d′ , respectively. Then we say that T is finer than T ′ if T ′ ⊆ T .
Remark. We also say that d is finer than d′ if T is finer than T ′ . If d and d′ are finer than each other,
then we say d and d′ are equivalent. By definition, the equivalence of metrics is an equivalent relation.
Proposition 1.4.2 (metric criterion of equivalence). Let M be a metric space with two different metrics d
and d′ . If there exist C1 , C2 > 0 such that C1 d ≤ d′ ≤ C2 d, then d and d′ are equivalent.
Proof. We shall prove that B ′ ⊆ B and B ⊆ B ′ , where B and B ′ are bases induced by d and d′ , respectively.
Suppose B ′ := {q ∈ M : d′ (p′ , q) < r′ } ∈ B ′ , where p′ ∈ M and r′ > 0. Then for any B ′ ⊆ B := {q ∈

M : d(p′ , q) < Cr 1 } ∈ B. On the other hand, let B := {q ∈ M : d(p, q) < r} ∈ B with p ∈ M and r > 0.
Then B ⊆ B ′′ := {q ∈ M : d′ (p, q) < C2 r} ∈ B ′ .
Proposition 1.4.3 (equivalent metrics implies same behavior of convergences). Let M be a metric space
with two different metrics d and d′ and (pn ) be a sequence in M . If d and d′ are equivalent, then pn → p in
d if and only if pn → p in d′ .
Proof. We show a slightly stronger result: If d is finer than d′ , then pn → p in d implies pn → p in d′ .
Suppose pn → p in d. Then for any r > 0, there exists N ∈ N such that pn ∈ B(p; r) for all n ≥ N . Let
r′ > 0 and consider B ′ (p; r′ ) := {q ∈ M : d(p, q) < r′ }. Since B ′ (p; r′ ) ∈ T ′ ⊆ T , there exists r′′ > 0 such
that B(p; r′′ ) ⊆ B ′ (p; r′ ). Then since pn → p in d, we are done.

6
We next define a metric on the Cartesian product M = X × Y of two metric spaces. There are three
natural ways to do so:
ˆ dE (p, p′ ) := dX (x, x′ )2 + dY (y, y ′ )2 ;
p

ˆ dmax (p, p′ ) := max{dX (x, x′ ), dY (y, y ′ )};

ˆ dsum (p, p′ ) := dX (x, x′ ) + dY (y, y ′ ),

where p = (x, y) and p′ = (x′ , y ′ ).


Since dmax ≤ dE ≤ dsum ≤ 2dmax , by 1.4.2, all the metrics defined above are equivalent. Thus, by 1.4.3,
pn → p in any of metrics above implies the others. Moreover, pn = (p1n , p2n ) converges in M if and only if
p1n and p2n both converge in X and Y , respectively, by the fact that dsum (pn , p) = dX (p1n , p1 ) + dY (p2n , p2 ),
where p = (p1 , p2 ).

1.5 Completeness
Definition 1.5.1. Let (pn ) be a sequence in M . Then we say (pn ) is Cauchy if for any ϵ > 0, there exists
N ∈ N such that d(pm , pn ) < ϵ for any n, m ≥ N .
Remark. Any Cauchy sequence is bounded. A set S ⊆ M is bounded if there exists p ∈ M, r > 0 such
that S ⊆ B(p; r).
Since d(pn , pm ) ≤ d(pm , p) + d(p, pn ), any convergent sequence is Cauchy. However, the converse does
not hold: Consider M := { n1 : n ∈ N} in R with usual metric. It is clear that pn = n1 is Cauchy. However,
pn → 0 ∈/ M.
Proposition 1.5.2 (Cauchy sequence with subsequential convergence is convergent). Let (pn ) be a Cauchy
sequence in M . If there is a convergent subsequence (pnk ) with pnk → p, then pn → p.
Proof. Since d(pn , p) ≤ d(pn , pnk ) + d(pnk , p), for any ϵ > 0, if n and nk are large enough, it follows that
d(pn , p) < ϵ and thus pn → p.
Definition 1.5.3. M is complete if every Cauchy sequence converges.
Theorem 1.5.4. Rm is complete.
Proof. Let (pn ) be a Cauchy sequence in Rm . Since (pn ) is Cauchy, (pn ) is bounded. Then by Bolzano-
Weierstrass and 1.5.2, (pn ) is convergent.

Remark. If M is complete and M ′ ⊆ M is a subspace. Then M ′ is complete if M ′ is closed by 1.3.6.


Definition 1.5.5. We say f : X → Y is an isometry if d(x, y) = d(f (x), f (y)) for all x, y ∈ X. By the
positive definiteness of distance functions, any isometry is a continuous injection.
Definition 1.5.6. Let M be a metric space. A completion of M is a complete metric space M
c such that
there exists an isometry i : M → M
c and i(M ) is dense in M
c, that is, i(M ) = M
c.

Theorem 1.5.7 (completion theorem). Every metric space can be uniquely completed up to isometry, that
is; if X1 , X2 are two completion of M with isometries ϕ1 : M → X1 , ϕ2 : M → X2 , respectively, then there
exists a unique bijective isometry f : X1 → X2 such that f ◦ ϕ1 = ϕ2 .

Proof of existence. Let C be the collection of all Cauchy sequences in M . For (xn ), (yn ) ∈ C , we say that
(xn ) and (yn ) are co-Cauchy if d(xn , yn ) → 0 and denoted by xn ∼ yn . It is clear that ∼ is an equivalent
relation. Let Mc = C / ∼ and D be a function on M c defined by D(P, Q) = limn→∞ d(pn , qn ), where P = [(pn )]
and Q = [(qn )] are the equivalent classes in M . It remains to verify the following four things:
c

7
(a) D is a well-defined metric on M
c: Let (pn ), (qn ) be two Cauchy sequences in M . Then

|d(pn , qn ) − d(pm , qm )| ≤ d(pn , pm ) + d(qn , qm ) → 0

as n, m → ∞. Thus, (d(pn , qn )) is a Cauchy sequence in R. Thus, since (d(pn , qn )) is bounded, by


Bolzano-Weierstrass theorem and 1.5.2, d(pn , qn ) → L for some L ∈ R. Suppose (p′n ) and (qn′ ) in C
with pn ∼ p′n and qn ∼ qn′ . We have

|d(pn , qn ) − d(p′n , qn′ )| ≤ d(pn , p′n ) + d(qn , qn′ ) → 0

as n → ∞. Then,
lim (d(pn , qn ) − d(p′n , qn′ )) = 0 =⇒ d(p′n , qn′ ) → L.
n→∞

Thus, D is well-defined. Since d is symmetry and satisfies the triangle inequality, by taking the limit,
these properties carry over to D on M
c. The positive-definiteness follows by definition of co-Cauchyness.
c such that i(p) = [p] := [(p, p, p, · · · )]. It is clear that i is a
(b) construction of isometry: Define i : M → M
well-defined isometry.
(c) M
c is complete: Let (Pk ) be a Cauchy sequence in M c. Observe that every subsequence of a Cauchy
sequence is Cauchy, and it and the mother-sequence are co-Cauchy. Thus, for each Pk ∈ C / ∼, let
(pk,n ) ∈ Pk such that d(pk,i , pk,j ) < k1 for all i, j ∈ N. Let qn = pn,n . We claim that (qn ) ∈ C and
Pk → Q := [(qn )]. Let ϵ > 0 be given. There exists N ≥ 3/ϵ such that if k, ℓ ≥ N then
ϵ
D(Pk , Pℓ ) ≤ .
3
Thus,

d(qk , qℓ ) = d(pk,k , pℓ,ℓ ) ≤ d(pk,k , pk,n ) + d(pk,n , pℓ,n ) + d(pℓ,n , pℓ,ℓ ) ≤ + d(pk,n , pℓ,n ).
3
By taking n → ∞, we have

d(qk , qℓ ) ≤ + D(Pk , Pℓ ) ≤ ϵ
3
and (qn ) ∈ C . Similarly, for ϵ > 0, we take N ≥ 2ϵ such that if k, n ≥ N then d(qk , qn ) < 2ϵ , for which
it follow that
1
d(pk,n , qn ) ≤ d(pk,n , pk,k ) + d(pk,k , qn ) ≤ + d(qk , qn ) < ϵ.
k
By taking n → ∞ again, we have D(Pk , Q) < ϵ and thus Pk → Q.

(d) i(M ) = M c. Then we let Pk = [pk ] ∈ i(M ). Let ϵ > 0 be given. Since (pn ) is
c. Let P = [(pn )] in M
Cauchy, there exists N ∈ N such that for any k, n ≥ N , we have

d(pk , pn ) ≤ ϵ.

Since pk is the n-th term of pk , by taking n → ∞, we get

D(Pk , P ) ≤ ϵ =⇒ Pk → P

and thus P ∈ M .

Proof of uniqueness. Let X1 be a completion of M with isometry ϕ1 . We first prove a claim:


Claim. (universal property for completion) For any complete metric space (Y, δ) and isometry map
f : M → Y , there exists one and only isometric map h : X1 → Y with h ◦ ϕ1 = f .
Proof of claim. For the uniqueness, suppose h1 , h2 are two such maps. Then h1 (x) = h2 (x) for all x ∈ ϕ1 (M ).
Since the set S := {x ∈ X1 : h1 (x) = h2 (x)} = {x ∈ X1 : dY (h1 (x), h2 (x)) = 0} is closed by 1.3.8, due to
1.3.7, S = X1 as ϕ1 (M ) ⊆ S ⊆ X1 . For existence: Given p ∈ X1 = ϕ1 (M ), by 1.3.6, there exists a sequence
(pn ) in ϕ1 (M ) with pn → p. As ϕ1 is an isometry, ϕ1 is left-invertible and (f (ϕ−11 (pn ))) is Cauchy. Hence,
f (ϕ−1
1 (p n )) → h(p) for some h(p) ∈ Y as Y is complete. By 1.1.4, h : X1 → Y is well-defined.

8
Let X2 be another completion of M with isometry ϕ2 . Then by claim, there exist unique isometries
h1 : X1 → X2 and h2 : X2 → X1 such that h1 ◦ ϕ1 = ϕ2 and h2 ◦ ϕ2 = ϕ1 . Let Y = X1 in claim. Then
there exists unique h : X1 → X1 such that h ◦ ϕ1 = ϕ1 . Note that idX1 : X1 → X1 is an isometry with
idX1 ◦ ϕ1 = ϕ1 and thus idX1 = h. On the other hand, since (h2 ◦ h1 ) ◦ ϕ1 = ϕ1 , h2 ◦ h1 = idX1 . Similarly,
h1 ◦ h2 = idX2 and thus h1 is the desired bijective isometry.

1.6 Compactness
Definition 1.6.1. A subset S ⊆ M is (sequentially) compact if every sequence (pn ) has a convergent
subsequence which converges in S.
Proposition 1.6.2 (compactness implies closed and bounded). Every compact set is closed and bounded.

Proof. Let S ⊆ M be a compact set and (pn ) be a convergent sequence with pn → p. Then p ∈ S by
definition of compactness and 1.1.3. Thus, S is closed by 1.3.6.1.
For boundedness, let p ∈ S. It is either S is bounded or for any n ∈ N, there exists pn ∈ S with
d(pn , p) ≥ n. However, if such sequence exists, since S is compact, there exists convergent (and thus
bounded) (pnk ), which is absurd as d(p, pnk ) ≥ nk → ∞ when k → ∞.

Theorem 1.6.3 (Heine-Borel theorem). A set S ⊆ Rm is compact if and only if S is closed and bounded.
Proof. By 1.6.2, the necessary condition holds. The converse holds due to the Bolzano-Weierstrass theorem.

Proposition 1.6.4 (closed subsets of a compact set are compact). Let F ⊆ S ⊆ M with F closed and S
compact. Then F is compact.
Proof. Let (pn ) be a sequence in F ⊆ S. Since S is compact, the exists a subsequence pnk → p for some
p ∈ S. Since F is closed, by 1.3.6.1, p ∈ F .
Proposition 1.6.5 (union and intersection of compact sets). Finite union and arbitrary intersection of
compact subsets are compact.
Proof. It is clear that the arbitrary intersection of compactSn sets is compact by 1.6.2, 1.3.4.1, and 1.6.4.
Let C1 , C2 , · · · , Cn ⊆ M be compact sets. Let C = j=1 Cj and (pn ) be a sequence in C. Thus, there
exists j ∈ {1, · · · , n} such that {n ∈ N : pn ∈ Cj } is a infinite set and therefore there exists a subsequence
(pnk ) in Cj . Since Cj is compact, there is a convergent subsequence pnkℓ → p ∈ Cj ⊆ C, proving the
compactness of C.
Theorem 1.6.6 (Cantor intersection theorem). The intersection of a nested sequence of compact and
nonempty sets is compact and nonempty.
T T
Proof. Let C1 ⊇ CT 2 ⊇ C3 ⊇ · · · be a such sequence. By 1.3.4.1, j≥1 Cj is closed. By 1.6.4, j≥1 Cj is
compact. To show j≥1 Cj is nonempty, let (pn ) be a sequence such that pn ∈ Cn . Since (Cn ) is nested,
T (pn )
is a sequence in C1 . Thus, there exists a subsequence pnk → p for some p ∈ C1 . Since nk ≥ k, p ∈ j≥1 Cj ,
as desired.
Definition 1.6.7. Let S ⊆ M . The diameter of S, denoted by diam(S), is defined by

diam(S) := sup{d(x, y) : x, y ∈ S}.


T T
Remark. If diam( j≥1 Cj ) = 0 in 1.6.6, then j≥1 Cj contains a single point.
Theorem 1.6.8 (continuity and compactness). If f : M → N is continuous and A ⊆ M is compact, then
f (A) is compact.
Proof. Let yn be a sequence in f (A). Then there exists xn ∈ A such that f (xn ) = yn for all n ∈ N. Since A
is compact, there exists a convergent subsequence xnk such that xnk → x ∈ A. Then by the continuity of f ,
ynk = f (xnk ) → f (x) ∈ f (A) as k → ∞.

9
Corollary 1.6.8.1 (extreme value theorem). A continuous real-valued function defined on a compact set is
bounded; it assumes maximum and minimum values.
Proof. Let K be a compact set and f : K → R is continuous. Then f (K) is closed and bounded by 1.6.3.
Thus, sup f (K) and inf f (K) exist as f (K) is bounded; it achieve maximum and minimum as f (K) is closed
(sup f (K), inf f (K) ∈ f (K) = f (K)).

Definition 1.6.9. Let S1 , S2 ⊆ M . The distance between S1 and S2 is the value

d(S1 , S2 ) = inf{d(x, y) : x ∈ S1 , y ∈ S2 }.

If S1 = {x} for some x ∈ M , then we write d(S1 , S2 ) = d(x, S2 ).

Proposition 1.6.10 (distance formulation of closure). Let S ⊆ M . Then S = {x ∈ M : d(x, S) = 0}.


Proof. Suppose x ∈ S. By 1.3.6, there exists a sequence xn in S with xn → x. Then d(x, S) = 0.
On the other hand, if d(x, S) = 0, then for any r > 0, there exists p ∈ S such that p ∈ B(x; r). Then
x ∈ S by 1.3.6 again.

Lemma 1.6.11. Let A ⊆ M be nonempty. The function d(·, A) : M → R is continuous.


Proof. Let x, y ∈ M . Then for any a ∈ A,

d(x, A) ≤ d(x, a) ≤ d(x, y) + d(y, a).

Thus, by taking infimum, one has d(x, A) ≤ d(x, y) + d(y, A) ⇐⇒ d(x, A) − d(y, A) ≤ d(x, y). By switching
the role of x and y, it follows that
|d(x, A) − d(y, A)| ≤ d(x, y).

Proposition 1.6.12 (distance between disjoint closed and compact sets). Let nonempty K, F ⊆ M with K
compact and F closed. If K ∩ F = ∅, then d(K, F ) > 0.
Proof. Let d(·, F ) : K → R. Then by 1.6.11 and 1.6.8.1, there exists pmin ∈ K such that d(pmin , F ) ≤ d(p, F )
for all p ∈ K. Then d(pmin , F ) > 0 as otherwise d(pmin , F ) = 0 ⇐⇒ pmin ∈ F = F by 1.6.10, which is
absurd. We claim that d(pmin , F ) = d(K, F ). It is clear that d(K, F ) ≤ d(pmin , F ). On the other hand, let
x ∈ K, y ∈ F . Then d(pmin , F ) ≤ d(x, F ) ≤ d(x, y). By taking infimum, we have d(pmin , F ) ≤ d(K, F ).
Example. Let I = [0, 1]. Then I cannot be written as a union of countable disjoint closed set.
S
Proof. Suppose not and I = j≥1 Fj with Fj closed and disjoint. We claim that there is a closed interval I1
which is disjoint from F1 and I1 intersects at least two members of {Fj }.

Proof of claim. Let a ∈ F1 , b ∈ F2 such that |a − b| ≤ d(F1 , F2 ) (such a, b exists since 1.6.12). Without loss
of generality, we assume a < b. Then [a, b] ∩ F1 = {a} and [a, b] ∩ F2 = {b}. Let I1 = [(a + b)/2, b] and we
are done.
By repeated use of the claim, we can get a sequence I ⊇ I1 ⊇ I2 ⊇ · · · whose intersection is disjoint from
every element Fn . Since the intersection is nonempty by 1.6.6, this is a contradiction.
Theorem 1.6.13 (homeomorphism and compactness). If M is compact then a continuous bijection f : M →
N is a homeomorphism.
Proof. We aim to show that f −1 : N → M is continuous. Let qn → q in N . Define pn = f −1 (qn ) and
p = f −1 (q).
Let pnk be a subsequence of pn . Since M is compact, there exists a sub-subsequence pnkℓ such that
pnkℓ → p′ for some p ∈ M . Then qnkℓ → f (p′ ) by the continuity of f . By 1.1.3, qnkℓ → q and therefore
f (p′ ) = q ⇐⇒ p′ = f −1 (q) = p. Thus, by 1.1.5, pn → p and f −1 is continuous.

10
1.7 Coverings
Definition 1.7.1. A collection C of subsets of M covers A ⊆ M if A is contained in the union of the sets
belonging to C. In this case, C is a covering of A. A subcollection C′ ⊆ C which is still a covering of A is a
subcovering of A.
Definition 1.7.2. If all the set in a covering C of A is open is a open covering. If any open covering of A
has a finite subcovering, we say that A is covering compact.
Theorem 1.7.3 (axiom of compactness). For a subset A of a metric space M , the following are equivalent:

(a) A is covering compact.


(b) A is sequentially compact.
Proof. (a) implies (b). Suppose A is covering compact. If A is not sequential compact, then there exists a
sequence (pn ) such that there is no subsequence converge in A. Then for each a ∈ A, there exists δa > 0
such that {n ∈ N : pn ∈ B(a; δa )} is finite. Since A is covering compact, there exist a1 , · · · , an ∈ A such that
n
[
A⊆ B(aj ; δaj ),
j=1

which is absurd as
|N| = |{n ∈ N : qn ∈ A}|
is infinite.
(b) implies (a). For a covering C of A, a Lebesgue number of C is a number λ > 0 such that for any
a ∈ A, there exists C ∈ C such that B(a; λ) ⊆ C. We first introduce Lebesgue number lemma: Every
open covering of sequential compact set has a Lebesgue number λ > 0.
Proof of lemma. Suppose not. Then there exists a open covering U of sequentially compact set A, and yet
for any λ > 0, there exists aλ ∈ a such that B(aλ ; λ) ̸⊆ U for all U ∈ U. For n ∈ N, let λ = 1/n and an be
the corresponding point in A. By sequential compactness, there exists a subsequence ank → a ∈ A for some
a ∈ A. Since U is an open covering of A, there exists r > 0 such that B(a; r) ⊆ U for some U ∈ U. Since
ank → a, there exists K ∈ N such that for any k ≥ K, ank ∈ B(a; r/2). Take k so large such that 1/nk < r/2
and k ≥ K, then B(ank ; 1/nk ) ⊆ B(a; r) ⊆ U , a contradiction.
Suppose U be an open covering of A. By Lebesgue number lemma, U has a Lebesgue number λ > 0. We
shall construct a finite subcovering U1 , · · · , Un by below algorithm:
(initial step) Pick any a1 ∈ A. Then there exists U1 ∈ U such that B(a1 ; λ) ⊆ U1 .
(inductive step) Suppose U1 , · · · , Uj are constructed. If A ⊆ U1 ∪ · · · ∪ Uj , then we are done. Otherwise,
there exists an+1 ∈ A\(U1 ∪ · · · ∪ Uj ). There exists Uj+1 such that B(aj+1 ; λ) ⊆ Uj+1 .

Notice that for any distinct i, j ∈ N, d(ai , aj ) ≥ λ. Thus, if the below algorithm does not halt, then there
exists a sequence (aj ) with no convergence subsequence, which is absurd.
Remark. In abstract topological space X, we say a set S ⊆ X is compact if S is covering compact.

1.8 Uniform Continuity


Definition 1.8.1. A function f : M → N is uniformly continuous if for each ϵ > 0, there exists δ > 0
such that for any p, q ∈ M with d(p, q) < δ, one has d(f (p), f (q)) < ϵ.

Remark. Uniform continuity implies continuity.


Theorem 1.8.2 (Heine-Cantor theorem). Every continuous function defined on a compact set is uniformly
continuous.

11
Proof. Let M be compact and f : M → N be continuous. Let ϵ > 0 be given. By 1.2.2, for any p ∈ M there
exists δp > 0 such that f (B(p; δp )) ⊆ B(f (p); ϵ). By 1.7.3, there exists p1 , · · · , pn ∈ M such that
n
[
M⊆ BM (pj ; δj ).
j=1

Let δ = 21 min{δ1 , · · · , δn } > 0. Then for any x, y ∈ M with d(x, y) < δ, there exists j ∈ {1, · · · , n} such that
d(x, pj ) < δ and thus x, y ∈ B(aj ; δj ). By the continuity of f ,

d(f (x), f (y)) ≤ d(f (x), f (aj )) + d(f (aj ), f (y)) < 2ϵ.

Lemma 1.8.3 (uniformly continuous function preserves Cauchyness). If f : M → N is uniformly continuous,


then for any Cauchy sequence (pn ) in M , (f (pn )) is also Cauchy.
Proof. Let ϵ > 0 be given. Then there exists δ > 0 such that for any p, q ∈ M with d(p, q) < δ, d(f (p), f (q)) <
ϵ. Since (pn ) is Cauchy, there exists N ∈ N such that for any n, m ≥ N , d(an , am ) < δ =⇒ d(f (an ), f (am )) <
ϵ. Thus, (f (an )) is Cauchy.
Theorem 1.8.4 (uniformly continuous extension). Let A ⊆ M and f : A → N be a uniformly continuous
function with N complete. Then there exists an extension fb : A → N such that fb is also uniformly continuous.

Proof. By 1.3.6, for any p ∈ A, there exists a sequence (pn ) in A with pn → p. Since (pn ) is Cauchy, by 1.8.3,
(f (pn )) is Cauchy, and thus converges to some q ∈ N . We thus define fb : A → N by

fb(p) = lim f (pn )


n→∞

for some pn → p. It can be shown that such fb is well-defined: Suppose pn , qn → p, then the sequence
p1 , q1 , p2 , q2 , · · · → p by 1.1.4. Since f (p1 ), f (q1 ), f (p2 ), f (q2 ), · · · → L for some L ∈ N . By 1.1.4 again,
lim f (pn ) = lim f (qn ) = L. Consequently, fb(p) = f (p) for all p ∈ A since we can take constant sequence p.
It remain to show that fb is uniformly continuous:

Let ϵ > 0 be given. Then there exists δ > 0 such that for any p′ , q ′ ∈ A with d(p′ , q ′ ) < δ, we
have d(f (p′ ), f (q ′ )) < ϵ. Let p, q ∈ A with d(p, q) < 3δ . Thus, there exist sequences pn , qn with
pn → p, qn → q. With given δ > 0, there exists N ∈ N such that for any n ≥ N , d(pn , p), d(qn , q) < 3δ .
Then
d(pn , qn ) ≤ d(pn , p) + d(p, q) + d(q, qn ) < δ =⇒ d(f (pn ), f (qn )) < ϵ
for any n ≥ N . Since f (pn ) → fb(p) and f (qn ) → fb(q), there exists N ′ ∈ N such that

d(fb(p), f (pn )), d(fb(q), f (qn )) < ϵ

for all n ≥ N ′ . Hence, for any n ≥ max{N, N ′ },

d(fb(p), fb(q)) ≤ d(fb(p), f (pn )) + d(f (pn ), f (qn )) + d(f (qn ), fb(q)) < 3ϵ.

Definition 1.8.5. If for any closed and bounded A ⊆ M is compact, we say its distance function d is
Heine-Borel and M has Heine-Borel property.
Corollary 1.8.5.1 (uniformly continuous function preserves boundedness). Suppose M has Heine-Borel
property and A ⊆ M is bounded. If f : A → N is uniformly continuous with N complete, then f (A) is
bounded.
Proof. By 1.8.4, f can be uniformly continuously extended to fb : A → M . Since A is bounded, A is compact
as M has Heine-Borel property. By 1.6.8, fb(A) is compact. Since f (A) ⊆ fb(A), f (A) is bounded.

12
1.9 Connectedness
Definition 1.9.1. M is disconnected if there exists a nonempty, proper and clopen (both closed and
open) subset A ⊆ M . If M is not disconnected, then M is connected. We say that A ⊆ M is connected
if A is connected in its own subspace.
Theorem 1.9.2 (equivalence of disconnected space). The following are equivalent:
(a) M is disconnected.
(b) There exists nonempty open A, B ⊆ M such that M = A ∪ B.
(c) There is a continuous function f : M → {0, 1} (with discrete metric) such that f is not constant.
(d) There exists two nonempty subsets A, B such that A ∩ B = ∅, A ∩ B = ∅ and M = A ∪ B.
Proof. (a) implies (b). Suppose M is disconnected. Then there exists a nonempty, proper and clopen subset
A ⊆ M . Then A and B := M \A are the desired subsets.
(b) implies (c). Let A, B be such subsets. Define f : M → {0, 1} by f (A) = {0} and f (B) = {1}. By
1.3.8, we remain to prove that f −1 (S) is open for all open S ⊆ {0, 1}, which clearly holds.
(c) implies (d). Let f : M → {0, 1} be such function. Let A := f −1 ({0}) and B := f −1 ({1}). Then A, B
are disjoint, clopen and nonempty. Thus, A ∩ B = A ∩ B = ∅ and A ∪ B = M .
(d) implies (a). Suppose A, B are such subsets. We claim that A is the desired set: Since A ∩ B = ∅
and A ∪ B = M , M \B = A and thus A is open. Similarly, B is open. Since A ∩ B = ∅ and A ∪ B = M ,
M \B = A and thus A is closed. Thus, A is a nonempty, proper and clopen subset.
Theorem 1.9.3 (continuity and connectedness). If M is connected, f : M → N is continuous, then f (M )
is connected.
Proof. Suppose f (M ) is disconnected. Then by 1.9.3, there exists continuous g : f (M ) → {0, 1} such that g
is not constant. By 1.2.3, g ◦ f : M → {0, 1} is continuous, non-constant function. Then M is disconnected
by 1.9.2 again.
Corollary 1.9.3.1 (generalized intermediate value theorem). If M is connected, f : M → R is continuous,
then f (M ) has the intermediate value property.
Proof. Suppose f (M ) fails to have intermediate value property. Then there exists γ ∈ R such that

M = f −1 ((−∞, γ)) ∪ f −1 ((γ, ∞)),

which contradicts to the fact that M is connected.


Proposition 1.9.4 (intermediate value theorem). Suppose a ≤ b in R. If f : [a, b] → R is continuous, then
f ([a, b]) has the intermediate value property.
Proof. Without loss of generality, suppose f (a) < u < f (b). Let S = {x ∈ [a, b] : f (x) ≤ u} ⊆ [a, b]. Since
a ∈ S, by lub-principle, c := sup(S) exists. We claim that f (c) = u. It is clear that that a < c < b. If
f (c) < u, then there exists δ+ > 0 such that for any x ∈ [a, b] with |x − c| < δ+ , f (x) < u, contradicting
c is the supremum. Now suppose f (c) > u. Similarly, there exists δ− > 0 such that for any x ∈ [a, b] with
|x − c| < δ− , f (x) > u, also contradicting that c is the supremum.
Proposition 1.9.5 (union of connected sets).
S
(a) Suppose (En ) is a sequence of connected subsets in M . If Ej ∩ Ej+1 ̸= ∅, then j≥1 Ej is connected.
Suppose {Eα }α∈Γ is a collection of connected subsets in M . If Eα ∩ Eβ ̸= ∅ for all α, β ∈ Γ, then
(b) S
α∈Γ Eα is connected.

Proof.
S
(a) Suppose f : j≥1 Ej → {0, 1} is continuous. Then by 1.3.10.1 and 1.9.2, f |Ej is a constant function
for any j ∈ N. Suppose f |ES
1
= 1. Since E1 ∩ E2 is nonempty, f |E2 = 1. By induction, for any j ∈ N,
f |Ej = 1. Thus, f = 1 and j≥1 Ej is connected by 1.9.2.

13
S
(b) Suppose f : α∈Γ Eα → {0, 1} is continuous. Fix Eα . Since Eα is connected,
S f |Eα is constant. Suppose
f |Eα = 1. Since Eβ ∩ Eα ̸= ∅, f |Eβ = 1 for all β ∈ Γ. Thus, f = 1 and α∈Γ Eα is connected.

Theorem 1.9.6 (connectedness and intervals). Suppose I ⊆ R. The following are equivalent:

(a) I is connected.
(b) For any a, b ∈ I, [a, b] ⊆ I.
Proof. (a) implies (b). Suppose there exists a, b ∈ I such that [a, b] ̸⊆ I. Then there exists c ∈ [a, b]\I.
Hence, I = (I ∩ (−∞, c)) ∪ (I ∩ (c, ∞)). Let A = I ∩ (−∞, c) and B = I ∩ (c, ∞). By 1.3.9, A, B are
nonempty open sets in I. Thus, I is disconnected by 1.9.2.
(b) implies (a). If I is empty, then I is clearly connected. Let p ∈ I. Notice that by condition (b),
[ [
I= [p, b] ∪ [p, y].
x≤p,x∈I p≤y,y∈I

By 1.9.5, it remains to show that for any a < b in R, [a, b] is connected. Let f : [a, b] → {0, 1} be a continuous
function. By 1.3.10.2, f can be viewed as a continuous function mapping [a, b] to R. If f is not a constant,
then by 1.9.4, there exists c ∈ [a, b] such that f (c) = 1/2, which is not possible. Thus, f is constant and [a, b]
is continuous.
Proposition 1.9.7 (closure and connectedness). If S ⊆ M is connected and S ⊆ T ⊆ S, then T is connected.

Proof. Let us introduce a lemma: Suppose A ⊆ M is dense and connected. Then M is connected.
Proof of lemma. Suppose M is disconnected. Then by 1.9.2, there exists nonempty open U, V ⊆ M such
that M = U ∪ V . Thus, A = (U ∩ A) ∪ (V ∩ A) with U ∩ A, V ∩ A being nonempty open sets in A, a
contradiction.

Since S = T , the closure of S in T is T . Then S is a dense and connected subset of T , which implies T
is connected by the lemma above.

1.10 Total Boundedness


Definition 1.10.1. A set A ⊆ M is totally bounded if for any ϵ > 0, there exists finite p1 , · · · , pn ∈ M
such that
[n
A⊆ B(pj ; ϵ).
j=1

Theorem 1.10.2 (generalized Heine-Borel theorem). A subset of a complete metric space is compact if and
only if it is closed and totally bounded.
Proof. The necessary condition clearly holds. For sufficiency, we claim that every totally bounded subset
admits a Cauchy subsequence:

Proof of claim. Suppose A is totally bounded and (an ) is a sequence in A. We shall construct the Cauchy
subsequence inductively:
Sn
(the initial step) Since A is totally bounded, there exists p1 , · · · , pn ∈ A such that A ⊆ j=1 B(pj ; 1).
Then there exists k ∈ {1, · · · , n} such that (an ) intersects S1 := B(pk ; 1) ∩ A infinitely often. Pick
pn1 ∩ S1 .
(the inductive step) Suppose SN ⊆ SN −1 ⊆ · · · ⊆ S1 are constructed with Sj+1 = Sj ∩B(qj+1 ; 1/(j +1))
and anj ∈ Sj are chosen. Since SN ⊆ A, SN is totally bounded. Then by the similar process in the
initial step, there exists qn+1 such that (an ) intersects SN +1 := B(qN +1 ; 1/(N + 1)) infinitely often.
Thus, we can take nN +1 > nN > · · · > n1 such that anN +1 ∈ SN +1 .

14
Let ϵ > 0 be given. Pick N ∈ N such that 1/N < ϵ/2. Then for any k, ℓ ≥ N , d(ank , anℓ ) < ϵ as
ank , anℓ ∈ SN .
Then we are done by completeness and closedness.
Example. Consider the Hilbert cube

H = {(x1 , x2 , · · · ) ∈ [0, 1]∞ : |xj | ≤ 1/2j , ∀j ∈ N}

with metric
d(x, y) = sup |xj − yj |.
j

It can be shown that H is compact by 1.10.2.


Proof.
(a) ([0, 1]∞ is complete) Suppose (pk ) is a Cauchy sequence in [0, 1]∞ with pk = (pk1 , pk2 , · · · ). Then for
each ϵ > 0, there exists N ∈ N such that for any n, m ≥ N , d(pn , pm ) < ϵ. By the definition of d, we
have |pnj − pm k
j | < ϵ for all n, m ≥ N and j ∈ N. Thus, for j ∈ N, (pj ) is Cauchy in [0, 1] and therefore
pj → pj for some pj ∈ [0, 1] as [0, 1] is complete. Let p = (p1 , p2 , · · · ). We claim that pk → p∗ . Since
k ∗ ∗ ∗ ∗ ∗

for each n, m ≥ N and j ∈ N, we have pnj − pm n
j < ϵ, by taking m → ∞, we have pj − pj ≤ ϵ. Thus,
n ∗
d(p , p ) ≤ ϵ for all n ≥ N .
(b) (H is closed) Suppose (hk ) is a convergent sequence in H with hk → h∗ . Then for each j ∈ N, hkj → h∗j
and thus |h∗j | ≤ 1/2j as [−1/2j , 1/2j ] is closed.

(c) (H is totally bounded) Let ϵ > 0 be given. Take N ∈ N such that 21n ≤ ϵ for all n > N . For
j = 1, · · · , N , since [−1/2j , 1/2j ] is totally bounded, there exists y1j , · · · , ynj j such that {B(ykj : ϵ) : k =
1, · · · , nj } covers [−1/2j , 1/2j ]. Let

S = {(p1 , p2 , · · · ) ∈ H : if j = 1, · · · , N, pj = ykj for some k; pj = 0 otherwise} ⊆ E.


S
Then S is finite and p∈S B(p; ϵ) covers H.
Thus, by 1.10.2, H is compact.

1.11 Perfect Metric Space


Definition 1.11.1. Suppose S ⊆ M . A point p ∈ M is a cluster point of M if for any ϵ > 0, B(p; ϵ) ∩ S
is an infinite set. The set of all cluster points of S is the derivative set of S, denoted by S ′ .
Proposition 1.11.2 (equivalence of cluster points). Suppose S ⊆ M . The following are equivalent:
(a) p ∈ S ′ .
(b) there exists a sequence of distinct points in S converging to p.
(c) for any ϵ > 0, B(p; ϵ) ∩ S contains at least two points.
(d) for any ϵ > 0, B(p; ϵ) ∩ S contains a point of S other than p.
Proof. (a) implies (b). There exists p1 ∈ S such that p1 ∈ B(p; 1) with p1 ̸= p. Thus d(p1 , p) > 0 and we can
take d(p1 , p) as the new radius.
Clearly (b)⇒(c)⇒(d).
(d) implies (a), which follows from the procedure in “(a) implies (b)”.
Proposition 1.11.3 (formulation of closure by derivative set). Suppose S ⊆ M . Then S = S ∪ S ′ .
Proof. Suppose p ∈ S. Then by 1.6.10, d(p, S) = 0. If p ̸∈ S, then for any ϵ, there exists p′ ∈ S with
p′ ∈ B(p; ϵ). Since p ̸= p′ , by 1.11.2, p ∈ S ′ .
Notice that if p ∈ S ′ , then d(p, S) = 0 ⇐⇒ p ∈ S.

15
Remark. S is closed if and only if S ′ ⊆ S.
Definition 1.11.4. We say that S ⊆ M is perfect if S = S ′ .
Theorem 1.11.5 (perfect and complete space is uncountable). Every nonempty, perfect, and complete
metric space is uncountable.
Proof. Suppose M is a such metric space. If M is countable, then we can enumerate M as M = {x1 , x2 , · · · }.
For p ∈ M and r > 0, we define B(p; r) := {q ∈ M : d(p, q) ≤ r}, which is a closed set as (−∞, r] is
closed in R and d(p, ·) : M → R is continuous.
Pick y1 ∈ M with y1 ̸= x1 . Then there exists y2 ̸= x2 such that y2 ∈ B(y1 ; 1). Since B(y1 ; 1) is open,
there exists r2 ∈ (0, 12 ) such that B(y2 ; r2 ) ⊆ B(y1 ; 1). Set Y1 = B(y1 ; 1) and Y2 = B(y2 ; x2 ).
Inductively, we construct Y1 ⊇ Y2 ⊇ · · · such that xj ̸∈ Yj with diam Yj < 2j . Thus the center points yn
T
form a Cauchy sequence. Since M is complete, yn → y ∈ M . Since Yj are closed nested, y ∈ j≥1 Yj . Since
xj ∈
/ Yj , y ̸= xj for all j ∈ N, a contradiction.
Definition 1.11.6. M is separable if M admits a countable dense set.
Proposition 1.11.7 (equivalence of separable space). M is separable if and only if M admits a countable
base.
Proof. The sufficient condition is easy. Suppose {Vn } is a countable base. Without loss of generality, we
can assume Vn ̸= ∅. For each Vn , pick pn ∈ Vn . Then for any p ∈ M and r > 0, B(p; r) is the union of a
subcollection of {Vn } and thus B(p; r) ∩ {pn } =
̸ ∅.
The necessary condition is also easy. Let {qn } be a countable dense set. Let B ′ = {B(qn , r) : n ∈ N, r >
0, r ∈ Q}. Let G be an open set. Then for any q ∈ G, there exists ϵ > 0 such that B(q; ϵ) ⊆ G. Since
{qn } is dense, there exists j ∈ N such that qj ∈ B(q; ϵ/2). Take r ∈ Q such that d(q, qj ) < r < ϵ/2. Thus,
q ∈ B(qj ; r) ⊆ B(q; ϵ) ⊆ G.
Theorem 1.11.8 (cupcake theorem). Let M be separable. Then every closed set K ⊆ M can be written as
C ∪ P = K with P perfect, C at most countable and C ∩ P = ∅.
Proof. If K is at most countable, then set P = ∅ and C = K. Suppose K is uncountable.
We say a point p ∈ M is a condensation point of K if for any r > 0, B(p; r) ∩ K is uncountable. Let
P be the set of all condensation points of K. Since K is closed, we have P ⊆ K.
By 1.11.7, there exists a countable base {Vn }. Let V be the union of those Vn such that Vn ∩ K is at
most countable. We claim that P = M \V .
Proof of claim.
p∈V
⇐⇒ There exists Vn such that p ∈ Vn and Vn ∩ K is countable.
⇐⇒ There exists Vn and r > 0 such that B(p; r) ⊆ Vn and Vn ∩ K is at most countable.
=⇒ B(p; r) ∩ K is at most countable for some r > 0.
⇐⇒ p ∈
/ P.
Let p ∈ M \V . Suppose there exists r > 0 such that B(p; r) ∩ K is at most countable. Since {Vn } is a base,
B(p; r) is the union of a subcollection of {Vn }, say {Vnk }. Thus, Vnk ∩ K is at most countable and p ∈ V , a
contradiction. Thus p ∈ P .
Therefore, P is closed. By 1.11.3, it remains to show that P ⊆ P ′ . Suppose p ∈ P . To get a contradiction,
there exists r > 0 such that B(p; r) ∩ P = {p}. Then for any y ∈ B(p; r)\{p}, there exists ry > 0 such that
B(y; ry ) ∩ K is at most countable and B(y; ry ) ⊆ B(p; r). Since {Vn } is a base, there exists Vny such that
y ∈ Vny ⊆ B(y; ry ) and Vny ∩ K is at most countable. Thus,
[
(B(p; r)\{p}) ∩ K = (Vny ∩ K),
y∈B(p;r)\{p}

which is at most countable. Then B(p; r) ∩ K is countable, a contradiction.


Hence, P is perfect and C := K\P = K ∩ V is countable.

16
1.12 Cantor Sets
Cantor sets are examples of compact sets that are maximally disconnected. Here is how to construct standard
Cantor set. Start with the unit interval [0, 1] and remove it open middle third, (1/3, 2/3). Then remove
the open middle third from the remaining two intervals, and so on. This gives a nested sequence C 0 ⊇ C 1 ⊇
C 2 ⊇ · · · , where C 0 = [0, 1], C 1 is the union of the two intervals [0, 1/3] and [2/3, 1], C 2 is the union of four
intervals, C 3 is the union of eight intervals, and so on.
The standard middle third Cantor set is the nested intersection
\
C= Cj .
j≥1

Definition 1.12.1. A metric space M is totally disconnected if for any p ∈ M and ϵ > 0, there exists a
clopen set U such that
p ∈ U ⊆ B(p; ϵ).
Theorem 1.12.2 (characterization of the Cantor set). The Cantor set is a compact, nonempty, perfect, and
totally disconnected metric space.
Proof. Let E be the set of endpoints of all the C n -intervals. Clearly E is countable and E ⊆ C. Then C is
infinite and compact (by 1.6.6).
To show that C is perfect and totally disconnected, take any x ∈ C and any ϵ > 0. Fix n ∈ N so large
such that 1/3n < ϵ. The point x lies in one of the 2n intervals I of length 1/3n that comprise C n . Fix this
I. The set E ∩ I is infinite and contained in the interval (x − ϵ, x + ϵ). Thus x is a cluster point of C and C
is perfect.
The interval I is closed in R and therefore in C n . The complement J = C n \I consists of finitely many
closed intervals and is therefore closed too. Thus, I and J are clopen in C n . Note that C ⊆ C n . Then C ∩ I
is clopen in C with x ∈ I ⊆ B(x; ϵ) ∩ C. Thus, C is totally disconnected.
Corollary 1.12.2.1 (the Cantor set is uncountable). The Cantor set is uncountable.
Definition 1.12.3. Suppose S ⊆ M . We say that S is nowhere dense if S ◦ = ∅.
Proposition 1.12.4 (the Cantor set is nowhere dense). The Cantor set is nowhere dense.
Proof. Suppose C contains (a, b) for some a < b. Then (a, b) ⊆ C n for all n ∈ N. Take n ∈ N so large such
that 1/3n < b − a. However, since (a, b) lies in C n for all n ∈ N, and each interval in C n is of length 1/3n , it
leads to a contradiction.
Definition 1.12.5. A set S ⊆ R is a zero set if for any ϵ > 0, there is a countable covering of S by open
intervals (aj , bj ) and the total length of the covering is

X
bj − aj < ϵ.
j=1

Remark. It is clear that C is a zero set.


A more direct way to see that the Cantor set is uncountable involves a geometric coding scheme. Take
the code 0 = left and 2 = right. Then C0 = [0, 1/3] and C2 = [2/3, 1] and C 1 = C0 ∪ C2 . Similarly, the left
and right subintervals of C0 are coded C00 and C02 while the left and right subintervals of C2 are C20 and
C22 . This gives C 2 = C00 ∪ C02 ∪ C20 ∪ C22 . Imagine now an infinite address string ω = ω1 ω2 · · · of zeros
and twos. corresponding to ω, we form a nested sequence of intervals
Cω1 ⊇ Cω1 ω2 ⊇ · · ·
the intersection of which is a point (since diam(Cω1 ···ωn ) → 0 as n → ∞) p = p(ω) ∈ C. Specifically,
\
p(ω) = Cω|n ,
n∈N

where ω|n = ω1 · · · ωn truncates ω to an address of length n. This constructs a correspondence between ω


and a point in C. Since the set of all such ω is uncountable (can be proven by standard diagonal argument),
C is also uncountable.
If M is compact, a pieces of M is any compact nonempty subset of M .

17
Lemma 1.12.6. If M is compact, then for any ϵ > 0, M can be expressed as the finite union of pieces with
diameter less than ϵ.
Proof. Reduce the covering {B(x; ϵ/2) : x ∈ M } of M to a finite subcovering and take the closure of each
member of the subcovering.
We say that M divides into these small pieces. The metaphor is not perfect as the pieces may overlap.
Let W (n) be the set of words in two letters, say a and b, having length n. Then |W (n)| = 2n .
Using 1.12.6, we divide M into a finite number of pieces of diameter ≤ 1 and we denote by M1 . Take
n1 ∈ N such that 2n1 ≥ |M1 | and choose any surjection w1 : W (n1 ) → M1 . We say w1 labels M1 and if
w1 (α) = L then α is a label of L.
Then we divide each L ∈ M1 into finitely smaller pieces (say with diameter ≤ 1/2). Let M2 (L) be the
collection of these smaller pieces and let
[
M2 = M2 (L).
L∈M1

Choose n2 such that 2 n2


≥ max{|M2 (L)| : L ∈ M1 } and label M2 with αβ ∈ W (n1 + n2 ) such that:
(labeling rule) if L = w1 (α) then αβ labels the pieces S ∈ M2 (L) as β varies in W (n2 ).
This labeling amounts to a surjection ω2 : W (n1 + n2 ) → M2 that is coherent with w1 in the sense that
β 7→ ω2 (αβ) labels the pieces S ∈ w1 (α).
Proceeding by induction, we get finer and finer divisions of M coherently labeled with longer and longer
words. More precisely there is a sequence of divisions (Mk ) and surjections wk → Wk = W (n1 + · · · + nk ) →
Mk such that:
(a) The maximum diameter of the pieces L ∈ Mk tends to zero as k → ∞.
(b) Mk+1 refines Mk in the sense that each S ∈ Mk+1 is contained in some L ∈ Mk .
(c) If L ∈ Mk and Mk+1 denotes {S ∈ Mk+1 : S ⊆ L} then
[
L= S.
S∈Mk+1

(d) The labeling wk are coherent in the sense that if wk (α) = L ∈ Mk then β 7→ wk+1 (αβ) labels Mk+1 (L)
as β varies in W (nk+1 ).
Theorem 1.12.7 (Cantor surjecion theorem). Given a compact nonempty metric space M , there exists a
continuous surjection of C onto M .
Proof. We are given a nonempty compact metric space M and we seek a continuous surjection σ : C → M
where C is the standard Cantor set.
Note that for any p ∈ C, by the coding above, we have
\
p= Cω|n .
n∈N

We referred to ω = ω(p) as the address of p.


For k ∈ N, let Mk be the finite divisions of M constructed above, coherently labeled by wk . They obey (a)-
(d). Given p ∈ C, we look at the nested sequence of pieces Lk (p) ∈ Mk such that Lk (p) = wk (ω|n1 + · · · + nk )
where ω = ω(p). Then (Lk (p))T is a nested decreasing sequence of nonempty compact sets whose diameter
tends to 0 as k → ∞. Thus Lk (p) is a well-defined point in M and we set
\
σ(p) = Lk (p).
k∈N

We must show that σ is a continuous surjecion C → M . Continuity is simple. If p, p′ ∈ C are closed together
then for large n the first n entries of their address are equal. This implies that σ(p) and σ(p′ ) belong to a
common Lk and k is large. Since diam(Lk ) → 0 as k → ∞ we get continuity. Surjectivity is also simple.
Each q ∈ M is the intersection of at least one nested sequence of pieces Lk ∈ Mk . Coherence pf the labeling
of the Mk implies that for each nested sequence (Lk ) there is an infinite word α = α1 α2 α3 · · · such that
α1 ∈ W (ni ) and Lk = wk (α1 · · · αm ) with m = n1 + · · · + nk . The point with address α is sent by σ to q.

18
Corollary 1.12.7.1 (existence of Peano curve). There exists a Peano curve, a continuous path in the
plane which is space-filling in the sense that it image has a nonempty interior. In fact, there is a Peano
curve whose image is the closed unit disc B 2 .
Proof. Let σ : C → B 2 be a continuous surjection supplied by 1.6.6. Extend σ to a map τ : [0, 1] → B 2 bt
setting (
σ(x), x∈C
τ (x) =
(1 − t)σ(a) + tσ(b), x = (1 − t)a + tb ∈ (a, b),
where (a, b) is a gap interval, that is, (a, b) ⊆ C c such that a, b ∈ C. Since σ is continuous, we have
|σ(a) − σ(b)| → 0 as |a − b| → 0. Hence τ is continuous. Note that τ ([0, 1]) = B 2 as B 2 is convex and τ just
extends σ via linear interpolation.

19
Chapter 2

Functions of a Real Variables

2.1 Differentiation
open
Definition 2.1.1. The function f : U ⊆ R → R is differentiable at p ∈ U if there exists L ∈ R such that
for all small enough h ∈ R with p + h ∈ U , we have

f (p + h) = f (p) + Lh + R(h),

where R(h)/h → 0 as h → 0. It is clear that such L is unique (if exists). Thus, we say that L is the
derivative of f at p and denoted by f ′ (p).
Remark. It is immediate that if f is differentiable at p, then f is continuous at p.
Remark. This definition is equivalent to the usual definition in calculus courses:
f (p + h) − f (p)
lim = L.
h→0 h
open
Proposition 2.1.2 (the rule of differentiation). Suppose f, g : U ⊆ R → R is differentiable at p. Then
(a) f + g and f g is differentiable at p with (f + g)′ (p) = f ′ (p) + g ′ (p) and (f g)′ (p) = f ′ (p)g(p) + f (p)g ′ (p).
(b) If f (p) ̸= 0, then (1/f ) is differentiable at p with (1/f )′ (p) = −f ′ (p)/f 2 (p).
open
(c) If q : V ⊆ R → R if differentiable at f (p), then (q ◦ f ) is differentiable at f (p) with (q ◦ f )′ (p) =
q ′ (f (p))f ′ (p).
Proof. (a) (f +g)(p+h) = f (p+h)+g(p+h) = (f +g)(x)+ (f ′ (p)+g ′ (p))h+Rf (h)+Rg (h). (f g)(p+h) =
(f (p) + f ′ (p)h + Rf (h))(g(p) + g ′ (p)h + Rf (h)) = (f g)(p) + (f ′ (p)g(p) + f (p)g ′ (p))h + (f ′ (p)g ′ (p))h2 +
Rf (h)(g(p + h)) + Rg (h)(f (p + h)).
(b) Since f (p) ̸= 0, and f is continuous at p, there exists δ > 0 such that for any |h| < δ, f (p + h) ̸= 0.
Then for such h,
1 1 −f ′ (p)h − Rf (h)
− = .
f (p + h) f (p) f (p + h)f (p)
The rest follows from the continuity of f .
(c) q(f (p + h)) = q(f (p) + f ′ (p)h + Rq (h)) = h(f (p)) + q ′ (f (p))(f ′ (p)h + Rf (h)) + Rq (f ′ (p)h + Rf (h)).
Define (
Rf (h)/h, h ̸= 0
ef (h) =
0, h=0
and eq (h) similarly. It is clear that ef , eq are continuous at 0. Then q ′ (f (p))Rf (h)/h = q ′ (f (p))ef (h) →
0 as h → 0. On the other hand, f ′ (p)h + Rf (h) = (f ′ (p) + ef (h))h and thus

Rq (f ′ (p)h + Rf (h)) = eq (f ′ (p)h + Rf (h))(f ′ (p) + ef (h))h.

20
Note that f ′ (p)h + Rf (h) → 0 as h → 0.

Theorem 2.1.3 (mean value theorems). Suppose f, g : [a, b] → R is continuous and differentiable at (a, b).
(a) (Rolle’s MVT) If f (a) = f (b), then there exists c ∈ (a, b) such that f ′ (c) = 0.

(b) (Lagrange’s MVT) There exists c ∈ (a, b) such that f (b) − f (a) = f ′ (c)(b − a).
(c) (Cauchy’s MVT) There exists c ∈ (a, b) such that (f (b) − f (a))g ′ (c) = (g(b) − g(a))f ′ (c).
Proof. (a) Since f is continuous, by 1.6.8.1, there exists c ∈ [a, b] such that f (c) ≥ f (x) for all x ∈ [a, b]. If
c ∈ (a, b), then for any x > c, we have

f (x) − f (c)
≤ 0 =⇒ f ′ (c) ≤ 0.
x−c
If x < c, we have
f (x) − f (c)
≥ 0 =⇒ f ′ (c) ≥ 0.
x−c
Thus, f ′ (c) = 0. For the case that the minimum occurs in (a, b) is similar. Finally, if both maximum
and minimum occur at {a, b}, then f (x) is a constant. Then we can take any c ∈ (a, b).

(b) Define F (x) = f (x)(b − a) − (f (b) − f (a))x. Then F is continuous on [a, b] and differentiable on (a, b)
with F (a) = F (b). Thus there exists c ∈ (a, b) such that F ′ (c) = 0 ⇐⇒ f (b) − f (a) = f ′ (c)(b − a).
(c) Define G(x) = (g(b) − g(a))f (x) − (f (b) − f (a))g(x). Then G is continuous on [a, b] and differentiable
on (a, b) with G(a) = G(b). Thus there exists c ∈ (a, b) such that G′ (c) = 0 ⇐⇒ (f (b) − f (a))g ′ (c) =
(g(b) − g(a))f ′ (c).

Theorem 2.1.4 (L’Hospital Rules). Suppose f and g are real and differentiable in (a, b) and g ′ (x) ̸= 0 for
all x ∈ (a, b), where −∞ ≤ a < b ≤ ∞. Suppose

f ′ (x)
→A
g ′ (x)

as x → a. If
f (x), g(x) → 0
as x → a or if
g(x) → ∞
as x → a, then
f (x)
→A
g(x)
as x → a.
Proof. We first consider the case when −∞ ≤ A < ∞. Choose a real number q such that A < q, and then

choose r such that A < r < q. Since fg′ (x)
(x)
→ A as x → a and g ′ (x) ̸= 0 for all x ∈ (a, b), there exists a point
c ∈ (a, b) such that whenever x, y ∈ (a, c), we have

f ′ (x)
< r, g(x) − g(y) ̸= 0.
g ′ (x)

If a < x < y < c, then by 2.1.3, there exists t ∈ (x, y) such that

f (x) − f (y) f ′ (t)


(f (x) − f (y))g ′ (t) = (g(x) − g(y))f ′ (t) =⇒ = ′ < r.
g(x) − g(y) g (t)

21
ˆ Now suppose f (x), g(x) → 0 as x → a. Letting x → a, we see that
f (y)
≤ r < q,
g(y)
where y ∈ (a, c).
ˆ Now suppose g(x) → ∞ as x → a. Fix y, we can choose c1 ∈ (a, y) such that g(x) > g(y) and g(x) > 0
if a < x < c1 . Then
g(x)−g(y)
f (x) − f (y) × g(x) f (x) g(y) f (y)
<r =⇒ <r−r + (a < x < c1 ).
g(x) − g(y) g(x) g(x) g(x)
Letting x → a, we similarly obtain
f (x)
≤ r < q.
g(x)

Hence, for any q > A, there exists c2 such that fg(x)


(x)
< q if x ∈ (a, c2 ).
In the same manner, if −∞ < A ≤ ∞, and p is chosen so that p < A, we can find a point c3 such that
p < fg(x)
(x)
for any a < x < c3 . Therefore,
f (x)
ˆ If A ∈ R, then since p, q are both arbitrary, g(x) → A as x → a.
f (x)
ˆ If A = ∞, then for any real number p, p < A and therefore g(x) → ∞ as x → a.
f (x)
ˆ If A = −∞, then since for any real number q, q > A and therefore g(x) → −∞ as x → a.

Lemma 2.1.5 (local monotonicity from derivatives). Suppose f : (a, b) → R is differentiable. If f ′ (c) > 0
for some c ∈ (a, b), then there exists x ∈ (c, b) and y ∈ (a, c) such that f (x) > f (c) > f (y). Similarly, if
f ′ (c) < 0, then there exists x ∈ (c, b) such and y ∈ (a, c) such that f (y) > f (c) > f (x).
Proof. Suppose f ′ (c) > 0. If for any x ∈ (c, b), we have
f (x) − f (c)
f (x) ≤ f (c) =⇒ ≤ 0 =⇒ f ′ (c) ≤ 0,
x−c
a contradiction. The rest are similar.
Proposition 2.1.6 (Darboux’s theorem). If f : (a, b) → R is differentiable, then its derivative f ′ (x) has the
intermediate value property.
Proof. Suppose c, d ∈ (a, b) with c < d. We assume f ′ (c) < f ′ (d). Let f ′ (c) < η < f ′ (d). Set F : [c, d] → R
with F (x) = f (x) − ηx. Since F is continuous, there exists ξ ∈ [c, d] such that F (ξ) attains minimum. Note
that F ′ (c) = f ′ (c) − η < 0 and F ′ (d) = f ′ (d) − η > 0. Thus, by 2.1.5, the minimum ξ ∈ (c, d) and thus
F ′ (ξ) = 0 ⇐⇒ f ′ (ξ) = η.
Definition 2.1.7. The derivative of f ′ , if exists, is the second derivative of f . Higher derivative are
defined inductively and written as f (r) := (f (r−1) )′ (x). If f (r) (x) exists then f is r-th differentiable at x.
If f (r) (x) exists for all r and x then f is smooth. The zeroth derivative of f is f itself.
Remark. If f is r-the differentiable at x, then f (r−1) is continuous at x.
Definition 2.1.8. Suppose n ∈ N. The class C n (U ) is the set of functions f , defined on U , which are n-th
differentiable with f (r) continuous. Note that C 0 ⊋ C 1 ⊋ C 2 · · · ⊋ n∈N C n = C ∞ .
T

Definition 2.1.9. Suppose f : (a, b) → R is r-th differentiable at x. Then the r-th Taylor polynomial of
f at x, denoted by P (h), is the polynomial
r
X f (j) (x)
P (h) = hj .
j=0
j!

22
Theorem 2.1.10 (Taylor approximation theorem). Suppose f : (a, b) → R is r-th differentiable at x. Let
P (h) be the r-th Taylor polynomial of f at x. Then
(a) R(h) = f (x + h) − P (h) is r-th order flat, that is, R(h)/hr → 0 as h → 0.
(b) P (h) is the only polynomial with degree ≤ r with property (a).
(c) If, in addition, f is (r + 1)-th differentiable on (a, b), then there exist θ ∈ (a, b) such that

f (r+1) (θ) r+1


R(h) = h .
(r + 1)!

Proof. (a) Consider


f (x + h) − P (h)
hr
for h ̸= 0. By taking h → 0 and apply 2.1.4 r − 1 times (since P (j) (0) = f (j) (x) for 0 ≤ j ≤ r), the is
given by
f (r−1) (x + h) − P (r−1) (h) f (r−1) (x + h) − (f (r−1) (x) + f (r) (x)h)
= →0
(r − 1)!h h
as h → 0, by the definition of f (r) (x).
If there exists P (h) and Q(h) satisfies (a). Then Q(h) − P (h) is r-th order flat. Suppose Q(h) − P (h) =
(b) P
r j
j=0 aj h . Then a0 = · · · = ar = 0 ⇐⇒ Q(h) = P (h).

(c) Fix h > 0. Let


K
g(t) = f (x + t) − P (t) − tr+1 ,
(r + 1)!
where K is the value such that
K
f (x + h) = P (h) + hr+1
(r + 1)!
holds. Then
2.1.3 K r
g(0) = g(h) = 0 =⇒ g ′ (ξ) = f ′ (x + ξ) − P ′ (ξ) − ξ =0
r!
for some ξ > 0. Note that P ′ (h) is the r − 1-th Taylor polynomial of f ′ at x. Thus, by induction (the
base case holds by 2.1.3),

f (r+1) (θ) r
f ′ (x + ξ) − P ′ (ξ) = ξ =⇒ K = f (r+1) (θ).
r!

Proposition 2.1.11 (real continuous injective function from is strictly monotone). Suppose f : I ⊆ R → R
is a continuous injective function, where I ⊆ R is an interval. Then f is strictly monotone.
Proof. Suppose not. Then there exists a, b, c ∈ I such that f (a) < f (b) > f (c) or f (a) > f (b) < f (c). For
the first case, by 1.9.4, there exists d ∈ [a, b) and e ∈ (b, c] such that f (d) = f (e) a contradiction. The second
case is similar
Proposition 2.1.12 (strictly monotone continuous function is a homeomorphism). Suppose f : (a, b) → R
is strictly monotone. Then f is a homeomorphism from (a, b) to f ((a, b)).
Proof. It remains to show that f −1 is continuous. Let ϵ > 0 and y ∈ f ((a, b)) be given. Then there exists
x ∈ (a, b) such that f (x) = y. Since (a, b) is open, there exists ϵ0 > 0 such that B(x; ϵ0 ) ⊆ (a, b). Take
ϵ′ = 21 min{ϵ0 .ϵ}. Let δ = min{|y − f (x − ϵ′ )| , |y − f (x + ϵ′ )|}. Since f is strictly monotone, δ > 0 and
f −1 (B(y; δ)) ⊆ B(x; ϵ′ ) ⊆ B(x; ϵ).
Theorem 2.1.13 (change of variables in limit). Suppose g : I → R is continuous and injective and f : R → R
is a function, where I is an open interval. Then for any a ∈ I, limx→a (f ◦ g)(x) exists if and only if
limt→g(a) f (t) exists, and they are equal when they do exist.

23
Proof. Suppose f (g(x)) = L. We claim that limt→g(a) f (t) = L. Let ϵ be given. Then there exists δ > 0 such
that 0 < |x − a| < δ implies |f (g(x)) − L| < ϵ.
By 2.1.11 and 2.1.12, g is a homeomorphism from I to g(I). Thus, there exists γ > 0 such that |t − g(a)| <
γ implies |g −1 (t) − a| < δ. Thus,

0 < |t − g(a)| < γ =⇒ 0 < g −1 (t) − a < δ =⇒ |f (t) − L| < ϵ

and f (t) → L as t → g(a).


The opposite direction is similar: let h = f ◦ g, then f (t) = h ◦ g −1 (t). Since g −1 is also a continuous
injection, we are done.
Theorem 2.1.14 (inverse function theorem of dimension 1). If f : (a, b) → (c, d) is a differentiable surjection
and f ′ (x) ̸= 0, then f is a homeomorphism. Its inverse is differentiable and its derivative at y ∈ (c, d) is
1
(f −1 )′ (y) = .
f ′ ◦ f −1 (y)

Proof. By 2.1.6, f ′ is either always positive or always negative. Thus, by 2.1.3, f is strictly monotone. Then
by 2.1.12, f is a homeomorphism.
Suppose y ∈ (c, d) and f −1 (y) = x. Then,

1 t−x f −1 (f (t)) − x
= lim = lim .
f ′ (x) t→x f (t) − f (x) t→x f (t) − f (x)
f −1 (p)−x
Let h(p) = p−f (x) . Then

1 2.1.13 f −1 (p) − f −1 (y)


= lim h(f (t)) = lim h(p) = lim .
f ′ (x) t→x p→y p→y p−y

Definition 2.1.15. If a homeomorphism f and its inverse are both of class C r , r ≥ 1, then f is a C r -
diffeomorphism.
Corollary 2.1.15.1. If f : (a, b) → (c, d) is a homeomorphism of class C r , and for all x ∈ (a, b) we have
f ′ (x) ̸= 0, then f is a C r -diffeomorphism.
Proof. The case r = 1 is proven by 2.1.14. For r ≥ 2, induction, 2.1.2 and 2.1.14 complete the proof.

2.2 Riemann Integration


Definition 2.2.1. Let f : [a, b] → R be given. A partition pair consist of two finite sets of points
P, T ⊆ [a, b] where P = {x0 < x1 < · · · < xn } and T = {t1 < t2 < · · · < tn } such that a = x0 ≤ t1 ≤ x1 ≤
· · · ≤ tn ≤ xn = b. The Riemann sum corresponding to f, P, T is
n
X
R(f ; P, T ) = f (ti )∆xj ,
j=1

where ∆xj = xj − xj−1 . The mesh for a partition P is maxj ∆xj , denoted by ∥P ∥. A real number I is the
Riemann integral of f over [a, b] if it satisfies the following condition: for any ϵ > 0, there exists δ > 0
such that for any partition pair P, T with ∥P ∥ < δ, we have |R(f ; P, T ) − I| < ϵ. If such I exists it is unique
as I is the limit
n  
X b−a b−a
lim f a+j .
n→∞
j=1
n n
Rb
In this case, we say that f is Riemann integrable with Riemann integral I and denote I as a
f (x)dx.
Proposition 2.2.2 (Riemann integrable function is bounded). Suppose f : [a, b] → R is Riemann integrable.
Then f is bounded.

24
Proof. Suppose f is bounded. There exists δ > 0 such that for all partition pair P, T with ∥P ∥ < δ, one has
Z b
R(f ; P, T ) − f (x)dx < 1.
a

Suppose T = {t1 < · · · < tn } and P = {x0 < x1 < · · · < xn }. There exists [xk−1 , xk ] where f is not bounded.
Thus, we can take t′k ∈ [xk−1 , xk ] such that

|f (t′k ) − f (tk )|∆xk > 2.

Let T ′ be the same as T but change tk to t′k . Then |R(f ; P, T ) − R(f ; P, T ′ )| > 2, a contradiction.
Definition 2.2.3. Suppose f : [a, b] → R is bounded. Let P = {a = x0 < x1 < · · · < xn = b} be a partition
set. Then the lower sum and upper sum of f are
n
X n
X
L(f ; P ) = mj ∆xj , U (f ; P ) = Mj ∆xj ,
j=1 j=1

where
mj = inf f, Mj = sup f.
[xj−1 ,xj ] [xj−1 ,xj ]
Rb Rb
The lower integral a f (x)dx and upper integral a f (x)dx is the supremum and the infimum of all lower
sum and upper sum, respectively. If the lower and upper integral coincide, then we say f is Darboux
integrable and the common value is the Darboux integral of f on [a, b].
Remark. Notice that a bounded function f : [a, b] → R is Darboux integrable if and only if for any ϵ > 0,
there exists a partition P such that U (f ; P ) − L(f ; P ) < ϵ. This can be proven by the fact that whenever a
partition P ′ ⊇ P (P ′ refines P ), we have
Z b Z b
L(f ; P ) ≤ L(f ; P ′ ) ≤ f (x)dx ≤ f (x)dx ≤ U (f ; P ′ ) ≤ U (f ; P )
a a

Theorem 2.2.4 (equivalence of Darboux and Riemann integral). Riemann integrability is equivalent to
Darboux integrability, and when a function is integrable, its upper, lower, and Riemann integrals are equal.
Proof. Let f : [a, b] → R. Suppose f is Riemann integrable. Then by 2.2.2, f is bounded. Let ϵ > 0 be given.
Since f is Riemann-integrable, there exists δ > 0 such that for any partition pair P, T with ∆P < δ, one has
Z b
R(f ; P, T ) − f (x)dx < ϵ.
a

Let P = {x0 < x1 < · · · < xn } be a such partition. Then for each j ∈ {1, · · · , n}. take t′j , t′′j ∈ [xj−1 , xj ]
such that
Mj − f (t′j ) < ϵ, f (t′′j ) − mj < ϵ.
Let T ′ = {t′1 < t′2 < · · · < t′n } and T ′′ = {t′′1 < · · · < t′′n }. Then

U (f ; P ) − R(f ; P, T ′ ) < ϵ(b − a), R(f ; P, T ′′ ) − L(f ; P ) < ϵ(b − a).


Rb
Let R′ = R(f ; P, T ′ ), R′′ = (f ; P, T ′′ ) and I = a f (x)dx. Then

U (f ; P ) − L(f ; P ) = U (f ; P ) − R′ + R′ − I + I − R′′ + R′′ − L(f ; P ) < 2ϵ((b − a) + 1).

Conversely, suppose f is Darboux integrable. Then |f | ≤ M for some M . Let ϵ > 0 be given. Then there
exists a partition P1 such that
U1 − L1 < ϵ,
where L1 = L(f ; P1 ) and U1 = U (f ; P1 ). Let
ϵ
δ=
n1 M

25
where n1 is the number of P1 -intervals. Consider ∆P < δ and fix a set of sample points for P . We claim
that the Riemann sum R(f ; P, T ) for every such partition pair P, T differs from the Darboux integral I by
less than ϵ. Consider P ∗ = P ∪ P1 . Then since P ∗ is a refinement of P , we have

L1 ≤ L∗ ≤ U ∗ ≤ U1

where L∗ = L(f ; P ∗ ) and U ∗ = f (f ; P ∗ ). We claim that 0 ≤ U − U ∗ < ϵ, where U = U (f ; P ). Since P ∗


refines P , each P ∗ -interval Ij∗ = [x∗j−1 , x∗j ] is contained in some P -interval Ii = [xi−1 , xi ]. Except for at most
n1 exceptional intervals Ij∗ , we have Ij∗ = Ii and Mj∗ = Mi . Set

I = {i : Ii contains exceptional subintervals}

and
I (i) = {j : Ij∗ is an exceptional subinterval of Ii }.
Therefore, X X X X X
0 ≤ U − U∗ = Mi ∆xi − Mj∗ ∆x∗j = (Mi − Mj∗ )∆x∗j .
i∈I i∈I j∈I (i) i∈I j∈I (i)

Since 0 ≤ Mi − Mj∗ ≤ 2M , and since there are at most n1 exceptional intervals, each of length ≤ δ, this
implies
0 ≤ U − U ∗ ≤ 2M n1 δ < 2ϵ.
Similarly, L∗ − L < 2ϵ. Thus,

U − L = (U − U ∗ ) + (U ∗ − L∗ ) + (L∗ − L) < 5ϵ.

Since R(f ; P, T ) ∈ [L, U ]. Thus, |R(f ; P, T ) − I| < 5ϵ. Therefore, f is Riemann integrable and its integral
equals I.
Fix I = [a, b]. Let R(I) be the space of Riemann-integrable functions.
Rb
Proposition 2.2.5 (linearity of integral). R(I) is a vector space and f 7→ a
f (x)dx is a linear map from
R(I) → R.
Proof. For additivity, we shall prove a slightly stronger statement: lower integral is superadditive and upper
integral is subadditive. Suppose f, g : [a, b] → R are bounded. Then for any partition P , we have

U (f + g; P ) ≤ U (f ; P ) + U (g; P )

as supremum is subadditive. Then the result follows by taking infimum on both sides. For lower integral is
similar.
Suppose h ∈ R(I). Let ϵ > 0 and c ∈ R\{0} be given. Then for any partition pair P, T with ∆P < δ,
Z b Z b
ϵ
R(f ; P, T ) − f (x)dx < ⇐⇒ cR(f ; P, T ) − c f (x)dx < ϵ.
a |c| a

Rb Rb
Since cR(f ; P, T ) = R(cf ; P, T ), cf ∈ R(I) and a
cf (x)dx = c
f (x)dx. The case c = 0 is trivial.
a
Rb Rb
Proposition 2.2.6 (monotonicity of integral). Suppose f, g ∈ R(I). If f ≤ g, then a f (x)dx ≤ a g(x)dx.
Rb
Proof. Let h = g − f . By 2.2.5, we only need to show that a h(x)dx ≥ 0. Since h ≥ 0, for any partition P ,
Rb
U (h; P ) ≥ 0 and thus a h(x)dx ≥ 0.
Proposition 2.2.7 (sandwich principle). f ∈ R(I) if for any ϵ > 0, there exists g, h ∈ R(I) such that
Rb Rb
g ≤ f ≤ h and a h(x)dx − a g(x)dx < ϵ.

26
Proof. Let ϵ > 0 be given. By the given condition, there exist g, h ∈ R(I) such that g ≤ f ≤ h and
Rb Rb
a
h(x)dx − a g(x)dx < ϵ. By the integrability of g, h, there exists a partition P such that
Z b Z b
g(x)dx − ϵ < L(g; P ) ≤ L(f ; P ) ≤ U (f ; P ) ≤ U (h; P ) < h(x)dx + ϵ.
a a

Thus U (f ; P ) − L(f ; P ) < 3ϵ and f is thus integrable.


Example. A characteristic function of a subset E ⊆ R, χE , is defined as
(
1, x ∈ E
χE (x) = .
0, x ∈/E

The integral of χ[a,b] is b − a. A step function is a linear combination of characteristic functions of intervals,
which is clearly integrable by 2.2.5. Note that by 2.2.7 and 2.2.6, χ(a,b) , χ(a,b] and χ[a,b) are all integrable
with integral (b − a).
Example. The Thomae’s function T : [0, 1] → R is defined by
(
1
, if x = pq ∈ Q ∩ [0, 1] with gcd(p, q) = 1;
T (x) = q
0, otherwise.

Then since for any ϵ > 0, we have


0 ≤ T (x) ≤ ϵχ[0,1] + t(x),
where (
1
q, if x = pq ∈ Q ∩ [0, 1] with gcd(p, q) = 1 and 1
q ≥ ϵ;
t(x) =
0, otherwise.

Then t(x) is a step function as t(x) ̸= 0 for at most ⌊ 1ϵ ⌋2 values. Thus, by 2.2.7 and 2.2.6 again, T (x) is
integrable with integral 0.

P
Pn 2.2.8. Suppose (an ) is a sequence in R := [−∞, ∞]. Then we say an converges if its partial
Definition
sum j=1 aj → s for some s ∈ R.

Remark. For any a ∈ R and c+ > 0, c− < 0, we define a + ∞ = ∞, c+ · ∞ = ∞, c− · ∞ = −∞. Also,


∞ + ∞ = ∞, ∞ · ∞ = ∞ and 0 · ∞ = 0.
Proposition 2.2.9 (supremum formulation of series convergence). Suppose (an ) is a sequence of nonnegative
numbers (possibly infinite). Then X X
aj = sup aj (∗)
j F ⊆N,F finite j∈F

Proof. If aj = ∞ for some j, then the both sides of (∗) are ∞. Now suppose aj < ∞ for all j. Since
n
X X
aj ≤ sup aj
j=1 F ⊆N,F finite j∈F

for all n ∈ N, we have X X


aj ≤ sup aj
j F ⊆N,F finite j∈F

as its partial sums are increasing. On the other hand, for any finite set F ⊆ N, let M = max F we have

X M
X X
aj ≥ aj ≥ aj .
j j=1 j∈F

Then the inequality follows by taking supremum.

27
Theorem 2.2.10 (Tonelli’s theorem for nonnegative sequence). Suppose (aij ) is a double-indexed, nonneg-
ative sequence (possibly infinite). Then
X XX XX
aij = aij = aij ,
i,j i j j i
P
where i,j aij is defined as the right hand side of (∗).

Proof. Let F ⊆ N2 be a finite set. Then there exists N such that F ⊆ [N ] × [N ], where [N ] = {1, · · · , N }.
Thus,
X N X
X N XX
aij ≤ aij ≤ aij .
(i,j)∈F i=1 j=1 i j
P P P
Therefore, i,j aij ≤ i j aij .
For the converse, let n ∈ N be given, we aim to show that
X n X
X
aij ≥ aij ,
i,j i=1 j

which is clear as for any m ∈ N,


X X X m
n X
aij ≥ aij = aij .
i,j i∈[n],j∈[m] i=1 j=1

Since n is arbitrary, we are done.


Proposition 2.2.11 (algebra of zero sets). Let Z be the collection of zero sets. Then

(a) If A ∈ Z, then for any B ⊆ A, B ∈ Z.


(b) If F is a at most countable set, then F ∈ Z.
(c) Z is closed under countable union.
(d) C ∈ Z, where C is the standard Cantor set.

Proof. (a) trivial.


Suppose F = {pn }. Let ϵ > 0 be given. Then we can cover each pn by an interval of length ϵ/2n . Since
(b) P
∞ n
n=1 ϵ/2 = ϵ, F is a zero set.

(c) Suppose {Ai } is a sequence of S zero sts. LetPϵ > 0 be given. Then forSeach i, there exists a sequence of
i
intervals
S {Iij } such that A i ⊆ j Iij with j |Iij | < ϵ/2 . Let A = i Ai . Then A can be covered by
I
i,j ij with total length
X 2.2.10 X ϵ
|Iij | < = ϵ.
i,j i
2i

2n
(d) Let ϵ > 0 be given. Let 3n < ϵ. Since we can cover C n by 2n intervals with total length 2n /3n , C ∈ Z
as C ⊆ C n .

Definition 2.2.12. Let f : M → N be a function. Then the oscillation of f at x, oscx (f ), is defined by

oscx (f ) = inf diam(f (B(x; δ))).


δ>0

Proposition 2.2.13 (oscillation and continuity). Suppose f : M → N is a function. Then f is continuous


at x is and only if oscx (f ) = 0.

28
Proof. Suppose f is continuous at x. Then for any ϵ > 0, there exists δ > 0 such that f (B(x; δ)) ⊆ B(f (x); ϵ),
thus diam f (B(x; δ)) ≤ diam(B(f (x); ϵ)) ≤ 2ϵ and oscx (f ) = 0.
Conversely, suppose f is not continuous at x. Then there exists ϵ0 > 0 such that for any δ > 0,
f (B(x; δ)) ̸⊆ B(f (x); ϵ0 ), then oscf (x) ≥ ϵ0 .
Remark. If I is any interval containing x in its interior then

MI − mI ≥ oscx (f ).

Theorem 2.2.14 (Riemann-Lebesgue theorem). A function f : [a, b] → R is Riemann integrable if and only
if it is bounded and its set of discontinuity points is a zero set.
Proof. Let f : [a, b] → [−M, M ] and D be the set of discontinuity of f . Note that

[
D= Dk ,
k=1

where  
1
Dk = x ∈ [a, b] : oscx (f ) ≥ .
k
By 2.2.11, it suffices to show that each Dk is a zero set.
Assume that f is Riemann integrable and let ϵ > 0 and k ∈ N be given. By 2.2.4, there exists a partition
P such that X
U −L= (Mi − mi )∆xi < ϵ/k.
We say that a P -interval is “bad” if it contains a point of Dk in its interior. Then
ϵ X X 1X
>U −L= (Mi − mi )∆xi ≥ (Mi − mi )∆xi ≥ ∆xi .
k k
bad bad

Thus, the total length of bad intervals is at most ϵ. Since P is finite, P ∩ Dk is finite. Therefore, Dk is a
zero set.
Conversely, assume that the discontinuity set D of f : [a, b] → [M, −M ] is a zero set. Let ϵ > 0 be given.
By 2.2.4, it suffices to that for any ϵ > 0, there exists a partition P with U (f ; P ) − L(f ; P ) < ϵ. Let ϵ > 0
be given. Take k ∈ N such that k1 < ϵ. Since D is a zero set, by 2.2.11, Dk is a zero set. Hence, there exists
a countable covering J of Dk by open intervals Jj = (aj , bj ) with total length
X
bj − aj < ϵ.

On the other hand, for each x ∈ [a, b]\Dk , there is an open interval Ix containing x such that
1
sup MIx − inf mIx < .
k
These interval Ix are a covering I of the good set [a, b]\Dk . The union V = J ∪ I is an open covering of
[a, b]. Since [a, b] is compact, there exists a Lebesgue number λ > 0. Let P = {x0 , · · · , xn } be a partition
with ∆P < λ. Each P -interval Ii is contained wholly in some Ix or wholly in some Jj . Set

J = {i ∈ {1, · · · , n} : Ii is contained in some bad interval Jj }.

For some finite m, J1 ∪ · · · ∪ Jm contains those P -intervals Ii with i ∈ J. Then


n
X
U −L= (Mi − mi )∆xi
i=1
X X
= (Mi − mi )∆xi + (Mi − mi )∆xi
i∈J i∈J
/
X X ∆xi
≤ 2M ∆xi + ≤ 2M ϵ + (b − a)/k < ϵ(2M + (b − a)).
k
i∈J i∈J
/

29
Remark. We say that f satisfies a property P almost everywhere if the set of points that f does not
satisfy P is a zero set.
Definition 2.2.15. A real-valued function f : M → R is piecewise continuous if f is bounded and the
discontinuity of f is a finite set.
Proposition 2.2.16 (integrability of common functions).
(a) Every continuous function and every piecewise continuous function is Riemann integrable.
(b) The characteristic function S ⊆ [a, b] is Riemann integrable if and only if the boundary of S, ∂S,
defined by ∂S = S ∩ R\S, is a zero set.
(c) Every monotone function is integrable
(d) The product of Riemann integrable functions is integrable.
(e) If f : [a, b] → [c, d] is integrable and ϕ : [c, d] → R is integrable, then ϕ ◦ f is Riemann integrable.
(f) If f ∈ R(I), then |f | ∈ R(I).
(g) If Riemann integrable function f : [a, b] → [0, M ] has integral zero, then f = 0 almost everywhere.
(h) If R is Riemann integrable and ψ is a homeomorphism whose inverse satisfies a Lipshitz condition,
that is, |ψ −1 (x) − ψ −1 (y)| ≤ L|x − y|, then f ◦ ψ is Riemann integrable.
(i) If c ∈ (a, b) and f ∈ R(I), then its restriction to [a, c] and [c, b] are Riemann integrable and
Z b Z c Z b
f (x)dx = f (x)dx + f (x)dx.
a a c

Proof.
(a) Trivial by considering the sets of their discontinuities.
(b) By 1.3.6, the discontinuity set of χS is ∂S.
(c) Notice that for each x ∈ [a, b], both left and right limits exists. Without loss of generality, suppose f
is monotone increasing. Then the discontinuity set D of f is contained in the set {x ∈ (a, b) : f (x− ) <
f (x+ )}∪{a, b}. Since for x, y ∈ (a, b) with x < y, f (x+ ) ≤ f (y − ), the gap intervals of the discontinuities
Ix = (f (x− ), f (x+ )) are disjoint open intervals. Thus, it is at most countable and D is therefore a zero
set.
(d) Suppose f, g ∈ R(I). Then D(f ), D(g) are zero sets. Since D(f g) ⊆ D(f ) ∪ D(g), by 2.2.11, D(f g) is
a zero set and therefore f g is integrable.
(e) Since D(ϕ ◦ f ) ⊆ D(f ), ϕ ◦ f is integrable.
(f) |f | is the composition of absolute value and f , which is integrable by (e).
R
(g) Suppose f = 0 but f (x0 ) > 0 at some continuity point x0 of f . Then there exists δ > 0 such
c
Rthat f (x) ≥ c/2 for all x ∈ I := [x0 − δ, x0 + δ]. Therefore, f (x) ≥ 2 χI for all x ∈ [a, b] and thus
f ≥ 2cδ > 0 by 2.2.6.
(h) More precisely, we assume that f : [a, b] → R is Riemann integrable, ψ bijects [c, d] onto [a, b] with
ψ(c) = a, ψ(d) = b (we may assume that ψ is strictly monotone by 2.1.11). We claim that f ◦ ϕ is
continuous at x0 ∈ [a, b] if and only if f is continuous at ϕ(x0 ). Indeed, suppose f ◦ ϕ is continuous
at x0 , then for a given ϵ, there exists δ > 0 such that for any y ∈ [c, d] with |x0 − x| < δ, we have
|f (ϕ(x0 ))−f (ϕ(x))| < ϵ. Since ψ −1 is continuous at ψ(x0 ), there exists η > 0 such that for any y ∈ [c, d]
with |ψ(x0 ) − y| < η, we have x0 − ψ −1 (y) < δ, which implies that |f (ϕ(x0 )) − f (y)| < ϵ. Conversely,
if f is continuous at ϕ(x0 ), then for a given ϵ > 0, there exists δ > 0 such that for any y ∈ [c, d] with
|ϕ(x0 ) − y| < δ, we have |f (ϕ(x0 )) − f (y)| < ϵ. Then since ϕ is continuous at x0 , there exists η > 0 such
that for any x ∈ [a, b] with |x0 − x| < η, we have |ϕ(x0 ) − ϕ(x)| < δ and thus |f (ϕ(x0 )) − f (ϕ(x))| < ϵ.

30
Let D be the set of discontinuity points of f .Then D′ = ϕ−1 (D) is the set of discontinuity points of
f ◦ ψ. Let ϵ > 0 be given. There is an open covering of D by intervals (ai , bi ) whose total length is
ϵ/K. The homeomorphic intervals (a′i , b′i ) = ψ −1 (au , bi ) cover D′ and have total length
X X
b′i − a′i ≤ L(bi − ai ) < ϵ.

Therefore D′ is a zero set and f ◦ ψ is integrable.


(i) Since f = f χ[a,c] + f χ(c,b] , then by 2.2.5, we are done.

Remark. For f ∈ R(I), we let


Z b Z a
f (x)dx = − f (x)dx.
a b

Corollary 2.2.16.1. Suppose f, g ∈ R(I). Then max(f, g) and min(f, g) are integrable.
Proof. Since
f + g + |f − g| f + g − |f − g|
max(f, g) = , min(f, g) = ,
2 2
by 2.2.16, we are done.

Theorem 2.2.17 (fundamental theorem of Calculus). If f ∈ R(I) is Riemann integrable then its indefinite
integral Z x
F (x) = f (t)dt
a

is a continuous function of x. The derivative of F (x) exists and equals f (x) at every point x which f is
continuous.
Proof. Since f is Riemann integrable, then f is bounded by some M > 0. Then
Z y
|F (x) − F (y)| = f (x)dx ≤ M |x − y|
x

and F is Lipshitz and thus continuous.


Suppose f is continuous at x0 , then for any ϵ > 0, there exist δ > 0 such that for any x ∈ [a, b] with
|x − x0 | < δ, we have |f (x) − f (x0 )| < ϵ. Then for h =
̸ 0 small enough,
x+h
F (x0 + h) − F (x0 )
Z
1
− f (x0 ) = (f (x) − f (x0 ))dx ≤ ϵ.
h h x

Theorem 2.2.18 (antiderivative theorem). An antiderivative of a Riemann integrable function, if it exists,


differs from the definite integral by a constant.
Proof. We assume that f ∈ R(I) and F is an antiderivative of f . Then we claim that
Z x
G(x) = f (t)dt + G(a).
a

Partition [a, x] as
a = x0 < x1 < · · · < xn = x,
and choose tk ∈ (xk−1 , xk ) such that

G(xk ) − G(xk−1 ) = G′ (tk )∆xk ,

31
such tk exists by 2.1.3. Then
n
X n
X
G(x) − G(a) = G(xk ) − G(xk−1 ) = f (tk )∆xk .
k=1 k=1

Then given ϵ > 0, since f is integrable on [a, x], there exists δ > 0 such that for any partition pair P, T with
∆P < δ, one has Z x
R(f ; P, T ) − f (t)dt < ϵ.
a
Then Z x
G(x) − G(a) − f (t)dt < ϵ.
a
Since ϵ > 0 is arbitrary, we are done.
Corollary 2.2.18.1 (integration by parts). If f, g : [a, b] → R are differentiable and f ′ , g ′ ∈ R(I) then
Z b Z b
f (x)g ′ (x)dx = f (b)g(b) − f (a)g(a) − f ′ (x)g(x)dx.
a a

Proof. By 2.1.2, (f g)′ = f ′ g + f g ′ . Thus, f g is an antiderivative of f ′ g + f g ′ . Thus, f g differs from the


indefinite integral of f ′ g + f g ′ by a constant. Then for all t ∈ [a, b], we have
Z t Z t
f (t)g(t) − f (a)g(a) = f ′ (x)g(x)dx + f (x)g ′ (x)dx.
a a

Setting t = b, we are done.


Theorem 2.2.19 (integration by substitution). If f ∈ R(I) and g : [c, d] → [a, b] is a C 1 -diffeomorphism
with g ′ > 0, then
Z b Z d
f (y)dy = f (g(x))g ′ (x)dx.
a c

Proof. The first integrable exists by assumption. By 2.2.16, f ◦ g and g ′ exist and thus (f ◦ g)g ′ ∈ R(I). Let
P = {x0 < x1 < · · · < xn } be a partition of interval [c, d] and choose tk ∈ (xk−1 , xk ) such that
g(xk ) − g(xk−1 ) = g ′ (tk )∆xk ,
such tk exist by 2.1.3. Since g is a diffeomorphism we have a partition Q = {y0 < y1 < · · · < yn } of the
interval [a, b], where yk = g(xk ), and ∆P → 0 implies ∆Q → 0. Set sk = g(tk ). This gives two equal
Riemann sums
Xn n
X
f (sk )∆yk = f (g(tk ))g ′ (tk )∆xk
k=1 k=1
Rb Rd
which converges to a
f (y)dy and c
f (g(t))g ′ (t)dt as ∆P → 0.
Theorem 2.2.20 (Cauchy-Schwartz inequality). Suppose f, g ∈ R(I), then
s
Z b Z b Z b
f (x)g(x)dx ≤ f 2 (x)dx g 2 (x)dx.
a a a

Proof. By 2.2.16, the integral on both sides exists. Define h : R → R by


Z b Z b Z b Z b
2 2
h(t) = (f (x) − tg(x)) dx = f (x)dx − 2t f (x)g(x)dx + g 2 (x)dx
a a a a

by 2.2.5. Since h(t) ≥ 0 and h(t) is a polynomial of degree 2, its discriminant is nonpositive. Thus,
Z b !2 Z
b Z b
f (x)g(x)dx ≤ f 2 (x)dx g 2 (x)dx.
a a a

32
Definition 2.2.21. Assume that fR : [a, b) → R is Riemann integrable when restricted to any closed subin-
c
terval [a, c] ⊆ [a, b). If the limit of a f (x)dx exists as c → b, then we define
Z b Z c
f (x)dx = lim f (x)dx.
a c→b a

In order for the two-sided improper integral to exist for a function f : (a, b) → R, it is natural to fix some
Rm Rb
point m ∈ (a, b) and require that both improper integrals a f (x)dx and m dx exists. There sum is the
Rb
improper integral a f (x)dx.
Proposition 2.2.22 (Barbaletian lemmas). Suppose f : [0, ∞) → [0, ∞) be a function.
(a) If f is uniformly continuous and Z ∞
f (x)dx < ∞,
0
then f (x) → 0 as x → ∞.
(b) If f is decreasing and Z ∞
f (x)dx < ∞,
0
then f (x) → 0 as x → ∞.
Proof.
(a) Suppose f (x) ̸→ 0. Then there exists ϵ0 > 0 and an increasing sequence (xn ) such that xn+1 > xn + 1,
xn → ∞, and f (xn ) ≥ ϵ0 . Then by uniform continuity, there exists δ ∈ (0, 1) such that for any n ∈ N
and x ∈ (xn − δ, xn + δ), f (x) ≥ ϵ0 /2. Since f (x) ≥ 0,
Z ∞ Z M
f (x)dx = sup f (x)dx.
0 M >0 0

Thus, for any n ∈ N,


Z xn +δ n Z
X xk +δ
f (x)dx ≥ f (x)dx ≥ (n − 1)ϵ0 δ → ∞
0 k=2 xk −δ

as n → ∞, a contradiction.
(b) Suppose f (x) ̸→ 0. Then there exists ϵ0 > 0 and an increasing sequence (xn ) such that xn → ∞ and
f (xn ) ≥ ϵ0 . Then for each n ∈ N,
Z ∞ Z xn
ϵ0 x n
f (x)dx ≥ f (x)dx ≥ →∞
0 xn /2 2

as n → ∞, a contradiction.

Example (logarithm and exponential function). Define


Z x
1
log(x) = dt
1 t
for all x > 0. By 2.2.17, (log(x))′ = 1/x and thus log(x) is smooth. It is clear that log(x) is strictly increasing
and Z x Z ⌊x⌋
1 1 1 1
log(x) = dt ≥ dt ≥ + · · · + →∞
1 t 1 t 2 ⌊x⌋
as x → ∞. On the other hand,
Z x Z 1  n
2n−1 2n−2

1 1 2 1  n
log(x) = dt = − dt ≤ − n
+ n
+ + · · · + =− 1+
1 t x t 2 2 n−1 2 2

33
whenever x < 21n . Thus, as x → 0+ , log(x) → −∞. Thus, the image of log(x) is R.
By 2.1.12 and 2.1.15.1, log(x) : (0, ∞) → R has an smooth inverse exp(x) : R → (0, ∞) and exp′ (x) =
exp(x).
Fix y ∈ R, consider f (x) = log(xy) − log(x) − log(y). By 2.1.2, f ′ (x) = 0 and thus f (x) is a constant by
2.1.3. Since f (1) = 0, we have log(xy) = log(x) + log(y) and thus exp(x + y) = exp(x) exp(y).
Given r > 0, we define rx by rx = exp(log(r)x). For r ̸= 1, we define logr (x) = log(x)
log(r) . Note that for
q ∈ Q, rq coincides with usual exponential as exp(qx) = (exp(x))q for all q ∈ Q.

2.3 Series
P
Let an be a series. Then by Cauchy criterion of sequence convergence, we have

X n
X
ak converges if and only if ∀ϵ > 0, ∃N ∈ N, ∀n, m ≥ N, ak < ϵ.
k=m
P P
Proposition 2.3.1 (comparison test). If a series P bk dominates a series Pan in the sense that |ak | ≤ bk
for sufficiently large k, then the convergence of bk implies convergence of ak .
Proof.
n
X n
X n
X
ak ≤ |ak | ≤ bk
k=m k=m k=m

for all m, n large enough. The rest follows by Cauchy criterion.


P P
Definition 2.3.2. A series ak converges absolutely if |ak | converges. 2.3.1 shows that absolute
convergence implies convergence. A series that converges but not absolutely converges conditionally.
R∞ P
Proposition 2.3.3 (integral test). Suppose that 0 f (x)dx is a given improper integral and ak is a given
series. If |ak | ≤ f (x) for all sufficiently k and all x ∈ (k − 1, k], then convergence of the improper integral
implies convergence of the series. Conversely, if 0 ≤ f (x) ≤ ak for all sufficiently large k and all x ∈ (k, k + 1],
then divergence of the improper integral implies divergence of the series.
Proof. For some large N0 and all N ≥ N0 , we have
N
X Z N Z ∞
|ak | ≤ f (x)dx ≤ f (x)dx.
k=N0 +1 N0 N0

R∞ P
Then the convergence of 0 f (x)dx implies the convergence of |ak | by monotone convergence theorem.
The converse is similar.
P 1
Remark. The p-series is the series np . By 2.3.3, p-series converge if and only if p > 1.
P
Proposition 2.3.4 (CauchyP k condensation test). Suppose a1 ≥ a2 ≥ · · · and converge to 0. Then ak
converges if and only if 2 a2k converges.
P P k
Proof. Since ak ≥ 0, we only need to show the co-boundedness of ak and 2 a2k .
For any n ∈ N, there exists k ∈ N such that k ≥ 2n , then
1
a1 + · · · + ak ≥ a1 + a2 + (a3 + a4 ) + · · · + (a2k−1 +1 + · · · + a2n ) ≥ (a1 + 2a2 + · · · + 2n a2n ).
2
Conversely, for any k ∈ N, there exists n such that 2n+1 ≥ k, then

a1 + · · · + ak ≤ a1 + (a2 + a3 ) + · · · + (a2n + · · · + a2n+1 −1 ) ≤ a1 + 2a2 + · · · + 2n a2n .

Definition 2.3.5. Let (an ) be a sequence in R and S be the set of limits of convergent subsequences (possibly
infinite). Then the limit superior of an , lim supn→∞ an , is the number sup(S). Similarly, the limit inferior
of an , lim inf n→∞ an , it the number inf(S).

34
Proposition 2.3.6 (criterion of limit superior/inferior). Let (an ) be a sequence in R and. Then the following
are equivalent:
(a) s∗ = lim supn→∞ an .
(b) s∗ ∈ S and for any s > s∗ , there xn < s for large n.

Similarly, the following are equivalent:


(c) s∗ = lim inf n→∞ an .
(d) s∗ ∈ S and for any s < s∗ , there xn > s for large n.
Proof. (a) implies (b). We first show that s∗ ∈ S. Suppose s∗ = ∞, then for any n ∈ N exist subsequential
limit sn > n + 1. Let ϵ ∈ (0, 1) be given. Since each sn is a subsequential limit, there exist infinitely many
an such that an > n. Thus, there exist a subsequence ank such that ank → ∞, proving s∗ ∈ S.
Suppose s∗ ∈ R. Let ϵ > 0 be given. Then there exists a subsequential limit s such that s∗ − s < 2ϵ . Since
s is a subsequential limit, there exist infinitely many an such that s∗ > an > s∗ − ϵ. Thus s∗ ∈ S.
Suppose s∗ = −∞. Then each convergent subsequence converges to −∞, which is clear that s∗ ∈ S.
Now suppose s > s∗ (if s∗ = ∞, there is no s > s∗ and the result holds.). Suppose for any k ∈ N, there
exists nk ≥ N with ank ≥ s. Then there exists L ∈ S such that L ≥ s > s, a contradiction.
To show the uniqueness, suppose p, q ∈ R∗ both satisfies (b) and p > q. Take r such that p > r > q. By
(b), xn < r for large n. Then any subsequential limit ≤ r < p, a contradiction.
The equivalence of (c) and (d) is similar.
Proposition 2.3.7 (addtive inequality of limit superior/inferior). Suppose (an ), (bn ) two real sequences.
Then
lim inf an + lim inf bn ≤ lim inf (an + bn ) ≤ lim sup(an + bn ) ≤ lim sup an + lim sup bn
n→∞ n→∞ n→∞ n→∞ n→∞ →∞

if the addition is defined.

Proof. We only prove lim sup(an + bn ) ≤ lim sup an + lim sup bn . If any of lim sup an and lim sup bn is ∞,
then there is nothing to prove. Suppose both lim sup an and lim sup bn is less than ∞. Let α > lim sup an
and β > lim sup bn . Then by 2.3.6, for sufficiently large n, we have

an < α, bn < β.

Thus for any convergent subsequences of an + bn with limit γ,

γ ≤α+β

and thus
lim sup(an + bn ) ≤ lim sup an + lim sup bn .
n→∞ n→∞ n→∞

Proposition 2.3.8 (limsup/liminf formulation of limit). Suppose (an ) is a real sequence. Then lim an exists
if and only if lim sup an = lim inf an . Also, if lim an exists, then lim an = lim inf an = lim sup an .

Proof. If lim an exists, then by 1.1.3, lim sup an = lim inf an = lim an .
Conversely, suppose α := lim sup an = lim inf an . If α = ∞ or −∞, then it is clear that lim an = α. If
α ∈ R, then an is bounded as otherwise there exists a subsequence that converges to α.
Let ank be a subsequence. Since an is bounded, by Bolzano-Weierstrass theorem, there exist a subsequence
that converges to α. Thus, by 1.1.5, an → α.

Proposition 2.3.9 (root test). Let p


k
α = lim sup |ak |.
k→∞
P P
If α < 1, then an converges absolutely; if α > 1, then an diverges.

35
p P k
Proof. Suppose α < 1. Take r ∈ (α, 1). By 2.3.6, k |ak | < r for all large k. Since r < 1, and r converges,
we are done by 2.3.1. If α > 1, take r′ ∈ (1, α), then |ak | ≥ (r′ )k for infinitely many k. Since ak ̸→ 0,
P
ak
diverges.
ck xk is a power series then there exists a unique R with
P
Corollary 2.3.9.1 (radius of convergence). If
0 ≤ R ≤ ∞, its radius of convergence, such that the series converges when |x| < R, and diverges whenever
|x| > R. Moreover, R is given by the formula
1
R= p .
lim supk→∞ k
|ak |

ck xk , then
P
Proof. Apply root test to the series
p
k
p |x|
lim |ak | = |x| lim sup k |ak | = .
k→∞ k→∞ R

If |x| < R, the root test gives convergence. If |x| > R, it gives divergence.
P
Proposition 2.3.10 (ratio test). Let the ratio between successive terms of the series ak be τk = |ak+1 /ak |
(we suppose that ak ̸= 0). Set
ρ = lim sup τk , λ = lim inf τk .
k→∞ k→∞

If ρ < 1, then the series converges, if λ > 1, then the series diverges.
P k
Proof. Suppose ρ < 1. Take r ∈ (ρ, 1). Then for large k, |ak+1 | < r|ak |. Since r < 1 and
P r converges,
ak converges by 2.3.1.
Suppose λ > 1, take r′ ∈ (1, λ). Then for large k, |ak+1 | > r′ |ak | and thus ak ̸→ 0. Therefore,
P
ak
diverges.
Proposition 2.3.11 (relation between ratio and root test). Let (an ) be a nonzero sequence. Then

ak+1 p p ak+1
lim inf ≤ lim inf k |ak | ≤ lim sup k |ak | ≤ lim sup .
k→∞ ak k→∞ k→∞ k→∞ ak

Proof. We only prove the last inequality. If lim supk→∞ aak+1


k
= ∞, then there is nothing to prove. Suppose
p p
lim supk→∞ k |ak | < ∞. Let β > lim supk→∞ k |ak |. Then there exists
p K such that |ak+1 | < |ak |β for all
h
k ≥ K. Then for any h ∈ N, |aK+h | ≤ β |aK | and thus lim supk→∞ |ak | ≤ β. Since β is arbitrary, we are
k

done.
Proposition 2.3.12 (logarithm test). Suppose (an ) is a nonzero sequence. Let

log(|1/ak |) log(|1/ak |)
α = lim inf , β = lim sup .
k→∞ log(k) k→∞ log(k)
P P
If α > 1, then an converge absolutely; if β < 1, then |an | diverges.
Proof. Suppose α > 1, take r ∈ (1, α). Then

log(|1/ak |)
>r
log(k)

for large k. Then


1
|ak | <
kr
P
for large k. Then by 2.3.1, ak converge absolutely.
Suppose β < 1. Take r′ ∈ (β, 1). Then
1
|ak | >
k r′
P
for large k. Then by 2.3.1, |an | diverges.

36
e k = ak+1 − ak .
Definition 2.3.13. For a sequence (ak ), the forward difference of ak , δ̃ak , is defined as ∆a
Two immediate consequences are that
N
X
e k = aN +1 − aM
∆a
k=M

and
e k bk ) = ak+1 bk+1 − ak bk = ak+1 bk+1 − ak bk+1 + ak bk+1 − ak bk = (∆a
∆(a e k )bk+1 + ak (∆b
e k ).

By summing from M to N , we have


M
X M
X
aN +1 bN +1 − aM bM = (∆a
e k )bk+1 + ak (∆b
e k)
k=N k=N

and thus
M
X M
X
e k ) = aN +1 bN +1 − aM bM −
ak (∆b (∆a
e k )bk+1 .
k=N k=N

This equation is called summation by parts.


P P
Proposition 2.3.14 (Abel’s test). Suppose ak converge and (bk ) is monotone and bounded. Then ak bk
is also convergent.
P P∞ P
Proof. Let Rk be the remainder of ak , that is, Rk = j=k+1 aj . Since an converges, Rk → 0 as k → ∞.
Then
XN N
X N
X
ak bk = − e k = RM bM − RN +1 bN +1 +
bk ∆R (∆b
e k )Rk+1 .
k=M k=M k=M

Since bk is monotone and bounded, RM bM , RN +1 bN +1 → 0 as M, N → ∞. Lastly,


N
X
e k )Rk+1 ≤ ϵ |bN +1 − bM | ≤ 2Kϵ
(∆b
k=M

as bk can be bounded by some K > 0.


Proposition 2.3.15
P (Dirichlet’s test). Suppose ak is monotone with ak → 0 and the partial sum of bn is
bounded. Then ak bk converges.

Proof. Without loss of generality, we assume ak is increasing. Let Bk be the partial sum of bk and |Bk | < K
for some K > 0. Then
N
X N
X N
X
ak bk = e k−1 = aN +1 BN − aM BM −1 +
ak ∆B Bk ∆a
e k.
k=M k=M k=M

Then
N
X
ak bk < K(|aM +1 | + |aM |) + K(aN +1 − aM ) → 0
k=M
P
as N, M → ∞. Thus ak bk converges.

Theorem 2.3.16 (Young’s inequality for increasing functions). Let f : [0, ∞) → [0, ∞) be a continuous and
strictly increasing function with f (0) = 0. Then
Z a Z b
f (x)dx + f −1 (x)dx ≥ ab
0 0

for any a, b > 0.

37
Proof. We first claim that for c > 0,
Z c Z f (c)
f (x)dx + f −1 (x)dx = cf (c).
0 0

Let P, Q be partitions of [0, c] and [0, f (c)], respectively. Then P ∗ = P ∪ f −1 (Q) and Q∗ = Q ∪ f (P ). Let
P ∗ = {x0 < x1 < · · · < xn }. Then Q∗ = {y0 < y1 < · · · < yn } with yk = f (xk ). Since f is strictly increasing
n
X n
X
U (f, P ∗ ) + U (f −1 , Q∗ ) = yj ∆x
e j−1 + xj ∆y
e j−1
j=1 j=1
n−1
X n
X
= yn (xn − xn−1 ) + xn−1 yn − xj ∆y
e j+ xj ∆y
e j−1
j=1 j=1

= cf (c) − x1 (y2 − y1 ) − · · · − xn−1 (yn − yn−1 ) + x1 (y1 − y0 ) + · · · + xn (yn − yn−1 )


= cf (c) + x1 (y1 − y0 ) + (x2 − x1 )(y2 − y1 ) + · · · + (xn − xn−1 )(yn − yn−1 ) > cf (c).

Thus, since P, Q are arbitrary,


Z c Z f (c)
f (x)dx + f −1 (x)dx ≥ cf (c).
0 0
Similarly,
Z c Z f (c)
f (x)dx + f −1 (x)dx ≤ cf (c)
0 0
by considering Darboux lower sum and the claim follows.
Without loss of generality, let f (a) ≤ b. Then
Z a Z b Z b
−1
f (x)dx + f (x)dx = af (a) + f −1 (x)dx ≥ af (a) + (b − f (a))a = ab.
0 0 f (a)

The equality holds if and only if b = f (a) (see the remark below).
R
Remark. Not only 2.2.6 holds, if f > 0 and f is integrable on I0 = [a, b], then I0 f (x)dx > 0. Indeed,
R
suppose I0 f (x)dx = 0. Then there exists a partition P0 such that U (f ; P0 ) < |I0 |. Then there exists a
R
subinterval I1 such that supI1 f < 1. Since I1 f (x)dx = 0, we can construct a sequence of closed interval
T
I0 ⊇TI2 ⊇ · · · such that supIk f < 1/k. By 1.6.6, k Ik is nonempty. However, in this case, f (x) = 0 for
x ∈ k Ik , a contradiction.

Theorem 2.3.17 (Stolz-Cesaro). Let (an ) and (bn ) be two sequences of real numbers.
1. Assume that (bn ) is strictly monotone and unbounded. and the following limit exists:

∆a
e n
lim = L,
∆b
e n

then
an
lim = L.
bn

2. Assume that an , bn → 0 and (bn ) is strictly monotone. If

∆a
e n
lim = L,
∆b
e n

then
an
lim = L.
bn
Proof.

38
1. We shall prove the case when (bn ) is strictly increasing and bn → ∞; the other case is analogous.
Suppose −∞ ≤ L < ∞. Then there exists r, q such that L < r < q. Since ∆a → L and bn → ∞,
e n
∆b
e n
∆a
there exists N ∈ N such that n ≥ N implies < r and bn > 0. Hence, for n ≥ N , it follows that
e n
∆b
e n

 
an+1 bN aN
<r 1− + := cn .
bn+1 bn+1 an+1
an an
Since cn → r as n → ∞, bn < q for sufficiently large n. Similarly, if −∞ < L ≤ ∞ and p > L, bn > p.
Hence,
ˆ If L = ∞, then for any p ∈ R, an
bn > p for sufficiently large n and therefore an
bn → ∞.
ˆ If L = −∞, then for any q ∈ R, abnn < q for sufficiently large n and therefore abnn → −∞.
ˆ If L ∈ R, then for any p, q ∈ R with p < L < q, we have p < an
bn < q for sufficiently large n and
therefore abnn → L.
2. We shall prove that bn > 0 for all n ∈ N. Suppose that for some N ∈ N, bN ≤ 0. Then since (bn ) is
strictly decreasing, bN +1 < 0. Furthermore, for any n ≥ N + 1, |bn | > |bN +1 |, a contradiction. Suppose
−∞ ≤ L < ∞. Then there exists r, q such that L < r < q. Since ∆a e n → L, ∆b
∆a
e n < r for sufficiently
e n e n
∆b
large n. Notice that an = (an − an+1 ) + · · · + (am − am+1 ) + am+1 for all m ≥ n. Hence, for sufficiently
large n and m ≥ n, we have
 
an bm+1 am+1
an < r(bn − bm+1 ) + am+1 ⇐⇒ <r 1− + := cm .
bn bn bn

Since cm → r as m → ∞, abnn < q for all sufficiently large n. Similarly, if −∞ < L ≤ ∞ and L < p,
an
bn < p for sufficiently large p. Then the rest is similar.

Remark. Notice that this proof mimics the proof of 2.1.4.


P P
Proposition 2.3.18 (Mertens’ theorem). Suppose ak converges absolutely and bk converges. Let

cj = a0 bj + · · · + aj b0 .
P
Then cj converges with

X ∞
X ∞
X
cj = aj bj .
j=0 j=0 j=0
P P
Proof. Suppose aj = A and bj = B. Let βj = Bj − B, where Bj is the j-th partial sum of bj . Then
βj → 0. Let N ∈ N. Consider
N
X X j
N X N X
X N N
X N
X N
X
cj = aℓ bj−ℓ = aℓ bj−ℓ = aℓ BN −ℓ = aℓ (βN −ℓ + B) = AN B + aℓ βN −ℓ .
j=0 j=0 ℓ=0 ℓ=0 j=ℓ ℓ=0 ℓ=0 ℓ=0

Since Aℓ B → AB as N → ∞, it remains to show that


N
X
aℓ βN −ℓ → 0
ℓ=0

as N →P ∞. P
Let |aℓ | = α and ϵ > 0. Since an converges, there exists L1 ∈ N such that for any |aℓ | < ϵ for any
ℓ ≥ L1 . Since βℓ → 0, there exists L2 ∈ N such that ℓ ≥ L2 implies |βℓ | < ϵ. Let L = max{L1 , L2 }. Then,

39
for N ≥ 2L,
N
X N
X
aℓ βN −ℓ = aN −ℓ βℓ
ℓ=0 ℓ=0
L
X N
X
≤ aN −ℓ βℓ + aN −ℓ βℓ
ℓ=0 ℓ=L+1
L
X
<ϵ |βℓ | + α max{|βL+1 |, · · · , |βN |}
ℓ=0
XL
< ϵ( |βℓ | + α).
ℓ=0
P
Thus, aℓ βN −ℓ → 0 as N → ∞, as required.
Definition 2.3.19. A rearrangement of an is a sequence bn such that bn = aψ(n) , where ψ : N → N is a
bijection.
P P
Theorem
P P 2.3.20 (absolute convergence and rearrangement). If |an | converges, then bn converges and
bn = an , where bn is a rearrangement of an .
|an | = S ′ and ψ be a corresponding bijection. Let ϵ > 0 and N so large such that
P
Proof. Suppose
P∞ PN ′ ′
n=N |an | < ϵ and n=1 an − S < ϵ. Let N = max{ψ(1), · · · , ψ(N )}. Then for M ≥ N ,

M
X M
X N
X N
X ∞
X N
X
bn − S ≤ bn − an + an − S ≤ |an | + an − S < 2ϵ.
n=1 n=1 n=1 n=1 n=N +1 n=1

P P P
Definition 2.3.21. We say that an converges conditionally if |an | diverges.
an converges but
Remark. Suppose a+ and a− a+ − + −
P
n = max(an , 0) P n = max(−a n , 0). Then an = P n −an and |an | = a n +an . If an
− −
converges conditionally, then one of P a+ +
P P
n and
P +n a diverges. Suppose an diverges and a n converges.

an = a+ a−
P + P
Then
P + sinceP n − an , we have n = (an − an ) = an − an converges, a contradiction. Thus,
an and a−
n both diverges.
P
Theorem 2.3.22 (Riemann rearrangement theorem). If an converges conditionally, then for any −∞ ≤
α ≤ β ≤ ∞, there exists a rearrangement bn such that its partial sum Bn such that

lim inf Bn = α, lim sup Bn = β.

Proof. Define PP k be the first


P k-th nonnegative term of an and Nk the first Pk-th negative term of an . By the
above remark, Pk and Nk above diverge. Note that Pk , Nk → 0 as ak converges.
Let βk and αk be sequences such that βk → β and αk → α with βk > αk . Let i1 be the first index such
that P1 + · · · + Pi1 > β1 and j1 be the first index such that P1 + · · · + Pi1 + N1 + · · · + Nj1 < α1 . Similarly,
let i2 be the first index such that

P1 + · · · + Pi1 + N1 + · · · + Nj1 + Pi1 +1 + · · · + Pi2 > β2

and let j2 be the first index such that

P1 + · · · + Pi1 + N1 + · · · + Nj1 + Pi1 +1 + · · · + Pi2 + Nj1 +1 + · · · + Nj2 < α2 .

Define ik , jk inductively and bn be the rearrangement defined above. Let xk and yk be the partial sums
whose last terms are Pik and Njk , respectively. Notice that

|xk − β| ≤ Pik , |yk − α| ≤ −Njk .

40
Then since Pik , Njk → 0, we have
lim sup Bn ≥ β, lim inf Bn ≤ α.
Let ℓ− + −
0 = 0, ℓk = i1 + j1 + i2 + · · · + jk−1 + ik and ℓk = i1 + j1 + · · · + ik + jk .
− +
Then for n ∈ [ℓk , ℓk+1 ], Bn ≤ βk + Pik . If lim sup Bn > β, take r ∈ (β, lim sup Bn ), then for sufficient
large n, Bn > r > β, which is absurd as βk + Pik → β. Hence, lim sup Bn ≤ β. Similarly, lim inf Bn ≥ α and
the result follows.
Definition 2.3.23. For a double-indexed real sequence an,m , we say that an,m converges to a if for any
M ∈ N such that whenever
ϵ > 0, there exists N, P n ≥ N, m ≥ M , |an,m − a| < ϵ. For thePconvergence of
P n Pm
double-indexed series an,m , we say that an,m converges if its partial sum Sn,m = i=1 j=1 an,m
converges.
P P P P P
Theorem 2.3.24 (Pringsheim’s theorem). If n,m an,m , n m an,m and m n an,m exist, then they
are equal.
P
Proof. If n,m an,m exists, then for any ϵ > 0 we can find M ∈ N such that

X
Sn,m − an,m < ϵ
n,m

whenever n, m > M . Then the result follows due to the continuity of | · |.

Theorem 2.3.25
P (absolute convergence and order of summation). P Suppose anm is a double-indexed real
sequence. If n,m |anm | converges (in this case, we also say that aij converges absolutely), then
∞ X
X ∞ ∞ X
X ∞ ∞
X
anm = anm = an,m .
n=1 m=1 m=1 n=1 n,m=1

Proof. By 2.3.24, it suffices to show that these three sums exist. Let N ∈ N be given. Then
N
X ∞
X
SN := |anm | ≤ |aij |.
n,m=1 n,m=1

Thus, by 1.0.1 and 2.3.1,


N
X
lim anm
N →∞
n,m=1

exists, say S. Let ϵ > 0 be given. Then sufficiently large N, M ,


N X
X M N X
X M N
X N
X
anm − S ≤ aij − anm + anm − S < ϵ
n=1 m=1 n=1 m=1 n,m=1 n,m=1
P
and thus n,m anm exists.
Fix n ∈ N, since

X X ∞
X
|anm | ≤ |anm | =⇒ anm exists.
m=1 n,m m=1

Moreover,
∞ X
X ∞ X ∞ X
X ∞
anm ≤ |anm | =⇒ anm exists.
n=1 m=1 n,m n=1 m=1
P P
Similarly m n anm exists, then the result follows by 2.3.24.

41
Chapter 3

Function Space

3.1 Uniform Convergence and C 0 [a, b]


Definition 3.1.1. Let fn : X → Y be a sequence of functions. We say that fn → f : X → Y pointwisely
if fn (x) → f (x) for all x ∈ X. On the other hand, we say that fn → f uniformly, denoted by fn ⇒ f , if
for any ϵ > 0, there exist N ∈ N such that for any x ∈ X and n ≥ N , d(fn (x), f (x)) < ϵ.
Remark. fn ⇒ f implies fn → f by definition.
Proposition 3.1.2 (Cauchy criterion of uniform convergence). Let Y be complete and fn : X → Y be a
sequence of functions. Then fn converges uniformly to some f : X → Y if and only if for any ϵ > 0, there
exists N ∈ N such that for any n, m ≥ N and x ∈ X, d(fn (x), fm (x)) < ϵ.
Proof. The necessary condition clearly holds. For the converse, notice that for any x ∈ X, (fn (x)) is a
Cauchy sequence. As Y being complete, we can define f by

f (x) = lim fn (x).

Let ϵ > 0 be given. Then for large enough n, m, we have

d(fn (x), fm (x)) < ϵ.

As d being continuous, we can take m → ∞ and get

d(fn (x), f (x)) ≤ ϵ

for all x ∈ X, as required.


Proposition 3.1.3 (uniform limit preserves continuity). Suppose fn : X → Y is a sequence of functions
that is continuous at x. If fn ⇒ f , then f is continuous at x.
Proof. Let ϵ > 0 be given. Notice that

d(f (x), f (y)) ≤ d(f (x), fn (x)) + d(fn (x), fn (y)) + d(fn (y), f (y)).

The first and third term of the right-hand side < ϵ when n is large enough. The middle term < ϵ when y is
closed enough to x.
Proposition 3.1.4 (Dirichlet’s test for uniform convergence). Suppose fk : X → R is a sequence of monotone
k ⇒ 0 and gk : X → R is a sequence of functions that the partial sum of gk is uniformly
functions that fP
bounded. Then fk gk uniformly converges.
Proof. Suppose Gk is the partial sum of gk . By the condition, |Gk | < K for some K > 0. Then since
N
X N
X N
X
fk gk = fk ∆Gk−1 = fN +1 GN − fM GM −1 − Gk fk .
k=M k=M k=M

42
Thus for a given ϵ > 0, as fn ⇒ 0 and fn is monotone, for N, M large enough, we have
N
X
fk gk ≤ 2Kϵ + K|fN − fM | ≤ 4Kϵ.
k=M

Then by 3.1.2, we are done.

Proposition 3.1.5 (Abel’s


P test for uniform convergence). Suppose fk , gk : X → R are two sequences
P
of functions such that gk uniform converges and fk is monotone and uniform bounded. Then fk gk
converges.
P∞ P
Proof. Since fk is uniform bounded, |fk | < K for some K > 0. Let Rk = k gk be the remainder of gk .
Then Rk ⇒ 0. Then for large enough M, N , since
N
X N
X N
X
fk gk = − fk ∆Rk = −fN +1 Rk+1 + fM RM + Rk+1 ∆fk ,
k=M k=M k=M

it follows that
N
X
fk gk ≤ 4Kϵ.
k=M

Definition 3.1.6. The space Cb (X, Rn ) is the space of Rn -valued bounded functions defined on X. The
sup-norm ∥·∥∞ : X → Rn is the function defined by

∥f ∥∞ = sup{|f (x)| : x ∈ X}.

It is clear that fn ⇒ f if and only if fn → f in ∥·∥∞ . Also, ∥·∥∞ is a complete metric on Cb (X, Rn ) by 3.1.2.
By 3.1.3, C 0 (X, Rn ) is a closed subset of Cb (X, Rn )
P
Proposition 3.1.7P (Weierstrass M-test). If Mk is a convergent series of constants and if fk ∈ Cb satisfies
∥fk ∥∞ ≤ Mk , then fk converges uniformly and absolutely.
Proof. Clearly from
N
X N
X N
X
fk ≤ ∥fk ∥∞ ≤ Mk < ϵ
k=M ∞ k=M k=M

as M, N large enough.

Theorem 3.1.8 (uniform limit preserves integrability). The uniform limit of Riemann integrable functions
is Riemann integrable, and the limit of the integrals is the integral of the limit, that is,
Z b Z b
lim fn (x)dx = f (x)dx
n→∞ a a

whenever fn ⇒ f .
Proof. By 2.2.14, fn is continuous
S almost everywhere. Let Zn be the set of discontinuities of fn . By 3.1.3,
f is continuous on [a, b]\ Zn . Thus, by 3.1.3, f is integrable.
For the equality, note that
Z b Z b Z b
fn − f ≤ |fn − f | ≤ ϵ(b − a)
a a a

as n is large enough.
Theorem 3.1.9 (uniform limit preserves differentiability). The uniform limit of a sequence of differentiable
functions is differentiable provided that the sequence of derivatives also converges uniformly.

43
Proof. We suppose that fn , f : (a, b) → R, fn ⇒ f and fn′ ⇒ g. We shall prove that f is differentiable and
f ′ = g.
Fix x ∈ (a, b), define 
 fn (t) − fn (x)
, t ̸= x
Dn (x) = t−x
f ′ (x), t=x
n

and 
 f (t) − f (x)
, t ̸= x
D(x) = t−x
g(x), t = x.
Then Dn is continuous and Dn → D. For t ̸= x, by 2.1.3,

Dn (t) − Dm (t) = fm (θ) − fn′ (θ),

where θ is between t and x. Since fn′ ⇒ g, when n, m → ∞, we have Dn (t) − Dm (t) → 0. Thus, Dn converges
uniformly and D is thus continuous by 3.1.3. Hence, f ′ (x) = g(x) and we are done.
ck xk be a power series with radius of con-
P
Proposition 3.1.10 (power series converges uniformly). Let
vergence R. Then for any r ∈ [0, R), the series converges uniformly on [−r, r].
Proof. Let r < β < R. Then for sufficiently large k,
1
> |ck |.
βk
Then
rk
ck xk <
βk
rk /β k converges. By 3.1.7, ck xk converges uniformly.
P P
and
Corollary 3.1.10.1. A power series can be integrated and differentiated term by term on its interval of
convergence.
P∞
Proof. Let f (t) = k=0 ck tk and fix |x| < R. Then by 3.1.8,
Z x ∞
X ck k+1
f (t)dt = x .
0 k+1
k=0

Since r  1/k
k |ck−1 | 
1/(k−1)
(k−1)/k 1
lim sup = lim sup |ck−1 | .
k→∞ k k→∞ k
Since (k − 1)/k → 1 and k −1/k → 1, the radius of convergence is also R.
Similar calculation shows that
X∞
kck tk−1
k=1

converges uniformly on (−R, R) and thus by 3.1.9, f is differentiable and



X

f (t) = kck xk−1 .
k=1

P∞ k
Remark. This shows that if f (x) = k=0 ck x with positive radius of convergence, then

f (k) (0)
ck = .
k!

44
P∞
Proposition 3.1.11 (change of reference point). Suppose f (x) = k=0 ck xk is a power series with radius
of converge R. Then for any x0 ∈ (−R, R) and any δ > 0 with (x0 − δ, x0 + δ) ⊆ (−R, R).

X f (k) (x0 )
f (x0 ) = (x − x0 )k .
k!
k=0

Proof. Take x ∈ (x0 − δ, x0 + δ). Since


∞ ∞ X
k  
X
k
X k k−j
f (x) = ck (x0 + x − x0 ) = ck x0 (x − x0 )j
j=0
j
k=0 k=0

and
∞ X
k   ∞
X k k−j j
X
|ck | |x0 | |x − x0 | = |ck | (|x0 | + |x − x0 |)k
j
k=0 j=0 k=0

converges absolutely as |x0 | + |x − x0 | < R. Then by 2.3.25,


∞ X
∞ ∞ ∞
(x − x0 )j X
 
X k k−j X
f (x) = ck x0 (x − x0 )j = ck k(k − 1) · · · (k − j + 1)xk−j
0 .
j=0 k=j
j j=0
j!
k=j

By 3.1.10.1,

X
ck k(k − 1) · · · (k − j + 1) = f (j) (x0 )
k=j

and the result follows.


theorem). Suppose f (x) = ck xk is a power series with the radius of conver-
P
Proposition 3.1.12
P (Abel’s
gence R > 0. If ck Rk converges, then f (x) converges uniformly on (0, R].
Proof. Define ak = xk /Rk and bk = ck Rk . Then
P
bk converges
P uniformly (as a constant function) and ak
is monotone and uniformly bounded on (0, R]. By 3.1.5, ak bk converges uniformly on (0, R].
open
Definition 3.1.13. A function f : U ⊆ R → R is analytic if for any x0 ∈ U , there exists δ > 0 such that

X
f (x) = ck (x − x0 )k
k=0

with radius of convergence δ.


ck xk with the radius of convergence
P
Remark. By 3.1.11, if f (x) can be written as power series f (x) =
R > 0, then f is analytic on (−R, R).
Proposition 3.1.14 (roots of analytic function are discrete). Let f be a non-constant analytic function on
an open interval (a, b), and suppose f (c) = 0 for some c ∈ (a, b). Then there exists δ > 0 such that f (x) ̸= 0
whenever 0 < |x − c| < δ.
Proof. We first claim that if f is a constant in a subinterval (c− , c+ ), then f is a constant on (a, b). Let
c ∈ (c− , c+ ) and S = {x ∈ (a, b) : f (x) = f (c)}. Then S is closed in (a, b) as f is continuous. We claim that
sup S = b. Suppose s := sup S < b. Since its left derivative of any order (≥ 1) at s is 0, there exists s′ ∈ (s, b)
such that f (s′ ) = f (c) as f is analytic at s, a contradiction. Similarly, inf S = a and thus f is a constant on
(a, b).
Suppose f (c) = 0. Since f is not a constant, there exists k0 such that f (k0 ) (c) ̸= 0 by the claim above
(otherwise, f (x) = 0 on some open interval). Without loss of generality, let k0 be the smallest such integer.
Then f (x) = (x − c)k0 g(x) with g(c) ̸= 0. Since g(x) is continuous, there exists a neighborhood U of c such
that g(x) ̸= 0 for all x ∈ U . Thus, f (x) ̸= 0 for all x ∈ U \c.
Definition 3.1.15. Suppose fn , f : X → Y are functions from X to Y . If for any X-sequence xn → x ∈ X,
fn (xn ) → f (x), then we say f is the continuous limit of fn and fn → f

45
3.2 Arzela-Ascoli
Definition 3.2.1. For a family of functions E on X to Y , we say that E is equicontinuous if for any ϵ > 0,
there exists δ > 0 such that for any f ∈ E and x, y ∈ X, d(x, y) < δ implies d(f (x), f (y)) < ϵ.

Proposition 3.2.2 (equicontinuity make local boundedness uniform). Suppose X is compact. Let E be an
equicontinuous family of functions from X to Y . If for any x ∈ X and, {f (x) : f ∈ E } is bounded, then E is
uniformly bounded.
Proof. Since E is equicontinuous, there exists δ > 0 such that for any x, y ∈ X and f ∈ E, d(x, y) < δ implies
d(f (x), f (y)) < 1. Cover X by finite δ-balls with centers x1 , · · · , xn . Since {f (xi ) : f ∈ E} is bounded, there
exist y0 ∈ Y and R > 0 such that {f (xi ) : f ∈ E} ⊆ B(y0 ; R) for all i ∈ {1, · · · , n}. Then for any x ∈ X,
there exists xi ∈ X such that d(x, xi ) < δ. Then for any f ∈ E,

d(f (x), y0 ) ≤ d(f (x), f (xi )) + d(f (xi ), y0 ) < R + 1 =⇒ {f (x) : x ∈ X, f ∈ E} ⊆ B(y0 ; R + 1).

46

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy