0% found this document useful (0 votes)
1 views72 pages

math3301-lecturenotes

The document is a set of lecture notes for a course on Measure Theory and Integration, focusing on the Lebesgue integral as a generalization of the Riemann integral. It introduces key concepts such as measure theory, Lebesgue measure, and the properties of Lebesgue integrable functions, emphasizing their importance in various fields of mathematics. The content includes preliminaries, axiomatic measure theory, and detailed discussions on the Lebesgue measure and integral.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views72 pages

math3301-lecturenotes

The document is a set of lecture notes for a course on Measure Theory and Integration, focusing on the Lebesgue integral as a generalization of the Riemann integral. It introduces key concepts such as measure theory, Lebesgue measure, and the properties of Lebesgue integrable functions, emphasizing their importance in various fields of mathematics. The content includes preliminaries, axiomatic measure theory, and detailed discussions on the Lebesgue measure and integral.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 72

Math 3301

Measure theory and integration I

Lecture notes of Prof. Hicham Gebran


hicham.gebran@gmail.com

Lebanese University, Fanar, Fall 2017-2018


https://hichamgebran.wordpress.com
2

Introduction and orientation


In previous courses, you studied the Riemann integral. With it, you can integrate many func-
tions and you know for example that every piecewise continuous function can be integrated.
There are however functions that do not possess a Riemann integral. Consider for instance the
function f : [0, 1] → IR defined by f (x) = 1 if x is rational and f (x) = 0 if x is irrational. We
shall see that this function is not integrable in the sense of Riemann. It does have however an
integral called the Lebesgue integral. The Lebesgue integral is a generalization of the Riemann
integral: it permits to integrate more functions. One of our targets in this course is to construct
the Lebesgue integral and study its properties.
There are many ways to present the subject. We have chosen the approach based on measure
theory. A measure is a generalization of the familiar notions of length, area, volume and
probability of an event. In this course we shall construct a measure on IR called the Lebesgue
measure which assigns a ”length” to many subsets of IR that are not necessary intervals.
Let me make some remarks about the subject before we start. They will become clear as
we progress.
1. As we said, the Lebesgue integral permits to integrate a wider class of functions than the
Riemann integral permits. Also, the sets on which we integrate need not be intervals.
2. The theorems of the Lebesgue theory are stronger and easier to use than those of the
Riemann theory.
3. There is an analogy between the completion of the rational numbers by real numbers
and the completion of Riemann integrable functions by Lebesgue integrable functions.
4. The Lebesgue theory of measure and integration is fundamental to many fields of Math-
ematics like probability, functional analysis, dynamical systems and Fourier series.

All questions, comments, remarks and suggestions are welcome.

References

Paul. R. Halmos, Measure theory (Princeton, Van Nostrand, 1950).


Roger Jean, Mesure et intégration (Presses de l’Université du Québec, 1982).
Walter Rudin, Real and complex analysis (McGraw-Hill, 1977).
Marc Troyanov, Mesure et intégration (Lecture notes, EPFL, 2005).
Thierry Gallay, Théorie de la mesure et de l’intégration (Lecture notes, Université Joseph
Fourier, 2009).
Contents

1 Preliminaries 5
1.1 Sets and functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Some fundamental properties of the real line and the extended real line . . . . . 8
1.3.1 The real line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.2 The extended real line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.3 Superior limit and inferior limit . . . . . . . . . . . . . . . . . . . . . . . . 13

2 Axiomatic measure theory 15


2.1 Measurable spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Measure spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Measurable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.1 Measurable real valued functions . . . . . . . . . . . . . . . . . . . . . . . 25
2.3.2 Simple functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4 Outer measures and Caratheodory’s theorem . . . . . . . . . . . . . . . . . . . . 29
2.5 Completion of a measure space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3 The Lebesgue measure on IR 35


3.1 Construction and properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Counterexamples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2.1 A Lebesgue non measurable set . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2.2 The middle third Cantor set . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4 The Lebesgue integral 45


4.1 Construction and properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1.1 Simple fonctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1.2 Nonnegative measurable functions . . . . . . . . . . . . . . . . . . . . . . 48
4.1.3 Summable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.1.4 Complex valued functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2 The Lebesgue dominated convergence theorem . . . . . . . . . . . . . . . . . . . 55
4.3 Relations with the Riemann integral . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.4 Some applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

A The Riemann integral 63


A.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
A.2 Criteria of integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
A.3 Classes of integrable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
A.4 Properties of the integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
A.5 Integration and differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
A.6 Limits and integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

3
4 CONTENTS
Chapter 1

Preliminaries

1.1 Sets and functions


Given a set E we denote by P(E) or 2E the set of all subsets of E. The reason for the notation
2E is that if E is finite and contains n elements, then there are 2n subsets of E.
Let E be a set and let A ⊂ E. The characteristic function of A is the function χA defined
by (
1 if x ∈ A
χA (x) =
0 if x ∈
/ A.
It is also denoted by 1A . The complement of A in E is the set of points in E that are not in A.
It is denoted by one of the following notations

∁A , ∁A c
E , E\A, E − A, A .

Given a family of sets (Ai )i∈I , we set


[
Ai = {x | ∃j ∈ I such that x ∈ Aj },
i∈I
\
Ai = {x | x ∈ Ai , ∀i ∈ I}.
i∈I

Recall De Morgan’s laws


[ \
X\ Ai = (X\Ai )
i∈I i∈I
\ [
X\ Ai = (X\Ai ),
i∈I i∈I

and the distributivity laws


[ [
A∩ Ai = (A ∩ Ai )
i∈I i∈I
\ \
A∪ Ai = (A ∪ Ai ).
i∈I i∈I

Let f : X → Y be a map between two sets. Let A ⊂ X, the (direct) image of A under f is

f (A) = {y ∈ Y | ∃x ∈ X such that y = f (x)}.

Let B ⊂ Y , the inverse image of B under f is

f −1 (B) = {x ∈ X | f (x) ∈ B}.

5
6 CHAPTER 1. PRELIMINARIES

Regardless of whether f is a bijection or not, f −1 defines a map

f −1 : P(Y ) → P(X)
B 7→ f −1 (B).

This map satisfies

1. f −1 (∅) = ∅ and f −1 (Y ) = X.

2. f −1 (Ac ) = (f −1 (A))c

3. f −1 (∪i∈I Bi ) = ∪i∈I f −1 (Bi ) and f −1 (∩i∈I Bi ) = ∩i∈I f −1 (Bi ).

We use capital letters to denote sets and calligraphic letters to denote sets of sets. If B is a
collection of subsets of Y , then we write f −1 (B) = {f −1 (B)|B ∈ B). This is the direct image
under f −1 of the set B.
Note that if (Ai )i∈I is a family of subsets of X, then
[ [
f ( Ai ) = f (Ai ),
i∈I i∈I

but we only have \ \


f( Ai ) ⊂ f (Ai ),
i∈I i∈I

the equality holds if f is one to one (injective).

Countable sets
A set E is called countable if there is a one to one function f : E → IN∗ where IN∗ is the set of
positive integers. This means that we can put the elements of E in a finite or infinite sequence
(x1 , x2 , . . .). So for example the set of all integers Z and the set of all rational numbers Q are
countable. A countable union of countable sets is countable, that is, if I is countable and Ai is
countable for all i ∈ I, then ∪i∈I Ai is countable. Also a finite Cartesian product of countable
sets is countable, so for example IN×IN is countable. However P(IN) and the set of real numbers
IR are uncountable.

Limsup and liminf


Let X be a set and (An )n∈IN be a sequence of subsets of X. We set
[ \
lim inf An = limAn = Ak ,
n∈IN∗ k≥n

and \ [
lim sup An = limAn = Ak ,
n∈IN∗ k≥n

Therefore, x ∈ lim inf An if starting from as certain rank, x belongs to all the An . In probability
theory, the event lim inf An is called the event that the An happen almost always (An a.a.).
On the other hand, x ∈ lim sup An if x belongs to An for infinitely many indices n. In
probability theory, the event lim sup An is called the event that the An happen infinitely often
(An i.o.).
Example. Consider the experiment of flipping a fair coin infinitely many times. Let An be
the event: Head appears on the nth flip. Then
1.1. SETS AND FUNCTIONS 7

T∞
1. n=1 An is the event: a head appears on all tosses or {(HH . . . H . . .)}.

2. lim inf An is the event: a head appears starting from a certain flip. It is the union of all
events of the form

(HH . . . H . . .) (XHH . . . H . . .) (XXHH . . . H . . .) . . .

3. lim sup An is the event: a head appears infinitely many times. This event is the comple-
ment of the event: a tail appears starting from a certain flip.

4. ∞
S
n=1 An is the event: a head appears at least once. □

Now we claim that \ [


An ⊂ limAn ⊂ limAn ⊂ An .
n∈IN∗ n∈IN
T
Proof.S For the first inclusion, let Bn = k≥n Ak . Then the first inclusion S is equivalent to
B1 ⊂ n∈IN∗ Bn which is obviouslyT true. For the third inclusion, let C n = k≥n Ak . Then the
third inclusion is equivalent to n∈IN∗ Cn ⊂ C1 which is obviously true. Now for the second
inclusion observe that
[ \
x ∈ lim inf An ⇔ x ∈ Ak ⇔ ∃n ∀k ≥ n x ∈ Ak ⇔ {n ∈ IN∗ | x ∈ / An } is finite,
n∈IN∗ k≥n

and that
\ [
x ∈ lim sup An ⇔ x ∈ Ak ⇔ ∀n ∃k ≥ n x ∈ Ak ⇔ {n ∈ IN∗ | x ∈ An } is infinite.
n∈IN∗ k≥n

Now if {n ∈ IN∗ | x ∈/ An } is finite then its complement {n ∈ IN∗ | x ∈ An } is infinite because


IN∗ is infinite. Hence the inclusion. □
We say that (An ) converges to A if limAn = limAn = A.

The axiom of choice


In chapter 3, we shall need the following axiom.
Axiom of choice. Let (Ai )i∈I be a family of nonempty and pairwise disjoint sets. Then
there exists a set C such that for all i ∈ I, C ∩ Ai is a singleton. Otherwise stated, given a
family of sets as above, we can select exactly one element form each set.
This axiom seems trivial (and this is why we call it axiom). It has however some far reaching
and unexpected consequences like the fact that any set can be well ordered (this is known as
the well ordering theorem). An ordered set (E, ≤) is said to be well ordered if every nonempty
subset of E has a smallest element. For example, IN is well ordered in the usual order. However
Z and IR are not well ordered in the usual order. The well ordering theorem implies that there
exists a well ordering on IR, a fact that no one can prove without the axiom of choice. Actually
the axiom of choice is equivalent to the well ordering theorem. The well ordering theorem
startled the mathematical world in the beginning of the twentieth century. A careful analysis of
its proof led to the formulation of the axiom of choice. The axiom of choice asserts the existence
of a set but gives no procedure to construct it, and this is why some mathematicians of the
beginning of the twentieth century refused it. Nowadays, most mathematicians accept it and
we shall use it in chapter 3 to construct a nonmeasurable subset of IR.
8 CHAPTER 1. REVIEW AND COMPLEMENTS

1.2 Topology
Let X be a set. A family T of subsets of X (that is, T ⊂ P(X)) is called a topology on X
provided the following conditions hold.

1. ∅, X ∈ T .
2. If Oi ∈ T for all i ∈ I, then ∪i∈I Oi ∈ T .
3. If O1 , O2 ∈ T , then O1 ∩ O2 ∈ T .

The elements of T are called open sets of X. A set is called closed if its complement is open.
If d is a distance on X, then d generates a topology on X.
A function f : X → Y between two topological spaces is called continuous at a point x0 if
for every neighborhood V of f (x0 ), there exists a neighborhood of x0 such that f (U ) ⊂ V . A
function f is called continuous on X if it is continuous at every point of X. Then the following
conditions are equivalent.

(i) f : X → Y is continuous.
(ii) The inverse image under f of every open set of Y is an open set of X.
(iii) The inverse image under f of every closed set of Y is a closed set of X.

A subset A of a metric space X is called dense if the closure of A is equal to X. The


following conditions are equivalent.
1. A is dense in X.
2. Every open set of X meets A.
3. For every x ∈ X, there exists a sequence in A that converges to x.
Let A be a subset of a topological space, an open covering of A is a family of open sets
whose union contains A. The subset A is called compact if any open covering of A contains a
finite subcollection that also covers A. In IRn , a subset is compact if and only if it is closed and
bounded.
A subset A of a topological space is connected if the following condition holds. If A ⊂ B ∪ C
with B and C open and disjoint, then either A ⊂ B or A ⊂ C. The union of connected sets
having a point in common is connected. The connected subsets of IR are the intervals. A
connected component of a topological space is a maximal (with respect to inclusion) connected
subspace, i.e., a connected subspace which is not contained properly in a bigger connected
subspace.

1.3 Some fundamental properties of the real line and the ex-
tended real line
1.3.1 The real line
We start by formulating two fundamental properties of the set of real numbers that we shall
use in the sequel. This set is denoted by IR and it is naturally identified to a line. An upper
bound of a subset E ⊂ IR is a real number M such that x ≤ M for all x ∈ E. The set E is
called bounded from above if it has an upper bound. A lower bound of a subset E ⊂ IR is a
real number m such that m ≤ x for all x ∈ E. The set E is called bounded from below if it has
a lower bound.
1.3. SOME FUNDAMENTAL PROPERTIES 9

Definition 1.1 Let E ⊂ IR be bounded from above. The number α is called the least upper
bound of E if the following hold.
i) α is an upper bound of E.
ii) If β < α, then β is not an upper bound of E.
In this case, we write α = sup E. If E is not bounded from above, we write sup E = +∞.

Proposition 1.1 Let E ⊂ IR be bounded from above. Then α = sup E if and only if the
following hold.
i) x ≤ α for all x ∈ E.
ii) ∀ε > 0 ∃y ∈ E such that α − ε < y.

Definition 1.2 Let E ⊂ IR be bounded from below. The number α is called the greatest lower
bound of E if the following hold.
i) α is a lower bound of E.
ii) If β > α, then β is not a lower bound of E.
In this case, we write α = inf E. If E is not bounded from below, we write inf E = −∞.

Proposition 1.2 Let E ⊂ IR be bounded from below. Then α = inf E if and only if the
following hold.
i) x ≥ α for all x ∈ E.
ii) ∀ε > 0 ∃y ∈ E such that α + ε > y.

Now we can state the two fundamental properties of IR.

Theorem 1.1 Any nonempty subset of IR which is bounded from above has a least upper
bound. Any nonempty subset of IR which is bounded from below has a greatest lower bound.
Any proof of this theorem involves going back to the construction of the real numbers from
the rational numbers. We do not address this issue here.
Recall now that a subset E ⊂ IR is an interval if [x, y] ⊂ E whenever x and y belong to E.
An interval has thus one of the following eleven forms ∅, {a}, ]a, b[, ]a, b], [a, b[, [a, b], ]−∞, a],
] − ∞, a[, ]a, +∞[, [a, +∞[ and IR. We shall say that an interval is trivial if it is empty or a
singleton.

Theorem 1.2 Any nontrivial interval of the real line contains rational as well as irrational
numbers.
This theorem is not difficult to prove using the following evident facts: 1- for each real
number a, there is an integer n bigger than a (this is known as the Archimedean property); 2-
every nonempty subset of IN has a smallest element (IN is well ordered).

Proposition 1.3 Let a and b be two real numbers. If a ≤ b + ε for all ε > 0 then a ≤ b.

Theorem 1.3 Every open set O of the real line is the union of a countable family of pairwise
disjoint open intervals.
Proof. Let (Iλ )λ∈L be the collection of connected components of O. We know that each Iλ
is an open interval (recall that in a locally connected space, a connected component of an open
set is open). Since the rational numbers are dense in IR, each Iλ contains a rational number.
By the axiom of choice, we can choose exactly one rational number in each Iλ . This defines a
function between L and Q which is one to one because two distinct components are disjoint.
Therefore L is countable. Finally we have indeed O = ∪λ∈L Iλ . □
10 CHAPTER 1. REVIEW AND COMPLEMENTS

1.3.2 The extended real line


In measure an integration we will have to deal with sets of infinite measure and with unbounded
functions. So we need to extend the set of real numbers. The extended real line denoted by
IR or [−∞, +∞] is obtained from IR by adding two objects +∞ and −∞ (which are not real
numbers).
Rules in IR
I. Order. We extend naturally the usual order on IR by letting −∞ < x < +∞ for every x ∈ IR
This order is total, that is any two numbers are comparable. Also every nonempty subset of IR
has a least upper bound (sup) and greatest lower bound (inf).
II. Arithmetic operations.

1. x + ∞ = +∞, for all x ∈] − ∞, ∞].


2. x − ∞ = −∞, for all x ∈ [−∞, ∞[.
(
+∞ if x ∈]0, +∞]
3. x × (+∞) =
−∞ if x ∈ [−∞, 0[.
4. 0 × ±∞ = 0. RThis is a convenient convention in measure theory. The reason is that we
want to have A 0 = 0 even if A has an infinite measure.
±∞
Warning. The following operations are not defined: ∞ − ∞, −∞ + ∞, ±∞ .

III. The topology of IR. We define a distance on IR by setting

d(x, y) = | arctan x − arctan y|


π
with the convention that arctan(+∞) = 2 and arctan (−∞) = − π2 so that for example
π
d(x, +∞) = | arctan x − | if x ∈ IR and d(+∞, −∞) = π.
2
Remark 1.1 Let (xn ) be a sequence of IR. Recall that (xn ) is said to tend to +∞ if for every
A > 0, we have xn > A for all n large enough. It is proved in calculus that xn → ∞ if and only
if arctan xn → π2 . It follows that xn → +∞ if and only if lim d(xn , +∞) = 0 in IR, which means
that (xn ) converges to the point +∞ in the topology of IR. Similarly, xn → −∞ if and only if
lim d(xn , −∞) = 0.
Here are some properties of the topology of IR.

1. The restriction of d to IR × IR is a distance that generates the usual topology of IR.


Indeed, let first O be open in (IR, d) and let x ∈ O. Then there is r > 0 such that
Bd (x, r) := {y | d(x, y) < r} ⊂ O}. But B(x, r) := {y | |x − y| < r} ⊂ Bd (x, r) since
d(x, y) = | arctan x − arctan y| ≤ |x − y|. Therefore B(x, r) ⊂ O. Hence O is open for the
usual topology which therefore contains the topology generated by d.
Conversely, let O be open for the usual topology of IR and let x ∈ O. Then there is
ε > 0 such that B(x, ε) ⊂ O. Now since the function t 7→ tan t is continuous at arctan x,
there is δ > 0 such that | tan arctan x − tan z| < ε whenever |z − arctan x| < δ. We claim
that Bd (x, δ) ⊂ B(x, ε). Indeed, d(x, y) < δ ⇒ | arctan x − arctan y| < δ ⇒ |x − y| =
| tan arctan x − tan arctan y| < ε. Therefore, O is open for d.
2. IR is homeomorphic to [− π2 , π2 ] and therefore to any compact interval [a, b] (a < b) of IR.
Therefore IR is compact and connected. Indeed, the map h : IR → [− π2 , π2 ] defined by
h(x) = arctan x. First, h is clearly a bijection. Next, h is continuous.
1.3. SOME FUNDAMENTAL PROPERTIES 11

• Let x ∈ IR and let xn → x, then arctan xn → arctan x, i.e., h(xn ) → h(x).


• Let x = +∞ and let xn → +∞, then h(xn ) → π
2 = h(+∞).
• Let x = −∞ and let xn → −∞, h(xn ) → − π2 = h(−∞).

The inverse map is given by


tan x
 if x ∈] − π2 , π2 [
−1
h (x) = +∞ if x = π2

−∞ if x = − π2 .

The proof of the continuity of h−1 is similar to the proof of continuity of h.

3. IR is dense in IR. This follows from the fact that IR is homeomorphic to ] − π2 , π2 [ which
is dense in [− π2 , π2 ].

Theorem 1.4 Every open set O of IR is the union of a countable family of pairwise disjoint
open intervals of IR.

Proof. Observe first that the connected subsets of IR are exactly the intervals of IR. Next,
an open interval of IR contains an open interval of IR which therefore contains rational points.
Therefore the proof of Theorem 1.3 can be repeated. □

We will have to deal later with sequences and series in IR. So it is important to formulate
some rules for using them.

Proposition 1.4 A monotonic sequence (xn ) of IR is convergent. Moreover

i) If (xn ) is nondecreasing then

lim xn = sup{xi | i ∈ IN∗ }.

ii) If (xn ) is nonincreasing, then

lim xn = inf{xi | i ∈ IN∗ }.

Proof. Let (xn ) be a monotonic sequence in IR. Let yn = h(xn ) where h : IR → [− π2 , π2 ] is the
homeomorphism defined above. Then (yn ) is also monotonic and therefore convergent (since it
is bounded). Continuity of h−1 implies that (xn ) is convergent.
i) Let (xn ) be nondecreasing. We distinguish between two cases

1) sup xi < +∞. We distinguish again between two cases.

a) xn = −∞ for all n ∈ IN∗ . Then indeed, lim xn = sup xi = −∞.


b) xn0 ∈ IR for some n0 . But then xn ∈ IR for all n ≥ n0 because the sequence is non-
decreasing. We modify the sequence by letting xn = xn0 for n ≤ n0 . This does not
change neither the limit nor the sup and we have xn ∈ IR for all n. Let ε > 0 be given.
By a property of the supremum, there exists k ∈ IN∗ such that sup xi − ε < xk . Then
sup xi − ε < xn for all n ≥ k since the sequence is nondecreasing. But xn ≤ sup xi . It
follows that |xn − sup xi | < ε for all n ≥ k and this means that lim xn = sup xi .

2) sup xi = +∞. We distinguish again between two cases.


12 CHAPTER 1. REVIEW AND COMPLEMENTS

a) xn < +∞ for all n ∈ IN∗ . The fact that sup xi = +∞ means that the set {xi |i ∈ IN∗ }
is not bounded from above. This in turn means that for all M > 0, there exists k such
that xk ≥ M . Then xn ≥ M for all n ≥ k. This means that lim xn = +∞ and so
lim xn = sup xi .
b) xn0 = ∞ for some n0 . Then xn = ∞ for all n ≥ n0 . In this case, we have lim xn = +∞
and so lim xn = sup xi .

ii) If (xn ) is nonincreasing, then (−xn ) is nondecreasing. Therefore lim(−xn ) = sup(−xi ) =


− inf(xi ) and so lim xn = inf xi . □
Pn
Let (xn ) be a sequence
P∞of [0, +∞]. Then the sequence Sn = k=1 xk is monotonic. It is therefore
convergent. We set k=1 xk = limn→∞ Sn . This extends the definition of a convergent series
of real numbers.P Therefore series of nonnegative terms are always convergent in IR. Note that
the condition ∞ k=1 xk < +∞ means that xk ∈ IR for all k and that the series is convergent in
the usual sense.
The familiar rules of convergent series still hold. For example, if (xn ) and (yn ) are sequences
in [0, +∞] then
P∞
+ yn ) = ∞
P P∞
1. n=1 (xn n=1 xn + n=1 yn .
P∞ P∞
2. n=1 αxn = α n=1 xn for α ≥ 0.

3. (∀n ∈ IN∗ , xn ≤ yn ) ⇒ ∞
P P∞
n=1 xn ≤ n=1 yn .

Lemma 1.1 The sum of a series of nonnegative terms does not depend on the order of sum-
mation.

Proof. Let P∞(xn ) be aP sequence of [0, +∞] and let σ : IN∗ → IN∗ be a bijection. We have to
prove that n=1 xn = ∞ n=1 xσ(n) . Let n be given and let N = max{σ(1), · · · , σ(n)}. Then
Pn PN P∞
{σ(1), · · · , σ(n)} ⊂ {1, · · · , N }. PIt follows that k=1 xσ(k) ≤ k=1 x k ≤ xk because
k=1P
xk ≥ 0. Letting n → ∞, we get ∞ ∞ ∞ ∞
P P
x
k=1 σ(k) ≤ x
k=1 k . By symmetry x
k=1 k ≤ k=1 xσ(k) .
Hence the equality. □
P
Remark 1.2 If the sequence (xn ) is of variable sign, then the sum of the series xn may
depend on the order of summation. Here is an example. We shall prove later that

1 1 1 1 1
1− + − + ··· + − + · · · = ln 2.
2 3 4 2k − 1 2k
However,

1 1 1 1 1 1 1 1 1
1− − + − − + ··· + − − + · · · = ln 2.
2 4 3 6 8 2k − 1 4k − 2 4k 2

Indeed, let Sn be the nth partial sum of the first series and let Sn′ be the nth partial sum of the
second series. Then
n   X n  

X 1 1 1 1 1 1
S3n = − − = − = S2n .
2k − 1 4k − 2 4k 4k − 2 4k 2
k=1 k=1

′ → 1 ln 2. Since S ′
Therefore S3n ′ 1 ′ ′ 1
2 3n−1 = S3n + 4n and S3n−2 = S3n−1 + 4n−2 , it follows that

S3n−1 ′
and S3n−2 also converge to 12 ln 2. Consequently, Sn′ converges to 12 ln 2. □
1.3. SOME FUNDAMENTAL PROPERTIES 13

Let now A be an infinite countable set and let (xa )a∈A be a family of [0, +∞]. Then there exists
a bijection φ : IN∗ → A. We set
X X∞
xa = xφ(n) .
a∈A n=1

The previous lemma ensures that this definition makes sense. Using arguments similar to those
used in the proof of Lemma 1.1, one can prove the following

Proposition 1.5 Let (xn,m ) be a double sequence of [0, +∞]. Then

X ∞ X
X ∞ ∞ X
X ∞
xn,m = xn,m = xn,m .
(n,m)∈IN∗ ×IN∗ n=1 m=1 m=1 n=1

1.3.3 Superior limit and inferior limit


Let (xn ) be an arbitrary sequence in IR. Let yn = sup{xk | k ≥ n}. Then (yn ) is nonincreasing
and therefore has a limit in IR. This limit is called the superior limit of the sequence {xn } and
it is denoted by limxn or lim sup xn . Thus,

lim sup xn = inf sup xk = lim sup xk .


n≥1 k≥n n→∞ k≥n

The inferior limit of a sequence xn denoted by limxn or lim inf xn is defined by

lim inf xn = sup inf xk = lim inf xk .


n≥1 k≥n n→∞ k≥n

Examples.
1. Let xn = (−1)n . Then yn := inf{xk | k ≥ n} = −1. Hence lim inf xn = lim yn = −1. On
n→∞
the other hand, zn := sup{xk | k ≥ n} = 1. Hence lim sup xn = lim zn = 1.
n→∞
1 1 1 1 1 1
2. Consider the sequence (1, 2, 1 + 2 , 2 + 2 , 1 + 3 , 2 + 3 , 1 + 4 , 2 + 4 , . . .). Then lim sup xn = 2
and lim inf xn = 1.

Theorem 1.5 The lim sup and lim inf satisfy the following properties.

(i) xn ≤ yn ⇒ lim inf xn ≤ lim inf yn and lim sup xn ≤ lim sup yn .

(ii) lim inf xn ≤ lim sup xn .

(iii) lim inf xn = lim sup xn = ℓ if and only if lim xn = ℓ.

(iv) If xn ≥ 0 then lim xn = 0 ⇔ lim sup xn = 0.

(v) lim sup xn is the biggest limit point of (xn ) and lim inf xn is the smallest limit point of
(xn ).

Proof. (i). We have sup{xk | k ≥ n} ≤ sup{yk | k ≥ n} for all n. Letting n → ∞, we get


lim sup xn ≤ lim sup yn . The proof for lim inf is similar.
(ii) For all n, we have inf{xk | k ≥ n} ≤ sup{xk | k ≥ n}. Letting n → ∞, we get the result.
(iii) Suppose first that lim inf xn = lim sup xn = ℓ. If ℓ = +∞ then the sequence yn := inf k≥n xk
tends to ∞. But xn ≥ yn and so xn → ∞. If ℓ = −∞, then zn := supk≥n xk tends to −∞. But
xn ≤ zn and so xn → −∞. Now we assume that ℓ ∈ IR. Let ε > 0. Then | sup xk − ℓ| < ε for
k≥n
all n large enough. In particular xn ≤ sup xk < ℓ + ε. Similarly, | inf xk − ℓ| < ε for n large
k≥n k≥n
14 CHAPTER 1. REVIEW AND COMPLEMENTS

enough. In particular xn ≥ inf k≥n xk > ℓ − ε. Thus, ℓ − ε < xn < ℓ + ε for all n large enough.
This proves that lim xn = ℓ.
Conversely, assume that lim xn = ℓ. If ℓ = +∞, then supk≥n xk → ∞ since supk≥n xk ≥ xn .
Therefore lim sup xn = +∞. Now since xn → ∞, we have for all A > 0, there is n0 such that
xn ≥ A for all n ≥ n0 and so inf k≥n xk ≥ inf k≥n0 xk ≥ A. Letting n → ∞ we get lim inf xn ≥ A.
Since A is arbitrary, we conclude that lim inf xn = +∞. The case ℓ = −∞ is similar. Now we
assume that ℓ ∈ IR. Let ε > 0 be given. Then there is n0 such that ℓ − ε < xk < ℓ + ε for all
k ≥ n0 . Thus
ℓ − ε ≤ inf xk ≤ sup xk ≤ ℓ + ε
k≥n k≥n

for all n ≥ n0 . Letting n → ∞, we get ℓ − ε ≤ lim sup xn ≤ ℓ + ε and ℓ − ε ≤ lim inf xn ≤ ℓ + ε.


Since ε is arbitrary, we conclude that lim sup xn = lim inf xn = ℓ.
(iv) If lim xn = 0 it follows from part (iii) that lim sup xn = 0. Now conversely, suppose that
lim sup xn = 0. Since xn ≥ 0, it follows that lim inf xn ≥ 0. But lim inf xn ≤ lim sup xn = 0.
Therefore lim sup xn = lim inf xn = 0. Thus, by part (iii) lim xn = 0.
(v) We prove the result for lim sup, the case of lim inf being similar. We need to show first that
if L := lim sup xn , then there is a subsequence (xkp ) that converges to L. We consider the case
L ∈ IR, the other cases being left as an exercise. We have the following:

∀ε > 0 ∃n0 such that L ≤ sup xk < L + ε ∀n ≥ n0 (by definition of the limsup)
k≥n

and

∀ε > 0 ∀n ∃m ≥ n such that xm > sup xk − ε (by a property of the sup).


k≥n

Thus, we have the following condition

∀ε > 0 ∀n ∃m ≥ n such that L + ε > xm > L − ε.

In particular, for ε = 1 and n = 1, ∃k1 ≥ 1 such that L + 1 > xk1 > L − 1. Next take ε = 1/2
and n = k1 + 1. Then there exits k2 ≥ k1 + 1 such that L + 12 > xk2 > L − 21 . At the pth step,
we get an index kp ≥ kp−1 + 1 such that L + p1 > xkp > L − p1 . This defines a subsequence (xkp )
that converges to L.
Let now (xφ(n) ) be a subsequence that converges to ℓ. We have

{xφ(k) ) | k ≥ n} ⊂ {xk | k ≥ n}.

Therefore, supk≥n xφ(k) ≤ supk≥n xk . Letting n → ∞, we get lim sup xφ(n) ≤ lim sup xn , that
is, lim xφ(n) ≤ lim sup xn . □
Chapter 2

Axiomatic measure theory

Our target in this chapter is to develop a general theory of measure that includes the notions
of length, area, volume and probability as special cases. So let us start by formulating some
natural requirements that a measure has to possess. Consider a random experiment and let
X denote its sample space, that is, the set of all possible outcomes (in probability theory,
X is usually denoted by Ω). The subsets of X are usually called events. However, we may
be interested in assigning a probability only to some special subsets of X; we will call them
events. Let A denote the collection of events (to which a probability will be assigned). First,
it is natural to assign a probability 1 to the ”certain” event X. So X should belong to A
and P (X) = 1. We would like also to assign a probability 0 to the impossible event ∅. So ∅
should belong to A and P (∅) = 0. Second, given an event A ∈ A, we would like to consider
its complement X\A as an event and set P (X\A) = 1 − P (A). Third, given two events A and
B, we would like to consider their union A ∪ B as an event. If moreover A and B are disjoint
we would like to have P (A ∪ B) = P (A) + P (B). It will follow by induction that if A1 , . . . , An
are events then their union is an event. Moreover, if these events are disjoint, we would have
P (A1 ∪ · · · ∪ An ) = P (A1 ) + · · · + P (An ). Now sometimes we have to deal with an infinite but
countable collection of events (An ) and we would like to consider their union as an event. If
moreover these events are disjoint we would like to have

P (A1 ∪ · · · ∪ An ∪ · · · ) = P (A1 ) + · · · + P (An ) + · · ·

Let us summarize our requirements for a collection of events A.

(i) X ∈ A.

(ii) ∅ ∈ A.

(iii) If A ∈ A then X\A ∈ A.


S∞
(iv) If (An ) is a sequence of A then n=1 An ∈ A.

A probability P on the collection of events A should satisfy

1. P (X) = 1

2. P (∅) = 0.

3. P (X\A) = 1 − P (A).

4. P ( ∞
S P∞
n=1 An ) = n=1 P (An ) if An are pairwise disjoint.

What we said about events and probabilities can be formulated for subsets of the plane and
their area. However, the area of the plane should be considered as infinite, so the requirement

15
16 CHAPTER 2. AXIOMATIC MEASURE THEORY

1. above is not essential to our general theory. Also note that requirement (ii) is redundant
because it can be deduced from (i) and (iii).
Now we are sufficiently motivated to start developing our general theory. A collection of
subsets having properties (i)-(iv) above is called a σ−algebra. If in requirement (iv) we only
consider finite sequences, then the collection is called an algebra.

2.1 Measurable spaces


Definition 2.1 Let X be a set. A σ−algebra on X is a collection A of subsets of X (that is,
A ⊂ P(X)) satisfying the following properties.

(i) X ∈ A.

(ii) A ∈ A ⇒ X\A ∈ A. We say that A is closed or stable under complementation.



[
(iii) If (An ) is a sequence of A, then An ∈ A. We say that A is closed or stable under
n=1
countable unions.

Examples. a) Let X be a set. Then P(X) is indeed a σ−algebra on X. It is the biggest


σ−algebra on X.
b) {∅, X} is a σ−algebra on X. It is the smallest σ−algebra on X.
c) Let A ⊂ X. Then {∅, A, Ac , X} is a σ−algebra on X.
d) The collection A of all subsets of IR which are either countable or have a countable comple-
ment is a σ−algebra on IR. Indeed, (i) X ∈ A because X c = ∅ is countable. (ii) If E is countable
or E c is countable, then E c is countable or E = (E c )c is countable (this condition is symmetric
in E and E c ). (iii) Let An be a sequence of subsets of S IR that are either countable or have a
countable complement. If all An are countable then ∞ n=1 An is
S∞ also countable.
T∞ If cnot, some
c
Am is uncountable. But then Am is therefore countable. Now ( n=1 An ) = n=1 An ⊂ Acm is
c

countable.
e) The collection of all subsets of IN which are either
S finite or have a finite complement is not a
σ−algebra on IN. Indeed, take An = {2n} then ∞ n=1 An is the set of all even positive integers:
it is infinite and its complement is infinite.

Definition 2.2 A measurable space is a couple (X, A) where X is a set and A is a σ−algebra
on X. The elements of A are called the measurable subsets of X.

Proposition 2.1 Let A be a σ− algebra on X. Then

1. ∅ ∈ A.

\
2. An ∈ A for n = 1, 2, . . . ⇒ An ∈ A. We say that A is closed or stable under countable
n=1
intersections.
N
[
3. An ∈ A for n = 1, 2, . . . , N ⇒ An ∈ A. We say that A is stable under finite unions.
n=1

N
\
4. An ∈ A for n = 1, 2, . . . , N ⇒ An ∈ A. We say that A is stable under finite intersec-
n=1
tions.
2.1. MEASURABLE SPACES 17

Proof. 1. X ∈ A by the first property of a σ−algebra. Therefore, ∅ = X\X ∈ A by the


second property.

S An ∈ A for n = 1, 2, . . .. Then X\An ∈ A for


2. Let S∞all n by the second
T∞ property. There-
foreT ∞n=1 X\A n ∈ A by the third property. But n=1 X\A n = X\ n=1 An . Therefore
X\ ∞
T∞
A
n=1 n ∈ A. Thus A
n=1 n ∈ A by the second property.
3. Set An = ∅ for n > N . Then An ∈ A for all n ∈ IN∗ . By property (iii), ∞
S
SN S∞ n=1 An ∈ A. But
A
n=1 n = A
n=1 n .
T∞
4. Set An = X for n > N . Then N
T
n=1 An = n=1 An ∈ A by 2. □

Remark 2.1 Let (X, A) be a measurable space and D ⊂ X. Then AD := {D ∩ E|E ∈ A} is a


σ−algebra on D. Thus (D, AD ) is a measurable space that we may call a subspace of (X, A).
Note that AD ⊃ P(D) ∩ A = {F ⊂ D|F ∈ A}. If in addition D ∈ A (i.e., D is measurable),
then AD = P(D) ∩ A.

Lemma 2.1 An intersection of σ−algebras on a set X is a σ−algebra on X.


T
Proof. Let (Ai )i∈I be a family of σ−algebras. We need to show that A = i∈I Ai is a
σ−algebra.

(i) X ∈ A since X ∈ Ai for all i ∈ I.

(ii) If A ∈ A, then A ∈ Ai for all i ∈ I and so X\A ∈ Ai for all i since Ai is a σ−algebra.
Therefore, X\A ∈ A.
S∞
Si∞for all n and all i. Therefore n=1 An ∈ Ai
(iii) If An ∈ A for all n = 1, 2 . . ., then An ∈ A
for all i ∈ I since Ai is σ−algebra. Thus, n=1 An ∈ A. □

Thanks to this lemma we now can define the notion of a σ−algebra generated by a family
of sets.

Definition 2.3 Let X be a set and S ⊂ P(X) be a collection of subsets of X. The intersection
of all σ−algebras containing S is called the σ−algebra generated by S. It is the smallest (with
respect to inclusion) σ−algebra that contains S. It is denoted by σ(S).

Examples. a) If (X, T ) is a topological space, then the σ−algebra σ(T ) generated by the
open sets of X is called the Borel σ−algebra of (X, T ). It is also denoted by B(X, T ) or just
B(X) if no confusion arises. An element B ∈ B(X) is called a Borel set or a Borel measurable
set.
b) Let A be a σ−algebra on a set X and B be a σ−algebra on a set Y . We denote by A ⊗ B
the σ−algebra generated by the following family

{A × B ⊂ X × Y such that A ∈ A and B ∈ B} .

It is called the product σ−algebra on X × Y .

Remark 2.2 Let X be a set.

1. If A is a σ−algebra on X, then σ(A) = A. Indeed, first, A ⊂ σ(A). Next, A is a


σ−algebra containing A, so A contains the intersection σ(A) of all σ−algebras containing
A.

2. If S ⊂ T ⊂ P(X), then σ(S) ⊂ σ(T ).


18 CHAPTER 2. AXIOMATIC MEASURE THEORY

Remark 2.3 (Methodology) 1. To prove that A = σ(C), we show that A ⊂ σ(C) and
that C ⊂ A (which implies that σ(C) ⊂ σ(A) = A).

2. To prove that σ(C1 ) = σ(C2 ), we show that C1 ⊂ σ(C2 ) and that C2 ⊂ σ(C1 ).
The following important results will be proved in the exercises.

Proposition 2.2 The Borel σ−algebra on IR, that we denote by B(IR) or B, is generated by
anyone of the following collections: (a, b ∈ IR or a, b ∈ Q)
i. the open intervals.

ii. the open intervals of the form ]a, b[.

iii. the closed intervals of the form [a, b].

iv. the intervals of the form ]a, +∞[.

v. the intervals of the form [a, +∞[.

vi. the intervals of the form ] − ∞, b[.

vii. the intervals of the form ] − ∞, b].

viii. all the intervals.

Proposition 2.3 The Borel σ−algebra of IR that we denote by B(IR) is generated by anyone
of the following collections where (where a, b ∈ Q or a, b ∈ IR).
i. the intervals of the form ]a, +∞].

ii. the intervals of the form [a, +∞].

iii. the intervals of the form [−∞, b[.

iv. the intervals of the form [−∞, b].

2.2 Measure spaces


Let (X, A) be a measurable space (i.e., a set X equipped with a σ−algebra).

Definition 2.4 A measure on (X, A) is a function µ : A → [0, +∞] such that

(i) µ(∅) = 0.

(ii) If {An } ⊂ A is sequence of pairwise disjoint elements of A then


∞ ∞
!
[ X
µ An = µ(An ).
n=1 n=1

Condition (ii) is called σ−additivity of the measure.

Remark 2.4 Condition (i) can be replaced by


(i’) There exists A ∈ A such that µ(A) < ∞.
Indeed, if µ(∅) = 0 then µ(∅) < ∞. Conversely, suppose that P∞µ(A) < ∞ for some A ∈ A.
Letting A1 = A and An = ∅ for n ≥ 2, we P get µ(A) = µ(A) + n=2 µ(∅). Since µ(A) is finite,
subtracting it form the equality, we get ∞
n=2 µ(∅) = 0. This implies that µ(∅) = 0.
2.2. MEASURE SPACES 19

Definition 2.5 A measure space is a triple (X, A, µ) such that A is a σ−algebra on X and µ
is a measure on (X, A).

Examples. In Examples (1) to (4), X is any set and one can take A = P(X).
(1) The trivial measure. µ(A) = 0 for all A ∈ A.
(
0 if A = ∅
(2) The infinite measure. µ(A) =
+∞ if A ̸= ∅.
Proof. (i) It is clear that µ(∅) = 0. (ii)SLet now (An ) be a sequence S∞ of pairwise disjoint

elements of A.P∞ If A n = ∅ for
P∞ all n, then n=1 nA = ∅ and soS∞ n=1 An ) = 0. On the
µ(
S hand, n=1 µ(An ) = n=1 0 = 0. If An ̸= ∅ for some m, Pn=1
other An ̸= ∅ and therefore
µ( ∞ A
n=1 n ) = ∞. On the other hand the series is equal to ∞ since ∞
n=1 µ(An ) ≥ µ(Am ) = ∞.

(
1 if x ∈ A
(3) The Dirac measure or Dirac mass. Let x ∈ X. We set δx (A) =
0 if x ∈
/ A.
Proof. (i) Since x ∈ / ∅, it follows that δx (∅) = 0. (ii) Let (An ) be a sequence of pairwise
disjoint elements of A. We distinguish between two cases. If x belongs to the union, then it
belongs to exactly one Am since the An are Ppairwise disjoint. Then δx (∪∞ n=1 An ) = 1, δx (Am ) = 1

and δx (Ak ) = 0 for k ̸= m. Therefore n=1 δx (An ) = 1. Next, if x does not belong to the
union, it does not belong to any of the An . So both terms are zero. □
(
card(A) if A is finite
(4) The counting measure. Let µ(A) =
+∞ if not.
Proof. (i) µ(∅) = card (∅) = 0. (ii) Let (An ) be a sequence of pairwise disjoint elements of
A. We distinguish between two cases.

Case 1. ∪∞ n=1 An is finite. Then there is some N ∈ IN such that An = ∅ for all n ≥ N .
Otherwise, card An ≥ 1 for infinitely many n. So ∪n=1 An would be infinite. Then ∪∞

n=1 An =
∪N A
n=1 n and so
∞ N N N N ∞
! ! !
[ [ [ X X X
µ An = µ An = card An = card An = µ(An ) = µ(An ).
n=1 n=1 n=1 n=1 n=1 n=1
Case 2. ∪∞
n=1 An
is infinite. Then µ(∪∞n=1 An )
= +∞. Here P we also distinguish between
two cases. 2a) Some Ak is infinite. Then µ(Ak ) = +∞ and so ∞ n=1 µ(An ) = +∞. 2b) All

An are finite. But then An ̸= ∅ for infinitely manyPn because ∪n=1 An is infinite. But then
µ(An ) = card (An ) ≥ 1 for infinitely many n and so ∞n=1 µ(An ) = +∞. □
(5) Restriction of a measure. Let (X, A, µ) be a measure space and let D ∈ A. We define
a new measure ν on (X, A) by ν(A) = µ(A ∩ D). This is indeed a measure which is called the
restriction of µ to D. In fact, ν is also a measure on (D, AD ) where AD = {D ∩ A|A ∈ A}.
(6) Positive linear combination of measures. Let (µn ) be a sequence ofPmeasures on a
measurable space (X, A) and let (αn ) be a sequence of [0, +∞]. Then µ := ∞ n=1 αn µn is a
measure on (X, A).
P
Proof. (i) µ(∅) = αn µn (∅) = 0. (ii) Let (Ak ) be a sequence of pairwise disjoint elements
of A. Then
X X X XX
µ(∪k Ak ) = αn µn (∪k Ak ) = αn µn (Ak ) = αn µn (Ak )
n n k n k
XX
= αn µn (Ak ) we can interchange the order of summation by Proposition 1.5
k n
X
= µ(Ak ).
k
20 CHAPTER 2. AXIOMATIC MEASURE THEORY

P □
In particular, if (xn ) is a sequence of X, then µ := n αn δxn is a measure. A measure of
this form is called a discrete measure. In particular, if X = {x1 , . . . , xN } is finite then
µ := N1 N
P
δ
n=1 xn is the familiar probability measure

card A
µ(A) = .
card X
Definition 2.6 Let (X, A, µ) be a measure space. We say that

1. µ is finite if µ(X) < ∞.

2. µ is a probability measure if µ(X) = 1.

3. µ is σ−finite if X = ∞
S
n=1 An with An ∈ A and µ(An ) < ∞ for all n ∈ IN.

Example. The counting measure is σ−finite on IN but not on IR.

Proposition 2.4 (Elementary properties of measures) Let (X, A, µ) be a measure space.

(a) If A1 , A2 , . . . , Ak ∈ A are pairwise disjoint, then


k k
!
[ X
µ An = µ(An ).
n=1 n=1

(b) If B ⊂ A then µ(A\B) + µ(B) = µ(A). In particular, if µ(B) < +∞ then µ(A\B) =
µ(A) − µ(B).

(c) If B ⊂ A then µ(B) ≤ µ(A).

(d) µ(A ∪ B) + µ(A ∩ B) = µ(A) + µ(B). In particular, if µ(A ∩ B) < +∞, then µ(A ∪ B) =
µ(A) + µ(B) − µ(A ∩ B).

(e) If {An }n∈IN∗ ⊂ A then


∞ ∞
!
[ X
µ An ≤ µ(An ).
n=1 n=1

This is called the σ−subadditivity (note that the An are not necessarily pairwise disjoint).

Proof. (a). Set An = ∅ for n > k. Then the sequence {An }n∈IN∗ is a family of pairwise
disjoint sets. Now, on the one hand, ∪kn=1 An = ∪∞
n=1 An . On the other hand, µ(An ) = 0 for
n > k since µ(∅) = 0. Therefore

X k
X ∞
X k
X
µ(An ) = µ(An ) + µ(An ) = µ(An ).
n=1 n=1 n=k+1 n=1

By the σ−additivity of µ, we have


∞ ∞
!
[ X
µ An = µ(An ).
n=1 n=1

Therefore,
k k
!
[ X
µ An = µ(An ).
n=1 n=1
2.2. MEASURE SPACES 21

(b) It follows from (a) that µ(C ∪ D) = µ(C) + µ(D) if C and D are disjoint. Now note that
A = (A\B) ∪ B and A\B and B are disjoint. Therefore µ(A) = µ(A\B) + µ(B).
(c) follows from (b) and the fact that a measure is non-negative.
(d) Observe first that A ∪ B = A ∪ B\(A ∩ B) and the two sets are disjoint. It follows from
part (a) that µ(A ∪ B) = µ(A) + µ(B\(A ∩ B)). But µ(B\(A ∩ B)) + µ(A ∩ B) = µ(B) by part
(b). Hence the conclusion follows.
(e) Define a sequence {Bn } in the following way: B1 = A1 , B2 = A2 \A1 , B3 = A3 \(A1 ∪ A2 )
and more generally, Bn = An \(A1 ∪ · · · ∪ An−1 ). Then the sequence {Bn } has the following
properties (see the exercises).

1. The Bn are pairwise disjoint.

2. ∞
S S∞
n=1 An = n=1 Bn .

3. Bn ⊂ An and therefore µ(Bn ) ≤ µ(An ).

It follows from the above and the σ−additivity of µ, that


∞ ∞ ∞ ∞
! !
[ [ X X
µ An = µ Bn = µ(Bn ) ≤ µ(An ).
n=1 n=1 n=1 n=1

Theorem 2.1 (Continuity of measures) Let (X, A, µ) be a measure space and let {An }n∈IN∗ ⊂
A. Then the following hold.

(i) If {An } is nondecreasing (i.e, A1 ⊂ A2 · · · ) then



!
[
µ An = lim µ(An ).
n→∞
n=1

(ii) If the sequence is nonincreasing (i.e.,A1 ⊃ A2 · · · ) and µ(A1 ) < ∞, then



!
\
µ An = lim µ(An ).
n→∞
n=1

Proof. (i). Define a sequence {Bn } in the following way: B1 = A1 , B2 = A2 \A1 , B3 = A3 \A2
and more generally, Bn = An \An−1 . Then the sequence {Bn } has the following properties

1. The Bn are pairwise disjoint.


Pn
2. An = B1 ∪ B2 ∪ · · · ∪ Bn and therefore µ(An ) = k=1 µ(Bk ).

3. ∞
S S∞
n=1 An = n=1 Bn .

It follows from the above and the σ−additivity of µ, that


n ∞ ∞ ∞
! !
X X [ [
lim µ(An ) = lim µ(Bk ) = µ(Bk ) = µ Bk =µ Ak .
n→∞ n→∞
k=1 k=1 k=1 k=1

(ii). Let A = ∩∞ ∞
n=1 An and Cn = A1 \An . Then ∪n=1 Cn = A1 \A and therefore

!
[
µ Cn = µ(A1 \A) = µ(A1 ) − µ(A)
n=1
22 CHAPTER 2. AXIOMATIC MEASURE THEORY

Note that {Cn } is increasing and therefore by (i) we know that µ(∪∞
n=1 Cn ) = limn→∞ µ(Cn ).
From the other hand, µ(Cn ) = µ(A1 \An ) = µ(A1 ) − µ(An ) and so

µ(A1 ) − µ(A) = µ(∪∞


n=1 Cn ) = lim µ(Cn ) = lim (µ(A1 ) − µ(An )) = µ(A1 ) − lim µ(An ).
n→∞ n→∞ n→∞

Whence µ(A) = limn→∞ µ(An ). □

Remark 2.5 The assumption µ(A1 ) < ∞ is essential. For example, if X = IN and µ is the
counting measure, then the sequence

Ak = {n ∈ IN∗ | n ≥ k}

\
is decreasing but µ(Ak ) = ∞ and therefore limk→∞ µ(Ak ) = ∞, whereas µ( Ak ) = µ(∅) = 0.
k=1

Definition 2.7 Let (X, A, µ) be measure space. A subset A ∈ A is said to be of full measure
if µ(Ac ) = 0.

Remark 2.6 If µ is finite, then A is of full measure if and only if µ(A) = µ(X). However, if
µ is not a finite measure, then a subset A satisfying µ(A) = µ(X) need not be of full measure.
Can you give an example?

Definition 2.8 Let (X, A, µ) be measure space. A subset A ⊂ X is called µ−negligible or


simply negligible, if there exists B ∈ A such that A ⊂ B and µ(B) = 0.
In this definition, the point is that a negligible set need not be measurable. Here is an artificial
example. Let X be a set, A be a proper subset of X and a ∈ A. Let A = {∅, A, Ac , X} and
consider the measure space (X, A, δa ) where δa is the Dirac measure. Then any proper subset
of Ac is negligible but not measurable.
If every negligible set is measurable (and therefore of measure zero), then the space (X, A, µ)
is called complete. Thus, in a complete measure space, a set is negligible if and only if it has
measure zero.
Exercise. A subset of a negligible set is negligible and a countable union of negligible sets is
negligible.

Definition 2.9 Let P be a predicate on a measure space (X, A, µ), that is, for each x ∈ X,
there is a proposition P (x) which is either true or false depending on x. We say that P holds
µ−almost everywhere (µ−a.e) or just almost everywhere if the set {x ∈ X | P (x) is false} is µ−
negligible.
For example two functions f, g : X → IR are equal almost everywhere if the set {x ∈ X | f (x) ̸=
g(x)} is negligible. We shall give more examples in the next chapter. A function from
(X, A, µ) → IR is called µ−negligible if it is equal to 0 almost everywhere.

Remark 2.7 A predicate P holds µ−a.e. if and only if there is A ∈ A such that µ(Ac ) = 0
and P (x) is true ∀ x ∈ A. Otherwise stated, P holds µ−a.e. if and only if it holds on a set of
full measure. Indeed, suppose first that P holds µ−a.e. This means that there exists B ∈ A
such that {x ∈ X | P (x) is false} ⊂ B and µ(B) = 0. Let A = B c . Then A ∈ A and A =
B c ⊂ {x ∈ X | P (x) is true}. This means that P (x) is true ∀x ∈ A with µ(Ac ) = 0. Conversely,
suppose that P (x) is true ∀x ∈ A. This means that A ⊂ {x ∈ X | P (x) is true}. Therefore
{x ∈ X | P (x) is false} ⊂ Ac with µ(Ac ) = 0. This means precisely that {x ∈ X | P (x) is false}
is negligible, that is, P holds µ−a.e.
2.3. MEASURABLE FUNCTIONS 23

Exercise. Let P1 and P2 be two predicates on a measure space. If both P1 and P2 hold
almost everywhere then the predicate P1 ∧P2 (conjunction) also holds almost everywhere. More
generally, if (Pn ) is a sequence of predicates that hold almost everywhere then the predicate
∧∞n=1 Pn also holds almost everywhere. If P ⇒ Q and P holds a.e, then Q holds a.e.

Remark 2.8 Many mathematicians say that a predicate P holds almost everywhere if {x ∈
X | P (x) is false} has measure 0. Our definition is therefore more general. However, due to the
previous remark, this makes little difference in practice.

2.3 Measurable functions


Definition 2.10 Let (X, A) and (Y, B) be two measurable spaces. A function f : X → Y is
called (A, B)−measurable if f −1 (B) ⊂ A, that is if for all B ∈ B we have f −1 (B) ∈ A.
Examples. a) Every function f : (X, P(X)) → (Y, B) is measurable.
b) A constant function is measurable.
c) The identity function on X is measurable.

The following lemma gives a useful criterion for the measurability of a function.

Lemma 2.2 Let (X, A) and (Y, B) be two measurable spaces. Suppose that B is generated by
a family C ⊂ P(Y ). Then f : X → Y is (A, B)−measurable if and only if f −1 (C) ⊂ A.
Proof. Suppose first that f is measurable. Then f −1 (C) ⊂ f −1 (B) ⊂ A. Conversely, suppose
that f −1 (C) ⊂ A and consider the family of subsets of Y
F := {E ⊂ Y |f −1 (E) ∈ A}.
Then one can check that F is σ−algebra on Y . But this σ−algebra contains C and so it contains
B since B is the smallest σ−algebra containing C. Now B ⊂ F means precisely that f −1 (B) ∈ A
for all B ∈ B, that is, f is (A, B)−measurable. □

Definition 2.11 Let X and Y be two topological spaces. A function f : X → Y is called


Borel-measurable if it is (B(X), B(Y ))−measurable, that is, if the inverse image under f of
every Borel subset of Y is a Borel subset of X.
Examples. 1) Every continuous map is Borel-measurable.
2) Every monotonic map f : IR → IR is Borel-measurable. Indeed, we claim first that the inverse
image under f of an interval of IR is an interval of IR and hence a Borel subset of IR. Let I
be an interval of IR and let x, y ∈ f −1 (I) with x ≤ y. We need to show that [x, y] ⊂ f −1 (I).
Let z ∈ [x, y]. If f is increasing then f (x) ≤ f (z) ≤ f (y) and so f (z) ∈ [f (x), f (y)] ⊂ I, that
is, z ∈ f −1 (I). If f is decreasing, a similar reasoning yields z ∈ f −1 (I) . The claim is proved.
Now the Borel σ−algebra of IR is generated by the intervals of IR.

Remark 2.9 Let (X, A) and (Y, B) be a two measurable spaces and A ∈ A. We already
observed that (A, P(A) ∩ A) is a measurable space. Therefore it makes sense to say that a
function f : A → Y is measurable. This means that f −1 (B) ∈ A, whenever B ∈ B, because
f −1 (B) ⊂ A and so f −1 (B) ∈ P(A) automatically.

Lemma 2.3 (The pasting lemma) Let (X, A) and (Y, B) be a two measurable spaces, and
let A, B ∈ A. Let f : A → Y and g : B → Y be measurable functions that coincide on A ∩ B
(this condition is satisfied if A ∩ B = ∅). Then the function h : A ∪ B → Y defined by
(
f (x) if x ∈ A
h(x) =
g(x) if x ∈ B
24 CHAPTER 2. AXIOMATIC MEASURE THEORY

is measurable.

Proof. This follows from the fact that h−1 (E) = f −1 (E) ∪ g −1 (E). Thus if E ∈ B, then
f −1 (E) and g −1 (E) belong to A (and they are contained in A ∪ B). □

Corollary 2.1 Let (X, A) a measurable space, A ∈ A, and f : A → IR be measurable. Define


f˜ : X → IR by (
f (x) if x ∈ A
f˜(x) =
0 if x ∈
/ A.

Then f˜ is measurable.

Proof. A constant function is measurable. □


This corollary means that we can always extend measurable functions to measurable func-
tions defined on the whole space X.
The next corollary shows how one can modify a measurable function and still get a measur-
able function.

Corollary 2.2 Let f : X → IR be measurable and let E ⊂ X be a measurable set. Then the
function h : X → IR defined by
(
f (x) if x ∈ X\E
h(x) =
0 if x ∈ E.

is measurable.

Proof. Note that the restriction of a measurable function to a measurable subset is measurable.

Proposition 2.5 Let f : X → Y be (A, B)−measurable and let µ be a measure on (X, A).
Then the formula
ν(B) := µ(f −1 (B))
for all B ∈ B defines a measure on (Y, B). This measure is denoted by f∗ (µ) and is called the
image measure of µ by f . It is also called the pushforward measure of µ by f .

Proof. (i). ν(∅) = µ(f −1 (∅)) = µ(∅) = 0.


(ii). Let {Bn } ⊂ B be sequence of pairwise disjoint sets, then {f −1 (Bn )} ⊂ A is also a
sequence of pairwise disjoint sets and therefore
∞ ∞ ∞ ∞ ∞
! !! !
[ [ [ X X
−1 −1 −1
ν Bn = µ f Bn =µ f (Bn ) = µ(f (Bn )) = ν(Bn ).
n=1 n=1 n=1 n=1 n=1

Example 2.1 Let X and Y be two arbitrary sets and let A = P(X) and B ⊂ P(Y ) be an
arbitrary σ−algebra on Y . Let µ be the counting measure on X. Then every function f : X → Y
is (A, B)−measurable and ν = f∗ µ is the measure that counts the number of preimages
(
card{x ∈ X|f (x) ∈ B} if this set is finite
ν(B) = µ(f −1 (B)) =
+∞ if not.
2.3. MEASURABLE FUNCTIONS 25

Example 2.2 Let X be a set, a ∈ X and consider the measure space (X, P(X), δa ). Let (Y, B)
be an arbitrary measure space and let f : X → Y be a function (it is necessarily measurable).
Then for any A ⊂ X, we have
( (
−1 1 if a ∈ f −1 (A) 1 if f (a) ∈ A
f∗ (δa (A)) = δa (f (A)) = −1
= = δf (a) (A).
0 if a ∈
/ f (A) 0 if f (a) ∈
/A

Therefore f∗ (δa ) = δf (a) .

Remark 2.10 The concept of image measure is very important in probability. Let (X, A, µ)
be a probability space and let f : X → IR is a measurable function (IR is equipped with its
Borel σ−algebra). Then f is called a random variable and f∗ (µ) is called the probability law
of f . In order to compute probabilities related to f , we have to know what is f∗ (µ).

2.3.1 Measurable real valued functions


In the following two subsections we consider functions that take values in IR. It is assumed that
IR is equipped with its Borel σ−algebra B(IR). So given a measurable space (X, A), a function
f : X → IR is called measurable if it is (A, B(IR))-measurable. To simplify, we will also say in
this case that f is A−measurable.

Remark 2.11 Let f : X → IR. To simplify the notation, we write {f < a} instead of
f −1 ([−∞, a[) = {x | f (x) < a}. This notation is used in probability theory. Most often in
probability theory, a probability space is denoted by (Ω, F, P ) and symbols like X, Y, Z, W are
used to denote random variables, i.e, measurable functions on Ω. Then P (X < a) denotes the
measure (probability) of the event {X < a}.

Lemma 2.4 Let (X, A) be a measurable space and f : X → IR. Then the following conditions
are equivalent.

1. f is measurable.

2. {f < a} is measurable for every a ∈ IR.

3. {f ≤ a} is measurable for every a ∈ IR.

4. {f < a} is measurable for every a ∈ Q.

5. {f ≤ a} is measurable for every a ∈ Q.

6. {f > a} is measurable for every a ∈ IR.

7. {f ≥ a} is measurable for every a ∈ IR.

8. {f > a} is measurable for every a ∈ Q.

9. {f ≥ a} is measurable for every a ∈ Q.

Proof. This follows from Lemma 2.2 and the fact that B(IR) is generated by subsets of the
form [−∞, a[ etc. □

Theorem 2.2 Let (X, A) be a measurable space. Let f, g : X → IR be measurable and let
α be a real constant. Then the following functions are defined on measurable subsets and are
measurable.
f
f + g, αf, f g, |f |, .
g
26 CHAPTER 2. AXIOMATIC MEASURE THEORY

Proof. f + g is not defined when f (x) = +∞ and g(x) = −∞ or vice versa. Otherwise stated
f + g is not defined on the set E = f −1 (+∞) ∩ g −1 (−∞) ∪ f −1 (−∞) ∩ g −1 (+∞). Now observe
that {+∞} is a closed set in IR. It is therefore a Borel subset of IR. It follows that f −1 ({+∞})
is measurable because f is measurable. Similarly, the sets g −1 (−∞), f −1 (−∞) g −1 (+∞) are
measurable. Consequently, E is measurable and so the set X\E on which f + g is defined is
measurable.
For similar reasons, fg is defined on a measurable set. f g is defined everywhere because of
our convention 0 × ±∞ = 0.
a) Let h = f + g. Let a ∈ Q. We claim that
[
{h < a} = {f < p} ∩ {g < q}

where the union is taken over all couples (p, q) ∈ Q2 such that p + q = a. Indeed, if f (x) < p
and g(x) < q with p + q = a, then h(x) = f (x) + g(x) < p + q = a. Conversely suppose that
h(x) < a. Note that this implies that f (x) < ∞ and g(x) < ∞. Let p be a rational number
such that f (x) < p < a − g(x). Let q = a − p ∈ Q. Then g(x) < q. This proves the claim. But
the claim means that {h < a} is a countable union of measurable sets.
b) αf is measurable because



{f < αa } if α>0

∅ if α = 0 and a ≤ 0
{αf < a} =
X
 if α = 0 and a > 0

{f > a }

if α ≥ 0.
α

c) f 2 is measurable because
(
2 ∅ if a < 0
{f ≤ a} = √ √
{− a ≤ f ≤ a} if a ≥ 0.

d) We can write

+∞ if f (x) = +∞, g(x) > 0 or f (x) = −∞, g(x) < 0 or vice versa

f (x)g(x) = −∞ if f (x) = +∞, g(x) < 0 or f (x) = −∞, g(x) > 0 or vice versa

1 2 2 2

2 (f (x) + g(x)) − f (x) − g(x) if f (x), g(x) ∈ IR.

Since constant functions are measurable, it follows from a), b), c) and the pasting lemma that
f g is measurable.
e) |f | is measurable because
(
∅ if a < 0
{|f | ≤ a} =
{−a ≤ f ≤ a} if a ≥ 0.

1
f) g is measurable because

1
1 {g > a } ∪ {g < 0}
 if a > 0
{ < a} = {g < 0} if a = 0
g 
 1
{ a < g < 0} if a < 0.
2.3. MEASURABLE FUNCTIONS 27

Theorem 2.3 Let (X, A) be a measurable space and let (fn ) be a sequence of measurable
functions from X to IR, then the following functions
inf fn , sup fn , lim inf fn , lim sup fn
are measurable. In particular, if (fn ) converges pointwise, its limit is measurable.
Proof. a) Let h = sup(fn ). Then h is measurable because

\
{h ≤ a} = {fn ≤ a}.
n=1

b) Let g = inf(fn ). Then g is measurable because



\
{h ≥ a} = {fn ≥ a}.
n=1

c) Consequently,
lim inf fn = sup inf fn
n≥1 k≥n
is measurable.
d) Similarly,
lim sup fn = inf sup fn
n≥1 k≥n

is measurable. □

Corollary 2.3 If f and g are two measurable functions from X to IR, then max(f, g) and
min(f, g) are measurable. In particular, the functions f + := max(f, 0) and f − := − min(f, 0) =
max(−f, 0) are measurable.

2.3.2 Simple functions


Definition 2.12 Let (X, A) be a measurable space and let f : X → IR be a measurable
function. We say that f is a simple function if it takes only a finite number of values.
Note that we include the condition of measurability in our definition of simple functions.

Lemma 2.5 Let f : X → IR be a simple function. Then f can be written in the form
n
X
f= ai 1Ai (2.1)
i=1

where {Ai } are measurable sets that form a partition of X.


Proof. Suppose that f takes the values {a1 , . . . , an } with ai ̸= aj for i ̸= j. Set Ai = f −1 (ai )
for i = 1, . . . , n. Then first, each Ai is measurable as the inverse image of a measurable set.
Second, the sets Ai are pairwise disjoint, for if x ∈ Ai ∩ Aj , then f (x) = ai and f (x) = aj so
that ai = aj and this is impossible if i ̸= j. Third, if x ∈ X, then f (x) takes some value ak so
that x ∈ f −1 (ak ) = Ak . This means that {Ai }i is a partition of X.
We show now that f has the representation (2.1). Let x ∈ X. Then as we already observed,
x belongs to exactly one Ak . Now, onPthe first hand f (x) = ak . On the other hand, 1Ak (x) = 1
and 1Ai (x) = 0 for i ̸= k. Therefore ni=1 ai 1Ai (x) = ak . Hence the equality.

Remark. If f : X → IR is a simple function then f could be written in the form (2.1) in
several ways. For example consider the function f defined by f (x) = −1 if x < 0 and f (x) = 1
if x ≥ 0. P
Then f = −χ]−∞,0[ + χ[0,∞[ . But also f = −χ]−∞,−2[ − χ[−2,0[ + χ[0,∞[ + 3χ∅ . Let us
say that j∈I bj χBj is an admissible representation of f if {Bj }j∈J form a partition of E.
28 CHAPTER 2. AXIOMATIC MEASURE THEORY

Theorem 2.4 (Approximation of nonnegative measurable functions by simple func-


tions) Let (X, A) be a measurable space and f : X → [0, ∞] be measurable. Then there exists
a sequence (hn ) of simple functions such that

a) hn (x) < +∞ for all n and all x ∈ X.

b) 0 ≤ h1 ≤ h2 ≤ · · · ≤ f .

c) hn (x) → f (x) as n → ∞ for all x ∈ X.

d) If f is bounded then the convergence is uniform.

Proof. For each n ∈ IN∗ divide the interval [0, n] into n2n intervals each of length 2−n and
set (
{x | k2−n ≤ f (x) < (k + 1)2−n } if 0 ≤ k ≤ n2n − 1
En,k =
{x | f (x) ≥ n} if k = n2n .
Then for each n, the family {En,k }k=0,...,n2n is a partition of X. Set
(
k2−n if x ∈ En,k for some 0 ≤ k ≤ n2n − 1
hn (x) =
n if x ∈ En,n2n .

Otherwise stated,
n
n2
X k
hn = 1E is a simple function.
2n n,k
k=0

a) It is clear that hn (x) < ∞ for all n ∈ IN∗ and all x ∈ X.


b) It is also clear that 0 ≤ hn ≤ f for all n. Let us prove that hn ≤ hn+1 . Let x ∈ X. There
are 2 cases:

(1) k2−n ≤ f (x) < (k + 1)2−n for some 0 ≤ k ≤ n2n − 1. Then hn (x) = k2−n . Also the
inequality of this case is equivalent to 2k2−n−1 ≤ f (x) < (2k + 2)2−n−1 . So we distinguish
between two cases.

(1a) 2k2−n−1 ≤ f (x) < (2k + 1)2−n−1 , that is, x ∈ En+1,2k . In this case, hn+1 (x) =
2k2−n−1 = k2−n = hn (x).
(1b) (2k +1)2−n−1 ≤ f (x) < (2k +2)2−n−1 , that is, x ∈ En+1,2k+1 . In this case, hn+1 (x) =
(2k + 1)2−n−1 > k2−n = hn (x).

(2) f (x) ≥ n. In this case, hn (x) = n. We distinguish between two cases.

(2a) n ≤ f (x) < n + 1. Then k2−n−1 ≤ f (x) < (k + 1)2−n−1 for some k ≤ (n + 1)2n+1 − 1
which means that x ∈ En+1,k . But then (k + 1)2−n−1 > n (because f (x) ≥ n) and so
k ≥ n2n+1 . Therefore, hn+1 (x) = k2−n−1 ≥ n = hn (x).
(2b) f (x) ≥ n + 1. Then hn+1 (x) = n + 1 > n = hn (x).

c) Let x ∈ X. There are two cases.

1. f (x) < ∞. Let ε > 0 be given. Choose N such that f (x) < N and 2−N < ε. For
n ≥ N we have k2−n ≤ f (x) < (k + 1)2−n for some k, and so hn (x) = k2−n , therefore
|f (x) − hn (x)| < 2−n ≤ 2−N < ε. Since ε was arbitrary, it follows that hn (x) → f (x).

2. f (x) = +∞. Then f (x) > n for all n and so hn (x) = n by construction. Therefore,
hn (x) → +∞.
2.4. OUTER MEASURES AND CARATHEODORY’S THEOREM 29

d) Let f (x) ≤ M for all x ∈ X. Let ε > 0 be given. Choose N such that N > M and 2−N < ε.
Then |f (x) − hn (x)| < 2−n ≤ 2−N < ε for all x ∈ X and all n ≥ N . □

Corollary 2.4 (Approximation of measurable functions by simple functions)


Let (X, A) be a measurable space and f : X → IR be measurable. Then there exists a sequence
of simple functions that converges pointwise to f .
Proof. See the exercises. □

2.4 Outer measures and Caratheodory’s theorem


Definition 2.13 An outer measure on a set X is a function

µ∗ : P(X) → [0, ∞]

such that

(i) µ∗ (∅) = 0;

(ii) (monotonicity) A ⊂ B ⇒ µ∗ (A) ≤ µ∗ (B);

(iii) (σ−subadditivity) For any sequence of subset {An } ⊂ P(Y ) we have


∞ ∞
!
[ X

µ An ≤ µ∗ (An ).
n=1 n=1

Examples. 1) Let X be a set. Any measure on (X, P(X)) is an outer measure on X.


2) Let X be a set. Define m : P(X) → IR+ by
(
0 if A = ∅
µ∗ (A) =
1 if not.

Then µ∗ is an outer measure which is not a measure (unless X is a one point set). Indeed,

1. µ∗ (∅) = 0.

2. Let A ⊂ B. If B = ∅, then A = ∅ and µ∗ (A) = µ∗ (B) = 0. If B ̸= ∅, then µ∗ (B) = 1 ≥


µ∗ (A).

3. Let (An ) be a sequence of P(X). We need to show that µ∗ (∪An ) ≤


P ∗
µ (An ).
n, then both sides are zero. If not, Ak ̸= ∅ for some k. In this case,
If An = ∅ for all P

µ (∪An ) = 1 and µ∗ (An ) ≥ µ∗ (Ak ) = 1.

We now show that µ∗ is not a measure if X has at least two points. Let a and b be two distinct
points of X. Then µ∗ ({a} ∪ {b}) = 1 whereas µ∗ ({a}) + µ∗ ({b}) = 2.
3) Define µ∗ : P(IR) → IR+ by
(
0 if A is countable
µ∗ (A) =
1 if not,

Then µ∗ is an outer measure which is not a measure if X is uncountable. Indeed,

1. µ∗ (∅) = 0 since ∅ is countable.


30 CHAPTER 2. AXIOMATIC MEASURE THEORY

2. Let A ⊂ B. If B is countable, then A is also countable and so µ∗ (A) = µ∗ (B) = 0. If B


is uncountable, then µ∗ (B) = 1 ≥ µ∗ (A).
3. Let (An ) be a sequence of P(IR). We need to show that µ∗ (∪An ) ≤ µ∗ (An ).
P

If all An are countable, then ∪An is also countable P


and both sides are zero. If not, then

∪An is uncountable and so µ (∪An ) = 1. Whereas µ∗ (An ) ≥ 1.
To show that µ∗ is not a measure, take two uncountable disjoint sets for instance [0,1] and ]2,3].
Then the measure of the union is 1 whereas the sum of the measures is 2.

The theorem of Caratheodory below shows that an outer measure is not very far from
a measure. More precisely, an outer measure is a measure when restricted to some suitable
σ−algebra.

Lemma 2.6 If µ∗ is an outer measure on a set X, then


k
X
µ∗ (A1 ∪ · · · ∪ Ak ) ≤ µ∗ (Ai ).
i=1

Proof. Complete the finite sequence Ai in an infinite one by setting Aj = ∅ for j > k. □

Definition 2.14 Let µ∗ be an outer measure on a set X. A subset E ⊂ X is called µ∗ −measurable


in the sense of Caratheodory if for every A ⊂ X we have

µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A\E).

Remark 2.12 Since A = (A ∩ E) ∪ (A\E), it follows from the preceding lemma that µ∗ (A) ≤
µ∗ (A ∩ E) + µ∗ (A\E). Thus, a set E ⊂ X is µ∗ −measurable if and only of for every A ⊂ X,

µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A\E).

Theorem 2.5 (Caratheodory) Let µ∗ be an outer measure on a set X. Let Mµ∗ denote the
set of µ∗ −measurable subsets of X. Then
1. Mµ∗ is a σ−algebra.
2. µ := µ∗ |Mµ∗ is a measure on (X, Mµ∗ ).
3. (X, Mµ∗ , µ) is complete.
This theorem associates to each outer measure on a set X a measure space

(X, µ∗ ) 7→ (X, Mµ∗ , µ).

Proof. We prove 1. and 2. simultaneously. (i) Let A ⊂ X. Then µ∗ (A ∩ ∅) +


µ∗ (A\∅) = µ∗ (∅) + µ∗ (A) = µ∗ (A). Therefore ∅ ∈ Mµ∗ . (ii) It is clear that the condition of
µ∗ −measurability is symmetric in E and E c . Therefore E ∈ Mµ∗ implies that E c ∈ Mµ∗ . (iii)
To prove the third property of a σ−algebra, we proceed in several steps.
Step 1. Mµ∗ is closed under finite unions. Let E1 and E2 belong to Mµ∗ . We prove that
E1 ∪ E2 ∈ Mµ∗ . Let A ⊂ X. Then

µ∗ (A) ≤ µ∗ A ∩ (E1 ∪ E2 ) + µ∗ A ∩ (E1 ∪ E2 )c


 

= µ∗ (A ∩ E1 ) ∪ (A ∩ E2 ∩ E1c ) + µ∗ A ∩ E1c ∩ E2c


 

≤ µ∗ (A ∩ E1 ) + µ∗ A ∩ E1c ∩ E2 ) + µ∗ (A ∩ E1c ∩ E2c )




= µ∗ (A ∩ E1 ) + µ∗ A ∩ E1c since E2 is µ∗ −measurable




= µ∗ (A) since E1 is µ∗ −measurable.


2.4. OUTER MEASURES AND CARATHEODORY’S THEOREM 31

Thus
µ∗ (A) = µ∗ (A ∩ (E1 ∪ E2 ) + µ∗ A ∩ (E1 ∪ E2 )c ,


which means that E1 ∪ E2 ∈ Mµ∗ . Now it is easily seen by induction that, if E1 , . . . , En ∈ Mµ∗ ,
then E1 ∪ · · · ∪ En ∈ Mµ∗ .
Step 2. Let E1 , . . . , En be pairwise disjoint elements of Mµ∗ . We prove that for any
subset A ⊂ X,
n
 X
µ∗ A ∩ (∪ni=1 Ei ) = µ∗ (A ∩ Ei ).
i=1

The claim is indeed true for n = 1. Assume that it is true for some n. Then
n+1
[ n+1
[ n+1
[
µ∗ A ∩ ( Ei ) = µ∗ A ∩ ( Ei ) ∩ En+1 ) + µ∗ A ∩ ( c
  
Ei ) ∩ En+1 )
i=1 i=1 i=1

because En+1 is µ∗ −measurable. Now, (∪n+1 n+1


i=1 Ej ) ∩ En+1 = ∪i=1 (Ei ∩ En+1 ) = En+1 because
n+1 c
Ei ∩ En+1 = ∅ for i ̸= n by assumption. Next, (∪i=1 Ei ) ∩ En+1 = ∪ni=1 (Ei ∩ En+1
c ) = ∪ni=1 Ei
c
because Ei ⊂ En+1 since Ei ∩ En+1 = ∅ for i ≤ n. Therefore

µ∗ A ∩ (∪n+1 ∗ ∗ n
 
i=1 Ei ) = µ (A ∩ En+1 ) + µ A ∩ (∪i=1 Ei )
n
X

= µ (A ∩ En+1 ) + µ∗ (A ∩ Ei ) by the induction assumption
i=1
n+1
X
= µ∗ (A ∩ Ei ).
i=1

Therefore, the claim is proved. Observe that if we take A = X, we get


n
 X
µ∗ ∪ni=1 Ei = µ∗ (Ei ).
i=1

Step 3. Mµ∗ is closed under countable disjoint unions. Let E1 , E2 , . . . , be pairwise


disjoint elements of Mµ∗ . Let E = ∪∞
i=1 Ei . We prove that E ∈ Mµ∗ and that for any A ⊂ X,


X

(∪∞ µ∗ (A ∩ Ei ).

µ A∩ i=1 Ei ) =
i=1

Set Fn = ∪ni=1 Ei . By Step 1, each Fn is µ∗ −measurable and so by Step 2,


n
X
∗ ∗ ∗
µ (A) = µ (A ∩ Fn ) + µ (A ∩ Fnc ) = µ∗ (A ∩ Ei ) + µ∗ (A ∩ Fnc ).
i=1

Now Fn ⊂ E so E c ⊂ Fnc and therefore µ∗ (A ∩ E c ) ≤ µ∗ (A ∩ Fnc ) by monotonicity of µ∗ . Hence,


n
X
µ∗ (A) ≥ µ∗ (A ∩ Ei ) + µ∗ (A ∩ E c ).
i=1

As this inequality is true for any n ∈ IN∗ , we get



X
µ∗ (A) ≥ µ∗ (A ∩ Ei ) + µ∗ (A ∩ E c ). (2.2)
i=1
32 CHAPTER 2. AXIOMATIC MEASURE THEORY

By the σ−subadditivity, we get

µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A ∩ E c ).

This proves that E ∈ Mµ∗ . Now taking in particular A = E in (2.2), we get



X
µ∗ (E) ≥ µ∗ (Ei ).
i=1

Since the reverse inequality holds by σ−subadditivity, we have in fact equality.


Step 4. Mµ∗ is closed under arbitrary countable unions. Let {An } be a sequence of Mµ∗ .
Set E1 = A1 and En = An \(A1 ∪ · · · ∪ An−1 ). We know that the En are pairwise disjoint and
∪∞ ∞ ∞
i=1 Ei = ∪i=1 Ai . By Step 3, ∪i=1 Ei ∈ Mµ∗ . Hence the conclusion.

3. Let E be µ−negligible, that is E ⊂ B with µ(B) = 0. Then µ∗ (B) = 0 and so µ∗ (E) = 0.


Therefore µ∗ (A ∩ E) = 0. Therefore, µ∗ (A ∩ E) + µ∗ (A ∩ E c ) = µ∗ (A ∩ E c ) ≤ µ∗ (A). This
means that E is µ∗ −measurable. □

2.5 Completion of a measure space


You will appreciate the importance of this section in the next chapter when we will compare
the Borel σ−algebra on IR with the Lebesgue σ−algebra.

Theorem 2.6 Let (X, A, µ) be a measure space. Let à denote the set of subsets of X of the
form A ∪ N where A ∈ A and N is µ−negligible. Then

(1) Ã is a σ−algebra containing A.

(2) µ extends in a unique way to a measure µ̃ on (X, Ã).

(3) The measure space (X, Ã, µ̃) is complete.

(4) A subset A ⊂ X is µ−negligible if and only if it is µ̃−negligible.

Proof. (1) We prove first that à is a σ−algebra. Let N denote the set of µ−negligible
subsets of X.
(i) X = X ∪ ∅ with X ∈ A and ∅ ∈ N . Therefore X ∈ Ã.

(ii) Let A ∪ N ∈ Ã. Then (A ∪ N )c = Ac ∩ N c . Now by definition, there exists C ∈ A such


that N ⊂ C and µ(C) = 0. Then C c ⊂ N c and we can write:
[ [
Ac ∩ N c = (Ac ∩ C c ) (Ac ∩ N c ∩ C) = B M.

Observe that, B ∈ A and M ⊂ C so that M ∈ N . This means that (A ∪ N )c ∈ Ã.


S
(iii) Let (Ai ∪Ni ) be a sequence of Ã. Then ∪i (Ai ∪Ni ) = ∪i Ai ∪i Ni . Observe that ∪i Ai ∈ A
since A is a σ−algebra and ∪i Ni ∈ N for if Ci is such that Ni ⊂ Ci and µ(Ci ) = 0 then
∪i Ni ⊂ ∪i Ci with µ(∪Ci ) = 0. This means that ∪i (Ai ∪ Ni ) ∈ Ã.
Next, A ⊂ Ã since for every A ∈ A, we have the decomposition A = A ∪ ∅ where A ∈ A and
∅ ∈ N . Note also that N ⊂ Ã since for every N ∈ N , we have the decomposition N = ∅ ∪ N
where ∅ ∈ A and N ∈ N .
(2) Now we define a measure µ̃ on (X, Ã) in the following way: let à ∈ Ã, then there exist
A ∈ A and N ∈ N such that à = A ∪ N . We set µ̃(Ã) := µ(A). We need to ensure that this
2.5. COMPLETION OF A MEASURE SPACE 33

definition makes sense because the representation A ∪ N of an element of à is not necessarily


unique. So let B ∪ M be another representation of à so that à = A ∪ N = B ∪ M . There
exists M1 ∈ A such that M ⊂ M1 and µ(M1 ) = 0. Then A ⊂ B ∪ M ⊂ B ∪ M1 and so
µ(A) ≤ µ(B ∪ M1 ) = µ(B). By symmetry, µ(B) ≤ µ(A). Hence µ̃ is well defined. It is clear
that µ̃ coincides with µ on A because if A ∈ A, the decomposition A = A ∪ ∅ implies that
µ̃(A) = µ(A).
• Next, we check that µ̃ is a measure on (X, Ã).

(i) µ̃(∅) = µ̃(∅ ∪ ∅) = µ(∅) = 0.

(ii) Let (An ∪ Nn ) be a sequence of pairwise disjoint elements of Ã. Observe that the sets
(An ) are also pairwise disjoint. Then
!
[ X X
µ̃ (An ∪ Nn ) = µ̃ (∪n An ∪ ∪n Nn ) = µ(∪n An ) = µ(An ) = µ̃(An ∪ Nn ).
n n n

• We check that µ̃ is the unique extension of µ to Ã. So let ν be an extension of µ to Ã.


Let B ∈ Ã. Then B = A ∪ N where A ∈ A and N ∈ N . Then ν(B) ≤ ν(A) + ν(N ).
Let N ⊂ N1 where N1 ∈ A and µ(N1 ) = 0. Then ν(N ) ≤ ν(N1 ) = µ(N1 ) = 0 (because
ν coincides with µ on A). Therefore ν(B) ≤ ν(A) = µ(A) = µ̃(B). On the other hand,
µ̃(B) = ν(A) ≤ ν(A ∪ N ) = ν(B). Hence ν(B) = µ̃(B).
(3) We check that (X, Ã, µ̃) is complete. Let C ⊂ D with D ∈ Ã and µ̃(D) = 0. Then
D = D1 ∪ D2 where D1 ∈ A and D2 ∈ N . Therefore, µ(D1 ) = µ̃(D) = 0. Now D2 ∈ N
implies that D2 ⊂ D3 where D3 ∈ A and µ(D3 ) = 0. Consequently, C ⊂ D1 ∪ D3 where now
D1 ∪ D3 ∈ A and µ(D1 ∪ D3 ) = 0. This means that C ∈ N ⊂ Ã.
(4) What we just said shows that a µ̃−negligible set is µ−negligible. Conversely, let A ⊂ X
be µ−negligible. Then there is M ∈ A such that A ⊂ M and µ(M ) = 0. Then M ∈ Ã and
µ̃(M ) = 0. This means that A is µ̃−negligible.

Definition 2.15 The space (X, Ã, µ̃) constructed above is called the completion of the space
(X, A, µ). We shall also say that à is the completion of A.

Remark 2.13 In this context, a property holds µ−almost everywhere if and only if it holds
µ̃−almost everywhere.
Exercise. a) Show that for each B ∈ Ã, there exist A ∈ A and N ∈ N such that B = A ∪ N
and moreover A ∩ N = S∅. Hint. Let M ∈ A be such that N ⊂ M and µ(M ) = 0. Write
A ∪ N = (A ∪ N ) ∩ M c (A ∪ N ) ∩ M .
b) Show that B ∈ Ã if and only if there exist A1 , A2 ∈ A such that A1 ⊂ B ⊂ A2 and
µ(A2 \A1 ) = 0.

Proposition 2.6 Let (X, A, µ) be a measure space and let (X, Ã, µ̃) be its completion. Let
f : X → IR be Ã−measurable. Then there exists a function g : X → IR which is A−measurable
and coincides with f almost everywhere.
Proof. We prove this in two steps.
P
Step 1. Let f = ai 1Ai be an Ã−measurable simple function.P Then for each i, there exists
Bi ∈ A such that Bi ⊂ Ai and Ai \Bi is negligible. Let g = ai 1Bi . Then g is A−measurable.
We claim that x ∈/ ∪(Ai \Bi ) ⇒ f (x) = g(x). Indeed, let x ∈
/ ∪(Ai \Bi ). Since the {Ai } form
34 CHAPTER 2. AXIOMATIC MEASURE THEORY

a partition of X, there exists a unique k such that x ∈ Ak and so f (x) = ak . Then x ∈ Bk


because otherwise x ∈ Ak \Bk , contrary to our assumption. Then g(x) = ak (because the Bi are
also pairwise disjoint) and so g(x) = f (x). The claim is proved. It follows {f ̸= g} ⊂ ∪(Ai \Bi )
and therefore {f ̸= g} is negligible. This means that f = g almost everywhere.
Step 2. Let f be Ã− measurable. By the approximation theorem, there exists a sequence
of simple functions fn that are Ã−measurable and converge pointwise to f . Now by step 1,
for each n, there exists a simple function gn which is A−measurable and such that fn = gn
almost everywhere. By Remark 2.7, for each n, there exists An ∈ A such that fn (x) = gn (x)
for all x ∈ An and µ(Acn ) = 0. Let A = ∩An . Then µ(Ac ) = 0 and fn (x) = gn (x) for all x ∈ A
and for all n. Now for each n, modify gn by setting gn (x) = 0 for x ∈ Ac . Then each gn is
still A−measurable (Corollary 2.2) and moreover (gn ) converges to the function g defined by
g(x) = f (x) if x ∈ A and g(x) = 0 if x ∈ Ac . Observe that g is A− measurable as a limit of A−
measurable functions. Now g(x) = f (x) for all x ∈ A and µ(Ac ) = 0. This means that f = g
almost everywhere. □
Chapter 3

The Lebesgue measure on IR

Our target in this chapter is to give a mathematical meaning to the intuitive notions of length.
We shall assign a measure (a length) to many subsets of IR. This measure is called the Lebesgue
measure and the class of subsets having a measure is called the Lebesgue σ−algebra.

3.1 Construction and properties


Here is our plan for this section.
1. Construct an outer measure λ∗ on IR and establish its properties.

2. Use the first part of Caratheodory’s theorem to construct a σ−algebra L on IR and


establish its properties.

3. Use the second part of Caratheodory’s theorem to construct a measure λ on (IR, L) called
the Lebesgue measure. Establish some important properties λ.

The Lebesgue outer measure


The Lebesgue outer measure on IR is the function λ∗ : P(IR) → [0, ∞] defined by
(∞ ∞
)
X [

λ (A) = inf (bn − an ) A ⊂ ]an , bn [ . (3.1)
n=1 n=1

Some comments are in order. First, λ∗


is well defined. Indeed, let
(∞ ∞
)
X [
XA = (bn − an ) A ⊂ ]an , bn [ .
n=1 n=1
S∞ P∞
Then XA ⊂ [0, ∞]. Since A ⊂ n=1 ] − n, n[ and n=1 (2n) = +∞, we have +∞ ∈ XA .
Therefore XA is not empty and so it has an infimum. It is possible that XA = {+∞}. For
example, XIR = {+∞}.
Second, we can write
(∞ ∞
)
X [
λ∗ (A) = inf ℓ(In ) A ⊂ In , and In is an open interval
i=1 n=1

where ℓ(In ) is the length of the interval In . Indeed, let


(∞ ∞
)
X [
YA = ℓ(In ) A ⊂ In , and In is an open interval .
n=1 n=1

35
36 CHAPTER 3. THE LEBESGUE MEASURE ON IR


P∞ [
Then XA ⊂ YA . Conversely, let s ∈ YA then s = n=1 ℓ(In ) where A ⊂ In . If all intervals
n=1
In are bounded, then s ∈ XA . If some interval In is unbounded then s = +∞ ∈ XPA .
A sequence of intervals In such that A ⊂ ∪In is called a covering of A and ∞n=1 ℓ(In ) is
called the total length of the covering.
Third, the word open can be removed from the definition of λ∗ (A). Indeed, let
(∞ ∞
)
X [
ZA = ℓ(In ) A ⊂ In , and In is an interval .
n=1 n=1

Then YA ⊂ ZA and so inf ZA ≤ inf YA . On the other hand, let ε > 0 be a given and let (In ) be a
covering of A by intervals. Observe that for each n, there exists an open interval Jn containing
In such that ℓ(Jn ) = ℓ(In ) + 2εn ; for example if In = [an , bn ], take
P∞Jn =]an − P
ε
2n+1 n
ε
, b + 2n+1 [.

It follows that P
(Jn ) is a covering of A by open intervals and n=1 ℓ(Jn ) = n=1 ℓ(In ) + ε.
Consequently, ∞ n=1 ℓ(In ) ≥ inf YA − ε and so inf ZA ≥ inf YA − ε. Since ε was arbitrary, we
conclude that inf ZA ≥ inf YA and hence equality.

Proposition 3.1 λ∗ is an outer measure on IR.


Proof.
P∞
(i) Let an = bn = 0 for n = 1, 2, 3, . . .. Then ∪∞
n=1 ]an , bn [= ∅ and n=1 (bn − an ) = 0.
Therefore, 0 ∈ X∅ and so 0 ≤ λ∗ (∅) = inf X∅ ≤ 0.
(ii) We claim that if A ⊂ B ⊂ IR, then XB ⊂ XA . Indeed, let s ∈ XB , then P(by the definition

of XB ) there exist two sequences {an } and {bn } of IR such that s = n=1 (bn − an ) and
B ⊂ ∪∞ ∞
n=1 ]an , bn [. Since A ⊂ B, then A ⊂ ∪n=1 ]an , bn [. Thus, s ∈ XA .
It follows that inf XA ≤ inf XB , that is, λ∗ (A) ≤ λ∗ (B).
(iii) Let {An } be a sequence of subsets of IR. We need to show that
X
λ∗ (∪n An ) ≤ λ∗ (An ).
n

If λ∗ (An ) = +∞ for some n then the inequality is of course satisfied. So we assume that
λ∗ (An ) < +∞ for all n ∈ IN∗ . Let ε > 0 be given. By a fundamental property of the
infimum of a subset of IR, for each n ∈ IN∗ , there exists a sequence {Inm }m∈IN∗ of open
intervals such that
∞ ∞
[ X ε
An ⊂ Inm and ℓ(Inm ) < λ∗ (An ) + n .
2
m=1 m=1

The countable family {Imn }n,m∈IN is a covering of ∞


S
n=1 An and therefore

[ ∞
X
λ∗ ( An ) ≤ ℓ(Inm ).
n=1 n,m=1

We have therefore,
∞ ∞ X
∞ ∞  ∞

[ X X
∗ ε X ∗
λ ( An ) ≤ ℓ(Inm ) < λ (An ) + n = λ (An ) + ε.
2
n=1 n=1 m=1 n=1 n=1
Since ε is arbitrary, we conclude that
∞ ∞
!
[ X
λ∗ An ≤ λ∗ (An ).
n=1 n=1

3.1. CONSTRUCTION AND PROPERTIES 37

Definition 3.1 Let E ⊂ IR and a ∈ IR. We set E + a = {x + a|x ∈ E}. We say that E + a is
a translate of E. For example [0,1]+3=[3,4]. We set aE = {ax|x ∈ E}. aE is the image of E
under the homothecy of center 0 and ratio a. For example if E = [1, 2] then 3E = [3, 6].

Exercise. Prove the following

1. ∪i (Ei + a) = (∪Ei ) + a and ∩i (Ei + a) = (∩Ei ) + a.

2. (E + a)c = E c + a and (aE)c = aE c for a ̸= 0.

3. A ∩ (E + a) = ((A − a) ∩ E) + a and A ∩ (aE) = a( a1 A ∩ E) for a ̸= 0.

4. If a ≥ 0, then inf(aE) = a inf E and sup(aE) = a sup E. What if a < 0?

5. If I is an interval, then ℓ(aI) = |a|ℓ(I).

Here are some fundamental properties of λ∗ .

Proposition 3.2 The Lebesgue outer measure λ∗ satisfies the following properties.

1. λ∗ ({p}) = 0 for any p ∈ IR.

2. λ∗ (E) = 0 for any countable set E ⊂ IR.

3. For any interval I, λ∗ (I) = length(I).

4. λ∗ is translation invariant, that is, λ∗ (E + a) = λ∗ (E) for every E ⊂ IR and every a ∈ IR.

5. λ∗ (aE) = |a|λ∗ (E) for any E ⊂ IR and any a ∈ IR.

Proof. 1. Let ε > 0 be given. Let I1 =]p − ε, p + ε[ and In = ∅ for n > 1. Then {In } is
a countable covering of p by open intervals whose total length is 2ε. It follows that 2ε ∈ X{p}
and so λ∗ ({p}) ≤ 2ε. Since ε was arbitrary, we have λ∗ ({p}) = 0.
2. Let E be countable. We can write E = ∪p∈E {p}. Since λ∗ is an outer measure, λ∗ (E) ≤

P
p∈E λ ({p}) = 0.
3. Let I be an interval (possibly unbounded). Set I1 = I and In = ∅ for n >1. Then {In } is
a countable covering of I by intervals whose total length is ℓ(I). Therefore ℓ(I) ∈ XI and so
λ∗ (I) ≤ ℓ(I).
Conversely, let {In } be
P∞a countable covering of I by open intervals. We already proved in
the exercises that ℓ(I) ≤ n=1 ℓ(In ). Therefore ℓ(I) is a lower bound for XI and consequently,
ℓ(I) ≤ λ∗ (I).
Remark. The proof that ℓ(I) ≤ ∞
P
n=1 ℓ(In ) used the compactness of the interval [a, b]. It
seems that the compactness of [a, b] (in terms of open covering) were discovered by Borel and
Lebesgue in their proof that λ∗ ([a, b]) = b − a.
4. Let E ⊂ IR and let a ∈ IR. We need to show that λ∗ (E + a) = λ∗ (E). The claim is indeed
true if E is an interval (use point 3. above). Let now E be an arbitrary subset of IR. If {In } is
a countable covering of E by open intervals, then {In + a} is a countable covering of E + a by
open intervals. Hence

X ∞
X ∞
X
λ∗ (E + a) ≤ λ∗ (∪∞
n=1 (In + a)) ≤ λ∗ (In + a) = ℓ(In + a) = ℓ(In ).
n=1 n=1 n=1

Therefore λ∗ (E + a) is a lower bound of XE and therefore λ∗ (E + a) ≤ λ∗ (E). Since a is


arbitrary, we also have λ∗ (E − a) ≤ λ∗ (E) (replace a by −a).
38 CHAPTER 3. THE LEBESGUE MEASURE ON IR

On the other hand, E = (E+a)−a and according to what we said λ∗ ((E+a)−a) ≤ λ∗ (E+a).
Therefore λ∗ (E) ≤ λ∗ (E + a). Hence the equality.
5. If a = 0, the result is trivial. If not,
X X 1
λ∗ (aE) = inf{ ℓ(In )|aE ⊂ ∪In } = inf{ ℓ(In )|E ⊂ ∪ In }
a
X 1 1
= inf{ |a|ℓ( In )|E ⊂ ∪ In }
a a
X 1 1
= |a| inf{ ℓ( In )|E ⊂ ∪ In } = |a|λ∗ (E).
a a

The Lebesgue σ−algebra L


Since λ∗ is an outer measure on IR, it follows from the first part of Caratheodory’s theorem
that the set of λ∗ −measurable sets is a σ−algebra on IR. We call it the Lebesgue σ−algebra
and we denote it by L. Therefore E ∈ L if and only if

λ∗ (A) = λ∗ (A ∩ E) + λ∗ (A\E)

for every subset A ⊂ IR. An element in L is called a Lebesgue-measurable set. It turns out
that L is a big set. It contains the Borel σ−algebra of IR, but it is much bigger as we shall see.
Recall that the Borel σ−algebra B(IR) is the smallest σ−algebra containing the open subsets of
IR. It is also the smallest σ−algebra containing the closed subsets of IR. An element of B(IR)
is called Borel-measurable. We also denote the Borel σ− algebra on IR by B.

Proposition 3.3 B ⊂ L, that is, every Borel subset of the real line is Lebesgue-measurable.
Proof. Recall that B is generated by the family of intervals of the form ]a, ∞[. Therefore it is
enough to prove that such intervals belong to L. Let A be an arbitrary subset of IR and let a ∈ IR.
Set A1 = A∩]a, ∞[ and A2 = A∩] − ∞, a]. We need to show that λ∗ (A1 ) + λ∗ (A2 ) ≤ λ∗ (A).
The inequality is satisfied if λ∗ (A) = +∞, therefore we assume that λ∗ (A) < +∞.
Let ε > 0 be given. Then, by a fundamental property P∞ of the infimum, there exists a sequence
{In } of open intervals such that A ⊂ ∪∞ I
n=1 n and n=1 ℓ(I n ) < λ ∗ (A) + ε. Set I ′ = I ∩]a, ∞[
n n
and In′′ = In ∩] − ∞, a]. Then, In′ and In′′ are disjoint intervals (possibly empty) such that
In = In′ ∪ In′′ . Therefore,
ℓ(In ) = ℓ(In′ ) + ℓ(In′′ ).
Now, since A1 ⊂ ∪∞ ′
n=1 In , we have

X

λ (A1 ) ≤ ℓ(In′ ).
n=1

Similarly,

X

λ (A2 ) ≤ ℓ(In′′ ).
n=1
Therefore,

X ∞
 X
λ∗ (A1 ) + λ∗ (A2 ) ≤ ℓ(In′ ) + ℓ(In′′ ) = ℓ(In ) < λ∗ (A) + ε.
n=1 n=1
Since ε was arbitrary, we conclude that

λ∗ (A1 ) + λ∗ (A2 ) ≤ λ∗ (A).


3.1. CONSTRUCTION AND PROPERTIES 39

Corollary 3.1 Countable sets, intervals, open sets and closed sets are Lebesgue measurable.
In practice, all the real sets that we deal with are Lebesgue measurable. Constructing a non-
measurable set is not a trivial matter. See below.

Remark 3.1 You could ask if the inclusion B ⊂ L, is strict. It is. There are two ways to see
this. First, we can construct explicitly a Lebesgue measurable set which is not Borel measurable.
But this is not trivial. The known examples use the middle third Cantor set that we introduce
in the next section. The second way is to show that there is no bijection form B to L. This
result uses the theory of cardinals that you probably do not know. In fact it can be shown
that B is in bijection with IR whereas L is in bijection with P(IR). To prove this is is also a
nontrivial task.

Proposition 3.4 If E ∈ L and a ∈ IR, then E + a ∈ L and aE ∈ L.


Proof. Let E ∈ L. We first show that E + a ∈ L, that is, we show that

λ∗ (A) = λ∗ (A ∩ (E + a)) + λ∗ (A ∩ (E + a)c )

for every A ⊂ IR.


Let δ = λ∗ (A ∩ (E + a)) + λ∗ (A ∩ (E + a)c ). Using the results of a previous exercise, we get

δ = λ∗ ((A − a) ∩ E) + a) + λ∗ A ∩ (E c + a)
 

= λ∗ ((A − a) ∩ E) + a) + λ∗ ((A − a) ∩ E c ) + a)
 

= λ∗ (A − a) ∩ E) + λ∗ (A − a) ∩ E c ) by the translation invariance of λ∗


 

= λ∗ (A − a) since E ∈ L

= λ (A) by the translation invariance of λ∗ .

Next, we show that aE ∈ L. Let A ⊂ IR, then


1 1
λ∗ (A ∩ aE) + λ∗ (A ∩ (aE)c ) = λ∗ [a( A ∩ E)] + λ∗ [a( A ∩ E c )]
a a
1 1
= |a|λ∗ [ A ∩ E] + |a|λ∗ [ A ∩ E c ]
a a
∗ 1 ∗
= |a|λ ( A) = λ (A).
a

Recall that f : (X, A) → (Y, B) is called (A, B)−measurable if f −1 (B) ⊂ A. Since on IR,
there are two main σ−algebras: the Lebesgue σ−algebra L and the Borel σ−algebra B, we
distinguish between two types of measurable functions. Recall that a function f : IR → IR is
called Borel-measurable if it is (B, B(IR))−measurable, that is, if the inverse image under f of
every Borel set is a Borel set. On the other hand we have the following definition.

Definition 3.2 A function f : IR → IR is called Lebesgue-measurable (sometimes just measur-


able) if it is (L, B(IR))−measurable, that is, if the inverse image under f of every Borel set is
Lebesgue measurable.

Remark 3.2 Every Borel-measurable function is Lebesgue-measurable. The converse is not


true. Why?

In general, if f : (X, A) → (Y, B) is (A, B)−measurable and g : (Y, B) → (Z, C) is


(B, C)−measurable, then g ◦ f is (A, C)−measurable. In particular let f : IR → IR and
g : IR → IR, then
40 CHAPTER 3. THE LEBESGUE MEASURE ON IR

1. If f and g are Borel-measurable then g ◦ f is Borel-measurable.

2. If f is Lebesgue-measurable and g is Borel-measurable, then g ◦ f is Lebesgue-measurable.

3. Warning. If f is Borel-measurable and g is Lebesgue-measurable, then g ◦ f is not nec-


essarily Lebesgue-measurable. In particular, the composition of two Lebesgue-measurable
functions need not be Lebesgue-measurable.

The Lebesgue measure λ


The second part of Caratheodory’s theorem ensures that λ∗ restricted to L is a measure. We
call it the Lebesgue measure on IR and we denote by λ. Therefore (IR, L, λ) is a measure space.
In addition, to the properties satisfied by any measure, the Lebesgue measure satisfies some
special properties.

Proposition 3.5 The Lebesgue measure λ satisfies the following properties.

1. The measure of a countable set is zero.

2. The measure of an interval is equal to its length.

3. λ is invariant under translations and symmetries.

4. The measure space (IR, L, λ) is complete.

5. λ is σ−finite.

Proof. The first three points follow from the same properties of λ∗ after recalling that
countable sets and intervals are measurable, that L is invariant under translations and sym-
metries. Point 4. follows from Caratheodory’s theorem. Point 5. follows from the fact that
IR = ∪∞
n=1 [−n, n] and λ([−n, n]) = 2n < ∞. □

Here is another important relation between the Lebesgue σ−algebra and the Borel σ−algebra
on IR.

Proposition 3.6 The Lebesgue σ−algebra is the completion of the Borel σ−algebra. This
means that a A is Lebesgue measurable if and only if there exists a Borel set B and a negligible
set N , such that A = B ∪ N .

Proof. See the exercises.

Remark 3.3 It follows from Theorem 2.6 that a set N ⊂ IR is negligible if and only if it is
contained in a Borel set of measure 0.

Corollary 3.2 Let f : IR → IR be Lebesgue measurable. Then there exists a Borel measurable
function g : IR → IR that coincides with f almost everywhere.

Proof. This follows from the previous proposition and Proposition 2.6.
We end this section with a characterization of the sets of measure zero. The proof is
straightforward if you recall the definition of the measure in terms of covering by open intervals.
We can think of a set of measure zero as a ”thin” set.

Proposition 3.7 A set E ⊂ IR has measure zero if and only if for every ε > 0 there exists a
countable covering of E by open intervals whose total length is less that ε.
3.2. COUNTEREXAMPLES 41

3.2 Counterexamples
3.2.1 A Lebesgue non measurable set
This example is due to Vitali.
Construction.
1. Define on ]0,1[ the equivalence relation: x ∼ y if x − y is rational. This defines a partition
of ]0,1[ into equivalence classes. The class containing x is [x] = {x + q | q ∈ Q∩] − 1, 1[}.

2. By the axiom of choice, there exists a set P that contains exactly one element from each
of the equivalence classes. It is clear that P ⊂]0, 1[.

Proposition 3.8 The set P defined above is not Lebesgue-measurable.


Proof.
1. Let {rn } be a sequence that counts the rational numbers of ]-1,1[. We assume that there
is no redundancy in the sequence, that is rn ̸= rm if n ̸= m. Set Pn = P + rn . We claim
that

[
]0, 1[⊂ Pn ⊂] − 1, 2[. (3.2)
n=1
Indeed, first note that Pn = P + rn ⊂]0, 1[+] − 1, 1[=] − 1, 2[. Therefore, ∪∞
n=1 Pn ⊂] − 1, 2[.
Next, let y ∈]0, 1[. There is exactly one element x ∈ [y] such that x ∈ P . Then x ∼ y
and so q := y − x is rational. Note that q ∈] − 1, 1[ since x, y ∈]0, 1[. Therefore q = rj for
some j ∈ IN∗ . Conclusion: y = x + rj ∈ P + rj = Pj .

2. The Pn are pairwise disjoint. Otherwise there are two integers n ̸= m such that Pn ∩Pm ̸=
∅. Let t ∈ Pn ∩Pm . Then there are two numbers x and z in P such that t = x+rn = z+rm .
It follows that x − z = rm − rn is rational and so x ∼ z. But this contradicts the
construction of P (P contains only one element from each equivalence class).

3. Suppose that P is measurable. Then each Pn is measurable and λ(Pn ) = λ(P ) by the
translation invariance of the Lebesgue measure. It follows from the σ−additivity of λ and
the previous step that

X ∞
X
λ(∪∞
n=1 Pn ) = λ(Pn ) = λ(P ).
n=1 n=1

It follows from relationP(3.2) and the monotonicity of λ that 1 ≤ ∞


P
n=1 λ(P ) ≤ 3. But this

is impossible because n=1 λ(P ) is either 0 (if λ(P ) = 0) or +∞ (if λ(P ) > 0). Therefore
we have to conclude that P is not measurable. □

3.2.2 The middle third Cantor set


We know that every countable set has Lebesgue measure 0. Is the converse true? The following
example shows that this not the case, that is, there is an uncountable set of measure 0. I believe
that every educated mathematician should know about this beautiful set which is the source of
many examples and counterexamples in Analysis. It is the simplest example of a fractal set.
Construction. Start with the interval [0,1].
Step 1: Remove the open middle third of [0,1] to obtain
   
1 2
K1 = 0, ∪ ,1 .
3 3
42 CHAPTER 3. THE LEBESGUE MEASURE ON IR

Step 2: Remove the open middle third of each of the intervals of K1 , we get
       
1 2 1 2 7 8
K2 = 0, ∪ , ∪ , ∪ ,1 .
9 9 3 3 9 9

..
.

Step n: Remove the open middle third of each of the intervals of Kn−1 , to get
[ h (n) (n) i
Kn = ai , bi .
i∈Jn

1
Note that Kn is a union of 2n pairwise disjoint intervals each of length .
3n
This defines a decreasing sequence (Kn ) of closed subsets of [0,1]. We set

\
K= Kn .
n=1

This set is called the middle third Cantor set. It is clear that K is a compact set. It is non
empty because it is the intersection of a decreasing sequence of nonempty closed sets in the
compact space [0,1]. In fact, it should be clear from the construction that the endpoints of the
intervals removed at each step belong to K, so for example 0, 1, 31 , 23 , 19 , . . . belong to K, that is,
(n) (n)
the sequences ai and bi belong to K.

Proposition 3.9 The middle third Cantor set satisfies the following properties.

1. Its Lebesgue measure is 0.

2. It has an empty interior.

3. It has no isolated points.

4. It is uncountable.

Proof. 1. We already observed that Kn is a disjoint union of 2n intervals of length 31n .


n
Therefore λ(Kn ) = 23 so that λ(Kn ) → 0 as n → ∞. By the continuity property of measures


and since λ(K1 ) < ∞ we, get



!
\
λ(K) = λ Kn = lim λ(Kn ) = 0.
n→∞
n=1

2. Any set E of Lebesgue measure zero has an empty interior. For otherwise, E would contain
an nonempty open interval I. But then λ(E) ≥ λ(I) > 0.
3. Let x ∈ K. We need to show that any neighborhood of  x meets K at a point y ̸= x. Let
1 n
therefore ε > 0 be given.
h Let n i be an integer satisfying 3 < ε. Since x ∈ Kn , there is some
(n) (n) n
i ∈ Jn such that x ∈ ai , bi . The length of this interval being 13 . We already observed
(n) (n) (n) (n) (n) (n)
that ai and bi belong to K. Now, if x = ai , let y = bi . If x = bi , let y = ai . Finally,
(n) (n) (n) n
if ai < x < bi , let y = ai . In all cases, y ∈ K, y ̸= x and |y − x| ≤ 13 < ε. Since ε was
arbitrary, this means that any neighborhood of meets K at a point different from x.
4. A theorem of topology states that a compact Hausdorff space with no isolated points is
uncountable. □
3.2. COUNTEREXAMPLES 43

Remark 3.4 Let I denote the collection of open intervals of IR and let J denote the collection
of all intervals of IR. Then we have

I ⊂ J ⊂ B ⊂ L ⊂ P(IR).

The first inclusion is clear. The second inclusion follows from Proposition 2.2. The third
inclusion follows from Proposition 3.3. The last inclusion is clear. In fact all the inclusions
are strict. This is clear for the first two. That the third inclusion is strict was pointed out in
Remark 3.1. The last inclusion is strict because of the existence of a Lebesgue non measurable
set.
44 CHAPTER 3. THE LEBESGUE MEASURE ON IR
Chapter 4

The Lebesgue integral

4.1 Construction and properties


We shall define the Lebesgue integral successively for

1. nonnegative simple functions,

2. nonnegative measurable functions,

3. a large class of measurable functions called summable functions,

4. summable complex valued functions.

Remark 4.1 The Riemann integral is defined by approximation from the integrals of step
functions, whereas the Lebesgue integral is constructed from the integral of simple functions.
In the first case, we divide the domain of the function (that is the interval on which it is defined)
into small parts. In the second case, we divide the range of the function. This is a fundamental
difference.

4.1.1 Simple fonctions


Lemma 4.1PLet (X, A, µ) P be a measure space and f : X → IR be a nonnegative simple
function. If ni=1 bi 1Bi and mj=1 cj 1Cj are two admissible representations of f , then

n
X m
X
bi µ(Bi ) = cj µ(Cj ).
i=1 j=1

Proof. Note that by assumption, {CjP }j=1,...,m form a partition of X. Therefore, Bi =


∪j=1 Bi ∩ Cj (disjoint union) and so 1Bi = m
m
j=1 1Bi ∩Cj . Thus

n
X X
bi 1Bi = bi 1Bi ∩Cj .
i=1 i,j

Similarly,
m
X X
cj 1Bj = cj 1Cj ∩Bi .
j=1 j,i

Therefore, X X
bi 1Bi ∩Cj = cj 1Cj ∩Bi
i,j j,i

45
46 CHAPTER 4. THE LEBESGUE INTEGRAL

Thus, bi = cj , for all i, j such that Bi ∩ Cj ̸= ∅. Let Λ be the set of indices (i, j) for which
Bi ∩ Cj ̸= ∅. Now
n
X n
X m
X X
bi µ(Bi ) = bi µ(Bi ∩ Cj ) = bi µ(Bi ∩ Cj ).
i=1 i=1 j=1 (i,j)∈Λ

Similarly,
m
X m
X n
X X
cj µ(Cj ) = cj µ(Bi ∩ Cj ) = cj µ(Bi ∩ Cj ).
j=1 j=1 i=1 (i,j)∈Λ

Hence the equality. □


Thanks to this lemma we can now give the following definition.

Definition 4.1 Let (X, A, µ) be a measure space, E ∈ A and R f : X → IR beR a nonnegative


simple function. The Lebesgue integral of f on E denoted by E f dµ or just E f is
Z n
X
f dµ = ai µ(Ai ∩ E),
E i=1
Pn
where i=1 ai 1Ai is any admissible representation of f .

Note that the integral is a value in [0, ∞].

Example 4.1 Let (X, A, µ) = (IR, L, λ). Consider f : IR → IR defined by



1
 if x ∈ [0, 1] ∪ [3, 4]
f (x) = 3 if x ∈]1, 3[

0 elsewhere.

Then, f = 1 × 1[3,4] + 3 × 1]1,3[ + 0 × 1IR\[0,4] . Then,


Z
f dλ = 1 × λ([0, 1] ∪ [3, 4]) + 3 × λ(]1, 3[) + 0 × λ(IR\[0, 4]) = 8,
IR

where we have used the convention 0 × ∞ = 0.


R
Example 4.2 Let A ⊂ IR be a Lebesgue measurable set. Then IR 1A dλ = λ(A). More
generally, if (X, A, µ) is a measure space and A ∈ A, then,
Z
1A dµ = µ(A).
X

Example 4.3 Consider the measure space (X, P(X), δa ) where δa is the Dirac measure at
a ∈ X. Let f : X → [0, ∞] be a simple function. Then
Z
f dδa = f (a).
X
P
Indeed, let f = simple nonnegative function (this implies that {Ai }i is a
Rαi 1Ai be a P
partition of X). Then, X f dδa = αi δa (Ai ) =
Pαk where k is the index of the unique set Ak
to which a belongs. But if a ∈ Ak then f (a) = αi δa (Ai ) = αk . Hence the equality.
4.1. CONSTRUCTION AND PROPERTIES 47

Example 4.4 Consider the measure space (IN, P(IN), µ) where µ is the counting measure. Let
f : IN → [0, ∞] be a simple function. Then
Z X
f dµ = f (n).
IN n∈IN
P P
PmIndeed, note first that for E ⊂ IN, µ(E) = n∈E 1 = n∈IN 1E (n). Next, let f =
α 1
i=1 i Ai be a simple nonnegative function. Then,
Z m
X m
X X m X
X m
XX X
f dµ = αi µ(Ai ) = αi 1Ai (n) = αi 1Ai (n) = αi 1Ai (n) = f (n).
IN i=1 i=1 n∈IN i=1 n∈IN n∈IN i=1 n∈IN

Lemma 4.2 Let (X, A, µ) be a measure space, E, F ∈ A and let f : X → [0, ∞] be simple.
Then the following hold.
R
(i) E 0 dµ = 0.
R R
(ii) E ⊂ F ⇒ E f dµ ≤ F f dµ.
R
(iii) µ(E) = 0 ⇒ E f dµ = 0.
Pn R Pn
Proof.
R Pn(i) is trivial. (ii) Let f = i=1 a i 1 A i . Then E f dµ = i=1 ai µ(Ai ∩ E) and
F f dµ = i=1 ai µ(Ai ∩ F ). Since µ(Ai ∩ E) ≤ µ(Ai ∩ F ) and ai ≥ 0, the result follows. (iii)
follows from the definition and the fact that µ(Ai ∩ E) = 0. □

Lemma 4.3 R Let f be a simple nonnegative function defined on a measure space (X, A, µ).
Then A 7→ A f dµ is a measure on (X, A).
Proof. Set δ(A) = A f dµ with f = ni=1 ai 1Ai .(i) It should be clear from the definition
R P
and the convention 0 × ∞ = 0 that δ(∅) = 0. (ii) Let (En ) be a sequence of pairwise disjoint
measurable sets. Then
Z Xn n ∞ ∞ X n

 X X X
f dµ = ai µ ∪n=1 (Ai ∩ En ) = ai µ(Ai ∩ En ) = ai µ(Ai ∩ En )
∪∞ i=1 i=1 n=1 n=1 i=1
n=1 En
∞ Z
X
= f dµ.
n=1 En

Proposition 4.1 The Lebesgue integral satisfies the following properties (f and g are two
nonnegative simple functions).
Z Z Z
(i) Additivity: (f + g) dµ = f dµ + f dµ.
E E E
Z Z
(ii) Positive homogeneity: (αf ) dµ = α f dµ for any constant α ≥ 0.
E E
Z Z
(iii) Monotonicity: f ≤ g on E ⇒ f dµ ≤ g dµ.
E E
P P
Proof. Let f = i∈I ai 1Ai and g = j∈J bj 1Bj be admissible representations of f and g.
(i) Note that (Ai ∩ Bj )(i,j)∈I×J form a finite partition of E. It is also easy to see that
X
f +g = (ai + bj )1Ai ∩Bj
(i,j)∈I×J
48 CHAPTER 4. THE LEBESGUE INTEGRAL

is an admissible representation of f + g. Then


Z X X X
(f + g) dµ = (ai + bj )µ(Ai ∩ Bj ) = ai µ(Ai ∩ Bj ) + bj µ(Ai ∩ Bj )
E (i,j)∈I×J (i,j)∈I×J (i,j)∈I×J
X X X X
= ai µ(Ai ∩ Bj ) + bj µ(Ai ∩ Bj )
i∈I j∈J j∈J i∈I
X X
= ai µ(Ai ∩ ∪j∈J Bj ) + bj µ(Bj ∩ ∪i∈I Ai )
i∈I j∈J
X X
= ai µ(Ai ) + bj µ(Bj )
i∈I j∈J
Z Z
= f dµ + g dµ.
E E
P
(ii) For every α ≥ 0, αf = i∈I αai χAi , hence
Z X X Z
(αf ) dµ = αai µ(E ∩ Ai ) = α ai µ(E ∩ Ai ) = α f dµ.
E i∈I i∈I E

(iii) Write g = (g − f ) + f where g − f is nonnegative and simple. It follows from (i) that
Z Z Z Z
g dµ = (g − f ) dµ + f dµ ≥ f dµ
E E E E

since the integral of a nonnegative function is nonnegative. □

4.1.2 Nonnegative measurable functions


Let (X, A, µ) be a measure space. The set nonnegative measurable functions defined on X is
denoted by M+ (X, A, µ) or just M+ (X) i.e.,

M+ (X) = {f : X → [0, ∞] | f is measurable}.

Definition 4.2 Let E ⊂ X be a measurable set and f ∈ M+ (X). The Lebesgue integral of f
on E (with respect to the measure µ) is
Z Z

f dµ = sup h dµ | h : X → [0, ∞] is simple and 0 ≤ h ≤ f .
E E

We need first to justify that this definition of the integral is an extension of the previous
one, i.e., that both definitions coincide when f is simple. Indeed, let f be simple and let us
denote its integral according to the first definition by SE (f ). Also let

If = {SE (h) | h : X → [0, ∞] is simple and 0 ≤ h ≤ f }.


R
Then of course SE (f ) ∈ If and therefore SE (f ) ≤ sup If = E f dµ. On the other hand, for
any simple
R function h satisfying 0 ≤ h ≤ f , we have SE (h) ≤ SE (f ) since h ≤ f . It follows
that E f = sup If ≤ SE (f ). Hence the equality.

Proposition 4.2 For every f, g ∈ M+ (X), the following hold


R R
(i) f ≤ g ⇒ E f ≤ E g.
R R
(ii) E ⊂ F ⇒ E f ≤ F f .
R R
(iii) E (αf ) = α E f for every constant α ≥ 0.
4.1. CONSTRUCTION AND PROPERTIES 49

Proof. R(i) Let f ≤ g. If h


R is a simple function such that 0 ≤ h ≤ f then 0 ≤ h ≤ g. Therefore
sup0≤h≤f E h ≤ sup0≤h≤g E h. Hence the result.
(ii) follows similarly by appealing to Lemma 4.2 (ii).
(iii) follows from the same property for nonnegative simple functions:
Z Z Z Z Z
(αf ) = sup (αh) = sup α h = α sup h=α f.
E 0≤h≤f E 0≤h≤f E 0≤h≤f E E


R
Proposition 4.3 Let f ∈ M+ (X, A, µ) and E ∈ A. Then E f dµ = 0 if and only if f = 0
µ−almost everywhere on E.

Proof. Suppose first that f = 0 on E\A where A has measure zero. Then every simple
function h such that 0 ≤ h ≤ f is zero on E\A. Therefore, by Lemma 4.3 and Lemma 4.2 (iii)
Z Z Z
h dµ = h dµ + h dµ = 0,
E E\A A
R R R
and so E f = sup0≤h≤f E h = 0. Conversely, suppose that E f dµ = 0. Let A = {x ∈
E | f (x) > 0} and An = {x ∈ E | f (x) > n1 } so that A = ∪∞ 1
n=1 An . The functions hn := n χAn
are simple functions such that 0 ≤ hn ≤ f . Consequently,
Z Z
1
µ(An ) = hn dµ ≤ f dµ = 0
n E E

which implies that µ(An ) = 0 for all n. Therefore, µ(A) ≤ ∞


P
n=1 µ(An ) = 0. □

Lemma 4.4 (Markov’s inequality) For every f ∈ M+ (X) and every constant a ≥ 0, we
have Z
f dµ ≥ aµ({f ≥ a}).
E

Proof. Let A = {x | f (x) ≥ a}. Then h := a1A is a simple function such that 0 ≤ h ≤ f .
Consequently, Z Z
f dµ ≥ h dµ = aµ(A) = aµ({f ≥ a}).
E E

R
Corollary 4.1 Let f ∈ M+ (X) satisfy X f dµ < ∞. Then µ({f = ∞}) = 0.

Proof. Let An = {x|f (x) ≥ n}. Then {f = ∞} = ∩An . By Markov’s inequality,


Z
f dµ ≥ nµ(An ).
X

It follows that µ(An ) →0. But (An ) is decreasing and µ(A1 ) < ∞, therefore by the continuity
property of a measure, µ({f = ∞}) = µ(∩An ) = lim µ(An ) = 0. □

Theorem 4.1 (Monotone convergence theorem or Beppo Levi theorem) Let (X, A, µ)
be a measure space and let (fn ) be a sequence of M+ (X) satisfying

1. fn (x) ≤ fn+1 (x) for all n and all x ∈ X (that is, fn is nondecreasing).

2. fn (x) → f (x) for every x ∈ X.


50 CHAPTER 4. THE LEBESGUE INTEGRAL

Then, Z Z
lim fn dµ = f dµ.
n→∞ X X
Z Z Z
Otherwise stated lim fn dµ = lim fn dµ, that is, we can interchange lim and .
n→∞ X E n→∞

Proof. Observe
R first that
R fn (x) ≤ f (x) for all x ∈ E and n ∈ IN∗ . It follows from Proposition
4.2 (i) that X fn dµ ≤ X f dµ for all n and therefore
Z Z
lim fn dµ ≤ f dµ
n→∞ X X
R
(the limit exists in [0, ∞] since the sequence { X fn dµ} is nondecreasing). So it remains to
prove the reverse inequality.
Let α ∈]0, 1[ and let h be a simple function such that 0 ≤ h ≤ f . Set
An = {x ∈ X | fn (x) ≥ αh(x)}.
Then An is an increasing sequence of sets and we claim that X = ∪∞ n=1 An . Indeed, let x ∈ X.
If h(x) = 0, then fn (x) ≥ αh(x) = 0 and so x ∈ An for all n. If not, choose ε < (1 − α)h(x).
Since fn (x) → f (x), there exists m such that f (x) − fn (x) < ε for all n ≥ m. In particular,
f (x) − fm (x) < (1 − α)h(x) and so f (x) − fm (x) < f (x) − αh(x) since h(x) ≤ f (x). Therefore
fm (x) ≥ αh(x) and so x ∈ Am . R
Now we already know that A 7→ A h is a measure ν, and as any measure it satisfies
ν(∪∞n=1 An ) = limn→∞ ν(An ), that is
Z Z
hdµ = lim h dµ.
X n→∞ A
n
R R R
Note also that An αh dµ ≤ fn dµ ≤
An Xfn dµ. Hence
Z Z
α lim h dµ ≤ lim fn dµ
n→∞ A n→∞ X
n

and thus Z Z
α h dµ ≤ lim h dµ.
X n→∞ E

Letting α → 1, we get Z Z
h dµ ≤ lim fn dµ.
X n→∞ X

Since this inequality holds for any simple function h such that 0 ≤ h ≤ f , by taking the
supremum over such h, we finally get
Z Z
f dµ ≤ lim f dµ.
X n→∞ X

Example 4.5 Consider the measure space (X, P(X), δa ) where δa is the Dirac measure at
a ∈ X. Let f : X → [0, ∞] be a nonnegative measurable function. Then
Z
f dδa = f (a).
X

The result is true for simple functions. Let now f ∈ M+ (X). Then there exists an increasing
sequence (hn ) of simple nonnegative functions that converge to f . By Beppo Levi’s theorem,
Z Z
f dδa = lim hn dδa = lim hn (a) = f (a).
X X
4.1. CONSTRUCTION AND PROPERTIES 51

Example 4.6 Consider the measure space (IN, P(IN), µ) where µ is the counting measure. Let
f : IN → [0, ∞] be a nonnegative measurable function. Then
Z X
f dµ = f (n).
X n∈IN

The result is true for simple nonnegative functions. Set


(
f (n) if n ≤ k
sk (n) =
0 otherwise.

Then, (sk )k is an increasing sequence of simple functions that converges to f . By Beppo Levi’s
theorem
Z Z X k
X ∞
X
f dµ = lim sk dµ = lim sk (n) = lim f (n) = f (n).
IN k→∞ IN k→∞ k→∞
n∈IN n=0 n=0

Proposition 4.4 Let f, g ∈ M+ (X) and let E ⊂ X be measurable. Then


Z Z Z
(f + g) = f+ g.
E E E

Proof. By Theorem 2.4, there exist two nondecreasing sequences (fn ) and (gn ) of simple
nonnegative functions that converge respectively to f and g. Therefore (fn + gn ) is a nonde-
creasing sequence of simple nonnegative functions that converges to (f + g). By the previous
theorem
Z Z Z Z Z Z
lim fn = f, lim gn → g, and lim (fn + gn ) = (f + g)
n→∞ E E n→∞ E E n→∞ E E

By the additivity of the integral of simple functions, we have


Z Z Z
(fn + gn ) = fn + gn ,
E E E

and thus,
Z Z Z  Z Z Z Z
(f + g) = lim fn + gn = lim fn + lim gn = f+ g.
E n→∞ E E n→∞ E n→∞ E E E


R R
Proposition 4.5 Let f, g ∈ M+ (X). If f = g µ−a.e, then X f dµ = X g dµ.
Proof. Let (
max(f, g) − min(f, g) if min(f, g) < ∞
h=
0 otherwise.
Then h ∈ M+ (X) and max(f,Rg) = min(f, g) + h. Furthermore, h = 0 on the set {f = g}.
Therefore, h = 0 µ−a.e.and so X h dµ = 0. By the previous proposition,
Z Z
max(f, g) dµ = min(f, g) dµ.
X X
R R
But both X f dµ and X g dµ lie between these equal values by monotonicity of the integral,
and therefore they are equal to each other. □
52 CHAPTER 4. THE LEBESGUE INTEGRAL

Lemma 4.5 (Additivity of domains) Let f ∈ M+ (X). If E1 and E2 are disjoint measurable
sets then Z Z Z
f= f+ f.
E1 ∪E2 E1 E2

Proof. The result follows from the additivity of domains for simple functions and the
approximation of functions in M+ (X) by simple functions (reason as above). □
Z Z
Corollary 4.2 Let f ∈ M+ (X). If E and A are measurable sets then f 1A = f.
E A∩E
Proof. By the previous lemma,
Z Z Z Z Z Z
f 1A = f 1A + f 1A = f+ 0= f.
E E∩A E\A E∩A E\A E∩A

Proposition 4.6 (Integration term by term) Let (fn ) be a sequence of M+ (X). Then
∞ ∞ Z
Z !
X X
fi = fi .
X i=1 i=1 X
Pn
Proof.
P∞ Set gn = i=1 fi . Then gn is nondecreasing sequence of M + (X) that converges to
i=1 fi . By the monotone convergence theorem
Z Z X ∞
lim gn = fi .
n→∞ X X i=1

On the other hand, it follows by induction from the previous proposition that
n n Z
Z Z !
X X
gn = fi = fi .
X X i=1 i=1 X

Letting n → ∞ we get the result. □


Z
Corollary 4.3 Let f ∈ M+ (X, A, µ). Then E 7→ f dµ is a measure on (X, A).
E
R
Proof. (i) It is clear that ∅ f dµ = 0. (ii). Let (EP n ) be sequence of pairwise disjoint

elements
P∞ of A. We proved in the exercises that 1 ∞
∪n=1 En = n=1 1En and therefore f 1∪n=1 En =

n=1 f 1En . By Corollary 4.2 and the previous proposition


∞ ∞ Z ∞ Z
Z Z Z !
X X X
f dµ = f 1∪n=1 En dµ =
∞ f 1En dµ = f 1En dµ = f dµ.
∪∞
n=1 En X X n=1 n=1 X i=1 En

4.1.3 Summable functions


Definition 4.3 Let (X, A, µ) be a measure space, E ∈ A and f : E → IR be measurable. f is
called integrable or summable on E if
Z
|f | dµ < ∞.
E

We denote by L1 (E)the set of summable functions on E. Other notations: L1IR (E), L1 (µ),
L1 (E, µ). The set of summable functions f : E → IR is denoted by LIR (E). So we have
LIR (E) ⊂ LIR (E). But as we shall see, there is little difference between these sets.
4.1. CONSTRUCTION AND PROPERTIES 53

Recall that for a function f we defined f + = max(f, 0) and f − = − min(f, 0) and if f is


measurable if and only if both f + and f − are measurable. Recall also that

f = f+ − f− and |f | = f + + f − .

Therefore, f is summable if and only if both f + and f − are summable. It is natural to define
the integral of a summable function by
Z Z Z
f= +
f − f −.
E E E

Example 4.7 Consider the measure space (X, P(X), δa ) where δa is the Dirac measure at
a ∈ X. Let f : X → IR be an arbitrary function (it is necessarily measurable). Then f is
summable if and only if |f (a)| < +∞, that is, if and only if f (a) ∈ IR. In this case
Z Z Z
f dδa = +
f dδa − f − dδa = f + (a) − f − (a) = f (a).
X X X

Example 4.8 Consider the measure space (IN, P(IN), µ) where µ is the counting measure. Let
f : IN → PIR be a function (it is necessary measurable). Then f is P summable if and only if
the series n∈IN |f (n)| is convergent, that is, if and only if the series n∈IN f (n) is absolutely
convergent. In this case we have
Z Z Z X X X X
f dµ = +
f dµ − f − dµ = f + (n) − f − (n) = (f + (n) − f − (n)) = f (n).
IN IN X n∈IN n∈IN n∈IN n∈IN

Proposition 4.7 Let (X, A, µ) be a measure space and E ∈ A. Then the following hold.

1. L1 (E) is real vector space.


Z Z Z Z
2. The map f 7→ f is linear, i.e., (αf + βg) = α f +β g.
E E E E

Proof. 1. Let f, g ∈ L1 (E) and α, β ∈ IR. Then |αf + βg| ≤ |α||f | + |β||g|. Consequently,
Z Z Z Z
|αf + βg| ≤ |α||f | + |β||g| = |α| |f | + |β| |g| < +∞,
E E E E

and so αf + βg ∈ L1 (E).
R R R R R
2. We check only that (f + g) = f + g. The homogeneity E αf = α E f is left as an
exercise (distinguish between the case α ≥ 0 and α < 0).
From the identity

f + g = (f + g)+ − (f + g)− = (f + − f − ) + (g + − g − ),

we deduce that (f + g)+ + f − + g − = (f + g)− + f + + g + , and therefore


Z Z
(f + g)+ + f − + g − = (f + g)− + f + + g +
E E

Now all the functions belong to M+ (E) and so by the linearity of the integral for functions in
M+ (E) we deduce that
Z Z Z Z Z Z
+ − − − +
(f + g) + f + g = (f + g) + f + g+
E E E E E E
54 CHAPTER 4. THE LEBESGUE INTEGRAL

and so Z Z Z Z Z Z
(f + g)+ − (f + g)− = f+ − f− + g+ − g−
E E E E E E

which means that Z Z Z


(f + g) = f+ g.
E E E

Proposition 4.8 Let f ∈ L1 (E), and let (En ) be a sequence of pairwise disjoint measurable
subsets of E. Then Z XZ
f= f
∪En En

Proof. Using the additivity of domains for nonnegative measurable functions, we have
Z Z Z XZ XZ XZ Z XZ
+ − + − − −
f= f − f = f − f = ( f − f )= f.
∪En ∪En ∪En En En En En En

Proposition 4.9 For every f ∈ L1 (E) we have


Z Z
f ≤ |f |.
E E

Proof. We have
Z Z Z Z Z Z Z
+ − + − + −
f = f − f ≤ f + f = (f + f ) = |f |.
E E E E E E E

Proposition 4.10 Let f, g ∈ L1 (X, A, µ). If f = g µ−a.e, then


R R
X f dµ = X g dµ.

Proof. It is R easy to check that if f = g µ−a.e, then f + = g + a.e. and f − = g − a.e. By


Proposition 4.5, f + = g + and f − = g − . Hence the conclusion follows.
R R R

4.1.4 Complex valued functions


Let (X, A, µ) be a measure space and let f : X → C. We say that f is summable or integrable
if its components Re f and Im f are summable and in this case we set
Z Z Z
f dµ = Re f dµ + i Im f dµ.
X X X

The space of summable complex valued functions is denoted by L1C (X). Other notations: L1C (µ),
L1 (X, C), L1 (X, µ, C).
It is not difficult to see that L1C (X) is a vector space on C and that the map f 7→ X f dµ
R

is a linear functional on L1C (X). The additivity on domains also holds.

Proposition 4.11 For every f ∈ L1C (E) we have


Z Z
f ≤ |f |.
E E
4.2. THE LEBESGUE DOMINATED CONVERGENCE THEOREM 55

R
R R |R f|
Proof. If f = 0, the inequality is satisfied. Assume therefore that f ̸= 0 and let α = f
.
Then Z Z Z
f =α f= αf.
R R R R
Now, αf is real. Therefore αf = Re αf = Re(αf ) (the last equality holds by definition).
It follows that Z Z Z Z
f = Re(αf ) ≤ |αf | = |f |

because |α| = 1. □

4.2 The Lebesgue dominated convergence theorem


Suppose that we have a sequence of functions fn that converges pointwise to a limit f , do we
have Z Z
lim fn dµ = lim fn dµ?
n→∞ X X n→∞
R
Otherwise stated, can we interchange the and lim? The answer in general is no as showed by
the following example.

Counter-example. Consider the sequence defined by fn (x)R= nx(1 − x2 )n for x ∈ [0, 1]. This
1 n
sequence converges to the zero function. Note however that 0 fn (x) dx = 2n+2 → 12 whereas
R1
0 lim fn (x) dx = 0.
R
There are of course many cases in which the interchange of and lim is possible. In the
Riemann theory, a sufficient condition is uniform convergence.
R However this is a restrictive
requirement. A better criterion for the interchange of and lim is the monotone convergence
theorem. Now we give another criterion known as the Lebesgue dominated convergence theorem.
But first, a lemma.

Lemma 4.6 (Fatou) Let {fn } be a sequence of M+ (X), then


Z Z
lim inf fn ≤ lim inf fn .
X X

Proof. Set gn = inf fk . Then gn ≤ fn , gn ≤ gn+1 and lim inf fn = lim gn . Therefore
R R k≥n
R R
gn ≤ fn and so lim gn ≤ lim inf fn . It follows from this and the monotone convergence
theorem that Z Z Z Z
lim inf fn = lim gn = lim gn ≤ lim inf fn .

Theorem 4.2 (The Lebesgue dominated convergence theorem) Let (X, A, µ) be a mea-
sure space and E ∈ A. Let (fn ) be a sequence of measurable functions from E to IR or C.
Suppose that

(i) There exists a measurable function f such that fn → f a.e. on E.

(ii) There exist g ∈ L1 (E) such that |fn | ≤ g a.e. on E.

Then f is summable on E and


Z
(a) lim |fn − f | dµ = 0.
n→∞ E
56 CHAPTER 4. THE LEBESGUE INTEGRAL

Z Z
(b) lim fn dµ = f dµ.
n→∞ E E
Proof. Let A be a set on which the assumptions hold and such that µ(Ac ) = 0. Modify f, fn
and g by setting f (x) = fn (x) = g(x) = 0 for x ∈ / A. This does not modify the measurability
and summability properties but (i) and (ii) now hold everywhere.
Now the fn are summable since |fn | ≤ g and g is summable. Also it follows by letting
n → ∞ in (ii) that |f | ≤ g and therefore f is also summable. Next, |f − fn | ≤ |f | + |fn | ≤ 2g
and so setting φn := 2g − |f − fn | we have that φn is summable and nonnegative. Since fn → f ,
it follows that lim inf φn = lim φn = 2g and by Fatou’s lemma
Z Z Z Z Z
2g = lim inf φn ≤ lim inf φn = 2g + lim inf (−|f − fn |).
E E E E E
R R
It follows that lim inf E (−|f − fn |) ≥ 0 and so lim sup E |f − fn | ≤ 0. Therefore
Z
lim |f − fn | = 0.
n→∞

This proves (a). Part (b) follows from the estimate


Z Z Z Z
f− fn = (f − fn ) ≤ |f − fn | → 0.
E E E E

Corollary 4.4 (Integration term by term) Let (fn ) be a sequence of measurable functions
from X to IR or C. Suppose that
XZ
|fn | dµ < +∞.
n≥1 X
P
Then n≥1 fn is µ− integrable and
Z X XZ
fn dµ = fn dµ.
X n≥1 n≥1 X
P
Proof. Let g = n≥1 |fn |. Using Proposition 4.6, we get
Z Z X XZ
g dµ = |fn | dµ = |fn | dµ < ∞.
X X n≥1 n≥1 X

Therefore g ∈ L1 (X). It follows that g is finite µ−a.e. and so the series n≥1 fn is absolutely
P
convergent (and
P hence convergent) a.e.
Let gn = nk=1 fk . P
P
Then (gn ) converges to k≥1 fk a.e. and |gn | ≤ g a.e. By the dominated
convergence theorem, k≥1 fk is µ− integrable and
Z Z X
lim gn dµ = fk dµ,
n→∞ X X k≥1

that is
n
Z X Z X
lim fk = fk dµ.
n→∞ X X k≥1
k=1
R Pn Pn R P R
But limn→∞ X( k=1 fk ) dµ = limn→∞ k=1 X fk dµ = k≥1 X fk dµ. Therefore,
XZ Z X
fk dµ = fk dµ.
k≥1 X X k≥1
4.3. RELATIONS WITH THE RIEMANN INTEGRAL 57

4.3 Relations with the Riemann integral


Theorem 4.3 Let f : [a, b] → IR be bounded. If f is Riemann integrable then f is Lebesgue
integrable on [a, b] and the integrals coincide, i.e.,
Z Z b
f dλ = f (x) dx.
[a,b] a

Proof. Let (Pn ) be an increasing1 sequence of partitions of [a, b] such that ||Pn || → 0. For
example one can take
k
Pn = {a + n (b − a) | k = 0, . . . , 2n },
2
n
which divides [a, b] into 2 equal subintervals. Then form the Darboux upper and lower sums
corresponding to each Pn
X X
U (f, Pn ) = Mn,i ∆xn,i , L(f, Pn ) = mn,i ∆xn,i .
i∈In i∈In

We claim that Z b
lim U (f, Pn ) = lim L(f, Pn ) = f (x) dx.
n→∞ n→∞ a
Indeed, let ε > 0 be given, then by Theorem A.3, there exists n0 such that U (f, Pn )−L(f, Pn ) <
Rb Rb Rb
ε for all n ≥ n0 . Therefore a f (x) dx ≤ U (f, Pn ) ≤ a f (x) dx + ε. Similarly, a f (x) dx − ε ≤
Rb
L(f, Pn ) ≤ a f (x) dx.
Now define two sequence of step functions Gn and gn by

Gn (x) = Mn,i if xi−1 ≤ x < xi ; gn (x) = mn,i if xi−1 ≤ x < xi ; Gn (b) = gn (b) = f (b).

Otherwise stated,
X X
Gn = Mn,i 1[xi−1 ,xi [ + f (b)1{b} , gn = gn,i 1[xi−1 ,xi [ + f (b)1{b} .
i∈In i∈In

Then clearly, Z Z
Gn dλ = U (f, Pn ) and gn dλ = L(f, Pn ).
[a,b] [a,b]

Moreover,
G1 (x) ≥ G2 (x) ≥ · · · ≥ f (x) and g1 (x) ≤ g2 (x) ≤ · · · ≤ f (x).
Hence
G(x) := lim Gn (x) ≥ f (x) and g(x) := lim gn (x) ≤ f (x).
n→∞ n→∞
Note that G and g and measurable as limits of measurable functions. Also G and g are summable
on [a, b] since they are bounded. It follows from the monotone convergence theorem2 that
Z Z Z b
G dλ = lim Gn dλ = lim U (f, Pn ) = f (x) dx,
[a,b] n→∞ [a,b] n→∞ a

and Z Z Z b
g dλ = lim gn dλ = lim L(f, Pn ) = f (x) dx.
[a,b] n→∞ [a,b] n→∞ a
1
By increasing we mean that Pn+1 is a refinement of Pn
2
R
R (Gn − G) is a decreasing
R sequence
R in M+ ([a, b]) that converges to 0 and such that (G1 − G) < ∞. Therefore,
(Gn − G)R → 0, i.e GnR → G. Also, (gRn − g1 ) Ris an increasing sequence M+ ([a, b]) that converges to g − g1 .
Therefore (gn − g1 ) → (g − g1 ) and so gn → g
58 CHAPTER 4. THE LEBESGUE INTEGRAL

Therefore, Z Z
G dλ = g dλ
[a,b] [a,b]

and so G = g almost everywhere. Hence f = G = g a.e. This implies first that f is measurable
(because G is measurable as a limit of step functions), and second that
Z Z Z b
f dλ = G dλ = f (x) dx.
[a,b] [a,b] a


Next we show the relation between
R∞ absolutely convergent Riemann integrals and Lebesgue
Rb
integrals. Recall that the integral a f (x) dx is called absolutely convergent if limb→∞ a |f (x)| dx
exists.

Theorem 4.4 Let f : [a, ∞[→ IR be Riemann integrable on any compact interval [a, b]. Then
the following conditions are equivalent.
Z ∞
(i) f (x) dx is absolutely convergent.
a

(ii) f ∈ L1 ([a, ∞[).

If one of these conditions holds, then


Z Z ∞
f dλ = f (x) dx.
[a,∞[ a

Proof. (i)⇒(ii). Let fn = f 1[a,n] . Then (fn ) is a sequence of measurable functions that
converges to f and so f is measurable. Now the sequence (|fn |) is nondecreasing and converges
to |f |. By the monotone convergence theorem,
Z Z Z
|f | dλ = lim |fn | dλ = lim |f | dλ.
[a,∞[ n→∞ [a,∞[ n→∞ [a,n]

But according to the previous theorem


Z Z n
|f | dλ = |f (x)| dx.
[a,n] a
R
But this last integral has a finite limit by assumption and therefore [a,∞[ |f | dλ < +∞, that is
f is summable on [a, ∞[. R R
Now by the dominated convergence theorem [a,∞[ fn → [a,∞[ f . But
Z Z Z n
fn dλ = f dλ = f (x) dx
[a,∞[ [a,n] a
R∞
and this last integral converges to a f (x) dx. Hence the equality.
(ii)⇒(i). R It is enough to prove that for any sequence {tn } ⊂ [a, ∞[ converging to +∞, the
t
integral a n |f (x)| dx has a finite limit independent of (tn ). So let (tn ) be such sequence and
1
R (|fn |) converges
set fn = f χ[a,tn ] . Then R to |f | and |fn | ≤ |f | with
R |f | ∈ L . By Rthe dominated
convergence theorem [a,∞[ |fn | dλ = [a,tn ] |f | dλ converges to [a,∞[ |f | dλ. But [a,tn ] |fn | dλ =
R tn R tn R
a |fR(x)| dx by the previous theorem. This means that a |f (x)| dx → [a,∞[ |f | dλ. It follows

that a |f (x)| dx is convergent. □
4.3. RELATIONS WITH THE RIEMANN INTEGRAL 59

Theorem 4.5 Let f : [a, b[→ IR be Riemann integrable on any compact interval [a, c] ⊂ [a, b[.
Then the following conditions are equivalent.
Z b
(i) f (x) dx is absolutely convergent.
a

(ii) f ∈ L1 ([a, b])

If one of these conditions holds, then


Z Z b
f dλ = f (x) dx.
[a,b] a

Proof. Exercise. Proceed as above. □


R∞
Remark 4.2 a / L1 ([a, ∞[). For example,
f (x) dx may be semi convergent and so f ∈
Z ∞ Z ∞
sin x π sin x
dx = but dx = +∞.
0 x 2 0 x
R R∞
In this case [0,∞[ f dλ does not exist, although 0 f (x) dx exists.

Theorem 4.6 [Lebesgue’s theorem on Riemann’s integral] Let f : [a, b] → IR be a bounded


function. Then f is Riemann integrable if and only if it is continuous almost everywhere, i.e.,
the set of its discontinuity points is of Lebesgue measure zero.

Proof. Suppose first that f is Riemann integrable. We proceed as in the proof of Theorem
4.3: that is, we consider an increasing sequence (Pn ) of partitions of [a, b] whose norms tend to
zero, and then define two monotone sequences of step functions Gn and gn converging to G and
g respectively so that g(x) ≤ f (x) ≤ G(x) and G = g a.e. Now let
[
E := {x ∈ [a, b] | g(x) ̸= G(x)} ∪ Pn .
n
S
Then E has measure zero since n Pn is countable. We claim that f is continuous on [a, b]\E.
Indeed, let x0 ∈ [a, b]\E and let ε > 0 be given. Then g(x0 ) = G(x0 ) and since gn (x0 ) → g(x0 )
and Gn (x0 ) → G(x0 ) we deduce that Gk (x0 ) − gk (x0 ) < ε for all k large enough. Choose and
fix such k. Now x0 ∈ / Pk and so it must be an interior point of some subinterval of the partition
Pk where gk and Gk are constant. Hence there exists δ > 0 such that Gk (x) = Gk (x0 ) and
gk (x) = gk (x0 ) for all |x − x0 | < δ. From the above and the inequalities g(x) ≤ f (x) ≤ G(x),
we conclude that

−ε < g(x0 ) − G(x0 ) ≤ f (x) − f (x0 ) ≤ G(x0 ) − g(x0 ) < ε.

This proves that f is continuous at x0 .


Suppose now conversely, that f is continuous on some [a, b]\D where λ(D) = 0. Let ε > 0 be
given and M be such that |f (x)| ≤ M . Then |f (x)−f (y)| ≤ 2M for all x, y ∈ [a,P b]. Since D has
ε
measure zero there exists a countable cover of D by open intervals In such that n ℓ(In ) < 4M .
Now for all x ∈ / D there exists (by the continuity of f at x) an open interval Jx containing x
ε
and such that |f (z) − f (y)| < 2(b−a) for all z, y ∈ Jx ∩ [a, b]. Now {In }n ∪ {Jx }x∈D
/ is an open
cover of [a, b] and so by compactness, there exists a finite subcover {Ink }k=1 ∪ {Jxi }m
n
i=1 . Let
P = {t0 , . . . tN } be the partition of [a, b] determined by the endpoints of Ink and Jxi (which are
60 CHAPTER 4. THE LEBESGUE INTEGRAL

inside [a, b]). Then each interval ]tj−1 , tj [ is contained either in some Ink or in some Jxi . Let
J = {j | ]tj−1 , tj [⊂ Ink for some k} and ∆j = tj − tj−1 . Then
N
X
U (f, P ) − L(f, P ) = ∆j sup{|f (y) − f (z)| | y, z ∈ [tj−1 , tj ]}
j=1
X X ε
≤ ∆j 2M + ∆j
2(b − a)
j∈J j ∈J
/
ε ε
≤ 2M + (b − a) =ε
4M 2(b − a)
Hence f is Riemann integrable. □

Remark 4.3 The characteristic function f of Q ∩ [0, 1] is not Riemann integrable because it is
discontinuous everywhere (this is because every interval, no matter how small, contains rational
and irrational numbers). However it is Lebesgue integrable because it is Lebesgue measurable
(as Q ∩ [0, 1] is Lebesgue measurable) and its integral, by definition, is λ(Q ∩ [0, 1]) = 0.
On the other hand the characteristic function of the middle third Cantor set (restricted to
[0,1]) is Riemann integrable.

4.4 Some applications


We give some applications ofZ the power of the convergence theorems.
n
x n −2x
Example. Compute lim 1+ e dx. The classical theorems of Riemann integra-
n→∞ 0 n
tion do not apply since we work on an unbounded domain and there is no uniform convergence.
However, we can work with the Lebesgue integral instead. First write
Z n Z Z
x n −2x  x n −2x  x n −2x
In = 1+ e dx = 1+ e dλ = 1+ e χ[0,n] dλ.
0 n [0,n] n [0,∞[ n
n
Let fn (x) = 1 + nx e−2x χ[0,n] (x). Then fn (x) → e−x and |fn (x)| ≤ e−x . But the function
x → e−x is summable on [0, ∞[. By the dominated convergence theorem
Z Z ∞
−x
lim In = e dx = e−x dx = 1.
n→∞ [0,∞[ 0

Remark. Note that the sequence (1 + x/n)n is nondecreasing for x ≥ 0 and therefore we can
also use the monotone convergence theorem.

Integrals depending on a parameter


Theorem 4.7 Let (X, A, µ) be a measure space, J be metric space and f : X × J → IR (or C)
be a function satisfying
(i) For every s ∈ J, the function x 7→ f (x, s) is measurable.
(ii) For almost every x ∈ X, the function s 7→ f (x, s) is continuous at some point s0 .
(iii) There exists a function g ∈ L1 (X) such that |f (x, s)| ≤ g(x) for all s ∈ I and almost every
x ∈ X.
Then the function I defined by Z
I(s) = f (x, s) dµ(x)
X
is continuous at s0 .
4.4. SOME APPLICATIONS 61

Proof. We need to show that for every sequence {tn } ⊂ J converging to s0 , we have
I(tn ) → I(s0 ). Let fn (x) = f (x, tn ) and h(x) = f (x, s0 ). Then fn → h a.e. The dominated
convergence theorem gives the result. □

Theorem 4.8 (Derivation under the integral) Let (X, A, µ) be a measure space and J be
an open interval of IR and f : X × J → IR (or C) be a function satisfying

(i) For every s ∈ J, the function x 7→ f (x, s) is µ−integrable.

(ii) For almost every x ∈ X, the function s 7→ f (x, s) is differentiable on J.

(iii) There exists a function g ∈ L1 (X) such that | ∂f∂s


(x,s)
| ≤ g(x) for all s ∈ J and almost every
x ∈ X.

Then the function I defined by Z


I(s) = f (x, s) dµ(x)
X
is differentiable on J and Z
dI ∂f (x, s)
= dµ(x).
ds X ∂s
Proof. Let t ∈ J and let {tn } ⊂ J be a sequence converging to t with tn ̸= t. Set
f (x,tn )−f (x,t)
φn (x) = tn −t .
By the mean value theorem and assumption (iii),

∂f (x, s)
|φn (t)| ≤ sup | | ≤ g(x).
s ∂s
By the dominated convergence theorem

I(tn ) − I(t) f (x, tn ) − f (x, t) f (x, tn ) − f (x, t)


Z Z
dI
= lim = lim dµ(x) = lim dµ(x)
ds n→∞ tn − t n→∞ X tn − t X n→∞ tn − t
Z
∂f (x, s)
= dµ(x).
X ∂s

62 CHAPTER 4. THE LEBESGUE INTEGRAL
Appendix A

The Riemann integral

The integral of a function f over an interval [a, b] is thought of as the area of the planar region
bounded by the graph of f , and the lines y = 0, x = a and x = b. But how to define this area?
The Riemann approach is to divide the interval [a, b] into many small subintervals [xi−1 , xi ],
next, to choose a point ci in each subinterval and then consider the rectangles with base [xi−1 , xi ]
and height f (ci ). The sum of the areas of all these small rectangles is then an approximation
of the ”area” under the graph of f . It is natural to expect that the more we have rectangles,
or the finer is the decomposition of the interval [a, b], the better will be the approximation of
the area under the graph of f . In what follows, we shall elaborate these intuitive ideas and
construct a theory of the integral known as the Riemann integral.

A.1 Definitions
Definition A.1 Let [a, b] be a given interval. A partition P of [a, b] is a finite set of points
{x0 , x1 , . . . , xn } such that a = x0 < x1 < · · · < xn−1 < xn = b. We write ∆xi = xi − xi−1 for
i = 1, . . . , n. The biggest of the numbers ∆xi is called the norm of the partition and is denoted
by ∥P ∥.

Definition A.2 Let P1 and P2 be two partitions of [a, b]. We say that P2 is a refinement of P1
if P1 ⊂ P2 .

Let f : [a, b] → IR be a bounded function. In order to define the integral of f over [a, b], we
proceed as follows. Let P = {x0 , . . . , xn } be a partition of [a, b] and let c1 , . . . , cn be a sequence
of points in [a, b] such that ci ∈ [xi−1 , xi ] for all i = 1, . . . , n. Then we form the finite sum
n
X
σ(f, P, c1 , . . . , cn ) = f (ci )∆xi
i=1

called a Riemann sum of f corresponding to the the partition P . This Riemann sum is said to
have a limit I as the partition becomes finer and finer, or as ∥P ∥ → 0 if for every ε > 0, there
exists a number δ > 0 such that
|σ(f, P ) − I| < ε
for every partition P of [a, b] such that ∥P ∥ < δ and any choice of the points c1 , . . . , cn . In this
case we write
lim σ(f, P ) = I.
∥P ∥→0

Definition A.3 A function f : [a, b] → IR is called integrable in the sense of Riemann or


Riemann integrable if the Riemann sums of f have a limit. In this case, this limit is called the

63
64 APPENDIX A. THE RIEMANN INTEGRAL

integral of f over [a, b] and is denoted by


Z b
f (x) dx.
a

Otherwise stated, the integral of f is defined by


Z b
f (x) dx = lim σ(f, P )
a ∥P ∥→0

provided that the limit exists.

Theorem A.1 Let f : [a, b] → IR be integrable, then


n   Z b
b−aX b−a
lim f a+k = f (x) dx.
n→∞ n n a
k=1

Proof. Take a uniform partition of [a, b], that is, divide [a, b] into equal intervals each of
length b−a b−a
n and choose ck = a + k n , we obtain the Riemann sum

n  
X b−a b−a
f a+k .
n n
k=1
Z b
According to the above, this sum tends to f (x) dx as n → ∞. □
a

Rb Ra
Definition
R a A.4 If a > b and f is integrable on [b, a] we set a f (x) dx = − b f (x) dx. Also
we set a f (x) dx = 0.

A.2 Criteria of integrability


Let f : [a, b] → IR be a bounded function and let P = {x0 , . . . , xn } be a partition of [a, b]. We
set

Mi = sup f (x)
xi−1 ≤x≤xi

mi = inf f (x).
xi−1 ≤x≤xi

We need the following quantities called respectively the Darboux upper sum and the Darboux
lower sum of f corresponding to the partition P
n
X
U (f, P ) = Mi ∆xi
i=1
Xn
L(f, P ) = mi ∆xi .
i=1

Lemma A.1 Let P be a partition of [a, b] and let f be a bounded function. Then,

L(f, P ) ≤ σ(f, P, c1 , . . . , cn ) ≤ U (f, P ),

for any choice of the intermediate points ci .


A.2. CRITERIA OF INTEGRABILITY 65

Proof. Straightforward. □

Theorem A.2 Let P ∗ be a refinement of a partition P of [a, b]. Then

(i) L(P, f ) ≤ L(P ∗ , f ).

(ii) U (P ∗ , f ) ≤ U (P, f ).

Otherwise stated, inserting an extra point into a partition increases the lower sum and decreases
the upper sum.
Proof. We prove (i). Let P = {x0 , . . . , xn }. It is enough to prove the claim when P ∗
contains just one point more than P . Let this point be x∗ . Then there is i = 1, . . . , n such that
xi−1 < x∗ < xi . Let

w1 = inf f (x) and w2 = inf f (x).


xi−1 ≤x≤x∗ x∗ ≤x≤xi

Clearly w1 ≥ mi and w2 ≥ mi where

mi = inf f (x).
xi−1 ≤x≤xi

Now

L(f, P ∗ ) − L(f, P ) = w1 (x∗ − xi−1 ) + w2 (xi − x∗ ) − mi (xi − xi−1 )


= (w1 − mi )(x∗ − xi−1 ) + (w2 − mi )(xi − x∗ ).

Therefore the difference is nonnegative, and so our claim is true.


The proof of (ii) is similar. □

Corollary A.1 For any partition P and any partition Q of [a, b],

L(f, P ) ≤ U (f, Q).

Therefore, sup L(f, P ) ≤ inf U (f, Q) where the sup and inf are taken over all possible partitions
of [a, b].

Proof. Let P and Q be two partitions of [a, b] and let P ∗ = P ∪ Q so that P ∗ is a refinement
of both P and Q. It follows from the theorem above that L(P, f ) ≤ L(P ∗ , f ) ≤ U (P ∗ , f ) ≤
U (Q, f ). □

Lemma A.2 Let P ∗ be a refinement of a partition P of [a, b]. Then

U (f, P ) − U (f, P ∗ ) ≤ 2m||f ||||P || and L(f, P ∗ ) − L(f, P ) ≤ 2m||f ||||P ||

where m is the number of points in P ∗ \P and ||f || = supx∈[a,b] |f (x)|.

Proof. It is enough to prove the lemma when P ∗ has just one more point than P . Let this
point be x∗ . Then there is i = 1, . . . , n such that xi−1 < x∗ < xi . Then,

U (f, P ) − U (f, P ∗ ) ≤ ||f ||(x∗ − xi−1 ) + ||f ||(xi − x∗ ) + ||f ||(xi − xi−1 ) = 2||f ||(xi − xi−1 )
≤ 2||f ||||P ||.

The proof for L(U, f ) is similar. □


66 APPENDIX A. THE RIEMANN INTEGRAL

Theorem A.3 (Darboux) Let f : [a, b] → IR be bounded. Then

lim U (f, P ) = inf U (f, P ) and lim L(f, P ) = sup L(f, P ),


||P ||→0 ||P ||→0

where the sup is taken over all partitions of [a, b].


Proof. Let I ∗ = inf U (f, P ) and let ε > 0 be given. Then there exists a partition P1 =
{x0 , . . . , xm } such that U (f, P1 ) < I ∗ + 2ε . Let δ = 4m||f
ε
|| where ||f || is the supremum of f on
[a, b]. Let P be a partition such that ||P || < δ and finally let P ∗ = P1 ∪ P . According to the
previous lemma U (f, P ) − U (f, P ∗ ) ≤ 2m||f ||||P || < 2m||f ||δ = 2ε . Now since P ∗ is finer that
P1 , we have U (f, P ∗ ) ≤ U (f, P1 ) < I ∗ + 2ε . Thus,
ε
I ∗ ≤ U (f, P ) < U (f, P ∗ ) + < I ∗ + ε.
2
The proof for I∗ = sup L(f, P ) is similar or can be deduced from the preceding by noting that
inf(A) = − sup(−A) and L(−f, P ) = −U (f, P ). □

Lemma A.3 By a choice of the intermediate points ci , the Riemann sum σ(f, P, c1 , . . . , cn ) can
be made arbitrarily close to the upper Darboux sum U (f, P ) as well as to the lower Darboux
sum L(f, P ).
Proof. By the property of the supremum, for any ε > 0, there exists a point ci in [xi−1 , xi ]
such that
ε
Mi − < f (ci ) ≤ Mi .
b−a
Multiplying both inequalities by ∆xi and summing from i = 1 to i = n, we get

U (f, P ) − ε < σ(f, P, c1 , . . . , cn ) ≤ U (f, P ).

This proves the first assertion of the lemma. The second assertion is proved similarly by using
the property of the infimum. □

Theorem A.4 Let f : [a, b] → IR be a bounded function. Then the following conditions are
equivalent.

(i) f is Riemann integrable on [a, b].

(ii) For every ε > 0 there exits a number δ > 0 such that U (f, P ) − L(f, P ) < ε, for any
partition P of [a, b] such that ∥P ∥ < δ.

(iii) sup L(f, P ) = inf U (f, P ).

(iv) For every ε > 0 there exits a partition P of [a, b] such that U (f, P ) − L(f, P ) < ε.
Rb
In addition, if one of the above conditions hold then a f (x) dx = sup L(f, P ) = inf U (f, P ).
Rb
Proof. (i)⇒(ii). Let f be integrable and set I = a f (x) dx. Let ε > 0 be given. Then by
definition, there exists a number δ > 0 such that
ε ε ε
|σ(f, P, c1 , . . . , cn ) − I| < , or I− < σ(f, P, c1 , . . . , cn ) < I + ,
2 2 2
for every partition P of [a, b] such that ||P || < δ and any choice of the intermediate points ci .
By the previous lemma, the lower Darboux sum L(f, P ) belongs to this interval for some choice
of ci . The same is true for the upper Darboux sum U (f, P ). This means that both sums belong
to the same interval of length ε.
A.3. CLASSES OF INTEGRABLE FUNCTIONS 67

(ii)⇒(iii). For every partition P of [a, b], we have

L(f, P ) ≤ sup L(f, P ) ≤ inf U (f, P ) ≤ U (f, P ),

where the inf and sup are taken over all possible partitions of [a, b]. Condition (ii) then implies
that
0 ≤ inf U (f, P ) − sup L(f, P ) < ε.
for all ε > 0. This means that inf U (P, f ) = sup L(P, f ).
(iii)⇒(i). Let I be this common number. It follows from Darboux’s theorem that

lim U (f, P ) = lim L(f, P ) = I.


||P ||→0 ||P ||→0

But L(f, P ) ≤ σ(f, P, c1 , . . . , cn ) ≤ U (f, P ) and so lim σ(f, P, c1 , . . . , cn ) = I. This means


||P ||→0
that f is integrable (and its integral is I).
(iii)⇒(iv). Denote this common number by I. By the property of the sup, there exists a
partition P such that
ε
I = sup L(P, f ) < L(f, P ) + .
2
By the property of the inf, there exists a partition Q such that
ε
I = inf U (P, f ) > U (f, P ) − .
2
Let P ∗ = P ∪ Q. Then
ε ε
U (f, P ∗ ) − < I < L(f, P ∗ ) + .
2 2
Hence the conclusion.
(iv)⇒(iii). It follows that inf U (P, f ) ≤ U (f, P ) < L(f, P ) + ε ≤ sup L(f, P ) + ε for every ε > 0
and therefore, inf U (P, f ) ≤ sup L(f, P ). But the reverse inequality always holds, hence the
equality. □
P
Remark A.1 Observe that U (f, P ) − L(f, P ) = (Mi − mi )∆xi . The difference ωi = Mi − mi
is called the oscillation of f on [xi−1 , xi ]. Thus condition (ii) P
of the previous theorem can be
formulated as follows: For every ε > 0, there is δ > 0 such that ωi ∆xi < ε, for every partition
P such that ∥P ∥ < δ.

A.3 Classes of integrable functions


Not all functions are integrable in the sense of Riemann (this is why we sought another criterion
of integrability). Consider for example the Dirichlet function f : [0, 1] → IR defined by f (x) = 1
if x is rational and f (x) = 0 if x is irrational. Let P = {x0 , . . . , xn } be an arbitrary partition
of [0,1]. Since every subinterval [xk−1 , xk ] contains P
rational as well as irrational points, the
oscillation of f on every subinterval is 1. Therefore ωi ∆xi = 1. Therefore this sum cannot
be made arbitrarily small for some choice of a fine partition. This means that the function is
not integrable in the sense of Riemann.
However, there are many functions which are integrable in the sense of Riemann. In fact all
elementary functions are integrable. We shall prove this shortly, and to this end we start with
a lemma that is proved in the exercises.

Lemma A.4 Let f : [α, β] → IR be a bounded function. Then the oscillation of f is also equal
to sup{|f (x) − f (y)| | x, y ∈ [α, β]}.
68 APPENDIX A. THE RIEMANN INTEGRAL

Theorem A.5 A continuous function is Riemann integrable.


Proof. Let f : [a, b] → IR be continuous and let ε > 0 be given. Since f is uniformly
ε
continuous on [a, b], there exists δ > 0 such that |f (x) − f (t)| < b−a for all x, t ∈ [a, b] such that
|x − t| < δ. Let P = {x0 , . . . xn } be a partition such that ∥P ∥ < δ. Let ωi be the oscillation of
ε
f on the interval [xi−1 , xi ]. According to the previous lemma ωi ≤ b−a and therefore
n n
X ε X ε
ωi ∆xi ≤ ∆xi = (b − a) = ε.
b−a b−a
i=1 i=1

By Theorem A.4, f is integrable. □

Theorem A.6 A monotonic function is Riemann integrable.


Proof. We prove the theorem for an increasing function f . We distinguish between two cases
(i) f (a) = f (b) and (ii) f (a) < f (b). In the first case f is constant and therefore integrable (since
it is continuous). So we consider the second case. Let ε > 0 be given, and let P = {x0 , . . . , xn }
ε
be a partition of [a, b] with ∥P ∥ < δ = f (b)−f (a) . Let as before, Mi = supxi−1 ≤x≤xi f (x) and
mi = inf xi−1 ≤x≤xi f (x). Then mi = f (xi−1 ) and Mi = f (xi ) (because f is increasing) so that
n
X n
X
   
U (f, P ) − L(f, P ) = f (xi ) − f (xi−1) ∆xi ≤ δ f (xi ) − f (xi−1) = δ[f (b) − f (a)] = ε.
i=1 i=1

By Theorem A.4, f is integrable. □

Theorem A.7 Let f : [a, b] → IR be a function and let c ∈]a, b[. If f is integrable on [a, c] and
on [c, b], then f is integrable on [a, b] and
Z b Z c Z b
f (x) dx = f (x) dx + f (x)dx.
a a c
P
Proof. Consider the sum ωi ∆xi for some partition. If the point c belongs to the partition,
then this sum consists of two similar sums for the intervals [a, c] and [c, b], each of which tends
to zero as the norm of the partition tends to zero. The conclusion remains true also in the case
where c does not belong to the partition: adding c to the partition we would change only one
term in the sum which itself tends to zero. □

Theorem A.8 A piecewise continuous function is Riemann integrable.


Proof. It is enough to prove the theorem when f has only one point of discontinuity c ∈]a, b[
because the general case follows by induction. Now f is integrable on [a, c] because it continuous
there. Similarly f is integrable on [c, b]. By the previous theorem f is integrable on [a, b]. □
In fact we have a stronger theorem.

Theorem A.9 A bounded function having finitely many points of discontinuities is Riemann
integrable.
Example. Consider the function x 7→ sin( x1 ). This function is discontinuous at x = 0 but it is
not piecewise continuous. According to to the previous theorem it is integrable on any compact
interval.
The above theorem and the next one can be proved inside the Riemann theory but they can be
easily deduced from Theorem 4.6.

Theorem A.10 Let f be integrable and φ be continuous. Then φ ◦ f is integrable.


In particular, if f is integrable then so are |f |, f 2 and any positive power of f .
A.4. PROPERTIES OF THE INTEGRAL 69

A.4 Properties of the integral


Proposition A.1 Suppose that f and g are integrable functions on [a, b] and that k is a
constant. Then
Z b Z b
a) kf is integrable and kf (x) dx = k f (x) dx.
a a
Z b Z b Z b
b) f + g is integrable and f (x) + g(x) dx = f (x) dx + g(x) dx.
a a a

c) f g is integrable.
Proof. a) and b) are proved by taking Riemann sums and going to the limit.
c) is proved in the exercises . □

Proposition A.2 If f is integrable on an interval containing the points a, b, and c, then


Z b Z c Z b
f (x) dx = f (x) dx + f (x) dx.
a a c

Proposition A.3 If an integrable function f : [a, b] → IR satisfies f (x) ≥ 0 for every x ∈ [a, b],
Z b
then f (x) dx ≥ 0.
a

Proof. If f (x) ≥ 0 then any Riemann sum is nonnegative. Going to the limit, we obtain the
result. □

Corollary A.2 If f and g are two integrable functions satisfying f (x) ≤ g(x) for all x ∈ [a, b],
Z b Z b
then f (x) dx ≤ g(x) dx.
a a

Corollary A.3 If f is integrable on an interval containing a and b and f satisfies |f (x)| ≤ M


Z b
then f (x) dx ≤ M |b − a|.
a

A.5 Integration and differentiation


Theorem A.11 (First fundamental theoremZof Calculus) Let f : [a, b] → IR be inte-
x
grable. Then the function F defined by F (x) = f (t) dt is continuous and differentiable at
a
each point where f is continuous. At these points, F ′ (x) = f (x).
Proof. Since f is bounded,
Rx there is a constant M such that |f (t)| ≤ M for all t ∈ [a, b]. Note
that F (x) − F (y) = y f (t) dt and so

|F (x) − F (y)| ≤ M |x − y|.


This means that F is Lipschitz continuous and therefore continuous.
Let now x be a point at which f is continuous. Then, given ε > 0, there exists δ > 0 such
that |f (t) − f (x)| < ε for all t such that |t − x| < δ.
F (x + h) − F (x) 1 x+h
Z
− f (x) = (f (t) − f (x)) dt ≤ ε for |h| < δ.
h h x
F (x + h) − F (x)
This means that limh→0 = f (x) and so F is differentiable at x with derivative
h
f (x). □
70 APPENDIX A. THE RIEMANN INTEGRAL

Corollary A.4 Every continuous function defined on a interval has an antiderivative.

We rarely compute integrals by going to the definition. The most practical way to compute
the integral of elementary functions is given by the following theorem which connects integration
and the search for antiderivatives.

Theorem A.12 (Second fundamental theorem of Calculus) Let f be an integrable func-


tion defined on an interval I and let a, b ∈ I. If G is an antiderivative of f , then
Z b
f (x) dx = G(b) − G(a).
a

Pn for a < b. Note that for any partition P = {x0 , . . . , xn }


Proof. It is enough to prove the result
of [a, b] we have G(b) − G(a) = k=1 G(xk ) − G(xk−1 ). Now by the mean value theorem
there exists ck ∈]x ′
Pnk−1 , xk [ such that G(xk ) − G(xk−1 ) = G (ck )∆xk = f (ck )∆xk . Therefore,
G(b) − G(a) = k=1 f (ck )∆xk . But the right hand side of this equality is a Riemann sum of
Rb
f . Since f is integrable this sum converges to a f (x) dx as ∥P ∥ → 0. □

Remark. The quantity G(b) − G(a) is usually denoted by G(x)|ba .


Z 1 1
x3 1
Thanks to the above theorem we can write for example x2 dx = = .
0 3 0 3
1 √ √ √ 
Example of computing Riemann sums. Find lim √ 1+ 2 + ··· + n . The sum
n→∞ n n
can be written as
r r r ! n  
1 1 2 n b−aX b−a
+ + ··· + = f a+k
n n n n n n
k=1

where a = 0,R b = 1 and f (x) = x. Therefore the sum is a Riemann sum of f and converges
1√
therefore to 0 x dx = 32 .
There are of course several techniques for integration. The most important are integration
by parts, substitutions or change of variables and partial fractions. These are usually studied
in elementary Calculus courses. We mention the first two.

Theorem A.13 (Integration by parts) Let f and g be differentiable functions defined on


[a, b]. If f ′ and g ′ are integrable on [a, b], then
Z b Z b

f (x)g(x) = f (b)g(b) − f (a)g(a) − g ′ (x)f (x) dx.
a a

Proof. The result follows from the derivative of a product and the second fundamental
theorem of Calculus.

Theorem A.14 (Integration by substitution) Let φ : [a, b] → [c, d] be differentiable and


let f : [c, d] → IR be continuous. If φ is integrable, then
Z b Z φ(b)

f (φ(t))φ (t) dt = f (x) dx.
a φ(a)

Proof. By the chain rule and the second fundamental theorem of Calculus, both sides are
equal to F (φ(b)) − F (φ(a)) where F is an antiderivative of f .
A.6. LIMITS AND INTEGRATION 71

A.6 Limits and integration


2 n
Example A.1 Consider the sequence defined by fn (x) = nx(1R 1 − x ) for nx ∈ [0,1 1]. This
sequence converges to the zero function. Note however that 0 fn (x) dx = 2n+2 → 2 whereas
R1
0 lim fn (x) dx = 0.

The above example show that one should be careful before interchanging limits and integrals.
However, there is a stronger form of convergence that permits this interchange and also preserves
the properties of continuity and integrability.

Example A.2 The sequence of functions defined by fn (x) = nx(1 − x2 )n for x ∈ [0, 1] does
not converge uniformly because
   n
1 n 1
sup fn (x) = fn √ = √ 1− → ∞ as n → ∞.
x∈[0,1] 2n + 1 2n + 1 2n + 1

Theorem A.15 Let fn : [a, b] → IR be a sequence of Riemann integrable functions which


converges uniformly to a function f . Then f is Riemann integrable and
Z b Z b
lim fn (t) dt = f (t) dt.
n→∞ a a

Otherwise stated,
Z b Z b
lim fn (t) dt = lim fn (t) dt.
n→∞ a a n→∞
Rb
Proof. The first step is to show that the sequence { a fn } converges. Let ε > 0. Since {fn }
converges uniformly, there exists an integer N such that |fn (x) − fm (x)| < ε for all x ∈ [a, b]
and all n, m ≥ N . It follows that
Z b Z b Z b Z b
fn (x) dx − fm (x) dx = [fn (x) − fm (x)] dx ≤ |fn (x) − fm (x)| dx ≤ ε(b − a)
a a a a

for all n, m ≥ N . This means that the sequence of integrals is a Cauchy sequence and therefore
is convergent to some limit L.
Chose n ≥ N . Since fn is integrable, there exists δ > 0 such that |σ(fn , P, c1 , . . . , cn ) −
Rb
a fn (x) dx| ≤ ε for any partition P with ∥P ∥ ≤ δ. From the other hand, it is easily seen that
|σ(f, P, c1 , . . . , ck ) − σ(fn , P, c1 , . . . , ck )| ≤ ε(b − a).
The result now follows from
Z b Z b
|σ(f, P ) − L| ≤ |σ(f, P ) − σ(fn , P )| + σ(fn , P ) − fn (x) dx + fn (x) dx − L .
a a

Theorem A.16 Let fn : [a, b] → IR be a sequence of continuously differentiable functions


which satisfies the following conditions.

(i) There is some point x0 ∈ [a, b] such that fn (x0 ) converges.

(ii) The sequence of derivatives fn′ converges uniformly to some function g.

Then {fn } converges uniformly to a C 1 function f which satisfies f ′ = g.


72 APPENDIX A. THE RIEMANN INTEGRAL

Proof. By the second fundamental theorem of calculus we can write


Z x
fn (x) = fn (x0 ) + fn′ (t) dt.
x0

Let ℓ = limn→∞ fn (x0 ) and define now the function f by


Z x
f (x) = ℓ + g(t) dt.
x0

Then f ′ (x) = g(x). We shall prove that fn converges uniformly to f . Taking the difference
between the last two identities, it follows from the triangle inequality that
Z x
|fn (x) − f (x)| ≤ |fn (x0 ) − ℓ| + |fn′ (t) − g(t)| dt .
x0

Now given ε > 0, we can make |fn (x0 ) − ℓ| < ε for all n large enough. Also we can make |fn′ (t) −
Rx
g(t)| < ε for all n large enough and all t ∈ [a, b]. Thus integrating we get x0 |fn′ (t) − g(t)| dt ≤
(b − a)ε for all x ∈ [a, b]. This means that

|fn (x) − f (x)| ≤ ε + (b − a)ε,

for all n large enough and all x ∈ [a, b]. Since ε is arbitrary, the theorem is proved. □

Now we give the analog of the above theorems for series. The following theorem can be
easily proved by taking partial sums.
P
Theorem A.17 Let fn : I → IR be sequence of functions such that the series fn converges
uniformly to a function f .

a) If each fn is continuous then f is continuous.


Rb
b) If each fn is integrable on an interval [a, b] ⊂ I, then f is integrable on [a, b] and a f (x) dx =
PRb
a fn (x). This means that we can integrate term by term a uniformly convergent series.
P ′
c) If each fn is continuously differentiablePand fn is uniformly convergent on I then f is
′ ′
continuously differentiable and f (x) = fn (x) for all x ∈ I.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy