0% found this document useful (0 votes)
254 views237 pages

Introductory Mathematical Analysis For Quantitative Finance: Daniele Ritelli and Giulia Spaletta

Uploaded by

Francesco
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
254 views237 pages

Introductory Mathematical Analysis For Quantitative Finance: Daniele Ritelli and Giulia Spaletta

Uploaded by

Francesco
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 237

Introductory Mathematical Analysis for Quantitative

Finance

Daniele Ritelli and Giulia Spaletta

February 11, 2021


ii
Contents

1 Euclidean space 1
1.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Topology of Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Limits of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Sequences and series of functions 9


2.1 Sequences and series of real or complex numbers . . . . . . . . . . . . . . . . . . . . . 9
2.2 Sequences of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Uniform convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Series of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Power series: radius of convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.6 Taylor–MacLaurin series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6.1 Binomial series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.6.2 The error function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.3 Abel theorem and series summation . . . . . . . . . . . . . . . . . . . . . . . . 31
2.7 Basel problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.8 Extension of elementary functions to the complex field . . . . . . . . . . . . . . . . . . 35
2.8.1 Complex exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.8.2 Complex goniometric hyperbolic functions . . . . . . . . . . . . . . . . . . . . . 36
2.8.3 Complex logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.9.1 Solved exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.9.2 Unsolved exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3 Multidimensional differential calculus 43


3.1 Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3 Maxima and Minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.4 Sufficient conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.5 Lagrange multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.6 Mean–Value theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.7 Implicit function theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.8 Proof of Theorem 3.22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.9 Sufficient conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4 Ordinary differential equations of first order: general theory 55


4.1 Preliminary notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.1.1 Systems of ODEs: equations of higher order . . . . . . . . . . . . . . . . . . . . 57
4.2 Existence of solutions: Peano theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3 Existence and uniqueness: Picard–Lindelöhf theorem . . . . . . . . . . . . . . . . . . . 59
4.3.1 Interval of existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.3.2 Vector–valued differential equations . . . . . . . . . . . . . . . . . . . . . . . . 64

vii
viii CONTENTS

4.3.3 Solution continuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5 Ordinary differential equations of first order: methods for explicit solutions 67


5.1 Separable equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2 Singular integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.3 Homogeneous Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.3.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.4 Quasi homogeneous equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.5 Exact equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.5.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.6 Integrating factor for non exact equations . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.6.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.7 Linear equations of first order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.7.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.8 Bernoulli equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.8.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.9 Riccati equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.9.1 Cross–Ratio property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.9.2 Reduced form of the Riccati equation . . . . . . . . . . . . . . . . . . . . . . . 91
5.9.3 Connection with the linear equation of second order . . . . . . . . . . . . . . . 92
5.9.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.10 Change of variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.10.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

6 Linear differential equations of second order 97


6.1 Homogeneous equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.1.1 Operator notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.1.2 Wronskian determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.1.3 Order reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.1.4 Constant–coefficient equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.1.5 Cauchy–Euler equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.1.6 Invariant and Normal form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.2 Non–homogeneous equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.2.1 Variation of parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.2.2 Non–homogeneous equations with constant coefficients . . . . . . . . . . . . . . 112
6.2.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

7 Prologue to Measure theory 115


7.1 Set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.1.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.1.2 Indexes and Cartesian product . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.1.3 Cartesian product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.1.4 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.1.5 Equivalences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.1.6 Real intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7.1.7 Cardinality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7.1.8 The Real Number System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7.1.9 The extended Real Number System . . . . . . . . . . . . . . . . . . . . . . . . 118
7.2 Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.2.1 Closed sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.2.2 Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.2.3 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
CONTENTS ix

7.2.4 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121


7.2.5 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

8 Lebesgue integral 123


8.1 Measure theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
8.1.1 σ–algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
8.1.2 Borel sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.1.3 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
8.2 Translation invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8.2.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8.3 Simple functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8.3.1 Integral of simple functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
8.4 Measurable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
8.5 Lebesgue integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
8.6 Almost everywhere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
8.7 Connection with Riemann integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
8.7.1 The Riemann integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
8.7.2 Lebesgue–Vitali theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.7.3 An interesting example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.8 Non Lebesgue integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.8.1 Dirac measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.8.2 Discrete measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.9 Generation of measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
8.10 Passage to the limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.10.1 Monotone convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.10.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.10.3 Dominated convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.10.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
8.10.5 A property of increasing functions . . . . . . . . . . . . . . . . . . . . . . . . . 148
8.11 Differentiation under the integral sign . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
8.11.1 The probability integral (1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
8.11.2 The probability integral (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
8.11.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
8.12 Basel problem again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
8.13 Debye integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

9 Radon–Nikodym theorem 157


9.1 Signed measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
9.2 Radon–Nikodym theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

10 Multiple integrals 163


10.1 Integration in R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
10.1.1 Smart applications of Fubini theorem . . . . . . . . . . . . . . . . . . . . . . . 165
10.2 Change of variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
10.3 Integration in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
10.3.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
10.4 Product of σ–algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
x CONTENTS

11 Gamma and Beta functions 181


11.1 Gamma function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
11.1.1 Historical backgruond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
11.1.2 Main properties of Γ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
11.2 Beta function
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
11.2.1 Γ 12 and the probability integral . . . . . . . . . . . . . . . . . . . . . . . . . . 185
11.2.2 Legendre duplication formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
11.2.3 Euler reflexion formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
11.3 Definite integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
11.4 Double integration techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

12 Fourier transform on the real line 195


12.1 Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
12.1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
12.2 Properties of the Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
12.2.1 Linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
12.2.2 The Shift Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
12.2.3 The Stretch Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
12.2.4 Combining shifts and stretches . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
12.3 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
12.4 Linear ordinary differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
12.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

13 Parabolic equations 205


13.1 Partial differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
13.1.1 Classification of second–order linear partial differential equations . . . . . . . . 206
13.2 The heat equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
13.2.1 Uniqueness of solution: homogeneous case . . . . . . . . . . . . . . . . . . . . . 207
13.2.2 Fundamental solutions: heat kernel . . . . . . . . . . . . . . . . . . . . . . . . . 209
13.2.3 Initial data on (0, ∞) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
13.3 Parabolic equations with constant coefficients . . . . . . . . . . . . . . . . . . . . . . . 216
13.3.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
13.4 Black–Scholes equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
13.4.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
13.5 Non–homogeneous equation: Duhamel integral . . . . . . . . . . . . . . . . . . . . . . 223
Preface

The purpose of this book is to be a tool for students, with little mathematical background, who
aim to study Mathematical Finance. The only prerequisites assumed are one–dimensional differential
calculus, infinite series, Riemann integral and elementary linear algebra.
In a sense, it is a sort of intensive course, or crash–course, which allows students, with minimal
knowledge in Mathematical Analysis, to reach the level of mathematical expertise necessary in modern
Quantitative Finance. These lecture notes concern pure mathematics, but the arguments presented
are oriented to Financial applications. The n–dimensional Euclidean space is briefly introduced, in
order to deal with multivariable differential calculus. Sequences and series of functions are introduced,
in view of theorems concerning the passage to the limit in Measure theory, and their role in the
general theory of ordinary differential equations, which is also presented. Due to its importance in
Quantitative Finance, the Radon–Nykodim theorem is stated, without proof, since the Von Neumann
argument requires notions of Functional Analysis, which would require a dedicated course. Finally, in
order to solve the Black–Scholes partial differential equation, basics in ordinary differential equations
and in the Fourier transform are provided.
We kept our exposition as short as possible, as the lectures are intended to be a preliminary contact
with the mathematical concepts used in Quantitative Finance and provided, often, in a one–semester
course. This book, therefore, is not intended for a specialized audience, although the material presented
here can be used by both experts and non-experts, to have a clear idea of the mathematical tools used
in Finance.

xi
xii CONTENTS
1 Euclidean space

1.1 Vectors
If n ∈ N , we use the symbol Rn to indicate the Cartesian1 product of n copies of R with itself, i.e.:

Rn := {(x1 , x2 , . . . , xn ) : xj ∈ R for j = 1, 2, . . . , n} .

The concept of Euclidean2 space is not limited to the set Rn , but it also includes the so–called Euclidean
inner product, introduced in Definition 1.1. The integer n is called dimension of Rn , the elements
x = (x1 , x2 , . . . , xn ) of Rn are called points, or vectors or ordered n–tuples, while xj , j = 1, . . . , n , are
the coordinates, or components, of x. Vector x and y are equal if xj = yj for j = 1, 2, ..., n. The zero
vector is the vector whose components are null, that is, 0 := (0, 0, . . . , 0). In low dimension situations,
i.e. for n = 2 or n = 3, we will write x = (x , y) and x = (x , y , z) , respectively.
For our purposes, that is extending differential calculus to functions of several variables, we need to
define an algebraic structure in Rn . This is done by introducing operations in Rn .

Definition 1.1. Let x = (x1 , x2 , . . . , xn ) , y = (y1 , y2 , . . . , yn ) ∈ Rn and α ∈ R .

(i) The sum of x and y is the vector:

x + y := (x1 + y1 , x2 + y2 , . . . , xn + yn ) ;

(ii) The difference of x and y is the vector:

x − y := (x1 − y1 , x2 − y2 , . . . , xn − yn ) ;

(iii) The α–multiple of x is the vector:

α x = (α x1 , α x2 , . . . , α xn ) ;

(iv) The Euclidean inner product of x and y is the real number:

x · y := x1 y1 + x2 y2 + ... + xn yn .

The vector operations of Definition 1.1, illustrated in Figure 1.1, represent the analogues of the alge-
braic operations in R and imply algebraic rules in Rn .

Proposition 1.2. Let x , y , z ∈ Rn and α , β , ∈ R . Then:

1
Renatus Cartesius (1596–1650), French mathematician and philosopher.
2
Euclid of Alexandria (350–250 B.C. circa), Greek mathematician.

1
2 CHAPTER 1. EUCLIDEAN SPACE
228 Chapter 8 EUCLIDEAN SPACES

Figure 8.1
Figure 1.1: Vector operations.
(Note: For relationships between these three norms, see Remark 8.7. The subscript
00 is frequently used for supremum norms because the supremum of a continuous
(a) α 0 = 0 ;
0 x = 0 ;
as p --> oo--see Exercise 8, p. 126.)
(h) x − x = 0 ;
function on an interval [a, bj can be computed by taking the limit of
(b)
If(x)IP dX)I/p
(i) 0 · x = 0 ;
U:
Since Ilxll(c)= Ilxlll
1 x ==x Ilxll
; oo = lxi, when n = 1, each norm defined above is an
(j) x + (y + z) = (x + y) + z ;
extension of(d)the αabsolute value from R
(β x) = β (α x) = (α β) x ; to Rn. The most important, and in some
senses the most natural, of these norms is the Euclidean(k) norm.
x +This
y=y is +true
x ; for at
(e) α (x · y) = (α x) ·
least two reasons. First, by definition,y = x · (α y) ;
(f) α (x + y) = α2x + α y ; (l) x·y =y·x ;
IIxl1 =x·x
(g) 0+x=x ; (m) x · (y + z) = x · y + x · z .
(This aids in many calculations; see, for example, the proofs of Theorems 8.5 and
8.6.) Second, if is the triangle in R2 with vertices (0,0), x:= (a,b), and (a,O),
Definition 1.3. The standard base of Rn is the set En = {e1 , . . . , en }, where:
then by the Pythagorean Theorem, the hypotenuse of .Ja 2 + b2 , is exactly the
norm of x. Hence we define thee1 =(Euclidean)
(1, 0, . . . , 0) ,distance
e2 = (0,between
1, 0, . . . , 0) , . . points
two . , en =a,b
(0, ERn
. . . , 0, 1) .
by Note that a generic x = (x1 , . . . , xn ) ∈ Rn can be represented as a linear combination of vectors in
En : dist (a, b) := Iia-bil.
n n
Thus the Euclidean norm of a vector has a simple x = geometric
xj ej = interpretation.
X X
x · ej ej .
The algebraic structure of Rn also has a simple j=1 geometricj=1 interpretation in R2
and R3 thatIt gives us another very useful way to think about vectors. Scalar mul-
is worth noting that, when n = 2 and n = 3 , the standard base En is made of pairwise orthogonal
tiplication stretches
vectors. Inorordercompresses vector a but to
to extenda orthogonality leaves it in the consider
n dimensions, same straight
the main line
property of the standard
which passes through
base, ek =a.0 for
i.e., ej0· and Indeed, . a = (aI, a2) and t > 0, then ta = (tal, ta2)
j 6= k if
has the same direction as a, but its magnitude is or < than the magnitude of a,
Definition 1.4. Let x , y ∈ Rn be non–zero vectors; then:
depending on whether t 1 or t < 1. When t is negative, ta points in the opposite
direction from (i)a xbut
, y is
areagain stretched
parallel if and onlyor compressed,
if there existsdepending
t ∈ R suchon thexsize
that Itl. is denoted with xky ;
= t yof; this
To interpret the sum of two vectors, fix a pair of nonparallel vectors a, b E R 2 , and
(ii) x , y are orthogonal if and only if x · y = 0 ; this is denoted with x ⊥ y .
let P(a,b) denote the pamllelogmm associated with a and b; i.e., the parallelogram
whose sidesAsareangiven example,
by aaand= (3, 5) and
b (see (−6, −10)
b = 8.1).
Figure arethat
Notice if a =
parallel, while
(aI,ca2)
= (1,and1) and d = (1, −1) are
b = (b l , b2), then by definition the vector sum a+b = (al + bl , a2 + b2) is the diagonal
orthogonal.
of P(a,b), i.e.,
Thea+b is theinner
Euclidean vector that begins
product at the origin
allows introducing and ends
a metric n , the
in Rat opposite
as shown in the following Definition
vertex of P(a;b). Similarly, the difference a - b can be identified with the other
1.5:
P(a, b) (see1.5.
diagonal of Definition Figure
Let 8.1).
x ∈ Rn . The (Euclidean) norm of x is the scalar

n
!1
X 2
||x|| := x2k .
k=1
1.1. VECTORS 3

Remark 1.6. Observe that:


||x||2 = x · x .
2
pT is the triangle of vertices (0, 0) , (a, b) , (a, 0) ∈ R , then, by Pythagora Theorem, the hypotenuse
If
a2 + b2 of T is exactly the norm of x := (a, b) .
We now define the Euclidean distance between two points in Rn .
Definition 1.7. Given x , y ∈ Rn , their (Euclidean) distance is defined, and denoted, as:
dist(x, y) := ||x − y|| .
Theorem 1.8 (Cauchy–Schwarz inequality). Let x , y ∈ Rn . Then:
|x · y| ≤ ||x|| ||y|| .
Proof. We only consider the non–trivial situation x, y 6= 0 . For t ∈ R , define:
f (t) := ||x − t y||2 .
Since 0 ≤ f (t) = ||x||2 − 2 t x · y + t2 ||y||2 , it must hold:

= (x · y)2 − ||x||2 ||y||2 ≤ 0 ,
4
and the thesis follows.
The main properties of the Euclidean norm are stated in the following Proposition 1.9, in which
inequalities (iii)–(iv) are called triangular inequalities.
Proposition 1.9. If x , y ∈ Rn , then:
(i) ||x|| ≥ 0 , with ||x|| = 0 only when x = 0 ;
(ii) ||α x|| = |α| ||x|| for all scalars α ;
(iii) ||x + y|| ≤ ||x|| + ||y|| ;
(iv) ||x − y|| ≥ ||x|| − ||y|| .
Proof. Inequality (i) and equality (ii) are trivial. To get (iii), observe that:
||x + y||2 = ||x||2 + 2 x · y + ||y||2 .
From the Cauchy–Schwarz3 inequality of Theorem 1.8, we infer:
||x||2 + 2 x · y + ||y||2 ≤ ||x||2 + 2 ||x|| ||y|| + ||y||2 = (||x|| + ||y||)2 .
Inequality (iv) follows analogously from:
||x − y||2 = ||x||2 − 2 x · y + ||y||2 ,
and from the Cauchy–Schwarz inequality.
Remark 1.10. Let a , b ∈ R2 \ {0} and let T be the triangle determined by points 0 , a , b ; then, T
has sides of length ||a|| , ||b|| , ||a − b|| , as shown in Figure 1.2. If θ is the angle between the sides of
length ||a|| , ||b|| then, by the Law of Cosines (linked to Carnot Theorem), the following equality (1.1)
holds:
||a − b||2 = ||a||2 + ||b||2 − 2 ||a|| ||b|| cos θ . (1.1)
Recalling that
||a − b||2 = ||a||2 + ||b||2 − 2 a · b ,
we conclude that:
a·b
cos θ = . (1.2)
||a|| ||b||
3
Augustin–Louis Cauchy (1789–1857), French mathematician, engineer and physicist.
Karl Hermann Amandus Schwarz (1843–1921), German mathematician.
4 CHAPTER 1. EUCLIDEAN SPACE

Figure 1.2: Angle between two vectors.

Taking into consideration equation (1.2) and the Cauchy–Schwarz inequality, we can formulate the
following Definitions 1.11–1.12.

Definition 1.11. Let x , y ∈ Rn be two non–zero vectors. Their angle ϑ(x y) is defined by:

||x|| ||y||
arccos ϑ(x y) = .
x·y
π
Observe that, when x , y are orthogonal, then ϑ(x , y) = .
2
Definition 1.12. The hyperplane (a plane where n = 3) passing through a point a ∈ Rn , with normal
b 6= 0 , is the set:
Πb (a) = {x ∈ Rn : (x − a) · b = 0} .

Note that, by definition, Πb (a) is the set of all points x such that x − a and b are orthogonal; observe
that, given a , b , the normal x − a is not unique, as Figure 1.3 illustrates.

By definition, the hyperplane Πb (a) is given by:

b1 x1 + b2 x2 + · · · + bn xn = d ,

where b = (b1 , . . . , bn ) is a normal and d = b · a is a constant related to the distance from Πb (a) to
the origin. Planes in R3 have equations of the form:

ax + by + cz = d .

1.2 Topology of Rn
Topology, that is the description of the relations among subsets of Rn , is based on the concept of
open and closed sets, that generalises the notion of open and closed intervals. After introducing these
concepts, we state their most basic properties. The first step is the natural generalisation of intervals
in Rn .

Definition 1.13. Open and closed balls are defined as follows:

(i) ∀r > 0 , the open ball, centered at a , of radius r , is the set of points:

Br (a) := {x ∈ Rn : ||x − a|| < r} ;


1.2. TOPOLOGY OF RN 5

a
x-
a
x
x-a
x-
a
x

Figure 1.3: Hyperplane.

(ii) ∀r ≥ 0 , the closed ball, centered at a , of radius r , is the set of points:

B r (a) {x ∈ Rn : ||x − a|| ≤ r} .

Note that, when n = 1 , the open ball centered at a of radius r is the open interval (a − r , a + r) ,
and the corresponding closed ball is the closed interval [a − r , a + r] . Here we adopt the convention of
representing open balls as dashed circumferences, while closed balls are drawn as solid circumferences,
as shown in Figure 1.4.
B.3 Topology of Rn 243

....... ,
,,/
."",.""..----

.....

//
/ "r-\.
'
I ~~ 1\
I ~,_/\
( IIx-ali \
I ~_--r-----I
\ I
\ I
\ I
\ I
\
,,
/
I
, , / Br(a)

.......... .....
..... ----- " _/

Figure 8.5
Figure 1.4: Open ball n = 2 .
Notice that when n = 1, the open ball centered at a of radius r is the open interval
(a - r, a + r), and the corresponding closed ball is the closed interval [a - r, a + r].
Also notice that the open ball (respectively, the closed ball) centered at a of radius
To generalise the concept of open and closed intervals even further, observe that each element of an
r contains none of its (respectively, all of its) circumference {x : Ilx - all = r}.
open interval I lies inside I, i.e., it is surrounded by other points in I. Although closed intervals do
Accordingly, we will draw pictures of balls in R 2 with the following conventions:
not satisfy Open
this property, their
balls will be complements
drawn with dasheddo. Accordingly, and
circumferences, we closed
give the following
balls Definition 1.14.
will be drawn
with solid circumferences (see Figure 8.5).
To generalize the concept of open and closed intervals even further, observe that
each element of an open interval I lies "inside" I, i.e., is surrounded by other points
in I. On the other hand, although closed intervals do NOT satisfy this property,
6 CHAPTER 1. EUCLIDEAN SPACE

Definition 1.14. The open and closed sets are defined as follows:
(i) a set V ⊂ Rn is open if and only if, for every a ∈ V , there exists ε > 0 such that Bε (a) ⊆ V ;
(ii) a set E ⊂ Rn is closed if and only if its complement E c := Rn \ E is open.
It follows that every open ball is an open set. Note that, if a ∈ Rn , then Rn \ {a} is open and {a} is
closed.
Remark 1.15. For each n ∈ N , the empty set ∅ and the whole space Rn are both open and closed.
We state, without proof, the following Theorem 1.16, which explains the basic properties of open and
closed sets. Notions on sets, set operators and Topology are presented in greater detail in Chapter 7,
while in this first chapter only strictly necessary concepts are introduced.
Theorem 1.16. Let {Vα }α∈A and {Eα }α∈A be any collections of respectively open and closed subsets
of Rn , where A is any set of indexes. Let further {Vk : k = 1 , . . . , p} and {Ek : k = 1 , . . . , p} be
finite collections of respectively open and closed subsets of Rn . Then:
[ \
(i) Vα is open; (iii) Eα is closed;
α∈A α∈A

p
\ p
[
(ii) Vk is open; (iv) Ek is closed;
k=1 k=1

(v) If V is open and E is closed, then V \ E is open and E \ V is closed.


Remark 1.17. In Theorem 1.16, statements (ii) and (iv) are false if arbitrary collections are used in
place of finite collections. In the one–dimensional Euclidean space R1 = R , in fact, we have that:
\  1 1
− , = {0}
k k
k∈N
is a closed set and
[ 1 k

, = (0, 1)
k+1 k+1
k∈N
is open.
Definition 1.18. Let E ⊆ Rn .
(i) The interior of E is the set E o := {V | V ⊆ E , V is open in Rn } .
S

(ii) The closure of E is the set E := {B | B ⊇ E , B is closed in Rn } .


T

Note that every set E contains the open set ∅ and is contained in the closed set Rn ; hence, E o and
E are well–defined. Notice further that E o is always open and E is always closed: E o is the largest
open set contained in E , and E is the smallest closed set containing E . The following Theorem 1.19
illustrates the properties of E o and E .
Theorem 1.19. Let E ⊆ Rn , then:
(i) E o ⊆ E ⊆ E ;
(ii) if V is open and V ⊆ E , then V ⊆ E o ;
(iii) if C is closed and C ⊇ E , then C ⊇ E .
Let us, now, introduce the notion of boundary of a set.
Definition 1.20. The boundary of E is the set:
∂E := {x ∈ Rn : for all r > 0 , Br (x) ∩ E 6= ∅ and Br (x) ∩ E c 6= ∅} .
Given a set E , its boundary ∂E is closely related to E o and E .
Theorem 1.21. If E ⊆ Rn then ∂E = E \ E o .
1.3. LIMITS OF FUNCTIONS 7

1.3 Limits of functions


A vector function is a function f of the form f : A → Rm , where A ⊆ Rn . Since f (x) ∈ Rm for each
x ∈ A , then there are m functions fj : A → R , called component functions of f , such that:
f (x) = (f1 (x) , . . . , fm (x)) for each x ∈ A .
When m = 1 , function f has only one component and we call f real–valued. If f = (f1 , . . . , fm ) is
a vector function, where the components fj have intrinsic domains, then the maximal domain of f is
defined to be the intersection of the domains of all components fj .
To set up a notation for the algebra of vector functions, let E ⊆ Rn and let f, g : E → Rm . For each
x ∈ E , the following operations can be defined.
The scalar multiple of α ∈ R by f is given by:
(α f )(x) := α f (x) .
The sum of f and g is obtained as:
(f + g)(x) := f (x) + g(x) .
The (Euclidean) dot product of f and g is constructed as:
(f · g)(x) := f (x) · g(x) .
Definition 1.22. Let n, m ∈ N and a ∈ Rn , let V be an open set containing a and let f : V \ {a} →
Rm . Then, f (x) is said to converge to L , as x approaches a , if and only if for every ε > 0 there exists
a positive δ (that in general depends on ε , f , V , a ) such that:
0 < ||x − a|| < δ =⇒ ||f (x) − L|| < ε .
In this case we write:
lim f (x) = L
x→a
and call L the limit of f (x) as x approaches a . Using the analogy between the norm on Rn and the
absolute value on R , it is possible to extend a great part of the one–dimensional theory on limits of
functions to the Euclidean space setting.
Example 1.23. Consider proving that:
x2 y
lim 2 2
= 0.
(x , y) → (0, 0) x + y
Using polar coordinates x = ρ cos θ , y = ρ sin θ , we have:
x2 y
= ρ sin θ cos2 θ .
x2 + y 2
When (x , y) → (0 , 0) , then ρ → 0 holds too and, since for any θ ∈ [0 , 2π] the quantity sin θ cos2 θ
is bounded, the equality to zero follows.
Example 1.24. Let us demonstrate that the following limit does not exist:
xy
lim 2 2
.
(x , y) → (0 , 0) x + y
If we move towards the origin (0, 0) , along the line y = m x , we see that:
xy m
= ,
x2 +y 2 1 + m2
that is, the right hand side depends explicitly on the slope m , and we have different values for different
slopes of the line: this means that the limit does not exist.
8 CHAPTER 1. EUCLIDEAN SPACE

6 E ⊆ Rn and let f : E → Rm .
Definition 1.25. Let ∅ =
(i) f is said to be continuous at a ∈ E if and only if for every ε > 0 there exists a positive δ (that
in general depends on ε , f , a) such that

||x − a|| < δ and x ∈ E =⇒ ||f (x) − f (a)|| < ε ;

(ii) f is said to be continuous on E if and only if f is continuous at every x ∈ E .


Example 1.26. Function:  2
x y
(x , y) 6= 0


 2 2
f (x , y) = x + y


0 (x , y) = 0

is continuous at every x ∈ R2 , while:


 xy

 x2 + y 2 (x , y) 6= 0
g(x , y) =


0 (x , y) = 0

is not continuous at 0 .

We now state the two important Theorems 1.27 and 1.28, that establish the topological properties of
continuity.
Theorem 1.27. Let n , m ∈ N and f : Rn → Rm . Then the following conditions are equivalent:
(i) f is continuous on Rn ;

(ii) f −1 (V ) is open in Rn for every open subset V of Rm ;

(iii) f −1 (E) is closed in Rn for every closed subset E of Rm .


Theorem 1.28. Let n , m ∈ N , E be open in Rn and assume f : E → Rm . Then f is continuous on
E if and only if f −1 (V ) is open in Rn for every open set V in Rm .

Definition 1.29. A subset B ⊂ Rn is bounded if there exists M > 0 such that ||x|| ≤ M for any
x∈B.
The following Theorem 1.30, due to Weierstrass4 , states the fundamental property that, if a set is
both closed and bounded (we call it compact), then its image under any continuous function is also
compact.
Theorem 1.30 (Weierstrass Theorem on compactness). Let n , m ∈ N . If H is compact in Rn and
f : H → Rm is continuous on H , then f (H) is compact in Rm .

In the particular situation of a scalar function, we can state the generalization of Theorem 1.30 to
functions depending on several variables.
Theorem 1.31 (Generalization of Weierstrass Theorem). Assume that H is a non–empty subset of
Rn and f : H → R . If H is compact and f is continuous on H , then:

M := sup{f (x) : x ∈ H} and m := inf{f (x) : x ∈ H}

are finite real numbers. Moreover, there exist points xM , xm ∈ H such that M = f (xM ) and m =
f (xm ) .
4
Karl Theodor Wilhelm Weierstrass (1815–1897), German mathematician.
2 Sequences and series of functions

2.1 Sequences and series of real or complex numbers


A sequence is a set of numbers u1 , u2 , u3 , . . . , in a definite order of arrangement, that is, a map
u : N → R or u : N → C , formed according to a certain rule. Each number in the sequence is called
term; un is called the nth term. The sequence is called finite or infinite, according to the number of
terms. The sequence u1 , u2 , u3 , . . . , when considered as a function, is also designated as (un )n∈N or
briefly (un ) .
Definition 2.1. The real or complex number ` is called the limit of the infinite sequence (un ) if, for
any positive number ε , there exists a positive number n(ε) , depending on ε , such that |un − `| < ε
for all integers n > n(ε) . In such a case, we denote:
lim un = ` .
n→∞

X
Given a sequence (un ) , we say that its associated infinite series un :
n=1

(i) converges, when it exists the limit:


n
X ∞
X
lim uk := S = un ;
n→∞
k=1 n=1

n
X
(ii) diverges, when the limit of the partial sums uk does not exist.
k=1

2.2 Sequences of functions


Given a real interval [a , b] , we denote F ([a , b]) the collection of all real functions defined on [a , b] :
F ([a, b]) = {f | f : [a, b] → R} .
Definition 2.2. A sequence of functions with domain [a , b] is a sequence of elements of F ([a, b]) .
Example 2.3. Functions fn (x) = xn , where x ∈ [0 , 1] , form a sequence of functions in F ([0 , 1]) .
Let us analyse what happens when n → ∞ . It is easy to realise that a sequence of continuous functions
may converge to a non–continuous function. Indeed, for the sequence of functions in Example 2.3, it
holds: (
1 if x = 1 ,
lim fn (x) = lim xn =
n→∞ n→∞ 0 if 0 ≤ x < 1 .
Thus, even if every function of the sequence fn (x) = xn is continuous, the limit function f (x) , defined
below, may not be continuos:
f (x) := lim fn (x) .
n→∞
The convergence of a sequence of functions, like that of Example 2.3, is called simple convergence. We
now provide its rigorous definition.

9
10 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Definition 2.4. If (fn ) is a sequence of functions in I ⊆ [a , b] and f is a real function on I , then fn


pointwise converges to f if, for any x ∈ I , there exists the limit of the real sequence (fn (x)) and its
value is f (x) :
lim fn (x) = f (x) .
n→∞

Pointwise convergence is denoted as follows:

I
fn −
→f .

I
Remark 2.5. Definition 2.4 can be reformulated as follows: it holds that fn − → f if, for any ε > 0
and for any x ∈ I , there exists n(ε, x) ∈ N , depending on ε and x , such that:

|fn (x) − f (x)| < ε

for any n ∈ N with n > n(ε , x) .

Example 2.3 shows that the pointwise limit of a sequence of continuos functions may not be continuos.

2.3 Uniform convergence


Pointwise convergence does not allow, in general, interchanging between limit and integral operators,
a possibility that we call passage to the limit and that we also address in § 8.10. To explain it, consider
the sequence of functions:
2 2
fn (x) = n e−n x
defined on [0 , ∞) ; it is a sequence that clearly converges to the zero function. Employing the substi-
tution n x = y , evaluation of the integral of fn yields:
Z ∞ Z ∞
2
fn (x)dx = e−y dy .
0 0

We do not have the tools, yet, to evaluate the integral in the left hand–side of the above equality (but
we will soon), but it is clear that it is a positive real number, so we have:
Z ∞ Z ∞ Z ∞
−y 2
lim fn (x)dx = e dy = α > 0 6= lim fn (x)dx = 0 .
n→∞ 0 0 0 n→∞

To establish a ‘good’ notion of convergence, that allows the passage to the limit, when we take the
integral of the considered sequence, and that preserves continuity, we introduce the fundamental notion
of uniform convergence.

Definition 2.6. If (fn ) is a sequence of functions defined on the interval I , then fn converges uni-
formly to the function f if, for any ε > 0 , there exists nε ∈ N such that, for n ∈ N , n > nε , it
holds:
sup |fn (x) − f (x)| < ε . (2.1)
x∈I

Uniform convergence is denoted by:


I
fn ⇒ f .

Remark 2.7. Definition 2.6 is equivalent to requesting that, for any ε > 0 , there exists nε ∈ N such
that, for n ∈ N , n > nε , it holds:

|fn (x) − f (x)| < ε , for any x ∈ I . (2.2)


2.3. UNIFORM CONVERGENCE 11

I
Proof. Let fn ⇒ f . Then, for any ε > 0 , there exists nε ∈ N such that:
sup |fn (x) − f (x)| < ε , for any n ∈ N , n > nε ,
I
and this implies (2.2). Viceversa, if (2.2) holds then, for any ε > 0 , there esists nε ∈ N such that:
sup |fn (x) − f (x)| < ε , for any n ∈ N , n > nε ,
x∈I
I
that is to say, fn ⇒ f .
Remark 2.8. Uniform convergence implies pointwise convergence. The converse does not hold, as
Example 2.3 shows.

In the next Theorem 2.9, we state the so–called Cauchy uniform convergence criterion.
Theorem 2.9. Given a sequence of functions (fn ) in [a , b] , the following statements are equivalent:
(i) (fn ) converges uniformly;
(ii) for any ε > 0 , there exists nε ∈ N such that, for n , m ∈ N , with n , m > nε , it holds:
|fn (x) − fm (x)| < ε , for any x ∈ [a , b] .

Proof. We show that (i) =⇒ (ii). Assume that (fn ) converges uniformly, i.e., for a fixed ε > 0 ,
ε
there exists nε > 0 such that, for any n ∈ N , n > nε , inequality |fn (x) − f (x)| < , holds for any
2
x ∈ [a , b] . Using the triangle inequality, we have:
ε ε
|fn (x) − fm (x)| ≤ |fn (x) − f (x)| + |f (x) − fm (x)| < + = ε
2 2
for n , m > nε .
To show that (ii) =⇒ (i), let us first observe that, for a fixed x ∈ [a , b] , the numerical sequence
(fn (x)) is indeed a Cauchy sequence, thus, it converges to a real number f (x) . We prove that such
a convergence is uniform. Let us fix ε > 0 and choose nε ∈ N such that, for n , m ∈ N , n , m > nε , it
holds:
|fn (x) − fm (x)| < ε
for any x ∈ [a , b] . Now, taking the limit for m → +∞ , we get:
|fn (x) − f (x)| < ε
for any x ∈ [a , b] . This completes the proof.
Example 2.10. The sequence of functions fn (x) = x (1 + n x)−1 converges uniformly to f (x) = 0
in the interval [0 , 1] . Since fn (x) ≥ 0 for n ∈ N and for x ∈ [0 , 1] , we have:
x 1
sup = →0 as n→∞.
x∈[0,1] 1 + n x 1+n

Figure 2.1: fn (x) = x(1 + n x)−1 , n = 1,... ,6.


12 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Example 2.11. The sequence of functions fn (x) = (1 + n x)−1 does not converge uniformly to
f (x) = 0 in the interval [0 , 1] . In spite of the pointwise limit of fn for x ∈]0 , 1] , we have in fact:
1
sup = 1.
x∈[0,1] 1 + nx

Figure 2.2: fn (x) = (1 + n x)−1 , n = 1,... ,6.

Example 2.12. If α ∈ R+ , the sequence of functions, defined on R+ by fn (x) = nα x e−nαx ,


converges pointwise to 0 on R+ , and uniformly if α < 1 .
For x > 0 , in fact, by taking the logarithm, we obtain:

ln fn (x) = ln nα x e−αx = α ln n − n x .


It follows that lim ln fn (x) = −∞ and, then, lim fn (x) = 0 . Pointwise convergence is proved.
n→∞ n→∞
For uniform convergence, we show that, for any n ∈ N , the associate function fn reaches its absolute
maximum in R+ . By differentiating with respect to x , we obtain, in fact:

fn0 (x) = nα eα (−n) x (1 − α n x) ,


1
from which we see that function fn assumes its maximum value in xn = ; such a maximum is
n
absolute, since fn (0) = 0 and lim fn (x) = 0 . We have thus shown that:
x→+∞
1
sup fn (x) = fn = e−α nα−1 .
x∈R+ n

Now, lim sup fn (x) = lim e−α nα−1 = 0 , when α < 1 . Hence, in this case, convergence is indeed
n→∞ x∈R+ n→∞
uniform.

In the following example, we compare two sequences of functions, apparently very similar, but the
first one is pointwise convergent, while the second one is uniformly convergent.

Example 2.13. Consider the sequences of functions (fn ) and (gn ) , both defined on [0 , 1] :
1

n2 x (1 − n x)
 if 0 ≤ x < ,
fn (x) = n (2.3)
1
0 if ≤x≤1,

n
and
1

n x2 (1 − n x)
 if 0≤x< ,
gn (x) = n (2.4)
1
0 if ≤x≤1.

n
2.3. UNIFORM CONVERGENCE 13

Sequence (fn ) converges pointwise to f (x) = 0 for x ∈ [0 , 1] ; in fact, it is fn (0) = 0 and fn (1) = 0 for
1
any n ∈ N . When x ∈ (0 , 1) , since n0 ∈ N exists such that < x , it follows that fn (x) = 0 for any
n0
n ≥ n0 .
1
The convergence of (fn ) is not uniform; to show this, observe that ξn = maximises fn , since:
2n
1

n2 (1 − 2 n x)
 if 0 ≤ x < ,
fn0 (x) = n
1
0 if ≤x≤1.

n
It then follows:
n
sup |fn (x) − f (x)| = sup fn (x) = fn (ξn ) =
x∈[0,1] x∈[0,1] 4
which prevents uniform convergence. With similar considerations, we can prove that (gn ) converges
pointwise to g(x) = 0 , and that the convergence is also uniform, since:
1

n x (2 − 3 n x)
 if 0 ≤ x < ,
0
gn (x) = n
1
0 if ≤x≤1,

n
2
implying that ηn = maximises gn and that:
3n
4
sup |gn (x) − g(x)| = sup gn (x) = gn (ηn ) = ,
x∈[0,1] x∈[0,1] 27 n

which ensures the uniform convergence of (gn ) .

Uniform convergence implies remarkable properties. If a sequence of continuous functions is uniformly


convergent, in fact, its limit is also a continuous function.
Theorem 2.14. If (fn ) is a sequence of continuous functions on an closed and bounded interval [a , b] ,
which converges uniformly to f , then f is a continuous function.
Proof. Let f (x) be the limit of fn . Choose ε > 0 and x0 ∈ [a b] . Due to uniform convergence, there
exists nε ∈ N such that, if n ∈ N , n > nε , then:
ε
sup |fn (x) − f (x)| < . (2.5)
x∈[a,b] 3

Using the continuity of fn , we can see that there exists δ > 0 such that:
ε
|fn (x) − fn (x0 )| < (2.6)
3
for any x ∈ [a , b] with |x − x0 | < δ .
To end the proof, we have to show that, given x0 ∈ [a , b] , if x ∈ [a , b] is such that |x − x0 | < δ , then
|f (x) − f (x0 )| < ε . By the triangular inequality:

|f (x) − f (x0 )| ≤ |f (x) − fn (x)| + |fn (x) − fn (x0 )| + |fn (x0 ) − f (x0 )| .

Observe that:
ε ε ε
|f (x) − fn (x)| < , |fn (x0 ) − f (x0 )| < , |fn (x) − fn (x0 )| < ,
3 3 3
the first two inequalities being due to (2.5), while the third one is due to (2.6). Hence:

|f (x) − f (x0 )| < ε

if |x − x0 | < δ ; this concludes the proof.


14 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

When we are in presence of uniform convergence, for a sequence of continuous functions, defined on
the bounded and closed interval [a , b] , then the following passage to the limit holds:
Z b Z b
lim fn (x)dx = lim fn (x)dx . (2.7)
n→∞ a a n→∞

We can, in fact, state the following Theorem 2.15.

Theorem 2.15. If (fn ) is a sequence of continuous functions on [a , b] , converging uniformly to f (x) ,


then (2.7) holds true.

Proof. From Theorem 2.14, f (x) is continuos, thus, it is Riemann integrable (see § 8.7.1). Now, choose
ε > 0 so that nε ∈ N exists such that, for n ∈ N , n > nε :
ε
|fn (x) − f (x)| < for any x ∈ [a , b] . (2.8)
b−a
By integration:
Z b Z b Z b
ε
fn (x)dx − f (x)dx ≤ |fn (x) − f (x)| dx < (b − a) = ε

a a a b−a

which ends the proof.

Remark 2.16. The passage to the limit is sometimes possible under less restrictive hypotheses than
Theorem 2.15. In the following example, passage to the limit is possible without uniform convergence.
Consider the sequence in [0 , 1] , given by fn (x) = n x (1 − x)n . For such a sequence, it is:
 n
1 n 1
sup |fn (x)| = fn ( n+1 ) = 1− ,
x∈[0,1] n+1 n+1

thus, fn is not uniformly convergent, since it holds:


1
lim sup |fn (x)| = 6= 0 ,
n→∞ x∈[0,1] e

[0 ,1]
On the other hand, it holds that fn −−−→ 0 . Moreover, we can use integration by part as follows:
Z 1 Z 1
lim fn (x) dx = lim n x (1 − x)n dx
n→∞ 0 n→∞ 0
Z 1  0
1 n+1
= lim nx − (1 − x) dx
n→∞ 0 n+1
Z 1
n n
= lim (1 − x)n+1 dx = lim = 0,
n→∞ 0 n + 1 n→∞ (n + 1)(n + 2)

and it also holds: Z 1 Z 1


lim fn (x)dx = 0 dx = 0 .
0 n→∞ 0

Remark 2.17. Consider again the sequences of functions (2.3) and (2.4), defined on [0 , 1] , with
fn → 0 and gn ⇒ 0 . Observing that:
1
Z 1 Z
n 1
fn (x)dx = n x2 (1 − n x)dx =
0 0 6
and 1
Z 1 Z
n 1
gn (x)dx = n2 x(1 − n x)dx = ,
0 0 12 n2
2.3. UNIFORM CONVERGENCE 15

it follows: Z 1 Z 1
1
lim fn (x)dx = 6= f (x)dx = 0 ,
n→∞ 0 6 0
while: Z 1 Z 1
1
lim gn (x)dx = lim =0= g(x)dx = 0 .
n→∞ 0 n→∞ 12 n2 0
In other words, the pointwise convergence of (fn ) does not permit the passage to the limit, while the
uniform convergence of (gn ) does.
We provide a second example to illustrate, again, that pointwise convergence, alone, does not allow
the passage to the limit.
Example 2.18. Consider the sequence of functions (fn ) on [0 , 1] defined by:
 1
 n2 x if 0 ≤ x ≤ ,
n





1 2

fn (x) = 2 n − n2 x if < x ≤ ,


 n n
2



0 if < x ≤ 1 .
n
Observe that each fn is a continuous function. Plots of fn are shown in Figure 2.3, for some values
of n ; it is clear that, pointwise, fn (x) → 0 for n → ∞ .
y

Figure 2.3: Plot of functions fn (x) , n = 3 , . . . , 6 , in Example 2.18. Solid lines are used for even values
of n ; dotted lines are employed for odd n .

By construction, though, each triangle in Figure 2.3 has area equal to 1 , thus, for any n ∈ N :
Z 1
fn (x)dx = 1 .
0

In conclusion: Z 1 Z 1
1 = lim fn (x)dx 6= lim fn (x)dx = 0
n→∞ 0 0 n→∞
In presence of pointwise convergence alone, therefore, swapping between integral and limit is not
possible.

Uniform convergence leads to a third interesting consequence, connected to the behaviour of sequences
of differentiable functions.
Theorem 2.19. Let (fn ) be a sequence of continuous functions, on [a , b] . Assume that each fn is
differentiable, with continuous derivative, and that:
16 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

(i) lim fn (x) = f (x) , for any x ∈ [a , b] ;


n→∞

(ii) (fn0 ) converges uniformly in [a , b] .

Then f (x) is differentiable and


lim f 0 (x) = f 0 (x).
n→∞ n

Proof. Define g(x) as:


g(x) = lim fn0 (x) ,
n→∞

and recall that such a limit is uniform. For Theorems 2.14 and 2.15, g(x) is continuous on [a , b] . A
classical result from Calculus states that, for any x ∈ [a, b] :
Z x Z x
g(t)dt = lim fn0 (t)dt = lim (fn (x) − fn (a)) = f (x) − f (a)
a n→∞ a n→∞

This means that f (x) is differentiable and its derivative is g(x) .

The hypotheses of Theorem 2.19 are essential, as it is shown by the following example.

Example 2.20. Consider the sequence (fn ) on the open interval ] − 1 , 1[ :

2x
fn (x) = , n ∈ N.
1 + n2 x2
Observe that fn converges to 0 uniformly in ] − 1 , 1[ , since:

1
sup |fn (x)| = −→ 0 .
x∈]−1,1[ n n→∞

Function fn is differentiable for any n ∈ N and, for any x ∈ ] − 1 , 1[ and any n ∈ N , the derivative of
fn , with respect to x , is:
2(1 − n2 x2 )
fn0 (x) = .
(1 + n2 x2 )2
Now, consider function g : ] − 1 , 1[ → R :
(
0 if x 6= 0 ,
g(x) =
2 if x = 0 .

]−1,1[
Clearly, fn0 −−−→ g ; such a convergence holds pointwise, but not uniformly; by Theorem 2.14, in
fact, uniform convergence of (fn0 ) would imply g to be continuous, which is not, in this case. Here, the
hypotheses of Theorem 2.19 are not fulfilled, thus its thesis does not hold.

We end this section with Theorem 2.21, due to Dini1 , and the important Corollary 2.23, a consequence
of the Dini Theorem, very useful in many applications. Theorem and corollary connect monotonicity
and uniform convergence for a sequence of functions; for their proof, we refer the Reader to [16].

Theorem 2.21 (Dini). Let (fn ) be a sequence of continuous functions, converging pointwise to a
continuous function f , defined on the interval [a , b] .
Furthermore, assume that, for any x ∈ [a , b] and for any n ∈ N , it holds fn (x) ≥ fn+1 (x) . Then fn
converges uniformly to f in [a , b] .

Remark 2.22. In Theorem 2.21, hypothesis fn (x) ≥ fn+1 (x) can be replaced with its reverse mono-
tonicity assumption fn (x) ≤ fn+1 (x) , obtaining the same thesis.
1
Ulisse Dini (1845–1918), Italian mathematician and politician.
2.4. SERIES OF FUNCTIONS 17

Corollary 2.23 (Dini). Let (fn ) be sequence of nonnegative, continuous and integrable functions,
defined on R , and assume that it converges pointwise to f , which is also nonnegative, continuous and
integrable. Suppose further that it is either 0 ≤ fn (x) ≤ fn+1 ≤ f (x) or 0 ≤ f (x) ≤ fn+1 ≤
fn (x) , for any x ∈ R and any n ∈ N . Then:
Z +∞ Z +∞
lim fn (x)dx = f (x)dx .
n→∞ −∞ −∞

Example 2.24. Let us consider an application of Theorem 2.21 and Corollary 2.23. Define fn (x) =
xn sin(πx) , x ∈ [0 , 1] . It is immediate to see that, for any x ∈ [0, 1] :

lim xn sin(πx) = 0 .
n→∞

Moreover, since it is 0 ≤ f (x) ≤ fn+1 ≤ fn (x) for any x ∈ [0 , 1] , the convergence is uniform and,
then: Z 1
lim xn sin(πx)dx = 0 .
n→∞ 0

2.4 Series of functions


The process of transformation of a sequence of real numbers into an infinite series works, also, when
extending sequences of functions into series of functions.

Definition 2.25. The series of functions



X
fn (x) = f1 (x) + f2 (x) + · · · + fm (x) + · · · · · · (2.9)
n=1

converges in [a , b] , if the sequence of its partial sums:


n
X
sn (x) = fk (x) (2.10)
k=1

converges in [a , b] . The same result applies to uniform convergence, that is,


if (2.10) converges uniformly in [a , b] , then (2.9) converges uniformly in [a , b] .

Remark 2.26. Defining rn (x) := f (x) − fn (x) , then (2.9) converges uniformly in [a, b] if, for any
ε > 0 , there exists nε such that:

sup |rn (x)| < ε , for any n > nε .


x∈[a,b]

The following Theorem 2.27, due to Weierstrass, establishes a sufficient condition to ensure the uniform
convergence of a series of functions.

Theorem 2.27 (Weierstrass Theorem on approximation). Let (fn ) be a sequence of functions defined
on [a , b] . Assume that for any n ∈ N , there esists Mn ∈ N such that |fn (x)| ≤ Mn for any
x ∈ [a , b] . Morevover, assume convergence for the numerical series:

X
Mn .
n=1

Then (2.9) converges uniformly in [a , b] .


18 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Proof. For the Cauchy criterion of convergence (Theorem 2.9), the series of functions (2.9) converges
uniformly if and only if, for any ε > 0 , there exist nε ∈ N such that:

m
X
sup fk < ε , for any m > n > nε .

x∈[a ,b]
k=n+1


X
In our case, once ε > 0 is fixed, since the numerical series Mn converges, there exists nε ∈ N such
n=1
that:
m
X
Mk < ε , for any m > n > nε .
k=n+1
Now, we use the triangle inequality:

m
X m
X m
X
sup fk ≤ sup |fk | ≤ Mk < ε .

x∈[a,b]
k=n+1
x∈[a,b]
k=n+1 k=n+1

This proves the theorem.

Theorem 2.15 is useful for swapping between sum and integral of a series, and Theorem 2.19 for
term–by–term differentiability. We now state three further helpful theorems.
Theorem 2.28. If (fn ) is a sequence of continuous functions on [a , b] and if their series (2.9) converges
uniformly on [a , b] , then:
∞ Z b ∞
 Z b X !
X
fn (x)dx = fn (x) dx . (2.11)
n=1 a a n=1

Proof. Define:

X n
X
f (x) := fn (x) = lim fm (x) . (2.11a)
n→∞
n=1 m=1
By Theorem 2.14, function f (x) is continuous and:
Z b n Z b
X
f (x)dx = lim fm (x)dx . (2.11b)
a n→∞
m=1 a

Now, using the linearity of the integral:


Xn Z b Z b n
X Z b
fm (x)dx = fm (x)dx = sn (x)dx .
m=1 a a m=1 a

From Theorem 2.15, the thesis (2.11) then follows.

We state (without proof) a more general result, that is not based on the uniform convergence, but
only on simple convergence and few other assumptions.
Theorem 2.29. Let (fn ) be a sequence of functions on an interval [a , b] ⊂ R . Assume that each
fn is both piecewise continuous and integrable on I , and that (2.9) converges pointwise, on I , to a
piecewise continuous function f .
Moreover, assume convergence for the numerical (positive terms) series:
X∞ Z b
|fn (x)| dx .
n=1 a

Then the limit function f is integrable in [a , b] and:


Z b ∞ Z b
X
f (x)dx = fn (x)dx .
a n=1 a
2.4. SERIES OF FUNCTIONS 19

Example 2.30. The series of functions:



X sin(nx)
(2.12)
n2
n=1

converges uniformly on any interval [a , b] .


It is, in fact, easy to use the Weierstrass Theorem 2.27 and verify that:

sin(nx) 1
n2 ≤ n2 .


X 1
Our statement follows from the convergence of the infinite series , shown in formula (2.71) of
n2
n=1
§ 2.7, later on.
Moreover, if f (x) denotes the sum of the series (2.12), then, due to the uniform convergence:

π ∞
πX ∞ π ∞
1 − cos(n π)
Z Z Z
sin(n x) X 1 X
f (x)dx = dx = sin(n x)dx = .
0 0 n2 n2 0 n3
n=1 n=1 n=1

Now, observe that: (


2 if n is odd,
1 − cos(nπ) =
0 if n is even.
It is thus possible to infer:
Z π ∞
X 2
f (x)dx = .
0 (2 n − 1)3
n=1

Theorem 2.31. Assume that (2.9), defined on [a , b] , converges uniformly, and assume that each fn
has continuous derivative fn0 (x) for any x ∈ [a , b] ; assume further that the series of the derivatives is
uniformly convergent. If f (x) denotes the sum of the series (2.9), then f (x) is differentiable and, for
any x ∈ [a , b] :
X ∞
0
f (x) = fn0 (x) . (2.13)
n=1

The derivatives at the extreme points a and b are obviously understood as right and left derivatives,
respectively.

Proof. We present here the proof given in [33]. Let us denote by g(x) , with x ∈ [a , b] , the sum of the
series of the derivatives fn0 (x) :

X
g(x) = fn0 (x) .
n=1

By Theorem 2.14, function g(x) is continuous and, by Theorem 2.15, we can integrate term by term
in [a , x] :
Z x ∞ Z
X x ∞
X ∞ ∞
 X X
g(ξ) dξ = fn0 (ξ)dξ = fn (x) − fn (a) = fn (x) − fn (a) , (2.14)
a n=1 a n=1 n=1 n=1

where linearity of the sum is used in the last step of the chain of equalities. Now, recalling definition
(2.11a) of f (x) , formula (2.14) can be rewitten as:
Z x
g(ξ) dξ = f (x) − f (a) . (2.14a)
a
20 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Differentiating both sides of (2.14a), by the Fundamental Theorem of Calculus2 we obtain g(x) =
f 0 (x) , which means that:

X
0
f (x) = g(x) = fn0 (x) .
n=1
Hence, the proof is completed.

Uniform convergence of series of functions satisfies linearity properties expressed by the following
Theorem 2.32, whose proof is left as an exercise.

Theorem 2.32. Given two uniformly convergent power series in [a , b] :



X ∞
X
fn (x), gn (x) ,
n=1 n=1

the following series converges uniformly for any α , β ∈ R :



X
(αfn (x) + βgn (x)) .
n=1

Moreover, if h(x) is a continuous function, defined on [a , b] , then the following series is uniformly
convergent:
X∞
h(x)fn (x) .
n=1

2.5 Power series: radius of convergence


The problem dealt with in this section is as follows. Consider a sequence of real numbers (an )n≥0 and
the function defined by the so–called power series:

X
f (x) = an (x − x0 )n . (2.15)
n=0

Given x0 ∈ R , it is important to find all the values x ∈ R such that the series of functions (2.15)
converges.

Example 2.33. With x0 = 0 , the power series:



X ( (2n)! )2 n
x
16n ( n! )4
n=0

converges for |x| < 1 .

Remark 2.34. It is not restrictive, by using a translation, to consider the following simplified–form
power series, obtained from (2.15) with x0 = 0 :

X
an xn . (2.16)
n=0

Obviously, the choice of x in (2.16) determinates the convergence of the series. The following Lemma
2.35 is of some importance.

Lemma 2.35. If (2.16) converges for x = r0 then, for any 0 ≤ r < |r0 | , it is absolutely and uniformly
convergent in [−r , r] .
2
See, for example, mathworld.wolfram.com/FundamentalTheoremsofCalculus.html
2.5. POWER SERIES: RADIUS OF CONVERGENCE 21


X
Proof. It is assumed the convergence of the numerical series an r0n , that is to say, there exists a
n=0
∞ 
r X r n
positive constant K such that an r0n ≤ K . Since < 1 , then the geometrical series
r0 r0
n=0
converges. Now, for any n ≥ 0 and any x ∈ [−r, r] :
n
n n
an xn = an r0n x ≤ K x ≤ K r .

r0 r0 r0 (2.17)

By Theorem 2.27, inequality (2.17) implies that (2.16) is uniformly convergent. Due to positivity, the
convergence is also absolute.

From Lemma 2.35 it follows the fundamental Theorem 2.36, due to Cauchy and Hadamard3 , which
explains the behaviour of a power series:

Theorem 2.36 (Cauchy–Hadamard). Given the power series (2.16), then only one of the following
alternatives holds:

(i) series (2.16) converges for any x ;

(ii) series (2.16) converges only for x = 0 ;

(iii) there exists a positive number r such that series (2.16) converges for any x ∈ ] − r , r [ and
diverges for any x ∈ ] − ∞ , −r [ ∪ ] r , +∞[ .

Proof. Define the set:



( )
X
n
C := x ∈ [ 0 , +∞[ : an x converges .
n=0

If C = [ 0 , +∞ [ , then (i) holds. Otherwise, C is bounded. If C = { 0 } , then (ii) holds. If both


(i) and (ii) are not true, then there exists the positive real number r = sup C . Now, choose any
|y| + r
y ∈ ] − r , r [ and form ȳ = . Since ȳ is not an upper bound of C , then a number z ≥ ȳ
2
exists, for which it converges the series:
X∞
an z n .
n=0

As a consequence, by Lemma 2.35, series (2.16) converges for any x ∈] − z, z[ , and, in particular, it is
convergent the series:
X∞
an y n .
n=0

To end the proof, take |y| > r and assume, by contradiction, that it is still convergent the series:

X
an y n .
n=0

If so, using Lemma 2.35, it would follow that series (2.16) converges for any x ∈ ] − |y| , |y| [ and, in
particular, it would converge for the number:

|y| + r
> r,
2
which contradicts the assumption r = sup C .

3
Jacques Salomon Hadamard (1865–1963), French mathematician.
22 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Definition 2.37. The interval within which (2.16) converges is called interval of convergence and r
is called radius of convergence.

The radius of convergence can be calculated as stated in Theorem 2.38.

Theorem 2.38 (Radius of convergence). Consider the power series (2.16) and assume that the fol-
lowing limit exists:
an+1
` = lim .
n→∞ an
Then:

(i) if ` = ∞ , series (2.16) converges only for x = 0 ;

(ii) if ` = 0 , series (2.16) converges for all x ;


1
(iii) if ` > 0 , series (2.16) converges for |x| < .
`
1
Therefore r = is the radius of convergence of (2.16).
`
Proof. Consider the series:

X ∞
X
n
|an x | = |an | |x|n
n=0 n=0

and apply the ratio test, that is to say, study the limit of the fraction between the (n + 1)–th term
and the n–th term in the series:
|an+1 | |x|n+1

an+1
lim = |x| lim = |x| ` .
n→∞ |an | |x|n n→∞ an

If ` = 0 , then series (2.16) converges for any x ∈ R , since it holds:

|an+1 | |x|n+1
lim = 0 < 1.
n→∞ |an | |x|n
If ` > 0 , then:
|an+1 | |x|n+1 1
lim = |x|` < 1 ⇐⇒ |x| < .
n→∞ |an | |x|n `
Eventually, if ` = ∞ , series (2.16) does not converge when x 6= 0 , since it is:

|an+1 | |x|n+1
lim > 1,
n→∞ |an | |x|n

while, for x = 0 , series (2.16) reduces to the zero series, which converges trivially.

Example 2.39. The power series (2.18), known as geometric series, has radius of convergence r = 1 .

X
xn . (2.18)
n=0

Proof. In (2.18), it is an = 1 for all n ∈ N , thus:



an+1
lim = 1.
n→∞ an

which means that series (2.18) converges for −1 < x < 1. At the boundary of the interval of conver-
gence, namely x = 1 and x = −1 , the geometric series (2.18) does not converge. In conclusion, the
interval of convergence of (2.18) is the open interval ] − 1 , 1 [ .
2.5. POWER SERIES: RADIUS OF CONVERGENCE 23

Example 2.40. The power series (2.19) has radius of convergence r = 1 .



X xn
. (2.19)
n
n=1

1
Proof. Here, an = , thus:
n
an+1
lim = lim n = 1 .
n→∞ an n→∞ n + 1
that is, (2.19) converges for −1 < x < 1 .
At the boundary of the interval of convergence, (2.19) behaves as follows: when x = 1 , it reduces to
the divergent harmonic series:

X 1
,
n
n=0

while, when x = −1 , (2.19) reduces to the convergent alternate signs series:



X 1
(−1)n .
n
n=0

The interval of convergence of (2.19) is, thus, [−1 , 1 [ .

Example 2.41. Series (2.20), given below, has infinite radius of convergence:

X xn
. (2.20)
n!
n=0

1
Proof. Since it is an = for any n ∈ N , it follows that:
n!

an+1 1
lim = lim = 0.
n→∞ an n→∞ n + 1

It is possible to differentiate and integrate power series, as stated in the following Theorem 2.42, which
we include for completeness, as it represents a particular case of Theorems 2.28 and 2.31.

Theorem 2.42. Let f (x) be the sum of the power series (2.16), with radius of convergence r . The
following results hold.

(i) f (x) is differentiable and, for any |x| < r , it is:



X
f 0 (x) = n an xn−1 ;
n=1

(ii) if F (x) is the primitive of f (x) , which vanishes for x = 0 , then:



X an n+1
F (x) = x .
n+1
n=0

The radius of convergence of both power series f 0 (x) and F (x) is that of f (x) .

Power series behave nicely with respect to the usual arithmetical operations, as shown in Theorem
2.43, which states some useful result.
24 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Theorem 2.43. Consider two power series, with radii of convergence r1 and r2 respectively:

X ∞
X
f1 (x) = an xn , f2 (x) = bn xn . (2.21)
n=0 n=0

Then r = min{r1 , r2 } is the radius of convergence of:



X
(f1 + f2 )(x) = (an + bn ) xn .
n=0

If α ∈ R , then r1 is the radius of convergence of:



X
α f1 (x) = α an xn .
n=0

We state, without proof, Theorem 2.44, concerning the product of two power series.
Theorem 2.44. Consider the two power series in (2.21), with radii of convergence r1 and r2 respec-
tively. The product of the two power series is defined by the Cauchy formula:

X n
X
cn xn , where cn = aj bn−j , (2.22)
n=0 j=0

that is:

c0 = a0 b0 ,
c1 = a0 b1 + a1 b0 ,
..
.
cn = a0 bn + a1 bn−1 + · · · + an−1 b1 + an b0 .

Series (2.22) has interval of convergence given by |x| < r = min{r1 , r2 } , and its sum is the pointwise
product f1 (x) f2 (x) .

2.6 Taylor–MacLaurin series


Our starting point, here, is the Taylor4 formula with Lagrange5 remainder term. Let f : I → R be a
function that admits derivatives of any order at x0 ∈ I . The Taylor–Lagrange Theorem states that, if
x ∈ I , then there exists a real number ξ , between x and x0 , such that:
f (x) = Pn ( f (x) , x0 ) + Rn ( f (x) , x0 )
n
X f (k) (x0 ) f (n+1) (ξ) (2.23)
= (x − x0 )k + (x − x0 )n+1 .
k! (n + 1)!
k=0

Since f has derivatives of any order, we may form the limit of (2.23) as n → ∞ ; a condition is stated
in Theorem 2.45 to detect when the passage to the limit is effective.
Theorem 2.45. If f has derivatives of any order in the open interval I , with x0 , x ∈ I , and if:
f (n+1) (ξ)
lim Rn (f (x), x0 ) = lim (x − x0 )n+1 = 0 ,
n→∞ n→∞ (n + 1)!

then:

X f (n) (x0 )
f (x) = y (x − x0 )n . (2.24)
n!
n=0
4
Brook Taylor (1685–1731), English mathematician.
5
Giuseppe Luigi Lagrange (1736–1813), Italian mathematician.
2.6. TAYLOR–MACLAURIN SERIES 25

Definition 2.46. A function f (x) , defined on an open interval I , is analytic at x0 ∈ I , if its Taylor
series about x0 converges to f (x) in some neighborhood of x0 .
Remark 2.47. Assuming the existence of the derivatives of any order is not enough to infer that a
function is analytic and, thus, it can be represented with a convergent power series. For instance, the
function: ( 1
e − x2 if x 6= 0
f (x) =
0 if x = 0
has derivatives of any order in x0 = 0 , but such derivates are all zero, therefore the Taylor series reduces
to the zero function. This happens because the Lagrange remainder does not vanish as n → ∞.

Note that the majority of the functions, that interest us, does not possess the behaviour shown in
Remark 2.47. The series expansion of the most important, commonly used, functions can be inferred
from Equation (2.23), i.e., from the Taylor–Lagrange Theorem. And Theorem 2.45 yields a sufficient
condition to ensure that a given function is analytic.
Corollary 2.48. Consider f with derivatives of any order in the interval I = ] a , b [ . Assume that
there exist L , M > 0 such that, for any n ∈ N ∪ {0} and for any x ∈ I :

(n)
f (x) ≤ M Ln . (2.25)

Then, for any x0 ∈ I , function f (x) coincides with its Taylor series in I .
Proof. Assume x > x0 . The Lagrange remainder for f (x) is given by:

f (n+1) (ξ)
Rn (f (x) , x0 ) = (x − x0 )n+1 ,
(n + 1)!
where ξ ∈ (x0 , x) , which can be written as ξ = x0 + α(x − x0 ) , with 0 < α < 1 . Now, using condition
(2.25), it follows:
 n+1
L (b − a)
|Rn (f (x) , x0 )| ≤ M.
(n + 1)!
The thesis follows from the limit:
 n+1
L (b − a)
lim = 0.
n→∞ (n + 1)!

Corollary 2.48, together with Theorem 2.42, allows to find the power series expansion for the most
common elementary functions. Theorem 2.49 concerns a first group of power series that converges for
any x ∈ R .
Theorem 2.49. For any x ∈ R , the exponential power series expansion holds:

x
X xn
e = . (2.26)
n!
n=0

The goniometric, hyperbolic, power series expansions also hold:

∞ ∞
X (−1)n x2n+1 X (−1)n x2n
sin x = , (2.27) cos x = , (2.28)
(2n + 1)! (2n)!
n=0 n=0

∞ ∞
X x2n+1 X x2n
sinh x = , (2.29) cosh x = . (2.30)
(2n + 1)! (2n)!
n=0 n=0
26 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

Proof. First, observe that the general term in each of the five series (2.26)–(2.30) comes form the
MacLaurin6 formula.
To show that f (x) = ex is the sum of the series (2.26), let us use the fact that f (n) (x) = ex for any
n ∈ N ; in this way, it is possible to infer that, in any interval [a , b] , the disequality (2.25) is fulfilled
if we take M = 1 and L = max{ex : x ∈ [a, b]} .
To prove (2.27) and (2.28), in which derivatives of the goniometric functions sin x and cos x are
considered, condition (2.25) is immediately verified by taking M = 1 and L = 1 .
Finally, (2.29) and (2.30) are a straightforward consequence of the definition of the hyperbolic functions
in terms of the exponential:
ex + e−x ex − e−x
cosh x = , sinh x = ,
2 2
together with Theorem 2.43.

Theorem 2.50 concerns a second group of power series converging for |x| < 1 .
Theorem 2.50. If |x| < 1 , the following power series expansions hold:

∞ ∞
1 X 1 X
= xn , (2.31) 2
= (n + 1)xn , (2.32)
1−x (1 − x)
n=0 n=0

∞ ∞
X xn+1 X xn+1
ln(1 − x) = − , (2.33) ln(1 + x) = (−1)n , (2.34)
n+1 n+1
n=0 n=0

∞ ∞
x2n+1 x2n+1
 
X
n 1+x X
arctan x = (−1) , (2.35) ln =2 . (2.36)
2n + 1 1−x 2n + 1
n=0 n=0

1
Proof. To prove (2.31), define f (x) = and build the MacLaurin polynomial of order n , which
1−x
is Pn ( f (x) , 0 ) = 1 + x + x2 + . . . + xn ; the remainder can be thus estimated directly:
 
Rn f (x) , n = f (x) − Pn f (x) , 0
1
= − (1 + x + x2 + · · · + xn ) (2.37)
1−x
1 1−x n+1 x n+1
= − = .
1−x 1−x 1−x
Assuming |x| < 1 , we see that the remainder vanishes for n → ∞ , thus (2.31) follows.
Indentity (2.32) can be proven by employing both formula (2.31) and Theorem 2.44, with an = bn = 1 .
To obtain (2.33), the geometric series in (2.31) can be integrated term by term, using Theorem 2.42;
in fact, letting |x| < 1 , we can consider the integral:
Z x
dt
= − ln(1 − x) .
0 1 −t
Now, from Theorem 2.42 it follows:

xX ∞
xn+1
Z X
xn dx = .
0 n=0 n+1
n=0

Formula (2.33) is then a consequence of formula (2.31). Formula (2.34) can be proven analogously to
(2.33), by considering −x instead of x .
6
Colin Maclaurin (1698–1746), Scottish mathematician.
2.6. TAYLOR–MACLAURIN SERIES 27

To prove (2.35), we use again formula (2.31) with t = −x , so that we have:



1 X
2
= (−1)n x2n .
1+x
n=0

Integrating and invoking Theorem 2.42, we obtain:


Z x ∞ 2n+1
dt X
n x
arctan x = 2
= (−1) .
0 1+t 2n + 1
n=0

Finally, to prove (2.36), let us consider x = t2 in formula (2.31), so that:



1 X
= t2n . (2.38)
1 − t2
n=0

Integrating, taking |x| < 1 , and using Theorem 2.42, the following result is obtained:
Z x ∞
1 x 1 1 + x X x2n+1
Z  
dt 1 1
2
= + dt = ln = .
0 1−t 2 0 1+t 1−t 2 1−x 2n + 1
n=0

2.6.1 Binomial series


The role of the so–called binomial series is pivotal. Let us recall the binomial formula (2.39). If n ∈ N
and x ∈ R then:
n  
n
X n k
(1 + x) = x , (2.39)
k
k=0
where the binomial coefficient is defined as:
 
n n · (n − 1) · · · · · (n − k + 1)
= . (2.40)
k k!
Observe that the left hand side of (2.40) does not require the numerator to be a natural number.
Therefore, if α ∈ R and if n ∈ N , the generalized binomial coefficient is defined as:
 
α α · (α − 1) · · · · · (α − n + 1)
= . (2.41)
n n!
From (2.41) an useful property of the generalized binomial coefficient can be inferred, and later used
to expand in power series the function f (x) = (1 + x)α .
Proposition 2.51. For any α ∈ R and any n ∈ N , the following identity holds:
     
α α α
n + (n + 1) =α . (2.42)
n n+1 n
Proof. The thesis follows from a straightforward computation:
     
α α α α · (α − 1) · · · · · (α − n)
n + (n + 1) =n + (n + 1)
n n+1 n (n + 1)!
 
α α · (α − 1) · · · · · (α − n)
=n +
n n!
 
α α · (α − 1) · · · · · (α − n + 1)
=n + (α − n)
n n!
     
α α α
=n + (α − n) =α .
n n n
28 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

By using Proposition 2.51, it is possible to prove the so–called generalised Binomial Theorem 2.52.

Theorem 2.52 (Generalised Binomial). For any α ∈ R and |x| < 1 , the following identity holds:
∞  
α
X α n
(1 + x) = x . (2.43)
n
n=0

Proof. Let us denote f (x) the sum of the generalised binomial series:
∞  
X α n
f (x) = x ,
n
n=0

and introduce function g(x) as follows:

f (x)
g(x) = .
(1 + x)α

To prove the thesis, let us show that g(x) = 1 for any |x| < 1 . Differentiating g(x) we obtain:

(1 + x)f 0 (x) − αf (x)


g 0 (x) = . (2.44)
(1 + x)α+1

Moreover, differentiating f (x) term by term, using Theorem 2.42, we get:


∞   ∞   ∞  
0
X α n−1 X α n−1 X α n
(1 + x) f (x) = (1 + x) n x = n x + n x
n n n
n=1 n=1 n=1
∞   ∞  
X α n
X α n
= (n + 1) x + n x
n+1 n
n=0 n=1
∞   ∞  
X α n
X α n
= (n + 1) x + n x
n+1 n
n=0 n=0
∞      ∞  
X α α n
X α n
= (n + 1) +n x = α x = α f (x) .
n+1 n n
n=0 n=0

Thus, g 0 (x) = 0 for any |x| < 1 , which implies that g(x) is a constant function. It follows that
g(x) = g(0) = f (0) = 1 , which proves thesis (2.43).
1
When considering the power series expansion of arcsin, the particular value α = − turns out to be
2
important. Let us, then, study the generalised binomial coefficient (2.41) corresponding to such an α .

Proposition 2.53. For any n ∈ N , the following identity holds true:


 1
−2 (2 n − 1)!!
= (−1)n , (2.45)
n (2 n)!!

in which n!! denotes the double factorial function (or semi–factorial) of n .


1
Proof. Evaluation of the binomial coefficient yields, for α = − :
2

− 1 (− 1 − 1)(− 12 − 2) · · · (− 12 − n + 1)
 1
−2
= 2 2
n n!
1 1
( + 1)( 2 + 2) · · · ( 12 + n − 1)
1 1
· 3
· 5
··· 2 n−1
= (−1)n 2 2 = (−1)n 2 2 2 2
.
n! n!
2.6. TAYLOR–MACLAURIN SERIES 29

Recalling that the double factorial n!! is the product of all integers from 1 to n of the same parity
(odd or even) as n , we obtain:

1 3 5 2n − 1 (2 n − 1)!!
· · ··· = .
2 2 2 2 2n
Therefore:
− 21
 
(2 n − 1)!!
= (−1)n .
n 2n n!
Recalling further that (2n)!! = 2n n! , thesis (2.45) follows.

The following Corollary 2.54 is a consequence of Proposition 2.53.

Corollary 2.54. For any |x| < 1, using the convention (−1)!! = 1 , it holds:


X (2 n − 1)!!
1
√ = xn , (2.46)
1 − x n=0 (2 n)!!


1 X (2 n − 1)!! n
√ = (−1)n x . (2.47)
1 + x n=0 (2n)!!

Formula (2.46) is implied by Theorem 2.42 and yields the MacLaurin series for arcsin x , while (2.47)
gives the series for arcsinh x , as expressed in the following Theorem 2.55.

Theorem 2.55. Considering |x| < 1 and letting (−1)!! = 1 , then:



X (2 n − 1)!! x2 n+1
arcsin x = , (2.48)
(2 n)!! 2 n + 1
n=0

X (2 n − 1)!! x2 n+1
arcsinh x = (−1)n . (2.49)
(2 n)!! 2 n + 1
n=0

Proof. For |x| < 1 , we can write: Z x


dt
arcsin x = √ .
0 1 − t2
Using (2.46) with x = t2 and applying Theorem 2.42, it follows:

xX ∞ x
(2n − 1)!! 2n X (2n − 1)!!
Z Z
arcsin x = t dt = t2n dt
0 n=0 (2n)!! (2n)!! 0
n=0

X (2n − 1)!! x2n+1
= .
(2n)!! 2n + 1
n=0

Equation (2.49) can be proved analogously.

Using the power series (2.46) and the Central Binomial Coefficient formula:

2n (2n − 1)!!
 
2n
= (2.50)
n n!

provable by induction, a result can be obtained, due to Lehemer7 [38].


7
Derrick Henry Lehmer (1905–1991), American mathematician.
30 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

1
Theorem 2.56 (Lehmer). If |x| < , then:
4
∞  
1 X 2n n
√ = x . (2.51)
1 − 4x n=0 n

Proof. Formula (2.50) yields:


∞  ∞ ∞
2n n X 2n (2n − 1)!! n X 4n (2n − 1)!! n
X 
x = x = x .
n n! 2n n!
n=0 n=0 n=0

Using again the relation (2n)!! = 2n n! , it follows:


∞  ∞ ∞
2n n X 4n (2n − 1)!! n X (2n − 1)!!
X 
x = x = (4x)n .
n (2n)!! (2n)!!
n=0 n=0 n=0

Finally, equality (2.51) follows from (2.46).

2.6.2 The error function


We present here an example on how to deal with power series expansion of a function, which has great
importance in Probability theory. In Statistics, it is fundamental to deal with the following definite
integral: Z x
2
F (x) = e−t dt . (2.52)
0

The main issue with the integral (2.52) is that it is not possible to express it by means of the known
elementary functions [39]. On the other hand, some probabilistic applications require to know, at least
numerically, the values of the function introduced in (2.52). A way to achieve this goal is integrating
by series. Using the power series for the exponential function, it is possible to write:

2
X t2 n
e−t = (−1)n .
n!
n=0

Since the power series in uniformly convergent, we can invoke Theorem 2.15 and transform the integral
(2.52) into a series as:

x ∞ x ∞
t2 n 1 x2n+1
Z Z
−t2
X X
n
e dt = (−1) dt = (−1)n . (2.53)
0 0 n! n! 2n + 1
n=0 n=0

The error function, used in Statistics, is defined in the following way:


Z x
2 2
erf(x) = √ e−t dt . (2.54)
π 0

Our previous argument, which led to equation (2.53), shows that the power series expansion of the
error function, introduced in (2.54), is

2 X 1 x2n+1
erf(x) = √ (−1)n . (2.55)
π n=0 n! 2n + 1

Notice that from Theorem 2.38 it follows that the radius of convergence of the power series (2.55) is
infinite.
2.6. TAYLOR–MACLAURIN SERIES 31

2.6.3 Abel theorem and series summation


We present here an important theorem, due to Abel8 , which explains the behavior of a given power
series, with positive radius of convergence, at the boundary of the interval of convergence. In the
previous Examples 2.39 and 2.40, we observed different behaviors at the boundary of the convergence
interval: they can be explained by Abel Theorem 2.57, for the proof of which we refer to [9].
Theorem 2.57 (Abel). Denote by f (x) the sum of the power series (2.16), in which we assume that

X
the radius of convergenge is r > 0 . Assume further that the numerical series rn an converges. Then:
n=0

X
lim f (x) = an r n . (2.56)
x→r−
n=0
Proof. The generality of the proof is not affected by the choice r = 1 , as different radii can be achieved
with a straightforward change of variable. Let:
n−1
X
sn = am ;
m=0
then:

X
s = lim sn = an .
n→∞
n=0
Now, observe that a0 = s1 and an = sn+1 − sn for any n ∈ N . If |x| < 1 , then 1 is the radius of
convergence of the power series:
X∞
sn+1 xn . (2.57)
n=0
To show it, notice that:
sn+2 an+1 + sn+1
lim = lim = 1.
n→∞ sn+1 n→∞ sn+1
When |x| < 1 , series (2.57) can be multiplied by 1 − x , yielding:

X ∞
X ∞
X
n n
(1 − x) sn+1 x = sn+1 x − sn+1 xn+1
n=0 n=0 n=0
X∞ X∞
= sn+1 xn − sn xn (2.58)
n=0 n=1

X ∞
X
n
= s1 + (sn+1 − sn )x = a0 + an xn = f (x) .
n=1 n=1
To obtain thesis (2.56) we have to show that, for any ε > 0 , there exists δε > 0 such that |f (x)−s| < ε ,
for any x such that 1 − δε < x < 1 . From (2.58) and using formula (2.31) for the sum of the geometric
series, we have:
X∞
f (x) − s = (1 − x) sn+1 xn − s
n=0
X∞ ∞
X
= (1 − x) sn+1 xn − s (1 − x) xn
n=0 n=0
∞ ∞ (2.59)
X X
n n
= (1 − x) sn+1 x − (1 − x) sx
n=0 n=0

X
= (1 − x) (sn+1 − s) xn .
n=0
8
Niels Henrik Abel (1802–1829), Norvegian mathematician.
32 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

ε
Now, fixed ε > 0 , there exists nε ∈ N such that |sn+1 − s| < for any n ∈ N , n > nε ; therefore,
2
using the triangle inequality in (2.59), the following holds for x ∈] − 1, 1[ :

n
X ε X
|f (x) − s| = (1 − x) (sn+1 − s)xn + (sn+1 − s)xn


n=0 n=nε +1

n
X ε X
≤ (1 − x) (sn+1 − s)xn + (1 − x) (sn+1 − s)xn


n=0 n=nε +1
nε ∞
(2.60)
X ε X
≤ (1 − x) |sn+1 − s| |x|n + (1 − x) |x|n
2
n=0 n=nε +1
nε nε
X ε X ε
≤ (1 − x) |sn+1 − s| |x|n + ≤ (1 − x) |sn+1 − s| + .
2 2
n=0 n=0

Observing that the function:



X
x 7→ (1 − x) |sn+1 − s|
n=0

is continuous and vanishes for x = 1 , it is possible to choose δ ∈]0 , 1[ such that, if 1 − δ < x < 1 , we
have:

X ε
(1 − x) |sn+1 − s| < .
2
n=0

Thesis (2.56) thus follows.

Theorem 2.57 allows to compute, roundly, the sum of many interesting series.

Example 2.58. Recalling the power series expansion (2.33), from Theorem 2.57, with x = 1 , it
follows:

X (−1)n+1
ln 2 = .
n
n=1

Example 2.59. Recalling the power series expansion (2.35), Theorem 2.57, with x = 1 , allows finding
the sum of the Leibnitz–Gregory9 series:

π X 1
= (−1)n .
4 2n + 1
n=0

Example 2.60. Recalling the particular binomial expansion (2.46), Abel Theorem 2.57 implies that,
for x = 1 , the following holds:

1 X (2n − 1)!!
√ = (−1)n .
2 n=0 (2n)!!

π
Using the fact that arccos x = − arcsin x , it is possible to obtain a second series, which gives π .
2
Example 2.61. Recalling the arcsin expansion (2.48), from Theorem 2.57 it follows:

π X (2n − 1)!! 1
= .
2 (2n)!! 2n + 1
n=0
9
James Gregory (1638–1675), Scottish mathematician and astronomer.
Gottfried Wilhelm von Leibnitz (1646–1716), German mathematician and philosopher.
2.7. BASEL PROBLEM 33

Example 2.62. We show here two summation formulæ connecting π to the central binomial coeffi-
cients:  
2n

X n π
= ; (2.61)
4n (2 n + 1) 2
n=0
 
2n

X n π
n
= . (2.62)
16 (2 n + 1) 3
n=0

The key to show (2.61) and (2.62) lies in the representation of the central binomial coefficient (2.50),
whose insertion in the left hand side of (2.61) leads to the infinite series:

X (2n + 1)!!
. (2.63)
2n n! (2n+ 1)
n=0

We further notice that, from the power expansion of the arcsin function (2.48), it is possible to infer
the following equality:
√ ∞
arcsin x X (2n + 1)!!
√ = xn . (2.64)
x 2n n! (2n + 1)
n=0

The radius of convergence of the power series (2.64) is 1 ; Abel Theorem 2.57 can thus be applied to
arrive to (2.61). It is worth noting that (2.61) can also be obtained using the Lehemer series (2.51),
via the change of variable y = 4 x and integrating term by term.
A similar argument leads to (2.62); here, the starting point is the following power series expansion,
which has, again, radius of convergence r = 1 :
 
2n

arcsin x X n
= n
x2n (2.65)
x 4 (2n + 1)
n=0

1
Equality (2.62) follows by evaluating formula (2.65) at x = .
2

2.7 Basel problem


One of the most celebrated problems in Classical Analysis is the Basel Problem, which consists in
determining the exact value of the infinite series:

X 1
. (2.66)
n2
n=1

Mengoli10 originally posed this problem in 1644, that takes its name from Basel, birthplace of Euler11
π2
who first provided the correct solution in [19].
6
There exist several solutions of the Basel problem; here we present the solution of Choe [11], based
on the power series expansion of f (x) = arcsin x , shown in Formula (2.48), as well as on the Abel
Theorem 2.57 and on the following integral Formula (2.67), which can be proved by induction on
m∈N : Z π
2 (2 m)!!
sin2 m+1 t dt = . (2.67)
0 (2 m + 1)!!
10
Pietro Mengoli (1626–1686), Italian mathematician and clergyman from Bologna.
11
Leonhard Euler (1707–1783), Swiss mathematician and physicist.
34 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

The first step towards solving the Basel problem is to observe that, in the sum (2.66), the attention
can be confined to odd indexes only. Namely, if E denotes the sum of the series (2.66), then E can be
computed by considering, separately, the sums on even and odd indexes:
∞ ∞
X 1 X 1
2
+ =E.
(2 n) (2 n + 1)2
n=1 n=0

On the other hand:


∞ ∞
X 1 X 1 E
= =
(2 n)2 4 n2 4
n=1 n=1
yielding:

X 1 3
2
= E. (2.68)
(2n + 1) 4
n=0

π2 3 π2
Now, observe that E = ⇐⇒ E = . In other words, the Basel problem is equivalent to show
6 4 8
that:

X 1 π2
= , (2.69)
(2n + 1)2 8
n=0
whose proof can be found in [11].
Abel Theorem 2.57 applies to the power series (2.48), since we can prove that (2.48) converges for
x = 1 , using Raabe12 test, that is to say, forming:
 
an
ρ = lim n −1 ,
n→∞ an+1
in which an is the n−th series term, and proving that ρ > 1 . In the case of (2.48), with x = 1 :
(2n − 1)!! x2n+1 (2n + 2)!! 2n + 3
 
ρ = lim n − 1
n→∞ (2n)!! 2n + 1 (2n + 1)!! x2n+3
 
2(n + 1)(2n + 3) n(6n + 5) 3
= lim n 2
− 1 = lim 2
= .
n→∞ (2n + 1) n→∞ (2n + 1) 2
This implies, also, that the series (2.48) converges uniformly. The change of variable x = sin t in both
π π
sides of (2.48) yields, when − < t < :
2 2

X (2n − 1)!! sin2n+1 t
t = sin t + . (2.70)
(2n)!! 2n + 1
n=1
π
Integrating (2.70) term by term, on the interval [0 , ] , and using (2.67), we obtain:
2
∞ Z π
π2 X (2n − 1)!! 2 sin2n+1 t
=1+ dt
8 (2n)!! 0 2n + 1
n=1

X (2n − 1)!! (2n)!! 1
=1+
(2n)!! (2n + 1)!! 2n + 1
n=1
∞ ∞
X 1 X 1
=1+ 2
= .
(2n + 1) (2n + 1)2
n=1 n=0

This shows (2.69) and, thus, the Euler summation formula:



X 1 π2
= . (2.71)
n2 6
n=1
12
Joseph Ludwig Raabe (1801–1859), Swiss mathematician.
2.8. EXTENSION OF ELEMENTARY FUNCTIONS TO THE COMPLEX FIELD 35

2.8 Extension of elementary functions to the complex field


The set C , of complex numbers, as well as the set R , of reals, in force of the triangle inequality, pos-
sesses the topological structure of metric space. The theory of convergence of sequences and sequences
of functions with complex values is, therefore, analogous to that of real valued sequences and sequences
of functions. As a consequnce, it is possible to extend the domain of the elementary functions, that
are representable in terms of convergent power series, to the complex domain.

2.8.1 Complex exponential


Let us start considering the complex exponential. In C , the exponential function is defined in terms
of the usual power series, which is thought, here, as a function of a variable z ∈ C :

Definition 2.63.

z
X zn
e := . (2.72)
n!
n=0

Equations (2.26) and (2.72) only differ in the fact that, in the latest one, the argument can be a
complex number. Almost all the familiar properties of the exponential still hold, with the one exception
of positivity, which has no sense in the unordered field C . The fundamental property of the complex
exponential is stated in the following Theorem 2.64, due to Euler.

Theorem 2.64 (Euler). For any z = x + iy ∈ C , with x , y ∈ R , it holds:

ex+iy = ex (cos y + i sin y) . (2.73)

Proof. Let z = x + iy ∈ C ; then:

ez = ex+i y
 =e ·e
x iy

i y (i y)2 (i y)3 (i y)4



x
=e · 1+ + + + + ···
1! 2! 3! 4!
y2 y4 y3 y5
   
= ex · 1− + + ··· + i y − + + ···
2! 4! 3! 5!
x
= e · (cos y + i sin y) .

The last step, above, exploits the real power series expansion for the sine and cosine functions given
in (2.27) and (2.28) respectively.

The first beautiful consequence of Theorem 2.64 is the famous Euler identity.

Corollary 2.65 (Euler identity).


ei π + 1 = 0 . (2.74)

Proof. First observe that, if x = 0 in (2.73), then it holds, for any y ∈ R :

ei y = cos y + i sin y . (2.75)

Now, with y = π , equation (2.75) follows.

The C–extension of the exponential has an important consequence since the exponential function,
when considered as a function C → C , is no longer one–to–one, but it is a periodic function. In fact,
if z , w ∈ C , then:
ez = ew ⇐⇒ z = w + 2 n π i with n ∈ Z .
36 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

2.8.2 Complex goniometric hyperbolic functions


Equality (2.75) implies the following formulæ (2.76), again due to Euler and valid for any y ∈ R :

ei y − e−i y ei y + e−i y
sin y = , cos y = . (2.76)
2i 2
It is thus possible to use (2.76) to extend to C the goniometric functions.

Definition 2.66. For any z ∈ C , define:

ei z − e−i z ei z + e−i z
sin z = , cos z = . (2.77)
2i 2
In essence, for the sine and cosine functions, both in their goniometric and hyperbolic versions, the
power series expansions (2.27), (2.28), (2.29) and (2.30) are understood as functions of a complex
variable.

2.8.3 Complex logarithm


To define the complex logarithm, it must be taken into account that the C–exponential function is
periodic, with period 2 π i , thus the C–logarithm is not univocally determined. With this in mind,
we formulate the following definition:

Definition 2.67. If w ∈ C , the logarithm of w is any complex number z ∈ C such that ez = w .

Remark 2.68. In C , as well as in R , the logarithm of zero is undefined, since, from (2.73), it follows
ez 6= 0 , for any z ∈ C .

Using the polar representation of a complex number, we can represent its logarithms as shown below.

Theorem 2.69. If w = ρ ei ϑ is a non–zero complex number, the logarithms of w are defined as:

log w = ln ρ + i (ϑ + 2 n π) , n ∈ Z. (2.78)

Proof. Let w = ez and let z = x + i y ; then, we have to solve the system:

ez = ρ ei ϑ ,

with
ez = ex+i y = ex ei y = ex (cos y + i sin y) , ρ ei ϑ = ρ (cos ϑ + i sin ϑ) ,
from which the real and imaginary components of z are obtained:

x = ln ρ , ρ ≥ 0, y = ϑ + 2nπ.

Since log w = z , thesis (2.78) follows.

Among the infinite logarithms of a complex number, we pin down one, corresponding to the most
convenient argument.

Definition 2.70. Consider w = ρ ei ϑ , w 6= 0 . The main argument of w is ϑ , with −π < ϑ ≤ π ,


and it is referred to as arg(w) . Note, also, that ρ = |w| . Then, the principal determination of the
logarithm of w is:
Log w = ln ρ + i ϑ = ln |w| + i arg(w) .

Example 2.71. Compute Log(−1) . Here, w = ρ ei ϑ , with ρ = | − 1| and ϑ = arg(−1) . Since


ln 1 = 0 and arg(−1) = π , we obtain Log(−1) = i π .
2.9. EXERCISES 37

In other words, for a non–zero complex w , the principal determination (or principal value) Log w is
the logarithm whose imaginary part lies in the interval (−π , π] .

We end this section introducing the complex power.

Definition 2.72. Given z ∈ C , z 6= 0 , and w ∈ C , the complex power function is defined as:

z w = ew Log z
π
Example 2.73. Compute i i . Applying Definition 2.72: i i = ei Log i . Since arg(i) = and |i| = 1 ,
2
π π π
then Log i = i . Finally, i i = ei i 2 = e− 2 .
2
Example 2.74. In C , it is possible solve equations like sin z = 2 , obviously finding complex solutions.
From the sin definition (2.77), in fact, we obtain:

e2 i z − 4 i ei z − 1 = 0 .

Thus:  √ 
ei z = 2 ± 3 i .

Evaluating the logarithms, the following solutions can be found:


π  √ 
z = + 2 n π − i ln 2 ± 3 , n ∈ Z.
2

2.9 Exercises
2.9.1 Solved exercises
1. Given the following sequence of functions, establish whether it is pointwise and/or uniformly con-
vergent:
n x + x2
fn (x) = , x ∈ [0, 1] ,
n2
2. Exaluate the pointwise limit of the sequence of functions:

n
fn (x) = 1 + xn , x ≥ 0.

3. Show that the following sequence of functions converges pointwise, but not uniformly, to f (x) = 0 :

fn (x) = n x e−n x , x > 0,

4. Show that the following sequence of functions converges uniformly to f (x) = 0 :



1 − xn
fn (x) = , x ∈ [−1 , 1] ,
n2

5. Show that:

X 1
= ln 2 .
n 2n
n=1

6. Evaluate:

n e−n x
Z
lim dx .
n→∞ 1 1 + nx
38 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

1
x(1 − x)
Z
7. Use the definite integral dx to prove that:
0 1+x

X (−1)n 3
= − ln 4 .
(n + 1)(n + 2) 2
n=1

−n
x2

8. Let fn (x) = 1+ , x ≥ 0.
n

a. Show that (fn ) is pointwise convergent to a function f (x) to be determined.


b. Show that: Z ∞ Z ∞
lim fn (x)dx = f (x)dx .
n→∞ 0 0

Solutions to Exercises 2.9.1


1. Sequence fn (x) converges pointwise to zero for any x ∈ [0, 1] since:
x x2 x2
 
x
lim fn (x) = lim + 2 = lim + lim 2 = 0 + 0 = 0 .
n→∞ n→∞ n n n→∞ n n→∞ n

To establish if such a convergence is also uniform, we evaluate:


nx + x2
 
n+1
sup |fn (x) − 0| = sup 2
= .
x∈[0,1] x∈[0,1] n n2

Observe that:
n+1
lim sup |fn (x) − 0| = lim = 0.
n→∞ x∈[0,1] n→∞ n2

The uniform convergence on the interval [0, 1] follows.

2. If x = 0 then fn (0) = 1 for any n ∈ N .


3 5
If x ≤ 1 , then xn → 0 , and we can infer that there exists n0 ∈ N such that < 1 + xn < ;
2 2
therefore, for any n > n0 , we have:
r r
n 3 n 5
< fn (x) < .
2 2
q q
Since limn→∞ n 32 = limn→∞ n 52 = 1 , we can use the Sandwich Theorem (or Squeeze Theorem13 ),
for x ≤ 1 , to prove the limit relation:

lim fn (x) = 1 .
n→∞

Now, examine what happens when x > 1 . First, notice that:


s  r


n n 1 n 1
fn (x) = 1 + xn = xn n
+1 =x + 1.
x xn

1 1
Recalling that, here, < 1 , x 6= 0 , we consider a change of variable t = and repeat the previous
x x √
argument (that we followed in the case of a variable t < 1) to obtain lim n tn + 1 = 1 , that is:
n→∞
r
n 1
lim + 1 = 1.
n→∞ xn
13
See, for example,mathworld.wolfram.com/SqueezingTheorem.html
2.9. EXERCISES 39

In other words, for x > 1 , we have shown that:



n
lim fn (x) = lim 1 + xn = x .
n→∞ n→∞

Putting everything together, we have proven that:


(
1 if x ≤ 1 ,
lim fn (x) = f (x) where f (x) =
n→∞ x if x > 1 .

3. The pointwise limit of sequence fn (x) = n x e−n x is f (x) = 0 , due to the exponential decay of the
factor e−n x . To investigate the possible uniform convergence, we consider:

sup |fn (x) − f (x)| = sup n x e−n x .


x>0 x>0

Differentiating we find:
d
n x e−n x = n e−n x (1 − n x) ,

dx
1
showing that x = n is a local maximizer and the corresponding extremum is:
 
1 1
fn = .
n e
But this implies that the found convergence cannot be uniform, since:
1
lim sup |fn (x) − f (x)| = 6= 0 .
n→∞ x>0 e
√ √
4. For any x ∈ [−1, 1] and any n ∈ N , it holds that 1 − xn ≤ 2 , thus:
√ √
1 − xn 2
fn (x) = 2
≤ 2 . (2.79)
n n
Now, observe that inequality (2.79) is independent of x ∈ [−1, 1] : this fact, taking the supremum
with respect to x ∈ [−1, 1] , ensures uniform convergence.

5. Consider, for any n ∈ N , the definite integral:


Z 1
2 1
xn−1 dx = .
0 n 2n
Summing up for all positive integers between 1 and n , we get:
∞ ∞ Z 1
X 1 X 2

n
= xn−1 dx .
n2 0
n=1 n=1

Since the geometric series, in the right–hand side above, converges uniformly, we can swap series
and integral, obtaining:
∞ Z 1 X ∞
!
X 1 2
= xn−1 dx .
n 2n 0
n=1 n=1
Therefore
∞ Z 1
X 1 2 1
n
= dx .
n2
n=1 0 1−x

The thesis follows by integrating:


Z 1
2 1 x= 1
dx = [− ln(1 − x)]x=02 .
0 1−x
40 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

n
6. Define the (decreasing) function hn (x) = , with x ∈ [1, +∞) . Then:
1 + nx
n2
h0n (x) = − < 0.
(1 + nx)2
Since:
n
lim =0
x→∞ 1 + n x
and
n n
sup = hn (1) = ,
x∈[1,∞) 1 + n x 1+n
we can infer that:
n
|hn (x)| ≤ < 1.
1+n
Therefore:
n e−n x

−n x
1 + n x < e .

This shows uniform convergence for fn . We can now invoke Theorem 2.15, to obtain:
Z ∞ Z ∞ Z ∞
n e−n x n e−n x
lim dx = lim dx = 0 dx = 0 .
n→∞ 1 1 + nx 1 n→∞ 1 + n x 1

7. First, evaluate the definite integral:


Z 1 Z 1 
x(1 − x) 2 3
dx = 2−x− dx = − ln 4 .
0 1+x 0 x+1 2
Then, recall that, in [0 , 1] , it holds true the geometric series expansion:
∞ ∞
1 X
m
X
= (−x) = (−1)m xm .
1+x
m=0 m=0

It is thus possible to integrate term by term, obtaining:


Z 1 Z 1 ∞
x(1 − x) X
dx = x(1 − x) (−1)m xm dx
0 1 + x 0 n=0
X∞ Z 1
= (−1)m xm+1 (1 − x) dx .
m=0 0

Now, evaluating the last right–hand side integral, we get:


Z 1 ∞  
x(1 − x) X
m 1 1
dx = (−1) −
0 1+x m+2 m+3
m=0

X (−1)m
= .
(m + 2)(m + 3)
m=0

Our statement follows using the change of index n = m + 1 .


8. Observe that:
x2
   2  2 
x x
 2 −n
x
 −n ln 1 + −n +o
1+ =e n =e n n 2
= e−x (1+o(1))
n
where we have used ln(1 + t) = t + o(t) when t ' 0 . This means that:
−n
x2

2
lim 1 + = e−x .
n→∞ n
We have thus shown point (a). The second statement follows from Corollary 2.23.
2.9. EXERCISES 41

2.9.2 Unsolved exercises


x
1. Show that the sequence of functions fn (x) = , with x ∈ R , converges pointwise to f (x) = 0 ,
n
but the convergence is not uniform.
Show also that, on the other hand, when a > 0 , the sequence (fn ) converges uniformly to f (x) = 0
for x ∈ [−a , a] .

x n
 
2. Let fn (x) = cos √ , x ∈ R . Show that:
n

a. fn converges pointwise to a non–zero function f (x) to be determined;


b. if a > 0 , the sequence (fn ) converges uniformly on [−a , a] .

Hint. Consider the sequence gn (x) = ln fn (x) and use the power series (2.33) and (2.28).

x + x2 e n x
3. Establish if the sequence of functions (fn )n∈N , defined, for x ∈ R , by fn (x) = ,
1 + enx
converges pointwise and/or uniformly.

X 1 3
4. Show that n
= ln .
n3 2
n=1
Z 1 x+1
5. Show that lim e n dx = 1 .
n→∞ 0

6. Consider the following equality and say if (and why) it is true or false:
1 1
x4 x4
Z Z
lim dx = lim dx .
n→∞ 0 x2 + n2 0 n→∞ x2 + n2

n(x3 + x) e−x
7. Let fn (x) = , with x ∈ [0 , 1] .
1 + nx

a. Show that (fn ) is pointwise convergent to a function f (x) to be determined.


b. Show that, for any x ∈ [0, 1] and for any n ∈ N :

2
|fn (x) − f (x)| ≤ .
1 + nx

c. Show that, for any a > 0 , sequence (fn ) converges uniformly to f on [a , 1] , but the convergence
is not uniform on [0 , 1] .
Z 1
d. Evaluate lim fn (x) dx .
n→∞ 0

8. Use the definite integral:


1
1−x
Z
dx
0 1 − x4
to show that:

X 1 π 1
= + ln 2 .
(4 n + 1)(4 n + 2) 8 4
n=0

Hint.
1 1−x 1
= + .
(1 + x)(1 + x2 ) 2 (1 + x2 ) 2(1 + x)
42 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS

9. Use the definite integral: √


Z 1
1+ x
dx
0 1+x
to show that:

X 4n + 1 π
(−1)n−1 = 2 + ln 2 − .
n (2 n + 1) 2
n=1

∞ ∞
x5
Z X 1
10. Show that 2 dx = .
0
x
e −1 n3
n=1
√ 
11. Show that cos z = 2 ⇐⇒ z = 2 n π − i ln 2 ± 3 , n ∈ Z.
3 Multidimensional differential calcu-
lus

3.1 Partial derivatives


The most natural way to define derivatives of functions of several variables is to allow only one variable
at a time to move, while freezing the others. Thus, if f : V → R is a function of n variables, whose
domain is the open set V , we define the set {x1 } × · · · × {xj−1 } × [a, b] × {xj+1 } × · · · × {xn } ,
where [a , b] is chosen so to have {x1 } × · · · × {xj−1 } × {t} × {xj+1 } × · · · × {xn } ⊂ V for any
t ∈ [a , b] . We shall denote the function:

g(t) := f (x1 , . . . , xj−1 , t , xj+1 , . . . , xn )

by
f (x1 , . . . , xj−1 , · , xj+1 , . . . , xn ) .
If g is differentiable at some t0 ∈ (a , b) , then the first–order partial derivative of f at
(xl , . . . , xj−1 , t0 , xj+1 , . . . , xn ) , with respect to xj , is defined by:

∂f
fxj (x1 , . . . , xj−1 , t0 , xj+1 , . . . , xn ) := (x1 , . . . , xj−1 , t0 , xj+1 , . . . , xn )
∂xj
:=g 0 (t0 ) .

Therefore, the partial derivative fxj exists at a point a if and only if the following limit exists:

∂f f (a + h ej ) − f (a)
(a) := lim .
∂xj h→0 h

Higher–order partial derivatives are defined by iteration. For example, when it exists, the second–order
partial derivative of f , with respect to xj and xk , is defined by:

∂2f
 
∂ ∂f
fxj xk := := .
∂xk ∂xj ∂xk ∂xj

Second–order partial derivatives are called mixed when j 6= k .

Definition 3.1. Let V be a non–empty open subset of Rn , let f : V → R and p ∈ N.

(i) f is said to be C p on V if and only if every k–th order partial derivative of f , with k ≤ p , exists
and is continuous on V .

(ii) f is said to be C ∞ on V if and only if f is C p for all p ∈ N .

If f is C p on V and q < p , then f is C q on V . The symbol C p (V ) denotes the set of functions that are
C p on an open set V .
For simplicity, in the following we shall state all results for the case m = 1 and n = 2 , denoting x1
with x and x2 with y . With appropriate changes in notation, the same results hold for any m, n ∈ N .

43
44 CHAPTER 3. MULTIDIMENSIONAL DIFFERENTIAL CALCULUS

Example 3.2. By the Product Rule1 , if fx and gx exist, then:

∂f ∂g ∂f
(f g) = f +g .
∂x ∂x ∂x

Example 3.3. By the Mean–Value Theorem2 , if f ( · , y) is continuous on [a, b] and the partial deriva-
tive fx ( · , y) exists on (a , b) , then there exists a point c ∈ (a , b) (which may depend on y as well as
on a and b) such that:
∂f
f (b , y) − f (a , y) = (b − a) (c , y) .
∂x
In most situations, when dealing with higher–order partial derivatives, the order of computation of
the derivatives is, in some sense, arbitrary. This is expressed by the Clairaut3 –Schwarz Theorem.

Theorem 3.4 (Clairaut–Schwarz). Assume that V is open in R2 , that (a , b) ∈ V and f : V → R .


Assume further that f is C 2 on V and that one of the two second–order mixed partial derivatives
of f exists on V and is continuous at the point (a, b) . Then, the other second–order mixed partial
derivative exists at (a , b) and the following equality is verified:

∂2 f ∂2 f
(a , b) = (a , b) .
∂y ∂x ∂x ∂y

The hypotheses of Theorem 3.4 are met if f ∈ C 2 (V ) on V ⊆ R2 , V open.


For functions of n variables, the following Theorem 3.5 holds.

Theorem 3.5. If f is C 2 on an open subset V of Rn , if a ∈ V , and if j 6= k , then:

∂2 f ∂2 f
(a) = (a) .
∂xj ∂xk ∂xk ∂xj

Remark 3.6. Existence of partial derivatives does not ensure continuity. As an example, consider:
( xy
if (x , y) 6= (0, 0) ,
f (x , y) = x2 + y 2
0 if (x , y) = (0 , 0) .

This function is not continuous at (0 , 0) , but admits partial derivatives at any (x, y) ∈ R2 , since:

f (∆x , 0) − f (0 , 0)
lim = lim 0 = 0 ,
∆x→0 ∆x ∆x→0

and
f (0 , ∆y) − f (0 , 0)
lim = lim 0 = 0 .
∆y→0 ∆y ∆y→0

3.2 Differentiability
In this section, we define what it means for a vector function f to be differentiable at a point a .
Whatever our definition, if f is differentiable at a , then we expect two things:

(1) f will be continuous at a ;

(2) all first–order partial derivatives of f will exist at a .


1
See, for example, mathworld.wolfram.com/ProductRule.html
2
See, for example, mathworld.wolfram.com/MeanValueTheorem.html
3
Alexis Claude Clairaut (1713–1765), French mathematician, astronomer, geophysicist.
3.2. DIFFERENTIABILITY 45

To appreciate the following Definition 3.7 of total derivative of a function of n variables, we consider
one peculiar aspect of differentiable functions of one variable. Recall that f : R → R is differentiable
in x ∈ R if the following limit is finite, i.e., it is a real number:

f (x + h) − f (x)
lim := f 0 (x) .
h→0 h
The definition above is equivalent to the following: f is differentiable in x ∈ R if there exist α ∈ R
ω(h)
and a function ω : (−δ, δ) → R , with ω(0) = 0 and lim = 0 , such that:
h→0 h

f (x + h) = f (x) + α h + ω(h) h . (3.1)

The definition of differentiability for functions of several variables extends Property (3.1).

Definition 3.7. Let f be a real function of n variables. f is said to be differentiable, at a point


a ∈ Rn , if and only if there exists an open set V ⊆ Rn , such that a ∈ V and f : V → R , and there
exists d ∈ Rn such that:
f (a + h) − f (a) − d · h
lim =0
h→0 ||h||
d is called total derivative of f at a

Theorem 3.8. If f is differentiable at a, then:

(i) f is continuous at a ;

(ii) all first–order partial derivatives of f exist at a ;


 
∂f ∂f
(iii) d = ∇f (a) := (a) , . . . , (a) .
∂x1 ∂xn

∇f (a) is called the gradient (or nabla) of f at a .


A reversed implication to Theorem 3.8 also holds true.

Theorem 3.9. Let V be open in Rn , let a ∈ V and suppose that f : V → R . If all first–order partial
derivatives of f exist in V and are continuous at a , then f is differentiable at a .

The hypotheses of Theorem 3.9 are met if f ∈ C 1 (V ) on V ⊆ Rn , V open.

Theorem 3.10. Let α ∈ R , a ∈ Rn , and assume that f , g : V → R are differentiable at a , being


V ⊆ Rn an open set. Then, the functions f + g and α f are differentiable at a , and the following
equalities are verified:

(i) ∇(f + g)(a) = ∇f (a) + ∇g(a) ;

(ii) ∇(α f )(a) = α ∇f (a) ;

(iii) ∇(f g)(a) = g(a) ∇f (a) + f (a) ∇g(a) .

Moreover, if g(a) 6= 0 , then f /g is differentiable at a , and it holds:


 
f g(a) ∇f (a) − f (a) ∇g(a)
(iv) ∇ (a) = .
g g 2 (a)
The composition of functions follows similar rules, for differentiation, as in the one–dimensional case.
For instance, the Chain Rule 4 holds in the following way. Consider a vector function g : I → Rn ,
g = (g1 , . . . , gn ) , defined on an open interval I ⊆ R , and consider f : g(I) ⊆ Rn → R . If each of the
4
See, for example, mathworld.wolfram.com/ChainRule.html
46 CHAPTER 3. MULTIDIMENSIONAL DIFFERENTIAL CALCULUS

components gj of g is differentiable at t0 ∈ I , and if f is differentiable at a = (g1 (t0 ) , . . . , gn (t0 )) ,


then the composition ϕ(t) := f (g(t)) is differentiable at t0 , and we have:

ϕ0 (t0 ) = ∇f (a) · g 0 (t0 ) ,

where · is the dot (inner) product in Rn , introduced in Definition 1.1, and:

g 0 (t0 ) := g10 (t0 ) , . . . , gn0 (t0 ) .




In order to extend the notion of gradient, we introduce the Jacobian5 matrix associated to a vector–
valued function.

Definition 3.11. Let f : Rn → Rm be a function from the Euclidean n–space to the Euclidean
m–space. f has m real–valued component functions:

f1 (x1 , . . . , xn ) , . . . , fm (x1 , . . . , xn ) .

If the partial derivatives of the component functions exist, they can be organized in an m–by–n matrix,
namely the Jacobian matrix J of f :

∂f1 ∂f1 ∂f1


 
···
 ∂x1 ∂x2 ∂xn 
 . .. .. ..  ∂(f1 , . . . fm )
 ..
J = . . .  := ∂(x , . . . , x ) .

 ∂fm 1 n
∂fm ∂fm 
···
∂x1 ∂x2 ∂xn

The i–th row of J corresponds to the gradient ∇fi of the i–th component function fi , for i =
1,... ,m.

We introduce, now, the class of positive homogeneous functions.

Definition 3.12. A function f : Rn \ {0} → R is positive homogeneous, of degree k , if for any


x ∈ Rn \ {0} :
f (α x) = αk f (x) .

The following Theorem 3.13 is known as Euler Theorem on homogeneous functions.

Theorem 3.13. If f : Rn \ {0} → R is continuously differentiable, then f is positive homogeneous,


of degree k , if and only if:
x · ∇f (x) = k f (x) .

3.3 Maxima and Minima


Definition 3.14. Let V be an open set in Rn , let a ∈ V and suppose that f : V → R . Then:

(i) f (a) is called a local minimum of f if and only if there exists r > 0 such that f (a) ≤ f (x) for
all x ∈ Br (a) , an open ball neighborhood of a (recall Definition 1.13);

(ii) f (a) is called a local maximum of f if and only if there exists r > 0 such that f (a) ≥ f (x) for
all x ∈ Br (a) ;

(iii) f (a) is called a local extremum of f if and only if f (a) is a local maximum or a local minimum
of f .
5
Carl Gustav Jacob Jacobi (1804–1851), German mathematician.
3.4. SUFFICIENT CONDITIONS 47

Remark 3.15. If the first–order partial derivatives of f exist at a, and if f (a) is a local extremum
of f , then ∇f (a) = 0 .
In fact, the one–dimensional function:

g(t) = f (a1 , . . . , aj−1 , t , aj+1 , . . . , an )

has a local extremum at t = aj for each j = 1 , . . . , n . Hence, by the one–dimensional theory:

∂f
(a) = g 0 (aj ) = 0 .
∂xj

As in the one–dimensional case, condition ∇f (a) = 0 is necessary but not sufficient for f (a) to be a
local extremum.

Example 3.16. There exist continuously differentiable functions satisfying ∇f (a) = 0 and such that
f (a) is neither a local maximum nor a local minimum.
Consider, for instance, in the case n = 2 , the following function:

f (x, y) = y 2 − x2 .

It is easy to check that ∇f (0) = 0 , but the origin is a saddle point, as shown in Figure 3.1.

Figure 3.1: Saddle point of the function z = y 2 − x2 .

Let us give a formal definition to such a situation.

Definition 3.17. Let V be open in Rn , let a ∈ V , and let f : V → R be differentiable at a .


Point a is called a saddle point of f if ∇f (a) = 0 and there exists r0 > 0 such that, given any
ρ ∈ (0 , r0 ) , there exist points x , y ∈ Bρ (a) satisfying:

f (x) < f (a) < f (y) .

3.4 Sufficient conditions


To establish sufficient conditions for optimization, we introduce the notion of Hessian6 matrix.
6
Ludwig Otto Hesse (1811–1874), German mathematician.
48 CHAPTER 3. MULTIDIMENSIONAL DIFFERENTIAL CALCULUS

Definition 3.18. Let V ⊆ Rn an open set and let f : V → R be a C 2 function. The Hessian matrix
of f at x ∈ V (or, simply, the Hessian) is the symmetric square matrix formed by the second–order
partial derivatives of f , evaluated at point x :
 2 
∂ f
H(f )(x) := (x) , for i, j = 1, . . . , n .
∂xi ∂xj

Tests for extrema and saddle points, in the simplest situation of n = 2 , are stated in Theorem 3.19.

Theorem 3.19. Let V be open in R2 , consider (a , b) ∈ V , and suppose that f : V → R satisfies


∇f (a, b) = 0 . Suppose further that f ∈ C 2 and set:
2
D := fxx (a, b) fyy (a, b) − fxy (a, b) .

(i) If D > 0 and fxx (a , b) > 0 , then f (a, b) is a local minimum.

(ii) If D > 0 and fxx (a , b) < 0 , then f (a, b) is a local maximum.

(iii) If D < 0 , then (a , b) is a saddle point.

Notice that D is the determinant of the Hessian of f evaluated at (a , b) :

D = det[H(f )(a, b)] .

Example 3.20. A couple of examples are provided here, and the Reader is invited to verify the stated
results.

(1) Function f (x , y) = x3 + 6 x y − 3 y 2 + 2 has a saddle point in (a , b) = (0 , 0) and a local maximum


in (a , b) = (−2 , −2) .

(2) Function f (x , y) = x2 + y 3 − 2 x y − y admits a saddle point of coordinates (a , b) = (− 13 , − 31 )


and a local minimum in (a , b) = (1 , 1) .

Tests for extrema and saddle points, in the general situation of n variables, are stated in Theorem
3.21.

Theorem 3.21. In n variables, a critical point x0 :

(i) is a local minimum for f ∈ C 2 if, for each k = 1 , . . . , n :

det[ Hk (f )(x0 ) ] > 0 ,

(ii) is a local maximum for f ∈ C 2 if, for each k = 1 , . . . , n :

det[ (−1)k Hk (f )(x0 ) ] > 0 ,

where Hk (f ) denote the principal minor of order k of H(f ) .

3.5 Lagrange multipliers


In applications, it is often necessary to optimize functions under some constraints. The Lagrange
multipliers Theorem 3.22 provides necessary optimum conditons for a problem of the following kind:

max f (x) subject to g(x) = 0

or
min f (x) subject to g(x) = 0 .
3.6. MEAN–VALUE THEOREM 49

Theorem 3.22 (Lagrange multipliers – general case). Let m < n , let V be open in Rn , and let
f , gj : V → R be C 1 on V , for j = 1 , 2 . . . , m . Suppose that:

∂(g1 , . . . , gm )
∂(x1 , . . . , xn )

has rank m at x0 ∈ V , where gj (x0 ) = 0 for j = 1, 2, . . . , m . Assume further that x0 is a local


extremum for f in the set:
M = {x ∈ V : gj (x) = 0} .
Then, there exist scalars λ1 , . . . , λm , such that:
m
!
X
∇ f (x0 ) − λk gk (x0 ) = 0.
k=1

We will limit the proof of the Lagrange multipliers Theorem 3.22 in a two–dimensional context. To
this aim, it is first necessary to consider some preliminary results; we will resume the proof in §3.8.

3.6 Mean–Value theorem


We begin with recalling the definition of a segment in the Euclidean space.

Definition 3.23. Given x , y ∈ Rn , the segment joining x and y is defined as:

[x , y] := {z ∈ Rn : z = t x + (1 − t) y , 0 ≤ t ≤ 1} .

The one–dimensional Mean–Value theorem (already met in Example 3.3), also called Lagrange Mean–
Value theorem or First Mean–Value theorem, can be extended to the Euclidean space Rn .

Theorem 3.24 (Mean–Value). Let A ⊂ Rn , and f : A → R . Consider x , y ∈ Rn such that


[x , y] ⊂ Ao , the interior of A (see Definition 1.18). Assume that f (x) is differentiable in [x , y] .
Then, there exists z ∈ [x , y] such that:

f (x) − f (y) = ∇f (z) · (x − y) .

Proof. Define ϕ : [0 , 1] → Rn , ϕ(t) = y + t (x − y) . Observe that ϕ ∈ C and it is differentiable for


any t ∈ (0 , 1) . Moreover, ϕ0 (t) = x − y . It follows that g = f ◦ ϕ : [0 , 1] → R is continuous and
differentiable in (0 , 1) . We can thus apply the one–dimensional version of the Mean–Value Theorem,
to infer the existence of η ∈ (0 , 1) such that:

f (x) − f (y) = g(1) − g(0) = g 0 (η).

On the other hand, the Chain Rule implies:

g 0 (η) = ∇f (ϕ(η)) · ϕ0 (η) = ∇f (ϕ(η)) · (x − y) .

Since z = ϕ(η) ∈ [x , y] , Theorem 3.24 is proved.

3.7 Implicit function theorem


The following fundamental step is represented by the implicit function theorem, proved by Dini in
1878. For simplicity, we provide its proof only in the R2 case, presented in Theorem 3.25, but its
generalization to Rn is quite straightforward, and we state it in Theorem 3.26.
50 CHAPTER 3. MULTIDIMENSIONAL DIFFERENTIAL CALCULUS

Theorem 3.25 (Implicit Function – case n = 2). Let Ω be an open set in R2 , and let f : Ω → R
be a C 1 function. Suppose there exists (x0 , y0 ) ∈ Ω such that f (x0 , y0 ) = 0 and fy (x0 , y0 ) 6= 0 .
Then, there exist δ , ε > 0 such that, for any x ∈ (x0 − δ , x0 + δ) there exists a unique y = ϕ(x) ∈
(y0 − ε , y0 + ε) such that:
f (x, y) = 0 .
Moreover, function y = ϕ(x) is C 1 in (x0 − δ, x0 + δ) and it holds that, for any x ∈ (x0 − δ , x0 + δ) :

fx (x, ϕ(x))
ϕ0 (x) = − .
fy (x, ϕ(x))

Proof. Let us assume that f (x0 , y0 ) > 0 . Since function fy (x , y) is continuos, it is possibile to find
a ball Bδ1 (x0 , y0 ) in which it is verified that (x , y) ∈ Bδ1 (x0 , y0 ) =⇒ fy (x, y) > 0 .
This means that, with an appropriate narrowing of parameters ε and δ , function y 7→ f (x , y) can
be assumed to be an increasing function, for any x ∈ (x0 − δ , x0 + δ) .
In particular, y 7→ f (x0 , y) is increasing and, since f (x0 , y0 ) = 0 by assumption, the following
disequalities are verified, for ε small enough:

f (x0 , y0 + ε) > 0 and f (x0 , y0 − ε) < 0 .

Using, again, continuity of f and an appropriate narrowing of δ , we infer that, for any x ∈ (x0 −
δ , x0 + δ) :
f (x, y0 + ε) > 0 and f (x, y0 − ε) < 0 .
In conclusion, using continuity of y 7→ f (x , y) and the Bolzano theorem7 on the existence of zeros,
we have shown that, for any x ∈ (x0 − δ, x0 + δ) , there is a unique y = ϕ(x) ∈ (y0 − ε, y0 + ε) such
that:
f (x , y) = f (x , ϕ(x)) = 0 .
To prove the second part of Theorem 3.25, we need to show that ϕ(x) is differentiable. To this aim,
consider h ∈ R such that x + h ∈ (x0 − δ, x0 + δ) . In this way, from the Mean–Value Theorem 3.24,
there exist θ ∈ (0, 1) such that:
 
0 = f x + h , ϕ(x + h) − f x , ϕ(x)
 
= fx x + θ h ϕ(x) + θ ϕ(x + h) − ϕ(x) h +
   
+ fy x + θ h ϕ(x) + θ ϕ(x + h) − ϕ(x) ϕ(x + h) − ϕ(x) ,

thus:  
ϕ(x + h) − ϕ(x) fx x + θ h ϕ(x) + θ ϕ(x + h) − ϕ(x)
=−   .
h fy x + θ h ϕ(x) + θ ϕ(x + h) − ϕ(x)

The thesis follows by taking, in the equality above, the limit for h → 0 , observing that h → 0 =⇒
θ → 0 , and recalling that f (x , y) is C 1 .

We are now ready to state the Implicit function Theorem 3.26 in the general n–dimensional case; here,
Ω is an open set in Rn × R , thus (x, y) ∈ Ω means that x ∈ Rn and y ∈ R .

Theorem 3.26 (Implicit Function – general case). Let Ω ⊆ Rn × R be open, and let f ∈ C 1 (Ω , R) .
Assume that there exists (x0 , y0 ) ∈ Ω such that f (x0 , y0 ) = 0 and fy (x0 , y0 ) 6= 0 .
Then, there exist an open ball Bδ (x0 ) , an open interval (y0 − ε , y0 + ε) and a function ϕ : (y0 −
ε , y0 + ε) → R , such that:

(i) Bδ (x0 ) × (y0 − ε , y0 + ε) ⊂ Ω ;


7
Bernard Placidus Johann Nepomuk Bolzano (1781–1848), Czech mathematician, theologian and philosopher. For
the theorem of Bolzano see, for example, mathworld.wolfram.com/BolzanoTheorem.html
3.8. PROOF OF THEOREM 3.22 51

(ii) (x, y) ∈ Bδ (x0 ) × (y0 − ε , y0 + ε) =⇒ fy (x, y) 6= 0 ;

(iii) for any (x, y) ∈ Bδ (x0 ) × (y0 − ε , y0 + ε) it holds:

f (x, y) = 0 ⇐⇒ y = ϕ(x) ;

(iv) ϕ ∈ C 1 Bδ (x0 ) and




fxj x , ϕ(x)
ϕxj (x) = −  .
fy x , ϕ(x)

3.8 Proof of Theorem 3.22


We can now prove the multipliers Theorem 3.22; as said before, the proof is given only for the n = 2
case, presented in Theorem 3.27.

Theorem 3.27 (Lagrange multipliers – case n = 2). Let A ⊂ R2 be open, and let f , g : A → R be
C 1 functions. Consider the subset of A :

M = {(x , y) ∈ A : g(x , y) = 0} .

Assume that ∇g(x , y) 6= 0 for any (x , y) ∈ M . Assume further that (x0 , y0 ) ∈ M is a maximum or
a minimum of f (x , y) for any (x , y) ∈ M .
Then, there exists λ ∈ R such that:

∇f (x0 , y0 ) = λ∇g(x0 , y0 ) .

Proof. Since ∇g(x0 , y0 ) 6= 0 , we can assume that gy (x0 , y0 ) 6= 0, . Thus, from the Implicit function
Theorem 3.25, there exist ε , δ > 0 such that, for x ∈ (x0 − δ , x0 + δ) , y ∈ (y0 − ε , y0 + ε) , it holds:

g(x , y) = g(x , ϕ(x)) = 0 .



Consider the function x 7→ f x , ϕ(x) := h(x) , for x ∈ (x0 − δ , x0 + δ) . By assumption, h(x) admits
an extremum in x = x0 , therefore its derivative in x0 vanishes. Using the Chain Rule, it follows:

0 = h0 (x0 ) = fx x0 , ϕ(x0 ) + fy x0 , ϕ(x0 ) ϕ0 (x0 ) .


 
(3.2)

Again, use the Implicit function Theorem 3.25, which gives:



0 gx x0 , ϕ(x0 )
ϕ (x0 ) = − .
gy x0 , ϕ(x0 )

Substituting into (3.2), recalling that ϕ(x0 ) = y0 , we get:

fx (x0 , y0 ) gy (x0 , y0 ) − fy (x0 , y0 ) gx (x0 , y0 ) = 0 ,

which can be rewritten as:


fx (x0 , y0 ) fy (x0 , y0 )
det
= 0.
gx (x0 , y0 ) gy (x0 , y0 )
Since the above determinant is zero, it follows that its rows are proportional, implying that there
exists λ ∈ R such that:
 
fx (x0 , y0 ) , fy (x0 , y0 ) = λ gx (x0 , y0 ) , gy (x0 , y0 ) .
52 CHAPTER 3. MULTIDIMENSIONAL DIFFERENTIAL CALCULUS

3.9 Sufficient conditions


The multiplier Theorem 3.22 expresses necessary conditions for the existence of an optimal solution.
Stating sufficient conditions is also important; to such an aim, in the two–dimensional case, the main
tool is the so–called Bordered Hessian.
Suppose we are dealing with the simplest case of constrained optimization, that is, find the maximum
value (max) or the minimum value (min) of f (x , y) under the constraint g(x , y) = 0 . We form the
Lagrangian functional L(x , y , λ) = f (x , y) − λ g(x , y) and, after solving the critical point system:
 0 0
fx (x , y) − λ gx (x , y) = 0 ,

0 0
fy (x , y) − λ gy (x , y) = 0 ,

g (x , y) = 0 ,

we evaluate:  00 00 
Lxx Lxy gx
00 00
Λ = det Lxy Lyy gy  .
gx gy 0
Then:
(a) Λ > 0 indicates a maximum value;
(b) Λ < 0 indicates a minimum value.
Example 3.28. An example of interest in Economics concerns the maximization of a production
function of Cobb–Douglas kind. The mathematical problem can be modelled as:
max f (x , y) = xa y 1−a
(3.3)
subject to p x + q y − c = 0
where 0 < a < 1 , and p , q , c > 0 .
In a problem like (3.3), f (x , y) is referred to as objective function, while , by defining the function
w(x , y) = p x + q y − c , the constraint is given by w(x , y) = 0 .
The Lagrangian is L(x , y ; m) = f (x , y) − m w(x , y) . The critical point equations are:

a−1 y 1−a − m p = 0 ,
Lx (x , y ; m) = a x

Ly (x , y ; m) = (1 − a) xa y −a − m q = 0 ,

Lm (x , y ; m) = p x + q y − c = 0 .

Eliminating m from the first two equations, by subtraction, we obtain the two–by-two linear system
in the variables x , y : (
(1 − a) p x − a q y = 0 ,
px + qy − c = 0 .
Solving the 2 × 2 system and recovering m , from m = (a xa−1 y 1−a ) p−1 , we find the critical point:
 ac
 x=
p




c (1 − a)
y=


 q
m = (1 − a)1−a aa p−a q a−1

which is a maximum; the Bordered Hessian is, in fact:


c − a c −a
  a−2  
c − a c 1−a
  a−1  
ac ac
 (a − 1) a (1 − a) a p 
 p q p
 a  q
−a 
ac c − ac
 
a q
 
 a−1  −a
c − ac
 
 ac p q 
 (1 − a) a − q 
 p q c 
p q 0
3.9. SUFFICIENT CONDITIONS 53

and its determinant is positive:


aa−1 p2−a q a+1
det = >0.
c (1 − a)a
Example 3.29. In this example, we revert the point of view between constraint and objective function
in a problem like (3.3). Here, the idea is to minimize the total cost, fixing the level of production. For
the sake of simplicity, we treat the particular two–dimensional problem of finding maxima and minima
of f (x , y) = 2 x + y , subject to the constraint x1/4 y 3/4 = 1 , x > 0 , y > 0 .
The critical point equations are: 
m y 3/4



 2 =0,
4 x3/4



3 m x1/4
1 − =0,
4 y 1/4




x1/4 y 3/4 = 1 .

Eliminating m from the first two equations, by substitution, we obtain the two–by-two linear system
in the variables x , y : (y
=6,
x
x1/4 y 3/4 = 1 .
Solving the 2 × 2 system and recovering m from m = (4 y 1/4 )/(3 x1/4 ) , the critical point is found:

4 × 21/4
x = 6−3/4 , y = 61/4 , m= .
33/4
The Bordered Hessian is:
3 m y 3/4 y 3/4
 
3m


 16 x7/4 16 x3/4 y 1/4 4 x3/4 

 3m 3 m x1/4 3 x1/4 
Λ= −  .

 16 x3/4 y 1/4 16 y 5/4 4 y 1/4 

 y 3/4 3 x1/4 
0
4 x3/4 4 y 1/4
Evaluating Λ at the critical point and computing its determinant:

3 (3)1/4
det Λ = − ,
23/4
we see that we found a minimum.
Example 3.30. The problem presented here is typical in the determination of an optimal investment
portfolio in Corporate Finance.
We seek to minimize f (x, y) = x2 + 2 y 2 + 3 z 2 + 2 xz + 2 y z with the constraints:

x+y+z =1 2x + y + 3z = 7.

The Lagrangian is:

L(x , y , z ; m , n) = x2 + 2 y 2 + 3 z 2 + 2 x z + 2 y z − m (x + y + z − 1) − n (2 x + y + 3 z − 7) ,

hence the optimality conditions are:




 2x + 2z = m + 2n ,

4 y + 2 z = m + n ,



2x + 2y + 6z = m + 3n ,

x+y+z =1 ,





2x + y + 3z = 7 .

54 CHAPTER 3. MULTIDIMENSIONAL DIFFERENTIAL CALCULUS

The solution to this 5 × 5 linear system is

x = 0, y = −2 , z = 3, m = −10 , n = 8.

The convexity of the objective function ensures that the found solution is the absolute minimum.
Though this statement should be proved rigorously, we do not treat it here.
4 Ordinary differential equations of
first order: general theory

Our goal, in introducing ordinary differential equations, is to provide a brief account on methods of
explicit integration, for the most common types of ordinary differential equations. However, it is not
taken for granted the main theoretical problem, concerning existence and uniqueness of the solution of
the Initial Value Problem, modelled by (4.3). Indeed, the proof of the Picard–Lindelöhf Theorem 4.17
is presented in detail: to do this, we will use some notions from the theory of uniform convergence of
sequences of functions, already discussed in Theorem 2.15. An abstract approach followed, for instance,
in Chapter 2 of [60], is avoided here.
In the following Chapter 5, we present some classes of ordinary differential equations for which, using
suitable techniques, the solution can be described in terms of known functions: in this case, we say
that we are able to find an exact solution of the given ordinary differential equation.

4.1 Preliminary notions


Let x be an independent variable, moving on the real axes, and let y be a dependent variable, that
is y = y(x) . Let further y 0 , y 00 , . . . , y (n) represent successive derivatives of y with respect to x .
An ordinary differential equation (ODE) is any relation of equality involving at least one of those
derivatives and the function itself. For instance, the equation below:
dy
(x) := y 0 (x) = 2 x y(x) (4.1)
dx
states that the first derivative of the function y equals the multiplication of 2 x and y . An additional,
implicit statement is that (4.1) holds only for all those x for which both the function and its first
derivative are defined.
The term ordinary distinguishes this kind of equation from a partial differential equation, which
would involve two or more independent variables, a dependent variable and the corresponding partial
derivatives, i.e., for example:
∂f (x , y) ∂f (x , y)
+ 4xy = x+y.
∂x ∂y
We will present partial differential equations, used in Quantitative Finance, in Chapter 13.
The general ordinary differential equation of first order has the form:
F (x, y, y 0 ) = 0 . (4.2)

A function y = y(x) is called a solution of (4.2), on an interval J , if y(x) is differentiable on J and


if the following equality holds for all x ∈ J :
F (x , y(x) , y 0 (x)) ≡ 0 .
In general, we would like to know whether, under certain circumstances, a differential equation has a
unique solution. To accomplish this property, it is usual to consider the so–called Initial Value Problem
(or IVP) which, in the simplest scalar case, takes the form presented in Definition 4.1.

55
56CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: GENERAL THEORY

Definition 4.1. Given f : Ω ⊂ R2 → R , being Ω an open set, the initial value problem (also called
Cauchy problem) takes the form:
(
y 0 = f (x , y) , x∈I ,
(4.3)
y(x0 ) = y0 , x0 ∈ I , y0 ∈ J ,

where I × J ⊆ Ω are intervals, and where we have simply denoted y in place of y(x) .

Remark 4.2. We say that differential equations are studied by quantitative or exact methods when
they can be solved completely, that is to say, all their solutions are known and could be written in
closed form, in terms of elementary functions or, at times, in terms of special functions (or in terms
of inverses of elementary and special functions).

We now provide some examples of ordinary differential equations.

Example 4.3. Let us consider the differential equation:


1
y0 = . (4.4)
x2
If we rewrite equation (4.4) as:  
d 1
y(x) + = 0,
dx x
we see that we are dealing with a function whose derivative is zero. If we seek solutions defined on an
interval, then we can exploit a consequence of the Mean–Value Theorem 3.24 (namely, a function that
is continuous and differentiable on [a , b] and has null first–derivative on (a , b) , is constant on (a , b)) ,
to see that:
1
y(x) + = C ,
x
for some constant C and for all x ∈ I , where I is an interval not containing zero. In other words, as
long as we consider the domain of solutions to be an interval like I , any solution of the differential
equation (4.4) takes the form:
1
y(x) = C − , for x ∈ I .
x
By choosing an initial condition, for example y(1) = 5 , a particular value C = 6 is determined, so
that:
1
y(x) = 6 − , for x ∈ I .
x

We can also follow a reverse approach, in the sense that, as illustrated in Example 4.4, given a
geometrical locus, we obtain its ordinary differential equation.

Example 4.4. Consider the family of parabolas of equation:

y = α x2 . (4.5)

Any parabola in the family has the y-axes as common axis, with vertex in the origin. Differentiating,
we get:
y0 = 2 α x . (4.6)
Eliminating α from (4.5) and (4.6), we obtain the differential equation:

2y
y0 = . (4.7)
x
This means that any parabola in the family is solution to the differential equation (4.7).
4.1. PRELIMINARY NOTIONS 57

4.1.1 Systems of ODEs: equations of higher order


It is possible to consider differential equations of order higher than one, or systems of many differential
equations of first order.
Example 4.5. The following ordinary differential equations are, respectively, of order 2 and of order
3 :
x y 00 + 2 y 0 + 3 y − ex = 0 ,
(y (3) )2 + y 00 + y = x .
The second equation is quadratic in the highest derivative y (3) , therefore we say, also, that it has
degree 2.
A system of first–order differential equations is, for example, the following one:
(
y10 = y1 (a − b y2 ) ,
(4.8)
y20 = y2 (c y1 − d) ,

in which y1 = y1 (x) and y2 = y2 (x) are functions of a variable x that, in most applications, takes the
meaning of time. System (4.8) is probably the most famous system of ordinary differential equations,
as it represents the Lotka-Volterra predator prey system; see, for instance, [23]. Notice that the left
hand–sides in (4.8) are not dependent on x : in this particular case, the system is called autonomous.
We now state, formally, the definition of Initial Value Problem for a system of n ordinary differential
equations, each of first order, and for a differential equation of order n , with integer n ≥ 1 in both
cases.
Definition 4.6. Consider Ω , open set in R × Rn , with integer n ≥ 1 , and let f : Ω → Rn be a
vector–valued continuous function of (n + 1)–variables. Let further (x0 , y) ∈ Ω and I be an open
interval such that x0 ∈ I .
Then, a vector–valued function s : I → Rn is a solution of the initial value problem:
(
y 0 = f (x , y)
(4.9)
y(x0 ) = y 0
if the following conditions are verified:

(i) s ∈ C 1 (I) ; (iii) s(x0 ) = y 0 ;

(iv) s0 (x) = f (x , s(x)) .



(ii) x , s(x) ∈ Ω for any x ∈ I ;

Remark 4.7. In the Lotka–Volterra  case (4.8), it is n = 2 , thus y = (y1 , y2 ) , the open set is
Ω = R × (0 , +∞)  × (0 , +∞) and the continuos function is f (x , y) = f (x , y1 , y2 ) = y1 (a −
b y2 ) , y2 (c y1 − d) .
The rigorous definition of initial value problem for a differential equation of order n is provided below.
Definition 4.8. Consider an open set Ω ⊆ R × Rn , where n ≥ 1 is integer. Let F : Ω → R be a
scalar continuous function of (n + 1)–variables. Let further (x0 , b) ∈ Ω and I be an open interval
such that x0 ∈ I . Finally, denote b = (b1 , . . . , bn ) .
Then, a real function s : I → R is a solution of the initial value problem:


 y (n) = F (x , y , y 0 , y 00 , · · · , y (n−1) )

y(x0 ) = b1



y 0 (x0 ) = b2 (4.10)

...





 (n−1)
y (x0 ) = bn
if:
58CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: GENERAL THEORY

(i) s ∈ C n (I) ;

(ii) x , s(x) , s0 (x) , . . . , s(n−1) (x) ∈ Ω



for any x ∈ I ;

(iii) s(j) (x0 ) = bj+1 , j = 0,1,... ,n − 1;

(iv) s(n) (x) = F x , s(x) , s0 (x) , · · · , s(n−1) (x) .




Definition 4.9. Consider a family of functions y(x ; c1 , . . . , cn ) , depending on x and on n param-


eters c1 , . . . , cn , which vary within a set M ⊂ Rn . Such a family is called a complete integral, or a
general solution, of the n–th order equation:

y (n) = F (x , y , y 0 , y 00 , · · · , y (n−1) ) , (4.11)

if it satisfies two requirements:

(1) each function y(x ; c1 , . . . , cn ) is a solution to (4.11)

(2) all solutions to (4.11) can be expressed as functions of the family itself, i.e., they take the form
y(x ; c1 , . . . , cn ) .

Remark 4.10. Systems of first–order differential equations like (4.9) and equations of order n like
(4.10) are intimately related. Given the n-th order equation (4.10), in fact, an equivalent system can
be build, that has form (4.9), by introducing a new vector variable z = (z1 , . . . , zn ) and considering
the system of differential equations:


 z10 = z2

0
z2 = z3



... (4.12)


 0
zn−1 = zn



 0
zn = F (x , z1 , z2 , . . . , zn )

with the set of initial conditions: 


z1 (x0 ) = b1 ,

... (4.13)

zn (x0 ) = bn .

System (4.12) can be represented in the vectorial form (4.9), simply by setting z 0 = (z10 , . . . , zn0 ) ,
b = (b1 , . . . , bn ) and:
 
z2

 z3 

f (x , z) = 
 ... .

 zn 
F (x , z1 , z2 , . . . , zn )

Form Remark 4.10, the following Theorem 4.11 can be inferred, whose straightforward proof is omitted.

Theorem 4.11. Function s is solution of the n–th order initial value problem (4.10) if and only if
the vector function z solves system (4.12), with the initial conditions (4.13).

Remark 4.12. It is also possible to go in the reverse way, that is to say, any system of n differential
equations, of first order, can be transformed into a scalar differential equation of order n . We illustrate
this procedure with the Lotka-Volterra system (4.8). The first step consists in computing the second
derivative, with respect to x , of the first equation in (4.8):

y10 = y1 (a − b y2 ) =⇒ y100 = y10 (a − b y2 ) − b y1 y20 . (4.8a)


4.2. EXISTENCE OF SOLUTIONS: PEANO THEOREM 59

Then, the values of y10 and y20 from (4.8) are inserted in (4.8a), yielding:

y100 = y1 (a − b y2 )2 + b y2 (d − c y1 ) .

(4.8b)

Thirdly, using again the first equation in (4.8), y2 is expressed in terms of y1 and y10 , namely:

a y1 − y10
y10 = y1 (a − b y2 ) =⇒ y2 = . (4.8c)
b y1

Finally, (4.8c) is inserted into (4.8b), which provides the second–order differential equation for y1 :

y 02
y100 = a y1 − y10 (d − c y1 ) + 1 .

(4.8d)
y1

4.2 Existence of solutions: Peano theorem


In this section, we briefly deal with the problem of the existence of solution for ordinary differential
equations, for which continuity is the only essential hypothesis. The Peano1 Theorem 4.14 on existence
is stated, but not demonstated; the interested Reader is referred to Chapter 2 of [29].
We first state Peano Theorem in the scalar case.

Theorem 4.13. Consider the rectangle R = [x0 − a , x0 + a] × [y0 − b , y0 + b] , and let f : R → R be


continuos. Then, the initial value problem (4.3) admits at least a solution in a neighborhood of x0 .

To extend Theorem 4.13 to systems of ordinary differential equations, the rectangle R is replaced by
a parallelepiped, obtained as the Cartesian product of a real interval with an n–dimensional closed
ball.

Theorem 4.14 (Peano). Let us consider the n + 1–dimensional parallelepiped P = [x0 − a , x0 +


a] × B(y 0 , r) , and let f : P → Rn be a continuos function. Then, the initial value problem (4.9)
admits at least a solution in a neighborhood of x0 .

Remark 4.15. Under the sole continuity assumption, a solution needs not to be unique. Consider,
for example, the initial value problem:
(
y 0 (x) = 2 |y(x)| ,
p
(4.14)
y(0) = 0 .

The zero function y(x) = 0 is a solution of (4.14), which is solved, though, by function y(x) = x |x|
as well. Moreover, for each pair of real numbers α < 0 < β , the following ϕα ,β (x) function solves
(4.14) too: 
−(x − α)2
 if x<α,
ϕα ,β (x) = 0 if α ≤ x ≤ β ,


(x − β)2 if x>β .
In other words, the considered initial value problem admits infinite solutions. This phenomenon is
known as Peano funnel.

4.3 Existence and uniqueness: Picard–Lindelöhf theorem


To ensure existence and uniqueness of the solution to the initial value problem (4.9), a more restrictive
condition than continuity needs to be considered and is presented in Theorem 4.17. Given the impor-
tance of such a theorem, we provide here its proof, though in the scalar case only; notice that the
1
Giuseppe Peano (1858–1932), Italian mathematician and glottologist.
60CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: GENERAL THEORY

proof is constructive and turns out useful when trying to evaluate the solution of the given ordinary
differential equation.
The key notion to be introduced is Lipschitz continuity, which may be considered as a kind of inter-
mediate property, between continuity and differentiability.
For simplicity, we work in a scalar situation; the extension to systems of differential equations is only
technical; some details are provided in § 4.3.2.
We use again R to denote the rectangle:

R = [x0 , x0 + a] × [y0 − b , y0 + b] .

Definition 4.16. Function f : R → R is called uniformly Lipschitz–continuous in y , with respect


to x , if there exists L > 0 such that:

f (x , y1 ) − f (x , y2 ) < L y1 − y2 , for any (x , y1 ) , (x , y2 ) ∈ R . (4.15)

Using the Lipschitz2 continuity property, we prove the Picard–Lindelöhf3 Theorem.


Theorem 4.17 (Picard–Lindelöhf). Let f : R → R be uniformly Lipschitz continuous in y , with
respect to x , and define:
 b
M = max f , α = min a , . (4.16)
R M
Then, problem (4.3) admits unique solution u ∈ C 1 [x0 , x0 + α] , R .


Proof. The proof is somewhat long, so we present it splitted into four steps.
First step. Let n ∈ N . Define the sequence of functions (un ) by recurrence:

 u0 (x) = y0 ,

Z x

un+1 (x) = y0 +
 f X , un (X) dX .
x0

We want to show that x , un (x) ∈ R for any x ∈ [x0 , x0 + α] . To this aim, it is enough to prove
that, for n ≥ 0 , the following inequality is verified:

|un (x) − y0 | ≤ b , for any x ∈ [x0 , x0 + α] . (4.17)

In the particular case n = 0 , inequality (4.17) is satisfied, since:

|u0 (x) − y0 | = |y0 − y0 | = 0 ≤ b .

In the general case, we have:


Z x

|un+1 (x) − y0 | ≤
f (X , un (X))dX
Z xx0
≤ |f (X , un (X))| dX ≤ M |x − x0 | ≤ M α ≤ b .
x0

It is precisely here that we can understand the reason of the peculiar definition (4.16) of the number
α , as such a choice turns out appropriate in correctly defining each (and any) term in the sequence
un . It also highlights the local nature of the solution of the initial value problem (4.3).
Second step. We now show that (un ) converges uniformly on [x0 , x0 + α] . The identity:
n
X
un = u0 + (u1 − u0 ) + · · · + (un − un−1 ) = u0 + (uk − uk−1 )
k=1
2
Rudolf Otto Sigismund Lipschitz (1832–1903), German mathematician.
3
Charles Émile Picard (1856–1941), French mathematician.
Ernst Leonard Lindelöf (1870–1946), Finnish mathematician.
4.3. EXISTENCE AND UNIQUENESS: PICARD–LINDELÖHF THEOREM 61

suggests that any sequence (un ) can be thought of as an infinite series: its uniform convergence, thus,
can be proved by showing that the following series (4.18) converges totally on [x0 , x0 + α] :

X
(uk − uk−1 ) . (4.18)
k=1

To prove total convergence, we need to prove, for n ∈ N , the following bound:

Ln−1 |x − x0 |n
|un (x) − un−1 (x)| ≤ M , for any x ∈ [x0 , x0 + α] . (4.19)
n!
We proceed by induction. For n = 1 , the bound is verified, since:

|u1 (x) − u0 (x)| = |u1 (x) − y0 |


Z x Z x

= f (X , y0 ) dX ≤ |f (X , y0 )| dX ≤ M |x − x0 | .
x0 x0

We now prove (4.19) for n + 1 , assuming that it holds true for n . Indeed:
Z x 
 
|un+1 (x) − un (x)| = f X , un (X) − f X , un−1 (X) dX
Z xx0
 
≤ f X , un (X) − f X , un−1 (X) dX
x0
Z x
≤L |un (X) − un−1 (X)| dX
x0
Ln−1 x
Ln |x − x0 |n+1
Z
≤M |X − x0 |n dX = M .
n! x0 (n + 1)!

Therefore (4.19) is proved and implies that series (4.18) is totally convergent for [x0 , x0 + α] ; in fact:
∞ ∞ ∞
X X X Ln−1 αn
(un − un−1 ) ≤ sup |un − un−1 | ≤ M
n!
n=1 n=1 [x0 ,x0 +α] n=1

M X (L α)n M
eα L − 1 < +∞ .

= =
L n! L
n=1

Third step. We show that the limit of the sequence of functions (un ) solves the initial value problem
(4.3). From the equality:
 Z x 

lim un+1 (t) = lim y0 + f X , un (X) dX ,
n→∞ n→∞ x0

we obtain, when u = lim un , the fundamental relation:


n→∞
Z x 
u(x) = y0 + f X , u(X) dX , (4.20)
x0
  
since f X , un (X) ≤ M ensures uniform convergence for f X , un (X) .
n∈N
Now, differentiating both sides of (4.20), we see that u(x) is solution of the initial value problem
(4.3).
Fourth step. We have to prove uniqueness of the solution of (4.3). By contradiction, assume that
v ∈ C 1 ([x0 , x0 + α] , R) solves (4.3) too. Thus:
Z x

v(x) = y0 + f X , v(X) dX .
x0
62CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: GENERAL THEORY

As before, it is possible to show that, for any n ∈ N and any x ∈ [x0 , x0 + α] , the following inequality
holds true:
Ln |x − x0 |n
|u(x) − v(x)| ≤ K , (4.21)
n!
where K is given by:
K= max |u(x) − v(x)| .
x∈[x0 ,x0 +α]

Indeed: Z x  
|u(x) − v(x)| ≤ f X , u(X) − f x , v(X) dX ≤ K L |x − x0 | ,
x0
which proves (4.21) for n = 1 . Using induction, if we assume that (4.21) is satisfied for some n ∈ N ,
then:
Z x
 
|u(x) − v(x)| ≤ f X , u(X) − f X , v(X) dX
x0
Z x Z x
Ln (X − x0 )n
≤L |u(X) − v(X)| dX ≤ L K dX .
x0 x0 n!
After calculating the last integral in the above inequality chain, we arrive at:
Ln+1 (x − x0 )n+1
|u(x) − v(x)| ≤ K ,
(n + 1)!
which proves (4.21) for index n + 1 . By induction, (4.21) holds true for any n ∈ N .
We can finally end our demonstration of Theorem 4.17. In fact, by taking the limit n → ∞ in (4.21),
we obtain that, for any x ∈ [x0 , x0 + α] , the following inequality is verified:
|u(x) − v(x)| ≤ 0 ,
which shows that u(x) = v(x) for any x ∈ [x0 , x0 + α] .

Remark 4.18. Let us go back to Remark 4.15. In such p a situation, where the initial value problem
(4.14) has multiple solutions, function f (x , y) = 2 |y| does not fulfill the Lipschitz continuity
property. In fact, taking, for istance, y1 , y2 > 0 yields:
|f (y1 ) − f (y2 )| 2
= √ √ ,
|y1 − y2 | y1 − y2
which is unbounded.
The proof of Theorem 4.17, based on successive Picard iterates, is also useful in some simple situations,
where it allows to compute an approximate solution of the initial value problem (4.3). This is illustrated
by Example 4.19.
Example 4.19. Construct the Picard–Lindelhöf iterates for:
(
y 0 (x) = − 2 x y(x) ,
(4.22)
y(0) = 1 .
The first iterate is y0 (x) = 1 , while subsequent iterates are:
Z x Z x
y1 (x) = y0 (x) + f (t , y0 (t))dt = 1 − 2 t dt = 1 − x2 ,
0 0
Z x Z x
x4
y2 (x) = y0 (x) + f (t , y1 (t))dt = 1 − 2 t (1 − t2 )dt = 1 − x2 + ,
0 0 2
Z x 
t4 x4 x6

2
y3 (x) = 1 − 2 t 1−t + dt = 1 − x2 + − ,
0 4 2 6
Z x 
t4 t6 x4 x6 x8

2
y4 (x) = 1 − 2 t 1−t + − dt = 1 − x2 + − + ,
0 4 6 2 6 24
4.3. EXISTENCE AND UNIQUENESS: PICARD–LINDELÖHF THEOREM 63

and so on. A pattern emerges:

x2 x4 x6 x8 (−1)n x2n
yn (x) = 1 − + − + + ··· + .
1! 2! 3! 4! n!
The sequence of Picard–Lindelhöf iterates converges only if it also converges the series:
m
X (−1)n x2n
y(x) := lim .
m→∞ n!
n=0

Now, recalling the Taylor series for the exponential function:



x
X xn
e = ,
n!
n=0

it follows:

X (−x2 )n 2
y(x) = = e−x .
n!
n=0

We will show later, in Example 5.4, that this function is indeed the solution to (4.22).

We leave it to the Reader, as an exercise, to determine the successive approximations of the IVP:
(
y 0 (x) = y(x) ,
y(0) = 1 .

and to recognise that the solution is the exponential function y(x) = ex .

4.3.1 Interval of existence


The interval of existence of an initial value problem can be defined as the largest interval where the
solution is well defined. This means that the initial point x0 must be within the interval of existence.
In the following, we discuss how to detect such an interval with a theoretical approach. When the exact
solution is available, as in the case of the following Example 4.20, the determination of the interval of
existence is straightforward.

Example 4.20. The initial value problem:


(
y0 = 1 − y2
y(0) = 0

is solved by the function y(x) = tanh x , so that its interval of existence is R . It is worth noting that
the similar initial value problem: (
y0 = 1 + y2
y(0) = 0
behaves
 π π  differently, since it is solved by y(x) = tan x and has, therefore, interval of existence given by
− 2 , 2 . Moreover, in this latter case, taking the limit at the boundary of the interval of existence
yields:
limπ y(x) = limπ tan x = ±∞.
x→± 2 x→± 2

This is not a special situation. Even when the interval of existence is bounded, for some theoretical
reason that we present later, in detail, the solution can be unbounded; this case is referred to as a
blow-up phenomenon.
64CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: GENERAL THEORY

4.3.2 Vector–valued differential equations


It is not difficult to adapt the argument presented in Theorem 4.17 to the vector–valued situation of
a function f : Ω → Rn defined on an open set Ω ⊂ R × Rn . In this case, the Lipschitz continuity
condition is:
||f (x , y 1 ) − f (x , y 2 )|| ≤ L ||y 1 − y 2 ||
for any (x , y 1 ) , (x , y 2 ) ∈ R , where the rectangle R is replaced by a cylinder:

R = [x0 , x0 + a] × B b (y 0 ) .

The vector–valued version of the Picard–Lindelöhf Theorem 4.17 is represented by the following The-
orem 4.21, whose proof is omitted, as it is very similar to that of Theorem 4.17.
Theorem 4.21 (Picard–Lindelöhf, vector–valued case). Let f : R → Rn br uniformly Lipschitz
continuous in y , with respect to x , and define:
b
M = max ||f || , α = min {a , }. (4.23)
R M
Then, problem (4.9) admits unique solution u ∈ C 1 ([x0 , x0 + α] , Rn ) .

4.3.3 Solution continuation


To detect the interval of existence, we start by observing that the Picard–Lindelhöf Theorem 4.17 leads
to a solution of the IVP (4.3) which is, by construction, local, i.e., it is a solution defined within a
neighborhood of the initial independent data x0 . The radius of this neighborhood depends on function
f (x , y) in different ways: to understand them, we introduce the notion of joined solutions, as a first
(technical) step.
Remark 4.22. Let f : Ω → R be a continuos function, defined on an open set Ω ⊂ R × Rn . Consider
two solutions, y 1 ∈ C 1 ([a , b] , Rn ) and y 2 ∈ C 1 ([b , c] , Rn ) , of the differential equation y 0 = f (x , y) ,
such that y 1 (b) = y 2 (b) . Then, function y : [a , c] → Rn defined as:
(
y 1 (x) if x ∈ [a , b]
y(x) =
y 2 (x) if x ∈ (b , c]

is also a solution of y 0 = f (x , y) .
Function f represents a vector field.
With the Picard–Lindelöhf Theorem 4.21, we can build solutions to initial value problems associated
to y 0 = f (x , y) , choosing the initial data in Ω . In other words, given a point (x0 , y 0 ) ∈ Ω , we form
the IVP (4.9), for which Theorem 4.21 ensures existence of a solution u(x) in a neighborhood of x0 .
If now, in the rectangle R = [x0 , x0 + a] × B b (y 0 ) , we choose a , b > 0 so that R ⊂ Ω (which is
always possible, since Ω is open), then the solution of IVP (4.9) is defined at least up to the point
x1 = x0 + α1 , where the constant α1 > 0 is given by (4.23).
This allows us to continue and consider a new initial value problem:
(
y 0 = f (x, y)
(4.9a)
y(x1 ) = u(x1 ) := y 1

which is defined at least up to point x2 = x1 + α2 , where constant α2 > 0 is again given by (4.23).
This procedure can be iterated, leading to the formal Definition 4.23 of maximal domain solution. The
idea of continuation of a solution may be better understood by looking at Figure 4.1.
Definition 4.23 (Maximal domain solution). If u ∈ C 1 (I , Rn ) solves the initial value problem (4.9),
we say that u has maximal domain (or that u does not admit a continuation) if there exists no
function v ∈ C 1 (J , Rn ) which also solves (4.9) and such that I ⊂ J .
4.3. EXISTENCE AND UNIQUENESS: PICARD–LINDELÖHF THEOREM 65

Rn

y1
y0

x0 x1 R

Figure 4.1: Continuation of a solution

The existence of the maximal domain solution to IVP (4.9) can be understood euristically, as it comes
from indefinitely repeating the continuation procedure. Establishing it with mathematical rigor is
beyond the aim of these lecture notes, since it would require notions from advanced theoretical Set
theory, such as Zorn’s Lemma4 .
We end this section stating, in Theorem 4.24, a result on the asymptotic behaviour of a solution with
maximal domain, in the particular case where Ω = I × Rn , being I an open interval. Such a result
explains what observed in Example 4.20, though we do not provide a proof of Theorem 4.24.

Theorem 4.24. Let f be defined on an open set I × Rn ⊂ R × Rn . Given (x0 , y 0 ) ∈ I × Rn , assume


that function y is a maximal domain solution of the initial value problem:
(
y 0 = f (x , y) ,
y(x0 ) = y 0 .

Denote (α , ω) the maximal domain of y . Then, one of two possibility holds respectively for α and
for ω :

(1) it is either α = inf I ,


or α > inf I , implying lim |y(x)| = +∞ ;
x→α+

(2) it is either ω = sup I ,


or ω < sup I , implying lim |y(x)| = +∞ .
x→ω −

4
See, for example, mathworld.wolfram.com/ZornsLemma.html
66CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: GENERAL THEORY
5 Ordinary differential equations of
first order: methods for explicit so-
lutions

In the previous Chapter 4 we exposed the general theory, concerning conditions for existence and
uniqueness of an initial value problem. Here, we consider some important particular situations, in
which, due to the structure of certain kind of scalar ordinary differential equations, it is possible to
establish methods to determine their explicit solution

5.1 Separable equations


Definition 5.1. A differential equation is separable if it has the form:
(
y 0 (x) = a(x) b y(x) ,

(5.1)
y(x0 ) = y0 ,

where a(x) and b(y) are continuous functions, respectively defined on intervals Ia and Ib , such that
x0 ∈ Ia and y0 ∈ Ib .

To obtain existence and uniqueness in the solution of (5.1), we have to assume that b(y0 ) 6= 0 .

Theorem 5.2. If, for any y ∈ Ib , it holds:

b(y) 6= 0 , (5.2)

then the unique solution to (5.1) is function y(x) , defined implicitly by:
Z y Z x
dz
= a(s) ds . (5.3)
y0 b(z) x0

Remarkp5.3. The hypothesis (5.2) cannot be removed, as shown, for instance, in Remark 4.15, where
b(y) = 2 |y| , which means that b(0) = 0 .

Proof. Introduce the two-variable function:


Z y Z x
dz
F (x , y) := − a(s) ds , (5.3a)
y0 b(z) x0

for which F (x0 , y0 ) = 0, . Recalling (5.2) and since:

∂F (x, y) 1
= ,
∂y b(y)
it follows:
∂F (x0 , y0 ) 1
= 6= 0 .
∂y b(y0 )

67
68CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

We can thus invoke the Implicit function theorem 3.25 and infer the existence of δ , ε > 0 for which,
given any x ∈ (x0 − δ , x0 + δ) , there is a unique C 1 function y = y(x) ∈ (y0 − ε , y0 + ε) such that:

F (x, y) = 0

and such that, for any x ∈ (x0 − δ, x0 + δ) :



0 Fx x , y(x) a(x)
y (x) = −  = = a(x) b(y) .
Fy x , y(x) 1
b(y)

Function y , implicitly defined by (5.3), is thus a solution of (5.1); to complete the proof, we still have
to show its uniqueness. Assume that y1 (x) and y2 (x) are both solutions of (5.1), and define:
Z y
dz
B(y) := ;
y0 b(z)

then:
d   y10 (x) y20 (x)
B y1 (x) − B y2 (x) =  − 
dx b y1 (x) b y2 (x)
 
a(x) b y1 (x) a(x) b y2 (x)
=  −  = 0.
b y1 (x) b y2 (x)

 we used the fact that both y1 (x) and y2 (x) are assumed to solve (5.1). Thus B y1 (x) −
Notice that
B y2 (x) is a constant function, and its constant value is zero, since y1 (x0 ) = y2 (x0 ) = y0 . In other
words, we have shown that, for any x ∈ Ia :
 
B y1 (x) − B y2 (x) = 0 ,

which means, recalling the definition of B :


Z y1 (x) Z y2 (x) Z y1 (x)
dz dz dz
0= − = .
y0 b(z) y0 b(z) y2 (x) b(z)

At this point, using the Mean–Value Theorem 3.24, we infer the existence of a number X(x) between
the integration limits y2 (x) and y1 (x) , such that:
1 
 y1 (x) − y2 (x) = 0 .
b X(x)

But, from (5.2), it holds:


1
6= 0 ,
b(X(x))
thus:
y1 (x) − y2 (x) = 0 ,
and the theorem proof is completed.

Example 5.4. Consider once more the IVP studied, using successive approximations, in Example
4.19: (
y 0 (x) = −2 x y(x) ,
y(0) = 1 .
Setting a(x) = −2 x , b(y) = y , x0 = 0 , y0 = 1 in (5.3) leads to:
Z y Z x
1 2
dz = (−2 z) dz ⇐⇒ ln y = −x2 ⇐⇒ y(x) = e−x .
1 z 0
5.1. SEPARABLE EQUATIONS 69

In the next couple of examples, some interesting, particular cases of separable equations are considered.

Example 5.5. The choice b(y) = y in (5.3) yields the particular separable equation:
(
y 0 (x) = a(x) y(x) ,
(5.4)
y(x0 ) = y0 ,

where a(x) is a given continuous function. Using (5.3), we get:


Z y Z x Z x
1 y
dz = a(s) ds =⇒ ln = a(s) ds ,
y0 z x0 y0 x0

thus: Z x
a(s) ds
x0
y = y0 e .
x
For instance, if a(x) = − , the initial value problem:
2
( x
y 0 (x) = − y(x)
2
y(0) = 1

has solution:
x2
y(x) = e− 4 .

Example 5.6. In (5.3), let b(y) = y 2 , which leads to the separable equation:
(
y 0 (x) = a(x) y 2 (x) ,
(5.5)
y(x0 ) = y0 ,

with a(x) continuous function. Using (5.3), we find:


Z y Z x Z x
1 1 1
2
dz = a(s) ds =⇒ − + = a(s) ds ,
y0 z x0 y y0 x0

and, solving with respect to y :


1
y= Z x .
1
− a(s) ds
y0 x0

For instance, if a(x) = −2 x , the initial value problem:


(
y 0 (x) = −2 x y 2 (x)
y(0) = 1

has solution
1
y= .
1 + x2

We now provide some practical examples, recalling that a complete treatment needs, both, finding the
analytical expression of the solution and determining the maximal solution domain.

Example 5.7. Consider equation:


(
y 0 (x) = (x + 1) 1 + y 2 (x) ,


y(−1) = 0 .
70CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

Using (5.3), we find:


Z y(x) Z x
1
dz = (s + 1) ds ,
0 1 + z2 −1
and, evaluating the integrals:
Z y(x) Z x
1 1
dz = arctan y(x) , (s + 1) ds = (x + 1)2 ,
0 1 + z2 −1 2
the solution is obtained:
(x + 1)2
y(x) = tan.
2
Observe that the solution y(x) is only well–defined for x in a neighborhood of x0 = −1 and such
that:
π (x + 1)2 π
− < < ,
2 2 2
that is, −1 − π < x < −1 + π .
Example 5.8. Solve the initial value problem:
y 0 (x) = x − 1 ,

y(x) + 1
y(0) = 0 .

From (5.3):
Z y(x) Z x
(z + 1)dz = (s − 1)ds ,
0 0
performing the relevant computations, we get:
1 2 1
y (x) + y(x) = x2 − x ,
2 2
so that: (
p x − 2,
y(x) = −1 ± (x − 1)2 =
−x .
Now, recall that x lies in a neighborhood of zero and that the initial condition requires y(0) = 0 ; it
can be inferred, therefore, that y(x) = −x must be chosen. To establish the maximal domain, observe
that x − 1 vanishes for x = 1 ; thus, we infer that x < 1 .
Example 5.9. As a varies in R , investigate the maximal domain of the solutions to the initial value
problem: (
u0 (x) = a 1 + u2 (x) cos x ,


u(0) = 0 .
Form (5.3), to obtain:
Z u(x) Z x
dz
= a cos s ds.
0 1 + z2 0
After performing the relevant computations, we get:
arctan u(x) = a sin x . (5.6)
It is clear that the Range of the right hand–side of (5.6) is [−a , a] . To obtain a solution defined on
π
R , we have to impose that a < . In such a case, solving with respect to u yields:
2
u(x) = tan (a sin x) .
π π
Viceversa, when a ≥ , since there exists x ∈ R+ for which a sin x = , then, the obtained solution
2 2
π
is defined in (−x , x) , and x is the minimum positive number verifying the equality a sin x = .
2
5.1. SEPARABLE EQUATIONS 71

5.1.1 Exercises
1. Solve the following separable equations:
 x 
ex
y 0 (x) = sin2 y(x) , y 0 (x) = ,

 y(x) (d) (1 + ex ) cosh y(x)
(a) r
π y(0) = 0 ,


y(0) =
 ,
2
ex

y 0 (x) = 2 x ,

y 0 (x) = ,
(b) cos y(x) (e) (1 + ex ) cos y(x)
y(0) = 0 ,

y(0) = 5π ,

 2x
y 0 (x) =
 , 
x2 2
cos y(x)  0
y (x) = y (x) ,
(c) (f) 1+x
y(0) = π ,

y(0) = 1 .

4

Solutions:
p 1
(1 + e2 x ) ,

(a) ya (x) = arccos(−x2 ) , (d) yd (x) = arcsinh ln 2
(e) ye (x) = arcsin ln 21 (1 + e2 x ) ,

(b) yf (x) = arcsin x2 + 5 π ,
2
(c) yc (x) = arcsin √1 + x2 ,
 (f) yb (x) = .
2 2(1 + x − ln(1 + x)) − x2

2. Show that the solution to the initial value problem:

e2x

 0
y (x) = y(x)
√ 4 + e2x
y(0) = 5


is y(x) = 4 + e2 x . What is the maximal domain of such a solution?

3. Show that the solution to the initial value problem:

y 0 (x) = x sin x

1 + y(x)
y(0) = 0


is y(x) = 2 sin x − 2 x cos x + 1 − 1 . Find the maximal domain of the solution.

4. Show that the solution to the initial value problem:

y 3 (x)

 0
y (x) =
1 + x2
y(0) = 2

2
is y(x) = √ . Find the maximal domain of the solution.
1 − 8 arctan x
5. Show that the solution to the initial value problem:
(
y 0 (x) = (sin x + cos x) e−y(x)
y(0) = 1

is y(x) = ln(1 + e + sin x − cos x) . Find the maximal domain of the solution.
72CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

6. Show that the solution to the initial value problem:


(
y 0 (x) = (1 + y 2 (x)) ln(1 + x2 )
y(0) = 1

π
is y(x) = tan x ln(x2 + 1) − 2 x + 2 arctan x +

4 . Find the maximal domain of the solution.

7. Show that the solution to initial value problem:



y 0 (t) = − 1 y(x) − 1 y 4 (x)
x x
y(1) = 1

1
is y(x) = − √
3
. Find the maximal domain of the solution.
1 − 2x3
8. Solve the initial value problem:

00 0
y (x) = (sin x) y (x)

y(0) = 1

 0
y (1) = 0

Hint. Set z(x) = y 0 (x) and solve the equation z 0 (x) = (sin x) z(x) .

5.2 Singular integrals


Given a differential equation, one may want to describe its general solution, without fixing a set of
initial conditions. Consider, for instance, the differential equation:

y 0 = (y − 1) (y − 2) . (5.7)

Equation (5.7) is separable, so we can easily adapt formula (5.3), using indefinite integrals and adding
a constant of integration:
y−2
ln = x + c1 .
y−1
Solving for y , the general solution to (5.7) is obtained:

2 − ex+c1 2 − c ex
y(x) = = , (5.8)
1 + ex+c1 1 + c ex
where we set c = ec1 .
Observe that the two constant functions y = 1 and y = 2 are solutions of equation (5.7). Observe
further that y = 2 is obtained from (5.8) taking c = 0 , thus such a solution is a particular solution
to (5.7). Viceversa, solution y = 1 cannot be obtained using the general solution (5.8); for this reason
this solution is called singular.
Singular solutions of a differential equation can be found with a computational procedure, illustrated
in Remark 5.10.

Remark 5.10. Given the differential equation (4.2), suppose that its general solution is given by
Φ(x , y , c) = 0 . When there exists a singular integral of (4.2), it can be detected eliminating c from
the system: 
Φ(x , y , c) = 0 ,

∂Φ (5.9)

 (x , y , c) = 0 .
∂c
5.3. HOMOGENEOUS EQUATIONS 73

In the case of equation (5.7), system (5.9) becomes:


(
cex (y + 1) + y − 2 = 0 ,
ex (y + 1) = 0 ,

which confirms that y = −1 is a singular integral of (5.7).

Remark 5.11. When the differential equation (5.3) is given in implicit form, uniqueness of the
solution does not hold; this generates the occurrence of a singular integral, which can be detected
without solving the differential equation (4.2), eliminating y 0 from system 5.10:

0
F (x , y , y ) = 0 ,

∂F (5.10)
 0 (x , y , y 0 ) = 0 .

∂y

A detailed discussion can be found in § 23 of [13].

5.3 Homogeneous Equations


To obtain exact solutions of non–separable differential equations, it is possible, in some specific sit-
uations, to use some ansatz and transform the given equations. A few examples are provided in the
following. The first kind of transformable equations, that we consider here, are the so–called homoge-
neous equations.

Theorem 5.12. Given  f : [0 , ∞)×[0 , ∞) → R , if f (α x , α y) = f (x , y) , and x0 , y0 ∈ R , x0 6= 0 , are


y0 y0
such that f 1 , 6= , then the change of variable y(x) = x u(x) can be employed to transform
x0 x0
the differential equation: (
y 0 (x) = f x , y(x)

(5.11)
y(x0 ) = y0
into the separable equation:  
 0
f 1 , u(x) − u(x)
u (x) =
 ,
x (5.12)
u(x0 ) = y0 .


x0
Proof. We represent the solution to (5.11) in the form y(x) = x u(x) and we look for the auxiliary
unknown u(x) ; this is a change of variable that, in force of homogeneity conditions, transforms (5.11)
into a separable equation, expressed in terms of u(x) . Differentiating y(x)
 = x u(x) yields, in fact,
0 0 0 0
y (x) = x u(x) + u(x) . Now, imposing the equality y (x) = f x , y(x) leads to x u (x) + u(x) =
f x , x u(x) = f 1 , u(x) , where we used the fact that f (α x , α y) = f (x , y) . Therefore, u(x) solves
the differential equation:
f (1 , u(x)) − u(x)
u0 (x) = .
x
Observe further that the initial condition y(x0 ) = y0 is changed into x0 u(x0 ) = y0 . Recalling that
x0 6= 0 , equation (5.11) is changed into the separable problem (5.12), whose solution u(x) is defined
by:
u(x)
Z
1
ds = ln |x| − ln |x0 | .
f (1 , s) − s
y0
x0
74CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

Example 5.13. Consider the initial value problem:


2 2
y 0 (x) = x + y (x) ,

x y(x)
y(2) = 2 .

In this case, f (x , y) is an homogeneous function:

x2 + y 2
f (x, y) = .
xy

Using y(x) = x u(x) :


 1 + u2 (x) 1
f 1 , u(x) = = − u(x) .
u(x) u(x)
Thus, the transformed problem turns out to be in separable form:

u0 (x) = 1 ,

x u(x)
u(2) = 1 ,

and its solution can be found by integration:


u(x)
Z
s ds = ln |x| − ln 2 ,
1

yielding: p
u(x) = 2 ln |x| + 1 − ln 4 .
√ 2
Observe that the solution is defined on the interval x > eln 4−1 = √ .
e
At this point, going back to our original initial value problem, we arrive at the solution of the homo-
geneous problem:
p 2
y(x) = x 2 ln |x| + 1 − ln 4 , x> √ .
e

5.3.1 Exercises
Solve the following initial values problems for homogeneous equations:

y 2 (x) y(x) 
 
1

 0
y (x) = 2 , y 0 (x) =
 y(x) + x e x ,
x − x y(x) 3. x
1.
y(1) = 1 ,

y(1) = −1 ,

 
3
1 15 x + 11 y(x)
 
y 0 (x) = − 2 x y(x) + y 2 (x) , y 0 (x) = −

  ,
2. x 4. 9 x + 5 y(x)
 
y(1) = 1 , y(1) = 1 ,
 

Solutions:

1 + 3x2 − 1 3. y(x) = −x ln (e − ln x) ,
1. y(x) = ,
x
x 1
2x 4. = ,
2 23/5 y 2/5 y 3/5
2. y(x) = 2 , 1+ 3+
3x − 1 x x
5.4. QUASI HOMOGENEOUS EQUATIONS 75

5.4 Quasi homogeneous equations


By employing a few smart ansatz, it is possible to transform some differential equations, non–separable
and non–homogeneous, into equivalent equations that are separable or homogeneous. Here, we deal
with differential equations of the form:
 
0 ax + by + c
y =f , (5.13)
αx + βy + γ

where  
a b
det 6 0.
= (5.14)
α β
In this situation, the linear system:
(
ax + by + c = 0
(5.15)
αx + βy + γ = 0

has a unique solution, say, (x , y) = (x1 , y1 ) . To obtain a homogeneous or a separable equation, it is


possible to exploit the solution uniqueness, employing the change of variable:
( (
X = x − x1 , x = X + x1 ,
⇐⇒
Y = y − y1 , y = Y + y1 .

Example 5.14 illustrates the transformation procedure.

Example 5.14. Consider the equation:

y 0 = 3 x + 4 ,

y−1
y(0) = 2 .

The first step consists in solving the system:


(
3x + 4 = 0 ,
y−1=0 ,

4
whose solution is x1 = − , y1 = 1 . In the second step, the change variable is performed:
3
(
X = x + 43 ,
Y =y−1 ,

which leads to the separable equation:



Y 0 = 3X ,

Y
4
Y
 =1,
3
with solution:  
2 16
3 X − = Y 2 − 1.
9
Recovering the original variables:
 4 16 
3 (x + )2 − = (y − 1)2 − 1
3 9
76CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

and simplifying:
 
8
3 x + x = y 2 − 2y
2
3
yields:
p
y =1± 3 x2 + 8 x + 1 .

Finally, recalling that y(0) = 2 , the solution is given by:



p −4 + 13
y(x) = 1 + 3 x2 + 8 x + 1 , x> .
3

The worked–out Example 5.15 illustrate the procedure to be followed if, when considering equation
(5.13), condition (5.14) is not fulfilled.

Example 5.15. Consider the equation

y 0 = − x + y + 1 ,

2x + 2y + 1
y(1) = 2 .

Here, there is no solution to system:


(
x+y+1=0 ,
2x + 2y + 1 = 0 .

In this situation, since the two equations in the system are proportional, the change of variable to be
employed is:
(
t = x,
z = x+y.

The given differential equation is, hence, transformed into the separable one:

z

z 0 = ,
2z + 1
z(1) = 3 .

Separating the variables leads to:


Z z Z t
2w + 1
dw = ds .
3 w 1

Thus:
2 z − 6 + ln z − ln 3 = t − 1 =⇒ x + 2 y + ln(x + y) = 5 + ln 3 .

Observe that, in this example, it is not possible to express the dependent variable y in an elementary
way, i.e., in terms of elementary functions.

5.5 Exact equations


Aim of this section is to provide full details on solving exact differential equations. To understand
the idea behind the treatment of this kind of equation, we present Example 5.16, that will help in
illustrating what an exact differential equation is, how its structure can be exploited, to arrive at a
solution, and why the process works as it does.
5.5. EXACT EQUATIONS 77

Example 5.16. Consider the differential equation:

3 x2 − 2 x y
y0 = . (5.16)
2 y + x2 − 1
First, rewrite (5.16) as:
2 x y − 3 x2 + (2 y + x2 − 1) y 0 = 0 . (5.16a)
Equation (5.16a) is solvable under the assumption that a suitable function Φ(x , y) can be found, that
verifies:
∂Φ ∂Φ
= 2 x y − 3 x2 , = 2 y + x2 − 1 .
∂x ∂y
Note that it is not always possible to determine such a Φ(x , y) . In the current Example 5.16, though,
we are able to define Φ(x , y) = y 2 + (x2 − 1) y − x3 . Therefore (5.16a) can be rewritten: as
∂Φ ∂Φ 0
+ y = 0. (5.16b)
∂x ∂y

Invoking the multi–variable Chain Rule1 , we can write (5.16b) as:


d 
Φ x , y(x) = 0 . (5.16c)
dx
Since, when the ordinary derivative of a function is zero, the function is constant, there must exist a
real number c such that:
Φ x , y(x) = y 2 + (x2 − 1) y − x3 = c .

(5.17)
Thus (5.17) is an implicit solution for the differential equation (5.16); if an initial condition is assigned,
we can determine c .
It is not always possible to determine an explicit solution, expressed in terms of y . In the particular
situation of Example 5.16, though, this is feasible and we are able to find an explicit solution. For
instance, setting y(0) = 1 , we get c = 0 and:
1 p 
y(x) = 1 − x2 + x4 + 4 x3 − 2 x2 + 1 .
2

Let us, now, leave the particular case of Example 5.16, and return to the general situation, i.e., consider
ordinary differential equation of the form:

M (x , y) + N (x , y) y 0 = 0 . (5.18)

We call exact the differential equation (5.18), if there exists a function Φ(x , y) such that:
∂Φ ∂Φ
= M (x , y) , = N (x , y) , (5.19)
∂x ∂y
and the (implicit) solution to an exact differential equation is constant:

Φ(x , y) = c .

In other words, finding Φ(x , y) constitues the central task in determining whether a differential
equation is exact and in computing its solution.
Establishing a necessary condition for (5.18) to be exact is easy. In fact, if we assume that (5.18) is
exact and that Φ(x , y) satisfies the hypotheses of Theorem 3.4, then the equality holds:
   
∂ ∂Φ ∂ ∂Φ
= .
∂x ∂y ∂y ∂x
1
See, for example, mathworld.wolfram.com/ChainRule.html
78CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

Inserting (5.19), we obtain the necessary condition for an equation to be exact:


∂ ∂
N (x , y) = M (x , y) . (5.20)
∂x ∂y

The result in Theorem 5.17 speeds up the search of solution for exact equations.

Theorem 5.17. Define Q = { (x , y) ∈ R2 | a < x < b , c < y < d } and let M , N : Q → R be


C 1 with N (x , y) 6= 0 for any (x , y) ∈ Q . Assume that M , N, verify the closure condition (5.20) for
any (x , y) ∈ Q . Then, there exists a unique solution to the initial value problem:

y 0 = − M (x , y) ,

N (x , y) (5.21)
y(x0 ) = y0 , (x0 , y0 ) ∈ Q .

Such a solution is implicitly defined by:


Z x Z y
M (t , y0 ) dt + N (x , s) ds = 0 . (5.22)
x0 y0

Example 5.18. Consider the differential equation:


 2
y 0 = − 6 x + y ,
2xy + 1
y(1) = 1 .

The closure condition (5.20) is fullfilled, since:

∂M (x , y)
M (x , y) = 6 x + y 2 =⇒ = 2y,
∂y
∂N (x , y)
N (x , y) = 2 x y + 1 =⇒ = 2y.
∂x
Formula (5.22) then yields:
Z x Z x
M (t , 1) dt = (6 t + 1) dt = −4 + x + 3 x2 ,
Z 1y Z1 y
N (x , s) ds = (2 x s + 1) ds = −1 − x + y + x y 2 .
1 1

Hence, the solution to the given initial value problem is implicitly defined by:

x y 2 + y + 3 x2 − 5 = 0

and, solving for y , two solutions are reached:



−1 ± 1 + 20 x − 12 x3
y= .
2x
Recalling that y(1) = 1 , we choose one solution:

−1 + 1 + 20 x − 12 x3
y= .
2x
Example 5.19. Consider solving:

3 y e3 x − 2 x

 0
y =− ,
e3 x
y(1) = 1 .
5.6. INTEGRATING FACTOR FOR NON EXACT EQUATIONS 79

Here, it holds:

∂M (x , y)
M (x , y) = 3 y e3 x − 2 x =⇒ = 3 e3 x ,
∂y
∂N (x , y)
N (x , y) = e3 x =⇒ = 3 e3 x .
∂x
Using formula (5.22):
Z x Z x
M (t , 1) dt = (3 e3 t − 2 t) dt = −x2 + e3 x − e3 + 1 ,
Z 1y Z1 y
N (x , s) ds = (e3 x ) ds = (y − 1) e3 x .
1 1

The solution to the given initial value problem is, therefore:

−x2 + e3 x − e3 + 1 + (y − 1) e3 x = 0 y = e−3 x x2 + e3 − 1 .

=⇒

5.5.1 Exercises
1. Solve the following initial value problems, for exact equations:
3 2
y 0 = 9 x − 2 x y ,
 
y 0 = − 2 x + 3 y ,
(a) 3x + y − 1 (d) x2 + 2 y + 1
y(1) = 2 , y(0) = −3 ,
 
 2
2
y 0 = 9 x − 2 x y ,
 y 0 = 2 x y + 4 ,
(e) 2 (3 − x2 y)
(b) x2 + 2 y + 1
y(−1) = 8 ,

y(0) = −3 ,

2xy

 2
 − 2x
y 0 = 2 x y + 4 , 2

0 = 1+x

(f) y ,
(c) 2 (3 − x2 y) 
 2 − ln (1 + x2 )
y(−1) = −8 , y(0) = 1 .
 

2. Using the method for exact equation, described in this § 5.5, prove that the solution of the initial
value problem:
1 − 3 y 3 e3 x y

y 0 =
3 x y 2 e3 x y + 2 y e3 x y
y(0) = 1

is implicitly defined by y 2 e3 x y − x = 1 , and verify this result using the Dini Implicit function
Theorem 3.25.

5.6 Integrating factor for non exact equations


In § 5.5, we faced differential equations of the form (5.18), with the closure condition (5.20), essential
to detect the solution; we recall both formulæ , for convenience:

∂ ∂
M (x, y) + N (x, y) y 0 = 0 , N (x, y) = M (x , y) .
∂x ∂y

The case is more frequent, though, in which condition (5.19) is not satisfied, so that we are unable to
express the solution of the given differential equation in terms of the known functions.
There is, however, a general method of solution which, at times, allows the solution of the general
differential equation to be formulated using the known functions. In formula (5.18a) below, although
80CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

it can hardly be considered as an orthodox procedure, we split the derivative y 0 and, then, rewrite
(5.18) in the so–called Pfaffian2 form:

M (x , y) dx + N (x , y) dy = 0 . (5.18a)

We do not assume condition (5.20). In this situation, there exists a function µ(x , y) such that,
multiplying both sides of (5.18a) by µ , an equivalent equation is obtained which is exact, namely:

µ(x , y) M (x, y) dx + µ(x , y) N (x, y) dy = 0 . (5.18b)

This represents a theoretical statement, in the sense that it is easy to formulate conditions that need
to be satisfied by the integrating factor µ , namely:

∂  ∂ 
µ(x , y) N (x, y) = µ(x , y) M (x, y) . (5.23)
∂x ∂y

Evaluating the partial derivatives (and employing a simplified subscript notation for partial deriva-
tives), the partial differential equation for µ is obtained:

M (x , y) µy − N (x , y) µx = Nx (x , y) − My (x , y) µ . (5.23a)

Notice that solving (5.23a) may turn out to be harder than solving the original differential equation
(5.18a). However, depending on the particular structure of the functions M (x , y) and N (x , y) , there
exist favorable situations in which it is possibile to detect the integrating factor µ(x , y) , provided
that some restrictions are imposed on µ itself. In the following Theorems 5.20 and 5.23, we describe
what happens when µ depends on one variable only.

Theorem 5.20. Equation (5.18a) admits an integrating factor µ depending on x only, if the quantity:

My (x , y) − Nx (x , y)
ρ(x) = (5.24)
N (x , y)

also depends on x only. In this case, it is:


Z
ρ(x) dx
µ(x) = e , (5.25)

with ρ(x) given by (5.24).

Proof. Assume that µ(x, y) is a function of one variable only, say, it is a function of x only, thus:


µ(x , y) = µ(x) , µx = = µ0x , µy = 0 .
dx
In this situation, equation (5.23a) reduces to:

N (x , y) µ0x = My (x , y) − Nx (x , y) µ ,

(5.23b)

that is:
µ0x My (x , y) − Nx (x , y)
= . (5.23c)
µ N (x , y)
Now, if the left hand–side of (5.23c) depends on x only, then (5.23c) is separable: solving it leads to
the integrating factor represented in thesis (5.25).
2
Johann Friedrich Pfaff (1765–1825), German mathematician.
5.6. INTEGRATING FACTOR FOR NON EXACT EQUATIONS 81

Example 5.21. Consider the initial value problem:


2
y 0 = − 3 x y − y ,

x (x − y) (5.26)
y(1) = 3 .

Equation (5.26) is not exact, nor separable. Let us rewrite it in Pfaffian form, temporarily ignoring
the initial condition:
(3 x y − y 2 ) dx + x (x − y) dy = 0 . (5.26a)
Setting M (x , y) = 3 x y − y 2 and N (x , y) = x (x − y) yields:

My (x , y) − Nx (x , y) 3 x − 2 y − (2 x − y) 1
= = ,
N (x , y) x (x − y) x

which is a function of x only. The hypotheses of Theorem 5.20 are fulfilled, and the integrating factor
comes from (5.25): Z
1
dx
x
µ(x) = e = x.
Multiplying equation (5.26a) by the integrating factor x , we form an exact equation, namely:

(3 x2 y − x y 2 ) dx + x2 (x − y) dy = 0 . (5.26b)

Now, we can define the modified functions that constitute (5.26b):

M1 (x , y) = x M (x , y) = 3 x2 y − x y 2 ,

N1 (x, y) = x N (x , y) = x2 (x − y) ,
and employ them in equation (5.22), which also incorporates the initial condition:
Z x Z y
M1 (t , 3) dt + N1 (x , s) ds = 0 ,
1 3

that is: Z x Z y
(−9 t + 9 t2 ) dt + (x2 − x s) ds = 0 .
1 3

Evaluating the integrals:


x2 y 2 3
x3 y −+ = 0.
2 2
Solving for y , and recalling the initial condition, leads to the solution of the initial value problem
(5.26):

3 + x4
y =x+ .
x
Example 5.22. Consider the following initial value problem, in which the differential equation is not
exact nor separable:
2
y 0 = − 4 x y + 3 y − x ,

x (x + 2y) (5.27)
y(1) = 1 .

Here, M (x , y) = 4 x y + 3 y 2 − x , N (x , y) = x (x + 2 y) , so that the quantity below turns out to be


a function of x :
My (x , y) − Nx (x , y) 2
= .
N (x , y) x
82CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

Since the hypotheses of Theorem 5.20 are fulfilled, the integrating factor is given by (5.25):
Z
2
dx
x
µ(x) = e = x2 .

After defining the modified functions:

M1 (x , y) = x M (x , y) = 4 x3 y + 3 x2 y 2 − x3 ,

N1 (x , y) = x N (x , y) = x3 (x + 2 y) ,
we can use them into equation (5.22), which also incorporates the initial condition, obtaining:
Z x Z y
M1 (t , 1) dt + N1 (x , s) ds = 0 ,
1 1

that is: Z x Z y
2 3
(3 t + 3 t ) dt + x2 (2 s + x) ds = 0 .
1 1
Evaluating the integrals yields:
x4 7
x4 y −
+ x3 y 2 − = 0 .
4 4
Solving for y and recalling the initial condition, we get the solution of the initial value problem (5.27):

x5 + x4 + 7 x
y= − .
2x3/2 2

We examine, now, the case in which the integrating factor µ is a function of y only. Given the analogy
with Theorem 5.20, the proof of Theorem 5.23 is not provided here.

Theorem 5.23. Equation (5.18a) admits an integrating factor µ depending on y only, if the quantity:

Nx (x , y) − My (x , y)
ρ(y) = (5.28)
M (x , y)

also depends on y only. In this case, the integrating factor is:


Z
ρ(y) dx
µ(y) = e , (5.29)

with ρ(y) given by (5.28).

Example 5.24. Consider the initial value problem, with non–separable and non–exact differential
equation:
y 0 = − y (x + y + 1) ,

x (x + 3 y + 2) (5.30)
y(1) = 1 .

Functions M (x , y) = y (x + y + 1) and N (x , y) = x (x + 3 y + 2) are such that the following quantity


is dependant on y only:
Nx (x , y) − My (x , y) 1
= .
M (x , y) y
Formula (5.29) then leads to the integrating factor µ(y) = y , which in turn leads to the following
exact equation, written in Pfaffian form:

y 2 (x + y + 1) dx + x y (x + 3 y + 2) dy = 0 .
5.6. INTEGRATING FACTOR FOR NON EXACT EQUATIONS 83

Define the the modified functions:

M1 (x , y) = x M (x , y) = y 2 (x + y + 1) ,

N1 (x , y) = x N (x , y) = x y (x + 3 y + 2) ,
and employ them into equation (5.22), which also incorporates the initial condition, obtaining:
Z x Z y
M1 (t , 1) dt + N1 (x , s) ds ,
1 1

that is: Z x Z y
(2 + t) dt + s x (2 + 3 s + x) ds = 0 .
1 1

The solution to (5.30) can be thus expressed, in implicit form, as:

x2 y 2 5
+ x y3 + x y2 − = 0 .
2 2

To end this § 5.6, let us consider the situation of a family of differential equations for which an
integrating factor µ is available.

Theorem 5.25. Let Q = { (x , y) ∈ R2 | 0 < a < x < b , 0 < c < x < d } , and let f1 and f2 be
C 1 functions on Q , such that f1 (x y) − f2 (x y) 6= 0 . Define the functions M (x , y) and N (x , y) as:

M (x , y) = y f1 (x y) , N (x , y) = x f2 (x y) .

Then:
1
µ(x , y) = 
x y f1 (x y) − f2 (x y)
is an integrating factor for:
y f1 (x y)
y0 = − .
x f2 (x y)

Proof. It suffices to insert the above expressions of µ , M and N into condition (5.23) and verify that
it gets satisfied.

5.6.1 Exercises
1. Solve the following initial value problems, using a suitable integrating factor.
2
y 0 = − y (x + y) ,
 
y 0 = 3 x + 2 y ,
(a) x + 2y − 1 (e) 2xy
y(1) = 1 , y(1) = 1 ,
 

 0 y − 2 x3


y 0 = − y
2
, y = ,
(b) x (y − ln x) (f) x
y(1) = 1 , y(1) = 1 ,

y 0 = y

y 0 = y ,
, (g) 3
y − 3x
(c) y − 3x − 3 y(0) = 1 ,
y(0) = 0 ,
 3 x
( y 0 = − y + 2 y e ,
y0 = x − y , (h) ex + 3 y 2
(d)
y(0) = 0 , y(0) = 0 .

84CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

5.7 Linear equations of first order


Consider the differential equation:
(
y 0 (x) = a(x) y(x) + b(x) ,
(5.31)
y(x0 ) = y0 .
Let functions a(x) and b(x) be continuous on the interval I ⊂ R . The first–order differential equation
(5.31) is called linear, since y is represented by a polynomial of degree 1 . We can establish a formula
for its integration, following a procedure that is similar to what we did for separable equations.
Theorem 5.26. The unique solution to (5.31) is:
Z x Z t
a(t) dt − a(s) ds
!
Z x
y(x) = e x0 y0 + b(t) e x0 dt
x0

i.e., in a more compact form:


!
Z x
A(x) −A(t)
y(x) = e y0 + b(t) e dt (5.32)
x0

where: Z x
A(x) = a(s) ds . (5.33)
x0
Proof. To arrive at formula (5.32), we first examine the case b(x) = 0 , for which (5.31) reduces to the
separable (and linear) equation: (
y 0 (x) = a(x) y(x) ,
(5.34)
y(x0 ) = c ,
having set y0 = c . If x0 ∈ I , the solution of (5.34) through point (x0 , c) is:
y(x) = c eA(x) . (5.35)
To find the solution to the more general differential equation (5.31), we use the method of Variation
of Parameters 3 , due to Lagrange: we assume that c is a function of x and search for c (x) such that
the function:
y(x) = c(x) eA(x) (5.36)
becomes, indeed, a solution of (5.31). To this aim, differentiate (5.36):
y 0 (x) = c0 (x) eA(x) + c(x) a(x) eA(x)
and impose that function (5.36) solves (5.31), that is:
c0 (x) eA(x) + c(x) a(x) eA(x) = a(x) c(x) eA(x) + b(x) ,
from which:
c0 (x) = b(x) e−A(x) . (5.37)
Integrating (5.37) between x0 and x , we obtain:
Z x
c(x) = b(t) e−A(t) dt + K ,
x0

with K constant. Finally, the solution to (5.31) is:


Z x
A(x)
b(t) e−A(t) dt + K .

y(x) = e
x0

Evaluating y(x0 ) and recalling the initial condition in (5.31), we see that y0 = K . Thesis (5.32) thus
follows.
3
See, for example, mathworld.wolfram.com/VariationofParameters.html
5.7. LINEAR EQUATIONS OF FIRST ORDER 85

Remark 5.27. An alternative proof to Theorem 5.26 can be provided, using Theorem 5.20 and the
integrating factor procedure. In fact, if we assume:
M (x , y) = a(x) y(x) + b(x) , N (x , y) = −1 ,
then:
My − Ny
= −a(x) ,
N
which yields the integrating factor µ(x) = e−A(x) , with A(x) defined as in (5.33). Considering the
following exact equation, equivalent to (5.31):
e−A(x) a(x) y + b(x)

0
y (x) = − ,
−e−A(x)
and employing relation (5.22), we obtain:
Z x Z y
a(t) y0 + b(t) e−A(t) dt − e−A(x) ds = 0 ,

x0 y0

which, after some straightforward computations, yields formula (5.32).


Remark 5.28. The general solution of the linear differential equation (5.31) can be described when
a particular solution of it is known, together with the general solution of the linear and separable
equation (5.34).
If y1 and y2 are both solutions of (5.31), in fact, there exist v1 , v2 ∈ R such that:
Z x
y1 (x) = eA(x) v1 + b(t) e−A(t) dt ,

x0
Z x
y2 (x) = eA(x) v2 + b(t) e−A(t) dt .

x0
Subtracting, we obtain:
y1 (x) − y2 (x) = v1 − v2 eA(x) ,


which means that y1 − y2 has the form (5.35) and, therefore, solves (5.34). Now, using the fact that
y1 is a solution of (5.31), the general solution to (5.31) can be written as:
y(x) = c eA(x) + y1 (x) , c ∈ R,
and all this is equivalent to saying that the general solution y = y(x) of (5.31) can be written in the
form:
y − y1
= c, c ∈ R.
y2 − y1
Example 5.29. Consider the equation:

3
y 0 (x) = 3 x2 y(x) + x ex ,


y(0) = 1 .

3
Here, a(x) = 3 x2 and b(x) = x ex . Using (5.32)–(5.33), we get:
Z x Z x
A(x) = a(s) ds = 3 s2 ds = x3
x0 0

and
x x
x2
Z Z
−A(t) 3 3
b(t) e dt = t et e−t dt = ,
x0 0 2
so that:
x2
 
x3
y(x) = e 1+ .
2
86CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

Example 5.30. Consider the equation:


(
y 0 (x) = 2 x y(x) + x ,
y(0) = 2 .

Note that a(x) = 2 x , b(x) = x . Then:


Z x Z x
A(x) = a(s) ds = 2 s ds = x2
0 0

and
" 2
#x 2
x x
e−t 1 e−x
Z Z
−A(t) −t2
b(t) e = te dt = − = − .
0 0 2 2 2
0

Therefore, the solution is:


2
1 e−x  5 x2 1
x2
y(x) = e 2 + − = e − .
2 2 2 2
Remark 5.31. When, in equation (5.32), functions a(x) and b(x) are constant, we obtain:
(
y 0 (x) = a y(x) + b ,
(5.32a)
y(x0 ) = y0 ,
and the solution is given by:
b  a (x−t0 ) b
y(x) = y0 + e − .
a a

5.7.1 Exercises
Solve the following initial value problems for linear equations.
1 1 1
 
y 0 (x) = −

2
y(x) + 2
, y 0 (x) = y(x) + x2 ,

1. 1 + x 1 + x 4. x
 
y(0) = 0 , y(1) = 0 ,
 
 
3

 y 0 (x) = − sin x y(x) + sin x ,
y 0 (x) = 3 x2 y(x) + x ex ,

2. 5.

y(0) = 0 , 
y(0) = 1 ,

x 1
 
0 y 0 (x) = x −
y (x) =

2
y(x) + 1 ,  y(x) ,
3. 1+x 6. 3x

y(0) = 0 , 
y(1) = 1 .

5.8 Bernoulli equation


A Bernoulli4 differential equation, with exponent α , has the form:
y 0 (x) = a(x) y(x) + b(x) y α (x) . (5.38)
Let us assume α 6= 0 , 1 , so that (5.38) is non–linear. The change of variable
v(x) = y 1−α (x)
transforms (5.38) into a linear equation:
v 0 (x) = 1 − α a(x) v(x) + 1 − α b(x) .
 
(5.39)
4
Jacob Bernoulli (1654–1705), Swiss mathematician.
5.8. BERNOULLI EQUATION 87

Example 5.32. Consider the differential equation:



y 0 = − 1 y − 1 y 4 ,
x x
y(2) = 1 .

Here α = 4 , and the change of variable is v(x) = y −3 (x) , i.e., y = v −1/3 , leading to:
1 −4/3 0 1 1
− v v = − v −1/3 − v −4/3 .
3 x x
Multiplcation by v 4/3 yields a linear differential equation in v :
1 0 1 1
− v =− v− .
3 x x
Simplifying and recalling the initial condition, we obtain a linear initial value problem:

v 0 = 3 v + 3 ,
x x
v(2) = 1 ,

1 3
solved by v(x) = x − 1 . Hence, the solution to the original problem is:
4
r
4
y(x) = 3 3 .
x −4
Example 5.33. Given x > 0 , solve the initial value problem:

y 0 (x) = 1 y(x) + 5 x2 y 3 (x) ,
2x
y(1) = 1 .

This is a Bernoulli equation with exponent α = 3 . Consider the change of variable:


 1 − 1
y(x) = v(x) 1−3 = v(x) 2 .

The associated linear equation in v(x) is:


1
v 0 (x) = − v(x) − 10 x2 ,
x
which is solved by:
7 5
v(x) = − x3 .
2x 2
To recover y(x) , we have to assume:
r
7 5 7
− x3 > 0
4
⇐⇒ 0<x< .
2x 2 5
Finally: r
1 4 7
y(x) = q , with 0<x< .
7
− 5
x3 5
2x 2

Example 5.34. Solve the initial value problem:



y 0 (x) = − 1 y(x) − 1 x2 y 5 (x) ,
4x 4
y(1) = 1 .
88CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

We have to solve a Bernoulli equation, with exponent α = 5 . The change of variable is:
 1 − 1
y(x) = v(x) 1−5
= v(x) 4 ,

which leads to the transformed linear equation:



v 0 (x) = 1 v(x) + x2 ,
x
v(1) = 1 ,

solved by:
x + x3
v(x) = .
2
Recovering y(x) , we find: r
4 2
y(x) =
x + x3
that is defined for x > 0 .

Remark 5.35. Bernoulli equation (5.38) can also be solved using the same approach used for linear
equations, that is, imposing a solution of the form:

y(x) = c(x) eA(x)

and obtaining the separable equation for c(x) :

c0 (x) = b(x) e(α−1) A(x) cα (x) ,

so that:
  1
1−α
1−α
c(x) = (1 − α) F (x) + c0 ,

where: Z x
F (x) = b(z) e(α−1) A(z) dz .
x0

5.8.1 Exercises
Solve the following initial value problems for Bernoulli equations.

y 0 = 2 x y + 2√y ,
 
y 0 = − 1 − 1 y 2 ,
1. x ln x 3. 1+x
y(2) = 1 , y(0) = 1 ,
(  2 (
y 0 = x2 y + x5 + x2 y 3 , y0 = y + x y2 ,
2. 4.
y(0) = 1 , y(0) = 1 .
(
y 0 (x) − y(x) + (cos x) y(x)2 = 0 ,
5.
y(0) = 1 .

5.9 Riccati equation


A Riccati differential equation has the following form:

y 0 = a(x) + b(x) y + c(x) y 2 . (5.40)


5.9. RICCATI EQUATION 89

The solving strategy is based on knowing one particular solution y1 (x) of (5.40). Then, it is assumed
that the other solutions of (5.40) have the form y(x) = y1 (x) + u(x) , where u(x) is an unknown
function, to be found, and that solves the associated Bernoulli equation:

u0 (x) = b(x) + 2 c(x) y1 (x) u(x) + c(x) u2 (x) .



(5.40a)

Another way to form (5.40a) is via the substitution:


1
y(x) = y1 (x) + ,
u(x)

which transform (5.40) directly into the linear equation:

u0 (x) = −c(x) − b(x) + 2 c(x) y1 (x) u(x) .



(5.41)

Notice that, in this latter way, we combine together two substitutions: the first one maps the Riccati5
equation into a Bernoulli equation; the second one linearizes the Bernoulli equation.
Example 5.36. Knowing that y1 (x) = 1 solves the Riccati equation:
1+x 1
y0 = − + y + y2 , (5.42)
x x
we want to show that the general solution to equation (5.42) is:

x2 e x
y(x) = 1 + .
c + (1 − x)ex
Let us use the change of variable:
1
y(x) = 1 + ,
u(x)
to obtain the linear equation:
1 2
u0 (x) = −
− 1+ u(x) . (5.42a)
x x
To solve it, we proceed as we learned. First, compute A(x) :

e−x
Z
2
A(x) = − 1+ dx = −x − 2 ln x =⇒ eA(x) = .
x x2
Then, form: Z Z Z
−A(x) 1 2 x
b(x) e dx = − x e dx = − x ex dx = (1 − x) ex .
x
The solution to (5.42a) is, therefore:

e−x x
 c e−x + 1 − x c + (1 − x) ex
u(x) = c + (1 − x) e = = .
x2 x2 x2 e x
Finally, the solution to (5.42) is:

1 x2 ex
y(x) = 1 + =1+ .
u(x) c + (1 − x) ex

Example 5.37. Using the fact that y1 (x) = x solves:

y(x)
y 0 (x) = −x5 + + x3 y 2 (x) , (5.43)
x
find the general solution of (5.43).
5
Jacopo Francesco Riccati (1676–1754), Italian mathematician and jurist.
90CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

1
The substitution y = x + leads to the linear differential equation:
v
1
v0 x+ 2 x5 + 1
v + x3 x + 1 2
 
1 − 2 = −x5 + =⇒ v0 = − v − x3 ,
v x v x
whose solution is:
2 x5
c e− 5 1
v(x) = − ,
x 2x
where c is an integration constant. The general solution of (5.43) is, therefore:
2x
y= + x.
2 x5
2 c e− 5 −1
Remark 5.38. In applications, it may be useful to state conditions on the coefficient functions
a(x) , b(x) and c(x) , to the aim that the relevant Riccati equation (5.40) is solved by some particular
function having simple form. The following list summarizes such conditions and, for each one, the
correspondent simple–form solution y1 .

1. Monomial solution:
if a(x) + xn−1 x b(x) + c(x) xn+1 − n = 0 , then y1 (x) = xn .


2. Exponential solution:
if a(x) + en x b(x) + c(x) en x − n = 0 , then y1 (x) = en x .


3. Exponential monomial solution:


if a(x) + en x x b(x) + x2 c(x) en x − n x − 1 = 0 , then y1 (x) = x en x .


4. Sine solution: if a(x) + b(x) sin(n x) + c(x) sin2 (n x) − n cos(n x) = 0 , then y1 (x) = sin(n x) .

5. Cosine solution: if a(x) + b(x) cos(n x) + c(x) cos2 (n x) + n sin(n x) = 0 , then y1 (x) = cos(n x) .

5.9.1 Cross–Ratio property


Solutions of Riccati equations posses some peculiar properties, due to the connection with linear
equations, as explained in the following Theorem 5.39.

Theorem 5.39. Given any three functions y1 , y2 , y3 , which satisfy (5.40), then the general solution
y of (5.40) can be expressed in the form:
y − y2 y3 − y2
=c . (5.44)
y − y1 y3 − y1
Proof. We saw in § 5.9 that, if y1 is a solution of (5.40), then solutions y2 and y3 will be determined
by two particular choices of u in the substitution:
1
y = y1 + .
u
Let us denote such functions with u2 and u3 , respectively:
1 1
y2 = y1 + , y3 = y1 + .
u2 u3
Recalling that u2 and u3 are solutions to the linear equation (5.41), we know that the general solution
of (5.41) can be written as shown in Remark 5.28:
u − u2
= c.
u3 − u2
5.9. RICCATI EQUATION 91

At this point, employing the reverse substitution, and following [32] (page 23):

1 1 1
u= , u2 = , u3 = ,
y − y1 y2 − y1 y3 − y1

we arrive at formula (5.44), representing the general solution of (5.40).

A consequence of Theorem 5.39 is the so–called Cross–Ratio property of the Riccati equation, illus-
trated in the following Corollary 5.40.

Corollary 5.40. Given any four solutions y1 , . . . , y4 of the Riccati equation (5.40), their Cross–Ratio
is constant and is given by the quantity:
y4 − y2 y3 − y1
. (5.45)
y4 − y1 y3 − y2

Proof. Relation (5.44) implies that, if y4 is a fourth solution of (5.40), then:


y4 − y2 y3 − y2
=c ,
y4 − y1 y3 − y1

which, since c is constant, demonstrates thesis (5.45).

5.9.2 Reduced form of the Riccati equation


The particular differential equation:

u0 = A0 (x) + A1 (x) u2

is known as reduced form of the Riccati equation (5.40). Functions A0 (x) and A1 (x) are related
to functions a(x) , b(x) and c(x) appearing in (5.40). In fact, if B(x) is a primitive of b(x) , i.e.,
B 0 (x) = b(x) , the change of variable:
u(x) = e−B(x) y (5.46)
trasforms (5.40) into the reduced Riccati equation:

u0 = a(x) e−B(x) + c(x) eB(x) u2 . (5.40b)

This can be seen by computing u0 = e−B(x) y 0 − y B 0 (x) from (5.46) and then substituting, in


the factor y 0 − y B 0 (x) , the equalities B 0 (x) = b(x) and y 0 = a(x) + b(x) y + c(x) y 2 , and finally
y = eB(x) u .

Sometimes, given a Riccati equation, its solution can be obtained by simply transforming it to its
reduced form. Example 5.41. illustrates this fact.

Example 5.41. Consider the initial value problem for the Riccati equation:

y 0 = 1 − 1 y + x y 2 ,
2x
y(1) = 0 .

To obtain its reduced form, define:


Z
1 1
B(x) = − dx = − ln x ,
2x 2
and the change of variable:
1 1
y = eB(x) u = e− 2 ln x u = x− 2 u .
92CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

The reduced (separable) Riccati equation is, then:


1
(
u0 = x 2 1 + u2 ,


u(1) = 0 ,

whose solution is:


1 3 
u(x) = tan (2 x 2 − 2) .
3
Remark 5.42. The reduced Riccati equation is also separable if and only if there exists a real number
λ such that:
λ a(x) e−B(x) = c(x) eB(x) .
In other words, to have separability of the reduced equation, the function:
c(x) 2 B(x)
e
a(x)
has to be constant and equal to a certain real number λ . The topic of separability of the Riccati
equation is presented in [2, 48, 49, 57, 61].

5.9.3 Connection with the linear equation of second order


Second–order differential equations will be discussed in Chapter 6, but the study of a particular
second–order differential equation is anticipated here, since it is related to the Riccati equation.
A linear differential equation of second order has the form:

y 00 + P (x) y 0 + Q(x) y = 0 , (5.47)

where P (x) and Q(x) are given continuos functions, defined on an interval I ⊂ R . The term linear
indicates the fact that the unknown function y = y(x) and its derivatives appear in polynomial form
of degree one.
The second–order linear differential equation (5.47) is equivalent to a particular Riccati equation. We
follow the fair exposition given in Chapter 15 of [46]. Let us introduce a new variable u = u(x) ,
setting: Z
y = e−U (x) , with U (x) = − u(x) dx . (5.48)

Compute the first and second derivatives of y , respectively:

y 0 = −u e−U (x) , y 00 = e−U (x) u2 − u0 .




Equation (5.47) gets then transformed into:

e−U (x) (u2 − u0 ) − P (x) u e−U (x) + Q(x) e−U (x) = 0 ,

simplifying which leads to the non–linear Riccati differential equation of first order:

u0 = Q(x) − P (x) u + u2 . (5.47a)

Viceversa, to find a linear differential equation, of second order, that is equivalent to the first–order
Riccati equation (5.40), let us proceed as follows. Consider the transformation:

w0
y=− , (5.49)
c(x) w
with first derivative:
w c0 (x) w0 − c(x) w w00 + c(x) (w0 )2
y0 = .
c2 (x) w2
5.9. RICCATI EQUATION 93

Now, apply transformation (5.49) to the right hand–side of (5.40), i.e., to a(x) + b(x) y + c(x) y 2 :

b(x) w0 w2
a(x) − + .
c(x) w c(x) (w0 )2
By comparison, and after some algebra, we arrive at:

−a(x) c2 w + b(x) c(x) w0 + c0 (x) w0 − c(x) w00


=0,
c2 w
that is a linear differential equation of second order:

c(x) w00 − b(x) c(x) + c0 (x) w0 + a(x) c2 (x)w = 0



(5.47b)

equivalent to the Riccati equation (5.40).


Example 5.43. Consider the linear differential equation of order 2 :
x 1
y 00 − y0 + y = 0, with − 1 < x < 1. (5.50)
1 − x2 1 − x2
Following the notations in (5.47):
x 1
P (x) = − , Q(x) = ,
1 − x2 1 − x2
and using the transformation (5.48), we arrive at the Riccati equation:
1 x
u0 = 2
+ u + u2 . (5.50a)
1−x 1 − x2

To obtain the reduced form of (5.50a), we employ the transformation (5.46), observing that, here,
such a transformation works in the following way:
Z
x 1
b(x) = 2
=⇒ B(x) = b(x) dx = − ln(1 + x2 )
1−x 2
p
−B(x)
=⇒ v=e u = 1 − x2 u .

The reduced Riccati (separable) differential equation is, then:


1 1
v0 = √ +√ v2 , (5.51)
1−x 2 1 − x2
whose general solution is:
v(x) = tan (arcsin x + c) .
Recovering the u variable, we get the solution to equation (5.50a):
tan (arcsin x + c)
u(x) = √ .
1 − x2
To get the solution of the linear equation (5.50), we reuse relation (5.48).
First, a primitive U (x) of u(x) must be found:

tan arcsin x + c
Z Z

U (x) = u(x) dx = √ dx = − ln cos(arcsin x + c) − K .
1 − x2
where ±K is a constant whose sign can be made positive. Then, using (5.48), we can conclude that
the general solution to (5.50) is:

ln cos(arcsin x+c) +K
y(x) = e = eK cos(arcsin x + c) .
94CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

5.9.4 Exercises
1. Knowing that y1 (x) = 1 is a solution of:

y 0 (x) = −1 − ex + ex y(x) + y 2 (x) , (5.52)

show that the general solution to (5.52) is:


x
e2 x+e
y(x) = 1 + .
c − eex ex − 1
Z
x x
Hint: e2 x+e dx = ee (ex − 1) .

2. Find the general solution of the equation:


2
y 0 (x) = − + y 2 (x) ,
x2
1
knowing that y1 (x) = is a particular solution.
x
3. Find the general solution of the equation:
1 1
y 0 (x) = −1 + y + 2 y 2 (x) ,
x x
knowing that y1 (x) = x is a particular solution.

4. Solve the linear differential equation of second order:

x2 y 00 + 3 x y 0 + y = 0 .

Hint: Transform the equation into a Riccati equation, in reduced form, and then use the fact that
v1 (x) = x2 is a particular solution.

5. Solve the linear differential equation of order 2 :

(1 + x2 ) y 00 − 2 x y 0 + 2 y = 0 .

1
Hint: Transform the equation into a Riccati equation, then solve it, using the fact that u1 = −
x
is a particular solution.

5.10 Change of variable


A differential equation can be solved, at times, using an appropriate change of variable. Aim of this
§ 5.10 is to provide a short account on how to apply any change of variable, to a given differential
equation, in a correct way. Details on how to find a change of variable, capable of transforming a given
differential equation into a simpler (possibly, the simplest) form, can be found in [31], [4].

Consider the differential equation:


yx0 = f (x , y(x)) , (5.53)
where, the subscript x emphasizes that we are considering the x –derivative, in contrast with the fact
that, below, we are also going to form the derivative with respect to a new variable.
Consider, in fact, the mapping:
(x, y) 7→ (X, Y ) ,
5.10. CHANGE OF VARIABLE 95

where X = X(x, y) and Y = Y (x, y) are C 1 functions, with:


 
Xx (x, y) Xy (x, y)
det 6= 0 .
Yx (x, y) Yy (x, y)
This last condition ensures uniqueness for the (x, y)–solution of the system:
(
X = X(x, y) ,
Y = Y (x, y) .
Equation (5.53) gets, thus, changed into:
Dx Y (x, y) Yx + Yy yx0
=
Dx X(x, y) Xx + Xy yx0
i.e.,
Yx + f (x, y) Yy
YX0 = . (5.53a)
Xx + f (x, y) Xy
The right hand–side of (5.53a) contains (x, y) . To complete the coordinate change, we have to revert
the mapping, solving the system:
( (
X = X(x, y) , x = x̂(X, Y ) ,
=⇒
Y = Y (x, y) , y = ŷ(X, Y ) ,
and substituting the founded expressions for x and y into (5.53a).
Example 5.44. Consider the Riccati equation:
2y 1
yx0 = x y 2 − − 3. (5.54)
x x
Consider further the change of variables (X , Y ) = (x2 y , ln x) , so that:
1
Yx = Dx Y (x , y) = Dx (ln x) = , Yy = Dy Y (x , y) = Dy (ln x) = 0 ,
x
Xx = Dx X(x , y) = Dx (x2 y) = 2 x y , Xy = Dy X(x , y) = Dy (x2 y) = x2 ,
thus:
1 2y 1
+ x y2 − − 3 0 1
YX0 = x x x = 4 2 . (5.54a)
2 y 1 x y −1
2 x y + x y2 − − 3 x2

x x
Coordinates (x , y) can be expressed in terms of the new coordinates (X , Y ) as:
( (
X = x2 y , x = eY ,
=⇒
Y = ln x , y = Xe−2 Y .
Substitution into (5.54a) leads to:
1 1 1 1 1
YX0 = = 2 = ( − ). (5.54b)
e4 Y X 2 e−4 Y − 1 X −1 2 X −1 X +1
Integrating (5.54b):
1 X +1
Y = ln c +ln ,
2 X −1
where ln c is an integration constant. Recovering the original variables:
s
1 x2 y + 1 x2 y + 1 
ln x = ln c + ln 2 = ln c .
2 x y−1 x2 y − 1
The solution of (5.54) is, therefore:
s
x2 y + 1 c2 + x2
x=c =⇒ y= .
x2 y − 1 x2 (c2 − x2 )
96CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT

5.10.1 Exercises
1. Prove that the differential equation:

y − 4 x y 2 − 16 x3
yx0 =
y 3 + 4 x2 y + x

is transformed, by the change of variables (x , y) 7→ (X , Y ) , into:

YX0 = −2 X .
p y
where X(x , y) = 4 x2 + y 2 and Y (x , y) = arctan .
2x
Then, use this fact to integrate the original differential equation.

2. Prove that the differential equation:

y 3 + x2 y − y − x
yx0 =
x y 2 + x3 + y − x

is transformed, by the change of variables (x , y) 7→ (X , Y ) , into:

YX0 = Y (1 − Y 2 ) ,
y p
where X(x , y) = arctan and Y (x , y) = x2 + y 2 .
x
Then, use this fact to integrate the original differential equation.

3. Given the differential equation, in the unknown y = y(x) :

y 3 ey
y0 = , (5.55)
y 3 ey − ex

transform it into an equation for Y = Y (X) , using the change of variables:

X(x , y) = − 1 ,

y
Y (x , y) = ex−y .

Then, express the solution to (5.55) in implicit form.

4. Given the differential equation, in the unknown y = y(x) :

y2

 0 y+1
y = 3+ ,
x x (5.56)
y(1) = 0 ,

transform it into an equation for Y = Y (X) , using the change of variables:


 y
X(x , y) = ,
x
Y (x , y) = − 1 ,
x
and then give the solution to (5.56).
6 Linear differential equations of sec-
ond order

The general form of a differential equation of order n ∈ N was briefly introduced in equation (4.10) of
Chapter 4. The current Chapter 6 is devoted to the particular situation of linear equations of second
order:
a(x) y 00 + b(x) y 0 + c(x) y = d(x) , (6.1)
where a , b , c and d are continuous real functions of the real variable x ∈ I , being I an interval in R
and a(x) 6= 0 . Equation (6.1) may be represented, at times, in operational notation:

M y = d(x) ,

where M : C 2 (I) → C(I) is a differential operator that acts on the function y ∈ C 2 (I) :

M y = a(x) y 00 + b(x) y 0 + c(x) y . (6.2)

In this situation, existence and uniqueness of solutions are verified, for any initial value problem
associated to (6.1).
Before dealing with the simplest case, in which the coefficient functions a(x) , b(x) , c(x) are constant,
we examine general properties, that hold in any situation. We will study some variable coefficient
equations, that are meaningful in applications. Our treatment can be easily extended to equation of
any order; for details, refer to Chapter 5 of [47] or Chapter 6 of [3].

6.1 Homogeneous equations


Assume that a(x) 6= 0 on a certain interval I . Hence, we can set:

b(x) c(x) d(x)


p(x) = , q(x) = , r(x) = , (6.3)
a(x) a(x) a(x)

and represent the differential equation (6.1) in the explicit form:

L y(x) = r(x) , (6.4)

where L is the differential operator:

L y = y 00 + p(x) y 0 + q(x) y , (6.5)

The homogeneous equation associated to (6.4) is:

Ly = 0. (6.4a)

The first step in studying (6.4) consists in the change of variable:

y(x) = f (x) u(x) ,

97
98 CHAPTER 6. LINEAR DIFFERENTIAL EQUATIONS OF SECOND ORDER

where u(x) is the new dependent variable, while f (x) is a function to be specified, in order to simplify
computations. We find:
L(f u) = f u00 + 2 f 0 + p f u0 + f 00 + p f 0 + q f u = r .
 
(6.6)
In (6.6), we can choose f so that the coefficient of u vanishes, that is:
f 00 + p f 0 + q f = 0 . (6.7)
In this way, equation (6.6) becomes easily solvable, since it reduces to a first–order linear equation in
the unknown v = u0 :
f v0 + 2 f 0 + p f v = r .

(6.6a)
At this point, if any particular solution to the homogeneous equation (6.4a) is available, the solution
of the non–homogeneous equation (6.4) can be obtained.
The set of solutions to a homogeneous equation forms a two–dimensional vector space, as illustrated
in the following Theorem 6.1. The first, and easy, step is to recognize, that given two solutions y1 and
y2 of (6.4a), their linear combination:
y = α1 y1 + α2 y2 , α1 , α2 ∈ R ,
is also a solution to (6.4a). In particular, if y1 and y2 are solution to (6.4a), then:
(L y1 )(x) = y100 + p(x) y10 + q(x) y1 = 0 , (6.8)
(L y2 )(x) = y200 + p(x) y20 + q(x) y2 = 0 . (6.9)
To form their linear combination, we multiply both sides of (6.8) and (6.9) by α1 and α2 , respectively,
and we sum the results, obtaining:
α1 y100 + α2 y200 + p(x) α1 y10 + α2 y20 + q(x) α1 y1 + α2 y2 = 0 .
  

Using the elementary properties of differentiation, we see that:


α1 y100 + α2 y200 = (α1 y1 + α2 y2 )00
and
α1 y10 + α2 y20 = (α1 y1 + α2 y2 )0 .
This shows that α1 y1 + α2 y2 is indeed a solution to (6.4a). This demonstrates also that, when
α2 = 0 , any multiple of one solution of (6.4a) solves (6.4a) too. By iteration, it holds that any linear
combination of solutions of (6.4a) solves (6.4a) too.

6.1.1 Operator notation


The operator notation (6.4) comes out handy in understanding, in details, the structure of the solution
set of a linear homogeneous differential equation,
The L introduced in (6.2) represents a linear operator between the (infinite dimensional) vector
space C (2) (I) , formed by all the functions f whose first and second derivatives, f 0 , f 00 , exist and are
continuous on I , and the (infinite dimensional) space C(I) of the continuous functions on I :
L : C 2 (I) → C(I) . (6.10)
The task of solving the linear homogeneous differential equation (6.4a) becomes, thus, equivalent to
describing the kernel, denoted by ker(L) , of the linear operator L , that is the space of the solutions
of the linear homogeneous equation (6.4a):
n o
ker(L) = y ∈ C (2) (I) | L y = 0 . (6.11)

Even if C (2) (I) in a infinite-dimensional vector space, ker(L) is a subspace of dimension 2 , as stated
in Theorem 6.1.
6.1. HOMOGENEOUS EQUATIONS 99

Theorem 6.1. Consider the linear differential operator L : C 2 (I) → C(I) , defined by (6.5). Then,
the kernel of L has dimension 2 .

Proof. Fix x0 ∈ I and define the linear operator T : ker(L) → R2 , which maps each function
y ∈ ker(L) onto its initial value, evaluated at x0 , i.e.:

T y = y(x0 ) , y 0 (x0 ) .


The existence and uniqueness Theorem 4.17 means that T y = (0 , 0) implies y = 0 . Hence, by the
theory of linear operators, this mean that T is one–one operator and it holds:

dim ker(L) = dim R2 = 2 .

6.1.2 Wronskian determinant


To study the vector space structure of the set of solutions to (6.4a), it is useful to examine some
properties, related to the linear independence of functions. Given n real functions f1 , . . . , fn of a
real variable, all defined on the same interval I , we say that f1 , . . . , fn are linearly dependent if there
exist n numbers α1 , . . . , αn , not all zero and such that, for any x ∈ I :
n
X
αk fk (x) = 0 . (6.12)
k=1

If condition (6.12) holds only when all the αk are zero (i.e., αk = 0 for all k = 1 , . . . , n), then
functions f1 , . . . , fn are linearly independent.

We now provide a sufficient condition for linear independence of a set of functions. Let us assume that
f1 , . . . , fn are n–times differentiable. Then, from equation (6.12), applying successive differentiations,
we can form a system of n linear equations in the variables α1 , . . . , αn :

α1 f1 + α2 f2 + . . . + αn fn = 0,
α1 f10 + α2 f20 + ... + αn fn0 = 0,
α1 f100 + α2 f200 + ... + αn fn00 = 0, (6.13)
.. ..
. .
(n−1) (n−1) (n−1)
α1 f1 +α2 f2 + · · · + αn fn = 0.

Functions f1 , . . . , fn are linearly independent, if it holds:


 
f1 f2 ... fn
 f10 f20 ... fn0 
det  .. .. ..  6= 0 . (6.14)
 
 . . . 
(n−1) (n−1) (n−1)
f1 f2 ... fn

The determinant in (6.14) is called the Wronskian1 of functions f1 , . . . , fn , denoted as:


 
f1 (x) f2 (x) ... fn (x)
 f10 (x) f20 (x) ... fn0 (x) 
W (f1 , . . . , fn )(x) = det  .. .. ..  .
 
 . . . 
(n−1) (n−1) (n−1)
f1 (x) f2 (x) . . . fn (x)
1
Josef–Maria Hoëne de Wronski (1778–1853), Polish philosopher, mathematician, physicist, lawyer and economist.
100 CHAPTER 6. LINEAR DIFFERENTIAL EQUATIONS OF SECOND ORDER

For example, functions f1 (x) = sin2 x , f2 (x) = cos2 x , f3 (x) = sin(2 x) , are linearly independent on
I = R , since their Wronskian is non–zero:
 
f1 (x) f2 (x) f3 (x)
W (f1 , f2 , f3 )(x) = det  f10 (x) f20 (x) f30 (x) 
f100 (x) f200 (x) f300 (x)
sin2 x cos2 x
 
sin(2 x)
= det  2 cos x sin x −2 cos x sin x 2 cos(2 x)  = 4 .
2 2 2 2
2 cos x − sin x) 2 sin x − cos x) −4 sin(2 x)

A non–vanishing Wronskian represents a sufficient condition for linear independence of functions. It is


worth noting that, in general, the Wronskian of a set of linearly independent functions may vanish, but
this situation cannot occurr when the functions are solutions to a linear differential equation. There
exists, in fact, the important result, due to Abel2 , stated in the following Theorem 6.2.
Theorem 6.2. Let functions y1 (x) and y2 (x) , defined on the interval I , be solutions to the linear
differential equation (6.4a). Then, a necessary and sufficient condition, for y1 and y2 to be linearly
independent, is provided by their Wronskian being non–zero on I .
Proof. The Wronskian of y1 (x) and y2 (x) is a function W : I → R defined as:
  
y1 (x) y2 (x) 0 0

W (x) = W (y1 , y2 )(x) = det 0 = y 1 (x) y (x) − y2 (x) y (x) .
y1 (x) y20 (x) 2 1

Differentiating, we obtain:
d
W 0 (x) = W (y1 , y2 )(x) = y1 (x) y200 (x) − y2 (x) y100 (x) .
dx
Since y1 and y2 are solution to (6.4a), recalling that we assume a(x) 6= 0 in (6.1), it holds:

y100 (x) = −p(x) y10 (x) − q(x) y1 (x) ,

y200 (x) = −p(x) y20 (x) − q(x) y2 (x) ,


where p(x) , q(x) are as in (6.3). Then:
 
W 0 (x) = −p(x) y1 (x) y20 (x) − y2 (x) y10 (x) = −p(x) W (x) .

In other words, the Wronskian solves the separable differential equation:

W 0 = −p(x) W . (6.15)

Solving (6.15) yields: Z x


− p(s) ds
W (x) = W (x0 ) e x0 , (6.16)
with p(s) is as in (6.3). Equation (6.16) implies that, if the Wronskian W (x) vanishes in x0 ∈ I , then
W (x) is the zero function; viceversa, if there exists x0 ∈ I such that W (x0 ) 6= 0 , then W (x) 6= 0 for
each x ∈ I. Hence, to prove the thesis of Theorem 6.2, we need to prove that there exists x0 ∈ I such
that W (x0 ) 6= 0 . The demontration is by contradiction. Let us negate the assumption:

∃ x0 ∈ I such that W (x0 ) 6= 0 ,

which means that the following holds true:

W (x0 ) = 0 ∀ x0 ∈ I .
2
Niels Abel (1802-1829)
6.1. HOMOGENEOUS EQUATIONS 101

Construct the 2 × 2 linear system of algebraic equations in the unknowns α1 , α2 :


     
y1 (x0 ) y2 (x0 ) α1 0
= . (6.17)
y10 (x0 ) y20 (x0 ) α2 0

By assumption, the determinant of this homogeneous system is zero, hence the system admits a non
T
trivial solution α1 , α2 , with α1 , α2 not simultaneously null. Now, define the function:

y(x) = α1 y1 (x) + α2 y2 (x) .

Since y(x) is a linear combination of solutions y1 (x) and y2 (x) of (6.4a), then y(x) is also a solution
to (6.4a). And since, by construction, (α1 , α2 ) solves (6.17), then it also holds that:

y(x0 ) = α1 y1 (x0 ) + α2 y2 (x0 ) = 0 ,


y 0 (x0 ) = α1 y10 (x0 ) + α2 y20 (x0 ) = 0 .

At this point, from the existence and uniqueness of the solutions of the initial value problem:
(
Ly = 0 ,
y(x0 ) = y 0 (x0 ) = 0 ,

it turns out that y(x) = 0 identically, implying that α1 = α2 = 0 . So, we arrive at a contradiction.
The theorem is proved.

Putting together Theorems 6.1 and 6.2, it is possible to establish if a pair of solutions to (6.4a) is a
basis for the set of solutions to the equation (6.4a), as illustrated in Theorem 6.3.

Theorem 6.3. Consider the linear differential operator L : C 2 (I) → C(I) defined in (6.4a). If y1 and
y2 are two independent elements of ker(L) , then any other element of ker(L) can be expressed as a
linear combination of y1 and y2 :
y(x) = c1 y1 (x) + c2 y2 (x)
for suitable constants c1 , c2 ∈ R .

6.1.3 Order reduction


When an integral of the homogeneous equation (6.4a) is known, possibly by inspection or by an
educated guess, a second independent solution to (6.4a) can be obtained, with a procedure illustrated
in Example 6.4.

Example 6.4. Knowing that y1 (x) = x solves the differential equation:

x2 (1 + x) y 00 − x (2 + 4 x + x2 ) y 0 + (2 + 4 x + x2 ) y = 0 , (6.18)

a second independent solution to (6.18) can be found, by seeking a solution of the form:

y2 (x) = y1 (x) u(x) = x u(x) .

Let us evaluate the first and the second derivative of y2 :

y20 (x) = u(x) + x u0 (x) , y200 (x) = 2 u0 (x) + x u00 (x) ,

and substitute the derivatives above into (6.18):

x3 (x + 1) u00 (x) − (x + 2) u0 (x) = 0 .



102 CHAPTER 6. LINEAR DIFFERENTIAL EQUATIONS OF SECOND ORDER

Now, introducing v(x) = u0 (x) , we see that v has to satisfy the first–order linear separable differential
equation:
x+2
v0 = v,
x+1
which is solved by:
v(x) = c (1 + x) ex .
We can assume c = 1 , since we are only interested in finding one particular solution of (6.18). Function
u(x) is then found by integration:
Z
u(x) = (1 + x) ex dx = x ex .

Therefore, a second solution to (6.18) is y2 (x) = x2 ex , where, again, we do not worry about the inte-
gration constant. Functions y1 , y2 form an independent set of solutions to (6.18) if their Wronskian:
x 2 ex
   
y1 (x) y2 (x) x
det 0 = det = x2 (x + 1) ex .
y1 (x) y20 (x) 1 x (x + 2) ex
is different from zero. Now, observe that the differential equation (6.18) has to be considered in one of
the intervals (−∞ , −1) , (−1 , 0) , (0, +∞) , where the leading coefficient x2 (1 + x) of (6.18) does not
vanish. On such intervals, the Wronskian does not vanish as well, thus f1 , f2 are linearly independent.
In conclusion, the general solution to (6.18) is:
y(x) = c1 x + c2 x2 ex , c1 , c2 ∈ R . (6.19)

The procedure illustrated in Example 6.4 can be repeated in the general case. For simplicity, we recall
(6.4a) written in the explicit form:
y 00 + p(x) y 0 + q(x) y = 0 . (6.20)
If a solution y1 (x) of (6.20) is known, we look for a second solution of the form:
y2 (x) = u(x) y1 (x) ,
where u is a function to be determined. Computing the first and second derivatives of y2 :
y20 = y1 u0 + y10 u , y200 = y1 u00 + 2 y10 u0 + y100 u ,
and inserting them into (6.20), yields, after some computations:
y1 u00 + (2 y10 + p y1 ) u0 + (y100 + p y10 + q y1 ) u = 0 .
Now, since, y1 is a solution to (6.20), the previous equation reduces to:
y1 u00 + (2 y10 + p y1 ) u0 = 0 . (6.21)
Equation (6.21) is a first–order separable equation in the unknown u0 , exactly in the same way as in
the Example 6.4, and it can be integrated to obtain the second solution to (6.20).

The search for a second independent solution to (6.20) can also be pursued using the Wronskian
equation (6.16), without explicitly computing two solutions of (6.20). This is stated in the following
Theorem 6.5.
Theorem 6.5. If y1 (x) is a non–vanishing solution of the second–order equation (6.20), then a second
independent solution is given by:
Z t
−p(s) ds
Z x x0
e
y2 (x) = y1 (x) dt , (6.22)
x0 y12 (t)
with p(s) is as in (6.3).
6.1. HOMOGENEOUS EQUATIONS 103

Proof. Given the assumption that y1 is a non–vanishing function, rewrite the Wronskian as:
y1 y20 − y2 y10 
 
y1 y2
W (y1 , y2 ) = det 0 0 = y1 y20 − y2 y10 = y12 ,
y1 y2 y12
and observe that:
y1 y20 − y2 y10 d y2 
2 = .
y1 dx y1
In other words:
d y2  W (y1 , y2 )
= , (6.23)
dx y1 y12
integrating which leads to: Z x
W (y1 , y2 )(s)
y2 (x) = y1 (x) ds , (6.24)
x0 y12 (s)
setting to zero the constant of integration W (x0 ) . At this point, thesis (6.22) follows from inserting
equation (6.16) into (6.24).

Example 6.6. Consider again Example 6.4. The solution y1 (x) = x of (6.18) can be used in formula
(6.22), to detect a second solution to such equation. Observe that, in this case:
−x (2 + 4 x + x2 ) 2 + 4 x + x2
p(x) = = − ,
x2 (1 + x) x (1 + x)
so that: Z
−p(x) dx = x + 2 ln x + ln(1 + x) ,

and: Z
−p(s) ds
e = x2 (1 + x) ex .
The second solution is, hence:
Z
y2 (x) = x (1 + x) ex dx = x2 ex ,

in accordance with the solution y2 found using the method order reduction in Example 6.4.
Example 6.7. Find the general solution of the homogeneous linear differential equation of second
order:
1 1 
y 00 + y 0 + 1 − y = 0.
x 4 x2
We seek a solution of the form y1 = xm sin x . For such a solution, the first and second derivatives
are: (
y10 = xm−1 x cos x + m sin x ,


y100 = xm−2 2 m x cos x + m (m − 1) sin x − x2 sin x .




Imposing that y1 solves the considered differential equation, we obtain:


1
xm−2 (2 m + 1) x cos x + m2 −

sin x = 0 ,
4
1
which implies, in particular, m = − . In this way, we have proved that
2
sin x
y1 = √
x
solves the given differential equation.
1
To obtain a second independent solution, we employ (6.22); in our case, it is p(x) = , which gives:
x
Z
sin x dx cos x
y2 = √ 2 =− √ .
x sin x x
104 CHAPTER 6. LINEAR DIFFERENTIAL EQUATIONS OF SECOND ORDER

Remark 6.8. To facilitate the search for some particular solution of a linear differential equation,
conditions on the coefficients p(x) and q(x) , defined in (6.3), are provided below, each leading to a
function y1 that solves (6.20).

1. Monomial solution: if n2 − n + n x p(x) + x2 q(x) = 0 , then y1 (x) = xn .

2. Exponential solution: if n2 + n p(x) + q(x) = 0 , then y1 (x) = en x .

3. Exponential monomial solution: if n2 x + 2 n + (1 + n x) p(x) + x q(x) = 0 , then y1 (x) = x en x .


2
4. Exponential Gaussian solution: if 2 m + 4 m2 x2 + 2 m x p(x) + q(x) = 0 , then y1 (x) = em x .

5. Sine solution: if n p(x) cos(n x) − n2 sin(n x) + q(x) sin(n x) = 0 , then y1 (x) = sin(n x) .

6. Cosine solution: if q(x) cos(n x) − n2 cos(n x) − n p(x) sin(n x) = 0 , then y1 (x) = cos(n x) .

6.1.4 Constant–coefficient equations


In equation (6.1), the easiest case occurs when the coefficients functions a(x) , b(x) , c(x) are constant.
Suppose that a , b , c ∈ R , with a 6= 0 , and let u be a continuous function. A constant–coefficient
homogeneous differential equation has the form:

M y = a y 00 + b y 0 + c y = 0 . (6.25)

We seek for solutions of (6.25) in the exponential form

y(x) = eλ x ,

where λ is a constant to be determined. Computing the first two derivatives:

y 0 (x) = λ y(x) , y 00 (x) = λ2 y(x) ,

and imposing that y(x) solves (6.25), leads to the algebraic equation, called characteristic equation of
(6.25):
a λ2 + b λ + c = 0 . (6.26)
The roots of (6.26) determine solutions of (6.25). Namely, if the discriminant ∆ = b2 − 4 a c is
positive, so that equation (6.26) admits two distinct real roots λ1 and λ2 , then the general solution
to (6.25) is:
y = c1 eλ1 x + c2 eλ2 x . (6.27)
The independence of solutions y1 = c1 eλ1 x and y2 = c2 eλ2 x follows from the analysis of their
Wronskian, which is non–vanishing for any x ∈ R :
  √
y1 y2 (λ1 +λ2 ) x ∆ −b x
W (y1 , y2 )(x) = det 0 0 = (λ2 − λ1 ) e = e a 6= 0 .
y1 y2 a

When ∆ < 0 , equation (6.26) admits two distinct complex conjugate roots λ1 = α + i β and λ2 =
α − i β , and the general solution to (6.25) is:

y = eα x c1 cos(β x) + c2 sin(β x) .

(6.28)

Forming the complex exponential of λ1 and that of λ2 , two complex valued functions z1 , z2 are
obtained:
z1 = eλ1 x = e(α+i β) x = eα x ei β x = eα x cos(β x) + i sin(β x) ,


z2 = eλ2 x = e(α−i β) x = eα x e−i β x = eα x cos(β x) − i sin(β x) ,



6.1. HOMOGENEOUS EQUATIONS 105

that have the same real and imaginary parts:

<(z1 ) = <(z2 ) = eα x cos(β x) , =(z1 ) = =(z2 ) = eα x sin(β x) .

Set, for example, y1 = eα x cos(β x) and y2 = eα x sin(β x) . Then, the real solution presented in
(6.28) is a linear combination of the real functions y1 and y2 , which are are independent, since their
Wronskian is non–vanishing:
 
y1 y2
W (y1 , y2 )(x) = det 0 = β e2 α x 6= 0 .
y1 y20

When ∆ = 0 , equation (6.26) has one real root with multiplicity 2 , and the correspondent solution
to (6.25) is:
b
y1 = e − 2 a x
.
In this situation, we need a second independent solution, that is obtained from formula (6.22) of
Theorem 6.5, using the just found y1 and with p(s) built for equation (6.26), thus:
Z
b
− dx
Z a
b e b
y2 = e− 2 a x b
dx = x e− 2 a x .
e− a x
In other words, when ∆ = 0 the general solution of (6.25) is:
b
y = e− 2 a x c1 + c2 x .

(6.29)

Observe that the Wronskian is:


 
y1 y2 b
W (y1 , y2 )(x) = det 0 0 = e− a x 6= 0 .
y1 y2

Note that the knowledge of the Wronskian expression is useful in the study of non–homogeneous
differential equations too, as it will be shown in § 6.2.
Example 6.9. Consider the initial value problem:
(
y 00 − 2 y 0 + 6 y = 0 ,
y(0) = 0 , y 0 (0) = 1 .

The characteristic equation is λ2 − 2 λ + 6 = 0 , with roots λ = 1 ± i 5 . Hence, two independent
solutions are: √  √ 
y1 (x) = ex cos 5x , y2 (x) = ex sin 5x ,

and the general solution can be expressed as y(x) = c1 y1 (x) + c2 y2 (x) . Now, forming the initial
conditions: (
y(0) = c1 y1 (0) + c2 y2 (0) = c1 ,

y 0 (0) = c1 y10 (0) + c2 y20 (0) = c1 + c2 5 ,
we see that constants c1 and c2 must verify:

c1 = 0 ,
(
c1 = 0 ,
√ =⇒ 1
c1 + c2 5 = 1 , c2 = √ .
5
In conclusion, the considered initial value problem is solved by:
ex √ 
y(x) = √ sin 5x .
5
106 CHAPTER 6. LINEAR DIFFERENTIAL EQUATIONS OF SECOND ORDER

6.1.5 Cauchy–Euler equations


A Cauchy–Euler differential equation is a particular second–order linear equation, with variable coef-
ficients, of the form:
a x2 y 00 + b x y 0 + c y = 0 , (6.30)
where a , b , c ∈ R , and with x > 0 . We seek solutions of (6.30) in power form, that is, y = xm , being
m a constant to be determined and that must satisfy the algebraic equation:
a m (m − 1) + b m + c = 0 , (6.31)
i.e., a m2 + (b − a) m + c = 0 . Let m1 and m2 be the roots of equation (6.31) and denote its
discriminat with ∆ = (a − b)2 − 4 a c . According to the sign of ∆ , the differential equation (6.30) is
solved by a power–form function y defined as follows, with c1 , c2 ∈ R in all cases:
(i) if ∆ > 0 then m1 , m2 are real and distinct, hence y = c1 xm1 + c2 xm2 ;
(ii) if ∆ < 0 then m1 , m2 are complex and conjugate, say m1,2 = α ± i β , thus y =
xα c1 cos(β ln x) + c2 sin(β ln x) ;


(iii) if ∆ = 0 then m1 = m2 = m , and the solution is y = c1 xm + c2 xm ln x .


Remark 6.10. The solution y of the Cauchy–Euler equation, illustrated in each of the three cases
above, has components y1 , y2 , that are linear independent. This statement can be verified using the
Wronskian.
When ∆ > 0 , equation (6.31) has two distinct real roots m1 6= m2 , which implies, defining y1 = xm1
and y2 = xm2 , that the Wronskian is non–null:
 
y1 y2
W (y1 , y2 )(x) = det 0 = (m2 − m1 ) xm1 +m2 −1 6= 0 , for x > 0 .
y1 y20

When ∆ < 0 , there are two complex conjugate roots m1,2 = α ± i β of (6.31); then, setting for
example y1 = xα cos(β ln x) and y2 = xα sin(β ln x) , the Wronskian does not vanish:
 
y y2
W (y1 , y2 )(x) = det 10 = β x2α−1 6= 0 , for x > 0 .
y1 y20

When ∆ = 0 , equation (6.31) has one real root m of multiplicity 2 ; in this case y1 = xm and
y2 = y1 ln x ; again, the Wronskian does not vanish:
 
y1 y2
W (y1 , y2 )(x) = det 0 = x2 m−1 6= 0 , for x > 0 .
y1 y20

Notice, again, that knowing the Wronskian turns out useful also when studying the non–homogeneous
differential equations case (refer to § 6.2).
Example 6.11. Consider the initial value problem:
(
x2 y 00 − 2 x y 0 + 2 y = 0 ,
y(1) = 1 , y 0 (1) = 0 .
Here equation (6.30) assumes the form:
m (m − 1) − 2 m + 2 = 0 ,
that is m = 1 , m = 2 . Therefore, the general solution of the given equation is y = c1 x + c2 x2 ;
imposing the initial conditions yields the system:
(
c1 + c2 = 1 ,
c1 + 2 c2 = 0 .
In conclusion, the solution of the initial vale problem is y = 2 x − x2 .
6.1. HOMOGENEOUS EQUATIONS 107

Example 6.12. Consider the initial value problem:


(
x2 y 00 − x y 0 + 5 y = 0 ,
y(1) = 1 , y 0 (1) = 0 .

Here, equation (6.30) assumes the form:

m (m − 1) − m + 5 = 0 ,

that is m = 1 ± 2 i . Hence, the general solution of the given equation is y =


x ( c1 cos(2 ln x) + c2 sin(2 ln x) ) ; again, imposing the initial conditions, we obtain the system:
(
c1 = 1 ,
c1 + 2 c2 = 0 ,

leading to the solution:


 
1
y=x cos(2 ln x) − sin(2 ln x) .
2

6.1.6 Invariant and Normal form


To conclude § 6.1, we introduce the fundamental notion of invariant of a homogeneous linear differential
equation of second order (6.4a) The basic fact is that any such equation can be transformed into a
differential equation without the term containing the first derivative, namely:

u00 + I(x) u = 0 , (6.32)

with:
1 0 1
I(x) = q(x) − p (x) − p2 (x) . (6.33)
2 4
In fact, assuming the knowledge of a solution to (6.4a) of the form y = f u leads to, in the same way
followed to obtain equation (6.6):

L(f u) = f u00 + (2 f 0 + p f ) u0 + (f 00 + p f 0 + q f ) u = 0 , (6.34)

in which f (x) can be chosen so that the coefficient of u0 vanishes, namely:


Z
− 12 p(x) dx
2 f0 + p f = 0 =⇒ f (x) = e .

Function f (x) does not vanish, and has first and second derivatives given by:

fp f (p2 − 2 p0 )
f0 = − , f 00 = .
2 4

Hence, f can be simplified out in (6.34), yielding the reduced form:


 
00 1 0 1 2
u + q− p − p u = 0, (6.35)
2 4

which is stated in (6.32)–(6.33). Equation (6.32) is called the normal form of equation (6.4a). Function
I(x) , introduced in (6.33), is called invariant of the homogeneous differential equation (6.4a) and
represents a mathematical invariant, in the sense expressed by the following Theorem 6.13.
108 CHAPTER 6. LINEAR DIFFERENTIAL EQUATIONS OF SECOND ORDER

Theorem 6.13. If the equation:

L1 y = y 00 + p1 y 0 + q1 y = 0 (6.36)

can be transformed into the equation:

L2 y = y 00 + p2 y 0 + q2 y = 0 , (6.37)

by the the change of dependent variable y = f u , then the invariants of (6.36)–(6.37) coincide:
1 0 1 1 1
I1 = q1 − p − p2 = q2 − p02 − p22 = I2 .
2 1 4 1 2 4
Viceversa, when equations (6.36) and (6.37) admit the same invariant, either equation can be trans-
formed into the other one, by:
Z
1


2 p1 (x) − p2 (x) dx
y(x) = u(x) e .

Remark 6.14. We can transform any second–order linear differential equation into its normal form.
Moreover, if we are able to solve the equation in normal form, then we can obtain, easily, the general
solution to the original equation. The next Example 6.15 clarifies this idea.
Example 6.15. Consider the homogeneous differential equation, depending on the real positive pa-
rameter a :  
00 2 0 2 2
y − y + a + 2 y = 0. (6.38)
x a
Hence:
2 2 1 1
p(x) = − , q(x) = a2 + 2 =⇒ I(x) = q(x) − p0 (x) − p2 (x) = a2 .
x a 2 4
In this example, the invariant is not dependent of x , and the normal form is:

u00 + a2 u = 0 . (6.39)

The general solution to (6.39) is u(x) = c1 cos(a x) + c2 sin(a x) , where c1 , c2 are real parameter,
and the solution to the original equation (6.38) is:
Z
1

2 p(x) dx
y(x) = u(x) e = c1 x cos(a x) + c2 x sin(a x) .

Example 6.16. Find the general solution of the homogeneous linear differential equation of second
order:
y 00 − 2 tan(x) y 0 + y = 0 .
The first step consists in transforming the given equation into normal form, with the change of variable:
Z Z
1

2 p(x) dx tan x dx
u
y=u e =u e = .
cos x
The normal form is:
u00 + 2 u = 0 ,
which is a constant–coefficient equation, solved by:
√ √
u = c1 cos( 2 x) + c2 sin( 2 x) .

The solution to the given differential equation is:


√ √
cos( 2 x) sin( 2 x)
y = c1 + c2 .
cos x cos x
6.2. NON–HOMOGENEOUS EQUATION 109

The normal form clarifies the structure of the solutions to a constant–coefficient, linear equation of
second–order.
Remark 6.17. Consider the constant–coefficient equation (6.25). The change of variable:
b
y = u e− 2 a x

allows to transform (6.25) into normal form:

b2 − 4 a c
u00 − u = 0, (6.40)
4 a2
since, in this case:
b c 1 0 1 2 −b2 + 4 a c
p=− , q=− =⇒ I =q− p − p = .
a a 2 4 4 a2
The normal form (6.40) explains the nature of the following formulæ (6.41), (6.42) and (6.43), namely
describing the structure of the solution to a constant–coefficient homogeneous linear differential equa-
tion of second order. In the following, the discriminant is ∆ = b2 − 4 a c and c1 , c2 are constant:

(i) if ∆ > 0 , √ √
 
b
−2a x ∆  ∆ 
y(x) = e c1 cosh x + c2 sinh x ; (6.41)
2a 2a
(ii) if ∆ < 0 ,  √ √ 
b
−2a x −∆  −∆ 
y(x) = e c1 cos x + c2 sin x ; (6.42)
2a 2a
(iii) if ∆ = 0 ,
b
y(x) = e− 2 a x (c1 x + c2 ) . (6.43)

6.2 Non–homogeneous equation


We finally deal with the non–homogeneous equation (6.4), which we recall and re–label for convenience:

L y = y 00 + p(x) y 0 + q(x) y = r(x) , (6.44)

changing, slightly, the point of view, in comparison to the beginning of § 6.1. Here, we assume to
know, already, the general solution of the homogeneous equation associated to (6.44). Aim of this
section is, indeed, to describe the relation between solutions of L y = 0 and solutions of L y = r(x) ,
being r(x) a given continuous function. The first and probably most important step in this direction
is represented by the following Theorem 6.18.
Theorem 6.18. Let y1 and y2 be independent solutions of L y = 0 , and let yp be a solution of
L y = r(x) . Then, any solution of the latest non–homogeneous equation has the form:

y(x) = c1 y1 (x) + c2 y2 (x) + yp (x) , (6.45)

where c1 , c2 are constant. for suitable constants c1 , c2 ∈ R .


Proof. Using the linearity of the operator L , we see that:

L (y − yp ) = L y − L yp = r − r = 0 .

This means that y − yp can be express by a linear combination of y1 and y2 . Hence, thesis (6.45) is
proved.

Formula (6.45) is called general solution of the non–homogeneous equation (6.44).


110 CHAPTER 6. LINEAR DIFFERENTIAL EQUATIONS OF SECOND ORDER

6.2.1 Variation of parameters


Theorem 6.18 indicates that, to describe the general solution of the linear non–homogeneous equation
(6.44), we need to know a particular solution to (6.44) and two independent solutions of the associated
homogeneous equation. As a matter of fact, the knowledge of two independent solutions to L y = 0
allows to individuate a particular solution of (6.44), using the method of the Variation of parameters,
introduced by Lagrange in 1774.

Theorem 6.19. Let y1 and y2 be two independent solutions of the homogeneous equation associated
to (6.44). Then, a particular solution to (6.44) has the form:

yp (x) = k1 (x) y1 (x) + k2 (x) y2 (x) , (6.46)

where: Z Z
y2 (x) r(x) y1 (x) r(x)
k1 (x) = − dx , k2 (x) = dx . (6.47)
W (y1 , y2 )(x) W (y1 , y2 )(x)

Proof. Assume that y1 and y2 are independent solutions of the homogeneous equation associated to
(6.44), and look for a particular solution of (6.44) in the desired form:

yp = k1 y1 + k2 y2 ,

where k1 , k2 are two C 1 functions to be determined. Computing the first derivative of yp yields:

yp0 = k10 y1 + k20 y2 + k1 y10 + k2 y20 . (6.48)

Let us impose a first condition on y1 and y2 , i.e., impose that they verify:

k10 y1 + k20 y2 = 0 , (6.49)

so that (6.48) reduces to:


yp0 = k1 y10 + k2 y20 . (6.48a)

Now, compute yp00 , by applying differentiation to (6.48a):

yp00 = k1 y100 + k2 y200 + k10 y10 + k20 y20 .

At this point, imposing that yp solves equation (6.44) leads to forming the following expression (in
which variable x is discarded, to ease the notation):

yp00 + p yp0 + q yp = k1 y100 + k2 y200 + k10 y10 + k20 y20 + p k1 y10 + k2 y20 + q k1 y10 + k2 y20
  

= k1 y100 + p y10 + q y1 + k2 y200 + p y20 + q y2 + k10 y10 + k20 y20 .


  

In this way, a second condition on y1 and y2 is obtained:

k10 y10 + k20 y20 = r . (6.50)

Equations (6.49) and (6.50) form a 2 × 2 linear system, in the variables k10 , k20 , that admits a unique
solution:
y2 r y1 r
k10 = − , k20 = , (6.51)
W (y1 , y2 ) W (y1 , y2 )
since its coefficient matrix is the Wronskian W (y1 , y2 ) , which does not vanish, given the assumption
that y1 and y2 are independent. Thesis (6.47) follows by integration of k10 , k20 in (6.51).
6.2. NON–HOMOGENEOUS EQUATION 111

Example 6.20. In Example 6.4, we showed that the general solution of the homogeneous equation
(6.18) has the form (6.19), i.e., c1 x + c2 x2 ex , c1 , c2 ∈ R . Here, we use Theorem 6.19 to find the
general solution of the non–homogeneous equation:
2
x2 (1 + x) y 00 − x (2 + 4 x + x2 ) y 0 + (2 + 4 x + x2 ) y = x2 (1 + x) . (6.52)

As a first step, let us rewrite (6.52) in explicit form, namely:


x (2 + 4 x + x2 ) 0 (2 + 4 x + x2 )
y 00 − y + y = x2 (1 + x) .
x2 (1 + x) x2 (1 + x)
Then, using equation (6.47), we obtain:
x3
Z Z
k1 (x) = − x2 dx = − , k2 (x) = x e−x dx = −(1 + x) e−x .
3
Hence, the general solution of (6.52) is:
x4
y(x) = c1 x + c2 x2 ex − − x2 (1 + x) .
3
Example 6.21. Consider the initial value problem:
(
x2 y 00 − x y 0 + 5 y = x2 ,
y(1) = y 0 (1) = 0 ,

In Example 6.12, the associated homogeneous equation was considered, of which two independent
solutions were found, namely y1 = x cos(2 ln x) and y2 = x sin(2 ln x) , whose Wronskian is 2 x .
Now, writing the given non–homogenous differential equation in explicit form:
1 0 5
y 00 − y + 2 y = 1,
x x
and using (6.47), we find:
Z Z
1 1
k1 (x) = − sin(2 ln x) dx , k2 (x) = cos(2 ln x) dx .
2 2
Evaluating the integrals:
1 2 1 
k1 (x) = x cos(2 ln x) − x sin(2 ln x) ,
2 5 5
1 2 1 
k1 (x) = x sin(2 ln x) − x cos(2 ln x) .
2 5 5
Hence, a particular solution of the non–homogenous equation is
x2
yp = k1 y1 + k2 y2 = ,
5
while the general solution is:
x2
y = c1 x cos(2 ln x) + c2 x sin(2 ln x) + .
5
To solve the initial value problem, c1 and c2 need to be determined, imposing the initial conditions:
1 2
y(1) = c1 + = 0 , y 0 (1) = c1 + 2 c2 + = 0 ,
5 5
1 1
yielding c1 = − , c2 = − . In conclusion, the solution of the given initial value problem is:
5 10
1
y(x) = x (2 x − sin(2 ln x) − 2 cos(2 ln x)) .
10
112 CHAPTER 6. LINEAR DIFFERENTIAL EQUATIONS OF SECOND ORDER

6.2.2 Non–homogeneous equations with constant coefficients


While studying a second–order differential equation, the easiest situation occurs, probably, when the
equation coefficients are constant. In this case, the application of the method of Variation of parameters
can be performed systematically. Here, though, we only provide the results, that is, we indicate how
to search for some particular solution of a given constant–coefficient equation; the interested Reader
can, then, apply the Variation of parameters to validate our statements.
Assume that a constant–coefficient non–homogeneous differential equation of second order is given:

M y = a y 00 + b y 0 + c y = r(x) , (6.53)

where r(x) is a continuous real function. Denote with K(λ) the characteristic polynomial associated
to the differential equation (6.53). A list is provide below, taken from [21], of particular functions
r(x) , together with a recipe to find a relevant particular solution of (6.53).

(1) Let r(x) = e α x Pn (x) , being Pn (x) a given polynomial of degree n .

(a) If K(α) 6= 0 , it means that α is not a root of the characteristic equation. Then, a particular
solution of (6.53) has the form:
yp = eα x Qn (x) , (6.54)
where Qn (x) is a polynomial of degree n to be determined.
(b) If K(α) = 0 , with multiplicity s ≥ 1 , it means that α is a root of the characteristic equation.
Then, a particular solution of (6.53) has the form:

yp = xs eα x Rn (x) , (6.55)

where Rn (x) is a polynomial of degree n to be determined.


 
(2) Let r(x) = e α x Pn (x) cos(βx) + Qm (x) sin(βx) , being Pn and Qm given polynomials of
degree n and m , respectively.

(a) If K(α + i β) 6= 0 , it means that α + i β is not a root of the characteristic equation. In this
case, a particular solution of (6.53) has the form:
 
αx
yp = e Rp (x) cos(β x) + Sp (x) cos(β x) , (6.56)

with Rp (x) and Sp (x) polynomials of degree p = max{n , m} to be determined.


(b) if K(α+i β) = 0 , it means that α+i β is a root of the characteristic equation, with multiplicity
s ≥ 1 . Then, a particular solution of (6.53) has the form:
 
s αx
yp = x e Rp (x) cos(β x) + Sp (x) cos(β x) , (6.57)

where Rp (x) and Sp (x) are polynomials of degree p = max{n , m} to be determined.

(3) Let r(x) = r1 (x) + . . . + rn (x) and let yk be solution of M yk = rk . Then, y = y1 + · · · + yn


solves L y = r . This fact is known as super–position principle.

Example 6.22. Consider the differential equation:

y 00 (x) + 2 y 0 (x) + y(x) = x2 + x .

Here α = 0 is not a root of the characteristic equation:

K(λ) = λ2 + 2 λ + 1 = 0 ,
6.2. NON–HOMOGENEOUS EQUATION 113

thus, we are in situation (1a) and we look for a solution of the form (6.54), that is

yp (x) = s0 x2 + s1 x + s2 .

Differentiating:
(
yp0 (x) = 2 s0 x + s1 ,
yp00 (x) = 2 s0 ,

and imposing that yp solves the differential equation, we obtain:

s0 x2 + (4 s0 + s1 ) x + 2 s0 + 2 s1 + s2 = x2 + x .

Therefore, it must be:


 
s0 = 1 ,
 s0 = 1 ,

4 s0 + s1 = 1 , =⇒ s1 = −3 ,
 
2 s0 + 2 s1 + s2 = 0 , s2 = 4 .
 

Hence, a particular solution of the given equation is

yp = x 2 − 3 x + 4 .

Finally, solving the associated homogeneous equation, we obtain the required general solution:

y(x) = x2 − 3 x + 4 + c1 e−x + c2 x e−x .

Example 6.23. Consider the differential equation:

y 00 (x) + y(x) = sin x + cos x .

Observe, first, that the general solution of the associated homogeneous equation is:

y0 (x) = c1 cos x + c2 sin x , c1 , c2 ∈ R .

Observe, further, that the characteristic equation has roots ±i , Hence, we are in situation (2b) and
we look for a solution of the form (6.57), that is:

yp (x) = s1 x cos x + s2 x sin x , s1 , s2 ∈ R .

Imposing that yp (x) solves the given non–homogeneous equation, we find:

2 s2 cos x − 2 s1 sin x = cos x + sin x .

Solving the system:


1 1
s1 = − , s2 = ,
2 2
leads to the general solution of the given equation:

1 1
y(x) = x sin x − x cos x + c1 cos x + c2 sin x , c1 , c2 ∈ R .
2 2
114 CHAPTER 6. LINEAR DIFFERENTIAL EQUATIONS OF SECOND ORDER

6.2.3 Exercises
1. Solve the following second order variable coefficient linear differential equations:
 
00 2 2
(a) y − 1+ y0 + y = 0 ;
x x
2 x 2
(b) y 00 − y0 + y = 0;
1 + x2 1 + x2
1
(c) y 00 − y 0 − 4 x2 y = 0 .
x
m
Hint: y = ex .

2. Solve the following initial value problems, using the transformation of the given differential equation
in normal form:
  
00 + 2 sin x y 0 + (sin x)2 + cos x − 6

y y = 0,


x2


(a)

 y(1) = 0 ,
y 0 (1) = 1 ,

2 0

00
y + x y + y = 0 ,


(b) y(1) = 1 ,

 0

y (1) = 0 .

3. Find the general solution of the differential equation


1
y 00 + (cot x) y 0 − y=0,
(sin x)2
1
using the fact that y1 = is a solution. Then, find, if it exists, the particular solution that
sin x
y(x) 1
vanishes for x → 0+ . Say whether there exists a solution such that lim = .
x→0+ x 2
4. Solve the non–homogeneous equation:
1 0
y 00 − y − 4 x2 y = −4 x4 ,
x
2
using the fact that y1 (x) = ex is a solution of the associated homogeneous equation.
7 Prologue to Measure theory

Some basic notions are presented in this chapter, that are needed as introduction to Measure theory.
Some familiarity, with the concepts presented, here and in the following Chapters 8 to 10, is also
assumed and recalled, briefly, for the sake of completeness.

7.1 Set theory


7.1.1 Sets
In operations with sets, we shall always deal with collections of subsets of some universal set Ω . The
concept of set is considered as given, we do not provide a definition for it. We are only concerned
with set membership and operations. Set are usually denoted with capital letters. Set membership is
denoted with the symbol “ ∈ ”, therefore x ∈ A means that x belongs to A . Set inclusion is denoted
as A ⊂ B , which means that every member of A is a member of B . The case A = B is a particular
case of set inclusion. The inclusion is said to be strict if it exists at least one element b ∈ B such
that b ∈
/ A . In such a case, we may write A ( B ; obviously, if A ( B , then A ⊂ B is also true.
The set of subsets of A is denoted by P(A) , which is sometimes called the Power set of A , since, if
A has n elements, then P(A) has 2n elements.
Intersection and union are defined as:
A ∩ B = {x : x ∈ A and x ∈ B} , A ∪ B = {x : x ∈ A or x ∈ B} .
The complement Ac of a set A consists of the elements of Ω which are not members of A , denoted
with:
Ac = Ω \ A .
The difference is:
/ A} = B ∩ Ac ,
B \ A = {x ∈ B : x ∈
while the symmetric difference is defined by:
A∆B = (A \ B) ∪ (B \ A) .
Symbol ∅ denotes the empty set, i.e. the set with no elements:
∅ = { x ∈ Ω : x 6= x } .
Note that A∆B = ∅ ⇐⇒ A = B .

7.1.2 Indexes and Cartesian product


Here, a rigorous extension is provided of the definitions of intersection and union to arbitrary collections
of sets. Let us consider a finite family of sets A1 , . . . , An ; their list can be understood as the image
set of a function x : {1 , 2 , . . . , n} → X , where X actually indicates the set of n objects we want to
index.
Assume, in general, that two non–empty sets I and X are given, together with a function x : I → X ,
x : i 7→ x(i) . We refer to I as the index set, while the triad (X , I, x) is called the indexing of X . The
following conventions are adopted:

115
116 CHAPTER 7. PROLOGUE TO MEASURE THEORY

 the notation xi indicates x(i) ;

 the notation (xi )i∈I represents the mapping x : I → X , and it is said to be a collection of elements
in X indexed by I .
In other words, x establishes an indexed family of elements in X indexed by I , and the elements of
X are referred to as forming the family, i.e., the indexed family (xi )i∈I is interpreted as a collection,
rather than a function.
When I = N , we are obviously dealing with an usual sequence. When we, instead, consider a finite
list of objects, then I = {1 , 2 , . . . , n} . It is also possible to consider as index set, for example, the
power set of X , that is, I = P(X ) ; in the latest case, the indexed family becomes the collection of
the subsets of X .
Union and intersection of arbitrary collections of sets are defined as:
\
Aα = { x : x ∈ Aα for all α ∈ I } = {x : ∀ α ∈ Λ , x ∈ Aα } ,
α∈I
[
Aα = {x : x ∈ Aα for some α ∈ I} = {x : ∃ α ∈ I , x ∈ Aα } ,
α∈I

and, recalling De Morgan1 Laws, it holds that:


[ \ \ [
( Aα )c = Acα , ( Aα ) c = Acα . (7.1)
α∈I α∈I α∈I α∈I

If A∩B = ∅ , then A and B are disjoint. A family of sets (Aα )α∈I is pairwise disjoint if Aα ∪Aβ = ∅
whenever α 6= β , for α , β ∈ I .

7.1.3 Cartesian product


The Cartesian product A × B of a couple of sets A and B is defined as the set of ordered pairs:

A × B = { (a, b) : a ∈ A , b ∈ B} ,

while, more in general, the Cartesian product:


Y
Ai
i∈I

of an arbitrary, indexed family of sets (Ai )i∈I , is the collection of all functions:
[
a:I→ Ai , a : i 7→ ai , such that ai ∈ Ai for any i ∈ I .
i∈I

When Ai = A for any i ∈ I , the Cartesian product is denoted as AI , which reduces to An in the
case I = {1 , . . . , n} .
The Cartesian plane is R2 = R × R , while Rn is the set of all n–tuples (x1 , . . . , xn ) composed of
real numbers. A rectangle is the Cartesian product of two intervals.

7.1.4 Functions
A function f : A → B can be intepreted as a subset of A×B in which each first coordinate determines
the second one:
(a, b) , (a, c) ∈ f =⇒ b = c.
Domain and Range of a function are, thus, respectively defined as:

Df = {a ∈ A : ∃ b ∈ B , (a, b) ∈ f } , Rf = {b ∈ B : ∃ a ∈ A , (a, b) ∈ f } .
1
Augustus De Morgan (1806–1871), British mathematician and logician.
7.1. SET THEORY 117

Informally, f associates elements of B with elements of A such that each a ∈ A has at most one
image b ∈ B , that is, b = f (a) . The Image of X ⊂ A is:

f (X) = { b ∈ B : b = f (a) for some a ∈ X} ,

and the Inverse image of Y ⊂ B is:

f −1 (Y ) = { a ∈ A : f (a) ∈ Y } .

Given two functions g and f , such that Df ⊂ Dg and g = f on Df , then we say that g extends
f to Dg and, viceversa, f restricts g to Df .
The algebra of real functions is defined pointwise. The sum f + g is defined as (f + g)(x) = f (x) + g(x).
The product f g is given by (f g)(x) = f (x) g(x). The indicator function 1A of the set A is the function:

1 for x ∈ A
1A (x) =
0 for x ∈
/A

that verifies 1A∩B = 1A 1B , 1A∪B = 1A + 1B − 1A 1B , 1Ac = 1 − 1A .

7.1.5 Equivalences
A relation between two sets, A and B , is a subset R of A × B ; we write x ∼ y to indicate that
(x, y) ∈ R. An equivalence relation on a set E is a relation R of E × E , with the following properties,
valid for any x , y , z ∈ E :

(i) reflexivity: x ∼ x;

(ii) symmetry: x ∼ y =⇒ y ∼ x ;

(iii) transitivity: x ∼ y and y ∼ z =⇒ x ∼ z .

For any x ∈ E , the subset [x] := { e ∈ E : e ∼ x } is called equivalence class of the element x ∈ E .
Clearly, when x ∼ y , then [x] = [y] .
An equivalence relation on E subdivides such a set into disjoint equivalence classes. A partition of a
set E is a family F of subsets of E such that:
[
(i) A=E,
A∈F

(ii) for any A , B ∈ F , A 6= B , then A ∩ B = ∅ .

Consider x , y ∈ E such that x 6= y , then, it is either [x] = [y] or [x] 6= [y] . In the first case, we
have already observed that it is x ∼ y . In the second case, x and y are not equivalent, i.e., x 6∼ y ,
therefore [x] ∩ [y] = ∅ . Hence, the equivalence classes of E partition the set E itself. The collection
of the equivalence classes of E is called quotient set of E and is denoted by E/ ∼ . For instance, if
E = Z , the relation x ∼ y ⇐⇒ x − y = 2 k , k ∈ Z , is an equivalence in Z , which partitions Z in
the classes of even and odd numbers. In general, for any given m ∈ Z , an equivalence relation x ∼ y
can be defined, denoted by the symbol ≡m :

x ≡m y ⇐⇒ x−y = m k, for a certain k ∈ Z .

The quotient set, obtained in this way, is called the set of residual classes, modulus m and is denoted
by Zm .
118 CHAPTER 7. PROLOGUE TO MEASURE THEORY

7.1.6 Real intervals


Let us introduce some notation. N denotes the set of natural numbers, with the convention 0 ∈ / N,
while Z is the set of integers, Q is the set of rational numbers, R denotes the set of real numbers and
C is the set of complex numbers.
Intervals in R are denoted via endpoints, a square bracket indicating their inclusion, while an open
bracket means exclusion, so that, for example:

(a, b] = {x ∈ R : a < x ≤ b} .

Symbols ∞ and −∞ are used to describe unbounded intervals, such as (−∞ , b] . Later on, we will
define operations, in the extended real number system, involving these symbols.

7.1.7 Cardinality
Two non–empty sets X , and Y share common cardinality if there exists a bijection f : X → Y .
An empty set is finite; a non–empty set is finite if it shares cardinality with the set In = {1 , . . . , n} ,
for some n ∈ N . A set is infinite if it shares cardinality with one of its proper subsets. A set A is
countable, or denumerable, if there exists a one–one correspondence between A and a subset of N .

 Q is denumerable;  R \ Q is not denumerable;

 R is not denumerable;  [a , b] is not denumerable.

7.1.8 The Real Number System


Definition 7.1. Let X be a non–empty subset of R .

(a) An element u ∈ R is called upper bound of X if x ≤ u for any x ∈ X ; in this case, X is said to
be bounded from above.

(b) An element ` ∈ R is called lower bound of X if x ≥ `, for any x ∈ X ; in this case X is said to
be bounded from below.

(c) An element u? is called supremum, or least lower bound, of X if u? ≤ µ for any upper bound u
of X .

(d) An element `? is called infimum, or greatest lower bound, of X if `? ≥ ν for any lower bound `
of X .

Supremum and Infimum of a set X are denoted by:

u? = sup X, `? = inf X .

7.1.9 The extended Real Number System


The set of the extended real numbers is denoted as:

R := R ∪ {−∞ , +∞} = [−∞ , +∞] , (7.2)

where we assume that, for any x ∈ R , it holds −∞ < x < +∞ . Rules for computations with
−∞ and + ∞ are introduced in the following list.

1. x + (+∞) = +∞ = +∞ + x ,

2. x + (−∞) = −∞ = −∞ + x ,

3. x > 0 =⇒ x · (+∞) = +∞ = +∞ · x ,
7.2. TOPOLOGY 119

4. x > 0 =⇒ x · (−∞) = −∞ = −∞ · x ,
5. x < 0 =⇒ x · (+∞) = −∞ = −∞ · x ,
6. x < 0 =⇒ x · (−∞) = +∞ = −∞ · x ,
7. (+∞) + (+∞) = +∞ ,
8. (−∞) + (−∞) = −∞ ,
9. (+∞) · (+∞) = +∞ = (−∞) · (−∞) ,
10. (+∞) · (−∞) = −∞ = (−∞) · (+∞) ,
11. (+∞) · 0 = 0 = 0 · (+∞) ,
12. (−∞) · 0 = 0 = 0 · (−∞) .
Note that the following operations remain undefined:
(+∞) + (−∞) , (−∞) + (+∞) .

7.2 Topology
Basic notions of Topology are shortly resumed, now, which are needed to develop Measure theory,
as well as to generalize the concepts studied in section 1.2. A complete reference to Topology is
represented, for example, by [34]. The general definition of a Topology, in a non–empty set Ω , is given
below.
Definition 7.2 (Topology). Let Ω be any non–empty set. A collection T of subsets of Ω is called
topology if it verifies the four following properties:
(i) ∅∈T ;
(ii) Ω∈T ;
[
(iii) closure under union: if C ⊂ T , then ∈T ;
C∈C
n
\
(iv) closure under finite intersection: if O1 , . . . , On ∈ T , then Ok ∈ T .
k=1

The pair Ω, T is called a topological space, and the sets in T are called open sets.
Remark 7.3. Many of the topological spaces used in applications verify a further axiom, known as
Hausdorff 2 property or separation or T2 property:
(v) for any x1 , x2 ∈ Ω , x1 6= x2 , there exist O1 , O2 ∈ T , O1 ∩ O2 = ∅ , such that x1 ∈ O1 and
x2 ∈ O2 .
The idea of topological space is inspired by the open sets in Rn , introduced in definition 1.13 in § 1.2.
We provide here further examples of topological spaces.
Example 7.4. Let Ω be any non–empty set. Then, T = {∅ , Ω} is a topology, called trivial or indiscrete
topology. In this situation, all points of the space cannot be distinguished by topological means.
Example 7.5. Let Ω be any non–empty set. Then T = P (Ω) is a topology, called discrete topology.
Here, every set is open.
Example 7.6. If Ω = N , the family T of all the finite initial segments of N , that is to say, the
collection of sets Jn = {1 , 2 , . . . , n} , is a topology.
Example 7.7. Let Ω be any non–empty set, and let F be a partition of Ω . The collection T of the
subsets of Ω , obtained as union of elements of F , is a topology, induced by the partition F.
2
Felix Hausdorff (1868–1942), German mathematician.
120 CHAPTER 7. PROLOGUE TO MEASURE THEORY

7.2.1 Closed sets


Let (Ω, T ) be a topological space. The set C ⊂ Ω is said to be closed if C c ∈ T , that is, if the
complement of C is an open set in T .
From the topology axioms in Definition 7.2 and De Morgan Laws (7.1), the collection K, formed by
the closed sets in a topological space, verifies the following four properties:

(i) ∅ ∈ K;

(ii) Ω ∈ K;
\
(iii) closure under intersection: if C ⊂ K , then ∈ K;
C∈C

n
[
(iv) closure under finite union: if O1 , . . . , On ∈ K , then Ok ∈ K .
k=1

7.2.2 Limit
Let (Ω , T ) be a topological space, and consider S ⊆ Ω . Then, x ∈ Ω is:

(a) separated from S if and only if there exists A ∈ T such that x ∈ A and A ∩ S = ∅ ;

(b) an adherent point for S if and only if A ∩ S 6= ∅ for any A ∈ T such that x ∈ A ;

(c) an accumulation point for S if and only if x is adherent to S \ {x} ;

(d) an isolated point in S if and only if x is not an accumulation point for S .

Remark 7.8. A few facts should be noticed.

 If x is an accumulation point for S , then x is also an adherent point for S .

 If x ∈
/ S is adherent for S , then x is an accumulation point for S .

 There exist adherent points for S that are not accumulation points for S .

An open neighborhood of x ∈ Ω is an open set U ∈ T such that x ∈ U ; for simplicity, it will be referred
to as neighborhood. It is possible to express the notions of separated, adherent and accumulation points
using the idea of neighborhood. The concept of limit can also be generalised.

Definition 7.9. Let (Ω1 , T1 ) and (Ω2 , T2 ) be topological spaces, and consider A ⊆ Ω1 and f : A →
Ω2 . Furthemore, let x0 ∈ A be an accumulation point for A , and ` ∈ Ω2 . Then:

lim f (x) = `
x→x0

if, for any T2 –neighborhood V of ` , there exists a T1 –neighborhood U of x0 such that

x ∈ A ∩ (U \ {x0 }) =⇒ f (x) ∈ V .
7.2. TOPOLOGY 121

7.2.3 Closure
Consider a topological space ( Ω , T ) and a set S ⊆ Ω . The closure S is the collection of all
adherent points of S . By construction, it holds S ⊆ S . The closure of a closed set is the set itself,
and we can indeed state the following Theorem 7.10.

Theorem 7.10. In the topological space (Ω , T ) , a set S ⊆ Ω is closed if and only if S = S .

Theorem 7.10 has a few implications. First of all, ∅ = ∅ and Ω = Ω . Moreover, given S ⊆ Ω , since S
is a closed set, then S = S . Furthermore, if S1 ⊆ S2 , then S 1 ⊆ S 2 . Finally, S1 ∪ S2 = S1 ∪ S2 .
The main property of the closure of a set S , though, is that S is the smallest closed set containing
S . In fact, the following Theorem 7.11 holds.

Theorem 7.11. Let (Ω, T ) be a topological space, and consider S ⊆ Ω . Denote with K the collection
of closed sets in (Ω , T ) . Furthermore, denote with I the family of subsets of Ω such that:

I = {K ∈ K : S ⊆ K } .

Then \
S= K.
K∈I

7.2.4 Compactness

Definition 7.12. Let Ω be a given set and let S ⊆ Ω . A covering of S in Ω is an indexed family
(Ai )i∈I in Ω such that: [
S⊆ Ai .
i∈I

In a topological space, a covering is called open if Ai is an open set for any i ∈ I .

The notion of open covering leads to the fundamental definition of compactness in a topological space.

Definition 7.13. Let (Ω , T ) be a topological space. A set K ⊂ Ω is called compact if any open
covering of K admits a finite sub–covering, meaning that there exists a finite subset I0 ⊆ I such
that [
K⊆ Ai .
i∈I0

In the familiar context of the Euclidean, open set, topology in Rn , it is possible to show that a
subset is compact if and only if it is closed and bounded. This property of compactness does not hold
for general topological spaces. In the case of the real line R , intervals of the form [a , b] constitute
examples of compact sets. It is further possible to show that, in a T2 –space (where the T2 property
is defined in Remark 7.3), any compact set is also a closed set. Finally, finite subsets in a topological
space are compact.

7.2.5 Continuity
When, in a set, a topology is available, it is possible to introduce the concept of continuity. Let (Ω1 , T1 )
and (Ω2 , T2 ) be topological spaces and consider f : Ω1 → Ω2 . Function f is continuous if, for any
open set A2 ∈ Ω2 , the inverse image f −1 (A2 ) is an open set in the space (Ω1 , T1 ) , that is, in formal
words:
A2 ∈ T2 =⇒ f −1 (A2 ) ∈ T1 .
It is possible to show that f is continuous if and only if, for any closed set K2 in Ω2 , the inverse
image f −1 (K2 ) is a closed set in Ω1 . It is also possible to formulate a local notion of continuity, at a
122 CHAPTER 7. PROLOGUE TO MEASURE THEORY

given point x0 , using appropriate neighborhoods; as we do not need to analyze this problem here, we
leave it to the interested Reader as an exercise.
It is interesting to remark that Weierstrass Theorem 1.30 can be generalized to the case of topological
spaces.

Theorem 7.14 (Weierstrass Theorem on compactness in topological spaces). If f : Ω1 → Ω2 is a


continuous map, and if K1 is a compact subset of Ω1 , then f (K1 ) is compact in Ω2 .
8 Lebesgue integral

Here and in Chapters 9 and 10, we deal with function µ , called measure, which returns area, or volume,
or probability, of a given set. We assume that µ is already defined, adopting an axiomatic approach
which turns out advantageous, as the same theoretical results apply to other situations, besides area
in R2 or volume in R3 , and which is particularly fruitful in Probability theory. A general domain that
can be assumed for µ , is a σ-algebra, defined in § 8.1.
For completeness, a few basic concepts are recalled, for which some familiarity is assumed.

8.1 Measure theory


Measure theory is introduced in an axiomatic way, to keep its exposition to a minimum. Some ideas
are provided, later, to explain the construction of the Lebesgue1 measure, which represents the most
important measure on the real line.

8.1.1 σ–algebras
Let us introduce, first, the notion of σ–algebra of sets.

Definition 8.1. Let Ω be any non–empty set. A collection A of subsets of Ω is called σ–algebra of
sets if:

(i) Ω ∈ A;

(ii) closure under complement: A∈A =⇒ Ac ∈ A ;



[
(iii) closure under countable union: (An )n ⊂ A =⇒ An ∈ A .
n=1

The pair (Ω , A) is called measurable space, and the sets in A are called measurable sets.

From Definition 8.1 and De Morgan Laws (7.1), it follows a fourth property of closure under countable
intersections of the collection of measurable sets A .
When the countable union property (iii) of Definition iii is replaced with the weaker assumption of
finite union, the family of sets A becomes an algebra of sets.

Definition 8.2. Let Ω be any non–empty set. A collection A of subsets of Ω is called algebra of sets
if:

(i) Ω ∈ A;

(ii) closure under complement: A∈A =⇒ Ac ∈ A ;


m
[
(iii) closure under finite union: (An )n ⊂ A =⇒ An ∈ A .
n=1
1
Henri Léon Lebesgue (1875–1941), French mathematician.

123
124 CHAPTER 8. LEBESGUE INTEGRAL

We do not develop Measure theory in the contest of algebra of set, as it is beyond the purpose of our
introductory treatment.

Example 8.3. Let Ω be any non–empty set.

 A = {Ω , ∅} is a σ–algebra. Since it is the simplest possible σ–algebra, it is called trivial.

 A = P(Ω) , i.e., the power set of Ω , is a σ–algebra.

 If A is a non–empty subset of Ω , then A = {∅ , A , Ac , Ω} is a σ–algebra.

 If Ω is an infinite set, then the collection A of subsets A ⊂ Ω , with A countably infinite or Ac


countably infinite, is a σ–algebra.

 If Ω is an infinite set, then the collection A of subsets A ⊂ Ω , with A finite or Ac finite, is an


algebra, but it is not a σ–algebra.

It may happen that a given collection of sets, X , is not a σ–algebra, but there exists the minimal σ–
algebra, which contains X . Hence, to obtain non–trivial σ–algebras, we need to consider the following
abstract construction.

Lemma 8.4. Consider a family of σ–algebras on Ω . Then, the intersection of all the σ–algebras from
this family is also a σ–algebra on Ω .

Proof. Denote by H a non–empty collection of σ–algebras on Ω , and define:


\
A0 = A.
A∈H

To prove Lemma 8.4, we need to prove that A0 is a σ–algebra. Observe, first, that Ω ∈ A0 , since
Ω ∈ A for any A ∈ H , by definition. Then, choose a set A ∈ A0 , so that A belongs to any σ–algebra
in A ∈ H and, therefore, Ac ∈ A0 . Finally, if (An )n∈N is a sequence of sets in A0 , it is clear that such
a sequence belongs to any σ–algebra in H , implying:
[
An ∈ A
n∈N

for each A ∈ H , so that: [


An ∈ A0 .
n∈N

Using Lemma 8.4, we can define the σ–algebra generated by a family of arbitrary sets.

Definition 8.5. Let X be a collection of subsets of Ω . Denote with H the collection of σ–algebras
in Ω , including X . Then: \
σ (X ) = A
A∈H

is a σ–algebra, called σ–algebra generated by X .

8.1.2 Borel sets


This construction is of great interest when, working on the real line, we consider the collection X of
open sets in R . In this situation, the generated σ–algebra, σ (X ) , is called Borel σ–algebra, and it
is denoted by A (R) , while every member of A (R) is called Borel set.
If I be a collection of intervals [a , b) with a < b , i.e.,

I = { [a , b) | a , b ∈ R , a < b } ,
8.1. MEASURE THEORY 125

then:
σ (I) = A (R) . (8.1)
Equality (8.1) also holds when:

I = { (a , b) | a , b ∈ R , a < b } ,
I = { (a , +∞) | a ∈ R} ,
I = { (−∞ , a) | a ∈ R} .

The n–dimensional Borel2 σ–algebra in Rn is the σ–algebra generated by all open subsets of Rn ,
and it is denoted by A(Rn ) . By construction, A(Rn ) contains all open/closed sets and all countable
unions/intersections of open/closed sets. The Borel σ–algebra does not represent the whole of P(Rn ) ;
this result is known as Vitali3 Covering Theorem ([10], Theorem 2.1.4).

8.1.3 Measures

Definition 8.6. Let A be a σ–algebra on Ω . Function µ : A → [0 , +∞] is said to be a measure on


A if:

(i) µ(∅) = 0

(ii) the property of countable additivity holds, that is, for any disjoint sequence (An )n ⊂ A :
∞ ∞
!
[ X
µ An = µ(An ) .
n=1 n=1

The set ( Ω , A , µ ) is called measure space, and measure µ is called:

 finite if µ(Ω) < +∞ ; in particular, µ is called probability measure if µ(Ω) = 1 ;



[
 σ–finite if there exists a sequence (An )n ⊂ A , such that An = Ω , where µ(An ) < +∞ for
n=1
any n ∈ N ;

 complete if A ∈ A , with µ(A) = 0 , and B ⊂ A =⇒ B ∈ A , and µ(B) = 0 ;

 concentrated on A ∈ A if µ(Ac ) = 0 ; in this case, A is said to be a support of µ .

Example 8.7. Here, the so–called counting measure is introduced.


Let Ω be an arbitrary set, and A be a σ–algebra on Ω . Define µ# : A → [0 , ∞] as:
(
#A if A is finite,
µ# (A) =
+∞ if A is infinite.

where #A indicates the cardinality of A . Function µ# is a measure on A.


Example 8.8. Let x ∈ Ω . For any A ∈ P(Ω) , define:
(
1 if x ∈ A ,
δx (A) =
0 if x ∈
/ A.

δx is a measure on P(Ω) called Dirac x–measure. It is a measure concentrated on the singleton set
{x}.
2
Félix Édouard Justin Émile Borel (1871–1956), French mathematician and politician.
3
Giuseppe Vitali (1875–1932), Italian mathematician.
126 CHAPTER 8. LEBESGUE INTEGRAL

The following intuitive results, stated in Proposition 8.9 are known as monotonicity and subtractivity
of the measure.
Proposition 8.9. If A , B ∈ A and A ⊂ B , then:

µ(A) ≤ µ(B) .

Moreover, if µ(A) < ∞ , then:


µ(B \ A) = µ(B) − µ(A) .
Proof. Equality B = (B \ A) ∪ A implies:

µ(B) = µ(B \ A) + µ(A) . (8.2)

Since µ(B \ A) ≥ 0 , it follows that µ(B) ≥ µ(A) , thus monotonicity is proved.


Now, the assumption µ(A) < ∞ means that µ(A) is finite and, thus, it can be substracted from both
sides of equality (8.2), proving subtractivity.

Proposition 8.10. If A , B ∈ A and if µ(A ∩ B) < ∞ , then:

µ(A ∪ B) = µ(A) + µ(B) − µ(A ∩ B) .

Proof. Notice that (A ∩ B) ⊂ A , (A ∩ B) ⊂ B , and:



A ∪ B = A \ (A ∩ B) ∪ B .

Since µ(A ∩ B) < ∞ , it holds:

µ(A ∪ B) = µ(A) − µ(A ∩ B) + µ(B) .

The following Lemma 8.11 shows that, when dealing with a sequence of non–disjoint sets, it is always
possible to rearrange it and treat, instead, a pairwise disjoint sequence of sets, equivalent to the original
sequence.
Lemma 8.11. Consider a measure space ( Ω , A , µ ) . Let (Ak )k∈N be a numerable sequence of mea-
surable sets. Then, there exists a sequence of measurable sets (Bk )k∈N such that:

1. Bk ∈ A for any k ∈ N ;

2. Bk ⊆ Ak for any k ∈ N ;

3. Bk ∩ Bj = ∅ for any k 6= j , i.e., sequence (Bk )k∈N is pairwise disjoint;



[ ∞
[
4. Ak = Bk .
k=1 k=1

Proof. The sequence (Bk )k∈N can be defined inductively, setting:


k−1
[
B1 = A1 , Bk = Ak \ Ai for k > 1 .
i=1

Properties (1)–(2) are straightforward, both being a consequence of the construction of each set Bk .
To demonstrate property (3), let us fix k , j ∈ N , assuming, without loss of generality, that k < j ,
and consider the intersection:
k−1 j−1
! !
[ [
Bk ∩ Bj = Ak \ Ai ∩ Aj \ Ai .
i=1 i=1
8.1. MEASURE THEORY 127

Now, using De Morgan Laws (7.1):

Bk ∩ Bj = Ak ∩ Ac1 ∩ Ac2 . . . ∩ Ack−1 Aj ∩ Ac1 ∩ Ac2 . . . ∩ Acj−1 .


 

The assumption k < j implies that, in the second group of intersections, it appears the set Ack . Thus,
we can infer that:
Bk ∩ Bj = ∅ .

We now demonstrate property (4). Notice, first, that from property (2), it is immediate the inclusion:

[ ∞
[
Bk ⊆ Ak .
k=1 k=1

To complete the proof, the reverse inclusion must be shown. To this purpose, let us choose:

[
x∈ Ak .
k=1

Hence, x must belong to one of the sets Ak , at least, implying that the following definition is well–
posed:
m = min{ k ∈ N | x ∈ Ak } .

In other words, x ∈ Am and x ∈


/ Ak for k = 1 , . . . , m − 1 . Therefore, it must be x ∈ Bm , so that:

[
x∈ Bk .
k=1

which completes the proof.

While working with numerable families of sets, not necessarily pairwise disjoint, we meet the so–called
property of numerable subadditivity.

Proposition 8.12. If (Ak )k≥1 ⊂ A , then:



[ ∞
X

µ Ak ≤ µ(Ak ) .
k=1 i=k

Proof. Denote with (Bk )k∈N the sequence of measurable sets, obtained from the sequence (Ak )k∈N
using the procedure of Lemma 8.11. In this way, for any k ∈ N , it holds that Bk ∈ A and Bk ⊆ Ak .
It also holds that the sequence (Bk )k∈N is pairwise disjoint, that is, Bk ∩ Bj = ∅ for any k 6= j .
Moreover:
[∞ ∞
[
Ak = Bk .
k=1 k=1

We can, then, infer:


∞ ∞ ∞ ∞
! !
[ [ X X
µ Ak = µ Bk = µ(Bk ) ≤ µ(Ak ) .
k=1 k=1 k=1 k=1

The next Theorem 8.13 states two very interesting results, related to the situation of increasing or
decreasing families of nested sets.
128 CHAPTER 8. LEBESGUE INTEGRAL

Theorem 8.13. Let (Ak )k≥1 ⊂ A be an increasing sequence of sets, i.e., Ak ⊂ Ak+1 . Then:

[ 
µ Ak = lim µ(An ) . (8.3)
n→∞
k=1

Let (Ak )k≥1 ⊂ A be a decreasing sequences of sets, i.e., Ak+1 ⊂ Ak , with µ(A1 ) < ∞ . Then:

\ 
µ Ak = lim µ(An ) . (8.4)
n→∞
k=1

Proof. Let us first prove (8.3). By assumption, Ak ⊂ Ak+1 , hence, the non–negative sequence
(µ(Ak ))k∈N is monotonically increasing, which implies the existence of the limit:

` = lim µ(Am ) .
m→∞


[
On the other hand, the inclusion Am ⊆ Ak holds true, for any m ∈ N . Passing to the limit:
k=1


!
[
` = lim µ(Am ) ≤ lim µ Ak . (8.5)
m→∞ k→∞
k=1

If ` = +∞ , there is nothing more to proof. If, instead, ` < +∞ , then, due to the monotonicity of the
sequence (µ(Ak ))k∈N , we infer that, for any k ∈ N :

µ(Ak ) ≤ ` < +∞ . (8.6)

Now, observe that the union of sets can be rewritten as:



[ ∞
[
Ak = A1 ∪ (Ak+1 \ Ak ) . (8.7)
k=1 k=1

The union in the right hand–side of (8.7) is, by construction, disjoint. Since condition (8.6) also holds,
we can use Proposition 8.9 to obtain:
∞ ∞
!
[ X 
µ Ak = µ(A1 ) + µ (Ak+1 \ Ak )
k=1 k=1

X 
= µ(A1 ) + µ (Ak+1 ) − µ (Ak ) .
k=1

In the previous chain of equalities, the last series is telescopic, therefore:


∞ n
!
[ X 
µ Ak = µ(A1 ) + lim µ (Ak+1 ) − µ (Ak ) = lim µ (An ) = ` ,
n→∞ n→∞
k=1 k=1

which proves (8.3).



\
We now prove the second relation (8.4). Define the measurable set B = Ak , and form:
k=1


[ ∞
[ ∞
[
A1 \ B = A1 ∩ B c = A1 ∩ Ack = (A1 ∩ Ack ) = (A1 \ Ak ) .
k=1 k=1 k=1

The sequence of sets (A1 \ Ak )k∈N is increasing, thus, we can apply relation (8.3) and infer that:

µ(A1 \ B) = lim µ(A1 \ Ak ) . (8.8)


k→∞
8.1. MEASURE THEORY 129

Now, µ(A1 ) < +∞ by hypothesis. Hence, from Proposition 8.9 it follows that µ(A1 \ B) = µ(A1 ) −
µ(B) , which can be inserted into (8.8) to yield, recalling the definition of B :

!
\
µ(A1 ) − µ Ak = lim µ(A1 \ Ak ) = µ(A1 ) − lim µ(Ak ) . (8.9)
k→∞ k→∞
k=1

Thesis (8.4) follows by eliminating µ(A1 ) form both sides of (8.9).

8.1.4 Exercises
1. Consider Ω = {1 , 2 , 3} . Find necessary and sufficient conditions on the real numbers x , y , z ,
such that there exists a probability measure µ on the σ–algebra A = P(Ω) , where:

x = µ({1 , 2}) , y = µ({2 , 3}) , z = µ({1 , 3}) .

2. Consider a measure space (Ω , A , µ) .

(a) Let E1 , E2 ∈ A . The symmetric difference E1 ∆E2 is defined as:

E1 ∆E2 := (E1 \ E2 ) ∪ (E2 \ E1 ) .

Suppose that µ(E1 ∆E2 ) = 0 . Show that µ(E1 ) = µ(E2 ) .


(b) Show that, assuming that µ is a complete measure, E1 ∈ A , and µ(E1 ∆E2 ) = 0 , then
E2 ∈ A .

Solution to Exercise 2.

(a) To prove the first statement, we proceed as follows.

 
µ(E1 ∆E2 ) = 0 =⇒ µ (E1 \E2 ) ∪ (E2 \E1 ) = 0
=⇒ µ(E1 \E2 ) + µ(E2 \E1 ) = 0
   
=⇒ µ E1 \(E1 ∩ E2 ) + µ E2 \(E1 ∩ E2 ) = 0 .
   
Since µ : Ω → [0 , +∞] , it follows that µ E1 \(E1 ∩ E2 ) = 0 and µ E2 \(E1 ∩ E2 ) = 0 .
Moreover, observing that (E1 ∩ E2 ) ⊂ E1 and (E2 ∩ E1 ) ⊂ E2 , we can write:

µ(E1 ) − µ(E1 ∩ E2 ) = 0 ,
µ(E2 ) − µ(E1 ∩ E2 ) = 0 .

Thus:
µ(E1 ) − µ(E1 ∩ E2 ) = µ(E2 ) − µ(E1 ∩ E2 ) =⇒ µ(E1 ) = µ(E2 ) .
(b) For this second point, we have:
 
µ(E1 ∆E2 ) = 0 =⇒ µ (E1 \E2 ) ∪ (E2 \E1 ) = 0 .

Since µ is complete, then (E1 \E2 ) and (E2 \E1 ) are measurable, i.e., they belong to A . In this
way, since the σ–algebra is closed with respect to union, intersection and set complementation,
we have:
E1 \(E1 ∆E2 ) = E1 ∩ (E1 ∆E2 )c ∈ A ,
hence:  
E2 = (E2 \E1 ) ∪ E1 \(E1 ∆E2 ) ∈ A .
130 CHAPTER 8. LEBESGUE INTEGRAL

8.2 Translation invariance


Consider the situation of a universal set, where the measure space is established, that coincides with the
real line R , or with the Euclidean m–dimensional space Rm : it is natural, then, to relate the property
of measure with the algebraic structure of the universal set. The main property that a measure may
verify is its invariance with respect to translation.

Definition 8.14. Consider x ∈ Rm and A ∈ P(Rm ) . The translate of A with respect to x is the set:

A + x := {u ∈ Rm | u = x + a for some a ∈ A} .

In Measure theory, it makes sense to compare the measure of the two sets A and A + x : for geometric
reasons, we may expect them to be the same; this holds for an important class of measures. We have,
indeed, the following result, that we state without proof.

Theorem 8.15. There exists a unique complete measure, defined on R and denoted by ` , and there
exists a unique σ–algebra M(R) , which are translation invariant, i.e.:

`(A) = `(A + x) , for any x ∈ R , A ∈ M(R) .

Moreover, for any interval with endpoints a , b ∈ R , with a < b , it holds:


   
` [a , b] = ` [a , b) = ` (a , b] = ` (a , b) = b − a .

` is called Lebesgue measure in R .

The relation between the σ–algebra M(R) of the Lebesgue measurable sets and the Borel σ–algebra
is stated in the following Remark 8.16.

Remark 8.16. The chain of strict inclusions is true:

A(R) ( M(R) ( P(R) .

An extensive treatment and a construction of the Lebesgue measure and its properties can be found in
[52], [51] and [6]. For our purposes, our axiomatic approach suffices: there exists a unique translation–
invariant measure, which associates the measure µ of every finite interval [a , b] with its length b − a .

8.2.1 Exercises

1. Find the Lebesgue measure of the set:


∞  
[ 1 1
x | ≤ x < .
n+1 n
n=1

8.3 Simple functions

Definition 8.17. Let ( Ω , A , µ ) be a measure space, and consider A ∈ A . The function ϕ : A →


[−∞ , +∞] is called simple if:

(i) ϕ(A) is a finite set, i.e., ϕ(A) = {ϕ1 , ϕ2 , . . . , ϕm } ;


m
[
(ii) there exist A1 , . . . , Am ∈ A such that Ai ∩ Aj = ∅ for i 6= j , and A = Ai ;
i=1

(iii) for x ∈ Ai , with i = 1 , . . . , m , it holds ϕ(x) = ϕi .


8.3. SIMPLE FUNCTIONS 131

The plainest simple function is the characteristic function of a given set.


Example 8.18. If A is a subset of the set Ω , the characteristic function of A is defined by:
(
1 if x ∈ A ,
1A (x) =
0 if x ∈/ A.

Remark 8.19. Any simple function can be represented as linear combination of characteristic func-
tions, since it holds:
Xm
ϕ(x) = ϕi 1Ai (x) .
i=1

8.3.1 Integral of simple functions


Let (Ω , A , µ) be a measure space, let A ∈ A be a measurable set, and let ϕ be a simple function
defined on A ; assume further that ϕ is non–negative.
Definition 8.20. The integral of ϕ , on the set A , with respect to measure µ , is defined by:
Z m
X
ϕ dµ := ϕi µ(Ai ) .
A i=1

When a simple function is defined on the real line, the idea of its integration is intuitive and it is
inspired by the geometric concept of area of a family of rectangles, as illustrated in Figure 8.1.
xHtL

3.0

2.5

2.0

1.5

1.0

0.5

t
0.5 1.0 1.5 2.0 2.5 3.0

Figure 8.1: The integral of a simple real function

Observe that, to define a simple function correctly, we have to adopt a measure–theory convention,
that concerns ∞ in the extended real number system, namely:

±∞ · 0 = 0 .

We now state, without proof, the main properties of the integral of a simple function, strating with
the properties of positivity and linearity.
Proposition 8.21. Let (Ω, A, µ) be a measure space, let A ∈ A be a measurable set, and let ϕ1 , ϕ2
be simple functions, defined on A . Then, the following properties hold:
Linearity: if α1 , α2 ∈ R , then
Z Z Z
(α1 ϕ1 + α2 ϕ2 ) dµ = α1 ϕ1 dµ + α2 ϕ2 dµ ;
A A A

Positivity: if ϕ1 ≥ 0 , then Z
ϕ1 dµ ≥ 0 .
A
132 CHAPTER 8. LEBESGUE INTEGRAL

Even though it is evident and easy to show, the following Proposition 8.22 has a fundamental impor-
tance in Measure theory, and it follows from the concept of integral of a simple function.

Proposition 8.22. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Then
Z
µ(A) = 1A dµ .

Proof. Observe that the characteristic function of a set A is a simple function. Hence, since 1A (x) = 0
when x ∈ Ac , and since 1A (x) = 1 when x ∈ A , it follows:
Z
1A dµ = 0 × µ(Ac ) + 1 × µ(A) = µ(A) .

8.4 Measurable functions


The notion of measurable functions is of great interest in Measure theory, but also in Probability
theory, since measurable functions can be interpreted as random variables.

Definition 8.23. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Function
f : A → [−∞ , +∞] is called measurable if, for any α ∈ R :

{ x ∈ A | f (x) ≤ α } ∈ A .

The choice of ≤ is not restrictive, as shown in Proposition 8.24.

Proposition 8.24. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. The
following statements for function f : A → [−∞ , +∞] are equivalent:

(i) {x ∈ A | f (x) > α} ∈ A ; (iii) {x ∈ A | f (x) < α} ∈ A ;

(ii) {x ∈ A | f (x) ≥ α} ∈ A ; (iv) {x ∈ A | f (x) ≤ α} ∈ A .

Proof. The proof follows from the equalities recalled below, for which we refer to Theorem 11.15 of
[52]:

\ 1
{x ∈ A | f (x) ≤ α} = {x ∈ A | f (x) < α + },
n
n=1

\ 1
{x ∈ A | f (x) ≥ α} = {x ∈ A | f (x) > α − },
n
n=1

and
{x ∈ A | f (x) < α} = A \ {x ∈ A | f (x) ≥ α} ,
{x ∈ A | f (x) > α} = A \ {x ∈ A | f (x) ≤ α} .

They show, in a chain, that statement (i) =⇒ (ii) , statement (ii) =⇒ (iii) , statement (iii) =⇒ (iv) ,
and, finally, statement (iv) =⇒ (i) .

The property of measurability is well related with the basic algebraic operations between functions,
hence, we can state, without proof, the following Proposition 8.25.

Proposition 8.25. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Let,
further, f , g : A → R be measurable functions. Then, the following functions are measurable:
8.4. MEASURABLE FUNCTIONS 133

(a) αf, with α ∈ R ; (e) max {f , g} ;


(b) f +g ; (f) f + := max {f , 0} ;
(c) f g;
(g) f − := max {−f , 0} ;
f
(d) ; (h) |f | = f + + f − .
g

Continuous and monotonic functions are indeed measurable, as illustrated in Proposition 8.26.

Proposition 8.26. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Then,
any continuous function and any monotonic function, defined on A , is measurable.

In Proposition 8.27, we analize how measurability interacts with sequences of functions.

Proposition 8.27. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Let,
further, {fn } be a sequence of measurable functions, defined on A . Then, the following ones are
measurable functions too:

max fn , min fn , sup fn , inf fn .


n≤k n≤k n∈N n∈N

Recall the definition of upper and lower limit of a sequence (xn )n :

lim sup xn := inf sup xn , lim inf xn := sup inf xn , (8.10)


n→∞ k∈N n≥k n→∞ k∈N n≥k

and notice that upper and lower limits are always well defined since the following sequences are,
respectively, decreasing and increasing:

un := inf sup xn , ln := sup inf xn ,


k∈N n≥k k∈N n≥k

We are now in the position to state the following Corollary 8.28.

Corollary 8.28. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Let, further,
{fn } be a sequence of measurable functions, defined on A . Then, the following functions are also
measurable:
lim sup fn , lim inf fn , lim fn .
n→∞ n→∞ n→∞

Reasoning with sequences of functions can yield interesting results, like the one stated in the following
Theorem 8.29.

Theorem 8.29. Let f : R → R be a differentiable function. Then, the derivative f 0 is a measurable


function.

Proof. For any n ∈ N , define:

 f (x + n1 ) − f (x)
gn (x) = n f (x + n1 ) − f (x) =

1 .
n

The thesis follows, observing that (gn ) is a sequence of measurable functions and that gn −→ f 0 (x) .

From their Definition 8.17, simple functions are measurable. Their importance lies in the fact that
simple functions can approximate measurable functions, as shown by the following Theorem 8.30.
134 CHAPTER 8. LEBESGUE INTEGRAL

Theorem 8.30. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Let, further,
f : A → R be a measurable function. Then, there exist a sequence of simple functions (fn )n , defined
on A , such that, for any x ∈ A , it holds |fn (x)| ≤ |f (x)| and:

lim fn (x) = f (x) .


n→∞

Remark 8.31. The quite technical proof of Theorem 8.30 can be found, for instance, in Theorem
11.20 of [52]. Here, we only provide the interesting result that the sequence of simple functions,
approximating the given measurable function f , can be defined as follows, for any n ∈ N :

n
 if f (x) ≥ n ,
fn (x) = k − 1 k−1 k


n
if n
≤ f (x) < n , 1 ≤ k ≤ n 2n .
2 2 2

8.5 Lebesgue integral


We introduce the notion of integral, using the approximation of a measurable function with sequences
of simple functions, in the spirit of Theorem 8.30. We start with working within the contest of non–
negative function.

Definition 8.32. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Let, further,
f : A → [0 , +∞] be a measurable function. The integral of f on A , with respect to measure µ , is
defined as: Z Z 
f dµ := sup ϕ dµ | 0 ≤ ϕ ≤ f , ϕ simple .
A A

Here is a list of the main properties of the integral of a non–negative function. The three properties
can be demonstrated by observing, first, that they hold for simple functions, and then forming the
limit, via Theorem 8.30.
Z
(P1) positivity property: 0≤ f dµ ;
A
Z Z
(P2) monotonicity property: 0≤f ≤g =⇒ f dµ ≤ g dµ ;
A A
Z  
(P3) f dµ < +∞ =⇒ µ { x ∈ A | f (x) = +∞ } = 0.
A

We are now in the position to define the integral for measurable functions which may change sign. In
particular, the integral is defined for the class of absolutely integrable functions.

Definition 8.33. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Let, further,
f : A → [0 , +∞] be a measurable function. We say that f is summable on Ω if both integrals:
Z Z
f + dµ and f − dµ (8.11)
Ω Ω

are finite. In this case, the integral of f is defined as:


Z Z Z
f dµ := +
f dµ − f − dµ . (8.12)
Ω Ω Ω

Moreover, the integral of f on R exists if at least one of the two integrals (8.11) is finite. In this
latest case, the integral is still defined by (8.12), but it may be infinite. The undefined situation is,
obviously: +∞ − ∞ .
8.6. ALMOST EVERYWHERE 135

Take notice that some Authors employ alternative notations:


Z Z
f (x) dµ(x) , f (x) µ(dx) .
Ω Ω
Remark 8.34. From Definition 8.33, it can be easily inferred that:
Z Z

f dµ ≤ |f | dµ .

Ω Ω
The following Definition 8.35 adapts Definition 8.33 to functions defined on subsets of the measure
space Ω .
Definition 8.35. Let (Ω, A , µ) be a measure space. Consider A ∈ A and let f : A → [−∞ , +∞]
be a measurable function. Then, f is integrable on A if, following Definition 8.33, function fˆ : Ω →
[−∞ , +∞] defined by: (
f (x) if x ∈ A
fˆ(x) :=
0 if x ∈
/A
is integrable on Ω . The integral of f on A is, then, given by:
Z Z
f dµ := fˆ dµ .
A Ω

The notation L(Ω) represents the collection of all measurable functions, on Ω , which are summable.
The properties of the integral imply that L(Ω) is a vector space on R .

8.6 Almost everywhere


There are cases in which the integral defined using the Lebesgue measure–theory approach is more
effective than the Riemann integral. This happens, in particular, when the Lebesgue integral allows to
deal with a property that does not hold for all x , and the set, on which the property does not hold,
has zero measure.
Definition 8.36. Let (Ω, A , µ) be a measure space. Consider a property P , and consider the set
formed by those elements of Ω for which P does not hold. If such a set has zero measure, then P
holds almost everywhere in Ω .
In other words, a property is said to hold almost everywhere, if there exists a set N ∈ A such that
µ(N ) = 0 and such that all the elements of Ω , where P does not hold, belong to N .
Example 8.37. Consider two measurable functions f , g : Ω → [−∞ , +∞] . We say that f = g almost
everywhere if:  
µ { x ∈ Ω | f (x) 6= g(x)} = 0 .
Example 8.38. Let f , g : Ω → [−∞ , +∞] . be measurable functions. We say that f ≤ g almost
everywhere if:  
µ { x ∈ Ω | f (x) > g(x)} = 0 ,
or that f ≥ g almost everywhere if:
 
µ { x ∈ Ω | f (x) < g(x)} = 0 .

Example 8.39. The sequence of functions fn (x) = xn , defined on the interval [0 , 1] , converges to 0
almost everywhere, if we take the Lebesgue measure. In fact, if x ∈ [0 , 1] , then
(
0 if x ∈ [0, 1[ ,
lim xn =
n→∞ 1 if x = 1 .
This shows that the limit function is not the null function, but the set on which the limit function
differs from the null function has zero measure.
136 CHAPTER 8. LEBESGUE INTEGRAL

The following Proposition 8.40 provides reasons of the relevance of the almost everywhere properties,
when they hold for a complete measure space.

Proposition 8.40. Let (Ω, A , µ) be a measure space, where µ is a complete measure. Let f , g : Ω →
[−∞ , +∞] be functions that are almost everywhere equal. If f is measurable, then g is measurable.
Moreover, if f ∈ L(Ω) , then g ∈ L(Ω) and:
Z Z
f dµ = g dµ .
Ω Ω

8.7 Connection with Riemann integral


The Lebesgue integral was introduced in order to generalizes the Riemann4 integral. In this § 8.7
the Lebesgue–Vitali theorem is stated, without proof, which explains the interplays between the two
notions of integral. For completeness, we first recall the definition of Riemann integral.

8.7.1 The Riemann integral


A partition of a real interval [a , b] is any finite subset σ of [a , b] , such that a , b ∈ σ . Hence, for a
suitable n ∈ N , we must have:

σ = { a = x0 , x1 , . . . , xn−1 , xn = b } ,

where the elements of σ are listed according to the convention that xi−1 < xi , for any i = 1 , . . . , n .
The norm of σ is the positive number:

kσk = max (xi − xi−1 ) .


1≤i≤n
 
The collection of all partitions of [a , b] is denoted by Ω [a , b] . Given σ1 , σ2 ∈ Ω [a , b] , if σ1 ⊂ σ2
then kσ1 k ≥ kσ2 k , and we say that σ2 is a refinement of σ1 .
The notation f ∈ B ([a, b]) indicates that f : [a , b] −→ R is a bounded function, i.e., for any x ∈ [a, b] :

−∞ < m := inf f (t) ≤ f (x) ≤ sup f (t) := M < +∞ .


t∈[a ,b] t∈[a ,b]

The lower sum of f , induced by σ , is the real number:


n
X
s (f , σ) := mi (xi − xi−1 ) ,
i=1

where, for any 1 ≤ i ≤ n :


mi = inf f (x) .
xi−1 ≤x≤xi

Similarly, the upper sum of f , induced by σ , is the real number:


n
X
S (f , σ) := Mi (xi − xi−1 ) ,
i=1

where:
Mi = sup f (x) .
xi−1 ≤x≤xi

These definitions have a plain geometrical meaning; in particular, when f is non–negative, the idea is
to calculate the area of the plane region situated underneath the graph of f , over the interval [a , b] ,
as illustrated in Figure 8.2.
4
Georg Friedrich Bernhard Riemann (1826–1866), German mathematician.
8.7. CONNECTION WITH RIEMANN INTEGRAL 137

y y
f HbL f HbL

f HaL f HaL

x x
a b a b

Figure 8.2: Lower sum (left) and upper sum (right).

The real numbers:


sup s(f , σ) and inf S(f , σ)
σ∈Ω([a,,b]) σ∈Ω([a ,b])

are called, respectively, lower integral and upper integral of f ∈ B ([a , b]) . They are represented with
the notations:
Z b Z b
sup s(f , σ) := f (x) dx , inf S(f , σ) := f (x) dx .
σ∈Ω([a ,b]) a σ∈Ω([a ,b]) a

From the definitions, it follows immediately:


Z b Z b
f (x) dx ≤ f (x) dx .
a a

A function f ∈ B ([a , b]) is Riemann integrable if:


Z b Z b
f (x) dx = f (x) dx .
a a

In this case, we denote f ∈ R ([a , b]) , and the common value of upper and lower integrals is denoted
by:
Z b
f (x) dx .
a
The inclusion R ([a , b]) ⊂ B ([a , b]) is proper. In fact, if we consider the following function, due to
Dirichlet5 : (
0 if x ∈ [0 , 1] ∩ (R\Q) ,
D(x) =
1 if x ∈ [0 , 1] ∩ Q ,
we see that D ∈ B ([a , b]) , while D ∈
/ R ([a , b]) , since:
Z b Z b
D(x) dx = 0 < D(x) dx = 1 .
a a

At this point, take notice that, if we consider the Lebesgue measure on R , then function D(x) coincides
with the zero function almost everywhere; hence, by Proposition 8.40, the Dirichlet function turns to
be integrable and: Z
D d` = 0 .
[0 ,1]
In other words, the Dirichlet function constitutes an example of a Lebesgue integrable function, that
is not Riemann integrable.
5
Johann Peter Gustav Lejeune Dirichlet (1805–1859), German mathematician.
138 CHAPTER 8. LEBESGUE INTEGRAL

8.7.2 Lebesgue–Vitali theorem


The full description of the relation between Riemann and Lebesgue integrals is due to Lebesgue and
Vitali, and is explained in the following Theorem 8.41, whose proof can be found, for example, in
Theorem 2.5.4 of [12].

Theorem 8.41. Let f : [a , b] → R be a bounded function. Then


 
(1) f ∈ R [a , b] ⇐⇒ f ∈ C [a , b] for almost any x ∈ [a , b] ;
 
(2) f ∈ R [a , b] =⇒ f ∈ L [a , b] , ` and
Z b Z
f (x) dx = f d` .
a [a ,b]

Remark 8.42. In view of Theorem 8.41, we will use the traditional Leibniz–Riemann notation:
Z b
f (x) dx
a

to denote, also, the Lebesgue integral of the measurable function f , defined on the interval [a , b] .

8.7.3 An interesting example


One result, of classical theory of Riemann integration, states that any monotonic function f : [a , b] →
R is Riemann integrable. On the other hand, Theorem 8.41 ensures that the set of points, in which
a Riemann integrable function is not continuous, has zero Lebesgue measure. The following Example
8.43 introduces a monotonic function f , defined on a bounded interval [a , b] , that has a numerable
set of points (xn )n∈N ⊂ [a , b] in which it is not continuous, being:

f + (xn ) = lim f (x) 6= lim f (x) = f − (xn )


x→x+
n x→x−
n

with f + (xn ) , f − (xn ) ∈ R .

Example 8.43. Consider a strictly increasing sequence (xn )n∈N ⊂ [0 , 1] , and define:

x∞ := lim xn ,
n→∞

being the existence of x∞ ensured by the hypothesis on (xn ) . Now, introduce the function f : [0 , 1] →
R , defined as: (
xn if x ∈ [ n−1 n
n , n+1 ) ,
f (x) =
x∞ if x = 1 .
By construction, f is strictly increasing and, thus, integrable on [0 , 1] . Again by construction, f is
n
discontinuous at any point ξn = , where f jumps, as well as at x = 1 . Moreover:
n+1
∞ Z n ∞ ∞
1
n − 1 X
Z X n+1 X n xn
f (x) dx = xn dx = xn − = ,
0
n−1 n+1 n n(n + 1)
n=1 n n=1 n=1

which implies that the integral of f is strictly positive. For instance, as illustrated in Figure 8.3, if
xn = 1 − n1 , then:

1 ∞ ∞ ∞
π2
Z X 2 2 1 X 1 1  X 1
f (x) dx = − − 2 =2 − − = 2 − .
0 n n+1 n n n+1 n2 6
n=1 n=1 n=1
8.8. NON LEBESGUE INTEGRALS 139

1
Figure 8.3: Example 8.43, with xn = 1 − n .

8.8 Non Lebesgue integrals


8.8.1 Dirac measure
Recall the definition of the Dirac6 measure, already met in Example 8.8. Given a non–empty set Ω ,
consider x0 ∈ Ω . For any A ∈ P(Ω) , the Dirac measure, concentrated in x0 , is defined as:
(
1 if x0 ∈ A ,
δx0 (A) =
0 if x0 ∈
/ A,
thus, it can be described in terms of the characteristic function:
δx0 (A) = 1A (x0 ) .
Moreover, any real function on Ω is integrable with respect to the Dirac δx0 ; from the general
definition of integral, in fact, we have:
Z
f dδx0 = f (x0 ) .

8.8.2 Discrete measure


Let us generate a measure in Ω , via the following construction. Given a sequence (xn ) in Ω , for any
A ⊂ Ω define: X
µ(A) = 1A (xn )
n≥1

In other words, µ counts how many elements of (xn ) belong to A .

It
Xis possible to show that a measurable function f is integrable if and only if the numeric series
|f (xn )| converges. In such a case, we also have:
n≥1
Z X
f dµ = f (xn ) .
Ω n≥1

The above result also means


X that infinite series can be seen as Lebesgue integrals: if Ω = R and (an )
is a sequence such that an converges absolutely, then we can write:
n≥1
X Z
an = a(x) dµ(x) ,
n≥1 R

where a(n) = an , and a(x) = 0 if x ∈


/ N , and where µ is the measure which counts non–negative
numbers.
6
Paul Adrien Maurice Dirac (1902–1984), British theoretical physicist.
140 CHAPTER 8. LEBESGUE INTEGRAL

8.9 Generation of measures


Up to now, we exposed the theory of Lebesgue integral under the assumption that a measure is given.
When Measure theory is employed in Probability theory, there are many measures to deal with. The
way these measures are generated is expressed by the following Theorem 8.44, known as generation of
measure theorem: we present a proof of it inspired by [52].

Theorem 8.44. Let ( Ω , A , µ ) be a measure space, and let f : Ω → [0 , +∞] be non–negative and
measurable. Then, for any A ∈ A : Z
φ(A) := f dµ
A

is a measure on A . Moreover, φ is a finite measure, that is, φ(Ω) < +∞ , if and only if f ∈ L(Ω , µ) .

Proof. It is obvious that φ(∅) = 0 . To complete the proof, we have to show that, if An ∈ A , n ∈ N ,

[
is such that Ai ∩ Aj = ∅ for i 6= j , then, setting A = An yields:
n=1


X
φ(A) = φ(An ) .
n=1

If f is a characteristic function of some measurable set E , then the countable additivity of φ follows
from the countable additivity of µ ; in such a case, in fact, we have:
Z
φ(A) = 1E dµ = µ(A ∩ E) .
A

Therefore, if A , B ∈ A , with A ∩ B = ∅ , then:


 
φ(A ∪ B) = µ (A ∪ B) ∩ E = µ (A ∩ E) ∪ (B ∩ E) .

Since (A ∩ E) ∩ (B ∩ E) = ∅ , it follows that:

φ(A ∪ B) = µ((A ∩ E) ∪ (B ∩ E)) = µ(A ∩ E) + µ(B ∩ E)


= φ(A) + φ(B).

The extension to a countable union of disjoint sets is straightforward.

If f is a simple function, the conclusion still holds, since:


X
f= fi 1Ei .
i

In the general case of non–negative measurable f , for any simple function s such that 0 ≤ s ≤ f ,
we have: Z X∞ Z X∞ Z ∞
X
s dµ = s dµ ≤ f dµ = φ(An ) .
A n=1 An n=1 An n=1

Now, recalling the definition of integral of a measurable function, and considering the supremum, we
find: Z Z X∞
sup s dµ = f dµ = φ(A) ≤ φ(An ) . (8.13)
0≤s≤f A A n=1

To obtain the thesis, we have to revert inequality (8.13) and prove it. Now, if there exists an An such
that φ(An ) = +∞ then, since A ⊃ An , it follows that φ(A) ≥ φ(An ) and the thesis is immediate.
8.10. PASSAGE TO THE LIMIT 141

Hence, the proof can be limited to the case in which φ(An ) < +∞ for any n ∈ N . Then, for any
ε > 0 , a simple function s can be chosen such that 0 ≤ s ≤ f and:
Z Z Z Z
ε ε
s dµ ≥ f dµ − , s dµ ≥ f dµ − .
A1 A1 2 A2 A2 2
Thus: Z Z Z
φ(A1 ∪ A2 ) ≥ s dµ = s dµ + s dµ ≥ φ(A1 ) + φ(A2 ) − ε ,
A1 ∪A2 A1 A2
from which, it can be inferred that:
φ(A1 ∪ A2 ) ≥ φ(A1 ) + φ(A2 ) .
Generalising to any n ∈ N :
φ(A1 ∪ · · · ∪ An ) ≥ φ(A1 ) + · · · + φ(An ) . (8.14)
Now, since A ⊃ A1 ∪ · · · ∪ An , inequality (8.14) implies:

X
φ(A) ≥ φ(An ) . (8.15)
n=1

Finally, the thesis follows from (8.13) and (8.15).

8.10 Passage to the limit


In applications involving a sequence (fn ) of measurable functions, it is of great importance the passage
to the limit, related to the possibility of switching between the operations of limit and integration. We
addressed this problem in § 2.3, in particular in Theorem 2.15, for uniformly convergent sequences of
functions. Here, we provide a more general, measure theoretical treatment.

8.10.1 Monotone convergence


We can surely state that passing to the limit represents one of main aims of Lebesgue integration theory,
in generalising the classic contest of Riemann integration. In this § 8.10.1 we build the main tool for
the passage–to–the–limit theory for the Lebesgue integral. The first, and probably most important
result, in this theory, is the monotone convergence Theorem, 8.45, due to Beppo Levi7 .
Theorem 8.45 (Beppo Levi – monotone convergence). Let E ∈ A and let (fn ) be a sequence of
measurable functions, such that, for any x ∈ E :
0 ≤ f1 (x) ≤ f2 (x) ≤ · · · ≤ fn (x) ≤ · · · · · · (8.16)
If:
f (x) = lim fn (x) , (8.17)
n→∞
then: Z Z
lim fn dµ = f dµ .
n→∞ E E
Remark 8.46. When fn is decreasing, Levi Theorem 8.45 does not hold.
1
To see it, consider the counter–example fn (x) = , for any x ∈ R . Here:
n
fn (x) & 0 ,
a notation indicating that the sequence decreases to zero, as n → ∞ . Thus, f (x) = lim fn (x) = 0 .
n→∞
At the same time, though, for any n ∈ N , it holds:
Z ∞
fn (x) dx = ∞ .
−∞
7
Beppo Levi (1875–1961), Italian mathematician.
142 CHAPTER 8. LEBESGUE INTEGRAL

Proof. (Theorem 8.45) Relations (8.16) imply that there exists α ∈ [0 , +∞] such that:
Z
lim fn dµ = α .
n→∞ E
Z Z
Since fn dµ ≤ f dµ , it is also:
E E Z
α ≤ f dµ . (8.18)
E
To obtain the thesis, we have to revert inequality (8.18) and prove it. Let us fix 0 < c < 1 and a
simple function s such that 0 ≤ s ≤ f . Define, further, for any n ∈ N :

En = { x ∈ E | fn (x) ≥ c s(x) } .

From (8.16) we infer that E1 ⊂ E2 ⊂ · · · ⊂ En ⊂ · · · · · · and, from (8.17), we also have:



[
E= En
n=1

Now, given an m ∈ N , the following inequalities are verified, since Em ⊂ E :


Z Z Z
fn dµ ≥ fn dµ ≥ c s dµ . (8.19)
E Em Em

At this point, evaluate the limit for n → ∞ in (8.19), leading to:


Z
α≥ c s dµ . (8.20)
Em

Due to the Generation measure Theorem 8.44, the integral in (8.20) is a countable additive set function.
Then, we can use Theorem 8.13, for increasing sequences of nested sets, to infer:
Z
α ≥ c s dµ .
E

Finally, we evaluate the limit c % 1 , obtaining:


Z
α ≥ s dµ ,
E

and we consider the supremum for s , that is:


Z
α ≥ f dµ . (8.21)
E

Thesis follows from (8.18) and (8.21).

Analytic functions
The monotone convergence Theorem 8.45 applies in a natural way to analytic functions, that are
functions admitting a convergent expansion in power series. The connection to analytic functions is
established by the following Corollary 8.47 to Theorem 8.45.

X
Corollary 8.47. Let fn be a series of positive functions on Ω . Then:
n=1


Z X ∞ Z
X
fn dµ = fn dµ .
Ω n=1 n=1 Ω
8.10. PASSAGE TO THE LIMIT 143

The next Example 8.48 aims to clarify how to deal with analytic functions using monotone convergence.
Take notice of the importance of this example, which provides a further solution to the Basel problem
presented in § 2.7.

Example 8.48. Compute the value of the so–called Leibniz integral, that is:
1
π2
Z
ln x
dx = . (8.22)
0 x2 − 1 8

As mentioned, integral (8.22) is connected to the Basel problem in § 2.7. Recalling the geometric series
expansion (2.38), the following result can be inferred:

1 X
= x2 n .
1 − x2
n=0

Then, we employ Levi Theorem 8.45:

1 ∞ 1
− ln x
Z X Z
2
dx = −x2 n ln x dx .
0 1−x 0
n=0

Integrating by parts:

1 1 Z 1
x2 n+1 x2 n
Z 
2n 1
−x ln x dx = − ln x + dx = . (8.23)
0 2n + 1 0 0 2n + 1 (2 n + 1)2

In other words, the integral in the left–hand side of (8.22) can be expressed in terms of a numerical
series:
Z 1 ∞
ln x X 1
2−1
dx = . (8.24)
0 x (2 n + 1)2
n=0

Formula (8.22) follows from identity (2.69).

The monotone convergence Theorem 8.45 allows, also, the computation of many other infinite series, a
few of which will be presented in the following examples. Let us start with the most elementary cases.

Example 8.49. This interesting example, of infinite series summation, is taken from [17]. We show
that:

X 1
= 4 (1 − ln 2) . (8.25)
n=1
n + 21 n
2

If a , b ≥ 0 , a 6= b , consider the infinite series:



X 1
S(a , b) = .
(n + a)(n + b)
n=1

Making use of partial fraction decomposition, we get:


∞  
1 X 1 1
S(a , b) = − .
a−b n+b n+a
n=1

Recalling that, for any α > 0 , it holds:


Z ∞
1
= e−α x dx ,
α 0
144 CHAPTER 8. LEBESGUE INTEGRAL

we get further:
∞ Z
1 X ∞ h −(n+b) x i
S(a , b) = e − e−(n+a) x dx ,
a−b
n=1 0
∞ Z
1 X ∞ −n x h −b x i
= e e − e−a x dx .
a−b 0
n=1
As in Example 8.48, the trick is to recognise the geometric series expansion (2.31) in the last integrand:
∞ ∞
X
−x n 1 X 1 1 e−x
e−n x =

e = , i.e., − 1 = = ,
1 − e−x 1 − e−x ex − 1 1 − e−x
n=0 n=1
and, then, use the monotone convergence Theorem 8.45, which yields:
Z ∞h i e−x
1
S(a, b) = e−bx − e−ax dx
a−b 0 1 − e−x
At this point, considering the change of variable t = e−x :
Z 1 b
1 t − ta
S(a , b) = dt . (8.26)
a−b 0 1−t
The left–hand side of (8.25) corresponds to forming S( 12 , 0) :
∞ Z 1 √ Z 1
X 1 1 1− t 1
1 = S( 2 , 0) = 2 dt = 2 √ dt .
2
n +2n 0 1−t 0 1+ t
n=1

Finally, recovering the variable t = u2 , with dt = 2 du :


∞ Z 1
X 1 u
2 1 =4 du .
n=1
n +2n 0 1+u

Equality (8.25) follows by computing the last integral.


Remark 8.50. Consider again equation (8.26), but, this time, let a = b > 0 . In this case, the infinite
series becomes [17]:

X 1
S(a) := S(a , a) = .
(n + a)2
n=1
To evaluate it, we can repeat the argument of Example 8.49, starting from the definite integral:
Z ∞
1
= x e−α dx , α > 0.
α2 0
Making use of the geometric expansion (2.31) and the monotone convergence Theorem 8.45:
∞ ∞ Z ∞ ∞ Z ∞
X 1 X
−(n+a)x
X
S(a) = = xe dx = x e−a x e−n x dx
(n + a)2 0 0
n=1 n=1 n=1
Z ∞ −a x
e
= x x dx .
0 e −1
With the change of variable e−x = t , we arrive at the identity:
∞ Z 1 a
X 1 t ln t
S(a) = = dt . (8.27)
(n + a)2 0 t−1
n=1
In particular, when a = 0 :
∞ Z 1
X 1 ln t
S(0) = 2
= dt . (8.27a)
n
n=1 0 t−1

π2 π2
Note the interesting comparison between (8.24) and (8.27a), that evaluate to and , respectively,
8 6
as shown in formulæ (2.69) and (2.71).
8.10. PASSAGE TO THE LIMIT 145

8.10.2 Exercises
1
1−x
Z
1. Using the definite integral dx , show that:
0 1 − x4

X 1 π 1
= + ln 2 .
(4 n + 1)(4 n + 2) 8 4
n=0

Hint. Use the partial fraction decomposition:

1 1−x 1
2
= 2
+ .
(1 + x) (1 + x ) 2 (1 + x ) 2 (1 + x)

1
x (1 − x)
Z
2. Using the definite integral dx , show that:
0 1+x

X (−1)n 3
= − ln 4 .
(n + 1) (n + 2) 2
n=1

Z ∞ ∞
sin(a x) X a
3. Show that x
dx = , for any a ∈ R .
0 e −1 n + a2
2
n=1
Hint. Use the equality:

e−n x (n sin(a x) + a cos(a x))


Z
sin(a x) e−n x dx = − .
a2 + n2

∞ ∞
x2
Z X 2
4. Show that dx = .
0 ex − 1 n3
n=1


X 1
5. Show that  = 2 (8 − π − 6 ln 2) .
n=1
n n + 14
Hint: Use the partial fraction decomposition:

x3 1+x 1
2
=1− 2
− .
(x + 1) (1 + x ) 2 (1 + x ) 2 (1 + x)

8.10.3 Dominated convergence


The monotone convergence Theorem 8.45 assumes that the given sequence of functions is increasing. In
many circumstances, this hypothesis is not fulfilled, thus, it is important to detect different situations,
where the passage to the limit is still possible. The Beppo Levi result remains, however, the key to
the proof of the Lebesgue dominated convergence Theorem 8.52. In order to state and proof such a
theorem, we need an important preliminary result, known as Fatou Lemma, which holds for positive
sequences of functions. Note that the notion of lower limit, given in equation (8.10), plays here a key
role.

Lemma 8.51 (Fatou). Let (Ω , A , µ) be a measure space, and let (fn )n be a sequence of measurable
positive functions on Ω . Then:
Z Z
lim inf fn dµ ≤ lim inf fn dµ .
Ω n→∞ n→∞ Ω
146 CHAPTER 8. LEBESGUE INTEGRAL

Proof. For n ∈ N , introduce the increasing sequence of measurable functions:

gn := inf fp .
p≥n

By definition, (gn ) is such that:


lim gn (x) = lim inf fn (x) .
n→∞ n→∞

From the monotone convergence Theorem 8.45, since gn (x) ≤ fn (x) , we infer:
Z Z Z
lim inf fn dµ = lim gn dµ ≤ lim inf fn dµ ,
Ω n→∞ n→∞ Ω n→∞ Ω

which completes the proof.

We are now in position to state the dominated convergence Theorem 8.52, due to Lebesgue.

Theorem 8.52. Consider a measure space (Ω , A , µ) and let (fn )n be a sequence of measurable
functions such that:
lim fn (x) = f (x) . (8.28)
n→∞

Assume that there exists a non–negative summable g ∈ L(Ω) such that, for any x ∈ Ω and any
n∈N :
|fn (x)| ≤ g(x) (8.29)

Then, f ∈ L(Ω) , and: Z Z


lim fn dµ = f dµ . (8.30)
n→∞ Ω Ω

Proof. We follow the proof proposed in [52]. Condition (8.29) implies that fn + g ≥ 0 , thus, Lemma
8.51 can be used to obtain the inequality:
Z Z
(f + g) dµ ≤ lim inf (fn + g) dµ .
A n→∞ A

It can then be inferred: Z Z


f dµ ≤ lim inf fn dµ . (8.31)
A n→∞ A

From (8.29) we also get g − fn ≥ 0 , and, again from Fatou8 Lemma 8.51, we obtain:
Z Z
(g − f ) dµ ≤ lim inf (g − fn ) dµ .
A n→∞ A

Thus: Z Z
− f dµ ≤ lim inf (−fn ) dµ ,
A n→∞ A

that is equivalent to: Z Z


f dµ ≥ lim sup fn dµ . (8.32)
A n→∞ A

Thesis (8.30) follows from (8.31) and (8.32).

Remark 8.53. When µ is a complete measure (see Definition 8.6), the dominated convergence The-
orem 8.52 holds, also, when conditions (8.28) and (8.29) are true almost everywhere.
8
Pierre Joseph Louis Fatou (1878–1929), French mathematician and astronomer.
8.10. PASSAGE TO THE LIMIT 147

8.10.4 Exercises
n
+∞
e−(x+1)
Z
1. Evaluate lim √ dx .
n→+∞ 0 x

2. If f : [0, 1] → R is a continuous function, show that:


Z 1 Z 1
lim f (tn ) dt = f (0) and lim n f (t) e−t dt = f (0) .
n→+∞ 0 n→+∞ 0

1
e−x
Z
3. Explain why the passage to the limit lim dx is possible.
n→∞ 0 x2 + n2
Z ∞
arctan n
4. Evaluate lim dx , explaining why it is possible to use the dominated convergence
n→∞ 0 1 + x2
Theorem 8.52.
1
x4
Z
5. Justify the following passage to the limit lim dx .
n→∞ 0 x2 + n2
2√
6. Consider the sequence of functions fn (x) = n x e−n x , x ∈ [0 , +∞) . Show that the thesis of
Theorem 8.52 does not hold. Explain the reasons, evaluating sup fn (x) .
x∈[0 ,+∞)

7. Using either of the Theorems 2.15, or 8.45, or 8.52, evaluate the following limits:
Z ∞ Z π
−x2
(a) nxe dx ; (d) sin(n x) e−n x dx ;
0 0
1
r
Z
1
Z 1
(b) x2 + dx ; (e) lim n x (1 − x2 )n dx .
n n→∞
Z0 π
0
sin(n x)
(c) dx ;
0 n

Solution to n. 1 of Exercises 8.10.4

Form the Bernoulli inequality 9 :


(1 + x)n ≥ 1 + n x

it can be inferred, since x > 0 and n ≥ 1 :

(1 + x)n ≥ 1 + n x ≥ 1 + x > 0 , so that − (1 + x)n < −x ,

and, then, an uniform estimate for the integrand can be formed:


n
e−(1+x) e−x
√ ≤ √ := ϕ(x) .
x x

Now, observing that ϕ ∈ L(R) , the dominated convergence Theorem 8.52 can be used, and we can
interchange the limit with the integral:
n n
+∞
e−(x+1) +∞
e−(x+1)
Z Z
lim √ dx = lim √ dx = 0 .
n→+∞ 0 x 0 n→+∞ x
9
See, for example, mathworld.wolfram.com/BernoulliInequality.html
148 CHAPTER 8. LEBESGUE INTEGRAL

8.10.5 A property of increasing functions


Here, we deal with the Lebesgue measure on the real line. To introduce this topic, we consider again
the function f of Example 8.43, which is piecewise constant and almost everywhere differentiable on
n
[0 , 1] . Its derivative is zero for x 6= and for any n ∈ N , meaning that:
n+1
Z 1
f 0 (x) dx = 0 .
0
On the other hand, the considered function f verifies:
Z 1
0 = f 0 (x) dx < f (1) − f (0) = x∞ − x1 . (8.33)
0
Note that (8.33) is not against the following celebrated identity10 which holds true for any function
g ∈ C 1 ([a, b]) :
Z b
g 0 (x) dx = g(b) − g(a) ,
a
The following Theorem 8.54 explains the origin of inequality (8.33); to prove it, we follow [45].
Theorem 8.54. Let f : [a, b] → R be an increasing function, that is to say, x , y ∈ [a , b] , x < y =⇒
f (x) ≤ f (y) . Define the set E as the subset of interval (a , b) on which f is differentiable. Then,
f 0 : E → R is summable and: Z
f 0 (x) dx ≤ f (b) − f (a) . (8.34)
E
Proof. Recall that any increasing function can have, at least, a countable infinity of points of disconti-
nuity, which are indeed jumps. Hence, an increasing function is almost everywhere differentiable, and,
then, measurable. Let us, now, extend the domain of f (x) to the interval [a , b + 1] , defining:
(
f (x) if x ∈ [a , b] ,
f˘(x) =
f (b) if x ∈ (b , b + 1] .

For simplicity, we keep the notation f to indicate f˘ . As in Theorem 8.29, let us introduce the sequence
of (measurable) functions (ϕn )n∈N , defined for x ∈ [a , b] as:
1
f (x + ) − f (x)
ϕn (x) = n .
1
n
By definition, if x ∈ E , then ϕn (x) → f 0 (x) as n → ∞ , meaning that f 0 is measurable too. Now,
observing that `( [a , b] − E ) = 0 , we have:
Z Z b Z b Z b 
1
ϕn (x) dx = ϕn (x) dx = n f (x + ) dx − f (x) dx
E a a n a
1
!
Z b+ n Z b
=n f (x) dx − f (x) dx
1
a+ n a
1 1
!
Z b+ n Z a+ n
=n f (x) dx − f (x) dx
b a
1 1
Z b+ n Z a+ n
=n f (b) dx − n f (x) dx
b a
1
Z a+ n
= f (b) − n f (x) dx
a
≤ f (b) − f (a) . (8.35)
10
We refer to the Fundamental Theorem of Calculus; see, for example, math-
world.wolfram.com/FundamentalTheoremsofCalculus.html and the references therein.
8.11. DIFFERENTIATION UNDER THE INTEGRAL SIGN 149

The last inequality (8.35) follows from the monotonicity assumption on f . In fact, for any x ∈
1
[a , b + 1] , from f (a) ≤ f (x) and integrating on [a , a + ] , we obtain:
n
Z a+ 1 Z a+ 1 Z a+ 1
n n n
n f (a) dx ≤ n f (x) dx ⇐⇒ −f (a) ≥ −n f (x) dx .
a a a

At this point, using Fatou Lemma 8.51, and passing to the limit, we infer:
Z Z Z
0
f (x) dx = lim inf ϕn (x) dx ≤ lim inf ϕn (x) dx ≤ f (b) − f (a) .
E E n→∞ n→∞ E

The proof of Theorem 8.54 is thus completed.

Remark 8.55. The inequality in the thesis of Theorem 8.54 can be strict, as shown by the earlier
result (8.33).

8.11 Differentiation under the integral sign


Here, the problem is considered, consisting in the differentiation of an integral which depends by one
parameter. The solution technique is powerful and allows the evaluation of many definite integrals,
which, otherwise, would be impossible to compute. The relevant theorems are stated (without proof),
followed by several of their applications, to illustrate their importance.
Theorem 8.56. Consider the real intervals ]a, b[ , [α , β] ⊂ R , and function f : ]a , b[ × [α , β] → R ,
such that:
(i) for any x ∈ ]a , b[ , function t 7→ f (x , t) is summable in [α , β] ;

(ii) for almost any t ∈ [α , β] , function x 7→ f (x , t) is differentiable in ]a , b[ ;

(iii) for any x ∈]a , b[ and for almost any t ∈ [α , β] , there exists g summable in [α , β] such that:

∂f
(x , t) ≤ g(t) .
∂x

Then: Z β
F (x) := f (x , t) dt
α
is differentiable and: Z β
∂f
F 0 (x) = (x , t) dt .
α ∂x
It is also useful to state an alternative version of Theorem 8.56, which uses the continuity of the partial
derivatives of f (x , t) .
Theorem 8.57. Let f : [a , b] × [α , β] → R be a continuos function, such that the partial derivative
∂f
esists and is continuos on [a , b] × [α , β] . Then:
∂x
Z β
F (x) := f (x , t)dt
α

is differentiable and: Z β
0 ∂f
F (x) = (x , t) dt .
α ∂x
Several examples follow, on the use of Theorems 8.56 and 8.57.
150 CHAPTER 8. LEBESGUE INTEGRAL

8.11.1 The probability integral (1)


The first application of the derivation of parametric integrals is devoted to the evaluation of the
probability integral: Z ∞
2
e−x dx .
0
Let us introduce the functions:
2 2 2
x 1
e−(1+t ) x
Z Z
−t2
f1 (x) := e dt and f2 (x) := dt .
0 0 1 + t2
The derivative of f1 is: Z x
−x2 2
f10 (x) := 2 e e−t dt ,
0
while, for the derivative of f2 , we employ Theorems 8.56 and 8.57, obtaining:
Z 1
0 −x2 2 2
f2 (x) := −2 x e e−x t dt .
0

In this last integral, the change of variable τ = x t leads to:


Z x
0 −x2 2
f2 (x) := −2 e e−τ dτ .
0

Hence, the derivative of function f (x) = f1 (x) + f2 (x) is zero for any x , meaning that f (x) is a
constant function, with f (x) = f (0) for all x ∈ R .
And it is clear that: Z 1
1 π
f (0) = 2
dt = .
0 1+t 4
In other words, we have shown that, for any x ∈ R :
Z x 2 Z 1 −(1+t2 ) x2
−t2 e π
e dt + 2
dt = .
0 0 1+t 4
Now, using the dominated convergence Theorem 8.52:
(Z 2 Z 1 −(1+t2 ) x2 ) Z ∞ 2
x
−t2 e −t2 π
lim e dt + 2
dt = e dt = .
x→∞ 0 0 1 + t 0 4

The value of the probability integral is, therefore:


Z ∞ √
−x2 π
e dx = . (8.36)
0 2
Remark 8.58. If the interval of integration is symmetric, I = [ −a , a ] , with a ∈ R ∪ { ±∞ } , then:
Z a
f (t) dt = 0 for any integrable odd f ,
−a

while Z a Z a
f (t) dt = 2 f (t) dt for any integrable even f .
−a 0
2
Remark 8.59. From (8.36) and from the fact that e−x is an even function, we infer that:
Z ∞
2 √
e−x dt = π . (8.36a)
−∞

In applications, the following formula is also useful:


Z ∞
2
e−πx dx = 1 . (8.36b)
−∞
8.11. DIFFERENTIATION UNDER THE INTEGRAL SIGN 151

Remark 8.60. From (8.36), integrating by parts, it follows:


Z ∞ √
2 −x2 π
x e dx = . (8.36c)
0 4
and, equivalently: √
Z ∞
2 −x2 π
x e dx = . (8.36d)
∞ 2
Remark 8.61. Let a > 0 and ∆ = b2 − 4 a c < 0 , and consider the following identity, known as
square completion 11 :
b 2 b2
 
2
ax + bx + c = a x + +c− .
2a 4a
Recalling (8.36a), the Gaussian integral identities follow:
Z ∞ r
π b2
−(a x2 +b x+c)
e dx = e 4 a −c . (8.37)
−∞ a
Z ∞ r
2 π b b2 −c
x e−(a x +b x+c) dx = − e4a , (8.38)
−∞ a3 2
Z ∞
π 2a + b2
r
b2
2 −(a x2 +b x+c) 4 a −c .
x e dx = e (8.39)
−∞ a5 4

8.11.2 The probability integral (2)


This second proof of (8.36a) is contained in [59] and, again, it is based on Theorems 8.56 and 8.57 for
the derivation of parametric integrals. Assuming that the value of the probability integral is unknown,
define: Z ∞ Z ∞
2 2
G := e−x dx = 2 e−x dx , (8.40)
−∞ 0
and define further, for t ≥ 0 :
∞ 2)
e−x (1+t
Z
F (x) := dt . (8.41)
−∞ 1 + t2
Observe that F (x) ≤ π e−x . In fact:
Z ∞ −x t2 Z ∞
−x e −x 1
F (x) = e 2
dt ≤ e 2
dt = π e−x .
−∞ 1 + t −∞ 1 + t
Hence:
lim F (x) = 0 .
x→∞
Using Theorems 8.56 and 8.57, for the derivation under the integral sign:
Z ∞ Z ∞
0 −x(1+t2 ) −x 2
F (x) = − e dt = −e e−x t dt .
−∞ −∞

1
Now, with the change of variable t = √ z :
x

e−x e−x
Z
2
F 0 (x) = − √ e−z dz = − √ G .
x −∞ x

Then, integrating on [t , ∞) :
∞ ∞
e−s
Z Z
0
F (s) ds = −G √
√ ds ,
x x s
11
See, for example, mathworld.wolfram.com/CompletingtheSquare.html
152 CHAPTER 8. LEBESGUE INTEGRAL

that is:

e−s
Z
F (x) = G √
√ ds . (8.42)
x s

From (8.41) it follows that F (0) = π , while (8.42) implies that F (0) = G2 . Thus G = π , which
means that (8.36a) is re-established.

8.11.3 Exercises
Z +∞
sin t
1. Consider F (x) = e−x t dt , where x > 0 . Show that:
0 t
1
(a) F 0 (x) = − ;
1 + x2
(b) lim F (x) = 0 ;
x→+∞
π
(c) F (x) = − arctan x ;
2
(d) Dirichlet integral : Z +∞
sin t π
dt = . (8.43)
0 t 2
2. Using Theorem 8.56, for differentiation under the integral sign, show that, for x ≥ 0 :
Z 1 x
t −1
dt = ln(1 + x) .
0 ln t
3. For x > 0 consider the function:
+∞
arctan t − arctan(t x)
Z
F (x) = dt .
0 t
Explain why it is possible to differentiate under the integral; then, show that:
π
F (x) = − ln x .
2
4. For x > 0 consider the function
+∞
e−t − e−t x
Z
F (x) = dt .
0 t
Explain why it is possible to differentiate under the integral sign; then, show that:
F (x) = ln x .

5. Consider the function, defined for x > −1 :


Z ∞
arctan(t x)
F (x) = dt .
0 t3 + t
Explain why it is possible to differentiate under the integral sign; then, employ the partial fraction
decomposition:
x2
 
1 1 1
= − ,
(1 + t2 )(1 + t2 x2 ) 1 − x2 1 + t2 1 + t2 x2
to show that:
π
F (x) = ln(1 + x) .
2
6. Using the relation: Z ∞
2 √
e−x dx = π ,
−∞
and using the derivation of parametric integrals, show that, for any x ∈ R :
Z ∞
2 √ 2
e−t cos(t x) dt = π e−x /4 .
−∞
8.11. DIFFERENTIATION UNDER THE INTEGRAL SIGN 153

Solutions to n. 1 of Exercises 8.11.3


(a) Observe, first, that, since we assumed x , t > 0 , then:

∂f
= −e−x t sin t ≤ e−x t ≤ 1 .

∂x

This means that we can differentiate under the integral sign, and, integrating by parts, we obtain:
Z ∞
0 1
F (x) = −e−x t sin t dt = − .
0 1 + x2

sin t
(b) Make use of the fact that x > 0 , and that ≤ 1 for any t ∈ R , to form some estimates:
t
∞ Z ∞
Z
sin t 1
e−x t e−x t dt = .

|F (x)| = dt ≤
0 t 0 x

Now, taking the limit as x → ∞ :



Z
−x t sin t 1
lim e dt ≤ lim = 0.
x→∞ 0 t x→∞ x

(c) Integrating the formula obtained in (a), we get:


Z
1
F (x) = − dx + C = C − arctan x ,
1 + x2

where C is some integration constant. Recalling (b) and taking the limit as x → ∞ :
π
0 = lim F (x) = C − arctan(∞) = C − .
x→∞ 2
π π
It follows that C = and F (x) = − arctan x .
2 2
(d) At this point, it is immediate recognize that:
Z ∞
sin t π
dt = F (0) = .
0 t 2

The above integral, given in (8.43), is of great importance and it is known as Dirichlet integral.

Solutions to n. 2 of Exercises 8.11.3


Recall, first, that:
d x d ln tx
t = e = ln t · tx .
dx dx
tx − 1
Then, set f (x , t) = , so that:
ln t
∂f ln t · tx
(x , t) = = tx .
∂x ln t

The integration is for t ∈ [0 , 1] , and x ≥ 0 , meaning that | ∂f


∂x | < 1 , and allowing the derivation
under the integral sign. Thus:
1 1 1 t=1
tx − 1 d tx − 1 t1+x
Z Z Z 
d 1
dt = dt = tx dt = = .
dx 0 ln t 0 dx ln t 0 1+x t=0 1+x
154 CHAPTER 8. LEBESGUE INTEGRAL

Integrating with respect to t :


1
tx − 1
Z
dt = ln(1 + x) + K .
0 ln t
To detect the integration constant K , it suffices to take x = 0 :
Z 1 0
t −1
0= dt = ln(1) + K =⇒ K = 0.
0 ln t
In conclusion:
1
tx − 1
Z
dt = ln(1 + x) .
0 ln t

8.12 Basel problem again


We show, here, that differentiation of parametric integrals can be used to provide an alternative
solution to the Basel problem, presented in § 2.7 and solved in (2.71), as well as in (8.22) of Example
8.48. As already observed in § 2.7, it is sufficient to demonstrate the equivalent formula (2.69); the
proof given here follows [42]. Let us define, for 0 ≤ x ≤ 1 :
Z π
2
F (x) = arcsin (x sin t) dt .
0

F (x) is continuous for x ∈ [0 , 1] , and it can be differentiated for 0 ≤ x < 1 :


Z π
0
2 sin t
F (x) = p dt .
0 1 − x2 sin2 t
Now, the change of variable cos t = u yields:
Z 1 Z 1
0 du du
F (x) = p = q .
2 2
1 − x (1 − u ) 1−x2 2
0 0 x + u
x 2

Recalling the integration formula:


Z
du  p 
√ = ln u + a2 + u2 ,
a2 + u2
we find out: " r !#u=1
1 1 − x2
F 0 (x) = ln u + + u2 .
x x2
u=0
Performing the relevant computations leads to:
√ !!
2
 
1 1 1 − x
F 0 (x) = ln 1 + − ln
x x x

x2 n
    X
1 1+x 1 1+x
= ln √ = ln = .
x 1 − x2 2x 1−x 2n + 1
n=0

The power series (2.36) was used in the very last step. Now, since F (0) = 0 , then we have that, for
0≤x<1 :
Z x ∞
Z xX
t2 n
F (x) = F 0 (t) dt = dt .
0 0 2n + 1
n=0
For 0 ≤ x < 1 , we thus have:

X x2 n+1
F (x) = .
(2 n + 1)2
n=0
8.13. DEBYE INTEGRAL 155

But F (x) is continuous in x = 1 too, therefore:


∞ π
π2
Z
X 1 2
= F (1) = arcsin(sin t) dt = .
(2 n + 1)2 0 8
n=0

In other words, we proved (2.69), obtaining, once more, the solution to the Basel problem.

8.13 Debye integral


We compute a family of definite integrals, introduced by Debye12 , in a physical chemistry model: while
his focus was on the calculation of heat capacity, depending on a parameter, his model has relevance
in Mathematical Finance too. The family of Debye integrals is:
Z x
tm
Dm (x) = t
dt , x > 0; (8.44)
0 e −1

to ensure the convergence, we have to assume m > 0 . Let us restrict our attention to the particular
case m = 1 . Taking into account the power series expansion (2.33), we introduce a generalization of
the logarithmic function, Lis (x) , of order s and argument x , known as polylogarithm:

X un
Lis (x) = . (8.45)
ns
n=1

The case s = 2 , in particular, was introduced by L. Euler in 1768 and it is called dilogarithm:
∞ Z x
X xn ln(1 − t)
Li2 (x) := 2
=− dt . (8.45a)
n 0 t
n=1

We are now in position to evaluate the Debye integral D1 . When t > 0 , we can write:
∞ ∞
t −t 1 −t
X
−n t
X
= t e = t e e = t e−(n+1) t .
et − 1 1 − e−t
n=0 n=0

Using the monotone convergence Theorem 8.45 and integrating:


 
∞ Z x
X ∞ 1 − 1 + (n + 1) x e−(n+1) x
X
D1 (x) = t e−(n+1) t dt =
0 (n + 1)2
n=0 n=0
∞ ∞ ∞ ∞
X 1 − (1 + n x) e−n x X 1 X e−n x X e−n x
= = − − x
n2 n2 n2 n
n=1 n=1 n=1 n=1
π2
= − Li2 (e−x ) + x ln(1 − e−x )
6
To study the general case of order m > 0 , it is necessary to use a special function, the Euler Gamma
function, that is introduced in Chapter 11.

12
Peter Joseph William Debye (1884–1966), Dutch–American physicist and physical chemist, and Nobel laureate in
Chemistry in 1936.
156 CHAPTER 8. LEBESGUE INTEGRAL
9 Radon–Nikodym theorem

9.1 Signed measures


This § 9.1 constitutes a brief account on the topic of signed measures. For further details we refer to
[51].

Definition 9.1. Let A be a σ–algebra on Ω . A signed measure, or charge, is any set function ν ,
defined on A , such that:

(i) µ can take at most one of the values −∞ or +∞ ,

(ii) ν(∅) = 0 ;

(iii) for any sequence (En )n of measurable disjoint sets, it holds:



[  X∞
ν En = ν(En ) . (9.1)
n=1 n=1

Remarks 9.2.

 In Definition 9.1, equality (iii) means that, if the measure in the left hand–side of (9.1) is finite,
then the infinite series in the right hand–side of (9.1) converges absolutely; otherwise, such a series
diverges.

 As a consequence of Definition 9.1, a (positive) measure is a signed measure; viceversa, a signed


measure may not be a measure.

Definition 9.3.

(i) A set P ∈ A is a positive set, with respect to the signed measure ν , if it holds ν(E) ≥ 0 for
any subset E ⊂ P , E ∈ A .

(ii) A set N ∈ A is a negative set, with respect to the signed measure ν , if it holds ν(E) ≤ 0 for
any subset E ⊂ N , E ∈ A .

(iii) If a set A is both positive and negative, with respect to the signed mea– sure ν , then A is
called null set.

Remarks 9.4.

 Any measurable subset of a positive set is a positive set.

 The restriction of ν to positive sets is a measure.

157
158 CHAPTER 9. RADON–NIKODYM THEOREM

Remark 9.5. Note that null sets, for a signed measure, are different from sets of measure zero, for
a positive measure. This becomes evident if we observe that, for a null set A , by definition, it holds
ν(A) = 0 , which means that A may be the union of two non–empty subset of opposite charge. For
instance, in R , the set function: Z
ν(A) = x dx
A
is a signed measure, defined on the σ–algebra of Lebesgue measurable sets, and:
Z 0 Z 2
ν([−2, 0]) = x dx = −2 , ν([0, 2]) = x dx = 2 ,
−2 0
Z 2
ν([−2, 2]) = x dx = 0 .
−2

The Hahn1 decomposition Theorem 9.6 explains the behaviour of the signed measures.
Theorem 9.6 (Hahn). Let ν be a signed measure in the measure space (Ω , A) . Then, there exist a
positive set P and a negative set N such that:
P ∩N =∅ and P ∪ N = Ω.
The decomposition of Ω into a positive set P and a negative set N is called Hahn decomposition for
the signed measure ν . Such a decomposition is not unique.
Remark 9.7. Denote by {P , N } a Hahn decomposition of the charge ν . Then, it is possibile to
define two positive measures ν + and ν − as follows:
ν + (E) = ν(E ∩ P ) , ν − (E) = −ν(E ∩ N ) .
The positive measures ν + and ν − turn out to be mutually singular, as they comply with the following
Definition 9.8.
Definition 9.8. Two measures ν1 and ν2 , defined on (Ω , A) , are called mutually singular, and
denoted ν1 ⊥ ν2 , if there exist two measurable disjoint sets A and B such that Ω = A ∪ B and:
ν1 (A) = ν2 (B) = 0 .
The various results on charge measure and Hahn decomposition are contained in the following Theo-
rem 9.9, due to Jordan2 .
Theorem 9.9 (Jordan decomposition). Let ν be a charge, defined on the measure space (Ω , A) .
Then, there are exactly two positive measures ν + and ν − defined on (Ω, A) and such that:
ν = ν+ − ν− .
ν + and ν − are called, respectively, positive and negative variation of ν . Since, by definition, µ can
take at most one of the values −∞ or +∞ , then at least one of the two variations must be finite. If
both variations are finite, then ν is a finite signed measure.
Remark 9.10. A consequence of Theorem 9.9 is that a new measure, denoted with the symbol |ν| ,
can be defined:
|ν| (E) := ν + (E) + ν − (E) .
This measure is called absolute value or total variation of ν . It is possible to show that:
Z

|ν| (E) = sup f dν ,
E

where the supremum is taken over all measurable functions f that verify |f | ≤ 1 everywhere.

Moreover, the following properties hold, for any E ∈ A :


1
Hans Hahn (1879–1934), Austrian mathematician.
2
Marie Ennemond Camille Jordan (1838–1922), French mathematician.
9.1. SIGNED MEASURES 159

(i) E is positive if and only if ν − (E) = 0 ;

(ii) E is negative if and only if ν + (E) = 0 ;

(iii) E is a null set if and only if |ν| (E) = 0 .

Definition 9.11. The integral of a function f , with respect to a signed measure ν , is defined by:
Z Z Z
f dν := f dν + − f dν − , (9.2)

where we have to assume that f is measurable with respect to both ν + and ν − , and that the integrals
in the right–hand side of (9.2) are not both infinite.

Example 9.12. Consider f (x) = (x − 1)e−|x−1| , and define the signed measure on the Lebesgue
measurable set in R : Z
ν(A) = (x − 1) e−|x−1| dx .
A

We wish to evaluate ν + (R) , ν − (R) , |ν|(R) , and:


Z
1
dν .
[0,+∞] x−1

From the Hahn decomposition Theorem 9.6, being ν a signed measure, we know that there exist two
sets P and N , positive and negative, respectively, such that P ∩ N = ∅ , P ∪ N = R , and such that
ν + (E) = ν(E ∩ P ) and ν − (E) = −ν(E ∩ N ) .
Observe that f (x) ≥ 0 in [1 , +∞) , while f (x) < 0 on (−∞ , 1) . Hence, we can choose, for instance,
P = [1 , +∞) and N = (−∞ , 1) . Even though P and N are not uniquely determined, ν + and ν − do
not change. We then obtain:
Z ∞ Z ∞
−|x−1|
+
ν (R) = ν(R ∩ P ) = (x − 1) e dx = (x − 1) e−(x−1) dx
1 1
Z ∞
= u e−u du = 1 ,
0

Z 1 Z 1
− −|x−1|
ν (R) = −ν(R ∩ N ) = − (x − 1) e dx = − (x − 1) e(x−1) dx
−∞ −∞
Z 0
=− u eu du = −(−1) = 1 ,
−∞

|ν| (E) = ν + (E) + ν − (E) = 2 .

Finally, from equality (9.2), that defines an integral with respect to a signed measure, we get:
Z Z Z
1 1 1
dν = dν − dν
[0 ,+∞] x−1 [0 ,+∞]∩P x − 1 [0 ,+∞]∩N x − 1
Z +∞ Z 1
1 −(x−1) 1
= (x − 1) e dx − (x − 1) e(x−1) dx
1 x − 1 0 x − 1
Z +∞ Z 1
= e−(x−1) dx − e(x−1) dx
1 0
e−1 1
=1− = .
e e
160 CHAPTER 9. RADON–NIKODYM THEOREM

9.2 Radon–Nikodym theorem


This § 9.2 illustrates the Radon–Nikodym3 Theorem 9.16, which has great implications in Probability
theory and in Mathematical Finance, as explained, for example, in the ‘Applications’ section of [18].
In some sense, Theorem 9.16 represents the converse of the generation of measures Theorem 8.44.
From the latest theorem, in fact, and from the monotone convergence Theorem 8.45, a result of great
interest in mathematical probability can be inferred. For easiness, we recall the notations of Theorem
8.44.
(Ω , A , µ) denotes a measure space,
Z and f : Ω → [0 , +∞] is a non–negative and measurable function.
Then, we know that φ(A) := f dµ defines a measure on (Ω , A , µ) , for any A ∈ A .
A
Theorem 9.13. Let f : Ω → [0 , +∞] be µ–measurable, and let further g : Ω → [0 , +∞] be
φ–measurable. Then: Z Z
g dφ = g f dµ . (9.3)
Ω Ω
We refer to
Z f as the density of the measure φ with respect to the measure µ . In particular, if f is
such that f dµ = 1 , then φ is a probability measure.

Proof. Theorem 8.44 implies that (9.3) holds, in particular, for g = 1E , with E measurable, i.e.,
E ∈ A . Hence, (9.3) also holds, by linear combination, for any simple function. The general case
finally follows, using a passage to the limit, from the monotone convergence Theorem 8.45, in an
analogous way to that illustrated in Remark 8.31.

Remark 9.14. If measure φ is built from f and µ , as shown in Theorem 9.13, then, if a set is
negligible for µ , it is also negligible for φ , that is: µ(E) = 0 =⇒ φ(E) = 0 .
The situation illustrated in Remark 9.14 is formalised in the following Definition 9.15.
Definition 9.15. Given two measures φ and µ , on the same σ–algebra A , we say that φ is absolutely
continuous, with respect to µ , if:
µ(E) = 0 =⇒ φ(E) = 0 ,
and we use the notation: φ  µ .
So far, we have shown that, if measure φ is obtained from measure µ by integrating a positive
measurable function f , then φ is absolutely continuous with respect to µ . It is possible to revert
this process: if measure φ is absolutely continuous with respect to measure µ , then, under certain
essential hypotheses, it is possible to represent µ as an integral of a certain function. Such hypotheses
are given in the Radon–Nikodym Theorem 9.16.
Theorem 9.16 (Radon–Nikodym). Let (Ω , A , µ) be a σ–finite measure space, and let φ be a
measure, on A , absolutely continuos with respect to µ . Then, there exists a non–negative measurable
function h such that, for any E ∈ A :
Z
φ(E) = h dµ .
E
This function h is almost everywhere unique, it is called Radon–Nikodym derivative of φ with respect
to µ , and it is denoted by:  


Moreover, for any function f ≥ 0 and φ–measurable:
Z Z
f dφ = f h dµ .
Ω Ω
3
Johann Karl August Radon (1887–1956), Austrian mathematician.
Otto Marcin Nikodym (1887–1974), Polish mathematician.
9.2. RADON–NIKODYM THEOREM 161

For the proof, see § 6.9 of [53]. The Radon–Nikodym derivative, i.e., the function h which expresses
the change of measure, has some interesting properties, stated below.
Z Z  

(a) If φ  µ and if ϕ is a non–negative and measurable function, then: ϕ dφ = ϕ dµ .

     
d (φ1 + φ2 ) dφ1 dφ2
(b) For any φ1 , φ2 it holds: = + .
dµ dµ dµ
     
dφ dφ dµ
(c) If φ  µ  λ , then = .
dλ dµ dλ
   −1
dν dµ
(d) If ν  µ and µ  ν , then = .
dµ dν
162 CHAPTER 9. RADON–NIKODYM THEOREM
10 Multiple integrals

10.1 Integration in R2
In this § 10.1, we expose the theoretical process that extends the Lebesgue measure from R to the
plane R2 . Such a process also provides a method to compute the two–dimensional Lebesgue measure
of sets in the plane, using the one–dimensional measure on the line. The basic step is the notion
of section of a set. Due to our applicative commitment, most of the theorems are stated, but not
demonstrated; for their proofs, refer to [51] and [6].

Definition 10.1. Let A ⊂ R2 and fix x ∈ R . Then, the section of foot x of A is defined as the
subset Ax of R :
Ax := {y ∈ R | (x , y) ∈ A} .
Viceversa, given y ∈ R , the section of foot y of A is:

Ay := {x ∈ R | (x, y) ∈ A} .

Figure 10.1 illustrates the set sections Ax , Ay , for given A , x , y .

Ax A
y Ay

Figure 10.1: Sections of A , respectively of of foot x (left) and y (right).

The following Theorem 10.2 characterizes the masurability of set sections.

Theorem 10.2. Let A ⊂ R2 be measurable. Then, the following results hold.

(I) Given x ∈ R , section Ax is almost everywhere measurable. Moreover, function x 7→ `(Ax ) is


measurable and: Z
`2 (A) = `(Ax ) dx ,
R

where `2 denotes the Lebesgue measure in R2 .

(II) Given y ∈ R, section Ay is almost everywhere measurable. Moreover, function y 7→ m(Ay ) is


measurable and: Z
`2 (A) = `(Ay ) dy ,
R

where, again, `2 is the Lebesgue measure in R2 .

163
164 CHAPTER 10. MULTIPLE INTEGRALS

Example 10.3. From Theorem 10.2 it is immediate to find the measure of a circle, i.e., its area.
Given the unit circle A = {(x , y) | x2 + y 2 ≤ 1} , in fact, we see that its `2 measure
h √ can be found ias

illustrated in Figure 10.2. Fixed x ∈ R , the section of foot x is given by Ax = − 1 − x2 , 1 − x2 ,
if −1 ≤ x ≤ 1 , while it is Ax = ∅ elsewhere. Therefore:
Z 1 Z 1p
π
`2 (A) = `(Ax ) dx = 4 1 − x2 dx = 4 = π .
−1 0 4

��


Figure 10.2: Application of Theorem 10.2 to the computation of the unit circle area.

The following theorem is known as Fubini1 Theorem [24, 25]: it rules the evaluation of integrals, with
respect to the two–dimensional Lebesgue measure, establishing the method of nested integration.

Theorem 10.4 (Fubini). Let A ⊂ R2 be a measurable set and let f ∈ L(A) .

(I) Denote with S0 the null set where section Ax is not measurable, and define S to be the subset
of R where the y–sections of A have positive measure:

S = { x ∈ R \ S0 | `(Ax ) > 0} .

Then, function y 7→ f (x, y) is summable on Ax , and it holds true that:


Z Z Z 
f (x , y) dx dy = f (x, y) dy dx . (10.1)
A S Ax

(II) Denote with T0 the null set where section Ay is not measurable, and define T to be the subset
of R where the x–sections of A have positive measure:

T = { y ∈ R \ T0 | `(Ay ) > 0} .

Then, function x 7→ f (x, y) is summable on Ay , and the following holds true:


Z Z Z !
f (x , y) dx dy = f (x, y) dx dy . (10.2)
A T Ay

Figure 10.3 illustrates cases (I) and (II) of the Fubini Theorem 10.4.

1
Guido Fubini (1879–1943), Italian mathematician.
10.1. INTEGRATION IN R2 165

Ax A T
y Ay

x S

Figure 10.3: Fubini Theorem 10.4: cases (I) and (II), to the left and to the right, respectively.

Remark 10.5. In Theorem 10.4, the same integral is evaluated by (10.1) and (10.2); in a practical
situation, it usually occurs that one integration path is easier than the other. Moreover, the theoretical
equality between integrals (10.1) and (10.2):
Z Z  Z Z !
f (x , y) dy dx = f (x , y) dx dy , (10.3)
S Ax T Ay

can be exploited to evaluate integrals that are otherwise difficult to be computed via a direct approach.

(x , y) ∈ R2 | 0 ≤ x ≤ 1 , x2 ≤ y ≤ x + 1 ; note that x ∈

Example 10.6. Consider A =
[0 , 1] =⇒ x2 ≤ 1 + x . We want to evaluate:
ZZ
x y dx dy .
A

The first step consists in the analysis of the integration domain. In this example, the domain is
described by a double inequality constraint, of the form f1 (x) ≤ y ≤ f2 (x) , with x ∈ [a , b] . Many
authors describe such kind of integration domain as normal domain. In our case, [a , b] = [0 , 1] , and
f1 (x) = x2 is a parabola, while f2 (x) = 1 + x is a line, hence, the domain plot is as shown in
Figure 10.4. When working with normal domains, it is natural to adopt the vertical section approach,
stated in case (I) of Theorem 10.4. Here, S = [0 , 1] , Ax = [x2 , 1 + x] , and the nested integration
formula (10.1) yields:

1 1+x 1 1+x
y2
ZZ Z Z  Z 
x y dx dy = x y dy dx = x dx
A 0 x2 0 2 x2
Z 1
1 5
−x5 + x3 + 2 x2 + x dx = .

=
2 0 8

(x , y) ∈ R2 | y ≥ 0 , y ≤ −x + 3 , y ≤ 2x + 3

Example 10.7. Given A = , evaluate:
ZZ
y dx dy .
A

The integration domain is the triangle with vertices in (− 32 , 0) , (3 , 0) , (0 , 3) , as shown in Figure 10.5.
We first integrate in x and then in y :
Z 3 Z 3−y ! Z 3 
9 3 2 27
y dx dy = y − y dy =
0 y−3
0 2 2 4
2

10.1.1 Smart applications of Fubini theorem


166 CHAPTER 10. MULTIPLE INTEGRALS

Figure 10.4: Normal integration domain of Example 10.6.

3.0

2.5

2.0

1.5

1.0

0.5

-2 -1 1 2 3

Figure 10.5: Integration domain of Example 10.7.

Fresnel integrals
We show, following [15] (pages 473–474), that:
∞ ∞
Z r Z
2 π
F1 := sin x dx = = cos x2 dx := F2 .
−∞ 8 −∞

The change of variable x2 = t allows to compute F1 and F2 as:


Z ∞ Z ∞
sin t cos t
F1 = √ dt , F2 = √ dt .
0 t 0 t

Making use of the probability integral, discussed in § 8.11.1, 8.11.2, we see that:
Z ∞
1 2 2
√ =√ e−t x dx .
t π 0

Thus, we can write:


Z ∞Z ∞ Z ∞Z ∞
2 −t x2 2 2
F1 = √ e sin t dx dt , F2 = √ e−t x cos t dx dt .
π 0 0 π 0 0
10.1. INTEGRATION IN R2 167

We can employ Fubini Theorem 10.4, to invert the order of integration, and then observe that:
Z ∞ Z ∞
2 1 2 x2
e−t x sin t dt = 4
, e−t x cos t dt = ,
0 1+x 0 1 + x4
that follow from the indefinite integration formulæ :
2
e−t x x2 sin t + cos t
Z 
−t x2
e sin t dt = − ,
1 + x4
2
e−t x sin t − x2 cos t
Z 
−t x2
e cos t dt = .
1 + x4
Therefore:
∞ ∞
x2
Z Z
2 1 2
F1 = √ dx , F2 = √ dx .
π 0 1 + x4 π 0 1 + x4
The remaining computations are a matter of elementary integration, and they are left to the Reader,
or they can be obtained using the integration formulæ (11.25) and (11.26) of the next Chapter 11.

Cauchy formula for iterated integration


We study here the problem, solved by Cauchy, of determining a function from the knowledge of its
second derivative. The problem can be stated in these terms: given f ∈ L(R) ∩ C(R) , then, the
function: Z t
x(t) := a0 + a1 t + (t − r) f (r) dr
0
is the solution to the Initial value problem:
(
x00 (t) = f (t) ,
x(0) = a0 , x0 (0) = a1 .

The problem above can, indeed, be solved directly by integrating x00 = f (t) twice on [0 , t] . The first
integration provides:
Z t Z t Z t
00 0 0
x (τ ) dτ = f (r) dr =⇒ x (t) − x (0) = f (r) dr ,
0 0 0

where, since x0 (0) = a1 , the first integration constant gets determined and we obtain:
Z t
0
x (t) = a1 + f (r) dr .
0

Integrating for the second time on [0 , t] :


Z t Z t  Z t 
0
x (s) ds = a1 + f (r) dr ds , (10.4)
0 0 0

and evaluating the integrals in (10.4):


Z t Z t 
x(t) − x(0) = a1 t + f (r) dr ds .
0 0

Now, recall that x(0) = a0 and use Fubini Theorem 10.4, to exchange the order of integration:
Z t Z t  Z t
x(t) = a0 + a1 t + f (r) ds dr = a0 + a1 t + (t − r) f (r) dr .
0 r 0

The lastest passage can be understood looking at Figure 10.6.


168 CHAPTER 10. MULTIPLE INTEGRALS

s
t

Figure 10.6: Illustration of the solution to Cauchy problem.

It is possibile to extend this argument to any order of integration: the solution of the initial value
problem:
(
x(n) (t) = f (t) ,
x(0) = a0 , x0 (0) = a1 , . . . , x(n−1) (0) = an−1 ,

is indeed given by:


Z t
1
x(t) = a0 + a1 t + · · · + an−1 tn−1 + (t − r)n−1 f (r) dr .
(n − 1)! 0

A challenging definite integral

We use Fubini Theorem 10.4 to compute a definite integral, presented in Chapter 5 of [26], and hard
to evaluate otherwise:
Z 1 b
x − xa 1+b
dx = ln , 0 ≤ a < b. (10.5)
0 ln x 1+a
To calculate (10.5), let us define:

A = { (x , y) ∈ R2 | 0 ≤ x ≤ 1 , a ≤ y ≤ b } := [0 , 1] × [a , b]

and consider the double integral:


ZZ
xy dx dy .
A

At this point, we integrate first in x and then in y :


ZZ Z b Z 1  Z b
y y dy 1+b
x dx dy = x dx dy = = ln . (10.6)
A a 0 a 1+y 1+a

Reverting the order of integration:

1 b 1 y=b 1
xy xb − xa
ZZ Z Z  Z  Z
y y
x dx dy = x dy dx = dx = dx . (10.7)
A 0 a 0 ln x y=a 0 ln x

Comparing (10.6) and (10.7), we obtain (10.5).


10.1. INTEGRATION IN R2 169

Frullani integral
We use again a two–fold double integral, to evaluate a hard single–variable integral, known as Frullani2
integral: Z ∞
arctan(b x) − arctan(a x) π b
dx = ln , 0 < a < b. (10.8)
0 x 2 a
Consider the double integral:
Z b Z ∞  Z b  x=∞
1 arctan(x y)
dx dy = dy
a 0 1 + x2 y 2 a y x=0
Z b
(10.9)
π π b
= dy = ln .
a 2y 2 a

Now, revert the order of integration in (10.9):


Z ∞ Z b  Z ∞  y=b
1 arctan(x y)
dy dx = dx
0 a 1 + x2 y 2 0 x y=a (10.10)

arctan(b x) − arctan(a x)
Z
= dx .
0 x

Equation (10.8) follows by comparison of (10.9) with (10.10).

Remark 10.8. An analogous strategy allows to state the identity:



e−a x − e−b x
Z
b
= ln , (10.11)
0 x a

which can be obtained starting from the double integral:


Z ∞ Z b
e−x y dx dy ,
0 a

where it is assumed that a , b > 0 .

Basel problem, once more


Here, we describe two further ways to solve the Basel problem presented in § 2.7; this time, double
integration is used and we refer to [28] and [50] for the first and second methods, respectively. In both
cases,the aim is to prove the Leibniz integral (8.22) that is linked, through equality (8.24), to identity
(2.69). In other words, once more, both methods lead to the Euler summation formula (2.71).

In [28], the double integral is considered:


Z +∞  Z 1
x 
dy dx , (10.12)
0 0 (1 + x2 )(1 + x2 y 2 )

which can be re-written as: Z +∞ Z 1


1  dy 
1 dx , (10.12a)
0 x(1 + x2 ) 0 x2
+ y2
in order to exploit the the integration formula:
Z
dy 1 y
2 2
= arctan ,
a +y a a
2
Giuliano Frullani (1795–1834), Italian mathematician.
170 CHAPTER 10. MULTIPLE INTEGRALS

so that: ! y=1
Z +∞ Z 1 Z +∞ 
1 dy arctan x y
1 dx = dx .
0 x(1 + x2 ) 0 x2
+ y2 0 1 + x2 y=0

Therefore: Z +∞ Z 1  Z +∞
x arctan x
2 2 2
dy dx = dx .
0 0 (1 + x )(1 + x y ) 0 1 + x2
Now, since:
arctan2 x
Z
arctan x
dx = ,
1 + x2 2
it follows:
+∞ Z 1
π2
Z 
x
dy dx = . (10.13)
0 0 (1 + x )(1 + x2 y 2 )
2 8
At this point, we use the partial fraction:
2 y2x
 
x 1 2x
2 2 2
= 2 2 2
− ,
(1 + x )(1 + x y ) 2 (y − 1) 1+x y 1 + x2
and we exploit Fubini Theorem 10.4, to change the order of integration in (10.12), and perform the
integration:
Z 1 Z +∞ Z 1 +∞
1 + x2 y 2
 
x 1
dx dy = ln dy
0 0 (1 + x2 )(1 + x2 y 2 ) 2
0 2 (y − 1) 1 + x2 0
Z 1 Z 1 (10.14)
ln y 2 ln y
= 2
dy = 2
dy .
0 2 (y − 1) 0 y −1

Comparison of (10.13) with (10.14) yields the Leibniz integral (8.22).

In [50], the double integral is considered:


Z ∞Z ∞
dx dy
. (10.15)
0 0 (1 + y)(1 + x2 y)
Let us integrate (10.15) with respect to x , first, and then with respect to y :
Z ∞ Z ∞
π ∞ π2
 Z
1 dx dy
dy = √ = . (10.16)
0 1 + y 0 1 + x2 y 2 0 y(1 + y) 2
Reverting the order of integration:
Z ∞ Z ∞ 
dy
dx
0 0 (1 + y)(1 + x2 y)
Z ∞ Z ∞ 
x2
 
1 1
= − dy dx (10.17)
0 1 − x2 0 1 + y 1 + x2 y
Z ∞ Z ∞
1 1 ln x
= 2
ln 2 dx = 2 2
dx .
0 1−x x 0 x −1
Equating (10.16) and (10.17), we get:

π2
Z
ln x
dx = . (10.18)
0 x2 − 1 4
Now, split the integration domain in [0 , 1] plus [1, ∞) , and use the change of variable x = 1/u in the
integral evaluated on [1 , ∞) :
Z ∞ Z 1 Z ∞ Z 1 Z 1
ln x ln x ln x ln x ln u
2
dx = 2
dx + 2
dx = 2
dx + 2
du. (10.19)
0 x −1 0 x −1 1 x −1 0 x −1 0 u −1

Comparison of (10.18) and (10.19) yields the Leibniz integral (8.22).


10.2. CHANGE OF VARIABLE 171

10.2 Change of variable


We treat, here, the two–dimensional version of the change of variable for integrals; we begin with
presenting the class of well–defined variable transformations.

Definition 10.9. Let A be an open subset of R2 . ϕ : A → R2 is called regular mapping if:

1. ϕ is injective;

2. ϕ has continuous partial derivatives;

3. det Jϕ (x) 6= 0 for any x ∈ A , where Jϕ (x) is the Jacobian matrix of ϕ evaluated at x , i.e.:

∂ϕ1 (x) ∂ϕ1 (x)


 
 
∇ϕ1  ∂x1 ∂x2 
Jϕ (x) =  = ,


 ∂ϕ (x) ∂ϕ2 (x) 
∇ϕ2 2
∂x1 ∂x2

and
∂ϕ1 (x) ∂ϕ2 (x) ∂ϕ1 (x) ∂ϕ2 (x)
det Jϕ (x) = − .
∂x1 ∂x2 ∂x2 ∂x1

The most used regular mapping in the plane is the transformation into the so–called polar coordinates.

Example 10.10. Let A = (0 , +∞) × [0 , 2π) , and let ϕ : A → R2 be defined as:

ϕ(ρ , ϑ) := (ρ cos ϑ , ρ sin ϑ) .

This is a regular mapping. In fact, the determinant of the Jacobian is det Jϕ (ρ, ϑ) = ρ > 0 , since:

∂(ρ cos ϑ) ∂(ρ sin ϑ)


 
 
∂ρ ∂ρ cos ϑ −ρ sin ϑ
Jϕ (ρ, ϑ) =  = .
 
∂(ρ cos ϑ) ∂(ρ sin ϑ) sin ϑ ρ cos ϑ
∂ϑ ∂ϑ

A second commonly used regular mapping is constituted by rotations.

Example 10.11. Fix α ∈ (0 , 2π) . The map ϕ : R2 \ {(0 , 0)} → R2 :

ϕ(x, y) := (x cos α − y sin α , x sin α + y cos α)

is a regular mapping, and it is called rotation of angle α. The Jacobian determinant takes value 1. Note
that the composition of two rotations, of angles α and β respectively, is again a rotation of amplitude
α + β mod 2 π .

Regular mappings are useful, since they can allow transforming a double integral into a simpler one.
We state the Change of Variable Theorem 10.12 and, then, study several situations where a good
change of variable eases the computation of integrals.

Theorem 10.12. Consider the open set A ⊆ R2 , and assume that ϕ : A → R2 is a regular mapping.
Let f : A → R be a measurable function. Then, f ∈ L(A) if and only if x 7→ f (ϕ(x)) |det Iϕ (x)|
is summable on the inverse image ϕ−1 (A) . In such a situation, the equality holds true:
Z Z

f (u) du = f ϕ(x) |det Iϕ (x)| dx .
A ϕ−1 (A)
172 CHAPTER 10. MULTIPLE INTEGRALS

Example 10.13. We wish to evaluate, once more, the probability integral G , defined in (8.40). To
this aim, a double integral is considered, which equals G2 , thanks to Fubini Theorem 10.4:
ZZ Z Z
2 2 2 2
e−(x +y ) dx dy = e−x dx e−y dy = G2 ,
R2 R R

while, employing polar coordinates, it also holds:


ZZ ZZ
−(x2 +y 2 ) 2
e dx dy = ρ e−ρ dρ dϑ
R2 (0 ,+∞)×[0 ,2π)
" 2
#ρ=∞

e−ρ
Z
−ρ2
= 2π ρe dρ = 2 π − = π.
0 2
ρ=0

Thus G2 = π , i.e., G = π , as already found in § 8.11.2.
Example 10.14. Let us evaluate the measure of set:
n √ o
E = (x , y) ∈ R2 | x2 + y 2 ≤ 1 ∩ (x , y) ∈ R2 | 0 ≤ y ≤ x ≤ 3 y .


√ √
As Figure 10.7 shows,
√ E is a sort of triangle with curvilinear base, in which points A = (1/ 2 , 1/ 2)
and B = (1/2, 3/2) are obtained solving the systems:
( (
x2 + y 2 = 1 , x2 + y 2 = 1 ,

x = y, x = 3y.

Figure 10.7: Set E of Example 10.14.

Passing to polar coordinates, the constraints defining E become:


n π √ o
ϕ−1 (E) = (ρ , ϑ) ∈ (0 , ∞) × [0 , ] | ρ < 1 , 0 < sin ϑ < cos ϑ < 3 sin ϑ .
2
π
From ϑ ∈ [0 , ] and sin ϑ < cos ϑ , we infer:
2
π
0<ϑ< ,
4
π √
while, from ϑ ∈ [0 , ] and cos ϑ < 3 sin ϑ , it follows:
2
π π
<ϑ< ,
6 2
10.2. CHANGE OF VARIABLE 173

so that, in conclusion:
π π
ϕ−1 (E) = (0 , 1) × ( , ) .
6 4
Finally, invoking Theorem 10.12, we obtain:
ZZ
1 π π  π
`2 (E) = π π ρ dρ dϑ = 2 4 − 6 = 24 .
(0 ,1)×( , )
6 4
It is obviously possible to not use Theorem 10.12, and compute `2 (E) using, instead, Fubini Theorem
10.4. By doing so, though, the relevant computation trun out to be more involved. By looking again
at Figure 10.7, in fact, we see that the domain of integration must be split ad follows:
Z √ 1/ 2
Z !
x Z √ Z √ 2 !
3/2 1−x
`2 (E) = √ dy dx + √ √ dy dx .
0 x/ 3 1/ 2 x/ 3

To evaluate the second integral, we must employ an appropriate integration formula:


Z p
1  p 
1 − x2 dx = x 1 − x2 + arcsin x ,
2
π
after which we do arrive at the computation of the same value `2 (E) = obtained using polar
24
coordinates, but with a greater effort.

Remark 10.15. Polar coordinates can be modified, if we wish to compute the measure of the canonical
ellipse, that is a set E described by:

x2 y 2
 
2
E = (x , y) ∈ R | 2 + 2 ≤ 1 ,
a b

where a , b > 0 . Consider the map ϕ : R2 \ (0 , 0) → R2 given by:

ϕ(ρ , ϑ) := (a ρ cos ϑ , b ρ sin ϑ) .

It is a regular mapping, whose Jacobian determinant is:


 
a cos ϑ −b ρ sin ϑ
det Jϕ (ρ, ϑ) = det = abρ > 0.
a sin ϑ b ρ cos ϑ

Observe that ϕ−1 (E) = [0 , 1] × [0 , 2π) , hence:


ZZ ZZ
`2 (E) = dx dy = a b ρ dρ dϑ = a b π .
E [0 ,1]×[0 ,2π)

In the next Example 10.16, we employ a rotation to compute the measure of a set.

Example 10.16. Let us compute the Lebesgue measure of

A := { (x , y) ∈ R2 | x2 − x y + y 2 ≤ 1 } , (10.20)
π
using an α = rotation:
4
1 1


x = √ u− √ v,
2 2




 1 1
y = √ u + √ v .


2 2
174 CHAPTER 10. MULTIPLE INTEGRALS

In this way, the set of points verifying x2 − x y + y 2 ≤ 1 is transformed into:


1 1 1
(u − v)2 − (u + v)(u − v) + (u + v)2 ≤ 1 ,
2 2 2
that is:
u2 3v 2
+ ≤ 1. (10.21)
2 2
The choice of the rotation angle α is not a lucky guess. In general, we start with an undefined rotation
of amplitude α : (
x = u cos α − v sin α
(10.22)
y = u sin α + v cos α
and we insert it into the equation which defines our domain. In the current example, inserting (10.22)
into (10.20) yields:

(u sin α + v cos α)2 − (u cos α − v sin α)(u sin α + v cos α) + (u cos α − v sin α)2 ≤ 1

that is:
u2 + v 2 + (sin2 α − cos2 α) u v − (u2 − v 2 ) sin α cos α ≤ 1 . (10.23)
We choose the value of α for which the rectangular term u v in (10.23) vanishes:

sin2 α − cos2 α = 0 ,

namely, α = π/4 .
At this point, returning to our transformed set (10.21), and recalling Remark 10.15, we infer that the
measure of A is: √
√ 2 2π
2 √ π=√ .
3 3
Remark 10.17. The computational method described in Example 10.16 can be applied to the situ-
ation of a general ellipse:

E = { (x , y) ∈ R2 | a x2 + 2 b x y + c y 2 ≤ d } , (10.24)

where, to make sure we are dealing with an ellipse and a non–empty set, we must assume that
a , c , d > 0 , and that the discriminant ∆ = b2 − a c is negative. That said, we can apply the rotation
technique to compute `2 (E) . The two cases a = c and a 6= c must be distinguished. When a = c , the
π/4 rotation used in Example 10.16 still works, and transforms E into:

ϕ−1 (E) = { (u , v) | u2 (a + b) + v 2 (a − b) ≤ d } . (10.25)

Note that the two conditions a = c and ∆ < 0 imply |b| < a ; hence, a + b > 0 and a − b > 0 .
Recalling Remark 10.15, we can thus infer:
d
`2 (E) = √ π. (10.26)
a2− b2
If a 6= c , it is always possible to transform the given ellipse into an ellipse of the form (10.25), by
means of the variable transformation: 
x = x1 ,
(10.27)
r
a
y = y1 ,
c
applying which, the general ellipse (10.24) becomes:
r
2 a
a x1 + 2 b x1 y1 + a y12 ≤ d . (10.28)
c
10.3. INTEGRATION IN RN 175
r
a
Note that transformation (10.27) has non–zero Jacobian determinant > 0 ; hence from (10.26), it
c
follows:
d
`2 (E) = √ π. (10.29)
a c − b2
In other words, (10.29) nicely extends (10.26).

Translations and affine transformations are regular transformations too.


Example 10.18. If a ∈ R2 , the map ϕ : R2 → R2 , ϕ(x) = x + a , is a regular transformation,
called translation of amplitude a . It is quite immediate to verify that, for such a transformation, it
holds det Jϕ = 1.
Example 10.19. Given a ∈ R2 , let M ∈ R2×2 be a non–singular square matrix. Then, the map
ϕ : R2 → R2 , ϕ(x) = M xT + a is a regular transformation, called affine transformation. Here, the
Jacobian determinat is given by det Jϕ = det M 6= 0 .
Remark 10.20. Any affine transformation is obtained by composing a linear transformation and a
translation. As consequence, we can think of a translation as an affine transformation, associated to
the identity matrix.

10.3 Integration in Rn
The process of extending the Lebesgue measure, from R to R2 , can be naturally iterated. First,
though, some notations need to be introduced. The Euclidean space Rn is here decomposed into the
Cartesian product of two lower dimensional Euclidean spaces, Rp and Rq , where p + q = n , and we
make the identification Rn = Rp × Rq . Moreover, with the writing (x, y) ∈ Rn , we implicitly mean
that x = (x1 , . . . , xp ) ∈ Rp and y = (y1 , . . . , yq ) ∈ Rq . This said, following Definition 10.1, we can
now provide the idea of section of a set A ⊆ Rn .
Definition 10.21. Consider A ⊂ Rn and x ∈ Rp . Then, the section of foot x of A is defined as
the following subset Ax of Rq :

Ax := {y ∈ Rq | (x , y) ∈ A} .

Similarly, if y ∈ Rq , then the section of foot y of A is defined as the following subset Ay of Rp :

Ay := {x ∈ Rp | (x , y) ∈ A} .

The Lebesgue measure in Rn is denoted by `n ; analogous meaning holds for `p and `q . In the following,
when the term measurable is used, the dimension of the considered Euclidean space will be clear form
the context. Symbol dp indicates that we are integrating with respect to the Lebesgue measure in
Rp , and analogously for other dimensions.
The set section Theorem 10.22, in its proof, makes a heavy use of the monotone convergence Theorem
8.45.
Theorem 10.22. Let A ⊂ Rn be a measurable set. Then:
(I) for almost any x ∈ Rp , section Ax ⊂ Rq is measurable; moreover, function Ax 7→ `q (Ax ) is
measurable, and it holds: Z
`n (A) = `q (Ax ) dp x ;
Rp

(II) for almost any y ∈ Rq , section Ay ⊂ Rp is measurable; moreover, function Ay 7→ `p (Ay ) is


measurable, and it holds: Z
`n (A) = `p (Ay ) dq y .
Rq
176 CHAPTER 10. MULTIPLE INTEGRALS

We can now formulate the extension of Fubini Theorem 10.4 to the general Rn case.

Theorem 10.23. (Fubini – general case) Let A ⊂ Rn be a measurable set and consider f ∈ L(A) .
Denote with S0 the null set set where section Ax , for x ∈ Rp , is non–measurable. Define, further, S
to be the subset of Rp where the x–sections of A have positive measure, formally:

S = { x ∈ Rp \ S0 | `q (Ax ) > 0 } . (10.30)

Then, y 7→ f (x , y) is summable on Ax , and the equality holds true:


Z Z Z 
f (x , y) dx dy = f (x , y) dq y dp x . (10.31)
A S Ax

To apply Fubini Theorem 10.23, it is mandatory that the integrand f (x , y) is a summable function.
In many circumstances, summability can be deduced by some a priori considerations. Otherwise, the
following Theorem 10.24, due to Tonelli3 , analyses the summability of a given function.

Theorem 10.24. (Tonelli) Let A , S0 and S be as in Theorem 10.23. Let further f : A → R be a


measurable function. Then, function y 7→ f (x , y) is measurable on Ax , and the following equality
holds true: Z Z Z 
|f (x , y)| dx dy = |f (x , y)| dq y dp x . (10.32)
A S Ax

We complete our study of integration in the Euclidean space Rn by describing some applications of
the Fubini Theorem 10.23. Some further terminology is also provided.

Definition 10.25. Consider a measurable set A ⊆ Rn and a measurable real–valued non–negative


function f : A → [0 , +∞] . The following sets are called, respectively, subgraph Γf and graph Gf of
f :
Γf = {(x , t) ∈ A × Rn | 0 ≤ t < f (x)} , Gf = {(x , t) ∈ A × Rn | t = f (x)} .

The geometrical meaning of Definition 10.25 is, clearly, that of the n–dimensional generalization of the
concept of graph of a function of one variable. The following Theorem 10.26 illustrates the geometric
idea of an n–dimensional integral.

Theorem 10.26. Sets Γf and Gf , introduced in Definition 10.25, are measurable in Rn+1 . Moreover:
Z
`n+1 (Γf ) = f dx , `n+1 (Gf ) = 0 .
A

Combining Theorems 10.26, 8.13, it is possible to provide an alternative proof to the monotone con-
vergence Theorem 8.45.

Theorem 10.27. Let A ⊆ Rn be measurable, and consider a sequence (fp )p∈N of real non–negative
measurable functions on A , such that, for almost any x ∈ A :

(i) fp (x) ≤ fp+1 (x) ; (ii) lim fp (x) = f (x) . (10.33)


p→∞

Then, the passage to the limit holds true:


Z Z
f (x) dn x = lim fp (x) dn x .
A p→∞ A
3
Leonida Tonelli (1885–1946), Italian mathematician.
10.3. INTEGRATION IN RN 177

Proof. Following Definition 10.25, consider the subgraphs Γf and Γfp . From the monotonicity  hy-
pothesis in point (i) of (10.33), it follows that Γfp ⊆ Γfp+1 . Hence, the family of sets Γfp p∈N is a
nested increasing family and, due to point (ii) of (10.33), it is esaustive, that is:

[
Γf = Γfp .
p=1

We can thus invoke Theorem 8.13, to infer:



[ Z

`n+1 ( Γfp ) = lim `n+1 Γfp = fp (x) dn x .
p→∞ A
p=1

On the other hand, from Theorem 10.26, it follows:



[ Z
`n+1 ( Γfp ) = `n+1 (Γf ) = f (x) dn x ,
p=1 A

and the proof is completed.


To conclude § 10.3, we state an alternative formula for integral evaluation, that can be useful in
applications.
Theorem 10.28. Consider A ⊆ Rn measurable. If f ∈ L(A) , then the following equality holds true:
Z Z ∞
f (x) dn x = `n ({x ∈ A | f (x) > t}) dt .
A 0

10.3.1 Exercises
(x , y) ∈ R2 | x2 ≤ y ≤ x . Show that:

1. Consider A :=
ZZ
1 π
2
dx dy = √ − 2 .
A y−x −1 3
2. Let A = (x , y) ∈ R2 | 0 ≤ x ≤ 1 , x2 ≤ y ≤ x + 1 . Show that:

ZZ
5
x y dx dy = .
A 8

3. Let A = { (x , y) ∈ R2 | 0 ≤ x ≤ y ≤ 2 x } . Show that:


ZZ
3
x e−y dx dy = .
A 4

4. Let A = (x , y) ∈ R2 | y ≥ 0 , y ≤ 2 + x , y ≤ 2 − x2 . Show that:



ZZ
x dx dy = 4 .
A

(x , y) ∈ R2 | x4 ≤ y ≤ x 2

5. Let A = . Show that:
ZZ
1
x y dx dy = .
A 30

6. Let A = { (x , y) ∈ R2 | x2 + 4 y 2 ≤ 1} . Show that:


ZZ
1 π
2 2
dx dy = ln 2 .
A 1 + x + 4y 2

7. Let A = { (x , y) ∈ R2 | 5 x2 + 4 x y + y 2 ≤ 1} . Show that:


ZZ
2 2
e5 x +4 x y+y dx dy = π (e − 1) .
A
178 CHAPTER 10. MULTIPLE INTEGRALS

10.4 Product of σ–algebras


We end this Chapter 8 giving a short outline of the theory on products of σ–algebras. Assume that
two measure spaces (Ω1 , A1 ) and (Ω2 , A2 ) are given. We form the Cartesian product Ω := Ω1 × Ω2 ,
on which we wish to build a measure µ , that agrees with the given measures on Ω1 and Ω2 . To do
so, we have to provide the domain A of such a set function µ , that is to say, we must construct a
suitable σ–algebra on Ω : the A that we define is the product σ–algebra, built to be the minimum
σ–algebra on Ω , containing rectangles A1 × A2 , with A1 ∈ A1 and A2 ∈ A2 . In other words, the
product σ–algebra is generated by the collection of rectangular sets:

R = { A1 × A2 | A1 ∈ A1 , A2 ∈ A2 } .

It is possible to prove the following result.

Theorem 10.29. The product σ–algebra A1 × A2 verifies what follows:

(i) A1 × A2 is also generated by cylinders:

C = { A1 × Ω2 | A1 ∈ A1 } ∩ { Ω1 × A2 | A2 ∈ A2 } ;

(ii) A1 × A2 is the minimum σ–algebra such that the following projections are measurable:

Pr1 : Ω → Ω1 , Pr1 (ω1 , ω2 ) = ω1 ,


Pr2 : Ω → Ω2 , Pr2 (ω1 , ω2 ) = ω2 .

For applications, the most relevant situation occurs when Ω1 = Ω2 = R and A1 = A2 = B is the
Borel σ–algebra on R . Borel sets, in the plane, can be generated in two equivalent ways.

Theorem 10.30. The σ–algebras generated by:

R = { B1 × B2 | B1 , B2 ∈ B } and I = { I1 × I2 | I1 , I2 intervals }

are the same.

Having defined the product σ–algebra, we have to introduce the product measure, and we must do so
in a consistent manner, that follows the contribution of Fubini to Measure theory, in the case of the
Euclidean space Rn .
To build the product measure, we need to work with σ–finite measure spaces (Ω1 , A1 , µ1 ) and
(Ω2 , A2 , µ2 ) . Note that this is the case of Lebesgue measure and of Probability measures. We denote
by µ the product measure defined on the product σ–algebra A1 × A2 . As a first construction step,
let us impose the ‘natural’ condition:

µ (A1 × A2 ) = µ(A1 ) µ(A2 ) .

The task of assigning a measure to non–rectangular sets requires the notion of section of a subset A
of Ω1 × Ω2 .

Definition 10.31. Consider A ⊂ Ω1 × Ω2 and let ω2 ∈ Ω2 . Then, the section of foot ω2 is the subset
of Ω1 defined, and denoted, as:

Aω2 = { ω1 ∈ Ω1 | (ω1 , ω2 ) ∈ A } .

Analogously, if ω1 ∈ Ω1 , then the section of foot ω1 is the subset of Ω2 :

Aω1 = { ω2 ∈ Ω2 | (ω1 , ω2 ) ∈ A } .
10.4. PRODUCT OF σ–ALGEBRAS 179

We can now state Theorem 10.32, which concerns the measurability of sections, of a measurable set,
for the product measure.

Theorem 10.32. Let A ∈ A1 × A2 . Then, Aω2 ∈ A1 for any ω2 ∈ Ω2 , and Aω1 ∈ A2 for any
ω1 ∈ Ω1 .

Definition 10.33. Given A ∈ A1 × A2 , we define measure of A the number:


Z
µ (A) = µ1 (Aω2 ) dµ2 (ω2 ) . (10.34)
Ω2

Theorem 10.34. Consider the product σ–algebra A1 × A2 . Let µ1 , µ2 be σ–finite measures. Then,
the functions:
ω2 7→ µ1 (Aω2 ) and ω1 7→ µ2 (Aω1 )
are measurable with respect to A2 and A1 , respectively. Furthermore, it holds:
Z Z
µ2 (Aω1 ) dµ1 (ω1 ) = µ1 (Aω2 ) dµ2 (ω2 ) .
Ω1 Ω2

Theorem 10.35. The set function µ introduced in (10.34) is a measure. Moreover, µ is unique,
since any other measure that coincides with µ on rectangles, is equal to µ on the product σ–algebra
A1 × A 2 .

We are, finally, in the position to state the Fubini Theorem 10.36 for nested integrals.

Theorem 10.36. Consider f ∈ L1 (Ω1 × Ω2 ) . Then, the functions:


Z Z
ω1 7→ f (ω1 , ω2 ) dµ2 (ω2 ) and ω2 7→ f (ω1 , ω2 ) dµ1 (ω1 )
Ω2 Ω1

belong to L1 (Ω1 ) and L1 (Ω2 ) , respectively. Moreover, it holds:


Z Z Z 
f (ω1 , ω2 ) d(µ1 × µ2 ) (ω1 , ω1 ) = f (ω1 , ω2 ) dµ2 (ω2 ) dµ1 (ω1 )
Ω1 ×Ω2 Ω1 Ω2
Z Z 
= f (ω1 , ω2 )dµ1 (ω1 ) dµ2 (ω2 ) .
Ω2 Ω1
180 CHAPTER 10. MULTIPLE INTEGRALS
11 Gamma and Beta functions

11.1 Gamma function


The material presented in this Chapter 11 is based on many excellent textbooks [5], [16], [20], [27],
[30], [41], [55], [62], [43], [40], to which the interested Reader is referred.

11.1.1 Historical backgruond


The Gamma function was introduced by Euler in relation with the factorial problem. From the early
beginning of Calculus, in fact, the problem of understanding the behavior of n! , with n ∈ N , and the
related binomial coefficients, was under the attention of the mathematical Community. In 1730, the
formula of Stirling1 was discovered:
n! en √
lim n √ = 2 π . (11.1)
n→∞ n n
In the same period, Euler introduced a function e(x) , given by the explicit formula (11.2) and defined
for any x > 0 , which reduces to e(n) = n! when its argument is x = n ∈ N . Euler described his
results in a letter to Goldbach2 , who had posed, together with Bernoulli, the interpolation problem to
Euler. This last problem was inspired by the fact that the additive counterpart of the factorial has a
very simple solution:
n(n + 1)
sn = 1 + 2 + . . . · · · + n = ,
2
and by the observation that the above sum function admits a continuation to C given by the following
function:
x(x + 1)
f (x) = ,.
2

Euler’s solution for the factorial is the following integral:


Z 1  x
1
e(x) = ln dt . (11.2)
0 t

Legendre3 introduced the letter Γ(x) to denote (11.2), and he modified its representation as follows:
Z ∞
Γ(x) = e−t tx−1 dt , x > 0. (11.3)
0

Observe that (11.2) and (11.3) imply the equality Γ(x + 1) = e(x) . In fact, the change of variable
t = − ln u , in the Legendre integral Γ(x + 1) , yields:
1
1 x 1
Z  
ln u
Γ(x + 1) = e ln du = e(x) .
0 u u
1
James Stirling (1692–1770), Scottish mathematician.
2
Christian Goldbach (1690–1764),German mathematician.
3
Adrien Marie Legendre (1752–1833), French mathematician.

181
182 CHAPTER 11. GAMMA AND BETA FUNCTIONS

11.1.2 Main properties of Γ


Function Γ is the natural continuation of the discrete factorial, since:

Γ(n + 1) = n! for n ∈ N , (11.4)

and, most importantly, Γ solves the functional equation:

Γ(x + 1) = x Γ(x) for any x > 0. (11.5)

These aspects are treated with the maximal generality in the Bohr–Mollerup4 Theorems 11.1–11.2.
Note that Γ(x) appears in many formulæ of Mathematical Analysis, Physics and Mathematical Statis-
tics.

Theorem 11.1. For any x > 0 , the recursion relation (11.5) is true. In particular, when x = n ∈ N ,
then (11.4) holds.

Proof. Consider (11.3) and integrate by parts:


Z ∞ Z ∞
x −t
 x −t ∞
Γ(x + 1) = t e dt = −t e 0
+x t x−1 e−t dt = x Γ(x) .
0 0

For the integer argument case, exploit (11.3) to observe that:


Z ∞
Γ(1) = e−t dt = 1 ,
0

and use the just proved recursion (11.5) to compute:

Γ(2) = 1 · Γ(1) = 1 , Γ(3) = 2 · Γ(2) = 2 , Γ(4) = 3 · Γ(3) = 3 · 2 = 6 ,

and so on. Hence, Γ(n + 1) = n! can be inferred by an inductive argument.

When x > 0 , function Γ(x) is continuous and differentiable at any order. To evaluate its derivatives,
we use the differentiation of parametric integrals, obtaining:
Z ∞
0
Γ (x) = e−t tx−1 ln t dt , (11.6a)
0
Z ∞
Γ(2) (x) = e−t tx−1 (ln t)2 dt , (11.6b)
0

and more generally: Z ∞


Γ(n)
(x) = e−t tx−1 (ln t)n dt . (11.7)
0

From its definition, Γ(x) is strictly positive. Moreover, since (11.6b) shows that Γ(2) (x) ≥ 0 , it follows
that Γ(x) is also strictly convex. We thus infer the existence for the following couple of limits:

`0 := lim Γ(x) , `∞ = lim Γ(x) .


x→0+ x→∞

To evaluate `0 , we use the inequality chain:


Z 1
1 1 x−1
Z
x−1 −t 1
Γ(x) > t e dt > t dt = ,
0 e 0 xe
which ensures that:
`0 = +∞ .
4
Harald August Bohr (1887–1951), Danish mathematician and soccer player.
Johannes Mollerup (1872–1937), Danish mathematician.
11.1. GAMMA FUNCTION 183

To evaluate `∞ , since we know, a priori, that such limit exists, we can restrict the focus to natural
numbers, so that we have immediately:

`∞ = lim Γ(n) = lim (n − 1)! = +∞ .


n→∞ n→∞

Observing that Γ(2) = Γ(1) and using Rolle Theorem5 , we see that there exists ξ ∈ ] 1 , 2 [ such that
Γ0 (ξ) = 0 . On the other hand, since Γ(2) (x) > 0 , the first derivative Γ0 (x) is strictly increasing, thus
there is a unique ξ such that Γ0 (ξ) = 0 . Furthermore, we have that 0 < x < ξ =⇒ Γ0 (x) < 0
and x > ξ =⇒ Γ0 (x) > 0 . This means that ξ represents the absolute minimum for Γ(x) , when
x ∈ ] 0 , ∞ [ . The numerical determination of ξ and Γ(ξ) is due to Legendre and Gauss6 :

ξ = 1.4616321449683622 , Γ(ξ) = 0.8856031944108886 .

An important property of Γ is that its logarithm is a convex function, as stated below.

Theorem 11.2. Γ(x) is logarithmically convex.

Proof. Recall the Schwarz inequality7 for functions whose second power is summable:
Z ∞ 2 Z ∞ Z ∞
f (t) g(t) dt ≤ f 2 (t) dt · g 2 (t) dt .
0 0 0

If we take:
t x−1
f (t) = e− 2 t 2 , g(x) = f (x) ln t ,
recalling (11.3), (11.6a) and (11.6b), we find the inequality:
2
Γ0 (x) ≤ Γ(x) Γ(2) (x) .

Hence, we conclude that:

d2 Γ(x) Γ(2) (x) − (Γ0 (x))2


ln Γ(x) = ≥ 0.
dx2 Γ2 (x)

Property (11.5) can be iterated, therefore, if n ∈ N and x > 0 :

Γ(x + n) = (x + n − 1) (x + n − 2) · · · (x + 1) x Γ(x) . (11.8)

Definition 11.3. The quotient:


Γ(x + n)
(x)n := = (x + n − 1) (x + n − 2) · · · (x + 1) x (11.9)
Γ(x)
is called Pochhammer symbol or increasing factorial.

Rewriting (11.8) as:


Γ(x + n)
Γ(x) = (11.8b)
(x + n − 1) (x + n − 2) · · · (x + 1) x
allows the evaluation of Γ(x) also for negative values of x , except the integers. For instance, when
x ∈ ] − 1 , 0 [ , Gamma is given by:
Γ(x + 1)
Γ(x) = ;
x
5
Michel Rolle (1652–1719), French mathematician. For the theorem of Rolle see, for example, math-
world.wolfram.com/RollesTheorem.html
6
Carl Friedrich Gauss (1777–1855), German mathematician and physicist.
7
See, for example, mathworld.wolfram.com/SchwarzsInequality.html
184 CHAPTER 11. GAMMA AND BETA FUNCTIONS

in particular:    
1 1
Γ − = −2 Γ .
2 2
When x ∈ ] − 2 , −1 [ , the evaluation is:

Γ(2 − x)
Γ(x) = ,
(x + 1) x

thus, in particular:    
3 4 1
Γ − = Γ .
2 3 2
In other words, Γ(x) is defined on the real line, except the singular points x = 0 , −1 , −2 , · · · , and
so on, as shown in Figure 11.1.
y

x
-4 -3 -2 -1 1 Ξ 2 3 4 5

Figure 11.1: Plot of Γ(x) .

11.2 Beta function


The Beta function B(x , y) , also called Eulerian integral of first kind, is defined as:
Z 1
B(x , y) = t x−1 (1 − t) y−1 dt , x,y > 0. (11.10)
0

Notice that the change of variable t = 1 − s provides, immediately, the symmetry relation B(x , y) =
B(y , x) in (11.10), which yields:
Z π/2
B(x , y) = 2 cos2 x−1 ϑ sin2 y−1 ϑ dϑ . (11.10a)
0

The main property of the Beta function is its relationship with the Gamma function, as expressed by
Theorem 11.4 below.

Theorem 11.4. For any real x , y > 0 , it holds:

Γ(x) Γ(y)
B(x , y) = . (11.11)
Γ(x + y)
11.2. BETA FUNCTION 185

Proof. From the usual definition (11.3) for the Gamma function, after the change of variable t = u2 ,
we have: Z +∞
2
Γ(x) = 2 u2 x−1 e−u du .
0
In the same way: Z +∞
2
Γ(y) = 2 v 2 y−1 e−v dv .
0
Now, we form the product of the last two integrals above, and we use Fubini Theorem 10.4, obtaining:
ZZ
2 2
Γ(x) Γ(y) = 4 u2 x−1 v 2 y−1 e−(u +v ) du dv .
[0 ,+∞)×[0 ,+∞)

At this point, we change variable in the double integral, using polar coordinates:
(
u = ρ cos ϑ ,
v = ρ sin ϑ ,

which leads to the relation:


!
Z +∞  Z π/2
2 x+2 y−1 −ρ2 2 x−1 2 y−1
Γ(x) Γ(y) = 4 ρ e dρ cos ϑ sin ϑ dϑ
0 0
Z π/2
= Γ(x + y) 2 cos2 x−1 ϑ sin2 y−1 ϑ dϑ .
0

The thesis follows from (11.10a).

Remark 11.5. Theorem 11.4 can be shown also using, again, Fubini Theorem 10.4, starting from:
Z ∞Z ∞
Γ(x) Γ(y) = e−(t+s) tx−1 sy−1 dt ds ,
0 0

and using the changes of variable t = u v and s = u (1 − v) .

1

11.2.1 Γ 2
and the probability integral
1
Using (11.11), we can evaluate Γ and, then, the probability integral (8.36). In fact, by taking
2
x = z and y = 1 − z , where 0 < z < 1 , we obtain:
Z 1 Z 1
z−1 −z t z 1
Γ(z) Γ(1 − z) = B(z , 1 − z) = t (1 − t) dt = dt .
0 0 1−t t

Now, the change of variable y = t (1 − t)−1 , leads to:


Z ∞ z−1
y
Γ(z) Γ(1 − z) = dy . (11.12)
0 1+y
1
In particular, the choice z = yields:
2

 
1
Γ = π. (11.13)
2

To see it, observe that (11.12) implies:


!2
1 Z ∞
1
Γ = √ dy , (11.14)
2 0 (1 + y) y
186 CHAPTER 11. GAMMA AND BETA FUNCTIONS

where the right hand–side integral is computed setting y = x2 , so that:


!2
1 Z ∞
1
Γ =2 dx = 2 lim (arctan b) = π .
2 0 1 + x2 b→∞

Employing (11.13), it is also possible to evaluate, once more, the probability integral (8.36). Evaluating
(11.3) at x = 1/2 , and then setting t = x2 , we find, in fact:
∞ ∞
e−t
  Z Z
1 2
Γ = √ dt = 2 e−x dx ,
2 0 t 0

which implies (8.36).

The change of variable s = t (1 − t)−1 , used in (11.10), provides an alternative representation for the
Beta function: Z ∞
sx−1
B(x , y) = ds . (11.10b)
0 (1 + s)x+y
Setting s = t (1 − t)−1 , in fact, leads to:
Z 1 Z ∞
B(x , y) = t x−1
(1 − t) y−1
dt = sx−1 (1 + s)−x+1 (1 + s)−2 ds
0 0

sx−1
Z
= ds .
0 (1 + s)x+y

The following representation Theorem 11.6 decribes the family of integrals related to the Beta function.

Theorem 11.6. If x , y > 0 and a < b , then:


Z b
(s − a)x−1 (b − s)y−1 ds = (b − a)x+y−1 B(x , y) . (11.15)
a

Proof. In (11.10), employ the change of variable t = (s − a)/(b − a) , to obtain:


Z 1
B(x , y) = tx−1 (1 − t)y−1 dt
0
Z b
= (s − a)x−1 (b − a)−x+1 (b − s)−y+1 (b − a)−1 ds
0
Z b
−x−y+1
= (b − a) (s − a)x−1 (b − s)y−1 ds ,
0

which ends our argument.

The particular situation a = −1 and b = 1 is interesting, since it gives:


Z 1
(1 + s)x−1 (1 − s)y−1 ds = 2x+y−1 B(x , y) . (11.15b)
−1

11.2.2 Legendre duplication formula


Legendre formula expresses Γ(2 x) in terms of Γ(x) and Γ(x + 12 ) .

Theorem 11.7. It holds:



22 x−1 Γ(x) Γ(x + 12 ) = π Γ(2 x) . (11.16)
11.2. BETA FUNCTION 187

Proof. Define the integrals:


Z π 2 x Z π 2 x
2 2
 
I= sin t dt and J= sin(2 t) dt .
0 0

Observe that, with the change of variable 2 t = u in J , it follows that I = J . Observe, further, that:
 
1 1 1
I = B x+ , ,
2 2 2
Z π  
2
2x 2x−1 1 1
J= (2 sin t cos t) dt = 2 B x + ,x + .
0 2 2

Hence:    
1 1 1 1 1
B x+ , = 22 x−1 B x + , x + . (11.17)
2 2 2 2 2

Recalling (11.11), equality (11.17) implies (11.16).

Formula (11.16) is generalised by Gauss multiplication Theorem 11.8, which we present without proof.

Theorem 11.8. For each m ∈ N , the following formula holds true:

m−1  
Y k
Γ(x) Γ x+ = m1/2−m x (2 π)(m−1)/2 Γ(m x) . (11.18)
m
k=1

11.2.3 Euler reflexion formula


This is a famous and beautiful relation, that was found by Euler and that is stated in (11.19). It
admits two alternative proofs, which are not presented here, as one uses complex integrals, while the
other exploits infinite products.

Theorem 11.9. For any x ∈ ] 0 , 1 [ , it holds:

π
Γ(x) Γ(1 − x) = . (11.19)
sin(π x)

From the reflexion formula (11.19), it follows immediately the computation of integral (11.20).

Corollary 11.10. For any x ∈ ] 0 , 1 [ , it holds:


ux−1
Z
π
du = . (11.20)
0 1+u sin(π x)

It is possible to use (11.19) to establish a cosine reflexion formula.


   
1 1 π π
Γ +p Γ −p = 1
 = (11.21)
2 2 sin 2 +p π cos (p π)

in which we must assume p 6= n + 21 , for n = 0 , 1 , . . . . In terms of the original variable x , we also


have:
π
Γ(x) Γ(1 − x) =  .
cos x − 12 π
188 CHAPTER 11. GAMMA AND BETA FUNCTIONS

11.3 Definite integrals


Gamma and Beta functions are extremely useful for the computation of many definite integrals. Here,
we present some integrals, that can be found in [40] too: to solve them, reflection formulæ (11.19) and
(11.21) are employed. We begin with an integral identity due to Legendre.
Theorem 11.11. If n ≥ 1 then:
Z 1 π  Z 1
dx dx
√ n
= cos √ . (11.22)
0 1−x n 0 1 + xn
Proof. Legendre established formula (11.22) in equation (z) of his treatise [37]. First observe that, for
n ≥ 1 , both integrals converge. Define:
Z 1 Z 1
dx dx
I1 = √ n
, I2 = √ .
0 1 − x 0 1 + xn
Employing the change of variable xn = t , in both integrals, yields:
1
1 1 t n −1
Z  
1 1 1
I1 = √ dt = B ,
n 0 1−t n n 2
1
1 ∞ t n −1
Z  
1 1 1 1
I2 = √ dt = B , − .
n 0 1+t n n 2 n
In integral I2 , above, we used the Beta representation (11.10b). Now, form the ratio I1 /I2 and exploit
Theorem 11.4, to obtain:
B n1 , 21 Γ n1 Γ 21 Γ 21
   
I1 π
= 1 1 1 = 1 1 1 1 1 =
.
Γ n + 2 Γ 21 − n1
1 1
    
I2 B n,2−n Γ n+2 Γ n Γ 2−n

Thus, recalling (11.21):


I1 π π 
= π = cos
I2 n
cos πn


which is our statement.

The same argument followed to demonstrate Theorem 11.11 can be applied to prove the following
Theorem 11.12, thus, we leave it as an exercise.
Theorem 11.12. If 2 a < n , then:
Z 1
xa−1  a  Z ∞ z a−1
√ dx = cos π √ dz . (11.23)
0 1 − xn n 0 1 + zn

Theorem 11.13. If n ∈ N , n ≥ 2 , then:


Z 1
dx π

n n
= π  . (11.24)
0 1−x n sin
n
Proof. Using, again, the change of variable xn = t leads to:
Z 1
1 1 1 −1 1 1 1 −1
Z Z
dx 1
−n (1− n1 )−1 dt
√ = t n (1 − t) dt = t n (1 − t)
0
n
1 − xn n 0 n 0
     
1 1 1 1 1 1
= B , 1− = Γ Γ 1− .
n n n n n n
Thesis (11.24) follows from reflexion formula (11.19).
11.3. DEFINITE INTEGRALS 189

Theorem 11.14. For any n ≥ 2 , it holds:


Z ∞
dx π
n
= π  . (11.25)
0 1+x n sin
n
Moreover, if n − m > 1 , then:

xm
Z
π
dx = π
. (11.26)
0 1 + xn n sin n (m + 1)

1
Proof. The change of variable 1 + xn = is employed in the left hand–side integral of (11.25), and
t
1 1 1
therefore dx = − t− n −1 (1 − t) n −1 dt :
n
Z ∞
1 1 −1 1 1 (1− 1 )−1
Z Z
dx 1
−1 1

n
= t n (1 − t) n dt = t n (1 − t) n −1 dt
0 1+x n 0 n 0
     
1 1 1 1 1 1
= B 1− , = Γ 1− Γ .
n n n n n n

Thesis (11.25) follows from reflexion formula (11.19). Formula (11.26) also follows, using an analogous
argument.

Formula (11.27), below, is needed to prove the following Theorem 11.15.


  √ n−1 √
1 π Y π
Γ n− = n−1 (2k − 1) = n−1 (2n − 3)!! . (11.27)
2 2 2
k=1

Theorem 11.15. If n ∈ N , it holds:



π (2n − 3)!!
Z
dx
2 n = n−1 . (11.28)
−∞ (1 + x ) 2 (n − 1)!

Proof. Using the symmetry of the integrand, we have:


Z ∞ Z ∞
dx dx
n =2 .
−∞ (1 + x )
2
0 (1 + x2 )n

1 1 3 1
The change of variable 1 + x2 = , that is, dx = − t− 2 (1 − t)− 2 dt , leads to:
t 2
Z ∞   √  
dx 1 1 π 1
= B n− , = Γ n− .
−∞ (1 + x2 )n 2 2 (n − 1)! 2

Exploiting (11.27), we arrive at thesis (11.28).

By an analogous argument, the following Theorem 11.16 can be demonstrated.

Theorem 11.16. If n p − m > 1 , then:


∞ m+1 m+1
 
xm Γ Γ p−
Z
n n
dx = . (11.29)
0 (1 + xn )p n Γ(p)
190 CHAPTER 11. GAMMA AND BETA FUNCTIONS

11.4 Double integration techniques


As often happens in Analysis, double integration leads to some interesting integral identities, regardless
of the order in which the two integrals are evaluated. Here, we obtain a few of such important identities,
connecting the Eulerian integrals with double integration reversal. In particular, the Fresnel integrals
are attained, which are related to the probability integrals, as well as the Dirichlet integral, following
the presentation in [43].
We start proving the following beautiful identity (11.30), which holds for any b ∈ R and for 0 < p < 2 ,
in order to provide convergence for the integral.
Z +∞
sin(b x) π b p−1
dx = . (11.30)
0 xp 2 Γ(p) sin p π2

The starting point, to prove (11.30), is the double integral:


Z +∞ Z +∞
I(b , p) = sin(b x) y p−1 e−x y dx dy .
0 0

The assumption we made for the parameter b ensures summability. Hence, exploiting Fubini Theorem
10.4, the integration can be performed regardless of the order. Let us integrate, first, with respect to
x : Z +∞ Z +∞ 
p−1 −x y
I(b , p) = y sin(b x) e dx dy .
0 0
In this way, the integral above turns out to be an elementary one, since:
Z +∞
b
sin(b x) e−x y dx = 2 ,
0 b + y2
and then:
+∞
y p−1
Z
I(b , p) = b dy ,
0 b2 + y 2
for which, employing the change of variable t = yb , we obtain:

tp−1
Z
I(b , p) = b p−1 dt .
0 1 + t2
The latest formula allows to use identity (11.26) and complete the first computation:

π b p−1
I(b , p) = . (11.31)
2 sin p π2

Now, revert the order of integration:


Z +∞ Z +∞ 
I(b , p) = sin(b x) y p−1 e−x y dy dx .
0 0

The inner integral is immediately evaluated, in terms of the Gamma function, setting u = x y :
Z +∞ Z ∞
p−1 −x y 1 1
y e dy = p u p−1 e−u du = p Γ(p) . (11.32)
0 x 0 x
Equating (11.32) and (11.31) leads to:

b p−1 π
Z
sin(b x)
Γ(p) dx = ,
0 xp 2 sin p π2

which is nothing else but (11.30).


11.4. DOUBLE INTEGRATION TECHNIQUES 191

A first consequence of equation (11.30), corresponding to the particular case b = p = 1 , is the


Dirichlet integral (8.43), which was obtained in Exercises 8.11.3. Moreover, from (11.30), it is possible
to establish a second integral formula (11.33), which generalizes the Dirichlet integral (8.43), and
which holds for q > 1 .
Z ∞
sin xq
   
1 1 π
q
dx = Γ cos . (11.33)
0 x q−1 q 2q
To prove (11.33), the first step is the quite natural change of variabile xq = u , in the integral:
Z ∞ Z ∞
sin xq 1 sin u
dx = du .
0 x q q 0 u2− 1q

The right hand–side integral, above, has the form (11.30), with b = 1 and p = 2 − 1q , therefore:
Z ∞
sin xq π
q
dx =   .
0 x 
1

1 π
2 q Γ 2 − q sin 2 − q 2

Now, evaluating the sine:


     
1 π π π
sin 2− = sin π − = sin ,
q 2 2q 2q
and using the reflection formula (11.19):
       
1 1 1 1 q−1 π 1
Γ 2− =Γ 1+1− = 1− Γ 1− =    ,
q q q q q sin π
Γ 1q
q

we arrive at:    
1
Z ∞
sin xq Γ q sin πq
dx =  .
0 xq 2 (q − 1) sin π
2q

Finally, the goniometric identity:


sin x x
= 2 cos ,
sin x2 2
implies the equality below, which simplifies to (11.33):
 
1
Z ∞
sin xq Γ q
 
π
q
dx = 2 cos .
0 x 2 (q − 1) 2q

We now show, employing again reversal integration, a cosine relation similar to (11.30), namely:
Z ∞
cos(b x) π b p−1
dx = , (11.34)
0 xp 2 Γ(p) cos p π2
where we must assume that 0 < p < 1 , to ensure convergence of the integral, due to the singularity
in the origin. To prove (11.34), we consider the double integral:
Z ∞Z ∞
cos(b x) y p−1 e−x y dx dy ,
0 0

from which we show that (11.34) can be reached, via the Fubini Theorem 10.4, regardless of the order
of integration. The starting point is, then, the equality:
Z ∞ Z ∞  Z ∞ Z ∞ 
p−1 −x y p−1 −x y
cos(b x) y e dy dx = y cos(b x) e dx dy . (11.35)
0 0 0 0
192 CHAPTER 11. GAMMA AND BETA FUNCTIONS

The inner integral in the right hand–side of (11.35) is elementary:


Z ∞
y
cos(b x) e−x y dx = 2 .
0 b + y2

Thus, the right hand–side of (11.35) is:


Z ∞ Z ∞ Z ∞
y 1 yp tp
y p−1 2 dy = 2 dy = b p−1
dt .
b + y2 b2 0 1 + y 1 + t2

0 b 0

The last integral above is in the form (11.26). Therefore, the right hand–side integral of (11.35) turns
out to be: Z ∞
yp π b p−1 π b p−1
dy = = . (11.36)
b2 + y 2 2 cos p2 π
 
0 2 sin p+1 π 2

The inner integral in the left hand–side of (11.35) is given by (11.32). Hence, the left hand–side integral
of (11.35) is: Z ∞
cos(b x)
Γ(p) dx . (11.37)
0 xp
Equating (11.36) and (11.37) leads to (11.34).

There is a further, very interesting consequence of equation (11.30) leading to the evaluation of the
Fresnel8 integrals (11.38), that hold for b > 0 and k > 1 :
Z ∞  
k 1 1  π 
sin(b x ) dx = 1 Γ sin . (11.38)
0 k bk k 2k

To prove (11.38), we start from considering its left hand–side integral, inserting in it the change of
xk
variable u = xk , i.e., du = k xk−1 dx = k dx , thus:
x
Z ∞ Z ∞ 1
1 ∞ sin(b u)
Z
k 1 uk
sin(b x ) dx = sin(b u) du = 1 du .
0 0 k u k 0 u1− k
1
The latest integral above is in the form (11.30), with p = 1 − k . Hence:
1

π b− k
Z
sin(b u)
1 du =  , (11.39)
u1− k 2 Γ 1 − k1 sin 1 − k1 π
 
0
2

and, thus: Z ∞
π
sin(b xk ) dx = 1
1 π π
 
0 2 k b Γ 1−
k
k sin 2 − 2k
At this point, employing the reflection formula (11.19):
   
1 1 π
Γ Γ 1− = ,
k k sin πk

and the goniometric identities:


π π   π  sin x x
sin − = cos , x = 2 sin .
2 2k 2k cos 2 2

we obtain (11.38).
8
Augustin–Jean Fresnel (1788–1827), French civil engineer and physicist.
11.4. DOUBLE INTEGRATION TECHNIQUES 193

In (11.38), the particular choices b = 1 and k = 2 correspond to the sine Fresnel integral:
Z ∞   π  rπ
2 1 1
sin(x ) dx = Γ sin = . (11.40)
0 2 2 4 8

Exploiting the same technique that produced (11.38) from (11.30), it is possible to derive the cosinus
analogue (11.41) of Fresnel integrals, that hold for b > 0 and k > 1 :
Z ∞  
k 1 1  π 
cos(b x ) dx = 1 Γ cos . (11.41)
0 k bk k 2k

To prove (11.41), the starting point is (11.34). Then, as in the sine case, we introduce the change of
variable u = xk , we choose p = 1 − k1 , and, via calculations similar to those performed in the sine
case, we arrive at: Z ∞
π
cos(b xk ) dx = 1 .
2 k b k Γ 1 − k cos π2 − 2πk
1

0

At this point, exploiting the reflection formula (11.19):


 
1 π
Γ 1− = ,
Γ k sin πk
1

k

and the trigonometric properties:


π π   π  sin x x
cos − = sin , x = 2 cos .
2 2k 2k sin 2 2

formula (11.41) follows.


194 CHAPTER 11. GAMMA AND BETA FUNCTIONS
12 Fourier transform on the real line

A brief presentation of the Fourier transform is provided here, limited to those of its aspects that come
handy while integrating a particular partial differential equation, namely, the heat equation. This latest
one constitutes the main tool in solving the Black-Scholes equation, which is of great importance in
Quantitative Finance. This Chapter 12 is strongly inspired by [44].

12.1 Fourier transform


Definition 12.1. Let f be a real function of a real variable. The Fourier transform of f is the
complex valued function: Z +∞
Ff (s) := e−2 π i s t f (t) dt . (12.1)
−∞

A sufficient condition for the existence of the Fourier1 integral is f ∈ L∞ (R) . In any Fourier transform,
one value is immediate to compute, namely that corresponding to s = 0 :
Z +∞
Ff (0) := f (t) dt .
−∞

The inverse Fourier transform is realized by a sign change:


Z +∞
−1
F g(t) := e−2 π i s t g(s) ds . (12.2)
−∞

The Fourier inversion Theorem 12.2 explaines the inversion process:


Theorem 12.2. Given f ∈ L1 (R) , then:

F(F −1 g) = g , F −1 (Ff ) = f .

Remark 12.3. A standard definition of the Fourier transform does not exist: the definition presented
here is not unique. The reason for non–uniqueness is related to the position of the 2 π quantity: it
might be part of the exponential, as in (12.1), or it might be an external multiplicative factor, or it
might be at all missing. There is also a question on which is the Fourier transform and which is its
inverse, that is, where to set the minus sign in the exponential.
Various conventions are in common use, according to each particular study branch, and we provide a
summary of such conventions, following [35]. Let us consider the general definition:
Z +∞
1
Ff (s) = ei B s t f (t) dt .
A −∞
The most common choices, found in practice, are the following pairs:

A = 2π, B = ±1 ;
A = 1, B = ±2 π ;
A = 1, B = ±1 .
1
Jean Baptiste Joseph Fourier (1768–1830), French mathematician and physicist.

195
196 CHAPTER 12. FOURIER TRANSFORM ON THE REAL LINE

Our choice (12.1). corresponds to A = 1 , B = −2 π . In computer algebra systems like, for instance,
Mathematica® , the Fourier transform is implemented as:
s Z ∞
|b|
Fa ,b f (s) = ei b s t f (t) dt .
(2 π)1−a −∞
Some results for the Fourier transform are now stated, starting with the Riemann–Lebesgue Theorem
12.4, whose proof is omitted for brevity.
Theorem 12.4 (Riemann–Lebesgue). If f ∈ L1 (R) then:
lim F(s) = 0 .
|s|→∞

The following Plancherel Theorem 12.5 plays a key role in establishing the Fourier transform property
in L2 (R) .
Theorem 12.5 (Plancherel). Consider f ∈ L1 (R) ∩ L2 (R) . We have that Ff ∈ L2 (R) , and:
Z ∞ Z ∞
|f (t)|2 dt = |Ff (s)|2 ds .
−∞ −∞

Finally, Theorem 12.6 illustrates a further interesting property.


Theorem 12.6. Consider f , g ∈ L1 (R) . Then, it holds that:
Z ∞ Z ∞
Ff (s) g(s) ds = f (x) Fg(x) dx .
−∞ −∞

12.1.1 Examples
Here, some examples are provided, on the computation of the Fourier transform. Before them, let us
state some remarks.
Remark 12.7. For each even function f (t) = f (−t) , the following relation holds:
Z +∞ Z +∞
Ff (s) = cos(2 π s t) f (t) dt = 2 cos(2 π s t) f (t) dt , (12.3)
−∞ 0

and, analogously, for each odd function f (t) = −f (−t) , it holds:


Z +∞ Z +∞
Ff (s) = −i sin(2 π s t) f (t) dt = −2 i sin(2 π s t) f (t) dt . (12.4)
−∞ 0

Example 12.8. The triangle function. Consider the triangle function, defined by Λ(x) = max{ 1−
|x| , 0 } , which is equivalent to the explicit expression:
(
1 − |x| for |x| ≤ 1 ,
Λ(x) =
0 otherwise.

1.0

0.8

0.6

0.4

0.2

-1.5 -1.0 -0.5 0.5 1.0

Figure 12.1: Graph of triangle function


12.1. FOURIER TRANSFORM 197

To compute the Fourier transform, using the fact that the sine function is odd, we evaluate:
Z +∞
FΛ(s) = e−2 π i s t Λ(t) dt
−∞
Z1  
= cos(2 π s t) − i sin(2 π s t) (1 − |t|) dt
−1
Z1
= cos(2 π s t) (1 − |t|) dt .
−1

Since the cosine function is even, recalling (12.3), it holds:


Z 1
FΛ(s) = 2 cos(2 π s t) (1 − t) dt
0 
sin(2 π s) 2 π s sin(2 π s) + cos(2 π s) − 1
=2 −
2πs 4 π 2 s2
1 − cos(2 π s)
= .
2 π 2 s2
Exploiting the trigonometric identity 1 − cos(2 x) = 2 (sin x)2 , we finally arrive at:

sin(π s) 2
 
FΛ(s) = .
πs

Example 12.9. Exponential even function. Consider f (t) = e−|t| ; then:


Z +∞ Z +∞
Ff (s) = 2 cos(2 π s t) f (t) dt = 2 cos(2 π s t) e−t dt
0 0
t=+∞
2 e−t (2 π s

sin(2 π s t) − cos(2 π s t)) 2
= =
4 π 2 s2 + 1 t=0 1 + 4π 2 s2

Example 12.10. Gaussian function. Define f (t) = e . Then form: −π t2

Z +∞
2
Ff (s) = e−2 π i s t e−π t dt .
−∞

Differentiate with respect to s :


Z +∞
d 2
Ff (s) = (−2 π i t) e−2 π i s t e−π t dt
ds −∞
Z +∞  0
2
= i e−π t e−2 π i s t dt ,
−∞

and integrate by parts:


Z +∞
d 2
Ff (s) = − i e−π t (−2 π i s) e−2 π i s t dt
ds −∞
Z +∞
2
= −2 π s e−π t e−2 π i s t dt = −2 π s Ff (s) .
−∞

In other words, Ff (s) satisfies the differential equation:


d
Ff (s) = −2 π s Ff (s) ,
ds
whose unique solution, incorporating the initial condition, is:
2
Ff (s) = Ff (0) e−π s .
198 CHAPTER 12. FOURIER TRANSFORM ON THE REAL LINE

Since: Z +∞
2
Ff (0) = e−π t dt = 1
−∞

it finally follows:
2
Ff (s) = e−π s .
2
We have found the remarkable fact that the Gaussian f (t) = e−π t is equal to its own Fourier
transform, that is, the Gaussian function is a fixed point for the Fourier transform.

Remark 12.11. The Fourier transform of the Gaussian can also be evaluated with a different method,
namely, the square completion of the exponent, which we now illustrate in detail. Form the Fourier
transform for the Gaussian function, according to (12.1):
Z +∞ Z +∞
2 2)
Ff (s) = e−2 π i s t e−π t dt = e−π (2 i s t+ t dt .
−∞ −∞

Now, rewrite the exponent as:

−π 2 i s t + t2 = −π (−s2 + 2 i s t + t2 + s2 ) = −π s2 − π (t + i s)2 .


Thus Z +∞
−π s2 2
Ff (s) = e e−π (t+i s) dt .
−∞

Employ the change of variable π (t + i s) = τ , so that:
2
e−π s +∞
Z
2
Ff (s) = √ e−τ dτ .
π −∞

Hence:
2
Ff (s) = e−π s .

Example 12.12. We follow an ingenious method presented in [43] (on pages 79–82 and leading, there,
to integral (3.1.7)), to evaluate the Fourier transform of:

1
f (t) = ,
b2 + t2
being b a positive parameter.
From Definition (12.1), simplified into (12.3) since the given f is even, we have:
Z ∞
cos(2 π s t)
Ff (s) = 2 dt .
0 b2 + t2

After introducing x = 2 π s , the parametric integral has to be evaluated:


Z ∞
cos(x t)
y(x) = dt . (12.5)
0 b2 + t2

Let us integrate (12.5) by parts:


Z ∞
t sin(x t)
x y(x) = 2 dt , (12.5a)
0 (t2 + b2 )2

and, then, differentiate (12.5a):



t2 cos(x t)
Z
0
y(x) + x y (x) = 2 dt . (12.5b)
0 (t2 + b2 )2
12.2. PROPERTIES OF THE FOURIER TRANSFORM 199

Now, consider the partial decomposition:


t2 1 b2
= −
(t2 + b2 )2 b2 + t2 (b2 + t2 )2
and insert it into (12.5b), to obtain:
Z ∞
cos(x t)
y(x) + x y 0 (x) = 2 y(x) − 2 b2 dt ,
0 (t2 + b2 )2
that is: Z ∞
0 2 cos(x t)
x y (x) − y(x) = 2 b , dt . (12.5c)
0 (t2 + b2 )2
We now differentiate (12.5c): Z ∞
00 2 t sin(x t)
x y (x) = 2 b dt ,
0 (t2 + b2 )2
which means that, recalling (12.5a), we have arrived at the second order linear differential equation:
y 00 (x) = b2 y(x) ,
whose general solution is:
y(x) = c1 eb x + c2 e−b x .
To determine constants c1 , c2 , two conditions are needed: we can obtain the first one from (12.5),
that can be evaluated at x = 0 , yielding:
Z ∞
dt 1h t i∞ π
y(0) = 2 2
= arctan = , (12.6)
0 b +t b b 0 2b
π
which implies c1 + c2 = . Moreover, from (12.5a), we see that:
2b
2 ∞ t sin(x t)
Z
y(x) = dt ,
x 0 (t2 + b2 )2
and, thus:
lim y(x) = 0 . (12.7)
x→+∞
Since b > 0 , condition (12.7) implies c1 = 0 , and this means that:
Z ∞
cos(x t) π −b x
y(x) = 2 2
dt = e . (12.8)
0 b +t 2b
We can finally use (12.8), assuming s real and positive, to obtain the Fourier transform of the initial
f (t) :
π
Ff (s) = e−2 π b s .
b

12.2 Properties of the Fourier transform


The Fourier transform has several useful properties.

12.2.1 Linearity
One of the simplest, and most frequently invoked properties of the Fourier transform is that it is a
linear operator. This means:
F(f + g)(s) = Ff (s) + Fg(s)
F(α f )(s) = α Ff (s) .
where α is any real or complex number.
200 CHAPTER 12. FOURIER TRANSFORM ON THE REAL LINE

12.2.2 The Shift Theorem


A shift of the variable t , which represents a time delay in many applications, has a simple effect on
the Fourier transform. Consider evaluating the Fourier transform of f (t + b) , for any constant b . We
introduce the special notation fb (t) = f (t + b) , and then perform the following:
Z +∞ Z +∞
−2 π i s t
Ffb (s) = f (t + b) e dt = f (u) e−2 π i s (u−b) du
−∞ −∞
Z +∞
=e 2πisb
f (u) e−2 π i s u du = e2 π i s b Ff (s) .
−∞

12.2.3 The Stretch Theorem


How does the Fourier transform vary, if we stretch or shrink the variable t in the domain? More
precisely, when t is scaled to become a t , we want to know what happens to the Fourier transform of
a f (t) = f (a t) . First, assume that a > 0 . Then:
Z +∞ Z +∞
−2 π i s t 1
Fa f (s) = f (a t) e dt = f (u) e−2 π i s (u/a) du
−∞ a −∞
Z +∞
1 1 s
= f (u) e−2 π i u (s/a) du = Ff .
a −∞ a a

When a < 0 , the limits of integration are reversed, when we insert the substitution u = a t , thuse,
the resulting transform is:
1 s
Fa f (s) = − Ff .
a a
Since −a is positive when a is negative, we can combine the two cases and present the Stretch
Theorem as, assuming a 6= 0:
1 s
Fa f (s) = Ff .
|a| a
For instance, recalling Example 12.9, the Fourier transform of g(t) = e−a |t| , with a positive constant,
is:
2a
Fg(s) = 2 .
a + 4 π 2 s2

12.2.4 Combining shifts and stretches


We can combine Shift and Stretch Theorems, to find the Fourier transform of f (a t + b) , denoted
below as a fb (t) :
1 2 π i s b/a s
Fa fb (s) = e Ff .
|a| a
Example 12.13. We use the four properties of the Fourier transform to show that, given:

1 t2
f (t) = √ e− 2 σ 2 ,
σ 2π
its Fourier transform is:
2 σ 2 s2
Ff (s) = e−2π
2 2
We know that, if g(t) = e−π t , then Ff (s) = e−π s . Moreover, using the Stretch Theorem:

1 s
Fa g(s) = Fg .
|a| a
12.3. CONVOLUTION 201

1
t2
Since we want to find a such that g(a t) = e− 2 σ2 , the following relation must hold:
1 2
−π a2 t2 = − t ,
2σ 2
that is:
1
a= √ .
σ 2π
From relation:
1 s
Fa g(s) = Fg ,
|a| a
it finally follows: √ 2
Fa g(s) = σ 2 π e−π s 2πσ
.

12.3 Convolution
Convolution is an operation which combines two functions, f , g , producing a third function; as shown
in (12.9), it is defined as the integral of the pointwise multiplication of f and g , and it has independent
variable given by the amount by which either f or g is translated. Convolution finds applications
that include Probability, Statistics, computer vision, natural language processing, image and signal
processing, Engineering, and differential equations. Our interest, here, is in the last application.
Definition 12.14. The convolution of two functions g(t) and f (t) , both defined on the entire real
line, is the function defined as:
Z +∞
(g ? f )(t) = g(t − x) f (x) dx . (12.9)
−∞

Remark 12.15. Consider the case in which functions g(t) and f (t) are supported only on [0 , ∞) ,
that is, they are zero for negative arguments. Then, the integration limits can be truncated, and the
convolution is: Z t
(g ? f )(t) = g(t − x) f (x) dx . (12.10)
0

Remark 12.16. Convolution is a commutative operation:

(g ? f )(t) = (f ? g)(t) .

This follows from a simple change of variabile, namely t − x = u , in which x and u play the role of
integration variables, while t acts as a parameter:
Z ∞ Z −∞
(g ? f )(t) = g(t − x) f (x) dx = g(u) f (t − u) (−du) = (f ? g)(t) .
−∞ ∞

Some practical examples of convolution of functions are now provided, in particular to show that this
concept is of great importance in the Algebra of random variables. Let us begin with the convolution
of two power function.
Example 12.17. (Convolution of powers) Consider g , f : [0 , +∞) → R respectively defined by
g(x) = xa , f (x) = xb , with a , b > 0 . Then:
Z t Z t
(g ? f )(t) = ta ? tb = g(t − x) f (x) dx = (t − x)a xb dx .
0 0

The last integral above can be computed via the Beta function: to see this, let us rewrite it as:
Z t Z t
a b a x a b
(t − x) x dx = t 1− x dx ,
0 0 t
202 CHAPTER 12. FOURIER TRANSFORM ON THE REAL LINE

so that, after the change of variable x = t u , we recognise the Eulerian integral (11.10):
Z t Z 1
x a b
t a
1− x dx = t a
(1 − u)a tb ub t du = ta+b+1 B(a + 1, b + 1) .
0 t 0
In terms of Gamma functions, recalling relation (11.11):
Γ(a + 1) Γ(b + 1)
ta ? tb = ta+b+1 . (12.11)
Γ(a + b + 2)
Formula (12.11) is more expressive, and easy to remember, rewritten as:
ta tb ta+b+1
? = ,
Γ(a + 1) Γ(b + 1) Γ(a + b + 2)
which, when a = n ∈ N and b = m ∈ N , becomes:
tn tm tn+m+1
? = .
n! m! (n + m + 1)!

Now, we consider the convolution of two exponential functions.


Example 12.18. (Convolution of exponentials) Consider g , f : [0 , +∞) → R respectively defined
by g(x) = ea x , f (x) = eb x . Assume that a 6= b . Then:
Z t Z t
at bt ea t − eb t
(g ? f )(t) = e ? e = g(t − x) f (x) dx = ea (t−x) eb x dx = .
0 0 a−b
When a = b , it is, instead:
Z t Z t
at at
(g ? f )(t) = e ?e = g(t − x) f (x) dx = ea (t−x) ea x dx = t ea t .
0 0

Convolution behaves nicely with respect to the Fourier transform, as shown by the following result,
known as Convolution Theorem for Fourier transform.
Theorem 12.19. If g(t) and f (t) are both summable, then:
F(g ? f )(s) = Fg(s) Ff (s) . (12.12)
Proof. The proof is straightforward and uses Fubini Theorem 10.4, that is the why we assume summa-
bility of both functions. We can then write:
Z ∞
F(g ? f )(s) = (g ? f )(t) e−2 π i s t dt
−∞
Z ∞ Z ∞ 
−2 π i s t
= g(t − x) f (x) e dx dt
−∞ −∞
Z ∞ Z ∞ 
−2 π i s t
= g(t − x) f (x) e dt dx
−∞ −∞
Z ∞ Z ∞ 
−2 π i s t
= f (x) g(t − x) e dt dx .
−∞ −∞
With the change of variable t − x = u , so that dt = du , we obtain:
Z ∞ Z ∞ 
−2 π i s (u+x)
F(g ? f )(s) = f (x) g(u) e du dx
−∞ −∞
Z ∞ Z ∞ 
−2 π i s x −2 π i s u
= f (x) e g(u) e du dx
−∞ −∞
Z ∞
= Fg(s) f (x) e−2 π i s x dx = Fg(s) Ff (s) ,
−∞
and this completes the proof.
12.4. LINEAR ORDINARY DIFFERENTIAL EQUATIONS 203

Remark 12.20. Another interesting property, owned by the convolution of Fourier transforms, is:

F(f g)(s) = (Ff ? Fg) (s) ,

where we have to assume that functions f , g ∈ L1 (R) are such that their product is also f g ∈ L1 (R) .

12.4 Linear ordinary differential equations


Fourier transform is useful in solving linear ordinary differential equations. To such an aim, a formula
is needed, for the Fourier transform of the derivative: it is given in Theorem 12.21.

Theorem 12.21. If f ∈ L1 (R) is a differentiable function, then:

Ff 0 (s) = 2 π i s Ff (s) . (12.13)

Proof. To obtain formula (12.13), it suffices to evaluate the Fourier transform of f 0 (t) :
Z +∞
0
Ff (s) = e−2 π i s t f 0 (t) dt ,
−∞

and integrate by parts.

Note that differentiation is transformed into multiplication: this represents another remarkable feature
of the Fourier transform, providing one more reason for its usefulness.
Formulæ for higher derivatives also hold, and the relevant result follows by mathematical induction:

Ff (n) (s) = (2 π i s)n Ff (s) . (12.14)

The derivative Theorem 12.21. is useful for solving linear ordinary, and partial, differential equations.
Example 12.22 illustrates its use with an ordinary differential equation.

Example 12.22. Consider the ordinary differential equation:

u00 − u = −f ,

where f (t) is a given function. The problem consists in finding u(t) . Form the Fourier transform of
both sides of the stated equation:

(2 π i s)2 Fu − Fu = −Ff ,

and solve with respect to Fu :


1
Fu = Ff .
1 + 4 π 2 s2
1 1
Now, observe that quantity 2 2
is the Fourier transform of , that is,
1 + 4π s 2 e−|t|
 
1 −|t|
Fu = F e Ff
2
The right hand–side, in the above expression, is the product of two Fourier transforms. Therefore,
according to the Convolution Theorem 12.19:
1 −|t|
u(t) = e ? f (t) .
2
Written out in full, the solution is:
Z +∞
1
u(t) = e−|t−τ | f (τ ) dτ .
2 −∞
204 CHAPTER 12. FOURIER TRANSFORM ON THE REAL LINE

12.5 Exercises
1. Consider the two functions f , g : [0, +∞) → R , respectively defined as f (t) = sin t e−t , g(t) =
cos t e−t . Show that:
1
(g ? f )(t) = t e−t sin t .
2
2. Consider the two functions f , g : [0, +∞) → R , respectively defined as f (t) = sin t e−2 t , g(t) =
cos t e−2 t . Show that:
1
(g ? f )(t) = t e−2 t sin t .
2
2 −t−1
3. Show that the Fourier transform of the function f (t) = e−t is:
√ 2 s2 3
Ff (s) = π e−π −iπs− 4 .
13 Parabolic equations

The Fourier transform method of Chapter 12 is used here, among other methods, to solve partial
differential equations of parabolic type, which are of fundamental importance in Mathematical Finance.
The exposition presented in this Chapter 13 exploits the material contained in various references;
in particular, we refer to [14], Chapter 6 of [1], Chapter 4 of [56], § 2.4 of [8], Chapter 6 of [58],
Chapter 4 of [22], and [54].

13.1 Partial differential equations


A partial differential equation (PDE) is a relation that involves partial derivatives of a function to be
determined. Denote by u such an unknown function, and let t , x , y , . . . and so on, be its indepen-
dent variables, that is, u = u(t , x , y , . . .) . Among these variables, t often represents time, and the
expressions we deal with have the following implicit form:

F (t , x , y , . . . , u , ut , ux , uy , . . . , utt , utx , . . . , uttt , . . .) = 0 , (13.1)

where the subscript notation indicates partial differentiation:


∂u ∂2 u
ut = , utx = , ······ .
∂t ∂t ∂x
We assume, always, that the unknown function u is sufficiently well behaved, so that all the necessary
partial derivatives exist and the corresponding mixed partial derivatives are equal. As in the case of
ordinary differential equations, we define the order of (13.1) to be that of the highest–order partial
derivative appearing in the equation. Furthermore, we say that (13.1) is linear if F is linear as a
function of u , ut , ux , uy , utt , · · · etcetera, that is to say, F is a linear combination of the unknown
u and its derivatives.
Examples of partial differential equations are ux + uy = 3 ut − 2 x2 − 5 t , which is first–order and
linear, and uxx + uy = x2 , that is second–order and linear.
A solution of (13.1) is a continuous function u = u(x , y , z , · · · ) that has continuous partial derivatives
and that, when substituted in (13.1), reduces equation (13.1) to an identity. For instance, u(x , t) =
x ex−t solves equation ut = uxx − 2 ux .
Example 13.1. Consider the first-order partial differential equation in the unknown u = u(x , y) :

ux + uy = 0 .

It is possible to show that a solution is:

u = φ(x − y) ,

where φ is any function, of x − y , having continuous first-order partial derivatives. Indeed, since:

ux = φ0 (x − y) and uy = −φ0 (x − y) ,

it immediately follows:
ux + uy = φ0 (x − y) − φ0 (x − y) = 0 .

205
206 CHAPTER 13. PARABOLIC EQUATIONS

13.1.1 Classification of second–order linear partial differential equations


Taking into account the nature of the PDE applications, that are of interest here, it is useful to present
a classification of second-order linear partial differential equations, where the unknown u = u(x , y)
is a function of two independent variables, i.e., PDEs of the form:
L[u] : = A(x , y) uxx + B(x , y) uxy + C(x , y) uyy
(13.2)
+ D(x , y) ux + E(x , y) uy + F (x , y) u − G(x , y) = 0 .
In the linear operator L[u] , above, functions A(x , y) , . . . , G(x , y) are continuous in some open set
Ω ⊆ R2 . Now, recall that the quadratic equation:
a x2 + b x y + c y 2 + d x + e y + f = 0
represents a hyperbola, parabola, or ellipse, according to its discriminant:
∆ = b2 − 4 a c
being positive, zero, or negative. In an analogous way, the differential operator L and the partial
differential equation (13.2) are said to be hyperbolic, parabolic, or elliptic, at a point (x0 , y0 ) ∈ Ω ,
according to the discriminant:
∆(x , y) = B 2 (x , y) − 4 A(x , y) C(x , y) (13.3)
being positive, zero, or negative, when (13.3) is evaluated at (x , y) = (x0 , y0 ) . Furthermore, L and
the PDE in (13.2) are called hyperbolic, or parabolic, or elliptic, on a domain Ω ⊆ R2 , if their
discriminant (13.3) is positive, or zero, or negative, at each point of Ω .
Example 13.2. The wave equation in one dimension:
utt = c2 uxx
is hyperbolic in any domain, since:
A = −c2 , B = 0, C = 1,
so that ∆ = B 2 − 4 A C = 4 c2 > 0 .
Example 13.3. The one–dimensional heat equation, which is the main object of our study in § 13.2:
ut = c uxx , c > 0, (13.4)
is parabolic in any domain, since:
A = −c , B = 0, C=0
so that ∆ = B2 − 4AC = 0.
Example 13.4. The potential, or Laplace equation, in two dimensions:
uxx + uyy = 0 (13.5)
is elliptic in any domain since:
A = 1, B = 0, C = 1,
so that ∆ = B2 − 4AC < 0.

It is useful, here, to introduce the so–called Laplacian operator ∇2 u , for a two–variable function
u ∈ C 2 , defined as:
∇2 u = uxx + uyy . (13.6)

Given u , v ∈ C 2 , we can apply (13.6), recalling the nabla definition in Theorem 3.8, and the dot
(inner) product introduced in Definition 1.1, to show that:
∇2 (u v) = ∇2 u v + 2 ∇u · ∇v + u ∇2 v .
   

A C 2 function u is called harmonic if ∇2 u = 0 . Examples of harmonic functions are u(x , y) =


y 3 − 3 x2 y and u(x , y) = ek x sin(k y) .
13.2. THE HEAT EQUATION 207

13.2 The heat equation


In this section, we treat the one–dimensional heat equation, that is a partial differential equation of
the form:
ut (x , t) = c uxx (x , t) + P (x , t) , x ∈ R, t > 0, (13.7)
where P (x , t) is a given real function of the two independent variables (x , t) . The n–dimensional
heat equation is:
ut (x , t) = c ∇2x u(x , t) + P (x , t) , x ∈ Rn , t > 0 , (13.8)
where the Laplace differential operator ∇2x acts on the so–called state variables x = (x1 , . . . , xn ) .
The heat equation is of great importance in Mathematical Physics, and it is of high interest, also, in
Physics and Probability. Here, only the one–dimensional case is treated, as it is useful in Mathematical
Finance. We begin with considering the homogeneous case, as well as some equations that can be
transformed into homogeneous ones. We then examine non–homogeneous equations, introducing the
Duhamel principle in § 13.5.

13.2.1 Uniqueness of solution: homogeneous case


For Mathematical Finance applications, we have to associate a given parabolic partial differential
equation with an appropriate initial condition. We treat, here, the homogeneous version of equation
(13.7), which means that we make the assumption that P (x , t) = 0 , for any (x , t) . We further
assume that c = 1 in (13.7), since this does not affect the generality of our argument, as shown in
Remark 13.5. Hence, we deal with the Cauchy problem:

ut (x , t) = uxx (x , t) , for t > 0 ,
(13.9)
u(x , 0) = f (x) ,

where both f (x) and u(x , t) are defined for x ∈ R , that is, −∞ < x < +∞ . The parabolic partial
differential equation in (13.9), that is:
ut = uxx , (13.10)
is sometimes called diffusion equation, given its fundamental connections with Brownian motion, i.e.,
the limit of a random walk, which is a mathematical formalization of a path that contains random
steps, and which turns out to be linked to the heat equation [36].

Remark 13.5. Generality is not affected by considering problem (13.9) instead of the following
problem (13.11), in which c 6= 1 and x ∈ R :

ut (x , t) = c uxx (x , t) , for t > 0 ,
(13.11)
u(x , 0) = f (x) .

 t
To see it, let us assume that u(x , t) solves ut = c uxx , and introduce w(x , t) = u x , . Then, w
c
solves wt = wxx , since it verifies:

1  t 1  t  t
wt (x , t) = ut x , = c uxx x , = uxx x , = wxx (x , t) .
c c c c c
This means that, without any loss of generality, we can assume c = 1 , and deal only with (13.9).

The first step towards solving the Cauchy problem (13.9) is to establish a uniqueness result for its
solutions. To such an aim, some assumptions are needed, as stated in the following Energy Theorem
13.6.
208 CHAPTER 13. PARABOLIC EQUATIONS

Theorem 13.6. Problem (13.9) admits solution u ∈ C 2 R , [0 , ∞) , which is unique and satisfies the


asymptotic condition:
lim ux (x , t) = 0 . (13.12)
|x|→∞

Proof. Assume that function u(x , t) satisfies the differential equation in (13.9). Define the function:
Z ∞
1
W (t) = u2 (x , t) dx .
2 −∞

W (t) is called the energy of the solution to the heat equation in (13.9). Now, assume that there exist
two solutions to problem (13.9), say, r(x , t) and s(x , t) , and assume that both r and s satisfy
condition (13.12). At this point, if we define u to be the function u(x , t) = r(x , t) − s(x , t) , since
the heat equation in (13.9) is linear, we see that u solves, for x ∈ R :


 ut (x , t) = uxx (x , t) , for t > 0 ,



u(x , 0) = 0 , (13.13)


 lim ux (x , t) = 0 .


|x|→∞

We aim to prove that the unique solution to (13.13) is the zero function, since such a result implies
the thesis of Theorem 13.6. Now, condition (13.12) allows to differentiate the energy function W (t) :
Z ∞ Z ∞
0
W (t) = u(x , t) ut (x , t) dx = u(x , t) uxx (x , t) dx .
−∞ −∞

In the last equality step, above, we used the assumption that u(x , t) solves the heat equation in
(13.9). Integrating by parts, and using condition (13.12), leads to:
Z ∞
W 0 (t) = − u2x (x , t) dx ≤ 0 ,
−∞

which means that W (t) is a decreasing function. On the other hand, we know, a priori, that W (t) ≥ 0 .
Evaluating W (0) = 0 , we see that it must be W (t) = 0 for any t > 0 . Hence, it must also be
u(x , t) = 0 for any t > 0 , which shows that solutions r(x, t) and s(x, t) are equal.

It is possible to state a deeper result then Theorem 13.6, but its proof require tools that go beyond
the undergraduate curriculum in Economics and Management; hence, we just state such a result in
Theorem 13.7, and refer to Theorem 4.4 of [22] for its proof. We remark that hypothesis (13.12) is
replaced by conditions (13.14)–(13.15).

Theorem 13.7. Assume that u is a real continuous function on R × (0 , ∞) , and that it is C 2 on


R × [0 , ∞) . Assume further that u(x , t) solves the following Cauchy problem, where x ∈ R :
(
ut = uxx , for t > 0 ,
u(x, 0) = 0 .

If, for any ε > 0 , there exists C > 0 such that, for any (x , t) ∈ R × [0 , ∞) :
2
|u(x , t)| ≤ C eε x , (13.14)
x2
|ux (x , t)| ≤ C eε , (13.15)

then u(x , t) is identically zero on R × [0 , ∞) .


13.2. THE HEAT EQUATION 209

13.2.2 Fundamental solutions: heat kernel


In force of Theorems 13.6 and 13.7, the Cauchy problem (13.9) can now be solved via analytical
methods, since, if a solution is found, then it has to be unique. To understand the type of solutions
that are obtained, it is interesting to take Remarks 13.8 into account: among them, in particular, the
last one is called scale invariance and is the most important property of solutions to (13.10).
Remarks 13.8.

 If u(x , t) solves (13.10), then function (x , t) 7→ u(x − y , t) also solves (13.10), for any y .

 If u(x , t) solves (13.10), then any of its derivatives, ut , ux , utt , · · · etcetera, is also a solution to
(13.10).

 Any linear combination of solutions of (13.10) is also a solution to (13.10).

 Scale invariance – If u(x , t) solves (13.10), then (x , t) 7→ u(α x , α2 t) is a function that also solves
(13.10), for any α > 0 .

The scale invariant property provides an educated guess on the structure of the solutions to (13.10);
namely, we look for solutions of the form:
 x 
u(x , t) = w √ , (13.16)
t
being w = w(r) a differentiable real function of a real variable. Differentiating (13.16), we obtain:
   
x 0 x 1 00 x
ut (x , t) = − 3/2 w √ , uxx (x , t) = w √ .
2t t t t
Equating ut (x , t) and uxx (x , t) yields a linear second–order differential equation, with variable co-
x
efficients, in the unknown w = w(r) , where r := √ :
t
r 0
w00 (r) = − w (r) ,
2
which can be integrated via separation of variables, in the unknown w0 (r) :
r2
w0 (r) = w0 (0) e− 4 . (13.17)

Integration of (13.17) provides:


Z r s2
w(r) = w0 (0) e− 4 ds + w(0) ,
0

that, recovering the original coordinates, can be rewritten as:


Z √x
t s2
u(x , t) = k1 e− 4 ds + k2 , (13.18)
0

being k1 and k2 arbitrary constants of integration. Due to its construction, function (13.18) solves
(13.10) for t > 0 , and so does, by Remarks 13.8, its partial derivative, with respect to x :
x2
∂u e− 4 t
h(x , t) := (x , t) = k1 √ . (13.19)
∂x t
To fix one solution, we choose the integration constant k1 such that:
Z ∞
h(x , t) dx = 1 .
−∞
210 CHAPTER 13. PARABOLIC EQUATIONS

Recalling the probability integral computation, illustrated in § 8.11.1 and § 11.2.1 and given by formula
(8.36), we take k1 = √14 π and obtain the particular solution to (13.10) known as heat kernel (or Green
function or fundamental solution for the heat equation) and given by the following Gaussian:
1 x2
H(x , t) = √ e− 4 t , (13.20)
4πt
This preliminary discussion is essential to understand the following integral transform approach to
solving the Cauchy problem (13.9). Starting from the Fourier transform of the Gaussian (see Examples
12.10 and 12.13), the idea is to compute the Fourier transform of both sides of the heat equation in
(13.9), with respect to x and thinking of t as a parameter. Following [22], we thus state Theorem 13.9,
namely, the analytical treatment of the heat equation based on Fourier transforms; we provide an
euristic proof for it, and refer to Theorem 4.3 of [22] for a more rigorous demonstration.
Theorem 13.9. Assume that f ∈ Lp , with 1 ≤ p ≤ ∞ . The solution to problem (13.9) is then given
by the following formula, which holds true on R × (0 , +∞) :
Z +∞ −(x−y)2
1 4t
u(x , t) = √ e f (y) dy , (13.21)
4πt −∞

or, equivalently, recalling the heat kernel (13.20), by the convolution formula:
Z ∞
u(x , t) = H(x − y , t) f (y) dy . (13.22)
−∞

Moreover, if f is bounded and continuous, then the solution (13.22) is continuous on R × (0 , +∞) .
Proof. The Fourier transform of the right hand–side uxx (x , t) in (13.10) is:

Fuxx (s , t) = (2 π i s)2 Fu(s , t) = −4 π 2 s2 Fu(s , t) ,

while, for the left hand–side ut (x , t) , we note that it holds:


Z +∞
Fut (s , t) = ut (x , t) e−2 π i s x dx
−∞
Z +∞
∂ ∂
= u(x , t) e−2 π i s x dx = Fu(s , t) .
∂t −∞ ∂t
In other words, forming the Fourier transform (with respect to x ) of both sides of equation ut (x , t) =
uxx (x , t) leads to:

Fu(s , t) = −4 π 2 s2 Fu(s , t) .
∂t
This an ordinary differential equation in t , despite the partial derivative symbol; solving it yields:
2 s2
Fu(s , t) = Fu(s , 0) e−4 π t
,

where the initial condition Fu(s , 0) can be computed as follows:


Z +∞ Z +∞
−2 π i s x
Fu(s , 0) = u(x , 0) e dx = f (x) e−2 π i s x dx = Ff (s) .
−∞ −∞

Putting it all together:


2 s2
Fu(s , t) = Ff (s) e−4 π t
.
Recalling Example 12.13, we recognize that the exponential factor, in the right hand–side, above, is
the Fourier transform of the heat kernel H(s , t) in (13.20). In other words, the Fourier transform of
u is given by the product of two Fourier transforms:

Fu(s , t) = Ff (s) FH(s , t) (13.23)


13.2. THE HEAT EQUATION 211

Inverting (13.23) yields a convolution in the x–domain:

u(x , t) = H(x , t) ? f (x) ,

that can be written out as (13.22).

Remark 13.10. In the case of the (apparently) more general Cauchy problem (13.11), the solution
is:
Z +∞ −(x−y)2
1 4ct
u(x , t) = √ e f (y) dy . (13.24)
4 π c t −∞
Remark 13.11. Observe that H(x , t) vanishes very rapidly as |x| → ∞ . Then, the convolution
integral (13.22) is well defined, for t < T , if the following growth condition is fulfilled:
x2
4T
|f (x)| ≤ c e . (13.25)

Moreover, under condition (13.25), and if f is assumed to be continuous, then, as t → 0+ , u(x , t)


approaches f (x) uniformly on bounded sets.
Remark 13.12. The heat kernel H(x , t) is defined, by (13.20), for t > 0 only. Moreover, it is an
even function of x .

0.8

0.6

0.4

0.2

-6 -4 -2 0 2 4 6

Figure 13.1: Graph of the heat kernel (13.20) for t = 1 , 2 , 4 , 12 , 14 .

√ √
Remark 13.13. Via the change of variable y = x + 2 s t , i.e., dy = 2 t ds , solution (13.21), to
the Cauchy problem (13.9), can be written as:
Z ∞ √
1 2
u(x , t) = √ e−s f (x + 2 s t) ds . (13.26)
π −∞

Remark 13.14. If the initial value f (x) , appearing in problem (13.9), is an odd function, i.e.,
f (−x) = −f (x) for any x ∈ R , then (13.21) implies that solution u(x , t) is such that

u(−x , t) = −u(x , t) , for any x ∈ R and for any t > 0 .

Hence, u(0, t) = 0 , by a continuity argument.


If, instead, f (x) is an even function, i.e., f (−x) = f (x) for any x ∈ R , then:

u(−x , t) = u(x , t) , for any x ∈ R and for any t > 0 .

The heat kernel has some interesting properties, as Proposition 13.15 outlines.
212 CHAPTER 13. PARABOLIC EQUATIONS

Proposition 13.15. The heat kernel introduced in (13.20) verifies:


(i) Ht (x , t) = Hxx (x , t) , for t > 0 , x ∈ R ;
Z ∞
(ii) H(x , t) dx = 1 , for t > 0 ;
−∞

0 for x 6= 0 ,
(iii) lim H(x , t) =
t→0+ +∞ for x = 0 .
Proof. Equality (i) stems from a direct calculation. The integral value (ii) follows from the change of
x √
variable z = √ , i.e., dx = 4 t dz , which implies:
4t
Z +∞ Z +∞
1 2
H(x , t) dx = √ e−z dz = 1 .
−∞ π −∞

1
Limit (iii) is trivial if x = 0 ; when x 6= 0 , by the change of variable s = , and using L’Hospital
t
Rule 1 , we obtain:

1 −x2 s 1
lim √ e 4 t = lim = lim = 0.
t→0+ 4πt s→+∞ √ s x2 s→+∞ √ s x2
4π e 4 x2 sπ e 4

We now provide some examples on how to solve some heat equations for given initial value functions
f (x) . The first example, due to the nature of the f considered, has applications in Finance.
Example 13.16. Recall the notion of positive part of a function, introduced in Proposition 8.25, that
is f + (x) := max {f (x) , 0} , and consider the Initial value problem:
(
ut = uxx , for t > 0 ,
+
(13.27)
u(x , 0) = x ,

where x ∈ R . Using (13.26), we can write:


Z ∞ √ Z
x −s2 2 t ∞ 2
u(x , t) = √ e ds + √ s e−s ds
π − x√ π − x√
2 t 2 t
Z − x√ √
x  ∞ −s2 2 t  1 − x2 
Z 
2 t −s2
=√ e ds − e ds + √ e 4t
π 0 0 π 2
√ √ √
x2

x π π  x  t
=√ − erf − √ + √ e− 4 t
π 2 2 2 t π
  √ 2
x  x  t x
= 1 − erf − √ + √ e− 4 t .
2 2 t π
Example 13.17. Using (13.26), we show that u(x , t) = x2 + 2 t solves:
(
ut = uxx , for t > 0 ,
2
(13.28)
u(x , 0) = x ,

where x ∈ R . Observe that the given initial value function f (x) = x2 is even, thus the solution obeys
Remark 13.14. Furthermore, from (13.26), we can write:
Z ∞ √ 2
1 2

u(x , t) = √ e−s x + 2s t ds .
π −∞
1
See, for example, mathworld.wolfram.com/LHospitalsRule.html
13.2. THE HEAT EQUATION 213

Now:  √ 2 √
x + 2s t = x2 + 4 x s t + 4 s 2 t .
2 √
Observe that s 7→ e−s 4 x s t is an odd function of s ; thus, by Remark 8.58:
Z ∞
1 2
e−s x2 + 4 s2 t ds

u(x , t) = √
π −∞
2 Z ∞ Z ∞
x −s2 4t 2 4t
=√ e ds + √ s2 e−s ds = x2 + √ c .
π −∞ π −∞ π

To find the value of the constant c , impose that the found family of functions u(x , t) solves (13.28):

4 π
ut = √ c , uxx = 2 , =⇒ c= .
π 2

Finally, the solution to (13.28) is:



24t π
u(x , t) = x + √ = x2 + 2 t .
π 2

Note that, in the previous calculation, we also re–established result (8.36d).

Example 13.18. This example is taken from [54] and consists in solving the Cauchy problem:
(
ut = uxx , x ∈ R, t > 0,
u(x, 0) = sin x , x ∈ R,

and, then, inferring the integration formula:


Z +∞
2 √ 2
e−s cos(a s) ds = π e−a /4 , ∀a ∈ R . (13.29)
−∞

In the case of the given Cauchy problem, formula (13.26) yields:


Z ∞ √
1 2
u(x , t) = √ e−s sin(x + 2 s t) ds .
π −∞

The trigonometric identity sin(α + β) = sin α cos β + cos α sin β implies:


√ √ √
sin(x + 2 s t) = sin x cos(2 s t) + cos x sin(2 s t) .
2 √
Since function s 7→ e−s cos x sin(2 s t) is odd, by Remark 8.58, we obtain:

sin x ∞ −s2 √
Z
u(x , t) = √ e cos(2 s t) ds .
π −∞

In other words, the solution to the given Cauchy problem has the form:
sin x
u(x , t) = √ y(t) ,
π

where we set: ∞
Z
2 √
y(t) = e−s cos(2 s t) ds .
−∞
Since equality ut = uxx must hold, by computing:
sin x sin x
ut = √ y 0 (t) , uxx = − √ y(t) ,
π π
214 CHAPTER 13. PARABOLIC EQUATIONS

we see that y(t) solves the initial value problem, for ordinary differential equations:
(
y 0 (t) = −y(t) ,

y(0) = π .

Thus: √
y(t) = π e−t ,
and then:
u(x , t) = e−t sin x .
Finally, statement (13.29) follows from the equality:
Z ∞ √
sin x 2
√ e−s cos(2 s t) ds = e−t sin x .
π −∞
Example 13.19. Solve the Cauchy problem for the heat equation:
(
ut = uxx ,
u(x , 0) = x2 + x .

Its solution is given by formula (13.26), with f (x) = x2 + x , so that the integrand is:
2
 √ √ 
e−s 4 s2 t + 4 s t x + 2 s t + x2 + x .

Discarding the integrand odd components, we arrive at:


Z ∞ Z ∞
1 −s2 4t 2
2 2 2
s2 e−s ds .

u(x , t) = √ e 4 s t + x + x ds = x + x + √
π −∞ π −∞
Now, imposing equality ut = uxx , we see that:
Z ∞
4 2
√ s2 e−s ds = 2 ,
π −∞
which leads, again, to formula (8.36d). In conclusion, the required solution is:

u(x , t) = x2 + x + 2 t .

Example 13.20. Here, we use some formulæ that follow from the probability integral (8.36), and
that are extremely useful in many various applications. Consider the initial value problem:
(
ut = uxx , x ∈ R, t > 0,
x
u(x , 0) = x e , x ∈ R.

To obtain its solution u(x , t) , let us use formula (13.26), with f (x) = x ex . Since:
√ √ √ √
f (x + 2 s t) = 2 s t ex e2 s t + x ex e 2 s t ,

then: ∞ ∞
√ √ √
 Z Z 
1 −s2 +2 −s2 +2
u(x , t) = √ 2 t ex se ts
ds + x e x
e ts
ds .
π −∞ −∞
Now, formulæ (8.37)–(8.38), that are linked to the probability integral, yield, respectively:
Z ∞ Z ∞

−s2 +2 t s √ 2
√ √ √
e ds = π e ,t
s e−s +2 t s ds = π t et .
−∞ −∞

In conclusion:
1  √ x √ √ t √ 
u(x , t) = √ 2 t e π t e + x ex π et = (x + 2 t) et+x .
π
13.2. THE HEAT EQUATION 215

13.2.3 Initial data on (0, ∞)


It is interesting to associate the diffusion equation (13.10) with an initial data, which is defined only
on (0 , ∞) , and not on the whole real line R . To do so, a technique is employed, here, that is similar
to d’Alembert method for the wave equation, for which we refer to § 1 of Chapter 5 in [1].
Given a continuous function f : (0 , ∞) → R , we deal, here, with two types of initial value problems,
respectively: 

 ut = uxx , x > 0, t > 0,
u(0 , t) = 0 , t > 0, (13.30)

u(x , 0) = f (x) ,

and 

 ut = uxx , x > 0, t > 0,
ux (0 , t) = 0 , t > 0, (13.31)

u(x , 0) = f (x) .

To solve (13.30), we extend the initial data f (x) to an odd function f o (x) , called odd continuation
of f , that is defined on the whole real axis as follows:
(
o f (x) if x > 0 ,
f (x) =
−f (−x) if x < 0 ,
and, then, we write the solution to the Cauchy problem with initial data f o , using the convolution
formula (13.22):
Z ∞
u(x , t) = H(x − y , t) f o (y) dy
−∞
Z 0 Z ∞
=− H(x − y , t) f (−y) dy + H(x − y , t) f (y) dy
−∞ 0
Z ∞

= H(x − y , t) − H(x + y , t) f (y) dy
Z0 ∞
= H1 (x , y , t) f (y) dy ,
0

where H1 (x , y , t) is the so–called Green function of first kind:


 (x−y)2 (x+y)2

1 − 4t − 4t
H1 (x , y , t) = H(x − y , t) − H(x + y , t) = √ e −e .
4πt
Note that condition u(0 , t) = 0 comes from the argument illustrated in § 13.14.

To solve (13.31), we form the even continuation of f :


(
f (x) if x > 0 ,
f e (x) =
f (−x) if x < 0 ,
and, then, we solve the Cauchy problem with initial data f e , using the convolution formula (13.22):
Z ∞
u(x , t) = H(x − y , t) f e (y) dy
−∞
Z 0 Z ∞
= H(x − y , t) f (−y) dy + H(x − y , t) f (y) dy
Z−∞

0

= H(x − y , t) + H(x + y , t) f (y) dy
0
Z ∞
= H2 (x , y , t) f (y) dy ,
0
216 CHAPTER 13. PARABOLIC EQUATIONS

where H2 (x , y , t) is the so–called Green function of second kind:


 (x−y)2 (x+y)2

1 − 4t − 4t
H2 (x , y , t) = H(x − y , t) + H(x + y , t) = √ e +e
4πt

Note that condition ux (0 , t) = 0 comes from the argument described in § 13.14.

13.3 Parabolic equations with constant coefficients


A partial differential equation, which is linear, parabolic, and with constant coefficients, has the form:

vt = vxx + a vx + b v , (13.32)

where a , b ∈ R . The following Theorem 13.21 shows that (13.32) can always be reduced to the heat
equation, by a suitable change of variable.

Theorem 13.21. The solution to (13.32) is given by:

a2 ax
v(x , t) = e(b− 4 )t e− 2 h(x , t) , (13.33)

where h(x , t) is a solution to the heat equation (13.9).

Proof. We seek for a function that solves equation (13.32) and has the form:

v(x , t) = eα t eβ x h(x , t) . (13.34)

To do so, we impose that v(x , t) , above, is a solution to (13.32), finding α and β accordingly. Let
us compute:

vt = eα t eβ x (α h + ht ) ,
vx = eα t eβ x (β h + hx ) ,
vxx = eα t eβ x β 2 h + 2 β hx + hxx ,


and substitute them into (13.32), rewritten as vt − vxx − a vx − b v = 0 , obtaining:

eα t eβ x ht − (b − α + a β + β 2 ) h − (a + 2 β) hx − hxx = 0 .


The hypothesis that h solves the heat equation allows to consider the system:
(
a + 2β = 0,
b − α + a β + β2 = 0 ,

which can be solved with respect to α and β :


2

α = b − a ,

4
a
β = − ,

2
showing that, when ht = hxx , equation (13.32) is solved by (13.33).

The proof to Theorem 13.21 can be adapted to the case of a Cauchy problem.
13.3. PARABOLIC EQUATIONS WITH CONSTANT COEFFICIENTS 217

Corollary 13.22. The Cauchy problem, defined for x ∈ R and t ≥ 0 :



ut (x , t) = uxx (x , t) + a ux (x , t) + b u(x , t) , t > 0,
(13.35)
u(x , 0) = f (x) ,

is solved by:
a2 ax
u(x , t) = e(b− 4 )t e− 2 h(x , t) ,
where h(x , t) solves the Cauchy problem for the heat equation, given below:

ht (x , t) = hxx (x , t) ,
h(x , 0) = e a2x f (x) .

Proof. The proof follows from combining Theorems 13.9 and 13.21.

Corollary 13.22 can be further generalized to the case in which the second derivative term is multiplied
by c 6= 1 .
Corollary 13.23. The Cauchy problem, defined for x ∈ R and t ≥ 0 :

ut (x , t) = c uxx (x , t) + a ux (x , t) + b u(x , t) , t > 0,
(13.36)
u(x , 0) = f (x) ,

is solved by:
a2 ax
u(x , t) = e(b− 4 c ) t e− 2 c h(x , t) ,
where h(x , t) solves the following Cauchy problem for the heat equation:

ht (x , t) = c hxx (x , t) ,
h(x , 0) = e a2 xc f (x) .

Example 13.24. We want to solve the parabolic Cauchy problem:


(
ut = uxx − 2 ux ,
u(x, 0) = x ex .

Observe that it is of the form (13.35), with f (x) = x ex , a = −2 , and b = 0 . In order to use
Corollary 13.22, we have to solve, first, the following Cauchy problem for the heat equation:
 
ht (x , t) = hxx (x , t) , ht (x , t) = hxx (x , t) ,
i.e.,
h(x , 0) = e a2x x ex h(x , 0) = x ,

whose solution, recalling Remark 13.13, is:


Z ∞ √
1 −s2
a (x+2 s t) √ √
h(x , t) = √ e e 2 (x + 2 s t) ex+2 s t ds
π −∞
Z ∞ √
1 2
=√ e−s (x + 2 s t) ds
π −∞
Z ∞ √ Z
x −s2 2 t ∞ 2
=√ e ds + √ s e−s ds
π −∞ π −∞

x √ 2 t
=√ π+ √ 0
π π
= x,
218 CHAPTER 13. PARABOLIC EQUATIONS

2
where the last equalities rely on (8.36a) and on the fact that s 7→ s e−s is an odd function, for which
Remark 8.58 holds. In conclusion, the given problem is solved by:
a2 ax
u(x , t) = e(b− 4 ) t
e− 2 h(x , t) = e−t ex x = x ex−t .
Example 13.25. We solve the Cauchy problem for parabolic equation:
(
ut = uxx + 4 ux ,
u(x , 0) = x2 e−2 x .
This problem is of the form (13.35), with f (x) = x2 e−2 x , a = 4 , and b = 0 . In order to use
Corollary 13.22, we have to solve, first, the following Cauchy problem for the heat equation:

ht (x , t) = hxx (x , t) ,
h(x , 0) = x2 ,

whose solution, recalling Remark 13.13, is:


Z ∞ √
1 2
h(x , t) = √ e−s (x + 2 s t)2 ds
π −∞
Z ∞ √
1 2
=√ e−s (x2 + 4 x s t + 4 s2 t) ds
π −∞
Z ∞
1 2
=√ e−s (x2 + 4 s2 t) ds
π −∞
Z ∞ Z ∞
x2 2 4t 2
=√ e−s ds + √ s2 e−s ds
π −∞ π −∞
2 √
x √ 4t π
=√ π+√
π π 2
2
= x + 2t,
2
where the chain of equalities relies on (8.36a) and (8.36d), and on the fact that s 7→ s e−s is an odd
function, for which Remark 8.58 holds. In conclusion, the given problem is solved by:
u(x , t) = e−4 t−2 x (x2 + 2 t) .
Example 13.26. Solve the parabolic initial value problem:
(
ut = uxx − 4 ux ,
u(x , 0) = x3 e2 x .
This problem is of the form (13.35), with f (x) = x3 e2 x , a = −4 , and b = 0 . In order to use
Corollary 13.22, we have to solve, first, the following Cauchy problem for the heat equation:

ht (x , t) = hxx (x , t) ,
h(x , 0) = x3 ,

whose solution, recalling Remark 13.13, is:


Z ∞ √
1 2
h(x , t) = √ e−s (x + 2 s t)3 ds
π −∞
Z ∞ √ √
1 2
=√ e−s (x3 + 6 x2 s t + 12 x s2 t + 8 s3 t t) ds
π −∞
Z ∞
1 2
=√ e−s (x3 + 12 x s2 t) ds
π −∞

x3 √ 12 x t π
=√ π+ √
π π 2
2
= x (x + 6 t) ,
13.4. BLACK–SCHOLES EQUATION 219

2
where the chain of equalities relies on (8.36a) and (8.36d), and on the fact that functions s 7→ s e−s
2
and s 7→ s3 e−s are both odd, thus they verify Remark 8.58. In conclusion, the given problem is
solved by:
u(x , t) = e2 x−4 t x (x2 + 6 t) .

13.3.1 Exercises
1. Show that function u(x , t) = (2 t + x) e2 x+t solves the Cauchy parabolic problem:
(
ut = uxx − 2 ux + u , t > 0,
u(x , 0) = x e .2 x

2. Show that function u(x , t) = (2 t + x) e2t+2x solves the Cauchy parabolic problem:
(
ut = uxx − 2 ux + 2 u , t > 0,
u(x , 0) = x e2 x .

3. Consider a positive C 2 function u(x , t) , which solves (13.10) for t > 0 . Then, the following
function:
ux
θ(x , t) = −2
u
satisfies, for t > 0 , the differential equation:

θt + θ θx = θxx .

4. Consider the initial value problem (13.9), in which we set:


(
1 if x > 0 ,
f (x) =
0 if x < 0 .

Then, the solution of the so–formed initial value problem is given by:
  
1 x
u(x, t) = 1+φ √ ,
2 4t

being φ(s) the error function, defined by:


Z s
2 2
φ(s) = √ e−t dt .
π 0

13.4 Black–Scholes equation


We are finally in the position to solve analytically the Black–Scholes equation, namely, a parabolic
partial differential equation, with variable coefficients, which is at the base of Quantitative Finance,
as it governs the price evolution of a European call option or put option, under the Black–Scholes2
model. This is not the place to explain the origin and the economic foundation of such a model, for
which we refer the Reader to [7]. The Black-Scholes equation is:

∂V 1 ∂2V ∂V
+ σ2 S 2 2
+ rS − r V = 0, S ≥ 0, t ∈ [0 , T ] , (13.37)
∂t 2 ∂S ∂S
where:
2
Fischer Sheffey Black (1938–1995), American economist.
Myron Samuel Scholes (1941–living), Canadian–American financial economist.
220 CHAPTER 13. PARABOLIC EQUATIONS

• t is the time;

• S = S(t) is the price of the underlying asset, at time t ;

• V = V (S , t) is the value of the option;

• T is the expiration date;

• σ is the volatility of the underlying asset;

• r the risk–free interest rate.


We assume that r and σ are constant. The treatment of (13.37) is based on the interesting on–line
material provided by [14]. We reduce the Black-Scholes equation to a general parabolic equations,
with constant coefficients. For clarity, and for consistency with the previous chapters of this book, we
employ the (x , t) notation to model the Black-Scholes equation, namely:
1 
ut + σ 2 x2 uxx + r x ux − r u = 0 , x ≥ 0, 0 ≤ t ≤ T .
2
In other words, the Black-Scholes equation can be described as:
(
ut = a x2 uxx + b x ux + c u ,
(13.38)
u(x, 0) = f (x) ,

where a , b , c are given real numbers. Equation (13.38) can be turned into a parabolic equation, with
constant coefficients, using the change of variable:

x = ey ,
(
y = ln x ,
1 that is to say, (13.39)
t = τ , τ = a t.
a
Equating u(x , t) = v(y , τ ) , the transformed differential equation is obtained:
 
b c

 vτ = vyy + − 1 vy + v ,
a a (13.40)
v(y , 0) = f ( ey ) .

To see it, let us compute the partial derivatives of u(x , t) in terms of the transformed function
v(y , τ ) :

∂v ∂τ ∂v ∂(a t) ∂v
ut = = = a = a vτ ;
∂τ ∂t ∂τ ∂t ∂τ
∂v ∂y ∂v ∂(ln x) 1 ∂v 1
ux = = = = vy ;
∂y ∂x ∂y ∂x x ∂y x
   
∂ux ∂ 1 ∂ 1 1 ∂ ∂v
uxx = = vy = vy +
∂x ∂x x ∂x x x ∂x ∂y
  (13.41)
1 1 ∂ ∂y
  ∂v
= − 2 vy +
x x ∂y ∂x ∂y
 
1 1 ∂  ∂y  ∂v
= − 2 vy +
x x ∂y ∂x ∂y
 
1 1 ∂ ∂v 1
= − 2 vy + 2 = 2 (vyy − vy ) .
x x ∂y ∂y x
Note that, in computing uxx , above, we wrote the differential operator as:
∂ ∂ ∂y
= .
∂x ∂y ∂x
13.4. BLACK–SCHOLES EQUATION 221

Inserting (13.41) into (13.38) yields:



 a v = a x2 1 (v − v ) + b x 1 v + c v ,
τ yy y y
x2 x
v(y , 0) = f (ey ) ,

which is indeed (13.40). At this point, observe that (13.40) is a constant coefficients problem of the
form (13.35), which we rewrite here, for convenience, as:
(
vτ = vyy + A vy + B v ,
(13.42)
v(y , 0) = g(y) .

Recalling Corollary 13.22, we know that, in order to solve (13.42), we have to consider, first, the
following heat equation: 
 hτ = hyy ,
Ay
h(y , 0) = e 2 g(y) ,

whose solution h(y , τ ) is a component of the following function, that solves (13.42):
A2 Ay
v(y, τ ) = e(B− 4 )τ e− 2 h(y , τ ) .

In other words, since in our case it is:


b b−a c
A= −1= , B= ,
a a a
we have to consider the heat equation:

 hτ = hyy
(b−a) y
h(y , 0) = e 2 a f (ey ) ,

and solve it, using (13.26), thus obtaining:


Z +∞ √
1 (b−a) (y+2 s τ ) √
−s2
h(y , τ ) = √ e e 2a f (ey+2 s τ ) ds . (13.43)
π −∞

Then, we can insert (13.43) into the solution of the transformed problem (13.40), which is:
c 1 b−a 2

a − 4( a ) τ
(b−a) y
v(y , τ ) = e e− 2 a h(y , τ ) . (13.44)

Given the variable coefficient problem (13.38), equations (13.43)–(13.44) lead automatically to the
solution of the transformed problem (13.40), via the evaluation of the integral involved. Once such
a solution is computed, the solution to the given problem (13.38) can be obtained recovering the
original variables through (13.39). The following Examples 13.27, 13.28 and 13.29 illustrate the solution
procedure.

Example 13.27. Compute the solution to the parabolic problem:


(
ut = x2 uxx + x ux + u ,
(13.45)
u(x, 0) = x .

The given problem has the form (13.38), with a = b = c = 1 and f (x) = x . Let us form (13.43):
Z +∞ √
Z +∞ √
1 2 1 2
h(y , τ ) = √ e−s e0 ey+2 τ s ds = √ e−s +2 τ s+y ds = ey+τ ,
π −∞ π −∞
222 CHAPTER 13. PARABOLIC EQUATIONS

where we used the integration formula (8.37). From (13.44), we arrive at the solution of the transformed
problem (13.40):
v(y , τ ) = ey+2 τ .
Finally, we use (13.39), to recover the original variables:

τ = t, y = ln x ,

so that solution to (13.45) is:


u(x , t) = x e2 t .

Example 13.28. Compute the solution to the parabolic problem:


(
ut = 2 x2 uxx + 4 x ux + u ,
(13.46)
u(x , 0) = x .

This problem has the form (13.38), with a = 2 , b = 4 , c = 1 and f (x) = x . As in the previous
Example 13.28, problem (13.46) can be solved working in exact arithmetic. Here, (13.43) becomes:

Z ∞ y+2 s τ √
1 −s2
h(y , τ ) = √ e e 2 ey+2 s τ
ds
π ∞
Z ∞ √ 3y 3 9
1 −s2 +3 τ s+ 2 2 y+ 4 τ
=√ e ds = e ,
π ∞

where the last equality is obtained via the integration formula (8.37). Applying (13.44), we arrive at
the solution of the transformed problem (13.40), that is:
5
y+ 2 τ
v(y , τ ) = e .

Finally, recovering the original variables by means of the change of variable (13.39), that here is:

τ = 2t, y = ln x ,

we arrive at the solution of the differential equation (13.46), namely:

u(x , t) = x e5 t .

Example 13.29. Solve the Cauchy problem:


(
ut = 2 x2 uxx + x ux + u , x > 0, t > 0,
u(x , 0) = ln x .

This problem has the form (13.38), with a = 2 , b = c = 1 and f (x) = ln x . The associated heat
equation, modelled in (13.43), is:

Z +∞ √
y+2 s τ
1 −s2 −
h(y , τ ) = √ e e
(y + 2 s τ ) ds
4
π −∞
Z +∞ √
s τ y
√ Z +∞ √
s τ y
y −(s2 + 2 + 4 ) 2 τ −(s2 + 2 + 4 )
=√ e ds + √ s e ds
π −∞ π −∞
τ y √ √ τ y
y √ 16 − 4 2 τ √ τ 16 − 4
=√ π e + √ (− π) e
π π 4
τ τ −4 y
= (y − ) e 16 ,
2
13.5. NON–HOMOGENEOUS EQUATION: DUHAMEL INTEGRAL 223

where the Gaussian integration formulæ (8.37)–(8.38) were employed. Using (13.44), the solution of
the transformed problem (13.40) is obtained:
7τ y τ τ −4 y τ τ
v(y , τ ) = e 16 e 4 (y − ) e 16 = e 2 (y − ) .
2 2
Finally, we use (13.39), to recover the original variables:

τ = 2t, y = ln x ,

so that solution to (13.45) is:


u(x , t) = et (ln x − t) .

13.4.1 Exercises
1. Show that function u(x , t) = x2 e5 t solves the Cauchy parabolic problem:
(
ut = x2 uxx + x ux + u , t > 0,
2
u(x , 0) = x .

2. Show that function u(x , t) = e2 t (ln x − 2 t) solves the Cauchy parabolic problem:
(
ut = x2 uxx − x ux + 2 u , t > 0,
u(x , 0) = ln x .

13.5 Non–homogeneous equation: Duhamel integral


We study here the non–homogeneous initial value problem:
(
ut (x , t) = c uxx (x , t) + P (x , t) , x ∈ R, t > 0,
(13.47)
u(x , 0) = f (x) ,

which differs from (13.11) in the source term P (x , t) . For problem (13.47), a general solution formula
is obtained, which is analogous to (13.21). To do so, we follow an approach, that generalizes the
variation of parameters method, illustrated in Theorem 5.26 and in § 6.2.1. Such a generalization
apply to linear partial differential equations, and it is known as Duhamel integral or principle. To
understand how it works, we first present Example 13.30, which is referred to an ordinary differential
equation, revisited having in mind the Duhamel3 approach.
Example 13.30. Let a ∈ R be a given real number, and let f (t) be a continuous function, defined
on [0 , +∞) . Then, the linear initial value problem:
(
y 0 (t) = a y(t) + f (t) , t > 0,
(13.48)
y(0) = 0 ,

has the solution illustrated in Theorem 5.26, namely:


Z t Z t
at −a s
y(t) = e e f (s) ds = f (s) ea (t−s) ds . (13.49)
0 0

Consider, now, the following set of linear homogeneous equations, depending on the one parameter a ,
and in which s plays the role of a dummy variable:
(
u0 (t) = a u(t) , t>0,
(13.50)
u(0) = f (s) .
3
Jean–Marie Constant Duhamel (1797–1872), French mathematician and physicist.
224 CHAPTER 13. PARABOLIC EQUATIONS

At this point, observe that the solution (which is a function of t ) of the parametric problem (13.50)
is:
u(t , s) = f (s) ea t . (13.51)
Hence, if we look back at solution (13.49) of problem (13.48), we can write it as:
Z t
y(t) = u(t − s , s) ds . (13.52)
0

We can interpret the found solution in the following way. The solution of the non–homogeneous
equation (y 0 = a y + f (t)) , corresponding to the zero–value initial condition (y(0) = 0) , is obtained
from the solution of the homogeneous equation (u0 = a u) , when it is parametrized by the non–
homogeneous initial condition (u(0) = f (s)) .

The argument set out in Example 13.30 constitutes the foundation of the Duhamel method for partial
differential equations. Though the method works not only for parabolic equations, we use it, here, for
solving non–homogeneous parabolic equations. Let us consider, in fact, the initial value problem:
(
ut (x , t) = c uxx (x , t) + P (x , t) , x ∈ R, t > 0,
(13.53)
u(x , 0) = 0 .

Applying a procedure similar to that of Example 13.30, we build the homogeneous parametric initial
value problem: (
ht = c hxx , x ∈ R, t > 0,
(13.54)
h(x , 0) = P (x , s) .
Observe that (13.54) has the form of the homogeneous initial value problem (13.11), whose solution
is given by (13.24). Thus, the solution to the parametric problem (13.54) can be expressed in terms
of the heat kernel H(x , t) , introduced in (13.20), modified as follows:

1 x2
H(x , t) = √ e− 4 c t
4πct
and employed to define: Z ∞
h(x , t ; s) = H(x − y , t) P (y , s) dy , (13.55)
−∞
Finally, motivated by considerations similar to those described in Example 13.30, we conclude that
the solution to (13.53) should be:
Z t
u(x , t) = h(x , t − s ; s) ds
0
Z t Z ∞ 
= H(x − y , t − s) P (y , s) dy ds (13.56)
0 −∞
(x−y)2
Z t Z ∞ !
1 1 − 4 c (t−s)
= √ √ e P (y, s) dy ds .
2 πc 0 −∞ t−s

As a matter of fact, the function u(x , t) , defined in (13.56), is indeed solution to the initial value
problem (13.53). The technical details, concerning differentiation under the integral sign, are omitted
here, but we point out that formula (13.56) can be used to compute solutions to (13.53) in explicit
form.
Example 13.31. We solve, here, the non–homogeneous initial value problem:
(
ut (x , t) = uxx (x , t) + x t , x ∈ R, t > 0,
(13.57)
u(x , 0) = 0 .
13.5. NON–HOMOGENEOUS EQUATION: DUHAMEL INTEGRAL 225

We build (13.56), with P (x , t) = x t , i.e.:


2
!
Z t Z ∞ (x−y)
1 y −
u(x , t) = √ s √ e 4 (t−s) dy ds (13.58)
2 π 0 −∞ t−s

Now, perform the change of variable, from y to z :


x−y
z= √ ,
2 t−s

so that the evaluation of the inner integral in (13.58) simplifies as follows:


Z ∞ (x−y)2 Z ∞

1 y − 4 (t−s) 1 2
x − 2 z t − s e−z dz

√ √ e dy = √
2 π −∞ t−s π −∞
Z ∞
x 2
=√ e−z dz = x ,
π −∞
2
where the chain of equalities relies on (8.36a) and on the fact that function z 7→ z e−z is odd, thus
it verifies Remark 8.58. In conclusion, the solution to the initial value problem (13.57) is:
Z t
1 2
u(x , t) = s x ds = xt .
0 2

Note that formula (13.56) provides the solution to the particular initial value problem (13.53), with
zero–value initial condition. To arrive at the solution in the general case (13.47), we exploit the linearity
of the differential equation, using a superposition technique, that is based on solving two initial value
problems, namely:
(
vt (x , t) = c vxx (x , t) , x ∈ R, t > 0,
(13.59)
v(x , 0) = f (x) ,

and (
wt (x , t) = c wxx (x , t) + P (x , t) , x ∈ R, t > 0,
(13.60)
w(x , 0) = 0 .
It turns out that function u(x , t) = v(x , t) + w(x , t) solves the initial value problem (13.47). In other
words, using formulæ (13.24) and (13.56) jointly, we can state that the solution u(x , t) to the initial
value problem (13.47) is given by:
(x−y)2
Z +∞
1
u(x , t) = √ e− 4 c t f (y) dy
2 π c t −∞
(13.61)
Z tZ ∞ (x−y)2
1 1 − 4 c (t−s)
+ √ √ e P (y, s) dy ds .
2 π c 0 −∞ t − s
226 CHAPTER 13. PARABOLIC EQUATIONS
Bibliography

[1] Kuzman Adzievski and Abdul Hasan Siddiqi. Introduction to partial differential equations for
scientists and engineers using Mathematica. CRC-Press, Boca Raton, 2014.

[2] J.L. Allen and F.M. Stein. On solution of certain Riccati differential equations. Amer. Math.
Monthly, 71:1113–1115, 1964.

[3] Tom Mike Apostol. Calculus: Multi Variable Calculus and Linear Algebra, with Applications to
Differential Equations and Probability. John Wiley & Sons, New York, 1969.

[4] Daniel J. Arrigo. Symmetry analysis of differential equations: an introduction. John Wiley &
Sons, New York, 2015.

[5] E. Artin and M. Butler. The gamma function. Holt, Rinehart and Winston, New York, 1964.

[6] Robert G. Bartle. The elements of integration and Lebesgue measure. John Wiley & Sons, New
York, 2014.

[7] Fischer Black and Myron Scholes. The pricing of options and corporate liabilities. The journal
of political economy, pages 637–654, 1973.

[8] David Borthwick. Partial differential equations. 2nd ed. Springer, Cham, 2016.

[9] D. Brannan. A First Course in Mathematical Analysis. Cambridge University Press, Cambridgde,
2006.

[10] Douglas S. Bridges. Foundations of Real and Abstract Analysis. Springer, 1998.

X 1 π2
[11] B.R. Choe. An elementary proof of = . American Mathematical Monthly, 94(7):662–
n2 6
n=1
663, 1987.

[12] Donald L. Cohn. Measure Theory. 2nd ed. Birkhäuser, 2013.

[13] Lothar Collatz. Differential Equations: An Introduction With Applications. John Wiley and Sons,
1986.

[14] François Coppex. Solving the Black-Scholes equation: a demystification. private communication,
2009.

[15] Richard Courant and Fritz John. Introduction to Calculus and Analysis, volume 2. John Wiley
& Sons, New York, 1974.

[16] P. Duren. Invitation to Classical Analysis, volume 17. American Mathematical Society, Provi-
dence, 2012.

[17] Costas J. Efthimiou. Finding exact values for infinite sums. Mathematics magazine, 72(1):45–51,
1999.

227
228 BIBLIOGRAPHY

[18] Radon-Nikodym theorem. Applications. Wikipedia, https://en.wikipe–dia.org/wiki/Radon–


Nikodym theorem, 2019.

[19] Leonhard Euler. De summis serierum reciprocarum. Commentarii academiæ scientiarum


Petropolitanae, 7:123–134, 1740.

[20] O.J. Farrell and B. Ross. Solved problems: gamma and beta functions, Legendre polynomials,
Bessel functions. Macmillan, New York, 1963.

[21] Angelo Favini, Ermanno Lanconelli, Enrico Obrecht, and Cesare Parenti. Esercizi di analisi
matematica: Equazioni Differenziali, volume 2. Clueb, 1978.

[22] Gerald B. Folland. Introduction to Partial differential equations. 2nd ed. Princeton University
Press, Princeton, 1995.

[23] Herbert I. Freedman. Deterministic Mathematical Models in Population Ecology. M. Dekker,


1980.

[24] Guido Fubini. Sugli integrali multipli. Rend. Acc. Naz. Lincei, 16:608–614, 1907.

[25] Guido Fubini. Il teorema di riduzione per gli integrali multipli. Rend. Sem. Mat. Univ. Pol.
Torino, 9:125–133, 1949.

[26] Claude George. Exercises in integration. Springer, New York, 1984.

[27] A. Ghizzetti, A. Ossicini, and L. Marchetti. Lezioni di complementi di matematica. 2nd ed.
Libreria eredi Virgilo Veschi, Roma, 1972.
π2
[28] James Harper. Another simple proof of 1 + 212 + 312 + · · · = 6 . American Mathematical Monthly,
110(6):540–541, 2003.

[29] Phillip Hartman. Ordinary Differential Equations 2nd Edition. Siam, 2002.

[30] O. Hijab. Introduction to calculus and classical analysis. Springer, New York, 2011.

[31] P. Hydon. Symmetry methods for differential equations: a beginner’s guide. Cambridge University
Press, Cambridge, 2000.

[32] Edward L. Ince. Ordinary Differential Equations. Dover, 1956.

[33] Wilfred Kaplan. Advanced calculus. Addison-Wesley, Boston, 1952.

[34] John L. Kelley. General Topology. D. Van Nostrand Company, Inc., Princeton, N.J., U.S.A.,
https://archive.org/details/GeneralTopology, 1955.

[35] Thomas William Körner. Fourier analysis. Cambridge university press, Cambridge, 1989.

[36] Gregory F. Lawler. Random walk and the heat equation. American Mathematical Society, Provi-
dence, 2010.

[37] A.M. Legendre. Traité des fonctions elliptiques et des intégrales Euleriennes, volume 2. Huzard-
Courcier, Paris, 1826.

[38] D.H. Lehmer. Interesting series involving the central binomial coefficient. The American Mathe-
matical Monthly, 92(7):449–457, 1985.

[39] A.R. Magid. Lectures on differential galois theory. Notices of the American Mathematical Society,
7:1041–1049, 1994.

[40] C.C. Maican. Integral Evaluations Using the Gamma and Beta Functions and Elliptic Integrals
in Engineering: A Self-study Approach. International Press of Boston Inc., Boston, 2005.
BIBLIOGRAPHY 229

[41] A.M. Mathai and H.J. Haubold. Special Functions for Applied Scientists. Springer, New York,
2008.

[42] Habib Bin Muzaffar. A new proof of a classical formula. American Mathematical Monthly,
120(4):355–358, 2013.

[43] Paul J. Nahin. Inside Interesting Integrals. Springer, Berlin, 2014.

[44] Brad Osgood. The Fourier Transform and its Applications. Stanford University, 2007.

[45] Bruno Pini. Terzo Corso di Analisi Matematica, volume 1. CLUEB, Bologna, 1977.

[46] Earl David Rainville. Intermediate differential equations. Macmillan, New York, 1964.

[47] Earl David Rainville and Phillip E. Bedient. Elementary differential equations. 6th ed. Macmillan,
New York, 1981.

[48] P.R.P. Rao. The Riccati differential equation. Amer. Math. Monthly, 69:995–996, 1962.

[49] P.R.P. Rao and V.H. Hukidave. Some separable forms of the Riccati equation. Amer. Math.
Monthly, 75:38–39, 1968.
π2
[50] Daniele Ritelli. Another proof of ζ(2) = using double integrals. Amer. Math. Monthly,
6
120:642–645, 2013.

[51] Halsey Lawrence Royden and Patrick Fitzpatrick. Real analysis. 4th ed. Macmillan, New York,
2010.

[52] W. Rudin. Principles of mathematical analysis. 3rd ed. McGraw-Hill, New York, 1976.

[53] W. Rudin. Real and Complex Analysis. Tata McGraw–Hill, New York, 2006.

[54] Fabio Scarabotti. Equazioni alle derivate parziali. Esculapio, Bologna, 2010.

[55] L. Schwartz. Mathematics for the physical sciences. Addison-Wesley, New York, 1966.

[56] William Shaw. Modelling Financial Derivatives using Mathematica® . Cambridge University
Press, Cambridge, 1988.

[57] H. Siller. On the separability of the Riccati differential equation. Math. Mag., 43:197–202, 1970.

[58] Walter A. Strauss. Introduction to Partial Differential Equations. John Wiley & Sons, New York,
2007.

[59] J. Van Yzeren. Moivre’s and Fresnel’s integrals by simple integration. American Mathematical
Monthly, pages 690–693, 1979.

[60] Wolfgang Walter. Ordinary Differential Equations. Springer, Berlin, 1998.

[61] J.S.W. Wong. On solution of certain Riccati differential equations. Math. Mag., 39:141–143, 1966.

[62] R.C. Wrede and M.R. Spiegel. Schaum’s Outline of Advanced Calculus. McGraw-Hill, New York,
2010.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy