Introductory Mathematical Analysis For Quantitative Finance: Daniele Ritelli and Giulia Spaletta
Introductory Mathematical Analysis For Quantitative Finance: Daniele Ritelli and Giulia Spaletta
Finance
1 Euclidean space 1
1.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Topology of Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Limits of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
vii
viii CONTENTS
The purpose of this book is to be a tool for students, with little mathematical background, who
aim to study Mathematical Finance. The only prerequisites assumed are one–dimensional differential
calculus, infinite series, Riemann integral and elementary linear algebra.
In a sense, it is a sort of intensive course, or crash–course, which allows students, with minimal
knowledge in Mathematical Analysis, to reach the level of mathematical expertise necessary in modern
Quantitative Finance. These lecture notes concern pure mathematics, but the arguments presented
are oriented to Financial applications. The n–dimensional Euclidean space is briefly introduced, in
order to deal with multivariable differential calculus. Sequences and series of functions are introduced,
in view of theorems concerning the passage to the limit in Measure theory, and their role in the
general theory of ordinary differential equations, which is also presented. Due to its importance in
Quantitative Finance, the Radon–Nykodim theorem is stated, without proof, since the Von Neumann
argument requires notions of Functional Analysis, which would require a dedicated course. Finally, in
order to solve the Black–Scholes partial differential equation, basics in ordinary differential equations
and in the Fourier transform are provided.
We kept our exposition as short as possible, as the lectures are intended to be a preliminary contact
with the mathematical concepts used in Quantitative Finance and provided, often, in a one–semester
course. This book, therefore, is not intended for a specialized audience, although the material presented
here can be used by both experts and non-experts, to have a clear idea of the mathematical tools used
in Finance.
xi
xii CONTENTS
1 Euclidean space
1.1 Vectors
If n ∈ N , we use the symbol Rn to indicate the Cartesian1 product of n copies of R with itself, i.e.:
Rn := {(x1 , x2 , . . . , xn ) : xj ∈ R for j = 1, 2, . . . , n} .
The concept of Euclidean2 space is not limited to the set Rn , but it also includes the so–called Euclidean
inner product, introduced in Definition 1.1. The integer n is called dimension of Rn , the elements
x = (x1 , x2 , . . . , xn ) of Rn are called points, or vectors or ordered n–tuples, while xj , j = 1, . . . , n , are
the coordinates, or components, of x. Vector x and y are equal if xj = yj for j = 1, 2, ..., n. The zero
vector is the vector whose components are null, that is, 0 := (0, 0, . . . , 0). In low dimension situations,
i.e. for n = 2 or n = 3, we will write x = (x , y) and x = (x , y , z) , respectively.
For our purposes, that is extending differential calculus to functions of several variables, we need to
define an algebraic structure in Rn . This is done by introducing operations in Rn .
x + y := (x1 + y1 , x2 + y2 , . . . , xn + yn ) ;
x − y := (x1 − y1 , x2 − y2 , . . . , xn − yn ) ;
α x = (α x1 , α x2 , . . . , α xn ) ;
x · y := x1 y1 + x2 y2 + ... + xn yn .
The vector operations of Definition 1.1, illustrated in Figure 1.1, represent the analogues of the alge-
braic operations in R and imply algebraic rules in Rn .
1
Renatus Cartesius (1596–1650), French mathematician and philosopher.
2
Euclid of Alexandria (350–250 B.C. circa), Greek mathematician.
1
2 CHAPTER 1. EUCLIDEAN SPACE
228 Chapter 8 EUCLIDEAN SPACES
Figure 8.1
Figure 1.1: Vector operations.
(Note: For relationships between these three norms, see Remark 8.7. The subscript
00 is frequently used for supremum norms because the supremum of a continuous
(a) α 0 = 0 ;
0 x = 0 ;
as p --> oo--see Exercise 8, p. 126.)
(h) x − x = 0 ;
function on an interval [a, bj can be computed by taking the limit of
(b)
If(x)IP dX)I/p
(i) 0 · x = 0 ;
U:
Since Ilxll(c)= Ilxlll
1 x ==x Ilxll
; oo = lxi, when n = 1, each norm defined above is an
(j) x + (y + z) = (x + y) + z ;
extension of(d)the αabsolute value from R
(β x) = β (α x) = (α β) x ; to Rn. The most important, and in some
senses the most natural, of these norms is the Euclidean(k) norm.
x +This
y=y is +true
x ; for at
(e) α (x · y) = (α x) ·
least two reasons. First, by definition,y = x · (α y) ;
(f) α (x + y) = α2x + α y ; (l) x·y =y·x ;
IIxl1 =x·x
(g) 0+x=x ; (m) x · (y + z) = x · y + x · z .
(This aids in many calculations; see, for example, the proofs of Theorems 8.5 and
8.6.) Second, if is the triangle in R2 with vertices (0,0), x:= (a,b), and (a,O),
Definition 1.3. The standard base of Rn is the set En = {e1 , . . . , en }, where:
then by the Pythagorean Theorem, the hypotenuse of .Ja 2 + b2 , is exactly the
norm of x. Hence we define thee1 =(Euclidean)
(1, 0, . . . , 0) ,distance
e2 = (0,between
1, 0, . . . , 0) , . . points
two . , en =a,b
(0, ERn
. . . , 0, 1) .
by Note that a generic x = (x1 , . . . , xn ) ∈ Rn can be represented as a linear combination of vectors in
En : dist (a, b) := Iia-bil.
n n
Thus the Euclidean norm of a vector has a simple x = geometric
xj ej = interpretation.
X X
x · ej ej .
The algebraic structure of Rn also has a simple j=1 geometricj=1 interpretation in R2
and R3 thatIt gives us another very useful way to think about vectors. Scalar mul-
is worth noting that, when n = 2 and n = 3 , the standard base En is made of pairwise orthogonal
tiplication stretches
vectors. Inorordercompresses vector a but to
to extenda orthogonality leaves it in the consider
n dimensions, same straight
the main line
property of the standard
which passes through
base, ek =a.0 for
i.e., ej0· and Indeed, . a = (aI, a2) and t > 0, then ta = (tal, ta2)
j 6= k if
has the same direction as a, but its magnitude is or < than the magnitude of a,
Definition 1.4. Let x , y ∈ Rn be non–zero vectors; then:
depending on whether t 1 or t < 1. When t is negative, ta points in the opposite
direction from (i)a xbut
, y is
areagain stretched
parallel if and onlyor compressed,
if there existsdepending
t ∈ R suchon thexsize
that Itl. is denoted with xky ;
= t yof; this
To interpret the sum of two vectors, fix a pair of nonparallel vectors a, b E R 2 , and
(ii) x , y are orthogonal if and only if x · y = 0 ; this is denoted with x ⊥ y .
let P(a,b) denote the pamllelogmm associated with a and b; i.e., the parallelogram
whose sidesAsareangiven example,
by aaand= (3, 5) and
b (see (−6, −10)
b = 8.1).
Figure arethat
Notice if a =
parallel, while
(aI,ca2)
= (1,and1) and d = (1, −1) are
b = (b l , b2), then by definition the vector sum a+b = (al + bl , a2 + b2) is the diagonal
orthogonal.
of P(a,b), i.e.,
Thea+b is theinner
Euclidean vector that begins
product at the origin
allows introducing and ends
a metric n , the
in Rat opposite
as shown in the following Definition
vertex of P(a;b). Similarly, the difference a - b can be identified with the other
1.5:
P(a, b) (see1.5.
diagonal of Definition Figure
Let 8.1).
x ∈ Rn . The (Euclidean) norm of x is the scalar
n
!1
X 2
||x|| := x2k .
k=1
1.1. VECTORS 3
Taking into consideration equation (1.2) and the Cauchy–Schwarz inequality, we can formulate the
following Definitions 1.11–1.12.
Definition 1.11. Let x , y ∈ Rn be two non–zero vectors. Their angle ϑ(x y) is defined by:
||x|| ||y||
arccos ϑ(x y) = .
x·y
π
Observe that, when x , y are orthogonal, then ϑ(x , y) = .
2
Definition 1.12. The hyperplane (a plane where n = 3) passing through a point a ∈ Rn , with normal
b 6= 0 , is the set:
Πb (a) = {x ∈ Rn : (x − a) · b = 0} .
Note that, by definition, Πb (a) is the set of all points x such that x − a and b are orthogonal; observe
that, given a , b , the normal x − a is not unique, as Figure 1.3 illustrates.
b1 x1 + b2 x2 + · · · + bn xn = d ,
where b = (b1 , . . . , bn ) is a normal and d = b · a is a constant related to the distance from Πb (a) to
the origin. Planes in R3 have equations of the form:
ax + by + cz = d .
1.2 Topology of Rn
Topology, that is the description of the relations among subsets of Rn , is based on the concept of
open and closed sets, that generalises the notion of open and closed intervals. After introducing these
concepts, we state their most basic properties. The first step is the natural generalisation of intervals
in Rn .
(i) ∀r > 0 , the open ball, centered at a , of radius r , is the set of points:
a
x-
a
x
x-a
x-
a
x
Note that, when n = 1 , the open ball centered at a of radius r is the open interval (a − r , a + r) ,
and the corresponding closed ball is the closed interval [a − r , a + r] . Here we adopt the convention of
representing open balls as dashed circumferences, while closed balls are drawn as solid circumferences,
as shown in Figure 1.4.
B.3 Topology of Rn 243
....... ,
,,/
."",.""..----
.....
//
/ "r-\.
'
I ~~ 1\
I ~,_/\
( IIx-ali \
I ~_--r-----I
\ I
\ I
\ I
\ I
\
,,
/
I
, , / Br(a)
.......... .....
..... ----- " _/
Figure 8.5
Figure 1.4: Open ball n = 2 .
Notice that when n = 1, the open ball centered at a of radius r is the open interval
(a - r, a + r), and the corresponding closed ball is the closed interval [a - r, a + r].
Also notice that the open ball (respectively, the closed ball) centered at a of radius
To generalise the concept of open and closed intervals even further, observe that each element of an
r contains none of its (respectively, all of its) circumference {x : Ilx - all = r}.
open interval I lies inside I, i.e., it is surrounded by other points in I. Although closed intervals do
Accordingly, we will draw pictures of balls in R 2 with the following conventions:
not satisfy Open
this property, their
balls will be complements
drawn with dasheddo. Accordingly, and
circumferences, we closed
give the following
balls Definition 1.14.
will be drawn
with solid circumferences (see Figure 8.5).
To generalize the concept of open and closed intervals even further, observe that
each element of an open interval I lies "inside" I, i.e., is surrounded by other points
in I. On the other hand, although closed intervals do NOT satisfy this property,
6 CHAPTER 1. EUCLIDEAN SPACE
Definition 1.14. The open and closed sets are defined as follows:
(i) a set V ⊂ Rn is open if and only if, for every a ∈ V , there exists ε > 0 such that Bε (a) ⊆ V ;
(ii) a set E ⊂ Rn is closed if and only if its complement E c := Rn \ E is open.
It follows that every open ball is an open set. Note that, if a ∈ Rn , then Rn \ {a} is open and {a} is
closed.
Remark 1.15. For each n ∈ N , the empty set ∅ and the whole space Rn are both open and closed.
We state, without proof, the following Theorem 1.16, which explains the basic properties of open and
closed sets. Notions on sets, set operators and Topology are presented in greater detail in Chapter 7,
while in this first chapter only strictly necessary concepts are introduced.
Theorem 1.16. Let {Vα }α∈A and {Eα }α∈A be any collections of respectively open and closed subsets
of Rn , where A is any set of indexes. Let further {Vk : k = 1 , . . . , p} and {Ek : k = 1 , . . . , p} be
finite collections of respectively open and closed subsets of Rn . Then:
[ \
(i) Vα is open; (iii) Eα is closed;
α∈A α∈A
p
\ p
[
(ii) Vk is open; (iv) Ek is closed;
k=1 k=1
Note that every set E contains the open set ∅ and is contained in the closed set Rn ; hence, E o and
E are well–defined. Notice further that E o is always open and E is always closed: E o is the largest
open set contained in E , and E is the smallest closed set containing E . The following Theorem 1.19
illustrates the properties of E o and E .
Theorem 1.19. Let E ⊆ Rn , then:
(i) E o ⊆ E ⊆ E ;
(ii) if V is open and V ⊆ E , then V ⊆ E o ;
(iii) if C is closed and C ⊇ E , then C ⊇ E .
Let us, now, introduce the notion of boundary of a set.
Definition 1.20. The boundary of E is the set:
∂E := {x ∈ Rn : for all r > 0 , Br (x) ∩ E 6= ∅ and Br (x) ∩ E c 6= ∅} .
Given a set E , its boundary ∂E is closely related to E o and E .
Theorem 1.21. If E ⊆ Rn then ∂E = E \ E o .
1.3. LIMITS OF FUNCTIONS 7
6 E ⊆ Rn and let f : E → Rm .
Definition 1.25. Let ∅ =
(i) f is said to be continuous at a ∈ E if and only if for every ε > 0 there exists a positive δ (that
in general depends on ε , f , a) such that
is not continuous at 0 .
We now state the two important Theorems 1.27 and 1.28, that establish the topological properties of
continuity.
Theorem 1.27. Let n , m ∈ N and f : Rn → Rm . Then the following conditions are equivalent:
(i) f is continuous on Rn ;
Definition 1.29. A subset B ⊂ Rn is bounded if there exists M > 0 such that ||x|| ≤ M for any
x∈B.
The following Theorem 1.30, due to Weierstrass4 , states the fundamental property that, if a set is
both closed and bounded (we call it compact), then its image under any continuous function is also
compact.
Theorem 1.30 (Weierstrass Theorem on compactness). Let n , m ∈ N . If H is compact in Rn and
f : H → Rm is continuous on H , then f (H) is compact in Rm .
In the particular situation of a scalar function, we can state the generalization of Theorem 1.30 to
functions depending on several variables.
Theorem 1.31 (Generalization of Weierstrass Theorem). Assume that H is a non–empty subset of
Rn and f : H → R . If H is compact and f is continuous on H , then:
are finite real numbers. Moreover, there exist points xM , xm ∈ H such that M = f (xM ) and m =
f (xm ) .
4
Karl Theodor Wilhelm Weierstrass (1815–1897), German mathematician.
2 Sequences and series of functions
n
X
(ii) diverges, when the limit of the partial sums uk does not exist.
k=1
9
10 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS
I
fn −
→f .
I
Remark 2.5. Definition 2.4 can be reformulated as follows: it holds that fn − → f if, for any ε > 0
and for any x ∈ I , there exists n(ε, x) ∈ N , depending on ε and x , such that:
Example 2.3 shows that the pointwise limit of a sequence of continuos functions may not be continuos.
We do not have the tools, yet, to evaluate the integral in the left hand–side of the above equality (but
we will soon), but it is clear that it is a positive real number, so we have:
Z ∞ Z ∞ Z ∞
−y 2
lim fn (x)dx = e dy = α > 0 6= lim fn (x)dx = 0 .
n→∞ 0 0 0 n→∞
To establish a ‘good’ notion of convergence, that allows the passage to the limit, when we take the
integral of the considered sequence, and that preserves continuity, we introduce the fundamental notion
of uniform convergence.
Definition 2.6. If (fn ) is a sequence of functions defined on the interval I , then fn converges uni-
formly to the function f if, for any ε > 0 , there exists nε ∈ N such that, for n ∈ N , n > nε , it
holds:
sup |fn (x) − f (x)| < ε . (2.1)
x∈I
Remark 2.7. Definition 2.6 is equivalent to requesting that, for any ε > 0 , there exists nε ∈ N such
that, for n ∈ N , n > nε , it holds:
I
Proof. Let fn ⇒ f . Then, for any ε > 0 , there exists nε ∈ N such that:
sup |fn (x) − f (x)| < ε , for any n ∈ N , n > nε ,
I
and this implies (2.2). Viceversa, if (2.2) holds then, for any ε > 0 , there esists nε ∈ N such that:
sup |fn (x) − f (x)| < ε , for any n ∈ N , n > nε ,
x∈I
I
that is to say, fn ⇒ f .
Remark 2.8. Uniform convergence implies pointwise convergence. The converse does not hold, as
Example 2.3 shows.
In the next Theorem 2.9, we state the so–called Cauchy uniform convergence criterion.
Theorem 2.9. Given a sequence of functions (fn ) in [a , b] , the following statements are equivalent:
(i) (fn ) converges uniformly;
(ii) for any ε > 0 , there exists nε ∈ N such that, for n , m ∈ N , with n , m > nε , it holds:
|fn (x) − fm (x)| < ε , for any x ∈ [a , b] .
Proof. We show that (i) =⇒ (ii). Assume that (fn ) converges uniformly, i.e., for a fixed ε > 0 ,
ε
there exists nε > 0 such that, for any n ∈ N , n > nε , inequality |fn (x) − f (x)| < , holds for any
2
x ∈ [a , b] . Using the triangle inequality, we have:
ε ε
|fn (x) − fm (x)| ≤ |fn (x) − f (x)| + |f (x) − fm (x)| < + = ε
2 2
for n , m > nε .
To show that (ii) =⇒ (i), let us first observe that, for a fixed x ∈ [a , b] , the numerical sequence
(fn (x)) is indeed a Cauchy sequence, thus, it converges to a real number f (x) . We prove that such
a convergence is uniform. Let us fix ε > 0 and choose nε ∈ N such that, for n , m ∈ N , n , m > nε , it
holds:
|fn (x) − fm (x)| < ε
for any x ∈ [a , b] . Now, taking the limit for m → +∞ , we get:
|fn (x) − f (x)| < ε
for any x ∈ [a , b] . This completes the proof.
Example 2.10. The sequence of functions fn (x) = x (1 + n x)−1 converges uniformly to f (x) = 0
in the interval [0 , 1] . Since fn (x) ≥ 0 for n ∈ N and for x ∈ [0 , 1] , we have:
x 1
sup = →0 as n→∞.
x∈[0,1] 1 + n x 1+n
Example 2.11. The sequence of functions fn (x) = (1 + n x)−1 does not converge uniformly to
f (x) = 0 in the interval [0 , 1] . In spite of the pointwise limit of fn for x ∈]0 , 1] , we have in fact:
1
sup = 1.
x∈[0,1] 1 + nx
ln fn (x) = ln nα x e−αx = α ln n − n x .
It follows that lim ln fn (x) = −∞ and, then, lim fn (x) = 0 . Pointwise convergence is proved.
n→∞ n→∞
For uniform convergence, we show that, for any n ∈ N , the associate function fn reaches its absolute
maximum in R+ . By differentiating with respect to x , we obtain, in fact:
Now, lim sup fn (x) = lim e−α nα−1 = 0 , when α < 1 . Hence, in this case, convergence is indeed
n→∞ x∈R+ n→∞
uniform.
In the following example, we compare two sequences of functions, apparently very similar, but the
first one is pointwise convergent, while the second one is uniformly convergent.
Example 2.13. Consider the sequences of functions (fn ) and (gn ) , both defined on [0 , 1] :
1
n2 x (1 − n x)
if 0 ≤ x < ,
fn (x) = n (2.3)
1
0 if ≤x≤1,
n
and
1
n x2 (1 − n x)
if 0≤x< ,
gn (x) = n (2.4)
1
0 if ≤x≤1.
n
2.3. UNIFORM CONVERGENCE 13
Sequence (fn ) converges pointwise to f (x) = 0 for x ∈ [0 , 1] ; in fact, it is fn (0) = 0 and fn (1) = 0 for
1
any n ∈ N . When x ∈ (0 , 1) , since n0 ∈ N exists such that < x , it follows that fn (x) = 0 for any
n0
n ≥ n0 .
1
The convergence of (fn ) is not uniform; to show this, observe that ξn = maximises fn , since:
2n
1
n2 (1 − 2 n x)
if 0 ≤ x < ,
fn0 (x) = n
1
0 if ≤x≤1.
n
It then follows:
n
sup |fn (x) − f (x)| = sup fn (x) = fn (ξn ) =
x∈[0,1] x∈[0,1] 4
which prevents uniform convergence. With similar considerations, we can prove that (gn ) converges
pointwise to g(x) = 0 , and that the convergence is also uniform, since:
1
n x (2 − 3 n x)
if 0 ≤ x < ,
0
gn (x) = n
1
0 if ≤x≤1,
n
2
implying that ηn = maximises gn and that:
3n
4
sup |gn (x) − g(x)| = sup gn (x) = gn (ηn ) = ,
x∈[0,1] x∈[0,1] 27 n
Using the continuity of fn , we can see that there exists δ > 0 such that:
ε
|fn (x) − fn (x0 )| < (2.6)
3
for any x ∈ [a , b] with |x − x0 | < δ .
To end the proof, we have to show that, given x0 ∈ [a , b] , if x ∈ [a , b] is such that |x − x0 | < δ , then
|f (x) − f (x0 )| < ε . By the triangular inequality:
|f (x) − f (x0 )| ≤ |f (x) − fn (x)| + |fn (x) − fn (x0 )| + |fn (x0 ) − f (x0 )| .
Observe that:
ε ε ε
|f (x) − fn (x)| < , |fn (x0 ) − f (x0 )| < , |fn (x) − fn (x0 )| < ,
3 3 3
the first two inequalities being due to (2.5), while the third one is due to (2.6). Hence:
When we are in presence of uniform convergence, for a sequence of continuous functions, defined on
the bounded and closed interval [a , b] , then the following passage to the limit holds:
Z b Z b
lim fn (x)dx = lim fn (x)dx . (2.7)
n→∞ a a n→∞
Proof. From Theorem 2.14, f (x) is continuos, thus, it is Riemann integrable (see § 8.7.1). Now, choose
ε > 0 so that nε ∈ N exists such that, for n ∈ N , n > nε :
ε
|fn (x) − f (x)| < for any x ∈ [a , b] . (2.8)
b−a
By integration:
Z b Z b Z b
ε
fn (x)dx − f (x)dx ≤ |fn (x) − f (x)| dx < (b − a) = ε
a a a b−a
Remark 2.16. The passage to the limit is sometimes possible under less restrictive hypotheses than
Theorem 2.15. In the following example, passage to the limit is possible without uniform convergence.
Consider the sequence in [0 , 1] , given by fn (x) = n x (1 − x)n . For such a sequence, it is:
n
1 n 1
sup |fn (x)| = fn ( n+1 ) = 1− ,
x∈[0,1] n+1 n+1
[0 ,1]
On the other hand, it holds that fn −−−→ 0 . Moreover, we can use integration by part as follows:
Z 1 Z 1
lim fn (x) dx = lim n x (1 − x)n dx
n→∞ 0 n→∞ 0
Z 1 0
1 n+1
= lim nx − (1 − x) dx
n→∞ 0 n+1
Z 1
n n
= lim (1 − x)n+1 dx = lim = 0,
n→∞ 0 n + 1 n→∞ (n + 1)(n + 2)
Remark 2.17. Consider again the sequences of functions (2.3) and (2.4), defined on [0 , 1] , with
fn → 0 and gn ⇒ 0 . Observing that:
1
Z 1 Z
n 1
fn (x)dx = n x2 (1 − n x)dx =
0 0 6
and 1
Z 1 Z
n 1
gn (x)dx = n2 x(1 − n x)dx = ,
0 0 12 n2
2.3. UNIFORM CONVERGENCE 15
it follows: Z 1 Z 1
1
lim fn (x)dx = 6= f (x)dx = 0 ,
n→∞ 0 6 0
while: Z 1 Z 1
1
lim gn (x)dx = lim =0= g(x)dx = 0 .
n→∞ 0 n→∞ 12 n2 0
In other words, the pointwise convergence of (fn ) does not permit the passage to the limit, while the
uniform convergence of (gn ) does.
We provide a second example to illustrate, again, that pointwise convergence, alone, does not allow
the passage to the limit.
Example 2.18. Consider the sequence of functions (fn ) on [0 , 1] defined by:
1
n2 x if 0 ≤ x ≤ ,
n
1 2
fn (x) = 2 n − n2 x if < x ≤ ,
n n
2
0 if < x ≤ 1 .
n
Observe that each fn is a continuous function. Plots of fn are shown in Figure 2.3, for some values
of n ; it is clear that, pointwise, fn (x) → 0 for n → ∞ .
y
Figure 2.3: Plot of functions fn (x) , n = 3 , . . . , 6 , in Example 2.18. Solid lines are used for even values
of n ; dotted lines are employed for odd n .
By construction, though, each triangle in Figure 2.3 has area equal to 1 , thus, for any n ∈ N :
Z 1
fn (x)dx = 1 .
0
In conclusion: Z 1 Z 1
1 = lim fn (x)dx 6= lim fn (x)dx = 0
n→∞ 0 0 n→∞
In presence of pointwise convergence alone, therefore, swapping between integral and limit is not
possible.
Uniform convergence leads to a third interesting consequence, connected to the behaviour of sequences
of differentiable functions.
Theorem 2.19. Let (fn ) be a sequence of continuous functions, on [a , b] . Assume that each fn is
differentiable, with continuous derivative, and that:
16 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS
and recall that such a limit is uniform. For Theorems 2.14 and 2.15, g(x) is continuous on [a , b] . A
classical result from Calculus states that, for any x ∈ [a, b] :
Z x Z x
g(t)dt = lim fn0 (t)dt = lim (fn (x) − fn (a)) = f (x) − f (a)
a n→∞ a n→∞
The hypotheses of Theorem 2.19 are essential, as it is shown by the following example.
2x
fn (x) = , n ∈ N.
1 + n2 x2
Observe that fn converges to 0 uniformly in ] − 1 , 1[ , since:
1
sup |fn (x)| = −→ 0 .
x∈]−1,1[ n n→∞
Function fn is differentiable for any n ∈ N and, for any x ∈ ] − 1 , 1[ and any n ∈ N , the derivative of
fn , with respect to x , is:
2(1 − n2 x2 )
fn0 (x) = .
(1 + n2 x2 )2
Now, consider function g : ] − 1 , 1[ → R :
(
0 if x 6= 0 ,
g(x) =
2 if x = 0 .
]−1,1[
Clearly, fn0 −−−→ g ; such a convergence holds pointwise, but not uniformly; by Theorem 2.14, in
fact, uniform convergence of (fn0 ) would imply g to be continuous, which is not, in this case. Here, the
hypotheses of Theorem 2.19 are not fulfilled, thus its thesis does not hold.
We end this section with Theorem 2.21, due to Dini1 , and the important Corollary 2.23, a consequence
of the Dini Theorem, very useful in many applications. Theorem and corollary connect monotonicity
and uniform convergence for a sequence of functions; for their proof, we refer the Reader to [16].
Theorem 2.21 (Dini). Let (fn ) be a sequence of continuous functions, converging pointwise to a
continuous function f , defined on the interval [a , b] .
Furthermore, assume that, for any x ∈ [a , b] and for any n ∈ N , it holds fn (x) ≥ fn+1 (x) . Then fn
converges uniformly to f in [a , b] .
Remark 2.22. In Theorem 2.21, hypothesis fn (x) ≥ fn+1 (x) can be replaced with its reverse mono-
tonicity assumption fn (x) ≤ fn+1 (x) , obtaining the same thesis.
1
Ulisse Dini (1845–1918), Italian mathematician and politician.
2.4. SERIES OF FUNCTIONS 17
Corollary 2.23 (Dini). Let (fn ) be sequence of nonnegative, continuous and integrable functions,
defined on R , and assume that it converges pointwise to f , which is also nonnegative, continuous and
integrable. Suppose further that it is either 0 ≤ fn (x) ≤ fn+1 ≤ f (x) or 0 ≤ f (x) ≤ fn+1 ≤
fn (x) , for any x ∈ R and any n ∈ N . Then:
Z +∞ Z +∞
lim fn (x)dx = f (x)dx .
n→∞ −∞ −∞
Example 2.24. Let us consider an application of Theorem 2.21 and Corollary 2.23. Define fn (x) =
xn sin(πx) , x ∈ [0 , 1] . It is immediate to see that, for any x ∈ [0, 1] :
lim xn sin(πx) = 0 .
n→∞
Moreover, since it is 0 ≤ f (x) ≤ fn+1 ≤ fn (x) for any x ∈ [0 , 1] , the convergence is uniform and,
then: Z 1
lim xn sin(πx)dx = 0 .
n→∞ 0
Remark 2.26. Defining rn (x) := f (x) − fn (x) , then (2.9) converges uniformly in [a, b] if, for any
ε > 0 , there exists nε such that:
The following Theorem 2.27, due to Weierstrass, establishes a sufficient condition to ensure the uniform
convergence of a series of functions.
Theorem 2.27 (Weierstrass Theorem on approximation). Let (fn ) be a sequence of functions defined
on [a , b] . Assume that for any n ∈ N , there esists Mn ∈ N such that |fn (x)| ≤ Mn for any
x ∈ [a , b] . Morevover, assume convergence for the numerical series:
∞
X
Mn .
n=1
Proof. For the Cauchy criterion of convergence (Theorem 2.9), the series of functions (2.9) converges
uniformly if and only if, for any ε > 0 , there exist nε ∈ N such that:
m
X
sup fk < ε , for any m > n > nε .
x∈[a ,b]
k=n+1
∞
X
In our case, once ε > 0 is fixed, since the numerical series Mn converges, there exists nε ∈ N such
n=1
that:
m
X
Mk < ε , for any m > n > nε .
k=n+1
Now, we use the triangle inequality:
m
X m
X m
X
sup fk ≤ sup |fk | ≤ Mk < ε .
x∈[a,b]
k=n+1
x∈[a,b]
k=n+1 k=n+1
Theorem 2.15 is useful for swapping between sum and integral of a series, and Theorem 2.19 for
term–by–term differentiability. We now state three further helpful theorems.
Theorem 2.28. If (fn ) is a sequence of continuous functions on [a , b] and if their series (2.9) converges
uniformly on [a , b] , then:
∞ Z b ∞
Z b X !
X
fn (x)dx = fn (x) dx . (2.11)
n=1 a a n=1
Proof. Define:
∞
X n
X
f (x) := fn (x) = lim fm (x) . (2.11a)
n→∞
n=1 m=1
By Theorem 2.14, function f (x) is continuous and:
Z b n Z b
X
f (x)dx = lim fm (x)dx . (2.11b)
a n→∞
m=1 a
We state (without proof) a more general result, that is not based on the uniform convergence, but
only on simple convergence and few other assumptions.
Theorem 2.29. Let (fn ) be a sequence of functions on an interval [a , b] ⊂ R . Assume that each
fn is both piecewise continuous and integrable on I , and that (2.9) converges pointwise, on I , to a
piecewise continuous function f .
Moreover, assume convergence for the numerical (positive terms) series:
X∞ Z b
|fn (x)| dx .
n=1 a
∞
X 1
Our statement follows from the convergence of the infinite series , shown in formula (2.71) of
n2
n=1
§ 2.7, later on.
Moreover, if f (x) denotes the sum of the series (2.12), then, due to the uniform convergence:
π ∞
πX ∞ π ∞
1 − cos(n π)
Z Z Z
sin(n x) X 1 X
f (x)dx = dx = sin(n x)dx = .
0 0 n2 n2 0 n3
n=1 n=1 n=1
Theorem 2.31. Assume that (2.9), defined on [a , b] , converges uniformly, and assume that each fn
has continuous derivative fn0 (x) for any x ∈ [a , b] ; assume further that the series of the derivatives is
uniformly convergent. If f (x) denotes the sum of the series (2.9), then f (x) is differentiable and, for
any x ∈ [a , b] :
X ∞
0
f (x) = fn0 (x) . (2.13)
n=1
The derivatives at the extreme points a and b are obviously understood as right and left derivatives,
respectively.
Proof. We present here the proof given in [33]. Let us denote by g(x) , with x ∈ [a , b] , the sum of the
series of the derivatives fn0 (x) :
∞
X
g(x) = fn0 (x) .
n=1
By Theorem 2.14, function g(x) is continuous and, by Theorem 2.15, we can integrate term by term
in [a , x] :
Z x ∞ Z
X x ∞
X ∞ ∞
X X
g(ξ) dξ = fn0 (ξ)dξ = fn (x) − fn (a) = fn (x) − fn (a) , (2.14)
a n=1 a n=1 n=1 n=1
where linearity of the sum is used in the last step of the chain of equalities. Now, recalling definition
(2.11a) of f (x) , formula (2.14) can be rewitten as:
Z x
g(ξ) dξ = f (x) − f (a) . (2.14a)
a
20 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS
Differentiating both sides of (2.14a), by the Fundamental Theorem of Calculus2 we obtain g(x) =
f 0 (x) , which means that:
∞
X
0
f (x) = g(x) = fn0 (x) .
n=1
Hence, the proof is completed.
Uniform convergence of series of functions satisfies linearity properties expressed by the following
Theorem 2.32, whose proof is left as an exercise.
Moreover, if h(x) is a continuous function, defined on [a , b] , then the following series is uniformly
convergent:
X∞
h(x)fn (x) .
n=1
Given x0 ∈ R , it is important to find all the values x ∈ R such that the series of functions (2.15)
converges.
Remark 2.34. It is not restrictive, by using a translation, to consider the following simplified–form
power series, obtained from (2.15) with x0 = 0 :
∞
X
an xn . (2.16)
n=0
Obviously, the choice of x in (2.16) determinates the convergence of the series. The following Lemma
2.35 is of some importance.
Lemma 2.35. If (2.16) converges for x = r0 then, for any 0 ≤ r < |r0 | , it is absolutely and uniformly
convergent in [−r , r] .
2
See, for example, mathworld.wolfram.com/FundamentalTheoremsofCalculus.html
2.5. POWER SERIES: RADIUS OF CONVERGENCE 21
∞
X
Proof. It is assumed the convergence of the numerical series an r0n , that is to say, there exists a
n=0
∞
r X r n
positive constant K such that an r0n ≤ K . Since < 1 , then the geometrical series
r0 r0
n=0
converges. Now, for any n ≥ 0 and any x ∈ [−r, r] :
n
n n
an xn = an r0n x ≤ K x ≤ K r .
r0 r0 r0 (2.17)
By Theorem 2.27, inequality (2.17) implies that (2.16) is uniformly convergent. Due to positivity, the
convergence is also absolute.
From Lemma 2.35 it follows the fundamental Theorem 2.36, due to Cauchy and Hadamard3 , which
explains the behaviour of a power series:
Theorem 2.36 (Cauchy–Hadamard). Given the power series (2.16), then only one of the following
alternatives holds:
(iii) there exists a positive number r such that series (2.16) converges for any x ∈ ] − r , r [ and
diverges for any x ∈ ] − ∞ , −r [ ∪ ] r , +∞[ .
As a consequence, by Lemma 2.35, series (2.16) converges for any x ∈] − z, z[ , and, in particular, it is
convergent the series:
X∞
an y n .
n=0
To end the proof, take |y| > r and assume, by contradiction, that it is still convergent the series:
∞
X
an y n .
n=0
If so, using Lemma 2.35, it would follow that series (2.16) converges for any x ∈ ] − |y| , |y| [ and, in
particular, it would converge for the number:
|y| + r
> r,
2
which contradicts the assumption r = sup C .
3
Jacques Salomon Hadamard (1865–1963), French mathematician.
22 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS
Definition 2.37. The interval within which (2.16) converges is called interval of convergence and r
is called radius of convergence.
Theorem 2.38 (Radius of convergence). Consider the power series (2.16) and assume that the fol-
lowing limit exists:
an+1
` = lim .
n→∞ an
Then:
and apply the ratio test, that is to say, study the limit of the fraction between the (n + 1)–th term
and the n–th term in the series:
|an+1 | |x|n+1
an+1
lim = |x| lim = |x| ` .
n→∞ |an | |x|n n→∞ an
|an+1 | |x|n+1
lim = 0 < 1.
n→∞ |an | |x|n
If ` > 0 , then:
|an+1 | |x|n+1 1
lim = |x|` < 1 ⇐⇒ |x| < .
n→∞ |an | |x|n `
Eventually, if ` = ∞ , series (2.16) does not converge when x 6= 0 , since it is:
|an+1 | |x|n+1
lim > 1,
n→∞ |an | |x|n
while, for x = 0 , series (2.16) reduces to the zero series, which converges trivially.
Example 2.39. The power series (2.18), known as geometric series, has radius of convergence r = 1 .
∞
X
xn . (2.18)
n=0
which means that series (2.18) converges for −1 < x < 1. At the boundary of the interval of conver-
gence, namely x = 1 and x = −1 , the geometric series (2.18) does not converge. In conclusion, the
interval of convergence of (2.18) is the open interval ] − 1 , 1 [ .
2.5. POWER SERIES: RADIUS OF CONVERGENCE 23
1
Proof. Here, an = , thus:
n
an+1
lim = lim n = 1 .
n→∞ an n→∞ n + 1
that is, (2.19) converges for −1 < x < 1 .
At the boundary of the interval of convergence, (2.19) behaves as follows: when x = 1 , it reduces to
the divergent harmonic series:
∞
X 1
,
n
n=0
Example 2.41. Series (2.20), given below, has infinite radius of convergence:
∞
X xn
. (2.20)
n!
n=0
1
Proof. Since it is an = for any n ∈ N , it follows that:
n!
an+1 1
lim = lim = 0.
n→∞ an n→∞ n + 1
It is possible to differentiate and integrate power series, as stated in the following Theorem 2.42, which
we include for completeness, as it represents a particular case of Theorems 2.28 and 2.31.
Theorem 2.42. Let f (x) be the sum of the power series (2.16), with radius of convergence r . The
following results hold.
The radius of convergence of both power series f 0 (x) and F (x) is that of f (x) .
Power series behave nicely with respect to the usual arithmetical operations, as shown in Theorem
2.43, which states some useful result.
24 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS
Theorem 2.43. Consider two power series, with radii of convergence r1 and r2 respectively:
∞
X ∞
X
f1 (x) = an xn , f2 (x) = bn xn . (2.21)
n=0 n=0
We state, without proof, Theorem 2.44, concerning the product of two power series.
Theorem 2.44. Consider the two power series in (2.21), with radii of convergence r1 and r2 respec-
tively. The product of the two power series is defined by the Cauchy formula:
∞
X n
X
cn xn , where cn = aj bn−j , (2.22)
n=0 j=0
that is:
c0 = a0 b0 ,
c1 = a0 b1 + a1 b0 ,
..
.
cn = a0 bn + a1 bn−1 + · · · + an−1 b1 + an b0 .
Series (2.22) has interval of convergence given by |x| < r = min{r1 , r2 } , and its sum is the pointwise
product f1 (x) f2 (x) .
Since f has derivatives of any order, we may form the limit of (2.23) as n → ∞ ; a condition is stated
in Theorem 2.45 to detect when the passage to the limit is effective.
Theorem 2.45. If f has derivatives of any order in the open interval I , with x0 , x ∈ I , and if:
f (n+1) (ξ)
lim Rn (f (x), x0 ) = lim (x − x0 )n+1 = 0 ,
n→∞ n→∞ (n + 1)!
then:
∞
X f (n) (x0 )
f (x) = y (x − x0 )n . (2.24)
n!
n=0
4
Brook Taylor (1685–1731), English mathematician.
5
Giuseppe Luigi Lagrange (1736–1813), Italian mathematician.
2.6. TAYLOR–MACLAURIN SERIES 25
Definition 2.46. A function f (x) , defined on an open interval I , is analytic at x0 ∈ I , if its Taylor
series about x0 converges to f (x) in some neighborhood of x0 .
Remark 2.47. Assuming the existence of the derivatives of any order is not enough to infer that a
function is analytic and, thus, it can be represented with a convergent power series. For instance, the
function: ( 1
e − x2 if x 6= 0
f (x) =
0 if x = 0
has derivatives of any order in x0 = 0 , but such derivates are all zero, therefore the Taylor series reduces
to the zero function. This happens because the Lagrange remainder does not vanish as n → ∞.
Note that the majority of the functions, that interest us, does not possess the behaviour shown in
Remark 2.47. The series expansion of the most important, commonly used, functions can be inferred
from Equation (2.23), i.e., from the Taylor–Lagrange Theorem. And Theorem 2.45 yields a sufficient
condition to ensure that a given function is analytic.
Corollary 2.48. Consider f with derivatives of any order in the interval I = ] a , b [ . Assume that
there exist L , M > 0 such that, for any n ∈ N ∪ {0} and for any x ∈ I :
(n)
f (x) ≤ M Ln . (2.25)
Then, for any x0 ∈ I , function f (x) coincides with its Taylor series in I .
Proof. Assume x > x0 . The Lagrange remainder for f (x) is given by:
f (n+1) (ξ)
Rn (f (x) , x0 ) = (x − x0 )n+1 ,
(n + 1)!
where ξ ∈ (x0 , x) , which can be written as ξ = x0 + α(x − x0 ) , with 0 < α < 1 . Now, using condition
(2.25), it follows:
n+1
L (b − a)
|Rn (f (x) , x0 )| ≤ M.
(n + 1)!
The thesis follows from the limit:
n+1
L (b − a)
lim = 0.
n→∞ (n + 1)!
Corollary 2.48, together with Theorem 2.42, allows to find the power series expansion for the most
common elementary functions. Theorem 2.49 concerns a first group of power series that converges for
any x ∈ R .
Theorem 2.49. For any x ∈ R , the exponential power series expansion holds:
∞
x
X xn
e = . (2.26)
n!
n=0
∞ ∞
X (−1)n x2n+1 X (−1)n x2n
sin x = , (2.27) cos x = , (2.28)
(2n + 1)! (2n)!
n=0 n=0
∞ ∞
X x2n+1 X x2n
sinh x = , (2.29) cosh x = . (2.30)
(2n + 1)! (2n)!
n=0 n=0
26 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS
Proof. First, observe that the general term in each of the five series (2.26)–(2.30) comes form the
MacLaurin6 formula.
To show that f (x) = ex is the sum of the series (2.26), let us use the fact that f (n) (x) = ex for any
n ∈ N ; in this way, it is possible to infer that, in any interval [a , b] , the disequality (2.25) is fulfilled
if we take M = 1 and L = max{ex : x ∈ [a, b]} .
To prove (2.27) and (2.28), in which derivatives of the goniometric functions sin x and cos x are
considered, condition (2.25) is immediately verified by taking M = 1 and L = 1 .
Finally, (2.29) and (2.30) are a straightforward consequence of the definition of the hyperbolic functions
in terms of the exponential:
ex + e−x ex − e−x
cosh x = , sinh x = ,
2 2
together with Theorem 2.43.
Theorem 2.50 concerns a second group of power series converging for |x| < 1 .
Theorem 2.50. If |x| < 1 , the following power series expansions hold:
∞ ∞
1 X 1 X
= xn , (2.31) 2
= (n + 1)xn , (2.32)
1−x (1 − x)
n=0 n=0
∞ ∞
X xn+1 X xn+1
ln(1 − x) = − , (2.33) ln(1 + x) = (−1)n , (2.34)
n+1 n+1
n=0 n=0
∞ ∞
x2n+1 x2n+1
X
n 1+x X
arctan x = (−1) , (2.35) ln =2 . (2.36)
2n + 1 1−x 2n + 1
n=0 n=0
1
Proof. To prove (2.31), define f (x) = and build the MacLaurin polynomial of order n , which
1−x
is Pn ( f (x) , 0 ) = 1 + x + x2 + . . . + xn ; the remainder can be thus estimated directly:
Rn f (x) , n = f (x) − Pn f (x) , 0
1
= − (1 + x + x2 + · · · + xn ) (2.37)
1−x
1 1−x n+1 x n+1
= − = .
1−x 1−x 1−x
Assuming |x| < 1 , we see that the remainder vanishes for n → ∞ , thus (2.31) follows.
Indentity (2.32) can be proven by employing both formula (2.31) and Theorem 2.44, with an = bn = 1 .
To obtain (2.33), the geometric series in (2.31) can be integrated term by term, using Theorem 2.42;
in fact, letting |x| < 1 , we can consider the integral:
Z x
dt
= − ln(1 − x) .
0 1 −t
Now, from Theorem 2.42 it follows:
∞
xX ∞
xn+1
Z X
xn dx = .
0 n=0 n+1
n=0
Formula (2.33) is then a consequence of formula (2.31). Formula (2.34) can be proven analogously to
(2.33), by considering −x instead of x .
6
Colin Maclaurin (1698–1746), Scottish mathematician.
2.6. TAYLOR–MACLAURIN SERIES 27
Integrating, taking |x| < 1 , and using Theorem 2.42, the following result is obtained:
Z x ∞
1 x 1 1 + x X x2n+1
Z
dt 1 1
2
= + dt = ln = .
0 1−t 2 0 1+t 1−t 2 1−x 2n + 1
n=0
By using Proposition 2.51, it is possible to prove the so–called generalised Binomial Theorem 2.52.
Theorem 2.52 (Generalised Binomial). For any α ∈ R and |x| < 1 , the following identity holds:
∞
α
X α n
(1 + x) = x . (2.43)
n
n=0
Proof. Let us denote f (x) the sum of the generalised binomial series:
∞
X α n
f (x) = x ,
n
n=0
f (x)
g(x) = .
(1 + x)α
To prove the thesis, let us show that g(x) = 1 for any |x| < 1 . Differentiating g(x) we obtain:
Thus, g 0 (x) = 0 for any |x| < 1 , which implies that g(x) is a constant function. It follows that
g(x) = g(0) = f (0) = 1 , which proves thesis (2.43).
1
When considering the power series expansion of arcsin, the particular value α = − turns out to be
2
important. Let us, then, study the generalised binomial coefficient (2.41) corresponding to such an α .
− 1 (− 1 − 1)(− 12 − 2) · · · (− 12 − n + 1)
1
−2
= 2 2
n n!
1 1
( + 1)( 2 + 2) · · · ( 12 + n − 1)
1 1
· 3
· 5
··· 2 n−1
= (−1)n 2 2 = (−1)n 2 2 2 2
.
n! n!
2.6. TAYLOR–MACLAURIN SERIES 29
Recalling that the double factorial n!! is the product of all integers from 1 to n of the same parity
(odd or even) as n , we obtain:
1 3 5 2n − 1 (2 n − 1)!!
· · ··· = .
2 2 2 2 2n
Therefore:
− 21
(2 n − 1)!!
= (−1)n .
n 2n n!
Recalling further that (2n)!! = 2n n! , thesis (2.45) follows.
Corollary 2.54. For any |x| < 1, using the convention (−1)!! = 1 , it holds:
∞
X (2 n − 1)!!
1
√ = xn , (2.46)
1 − x n=0 (2 n)!!
∞
1 X (2 n − 1)!! n
√ = (−1)n x . (2.47)
1 + x n=0 (2n)!!
Formula (2.46) is implied by Theorem 2.42 and yields the MacLaurin series for arcsin x , while (2.47)
gives the series for arcsinh x , as expressed in the following Theorem 2.55.
Using the power series (2.46) and the Central Binomial Coefficient formula:
2n (2n − 1)!!
2n
= (2.50)
n n!
1
Theorem 2.56 (Lehmer). If |x| < , then:
4
∞
1 X 2n n
√ = x . (2.51)
1 − 4x n=0 n
The main issue with the integral (2.52) is that it is not possible to express it by means of the known
elementary functions [39]. On the other hand, some probabilistic applications require to know, at least
numerically, the values of the function introduced in (2.52). A way to achieve this goal is integrating
by series. Using the power series for the exponential function, it is possible to write:
∞
2
X t2 n
e−t = (−1)n .
n!
n=0
Since the power series in uniformly convergent, we can invoke Theorem 2.15 and transform the integral
(2.52) into a series as:
x ∞ x ∞
t2 n 1 x2n+1
Z Z
−t2
X X
n
e dt = (−1) dt = (−1)n . (2.53)
0 0 n! n! 2n + 1
n=0 n=0
Our previous argument, which led to equation (2.53), shows that the power series expansion of the
error function, introduced in (2.54), is
∞
2 X 1 x2n+1
erf(x) = √ (−1)n . (2.55)
π n=0 n! 2n + 1
Notice that from Theorem 2.38 it follows that the radius of convergence of the power series (2.55) is
infinite.
2.6. TAYLOR–MACLAURIN SERIES 31
ε
Now, fixed ε > 0 , there exists nε ∈ N such that |sn+1 − s| < for any n ∈ N , n > nε ; therefore,
2
using the triangle inequality in (2.59), the following holds for x ∈] − 1, 1[ :
∞
n
X ε X
|f (x) − s| = (1 − x) (sn+1 − s)xn + (sn+1 − s)xn
n=0 n=nε +1
∞
n
X ε X
≤ (1 − x) (sn+1 − s)xn + (1 − x) (sn+1 − s)xn
n=0 n=nε +1
nε ∞
(2.60)
X ε X
≤ (1 − x) |sn+1 − s| |x|n + (1 − x) |x|n
2
n=0 n=nε +1
nε nε
X ε X ε
≤ (1 − x) |sn+1 − s| |x|n + ≤ (1 − x) |sn+1 − s| + .
2 2
n=0 n=0
is continuous and vanishes for x = 1 , it is possible to choose δ ∈]0 , 1[ such that, if 1 − δ < x < 1 , we
have:
nε
X ε
(1 − x) |sn+1 − s| < .
2
n=0
Theorem 2.57 allows to compute, roundly, the sum of many interesting series.
Example 2.58. Recalling the power series expansion (2.33), from Theorem 2.57, with x = 1 , it
follows:
∞
X (−1)n+1
ln 2 = .
n
n=1
Example 2.59. Recalling the power series expansion (2.35), Theorem 2.57, with x = 1 , allows finding
the sum of the Leibnitz–Gregory9 series:
∞
π X 1
= (−1)n .
4 2n + 1
n=0
Example 2.60. Recalling the particular binomial expansion (2.46), Abel Theorem 2.57 implies that,
for x = 1 , the following holds:
∞
1 X (2n − 1)!!
√ = (−1)n .
2 n=0 (2n)!!
π
Using the fact that arccos x = − arcsin x , it is possible to obtain a second series, which gives π .
2
Example 2.61. Recalling the arcsin expansion (2.48), from Theorem 2.57 it follows:
∞
π X (2n − 1)!! 1
= .
2 (2n)!! 2n + 1
n=0
9
James Gregory (1638–1675), Scottish mathematician and astronomer.
Gottfried Wilhelm von Leibnitz (1646–1716), German mathematician and philosopher.
2.7. BASEL PROBLEM 33
Example 2.62. We show here two summation formulæ connecting π to the central binomial coeffi-
cients:
2n
∞
X n π
= ; (2.61)
4n (2 n + 1) 2
n=0
2n
∞
X n π
n
= . (2.62)
16 (2 n + 1) 3
n=0
The key to show (2.61) and (2.62) lies in the representation of the central binomial coefficient (2.50),
whose insertion in the left hand side of (2.61) leads to the infinite series:
∞
X (2n + 1)!!
. (2.63)
2n n! (2n+ 1)
n=0
We further notice that, from the power expansion of the arcsin function (2.48), it is possible to infer
the following equality:
√ ∞
arcsin x X (2n + 1)!!
√ = xn . (2.64)
x 2n n! (2n + 1)
n=0
The radius of convergence of the power series (2.64) is 1 ; Abel Theorem 2.57 can thus be applied to
arrive to (2.61). It is worth noting that (2.61) can also be obtained using the Lehemer series (2.51),
via the change of variable y = 4 x and integrating term by term.
A similar argument leads to (2.62); here, the starting point is the following power series expansion,
which has, again, radius of convergence r = 1 :
2n
∞
arcsin x X n
= n
x2n (2.65)
x 4 (2n + 1)
n=0
1
Equality (2.62) follows by evaluating formula (2.65) at x = .
2
Mengoli10 originally posed this problem in 1644, that takes its name from Basel, birthplace of Euler11
π2
who first provided the correct solution in [19].
6
There exist several solutions of the Basel problem; here we present the solution of Choe [11], based
on the power series expansion of f (x) = arcsin x , shown in Formula (2.48), as well as on the Abel
Theorem 2.57 and on the following integral Formula (2.67), which can be proved by induction on
m∈N : Z π
2 (2 m)!!
sin2 m+1 t dt = . (2.67)
0 (2 m + 1)!!
10
Pietro Mengoli (1626–1686), Italian mathematician and clergyman from Bologna.
11
Leonhard Euler (1707–1783), Swiss mathematician and physicist.
34 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS
The first step towards solving the Basel problem is to observe that, in the sum (2.66), the attention
can be confined to odd indexes only. Namely, if E denotes the sum of the series (2.66), then E can be
computed by considering, separately, the sums on even and odd indexes:
∞ ∞
X 1 X 1
2
+ =E.
(2 n) (2 n + 1)2
n=1 n=0
π2 3 π2
Now, observe that E = ⇐⇒ E = . In other words, the Basel problem is equivalent to show
6 4 8
that:
∞
X 1 π2
= , (2.69)
(2n + 1)2 8
n=0
whose proof can be found in [11].
Abel Theorem 2.57 applies to the power series (2.48), since we can prove that (2.48) converges for
x = 1 , using Raabe12 test, that is to say, forming:
an
ρ = lim n −1 ,
n→∞ an+1
in which an is the n−th series term, and proving that ρ > 1 . In the case of (2.48), with x = 1 :
(2n − 1)!! x2n+1 (2n + 2)!! 2n + 3
ρ = lim n − 1
n→∞ (2n)!! 2n + 1 (2n + 1)!! x2n+3
2(n + 1)(2n + 3) n(6n + 5) 3
= lim n 2
− 1 = lim 2
= .
n→∞ (2n + 1) n→∞ (2n + 1) 2
This implies, also, that the series (2.48) converges uniformly. The change of variable x = sin t in both
π π
sides of (2.48) yields, when − < t < :
2 2
∞
X (2n − 1)!! sin2n+1 t
t = sin t + . (2.70)
(2n)!! 2n + 1
n=1
π
Integrating (2.70) term by term, on the interval [0 , ] , and using (2.67), we obtain:
2
∞ Z π
π2 X (2n − 1)!! 2 sin2n+1 t
=1+ dt
8 (2n)!! 0 2n + 1
n=1
∞
X (2n − 1)!! (2n)!! 1
=1+
(2n)!! (2n + 1)!! 2n + 1
n=1
∞ ∞
X 1 X 1
=1+ 2
= .
(2n + 1) (2n + 1)2
n=1 n=0
Definition 2.63.
∞
z
X zn
e := . (2.72)
n!
n=0
Equations (2.26) and (2.72) only differ in the fact that, in the latest one, the argument can be a
complex number. Almost all the familiar properties of the exponential still hold, with the one exception
of positivity, which has no sense in the unordered field C . The fundamental property of the complex
exponential is stated in the following Theorem 2.64, due to Euler.
ez = ex+i y
=e ·e
x iy
The last step, above, exploits the real power series expansion for the sine and cosine functions given
in (2.27) and (2.28) respectively.
The first beautiful consequence of Theorem 2.64 is the famous Euler identity.
The C–extension of the exponential has an important consequence since the exponential function,
when considered as a function C → C , is no longer one–to–one, but it is a periodic function. In fact,
if z , w ∈ C , then:
ez = ew ⇐⇒ z = w + 2 n π i with n ∈ Z .
36 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS
ei y − e−i y ei y + e−i y
sin y = , cos y = . (2.76)
2i 2
It is thus possible to use (2.76) to extend to C the goniometric functions.
ei z − e−i z ei z + e−i z
sin z = , cos z = . (2.77)
2i 2
In essence, for the sine and cosine functions, both in their goniometric and hyperbolic versions, the
power series expansions (2.27), (2.28), (2.29) and (2.30) are understood as functions of a complex
variable.
Remark 2.68. In C , as well as in R , the logarithm of zero is undefined, since, from (2.73), it follows
ez 6= 0 , for any z ∈ C .
Using the polar representation of a complex number, we can represent its logarithms as shown below.
Theorem 2.69. If w = ρ ei ϑ is a non–zero complex number, the logarithms of w are defined as:
log w = ln ρ + i (ϑ + 2 n π) , n ∈ Z. (2.78)
ez = ρ ei ϑ ,
with
ez = ex+i y = ex ei y = ex (cos y + i sin y) , ρ ei ϑ = ρ (cos ϑ + i sin ϑ) ,
from which the real and imaginary components of z are obtained:
x = ln ρ , ρ ≥ 0, y = ϑ + 2nπ.
Among the infinite logarithms of a complex number, we pin down one, corresponding to the most
convenient argument.
In other words, for a non–zero complex w , the principal determination (or principal value) Log w is
the logarithm whose imaginary part lies in the interval (−π , π] .
Definition 2.72. Given z ∈ C , z 6= 0 , and w ∈ C , the complex power function is defined as:
z w = ew Log z
π
Example 2.73. Compute i i . Applying Definition 2.72: i i = ei Log i . Since arg(i) = and |i| = 1 ,
2
π π π
then Log i = i . Finally, i i = ei i 2 = e− 2 .
2
Example 2.74. In C , it is possible solve equations like sin z = 2 , obviously finding complex solutions.
From the sin definition (2.77), in fact, we obtain:
e2 i z − 4 i ei z − 1 = 0 .
Thus: √
ei z = 2 ± 3 i .
2.9 Exercises
2.9.1 Solved exercises
1. Given the following sequence of functions, establish whether it is pointwise and/or uniformly con-
vergent:
n x + x2
fn (x) = , x ∈ [0, 1] ,
n2
2. Exaluate the pointwise limit of the sequence of functions:
√
n
fn (x) = 1 + xn , x ≥ 0.
3. Show that the following sequence of functions converges pointwise, but not uniformly, to f (x) = 0 :
5. Show that:
∞
X 1
= ln 2 .
n 2n
n=1
6. Evaluate:
∞
n e−n x
Z
lim dx .
n→∞ 1 1 + nx
38 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS
1
x(1 − x)
Z
7. Use the definite integral dx to prove that:
0 1+x
∞
X (−1)n 3
= − ln 4 .
(n + 1)(n + 2) 2
n=1
−n
x2
8. Let fn (x) = 1+ , x ≥ 0.
n
Observe that:
n+1
lim sup |fn (x) − 0| = lim = 0.
n→∞ x∈[0,1] n→∞ n2
lim fn (x) = 1 .
n→∞
1 1
Recalling that, here, < 1 , x 6= 0 , we consider a change of variable t = and repeat the previous
x x √
argument (that we followed in the case of a variable t < 1) to obtain lim n tn + 1 = 1 , that is:
n→∞
r
n 1
lim + 1 = 1.
n→∞ xn
13
See, for example,mathworld.wolfram.com/SqueezingTheorem.html
2.9. EXERCISES 39
3. The pointwise limit of sequence fn (x) = n x e−n x is f (x) = 0 , due to the exponential decay of the
factor e−n x . To investigate the possible uniform convergence, we consider:
Differentiating we find:
d
n x e−n x = n e−n x (1 − n x) ,
dx
1
showing that x = n is a local maximizer and the corresponding extremum is:
1 1
fn = .
n e
But this implies that the found convergence cannot be uniform, since:
1
lim sup |fn (x) − f (x)| = 6= 0 .
n→∞ x>0 e
√ √
4. For any x ∈ [−1, 1] and any n ∈ N , it holds that 1 − xn ≤ 2 , thus:
√ √
1 − xn 2
fn (x) = 2
≤ 2 . (2.79)
n n
Now, observe that inequality (2.79) is independent of x ∈ [−1, 1] : this fact, taking the supremum
with respect to x ∈ [−1, 1] , ensures uniform convergence.
n
= xn−1 dx .
n2 0
n=1 n=1
Since the geometric series, in the right–hand side above, converges uniformly, we can swap series
and integral, obtaining:
∞ Z 1 X ∞
!
X 1 2
= xn−1 dx .
n 2n 0
n=1 n=1
Therefore
∞ Z 1
X 1 2 1
n
= dx .
n2
n=1 0 1−x
n
6. Define the (decreasing) function hn (x) = , with x ∈ [1, +∞) . Then:
1 + nx
n2
h0n (x) = − < 0.
(1 + nx)2
Since:
n
lim =0
x→∞ 1 + n x
and
n n
sup = hn (1) = ,
x∈[1,∞) 1 + n x 1+n
we can infer that:
n
|hn (x)| ≤ < 1.
1+n
Therefore:
n e−n x
−n x
1 + n x < e .
This shows uniform convergence for fn . We can now invoke Theorem 2.15, to obtain:
Z ∞ Z ∞ Z ∞
n e−n x n e−n x
lim dx = lim dx = 0 dx = 0 .
n→∞ 1 1 + nx 1 n→∞ 1 + n x 1
x n
2. Let fn (x) = cos √ , x ∈ R . Show that:
n
Hint. Consider the sequence gn (x) = ln fn (x) and use the power series (2.33) and (2.28).
x + x2 e n x
3. Establish if the sequence of functions (fn )n∈N , defined, for x ∈ R , by fn (x) = ,
1 + enx
converges pointwise and/or uniformly.
∞
X 1 3
4. Show that n
= ln .
n3 2
n=1
Z 1 x+1
5. Show that lim e n dx = 1 .
n→∞ 0
6. Consider the following equality and say if (and why) it is true or false:
1 1
x4 x4
Z Z
lim dx = lim dx .
n→∞ 0 x2 + n2 0 n→∞ x2 + n2
n(x3 + x) e−x
7. Let fn (x) = , with x ∈ [0 , 1] .
1 + nx
2
|fn (x) − f (x)| ≤ .
1 + nx
c. Show that, for any a > 0 , sequence (fn ) converges uniformly to f on [a , 1] , but the convergence
is not uniform on [0 , 1] .
Z 1
d. Evaluate lim fn (x) dx .
n→∞ 0
Hint.
1 1−x 1
= + .
(1 + x)(1 + x2 ) 2 (1 + x2 ) 2(1 + x)
42 CHAPTER 2. SEQUENCES AND SERIES OF FUNCTIONS
∞ ∞
x5
Z X 1
10. Show that 2 dx = .
0
x
e −1 n3
n=1
√
11. Show that cos z = 2 ⇐⇒ z = 2 n π − i ln 2 ± 3 , n ∈ Z.
3 Multidimensional differential calcu-
lus
by
f (x1 , . . . , xj−1 , · , xj+1 , . . . , xn ) .
If g is differentiable at some t0 ∈ (a , b) , then the first–order partial derivative of f at
(xl , . . . , xj−1 , t0 , xj+1 , . . . , xn ) , with respect to xj , is defined by:
∂f
fxj (x1 , . . . , xj−1 , t0 , xj+1 , . . . , xn ) := (x1 , . . . , xj−1 , t0 , xj+1 , . . . , xn )
∂xj
:=g 0 (t0 ) .
Therefore, the partial derivative fxj exists at a point a if and only if the following limit exists:
∂f f (a + h ej ) − f (a)
(a) := lim .
∂xj h→0 h
Higher–order partial derivatives are defined by iteration. For example, when it exists, the second–order
partial derivative of f , with respect to xj and xk , is defined by:
∂2f
∂ ∂f
fxj xk := := .
∂xk ∂xj ∂xk ∂xj
(i) f is said to be C p on V if and only if every k–th order partial derivative of f , with k ≤ p , exists
and is continuous on V .
If f is C p on V and q < p , then f is C q on V . The symbol C p (V ) denotes the set of functions that are
C p on an open set V .
For simplicity, in the following we shall state all results for the case m = 1 and n = 2 , denoting x1
with x and x2 with y . With appropriate changes in notation, the same results hold for any m, n ∈ N .
43
44 CHAPTER 3. MULTIDIMENSIONAL DIFFERENTIAL CALCULUS
∂f ∂g ∂f
(f g) = f +g .
∂x ∂x ∂x
Example 3.3. By the Mean–Value Theorem2 , if f ( · , y) is continuous on [a, b] and the partial deriva-
tive fx ( · , y) exists on (a , b) , then there exists a point c ∈ (a , b) (which may depend on y as well as
on a and b) such that:
∂f
f (b , y) − f (a , y) = (b − a) (c , y) .
∂x
In most situations, when dealing with higher–order partial derivatives, the order of computation of
the derivatives is, in some sense, arbitrary. This is expressed by the Clairaut3 –Schwarz Theorem.
∂2 f ∂2 f
(a , b) = (a , b) .
∂y ∂x ∂x ∂y
∂2 f ∂2 f
(a) = (a) .
∂xj ∂xk ∂xk ∂xj
Remark 3.6. Existence of partial derivatives does not ensure continuity. As an example, consider:
( xy
if (x , y) 6= (0, 0) ,
f (x , y) = x2 + y 2
0 if (x , y) = (0 , 0) .
This function is not continuous at (0 , 0) , but admits partial derivatives at any (x, y) ∈ R2 , since:
f (∆x , 0) − f (0 , 0)
lim = lim 0 = 0 ,
∆x→0 ∆x ∆x→0
and
f (0 , ∆y) − f (0 , 0)
lim = lim 0 = 0 .
∆y→0 ∆y ∆y→0
3.2 Differentiability
In this section, we define what it means for a vector function f to be differentiable at a point a .
Whatever our definition, if f is differentiable at a , then we expect two things:
To appreciate the following Definition 3.7 of total derivative of a function of n variables, we consider
one peculiar aspect of differentiable functions of one variable. Recall that f : R → R is differentiable
in x ∈ R if the following limit is finite, i.e., it is a real number:
f (x + h) − f (x)
lim := f 0 (x) .
h→0 h
The definition above is equivalent to the following: f is differentiable in x ∈ R if there exist α ∈ R
ω(h)
and a function ω : (−δ, δ) → R , with ω(0) = 0 and lim = 0 , such that:
h→0 h
The definition of differentiability for functions of several variables extends Property (3.1).
(i) f is continuous at a ;
Theorem 3.9. Let V be open in Rn , let a ∈ V and suppose that f : V → R . If all first–order partial
derivatives of f exist in V and are continuous at a , then f is differentiable at a .
In order to extend the notion of gradient, we introduce the Jacobian5 matrix associated to a vector–
valued function.
Definition 3.11. Let f : Rn → Rm be a function from the Euclidean n–space to the Euclidean
m–space. f has m real–valued component functions:
f1 (x1 , . . . , xn ) , . . . , fm (x1 , . . . , xn ) .
If the partial derivatives of the component functions exist, they can be organized in an m–by–n matrix,
namely the Jacobian matrix J of f :
The i–th row of J corresponds to the gradient ∇fi of the i–th component function fi , for i =
1,... ,m.
(i) f (a) is called a local minimum of f if and only if there exists r > 0 such that f (a) ≤ f (x) for
all x ∈ Br (a) , an open ball neighborhood of a (recall Definition 1.13);
(ii) f (a) is called a local maximum of f if and only if there exists r > 0 such that f (a) ≥ f (x) for
all x ∈ Br (a) ;
(iii) f (a) is called a local extremum of f if and only if f (a) is a local maximum or a local minimum
of f .
5
Carl Gustav Jacob Jacobi (1804–1851), German mathematician.
3.4. SUFFICIENT CONDITIONS 47
Remark 3.15. If the first–order partial derivatives of f exist at a, and if f (a) is a local extremum
of f , then ∇f (a) = 0 .
In fact, the one–dimensional function:
∂f
(a) = g 0 (aj ) = 0 .
∂xj
As in the one–dimensional case, condition ∇f (a) = 0 is necessary but not sufficient for f (a) to be a
local extremum.
Example 3.16. There exist continuously differentiable functions satisfying ∇f (a) = 0 and such that
f (a) is neither a local maximum nor a local minimum.
Consider, for instance, in the case n = 2 , the following function:
f (x, y) = y 2 − x2 .
It is easy to check that ∇f (0) = 0 , but the origin is a saddle point, as shown in Figure 3.1.
Definition 3.18. Let V ⊆ Rn an open set and let f : V → R be a C 2 function. The Hessian matrix
of f at x ∈ V (or, simply, the Hessian) is the symmetric square matrix formed by the second–order
partial derivatives of f , evaluated at point x :
2
∂ f
H(f )(x) := (x) , for i, j = 1, . . . , n .
∂xi ∂xj
Tests for extrema and saddle points, in the simplest situation of n = 2 , are stated in Theorem 3.19.
Example 3.20. A couple of examples are provided here, and the Reader is invited to verify the stated
results.
Tests for extrema and saddle points, in the general situation of n variables, are stated in Theorem
3.21.
or
min f (x) subject to g(x) = 0 .
3.6. MEAN–VALUE THEOREM 49
Theorem 3.22 (Lagrange multipliers – general case). Let m < n , let V be open in Rn , and let
f , gj : V → R be C 1 on V , for j = 1 , 2 . . . , m . Suppose that:
∂(g1 , . . . , gm )
∂(x1 , . . . , xn )
We will limit the proof of the Lagrange multipliers Theorem 3.22 in a two–dimensional context. To
this aim, it is first necessary to consider some preliminary results; we will resume the proof in §3.8.
[x , y] := {z ∈ Rn : z = t x + (1 − t) y , 0 ≤ t ≤ 1} .
The one–dimensional Mean–Value theorem (already met in Example 3.3), also called Lagrange Mean–
Value theorem or First Mean–Value theorem, can be extended to the Euclidean space Rn .
Theorem 3.25 (Implicit Function – case n = 2). Let Ω be an open set in R2 , and let f : Ω → R
be a C 1 function. Suppose there exists (x0 , y0 ) ∈ Ω such that f (x0 , y0 ) = 0 and fy (x0 , y0 ) 6= 0 .
Then, there exist δ , ε > 0 such that, for any x ∈ (x0 − δ , x0 + δ) there exists a unique y = ϕ(x) ∈
(y0 − ε , y0 + ε) such that:
f (x, y) = 0 .
Moreover, function y = ϕ(x) is C 1 in (x0 − δ, x0 + δ) and it holds that, for any x ∈ (x0 − δ , x0 + δ) :
fx (x, ϕ(x))
ϕ0 (x) = − .
fy (x, ϕ(x))
Proof. Let us assume that f (x0 , y0 ) > 0 . Since function fy (x , y) is continuos, it is possibile to find
a ball Bδ1 (x0 , y0 ) in which it is verified that (x , y) ∈ Bδ1 (x0 , y0 ) =⇒ fy (x, y) > 0 .
This means that, with an appropriate narrowing of parameters ε and δ , function y 7→ f (x , y) can
be assumed to be an increasing function, for any x ∈ (x0 − δ , x0 + δ) .
In particular, y 7→ f (x0 , y) is increasing and, since f (x0 , y0 ) = 0 by assumption, the following
disequalities are verified, for ε small enough:
Using, again, continuity of f and an appropriate narrowing of δ , we infer that, for any x ∈ (x0 −
δ , x0 + δ) :
f (x, y0 + ε) > 0 and f (x, y0 − ε) < 0 .
In conclusion, using continuity of y 7→ f (x , y) and the Bolzano theorem7 on the existence of zeros,
we have shown that, for any x ∈ (x0 − δ, x0 + δ) , there is a unique y = ϕ(x) ∈ (y0 − ε, y0 + ε) such
that:
f (x , y) = f (x , ϕ(x)) = 0 .
To prove the second part of Theorem 3.25, we need to show that ϕ(x) is differentiable. To this aim,
consider h ∈ R such that x + h ∈ (x0 − δ, x0 + δ) . In this way, from the Mean–Value Theorem 3.24,
there exist θ ∈ (0, 1) such that:
0 = f x + h , ϕ(x + h) − f x , ϕ(x)
= fx x + θ h ϕ(x) + θ ϕ(x + h) − ϕ(x) h +
+ fy x + θ h ϕ(x) + θ ϕ(x + h) − ϕ(x) ϕ(x + h) − ϕ(x) ,
thus:
ϕ(x + h) − ϕ(x) fx x + θ h ϕ(x) + θ ϕ(x + h) − ϕ(x)
=− .
h fy x + θ h ϕ(x) + θ ϕ(x + h) − ϕ(x)
The thesis follows by taking, in the equality above, the limit for h → 0 , observing that h → 0 =⇒
θ → 0 , and recalling that f (x , y) is C 1 .
We are now ready to state the Implicit function Theorem 3.26 in the general n–dimensional case; here,
Ω is an open set in Rn × R , thus (x, y) ∈ Ω means that x ∈ Rn and y ∈ R .
Theorem 3.26 (Implicit Function – general case). Let Ω ⊆ Rn × R be open, and let f ∈ C 1 (Ω , R) .
Assume that there exists (x0 , y0 ) ∈ Ω such that f (x0 , y0 ) = 0 and fy (x0 , y0 ) 6= 0 .
Then, there exist an open ball Bδ (x0 ) , an open interval (y0 − ε , y0 + ε) and a function ϕ : (y0 −
ε , y0 + ε) → R , such that:
f (x, y) = 0 ⇐⇒ y = ϕ(x) ;
Theorem 3.27 (Lagrange multipliers – case n = 2). Let A ⊂ R2 be open, and let f , g : A → R be
C 1 functions. Consider the subset of A :
M = {(x , y) ∈ A : g(x , y) = 0} .
Assume that ∇g(x , y) 6= 0 for any (x , y) ∈ M . Assume further that (x0 , y0 ) ∈ M is a maximum or
a minimum of f (x , y) for any (x , y) ∈ M .
Then, there exists λ ∈ R such that:
∇f (x0 , y0 ) = λ∇g(x0 , y0 ) .
Proof. Since ∇g(x0 , y0 ) 6= 0 , we can assume that gy (x0 , y0 ) 6= 0, . Thus, from the Implicit function
Theorem 3.25, there exist ε , δ > 0 such that, for x ∈ (x0 − δ , x0 + δ) , y ∈ (y0 − ε , y0 + ε) , it holds:
we evaluate: 00 00
Lxx Lxy gx
00 00
Λ = det Lxy Lyy gy .
gx gy 0
Then:
(a) Λ > 0 indicates a maximum value;
(b) Λ < 0 indicates a minimum value.
Example 3.28. An example of interest in Economics concerns the maximization of a production
function of Cobb–Douglas kind. The mathematical problem can be modelled as:
max f (x , y) = xa y 1−a
(3.3)
subject to p x + q y − c = 0
where 0 < a < 1 , and p , q , c > 0 .
In a problem like (3.3), f (x , y) is referred to as objective function, while , by defining the function
w(x , y) = p x + q y − c , the constraint is given by w(x , y) = 0 .
The Lagrangian is L(x , y ; m) = f (x , y) − m w(x , y) . The critical point equations are:
a−1 y 1−a − m p = 0 ,
Lx (x , y ; m) = a x
Ly (x , y ; m) = (1 − a) xa y −a − m q = 0 ,
Lm (x , y ; m) = p x + q y − c = 0 .
Eliminating m from the first two equations, by subtraction, we obtain the two–by-two linear system
in the variables x , y : (
(1 − a) p x − a q y = 0 ,
px + qy − c = 0 .
Solving the 2 × 2 system and recovering m , from m = (a xa−1 y 1−a ) p−1 , we find the critical point:
ac
x=
p
c (1 − a)
y=
q
m = (1 − a)1−a aa p−a q a−1
Eliminating m from the first two equations, by substitution, we obtain the two–by-two linear system
in the variables x , y : (y
=6,
x
x1/4 y 3/4 = 1 .
Solving the 2 × 2 system and recovering m from m = (4 y 1/4 )/(3 x1/4 ) , the critical point is found:
4 × 21/4
x = 6−3/4 , y = 61/4 , m= .
33/4
The Bordered Hessian is:
3 m y 3/4 y 3/4
3m
−
16 x7/4 16 x3/4 y 1/4 4 x3/4
3m 3 m x1/4 3 x1/4
Λ= − .
16 x3/4 y 1/4 16 y 5/4 4 y 1/4
y 3/4 3 x1/4
0
4 x3/4 4 y 1/4
Evaluating Λ at the critical point and computing its determinant:
3 (3)1/4
det Λ = − ,
23/4
we see that we found a minimum.
Example 3.30. The problem presented here is typical in the determination of an optimal investment
portfolio in Corporate Finance.
We seek to minimize f (x, y) = x2 + 2 y 2 + 3 z 2 + 2 xz + 2 y z with the constraints:
x+y+z =1 2x + y + 3z = 7.
L(x , y , z ; m , n) = x2 + 2 y 2 + 3 z 2 + 2 x z + 2 y z − m (x + y + z − 1) − n (2 x + y + 3 z − 7) ,
x = 0, y = −2 , z = 3, m = −10 , n = 8.
The convexity of the objective function ensures that the found solution is the absolute minimum.
Though this statement should be proved rigorously, we do not treat it here.
4 Ordinary differential equations of
first order: general theory
Our goal, in introducing ordinary differential equations, is to provide a brief account on methods of
explicit integration, for the most common types of ordinary differential equations. However, it is not
taken for granted the main theoretical problem, concerning existence and uniqueness of the solution of
the Initial Value Problem, modelled by (4.3). Indeed, the proof of the Picard–Lindelöhf Theorem 4.17
is presented in detail: to do this, we will use some notions from the theory of uniform convergence of
sequences of functions, already discussed in Theorem 2.15. An abstract approach followed, for instance,
in Chapter 2 of [60], is avoided here.
In the following Chapter 5, we present some classes of ordinary differential equations for which, using
suitable techniques, the solution can be described in terms of known functions: in this case, we say
that we are able to find an exact solution of the given ordinary differential equation.
55
56CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: GENERAL THEORY
Definition 4.1. Given f : Ω ⊂ R2 → R , being Ω an open set, the initial value problem (also called
Cauchy problem) takes the form:
(
y 0 = f (x , y) , x∈I ,
(4.3)
y(x0 ) = y0 , x0 ∈ I , y0 ∈ J ,
where I × J ⊆ Ω are intervals, and where we have simply denoted y in place of y(x) .
Remark 4.2. We say that differential equations are studied by quantitative or exact methods when
they can be solved completely, that is to say, all their solutions are known and could be written in
closed form, in terms of elementary functions or, at times, in terms of special functions (or in terms
of inverses of elementary and special functions).
We can also follow a reverse approach, in the sense that, as illustrated in Example 4.4, given a
geometrical locus, we obtain its ordinary differential equation.
y = α x2 . (4.5)
Any parabola in the family has the y-axes as common axis, with vertex in the origin. Differentiating,
we get:
y0 = 2 α x . (4.6)
Eliminating α from (4.5) and (4.6), we obtain the differential equation:
2y
y0 = . (4.7)
x
This means that any parabola in the family is solution to the differential equation (4.7).
4.1. PRELIMINARY NOTIONS 57
in which y1 = y1 (x) and y2 = y2 (x) are functions of a variable x that, in most applications, takes the
meaning of time. System (4.8) is probably the most famous system of ordinary differential equations,
as it represents the Lotka-Volterra predator prey system; see, for instance, [23]. Notice that the left
hand–sides in (4.8) are not dependent on x : in this particular case, the system is called autonomous.
We now state, formally, the definition of Initial Value Problem for a system of n ordinary differential
equations, each of first order, and for a differential equation of order n , with integer n ≥ 1 in both
cases.
Definition 4.6. Consider Ω , open set in R × Rn , with integer n ≥ 1 , and let f : Ω → Rn be a
vector–valued continuous function of (n + 1)–variables. Let further (x0 , y) ∈ Ω and I be an open
interval such that x0 ∈ I .
Then, a vector–valued function s : I → Rn is a solution of the initial value problem:
(
y 0 = f (x , y)
(4.9)
y(x0 ) = y 0
if the following conditions are verified:
Remark 4.7. In the Lotka–Volterra case (4.8), it is n = 2 , thus y = (y1 , y2 ) , the open set is
Ω = R × (0 , +∞) × (0 , +∞) and the continuos function is f (x , y) = f (x , y1 , y2 ) = y1 (a −
b y2 ) , y2 (c y1 − d) .
The rigorous definition of initial value problem for a differential equation of order n is provided below.
Definition 4.8. Consider an open set Ω ⊆ R × Rn , where n ≥ 1 is integer. Let F : Ω → R be a
scalar continuous function of (n + 1)–variables. Let further (x0 , b) ∈ Ω and I be an open interval
such that x0 ∈ I . Finally, denote b = (b1 , . . . , bn ) .
Then, a real function s : I → R is a solution of the initial value problem:
y (n) = F (x , y , y 0 , y 00 , · · · , y (n−1) )
y(x0 ) = b1
y 0 (x0 ) = b2 (4.10)
...
(n−1)
y (x0 ) = bn
if:
58CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: GENERAL THEORY
(i) s ∈ C n (I) ;
(2) all solutions to (4.11) can be expressed as functions of the family itself, i.e., they take the form
y(x ; c1 , . . . , cn ) .
Remark 4.10. Systems of first–order differential equations like (4.9) and equations of order n like
(4.10) are intimately related. Given the n-th order equation (4.10), in fact, an equivalent system can
be build, that has form (4.9), by introducing a new vector variable z = (z1 , . . . , zn ) and considering
the system of differential equations:
z10 = z2
0
z2 = z3
... (4.12)
0
zn−1 = zn
0
zn = F (x , z1 , z2 , . . . , zn )
System (4.12) can be represented in the vectorial form (4.9), simply by setting z 0 = (z10 , . . . , zn0 ) ,
b = (b1 , . . . , bn ) and:
z2
z3
f (x , z) =
... .
zn
F (x , z1 , z2 , . . . , zn )
Form Remark 4.10, the following Theorem 4.11 can be inferred, whose straightforward proof is omitted.
Theorem 4.11. Function s is solution of the n–th order initial value problem (4.10) if and only if
the vector function z solves system (4.12), with the initial conditions (4.13).
Remark 4.12. It is also possible to go in the reverse way, that is to say, any system of n differential
equations, of first order, can be transformed into a scalar differential equation of order n . We illustrate
this procedure with the Lotka-Volterra system (4.8). The first step consists in computing the second
derivative, with respect to x , of the first equation in (4.8):
Then, the values of y10 and y20 from (4.8) are inserted in (4.8a), yielding:
y100 = y1 (a − b y2 )2 + b y2 (d − c y1 ) .
(4.8b)
Thirdly, using again the first equation in (4.8), y2 is expressed in terms of y1 and y10 , namely:
a y1 − y10
y10 = y1 (a − b y2 ) =⇒ y2 = . (4.8c)
b y1
Finally, (4.8c) is inserted into (4.8b), which provides the second–order differential equation for y1 :
y 02
y100 = a y1 − y10 (d − c y1 ) + 1 .
(4.8d)
y1
To extend Theorem 4.13 to systems of ordinary differential equations, the rectangle R is replaced by
a parallelepiped, obtained as the Cartesian product of a real interval with an n–dimensional closed
ball.
Remark 4.15. Under the sole continuity assumption, a solution needs not to be unique. Consider,
for example, the initial value problem:
(
y 0 (x) = 2 |y(x)| ,
p
(4.14)
y(0) = 0 .
The zero function y(x) = 0 is a solution of (4.14), which is solved, though, by function y(x) = x |x|
as well. Moreover, for each pair of real numbers α < 0 < β , the following ϕα ,β (x) function solves
(4.14) too:
−(x − α)2
if x<α,
ϕα ,β (x) = 0 if α ≤ x ≤ β ,
(x − β)2 if x>β .
In other words, the considered initial value problem admits infinite solutions. This phenomenon is
known as Peano funnel.
proof is constructive and turns out useful when trying to evaluate the solution of the given ordinary
differential equation.
The key notion to be introduced is Lipschitz continuity, which may be considered as a kind of inter-
mediate property, between continuity and differentiability.
For simplicity, we work in a scalar situation; the extension to systems of differential equations is only
technical; some details are provided in § 4.3.2.
We use again R to denote the rectangle:
R = [x0 , x0 + a] × [y0 − b , y0 + b] .
Proof. The proof is somewhat long, so we present it splitted into four steps.
First step. Let n ∈ N . Define the sequence of functions (un ) by recurrence:
u0 (x) = y0 ,
Z x
un+1 (x) = y0 +
f X , un (X) dX .
x0
We want to show that x , un (x) ∈ R for any x ∈ [x0 , x0 + α] . To this aim, it is enough to prove
that, for n ≥ 0 , the following inequality is verified:
It is precisely here that we can understand the reason of the peculiar definition (4.16) of the number
α , as such a choice turns out appropriate in correctly defining each (and any) term in the sequence
un . It also highlights the local nature of the solution of the initial value problem (4.3).
Second step. We now show that (un ) converges uniformly on [x0 , x0 + α] . The identity:
n
X
un = u0 + (u1 − u0 ) + · · · + (un − un−1 ) = u0 + (uk − uk−1 )
k=1
2
Rudolf Otto Sigismund Lipschitz (1832–1903), German mathematician.
3
Charles Émile Picard (1856–1941), French mathematician.
Ernst Leonard Lindelöf (1870–1946), Finnish mathematician.
4.3. EXISTENCE AND UNIQUENESS: PICARD–LINDELÖHF THEOREM 61
suggests that any sequence (un ) can be thought of as an infinite series: its uniform convergence, thus,
can be proved by showing that the following series (4.18) converges totally on [x0 , x0 + α] :
∞
X
(uk − uk−1 ) . (4.18)
k=1
Ln−1 |x − x0 |n
|un (x) − un−1 (x)| ≤ M , for any x ∈ [x0 , x0 + α] . (4.19)
n!
We proceed by induction. For n = 1 , the bound is verified, since:
We now prove (4.19) for n + 1 , assuming that it holds true for n . Indeed:
Z x
|un+1 (x) − un (x)| = f X , un (X) − f X , un−1 (X) dX
Z xx0
≤ f X , un (X) − f X , un−1 (X) dX
x0
Z x
≤L |un (X) − un−1 (X)| dX
x0
Ln−1 x
Ln |x − x0 |n+1
Z
≤M |X − x0 |n dX = M .
n! x0 (n + 1)!
Therefore (4.19) is proved and implies that series (4.18) is totally convergent for [x0 , x0 + α] ; in fact:
∞ ∞ ∞
X X X Ln−1 αn
(un − un−1 ) ≤ sup |un − un−1 | ≤ M
n!
n=1 n=1 [x0 ,x0 +α] n=1
∞
M X (L α)n M
eα L − 1 < +∞ .
= =
L n! L
n=1
Third step. We show that the limit of the sequence of functions (un ) solves the initial value problem
(4.3). From the equality:
Z x
lim un+1 (t) = lim y0 + f X , un (X) dX ,
n→∞ n→∞ x0
As before, it is possible to show that, for any n ∈ N and any x ∈ [x0 , x0 + α] , the following inequality
holds true:
Ln |x − x0 |n
|u(x) − v(x)| ≤ K , (4.21)
n!
where K is given by:
K= max |u(x) − v(x)| .
x∈[x0 ,x0 +α]
Indeed: Z x
|u(x) − v(x)| ≤ f X , u(X) − f x , v(X) dX ≤ K L |x − x0 | ,
x0
which proves (4.21) for n = 1 . Using induction, if we assume that (4.21) is satisfied for some n ∈ N ,
then:
Z x
|u(x) − v(x)| ≤ f X , u(X) − f X , v(X) dX
x0
Z x Z x
Ln (X − x0 )n
≤L |u(X) − v(X)| dX ≤ L K dX .
x0 x0 n!
After calculating the last integral in the above inequality chain, we arrive at:
Ln+1 (x − x0 )n+1
|u(x) − v(x)| ≤ K ,
(n + 1)!
which proves (4.21) for index n + 1 . By induction, (4.21) holds true for any n ∈ N .
We can finally end our demonstration of Theorem 4.17. In fact, by taking the limit n → ∞ in (4.21),
we obtain that, for any x ∈ [x0 , x0 + α] , the following inequality is verified:
|u(x) − v(x)| ≤ 0 ,
which shows that u(x) = v(x) for any x ∈ [x0 , x0 + α] .
Remark 4.18. Let us go back to Remark 4.15. In such p a situation, where the initial value problem
(4.14) has multiple solutions, function f (x , y) = 2 |y| does not fulfill the Lipschitz continuity
property. In fact, taking, for istance, y1 , y2 > 0 yields:
|f (y1 ) − f (y2 )| 2
= √ √ ,
|y1 − y2 | y1 − y2
which is unbounded.
The proof of Theorem 4.17, based on successive Picard iterates, is also useful in some simple situations,
where it allows to compute an approximate solution of the initial value problem (4.3). This is illustrated
by Example 4.19.
Example 4.19. Construct the Picard–Lindelhöf iterates for:
(
y 0 (x) = − 2 x y(x) ,
(4.22)
y(0) = 1 .
The first iterate is y0 (x) = 1 , while subsequent iterates are:
Z x Z x
y1 (x) = y0 (x) + f (t , y0 (t))dt = 1 − 2 t dt = 1 − x2 ,
0 0
Z x Z x
x4
y2 (x) = y0 (x) + f (t , y1 (t))dt = 1 − 2 t (1 − t2 )dt = 1 − x2 + ,
0 0 2
Z x
t4 x4 x6
2
y3 (x) = 1 − 2 t 1−t + dt = 1 − x2 + − ,
0 4 2 6
Z x
t4 t6 x4 x6 x8
2
y4 (x) = 1 − 2 t 1−t + − dt = 1 − x2 + − + ,
0 4 6 2 6 24
4.3. EXISTENCE AND UNIQUENESS: PICARD–LINDELÖHF THEOREM 63
x2 x4 x6 x8 (−1)n x2n
yn (x) = 1 − + − + + ··· + .
1! 2! 3! 4! n!
The sequence of Picard–Lindelhöf iterates converges only if it also converges the series:
m
X (−1)n x2n
y(x) := lim .
m→∞ n!
n=0
it follows:
∞
X (−x2 )n 2
y(x) = = e−x .
n!
n=0
We will show later, in Example 5.4, that this function is indeed the solution to (4.22).
We leave it to the Reader, as an exercise, to determine the successive approximations of the IVP:
(
y 0 (x) = y(x) ,
y(0) = 1 .
is solved by the function y(x) = tanh x , so that its interval of existence is R . It is worth noting that
the similar initial value problem: (
y0 = 1 + y2
y(0) = 0
behaves
π π differently, since it is solved by y(x) = tan x and has, therefore, interval of existence given by
− 2 , 2 . Moreover, in this latter case, taking the limit at the boundary of the interval of existence
yields:
limπ y(x) = limπ tan x = ±∞.
x→± 2 x→± 2
This is not a special situation. Even when the interval of existence is bounded, for some theoretical
reason that we present later, in detail, the solution can be unbounded; this case is referred to as a
blow-up phenomenon.
64CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: GENERAL THEORY
R = [x0 , x0 + a] × B b (y 0 ) .
The vector–valued version of the Picard–Lindelöhf Theorem 4.17 is represented by the following The-
orem 4.21, whose proof is omitted, as it is very similar to that of Theorem 4.17.
Theorem 4.21 (Picard–Lindelöhf, vector–valued case). Let f : R → Rn br uniformly Lipschitz
continuous in y , with respect to x , and define:
b
M = max ||f || , α = min {a , }. (4.23)
R M
Then, problem (4.9) admits unique solution u ∈ C 1 ([x0 , x0 + α] , Rn ) .
is also a solution of y 0 = f (x , y) .
Function f represents a vector field.
With the Picard–Lindelöhf Theorem 4.21, we can build solutions to initial value problems associated
to y 0 = f (x , y) , choosing the initial data in Ω . In other words, given a point (x0 , y 0 ) ∈ Ω , we form
the IVP (4.9), for which Theorem 4.21 ensures existence of a solution u(x) in a neighborhood of x0 .
If now, in the rectangle R = [x0 , x0 + a] × B b (y 0 ) , we choose a , b > 0 so that R ⊂ Ω (which is
always possible, since Ω is open), then the solution of IVP (4.9) is defined at least up to the point
x1 = x0 + α1 , where the constant α1 > 0 is given by (4.23).
This allows us to continue and consider a new initial value problem:
(
y 0 = f (x, y)
(4.9a)
y(x1 ) = u(x1 ) := y 1
which is defined at least up to point x2 = x1 + α2 , where constant α2 > 0 is again given by (4.23).
This procedure can be iterated, leading to the formal Definition 4.23 of maximal domain solution. The
idea of continuation of a solution may be better understood by looking at Figure 4.1.
Definition 4.23 (Maximal domain solution). If u ∈ C 1 (I , Rn ) solves the initial value problem (4.9),
we say that u has maximal domain (or that u does not admit a continuation) if there exists no
function v ∈ C 1 (J , Rn ) which also solves (4.9) and such that I ⊂ J .
4.3. EXISTENCE AND UNIQUENESS: PICARD–LINDELÖHF THEOREM 65
Rn
Ω
y1
y0
x0 x1 R
The existence of the maximal domain solution to IVP (4.9) can be understood euristically, as it comes
from indefinitely repeating the continuation procedure. Establishing it with mathematical rigor is
beyond the aim of these lecture notes, since it would require notions from advanced theoretical Set
theory, such as Zorn’s Lemma4 .
We end this section stating, in Theorem 4.24, a result on the asymptotic behaviour of a solution with
maximal domain, in the particular case where Ω = I × Rn , being I an open interval. Such a result
explains what observed in Example 4.20, though we do not provide a proof of Theorem 4.24.
Denote (α , ω) the maximal domain of y . Then, one of two possibility holds respectively for α and
for ω :
4
See, for example, mathworld.wolfram.com/ZornsLemma.html
66CHAPTER 4. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: GENERAL THEORY
5 Ordinary differential equations of
first order: methods for explicit so-
lutions
In the previous Chapter 4 we exposed the general theory, concerning conditions for existence and
uniqueness of an initial value problem. Here, we consider some important particular situations, in
which, due to the structure of certain kind of scalar ordinary differential equations, it is possible to
establish methods to determine their explicit solution
where a(x) and b(y) are continuous functions, respectively defined on intervals Ia and Ib , such that
x0 ∈ Ia and y0 ∈ Ib .
To obtain existence and uniqueness in the solution of (5.1), we have to assume that b(y0 ) 6= 0 .
b(y) 6= 0 , (5.2)
then the unique solution to (5.1) is function y(x) , defined implicitly by:
Z y Z x
dz
= a(s) ds . (5.3)
y0 b(z) x0
Remarkp5.3. The hypothesis (5.2) cannot be removed, as shown, for instance, in Remark 4.15, where
b(y) = 2 |y| , which means that b(0) = 0 .
∂F (x, y) 1
= ,
∂y b(y)
it follows:
∂F (x0 , y0 ) 1
= 6= 0 .
∂y b(y0 )
67
68CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT
We can thus invoke the Implicit function theorem 3.25 and infer the existence of δ , ε > 0 for which,
given any x ∈ (x0 − δ , x0 + δ) , there is a unique C 1 function y = y(x) ∈ (y0 − ε , y0 + ε) such that:
F (x, y) = 0
Function y , implicitly defined by (5.3), is thus a solution of (5.1); to complete the proof, we still have
to show its uniqueness. Assume that y1 (x) and y2 (x) are both solutions of (5.1), and define:
Z y
dz
B(y) := ;
y0 b(z)
then:
d y10 (x) y20 (x)
B y1 (x) − B y2 (x) = −
dx b y1 (x) b y2 (x)
a(x) b y1 (x) a(x) b y2 (x)
= − = 0.
b y1 (x) b y2 (x)
we used the fact that both y1 (x) and y2 (x) are assumed to solve (5.1). Thus B y1 (x) −
Notice that
B y2 (x) is a constant function, and its constant value is zero, since y1 (x0 ) = y2 (x0 ) = y0 . In other
words, we have shown that, for any x ∈ Ia :
B y1 (x) − B y2 (x) = 0 ,
At this point, using the Mean–Value Theorem 3.24, we infer the existence of a number X(x) between
the integration limits y2 (x) and y1 (x) , such that:
1
y1 (x) − y2 (x) = 0 .
b X(x)
Example 5.4. Consider once more the IVP studied, using successive approximations, in Example
4.19: (
y 0 (x) = −2 x y(x) ,
y(0) = 1 .
Setting a(x) = −2 x , b(y) = y , x0 = 0 , y0 = 1 in (5.3) leads to:
Z y Z x
1 2
dz = (−2 z) dz ⇐⇒ ln y = −x2 ⇐⇒ y(x) = e−x .
1 z 0
5.1. SEPARABLE EQUATIONS 69
In the next couple of examples, some interesting, particular cases of separable equations are considered.
Example 5.5. The choice b(y) = y in (5.3) yields the particular separable equation:
(
y 0 (x) = a(x) y(x) ,
(5.4)
y(x0 ) = y0 ,
thus: Z x
a(s) ds
x0
y = y0 e .
x
For instance, if a(x) = − , the initial value problem:
2
( x
y 0 (x) = − y(x)
2
y(0) = 1
has solution:
x2
y(x) = e− 4 .
Example 5.6. In (5.3), let b(y) = y 2 , which leads to the separable equation:
(
y 0 (x) = a(x) y 2 (x) ,
(5.5)
y(x0 ) = y0 ,
has solution
1
y= .
1 + x2
We now provide some practical examples, recalling that a complete treatment needs, both, finding the
analytical expression of the solution and determining the maximal solution domain.
y(−1) = 0 .
70CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT
From (5.3):
Z y(x) Z x
(z + 1)dz = (s − 1)ds ,
0 0
performing the relevant computations, we get:
1 2 1
y (x) + y(x) = x2 − x ,
2 2
so that: (
p x − 2,
y(x) = −1 ± (x − 1)2 =
−x .
Now, recall that x lies in a neighborhood of zero and that the initial condition requires y(0) = 0 ; it
can be inferred, therefore, that y(x) = −x must be chosen. To establish the maximal domain, observe
that x − 1 vanishes for x = 1 ; thus, we infer that x < 1 .
Example 5.9. As a varies in R , investigate the maximal domain of the solutions to the initial value
problem: (
u0 (x) = a 1 + u2 (x) cos x ,
u(0) = 0 .
Form (5.3), to obtain:
Z u(x) Z x
dz
= a cos s ds.
0 1 + z2 0
After performing the relevant computations, we get:
arctan u(x) = a sin x . (5.6)
It is clear that the Range of the right hand–side of (5.6) is [−a , a] . To obtain a solution defined on
π
R , we have to impose that a < . In such a case, solving with respect to u yields:
2
u(x) = tan (a sin x) .
π π
Viceversa, when a ≥ , since there exists x ∈ R+ for which a sin x = , then, the obtained solution
2 2
π
is defined in (−x , x) , and x is the minimum positive number verifying the equality a sin x = .
2
5.1. SEPARABLE EQUATIONS 71
5.1.1 Exercises
1. Solve the following separable equations:
x
ex
y 0 (x) = sin2 y(x) , y 0 (x) = ,
y(x) (d) (1 + ex ) cosh y(x)
(a) r
π y(0) = 0 ,
y(0) =
,
2
ex
y 0 (x) = 2 x ,
y 0 (x) = ,
(b) cos y(x) (e) (1 + ex ) cos y(x)
y(0) = 0 ,
y(0) = 5π ,
2x
y 0 (x) =
,
x2 2
cos y(x) 0
y (x) = y (x) ,
(c) (f) 1+x
y(0) = π ,
y(0) = 1 .
4
Solutions:
p 1
(1 + e2 x ) ,
(a) ya (x) = arccos(−x2 ) , (d) yd (x) = arcsinh ln 2
(e) ye (x) = arcsin ln 21 (1 + e2 x ) ,
(b) yf (x) = arcsin x2 + 5 π ,
2
(c) yc (x) = arcsin √1 + x2 ,
(f) yb (x) = .
2 2(1 + x − ln(1 + x)) − x2
e2x
0
y (x) = y(x)
√ 4 + e2x
y(0) = 5
√
is y(x) = 4 + e2 x . What is the maximal domain of such a solution?
y 0 (x) = x sin x
1 + y(x)
y(0) = 0
√
is y(x) = 2 sin x − 2 x cos x + 1 − 1 . Find the maximal domain of the solution.
y 3 (x)
0
y (x) =
1 + x2
y(0) = 2
2
is y(x) = √ . Find the maximal domain of the solution.
1 − 8 arctan x
5. Show that the solution to the initial value problem:
(
y 0 (x) = (sin x + cos x) e−y(x)
y(0) = 1
is y(x) = ln(1 + e + sin x − cos x) . Find the maximal domain of the solution.
72CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT
π
is y(x) = tan x ln(x2 + 1) − 2 x + 2 arctan x +
4 . Find the maximal domain of the solution.
1
is y(x) = − √
3
. Find the maximal domain of the solution.
1 − 2x3
8. Solve the initial value problem:
00 0
y (x) = (sin x) y (x)
y(0) = 1
0
y (1) = 0
Hint. Set z(x) = y 0 (x) and solve the equation z 0 (x) = (sin x) z(x) .
y 0 = (y − 1) (y − 2) . (5.7)
Equation (5.7) is separable, so we can easily adapt formula (5.3), using indefinite integrals and adding
a constant of integration:
y−2
ln = x + c1 .
y−1
Solving for y , the general solution to (5.7) is obtained:
2 − ex+c1 2 − c ex
y(x) = = , (5.8)
1 + ex+c1 1 + c ex
where we set c = ec1 .
Observe that the two constant functions y = 1 and y = 2 are solutions of equation (5.7). Observe
further that y = 2 is obtained from (5.8) taking c = 0 , thus such a solution is a particular solution
to (5.7). Viceversa, solution y = 1 cannot be obtained using the general solution (5.8); for this reason
this solution is called singular.
Singular solutions of a differential equation can be found with a computational procedure, illustrated
in Remark 5.10.
Remark 5.10. Given the differential equation (4.2), suppose that its general solution is given by
Φ(x , y , c) = 0 . When there exists a singular integral of (4.2), it can be detected eliminating c from
the system:
Φ(x , y , c) = 0 ,
∂Φ (5.9)
(x , y , c) = 0 .
∂c
5.3. HOMOGENEOUS EQUATIONS 73
Remark 5.11. When the differential equation (5.3) is given in implicit form, uniqueness of the
solution does not hold; this generates the occurrence of a singular integral, which can be detected
without solving the differential equation (4.2), eliminating y 0 from system 5.10:
0
F (x , y , y ) = 0 ,
∂F (5.10)
0 (x , y , y 0 ) = 0 .
∂y
x y(x)
y(2) = 2 .
x2 + y 2
f (x, y) = .
xy
u0 (x) = 1 ,
x u(x)
u(2) = 1 ,
yielding: p
u(x) = 2 ln |x| + 1 − ln 4 .
√ 2
Observe that the solution is defined on the interval x > eln 4−1 = √ .
e
At this point, going back to our original initial value problem, we arrive at the solution of the homo-
geneous problem:
p 2
y(x) = x 2 ln |x| + 1 − ln 4 , x> √ .
e
5.3.1 Exercises
Solve the following initial values problems for homogeneous equations:
y 2 (x) y(x)
1
0
y (x) = 2 , y 0 (x) =
y(x) + x e x ,
x − x y(x) 3. x
1.
y(1) = 1 ,
y(1) = −1 ,
3
1 15 x + 11 y(x)
y 0 (x) = − 2 x y(x) + y 2 (x) , y 0 (x) = −
,
2. x 4. 9 x + 5 y(x)
y(1) = 1 , y(1) = 1 ,
Solutions:
√
1 + 3x2 − 1 3. y(x) = −x ln (e − ln x) ,
1. y(x) = ,
x
x 1
2x 4. = ,
2 23/5 y 2/5 y 3/5
2. y(x) = 2 , 1+ 3+
3x − 1 x x
5.4. QUASI HOMOGENEOUS EQUATIONS 75
where
a b
det 6 0.
= (5.14)
α β
In this situation, the linear system:
(
ax + by + c = 0
(5.15)
αx + βy + γ = 0
y 0 = 3 x + 4 ,
y−1
y(0) = 2 .
4
whose solution is x1 = − , y1 = 1 . In the second step, the change variable is performed:
3
(
X = x + 43 ,
Y =y−1 ,
and simplifying:
8
3 x + x = y 2 − 2y
2
3
yields:
p
y =1± 3 x2 + 8 x + 1 .
The worked–out Example 5.15 illustrate the procedure to be followed if, when considering equation
(5.13), condition (5.14) is not fulfilled.
y 0 = − x + y + 1 ,
2x + 2y + 1
y(1) = 2 .
In this situation, since the two equations in the system are proportional, the change of variable to be
employed is:
(
t = x,
z = x+y.
The given differential equation is, hence, transformed into the separable one:
z
z 0 = ,
2z + 1
z(1) = 3 .
Thus:
2 z − 6 + ln z − ln 3 = t − 1 =⇒ x + 2 y + ln(x + y) = 5 + ln 3 .
Observe that, in this example, it is not possible to express the dependent variable y in an elementary
way, i.e., in terms of elementary functions.
3 x2 − 2 x y
y0 = . (5.16)
2 y + x2 − 1
First, rewrite (5.16) as:
2 x y − 3 x2 + (2 y + x2 − 1) y 0 = 0 . (5.16a)
Equation (5.16a) is solvable under the assumption that a suitable function Φ(x , y) can be found, that
verifies:
∂Φ ∂Φ
= 2 x y − 3 x2 , = 2 y + x2 − 1 .
∂x ∂y
Note that it is not always possible to determine such a Φ(x , y) . In the current Example 5.16, though,
we are able to define Φ(x , y) = y 2 + (x2 − 1) y − x3 . Therefore (5.16a) can be rewritten: as
∂Φ ∂Φ 0
+ y = 0. (5.16b)
∂x ∂y
Let us, now, leave the particular case of Example 5.16, and return to the general situation, i.e., consider
ordinary differential equation of the form:
M (x , y) + N (x , y) y 0 = 0 . (5.18)
We call exact the differential equation (5.18), if there exists a function Φ(x , y) such that:
∂Φ ∂Φ
= M (x , y) , = N (x , y) , (5.19)
∂x ∂y
and the (implicit) solution to an exact differential equation is constant:
Φ(x , y) = c .
In other words, finding Φ(x , y) constitues the central task in determining whether a differential
equation is exact and in computing its solution.
Establishing a necessary condition for (5.18) to be exact is easy. In fact, if we assume that (5.18) is
exact and that Φ(x , y) satisfies the hypotheses of Theorem 3.4, then the equality holds:
∂ ∂Φ ∂ ∂Φ
= .
∂x ∂y ∂y ∂x
1
See, for example, mathworld.wolfram.com/ChainRule.html
78CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT
The result in Theorem 5.17 speeds up the search of solution for exact equations.
y 0 = − M (x , y) ,
N (x , y) (5.21)
y(x0 ) = y0 , (x0 , y0 ) ∈ Q .
∂M (x , y)
M (x , y) = 6 x + y 2 =⇒ = 2y,
∂y
∂N (x , y)
N (x , y) = 2 x y + 1 =⇒ = 2y.
∂x
Formula (5.22) then yields:
Z x Z x
M (t , 1) dt = (6 t + 1) dt = −4 + x + 3 x2 ,
Z 1y Z1 y
N (x , s) ds = (2 x s + 1) ds = −1 − x + y + x y 2 .
1 1
Hence, the solution to the given initial value problem is implicitly defined by:
x y 2 + y + 3 x2 − 5 = 0
3 y e3 x − 2 x
0
y =− ,
e3 x
y(1) = 1 .
5.6. INTEGRATING FACTOR FOR NON EXACT EQUATIONS 79
Here, it holds:
∂M (x , y)
M (x , y) = 3 y e3 x − 2 x =⇒ = 3 e3 x ,
∂y
∂N (x , y)
N (x , y) = e3 x =⇒ = 3 e3 x .
∂x
Using formula (5.22):
Z x Z x
M (t , 1) dt = (3 e3 t − 2 t) dt = −x2 + e3 x − e3 + 1 ,
Z 1y Z1 y
N (x , s) ds = (e3 x ) ds = (y − 1) e3 x .
1 1
−x2 + e3 x − e3 + 1 + (y − 1) e3 x = 0 y = e−3 x x2 + e3 − 1 .
=⇒
5.5.1 Exercises
1. Solve the following initial value problems, for exact equations:
3 2
y 0 = 9 x − 2 x y ,
y 0 = − 2 x + 3 y ,
(a) 3x + y − 1 (d) x2 + 2 y + 1
y(1) = 2 , y(0) = −3 ,
2
2
y 0 = 9 x − 2 x y ,
y 0 = 2 x y + 4 ,
(e) 2 (3 − x2 y)
(b) x2 + 2 y + 1
y(−1) = 8 ,
y(0) = −3 ,
2xy
2
− 2x
y 0 = 2 x y + 4 , 2
0 = 1+x
(f) y ,
(c) 2 (3 − x2 y)
2 − ln (1 + x2 )
y(−1) = −8 , y(0) = 1 .
2. Using the method for exact equation, described in this § 5.5, prove that the solution of the initial
value problem:
1 − 3 y 3 e3 x y
y 0 =
3 x y 2 e3 x y + 2 y e3 x y
y(0) = 1
is implicitly defined by y 2 e3 x y − x = 1 , and verify this result using the Dini Implicit function
Theorem 3.25.
∂ ∂
M (x, y) + N (x, y) y 0 = 0 , N (x, y) = M (x , y) .
∂x ∂y
The case is more frequent, though, in which condition (5.19) is not satisfied, so that we are unable to
express the solution of the given differential equation in terms of the known functions.
There is, however, a general method of solution which, at times, allows the solution of the general
differential equation to be formulated using the known functions. In formula (5.18a) below, although
80CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT
it can hardly be considered as an orthodox procedure, we split the derivative y 0 and, then, rewrite
(5.18) in the so–called Pfaffian2 form:
M (x , y) dx + N (x , y) dy = 0 . (5.18a)
We do not assume condition (5.20). In this situation, there exists a function µ(x , y) such that,
multiplying both sides of (5.18a) by µ , an equivalent equation is obtained which is exact, namely:
This represents a theoretical statement, in the sense that it is easy to formulate conditions that need
to be satisfied by the integrating factor µ , namely:
∂ ∂
µ(x , y) N (x, y) = µ(x , y) M (x, y) . (5.23)
∂x ∂y
Evaluating the partial derivatives (and employing a simplified subscript notation for partial deriva-
tives), the partial differential equation for µ is obtained:
M (x , y) µy − N (x , y) µx = Nx (x , y) − My (x , y) µ . (5.23a)
Notice that solving (5.23a) may turn out to be harder than solving the original differential equation
(5.18a). However, depending on the particular structure of the functions M (x , y) and N (x , y) , there
exist favorable situations in which it is possibile to detect the integrating factor µ(x , y) , provided
that some restrictions are imposed on µ itself. In the following Theorems 5.20 and 5.23, we describe
what happens when µ depends on one variable only.
Theorem 5.20. Equation (5.18a) admits an integrating factor µ depending on x only, if the quantity:
My (x , y) − Nx (x , y)
ρ(x) = (5.24)
N (x , y)
Proof. Assume that µ(x, y) is a function of one variable only, say, it is a function of x only, thus:
dµ
µ(x , y) = µ(x) , µx = = µ0x , µy = 0 .
dx
In this situation, equation (5.23a) reduces to:
N (x , y) µ0x = My (x , y) − Nx (x , y) µ ,
(5.23b)
that is:
µ0x My (x , y) − Nx (x , y)
= . (5.23c)
µ N (x , y)
Now, if the left hand–side of (5.23c) depends on x only, then (5.23c) is separable: solving it leads to
the integrating factor represented in thesis (5.25).
2
Johann Friedrich Pfaff (1765–1825), German mathematician.
5.6. INTEGRATING FACTOR FOR NON EXACT EQUATIONS 81
x (x − y) (5.26)
y(1) = 3 .
Equation (5.26) is not exact, nor separable. Let us rewrite it in Pfaffian form, temporarily ignoring
the initial condition:
(3 x y − y 2 ) dx + x (x − y) dy = 0 . (5.26a)
Setting M (x , y) = 3 x y − y 2 and N (x , y) = x (x − y) yields:
My (x , y) − Nx (x , y) 3 x − 2 y − (2 x − y) 1
= = ,
N (x , y) x (x − y) x
which is a function of x only. The hypotheses of Theorem 5.20 are fulfilled, and the integrating factor
comes from (5.25): Z
1
dx
x
µ(x) = e = x.
Multiplying equation (5.26a) by the integrating factor x , we form an exact equation, namely:
(3 x2 y − x y 2 ) dx + x2 (x − y) dy = 0 . (5.26b)
M1 (x , y) = x M (x , y) = 3 x2 y − x y 2 ,
N1 (x, y) = x N (x , y) = x2 (x − y) ,
and employ them in equation (5.22), which also incorporates the initial condition:
Z x Z y
M1 (t , 3) dt + N1 (x , s) ds = 0 ,
1 3
that is: Z x Z y
(−9 t + 9 t2 ) dt + (x2 − x s) ds = 0 .
1 3
x (x + 2y) (5.27)
y(1) = 1 .
Since the hypotheses of Theorem 5.20 are fulfilled, the integrating factor is given by (5.25):
Z
2
dx
x
µ(x) = e = x2 .
M1 (x , y) = x M (x , y) = 4 x3 y + 3 x2 y 2 − x3 ,
N1 (x , y) = x N (x , y) = x3 (x + 2 y) ,
we can use them into equation (5.22), which also incorporates the initial condition, obtaining:
Z x Z y
M1 (t , 1) dt + N1 (x , s) ds = 0 ,
1 1
that is: Z x Z y
2 3
(3 t + 3 t ) dt + x2 (2 s + x) ds = 0 .
1 1
Evaluating the integrals yields:
x4 7
x4 y −
+ x3 y 2 − = 0 .
4 4
Solving for y and recalling the initial condition, we get the solution of the initial value problem (5.27):
√
x5 + x4 + 7 x
y= − .
2x3/2 2
We examine, now, the case in which the integrating factor µ is a function of y only. Given the analogy
with Theorem 5.20, the proof of Theorem 5.23 is not provided here.
Theorem 5.23. Equation (5.18a) admits an integrating factor µ depending on y only, if the quantity:
Nx (x , y) − My (x , y)
ρ(y) = (5.28)
M (x , y)
Example 5.24. Consider the initial value problem, with non–separable and non–exact differential
equation:
y 0 = − y (x + y + 1) ,
x (x + 3 y + 2) (5.30)
y(1) = 1 .
y 2 (x + y + 1) dx + x y (x + 3 y + 2) dy = 0 .
5.6. INTEGRATING FACTOR FOR NON EXACT EQUATIONS 83
M1 (x , y) = x M (x , y) = y 2 (x + y + 1) ,
N1 (x , y) = x N (x , y) = x y (x + 3 y + 2) ,
and employ them into equation (5.22), which also incorporates the initial condition, obtaining:
Z x Z y
M1 (t , 1) dt + N1 (x , s) ds ,
1 1
that is: Z x Z y
(2 + t) dt + s x (2 + 3 s + x) ds = 0 .
1 1
x2 y 2 5
+ x y3 + x y2 − = 0 .
2 2
To end this § 5.6, let us consider the situation of a family of differential equations for which an
integrating factor µ is available.
Theorem 5.25. Let Q = { (x , y) ∈ R2 | 0 < a < x < b , 0 < c < x < d } , and let f1 and f2 be
C 1 functions on Q , such that f1 (x y) − f2 (x y) 6= 0 . Define the functions M (x , y) and N (x , y) as:
M (x , y) = y f1 (x y) , N (x , y) = x f2 (x y) .
Then:
1
µ(x , y) =
x y f1 (x y) − f2 (x y)
is an integrating factor for:
y f1 (x y)
y0 = − .
x f2 (x y)
Proof. It suffices to insert the above expressions of µ , M and N into condition (5.23) and verify that
it gets satisfied.
5.6.1 Exercises
1. Solve the following initial value problems, using a suitable integrating factor.
2
y 0 = − y (x + y) ,
y 0 = 3 x + 2 y ,
(a) x + 2y − 1 (e) 2xy
y(1) = 1 , y(1) = 1 ,
0 y − 2 x3
y 0 = − y
2
, y = ,
(b) x (y − ln x) (f) x
y(1) = 1 , y(1) = 1 ,
y 0 = y
y 0 = y ,
, (g) 3
y − 3x
(c) y − 3x − 3 y(0) = 1 ,
y(0) = 0 ,
3 x
( y 0 = − y + 2 y e ,
y0 = x − y , (h) ex + 3 y 2
(d)
y(0) = 0 , y(0) = 0 .
84CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT
where: Z x
A(x) = a(s) ds . (5.33)
x0
Proof. To arrive at formula (5.32), we first examine the case b(x) = 0 , for which (5.31) reduces to the
separable (and linear) equation: (
y 0 (x) = a(x) y(x) ,
(5.34)
y(x0 ) = c ,
having set y0 = c . If x0 ∈ I , the solution of (5.34) through point (x0 , c) is:
y(x) = c eA(x) . (5.35)
To find the solution to the more general differential equation (5.31), we use the method of Variation
of Parameters 3 , due to Lagrange: we assume that c is a function of x and search for c (x) such that
the function:
y(x) = c(x) eA(x) (5.36)
becomes, indeed, a solution of (5.31). To this aim, differentiate (5.36):
y 0 (x) = c0 (x) eA(x) + c(x) a(x) eA(x)
and impose that function (5.36) solves (5.31), that is:
c0 (x) eA(x) + c(x) a(x) eA(x) = a(x) c(x) eA(x) + b(x) ,
from which:
c0 (x) = b(x) e−A(x) . (5.37)
Integrating (5.37) between x0 and x , we obtain:
Z x
c(x) = b(t) e−A(t) dt + K ,
x0
Evaluating y(x0 ) and recalling the initial condition in (5.31), we see that y0 = K . Thesis (5.32) thus
follows.
3
See, for example, mathworld.wolfram.com/VariationofParameters.html
5.7. LINEAR EQUATIONS OF FIRST ORDER 85
Remark 5.27. An alternative proof to Theorem 5.26 can be provided, using Theorem 5.20 and the
integrating factor procedure. In fact, if we assume:
M (x , y) = a(x) y(x) + b(x) , N (x , y) = −1 ,
then:
My − Ny
= −a(x) ,
N
which yields the integrating factor µ(x) = e−A(x) , with A(x) defined as in (5.33). Considering the
following exact equation, equivalent to (5.31):
e−A(x) a(x) y + b(x)
0
y (x) = − ,
−e−A(x)
and employing relation (5.22), we obtain:
Z x Z y
a(t) y0 + b(t) e−A(t) dt − e−A(x) ds = 0 ,
x0 y0
which means that y1 − y2 has the form (5.35) and, therefore, solves (5.34). Now, using the fact that
y1 is a solution of (5.31), the general solution to (5.31) can be written as:
y(x) = c eA(x) + y1 (x) , c ∈ R,
and all this is equivalent to saying that the general solution y = y(x) of (5.31) can be written in the
form:
y − y1
= c, c ∈ R.
y2 − y1
Example 5.29. Consider the equation:
3
y 0 (x) = 3 x2 y(x) + x ex ,
y(0) = 1 .
3
Here, a(x) = 3 x2 and b(x) = x ex . Using (5.32)–(5.33), we get:
Z x Z x
A(x) = a(s) ds = 3 s2 ds = x3
x0 0
and
x x
x2
Z Z
−A(t) 3 3
b(t) e dt = t et e−t dt = ,
x0 0 2
so that:
x2
x3
y(x) = e 1+ .
2
86CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT
and
" 2
#x 2
x x
e−t 1 e−x
Z Z
−A(t) −t2
b(t) e = te dt = − = − .
0 0 2 2 2
0
5.7.1 Exercises
Solve the following initial value problems for linear equations.
1 1 1
y 0 (x) = −
2
y(x) + 2
, y 0 (x) = y(x) + x2 ,
1. 1 + x 1 + x 4. x
y(0) = 0 , y(1) = 0 ,
3
y 0 (x) = − sin x y(x) + sin x ,
y 0 (x) = 3 x2 y(x) + x ex ,
2. 5.
y(0) = 0 ,
y(0) = 1 ,
x 1
0 y 0 (x) = x −
y (x) =
2
y(x) + 1 , y(x) ,
3. 1+x 6. 3x
y(0) = 0 ,
y(1) = 1 .
Here α = 4 , and the change of variable is v(x) = y −3 (x) , i.e., y = v −1/3 , leading to:
1 −4/3 0 1 1
− v v = − v −1/3 − v −4/3 .
3 x x
Multiplcation by v 4/3 yields a linear differential equation in v :
1 0 1 1
− v =− v− .
3 x x
Simplifying and recalling the initial condition, we obtain a linear initial value problem:
v 0 = 3 v + 3 ,
x x
v(2) = 1 ,
1 3
solved by v(x) = x − 1 . Hence, the solution to the original problem is:
4
r
4
y(x) = 3 3 .
x −4
Example 5.33. Given x > 0 , solve the initial value problem:
y 0 (x) = 1 y(x) + 5 x2 y 3 (x) ,
2x
y(1) = 1 .
We have to solve a Bernoulli equation, with exponent α = 5 . The change of variable is:
1 − 1
y(x) = v(x) 1−5
= v(x) 4 ,
solved by:
x + x3
v(x) = .
2
Recovering y(x) , we find: r
4 2
y(x) =
x + x3
that is defined for x > 0 .
Remark 5.35. Bernoulli equation (5.38) can also be solved using the same approach used for linear
equations, that is, imposing a solution of the form:
so that:
1
1−α
1−α
c(x) = (1 − α) F (x) + c0 ,
where: Z x
F (x) = b(z) e(α−1) A(z) dz .
x0
5.8.1 Exercises
Solve the following initial value problems for Bernoulli equations.
y 0 = 2 x y + 2√y ,
y 0 = − 1 − 1 y 2 ,
1. x ln x 3. 1+x
y(2) = 1 , y(0) = 1 ,
( 2 (
y 0 = x2 y + x5 + x2 y 3 , y0 = y + x y2 ,
2. 4.
y(0) = 1 , y(0) = 1 .
(
y 0 (x) − y(x) + (cos x) y(x)2 = 0 ,
5.
y(0) = 1 .
The solving strategy is based on knowing one particular solution y1 (x) of (5.40). Then, it is assumed
that the other solutions of (5.40) have the form y(x) = y1 (x) + u(x) , where u(x) is an unknown
function, to be found, and that solves the associated Bernoulli equation:
Notice that, in this latter way, we combine together two substitutions: the first one maps the Riccati5
equation into a Bernoulli equation; the second one linearizes the Bernoulli equation.
Example 5.36. Knowing that y1 (x) = 1 solves the Riccati equation:
1+x 1
y0 = − + y + y2 , (5.42)
x x
we want to show that the general solution to equation (5.42) is:
x2 e x
y(x) = 1 + .
c + (1 − x)ex
Let us use the change of variable:
1
y(x) = 1 + ,
u(x)
to obtain the linear equation:
1 2
u0 (x) = −
− 1+ u(x) . (5.42a)
x x
To solve it, we proceed as we learned. First, compute A(x) :
e−x
Z
2
A(x) = − 1+ dx = −x − 2 ln x =⇒ eA(x) = .
x x2
Then, form: Z Z Z
−A(x) 1 2 x
b(x) e dx = − x e dx = − x ex dx = (1 − x) ex .
x
The solution to (5.42a) is, therefore:
e−x x
c e−x + 1 − x c + (1 − x) ex
u(x) = c + (1 − x) e = = .
x2 x2 x2 e x
Finally, the solution to (5.42) is:
1 x2 ex
y(x) = 1 + =1+ .
u(x) c + (1 − x) ex
y(x)
y 0 (x) = −x5 + + x3 y 2 (x) , (5.43)
x
find the general solution of (5.43).
5
Jacopo Francesco Riccati (1676–1754), Italian mathematician and jurist.
90CHAPTER 5. ORDINARY DIFFERENTIAL EQUATIONS OF FIRST ORDER: METHODS FOR EXPLICIT
1
The substitution y = x + leads to the linear differential equation:
v
1
v0 x+ 2 x5 + 1
v + x3 x + 1 2
1 − 2 = −x5 + =⇒ v0 = − v − x3 ,
v x v x
whose solution is:
2 x5
c e− 5 1
v(x) = − ,
x 2x
where c is an integration constant. The general solution of (5.43) is, therefore:
2x
y= + x.
2 x5
2 c e− 5 −1
Remark 5.38. In applications, it may be useful to state conditions on the coefficient functions
a(x) , b(x) and c(x) , to the aim that the relevant Riccati equation (5.40) is solved by some particular
function having simple form. The following list summarizes such conditions and, for each one, the
correspondent simple–form solution y1 .
1. Monomial solution:
if a(x) + xn−1 x b(x) + c(x) xn+1 − n = 0 , then y1 (x) = xn .
2. Exponential solution:
if a(x) + en x b(x) + c(x) en x − n = 0 , then y1 (x) = en x .
4. Sine solution: if a(x) + b(x) sin(n x) + c(x) sin2 (n x) − n cos(n x) = 0 , then y1 (x) = sin(n x) .
5. Cosine solution: if a(x) + b(x) cos(n x) + c(x) cos2 (n x) + n sin(n x) = 0 , then y1 (x) = cos(n x) .
Theorem 5.39. Given any three functions y1 , y2 , y3 , which satisfy (5.40), then the general solution
y of (5.40) can be expressed in the form:
y − y2 y3 − y2
=c . (5.44)
y − y1 y3 − y1
Proof. We saw in § 5.9 that, if y1 is a solution of (5.40), then solutions y2 and y3 will be determined
by two particular choices of u in the substitution:
1
y = y1 + .
u
Let us denote such functions with u2 and u3 , respectively:
1 1
y2 = y1 + , y3 = y1 + .
u2 u3
Recalling that u2 and u3 are solutions to the linear equation (5.41), we know that the general solution
of (5.41) can be written as shown in Remark 5.28:
u − u2
= c.
u3 − u2
5.9. RICCATI EQUATION 91
At this point, employing the reverse substitution, and following [32] (page 23):
1 1 1
u= , u2 = , u3 = ,
y − y1 y2 − y1 y3 − y1
A consequence of Theorem 5.39 is the so–called Cross–Ratio property of the Riccati equation, illus-
trated in the following Corollary 5.40.
Corollary 5.40. Given any four solutions y1 , . . . , y4 of the Riccati equation (5.40), their Cross–Ratio
is constant and is given by the quantity:
y4 − y2 y3 − y1
. (5.45)
y4 − y1 y3 − y2
u0 = A0 (x) + A1 (x) u2
is known as reduced form of the Riccati equation (5.40). Functions A0 (x) and A1 (x) are related
to functions a(x) , b(x) and c(x) appearing in (5.40). In fact, if B(x) is a primitive of b(x) , i.e.,
B 0 (x) = b(x) , the change of variable:
u(x) = e−B(x) y (5.46)
trasforms (5.40) into the reduced Riccati equation:
This can be seen by computing u0 = e−B(x) y 0 − y B 0 (x) from (5.46) and then substituting, in
the factor y 0 − y B 0 (x) , the equalities B 0 (x) = b(x) and y 0 = a(x) + b(x) y + c(x) y 2 , and finally
y = eB(x) u .
Sometimes, given a Riccati equation, its solution can be obtained by simply transforming it to its
reduced form. Example 5.41. illustrates this fact.
Example 5.41. Consider the initial value problem for the Riccati equation:
y 0 = 1 − 1 y + x y 2 ,
2x
y(1) = 0 .
u(1) = 0 ,
where P (x) and Q(x) are given continuos functions, defined on an interval I ⊂ R . The term linear
indicates the fact that the unknown function y = y(x) and its derivatives appear in polynomial form
of degree one.
The second–order linear differential equation (5.47) is equivalent to a particular Riccati equation. We
follow the fair exposition given in Chapter 15 of [46]. Let us introduce a new variable u = u(x) ,
setting: Z
y = e−U (x) , with U (x) = − u(x) dx . (5.48)
simplifying which leads to the non–linear Riccati differential equation of first order:
Viceversa, to find a linear differential equation, of second order, that is equivalent to the first–order
Riccati equation (5.40), let us proceed as follows. Consider the transformation:
w0
y=− , (5.49)
c(x) w
with first derivative:
w c0 (x) w0 − c(x) w w00 + c(x) (w0 )2
y0 = .
c2 (x) w2
5.9. RICCATI EQUATION 93
Now, apply transformation (5.49) to the right hand–side of (5.40), i.e., to a(x) + b(x) y + c(x) y 2 :
b(x) w0 w2
a(x) − + .
c(x) w c(x) (w0 )2
By comparison, and after some algebra, we arrive at:
To obtain the reduced form of (5.50a), we employ the transformation (5.46), observing that, here,
such a transformation works in the following way:
Z
x 1
b(x) = 2
=⇒ B(x) = b(x) dx = − ln(1 + x2 )
1−x 2
p
−B(x)
=⇒ v=e u = 1 − x2 u .
5.9.4 Exercises
1. Knowing that y1 (x) = 1 is a solution of:
x2 y 00 + 3 x y 0 + y = 0 .
Hint: Transform the equation into a Riccati equation, in reduced form, and then use the fact that
v1 (x) = x2 is a particular solution.
(1 + x2 ) y 00 − 2 x y 0 + 2 y = 0 .
1
Hint: Transform the equation into a Riccati equation, then solve it, using the fact that u1 = −
x
is a particular solution.
5.10.1 Exercises
1. Prove that the differential equation:
y − 4 x y 2 − 16 x3
yx0 =
y 3 + 4 x2 y + x
YX0 = −2 X .
p y
where X(x , y) = 4 x2 + y 2 and Y (x , y) = arctan .
2x
Then, use this fact to integrate the original differential equation.
y 3 + x2 y − y − x
yx0 =
x y 2 + x3 + y − x
YX0 = Y (1 − Y 2 ) ,
y p
where X(x , y) = arctan and Y (x , y) = x2 + y 2 .
x
Then, use this fact to integrate the original differential equation.
y 3 ey
y0 = , (5.55)
y 3 ey − ex
X(x , y) = − 1 ,
y
Y (x , y) = ex−y .
y2
0 y+1
y = 3+ ,
x x (5.56)
y(1) = 0 ,
The general form of a differential equation of order n ∈ N was briefly introduced in equation (4.10) of
Chapter 4. The current Chapter 6 is devoted to the particular situation of linear equations of second
order:
a(x) y 00 + b(x) y 0 + c(x) y = d(x) , (6.1)
where a , b , c and d are continuous real functions of the real variable x ∈ I , being I an interval in R
and a(x) 6= 0 . Equation (6.1) may be represented, at times, in operational notation:
M y = d(x) ,
where M : C 2 (I) → C(I) is a differential operator that acts on the function y ∈ C 2 (I) :
In this situation, existence and uniqueness of solutions are verified, for any initial value problem
associated to (6.1).
Before dealing with the simplest case, in which the coefficient functions a(x) , b(x) , c(x) are constant,
we examine general properties, that hold in any situation. We will study some variable coefficient
equations, that are meaningful in applications. Our treatment can be easily extended to equation of
any order; for details, refer to Chapter 5 of [47] or Chapter 6 of [3].
Ly = 0. (6.4a)
97
98 CHAPTER 6. LINEAR DIFFERENTIAL EQUATIONS OF SECOND ORDER
where u(x) is the new dependent variable, while f (x) is a function to be specified, in order to simplify
computations. We find:
L(f u) = f u00 + 2 f 0 + p f u0 + f 00 + p f 0 + q f u = r .
(6.6)
In (6.6), we can choose f so that the coefficient of u vanishes, that is:
f 00 + p f 0 + q f = 0 . (6.7)
In this way, equation (6.6) becomes easily solvable, since it reduces to a first–order linear equation in
the unknown v = u0 :
f v0 + 2 f 0 + p f v = r .
(6.6a)
At this point, if any particular solution to the homogeneous equation (6.4a) is available, the solution
of the non–homogeneous equation (6.4) can be obtained.
The set of solutions to a homogeneous equation forms a two–dimensional vector space, as illustrated
in the following Theorem 6.1. The first, and easy, step is to recognize, that given two solutions y1 and
y2 of (6.4a), their linear combination:
y = α1 y1 + α2 y2 , α1 , α2 ∈ R ,
is also a solution to (6.4a). In particular, if y1 and y2 are solution to (6.4a), then:
(L y1 )(x) = y100 + p(x) y10 + q(x) y1 = 0 , (6.8)
(L y2 )(x) = y200 + p(x) y20 + q(x) y2 = 0 . (6.9)
To form their linear combination, we multiply both sides of (6.8) and (6.9) by α1 and α2 , respectively,
and we sum the results, obtaining:
α1 y100 + α2 y200 + p(x) α1 y10 + α2 y20 + q(x) α1 y1 + α2 y2 = 0 .
Even if C (2) (I) in a infinite-dimensional vector space, ker(L) is a subspace of dimension 2 , as stated
in Theorem 6.1.
6.1. HOMOGENEOUS EQUATIONS 99
Theorem 6.1. Consider the linear differential operator L : C 2 (I) → C(I) , defined by (6.5). Then,
the kernel of L has dimension 2 .
Proof. Fix x0 ∈ I and define the linear operator T : ker(L) → R2 , which maps each function
y ∈ ker(L) onto its initial value, evaluated at x0 , i.e.:
T y = y(x0 ) , y 0 (x0 ) .
The existence and uniqueness Theorem 4.17 means that T y = (0 , 0) implies y = 0 . Hence, by the
theory of linear operators, this mean that T is one–one operator and it holds:
If condition (6.12) holds only when all the αk are zero (i.e., αk = 0 for all k = 1 , . . . , n), then
functions f1 , . . . , fn are linearly independent.
We now provide a sufficient condition for linear independence of a set of functions. Let us assume that
f1 , . . . , fn are n–times differentiable. Then, from equation (6.12), applying successive differentiations,
we can form a system of n linear equations in the variables α1 , . . . , αn :
α1 f1 + α2 f2 + . . . + αn fn = 0,
α1 f10 + α2 f20 + ... + αn fn0 = 0,
α1 f100 + α2 f200 + ... + αn fn00 = 0, (6.13)
.. ..
. .
(n−1) (n−1) (n−1)
α1 f1 +α2 f2 + · · · + αn fn = 0.
For example, functions f1 (x) = sin2 x , f2 (x) = cos2 x , f3 (x) = sin(2 x) , are linearly independent on
I = R , since their Wronskian is non–zero:
f1 (x) f2 (x) f3 (x)
W (f1 , f2 , f3 )(x) = det f10 (x) f20 (x) f30 (x)
f100 (x) f200 (x) f300 (x)
sin2 x cos2 x
sin(2 x)
= det 2 cos x sin x −2 cos x sin x 2 cos(2 x) = 4 .
2 2 2 2
2 cos x − sin x) 2 sin x − cos x) −4 sin(2 x)
Differentiating, we obtain:
d
W 0 (x) = W (y1 , y2 )(x) = y1 (x) y200 (x) − y2 (x) y100 (x) .
dx
Since y1 and y2 are solution to (6.4a), recalling that we assume a(x) 6= 0 in (6.1), it holds:
W 0 = −p(x) W . (6.15)
W (x0 ) = 0 ∀ x0 ∈ I .
2
Niels Abel (1802-1829)
6.1. HOMOGENEOUS EQUATIONS 101
By assumption, the determinant of this homogeneous system is zero, hence the system admits a non
T
trivial solution α1 , α2 , with α1 , α2 not simultaneously null. Now, define the function:
Since y(x) is a linear combination of solutions y1 (x) and y2 (x) of (6.4a), then y(x) is also a solution
to (6.4a). And since, by construction, (α1 , α2 ) solves (6.17), then it also holds that:
At this point, from the existence and uniqueness of the solutions of the initial value problem:
(
Ly = 0 ,
y(x0 ) = y 0 (x0 ) = 0 ,
it turns out that y(x) = 0 identically, implying that α1 = α2 = 0 . So, we arrive at a contradiction.
The theorem is proved.
Putting together Theorems 6.1 and 6.2, it is possible to establish if a pair of solutions to (6.4a) is a
basis for the set of solutions to the equation (6.4a), as illustrated in Theorem 6.3.
Theorem 6.3. Consider the linear differential operator L : C 2 (I) → C(I) defined in (6.4a). If y1 and
y2 are two independent elements of ker(L) , then any other element of ker(L) can be expressed as a
linear combination of y1 and y2 :
y(x) = c1 y1 (x) + c2 y2 (x)
for suitable constants c1 , c2 ∈ R .
x2 (1 + x) y 00 − x (2 + 4 x + x2 ) y 0 + (2 + 4 x + x2 ) y = 0 , (6.18)
a second independent solution to (6.18) can be found, by seeking a solution of the form:
Now, introducing v(x) = u0 (x) , we see that v has to satisfy the first–order linear separable differential
equation:
x+2
v0 = v,
x+1
which is solved by:
v(x) = c (1 + x) ex .
We can assume c = 1 , since we are only interested in finding one particular solution of (6.18). Function
u(x) is then found by integration:
Z
u(x) = (1 + x) ex dx = x ex .
Therefore, a second solution to (6.18) is y2 (x) = x2 ex , where, again, we do not worry about the inte-
gration constant. Functions y1 , y2 form an independent set of solutions to (6.18) if their Wronskian:
x 2 ex
y1 (x) y2 (x) x
det 0 = det = x2 (x + 1) ex .
y1 (x) y20 (x) 1 x (x + 2) ex
is different from zero. Now, observe that the differential equation (6.18) has to be considered in one of
the intervals (−∞ , −1) , (−1 , 0) , (0, +∞) , where the leading coefficient x2 (1 + x) of (6.18) does not
vanish. On such intervals, the Wronskian does not vanish as well, thus f1 , f2 are linearly independent.
In conclusion, the general solution to (6.18) is:
y(x) = c1 x + c2 x2 ex , c1 , c2 ∈ R . (6.19)
The procedure illustrated in Example 6.4 can be repeated in the general case. For simplicity, we recall
(6.4a) written in the explicit form:
y 00 + p(x) y 0 + q(x) y = 0 . (6.20)
If a solution y1 (x) of (6.20) is known, we look for a second solution of the form:
y2 (x) = u(x) y1 (x) ,
where u is a function to be determined. Computing the first and second derivatives of y2 :
y20 = y1 u0 + y10 u , y200 = y1 u00 + 2 y10 u0 + y100 u ,
and inserting them into (6.20), yields, after some computations:
y1 u00 + (2 y10 + p y1 ) u0 + (y100 + p y10 + q y1 ) u = 0 .
Now, since, y1 is a solution to (6.20), the previous equation reduces to:
y1 u00 + (2 y10 + p y1 ) u0 = 0 . (6.21)
Equation (6.21) is a first–order separable equation in the unknown u0 , exactly in the same way as in
the Example 6.4, and it can be integrated to obtain the second solution to (6.20).
The search for a second independent solution to (6.20) can also be pursued using the Wronskian
equation (6.16), without explicitly computing two solutions of (6.20). This is stated in the following
Theorem 6.5.
Theorem 6.5. If y1 (x) is a non–vanishing solution of the second–order equation (6.20), then a second
independent solution is given by:
Z t
−p(s) ds
Z x x0
e
y2 (x) = y1 (x) dt , (6.22)
x0 y12 (t)
with p(s) is as in (6.3).
6.1. HOMOGENEOUS EQUATIONS 103
Proof. Given the assumption that y1 is a non–vanishing function, rewrite the Wronskian as:
y1 y20 − y2 y10
y1 y2
W (y1 , y2 ) = det 0 0 = y1 y20 − y2 y10 = y12 ,
y1 y2 y12
and observe that:
y1 y20 − y2 y10 d y2
2 = .
y1 dx y1
In other words:
d y2 W (y1 , y2 )
= , (6.23)
dx y1 y12
integrating which leads to: Z x
W (y1 , y2 )(s)
y2 (x) = y1 (x) ds , (6.24)
x0 y12 (s)
setting to zero the constant of integration W (x0 ) . At this point, thesis (6.22) follows from inserting
equation (6.16) into (6.24).
Example 6.6. Consider again Example 6.4. The solution y1 (x) = x of (6.18) can be used in formula
(6.22), to detect a second solution to such equation. Observe that, in this case:
−x (2 + 4 x + x2 ) 2 + 4 x + x2
p(x) = = − ,
x2 (1 + x) x (1 + x)
so that: Z
−p(x) dx = x + 2 ln x + ln(1 + x) ,
and: Z
−p(s) ds
e = x2 (1 + x) ex .
The second solution is, hence:
Z
y2 (x) = x (1 + x) ex dx = x2 ex ,
in accordance with the solution y2 found using the method order reduction in Example 6.4.
Example 6.7. Find the general solution of the homogeneous linear differential equation of second
order:
1 1
y 00 + y 0 + 1 − y = 0.
x 4 x2
We seek a solution of the form y1 = xm sin x . For such a solution, the first and second derivatives
are: (
y10 = xm−1 x cos x + m sin x ,
Remark 6.8. To facilitate the search for some particular solution of a linear differential equation,
conditions on the coefficients p(x) and q(x) , defined in (6.3), are provided below, each leading to a
function y1 that solves (6.20).
5. Sine solution: if n p(x) cos(n x) − n2 sin(n x) + q(x) sin(n x) = 0 , then y1 (x) = sin(n x) .
6. Cosine solution: if q(x) cos(n x) − n2 cos(n x) − n p(x) sin(n x) = 0 , then y1 (x) = cos(n x) .
M y = a y 00 + b y 0 + c y = 0 . (6.25)
y(x) = eλ x ,
and imposing that y(x) solves (6.25), leads to the algebraic equation, called characteristic equation of
(6.25):
a λ2 + b λ + c = 0 . (6.26)
The roots of (6.26) determine solutions of (6.25). Namely, if the discriminant ∆ = b2 − 4 a c is
positive, so that equation (6.26) admits two distinct real roots λ1 and λ2 , then the general solution
to (6.25) is:
y = c1 eλ1 x + c2 eλ2 x . (6.27)
The independence of solutions y1 = c1 eλ1 x and y2 = c2 eλ2 x follows from the analysis of their
Wronskian, which is non–vanishing for any x ∈ R :
√
y1 y2 (λ1 +λ2 ) x ∆ −b x
W (y1 , y2 )(x) = det 0 0 = (λ2 − λ1 ) e = e a 6= 0 .
y1 y2 a
When ∆ < 0 , equation (6.26) admits two distinct complex conjugate roots λ1 = α + i β and λ2 =
α − i β , and the general solution to (6.25) is:
y = eα x c1 cos(β x) + c2 sin(β x) .
(6.28)
Forming the complex exponential of λ1 and that of λ2 , two complex valued functions z1 , z2 are
obtained:
z1 = eλ1 x = e(α+i β) x = eα x ei β x = eα x cos(β x) + i sin(β x) ,
Set, for example, y1 = eα x cos(β x) and y2 = eα x sin(β x) . Then, the real solution presented in
(6.28) is a linear combination of the real functions y1 and y2 , which are are independent, since their
Wronskian is non–vanishing:
y1 y2
W (y1 , y2 )(x) = det 0 = β e2 α x 6= 0 .
y1 y20
When ∆ = 0 , equation (6.26) has one real root with multiplicity 2 , and the correspondent solution
to (6.25) is:
b
y1 = e − 2 a x
.
In this situation, we need a second independent solution, that is obtained from formula (6.22) of
Theorem 6.5, using the just found y1 and with p(s) built for equation (6.26), thus:
Z
b
− dx
Z a
b e b
y2 = e− 2 a x b
dx = x e− 2 a x .
e− a x
In other words, when ∆ = 0 the general solution of (6.25) is:
b
y = e− 2 a x c1 + c2 x .
(6.29)
Note that the knowledge of the Wronskian expression is useful in the study of non–homogeneous
differential equations too, as it will be shown in § 6.2.
Example 6.9. Consider the initial value problem:
(
y 00 − 2 y 0 + 6 y = 0 ,
y(0) = 0 , y 0 (0) = 1 .
√
The characteristic equation is λ2 − 2 λ + 6 = 0 , with roots λ = 1 ± i 5 . Hence, two independent
solutions are: √ √
y1 (x) = ex cos 5x , y2 (x) = ex sin 5x ,
and the general solution can be expressed as y(x) = c1 y1 (x) + c2 y2 (x) . Now, forming the initial
conditions: (
y(0) = c1 y1 (0) + c2 y2 (0) = c1 ,
√
y 0 (0) = c1 y10 (0) + c2 y20 (0) = c1 + c2 5 ,
we see that constants c1 and c2 must verify:
c1 = 0 ,
(
c1 = 0 ,
√ =⇒ 1
c1 + c2 5 = 1 , c2 = √ .
5
In conclusion, the considered initial value problem is solved by:
ex √
y(x) = √ sin 5x .
5
106 CHAPTER 6. LINEAR DIFFERENTIAL EQUATIONS OF SECOND ORDER
When ∆ < 0 , there are two complex conjugate roots m1,2 = α ± i β of (6.31); then, setting for
example y1 = xα cos(β ln x) and y2 = xα sin(β ln x) , the Wronskian does not vanish:
y y2
W (y1 , y2 )(x) = det 10 = β x2α−1 6= 0 , for x > 0 .
y1 y20
When ∆ = 0 , equation (6.31) has one real root m of multiplicity 2 ; in this case y1 = xm and
y2 = y1 ln x ; again, the Wronskian does not vanish:
y1 y2
W (y1 , y2 )(x) = det 0 = x2 m−1 6= 0 , for x > 0 .
y1 y20
Notice, again, that knowing the Wronskian turns out useful also when studying the non–homogeneous
differential equations case (refer to § 6.2).
Example 6.11. Consider the initial value problem:
(
x2 y 00 − 2 x y 0 + 2 y = 0 ,
y(1) = 1 , y 0 (1) = 0 .
Here equation (6.30) assumes the form:
m (m − 1) − 2 m + 2 = 0 ,
that is m = 1 , m = 2 . Therefore, the general solution of the given equation is y = c1 x + c2 x2 ;
imposing the initial conditions yields the system:
(
c1 + c2 = 1 ,
c1 + 2 c2 = 0 .
In conclusion, the solution of the initial vale problem is y = 2 x − x2 .
6.1. HOMOGENEOUS EQUATIONS 107
m (m − 1) − m + 5 = 0 ,
with:
1 0 1
I(x) = q(x) − p (x) − p2 (x) . (6.33)
2 4
In fact, assuming the knowledge of a solution to (6.4a) of the form y = f u leads to, in the same way
followed to obtain equation (6.6):
Function f (x) does not vanish, and has first and second derivatives given by:
fp f (p2 − 2 p0 )
f0 = − , f 00 = .
2 4
which is stated in (6.32)–(6.33). Equation (6.32) is called the normal form of equation (6.4a). Function
I(x) , introduced in (6.33), is called invariant of the homogeneous differential equation (6.4a) and
represents a mathematical invariant, in the sense expressed by the following Theorem 6.13.
108 CHAPTER 6. LINEAR DIFFERENTIAL EQUATIONS OF SECOND ORDER
L1 y = y 00 + p1 y 0 + q1 y = 0 (6.36)
L2 y = y 00 + p2 y 0 + q2 y = 0 , (6.37)
by the the change of dependent variable y = f u , then the invariants of (6.36)–(6.37) coincide:
1 0 1 1 1
I1 = q1 − p − p2 = q2 − p02 − p22 = I2 .
2 1 4 1 2 4
Viceversa, when equations (6.36) and (6.37) admit the same invariant, either equation can be trans-
formed into the other one, by:
Z
1
−
2 p1 (x) − p2 (x) dx
y(x) = u(x) e .
Remark 6.14. We can transform any second–order linear differential equation into its normal form.
Moreover, if we are able to solve the equation in normal form, then we can obtain, easily, the general
solution to the original equation. The next Example 6.15 clarifies this idea.
Example 6.15. Consider the homogeneous differential equation, depending on the real positive pa-
rameter a :
00 2 0 2 2
y − y + a + 2 y = 0. (6.38)
x a
Hence:
2 2 1 1
p(x) = − , q(x) = a2 + 2 =⇒ I(x) = q(x) − p0 (x) − p2 (x) = a2 .
x a 2 4
In this example, the invariant is not dependent of x , and the normal form is:
u00 + a2 u = 0 . (6.39)
The general solution to (6.39) is u(x) = c1 cos(a x) + c2 sin(a x) , where c1 , c2 are real parameter,
and the solution to the original equation (6.38) is:
Z
1
−
2 p(x) dx
y(x) = u(x) e = c1 x cos(a x) + c2 x sin(a x) .
Example 6.16. Find the general solution of the homogeneous linear differential equation of second
order:
y 00 − 2 tan(x) y 0 + y = 0 .
The first step consists in transforming the given equation into normal form, with the change of variable:
Z Z
1
−
2 p(x) dx tan x dx
u
y=u e =u e = .
cos x
The normal form is:
u00 + 2 u = 0 ,
which is a constant–coefficient equation, solved by:
√ √
u = c1 cos( 2 x) + c2 sin( 2 x) .
The normal form clarifies the structure of the solutions to a constant–coefficient, linear equation of
second–order.
Remark 6.17. Consider the constant–coefficient equation (6.25). The change of variable:
b
y = u e− 2 a x
b2 − 4 a c
u00 − u = 0, (6.40)
4 a2
since, in this case:
b c 1 0 1 2 −b2 + 4 a c
p=− , q=− =⇒ I =q− p − p = .
a a 2 4 4 a2
The normal form (6.40) explains the nature of the following formulæ (6.41), (6.42) and (6.43), namely
describing the structure of the solution to a constant–coefficient homogeneous linear differential equa-
tion of second order. In the following, the discriminant is ∆ = b2 − 4 a c and c1 , c2 are constant:
(i) if ∆ > 0 , √ √
b
−2a x ∆ ∆
y(x) = e c1 cosh x + c2 sinh x ; (6.41)
2a 2a
(ii) if ∆ < 0 , √ √
b
−2a x −∆ −∆
y(x) = e c1 cos x + c2 sin x ; (6.42)
2a 2a
(iii) if ∆ = 0 ,
b
y(x) = e− 2 a x (c1 x + c2 ) . (6.43)
changing, slightly, the point of view, in comparison to the beginning of § 6.1. Here, we assume to
know, already, the general solution of the homogeneous equation associated to (6.44). Aim of this
section is, indeed, to describe the relation between solutions of L y = 0 and solutions of L y = r(x) ,
being r(x) a given continuous function. The first and probably most important step in this direction
is represented by the following Theorem 6.18.
Theorem 6.18. Let y1 and y2 be independent solutions of L y = 0 , and let yp be a solution of
L y = r(x) . Then, any solution of the latest non–homogeneous equation has the form:
L (y − yp ) = L y − L yp = r − r = 0 .
This means that y − yp can be express by a linear combination of y1 and y2 . Hence, thesis (6.45) is
proved.
Theorem 6.19. Let y1 and y2 be two independent solutions of the homogeneous equation associated
to (6.44). Then, a particular solution to (6.44) has the form:
where: Z Z
y2 (x) r(x) y1 (x) r(x)
k1 (x) = − dx , k2 (x) = dx . (6.47)
W (y1 , y2 )(x) W (y1 , y2 )(x)
Proof. Assume that y1 and y2 are independent solutions of the homogeneous equation associated to
(6.44), and look for a particular solution of (6.44) in the desired form:
yp = k1 y1 + k2 y2 ,
where k1 , k2 are two C 1 functions to be determined. Computing the first derivative of yp yields:
Let us impose a first condition on y1 and y2 , i.e., impose that they verify:
At this point, imposing that yp solves equation (6.44) leads to forming the following expression (in
which variable x is discarded, to ease the notation):
yp00 + p yp0 + q yp = k1 y100 + k2 y200 + k10 y10 + k20 y20 + p k1 y10 + k2 y20 + q k1 y10 + k2 y20
Equations (6.49) and (6.50) form a 2 × 2 linear system, in the variables k10 , k20 , that admits a unique
solution:
y2 r y1 r
k10 = − , k20 = , (6.51)
W (y1 , y2 ) W (y1 , y2 )
since its coefficient matrix is the Wronskian W (y1 , y2 ) , which does not vanish, given the assumption
that y1 and y2 are independent. Thesis (6.47) follows by integration of k10 , k20 in (6.51).
6.2. NON–HOMOGENEOUS EQUATION 111
Example 6.20. In Example 6.4, we showed that the general solution of the homogeneous equation
(6.18) has the form (6.19), i.e., c1 x + c2 x2 ex , c1 , c2 ∈ R . Here, we use Theorem 6.19 to find the
general solution of the non–homogeneous equation:
2
x2 (1 + x) y 00 − x (2 + 4 x + x2 ) y 0 + (2 + 4 x + x2 ) y = x2 (1 + x) . (6.52)
In Example 6.12, the associated homogeneous equation was considered, of which two independent
solutions were found, namely y1 = x cos(2 ln x) and y2 = x sin(2 ln x) , whose Wronskian is 2 x .
Now, writing the given non–homogenous differential equation in explicit form:
1 0 5
y 00 − y + 2 y = 1,
x x
and using (6.47), we find:
Z Z
1 1
k1 (x) = − sin(2 ln x) dx , k2 (x) = cos(2 ln x) dx .
2 2
Evaluating the integrals:
1 2 1
k1 (x) = x cos(2 ln x) − x sin(2 ln x) ,
2 5 5
1 2 1
k1 (x) = x sin(2 ln x) − x cos(2 ln x) .
2 5 5
Hence, a particular solution of the non–homogenous equation is
x2
yp = k1 y1 + k2 y2 = ,
5
while the general solution is:
x2
y = c1 x cos(2 ln x) + c2 x sin(2 ln x) + .
5
To solve the initial value problem, c1 and c2 need to be determined, imposing the initial conditions:
1 2
y(1) = c1 + = 0 , y 0 (1) = c1 + 2 c2 + = 0 ,
5 5
1 1
yielding c1 = − , c2 = − . In conclusion, the solution of the given initial value problem is:
5 10
1
y(x) = x (2 x − sin(2 ln x) − 2 cos(2 ln x)) .
10
112 CHAPTER 6. LINEAR DIFFERENTIAL EQUATIONS OF SECOND ORDER
M y = a y 00 + b y 0 + c y = r(x) , (6.53)
where r(x) is a continuous real function. Denote with K(λ) the characteristic polynomial associated
to the differential equation (6.53). A list is provide below, taken from [21], of particular functions
r(x) , together with a recipe to find a relevant particular solution of (6.53).
(a) If K(α) 6= 0 , it means that α is not a root of the characteristic equation. Then, a particular
solution of (6.53) has the form:
yp = eα x Qn (x) , (6.54)
where Qn (x) is a polynomial of degree n to be determined.
(b) If K(α) = 0 , with multiplicity s ≥ 1 , it means that α is a root of the characteristic equation.
Then, a particular solution of (6.53) has the form:
yp = xs eα x Rn (x) , (6.55)
(a) If K(α + i β) 6= 0 , it means that α + i β is not a root of the characteristic equation. In this
case, a particular solution of (6.53) has the form:
αx
yp = e Rp (x) cos(β x) + Sp (x) cos(β x) , (6.56)
K(λ) = λ2 + 2 λ + 1 = 0 ,
6.2. NON–HOMOGENEOUS EQUATION 113
thus, we are in situation (1a) and we look for a solution of the form (6.54), that is
yp (x) = s0 x2 + s1 x + s2 .
Differentiating:
(
yp0 (x) = 2 s0 x + s1 ,
yp00 (x) = 2 s0 ,
s0 x2 + (4 s0 + s1 ) x + 2 s0 + 2 s1 + s2 = x2 + x .
yp = x 2 − 3 x + 4 .
Finally, solving the associated homogeneous equation, we obtain the required general solution:
Observe, first, that the general solution of the associated homogeneous equation is:
Observe, further, that the characteristic equation has roots ±i , Hence, we are in situation (2b) and
we look for a solution of the form (6.57), that is:
1 1
y(x) = x sin x − x cos x + c1 cos x + c2 sin x , c1 , c2 ∈ R .
2 2
114 CHAPTER 6. LINEAR DIFFERENTIAL EQUATIONS OF SECOND ORDER
6.2.3 Exercises
1. Solve the following second order variable coefficient linear differential equations:
00 2 2
(a) y − 1+ y0 + y = 0 ;
x x
2 x 2
(b) y 00 − y0 + y = 0;
1 + x2 1 + x2
1
(c) y 00 − y 0 − 4 x2 y = 0 .
x
m
Hint: y = ex .
2. Solve the following initial value problems, using the transformation of the given differential equation
in normal form:
00 + 2 sin x y 0 + (sin x)2 + cos x − 6
y y = 0,
x2
(a)
y(1) = 0 ,
y 0 (1) = 1 ,
2 0
00
y + x y + y = 0 ,
(b) y(1) = 1 ,
0
y (1) = 0 .
Some basic notions are presented in this chapter, that are needed as introduction to Measure theory.
Some familiarity, with the concepts presented, here and in the following Chapters 8 to 10, is also
assumed and recalled, briefly, for the sake of completeness.
115
116 CHAPTER 7. PROLOGUE TO MEASURE THEORY
the notation (xi )i∈I represents the mapping x : I → X , and it is said to be a collection of elements
in X indexed by I .
In other words, x establishes an indexed family of elements in X indexed by I , and the elements of
X are referred to as forming the family, i.e., the indexed family (xi )i∈I is interpreted as a collection,
rather than a function.
When I = N , we are obviously dealing with an usual sequence. When we, instead, consider a finite
list of objects, then I = {1 , 2 , . . . , n} . It is also possible to consider as index set, for example, the
power set of X , that is, I = P(X ) ; in the latest case, the indexed family becomes the collection of
the subsets of X .
Union and intersection of arbitrary collections of sets are defined as:
\
Aα = { x : x ∈ Aα for all α ∈ I } = {x : ∀ α ∈ Λ , x ∈ Aα } ,
α∈I
[
Aα = {x : x ∈ Aα for some α ∈ I} = {x : ∃ α ∈ I , x ∈ Aα } ,
α∈I
If A∩B = ∅ , then A and B are disjoint. A family of sets (Aα )α∈I is pairwise disjoint if Aα ∪Aβ = ∅
whenever α 6= β , for α , β ∈ I .
A × B = { (a, b) : a ∈ A , b ∈ B} ,
of an arbitrary, indexed family of sets (Ai )i∈I , is the collection of all functions:
[
a:I→ Ai , a : i 7→ ai , such that ai ∈ Ai for any i ∈ I .
i∈I
When Ai = A for any i ∈ I , the Cartesian product is denoted as AI , which reduces to An in the
case I = {1 , . . . , n} .
The Cartesian plane is R2 = R × R , while Rn is the set of all n–tuples (x1 , . . . , xn ) composed of
real numbers. A rectangle is the Cartesian product of two intervals.
7.1.4 Functions
A function f : A → B can be intepreted as a subset of A×B in which each first coordinate determines
the second one:
(a, b) , (a, c) ∈ f =⇒ b = c.
Domain and Range of a function are, thus, respectively defined as:
Df = {a ∈ A : ∃ b ∈ B , (a, b) ∈ f } , Rf = {b ∈ B : ∃ a ∈ A , (a, b) ∈ f } .
1
Augustus De Morgan (1806–1871), British mathematician and logician.
7.1. SET THEORY 117
Informally, f associates elements of B with elements of A such that each a ∈ A has at most one
image b ∈ B , that is, b = f (a) . The Image of X ⊂ A is:
f −1 (Y ) = { a ∈ A : f (a) ∈ Y } .
Given two functions g and f , such that Df ⊂ Dg and g = f on Df , then we say that g extends
f to Dg and, viceversa, f restricts g to Df .
The algebra of real functions is defined pointwise. The sum f + g is defined as (f + g)(x) = f (x) + g(x).
The product f g is given by (f g)(x) = f (x) g(x). The indicator function 1A of the set A is the function:
1 for x ∈ A
1A (x) =
0 for x ∈
/A
7.1.5 Equivalences
A relation between two sets, A and B , is a subset R of A × B ; we write x ∼ y to indicate that
(x, y) ∈ R. An equivalence relation on a set E is a relation R of E × E , with the following properties,
valid for any x , y , z ∈ E :
(i) reflexivity: x ∼ x;
(ii) symmetry: x ∼ y =⇒ y ∼ x ;
For any x ∈ E , the subset [x] := { e ∈ E : e ∼ x } is called equivalence class of the element x ∈ E .
Clearly, when x ∼ y , then [x] = [y] .
An equivalence relation on E subdivides such a set into disjoint equivalence classes. A partition of a
set E is a family F of subsets of E such that:
[
(i) A=E,
A∈F
Consider x , y ∈ E such that x 6= y , then, it is either [x] = [y] or [x] 6= [y] . In the first case, we
have already observed that it is x ∼ y . In the second case, x and y are not equivalent, i.e., x 6∼ y ,
therefore [x] ∩ [y] = ∅ . Hence, the equivalence classes of E partition the set E itself. The collection
of the equivalence classes of E is called quotient set of E and is denoted by E/ ∼ . For instance, if
E = Z , the relation x ∼ y ⇐⇒ x − y = 2 k , k ∈ Z , is an equivalence in Z , which partitions Z in
the classes of even and odd numbers. In general, for any given m ∈ Z , an equivalence relation x ∼ y
can be defined, denoted by the symbol ≡m :
The quotient set, obtained in this way, is called the set of residual classes, modulus m and is denoted
by Zm .
118 CHAPTER 7. PROLOGUE TO MEASURE THEORY
(a, b] = {x ∈ R : a < x ≤ b} .
Symbols ∞ and −∞ are used to describe unbounded intervals, such as (−∞ , b] . Later on, we will
define operations, in the extended real number system, involving these symbols.
7.1.7 Cardinality
Two non–empty sets X , and Y share common cardinality if there exists a bijection f : X → Y .
An empty set is finite; a non–empty set is finite if it shares cardinality with the set In = {1 , . . . , n} ,
for some n ∈ N . A set is infinite if it shares cardinality with one of its proper subsets. A set A is
countable, or denumerable, if there exists a one–one correspondence between A and a subset of N .
(a) An element u ∈ R is called upper bound of X if x ≤ u for any x ∈ X ; in this case, X is said to
be bounded from above.
(b) An element ` ∈ R is called lower bound of X if x ≥ `, for any x ∈ X ; in this case X is said to
be bounded from below.
(c) An element u? is called supremum, or least lower bound, of X if u? ≤ µ for any upper bound u
of X .
(d) An element `? is called infimum, or greatest lower bound, of X if `? ≥ ν for any lower bound `
of X .
u? = sup X, `? = inf X .
where we assume that, for any x ∈ R , it holds −∞ < x < +∞ . Rules for computations with
−∞ and + ∞ are introduced in the following list.
1. x + (+∞) = +∞ = +∞ + x ,
2. x + (−∞) = −∞ = −∞ + x ,
3. x > 0 =⇒ x · (+∞) = +∞ = +∞ · x ,
7.2. TOPOLOGY 119
4. x > 0 =⇒ x · (−∞) = −∞ = −∞ · x ,
5. x < 0 =⇒ x · (+∞) = −∞ = −∞ · x ,
6. x < 0 =⇒ x · (−∞) = +∞ = −∞ · x ,
7. (+∞) + (+∞) = +∞ ,
8. (−∞) + (−∞) = −∞ ,
9. (+∞) · (+∞) = +∞ = (−∞) · (−∞) ,
10. (+∞) · (−∞) = −∞ = (−∞) · (+∞) ,
11. (+∞) · 0 = 0 = 0 · (+∞) ,
12. (−∞) · 0 = 0 = 0 · (−∞) .
Note that the following operations remain undefined:
(+∞) + (−∞) , (−∞) + (+∞) .
7.2 Topology
Basic notions of Topology are shortly resumed, now, which are needed to develop Measure theory,
as well as to generalize the concepts studied in section 1.2. A complete reference to Topology is
represented, for example, by [34]. The general definition of a Topology, in a non–empty set Ω , is given
below.
Definition 7.2 (Topology). Let Ω be any non–empty set. A collection T of subsets of Ω is called
topology if it verifies the four following properties:
(i) ∅∈T ;
(ii) Ω∈T ;
[
(iii) closure under union: if C ⊂ T , then ∈T ;
C∈C
n
\
(iv) closure under finite intersection: if O1 , . . . , On ∈ T , then Ok ∈ T .
k=1
The pair Ω, T is called a topological space, and the sets in T are called open sets.
Remark 7.3. Many of the topological spaces used in applications verify a further axiom, known as
Hausdorff 2 property or separation or T2 property:
(v) for any x1 , x2 ∈ Ω , x1 6= x2 , there exist O1 , O2 ∈ T , O1 ∩ O2 = ∅ , such that x1 ∈ O1 and
x2 ∈ O2 .
The idea of topological space is inspired by the open sets in Rn , introduced in definition 1.13 in § 1.2.
We provide here further examples of topological spaces.
Example 7.4. Let Ω be any non–empty set. Then, T = {∅ , Ω} is a topology, called trivial or indiscrete
topology. In this situation, all points of the space cannot be distinguished by topological means.
Example 7.5. Let Ω be any non–empty set. Then T = P (Ω) is a topology, called discrete topology.
Here, every set is open.
Example 7.6. If Ω = N , the family T of all the finite initial segments of N , that is to say, the
collection of sets Jn = {1 , 2 , . . . , n} , is a topology.
Example 7.7. Let Ω be any non–empty set, and let F be a partition of Ω . The collection T of the
subsets of Ω , obtained as union of elements of F , is a topology, induced by the partition F.
2
Felix Hausdorff (1868–1942), German mathematician.
120 CHAPTER 7. PROLOGUE TO MEASURE THEORY
(i) ∅ ∈ K;
(ii) Ω ∈ K;
\
(iii) closure under intersection: if C ⊂ K , then ∈ K;
C∈C
n
[
(iv) closure under finite union: if O1 , . . . , On ∈ K , then Ok ∈ K .
k=1
7.2.2 Limit
Let (Ω , T ) be a topological space, and consider S ⊆ Ω . Then, x ∈ Ω is:
(a) separated from S if and only if there exists A ∈ T such that x ∈ A and A ∩ S = ∅ ;
(b) an adherent point for S if and only if A ∩ S 6= ∅ for any A ∈ T such that x ∈ A ;
If x ∈
/ S is adherent for S , then x is an accumulation point for S .
There exist adherent points for S that are not accumulation points for S .
An open neighborhood of x ∈ Ω is an open set U ∈ T such that x ∈ U ; for simplicity, it will be referred
to as neighborhood. It is possible to express the notions of separated, adherent and accumulation points
using the idea of neighborhood. The concept of limit can also be generalised.
Definition 7.9. Let (Ω1 , T1 ) and (Ω2 , T2 ) be topological spaces, and consider A ⊆ Ω1 and f : A →
Ω2 . Furthemore, let x0 ∈ A be an accumulation point for A , and ` ∈ Ω2 . Then:
lim f (x) = `
x→x0
x ∈ A ∩ (U \ {x0 }) =⇒ f (x) ∈ V .
7.2. TOPOLOGY 121
7.2.3 Closure
Consider a topological space ( Ω , T ) and a set S ⊆ Ω . The closure S is the collection of all
adherent points of S . By construction, it holds S ⊆ S . The closure of a closed set is the set itself,
and we can indeed state the following Theorem 7.10.
Theorem 7.10 has a few implications. First of all, ∅ = ∅ and Ω = Ω . Moreover, given S ⊆ Ω , since S
is a closed set, then S = S . Furthermore, if S1 ⊆ S2 , then S 1 ⊆ S 2 . Finally, S1 ∪ S2 = S1 ∪ S2 .
The main property of the closure of a set S , though, is that S is the smallest closed set containing
S . In fact, the following Theorem 7.11 holds.
Theorem 7.11. Let (Ω, T ) be a topological space, and consider S ⊆ Ω . Denote with K the collection
of closed sets in (Ω , T ) . Furthermore, denote with I the family of subsets of Ω such that:
I = {K ∈ K : S ⊆ K } .
Then \
S= K.
K∈I
7.2.4 Compactness
Definition 7.12. Let Ω be a given set and let S ⊆ Ω . A covering of S in Ω is an indexed family
(Ai )i∈I in Ω such that: [
S⊆ Ai .
i∈I
The notion of open covering leads to the fundamental definition of compactness in a topological space.
Definition 7.13. Let (Ω , T ) be a topological space. A set K ⊂ Ω is called compact if any open
covering of K admits a finite sub–covering, meaning that there exists a finite subset I0 ⊆ I such
that [
K⊆ Ai .
i∈I0
In the familiar context of the Euclidean, open set, topology in Rn , it is possible to show that a
subset is compact if and only if it is closed and bounded. This property of compactness does not hold
for general topological spaces. In the case of the real line R , intervals of the form [a , b] constitute
examples of compact sets. It is further possible to show that, in a T2 –space (where the T2 property
is defined in Remark 7.3), any compact set is also a closed set. Finally, finite subsets in a topological
space are compact.
7.2.5 Continuity
When, in a set, a topology is available, it is possible to introduce the concept of continuity. Let (Ω1 , T1 )
and (Ω2 , T2 ) be topological spaces and consider f : Ω1 → Ω2 . Function f is continuous if, for any
open set A2 ∈ Ω2 , the inverse image f −1 (A2 ) is an open set in the space (Ω1 , T1 ) , that is, in formal
words:
A2 ∈ T2 =⇒ f −1 (A2 ) ∈ T1 .
It is possible to show that f is continuous if and only if, for any closed set K2 in Ω2 , the inverse
image f −1 (K2 ) is a closed set in Ω1 . It is also possible to formulate a local notion of continuity, at a
122 CHAPTER 7. PROLOGUE TO MEASURE THEORY
given point x0 , using appropriate neighborhoods; as we do not need to analyze this problem here, we
leave it to the interested Reader as an exercise.
It is interesting to remark that Weierstrass Theorem 1.30 can be generalized to the case of topological
spaces.
Here and in Chapters 9 and 10, we deal with function µ , called measure, which returns area, or volume,
or probability, of a given set. We assume that µ is already defined, adopting an axiomatic approach
which turns out advantageous, as the same theoretical results apply to other situations, besides area
in R2 or volume in R3 , and which is particularly fruitful in Probability theory. A general domain that
can be assumed for µ , is a σ-algebra, defined in § 8.1.
For completeness, a few basic concepts are recalled, for which some familiarity is assumed.
8.1.1 σ–algebras
Let us introduce, first, the notion of σ–algebra of sets.
Definition 8.1. Let Ω be any non–empty set. A collection A of subsets of Ω is called σ–algebra of
sets if:
(i) Ω ∈ A;
The pair (Ω , A) is called measurable space, and the sets in A are called measurable sets.
From Definition 8.1 and De Morgan Laws (7.1), it follows a fourth property of closure under countable
intersections of the collection of measurable sets A .
When the countable union property (iii) of Definition iii is replaced with the weaker assumption of
finite union, the family of sets A becomes an algebra of sets.
Definition 8.2. Let Ω be any non–empty set. A collection A of subsets of Ω is called algebra of sets
if:
(i) Ω ∈ A;
123
124 CHAPTER 8. LEBESGUE INTEGRAL
We do not develop Measure theory in the contest of algebra of set, as it is beyond the purpose of our
introductory treatment.
It may happen that a given collection of sets, X , is not a σ–algebra, but there exists the minimal σ–
algebra, which contains X . Hence, to obtain non–trivial σ–algebras, we need to consider the following
abstract construction.
Lemma 8.4. Consider a family of σ–algebras on Ω . Then, the intersection of all the σ–algebras from
this family is also a σ–algebra on Ω .
To prove Lemma 8.4, we need to prove that A0 is a σ–algebra. Observe, first, that Ω ∈ A0 , since
Ω ∈ A for any A ∈ H , by definition. Then, choose a set A ∈ A0 , so that A belongs to any σ–algebra
in A ∈ H and, therefore, Ac ∈ A0 . Finally, if (An )n∈N is a sequence of sets in A0 , it is clear that such
a sequence belongs to any σ–algebra in H , implying:
[
An ∈ A
n∈N
Using Lemma 8.4, we can define the σ–algebra generated by a family of arbitrary sets.
Definition 8.5. Let X be a collection of subsets of Ω . Denote with H the collection of σ–algebras
in Ω , including X . Then: \
σ (X ) = A
A∈H
I = { [a , b) | a , b ∈ R , a < b } ,
8.1. MEASURE THEORY 125
then:
σ (I) = A (R) . (8.1)
Equality (8.1) also holds when:
I = { (a , b) | a , b ∈ R , a < b } ,
I = { (a , +∞) | a ∈ R} ,
I = { (−∞ , a) | a ∈ R} .
The n–dimensional Borel2 σ–algebra in Rn is the σ–algebra generated by all open subsets of Rn ,
and it is denoted by A(Rn ) . By construction, A(Rn ) contains all open/closed sets and all countable
unions/intersections of open/closed sets. The Borel σ–algebra does not represent the whole of P(Rn ) ;
this result is known as Vitali3 Covering Theorem ([10], Theorem 2.1.4).
8.1.3 Measures
(i) µ(∅) = 0
(ii) the property of countable additivity holds, that is, for any disjoint sequence (An )n ⊂ A :
∞ ∞
!
[ X
µ An = µ(An ) .
n=1 n=1
δx is a measure on P(Ω) called Dirac x–measure. It is a measure concentrated on the singleton set
{x}.
2
Félix Édouard Justin Émile Borel (1871–1956), French mathematician and politician.
3
Giuseppe Vitali (1875–1932), Italian mathematician.
126 CHAPTER 8. LEBESGUE INTEGRAL
The following intuitive results, stated in Proposition 8.9 are known as monotonicity and subtractivity
of the measure.
Proposition 8.9. If A , B ∈ A and A ⊂ B , then:
µ(A) ≤ µ(B) .
The following Lemma 8.11 shows that, when dealing with a sequence of non–disjoint sets, it is always
possible to rearrange it and treat, instead, a pairwise disjoint sequence of sets, equivalent to the original
sequence.
Lemma 8.11. Consider a measure space ( Ω , A , µ ) . Let (Ak )k∈N be a numerable sequence of mea-
surable sets. Then, there exists a sequence of measurable sets (Bk )k∈N such that:
1. Bk ∈ A for any k ∈ N ;
2. Bk ⊆ Ak for any k ∈ N ;
Properties (1)–(2) are straightforward, both being a consequence of the construction of each set Bk .
To demonstrate property (3), let us fix k , j ∈ N , assuming, without loss of generality, that k < j ,
and consider the intersection:
k−1 j−1
! !
[ [
Bk ∩ Bj = Ak \ Ai ∩ Aj \ Ai .
i=1 i=1
8.1. MEASURE THEORY 127
The assumption k < j implies that, in the second group of intersections, it appears the set Ack . Thus,
we can infer that:
Bk ∩ Bj = ∅ .
We now demonstrate property (4). Notice, first, that from property (2), it is immediate the inclusion:
∞
[ ∞
[
Bk ⊆ Ak .
k=1 k=1
To complete the proof, the reverse inclusion must be shown. To this purpose, let us choose:
∞
[
x∈ Ak .
k=1
Hence, x must belong to one of the sets Ak , at least, implying that the following definition is well–
posed:
m = min{ k ∈ N | x ∈ Ak } .
While working with numerable families of sets, not necessarily pairwise disjoint, we meet the so–called
property of numerable subadditivity.
Proof. Denote with (Bk )k∈N the sequence of measurable sets, obtained from the sequence (Ak )k∈N
using the procedure of Lemma 8.11. In this way, for any k ∈ N , it holds that Bk ∈ A and Bk ⊆ Ak .
It also holds that the sequence (Bk )k∈N is pairwise disjoint, that is, Bk ∩ Bj = ∅ for any k 6= j .
Moreover:
[∞ ∞
[
Ak = Bk .
k=1 k=1
The next Theorem 8.13 states two very interesting results, related to the situation of increasing or
decreasing families of nested sets.
128 CHAPTER 8. LEBESGUE INTEGRAL
Theorem 8.13. Let (Ak )k≥1 ⊂ A be an increasing sequence of sets, i.e., Ak ⊂ Ak+1 . Then:
∞
[
µ Ak = lim µ(An ) . (8.3)
n→∞
k=1
Let (Ak )k≥1 ⊂ A be a decreasing sequences of sets, i.e., Ak+1 ⊂ Ak , with µ(A1 ) < ∞ . Then:
∞
\
µ Ak = lim µ(An ) . (8.4)
n→∞
k=1
Proof. Let us first prove (8.3). By assumption, Ak ⊂ Ak+1 , hence, the non–negative sequence
(µ(Ak ))k∈N is monotonically increasing, which implies the existence of the limit:
` = lim µ(Am ) .
m→∞
∞
[
On the other hand, the inclusion Am ⊆ Ak holds true, for any m ∈ N . Passing to the limit:
k=1
∞
!
[
` = lim µ(Am ) ≤ lim µ Ak . (8.5)
m→∞ k→∞
k=1
If ` = +∞ , there is nothing more to proof. If, instead, ` < +∞ , then, due to the monotonicity of the
sequence (µ(Ak ))k∈N , we infer that, for any k ∈ N :
The union in the right hand–side of (8.7) is, by construction, disjoint. Since condition (8.6) also holds,
we can use Proposition 8.9 to obtain:
∞ ∞
!
[ X
µ Ak = µ(A1 ) + µ (Ak+1 \ Ak )
k=1 k=1
∞
X
= µ(A1 ) + µ (Ak+1 ) − µ (Ak ) .
k=1
∞
[ ∞
[ ∞
[
A1 \ B = A1 ∩ B c = A1 ∩ Ack = (A1 ∩ Ack ) = (A1 \ Ak ) .
k=1 k=1 k=1
The sequence of sets (A1 \ Ak )k∈N is increasing, thus, we can apply relation (8.3) and infer that:
Now, µ(A1 ) < +∞ by hypothesis. Hence, from Proposition 8.9 it follows that µ(A1 \ B) = µ(A1 ) −
µ(B) , which can be inserted into (8.8) to yield, recalling the definition of B :
∞
!
\
µ(A1 ) − µ Ak = lim µ(A1 \ Ak ) = µ(A1 ) − lim µ(Ak ) . (8.9)
k→∞ k→∞
k=1
8.1.4 Exercises
1. Consider Ω = {1 , 2 , 3} . Find necessary and sufficient conditions on the real numbers x , y , z ,
such that there exists a probability measure µ on the σ–algebra A = P(Ω) , where:
Solution to Exercise 2.
µ(E1 ∆E2 ) = 0 =⇒ µ (E1 \E2 ) ∪ (E2 \E1 ) = 0
=⇒ µ(E1 \E2 ) + µ(E2 \E1 ) = 0
=⇒ µ E1 \(E1 ∩ E2 ) + µ E2 \(E1 ∩ E2 ) = 0 .
Since µ : Ω → [0 , +∞] , it follows that µ E1 \(E1 ∩ E2 ) = 0 and µ E2 \(E1 ∩ E2 ) = 0 .
Moreover, observing that (E1 ∩ E2 ) ⊂ E1 and (E2 ∩ E1 ) ⊂ E2 , we can write:
µ(E1 ) − µ(E1 ∩ E2 ) = 0 ,
µ(E2 ) − µ(E1 ∩ E2 ) = 0 .
Thus:
µ(E1 ) − µ(E1 ∩ E2 ) = µ(E2 ) − µ(E1 ∩ E2 ) =⇒ µ(E1 ) = µ(E2 ) .
(b) For this second point, we have:
µ(E1 ∆E2 ) = 0 =⇒ µ (E1 \E2 ) ∪ (E2 \E1 ) = 0 .
Since µ is complete, then (E1 \E2 ) and (E2 \E1 ) are measurable, i.e., they belong to A . In this
way, since the σ–algebra is closed with respect to union, intersection and set complementation,
we have:
E1 \(E1 ∆E2 ) = E1 ∩ (E1 ∆E2 )c ∈ A ,
hence:
E2 = (E2 \E1 ) ∪ E1 \(E1 ∆E2 ) ∈ A .
130 CHAPTER 8. LEBESGUE INTEGRAL
Definition 8.14. Consider x ∈ Rm and A ∈ P(Rm ) . The translate of A with respect to x is the set:
A + x := {u ∈ Rm | u = x + a for some a ∈ A} .
In Measure theory, it makes sense to compare the measure of the two sets A and A + x : for geometric
reasons, we may expect them to be the same; this holds for an important class of measures. We have,
indeed, the following result, that we state without proof.
Theorem 8.15. There exists a unique complete measure, defined on R and denoted by ` , and there
exists a unique σ–algebra M(R) , which are translation invariant, i.e.:
The relation between the σ–algebra M(R) of the Lebesgue measurable sets and the Borel σ–algebra
is stated in the following Remark 8.16.
An extensive treatment and a construction of the Lebesgue measure and its properties can be found in
[52], [51] and [6]. For our purposes, our axiomatic approach suffices: there exists a unique translation–
invariant measure, which associates the measure µ of every finite interval [a , b] with its length b − a .
8.2.1 Exercises
Remark 8.19. Any simple function can be represented as linear combination of characteristic func-
tions, since it holds:
Xm
ϕ(x) = ϕi 1Ai (x) .
i=1
When a simple function is defined on the real line, the idea of its integration is intuitive and it is
inspired by the geometric concept of area of a family of rectangles, as illustrated in Figure 8.1.
xHtL
3.0
2.5
2.0
1.5
1.0
0.5
t
0.5 1.0 1.5 2.0 2.5 3.0
Observe that, to define a simple function correctly, we have to adopt a measure–theory convention,
that concerns ∞ in the extended real number system, namely:
±∞ · 0 = 0 .
We now state, without proof, the main properties of the integral of a simple function, strating with
the properties of positivity and linearity.
Proposition 8.21. Let (Ω, A, µ) be a measure space, let A ∈ A be a measurable set, and let ϕ1 , ϕ2
be simple functions, defined on A . Then, the following properties hold:
Linearity: if α1 , α2 ∈ R , then
Z Z Z
(α1 ϕ1 + α2 ϕ2 ) dµ = α1 ϕ1 dµ + α2 ϕ2 dµ ;
A A A
Positivity: if ϕ1 ≥ 0 , then Z
ϕ1 dµ ≥ 0 .
A
132 CHAPTER 8. LEBESGUE INTEGRAL
Even though it is evident and easy to show, the following Proposition 8.22 has a fundamental impor-
tance in Measure theory, and it follows from the concept of integral of a simple function.
Proposition 8.22. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Then
Z
µ(A) = 1A dµ .
Ω
Proof. Observe that the characteristic function of a set A is a simple function. Hence, since 1A (x) = 0
when x ∈ Ac , and since 1A (x) = 1 when x ∈ A , it follows:
Z
1A dµ = 0 × µ(Ac ) + 1 × µ(A) = µ(A) .
Ω
Definition 8.23. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Function
f : A → [−∞ , +∞] is called measurable if, for any α ∈ R :
{ x ∈ A | f (x) ≤ α } ∈ A .
Proposition 8.24. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. The
following statements for function f : A → [−∞ , +∞] are equivalent:
Proof. The proof follows from the equalities recalled below, for which we refer to Theorem 11.15 of
[52]:
∞
\ 1
{x ∈ A | f (x) ≤ α} = {x ∈ A | f (x) < α + },
n
n=1
∞
\ 1
{x ∈ A | f (x) ≥ α} = {x ∈ A | f (x) > α − },
n
n=1
and
{x ∈ A | f (x) < α} = A \ {x ∈ A | f (x) ≥ α} ,
{x ∈ A | f (x) > α} = A \ {x ∈ A | f (x) ≤ α} .
They show, in a chain, that statement (i) =⇒ (ii) , statement (ii) =⇒ (iii) , statement (iii) =⇒ (iv) ,
and, finally, statement (iv) =⇒ (i) .
The property of measurability is well related with the basic algebraic operations between functions,
hence, we can state, without proof, the following Proposition 8.25.
Proposition 8.25. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Let,
further, f , g : A → R be measurable functions. Then, the following functions are measurable:
8.4. MEASURABLE FUNCTIONS 133
Continuous and monotonic functions are indeed measurable, as illustrated in Proposition 8.26.
Proposition 8.26. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Then,
any continuous function and any monotonic function, defined on A , is measurable.
Proposition 8.27. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Let,
further, {fn } be a sequence of measurable functions, defined on A . Then, the following ones are
measurable functions too:
and notice that upper and lower limits are always well defined since the following sequences are,
respectively, decreasing and increasing:
Corollary 8.28. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Let, further,
{fn } be a sequence of measurable functions, defined on A . Then, the following functions are also
measurable:
lim sup fn , lim inf fn , lim fn .
n→∞ n→∞ n→∞
Reasoning with sequences of functions can yield interesting results, like the one stated in the following
Theorem 8.29.
f (x + n1 ) − f (x)
gn (x) = n f (x + n1 ) − f (x) =
1 .
n
The thesis follows, observing that (gn ) is a sequence of measurable functions and that gn −→ f 0 (x) .
From their Definition 8.17, simple functions are measurable. Their importance lies in the fact that
simple functions can approximate measurable functions, as shown by the following Theorem 8.30.
134 CHAPTER 8. LEBESGUE INTEGRAL
Theorem 8.30. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Let, further,
f : A → R be a measurable function. Then, there exist a sequence of simple functions (fn )n , defined
on A , such that, for any x ∈ A , it holds |fn (x)| ≤ |f (x)| and:
Remark 8.31. The quite technical proof of Theorem 8.30 can be found, for instance, in Theorem
11.20 of [52]. Here, we only provide the interesting result that the sequence of simple functions,
approximating the given measurable function f , can be defined as follows, for any n ∈ N :
n
if f (x) ≥ n ,
fn (x) = k − 1 k−1 k
n
if n
≤ f (x) < n , 1 ≤ k ≤ n 2n .
2 2 2
Definition 8.32. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Let, further,
f : A → [0 , +∞] be a measurable function. The integral of f on A , with respect to measure µ , is
defined as: Z Z
f dµ := sup ϕ dµ | 0 ≤ ϕ ≤ f , ϕ simple .
A A
Here is a list of the main properties of the integral of a non–negative function. The three properties
can be demonstrated by observing, first, that they hold for simple functions, and then forming the
limit, via Theorem 8.30.
Z
(P1) positivity property: 0≤ f dµ ;
A
Z Z
(P2) monotonicity property: 0≤f ≤g =⇒ f dµ ≤ g dµ ;
A A
Z
(P3) f dµ < +∞ =⇒ µ { x ∈ A | f (x) = +∞ } = 0.
A
We are now in the position to define the integral for measurable functions which may change sign. In
particular, the integral is defined for the class of absolutely integrable functions.
Definition 8.33. Let (Ω , A , µ) be a measure space, and let A ∈ A be a measurable set. Let, further,
f : A → [0 , +∞] be a measurable function. We say that f is summable on Ω if both integrals:
Z Z
f + dµ and f − dµ (8.11)
Ω Ω
Moreover, the integral of f on R exists if at least one of the two integrals (8.11) is finite. In this
latest case, the integral is still defined by (8.12), but it may be infinite. The undefined situation is,
obviously: +∞ − ∞ .
8.6. ALMOST EVERYWHERE 135
The notation L(Ω) represents the collection of all measurable functions, on Ω , which are summable.
The properties of the integral imply that L(Ω) is a vector space on R .
Example 8.39. The sequence of functions fn (x) = xn , defined on the interval [0 , 1] , converges to 0
almost everywhere, if we take the Lebesgue measure. In fact, if x ∈ [0 , 1] , then
(
0 if x ∈ [0, 1[ ,
lim xn =
n→∞ 1 if x = 1 .
This shows that the limit function is not the null function, but the set on which the limit function
differs from the null function has zero measure.
136 CHAPTER 8. LEBESGUE INTEGRAL
The following Proposition 8.40 provides reasons of the relevance of the almost everywhere properties,
when they hold for a complete measure space.
Proposition 8.40. Let (Ω, A , µ) be a measure space, where µ is a complete measure. Let f , g : Ω →
[−∞ , +∞] be functions that are almost everywhere equal. If f is measurable, then g is measurable.
Moreover, if f ∈ L(Ω) , then g ∈ L(Ω) and:
Z Z
f dµ = g dµ .
Ω Ω
σ = { a = x0 , x1 , . . . , xn−1 , xn = b } ,
where the elements of σ are listed according to the convention that xi−1 < xi , for any i = 1 , . . . , n .
The norm of σ is the positive number:
where:
Mi = sup f (x) .
xi−1 ≤x≤xi
These definitions have a plain geometrical meaning; in particular, when f is non–negative, the idea is
to calculate the area of the plane region situated underneath the graph of f , over the interval [a , b] ,
as illustrated in Figure 8.2.
4
Georg Friedrich Bernhard Riemann (1826–1866), German mathematician.
8.7. CONNECTION WITH RIEMANN INTEGRAL 137
y y
f HbL f HbL
f HaL f HaL
x x
a b a b
are called, respectively, lower integral and upper integral of f ∈ B ([a , b]) . They are represented with
the notations:
Z b Z b
sup s(f , σ) := f (x) dx , inf S(f , σ) := f (x) dx .
σ∈Ω([a ,b]) a σ∈Ω([a ,b]) a
In this case, we denote f ∈ R ([a , b]) , and the common value of upper and lower integrals is denoted
by:
Z b
f (x) dx .
a
The inclusion R ([a , b]) ⊂ B ([a , b]) is proper. In fact, if we consider the following function, due to
Dirichlet5 : (
0 if x ∈ [0 , 1] ∩ (R\Q) ,
D(x) =
1 if x ∈ [0 , 1] ∩ Q ,
we see that D ∈ B ([a , b]) , while D ∈
/ R ([a , b]) , since:
Z b Z b
D(x) dx = 0 < D(x) dx = 1 .
a a
At this point, take notice that, if we consider the Lebesgue measure on R , then function D(x) coincides
with the zero function almost everywhere; hence, by Proposition 8.40, the Dirichlet function turns to
be integrable and: Z
D d` = 0 .
[0 ,1]
In other words, the Dirichlet function constitutes an example of a Lebesgue integrable function, that
is not Riemann integrable.
5
Johann Peter Gustav Lejeune Dirichlet (1805–1859), German mathematician.
138 CHAPTER 8. LEBESGUE INTEGRAL
Remark 8.42. In view of Theorem 8.41, we will use the traditional Leibniz–Riemann notation:
Z b
f (x) dx
a
to denote, also, the Lebesgue integral of the measurable function f , defined on the interval [a , b] .
Example 8.43. Consider a strictly increasing sequence (xn )n∈N ⊂ [0 , 1] , and define:
x∞ := lim xn ,
n→∞
being the existence of x∞ ensured by the hypothesis on (xn ) . Now, introduce the function f : [0 , 1] →
R , defined as: (
xn if x ∈ [ n−1 n
n , n+1 ) ,
f (x) =
x∞ if x = 1 .
By construction, f is strictly increasing and, thus, integrable on [0 , 1] . Again by construction, f is
n
discontinuous at any point ξn = , where f jumps, as well as at x = 1 . Moreover:
n+1
∞ Z n ∞ ∞
1
n − 1 X
Z X n+1 X n xn
f (x) dx = xn dx = xn − = ,
0
n−1 n+1 n n(n + 1)
n=1 n n=1 n=1
which implies that the integral of f is strictly positive. For instance, as illustrated in Figure 8.3, if
xn = 1 − n1 , then:
1 ∞ ∞ ∞
π2
Z X 2 2 1 X 1 1 X 1
f (x) dx = − − 2 =2 − − = 2 − .
0 n n+1 n n n+1 n2 6
n=1 n=1 n=1
8.8. NON LEBESGUE INTEGRALS 139
1
Figure 8.3: Example 8.43, with xn = 1 − n .
It
Xis possible to show that a measurable function f is integrable if and only if the numeric series
|f (xn )| converges. In such a case, we also have:
n≥1
Z X
f dµ = f (xn ) .
Ω n≥1
Theorem 8.44. Let ( Ω , A , µ ) be a measure space, and let f : Ω → [0 , +∞] be non–negative and
measurable. Then, for any A ∈ A : Z
φ(A) := f dµ
A
is a measure on A . Moreover, φ is a finite measure, that is, φ(Ω) < +∞ , if and only if f ∈ L(Ω , µ) .
Proof. It is obvious that φ(∅) = 0 . To complete the proof, we have to show that, if An ∈ A , n ∈ N ,
∞
[
is such that Ai ∩ Aj = ∅ for i 6= j , then, setting A = An yields:
n=1
∞
X
φ(A) = φ(An ) .
n=1
If f is a characteristic function of some measurable set E , then the countable additivity of φ follows
from the countable additivity of µ ; in such a case, in fact, we have:
Z
φ(A) = 1E dµ = µ(A ∩ E) .
A
In the general case of non–negative measurable f , for any simple function s such that 0 ≤ s ≤ f ,
we have: Z X∞ Z X∞ Z ∞
X
s dµ = s dµ ≤ f dµ = φ(An ) .
A n=1 An n=1 An n=1
Now, recalling the definition of integral of a measurable function, and considering the supremum, we
find: Z Z X∞
sup s dµ = f dµ = φ(A) ≤ φ(An ) . (8.13)
0≤s≤f A A n=1
To obtain the thesis, we have to revert inequality (8.13) and prove it. Now, if there exists an An such
that φ(An ) = +∞ then, since A ⊃ An , it follows that φ(A) ≥ φ(An ) and the thesis is immediate.
8.10. PASSAGE TO THE LIMIT 141
Hence, the proof can be limited to the case in which φ(An ) < +∞ for any n ∈ N . Then, for any
ε > 0 , a simple function s can be chosen such that 0 ≤ s ≤ f and:
Z Z Z Z
ε ε
s dµ ≥ f dµ − , s dµ ≥ f dµ − .
A1 A1 2 A2 A2 2
Thus: Z Z Z
φ(A1 ∪ A2 ) ≥ s dµ = s dµ + s dµ ≥ φ(A1 ) + φ(A2 ) − ε ,
A1 ∪A2 A1 A2
from which, it can be inferred that:
φ(A1 ∪ A2 ) ≥ φ(A1 ) + φ(A2 ) .
Generalising to any n ∈ N :
φ(A1 ∪ · · · ∪ An ) ≥ φ(A1 ) + · · · + φ(An ) . (8.14)
Now, since A ⊃ A1 ∪ · · · ∪ An , inequality (8.14) implies:
∞
X
φ(A) ≥ φ(An ) . (8.15)
n=1
Proof. (Theorem 8.45) Relations (8.16) imply that there exists α ∈ [0 , +∞] such that:
Z
lim fn dµ = α .
n→∞ E
Z Z
Since fn dµ ≤ f dµ , it is also:
E E Z
α ≤ f dµ . (8.18)
E
To obtain the thesis, we have to revert inequality (8.18) and prove it. Let us fix 0 < c < 1 and a
simple function s such that 0 ≤ s ≤ f . Define, further, for any n ∈ N :
En = { x ∈ E | fn (x) ≥ c s(x) } .
Due to the Generation measure Theorem 8.44, the integral in (8.20) is a countable additive set function.
Then, we can use Theorem 8.13, for increasing sequences of nested sets, to infer:
Z
α ≥ c s dµ .
E
Analytic functions
The monotone convergence Theorem 8.45 applies in a natural way to analytic functions, that are
functions admitting a convergent expansion in power series. The connection to analytic functions is
established by the following Corollary 8.47 to Theorem 8.45.
∞
X
Corollary 8.47. Let fn be a series of positive functions on Ω . Then:
n=1
∞
Z X ∞ Z
X
fn dµ = fn dµ .
Ω n=1 n=1 Ω
8.10. PASSAGE TO THE LIMIT 143
The next Example 8.48 aims to clarify how to deal with analytic functions using monotone convergence.
Take notice of the importance of this example, which provides a further solution to the Basel problem
presented in § 2.7.
Example 8.48. Compute the value of the so–called Leibniz integral, that is:
1
π2
Z
ln x
dx = . (8.22)
0 x2 − 1 8
As mentioned, integral (8.22) is connected to the Basel problem in § 2.7. Recalling the geometric series
expansion (2.38), the following result can be inferred:
∞
1 X
= x2 n .
1 − x2
n=0
1 ∞ 1
− ln x
Z X Z
2
dx = −x2 n ln x dx .
0 1−x 0
n=0
Integrating by parts:
1 1 Z 1
x2 n+1 x2 n
Z
2n 1
−x ln x dx = − ln x + dx = . (8.23)
0 2n + 1 0 0 2n + 1 (2 n + 1)2
In other words, the integral in the left–hand side of (8.22) can be expressed in terms of a numerical
series:
Z 1 ∞
ln x X 1
2−1
dx = . (8.24)
0 x (2 n + 1)2
n=0
The monotone convergence Theorem 8.45 allows, also, the computation of many other infinite series, a
few of which will be presented in the following examples. Let us start with the most elementary cases.
Example 8.49. This interesting example, of infinite series summation, is taken from [17]. We show
that:
∞
X 1
= 4 (1 − ln 2) . (8.25)
n=1
n + 21 n
2
we get further:
∞ Z
1 X ∞ h −(n+b) x i
S(a , b) = e − e−(n+a) x dx ,
a−b
n=1 0
∞ Z
1 X ∞ −n x h −b x i
= e e − e−a x dx .
a−b 0
n=1
As in Example 8.48, the trick is to recognise the geometric series expansion (2.31) in the last integrand:
∞ ∞
X
−x n 1 X 1 1 e−x
e−n x =
e = , i.e., − 1 = = ,
1 − e−x 1 − e−x ex − 1 1 − e−x
n=0 n=1
and, then, use the monotone convergence Theorem 8.45, which yields:
Z ∞h i e−x
1
S(a, b) = e−bx − e−ax dx
a−b 0 1 − e−x
At this point, considering the change of variable t = e−x :
Z 1 b
1 t − ta
S(a , b) = dt . (8.26)
a−b 0 1−t
The left–hand side of (8.25) corresponds to forming S( 12 , 0) :
∞ Z 1 √ Z 1
X 1 1 1− t 1
1 = S( 2 , 0) = 2 dt = 2 √ dt .
2
n +2n 0 1−t 0 1+ t
n=1
π2 π2
Note the interesting comparison between (8.24) and (8.27a), that evaluate to and , respectively,
8 6
as shown in formulæ (2.69) and (2.71).
8.10. PASSAGE TO THE LIMIT 145
8.10.2 Exercises
1
1−x
Z
1. Using the definite integral dx , show that:
0 1 − x4
∞
X 1 π 1
= + ln 2 .
(4 n + 1)(4 n + 2) 8 4
n=0
1 1−x 1
2
= 2
+ .
(1 + x) (1 + x ) 2 (1 + x ) 2 (1 + x)
1
x (1 − x)
Z
2. Using the definite integral dx , show that:
0 1+x
∞
X (−1)n 3
= − ln 4 .
(n + 1) (n + 2) 2
n=1
Z ∞ ∞
sin(a x) X a
3. Show that x
dx = , for any a ∈ R .
0 e −1 n + a2
2
n=1
Hint. Use the equality:
∞ ∞
x2
Z X 2
4. Show that dx = .
0 ex − 1 n3
n=1
∞
X 1
5. Show that = 2 (8 − π − 6 ln 2) .
n=1
n n + 14
Hint: Use the partial fraction decomposition:
x3 1+x 1
2
=1− 2
− .
(x + 1) (1 + x ) 2 (1 + x ) 2 (1 + x)
Lemma 8.51 (Fatou). Let (Ω , A , µ) be a measure space, and let (fn )n be a sequence of measurable
positive functions on Ω . Then:
Z Z
lim inf fn dµ ≤ lim inf fn dµ .
Ω n→∞ n→∞ Ω
146 CHAPTER 8. LEBESGUE INTEGRAL
gn := inf fp .
p≥n
From the monotone convergence Theorem 8.45, since gn (x) ≤ fn (x) , we infer:
Z Z Z
lim inf fn dµ = lim gn dµ ≤ lim inf fn dµ ,
Ω n→∞ n→∞ Ω n→∞ Ω
We are now in position to state the dominated convergence Theorem 8.52, due to Lebesgue.
Theorem 8.52. Consider a measure space (Ω , A , µ) and let (fn )n be a sequence of measurable
functions such that:
lim fn (x) = f (x) . (8.28)
n→∞
Assume that there exists a non–negative summable g ∈ L(Ω) such that, for any x ∈ Ω and any
n∈N :
|fn (x)| ≤ g(x) (8.29)
Proof. We follow the proof proposed in [52]. Condition (8.29) implies that fn + g ≥ 0 , thus, Lemma
8.51 can be used to obtain the inequality:
Z Z
(f + g) dµ ≤ lim inf (fn + g) dµ .
A n→∞ A
From (8.29) we also get g − fn ≥ 0 , and, again from Fatou8 Lemma 8.51, we obtain:
Z Z
(g − f ) dµ ≤ lim inf (g − fn ) dµ .
A n→∞ A
Thus: Z Z
− f dµ ≤ lim inf (−fn ) dµ ,
A n→∞ A
Remark 8.53. When µ is a complete measure (see Definition 8.6), the dominated convergence The-
orem 8.52 holds, also, when conditions (8.28) and (8.29) are true almost everywhere.
8
Pierre Joseph Louis Fatou (1878–1929), French mathematician and astronomer.
8.10. PASSAGE TO THE LIMIT 147
8.10.4 Exercises
n
+∞
e−(x+1)
Z
1. Evaluate lim √ dx .
n→+∞ 0 x
1
e−x
Z
3. Explain why the passage to the limit lim dx is possible.
n→∞ 0 x2 + n2
Z ∞
arctan n
4. Evaluate lim dx , explaining why it is possible to use the dominated convergence
n→∞ 0 1 + x2
Theorem 8.52.
1
x4
Z
5. Justify the following passage to the limit lim dx .
n→∞ 0 x2 + n2
2√
6. Consider the sequence of functions fn (x) = n x e−n x , x ∈ [0 , +∞) . Show that the thesis of
Theorem 8.52 does not hold. Explain the reasons, evaluating sup fn (x) .
x∈[0 ,+∞)
7. Using either of the Theorems 2.15, or 8.45, or 8.52, evaluate the following limits:
Z ∞ Z π
−x2
(a) nxe dx ; (d) sin(n x) e−n x dx ;
0 0
1
r
Z
1
Z 1
(b) x2 + dx ; (e) lim n x (1 − x2 )n dx .
n n→∞
Z0 π
0
sin(n x)
(c) dx ;
0 n
Now, observing that ϕ ∈ L(R) , the dominated convergence Theorem 8.52 can be used, and we can
interchange the limit with the integral:
n n
+∞
e−(x+1) +∞
e−(x+1)
Z Z
lim √ dx = lim √ dx = 0 .
n→+∞ 0 x 0 n→+∞ x
9
See, for example, mathworld.wolfram.com/BernoulliInequality.html
148 CHAPTER 8. LEBESGUE INTEGRAL
For simplicity, we keep the notation f to indicate f˘ . As in Theorem 8.29, let us introduce the sequence
of (measurable) functions (ϕn )n∈N , defined for x ∈ [a , b] as:
1
f (x + ) − f (x)
ϕn (x) = n .
1
n
By definition, if x ∈ E , then ϕn (x) → f 0 (x) as n → ∞ , meaning that f 0 is measurable too. Now,
observing that `( [a , b] − E ) = 0 , we have:
Z Z b Z b Z b
1
ϕn (x) dx = ϕn (x) dx = n f (x + ) dx − f (x) dx
E a a n a
1
!
Z b+ n Z b
=n f (x) dx − f (x) dx
1
a+ n a
1 1
!
Z b+ n Z a+ n
=n f (x) dx − f (x) dx
b a
1 1
Z b+ n Z a+ n
=n f (b) dx − n f (x) dx
b a
1
Z a+ n
= f (b) − n f (x) dx
a
≤ f (b) − f (a) . (8.35)
10
We refer to the Fundamental Theorem of Calculus; see, for example, math-
world.wolfram.com/FundamentalTheoremsofCalculus.html and the references therein.
8.11. DIFFERENTIATION UNDER THE INTEGRAL SIGN 149
The last inequality (8.35) follows from the monotonicity assumption on f . In fact, for any x ∈
1
[a , b + 1] , from f (a) ≤ f (x) and integrating on [a , a + ] , we obtain:
n
Z a+ 1 Z a+ 1 Z a+ 1
n n n
n f (a) dx ≤ n f (x) dx ⇐⇒ −f (a) ≥ −n f (x) dx .
a a a
At this point, using Fatou Lemma 8.51, and passing to the limit, we infer:
Z Z Z
0
f (x) dx = lim inf ϕn (x) dx ≤ lim inf ϕn (x) dx ≤ f (b) − f (a) .
E E n→∞ n→∞ E
Remark 8.55. The inequality in the thesis of Theorem 8.54 can be strict, as shown by the earlier
result (8.33).
(iii) for any x ∈]a , b[ and for almost any t ∈ [α , β] , there exists g summable in [α , β] such that:
∂f
(x , t) ≤ g(t) .
∂x
Then: Z β
F (x) := f (x , t) dt
α
is differentiable and: Z β
∂f
F 0 (x) = (x , t) dt .
α ∂x
It is also useful to state an alternative version of Theorem 8.56, which uses the continuity of the partial
derivatives of f (x , t) .
Theorem 8.57. Let f : [a , b] × [α , β] → R be a continuos function, such that the partial derivative
∂f
esists and is continuos on [a , b] × [α , β] . Then:
∂x
Z β
F (x) := f (x , t)dt
α
is differentiable and: Z β
0 ∂f
F (x) = (x , t) dt .
α ∂x
Several examples follow, on the use of Theorems 8.56 and 8.57.
150 CHAPTER 8. LEBESGUE INTEGRAL
Hence, the derivative of function f (x) = f1 (x) + f2 (x) is zero for any x , meaning that f (x) is a
constant function, with f (x) = f (0) for all x ∈ R .
And it is clear that: Z 1
1 π
f (0) = 2
dt = .
0 1+t 4
In other words, we have shown that, for any x ∈ R :
Z x 2 Z 1 −(1+t2 ) x2
−t2 e π
e dt + 2
dt = .
0 0 1+t 4
Now, using the dominated convergence Theorem 8.52:
(Z 2 Z 1 −(1+t2 ) x2 ) Z ∞ 2
x
−t2 e −t2 π
lim e dt + 2
dt = e dt = .
x→∞ 0 0 1 + t 0 4
while Z a Z a
f (t) dt = 2 f (t) dt for any integrable even f .
−a 0
2
Remark 8.59. From (8.36) and from the fact that e−x is an even function, we infer that:
Z ∞
2 √
e−x dt = π . (8.36a)
−∞
1
Now, with the change of variable t = √ z :
x
∞
e−x e−x
Z
2
F 0 (x) = − √ e−z dz = − √ G .
x −∞ x
Then, integrating on [t , ∞) :
∞ ∞
e−s
Z Z
0
F (s) ds = −G √
√ ds ,
x x s
11
See, for example, mathworld.wolfram.com/CompletingtheSquare.html
152 CHAPTER 8. LEBESGUE INTEGRAL
that is:
∞
e−s
Z
F (x) = G √
√ ds . (8.42)
x s
√
From (8.41) it follows that F (0) = π , while (8.42) implies that F (0) = G2 . Thus G = π , which
means that (8.36a) is re-established.
8.11.3 Exercises
Z +∞
sin t
1. Consider F (x) = e−x t dt , where x > 0 . Show that:
0 t
1
(a) F 0 (x) = − ;
1 + x2
(b) lim F (x) = 0 ;
x→+∞
π
(c) F (x) = − arctan x ;
2
(d) Dirichlet integral : Z +∞
sin t π
dt = . (8.43)
0 t 2
2. Using Theorem 8.56, for differentiation under the integral sign, show that, for x ≥ 0 :
Z 1 x
t −1
dt = ln(1 + x) .
0 ln t
3. For x > 0 consider the function:
+∞
arctan t − arctan(t x)
Z
F (x) = dt .
0 t
Explain why it is possible to differentiate under the integral; then, show that:
π
F (x) = − ln x .
2
4. For x > 0 consider the function
+∞
e−t − e−t x
Z
F (x) = dt .
0 t
Explain why it is possible to differentiate under the integral sign; then, show that:
F (x) = ln x .
This means that we can differentiate under the integral sign, and, integrating by parts, we obtain:
Z ∞
0 1
F (x) = −e−x t sin t dt = − .
0 1 + x2
sin t
(b) Make use of the fact that x > 0 , and that ≤ 1 for any t ∈ R , to form some estimates:
t
∞ Z ∞
Z
sin t 1
e−x t e−x t dt = .
|F (x)| = dt ≤
0 t 0 x
where C is some integration constant. Recalling (b) and taking the limit as x → ∞ :
π
0 = lim F (x) = C − arctan(∞) = C − .
x→∞ 2
π π
It follows that C = and F (x) = − arctan x .
2 2
(d) At this point, it is immediate recognize that:
Z ∞
sin t π
dt = F (0) = .
0 t 2
The above integral, given in (8.43), is of great importance and it is known as Dirichlet integral.
The power series (2.36) was used in the very last step. Now, since F (0) = 0 , then we have that, for
0≤x<1 :
Z x ∞
Z xX
t2 n
F (x) = F 0 (t) dt = dt .
0 0 2n + 1
n=0
For 0 ≤ x < 1 , we thus have:
∞
X x2 n+1
F (x) = .
(2 n + 1)2
n=0
8.13. DEBYE INTEGRAL 155
In other words, we proved (2.69), obtaining, once more, the solution to the Basel problem.
to ensure the convergence, we have to assume m > 0 . Let us restrict our attention to the particular
case m = 1 . Taking into account the power series expansion (2.33), we introduce a generalization of
the logarithmic function, Lis (x) , of order s and argument x , known as polylogarithm:
∞
X un
Lis (x) = . (8.45)
ns
n=1
The case s = 2 , in particular, was introduced by L. Euler in 1768 and it is called dilogarithm:
∞ Z x
X xn ln(1 − t)
Li2 (x) := 2
=− dt . (8.45a)
n 0 t
n=1
We are now in position to evaluate the Debye integral D1 . When t > 0 , we can write:
∞ ∞
t −t 1 −t
X
−n t
X
= t e = t e e = t e−(n+1) t .
et − 1 1 − e−t
n=0 n=0
12
Peter Joseph William Debye (1884–1966), Dutch–American physicist and physical chemist, and Nobel laureate in
Chemistry in 1936.
156 CHAPTER 8. LEBESGUE INTEGRAL
9 Radon–Nikodym theorem
Definition 9.1. Let A be a σ–algebra on Ω . A signed measure, or charge, is any set function ν ,
defined on A , such that:
(ii) ν(∅) = 0 ;
Remarks 9.2.
In Definition 9.1, equality (iii) means that, if the measure in the left hand–side of (9.1) is finite,
then the infinite series in the right hand–side of (9.1) converges absolutely; otherwise, such a series
diverges.
Definition 9.3.
(i) A set P ∈ A is a positive set, with respect to the signed measure ν , if it holds ν(E) ≥ 0 for
any subset E ⊂ P , E ∈ A .
(ii) A set N ∈ A is a negative set, with respect to the signed measure ν , if it holds ν(E) ≤ 0 for
any subset E ⊂ N , E ∈ A .
(iii) If a set A is both positive and negative, with respect to the signed mea– sure ν , then A is
called null set.
Remarks 9.4.
157
158 CHAPTER 9. RADON–NIKODYM THEOREM
Remark 9.5. Note that null sets, for a signed measure, are different from sets of measure zero, for
a positive measure. This becomes evident if we observe that, for a null set A , by definition, it holds
ν(A) = 0 , which means that A may be the union of two non–empty subset of opposite charge. For
instance, in R , the set function: Z
ν(A) = x dx
A
is a signed measure, defined on the σ–algebra of Lebesgue measurable sets, and:
Z 0 Z 2
ν([−2, 0]) = x dx = −2 , ν([0, 2]) = x dx = 2 ,
−2 0
Z 2
ν([−2, 2]) = x dx = 0 .
−2
The Hahn1 decomposition Theorem 9.6 explains the behaviour of the signed measures.
Theorem 9.6 (Hahn). Let ν be a signed measure in the measure space (Ω , A) . Then, there exist a
positive set P and a negative set N such that:
P ∩N =∅ and P ∪ N = Ω.
The decomposition of Ω into a positive set P and a negative set N is called Hahn decomposition for
the signed measure ν . Such a decomposition is not unique.
Remark 9.7. Denote by {P , N } a Hahn decomposition of the charge ν . Then, it is possibile to
define two positive measures ν + and ν − as follows:
ν + (E) = ν(E ∩ P ) , ν − (E) = −ν(E ∩ N ) .
The positive measures ν + and ν − turn out to be mutually singular, as they comply with the following
Definition 9.8.
Definition 9.8. Two measures ν1 and ν2 , defined on (Ω , A) , are called mutually singular, and
denoted ν1 ⊥ ν2 , if there exist two measurable disjoint sets A and B such that Ω = A ∪ B and:
ν1 (A) = ν2 (B) = 0 .
The various results on charge measure and Hahn decomposition are contained in the following Theo-
rem 9.9, due to Jordan2 .
Theorem 9.9 (Jordan decomposition). Let ν be a charge, defined on the measure space (Ω , A) .
Then, there are exactly two positive measures ν + and ν − defined on (Ω, A) and such that:
ν = ν+ − ν− .
ν + and ν − are called, respectively, positive and negative variation of ν . Since, by definition, µ can
take at most one of the values −∞ or +∞ , then at least one of the two variations must be finite. If
both variations are finite, then ν is a finite signed measure.
Remark 9.10. A consequence of Theorem 9.9 is that a new measure, denoted with the symbol |ν| ,
can be defined:
|ν| (E) := ν + (E) + ν − (E) .
This measure is called absolute value or total variation of ν . It is possible to show that:
Z
|ν| (E) = sup f dν ,
E
where the supremum is taken over all measurable functions f that verify |f | ≤ 1 everywhere.
Definition 9.11. The integral of a function f , with respect to a signed measure ν , is defined by:
Z Z Z
f dν := f dν + − f dν − , (9.2)
where we have to assume that f is measurable with respect to both ν + and ν − , and that the integrals
in the right–hand side of (9.2) are not both infinite.
Example 9.12. Consider f (x) = (x − 1)e−|x−1| , and define the signed measure on the Lebesgue
measurable set in R : Z
ν(A) = (x − 1) e−|x−1| dx .
A
From the Hahn decomposition Theorem 9.6, being ν a signed measure, we know that there exist two
sets P and N , positive and negative, respectively, such that P ∩ N = ∅ , P ∪ N = R , and such that
ν + (E) = ν(E ∩ P ) and ν − (E) = −ν(E ∩ N ) .
Observe that f (x) ≥ 0 in [1 , +∞) , while f (x) < 0 on (−∞ , 1) . Hence, we can choose, for instance,
P = [1 , +∞) and N = (−∞ , 1) . Even though P and N are not uniquely determined, ν + and ν − do
not change. We then obtain:
Z ∞ Z ∞
−|x−1|
+
ν (R) = ν(R ∩ P ) = (x − 1) e dx = (x − 1) e−(x−1) dx
1 1
Z ∞
= u e−u du = 1 ,
0
Z 1 Z 1
− −|x−1|
ν (R) = −ν(R ∩ N ) = − (x − 1) e dx = − (x − 1) e(x−1) dx
−∞ −∞
Z 0
=− u eu du = −(−1) = 1 ,
−∞
Finally, from equality (9.2), that defines an integral with respect to a signed measure, we get:
Z Z Z
1 1 1
dν = dν − dν
[0 ,+∞] x−1 [0 ,+∞]∩P x − 1 [0 ,+∞]∩N x − 1
Z +∞ Z 1
1 −(x−1) 1
= (x − 1) e dx − (x − 1) e(x−1) dx
1 x − 1 0 x − 1
Z +∞ Z 1
= e−(x−1) dx − e(x−1) dx
1 0
e−1 1
=1− = .
e e
160 CHAPTER 9. RADON–NIKODYM THEOREM
Remark 9.14. If measure φ is built from f and µ , as shown in Theorem 9.13, then, if a set is
negligible for µ , it is also negligible for φ , that is: µ(E) = 0 =⇒ φ(E) = 0 .
The situation illustrated in Remark 9.14 is formalised in the following Definition 9.15.
Definition 9.15. Given two measures φ and µ , on the same σ–algebra A , we say that φ is absolutely
continuous, with respect to µ , if:
µ(E) = 0 =⇒ φ(E) = 0 ,
and we use the notation: φ µ .
So far, we have shown that, if measure φ is obtained from measure µ by integrating a positive
measurable function f , then φ is absolutely continuous with respect to µ . It is possible to revert
this process: if measure φ is absolutely continuous with respect to measure µ , then, under certain
essential hypotheses, it is possible to represent µ as an integral of a certain function. Such hypotheses
are given in the Radon–Nikodym Theorem 9.16.
Theorem 9.16 (Radon–Nikodym). Let (Ω , A , µ) be a σ–finite measure space, and let φ be a
measure, on A , absolutely continuos with respect to µ . Then, there exists a non–negative measurable
function h such that, for any E ∈ A :
Z
φ(E) = h dµ .
E
This function h is almost everywhere unique, it is called Radon–Nikodym derivative of φ with respect
to µ , and it is denoted by:
dφ
dµ
Moreover, for any function f ≥ 0 and φ–measurable:
Z Z
f dφ = f h dµ .
Ω Ω
3
Johann Karl August Radon (1887–1956), Austrian mathematician.
Otto Marcin Nikodym (1887–1974), Polish mathematician.
9.2. RADON–NIKODYM THEOREM 161
For the proof, see § 6.9 of [53]. The Radon–Nikodym derivative, i.e., the function h which expresses
the change of measure, has some interesting properties, stated below.
Z Z
dφ
(a) If φ µ and if ϕ is a non–negative and measurable function, then: ϕ dφ = ϕ dµ .
dµ
d (φ1 + φ2 ) dφ1 dφ2
(b) For any φ1 , φ2 it holds: = + .
dµ dµ dµ
dφ dφ dµ
(c) If φ µ λ , then = .
dλ dµ dλ
−1
dν dµ
(d) If ν µ and µ ν , then = .
dµ dν
162 CHAPTER 9. RADON–NIKODYM THEOREM
10 Multiple integrals
10.1 Integration in R2
In this § 10.1, we expose the theoretical process that extends the Lebesgue measure from R to the
plane R2 . Such a process also provides a method to compute the two–dimensional Lebesgue measure
of sets in the plane, using the one–dimensional measure on the line. The basic step is the notion
of section of a set. Due to our applicative commitment, most of the theorems are stated, but not
demonstrated; for their proofs, refer to [51] and [6].
Definition 10.1. Let A ⊂ R2 and fix x ∈ R . Then, the section of foot x of A is defined as the
subset Ax of R :
Ax := {y ∈ R | (x , y) ∈ A} .
Viceversa, given y ∈ R , the section of foot y of A is:
Ay := {x ∈ R | (x, y) ∈ A} .
Ax A
y Ay
163
164 CHAPTER 10. MULTIPLE INTEGRALS
Example 10.3. From Theorem 10.2 it is immediate to find the measure of a circle, i.e., its area.
Given the unit circle A = {(x , y) | x2 + y 2 ≤ 1} , in fact, we see that its `2 measure
h √ can be found ias
√
illustrated in Figure 10.2. Fixed x ∈ R , the section of foot x is given by Ax = − 1 − x2 , 1 − x2 ,
if −1 ≤ x ≤ 1 , while it is Ax = ∅ elsewhere. Therefore:
Z 1 Z 1p
π
`2 (A) = `(Ax ) dx = 4 1 − x2 dx = 4 = π .
−1 0 4
��
�
�
Figure 10.2: Application of Theorem 10.2 to the computation of the unit circle area.
The following theorem is known as Fubini1 Theorem [24, 25]: it rules the evaluation of integrals, with
respect to the two–dimensional Lebesgue measure, establishing the method of nested integration.
(I) Denote with S0 the null set where section Ax is not measurable, and define S to be the subset
of R where the y–sections of A have positive measure:
S = { x ∈ R \ S0 | `(Ax ) > 0} .
(II) Denote with T0 the null set where section Ay is not measurable, and define T to be the subset
of R where the x–sections of A have positive measure:
T = { y ∈ R \ T0 | `(Ay ) > 0} .
Figure 10.3 illustrates cases (I) and (II) of the Fubini Theorem 10.4.
1
Guido Fubini (1879–1943), Italian mathematician.
10.1. INTEGRATION IN R2 165
Ax A T
y Ay
x S
Figure 10.3: Fubini Theorem 10.4: cases (I) and (II), to the left and to the right, respectively.
Remark 10.5. In Theorem 10.4, the same integral is evaluated by (10.1) and (10.2); in a practical
situation, it usually occurs that one integration path is easier than the other. Moreover, the theoretical
equality between integrals (10.1) and (10.2):
Z Z Z Z !
f (x , y) dy dx = f (x , y) dx dy , (10.3)
S Ax T Ay
can be exploited to evaluate integrals that are otherwise difficult to be computed via a direct approach.
(x , y) ∈ R2 | 0 ≤ x ≤ 1 , x2 ≤ y ≤ x + 1 ; note that x ∈
Example 10.6. Consider A =
[0 , 1] =⇒ x2 ≤ 1 + x . We want to evaluate:
ZZ
x y dx dy .
A
The first step consists in the analysis of the integration domain. In this example, the domain is
described by a double inequality constraint, of the form f1 (x) ≤ y ≤ f2 (x) , with x ∈ [a , b] . Many
authors describe such kind of integration domain as normal domain. In our case, [a , b] = [0 , 1] , and
f1 (x) = x2 is a parabola, while f2 (x) = 1 + x is a line, hence, the domain plot is as shown in
Figure 10.4. When working with normal domains, it is natural to adopt the vertical section approach,
stated in case (I) of Theorem 10.4. Here, S = [0 , 1] , Ax = [x2 , 1 + x] , and the nested integration
formula (10.1) yields:
1 1+x 1 1+x
y2
ZZ Z Z Z
x y dx dy = x y dy dx = x dx
A 0 x2 0 2 x2
Z 1
1 5
−x5 + x3 + 2 x2 + x dx = .
=
2 0 8
(x , y) ∈ R2 | y ≥ 0 , y ≤ −x + 3 , y ≤ 2x + 3
Example 10.7. Given A = , evaluate:
ZZ
y dx dy .
A
The integration domain is the triangle with vertices in (− 32 , 0) , (3 , 0) , (0 , 3) , as shown in Figure 10.5.
We first integrate in x and then in y :
Z 3 Z 3−y ! Z 3
9 3 2 27
y dx dy = y − y dy =
0 y−3
0 2 2 4
2
3.0
2.5
2.0
1.5
1.0
0.5
-2 -1 1 2 3
Fresnel integrals
We show, following [15] (pages 473–474), that:
∞ ∞
Z r Z
2 π
F1 := sin x dx = = cos x2 dx := F2 .
−∞ 8 −∞
Making use of the probability integral, discussed in § 8.11.1, 8.11.2, we see that:
Z ∞
1 2 2
√ =√ e−t x dx .
t π 0
We can employ Fubini Theorem 10.4, to invert the order of integration, and then observe that:
Z ∞ Z ∞
2 1 2 x2
e−t x sin t dt = 4
, e−t x cos t dt = ,
0 1+x 0 1 + x4
that follow from the indefinite integration formulæ :
2
e−t x x2 sin t + cos t
Z
−t x2
e sin t dt = − ,
1 + x4
2
e−t x sin t − x2 cos t
Z
−t x2
e cos t dt = .
1 + x4
Therefore:
∞ ∞
x2
Z Z
2 1 2
F1 = √ dx , F2 = √ dx .
π 0 1 + x4 π 0 1 + x4
The remaining computations are a matter of elementary integration, and they are left to the Reader,
or they can be obtained using the integration formulæ (11.25) and (11.26) of the next Chapter 11.
The problem above can, indeed, be solved directly by integrating x00 = f (t) twice on [0 , t] . The first
integration provides:
Z t Z t Z t
00 0 0
x (τ ) dτ = f (r) dr =⇒ x (t) − x (0) = f (r) dr ,
0 0 0
where, since x0 (0) = a1 , the first integration constant gets determined and we obtain:
Z t
0
x (t) = a1 + f (r) dr .
0
Now, recall that x(0) = a0 and use Fubini Theorem 10.4, to exchange the order of integration:
Z t Z t Z t
x(t) = a0 + a1 t + f (r) ds dr = a0 + a1 t + (t − r) f (r) dr .
0 r 0
s
t
It is possibile to extend this argument to any order of integration: the solution of the initial value
problem:
(
x(n) (t) = f (t) ,
x(0) = a0 , x0 (0) = a1 , . . . , x(n−1) (0) = an−1 ,
We use Fubini Theorem 10.4 to compute a definite integral, presented in Chapter 5 of [26], and hard
to evaluate otherwise:
Z 1 b
x − xa 1+b
dx = ln , 0 ≤ a < b. (10.5)
0 ln x 1+a
To calculate (10.5), let us define:
A = { (x , y) ∈ R2 | 0 ≤ x ≤ 1 , a ≤ y ≤ b } := [0 , 1] × [a , b]
1 b 1 y=b 1
xy xb − xa
ZZ Z Z Z Z
y y
x dx dy = x dy dx = dx = dx . (10.7)
A 0 a 0 ln x y=a 0 ln x
Frullani integral
We use again a two–fold double integral, to evaluate a hard single–variable integral, known as Frullani2
integral: Z ∞
arctan(b x) − arctan(a x) π b
dx = ln , 0 < a < b. (10.8)
0 x 2 a
Consider the double integral:
Z b Z ∞ Z b x=∞
1 arctan(x y)
dx dy = dy
a 0 1 + x2 y 2 a y x=0
Z b
(10.9)
π π b
= dy = ln .
a 2y 2 a
so that: ! y=1
Z +∞ Z 1 Z +∞
1 dy arctan x y
1 dx = dx .
0 x(1 + x2 ) 0 x2
+ y2 0 1 + x2 y=0
Therefore: Z +∞ Z 1 Z +∞
x arctan x
2 2 2
dy dx = dx .
0 0 (1 + x )(1 + x y ) 0 1 + x2
Now, since:
arctan2 x
Z
arctan x
dx = ,
1 + x2 2
it follows:
+∞ Z 1
π2
Z
x
dy dx = . (10.13)
0 0 (1 + x )(1 + x2 y 2 )
2 8
At this point, we use the partial fraction:
2 y2x
x 1 2x
2 2 2
= 2 2 2
− ,
(1 + x )(1 + x y ) 2 (y − 1) 1+x y 1 + x2
and we exploit Fubini Theorem 10.4, to change the order of integration in (10.12), and perform the
integration:
Z 1 Z +∞ Z 1 +∞
1 + x2 y 2
x 1
dx dy = ln dy
0 0 (1 + x2 )(1 + x2 y 2 ) 2
0 2 (y − 1) 1 + x2 0
Z 1 Z 1 (10.14)
ln y 2 ln y
= 2
dy = 2
dy .
0 2 (y − 1) 0 y −1
1. ϕ is injective;
3. det Jϕ (x) 6= 0 for any x ∈ A , where Jϕ (x) is the Jacobian matrix of ϕ evaluated at x , i.e.:
and
∂ϕ1 (x) ∂ϕ2 (x) ∂ϕ1 (x) ∂ϕ2 (x)
det Jϕ (x) = − .
∂x1 ∂x2 ∂x2 ∂x1
The most used regular mapping in the plane is the transformation into the so–called polar coordinates.
This is a regular mapping. In fact, the determinant of the Jacobian is det Jϕ (ρ, ϑ) = ρ > 0 , since:
is a regular mapping, and it is called rotation of angle α. The Jacobian determinant takes value 1. Note
that the composition of two rotations, of angles α and β respectively, is again a rotation of amplitude
α + β mod 2 π .
Regular mappings are useful, since they can allow transforming a double integral into a simpler one.
We state the Change of Variable Theorem 10.12 and, then, study several situations where a good
change of variable eases the computation of integrals.
Theorem 10.12. Consider the open set A ⊆ R2 , and assume that ϕ : A → R2 is a regular mapping.
Let f : A → R be a measurable function. Then, f ∈ L(A) if and only if x 7→ f (ϕ(x)) |det Iϕ (x)|
is summable on the inverse image ϕ−1 (A) . In such a situation, the equality holds true:
Z Z
f (u) du = f ϕ(x) |det Iϕ (x)| dx .
A ϕ−1 (A)
172 CHAPTER 10. MULTIPLE INTEGRALS
Example 10.13. We wish to evaluate, once more, the probability integral G , defined in (8.40). To
this aim, a double integral is considered, which equals G2 , thanks to Fubini Theorem 10.4:
ZZ Z Z
2 2 2 2
e−(x +y ) dx dy = e−x dx e−y dy = G2 ,
R2 R R
√ √
As Figure 10.7 shows,
√ E is a sort of triangle with curvilinear base, in which points A = (1/ 2 , 1/ 2)
and B = (1/2, 3/2) are obtained solving the systems:
( (
x2 + y 2 = 1 , x2 + y 2 = 1 ,
√
x = y, x = 3y.
so that, in conclusion:
π π
ϕ−1 (E) = (0 , 1) × ( , ) .
6 4
Finally, invoking Theorem 10.12, we obtain:
ZZ
1 π π π
`2 (E) = π π ρ dρ dϑ = 2 4 − 6 = 24 .
(0 ,1)×( , )
6 4
It is obviously possible to not use Theorem 10.12, and compute `2 (E) using, instead, Fubini Theorem
10.4. By doing so, though, the relevant computation trun out to be more involved. By looking again
at Figure 10.7, in fact, we see that the domain of integration must be split ad follows:
Z √ 1/ 2
Z !
x Z √ Z √ 2 !
3/2 1−x
`2 (E) = √ dy dx + √ √ dy dx .
0 x/ 3 1/ 2 x/ 3
Remark 10.15. Polar coordinates can be modified, if we wish to compute the measure of the canonical
ellipse, that is a set E described by:
x2 y 2
2
E = (x , y) ∈ R | 2 + 2 ≤ 1 ,
a b
In the next Example 10.16, we employ a rotation to compute the measure of a set.
A := { (x , y) ∈ R2 | x2 − x y + y 2 ≤ 1 } , (10.20)
π
using an α = rotation:
4
1 1
x = √ u− √ v,
2 2
1 1
y = √ u + √ v .
2 2
174 CHAPTER 10. MULTIPLE INTEGRALS
(u sin α + v cos α)2 − (u cos α − v sin α)(u sin α + v cos α) + (u cos α − v sin α)2 ≤ 1
that is:
u2 + v 2 + (sin2 α − cos2 α) u v − (u2 − v 2 ) sin α cos α ≤ 1 . (10.23)
We choose the value of α for which the rectangular term u v in (10.23) vanishes:
sin2 α − cos2 α = 0 ,
namely, α = π/4 .
At this point, returning to our transformed set (10.21), and recalling Remark 10.15, we infer that the
measure of A is: √
√ 2 2π
2 √ π=√ .
3 3
Remark 10.17. The computational method described in Example 10.16 can be applied to the situ-
ation of a general ellipse:
E = { (x , y) ∈ R2 | a x2 + 2 b x y + c y 2 ≤ d } , (10.24)
where, to make sure we are dealing with an ellipse and a non–empty set, we must assume that
a , c , d > 0 , and that the discriminant ∆ = b2 − a c is negative. That said, we can apply the rotation
technique to compute `2 (E) . The two cases a = c and a 6= c must be distinguished. When a = c , the
π/4 rotation used in Example 10.16 still works, and transforms E into:
Note that the two conditions a = c and ∆ < 0 imply |b| < a ; hence, a + b > 0 and a − b > 0 .
Recalling Remark 10.15, we can thus infer:
d
`2 (E) = √ π. (10.26)
a2− b2
If a 6= c , it is always possible to transform the given ellipse into an ellipse of the form (10.25), by
means of the variable transformation:
x = x1 ,
(10.27)
r
a
y = y1 ,
c
applying which, the general ellipse (10.24) becomes:
r
2 a
a x1 + 2 b x1 y1 + a y12 ≤ d . (10.28)
c
10.3. INTEGRATION IN RN 175
r
a
Note that transformation (10.27) has non–zero Jacobian determinant > 0 ; hence from (10.26), it
c
follows:
d
`2 (E) = √ π. (10.29)
a c − b2
In other words, (10.29) nicely extends (10.26).
10.3 Integration in Rn
The process of extending the Lebesgue measure, from R to R2 , can be naturally iterated. First,
though, some notations need to be introduced. The Euclidean space Rn is here decomposed into the
Cartesian product of two lower dimensional Euclidean spaces, Rp and Rq , where p + q = n , and we
make the identification Rn = Rp × Rq . Moreover, with the writing (x, y) ∈ Rn , we implicitly mean
that x = (x1 , . . . , xp ) ∈ Rp and y = (y1 , . . . , yq ) ∈ Rq . This said, following Definition 10.1, we can
now provide the idea of section of a set A ⊆ Rn .
Definition 10.21. Consider A ⊂ Rn and x ∈ Rp . Then, the section of foot x of A is defined as
the following subset Ax of Rq :
Ax := {y ∈ Rq | (x , y) ∈ A} .
Ay := {x ∈ Rp | (x , y) ∈ A} .
The Lebesgue measure in Rn is denoted by `n ; analogous meaning holds for `p and `q . In the following,
when the term measurable is used, the dimension of the considered Euclidean space will be clear form
the context. Symbol dp indicates that we are integrating with respect to the Lebesgue measure in
Rp , and analogously for other dimensions.
The set section Theorem 10.22, in its proof, makes a heavy use of the monotone convergence Theorem
8.45.
Theorem 10.22. Let A ⊂ Rn be a measurable set. Then:
(I) for almost any x ∈ Rp , section Ax ⊂ Rq is measurable; moreover, function Ax 7→ `q (Ax ) is
measurable, and it holds: Z
`n (A) = `q (Ax ) dp x ;
Rp
We can now formulate the extension of Fubini Theorem 10.4 to the general Rn case.
Theorem 10.23. (Fubini – general case) Let A ⊂ Rn be a measurable set and consider f ∈ L(A) .
Denote with S0 the null set set where section Ax , for x ∈ Rp , is non–measurable. Define, further, S
to be the subset of Rp where the x–sections of A have positive measure, formally:
To apply Fubini Theorem 10.23, it is mandatory that the integrand f (x , y) is a summable function.
In many circumstances, summability can be deduced by some a priori considerations. Otherwise, the
following Theorem 10.24, due to Tonelli3 , analyses the summability of a given function.
We complete our study of integration in the Euclidean space Rn by describing some applications of
the Fubini Theorem 10.23. Some further terminology is also provided.
The geometrical meaning of Definition 10.25 is, clearly, that of the n–dimensional generalization of the
concept of graph of a function of one variable. The following Theorem 10.26 illustrates the geometric
idea of an n–dimensional integral.
Theorem 10.26. Sets Γf and Gf , introduced in Definition 10.25, are measurable in Rn+1 . Moreover:
Z
`n+1 (Γf ) = f dx , `n+1 (Gf ) = 0 .
A
Combining Theorems 10.26, 8.13, it is possible to provide an alternative proof to the monotone con-
vergence Theorem 8.45.
Theorem 10.27. Let A ⊆ Rn be measurable, and consider a sequence (fp )p∈N of real non–negative
measurable functions on A , such that, for almost any x ∈ A :
Proof. Following Definition 10.25, consider the subgraphs Γf and Γfp . From the monotonicity hy-
pothesis in point (i) of (10.33), it follows that Γfp ⊆ Γfp+1 . Hence, the family of sets Γfp p∈N is a
nested increasing family and, due to point (ii) of (10.33), it is esaustive, that is:
∞
[
Γf = Γfp .
p=1
10.3.1 Exercises
(x , y) ∈ R2 | x2 ≤ y ≤ x . Show that:
1. Consider A :=
ZZ
1 π
2
dx dy = √ − 2 .
A y−x −1 3
2. Let A = (x , y) ∈ R2 | 0 ≤ x ≤ 1 , x2 ≤ y ≤ x + 1 . Show that:
ZZ
5
x y dx dy = .
A 8
(x , y) ∈ R2 | x4 ≤ y ≤ x 2
5. Let A = . Show that:
ZZ
1
x y dx dy = .
A 30
R = { A1 × A2 | A1 ∈ A1 , A2 ∈ A2 } .
C = { A1 × Ω2 | A1 ∈ A1 } ∩ { Ω1 × A2 | A2 ∈ A2 } ;
(ii) A1 × A2 is the minimum σ–algebra such that the following projections are measurable:
For applications, the most relevant situation occurs when Ω1 = Ω2 = R and A1 = A2 = B is the
Borel σ–algebra on R . Borel sets, in the plane, can be generated in two equivalent ways.
R = { B1 × B2 | B1 , B2 ∈ B } and I = { I1 × I2 | I1 , I2 intervals }
Having defined the product σ–algebra, we have to introduce the product measure, and we must do so
in a consistent manner, that follows the contribution of Fubini to Measure theory, in the case of the
Euclidean space Rn .
To build the product measure, we need to work with σ–finite measure spaces (Ω1 , A1 , µ1 ) and
(Ω2 , A2 , µ2 ) . Note that this is the case of Lebesgue measure and of Probability measures. We denote
by µ the product measure defined on the product σ–algebra A1 × A2 . As a first construction step,
let us impose the ‘natural’ condition:
The task of assigning a measure to non–rectangular sets requires the notion of section of a subset A
of Ω1 × Ω2 .
Definition 10.31. Consider A ⊂ Ω1 × Ω2 and let ω2 ∈ Ω2 . Then, the section of foot ω2 is the subset
of Ω1 defined, and denoted, as:
Aω2 = { ω1 ∈ Ω1 | (ω1 , ω2 ) ∈ A } .
Aω1 = { ω2 ∈ Ω2 | (ω1 , ω2 ) ∈ A } .
10.4. PRODUCT OF σ–ALGEBRAS 179
We can now state Theorem 10.32, which concerns the measurability of sections, of a measurable set,
for the product measure.
Theorem 10.32. Let A ∈ A1 × A2 . Then, Aω2 ∈ A1 for any ω2 ∈ Ω2 , and Aω1 ∈ A2 for any
ω1 ∈ Ω1 .
Theorem 10.34. Consider the product σ–algebra A1 × A2 . Let µ1 , µ2 be σ–finite measures. Then,
the functions:
ω2 7→ µ1 (Aω2 ) and ω1 7→ µ2 (Aω1 )
are measurable with respect to A2 and A1 , respectively. Furthermore, it holds:
Z Z
µ2 (Aω1 ) dµ1 (ω1 ) = µ1 (Aω2 ) dµ2 (ω2 ) .
Ω1 Ω2
Theorem 10.35. The set function µ introduced in (10.34) is a measure. Moreover, µ is unique,
since any other measure that coincides with µ on rectangles, is equal to µ on the product σ–algebra
A1 × A 2 .
We are, finally, in the position to state the Fubini Theorem 10.36 for nested integrals.
Legendre3 introduced the letter Γ(x) to denote (11.2), and he modified its representation as follows:
Z ∞
Γ(x) = e−t tx−1 dt , x > 0. (11.3)
0
Observe that (11.2) and (11.3) imply the equality Γ(x + 1) = e(x) . In fact, the change of variable
t = − ln u , in the Legendre integral Γ(x + 1) , yields:
1
1 x 1
Z
ln u
Γ(x + 1) = e ln du = e(x) .
0 u u
1
James Stirling (1692–1770), Scottish mathematician.
2
Christian Goldbach (1690–1764),German mathematician.
3
Adrien Marie Legendre (1752–1833), French mathematician.
181
182 CHAPTER 11. GAMMA AND BETA FUNCTIONS
These aspects are treated with the maximal generality in the Bohr–Mollerup4 Theorems 11.1–11.2.
Note that Γ(x) appears in many formulæ of Mathematical Analysis, Physics and Mathematical Statis-
tics.
Theorem 11.1. For any x > 0 , the recursion relation (11.5) is true. In particular, when x = n ∈ N ,
then (11.4) holds.
When x > 0 , function Γ(x) is continuous and differentiable at any order. To evaluate its derivatives,
we use the differentiation of parametric integrals, obtaining:
Z ∞
0
Γ (x) = e−t tx−1 ln t dt , (11.6a)
0
Z ∞
Γ(2) (x) = e−t tx−1 (ln t)2 dt , (11.6b)
0
From its definition, Γ(x) is strictly positive. Moreover, since (11.6b) shows that Γ(2) (x) ≥ 0 , it follows
that Γ(x) is also strictly convex. We thus infer the existence for the following couple of limits:
To evaluate `∞ , since we know, a priori, that such limit exists, we can restrict the focus to natural
numbers, so that we have immediately:
Observing that Γ(2) = Γ(1) and using Rolle Theorem5 , we see that there exists ξ ∈ ] 1 , 2 [ such that
Γ0 (ξ) = 0 . On the other hand, since Γ(2) (x) > 0 , the first derivative Γ0 (x) is strictly increasing, thus
there is a unique ξ such that Γ0 (ξ) = 0 . Furthermore, we have that 0 < x < ξ =⇒ Γ0 (x) < 0
and x > ξ =⇒ Γ0 (x) > 0 . This means that ξ represents the absolute minimum for Γ(x) , when
x ∈ ] 0 , ∞ [ . The numerical determination of ξ and Γ(ξ) is due to Legendre and Gauss6 :
Proof. Recall the Schwarz inequality7 for functions whose second power is summable:
Z ∞ 2 Z ∞ Z ∞
f (t) g(t) dt ≤ f 2 (t) dt · g 2 (t) dt .
0 0 0
If we take:
t x−1
f (t) = e− 2 t 2 , g(x) = f (x) ln t ,
recalling (11.3), (11.6a) and (11.6b), we find the inequality:
2
Γ0 (x) ≤ Γ(x) Γ(2) (x) .
in particular:
1 1
Γ − = −2 Γ .
2 2
When x ∈ ] − 2 , −1 [ , the evaluation is:
Γ(2 − x)
Γ(x) = ,
(x + 1) x
thus, in particular:
3 4 1
Γ − = Γ .
2 3 2
In other words, Γ(x) is defined on the real line, except the singular points x = 0 , −1 , −2 , · · · , and
so on, as shown in Figure 11.1.
y
x
-4 -3 -2 -1 1 Ξ 2 3 4 5
Notice that the change of variable t = 1 − s provides, immediately, the symmetry relation B(x , y) =
B(y , x) in (11.10), which yields:
Z π/2
B(x , y) = 2 cos2 x−1 ϑ sin2 y−1 ϑ dϑ . (11.10a)
0
The main property of the Beta function is its relationship with the Gamma function, as expressed by
Theorem 11.4 below.
Γ(x) Γ(y)
B(x , y) = . (11.11)
Γ(x + y)
11.2. BETA FUNCTION 185
Proof. From the usual definition (11.3) for the Gamma function, after the change of variable t = u2 ,
we have: Z +∞
2
Γ(x) = 2 u2 x−1 e−u du .
0
In the same way: Z +∞
2
Γ(y) = 2 v 2 y−1 e−v dv .
0
Now, we form the product of the last two integrals above, and we use Fubini Theorem 10.4, obtaining:
ZZ
2 2
Γ(x) Γ(y) = 4 u2 x−1 v 2 y−1 e−(u +v ) du dv .
[0 ,+∞)×[0 ,+∞)
At this point, we change variable in the double integral, using polar coordinates:
(
u = ρ cos ϑ ,
v = ρ sin ϑ ,
Remark 11.5. Theorem 11.4 can be shown also using, again, Fubini Theorem 10.4, starting from:
Z ∞Z ∞
Γ(x) Γ(y) = e−(t+s) tx−1 sy−1 dt ds ,
0 0
1
11.2.1 Γ 2
and the probability integral
1
Using (11.11), we can evaluate Γ and, then, the probability integral (8.36). In fact, by taking
2
x = z and y = 1 − z , where 0 < z < 1 , we obtain:
Z 1 Z 1
z−1 −z t z 1
Γ(z) Γ(1 − z) = B(z , 1 − z) = t (1 − t) dt = dt .
0 0 1−t t
Employing (11.13), it is also possible to evaluate, once more, the probability integral (8.36). Evaluating
(11.3) at x = 1/2 , and then setting t = x2 , we find, in fact:
∞ ∞
e−t
Z Z
1 2
Γ = √ dt = 2 e−x dx ,
2 0 t 0
The change of variable s = t (1 − t)−1 , used in (11.10), provides an alternative representation for the
Beta function: Z ∞
sx−1
B(x , y) = ds . (11.10b)
0 (1 + s)x+y
Setting s = t (1 − t)−1 , in fact, leads to:
Z 1 Z ∞
B(x , y) = t x−1
(1 − t) y−1
dt = sx−1 (1 + s)−x+1 (1 + s)−2 ds
0 0
∞
sx−1
Z
= ds .
0 (1 + s)x+y
The following representation Theorem 11.6 decribes the family of integrals related to the Beta function.
Observe that, with the change of variable 2 t = u in J , it follows that I = J . Observe, further, that:
1 1 1
I = B x+ , ,
2 2 2
Z π
2
2x 2x−1 1 1
J= (2 sin t cos t) dt = 2 B x + ,x + .
0 2 2
Hence:
1 1 1 1 1
B x+ , = 22 x−1 B x + , x + . (11.17)
2 2 2 2 2
Formula (11.16) is generalised by Gauss multiplication Theorem 11.8, which we present without proof.
m−1
Y k
Γ(x) Γ x+ = m1/2−m x (2 π)(m−1)/2 Γ(m x) . (11.18)
m
k=1
π
Γ(x) Γ(1 − x) = . (11.19)
sin(π x)
From the reflexion formula (11.19), it follows immediately the computation of integral (11.20).
∞
ux−1
Z
π
du = . (11.20)
0 1+u sin(π x)
The same argument followed to demonstrate Theorem 11.11 can be applied to prove the following
Theorem 11.12, thus, we leave it as an exercise.
Theorem 11.12. If 2 a < n , then:
Z 1
xa−1 a Z ∞ z a−1
√ dx = cos π √ dz . (11.23)
0 1 − xn n 0 1 + zn
1
Proof. The change of variable 1 + xn = is employed in the left hand–side integral of (11.25), and
t
1 1 1
therefore dx = − t− n −1 (1 − t) n −1 dt :
n
Z ∞
1 1 −1 1 1 (1− 1 )−1
Z Z
dx 1
−1 1
n
= t n (1 − t) n dt = t n (1 − t) n −1 dt
0 1+x n 0 n 0
1 1 1 1 1 1
= B 1− , = Γ 1− Γ .
n n n n n n
Thesis (11.25) follows from reflexion formula (11.19). Formula (11.26) also follows, using an analogous
argument.
1 1 3 1
The change of variable 1 + x2 = , that is, dx = − t− 2 (1 − t)− 2 dt , leads to:
t 2
Z ∞ √
dx 1 1 π 1
= B n− , = Γ n− .
−∞ (1 + x2 )n 2 2 (n − 1)! 2
The assumption we made for the parameter b ensures summability. Hence, exploiting Fubini Theorem
10.4, the integration can be performed regardless of the order. Let us integrate, first, with respect to
x : Z +∞ Z +∞
p−1 −x y
I(b , p) = y sin(b x) e dx dy .
0 0
In this way, the integral above turns out to be an elementary one, since:
Z +∞
b
sin(b x) e−x y dx = 2 ,
0 b + y2
and then:
+∞
y p−1
Z
I(b , p) = b dy ,
0 b2 + y 2
for which, employing the change of variable t = yb , we obtain:
∞
tp−1
Z
I(b , p) = b p−1 dt .
0 1 + t2
The latest formula allows to use identity (11.26) and complete the first computation:
π b p−1
I(b , p) = . (11.31)
2 sin p π2
The inner integral is immediately evaluated, in terms of the Gamma function, setting u = x y :
Z +∞ Z ∞
p−1 −x y 1 1
y e dy = p u p−1 e−u du = p Γ(p) . (11.32)
0 x 0 x
Equating (11.32) and (11.31) leads to:
∞
b p−1 π
Z
sin(b x)
Γ(p) dx = ,
0 xp 2 sin p π2
The right hand–side integral, above, has the form (11.30), with b = 1 and p = 2 − 1q , therefore:
Z ∞
sin xq π
q
dx = .
0 x
1
1 π
2 q Γ 2 − q sin 2 − q 2
we arrive at:
1
Z ∞
sin xq Γ q sin πq
dx = .
0 xq 2 (q − 1) sin π
2q
We now show, employing again reversal integration, a cosine relation similar to (11.30), namely:
Z ∞
cos(b x) π b p−1
dx = , (11.34)
0 xp 2 Γ(p) cos p π2
where we must assume that 0 < p < 1 , to ensure convergence of the integral, due to the singularity
in the origin. To prove (11.34), we consider the double integral:
Z ∞Z ∞
cos(b x) y p−1 e−x y dx dy ,
0 0
from which we show that (11.34) can be reached, via the Fubini Theorem 10.4, regardless of the order
of integration. The starting point is, then, the equality:
Z ∞ Z ∞ Z ∞ Z ∞
p−1 −x y p−1 −x y
cos(b x) y e dy dx = y cos(b x) e dx dy . (11.35)
0 0 0 0
192 CHAPTER 11. GAMMA AND BETA FUNCTIONS
The last integral above is in the form (11.26). Therefore, the right hand–side integral of (11.35) turns
out to be: Z ∞
yp π b p−1 π b p−1
dy = = . (11.36)
b2 + y 2 2 cos p2 π
0 2 sin p+1 π 2
The inner integral in the left hand–side of (11.35) is given by (11.32). Hence, the left hand–side integral
of (11.35) is: Z ∞
cos(b x)
Γ(p) dx . (11.37)
0 xp
Equating (11.36) and (11.37) leads to (11.34).
There is a further, very interesting consequence of equation (11.30) leading to the evaluation of the
Fresnel8 integrals (11.38), that hold for b > 0 and k > 1 :
Z ∞
k 1 1 π
sin(b x ) dx = 1 Γ sin . (11.38)
0 k bk k 2k
To prove (11.38), we start from considering its left hand–side integral, inserting in it the change of
xk
variable u = xk , i.e., du = k xk−1 dx = k dx , thus:
x
Z ∞ Z ∞ 1
1 ∞ sin(b u)
Z
k 1 uk
sin(b x ) dx = sin(b u) du = 1 du .
0 0 k u k 0 u1− k
1
The latest integral above is in the form (11.30), with p = 1 − k . Hence:
1
∞
π b− k
Z
sin(b u)
1 du = , (11.39)
u1− k 2 Γ 1 − k1 sin 1 − k1 π
0
2
and, thus: Z ∞
π
sin(b xk ) dx = 1
1 π π
0 2 k b Γ 1−
k
k sin 2 − 2k
At this point, employing the reflection formula (11.19):
1 1 π
Γ Γ 1− = ,
k k sin πk
we obtain (11.38).
8
Augustin–Jean Fresnel (1788–1827), French civil engineer and physicist.
11.4. DOUBLE INTEGRATION TECHNIQUES 193
In (11.38), the particular choices b = 1 and k = 2 correspond to the sine Fresnel integral:
Z ∞ π rπ
2 1 1
sin(x ) dx = Γ sin = . (11.40)
0 2 2 4 8
Exploiting the same technique that produced (11.38) from (11.30), it is possible to derive the cosinus
analogue (11.41) of Fresnel integrals, that hold for b > 0 and k > 1 :
Z ∞
k 1 1 π
cos(b x ) dx = 1 Γ cos . (11.41)
0 k bk k 2k
To prove (11.41), the starting point is (11.34). Then, as in the sine case, we introduce the change of
variable u = xk , we choose p = 1 − k1 , and, via calculations similar to those performed in the sine
case, we arrive at: Z ∞
π
cos(b xk ) dx = 1 .
2 k b k Γ 1 − k cos π2 − 2πk
1
0
A brief presentation of the Fourier transform is provided here, limited to those of its aspects that come
handy while integrating a particular partial differential equation, namely, the heat equation. This latest
one constitutes the main tool in solving the Black-Scholes equation, which is of great importance in
Quantitative Finance. This Chapter 12 is strongly inspired by [44].
A sufficient condition for the existence of the Fourier1 integral is f ∈ L∞ (R) . In any Fourier transform,
one value is immediate to compute, namely that corresponding to s = 0 :
Z +∞
Ff (0) := f (t) dt .
−∞
F(F −1 g) = g , F −1 (Ff ) = f .
Remark 12.3. A standard definition of the Fourier transform does not exist: the definition presented
here is not unique. The reason for non–uniqueness is related to the position of the 2 π quantity: it
might be part of the exponential, as in (12.1), or it might be an external multiplicative factor, or it
might be at all missing. There is also a question on which is the Fourier transform and which is its
inverse, that is, where to set the minus sign in the exponential.
Various conventions are in common use, according to each particular study branch, and we provide a
summary of such conventions, following [35]. Let us consider the general definition:
Z +∞
1
Ff (s) = ei B s t f (t) dt .
A −∞
The most common choices, found in practice, are the following pairs:
√
A = 2π, B = ±1 ;
A = 1, B = ±2 π ;
A = 1, B = ±1 .
1
Jean Baptiste Joseph Fourier (1768–1830), French mathematician and physicist.
195
196 CHAPTER 12. FOURIER TRANSFORM ON THE REAL LINE
Our choice (12.1). corresponds to A = 1 , B = −2 π . In computer algebra systems like, for instance,
Mathematica® , the Fourier transform is implemented as:
s Z ∞
|b|
Fa ,b f (s) = ei b s t f (t) dt .
(2 π)1−a −∞
Some results for the Fourier transform are now stated, starting with the Riemann–Lebesgue Theorem
12.4, whose proof is omitted for brevity.
Theorem 12.4 (Riemann–Lebesgue). If f ∈ L1 (R) then:
lim F(s) = 0 .
|s|→∞
The following Plancherel Theorem 12.5 plays a key role in establishing the Fourier transform property
in L2 (R) .
Theorem 12.5 (Plancherel). Consider f ∈ L1 (R) ∩ L2 (R) . We have that Ff ∈ L2 (R) , and:
Z ∞ Z ∞
|f (t)|2 dt = |Ff (s)|2 ds .
−∞ −∞
12.1.1 Examples
Here, some examples are provided, on the computation of the Fourier transform. Before them, let us
state some remarks.
Remark 12.7. For each even function f (t) = f (−t) , the following relation holds:
Z +∞ Z +∞
Ff (s) = cos(2 π s t) f (t) dt = 2 cos(2 π s t) f (t) dt , (12.3)
−∞ 0
Example 12.8. The triangle function. Consider the triangle function, defined by Λ(x) = max{ 1−
|x| , 0 } , which is equivalent to the explicit expression:
(
1 − |x| for |x| ≤ 1 ,
Λ(x) =
0 otherwise.
1.0
0.8
0.6
0.4
0.2
To compute the Fourier transform, using the fact that the sine function is odd, we evaluate:
Z +∞
FΛ(s) = e−2 π i s t Λ(t) dt
−∞
Z1
= cos(2 π s t) − i sin(2 π s t) (1 − |t|) dt
−1
Z1
= cos(2 π s t) (1 − |t|) dt .
−1
sin(π s) 2
FΛ(s) = .
πs
Z +∞
2
Ff (s) = e−2 π i s t e−π t dt .
−∞
Since: Z +∞
2
Ff (0) = e−π t dt = 1
−∞
it finally follows:
2
Ff (s) = e−π s .
2
We have found the remarkable fact that the Gaussian f (t) = e−π t is equal to its own Fourier
transform, that is, the Gaussian function is a fixed point for the Fourier transform.
Remark 12.11. The Fourier transform of the Gaussian can also be evaluated with a different method,
namely, the square completion of the exponent, which we now illustrate in detail. Form the Fourier
transform for the Gaussian function, according to (12.1):
Z +∞ Z +∞
2 2)
Ff (s) = e−2 π i s t e−π t dt = e−π (2 i s t+ t dt .
−∞ −∞
−π 2 i s t + t2 = −π (−s2 + 2 i s t + t2 + s2 ) = −π s2 − π (t + i s)2 .
Thus Z +∞
−π s2 2
Ff (s) = e e−π (t+i s) dt .
−∞
√
Employ the change of variable π (t + i s) = τ , so that:
2
e−π s +∞
Z
2
Ff (s) = √ e−τ dτ .
π −∞
Hence:
2
Ff (s) = e−π s .
Example 12.12. We follow an ingenious method presented in [43] (on pages 79–82 and leading, there,
to integral (3.1.7)), to evaluate the Fourier transform of:
1
f (t) = ,
b2 + t2
being b a positive parameter.
From Definition (12.1), simplified into (12.3) since the given f is even, we have:
Z ∞
cos(2 π s t)
Ff (s) = 2 dt .
0 b2 + t2
12.2.1 Linearity
One of the simplest, and most frequently invoked properties of the Fourier transform is that it is a
linear operator. This means:
F(f + g)(s) = Ff (s) + Fg(s)
F(α f )(s) = α Ff (s) .
where α is any real or complex number.
200 CHAPTER 12. FOURIER TRANSFORM ON THE REAL LINE
When a < 0 , the limits of integration are reversed, when we insert the substitution u = a t , thuse,
the resulting transform is:
1 s
Fa f (s) = − Ff .
a a
Since −a is positive when a is negative, we can combine the two cases and present the Stretch
Theorem as, assuming a 6= 0:
1 s
Fa f (s) = Ff .
|a| a
For instance, recalling Example 12.9, the Fourier transform of g(t) = e−a |t| , with a positive constant,
is:
2a
Fg(s) = 2 .
a + 4 π 2 s2
1 t2
f (t) = √ e− 2 σ 2 ,
σ 2π
its Fourier transform is:
2 σ 2 s2
Ff (s) = e−2π
2 2
We know that, if g(t) = e−π t , then Ff (s) = e−π s . Moreover, using the Stretch Theorem:
1 s
Fa g(s) = Fg .
|a| a
12.3. CONVOLUTION 201
1
t2
Since we want to find a such that g(a t) = e− 2 σ2 , the following relation must hold:
1 2
−π a2 t2 = − t ,
2σ 2
that is:
1
a= √ .
σ 2π
From relation:
1 s
Fa g(s) = Fg ,
|a| a
it finally follows: √ 2
Fa g(s) = σ 2 π e−π s 2πσ
.
12.3 Convolution
Convolution is an operation which combines two functions, f , g , producing a third function; as shown
in (12.9), it is defined as the integral of the pointwise multiplication of f and g , and it has independent
variable given by the amount by which either f or g is translated. Convolution finds applications
that include Probability, Statistics, computer vision, natural language processing, image and signal
processing, Engineering, and differential equations. Our interest, here, is in the last application.
Definition 12.14. The convolution of two functions g(t) and f (t) , both defined on the entire real
line, is the function defined as:
Z +∞
(g ? f )(t) = g(t − x) f (x) dx . (12.9)
−∞
Remark 12.15. Consider the case in which functions g(t) and f (t) are supported only on [0 , ∞) ,
that is, they are zero for negative arguments. Then, the integration limits can be truncated, and the
convolution is: Z t
(g ? f )(t) = g(t − x) f (x) dx . (12.10)
0
(g ? f )(t) = (f ? g)(t) .
This follows from a simple change of variabile, namely t − x = u , in which x and u play the role of
integration variables, while t acts as a parameter:
Z ∞ Z −∞
(g ? f )(t) = g(t − x) f (x) dx = g(u) f (t − u) (−du) = (f ? g)(t) .
−∞ ∞
Some practical examples of convolution of functions are now provided, in particular to show that this
concept is of great importance in the Algebra of random variables. Let us begin with the convolution
of two power function.
Example 12.17. (Convolution of powers) Consider g , f : [0 , +∞) → R respectively defined by
g(x) = xa , f (x) = xb , with a , b > 0 . Then:
Z t Z t
(g ? f )(t) = ta ? tb = g(t − x) f (x) dx = (t − x)a xb dx .
0 0
The last integral above can be computed via the Beta function: to see this, let us rewrite it as:
Z t Z t
a b a x a b
(t − x) x dx = t 1− x dx ,
0 0 t
202 CHAPTER 12. FOURIER TRANSFORM ON THE REAL LINE
so that, after the change of variable x = t u , we recognise the Eulerian integral (11.10):
Z t Z 1
x a b
t a
1− x dx = t a
(1 − u)a tb ub t du = ta+b+1 B(a + 1, b + 1) .
0 t 0
In terms of Gamma functions, recalling relation (11.11):
Γ(a + 1) Γ(b + 1)
ta ? tb = ta+b+1 . (12.11)
Γ(a + b + 2)
Formula (12.11) is more expressive, and easy to remember, rewritten as:
ta tb ta+b+1
? = ,
Γ(a + 1) Γ(b + 1) Γ(a + b + 2)
which, when a = n ∈ N and b = m ∈ N , becomes:
tn tm tn+m+1
? = .
n! m! (n + m + 1)!
Convolution behaves nicely with respect to the Fourier transform, as shown by the following result,
known as Convolution Theorem for Fourier transform.
Theorem 12.19. If g(t) and f (t) are both summable, then:
F(g ? f )(s) = Fg(s) Ff (s) . (12.12)
Proof. The proof is straightforward and uses Fubini Theorem 10.4, that is the why we assume summa-
bility of both functions. We can then write:
Z ∞
F(g ? f )(s) = (g ? f )(t) e−2 π i s t dt
−∞
Z ∞ Z ∞
−2 π i s t
= g(t − x) f (x) e dx dt
−∞ −∞
Z ∞ Z ∞
−2 π i s t
= g(t − x) f (x) e dt dx
−∞ −∞
Z ∞ Z ∞
−2 π i s t
= f (x) g(t − x) e dt dx .
−∞ −∞
With the change of variable t − x = u , so that dt = du , we obtain:
Z ∞ Z ∞
−2 π i s (u+x)
F(g ? f )(s) = f (x) g(u) e du dx
−∞ −∞
Z ∞ Z ∞
−2 π i s x −2 π i s u
= f (x) e g(u) e du dx
−∞ −∞
Z ∞
= Fg(s) f (x) e−2 π i s x dx = Fg(s) Ff (s) ,
−∞
and this completes the proof.
12.4. LINEAR ORDINARY DIFFERENTIAL EQUATIONS 203
Remark 12.20. Another interesting property, owned by the convolution of Fourier transforms, is:
where we have to assume that functions f , g ∈ L1 (R) are such that their product is also f g ∈ L1 (R) .
Proof. To obtain formula (12.13), it suffices to evaluate the Fourier transform of f 0 (t) :
Z +∞
0
Ff (s) = e−2 π i s t f 0 (t) dt ,
−∞
Note that differentiation is transformed into multiplication: this represents another remarkable feature
of the Fourier transform, providing one more reason for its usefulness.
Formulæ for higher derivatives also hold, and the relevant result follows by mathematical induction:
The derivative Theorem 12.21. is useful for solving linear ordinary, and partial, differential equations.
Example 12.22 illustrates its use with an ordinary differential equation.
u00 − u = −f ,
where f (t) is a given function. The problem consists in finding u(t) . Form the Fourier transform of
both sides of the stated equation:
(2 π i s)2 Fu − Fu = −Ff ,
12.5 Exercises
1. Consider the two functions f , g : [0, +∞) → R , respectively defined as f (t) = sin t e−t , g(t) =
cos t e−t . Show that:
1
(g ? f )(t) = t e−t sin t .
2
2. Consider the two functions f , g : [0, +∞) → R , respectively defined as f (t) = sin t e−2 t , g(t) =
cos t e−2 t . Show that:
1
(g ? f )(t) = t e−2 t sin t .
2
2 −t−1
3. Show that the Fourier transform of the function f (t) = e−t is:
√ 2 s2 3
Ff (s) = π e−π −iπs− 4 .
13 Parabolic equations
The Fourier transform method of Chapter 12 is used here, among other methods, to solve partial
differential equations of parabolic type, which are of fundamental importance in Mathematical Finance.
The exposition presented in this Chapter 13 exploits the material contained in various references;
in particular, we refer to [14], Chapter 6 of [1], Chapter 4 of [56], § 2.4 of [8], Chapter 6 of [58],
Chapter 4 of [22], and [54].
ux + uy = 0 .
u = φ(x − y) ,
where φ is any function, of x − y , having continuous first-order partial derivatives. Indeed, since:
ux = φ0 (x − y) and uy = −φ0 (x − y) ,
it immediately follows:
ux + uy = φ0 (x − y) − φ0 (x − y) = 0 .
205
206 CHAPTER 13. PARABOLIC EQUATIONS
It is useful, here, to introduce the so–called Laplacian operator ∇2 u , for a two–variable function
u ∈ C 2 , defined as:
∇2 u = uxx + uyy . (13.6)
Given u , v ∈ C 2 , we can apply (13.6), recalling the nabla definition in Theorem 3.8, and the dot
(inner) product introduced in Definition 1.1, to show that:
∇2 (u v) = ∇2 u v + 2 ∇u · ∇v + u ∇2 v .
where both f (x) and u(x , t) are defined for x ∈ R , that is, −∞ < x < +∞ . The parabolic partial
differential equation in (13.9), that is:
ut = uxx , (13.10)
is sometimes called diffusion equation, given its fundamental connections with Brownian motion, i.e.,
the limit of a random walk, which is a mathematical formalization of a path that contains random
steps, and which turns out to be linked to the heat equation [36].
Remark 13.5. Generality is not affected by considering problem (13.9) instead of the following
problem (13.11), in which c 6= 1 and x ∈ R :
ut (x , t) = c uxx (x , t) , for t > 0 ,
(13.11)
u(x , 0) = f (x) .
t
To see it, let us assume that u(x , t) solves ut = c uxx , and introduce w(x , t) = u x , . Then, w
c
solves wt = wxx , since it verifies:
1 t 1 t t
wt (x , t) = ut x , = c uxx x , = uxx x , = wxx (x , t) .
c c c c c
This means that, without any loss of generality, we can assume c = 1 , and deal only with (13.9).
The first step towards solving the Cauchy problem (13.9) is to establish a uniqueness result for its
solutions. To such an aim, some assumptions are needed, as stated in the following Energy Theorem
13.6.
208 CHAPTER 13. PARABOLIC EQUATIONS
Theorem 13.6. Problem (13.9) admits solution u ∈ C 2 R , [0 , ∞) , which is unique and satisfies the
asymptotic condition:
lim ux (x , t) = 0 . (13.12)
|x|→∞
Proof. Assume that function u(x , t) satisfies the differential equation in (13.9). Define the function:
Z ∞
1
W (t) = u2 (x , t) dx .
2 −∞
W (t) is called the energy of the solution to the heat equation in (13.9). Now, assume that there exist
two solutions to problem (13.9), say, r(x , t) and s(x , t) , and assume that both r and s satisfy
condition (13.12). At this point, if we define u to be the function u(x , t) = r(x , t) − s(x , t) , since
the heat equation in (13.9) is linear, we see that u solves, for x ∈ R :
ut (x , t) = uxx (x , t) , for t > 0 ,
u(x , 0) = 0 , (13.13)
lim ux (x , t) = 0 .
|x|→∞
We aim to prove that the unique solution to (13.13) is the zero function, since such a result implies
the thesis of Theorem 13.6. Now, condition (13.12) allows to differentiate the energy function W (t) :
Z ∞ Z ∞
0
W (t) = u(x , t) ut (x , t) dx = u(x , t) uxx (x , t) dx .
−∞ −∞
In the last equality step, above, we used the assumption that u(x , t) solves the heat equation in
(13.9). Integrating by parts, and using condition (13.12), leads to:
Z ∞
W 0 (t) = − u2x (x , t) dx ≤ 0 ,
−∞
which means that W (t) is a decreasing function. On the other hand, we know, a priori, that W (t) ≥ 0 .
Evaluating W (0) = 0 , we see that it must be W (t) = 0 for any t > 0 . Hence, it must also be
u(x , t) = 0 for any t > 0 , which shows that solutions r(x, t) and s(x, t) are equal.
It is possible to state a deeper result then Theorem 13.6, but its proof require tools that go beyond
the undergraduate curriculum in Economics and Management; hence, we just state such a result in
Theorem 13.7, and refer to Theorem 4.4 of [22] for its proof. We remark that hypothesis (13.12) is
replaced by conditions (13.14)–(13.15).
If, for any ε > 0 , there exists C > 0 such that, for any (x , t) ∈ R × [0 , ∞) :
2
|u(x , t)| ≤ C eε x , (13.14)
x2
|ux (x , t)| ≤ C eε , (13.15)
If u(x , t) solves (13.10), then function (x , t) 7→ u(x − y , t) also solves (13.10), for any y .
If u(x , t) solves (13.10), then any of its derivatives, ut , ux , utt , · · · etcetera, is also a solution to
(13.10).
Scale invariance – If u(x , t) solves (13.10), then (x , t) 7→ u(α x , α2 t) is a function that also solves
(13.10), for any α > 0 .
The scale invariant property provides an educated guess on the structure of the solutions to (13.10);
namely, we look for solutions of the form:
x
u(x , t) = w √ , (13.16)
t
being w = w(r) a differentiable real function of a real variable. Differentiating (13.16), we obtain:
x 0 x 1 00 x
ut (x , t) = − 3/2 w √ , uxx (x , t) = w √ .
2t t t t
Equating ut (x , t) and uxx (x , t) yields a linear second–order differential equation, with variable co-
x
efficients, in the unknown w = w(r) , where r := √ :
t
r 0
w00 (r) = − w (r) ,
2
which can be integrated via separation of variables, in the unknown w0 (r) :
r2
w0 (r) = w0 (0) e− 4 . (13.17)
being k1 and k2 arbitrary constants of integration. Due to its construction, function (13.18) solves
(13.10) for t > 0 , and so does, by Remarks 13.8, its partial derivative, with respect to x :
x2
∂u e− 4 t
h(x , t) := (x , t) = k1 √ . (13.19)
∂x t
To fix one solution, we choose the integration constant k1 such that:
Z ∞
h(x , t) dx = 1 .
−∞
210 CHAPTER 13. PARABOLIC EQUATIONS
Recalling the probability integral computation, illustrated in § 8.11.1 and § 11.2.1 and given by formula
(8.36), we take k1 = √14 π and obtain the particular solution to (13.10) known as heat kernel (or Green
function or fundamental solution for the heat equation) and given by the following Gaussian:
1 x2
H(x , t) = √ e− 4 t , (13.20)
4πt
This preliminary discussion is essential to understand the following integral transform approach to
solving the Cauchy problem (13.9). Starting from the Fourier transform of the Gaussian (see Examples
12.10 and 12.13), the idea is to compute the Fourier transform of both sides of the heat equation in
(13.9), with respect to x and thinking of t as a parameter. Following [22], we thus state Theorem 13.9,
namely, the analytical treatment of the heat equation based on Fourier transforms; we provide an
euristic proof for it, and refer to Theorem 4.3 of [22] for a more rigorous demonstration.
Theorem 13.9. Assume that f ∈ Lp , with 1 ≤ p ≤ ∞ . The solution to problem (13.9) is then given
by the following formula, which holds true on R × (0 , +∞) :
Z +∞ −(x−y)2
1 4t
u(x , t) = √ e f (y) dy , (13.21)
4πt −∞
or, equivalently, recalling the heat kernel (13.20), by the convolution formula:
Z ∞
u(x , t) = H(x − y , t) f (y) dy . (13.22)
−∞
Moreover, if f is bounded and continuous, then the solution (13.22) is continuous on R × (0 , +∞) .
Proof. The Fourier transform of the right hand–side uxx (x , t) in (13.10) is:
Remark 13.10. In the case of the (apparently) more general Cauchy problem (13.11), the solution
is:
Z +∞ −(x−y)2
1 4ct
u(x , t) = √ e f (y) dy . (13.24)
4 π c t −∞
Remark 13.11. Observe that H(x , t) vanishes very rapidly as |x| → ∞ . Then, the convolution
integral (13.22) is well defined, for t < T , if the following growth condition is fulfilled:
x2
4T
|f (x)| ≤ c e . (13.25)
0.8
0.6
0.4
0.2
-6 -4 -2 0 2 4 6
√ √
Remark 13.13. Via the change of variable y = x + 2 s t , i.e., dy = 2 t ds , solution (13.21), to
the Cauchy problem (13.9), can be written as:
Z ∞ √
1 2
u(x , t) = √ e−s f (x + 2 s t) ds . (13.26)
π −∞
Remark 13.14. If the initial value f (x) , appearing in problem (13.9), is an odd function, i.e.,
f (−x) = −f (x) for any x ∈ R , then (13.21) implies that solution u(x , t) is such that
The heat kernel has some interesting properties, as Proposition 13.15 outlines.
212 CHAPTER 13. PARABOLIC EQUATIONS
1
Limit (iii) is trivial if x = 0 ; when x 6= 0 , by the change of variable s = , and using L’Hospital
t
Rule 1 , we obtain:
√
1 −x2 s 1
lim √ e 4 t = lim = lim = 0.
t→0+ 4πt s→+∞ √ s x2 s→+∞ √ s x2
4π e 4 x2 sπ e 4
We now provide some examples on how to solve some heat equations for given initial value functions
f (x) . The first example, due to the nature of the f considered, has applications in Finance.
Example 13.16. Recall the notion of positive part of a function, introduced in Proposition 8.25, that
is f + (x) := max {f (x) , 0} , and consider the Initial value problem:
(
ut = uxx , for t > 0 ,
+
(13.27)
u(x , 0) = x ,
where x ∈ R . Observe that the given initial value function f (x) = x2 is even, thus the solution obeys
Remark 13.14. Furthermore, from (13.26), we can write:
Z ∞ √ 2
1 2
u(x , t) = √ e−s x + 2s t ds .
π −∞
1
See, for example, mathworld.wolfram.com/LHospitalsRule.html
13.2. THE HEAT EQUATION 213
Now: √ 2 √
x + 2s t = x2 + 4 x s t + 4 s 2 t .
2 √
Observe that s 7→ e−s 4 x s t is an odd function of s ; thus, by Remark 8.58:
Z ∞
1 2
e−s x2 + 4 s2 t ds
u(x , t) = √
π −∞
2 Z ∞ Z ∞
x −s2 4t 2 4t
=√ e ds + √ s2 e−s ds = x2 + √ c .
π −∞ π −∞ π
To find the value of the constant c , impose that the found family of functions u(x , t) solves (13.28):
√
4 π
ut = √ c , uxx = 2 , =⇒ c= .
π 2
Example 13.18. This example is taken from [54] and consists in solving the Cauchy problem:
(
ut = uxx , x ∈ R, t > 0,
u(x, 0) = sin x , x ∈ R,
sin x ∞ −s2 √
Z
u(x , t) = √ e cos(2 s t) ds .
π −∞
In other words, the solution to the given Cauchy problem has the form:
sin x
u(x , t) = √ y(t) ,
π
where we set: ∞
Z
2 √
y(t) = e−s cos(2 s t) ds .
−∞
Since equality ut = uxx must hold, by computing:
sin x sin x
ut = √ y 0 (t) , uxx = − √ y(t) ,
π π
214 CHAPTER 13. PARABOLIC EQUATIONS
we see that y(t) solves the initial value problem, for ordinary differential equations:
(
y 0 (t) = −y(t) ,
√
y(0) = π .
Thus: √
y(t) = π e−t ,
and then:
u(x , t) = e−t sin x .
Finally, statement (13.29) follows from the equality:
Z ∞ √
sin x 2
√ e−s cos(2 s t) ds = e−t sin x .
π −∞
Example 13.19. Solve the Cauchy problem for the heat equation:
(
ut = uxx ,
u(x , 0) = x2 + x .
Its solution is given by formula (13.26), with f (x) = x2 + x , so that the integrand is:
2
√ √
e−s 4 s2 t + 4 s t x + 2 s t + x2 + x .
u(x , t) = x2 + x + 2 t .
Example 13.20. Here, we use some formulæ that follow from the probability integral (8.36), and
that are extremely useful in many various applications. Consider the initial value problem:
(
ut = uxx , x ∈ R, t > 0,
x
u(x , 0) = x e , x ∈ R.
To obtain its solution u(x , t) , let us use formula (13.26), with f (x) = x ex . Since:
√ √ √ √
f (x + 2 s t) = 2 s t ex e2 s t + x ex e 2 s t ,
then: ∞ ∞
√ √ √
Z Z
1 −s2 +2 −s2 +2
u(x , t) = √ 2 t ex se ts
ds + x e x
e ts
ds .
π −∞ −∞
Now, formulæ (8.37)–(8.38), that are linked to the probability integral, yield, respectively:
Z ∞ Z ∞
√
−s2 +2 t s √ 2
√ √ √
e ds = π e ,t
s e−s +2 t s ds = π t et .
−∞ −∞
In conclusion:
1 √ x √ √ t √
u(x , t) = √ 2 t e π t e + x ex π et = (x + 2 t) et+x .
π
13.2. THE HEAT EQUATION 215
and
ut = uxx , x > 0, t > 0,
ux (0 , t) = 0 , t > 0, (13.31)
u(x , 0) = f (x) .
To solve (13.30), we extend the initial data f (x) to an odd function f o (x) , called odd continuation
of f , that is defined on the whole real axis as follows:
(
o f (x) if x > 0 ,
f (x) =
−f (−x) if x < 0 ,
and, then, we write the solution to the Cauchy problem with initial data f o , using the convolution
formula (13.22):
Z ∞
u(x , t) = H(x − y , t) f o (y) dy
−∞
Z 0 Z ∞
=− H(x − y , t) f (−y) dy + H(x − y , t) f (y) dy
−∞ 0
Z ∞
= H(x − y , t) − H(x + y , t) f (y) dy
Z0 ∞
= H1 (x , y , t) f (y) dy ,
0
vt = vxx + a vx + b v , (13.32)
where a , b ∈ R . The following Theorem 13.21 shows that (13.32) can always be reduced to the heat
equation, by a suitable change of variable.
a2 ax
v(x , t) = e(b− 4 )t e− 2 h(x , t) , (13.33)
Proof. We seek for a function that solves equation (13.32) and has the form:
To do so, we impose that v(x , t) , above, is a solution to (13.32), finding α and β accordingly. Let
us compute:
vt = eα t eβ x (α h + ht ) ,
vx = eα t eβ x (β h + hx ) ,
vxx = eα t eβ x β 2 h + 2 β hx + hxx ,
eα t eβ x ht − (b − α + a β + β 2 ) h − (a + 2 β) hx − hxx = 0 .
The hypothesis that h solves the heat equation allows to consider the system:
(
a + 2β = 0,
b − α + a β + β2 = 0 ,
The proof to Theorem 13.21 can be adapted to the case of a Cauchy problem.
13.3. PARABOLIC EQUATIONS WITH CONSTANT COEFFICIENTS 217
is solved by:
a2 ax
u(x , t) = e(b− 4 )t e− 2 h(x , t) ,
where h(x , t) solves the Cauchy problem for the heat equation, given below:
ht (x , t) = hxx (x , t) ,
h(x , 0) = e a2x f (x) .
Proof. The proof follows from combining Theorems 13.9 and 13.21.
Corollary 13.22 can be further generalized to the case in which the second derivative term is multiplied
by c 6= 1 .
Corollary 13.23. The Cauchy problem, defined for x ∈ R and t ≥ 0 :
ut (x , t) = c uxx (x , t) + a ux (x , t) + b u(x , t) , t > 0,
(13.36)
u(x , 0) = f (x) ,
is solved by:
a2 ax
u(x , t) = e(b− 4 c ) t e− 2 c h(x , t) ,
where h(x , t) solves the following Cauchy problem for the heat equation:
ht (x , t) = c hxx (x , t) ,
h(x , 0) = e a2 xc f (x) .
Observe that it is of the form (13.35), with f (x) = x ex , a = −2 , and b = 0 . In order to use
Corollary 13.22, we have to solve, first, the following Cauchy problem for the heat equation:
ht (x , t) = hxx (x , t) , ht (x , t) = hxx (x , t) ,
i.e.,
h(x , 0) = e a2x x ex h(x , 0) = x ,
2
where the last equalities rely on (8.36a) and on the fact that s 7→ s e−s is an odd function, for which
Remark 8.58 holds. In conclusion, the given problem is solved by:
a2 ax
u(x , t) = e(b− 4 ) t
e− 2 h(x , t) = e−t ex x = x ex−t .
Example 13.25. We solve the Cauchy problem for parabolic equation:
(
ut = uxx + 4 ux ,
u(x , 0) = x2 e−2 x .
This problem is of the form (13.35), with f (x) = x2 e−2 x , a = 4 , and b = 0 . In order to use
Corollary 13.22, we have to solve, first, the following Cauchy problem for the heat equation:
ht (x , t) = hxx (x , t) ,
h(x , 0) = x2 ,
2
where the chain of equalities relies on (8.36a) and (8.36d), and on the fact that functions s 7→ s e−s
2
and s 7→ s3 e−s are both odd, thus they verify Remark 8.58. In conclusion, the given problem is
solved by:
u(x , t) = e2 x−4 t x (x2 + 6 t) .
13.3.1 Exercises
1. Show that function u(x , t) = (2 t + x) e2 x+t solves the Cauchy parabolic problem:
(
ut = uxx − 2 ux + u , t > 0,
u(x , 0) = x e .2 x
2. Show that function u(x , t) = (2 t + x) e2t+2x solves the Cauchy parabolic problem:
(
ut = uxx − 2 ux + 2 u , t > 0,
u(x , 0) = x e2 x .
3. Consider a positive C 2 function u(x , t) , which solves (13.10) for t > 0 . Then, the following
function:
ux
θ(x , t) = −2
u
satisfies, for t > 0 , the differential equation:
θt + θ θx = θxx .
Then, the solution of the so–formed initial value problem is given by:
1 x
u(x, t) = 1+φ √ ,
2 4t
∂V 1 ∂2V ∂V
+ σ2 S 2 2
+ rS − r V = 0, S ≥ 0, t ∈ [0 , T ] , (13.37)
∂t 2 ∂S ∂S
where:
2
Fischer Sheffey Black (1938–1995), American economist.
Myron Samuel Scholes (1941–living), Canadian–American financial economist.
220 CHAPTER 13. PARABOLIC EQUATIONS
• t is the time;
where a , b , c are given real numbers. Equation (13.38) can be turned into a parabolic equation, with
constant coefficients, using the change of variable:
x = ey ,
(
y = ln x ,
1 that is to say, (13.39)
t = τ , τ = a t.
a
Equating u(x , t) = v(y , τ ) , the transformed differential equation is obtained:
b c
vτ = vyy + − 1 vy + v ,
a a (13.40)
v(y , 0) = f ( ey ) .
To see it, let us compute the partial derivatives of u(x , t) in terms of the transformed function
v(y , τ ) :
∂v ∂τ ∂v ∂(a t) ∂v
ut = = = a = a vτ ;
∂τ ∂t ∂τ ∂t ∂τ
∂v ∂y ∂v ∂(ln x) 1 ∂v 1
ux = = = = vy ;
∂y ∂x ∂y ∂x x ∂y x
∂ux ∂ 1 ∂ 1 1 ∂ ∂v
uxx = = vy = vy +
∂x ∂x x ∂x x x ∂x ∂y
(13.41)
1 1 ∂ ∂y
∂v
= − 2 vy +
x x ∂y ∂x ∂y
1 1 ∂ ∂y ∂v
= − 2 vy +
x x ∂y ∂x ∂y
1 1 ∂ ∂v 1
= − 2 vy + 2 = 2 (vyy − vy ) .
x x ∂y ∂y x
Note that, in computing uxx , above, we wrote the differential operator as:
∂ ∂ ∂y
= .
∂x ∂y ∂x
13.4. BLACK–SCHOLES EQUATION 221
which is indeed (13.40). At this point, observe that (13.40) is a constant coefficients problem of the
form (13.35), which we rewrite here, for convenience, as:
(
vτ = vyy + A vy + B v ,
(13.42)
v(y , 0) = g(y) .
Recalling Corollary 13.22, we know that, in order to solve (13.42), we have to consider, first, the
following heat equation:
hτ = hyy ,
Ay
h(y , 0) = e 2 g(y) ,
whose solution h(y , τ ) is a component of the following function, that solves (13.42):
A2 Ay
v(y, τ ) = e(B− 4 )τ e− 2 h(y , τ ) .
Then, we can insert (13.43) into the solution of the transformed problem (13.40), which is:
c 1 b−a 2
a − 4( a ) τ
(b−a) y
v(y , τ ) = e e− 2 a h(y , τ ) . (13.44)
Given the variable coefficient problem (13.38), equations (13.43)–(13.44) lead automatically to the
solution of the transformed problem (13.40), via the evaluation of the integral involved. Once such
a solution is computed, the solution to the given problem (13.38) can be obtained recovering the
original variables through (13.39). The following Examples 13.27, 13.28 and 13.29 illustrate the solution
procedure.
The given problem has the form (13.38), with a = b = c = 1 and f (x) = x . Let us form (13.43):
Z +∞ √
Z +∞ √
1 2 1 2
h(y , τ ) = √ e−s e0 ey+2 τ s ds = √ e−s +2 τ s+y ds = ey+τ ,
π −∞ π −∞
222 CHAPTER 13. PARABOLIC EQUATIONS
where we used the integration formula (8.37). From (13.44), we arrive at the solution of the transformed
problem (13.40):
v(y , τ ) = ey+2 τ .
Finally, we use (13.39), to recover the original variables:
τ = t, y = ln x ,
This problem has the form (13.38), with a = 2 , b = 4 , c = 1 and f (x) = x . As in the previous
Example 13.28, problem (13.46) can be solved working in exact arithmetic. Here, (13.43) becomes:
√
Z ∞ y+2 s τ √
1 −s2
h(y , τ ) = √ e e 2 ey+2 s τ
ds
π ∞
Z ∞ √ 3y 3 9
1 −s2 +3 τ s+ 2 2 y+ 4 τ
=√ e ds = e ,
π ∞
where the last equality is obtained via the integration formula (8.37). Applying (13.44), we arrive at
the solution of the transformed problem (13.40), that is:
5
y+ 2 τ
v(y , τ ) = e .
Finally, recovering the original variables by means of the change of variable (13.39), that here is:
τ = 2t, y = ln x ,
u(x , t) = x e5 t .
This problem has the form (13.38), with a = 2 , b = c = 1 and f (x) = ln x . The associated heat
equation, modelled in (13.43), is:
√
Z +∞ √
y+2 s τ
1 −s2 −
h(y , τ ) = √ e e
(y + 2 s τ ) ds
4
π −∞
Z +∞ √
s τ y
√ Z +∞ √
s τ y
y −(s2 + 2 + 4 ) 2 τ −(s2 + 2 + 4 )
=√ e ds + √ s e ds
π −∞ π −∞
τ y √ √ τ y
y √ 16 − 4 2 τ √ τ 16 − 4
=√ π e + √ (− π) e
π π 4
τ τ −4 y
= (y − ) e 16 ,
2
13.5. NON–HOMOGENEOUS EQUATION: DUHAMEL INTEGRAL 223
where the Gaussian integration formulæ (8.37)–(8.38) were employed. Using (13.44), the solution of
the transformed problem (13.40) is obtained:
7τ y τ τ −4 y τ τ
v(y , τ ) = e 16 e 4 (y − ) e 16 = e 2 (y − ) .
2 2
Finally, we use (13.39), to recover the original variables:
τ = 2t, y = ln x ,
13.4.1 Exercises
1. Show that function u(x , t) = x2 e5 t solves the Cauchy parabolic problem:
(
ut = x2 uxx + x ux + u , t > 0,
2
u(x , 0) = x .
2. Show that function u(x , t) = e2 t (ln x − 2 t) solves the Cauchy parabolic problem:
(
ut = x2 uxx − x ux + 2 u , t > 0,
u(x , 0) = ln x .
which differs from (13.11) in the source term P (x , t) . For problem (13.47), a general solution formula
is obtained, which is analogous to (13.21). To do so, we follow an approach, that generalizes the
variation of parameters method, illustrated in Theorem 5.26 and in § 6.2.1. Such a generalization
apply to linear partial differential equations, and it is known as Duhamel integral or principle. To
understand how it works, we first present Example 13.30, which is referred to an ordinary differential
equation, revisited having in mind the Duhamel3 approach.
Example 13.30. Let a ∈ R be a given real number, and let f (t) be a continuous function, defined
on [0 , +∞) . Then, the linear initial value problem:
(
y 0 (t) = a y(t) + f (t) , t > 0,
(13.48)
y(0) = 0 ,
Consider, now, the following set of linear homogeneous equations, depending on the one parameter a ,
and in which s plays the role of a dummy variable:
(
u0 (t) = a u(t) , t>0,
(13.50)
u(0) = f (s) .
3
Jean–Marie Constant Duhamel (1797–1872), French mathematician and physicist.
224 CHAPTER 13. PARABOLIC EQUATIONS
At this point, observe that the solution (which is a function of t ) of the parametric problem (13.50)
is:
u(t , s) = f (s) ea t . (13.51)
Hence, if we look back at solution (13.49) of problem (13.48), we can write it as:
Z t
y(t) = u(t − s , s) ds . (13.52)
0
We can interpret the found solution in the following way. The solution of the non–homogeneous
equation (y 0 = a y + f (t)) , corresponding to the zero–value initial condition (y(0) = 0) , is obtained
from the solution of the homogeneous equation (u0 = a u) , when it is parametrized by the non–
homogeneous initial condition (u(0) = f (s)) .
The argument set out in Example 13.30 constitutes the foundation of the Duhamel method for partial
differential equations. Though the method works not only for parabolic equations, we use it, here, for
solving non–homogeneous parabolic equations. Let us consider, in fact, the initial value problem:
(
ut (x , t) = c uxx (x , t) + P (x , t) , x ∈ R, t > 0,
(13.53)
u(x , 0) = 0 .
Applying a procedure similar to that of Example 13.30, we build the homogeneous parametric initial
value problem: (
ht = c hxx , x ∈ R, t > 0,
(13.54)
h(x , 0) = P (x , s) .
Observe that (13.54) has the form of the homogeneous initial value problem (13.11), whose solution
is given by (13.24). Thus, the solution to the parametric problem (13.54) can be expressed in terms
of the heat kernel H(x , t) , introduced in (13.20), modified as follows:
1 x2
H(x , t) = √ e− 4 c t
4πct
and employed to define: Z ∞
h(x , t ; s) = H(x − y , t) P (y , s) dy , (13.55)
−∞
Finally, motivated by considerations similar to those described in Example 13.30, we conclude that
the solution to (13.53) should be:
Z t
u(x , t) = h(x , t − s ; s) ds
0
Z t Z ∞
= H(x − y , t − s) P (y , s) dy ds (13.56)
0 −∞
(x−y)2
Z t Z ∞ !
1 1 − 4 c (t−s)
= √ √ e P (y, s) dy ds .
2 πc 0 −∞ t−s
As a matter of fact, the function u(x , t) , defined in (13.56), is indeed solution to the initial value
problem (13.53). The technical details, concerning differentiation under the integral sign, are omitted
here, but we point out that formula (13.56) can be used to compute solutions to (13.53) in explicit
form.
Example 13.31. We solve, here, the non–homogeneous initial value problem:
(
ut (x , t) = uxx (x , t) + x t , x ∈ R, t > 0,
(13.57)
u(x , 0) = 0 .
13.5. NON–HOMOGENEOUS EQUATION: DUHAMEL INTEGRAL 225
Note that formula (13.56) provides the solution to the particular initial value problem (13.53), with
zero–value initial condition. To arrive at the solution in the general case (13.47), we exploit the linearity
of the differential equation, using a superposition technique, that is based on solving two initial value
problems, namely:
(
vt (x , t) = c vxx (x , t) , x ∈ R, t > 0,
(13.59)
v(x , 0) = f (x) ,
and (
wt (x , t) = c wxx (x , t) + P (x , t) , x ∈ R, t > 0,
(13.60)
w(x , 0) = 0 .
It turns out that function u(x , t) = v(x , t) + w(x , t) solves the initial value problem (13.47). In other
words, using formulæ (13.24) and (13.56) jointly, we can state that the solution u(x , t) to the initial
value problem (13.47) is given by:
(x−y)2
Z +∞
1
u(x , t) = √ e− 4 c t f (y) dy
2 π c t −∞
(13.61)
Z tZ ∞ (x−y)2
1 1 − 4 c (t−s)
+ √ √ e P (y, s) dy ds .
2 π c 0 −∞ t − s
226 CHAPTER 13. PARABOLIC EQUATIONS
Bibliography
[1] Kuzman Adzievski and Abdul Hasan Siddiqi. Introduction to partial differential equations for
scientists and engineers using Mathematica. CRC-Press, Boca Raton, 2014.
[2] J.L. Allen and F.M. Stein. On solution of certain Riccati differential equations. Amer. Math.
Monthly, 71:1113–1115, 1964.
[3] Tom Mike Apostol. Calculus: Multi Variable Calculus and Linear Algebra, with Applications to
Differential Equations and Probability. John Wiley & Sons, New York, 1969.
[4] Daniel J. Arrigo. Symmetry analysis of differential equations: an introduction. John Wiley &
Sons, New York, 2015.
[5] E. Artin and M. Butler. The gamma function. Holt, Rinehart and Winston, New York, 1964.
[6] Robert G. Bartle. The elements of integration and Lebesgue measure. John Wiley & Sons, New
York, 2014.
[7] Fischer Black and Myron Scholes. The pricing of options and corporate liabilities. The journal
of political economy, pages 637–654, 1973.
[8] David Borthwick. Partial differential equations. 2nd ed. Springer, Cham, 2016.
[9] D. Brannan. A First Course in Mathematical Analysis. Cambridge University Press, Cambridgde,
2006.
[10] Douglas S. Bridges. Foundations of Real and Abstract Analysis. Springer, 1998.
∞
X 1 π2
[11] B.R. Choe. An elementary proof of = . American Mathematical Monthly, 94(7):662–
n2 6
n=1
663, 1987.
[13] Lothar Collatz. Differential Equations: An Introduction With Applications. John Wiley and Sons,
1986.
[14] François Coppex. Solving the Black-Scholes equation: a demystification. private communication,
2009.
[15] Richard Courant and Fritz John. Introduction to Calculus and Analysis, volume 2. John Wiley
& Sons, New York, 1974.
[16] P. Duren. Invitation to Classical Analysis, volume 17. American Mathematical Society, Provi-
dence, 2012.
[17] Costas J. Efthimiou. Finding exact values for infinite sums. Mathematics magazine, 72(1):45–51,
1999.
227
228 BIBLIOGRAPHY
[20] O.J. Farrell and B. Ross. Solved problems: gamma and beta functions, Legendre polynomials,
Bessel functions. Macmillan, New York, 1963.
[21] Angelo Favini, Ermanno Lanconelli, Enrico Obrecht, and Cesare Parenti. Esercizi di analisi
matematica: Equazioni Differenziali, volume 2. Clueb, 1978.
[22] Gerald B. Folland. Introduction to Partial differential equations. 2nd ed. Princeton University
Press, Princeton, 1995.
[24] Guido Fubini. Sugli integrali multipli. Rend. Acc. Naz. Lincei, 16:608–614, 1907.
[25] Guido Fubini. Il teorema di riduzione per gli integrali multipli. Rend. Sem. Mat. Univ. Pol.
Torino, 9:125–133, 1949.
[27] A. Ghizzetti, A. Ossicini, and L. Marchetti. Lezioni di complementi di matematica. 2nd ed.
Libreria eredi Virgilo Veschi, Roma, 1972.
π2
[28] James Harper. Another simple proof of 1 + 212 + 312 + · · · = 6 . American Mathematical Monthly,
110(6):540–541, 2003.
[29] Phillip Hartman. Ordinary Differential Equations 2nd Edition. Siam, 2002.
[30] O. Hijab. Introduction to calculus and classical analysis. Springer, New York, 2011.
[31] P. Hydon. Symmetry methods for differential equations: a beginner’s guide. Cambridge University
Press, Cambridge, 2000.
[34] John L. Kelley. General Topology. D. Van Nostrand Company, Inc., Princeton, N.J., U.S.A.,
https://archive.org/details/GeneralTopology, 1955.
[35] Thomas William Körner. Fourier analysis. Cambridge university press, Cambridge, 1989.
[36] Gregory F. Lawler. Random walk and the heat equation. American Mathematical Society, Provi-
dence, 2010.
[37] A.M. Legendre. Traité des fonctions elliptiques et des intégrales Euleriennes, volume 2. Huzard-
Courcier, Paris, 1826.
[38] D.H. Lehmer. Interesting series involving the central binomial coefficient. The American Mathe-
matical Monthly, 92(7):449–457, 1985.
[39] A.R. Magid. Lectures on differential galois theory. Notices of the American Mathematical Society,
7:1041–1049, 1994.
[40] C.C. Maican. Integral Evaluations Using the Gamma and Beta Functions and Elliptic Integrals
in Engineering: A Self-study Approach. International Press of Boston Inc., Boston, 2005.
BIBLIOGRAPHY 229
[41] A.M. Mathai and H.J. Haubold. Special Functions for Applied Scientists. Springer, New York,
2008.
[42] Habib Bin Muzaffar. A new proof of a classical formula. American Mathematical Monthly,
120(4):355–358, 2013.
[44] Brad Osgood. The Fourier Transform and its Applications. Stanford University, 2007.
[45] Bruno Pini. Terzo Corso di Analisi Matematica, volume 1. CLUEB, Bologna, 1977.
[46] Earl David Rainville. Intermediate differential equations. Macmillan, New York, 1964.
[47] Earl David Rainville and Phillip E. Bedient. Elementary differential equations. 6th ed. Macmillan,
New York, 1981.
[48] P.R.P. Rao. The Riccati differential equation. Amer. Math. Monthly, 69:995–996, 1962.
[49] P.R.P. Rao and V.H. Hukidave. Some separable forms of the Riccati equation. Amer. Math.
Monthly, 75:38–39, 1968.
π2
[50] Daniele Ritelli. Another proof of ζ(2) = using double integrals. Amer. Math. Monthly,
6
120:642–645, 2013.
[51] Halsey Lawrence Royden and Patrick Fitzpatrick. Real analysis. 4th ed. Macmillan, New York,
2010.
[52] W. Rudin. Principles of mathematical analysis. 3rd ed. McGraw-Hill, New York, 1976.
[53] W. Rudin. Real and Complex Analysis. Tata McGraw–Hill, New York, 2006.
[54] Fabio Scarabotti. Equazioni alle derivate parziali. Esculapio, Bologna, 2010.
[55] L. Schwartz. Mathematics for the physical sciences. Addison-Wesley, New York, 1966.
[56] William Shaw. Modelling Financial Derivatives using Mathematica® . Cambridge University
Press, Cambridge, 1988.
[57] H. Siller. On the separability of the Riccati differential equation. Math. Mag., 43:197–202, 1970.
[58] Walter A. Strauss. Introduction to Partial Differential Equations. John Wiley & Sons, New York,
2007.
[59] J. Van Yzeren. Moivre’s and Fresnel’s integrals by simple integration. American Mathematical
Monthly, pages 690–693, 1979.
[61] J.S.W. Wong. On solution of certain Riccati differential equations. Math. Mag., 39:141–143, 1966.
[62] R.C. Wrede and M.R. Spiegel. Schaum’s Outline of Advanced Calculus. McGraw-Hill, New York,
2010.