100% found this document useful (1 vote)
325 views102 pages

An Introduction To Stochastic Calculus

This document provides an introduction to stochastic calculus. It begins with a review of stochastic processes and their laws. It then discusses Brownian motion and its properties including the martingale property and Markov property. The main topics covered are Itô's calculus, including Itô's integral and formula. Applications of Itô's formula are also discussed. Later sections cover local time of Brownian motion, stochastic differential equations, continuous time martingales, and stochastic integrals with respect to continuous martingales.

Uploaded by

fani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
325 views102 pages

An Introduction To Stochastic Calculus

This document provides an introduction to stochastic calculus. It begins with a review of stochastic processes and their laws. It then discusses Brownian motion and its properties including the martingale property and Markov property. The main topics covered are Itô's calculus, including Itô's integral and formula. Applications of Itô's formula are also discussed. Later sections cover local time of Brownian motion, stochastic differential equations, continuous time martingales, and stochastic integrals with respect to continuous martingales.

Uploaded by

fani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 102

AN INTRODUCTION TO

STOCHASTIC CALCULUS
Marta Sanz-Solé
Facultat de Matemàtiques i Informàtica
Universitat de Barcelona
October 15, 2017
Contents
1 A review of the basics on stochastic processes 4
1.1 The law of a stochastic process . . . . . . . . . . . . . . . . . 4
1.2 Sample paths . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 The Brownian motion 10


2.1 Equivalent definitions of Brownian motion . . . . . . . . . . . 10
2.2 A construction of Brownian motion . . . . . . . . . . . . . . . 12
2.3 Path properties of Brownian motion . . . . . . . . . . . . . . . 16
2.4 The martingale property of Brownian motion . . . . . . . . . . 21
2.5 Markov property . . . . . . . . . . . . . . . . . . . . . . . . . 23

3 Itô’s calculus 28
3.1 Itô’s integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 The Itô integral as a stochastic process . . . . . . . . . . . . . 35
3.3 An extension of the Itô integral . . . . . . . . . . . . . . . . . 37
3.4 A change of variables formula: Itô’s formula . . . . . . . . . . 39
3.4.1 One dimensional Itô’s formula . . . . . . . . . . . . . . 40
3.4.2 Multidimensional version of Itô’s formula . . . . . . . . 52

4 Applications of the Itô formula 53


4.1 Burkholder-Davis-Gundy inequalities . . . . . . . . . . . . . . 53
4.2 Representation of L2 Brownian functionals . . . . . . . . . . . 54
4.3 Girsanov’s theorem . . . . . . . . . . . . . . . . . . . . . . . . 56

5 Local time of Brownian motion and Tanaka’s formula 60

6 Stochastic differential equations 66


6.1 Examples of stochastic differential equations . . . . . . . . . . 68
6.2 A result on existence and uniqueness of solution . . . . . . . . 69
6.3 Some properties of the solution . . . . . . . . . . . . . . . . . 74
6.4 Markov property of the solution . . . . . . . . . . . . . . . . . 77

7 Numerical approximations of stochastic differential equa-


tions 80

8 Continuous time martingales 83


8.1 Doob’s inequalities for martingales . . . . . . . . . . . . . . . 83
8.2 Quadratic variation of a continuous martingale . . . . . . . . . 86

9 Stochastic integrals with respect to continuous martingales 92


10 Appendix 1: Conditional expectation 95

11 Appendix 2: Stopping times 100

3
1 A review of the basics on stochastic pro-
cesses
This chapter is devoted to introduce the notion of stochastic processes and
some general definitions related with this notion. For a more complete ac-
count on the topic, we refer the reader to [12]. Let us start with a definition.

Definition 1.1 A stochastic process with state space S is a family {Xi , i ∈


I} of random variables Xi : Ω → S indexed by a set I.

For a successful progress in the analysis of such an object, some further


structure on the index set I and on the state space S is required. In this
course, we shall mainly deal with the particular cases: I = N, Z+ , R+ and S
either a countable set or a subset of Rd , d ≥ 1.
The basic problem statisticians are interested in, is the analysis of the prob-
ability law (mostly described by some parameters) of characters exhibited by
populations. For a fixed character described by a random variable X, they
use a finite number of independent copies of X -a sample of X. For many
purposes, it is interesting to have samples of any size and therefore to con-
sider sequences Xn , n ≥ 1. It is important here to insist on the word copies,
meaning that the circumstances around the different outcomes of X do not
change. It is a static world. Hence, they deal with stochastic processes
{Xn , n ≥ 1} consisting of independent and identically distributed random
variables.
This is not the setting we are interested in here. Instead, we would like to
give stochastic models for phenomena of the real world which evolve as time
goes by. Stochasticity is a choice that accounts for an uncomplete knowledge
and extreme complexity. Evolution, in contrast with statics, is what we
observe in most phenomena in Physics, Chemistry, Biology, Economics, Life
Sciences, etc.
Stochastic processes are well suited for modeling stochastic evolution phe-
nomena. The interesting cases correspond to families of random variables Xi
which are not independent. In fact, the famous classes of stochastic processes
are described by means of types of dependence between the variables of the
process.

1.1 The law of a stochastic process


The probabilistic features of a stochastic process are gathered in the joint
distributions of their variables, as given in the next definition.

4
Definition 1.2 The finite-dimensional joint distributions of the process
{Xi , i ∈ I} consists of the multi-dimensional probability laws of any finite
family of random vectors Xi1 , . . . , Xim , where i1 , . . . , im ∈ I and m ≥ 1 is
arbitrary.

Let us give an important example.


Example 1.1 A stochastic process {Xt , t ≥ 0} is said to be Gaussian if its
finite-dimensional joint distributions are Gaussian laws.
Remember that in this case, the law of the random vector (Xt1 , . . . , Xtm ) is
characterized by two parameters:

µ(t1 , . . . , tm ) = E (Xt1 , . . . , Xtm ) = (E(Xt1 ), . . . , E(Xtm ))



Λ(t1 , . . . , tm ) = Cov(Xti , Xtj ) 1≤i,j≤m .

If det Λ(t1 , . . . , tm ) > 0, then the law of (Xt1 , . . . , Xtm ) has a density, and
this density is given by
1
ft1 ,··· ,tm (x) =
((2π)mdet Λt1 ,...,tm )1/2
 
1 t −1
× exp − (x − µt1 ,...,tm ) Λt1 ,...,tm (x − µt1 ,...,tm ) .
2
In the sequel we shall assume that I ⊂ R+ and S ⊂ R, either countable or
uncountable, and denote by RI the set of real-valued functions defined on I.
A stochastic process {Xt , t ≥ 0} can be viewed as a random vector

X : Ω → RI .

Putting the appropriate σ-field of events in RI , say B(RI ), one can define,
as for random variables, the law of the process as the mapping

PX (B) = P (X −1 (B)), B ∈ B(RI ).

Mathematical results from measure theory tell us that PX is defined by means


of a procedure of extension of measures on cylinder sets given by the family
of all possible finite-dimensional joint distributions. This is a deep result.
In Example 1.1, we have defined a class of stochastic processes by means of
the type of its finite-dimensional joint distributions. But, does such an object
exist? In other words, could one define stochastic processes giving only its
finite-dimensional joint distributions? Roughly speaking, the answer is yes,
adding some extra condition. The precise statement is a famous result by
Kolmogorov that we now quote.

5
Theorem 1.1 Consider a family

{Pt1 ,...,tn , t1 < . . . < tn , n ≥ 1, ti ∈ I} (1.1)

where:

1. Pt1 ,...,tn is a probability on Rn ,

2. if {ti1 < . . . < tim } ⊂ {t1 < . . . < tn }, the probability law Pti1 ...tim is the
marginal distribution of Pt1 ...tn .

There exists a stochastic process {Xt , t ∈ I} defined in some probability space,


such that its finite-dimensional joint distributions are given by (1.1). That
is, the law of the random vector (Xt1 , . . . , Xtn ) is Pt1 ,...,tn .

One can apply this theorem to Example 1.1 to show the existence of Gaussian
processes, as follows.
Let K : I × I → R be a symmetric, nonnegative definite function. That
means:

• for any s, t ∈ I, K(t, s) = K(s, t);

• for any natural number n and arbitrary t1 , . . . , tn ∈ I, and x1 , . . . , xn ∈


R,
Xn
K(ti , tj )xi xj ≥ 0.
i,j=1

Covariance functions provide an example of nonnegative definite functions, as


we now illustrate. Consider a random vector (Y1 , . . . , Yn ) whose components
have mean zero and finite second order moment. Fix x1 , . . . , xn ∈ R. Then,
we have,
n
X n
X
xj xk Cov(Yj , Yk ) = xj xk E (Yj Yk )
j,k=1 j,k=1
n
!2
X
=E x j Yj ≥ 0.
j=1

There exists a Gaussian process {Xt , t ≥ 0} such that E(Xt ) = 0 for any
t ∈ I and Cov (Xti , Xtj ) = K(ti , tj ), for any ti , tj ∈ I.
To prove this result, fix t1 , . . . , tn ∈ I and set µ = (0, . . . , 0) ∈ Rn , Λt1 ...tn =
(K(ti , tj ))1≤i,j≤n and
Pt1 ,...,tn = N (0, Λt1 ...tn ).

6
We denote by (Xt1 , . . . , Xtn ) a random vector with law Pt1 ,...,tn . For any
subset {ti1 , . . . , tim } of {t1 , . . . , tn }, it holds that

A(Xt1 , . . . , Xtn ) = (Xti1 , . . . , Xtim ),

with  
δt1 ,ti1 · · · δtn ,ti1
A =  ··· ··· ··· ,
δt1 ,tim · · · δtn ,tim
where δs,t denotes the Kronecker Delta function.
By the properties of Gaussian vectors, the random vector (Xti1 , . . . , Xtim )
has an m-dimensional normal distribution, zero mean, and covariance matrix
AΛt1 ...tn At . By the definition of A, it is trivial to check that

AΛt1 ...tn At = (K(til , tik ))1≤l,k≤m .

Hence, the assumptions of Theorem 1.1 hold true and the result follows.

1.2 Sample paths


In the previous discussion, stochastic processes are considered as random
vectors. In the context of modeling, what matters are the observed values
of the process. Observations correspond to fixed values of ω ∈ Ω. This new
point of view leads to the next definition.

Definition 1.3 The sample paths of a stochastic process {Xt , t ∈ I} are the
family of functions indexed by ω ∈ Ω, X(ω) : I → S, defined by X(ω)(t) =
Xt (ω).

Sample paths are also called trajectories.

Example 1.2 Consider random arrivals of customers at a store. We set our


clock at zero and measure the time between two consecutive arrivals. They
are random variables X1 , X2 , . . . . We assume Xi > 0, a.s. Set S0 = 0 and
Sn = nj=1 Xj , n ≥ 1. Sn is the time of the n-th arrival. Define Nt as the
P
(random) number of customers who have visited the store during the time
interval [0, t], t ≥ 0.
Clearly, N0 = 0 and for t > 0, Nt = k if and only if

Sk ≤ t < Sk+1 .

7
The stochastic process {Nt , t ≥ 0} takes values on Z+ . Its sample paths are
increasing right continuous functions, with jumps at the random times Sn ,
n ≥ 1, of size one. It is a particular case of a counting process. Sample paths
of counting processes are always increasing right continuous functions, their
jumps are natural numbers.

Example 1.3 Evolution of prices of risky assets can be described by real-


valued stochastic processes {Xt , t ≥ 0} with continuous, although very rough,
sample paths. They are generalizations of the Brownian motion.
The Brownian motion, also called Wiener process, is a Gaussian process
{Bt , t ≥ 0} with the following parameters:

E(Bt ) = 0
E(Bs Bt ) = s ∧ t,

This defines the finite dimensional distributions and therefore the existence
of the process via Kolmogorov’s theorem (see Theorem 1.1).

Before giving a heuristic motivation for the preceding definition of Brownian


motion, we introduce two further notions.

A stochastic process {Xt , t ∈ I} has independent increments if for any


t1 < t2 < . . . < tk the random variables Xt2 − Xt1 , . . . , Xtk − Xtk−1 are
independent.

A stochastic process {Xt , t ∈ I} has stationary increments if for any t1 < t2 ,


the law of the random variable Xt2 − Xt1 is the same as that of Xt2 −t1 .
Brownian motion is termed after Robert Brown, a British botanist who ob-
served and reported in 1827 the irregular movements of pollen particles sus-
pended in a liquid. Assume that, when starting the observation, the pollen
particle is at position x = 0. Denote by Bt the position of (one coordinate)
of the particle at time t > 0. By physical reasons, the trajectories must be
continuous functions and because of the erratic movement, it seems reason-
able to say that {Bt , t ≥ 0} is a stochastic process. It also seems reasonable
to assume that the change in position of the particle during the time interval
[t, t + s] is independent of its previous positions at times τ < t and therefore,
to assume that the process has independent increments. The fact that such
an increment must be stationary is explained by kinetic theory, assuming
that the temperature during the experience remains constant.

8
The model for the law of Bt has been given by Einstein in 1905. More
precisely, Einstein’s definition of Brownian motion is that of a stochastic
processes with independent and stationary increments such that the law of
an increment Bt − Bs , s < t is Gaussian, zero mean and E(Bt − Bs )2 = t − s.
This definition is equivalent to the one given before.

9
2 The Brownian motion
2.1 Equivalent definitions of Brownian motion
This chapter is devoted to the study of Brownian motion, the process intro-
duced in Example 1.3 that we recall now.
Definition 2.1 The stochastic process {Bt , t ≥ 0} is a one-dimensional
Brownian motion if it is Gaussian, zero mean and with covariance function
given by E (Bt Bs ) = s ∧ t.
The existence of such process is ensured by Kolmogorov’s theorem. Indeed,
it suffices to check that
(s, t) → Γ(s, t) = s ∧ t
is nonnegative definite. That means, for any ti ≥ 0 and any real numbers ai ,
i, j = 1, . . . , m,
Xm
ai aj Γ(ti , tj ) ≥ 0.
i,j=1

But Z ∞
s∧t= 1[0,s] (r) 1[0,t] (r) dr.
0
Hence,
m
X m
X Z ∞
ai aj (ti ∧ tj ) = ai aj 1[0,ti ] (r) 1[0,tj ] (r) dr
i,j=1 i,j=1 0

m
!2
Z ∞ X
= ai 1[0,ti ] (r) dr ≥ 0.
0 i=1

Notice also that, since E(B02 ) = 0, the random variable B0 is zero almost
surely.
Each random variable Bt , t > 0, of the Brownian motion has a density, and
it is
1 x2
pt (x) = √ exp(− ),
2πt 2t
while for t = 0, its ”density” is a Dirac mass at zero, δ{0} .
Differentiating pt (x) once with respect to t, and then twice with respect to
x easily yields
∂ 1 ∂2
pt (x) = pt (x)
∂t 2 ∂x2
p0 (x) = δ{0} .

10
This is the heat equation on R with initial condition p0 (x) = δ{0} . That
means, as time evolves, the density of the random variables of the Brownian
motion behaves like a diffusive physical phenomenon.
There are equivalent definitions of Brownian motion, as the one given in the
nest result.
Proposition 2.1 A stochastic process {Xt , t ≥ 0} is a Brownian motion if
and only if
(i) X0 = 0, a.s.,

(ii) for any 0 ≤ s < t, the random variable Xt − Xs is independent of the


σ-field generated by Xr , 0 ≤ r ≤ s, σ(Xr , 0 ≤ r ≤ s) and Xt − Xs is a
N (0, t − s) random variable.
Proof: Let us assume first that {Xt , t ≥ 0} is a Brownian motion. Then
E(X02 ) = 0. Thus, X0 = 0 a.s..
Let Hs and H̃s be the vector spaces included in L2 (Ω) spanned by (Xr , 0 ≤
r ≤ s) and (Xs+u − Xs , u ≥ 0), respectively. Since for any 0 ≤ r ≤ s

E (Xr (Xs+u − Xs )) = 0,

Hs and H̃s are orthogonal in L2 (Ω). Consequently, Xt − Xs is independent


of the σ-field σ(Xr , 0 ≤ r ≤ s).
Since linear combinations of Gaussian random variables are also Gaussian,
Xt − Xs is normal, and E(Xt − Xs ) = 0,

E(Xt − Xs )2 = t + s − 2s = t − s.

This ends the proof of properties (i) and (ii).


Assume now that (i) and (ii) hold true. Then the finite dimensional distri-
butions of {Xt , t ≥ 0} are multidimensional normal, since they are obtained
by linear transformation of random vectors with Gaussian independent com-
ponents. Moreover, for 0 ≤ s ≤ t,

E(Xt Xs ) = E ((Xt − Xs + Xs )Xs ) = E ((Xt − Xs )Xs ) + E(Xs2 )


= E(Xt − Xs )E(Xs ) + E(Xs2 ) = E(Xs2 ) = s = s ∧ t.

Remark 2.1 We shall see later that Brownian motion has continuous sam-
ple paths. The description of the process given in the preceding proposition
tell us that such a process is a model for a random evolution which starts from
x = 0 at time t = 0, such that the qualitative change on time increments only
depends on their length (stationary law), and that the future evolution of the
process is independent of its past (Markov property).

11
Remark 2.2 The Brownian motion possesses several invariance properties.
Let us mention some of them.
• If B = {Bt , t ≥ 0} is a Brownian motion, so is −B = {−Bt , t ≥ 0}.

• For any λ > 0, the process B λ = λ1 Bλ2 t , t ≥ 0 is also a Brownian




motion. This means that zooming in or out, we will observe the same
behaviour. This is called the scaling property of Brownian motion.

• For any a > 0, B +a = {Bt+a − Ba , t ≥ 0} is a Brownian motion.

2.2 A construction of Brownian motion


There are several ways to obtain a Brownian motion. Here we shall give P.
Lévy’s construction, which also provides the continuity of the sample paths.
Before going through the details of this construction, we mention an alter-
nate.
Brownian motion as limit of a random walk
Let {ξj , j ∈ N} be a sequence of independent, identically distributed random
2
variables, with mean zero and variance Pnσ > 0. Consider the sequence of
partial sums defined by S0 = 0, Sn = j=1 ξj . The sequence {Sn , n ≥ 0} is
a Markov chain, and also a martingale.
Let us consider the continuous time stochastic process defined by linear in-
terpolation of {Sn , n ≥ 0}, as follows. For any t ≥ 0, let [t] denote its integer
value. Then set
Yt = S[t] + (t − [t])ξ[t]+1 , (2.1)
for any t ≥ 0.
The next step is to scale the sample paths of {Yt , t ≥ 0}. By analogy with
the scaling in the statement of the central limit theorem, we set

(n) 1
Bt = √ Ynt , (2.2)
σ n
t ≥ 0.
A famous result in probability theory -Donsker theorem- tell us that the se-
(n)
quence of processes {Bt , t ≥ 0}, n ≥ 1, converges in law to the Brownian
motion. The reference sample space is the set of continuous functions van-
ishing at zero. Hence, proving the statement, we obtain continuity of the
sample paths of the limit.
Donsker theorem is the infinite dimensional version of the above mentioned
(n)
central limit theorem. Considering s = nk , t = k+1
n
, the increment Bt −
(n)
Bs = σ√1 n ξk+1 is a random variable, with mean zero and variance t −

12
(n)
s. Hence Bt is not that far from the Brownian motion, and this is what
Donsker’s theorem proves.

P. Lévy’s construction of Brownian Motion


An important ingredient in the procedure is a sequence of functions defined
on [0, 1], termed Haar functions, defined as follows:

h0 (t) = 1,
n n
hkn (t) = 2 2 1[ 2k
, 2k+1 [ − 2 1[ 22k+1
2 2k+2 ,
2n+1 2n+1 n+1 , 2n+1 [

where n ≥ 1 and k ∈ {0, 1, . . . , 2n − 1}.


The set of functions (h0 , hkn ) is a CONS of L2 ([0, 1], B([0, 1]), λ), where
λ stands for the Lebesgue measure. Consequently, for any f ∈
2
L ([0, 1], B([0, 1]), λ), we can write the expansion
∞ 2X
−1 n
X
f = hf, h0 ih0 + hf, hkn ihkn , (2.3)
n=1 k=0

where the notation h·, ·i means the inner product in L2 ([0, 1], B)([0, 1]), λ).
Using (2.3), we define an isometry between L2 ([0, 1], B([0, 1]), λ) and
L2 (Ω, F, P ) as follows. Consider a family of independent random variables
with law N (0, 1), (N0 , Nnk ). Then, for f ∈ L2 ([0, 1], B([0, 1]), λ), set
∞ 2X
−1 n
X
I(f ) = hf, h0 iN0 + hf, hkn iNnk .
n=1 k=0

Clearly,
E I(f )2 = kf k22 .


Hence I defines an isometry between the space of random variables


L2 (Ω, F, P ) and L2 ([0, 1], B([0, 1]), λ). Moreover, since
−1
m 2X n
X
I(f ) = lim hf, h0 iN0 + hf, hkn iNnk ,
m→∞
n=1 k=0

the randon variable I(f ) is N (0, kf k22 ) and by Parseval’s identity

E(I(f )I(g)) = hf, gi, (2.4)

for any f, g ∈ L2 ([0, 1], B([0, 1]), λ).

13
Theorem 2.1 The process B = {Bt = I(11[0,t] ), t ∈ [0, 1]} defines a Brow-
nian motion indexed by [0, 1]. Moreover, the sample paths are continuous,
almost surely.

Proof: By construction B0 = 0. Notice that for 0 ≤ s ≤ t ≤ 1, Bt − Bs =


I(11]s,t] ). Hence, by virtue or (2.4) the process Bt − Bs is independent of any
Br , 0 < r < s, and Bt − Bs has a N (0, t − s) law. By Proposition 2.1, we
obtain the first statement.
Our next aim is to prove that the series appearing in
∞ 2X
−1 n
X
h11[0,t] , hkn iNnk

Bt = I 1[0,t] = h11[0,t] , h0 iN0 +
n=1 k=0
n −1
∞ 2X
X
= g0 (t)N0 + gnk (t)Nnk (2.5)
n=1 k=0

converges uniformly, a.s. In the last term we have introduced the Schauder
functions defined as follows.

g0 (t) = h11[0,t] , h0 i = t,
Z t
k k
gn (t) = h11[0,t] , hn i = hkn (s)ds,
0

for any t ∈ [0, 1].


By construction, for any fixed n ≥ 1, the functions gnk (t), k = 0, . . . , 2n − 1,
are positive, have disjoint supports and
n
gnk (t) ≤ 2− 2 .

Thus, 2n −1
X n
sup gn (t)Nn ≤ 2− 2
k k
sup |Nnk |.

t∈[0,1] 0≤k≤2n −1
k=0

The next step consists in proving that |Nnk | is bounded by some constant
n
depending on n such that when multiplied by 2− 2 the series with these
terms converges.
For this, we will use a result on large deviations for Gaussian measures along
with the first Borel-Cantelli lemma.

Lemma 2.1 For any random variable X with law N (0, 1) and for any a ≥ 1,
a2
P (|X| ≥ a) ≤ e− 2 .

14
Proof: We clearly have
Z ∞ Z ∞
2 − x2
2 2 x x2
P (|X| ≥ a) = √ dxe ≤√ dx e− 2
2π a 2π a a
2 a2 a2
= √ e− 2 ≤ e− 2 ,
a 2π
x √2
where we have used that 1 ≤ a
and a 2π
≤ 1.

We now move to the Borel-Cantelli’s based argument. By the preceding
lemma,
  n −1
2X
n n
|Nnk | P |Nnk | > 2 4

P sup >2 4 ≤
0≤k≤2n −1
k=0
n
≤ 2 exp −2 2 −1 .
n


It follows that ∞  
n
X
P sup |Nnk | >2 4 < +∞,
n=1 0≤k≤2n −1

and by the first Borel-Cantelli lemma


  
n
k
P lim inf sup |Nn | ≤ 2 4 = 1.
n→∞ 0≤k≤2n −1

That is, a.s., there exists n0 , which may depend on ω, such that
n
sup |Nnk | ≤ 2 4
0≤k≤2n −1

for any n ≥ n0 . Hence, we have proved


2n −1
X n n
sup gnk (t)Nnk ≤ 2− 2 sup |Nnk | ≤ 2− 4 ,

t∈[0,1] 0≤k≤2n −1
k=0

a.s., for n big enough, which proves the a.s. uniform convergence of the series
(2.5).

Next we discuss how from Theorem 1.1 we can get a Brownian motion indexed
by R+ . To this end, let us consider a sequence B k , k ≥ 1 consisting of
independent Brownian motions indexed by [0, 1]. That means, for each k ≥ 1,
B k = {Btk , t ∈ [0, 1]} is a Brownian motion and for different values of k, they

15
are independent. Then we define a Brownian motion recursively as follows.
Let k ≥ 1; for t ∈ [k, k + 1] set
k+1
Bt = B11 + B12 + · · · + B1k + Bt−k .

Such a process is Gaussian, zero mean and E(Bt Bs ) = s ∧ t. Hence it is a


Brownian motion.

we end this section by giving the notion of d-dimensional Brownian motion,


for a natural number d ≥ 1. For d = 1 it is the process we have seen so far.
For d > 1, it is the process defined by

Bt = (Bt1 , Bt2 , . . . , Btd ), t ≥ 0,

where the components are independent one-dimensional Brownian motions.

2.3 Path properties of Brownian motion


We already know that the trajectories of Brownian motion are a.s. continuous
functions. However, since the process is a model for particles wandering
erratically, one expects rough behaviour. This section is devoted to prove
some results that will make more precise these facts.
Firstly, it is possible to prove that the sample paths of Brownian motion
are γ-Hölder continuous. The main tool for this is Kolmogorov’s continuity
criterion (see e.g. [11][Theorem 2.1]):

Proposition 2.2 Let {Xt , t ≥ 0} be a stochastic process satisfying the fol-


lowing property: for some positive real numbers α, β and C,

E (|Xt − Xs |α ) ≤ C|t − s|1+β .

Then almost surely, the sample paths of the process are γ-Hölder continuous
with γ < αβ .

The law of the random variable Bt − Bs is N(0, t − s). Thus, it is possible to


compute the moments, and we have
 (2k)!
E (Bt − Bs )2k = k (t − s)k ,
2 k!
for any k ∈ N. Therefore, Proposition 2.2 yields that almost surely, the
sample paths of the Brownian motion are γ–Hölder continuous with γ ∈
(0, 12 ).

16
Nowhere differentiability
We shall prove that the exponent γ = 21 above is sharp. As a consequence
we will obtain a celebrated result by Dvoretzky, Erdös and Kakutani telling
that a.s. the sample paths of Brownian motion are not differentiable. We
gather these results in the next theorem.
Theorem 2.2 Fix any γ ∈ ( 12 , 1]; then a.s. the sample paths of {Bt , t ≥ 0}
are nowhere Hölder continuous with exponent γ.
Proof: Let γ ∈ ( 12 , 1] and assume that a sample path t → Bt (ω) is γ-Hölder
continuous at s ∈ [0, 1). Then
|Bt (ω) − Bs (ω)| ≤ C|t − s|γ ,
for any t ∈ [0, 1] and some constant C > 0.
Let n big enough and let i = [ns] + 1; by the triangular inequality

B (ω) − B (ω) ≤ B (ω) − B (ω)

j j+1 s j
n n n

+ Bs (ω) − B j+1 (ω)

n
 γ γ 
j j + 1
≤ C s − + s −
.
n n
Hence, by restricting j = i, i + 1, . . . , i + N − 1, we obtain
 γ
N M
B j (ω) − B j+1 (ω) ≤ C = γ.

n n n n
Define
 M 
AiM,n = B j (ω) − B j+1 (ω) ≤ γ , j = i, i + 1, . . . , i + N − 1 .

n n n
We have seen that the set of trajectories where t → Bt (ω) is γ-Hölder con-
tinuous at s is included in
∪∞ ∞ ∞ n i
M =1 ∪k=1 ∩n=k ∪i=1 AM,n .

Next we prove that this set has null probability. Indeed,


P ∩∞ n i n i
 
n=k ∪i=1 AM,n ≤ lim inf P ∪i=1 AM,n
n→∞
n
X
P AiM,n

≤ lim inf
n→∞
i=1
  N
M
≤ lim inf n P B 1 ≤ γ ,
n→∞ n n

17
where we have used that the random variables B j − B j+1 are N 0, n1 and

n n
independent. But
  r Z M n−γ
M n nx2
P B 1 ≤ γ = e− 2 dx
n n 2π −M n−γ
Z M n 21 −γ
1 − x2
2 1
=√ 1 −γ
e dx ≤ Cn 2 −γ .
2π −M n 2
Hence, by taking N such that N γ − 12 > 1,


h 1 iN
P ∩∞ n i
n 2 −γ

∪ A
n=k i=1 M,n ≤ lim inf nC = 0.
n→∞

since this holds for any k, M , se get


P ∪∞ ∞ ∞ n i

M =1 ∪k=1 ∩n=k ∪i=1 AM,n = 0.

This ends the proof of the theorem.



Notice that, if a.s. the sample paths of Brownian motion were differentiable
at some point, they would also be γ-Hölder continuous of degree γ = 1. This
contradicts the preceding theorem.
What happens for γ = 21 ? The answer to this question comes as a conse-
quence of Paul Lévy’s modulus of continuity result.
For a function f : [0, ∞) → R, the modulus of continuity is a way to describe
its local smoothness. More precisely, let σ : [0, ∞) → [0, ∞) be such that
σ(0) = 0 and strictly increasing. The function σ is a modulus of continuity
for the function f if for any 0 ≤ s ≤ t,
|f (s) − f (t)| ≤ σ(|t − s|).
Theorem 2.3 Let {Bt , t ≥ 0} be a Brownian motion. Then,
!
sup0≤t≤1−h |B(t + h) − B(t)|
P lim sup p = 1 = 1.
h→0 2h| log h|

As a consequence, the sample paths of a Brownian motion are not Hölder


continuous of degree γ = 21 . Indeed,
sup0≤t≤1−h |B(t + h) − B(t)|

h
sup0≤t≤1−h |B(t + h) − B(t)| √
= p 2| log h|,
2h| log h|

18
which tends to ∞ as h → 0.
Quadratic variation
The notion of quadratic variation provides a measure of the roughness of
a function. Existence of variations of different orders are also important in
procedures of approximation via a Taylor expansion and also in the develop-
ment of infinitesimal calculus. We will study here the existence of quadratic
variation, i.e. variation of order two, for the Brownian motion. As shall be
discussed in more detail in the next chapter, this provides an explanation to
the fact that rules of Itô’s stochastic calculus are different from those of the
classical differential deterministic calculus.
Fix a finite interval [0, T ] and consider the sequence of partitions given by
the points Πn = (tn0 = 0 ≤ tn1 ≤ . . . ≤ tnrn = T ), n ≥ 1. We assume that
lim |Πn | = 0,
n→∞

where |Πn | denotes the norm of the partition Πn :


|Πn | = sup (tj+1 − tj )
j=0,...,rn −1

Set ∆k B = Btnk − Btnk−1 . Under the preceding conditions on the sequence


(Πn )n≥1 we have the following.
Proposition 2.3 The sequence { rk=1 (∆k B)2 , n ≥ 1} converges in L2 (Ω)
Pn
to the deterministic random variable T . That is,
 !2 
rn
X
lim E  (∆k B)2 − T  = 0.
n→∞
k=1

Proof: For the sake of simplicity, we shall omit the dependence on n. Set
∆k t = tk −tk−1 . Notice that the random variables (∆k B)2 −∆k t, k = 1, . . . , n,
are independent and centered. Thus,
 !2   !2 
Xrn Xrn

(∆k B)2 − T  = E  (∆k B)2 − ∆k t



E 
k=1 k=1
rn h
X 2 i
= E (∆k B)2 − ∆k t
k=1
Xrn
3(∆k t)2 − 2(∆k t)2 + (∆k t)2
 
=
k=1
Xrn
=2 (∆k t)2 ≤ 2T |Πn |,
k=1

19
which clearly tends to zero as n tends to infinity. 
This proposition, together with the continuity of the sample paths of Brow-
nian motion yields
rn
X
sup |∆k B| = ∞, a.s..
n
k=1

Therefore, a.s. Brownian motionPhas infinite variation.


Indeed, assume that V := supn rk=1n
|∆k B| < ∞. Then
rn rn
!
X X
(∆k B)2 ≤ sup |∆k B| |∆k B|
k
k=1 k=1
≤ V sup |∆k B|.
k
Prn 2
We obtain limn→∞ k=1 (∆k B) = 0. a.s., which contradicts the result proved
in Proposition 2.3.

Remark 2.3 In the particular case Πn ⊂ Πn+1 , n ≥ 1, the result on the


quadratic variation of Brownian motion given in the preceding Proposition
can be improved. In fact, the following stronger statement holds:
rn
X
lim (∆k B)2 = T, a.s. (2.6)
n→∞
k=1

For example, one can consider Πn = ((kT )2−n , k = 0, . . . , 2n ).


Assume that the sequence of partitions (Πn )n≥1 satisfies the assumptions of
Proposition 2.3. In addition, we assume that there exists γ ∈ (0, 1) such that
γ
P
n≥1 |Πn | < ∞. Then (2.6) holds. Indeed, let λ > 0. Using Chebychev’s
inequality and the computations of the proof of Proposition 2.3, we have
( r )  2 
X n Xrn
2 −2  2
P (∆k B) − T > λ ≤ λ E (∆k B) − T 


k=1 k=1
−2
≤ Cλ |Πn |.
1−γ
Choose λ = |Πn | . Then
2

( r )
X Xn X
P (∆k B)2 − T > λ ≤ C |Πn |γ < ∞.


n≥1 k=1 n≥1

Thus, (2.6) follows from Borel-Cantelli’s Lemma.

20
The notion of quadratic variation presented before does not coincide with the
usual notion of quadratic variation for real functions. In the latter case,
there is no restriction on the partitions. Actually, for Brownian motion the
following result holds:
X
sup (∆k B)2 = +∞, a.s.,
Π
tk ∈Π

where the supremum is on the set of all partition of the interval [0, T ].

2.4 The martingale property of Brownian motion


We start this section by giving the definition of martingale for continuous
time stochastic processes. First, we introduce the appropriate notion of fil-
tration, as follows.
A family {Ft , t ≥ 0} of sub σ–fields of F is termed a filtration if

1. F0 contains all the sets of F of null probability,

2. For any 0 ≤ s ≤ t, Fs ⊂ Ft .

If in addition
∩s>t Fs = Ft ,
for any t ≥ 0, the filtration is said to be right-continuous.

Definition 2.2 A stochastic process {Xt , t ≥ 0} is a martingale with respect


to the filtration {Ft , t ≥ 0} if each variable belongs to L1 (Ω) and moreover

1. Xt is Ft –measurable for any t ≥ 0,

2. for any 0 ≤ s ≤ t, E(Xt /Fs ) = Xs .

If the equality in (2) is replaced by ≤ (respectively, ≥), we have a super-


martingale (respectively, a submartingale).
Given a stochastic process {Xt , t ≥ 0}, there is a natural way to define a
filtration by considering

Ft = σ(Xs , 0 ≤ s ≤ t), t ≥ 0.

To ensure that the above property (1) for a filtration holds, one needs to com-
plete the σ-field. In general, there is no reason to expect right-continuity.
However, for the Brownian motion, the natural filtration possesses this prop-
erty.

21
A stochastic process X with X0 constant and constant mean, independent
increments possesses the martingale property with respect to the natural
filtration. Indeed, for 0 ≤ s ≤ t,
E(Xt − Xs /Fs ) = E(Xt − Xs ) = 0.
Hence, a Brownian motion possesses the martingale property with respect to
the natural filtration.
Other examples of martingales with respect to the same filtration, related
with the Brownian motion are
1. {Bt2 − t, t ≥ 0},
 
a2 t
2. {exp aBt − 2 , t ≥ 0}.
Indeed, for the first example, let us consider 0 ≤ s ≤ t. Then,
E Bt2 /Fs = E (Bt − Bs + Bs )2 /Fs
 

= E (Bt − Bs )2 /Fs + 2E ((Bt − Bs )Bs /Fs )




+ E Bs2 /Fs .


Since Bt − Bs is independent of Fs , owing to the properties of the conditional


expectation, we have
E (Bt − Bs )2 /Fs = E (Bt − Bs )2 = t − s,
 

E ((Bt − Bs )Bs /Fs ) = Bs E (Bt − Bs /Fs ) = 0,


E Bs2 /Fs = Bs2 .


Consequently,
E Bt2 − Bs2 /Fs = t − s.


For the second example, we also use the property of independent increments,
as follows:
a2 t a2 t
       
E exp aBt − /Fs = exp(aBs )E exp a(Bt − Bs ) − /Fs
2 2
a2 t
  
= exp(aBs )E exp a(Bt − Bs ) − .
2
Using the expression of the density of the random variable Bt − Bs , we write
a2 t at2 x2
   Z  
1
E exp a(Bt − Bs ) − =p exp ax − − dx
2 2π(t − s) R 2 2(t − s)
 2
a (t − s) a2 t

= exp −
2 2
 2 
as
= exp − ,
2

22
where the before last equality is obtained by using the identity

x2 at2 (x − a(t − s))2 a2 t a2 (t − s)


− ax + = + − . (2.7)
2(t − s) 2 2(t − s) 2 2
Therefore, we obtain

a2 t a2 s
     
E exp aBt − /Fs = exp aBs − .
2 2
Remark. The computation of the expression

a2 t
  
E exp a(Bt − Bs ) −
2
could have been avoided by using the following property:. If Z is a random
variable with distribution N (0, σ 2 ), then it has exponential moments, and
 2
σ
E (exp(Z)) = exp .
2
This property is proved by using computations analogue to those above.

2.5 Markov property


For any 0 ≤ s ≤ t, x ∈ R and A ∈ B(R), we set

|x − y|2
Z  
1
p(s, t, x, A) = 1 exp − dy. (2.8)
(2π(t − s)) 2 A 2(t − s)

Actually, p(s, t, x, A) is the probability that a random variable, Normal, with


mean x and variance t − s take values on a fixed set A.
Let us prove the following identity:

P {Bt ∈ A / Fs } := E 1{Bt ∈A} |Fs = p(s, t, Bs , A), (2.9)

which means that, conditionally to the past of the Brownian motion until
time s, the law of Bt at a future time t only depends on Bs .
Let f : R → R be a bounded measurable function. Then, since Bs is Fs –
measurable and Bt − Bs independent of Fs , we obtain

E (f (Bt )/ Fs ) = E (f (Bs + (Bt − Bs )) / Fs )



= E (f (x + Bt − Bs )) .

x=Bs

23
The random variable x + Bt − Bs is N(x, t − s). Thus,
Z
E (f (x + Bt − Bs )) = f (y)p(s, t, x, dy),
R

and consequently,
Z
E (f (Bt )/ Fs ) = f (y)p(s, t, Bs , dy).
R

This yields (2.9) by taking f = 1A .


With similar arguments, we can prove that
P {Bt ∈ A / σ(Bs )} = p(s, t, Bs , A),
which along with (2.9) yields
P {Bt ∈ A / Fs } = P {Bt ∈ A / σ(Bs )} = p(s, t, Bs , A).
Going back to (2.8), we notice that the function x → p(s, t, x, A) is measur-
able, and the mapping A → p(s, t, x, A) is a probability.
Let us prove the additional property, called Chapman-Kolmogorov equation:
For any 0 ≤ s ≤ u ≤ t,
Z
p(s, t, x, A) = p(u, t, y, A)p(s, u, x, dy). (2.10)
R

We recall that the sum of two independent Normal random variables, is again
Normal, with mean the sum of the respective means, and variance the sum
of the respective variances. This is expressed in mathematical terms by the
fact that
Z
[fN(x,σ1 ) ∗ fN(y,σ2 ) ](z) = fN(x,σ1 ) (y)fN(y,σ2 ) (z − y)dy
R
= fN(x+y,σ1 +σ2 ) (z),
where the first expression denotes the convolution of the two densities. Using
this fact along with Fubini’s theorem, we obtain
Z Z Z
p(u, t, y, A)p(s, u, x, dy) = p(s, u, x, dy) p(u, t, y, dz)
R R A
Z Z 
= dz dy fN(x,u−s) (y)fN(0,t−u) (y − z)
A R
Z

= dz fN(x,u−s) ∗ fN(0,t−u) (z)
ZA
= dzfN(x,t−s) (z) = p(s, t, x, A),
A

24
proving (2.10).
This equation is the time continuous analogue of the property own by the
transition probability matrices of a homogeneousMarkov chain. That is,
Π(m+n) = Π(m) Π(n) ,
meaning that evolutions in m + n steps are done by concatenating m-step
and n-step evolutions. In (2.10) m + n is replaced by the real time t − s, m
by t − u, and n by u − s, respectively.
We are now ready to give the definition of a Markov process.
Consider a mapping
p : R+ × R+ × R × B(R) → R+ ,
satisfying the properties
(i) for any fixed s, t ∈ R+ , A ∈ B(R),
x → p(s, t, x, A)
is B(R)–measurable,
(ii) for any fixed s, t ∈ R+ , x ∈ R,
A → p(s, t, x, A)
is a probability,
(iii) Equation (2.10) holds.
Such a function p is termed a Markovian transition function. Let us also fix
a probability µ on B(R).
Definition 2.3 A real valued stochastic process {Xt , t ∈ R+ } is a Markov
process with initial law µ and transition probability function p if
(a) the law of X0 is µ,
(b) for any 0 ≤ s ≤ t,
P {Xt ∈ A/Fs } = p(s, t, Xs , A).

Therefore, we have proved that the Brownian motion is a Markov process


with initial law a Dirac delta function at 0 and transition probability function
p the one defined in (2.8).
Strong Markov property
Throughout this section, (Ft , t ≥ 0) will denote the natural filtration asso-
ciated with a Brownian motion {Bt , t ≥ 0} and stopping times will always
refer to this filtration.

25
Theorem 2.4 Let T be a stopping time. Then, conditionally to {T < ∞},
the process defined by

BtT = BT +t − BT , t ≥ 0,

is a Brownian motion independent of FT .

Proof: Assume that T < ∞ a.s. We shall prove that for any A ∈ FT , any
choice of parameters 0 ≤ t1 < · · · < tp and any continuous and bounded
function f on Rp , we have
h  i
E 1A f BtT1 , · · · , BtTp = P (A)E f Bt1 , · · · , Btp .
 
(2.11)

This suffices to prove all the assertions of the theorem. Indeed, by taking A =
Ω, we see that the finite dimensional distributions of B and B T coincide. On
the other hand, (2.11) states the independence of 1A and the random vector
(BtT1 , · · · , BtTp ). By a monotone class argument, we get the independence of
1A and B T .
The continuity of the sample paths of B implies, a.s.
 
f BtT1 , · · · , BtTp = f BT +t1 − BT , . . . , BT +tp − BT



X 
= lim 1{(k−1)2−n <T ≤k2−n } f Bk2−n +t1 − Bk2−n , . . . Bk2−n +tp − Bk2−n .
n→∞
k=1

Since f is bounded we can apply bounded convergence and write


h  i
E 1A f BtT1 , · · · , BtTp

X  
= lim E 1{(k−1)2−n <T ≤k2−n } 1A f Bk2−n +t1 − Bk2−n , . . . Bk2−n +tp − Bk2−n .
n→∞
k=1

Since A ∈ FT , the event A ∩ {(k − 1)2−n < T ≤ k2−n } ∈ Fk2−n . Since


Brownian motion has independent and stationary increments, we have
 
E 1{(k−1)2−n <T ≤k2−n } 1A f Bk2−n +t1 − Bk2−n , . . . Bk2−n +tp − Bk2−n
= P A ∩ {(k − 1)2−n < T ≤ k2−n } E f (Bt1 , · · · Btp ) .
   

Summing up with respect to k both terms in the preceding identity yields


(2.11), and this finishes the proof if T < ∞ a.s.
If P (T = ∞) > 0, we can argue as before and obtain
h  i
T T
 
E 1A∩{T <∞} f Bt1 , · · · , Btp = P (A ∩ {T < ∞})E f Bt1 , · · · , Btp .

26

An interesting consequence of the preceding property is given in the next
proposition.

Proposition 2.4 For any t > 0, set St = sups≤t Bs . Then, for any a ≥ 0
and b ≤ a,
P {St ≥ a, Bt ≤ b} = P {Bt ≥ 2a − b} . (2.12)
As a consequence, the probability law of St and |Bt | are the same.

Proof: Consider the stopping time

Ta = inf{t ≥ 0, Bt = a},

which is finite a.s. We have


Ta
P {St ≥ a, Bt ≤ b} = P {Ta ≤ t, Bt ≤ b} = P {Ta ≤ t, Bt−T a
≤ b − a}.
Ta
Indeed, Bt−T a
= Bt − BTa = Bt − a and B and B Ta have the same law.
Moreover, we know that these processes are independent of FTa .
This last property, along with the fact that B Ta and −B Ta have the same
law yields that (Ta , B Ta ) has the same distribution as (Ta , −B Ta ).
Define H = {(s, w) ∈ R+ × C(R+ ; R); s ≤ t, w(t − s) ≤ b − a}. Then
Ta
P {Ta ≤ t, Bt−Ta
≤ b − a} = P {(Ta , B Ta ) ∈ H} = P {(Ta , −B Ta ) ∈ H}
Ta
= P {Ta ≤ t, −Bt−T a
≤ b − a} = P {Ta ≤ t, 2a − b ≤ Bt }
= P {2a − b ≤ Bt }.
Ta
Indeed, by definition of the process {BtT , t ≥ 0}, the condition −Bt−T a
≤ b−a
is equivalent to 2a−b ≤ Bt ; moreover, the inclusion {2a−b ≤ Bt } ⊂ {Ta ≤ t}
holds true. In fact, if Ta > t, then Bt ≤ a; since b ≤ a, this yields Bt ≤ 2a−b.
This ends the proof of (2.12).
For the second one, we notice that {Bt ≥ a} ⊂ {St ≥ a}. This fact along
with (2.12) yield the validity of the identities

P {St ≥ a} = P {St ≥ a, Bt ≤ a} + P {St ≥ a, Bt ≥ a}


= 2P {Bt ≥ a} = P {|Bt | ≥ a}.

The proof is now complete.




27
3 Itô’s calculus
Itô’s calculus has been developed in the 50’ by Kyoshi Itô in an attempt to
give rigourous meaning to some differential equations driven by the Brown-
ian motion appearing in the study of some problems related with continuous
time Markov processes. Roughly speaking, one could say that Itô’s calculus
is an analogue of the classical Newton and Leibniz calculus for stochastic pro-
cesses. In fact, in classical
R mathematical analysis, there are several extensions
of the Riemann integral f (x)dx. For example, if g is an increasing bounded
function (or the difference of two of these functions),
R Lebesgue-Stieltjes in-
tegral gives a precise meaning to the integral f (x)g(dx), for some set of
functions f . However, before Itô’s development, no theory allowing nowhere
differentiable integrators g was known. Brownian motion, introduced in the
preceding chapter, is an example of stochastic process whose sample paths, al-
though continuous, are nowhere differentiable. Therefore, Lebesgue-Stieltjes
integral does not apply to the sample paths of Brownian motion.
There are many motivations coming from a variety of disciplines to consider
stochastic differential equations driven by a Brownian motion. Such an object
is defined as

dXt = σ(t, Xt )dBt + b(t, Xt )dt,


X 0 = x0 ,

or in integral form,
Z t Z t
X t = x0 + σ(s, Xs )dBs + b(s, Xs )ds. (3.1)
0 0

The first notion


R t to be introduced is that of stochastic integral. In fact, in (3.1)
the integral 0 b(s, Xs )ds might be defined pathwise, but this is not the case
Rt
for 0 σ(s, Xs )dBs , because of the roughness of the paths of the integrator.
More explicitly, it is not possible to fix ω ∈ Ω, then to consider the path
σ(s, Xs (ω)), and finally to integrate with respect to Bs (ω).

3.1 Itô’s integral


Throughout this section, we will consider a Brownian motion B = {Bt , t ≥ 0}
defined on a probability space (Ω, F, P ). We will also consider a filtration
(Ft , t ≥ 0) satisfying the following properties:
1. B is adapted to (Ft , t ≥ 0),
2. the σ-field generated by {Bu − Bt , u ≥ t} is independent of (Ft , t ≥ 0).

28
Notice that these two properties are satisfied if (Ft , t ≥ 0) is the natural
filtration associated to B.
We fix a finite time horizon T and define L2a,T as the set of stochastic processes
u = {ut , t ∈ [0, T ]} satisfying the following conditions:

(i) u is adapted and jointly measurable in (t, ω), with respect to the product
σ-field B([0, T ]) ⊗ F.
RT
(ii) 0
E(u2t )dt < ∞.
hR i 21
T
This is a Hilbert space with the norm kukL2a,T = 0
E(u2t )dt , which
coincides with the natural norm on the Hilbert space L2 (Ω × [0, T ], F ⊗
B([0, t]), dP × dλ) (here λ stands for the Lebesgue measure on R).
The notation L2a,T evokes the two properties -adaptedness and square
integrability- described before.
Consider first the subset of L2a,T consisting of step processes. That is, stochas-
tic processes which can be written as
n
X
ut = uj 1[tj−1 ,tj [ (t), (3.2)
j=1

with 0 = t0 ≤ t1 ≤ · · · ≤ tn = T and where uj , j = 1, . . . , n, are Ftj−1 –


measurable square integrable random variables. We shall denote by E the
set of these processes.
For step processes, the Itô stochastic integral is defined by the very natural
formula
Z T Xn
ut dBt = uj (Btj − Btj−1 ), (3.3)
0 j=1

that Rwe may compare with Lebesgue integral of simple functions. Notice
T
that 0 ut dBt is a random variable. Of course, we would like to be able to
consider more general integrands than step processes. Therefore, we must try
to extend the definition (3.3). For this, we have to use tools from Functional
Analysis based upon a very natural idea: If we are able to prove that (3.3)
gives a continuous functional between two metric spaces, then the stochastic
integral defined for the very particular class of step stochastic processes could
be extended to a more general class given by the closure of this set with
respect to a suitable norm. This is possible by one of the consequences of
Hahn-Banach Theorem.
The idea of continuity is made precise by the

29
Isometry property:
Z T 2 Z T 
E ut dBt =E u2t dt . (3.4)
0 0

Let us prove (3.4) for step processes. Clearly


Z T 2 n
X
E u2j (∆j B)2

E ut dBt =
0 j=1
X
+2 E(uj uk (∆j B)(∆k B)).
j<k

The measurability property of the random variables uj , j = 1, . . . , n, implies


that the random variables u2j are independent of (∆j B)2 . Hence, the con-
tribution of the first term in the right hand-side of the preceding identity is
equal to
X n Z T
2
E(uj )(tj − tj−1 ) = E(u2t )dt.
j=1 0

For the second term, we notice that for fixed j and k, j < k, the random
variables uj uk ∆j B are independent of ∆k B. Therefore,

E(uj uk (∆j B)(∆k B)) = E(uj uk (∆j B))E(∆k B) = 0.

Thus, we have (3.4).


This property tell us that the stochastic integral is a continuous functional
defined on E, endowed with the norm of L2 (Ω × [0, T ]), taking values on the
set L2 (Ω) of square integrable random variables.
Other properties of the stochastic integral of step processes

1. The stochastic integral is a centered random variable. Indeed,


Z T  n
!
X
E ut dBt = E uj (Btj − Btj+1
0 j=1
n
X 
= E(uj )E Btj − Btj+1 = 0,
j=1

where we have used that the random variables


 uj and Btj − Btj+1 are
independent and moreover E Btj − Btj+1 = 0.

30
2. Linearity: If u1 , u2 are two step processes and a, b ∈ R, then clearly
au1 + bu2 is also a step process and
Z T Z T Z T
1 2 1
(au + bu )(t)dBt = a u (t)dBt + b u2 (t)dBt .
0 0 0

The next step consists of identifying a bigger set than E of random processes
such that E is dense in the norm of the Hilbert space L2 (Ω × [0, T ]). Since
L2a,T is a Hilbert space with respect to this norm, we have that Ē ⊂ L2a,T .
The converse inclusion is also true. This is proved in the next Proposition,
which is a crucial fact in Itô’s theory.

Proposition 3.1 For any u ∈ L2a,T there exists a sequence (un , n ≥ 1) ⊂ E


such that Z T
lim E(unt − ut )2 dt = 0.
n→∞ 0

Proof: It will be done throughout three steps.


Step 1. Assume that u ∈ L2a,T is bounded, and has continuous sample paths,
a.s. An approximation sequence can be defined as follows:
[nT ]  
n
X k
u (t) = u 1 k k+1 (t),
k=0
n [n, n [

with the convention [nTn]+1 := T .


Clearly, un ∈ L2a,T and by continuity,

Z T [nT ] Z k+1
∧T
  2
n 2
X n k
|u (t) − u(t)| dt = u − u(t) dt
0 k=0
k
n
n

≤ T sup sup |un (t) − u(t)|2 → 0,


k t∈∆k

as n → ∞, a.s. Then, the approximation result follows by bounded conver-


gence on the measure space L1 (Ω, F, P ) applied to the sequence of random
variables Yn ∈ L1 (Ω, F, P ),
Z T
Yn = |un (t) − u(t)|2 dt.
0

Indeed, we have proved that limn→∞ Yn = 0, a.s. and, by the assymptions


on u, supn≥1 |Yn | ≤ Y , with Y ∈ L1 (Ω, F, P ).

31
Step 2. Assume that u ∈ L2a,T is bounded. For any n ≥ 1, let Ψn (s) =
n11[0, 1 ] (s). The sequence (Ψn )n≥1 is an approximation of the identity. Con-
n
sider
Z +∞ Z t
n
u (t) = Ψn (t − s)u(s)ds = (Ψn ∗ u)(t) = u(s)ds,
1
−∞ t− n

where the symbol “∗” denotes the convolution operator on R. In the last
integral we put u(r) = 0 if r < 0. With this, we define a stochastic process
un with continuous and bounded sample paths, a.s., and by the properties
of the convolution (see e.g. [7][Corollary 2, p. 378]), we have
Z T
|un (s) − u(s)|2 ds → 0.
0

As in the preceding step, by bounded convergence,


Z T
lim E |un (s) − u(s)|2 ds → 0.
n→∞ 0

Step 3. Let u ∈ L2a,T and define



0,
 u(t) < −n,
n
u (t) = u(t), −n ≤ u(t) ≤ n,

0, u(t) > n.

Clearly, supω,t |un (t)| ≤ n and un ∈ L2a,T . Moreover,


Z T Z T
n 2
E |u (s) − u(s)| ds = E |u(s)|2 1{|u(s)|>n} ds → 0,
0 0

where we have used that for a function f ∈ L1 ([0, T ], B([0, T ]), dt),
Z T
lim |f (t)|11{|f (t)|>n} dt = 0,
n→∞ 0

and then we have applied the bounded convergence theorem on L1 (Ω, F, P )


. 
By using the approximation result provided by the preceding Proposition,
we can give the following definition.
Definition 3.1 The Itô stochastic integral of a process u ∈ L2a,T is
Z T Z T
2
ut dBt := L (Ω) − lim unt dBt . (3.5)
0 n→∞ 0

32
In order this definition to make sense, one needs to make sure that if the
process u is approximated by two different sequences, say un,1 and un,2 , the
definition of the stochastic integral, using either un,1 or un,2 coincide. This
is proved using the isometry property. Indeed
Z T Z T 2 Z T 2
E un,1
t dBt − utn,2 dBt = E un,1 n,2
t − ut dt
0 0 0
Z T 2
Z T 2
≤2 E un,1
t − ut dt + 2 E un,2
t − ut dt
0 0
→ 0.
RT
Consequently, denoting by I i (u) = L2 (Ω) − limn→∞ 0
un,i
t dBt , i = 1, 2 and
using the triangular inequaltity, we have
Z T Z T Z T
1 2
kI (u) − I (u)k2 ≤ kI (u) − 1
un,1
t dBt k2 +k un,1
t dBt − un,2
t dBt k2
0 0 0
Z T
+ kI 2 (u) − un,2
t dBt k2 ,
0

where k · k2 denotes the norm in L2 (Ω).


The right-hand side of this inequality tends to zero as n → ∞. Hence the
left-hand side is null.
By its very definition, the stochastic integral defined in Definition 3.1 satisfies
the isometry property as well. Indeed, using again the notation I(·) for the
stochastic integral, we have

kI(u)kL2 (Ω) = lim kI(un )kL2 (Ω)


n→∞
= lim kun kL2 (Ω×[0,T ]) = kukL2 (Ω×[0,T ]) .
n→∞

Moreover,

(a) stochastic integrals are centered random variables:


Z T 
E ut dBt = 0,
0

(b) stochastic integration is a linear operator:


Z T Z T Z T
(aut + bvt ) dBt = a ut dBt + b vt dBt .
0 0 0

33
Remember that these facts are true for processes in E, as has been mentioned
before. The extension to processes in L2a,T is done by applying Proposition
3.1. For the sake of illustration we prove (a).
Consider an approximating sequence un inRthe sense of Proposition 3.1. By
T
the construction of the stochastic integral 0 ut dBt , it holds that
Z T  Z T 
lim E unt dBt =E ut dBt ,
n→∞ 0 0
R 
T
Since E 0
unt dBt = 0 for every n ≥ 1, this concludes the proof.
We end this section with an interesting example.
Example 3.1 For the Brownian motion B, the following formula holds:
Z T
1
BT2 − T .

Bt dBt =
0 2
RT
Let us remark that we would rather expect 0 Bt dBt = 12 BT2 , by analogy
with rules of deterministic calculus.
To prove this identity, we consider a particular sequence of approximating
jT
step processes based on the partition n , j = 0, . . . , n , as follows:
n
X
unt = Btj−1 1]tj−1 ,tj ] (t),
j=1

jT
with tj = n
. Clearly, un ∈ L2a,T and we have
Z T n Z tj
2
X 2
E (unt − Bt ) dt = E Btj−1 − Bt dt
0 j=1 tj−1
n Z tj 2
T X T
≤ dt = .
n j=1 tj−1 n

Therefore, (un , n ≥ 1) is an approximating sequence of B in the norm of


L2 (Ω × [0, T ]).
According to Definition 3.1,
Z T Xn

Bt dBt = lim Btj−1 Btj − Btj−1 ,
0 n→∞
j=1

in the L2 (Ω) norm.

34
Clearly,
n n 
X  1X 
Btj−1 Btj − Btj−1 = Bt2j − Bt2j−1
j=1
2 j=1
n
1X 2
− Btj − Btj−1
2 j=1
n
1 1X 2
= BT2 − Btj − Btj−1 . (3.6)
2 2 j=1

We conclude by using Proposition 2.3.

3.2 The Itô integral as a stochastic process


The indefinite Itô stochastic integral of a process u ∈ L2a,T is defined as
follows: Z t Z T
us dBs := us 1[0,t] (s)dBs , (3.7)
0 0
t ∈ [0, T ].
For this definition to make sense, we need that for any t ∈ [0, T ], the process
{us 1[0,t] (s), s ∈ [0, T ]} belongs to L2a,T . This is clearly true.
Obviously, properties of the integral mentioned in the previous section, like
zero mean, isometry, linearity, also hold for the indefinite integral.
The rest of the section is devoted to the study of important properties of the
stochastic process given by an indefinite Itô integral.
Rt
Proposition 3.2 The process {It = 0 us dBs , t ∈ [0, T ]} is a martingale.
Proof: We first establish the martingale property for any approximating
sequence Z t
n
It = uns dBs , t ∈ [0, T ],
0
n 2
where u converges to u in L (Ω × [0, T ]). This suffices to prove the Proposi-
tion, since L2 (Ω)–limits of martingales are again martingales (this fact follows
from Jensen’s inequality).
Let unt , t ∈ [0, T ], be defined by the right hand-side of (3.2). Fix 0 ≤ s ≤ t ≤
T and assume that tk−1 < s ≤ tk < tl < t ≤ tl+1 . Then
l
X
Itn − Isn = uk (Btk − Bs ) + uj (Btj − Btj −1 )
j=k+1

+ ul+1 (Bt − Btl ).

35
Using properties (g) and (f), respectively, of the conditional expectation (see
Appendix 1) yields
l
X
(Itn Isn /Fs )
 
E − = E (uk (Btk − Bs )/Fs ) + E E uj ∆j B/Ftj−1 /Fs
j=k+1

+ E (ul+1 E (Bt − Btl /Ftl ) /Fs )


= 0.

This finishes the proof of the proposition. 


A proof not very different as that of Proposition 2.3 yields

Proposition 3.3 For any process u ∈ L2a,T and bounded,

n
!2
X Z tj Z t
1
L (Ω) − lim us dBs = u2s ds.
n→∞ tj−1 0
j=1

That means, the “quadratic


R t 2 variation” of the indefinite stochastic integral is
given by the process { 0 us ds, t ∈ [0, T ]}.
The isometry property of the stochastic integral can be extended in the fol-
lowing sense. Let p ∈ [2, ∞[. Then,
Z t p Z t  p2
E us dBs ≤ C(p)E u2s ds . (3.8)
0 0

Here C(p) is a positive constant depending on p. This is Burkholder’s in-


equality.
A combination of Burkholder’s inequality and Kolmogorov’s continuity cri-
terion allows to deduce the continuity of the Rsample paths of the indefi-
T
nite stochastic integral. Indeed, assume that 0 E (ur )p dr < ∞, for any
p ∈ [2, ∞[. Using first (3.8) and then Hölder’s inequality (be smart!) implies
Z t p Z t  p2
E ur dBr ≤ C(p)E u2r dr
s s
Z t
p
≤ C(p)|t − s| 2
−1
E (ur )p dr
s
p
−1
≤ C(p)|t − s| 2 .

Since
Rt p ≥ 2 is arbitrary, with Theorem 1.1 we have that the sample paths of
1
u
0 s
dB s , t ∈ [0, T ] are γ–Hölder continuous with γ ∈]0, 2 [.

36
3.3 An extension of the Itô integral
In Section 3.1 we have introduced the set L2a,T and we have defined the
stochastic integral of processes of this class with respect to the Brownian
motion. In this section we shall consider a large class of integrands. The
notations and underlying filtration are the same as in Section 3.1.
Let Λ2a,T be the set of real valued processes u adapted to the filtration (Ft , t ≥
0), jointly measurable in (t, ω) with respect to the product σ-field B([0, T ]) ×
F and satisfying Z T 
P u2t dt < ∞ = 1. (3.9)
0

Clearly L2a,T ⊂ Λ2a,T . Our aim is to define the stochastic integral for processes
in Λ2a,T . For this we shall follow the same approach as in section 3.1. Firstly,
we start with step processes (un , n ≥ 1) of the form (3.2) belonging to Λ2a,T
and define the integral as in (3.3). The extension to processes in Λ2a,T needs
two ingredients. The first one is an approximation result that we now state
without giving a proof. Reader may consult for instance [1].
Proposition 3.4 Let u ∈ Λ2a,T . There exists a sequence of step processes
(un , n ≥ 1) of the form (3.2) belonging to Λ2a,T such that
Z T
lim |unt − ut |2 dt = 0,
n→∞ 0
a.s.
The second ingredient gives a connection between stochastic integrals of step
processes in Λ2a,T and their quadratic variation, as follows.
Proposition 3.5 Let u be a step processes in Λ2a,T . Then for any  > 0,
N > 0,  Z T  Z T 
2 N
P ut dBt >  ≤ P
ut dt > N + 2 . (3.10)
0 0 
Proof: It is based on a truncation argument. Let u be given by the right-hand
side of (3.2) (here it is not necessary to assume that the random variables uj
are in L2 (Ω)). Fix N > 0 and define
( Pn
N uj , if t ∈ [tj−1 , tj [, and u2j (tj − tj−1 ) ≤ N,
vt = Pj=1
n 2
0, if t ∈ [tj−1 , tj [, and j=1 uj (tj − tj−1 ) > N,

The process {vtN , t ∈ [0, T ]} belongs to L2a,T . Indeed, by definition


Z t
|vtN |2 dt ≤ N.
0

37
RT
Moreover, if 0 u2t dt ≤ N , necessarily ut = vtN for any t ∈ [0, T ]. Then by
considering the decomposition
 Z T 


ut dBt > 

0
 Z T Z T   Z T Z T 
2
2
= ut dBt > , ut dt > N ∪ ut dBt > , ut dt ≤ N ,
0 0 0 0

we obtain
T T T
 Z   Z  Z 

P ut dBt >  ≤ P vtN dBt > +P u2t dt >N .
0 0 0

We finally apply Chebychev’s inequality along with the isometry property of


the stochastic integral for processes in L2a,T and get
 Z T
 Z T 2
1 N
P vtN dBt >  ≤ 2E N
vt dBt ≤ 2 .
0  0 
This ends the proof of the result.

The extension
Fix u ∈ Λ2a,T and consider a sequence of step processes (un , n ≥ 1) of the
form (3.2) belonging to Λ2a,T such that
Z T
lim |unt − ut |2 dt = 0, (3.11)
n→∞ 0

in the convergence of probability.


By Proposition 3.5, for any  > 0, N > 0 we have
 Z T   Z T 
n m
n m 2
N
P (ut − ut )dBt >  ≤ P (ut − ut ) dt > N + 2 .
0 0 
Using (3.11), we can choose  such that for any N > 0 and n, m big enough,
 Z T 
n m 2

P (ut − ut ) dt > N ≤ .
0 2

Then, we may take N small enough so that N2 ≤ 2 . Consequently, we have


proved that the sequence of stochastic integrals of step processes
Z T 
n
ut dBt , n ≥ 1 (3.12)
0

38
is Cauchy in probability. The space L0 (Ω) of classes of finite random variables
(a.s.) endowed with the convergence in probability is a complete metric space.
For example, a possible distance is
 
|X − Y |
d(X, Y ) = E .
1 + |X − Y |

Hence the sequence (3.12) does have a limit in probability. Then, we define
Z T Z T
ut dBt = P − lim unt dBt . (3.13)
0 n→∞ 0

It is easy to check that this definition is indeed independent of the particular


approximation sequence used in the construction.

3.4 A change of variables formula: Itô’s formula


Like in Example 3.1, we can prove the following formula, valid for any t ≥ 0:
Z t
2
Bt = 2 Bs dBs + t. (3.14)
0

If the sample paths of {Bt , t ≥ 0} were sufficiently smooth -for example, of


bounded variation- we would rather have
Z t
2
Bt = 2 Bs dBs . (3.15)
0

Why is it so? Consider a similar decomposition as the one given in (3.6)


obtained by restricting the time interval to [0, t]. More concretely, consider
the partition of [0, t] defined by 0 = t0 ≤ t1 ≤ · · · ≤ tn = t,
n−1 
X 
Bt2 = Bt2j+1 − Bt2j
j=0
n−1 n−1
X  X 2
=2 Btj Btj+1 − Btj + Btj+1 − Btj , (3.16)
j=0 j=0

where we have used that B0 = 0.


Consider a sequence of partitions of [0, t] whose mesh tends to zero. We
already know that
n−1
X 2
Btj+1 − Btj → t,
j=0

39
in the convergence of L2 (Ω). This gives the extra contribution in the devel-
opment of Bt2 in comparison with the classical calculus approach.
Notice that, if B were of bounded variation then, we could argue as follows:
n−1
X 2
Btj+1 − Btj ≤ sup |Btj+1 − Btj |
0≤j≤n−1
j=0
n−1
X
× |Btj+1 − Btj |.
j=0

By the continuity of the sample paths of the Brownian motion, the first factor
in the right hand-side of the preceding inequality tends to zero as the mesh
of the partition tends to zero, while the second factor remains finite, by the
property of bounded variation.
Summarising. Differential calculus with respect to the Brownian motion
should take into account second order differential terms. Roughly speaking

(dBt )2 = dt.

A precise meaning to this formal formula is given in Proposition 2.3.

3.4.1 One dimensional Itô’s formula


In this section, we shall extend the formula (3.14) and write an expression for
f (t, Xt ) for a class of functions f and a family of stochastic processes which
include f (x) = x2 and the Brownian motion, respectively. To illustrate the
method of the proof, we start with the particular case given in the next
assertion (see [6]).

Theorem 3.1 Consider a function f : R → R belonging to C 2 , the class


of continuous differentiable real functions up to order two. Then, for any
0 ≤ a ≤ t,
Z t Z t
0 1
f (Bt ) = f (Ba ) + f (Bs )dBs + f 00 (Bs )ds, (3.17)
a 2 a

a.s.

The proof relies on two technical lemmas. In the sequel, we will consider
a sequence of partitions Πn = {0 = t0 ≤ t1 ≤ · · · ≤ tn = t} such that
limn→∞ |Πn | = 0.

40
Lemma 3.1 Let g be a real continuous function and λi ∈ (0, 1), i = 1, . . . , n.
There exists a subsequence (denoted by (n), by simplicity) such that
n
X
g(Bti−1 + λi (Bti − Bti−1 )) − g(Bti−1 ) (Bti − Bti−1 )2 , n ≥ 1,
 
Xn :=
i=1

converges to zero a.s.

Proof. Consider the random variable



Yn = max g(Bt + λ(Bt − Bt )) − g(Bt ) ,
i−1 i i−1 i−1
1≤i≤n,0<λ<1

Clearly
n
X
|Xn | ≤ Yn (Bti − Bti−1 )2 .
i=1

By the continuity of thesample


Pn paths of Brownian motion, (Yn )n≥1 converges
2 2
to zero a.s. Moreover, i=1 (B ti − Bti−1 ) , n ≥ 1 converges in L (Ω) to t.
Hence, the lemma holds.


Lemma 3.2 The hypotheses are the same as in Lemma 3.1. The sequence
n
X
g(Bti−1 ) (Bti − Bti−1 )2 − (ti − ti1 ) , n ≥ 1,
 
Sn :=
i=1

converges in probability to zero.

Proof. We will introduce a localization in Ω. With this, g may be assumed to


be bounded and we will prove convergence to zero in L2 (Ω). Finally, we will
remove the localization and obtain the result. This is a quite usual procedure
in probability theory.
Fix L > 0 and set
(L)
Ai−1 = {|B(tl )| ≤ L, 0 ≤ l ≤ i − 1}, 1 ≤ i ≤ n.

Notice that this is a decreasing family in the parameter i.


Define
Xn
g(Bti−1 )1A(L) (Bti − Bti−1 )2 − (ti − ti1 ) .
 
Sn,L =
i−1
i=1

This is a localization of Sn .

41
To simplify the notation, we call

Xi = (Bti − Bti−1 )2 − (ti − ti1 ),


Yi = g(Bti−1 )1A(L) (Bti − Bti−1 )2 − (ti − ti1 ) ,
 
i−1

1 ≤ 1 ≤ n. Our aim is to prove first that E (| i Yi |2 ) → 0.


P
Let Ft = σ(B(s), 0 ≤ s ≤ t). Fix 1 ≤ i < j ≤ n. Then

E(Yi Yj ) = E E Yi Yj /Ftj−1
 
= E Yi g(Btj−1 )1A(L) E(Xj /Ftj−1
j−1

= 0.

We clearly have Yi2 ≤ sup|x|≤L |g(x)|2 Xi2 and therefore,

E(Yi2 ) ≤ C(ti − ti−1 )2 sup |g(x)|2 .


|x|≤L

Consequently,
n
X
2
E(Sn,L ) = E(Yi2 )
i=1
≤ Ct sup |g(x)|2 |Πn |,
|x|≤L

which converges to zero as n → ∞.


Next, we fix  > 0 and write

P {|Sn | > } = P {|Sn | > , Sn = Sn,L } + P {|Sn | > , Sn 6= Sn,L }


≤ P {|Sn,L | > } + P {Sn 6= Sn,L } .

Chebyshev’s inequality yields

lim P {|Sn,L | > } ≤ −2 lim E(|Sn,L |2 ) = 0.


n→∞ n→∞

Moreover, if ω is such that Sn (ω) 6= Sn,L (ω) there exists i = 1, . . . , n, such


(L) (L)
that ω ∈ (Ai−1 )c . The family (Ai−1 )i decreases in i, consequently, ω ∈
(L)
(An−1 )c . Thus,
 
(L)
P {Sn 6= Sn,L } ≤ P (An−1 )c ≤P sup |B(s)| > L .
0≤s≤t

42
We proved in Proposition 2.4 that the law of sup0≤s≤t |B(s)| and |Bt | are the
same. Thus,
  r
2t
P sup |B(s)| > L = P {|B(t)| > L} ≤ L−1 E(|B(t)|) = L−1 .
0≤s≤t π

This yields
lim P {Sn 6= Sn,L } = 0,
L→∞

and ends the proof of the Lemma.




43
Proof of Theorem 3.1
For simplicity, we take a = 0. We fix ω and consider a Taylor expansion up
to the second order. This yields
n
X
f (Bt ) − f (0) = f 0 (Bti−1 )(Bti − Bti−1 )
i=1
n
1 X
+ f 00 (Bti−1 + λi (Bti − Bti−1 ))(Bti − Bti−1 )2 .
2 i=1

The stochastic process {ut = f 0 (Bt ), t ≥ 0} has continuous sample paths, a.s.
Let n
X
n
u (t) = f 0 (Bti−1 )1[ti−1 ,ti ] .
i=1

By continuity, Z t
lim |un (s) − u(s)|2 ds = 0, a.s.
n→∞ 0
Therefore,
n
X Z t Z t
0 n
lim f (Bti−1 )(Bti − Bti−1 ) = lim u (s)dBs = u(s)dBs ,
n→∞ n→∞ 0 0
i=1

in probability, and also a.s. for some subsequence.


Next we consider the terms with the second derivative. By the triangular
inequality, we have
n Z t
X
00 2 00
f (Bti−1 + λi (Bti − Bti−1 ))(Bti − Bti−1 ) − f (Bs )ds



i=1 0

X n X n
00 2 00 2
≤ f (Bti−1 + λi (Bti − Bti−1 ))(Bti − Bti−1 ) − f (Bti−1 )(Bti − Bti−1 )


i=1 i=1

X n
00
 2

+ f (Bti−1 ) (Bti − Bti−1 ) − (ti − ti−1 )


i=1

X n Z t
00 00
+ f (Bti−1 )(ti − ti−1 ) − f (Bs )ds .


i=1 0

The first term on the right-hand side of the preceding inequality converges
to zero as n → ∞, a.s. Indeed this follows from Lemma 3.1 applied to the
00
function g := f . With the same choice of g, Lemma 3.2 yields the a.s.

44
convergence to zero of a subsequence for the second term. Finally, the third
term converges also to zero, a.s. by the classical result on approximation of
Riemann integrals by Riemann sums.

We now introduce the class of Itô Processes for which we will prove a more
general version of the Itô formula.

Definition 3.2 Let {vt , t ∈ [0, T ]} be a stochastic process, adapted, whose


RT
sample paths are almost surely Lebesgue integrable, that is 0 |vt |dt < ∞,
a.s. Consider a stochastic process {ut , t ∈ [0, T ]} belonging to Λ2a,T and a
random variable X0 . The stochastic process defined by
Z t Z t
Xt = X 0 + us dBs + vs ds, (3.18)
0 0

t ∈ [0, T ] is termed an Itô process.

An alternative writing of (3.18) in differential form is

dXt = ut dBt + vt dt.

Let C 1,2 denote the set of functions on [0, T ] × R which are jointly continuous
in (t, x), continuous differentiable in t and twice continuous differentiable
in x, with jointly continuous derivatives. Our next aim is to prove an Itô
formula for the stochastic process {f (t, Xt ), t ∈ [0, T }], f ∈ C 1,2 This will be
an extension of (3.17).

Theorem 3.2 Let f : [0, T ] × R → R be a function in C 1,2 and X be an Itô


process with decomposition given in (3.18). The following formula holds:
Z t Z t
f (t, Xt ) = f (0, X0 ) + ∂s f (s, Xs )ds + ∂x f (s, Xs )us dBs
0 0
Z t
1 t 2
Z
+ ∂x f (s, Xs )vs ds + ∂xx f (s, Xs )u2s ds. (3.19)
0 2 0

Formula (3.19) can also be written as


Z t Z t
f (t, Xt ) = f (0, X0 ) + ∂s f (s, Xs )ds + ∂x f (s, Xs )dXs
0 0
1 t 2
Z
+ ∂ f (s, Xs )(dXs )2 , (3.20)
2 0 xx

45
or in differential form ,
1 2
df (t, Xt ) = ∂t f (t, Xt )dt + ∂x f (t, Xt )dXt + ∂xx f (t, Xt )(dXt )2 , (3.21)
2
where (dXt )2 is computed using the formal rule

dBt × dBt = dt,


dBt × dt = dt × dBt = 0,
dt × dt = 0.

Example 3.2 Consider the function


σ2
f (t, x) = eµt− 2
t+σx
,

with µ, σ ∈ R.
Applying formula (3.19) to Xt := Bt -a Brownian motion- yields
Z t Z t
f (t, Bt ) = 1 + µ f (s, Bs )ds + σ f (s, Bs )dBs .
0 0

Hence, the process {Yt = f (t, Bt ), t ≥ 0} satisfies the equation


Z t Z t
Yt = 1 + µ Ys ds + σ Ys dBs .
0 0

It is termed geometric Brownian motion. The equivalent differential form of


this identity is the linear stochastic differential equation

dYt = µYt dt + σYt dBt ,


Y0 = 1. (3.22)

Black and Scholes proposed as model of a market with a single risky asset
with initial value S0 = 1, the process St = Yt . We have seen that such a
process is in fact the solution to a linear stochastic differential equation (see
(3.22)).

In the particular case where f : R → R is a function in C 2 (twice continuously


differentiable), Theorem 3.2 gives the following version of Itô’s formula:
Z t Z t
0
f (Xt ) = f (X0 ) + f (Xs )us dBs + f 0 (Xs )vs ds
0 0
1 t 00
Z
+ f (Xs )u2s ds. (3.23)
2 0

46
Proof of Theorem 3.2
Let Πn = {0 = tn0 < · · · < tnpn = t} be a sequence of increasing partitions
such that limn→∞ |Πn | = 0. First, we consider the decomposition
We can write
pn −1 h i
X
f (t, Xt ) − f (0, X0 ) = f (tni+1 , Xtni+1 ) − f (ti , Xtni )
i=0
pn −1
X
f (tni+1 , Xtni ) − f (tni , Xtni )

=
i=0
h i
+ f (tni+1 , Xtni+1 ) − f (tni+1 , Xtni )
pn −1
X
∂s f (t̄ni , Xtni )(tni+1 − tni )

=
i=0
h i
+ ∂x f (tni+1 , Xtni )(Xtni+1 − Xtni )
pn −1
1X 2
+ ∂ f (tn , X̄ n )(Xtni+1 − Xtni )2 . (3.24)
2 i=0 xx i+1 i

with t̄ni ∈]tni , tni+1 [ and X̄in an intermediate (random) point on the segment
determined by Xtni and Xtni+1 .
In fact, this follows from a Taylor expansion of the function f up to the first
order in the variable s (or the mean-value theorem), and up to the second
order in the variable x. The asymmetry in the orders is due to the existence
of quadratic variation of the processes involved. The expresion (3.24) is the
analogue of (3.16). The former is much simpler for two reasons. Firstly, there
is no s-variable; secondly, f is a polynomial of second degree, and therefore it
has an exact Taylor expansion. But both formulas have the same structure.
When passing to the limit as n → ∞, we expect
pn −1 Z t
X
n n n n
∂s f (t̄i , Xti )(ti+1 − ti ) → ∂s f (s, Xs )ds
i=0 0
pn −1 Z t
X
∂x f (tni+1 , Xtni )(Xtni+1 − Xtni ) → ∂x f (s, Xs )us dBs
i=0 0
Z t
+ ∂x f (s, Xs )vs ds
0
pn −1 Z t
X
∂xx f (tni+1 , X̄in )(Xtni+1 − Xtni )2 → 2
∂xx f (s, Xs )u2s ds,
i=0 0

47
in some topology.
This actually holds in the a.s. convergence (by taking if necessary a subse-
quence). As in Theorem 3.1, the proof requires a localization in Ω. However,
this can be avoided by assuming some additional assumptions as follows: the
process v is bounded; u ∈ L2a,T ; the partial derivatives ∂x f , ∂xx
2
are bounded.
We shall give a proof of the theorem under these additional hypotheses.
Checking the convergences
First term
pn −1 Z t
X
n n n
∂s f (t̄i , Xtni )(ti+1 − ti ) → ∂s f (s, Xs )ds, (3.25)
i=0 0
a.s.
Indeed
p −1 Z t
Xn
n n n
∂s f (t̄i , Xti )(ti+1 − ti ) − ∂s f (s, Xs )ds
n



i=1 0
p −1 Z n
Xn ti+1  
n
= ∂s f (t̄i , Xtni ) − ∂s f (s, Xs ) ds

n
i=1 ti

pn −1 Z tn
X i+1
≤ ∂s f (t̄ni , Xtn ) − ∂s f (s, Xs ) ds
i
i=1 tn
i

∂s f (t̄ni , Xtni ) − ∂s f (s, Xs ) 1[tni ,tni+1 ] (s) .

≤t sup sup

1≤i≤pn −1 s∈[tn n
i ,ti+1 ]

The continuity of ∂s f along with that of the process X implies



sup ∂s f (t̄ni , Xtni ) − ∂s f (s, Xs ) 1[tni ,tni+1 ] (s) = 0.

lim sup

n→∞ 1≤i≤pn −1 s∈[tn ,tn ]
i i+1

This gives (3.25).


Second term
We prove
pn −1 Z t
X
tni , Xtni

∂x f (Xtni+1 − Xtni ) → ∂x f (s, Xs )dXs (3.26)
i=1 0

in probability, which amounts to check two convengences:


pn −1 Z n
 ti+1
X Z t
n
∂x f ti , Xtni ut dBt → ∂x f (s, Xs )us dBs , (3.27)
i=1 tn
i 0
pn −1 Z tn Z t
X i+1
tni , Xtni

∂x f vt dt → ∂x f (s, Xs )vs ds. (3.28)
i=1 tn
i 0

48
We start with (3.27). We have
p −1 Z tni+1 Z t 2
X n
n

E ∂x f ti , Xti ut dBt − ∂x f (s, Xs )us dBs
n


i=1 tn
i 0
p −1 Z n 2
Xn ti+1 
∂x f tni , Xtni − ∂x f (s, Xs ) us dBs
 
=E

n
i=1 ti

pn −1
2
X Z tni+1 
n
 
= E ∂x f ti , Xtni − ∂x f (s, Xs ) us dBs

tni
i=1
pn −1 Z tn
X i+1  2
E ∂x f tni , Xtni − ∂x f (s, Xs ) us ds,
 
=
i=1 tn
i

where we have applied successively that the stochastic integrals on disjoint


intervals are independent to each other, they are centered random variables,
along with the isometry property.
By the continuity of ∂x f and the process X,

sup ∂x f tni , Xtni − ∂x f (s, Xs ) 1[tni ,tni+1 ] (s) → 0,
 
sup (3.29)

1≤i≤pn −1 s∈[tn n
i ,ti+1 ]

a.s. Then, by bounded convergence,


 2
n
 
sup sup E ∂x f ti , Xtni − ∂x f (s, Xs ) us 1[tni ,tni+1 ] (s) → 0.

1≤i≤pn −1 s∈[tn n
i ,ti+1 ]

Therefore we obtain (3.27) in L2 (Ω).


For the proof of (3.28) we write
p −1 Z tni+1 Z t
Xn
∂x f tni , Xtni

vt dt − ∂x f (s, Xs )vs ds


n
ti 0
i=1
p −1 Z n
Xn ti+1 
∂x f tni , Xtni − ∂x f (s, Xs ) vs ds
 
=

n
i=1 ti

pn −1 Z tn
X i+1
∂x f tni , Xtn − ∂x f (s, Xs ) |vs |ds.

≤ i
i=1 tn
i

By virtue of (3.29) and bounded convergence, we obtain (3.28) in the a.s.


convergence.
Third term

49
2
Set fn,i = ∂xx f (tni+1 , X̄in ). We have to prove

pn −1 Z t
X
2 2
fn,i (Xtni+1 − Xtni ) → ∂xx f (s, Xs )u2s ds. (3.30)
i=0 0

This will be a consequence of the following convergences

pn −1
!2
X Z tn
i+1
Z t
2
fn,i us dBs → ∂xx f (s, Xs )u2s ds, (3.31)
i=0 tn
i 0
pn −1
! Z !
X Z tn
i+1 tn
i+1
fn,i us dBs vs ds → 0, (3.32)
i=0 tn
i tn
i

pn −1
!2
X Z tn
i+1
fn,i vs ds → 0, (3.33)
i=0 tn
i

in the a.s. convergence.


Let us start by arguing on (3.31). We have
!2
pn −1 Z tn Z t

X i+1
2 2
E fn,i us dBs − ∂xx f (s, Xs )us ds
i=0 tn
i 0
≤ (T1 + T2 ),

with
 !2 Z n 
pn −1 Z tni+1 ti+1

X 2

T1 = E
fn,i  us dBs − us ds ,

i=0 tni tn
i
p −1 Z tni+1 Z t
Xn
2 2
T2 = E fn,i us ds − ∂xx f (s, Xs )u2s ds .


i=0 tn
i 0

Since ∂x2k ,xl f is bounded,


!2 Z n
pn −1
X Z tni+1 ti+1

2

T1 ≤ CE
us dBs − us ds .
i=0 tn
i tn
i

This tends to zero as n → ∞ (see Proposition 3.3).

50
As for T2 , we have
p −1 Z n
Xn ti+1 
2 2

T2 = fn,i − ∂xx f (s, Xs ) us ds

n
i=1 ti

pn −1 Z tn
X i+1
2
≤ fn,i − ∂xx f (s, Xs ) u2s ds.
i=1 tn
i

Using the continuity property, we have


2

sup sup fn,i − ∂xx f (s, Xs ) → 0,
0≤0≤pn −1 s∈[tn n
i ,ti+1 ]

a.s. Then, by bounded convergence we obtain that T2 converges to zero as


n → ∞. Hence we have proved (3.31) in L1 (Ω) (and therefore also a.s. for
some subsequence).
Next we prove (3.32) in L1 (Ω). Indeed
p −1 Z tni+1 ! Z n !
Xn ti+1
E fn,i us dBs vs ds


i=0 tn
i tni

pn −1
Z n Z n !
X ti+1 ti+1
≤C E us dBs vs ds

tni tni
i=0

pn −1 n
! 21  n
!2  12
X Z ti+1
Z ti+1
≤C E u2s ds E |vs |ds 
i=0 tn
i tn
i

pn −1
! 12 !! 12
Z tn
i+1
Z tn
i+1
1
X
≤C |tni+1 − tni | 2 E|us |2 ds E |vs |2 ds
i=0 tn
i tn
i

pn −1 Z tn
! 21
1
X i+1
≤ C sup |tni+1 − tni | 2 E|us |2 ds
i
i=0 tn
i
1
pn −1 Z tn
! 2
X i+1
× E|vs |2 ds ,
i=0 tn
i

which tends to zero as n → ∞ a.s.


The proof of (3.33) is very easy. Indeed
!2
pn −1 Z tni+1 Z tni+1 !Z
t
X
f n,i vs ds ≤ C sup |vs |ds |vs |ds.

i=0 tn
i i tn
i 0

51
The first factor on the right-hand side of this inequality tends to zero as
n → ∞, while the second one is bounded a.s. Therefore (3.33) holds in the
a.s. convergence.
This ends the proof of the Theorem.


3.4.2 Multidimensional version of Itô’s formula


Consider a m-dimensional Brownian motion {(Bt1 , · · · , Btm ) , t ≥ 0} and a
real-valued Itô processes, as follows:
m
X
dXti = ui,l l i
t dBt + vt dt, (3.34)
l=1

i = 1, . . . , p. We assume that each one of the processes ui,l 2


t belong to Λa,T
RT i
and that 0 |vt |dt < ∞, a.s. Following a similar plan as for Theorem 3.2, we
will prove the following:
Theorem 3.3 Let f : [0, ∞) × Rp 7→ R be a function of class C 1,2 and
X = (X 1 , . . . , X p ) be given by (3.34). Then
Z t p Z t
X
f (t, Xt ) = f (0, X0 ) + ∂s f (s, Xs )ds + ∂xk f (s, Xs )dXsk
0 k=1 0
p t
1 XZ
+ ∂xk ,xl f (s, Xs )dXsk dXsl , (3.35)
2 k,l=1 0

where in order to compute dXsk dXsl , we have to apply the following rules
dBsk dBtl = δk,l ds, (3.36)
dBsk ds = 0,
(ds)2 = 0,
where δk,l denotes the Kronecker symbol.
We remark that the identity (3.36) is a consequence of the independence of
the components of the Brownian motion.
Example 3.3 Consider the particular case m = 1, p = 2 and f (x, y) = xy.
That is, f does not depend on t and we have denoted a generic point of R by
(x, y). Then the above formula (3.35) yields
Z t Z t Z t
1 2 1 2 1 2 2 1
u1s u2s ds.

Xt Xt = X0 X0 + Xs dXs + Xs dXs + (3.37)
0 0 0

52
4 Applications of the Itô formula
This chapter is devoted to give some important results that use the Itô for-
mula in some parts of their proofs.

4.1 Burkholder-Davis-Gundy inequalities


Rt
Theorem 4.1 Let u ∈ L2a,T and set Mt = 0
us dBs . Define
Mt∗ = sup |Ms | .
s∈[0,t]

Then, for any p > 0, there exist two positive constants cp , Cp such that
Z T  p2 Z T  p2
cp E u2s ds ≤ E (MT∗ )p ≤ Cp E u2s ds . (4.1)
0 0

Proof: We will only prove here the right-hand side of (4.1) for p ≥ 2. For this,
we assume that the process {Mt , t ∈ [0, T ]} is bounded. This assumption can
be removed by a localization argument.
Consider the function
f (x) = |x|p ,
for which we have that
f 0 (x) = p|x|p−1 sign(x),
f 00 (x) = p(p − 1)|x|p−2 ,
for x 6= 0. Then, according to (3.23) we obtain
Z t
1 t
Z
p p−1
|Mt | = p|Ms | sign(Ms )us dBs + p(p − 1)|Ms |p−2 u2s ds.
0 2 0

Applying the expectation operator to both terms of the above identity yields
Z t 
p p(p − 1) p−2 2
E (|Mt | ) = E |Ms | us ds . (4.2)
2 0
p
We next apply Hölder’s inequality to the expectation with exponents p−2
and q = p2 and get
Z t   Z t 
p−2 2 ∗ p−2 2
E |Ms | us ds ≤ E (Mt ) us ds
0 0
" Z
t  p2 # p2
p−2
≤ [E (Mt∗ )p ] p E u2s ds . (4.3)
0

53
Doob’s inequality (see Theorem 8.1) implies
 p
∗ p p
E (Mt ) ≤ E(|Mt |p ).
p−1

Hence, by applying (4.2), (4.3), we obtain


 p
∗ p p
E (Mt ) ≤ E(|Mt |p )
p−1
 p Z t 
p p(p − 1) p−2 2
≤ E |Ms | us ds
p−1 2 0
 p " Z
t  p2 # p2
p p(p − 1) p−2
≤ [E (Mt∗ )p ] p E u2s ds .
p−1 2 0

p−2
Since 1 − p
= p2 , from this inequality we obtain
 p  p2 Z t  p2
p p(p − 1)
E (Mt∗ )p ≤ E u2s ds .
p−1 2 0

This ends the proof of the upper bound.




4.2 Representation of L2 Brownian functionals


We already know that for any process u ∈ L2a,T , the stochastic integral process
Rt
{ 0 us dBs , t ∈ [0, T ]} is a martingale. The next result is a kind of converse
statement. In the proof we shall use a technical ingredient that we write
without giving a proof.
In the sequel we denote by FT the σ-field generated by (Bt , 0 ≤ t ≤ T ).

Lemma 4.1 The vector space generated by the random variables


Z T
1 T 2
Z 
exp f (t)dBt − f (t)dt ,
0 2 0

f ∈ L2 ([0, T ]), is dense in L2 (Ω, FT , P ).

Theorem 4.2 Let Z ∈ L2 (Ω, FT ). There exists a unique process h ∈ L2a,T


such that Z T
Z = E(Z) + hs dBs . (4.4)
0

54
Hence, for any martingale M = {Mt , t ∈ [0, T ]} bounded in L2 , there exist a
unique process h ∈ L2a,T and a constant C such that
Z t
Mt = C + hs dBs . (4.5)
0

Proof: We start with the proof of (4.4). Let H be the vector space consisting
of random variables Z ∈ L2 (Ω, FT ) such that (4.4) holds. Firstly, we argue
the uniqueness of h. This is an easy consequence of the isometry of the
stochastic integral. Indeed, if there were two processes h and h0 satisfying
(4.4), then
Z T  Z T 2
0 2 0
E (hs − hs ) ds = E (hs − hs )dBs
0 0
= 0.
This yields h = h0 in L2 ([0, T ] × Ω).
We now turn to the existence of h. Any Z ∈ H satisfies
Z T 
2 2 2
E(Z ) = (E(Z)) + E hs ds .
0

From this it follows that if (Zn , n ≥ 1) is a sequence of elements of H con-


verging to Z in L2 (Ω, FT ), then the sequence (hn , n ≥ 1) corresponding to
the representations is Cauchy in L2a,T . Denoting by h the limit, we have
Z T
Z = E(Z) + hs dBs .
0

Hence H is closed in L2 (Ω, FT ).


For any f ∈ L2 ([0, T ]), set
Z t
1 t 2
Z 
f
Et = exp fs dBs − f ds .
0 2 0 s
Rt
The random variable  0 fs dBs is Gaussian, centered and with variance
Rt 2
f ds. Hence, E Etf = 1. Then, by the Itô formula,
0 s
Z t
Etf =1+ Esf f (s)dBs .
0

Consequently, the representation holds for Z := ETf and also any linear com-
bination of such random variables belong to H. The conclusion follows from
Lemma 4.1.

55
Let us now prove the representation (4.5). The random variable MT belongs
to L2 (Ω). Hence, by applying the first part of the Theorem we have
Z T
MT = E(M0 ) + hs dBs ,
0

for some h ∈ L2a,T . By taking conditional expectations we obtain


Z t
Mt = E(MT |Ft ) = E(M0 ) + hs dBs , 0 ≤ t ≤ T.
0

The proof of the theorem is now complete.




Example 4.1 Consider Z = BT3 . In order to find the corresponding process


h in the integral representation, we apply first Itô’s formula, yielding
Z T Z T
3 2
BT = 3Bt dBt + 3 Bt dt.
0 0

An integration by parts gives


Z T Z T
Bt dt = T BT − tdBt .
0 0

Thus, Z T
BT3 3 Bt2 + T − t dBt .
 
=
0
Notice that E(BT ) = 0. Then ht = 3 [Bt2 + T − t].
3

4.3 Girsanov’s theorem


It is well known that if X is a multidimensional Gaussian random variable,
any affine transformation brings X into a multidimensional Gaussian random
variable as well. The simplest version of Girsanov’s theorem extends this
result to a Brownian motion. Before giving the precise statement and the
proof, let us introduce some preliminaries.

Lemma 4.2 Let L be a nonnegative random variable such that E(L) = 1.


Set
Q(A) = E(11A L), A ∈ F. (4.6)
Then, Q defines a probability on F, equivalent to P , with density given by L.
Reciprocally, if P and Q are two probabilities on F and P  Q, then there
exists a nonnegative random variable L such that E(L) = 1, and (4.6) holds.

56
Proof: It is clear that Q defines a σ-additive function on F. Moreover, since

Q(Ω) = E(11Ω L) = E(L) = 1,

Q is indeed a probability.
Let A ∈ F be such that Q(A) = 0. Since L > 0, a.s., we should have
P (A) = 0. Reciprocally, for any A ∈ F with P (A) = 0, we have Q(A) = 0
as well.
The second assertion of the lemma is Radon-Nikodym theorem.

If we denote by EQ the expectation operator with respect to the probability
Q defined before, one has

EQ (X) = E(XL).

Indeed, this formula is easily checked for simple random variables and then
extended to any random variable X ∈ L1 (Ω) by the usual approximation
argument.
Consider now a Brownian motion {Bt , t ∈ [0, T ]}. Fix λ ∈ R and let
λ2
 
Lt = exp −λBt − t . (4.7)
2

Notice that Lt = Etf with f = −λ (see section 4.2).


Itô’s formula yields Z t
Lt = 1 − λLs dBs .
0
Hence, the process {Lt , t ∈ [0, T ]} is a positive martingale and E(Lt ) = 1,
for any t ∈ [0, T ]. Set

Q(A) = E (11A LT ) , A ∈ FT . (4.8)

By Lemma 4.2, the probability Q is equivalent to P on the σ-field FT .


By the martingale property of {Lt , t ∈ [0, T ]}, the same conclusion is true on
Ft , for any t ∈ [0, T ]. Indeed, let A ∈ Ft , then

Q(A) = E (11A LT ) = E (E (11A LT |Ft ))


= E (11A E (LT |Ft ))
= E (11A Lt ) .

Next, we give a technical result.

57
Lemma 4.3 Let X be a random variable and let G be a sub σ-field of F such
that
u2 σ 2
E eiuX |G = e− 2 .


Then, the random variable X is independent of the σ-field G and its proba-
bility law is Gaussian, zero mean and variance σ 2 .

Proof: By the definition of the conditional expectation, for any A ∈ G,


u2 σ 2
E 1A eiuX = P (A)e− 2 .


In particular, for A := Ω, we see that the characteristic function of X is that


of a N(0, σ 2 ). This proves the last assertion.
Moreover, for any A ∈ G,
u2 σ 2
EA eiuX = e− 2 ,


saying that the law of X conditionally to A is also N(0, σ 2 ). Thus,

P ((X ≤ x) ∩ A) = P (A)PA (X ≤ x) = P (A)P (X ≤ x) ,

yielding the independence of X and G.




Theorem 4.3 (Girsanov’s theorem) Let λ ∈ R and set

Wt = Bt + λt.

In the probability space (Ω, FT , Q), with Q given in (4.8), the process {Wt , t ∈
[0, T ]} is a standard Brownian motion.

Proof: We will check that in the probability space (Ω, FT , Q), any increment
Wt − Ws , 0 ≤ s < t ≤ T is independent of Fs and has N(0, t − s) distribution.
That is, for any A ∈ Fs ,
u2 u2
 
EQ eiu(Wt −Ws ) 1A = EQ 1A e− 2 (t−s) = Q(A)e− 2 (t−s) .


The conclusion will follow from Lemma 4.3.


Indeed, writing

λ2 λ2
   
Lt = exp −λ(Bt − Bs ) − (t − s) exp −λBs − s ,
2 2

58
we have

EQ eiu(Wt −Ws ) 1A = E 1A eiu(Wt −Ws ) Lt


 

λ2
 
= E 1A eiu(Bt −Bs )+iuλ(t−s)−λ(Bt −Bs )− 2 (t−s) Ls .

Since Bt − Bs is independent of Fs , the last expression is equal to


λ2
E (11A Ls ) E e(iu−λ)(Bt −Bs ) eiuλ(t−s)− 2 (t−s)


(iu−λ)2 2
(t−s)+iuλ(t−s)− λ2 (t−s)
= Q(A)e 2

u2
= Q(A)e− 2
(t−s)
.

The proof is now complete.




59
5 Local time of Brownian motion and
Tanaka’s formula
This chapter deals with a very particular extension of Itô’s formula. More
precisely, we would like to have a decomposition of the positive submartingale
|Bt − x|, for some fixed x ∈ R as in the Itô formula. Notice that the function
f (y) = |y − x| does not belong to C 2 (R). A natural way to proceed is to
regularize the function f , for instance by convolution with an approximation
of the identity, and then, pass to the limit. Assuming that this is feasible,
the question of identifying the limit involving the second order derivative
remains open. This leads us to introduce a process termed the local time of
B at x introduced by Paul Lévy.
Definition 5.1 Let B = {Bt , t ≥ 0} be a Brownian motion and let x ∈ R.
The local time of B at x is defined as the stochastic process
1 t
Z
L(t, x) = lim 1(x−,x+) (Bs )ds
→0 2 0
1
= lim λ{s ∈ [0, t] : Bs ∈ (x − , x + )}, (5.1)
→0 2

where λ denotes the Lebesgue measure on R.


We see that L(t, x) measures the time spent by the process B at x during a
period of time of length t. Actually, it is the density of this occupation time.

We shall see later that the above limit exists in L2 (it also exists a.s.), a fact
that it is not obvious at all.
Local time enters naturally in the extension of the Itô formula we alluded
before. In fact, we have the following result.
Theorem 5.1 For any t ≥ 0 and x ∈ R, a.s.,
Z t
+ + 1
(Bt − x) = (B0 − x) + 1[x,∞) (Bs )dBs + L(t, x), (5.2)
0 2
where L(t, x) is given by (5.1) in the L2 convergence.
Proof: The heuristics of formula (5.2) is the following. In the sense of
distributions, f (y) = (y − x)+ has as first and second order derivatives,
f 0 (y) = 1[x,∞) (y), f 00 (y) = δx (y), respectively, where δx denotes the Dirac
delta measure. Hence we expect a formula like
Z t
1 t
Z
+ +
(Bt − x) = (B0 − x) + 1[x,∞) (Bs )dBs + δx (Bs )ds.
0 2 0

60
However, we have to give a meaning to the last integral.
Approximation procedure
We are going to approximate the function f (y) = (y − x)+ . For this, we fix
 > 0 and define

0,
 if y ≤ x − 
(y−x+)2
fx (y) = 4
, if x −  ≤ y ≤ x + 

y−x if y ≥ x + 

which clearly has as derivatives



0,
 if y ≤ x − 
0 (y−x+)
fx (y) = 2
, if x −  ≤ y ≤ x + 

1 if y ≥ x + 

and 
0,
 if y < x − 
00 1
fx (y) = , if x −  < y < x + 
 2
0 if y > x + 

Let φn , n ≥ 1 be a sequence of C ∞ functions with compact supports decreas-


ing to {0}. For instance we may consider the function

φ(y) = c exp −(1 − y 2 )−1 1{|y|<1} ,




R
with a constant c such that R φ(z)dz = 1, and then take

φn (y) = nφ(ny).

Set Z
gn (y) = [φn ∗ fx ](y) = fx (y − z)φn (z)dz.
R

It is well-known that gn ∈ C ∞ , gn and gn0 converge uniformly in R to fx and


0
fx , respectively, and gn00 converges pointwise to fx
00
except at the points x + 
and x − .
We then have an Itô’s formula for gn , as follows:
Z t
1 t 00
Z
0
gn (Bt ) = gn (B0 ) + gn (Bs )dBs + g (Bs )ds. (5.3)
0 2 0 n

61
Convergence of the terms in (5.3) as n → ∞
0
The function fx is bounded. The function gn0 is also bounded. Indeed,
Z
0 0
|gn (y)| = fx (y − z)φn (z)dz
R
Z 1
n
0
= fx (y − z)φn (z)dz
1
−n
0
≤ 2kfx k∞ .

Moreover, 0 0

gn (Bs )11[0,t] − fx (Bs )11[0,t] → 0,
uniformly in t and in ω. Hence, by bounded convergence,
Z t
2
E |gn0 (Bs ) − fx
0
(Bs )| → 0.
0

Then, the isometry property of the stochastic integral implies


Z t 2
0 0

E [gn (Bs ) − fx (Bs )] dBs → 0,
0

as n → ∞.
We next deal with the second order term. Since the law of each Bs has a
density, for each s > 0,

P {Bs = x + } = P {Bs = x − } = 0.

Thus, for any s > 0,


lim gn00 (Bs ) = fx
00
(Bs ),
n→∞

a.s. Using Fubini’s theorem, we see that this convergence also holds, for
almost every s, a.s. In fact,
Z t Z
ds dP 1{fx,00 (B )6=lim
s
00
n→∞ gn (Bs )}
0 Ω
Z Z t
= dP ds11{fx,
00 (B )6=lim
s n→∞ gn (Bs )} = 0.
00
Ω 0

We have
1
sup |gn00 (y)| ≤ .
y∈R 2

62
Indeed,
Z
1
|gn00 (y)|

= φn (z)11(x−,x+) (y − z)dz
2 R

1 y−x+
Z
2
≤ |φn (z)|dz ≤ .
2 y−x− 2
Then, by bounded convergence
Z t Z t
00 00
gn (Bs )ds → fx (Bs )ds,
0 0

a.s. and in L2 .
Thus, passing to the limit the expression (5.3) yields
Z t
1 t 1
Z
0
fx (Bt ) = fx (B0 ) + fx (Bs )dBs + 1(x−,x+) (Bs )ds. (5.4)
0 2 0 2
Convergence as  → 0 of (5.4)
Since fx (y) → (y − x)+ as  → 0 and

|fx (Bt ) − fx (B0 )| ≤ |Bt − B0 |,

we have
fx (Bt ) − fx (B0 ) → (Bt − x)+ − (B0 − x)+ ,
in L2 .
Moreover,
Z t 2
 Z t 
0
E fx, (Bs ) − 1[x,∞) (Bs ) ds ≤ E 1(x−,x+) (Bs )ds
0 0
Z t
2
≤ √ ds.
0 2πs
that clearly tends to zero as  → 0. Hence, by the isometry property of the
stochastic integral
Z t Z t
0
fx, (Bs )dBs → 1[x,∞) (Bs )dBs ,
0 0

in L2 .
Consequently, we have proved that
Z t
1
1(x−,x+) (Bs )ds
0 2

63
converges in L2 as  → 0 and that formula (5.2) holds.

We give without proof two further properties of local time.

1. The property of local time as a density of occupation measure is made


clear by the following identity, valid por any t ≥ 0 and every a ≤ b:
Z b Z t
L(t, x)dx = 1(a,b) (Bs )ds.
a 0

Rt
2. The stochastic integral 0 1[x,∞) (Bs )dBs has a jointly continuous ver-
sion in (t, x) ∈ (0, ∞) × R. Hence, by (5.2) so does the local time
{L(t, x), (t, x) ∈ (0, ∞) × R}.

The next result, which follows easily from Theorem 5.1 is known as Tanaka’s
formula.

Theorem 5.2 For any (t, x) ∈ [0, ∞) × R, we have


Z t
|Bt − x| = |B0 − x| + sign(Bs − x)dBs + L(t, x). (5.5)
0

Proof: We will use the following relations: |x| = x+ +x− , x− = max(−x, 0) =


(−x)+ . Hence, by virtue of (5.2), we only need a formula for (−Bt + x)+ .
Notice that we already have it, since the process −B is also a Brownian
motion. More precisely,
Z t
1
+
(−Bt + x) = (−B0 + x) + +
1[−x,∞) (−Bs )d(−Bs ) + L− (t, −x),
0 2

where we have denoted by L− (t, −x) the local time of −B at −x. We have
the following facts:
Z t Z t
1[−x,∞) (−Bs )d(−Bs ) = − 1(−∞,x] (Bs )dBs ,
0 0
1 t
Z

L (t, −x) = lim 1(−x−,−x+) (−Bs )ds
→0 2 0

1 t
Z
= lim 1(x−,x+) (Bs )ds
→0 2 0

= L(t, x),

where the limit is in L2 (Ω).

64
Thus, we have proved
Z t
− − 1
(Bt − x) = (B0 − x) − 1(−∞,x] (Bs )dBs + L(t, x). (5.6)
0 2

Adding up (5.2) and (5.6) yields (5.5). Indeed



1, if Bs > x

1[x,∞) (Bs ) − 1(−∞,x] (Bs ) = −1 if Bs < x

0 if Bs = x

which is identical to sign (Bs − x).




65
6 Stochastic differential equations
In this section we shall introduce stochastic differential equations driven by a
multi-dimensional Brownian motion. Under suitable properties on the coeffi-
cients, we shall prove a result on existence and uniqueness of solution. Then
we shall establish properties of the solution, like existence of moments of any
order and the Hölder property of the sample paths.

The setting
We consider a d-dimensional Brownian motion B = {Bt = (Bt1 , . . . , Btd ), t ≥
0}, B0 = 0, defined on a probability space (Ω, F, P ), along with a filtration
(Ft , t ≥ 0) satisfying the following properties:
1. B is adapted to (Ft , t ≥ 0),

2. the σ-field generated by {Bu − Bt , u ≥ t} is independent of (Ft , t ≥ 0).


We also consider functions

b : [0, ∞) × Rm → Rm , σ : [0, ∞) × Rm → L(Rd ; Rm ).

When necessary we will use the description

b(t, x) = bi (t, x) 1≤i≤m , σ(t, x) = σji (t, x) 1≤i≤m,1≤j≤d .


 

By a stochastic differential equation, we mean an expression of the form

dXt = σ(t, Xt )dBt + b(t, Xt )dt, t ∈ (0, ∞),


X0 = x, (6.1)

where x is a m-dimensional random vector independent of the Brownian


motion.
We can also consider any time value u ≥ 0 as the initial one. In this case,
we must write t ∈ (u, ∞) and Xu = x in (6.1). For the sake of simplicity we
will assume here that x is deterministic.
The formal expression (6.1) has to be understood as follows:
Z t Z t
Xt = x + σ(s, Xs )dBs + b(s, Xs )ds, (6.2)
0 0

or coordinate-wise,
d Z
X t Z t
Xti =x +i
σji (s, Xs )dBsj + bi (s, Xs )ds,
j=1 0 0

66
i = 1, . . . , m.

Strong existence and path-wise uniqueness


We now give the notions of existence and uniqueness of solution that will be
considered throughout this chapter.

Definition 6.1 A m-dimensional stochastic process (Xt , t ≥ 0) measurable


and Ft -adapted is a strong solution to (6.2) if the following conditions are
satisfied:

1. The processes (σji (s, Xs ), s ≥ 0) belong to L2a,∞ , for any 1 ≤ i ≤ m,


1 ≤ j ≤ d.

2. The processes (bi (s, Xs ), s ≥ 0) belong to L1a,∞ , for any 1 ≤ i ≤ m.

3. Equation (6.2) holds true for the fixed Brownian motion defined before,
for any t ≥ 0, a.s.

Definition 6.2 The equation (6.2) has a path-wise unique solution if any
two strong solutions X1 and X2 in the sense of the previous definition are
indistinguishable, that is,

P {X1 (t) = X2 (t), for any t ≥ 0} = 1.

Hypotheses on the coefficients


We shall refer to (H) for the following set of hypotheses.

1. Linear growth:

sup [|b(t, x)| + |σ(t, x)] | ≤ L(1 + |x|). (6.3)


t

2. Lipschitz in the x variable, uniformly in t:

sup [|b(t, x) − b(t, y)| + |σ(t, x) − σ(t, y)|] ≤ L|x − y|. (6.4)
t

In (6.3), (6.4), L stands for a positive constant.

67
6.1 Examples of stochastic differential equations
When the functions σ and b have a linear structure, the solution to (6.2)
admits an explicit form. This is not surprising as it is indeed the case for
ordinary differential equations. We deal with this question in this section.
More precisely, suppose that

σ(t, x) = Σ(t) + F (t)x, (6.5)


b(t, x) = c(t) + D(t)x. (6.6)

Example 1 Assume for simplicity d = m = 1, σ(t, x) = Σ(t), b(t, x) =


c(t) + Dx, t ≥ 0 and D ∈ R. Now equation (6.2) reads
Z t Z t
Xt = X 0 + Σ(s)dBs + [c(s) + DXs ]ds,
0 0

and has a unique solution given by


Z t
Dt
Xt = X0 e + eD(t−s) (c(s)ds + Σ(s)dBs ). (6.7)
0

To check (6.7) we proceed as in the deterministic case. First we consider the


equation
dXt = DXt dt,
with initial condition X0 , which solution is

Xt = X0 eDt , t ≥ 0.

The we use the variation of constants procedure and write

Xt = X0 (t)eDt .

A priori X0 (t) may be random. However, since eDt is differentiable, the Itô
differential of Xt is given by

dXt = dX0 (t)eDt + X0 (t)eDt Ddt.

Equating the right-hand side of the preceding identity with

Σ(t)dBt + (c(t) + Xt D) dt

68
yields
dX0 (t)eDt + Xt Ddt
= Σ(t)dBt + (c(t) + Xt D) dt,
that is
dX0 (t) = e−Dt [Σ(t)dBt + c(t)dt] .
In integral form
Z t
X0 (t) = x + e−Ds [Σ(s)dBs + c(s)ds] .
0

Plugging the right-hand side of this equation in Xt = X0 (t)eDt yields (6.7).


A particular example of the class of equations considered before is Langevin
Equation:
dXt = αdBt − βXt dt, t > 0,
X0 = x0 ∈ R, where α ∈ R and β > 0. Here Xt stands for the velocity at
time t of a free particle that performs a Brownian motion different from the
Bt in the equation. The solution to this equation is given by
Z t
−βt
X t = e x0 + α e−β(t−s) dBs .
0

Notice that {Xt , t ≥ 0} defines a Gaussian process.

6.2 A result on existence and uniqueness of solution


This section is devoted to prove the following result.
Theorem 6.1 Assume that the functions σ, and b satisfy the assumptions
(H). Then there exists a path-wise unique strong solution to (6.2).
Before giving a proof of this theorem we recall a version of Gronwall’s lemma
that will be used repeatedly in the sequel.
Lemma 6.1 Let u, v : [α, β] → R+ be functions such that u is Lebesgue
integrable and v is measurable and bounded. Assume that
Z t
v(t) ≤ c + u(s)v(s)ds, (6.8)
α

for some constant c ≥ 0 and for any t ∈ [α, β]. Then


Z t 
v(t) ≤ c exp u(s)ds . (6.9)
α

69
Proof of Theorem 6.1
Let us introduce Picard’s iteration scheme

Xt0 = x,
Z t Z t
Xtn =x+ σ(s, Xsn−1 )dBs + b(s, Xsn−1 )ds, n ≥ 1, (6.10)
0 0

t ≥ 0. Let us restrict the time interval to [0, T ], with T > 0. We shall


prove that the sequence of stochastic processes defined recursively by (6.10)
converges uniformly to a process X which is a strong solution of (6.2). Even-
tually, we shall prove path-wise uniqueness.
Step 1: We prove by induction on n that for any t ∈ [0, T ],
 
n 2
E sup |Xs | < ∞. (6.11)
0≤s≤t

Indeed, this property is clearly true if n = 0, since in this case Xt0 is constant
and equal to x. Suppose that (6.11) holds true for n = 0, . . . , m − 1. By
applying Burkholder’s and Hölder’s inequality, we reach
  h Z s 2 !

E sup |Xsn |2 ≤ C x + E sup σ(u, Xum−1 )dBu
0≤s≤t 0≤s≤t 0
Z s 2 ! i

+ E sup b(u, Xum−1 )du
0≤s≤t 0
h Z t 
m−1
2
≤C x+E σ(u, Xu ) du
0
Z t i
m−1 2

+E b(u, Xu ) du
0
 Z t 
m−1 2

≤C x+E 1 + |Xu | du
0
  
m−1 2
≤ C x + T + T E sup |Xs | .
0≤s≤T

Hence (6.11) is proved.


The assumptions (H) along with (6.11) imply that σ (s, Xsn−1 ) ∈ L2a,T and
b (s, Xsn−1 ) ∈ L2 (Ω × [0, T ]).
Step 2: As in Step 1, we prove by induction on n that
(Ct)n+1
 
n+1 n 2
E sup |Xs − Xs | ≤ . (6.12)
0≤s≤t (n + 1)!

70
Indeed, consider first the case n = 0 for which we have
Z s Z s
1
Xs − x = σ(u, x)dBu + b(s, x)ds.
0 0

Burkolder’s inequality yields


Z s 2 ! Z t

E sup σ(u, x)dBu ≤ C |σ(u, x)|2 du
0≤s≤t 0 0

≤ Ct(1 + |x|2 ).

Similarly,
Z s 2 ! Z t

E sup b(u, x)du ≤ Ct
|b(u, x)|2 du
0≤s≤t 0 0

≤ Ct (1 + |x|2 ).
2

With this, (6.12) is established for n = 0.


Assume that (6.12) holds for natural numbers m ≤ n − 1. Then, as we did
for n = 0, we can consider the decomposition
 
n+1 n 2
E sup |Xs − Xs | ≤ 2(A(t) + B(t)),
0≤s≤t

with

( Z s 2 )

σ(u, Xun ) − σ(u, Xun−1 ) dBu ,

A(t) = E sup
0≤s≤t 0
( Z s 2 )

b(u, Xun ) − b(u, Xun−1 ) du .

B(t) = E sup
0≤s≤t 0

Using first Burkholder’s inequality and then Hölder’s inequality along with
the Lipschitz property of the coefficient σ, we obtain
Z t
A(t) ≤ C(L) E(|Xsn − Xsn−1 |2 )ds.
0

By the induction assumption we can upper bound the last expression by


Z t
(Cs)n (Ct)n+1
C(L) ≤ C(T, L) .
0 n! (n + 1)!

71
Similarly, applying Hölder’s inequality along with the Lipschitz property of
the coefficient b and the induction assumption, yield

(Ct)n+1
B(t) ≤ C(T, L) .
(n + 1)!

Step 3: The sequence of processes {Xtn , t ∈ [0, T ]}, n ≥ 0, converges uni-


formly in t to a stochastic process {Xt , t ∈ [0, T ]} which satisfies (6.2).
Indeed, applying first Chebychev’s inequality and then (6.12), we have

(Ct)n+1
 
n+1 1
n
P sup Xt − Xt > n ≤ 22n ,
0≤t≤T 2 (n + 1)!

which clearly implies


∞  
X 1
P sup Xtn+1 − Xtn > n < ∞.
n=0 0≤t≤T 2

Hence, by the first Borel-Cantelli’s lemma


  
n+1 n
1
P lim inf sup Xt − Xt ≤ n = 1.
n 0≤t≤T 2

In other words, for each ω a.s., there exists a natural number m0 (ω) such
that
1
sup Xtn+1 − Xtn ≤ n ,
0≤t≤T 2
for any n ≥ m0 (ω). The Weierstrass criterion for convergence of series of
functions then implies that
m−1
X
Xtm =x+ [Xtk+1 − Xtk ]
k=0

converges uniformly on [0, T ], a.s. Let us denote by X = {Xt , t ∈ [0, T ]} the


limit. Obviously the process X has a.s. continuous paths.
To conclude the proof, we must check that X satisfies equation (6.2) on [0, T ].
The continuity properties of σ and b imply the convergences

σ(t, Xtn ) → σ(t, Xt ),


b(t, Xtn ) → b(t, Xt ),

as n → ∞, uniformly in t ∈ [0, T ], a.s.

72
Therefore,
Z t Z t

[b(s, Xsn ) − b(s, Xs )] ds ≤ L |Xsn − Xs | ds

0 0
≤ L sup |Xsn − Xs | → 0,
0≤s≤t

as n → ∞, a.s. This proves the a.s. convergence of the sequence of the


path-wise integrals.
As for the stochastic integrals, we will prove
Z t Z t
n
σ(s, Xs )dBs → σ(s, Xs )dBs (6.13)
0 0

as n → ∞ with the convergence in probability.


Indeed, applying the extension of Lemma 3.5 to processes of Λa,T , we have
for each , N > 0,
 Z t 
n

P (σ(s, Xs ) − σ(s, Xs )) dBs > 

0 Z t 
n 2 N
≤P |σ(s, Xs ) − σ(s, Xs )| ds > N + 2 .
0 

The first term in the right-hand side of this inequality converges to zero as
n → ∞. Since , N > 0 are arbitrary, this yields the convergence stated in
(6.13).
Summarising, by considering if necessary a subsequence {Xtnk , t ∈ [0, T ]},
we have proved the a.s. convergence, uniformly in t ∈ [0, T ], to a stochastic
process {Xt , t ∈ [0, T ]} which satisfies (6.2), and moreover

E{ sup |Xt |2 } < ∞.


0≤t≤T

In order to conclude that X is a strong solution to (6.1) we have to check


that the required measurability and integrability conditions hold. This is left
as an exercise to the reader.
Step 4: Path-wise uniqueness.
Let X1 and X2 be two strong solutions to (6.1). Proceeding in a similar way
as in Step 2, we easily get
  Z t  
2 2
E sup |X1 (u) − X2 (u)| ≤ C E sup |X1 (u) − X2 (u)| ds.
0≤u≤t 0 0≤u≤s

73
Hence, from Lemma 6.1 we conclude
 
2
E sup |X1 (u) − X2 (u)| = 0,
0≤u≤T

proving that X1 and X2 are indistinguishable.




6.3 Some properties of the solution


We start this section by studying the Lp -moments of the solution to (6.2).

Theorem 6.2 Assume the same assumptions as in Theorem 6.1 and suppose
in addition that the initial condition is a random variable X0 , independent of
the Brownian motion. Fix p ∈ [2, ∞) and t ∈ [0, T ]. There exists a positive
constant C = C(p, t, L) such that
 
p
E sup |Xs | ≤ C (1 + E|X0 |p ) . (6.14)
0≤s≤t

Proof: From (6.2) it follows that


  h  Z s p 
p p

E sup |Xs | ≤ C(p) E|X0 | + E sup σ(u, Xu )dBu
0≤s≤t 0≤s≤t
 Z s p  i 0

E sup b(u, Xu )du .
0≤s≤t 0

Applying first Burkholder’s inequality and then Hölder’s inequality yield


 Z s p  Z t  p2

E sup σ(u, Xu )dBu ≤ C(p)E |σ(s, Xs )|2 ds
0≤s≤t 0 0
Z t  Z t
≤ C(p, t)E |σ(s, Xs )| ds ≤ C(p, L, t) (1 + E|Xs |p )ds
p

0 Z t   0

≤ C(p, L, t) 1 + E sup |Xu |p ds.


0 0≤u≤s

For the pathwise integral, we apply Hölder’s inequality to obtain


 Z s p  Z t 
p
E sup b(u, Xu )du ≤ C(p, t)E |b(s, Xs )| ds
0≤s≤t 0 0
 Z t   
p
≤ C(p, L, t) 1 + E sup |Xu | ds .
0 0≤u≤s

74
Define  
p
ϕ(t) = E sup |Xs | .
0≤s≤t

We have established that


 Z t 
p
ϕ(t) ≤ C(p, L, t) E|X0 | + 1 + ϕ(s)ds .
0

Then, with Lemma 6.1 we end the proof of (6.14).



It is clear that the solution to (6.2) depends on the initial value X0 . Consider
two initial conditions X0 , Y0 (remember that there should be m-dimensional
random vectors independent of the Brownian motion). Denote by X(X0 ),
X(Y0 ) the corresponding solutions to (6.2). With a very similar proof as that
of Theorem 6.2 we can obtain the following.

Theorem 6.3 The assumptions are the same as in Theorem 6.1. Then
 
p
E sup |Xs (X0 ) − Xs (Y0 )| ≤ C(p, L, t) (E|X0 − Y0 |p ) , (6.15)
0≤s≤t

for any p ∈ [2, ∞), where C(p, L, t) is some positive constant depending on
p, L and t.

The sample paths of the solution of a stochastic differential equation possess


the same regularity as those of the Brownian motion. We next discuss this
fact.

Theorem 6.4 The assumptions are the same as in Theorem 6.1. Let p ∈
[2, ∞), 0 ≤ s ≤ t ≤ T . There exists a positive constant C = C(p, L, T ) such
that
p
E (|Xt − Xs |p ) ≤ C(p, L, T ) (1 + E|X0 |p ) |t − s| 2 . (6.16)

Proof: By virtue of (6.2) we can write

E (|Xt − Xs |p )
 Z t p Z t p 

≤ C(p) E σ(u, Xu )dBu + E b(u, Xu )du .
s s

75
Burkholder’s inequality and then Hölder’s inequality with respect to
Lebesgue measure on [s, t] yield
Z t p Z t  p2
2
E σ(u, Xu )dBu ≤ C(p)E |σ(u, Xu )| du
s s
Z t 
p
−1 p
≤ C(p)|t − s| 2 E |σ(u, Xu )| du
s
Z t
p
−1
≤ C(p, L, T )|t − s| 2 (1 + E(|Xu |p )) du.
s

By using the estimate (6.14) of Theorem 6.2, we have


Z t  
p p
(1 + E(|Xu | )) du ≤ C(p, T, L)|t − s|E sup |Xu |
s 0≤u≤t

≤ C(p, T, L)|t − s| (1 + E|X0 |p ) . (6.17)

This ends the estimate of the Lp moment of the stochastic integral.


The estimate of the path-wise integral follows from Hölder’s inequality and
(6.17). Indeed, we have
Z t p Z t
p−1
E b(u, Xu )du ≤ |t − s| E (|b(u, Xu )|p ) du
s s
Z t
p−1
≤ C(p, L)|t − s| (1 + E(|Xu |p )) du
s
≤ C(p, T, L)|t − s| (1 + E|X0 |p ) .
p

Hence we have proved (6.16)



We can now apply Kolmogorov’s continuity criterion (see Proposition 2.2) to
prove the following.
Corollary 6.1 With the same assumptions as in Theorem 6.1, we have that
the sample paths of the solution to (6.2) are Hölder continuous of degree
α ∈ (0, 21 ).

Remark 6.1 Assume that in Theorem 6.3 the initial conditions are deter-
ministic and are denoted by x and y, respectively. An extension of Kol-
mogorov’s continuity criterion to stochastic processes indexed by a multi-
dimensional parameter yields that the sample paths of the stochastic process
{Xt (x), t ∈ [0, T ], x ∈ Rm } are jointly Hölder continuous in (t, x) of degree
α < 12 in t and β < 1 in x, respectively.

76
6.4 Markov property of the solution
In Section 2.5 we discussed the Markov property of a real-valued Brownian
motion. With the obvious changes R into Rn , with arbitrary n ≥ 1 we can
see that the property extends to multi-dimensional Brownian motion. In this
section we prove that the solution to the sde (6.2) inherits the Markov prop-
erty from Brownian motion. To establish this fact we need some preliminary
results.

Lemma 6.2 Let (Ω, F, P ) be a probability space and (E, E) be a measur-


able space. Consider two independent sub-σ-fields of F, denoted by G, H,
respectively, along with mappings

X : (Ω, H) → (E, E),

and
Ψ : (E × Ω, E ⊗ G) → Rm ,
with ω 7→ Ψ (X(ω), ω) in L1 (Ω).
Then,
E (Ψ (X, ·) |H) = Φ(X)(·), (6.18)
with Φ(x)(·) = E (Ψ(x, ·)).

Proof: Assume first that Ψ(x, ω) = f (x)Z(ω) with a G-measurable Z and a


E-measurable f . Then, by the properties of the mathematical expectation,

E (Ψ (X, ·) |H) = E (f (X(·))Z(·)|H)


= f (X(·))E(Z).

Indeed, we use that X is H-measurable and that G, H are independent.


Clearly,

f (X(·))E(Z) = f (x)E(Z)|x=X(·) = E (Ψ(x, ·)) |x=X(·) = Φ(X).

This yields (6.18).


The result extends to any E ⊗ G-measurable function Ψ by a monotone class
argument.


Lemma 6.3 Fix u ≥ 0 and let η be a Fu -measurable random variable in


L2 (Ω). Consider the SDE
Z t Z t
η η
Yt = η + σ (s, Ys ) dBs b (s, Ysη ) ds,
u u

77
with coeffients σ and b satisfying the assumptions (H). Then for any t ≥ u,
η(ω)
Yt (ω) = Xtx,u (ω)|x=η(ω) ,

where Xtx,u , t ≥ 0, denotes the solution to


Z t Z t
Xtx,u =x+ σ (s, Xsx,u ) dBs b (s, Xsx,u ) ds. (6.19)
u u

Proof: Suppose first that η is a step function,


r
X
η= ci 1Ai , Ai ∈ Fu .
i=1

By virtue of the local property of the stochastic integral, on the set Ai ,


Xtci ,u (ω) = Xtx,u (ω)|x=η(ω) = Ytη (ω).
Let now (ηn , n ≥ 1) be a sequence of simple Fu -measurable random variables
converging in L2 (Ω) to η. By Theorem 6.3 we have

L2 (Ω) − lim Ytηn = Ytη .


n→∞

By taking if necessary a subsequence, we may assume that the limit is a.s..


Then, a.s.,
η(ω) ηn (ω)
Yt (ω) = lim Yt (ω) = lim Xtx,u (ω)|x=ηn (ω) = Xtx,u (ω)|x=η(ω) ,
n→∞ n→∞

where in the last equality, we have applied the joint continuity in (t, x) of
Xtx,u .

x,s
As a consequence of the preceding lemma, we have Xtx,s = XtXu ,u
for any
0 ≤ s ≤ u ≤ t, a.s.
For any Γ ∈ B(Rm ), set

p(s, t, x, Γ) = P {Xtx,s ∈ Γ}, (6.20)

so, for fixed 0 ≤ s ≤ t and x ∈ Rm , p(s, t, x, ·) is the law of the random


variable Xtx,s .

Theorem 6.5 The stochastic process {Xtx,s , t ≥ s} is a Markov process


with initial distribution µ = δ{x} and transition probability function given
by (6.20).

78
Proof: According to Definition 2.3 we have to check that (6.20) defines a
Markovian transition function and that

P {Xtx,s ∈ Γ|Fu } = p(u, t, Xux,s , Γ). (6.21)

We start by proving this identity. For this, we shall apply Lemma 6.3 in the
following setting:

(E, E) = (Rm , B(Rm )),


G = σ (Br+u − Bu , r ≥ 0) , H = Fu ,
Ψ(x, ω) = 1Γ (Xtx,u (ω)) , u ≤ t,
X := Xux,s , s ≤ u.

The property of independent increments of the Brownian motion clearly


yields that the σ-fields G and H defined above are independent and Xux,s
is Fu -measurable. Moreover,

Φ(x) = E (Ψ(x, ·)) = E (11Γ (Xtx,u ))


= P {Xtx,u ∈ Γ} = p(u, t, x, Γ).

Thus, Lemma 6.3 and then Lemma 6.2 yield


n o
x,s Xux,s ,u
P {Xt ∈ Γ|Fu } = P Xt ∈ Γ|Fu
  x,s
 
= E 1Γ XtXu ,u |Fu = E (Ψ (Xux,s , ω) |Fu )
= Φ (Xux,s ) = p(u, t, Xux,s , Γ).

Since x 7→ Xtx,s is continuous a.s., the mapping x 7→ p(s, t, ·, Γ) is also contin-


uous and thus measurable. Moreover, by its very definition Γ 7→ p(s, t, x, Γ)
is a probability. We now prove that Chapman-Kolmogorov’s equation is
satisfied (see (2.10)).
Indeed, fix 0 ≤ s ≤ u ≤ t; by property (c) of the conditional expectation we
have

p(s, t, x, Γ) = E (11Γ (Xtx,s ))


E (E (11Γ (Xtx,s )|Fu )) = E (P {Xtx,s ∈ Γ} |Fu ) .

By (6.21) this last expression is E (p(u, t, Xux,s , Γ)). But


Z
x,s
E (p(u, t, Xu , Γ)) = p(u, t, y, Γ)LXux,s (dy),
Rm

79
where LXux,s denotes the probability law of Xux,s . By definition

LXux,s (dy) = p(s, u, x, dy).

Therefore, Z
p(s, t, x, Γ) = p(u, t, y, Γ)p(s, u, x, dy).
Rm

The proof of the theorem is now complete.




7 Numerical approximations of stochastic


differential equations
In this section we consider a fixed time interval [0, T ]. Let π = {0 = τ0 <
τ1 < . . . < τN = T } be a partition of [0, T ]. The Euler-Maruyama scheme
for the SDE (6.2) based on the partition π is the stochastic process X π =
{Xtπ , t ∈ [0, T ]} defined iteratively as follows:

Xτπn+1 = Xτπn + σ(τn , Xτπn )(Bτn+1 − Bτn ) + b(τn , Xτπn )(τn+1 − τn ),


n = 0, · · · , N − 1
X0π = x, (7.1)

Notice that the values Xτπn , n = 0, · · · , N − 1 are determined by the values


of Bτn , n = 1, . . . , N .
We can extend the definition of X π to any value of t ∈ [0, T ] by setting

Xtπ = Xτπj + σ(τj , Xτπj )(Bt − Bτj ) + b(τj , Xτπj )(t − τj ), (7.2)

for t ∈ [τj , τj+1 ).


The stochastic process {Xtπ , t ∈ [0, T ]} defined by (7.2) can be written as a
stochastic differential equation. This notation will be suitable for comparing
with the solution of (6.2). Indeed, for any t ∈ [0, T ] set π(t) = sup{τl ∈
π; τl ≤ t}; then
Z t
Xtπ π π
   
=x+ σ π(s), Xπ(s) dBs + b π(s), Xπ(s) ds . (7.3)
0

The next theorem gives the rate of convergence of the Eurler-Maruyama


scheme to the solution of (6.2) in the Lp norm.

80
Theorem 7.1 We assume that the hypotheses (H) are satisfied. Moreover,
we suppose that there exists α ∈ (0, 1) such that

|σ(t, x) − σ(s, x)| + |b(t, x) − b(s, x)| ≤ C(1 + |x|)|t − s|α , (7.4)

where C is some positive contant.


Then, for any p ∈ [1, ∞)
 
p
E sup |Xt − Xt | ≤ C(T, p, x)|π|βp ,
π
(7.5)
0≤t≤T

1
where |π| denotes the norm of the partition π and β = 2
∧ α.
Proof: We shall apply the following result, that can be argued in a similar
way as in Theorem 6.2
 
π p
sup E sup |Xs | ≤ C(p, T ). (7.6)
π 0≤s≤t

Set
Zt = sup |Xsπ − Xs | .
0≤s≤t

Applying Burkholder’s and Hölder’s inequality we obtain


n p Z t p 
p
t 2 −1
p−1
π
E(Zt ) ≤ 2 E σ(π(s), Xπ(s) ) − σ(s, Xs ) ds
0
Z t p  o
+ tp−1 π
E b(π(s), Xπ(s) ) − b(s, Xs ) ds .
0

The assumptions on the coefficients yield


π
π

σ(π(s), Xπ(s) ) − σ(s, Xs ) ≤ σ(π(s), Xπ(s) ) − σ(π(s), Xπ(s) )

+ σ(π(s), Xπ(s) ) − σ(s, Xπ(s) ) + σ(s, Xπ(s) ) − σ(s, Xs )
 π  
≤ CT Xπ(s) − Xπ(s) + 1 + Xπ(s) |s − π(s)|α + Xπ(s) − Xs
 
≤ CT Zs + 1 + Xπ(s) (s − π(s))α + Xπ(s) − Xs ,


and similarly for the coefficient b.


Hence, we have
π
p 
E σ(π(s), Xπ(s) ) − σ(s, Xs )
h  p
i
p p αp
≤ C(p, T ) E(|Zs | ) + (1 + |x| ) |π| + |π| 2

1 p
h  i
≤ C(p, T ) E(|Zs |p ) + (1 + |x|p ) |π|α + |π| 2 ,

81
and a similar estimate for b. Consequently,
Z t  
1 p

p p α
E(Zt ) ≤ C(p, T, x) E(Zs )ds + |π| + |π| 2 T .
0

With Gronwall’s lemma we conclude

E(Ztp ) ≤ C(p, T, x)|π|βp ,


1
with β = 2
∧ α.


Remark 7.1 If the coefficients σ and b do not depend on t, with a similar


proof, we obtain β = 12 in (7.4).

Assume that the sequence of partitions of [0, T ], (πn , n ≥ 1) satisfies the


following property: there exists γ ∈ (0, β) and p ≥ 1 such that
X
|πn |(β−γ)p < ∞. (7.7)
n≥1

Then, Chebyshev’s inequality and (7.4) imply



X   ∞
X
−γ πn −p
P |πn | sup |Xs − Xs | >  ≤ C(p, T, x) |πn |(β−γ)p
n=0 0≤s≤T n=0
−p
C(p, T, x) .

Then, Borel-Cantelli’s lemma then yields

|πn |−γ sup |Xsπn − Xs | → 0, a.s., (7.8)


0≤s≤T

as n → ∞, uniformly in t ∈ [0, T ].
For example, for the sequence of dyadic partitions, |πn | = 2−n and for any
γ ∈ (0, β) and γ ∈ (0, β), p ≥ 1, (7.7) holds.

82
8 Continuous time martingales
In this chapter we shall study some properties of martingales (respectively,
supermartingales and submartingales) whose sample paths are continuous.
We consider a filtration {Ft , t ≥ 0} as has been introduced in section 2.4 and
refer to definition 2.2 for the notion of martingale (respectively, supermartin-
gale, submartingale). We notice that in fact this definition can be extended
to families of random variables {Xt , t ∈ T} where T is an ordered set. In
particular, we can consider discrete time parameter processes.
We start by listing some elementary but useful properties.
1. For a martingale (respectively, supermartingale, submartingale) the
function t 7→ E(Xt ) is a constant (respectively, decreasing, increasing)
function.
2. Let {Xt , t ≥ 0} be a martingale and let f : R → R be a convex
function. Assume further that f (Xt ) ∈ L1 (Ω), for any t ≥ 0. Then
the stochastic process {f (Xt ), t ≥ 0} is a submartingale. The same
conclusion holds true for a submartingale if, additionally the convex
function f is increasing.
The first assertion follows easily from property (c) of the conditional ex-
pectation. The second assertion can be proved using Jensen’s inequality, as
follows. Assume first that {Xt , t ≥ 0} is a martingale, and fix 0 ≤ s ≤ t.
Then by applying the functionn f to the identity E(Xt |Fs ) = Xs along with
the convexity of f , we obtain
E (f (Xt )|Fs ) ≥ f (E(Xt |Fs )) = f (Xs ).
If {Xt , t ≥ 0} is a submartingale, we consider the inequality E(Xt |Fs ) ≥ Xs .
Since f is increasing and convex, we have
E (f (Xt )|Fs ) ≥ f (E(Xt |Fs )) ≥ f (Xs ).

8.1 Doob’s inequalities for martingales


In the first part of this section, we will deal with discrete parameter martin-
gales indexed by {0, 1, . . . , N }.
Proposition 8.1 Let {Xn , 0 ≤ n ≤ N } be a submartingale. For any λ > 0,
the following inequalities hold:
 

λP sup Xn ≥ λ ≤ E XN 1(supn Xn ≥λ)
n

≤ E |XN |11(supn Xn ≥λ) . (8.1)

83
Proof: Consider the stopping time

T = inf{n : Xn ≥ λ} ∧ N.

Then

E (XN ) ≥ E (XT ) = E XT 1(supn Xn ≥λ)

+ E XT 1(supn Xn <λ)
 

≥ λP sup Xn ≥ λ + E XN 1(supn Xn <λ) .
n

By substracting E XN 1(supn Xn <λ) from the first and last term before we
obtain the first inequality of (8.1). The second one is obvious.

As a consequence of this proposition we have the following.

Proposition 8.2 Let {Xn , 0 ≤ n ≤ N } be either a martingale or a positive


submartingale. Fix p ∈ [1, ∞) and λ ∈ (0, ∞). Then,
 
λ P sup |Xn | ≥ λ ≤ E (|XN |p ) .
p
(8.2)
n

Moreover, for any p ∈]1, ∞),


   p
p p p
E (|XN | ) ≤ E sup |Xn | ≤ E (|XN |p ) . (8.3)
n p−1

Proof: Without loss of generality, we may assume that E |XN |p ) < ∞, since
otherwise (8.2) holds trivially.
According to property 2 above, the process {|Xn |p , 0 ≤ n ≤ N } is a sub-
martingale and then, by Proposition 8.1 applied to the process Xn := |Xn |p ,
   
1
p
µP sup |Xn | ≥ µ = µP sup |Xn | ≥ µ p
n n
 
= λ P sup |Xn | ≥ λ ≤ E (|XN |p ) ,
p
n

1
where for any µ > 0 we have written λ = µ p .
We now prove the second inequality of (8.3); the first is obvious.
Set X ∗ = supn |Xn |, for which we have

λP (X ∗ ≥ λ) ≤ E |XN |11(X ∗ ≥λ) .




84
Fix k > 0. Fubini’s theorem yields
Z X ∗ ∧k 
∗ p p−1
E ((X ∧ k) ) = E pλ dλ
0
Z Z ∞
= dP 1{λ≤X ∗ ∧k} pλp−1 dλ
Ω 0
Z k Z
p−1
=p dλλ dP
0 {λ≤X ∗ }
Z k
=p dλλp−2 λP (X ∗ ≥ λ)
0
Z k
dλλp−2 E |XN |11(X ∗ ≥λ)

≤p
0
 Z k∧X ∗ 
p−2
= pE |XN | λ dλ
0
p
E |XN | (X ∗ ∧ k)p−1 .

=
p−1
p
Applying Hölder’s inequality with exponents p−1
and p yields

p p−1 1
E ((X ∗ ∧ k)p ) ≤ [E ((X ∗ ∧ k)p )] p [E (|XN |p )] p .
p−1

Consequently,
1 p 1
[E ((X ∗ ∧ k)p )] p ≤ [E (|XN |p )] p .
p−1
Letting k → ∞ and using monotone convergence, we end the proof.

It is not difficult to extend the above results to martingales (submartingales)
with continuous sample paths. In fact, for a given T > 0 we define

D = Q ∩ [0, T ],
 
k
Dn = D ∩ , k ∈ Z+ ,
2n

where Q denotes the set of rational numbers.


We can now apply (8.2), (8.3) to the corresponding processes indexed by Dn .
By letting n to ∞ we obtain
 
λ P sup |Xt | ≥ λ ≤ sup E (|Xt |p ) , p ∈ [1, ∞)
p
t∈D t∈D

85
and    p
p p
E sup |Xt | ≤ sup E (|Xt |p ) , p ∈]1, ∞).
t∈D p − 1 t∈D

By the continuity of the sample paths we can finally state the following result.
Theorem 8.1 Let {Xt , t ∈ [0, T ]} be either a continuous martingale or a
continuous positive submartingale. Then
!
λp P sup |Xt | ≥ λ ≤ sup E (|Xt |p ) , p ∈ [1, ∞), (8.4)
t∈[0,T ] t∈[0,T ]
!  p
p p
E sup |Xt | ≤ sup E (|Xt |p ) (8.5)
t∈[0,T ] p−1 t∈[0,T ]
 p
p
= E(|XT |p ), p ∈]1, ∞). (8.6)
p−1
Inequality (8.4) is termed Doob’s maximal inequality, while (8.6) is called
Doob’s Lp inequality.

8.2 Quadratic variation of a continuous martingale


In this section we construct the quadratic variation of a continuous martin-
gale.
We start by a very simple consequence of the martingale property.
Lemma 8.1 Let {Nt , t ≥ 0} be a martingale with respect to a filtration
(Ft )t≥0 . The following identity holds: for any 0 ≤ s ≤ t

E Nt2 − Ns2 = E (Nt − Ns )2 ,


 
(8.7)

E Nt2 − Ns2 |Fs = E (Nt − Ns )2 Fs ,


 
(8.8)
Proof: By developing the square,

E (Nt − Ns )2 = E Nt2 + Ns2 − 2Ns Nt .


 

From the martingale property,

E(Ns Nt ) = E (E(Ns Nt |Fs )) = E(Ns2 ).

Hence, the (8.7) follows. The proof of (8.8) follows by similar arguments.

The next statement gives an idea of the roughness of the sample paths of a
continuous local martingale.

86
Proposition 8.3 Let N be a continuous bounded martingale, null at zero
and with sample paths of bounded variation, a.s. Then N is indistinguishable
from the constant process 0.
Proof: Fix t > 0 and consider a partition 0 = t0 < t1 < · · · < tp = t of [0, t].
Then
p h i
X
2
E(Nt ) = E| Nt2i − Nt2i−1
i=1
p h
X 2 i
= E Nti − Nti−1
i=1
p
" X #
≤E sup |Nti − Nti−1 | |Nti − Nti−1 |
i
i=1
 
≤ CE sup |Nti − Nti−1 | ,
i

where the second identity above is a consequence of Lemma 8.1 before.


By considering a sequence of partitions whose mesh tends to zero, the pre-
ceding estimate yields E(Nt2 ) = 0, by the continuity of the sample paths of
N . This finishes the proof of the proposition.

Throughout this section we will consider a fixed t > 0 and an increasing
sequence of partitions of [0, t] whose mesh tends to zero. Points of the n-th
partition will be generically denoted by tnk , k = 0, 1, . . . , pn . We will also
consider a continuous martingale M and define
pn  2
X
hM int = Mtnk − Mtnk−1 ,
k=1
(∆nk M )t = Mtnk − Mtnk−1 .

Theorem 8.2 Let M be a continuous bounded martingale. Then the se-


quence (hM int , n ≥ 1), t ∈ [0, T ], converges uniformly in t, in probability, to a
continuous, increasing process hM i = (hM it , t ∈ [0, T ]), such that hM i0 = 0.
That is, for any  > 0,
( )
P sup |hM int − hM it | >  → 0, (8.9)
t∈[0,T ]

as n → ∞. The process hM i is unique satisfying the above conditions and


that M 2 − hM i is a continuous martingale.

87
Proof: Uniqueness follows from Proposition 8.3. Indeed, assume there were
two increasing processes hMi i, i = 1, 2 satisfying that M 2 − hMi i is a con-
tinuous martingale. By taking the difference of these two processes we get
that hM1 i − hM2 i is of bounded variation and, at the same time a continuous
martingale. Hence hM1 i and hM2 i are indistinguishable.
The next objective is to prove that (hM int , n ≥ 1), t ∈ [0, T ] is, uniformly in
t, a Cauchy sequence in probability.
Let m > n. We have the following:
   2 
pn
X X
m 2
n
 n 2 m

2  
E |hM it − hM it | = E   (∆k M )t − (∆j M )t
k=1 j:tm n n
j ∈[tk−1 ,tk [

   2 
X pn  
X
m
= 4E   (∆j M )t Mtm j
− M tn
k−1
 

k=1 j:tm n n
j ∈[tk−1 ,tk [

 
pn  2
X X
= 4E  (∆m 2
j M )t Mtm j
− Mtnk−1 
k=1 j:tm n n
j ∈[tk−1 ,tk [
pm
!
 2 X
≤ 4E sup sup Mtm
j
− Mtk−1
n (∆m 2
j M )t
k j:tm n n
j ∈[tk−1 ,tk [ j=1
 ! pm
!2  12
 4 X
≤ 4 E sup sup Mtm
j
− Mtnk−1 E (∆m 2
j M )t
 .
k j:tm n n
j ∈[tk−1 ,tk [ j=1

Let us now consider the last expression. The first factor tends to zero as n
and m tends to infinity, because M is continuous and bounded. The second
factor is easily seen to be bounded uniformly in m. Thus we have proved
2
lim E |hM int − hM im
t | = 0, (8.10)
n,m→∞

for any t ∈ [0, T ].


One can easily check that for any n ≥ 1, {Mt2 − hM int , t ∈ [0, T ]} is a
continuous martingale. Indeed, let 0 ≤ s ≤ t, then
E Mt2 − Ms2 |Fs = E (Mt − Ms )2 |Fs = E (hM int − hM ins |Fs ) .
 

Therefore {hM int −hM im


t , t ∈ [0, T ]} is a martingale for any n, m ≥ 1. Hence,
Doob’s inequality yields
!
2 m 2
E sup |hM int − hM im n

t | ≤ 4E |hM i T − hM iT | .
t∈[0,T ]

88
Consequently,
 
−2
P { sup |hM int − hM im
t | > } ≤  E sup |hM int − hM im
t |
2
0≤t≤T 0≤t≤T

≤ 4−2 E |hM inT − hM im 2



T| .

This last expression tends to zero as n, m tend to infinity. Since the con-
vergence in probability is metrizable, we see that there exists a process hM i
satisfying the required conditions.


Remark Assume that the martingale M in Theorem 8.2 is bounded in L2 .


Then, by a stopping argument the boundedness assumption on M can be
removed.
ExamplenR Consider o the stochastic integral process
t
M = 0
ϕ(s)dBs , s ∈ [0, T ] , with ϕ ∈ L2a,T . It was proved in Section 3
that the process M is a continuous martingale with respect to the filtration
generated by the Brownian motion B. By applying the extension of Theorem
8.2 to continuous martingales bounded in L2 we can see that
Z t
hM it = ϕ(s)2 ds.
0

Indeed, the right-hand side of this equality defines an increasing process


vanishing at t = 0. Moreover, by Itô’s formula,
Z t Z t
2 2
Mt − ϕ(s) ds = 2 Ms ϕ(s)dBs ,
0 0

and the right-hand side of this identity defines a continuous martingale.


Given two continuous martingales M and N we define the cross variation by
1
hM, N it = [hM + N it − hM it − hN it ] , (8.11)
2
t ∈ [0, T ].
From the properties of the quadratic variation (see Theorem 8.2) we see that
the process {hM, N it , t ∈ [0, T ]} is a process of bounded variation and it is
the unique process (up to indistinguishability) such that {Mt Nt −hM, N it , t ∈
[0, T ]} is a continuous martingale. It is also clear that
pn   
X
lim Mtnk − Mtnk−1 Ntnk − Ntnk−1 = hM, N it , (8.12)
n→∞
k=1

89
uniformly in t ∈ [0, T ] in probability.
This result together with Schwarz’s inequality imply
p
|hM, N it | ≤ hM it hN it .

and more generally, by setting hM, N its = hM, N it − hM, N is , for 0 ≤ s ≤


t ≤ T , then p
|hM, N its | ≤ hM its hN its .
This inequality (a sort of Cauchy-Schwarz’s inequality) is a particular case
of the result stated in the next proposition.
Proposition 8.4 Let M , N be two continuous martingales and H, K be
two measurable processes. Then for any t ≥ 0,
Z t Z t  21 Z t  21
2 2
Hs Ks dhM, N is ≤ Hs dhM is Ks dhN is . (8.13)

0 0 0

Proof: Consider a finite partition of [0, t], given by {t0 = 0 < t1 < . . . < tr =
t}, and assume first that the stochastic processes H, K are step processes
and described by this partition, as follows:

H = H0 1{0} + H1 1]0,t1 ] + . . . Hr 1]tr−1 ,tr ] ,


K = K0 1{0} + K1 1]0,t1 ] + . . . Kr 1]tr−1 ,tr ] ,

with Hi , Ki , i = 0, . . . , r bounded measurable random variables. Then


Z t Xr
t
Hs Ks dhM, N is = Hi Ki hM, N iti+1
i
.
0 i=1

Thus,
Z t r
X
ti+1

Hs Ks dhM, N is ≤ ≤ |Hi Ki | hM, N iti


0 i=1
r
X   21  1
t ti+1 2
≤ |Hi Ki | hM iti+1
i
hN iti .
i=1

By applying Schwarz’s inequality, the last expression is bounded by


Z t  21 Z t  12
Hs2 dhM is Hs2 dhN is .
0 0

The general case follows from an approximation argument.

90

The bounded variation process hM, N i gives rise to the total variation mea-
sure dkhM, N iks defined by

dkhM, N iks = d(hM, N i+ )(s) + d(hM, N i− )(s),



where hM, N i+
s , hM, N is , denote the increasing functions such that


hM, N is = hM, N i+
s − hM, N is .

It is worth noticing that (8.13) can be extended to the following inequality.


Kunita-Watanabe inequality

Z t Z t  12 Z t  12
|Hs | |Ks | kdhM, N is k ≤ Hs2 dhM is Ks2 dhN is . (8.14)
0 0 0

More details can be found in [11].

91
9 Stochastic integrals with respect to contin-
uous martingales
This chapter aims to give an outline of the main ideas of the extension of
the Itô stochastic integral to integrators which are continuous martingales.
We start by describing precisely the spaces involved in the construction of
such a notion. Throughout the chapter, we consider a fixed probability space
(Ω, F, P ) endowed with a filtration (Ft , t ≥ 0).
We denote by H2 the space of continuous martingales M , indexed by [0, T ],
with M0 = 0 a.s. and bounded in L2 (Ω). That is,
sup E(|Mt |2 ) < ∞.
t∈[0,T ]

This is a Hilbert space endowed with the inner product


(M, N )H2 = E[hM, N iT ].
A stochastic process (Xt , t ≥ 0) is said to be progressively measurable if for
any t ≥ 0, the mapping (s, ω) 7→ Xs (ω) defined on [0, t] × Ω is measurable
with respect to the σ-field B([0, t]) ⊗ Ft .
For any M ∈ H2 we define L2 (M ) as the set of progressively measurable
processes H such that
Z ∞ 
2
E Hs dhM is < ∞.
0
2
Notice that this in an L space of measurable mappings defined on R+ × Ω
with respect to the measure dP dhM i. Hence it is also a Hilbert space, the
natural inner product being
Z ∞ 
(H, K)L2 (M ) = E Hs Ks dhM is .
0

The standard Brownian motion belongs to the space H2 and the space L2 (M )
will play the same role as L2a,T in the Itô theory of stochastic integration with
respect to Brownian motion.
Let E be the linear subspace of L2 (M ) consisting of processes of the form
p
X
Hs (ω) = Hi (ω)11]ti ,ti+1 ] (s), (9.1)
i=0

where 0 = t0 < t1 < . . . < tp+1 , and for each i, Hi is a Fti -measurable,
bounded random variable.
Stochastic processes belonging to E are termed elementary. There are related
with L2 (M ) as follows.

92
Proposition 9.1 Fix M ∈ H2 . The set E is dense in L2 (M ).
Proof: We will prove that if K ∈ L2 (M ) and is orthogonal to E then K = 0.
For this, we fix 0 ≤ s < t ≤ T and consider the process

H = F 1]s,t] ,

with F a Fs -measurable and bounded random variable.


Saying that K is orthogonal to H in L2 (M ) can be written as
Z T   Z t 
E Hu Ku dhM iu = E F Ku dhM iu = 0.
0 s

Consider the stochastic process


Z t
Xt = Ku dhM iu .
0

Notice that Xt ∈ L1 (Ω). In fact,


Z t
E|Xt | ≤ E |Ku |dhM iu
0
 Z t  21
1
≤ E |Ku |2 dhM iu EhM i2t 2 .
0

We have thus proved that E (F (Xt − Xs )) = 0, for any 0 ≤ s < t and any
Fs -measurable and bounded random variable F . This shows that the process
(Xt , t ≥ 0) is a martingale. At the same time, (Xt , t ≥ 0) is also a process of
bounded variation. Hence
Z t
Ku dhM iu = 0, ∀t ≥ 0,
0

which implies that K = 0 in L2 (M ).




Stochastic integral of processes in E

Proposition 9.2 Let M ∈ H2 , H ∈ E as in (9.1). Define


p
X 
(H.M )t = Hi Mti+1 ∧t − Mti ∧t .
i=0

Then

93
(i) H.M ∈ H2 .

(ii) The mapping H 7→ H.M extends to an isometry from L2 (M ) to H2 .

The stochastic process {(H.M )t , t ≥ 0} is called the stochastic


Rt integral of the
process H with respect to M and is also denoted by 0 Hs dMs .

Proof of (i): The martingale property follows from the measurability proper-
ties of H and the martingale property of M . Moreover, since H is bounded,
(H.M ) is bounded in L2 (Ω).

Proof of (ii): We prove first that the mapping

H ∈ E 7→ H.M

is an isometry from E to H2 .
Clearly H 7→ H.M is linear. Moreover, H.M is a finite sum of terms like

Mti = Hi Mti+1 ∧t − Mti ∧t




each one being a martingale and orthogonal to each other. It is easy to check
that
hM i it = Hi2 (hM iti+1 ∧t − hM iti ∧t ).
Hence,
p
X
hH.M it = Hi2 (hM iti+1 − hM iti ).
i=0

Consequently,
p
" #
X
EhH.M iT = kH.M k2H2 = E Hi2 (hM iti+1 ∧T − hM iti ∧T )
i=0
Z T 
=E Hs2 dhM is
0
= kHkL2 (M ) .

Since E is dense in L2 (M ) this isometry extends to a unique isometry from


L2 (M ) into H2 . The extension is termed the stochastic integral of the process
H with respect to M .


94
10 Appendix 1: Conditional expectation
Roughly speaking, a conditional expectation of a random variable is the
mean value with respect to a modified probability after having incorporated
some a priori information. The simplest case corresponds to conditioning
with respect to an event B ∈ F. In this case, the conditional expectation is
the mathematical expectation computed on the modified probability space
(Ω, F, P (·/B)).
However, in general, additional information cannot be described so easily.
Assuming that we know about some events B1 , . . . , Bn we also know about
those that can be derived from them, like unions, intersections, complemen-
taries. This explains the election of a σ-field to keep known information and
to deal with it.
In the sequel, we denote by G an arbitrary σ-field included in F and by X
a random variable with finite expectation (X ∈ L1 (Ω)). Our final aim is to
give a definition of the conditional expectation of X given G. However, in
order to motivate this notion, we shall start with more simple situations.
Conditional expectation given an event
Let B ∈ F be such that P (B) 6= 0. The conditional expectation of X given
B is the real number defined by the formula
1
E(X/B) = E(11B X). (10.1)
P (B)

It immediately follows that

• E(X/Ω) = E(X),

• E(11A /B) = P (A/B).

With the definition (10.1), the conditional expectation coincides with the
expectation with respect to the conditional probability P (·/B). We check
this fact with a discrete random variable X = ∞
P
i=1 i Ai . Indeed,
a 1

! ∞
1 X X P (Ai ∩ B)
E(X/B) = E ai 1Ai ∩B = ai
P (B) i=1 i=1
P (B)

X
= ai P (Ai /B).
i=1

95
Conditional expectation given a discrete random variable
Let Y = ∞
P
i=1 yi 1Ai , Ai = {Y = yi }. The conditional expectation of X given
Y is the random variable defined by

X
E(X/Y ) = E(X/Y = yi )11Ai . (10.2)
i=1

Notice that, knowing Y means knowing all the events that can be described
in terms of Y . Since Y is discrete, they can be described in terms of the
basic events {Y = yi }. This may explain the formula (10.2).
The following properties hold:
(a) E (E(X/Y )) = E(X);

(b) if the random variables X and Y are independent, then E(X/Y ) =


E(X).
For the proof of (a) we notice that, since E(X/Y ) is a discrete random
variable

X
E (E(X/Y )) = E(X/Y = yi )P (Y = yi )
i=1

!
X
=E X 1{Y =yi } = E(X).
i=1

Let us now prove (b). The independence of X and Y yields



X E(X11{Y =yi } )
E(X/Y ) = 1A i
i=1
P (Y = yi )

X
= E(X)11Ai = E(X).
i=1

In the sequel, we shall denote by σ(Y ) the σ-field generated by a random


variable Y . It consist of sets of the form {ω : Y (ω) ∈ B}, where B is a Borel
set of R.
The next proposition states two properties of the conditional expectation
that motivates the Definition 10.1.

Proposition 10.1 1. The random variable Z := E(X/Y ) is σ(Y )-


measurable; that is, for any Borel set B ∈ B, Z −1 (B) ∈ σ(Y ),

2. for any A ∈ σ(Y ), E (11A E(X/Y )) = E(11A X).

96
Proof: Set ci = E(X/{Y = yi }) and let B ∈ B. Then

Z −1 (B) = ∪i:ci ∈B {ω : Z(ω) = ci } = ∪i:ci ∈B {ω : Y (ω) = yi } ∈ σ(Y ),

proving the first property.


To prove the second one, it suffices to take A = {Y = yk }. In this case
 
E 1{Y =yk } E(X/Y ) = E 1{Y =yk } E(X/Y = yk )
 
E(X11{Y =yk } )
= E 1{Y =yk } = E(X11{Y =yk } ).
P (Y = yk )


Conditional expectation given a σ-field

Definition 10.1 The conditional expectation of X given G is a random vari-


able Z satisfying the properties
1. Z is G-measurable; that is, for any Borel set B ∈ B, Z −1 (B) ∈ G,

2. for any G ∈ G,
E(Z11G ) = E(X11G ).

We will denote the conditional expectation Z by E(X/G).


Notice that the conditional expectation is not a number but a random vari-
able. There is nothing strange in this, since conditioning depends on the
observations.
Condition (1) tell us that events that can be described by means of E(X/G)
are in G. Whereas condition (2) tell us that on events in G the random vari-
ables X and E(X/G) have the same mean value.

The existence of E(X/G) is not a trivial issue. You should trust mathe-
maticians and believe that there is a theorem in measure theory -the Radon-
Nikodym Theorem- which ensures the existence and uniqueness of such a
random variable (out of a set of probability zero).
Before stating properties of the conditional expectation, we are going to
explain how to compute it in two particular situations.

Example 10.1 Let G be the σ-field (actually, the field) generated by a finite
partition G1 , . . . , Gm . Then
m
X E(X11Gj )
E(X/G) = 1Gj . (10.3)
j=1
P (Gj )

97
Formula (10.3) can be checked using Definition 10.1. It tell us that, on
each generator of G, the conditional expectation is constant; this constant is
weighted by the mass of the generator (P (Gj )).

Example 10.2 Let G be the σ-field generated by random variables


Y1 , . . . , Ym , that is, the σ-field generated by events of the form
Y1−1 (B1 ), . . . , Y1−1 (Bm ), with B1 , . . . , Bm arbitrary Borel sets. Assume in
addition that the joint distribution of the random vector (X, Y1 , . . . , Ym ) has
a density f . Then
Z ∞
E(X/Y1 , . . . , Ym ) = xf (x/Y1 , . . . , Ym )dx, (10.4)
−∞

with
f (x, y1 , . . . , ym )
f (x/y1 , . . . , ym ) = R ∞ . (10.5)
−∞
f (x, y1 , . . . , ym )dx

In (10.5), we recognize the conditional density of X given Y1 = y1 , . . . , Ym =


ym . Hence, in (10.4) we first compute the conditional expectation E(X/Y1 =
y1 , . . . , Ym = ym ) and finally, replace the real values y1 , . . . , ym by the random
variables Y1 , . . . , Ym .
We now list some important properties of the conditional expectation.
(a) Linearity: for any random variables X, Y and real numbers a, b

E(aX + bY /G) = aE(X/G) + bE(Y /G).

(b) Monotony: If X ≤ Y then E(X/G) ≤ E(Y /G).

(c) The mean value of a random variable is the same as that of its conditional
expectation: E(E(X/G)) = E(X).

(d) If X is a G-measurable random variable, then E(X/G) = X

(e) Let X be independent of G, meaning that any set of the form X −1 (B),
B ∈ B is independent of G. Then E(X/G) = E(X).

(f) Factorization: If Y is a bounded, G-measurable random variable,

E(Y X/G) = Y E(X/G).

(g) If Gi , i = 1, 2 are σ-fields with G1 ⊂ G2 ,

E(E(X/G1 )/G2 ) = E(E(X/G2 )/G1 ) = E(X/G1 ).

98
(h) Assume that X is a random variable independent of G and Z another
G-measurable random variable. For any measurable function h(x, z)
such that the random variable h(X, Z) is in L1 (Ω),

E(h(X, Z)/G) = E(h(X, z))|Z=z .

We give some proofs.


Property (a) follows from the definition of the conditional expectation and
the linearity of the operator E. Indeed, the candidate aE(X/G)+bE(Y /G) is
G-measurable. By property 2 of the conditional expectation and the linearity
of E,

E(1G [aE(X/G) + bE(Y /G)]) = aE(1G X) + bE(1G Y )


= E(1G [aX + bY ]).

Property (b) is a consequence of the monotonicity property of the operator


E and a result in measure theory telling that, for G-measurable random
variables Z1 and Z2 , satisfying

E(Z1 1G ) ≤ E(Z2 1G ),

for any G ∈ G, we have Z1 ≤ Z2 . Indeed, under the standing assumptions,


for any G ∈ G,
E(1G X) ≤ E(1G Y ).
Then, property 2 of the conditional expectation yields

E(1G E(X/G)) = E(1G X) ≤ E(1G Y ) = E(1G E(Y /G)).

By applying the above mentioned property to Z1 = E(X/G), Z2 = E(Y /G),


we obtain the result.
Taking G = Ω in condition (2) above, we prove (c). Property (d) is obvious.
Constant random variables are measurable with respect to any σ-field. Ther-
fore E(X) is G-measurable. Assuming that X is independent of G, yields

E(X11G ) = E(X)E(11G ) = E(E(X)11G ).

This proves (e).


For the proof of (f), we first consider the case Y = 1G̃ , G̃ ∈ G. Claiming
(f) means that we propose as candidate E(Y X/G) = 1G̃ E(X/G). Clearly
1G̃ E(X/G) is G-measurable. Moreover

E(1G 1G̃ E(X/G)) = E(1G∩G̃ E(X/G)) = E(1G∩G̃ X).

99
The validity of the property extends by linearity to simple random variables.
Then, by monotone convergence, to positive random variables and, finally,
to random variables in L1 (Ω), by the usual decomposition X = X + − X − .
For the proof of (g), we notice that since E(X/G1 ) is G1 -measurable, it is
G2 -measurable as well. Then, by the very definition of the conditional expec-
tation,
E(E(X/G1 )/G2 ) = E(X/G1 ).
Next, we prove that

E(X/G1 ) = E(E(X/G2 )/G1 ).

For this, we fix G ∈ G1 and we apply the definition of the conditional expec-
tation. This yields

E(1G E(E(X/G2 )/G1 )) = E(1G E(X/G2 )) = E(1G X).

Property (h) is very intuitive: Since X is independent of G in does not enter


the game of conditioning. Moreover, the measurability of Z means that by
conditioning one can suppose it is a constant.
Let us give a proof of this property in the particular case h(x, z) =
1A (x)1B (z), where A, B are Borel sets. In this case we have

E(h(X, Z)/G) = E(1A (X)1B (Z)/G) = 1B (Z)E(1A (X)/G)


= 1B (Z)E(1A (X)).

Moreover,

E(h(X, z)) = E(1A (X)1B (z)) = 1B (z)E(1A (X)).

Therefore
E(h(X, z)) = 1B (Z)E(1A (X)).

z=Z
.

11 Appendix 2: Stopping times


Throughout this section we consider a fixed filtration (see Section 2.4)
(Ft , t ≥ 0).
Definition 11.1 A mapping T : Ω → [0, ∞] is termed a stopping time with
respect to the filtration (Ft , t ≥ 0) is for any t ≥ 0

{T ≤ t} ∈ Ft .

100
It is easy to see that if S and T are stopping times with respect to the same
filtration then T ∧ S, T ∨ S and T + S are also stopping times.

Definition 11.2 For a given stopping time T , the σ-field of events prior to
T is the following

FT = {A ∈ F : A ∩ {T ≤ t} ∈ Ft , for all t ≥ 0}.

Let us prove that FT is actually a σ-field. By the definition of stopping time


Ω ∈ FT . Asuume that A ∈ FT . Then

Ac ∩ {T ≤ t} = {T ≤ t} ∩ (A ∩ {T ≤ t})c ∈ Ft .

Hence with any A, FT also contains Ac .


Let now (An , n ≥ 1) ⊂ FT . We clearly have

(∪∞ ∞
n=1 An ) ∩ {T ≤ t} = ∪n=1 (An ∩ {T ≤ t}) ∈ Ft .

This completes the proof.

Some properties related with stopping times

1. Any stopping time T is FT -measurable. Indeed, let s ≥ 0, then

{T ≤ s} ∩ {T ≤ t} = {T ≤ s ∧ t} ∈ Fs∧t ⊂ Ft .

2. If {Xt , t ≥ 0} is a process with continuous sample paths, a.s. and (Ft )-


adapted, then XT is FT -measurable. Indeed, the continuity implies

X
XT = lim Xi2−n 1{i2−n <T ≤(i+1)2−n } .
n→∞
i=0

Let us now check that for any s ≥ 0, the random variable Xs 1{s<T } is
FT -measurable. This fact along with the property {T ≤ (i + 1)2−n } ∈
FT shows the result.
Let A ∈ B(R) and t ≥ 0. The set

{Xs ∈ A} ∩ {s < T } ∩ {T ≤ t}

is empty if s ≥ t. Otherwise it is equal to {Xs ∈ A} ∩ {s < T ≤ t},


which belongs to Ft .

101
References
[1] P. Baldi: Equazioni differenziali stocastiche e applicazioni. Quaderni
dell’Unione Matematica Italiana 28. Pitagora Editrici. Bologna 2000.

[2] K.L. Chung, R.J. Williams: Introduction to Stochastic Integration, 2nd


Edition. Probability and Its Applications. Birkhäuser, 1990.

[3] R. Durrett: Stochastic Calculus, a practical introduction. CRC Press,


1996.

[4] L.C. Evans: An Introduction to Stochastic Differential Equations.


http://math.berkeley.edu/ evans/SDE.course.pdf

[5] I. Karatzas, S. Shreve: Brownian Motion and Stochastic Calculus.


Springer Verlag, 1991.

[6] H.H. Kuo: Introduction to Stochastic Integration. Universitext. Springer


2006.

[7] S. Lang: Real Analysis. Addison-Wesley, 1973.

[8] G. Lawler: Introduction to Stochastic Processes. Chapman and


Hall/CRC, 2nd Edition, 2006.

[9] J.-F. Le Gall: Mouvement Brownien et calcul stochastiques. Notes de


Cours de DEA 1996-1997. http://www.dma.ens.fr/ legall/

[10] B. Øksendal: Stochastic Differential Equations; an Introduction with


Applications. Springer Verlag, 1998.

[11] D. Revuz, M. Yor: Continuous martingales and Brownian motion.


Springer Verlag, 1999.

[12] P. Todorovic: An Introduction to Stochastic Processes and Their Ap-


plications. Springer Series in Statistics. Springer Verlag, 1992,

[13] R. L. Schilling, L. Partzsch: Bronian Motion: an introduction to stochas-


tic processes. 2nd Edition. De Gruyter, 2014.

102

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy