An Introduction To Stochastic Calculus
An Introduction To Stochastic Calculus
STOCHASTIC CALCULUS
Marta Sanz-Solé
Facultat de Matemàtiques i Informàtica
Universitat de Barcelona
October 15, 2017
Contents
1 A review of the basics on stochastic processes 4
1.1 The law of a stochastic process . . . . . . . . . . . . . . . . . 4
1.2 Sample paths . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Itô’s calculus 28
3.1 Itô’s integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 The Itô integral as a stochastic process . . . . . . . . . . . . . 35
3.3 An extension of the Itô integral . . . . . . . . . . . . . . . . . 37
3.4 A change of variables formula: Itô’s formula . . . . . . . . . . 39
3.4.1 One dimensional Itô’s formula . . . . . . . . . . . . . . 40
3.4.2 Multidimensional version of Itô’s formula . . . . . . . . 52
3
1 A review of the basics on stochastic pro-
cesses
This chapter is devoted to introduce the notion of stochastic processes and
some general definitions related with this notion. For a more complete ac-
count on the topic, we refer the reader to [12]. Let us start with a definition.
4
Definition 1.2 The finite-dimensional joint distributions of the process
{Xi , i ∈ I} consists of the multi-dimensional probability laws of any finite
family of random vectors Xi1 , . . . , Xim , where i1 , . . . , im ∈ I and m ≥ 1 is
arbitrary.
If det Λ(t1 , . . . , tm ) > 0, then the law of (Xt1 , . . . , Xtm ) has a density, and
this density is given by
1
ft1 ,··· ,tm (x) =
((2π)mdet Λt1 ,...,tm )1/2
1 t −1
× exp − (x − µt1 ,...,tm ) Λt1 ,...,tm (x − µt1 ,...,tm ) .
2
In the sequel we shall assume that I ⊂ R+ and S ⊂ R, either countable or
uncountable, and denote by RI the set of real-valued functions defined on I.
A stochastic process {Xt , t ≥ 0} can be viewed as a random vector
X : Ω → RI .
Putting the appropriate σ-field of events in RI , say B(RI ), one can define,
as for random variables, the law of the process as the mapping
5
Theorem 1.1 Consider a family
where:
2. if {ti1 < . . . < tim } ⊂ {t1 < . . . < tn }, the probability law Pti1 ...tim is the
marginal distribution of Pt1 ...tn .
One can apply this theorem to Example 1.1 to show the existence of Gaussian
processes, as follows.
Let K : I × I → R be a symmetric, nonnegative definite function. That
means:
There exists a Gaussian process {Xt , t ≥ 0} such that E(Xt ) = 0 for any
t ∈ I and Cov (Xti , Xtj ) = K(ti , tj ), for any ti , tj ∈ I.
To prove this result, fix t1 , . . . , tn ∈ I and set µ = (0, . . . , 0) ∈ Rn , Λt1 ...tn =
(K(ti , tj ))1≤i,j≤n and
Pt1 ,...,tn = N (0, Λt1 ...tn ).
6
We denote by (Xt1 , . . . , Xtn ) a random vector with law Pt1 ,...,tn . For any
subset {ti1 , . . . , tim } of {t1 , . . . , tn }, it holds that
with
δt1 ,ti1 · · · δtn ,ti1
A = ··· ··· ··· ,
δt1 ,tim · · · δtn ,tim
where δs,t denotes the Kronecker Delta function.
By the properties of Gaussian vectors, the random vector (Xti1 , . . . , Xtim )
has an m-dimensional normal distribution, zero mean, and covariance matrix
AΛt1 ...tn At . By the definition of A, it is trivial to check that
Hence, the assumptions of Theorem 1.1 hold true and the result follows.
Definition 1.3 The sample paths of a stochastic process {Xt , t ∈ I} are the
family of functions indexed by ω ∈ Ω, X(ω) : I → S, defined by X(ω)(t) =
Xt (ω).
Sk ≤ t < Sk+1 .
7
The stochastic process {Nt , t ≥ 0} takes values on Z+ . Its sample paths are
increasing right continuous functions, with jumps at the random times Sn ,
n ≥ 1, of size one. It is a particular case of a counting process. Sample paths
of counting processes are always increasing right continuous functions, their
jumps are natural numbers.
E(Bt ) = 0
E(Bs Bt ) = s ∧ t,
This defines the finite dimensional distributions and therefore the existence
of the process via Kolmogorov’s theorem (see Theorem 1.1).
8
The model for the law of Bt has been given by Einstein in 1905. More
precisely, Einstein’s definition of Brownian motion is that of a stochastic
processes with independent and stationary increments such that the law of
an increment Bt − Bs , s < t is Gaussian, zero mean and E(Bt − Bs )2 = t − s.
This definition is equivalent to the one given before.
9
2 The Brownian motion
2.1 Equivalent definitions of Brownian motion
This chapter is devoted to the study of Brownian motion, the process intro-
duced in Example 1.3 that we recall now.
Definition 2.1 The stochastic process {Bt , t ≥ 0} is a one-dimensional
Brownian motion if it is Gaussian, zero mean and with covariance function
given by E (Bt Bs ) = s ∧ t.
The existence of such process is ensured by Kolmogorov’s theorem. Indeed,
it suffices to check that
(s, t) → Γ(s, t) = s ∧ t
is nonnegative definite. That means, for any ti ≥ 0 and any real numbers ai ,
i, j = 1, . . . , m,
Xm
ai aj Γ(ti , tj ) ≥ 0.
i,j=1
But Z ∞
s∧t= 1[0,s] (r) 1[0,t] (r) dr.
0
Hence,
m
X m
X Z ∞
ai aj (ti ∧ tj ) = ai aj 1[0,ti ] (r) 1[0,tj ] (r) dr
i,j=1 i,j=1 0
m
!2
Z ∞ X
= ai 1[0,ti ] (r) dr ≥ 0.
0 i=1
Notice also that, since E(B02 ) = 0, the random variable B0 is zero almost
surely.
Each random variable Bt , t > 0, of the Brownian motion has a density, and
it is
1 x2
pt (x) = √ exp(− ),
2πt 2t
while for t = 0, its ”density” is a Dirac mass at zero, δ{0} .
Differentiating pt (x) once with respect to t, and then twice with respect to
x easily yields
∂ 1 ∂2
pt (x) = pt (x)
∂t 2 ∂x2
p0 (x) = δ{0} .
10
This is the heat equation on R with initial condition p0 (x) = δ{0} . That
means, as time evolves, the density of the random variables of the Brownian
motion behaves like a diffusive physical phenomenon.
There are equivalent definitions of Brownian motion, as the one given in the
nest result.
Proposition 2.1 A stochastic process {Xt , t ≥ 0} is a Brownian motion if
and only if
(i) X0 = 0, a.s.,
E (Xr (Xs+u − Xs )) = 0,
E(Xt − Xs )2 = t + s − 2s = t − s.
Remark 2.1 We shall see later that Brownian motion has continuous sam-
ple paths. The description of the process given in the preceding proposition
tell us that such a process is a model for a random evolution which starts from
x = 0 at time t = 0, such that the qualitative change on time increments only
depends on their length (stationary law), and that the future evolution of the
process is independent of its past (Markov property).
11
Remark 2.2 The Brownian motion possesses several invariance properties.
Let us mention some of them.
• If B = {Bt , t ≥ 0} is a Brownian motion, so is −B = {−Bt , t ≥ 0}.
motion. This means that zooming in or out, we will observe the same
behaviour. This is called the scaling property of Brownian motion.
(n) 1
Bt = √ Ynt , (2.2)
σ n
t ≥ 0.
A famous result in probability theory -Donsker theorem- tell us that the se-
(n)
quence of processes {Bt , t ≥ 0}, n ≥ 1, converges in law to the Brownian
motion. The reference sample space is the set of continuous functions van-
ishing at zero. Hence, proving the statement, we obtain continuity of the
sample paths of the limit.
Donsker theorem is the infinite dimensional version of the above mentioned
(n)
central limit theorem. Considering s = nk , t = k+1
n
, the increment Bt −
(n)
Bs = σ√1 n ξk+1 is a random variable, with mean zero and variance t −
12
(n)
s. Hence Bt is not that far from the Brownian motion, and this is what
Donsker’s theorem proves.
h0 (t) = 1,
n n
hkn (t) = 2 2 1[ 2k
, 2k+1 [ − 2 1[ 22k+1
2 2k+2 ,
2n+1 2n+1 n+1 , 2n+1 [
where the notation h·, ·i means the inner product in L2 ([0, 1], B)([0, 1]), λ).
Using (2.3), we define an isometry between L2 ([0, 1], B([0, 1]), λ) and
L2 (Ω, F, P ) as follows. Consider a family of independent random variables
with law N (0, 1), (N0 , Nnk ). Then, for f ∈ L2 ([0, 1], B([0, 1]), λ), set
∞ 2X
−1 n
X
I(f ) = hf, h0 iN0 + hf, hkn iNnk .
n=1 k=0
Clearly,
E I(f )2 = kf k22 .
13
Theorem 2.1 The process B = {Bt = I(11[0,t] ), t ∈ [0, 1]} defines a Brow-
nian motion indexed by [0, 1]. Moreover, the sample paths are continuous,
almost surely.
converges uniformly, a.s. In the last term we have introduced the Schauder
functions defined as follows.
g0 (t) = h11[0,t] , h0 i = t,
Z t
k k
gn (t) = h11[0,t] , hn i = hkn (s)ds,
0
Thus, 2n −1
X n
sup gn (t)Nn ≤ 2− 2
k k
sup |Nnk |.
t∈[0,1] 0≤k≤2n −1
k=0
The next step consists in proving that |Nnk | is bounded by some constant
n
depending on n such that when multiplied by 2− 2 the series with these
terms converges.
For this, we will use a result on large deviations for Gaussian measures along
with the first Borel-Cantelli lemma.
Lemma 2.1 For any random variable X with law N (0, 1) and for any a ≥ 1,
a2
P (|X| ≥ a) ≤ e− 2 .
14
Proof: We clearly have
Z ∞ Z ∞
2 − x2
2 2 x x2
P (|X| ≥ a) = √ dxe ≤√ dx e− 2
2π a 2π a a
2 a2 a2
= √ e− 2 ≤ e− 2 ,
a 2π
x √2
where we have used that 1 ≤ a
and a 2π
≤ 1.
We now move to the Borel-Cantelli’s based argument. By the preceding
lemma,
n −1
2X
n n
|Nnk | P |Nnk | > 2 4
P sup >2 4 ≤
0≤k≤2n −1
k=0
n
≤ 2 exp −2 2 −1 .
n
It follows that ∞
n
X
P sup |Nnk | >2 4 < +∞,
n=1 0≤k≤2n −1
That is, a.s., there exists n0 , which may depend on ω, such that
n
sup |Nnk | ≤ 2 4
0≤k≤2n −1
a.s., for n big enough, which proves the a.s. uniform convergence of the series
(2.5).
Next we discuss how from Theorem 1.1 we can get a Brownian motion indexed
by R+ . To this end, let us consider a sequence B k , k ≥ 1 consisting of
independent Brownian motions indexed by [0, 1]. That means, for each k ≥ 1,
B k = {Btk , t ∈ [0, 1]} is a Brownian motion and for different values of k, they
15
are independent. Then we define a Brownian motion recursively as follows.
Let k ≥ 1; for t ∈ [k, k + 1] set
k+1
Bt = B11 + B12 + · · · + B1k + Bt−k .
Then almost surely, the sample paths of the process are γ-Hölder continuous
with γ < αβ .
16
Nowhere differentiability
We shall prove that the exponent γ = 21 above is sharp. As a consequence
we will obtain a celebrated result by Dvoretzky, Erdös and Kakutani telling
that a.s. the sample paths of Brownian motion are not differentiable. We
gather these results in the next theorem.
Theorem 2.2 Fix any γ ∈ ( 12 , 1]; then a.s. the sample paths of {Bt , t ≥ 0}
are nowhere Hölder continuous with exponent γ.
Proof: Let γ ∈ ( 12 , 1] and assume that a sample path t → Bt (ω) is γ-Hölder
continuous at s ∈ [0, 1). Then
|Bt (ω) − Bs (ω)| ≤ C|t − s|γ ,
for any t ∈ [0, 1] and some constant C > 0.
Let n big enough and let i = [ns] + 1; by the triangular inequality
B (ω) − B (ω) ≤ B (ω) − B (ω)
j j+1 s j
n n n
+ Bs (ω) − B j+1 (ω)
n
γ γ
j j + 1
≤ C s − + s −
.
n n
Hence, by restricting j = i, i + 1, . . . , i + N − 1, we obtain
γ
N M
B j (ω) − B j+1 (ω) ≤ C = γ.
n n n n
Define
M
AiM,n = B j (ω) − B j+1 (ω) ≤ γ , j = i, i + 1, . . . , i + N − 1 .
n n n
We have seen that the set of trajectories where t → Bt (ω) is γ-Hölder con-
tinuous at s is included in
∪∞ ∞ ∞ n i
M =1 ∪k=1 ∩n=k ∪i=1 AM,n .
17
where we have used that the random variables B j − B j+1 are N 0, n1 and
n n
independent. But
r Z M n−γ
M n nx2
P B 1 ≤ γ = e− 2 dx
n n 2π −M n−γ
Z M n 21 −γ
1 − x2
2 1
=√ 1 −γ
e dx ≤ Cn 2 −γ .
2π −M n 2
Hence, by taking N such that N γ − 12 > 1,
h 1 iN
P ∩∞ n i
n 2 −γ
∪ A
n=k i=1 M,n ≤ lim inf nC = 0.
n→∞
18
which tends to ∞ as h → 0.
Quadratic variation
The notion of quadratic variation provides a measure of the roughness of
a function. Existence of variations of different orders are also important in
procedures of approximation via a Taylor expansion and also in the develop-
ment of infinitesimal calculus. We will study here the existence of quadratic
variation, i.e. variation of order two, for the Brownian motion. As shall be
discussed in more detail in the next chapter, this provides an explanation to
the fact that rules of Itô’s stochastic calculus are different from those of the
classical differential deterministic calculus.
Fix a finite interval [0, T ] and consider the sequence of partitions given by
the points Πn = (tn0 = 0 ≤ tn1 ≤ . . . ≤ tnrn = T ), n ≥ 1. We assume that
lim |Πn | = 0,
n→∞
Proof: For the sake of simplicity, we shall omit the dependence on n. Set
∆k t = tk −tk−1 . Notice that the random variables (∆k B)2 −∆k t, k = 1, . . . , n,
are independent and centered. Thus,
!2 !2
Xrn Xrn
19
which clearly tends to zero as n tends to infinity.
This proposition, together with the continuity of the sample paths of Brow-
nian motion yields
rn
X
sup |∆k B| = ∞, a.s..
n
k=1
( r )
X Xn X
P (∆k B)2 − T > λ ≤ C |Πn |γ < ∞.
n≥1 k=1 n≥1
20
The notion of quadratic variation presented before does not coincide with the
usual notion of quadratic variation for real functions. In the latter case,
there is no restriction on the partitions. Actually, for Brownian motion the
following result holds:
X
sup (∆k B)2 = +∞, a.s.,
Π
tk ∈Π
where the supremum is on the set of all partition of the interval [0, T ].
2. For any 0 ≤ s ≤ t, Fs ⊂ Ft .
If in addition
∩s>t Fs = Ft ,
for any t ≥ 0, the filtration is said to be right-continuous.
Ft = σ(Xs , 0 ≤ s ≤ t), t ≥ 0.
To ensure that the above property (1) for a filtration holds, one needs to com-
plete the σ-field. In general, there is no reason to expect right-continuity.
However, for the Brownian motion, the natural filtration possesses this prop-
erty.
21
A stochastic process X with X0 constant and constant mean, independent
increments possesses the martingale property with respect to the natural
filtration. Indeed, for 0 ≤ s ≤ t,
E(Xt − Xs /Fs ) = E(Xt − Xs ) = 0.
Hence, a Brownian motion possesses the martingale property with respect to
the natural filtration.
Other examples of martingales with respect to the same filtration, related
with the Brownian motion are
1. {Bt2 − t, t ≥ 0},
a2 t
2. {exp aBt − 2 , t ≥ 0}.
Indeed, for the first example, let us consider 0 ≤ s ≤ t. Then,
E Bt2 /Fs = E (Bt − Bs + Bs )2 /Fs
+ E Bs2 /Fs .
Consequently,
E Bt2 − Bs2 /Fs = t − s.
For the second example, we also use the property of independent increments,
as follows:
a2 t a2 t
E exp aBt − /Fs = exp(aBs )E exp a(Bt − Bs ) − /Fs
2 2
a2 t
= exp(aBs )E exp a(Bt − Bs ) − .
2
Using the expression of the density of the random variable Bt − Bs , we write
a2 t at2 x2
Z
1
E exp a(Bt − Bs ) − =p exp ax − − dx
2 2π(t − s) R 2 2(t − s)
2
a (t − s) a2 t
= exp −
2 2
2
as
= exp − ,
2
22
where the before last equality is obtained by using the identity
a2 t a2 s
E exp aBt − /Fs = exp aBs − .
2 2
Remark. The computation of the expression
a2 t
E exp a(Bt − Bs ) −
2
could have been avoided by using the following property:. If Z is a random
variable with distribution N (0, σ 2 ), then it has exponential moments, and
2
σ
E (exp(Z)) = exp .
2
This property is proved by using computations analogue to those above.
|x − y|2
Z
1
p(s, t, x, A) = 1 exp − dy. (2.8)
(2π(t − s)) 2 A 2(t − s)
which means that, conditionally to the past of the Brownian motion until
time s, the law of Bt at a future time t only depends on Bs .
Let f : R → R be a bounded measurable function. Then, since Bs is Fs –
measurable and Bt − Bs independent of Fs , we obtain
23
The random variable x + Bt − Bs is N(x, t − s). Thus,
Z
E (f (x + Bt − Bs )) = f (y)p(s, t, x, dy),
R
and consequently,
Z
E (f (Bt )/ Fs ) = f (y)p(s, t, Bs , dy).
R
We recall that the sum of two independent Normal random variables, is again
Normal, with mean the sum of the respective means, and variance the sum
of the respective variances. This is expressed in mathematical terms by the
fact that
Z
[fN(x,σ1 ) ∗ fN(y,σ2 ) ](z) = fN(x,σ1 ) (y)fN(y,σ2 ) (z − y)dy
R
= fN(x+y,σ1 +σ2 ) (z),
where the first expression denotes the convolution of the two densities. Using
this fact along with Fubini’s theorem, we obtain
Z Z Z
p(u, t, y, A)p(s, u, x, dy) = p(s, u, x, dy) p(u, t, y, dz)
R R A
Z Z
= dz dy fN(x,u−s) (y)fN(0,t−u) (y − z)
A R
Z
= dz fN(x,u−s) ∗ fN(0,t−u) (z)
ZA
= dzfN(x,t−s) (z) = p(s, t, x, A),
A
24
proving (2.10).
This equation is the time continuous analogue of the property own by the
transition probability matrices of a homogeneousMarkov chain. That is,
Π(m+n) = Π(m) Π(n) ,
meaning that evolutions in m + n steps are done by concatenating m-step
and n-step evolutions. In (2.10) m + n is replaced by the real time t − s, m
by t − u, and n by u − s, respectively.
We are now ready to give the definition of a Markov process.
Consider a mapping
p : R+ × R+ × R × B(R) → R+ ,
satisfying the properties
(i) for any fixed s, t ∈ R+ , A ∈ B(R),
x → p(s, t, x, A)
is B(R)–measurable,
(ii) for any fixed s, t ∈ R+ , x ∈ R,
A → p(s, t, x, A)
is a probability,
(iii) Equation (2.10) holds.
Such a function p is termed a Markovian transition function. Let us also fix
a probability µ on B(R).
Definition 2.3 A real valued stochastic process {Xt , t ∈ R+ } is a Markov
process with initial law µ and transition probability function p if
(a) the law of X0 is µ,
(b) for any 0 ≤ s ≤ t,
P {Xt ∈ A/Fs } = p(s, t, Xs , A).
25
Theorem 2.4 Let T be a stopping time. Then, conditionally to {T < ∞},
the process defined by
BtT = BT +t − BT , t ≥ 0,
Proof: Assume that T < ∞ a.s. We shall prove that for any A ∈ FT , any
choice of parameters 0 ≤ t1 < · · · < tp and any continuous and bounded
function f on Rp , we have
h i
E 1A f BtT1 , · · · , BtTp = P (A)E f Bt1 , · · · , Btp .
(2.11)
This suffices to prove all the assertions of the theorem. Indeed, by taking A =
Ω, we see that the finite dimensional distributions of B and B T coincide. On
the other hand, (2.11) states the independence of 1A and the random vector
(BtT1 , · · · , BtTp ). By a monotone class argument, we get the independence of
1A and B T .
The continuity of the sample paths of B implies, a.s.
f BtT1 , · · · , BtTp = f BT +t1 − BT , . . . , BT +tp − BT
∞
X
= lim 1{(k−1)2−n <T ≤k2−n } f Bk2−n +t1 − Bk2−n , . . . Bk2−n +tp − Bk2−n .
n→∞
k=1
26
An interesting consequence of the preceding property is given in the next
proposition.
Proposition 2.4 For any t > 0, set St = sups≤t Bs . Then, for any a ≥ 0
and b ≤ a,
P {St ≥ a, Bt ≤ b} = P {Bt ≥ 2a − b} . (2.12)
As a consequence, the probability law of St and |Bt | are the same.
Ta = inf{t ≥ 0, Bt = a},
27
3 Itô’s calculus
Itô’s calculus has been developed in the 50’ by Kyoshi Itô in an attempt to
give rigourous meaning to some differential equations driven by the Brown-
ian motion appearing in the study of some problems related with continuous
time Markov processes. Roughly speaking, one could say that Itô’s calculus
is an analogue of the classical Newton and Leibniz calculus for stochastic pro-
cesses. In fact, in classical
R mathematical analysis, there are several extensions
of the Riemann integral f (x)dx. For example, if g is an increasing bounded
function (or the difference of two of these functions),
R Lebesgue-Stieltjes in-
tegral gives a precise meaning to the integral f (x)g(dx), for some set of
functions f . However, before Itô’s development, no theory allowing nowhere
differentiable integrators g was known. Brownian motion, introduced in the
preceding chapter, is an example of stochastic process whose sample paths, al-
though continuous, are nowhere differentiable. Therefore, Lebesgue-Stieltjes
integral does not apply to the sample paths of Brownian motion.
There are many motivations coming from a variety of disciplines to consider
stochastic differential equations driven by a Brownian motion. Such an object
is defined as
or in integral form,
Z t Z t
X t = x0 + σ(s, Xs )dBs + b(s, Xs )ds. (3.1)
0 0
28
Notice that these two properties are satisfied if (Ft , t ≥ 0) is the natural
filtration associated to B.
We fix a finite time horizon T and define L2a,T as the set of stochastic processes
u = {ut , t ∈ [0, T ]} satisfying the following conditions:
(i) u is adapted and jointly measurable in (t, ω), with respect to the product
σ-field B([0, T ]) ⊗ F.
RT
(ii) 0
E(u2t )dt < ∞.
hR i 21
T
This is a Hilbert space with the norm kukL2a,T = 0
E(u2t )dt , which
coincides with the natural norm on the Hilbert space L2 (Ω × [0, T ], F ⊗
B([0, t]), dP × dλ) (here λ stands for the Lebesgue measure on R).
The notation L2a,T evokes the two properties -adaptedness and square
integrability- described before.
Consider first the subset of L2a,T consisting of step processes. That is, stochas-
tic processes which can be written as
n
X
ut = uj 1[tj−1 ,tj [ (t), (3.2)
j=1
that Rwe may compare with Lebesgue integral of simple functions. Notice
T
that 0 ut dBt is a random variable. Of course, we would like to be able to
consider more general integrands than step processes. Therefore, we must try
to extend the definition (3.3). For this, we have to use tools from Functional
Analysis based upon a very natural idea: If we are able to prove that (3.3)
gives a continuous functional between two metric spaces, then the stochastic
integral defined for the very particular class of step stochastic processes could
be extended to a more general class given by the closure of this set with
respect to a suitable norm. This is possible by one of the consequences of
Hahn-Banach Theorem.
The idea of continuity is made precise by the
29
Isometry property:
Z T 2 Z T
E ut dBt =E u2t dt . (3.4)
0 0
For the second term, we notice that for fixed j and k, j < k, the random
variables uj uk ∆j B are independent of ∆k B. Therefore,
30
2. Linearity: If u1 , u2 are two step processes and a, b ∈ R, then clearly
au1 + bu2 is also a step process and
Z T Z T Z T
1 2 1
(au + bu )(t)dBt = a u (t)dBt + b u2 (t)dBt .
0 0 0
The next step consists of identifying a bigger set than E of random processes
such that E is dense in the norm of the Hilbert space L2 (Ω × [0, T ]). Since
L2a,T is a Hilbert space with respect to this norm, we have that Ē ⊂ L2a,T .
The converse inclusion is also true. This is proved in the next Proposition,
which is a crucial fact in Itô’s theory.
Z T [nT ] Z k+1
∧T
2
n 2
X n k
|u (t) − u(t)| dt = u − u(t) dt
0 k=0
k
n
n
31
Step 2. Assume that u ∈ L2a,T is bounded. For any n ≥ 1, let Ψn (s) =
n11[0, 1 ] (s). The sequence (Ψn )n≥1 is an approximation of the identity. Con-
n
sider
Z +∞ Z t
n
u (t) = Ψn (t − s)u(s)ds = (Ψn ∗ u)(t) = u(s)ds,
1
−∞ t− n
where the symbol “∗” denotes the convolution operator on R. In the last
integral we put u(r) = 0 if r < 0. With this, we define a stochastic process
un with continuous and bounded sample paths, a.s., and by the properties
of the convolution (see e.g. [7][Corollary 2, p. 378]), we have
Z T
|un (s) − u(s)|2 ds → 0.
0
where we have used that for a function f ∈ L1 ([0, T ], B([0, T ]), dt),
Z T
lim |f (t)|11{|f (t)|>n} dt = 0,
n→∞ 0
32
In order this definition to make sense, one needs to make sure that if the
process u is approximated by two different sequences, say un,1 and un,2 , the
definition of the stochastic integral, using either un,1 or un,2 coincide. This
is proved using the isometry property. Indeed
Z T Z T 2 Z T 2
E un,1
t dBt − utn,2 dBt = E un,1 n,2
t − ut dt
0 0 0
Z T 2
Z T 2
≤2 E un,1
t − ut dt + 2 E un,2
t − ut dt
0 0
→ 0.
RT
Consequently, denoting by I i (u) = L2 (Ω) − limn→∞ 0
un,i
t dBt , i = 1, 2 and
using the triangular inequaltity, we have
Z T Z T Z T
1 2
kI (u) − I (u)k2 ≤ kI (u) − 1
un,1
t dBt k2 +k un,1
t dBt − un,2
t dBt k2
0 0 0
Z T
+ kI 2 (u) − un,2
t dBt k2 ,
0
Moreover,
33
Remember that these facts are true for processes in E, as has been mentioned
before. The extension to processes in L2a,T is done by applying Proposition
3.1. For the sake of illustration we prove (a).
Consider an approximating sequence un inRthe sense of Proposition 3.1. By
T
the construction of the stochastic integral 0 ut dBt , it holds that
Z T Z T
lim E unt dBt =E ut dBt ,
n→∞ 0 0
R
T
Since E 0
unt dBt = 0 for every n ≥ 1, this concludes the proof.
We end this section with an interesting example.
Example 3.1 For the Brownian motion B, the following formula holds:
Z T
1
BT2 − T .
Bt dBt =
0 2
RT
Let us remark that we would rather expect 0 Bt dBt = 12 BT2 , by analogy
with rules of deterministic calculus.
To prove this identity, we consider a particular sequence of approximating
jT
step processes based on the partition n , j = 0, . . . , n , as follows:
n
X
unt = Btj−1 1]tj−1 ,tj ] (t),
j=1
jT
with tj = n
. Clearly, un ∈ L2a,T and we have
Z T n Z tj
2
X 2
E (unt − Bt ) dt = E Btj−1 − Bt dt
0 j=1 tj−1
n Z tj 2
T X T
≤ dt = .
n j=1 tj−1 n
34
Clearly,
n n
X 1X
Btj−1 Btj − Btj−1 = Bt2j − Bt2j−1
j=1
2 j=1
n
1X 2
− Btj − Btj−1
2 j=1
n
1 1X 2
= BT2 − Btj − Btj−1 . (3.6)
2 2 j=1
35
Using properties (g) and (f), respectively, of the conditional expectation (see
Appendix 1) yields
l
X
(Itn Isn /Fs )
E − = E (uk (Btk − Bs )/Fs ) + E E uj ∆j B/Ftj−1 /Fs
j=k+1
n
!2
X Z tj Z t
1
L (Ω) − lim us dBs = u2s ds.
n→∞ tj−1 0
j=1
Since
Rt p ≥ 2 is arbitrary, with Theorem 1.1 we have that the sample paths of
1
u
0 s
dB s , t ∈ [0, T ] are γ–Hölder continuous with γ ∈]0, 2 [.
36
3.3 An extension of the Itô integral
In Section 3.1 we have introduced the set L2a,T and we have defined the
stochastic integral of processes of this class with respect to the Brownian
motion. In this section we shall consider a large class of integrands. The
notations and underlying filtration are the same as in Section 3.1.
Let Λ2a,T be the set of real valued processes u adapted to the filtration (Ft , t ≥
0), jointly measurable in (t, ω) with respect to the product σ-field B([0, T ]) ×
F and satisfying Z T
P u2t dt < ∞ = 1. (3.9)
0
Clearly L2a,T ⊂ Λ2a,T . Our aim is to define the stochastic integral for processes
in Λ2a,T . For this we shall follow the same approach as in section 3.1. Firstly,
we start with step processes (un , n ≥ 1) of the form (3.2) belonging to Λ2a,T
and define the integral as in (3.3). The extension to processes in Λ2a,T needs
two ingredients. The first one is an approximation result that we now state
without giving a proof. Reader may consult for instance [1].
Proposition 3.4 Let u ∈ Λ2a,T . There exists a sequence of step processes
(un , n ≥ 1) of the form (3.2) belonging to Λ2a,T such that
Z T
lim |unt − ut |2 dt = 0,
n→∞ 0
a.s.
The second ingredient gives a connection between stochastic integrals of step
processes in Λ2a,T and their quadratic variation, as follows.
Proposition 3.5 Let u be a step processes in Λ2a,T . Then for any > 0,
N > 0, Z T Z T
2 N
P ut dBt > ≤ P
ut dt > N + 2 . (3.10)
0 0
Proof: It is based on a truncation argument. Let u be given by the right-hand
side of (3.2) (here it is not necessary to assume that the random variables uj
are in L2 (Ω)). Fix N > 0 and define
( Pn
N uj , if t ∈ [tj−1 , tj [, and u2j (tj − tj−1 ) ≤ N,
vt = Pj=1
n 2
0, if t ∈ [tj−1 , tj [, and j=1 uj (tj − tj−1 ) > N,
37
RT
Moreover, if 0 u2t dt ≤ N , necessarily ut = vtN for any t ∈ [0, T ]. Then by
considering the decomposition
Z T
ut dBt >
0
Z T Z T Z T Z T
2
2
= ut dBt > , ut dt > N ∪ ut dBt > , ut dt ≤ N ,
0 0 0 0
we obtain
T T T
Z Z Z
P ut dBt > ≤ P vtN dBt > +P u2t dt >N .
0 0 0
38
is Cauchy in probability. The space L0 (Ω) of classes of finite random variables
(a.s.) endowed with the convergence in probability is a complete metric space.
For example, a possible distance is
|X − Y |
d(X, Y ) = E .
1 + |X − Y |
Hence the sequence (3.12) does have a limit in probability. Then, we define
Z T Z T
ut dBt = P − lim unt dBt . (3.13)
0 n→∞ 0
39
in the convergence of L2 (Ω). This gives the extra contribution in the devel-
opment of Bt2 in comparison with the classical calculus approach.
Notice that, if B were of bounded variation then, we could argue as follows:
n−1
X 2
Btj+1 − Btj ≤ sup |Btj+1 − Btj |
0≤j≤n−1
j=0
n−1
X
× |Btj+1 − Btj |.
j=0
By the continuity of the sample paths of the Brownian motion, the first factor
in the right hand-side of the preceding inequality tends to zero as the mesh
of the partition tends to zero, while the second factor remains finite, by the
property of bounded variation.
Summarising. Differential calculus with respect to the Brownian motion
should take into account second order differential terms. Roughly speaking
(dBt )2 = dt.
a.s.
The proof relies on two technical lemmas. In the sequel, we will consider
a sequence of partitions Πn = {0 = t0 ≤ t1 ≤ · · · ≤ tn = t} such that
limn→∞ |Πn | = 0.
40
Lemma 3.1 Let g be a real continuous function and λi ∈ (0, 1), i = 1, . . . , n.
There exists a subsequence (denoted by (n), by simplicity) such that
n
X
g(Bti−1 + λi (Bti − Bti−1 )) − g(Bti−1 ) (Bti − Bti−1 )2 , n ≥ 1,
Xn :=
i=1
Clearly
n
X
|Xn | ≤ Yn (Bti − Bti−1 )2 .
i=1
Lemma 3.2 The hypotheses are the same as in Lemma 3.1. The sequence
n
X
g(Bti−1 ) (Bti − Bti−1 )2 − (ti − ti1 ) , n ≥ 1,
Sn :=
i=1
This is a localization of Sn .
41
To simplify the notation, we call
= 0.
Consequently,
n
X
2
E(Sn,L ) = E(Yi2 )
i=1
≤ Ct sup |g(x)|2 |Πn |,
|x|≤L
42
We proved in Proposition 2.4 that the law of sup0≤s≤t |B(s)| and |Bt | are the
same. Thus,
r
2t
P sup |B(s)| > L = P {|B(t)| > L} ≤ L−1 E(|B(t)|) = L−1 .
0≤s≤t π
This yields
lim P {Sn 6= Sn,L } = 0,
L→∞
43
Proof of Theorem 3.1
For simplicity, we take a = 0. We fix ω and consider a Taylor expansion up
to the second order. This yields
n
X
f (Bt ) − f (0) = f 0 (Bti−1 )(Bti − Bti−1 )
i=1
n
1 X
+ f 00 (Bti−1 + λi (Bti − Bti−1 ))(Bti − Bti−1 )2 .
2 i=1
The stochastic process {ut = f 0 (Bt ), t ≥ 0} has continuous sample paths, a.s.
Let n
X
n
u (t) = f 0 (Bti−1 )1[ti−1 ,ti ] .
i=1
By continuity, Z t
lim |un (s) − u(s)|2 ds = 0, a.s.
n→∞ 0
Therefore,
n
X Z t Z t
0 n
lim f (Bti−1 )(Bti − Bti−1 ) = lim u (s)dBs = u(s)dBs ,
n→∞ n→∞ 0 0
i=1
The first term on the right-hand side of the preceding inequality converges
to zero as n → ∞, a.s. Indeed this follows from Lemma 3.1 applied to the
00
function g := f . With the same choice of g, Lemma 3.2 yields the a.s.
44
convergence to zero of a subsequence for the second term. Finally, the third
term converges also to zero, a.s. by the classical result on approximation of
Riemann integrals by Riemann sums.
We now introduce the class of Itô Processes for which we will prove a more
general version of the Itô formula.
Let C 1,2 denote the set of functions on [0, T ] × R which are jointly continuous
in (t, x), continuous differentiable in t and twice continuous differentiable
in x, with jointly continuous derivatives. Our next aim is to prove an Itô
formula for the stochastic process {f (t, Xt ), t ∈ [0, T }], f ∈ C 1,2 This will be
an extension of (3.17).
45
or in differential form ,
1 2
df (t, Xt ) = ∂t f (t, Xt )dt + ∂x f (t, Xt )dXt + ∂xx f (t, Xt )(dXt )2 , (3.21)
2
where (dXt )2 is computed using the formal rule
with µ, σ ∈ R.
Applying formula (3.19) to Xt := Bt -a Brownian motion- yields
Z t Z t
f (t, Bt ) = 1 + µ f (s, Bs )ds + σ f (s, Bs )dBs .
0 0
Black and Scholes proposed as model of a market with a single risky asset
with initial value S0 = 1, the process St = Yt . We have seen that such a
process is in fact the solution to a linear stochastic differential equation (see
(3.22)).
46
Proof of Theorem 3.2
Let Πn = {0 = tn0 < · · · < tnpn = t} be a sequence of increasing partitions
such that limn→∞ |Πn | = 0. First, we consider the decomposition
We can write
pn −1 h i
X
f (t, Xt ) − f (0, X0 ) = f (tni+1 , Xtni+1 ) − f (ti , Xtni )
i=0
pn −1
X
f (tni+1 , Xtni ) − f (tni , Xtni )
=
i=0
h i
+ f (tni+1 , Xtni+1 ) − f (tni+1 , Xtni )
pn −1
X
∂s f (t̄ni , Xtni )(tni+1 − tni )
=
i=0
h i
+ ∂x f (tni+1 , Xtni )(Xtni+1 − Xtni )
pn −1
1X 2
+ ∂ f (tn , X̄ n )(Xtni+1 − Xtni )2 . (3.24)
2 i=0 xx i+1 i
with t̄ni ∈]tni , tni+1 [ and X̄in an intermediate (random) point on the segment
determined by Xtni and Xtni+1 .
In fact, this follows from a Taylor expansion of the function f up to the first
order in the variable s (or the mean-value theorem), and up to the second
order in the variable x. The asymmetry in the orders is due to the existence
of quadratic variation of the processes involved. The expresion (3.24) is the
analogue of (3.16). The former is much simpler for two reasons. Firstly, there
is no s-variable; secondly, f is a polynomial of second degree, and therefore it
has an exact Taylor expansion. But both formulas have the same structure.
When passing to the limit as n → ∞, we expect
pn −1 Z t
X
n n n n
∂s f (t̄i , Xti )(ti+1 − ti ) → ∂s f (s, Xs )ds
i=0 0
pn −1 Z t
X
∂x f (tni+1 , Xtni )(Xtni+1 − Xtni ) → ∂x f (s, Xs )us dBs
i=0 0
Z t
+ ∂x f (s, Xs )vs ds
0
pn −1 Z t
X
∂xx f (tni+1 , X̄in )(Xtni+1 − Xtni )2 → 2
∂xx f (s, Xs )u2s ds,
i=0 0
47
in some topology.
This actually holds in the a.s. convergence (by taking if necessary a subse-
quence). As in Theorem 3.1, the proof requires a localization in Ω. However,
this can be avoided by assuming some additional assumptions as follows: the
process v is bounded; u ∈ L2a,T ; the partial derivatives ∂x f , ∂xx
2
are bounded.
We shall give a proof of the theorem under these additional hypotheses.
Checking the convergences
First term
pn −1 Z t
X
n n n
∂s f (t̄i , Xtni )(ti+1 − ti ) → ∂s f (s, Xs )ds, (3.25)
i=0 0
a.s.
Indeed
p −1 Z t
Xn
n n n
∂s f (t̄i , Xti )(ti+1 − ti ) − ∂s f (s, Xs )ds
n
i=1 0
p −1 Z n
Xn ti+1
n
= ∂s f (t̄i , Xtni ) − ∂s f (s, Xs ) ds
n
i=1 ti
pn −1 Z tn
X i+1
≤ ∂s f (t̄ni , Xtn ) − ∂s f (s, Xs ) ds
i
i=1 tn
i
∂s f (t̄ni , Xtni ) − ∂s f (s, Xs ) 1[tni ,tni+1 ] (s) .
≤t sup sup
1≤i≤pn −1 s∈[tn n
i ,ti+1 ]
48
We start with (3.27). We have
p −1 Z tni+1 Z t 2
X n
n
E ∂x f ti , Xti ut dBt − ∂x f (s, Xs )us dBs
n
i=1 tn
i 0
p −1 Z n 2
Xn ti+1
∂x f tni , Xtni − ∂x f (s, Xs ) us dBs
=E
n
i=1 ti
pn −1
2
X Z tni+1
n
= E ∂x f ti , Xtni − ∂x f (s, Xs ) us dBs
tni
i=1
pn −1 Z tn
X i+1 2
E ∂x f tni , Xtni − ∂x f (s, Xs ) us ds,
=
i=1 tn
i
49
2
Set fn,i = ∂xx f (tni+1 , X̄in ). We have to prove
pn −1 Z t
X
2 2
fn,i (Xtni+1 − Xtni ) → ∂xx f (s, Xs )u2s ds. (3.30)
i=0 0
pn −1
!2
X Z tn
i+1
Z t
2
fn,i us dBs → ∂xx f (s, Xs )u2s ds, (3.31)
i=0 tn
i 0
pn −1
! Z !
X Z tn
i+1 tn
i+1
fn,i us dBs vs ds → 0, (3.32)
i=0 tn
i tn
i
pn −1
!2
X Z tn
i+1
fn,i vs ds → 0, (3.33)
i=0 tn
i
with
!2 Z n
pn −1 Z tni+1 ti+1
X 2
T1 = E
fn,i us dBs − us ds ,
i=0 tni tn
i
p −1 Z tni+1 Z t
Xn
2 2
T2 = E fn,i us ds − ∂xx f (s, Xs )u2s ds .
i=0 tn
i 0
50
As for T2 , we have
p −1 Z n
Xn ti+1
2 2
T2 = fn,i − ∂xx f (s, Xs ) us ds
n
i=1 ti
pn −1 Z tn
X i+1
2
≤ fn,i − ∂xx f (s, Xs ) u2s ds.
i=1 tn
i
pn −1 n
! 21 n
!2 12
X Z ti+1
Z ti+1
≤C E u2s ds E |vs |ds
i=0 tn
i tn
i
pn −1
! 12 !! 12
Z tn
i+1
Z tn
i+1
1
X
≤C |tni+1 − tni | 2 E|us |2 ds E |vs |2 ds
i=0 tn
i tn
i
pn −1 Z tn
! 21
1
X i+1
≤ C sup |tni+1 − tni | 2 E|us |2 ds
i
i=0 tn
i
1
pn −1 Z tn
! 2
X i+1
× E|vs |2 ds ,
i=0 tn
i
51
The first factor on the right-hand side of this inequality tends to zero as
n → ∞, while the second one is bounded a.s. Therefore (3.33) holds in the
a.s. convergence.
This ends the proof of the Theorem.
where in order to compute dXsk dXsl , we have to apply the following rules
dBsk dBtl = δk,l ds, (3.36)
dBsk ds = 0,
(ds)2 = 0,
where δk,l denotes the Kronecker symbol.
We remark that the identity (3.36) is a consequence of the independence of
the components of the Brownian motion.
Example 3.3 Consider the particular case m = 1, p = 2 and f (x, y) = xy.
That is, f does not depend on t and we have denoted a generic point of R by
(x, y). Then the above formula (3.35) yields
Z t Z t Z t
1 2 1 2 1 2 2 1
u1s u2s ds.
Xt Xt = X0 X0 + Xs dXs + Xs dXs + (3.37)
0 0 0
52
4 Applications of the Itô formula
This chapter is devoted to give some important results that use the Itô for-
mula in some parts of their proofs.
Then, for any p > 0, there exist two positive constants cp , Cp such that
Z T p2 Z T p2
cp E u2s ds ≤ E (MT∗ )p ≤ Cp E u2s ds . (4.1)
0 0
Proof: We will only prove here the right-hand side of (4.1) for p ≥ 2. For this,
we assume that the process {Mt , t ∈ [0, T ]} is bounded. This assumption can
be removed by a localization argument.
Consider the function
f (x) = |x|p ,
for which we have that
f 0 (x) = p|x|p−1 sign(x),
f 00 (x) = p(p − 1)|x|p−2 ,
for x 6= 0. Then, according to (3.23) we obtain
Z t
1 t
Z
p p−1
|Mt | = p|Ms | sign(Ms )us dBs + p(p − 1)|Ms |p−2 u2s ds.
0 2 0
Applying the expectation operator to both terms of the above identity yields
Z t
p p(p − 1) p−2 2
E (|Mt | ) = E |Ms | us ds . (4.2)
2 0
p
We next apply Hölder’s inequality to the expectation with exponents p−2
and q = p2 and get
Z t Z t
p−2 2 ∗ p−2 2
E |Ms | us ds ≤ E (Mt ) us ds
0 0
" Z
t p2 # p2
p−2
≤ [E (Mt∗ )p ] p E u2s ds . (4.3)
0
53
Doob’s inequality (see Theorem 8.1) implies
p
∗ p p
E (Mt ) ≤ E(|Mt |p ).
p−1
p−2
Since 1 − p
= p2 , from this inequality we obtain
p p2 Z t p2
p p(p − 1)
E (Mt∗ )p ≤ E u2s ds .
p−1 2 0
54
Hence, for any martingale M = {Mt , t ∈ [0, T ]} bounded in L2 , there exist a
unique process h ∈ L2a,T and a constant C such that
Z t
Mt = C + hs dBs . (4.5)
0
Proof: We start with the proof of (4.4). Let H be the vector space consisting
of random variables Z ∈ L2 (Ω, FT ) such that (4.4) holds. Firstly, we argue
the uniqueness of h. This is an easy consequence of the isometry of the
stochastic integral. Indeed, if there were two processes h and h0 satisfying
(4.4), then
Z T Z T 2
0 2 0
E (hs − hs ) ds = E (hs − hs )dBs
0 0
= 0.
This yields h = h0 in L2 ([0, T ] × Ω).
We now turn to the existence of h. Any Z ∈ H satisfies
Z T
2 2 2
E(Z ) = (E(Z)) + E hs ds .
0
Consequently, the representation holds for Z := ETf and also any linear com-
bination of such random variables belong to H. The conclusion follows from
Lemma 4.1.
55
Let us now prove the representation (4.5). The random variable MT belongs
to L2 (Ω). Hence, by applying the first part of the Theorem we have
Z T
MT = E(M0 ) + hs dBs ,
0
Thus, Z T
BT3 3 Bt2 + T − t dBt .
=
0
Notice that E(BT ) = 0. Then ht = 3 [Bt2 + T − t].
3
56
Proof: It is clear that Q defines a σ-additive function on F. Moreover, since
Q is indeed a probability.
Let A ∈ F be such that Q(A) = 0. Since L > 0, a.s., we should have
P (A) = 0. Reciprocally, for any A ∈ F with P (A) = 0, we have Q(A) = 0
as well.
The second assertion of the lemma is Radon-Nikodym theorem.
If we denote by EQ the expectation operator with respect to the probability
Q defined before, one has
EQ (X) = E(XL).
Indeed, this formula is easily checked for simple random variables and then
extended to any random variable X ∈ L1 (Ω) by the usual approximation
argument.
Consider now a Brownian motion {Bt , t ∈ [0, T ]}. Fix λ ∈ R and let
λ2
Lt = exp −λBt − t . (4.7)
2
57
Lemma 4.3 Let X be a random variable and let G be a sub σ-field of F such
that
u2 σ 2
E eiuX |G = e− 2 .
Then, the random variable X is independent of the σ-field G and its proba-
bility law is Gaussian, zero mean and variance σ 2 .
Wt = Bt + λt.
In the probability space (Ω, FT , Q), with Q given in (4.8), the process {Wt , t ∈
[0, T ]} is a standard Brownian motion.
Proof: We will check that in the probability space (Ω, FT , Q), any increment
Wt − Ws , 0 ≤ s < t ≤ T is independent of Fs and has N(0, t − s) distribution.
That is, for any A ∈ Fs ,
u2 u2
EQ eiu(Wt −Ws ) 1A = EQ 1A e− 2 (t−s) = Q(A)e− 2 (t−s) .
λ2 λ2
Lt = exp −λ(Bt − Bs ) − (t − s) exp −λBs − s ,
2 2
58
we have
λ2
= E 1A eiu(Bt −Bs )+iuλ(t−s)−λ(Bt −Bs )− 2 (t−s) Ls .
(iu−λ)2 2
(t−s)+iuλ(t−s)− λ2 (t−s)
= Q(A)e 2
u2
= Q(A)e− 2
(t−s)
.
59
5 Local time of Brownian motion and
Tanaka’s formula
This chapter deals with a very particular extension of Itô’s formula. More
precisely, we would like to have a decomposition of the positive submartingale
|Bt − x|, for some fixed x ∈ R as in the Itô formula. Notice that the function
f (y) = |y − x| does not belong to C 2 (R). A natural way to proceed is to
regularize the function f , for instance by convolution with an approximation
of the identity, and then, pass to the limit. Assuming that this is feasible,
the question of identifying the limit involving the second order derivative
remains open. This leads us to introduce a process termed the local time of
B at x introduced by Paul Lévy.
Definition 5.1 Let B = {Bt , t ≥ 0} be a Brownian motion and let x ∈ R.
The local time of B at x is defined as the stochastic process
1 t
Z
L(t, x) = lim 1(x−,x+) (Bs )ds
→0 2 0
1
= lim λ{s ∈ [0, t] : Bs ∈ (x − , x + )}, (5.1)
→0 2
We shall see later that the above limit exists in L2 (it also exists a.s.), a fact
that it is not obvious at all.
Local time enters naturally in the extension of the Itô formula we alluded
before. In fact, we have the following result.
Theorem 5.1 For any t ≥ 0 and x ∈ R, a.s.,
Z t
+ + 1
(Bt − x) = (B0 − x) + 1[x,∞) (Bs )dBs + L(t, x), (5.2)
0 2
where L(t, x) is given by (5.1) in the L2 convergence.
Proof: The heuristics of formula (5.2) is the following. In the sense of
distributions, f (y) = (y − x)+ has as first and second order derivatives,
f 0 (y) = 1[x,∞) (y), f 00 (y) = δx (y), respectively, where δx denotes the Dirac
delta measure. Hence we expect a formula like
Z t
1 t
Z
+ +
(Bt − x) = (B0 − x) + 1[x,∞) (Bs )dBs + δx (Bs )ds.
0 2 0
60
However, we have to give a meaning to the last integral.
Approximation procedure
We are going to approximate the function f (y) = (y − x)+ . For this, we fix
> 0 and define
0,
if y ≤ x −
(y−x+)2
fx (y) = 4
, if x − ≤ y ≤ x +
y−x if y ≥ x +
and
0,
if y < x −
00 1
fx (y) = , if x − < y < x +
2
0 if y > x +
R
with a constant c such that R φ(z)dz = 1, and then take
φn (y) = nφ(ny).
Set Z
gn (y) = [φn ∗ fx ](y) = fx (y − z)φn (z)dz.
R
61
Convergence of the terms in (5.3) as n → ∞
0
The function fx is bounded. The function gn0 is also bounded. Indeed,
Z
0 0
|gn (y)| = fx (y − z)φn (z)dz
R
Z 1
n
0
= fx (y − z)φn (z)dz
1
−n
0
≤ 2kfx k∞ .
Moreover, 0 0
gn (Bs )11[0,t] − fx (Bs )11[0,t] → 0,
uniformly in t and in ω. Hence, by bounded convergence,
Z t
2
E |gn0 (Bs ) − fx
0
(Bs )| → 0.
0
as n → ∞.
We next deal with the second order term. Since the law of each Bs has a
density, for each s > 0,
P {Bs = x + } = P {Bs = x − } = 0.
a.s. Using Fubini’s theorem, we see that this convergence also holds, for
almost every s, a.s. In fact,
Z t Z
ds dP 1{fx,00 (B )6=lim
s
00
n→∞ gn (Bs )}
0 Ω
Z Z t
= dP ds11{fx,
00 (B )6=lim
s n→∞ gn (Bs )} = 0.
00
Ω 0
We have
1
sup |gn00 (y)| ≤ .
y∈R 2
62
Indeed,
Z
1
|gn00 (y)|
= φn (z)11(x−,x+) (y − z)dz
2 R
1 y−x+
Z
2
≤ |φn (z)|dz ≤ .
2 y−x− 2
Then, by bounded convergence
Z t Z t
00 00
gn (Bs )ds → fx (Bs )ds,
0 0
a.s. and in L2 .
Thus, passing to the limit the expression (5.3) yields
Z t
1 t 1
Z
0
fx (Bt ) = fx (B0 ) + fx (Bs )dBs + 1(x−,x+) (Bs )ds. (5.4)
0 2 0 2
Convergence as → 0 of (5.4)
Since fx (y) → (y − x)+ as → 0 and
we have
fx (Bt ) − fx (B0 ) → (Bt − x)+ − (B0 − x)+ ,
in L2 .
Moreover,
Z t 2
Z t
0
E fx, (Bs ) − 1[x,∞) (Bs ) ds ≤ E 1(x−,x+) (Bs )ds
0 0
Z t
2
≤ √ ds.
0 2πs
that clearly tends to zero as → 0. Hence, by the isometry property of the
stochastic integral
Z t Z t
0
fx, (Bs )dBs → 1[x,∞) (Bs )dBs ,
0 0
in L2 .
Consequently, we have proved that
Z t
1
1(x−,x+) (Bs )ds
0 2
63
converges in L2 as → 0 and that formula (5.2) holds.
We give without proof two further properties of local time.
Rt
2. The stochastic integral 0 1[x,∞) (Bs )dBs has a jointly continuous ver-
sion in (t, x) ∈ (0, ∞) × R. Hence, by (5.2) so does the local time
{L(t, x), (t, x) ∈ (0, ∞) × R}.
The next result, which follows easily from Theorem 5.1 is known as Tanaka’s
formula.
where we have denoted by L− (t, −x) the local time of −B at −x. We have
the following facts:
Z t Z t
1[−x,∞) (−Bs )d(−Bs ) = − 1(−∞,x] (Bs )dBs ,
0 0
1 t
Z
−
L (t, −x) = lim 1(−x−,−x+) (−Bs )ds
→0 2 0
1 t
Z
= lim 1(x−,x+) (Bs )ds
→0 2 0
= L(t, x),
64
Thus, we have proved
Z t
− − 1
(Bt − x) = (B0 − x) − 1(−∞,x] (Bs )dBs + L(t, x). (5.6)
0 2
65
6 Stochastic differential equations
In this section we shall introduce stochastic differential equations driven by a
multi-dimensional Brownian motion. Under suitable properties on the coeffi-
cients, we shall prove a result on existence and uniqueness of solution. Then
we shall establish properties of the solution, like existence of moments of any
order and the Hölder property of the sample paths.
The setting
We consider a d-dimensional Brownian motion B = {Bt = (Bt1 , . . . , Btd ), t ≥
0}, B0 = 0, defined on a probability space (Ω, F, P ), along with a filtration
(Ft , t ≥ 0) satisfying the following properties:
1. B is adapted to (Ft , t ≥ 0),
or coordinate-wise,
d Z
X t Z t
Xti =x +i
σji (s, Xs )dBsj + bi (s, Xs )ds,
j=1 0 0
66
i = 1, . . . , m.
3. Equation (6.2) holds true for the fixed Brownian motion defined before,
for any t ≥ 0, a.s.
Definition 6.2 The equation (6.2) has a path-wise unique solution if any
two strong solutions X1 and X2 in the sense of the previous definition are
indistinguishable, that is,
1. Linear growth:
sup [|b(t, x) − b(t, y)| + |σ(t, x) − σ(t, y)|] ≤ L|x − y|. (6.4)
t
67
6.1 Examples of stochastic differential equations
When the functions σ and b have a linear structure, the solution to (6.2)
admits an explicit form. This is not surprising as it is indeed the case for
ordinary differential equations. We deal with this question in this section.
More precisely, suppose that
Xt = X0 eDt , t ≥ 0.
Xt = X0 (t)eDt .
A priori X0 (t) may be random. However, since eDt is differentiable, the Itô
differential of Xt is given by
Σ(t)dBt + (c(t) + Xt D) dt
68
yields
dX0 (t)eDt + Xt Ddt
= Σ(t)dBt + (c(t) + Xt D) dt,
that is
dX0 (t) = e−Dt [Σ(t)dBt + c(t)dt] .
In integral form
Z t
X0 (t) = x + e−Ds [Σ(s)dBs + c(s)ds] .
0
69
Proof of Theorem 6.1
Let us introduce Picard’s iteration scheme
Xt0 = x,
Z t Z t
Xtn =x+ σ(s, Xsn−1 )dBs + b(s, Xsn−1 )ds, n ≥ 1, (6.10)
0 0
Indeed, this property is clearly true if n = 0, since in this case Xt0 is constant
and equal to x. Suppose that (6.11) holds true for n = 0, . . . , m − 1. By
applying Burkholder’s and Hölder’s inequality, we reach
h Z s 2 !
E sup |Xsn |2 ≤ C x + E sup σ(u, Xum−1 )dBu
0≤s≤t 0≤s≤t 0
Z s 2 ! i
+ E sup b(u, Xum−1 )du
0≤s≤t 0
h Z t
m−1
2
≤C x+E σ(u, Xu ) du
0
Z t i
m−1 2
+E b(u, Xu ) du
0
Z t
m−1 2
≤C x+E 1 + |Xu | du
0
m−1 2
≤ C x + T + T E sup |Xs | .
0≤s≤T
70
Indeed, consider first the case n = 0 for which we have
Z s Z s
1
Xs − x = σ(u, x)dBu + b(s, x)ds.
0 0
≤ Ct(1 + |x|2 ).
Similarly,
Z s 2 ! Z t
E sup b(u, x)du ≤ Ct
|b(u, x)|2 du
0≤s≤t 0 0
≤ Ct (1 + |x|2 ).
2
with
( Z s 2 )
σ(u, Xun ) − σ(u, Xun−1 ) dBu ,
A(t) = E sup
0≤s≤t 0
( Z s 2 )
b(u, Xun ) − b(u, Xun−1 ) du .
B(t) = E sup
0≤s≤t 0
Using first Burkholder’s inequality and then Hölder’s inequality along with
the Lipschitz property of the coefficient σ, we obtain
Z t
A(t) ≤ C(L) E(|Xsn − Xsn−1 |2 )ds.
0
71
Similarly, applying Hölder’s inequality along with the Lipschitz property of
the coefficient b and the induction assumption, yield
(Ct)n+1
B(t) ≤ C(T, L) .
(n + 1)!
(Ct)n+1
n+1 1
n
P sup Xt − Xt > n ≤ 22n ,
0≤t≤T 2 (n + 1)!
In other words, for each ω a.s., there exists a natural number m0 (ω) such
that
1
sup Xtn+1 − Xtn ≤ n ,
0≤t≤T 2
for any n ≥ m0 (ω). The Weierstrass criterion for convergence of series of
functions then implies that
m−1
X
Xtm =x+ [Xtk+1 − Xtk ]
k=0
72
Therefore,
Z t Z t
[b(s, Xsn ) − b(s, Xs )] ds ≤ L |Xsn − Xs | ds
0 0
≤ L sup |Xsn − Xs | → 0,
0≤s≤t
The first term in the right-hand side of this inequality converges to zero as
n → ∞. Since , N > 0 are arbitrary, this yields the convergence stated in
(6.13).
Summarising, by considering if necessary a subsequence {Xtnk , t ∈ [0, T ]},
we have proved the a.s. convergence, uniformly in t ∈ [0, T ], to a stochastic
process {Xt , t ∈ [0, T ]} which satisfies (6.2), and moreover
73
Hence, from Lemma 6.1 we conclude
2
E sup |X1 (u) − X2 (u)| = 0,
0≤u≤T
Theorem 6.2 Assume the same assumptions as in Theorem 6.1 and suppose
in addition that the initial condition is a random variable X0 , independent of
the Brownian motion. Fix p ∈ [2, ∞) and t ∈ [0, T ]. There exists a positive
constant C = C(p, t, L) such that
p
E sup |Xs | ≤ C (1 + E|X0 |p ) . (6.14)
0≤s≤t
0 Z t 0
74
Define
p
ϕ(t) = E sup |Xs | .
0≤s≤t
Theorem 6.3 The assumptions are the same as in Theorem 6.1. Then
p
E sup |Xs (X0 ) − Xs (Y0 )| ≤ C(p, L, t) (E|X0 − Y0 |p ) , (6.15)
0≤s≤t
for any p ∈ [2, ∞), where C(p, L, t) is some positive constant depending on
p, L and t.
Theorem 6.4 The assumptions are the same as in Theorem 6.1. Let p ∈
[2, ∞), 0 ≤ s ≤ t ≤ T . There exists a positive constant C = C(p, L, T ) such
that
p
E (|Xt − Xs |p ) ≤ C(p, L, T ) (1 + E|X0 |p ) |t − s| 2 . (6.16)
E (|Xt − Xs |p )
Z t p Z t p
≤ C(p) E σ(u, Xu )dBu + E b(u, Xu )du .
s s
75
Burkholder’s inequality and then Hölder’s inequality with respect to
Lebesgue measure on [s, t] yield
Z t p Z t p2
2
E σ(u, Xu )dBu ≤ C(p)E |σ(u, Xu )| du
s s
Z t
p
−1 p
≤ C(p)|t − s| 2 E |σ(u, Xu )| du
s
Z t
p
−1
≤ C(p, L, T )|t − s| 2 (1 + E(|Xu |p )) du.
s
Remark 6.1 Assume that in Theorem 6.3 the initial conditions are deter-
ministic and are denoted by x and y, respectively. An extension of Kol-
mogorov’s continuity criterion to stochastic processes indexed by a multi-
dimensional parameter yields that the sample paths of the stochastic process
{Xt (x), t ∈ [0, T ], x ∈ Rm } are jointly Hölder continuous in (t, x) of degree
α < 12 in t and β < 1 in x, respectively.
76
6.4 Markov property of the solution
In Section 2.5 we discussed the Markov property of a real-valued Brownian
motion. With the obvious changes R into Rn , with arbitrary n ≥ 1 we can
see that the property extends to multi-dimensional Brownian motion. In this
section we prove that the solution to the sde (6.2) inherits the Markov prop-
erty from Brownian motion. To establish this fact we need some preliminary
results.
and
Ψ : (E × Ω, E ⊗ G) → Rm ,
with ω 7→ Ψ (X(ω), ω) in L1 (Ω).
Then,
E (Ψ (X, ·) |H) = Φ(X)(·), (6.18)
with Φ(x)(·) = E (Ψ(x, ·)).
77
with coeffients σ and b satisfying the assumptions (H). Then for any t ≥ u,
η(ω)
Yt (ω) = Xtx,u (ω)|x=η(ω) ,
where in the last equality, we have applied the joint continuity in (t, x) of
Xtx,u .
x,s
As a consequence of the preceding lemma, we have Xtx,s = XtXu ,u
for any
0 ≤ s ≤ u ≤ t, a.s.
For any Γ ∈ B(Rm ), set
78
Proof: According to Definition 2.3 we have to check that (6.20) defines a
Markovian transition function and that
We start by proving this identity. For this, we shall apply Lemma 6.3 in the
following setting:
79
where LXux,s denotes the probability law of Xux,s . By definition
Therefore, Z
p(s, t, x, Γ) = p(u, t, y, Γ)p(s, u, x, dy).
Rm
Xtπ = Xτπj + σ(τj , Xτπj )(Bt − Bτj ) + b(τj , Xτπj )(t − τj ), (7.2)
80
Theorem 7.1 We assume that the hypotheses (H) are satisfied. Moreover,
we suppose that there exists α ∈ (0, 1) such that
|σ(t, x) − σ(s, x)| + |b(t, x) − b(s, x)| ≤ C(1 + |x|)|t − s|α , (7.4)
1
where |π| denotes the norm of the partition π and β = 2
∧ α.
Proof: We shall apply the following result, that can be argued in a similar
way as in Theorem 6.2
π p
sup E sup |Xs | ≤ C(p, T ). (7.6)
π 0≤s≤t
Set
Zt = sup |Xsπ − Xs | .
0≤s≤t
1 p
h i
≤ C(p, T ) E(|Zs |p ) + (1 + |x|p ) |π|α + |π| 2 ,
81
and a similar estimate for b. Consequently,
Z t
1 p
p p α
E(Zt ) ≤ C(p, T, x) E(Zs )ds + |π| + |π| 2 T .
0
as n → ∞, uniformly in t ∈ [0, T ].
For example, for the sequence of dyadic partitions, |πn | = 2−n and for any
γ ∈ (0, β) and γ ∈ (0, β), p ≥ 1, (7.7) holds.
82
8 Continuous time martingales
In this chapter we shall study some properties of martingales (respectively,
supermartingales and submartingales) whose sample paths are continuous.
We consider a filtration {Ft , t ≥ 0} as has been introduced in section 2.4 and
refer to definition 2.2 for the notion of martingale (respectively, supermartin-
gale, submartingale). We notice that in fact this definition can be extended
to families of random variables {Xt , t ∈ T} where T is an ordered set. In
particular, we can consider discrete time parameter processes.
We start by listing some elementary but useful properties.
1. For a martingale (respectively, supermartingale, submartingale) the
function t 7→ E(Xt ) is a constant (respectively, decreasing, increasing)
function.
2. Let {Xt , t ≥ 0} be a martingale and let f : R → R be a convex
function. Assume further that f (Xt ) ∈ L1 (Ω), for any t ≥ 0. Then
the stochastic process {f (Xt ), t ≥ 0} is a submartingale. The same
conclusion holds true for a submartingale if, additionally the convex
function f is increasing.
The first assertion follows easily from property (c) of the conditional ex-
pectation. The second assertion can be proved using Jensen’s inequality, as
follows. Assume first that {Xt , t ≥ 0} is a martingale, and fix 0 ≤ s ≤ t.
Then by applying the functionn f to the identity E(Xt |Fs ) = Xs along with
the convexity of f , we obtain
E (f (Xt )|Fs ) ≥ f (E(Xt |Fs )) = f (Xs ).
If {Xt , t ≥ 0} is a submartingale, we consider the inequality E(Xt |Fs ) ≥ Xs .
Since f is increasing and convex, we have
E (f (Xt )|Fs ) ≥ f (E(Xt |Fs )) ≥ f (Xs ).
83
Proof: Consider the stopping time
T = inf{n : Xn ≥ λ} ∧ N.
Then
E (XN ) ≥ E (XT ) = E XT 1(supn Xn ≥λ)
+ E XT 1(supn Xn <λ)
≥ λP sup Xn ≥ λ + E XN 1(supn Xn <λ) .
n
By substracting E XN 1(supn Xn <λ) from the first and last term before we
obtain the first inequality of (8.1). The second one is obvious.
As a consequence of this proposition we have the following.
Proof: Without loss of generality, we may assume that E |XN |p ) < ∞, since
otherwise (8.2) holds trivially.
According to property 2 above, the process {|Xn |p , 0 ≤ n ≤ N } is a sub-
martingale and then, by Proposition 8.1 applied to the process Xn := |Xn |p ,
1
p
µP sup |Xn | ≥ µ = µP sup |Xn | ≥ µ p
n n
= λ P sup |Xn | ≥ λ ≤ E (|XN |p ) ,
p
n
1
where for any µ > 0 we have written λ = µ p .
We now prove the second inequality of (8.3); the first is obvious.
Set X ∗ = supn |Xn |, for which we have
84
Fix k > 0. Fubini’s theorem yields
Z X ∗ ∧k
∗ p p−1
E ((X ∧ k) ) = E pλ dλ
0
Z Z ∞
= dP 1{λ≤X ∗ ∧k} pλp−1 dλ
Ω 0
Z k Z
p−1
=p dλλ dP
0 {λ≤X ∗ }
Z k
=p dλλp−2 λP (X ∗ ≥ λ)
0
Z k
dλλp−2 E |XN |11(X ∗ ≥λ)
≤p
0
Z k∧X ∗
p−2
= pE |XN | λ dλ
0
p
E |XN | (X ∗ ∧ k)p−1 .
=
p−1
p
Applying Hölder’s inequality with exponents p−1
and p yields
p p−1 1
E ((X ∗ ∧ k)p ) ≤ [E ((X ∗ ∧ k)p )] p [E (|XN |p )] p .
p−1
Consequently,
1 p 1
[E ((X ∗ ∧ k)p )] p ≤ [E (|XN |p )] p .
p−1
Letting k → ∞ and using monotone convergence, we end the proof.
It is not difficult to extend the above results to martingales (submartingales)
with continuous sample paths. In fact, for a given T > 0 we define
D = Q ∩ [0, T ],
k
Dn = D ∩ , k ∈ Z+ ,
2n
85
and p
p p
E sup |Xt | ≤ sup E (|Xt |p ) , p ∈]1, ∞).
t∈D p − 1 t∈D
By the continuity of the sample paths we can finally state the following result.
Theorem 8.1 Let {Xt , t ∈ [0, T ]} be either a continuous martingale or a
continuous positive submartingale. Then
!
λp P sup |Xt | ≥ λ ≤ sup E (|Xt |p ) , p ∈ [1, ∞), (8.4)
t∈[0,T ] t∈[0,T ]
! p
p p
E sup |Xt | ≤ sup E (|Xt |p ) (8.5)
t∈[0,T ] p−1 t∈[0,T ]
p
p
= E(|XT |p ), p ∈]1, ∞). (8.6)
p−1
Inequality (8.4) is termed Doob’s maximal inequality, while (8.6) is called
Doob’s Lp inequality.
Hence, the (8.7) follows. The proof of (8.8) follows by similar arguments.
The next statement gives an idea of the roughness of the sample paths of a
continuous local martingale.
86
Proposition 8.3 Let N be a continuous bounded martingale, null at zero
and with sample paths of bounded variation, a.s. Then N is indistinguishable
from the constant process 0.
Proof: Fix t > 0 and consider a partition 0 = t0 < t1 < · · · < tp = t of [0, t].
Then
p h i
X
2
E(Nt ) = E| Nt2i − Nt2i−1
i=1
p h
X 2 i
= E Nti − Nti−1
i=1
p
" X #
≤E sup |Nti − Nti−1 | |Nti − Nti−1 |
i
i=1
≤ CE sup |Nti − Nti−1 | ,
i
87
Proof: Uniqueness follows from Proposition 8.3. Indeed, assume there were
two increasing processes hMi i, i = 1, 2 satisfying that M 2 − hMi i is a con-
tinuous martingale. By taking the difference of these two processes we get
that hM1 i − hM2 i is of bounded variation and, at the same time a continuous
martingale. Hence hM1 i and hM2 i are indistinguishable.
The next objective is to prove that (hM int , n ≥ 1), t ∈ [0, T ] is, uniformly in
t, a Cauchy sequence in probability.
Let m > n. We have the following:
2
pn
X X
m 2
n
n 2 m
2
E |hM it − hM it | = E (∆k M )t − (∆j M )t
k=1 j:tm n n
j ∈[tk−1 ,tk [
2
X pn
X
m
= 4E (∆j M )t Mtm j
− M tn
k−1
k=1 j:tm n n
j ∈[tk−1 ,tk [
pn 2
X X
= 4E (∆m 2
j M )t Mtm j
− Mtnk−1
k=1 j:tm n n
j ∈[tk−1 ,tk [
pm
!
2 X
≤ 4E sup sup Mtm
j
− Mtk−1
n (∆m 2
j M )t
k j:tm n n
j ∈[tk−1 ,tk [ j=1
! pm
!2 12
4 X
≤ 4 E sup sup Mtm
j
− Mtnk−1 E (∆m 2
j M )t
.
k j:tm n n
j ∈[tk−1 ,tk [ j=1
Let us now consider the last expression. The first factor tends to zero as n
and m tends to infinity, because M is continuous and bounded. The second
factor is easily seen to be bounded uniformly in m. Thus we have proved
2
lim E |hM int − hM im
t | = 0, (8.10)
n,m→∞
88
Consequently,
−2
P { sup |hM int − hM im
t | > } ≤ E sup |hM int − hM im
t |
2
0≤t≤T 0≤t≤T
This last expression tends to zero as n, m tend to infinity. Since the con-
vergence in probability is metrizable, we see that there exists a process hM i
satisfying the required conditions.
89
uniformly in t ∈ [0, T ] in probability.
This result together with Schwarz’s inequality imply
p
|hM, N it | ≤ hM it hN it .
Proof: Consider a finite partition of [0, t], given by {t0 = 0 < t1 < . . . < tr =
t}, and assume first that the stochastic processes H, K are step processes
and described by this partition, as follows:
Thus,
Z t r
X
ti+1
Hs Ks dhM, N is ≤ ≤ |Hi Ki | hM, N iti
0 i=1
r
X 21 1
t ti+1 2
≤ |Hi Ki | hM iti+1
i
hN iti .
i=1
90
The bounded variation process hM, N i gives rise to the total variation mea-
sure dkhM, N iks defined by
−
hM, N is = hM, N i+
s − hM, N is .
Z t Z t 12 Z t 12
|Hs | |Ks | kdhM, N is k ≤ Hs2 dhM is Ks2 dhN is . (8.14)
0 0 0
91
9 Stochastic integrals with respect to contin-
uous martingales
This chapter aims to give an outline of the main ideas of the extension of
the Itô stochastic integral to integrators which are continuous martingales.
We start by describing precisely the spaces involved in the construction of
such a notion. Throughout the chapter, we consider a fixed probability space
(Ω, F, P ) endowed with a filtration (Ft , t ≥ 0).
We denote by H2 the space of continuous martingales M , indexed by [0, T ],
with M0 = 0 a.s. and bounded in L2 (Ω). That is,
sup E(|Mt |2 ) < ∞.
t∈[0,T ]
The standard Brownian motion belongs to the space H2 and the space L2 (M )
will play the same role as L2a,T in the Itô theory of stochastic integration with
respect to Brownian motion.
Let E be the linear subspace of L2 (M ) consisting of processes of the form
p
X
Hs (ω) = Hi (ω)11]ti ,ti+1 ] (s), (9.1)
i=0
where 0 = t0 < t1 < . . . < tp+1 , and for each i, Hi is a Fti -measurable,
bounded random variable.
Stochastic processes belonging to E are termed elementary. There are related
with L2 (M ) as follows.
92
Proposition 9.1 Fix M ∈ H2 . The set E is dense in L2 (M ).
Proof: We will prove that if K ∈ L2 (M ) and is orthogonal to E then K = 0.
For this, we fix 0 ≤ s < t ≤ T and consider the process
H = F 1]s,t] ,
We have thus proved that E (F (Xt − Xs )) = 0, for any 0 ≤ s < t and any
Fs -measurable and bounded random variable F . This shows that the process
(Xt , t ≥ 0) is a martingale. At the same time, (Xt , t ≥ 0) is also a process of
bounded variation. Hence
Z t
Ku dhM iu = 0, ∀t ≥ 0,
0
Then
93
(i) H.M ∈ H2 .
Proof of (i): The martingale property follows from the measurability proper-
ties of H and the martingale property of M . Moreover, since H is bounded,
(H.M ) is bounded in L2 (Ω).
H ∈ E 7→ H.M
is an isometry from E to H2 .
Clearly H 7→ H.M is linear. Moreover, H.M is a finite sum of terms like
each one being a martingale and orthogonal to each other. It is easy to check
that
hM i it = Hi2 (hM iti+1 ∧t − hM iti ∧t ).
Hence,
p
X
hH.M it = Hi2 (hM iti+1 − hM iti ).
i=0
Consequently,
p
" #
X
EhH.M iT = kH.M k2H2 = E Hi2 (hM iti+1 ∧T − hM iti ∧T )
i=0
Z T
=E Hs2 dhM is
0
= kHkL2 (M ) .
94
10 Appendix 1: Conditional expectation
Roughly speaking, a conditional expectation of a random variable is the
mean value with respect to a modified probability after having incorporated
some a priori information. The simplest case corresponds to conditioning
with respect to an event B ∈ F. In this case, the conditional expectation is
the mathematical expectation computed on the modified probability space
(Ω, F, P (·/B)).
However, in general, additional information cannot be described so easily.
Assuming that we know about some events B1 , . . . , Bn we also know about
those that can be derived from them, like unions, intersections, complemen-
taries. This explains the election of a σ-field to keep known information and
to deal with it.
In the sequel, we denote by G an arbitrary σ-field included in F and by X
a random variable with finite expectation (X ∈ L1 (Ω)). Our final aim is to
give a definition of the conditional expectation of X given G. However, in
order to motivate this notion, we shall start with more simple situations.
Conditional expectation given an event
Let B ∈ F be such that P (B) 6= 0. The conditional expectation of X given
B is the real number defined by the formula
1
E(X/B) = E(11B X). (10.1)
P (B)
• E(X/Ω) = E(X),
With the definition (10.1), the conditional expectation coincides with the
expectation with respect to the conditional probability P (·/B). We check
this fact with a discrete random variable X = ∞
P
i=1 i Ai . Indeed,
a 1
∞
! ∞
1 X X P (Ai ∩ B)
E(X/B) = E ai 1Ai ∩B = ai
P (B) i=1 i=1
P (B)
∞
X
= ai P (Ai /B).
i=1
95
Conditional expectation given a discrete random variable
Let Y = ∞
P
i=1 yi 1Ai , Ai = {Y = yi }. The conditional expectation of X given
Y is the random variable defined by
∞
X
E(X/Y ) = E(X/Y = yi )11Ai . (10.2)
i=1
Notice that, knowing Y means knowing all the events that can be described
in terms of Y . Since Y is discrete, they can be described in terms of the
basic events {Y = yi }. This may explain the formula (10.2).
The following properties hold:
(a) E (E(X/Y )) = E(X);
96
Proof: Set ci = E(X/{Y = yi }) and let B ∈ B. Then
2. for any G ∈ G,
E(Z11G ) = E(X11G ).
The existence of E(X/G) is not a trivial issue. You should trust mathe-
maticians and believe that there is a theorem in measure theory -the Radon-
Nikodym Theorem- which ensures the existence and uniqueness of such a
random variable (out of a set of probability zero).
Before stating properties of the conditional expectation, we are going to
explain how to compute it in two particular situations.
Example 10.1 Let G be the σ-field (actually, the field) generated by a finite
partition G1 , . . . , Gm . Then
m
X E(X11Gj )
E(X/G) = 1Gj . (10.3)
j=1
P (Gj )
97
Formula (10.3) can be checked using Definition 10.1. It tell us that, on
each generator of G, the conditional expectation is constant; this constant is
weighted by the mass of the generator (P (Gj )).
with
f (x, y1 , . . . , ym )
f (x/y1 , . . . , ym ) = R ∞ . (10.5)
−∞
f (x, y1 , . . . , ym )dx
(c) The mean value of a random variable is the same as that of its conditional
expectation: E(E(X/G)) = E(X).
(e) Let X be independent of G, meaning that any set of the form X −1 (B),
B ∈ B is independent of G. Then E(X/G) = E(X).
98
(h) Assume that X is a random variable independent of G and Z another
G-measurable random variable. For any measurable function h(x, z)
such that the random variable h(X, Z) is in L1 (Ω),
E(Z1 1G ) ≤ E(Z2 1G ),
99
The validity of the property extends by linearity to simple random variables.
Then, by monotone convergence, to positive random variables and, finally,
to random variables in L1 (Ω), by the usual decomposition X = X + − X − .
For the proof of (g), we notice that since E(X/G1 ) is G1 -measurable, it is
G2 -measurable as well. Then, by the very definition of the conditional expec-
tation,
E(E(X/G1 )/G2 ) = E(X/G1 ).
Next, we prove that
For this, we fix G ∈ G1 and we apply the definition of the conditional expec-
tation. This yields
Moreover,
Therefore
E(h(X, z)) = 1B (Z)E(1A (X)).
z=Z
.
{T ≤ t} ∈ Ft .
100
It is easy to see that if S and T are stopping times with respect to the same
filtration then T ∧ S, T ∨ S and T + S are also stopping times.
Definition 11.2 For a given stopping time T , the σ-field of events prior to
T is the following
Ac ∩ {T ≤ t} = {T ≤ t} ∩ (A ∩ {T ≤ t})c ∈ Ft .
(∪∞ ∞
n=1 An ) ∩ {T ≤ t} = ∪n=1 (An ∩ {T ≤ t}) ∈ Ft .
{T ≤ s} ∩ {T ≤ t} = {T ≤ s ∧ t} ∈ Fs∧t ⊂ Ft .
Let us now check that for any s ≥ 0, the random variable Xs 1{s<T } is
FT -measurable. This fact along with the property {T ≤ (i + 1)2−n } ∈
FT shows the result.
Let A ∈ B(R) and t ≥ 0. The set
{Xs ∈ A} ∩ {s < T } ∩ {T ≤ t}
101
References
[1] P. Baldi: Equazioni differenziali stocastiche e applicazioni. Quaderni
dell’Unione Matematica Italiana 28. Pitagora Editrici. Bologna 2000.
102