Stochan
Stochan
and H. Maassen
Mathematical Institute
University of Nijmegen
Toernooiveld 1, 6525 ED Nijmegen
The Netherlands
1
Contents
1 Introduction 3
2 Brownian motion 7
2.1 Construction of Brownian Motion 7
2.2 Non-smoothness of paths 11
2.3 More Brownian motion 14
3 The Itô-integral 16
3.1 Step functions 17
3.2 Arbitrary functions 19
3.3 Martingales 22
3.4 Continuity of paths 23
4 Stochastic integrals and the Itô-formula 25
4.1 The one-dimensional Itô-formula 25
4.2 Some examples 27
4.3 The multi-dimensional Itô-formula 28
4.4 Local times of Brownian motion 30
5 The Martingale Representation Theorem 33
6 Stochastic differential equations 38
6.1 Strong solutions 38
6.2 Weak solutions 41
7 Itô-diffusions and one-parameter semigroups 43
7.1 Introduction and motivation 43
7.2 Basic properties 43
7.3 Generalities on generators 47
7.4 Applications 49
8 Transformations of diffusions 51
8.1 The Feynman-Kac formula 51
8.2 The Cameron-Martin formula 52
8.3 Killing and drift 54
9 The Black and Scholes option pricing formula. 56
9.1 Stocks, bonds and stock options 56
9.2 The martingale case 57
9.3 The effect of stock trading: the case μ = 0 58
9.4 Motivation 59
9.5 Results 60
9.6 Inclusion of the interest rate: r = 0 61
2
1 Introduction
In stochastic analysis one studies random functions of one variable and various kinds of
integrals and derivatives thereof. The argument of these functions is usually interpreted
as ‘time’, so the functions themselves can be thought of as the path of a random process.
Here, like in other areas of mathematics, going from the discrete to the continuous yields
a pay-off in simplicity and smoothness, at the price
t of a formally more complicated analy-
n
sis. Compare, to make an analogy, the integral 0 x 3 dx with he sum k=1 k3 . The integral
requires a more refined analysis for its definition and its properties, but once this has
been done the integral is easier to calculate. Similarly, in stochastic analysis you will be-
come acquainted with a convenient differential calculus as a reward for some hard work
in analysis.
Stochastic analysis can be applied in a wide variety of situations. We sketch a few exam-
ples below.
1. Some differential equations become more realistic when we allow some random-
ness in their coefficients. Consider for example the following growth equation, used
among other places in population biology:
d
St = (r + “Nt ”)St . (1)
dt
Here, St is the size of the population at time t, r is the average growth rate of the
population, and the “noise” Nt models random fluctuations in the growth rate.
2. At time t = 0 an investor buys stocks and bonds on the financial market, i.e., he
divides his initial capital C0 into A0 shares of stock and B0 shares of bonds. The
bonds will yield a guaranteed interest rate r . If we assume that the stock price St
satisfies the growth equation (1), then his capital Ct at time t is
Ct = At St + Bt er t , (2)
where At and Bt are the amounts of stocks and bonds held at time t. With a keen eye
on the market the investor sells stocks to buy bonds and vice versa. If his tradings
are ‘self-financing’, then dCt = At dSt + Bt d(er t ). An interesting question is:
- What would he be prepared to pay for a so-called European call option, i.e.,
the right (bought at time 0) to purchase at time T > 0 a share of stock at a
predetermined price K?
The rational answer, q say, was found by Black and Scholes (1973) through an anal-
ysis of the possible strategies leading from an initial investment q to a payoff CT .
Their formula is being used on the stock markets all over the world.
3. The Langevin equation describes the behaviour of a dust particle suspended in a
fluid:
d
m Vt = −ηVt + “Nt ”. (3)
dt
Here, Vt is the velocity at time t of the dust particle, the friction exerted on the
particle due to the viscosity η of the fluid is −ηVt , and the “noise” Nt stands for the
disturbance due to the thermal motion of the surrounding fluid molecules colliding
with the particle.
3
4. The path of the dust particle in example 3 is observed with some inaccuracy. One
measures the perturbed signal Z(t) given by
Zt = Vt + “Ñt ”. (4)
Here Ñt is again a “noise”. One is interested in the best guess for the actual value
of Vt , given the observation Zs for 0 ≤ s ≤ t. This is called a filtering problem: how
to filter away the noise Ñt . Kalman and Bucy (1961) found a linear algorithm, which
was almost immediately applied in aerospace engineering. Filtering theory is now a
flourishing and extremely useful discipline.
5. Stochastic analysis can help solve boundary value problems such as the Dirichlet
problem. If the value of a harmonic function f on the boundary of some bounded
regular region D ⊂ Rn is known, then one can express the value of f in the interior
of D as follows:
E f Bτx = f (x), (5)
t
where Btx := x + 0 Nt dt is an “integrated noise” or Brownian motion, starting at x,
and τ denotes the time when this Brownian motion first reaches the boundary. (A
harmonic function f is a function satisfying Δf = 0 with Δ the Laplacian.)
The goal of this course is to make sense of the above equations, and to work with them.
In all the above examples the unexplained symbol Nt occurs, which is to be thought of
as a “completely random” function of t, in other words, the continuous time analogue of
a sequence of independent identically distributed random variables. In a first attempt to
catch this concept, let us formulate the following requirements:
1. Nt is independent of Ns for t ≠ s;
2. The random variables Nt (t ≥ 0) all have the same probability distribution μ;
3. E (Nt ) = 0.
However, when taken literally these requirements do not produce what we want. This is
seen by the following argument. By requirement 1 we have for every point in time an
independent value of Nt . We shall show that such a “continuous i.i.d. sequence” Nt is not
measurable in t, unless it is identically 0.
Let μ denote the probability distribution of Nt , which by requirement 2 does not depend
on t, i.e., μ([a, b]) := P[a ≤ Nt ≤ b]. Divide R into two half lines, one extending from a
to −∞ and the other extending from a to ∞. If Nt is not a constant function of t, then
there must be a value of a such that each of the half lines has positive measure. So
Now consider the set of time points where the noise Nt is low: E := { t ≥ 0 : Nt ≤ a }.
It can be shown that with probability 1 the set E is not Lebesgue measurable. Without
giving a full proof we can understand this as follows. Let λ denote the Lebesgue measure
on R. If E would be measurable, then by requirement 1 and Eq. (6) it would be reasonable
to expect its relative share in any interval (c, d) to be p, i.e.,
4
On the other hand, it is known from measure theory that every measurable set E is
arbitrarily thick somewhere with respect to the Lebesgue measure λ, i.e., for all α < 1 an
interval (c, d) can be found such that
(cf. Halmos (1974) Th. III.16.A). This clearly contradicts Eq. (7), so E is not measurable.
This is a bad property of Nt : for, in view of (1), (3), (4) and (5), we would like to integrate
Nt .
For this reason, let us approach the problem from another angle. Instead of Nt , let us
consider the integral of Nt , and give it a name:
t
Bt := Ns ds.
0
The three requirements on the evasive object Nt then translate into three quite sensible
requirements for Bt .
BM1. For 0 = t0 ≤ t1 ≤ · · · ≤ tn the random variables Btj+1 − Btj (j = 0, . . . , n − 1) are
independent;
BM2. Bt has stationary increments, i.e., the joint probability distribution of
Bt1 +s − Bu1 +s , Bt2 +s − Bu2 +s , . . . , Btn +s − Bun +s
Excercise. 1.1 Show that BM5 implies the following: For any ε > 0
as n → ∞.
Exercise 1.1 helps us to specify the increments of Brownian motion in the following way.
Excercise. 1.2 Suppose BM1, BM2, BM4 and (8) hold. Apply the Central Limit Theorem
(Lindeberg’s condition) to
Xn,k := B kt − B (k−1)t
n n
5
and conclude that Bt − Bs , s < t has a normal distribution with variance t − s, i.e.
1 x2
P (Bs+t − Bs ∈ A) = √ e− 2t dx.
2π t A
Introducing
BM 2’. If s, t ≥ 0 then
1 x2
P (Bs+t − Bs ∈ A) = √ e− 2t dx.
2π t A
6
2 Brownian motion
t−s t−s
P [Bs ∈ dx, Bθ ∈ dy, Bt ∈ dz] = p(s, 0, x)p( , x, y)p( , y, z)dx dy dz
2 2
1 (y −μ)2
−
= p(s, 0, x)p(t − s, x, z) · √ e 2σ 2 dx dy dz
σ 2π
we obtain
1 (y −μ)2
−
P [Bθ ∈ dy|Bs ∈ dx, Bt ∈ dz] = √ e 2σ 2 dy,
σ 2π
7
which is our claim.
This suggests that we might be able to construct Brownian motion on [0, 1] by interpola-
tion.
(n)
To carry out this program, we begin with a sequence {ξk , k ∈ I(n), n ∈ N0 } of indepen-
dent, standard normal random variables on some probability space (Ω, F , P ). Here
denotes the set of odd, positive integers less than 2n . For each n ∈ N0 we define a process
(n)
B (n) := {Bt : 0 ≤ t ≤ 1} by recursion and linear interpolation of the preceeding process,
(n) (n−1)
as follows. For n ∈ N, Bk/2n−1 will agree with Bk/2n−1 , for all k = 0, 1, . . . , 2n−1 . Thus for
(n)
each n we only need to specify the values of Bk/2n for k ∈ I(n). We start with
(n)
We shall show that, almost surely, Bt converges uniformly in t to a continuous function
Bt (as n → ∞) and that Bt is a Brownian motion.
We start with giving a more convenient representation of the processes B (n) , n = 0, 1, . . . .
We define the following Haar functions by H10 (t) ≡ 1, and for n ∈ N, k ∈ I(n)
⎧
⎪
⎪ (n−1)/2 , k−1 k
⎨ 2 2n ≤ t < 2n
(n) k k+1
Hk (t) := −2(n−1)/2 , 2n ≤ t < 2n
⎪
⎪
⎩ 0 otherwise.
(0) (n)
Note that S1 (t) = t, and that for n ≥ 1 the graphs of Sk are little tents of height
2−(n+1)/2 centered at k/2n and non overlapping for different values of k ∈ I(n). Clearly,
(0) (0) (0)
Bt = ξ1 S1 (t), and by induction on n, it is readily verified that
(n)
n (m) (m)
Bt (ω) = ξk (ω)Sk (t), 0 ≤ t ≤ 1, n ∈ N. (9)
m=0 k∈I(m)
(n)
Lemma 2.1 As n → ∞, the sequence of functions {Bt (ω), 0 ≤ t ≤ 1}, n ∈ N0 , given by
(9) converges uniformly in t to a continuous function {Bt (ω), 0 ≤ t ≤ 1} for almost every
ω ∈ Ω.
8
(n)
Proof. Let bn := maxk∈I(n) |ξk |. Oberserve that for x > 0 and each n, k
∞
2 2 /2
e−u
(n)
P (|ξk | > x) = du
π x
∞
2 u −u2 /2 2 1 −x 2 /2
≤ e du = e ,
π x x πx
which gives
(n) (n) 2 2n −n2 /2
P (bn > n) = P( {|ξk | > n}) ≤ 2n P (|ξ1 | > n) ≤ e ,
π n
k∈I(n)
, the Borel-Cantelli-Lemma implies that there is a set Ω̃ with P (Ω̃) = 1 such that for ω ∈ Ω̃
there is an n0 (ω) such that for all n ≥ n0 (ω) it holds true that bn (ω) ≤ n. But then
n2−(n+1)/2 < ∞;
(n) (n)
|ξk (ω)Sk (t)| ≤
n≥n0 (ω) k∈I(n) n≥n0 (ω)
(n)
so for ω ∈ Ω̃, Bt (ω) converges uniformly in t to a limit Bt . The uniformity of the
convergence implies the conitunuity of the limit Bt .
The following exercise facilitates the construction of Brownian motion substantially:
holds true.
9
Theorem 2.2 With the above notations
(n)
Bt := lim Bt
n→∞
Proof. In view of our definition of Brownian motion it suffices to prove that for 0 = t0 <
t1 . . . < tn ≤ 1, the increments (Btj − Btj−1 )j=1,... ,n are independent, normally distributed
with mean zero and variance (tj − tj−1 ). For this we will show that the Fourier √ transforms
satisfy the appropriate condition, namely that for λj ∈ R (and as usual i := −1)
n
n
1
E exp i λj (Btj − Btj−1 ) = exp − λ2j (tj − tj−1 ) . (12)
2
j=1 j=1
To derive (12) it is most natural to exploit the construction of Bt form Gaussian random
(n)
variables. Set λn+1 = 0 and use the independence and normality of the ξk to compute
for M ∈ N
n
(M)
E exp −i (λj+1 − λj )Btj
j=1
M
n
(m) (m)
= E exp −i ξk (λj+1 − λj )Sk (tj )
m=0 k∈I(m) j=1
M
n
(m) (m)
= E exp −iξk (λj+1 − λj )Sk (tj )
m=0 k∈I(m) j=1
M 1
n
(m) 2
= exp − (λj+1 − λj )Sk (tj )
m=0 k∈I(m)
2
j=1
1
n n M
(m) (m)
= exp − (λj+1 − λj )(λl+1 − λl ) Sk (tj )Sk (tl )
2 m=0
j=1 l=1 k∈I(m)
10
Then
n−1
L2 − lim (Bt(n) − Bt(n) )2 = T .
n→∞ j+1 j
j=0
(n) (n)
Proof. Abbreviate ΔBj = Bt(n) − Bt(n) and Δtj = tj+1 − tj . Let δn = maxj Δtj . Then
j+1 j
2 2
(ΔBj )2 − T = E (ΔBj )2 − T
j j
=E (ΔBi )2 (ΔBj )2 − 2T E (ΔBj )2 + T 2
i,j j
= E((ΔBj )4 ) + E((ΔBi )2 )E((ΔBj )2 ) − 2T Δtj + T 2
j i≠j j
2 2
= 3(Δtj ) + (Δti )(Δtj ) − T
j i≠j
=2 (Δtj )2
j
≤ 2δn Δtj
j
= 2δn T ,
where again we have used the
fact that the fourth moment of a centered Gaussian random
variable ξ is given by E ξ 4 = 3 Var(ξ)2 . Let n → ∞ to get the claim.
We may write the message of Lemma 2.4 symbolically as
(dBt )2 = dt,
11
saying that ‘Brownian motion has quadratic variation growing linearly with time’. This ex-
pression will acquire a precise meaning during the sequel of this course. For the moment,
let us just say that Bt has large fluctuations at a small scale, namely
dBt is of order dt dt.
To prove the infinite variation part of the above Theorem 2.3 we need one more prepara-
tory lemma, which applies to a general sequence of random variables.
∞ p
Proof. Choose a subsequence such that k=1 E(Xnk ) < ∞. By Chebyshev’s inequality
we have for all m ∈ N,
p
1 p
P Xnk ≥ ≤ m E Xnk ,
m
n−1 2
lim B(nk ) (ω) − B(n ) (ω) = T .
k→∞ tj+1 tj k
j=0
12
Then limk→∞ εnk = 0 by the uniform continuity of t
→ Bt . It follows that
k −1
n k −1
n
1 1
|ΔBj | ≥ |ΔBj |2 ∼ T →∞ as k → ∞.
j=0 j=0
εnk εnk
To derive form here that also the paths of Brownian motion are almost surely nowhere
differentiable, we need the following
Excercise. 2.2 Let (Bt )0≤t≤T be a Brownian motion on [0.T ]. Then for each c > 0 the
following stochastic process
(c · Bt/c 2 )0≤t≤T
is a Brownian motion on [0, T /c 2 ].
Now we are ready to proof that the paths of Brownian motion are almost surely nowhere
differentiable.
Proof. Let Xn,k := maxj=k,k+1,k+2 |B j − B j−1 |. For ε > 0 we have
2n 2n
where the second step follows from the above exercise. Thus for Yn := mink≤T ·2n Xn,k we
obtain
Denote
A := {ω ∈ Ω : t
→ Bt (ω) is differentiable somewhere}.
Let ω ∈ A, t
→ Bt (ω) be differentiable in t0 := t0 (ω), and let D denote its derivative.
Then there exists δ := δ(ω, t0 ) such that
1 δ
n
< , n0 > (|D| + 1) and n0 > t0 .
2 0 2
k k+1
For n ≥ n0 choose k such that 2n ≤ t0 < 2n . Then
j
|t0 − |<δ for j = k, k + 1, k + 2.
2n
Thus
1 n
Xn,k (ω) ≤ (|D| + 1) n
≤ n,
2 2
n n
and, since n > t0 > k/2n , also Yn (ω) ≤ 2n . Therefore A ⊂ An := {Yn (ω) ≤ 2n } for n
large enough and hence also
A ⊆ lim inf An .
13
But (13) implies
P (An ) ≤ n2n (2n/2+1 n2−n )3 < ∞
n n
as n → ∞, such that P (lim inf An ) = 0. Thus almost surely t → Bt (ω) is nowhere differ-
entiable.
E(Xt |Ft ) ≥ Xs
(respectively,
E(Xt |Ft ) ≤ Xs ).
We say that it is a martingale if is both, a supermartingale and a submartingale.
Excercise. 2.3 Prove that Brownian motion (Bt )0≤t≤T together with the canconical filtra-
tion
Ft ) := σ {Xs , 0 ≤ s ≤ t}
is a martingale.
Once we have construced Brownian motion in one dimension another natural question to
ask is, whether there is a multidimensional analogue to it. The following defintion seems
most natural.
14
4. P -a.s. the paths of t
→ Bt are continuous.
Note that the above definition implies that the coordinate processes of a d-dimensional
Brownian motion are a one dimensional Brownian motion.
This concludes the construction of Brownian motion. In the next sections we shall see
that Brownian motion is the building block of stochastic analysis.
It turns out that all random variables in L2 (Ω, P) can be represented in a natural way as
integrals of products of increments of Brownian motion (‘Wiener chaos expansion’). This
shows that Brownian motion is really the basic random process in L2 (Ω, P).
15
3 The Itô-integral
for some random function f which, of course, still would be a random object. The
Lebesgue-Stieltjes integral roughly follows the following idea. In the construction of the
Lebesgue (or Riemann) integral we give each interval I a weight which is equivalent to its
length |I|. Now a natural generalization of this concept is to assign a weight to I that de-
pends on its location. This can happen in the following way: Take a montonely increasing
function
G : [0, T ] → R
with G(0) = 0 and for a continuous function
f : [0, T ] → R
We now take a sequence of partitions (τν )ν∈N such that maxti ∈τν |ti+1 −ti | → 0 as ν → ∞.
The Lebesgue-Stieltjes integral of f with respect to G is then defined as
T
τ
f (t)dG(t) := lim IGν (f ).
0 ν→∞
16
It can be shown, that the Lebesgue-Stieltjes integral is well defined and may even be
extended to the case of non-increasing G, which reflects that an interval I might have
negative measure. Indeed, it turns out that the appropriate requirement is that G has
finite variation
lim |G(ti+1 ) − G(ti )| < ∞.
ν→∞
ti ∈τν
Now we have shown in Theorem 2.3 that the paths of Brownian motion have infinite vari-
ation on every time interval. Hence T the concept of Lebesgue-Stieltjes integration cannot
be simply carried over to define 0 f (t, ω)dBt (ω). We will soon see what goes wrong,
when we follow the standard ideas to define an integral, i.e. we first define the integral of
a step function and then continue by approximating “arbitrary functions” by step func-
tions. However, we shall see that f (t, ω) cannot be completely arbitrary, but has in some
way to be fitting to ω
→ Bt (ω). The construction was pioneered by K. Itô in the 1940’s.
n−1
φ(t, ω) = cj (ω)1[tj ,tj+1 ) (t)
j=0
by
T
n−1
φ(t, ω)dBt (ω) := cj (ω) Btj+1 (ω) − Btj (ω) . (14)
0
j=0
The next thing to do would be to approximate f by step functions and define f dBt to
be the limit of their stochastic integrals. But here we meet a difficulty!
n−1
φn (t, ω) := Btj (ω)1[tj ,tj+1 ) ,
j=0
n−1
ψn (t, ω) := Btj+1 (ω)1[tj ,tj+1 ) ,
j=0
where t0 , t1 , . . . , tn are defined as in Lemma 2.4 in Section 2.4 (and are not ω-dependent).
However, from our definition (14) we find that
T T
n−1
ψn dBt − φn dBt = (ΔBj )2 ,
0 0
j=0
which, according to Lemma 2.4, does not tend to 0 as n → ∞but to the constant T . In
other words, the variation of the path t
→ Bt is too large for Bt dBt to be defined in a
straightforward way.
17
We now introduce a requirement for the approximation of simple functions, and hence
also for the integrands.
We shall often abbreviate this space by L2 (B). The natural inner product that makes
L2 (B) into a real Hilbert space is
T
f , g := dt f (t, ω)g(t, ω)P(dω)
0 Ω
T
=E f (t, ·)g(t, ·)dt .
0
We note that the step functions φn in the last example are adapted, since φn (t, ω)= Btj
for t ∈ [tj , tj+1 ), so that φn (t, ω) only depends on past values of B. On the other hand,
ψn is not adapted, since at time t ∈ [tj , tj+1 ) it already anticipates the Brownian motion
at time tj+1 : ψn (t, ω)= Btj+1 (ω).
The next theorem is a crucial property of stochastic integrals of step functions.
Proposition 3.1 (The Itô-isometry) Let φ be a step function in L2 (B, [0, T ]), and let
T
I0 (φ)(ω) := φ(t, ω)dBt (ω)
0
i.e.,
2 T
T
P(dω) φ(t, ω)dBt (ω) = φ2 (t, ω)P(dω)dt.
Ω 0 Ω 0
18
Proof. By adaptedness, ci in (14) is independent of ΔBj := Btj+1 − Btj for i ≤ j. Therefore
n−1
2
I0 (φ)2 = E cj ΔBj
j=0
n−1
n−1
= E ci cj (ΔBi )(ΔBj )
i=0 j=0
n−1
= E cj2 (ΔBj )2 + 2 E ci cj (ΔBi ) E ΔBj
j=0 i<j
n−1
= E(cj2 )E (ΔBj )2
j=0
n−1
= E(cj2 )Δtj ,
j=0
where we use that E(ΔBj ) = 0, E((ΔBj )2 ) = Δtj (recall BM3-BM4 in Section 1). On the
other hand,
T n−1
2
φ2 = E cj 1[tj ,tj+1 ) (t) dt
0
j=0
T
n−1
n−1
= 1[ti ,ti+1 ) (t)1[tj ,tj+1 ) (t)dt E(ci cj )
0
i=0 j=0
n−1
= Δtj E(cj2 ).
j=0
Lemma 3.2 Every function f ∈ L2 (B, [0, T ]) can be approximated arbitrarily well by step
functions in L2 (B, [0, T ]).
On the basis of Proposition 3.1 and Lemma 3.2 we can now define the Itô-integral of a
function g ∈ L2 (B, [0, T ]) as follows. Approximate g by step functions φn ∈ L2 (B, [0,
T ]), i.e., φn → g in L2 (B, [0, T ]). Apply I0 to each of the φn . Since I0 is an isometry, the
sequence I0 φn has a limit in L2 (Ω, P). This is what we define to be the Itô-integral Ig of
g:
T
g(t, ω)dBt (ω) := (Ig)(ω) = L2 − lim (I0 φn )(ω).
0 n→∞
Proof of Lemma 3.2. We divide the proof into three steps of successive approximation.
1. Every bounded (pathwise) continuous g ∈ L2 (B) can be approximated by a se-
quence of step functions.
19
Proof. Partition the interval [0, T ] into n pieces by times (tj ) in the customary
way. Define
n−1
φn (t, ω) := g(tj , ω)1[tj ,tj+1 ) (t).
j=0
Then, since t
→ g(t, ω) is continuous and maxj |Δtj | → 0 for all ω ∈ Ω, we have
T
2
lim g(t, ω) − φn (t, ω) dt = 0.
n→∞ 0
2. Every bounded h ∈ L2 (B) can be approximated by a sequence of bounded continu-
ous functions in L2 (B).
Proof. ******* Suppose |h| ≤ M. For each n, let the “mollifier” ψn be a non-negative
continuous function of the form given in Figure 3.2, with the properties ψn (x) = 0
∞
for x ∉ [0, 1/n] and −∞ ψn (x)dx = 1. Define
t
gn (t, ω) := ψn (t − s)h(s, ω)ds.
0
Then t
→ gn (t, ω) is continuous for all ω, and |gn | ≤ M. Moreover, for all ω,
T
2
lim gn (s, ω) − h(s, ω) ds = 0,
n→∞ 0
3. Every f ∈ L2 (B) can be approximated by bounded functions in L2 (B). (This is a
general result on L2 -spaces.)
Proof. Let f ∈ L2 (B) and put hn (t, ω) := (−n) ∨ (n ∧ f (t, ω)). Then
T
f − hn 2L2 (B) ≤ dt P(dω) 1[n,∞)(|f (t, ω)|)f (t, ω)2 ,
0 Ω
20
0 1/n
FIG: the function ψn .
Here is an example of a stochastic integral.
where we use the shorthand notation Bj := Btj and ΔBj := Btj+1 − Btj . Note that Bi =
j<i ΔBj . We therefore have
2
BT2 = ΔBj
j
= (ΔBi )2 + 2 (ΔBi )(ΔBj )
i i<j
= (ΔBi )2 + 2 Bj (ΔBj )
i j
T
2
= (ΔBi ) + 2 φn (t)dBt .
0
i
21
3.3 Martingales
In section 4.4 we shall prove that the Itô-integral w.r.t. Brownian motion of an adapted
square integrable stochastic process always has a continuous version. For this we shall
need an interlude on martingales. We start with a reminder of what we already defined
in Section 2.3.
In words, Et (X)(ω) is the best estimate (in the sense of least mean square error) that can
be made of X(ω) on the basis of the knowledge of Bs (ω) for 0 ≤ s ≤ t.
Es (Mt ) = Ms for 0 ≤ s ≤ t ≤ T .
In words, a martingale is a ‘fair game’: the expected value at any time in the future is
equal to the current value. Note that Brownian motion itself is a martingale, since for
0 ≤ s ≤ t ≤ T,
Theorem 3.3 The stochastic integral of an adapted step function is a martingale with
continuous paths.
Proof. This directly follows from the fact that Brownian motion has continuous paths
and satisfies the martingale property (use the definition of the stochastic integral of a
step function given in (14) in Section 4.1).
The following powerful tool will help us prove that the Itô-integral of any process in
L2 (B) possesses a continuous version.
1
P[ sup |Mt | > λ] ≤ E(|MT |p ).
0≤t≤T λp
Proof. We may assume that E(|MT |p ) < ∞ for all s ∈ [0, T ]. Let Zt := |Mt |p . Then, since
x
→ |x|p is a convex function, Zt is sub-martingale, meaning that for all 0 ≤ s ≤ t ≤ T ,
22
It follows in particular that E(|MT |p ) < ∞ for all s ∈ [0, T ]. Let us discretise time and
first prove a discrete version of Doob’s inequality. To that end we fix n ∈ N and put
tk = kT /n. Let K(ω) denote the smallest value of k for which Ztk ≥ λp , if this occurs at
all. Otherwise, put K(ω) = ∞. Then we may write, since [K = k] ∈ Ftk ,
n
P[ max |Mtk | > λ] = P[K = k]
0≤k≤n
k=0
n
1
≤ E(1[K=k] Ztk )
λp
k=0
n
1
≤ p
E 1[K=k] E(ZT |Ftk )
k=0
λ
n
1
= E(1[K=k] ZT )
λp
k=0
1
≤ E(ZT ).
λp
Here, the second inequality uses the sub-martingale property. Now let An denote the
event An := [max0≤k≤n |Mtk | > λ]. Then we have A1 ⊂ A2 ⊂ A4 ⊂ A8 ⊂ · · · , and so,
t
→ Mt being continuous,
∞
1
P[ sup |Mtk | > λ] = P A2n = lim P(A2n ) ≤ p E(|MT |p ).
0≤t≤T n=0
n→∞ λ
Then there exists a version Jt of It with continuous paths, i.e., t → Jt (ω) is continuous
for almost all ω ∈ Ω.
Proof. The point of the proof is to turn continuity in L2 (Ω, P) into continuity of paths.
This requires some estimates.
Let φn ∈ L2 (B) be an approximation of f by step functions. Put
t
In (t, ω) = φn (s, ω)dBs (ω).
0
23
By Lemma 3.3 in Section 4.3, In is a pathwise continuous martingale for all n. The same
holds for the differences In − Im . Therefore, by the martingale inequality and the Itô-
isometry, we have
1
P[ sup |In (t) − Im (t)| > ] ≤ 2 E (In (t) − Im (t))2
0≤t≤T
1 T
= 2 E (φn (t) − φm (t))2 dt
0
1
= 2 φn − φm 2L2 (B) ,
which tends to 0 as n, m → ∞ because φn is a Cauchy sequence. We can therefore choose
an increasing sequence n1 , n2 , n3 , . . . of natural numbers such that
P[ sup |Ink+1 (t) − Ink (t)| > 2−k ] ≤ 2−k .
0≤t≤T
Hence for almost all ω there exists K(ω) such that for all k ≥ K(ω),
sup |Ink+1 (t, ω) − Ink (t, ω)| ≤ 2−k ,
0≤t≤T
The last equality uses that the orthogonal projection in L2 (Ω, P) is continuous.
t
From now on we shall always take 0 f (s, ω)dBs to mean a t-continuous version of the
integral.
We finish this section by extending Theorem 3.3 to arbitrary functions f ∈ L2 (B, [0, T ])
Proof. This follows from Theorem 3.3, the almost sure t-continuity of Mt , Doob’s mar-
tingale inequality combined with the Itô-isometry.
We have completed our construction of stochastic integrals. In the next sections we shall
investigate their main properties.
24
4 Stochastic integrals and the Itô-formula
In this chapter we shall treat the Itô-formula, a stochastic chain rule that is of great help
in the formal manipulation of stochastic integrals.
We say that a process Xt is a stochastic integral if there exist (square integrable adapted)
processes Ut , Vt ∈ L2 (B, [0, T ]) such that for all t ∈ [0, T ],
t t
Xt = X0 + Us ds + Vs dBs . (15)
0 0
The first integral on the r.h.s. is of finite variation, being pathwise differentiable almost
everywhere. The second integral is an Itô-integral and therefore a martingale. A decom-
position of a process into a martingale and a process of finite variation is called a Doob-
Meyer decomposition. Processes in L2 (B, [0, T ]) whave such a decomposition are called
‘semi-martingales’. Equation (15) is conveniently rewritten in differential form:
Example In Section 4.2 it was shown that the process Bt2 satisfies the equation
∂g ∂g 1 ∂ 2g 2
dYt = (t, Xt )dt + (t, Xt )dXt + 2 (t, Xt ) (dXt ) , (18)
∂t ∂x ∂x 2
with
1 ∂2g
Ut
∂g ∂g
= ∂t (t, Xt ) + ∂x (t, Xt ) Ut + 2 ∂x 2 (t, Xt ) Vt2
Vt
∂g
= ∂x (t, Xt ) Vt ,
25
which in its turn stands for
T T
YT = Y0 + Us ds + Vs dBs .
0 0
(The occurrence of the third term in the r.h.s. of (18) is sometimes called ‘the Itô-correction’.)
We shall prove Theorem 4.1 via the following extension of Lemma 2.4 in Section 2.4.
n−1 2 T
Atj ΔBj → At dt in L2 (Ω, P) as n → ∞.
0
j=0
∂g ∂g
(tj , Xj )ΔXj = tj , Xj Uj Δtj + Vj ΔBj + o(1)
∂x ∂x
j j
T T
∂g ∂g
→ (t, Xt ) Ut dt + (t, Xt ) Vt dBt .
0 ∂x 0 ∂x
26
The third and the fourth tend to zero. For instance, if in the fourth term we substitute
ΔXj = Uj Δtj + Vj ΔBj , then a term
∂ 2g
(tj , Xj )Vj Δtj ΔBj =: cj Δtj ΔBj
j
∂t∂x j
arises. But, because cj is Ftj -measurable and |cj | ≤ M for all j, it follows that (4.1) tends
to zero because
2
E cj Δtj ΔBj = E(cj2 )(Δtj )3 → 0.
j j
1
∂ 2g 2
1
∂ 2g 2
2 ΔX j = 2 (t j , X j ) Uj Δt j + V j ΔBj + o(1)
j
∂x 2 j
∂x 2
∂ 2g
1 2 2 2 2
= 2 (t j , X j ) Uj (Δt j ) + 2Uj V j Δt j ΔBj + V j (ΔBj ) + o(1)
∂x 2
j
T 2
1 ∂ g
→ 2 (t , Xj )Vt2 dt
2 j
0 ∂x
Example
T 2 With the help of the Itô-formula it is possible to quickly calculate an integral
like 0 Bt dBt , in much the same way as ordinary integrals are calculated: we make a guess
for the primitive, calculate its derivative, see if the guess is correct, if not then we adapt
our guess.
In the present case our guess is that we should have something like Bt3 , so we calculate
(use Theorem 4.1 with g(t, x) = x 3 , Ut ≡ 0, Vt ≡ 1):
1
⇒ Bt2 dBt = 3 d Bt3 − Bt dt
T 1 T
⇒ 0 Bt2 dBt = 3 BT3 − 0 Bt dt.
T
Example Let f be differentiable. Then the noise Nf = 0 f (t)dBt satisfies (use Theorem
4.1 with g(t, x) = f (t)x, Ut ≡ 0, Vt ≡ 1):
27
Example We want to solve the stochastic differential equation
This is obviously growing too fast: the second term in the r.h.s., which is the Itô-correction,
must be compensated. We thus try Xt = exp (−αt)Yt , which yields
1
dXt = βXt dBt + 2 β2 − α Xt dt.
1
The second term in the r.h.s. is zero for α = 2 β2 , so we find the solution
1 2
Xt = eβBt − 2 β t .
m
dXi (t) = Ui (t)dt + Vij (t)dBj (t) (i = 1, . . . , n), (19)
j=1
for some processes Ui (t) and Vij (t) in L2 (B, [0, T ]). We sometimes abbreviate (19) in the
vector notation
dX = U dt + V dB.
∂gi n ∂gi 1 n ∂ 2 gi
dYi (t) = ∂t (t, Xt ) dt + j=1 ∂xj (t, Xt ) dXj (t) + 2 j,k=1 ∂xj ∂xk (t, Xt ) dXj dXk
(i = 1, . . . , p),
(20)
where the product dXj dXk has to be evaluated according to the rules
28
Equation (20 is the multi-dimensional version of Itô’s formula, which can be proved in the
same way as its one-dimensional counterpart. (Be careful to keep track of all the indices.
The easiest case is m = n, p = 1.)
Example (‘Bessel process’) Let Rt (ω) = Bt (ω), where Bt is m-dimensional Brownian
motion and · is the Euclidean norm. Apply the Itô-formula to the function r : Rm →
∂r xi ∂2r 1 xi2
R+ : x
→ x. We compute ∂xi = r and ∂xi 2
= r − r3
. So we find
⎛ ⎞
2
m
Bj
m
1 B m
Bj m−1
dR =
1
dBj + 2 ⎝ − j ⎠ dt = dBj + dt.
R R R 3 R 2R
j=1 j=1 j=1
The next theorem gives a way to construct martingales out of Brownian motion. A func-
2
tion f is called harmonic if Δf = 0, with Δ = i ∂ 2 the Laplacian.
∂xi
f"
fk"
29
This is an Itô-integral and hence a martingale.
An alternative way to understand Theorem 4.3 is that a harmonic function f has the
property that its value in a point x is the average over its values on any sphere around
x. This property, together with the fact that Brownian motion is ‘isotropic’, explains why
f (Bt ) is a ‘fair game’.
The following extension of Itô’s formula will be useful later on.
Lemma 4.4 Itô’s formula for Yt = g(Bt ) still holds if g : R → R is C1 everywhere and C2
outside a finite set { z1 , . . . , zN }, with g locally bounded outside this set.
Proof. Take fk ∈ C2 (R) such that fk → g and fk → g as k → ∞, both uniformly and
such that for x ∉ { z1 , . . . , zN }:
⎧
⎨f (x) → g (x)
k
⎩|f (x)| ≤ M in a neighbourhood of { z1 , . . . , zN } .
k
(Fig. 4.3 shows the graph of fk for a simple example of g that has a jump.) For fk we
have the Itô formula
t t
fk (Bs )dBs + 2 fk (Bs )ds.
1
fk (Bt ) = fk (B0 ) +
0 0
1
Lt := lim λ ({ s ∈ [0, t] |Bs ∈ (−ε, ε) }) ,
ε↓0 2ε
(Think of Lt as the density per unit length of the total time spent close to the origin up
to time t.)
30
−ε ε
as shown in Fig. 4.4. Then gε is C2 , except in the points { −ε, ε }, and it is C1 everywhere
on R. Apply Lemma 4.4 to get
t t
gε (Bs )dBs ,
1
2 gε (Bs )ds = gε (Bt ) − gε (B0 ) − (21)
0 0
and
Now, the limit as ε ↓ 0 of the l.h.s. of (21) is precisely Lt . (The time spent by Bt in ±ε
is zero.) Moreover, we trivially have gε (Bt ) → |Bt | and gε (B0 ) → |B0 | as ε ↓ 0. Hence it
suffices to prove that the integral in the r.h.s. of (21) converges to the appropriate limit:
t
gε (Bs ) − sgn(Bs ) dBs → 0 in L2 (Ω, P).
0
t
≤ 0 P (Bs ∈ (−ε, ε)) ds
→ 0 as ε ↓ 0,
31
where in the second equality we use the Itô-isometry and the last statement holds because
Bs (s > 0) has an absolutely continuous distribution. It follows that Lt exists and can be
expressed as in the statement of the theorem.
Note that for smooth functions f :
t
|f (t)| − |f (0)| − sgn (f (s)) f (s)ds = 0
0
because
d
f (t) = sgn (f (t)) f (t) (f (t) ≠ 0).
dt
Thus, the local time is an Itô-correction to this relation, caused by the fact that d|Bt | ≠
sgn (Bt ) dBt : if Bt passes the origin during the time interval Δtj , then |Btj+1 − Btj | need
not be equal to sgn(Btj )ΔBtj . The difference is a measure of the time spent close to the
origin.
The existence of the local times of Brownian motion was proved by Lévy in the 1930’s
using hard estimates. The above approach is shorter and more elegant. What is described
above is the local time at the origin: Lt = Lt (0). In a completely analogous way one can
prove the existence of the local time Lt (x) at any site x ∈ R. The process x → Lt (x)
plays a key role in many applications associated with Brownian motion.
32
5 The Martingale Representation Theorem
Let B(t) = (B1 (t), . . . , Bd (t)) be d-dimesnional Brownian motion. In Section 3, Theorem
3.3, and Corollary 3.6, we have proved that if f ∈ L2 then the Itô integral
t
Xt = X0 + f (s, ω)dB(s); t ≥ 0
0
is always a martingale with respect to the filtration Ft of Brownian motion. This might
not be too surprising because also Brownina motion itself is a martingale. In this section
we prove a result, which is really stunning, namely that the converse also is true: Any Ft -
martingale with respect to P can be represented as an Itô-integral. This result, called the
martingale representation theorem, is important in many applications, e.g. mathematical
finance. In this section we will only prove ist one dimension, but essentailly the same
proof works for arbitrary (finite) d.
We start by establishing some auxiliary results
Excercise. 5.1 Look up the proof the following result, sometimes called the Doob-Dynkin
lemma, e.g. in M. M. Rao, Prop. 3, p.7, or B. Øksendal, Lemma 2.1.2,p.9.
Lemma 5.1 Let (Ω, F , P ) be a probability space and X, Y : Ω → Rd be two random vari-
ables. Denote
σ (X) := {X −1 (B), B ∈ F }.
Then Y ist σ (X)-measurable if and only if there exists a Borel-measurable function g :
Rn → Rn such that
Y = g(X).
Excercise. 5.2 Look up the proof the following result, called the Martingale convergence
theorem, e.g. in B. Øksendal, Corollary C.9.
F∞ := σ {Fk , k = 1, 2 . . . }.
Then
E[X|Fk ] →k→∞ E[F |F∞ ]
P -a.e. and in L1 (P ).
is dense in L2 (FT , P ).
33
Proof. Let {ti }∞
i=1 be a dense subset of [0, T ] and for each n = 1, 2, . . . let Hn denote
the σ -algebra generated by Bt1 , . . . , Btn . Clearly
Hn ⊆ Hn+1
and
FT = σ {Hn , n = 1, 2, . . . }.
Choose g ∈ L2 (FT , P ). Then by the martingale convergence theorem 5.2 we have that
P -a.e. and in L2 (FT , P ). By the Doob-Dynkin lemma 5.1 we can write, for each n,
is dense in L2 (FT , P ).
for all λ = (λ1 , . . . , λn ) ∈ Rn and all t1 , . . . tn ∈ [0.T ]. The function G(λ) is real analytic
and hence has an analytic extension to the complex space Cn given by
G(z) := exp{z1 Bt1 (ω) + . . . + zn Btn (ω)}g(ω)dP (ω)
Ω
34
where
−n/2
φ̂(y) = (2π ) φ(x)e−ix·y dx
Rn
is the Fourier transform and we have used the inverse Fourier transform theorem
φ(x) = (2π )−n/2 φ̂(y)eix·y dy.
Rn
By (23) and Lemma 5.3 g is orthogonal to a dense subset of L2 (FT , P ) and thus we
conculde that g ≡ 0. Therefore the linear span of the functions in (22) must be dense in
L2 (FT , P ) as claimed.
Let B(t) = (B1 (t), . . . , Bd (t)) be d-dimensional Brownian motion. If f (ω, s) ∈ L2 (B, [0, T ])
then the random vaiable T
V (ω) := f (ω, s)dB(s)
0
so V ∈ L2 (FT , P ).
The next result states that any F ∈ L2 (FT , P ) can be represented this way:
Theorem 5.5 (The Itô representation theorem) Let F ∈ L2 (FT , P ). Then there exists a
unique stochastic process f (ω, s) ∈ L2 (B, [0, T ]) such that
T
F (ω) = E[F ] + f (ω, s)dB(s).
0
Proof. Again we will only treat the case of d = 1. First assume that F has the form (22),
i.e. T !
1 T 2
F (ω) = exp h(t)dBt (ω) − h (t)dt ,
0 2 0
1 2 1
dYt = Yt (h(t)dBt − h (t)dt) + Yt (h(t)dBt )2 = Yt h(t)dBt ,
2 2
so that t
Yt = 1 + Ys h(s)dBs ; t ∈ [0, T ].
0
Thus T
F = YT = 1 + Ys h(s)dBs
0
35
and hence E[F ] = 1. So in this case the claim of the lemma holds true.
By linearity it also holds true for linear combinations of functions of the form (22). So, if
F ∈ L2 (FT , P ) we approximate it by linear combinations Fn of functions of the form (22).
Then for each n we have
T
Fn (ω) = E[Fn ] + fn (ω, s)dB(s)
0
the limit being taken in L2 (B, [0, T ]). Hence the representation part of the theorem fol-
lows.
To see the uniqueness we again employ the Itô isometry: Suppose
T T
F (ω) = E[F ] + f1 (ω, s)dB(s) = E[F ] + f2 (ω, s)dB(s)
0 0
Theorem 5.6 ((The martingale representation theorem)) Let B(t) = (B1 (t), . . . , Bd (t)),
0 ≤ t ≤ T , be d-dimensional Brownian motion and let Mt be an Ft - martingale with
respect to P , such that Mt ∈ L2 (P ) for all 0 ≤ t ≤ T . Then there exists a unique stochastic
process g(ω, s) ∈ L2 (B, [0, T ]) and
t
Mt (ω) = E[M0 ] + g(ω, s)dB(s) a.e.
0
for all 0 ≤ t ≤ T .
Proof. Again we just treat the case d = 1. By the Itô representation theorem applied to
T = t and F = Mt , we have that there exists a unique f (t) (ω, s) ∈ L2 (B, [0, T ]) such that
t t
(t)
Mt (ω) = E[Mt ] + f (ω, s)dBs (ω) = E[M0 ] + f (t) (ω, s)dBs (ω).
0 0
36
Now assume that 0 ≤ t1 < t2 Then
" #
t2
Mt1 = E[Mt2 |Ft1 ] = E[M0 ] + E f (t2 ) (ω, s)dBs (ω)|Ft1
0
t1
= E[M0 ] + f (t2 ) (ω, s)dBs (ω). (24)
0
and therefore
f (t1 ) (ω, s) = f (t2 ) (ω, s) for a.a.(ω, s) ∈ [0, t1 ] × Ω.
Now putting
f (ω, s) = f (T ) (ω, s)
gives the result.
37
6 Stochastic differential equations
A stochastic differential equation for a process X(t) with values in Rn is an equation of
the form
m
dXi (t) = bi (t, Xt )dt + σij (t, Xt )dBj (t) (i = 1, · · · , n). (26)
j=1
Here, B(t) = (B1 (t), B2 (t), · · · , Bm (t)) is an m-dimensional Brownian motion, i.e., an m-
tuple of independent Brownian motions on R. The functions bi and σij from R × Rn to R
with i = 1, 2, · · · , n and j = 1, 2, · · · , m form a field b of n-vectors and a field σ of n×m-
matrices. A process X ∈ L2 (B, [0, T ]) for which (26) holds is called a (strong) solution of
the equation. In more pictorial language, such a solution is called an Itô-diffusion with
drift b and diffusion matrix σ σ ∗ .
In this section we shall formulate a result on the existence and the uniqueness of Itô-
diffusions. It will be convenient to employ the following notation for the norms on vectors
and matrices:
n
n
m
x2 := xi2 (x ∈ Rn ); σ 2 := 2
σij = tr (σ σ ∗ ) (σ ∈ Rn×m ).
i=1 i=1 j=1
Also, we would like to take into account an initial condition X(0) = Z, where Z is an Rn -
valued random variable independent of the Brownian motion. All in all, we enlarge our
probability space and our space of adapted processes as follows. We choose a probability
measure μ on Rn and put
Ω := Rn × Ωm ;
Ft := B(Rn ) ⊗ Ft⊗m (t ∈ [0, T ]);
⊗m
P := μ ⊗ P ;
n
Z : Ω → R : (z, ω)
→ z;
L (B, [0, T ]) := {X ∈ L2 (Rn , μ) ⊗ L2 (Ω, FT , P)⊗m ⊗ L2 [0, T ] ⊗ Rn | ω
→ Xt,i
2 x
(ω) is Ft -measurable}.
This change of notation being understood, we shall drop the primes again.
Theorem 6.1 Fix T > 0. Let b : [0, T ] × Rn → Rn and σ : [0, T ] × Rn → Rn×m be measur-
able functions, satisfying the growth conditions
38
Proof. The proof comes in three parts.
1. Uniqueness. Suppose X, Y ∈ L2 (B, [0, T ]) are solutions of (26) with continuous paths.
Put
Applying the inequality (a+b)2 ≤ 2(a2 +b2 ) for real numbers t a and b, the independence
t
of the components of B(t), the Cauchy-Schwarz inequality ( 0 g(s)ds)2 ≤ t 0 g(s)2 ds for
an L2 -function g, the multi-dimensional Itô-isometry and finally the Lipschitz condition,
we find
n
E Xt − Yt 2 := E (Xi (t) − Yi (t))2
i=1
⎛⎛ ⎞2 ⎞
n t m t
⎜ ⎟
= E ⎝⎝ Δbi (s)ds + Δσij (s)dBj (s)⎠ ⎠
0 0
i=1 j=1
⎛ ⎞2 ⎞
2 ⎛ m
n
⎜
t t
⎟
≤2 E⎝ Δbi (s)ds +⎝ Δσij (s)dBj (s)⎠ ⎠
0 0
i=1 j=1
⎛ 2 ⎞
n t
=2 E⎝ Δbi (s)ds ⎠
0
i=1
⎛ ⎞
n m t
+2 E⎝ (Δσij (s))2 ds ⎠
0
i=1 j=1
⎛ ⎞ ⎛ ⎞
t
n t
n
m
≤ 2t E⎝ Δbi (s)2 ⎠ ds + 2 E⎝ Δσij (s)2 ⎠ ds
0 0
i=1 i=1 j=1
t t
= 2t E(Δb(s)2 )ds + 2 E(Δσ (s)2 )ds
0 0
t
≤ 2D 2 (T + 1) EXs − Ys 2 ds.
0
So the function f : t
→ E Xt − Yt 2 satisfies the integral inequality
t
0 ≤ f (t) ≤ A f (s)ds
0
for the constant A = 2D 2(T + 1). This inequality implies that f = 0 (“Gronwall’s inequal-
t
ity”). Indeed, put F (t) = 0 f (s)ds. Then F (t) is C1 and F (t) = f (t) ≤ AF (t). Therefore
d −tA
e F (t) = e−tA (f (t) − AF (t)) ≤ 0.
dt
Since F (0) = 0, it follows that e−tA F (t) ≤ 0 implying F (t) ≤ 0. So we have 0 ≤ f (t) ≤
AF (t) ≤ 0 and hence f (t) = 0.
39
Thus we have E Xt − Yt 2 = 0 for all t ∈ [0, T ]. In particular, for all t ∈ [0, T ] ∩ Q and
almost all ω:
Xt (ω) = Yt (ω).
Now let
and since Xt and Yt have continuous paths we conclude that for almost all ω:
Let us start with the constant process Xt0 := Z, and define recursively
(k+1) (k)
Xt := X̃t (k ≥ 0).
The calculation in the uniqueness part of this proof can be used to conclude that
2 t
E X̃t − Ỹt ≤ A E Xs − Ys 2 ds for any Xs , Ys ∈ L2 (B).
0
≤ 2T 2 C 2 E (1 + Z)2 + 2T C 2 E (1 + Z)2
≤ 2C 2 (T 2 + T )E (1 + Z)2 ,
40
which is finite by the growth condition and the requirement that Z has finite variance.
2 (k)
-
Now let X = L (B)- limk→∞ X . The existence of this limit follows because k Ak t k /k! <
∞. Then
2 (k)
X̃ = L (B)- lim X ˜= L2 (B)- lim X (k+1) = X.
k→∞ k→∞
So X is a solution of (26).
3. Continuity and adaptedness. By Theorem 3.5 the paths t
→ Xt (ω) can be assumed
continuous for almost all ω ∈ Ω. The fact that the solution is adapted is immediate from
the construction.
Definition 6.1 A weak solution of (26) is a pair (Bt , Xt ), measurable w.r.t. some filtration
(Gt )t∈[0,T ] on some probability space (Ω, G, P), such that Bt is m-dimensional Brownian
motion and such that (26) holds.
The key point here is that the filtration need not be (Ft )t∈[0,T ] = σ ((Bs )s∈[0,t] ). (If it is,
then we have a strong solution.) Strong uniqueness is uniqueness of a strong solution.
Weak uniqueness holds if, given σ and b, all weak solutions have the same law. Strong
uniqueness implies weak uniqueness.
By the following example we illustrate the difference between these two concepts.
Example (Tanaka) This is related to our example of Brownian local time in Section 5.4.
Let Bt be a Brownian motion and define
t
B̃t := sgn(Bs )dBs .
0
with (dB̃t )2 = sgn(Bt )2 (dBt )2 = (dBt )2 = dt. Now, Lévy’s characterization of Brownian
motion (which we do not prove here) says that any martingale with quadratic variation t
must be a Brownian motion. Hence B̃t is a Brownian motion.
Turning matters around, Bt is itself a solution of the stochastic differential equation
41
t t t
since Bt = 0 dBs = 0 sgn(Bs )2 dBs = 0 sgn(Bs )dB̃s . Because every solution of (27) is a
Brownian motion, we have weak uniqueness. However, because −Bt obviously also is a
solution of (27), there are two solutions of (27) for a given process B̃t . In other words, we
do not have strong uniqueness.
Taking this argument one step further we find (cf. Theorem 4.5):
where Lt is the local time at the origin. By its definition, Lt is adapted to the filtra-
tion (Gt )t∈[0,T ] generated by (|Bt |)t∈[0,T ] . Hence so is B̃t . It follows that F̃t ⊂ Gt , where
(F̃t )t∈[0,T ] is the filtration generated by (B̃t )t∈[0,T ] . However, since Bt is not Gt -measurable
(its sign is not determined by Gt ), it is not F̃t -measurable.
Note that (27) is an Itô-diffusion with b(t, x) = 0 and σ (t, x) = sgn(x). The latter is not
Lipschitz, which is why Theorem 6.1 does not apply.
42
7 Itô-diffusions and one-parameter semigroups
7.1 Introduction and motivation
An Itô-diffusion Xtx (with initial condition X0x = x ∈ Rn ) is a Markov process in contin-
uous time. If we suppose that the field b of drift vectors and the field σ of diffusion
matrices are both time-independent, i.e., Xtx is the solution starting at x of the stochastic
differential equation
(where b and σ satisfy the Lipschitz conditions mentioned in Theorem 6.1), then this
Markov process is stationary (= time-homogeneous) and can be characterised by its tran-
sition probabilities
. /
Pt (x, B) := P Xtx ∈ B (t ≥ 0),
where B runs through the Borel subsets of Rn . These transition probabilities satisfy the
one-parameter Chapman-Kolmogorov equation
Pt+s (x, B) = Pt (x, dy)Ps (y, B). (29)
y∈Rn
St+s = St ◦ Ss , (31)
i.e., the transformations (St )t≥0 form a one-parameter semigroup. Such a semigroup is
determined by its generator A, defined by
1
Af := lim (St f − f ) . (32)
t↓0 t
In this chapter we study the interplay between the diffusion Xtx and its generator A.
t,x x
Lemma 7.1 (Stationarity) The processes s
→ Xs and s
→ Xs−t are equal in law for s ≥ t.
43
and
x x x
dXs−t = b(Xs−t )ds + σ (Xs−t )d(Bs−t )
respectively, for s ≥ t. Since Bs −Bt and Bs−t are Brownian motions on [t, T ], both starting
at 0, the two processes have the same law by weak uniqueness.
Proposition 7.2 (Markov property) Let Xtx (t ≥ 0) denote the solution of (28) with initial
condition X0x = x. Fix t ≥ 0. Then for all s ≥ 0 the conditional expectation E(f (Xt+s
x
)|Ft )
x
depends on ω ∈ Ω only via Xt . In fact,
x
E(f (Xt+s )|Ft ) = (Ss f )(Xtx ).
On the other hand, the random variable Ys (y) depends on ω ∈ Ω only via the Brownian
motion u
→ Bu (ω) − Bt (ω) for u ≥ t, so Ys (y) is stochastically independent of Ft . By
strong uniqueness we must have, for s ≥ 0,
Ys Xtx = Xt+sx
a.s.,
since both sides are solutions of the same stochastic differential equation (28) for s ≥ 0
with initial condition Y0 (Xtx ) = Xtx . As Xtx is Ft -measurable, we now have
x
E f (Xt+s |Ft ) (ω) = f Ys (Xtx (ω))(ω ) P(dω ) = (Ss f ) Xtx (ω) .
ω ∈Ω
In our study of diffusions and their one-parameter semigroups we choose for B the space
C0 (Rn ) of all continuous functions Rn → R that tend to 0 at infinity. The natural norm
on this space is the supremum norm
f := sup |f (x)|.
x∈Rn
44
For f ∈ C0 (Rn ) we define
(St f )(x) := E f (Xtx ) ,
where Xtx (t ≥ 0) denotes the solution of (28) with initial condition X0x = x.
Theorem 7.3 The operators St with t ≥ 0 form a jointly continuous one-parameter semi-
group of contractions C0 (Rn ) → C0 (Rn ).
We thus obtain
y
EXtx − Xt 2 ≤ 2x − y2 + 2AF (t) ≤ 2x − y2 e2At . (33)
Now choose ε > 0. Since f is uniformly continuous on Rn , there exists a δ > 0 such that
ε
x − y ≤ δ ⇒ |f (x ) − f (y )| ≤ .
2
√ -
It follows that for x, y ∈ Rn not more than δ := δ εe−At /2 2f apart we may estimate
y
|St f (x) − St f (y)| = E f (Xtx ) − E f (Xt )
y
≤ E f (Xtx ) − f (Xt )
ε
y
≤ + 2f P Xtx − Xt > δ
2
ε 1
≤ + 4f e2At x − y2
2 (δ )2
< ε,
45
where the third inequality uses the Markov inequality and (33). So St f is uniformly con-
tinuous as well.
approach to 0 at infinity. Since t
→ St f is continuous in the supremum norm, it suffices
to prove that there exists a τ > 0 such that for all t ∈ [0, τ] we have
lim St f (x) = 0.
x→∞
(Physically speaking, we must prove that the diffusion cannot escape to infinity in a finite
time.) We start with the estimate
Ak t k
x,(k+1) x,(k) 2
E Xt − Xt ≤ Dt(1 + x)2
k!
for the iterates of the map X
→ X̃ in Section 6.1, starting from the constant process
x,0 x,(k+1)
Xt = x. Since Xtx = x + k (Xt − X x,(k)), it follows that
2
E Xtx − x ≤ Dt (1 + x)2 . (34)
-
with Dt = Dt( k Ak t k /k!)2 . Now choose ε > 0. Let M > 0 be such that |f (x )| ≤ 2 for
ε
all x ∈ Rn with x ≥ M, and let τ > 0 be such that 16Dτ f < 2 var epsilon. Then,
1
46
2
Note that from (34) it follows that limt↓0 E Xtx − x = 0. Now again take ε > 0. Let
δ > 0 be such that x − y < δ ⇒ |f (x) − f (y)| < ε/2. Let t0 > 0 be such that
EXtx − x2 < εδ2 /4f for 0 ≤ t ≤ t0 . Then for t ∈ [0, t0 ] we have, again by Chebyshev’s
inequality,
x
(St f ) (x) − f (x) = f Xt (ω) − f (x) P (dω)
Ω
ε . /
≤ + 2 f P Xtx (ω) − x ≥ δ
2
ε 2f
≤ + EXtx − x2
2 δ2
<ε.
47
So fε ∈ Dom(A), and hence Dom(A) is dense in B.
Theorem 7.5 Let (St )t≥0 be the one-parameter semigroup on B = C0 (Rn ) associated
to an Itô-diffusion with coefficients b and σ . Let A be the generator of (St )t≥0 . Then
Cc2 (Rn ) ⊂ Dom(A), and for all f ∈ Cc2 (Rn ):
∂f ∂ 2f
(Af )(x) = bi (x) (x) + 1
2 σ (x)σ (x)∗ ij (x).
i
∂xi i,j
∂xi ∂xj
Using the condition X0x = x, we conclude that for any f ∈ Cc2 (Rn ),
t t
∂f x x
f Xtx − f (x) = Xs σij Xsx dBj (s) + A f Xs ds. (35)
0
i,j
∂xi 0
The first part on the r.h.s. is a martingale and therefore, by the continuity of t
→ Xtx and
the definition of A,
1
(Af ) (x) = lim t E f Xtx − f (x)
t↓0
t
A f Xsx ds
1
= lim E t
t↓0 0
= A f (x) .
48
Note that the martingale part drops out after taking the expectation.
Having thus identified the generator associated with Itô-diffusions, we next formulate an
important formula for stopped Itô-diffusions.
Proof. The proof is a bit ‘sketchy’, but the details are easily filled in. By applying (35) to
x
the stopped process t
→ Xt∧τ and letting t → ∞ afterwards, we find:
τ τ
∂f
f Xτx = f (x) + (X x )σij (Xsx )dBj (s) + (Af ) Xsx ds.
0 ∂xi s 0
i,j
After taking expectations we get the claim (stopped martingales starting at 0 have expec-
tation 0).
7.4 Applications
The simplest example of a diffusion is Brownian motion itself: Xt = Bt (b ≡ 0, σ ≡ id).
Its generator is 1/2 times the Laplacian Δ.
Problem 1: Consider Brownian motion Bta := a + Bt starting at a ∈ Rn . Let R > a. What
is the average time Bta spends inside the ball DR = { x ∈ Rn : x ≤ R }?
Solution: Choose f ∈ Cc2 (Rn ) such that f (x) = x2 for x ≤ R. Let τRa denote the first
time the Brownian motion hits the sphere. Then τRa is a stopping time. Put τ := τRa ∧ T
and apply Dynkin’s formula, to obtain
τ
a
1 a
E f (Bτ ) = f (a) + E 2 (Δf ) Bs ds
0
2
= a + nE(τ),
1
where we use that 2 Δf ≡ n. Obviously, E f (Bτa ) ≤ R 2 . Therefore
1
E(τ) ≤ n R 2 − a2 .
1
As this holds for all T , it follows that E(τRa ) ≤ n R 2 − a2 < ∞. Hence τ → τRa as
T → ∞ by monotone convergence. But then we must have f (Bτa ) → Bτaa 2 = R 2 as
R
T → ∞, and so in fact
1
E τRa = n R 2 − a2
by dominated convergence.
Problem 2: Let b ∈ Rn be a point outside the ball DR . What is the probability that the
Brownian motion starting in b ever hits DR ?
49
Solution: We cannot use Dynkin’s formula directly, because we do not know if the Brow-
nian particle will ever hit the sphere. In order to obtain a well-defined stopping time, we
b
need a bigger ball DM := { x ∈ Rn |x < M } enclosing the point b. Let σM,R be the first
b b
exit time from the ‘annulus’ AM := DM \DR starting from b. Then clearly σM,R = τM ∧ τRb .
1
Now take A = 2 Δ, the generator of Brownian motion, and suppose that fM : AM → Rn
satisfies the following requirements:
(i) ΔfM = 0, i.e., fM is harmonic,
(ii) fM (x) = 1 for x = R,
(iii) fM (x) = 0 for x = M.
We then find
b
P B
b
= R = E fM B b
b = fM (b),
σM,R σM,R
where the first equality uses (ii) and (iii) and the second equality uses Dynkin’s formula
in combination with (i.) (Incidentally, this says that fM is uniquely determined if it exists!)
Next we let M → ∞ to obtain
P τRb < ∞ = P ∃M≥b : τRb < τM b
,
because any path of Brownian motion that hits the boundary of DR must be bounded.
From the latter we in turn obtain
P τRb < ∞ = P ∪M≥b [τRb < τM b
] = lim P τRb < τM
b
M→∞
b
= lim P BσM = R = lim fM (b).
M→∞ M→∞
Thus, the only thing that remains to be done is to calculate this limit, i.e., we must solve
(i)-(iii.)
For n = 2 we find, after some calculation:
It follows that Brownian motion with n = 1 or n = 2 is recurrent, since it will hit any
sphere with probability 1. But for n ≥ 3 Brownian motion is transient.
50
8 Transformations of diffusions
In this section we treat two useful formulas in the theory of diffusions, which are proved
using Itô-calculus.
Theorem 8.1 (Feynman-Kac formula) For x ∈ Rn , let Xtx be an Itô-diffusion with gener-
ator A and initial condition X0x = x. Let v : Rn → [0, ∞) be continuous, and let Stv f for
f ∈ C0 (Rn ) and t ≥ 0 be given by
v x t
x
St f (x) = E f Xt exp − v(Xu )du .
0
Proof. It is not difficult to show, by the techniques used in the proof of Theorem 7.3,
that Stv f lies again in C0 (Rn ), and that the properties 1, 2 and 4 in Definition 7.1 hold. It
is illuminating, however, to explicitly prove Property 3.
For 0 ≤ s ≤ t, let
t
x s,x
Zs,t := exp − v Xu du .
s
x
Property 3 in Definition 7.1 is preserved due to the particular form of the process Z0,t . In
fact, let f ∈ C0 (R ). Then
n
v x
x
St+s f (x) = E Z0,t+s f Xt+s
x Xtx x
= E Z0,t Zt,t+s f Xt+s
x
x Xt t,Xtx
= E Z0,t E Zt,t+s f Xt+s |Ft
x
= E Z0,t g Xtx
= Stv g (x) ,
where
y t,y y y
g(y) = E Zt,t+s f Xt+s |Ft = E Z0,s f (Xs ) = Ssv f y
by stationarity. So indeed
v
St+s f (x) = Stv ◦ Ssv f (x) .
Let us finally show that A − v is the generator. To that end we calculate (with the help of
Itô’s formula and dXt = b(Xt )dt + σ (Xt )dBt ):
d Z0,t f (Xt ) =(dZ0,t )f (Xt ) + Z0,t d(f (Xt )) + (dZ0,t )(df (Xt ))
. /
= −v (Xt ) f (Xt ) + (Af ) (Xt ) Z0,t dt + Z0,t σ (Xt )T ∇f (Xt ) , dBt .
51
In short, dStv f = Stv ((A − v)f ) dt which means that A − v is indeed the generator of the
semigroup (Stv )t≥0 .
We can give the semigroup (Stv )t≥0 a clear probabilistic interpretation. The positive num-
ber v(y) is viewed as a ‘hazard rate’ at y ∈ Rn , the probability per unit time for the
diffusion process to be ‘killed’. Let us extend the state space Rn by a single point ∂, the
‘coffin’ state, where the system ends up after being killed. Then it can be shown that there
exists a stopping time τ, the ‘killing time’, such that the process Ytx given by
⎧
⎨X x if t ≤ τ
x t
Yt :=
⎩∂ if t > τ
satisfies
x x t x
E f Yt = E f Xt exp − v Xu du ,
0
provided we define f (∂) := 0. The proof of this requires an explicit construction of the
killing time τ, which we shall not give here.
The Feynman-Kac formula was originally formulated as a non-rigorous ‘path-integral for-
mula’ in quantum mechanics by R. Feynman, and was later reformulated in terms of
diffusions by M. Kac. The connection with quantum mechanics can be stated as follows.
1
If Xt is Brownian motion, then the generator of Stv is 2 Δ − v. This is (−1)× the Hamilton
n
operator of a particle in a potential v in R . According to Schrödinger, the evolution in
time of the wave function ψ ∈ L2 (Rn ) describing such a particle is given by ψ
→ Utv ψ,
where Utv is a group of unitary operators given by
1
Utv = exp it 2 Δ − v (t ∈ R).
Finally, if v fails to be nonnegative, then the Feynman-Kac formula may still hold. For
instance, it suffices that v be bounded from below.
where b is bounded and Lipschitz continuous. This induces a probability measure P̃x
b on
Ω.
52
Theorem 8.2 (Cameron-Martin formula) For all x ∈ Rn the measures Px and P̃x
b are
mutually absolutely continuous with Radon-Nikodym derivative given by
T
dP̃x T x x 2
b
= exp x 1
b Bs , dBs − 2 b Bs ds .
dPx 0 0
Proof. Fix x ∈ Pn . The idea of the proof is to show that Xtx has the same distribution
under Px (or P0 ) as Btx has under ρT ·Px , where ρT denotes the Radon-Nikodym derivative
of the theorem. In other words, we shall show that for all 0 ≤ t1 ≤ t2 ≤ · · · ≤ tn ≤ T and
all f1 , f2 , · · · fn ∈ C0 (Rn ):
Ex f1 Xt1 × · · · × fn Xtn = Ex ρT f1 Btx1 × · · · × fn Btxn , (36)
Let us start with the l.h.s. First we note that for 0 ≤ s ≤ t ≤ T , for all F in the algebra
As of all bounded Fs -measurable functions on Ω and for all f ∈ C0 (Rn ) we have by the
property of conditional expectations and the Markov property,
Ex (F f (Xt )) = Ex E(F f (Xt ))|Fs
= Ex F E(f (Xt ))|Fs
= Ex F (St−s f )(Xt ) .
If we now apply this result repeatedly on the product f1 (Xt1 ) × · · · × fn (Xtn ), first pro-
jecting onto Atn−1 , then on Atn−2 , and continue down to At1 , we obtain (37).
Now consider the r.h.s. of (36), which is more difficult to handle. We shall show that it is
also given by (37) in three steps. Namely,
1. For all t ∈ [0, T ], F ∈ At ,
Ex (F ρT ) = Ex (F ρt ) . (38)
53
3. We apply (39) repeatedly, first with t = tn , s = tn−1 , F ∈ Atn−1 , then with t = tn−1 ,
s = tn−2 , F ∈ Atn−2 , continuing until t = t1 , and we obtain (37).
Thus, to complete the proof it remains to prove (38) and (39).
1
Proof of 1: Put ρt := exp (Zt ) with dZt = b (Bt ) , dBt − 2 b (Bt )2 dt. Then ρT is as
1
defined above and dρt = exp (Zt ) dZt + 2 exp (Zt ) (dZt )2 = ρt b (Bt ) , dBt . It follows
that t
→ ρt is a martingale. Therefore for F ∈ At :
Ex (F ρT ) = Ex (E (F ρT |Ft ))
= Ex (F E (ρT |Ft ))
= Ex (F ρt ) .
d x d x
E (F ρT f (Bt )) = E (F ρt f (Bt ))
dt dt
= Ex (F ρt (Af ) (Bt ))
= Ex (F (Af ) (Bt ) ρT ) ,
1
where A := 2 Δ + b, ∇ is the generator of (Xt ). Therefore the left and the right hand
side of (39) have the same derivative with respect to t for t ≥ s. Since they are equal for
t = s, they must be equal for all t ≥ s.
and let (St ) be the associated contraction semigroup, which can be expressed using the
Cameron-Martin formula as
t
t
x x x 1 x 2
(St f ) := E(f (Xt )) = E f (Bt ) exp − ∇h(Bs ), dBs − 2 h(Bs ) ds .
0 0
where
1
v (x) := 2 ∇h (x)2 − Δh (x) . (40)
54
Then from (40) and the equality
t t
h(Btx ) = h(x) + ∇h(Bsx ), dBs + 12 Δh(Bsx )ds
0 0
1 − 1 x2
Ω(x) = e 2 .
c
55
9 The Black and Scholes option pricing formula.
In 1973 Black and Scholes published a paper (after two rejections) containing a formula
for the fair price of a European call option on stocks. This formula now forms the basis of
pricing practice on the option market. It is a fine example of applied stochastic analysis,
and marks the beginning of an era were banks employ probabilists that are well versed
in Itô calculus.
A (European) call option on stocks is the right, but not the obligation, to buy at some
future time T a share of stock at price K. Both T and K are fixed ahead. Since at time T
the value of the stock may be higher than K, such a right-to-buy in itself has a value. The
problem of this section is: what is a reasonable price for this right?
The constant μ ∈ R (usually positive) describes the relative rate of return of the shares.
The constant σ > 0 is called the volatility and measures the size of the random flucta-
tions in the stock value. Shares of stock represent some real asset, for example partial
ownership of a company. They can be bought or sold at any time t at the current price
St .
Let us suppose that on the market certain other securities, called bonds, are available that
yield a riskless return rate r ≥ 0. This is comparable to the interest on bank accounts.
The value βt of a bond satisfies the differential equation
56
The question we are addressing is: How much would we be willing to pay at time 0 for the
right to buy at time T one share of stock at the price K > 0 fixed ahead? Such a right is
called a European stock option. The time T is called the expiry time, the price K is called
the exercise price. This option pricing problem turns out to be a problem of stochastic
control.
Another type of option is the American stock option, where the time at which the shares
can be bought is not fixed beforehand but only has an upper limit. The pricing problem
for American stock options contains, apart from a stochastic control element, also a
halting problem. It is therefore more difficult, and we leave it aside.
The solution to the European option pricing problem is given by the Black and Scholes
option pricing formula (50) appearing at the end of this Section. As this formula does
not look wholely transparant at first sight, we shall introduce the result in three steps,
raising the level of complexity slowly by introducing the ingredients one by one.
In this stationary world our option pricing problem is easy: a fair price q at time 0 of the
right to buy at time T a share of stock at the price K is
q := E (ST − K)+ , (44)
where x + stands for max(0, x). Indeed, if the stock value ST turns out to be larger than
the exercise price K, then the holder of the option makes a profit of ST − K by buying
the share of stock at the price K that he is entitled to, and then immediately selling it
again at its current value ST . On the other hand, if ST ≤ K, then his option expires as a
worthless contract.
Since we know the price process St to be an exponential martingale, we can explicitly
evaluate the option price q:
q = E (ST − K)+
+
1 2
=E S 0 e σ BT − 2 σ T − K
∞ +
1 2
= S0 ew− 2 σ T − K ϕσ 2 T (w)dw (45)
−∞
∞
1
w− 2 σ 2 T
= K 1
S 0 e − K ϕσ 2 T (w)dw
S0 + 2 σ
log 2T
S0 1 S0 1
= S0 Φσ 2 T log + σ 2T − K Φσ 2 T log − σ 2T ,
K 2 K 2
where Φλ : u
→ √1 Φ √u is the normal distribution fuction with mean 0 and variance λ,
λ λ
and ϕλ = √ 1 exp(−w 2 /2λ) is the associated density function. In (45) we have made
2π λ
57
use of the equalities
1
ew− 2 λ ϕλ (w) = ϕλ (w − λ),
and
∞
ϕλ (w)dw = Φλ (−x).
x
The above option pricing formula covers the case μ = r = 0. Surprisingly, it also covers
the case μ = 0, r = 0. I.e., μ plays no role in the final result. In fact, the full Black and
Scholes option pricing formula is obtained by substituting Ke−r T for K to take devalua-
tion into account. It was this surprising disappearance of μ from the formula that caused
the difficulties that Black and Scholes experienced in getting their result accepted.
And indeed, for a justification of these statements we need a considerable extension of
our theoretical background.
We note that this equation expresses a kind of ‘time delay’: the amount ati of stock
bought at time ti only has its effect at time ti+1 . This delay will remain even after the
continuous time limit is taken, when it will give rise to an Itô-term.
58
Now consider the portfolio value Ct at time t:
Ct := at St + bt .
(ΔC)i = ai (ΔS)i .
Now, since there is no essential limit to the frequency of trade, the partition of [0, T ]
generated by the sequence of times 0 = t0 < t1 < t2 < · · · < tn−1 < tn = T can be made
arbitrarily fine. It is therefore reasonable to make the following idealisation.
We are now in a position to define what we mean by the ‘fair price’ of a claim or option.
Let g : [0, ∞) → [0, ∞) be measurable. By a claim to g(ST ) we mean the right to cash in
at time T the amount g(ST ), which depends on the current stock value ST at time T .
Definition 9.2 A claim to g(ST ) is called redundant if there exists a self-financing strat-
egy (a, b) such that with probability 1,
CT := aT ST + bT = g(ST ).
F(g(ST )) := C0 = a0 S0 + b0 .
9.4 Motivation
In the economic literature the above definition is usually motivated by the following
argument (a so-called arbitrage argument).
Suppose that claims to g(ST ) were traded at time 0 at a price p higher than q. Then it
would be possible to make an unbounded and riskless profit (an ‘arbitrage’) by selling
n such claims for the market price p, then to reserve an amount nq as initial capital
for the self-financing strategy (na, nb) — yielding with probability 1 the amount ng(ST )
59
needed to satisfy the claims — and then to pocket the difference n(p − q). Conversely, if
the market price p of the claim would be lower than q, then one could buy n claims and
apply the strategy (−na, −nb) yielding an immediate gain of n(q − p) at time 0. At time
T one could cancel one’s debts by executing the n claims to g(ST ). It should be admitted
that this second strategy, involving negative share holdings (or short sales of stock), is
somewhat more artificial than the first. But clearly, the possibility of arbitrage is not fair.
This concludes the motivation of Definition 9.2. In economic theory one often goes one
step further and assumes that arbitrage in fact does not occur. It is claimed that the
possibility of arbitrage would immediately be used by one of the parties on the market,
and this would set the market price equal to the fair price.
9.5 Results
Theorem 9.1 Let g ∈ C0 (0, ∞). On a stock market without bonds or interest (i.e., with
r = 0), the fair price at time 0 of a claim to g(ST ) at time T > 0 is
F(g(ST )) = E(g(XT )),
where (Xt )t∈[0,T ] is the exponential martingale with parameter σ starting at S0 , i.e., the
solution of the stochastic differential equation
dXt = Xt (σ dBt ) with X0 = S 0 . (49)
Corollary 9.2 In the absence of bonds or interest the fair price at time 0 of an option to
a share of stock at time T is
F (ST − K)+ = E (XT − K)+ ,
where Xt is the exponential martingale of Theorem 9.1. The right hand side is given
explicitly by (45).
= at dSt .
60
It follows that the fair price at time 0 is given by
S
F(g(ST )) := a0 S0 + b0 = f0 (S0 ) = eT A g (S0 ) = E g(XT0 ) .
Note that there is no μ in this proof! Apparently the fair price at time 0 is not influenced
by μ! To begin to understand this fact, we take the case g(x) = x. Clearly the fair price
at time 0 of a share at time T is just S0 , not E(ST ): St is automatically a ‘martingale under
F’.
Explicit calculation of the strategy for the case g(x) = (x − K)+ of a stock option yields
St 1
at = Φσ 2 (T −t) log + 2 σ 2 (T − t) ;
K
and
St 1 2
bt = −K Φσ 2 (T −t) log − 2 σ (T − t) .
K
These expressions describe a smooth steering mechanism, moving from the initial value
(a0 , b0 ) to the final value (aT , bT ), given by
⎧
⎨(0, 0), if ST ≤ K;
(aT , bT ) =
⎩(1, −K), if ST > K.
Thus the pair (at , bt ) always moves inside [0, 1] × [−K, 0], the strategy always involves
borrowing of money in order to buy up to one single share of stock. In cases where the
option is cheap, relatively much has to be borrowed in order to imitate the workings of
the option.
This result will come out as a combined effect of an upward drift and a discount.
In the presence of bonds, a self-financing strategy is to be defined as a pair (a, b) of
adapted processes with continuous paths such that the total portfolio value Ct := at St +
bt βt satisfies
61
Theorem 9.3 On a stock market with stocks and bonds the fair price at time 0 of a claim
to g(ST ) at time T > 0 is
∂ ∂2
Proof. Let B := r x ∂x + σ 2 x 2 ∂x 2 be the generator of the diffusion Yt and define
= r ft (St ) − St ft (St ) dt + ft (St )dSt
= bt dβt + at dSt .
It follows that
F(g(ST )) := f0 (S0 ) = eT (B−r ) g (S0 ) = E e−r T g(YT ) .
Proof. The first equality is Theorem 9.3, the second equality follows from the fact that
Yt = er t Xt , while the third equality is (45).
62