0% found this document useful (0 votes)
14 views58 pages

Controle Sto Arret Optimal

Uploaded by

palace
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views58 pages

Controle Sto Arret Optimal

Uploaded by

palace
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

Chapter 1

Conditional Expectation
and Linear Parabolic
PDEs

Throughout this chapter, (⌦, F, F, P ) is a filtered probability space with filtra-


tion F = {Ft , t 0} satisfying the usual conditions. Let W = {Wt , t 0} be
a Brownian motion valued in Rd , defined on (⌦, F, F, P ).
Throughout this chapter, a maturity T > 0 will be fixed. By H2 , we denote
the collection of all progressively
hR measurble
i processes with appropriate (finite)
T
dimension such that E 0 | t | dt < 1.
2

1.1 Stochastic di↵erential equations with ran-


dom coefficients
In this section, we recall the basic tools from stochastic di↵erential equations

dXt = bt (Xt )dt + t (Xt )dWt , t 2 [0, T ], (1.1)

where T > 0 is a given maturity date. Here, b and are F⌦B(Rn )-progressively
measurable functions from [0, T ] ⇥ ⌦ ⇥ Rn to Rn and MR (n, d), respectively.
In particular, for every fixed x 2 Rn , the processes {bt (x), t (x), t 2 [0, T ]} are
F progressively measurable.

Definition 1.1. A strong solution of (1.1) is an F progressively measurable


RT
process X such that 0 (|bt (Xt )| + | t (Xt )|2 )dt < 1, a.s. and
Z t Z t
Xt = X0 + bs (Xs )ds + s (Xs )dWs , t 2 [0, T ].
0 0

7
8 CHAPTER 1. CONDITIONAL EXPECTATION AND LINEAR PDEs

Let us mention that there is a notion of weak solutions which relaxes some
conditions from the above definition in order to allow for more general stochas-
tic di↵erential equations. Weak solutions, as opposed to strong solutions, are
defined on some probabilistic structure (which becomes part of the solution),
and not necessarily on (⌦, F, F, P, W ). Thus, for a weak solution we search for a
˜ F̃, F̃, P̃, W̃ ) and a process X̃ such that the requirement
probability structure (⌦,
of the above definition holds true. Obviously, any strong solution is a weak
solution, but the opposite claim is false.
The main existence and uniqueness result is the following.
Theorem 1.2. Let X0 2 L2 be a r.v. independent of W . Assume that the
processes b. (0) and . (0) are in H2 , and that for some K > 0:

|bt (x) bt (y)| + | t (x) t (y)|  K|x y| for all t 2 [0, T ], x, y 2 Rn .

Then, for all T > 0, there exists a unique strong solution of (1.1) in H2 . More-
over,

E sup |Xt |2  C 1 + E|X0 |2 eCT , (1.2)
tT

for some constant C = C(T, K) depending on T and K.


Proof. We first establish the existence and uniqueness result, then we prove the
estimate (1.2).
Step 1 For a constant c > 0, to be fixed later, we introduce the norm
"Z #1/2
T
ct 2
k kH2c := E e | t | dt for every 2 H2 .
0

Clearly , the norms k.kH2 and k.kH2c on the Hilbert space H2 are equivalent.
Consider the map U on H2 by:
Z t Z t
U (X)t := X0 + bs (Xs )ds + s (Xs )dWs , 0  t  T.
0 0

By the Lipschitz property of b and in the x variable and the fact that
b. (0), . (0) 2 H2 , it follows that this map is well defined on H2 . In order
to prove existence and uniqueness of a solution for (1.1), we shall prove that
U (X) 2 H2 for all X 2 H2 and that U is a contracting mapping with respect to
the norm k.kH2c for a convenient choice of the constant c > 0.
1- We first prove that U (X) 2 H2 for all X 2 H2 . To see this, we decompose:
"Z Z 2
#
T t
kU (X)k2H2  3T kX0 k2L2 + 3T E bs (Xs )ds dt
0 0
"Z Z 2
#
T t
+3E s (Xs )dWs dt
0 0
1.1. Stochastic differential equations 9

By the Lipschitz-continuity of b and in x, uniformly in t, we have |bt (x)|2 


K(1 + |bt (0)|2 + |x|2 ) for some constant K. We then estimate the second term
by:

"Z Z 2
# "Z #
T t T
2 2
E bs (Xs )ds dt  KT E (1 + |bt (0)| + |Xs | )ds < 1,
0 0 0

since X 2 H2 , and b(., 0) 2 L2 ([0, T ]).


As, for the third term, we use the Doob maximal inequality together with
the fact that | t (x)|2  K(1 + | t (0)|2 + |x|2 ), a consequence of the Lipschitz
property on :

"Z Z 2
# " Z 2
#
T t t
E s (Xs )dWs dt  T E max s (Xs )dWs dt
0 0 tT 0
"Z #
T
 4T E | s (Xs )|2 ds
0
"Z #
T
2 2
 4T KE (1 + | s (0)| + |Xs | )ds < 1.
0

2- To see that U is a contracting mapping for the norm k.kH2c , for some convenient
choice of c > 0, we consider two process X, Y 2 H2 with X0 = Y0 , and we
estimate that:

2
E |U (X)t U (Y )t |
Z t 2 Z t 2
 2E (bs (Xs ) bs (Ys )) ds + 2E ( s (Xs ) s (Ys )) dWs
0 0
Z t 2 Z t
2
= 2E (bs (Xs ) bs (Ys )) ds + 2E | s (Xs ) s (Ys )| ds
0 0
Z t Z t
2 2
= 2tE |bs (Xs ) bs (Ys )| ds + 2E | s (Xs ) s (Ys )| ds
0 0
Z t
2
 2(T + 1)K E |Xs Ys | ds.
0

2K(T + 1)
Hence, kU (X) U (Y )kc  kX Y kc , and therefore U is a contract-
c
ing mapping for sufficiently large c.
Step 2 We next prove the estimate (1.2). We shall alleviate the notation writ-
10 CHAPTER 1. CONDITIONAL EXPECTATION AND LINEAR PDEs

ing bs := bs (Xs ) and s := s (Xs ). We directly estimate:


 " Z Z #
u u 2
2
E sup |Xu | = E sup X0 + bs ds + s dWs
ut ut 0 0
Z " Z #!
t u 2
2 2
 3 E|X0 | + tE |bs | ds + E sup s dWs
0 ut 0
✓ Z t Z t ◆
 3 E|X0 |2 + tE |bs |2 ds + 4E | s |2 ds
0 0

where we used the Doob’s maximal inequality. Since b and are Lipschitz-
continuous in x, uniformly in t and !, this provides:
 ✓ Z t  ◆
2 2 2
E sup |Xu |  C(K, T ) 1 + E|X0 | + E sup |Xu | ds
ut 0 us

and we conclude by using the Gronwall lemma. }

The following exercise shows that the Lipschitz-continuity condition on the


coefficients b and can be relaxed. We observe that further relaxation of this
assumption is possible in the one-dimensional case, see e.g. Karatzas and Shreve
[8].

Exercise 1.3. In the context of this section, assume that the coefficients µ
and are locally Lipschitz and linearly growing in x, uniformly in (t, !). By a
localization argument, prove that strong existence and uniqueness holds for the
stochastic di↵erential equation (1.1).

In addition to the estimate (1.2) of Theorem 1.2, we have the following flow
continuity results of the solution of the SDE.

Theorem 1.4. Let the conditions of Theorem 1.2 hold true, and consider some
(t, x) 2 [0, T ) ⇥ Rn with t  t0  T .
(i) There is a constant C such that:

0 0
E sup Xst,x Xst,x |2  CeCt |x x0 |2 . (1.3)
tst0

R t0
(ii) Assume further that B := supt<t0 T (t0 t) 1
E t
|br (0)|2 + | r (0)|
2
dr <
1. Then for all t0 2 [t, T ]:

0
E sup Xst,x Xst ,x |2  CeCT (B + |x|2 )|t0 t|. (1.4)
t0 sT

0
Proof. (i) To simplify the notations, we set Xs := Xst,x and Xs0 := Xst,x for all
s 2 [t, T ]. We also denote x := x x0 , X := X X 0 , b := b(X) b(X 0 ) and
1.1. Stochastic differential equations 11

:= (X) (X 0 ). We first decompose:


✓ Z s Z s ◆
2 2
2 2
| Xs |  3 | x| + bu du + u dWu
t t
✓ Z s Z s ◆
2 2
2
 3 | x| + (s t) bu du + u dWu .
t t

Then, it follows from the Doob maximal inequality and the Lipschitz property
of the coefficients b and that:
 ✓ Z s Z s ◆
0 2 2 2 2
h(t ) := E sup | Xs |  3 | x| + (s t) E bu du + 4 E u du
tst0 t t
✓ Z s ◆
 3 | x|2 + K 2 (t0 + 4) E| Xu |2 du
t
✓ Z s ◆
 3 | x|2 + K 2 (t0 + 4) h(u)du .
t

Then the required estimate follows from the Gronwall inequality.


2. We next prove (1.4). We again simplify the notation by setting Xs := Xst,x ,
0
s 2 [t, T ], and Xs0 := Xst ,x , s 2 [t0 , T ]. We also denote t := t0 t, X := X X 0 ,
b := b(X) b(X 0 ) and := (X) (X 0 ). Then following the same arguments
as in the previous step, we obtain for all u 2 [t0 , T ]:
 ✓ Z u ◆
2 2 2 2
h(u) := E sup | Xs |  3 E|Xt0 x| + K (T + 4) E| Xr | dr
t0 su t0
✓ Z u ◆
 3 E|Xt0 x|2 + K 2 (T + 4) h(r)dr (1.5)
t0

Observe that
Z Z !
t0 2 t0 2
2
E|Xt0 x|  2 E br (Xr )dr +E r (Xr )dr
t t
Z Z !
t0 t0
2 2
 2 T E|br (Xr )| dr + E| r (Xr )| dr
t t
Z t0
 6(T + 1) K 2 E|Xr x|2 + |x|2 + E|br (0)|2 dr
t
⇣ Z t0 ⌘
 6(T + 1) (t0 t)(|x|2 + B) + K 2 E|Xr x|2 dr .
t
By the Gronwall inequality, this shows that
0
E|Xt0 x|2  C(|x|2 + B)|t0 t|eC(t t)
.
Plugging this estimate in (1.5), we see that:
✓ Z u ◆
2 0 C 0 2
h(u)  3 C(|x| + B)|t t|e (t t) + K (T + 4) h(r)dr , (1.6)
t0

and the required estimate follows from the Gronwall inequality. }


12 CHAPTER 1. CONDITIONAL EXPECTATION AND LINEAR PDEs

1.2 Markov solutions of SDEs


In this section, we restrict the coefficients b and to be deterministic functions
of (t, x). In this context, we write
bt (x) = b(t, x), t (x) = (t, x) for t 2 [0, T ], x 2 Rn ,
where b and are continuous functions, Lipschitz in x uniformly in t. Let X.t,x
denote the solution of the stochastic di↵erential equation
Z s Z s
Xst,x = x + b u, Xut,x du + u, Xut,x dWu s t
t t

The two following properties are obvious:


• Clearly, Xst,x = F (t, x, s, (W. Wt )tus ) for some deterministic function
F.
u,X t,x
• For t  u  s: Xst,x = Xs u . This follows from the pathwise uniqueness,
and holds also when u is a stopping time.
With these observations, we have the following Markov property for the solutions
of stochastic di↵erential equations.
Proposition 1.5. (Markov property) For all 0  t  s:
E [ (Xu , t  u  s) |Ft ] = E [ (Xu , t  u  s) |Xt ]
for all bounded function : C([t, s]) ! R.

1.3 Connection with linear partial di↵erential


equations
1.3.1 Generator
Let {Xst,x , s t} be the unique strong solution of
Z s Z s
Xst,x = x+ t,x
µ(u, Xu )du + (u, Xut,x )dWu , s t,
t t

where µ and satisfy the required condition for existence and uniqueness of a
strong solution.
For a function f : Rn ! R, we define the function Af by
t,x
E[f (Xt+h )] f (x)
Af (t, x) = lim if the limit exists.
h!0 h
Clearly, Af is well-defined for all bounded C 2 function with bounded deriva-
tives and

1 T @2f
Af (t, x) = µ(t, x) · f (t, x) + Tr (t, x) , (1.7)
2 @x@xT
1.3. Connection with PDE 13

(Exercise !). The linear di↵erential operator A is called the generator of X. It


turns out that the process X can be completely characterized by its generator or,
more precisely, by the generator and the corresponding domain of definition...
As the following result shows, the generator provides an intimate connection
between conditional expectations and linear partial di↵erential equations.
⇥ ⇤
Proposition 1.6. Assume that the function (t, x) 7 ! v(t, x) := E g(XTt,x is
C 1,2 ([0, T ) ⇥ Rn ). Then v solves the partial di↵erential equation:
@v
+ Av = 0 and v(T, .) = g.
@t
Proof. Given (t, x), let ⌧1 := T ^ inf{s > t : |Xst,x x| 1}. By the law of
iterated expectation together with the Markov property of the process X, it
follows that
⇥ t,x ⇤
v(t, x) = E v s ^ ⌧1 , Xs^⌧ 1
.
Since v 2 C 1,2 ([0, T ), Rn ), we may apply Itô’s formula, and we obtain by taking
expectations:
Z s^⌧1 ✓ ◆
@v
0 = E + Av (u, Xut,x )du
t @t
Z s^⌧1
@v
+E (u, Xst,x ) · (u, Xut,x )dWu
t @x
Z s^⌧1 ✓ ◆
@v
= E + Av (u, Xut,x )du ,
t @t
where the last equality follows from the boundedness of (u, Xut,x ) on [t, s^⌧1 ]. We
now send s & t, and the required result follows from the dominated convergence
theorem. }

1.3.2 Cauchy problem and the Feynman-Kac representa-


tion
In this section, we consider the following linear partial di↵erential equation
@v
@t+ Av k(t, x)v + f (t, x) = 0, (t, x) 2 [0, T ) ⇥ Rd
(1.8)
v(T, .) = g
where A is the generator (1.7), g is a given function from Rd to R, k and f are
functions from [0, T ] ⇥ Rd to R, b and are functions from [0, T ] ⇥ Rd to Rd
and and MR (d, d), respectively. This is the so-called Cauchy problem.
For example, when k = f ⌘ 0, b ⌘ 0, and is the identity matrix, the above
partial di↵erential equation reduces to the heat equation.
Our objective is to provide a representation of this purely deterministic prob-
lem by means of stochastic di↵erential equations. We then assume that µ and
satisfy the conditions of Theorem 1.2, namely that
Z T
µ, Lipschitz in x uniformly in t, |µ(t, 0)|2 + | (t, 0)|2 dt < 1.(1.9)
0
14 CHAPTER 1. CONDITIONAL EXPECTATION AND LINEAR PDEs

Theorem 1.7. Let the coefficients µ, be continuous and satisfy (1.9). Assume
further that the function k is uniformly bounded from below, and f has quadratic
growth in x uniformly in t. Let v be a C 1,2 [0, T ), Rd \C 0 [0, T ) ⇥ Rd solution
of (1.8) with quadratic growth in x uniformly in t. Then
"Z #
T
t,x
v(t, x) = E t,x t,x
s f (s, Xs )ds + T g XTt,x , t  T, x 2 Rd ,
t

Rs Rs Rs t,x
k(u,Xu )du
where Xst,x := x+ t
µ(u, Xut,x )du+ t
(u, Xut,x )dWu and t,x
s := e t

for t  s  T .
Proof. We first introduce the sequence of stopping times
⇣ 1⌘
⌧n := T ^ ^ inf s > t : Xst,x x n ,
n
and we oberve that ⌧n ! T P a.s. Since v is smooth, it follows from Itô’s
formula that for t  s < T :
✓ ◆
t,x t,x t,x @v
d s v s, Xs = s kv + + Av s, Xst,x ds
@t
@v
+ st,x s, Xst,x · s, Xst,x dWs
✓@x ◆
t,x t,x @v t,x t,x
= s f (s, Xs )ds + s, Xs · s, Xs dWs ,
@x

by the PDE satisfied by v in (1.8). Then:


⇥ ⇤
E ⌧t,xn
v ⌧n , X⌧t,x
n
v(t, x)
Z ⌧n ✓ ◆
t,x @v
= E s f (s, Xs )ds + s, Xst,x · s, Xst,x dWs .
t @x

Now observe that the integrands in the stochastic integral is bounded by def-
inition of the stopping time ⌧n , the smoothness of v, and the continuity of .
Then the stochastic integral has zero mean, and we deduce that
Z ⌧n
t,x t,x
v(t, x) = E s f s, Xs ds + ⌧t,x
n
v ⌧n , X⌧t,x
n
. (1.10)
t

Since ⌧n ! T and the Brownian motion has continuous sample paths P a.s.
it follows from the continuity of v that, P a.s.
Z ⌧n
t,x t,x
s f s, Xs ds + ⌧t,x
n
v ⌧n , X⌧t,x
n
t Z T
n!1
! t,x
s f s, Xs
t,x
ds + Tt,x v T, XTt,x (1.11)
t Z
T
= t,x
s f s, Xs
t,x
ds + Tt,x g XTt,x
t
1.3. Connection with PDE 15

by the terminal condition satisfied by v in (1.8). Moreover, since k is bounded


from below and the functions f and v have quadratic growth in x uniformly in
t, we have
Z ⌧n ✓ ◆
t,x t,x t,x t,x 2
s f s, Xs ds + ⌧n v ⌧n , X⌧n  C 1 + max |Xt | .
t tT

By the estimate stated in the existence and uniqueness theorem 1.2, the latter
bound is integrable, and we deduce from the dominated convergence theorem
that the convergence in (1.11) holds in L1 (P), proving the required result by
taking limits in (1.10). }

The above Feynman-Kac representation formula has an important numerical


implication. Indeed it opens the door to the use of Monte Carlo methods in order
to obtain a numerical approximation of the solution of the partial di↵erential
equation (1.8). For sake of simplicity, we provide the main idea in the case
f = k = 0. Let X (1) , . . . , X (k) be an iid sample drawn in the distribution of
XTt,x , and compute the mean:

1 X ⇣ (i) ⌘
k
v̂k (t, x) := g X .
k i=1

By the Law of Large Numbers, it follows that v̂k (t, x) ! v(t, x) P a.s. More-
over the error estimate is provided by the Central Limit Theorem:
p k!1 ⇥ ⇤
k (v̂k (t, x) v(t, x)) ! N 0, Var g XTt,x in distribution,

and is remarkably independent of the dimension d of the variable X !

1.3.3 Representation of the Dirichlet problem


Let D be an open bounded subset of Rd . The Dirichlet problem is to find a
function u solving:

Au ku + f = 0 on D and u = g on @D, (1.12)

where @D denotes the boundary of D, f and k are continuous functions from


Rd to R, and A is the generator of the process X 0,X0 defined as the unique
strong solution of the homogeneous (time independent coefficients) stochastic
di↵erential equation
Z t Z t
0,X0 0,X0
Xt = X0 + µ(Xs )ds + (Xs0,X0 )dWs , t 0.
0 0

Similarly to the the representation result of the Cauchy problem obtained in


Theorem 1.7, we have the following representation result for the Dirichlet prob-
lem.
16 CHAPTER 1. CONDITIONAL EXPECTATION AND LINEAR PDEs

Theorem 1.8. Let u be a C 2 solution of the Dirichlet problem (1.12). Assume


that k is nonnegative, and
n o
x
E[⌧D ] < 1, x 2 Rd , where ⌧D
x
:= inf t 0 : Xt0,x 62 D .

Then, we have the representation:


" Z #
⇣ ⌘ R ⌧Dx x
⌧D ⇣ ⌘ Rt
0,x
u(x) = E g X⌧ x e 0 k(Xs )ds
+ f Xt0,x e 0
k(Xs )ds
dt .
D
0

Exercise 1.9. Provide a proof of Theorem 1.8 by imitating the arguments in


the proof of Theorem 1.7.

1.4 The stochastic control approach to the Black-


Scholes model
1.4.1 The continuous-time financial market
Let T be a finite horizon, and (⌦, F, P) be a complete probability space sup-
porting a Brownian motion W = {(Wt1 , . . . , Wtd ), 0  t  T } with values in Rd .
We denote by F = FW = {Ft , 0  t  T } the canonical augmented filtration of
W , i.e. the canonical filtration augmented by zero measure sets of FT .
We consider a financial market consisting of d + 1 assets :
(i) The first asset S 0 is non-risky, and is defined by
✓Z t ◆
St0 = exp ru du , 0  t  T,
0
RT
where {rt , t 2 [0, T ]} is a non-negative adapted processes with 0 rt dt < 1 a.s.,
and represents the instantaneous interest rate.
(ii) The d remaining assets S i , i = 1, . . . , d, are risky assets with price
processes defined by the dynamics
d
X
dSti i,j j
= µit dt + t dWt , t 2 [0, T ],
Sti j=1

RT RT
for 1  i  d, where µ, are F adapted processes with 0 |µit |dt+ 0 | ti,j |2 dt <
1 for all i, j = 1, . . . , d. It is convenient to use the matrix notations to represent
the dynamics of the price vector S = (S 1 , . . . , S d ):

dSt = St ? (µt dt + t dWt ) , t 2 [0, T ],

where, for two vectors x, y 2 Rd , we denote x ? y the vector of Rd with compo-


nents (x ? y)i = xi yi , i = 1, . . . , d, and µ, are the Rd vector with components
µi ’s, and the MR (d, d) matrix with entries i,j .
1.3. Connection with PDE 17

We assume that the MR (d, d) matrix t is invertible for every t 2 [0, T ]


a.s., and we introduce the process
1
t := t (µt rt 1) , 0  t  T,

called the risk premium process. Here 1 is the vector of ones in Rd . We shall
frequently make use of the discounted processes
✓ Z t ◆
St
S̃t := 0 = St exp ru du ,
St 0

Using the above matrix notations, the dynamics of the process S̃ are given by

dS̃t = S̃t ? (µt rt 1)dt + t dWt = S̃t ? t ( t dt + dWt ) .

1.4.2 Portfolio and wealth process


A portfolio strategy is an F adapted process ⇡ = {⇡t , 0  t  T } with values
in Rd . For 1  i  n and 0  t  T , ⇡ti is the amount (in Euros) invested in
the risky asset S i .
We next recall the self-financing condition in the present framework. Let Xt⇡
denote the portfolio value, or wealth, process at time t induced by the portfolio
Pn
strategy ⇡. Then, the amount invested in the non-risky asset is Xt⇡ i
i=1 ⇡t
= Xt⇡ ⇡t · 1.
Under the self-financing condition, the dynamics of the wealth process is
given by

Xn
⇡ti Xt⇡ ⇡t · 1
dXt⇡ = i
dS i
t + 0 dSt0 .
i=1
S t S t

Let X̃ ⇡ be the discounted wealth process


✓ Z t ◆
X̃t⇡ := Xt⇡ exp r(u)du , 0  t  T.
0

Then, by an immediate application of Itô’s formula, we see that

dX̃t = ˜t ·
⇡ t ( t dt + dWt ) , 0  t  T, (1.13)

where ⇡ ˜t := e rt ⇡t . We still need to place further technical conditions on ⇡,


at least in order for the above wealth process to be well-defined as a stochastic
integral.
Before this, let us observe that, assuming that the risk premium process
satisfies the Novikov condition:
h 1 RT 2 i
E e 2 0 | t | dt < 1,
18 CHAPTER 1. CONDITIONAL EXPECTATION AND LINEAR PDEs

it follows from the Girsanov theorem that the process


Z t
Bt := Wt + u du , 0tT, (1.14)
0

is a Brownian motion under the equivalent probability measure


Z Z !
T T
1 2
Q := ZT · P on FT where ZT := exp u · dWu | u| du .
0 2 0

In terms of the Q Brownian motion B, the discounted price process satisfies

dS̃t = S̃t ? t dBt , t 2 [0, T ],

and the discounted wealth process induced by an initial capital X0 and a port-
folio strategy ⇡ can be written in
Z t
X̃t⇡ = X̃0 + ˜u ·
⇡ u dBu , for 0  t  T. (1.15)
0

Definition 1.10. An admissible portfolio process ⇡ = {✓t , t 2 [0, T ]} is an


RT
F progressively measurable process such that 0 | tT ⇡t |2 dt < 1, a.s. and the
corresponding discounted wealth process is bounded from below by a Q martingale

X̃t⇡ Mt⇡ , 0  t  T, for some Q martingale M ⇡ .

The collection of all admissible portfolio processes will be denoted by A.


The lower bound M ⇡ , which may depend on the portfolio ⇡, has the interpre-
tation of a finite credit line imposed on the investor. This natural generalization
of the more usual constant credit line corresponds to the situation where the
total credit available to an investor is indexed by some financial holding, such as
the physical assets of the company or the personal home of the investor, used as
collateral. From the mathematical viewpoint, this condition is needed in order
to exclude any arbitrage opportunity, and will be justified in the subsequent
subsection.

1.4.3 Admissible portfolios and no-arbitrage


We first define precisely the notion of no-arbitrage.
Definition 1.11. We say that the financial market contains no arbitrage op-
portunities if for any admissible portfolio process ✓ 2 A,

X0 = 0 and XT✓ 0P a.s. implies XT✓ = 0 P a.s.

The purpose of this section is to show that the financial market described
above contains no arbitrage opportunities. Our first observation is that, by the
1.3. Connection with PDE 19

very definition of the probability measure Q, the discounted price process S̃


satisfies:
n o
the process S̃t , 0  t  T is a Q local martingale. (1.16)

For this reason, Q is called a risk neutral measure, or an equivalent local mar-
tingale measure, for the price process S.
We also observe that the discounted wealth process satisfies:
X̃ ⇡ is a Q local martingale for every ⇡ 2 A, (1.17)
as a stochastic integral with respect to the Q Brownian motion B.
Theorem 1.12. The continuous-time financial market described above contains
no arbitrage opportunities, i.e. for every ⇡ 2 A:
X0 = 0 and XT⇡ 0P a.s. =) XT⇡ = 0 P a.s.
Proof. For ⇡ 2 A, the discounted wealth process X̃ is a Q local martingale

i a Q martingale. Then X̃ is a Q super-martingale.



bounded from below h by
In particular, EQ X̃T⇡  X̃0 = X0 . Recall that Q is equivalent to P and S 0
is strictly positive. Then, this inequality shows that, whenever X0⇡ = 0 and
XT⇡ 0 P a.s. (or equivalently Q a.s.), we have X̃T⇡ = 0 Q a.s. and therefore
XT⇡ = 0 P a.s. }

1.4.4 Super-hedging and no-arbitrage bounds


Let G be an FT measurable random variable representing the payo↵ of a deriva-
tive security with given maturity T > 0. The super-hedging problem consists in
finding the minimal initial cost so as to be able to face the payment G without
risk at the maturity of the contract T :
V (G) := inf {X0 2 R : XT⇡ GP a.s. for some ⇡ 2 A} .
Remark 1.13. Notice that V (G) depends on the reference measure P only by
means of the corresponding null sets. Therefore, the super-hedging problem is
not changed if P is replaced by any equivalent probability measure.
We now show that, under the no-arbitrage condition, the super-hedging
problem provides no-arbitrage bounds on the market price of the derivative se-
curity.
Assume that the buyer of the contingent claim G has the same access to
the financial market than the seller. Then V (G) is the maximal amount that
the buyer of the contingent claim contract is willing to pay. Indeed, if the seller
requires a premium of V (G) + 2", for some " > 0, then the buyer would not
accept to pay this amount as he can obtain at least G by trading on the financial
market with initial capital V (G) + ".
Now, since selling of the contingent claim G is the same as buying the con-
tingent claim G, we deduce from the previous argument that
V ( G)  market price of G  V (G). (1.18)
20 CHAPTER 1. CONDITIONAL EXPECTATION AND LINEAR PDEs

1.4.5 The no-arbitrage valuation formula


We denote by p(G) the market price of a derivative security G.
Theorem 1.14. Let G be an FT measurabel random variable representing the
payo↵ of a ⇣derivative ⌘security at the maturity T > 0, and recall the notation
RT
G̃ := G exp 0 t
r dt . Assume that EQ [|G̃|] < 1. Then

p(G) = V (G) = EQ [G̃].


⇤ ⇤
Moreover, there exists a portfolio ⇡ ⇤ 2 A such that X0⇡ = p(G) and XT⇡ = G,
a.s., that is ⇡ ⇤ is a perfect replication strategy.
Proof. 1- We first prove that V (G) EQ [G̃]. Let X0 and ⇡ 2 A be such that
XT⇡ G, a.s. or, equivalently, X̃T⇡ G̃ a.s. Notice that X̃ ⇡ is a Q super-
martingale, as a Q local martingale bounded from below by a Q martingale.
Then X0 = X̃0 EQ [X̃T⇡ ] EQ [G̃].
2- We next prove that V (G)  EQ [G̃]. Define the Q martingale Yt := EQ [G̃|Ft ]
and observe that FW = FB . Then, it follows from the martingale representa-
RT
tion theorem that Yt = Y0 + 0 t · dBt for some F adapted process with
RT
0
˜ ⇤ := ( T ) 1 , we see that
| t |2 dt < 1 a.s. Setting ⇡
Z T

⇡ 2 A and Y0 + ⇡˜ ⇤ · t dBt = G̃ P a.s.
0

which implies that Y0 V (G) and ⇡ ⇤ is a perfect hedging stratgey for G,


starting from the initial capital Y0 .
3- From the previous steps, we have V (G) = EQ [G̃]. Applying this result to G,
we see that V ( G) = V (G), so that the no-arbitrage bounds (1.18) imply that
the no-arbitrage market price of G is given by V (G). }

1.4.6 PDE characterization of the Black-Scholes price


In this subsection, we specialize further the model to the case where the risky
securities price processes are Markov di↵usions defined by the stochastic di↵er-
ential equations:
dSt = St ? r(t, St )dt + (t, St )dBt .
Here (t, s) 7 ! s ? r(t, s) and (t, s) 7 ! s ? (t, s) are Lipschitz-continuous func-
tions from R+ ⇥ [0, 1)d to Rd and Sd , successively. We also consider a Vanilla
derivative security defined by the payo↵
G = g(ST ),
where g : [0, 1)d ! R is a measurable function bounded from below. From the
previous subsection, the no-arbitrage price at time t of this derivative security
is given by
h RT i h RT i
V (t, St ) = EQ e t r(u,Su )du g(ST )|Ft = EQ e t r(u,Su )du g(ST )|St ,
1.3. Connection with PDE 21

where the last equality follows from the Markov property of the process S.
Assuming further that g has linear growth, it follows that V has linear growth
in s uniformly in t. Since V is defined by a conditional expectation, it is expected
to satisfy the linear PDE:
1 ⇥ ⇤
@t V rs ? DV Tr (s ? )2 D2 V rV = 0. (1.19)
2
More precisely, if V 2 C 1,2 (R+ , Rd ), the V is a classical solution of (1.19) and
satisfies the final condition V (T, .) = g. Coversely, if the PDE (1.19) combined
with the final condition v(T, .) = g has a classical solution v with linear growth,
then v coincides with the derivative security price V .
22 CHAPTER 1. CONDITIONAL EXPECTATION AND LINEAR PDEs
Chapter 2

Stochastic Control
and Dynamic Programming

In this chapter, we assume that the filtration F is the P augmentation of the


canonical filtration of the Brownian motion W . This restriction is only needed
in order to simplify the presentation of the proof of the dynamic programming
principle. We will also denote by

S := [0, T ) ⇥ Rn where T 2 [0, 1].

The set S is called the parabolic interior of the state space. We will denote by
S̄ := cl(S) its closure, i.e. S̄ = [0, T ] ⇥ Rn for finite T , and S̄ = S for T = 1.

2.1 Stochastic control problems in standard form


Control processes. Given a subset U of Rk , we denote by U the set of all pro-
gressively measurable processes ⌫ = {⌫t , t < T } valued in U . The elements of
U are called control processes.

Controlled Process. Let

b : (t, x, u) 2 S ⇥ U ! b(t, x, u) 2 Rn

and

: (t, x, u) 2 S ⇥ U ! (t, x, u) 2 MR (n, d)

be two continuous functions satisfying the conditions

|b(t, x, u) b(t, y, u)| + | (t, x, u) (t, y, u)|  K |x y|, (2.1)


|b(t, x, u)| + | (t, x, u)|  K (1 + |x| + |u|). (2.2)

23
24 CHAPTER 2. STOCHASTIC CONTROL, DYNAMIC PROGRAMMING

for some constant K independent of (t, x, y, u). For each control process ⌫ 2 U,
we consider the controlled stochastic di↵erential equation :

dXt = b(t, Xt , ⌫t )dt + (t, Xt , ⌫t )dWt . (2.3)

If the above equation has a unique solution X, for a given initial data, then
the process X is called the controlled process, as its dynamics is driven by the
action of the control process ⌫.
We shall be working with the following subclass of control processes :

U0 := U \ H2 , (2.4)

where H2 is the collection of all progressively measurable processes with finite


L2 (⌦ ⇥ [0, T )) norm. Then, for every finite maturity T 0  T , it follows from
the above uniform Lipschitz condition on the coefficients b and that
"Z 0 #
T
E |b| + | |2 (s, x, ⌫s )ds < 1 for all ⌫ 2 U 0 , x 2 Rn ,
0

which guarantees the existence of a controlled process on the time interval [0, T 0 ]
for each given initial condition and control. The following result is an immediate
consequence of Theorem 1.2.
Theorem 2.1. Let ⌫ 2 U0 be a control process, and ⇠ 2 L2 (P) be an F0 measurable
random variable. Then, there exists a unique F adapted process X ⌫ satisfying
(6.3) together with the initial condition X0⌫ = ⇠. Moreover for every T > 0,
there is a constant C > 0 such that

E sup |Xs⌫ |2 < C(1 + E[|⇠|2 ])eCt for all t 2 cl([0, T )). (2.5)
0st

Cost functional. Let

f, k : [0, T ) ⇥ Rn ⇥ U ! R and g : Rn ! R

be given functions. We assume that f, k are continuous and kk k1 < 1 (i.e.


max( k, 0) is uniformly bounded). Moreover, we assume that f and g satisfy
the quadratic growth condition :

|f (t, x, u)| + |g(x)|  K(1 + |u| + |x|2 ),

for some constant K independent of (t, x, u). We define the cost function J on
[0, T ] ⇥ Rn ⇥ U by :
"Z #
T
J(t, x, ⌫) := E ⌫
(t, s)f (s, Xst,x,⌫ , ⌫s )ds + ⌫
(t, T )g(XTt,x,⌫ )1T <1 ,
t

when this expression is meaningful, where


Rs
⌫ k(r,Xrt,x,⌫ ,⌫r )dr
(t, s) := e t ,
2.1. Standard stochastic control 25

and {Xst,x,⌫ , s t} is the solution of (6.3) with control process ⌫ and initial
condition Xtt,x,⌫ = x.

Admissible control processes. In the finite horizon case T < 1, the quadratic
growth condition on f and g together with the bound on k ensure that J(t, x, ⌫)
is well-defined for all control process ⌫ 2 U0 . We then define the set of admissible
controls in this case by U0 .
More attention is needed for the infinite horizon case. In particular, the
discount term k needs to play a role to ensure the finiteness of the integral. In
this setting the largest set of admissible control processes is given by
n Z o
⇥ 1 ⌫ ⇤
U0 := ⌫ 2 U : E (t, s) 1+|Xst,x,⌫ |2 +|⌫s )| ds < 1 for all x when T = 1.
0

The stochastic control problem. The purpose of this section is to study the min-
imization problem
V (t, x) := sup J(t, x, ⌫) for (t, x) 2 S.
⌫2U0

Our main concern is to describe the local behavior of the value function V
by means of the so-called dynamic programming equation, or Hamilton-Jacobi-
Bellman equation. We continue with some remarks.
Remark 2.2. (i) If V (t, x) = J(t, x, ⌫ˆt,x ), we call ⌫ˆt,x an optimal control for
the problem V (t, x).
(ii) The following are some interesting subsets of controls :
- a process ⌫ 2 U0 which is adapted to the natural filtration FX of the
associated state process is called feedback control,
- a process ⌫ 2 U0 which can be written in the form ⌫s = ũ(s, Xs ) for some
measurable map ũ from [0, T ] ⇥ Rn into U , is called Markovian control;
notice that any Markovian control is a feedback control,
- the deterministic processes of U0 are called open loop controls.
(iii) Suppose that T < 1, and let (Y, Z) be the controlled processes defined
by
dYs = Zs f (s, Xs , ⌫s )ds and dZs = Zs k(s, Xs , ⌫s )ds ,
and define the augmented state process X̄ := (X, Y, Z). Then, the above
value function V can be written in the form :
V (t, x) = V̄ (t, x, 0, 1) ,
where x̄ = (x, y, z) is some initial data for the augmented state process X̄,
⇥ ⇤
V̄ (t, x̄) := Et,x̄ ḡ(X̄T ) and ḡ(x, y, z) := y + g(x)z .
Hence the stochastic control problem V can be reduced without loss of
generality to the case where f = k ⌘ 0. We shall appeal to this reduced
form whenever convenient for the exposition.
26 CHAPTER 2. STOCHASTIC CONTROL, DYNAMIC PROGRAMMING

(iv) For notational simplicity we consider the case T < 1 and f = k = 0. The
previous remark shows how to immediately adapt the following argument
so that the present remark holds true without the restriction f = k = 0.
The extension to the infinite horizon case is also immediate.
Consider the value function
⇥ ⇤
Ṽ (t, x) := sup E g(XTt,x,⌫ ) , (2.6)
⌫2Ut

di↵ering from V by the restriction of the control processes to

Ut := {⌫ 2 U0 : ⌫ independent of Ft } . (2.7)

Since Ut ⇢ U0 , it is obvious that Ṽ  V . We claim that

Ṽ = V, (2.8)

so that both problems are indeed equivalent. To see this, fix (t, x) 2 S and
⌫ 2 U0 . Then, ⌫ can be written as a measurable function of the canonical
process ⌫((!s )0st , (!s !t )tsT ), where, for fixed (!s )0st , the map
⌫(!s )0st : (!s !t )tsT 7! ⌫((!s )0st , (!s !t )tsT ) can be viewed
as a control independent on Ft . Using the independence of the increments
of the Brownian motion, together with Fubini’s Lemma, it thus follows
that
Z h
t,x,⌫(!s )0st i
J(t, x; ⌫) = E g(XT ) dP((!s )0st )
Z
 Ṽ (t, x)dP((!s )0st ) = Ṽ (t, x).

By arbitrariness of ⌫ 2 U0 , this implies that Ṽ (t, x) V (t, x).

2.2 The dynamic programming principle


2.2.1 A weak dynamic programming principle
The dynamic programming principle is the main tool in the theory of stochastic
control. In these notes, we shall prove rigorously a weak version of the dy-
namic programming which will be sufficient for the derivation of the dynamic
programming equation. We denote:

V⇤ (t, x) := lim inf V (t0 , x0 ) and V ⇤ (t, x) := lim sup V (t0 , x0 ),


(t0 ,x0 )!(t,x) (t0 ,x0 )!(t,x)

for all (t, x) 2 S̄. We also recall the subset of controls Ut introduced in (2.7)
above.
2.2. Dynamic programming principle 27

Theorem 2.3. Assume that V is locally bounded and fix (t, x) 2 S. Let {✓⌫ , ⌫ 2
Ut } be a family of finite stopping times independent of Ft with values in [t, T ].
Then:
"Z ⌫ #

⌫ t,x,⌫ ⌫ ⌫ ⌫ t,x,⌫
V (t, x) sup E (t, s)f (s, Xs , ⌫s )ds + (t, ✓ )V⇤ (✓ , X✓⌫ ) .
⌫2Ut t

Assume further that g is lower-semicontinuous and Xt,x ⌫


1[t,✓⌫ ] is L1 bounded
for all ⌫ 2 Ut . Then
"Z ⌫ #

t,x,⌫
V (t, x)  sup E ⌫
(t, s)f (s, Xst,x,⌫ , ⌫s )ds + ⌫ (t, ✓⌫ )V ⇤ (✓⌫ , X✓⌫ ) .
⌫2Ut t

We shall provide an intuitive justification of this result after the following


comments. A rigorous proof is reported in Section 2.2.2 below.
(i) If V is continuous, then V = V⇤ = V ⇤ , and the above weak dynamic pro-
gramming principle reduces to the classical dynamic programming princi-
ple:
"Z #

V (t, x) = sup Et,x (t, s)f (s, Xs , ⌫s )ds + (t, ✓)V (✓, X✓ ) (2.9)
.
⌫2U t

(ii) In the discrete-time framework, the dynamic programming principle (2.9)


can be stated as follows :
h i
V (t, x) = sup Et,x f (t, Xt , u) + e k(t+1,Xt+1 ,u) V (t + 1, Xt+1 ) .
u2U

Observe that the supremum is now taken over the subset U of the finite
dimensional space Rk . Hence, the dynamic programming principle allows
to reduce the initial maximization problem, over the subset U of the in-
finite dimensional set of Rk valued processes, into a finite dimensional
maximization problem. However, we are still facing an infinite dimen-
sional problem since the dynamic programming principle relates the value
function at time t to the value function at time t + 1.
(iii) In the context of the above discrete-time framework with finite horizon
T < 1, notice that the dynamic programming principle suggests the fol-
lowing backward algorithm to compute V as well as the associated optimal
strategy (when it exists). Since V (T, ·) = g is known, the above dynamic
programming principle can be applied recursively in order to deduce the
value function V (t, x) for every t.
(iv) In the continuous time setting, there is no obvious counterpart to the
above backward algorithm. But, as the stopping time ✓ approaches t,
the above dynamic programming principle implies a special local behavior
for the value function V . When V is known to be smooth, this will be
obtained by means of Itô’s formula.
28 CHAPTER 2. STOCHASTIC CONTROL, DYNAMIC PROGRAMMING

(v) It is usually very difficult to determine a priori the regularity of V . The


situation is even worse since there are many counter-examples showing
that the value function V can not be expected to be smooth in general;
see Section 2.4. This problem is solved by appealing to the notion of
viscosity solutions, which provides a weak local characterization of the
value function V .

(vi) Once the local behavior of the value function is characterized, we are
faced to the important uniqueness issue, which implies that V is com-
pletely characterized by its local behavior together with some convenient
boundary condition.

Intuitive justification of (2.9). Let us assume that V is continuous. In


particular, V is measurable and V = V⇤ = V ⇤ . Let Ṽ (t, x) denote the right
hand-side of (2.9).
By the tower Property of the conditional expectation operator, it is easily
checked that
"Z #

J(t, x, ⌫) = Et,x (t, s)f (s, Xs , ⌫s )ds + (t, ✓)J(✓, X✓ , ⌫) .
t

Since J(✓, X✓ , ⌫)  V (✓, X✓ ), this proves that V  Ṽ . To prove the reverse


inequality, let µ 2 U and " > 0 be fixed, and consider an " optimal control ⌫ "
for the problem V (✓, X✓ ), i.e.

J(✓, X✓ , ⌫ " ) V (✓, X✓ ) ".

Clearly, one can choose ⌫ " = µ on the stochastic interval [t, ✓]. Then
"Z #

" "
V (t, x) J(t, x, ⌫ ) = Et,x (t, s)f (s, Xs , µs )ds + (t, ✓)J(✓, X✓ , ⌫ )
t
"Z #

Et,x (t, s)f (s, Xs , µs )ds + (t, ✓)V (✓, X✓ ) " Et,x [ (t, ✓)] .
t

This provides the required inequality by the arbitrariness of µ 2 U and " > 0.
}

Exercise. Where is the gap in the above sketch of the proof ?

2.2.2 Dynamic programming without measurable selec-


tion
In this section, we provide a rigorous proof of Theorem 2.3. Notice that, we
have no information on whether V is measurable or not. Because of this, the
2.2. Dynamic programming principle 29

right-hand side of the classical dynamic programming principle (2.9) is not even
known to be well-defined.
The formulation of Theorem 2.3 avoids this measurability problem since
V⇤ and V ⇤ are lower- and upper-semicontinuous, respectively, and therefore
measurable. In addition, it allows to avoid the typically heavy technicalities
related to measurable selection arguments needed for the proof of the classical
(2.9) after a convenient relaxation of the control problem, see e.g. El Karoui
and Jeanblanc [5].

Proof of Theorem 2.3 For simplicity, we consider the finite horizon case
T < 1, so that, without loss of generality, we assume f = k = 0, See Remark
2.2 (iii). The extension to the infinite horizon framework is immediate.
1. Let ⌫ 2 Ut be arbitrary and set ✓ := ✓⌫ . Then:
⇥ ⇤
E g XTt,x,⌫ |F✓ (!) = J(✓(!), X✓t,x,⌫ (!); ⌫˜! ),

where ⌫˜! is obtained from ⌫ by freezing its trajectory up to the stopping time
✓. Since, by definition, J(✓(!), X✓t,x,⌫ (!); ⌫˜! )  V ⇤ (✓(!), X✓t,x,⌫ (!)), it follows
from the tower property of conditional expectations that
⇥ ⇤ ⇥ ⇥ ⇤⇤ ⇥ ⇤
E g XTt,x,⌫ = E E g XTt,x,⌫ |F✓  E V ⇤ ✓, X✓t,x,⌫ ,

which provides the second inequality of Theorem 2.3 by the arbitrariness of


⌫ 2 Ut .
2. Let " > 0 be given, and consider an arbitrary function

':S !R such that ' upper-semicontinuous and V '.

2.a. There is a family (⌫ (s,y)," )(s,y)2S ⇢ U0 such that:

⌫ (s,y)," 2 Us and J(s, y; ⌫ (s,y)," ) V (s, y) ", for every (s, y) 2 S.(2.10)

Since g is lower-semicontinuous and has quadratic growth, it follows from Theo-


rem 2.1 that the function (t0 , x0 ) 7! J(t0 , x0 ; ⌫ (s,y)," ) is lower-semicontinuous, for
fixed (s, y) 2 S. Together with the upper-semicontinuity of ', this implies that
we may find a family (r(s,y) )(s,y)2S of positive scalars so that, for any (s, y) 2 S,

'(s, y) '(t0 , x0 ) " and J(s, y; ⌫ (s,y)," ) J(t0 , x0 ; ⌫ (s,y)," )  "


(2.11)
for (t0 , x0 ) 2 B(s, y; r(s,y) ),

where, for r > 0 and (s, y) 2 S,

B(s, y; r) := {(t0 , x0 ) 2 S : t0 2 (s r, s), |x0 y| < r} .

Clearly, B(s, y; r) : (s, y) 2 S, 0 < r  r(s,y) forms an open covering of


[0, T ) ⇥ Rd . It then follows from the Lindelöf covering Theorem, see e.g. [4]
Theorem 6.3 Chap. VIII, that we can find a countable sequence (ti , xi , ri )i 1
of elements of S ⇥ R, with 0 < ri  r(ti ,xi ) for all i 1, such that S ⇢
30 CHAPTER 2. STOCHASTIC CONTROL, DYNAMIC PROGRAMMING

{T } ⇥ Rd [ ([i 1 B(ti , xi ; ri )). Set A0 := {T } ⇥ Rd , C 1 := ;, and define the


sequence

Ai+1 := B(ti+1 , xi+1 ; ri+1 ) \ Ci where Ci := Ci 1 [ Ai , i 0.

With this construction, it follows from (2.10), (2.11), together with the fact that
V ', that the countable family (Ai )i 0 satisfies

(✓, X✓t,x,⌫ ) 2 [i 0 Ai P a.s., Ai \ Aj = ; for i 6= j 2 N,


(2.12)
and J(·; ⌫ i," ) ' 3" on Ai for i 1,

where ⌫ i," := ⌫ (ti ,xi )," for i 1.


2.b. We now prove the first inequality in Theorem 2.3. We fix ⌫ 2 Ut and
t n
✓ 2 T[t,T ] . Set A := [0in Ai , n 1. Given ⌫ 2 Ut , we define for s 2 [t, T ]:
⇣ n
X ⌘
⌫s",n := 1[t,✓] (s)⌫s + 1(✓,T ] (s) ⌫s 1(An )c (✓, X✓t,x,⌫ ) + 1Ai (✓, X✓t,x,⌫ )⌫si," .
i=1

Notice that {(✓, X✓t,x,⌫ )


2 Ai } 2 F✓t . Then, it follows that ⌫ ",n 2 Ut . Then, it
follows from (2.12) that:
h ⇣ ",n
⌘ i ⇣ ",n

E g XTt,x,⌫ |F✓ 1An ✓, X✓t,x,⌫ = V T, XTt,x,⌫ 1A0 ✓, X✓t,x,⌫
n
X
+ J(✓, X✓t,x,⌫ , ⌫ i," )1Ai ✓, X✓t,x,⌫
i=1
n
X
'(✓, X✓t,x,⌫ 3" 1Ai ✓, X✓t,x,⌫
i=0

= '(✓, X✓t,x,⌫ ) 3" 1An ✓, X✓t,x,⌫ ,

which, by definition of V and the tower property of conditional expectations,


implies

V (t, x) J(t, x, ⌫ ",n )


h h ⇣ ",n
⌘ ii
= E E g XTt,x,⌫ |F✓
⇥ ⇤
E ' ✓, X✓t,x,⌫ 3" 1An ✓, X✓t,x,⌫
⇥ ⇤
+E g XTt,x,⌫ 1(An )c ✓, X✓t,x,⌫ .

Since g XTt,x,⌫ 2 L1 , it follows from the dominated convergence theorem that:


⇥ ⇤
V (t, x) 3" + lim inf E '(✓, X✓t,x,⌫ )1An ✓, X✓t,x,⌫
n!1
⇥ ⇤
= 3" + lim E '(✓, X✓t,x,⌫ )+ 1An ✓, X✓t,x,⌫
n!1
⇥ ⇤
lim E '(✓, X✓t,x,⌫ ) 1An ✓, X✓t,x,⌫
n!1
⇥ ⇤
= 3" + E '(✓, X✓t,x,⌫ ) ,
2.3. Dynamic programming equation 31

where the last equality follows from the left-hand side of (2.12)
⇥ and from ⇤the
monotone convergence theorem, due to the fact that either E '(✓, X✓t,x,⌫ )+ <
⇥ ⇤
1 or E '(✓, X✓t,x,⌫ ) < 1. By the arbitrariness of ⌫ 2 Ut and " > 0, this
shows that:
⇥ ⇤
V (t, x) sup E '(✓, X✓t,x,⌫ ) . (2.13)
⌫2Ut

3. It remains to deduce the first inequality of Theorem 2.3 from (2.13). Fix
r > 0. It follows from standard arguments, see e.g. Lemma 3.5 in [12], that
we can find a sequence of continuous functions ('n )n such that 'n  V⇤  V
for all n 1 and such that 'n converges pointwise to V⇤ on [0, T ] ⇥ Br (0).
Set N := minn N 'n for N 1 and observe that the sequence ( N )N is non-
decreasing and converges pointwise to V⇤ on [0, T ] ⇥ Br (0). By (2.13) and the
monotone convergence Theorem, we then obtain:
⇥ ⇤ ⇥ ⇤
V (t, x) lim E N (✓⌫ , Xt,x

(✓⌫ )) = E V⇤ (✓⌫ , Xt,x

(✓⌫ )) .
N !1

2.3 The dynamic programming equation


The dynamic programming equation is the infinitesimal counterpart of the dy-
namic programming principle. It is also widely called the Hamilton-Jacobi-
Bellman equation. In this section, we shall derive it under strong smoothness
assumptions on the value function. Let S d be the set of all d ⇥ d symmetric
matrices with real coefficients, and define the map H : S ⇥ R ⇥ Rn ⇥ S d by :
H(t, x, r, p, )

1 T
:= sup k(t, x, u)r + b(t, x, u) · p + Tr[ (t, x, u) ] + f (t, x, u) .
u2U 2
We also need to introduce the linear second order operator Lu associated to the
controlled process { (0, t)Xtu , t 0} controlled by the constant control process
u:
Lu '(t, x) := k(t, x, u)'(t, x) + b(t, x, u) · D'(t, x)
1 ⇥ T ⇤
+ Tr (t, x, u)D2 '(t, x) ,
2
where D and D2 denote the gradient and the Hessian operators with respect to
the x variable. With this notation, we have by Itô’s formula:
Z s
⌫ ⌫ ⌫ ⌫ ⌫
(0, s)'(s, Xs ) (0, t)'(t, Xt ) = (0, r) (@t + L⌫r ) '(r, Xr⌫ )dr
t
Z s

+ (0, r)D'(r, Xr⌫ ) · (r, Xr⌫ , ⌫r )dWr
t

for every s t and smooth function ' 2 C 1,2 ([t, s], Rn ) and each admissible
control process ⌫ 2 U0 .
32 CHAPTER 2. STOCHASTIC CONTROL, DYNAMIC PROGRAMMING

Proposition 2.4. Assume the value function V 2 C 1,2 ([0, T ), Rn ), and let the
coefficients k(·, ·, u) and f (·, ·, u) be continuous in (t, x) for all fixed u 2 U .
Then, for all (t, x) 2 S:
@t V (t, x) H t, x, V (t, x), DV (t, x), D2 V (t, x) 0. (2.14)
Proof. Let (t, x) 2 S and u 2 U be fixed and consider the constant control
process ⌫ = u, together with the associated state process X with initial data
Xt = x. For all h > 0, Define the stopping time :
✓h := inf {s > t : (s t, Xs x) 62 [0, h) ⇥ ↵B} ,
where ↵ > 0 is some given constant, and B denotes the unit ball of Rn . Notice
that ✓h ! t, P a.s. when h & 0, and ✓h = h for h  h̄(!) sufficiently small.
1. From the first inequality of the dynamic programming principle, it follows
that :
" Z #
✓h
0  Et,x (0, t)V (t, x) (0, ✓h )V (✓h , X✓h ) (0, r)f (r, Xr , u)dr
t
"Z #
✓h
·
= Et,x (0, r)(@t V + L V + f )(r, Xr , u)dr
t
"Z #
✓h
Et,x (0, r)DV (r, Xr ) · (r, Xr , u)dWr ,
t

the last equality follows from Itô’s formula and uses the crucial smoothness
assumption on V .
2. Observe that (0, r)DV (r, Xr ) · (r, Xr , u) is bounded on the stochastic
interval [t, ✓h ]. Therefore, the second expectation on the right hand-side of the
last inequality vanishes, and we obtain :
" Z #
1 ✓h ·
Et,x (0, r)(@t V + L V + f )(r, Xr , u)dr 0
h t

We now send h to zero. The a.s. convergence of the random value inside the
expectation is easily obtained by the mean value Theorem; recall that ✓h = h
R✓
for sufficiently small h > 0. Since the random variable h 1 t h (0, r)(L· V +
f )(r, Xr , u)dr is essentially bounded, uniformly in h, on the stochastic interval
[t, ✓h ], it follows from the dominated convergence theorem that :
@t V (t, x) Lu V (t, x) f (t, x, u) 0.
By the arbitrariness of u 2 U , this provides the required claim. }

We next wish to show that V satisfies the nonlinear partial di↵erential equa-
tion (2.15) with equality. This is a more technical result which can be proved by
di↵erent methods. We shall report a proof, based on a contradiction argument,
which provides more intuition on this result, although it might be slightly longer
than the usual proof reported in standard textbooks.
2.3. Dynamic programming equation 33

Proposition 2.5. Assume the value function V 2 C 1,2 ([0, T ), Rn ), and let the
function H be upper semicontinuous, and kk + k1 < 1. Then, for all (t, x) 2
S:

@t V (t, x) H t, x, V (t, x), DV (t, x), D2 V (t, x)  0. (2.15)

Proof. Let (t0 , x0 ) 2 [0, T ) ⇥ Rn be fixed, assume to the contrary that

@t V (t0 , x0 ) + H t0 , x0 , V (t0 , x0 ), DV (t0 , x0 ), D2 V (t0 , x0 ) < 0, (2.16)

and let us work towards a contradiction.


1. For a given parameter " > 0, define the smooth function ' V by

'(t, x) := V (t, x) + " |t t0 |2 + |x x0 | 4 .

Then

(V ')(t0 , x0 ) = 0, (DV D')(t0 , x0 ) = 0, (@t V @t ')(t0 , x0 ) = 0,


2 2
and (D V D ')(t0 , x0 ) = 0,

and (2.16) says that:

h(t0 , x0 ) := @t '(t0 , x0 ) + H t0 , x0 , '(t0 , x0 ), D'(t0 , x0 ), D2 '(t0 , x0 ) < 0.

2. By upper semicontinuity of H, we have:

h(t, x) < 0 on N⌘ := ( ⌘, ⌘) ⇥ ⌘B for ⌘ > 0 sufficiently small,

where B denotes the unit ball centered at x0 . We next observe that the param-
eter defined by the following is positive:
+
k1
e⌘kk := max (V ') < 0. (2.17)
@N⌘

3. Let ⌫ be an arbitrary control process, and denote by X and the controlled


process and the discount factor defined by ⌫ and the initial data Xt0 = x0 .
Consider the stopping time

✓ := inf {s > t : (s, Xs ) 62 N⌘ } ,

and observe that, by continuity of the state process, (✓, X✓ ) 2 @N⌘ , so that :
+
k1
(V ')(✓, X✓ )  e⌘kk

by (2.17). Recalling that (t0 , t0 ) = 1, we now compute that:


Z ✓
+
(t0 , ✓)V (✓, X✓ ) V (t0 , x0 )  d[ (t0 , r)'(r, Xr )] e⌘kk k1 (t0 , ✓)
t0
Z ✓
 d[ (t0 , r)'(r, Xr )] .
t0
34 CHAPTER 2. STOCHASTIC CONTROL, DYNAMIC PROGRAMMING

By Itô’s formula, this provides :


h Z ✓ i
V (t0 , x0 ) + Et0 ,x0 (t0 , ✓)V (✓, X✓ ) (t0 , r)(@t ' + L⌫r ')(r, Xr )dr ,
t0

where the ”dW ” integral term has zero mean, as its integrand is bounded on the
stochastic interval [t0 , ✓]. Observe also that (@t ' + L⌫r ')(r, Xr ) + f (r, Xr , ⌫r ) 
h(r, Xr )  0 on the stochastic interval [t0 , ✓]. We therefore deduce that :
hZ ✓ i
V (t0 , x0 ) + Et0 ,x0 (t0 , r)f (r, Xr , ⌫r )dr + (t0 , ✓)V (✓, X✓ ) .
t0

As is a positive constant independent of ⌫, this implies that:


hZ ✓ i
V (t0 , x0 ) + sup Et0 ,x0 (t0 , r)f (r, Xr , ⌫r )dr + (t0 , ✓)V (✓, X✓ ) ,
⌫2Ut t0

which is the required contradiction of the second part of the dynamic program-
ming principle, and thus completes the proof. }

As a consequence of Propositions 2.4 and 2.5, we have the main result of


this section :
Theorem 2.6. Let the conditions of Propositions 2.5 and 2.4 hold. Then, the
value function V solves the Hamilton-Jacobi-Bellman equation
@t V H ., V, DV, D2 V = 0 on S. (2.18)

2.4 On the regularity of the value function


The purpose of this paragraph is to show that the value function should not be
expected to be smooth in general. We start by proving the continuity of the
value function under strong conditions; in particular, we require the set U in
which the controls take values to be bounded. We then give a simple example
in the deterministic framework where the value function is not smooth. Since
it is well known that stochastic problems are “more regular” than deterministic
ones, we also give an example of stochastic control problem whose value function
is not smooth.

2.4.1 Continuity of the value function for bounded con-


trols
For notational simplicity, we reduce the stochastic control problem to the case
f = k ⌘ 0, see Remark 2.2 (iii). Our main concern, in this section, is to show the
standard argument for proving the continuity of the value function. Therefore,
the following results assume strong conditions on the coefficients of the model
in order to simplify the proofs. We first start by examining the value function
V (t, ·) for fixed t 2 [0, T ].
2.4. Regularity of the value function 35

Proposition 2.7. Let f = k ⌘ 0, T < 1, and assume that g is Lipschitz


continuous. Then:
(i) V is Lipschitz in x, uniformly in t.
(ii) Assume further that U is bounded. Then V is 12 Hölder-continuous in t,
and there is a constant C > 0 such that:
p
V (t, x) V (t0 , x)  C(1 + |x|) |t t0 |; t, t0 2 [0, T ], x 2 Rn .
Proof. (i) For x, x0 2 Rn and t 2 [0, T ), we first estimate that:
⇣ 0

|V (t, x) V (t, x0 )|  sup E g XTt,x,⌫ g XTt,x ,⌫
⌫2U0
0
 Const sup E XTt,x,⌫ XTt,x ,⌫
⌫2U0

 Const |x x0 |,
where we used the Lipschitz-continuity of g together with the flow estimates
of Theorem 1.4, and the fact that the coefficients b and are Lipschitz in x
uniformly in (t, u). This compltes the proof of the Lipschitz property of the
value function V .
(ii) To prove the Hölder continuity in t, we shall use the dynamic programming
principle.
(ii-1) We first make the following important observation. A careful review
of the proof of Theorem 2.3 reveals that, whenever the stopping times ✓⌫ are
constant (i.e. deterministic), the dynamic programming principle holds true
with the semicontinuous envelopes taken only with respect to the x variable.
Since V was shown to be continuous in the first part of this proof, we deduce
that:
⇥ ⇤
V (t, x) = sup E V t0 , Xtt,x,⌫0 (2.19)
⌫2U0

for all x 2 Rn , t < t0 2 [0, T ].


(ii-2) Fix x 2 Rn , t < t0 2 [0, T ]. By the dynamic programming principle
(2.19), we have :
⇥ ⇤
|V (t, x) V (t0 , x)| = sup E V t0 , Xtt,x,⌫
0 V (t0 , x)
⌫2U0

 sup E V t0 , Xtt,x,⌫
0 V (t0 , x) .
⌫2U0

By the Lipschitz-continuity of V (s, ·) established in the first part of this proof,


we see that :
|V (t, x) V (t0 , x)|  Const sup E Xtt,x,⌫
0 x . (2.20)
⌫2U0

We shall now prove that


sup E Xtt,x,⌫
0 x  Const (1 + |x|)|t t0 |1/2 , (2.21)
⌫2U
36 CHAPTER 2. STOCHASTIC CONTROL, DYNAMIC PROGRAMMING

which provides the required (1/2) Hölder continuity in view of (2.20). By


definition of the process X, and assuming t < t0 , we have
Z t0 Z t0
2
2
E Xtt,x,⌫
0 x = E b(r, Xr , ⌫r )dr + (r, Xr , ⌫r )dWr
t t
"Z #
t0
2
 Const E |h(r, Xr , ⌫r )| dr
t

where h := [b2 + 2 ]1/2 . Since h is Lipschitz-continuous in (t, x, u) and has


quadratic growth in x and u, this provides:
Z t0 Z t0 !
t,x,⌫ 2 2 2 t,x,⌫ 2
E Xt0 x  Const (1 + |x| + |⌫r | )dr + E Xr x dr .
t t

Since the control process ⌫ is uniformly bounded, we obtain by the Gronwall


lemma the estimate:
2
E Xtt,x,⌫
0 x  Const (1 + |x|2 )|t0 t|, (2.22)
where the constant does not depend on the control ⌫. }
Remark 2.8. When f and/or k are non-zero, the conditions required on f and
k in order to obtain the (1/2) Hölder continuity of the value function can be
deduced from the reduction of Remark 2.2 (iii).
Remark 2.9. Further regularity results can be proved for the value function
under convenient conditions. Typically, one can prove that Lu V exists in the
generalized sense, for all u 2 U . This implies immediately that the result of
Proposition 2.5 holds in the generalized sense. More technicalities are needed in
order to derive the result of Proposition 2.4 in the generalized sense. We refer
to [6], §IV.10, for a discussion of this issue.

2.4.2 A deterministic control problem with non-smooth


value function
Let ⌘ 0, b(x, u) = u, U = [ 1, 1], and n = 1. The controlled state is then the
one-dimensional deterministic process defined by :
Z s
Xs = Xt + ⌫t dt for 0  t  s  T .
t

Consider the deterministic control problem


V (t, x) := sup (XT )2 .
⌫2U

The value function of this problem is easily seen to be given by :



(x + T t)2 for x 0 with optimal control û = 1 ,
V (t, x) =
(x T + t)2 for x  0 with optimal control û = 1 .
2.4. Regularity of the value function 37

This function is continuous. However, a direct computation shows that it is not


di↵erentiable at x = 0.

2.4.3 A stochastic control problem with non-smooth value


function
Let U = R, and the controlled process X be the scalar process defined by the
dynamics:

dXt = ⌫t dWt ,

where W is a scalar Brownian motion. Let g be a lower semicontinuous mapping


on R, with ↵0 0
|x|  g(x)  ↵+ x, x 2 R, for some constants ↵, ↵0 , 0 2 R.
We consider the stochastic control problem

V (t, x) := sup Et,x [g(XT⌫ )] .


⌫2U0

Let us assume that V is smooth, and work towards a contradiction.

1. If V is C 1,2 ([0, T ), R), then it follows from Proposition 2.4 that V satisfies
1 2 2
@t V u D V 0 for all u 2 R,
2
and all (t, x) 2 [0, T ) ⇥ R. By sending u to infinity, it follows that

V (t, ·) is concave for all t 2 [0, T ). (2.23)


⇥ ⇤
2. Notice that V (t, x) Et,x g(XT0 ) = g(x). Then, it follows from (2.23) that:

V (t, x) g conc (x) for all (t, x) 2 [0, T ) ⇥ R, (2.24)

where g conc is the concave envelope of g, i.e. the smallest concave majorant of
g. Notice that g conc < 1 as g is bounded from above by a line.

3. Since g  g conc , we see that

V (t, x) := sup Et,x [g(XT⌫ )]  sup Et,x [g conc (XT⌫ )] = g conc (x),
⌫2U0 ⌫2U0

by the martingale property of X ⌫ . In view of (2.24), we have then proved that

V 2 C 1,2 ([0, T ), R)
=) V (t, x) = g conc (x) for all (t, x) 2 [0, T ) ⇥ R.

Now recall that this implication holds for any arbitrary non-negative lower semi-
continuous function g. We then obtain a contradiction whenever the function
g conc is not C 2 (R). Hence

g conc 62 C 2 (R) =) V 62 C 1,2 ([0, T ), R2 ).


38 CHAPTER 2. STOCHASTIC CONTROL, DYNAMIC PROGRAMMING
Chapter 3

Optimal Stopping and


Dynamic Programming

As in the previous chapter, we assume here that the filtration F is defined as the
P augmentation of the canonical filtration of the Brownian motion W defined
on the probability space (⌦, F, P).
Our objective is to derive similar results, as those obtained in the previous
chapter for standard stochastic control problems, in the context of optimal stop-
ping problems. We will then first start by the formulation of optimal stopping
problems, then the corresponding dynamic programming principle, and dynamic
programming equation.

3.1 Optimal stopping problems


For 0  t  T  1, we denote by T[t,T ] the collection of all F stopping
times with values in [t, T ]. We also recall the notation S := [0, T ) ⇥ Rn for the
parabolic state space of the underlying state process X defined by the stochastic
di↵erential equation:
dXt = µ(t, Xt )dt + (t, Xt )dWt , (3.1)
where µ and are defined on S̄ and take values in R and Sn , respectively. We
n

assume that µ and satisfies the usual Lipschitz and linear growth conditions
so that the above SDE has a unique strong solution satisfying the integrability
proved in Theorem 1.2.
The infinitesimal generator of the Markov di↵usion process X is denoted by
1 ⇥ T 2 ⇤
A' := µ · D' + Tr D ' .
2
Let g be a measurable function from Rn to R, and assume that:

E sup |g(Xt )| < 1. (3.2)
0t<T

39
40 CHAPTER 3. OPTIMAL STOPPING, DYNAMIC PROGRAMMING

For instance, if g has polynomial growth, the latter integrability condition is


automatically satisfied. Under this condition, the following criterion:
⇥ ⇤
J(t, x, ⌧ ) := E g X⌧t,x 1⌧ <1 (3.3)

is well-defined for all (t, x) 2 S and ⌧ 2 T[t,T ] . Here, X t,x denotes the unique
strong solution of (3.1) with initial condition Xtt,x = x.
The optimal stopping problem is now defined by:

V (t, x) := sup J(t, x, ⌧ ) for all (t, x) 2 S. (3.4)


⌧ 2T[t,T ]

A stopping time ⌧ˆ 2 T[t,T ] is called an optimal stopping rule if V (t, x) =


J(t, x, ⌧ˆ).
The set

S := {(t, x) : V (t, x) = g(x)} (3.5)

is called the stopping region and is of particular interest: whenever the state is
in this region, it is optimal to stop immediately. Its complement S c is called
the continuation region.
Remark 3.1. As in the previous chapter, we could have considered an appear-
ently more general criterion
Z ⌧
V (t, x) := sup E (t, s)f (s, Xs )ds + (t, ⌧ )g X⌧t,x 1⌧ <1 ,
⌧ 2T[t,T ] t

with
Rs
k(s,Xs )ds
(t, s) := e t for 0  t  s < T.

However by introducing the additional state


Z t
Yt := Y0 + s f (s, Xs )ds,
0
Z t
Zt := Z0 + Zs k(s, Xs )ds,
0

we see immediately that we may reduce this problem to the context of (3.4).
Remark 3.2. Consider the subset of stopping rules:
t
T[t,T ] := ⌧ 2 T[t,T ] : ⌧ independent of Ft . (3.6)

By a similar argument as in Remark 2.2 (iv), we can see that the maximization
in the optimal stopping problem (3.4) can be restricted to this subset, i.e.

V (t, x) := sup J(t, x, ⌧ ) for all (t, x) 2 S. (3.7)


t
⌧ 2T[t,T ]
41

3.2 The dynamic programming principle


In the context of optimal stopping problems, the proof of the dynamic pro-
gramming principle is easier than in the context of stochastic control problems
of the previous chapter. The reader may consult the excellent exposition in
the book of Karatzas and Shreve [9], Appendix D, where the following dynamic
programming principle is proved:
⇥ ⇤
V (t, x) = sup E 1{⌧ <✓} g(X⌧t,x ) + 1{⌧ ✓} V (✓, X✓t,x ) , (3.8)
t
⌧ 2T[t,T ]

for all (t, x) 2 S and ⌧ 2 T[t,T ] . In particular, the proof in the latter reference
does not require any heavy measurable selection, and is essentially based on the
supermartingale nature of the so-called Snell envelope process. Moreover, we
observe that it does not require any Markov property of the underlying state
process.
We report here a di↵erent proof in the sprit of the weak dynamic program-
ming principle for stochastic control problems proved in the previous chapter.
The subsequent argument is specific to our Markovian framework and, in this
sense, is weaker than the classical dynamic programming principle. However,
the combination of the arguments of this chapter with those of the previous
chapter allow to derive a dynamic programming principle for mixed stochastic
control and stopping problem.
t
The following claim will be making using of the subset T[t,T ] , introduced
in (3.6), of all stopping times in T[t,T ] which are independent of Ft , and the
notations:

V⇤ (t, x) := lim inf V (t0 , x0 ) and V ⇤ (t, x) := lim sup V (t0 , x0 )


(t0 ,x0 )!(t,x) (t0 ,x0 )!(t,x)

for all (t, x) 2 S̄. We recall that V⇤ and V ⇤ are the lower and upper semicon-
tinuous envelopes of V , and that V⇤ = V ⇤ = V whenever V is continuous.
t
Theorem 3.3. Assume that V is locally bounded. For (t, x) 2 S, let ✓ 2 T̄[t,T ]
be a stopping time such that X✓t,x is bounded. Then:
⇥ ⇤
V (t, x)  sup E 1{⌧ <✓} g(X⌧t,x ) + 1{⌧ ✓} V

(✓, X✓t,x ) , (3.9)
t
⌧ 2T[t,T ]
⇥ t,x ⇤
V (t, x) sup E 1{⌧ <✓} g(X⌧t,x ) + 1{⌧ ✓} V⇤ (✓, X✓ )) . (3.10)
t
⌧ 2T[t,T ]

Proof. Inequality (3.9) follows immediately from the tower property and the
fact that J  V ⇤ .
We next prove inequality (3.10) with V⇤ replaced by an arbitrary function

':S !R such ' is upper-semicontinuous and V ',

which implies (3.10) by the same argument as in Step 3 of the proof of Theorem
2.3.
42 CHAPTER 3. OPTIMAL STOPPING, DYNAMIC PROGRAMMING

Arguying as in Step 2 of the proof of Theorem 2.3, we first observe that, for
every " > 0, we can find a countable family Āi ⇢ (ti ri , ti ] ⇥ Ai ⇢ S, together
with a sequence of stopping times ⌧ i," in T[ttii,T ] , i 1, satisfying Ā0 = {T } ⇥ Rd
and

[i 0 Āi
¯ ⌧ i," )
= S, Āi \ Āj = ; for i 6= j 2 N, J(·; ' 3" on Āi for i
1.
(3.11)
Set Ān := [in Āi , n t
1. Given two stopping times ✓, ⌧ 2 T[t,T ] , it is clear that

n
!
X
⌧ n," := ⌧ 1{⌧ <✓} + 1{⌧ ✓} T 1(Ān )c ✓, X✓t,x + ⌧ i," 1Āi ✓, X✓t,x
i=1

t
defines a stopping time in T[t,T ] . We then deduce from the tower property and
(3.11) that

V̄ (t, x) ¯ x; ⌧ n," )
J(t,
⇥ ⇤
E g X⌧t,x 1{⌧ <✓} + 1{⌧ ✓} '(✓, X✓t,x ) 3" 1Ān (✓, X✓t,x )
⇥ ⇤
+E 1{⌧ ✓} g(XTt,x )1(Ān )c (✓, X✓t,x ) .

By sending n ! 1 and arguing as in the end of Step 2 of the proof of Theorem


2.3, we deduce that
⇥ t,x ⇤
V̄ (t, x) E g X⌧t,x 1{⌧ <✓} + 1{⌧ ✓} '(✓, X✓ ) 3",
t
and the result follows from the arbitrariness of " > 0 and ⌧ 2 T[t,T ]. }

3.3 The dynamic programming equation


In this section, we explore the infinitesimal counterpart of the dynamic program-
ming principle of Theorem 3.3, when the value function V is a priori known to
be smooth. The smoothness that will be required in this chapter must be so
that we can apply Itô’s formula to V . In particular, V is continuous, and the
dynamic programming principle of Theorem 3.3 reduces to the classical dynamic
programming principle (3.8).
Loosely speaking, the following dynamic programming equation says the
following:

• In the stopping region S defined in (3.5), continuation is sub-optimal, and


therefore the linear PDE must hold with inequality in such a way that the
value function is a submartingale.

• In the continuation region S c , it is optimal to delay the stopping decision


after some small moment, and therefore the value function must solve a
linear PDE as in Chapter 1.
43

Theorem 3.4. Assume that V 2 C 1,2 ([0, T ), Rn ), and let g : Rn ! R be


continuous. Then V solves the obstacle problem:

min { (@t + A)V , V g} = 0 on S. (3.12)

Proof. We organize the proof into two steps.


1. We first show that:

min { (@t + A)V , V g} 0 on S. (3.13)

The inequality V g 0 is obvious as the constant stopping rule ⌧ = t 2 T[t,T ]


is admissible. Next, for (t0 , x0 ) 2 S, consider the stopping times

✓h := inf t > t0 : (t, Xtt0 ,x0 ) 62 [t0 , t0 + h] ⇥ B , h > 0,

where B is the unit ball of Rn centered at x0 . Then ✓h 2 T[t,T


t
] for sufficiently
small h, and it follows from (3.10)that:

V (t0 , x0 ) E [V (✓h , X✓h )] .

We next apply Itô’s formula, and observe that the expected value of the di↵usion
term vanishes because (t, Xt ) lies in the compact subset [t0 , t0 + h] ⇥ B for
t 2 [t0 , ✓h ]. Then:
" Z #
1 ✓h t0 ,x0
E (@t + A)V (t, Xt )dt 0.
h t0

Clearly, there exists ĥ! > 0, depending on !, ✓h = h for h  ĥ! . Then, it


follows from the mean value theorem that the expression inside the expectation
converges P a.s. to (@t + A)V (t0 , x0 ), and we conclude by dominated conver-
gence that (@t + A)V (t0 , x0 ) 0.
2. In order to complete the proof, we use a contradiction argument, assuming
that

V (t0 , x0 ) > g(x0 ) and (@t + A)V (t0 , x0 ) > 0 at some (t0 , x0 ) 2 S, (3.14)

and we work towards a contradiction of (3.9). Introduce the function


"
'(t, x) := V (t, x) + |x x0 | 2 for (t, x) 2 S.
2
Then, it follows from (3.14) that for a sufficiently small " > 0, we may find
h > 0 and > 0 such that

V g+ and (@t + A)' 0 on Nh := [t0 , t0 + h] ⇥ hB. (3.15)

Moreover:

:= max(V ') < 0. (3.16)


@Nh
44 CHAPTER 3. OPTIMAL STOPPING, DYNAMIC PROGRAMMING

Next, let
✓ := inf t > t0 : t, Xtt0 ,x0 62 Nh .
t
For an arbitrary stopping rule ⌧ 2 T[t,T ] , we compute by Itô’s formula that:

E [V (⌧ ^ ✓, X⌧ ^✓ ) V (t0 , x0 )] = E [(V ') (⌧ ^ ✓, X⌧ ^✓ )]


+E [' (⌧ ^ ✓, X⌧ ^✓ ) '(t0 , x0 )]
= E [(V ') (⌧ ^ ✓, X⌧ ^✓ )]
"Z #
⌧ ^✓
+E (@t + A)'(t, Xtt0 ,x0 )dt ,
t0

where the di↵usion term has zero expectation because the process (t, Xtt0 ,x0 ) is
confined to the compact subset Nh on the stochastic interval [t0 , ⌧ ^ ✓]. Since
L' 0 on Nh by (3.15), this provides:
E [V (⌧ ^ ✓, X⌧ ^✓ ) V (t0 , x0 )]  E [(V ') (⌧ ^ ✓, X⌧ ^✓ )]
 P[⌧ ✓],
by (3.16). Then, since V g + on Nh by (3.15):
⇥ ⇤
V (t0 , x0 ) P[⌧ ✓] + E g(X⌧t0 ,x0 ) + 1{⌧ <✓} + V ✓, X✓t0 ,x0 1{⌧ ✓}
⇥ ⇤
( ^ ) + E g(X⌧t0 ,x0 )1{⌧ <✓} + V ✓, X✓t0 ,x0 1{⌧ ✓} .
t
By the arbitrariness of ⌧ 2 T[t,T ] , this provides the desired contradiction of (3.9).
}

3.4 Regularity of the value function


3.4.1 Finite horizon optimal stopping
In this subsection, we consider the case T < 1. Similar to the continuity result
of Proposition 2.7 for the stochastic control framework, the following continuity
result is obtained as a consequence of the flow continuiy of Theorem 1.4 together
with the dynamic programming principle.
Proposition 3.5. Assume g is Lipschitz-continuous, and let T < 1. Then,
there is a constant C such that:
⇣ p ⌘
V (t, x) V (t0 , x0 )  C |x x0 | + |t t0 | for all (t, x), (t0 , x0 ) 2 S.

Proof. (i) For t 2 [0, T ] and x, x0 2 Rn , it follows from the Lipschitz property
of g that:
0
|V (t, x) V (t, x0 )|  Const sup E X⌧t,x X⌧t,x
⌧ 2T[t,T ]
0
 Const E sup X⌧t,x X⌧t,x
tsT
0
 Const |x x|
45

by the flow continuity result of Theorem 1.4.


ii) To rpove the H”older continuity result in t, we argue as in the proof of
Proposition 2.7 using the dynamic programming principle of Theorem 3.3.
(ii-1) We first observe that, whenever the stopping time ✓ = t0 > t is
constant (i.e. deterministic), the dynamic programming principle (3.9)-(3.10)
holds true if the semicontinuous envelopes are taken with respect to the variable
x, with fixed time variable. Since V is continuous in x by the first part of this
proof, we deduce that
⇥ ⇤
V (t, x) = sup E 1{⌧ <t0 } g X⌧t,x + 1{⌧ t0 } V t0 , Xtt,x
0 (3.17)
t
⌧ 2T[t,T ]

(ii) We then estimate that


⇥ ⇤ ⇥ ⇤
0  V (t, x) E V t0 , Xtt,x
0  sup E 1{⌧ <t0 } g X⌧t,x V t0 , Xtt,x
0
t
⌧ 2T[t,T ]
⇥ ⇤
 sup E 1{⌧ <t0 } g X⌧t,x g Xtt,x
0 ,
t
⌧ 2T[t,T ]

where the last inequality follows from the fact that V g. Using the Lipschitz
property of g, this provides:

⇥ t,x ⇤
0  V (t, x) E V t , Xt0 0
 Const E sup Xst,x Xtt,x 0
tst0
p
 Const (1 + |x|) t0 t

by the flow continuity result of Theorem 1.4. Using this estimate together with
the Lipschitz property proved in (i) above, this provides:
⇥ ⇤ ⇥ ⇤
|V (t, x) V (t0 , x)|  V (t, x) E V t0 , Xtt,x0 + E V t0 , Xtt,x
0 V (t0 , x)
⇣ p ⌘
 Const (1 + |x|) t0 t + E Xtt,x 0 x
p
 Const (1 + |x|) t0 t,

by using again Theorem 1.4. }

3.4.2 Infinite horizon optimal stopping


In this section, the state process X is defined by a homogeneous scalar di↵usion:

dXt = µ(Xt )dt + (Xt )dWt . (3.18)

We introduce the hitting times:

Hbx := inf t > 0 : X 0,x = b ,

and we assume that the process X is regular, i.e.

P [Hbx < 1] > 0 for all x, b 2 R, (3.19)


46 CHAPTER 3. OPTIMAL STOPPING, DYNAMIC PROGRAMMING

which means that there is no subinterval of R from which the process X can
not exit.
We consider the infinite horizon optimal stopping problem:
⇥ ⇤
V (x) := sup E e ⌧ g X⌧0,x 1{⌧ <1} , (3.20)
⌧ 2T

where T := T[0,1] , and > 0 is the discount rate parameter.


According to Theorem 3.3, the dynamic programming equation correspond-
ing to this optimal stopping problem is the obstacle problem:

min { v Av, v g} = 0,

where the di↵erential operator in the present homogeneous context is given by


the generator of the di↵usion:

1
Av := µv 0 + 2 00
v . (3.21)
2
The ordinary di↵erential equation

Av v = 0 (3.22)

has two positive linearly independent solutions

, 0 such that strictly increasing, strictly decreasing. (3.23)

Clearly and are uniquely determined up to a positive constant, and all other
solution of (3.22) can be expressed as a linear combination of and .
The following result follows from an immediate application of Itô’s formula.

Lemma 3.6. For any b1 < b2 , we have:


h i (x) (b2 ) (b2 ) (x)
Hbx
E e 1 1{Hbx Hbx } = ,
1 2 (b1 ) (b2 ) (b2 ) (b1 )
h i (b1 ) (x) (x) (b1 )
Hbx
E e 2 1{Hbx Hbx } = .
1 2 (b1 ) (b2 ) (b2 ) (b1 )

We now show that the value function V is concave up to some change of vari-
able, and provides conditions under which V is C 1 across the exercise boundary,
i.e. the boundary between the exercise and the continuation regions. For the
next result, we observe that the fnction ( / ) is continuous and strictly increas-
ing by (3.23), and therefore invertible.

Theorem 3.7. (i) The function (V / ) ( / ) 1 is concave. In particular, V


is continuous on R.
(ii) Let x0 be such that V (x0 ) = g(x0 ), and assume that g, and are di↵er-
entiable at x0 . Then V is di↵erentiable at x0 , and V 0 (x0 ) = g 0 (x0 ).
47

Proof. For (i), it is sufficient to prove that:


V V V V
(x) (b1 ) (b2 ) (x)
for all b1 < x < b2 . (3.24)
(x) (b1 ) (b2 ) (x)

For " > 0, consider the " optimal stopping rules ⌧1 , ⌧2 2 T for the problems
V (b1 ) and V (b2 ):
⇥ ⇤
E e ⌧i g X⌧0,b
i
i
V (bi ) " for i = 1, 2.

We next define the stopping time


⇣ ⌘ ⇣ ⌘
⌧ " := Hbx1 + ⌧1 ✓Hbx 1{Hbx <Hbx } + Hbx2 + ⌧2 ✓Hbx 1{Hbx <Hbx } ,
1 1 2 2 2 1

where ✓ denotes the shift operator on the canonical space, i.e. ✓t (!)(s) =
!(t + s). In words, the stopping rule ⌧ " uses the " optimal stopping rule ⌧1 if
the level b1 is reached before the level b2 , and the " optimal stopping rule ⌧2
otherwise. Then, it follows from the strong Markov property that
h "
⇣ ⌘i
V (x) E e ⌧ g X⌧0,x "

h x ⇥ ⇤ i
= E e Hb1 E e ⌧1 g X⌧0,b 1
1
1{Hbx <Hbx }
1 2
h ⇥ ⇤ i
Hbx2 1 ⌧2 0,b2
+E e E e g X⌧2 1{Hbx <Hbx }
2 1
h i
Hbx1
(V (b1 ) ") E e 1{Hbx <Hbx }
1 2
h i
Hbx2
+ (V (b2 ) ") E e 1{Hbx <Hbx } .
2 1

Sending " & 0, this provides


h x
i h i
Hbx
V (x) V (b1 )E e Hb1 1{Hbx <Hbx } + V (b2 )E e 2 1{Hbx <Hbx } .
1 2 2 1

By using the explicit expressions of Lemma 3.6 above, this provides:

V (x) V (b1 ) (b2 ) (x) V (b2 ) (x) (b1 )


+ ,
(x) (b1 ) (b2 ) (b1 ) (b2 ) (b2 ) (b1 )

which implies (3.24).


(ii) We next prove the smoothfit result. Let x0 be such that V (x0 ) = g(x0 ).
Then, since V g, is strictly increasing, 0 is strictly decreasing, it follows
from (3.24) that:
g g V V
(x0 + ") (x0 ) (x0 + ") (x0 )
 (3.25)
(x0 + ") (x0 ) (x0 + ") (x0 )
V V g g
(x0 ) (x0 ) (x0 ) (x0 )
 
(x0 ) (x0 ) (x0 ) (x0 )
48 CHAPTER 3. OPTIMAL STOPPING, DYNAMIC PROGRAMMING

for all " > 0, > 0. Multiplying by (( / )(x0 + ") ( / )(x0 ))/", this implies
that:
g g V V g g
(x0 + ") (x0 ) (x0 + ") (x0 ) +
(") (x0 ) (x0 )
  ,
" " ( )
(3.26)
where

+
(x0 + ") (x0 ) (x0 ) (x0 )
(") := and ( ) := .
"
We next consider two cases:

• If ( / )0 (x0 ) 6= 0, then we may take " = and send " & 0 in (3.26) to
obtain:
✓ ◆0
d+ ( V ) g
(x0 ) = (x0 ). (3.27)
dx

• If ( / )0 (x0 ) = 0, then, we use the fact that for every sequence "n & 0,
there is a subsequence "nk & 0 and k & 0 such that + ("nk ) = ( k ).
Then (3.26) reduces to:
g g V V g g
(x0 + "nk ) (x0 ) (x0 + "nk ) (x0 ) (x0 k) (x0 )
  ,
" nk " nk k

and therefore
V V ✓ ◆0
(x0 + "nk ) (x0 ) g
! (x0 ).
" nk

By the arbitrariness of the sequence ("n )n , this provides (3.27).

Similarly, multiplying (3.25) by (( / )(x0 ) ( / )(x0 ))/ , and arguying as


above, we obtain:
✓ ◆0
d (V ) g
(x0 ) = (x0 ),
dx

thus completing the proof. }

3.4.3 An optimal stopping problem with nonsmooth value


We consider the example

Xst,x := x + (Wt Ws ) for s t.


49

Let g : R ! R+ be a measurable nonnegative function with lim inf x!1 g(x) =


0, and consider the infinite horizon optimal stopping problem:
⇥ ⇤
V (t, x) := sup E g X⌧t,x 1{⌧ <1}
⌧ 2T[t,1]
⇥ ⇤
= sup E g X⌧t,x .
⌧ 2T[t,1)

Let us assume that V 2 C 1,2 (S), and work towards a contradiction. We first
observe by the homogeneity of the problem that V (t, x) = V (x) is independent
of t. Moreover, it follows from Theorem 3.4 that V is concave in x and V g.
Then

V g conc , (3.28)

where g conc is the concave envelope of g. If g conc = 1, then V = 1. We then


continue in the more inetersting case where g conc < 1.
By the Jensen inequality and the non-negativity of g, the process {g (Xst,x ) , s t}
is a supermartingale, and:
⇥ ⇤
V (t, x)  sup E g conc X⌧t,x  g conc (x).
⌧ 2T[t,T ]

Hence, V = g conc , and we obtain the required contradiction whenever g conc is


not di↵erentiable at some point of R.
50 CHAPTER 3. OPTIMAL STOPPING, DYNAMIC PROGRAMMING
Chapter 4

Solving Control Problems


by Verification

In this chapter, we present a general argument, based on Itô’s formula, which


allows to show that some ”guess” of the value function is indeed equal to the
unknown value function. Namely, given a smooth solution v of the dynamic
programming equation, we give sufficient conditions which allow to conclude
that v coincides with the value function V . This is the so-called verification
argument. The statement of this result is heavy, but its proof is simple and relies
essentially on Itô’s formula. However, depending on the problem in hand, the
verification of the conditions which must be satisfied by the candidate solution
can be difficult.
The verification argument will be provided in the contexts of stochastic
control and optimal stopping problems. We conclude the chapter with some
examples.

4.1 The verification argument for stochastic con-


trol problems
We recall the stochastic control problem formulation of Section 2.1. The set of
admissible control processes U0 ⇢ U is the collection of all progressively measur-
able processes with values in the subset U ⇢ Rk . For every admissible control
process ⌫ 2 U0 , the controlled process is defined by the stochastic di↵erential
equation:

dXt⌫ = b(t, Xt⌫ , ⌫t )dt + (t, Xt⌫ , ⌫t )dWt .

The gain criterion is given by


"Z #
T
J(t, x, ⌫) := E ⌫
(t, s)f (s, Xst,x,⌫ , ⌫s )ds + ⌫
(t, T )g(XTt,x,⌫ ) ,
t

51
52 CHAPTER 4. THE VERIFICATION ARGUMENT

with
Rs
⌫ k(r,Xrt,x,⌫ ,⌫r )dr
(t, s) := e t .

The stochastic control problem is defined by the value function:

V (t, x) := sup J(t, x, ⌫), for (t, x) 2 S. (4.1)


⌫2U0

We follow the notations of Section 2.3. We recall the Hamiltonian H : S ⇥ R ⇥


Rd ⇥ Sd defined by :

H(t, x, r, p, )

1 T
:= sup k(t, x, u)r + b(t, x, u) · p + Tr[ (t, x, u) ] + f (t, x, u) ,
u2U 2
where b and satisfy the conditions (2.1)-(2.2), and the coefficients f and k are
measurable. From the results of the previous section, the dynamic programming
equation corresponding to the stochastic control problem (4.1) is:

@t v H(., v, Dv, D2 v) = 0 and v(T, .) = g. (4.2)

A function v will be called a supersolution (resp. subsolution) of the equation


(4.2) if

@t v H(., v, Dv, D2 v) (resp. ) 0 and v(T, .) (resp. ) g.

The proof of the subsequent result will make use of the following linear second
order operator

Lu '(t, x) := k(t, x, u)'(t, x) + b(t, x, u) · D'(t, x)


1 ⇥ T ⇤
+ Tr (t, x, u)D2 '(t, x) ,
2
which corresponds to the controlled process { u (0, t)Xtu , t 0} controlled by
the constant control process u, in the sense that
Z s

(0, s)'(s, Xs⌫ ) ⌫
(0, t)'(t, Xt⌫ ) = ⌫
(0, r) (@t + L⌫r ) '(r, Xr⌫ )dr
t
Z s

+ (0, r)D'(r, Xr⌫ ) · (r, Xr⌫ , ⌫r )dWr
t

for every t  s and smooth function ' 2 C 1,2 ([t, s], Rd ) and each admissible
control process ⌫ 2 U0 . The last expression is an immediate application of Itô’s
formula.
Theorem 4.1. Let T < 1, and v 2 C 1,2 ([0, T ), Rd ) \ C([0, T ] ⇥ Rd ). Assume
that kk k1 < 1 and v and f have quadratic growth, i.e. there is a constant C
such that

|f (t, x, u)| + |v(t, x)|  C(1 + |x|2 + |u|), (t, x, u) 2 [0, T ) ⇥ Rd ⇥ U.


4.1. Verification in stochastic control 53

(i) Suppose that v is a supersolution of (4.2). Then v V on [0, T ] ⇥ Rd .


(ii) Let v be a solution of (4.2), and assume that there exists a minimizer û(t, x)
of u 7 ! Lu v(t, x) + f (t, x, u) such that
• 0 = @t v(t, x) + Lû(t,x) v(t, x) + f t, x, û(t, x) ,
• the stochastic di↵erential equation
dXs = b (s, Xs , û(s, Xs )) ds + (s, Xs , û(s, Xs )) dWs
defines a unique solution X for each given initial data Xt = x,
• the process ⌫ˆs := û(s, Xs ) is a well-defined control process in U0 .
Then v = V , and ⌫ˆ is an optimal Markov control process.
Proof. Let ⌫ 2 U0 be an arbitrary control process, X the associated state process
with initial date Xt = x, and define the stopping time
1
✓n := (T n ) ^ inf {s > t : |Xs x| n} .
By Itô’s formula, we have
Z ✓n
v(t, x) = (t, ✓n )v (✓n , X✓n ) (t, r)(@t + L⌫r )v(r, Xr )dr
t
Z ✓n
(t, r)Dv(r, Xr ) · (r, Xr , ⌫r )dWr
t

Observe that (@t + L⌫r )v + f (·, ·, u)  @t v + H(·, ·, v, Dv, D2 v)  0, and that


the integrand in the stochastic integral is bounded on [t, ✓n ], a consequence of
the continuity of Dv, and the condition kk k1 < 1. Then :
" Z #
✓n
v(t, x) E (t, ✓n )v (✓n , X✓n ) + (t, r)f (r, Xr , ⌫r )dr . (4.3)
t

We now take the limit as n increases to infinity. Since ✓n ! T a.s. and


Z ✓n
(t, ✓n )v (✓n , X✓n ) + (t, r)f (r, Xr , ⌫r )dr
t

T kk k1
RT
 Ce (1 + |X✓n |2 + T + |Xs |2 ds)
t
RT
 CeT kk k1
(1 + T )(1 + suptsT |Xs |2 + t |⌫s |2 ds) 2 L1 ,
by the estimate (2.5) of Theorem 2.1, it follows from the dominated convergence
that
" Z T #
v(t, x) E (t, T )v(T, XT ) + (t, r)f (r, Xr , ⌫r )dr
t
" Z #
T
E (t, T )g(XT ) + (t, r)f (r, Xr , ⌫r )dr ,
t
54 CHAPTER 4. THE VERIFICATION ARGUMENT

where the last inequality uses the condition v(T, ·) g. Since the control ⌫ 2 U0
is arbitrary, this completes the proof of (i).
Statement (ii) is proved by repeating the above argument and observing that
the control ⌫ˆ achieves equality at the crucial step (4.3). }
Remark 4.2. When U is reduced to a singleton, the optimization problem V is
degenerate. In this case, the DPE is linear, and the verification theorem reduces
to the so-called Feynman-Kac formula.
Notice that the verification theorem assumes the existence of such a solution,
and is by no means an existence result. However, it provides uniqueness in the
class of functions with quadratic growth.
We now state without proof an existence result for the DPE together with
the terminal condition V (T, ·) = g (see [8] and the references therein). The main
assumption is the so-called uniform parabolicity condition :
there is a constant c > 0 such that
(4.4)
⇠0 0
(t, x, u) ⇠ c|⇠|2 for all (t, x, u) 2 [0, T ] ⇥ Rn ⇥ U .

In the following statement, we denote by Cbk (Rn ) the space of bounded functions
whose partial derivatives of orders  k exist and are bounded continuous. We
similarly denote by Cbp,k ([0, T ], Rn ) the space of bounded functions whose partial
derivatives with respect to t, of orders  p, and with respect to x, of order 
k, exist and are bounded continuous.
Theorem 4.3. Let Condition 4.4 hold, and assume further that :
• U is compact;
• b, and f are in Cb1,2 ([0, T ], Rn );
• g 2 Cb3 (Rn ).
Then the DPE (2.18) with the terminal data V (T, ·) = g has a unique solution
V 2 Cb1,2 ([0, T ] ⇥ Rn ).

4.2 Examples of control problems with explicit


solutions
4.2.1 Optimal portfolio allocation
We now apply the verification theorem to a classical example in finance, which
was introduced by Merton [10, 11], and generated a huge literature since then.
Consider a financial market consisting of a non-risky asset S 0 and a risky
one S. The dynamics of the price processes are given by

dSt0 = St0 rdt and dSt = St [µdt + dWt ] .

Here, r, µ and are some given positive constants, and W is a one-dimensional


Brownian motion.
The investment policy is defined by an F adapted process ⇡ = {⇡t , t 2
[0, T ]}, where ⇡t represents the amount invested in the risky asset at time t;
4.2. Examples 55

The remaining wealth (Xt ⇡t ) is invested in the risky asset. Therefore, the
liquidation value of a self-financing strategy satisfies

dSt dS 0
dXt⇡ = ⇡t + (Xt⇡ ⇡t ) 0t
St St
= (rXt + (µ r)⇡t ) dt + ⇡t dWt . (4.5)

Such a process ⇡ is said to be admissible if it lies in U0 = H2 which will be


refered to as the set of all admissible portfolios. Observe that, in view of the
particular form of our controlled process X, this definition agrees with (2.4).
Let be an arbitrary parameter in (0, 1) and define the power utility func-
tion :

U (x) := x for x 0.

The parameter is called the relative risk aversion coefficient.


The objective of the investor is to choose an allocation of his wealth so as to
maximize the expected utility of his terminal wealth, i.e.
⇥ ⇤
V (t, x) := sup E U (XTt,x,⇡ ) ,
⇡2U0

where X t,x,⇡ is the solution of (4.5) with initial condition Xtt,x,⇡ = x.


The dynamic programming equation corresponding to this problem is :

@w
(t, x) + sup Au w(t, x) = 0, (4.6)
@t u2R

where Au is the second order linear operator :

@w 1 @2w
Au w(t, x) := (rx + (µ r)u) (t, x) + 2 2
u (t, x).
@x 2 @x2
We next search for a solution of the dynamic programming equation of the form
v(t, x) = x h(t). Plugging this form of solution into the PDE (4.6), we get the
following ordinary di↵erential equation on h :

u 1 u2
0 = h0 + h sup r + (µ r) + ( 1) 2 2 (4.7)
u2R x 2 x

1
= h0 + h sup r + (µ r) + ( 1) 2 2 (4.8)
2R 2

1 (µ r)2
= h0 + h r + , (4.9)
2 (1 ) 2

where the maximizer is :


µ r
û := 2
x.
(1 )
56 CHAPTER 4. THE VERIFICATION ARGUMENT

Since v(T, ·) = U (x), we seek for a function h satisfying the above ordinary
di↵erential equation together with the boundary condition h(T ) = 1. This
induces the unique candidate:

a(T t) 1 (µ r)2
h(t) := e with a := r+ .
2 (1 ) 2

Hence, the function (t, x) 7 ! x h(t) is a classical solution of the HJB equation
(4.6). It is easily checked that the conditions of Theorem 4.1 are all satisfied in
this context. Then V (t, x) = x h(t), and the optimal portfolio allocation policy
is given by the linear control process:
µ r

ˆt = 2
Xt⇡ˆ .
(1 )

4.2.2 Law of iterated logarithm for double stochastic in-


tegrals
The main object of this paragraph is Theorem 4.5 below, reported from [2],
which describes the local behavior of double stochastic integrals near the starting
point zero. This result will be needed in the problem of hedging under gamma
constraints which will be discussed later in these notes. An interesting feature
of the proof of Theorem 4.5 is that it relies on a verification argument. However,
the problem does not fit exactly in the setting of Theorem 4.1. Therefore, this
is an interesting exercise on the verification concept.
Given a bounded predictable process b, we define the processes
Z t Z t
b b
Yt := Y0 + br dWr and Zt := Z0 + Yrb dWr , t 0 ,
0 0

where Y0 and Z0 are some given initial data in R.


Lemma 4.4. Let and T be two positive parameters with 2 T < 1. Then :
h b
i h 1
i
E e 2 ZT  E e 2 ZT for each predictable process b with kbk1  1 .

Proof. We split the argument into three steps.


1. We first directly compute that
h 1
i
E e2 ZT Ft = v(t, Yt1 , Zt1 ) ,

where, for t 2 [0, T ], and y, z 2 R, the function v is given by :


" ( Z T )!#
v(t, y, z) := E exp 2 z+ (y + Wu Wt ) dWu
t
2 z
⇥ ⇤
= e E exp {2yWT t + WT2 t (T t)}
⇥ ⇤
= µ exp 2 z (T t) + 2µ2 2
(T 2
t)y ,
4.2. Examples 57

1/2
where µ := [1 2 (T t)] . Observe that

the function v is strictly convex in y, (4.10)

and
2
yDyz v(t, y, z) = 8µ2 3
(T t) v(t, y, z) y 2 0. (4.11)

2. For an arbitrary real parameter , we denote by A the generator the process


Y b, Z b :
1 2 2 1
A := Dyy + y 2 Dzz
2 2
+ yDyz .
2 2
In this step, we intend to prove that for all t 2 [0, T ] and y, z 2 R :

max A v(t, y, z) = A1 v(t, y, z) = 0. (4.12)


| |1

The second equality follows from the fact that {v(t, Yt1 , Zt1 ), t  T } is a mar-
tingale . As for the first equality, we see from (4.10) and (4.11) that 1 is a
maximizer of both functions 7 ! 2 Dyy 2
v(t, y, z) and 7 ! yDyz 2
v(t, y, z) on
[ 1, 1].

3. Let b be some given predictable process valued in [ 1, 1], and define the
sequence of stopping times

⌧k := T ^ inf t 0 : (|Ytb | + |Ztb | k , k2N.

By Itô’s lemma and (4.12), it follows that :


Z ⌧k
b b
v(0, Y0 , Z0 ) = v ⌧k , Y⌧k , Z⌧k [bDy v + yDz v] t, Ytb , Ztb dWt
0
Z ⌧k
(@t + Abt )v t, Ytb , Ztb dt
0
Z ⌧k
v ⌧k , Y⌧bk , Z⌧bk [bDy v + yDz v] t, Ytb , Ztb dWt .
0

Taking expected values and sending k to infinity, we get by Fatou’s lemma :


⇥ ⇤
v(0, Y0 , Z0 ) lim inf E v ⌧k , Y⌧bk , Z⌧bk
k!1
⇥ ⇤ h b
i
E v T, YTb , ZTb = E e 2 ZT ,

which proves the lemma. }

We are now able to prove the law of the iterated logarithm for double stochas-
tic integrals by a direct adaptation of the case of the Brownian motion. Set
1
h(t) := 2t log log for t>0.
t
58 CHAPTER 4. THE VERIFICATION ARGUMENT

Theorem 4.5. Let b be a predictable process valued in a bounded interval [ 0 , 1]


RtRu
for some real parameters 0  0 < 1 , and Xtb := 0 0 bv dWv dWu . Then :
2Xtb
0  lim sup  1 a.s.
t&0 h(t)
Proof. We first show that the first inequality is an easy consequence of the
second one. Set ¯ := ( 0 + 1 )/2 0, and set := ( 1 0 )/2. By the law of
the iterated logarithm for the Brownian motion, we have
¯
¯ = lim sup 2Xt 2Xtb̃ 2Xtb
 lim sup + lim sup ,
t&0 h(t) t&0 h(t) t&0 h(t)
1 ¯
where b̃ := ( b) is valued in [ 1, 1]. It then follows from the second
inequality that :
2Xtb ¯
lim sup = 0 .
t&0 h(t)
We now prove the second inequality. Clearly, we can assume with no loss of
generality that kbk1  1. Let T > 0 and > 0 be such that 2 T < 1. It
follows from Doob’s maximal inequality for submartingales that for all ↵ 0,
 
P max 2Xtb ↵ = P max exp(2 Xtb ) exp( ↵)
0tT 0tT
h b
i
 e ↵ E e2 XT .

In view of Lemma 4.4, this provides :


 h i
1
P max 2Xtb ↵  e ↵
E e2 XT
0tT
1
(↵+T )
= e (1 2 T) 2 . (4.13)
We have then reduced the problem to the case of the Brownian motion, and
the rest of this proof is identical to the first half of the proof of the law of the
iterated logarithm for the Brownian motion. Take ✓, ⌘ 2 (0, 1), and set for all
k 2 N,
↵k := (1 + ⌘)2 h(✓k ) and k := [2✓k (1 + ⌘)] 1
.
Applying (4.13), we see that for all k 2 N,

1
P max 2Xtb (1 + ⌘)2 h(✓k )  e 1/2(1+⌘) 1 + ⌘ 1 2
( k log ✓) (1+⌘)
.
0t✓ k
P
Since k 0 k (1+⌘) < 1, it follows from the Borel-Cantelli lemma that, for
almost all ! 2 ⌦, there exists a natural number K ✓,⌘ (!) such that for all
k K ✓,⌘ (!),
max 2Xtb (!) < (1 + ⌘)2 h(✓k ) .
0t✓ k
4.2. Examples 59

In particular, for all t 2 (✓k+1 , ✓k ],

h(t)
2Xtb (!) < (1 + ⌘)2 h(✓k )  (1 + ⌘)2 .

Hence,

2Xtb (1 + ⌘)2
lim sup < a.s.
t&0 h(t) ✓

and the required result follows by letting ✓ tend to 1 and ⌘ to 0 along the
rationals. }

4.3 The verification argument for optimal stop-


ping problems
In this section, we develop the verification argument for finite horizon optimal
stopping problems. Let T > 0 be a finite time horizon, and X t,x denote the
solution of the stochastic di↵erential equation:
Z s Z s
t,x t,x
Xs = x + b(s, Xs )ds + (s, Xst,x )dWs , (4.14)
t t

where b and satisfy the usual Lipschitz and linear growth conditions. Given
the functions k, f : [0, T ] ⇥ Rd ! R and g : Rd ! R, we consider the optimal
stopping problem
Z ⌧
V (t, x) := sup E (t, s)f (s, Xst,x )ds + (t, ⌧ )g(X⌧t,x ) , (4.15)
t
⌧ 2T[t,T t
]

whenever this expected value is well-defined, where


Rs
k(r,Xrt,x )dr
(t, s) := e t , 0  t  s  T.

By the results of the previous chapter, the corresponding dynamic programmin


equation is:

min { @t v Lv f, v g} = 0 on [0, T ) ⇥ Rd , v(T, .) = g, (4.16)

where L is the second order di↵erential operator


1 ⇥ T

Lv := b · Dv + Tr D2 v kv.
2
Similar to Section 4.1, a function v will be called a supersolution (resp. subso-
lution) of (4.16) if

min { @t v Lv f, v g} (resp. ) 0 and v(T, .) (resp. ) g.


60 CHAPTER 4. THE VERIFICATION ARGUMENT

Before stating the main result of this section, we observe that for many inter-
esting examples, it is known that the value function V does not satisfy the C 1,2
regularity which we have been using so far for the application of Itô’s formula.
Therefore, in order to state a result which can be applied to a wider class of
problems, we shall enlarge in the following remark the set of function for which
Itô’s formula still holds true.
Remark 4.6. Let v be a function in the Sobolev space W 1,2 (S). By definition,
for such a function v, there is a sequence of functions (v n )n 1 ⇢ C 1,2 (S) such
that v n ! v uniformly on compact subsets of S, and

k@t v n @t v m kL2 (S) + kDv n Dv m kL2 (S) + kD2 v n D2 v m kL2 (S) ! 0.

Then, Itô’s formula holds true for v n for all n 1, and is inherited by v by
sending n ! 1.
Theorem 4.7. Let T < 1 and v 2 W 1,2 ([0, T ), Rd ). Assume further that v
and f have quadratic growth. Then:
(i) If v is a supersolution of (4.16), then v V .
(ii) If v is a solution of (4.16), then v = V and

⌧t⇤ := inf {s > t : v(s, Xs ) = g(Xs )}

is an optimal stopping time.


Proof. Let (t, x) 2 [0, T ) ⇥ Rd be fixed and denote s := (t, s).
t
(i) For an arbitrary stopping time ⌧ 2 T[t,T ) , we denote

⌧n := ⌧ ^ inf s > t : |Xst,x x| > n .

By our regularity conditions on v, notice that Itô’s formula can be applied to it


piecewise. Then:
Z ⌧n Z ⌧n
v(t, x) = ⌧n v(⌧n , X⌧t,x
n
) s (@ t + L)v(s, X t,x
s )ds s(
T
Dv)(s, Xst,x )dWs
t t
Z ⌧n Z ⌧n
t,x t,x T
⌧n v(⌧n , X⌧n ) + s f (s, Xs )ds s( Dv)(s, Xst,x )dWs
t t

by the supersolution property of v. Since (s, Xst,x ) is bounded on the stochastic


interval [t, ⌧n ], this provides:
h Z ⌧n i
v(t, x) E ⌧n v(⌧n , X⌧t,x
n
) + s f (s, X t,x
s )ds .
t

Notice that ⌧n ! ⌧ a.s. Then, since f and v have quadratic growth, we may
pass to the limit n ! 1 invoking the dominated convergence theorem, and we
get:
h Z T i
v(t, x) E T v(T, XTt,x ) + t,x
s f (s, Xs )ds .
t
4.2. Examples 61

Since v(T, .) g by the supersolution property, this concludes the proof of (i).
(ii) Let ⌧t⇤ be the stopping time introduced in the theorem. Then, since v(T, .) =
g, it follows that ⌧t⇤ 2 T[t,T
t
] . Set

⌧tn := ⌧t⇤ ^ inf{s > t : |Xst,x x| > n .

Observe that v > g on [t, ⌧tn ) ⇢ [t, ⌧t⇤ ) and therefore @t v Lv f = 0 on


[t, ⌧tn ). Then, proceeding as in the previous step, it follows from Itô’s formula
that:
h Z ⌧tn i
v(t, x) = E ⌧tn v(⌧tn , X⌧t,x
n ) +
t
s f (s, X t,x
s )ds .
t

Since ⌧tn ! ⌧t⇤ a.s. and f, v have quadratic growth, we may pass to the limit
n ! 1 invoking the dominated convergence theorem. This leads to:
h Z T i
t,x t,x
v(t, x) = E T v(T, XT ) + s f (s, Xs )ds ,
t

and the required result follows from the fact that v(T, .) = g. }

4.4 Examples of optimal stopping problems with


explicit solutions
4.4.1 Perpetual American options
The pricing problem of perpetual American put options reduces to the infinite
horizon optimal stopping problem:
⇥ ⇤
P (t, s) := sup E e r(⌧ t) (K S⌧t,s )+ ,
t
⌧ 2T[t,1)

where K > 0 is a given exercise price, S t,s is defined by the Black-Scholes


constant coefficients model:
2
Sut,s := s exp r (u t) + (Wu Wt ), u t,
2
and r 0, > 0 are two given constants. By the time-homogeneity of the
problem, we see that
⇥ ⇤
P (t, s) = P (s) := sup E e r⌧ (K S⌧0,s )+ . (4.17)
⌧ 2T[0,1)

In view this time independence, it follows that the dynamic programming cor-
responding to this problem is:
1
min{v (K s)+ , rv rsDv 2
D2 v} = 0. (4.18)
2
62 CHAPTER 4. THE VERIFICATION ARGUMENT

In order to proceed to a verification argument, we now guess a solution to the


previous obstacle problem. From the nature of the problem, we search for a
solution of this obstacle problem defined by a parameter s0 2 (0, K) such that:
1
p(s) = K s for s 2 [0, s0 ] and rp rsp0 2 2 00
s p = 0 on [s0 , 1).
2
We are then reduced to solving a linear second order ODE on [s0 , 1), thus
determining v by
2
2r/
p(s) = As + Bs for s 2 [s0 , 1),

up to the two constants A and B. Notice that 0  p  K. Then the constant


A = 0 in our candidate solution, because otherwise v ! 1 at infinity. We
finally determine the constants B and s0 by requiring our candidate solution to
be continuous and di↵erentiable at s⇤ . This provides two equations:
2
2r/ 2 2r/ 2r/ 2
1
Bs0 =K s0 and s0 = 1,
B
which provide our final candidate
2
✓ ◆ 2r
2rK s0 s 2

s0 = , p(s) = (K s)1[0,s0 ] (s) + 1[s0 ,1) (s) . (4.19)


2r + 2 2r s0
Notice that our candidate p is not twice di↵erentiable at s0 as p00 (s0 ) = 0 6=
p00 (s0 +). However, by Remark 4.6, Itô’s formula still applies to p, and p satisfies
the dynamic programming equation (4.18). We now show that

p = P with optimal stopping time ⌧ ⇤ := inf t > 0 : p(St0,s ) = (K St0,s )+ .


(4.20)
Indeed, for an arbitrary stopping time ⌧ 2 T[0,1) , it follows from Itô’s formula
that:
Z ⌧ Z ⌧
1 2 2 00
p(s) = e r⌧ p(S⌧0,s ) e rt ( rp + rsp0 + s p )(St )dt p0 (St ) St dWt
0 2 0
Z ⌧
e r⌧
(K S⌧ )t,s +
p0 (St ) St dWt
0

by the fact that p is a supersolution of the dynamic programming equation.


Since p0 is bounded, there is no need to any localization to get rid of the
stochastic integral, and we directly obtain by taking expected values that p(s)
E[e r⌧ (K S⌧t,s )+ ]. By the arbitrariness of ⌧ 2 T[0,1) , this shows that p P .
We next repeat the same argument with the stopping time ⌧ ⇤ , and we see

that p(s) = E[e r⌧ (K S⌧0,s +
⇤ ) ], completing the proof of (4.20).

4.4.2 Finite horizon American options


Finite horizon optimal stopping problems rarely have an explicit solution. So the
following example can be seen as a sanity check. In the context of the financial
4.2. Examples 63

market of the previous subsection, we assume the instanteneous interest rate


r = 0, and we consider an American option with payo↵ function g and maturity
T > 0. Then the price of the corresponding American option is given by the
optimal stopping problem:
⇥ ⇤
P (t, s) := sup E g(S⌧t,s ) . (4.21)
t
⌧ 2T[t,T ]

The corresponding dynamic programming equation is:


1 2
min v g, @t v D v = 0 on [0, T ) ⇥ R+ and v(T, .) = g.(4.22)
2
Assuming further that g 2 W 1,2 and concave, we see that g is a solution of
the dynamic programming equation. Then, provided that g satisfies suitable
growth condition, we see by a verification argument that P = p.
Notice that the previous result can be obtained directly by the Jensen in-
equality together with the fact that S is a martingale.
64 CHAPTER 4. THE VERIFICATION ARGUMENT

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy