Controle Sto Arret Optimal
Controle Sto Arret Optimal
Conditional Expectation
and Linear Parabolic
PDEs
where T > 0 is a given maturity date. Here, b and are F⌦B(Rn )-progressively
measurable functions from [0, T ] ⇥ ⌦ ⇥ Rn to Rn and MR (n, d), respectively.
In particular, for every fixed x 2 Rn , the processes {bt (x), t (x), t 2 [0, T ]} are
F progressively measurable.
7
8 CHAPTER 1. CONDITIONAL EXPECTATION AND LINEAR PDEs
Let us mention that there is a notion of weak solutions which relaxes some
conditions from the above definition in order to allow for more general stochas-
tic di↵erential equations. Weak solutions, as opposed to strong solutions, are
defined on some probabilistic structure (which becomes part of the solution),
and not necessarily on (⌦, F, F, P, W ). Thus, for a weak solution we search for a
˜ F̃, F̃, P̃, W̃ ) and a process X̃ such that the requirement
probability structure (⌦,
of the above definition holds true. Obviously, any strong solution is a weak
solution, but the opposite claim is false.
The main existence and uniqueness result is the following.
Theorem 1.2. Let X0 2 L2 be a r.v. independent of W . Assume that the
processes b. (0) and . (0) are in H2 , and that for some K > 0:
Then, for all T > 0, there exists a unique strong solution of (1.1) in H2 . More-
over,
E sup |Xt |2 C 1 + E|X0 |2 eCT , (1.2)
tT
Clearly , the norms k.kH2 and k.kH2c on the Hilbert space H2 are equivalent.
Consider the map U on H2 by:
Z t Z t
U (X)t := X0 + bs (Xs )ds + s (Xs )dWs , 0 t T.
0 0
By the Lipschitz property of b and in the x variable and the fact that
b. (0), . (0) 2 H2 , it follows that this map is well defined on H2 . In order
to prove existence and uniqueness of a solution for (1.1), we shall prove that
U (X) 2 H2 for all X 2 H2 and that U is a contracting mapping with respect to
the norm k.kH2c for a convenient choice of the constant c > 0.
1- We first prove that U (X) 2 H2 for all X 2 H2 . To see this, we decompose:
"Z Z 2
#
T t
kU (X)k2H2 3T kX0 k2L2 + 3T E bs (Xs )ds dt
0 0
"Z Z 2
#
T t
+3E s (Xs )dWs dt
0 0
1.1. Stochastic differential equations 9
"Z Z 2
# "Z #
T t T
2 2
E bs (Xs )ds dt KT E (1 + |bt (0)| + |Xs | )ds < 1,
0 0 0
"Z Z 2
# " Z 2
#
T t t
E s (Xs )dWs dt T E max s (Xs )dWs dt
0 0 tT 0
"Z #
T
4T E | s (Xs )|2 ds
0
"Z #
T
2 2
4T KE (1 + | s (0)| + |Xs | )ds < 1.
0
2- To see that U is a contracting mapping for the norm k.kH2c , for some convenient
choice of c > 0, we consider two process X, Y 2 H2 with X0 = Y0 , and we
estimate that:
2
E |U (X)t U (Y )t |
Z t 2 Z t 2
2E (bs (Xs ) bs (Ys )) ds + 2E ( s (Xs ) s (Ys )) dWs
0 0
Z t 2 Z t
2
= 2E (bs (Xs ) bs (Ys )) ds + 2E | s (Xs ) s (Ys )| ds
0 0
Z t Z t
2 2
= 2tE |bs (Xs ) bs (Ys )| ds + 2E | s (Xs ) s (Ys )| ds
0 0
Z t
2
2(T + 1)K E |Xs Ys | ds.
0
2K(T + 1)
Hence, kU (X) U (Y )kc kX Y kc , and therefore U is a contract-
c
ing mapping for sufficiently large c.
Step 2 We next prove the estimate (1.2). We shall alleviate the notation writ-
10 CHAPTER 1. CONDITIONAL EXPECTATION AND LINEAR PDEs
where we used the Doob’s maximal inequality. Since b and are Lipschitz-
continuous in x, uniformly in t and !, this provides:
✓ Z t ◆
2 2 2
E sup |Xu | C(K, T ) 1 + E|X0 | + E sup |Xu | ds
ut 0 us
Exercise 1.3. In the context of this section, assume that the coefficients µ
and are locally Lipschitz and linearly growing in x, uniformly in (t, !). By a
localization argument, prove that strong existence and uniqueness holds for the
stochastic di↵erential equation (1.1).
In addition to the estimate (1.2) of Theorem 1.2, we have the following flow
continuity results of the solution of the SDE.
Theorem 1.4. Let the conditions of Theorem 1.2 hold true, and consider some
(t, x) 2 [0, T ) ⇥ Rn with t t0 T .
(i) There is a constant C such that:
0 0
E sup Xst,x Xst,x |2 CeCt |x x0 |2 . (1.3)
tst0
R t0
(ii) Assume further that B := supt<t0 T (t0 t) 1
E t
|br (0)|2 + | r (0)|
2
dr <
1. Then for all t0 2 [t, T ]:
0
E sup Xst,x Xst ,x |2 CeCT (B + |x|2 )|t0 t|. (1.4)
t0 sT
0
Proof. (i) To simplify the notations, we set Xs := Xst,x and Xs0 := Xst,x for all
s 2 [t, T ]. We also denote x := x x0 , X := X X 0 , b := b(X) b(X 0 ) and
1.1. Stochastic differential equations 11
Then, it follows from the Doob maximal inequality and the Lipschitz property
of the coefficients b and that:
✓ Z s Z s ◆
0 2 2 2 2
h(t ) := E sup | Xs | 3 | x| + (s t) E bu du + 4 E u du
tst0 t t
✓ Z s ◆
3 | x|2 + K 2 (t0 + 4) E| Xu |2 du
t
✓ Z s ◆
3 | x|2 + K 2 (t0 + 4) h(u)du .
t
Observe that
Z Z !
t0 2 t0 2
2
E|Xt0 x| 2 E br (Xr )dr +E r (Xr )dr
t t
Z Z !
t0 t0
2 2
2 T E|br (Xr )| dr + E| r (Xr )| dr
t t
Z t0
6(T + 1) K 2 E|Xr x|2 + |x|2 + E|br (0)|2 dr
t
⇣ Z t0 ⌘
6(T + 1) (t0 t)(|x|2 + B) + K 2 E|Xr x|2 dr .
t
By the Gronwall inequality, this shows that
0
E|Xt0 x|2 C(|x|2 + B)|t0 t|eC(t t)
.
Plugging this estimate in (1.5), we see that:
✓ Z u ◆
2 0 C 0 2
h(u) 3 C(|x| + B)|t t|e (t t) + K (T + 4) h(r)dr , (1.6)
t0
where µ and satisfy the required condition for existence and uniqueness of a
strong solution.
For a function f : Rn ! R, we define the function Af by
t,x
E[f (Xt+h )] f (x)
Af (t, x) = lim if the limit exists.
h!0 h
Clearly, Af is well-defined for all bounded C 2 function with bounded deriva-
tives and
1 T @2f
Af (t, x) = µ(t, x) · f (t, x) + Tr (t, x) , (1.7)
2 @x@xT
1.3. Connection with PDE 13
Theorem 1.7. Let the coefficients µ, be continuous and satisfy (1.9). Assume
further that the function k is uniformly bounded from below, and f has quadratic
growth in x uniformly in t. Let v be a C 1,2 [0, T ), Rd \C 0 [0, T ) ⇥ Rd solution
of (1.8) with quadratic growth in x uniformly in t. Then
"Z #
T
t,x
v(t, x) = E t,x t,x
s f (s, Xs )ds + T g XTt,x , t T, x 2 Rd ,
t
Rs Rs Rs t,x
k(u,Xu )du
where Xst,x := x+ t
µ(u, Xut,x )du+ t
(u, Xut,x )dWu and t,x
s := e t
for t s T .
Proof. We first introduce the sequence of stopping times
⇣ 1⌘
⌧n := T ^ ^ inf s > t : Xst,x x n ,
n
and we oberve that ⌧n ! T P a.s. Since v is smooth, it follows from Itô’s
formula that for t s < T :
✓ ◆
t,x t,x t,x @v
d s v s, Xs = s kv + + Av s, Xst,x ds
@t
@v
+ st,x s, Xst,x · s, Xst,x dWs
✓@x ◆
t,x t,x @v t,x t,x
= s f (s, Xs )ds + s, Xs · s, Xs dWs ,
@x
Now observe that the integrands in the stochastic integral is bounded by def-
inition of the stopping time ⌧n , the smoothness of v, and the continuity of .
Then the stochastic integral has zero mean, and we deduce that
Z ⌧n
t,x t,x
v(t, x) = E s f s, Xs ds + ⌧t,x
n
v ⌧n , X⌧t,x
n
. (1.10)
t
Since ⌧n ! T and the Brownian motion has continuous sample paths P a.s.
it follows from the continuity of v that, P a.s.
Z ⌧n
t,x t,x
s f s, Xs ds + ⌧t,x
n
v ⌧n , X⌧t,x
n
t Z T
n!1
! t,x
s f s, Xs
t,x
ds + Tt,x v T, XTt,x (1.11)
t Z
T
= t,x
s f s, Xs
t,x
ds + Tt,x g XTt,x
t
1.3. Connection with PDE 15
By the estimate stated in the existence and uniqueness theorem 1.2, the latter
bound is integrable, and we deduce from the dominated convergence theorem
that the convergence in (1.11) holds in L1 (P), proving the required result by
taking limits in (1.10). }
1 X ⇣ (i) ⌘
k
v̂k (t, x) := g X .
k i=1
By the Law of Large Numbers, it follows that v̂k (t, x) ! v(t, x) P a.s. More-
over the error estimate is provided by the Central Limit Theorem:
p k!1 ⇥ ⇤
k (v̂k (t, x) v(t, x)) ! N 0, Var g XTt,x in distribution,
RT RT
for 1 i d, where µ, are F adapted processes with 0 |µit |dt+ 0 | ti,j |2 dt <
1 for all i, j = 1, . . . , d. It is convenient to use the matrix notations to represent
the dynamics of the price vector S = (S 1 , . . . , S d ):
called the risk premium process. Here 1 is the vector of ones in Rd . We shall
frequently make use of the discounted processes
✓ Z t ◆
St
S̃t := 0 = St exp ru du ,
St 0
Using the above matrix notations, the dynamics of the process S̃ are given by
Xn
⇡ti Xt⇡ ⇡t · 1
dXt⇡ = i
dS i
t + 0 dSt0 .
i=1
S t S t
dX̃t = ˜t ·
⇡ t ( t dt + dWt ) , 0 t T, (1.13)
and the discounted wealth process induced by an initial capital X0 and a port-
folio strategy ⇡ can be written in
Z t
X̃t⇡ = X̃0 + ˜u ·
⇡ u dBu , for 0 t T. (1.15)
0
The purpose of this section is to show that the financial market described
above contains no arbitrage opportunities. Our first observation is that, by the
1.3. Connection with PDE 19
For this reason, Q is called a risk neutral measure, or an equivalent local mar-
tingale measure, for the price process S.
We also observe that the discounted wealth process satisfies:
X̃ ⇡ is a Q local martingale for every ⇡ 2 A, (1.17)
as a stochastic integral with respect to the Q Brownian motion B.
Theorem 1.12. The continuous-time financial market described above contains
no arbitrage opportunities, i.e. for every ⇡ 2 A:
X0 = 0 and XT⇡ 0P a.s. =) XT⇡ = 0 P a.s.
Proof. For ⇡ 2 A, the discounted wealth process X̃ is a Q local martingale
⇡
where the last equality follows from the Markov property of the process S.
Assuming further that g has linear growth, it follows that V has linear growth
in s uniformly in t. Since V is defined by a conditional expectation, it is expected
to satisfy the linear PDE:
1 ⇥ ⇤
@t V rs ? DV Tr (s ? )2 D2 V rV = 0. (1.19)
2
More precisely, if V 2 C 1,2 (R+ , Rd ), the V is a classical solution of (1.19) and
satisfies the final condition V (T, .) = g. Coversely, if the PDE (1.19) combined
with the final condition v(T, .) = g has a classical solution v with linear growth,
then v coincides with the derivative security price V .
22 CHAPTER 1. CONDITIONAL EXPECTATION AND LINEAR PDEs
Chapter 2
Stochastic Control
and Dynamic Programming
The set S is called the parabolic interior of the state space. We will denote by
S̄ := cl(S) its closure, i.e. S̄ = [0, T ] ⇥ Rn for finite T , and S̄ = S for T = 1.
b : (t, x, u) 2 S ⇥ U ! b(t, x, u) 2 Rn
and
23
24 CHAPTER 2. STOCHASTIC CONTROL, DYNAMIC PROGRAMMING
for some constant K independent of (t, x, y, u). For each control process ⌫ 2 U,
we consider the controlled stochastic di↵erential equation :
If the above equation has a unique solution X, for a given initial data, then
the process X is called the controlled process, as its dynamics is driven by the
action of the control process ⌫.
We shall be working with the following subclass of control processes :
U0 := U \ H2 , (2.4)
which guarantees the existence of a controlled process on the time interval [0, T 0 ]
for each given initial condition and control. The following result is an immediate
consequence of Theorem 1.2.
Theorem 2.1. Let ⌫ 2 U0 be a control process, and ⇠ 2 L2 (P) be an F0 measurable
random variable. Then, there exists a unique F adapted process X ⌫ satisfying
(6.3) together with the initial condition X0⌫ = ⇠. Moreover for every T > 0,
there is a constant C > 0 such that
E sup |Xs⌫ |2 < C(1 + E[|⇠|2 ])eCt for all t 2 cl([0, T )). (2.5)
0st
f, k : [0, T ) ⇥ Rn ⇥ U ! R and g : Rn ! R
for some constant K independent of (t, x, u). We define the cost function J on
[0, T ] ⇥ Rn ⇥ U by :
"Z #
T
J(t, x, ⌫) := E ⌫
(t, s)f (s, Xst,x,⌫ , ⌫s )ds + ⌫
(t, T )g(XTt,x,⌫ )1T <1 ,
t
and {Xst,x,⌫ , s t} is the solution of (6.3) with control process ⌫ and initial
condition Xtt,x,⌫ = x.
Admissible control processes. In the finite horizon case T < 1, the quadratic
growth condition on f and g together with the bound on k ensure that J(t, x, ⌫)
is well-defined for all control process ⌫ 2 U0 . We then define the set of admissible
controls in this case by U0 .
More attention is needed for the infinite horizon case. In particular, the
discount term k needs to play a role to ensure the finiteness of the integral. In
this setting the largest set of admissible control processes is given by
n Z o
⇥ 1 ⌫ ⇤
U0 := ⌫ 2 U : E (t, s) 1+|Xst,x,⌫ |2 +|⌫s )| ds < 1 for all x when T = 1.
0
The stochastic control problem. The purpose of this section is to study the min-
imization problem
V (t, x) := sup J(t, x, ⌫) for (t, x) 2 S.
⌫2U0
Our main concern is to describe the local behavior of the value function V
by means of the so-called dynamic programming equation, or Hamilton-Jacobi-
Bellman equation. We continue with some remarks.
Remark 2.2. (i) If V (t, x) = J(t, x, ⌫ˆt,x ), we call ⌫ˆt,x an optimal control for
the problem V (t, x).
(ii) The following are some interesting subsets of controls :
- a process ⌫ 2 U0 which is adapted to the natural filtration FX of the
associated state process is called feedback control,
- a process ⌫ 2 U0 which can be written in the form ⌫s = ũ(s, Xs ) for some
measurable map ũ from [0, T ] ⇥ Rn into U , is called Markovian control;
notice that any Markovian control is a feedback control,
- the deterministic processes of U0 are called open loop controls.
(iii) Suppose that T < 1, and let (Y, Z) be the controlled processes defined
by
dYs = Zs f (s, Xs , ⌫s )ds and dZs = Zs k(s, Xs , ⌫s )ds ,
and define the augmented state process X̄ := (X, Y, Z). Then, the above
value function V can be written in the form :
V (t, x) = V̄ (t, x, 0, 1) ,
where x̄ = (x, y, z) is some initial data for the augmented state process X̄,
⇥ ⇤
V̄ (t, x̄) := Et,x̄ ḡ(X̄T ) and ḡ(x, y, z) := y + g(x)z .
Hence the stochastic control problem V can be reduced without loss of
generality to the case where f = k ⌘ 0. We shall appeal to this reduced
form whenever convenient for the exposition.
26 CHAPTER 2. STOCHASTIC CONTROL, DYNAMIC PROGRAMMING
(iv) For notational simplicity we consider the case T < 1 and f = k = 0. The
previous remark shows how to immediately adapt the following argument
so that the present remark holds true without the restriction f = k = 0.
The extension to the infinite horizon case is also immediate.
Consider the value function
⇥ ⇤
Ṽ (t, x) := sup E g(XTt,x,⌫ ) , (2.6)
⌫2Ut
Ut := {⌫ 2 U0 : ⌫ independent of Ft } . (2.7)
Ṽ = V, (2.8)
so that both problems are indeed equivalent. To see this, fix (t, x) 2 S and
⌫ 2 U0 . Then, ⌫ can be written as a measurable function of the canonical
process ⌫((!s )0st , (!s !t )tsT ), where, for fixed (!s )0st , the map
⌫(!s )0st : (!s !t )tsT 7! ⌫((!s )0st , (!s !t )tsT ) can be viewed
as a control independent on Ft . Using the independence of the increments
of the Brownian motion, together with Fubini’s Lemma, it thus follows
that
Z h
t,x,⌫(!s )0st i
J(t, x; ⌫) = E g(XT ) dP((!s )0st )
Z
Ṽ (t, x)dP((!s )0st ) = Ṽ (t, x).
for all (t, x) 2 S̄. We also recall the subset of controls Ut introduced in (2.7)
above.
2.2. Dynamic programming principle 27
Theorem 2.3. Assume that V is locally bounded and fix (t, x) 2 S. Let {✓⌫ , ⌫ 2
Ut } be a family of finite stopping times independent of Ft with values in [t, T ].
Then:
"Z ⌫ #
✓
⌫ t,x,⌫ ⌫ ⌫ ⌫ t,x,⌫
V (t, x) sup E (t, s)f (s, Xs , ⌫s )ds + (t, ✓ )V⇤ (✓ , X✓⌫ ) .
⌫2Ut t
Observe that the supremum is now taken over the subset U of the finite
dimensional space Rk . Hence, the dynamic programming principle allows
to reduce the initial maximization problem, over the subset U of the in-
finite dimensional set of Rk valued processes, into a finite dimensional
maximization problem. However, we are still facing an infinite dimen-
sional problem since the dynamic programming principle relates the value
function at time t to the value function at time t + 1.
(iii) In the context of the above discrete-time framework with finite horizon
T < 1, notice that the dynamic programming principle suggests the fol-
lowing backward algorithm to compute V as well as the associated optimal
strategy (when it exists). Since V (T, ·) = g is known, the above dynamic
programming principle can be applied recursively in order to deduce the
value function V (t, x) for every t.
(iv) In the continuous time setting, there is no obvious counterpart to the
above backward algorithm. But, as the stopping time ✓ approaches t,
the above dynamic programming principle implies a special local behavior
for the value function V . When V is known to be smooth, this will be
obtained by means of Itô’s formula.
28 CHAPTER 2. STOCHASTIC CONTROL, DYNAMIC PROGRAMMING
(vi) Once the local behavior of the value function is characterized, we are
faced to the important uniqueness issue, which implies that V is com-
pletely characterized by its local behavior together with some convenient
boundary condition.
Clearly, one can choose ⌫ " = µ on the stochastic interval [t, ✓]. Then
"Z #
✓
" "
V (t, x) J(t, x, ⌫ ) = Et,x (t, s)f (s, Xs , µs )ds + (t, ✓)J(✓, X✓ , ⌫ )
t
"Z #
✓
Et,x (t, s)f (s, Xs , µs )ds + (t, ✓)V (✓, X✓ ) " Et,x [ (t, ✓)] .
t
This provides the required inequality by the arbitrariness of µ 2 U and " > 0.
}
right-hand side of the classical dynamic programming principle (2.9) is not even
known to be well-defined.
The formulation of Theorem 2.3 avoids this measurability problem since
V⇤ and V ⇤ are lower- and upper-semicontinuous, respectively, and therefore
measurable. In addition, it allows to avoid the typically heavy technicalities
related to measurable selection arguments needed for the proof of the classical
(2.9) after a convenient relaxation of the control problem, see e.g. El Karoui
and Jeanblanc [5].
Proof of Theorem 2.3 For simplicity, we consider the finite horizon case
T < 1, so that, without loss of generality, we assume f = k = 0, See Remark
2.2 (iii). The extension to the infinite horizon framework is immediate.
1. Let ⌫ 2 Ut be arbitrary and set ✓ := ✓⌫ . Then:
⇥ ⇤
E g XTt,x,⌫ |F✓ (!) = J(✓(!), X✓t,x,⌫ (!); ⌫˜! ),
where ⌫˜! is obtained from ⌫ by freezing its trajectory up to the stopping time
✓. Since, by definition, J(✓(!), X✓t,x,⌫ (!); ⌫˜! ) V ⇤ (✓(!), X✓t,x,⌫ (!)), it follows
from the tower property of conditional expectations that
⇥ ⇤ ⇥ ⇥ ⇤⇤ ⇥ ⇤
E g XTt,x,⌫ = E E g XTt,x,⌫ |F✓ E V ⇤ ✓, X✓t,x,⌫ ,
⌫ (s,y)," 2 Us and J(s, y; ⌫ (s,y)," ) V (s, y) ", for every (s, y) 2 S.(2.10)
With this construction, it follows from (2.10), (2.11), together with the fact that
V ', that the countable family (Ai )i 0 satisfies
where the last equality follows from the left-hand side of (2.12)
⇥ and from ⇤the
monotone convergence theorem, due to the fact that either E '(✓, X✓t,x,⌫ )+ <
⇥ ⇤
1 or E '(✓, X✓t,x,⌫ ) < 1. By the arbitrariness of ⌫ 2 Ut and " > 0, this
shows that:
⇥ ⇤
V (t, x) sup E '(✓, X✓t,x,⌫ ) . (2.13)
⌫2Ut
3. It remains to deduce the first inequality of Theorem 2.3 from (2.13). Fix
r > 0. It follows from standard arguments, see e.g. Lemma 3.5 in [12], that
we can find a sequence of continuous functions ('n )n such that 'n V⇤ V
for all n 1 and such that 'n converges pointwise to V⇤ on [0, T ] ⇥ Br (0).
Set N := minn N 'n for N 1 and observe that the sequence ( N )N is non-
decreasing and converges pointwise to V⇤ on [0, T ] ⇥ Br (0). By (2.13) and the
monotone convergence Theorem, we then obtain:
⇥ ⇤ ⇥ ⇤
V (t, x) lim E N (✓⌫ , Xt,x
⌫
(✓⌫ )) = E V⇤ (✓⌫ , Xt,x
⌫
(✓⌫ )) .
N !1
for every s t and smooth function ' 2 C 1,2 ([t, s], Rn ) and each admissible
control process ⌫ 2 U0 .
32 CHAPTER 2. STOCHASTIC CONTROL, DYNAMIC PROGRAMMING
Proposition 2.4. Assume the value function V 2 C 1,2 ([0, T ), Rn ), and let the
coefficients k(·, ·, u) and f (·, ·, u) be continuous in (t, x) for all fixed u 2 U .
Then, for all (t, x) 2 S:
@t V (t, x) H t, x, V (t, x), DV (t, x), D2 V (t, x) 0. (2.14)
Proof. Let (t, x) 2 S and u 2 U be fixed and consider the constant control
process ⌫ = u, together with the associated state process X with initial data
Xt = x. For all h > 0, Define the stopping time :
✓h := inf {s > t : (s t, Xs x) 62 [0, h) ⇥ ↵B} ,
where ↵ > 0 is some given constant, and B denotes the unit ball of Rn . Notice
that ✓h ! t, P a.s. when h & 0, and ✓h = h for h h̄(!) sufficiently small.
1. From the first inequality of the dynamic programming principle, it follows
that :
" Z #
✓h
0 Et,x (0, t)V (t, x) (0, ✓h )V (✓h , X✓h ) (0, r)f (r, Xr , u)dr
t
"Z #
✓h
·
= Et,x (0, r)(@t V + L V + f )(r, Xr , u)dr
t
"Z #
✓h
Et,x (0, r)DV (r, Xr ) · (r, Xr , u)dWr ,
t
the last equality follows from Itô’s formula and uses the crucial smoothness
assumption on V .
2. Observe that (0, r)DV (r, Xr ) · (r, Xr , u) is bounded on the stochastic
interval [t, ✓h ]. Therefore, the second expectation on the right hand-side of the
last inequality vanishes, and we obtain :
" Z #
1 ✓h ·
Et,x (0, r)(@t V + L V + f )(r, Xr , u)dr 0
h t
We now send h to zero. The a.s. convergence of the random value inside the
expectation is easily obtained by the mean value Theorem; recall that ✓h = h
R✓
for sufficiently small h > 0. Since the random variable h 1 t h (0, r)(L· V +
f )(r, Xr , u)dr is essentially bounded, uniformly in h, on the stochastic interval
[t, ✓h ], it follows from the dominated convergence theorem that :
@t V (t, x) Lu V (t, x) f (t, x, u) 0.
By the arbitrariness of u 2 U , this provides the required claim. }
We next wish to show that V satisfies the nonlinear partial di↵erential equa-
tion (2.15) with equality. This is a more technical result which can be proved by
di↵erent methods. We shall report a proof, based on a contradiction argument,
which provides more intuition on this result, although it might be slightly longer
than the usual proof reported in standard textbooks.
2.3. Dynamic programming equation 33
Proposition 2.5. Assume the value function V 2 C 1,2 ([0, T ), Rn ), and let the
function H be upper semicontinuous, and kk + k1 < 1. Then, for all (t, x) 2
S:
Then
where B denotes the unit ball centered at x0 . We next observe that the param-
eter defined by the following is positive:
+
k1
e⌘kk := max (V ') < 0. (2.17)
@N⌘
and observe that, by continuity of the state process, (✓, X✓ ) 2 @N⌘ , so that :
+
k1
(V ')(✓, X✓ ) e⌘kk
where the ”dW ” integral term has zero mean, as its integrand is bounded on the
stochastic interval [t0 , ✓]. Observe also that (@t ' + L⌫r ')(r, Xr ) + f (r, Xr , ⌫r )
h(r, Xr ) 0 on the stochastic interval [t0 , ✓]. We therefore deduce that :
hZ ✓ i
V (t0 , x0 ) + Et0 ,x0 (t0 , r)f (r, Xr , ⌫r )dr + (t0 , ✓)V (✓, X✓ ) .
t0
which is the required contradiction of the second part of the dynamic program-
ming principle, and thus completes the proof. }
Const |x x0 |,
where we used the Lipschitz-continuity of g together with the flow estimates
of Theorem 1.4, and the fact that the coefficients b and are Lipschitz in x
uniformly in (t, u). This compltes the proof of the Lipschitz property of the
value function V .
(ii) To prove the Hölder continuity in t, we shall use the dynamic programming
principle.
(ii-1) We first make the following important observation. A careful review
of the proof of Theorem 2.3 reveals that, whenever the stopping times ✓⌫ are
constant (i.e. deterministic), the dynamic programming principle holds true
with the semicontinuous envelopes taken only with respect to the x variable.
Since V was shown to be continuous in the first part of this proof, we deduce
that:
⇥ ⇤
V (t, x) = sup E V t0 , Xtt,x,⌫0 (2.19)
⌫2U0
sup E V t0 , Xtt,x,⌫
0 V (t0 , x) .
⌫2U0
dXt = ⌫t dWt ,
1. If V is C 1,2 ([0, T ), R), then it follows from Proposition 2.4 that V satisfies
1 2 2
@t V u D V 0 for all u 2 R,
2
and all (t, x) 2 [0, T ) ⇥ R. By sending u to infinity, it follows that
where g conc is the concave envelope of g, i.e. the smallest concave majorant of
g. Notice that g conc < 1 as g is bounded from above by a line.
V (t, x) := sup Et,x [g(XT⌫ )] sup Et,x [g conc (XT⌫ )] = g conc (x),
⌫2U0 ⌫2U0
V 2 C 1,2 ([0, T ), R)
=) V (t, x) = g conc (x) for all (t, x) 2 [0, T ) ⇥ R.
Now recall that this implication holds for any arbitrary non-negative lower semi-
continuous function g. We then obtain a contradiction whenever the function
g conc is not C 2 (R). Hence
As in the previous chapter, we assume here that the filtration F is defined as the
P augmentation of the canonical filtration of the Brownian motion W defined
on the probability space (⌦, F, P).
Our objective is to derive similar results, as those obtained in the previous
chapter for standard stochastic control problems, in the context of optimal stop-
ping problems. We will then first start by the formulation of optimal stopping
problems, then the corresponding dynamic programming principle, and dynamic
programming equation.
assume that µ and satisfies the usual Lipschitz and linear growth conditions
so that the above SDE has a unique strong solution satisfying the integrability
proved in Theorem 1.2.
The infinitesimal generator of the Markov di↵usion process X is denoted by
1 ⇥ T 2 ⇤
A' := µ · D' + Tr D ' .
2
Let g be a measurable function from Rn to R, and assume that:
E sup |g(Xt )| < 1. (3.2)
0t<T
39
40 CHAPTER 3. OPTIMAL STOPPING, DYNAMIC PROGRAMMING
is well-defined for all (t, x) 2 S and ⌧ 2 T[t,T ] . Here, X t,x denotes the unique
strong solution of (3.1) with initial condition Xtt,x = x.
The optimal stopping problem is now defined by:
is called the stopping region and is of particular interest: whenever the state is
in this region, it is optimal to stop immediately. Its complement S c is called
the continuation region.
Remark 3.1. As in the previous chapter, we could have considered an appear-
ently more general criterion
Z ⌧
V (t, x) := sup E (t, s)f (s, Xs )ds + (t, ⌧ )g X⌧t,x 1⌧ <1 ,
⌧ 2T[t,T ] t
with
Rs
k(s,Xs )ds
(t, s) := e t for 0 t s < T.
we see immediately that we may reduce this problem to the context of (3.4).
Remark 3.2. Consider the subset of stopping rules:
t
T[t,T ] := ⌧ 2 T[t,T ] : ⌧ independent of Ft . (3.6)
By a similar argument as in Remark 2.2 (iv), we can see that the maximization
in the optimal stopping problem (3.4) can be restricted to this subset, i.e.
for all (t, x) 2 S and ⌧ 2 T[t,T ] . In particular, the proof in the latter reference
does not require any heavy measurable selection, and is essentially based on the
supermartingale nature of the so-called Snell envelope process. Moreover, we
observe that it does not require any Markov property of the underlying state
process.
We report here a di↵erent proof in the sprit of the weak dynamic program-
ming principle for stochastic control problems proved in the previous chapter.
The subsequent argument is specific to our Markovian framework and, in this
sense, is weaker than the classical dynamic programming principle. However,
the combination of the arguments of this chapter with those of the previous
chapter allow to derive a dynamic programming principle for mixed stochastic
control and stopping problem.
t
The following claim will be making using of the subset T[t,T ] , introduced
in (3.6), of all stopping times in T[t,T ] which are independent of Ft , and the
notations:
for all (t, x) 2 S̄. We recall that V⇤ and V ⇤ are the lower and upper semicon-
tinuous envelopes of V , and that V⇤ = V ⇤ = V whenever V is continuous.
t
Theorem 3.3. Assume that V is locally bounded. For (t, x) 2 S, let ✓ 2 T̄[t,T ]
be a stopping time such that X✓t,x is bounded. Then:
⇥ ⇤
V (t, x) sup E 1{⌧ <✓} g(X⌧t,x ) + 1{⌧ ✓} V
⇤
(✓, X✓t,x ) , (3.9)
t
⌧ 2T[t,T ]
⇥ t,x ⇤
V (t, x) sup E 1{⌧ <✓} g(X⌧t,x ) + 1{⌧ ✓} V⇤ (✓, X✓ )) . (3.10)
t
⌧ 2T[t,T ]
Proof. Inequality (3.9) follows immediately from the tower property and the
fact that J V ⇤ .
We next prove inequality (3.10) with V⇤ replaced by an arbitrary function
which implies (3.10) by the same argument as in Step 3 of the proof of Theorem
2.3.
42 CHAPTER 3. OPTIMAL STOPPING, DYNAMIC PROGRAMMING
Arguying as in Step 2 of the proof of Theorem 2.3, we first observe that, for
every " > 0, we can find a countable family Āi ⇢ (ti ri , ti ] ⇥ Ai ⇢ S, together
with a sequence of stopping times ⌧ i," in T[ttii,T ] , i 1, satisfying Ā0 = {T } ⇥ Rd
and
[i 0 Āi
¯ ⌧ i," )
= S, Āi \ Āj = ; for i 6= j 2 N, J(·; ' 3" on Āi for i
1.
(3.11)
Set Ān := [in Āi , n t
1. Given two stopping times ✓, ⌧ 2 T[t,T ] , it is clear that
n
!
X
⌧ n," := ⌧ 1{⌧ <✓} + 1{⌧ ✓} T 1(Ān )c ✓, X✓t,x + ⌧ i," 1Āi ✓, X✓t,x
i=1
t
defines a stopping time in T[t,T ] . We then deduce from the tower property and
(3.11) that
V̄ (t, x) ¯ x; ⌧ n," )
J(t,
⇥ ⇤
E g X⌧t,x 1{⌧ <✓} + 1{⌧ ✓} '(✓, X✓t,x ) 3" 1Ān (✓, X✓t,x )
⇥ ⇤
+E 1{⌧ ✓} g(XTt,x )1(Ān )c (✓, X✓t,x ) .
We next apply Itô’s formula, and observe that the expected value of the di↵usion
term vanishes because (t, Xt ) lies in the compact subset [t0 , t0 + h] ⇥ B for
t 2 [t0 , ✓h ]. Then:
" Z #
1 ✓h t0 ,x0
E (@t + A)V (t, Xt )dt 0.
h t0
V (t0 , x0 ) > g(x0 ) and (@t + A)V (t0 , x0 ) > 0 at some (t0 , x0 ) 2 S, (3.14)
Moreover:
Next, let
✓ := inf t > t0 : t, Xtt0 ,x0 62 Nh .
t
For an arbitrary stopping rule ⌧ 2 T[t,T ] , we compute by Itô’s formula that:
where the di↵usion term has zero expectation because the process (t, Xtt0 ,x0 ) is
confined to the compact subset Nh on the stochastic interval [t0 , ⌧ ^ ✓]. Since
L' 0 on Nh by (3.15), this provides:
E [V (⌧ ^ ✓, X⌧ ^✓ ) V (t0 , x0 )] E [(V ') (⌧ ^ ✓, X⌧ ^✓ )]
P[⌧ ✓],
by (3.16). Then, since V g + on Nh by (3.15):
⇥ ⇤
V (t0 , x0 ) P[⌧ ✓] + E g(X⌧t0 ,x0 ) + 1{⌧ <✓} + V ✓, X✓t0 ,x0 1{⌧ ✓}
⇥ ⇤
( ^ ) + E g(X⌧t0 ,x0 )1{⌧ <✓} + V ✓, X✓t0 ,x0 1{⌧ ✓} .
t
By the arbitrariness of ⌧ 2 T[t,T ] , this provides the desired contradiction of (3.9).
}
Proof. (i) For t 2 [0, T ] and x, x0 2 Rn , it follows from the Lipschitz property
of g that:
0
|V (t, x) V (t, x0 )| Const sup E X⌧t,x X⌧t,x
⌧ 2T[t,T ]
0
Const E sup X⌧t,x X⌧t,x
tsT
0
Const |x x|
45
where the last inequality follows from the fact that V g. Using the Lipschitz
property of g, this provides:
⇥ t,x ⇤
0 V (t, x) E V t , Xt0 0
Const E sup Xst,x Xtt,x 0
tst0
p
Const (1 + |x|) t0 t
by the flow continuity result of Theorem 1.4. Using this estimate together with
the Lipschitz property proved in (i) above, this provides:
⇥ ⇤ ⇥ ⇤
|V (t, x) V (t0 , x)| V (t, x) E V t0 , Xtt,x0 + E V t0 , Xtt,x
0 V (t0 , x)
⇣ p ⌘
Const (1 + |x|) t0 t + E Xtt,x 0 x
p
Const (1 + |x|) t0 t,
which means that there is no subinterval of R from which the process X can
not exit.
We consider the infinite horizon optimal stopping problem:
⇥ ⇤
V (x) := sup E e ⌧ g X⌧0,x 1{⌧ <1} , (3.20)
⌧ 2T
min { v Av, v g} = 0,
1
Av := µv 0 + 2 00
v . (3.21)
2
The ordinary di↵erential equation
Av v = 0 (3.22)
Clearly and are uniquely determined up to a positive constant, and all other
solution of (3.22) can be expressed as a linear combination of and .
The following result follows from an immediate application of Itô’s formula.
We now show that the value function V is concave up to some change of vari-
able, and provides conditions under which V is C 1 across the exercise boundary,
i.e. the boundary between the exercise and the continuation regions. For the
next result, we observe that the fnction ( / ) is continuous and strictly increas-
ing by (3.23), and therefore invertible.
For " > 0, consider the " optimal stopping rules ⌧1 , ⌧2 2 T for the problems
V (b1 ) and V (b2 ):
⇥ ⇤
E e ⌧i g X⌧0,b
i
i
V (bi ) " for i = 1, 2.
where ✓ denotes the shift operator on the canonical space, i.e. ✓t (!)(s) =
!(t + s). In words, the stopping rule ⌧ " uses the " optimal stopping rule ⌧1 if
the level b1 is reached before the level b2 , and the " optimal stopping rule ⌧2
otherwise. Then, it follows from the strong Markov property that
h "
⇣ ⌘i
V (x) E e ⌧ g X⌧0,x "
h x ⇥ ⇤ i
= E e Hb1 E e ⌧1 g X⌧0,b 1
1
1{Hbx <Hbx }
1 2
h ⇥ ⇤ i
Hbx2 1 ⌧2 0,b2
+E e E e g X⌧2 1{Hbx <Hbx }
2 1
h i
Hbx1
(V (b1 ) ") E e 1{Hbx <Hbx }
1 2
h i
Hbx2
+ (V (b2 ) ") E e 1{Hbx <Hbx } .
2 1
for all " > 0, > 0. Multiplying by (( / )(x0 + ") ( / )(x0 ))/", this implies
that:
g g V V g g
(x0 + ") (x0 ) (x0 + ") (x0 ) +
(") (x0 ) (x0 )
,
" " ( )
(3.26)
where
+
(x0 + ") (x0 ) (x0 ) (x0 )
(") := and ( ) := .
"
We next consider two cases:
• If ( / )0 (x0 ) 6= 0, then we may take " = and send " & 0 in (3.26) to
obtain:
✓ ◆0
d+ ( V ) g
(x0 ) = (x0 ). (3.27)
dx
• If ( / )0 (x0 ) = 0, then, we use the fact that for every sequence "n & 0,
there is a subsequence "nk & 0 and k & 0 such that + ("nk ) = ( k ).
Then (3.26) reduces to:
g g V V g g
(x0 + "nk ) (x0 ) (x0 + "nk ) (x0 ) (x0 k) (x0 )
,
" nk " nk k
and therefore
V V ✓ ◆0
(x0 + "nk ) (x0 ) g
! (x0 ).
" nk
Let us assume that V 2 C 1,2 (S), and work towards a contradiction. We first
observe by the homogeneity of the problem that V (t, x) = V (x) is independent
of t. Moreover, it follows from Theorem 3.4 that V is concave in x and V g.
Then
V g conc , (3.28)
51
52 CHAPTER 4. THE VERIFICATION ARGUMENT
with
Rs
⌫ k(r,Xrt,x,⌫ ,⌫r )dr
(t, s) := e t .
H(t, x, r, p, )
⇢
1 T
:= sup k(t, x, u)r + b(t, x, u) · p + Tr[ (t, x, u) ] + f (t, x, u) ,
u2U 2
where b and satisfy the conditions (2.1)-(2.2), and the coefficients f and k are
measurable. From the results of the previous section, the dynamic programming
equation corresponding to the stochastic control problem (4.1) is:
The proof of the subsequent result will make use of the following linear second
order operator
for every t s and smooth function ' 2 C 1,2 ([t, s], Rd ) and each admissible
control process ⌫ 2 U0 . The last expression is an immediate application of Itô’s
formula.
Theorem 4.1. Let T < 1, and v 2 C 1,2 ([0, T ), Rd ) \ C([0, T ] ⇥ Rd ). Assume
that kk k1 < 1 and v and f have quadratic growth, i.e. there is a constant C
such that
T kk k1
RT
Ce (1 + |X✓n |2 + T + |Xs |2 ds)
t
RT
CeT kk k1
(1 + T )(1 + suptsT |Xs |2 + t |⌫s |2 ds) 2 L1 ,
by the estimate (2.5) of Theorem 2.1, it follows from the dominated convergence
that
" Z T #
v(t, x) E (t, T )v(T, XT ) + (t, r)f (r, Xr , ⌫r )dr
t
" Z #
T
E (t, T )g(XT ) + (t, r)f (r, Xr , ⌫r )dr ,
t
54 CHAPTER 4. THE VERIFICATION ARGUMENT
where the last inequality uses the condition v(T, ·) g. Since the control ⌫ 2 U0
is arbitrary, this completes the proof of (i).
Statement (ii) is proved by repeating the above argument and observing that
the control ⌫ˆ achieves equality at the crucial step (4.3). }
Remark 4.2. When U is reduced to a singleton, the optimization problem V is
degenerate. In this case, the DPE is linear, and the verification theorem reduces
to the so-called Feynman-Kac formula.
Notice that the verification theorem assumes the existence of such a solution,
and is by no means an existence result. However, it provides uniqueness in the
class of functions with quadratic growth.
We now state without proof an existence result for the DPE together with
the terminal condition V (T, ·) = g (see [8] and the references therein). The main
assumption is the so-called uniform parabolicity condition :
there is a constant c > 0 such that
(4.4)
⇠0 0
(t, x, u) ⇠ c|⇠|2 for all (t, x, u) 2 [0, T ] ⇥ Rn ⇥ U .
In the following statement, we denote by Cbk (Rn ) the space of bounded functions
whose partial derivatives of orders k exist and are bounded continuous. We
similarly denote by Cbp,k ([0, T ], Rn ) the space of bounded functions whose partial
derivatives with respect to t, of orders p, and with respect to x, of order
k, exist and are bounded continuous.
Theorem 4.3. Let Condition 4.4 hold, and assume further that :
• U is compact;
• b, and f are in Cb1,2 ([0, T ], Rn );
• g 2 Cb3 (Rn ).
Then the DPE (2.18) with the terminal data V (T, ·) = g has a unique solution
V 2 Cb1,2 ([0, T ] ⇥ Rn ).
The remaining wealth (Xt ⇡t ) is invested in the risky asset. Therefore, the
liquidation value of a self-financing strategy satisfies
dSt dS 0
dXt⇡ = ⇡t + (Xt⇡ ⇡t ) 0t
St St
= (rXt + (µ r)⇡t ) dt + ⇡t dWt . (4.5)
U (x) := x for x 0.
@w
(t, x) + sup Au w(t, x) = 0, (4.6)
@t u2R
@w 1 @2w
Au w(t, x) := (rx + (µ r)u) (t, x) + 2 2
u (t, x).
@x 2 @x2
We next search for a solution of the dynamic programming equation of the form
v(t, x) = x h(t). Plugging this form of solution into the PDE (4.6), we get the
following ordinary di↵erential equation on h :
⇢
u 1 u2
0 = h0 + h sup r + (µ r) + ( 1) 2 2 (4.7)
u2R x 2 x
⇢
1
= h0 + h sup r + (µ r) + ( 1) 2 2 (4.8)
2R 2
1 (µ r)2
= h0 + h r + , (4.9)
2 (1 ) 2
Since v(T, ·) = U (x), we seek for a function h satisfying the above ordinary
di↵erential equation together with the boundary condition h(T ) = 1. This
induces the unique candidate:
a(T t) 1 (µ r)2
h(t) := e with a := r+ .
2 (1 ) 2
Hence, the function (t, x) 7 ! x h(t) is a classical solution of the HJB equation
(4.6). It is easily checked that the conditions of Theorem 4.1 are all satisfied in
this context. Then V (t, x) = x h(t), and the optimal portfolio allocation policy
is given by the linear control process:
µ r
⇡
ˆt = 2
Xt⇡ˆ .
(1 )
1/2
where µ := [1 2 (T t)] . Observe that
and
2
yDyz v(t, y, z) = 8µ2 3
(T t) v(t, y, z) y 2 0. (4.11)
The second equality follows from the fact that {v(t, Yt1 , Zt1 ), t T } is a mar-
tingale . As for the first equality, we see from (4.10) and (4.11) that 1 is a
maximizer of both functions 7 ! 2 Dyy 2
v(t, y, z) and 7 ! yDyz 2
v(t, y, z) on
[ 1, 1].
3. Let b be some given predictable process valued in [ 1, 1], and define the
sequence of stopping times
We are now able to prove the law of the iterated logarithm for double stochas-
tic integrals by a direct adaptation of the case of the Brownian motion. Set
1
h(t) := 2t log log for t>0.
t
58 CHAPTER 4. THE VERIFICATION ARGUMENT
h(t)
2Xtb (!) < (1 + ⌘)2 h(✓k ) (1 + ⌘)2 .
✓
Hence,
2Xtb (1 + ⌘)2
lim sup < a.s.
t&0 h(t) ✓
and the required result follows by letting ✓ tend to 1 and ⌘ to 0 along the
rationals. }
where b and satisfy the usual Lipschitz and linear growth conditions. Given
the functions k, f : [0, T ] ⇥ Rd ! R and g : Rd ! R, we consider the optimal
stopping problem
Z ⌧
V (t, x) := sup E (t, s)f (s, Xst,x )ds + (t, ⌧ )g(X⌧t,x ) , (4.15)
t
⌧ 2T[t,T t
]
Before stating the main result of this section, we observe that for many inter-
esting examples, it is known that the value function V does not satisfy the C 1,2
regularity which we have been using so far for the application of Itô’s formula.
Therefore, in order to state a result which can be applied to a wider class of
problems, we shall enlarge in the following remark the set of function for which
Itô’s formula still holds true.
Remark 4.6. Let v be a function in the Sobolev space W 1,2 (S). By definition,
for such a function v, there is a sequence of functions (v n )n 1 ⇢ C 1,2 (S) such
that v n ! v uniformly on compact subsets of S, and
Then, Itô’s formula holds true for v n for all n 1, and is inherited by v by
sending n ! 1.
Theorem 4.7. Let T < 1 and v 2 W 1,2 ([0, T ), Rd ). Assume further that v
and f have quadratic growth. Then:
(i) If v is a supersolution of (4.16), then v V .
(ii) If v is a solution of (4.16), then v = V and
Notice that ⌧n ! ⌧ a.s. Then, since f and v have quadratic growth, we may
pass to the limit n ! 1 invoking the dominated convergence theorem, and we
get:
h Z T i
v(t, x) E T v(T, XTt,x ) + t,x
s f (s, Xs )ds .
t
4.2. Examples 61
Since v(T, .) g by the supersolution property, this concludes the proof of (i).
(ii) Let ⌧t⇤ be the stopping time introduced in the theorem. Then, since v(T, .) =
g, it follows that ⌧t⇤ 2 T[t,T
t
] . Set
Since ⌧tn ! ⌧t⇤ a.s. and f, v have quadratic growth, we may pass to the limit
n ! 1 invoking the dominated convergence theorem. This leads to:
h Z T i
t,x t,x
v(t, x) = E T v(T, XT ) + s f (s, Xs )ds ,
t
and the required result follows from the fact that v(T, .) = g. }
In view this time independence, it follows that the dynamic programming cor-
responding to this problem is:
1
min{v (K s)+ , rv rsDv 2
D2 v} = 0. (4.18)
2
62 CHAPTER 4. THE VERIFICATION ARGUMENT