RoughPaths 2
RoughPaths 2
June 2014
Springer
To Waltraud and Rudolf Friz
and
To Xue-Mei
Preface
vii
viii Preface
Programme (FP7/2007-2013) / ERC grant agreement nr. 258237 and DFG, SPP 1324.
MH was supported by the Leverhulme trust through a leadership award and by the
Royal Society through a Wolfson research award.
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Controlled differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Analogies with other branches of mathematics . . . . . . . . . . . . . . . . . . 6
1.3 Regularity structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Frequently used notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 Rough path theory works in infinite dimensions . . . . . . . . . . . . . . . . . 11
xi
xii Contents
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.7 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
Chapter 1
Introduction
Abstract We give a short overview of the scopes of both the theory of rough paths
and the theory of regularity structures. The main ideas are introduced and we point
out some analogies with other branches of mathematics.
where the (ξi ) are i.i.d. standard Gaussian random variables. Based on martingale
theory, Itô’s stochastic differential equations (SDEs) have provided a rigorous and
extremely useful mathematical framework for all this. And yet, stability is lost in the
passage to continuous time: while it is trivial to solve (1.2) for a fixed realisation of
ξi (ω), after all (ξ1, . . . ξT ; Y0 ) 7→ Yi is surely a continuous map, the continuity of
the solution as a function of the driving noise is lost in the limit.
Taking Ẋ = ξ to be white noise in time (which amounts to say that X is a
Brownian motion, say B), the solution map S : B 7→ Y to (1.1), known as Itô map,
is a measurable map which in general lacks continuity, whatever norm one uses to
1
2 1 Introduction
equip the space of realisations of B. 1 Actually, one can show the following negative
result (see [Lyo91, LCL07] as well as Exercise 5.21 below):
Proposition 1.1. There exists no separable Banach space B ⊂ C([0, 1]) with the
following properties:
1. Sample paths of Brownian
R· motions lie in B almost surely.
2. The map (f, g) 7→ 0 f (t)ġ(t) dt defined on smooth functions extends to a contin-
uous map from B × B into the space of continuous functions on [0, 1].
Since, for any two distinct indices i and j, the map
Z ·
B 7→ B i (t) Ḃ j (t) dt , (1.3)
0
is itself the solution of one of the simplest possible differential equations driven by
B (take Y ∈ R2 solving Ẏ 1 = Ḃ i and Ẏ 2 = Y 1 Ḃ j ), this shows that it takes very
little for S to lack continuity. In this sense, solving SDEs is an analytically ill-posed
task! On the other hand, there are well-known probabilistic well-posedness results
for SDEs of the form 2
Theorem 1.2. Let ξε = δε ∗ ξ denote the regularisation of white noise in time with a
compactly supported smooth mollifier δε . Denote by Y ε the solutions to (1.1) driven
by Ẋ = ξε . Then Y ε converges in probability (uniformly on compact sets). The
limiting process does not depend on the choice of mollifier δε , and in fact is the
Stratonovich solution to (1.4).
There are many variations on such “Wong–Zakai” results, another popular choice
being ξε = Ḃ (ε) where B (ε) is a piecewise linear approximation (of mesh size
∼ ε) to Brownian motion. However, as consequence of the aforementioned lack of
continuity of the Itô-map, there are also reasonable approximations to white noise for
which the above convergence fails. (We shall see an explicit example in Section 3.4.)
Perhaps rather surprisingly, it turns out that well-posedness is restored via the
iterated integrals (1.3) which are in fact the only data that is missing to turn S into
a continuous map. The role of (1.3) was already appreciated in [INY78, Thm 4.1]
and related works in the seventies, but statements at the time were probabilistic
in nature, such as Theorem 1.2 above. Rough path analysis introduced by Terry
Lyons in the seminal article [Lyo98] and by now exposed in several monographs
[LQ02, LCL07, FV10b], provides the following remarkable insight: Itô’s solution
map can be factorised into a measurable “universal” map Ψ and a “nice” solution
map Ŝ as
1
This lack of regularity is the raison d’être for Malliavin calculus, a Sobolev type theory of C([0, T ])
equipped with Wiener measure, the law of Brownian motion.
2
For the purpose of this introduction, all coefficients are assumed to be sufficiently nice.
1.1 Controlled differential equations 3
Ψ Ŝ
B(ω) 7→ (B, B)(ω) 7→ Y (ω). (1.5)
The map Ψ is universal in the sense that it depends neither on the initial condition, nor
on the vector fields driving the stochastic differential equation, but merely consists
of enhancing Brownian motion with iterated integrals of the form
Z t
Bi,j (s, t) = B i (r) − B i (s) dB j (r) .
(1.6)
s
At this stage, the choice of stochastic integration in (1.6) (e.g. Itô or Stratonovich)
does matter and probabilistic techniques are required for the construction of Ψ .
Indeed, the map Ψ is only measurable and usually requires the use of some sort
of stochastic integration theory (or some equivalent construction, see for example
Section 10 below for a general construction in a Gaussian, non-semimartingale
context).
The solution map Ŝ on the other hand, the solution map to a rough differential
equation (RDE), also known as Itô–Lyons map and discussed in Chapter 8.1, is
purely deterministic and only makes use of analytical constructions. More precisely,
it allows input signals to be arbitrary rough paths which, as discussed in Chapter 2,
are objects (thought of as enhanced paths) of the form (X, X), defined via certain
algebraic properties (which mimic the interplay between a path and its iterated
integrals) and certain analytical, Hölder-type regularity conditions. In Chapter 3 these
conditions will be seen to hold true a.s. for (B, B); a typical realisation is thus called
Brownian rough path.
The Itô–Lyons map turns out, cf. Section 8.6, to be “nice” in the sense that it is a
continuous map of both its initial condition and the driving noise (X, X), provided
that the dependency on the latter is measured in a suitable “rough path” metric. In
other words, rough path analysis allows for a pathwise solution theory for SDEs i.e.
for a fixed realisation of the Brownian rough path. The solution map Ŝ is however
a much richer object than the original Itô map, since its construction is completely
independent of the choice of stochastic integral and even of the knowledge that the
driving path is Brownian. For example, if we denote by Ψ I (resp. Ψ S ) the maps
B 7→ (B, B) obtained by Itô (resp. Stratonovich) integration, then we have the almost
sure identities
S I = Ŝ ◦ Ψ I , S S = Ŝ ◦ Ψ S ,
where S I (resp. S S ) denotes the solution to (1.4) interpreted in the Itô (resp.
Stratonovich) sense. Returning to Theorem 1.2, we see that the convergence there
is really a deterministic consequence of the probabilistic question whether or not
Ψ S (B ε ) → Ψ S (B) in probability and rough path topology, with Ḃ ε = ξ . This
can be shown to hold in the case of mollifier, piecewise linear, and many other
approximations.
So how is this Itô–Lyons map Ŝ built? In order to solve (1.1), we need to be able
to make sense of the expression
4 1 Introduction
Z t
f (Ys ) dXs , (1.7)
0
where Y is itself the as yet unknown solution. Here is where the usual pathwise
approach breaks down: as we have seen in Proposition 1.1 it is in general impossible,
even in the simplest cases, to find a Banach space of functions containing Brownian
sample paths and in which (1.7) makes sense. Actually, if we measure regularity
in terms of Hölder exponents, then (1.7) makes sense as a limit of Riemann sums
for X and Y that are arbitrary α-Hölder continuous functions if and only if α > 12 .
The keyword here is arbitrary: in our case the function Y is anything but arbitrary!
Actually, since the function Y solves (1.1), one would expect the small-scale fluctua-
tions of Y to look exactly like the small-scale fluctuations of X in the sense that one
would expect that
Ys,t = f (Ys )Xs,t + Rs,t
where, for any path F with values in a linear space, we set Fs,t = Ft − Fs , and
where Rs,t is some remainder that one would expect to be “of higher order”.
Suppose now that X is a “rough path”, which is to say that it has been “enhanced”
with a two-parameter function X which should be interpreted as giving the values for
Z t
Xi,j (s, t) = i
Xs,r dXrj . (1.8)
s
Note here that this identity should be read in the reverse order from what one may be
used to: it is the right hand side that is defined by the left hand side and not the other
way around! The idea here is that if X is too rough, then we do not a priori know
how to define the integral of X against itself, so we simply postulate its values. Of
course, X cannot just be anything, but should satisfy a number of natural algebraic
identities and analytical bounds, see Chapter 2 below.
Anyway, assuming that we are provided with the data (X, X), then we know how
to give meaning to the integral of components of X against other components of X:
this is precisely what X encodes. Intuitively, this suggests that if we similarly encode
the fact that Y “looks like X at small scales”, then one should be able to extend
the definition of (1.7) to a large enough class of integrands to include solutions to
(1.1), even when α < 12 . One of the achievements of rough path theory is to make
this intuition precise. Indeed, in the framework of rough integration sketched here
and made precise in Chapter 4, the barrier α = 12 can be lowered to α = 13 . In
principle, this can be lowered further by further enhancing X with iterated integrals
of higher-order, but we decided to focus on the first non-trivial case for the sake of
simplicity and because it already covers the most important case when X is given
by a Brownian motion, or a stochastic process with properties similar to those of
Brownian motion. We do however indicate very briefly in Sections 2.4, 4.5 and 7.6
how the theory can be modified to cover the case α ≤ 13 , at least in the “geometric”
case when X is a limit of smooth paths.
The simplest way for Y to “look like X” is when Y = G(X) for some sufficiently
regular function G. Despite what one might guess, it turns out that this particular
1.1 Controlled differential equations 5
Theorem 1.3. Classes of SPDEs of the form du = F [u] dt + H[u] ◦ dB, with
second and first order differential operators F and H, respectively, and driven
by finite-dimensional noise, with the Zakai equation from filtering and stochastic
Hamilton–Jacobi–Bellman (HJB) equations as examples, can be solved pathwise, i.e.
for a fixed realisation of the Brownian rough path. As in the SDE case, the SPDE
solution map factorises as S S = Ŝ ◦ Ψ S where Ŝ, the solution map to a rough partial
differential equation (RPDE) is continuous in the rough path topology.
As we have just seen, the main idea of the theory of rough paths is to “enhance”
a path X with some additional data X, namely the integral of X against itself, in
order to restore continuity of the Itô map. The general idea of building a larger
object containing additional information in order to restore the continuity of some
nonlinear transformation is of course very old and there are several other theories
that have a similar “flavour” to the theory of rough paths, one of them being the
theory of Young measures (see for example the notes [Bal00]) where the value of
a function is replaced by a probability measure, thus allowing to describe limits of
highly oscillatory functions.
Nevertheless, when first confronted with some of the notions just outlined, the
first reaction of the reader might be that simply postulating the values for the right
hand side of (1.8) makes no sense. Indeed, if X is smooth, then we “know” that there
is only one “reasonable” choice for the integral X of X against itself, and this is the
Riemann integral. How could this be replaced by something else and how can one
expect to still get a consistent theory with a natural interpretation? These questions
will of course be fully answered in these notes.
For the moment, let us draw an analogy with a very well established branch of
geometric measure theory, namely the theory of varifolds [Alm66, LY02].
Varifolds arise as natural extensions of submanifolds in the context of certain
variational problems. We are not going into details here, but loosely speaking a
k-dimensional varifold in Rn is a (Radon) measure v on Rn × G(k, n), where
G(k, n) denotes the space of all k-dimensional subspaces of Rn . Here, one should
interpret G(k, n) as the space of all possible tangent spaces at any given point for
a k-dimensional submanifold of Rn . The projection of v onto Rn should then be
interpreted as a generalisation of the natural “surface measure” of a submanifold,
while the conditional (probability) measure on G(k, n) induced at almost every point
by disintegration should be interpreted as selecting a (possibly random) tangent
space at each point. Why is this a reasonable extension of the notion of submanifold?
1.2 Analogies with other branches of mathematics 7
Mε M
⇒
ε
It is intuitively clear that, as ε → 0, this converges to a circle, but the right half has
twice as much “weight” as the left half so that, if we were to describe the limit M
simply as a manifold, we would have lost some information about the convergence of
the surface measures in the process. More dramatically, there are situations where one
has a sequence of smooth manifolds such that the limit is again a smooth manifold,
but with a limiting “tangent space” which has nothing to do with the actual tangent
space of the limit! Indeed, consider the sequence of one-dimensional submanifolds
of R2 given by
ε2
This time, the limit is a piece of straight line, which is in principle a perfectly nice
smooth submanifold, but the limiting tangent space is deterministic and makes a 45◦
angle with the canonical tangent space associated to the limit.
The situation here is philosophically very similar to that of the theory of rough
paths: a subset M ⊂ Rn may be sufficiently “rough” so that there is no way of
canonically associating to it either a k-dimensional Riemannian volume element,
or a k-dimensional tangent space, so we simply postulate them. The two examples
given above show that even in situations where M is a nice smooth manifold, it
still makes sense to associate to it a volume element and / or tangent space that are
different from the ones that one would construct canonically. A similar situation
arises in the theory of rough paths. Indeed, it may so happen that X is actually
given by a smooth function. Even so, this does not automatically mean that the right
hand side of (1.8) is given by the usual Riemann integral of X against itself. An
explicit example illustrating this fact is given in Exercise 2.17 below. Similarly to
the examples of “non-canonical” varifolds given above, “non-canonical” rough paths
can also be constructed as limits of ordinary smooth paths (with the second-order
term X defined by (1.8) where the integral is the usual Riemann integral), provided
that one takes limits in a suitably weak topology.
8 1 Introduction
Very recently, a new theory of “regularity structures” was introduced [Hai14c], uni-
fying various flavours of the theory of rough paths (including Gubinelli’s controlled
rough paths [Gub04], as well as his branched rough paths [Gub10]), as well as the
usual Taylor expansions. While it has its roots in the theory of rough paths, the main
advantage of this new theory is that it is no longer tied to the one-dimensionality of
the time parameter, which makes it also suitable for the description of solutions to
stochastic partial differential equations, rather than just stochastic ordinary differen-
tial equations.
The main achievement of the theory of regularity structures is that it allows to
give a (pathwise!) meaning to ill-posed stochastic PDEs that arise naturally when
trying to describe the macroscopic behaviour of models from statistical mechanics
near criticality. One example of such an equation is the KPZ equation arising as a
natural model for one-dimensional interface motion [KPZ86, BG97, Hai13]:
The problem with this equation is that, if anything, one has (∂x h)2 = +∞ (a
consequence of the roughness of (1 + 1)-dimensional space-time white noise) and
one would have to compensate with C = +∞. It has become custom to define the
solution of the KPZ equation as the logarithm of the (multiplicative) stochastic heat
equation ∂t u = ∂x2 u + uξ, essentially ignoring the (infinite) Itô-correction term.3
The so-constructed solutions are called Hopf–Cole solutions and, to cite J. Quastel
[Qua11],
The evidence for the Hopf–Cole solutions is now overwhelming. Whatever the physicists
mean by KPZ, it is them.
3
This requires one of course to know that solutions to ∂t u = ∂x2 u + uξ stay strictly positive with
probability one, provided u0 > 0 a.s., but this turns out to be the case.
1.4 Frequently used notations 9
Theorem 1.4. Consider KPZ and Φ43 on a bounded square spatial domain with
periodic boundary conditions. Let ξε = δε ∗ ξ denote the regularisation of space-time
white noise with a compactly supported smooth mollifier δε that is scaled by ε in
the spatial direction(s) and by ε2 in the time direction. Denote by hε and Φε the
solutions to
∂t hε = ∂x2 hε + (∂x hε )2 − Cε + ξε ,
∂t Φε = ∆Φε + C̃ε Φε − Φ3ε + ξε .
We shall deal with paths with values in, as well as maps between, Banach spaces
V, W . It will be important to consider tensor products of such Banach spaces. Assume
at first that V, W are finite-dimensional, V ∼ = Rm , W ∼ = Rn . In this case the
tensor product V ⊗ W can be identified with the matrix space Rm×n . Indeed, if
(ei : 1 ≤ i ≤ m) [resp. (fj : 1 ≤ j ≤ n)] is a basis of V [resp. W ], then
(ei ⊗ fj : 1 ≤ i ≤ m, 1 ≤ j ≤ n) is a basis of V ⊗ W . If (ei ) and (fj ) are
orthonormal bases it is natural to define a Euclidean structure on V ⊗ W by declaring
the (ei ⊗ fj ) to be orthonormal. This induces a norm on V ⊗ W which is compatible
in the sense |v ⊗w| ≤ |v|·|w| ∀v ∈ V, w ∈ W . When applied to V ⊗V we also have
10 1 Introduction
L(V × V̄ , W ) ∼
= L(V ⊗ V̄ , W ).
In coordinates, this identification is almost trivial: any A ∈ L(V × V̄ , W ), i.e. any
bilinear map from V × V̄ into W , can be expressed in terms of a 3-tensor (Aji,k ) such
that A maps v = v i ei , v̄ = v̄ k ek into v i v̄ k Aji,k fj ∈ W . The same 3-tensor gives
rise to Ā ∈ L(V ⊗ V̄ , W ). Indeed, any M = M i,k (ei ⊗ ek ) ∈ V ⊗ V̄ is mapped
linearly into M i,k Aji,k fj ∈ W . (A brief discussion how these things are adapted in
an infinite-dimensional Banach setting is given in the following subsection.)
It will also be important to consider nonlinear maps between Banach spaces.
Generically, we write Cbn for the space of bounded continuous function, say F : V →
W , say, with up to n bounded, continuous derivatives in Fréchet sense, i.e. such that
def |Xs,t |
kXkα = sup <∞,
s,t∈[0,T ] |t − s|α
def
where we define the path increment Xs,t = Xt − Xs (and also use the convention
def
0/0 = 0). As is well known, C α is a Banach space when equipped with the norm
X 7→ |X0 | + kXkα . When working with paths starting at the origin, the term |X0 |
can be omitted, i.e. we can work with directly with k · kα . The same is true if we
are only interested in the α-Hölder distance between two paths started at the same
point ξ ∈ V . Often we shall work with partitions or dissections of [0, T ]; since
every dissection D = {0 = t0 < t1 < · · · < tn = T } ⊂ [0, T ] can be thought of as
a partition of [0, T ] into (essentially) disjoint intervals, P ={[ti−1 , ti ] : i = 1, . . . n},
and vice-versa, we shall use whatever is (notationally) more convenient. We recall
that lim|P|→0 , typically defined via nets, means convergence along any sequence (Pk )
with mesh |Pk | → 0, with identical limit along each such sequence. Here, the mesh
|P| of a partition P is the length of its largest element, i.e. |P| = supk∈{1,...,n} |tk −
tk−1 | if P is as above.
We will frequently deal with functions Ξ mapping (s, t) ∈ [0, T ]2 continuously
into some Banach space and which enjoy some sort of “on-diagonal” α-Hölder
def
regularity. More precisely, we write Ξ(s, t) = Ξs,t ∈ C2α if there exists a constant
C such that |Ξs,t | ≤ C|t − s| for all (s, t) ∈ [0, T ]2 . The smallest such constant is
α
4
This will arise naturally, with V̄ = V , when pairing the second Fréchet derivatives (of some
F : V → W ) with second iterated integrals with values in V ⊗ V .
1.5 Rough path theory works in infinite dimensions 11
then given by
def |Ξs,t |
kΞkα = sup .
s,t∈[0,T ] |t − s|α
In particular, if X is a function defined on [0, T ] that is α-Hölder continuous in the
usual sense, then its increments (s, t) 7→ Xs,t belong to C2α . For any such (non-
R t increment, one has necessarily α ≤ 1, for otherwise Ẋ = 0 and then
trivial) path
Xs,t = s Ẋ ≡ 0. In general, however, one has non-trivial elements Ξ ∈ C2α also
for α > 1 and indeed this is a crucial property whenever Ξs,t represents some error
term,
P since, in this case, α−1 whenever P is a partition of the interval [0, T ], one has
[s,t]∈P |Ξ s,t | ≤ CT |P| , which goes to 0 with the mesh of P.
As usual, we will use the notation A = O(x) if there exists a constant C such
that the bound |A| ≤ Cx holds for every x ≤ 1 (or every x ≥ 1, depending on the
context). Similarly, we write A = o(x) if the constant C can be made arbitrarily
small as x → 0 (or as x → ∞, depending on the context). We will also occasionally
write C for a generic constant that only depends on the data of the problem under
consideration and which can change value from one line to the other without further
notice.
At last, let us note that the symbols C α , DXα
etc. refer to spaces of rough paths and
controlled rough paths, respectively. (Both are introduced in details in the relevant
sections below.)
Unless explicitly otherwise stated, all rough path results in this book are valid (with
no complications in the arguments!) in a general Banach setting. Linear (or bilinear)
maps are now assumed to be continuous and we still use L(...) for the class of
such maps. What is a little more involved is the (classical) construction of a tensor
product as Banach space: one completes the algebraic tensor product, V ⊗a V̄ , under
a compatible tensor norm upon which the resulting space V ⊗ V̄ depends. What one
would like, as above, is
L(V × V̄ , W ) ∼
= L(V ⊗ V̄ , W )
will be enough for our purposes. For the rest of this text we shall thus make the
standing assumption that V ⊗ V has been equipped with a compatible tensor norm
that has this property. In many situations of interest the space V is just a copy of Rm
and then this is trivially true. In the existing literature, such aspects are discussed in
[LCL07, p19-20], [LQ02, p28,111].
Chapter 2
The space of rough paths
Abstract We define the space of (Hölder continuous) rough paths, as well as the
subspace of “geometric” rough paths which preserve the usual rules of calculus. The
latter can be interpreted in a natural way as paths with values in a certain nilpotent
Lie group. At the end of the chapter, we give a short discussion showing how these
definitions should be generalized to treat paths of arbitrarily low regularity.
13
14 2 The space of rough paths
which we assume to hold for every triple of times (s, u, t). Since Xt,t = 0, it
immediately follows (take s = u = t) that we also have Xt,t = 0 for every t. As
already mentioned in the introduction, one should think of X as postulating the value
of the quantity Z t
def
Xs,r ⊗ dXr = Xs,t , (2.2)
s
where we take the right hand side as a definition for the left hand side. (And not
the other way around!) We insist (cf. Exercise 2.7 below) that as a consequence
of (2.1), knowledge of the path t 7→ (X0,t , X0,t ) already determines the entire
second order process X. In this sense, the pair (X, X) is indeed a path, and not
some two-parameter object, although it is often more convenient to consider it
as one. If X is a smooth function and we read (2.2) from right to left, then it is
straightforward to verify (see Exercise 2.6 below) that the Rrelation (2.1) does indeed
hold. Furthermore, one can convince oneself that if f 7→ f dX denotes any form
Rt
of “integration” which is linear in f , has the property that s dXr = Xs,t , and is
Rt Ru Ru
such that s f (r) dXr + t f (r) dXr = s f (r) dXr for any admissible integrand
f , and if we use such a notion of “integral” to define X via (2.2), then (2.1) does
automatically hold. This makes it a very natural postulate in our setting.
Note that the algebraic relations (2.1) are by themselves not sufficient to determine
X as a function of X. Indeed, for any V ⊗ V -valued function F , the substitution
Xs,t 7→ Xs,t + Ft − Fs leaves the left hand side of (2.1) invariant. We will see later
on how one should interpret such a substitution. It remains to discuss what are the
natural analytical conditions one should impose for X. We are going to assume that
the path X itself is α-Hölder continuous, so that |Xs,t | . |t − s|α . The archetype of
an α-Hölder continuous function is one which is self-similar with index α, so that
Xλs,λt ∼ λα Xs,t .
(We intentionally do not give any mathematical definition of self-similarity here,
just think of ∼ as having the vague meaning of “looks like”.) Given (2.2), it is then
very natural to expect X to also be self-similar, but with Xλs,λt ∼ λ2α Xs,t . This
discussion motivates the following definition of our basic spaces of rough paths.
Definition 2.1. For α ∈ ( 13 , 12 ], define the space of α-Hölder rough paths (over V ),
in symbols C α ([0, T ], V ), as those pairs (X, X) =: X such that
Remark 2.2. Given an arbitrary path X ∈ C α with values in some Banach space V it
is far from obvious that this path can indeed be lifted to a rough path (X, X) ∈ C α .
The Lyons–Victoir extension theorem [LV07] asserts that this can always be done
provided α ∈ ( 13 , 12 ), with an infinite dimensional counter example given in the case
2.1 Basic definitions 15
α = 1/2. When dim V < ∞, there is no such restriction, see Proposition 13.23
below. In typical applications to stochastic processes, a “canonical” lift is constructed
via probability and one does not rely on the extension theorem.
If one ignores the nonlinear constraint (2.1), there is a natural way to think of
(X, X) as an element in the Banach space C α ⊕ C22α of such maps with (semi-)norm
kXkα + kXk2α . However, taking into account (2.1) we see that C α is not a linear
space, although it is a closed subset of the aforementioned Banach space. We will
need (some sort of) a norm and metric on C α . The induced “natural” norm on C α
given by kXkα + kXk2α fails to respect the structure of (2.1) which is homogeneous
with respect to a natural dilatation on C α , given by (X, X) 7→ (λX, λ2 X). This
suggests to introduce the α-Hölder (homogeneous) rough path norm
def
p
|||X|||α = kXkα + kXk2α , (2.4)
which, although not a norm in the usual sense of normed linear spaces, is the adequate
concept for the rough path X = (X, X).
Note also that the quantities defined in (2.3) are merely seminorms since they
vanish for constants. Most importantly, (2.3) leads to a notation of rough path metric
(and then rough path topology).
The perhaps cheapest way to show convergence with respect to this rough path
metric is based on interpolation: in essence, it is enough to establish pointwise
convergence in conjunction with uniform “rough path” bounds of the form (2.3); see
Exercise 2.9. Let us also note that C α ([0, T ], V ) so becomes a complete, metric
space; the reader is asked to work out the details in Exercise 2.11.
We conclude this part with two important remarks. First, we can ask ourselves up
to which point the relations (2.1) are already sufficient to determine X. Assume that
we can associate to a given function X two different second order processes X and
X̄, and set Gs,t = Xs,t − X̄s,t . It then follows immediately from (2.1) that
so that in particular Gs,t = G0,t − G0,s . Since, conversely, we already noted that
setting X̄s,t = Xs,t + Ft − Fs for an arbitrary continuous function F does not
change the left hand side of (2.1), we conclude that X is in general determined
1
As was already emphasised, C α is not a linear space but is naturally embedded in the normed
space of maps X, X; the definition of %α makes use of this. While this may not appear intrinsic (the
situation is somewhat similar to using the (restricted) Euclidean metric on R3 on the 2-sphere), the
ultimate justification is that the Itô map will turn out to be locally Lipschitz continuous in %α .
16 2 The space of rough paths
While (2.1) does capture the most basic (additivity) property that one expects any
decent theory of integration to respect, it does not imply any form of integration by
parts / chain rule. Now, if one looks for a first order calculus setting, such as is valid
in the context of smooth paths or the Stratonovich stochastic calculus, then for any
pair e∗i , e∗j of elements in V ∗ , writing Xti = e∗i (Xt ) and Xij ∗ ∗
s,t = (ei ⊗ ej )(Xs,t ), one
would expect to have the identity
Z t Z t
ij ji i j i
Xs,t + Xs,t “ = ” Xs,r dXr + Xs,r dXrj
s s
Z t
j
= d(X i X j )r − Xsi Xs,t − Xsj Xs,t
i
s
j j
= (X i X j )s,t − Xsi Xs,t − Xsj Xs,t
i i
= Xs,t Xs,t ,
so that the symmetric part of X is determined by X. In other words, for all times s, t
we have the “first order calculus” condition
1
Sym(Xs,t ) = Xs,t ⊗ Xs,t . (2.5)
2
However, if we take X to be an n-dimensional Brownian path and define X by Itô
integration, then (2.1) still holds, but (2.5) certainly does not.
2.3 Rough paths as Lie-group valued paths 17
There are two natural ways to define a set of “geometric” rough paths for which
(2.5) holds. On the one hand, we can define a subspace Cgα ⊂ C α by stipulating that
(X, X) ∈ Cgα if and only if (X, X) ∈ C α and (2.5) holds for every s, t. Note that
Cgα is a closed subset of C α . On the other hand, we have already seen that every
smooth path can be lifted canonically to an element of C α by reading the definition
(2.2) from right to left. This choice of X then obviously satisfies (2.5) and we can
define Cg0,α as the closure of lifts of smooth paths in C α . We leave it as exercise
to the reader to see that smooth paths in the definition of Cg0,α may be replaced by
piecewise smooth paths or (piecewise) C 1 paths without changing the resulting space
of geometric rough paths; see also Exercise 2.12.
One has the obvious inclusion Cg0,α ⊂ Cgα , which turns out to be strict [FV06a].
The situation is similar to the classical situation of the set of α-Hölder continuous
functions being strictly larger than the closure of smooth functions under the α-
Hölder norm. (Or the set of bounded measurable functions being strictly larger than
C, the closure of smooth functions under the supremum norm.) Also similar to the
case of classical Hölder spaces, one has the converse inclusion Cgβ ⊂ Cg,0
α
whenever
β > α, see Exercise 2.14. Let us finally mention that non-geometric rough paths
can always be embedded in a space of geometric rough paths at the expense of
adding new components; this is made precise in Exercise 2.14 and was systematically
explored in [HK12].
We now present a very fruitful interpretation of rough paths, at least in finite dimen-
2
sions, say V = Rd . To this end, consider X : [0, T ] → Rd , X : [0, T ] → Rd ⊗ Rd
subject to (2.1) and define (with Xs,t = Xt − Xs as usual)
also known as truncated tensor algebra. This multiplicative structure is very well
adapted to our needs since (2.1), combined with the obvious identity Xs,t = Xs,u +
Xu,t , means precisely that (again, called “Chen’s relation”)
write (1, b, c) = 1 + b + c. Computations using “formal power series” are then pos-
sible by considering the standard basis {ei : 1 ≤ i ≤ d} ⊂ Rd as non-commutative
−1
variables. The usual power series (1 + x) = 1 − x + x2 − . . . then leads to
−1
(1 + b + c) = 1 − (b + c) + (b + c) ⊗ (b + c)
=1−b−c+b⊗b,
and confirms the inverse of 1 + b + c given in (2.7). The usual power-series also
suggest
def 1
log (1 + b + c) = b + c − b ⊗ b
2
def 1
exp (b + c) = 1 + b + c + b ⊗ b (2.8)
2
[b + c, b0 + c0 ] = b ⊗ b0 − b0 ⊗ b ,
(0, b, 0). One can check that, as a Lie algebra, g(2) = Rd ⊕ so(d), i.e. the linear
span of (ei : 1 ≤ i ≤ d) and (eij : 1 ≤ i < j ≤ d), where eij = [ei , ej ]. The Lie
bracket of eij with any other element in g(2) vanishes. Since g(2) is closed under
the
operation [·, ·], its image under the exponential map, G(2) Rd := exp g(2) , is a
(2)
Lie subgroup of T1 Rd .
We call G(2) (Rd ) the step-2 nilpotent Lie group (with d generators). The algebraic
constraint (2.5) then translates precisely to the statement that the path t 7→ X0,t (and
then the increments Xs,t ) takes values in G(2) (Rd ).
Without going into too much details here, G(2) (Rd ) admits a natural homogeneous
“Carnot-Carathéodory norm” k · kC with the property, for x = exp (b + c),
1/2
kxkC |b| + |c| , (2.9)
where indicates Lipschitz equivalence (with constants that may depend on the
dimension d). A left-invariant metric dC , known as the Carnot-Carathéodory metric,
is induced by k · kC so that
2.4 Geometric rough paths of low regularity 19
1/2
dC (Xs , Xt ) = kXs,t kC |Xs,t | + |Xs,t | . (2.10)
Using the homogeneous rough path norm introduced in (2.4), taking into account
(2.3), we thus have
dC (Xs , Xt )
|||X|||α;[0,T ] sup α ,
s,t∈[0,T ] |t − s|
1. One has (X, X) ∈ Cgα , i.e. it satisfies (2.1), (2.3) and (2.5).
2. The path t 7→ Xt = 1 + X0,t + X0,t takes values in G(2) (Rd ) and is α-Hölder
continuous with respect to the distance dC .
Without going into full detail, the above proposition, combined with the geodesic
nature of the space G(2) (Rd ), shows that geometric rough paths are essentially limits
of smooths paths (“geodesic approximations” in the terminology of [FV10b]) in the
rough path metric.
Proposition 2.5. Let β ∈ 13 , 12 . For every (X, X) ∈ Cgβ [0, T ], Rd , there exists a
with uniform rough path bounds supn kX n kβ + kXn k2β < ∞. By interpolation,
convergence holds in α-Hölder rough path metric for any α ∈ 31 , β , namely
The interpretation given above gives a strong hint on how to construct geometric
rough paths with α-Hölder regularity for α ≤ 13 : setting p = b1/αc, one defines the
p-step truncated tensor algebra T (p) (Rd ) by
20 2 The space of rough paths
p
d
M ⊗n
Rd
(p) def
T (R ) = R ⊕ .
n=1
We can construct a Lie group G(p) (Rd ) ⊂ T (p) (Rd ) as before, by setting G(p) =
(p)
exp(g(p) ), where g(p) ⊂ T0 (Rd ) is the Lie algebra spanned by elements of the
form (1, b, 0, . . . , 0). Again, one can construct a “homogeneous Carnot-Carathéodory
metric” on G(p) , with a property similar to (2.9), but with the contribution coming
from the kth level scaling like | · |1/k .
A geometric α-Hölder rough path for arbitrary α ∈ 0, 21 is then given by a
2.5 Exercises
Exercise 2.6. Let X be a smooth V -valued path and let X be given by the left hand
side of (2.2), namely Z t
Xs,t = Xs,r ⊗ Ẋr dr .
s
a) Show that X does indeed satisfy Chen’s relation (2.1).
b) Consider the collection of all iterated integrals over [s, t], viewed as element in
the tensor algebra over V , say
Z
Xs,t := 1, Xs,t , Xs,t , , dXu1 ⊗ dXu2 ⊗ dXu3 , . . . ∈ T ((V )).
s<u1 <u2 <u3 <t
(2.11)
and show that the following general form of Chen’s relation holds,
determines the entire second order process X. In this sense (X, X) is indeed a path,
and not some two-parameter object.
Exercise 2.8. Consider s ≡ τ0 < τ1 < · · · < τN ≡ t. Show that (2.1) implies
X X
Xs,t = Xτi ,τi+1 + Xτj ,τj+1 ⊗ Xτi ,τi+1
0≤i<N 0≤j<i<N
N
X −1
= Xτi ,τi+1 + Xs,τi ⊗ Xτi ,τi+1 . (2.12)
i=0
Exercise 2.9 (Interpolation). Assume that Xn ∈ C β , for 1/3 < α < β, with
uniform bounds
n
and uniform convergence Xs,t → Xs,t and Xns,t → Xs,t , i.e. uniformly over s, t ∈
[0, T ]. Show that this implies X ∈ C β and
%α (Xn , X) → 0.
Solution 2.10. Using the uniform bounds and pointwise convergence, there exists C
such that uniformly in s, t
≤ C|t − s|β , 2β
n
|Xs,t | = lim Xns,t ≤ C|t − s| .
|Xs,t | = lim Xs,t
n n
as the %α -closure of smooth paths (enhanced with their iterated Riemann integrals)
in C α ([0, T ], V ). Assuming that V is separable, show that Cg0,α ([0, T ], V ) is also
separable.
0,1/2
b) Show that for every geometric 1/2-Hölder rough path, X ∈ Cg , X is neces-
sarily the iterated Riemann-Stieltjes integral of the underlying path X ∈ C 0,1/2 .
(Attention, this does not mean that for every X ∈ C 0,1/2 the iterated Riemann-
Stieltjes integral exist! A counterexample is found in [FV10b, Ex.9.14 (iii)].)
Solution 2.13. Let Q be a countable, dense subset of V and consider the space
Λn of paths which are piecewise linear between level-n dyadic rationals Dn :=
{kT /2n : 0 ≤ k ≤ 2n }, and, at level-n dyadic points, take values in Q. Clearly Λ =
∪Λn is countable for each Λn is in one-to-one correspondence with the (2n + 1)-fold
Cartesian product of Q. It is easy to see that each smooth X is the limit in C 1 of
some sequence (X n ) ⊂ Λ. Indeed, one can take X n to be the piecewise linear
dyadic approximation, modified such that X n |Dn takes values in QRand such that
|(X n − X)|Dn | < 1/n. By continuity of the map X ∈ C 1 7→ X, X ⊗ dX ∈
C α in the respective topologies (we could even take R α = 1), we have more than
enough to assert that every lifted smooth path, X, X ⊗ dX , is the %α -limit of
lifted paths in Λ. It is then easy to see that every %α limit point of lifted smooth path
is also the %α -limit of lifted paths in Λ.
2.5 Exercises 23
Turning to the second part of the question, it is not hard to see that
( )
|X s,t | |X s,t |
Cg0,α ⊂ X ∈ C α : sup α → 0, sup 2α → 0 as ε → 0 .
s,t:|t−s|<ε |t − s| s,t:|t−s|<ε |t − s|
Consider now the case α = 1/2 and a dissection {s = τ0 < τ1 < · · · < τN = t}
with mesh ≤ ε. It follows from Chen’s relation (2.1) that
X X
Xs,t − Xs,τi ⊗ Xτi ,τi+1 =
Xτi ,τi+1
0≤i<n 0≤i<n
X 2α
≤ C(ε) |τi+1 − τi | = T C(ε).
0≤i<n
as the %α -closure of smooth paths and their iterated integrals plus smooth V ⊗ V -
valued path increments. Show that
C 0,α ([0, T ], V ) ∼
= Cg0,α ([0, T ], V ) ⊕ C 0,2α ([0, T ], V ⊗ V ).
Define the (non-separable) space of weak geometric α-Hölder rough paths, Cgα as
those elements X ∈ C α for which 2 Sym (X) = X ⊗ X. Show that Cg0,α is a closed
subspace of Cgα and that
C α ([0, T ], V ) ∼
= Cgα ([0, T ], V ) ⊕ C 2α ([0, T ], V ⊗ V ).
The point of this exercise is that non-geometric rough path spaces can effectively be
embedded in geometric rough path spaces.
Exercise 2.15. At least when dim V < ∞, there is not much difference between
Cg0,α ⊂ Cgα in the following sense. Let 13 < α < β ≤ 12 . By using the (non-trivial!)
fact that every X ∈ Cgβ can be approximated uniformly by smooth paths, with
uniform β-Hölder rough path bounds, use interpolation to see that X ∈ Cg0,α , in fact
show that one has the compact embedding
Cgβ ,→ Cg0,α .
Solution 2.16. C β ⊂ C 0,α (and in fact a continuous embedding) is obvious from the
interpolation exercise above. The compactness of the embedding is a consequence
24 2 The space of rough paths
of Arzela-Ascoli (use dim V < ∞). At last the extension to non-geometric rough
path spaces, is fairly straightforward using the embedding into geometric rough path
spaces.
Exercise 2.17 (Pure area rough path). Identify R2 with the complex numbers and
consider
[0, 1] 3 t 7→ n−1 exp 2πin2 t ≡ X n .
Rt n
a) Set Xns,t := s Xs,r ⊗ dXrn . Show that, for fixed s < t,
n 0 1
Xs,t → 0, Xns,t → π(t − s) . (2.13)
−1 0
b) Establish the uniform bounds supn kX n k1/2 < ∞ and supn kXn k1 < ∞.
c) Conclude by interpolation that (2.13) takes place in α-Hölder rough path metric
%α for any 1/3 < α < 1/2.
n
Solution 2.18. a) Obviously, Xs,t = O(1/n) → 0 uniformly in s, t. Then
1 n
Xns,t = n
+ Ans,t = O 1/n2 + Ans,t
Xs,t ⊗ Xs,t
2
where Ans,t ∈ so(2) is the antisymmetric part of Xns,t . To avoid cumbersome
notation, we identify
0 a
∈ so(2) ↔ a ∈ R.
−a 0
Ans,t then represents the signed area between the curve (Xrn : s ≤ r ≤ t) and
the straight chord from Xtn to Xsn . (This is a simple consequence of Stokes
theorem: the exterior derivative of the 1-form 12 (x dy − y dx) which vanishes
along straight chords, is the volume form dx ∧ dy.) With s < t, (Xrn : s ≤ r ≤ t)
makes bn2 (t − s)c full spins around the origin, at radius 1/n. Each full spin
2
contributes area π(1/n) , while the final incomplete spin contributes some area
2
less than π(1/n) . The total signed area, with multiplicity, is thus
π Cs,t
Ans,t = n2 (t − s) + O(1) 2 = π(t − s) + 2 ,
n n
where |Cs,t | ≤ π uniformly in s, t. It follows that
0 1
Xns,t = π(t − s) + O 1/n2
(2.14)
−1 0
b) The following two estimates for path increments of n−1 exp 2πin2 t ≡ Xtn hold
true:
2.6 Comments 25
n n n
Xs,t ≤ Ẋ |t − s| ≤ n|t − s| , Xs,t ≤ 2|X n | = 2/n .
∞ ∞
√
Since a ∧ b ≤ ab, it immediately follows that
n p
Xs,t ≤ 2|t − s| ,
uniformly in n, s, t. In other words, supn kX n k1/2 < ∞. The argument for the
uniform bounds on Xs,t is similar. On the one hand, we have the bound (2.14).
On the other hand, we also have
n 2 |t − s|2 n2
Z Z
n n n
Xs,t = Ẋu ⊗ Ẋv du dv ≤ Ẋ
∞
≤ |t − s|2 .
s<u<v<t
2 2
The required uniform bound on kXk1 follows by using (2.14) for n2 |t − s| > 1
and the above bound for n2 |t − s| ≤ 1.
direction h is given by
def
Th (X) = X h , Xh ,
where X h := X + h and
Z t Z t Z t
h
Xs,t := Xs,t + hs,r ⊗ dXr + Xs,r ⊗ dhr + hs,r ⊗ dhr . (2.15)
s s s
a) Assume h is Lipschitz. (In particular, the last three integrals above are well-
defined Riemann–Stieltjes integrals.) Show that for fixed h, the translation operator
Th : X 7→ Th (X) is a continuous map from C α into itself.
b) The above (Lipschitz) assumption on h is equivalently expressed by saying that
h ∈ W 1,∞ , where W 1,q denotes the space of absolutely continuous paths h
with derivative ḣ ∈ Lq . Weaken the assumption on h by only requiring ḣ ∈ Lq ,
for suitable q = q(α). Show that q = 2 (“Cameron–Martin paths of Brownian
motion”) works for all α ≤ 1/2. As a matter of fact, the integrals appearing in
(2.15) make sense for every q ≥ 1, but the resulting translated “rough path” would
not necessarily lie in C α .
2.6 Comments
The notion of rough path is due to Lyons and was introduced in [Lyo98]. Rather
than using Hölder-type norms, the original article introduced rough paths in the p-
variation sense for any p ∈ [1, ∞). For p ≥ 3 (corresponding to α < 13 ), this requires
26 2 The space of rough paths
th
additional [p] order information. Various notes by Lyons preceding [Lyo98] already
dealt with α-Hölder rough paths for α ∈ 13 , 12 .
In the recent literature, elements in Cgα are actually called weakly geometric
(α-Hölder) rough paths. In contrast, the space of geometric rough paths Cg0,α is, by
definition, obtained via completion of smooth paths in %α . We do not insist on this
terminology here and indeed, by Proposition 2.5 there is not much difference. In the
early literature the two concepts were somewhat blurred, matters were clarified in
[FV06a].
Chapter 3
Brownian motion as a rough path
Abstract In this chapter, we consider the most important example of a rough path,
which is the one associated to Brownian motion. We discuss the difference, at the
level of rough paths, between Itô and Stratonovich Brownian motion. We also provide
a natural example of approximation to Brownian motion which converges to neither
of them.
The integration here is understood either in Itô- or Stratonovich sense (in the latter
case, we would write ◦dB); sometimes we indicate this by writing BItô resp. BStrat .
It should be noted that the antisymmetric part of B, which is nothing but Lévy’s
stochastic area and takes values in so(d), is not affected by the choice of stochastic
integration. Condition (2.1) is seen to be valid with either choice, while condition
(2.5) only holds in the Stratonovich case. We now address the question of α- resp. 2α-
Hölder regularity of X resp. X by a suitable extension of the classical Kolmogorov
criterion; the application to Brownian motion is then carried out in detail in the
following subsection.
Recalling that B ∈ C α , a.s. for any α < 1/2, we now address the question of
2α-Hölder regularity for B.
27
28 3 Brownian motion as a rough path
Theorem 3.1 (Kolmogorov criterion for rough paths). Let q ≥ 2, β > 1/q. As-
sume, for all s, t in [0, T ]
β 2β
|Xs,t |Lq ≤ C|t − s| , |Xs,t |Lq/2 ≤ C|t − s| , (3.2)
for some constant C < ∞. Then, for all α ∈ [0, β − 1/q), there exists a modification
of (X, X) (also denoted by (X, X)) and random variables Kα ∈ Lq , Kα ∈ Lq/2
such that, for all s, t in [0, T ]
α 2α
|Xs,t | ≤ Kα (ω)|t − s| , |Xs,t | ≤ Kα (ω)|t − s| . (3.3)
1 1
then, for every α ∈ 13 , β − 1q , we have (X, X) ∈ C α
In particular, if β − q > 3
a.s.
Proof. The proof is almost the same as the classical proof of Kolmogorov’s continuity
criterion, as exposed for example in [RY91]. Without loss of generality take T = 1
and let Dn denote the set of integerSmultiples of 2−n in [0, 1). As in the usual
criterion, it suffices to consider s, t ∈ n Dn , with the values at the remaining times
filled in using continuity. (This is why in general one ends up with a modification.)
Note that the number of elements in Dn is given by #Dn = 1/|Dn | = 2n . Set
Kn = sup Xt,t+2−n , Kn = sup Xt,t+2−n .
t∈Dn t∈Dn
Xt,t+2−n q ≤
X 1 βq βq−1
E Knq ≤ E C q |Dn | = C q |Dn |
,
|Dn |
t∈Dn
where (τi , τi+1 ) ∈ Dn some n ≥ m + 1, and for each fixed n ≥ m + 1 there are at
most two such intervals taken from Dn . It follows that
−1
NX X
|Xs,t | ≤ max Xs,τi+1 ≤ Xτi ,τi+1 ≤ 2 Kn ,
0≤i<N
i=0 n≥m+1
and similarly,
N −1 N −1
X X
|Xs,t | = Xτi ,τi+1 + Xs,τi ⊗ Xτi ,τi+1 ≤ Xτi ,τi+1 + |Xs,τi |Xτi ,τi+1
i=0 i=0
N
X −1 N
X −1
≤ Xτi ,τi+1 + max Xs,τi+1 Xτj ,τj+1
0≤i<N
i=0 j=0
X X 2
≤2 Kn + 2 Kn .
n≥m+1 n≥m+1
We thus obtain
|Xs,t | X 1 X 2Kn
α ≤ α 2Kn ≤ α ≤ Kα ,
|t − s| |Dm+1 | |Dn |
n≥m+1 n≥m+1
α
where Kα := 2 n≥0 Kn /|Dn | is in Lq . Indeed, since α < β −1/q by assumption
P
and |Dn | to any positive power is summable, we have
X 2 q 1/q
X 2C β−1/q
kKα kLq ≤ α |E(Kn )| ≤ α |Dn | <∞.
|Dn | |Dn |
n≥0 n≥0
Similarly,
|Xs,t | X 1 X 1 2
2α ≤ 2α 2Kn + α 2Kn ≤ Kα + Kα2 ,
|t − s| |Dm+1 | |Dm+1 |
n≥m+1 n≥m+1
2α
is in Lq/2 . Indeed,
P
where Kα := 2 n≥0 Kn /|Dn |
X 2 2/q X 2C
q/2 2β−2/q
kKα kLq/2 ≤ ≤ 2α |Dn | <∞,
2α E Kn
n≥0
|Dn | n≥0
|D n |
The reader will notice that the classical Kolmogorov criterion (KC) is contained
in the above proof and theorem by simply ignoring all considerations related to the
second-order process X. Let us also note in this context that the classical KC works
for processes (Xt : 0 ≤ t ≤ 1) with values in an arbitrary (separable) metric space
(it suffices to replace |Xs,t | by d(Xs , Xt ) in the argument). This observation actually
30 3 Brownian motion as a rough path
gives an alternative and immediate proof of Theorem 3.1, at least for “geometric”
(X, X), i.e. in presence of the algebraic constraint (2.5), and at the price of some
Lie group language. The key observation, as discussed in Section 2.3, is that t 7→
Xt := (1,X0,t, X0,t ) takes values in the step-2 nilpotent group with d generators,
G(2) Rd , ⊗ , endowed with the Carnot-Carathéodory metric
1/2
dC (Xs , Xt ) |Xs,t | + |Xs,t | .
The assumptions of Theorem 3.1 then translate precisely to |dC (Xs , Xt )|Lq ≤
β α
C|t − s| , and the same conclusion, dC (Xs , Xt ) ≤ Kα (ω)|t − s| , for Kα ∈ Lq , is
obtained from the classical Kolmogorov criterion.
Remark 3.2 (Warning). It is not possible to obtain (3.3) by applying the classical
KC to the (V ⊗ V )-valued process (X0,t : 0 ≤ t ≤ T ). Doing so only gives |Xs,t | =
α
O(|t − s| ) a.s. since one misses a crucial cancellation inherent in (cf. (2.1))
That said, it is possible (but tedious) to use a 2-parameter version of the KC to see
that (s, t) 7→ Xs,t /|t − s|2α admits a continuous modification. In particular, this then
implies that kXk2α is finite almost surely. In the Brownian setting, this was carried
out in [Fri05].
Here is a similar result for rough path distances, say between X and X̃. Note
that, due to the nonlinear structure of rough path spaces, one cannot simply apply
Theorem 3.1 to the “difference” of two rough paths. Indeed, X̃ − X is not defined in
general for, formally, one misses the information about the mixed integrals in
Z
X̃ − X = X̃ − X, X̃ − X ⊗ d X̃ − X .
Even when all of these expressions are well-defined, say when X̃ is smooth, conver-
gence of the right-hand side above to zero is different from saying that
Z Z
X̃ → X, X̃ ⊗ dX̃ → X ⊗ dX
and it is this type of convergence (in suitable Hölder-type norms) which our rough
path metric %α expresses.
Proof. The proof is a straightforward modification of the proof of Theorem 3.1 and
is left as an exercise to the reader. t
u
{Xn ≡ (X n , Xn ) : 1 ≤ n ≤ ∞} ,
such that the moment conditions in the statement of KC hold with with a constant
C, uniformly over 1 ≤ n ≤ ∞. Application of the above with ε = εn then gives
Lq -rates of convergence,
|%α (Xn , X)|Lq . εn .
Of course, when εn decays sufficiently fast, a Borel–Cantelli argument also gives
almost-sure convergence with suitable rates.
Proposition 3.4. For any a ∈ (1/3, 1/2) and T > 0 with probability one,
Proof. Using Brownian scaling and finite moments of B0,1 , which are immediate
from integrability properties of the (homogeneous) second Wiener–Itô chaos, the KC
for rough paths applies with β = 1/2 and all q < ∞. (As exercise, the reader may
want to show finite moments of B0,1 without chaos arguments; an elementary way to
do so is via conditioning, Itô isometry, and reflection principle.) t
u
Observe that Brownian motion enhanced with its iterated Itô integrals (2nd order
calculus!) yields a (random) rough path but not a geometric rough path which is, by
definition, an object with hardwired first order behaviour. Indeed, Itô formula yields
the identity
d(B i B j ) = B i dB j + B j dB i + B i , B j dt ,
i, j = 1, . . . , d ,
so that, writing I for the identity matrix in d dimensions, we have for s < t,
1 1 1
Sym BItô
s,t = Bs,t ⊗ Bs,t − I(t − s) 6= Bs,t ⊗ Bs,t ,
2 2 2
in contradiction with (2.5).
Let us also mention that Brownian motion with values in infinite-dimensional
spaces can also be lifted to rough paths, see the exercise section.
and has the advantage of a first order calculus. For instance, one has the first order
product rule
d(M N ) = M ◦ dN + N ◦ dM.
One can then define BStrat by (component-wise) Stratonovich-integration of Brownian
motion against
itself. Using basic results on quadratic variation between Brownian
motions (d B i , B j t = δ i,j dt where δ i,j = 1 if i = j, zero else), we see that
1
BStrat Itô
s,t = Bs,t + I(t − s) (3.5)
2
where I stands for the identity matrix. Note that the difference between BStrat and BItô
is symmetric, so that the antisymmetric parts of the two processes (Lévy’s stochastic
area) are identical.
3.3 Stratonovich Brownian motion 33
Proposition 3.5. For any a ∈ (1/2, 1/3) and with probability one,
n
R · respect to σ{BkT 2−n : 0 ≤ k ≤ 2 }, the same bounds
and upon conditioning with
hold for B (n);i and for 0 B (n);i dB (n);j . In fact, Kα , Kα have (more than enough)
integrability to apply Doob’s maximal inequality. This leads, with probability one, to
the bound
Z ·
sup
B (n) , B (n) ⊗ dB (n)
< ∞ .
n 0 2α
The reader should be warned that there are perfectly smooth and uniform ap-
proximations to Brownian motion, which do not converge to Stratonovich enhanced
Brownian motion, but instead to some different geometric (random) rough path, such
as
B̄ = B, B̄ , where B̄s,t = BStrat
s,t + (t − s)A , A ∈ so(d) .
Note that the difference between B̄ and BStrat is now antisymmetric, i.e. B̄ has a
stochastic area that is different from Lévy’s area. To construct such approximations,
it suffices to include oscillations (at small scales) such as to create the desired
effect in the area, while they do no affect the limiting path, see Exercise 2.17.
(In the context of Brownian motion and SDEs driven by Brownian motion such
approximations were studied by McShean, Ikeda–Watanabe and others, see [McS72,
IW89].) Although such “twisted” approximations do not seem to be the most obvious
way to approximate Brownian motion, they also arise naturally in some perfectly
reasonable situations.
Newton’s second law for a particle in R3 with mass m, and position x = x(t), (for
simplicity: constant) frictions α1 , α2 , α3 > 0 in orthonormal directions, subject
to a (3-dimensional) white noise in time, i.e. the distributional derivative of a 3-
dimensional Brownian motion B, reads
that is proportional to the strength of the magnetic field, the component of the velocity
that is perpendicular to the magnetic field and the charge of the particle. In terms
of our assumptions, this simply means that a non-zero antisymmetric component is
added to M . We shall hence drop the assumption of symmetry, and instead consider
for M a general square matrix with
Note that these second order dynamics can be rewritten as evolution equation for the
momentum p(t) = mẋ(t),
1
ṗ = −M ẋ + Ḃ = − M ṗ + Ḃ.
m
As we shall see X = X m , indexed by “mass” m, converges in a quite non-trivial
way to Brownian motion on the level of rough paths. In fact, the correct limit in
rough path sense is B̄ = (B, B̄), where
B̄s,t = BStrat
s,t + (t − s)A, (3.7)
Theorem 3.8. Let M ∈ Rd×d be a square matrix in dimension d such that all its
eigenvalues have strictly positive real part. Let B be a d-dimensional standard
Brownian motion, m > 0, and consider the stochastic differential equations
1 1
dX = P dt , dP = − M P dt + dB .
m m
with zero initial position X and momentum P . Then, for any q ≥ 1 and α ∈
(1/3, 1/2), as mass m → 0,
Z
M X, M X ⊗ d(M X) → B̄ in C α and Lq .
Y ε = P/ε.
By assumption, there exists λ > 0 such that the real part of every eigenvalue of M
is (strictly) bigger than λ. For later reference, we note that this implies the estimate
| exp(−τ M )| = O(exp(−λτ )) as τ → ∞. For fixed ε, define the Brownian motion
B̃t = εBε−2 t , and consider the SDEs
Note that the law of the solutions does not depend on ε. Furthermore, when solved
with identical initial data, we have pathwise equality
For each t (and in particular for t = 0), the law of Ỹtstat is precisely ν. We then see
that
Z 0 Z ∞
∗ ∗
Σ = E Ỹ0stat ⊗ Ỹ0stat = e−M (−s) e−M (−s) ds = e−M s e−M s ds.
−∞ 0
for all reasonable test functions f ; we shall only use it for quadratics. Using dX ε =
ε−1 Y ε dt we can then write
Z t Z t Z t
M Xsε ⊗ d(M X ε )s = M Xsε ⊗ dBs − ε M Xsε ⊗ dYsε
0 0 0
1
As found e.g. in textbooks by Stroock [Str11] or Kallenberg [Kal02]. Test functions are usually
assumed to be bounded, but by a truncation argument in our setting, this is easily extended to
quadratics.
3.4 Brownian motion in a magnetic field 37
Z t Z t
= M Xsε ⊗ dBs − M Xtε ⊗ (εYtε ) + ε d(M X ε )s ⊗ Ysε
0 0
Z t Z t
ε ε ε
= M Xs ⊗ dBs − M Xt ⊗ (εYt ) + M Ysε ⊗ Ysε ds
0 0
Z t Z
→ Bs ⊗ dBs − 0 + t (M y ⊗ y) ν(dy)
0
Z t
1
= Bs ⊗ dBs + tM Σ = B0,t + t M Σ − I ,
0 2
we see that M Σ − 21 I has symmetric part 0, i.e. is antisymmetric, and hence also
equals 12 (M Σ − ΣM ∗ ). This settles pointwise convergence, in the sense that
Z t
ε
M Xtε , M Xsε ε
S(M X )t := ⊗ d(M X )s → Bt , B̄0,t .
0
Step 2. (Uniform rough path bounds in Lq .) We claim that, for any q < ∞,
Z
q
ε q ε ε
sup E[kM X kα ] < ∞ , sup E
M X ⊗ d(M X )
<∞,
ε∈(0,1] ε∈(0,1] 2α
Since X is Gaussian, it follows from integrability properties of the first two Wiener–
Itô chaoses that it is enough to show these bounds for q = 2. Furthermore, we note
that the desired estimates are a consequence of the bounds
h 2 i
E X̃s,t . |t − s| , (3.10)
2
" Z #
t ε
2
E X̃s,u ⊗ dX̃u . |t − s| , (3.11)
s
where the implied proportionality constants are uniform over t, s ∈ (0, ∞). Indeed,
this follows directly from writing
h i h 2 i
ε 2
= E εX̃ε−2 s,ε−2 t . ε2 ε−2 t − ε−2 s = |t − s| ,
E Xs,t
38 3 Brownian motion as a rough path
(note the uniformity in ε), and similarly for the second moment of the iterated
integral.
In order to check (3.10), it is enough to note that M X̃s,t = B̃s,t − Ỹs,t , combined
with the estimate
h i 2 Z t ∗
−M (t−s)
2
E |Ỹs,t | = E (e − I)Ỹs + Tr(e−M u e−M u ) du . |t − s| ,
s
where we used the fact that Real{σ(M )} ⊂ (0, ∞) to get a uniform bound. In order
to control (3.11), we consider one of the components and write
"Z 2 # "Z Z 2 #
t i j
t u i j
E X̃s,u dX̃u = E Ỹr Ỹu dr du
s s s
Z h i
= E Ỹri Ỹuj Ỹqi Ỹvj 1{r≤u;q≤v} dr du dq dv
[s,t]4
Z h i h i h i h i
≤ E Ỹri Ỹuj E Ỹqi Ỹvj + E Ỹri Ỹqi E Ỹuj Ỹvj
[s,t]4
h i h i
+E Ỹri Ỹvj E Ỹuj Ỹqi dr du dq dv
Z !2
h i
. E Ỹr ⊗ Ỹu dr du
[s,t]2
Z !2
h i
. E Ỹr ⊗ Ỹu 1{r≤u} dr du ,
[s,t]2
where we have used the fact that Ỹ is Gaussian (which yields Wick’s formula for the
of products) in order to get the bound on the third line. But for r ≤ u,
expectation
E Ỹu Ỹr = e−M (u−r) Ỹr , so that
Z h i Z h i
E Ỹr ⊗ Ỹu 1{r≤u} dr du = E Ỹr ⊗ e−M (u−r) Ỹr 1{r≤u} dr du
[s,t]2 [s,t]2
Z t Z t
−λ(u−r)
2
. e du E Ỹr dr . |t − s| .
s r
order to find such cubature formulae, the mandatory first step, on which we focus
here, is the computation of the expectations of the n-fold iterated integrals2
Z
E ◦dB ⊗ · · · ⊗ ◦dB .
0<t1 <...tn <T
Let us combine all of these integrals into one single object (also called the “signature”
of Brownian motion) by writing
XZ
S(B)0,T = 1 + ◦dB ⊗ · · · ⊗ ◦dB .
n≥1 0<t1 <...tn <T
variable. Then !
d
T X
ES(B)0,T = exp ei ⊗ ei .
2 i=1
Proof. (Shekhar) Set ϕt := ES(B)0,t . (It is not hard to see, by Wiener–Itô chaos
integrability or otherwise, that all involved iterated integrals are integrable so that ϕ
is well-defined.) By Chen’s formula (in its general form, see Exercise 2.6) and the
independence of Brownian increments, one has the identity
ϕt+s = ϕt ⊗ ϕs .
For integers m, n we have log ϕm = n log ϕm/n and log ϕm = m log ϕ1 . It follows
that
log ϕt = t log ϕ1 ,
first for t = m
n ∈ Q, then for any real t by continuity. On the other hand, for t > 0,
Brownian scaling implies that ϕt = δ√t ϕ1 where δλ is the dilatation operator, which
acts by multiplication with λn on the nth tensor level, (Rd )⊗n . Since δλ commutes
with ⊗ (and thus also with log, defined as power series),
Recall that in this expression, “1” is identified with (1, 0, 0) in the truncated tensor
algebra, and similarly for the other summands, and addition also takes place in
T (2) (Rd ). Taking the logarithm (in the tensoralgebra truncated beyond level 2; in
this case log (1 + a + b) = a + b − 12 a ⊗ a if a is a 1-tensor, b a 2-tensor) then
immediately gives the desired identification. t u
Theorem 3.10 (Kolmogorov tightness criterion for rough paths). Let q ≥ 2, β >
1/q. Assume, for all s, t in [0, T ]
n q q/2
≤ C|t − s|βq , En Xns,t
βq
En Xs,t ≤ C|t − s| , (3.12)
3.6 Scaling limits of random walks 41
1
> 31 . Then for every α ∈ 1 1
for some constant C < ∞. Assume β − q 3, β − q , the
Xn ’s are tight in C 0,α .
In typical applications, the X n are only defined for discrete times, such as s =
j/n, t = k/n for integers j, k. The non-trivial work then consists, for a suitable
choice of Xn , in checking the following discrete tightness estimates,
j − k βq j − k βq
q q/2
En X nj , k ≤ C , En Xnj , k ≤ C . (3.13)
n n n n n n
Proposition 3.11. Consider a d-dimensional random walk (Xj : j ∈ N), with i.i.d.
increments of zero mean, finite moments of any order q < ∞, and unit covariance
matrix. Extend the rescaled random walk
1
X nj := √ Xj ,
n n
defined on discrete times only, by piecewise linear interpolation to all times and
construct to Xn = (X n , Xn ) by iterated (Riemann–Stieltjes) integration. Then the
tightness estimates in Theorem 3.10 hold with β = 1/2 and all q < ∞.
Proof. The iterated integrals of a linear (or affine) path with increment v ∈ Rd takes
the simple form exp(v) in terms of the tensor exponential introduced in (2.8). Chen’s
relation then implies
The simple calculus on the level-2 tensor algebra T (2) Rd leads to an explicit
expression for Xnj , k , to which one can apply the (discrete) Burkholder–Davis–Gundy
n n
inequality in order to get the discrete tightness estimates (3.13). The extension to
all times is straight-forward. Details are left to the reader (see e.g. [BF13]). An
alternative argument, not restricted to level 2, is found in [BFH09]. t u
Theorem 3.12. In the rescaled random walk setting of Proposition 3.11, and under
the additional assumption that E(X ⊗ X) = I, we have the weak convergence
Xn =⇒ BStrat
3.7 Exercises
Exercise 3.14. Bypass the use of Wiener–Itô chaos integrability in Proposition 3.4
by showing directly that the matrix-valued random variable BItô 0,1 has moments of all
orders. Hint: this is trivial for the on-diagonal entries, for the off-diagonal entries
one can argue via conditioning, Itô isometry, and reflection principle.
Exercise 3.15. Show that d-dimensional Brownian motion B enhanced with Lévy’s
stochastic area is a degenerate diffusion process and find its generator.
exists a.s. and in L2 , uniformly on compacts and so defines X with values in H ⊗HS H,
the closure of the algebraic tensor product H ⊗a H under the Hilbert–Schmidt norm.
Consider both the case of Itô and Stratonovich integration and verify that with either
choice, (X, X) ∈ C α a.s. for any α < 1/2.
3.7 Exercises 43
a) Verify that exactness holds with γ = 1/2 whenever dim V < ∞. (More generally,
exactness with γ = 1/2 always holds true if one works with the injective tensor
product space, V ⊗inj V , the injective norm being the smallest possible. For the
largest possible norm, the projective norm, the o(N )-estimate remains true but
can be as slow as one wishes; exactness may then fail; cf. [LLQ02]. Exactness
of the usual Wiener-space, with uniform or Hölder norm, is also known to be true.)
b) Fix α < 1/2.R Show that dyadic piecewise linear approximations B n , enhanced
with Bn = B n ⊗ dB n , converge in α-Hölder rough path metric to a limit
B in C α ([0, T ], V ). More precisely, use the previous exercise to show that the
sequence Bn = (B n , Bn ) is Cauchy in the sense that
Solution 3.18. We only sketch the main step in the proof of b). Without loss of
generality, we set T = 1. The crux of the matter is to show that Bn0,1 converges in
V ⊗ V . The rest follows from scaling and equivalence of moments in the first two
Wiener chaoses. Set tnk = k/2n . Then
44 3 Brownian motion as a rough path
2n 2
X
n+1 n
2
0,1 − B0,1 L2 ∼ E Btn+1 ,tn+1 ⊗ Btn+1 ,tn+1
B
2k−2 2k−1 2k−1 2k
k=1 V ⊗V
n 2
2
1 X n+1 n+1
∼ 2n+2 E 2 2 Btn+1 ,tn+1 ⊗ 2 2 Btn+1 ,tn+1
2 2k−2 2k−1 2k−1 2k
k=1 V ⊗V
n 2
X2
−2n−2
∼2 E Gk ⊗ G̃k
k=1 V ⊗V
−2n−2 2γn
.2 2
−2n(1−γ)
∼2 ,
Show that the area correction of X m , in the (small mass) limit m → 0 limit, is given
by
α 0 −1
.
2(1 + α2 ) 1 0
(This correction is computed by multiscale / homogenisation techniques in the book
[PS08]).
Exercise 3.22. Consider Xt = bt+σBt where b ∈ Rd , a = σσ ∗ ∈ (Rd )⊗2 . In other
words, X is a Lévy process with triplet (a, b, 0). Show that the expected signature of
3.7 Exercises 45
Here, the exponential should be interpreted as the exponential in the tensor algebra,
i.e.
1 1
exp(u) = 1 + u + u ⊗ u + u ⊗ u ⊗ u + . . .
2! 3!
Exercise 3.23 (Expected signature for Lévy processes [FS12b]). Consider a com-
pound Poisson process Y with intensity λ and jumps distributed like J = J(ω) ∼ ν.
in other words, Y is Lévy with triplet (0, 0, K) where the Lévy measure is given by
K = λν. A sample path of Y gives rise to piecewise linear, continuous path; simply
by connecting J1 , J1 + J2 etc. Show that, under a suitable integrability condition for
J,
ES(Y )0,T = exp T λE(eJ − 1).
Can you handle the case of a general Lévy process?
Exercise 3.24 (Level-3 cubature formula). Define a measure µ on C [0, 1], Rd by
±1
Call the resulting process (Xt (ω) : t ∈ [0, 1]) and compute the expected signature
up to level 3, that is
Z Z
E 1, X0,1 , dXt1 ⊗ dXt2 , dXt1 ⊗ dXt2 ⊗ dXt3 .
0<t1 <t2 <1 0<t1 <t2 <t3 <1
Then,
Z
1X 1
dXt1 dXt2 = Zi Zj ei ⊗ ej = I + (zero mean)
0<t1 <t2 <1 2 i,j 2
on level 3 shows that every summand either contains, for some i, a factor EZti1 = 0 or
46 3 Brownian motion as a rough path
3
E Zti1 = 0. In other words, the expected signature at level 3 is zero, in agreement
with π3 exp( 21 I) = 0. We conclude that the expected signatures, of µ on the one
3.8 Comments
The modification of Kolmogorov’s criterion for rough paths (Theorem 3.1) is a minor
variation on a rather well-known theme. Rough path regularity of Brownian motion
was first established in the thesis of Sipiläinen, [Sip93].
For extensions to infinite dimensional Wiener processes (and also convergence
of piecewise linear approximations in rough path sense) see Ledoux, Lyons and
Qian [LLQ02] and Dereich [Der10]; much of the interest here is to go beyond the
Hilbert space setting. The resulting stochastic integration theory against Banach-
space valued Brownian motion, which in essence cannot be done by classical methods,
has proven crucial in some recent applications (cf. the works of Kawabi–Inahama
[IK06], Dereich [Der10]).
Early proofs of Brownian rough path regularity were typically established by
convergence of dyadic piecewise linear approximations to (B, BStrat ) in (p-variation)
rough path metric; see e.g. Lyons–Qian [LQ02]. Many other “obvious” (but as we
have seen: not all reasonable) approximations are seen to yield the same Brownian
rough path limit. The discussion of Brownian motion in a magnetic field follows
closely Friz, Gassiat and Lyons [FGL13]. Continuous semi-martingales and large
classes of multidimensional Gaussian – and Markovian – processes lift to random
rough paths; convergence of piecewise linear approximation in rough path topology
is also known to hold true to hold in great generality. See e.g. Friz–Victoir [FV10b]
and the references therein. The expected signature of Brownian motion was first
established in the thesis of Fawcett [Faw04]; different proofs were then given by
Lyons–Victoir, Baudoin and Friz–Shekhar, [LV04, Bau04, FS12b]. Fawcett’s formula
is central to the Kusuoka–Lyons–Victoir cubature method ([Kus01, LV04]). More
generally, expected signatures capture important aspects of the law of a stochastic
process. See Chevyrev [Che13]. The extension to Lévy processes, Exercise 3.23, is
taken from Friz–Shekhar [FS12b]. The computation of expected signatures of large
classes of stochastic processes including stopped Brownian motion and stochastic
Löwner equations is presently pursued by a number of people including Lyons–
Ni [LN11], Werness [Wer12] and Boedihardjo–Qian [BNQ13]. The Donsker type
theorem, Theorem 3.12, in uniform topology, is a consequence of Stroock–Varadhan
[SV73]; the rough path case is due to Breuillard, Friz, and Huesmann [BFH09]].
Applications to cubature are discussed in [BF13].
Chapter 4
Integration against rough paths
R
Abstract The aim of this section is to give a meaning to the expression Yt dXt for
a suitable class of integrands Y , integrated against a rough path X. We first discuss
the case originally studied by Lyons where Y = F (X). We then introduce the notion
of a controlled rough path and show that this forms a natural class of integrands.
4.1 Introduction
R
The aim of this chapter is to give a meaning to the expression Yt dXt , for X ∈
C α ([0, T ], V ) and Y some continuous function with values in L(V, W ), the space
of bounded linear operators from V into some other Banach space W . Of course,
such an integral cannot be defined
R for arbitrary continuous functions Y , especially if
we want the map (X, Y ) 7→ Y dX to be continuous in the relevant topologies. We
therefore also want to identify a “good” class of integrands Y for the rough path X.
A natural approach would be to try to define the integral as a limit of Riemann–
Stieltjes sums, that is
Z 1 X
Yt dXt = lim Ys Xs,t , (4.1)
0 |P|→0
[s,t]∈P
47
48 4 Integration against rough paths
Z
1
(Yr − Y0 ) dXr ≤ CkY kβ;[0,1] kXkα;[0,1] , (4.2)
0
with C depending on α + β > 1. Given paths X, Y defined on [s, t] rather than [0, 1]
it is an easy consequence of the scaling properties of Hölder semi-norms, that
Z t
α+β
Yr dXr − Ys s,t ≤ CkY kβ kXkα |t − s|
X . (4.3)
s
2α
In particular, when α = β > 1/2, the right hand side is proportional to |t − s| =
o(|t − s|) which is to be compared with the estimate (4.20) below.
The main insight of the theory of rough paths is that this seemingly unsurmount-
able barrier of α + β > 1/2 (which reduces to α > 1/2 in the case α = β which
is our main interest1 ) can be broken by adding additional structure to the problem.
Indeed, for a rough path X, we postulate the values Xs,t of the integral of XRagainst
itself, see (2.2). It is then intuitively clear that one should be able to define Y dX
in a consistent way, provided that Y “looks like X”, at least on very small scales (in
the precise sense of (4.16) below). The easiest way for a function Y to “look like
X” is to have Yt = F (Xt ) for some sufficiently smooth F : V → L(V, W ), called a
1-form.
for r in some (small) interval [s, t], say. Recall (see sections 1.4 and1.5 concerning
the infinite-dimensional case) that2
L(V, L(V, W )) ∼
= L(V ⊗ V, W ) ,
side of Z 1 X
F (Xs ) dXs ≈ F (Xs )Xs,t + DF (Xs )Xs,t , (4.5)
0 [s,t]∈P
4
does exists and call it rough integral.
R· In fact, in this section we shall construct the
(indefinite) rough integral Z = 0 F (X)dX as element in C α , i.e. as path, similar
to the construction of stochastic integrals as processes rather than random variables.
Even this may not be sufficient in applications - one often wants to have an extended
meaning of the rough integral, such as (Z, Z) ∈ C α , point of view emphasised in
[Lyo98, LQ02, LCL07], or something similar (such as “Z controlled by X” in the
sense of Definition 4.6 below, to be discussed in the next section).
Lemma 4.1. Let F : V → L(V, W ) be a Cb2 function and let (X, X) ∈ C α for some
α > 13 . Set Ys := F (Xs ), Ys0 := DF (Xs ) and Rs,t
Y
:= Ys,t − Ys0 Xs,t . Then
Y, Y 0 ∈ C α and RY ∈ C 2α . (4.7)
kY kα ≤ kDF k∞ kXkα ,
kY 0 k ≤
D2 F
kXk ,
α ∞ α
R
≤ 1
D2 F
kXk2 .
Y
2α ∞ α
2
3
Recall that lim|P|→0 means convergence along any sequence (Pn ) with mesh |Pn | → 0, with
identical limit along each such sequence. In particular, it is not enough to establish convergence
along a particular sequence (Pn ), although a particular sequence may be used to identify the limit.
4
Of course, we can and will consider intervals other than [0, 1]. Without further notice, P always
denotes a partition of the interval under consideration.
50 4 Integration against rough paths
Proof. Cb2 regularity of F implies that F and DF are both Lipschitz continuous with
Lipschitz constants kDF k∞ and kD2 F k∞ respectively. The α-Hölder bounds on Y
and Y 0 are then immediate. For the remainder term, consider the function
A Taylor expansion, with intermediate value remainder, yields ξ ∈ (0, 1) such that
Y 1 2
Rs,t = F (Xt ) − F (Xs ) − DF (Xs )Xs,t = D F (X s + ξXs,t )(Xs,t , Xs,t ) .
2
Y
The claimed 2α-Hölder estimate, in the sense that |Rs,t | . |t − s|2α , then follows at
once. t u
Before we prove that the rough integral (4.6) exists, we discuss some sort of
abstract Riemann integration. In what follows, at first reading, one may
R t have in mind
the construction of a Riemann-Stieltjes (or Young) integral Zt := 0 Yr dXr . From
Young’s inequality (4.3), one has (with Zs,t = Zt − Zs as usual)
and Ξs,t := Ys Xs,t is a sufficiently good local approximation in the sense that it
fully determines the integral Z via the limiting procedure given in (4.1)). In this
sense Z = IΞ is the well-defined image of Ξ under some abstract integration map
I. Note that Zs,t = Zs,u + Zu,t , i.e. increments are additive (or “multiplicative” if
one regards + as group operation5 ) whereas a similar property fails for Ξ. In the
language of [Lyo98], such a Ξ corresponds to a “almost multiplicative functional”
and it is a key result in the theory that there is a unique associated “multiplicative
functional” (here: Z = IΞ). Following [Gub04, FdLP06] we call “sewing” the step
from a (good enough) local approximation Ξ to some (abstract) integral IΞ; the
concrete estimate which quantifies how well IΞ is approximated by Ξ will be called
“sewing lemma”. It plays an analogous role to “Davie’s lemma” (cf. section 8.7) in
the context of (rough) differential equations.
We now formalize what we mean by Ξ being a good enough local approximations.
For this, we introduce the space C2α,β ([0, T ], W ) of functions Ξ from the simplex
0 ≤ s ≤ t ≤ T into W such that Ξt,t = 0 and such that
def
kΞkα,β = kΞkα + kδΞkβ < ∞ , (4.8)
|Ξs,t |
where kΞkα = sups<t |t−s|α as usual, and also
def |δΞs,u,t |
δΞs,u,t = Ξs,t − Ξs,u − Ξu,t , kδΞkβ = sup β
.
s<u<t |t − s|
5
This terminology becomes natural if one considers Z together with its iterated integrals as
group-valued path, increments of which satisfy Chen’s “multiplicative” relation, see (2.3).
4.2 Integration of 1-forms 51
Provided that β > 1, it turns out that such functions are “almost” of the form
Ξs,t = Ft − Fs , for some α-Hölder continuous function F (they would be if and
only if δΞ = 0). Indeed, it is possible to construct in a canonical way a function Ξ̂
with δ Ξ̂ = 0 and such that Ξ̂s,t ≈ Ξs,t for |t − s| 1:
Lemma 4.2 (Sewing lemma). Let α and β be such that 0 < α ≤ 1 < β. Then,
there exists a (unique) continuous map I : C2α,β ([0, T ], W ) → C α ([0, T ], W ) such
that (IΞ)0 = 0 and
(IΞ)s,t − Ξs,t ≤ C|t − s|β .
(4.9)
where C only depends on β and kδΞkβ . (The α-Hölder norm of IΞ also depends
on kΞkα and hence on kΞkα,β .)
Proof. Note first that I will be built as a linear map, so that its continuity is an
immediate consequence of its boundedness. Uniqueness of I is also immediate.
Indeed, assume by contradiction that, for a given Ξ, there are two candidates F
and F̄ for IΞ. Since both of these functions have to satisfy the bound (4.9), the
function F − F̄ satisfies (F − F̄ )0 = 0 and (F − F̄ )s,t . |t − s|β . Since β > 1 by
assumption, it follows immediately that F − F̄ vanishes identically.
It remains to find the map I. It is very natural to make the guess
X
IΞ)s,t = lim Ξu,v , (4.10)
|P|→0
[u,v]∈P
where P denotes a partition of [s, t] and |P| denotes its mesh, i.e. the length of its
largest element. The remainder of the proof shows that this expression is well-defined
and that (4.9) holds.
Why is (4.10) well-defined? Because of its importance we give two (independent
but related) arguments. The first argument is based on successive (dyadic) refinement,
i.e. one starts by identifying the integral as limit of Riemann type sums, along a
particular sequence (Pn ). This is followed by checking that the limit is indeed
interval [s, t], we
independent of the choice of partitions. More precisely, for a given
start with the trivial partition P0 = {[s, t]} and we set I 0 Ξ s,t = Ξs,t . We then
define recursively [
Pn+1 = u, m , m, v ,
[u,v]∈Pn
where it is a straightforward exercise to check that the second equality holds. It then
follows immediately from the definition of k · kα,β that
n+1
Ξ s,t − I n Ξ s,t ≤ 2n(1−β) |t − s|β kδΞkβ .
I
52 4 Integration against rough paths
Since β > 1, these terms are summable and we conclude immediately that the
sequence (I n Ξ)s,t is Cauchy. It thus admits a limit (IΞ)s,t such that, by summing
up the bound above, one has
X n+1
Ξ s,t − I n Ξ s,t ≤ CkδΞkβ |t − s|β , (4.11)
IΞ − Ξs,t
≤ I
s,t
n≥0
for some universal constant C depending only on β, which is precisely the required
bound (4.9). It remains to see that the limit just constructed is independent of the
choice of partitions. Once one has shown that δIΞ = 0, which is equivalent to
(IΞ)0,t = (IΞ)0,s + (IΞ)s,t for all pairs s, t, this is not too difficult. Indeed, if P
denotes an arbitrary partition of [s, t] and we introduce
Z X
Ξ := Ξu,v ,
P [u,v]∈P
then the difference between (IΞ)s,t and this approximation can the be estimated,
thanks to (4.11) as
X
(IΞ)u,v − Ξu,v = O |P|β−1 ) .
[u,v]∈P
Since β > 1, this is enough to show that (IΞ)s,t is the limit along any sequence Pn
with mesh tending to zero. What remains to be shown is δIΞ = 0. In general this
is not obvious (but see Remark 4.3) and indeed, writing Pns,t for the level-n dyadic
partition relative to [s, t], as used above, this is quite tedious since Pn0,t is not equal
to the partition of [0, t] given by Pn0,s ∪ Pns,t , even though both have mesh tending
P n → ∞. In fact, one is better off to define the integral over [s, t] as the
to zero with
limit of [u,v]∈Pn0,T ,[u,v]⊂[s.t] Ξu,v . In Exercise 4.21, the reader is invited to work
out the remaining details.
The second argument, which is essentially due to Young, yields immediately
convergence as |P| → 0, i.e. the same limit is obtained along any sequence Pn with
mesh tending to zero. (As an immediate consequence, without any details left to the
reader, δIΞ = 0. Another advantage of Young’s construction is that it works under a
weaker 1/α-variation assumption on (X, X).) Consider a partition P of [s, t] and let
r ≥ 1 be the number of intervals in P. When r ≥ 2 there exists u ∈ [s, t] such that
[u− , u], [u, u+ ] ∈ P and
2
|u+ − u− | ≤ |t − s|.
r−1
P
Indeed, assuming otherwise R the contradiction 2|t − s| ≥ u∈P ◦ |u+ − u− | >
gives
2|t − s|. Hence, | P\{u} Ξ − P Ξ| = |δΞu− ,u,u+ | ≤ kδΞkβ (2|t − s|/(r − 1))β
R
and by iterating this procedure until the partition is reduced to P = {[s, t]}, we arrive
at the maximal inequality,
4.2 Integration of 1-forms 53
Z
sup Ξs,t − Ξ ≤ 2β kδΞkβ ζ(β)|t − s|β ,
P⊂[s,t] P
But then, for any P with |P| ≤ ε we can use the maximal inequality to see that
Z Z
X
≤ 2β ζ(β)kδΞk β
|v − u| = O |P|β−1 = O(εβ−1 ).
Ξ− Ξ
β
P P0 [u,v]∈P
This concludes the Young argument (with no hidden tedium left to the reader). t
u
Remark 4.3. The first argument ultimately suffered from the tedium of checking
the additivity property δIΞ = 0. In some cases, however, this addivity R property
of IΞ can be immediate. Imagine X : [0, T ] → V is smooth, X = X ⊗ dX,
and one is only interested in an error estimate for second order approximations of
Riemann-Stieltjes integrals, of the form
Z t
F (X r ) dXr − F (X s )X s,t − DF (X s )X s,t ≤ right-hand side of (4.13).
s
(This is still a highly non-trivial estimate since the right-hand side is uniform over all
(smooth) paths, as long as their α-rough path norms remain bounded!) In the context
of the above proof, this estimate is contained in the first step, applied with
But here it is clear from classical Riemann-Stieltjes theory, or in fact just Riemann
integration theory, that IΞs,t , constructed as limit of dyadic partitions of [s, t], is
Rt
precisely the Riemann-Stieltjes integral s F (Xr ) dXr and therefore additive. (The
contribution of DF (X)X in the approximations disappears in the limit; indeed, it
2
suffices to remark that Xu,v ∼ |v − u| , thanks to smoothness of X.)
We now apply the sewing lemma to the construction of (4.6). We have the follow-
ing.
Theorem 4.4 (Lyons). Let X = (X, X) ∈ C α ([0, T ], V ) for some T > 0 and
α > 13 , and let F : V → L(V, W ) be a Cb2 function. Then, the rough integral defined
54 4 Integration against rough paths
where the constant C only depends on p T and α and can be chosen uniformly in
T ≤ 1. Furthermore, |||X|||α = kXkα + kXk2α denotes again the homogeneous
α-Hölder rough path norm.
R·
Remark 4.5. We will see in Section 4.4 that the map (X, X) ∈ C α 7→ 0 F (X) dX ∈
C α is continuous in α-Hölder rough path metric.
Proof. Let us stress the fact that the argument given here only relies on the properties
of the integrand Y = F (X) collected in Lemma 4.1 above. In particular, the general-
isation to “extended” integrands (Y, Y 0 ), which replace (F (X), DF (X)), subject to
(4.7), will be immediate. (We shall develop this “Gubinelli” point of view further in
Section 4.3 below.)
The result follows as a consequence of Lemma 4.2. With the notation that we just
introduced, the classical Young integral [You36] can be defined as the usual limit of
Riemann sums by
Z t
Yr dXr = IΞ s,t , Ξs,t = Ys Xs,t .
s
so that, except in trivial cases, the required bound (4.8) is satisfied only if Y and
X are Hölder continuous with Hölder exponents adding up to β > 1. In order to
be able to cover the situation α < 12 , it follows that we need to consider a better
approximation to the Riemann sums, as discussed above. To this end, we use the
notation from Lemma 4.1, namely
and then set Ξs,t = Ys Xs,t + Ys0 Xs,t . Note that, for any u ∈ (s, t), we have the
identity
Y 0
δΞs,u,t = −Rs,u Xu,t − Ys,u Xu,t .
4.3 Integration of controlled rough paths 55
which is the claimed estimate (4.14) in the limit α ↓ 1/3. However, one can do better
by realising that the above estimate is best for |t − s| small, whereas for t − s large
it is better to split up |Zs,t | into the sum of small increments. To make this more
precise, set % := |||X|||α and write (hide factor C = C(α, T ) in . below)
α 2α 3α
|Zs,t | . %|t − s| + %2 |t − s| + %3 |t − s|
α
≤ 3%|t − s| for %1/α |t − s| ≤ 1.
Increments of Z over [s, t] with length greater than h := %−1/α are handled by
cutting them into pieces of length h. More precisely (cf. Exercise 4.24) we have
kZkα;h ≤ 3% which entails
kZkα ≤ 3% 1 ∨ 2h−(1−α) ≤ 6 % ∨ %1/α .
Motivated by Lemma 4.7 and the observation that rough integration essentially relies
on the properties (4.7) we introduce the notion of a controlled path Y , relative to
some “reference” path X, due to Gubinelli [Gub04]. For the sake of the following
definition we assume that Y takes values in some Banach space, say W̄ . When it
56 4 Integration against rough paths
satisfies kRY k2α < ∞. This defines the space of controlled rough paths,
(Y, Y 0 ) ∈ DX
2α
([0, T ], W̄ ).
Although Y 0 is not, in general, uniquely determined from Y (cf. Remark 4.7 and
Section 6 below) we call any such Y 0 the Gubinelli derivative of Y (with respect to
X).
Y
Here, Rs,t takes values in W̄ , and the norm k · k2α for a function with two
arguments is given by (2.3) as before. We endow the space DX
2α
with the semi-norm
where the constant C only depends on T and α and in fact can be chosen uniformly
over T ∈ (0, 1].
Remark 4.7. Since we only assume that kY kα < ∞, but then impose that kRY k2α <
∞, it is in general the case that a genuine cancellation takes place in (4.16). The
question arises to what extent Y determines Y 0 . Somewhat contrary to the classical
situation, where a smooth function has a unique derivative, too much regularity of
the underlying rough path X leads to less information about Y 0 . For instance, if Y is
smooth, or in fact in C 2α , and the underlying rough path X happens to have a path
component X that is also C 2α , then we may take Y 0 = 0, but as a matter of fact
any continuous path Y 0 would satisfy (4.16) with kRk2α < ∞. On the other hand,
if X is far from smooth, i.e. genuinely rough on all (small) scales, uniformly in all
directions, then Y 0 is uniquely determined by Y , cf. Section 6 below.
Remark 4.8. It is important to note that while the space of rough paths C α is not
even a vector space, the space DX
2α
is a perfectly normal Banach space for any given
X = (X, X) ∈ C . The twist of course is that the space in question depends in a
α
4.3 Integration of controlled rough paths 57
crucial way on the choice of X. The set of all pairs (X; (Y, Y 0 )) gives rise to the total
space G
C α n D 2α = {X} × DX
def 2α
,
X∈C α
Remark 4.9. While the notion of “controlled rough path” has many appealing fea-
tures, it does not comewith a natural approximation theory. To wit, consider
X, X ∈ Cgα [0, T ], Rd as limit of smooth paths Xn : [0, T ] → Rd in the sense of
Proposition 2.5. Then it is natural to approximate Y = F (X) by the Yn = F (Xn ),
which is again smooth (to the extent that F permits). On the other hand, there are
no obvious approximations (Yn , Yn0 ) ∈ DX 2α
n for an arbitrary controlled rough path
(Y, Y ) ∈ DX .
0 2α
where we took W̄ = L(V, W ) and used the canonical injection L(V, L(V, W )) ,→
L(V ⊗ V, W ) in writing Ys0 Xs,t . With these notations, the resulting integral takes
values in W .
With these notations at hand, it is now straight-forward to prove the following
result, which is a slight reformulation of [Gub04, Prop 1]:
Theorem 4.10 (Gubinelli). Let T > 0, let X = (X, X) ∈ C α ([0, T ], V ) for some
α > 13 , and let (Y, Y 0 ) ∈ DX
2α
[0, T ], L(V, W ) . Then there exists a constant C
depending only on T and α (and C can be chosen uniformly over T ∈ (0, 1]) such
that
a) The integral defined in (4.19) exists and, for every pair s, t, one has the bound
Z t
Yr dXr − Ys Xs,t − Ys0 Xs,t ≤ C kXkα kRY k2α + kXk2α kY 0 kα |t − s|3α .
s
(4.20)
b) The map from DX [0, T ], L(V, W ) to DX
2α
2α
[0, T ], W given by
6
Note the abuse of notation: we hide dependence on Y 0 which in general affects the limit but is
usually clear from the context.
58 4 Integration against rough paths
Z ·
(Y, Y 0 ) 7→ (Z, Z 0 ) := Yt dXt , Y , (4.21)
0
is a continuous linear map between Banach spaces and one has the bound
Remark 4.11. As in the above theorem, assume that (X, X) ∈ C α ([0, T ], V ) and
consider Y and Z two paths controlled by X. More precisely, we assume (Y, Y 0 ) ∈
DX2α
([0, T ], L(V̄ , W )) and (Z, Z 0 ) ∈ DX
2α
([0, T ], V̄ ), where of course V, V̄ , W are
all Banach spaces. Then, in terms of the abstract integration map I (cf. the sewing
lemma) we may define the integral of Y against Z, with values in W , as follows,
Z t
Yu dZu = (IΞ)s,t , Ξu,v = Yu Zu,v + Yu0 Zu0 Xu,v .
def
(4.22)
s
Here, we use the fact that Zu0 ∈ L(V, V̄ ) can be canonically identified with an opera-
tor in L(V ⊗V, V ⊗ V̄ ) by acting only on the second factor, and Yu0 ∈ L(V, L(V̄ , W ))
is identified as before with an operator in L(V ⊗ V̄ , W ). The reader may be helped
to see this spelled out in coordinates, assuming finite dimensions: using indices i, j
in W, V̄ respectively, and then k, l in V :
i i j i j k,l
(Ξu,v ) = (Yu )j (Zu,v ) + (Yu0 )k,j (Zu0 )l (Xu,v ) .
in the present situation. Clearly Y 0 Z 0 ∈ C α and so kδΞkβ is finite which allows the
proof to go through mutatis mutandis. In particular, (4.20) is valid, with the above
substitution, and reads
4.3 Integration of controlled rough paths 59
Z t
Yr dZr −Ys Zs,t −Ys0 Zs0 Xs,t ≤ C kXkα kRZ k2α +kXk2α kY 0 Z 0 kα |t−s|3α .
s
(4.23)
If Z = X and Z 0 is the identity operator, then this coincides with the definition
(4.19). Furthermore, in the smooth case, one can check that we again recover the
usual Riemann / Young integral.
Remark 4.12. If, in the notation of the proof of Theorem 4.4, Ξ and Ξ̃ are such that
Ξ − Ξ̃ ∈ C2β for some β > 1, i.e.
which converges to 0 as |P| → 0. (This remains true if O(|t − s|β ) with β > 1 is
replaced by o(|t − s|).)
This also shows that, if X and Y are smooth functions and X is defined by (2.2),
the integral that we just defined does coincide with the usual Riemann–Stieltjes
integral. However, if we change X, then the resulting integral does change, as will be
seen in the next example.
Example 4.13. Let f be a 2α-Hölder continuous function and let X = (X, X) and
X̄ = (X̄, X̄) be two rough paths such that
Here, the second term on the right hand side is a simple Young integral, which is
well-defined since α + 2α > 1 by assumption.
Remark 4.15. The bound (4.20) does behave in a very natural way under dilatations.
Indeed, the integral is invariant under the transformation
The same is true for the right hand side of (4.20), since under this dilatation, we also
have RY 7→ λ−1 RY .
60 4 Integration against rough paths
will be useful. Even when X = X̃, it is not a proper metric for it fails to separate
(Y, Y 0 ) and (Y + cX + c̄, Y 0 + c) for anytwo constants
c and c̄. When X 6= X̃,
the assertion “zero distance implies Y, Y 0 = Ỹ , Ỹ 0 ” does not even make sense.
(The two objects live in completely different spaces!) That said, for every fixed
(X, X) ∈ C α , one has (with Rs,t
Y
= Ys,t − Ys0 Xs,t as usual), a canonical map
ιX : Y, Y 0 ∈ CX
α
7→ Y 0 , RY ∈ C α ⊕ C22α .
Given Y0 = ξ, this map is injective since one can reconstruct Y by Yt = ξ +Y00 X0,t +
Y
R0,t . From this point of view, one simply has
and one is back in a normal Banach setting, where k·, ·kα,2α = k · kα + k · k2α is a
natural semi-norm on C α ⊕ C22α . (In fact, it is a norm if one only considers elements
in C α started at 0.) Elementary estimates of the form
ab − ãb̃ ≤ a b − b̃ + a − ã b̃ (4.26)
then lead to
0
+ Y00 Xs,t + Ỹ0,s + Ỹ0 X̃s,t + Rs,t
Y Ỹ
Ys,t − Ỹs,t = Y0,s − Rs,t
α
≤ C|t − s| Y00 − Ỹ00 +
Y 0 − Ỹ 0
α +
X − X̃
α +
RY − RỸ
2α ,
with a constant C = C(R, T ), provided |Y00 |, kXkα , kY 0 kα and similarly for the
same quantities with tilde, all have their norms bounded by R. (As usual, C can be
taken uniform in T ≤ 1 since in this case k·kα;[0,T ] ≤ k·k2α;[0,T ] .) It follows that
Y − Ỹ
≤ C
X − X̃
+ Y00 − Ỹ00 + d 0 0
α α X,X̃,2α Y, Y ; Ỹ , Ỹ . (4.27)
Z ·
(Z, Z 0 ) := Y dX, Y ∈ DX 2α
,
0
and similarly for Z̃, Z̃ 0 . Then the following (local) Lipschitz estimates holds true,
dX,X̃,2α Z, Z 0 ; Z̃, Z̃ 0 ≤ CM %α X, X̃ + Y00 − Ỹ00 + dX,X̃,2α Y, Y 0 ; Ỹ , Ỹ 0 ,
(4.28)
and also
Z−Z̃
≤ CM %α X, X̃ + Y0 − Ỹ0 + Y00 − Ỹ00 + d 0 0
α X,X̃,2α Y, Y ; Ỹ , Ỹ ,
(4.29)
where CM = C(M, T, α) is a suitable constant.
Proof. (The reader is advised to review the proofs of Theorems 4.4, 4.10.) We first
note that (4.27) applied to Z, Z̃ (note: Z00 − Z̃0 = Y0 − Ỹ ) shows that (4.29) is an
immediate consequence of the first estimate (4.28). Thus, we only need to discuss
the first estimate. By definition of dX,X̃,2α , we need to estimate
0
Z − Z̃ 0
+ kRZ − RZ̃ k2α =
Y − Ỹ
+
RZ − RZ̃
.
α α 2α
Thanks to (4.27), the first summand is clearly bounded by the right-hand side of
(4.28). For the second summand we recall
Z t
0
Z
Rs,t = Zs,t − Zs Xs,t = Y dX − Ys Xs,t = (IΞ)s,t − Ξs,t + Ys0 Xs,t
s
where Ξs,t = Ys Xs,t + Ys0 Xs,t and similar for RZ̃ . Setting ∆ = Ξ − Ξ̃, we use
(4.11) with β = 3α and Ξ replaced by ∆, so that
Ỹ Y 0 0
where δ∆s,u,t = Rs,u X̃u,t − Rs,u Xu,t + Ỹs,u X̃u,t − Ys,u Xu,t . We then conclude
with some elementary estimates of the type (4.26), noting that all involved quantities
stay bounded. tu
Recall that we showed in Section 2.3 how an α-Hölder rough path X could be defined
as a path with values in the p-step nilpotent Lie group G(p) (Rd ) ⊂ T (p) (Rd ), with
p = b1/αc. It does not seem obvious at all a priori how one would define a controlled
62 4 Integration against rough paths
rough path in this context. One way of interpreting Definition 4.6 is as a kind of
local “Taylor expansion” up to order 2α. It seems natural in the light of the previous
subsections that if α < 31 , a controlled rough path should have a kind of “Taylor
expansion” up to order pα.
As a consequence, if we expand Xs,t = X−1
def
s ⊗ Xt as
X
Xs,t = Xws,t ew ,
|w|≤p
where |w| denotes the length of the word w, one would expect that a controlled rough
path should have an expansion of the form
X
δYs,t = Ysw Xw Y
s,t + Rs,t , (4.30)
|w|≤p−1
Y
with |Rs,t | . |t − s|pα . Recall however that in Definition 4.6 we also needed a
regularity condition on the “derivative process” Y 0 . The equivalent statement in the
present context is that the Ysw should themselves be described by a local “Taylor
expansion”, but this time only up to order (p − |w|)α. A neat way of packaging
this into a compact statement is to view Y as a T (p−1) (Rd )-valued function and to
introduce a scalar product on T (p) (Rd ) by postulating that hew , ew̄ i = δw,w̄ for any
two words w and w̄. One then has the following extension of Definition 4.6 (see
Exercise 4.26).
Definition 4.17. A controlled rough path is a T (p−1) (Rd )-valued function Y such
that, for every word w with |w| ≤ p − 1, one has the bound
hew , Yt i − hXs,t ⊗ ew , Ys i ≤ C|t − s|(p−|w|)α .
(4.31)
Given such a controlled rough path Y , it is then natural to define its integral
against any component X i by
Z t X X
def
Zt = Ys dXsi = lim Yrw hew ⊗ ei , Xr,s i ,
0 |P|→0
[r,s]∈P |w|≤p−1
where ei is the unit vector associated to the word consisting of the single letter
i. It turns out [Gub10] that Z is again a controlled rough path in the sense of
Definition 4.17 provided that we lift it to T (p−1) (Rd ) by imposing that
def
hew ⊗ ei , Zt i = Yrw ,
and by setting Ztw = 0 for all non-empty words that do not terminate with the letter
i.
4.6 Exercises 63
4.6 Exercises
Solution 4.19. a) Given X on [s, t], define X̃ : [0, 1] 3 u 7→ X(s + u(t − s)) and
β
verify kX̃kα;[0,1] = |t − s| kXkβ;[s,t] . Proceeding similarly for Y , applying (4.2)
to X̃, Ỹ then gives (4.3).
b) Write Z for the indefinite integral. From (4.3), for every 0 ≤ s < t ≤ T ,
α+β
|Zs,t | ≤ |Ys ||Xs,t | + CkY kβ;[s,t] kXkα;[s,t] |t − s|
α
≤ |Y0 | + kY kβ;[0,T ] T β |Xs,t | + CkY kβ;[0,T ] kXkα;[0,T ] T β |t − s|
h i
α
≤ |Y0 | + kY kβ;[0,T ] T β (1 + C) kXkα;[0,T ] |t − s| .
h i
β α
≤ (1 ∨ T ) (1 + C) |Y0 | + kY kβ;[0,T ] kXkα;[0,T ] |t − s| ,
holds true whenever X is a geometric rough path. (Hence, from a rough path perspec-
tive, integration of gradient 1-forms against geometric rough paths is trivial for the
outcome does not depend on X.) What about non-geometric rough paths?
Exercise 4.21. Complete the “first argument” in the proof of Theorem 4.4.
Solution 4.22. Let Pn by the dyadic partitions of [0, T ], so that #Pn = 2n and mesh
|Pn | = T /2n . Call elements of Pn dyadic intervals (of level-n). Given an interval
[s, t] ⊂ [0, T ] there exists m ≥ 0, such that P m is the coarsest dyadic partition which
contains a dyadic interval ⊂ [s, t]. Note that |t − s| ∼ T /2m . We then define, for
n ≥ m, and a general interval [s, t],
X
n
Is,t := Ξu,v .
[u,v]∈Pn :
[u,v]⊂[s,t]
64 4 Integration against rough paths
n (n) (n)
Note Is,t = Ξs,t if [s, t] ∈ Pn . Write s+ (resp. t− ) for the closest right (resp. left)
level-n dyadic neighbour of s (resp. t) so that
(n) (n)
s ≤ s+ < t− ≤ t.
(n)
Note that if s is a level-m dyadic (i.e. s = kT /2m for some integer k) then s+ = s
for all n ≥ m, and similar for t. We have
n+1 X
n
Is,t − Is,t ≤ δΞ u+v + Ξs(n+1) ,s(n) + Ξt(n) ,t(n+1)
u, 2 ,v
[u,v]∈Pn :
[u,v]⊂[s,t]
β
|t(n) − s(n) | 1
. + 2−(n+1)α + 2−(n+1)α .
2n 2n
n
Plainly, these estimates imply, for general [s, t] ⊂ [0, T ], that Is,t : n is Cauchy
and we call the limit Is,t . In fact, I is additive in the sense that δI ≡ 0. Indeed, for
general s < u < t in [0, T ], if u− (resp. u+ ) denotes the closest left (resp. right)
level-n dyadic neighbour, then
n n n
Is,t = Is,u + Iu,v + Ξu− ,u+ ,
α α
and since Ξu− ,u+ . |u+ − u− | ∼ (1/2n ) , additivity of the limit I = lim I n
follows at once. Another immediate consequence of the above estimates, if applied
to a dyadic interval [s, t], is the estimate
β
|Is,t − Ξs,t | . |t − s| . (4.33)
We claim that the estimate (4.33) is valid for all intervals [s, t] ⊂ [0, T ]. By
continuity, it will be enough to consider s < t in ∪n Pn . As in the proof of
the Kolmogorov criterion, Theorem 3.1, we consider a (finite) partition P =
(τi ) of [s, t], which “efficiently” exhausts [s, t] with dyadic intervals of length
∼ 2−n , n ≥ m, in the sense that no three intervals have the same length. Note
that |P | ≡ max {|v − u| : [v, u] ∈ P } = 2−m ≤ |t − s| (and in fact ∼ |t − s| due
to minimal choice of m). Thanks to additiviy of I and (4.33) for dyadic intervals,
X X
|Is,t − Ξs,t | = (Iu,v − Ξu,v ) − Ξs,t − Ξu,v
[u,v]∈P [u,v]∈P
X X
β
. |v − u| + Ξs,t − Ξu,v .
[u,v]∈P [u,v]∈P
4.6 Exercises 65
∞
X β
≤ |t − s| + δΞs,τ
−(i+1) ,τ−i
+ δΞτi ,τi+1 ,t ,
i=0
β
so that |Is,t − Ξs,t | . |t − s| , as required
Exercise 4.23. Adapt the proof of Theorem 4.4 such as to obtain Young’s estimate
(4.3).
Exercise 4.24. Fix α ∈ (0, 1], h > 0 and M > 0. Consider a path Z : [0, T ] → V
and show that
|Zs,t |
−(1−α)
kZkα;h ≡ sup α ≤ M =⇒ kZkα ≤ M 1 ∨ 2h .
0≤s<t≤T |t − s|
t−s≤h
α
(Here, as usual, kZkα ≡ sup0≤s<t≤T |Zs,t |/|t − s| .)
Proof. By scaling it suffices to consider M = 1. Fix 0 ≤ s < t ≤ T , we need
α
to show |Zs,t |/|t − s| is bounded by 1 ∨ 2h1/α−1 . There is nothing to show for
|t − s| ≤ h. We therefore assume h ≤ |t − s| and define ti = (s + ih) ∧ t, for
i = 0, 1, . . . noting that tN = t for N ≥ |t − s|/h and also ti+1 − ti ≤ h for all i.
But then
X
|Zs,t | ≤ Zti ,ti+1
0≤i<|t−s|/h
4.7 Comments
Abstract In this chapter, we compare the integration theory developed in the pre-
vious chapter to the usual theories of stochastic integration, be it in the Itô or the
Stratonovich sense.
Recall from Section 3 that Brownian motion B can be enhanced to a (random) rough
path B = (B, B). Presently our focus is the case when B is given by the iterated Itô
integral 1 Z t
def
Bs,t = BItô
s,t = Bs,u ⊗ dBu
s
and the so enhanced Brownian motion has almost surely (non-geometric) α-Hölder
rough sample paths, for any α ∈ 31 , 12 . That is, B(ω) = (B(ω), B(ω)) ∈ C α for
every ω ∈ N1c where, here and in the sequel, Ni , i = 1, 2, ... denote suitable null sets.
We now show that rough integrals (against B = BItô ) and Itô integrals, whenever
both are well-defined, coincide.
Proposition 5.1. Assume (Y (ω), Y 0 (ω)) ∈ DB(ω)
2α
for every ω ∈ N2c . Set N3 =
N1 ∪ N2 . Then the rough integral
Z T X
Y dB = lim (Yu Bu,v + Yu0 Bu,v )
0 n→∞
[u,v]∈Pn
exists, for each fixed ω ∈ N3c , along any sequence (Pn ) with mesh |Pn | ↓ 0. If Y, Y 0
are adapted then, almost surely,
Z T Z T
Y dB = Y dB .
0 0
1
The case when B is given via iterated Stratonovich integration is left to Section 5.2 below.
67
68 5 Stochastic integration and Itô’s formula
Proof. Without loss of generality T = 1. The existence of the rough integral for
ω ∈ N3c under the stated assumptions is immediate from Theorem 4.10, applied
to Y (ω), controlled by B(ω), for ω ∈ N2c fixed. Recall (e.g. [RY91]) that for any
continuous, adapted process Y the Itô integral against Brownian motion has the
representation
Z 1 X
Y dB = lim Yu Bu,v (in probability)
0 n→∞
[u,v]∈Pn
sup |Y 0 (ω)|∞ ≤ M .
ω∈N5c
(This is the case in the “model” situation Y = F (X), Y 0 = DF (X) where F was
in particular assumed to have bounded derivatives; the general case is obtained by
localisation and left to Exercise 5.13.)
The claim is that the rough and Itô integral coincide on N5c . With a look at the
respective Riemann-sums, convergent away from N5 , basic analysis tells us that
X
∀ω ∈ N5c : ∃ lim Yu0 Bu,v ,
n
[u,v]∈Pn
and that this limit equals the difference of rough and Itô integrals (on N5c , a set of
full measure). Of course, |Pn | ↓ 0, and to see that the above limit is indeed zero (at
least on a set of full measure), it will be enough to show that
2
X 0
Yu Bu,v
2 = O(|P|) . (5.1)
[u,v]∈P L
as desired. t
u
5.2 Stratonovich integration 69
Almost surely, this construction then yields geometric α-Hölder rough sample paths,
for any α ∈ 13 , 12 . Recall that, by definition, the Stratonovich integral is given by
Z T Z T
def 1
Y ◦ dB = Y dB + [Y, B]T
0 0 2
d d
Proof. BStrat Itô
s,t = Bs,t + fs,t where f (t) = (1/2)t I, where I ∈ R ⊗ R denotes the
identity matrix. This entails, as was discussed in Example 4.13,
Z 1 Z 1 Z 1
Y dBStrat = Y dBItô + Y 0 df.
0 0 0
R1 R1
Thanks to Proposition 5.1, it only remains to identify 2 0
Y 0 df = 0
Yt0 dt with
[Y, B]1 . To see this, write
X X
0
Yu,v Bu,v = Yu,v Bu,v Bu,v + Ru,v Bu,v
[u,v]∈P [u,v]∈P
X
0 3α−1
= Yu,v (Bu,v ⊗ Bu,v ) + O |P| ,
[u,v]∈P
70 5 Stochastic integration and Itô’s formula
3α−1
thanks to R ∈ C22α and B ∈ C α .
P
where we used that Ru,v Bu,v = O |P|
Note that
We have seen in the proof of Proposition 5.1 that any limit (in probability, say) of
X
0
Yu,v BItô
u,v
[u,v]∈P
must be zero. In fact, a look at the argument reveals that this remains true with BItô
u,v
replaced by Sym BItôu,v . It follows that
X X Z 1
0
lim Yu,v Bu,v = lim Yu,v (v − u) = Yt0 dt ,
|P|→0 |P|→0 0
[u,v]∈P [u,v]∈P
Unsurprisingly, the same change of variables formula holds for geometric rough
paths X = (X, X), which are essentially limits of smooth paths, and it is not hard
to figure out, in view of Example 4.13, that a “second order” correction, involving
D2 F , appears in the non-geometric case. In other words, one can write down Itô
formulae for rough paths.
Before doing so, however, an important preliminary discussion is in order. Namely,
much of our effort so far was devoted to the understanding of (rough) integration
against 1-forms, say G = G(X) and indeed we found
Z X
G(X)dX ≈ hG(Xs ), Xs,t i + hDG(Xs ), Xs,t i
[s,t]∈P
in the sense that the compensated Riemann-Stieltjes sums appearing on the right-
hand side converge with mesh |P| → 0. Let us split X into symmetric part, Ss,t :=
Sym (Xs,t ), and antisymmetric (“area”) part, Anti (Xs,t ) := As,t . Then
and the final term disappears in the gradient case, i.e. when G = DF . Indeed, the
contraction of a symmetric tensor (here: D2 F ) with an antisymmetric tensor (here:
A) always vanishes. In other words, area matters very much for general integrals
of 1-forms but not at all for gradient 1-forms. Note also that, contrary to A, the
symmetric part S is a nice function of the underlying path X. For instance, for Itô
enhanced Brownian motion in Rd , one has the identity
Z t
1 i j
Si,j
s,t = B i
dB j
= B B − δ ij
(t − s) , 1 ≤ i, j ≤ d .
s
s,r r
2 s,t s,t
These considerations suggest that the following definition encapsulates all the data
required for the integration of gradient 1-forms.
Definition 5.3. We call X = (X, S) a reduced rough path, in symbols X ∈
Crα ([0, T ], V ), if X = Xt takes values in a Banach space V , S = Ss,t takes values
in Sym (V ⊗ V ), and the following hold:
i) a “reduced” Chen relation
Here, writing P for partitions of [0, t], the first integral is given by2
Z t X
def
hDF (Xu ), Xu,v i + D2 F (Xu ), Su,v ,
DF (Xs )dXs = lim
0 |P|→0
[u,v]∈P
(5.2)
while the second integral is a well-defined Young integral.
Proof. Consider first the geometric case, S = S̄, in which case the bracket is zero. The
proof is straightforward. Indeed, thanks to α-Hölder regularity of X with α > 1/3,
we obtain
X
F (XT ) − F (X0 ) = F (Xv ) − F (Xu )
[u,v]∈P
X 1
2
= hDF (Xu ), Xu,v i + D F (Xu ), Xu,v ⊗ Xu,v
2
[u,v]∈P
+ o(|v − u|)
X
hDF (Xu ), Xu,v i + D2 F (Xu ), S̄u,v + o(|v − u|) .
=
[u,v]∈P
P
We conclude by taking the limit |P| → 0, also noting that [u,v]∈P o(|v − u|) → 0.
For the non-geometric situation, just substitute
1
S̄u,v = Su,v + [X]u,v .
2
Since D2 F is Lipschitz, D2 F (X· ) ∈ C α and we can split-up the “bracket” term and
note that Z t
X D E
D2 F (Xu ), [X]u,v → D2 F (Xu )d[X]u ,
[u,v]∈P 0
where the convergence to the Young integral follows from [X] ∈ C 2α . The rest is
now obvious. tu
2
Note consistency with the rough integral when X ∈ C α .
5.3 Itô’s formula and Föllmer 73
Example 5.7. Consider the case when X = B, Itô enhanced Brownian motion. Then
X is given by the iterated Itô integrals, with twice its symmetric part given by
Z t
2Si,j B i dB j + B j dB i = Bti Btj − B i , B j t .
0,t =
0
The usual Itô formula is then recovered from the fact that
i,j j
i
− 2Si,j
i j i,j
[B]t = B0,t B0,t 0,t = B , B 0,t = δ t .
We conclude this section with a short discussion on Föllmer’s calcul d’Itô sans
probabilités [Föl81]. For simplicity of notation, we take V = Rd , W = Re in what
follows. With regard to (5.2), let us insist that the compensation is necessary and one
cannot, in general, separate the sum into two convergent sums. On the other hand,
we can combine the converging sums and write
X
hDF (Xu ), Xu,v i + D2 F (Xu ), Su,v
F (X)0,t = lim
|P|→0
[u,v]∈P
1 X
+ D2 F (Xu )[X]u,v (5.3)
2
[u,v]∈P
X 1
= lim hDF (Xu ), Xu,v i + D2 F (Xu ), Xu,v ⊗ Xu,v .
|P|→0 2
[u,v]∈P
We now put forward an assumption that allows to break up the sum above.
Definition 5.8. Let (Pn ) be a sequence of partitions of [0, T ] with mesh |Pn | → 0.
We say that X : [0, T ] → Rd has finite quadratic variation in the sense of Föllmer
along (Pn ) if and only if, for every t ∈ [0, T ] and 1 ≤ i, j ≤ d the limit
i j X j j
i i
X , X t := lim Xv∧t − Xu∧t Xv∧t − Xu∧t
n→∞
[u,v]∈Pn
exists. Write [X, X] for the resulting path with values in Sym Rd ⊗ Rd , i.e. the
X Z t
lim hG(u), Xu,v ⊗ Xu,v i = G(u)d[X, X]u ∈ Re .
n→∞ 0
[u,v]∈Pn
u<t
i For the
Proof. first statement, it is enough to argue component by component. Set
X := [X i , X i ]. By polarisation,
74 5 Stochastic integration and Itô’s formula
i j 1
X , X t = X i + X j t − X i t − X j t.
2
Since each term on the right-hand side is monotone in t, we see that t 7→ X i , X j t
is indeed of bounded variation.
Regarding the second statement, it is enough to check that, for continuous g :
[0, T ] → R and Y of finite quadratic variation, with continuous bracket t 7→ [Y ]t ,
X Z t
2
lim g(u)Yu,v = g(u)d[Y ]u . (5.4)
n→∞ 0
[u,v]∈Pn
u<t
Indeed, we can apply this for each component, with g = Gki,j and
X i + X j , X i, X j ,
Y ∈
2
P R
To see that (5.4) holds, write [u,v]∈Pn ,u<t g(u)Yu,v = [0,t)
g(u)dµn (u) with
X
2
µn = δu Yu,v
[u,v]∈Pn ,u<t
Combination of the above lemma with (5.3) gives the Itô–Föllmer formula,
Z t
1 t 2
Z
F (Xt ) = F (X0 ) + DF (Xs )dX + D F (Xs )d[X, X]t , 0 ≤ t ≤ T
0 2 0
(5.5)
where the middle integral is given by the (now existent) limit of left-point Riemann-
Stieltjes approximations
5.4 Backward integration 75
X Z t
lim hDF (Xu ), Xu,v i =: DF (X)dX.
n→∞ 0
[u,v]∈Pn
In fact, we encourage the reader to verify as an exercise that this formula is valid
whenever X : [0, T ] → Rd is continuous, of finite quadratic variation, with t 7→
[X, X]t continuous. Note, however, that Föllmer’s notion of quadratic variation (and
the above integral) can and will depend in general on the sequence (Pn ).
whenever this limit exists, in probability and uniformly on compact time intervals,
and does not depend on the sequence of partitions (as long as their meshes tend to
zero). For instance, Z t
←− 1 t
Bs dB s = Bt2 + .
0 2 2
In many applications one encounters integrand f that are “backward adapted” in the
sense that ft is FtT -measurable with Fst := σ(Bu,v : s ≤ u ≤ v ≤ t). For example
t
←−
Z
1 t 1 t
(Bt − Bs ) dB s = Bt2 − Bt2 − = Bt2 −
0 2 2 2 2
and we note (in contrast to the previous example) the zero mean property, which of
course comes from a backward martingale structure. By analogy with its forward
counterpart, the backward Stratonovich integral is defined as the backward Itô
integral, minus 1/2 times the quadratic variation of the integrand.
The purpose of this section is to understand backward integration as rough integra-
tion. To this end, recall that the rough integral of (Y, Y 0 ) ∈ DX
2α
against X = (X, X)
was defined by
Z T X
Y dX = lim Ys Xs,t + Ys0 Xs,t
0 |P |↓0
[s,t]∈P
where P are partitions of [0, T ] with mesh-size |P |. Clearly, some sort of “left-point”
evaluation has been hard-wired into our definition of rough integral. On the other
hand, one can expect that feeding in explicit second order information makes this
choice somewhat less important than in the case of classical stochastic integration.
The next proposition, purely deterministic, answers the questions to what extent
one can replace left-point by right-point evaluation.
76 5 Stochastic integration and Itô’s formula
3α
whenever (∗)s,t ≈ 0 in the sense that (∗)s,t = O |t − s| = o(|t − s|), so that it
does not contribute to the limit. (Recall (4.19) and Lemma 4.2.) But then
Remark 5.11. Note that another way of writing (5.6) is the somewhat more suggestive
Z T X
Y dX = − lim Yt Xt,s + Yt0 Xt,s .
0 |P|↓0
[s,t]∈P
is, in general, not well-defined. In fact, in view of the above proposition, existence of
this limit is equivalent to existence of (either)
X X
lim Yt0 Xs,t ⊗ Xs,t = lim Ys0 Xs,t ⊗ Xs,t .
|P|↓0 |P|↓0
[s,t]∈P [s,t]∈P
There is no reason why, for a general path X ∈ C α , the above limits will exists.
On the other hand, we already considered such sums in the context of the Itô–Föllmer
formula, cf. Lemma 5.9. The appropriate condition for X was seen to be “quadratic
variation (in the sense of Föllmer, along some (Pn ))”. And under this assumption,
X Z T
Ys0 Xs,t ⊗ Xs,t → Ys0 d[X]s . (5.7)
[s,t]∈P n 0
5.4 Backward integration 77
Proof. Regarding point i), it follows from the definition of the rough integral (see
also Example 4.13) that
Z t Z t Z t
Y dB back
= Itô
Y dB + Y 0 I ds .
0 0 0
The claim then follows from Proposition 5.1. The Stratonovich case is similar, now
using Corollary 5.2.
We now turn to point ii). Thanks to the backward presentation established in
Proposition 5.10,
Z T X
Y dBback = lim Yt Bs,t + Yt0 BItô
s,t + I(t − s) − Bs,t ⊗ Bs,t
r n→∞
[s,t]∈P n
X
= lim Yt Bs,t + Yt0 BItô 0
s,t − Ys (Bs,t ⊗ Bs,t − I(t − s)) ,
n→∞
[s,t]∈P n
78 5 Stochastic integration and Itô’s formula
0 0
using Ys,t (Xs,t ⊗ Xs,t ) ≈ 0 and Ys,t I(t − s) ≈ 0. (As before (∗)s,t ≈ 0 means
(∗)s,t = o(|t − s|).) Now we know that with probability 1, B(ω) has finite quadratic
variation [B]t = It, in the sense of Föllmer along some sequence (P n ). As a purely
deterministic consequence, cf. (5.7), on the same set of full measure,
X Z T X
lim Ys0 Bs,t ⊗ Bs,t = Ys0 d[B]s = lim Ys0 I(t − s).
n→∞ 0 n→∞
[s,t]∈P n [s,t]∈P n
T 0 T
Since BItô
s,t is independent from Ft and Yt , Yt are Ft -measurable, a (backward)
martingale argument shows that
X
lim Yt0 BItô
s,t = 0.
n→∞
[s,t]∈P n
5.5 Exercises
Exercise 5.13. Complete the proof of Proposition 5.1 in the case the of unbounded
Y 0.
Solution 5.14. It suffices to show the convergence of (5.1) in probability; to this end,
we introduce stopping times
5.5 Exercises 79
n o
τM = max t ∈ P : |Yt0 | < M ∈ [0, T ] ∪ {+∞}
def
and note that limM →∞ τM = ∞ almost surely. The stopped process S·τM is also a
martingale, and we see as above that, for every fixed M > 0,
X 2
0
Yu Bu,v = O(|P|).
2 L
[u,v]∈P
u≤τM
has a unique solution, starting from any Y0 = y0 ∈ Rd . (As a matter of fact, this
SDE can be solved pathwise by considering the random ODE for Zt = Yt − σBt .)
We are interested in the maximum likelihood estimation of the drift parameter A over
a fixed time horizon [0, T ], given some observation path Y = Y (ω). Recall that this
estimator, ÂT (ω), is based on the Radon–Nikodym density on pathspace, as given
by Girsanov’s theorem, relative to the drift free diffusion.
a) Let d = 1, h(y) = y. Show that the estimator  can be “robustified” in the sense
that ÂT (ω) = ÃT (Y (ω)) where
YT2 − y02 − σ 2 T
ÃT (Y ) = RT . (5.9)
2 0 Yt2 dt
a) Let B denote one-dimensional Brownian motion on [0, T ]. Show that the Sko-
RT
rokhod integral of BT against B over [0, T ], in symbols 0 BT δBt , is given by
BT2 − T .
b) Set Yt (ω) := BT (ω), with (zero) increments (trivially) controlled by B with
Y 0 := 0. (In view of true roughness of Brownian motion, cf. Section 6, there is no
other choice for Y 0 ). Show that the rough integral of Y against Brownian motion
over [0, T ] equals BT2 . Conclude that Skorokhod and rough integrals (against Itô
enhanced Brownian motion) do not coincide beyond adapted integrals.
where the limit on the right hand side exists in almost-sure sense. Conclude that in
this case rough integration against BStrat coincides almost surely with Stratonovich
anticipating stochastic integration, i.e.
Z · Z ·
Strat
Fω (Bs )dB (ω) ≡ Fω (Bs ) ◦ dBs (ω).
0 0
whenever this limit exists along some a sequence of dissections (Pn ) ⊂ [0, t] with
mesh |Pn | → 0. Show that this limit does not exist, in general, when X = B H , a
d-dimensional fractional Brownian motion with Hurst parameter H < 1/2. Hint:
Consider the simplest possible non-trival case, namely d = 1 and F (x) = x2 .
Solution 5.19. Assume convergence in probability say along some (Pn ) for the
approximating (left-point) sum,
X
Xu Xu,v .
[u,v]∈Pn
We look for a contradiction. Elementary “calculus for sums” implies that the mid-
point sum converges, i.e. where Xu above is replaced by Xu + Xu,v /2. It follows
that convergence of the left-point sums is equivalent to to existence of quadratic
variation, i.e. existence of
X 2
lim |Xu,v | .
n→∞
[u,v]∈Pn
2 2H
Note that E|Xu,v | = (1/2n ) so that the expectation of this sum equals 2n(1−2H) ,
which diverges when H < 1/2. In particular, quadratic variation does not exist as L1
limit. But is also cannot exists as a limit in probability, for both types of convergence
are equivalent on any finite Wiener–Itô chaos.
Exercise 5.20. In Proposition 5.6, replace the assumption that X = (X, S) ∈
Crα ([0, T ], V ) with α > 1/3, by a suitable p-variation assumption with p < 3.
Show that [X] has finite p/2-variation and that D2 F (X)d[X], as it appears in Itô’s
R
t sin kt 1 − cos kt
e0 (t) = √ , ek (t) = √ , e−k (t) = √ ,
2π k π k π
for k > 0. It follows from standard Gaussian measure theory [Bog98] that, given
a sequence ξn of i.i.d. normal Gaussian random variables, the sequence XN =
P N
n=−N en ξn converges almost surely in B to a limit X such that the law of X is µ.
PN
Write now YN = n=−N sign(n)en ξn , so that one also has YN → Y with law of
Y given by µ.
This immediately leads to a contradiction: on the one hand, assuming that (f, g) 7→
R R 2π
f dg is continuous on B, this implies that 0 XN (t) dYN (t) converges to some
finite (random) real number. On the other hand, an explicit calculation yields
82 5 Stochastic integration and Itô’s formula
2π N
ξ02 X ξn2 + ξ−n
2
Z
XN (t) dYN (t) = + .
0 2 n=1
n
5.6 Comments
Rough integrals of 1-forms against the Brownian rough path (and also continuous
semi-martingales enhanced to rough paths) are well known to coincide with stochastic
integrals, see [LQ02, FV10b] for instance, but the extensions presented in this section
seem to be new. Our Itô formula for reduced rough paths also appears to be new.
Chapter 6
Doob–Meyer type decomposition for rough
paths
M ≡ M̃ and A ≡ Ã .
hence, by the first part, M τ , Aτ ≡ 0. This also implies that the quadratic variation
τ
of M τ , denoted by [M τ ], vanishes. Since [M τ ] = [M ] (see e.g. [RY91, Ch. IV]) it
indeed follows that [M ] ≡ 0 on [0, τ ). t u
83
84 6 Doob–Meyer type decomposition for rough paths
1
As opposed to Hölder regularity which quantifies “roughness from above”, in the sense of an
upper estimate of the increment.
6.2 Uniqueness of the Gubinelli derivative and Doob–Meyer 85
Here and in the sequel of this section we fix α ∈ ( 13 , 12 ], a rough path X = (X, X) ∈
C α ([0, T ], V ) and a controlled rough path (Y, Y 0 ) ∈ DX 2α
. We first address the
question to what extent X and Y determine the Gubinelli derivative Y 0 . As it turns
out, Y 0 is uniquely determined, provided that X is sufficiently “rough from below, in
all directions”. A Doob–Meyer type decomposition will then follow as a corollary.
Let us first consider the case when X is scalar, i.e. with values in V = R. Assume
that for some given s ∈ [0, T ), there exists a sequence of times tn ↓ s such that
2α
|Xs,tn |/|tn − s| → ∞, i.e.
|Xs,t |
lim 2α = +∞.
t↓s |t − s|
Then Ys0 is uniquely determined from Y by (4.16) and the condition that kRY k2α <
∞. In fact, one necessarily has Xs,tn ∈ R \ {0} for n large enough and so, from the
very definition of RY ,
Y 2α
Ys,tn Rs,t |tn − s|
Ys0 = − n
2α
Xs,tn |tn − s| Xs,tn
which implies that limn→∞ Ys,tn /Xs,tn exists and equals Ys0 . The multidimensional
case is not that different, and the above consideration suggests the following defini-
tion.
|hv ∗ , Xs,t i|
∀v ∗ ∈ V ∗ \{0} : lim 2α =∞.
t↓s |t − s|
where the second equality follows from the assumption made in (6.2). Now, Ys0 Xs,t
takes values in W̄ , the same Banach space in which Y takes its values. For every
w∗ ∈ W̄ ∗ , the map V 3 v 7→ hw∗ , Ys0 vi defines an element v ∗ ∈ V ∗ so that
* +
|hv ∗ , Xs,t i| 0
∗ Ys Xs,t
= w , 2α = O(1) as t ↓ s;
2α
|t − s| |t − s|
Unless v ∗ = 0, the assumption that “X is rough at time s” implies that, along some
sequence tn ↓ s, we have the divergent behaviour |hv ∗ , Xs,tn i|/|tn − s|2α → ∞,
which contradicts that the same expression is O(1) as tn ↓ s. We thus conclude that
v ∗ = 0. In other words,
and this clearly implies Ys0 = 0. This finishes the proof of the implication stated in
(6.2). tu
Theorem 6.5 (Doob–Meyer for rough paths). Assume that X is rough at some
time s ∈ [0, T ) and let (Y, Y 0 ) ∈ DX
2α
. Then
Z t
2α
Y dX = O |t − s| as t ↓ s ⇒ Ys = 0 . (6.3)
s
2α
where the last inequality is just the statement that |t − s| = O |t − s| as t ↓ s,
thanks to α ≤ 1/2. We then conclude using (6.3) that Ys = Ỹs . If we now assume
true roughness of X, this conclusion holds for a dense set of times s and hence, by
6.3 Brownian motion is truly rough 87
(Attention that the above notation “hides” the dependence on Y 0 resp. Ỹ 0 .) But then
(6.4) implies Z t Z t
Zr dr ≡ Z̃r dr for t ∈ [0, T ],
0 0
and we conclude by differentiation with respect to t. t
u
Recall that (say, d-dimensional standard) Brownian motion satisfies the so-called
(Khintchine) law of the iterated logarithm, that is
" #
|Bt,t+h | √
∀t ≥ 0 : P lim 1 = 2 = 1. (6.5)
h↓0 h 2 (ln ln 1/h)1/2
See [McK69, p.18] or [RY91, Ch. II] for instance, typically proved with exponential
martingales. Remark that it is enough to consider t = 0 since (Bt,t+h : h ≥ 0) is
also a Brownian motion.
Theorem 6.6. With probability one, Brownian motion on V = Rd is truly rough,
relative to any Hölder exponent α ∈ [1/4, 1/2).
Proof. It is enough to show that, for fixed time s, and any θ ∈ [1/2, 1),
" #
∗ |ϕ(Bs,t )|
P ∀ϕ ∈ V , |ϕ| = 1 : lim = +∞ = 1.
t↓s |t − s|θ
(Then take s ∈ Q and conclude that the above event holds true, simultanously for all
such s, with probability one.)
1 1/2
To this end, set h 2 (ln ln 1/h) ≡ ψ(h). √We need the following two conse-
quences of (6.5). There exists c> 0 (here c = 2) such that for every every fixed
∗
unit dual vector ϕ ∈ V ∗ = Rd and every fixed s ∈ [0, T )
P lim |hϕ, Bs,t i|/ψ(t − s) ≥ c = 1,
t↓s
|Bs,t |
P lim < ∞ = 1.
t↓s ψ(t − s)
full measure,
P ∀ϕ ∈ K : lim |ϕ(Bs,t )|/ψ(t − s) ≥ c = 1
t↓s
On the other hand, every unit dual vector ϕ ∈ V ∗ is the limit of some (ϕn ) ⊂ K.
Then
|hϕn , Bs,t i| |hϕ, Bs,t i| |Bs,t |
≤ + |ϕn − ϕ|V ∗
ψ(t − s) ψ(t − s) ψ(t − s)
so that, using lim (|a| + |b|) ≤ lim (|a|) + lim (|b|), and restricting to the above set
of full measure,
|hϕ, Bs,t i|
0 < c ≤ lim .
t↓s ψ(t − s)
Hence, for a.e. sample B = B(ω) we can pick a sequence (tn ) converging to s such
that |hϕ, Bs,tn i|/ψ(tn − s) ≥ c − 1/n. On the other hand, for any θ ≥ 1/2 we have
where in the borderline case θ = 1/2 (which corresponds to α = 1/4) this divergence
1/2
is only logarithmic, L(τ ) = (ln ln 1/τ ) . t u
Observe that, indeed, any element in C α which is θ-Hölder rough for θ < 2α
is truly rough. (We shall see in the next section that multidimensional Brownian
motion is θ-Hölder rough for any θ > 1/2.) The following result can be viewed as
quantitative version of Proposition 6.4.
Proposition 6.8. Let (X, X) ∈ C α [0, T ], V be such that X is θ-Hölder rough for
some θ ∈ (0, 1]. Then, for every controlled rough path (Y, Y 0 ) ∈ DX 2α
[0, T ], W
one has,
∀ε ∈ (0, ε0 ] : Lεθ kY 0 k∞ ≤ osc(Y, ε) +
RY
2α ε2α .
(6.7)
As immediate consequence, if θ < 2α, Y 0 is uniquely determined from Y , i.e. if
Y, Y and Ỹ , Ỹ both belong to DX
0 0
and Y ≡ Ỹ , then Y 0 ≡ Ỹ 0 .
2α
Proof. Let us start with the consequence: apply estimate (6.7) with Y replaced by
Y − Ỹ = 0 and similarly Y 0 replaced by Y 0 − Ỹ 0 . Thanks to L > 0 it follows that
0
Y − Ỹ 0
= O ε2α−θ
∞
(Note that one has indeed (Ys0 )∗ : W ∗ → V ∗ .) Combining both (6.8) and (6.9), we
thus obtain that
Taking the supremum over all such ϕ ∈ W ∗ of unit length,2 and using the fact that
the norm of a linear operator is equal to the norm of its adjoint, we obtain
Remark 6.9. Even though the argument presented above is independent of the dimen-
sion of V , we are not aware of any example where L(θ, X) > 0 and dim V = ∞.
The reason why this definition works well only in the finite-dimensional case will be
apparent in the proof of Proposition 6.11 below.
2
Note that |Ys0∗ | denotes the operator norm, by definition equal to sup|ϕ|=1 |Ys0∗ ϕ|.
90 6 Doob–Meyer type decomposition for rough paths
Theorem 6.10 (Norris lemma for rough paths). Let X = (X, X) ∈ C α [0, T ], V
Then there exist constants r > 0 and q > 0 such that, setting
−1
R := 1 + Lθ (X) + |||X|||α + kY, Y 0 kX;2α + |Y0 | + |Y00 | + kZkα + |Z0 |
Proof. We leave the details of the proof as an exercise, see [HP13], and only sketch
its broad lines.
First, we conclude from Proposition 6.8 that I small in the supremum norm
implies that kY k∞ is also small. Then, we use interpolation to conclude from this
R D for ᾱ < α, thus implying
that R(Y, Y 0 ) is small when viewed as an element of 2ᾱ
that Y dX is necessarily small. This implies that Z ds is itself small from which,
using again interpolation, we finally conclude that Z itself must be small in the
supremum norm. t u
We now turn to Hölder-roughness of Brownian motion. Our focus will be on the unit
interval T = 1, and we consider scale up to ε0 = 1/2 for the sake of argument.
The proof of Proposition 6.11 relies on the following variation of the standard
small ball estimate for Brownian motion:
6.5 Brownian motion is Hölder rough 91
Proof. The standard small ball estimate for Brownian motion (see for example
[LS01]) yields the bound
sup P sup |hϕ, B(t)i| ≤ ε ≤ C exp(−cδε−2 ) . (6.11)
|ϕ|=1 t∈[0,δ]
The required estimate then follows from a standard chaining argument, as in [Nor86,
p. 127]: cover the sphere |ϕ| = 1 with ε−2(d−1) balls of radius ε2 , say, centred
at ϕi . We then use the fact that, since the supremum of B has Gaussian tails, if
supt∈[0,δ] |hϕi , B(t)i| ≤ ε, then the same bound, but with ε replaced by 2ε holds
with probability exponentially close to 1 uniformly over all ϕ in the ball of radius ε2
centred at ϕi . Since there are only polynomially many such balls required to cover
the whole sphere, (6.10) follows. Note that this chaining argument uses in a crucial
way that the number of balls or radius ε2 required to cover the sphere kϕk = 1 grows
only polynomially with ε−1 .
It is clear that bounds of the type (6.10) break down in infinite dimensions: if we
consider a cylindrical Wiener process, then (6.11) still holds, but the unit sphere of a
Hilbert space cannot be covered by a finite number of small balls anymore. If on the
other hand, we consider a process with a non-trivial covariance, then we can get the
chaining argument to work, but the bound (6.11) would break down due to the fact
that hϕ, B(t)i can then have arbitrarily small variance. t u
Proof (Proposition 6.11). With T = 1, ε0 = 1/2, a different way of formulating
Definition 6.7 is given by
1
Lθ (X) = inf sup θ
|hϕ, Xs,t i|.
t:|t−s|≤ε ε
where the inf is taken over |ϕ| = 1, s ∈ [0, 1] and ε ∈ (0, 1/2]. We then define the
“discrete analog” Dθ (X) of Lθ (X) to be given by
Therefore, by the triangle inequality, we conclude that the magnitude of the difference
between hϕ, Xs i and one of the two terms hϕ, Xti i, i = 1, 2 (say t1 ) is at least
1 −nθ
|hϕ, Xs,t1 i| ≥ 2 Dθ (X)
2
and therefore
|hϕ, Xs,t1 i| 1 2−nθ 1 1
θ
≥ θ
Dθ (X) ≥ Dθ (X).
ε 2 ε 2 2θ
Since s, ε and ϕ were chosen arbitrarily, the claim (6.12) follows.
Applying this to Brownian sample paths, X = B(ω), it follows that it is sufficient
to obtain the requested bound on P(Dθ (B) < ε). We have the straightforward bound
|hϕ, Bs,t i|
P(Dθ (B) < ε) ≤ P inf inf infn sup < ε
kϕk=1 n≥1 k≤2 s,t∈Ik,n 2−nθ
X 2n
∞ X
≤ P inf sup |hϕ, Bs,t i| < 2−nθ ε .
kϕk=1 s,t∈Ik,n
n=1 k=1
Trivially sups,t∈Ik,n |hϕ, Bs,t i| ≥ supt∈Ik,n |hϕ, Br,t i|, where r is the left boundary
of the interval Ik,n , we can bound this by applying Lemma 6.12. Noting that the
bound obtained in this way is independent of k, we conclude that
∞
X ∞
X
(2θ−1)n −2
n
exp −c̃nε−2 .
P(Dθ (B) < ε) ≤ M 2 exp −c2 ε ≤ M̃
n=1 n=1
Here, we used the fact that as soon as θ > 12 , we can find constants K and c̃ such that
uniformly over all ε < 1 and all n ≥ 1. (Consider separately the cases ε2 ∈ (0, 1/n)
and ε2 ∈ [1/n, 1).) We deduce from this the bound
Z ∞
−2
P(Dθ (B) < ε) ≤ M e−c̃ε + exp −c̃ε−2 x dx ,
1
Note that the proof given above is quite robust. In particular, we did not really
make use of the fact that B has independent increments. In fact, it transpires that all
that is required in order to prove the Hölder roughness of sample paths of a Gaussian
process W with stationary increments is a small ball estimate of the type
6.7 Comments 93
P sup |Wt − W0 | ≤ ε ≤ C exp(−cδ α ε−β ) ,
t∈[0,δ]
for some exponents α, β > 0. These kinds of estimates are available for example for
fractional Brownian motion with arbitrary Hurst parameter H ∈ (0, 1).
6.6 Exercises
Exercise 6.13. Show that the Q-Wiener process (as introduced in Exercise 3.16) is
truly rough.
Exercise 6.14. Prove and state precisely: multidimensional fractional Brownian mo-
tion B H , H ∈ (1/3, 1/2], is truly rough. Hint: A law of iterated logarithm for
fractional Brownian motion of the form
H
Bt,t+h √
Plim = 2 = 1
h↓0 hH (ln ln 1/h)1/2
Carry out the elementary optimization, e.g. when ε0 = T /2, to see that
4kY k∞
θ θ
RZ
2α kY k− 2α ∨ T −θ .
kZ 0 k∞ ≤ 2α ∞
L(θ, X)
Exercise 6.16 (Norris lemma for rough paths; [HP13]). Give a complete proof of
Theorem 6.10.
6.7 Comments
[FS12a]; the quantitative “Norris lemma” is taken from Cass, Litterer, Hairer and
Tindel [CHLT12]. These results also hold in “rougher” situations, i.e. when α ≤ 1/3,
[FS12a, CHLT12].
Chapter 7
Operations on controlled rough paths
R
Abstract At first sight, the notation Y dX introduced in Chapter 4 is ambiguous
since the resulting controlled rough path depends in general on the choices of both
the second-order process X and the derivative process Y 0 . Fortunately, this “lack of
completeness” in our notations is mitigated by the fact that in virtually all situations
of interest, Y is constructed by using a small number of elementary operations
described in this chapter. For all of these operations, it turns out to be intuitively
rather clear how the corresponding derivative process is constructed.
where Yu0 ⊗Yu0 ∈ L(V ⊗V, W ⊗W ) is given by (Yu0 ⊗Yu0 )(v⊗ṽ) = (Yu0 (v))⊗(Yu0 (ṽ)).
The fact that kYk2α is finite is then a consequence of (4.23). On the other hand, the
algebraic relations (2.1) already hold for the “Riemann sum” approximations to the
three integrals, provided that the partition used for the approximation of Ys,t is the
union of the one used for the approximation of Ys,u with the one used for Yu,t .
We summarise the above consideration in saying that for every fixed X ∈
C α ([0, T ], V ), we have a continuous canonical injection
95
96 7 Operations on controlled rough paths
DX
2α
([0, T ], W ) ,→ C α ([0, T ], W ) .
Here, the left hand side uses (4.22) to define the integral of two controlled rough
paths against each other and the right hand side uses the original definition (4.19) of
the integral of a controlled rough path against its reference path.
Proof. By assumption, one has Ys,t = Ys0 Xs,t + O(|t − s|2α ) and Z̃s,t = Zs0 Ys,t +
O(|t − s|2α ). Combining these identities, it follows immediately that
Zs,t = Z̃s0 Ys0 Xs,t + O(|t − s|2α ) = Zs0 Xs,t + O(|t − s|2α ) ,
so that (Z, Z 0 ) ∈ DX
2α
as required. Now the left hand side of (7.1) is given by IΞ0,t
with Ξs,t = Zs Ys,t + Zs0 Ys0 Xs,t , whereas the right hand side is given by I Ξ̃0,t ,
where we set Ξ̃s,t = Z̃s Ỹs,t + Z̃s0 Ys,t . Since |Ys,t − Ys0 Ys0 Xs,t | ≤ C|t − s|3α by
(4.20), the claim now follows from Remark 4.12. t u
Remark 7.2. It is straightforward to see that if 13 < β < α, then C α ,→ C β and, for
2β
every X ∈ C α , we have a canonical embedding DX 2α
,→ DX . Furthermore, in view
of the definition (4.10) of I, the values of the integrals defined above do not depend
on the interpretation of the integrand and integrator as elements of one or the other
space.
Let W and W̄ be two Banach spaces and let ϕ : W → W̄ be a function in Cb2 . Let
furthermore (Y, Y 0 ) ∈ DX2α
([0, T ], W ) for some X ∈ C α . (In applications X will
be part of some X = (X, X) ∈ C α but this is irrelevant here.) Then one can define a
(candidate) controlled rough path (ϕ(Y ), ϕ(Y )0 ) ∈ DX2α
([0, T ], W̄ ) by
0
which shows that ϕ(Y ), ϕ(Y ) ∈ C α . Furthermore, Rϕ ≡ Rϕ(Y ) is given by
ϕ
Rs,t = ϕ(Yt ) − ϕ(Ys ) − Dϕ(Ys )Ys0 Xs,t
Y
= ϕ(Yt ) − ϕ(Ys ) − Dϕ(Ys )Ys,t + Dϕ(Ys )Rs,t
so that,
1 2 2
kRϕ k2α ≤ D ϕ ∞ kY kα + |Dϕ|∞
RY
2α .
2
It follows that
ϕ(Y ), ϕ(Y )0
≤ kDϕ(Y· )k∞ kY·0 kα + kY·0 k∞
D2 ϕ(Y· )
∞ kY· kα
X,2α
1 2
+ D2 ϕ∞ kY kα + |Dϕ|∞
RY
2α
2
2
≤ kϕkC 2 kY·0 kα + kY·0 k∞ kY· kα + kY kα +
RY
2α
b
2
≤ Cα,T kϕkC 2 (1 + kXkα ) 1 + |Y00 | + kY, Y 0 kX,2α
b
98 7 Operations on controlled rough paths
× |Y00 | + kY, Y 0 kX,2α ,
It follows immediately that one has the following “Leibniz rule”, the proof of
which is left to the reader:
We now investigate the continuity properties of the controlled rough path constructed
in Lemma 7.3. In doing so, we shall use notation previously introduced in Section 4.4.
Y, Y 0 ∈ DX , Ỹ , Ỹ 0 ∈ DX̃
2α
2α
. For ϕ ∈ Cb3 define
+ dX,X̃,2α Y, Y 0 ; Ỹ , Ỹ 0 , (7.4)
as well as
Z − Z̃
≤ CM %α X, X̃ + Y0 − Ỹ0 + Y00 − Ỹ00 + d 0 0
α X,X̃,2α Y, Y ; Ỹ , Ỹ ,
(7.5)
for a suitable constant CM = C(M, T, α, ϕ).
Proof. (The reader is urged to revisit Lemma 7.3 where the composition (7.3) was
seen to be well-defined for ϕ ∈ Cb2 .) Similar as in the previous proof, noting that
0
Z0 − Z̃00 = Dϕ(Y0 )Y00 − Dϕ Ỹ0 Ỹ00 ≤ CM Y0 − Ỹ0 + Y00 − Ỹ00
Dϕ(Y )Y 0 − Dϕ Ỹ Ỹ 0
+
RZ − RZ̃
.
α 2α
Write CM (εX + ε0 + ε00 + ε) for the right hand side of (7.4). Note that with this
notation, from (4.27),
7.4 Stability II: Regular functions of controlled rough paths 99
Y − Ỹ
. εX + ε00 + ε =: εY ,
α
and also
Y − Ỹ
∞;[0,T ] . ε0 + εY (uniformly over T ≤ 1). Since Dϕ ∈ Cb2 , we
know from Lemma 8.2 that
Dϕ Ỹ − Dϕ(Y )
α = Dϕ Ỹ0 − Dϕ(Y0 ) +
Dϕ Ỹ − Dϕ(Y )
C α
≤ C(ε0 + εY )
where C depends on the Cb3 -norm of ϕ. Also,
Y 0 − Ỹ 0
C α ≤ ε00 + ε. Clearly then
(C α is a Banach algebra under pointwise multiplication), we have, for a constant CM ,
Dϕ(Y )Y 0 − Dϕ Ỹ Ỹ 0
≤ CM (ε0 + εY + ε00 + ε)
α
. CM (εX + ε0 + ε00 + ε) .
noting that this estimate is uniform in s, t ∈ [0, T ] and θ ∈ [0, 1]. RIt then suffices
to insert/subtract D2 ϕ(Ys + θYs,t ) Ỹs,t , Ỹs,t under the integral . . . (1 − θ)dθ
appearing in the definition of T1 and conclude with the triangle inequality and some
100 7 Operations on controlled rough paths
which is meaningful if we interpret the last two integrals as Young integrals. To show
that this is indeed the case, note first that a consequence of (7.6) and Theorem 4.10,
the increments of Y are of the form
which is the first term in the above identity, makes sense as a rough integral. Note
that, if X = B, Itô enhanced Brownian motion, and Y, Y 0 , Y 00 are all adapted, then
so is G and the integral is identified, by Proposition 5.1, as a classical Itô integral.
Proposition 7.6. Under the assumption (7.8), the Itô formula (7.7) holds true.
Proof. By the (previous) Itô formula, we know that F (Yt ) − F (Y0 ) equals
X X
DF (Yu )Yu,v + D2 F (Yu )Yu,v + lim D2 F (Yu )[Y]u,v
lim
|D|→0 |D|→0
[u,v]∈D [u,v]∈D
Rv (7.10)
where Yu,v = u Yu,· ⊗ dY in the sense of remark 4.11, noting that Yu,v =
Yu0 Yu0 Xu,v + o(|v − u|). Also,
7.6 Controlled rough paths of low regularity 101
Let us also subtract/add DF (Yu )Yu00 Xu,v from (7.10). Then F (Yt ) − F (Y0 ) equals
X
DF (Yu )(Yu,v − Yu00 Xu,v ) + DF (Yu )Yu00 Xu,v + D2 F (Yu )Yu0 Yu0 Xu,v
lim
|D|→0
[u,v]∈D
X
+ lim D2 F (Yu )Yu0 Yu0 [X]u,v
|D|→0
[u,v]∈D
X
DF (Yu )Yu0 Xu,v + DF (Yu )Yu00 + D2 F (Yu )Yu0 Yu0 Xu,v
= lim
|D|→0
[u,v]∈D
X Z t
+ lim DF (Yu )Γu,v + D2 F (Yu )Y 0 Yu0 d[X]u .
|D|→0 0
[u,v]∈D
In view of (7.9), also noting the appearance of two Young integrals in the last line,
the proof is complete. t u
Let us conclude this section by showing how these canonical operations can be
lifted to the case of controlled rough paths of low regularity, i.e. when α < 13 .
Recall from Section 4.5 that in this case we view a controlled rough path Y as a
T (p−1) (Rd )-valued function, which is controlled by increments of X in the sense of
Definition 4.17.
This suggests that, in order to define the product of two controlled rough paths
Y and Ȳ , we should first ask ourselves how a product of the type Xw w̄
s,t Xs,t for two
different words w a w̄ can be rewritten as a linear combination of the increments of
X. It was realised by Chen [Che54] that such a product is described by the shuffle
product. Recall that, for any alphabet A, the shuffle product is defined on the free
algebra over A by considering all possible ways of interleaving two words in ways
that preserve the original order of the letters. For example, if a, b and c are letters in
A, one has the identity
ew ? ew̄ = eww̄ ,
102 7 Operations on controlled rough paths
Zt = Yt ? Ȳt .
where F (k) denotes the kth derivative of F and Ỹt = Yt − Ytφ is the part describing
def
7.7 Exercises
Rt
Exercise 7.7. Verify that Xs,t = s Xs,r ⊗dXr where the integral is to be interpreted
in the sense of (4.22), taking (Y, Y 0 ) to be (X, I). In fact, check that
R this holds not
only in the limit |P| → 0 but in fact for every fixed |P|, i.e. Xs,t = P Ξ. Compare
this with formula (2.12), obtained in Exercise 2.7.
where we denote by kϕk2α;t the supremum over y of the 2α-Hölder norm of ϕ(y, ·).
7.7 Exercises 103
Exercise 7.9. Convince yourself that in the case p = 2, the definitions given in
Section 7.6 coincide with the definitions given earlier in this section.
Chapter 8
Solutions to rough differential equations
8.1 Introduction
105
106 8 Solutions to rough differential equations
sufficiently small to guarantee invariance of suitable balls and the contraction property.
Our key ingredients are estimates for rough integrals (cf. Theorem 4.10) and the
composition of controlled paths with smooth maps (Lemma 7.3). Recall that, for
rather trivial reasons (of the sort |t − s|2α ≤ |t − s|, when 0 ≤ s ≤ t ≤ T ≤ 1), all
constants in these estimates were seen to be uniform in T ∈ (0, 1].
Let us postulate that there exists a solution to a differential equation in Young’s sense
and let us derive an a-priori estimate. (In finite dimension, this can actually be used
to prove the existence of solutions. Note that the regularity requirement here is “one
degree less” than what is needed for the corresponding uniqueness result.)
Proposition 8.1. Assume X, Y ∈ C β ([0, 1], V ) for some β ∈ (1/2, 1] such that,
given ξ ∈ W, f ∈ Cb1 (W, L(V, W )), we have
Rt
Proof. By assumption, for 0 ≤ s < t ≤ 1, Ys,t = s f (Yr )dXr . Using Young’s
inequality (4.3), with C = C(β),
Z t
|Ys,t − f (Ys )Xs,t | =
(f (Yr ) − f (Ys ))dXr
s
2β
≤ CkDf k∞ kY kβ;[s,t] kXkβ;[s,t] |t − s|
so that
β β
|Ys,t |/|t − s| ≤ kf k∞ kXkβ + CkDf k∞ kY kβ;[s,t] kXkβ;[s,t] |t − s| .
β
Write kY kβ;h ≡ sup |Ys,t |/|t − s| where the sup is restricted to times s, t ∈ [0, 1]
for which t − s ≤ h. Clearly then,
and upon taking h small enough, s.t. δhβ 1, with δ = kXkβ , more precisely s.t.
CkDf k∞ kXkβ hβ ≤ C 1 + kf kC 1 kXkβ hβ ≤ 1/2
b
8.3 Review of the Young case: Picard iteration 107
(we will take h such that the second ≤ becomes an equality; adding 1 avoids trouble
when f ≡ 0)
1
kY kβ;h ≤ kf k∞ kXkβ .
2
−1/β
It then follows from Exercise 4.24 that, with h ∝ kXkβ ,
kY kβ ≤ kY kβ;h 1 ∨ h−(1−β) ≤ CkXkβ 1 ∨ h−(1−β)
1/β
= C kXkβ ∨ kXkβ .
Here, we have absorbed the dependence on f ∈ Cb1 into the constants. By scaling
(any non-zero f may be normalised to kf kC 1 = 1 at the price of replacing X by
b
kf kC 1 × X) we then get immediately the claimed estimate. t
u
b
The reader may be helped by first reviewing the classical Picard argument in a
Young setting, i.e. when β ∈ (1/2, 1]. Given ξ ∈ W , f ∈ Cb2 (W, L(V, W )), X ∈
C β ([0, 1], V ) and Y : [0, T ] → W of suitable Hölder regularity, T ∈ (0, 1], one
defines the map MT by
Z t
MT (Y ) := ξ + f (Ys )dXs : t ∈ [0, T ] .
0
and so the α-Hölder norm of X has the desired behaviour. As previously, when no
confusion is possible, we write k · kα ≡ k · kα;[0,T ] .
To avoid norm versus semi-norm considerations, it is convenient to work on
the space of paths started at ξ, namely {Y ∈ C α ([0,
T ], W )
: Y0 = ξ}. This affine
subspace is a complete metric space under Y, Ỹ 7→
Y − Ỹ
α and so is the closed
unit ball
BT = {Y ∈ C α ([0, T ], W ) : Y0 = ξ, kY kα ≤ 1} .
Young’s inequality (4.32) shows that there is a constant C which only depends on α
(thanks to T ≤ 1) such that for every Y ∈ BT ,
Similarly, for Y, Ỹ ∈ BT , using Young, f Y0 = f Ỹ0 and Lemma 8.2 below (with
K = 1)
Z · Z ·
Y − M Ỹ = f Y dX − f Ỹ dX
MT T
s s s s
α
0 0 α
≤ C f Y0 − f Ỹ0 + f Y − f Ỹ
α kXkα
Lemma 8.2. Assume f ∈ Cb2 (W, W̄ ) and T ≤ 1. Then there exists a Cα,K such that
for all X, Y ∈ C α with kXkα;[0,T ] , kY kα;[0,T ] ≤ K ∈ [1, ∞)
kf (X) − f (Y )kα;[0,T ] ≤ Cα,K kf kC 2 |X0 − Y0 | + kX − Y kα;[0,T ] .
b
The idea is to use a division property of sufficiently smooth functions. In the present
context, this simply means that one has
Z 1
f (x) − f (y) = g(x, y)(x − y) with g(x, y) := Df (tx + (1 − t)y) dt ,
0
|(g(x, y) − g(x̃, ỹ))| ≤ |g|Lip |(x − x̃, y − ỹ)| ≤ C D2 f ∞ (|x − x̃| + |y − ỹ|).
We now consider a priori estimates for rough differential equations, similar to Section
8.2. Recall that the homogeneous rough path norm |||X|||α was introduced in (2.4).
Proposition 8.3. Let ξ ∈ W, f ∈ Cb2 (W, L(V, W )) and a rough path X = (X, X) ∈
C α with α ∈ (1/3, 1/2] and assume that (Y, Y 0 ) = (Y, f (Y )) ∈ DX 2α
is a RDE
solution to dY = f (Y ) dX started at Y0 = ξ ∈ W . That is, for all t ∈ [0, T ],
Z t
Yt = ξ + f (Ys ) dXs , (8.2)
0
2β
where the integral is interpreted in the sense of Theorem 4.10 and f (Y ) ∈ DX is
built from Y by Lemma 7.3. (Thanks to Cb2 -regularity of f and Lemma 7.3 the above
rough integral equation (8.2) is well-defined.1 )
Then the following (a priori) estimate holds true
1/α
kY kα ≤ C kf kC 2 |||X|||α ∨ kf kC 2 |||X|||α
b b
1
Later we will establish existence and uniqueness under Cb3 -regularity.
110 8 Solutions to rough differential equations
Proof. Consider an interval I := [s, t] so that, using basic estimates for rough
integrals (cf. Theorem 4.10),
Y
Rs,t = |Ys,t − f (Ys )Xs,t |
Z t
≤ f (Y )dX − f (Ys )Xs,t − Df (Ys )f (Ys )Xs,t + |Df (Ys )f (Ys )Xs,t |
s
3α
. kXkα;I
Rf (Y )
2α;I + kXk2α;I kf (Y )kα;I |t − s|
2α
+ kXk2α;I |t − s| . (8.3)
Recall that k · kα is the usual Hölder semi-norm over [0, T ], while k · kα;I denotes
the same norm, but over I ⊂ [0, T ], so that trivially kXkα;I ≤ kXkα . Whenever
notationally convenient, multiplicative constants depending on α and f are absorbed
in ., at the very end we can use scaling to make the f dependence reappear. We
will also write k · kα;h for the supremum of k · kα;I over all intervals I ⊂ [0, T ] with
length |I| ≤ h. Again, one trivially has kXkα;I ≤ kXkα;h whenever |I| ≤ h. Using
this notation, we conclude from (8.3) that
Y
f (Y )
α
R
2α;h
. kXk 2α;h + kXk α;h
R
2α;h
+ kXk 2α;h kf (Y )kα;h h .
so that,
f (Y )
1 2
≤ D2 f ∞ kY kα;h + |Df |∞
RY
2α;h
R
2α;h 2
2
. kY kα;h +
RY
2α;h .
Hence, also using kf (Y )kα;h . kY kα;h , there exists c1 > 0, not dependent on X or
Y , such that
Y
2
R
2α;h
≤ c1 kXk2α;h + c1 kXkα;h hα kY kα;h (8.4)
α
Y
α
+ c1 kXkα;h h R 2α;h + c1 kXk2α;h h kY kα;h .
Y
with c2 = (2c1 + 1). On the other hand, since Ys,t = f (Ys )Xs,t − Rs,t and f is
bounded, we have the bound
ψh ≤ λh + ψh2 .
(and similarly: limg↓h ψg ≤ 3ψh ) which rules out any jumps of relative jump size
greater than 3. However, given that ψh ≥ 1/2 in the first regime and ψh < 1/6 in the
second, we can never jump from the second into the first regime, as h increases (from
zero). And so, we indeed must be in the second regime for all h ≤ h0 . Elementary
estimates on ψ− , as function of λh then show that
kY kα;h ≤ c6 |||X|||α ,
The aim of this section is to show that if f is regular enough and (X, X) ∈ C β with
β > 13 , then we can solve differential equations driven by the rough path X = (X, X)
of the type
dY = f (Y ) dX .
Such an equation will yield solutions in DX 2α
and will be interpreted in the corre-
sponding integral formulation, where the integral of f (Y ) against X is defined using
Lemma 7.3 and Theorem 4.10. More precisely, one has the following result:
for some τ > 0. Here, the integral is interpreted in the sense of Theorem 4.10 and
2β
f (Y ) ∈ DX is built from Y by Lemma 7.3. Furthermore, one has Y 0 = f (Y ) and,
if f ∈ Cb3 , solutions are global in time.
Restricting from [0, 1] to [0, T ], any T ≤ 1, Theorem 4.10 allows to define the map
Z ·
0 def
MT (Y, Y ) = ξ + Ξs dXs , Ξ ∈ DX 2α
.
0
The RDE solution on [0, T ] we are looking for is a fixed point of this map. Strictly
speaking, this would only yield a solution (Y, Y 0 ) in DX2α
. But since X ∈ C β , it
2β
turns out that this solution is automatically an element of DX . Indeed, |Ys,t | ≤
8.5 Rough differential equations 113
2α
|Y 0 |∞ |Xs,t | +
RY
2α |t − s| , so that Y ∈ C β . From the fixed point property it
then follows that Y 0 = f (Y ) ∈ C β and also RY ∈ C22β , since X ∈ C22β and
t
Z
Rs,t = Ys,t − Ys0 Xs,t =
Y
(f (Yr ) − f (Ys ))dXt
s
3α
≤ |Y 0 |∞ |Xs,t | + O |t − s| .
Note that if (Y, Y 0 ) is such that (Y0 , Y00 ) = (ξ, f (ξ)), then the same is true for
MT (Y, Y 0 ). Therefore, MT can be viewed as map on the space of controlled paths
started at (ξ, f (ξ)), i.e.
(Y, Y 0 ) ∈ DX
2α
([0, T ], W ) : Y0 = ξ, Y00 = f (ξ) .
Since DX 2α
is a Banach space (under the norm (Y, Y 0 ) 7→ |Y0 | + |Y00 | + kY, Y 0 kX,2α )
the above (affine) subspace is a complete metric space under the induced metric. This
is also true for the (closed) unit ball BT centred at, say
t 7→ (ξ + f (ξ)X0,t , f (ξ)).
(Note here that the apparently simpler choice t 7→ ξ, f (ξ) does in general not
belong to DX2α
.) In other words, BT is the set of all (Y, Y 0 ) ∈ DX
2α
([0, T ], W ) :
0
Y0 = ξ, Y0 = f (ξ) and
In fact, k(Y − f (ξ)X0,· , Y·0 − f (ξ))kX,2α = kY, Y·0 kX,2α as a consequence of the
triangle inequality and k(f (ξ)X0,· , f (ξ))kX,2α = kf (ξ)kα + k0k2α = 0, so that
n o
BT = (Y, Y 0 ) ∈ DX
2α
([0, T ], W ) : Y0 = ξ, Y00 = f (ξ) : k(Y, Y·0 )kX,2α ≤ 1 .
Let us also note that, for all (Y, Y 0 ) ∈ BT , one has the bound
0
Y0 + k(Y, Y 0 )k
X,2α ≤ |f |∞ + 1 =: M ∈ [1, ∞). (8.7)
We now show that, for T small enough, MT leaves BT invariant and in fact is
contracting. Constants below are denoted by C, may change from line to line and
may depend on α, β, X, X without special indication. They are, however, uniform
in T ∈ (0, 1] and we prefer to be explicit (enough) with respect to f such as to
see where Cb3 -regularity is used. With these conventions, we recall the following
estimates, direct consequences from Lemma 7.3 and Theorem 4.10 , respectively,
kΞ, Ξ 0 kX,2α ≤ CM kf kC 2 |Y00 | + kY, Y 0 kX,2α
b
114 8 Solutions to rough differential equations
Z ·
≤ kΞkα + kΞ 0 k∞ kXk2α
Ξs dXs , Ξ
0 X,2α
+ C kXkα
RΞ
2α + kXk2α kΞ 0 kα
≤ kΞkα + C |Ξ00 | + kΞ, Ξ 0 kX,2α (kXkα + kXk2α )
≤ kΞkα + C |Ξ00 | + kΞ, Ξ 0 kX,2α T β−α .
Z ·
MT (Y , Y 0 )
X,2α
=
Ξs dXs , Ξ
0 X,2α
0 0
≤ kΞkα + C |Ξ0 | + kΞ, Ξ kX,2α T β−α
2
≤ kf kC 1 kY kα + C kf kC 1 + CM kf kC 2 |Y00 | + kY, Y 0 kX,2α T β−α
b b b
β−α 2
≤ kf kC 1 (kf k∞ + 1)T + CM kf kC 1 + kf kC 2 (kf k∞ + 1) T β−α ,
b b b
where in the last step we used (8.7) and also kY kα;[0,T ] ≤ Cf T β−α , seen from
2α
|Ys,t | ≤ |Y 0 |∞ |Xs,t | +
RY
2α |t − s|
β 2α
≤ (|Y00 | + kY 0 kα )kXkβ |t − s| +
RY
2α |t − s| .
Then, using T α ≤ T β−α and
RY
2α ≤ kY, Y 0 kX,2α ≤ 1 , we obtain the bound
kY kα;[0,T ] ≤ |Y00 | + kY, Y 0 kX,2α kXkβ T β−α +
RY
2α T β−α
(8.8)
≤ (kf k∞ + 1)kXkβ + 1 T β−α .
In other words, kMT (Y, Y 0 )kX,2α = kMT (Y, Y 0 )kX,2α;[0,T ] = O T β−α with
The contraction property is obvious, provided that we can establish the following
two estimates:
Y − Ỹ
≤ CT β−α
Y − Ỹ , Y 0 − Ỹ 0
α X,2α
, (8.9)
∆, ∆0
≤ C
Y − Ỹ , Y 0 − Ỹ 0
X,2α .
X,2α
(8.10)
We now turn to (8.10). Similar to the proof of Lemma 8.2, f ∈ C 3 allows to write
∆s = Gs Hs where
Gs := g Ys , Ỹs , Hs := Ys − Ỹs ,
and g ∈ Cb2 with kgkC 2 ≤ Ckf kC 3 . Lemma 7.3 tells us that (G, G0 ) ∈ DX
2α
(with
b b
G0 = (DY g)Y 0 + (DỸ g)Ỹ 0 ) and in fact immediately yields an estimate of the form
.
Y − Ỹ , Y 0 − Ỹ 0
X,2α ,
where we made use of kgk∞ , kgkC 1 . kf kC 3 and |Y00 | = Ỹ00 = |f (ξ)| ≤ |f |∞ .
b b
The argument from here on is identical to the Young case: the previous esti-
mates allow fora small enough T0 ≤ 1 such that MT0 (BT0 ) ⊂ BT0 and for all
Y, Y 0 , Ỹ , Ỹ 0 ∈ BT0 :
116 8 Solutions to rough differential equations
1
MT0 Y, Y 0 − MT0 Ỹ , Ỹ 0
Y − Ỹ , Y 0 − Ỹ 0
≤
X,2α 2 X,2α
and so MT0 (·) admits a unique fixed point (Y, Y 0 ) ∈ BT0 , which is then the unique
solution Y to (8.1) on the (possibly rather small) interval [0, T0 ]. Noting that the
choice of T0 can again be done uniformly in the starting point, the solution on [0, 1]
is then constructed iteratively as before. t
u
In many situations, one is interested in solutions to an equation of the type
instead of (8.6). On the one hand, it is possible to recast (8.11) in the form (8.6) by
writing it as an RDE for Ŷt = (Yt , t) driven by X̂t = (X̂, X̂) where X̂ = (Xt , t)
and X̂ is given by X and the “remaining cross integrals” of Xt and t, given by usual
Riemann-Stieltjes integration. However, it is possible to exploit the structure of (8.11)
to obtain somewhat better bounds on the solutions. See [FV10b, Ch. 12].
dY = f (Y ) dX, Y0 = ξ ∈ W ;
similarly, let (Ỹ , f (Ỹ )) be the RDE solution driven by X̃ and started at ξ where
X, X̃ ∈ C β and α < β. Assuming
and also
˜ + %β X, X̃ ,
Y − Ỹ
≤ CM |ξ − ξ|
α
(similarly with tilde) and the local Lipschitz estimate for rough integration (uniform
0
in T ≤ 1) writing (Ξ, Ξ 0 ) := f (Y ), f (Y ) for the integrand,
= dX,X̃,2α Z, Z 0 ; Z̃, Z̃ 0
dX,X̃,2α Y, f (Y ); Ỹ , f Ỹ
. %α X, X̃ + ξ − ξ˜ + dX,X̃,2α Ξ, Ξ 0 ; Ξ̃, Ξ̃ 0
≤ %β X, X̃ + ξ − ξ˜ + dX,X̃,2β Ξ, Ξ 0 ; Ξ̃, Ξ̃ 0 ,
where we used α < β and T ≤ 1 in the last step. Thanks to the local Lipschitz
estimate for composition (also uniform over T ≤ 1)
≤ %β X, X̃ + ξ − ξ˜ + dX,X̃,2α Y, f (Y ); Ỹ , f Ỹ T β−α .
dX,X̃,2α Y, f (Y ); Ỹ , f Ỹ ≤ C %β X, X̃ + ξ − ξ˜
+ dX,X̃,2α Y, f (Y ); Ỹ , f Ỹ T β−α .
dX,X̃,2α Y, f (Y ); Ỹ , f Ỹ ≤ 2C %β X, X̃ + ξ − ξ˜ ,
Fix f ∈ Cb2 (W, L(V, W )) and X = (X, X) ∈ C β ([0, T ], V ) with β > 31 . Under
these assumptions, the rough differential equation dY = f (Y )dX makes sense as
well-defined integral equation. (In Theorem 8.4 we used additional regularity, namely
Cb3 , to establish existence of a unique solution on [0, T ].) By the very definition of an
2β
RDE solution, unique or not, (Y, f (Y )) ∈ DX i.e.
2β
Ys,t = f (Ys )Xs,t + O |t − s|
118 8 Solutions to rough differential equations
It turns out that the description (8.13) is actually a formulation that is equivalent
to the RDE solution built previously in the following sense.
Proposition 8.8. The following two statements are equivalent
i) (Y, f (Y )) is a RDE solution to (8.6), as constructed in Theorem 8.4.
ii) Y ∈ C([0, T ], W ) is an “RDE solution in the sense of Davie”, i.e. in the sense of
(8.13).
Proof. We already discussed how (8.13) is obtained from an RDE solution to
2β
(8.6). Conversely, (8.13) implies immediately Ys,t = f (Ys )Xs,t + O |t − s|
which shows that Y ∈ C β and also Y 0 := f (Y ) ∈ C β , thanks to f ∈ Cb2 , so that
2β
(Y, f (Y )) ∈ DX . It remains to see, in the notation of the proof of Theorem 4.10,
that Ys,t = (IΞ)s,t with
8.8 Lyons’ original definition 119
0
Ξs,t = f (Ys )Xs,t + (f (Y ))s Xs,t = f (Ys )Xs,t + Df (Ys )f (Ys )Xs,t .
To see this, we note that trivially Ys,t = (I Ξ̃)s,t with Ξ̃s,t := Ys,t . But Ξ̃s,t =
Ξs,t + o(|t − s|) and one sees as in Remark 4.12 that I Ξ̃ = IΞ. t u
directly by
def
Zt = IΞ 0,t
, Ξs,t = F (Xs ) Xs,t + DF (Xs )Xs,t ,
def s
s
Zs,t = I Ξ̄ s,t
, Ξ̄u,v = Zs,u Zu,v + F (Xu ) ⊗ F (Xu ) Xu,v .
It is possible to check that Ξ̄ s ∈ C2α,3α for every fixed s (see the proof of Theorem
4.10) so that the second line makes sense. It is also straightforward to check that
(Z, Z) satisfies (2.1), so that it does indeed belong to C α . Actually, one can see that
Z t Z t
Zt = F (Xs ) dXs , Zs,t = Zs,r ⊗ dZr ,
0 s
where the integrals are defined as in the previous sections, where F (X) ∈ DX
2α
as in
Section 7.3.
We can now define solutions to (8.6) in the following way.
Write π(f ) (0, y; X) = Y for this solution. Note that the inverse flow exists trivially,
by following the RDE driven by X(. − t),
We call the map y 7→ π(f ) (0, y; X) the flow associated to the above RDE. Moreover,
if X is a smooth approximation to X (in rough path metric), then the corresponding
ODE solution Y is close to Y , with a local Lipschitz estimate as given in Section
8.6.
It is natural to ask if the flow depends smoothly on y. Given a multi-index
k = (k1 , . . . , ke ) ∈ Ne , write Dk for the partial derivative with respect to y 1 , . . . , y e .
The proof of the following statement is an easy consequence of [FV10b, Chapter 12].
Theorem 8.10. Let α ∈ (1/3, 1/2] and X, X̃ ∈ Cgα . Assume f ∈ Cb3+n for some
integer n. Then the associated flow is of regularity C n+1 in y, as is its inverse flow.
The resulting family of partial derivatives, {Dk π(f ) (0, ξ; X), |k| ≤ n} satisfies the
RDE obtained by formally differentiating dY = f (Y )dX.
At last, for every M > 0 there exist C, K depending on M and the norm of f
such that, whenever |||X|||α , |||X̃|||α ≤ M < ∞ and |k| ≤ n,
sup Dk π(f ) (0, ξ; X) − Dk π(f ) (0, ξ; X̃)α;[0,t] ≤ C%α (X, X̃),
ξ∈Re
sup Dk π(f ) (0, ξ; X)−1 − Dk π(f ) (0, ξ; X̃)−1 α;[0,t] ≤ C%α (X, X̃),
ξ∈Re
8.10 Exercises
Exercise 8.12 (Linear RDEs). Consider f ∈ L(W, L(V, W )). Given an a priori
estimate for solutions to dY = f (Y )dX. Conclude with a (global) existence and
uniqueness results for such linear RDEs.
f = (f1 , . . . , fd ) ∈ Cb∞ Re , L Rd , Re ,
Exercise 8.15. Establish existence, continuity and stability for rough differential
equations with drift (cf. (8.6)),
You may assume f0 ∈ Cb3 (although one can do much better and f0 Lipschitz is
enough). Hint: Under this assumption, one solves dY = f¯(Y )X̄ with f¯ = (f, f0 )
and a X̄ a “space-time” rough path extension of X.
Exercise 8.16. Let f ∈ Cb2 and assume (Y, f (Y )) is a RDE solution to (8.6), as
constructed in Theorem 8.4. Show that the o-term in Davie’s definition, (8.13), can
be chosen uniformly over (X, X) ∈ BR , any R < ∞, where
n o
BR := (X, X) ∈ C β : kXkβ + kXk2β ≤ R , any R < ∞.
122 8 Solutions to rough differential equations
Show also that RDE solutions are β-Hölder, uniformly over (X, X) ∈ BR , any
R < ∞.
Exercise 8.17. Show that dX,X n ,2α ((Y, f (Y )), (Y n , f (Y n ))) → 0, together with
X → Xn in C β implies that also (Y n , Yn ) → (Y, Y) in C α . Since, at the price of
replacing f by F , cf. Definition 8.9, there is no loss of generality in solving for the
controlled rough path Z = X ⊕ Y , conclude that continuity of the RDE solution
map (Itô–Lyons map) also holds with Lyons’ definition of a solution.
8.11 Comments
ODEs driven by not too rough paths, i.e. paths that are α-Hölder continuous for some
α > 1/2 or of finite p-variation with p < 2, understood in the (Young) integral sense
were first studied by Lyons in [Lyo94]; nonetheless, the terminology Young-ODEs is
now widely used. Existence and uniqueness for such equations via Picard iterations
is by now classical, our discussion in Section 8.3 is a mild variation of [LCL07, p.22]
where also the division property (cf. proof of Lemma 8.2) is emphasised. Existence
and uniqueness of solutions to RDEs via Picard iteration in the (Banach!) space of
controlled rough paths originates in [Gub04] for regularity α ∈ ( 13 , 12 ). This approach
also allows to treat arbitrary regularities, see [Gub10, Hai14c].
The continuity result of Theorem 8.5 is due to T. Lyons; proofs of uniform
continuity on bounded sets were given in [Lyo98, LQ02, LCL07]. Local Lipschitz
estimates were pointed out subsequently and in different settings by various authors
including Lyons–Qian [LQ02], Gubinelli [Gub04], Friz–Victoir [FV10b], Inahama
[Ina10], Deya et al. [DNT12a].
The name universal limit theorem was suggested by P. Malliavin, meaning con-
tinuity of the Itô–Lyons map in rough path metrics. As we tried to emphasise, the
stability in rough path metrics is seen at all levels of the theory.
Lyons’ original argument (for arbitrary regularity) also involves a Picard iteration,
see e.g. [LCL07, p.88]. For regularity α > 1/3, Davie [Dav08] proves existence
and uniqueness for Young resp. rough differential equations via discrete Euler resp.
Milstein approximations. Using Lie group techniques, Davie’s argument was adapted
to arbitrary values of α by Friz–Victoir [FV10b]. Let us also note that the regularity
assumption in Theorem 8.4 (f ∈ Cb3 ) is not sharp; it is fairly straightforward to push
the argument to γ-Lipschitz (in the sense of Stein) regularity, for any γ > 1/α. It is
less straightforward [Dav08, FV10b] to show that uniqueness also holds for γ ≥ 1/α
and this is optimal, with counter-examples constructed in [Dav08]. Existence results
on the other hand are available for γ > (1/α) − 1. Setting α = 1, this is consistent
with the theory of ODEs where it is well known that, modulo possible logarithmic
divergencies, Lipschitz continuity of the coefficients is required for the uniqueness
of local solutions, but continuity is sufficient for their existence.
Chapter 9
Stochastic differential equations
In particular, we may use almost every realisation of (B, B) as the driving signal
of a rough differential equation. This RDE is then solved “pathwise” i.e. for a
fixed realisation of (B(ω), B(ω)). Recall that the choice of B is never unique: two
Itô Strat
important choices R the Stratonovich lift, we write B and B , where
R are the Itô and
B is defined as B ⊗ dB and B ⊗ ◦dB respectively. We now discuss the interplay
with classical stochastic differential equations (SDEs).
dY = f0 (Y )dt + f (Y ) dBItô , Y0 = ξ.
123
124 9 Stochastic differential equations
Proof. We assume zero drift f0 , but see Exercise 8.15. The map
A classical result (e.g. [IW89, p.392]) asserts that SDE approximations based on
piecewise linear approximations to the driving Brownian motions converge to the
solution of the Stratonovich equation. Using the machinery built in the previous
sections, we can now give a simple proof of this by combining Proposition 3.6,
Theorem 8.5 and the understanding that RDEs driven by BStrat yield solutions to the
Stratonovich equation (Theorem 9.1).
Theorem 9.3 (Wong–Zakai, Clark, Stroock–Varadhan). Let f, f0 , ξ be as in The-
orem 9.1 above. Let α < 1/2. Consider dyadic piecewise-linear approximations
(B n ) to B on [0, T ], as defined in Proposition 3.6. Write Y n for the (random) ODE
solutions to dY n = f0 (Y n )dt+f (Y n )dB n and Y for the Stratonovich SDE solution
to dY = f0 (Y )dt + f (Y ) ◦ dB, all started at ξ. Then the Wong–Zakai approxi-
mations converge a.s. to the Stratonovich solution. More precisely, with probability
one,
kY − Y n kα;[0,T ] → 0.
The only reason for dyadic piecewise-linear approximations in the above statement
is the formulation of the martingale-based Proposition 3.6. In Section 10 we shall
present a direct analysis (going far beyond the setting of Brownian drivers) which
easily entails quantitative convergence (in probability and Lq , any q < ∞) for all
piecewise-linear approximations towards a (Gaussian) rough path.
9.3 Support theorem and large deviations 125
In the forthcoming Exercise 10.14 it will be seen that (non-dyadic) piecewise linear
approximations (meshsize ∼ 1/n), viewed canonically as rough paths, converge a.s.
in C α with rate anything less than 1/2 − α. As long as α > 1/3, it then follows
from (local) Lipschitzness of the Itô–Lyons map that Wong–Zakai approximations
also converge with rate 1/2 − α−. Note that the “best” rate one obtains in this way
is 1/2 − 1/3− = 1/6−; the reason being that rate is measured in some Hölder space
with exponent 1/3+, rather than the uniform norm. The (well known) almost sure
“strong” rate 1/2− can be obtained from rough path theory at the price of working in
rough path spaces of (much) lower regularity; see [FR14].
We briefly discuss two fundamental results in diffusion theory and explain how
the theory of rough paths provides elegant proofs, reducing a question for general
diffusion to one for Brownian motion and its Lévy area.
The results discussed in this section were among the very first applications of
rough path theory to stochastic analysis, see Ledoux et al. [LQZ02]. Much more on
these topics is found [FV10b], so we shall be brief. The first result, due to Stroock–
Varadhan, concerns the support of diffusion processes.
(where Euclidean norm is used for the conditioning kB − hk∞,[0,T ] < ε). As a
consequence, the support of the law of Y , viewed as measure on the pathspace
C 0,α ([0, T ], Re ), is precisely the α-Hölder closure of {y h : ḣ ∈ L2 ([0, T ], Rd )}.
Proof. Using Theorem 9.1 we can and will take Y as RDE solution driven by
BStrat (ω). For h ∈ H and some fixed α ∈ ( 31 , 12 ), we furthermore denote by
S (2) (h) = (h, h ⊗ dh) ∈ Cg0,α the canonical lift given by computing the it-
R
The conditional statement then follows easily from continuity of the Itô–Lyons map
and so yields the “difficult” support inclusion: every y h is in the support of Y . The
easy inclusion, support of Y contained in the closure of {y h }, follows from the
Wong–Zakai theorem, Theorem 9.3. If one is only interested in the support statement,
but without the conditional statement (9.2), there are “softer” proofs; see Exercise
9.6 below. t u
The second result to be discussed here, due to Freidlin–Wentzell, concerns the
behaviour of diffusion in the singular (ε → 0) limit when B is replaced by εB. We
assume the reader is familar with large deviation theory.
Theorem 9.5 (Freidlin–Wentzell large deviations). Let f, f0 , ξ be as in Theorem
9.1 above. Let α < 1/2, B be a d-dimensional Brownian motion and consider the
unique Stratonovich SDE solution Y = Y ε on [0, T ] to
d
X
dY = f0 (Y )dt + fi (Y ) ◦ εdB i (9.4)
i=1
Here I is Schilder’s rate function for Brownian motion, i.e. I(h) = 12 kḣk2L2 ([0,T ],Rd )
for h ∈ H and I(h) = +∞ otherwise.
Proof. The key remark is that large deviation principles are robust under continuous
maps, a simple fact known as contraction principle. The problem is then reduced to
establishing a suitable large deviation principle for the Stratonovich lift of εB (which
is exacly δε BStrat ) in the α-Hölder rough path topology. Readers familiar with general
facts of large deviation theory, in particular the inverse and generalized contraction
principles, are invited to complete the proof along Exercise 9.7 below. t u
9.4 Exercises
(closed) subspace of Cg0,α of rough paths X started at X0 = 0. Show that BStrat has
full support. The “easy” inclusion, supp µ ⊂ Cg0,α is clear from Proposition 3.6. For
the other inclusion, recall the translation operator from Exercise 2.19 and follow the
steps below.
a) (Cameron–Martin theorem for Brownian rough path) Let h ∈ [0, T ] ∈ H =
W01,2 . Show that X ∈ supp µ implies Th (X) ∈ supp µ.
b) Show that the support of µ contains at least one point, say X̂ ∈ Cg0,α with the
property that there exists a sequence of Lipschitz paths (h(n) ) so that Th(n) (X̂) →
(0, 0) in α-Hölder rough path metric. Hint: Almost every realization of BStrat (ω)
will do, with −h(n) = B (n) , the dyadic piecewise-linear approximations from
Proposition 3.6.
c) Conclude that (0, 0) = limn→∞ R Th(n) (X̂) ∈ supp µ.
d) As a consequence, any (h, h ⊗ dh) = Th (0, 0) ∈ supp µ, for any h ∈ H and
taking the closure yields the “difficult” inclusion.
e) Appeal to continuity of the Itô–Lyons map to obtain the “difficult” support inclu-
sion (“every y h is in the support of Y ” ) in the context of Theorem 9.4.
J(X) = I(X) ,
where X = (X, X) and I is Schilder’s rate function for Brownian motion, i.e.
I(h) = 12 kḣk2L2 ([0,T ],Rd ) for h ∈ H = W01,2 and I(h) = +∞ otherwise.
Hint: Thanks to Fernique estimates for the homogeneous rough paths norm of
BStrat (which can be obtained by carefully tracking the moment-growth in Theorem
3.1 applied to BStrat ; alternatively see Theorem 11.9 below for an elegant Gaussian
argument) it is actually enough to establish a large deviation principle for (δε BStrat :
ε > 0) in the (much coarser) uniform topology, which is not very hard to do “by
hand”, cf. [FV10b].
9.5 Comments
Lyons [Lyo98] used the Wong–Zakai theorem in conjunction with his continuity
result to deduce the fact that RDE solutions (driven by the Brownian rough path BStrat )
coincide with solution to (Stratonovich) stochastic differential equations. Similar to
Friz–Victoir [FV10b], the logic is reversed here: thanks to an a priori identification
of f (Y ) dBStrat as a Stratonovich stochastic integral, the Wong–Zakai results is
R
128 9 Stochastic differential equations
obtained. Almost sure rates for Wong–Zakai approximations in Brownian (and then
more general Gaussian) situations, were studied by Hu–Nualart [HN09], Deya, Tindel
and Neuenkirch [DNT12b] and Friz–Riedel [FR14]; see also Riedel–Xu [RX13].
Let us also note that Lq -rates for the convergence of approximations are not easy
to obtain with rough path techniques (in contrast to Itô-calculus which is ideally
suited for moment calculations). Nonetheless, such rates can be obtained by Gaussian
techniques, as discussed in Section 11.2.3 below; applications include multi-level
Monte Carlo for rough differential equations [BFRS13]. The material in Section
9.3 goes back to Ledoux, Qian and Zhang ([LQZ02]; in p-variation). The results in
stronger Hölder topolgy are due to Friz and Victoir [Fri05, FV05, FV07, FV10b],
the conditional estimate (9.3) is due to Friz, Lyons and Stroock [FLS06].
Chapter 10
Gaussian rough paths
X(ω) : [0, T ] → Rd
and may take the underlying probability space as C [0, T ], Rd , equipped with a
Gaussian measure µ so that Xt (ω) = ω(t). Recall that µ, the law of X, is fully
determined by its covariance function
2
R : [0, T ] → Rd×d
(s, t) 7→ E[Xs ⊗ Xt ] .
In this section, a major role will be played by the rectangular increments of the
covariance, namely
s , t def
R 0 0 = E[Xs,t ⊗ Xs0 ,t0 ] .
s ,t
As far as the Hölder regularity of sample paths is concerned, we have the following
classical result, which is nothing but a special case of Kolmogorov’s continuity
criterion:
Proposition 10.1. Assume there exists positive % and M such that for every 0 ≤ s ≤
t ≤ T,
129
130 10 Gaussian rough paths
R s, t ≤ M |t − s|1/% .
(10.1)
s, t
Then, for every α < 1/(2%) there exists Kα ∈ Lq , for all q < ∞, such that
α
|Xs,t (ω)| ≤ Kα (ω)|t − s| .
Proof. We may argue componentwise and thus take d = 1 without loss of generality.
Since
1/2
1/2 s, t 1
≤ M 1/2 |t − s| 2%
|Xs,t |L2 = (E[Xs,t Xs,t ]) ≤ R
s, t
and |Xs,t |Lq ≤ cq |Xs,t |L2 by Gaussianity, we conclude immediately with an appli-
cation of the Kolmogorov criterion. t u
Whenever the above proposition applies with % < 1, the resulting sample paths
can be taken with Hölder exponent α ∈ ( 12 , 2%
1
); differential equations driven by X
can then be handled with Young’s theory, cf. Section 8.3. Therefore, our focus will be
on Gaussian processes which satisfy a suitable modification of condition (10.1) with
% ≥ 1 such that the process X allows for a probabilistic construction of a suitable
second order process1
2
X(ω) : [0, T ] → Rd×d ,
which is tantamount to making sense of the “formal” stochastic integrals
Z t
i
Xs,r dXrj for 0 ≤ s < t ≤ T, 1 ≤ i, j ≤ d , (10.2)
s
such that almost every realisation X(ω) satisfies the algebraicand analytical prop-
erties of Section 2, notably (2.1) and (2.3) for some α ∈ 31 , 12 . We shall also look
for (X, X) as (random) geometric rough path; thanks to (2.5), only the case i < j in
(10.2) then needs to be considered.
At the risk of being repetitive, the reader should keep in mind the following three
points: (i) the sample paths X(ω) will not have, in general, enough regularity to
define (10.2) as Young integrals; (ii) the process X will not be, in general, a semi-
martingale, so (10.2) cannot be defined using classical stochastic integrals; (iii) a lift
of the process X to (X, X) ∈ Cgα for some α ∈ 13 , 12 , if at all possible, will never
be unique (as discussed in Chapter 2, one can always perturb the area, i.e. Anti(X)
by the increments of a 2α-Hölder path). But there might still be one distinguished
canonical choice forR X, in the same way as BStrat is canonically obtained as limit
(in probability) of B ⊗ dB n , for many natural approximations B n of Brownian
n
motion B.
1
Despite the two parameters (s, t) one should not think of a random field here: as was noted in
Exercise 2.7, (X, X) is really a path.
10.2 Stochastic integration and variation regularity of the covariance 131
where the limit is understood in probability, say. Classical stochastic analysis (e.g.
[RY91, p144]) tells us that care is necessary: if X, X̃ are semimartingales, the
choice ξ = s (“left-point evaluation”) leads to the Itô integral; ξ = t (“right-
point evaluation”) to the backward Itô - and ξ = (s + t)/2 to the Stratonovich
integral. On the other hand, all these integrals only differ by a bracket term hX, X̃i
which vanishes if X, X̃ are independent. While we do not assume a semi-martingale
structure here, we do have the standing assumption of componentwise independence.
This suggests a Riemann sum approximation of (10.2) in which we expect the precise
point of evaluation to play no rôle; we thus consider left-point evaluation (but mid-
or rightpoint evaluation would lead to the same result; cf. Exercise 10.18, (ii) below).
Give partitions P, P 0 of [0, 1] we set
Z X
X0,s dX̃s := X0,s X̃s,t ,
P [s,t]∈P
Let us now assume that R has finite %-variation in the sense kRk%;[0,1]2 < ∞ where
the %-variation on a rectangle I × I 0 is given by
% !1/%
R s0 , t0
X
kRk%;I×I 0 := sup < ∞, (10.5)
P⊂I,
s ,t
[s,t]∈P
P 0 ⊂I 0 0 0
[s ,t ]∈P 0
and similarly for R̃, with θ = 1/% + 1/%̃ > 1. A generalisation of Young’s maximal
inequality due to Towghi [Tow02] states that 2
Z
sup R dR̃ ≤ C(θ)
R
%;I×I 0
R̃
%̃;I×I 0 .
P⊂I, P×P 0
P 0 ⊂I 0
R1
X0,r dX̃r exists as the L2 -limit of
R
Hence, 0 P
X0,r dX̃r as |P| ↓ 0 and
"Z
1 2 #
E X0,r dX̃r ≤ C
R
%;[0,1]2
R̃
%;[0,1]2 (10.7)
0
Proof. At first glance, the situation looks similar to Young’s part in the proof of
Theorem 4.10 where we deduce (4.12) from Young’s maximal inequality. However,
the same argument fails if re-run with Ξs,t = X0,s X̃s,t and | · | replaced by | · |L2 ;
in effect, the triangle inequality is too crude and does not exploit probabilistic
cancellations present here. We now present two arguments for the key estimate (10.6).
First argument: at the price of adding/subtracting P ∩ P 0 , we may assume without
loss of generality that P 0 refines P. This allows to write
Z Z X Z def
X0,r dX̃r − X0,r dX̃r = Xu,r dX̃r = I ,
P0 P [u,v]∈P P 0 ∩[u,v]
X X Z
= R dR̃ .
[u,v]∈P [u0 ,v 0 ]∈P P 0 ∩[u,v]×P 0 ∩[u0 ,v 0 ]
Thanks to Towghi’s maximal inequality, the absolute value of this term is bounded
from above by a constant C = C(%) times
X X
kRk%;[u,v]×[u0 ,v0 ]
R̃
%;[u,v]×[u0 ,v0 ]
[u,v]∈P [u0 ,v 0 ]∈P
X X 1 1
≤ ω([u, v] × [u0 , v 0 ]) % ω̃([u, v] × [u0 , v 0 ]) % ,
[u,v]∈P [u0 ,v 0 ]∈P
where ω = ω([s, t] × [s0 , t0 ]) (and similarly for ω̃) is a so-called 2D control [FV11]:
super-additive, continuous and zero when s = t or s0 = t0 . A possible choice, if
finite, is
%
0 0 def
X u , v
ω([s, t] × [s , t ]) = sup R u0 , v 0 .
(10.8)
Q⊂[s,t]×[s0 ,t0 ] 0 0 [u,v]×[u ,v ]∈Q
The difference to (10.5) is that the sup is taken over all (finite) partitions Q of
[s, t] × [s0 , t0 ] into rectangles; not just “grid-like” partitions induced by P × P 0 .
At this stage it looks like one should the change the assumption “covariance of
finite %-variation” to “finite controlled %-variation”, which by definition means
2
ω [0, 1] < ∞. But in fact there is little difference [FV11]: finite controlled %-
variation trivially implies finite %-variation; conversely, finite %-variation implies
finite controlled %0 -variation, any %0 > %. Since (10.6) does not depend on %, we may
as well (at the price of replacing % by %0 ) assume finite controlled %-variation. The
Cauchy–Schwarz inequality for finite sums shows that ω̄ := ω 1/2 ω̃ 1/2 is again a 2D
134 10 Gaussian rough paths
where we used the facts that |P| ↓ 0, % < 2 and super-additivity of ω̄ to obtain
the last inequality. This is precisely the required bound. The second argument
makes use of Riemann-Stieltjes theory, applicable after mollification of X̃, and a
uniformity property of %-variation upon mollification. Let thus denote X̃ n := X̃ ∗ fn
the convolution of t 7→ X̃t with (fn ), a family of smooth, compactly supported
n
probability density functions, weakly convergent to a Dirac at 0. Writing R̃s,t :=
n n
n n n
E X̃s X̃t for the covariance of X̃ , and also S̃s,t := E X̃s X̃t for the “mixed”
covariance, we leave the fact that
as and easy exercise for the reader. (Hint: Note R̃n = R̃ ∗ (fn ⊗ fn ), S̃ n = R̃ ∗
(δ ⊗ fn ); estimate then the rectangular increments of R̃n , respectively S̃ n , to the
power % with Jensen’s inequality.)
Since X̃ n has finite variation sample paths, basic Riemann-Stieltjes theory implies
Z Z
X0,r dX̃rn → X0,r dX̃rn as |P| → 0. (10.10)
P
In fact, this convergence (n fixed) takes also place in L2 which may be seen as con-
sequence of Lemma 10.2. On the other hand, pick %0 ∈ (%, 2) and apply Lemma 10.2
to obtain3
Z Z 2
X0,r dX̃rn ≤ CkRX k%0 ;[0,1]2
RX̃−X̃ n
%0 ;[0,1]2
sup X0,r dX̃r −
P P P L2
%/%0
1−%/%0
≤ CkRX k%0 ;[0,1]2
RX̃−X̃ n
%;[0,1]2
RX̃−X̃ n
∞;[0,1]2 , (10.11)
where C = C(%). Now %0 > % implies kRX k%0 ;[0,1]2 ≤ kRX k%;[0,1]2 (immediate
Pm %
consequence of |x|%0 ≤ |x|% ≡ ( i=1 |xi | )1/% on Rm ) and thanks to (10.9) we
also have the (uniform in n) estimate
n
R n
X̃−X̃ 2 ≤ C %
%;[0,1]
R
2 + 2
S̃
2 +
X̃ %;[0,1]
R n
2 %;[0,1] X̃ %;[0,1]
u, v
3
Define |f |∞;[0,1]2 = sup f 0 0 where the sup is taken over all [u, v], [u0 , v 0 ] ⊂ [0, 1].
u ,v
10.2 Stochastic integration and variation regularity of the covariance 135
≤ 4C%
R̃
%;[0,1]2 .
Note that there was nothing special about the time horizon [0, 1] in the above
discussion. Indeed, given any time horizon [s, t] of interest,
it suffices to apply the
same argument to the process Xs+τ (t−s) : 0 ≤ τ ≤ 1 . Since variation norms are
conveniently invariant under reparametrisation, (10.7) translates immediately to an
estimate of the form
"Z
t 2 #
E Xs,r dX̃r ≤ C
R
%;[s,t]2
R̃
%;[s,t]2 , (10.12)
s
first for the approximating Riemann-Stieltjes sums and then for their L2 -limits.
and then also (the algebraic conditions (2.1) and (2.5) leave no other choice!)
1 i 2
Xi,i
s,t := X and Xj,i i,j i j
s,t := −Xs,t + Xs,t Xs,t . (10.14)
2 s,t
Then, the following properties hold:
a) For every q ∈ [1, ∞) there exists C1 = C1 (q, %, d, T ) such that for all 0 ≤ s ≤
t ≤ T,
2q q q/%
E |Xs,t | + |Xs,t | ≤ C1 M q |t − s| . (10.15)
b) There exists a continuous modification of X, denoted by the same letter from here
on. Moreover, for any α < 1/(2%) and q ∈ [1, ∞) there exists C2 = C2 (q, %, d, α)
such that
2q q
E kXkα + kXk2α ≤ C2 M q . (10.16)
136 10 Gaussian rough paths
1
c) For any α < 2% , with probability one, the pair (X, X) satisfies conditions (2.1),
(2.3) and (2.5). In particular, for % ∈ [1, 32 ) and any α ∈ ( 13 , 2%
1
) we have
(X, X) ∈ Cg almost surely.
α
(Here, %α (X, Y) denotes the α-Hölder rough path distance between X = (X, X)
and Y = (Y, X) in Cgα .)
Proof. By scaling we may without loss of generality assume M = 1. As for a) we
note (again) that equivalence of Lq - and L2 -norm on Wiener–Itô chaos allow to
reduce our discussion to q = 2. The first level estimate being easy, we focus on
the second level estimate; to this end fix i 6= j. Since L2 -convergence implies a.s.
convergence along a subsequence there exists (Pn ), with mesh tending to zero, we
can use Fatou’s lemma to estimate
Z 2
i,j 2
i,j i
dYrj − Xs,r
i
dXrj
E Ys,t − Xs,t = E lim Ys,r
n→∞ Pn
Z 2
i
≤ lim inf E Ys,r dYrj − Xs,r
i
dXrj
n Pn
Z 2
i
≤ sup E Ys,r dYrj − Xs,ri
dXrj .
P P
where we estimate the second moment of each term on the right hand side by the
respective variation norms of the covariances; e.g.
Z 2
i j
E Ys,r d(Y − X)r ≤ CkRY i k%;[s,t]2 kRY j −X j k%;[s,t]2
P
2
≤ Cε2 |t − s| % .
i,i 2 1 i 2
2
E Yi,i i
s,t − Xs,t = E Ys,t − Xs,t
4
1 i i
i i
= E Ys,t − Xs,t Ys,t + Xs,t ,
4
then conclude with Cauchy–Schwarz.
Regarding b), given the pointwise Lq -estimates as stated in a), the Lq -estimates
for kX − Y kα and kY − Xk2α are obtained from Theorem 3.3. The last statement
is then an immediate consequence of the definition of %α . t u
1
Corollary 10.6. As above, let (X, Y ) = X , Y , . . . , X , Y d be a centred
1 d
contin-
uous Gaussian process such that X i , Y i is independent of X j , Y j when i 6= j.
Assume that there exists % ∈ [1, 23 ) and M ∈ (0, ∞) such that
1/%
R(X,Y )
%;[s,t]2
≤ M |t − s| ∀0 ≤ s ≤ t ≤ T. (10.18)
138 10 Gaussian rough paths
Proof. At the price of replacing (X, Y ) by the rescaled process M −1/2 (X, Y ) we
may take M = 1. (The concluding Lq -estimate on %α M −1/2 X, M −1/2 Y is then
readily translated into an estimate on %α (X, Y ), given that we allow the final constant
to depend on M .) Assumption (10.18) then spells out precisely to
1/% 1/%
kRX i k%;[s,t]2 ≤ |t − s| , kRY i k%;[s,t]2 ≤ |t − s|
η := max{kRX i −Y i k∞;[0,T ]2 : 1 ≤ i ≤ d}
for any given θ ∈ 0, 21 − %α . At last, take i∗ ∈ {1, . . . , d} as the arg max in the
Lemma 10.8. Assume that σ 2 (·) is concave on [0, h] for some h > 0. Then, one
has non-positive correlation of non-overlapping increments in the sense that, for
0 ≤ s ≤ t ≤ u ≤ v ≤ h,
s, t
E[Xs,t Xu,v ] = R ≤ 0.
u, v
140 10 Gaussian rough paths
The first claim now easily follows from concavity, cf. [MR06, Lemma 7.2.7].
To show the second bound, note that Xs,t Xu,v = (a + b + c)b where a = Xs,u ,
b = Xu,v , and c = Xv,t . Applying the algebraic identity
2 2
2(a + b + c)b = (a + b) − a2 + (c + b) − c2
where we used that σ 2 (·) is non-decreasing. On the other hand, using (a + b + c)b =
b2 + ab + cb and the non-positive correlation of non-overlapping increments, we
have
2 2
E[Xs,t Xu,v ] = E Xu,v + E[Xs,u Xu,v ] + E[Xv,t Xu,v ] ≤ E Xu,v ,
|σ 2 (τ )| ≤ L|τ |1/% .
for all intervals [s, t] with length |t − s| ≤ h and some M = M (%, L) > 0.
Proof. Consider some interval [s, t] with length |t − s| ≤ h. The proof relies on
separating “diagonal” and “off-diagonal” contributions. Let D = {ti }, D0 = {t0j } be
two dissections of [s, t]. For fixed i, we have
10.3 Fractional Brownian motion and beyond 141
X %
%
31−% E Xti ,ti+1 Xt0j ,t0j+1 ≤ 31−%
EXti ,ti+1 X·
%-var;[s,t]
(10.21)
t0j ∈D 0
%
%
≤
EXti ,ti+1 X·
%-var;[s,ti ] +
EXti ,ti+1 X·
%-var;[ti ,ti+1 ]
%
+
EXti ,ti+1 X·
%-var;[ti+1 ,t] .
≤ 2σ 2 (ti+1 − ti ) .
The third term is bounded analogously. For the middle term in (10.21) we estimate
EXti ,ti+1 X·
%
X
|EXti ,ti+1 Xt0j ,t0j+1 |%
%-var;[t ,t ]
= sup
i i+1
D0
t0j ∈D 0
where we used the second estimate of Lemma 10.8 for the penultimate bound and
the assumption on σ 2 for the last bound. Using these estimates in (10.21) yields
X
|EXti ,ti+1 Xt0j ,t0j+1 |% ≤ C|ti+1 − ti | ,
t0j ∈D 0
and (10.20) follows by summing over ti and taking the supremum over all dissections
of [s, t]. t
u
Corollary 10.10. Let X = (X 1 , ..., X d ) be a centred continuous Gaussian process
with independent components such that each X i satisfies the assumption of the
previous theorem, with common values of h, L and % ∈ [1, 3/2). Then X, restricted
to any interval [0, T ], lifts to X = (X, X) ∈ Cgα [0, T ], Rd .
Proof. Set In = [(n − 1)h, nh] so that [0, T ] ⊂ I1 ∪ I2 ∪ · · · ∪ I[T /h]+1 . On each
interval In , we may apply
Theorem 10.4 to lift Xn := X|In to a (random) rough
path Xn ∈ Cgα In , Rd . The concatenation of X1 , X2 , . . . then yields the desired
rough path lift on [0, T ]. t
u
Example 10.11 (Fractional Brownian motion). Clearly, d-dimensional fractional
Brownian motion B H with Hurst parameter H ∈ ( 13 , 12 ] satisfies the assumptions of
the above theorem / corollary for all components with
σ(u) = u2H ,
1
obviously non-decreasing and concave for H ≤ 2 and on any time interval [0, T ].
This also identifies
1
%=
2H
142 10 Gaussian rough paths
and % < 32 translates to H > 13 in which case we obtain a canonical geometric rough
path BH = (B H , BH ) associated to fBm. In fact, a canonical “level-3” rough path
BH can be constructed as long as % < %∗ = 2, corresponding to H > 1/4 but this
requires level-3 considerations which we do not discuss here (see [FV10b, Ch.15]).
Example 10.12 (Ornstein-Uhlenbeck process). Consider the d-dimensional (station-
ary) OU process, consisting of i.i.d. copies of a scalar Gaussian process X with
covariance
E[Xs Xt ] = K(|t − s|) , K(u) = exp (−cu) ,
where c > 0 is fixed. Note that σ 2 (u) = EXt,t+u
2 2
= EXt+u + EXt2 − 2EXt,t+u =
2
2[K(0) − K(u)] = 1 − exp (−cu), so that σ (u) is indeed increasing and concave:
One also has the bound σ 2 (u) = 1 − exp (−cu) ≤ cu, which shows that the
assumptions of the above corollary are satisfied with % = 1, L = c and arbitrary
h > 0.
10.4 Exercises
Use a Borel–Cantelli argument to show that, also for any θ < 1/2 − α,
1
kB − B n kα + kB − Bn k2α ≤ C(ω) .
nθ
10.4 Exercises 143
1 1
When α ∈ 3, 2 we can conclude convergence in α-Hölder rough path metric, i.e.
%α ((B, B), (B n , Bn )) → 0 ,
Exercise 10.15. Let (B, B̃) be a 2-dimensional standard Brownian motion. The
(Gaussian) process given by
X = (Bt , Bt + B̃t )
fails to have independent components and yet lifts to a Gaussian rough path. Explain
how and detail the construction.
Exercise 10.16. Assume R(s, t) = K(|t − s|) for some C 2 -function K. (This was
exactly the situation in the above Ornstein–Uhlenbeck case, Example 10.12.) Give a
direct proof that R has finite 2-dimensional 1-variation, more precisely,
This remains true when the mixed derivative is a signed measure, which in turn is the
case when R(s, t) = K(|t − s|) for some C 2 -function K. Indeed, write H and 2δ
for the distributional derivatives of | · |. Formal application of the chain-rule gives
∂t R = K 0 (|t − s|)H(t − s) and then, using |H| ≤ 1 a.s.,
∂s,t R(s, t) ≤ |K 00 (|t − s|)| + 2|K 0 (|t − s|)|δ(t − s).
2
2 2
Integration again over [s, t] ⊂ [0, T ] yields
Z
∂u,v R(u, v) du dv ≤ (T |K 00 | + 2|K 0 (0)|)|t − s|.
2
kRk1-var;[s,t]2 = ∞
[s,t]2
This is easily made rigorous by replacing | · | (and then H, 2δ) by a mollified version,
say |·|ε (and Hε , 2δε ), noting that variation-norms behave in a lower-semi-continuous
fashion under pointwise limits; that is
X ti−1 , ti
X ti−1 , ti
R 0 = R ε 0 ≤ liminf kRε k
tj−1 , t0j ε→0
lim
tj−1 , t0j 1-var;[u,v]2 .
ε→0
i,j i,j
Exercise 10.18. Assume X = X 1 , . . . , X d is a centred, continuous Gaussian
process with independent components.
(i) Assume covariance of finite %-variation with % < 2. Show that each component
X = X i , for i = 1, . . . , d, has almost surely vanishing compensated quadratic
variation on [0, T ] by which we mean
X
2 2
lim Xs,t − E(Xs,t ) =0,
n→∞
[s,t]∈Pn
in probability (and Lq , any q < ∞) for any sequence of partitions (Pn ) of [0, T ]
with mesh |Pn | → 0.
(ii) Under the assumptions of (i), show that there exists (Pn ) with |Pn | → 0 so
that, with probability one, the quadratic (co)variation X i , X j , in the sense of
definition 5.8, vanishes, for any i 6= j, with i, j ∈ {1, . . . , d}.
Conclude that, with regard to Theorem 10.4, the off-diagonal elements Xi,j s,t ,
defined as the L2 limit of left-point Riemann-Stieltjes sums, could have been
equivalently defined via mid- or right-point Riemann sums.
(iii) Assume % = 1. Show that, for all i = 1, . . . , d, there exists a sequence
(Pn ) with
mesh |Pn | → 0 so that, with probability, the quadratic variation X i , X i , in the
sense of definition 5.8, exists and equals
i X
i
2
X t := lim sup E Xu,v .
ε→0 |P|<ε
[u,v]∈P
u<t
Verify that % = 1 and compute [X]. (This example is related to the stochastic heat
equation, where s, t should be thought of as spatial variables; cf Lemma 12.17)
Solution 10.19. (i) Using Wick’s formula for the expectation of products of centred
Gaussians, namely
On the
other hand, at the price of passing to another subsequence also denoted by
P̃n , we have
X
2 2
sup Xu,v − E(Xu,v ) →0 almost surely,
t∈[0,T ]
[u,v]∈P̃n
u<t
2
(iv) One has E(Xs,t ) = cosh (−π) − cosh (|t − s| − π) = sinh (π)|t − s| + o(|t − s|)
and so [X]t = t sinh (π).
Exercise 10.20. Assume finite 1-variation of the covariance (as e.g. defined in (10.5))
2
of a zero-mean Gaussian process X and E[Xt,t+h ] = f (t)h + o(h) as h ↓ 0, for
some f ∈ C([0, T ], R). Show that, for every smooth test function ϕ,
T 2 T
Xt,t+h
Z Z
ϕ(t) dt → ϕ(t)f (t) dt as h → 0,
0 h 0
where the convergence takes places in Lq for any q < ∞ (and hence also in probabil-
ity).
Solution 10.21. Since all types of Lq -convergence are equivalent on the finite
Wiener–Itô chaos (here we only need the chaos up to level 2), it suffices to consider
q = 2. A dissection (tk ) of [0, T ] is given by tk = kh ∧ T . We have
X 1 Z tk+1 Z 1 X
2
ϕ(t)Xt,t+h dt = dθ ϕ(tk + θh)Xt2k +θh,tk +θh+h
h tk 0
k k
Z 1
≡ hϕ, µθ,h idθ ,
0
where the random measure µθ,h := k δtk +θh Xt2k +θh,tk +θh+h acts on test func-
P
in θ ∈ [0, 1],
uniformly
t∈
[0, T ]. On the other hand, the Gaussian (or Wick) identity
E A2 B 2 − E[A2 ]E B 2 = 2(E[AB])2 , applied with A = Xtk +θh,tk +θh+h and
B = Xtj +θh,tj +θh+h , gives
2
E F (t) − F̄ (t) = E F 2 (t) − F̄ 2 (t)
2
X tk + θh, tk + θh + h
=2 RX
tj + θh, tj + θh + h
k:tk +θh≤t
j:tj +θh≤t
. osc R2−% ; h → 0
as h → 0 ,
in L2 , again uniformly in t and θ. Now, for fixed smooth ϕ, one has the bound
Z Z 2 Z Z t 2
ϕ(t)µθ,h (dt) − ϕ(t)f (t)dt = f (s)ds − µθ,h ([0, t]) ϕ̇(t)dt
0
Z 1 Z t 2
. f (s)ds − µθ,h ([0, t]) dt
0 0
and so
Z Z 2 Z 1 Z t 2
E ϕ(t)µθ,h (dt) − ϕ(t)f (t)dt .
E f (s)ds − µθ,h ([0, t]) dt .
0 0
10.5 Comments
Classes of Gaussian processes which admit (canonical) lifts to random rough paths
were first studied by Coutin–Qian [CQ02], with focus on fBm with Hurst parameter
H > 1/4. Ledoux, Qian and Zhang [LQZ02] used Gaussian techniques to establish
large deviation and support for the Brownian rough paths, extensions to fractional
Brownian motions were investigated by Millet–Sanz-Solé [MSS06], Feyel and de
la Pradelle [FdLP06], Friz–Victoir [FV07, FV06a]. When H ≤ 1/4, there is no
canonical rough path lift: as noted in [CQ02], the L2 -norm of the area associated
to piecewise linear approximations to fBm diverges. See however the works of
Unterberger and then Nualart–Tindel [Unt10, NT11].
The notion of two-dimensional %-variation of the covariance, as adopted in this
chapter, is due to Friz–Victoir, [FV10a], [FV10b, Ch.15], [FV11], and allows for an
elegant and general construction of Gaussian rough paths. It also leads naturally to
useful Cameron–Martin embeddings, see Section 11.1. If restricted to the “diagonal”,
%-variation of the covariance relates to a classical criterion of Jain–Monrad [JM83].
The question remains how one checks finite %-variation when faced with a non-
trivial (and even non-explicit, e.g. given as Fourier series) covariance function. A
general criterion based on a certain covariance measure structure (reminiscent of
Kruk, Russo and Tudor [KRT07]) was recently given by Friz, Gess, Gulisashvili and
Riedel [FGGR13], a special case of which is the “concavity criterion” of Theorem
10.9.
Chapter 11
Cameron–Martin regularity and applications
where X ∈ C([0, T ], Rd ), say, and the supremum is taken over all partitions of [0, T ].
The 1-variation (p = 1) of such a path is of course nothing but its length, possibly
+∞. Hölder implies variation regularity, one has the immediate estimate
kXkp-var;[0,T ] ≤ T α kXkα;[0,T ] .
149
150 11 Cameron–Martin regularity and applications
is satisfied.
We are now interested in the regularity of Cameron–Martin paths. As in the
last section, X is an Rd -valued, continuous and centred Gaussian process on [0, T ],
realized as X(ω) = ω ∈ C [0, T ], Rd , a Banach space under the uniform norm,
d Z
X T Z T
Z= gsi dBsi ≡ hg, dBi .
i=1 0 0
Rt 2
By Itô’s isometry, hit := E ZBti = 0 gsi ds so that ḣ = g and khkH := E Z 2 =
RT 2
0
|gs | ds = kḣk2L2 where | · | denotes Euclidean norm on Rd . Clearly, h is of finite
1-variation, and its length is given by kḣkL1 . On the other hand, Cauchy–Schwarz
shows any h ∈ H is 1/2-Hölder which, in general, “only” implies 2-variation.
The proposition below applies to Brownian motion with % = 1, also recalling that
kRk1;[s,t]2 = |t − s| in the Brownian motion case.
1
The case % = 1 may be seen directly by taking βj = sgn htj ,tj+1 .
11.1 Complementary Young regularity 151
sX
≤ sup βj ⊗ βk , E Xtj ,tj+1 ⊗ Xtk ,tk+1
β,|β|l%0 ≤1 j,k
v
u X 1 X % %1
u %0 % 0 %0
≤ sup t |βj | |βk | E Xtj ,tj+1 ⊗ Xt ,t
k k+1
β,|β|l%0 ≤1 j,k j,k
X % 1/(2%) q
≤ E Xt ,t ⊗ Xt ,t
j j+1 k k+1
≤ kRk%-var;[s,t]2 .
j,k
The proof is then completed by taking the supremum over all dissections (tj ) of [0, t].
t
u
Remark 11.3. It is typical (e.g. for Brownian or fractional Brownian motion, with
% = 1/(2H) ≥ 1) that
1/%
∀s < t in [0, T ] : kRk%-var;[s,t]2 ≤ M |t − s| .
In such a situation, Proposition 11.2 implies that
1/(2%)
|hs,t | ≤ khk%-var;[s,t] ≤ khkH M 1/2 |t − s| ,
The homogeneous p-variation rough path norm over [0, T ] is then given by
q
def def
|||X|||p-var;[0,T ] = |||X|||p-var = kXkp-var + kXkp/2-var . (11.4)
Of course, a geometric rough path of finite p-variation, X ∈ Cgp-var is one for which
the “first order calculus” condition (2.5) holds.
The following results will prove crucial in Section 11.2 where we will derive,
based on the Gaussian isoperimetric inequality, good probabilistic estimates on
Gaussian rough path objects. They are equally crucial for developing the Malliavin
calculus for (Gaussian) rough differential equations in Section 11.3.
152 11 Cameron–Martin regularity and applications
Recall from Exercise 2.19 that the translation of a rough path X = (X, X) in
direction h is given by
def
Th (X) = X h , Xh
(11.5)
where X h := X + h and
Z t Z t Z t
h
Xs,t := Xs,t + hs,r ⊗ dXr + Xs,r ⊗ dhr + hs,r ⊗ dhr , (11.6)
s s s
provided that h is sufficienly regular to make the final three integrals above well-
defined.
Lemma 11.4. i) Let X ∈ Cgp-var ([0, T ], Rd ), with p ∈ [2, 3) and consider a function
h ∈ C q-var ([0, T ], Rd ) with complementary Young regularity in the sense that
2 1
From Remark 11.3, khk%,α . khkH for all α ≤ 2%
.
11.1 Complementary Young regularity 153
Let α ∈ ( 13 , 2%
1
] and X = (X, X) ∈ C α [0, T ], Rd a.s. be the random Gaussian
rough path constructed in Theorem 10.4. Then there exists a null set N such that for
every ω ∈ N c and every h ∈ H,
Th (X(ω)) = X(ω + h) .
Proof. Note that complementary Young regularity holds, with p = α1 < 3 and
q = % < 32 , as is seen from p1 + 1q > 13 + 32 > 1. As a consequence of Lemma 11.4,
the translation Th (X(ω)) is well-defined whenever X(ω) ∈ C α . The proof requires
a close look at the precise construction of X(ω) = (X(ω), X(ω)) in Theorem 10.4,
using Kolmogorov’s criterion to build a suitable (continuous, and then Hölder) modi-
fication from X restricted to dyadic times. We recall that X(ω) = ω ∈ C([0, T ], Rd ).
Let N1 be the null set of ω where X(ω) fails to be of α-Hölder (or p-variation)
regularity. Note that ω ∈ N1c implies ω + h ∈ N1c for all h ∈ H. By the very
construction of Xs,t as an L2 -limit, for fixed
R s, t there exists a sequence of partitions
(P m ) of [s, t] such that Xs,t (ω) = limm P m X ⊗ dX exists for a.e. ω, and we write
N2;s,t for the null set on which this fails. The intersections of all these, for dyadic
times s, t, is again a null set, denoted by N2 . Now take ω ∈ N1c ∩ N2c . For fixed
dyadic s, t, consider the aforementioned partitions (P m ) and note
Z
X(ω + h) ⊗ dX(ω + h)
Pm
Z Z Z Z
= X(ω) ⊗ dX(ω) + h ⊗ dX + X ⊗ dh + h ⊗ dh .
Pm Pm Pm Pm
for the cumulative distribution function of a standard Gaussian, noting the elementary
tail estimate
Φ̄(y) := 1 − Φ(y) ≤ exp −y 2 /2 , y ≥ 0.
Theorem 11.6 (Borell’s inequality). Let (E, H, µ) be an abstract Wiener space and
A ⊂ E a measurable Borel set with µ(A) > 0 so that
Aa = {x : g(x) ≤ a}
Assume furthermore that there exists a null-set N such that for all x ∈ N c , h ∈ H :
Then f has a Gaussian tail. More precisely, for all r > a and with ā := â − a/σ,
Proof. Note that µ(Aa ) > 0 implies â = Φ−1 (µ(Aa )) > −∞. We have for all
x∈/ N and arbitrary r, M > 0 and h ∈ rK,
Example 11.8 (Classical Fernique estimate). Take f (x) = g(x) = kxkE . Then the
assumptions of the generalized Fernique Theorem are satisfied with σ equal to the
operator norm of the continuous embedding H ,→ E. This applies in particular to
Wiener measure on C [0, T ], Rd .
Remark 11.10. Recall pthat the homogeneous “norm” |||X|||α was defined in (2.4) as
the sum of kXkα and kXk2α . Since X is “quadratic” in X (more precisely: in the
second Wiener–Itô chaos), the square root is crucial for the Gaussian estimate (11.9)
to hold.
Proof. Combining Theorem 11.5 with Lemma 11.4 and Proposition 11.2 shows that
for a.e. ω and all h ∈ H
|||X(ω)|||α ≤ C |||(X(ω − h))|||α + M 1/2 khkH .
We can thus apply the generalized Fernique Theorem with f (ω) = |||X|||α (ω) and
g(ω) = Cf (ω), noting that |||X|||α (ω) < ∞ almost surely implies that
def
Aa = {x : g(x) ≤ a}
has positive probability for a large enough (and in fact, any a > 0 thanks to a
support theorem for Gaussian rough paths, [FV10b]). Gaussian integrability of the
homogeneous rough path norm, for a fixed Gaussian rough path X is thus established.
The claimed uniformity, η = η(M, T, α, %) and not depending on the particular X
under consideration requires an additional argument. We need to make sure that
µ(Aa ) is uniformly positive over all X with given bounds on the parameters (in
particular M, %, a, d); but this is easy, using (10.16),
1 1
µ(|||X|||α ≤ a) ≥ 1 − E|||X|||2α ≥ 1 − 2 C ,
a2 a
√
where C = C(M, %, α, d) and so, say, a = 2C would do. t u
The price of a pathwise integration / SDE theory is that all estimates (have to) deal
with the worst possible scenario. To wit, given X = (X, X) ∈ Cgα and a nice 1-form,
F ∈ Cb2 say, we had the estimate
Z
T
1/α
F (X)dX ≤ C |||X|||α;[0,T ] ∨ |||X|||α;[0,T ] ,
0
reparametrisation. For the same reason, the integration domain [0, T ] in (11.10) may
be replaced by any other interval.
Example 11.11. The estimate (11.10) is sharp, at least when p = 1/α = 2, in the
following sense. Consider the (“pure-area”) rough path given by
0 c
t 7→ (0, At) , A = ,
−c 0
for some c > 0. The homogeneous (p-variation, or α-Hölder) rough path norm here
scales with c1/2 . Hence, the right-hand side of (11.10) scales like c (for c large), as
does the left-hand side which in fact is given by T |DF (0)A|.
|||X|||p-var;[τi ,τi+1 ] = 1,
i.e. for all but the very last interval for which one has |||X|||p-var;[τN ,τN +1 ] ≤ 1. One
can then exploit rough path estimates such as (11.10) on (small) intervals [τi , τi+1 ]
on which estimates are linear in |||X|||p-var ∼ 1. The problem of estimating rough
integrals is thus reduced to estimating N = N (X) and it was a key technical result
in [CLL13] to use Borell’s inequality to establish good (probabilistic) estimates on
N when X = X(ω) is a Gaussian rough path. (Our proof below is different from
[CLL13] and makes good use of the generalized Fernique estimate.)
To formalize this construction, we fixed a (1D) control function w = w(s, t), i.e.
a continuous map on {0 ≤ s ≤ t ≤ T }, super-additive, continuous and zero on the
5
The construction is purely deterministic. Of course, when X = X(ω) is random, then so is the
partition.
158 11 Cameron–Martin regularity and applications
so that w(τi , τi+1 ) = β for all i < N , while w(τN , τN +1 ) ≤ β, where N is given
by
N (w) ≡ Nβ (w; [0, T ]) := sup {i ≥ 0 : τi < T }.
As immediate consequence of super-additivity of controls,
N
X −1
βNβ (w; [0, T ]) = w(τi , τi+1 ) ≤ w(0, τN ) ≤ w(0, τN +1 ) = w(0, T ).
i=0
Note also that N is monotone in w, i.e. w ≤ w̃ implies N (w) ≤ N (w̃). At last, let us
set N (X) = N (wX ). The following (purely deterministic) lemma is most naturally
stated in variation regularity.
Proof. (Riedel) It is easy to see that all Nβ , Nβ 0 , with β, β 0 > 0 are comparable, it
is therefore enough to prove the lemma for some fixed β > 0.
q
Given h ∈ C q-var , wh (s, t) = |||h|||q-var;[s,t] is a control and so is whθ whenever
θ ≥ 1. (Noting 1 ≤ q ≤ p, we shall use this fact with θ = p/q.) From Lemma 11.4
we have, for any interval I
p p
(s, t) 7→ |||Th X|||p-var;[s,t] ≤ C |||X|||p-var;[s,t] + khkpq-var;[s,t] =: C w̃(s, t) .
6
Do not confuse a control w with “randomness” ω.
7
Super-additivity, i.e. ω(s, t) + ω(t, u) ≤ ω(s, u) whenever s ≤ t ≤ u is immediate, but
continuity is non-trivial see e.g. [FV10b, Prop. 5.8])
11.2 Concentration of measure 159
Replace X = Th T−h X by T−h X and then use elementary estimates of the type
(a + b)1/q ≤ (a1/q + b1/q ) for non-negative reals a, b, to obtain the claimed estimate
(11.12). t u
The previous lemma, combined with variation regularity of Cameron–Martin
paths (Proposition 11.2) and the generalized Fernique Theorem 11.7 then gives
immediately
Theorem 11.13 (Cass–Litterer–Lyons). Let X = (X, X) ∈ Cgα a.s. be a Gaussian
rough path, as in Theorem 11.9. (In particular, the covariance is assumed to have
finite 2D %-variation.) Then the integer-valued random variable
has a Weibull tail with shape parameter 2/% (by which we mean that N 1/% has a
Gaussian tail).
Let us quickly illustrate how to use the above estimate.
Corollary 11.14. Let X be as in the previous theorem and assume F ∈ Cb2 . Then the
random rough integral
Z T
def
Z(ω) = F (X(ω))dX(ω)
0
has a Weibull tail with shape parameter 2/% by which we mean that |Z|1/% has a
Gaussian tail.
Proof. Let (τi ) be the (random) partition associated to the p-variation of X(ω) as
defined in (11.11), with β = 1 and w = wX . Thanks to (11.10) we may estimate
160 11 Cameron–Martin regularity and applications
Z
T X
Z
τi+1
F (X(ω))dX(ω) ≤ F (X(ω))dX(ω)
0 τi
[τi ,τi+1 ]∈P
p
. (N (ω) + 1) sup |||X|||p-var;[τi ,τi+1 ] ∨ |||X|||p-var;[τi ,τi+1 ]
i
= (N (ω) + 1) ,
i
1 1
where the proportionality constant may depend on F , T and α ∈ 3 , 2% . t
u
In this section, we assume that the reader is already familiar with the basics of
Malliavin calculus as exposed for example in the monographs [Mal97, Nua06].
Consider some abstract Wiener space (W, H, µ) and a Wiener functional of the form
F : W → Re . In the context of stochastic - or rough differential equations (driven
by Gaussian signals), the Banach space W is of the form C [0, T ], Rd where µ
describes the statistics of the driving noise. If F denotes the solution to a stochastic
differential equation at some time t ∈ (0, T ], then, in general, F is not a continuous,
let alone Fréchet regular, function of the driving path. However, as we will see in this
section, it can be the case that for µ-almost every ω, the map H 3 h 7→ F (ω + h), i.e.
F (ω + ·) restricted to the Cameron-Martin space (H, h·, ·i) is Fréchet differentiable.
(This implies D1,p
loc -regularity, based on the commonly used Shigekawa Sobolev space
D1,p ; our notation here follows [Mal97] or [Nua06, Sec. 1.2, 1.3.4].) More precisely,
we introduce the following notion, see for example [Nua06, Sec. 4.1.3]:
Definition 11.15. Given an abstract Wiener space (W, H, µ), a random variable
1
F : W → R is said to be continuously H-differentiable, in symbols F ∈ CH , if for
µ-almost every ω, the map
H 3 h 7→ F (ω + h)
The following well known criterion of Bouleau–Hirsch, see [BH91, Thm 5.2.2] and
[Nua06, Sec. 1.2, 1.3.4] then provides a condition under which the law of F has a
density with respect to Lebesgue measure:
As beautifully explained in his own book [Mal97], Malliavin realised that the
strong solution to the stochastic differential equation
d
X
dYt = Vi (Yt ) ◦ dBti , (11.14)
i=1
Lie {V1 , . . . , Vd } y0 = Re .
(H)
(Here, Lie V denotes the Lie algebra generated by a collection V of smooth vector
fields.) There are many variations on this theme, one can include a drift vector
field (which gives rise to a modified Hörmander condition) and under the same
assumptions one can show that YT admits a smooth density. This result can also
(and was originally, see [Hör67, Koh78]) be obtained by using purely functional
analytic techniques, exploiting the fact that the density solves Kolmogorov’s forward
equation. On the other hand, Malliavin’s approach is purely stochastic and allows to
go beyond the Markovian / PDE setting. In particular, we will see that it is possible
to replace B by a somewhat generic sufficiently non-degenerate Gaussian process,
with the interpretation of (11.14) as a random RDE driven by some Gaussian rough
path X rather than Brownian motion.
for any α-Hölder geometric driving rough path X = (X, X) ∈ Cg0,α , which may
be obtained as limit of smooth, or piecewise smooth, paths in α-Hölder rough path
metric. Set p = 1/α. Recall that, thanks to continuity of the Itô–Lyons maps, RDE
solutions are limits of the corresponding ODE solutions.
The unique RDE solution (11.15) passing through Yt0 = y0 gives rise to the
X
solution flow y0 7→ Ut←t 0
(y0 ) = Yt . We call the derivative of the flow with respect
X
to the starting point the Jacobian and denote it by Jt←t 0
, so that
X d X
Jt←t a= Ut←t0 (y0 + εa) .
0
dε ε=0
X d Tεh X
Dh Ut←0 = U ,
dε t←0 ε=0
for any sufficiently smooth path h : R+ → Re . Recall that the translation operator
Th was defined in (11.5). In particular, we have seen in Lemma 11.4 that, if X arises
from a smooth path X together with its iterated integrals, then the translated rough
path Th X is nothing but X + h together with its iterated integrals. In the general case,
given h ∈ C q-var of complementary Young regularity, i.e. with 1/p + 1/q > 1, the
translation Th X can be written in terms of X and cross-integrals between X and h.
Suppose for a moment that the rough path X is the canonical lift of a smooth
Rd -valued path X. Then, it is classical to prove that Jt←t
X
0
solves the linear ODE
d
X
X X
dJt←t 0
= DVi (Yt )Jt←t 0
dXti , (11.16)
i=1
and satisfies JtX2 ←t0 = JtX2 ←t1 · JtX1 ←t0 . Furthermore, the variation of constants
formula leads to
Z tX d
X X
Dh Ut←0 = Jt←s Vi (Ys ) dhis . (11.17)
0 i=1
where [V, W ] denotes the Lie bracket between the vector fields V and W . All this
extends to the rough path limit without difficulties. For instance, (11.16) can be
interpreted as a linear equation driven by the rough path X, using the fact that
DV (Y ) is controlled by X to give meaning to the equation. It is then still the case
X
that Jt←t 0
is the derivative of the flow associated to (11.15) with respect to its initial
condition.
11.3 Malliavin calculus for rough differential equations 163
d
Z tX
X X X
i
Dh Ut←0 (y0 ) = Jt←s Vi Us←0 dhs (11.19)
0 i=1
Consider now an RDE driven by a Gaussian rough path X = X(ω). We now show
that the Re -valued random variable obtained from solving this random RDE enjoys
1
CH -regularity.
1
Proposition 11.19. With % ∈ [1, 23 ) and α ∈ ( 13 , 2% ), let X = (X, X) ∈ Cgα be a
Gaussian rough path as constructed in Theorem 10.4. For fixed t ≥ 0, the Re -valued
random variable
X(ω)
ω 7→ Ut←0 (y0 )
is continuously H-differentiable.
Proof. Recall h ∈ H ⊂ C %-var so that a.e. X(ω) and h enjoy complementary Young
regularity. As a consequence, we saw that the event
Note that s 7→ Zi,s is of finite p-variation, with p = 1/α. We have, with implicit
summation over i,
164 11 Cameron–Martin regularity and applications
t X
Z
X X
i
Dh Ut←0 (y0 ) = Jt←s Vi Us←0 dhs
0
Z t
i
= Zi dh
0
. (kZkp-var + |Z(0)|) × khk%-var
. (kZkp-var + |Z(0)|) × khkH .
X
Hence, the linear map DUt←0 X
(y0 ) : h 7→ Dh Ut←0 (y0 ) ∈ Re is bounded and each
∗
component is an element of H . We just showed that
d Tεh X(ω) D
X(ω)
E
h 7→ Ut←0 (y0 ) = DUt←0 (y0 ), h
dε ε=0 H
and hence
d X(ω+εh) D
X(ω)
E
h 7→ U (y0 ) = DUt←0 (y0 ), h
dε t←0 ε=0 H
emphasizing again that X(ω + h) ≡ Th X(ω) almost surely for all h ∈ H simulta-
neously. Repeating the argument with Tg X(ω) = X(ω + g) shows that the Gateaux
X(ω+·)
differential of Ut←0 at g ∈ H is given by
X(ω+g) g T X(ω)
DUt←0 = DUt←0 .
T X(ω)
g
(b) It remains to be seen that g ∈ H 7→ DUt←0 ∈ L(H, Re ), the space of linear
bounded maps equipped with operator norm, is continuous. We leave this as exercise
to the reader, cf. Exercise 11.23 below. t
u
X
Before we proceed we note that, by the multiplicative property of Jt←s , see the
remark following (11.16), one has
166 11 Cameron–Martin regularity and applications
X X
|
Mt = Jt←0 M̃t Jt←0 ,
on [0, t]. Thanks to Condition 2, true roughness of X, we can apply Theorem 6.5 to
conclude that one has
X
z | J0←· [Vi , Vj ](Y· ) ≡ 0 ,
for every i, j ∈ {1, . . . , d}. Iterating this argument shows that, with non-zero prob-
X
ability, the processes s 7→ z | J0←s W (Ys ) vanish identically for every vector field
W obtained as a Lie bracket of the vector fields Vi . In particular, this is the case for
s = 0, which implies that with positive probability, z is orthogonal to W (z0 ) for
all such vector fields. Since Hörmander’s condition (H) asserts precisely that these
vector fields span the tangent space at the starting point y0 , we conclude that z = 0
with positive probability, which is in contradiction with the fact that z is a random
unit vector and thus concludes the proof. t u
11.4 Exercises
newly covered regime H ∈ (1/4, 1/3], one would need to work in a “level-3” rough
path setting.)
Solution 11.24. In the notation of the (proof of) this Proposition, we have to show
Tg X(ω)
that g ∈ H 7→ DUt←0 ∈ L(H, Re ) is continuous. To this end, assume gn → g in
%-var
H (and hence in C ). Continuity properties of the Young integral imply continuity
of the translation operator viewed as map h ∈ C %-var 7→ Th X(ω) and so
depends continuously on x with respect to p-variation rough path metric: using the
x x
fact that Jt←· and U·←0 both satisfy rough differential equations driven by x this is
just a consequence of Lyons’ limit theorem (the universal limit theorem of rough
path theory). We apply this with x = X(ω) where ω remains a fixed element in
(11.20). It follows that
Tgn X(ω) Tg X(ω)
Tgn X(ω) Tg X(ω)
− DUt←0
= sup Dh Ut←0 − Dh Ut←0
DUt←0
op h:khkH =1
Tg X(ω) Tg X(ω)
and defining Zig (s) ≡ Jt←s Vi Us←0 , and similarly Zign (s), the same
reasoning as in part (a) leads to the estimate
Tgn X(ω) Tg X(ω)
− DUt←0
≤ c |Z gn − Z g |p-var + |Z gn (0) − Z g (0)| .
DUt←0
op
From the explanations just given this tends to zero as n → ∞ which establishes
continuity of the Gateaux differential, as required, and the proof is finished.
Exercise 11.25. Prove Theorem 11.20 in presence of a drift vector field V0 . In par-
ticular, show that in this case condition (H) can be weakened to
11.5 Comments
Abstract Second order stochastic partial differential equations are discussed from
a rough path point of view. In the linear and finite-dimensional noise case we
follow a Feynman–Kac approach which makes good use of concentration of measure
results, as those obtained in Section 11.2. Alternatively, one can proceed by flow
decomposition and this approach also works in a number of non-linear situations.
Secondly, now motivated by some semi-linear SPDEs of Burgers’ type with infinite-
dimension noise, we study the stochastic heat equation (in space dimension 1) as
evolution in Gaussian rough path space relative to the spatial variable, in the sense
of Chapter 10.
The second order stochastic partial differential equations we will be concerned with
here take the form of a terminal value problem,
d
X
−du = L[u]dt + Γi [u] ◦ dWti (ω) , u(T, ·) = g , (12.1)
i=1
def 1
Tr σ(x)σ T (x)D2 u + hb(x), Dui + c(x)u ,
L[u] = (12.2)
2
def
Γi [u] = hβi (x), Dui + γi (x)u .
169
170 12 Stochastic partial differential equations
ficients are assumed to be bounded with bounded derivatives of all orders (but see
Remark 12.3). We assume g ∈ BC(Rn ), that is bounded and continuous.1 The reader
may think of ◦dW in (12.1) as Stratonovich differential of a d-dimensional Brownian
motion. But of course, we are interested in replacing W by a genuine (geometric)
rough path W, such as to give meaning to the rough partial differential equation
(RPDE)
−du = L[u]dt + Γ [u]dW , u(T, ·) = g . (12.3)
To this end, since geometric rough paths are limits of smooth paths, we first consider
the case W ∈ C 1 [0, T ], Rd . It is a basic exercise in Itô-calculus, that any bounded
C 1,2 solution to
d
X
−∂t u = L[u] + Γi [u]Ẇti , u(T, ·) = g , (12.4)
i=1
1
In contrast to the space Cb we shall equip BC with the topology of locally uniform convergence.
12.1 Rough partial differential equations 171
Z ·
(W ε , Wε ) := W ε, ε
W0,t ⊗ dWtε → W
0
uε = S[W ε ; g] → u =: S[W; g]
This is clearly not an equation that can be solved by Itô theory alone. But is also not
immediately well-posed as rough differential equation since for this we would need to
understand B and W = (W, W) jointly as a rough path. In view of the Itô-differential
dB in (12.8), we take B, BItô , as constructed in Section 3.2), and are basically short
of the cross-integrals between B and W . (For simplicity of Rnotation only, pretend
over the next few lines W, B to be scalar.) We can Rdefine W dB(ω)Ras Wiener
integral (Itô with deterministic integrand), and then BdW = W B − W dB by
imposing integration by parts. We then easily get the estimate
Z t 2
2 2α+1
E Ws,r dBr . kW kα |t − s| ,
s
also when switching the roles of W, B, thanks to the integration by parts formula. It
0
follows from Kolmogorov’s criterion that ZW (ω) := Z = (Z, Z) ∈ C α a.s. for any
α0 ∈ (1/3, α) where
Rt !
BItô
Bt (ω) s,t (ω) s
W s,r ⊗ dB r (ω)
Zt = , Zs,t = R t
Wt Bs,r ⊗ dWr (ω) Ws,t
s
We are hence able to say that a solution X = X(ω) of (12.8) is, by definition, a
solution to the genuine (random) rough differential equation
172 12 Stochastic partial differential equations
One can see, similar to (11.10), but now also relying on RDE growth estimates as
established in Proposition 8.3), with p = 1/α0 ,
Z t
γ(X)dW . |||Z|||p-var;[s,t]
s
is indeed well-defined and the pointwise limit of uε (defined in the same way, with
W replaced by W ε ). By an Arzela–Ascoli argument, the limit is locally uniform. At
last, the claimed continuity of the solution map follows from the same arguments,
essentially by replacing W by W everywhere in the above argument, and of course
using (12.12) with g, W replaced by g ε , Wε , respectively. t
u
Remark 12.3. The proof actually shows that our solution u = u(s, x; W) to the linear
RDPE (12.3) enjoys a Feynman–Kac type representation, namely (12.12), in terms
of the process constructed as solution to the hybrid Itô-rough differential equation
(12.8). Assume now W is a Brownian motion, independent of B, and W(ω) =
WStrat = (W, WStrat ) ∈ Cg0,α a.s. It is not difficult to show that u = u(., ., WStrat (ω))
coincides with the Feynman–Kac SPDE solution derived by Pardoux [Par79] or
Kunita [Kun82], via conditional expectations given σ({Wu,v : s ≤ u ≤ v ≤ T },
12.1 Rough partial differential equations 173
Remark 12.4. It is easy to quantify the required regularity of the coefficients. The
argument essentially relies on solving (12.10) as bona fide rough differential equation.
It is then clear that we need to impose Cb3 -regularity for the vector fields σ and β.
The drift vector field b may be taken to be Lipschitz and c ∈ Cb .
Remark 12.5. We have not given meaning to the actual equation (12.3),
In Exercise 12.22 the reader is invited to check that this formulation is indeed
meaningful. In particular, the integral term hut , Γ ∗ ϕidWt is a bona-fide rough
R
Assume now that W is a Brownian motion and take W(ω) = WStrat as above. Then,
thanks to Theorem 5.12, one can see that u = u(., ., WStrat (ω)) is an analytically
weak SPDE solution in the sense that
Z T Z T
∗ ←−−
hus , ϕi = hg, ϕi − hut , L ϕidt − hut , M ∗ ϕi ◦ dW t ,
s s
We now develop the “calculus” for the transformations associated to each of the
above cases. All proofs consist of elementary computations and are left to the reader.
∂t v − F ψ t, x, v, Dv, D2 v = 0
F ψ (t, ψt (x), r, p, X)
= F x, r, p, Dψt−1 , X, Dψt−1 ⊗ Dψt−1 + p, D2 ψt−1 .
def
Proposition 12.7 (Case b). For any fixed x ∈ Rn , assume that the one-dimensional
ODE
2
The terminology here follows [LS00a].
12.1 Rough partial differential equations 175
if and only if v(t, x) = ϕ−1 (t, u(t, x), x), or equivalently ϕ(t, v(t, x), x) = u(t, x) ,
is a solution of
∂t v − ϕ F t, x, r, Dv, D2 v = 0 ,
with
1
F (t, x, ϕ, Dϕ + ϕ0 p,
ϕ def
F (t, x, r, p, X) = (12.16)
ϕ0
ϕ00 p ⊗ p + Dϕ0 ⊗ p + p ⊗ Dϕ0 + D2 ϕ + ϕ0 X ,
Remark 12.8. It is worth noting that the “quadratic gradient” term ϕ00 p⊗p disappears
in (12.16) whenever ϕ00 = 0. This happens when H(x, u) is linear in u, i.e. when
Hi [u] = γi (x)u , i = 1, . . . , d .
Remark 12.9. Note that all dependence on Ẇ has disappeared in (12.17), and conse-
quently (12.16). In the SPDE / filtering context this is known as robustification: the
transformed PDE (∂t − ϕ F )v = 0 can be solved for any W ∈ C([0, T ], Rd ). This
Pd
provides a way to solve SPDEs of the form du = F [u]dt + i=1 γi (x)u ◦ dWt
pathwise, so that u depends continuously on W in uniform topology.
We now turn our attention to case c). The point here is that the “inner” and “outer”
transformation seen above, namely
Proposition
R 12.10 (Case c). Let ψ = ψ W be as in case a) and set ϕ(t, r, x) =
t
r exp 0 γ(ψs (x))dWs . Then u is a (classical) solution to
176 12 Stochastic partial differential equations
R
t
if and only if v(t, x) = u(t, ψt (·)) exp − 0 γ(ψs (x))dWs is a (classical) solution
to
∂t v − ϕ (F ψ ) t, x, v, Dv, D2 v = 0.
Let us reflect for a moment on what has been achieved. We started with a PDE
that involves Ẇ and in all cases we managed to transform the original problem to a
PDE where all dependence on Ẇ has been isolated in some auxiliary ODEs. In the
stochastic context (◦dW instead of dW = Ẇ dt) this is nothing but the reduction,
via stochastic flows, from a stochastic PDE to a random PDE, to be solved ω-wise.
In the same spirit, the rough case is now handled with the aid of flows for RDEs and
their stability properties.
Given W ∈ Cg0,α , we pick an approximating sequence (W ε ), and transform
(in abusive notation) and the function F ε which appears on the right-hand side above
converges (e.g. locally uniformly) as ε → 0, due to stability properties of flows
associated to RDEs as discussed in Section 8.9.
All one now needs is a (deterministic) PDE framework with a number of good
properties, along the following “wish list”.
1. All approximate problems, i.e. with W ε ∈ C 1 ([0, T ], Rd )
d
X
∂t uε = F [uε ] + Hi [uε ]Ẇtε,i , uε (0, ·) = g ε ,
i=1
Z ·
(W ε , Wε ) := W ε, ε
W0,t ⊗ dWtε → W
0
uε = S[W ε ; g] → u =: S[W; g]
as ε → 0 in U. This u is the unique solution to the RPDE (12.20) in the sense of the
above definition. Moreover, the resulting solution map,
S : Cg0,α ([0, T ], Rd ) × G → U
is continuous.
It remains to identify suitable PDE frameworks, depending on the non-linearity
F . When ∂t u = F [u] is a scalar conservation law, entropy solutions actually provide
3
Given the roughness in t of our transformations, typically α-Hölder, it would not be wise to
incorporate temporal C 1 -regularity in the definition of the space U .
178 12 Stochastic partial differential equations
a suitable framework to handle additional rough noise, at least of (linear) type c),
[FG14]. On the other hand, when F = F [u] is a fully non-linear second order oper-
ator, say of Hamilton-Jacobi-Bellman (HJB) or Isaacs type, the natural framework
is viscosity theory [CIL92, FS06] and the problem of handling additional “rough”
/ C 1 , also with non-linear H = H(Du), was first raised by
noise, in the sense of W ∈
Lions–Sougandis [LS98a, LS98b, LS00a, LS00b].
and some technical conditions hold. 4 Without going into technical details, the con-
ditions are met for F = L as in (12.2) and are robust under taking inf and sup (pro-
vided the regularity of the coefficients holds uniformly). As a consequence, HJB
and Isaacs type non-linearities, where F takes the form infa La , infa supa0 La,a0 ,
are also covered.
4
... the most important of which is [CIL92, (3.14)]. Additional assumptions on F are necessary,
however, in particular due to the unboundedness of the domain Rn , and these are not easily found
in the literature; see [DFO14]. One can also obtain existence and uniqueness result in BUC.
12.1 Rough partial differential equations 179
where F satisfies ψ-invariant comparison. Then there exists u = u(t, x) ∈ BC, not
dependent on the approximating (W ε ) but only on W ∈ Cg0,α ([0, T ], Rd ), so that
uε = S[W ε ; g] → u =: S[W; g]
as ε → 0 in local uniform sense. This u is the unique solution to the RPDE (12.20)
with transport noise H[u] = hβ(x), Dui in the sense of the definition given previous
to Theorem 12.12. Moreover, we have continuity of the solution map,
180 12 Stochastic partial differential equations
Remark 12.14. In the above theorem, existence of RPDE solutions actually relies on
existence of approximate solutions uε , which one of course expects from standard
viscosity theory. Mild structural conditions on F , satisfied by HJB and Isaacs exam-
ples, which imply this existence are reviewed in [DFO14]. One can also establish a
modulus of continuity for RPDE solutions, so that u ∈ BU C after all.
Remark 12.15. The RPDE solution to du = F [u]dt + hβ(x), DuidW as constructed
above, when F = infa La is of HJB form, arises in pathwise stochastic control
[LS98b, BM07, DFG13].
Unfortunately, in the “semi-linear” noise case b), it turns out the structural as-
sumptions one has to impose on F in order to have the necessary comparison for
∂t − F 0 = 0 is rather restrictive, although semilinear situations are certainly covered.
Even in this case, due to the appearance of a quadratic non-linearity in Du, the argu-
ment is involved and requires a careful analysis on consecutive small time intervals,
rather than [0, T ]; see [LS00a, DF12]. A non-linear Feynman–Kac representation, in
terms of rough backward stochastic differential equations is given in [DF12].
At last, we return to the fully linear case of Section 12.1.1. That is, we consider
the (linear noise) case c) with linear F = L. With some care [FO14], the double
transformation leading to the transformed equation ∂t − ϕ (F ψ ) = 0 can be imple-
mented with the aid of coupled flows of rough differential equations. We can then
recover Theorem 12.2, but with somewhat different needs concerning the regularity
of the coefficients. (For instance, in the aforementioned theorem we really needed
σ, β ∈ Cb3 whereas now, using flow decomposition, we need β ∈ Cb5 but only σ ∈ Cb1 .
Remark 12.16. By either approach, case c) with linear F = L or Theorem 12.2,
we obtain a robust view on classes SPDEs which contain the Zakai equation from
filtering theory, provided the initial law admits a BU C-density. Robustness is an
important issue in filtering theory, see also Exercise 12.24.
Nonlinear stochastic partial differential equations driven by very singular noise, say
space-time white noise, may suffer from the fact that their nonlinearities are ill-posed.
For instance, even in space dimension one, there is no obvious way of giving “weak”
meaning to Burgers-like stochastic PDEs of the type
n
X
∂t ui = ∂x2 ui + f (u) + gji (u)∂x uj + ξ i , i = 1, . . . , n , (12.25)
j=1
where ξ = ξ i denotes space-time white noise (strictly speaking, n independent
copies of scalar space-time white noise). Recall that, at least formally, space-time
12.2 Stochastic heat equation as a rough path 181
As a consequence of the lack of regularity of ξ, it turns out that the solution to the
stochastic heat equation (i.e. the case f = g = 0 in (12.25) above) is only α-Hölder
continuous in the spatial variable x for any α < 1/2. In other words, one would
not expect any solution u to (12.25) to exhibit spatial regularity better than that of a
Brownian motion.
As a consequence, even when aiming for a weak solution theory, it is not clear
how to define the integral of a spatial test function ϕ against the nonlinearity. Indeed,
this would require us to make sense of expressions of the type
Z
ϕ(x)gji (u)∂x uj (t, x) dx ,
for fixed t. When g happens to be a gradient, such an integral can be defined by pos-
tulating that the chain rule holds and integrating by parts. For a general g, as arising
in applications from path sampling [HSV07], this approach fails. This suggests to
seek an understanding of u(t, ·) as a spatial rough path. Indeed, this would solve the
problem just explained by allowing us to define the nonlinearity in a weak sense as
Z
ϕ(x)gji (u) duj (t, x) ,
∂t ψ = ∂x2 ψ + ξ .
Indeed, writing u = ψ + v and proceeding formally for the moment, we then see that
v should solve
n
X
∂t v i = ∂x2 v i + f (v + ψ) + gji (v + ψ) ∂x ψ j + ∂x v j ) .
j=1
If we were able to make sense of the term appearing in the right hand side of this
equation, one would expect it to have the same regularity as ∂x ψ so that, since
ψ(t, ·) turns out to belong to C α for every α < 1/2, one would expect v(t, ·) to be
of regularity C α+1 for every α < 1/2. In particular, we would not expect the term
involving ∂x v j to cause any trouble, so that it only remains to provide a meaning for
the term gji (v + ψ)∂x ψ j . If we know that v ∈ C 1 and we have an interpretation of
ψ(t, ·) as a rough path ψ (in space), then this can be interpreted as the distribution
whose action, when tested against a test function ϕ, is given by
182 12 Stochastic partial differential equations
Z
ϕ(x)gji (ψ + v)) dψ j (t, x) .
This reasoning can actually be made precise, see the original article [Hai11b]. In this
section we limit ourselves to providing the construction of ψ and giving some of its
basic properties.
We now study the model problem in this context - the construction of a spatial rough
path associated, in essence, to the above SPDE in the case f = g = 0. More precisely,
we are considering stationary (in time) solution to the stochastic heat equation5 ,
Thanks to the fact that we chose λ > 0, the stochastic heat equation (12.26) has
indeed a stationary solution
P which, by taking Fourier transforms, may be decom-
posed as ψ(x, t; ω) = k Ytk (ω)ek (x). The components Ytk are then a family of
independent stationary one-dimensional Ornstein-Uhlenbeck processes given by
5
With λ = 0, the 0th mode of ψ behaves like a Brownian motion and ψ cannot be stationary in
time, unless one identifies functions that only differ by a constant.
12.2 Stochastic heat equation as a rough path 183
2 σ2
E Ytk = ,
2µk
for any fixed time t.
where K is given by
1 2 X cos (ku) σ2 √
K(u) := σ = √ √ cosh λ(u − π) .
4π µk 4 λ sinh λπ
k∈Z
Here, the second equality holds for u restricted to [0, 2π]. In fact, the cosine series is
the periodic continuation of the r.h.s. restricted to [0, 2π].
Proof. From the basic identity cos (α − β) = cos α cos β + sin α sin β,
1
e−k (x)e−k (y) + ek (x)ek (y) = cos (k(x − y)), k ∈ Z .
π
σ 2 X cos (kx)
K(x) = .
4π λ + k2
k∈Z
√
At last, expand the (even) function cosh λ |·|−π in its (cosine) Fourier-series
to get the claimed equality. t u
Proposition 12.18. Fix t ≥ 0. Then ψt (x; ω) = ψ(t, x; ω), indexed by x ∈ [0, 2π],
is a centred Gaussian process with covariance of finite 1-variation. More precisely,
Rψ(t,·)
1;[x,y]2
≤ 2πkKkC 2 ;[0,2π] |x − y| ,
and so (cf. Theorem 10.4), for each fixed t ≥ 0, the Rd -valued process
Remark 12.19. There are ad-hoc ways to construct a (spatial) rough path lift asso-
ciated to the stochastic heat-equation, for instance be writing ψ(t, ·) as Brownian
bridge plus a random smooth function. In this way, however, one ignores the large
body of results available for general Gaussian rough paths: for instance, rough path
convergence of hyper-viscosity or Galerkin approximation, extensions to fractional
stochastic heat equations, concentration of measure can all be deduced from general
principles.
We now show that solutions to the stochastic heat equation induces a continuous
stochastic evolution in rough path space.
Proof. Fix s and t. The proof then proceeds in two steps. First, we will verify the
assumptions of Corollary 10.6, namely we will show that
h iθ
2
|%α (ψs , ψt )|Lq ≤ C sup E(|ψs (x, y) − ψt (x, y)| ) ,
x,y∈[0,2π]
for some constant C that is independent of s and t. In the second step, we will show
that (here we may assume d = 1), with ψs (x, y) := ψs (y) − ψs (x), one has the
bound h i
2 1/2
sup E |ψs (x, y) − ψt (x, y)| = O |t − s| .
x,y∈[0,2π]
k∈Z
2
σ X cos (k(x − y)) −(λ+k2 )|t−s|.
= e =: Rτ (x, y).
4π λ + k2
k∈Z
2
After integrating over [u, v] , we see that the error made above is actually of order
2
O |v − u| . This is more than enough to conclude that
R(X 1 ,Y 1 )
1-var;[u,v]2
≤ C|v − u| < ∞ ,
the question reduces to a similar bound on E|ψs1 (x)−ψt1 (x)|2 , uniform in x ∈ [0, 2π].
This quantity is equal to
where we used that 1 − e−cx ≤ cx for c, x > 0 in the first sum. We then take
N ∼ |t − s|−1/2 , so that the first sum is of order O |t − s|1/2 . For the second sum,
2
we use the trivial bound 1 − e−(λ+k )|t−s| ≤ 1. It then suffices to note that
186 12 Stochastic partial differential equations
X 1 X 1 1/2
≤ = O(1/N ) = O |t − s| ,
λ + k2 k2
k≥N k≥N
also implies “almost 41 -Hölder” temporal regularity of the stochastic heat equation.
12.3 Exercises
as a signal and Y as noisy and incomplete observation. The filtering problem consists
in computing the conditional distribution of the unobserved component X, given the
observation Y . Equivalently, one is interested in computing
where Yt is the observation filtration and g is a suitably chosen test function. Measure
theory tells us that there exists a Borel-measurable map θtg : C([0, t], RdY ) → R, such
that a.s. πt (g) = θtg (Y ) where we consider Y = Y (ω) as a C([0, t], RdY )-valued
random variable. Note that θtg is not uniquely determined (after all, modifications
on null sets are always possible). On the other hand, there is obvious interest to
have a robust filter, in the sense of having a continuous version of θtg , so that close
observations lead to nearby conclusions about the signal.
a) Give an example to show that, in general, θtg does not admit a continuous version.
b) Let α ∈ (1/2, 1/3). Show that there exists a continuous map on rough path space
pt (g)
πt (g) = , pt (g) := E0 [g(Xt , Yt )vt |Yt ]
pt (1)
where
!
dP0 XZ t 1 t
Z
i i 2
= exp − h (Xs , Ys )dWs − ||h(Xs , Ys )|| ds
dP Ft i 0 2 0
and v = {vt , t > 0} is defined as the right-hand side above with −W replaced by Y .
u = u(t, x; ω) : [0, T ] × T × Ω → R
(u(. + ε) − u)
u ;
ε
Assume v ε = uε − ψ → v := u − ψ and its first (spatial) derivatives converge
locally uniformly in probability. Show that u is an analytically weak solution to
the perturbed equation
1 2
∂t u = ∂xx u + ∂x u +C +ξ
2
b) We note
2
1 u(. + ε) − u2
1 2 (u(. + ε) + u) (u(. + ε) − u)
Dε,r u = =
2 2 ε 2 ε
12.3 Exercises 189
(u(. + ε) − u) 1 2
=u + (u(. + ε) − u) .
ε 2ε
It follows that
ε
+ ε) − uε )
ε ε ε ε (u (.
∂t hv , ϕi = hv , ∂xx ϕi − hv , ϕi + u ,ϕ .
ε
= hv ε , ∂xx ϕi − hv ε , ϕi
1 ε 2 1 ε 2
− (u ) , Dε,l ϕ − (u (. + ε) − uε ) , ϕ .
2 2ε
[uε (. + ε) − uε ] = ψ(. + ε) − ψ + v ε (. + ε) − v ε
= ψ(. + ε) − ψ + O(ε)
(u−π)
Since K(u) = cosh 0 1
4 sinh (π) , we have C = −2K (0) = 2 , and it follows from
Exercise 10.20 that
ψ 2x,x+ε
Z
1 2 1
(ψ(. + ε) − ψ) , ϕ = ϕ(x) dx
2ε 2 ε
Z
1 1
→ ϕ(x)Cdx = ,ϕ ,
2 4
where the convergence takes place in probability. It follows that u is a solution (in
the above analytically weak sense) of
1 2 1
∂t u = ∂xx u − u + ∂x u + + ξ.
2 4
190 12 Stochastic partial differential equations
12.4 Comments
Abstract We give a short introduction to the main concepts of the general theory
of regularity structures. This theory unifies the theory of (controlled) rough paths
with the usual theory of Taylor expansions and allows to treat situations where the
underlying space is multidimensional.
13.1 Introduction
While a full exposition of the theory of regularity structures is well beyond the
scope of this book, we aim to give a concise overview to most of its concepts and
to show how the theory of controlled rough paths fits into it. In most cases, we will
only state results in a rather informal way and give some ideas as to how the proofs
work, focusing on conceptual rather than technical issues. The only exception is
the “reconstruction theorem”, Theorem 13.12 below, which is one of the linchpins
of the whole theory. Since its proof (or rather a slightly simplified version of it) is
relatively concise, we provide a fully self-contained version. For precise statements
and complete proofs of most of the results exposed here, we refer to the original
article [Hai14c]. See also the review articles [Hai14a, Hai14b] for shorter expositions
that complement the one given here.
It should be clear by now that a controlled rough path (Y, Y 0 ) ∈ DW 2α
bears a
strong resemblance to a differentiable function, with the Gubinelli derivative Y 0
describing the coefficient in front of a “first-order Taylor expansion” of the type
Compare this to the fact that a function f : R → R is of class C γ with γ ∈ (k, k+1)
(1) (k)
if for every s ∈ R there exist coefficients fs , . . . , fs such that
k
X
ft = fs + fs(`) (t − s)` + O(|t − s|γ ) . (13.2)
`=1
191
192 13 Introduction to regularity structures
(`)
Of course, fs is nothing but the `th derivative of f at the point s, divided by `!.
In this sense, one should really think of a controlled rough path (Y, Y 0 ) ∈ DW 2α
Remark 13.3. In principle, the index set A can be infinite. By analogy with the
polynomials,
P it is then natural to consider T as the set of all formal series of the form
α∈A τα , where only finitely many of the τα ’s are non-zero. This also dovetails
nicely with the particular form of elements in G. In practice however we will only
ever work with finite subsets of A so that the precise topology on T does not matter
as long as each of the Tα is finite-dimensional which is the case in all of the examples
we will consider here.
1
This only matters if dim T<α = +∞ for some α ∈ A.
194 13 Introduction to regularity structures
We start with two simple special cases followed by the general polynomial structure.
Fix γ ∈ (0, 1) and consider a real-valued function belonging to the Hölder space
γ
of exponent γ, say f ∈ C γ . In other words, f : R → R, and |fx − fy | . |y − x|
uniformly for x, y on compacts. The trivial regularity structure
A = {0} , T = T0 = h1i ∼
=R, G = {I} ,
x 7→ f (x) := fx 1.
A = {0, 1, 2} , T = T0 ⊕ T1 ⊕ T2 = h1, X, X 2 i ∼
= R3 ,
with structure group G = {Γh ∈ L(T, T ) : h ∈ (R, +)} where Γh is given, with
respect to the ordered basis 1, X, X 2 , by the matrix
1 h h2
0 1 2h .
001
Note that Γg ◦ Γh = Γg+h , so that G inherits its group structure from (R, +).
Moreover, the triangular form, with ones on the diagonal, expresses exactly the
requirement (13.5), i.e. that the action of Γh on any element in T amounts to add
terms of lower homogeneity. This structure allows to represent the function f and its
first two derivatives as a truncated Taylor series, namely as the T -valued map
1
x 7→ f (x) := fx 1 + Dfx X + D2 fx X 2 .
2
It is now an easy matter to generalize the above considerations to general Hölder
maps of several variables, say f : Rd → R in the Hölder space C n+γ , which is
defined by the obvious generalisation of (13.2) to functions on Rd . In this case, we
would take A = {0, 1, . . . , n} and T is the space of abstract polynomials of degree
at most n, in d commuting indeterminants X1 , . . . , Xd . This motivates the following
definition.
• G ∼ Rd , + acts on T via
Γh P (X) = P (X + h1) , h ∈ Rd ,
We start again from simple examples. What structure would be appropriate for Young
integration? Fix α ∈ (0, 1) and consider the problem of integrating a (continuous)
path YR against a scalar W ∈ C α . In the case of smooth W , the indefinite integral
Z = Y dW exists in Riemann–Stieltjes sense (and then Ż = Y Ẇ ). Otherwise,
Ẇ only exists as a Schwartz distribution (more precisely, Ẇ is an element of the
negative Hölder space C α−1 ). The corresponding regularity structure is given by
s 7→ Ż(s) := Ys Ẇ .
We shall see later how Ż R gives rise to the Ż, the distributional derivative of the
indefinite Young integral Y dW , provided of course that Y has the correct regularity
such as Y ∈ C β with α + β > 1.
Let us next consider the “task” of representing a controlled rough path in a suitable
regularity structure. More precisely, consider α ∈ (1/3, 1/2], a path W ∈ C α with
values in R, say, and (Y, Y 0 ) ∈ DW
2α
so that
The right-hand side above is (some sort of) Taylor expansion, based on W ∈ C α ,
which describes Y well near the (time) point s. We want to formalize this by attaching
to each time s the “jet”
Y (s) := Ys 1 + Ys0 W .
Performing the substitution 1 7→ 1, W 7→ (y 7→ Ws,t ) gets us back to the right hand
side of (13.9). This suggests to define the following regularity structure
A = {0, α} , T = T0 ⊕ Tα = h1, W i ∼
= R2 ,
196 13 Introduction to regularity structures
with structure group G = {Γh ∈ L(T, T ) : h ∈ (R, +)} where Γh is given, with
respect to the ordered basis 1, W by the matrix
1h
.
01
this suggests (rather informally at this stage), that in the vicinity of any fixed time s,
the distributional derivative of Z should have an expansion of the type
which can be done with the aid of the following regularity structure.
A = {α − 1, 2α − 1, 0, α} ,
T = Tα−1 ⊕ T2α−1 ⊕ T0 ⊕ Tα = hẆ , Ẇ, 1, W i ∼
= R4 ,
with structure group G = {Γh ∈ L(T, T ) : h ∈ (R, +)} where Γh is given, with
respect to the ordered basis Ẇ , Ẇ, 1, W , by the matrix
1h00
0 1 0 0
0 0 1 h .
0001
Equivalently,
Γh 1 = 1 , Γh Ẇ = Ẇ , Γh W = W + h1 , Γh (Ẇ) = Ẇ + hẆ .
It will be seen later that in this framework the function Ż defined in (13.11) does
indeed give rise to Ż, the distributional derivative of the indefinite rough integral
Y dW. The extension to multi-component rough paths, W ∈ C ([0, T ], Re ) with
R
e > 1, is essentially trivial. We just need more basis vectors Ẇ i , Ẇj,k , W l (with
1 ≤ i, j, k, l ≤ e):
Definition 13.5. Let α ∈ (1/3, 1/2]. The regularity structure for α-Hölder rough
paths (over Re ) is given by
13.3 Definition of a model and first examples 197
T0 = h1i , Tα = hW 1 , . . . , W e i ,
Tα−1 = hẆ 1 , . . . , Ẇ e i , T2α−1 = hẆij : 1 ≤ i, j ≤ ei .
Γh 1 = 1 , Γh W i = W i + hi 1 ,
i i
(13.12)
Γh Ẇ = Ẇ , Γh Ẇij = Ẇij + hi Ẇ j .
In a Brownian (rough path) context, one has Hölder regularity with exponent
α = 1/2 − κ, for arbitrarily small κ > 0. The above index set A, relevant for a
“regularity structure view” on stochastic integration, then becomes
n 1 1 o
A= − − κ, −2κ, 0, − κ ,
2 2
which, in abusive but convenient notation, we write as
n 1− − 1 −o
A= − , 0 , 0, .
2 2
Index sets of this form (“half-integers− ”) will also be typical in later SPDE situations
driven by spatial or space-time white noise.
around the origin. We also write D0 (Rd ) for the space of Schwartz distributions on
Rd . With these notations, our definition of a model for a given regularity structure
T is as follows.
Π : Rd → L T, D0 (Rd ) Γ : Rd × Rd → G
x 7→ Πx (x, y) 7→ Γxy
such that Γxy Γyz = Γxz and Πx Γxy = Πy . We then say that Πx realizes an element
of T as a Schwartz distribution.
Furthermore, write r for the smallest integer such that r > | min A| ≥ 0. We then
impose that for every compact set K ⊂ Rd and every γ > 0, there exists a constant
C = C(K, γ) such that the bounds
Πx τ (ϕλx ) ≤ Cλα kτ kα , kΓxy τ kβ ≤ C|x − y|α−β kτ kα ,
(13.13)
One very important remark is that the space M of all models for a given regularity
structure is not a linear space. However, it can be viewed as a closed subset (deter-
mined by the nonlinear constraints Γxy ∈ G, Γxy Γyz = Γxz , and Πy = Πx Γxy ) of
the linear space with seminorms (indexed by the compact set K) given by the smallest
constant C in (13.13). Also, there is a natural distance between models (Π, Γ ) and
(Π̄, Γ̄ ) given by the smallest constant C in (13.13), when replacing Πx by Πx − Π̄x
and Γxy by Γxy − Γ̄xy .
Remark 13.8. The identity Πx Γxy = Πy reflects the fact that Γxy is the linear map
that takes an expansion around y and turns it into an expansion around x. The first
bound in (13.13) states what we mean precisely when we say that τ ∈ Tα represents
a term that vanishes at order α. (See Exercise 13.31; note that α can be negative, so
that this may actually not vanish at all!) The second bound in (13.13) is very natural
in view of both (13.3) and (13.4). It states that when expanding a monomial of order
α around a new point at distance h from the old one, the coefficient appearing in
front of lower-order monomials of order β is of order at most hα−β .
Remark 13.9. In many cases of interest, it is natural to scale the different directions of
Rd in a different way. This is the case for example when using the theory of regularity
structures to build solution theories for parabolic stochastic PDEs, in which case
the time direction “counts as” two space directions. This “parabolic scaling” can be
formalized by the integer vector (2, 1, . . . , 1). More generally, one can introduce a
scaling s of Rd , which is just a collection of d mutually prime strictly positive integers
13.3 Definition of a model and first examples 199
and to define ϕλx in such a way that the ith direction is scaled by λsi . The polynomial
structure introduced earlier, in particular (13.7), should be changed accordingly by
Pd
postulating that the homogeneity of X k is given by |k|s = i=1 si ki . In this case,
the Euclidean distance between two points should be replaced everywhere by the
corresponding scaled distance |x|s = i |xi |1/si . See [Hai14c] for more details.
P
Remark 13.11. (Compare with Remark 4.8 in the rough path context.) It is important
γ
to note that while the space of models M is not a linear space, the space DM is a
linear space (with Banach, or at least Fréchet structure), given a model M ∈ M . The
twist of course is that the space in question depends in a crucial way on the choice of
M. The total space then is
[
M n Dγ = {M} × DM
def 2α
,
M∈M
γ
with base space M and “fibres” DM .
The most fundamental result in the theory of regularity structures then states that
given f ∈ D γ with γ > 0, there exists a unique Schwartz distribution Rf on Rd
such that, for every x ∈ Rd , Rf “looks like Πx f (x) near x”. More precisely, one
has
such that
Rf − Πx f (x) (ϕλx ) . λγ ,
(13.15)
uniformly over ϕ ∈ Br and λ as before, and locally uniformly in x. Without the
positivity assumption on γ, everything remains valid but uniqueness of R.
Remark 13.13. Actually, R should really be viewed as a (nonlinear!) map from the
total space M n D γ into D0 (Rd ). It is then also continuous with respect to the
natural topology on this space, which is essential when using it to prove convergence
results. We will however not prove this stronger continuity statement here.
We postpone the proof of the reconstruction theorem, as well as the above remark,
and turn instead to our previous list of regularity structures, now adding the relevant
models and indicate the interest of the reconstruction map.
Πx X k = (y 7→ (y − x)k ) ,
Γxy = Γh h=x−y .
We leave it as an exercise to the reader to verify that this does indeed satisfy the
bounds and relations of Definition 13.6.
In the sense of the following proposition, modelled distributions in the context of
the polynomial model are nothing than classical Hölder functions.
The proof is not difficult. Given f ∈ C n+γ , write f (x) for the Taylor expansion up to
order n with all monomials (y − x)k replaced by X k . It is immediate to check that
fˆ = f will do. The converse is obvious when n = 0, the general case can be seen
by induction. The proposition remains valid for integer values of β with the usual
caveat that in this context C β means β − 1 times continuously differentiable with the
highest order derivatives locally Lipschitz continuous.
Validity of such a proposition for negative exponents requires a suitable notion
“negative” Hölder spaces. In fact, the considerations above (see also Exercise 13.31)
suggest that a very natural space of distributions is obtained in the following way.
13.3 Definition of a model and first examples 201
Given α > 0, we denote by C −α the space of all Schwartz distributions η such that η
belongs to the dual of Ccr (elements in Cbr with compact support) with r the smallest
integer such that r > α, and such that
η(ϕλx ) . λ−α ,
uniformly over all ϕ ∈ Br and λ ∈ (0, 1], and locally uniformly in x. Given any
compact set K, the best possible constant such that the above bound holds uniformly
over x ∈ K yields a seminorm. The collection of these seminorms endows C −α with
a Fréchet space structure.
Remark 13.15. In terms of the scale of classical Besov spaces, the space C −α is a
−α
local version of B∞,∞ . It is in some sense the largest space of distributions that is
invariant under the scaling ϕ(·) 7→ λ−α ϕ(λ−1 ·), see for example [BP08].
B : C β × C −α → D0 (Rd )
Proof. Assume from now on that g = ξ ∈ C −α for some α > 0 and that f ∈ C β for
some β > α. We then build a regularity structure T in the following way. For the
set A, we take A = N ∪ (N − α) and for T , we set T = V ⊕ W , where each one of
the spaces V and W is a copy of the canonical polynomial model (in d commuting
variables). We also choose Γ as in the canonical polynomial model above, acting
simultaneously on each of the two instances.
As before, we denote by X k the canonical basis vectors in V . We also use the
suggestive notation “ΞX k ” for the corresponding basis vector in W , but we postulate
that ΞX k ∈ T|k|−α rather than ΞX k ∈ T|k| . Given any distribution ξ ∈ C −α , we
then define a model (Π ξ , Γ ), where Γ is as in the canonical model, while Π ξ acts as
with the obvious abuse of notation in the second expression. It is then straightforward
to verify that Πy = Πx ◦ Γxy and that the relevant analytical bounds are satisfied, so
that this is indeed a model.
Denote now by Rξ the reconstruction map associated to the model (Π ξ , Γ ) and,
for f ∈ C β , denote by f the element in D β given by the local Taylor expansion of
f of order β at each point. Note that even though the space D β does in principle
depend on the choice of model, in our situation f ∈ D β for any choice of ξ. It
202 13 Introduction to regularity structures
follows immediately from the definitions that the map x 7→ Ξf (x) belongs to D β−α
so that, provided that β > α, one can apply the reconstruction operator to it. This
suggests that the multiplication operator we are looking for can be defined as
B(f, ξ) = Rξ Ξf .
Let us see now how some of the results of Section 4 can be reinterpreted in the light
of this theory. Fix α ∈ (1/3, 1/2] and let T be the rough path regularity structure
put forward in Definition 13.5. Recall that this means A = {α − 1, 2α − 1, 0, α}.
We have for T0 a copy of R with unit vector 1, for Tα and Tα−1 a copy of Re with
respective unit vectors W j and Ẇ j , and for T2α−1 a copy of Re×e with unit vectors
Ẇij . The structure group G is isomorphic to Re and, for h ∈ Re , acts on T via
Γh 1 = 1 , Γh Ẇ i = Ẇ i , Γh W i = W i + hi 1 , Γh Ẇij = Ẇij + hi Ẇ j .
(13.16)
Let now W = (W, W) be an α-Hölder continuous rough path over Re . It turns out
that this defines a model for T in the following way:
Lemma 13.18. Given an α-Hölder continuous rough path W, one can define a model
M = MW for T on R by setting Γt,s = ΓWs,t and
j
Πs W j (t) = Ws,t
Πs 1 (t) = 1 ,
Z Z
Πs Ẇ j (ψ) = ψ(t) dWtj , Πs Ẇij (ψ) = ψ(t) dWij
s,t .
Here, both integrals are perfectly well-defined Riemann integrals, with the differential
in the second case taken with respect to the variable t. Given a controlled rough path
(Y, Y 0 ) ∈ DW
2α
, this then defines an element Y ∈ DM 2α
by
13.3 Definition of a model and first examples 203
Proof. We first check that the algebraic properties of Definition 13.6 are satisfied.
It is clear that Γs,u Γu,t = Γs,t and that Πs Γs,u τ = Πu τ for τ ∈ {1, W j , Ẇ j }.
Regarding Ẇij , we differentiate Chen’s relations (2.1) which yields the identity
dWi,j i,j i j
s,t = dWu,t + Ws,u dWt .
The last missing algebraic relation then follows at once. The required analytic bounds
follow immediately (exercise!) from the definition of the rough path space C α .
Regarding the function Y defined in the statement, we have
so that the condition (13.14) with γ = 2α does indeed coincide with the definition of
a controlled rough path. t u
Theorems 4.4 and 4.10 can then be recovered as a particular case of the recon-
struction theorem in the following way.
j
and such that Zs,t = Y (s) Ws,t + Yi0 (s) Wi,j 3α
s,t + O(|t − s| ).
By a simple approximation argument, it turns out that one can take for ψ the indicator
function of the interval [0, 1], so that
204 13 Introduction to regularity structures
j
η(1[s,t] ) = Y (s) Ws,t + Yi0 (s) Wi,j 3α
s,t + O(|t − s| ) .
Here, the reason why one obtains an exponent 3α rather than 3α − 1 is that it is
really |t − s|−1 1[s,t] that scales like an approximate δ-distribution as t → s. t
u
Remark 13.22. The theory of (controlled) rough paths of lower regularity already
hinted at in Section 2.4 can be recovered from the reconstruction operator and a
suitable choice of regularity structure (essentially two copies of the truncated tensor
algebra) in virtually the same way.
Let us give another application to rough path theory. Given an arbitrary path
W ∈ C α with values in Re , does there exist a (since α ≤ 1/2: non-unique) rough
path lift? In dimension e = 1, the answer is trivially yes, it suffices set Ws,t = 12 Ws,t
2
but the case of e > 1 is non-trivial. The following can be obtained as easy application
of the reconstruction theorem in the case γ ≤ 0.
W = (W, W) ∈ C α ([0, T ], Re ) .
Furthermore, this can be done is such a way that the map W 7→ W is continuous.
Remark 13.24. The reader may wonder how this dovetails with Proposition 1.1. The
point is that if we define W 7→ W by an application of the reconstruction theorem
with γ < 0, this map restricted to smooth paths does in general not coincide with the
Riemann–Stieltjes integral of W against itself.
We trust the reader is familiar with the Haar (wavelet) basis. The analysis seen earlier
in the rough path context (e.g. the proof of the sewing lemma, based on dyadic refine-
ments) can be viewed as based on this wavelet basis. The Haar basis, however, suffers
from lack of regularity. Fortunately, the following result due to Daubechies [Dau88]
provides us with much more regular functions that enjoy analogous properties:
Theorem 13.25. Given any integer 0 < r < ∞, there exists a function ϕ : Rd → R
with the following properties:
1. The function ϕ is of class Cbr and has compact support.
13.4 Wavelets and the reconstruction theorem 205
2. For every polynomial P of degree Pr, there exists a polynomial P̂ of degree r such
that, for every x ∈ Rd , one has y∈Zd P̂ (y)ϕ(x − y) = P (x).
3. One has ϕ(x)ϕ(x − y) dx = δy,0 for every y ∈ Zd .
R
4. There exist coefficients {ak }k∈Zd such that 2−d/2 ϕ(x/2) = k∈Zd ak ϕ(x − k).
P
The existence of such a function ϕ is highly non-trivial and actually equivalent to the
existence of a wavelet basis consisting of Cbr functions with compact support. Let us
restate the reconstruction theorem for the reader’s convenience. (We only consider
the case γ > 0 here.)
Theorem 13.26. Let T be a regularity structure as above and let (Π, Γ ) a model
for T on Rd . Then, there exists a unique linear map R : D γ → D0 (Rd ) such that
Rf − Πx f (x) (ϕλx ) . λγ ,
Proof. We pick ϕ with properties (1–4), as provided by Theorem 13.25, for some r >
| inf A|. We also set Λn = 2−n Zd and, for y ∈ Λn , we set ϕny (x) = 2nd/2 ϕ(2n (x −
y)). Here, the normalisation is chosen in such a way that the set {ϕny }y∈Λn is again
orthonormal in L2 . We then denote by Vn ⊂ C r the linear span of {ϕny }y∈Λn , so that,
by the property (4) above, one has V0 ⊂ V1 ⊂ V2 ⊂ . . .. We furthermore denote by
V̂n the L2 -orthogonal complement of Vn−1 in Vn , so that Vn = V0 ⊕ V̂1 ⊕ . . . ⊕ V̂n .
In order to keep notations compact, it will also be convenient to define the coefficients
ank with k ∈ Λn by ank = a2n k .
With these notations at hand, we then define a sequence of linear operators
Rn : D γ → C r by
X
Rn f (y) = Πx f (x) (ϕnx ) ϕnx (y) .
x∈Λn
We claim that there then exists a Schwartz distribution Rf such that, for every
compactly supported test function ψ of class C r , one has hRn f , ψi → Rf (ψ),
and that Rf furthermore satisfies the properties stated in the theorem.
Let us first consider the size of the components of Rn+1 f − Rn f in Vn . Given
x ∈ Λn , we make use of properties (3-4), so that
X
hRn+1 f − Rn f , ϕnx i = ank hRn+1 f , ϕn+1
n
x+k i − Πx f (x) (ϕx )
k∈Λn+1
X
ank Πx+k f (x + k) (ϕn+1
n
= x+k ) − Πx f (x) (ϕx )
k∈Λn+1
X
ank Πx+k f (x + k) (ϕn+1
n+1
= x+k ) − Πx f (x) (ϕx+k )
k∈Λn+1
X
ank Πx+k f (x + k) − Γx+k,x f (x) (ϕn+1
= x+k ) ,
k∈Λn+1
206 13 Introduction to regularity structures
where we used the algebraic relations between Πx and Γxy to obtain the last identity.
Since only finitely many of the coefficients ak are non-zero, it follows from the
definition of D γ that for the non-vanishing terms in this sum we have the bound
again uniformly over n ≥ 0 and x in any compact set. Here, the additional factor
nd
2− 2 comes from the fact that the functions ϕnx are normalised in L2 rather than L1 .
Combining these two bounds, we immediately obtain that
nd
f − Rn f , ϕnx i . 2−γn− 2 ,
n+1
hR (13.17)
uniformly over n ≥ 0 and x in compact sets. Take now a test function ψ ∈ Cbr with
compact support and let us try to estimate hRn+1 f − Rn f , ψi. Since Rn+1 f −
Rn f ∈ Vn+1 , we can decompose it into a part δRn f ∈ Vn and a part δ̂Rn f ∈ V̂n+1
and estimate both parts separately. Regarding the part in Vn , we have
X nd
X
hδRn f , ψi = hδRn f , ϕnx ihϕnx , ψi . 2−γn− 2 hϕnx , ψi ,
x∈Λn+1 x∈Λn+1
(13.18)
where we made use of the bound (13.17). At this stage we use the fact that, due
to the boundedness of ψ, we have hϕnx , ψi . 2−nd/2 . Furthermore, thanks to the
boundedness of the support of ψ, the number of non-vanishing terms appearing in
this sum is bounded by 2nd , so that we eventually obtain the bound
hδRn f , ψi . 2−γn .
(13.19)
Regarding the second term, we use the standard fact coming from wavelet analysis
[Mey92] that a basis of V̂n+1 can be obtained in the same way as the basis of Vn , but
replacing the function ϕ by functions ϕ̂ from some finite set Φ. In other words, V̂n+1
is the linear span of {ϕ̂nx }x∈Λn ;ϕ̂∈Φ . Furthermore, as a consequence of property (2),
the functions ϕ̂ ∈ Φ all have the property that
Z
ϕ̂(x) P (x) dx = 0 , (13.20)
for any polynomial P of degree less or equal to r. In particular, this shows that one
has the bound
nd
|hϕ̂nx , ψi| . 2− 2 −nr .
As a consequence, one has
13.4 Wavelets and the reconstruction theorem 207
X nd
X
hδ̂Rn f , ψi = hRn+1 f , ϕ̂nx ihϕ̂nx , ψi . 2− 2 −nr hRn+1 f , ϕ̂nx i .
x∈Λn x∈Λn
ϕ̂∈Φ ϕ̂∈Φ
At this stage, we note that, thanks to the definition of Rn+1 and the bounds on the
nd
model
(Π, Γ ), we have |hRn+1 f , ϕ̂nx i| . 2− 2 −α0 n , where α0 = inf A, so that
hδ̂Rn f , ψi . 2−nr−α0 n . Combining this with (13.19), we see that one has indeed
Rn f → Rf for some Schwartz distribution Rf .
It remains to show that the bound (13.15) holds. For this, given a distribution
η ∈ C α for some α > −r, we first introduce the notation
X X X
Pn η = η(ϕnx ) ϕnx , P̂n η = η(ϕ̂nx ) ϕ̂nx .
x∈Λn ϕ̂∈Φ x∈Λn
m≥n
X
n
δ̂Rm f − P̂m Πx f (x)
= R f − Pn Πx f (x) +
m≥n
X
m
+ δR f . (13.21)
m≥n
We then test these terms against ψxλ and we estimate the resulting terms separately.
For the first term, we have the identity
X
Rn f − Pn Πx f (x) (ψxλ ) = Πy f (y) − Πx f (x) (ϕny ) hϕny , ψxλ i . (13.22)
y∈Λn
We have the bound |hϕny , ψxλ i| . λ−d 2−dn/2 ∼ 2dn/2 . Since one furthermore has
|y − x| . λ for all non-vanishing terms in the sum, one also has similarly to before
dn dn
X
| Πy f (y) − Πx f (x) (ϕny )| . λγ−α 2− 2 −αn ∼ 2− 2 −γn .
(13.23)
α<γ
Since only finitely many (independently of n) terms contribute to the sum in (13.22),
it is indeed bounded by a constant proportional to 2−γn ∼ λγ as required.
We now turn to the second term in (13.21), where we consider some fixed value
m ≥ n. We rewrite this term very similarly to before as
where the sum runs over y ∈ Λm+1 and z ∈ Λm . This time, we use the fact that by
the property (13.20) of the wavelets ϕ̂, one has the bound
208 13 Introduction to regularity structures
md
−d−r −rm−
|hϕ̂m λ
z , ψx i| . λ 2 2 , (13.24)
For the last term in (13.21), we combine (13.18) with the bound |hϕm λ
y , ψx i| .
−d −dm/2 d −md
λ 2 and the fact that there are of the order of λ 2 terms appearing in the
sum (13.18) to conclude that the mth summand is bounded by a constant proportional
to 2−γm . Summing over m yields again the desired bound and concludes the proof.
t
u
Remark 13.27. There are obvious analogies between the construction of the recon-
struction operator R and that of the “rough integral” in Section 4, see also Exer-
cise 13.33. As a matter of fact, there exists a slightly more abstract formulation of
the reconstruction theorem which can be interpreted as a multidimensional analogue
to the sewing lemma, Lemma 4.2.
Remark 13.28. With a look to remark 13.11, and M = (Π, Γ ) ∈ M , one should
really view R = RM f as a map from M n D γ into D0 . Since the space M n D γ is
not a linear space, this shows that the map R isn’t actually linear, despite appearances.
However, the map (Π, Γ, f ) 7→ Rf turns out to be locally Lipschitz continuous
provided that the distance between (Π, Γ, f ) and (Π̄, Γ̄ , f¯) is given by the smallest
constant % such that
Here, in order to obtain bounds on Rf − R̄f¯ (ψ) for some smooth compactly
supported test function ψ, the above bounds should hold uniformly for x and y in a
neighbourhood of the support of ψ. The proof that this stronger continuity property
also holds is actually crucial when showing that sequences of solutions to mollified
equations all converge to the same limiting object. However, its proof is somewhat
more involved which is why we chose not to give it here.
Remark 13.29. In the particular case where Πx τ happens to be a continuous function
for every τ ∈ T (and every x ∈ Rd ), Rf is also a continuous function and one has
the identity
Rf (x) = Πx f (x) (x) . (13.25)
This can be seen from the fact that
13.5 Exercises 209
X
Rf (y) = lim Rn f (y) = lim Πx f (x) (ϕnx ) ϕnx (y) .
n→∞ n→∞
x∈Λn
Indeed, our assumptions imply that the function (x, z) 7→ Πx f (x) (z) is jointly
continuous and since the non-vanishing terms in the above sum satisfy |x − y| .
2−n , one has 2dn/2 Πx f (x) (ϕnx ) ≈ Πy f (y) (y) for large n. Since furthermore
n dn/2
P
x∈Λn ϕx (y) = 2 , the claim follows.
13.5 Exercises
Exercise 13.30. Use wavelets to construct an example demonstrating the “only if”
part of Theorem 13.16.
uniformly over all ϕ ∈ Br and λ ∈ (0, 1], and locally uniformly in x. Show that
the space C −α is independent of the choice of r in the definition given above,
which justifies the notation. Take now d = 1 and α ∈ (0, 1) for simplicity. Show
that any f ∈ C −α is the distributional derivative of some Hölder continuous
function F ∈ C 1−α .
Exercise 13.33. Retrace the proof of Theorem 13.12 in the context of Proposition
13.19 with the Haar basis as the choice of wavelet basis (i.e. set ϕ(x) = 1[0,1] (x)).
Convince yourself that this is equivalent to the proof of Lemma 4.2.
Exercise 13.34. Let (Π, Γ ) be a model for the “rough path” regularity structure
given in Definition 13.5 with the additional property that Πs Ẇ i is the distributional
210 13 Introduction to regularity structures
derivative of Πs W i for every s. Show that it is then necessarily of the form MW for
some α-Hölder rough path W as in Lemma 13.18.
13.6 Comments
An alternative theory to the theory of regularity structures [Hai14c] has been in-
troduced more or less simultaneously in Gubinelli–Imkeller–Perkowski [GIP12].
Instead of the reconstruction theorem, that theory builds instead on properties of
Bony’s paraproduct [Bon81, BMN10, BCD11]. It is also in principle able to deal
with stochastic PDEs like the KPZ equation or the dynamical Φ43 equation, see
Catellier–Chouk [CC13], but its scope is not as wide as that of the theory of regular-
ity structures. (For example, it cannot deal with classical one-dimensional parabolic
SPDEs driven by space-time white noise with a diffusion coefficient depending on
the solution.)
One advantage of the paraproduct-based theory is that one generally deals with
globally defined objects rather than the “jets” used in the theory of regularity struc-
tures. It also uses some already well-studied objects, so that it can rely on a substantial
body of existing literature. However, besides being less systematic than the theory
of regularity structures, it achieves a less clean break between the analytical and the
algebraic aspects of a given problem.
Chapter 14
Operations on modelled distributions
Abstract The original motivation for the development of the theory of regularity
structures was to provide robust solution theories for singular stochastic PDEs like
the KPZ equation or the dynamical Φ43 model. The idea is to reformulate them as fixed
point problems in some space D γ (or rather a slightly modified version that takes into
account possible singular behaviour near time 0) based on a suitable random model
in a regularity structure purpose-built for the problem at hand. In order to achieve
this this chapter provides a systematic way of formulating the standard operations
arising in the construction of the corresponding fixed point problem (differentiation,
multiplication, composition by a regular function, convolution with the heat kernel)
as operations on the spaces D γ .
14.1 Differentiation
211
212 14 Operations on modelled distributions
Proposition 14.2. Let ∂ be a map that realises L for the model (Π, Γ ) and let
f ∈ D γ (V ) for some γ > m. Then, ∂f ∈ D γ−m and the identity R∂f = LRf
holds.
for some δ > 0. Here, we defined ψxλ as before. By the assumption on the model Π,
we have the identity
One of the main purposes of the theory presented here is to give a robust way to
multiply distributions (or functions with distributions) that goes beyond the barrier
illustrated by Theorem 13.16. Provided that our functions / distributions are repre-
sented as elements in D γ for some model and regularity structure, we can multiply
their “Taylor expansions” pointwise, provided that we give ourselves a table of
multiplication on T .
It is natural to consider products with the following properties.
Remark 14.4. The condition that homogeneities add up under multiplication is very
natural, bearing in mind the case of the polynomial regularity structure. The second
condition is also very natural since it merely states that if one reexpands the product
of two “polynomials” around a different point, one should obtain the same result as
if one reexpands each factor first and then multiplies them together.
Given such a product, we can ask ourselves when the pointwise product of an
element D γ1 with an element in D γ2 again belongs to some D γ . In order to answer
this question, we introduce the notation Dαγ to denote those elements f ∈ D γ such
that furthermore M
f (x) ∈ T≥α = Tβ ,
β≥α
It follows from the properties of the product ? that the first term in (14.2) is bounded
by a constant times
X
kΓxy f1 (y) − f1 (x)kβ1 kΓxy f2 (y) − f2 (x)kβ2
β1 +β2 =β
X
. kx − ykγ1 −β1 kx − ykγ2 −β2 . kx − ykγ1 +γ2 −β .
β1 +β2 =β
. kx − ykγ1 +α2 −β ,
214 14 Operations on modelled distributions
Remark 14.6. It is clear that the formula (14.1) for γ is optimal in general as can
be seen from the following two “reality checks”. First, consider the case of the
polynomial model and take fi ∈ C γi . In this case, the (abstract) truncated Taylor
series fi for fi belong to D0γi . It is clear that in this case, the product cannot be
expected to have better regularity than γ1 ∧ γ2 in general, which is indeed what (14.1)
states. The second reality check comes from (the proof of) Theorem 13.16. In this
case, with β > α ≥ 0, one has f ∈ D0β , while the constant function x 7→ Ξ belongs
β−α
to D−α
∞
so that, according to (14.1), one expects their product to belong to D−α ,
which is indeed the case.
Here, G(k) denotes the kth derivative of G and τ ?k denotes the k-fold product
τ ? · · · ? τ . We also used the usual conventions G(0) = G and τ ?0 = 1.
Note that as long as G is C ∞ , this expression is well-defined. Indeed, by as-
sumption, there exists some α0 > 0 such that f˜(x) ∈ T≥α0 . By the properties of
the product, this implies that one has f˜(x)?k ∈ T≥kα0 . As a consequence, when
considering the component of G ◦ f in Tβ for β < γ, the only terms that give a
14.3 Schauder estimates and admissible models 215
contribution are those with k < γ/α0 . Since we cannot possibly hope in general that
0
G ◦ f ∈ D γ for some γ 0 > γ, this is all we really need.
It turns out that if G is sufficiently regular, then the map f 7→ G ◦ f enjoys
similarly nice continuity properties to what we are used to from classical Hölder
spaces. The following result is the analogue in this context to Lemma 7.3:
Proposition 14.7. In the same setting as above, provided that G is of class C k with
k > γ/α0 , the map f 7→ G◦f is continuous from D γ (V ) into itself. If k > γ/α0 +1,
then it is locally Lipschitz continuous.
The proof of this result can be found in [Hai14c]. It is somewhat lengthy, but
ultimately rather straightforward.
One of the reasons why the theory of regularity structures is very successful at
providing detailed descriptions of the small-scale features of solutions to semilinear
(S)PDEs is that it comes with very sharp Schauder estimates. Recall that the classical
Schauder estimates state that if K : Rd → R is a kernel that is smooth everywhere,
except for a singularity at the origin that is (approximately) homogeneous of degree
β − d for some β > 0, then the operator f 7→ K ∗ f maps C α into C α+β for every
α ∈ R, except for those values for which α + β ∈ N. (See for example [Sim97].)
It turns out that similar Schauder estimates hold in the context of general regularity
structures in the sense that it is in general possible to build an operator K : D γ →
D γ+β with the property that RKf = K ∗ Rf . Of course, such a statement can only
be true if our regularity structure contains not only the objects necessary to describe
Rf up to order γ, but also those required to describe K ∗ Rf up to order γ + β.
What are these objects? At this stage, it might be useful to reflect on the effect of the
convolution of a singular function (or distribution) with K.
Let us assume for a moment that a given real-valued function f is smooth ev-
erywhere, except at some point x0 . It is then straightforward to convince ourselves
that K ∗ f is also smooth everywhere, except at x0 . Indeed, for any δ > 0, we can
write K = Kδ + Kδc , where Kδ is supported in a ball of radius δ around 0 and
Kδc is a smooth function. Similarly, we can decompose f as f = fδ + fδc , where
fδ is supported in a δ-ball around x0 and fδc is smooth. Since the convolution of
a smooth function with an arbitrary distribution is smooth, it follows that the only
non-smooth component of K ∗ f is given by Kδ ∗ fδ , which is supported in a ball of
radius 2δ around x0 . Since δ was arbitrary, the statement follows. By linearity, this
strongly suggests that the local structure of the singularities of K ∗ f can be described
completely by only using knowledge on the local structure of the singularities of f .
It also suggests that the “singular part” of the operator K should be local, with the
non-local parts of K only contributing to the “regular part”.
This discussion suggests that we certainly need the following ingredients to build
an operator K with the desired properties:
216 14 Operations on modelled distributions
Γ Iτ − IΓ τ ∈ T̄ . (14.3)
Finally, we want to consider models that are compatible with this structure for a
given kernel K. For this, we first make precise what we mean exactly when we said
that K is approximately homogeneous of degree β − d.
P
Assumption 14.10 One can write K = n≥0 Kn where each of the kernels
d
Kn : R → R is smooth and compactly supported in a ball of radius 2−n around the
origin. Furthermore, we assume that for every multi-index k, one has a constant C
such that the bound
sup |Dk Kn (x)| ≤ C2n(d−β+|k|) , (14.4)
x
R
holds uniformly in n. Finally, we assume that Kn (x)P (x) dx = 0 for every
polynomial P of degree at most N , for some sufficiently large value of N .
Remark 14.11. It turns out that in order to define the operator K on D γ , we will need
K to annihilate polynomials of degree N for some N ≥ γ + β.
Remark 14.12. The last assumption may appear to be extremely stringent at first
sight. In practice, this turns out not to be a problem at all. Say for example that we
want to define an operator that represents convolution with G, the Green’s function of
the Laplacian. Then, G can be decomposed into a sum of terms satisfying the bound
(14.4) with β = 2, but it does of course not annihilate generic polynomials and it is
not supported in the ball of radius 1.
However, for any fixed value of N > 0, it is straightforward to decompose G
as G = K + R, where the kernel K is compactly supported and satisfies all of the
14.3 Schauder estimates and admissible models 217
properties mentioned above, and the kernel R is smooth. Lifting the convolution with
R to an operator from D γ → D γ+β (actually to D γ̄ for any γ̄ > 0) is straightforward,
so that we have reduced our problem to that of constructing an operator describing
the convolution by K.
Given such a kernel K, we can now make precise what we meant earlier when we
said that the models under consideration should be compatible with the kernel K.
Πx X k (y) = (y − x)k ,
Πx Iτ = K ∗ Πx τ − Πx J (x)τ , (14.5)
holds for every τ ∈ T with |τ | ≤ N . Here, J (x) : T → T̄ is the linear map given
on homogeneous elements by
X Xk Z
D(k) K(x − y) Πx τ (dy) .
J (x)τ = (14.6)
k!
|k|<|τ |+β
Remark 14.14. Note first that if τ ∈ T̄ , then the definition given above is coherent as
long as |τ | < N . Indeed, since Iτ = 0, one necessarily has Πx Iτ = 0. On the other
hand, the properties of K ensure that in this case one also has K ∗ Πx τ = 0, as well
as J (x)τ = 0.
Note now that the scaling properties of the Kn ensure that 2(β−|k|)n D(k) Kn (x − ·)
is a test function that is localised around x at scale 2−n . As a consequence, one has
Πx τ D(k) Kn (x − ·) . 2(|k|−β−|τ |)n ,
Remark 14.16. As a matter of fact, it turns out that the above definition of an admis-
sible model dovetails very nicely with our axioms defining a general model. Indeed,
starting from any regularity structure T , any model (Π, Γ ) for T , and a kernel
K satisfying Assumption 14.10, it is usually possible to build a larger regularity
structure Tˆ containing T (in the “obvious” sense that T ⊂ T̂ and the action of Ĝ on
T is compatible with that of G) and endowed with an abstract integration map I, as
218 14 Operations on modelled distributions
With all of these definitions in place, we can finally build the operator K : D γ →
D γ+β
announced at the beginning of this section. Recalling the definition of J from
(14.6), we set
Kf (x) = If (x) + J (x)f (x) + N f (x) , (14.7)
Note first that thanks to the reconstruction theorem, it is possible to verify that the
right hand side of (14.8) does indeed make sense for every f ∈ D γ in virtually the
same way as in Remark 14.15. One has:
Proof. The complete proof of this result can be found in [Hai14c] and will not be
given here. Let us simply show that one has indeed RKf = K ∗ Rf in the particular
case when our model consists of continuous functions so that Remark 13.29 applies.
In this case, one has
RKf (x) = Πx (If (x) + J (x)f (x)) (x) + Πx N f (x) (x) .
As a consequence of (14.5), the first term appearing in the right hand side of this
expression is given by
Πx (If (x) + J (x)f (x)) (x) = K ∗ Πx f (x) (x) .
On the other hand, the only term contributing to the second term is the one with
k = 0 (which is always present since γ > 0 by assumption) which then yields
Z
Πx N f (x) (x) = K(x − y) Rf − Πx f (x) (dy) .
14.4 Exercises 219
Adding both of these terms, we see that the expression K ∗ Πx f (x) (x) cancels,
leaving us with the desired result. t
u
We are now in principle in possession of all of the ingredients required to formu-
late a large number of semilinear stochastic PDEs: multiplication, composition by
regular functions, differentiation, and integration against the Green’s function of the
linearised equation.
In the next chapter we show how this can be leveraged in practice in order to build
a robust solution theory for the KPZ equation.
14.4 Exercises
fails.
b) Transfer Exercise 2.17 to the present context.
Solution 14.19. (We only address the first part.) Consider for instance the regularity
structure given by A = (−2κ, −κ, 0) for fixed κ > 0 with each Tα being a copy
of R given by T−nκ = hΞ n i. We furthermore take for G the trivial group. This
regularity structure comes with an obvious product by setting Ξ m ? Ξ n = Ξ m+n
provided that m + n ≤ 2.
Then, we could for example take as a model for T = (T, A, G):
Since our group G is trivial, one has fi ∈ D γ provided that each of the fi belongs to
C γ and each of the f˜ibelongs to C γ+κ . (And one has γ + κ < 1.) One furthermore
has the identity Rfi (x) = fi (x).
However, the pointwise product is given by
f1 ? f2 (x) = f1 (x)f2 (x)Ξ 0 + f˜1 (x)f2 (x) + f˜2 (x)f1 (x) Ξ + f˜1 (x)f˜2 (x)Ξ 2 ,
which by Theorem 14.5 belongs to D γ−κ . Provided that γ > κ, one can then apply
the reconstruction operator to this product and one obtains
which is obviously quite different from the pointwise product (Rf1 )(x) · (Rf2 )(x).
220 14 Operations on modelled distributions
How should this be interpreted? For n > 0, we could have defined a model Π (n)
by
√
Πx Ξ 0 (y) = 1 , Πx Ξ 2 (y) = 2c sin2 (ny) .
Πx Ξ (y) = 2c sin(ny) ,
as well as R(n) (f1 ? f2 ) = R(n) f1 · R(n) f2 . As a model, the model Π (n) actually
converges to the limiting model Π defined in (14.9). As a consequence of the
continuity of the reconstruction operator, this implies that
which is of course also easy to see “by hand”. This shows that in some cases, the
“non-canonical” models as in (14.9) can be interpreted as limits of “canonical” models
for which the usual rules of calculus hold. Even this is however not always the case
(think of the Itô Brownian rough path).
G = K + K̂ ,
where the kernel K satisfies all of the assumptions of Section 14.3 (with β = 2) and
the remainder K̂ is smooth and bounded.
Chapter 15
Application to the KPZ equation
Abstract We show how the theory of regularity structures can be used to build a
robust solution theory for the KPZ equation. We also give a very short survey of the
original approach to the same problem using controlled rough paths and we discuss
how the two approaches are linked.
Let us now briefly explain how the theory of regularity structures can be used to
make sense of solutions to very singular semilinear stochastic PDEs. We will keep
the discussion in this section at a very informal level without attempting to make
mathematically precise statements. The interested reader may find more details in
[Hai13, Hai14c].
For definiteness, we focus on the case of the KPZ equation [KPZ86], which is
formally given by
∂t h = ∂x2 h + (∂x h)2 + ξ − C , (15.1)
where ξ denotes space-time white noise, the spatial variable takes values in the circle
(i.e. in the interval [0, 2π] endowed with periodic boundary conditions), and C is a
fixed constant. The problem with such an equation is that even the solution to the
linear part of the equation, namely
∂t Ψ = ∂x2 Ψ + ξ ,
221
222 15 Application to the KPZ equation
This has usually been interpreted in the following way. Assuming for a moment
that ξ is a smooth function, a simple consequence of the change of variables formula
shows that if we define h = log Z, then Z satisfies the PDE
∂t Z = ∂x2 Z + Z ξ .
The only ill-posed product appearing in this equation now is the product of the
solution Z with white noise ξ. As long as Z takes values in L2 , this product can
be given a meaning as a classical Itô integral, so that the equation for Z can be
interpreted as the Itô equation
dZ = ∂x2 Z dt + Z dW , (15.2)
R
Sa
C∈ F × M × Cα Dγ
Ψ · R
(15.4)
Sc
F × C × Cα C([0, T ], C α )
∈
R
ξ h0 h
Here, Sc denotes the classical solution map Sc (C, ξ, h0 ) which provides the solution
(up to some fixed final time T ) to the equation
for regular instances of the noise ξ. The space F of “formal right hand sides” is in
this case just a copy of R which holds the value of the constant C appearing in (15.5).
The diagram commutes in the sense that if M ∈ R, then
where we identify M with its respective actions on R and M . The important addi-
tional features are the following:
• If ξε denotes a “natural” regularisation of space-time white noise, then there
exists a sequence Mε of elements in R such that Mε Ψ (ξε ) converges to a limiting
random element (Π, Γ ) ∈ M . This element can also be characterised directly
without resorting to specific approximation procedures and RSa (0, (Π, Γ ), h0 )
coincides almost surely with the Hopf–Cole solution to the KPZ equation.
• The maps Sa and R are both continuous, unlike the map Sc which is discontinuous
in its second argument for any topology for which ξε converges to ξ.
• As an abstract group, the “renormalisation group” R is simply equal to (R3 , +).
However, it is possible to extend the picture to deal with much larger classes of
approximations, which has the effect of increasing both R and the space F of
possible right hand sides. See for example [HQ14] for a proof of convergence to
KPZ for a much larger class of interface growth models.
An example of statement that can be proved from these considerations (see
[Hai13, Hai14c, HQ14]) is the following.
Theorem 15.1. Consider the sequence of equations
where ξε = δε ∗ξ with δε (t, x) = ε−3 %(ε−2 t, ε−1 x), for some smooth and compactly
supported function %, and ξ denotes space-time white noise. Then, there exists a
224 15 Application to the KPZ equation
Remark 15.2. It is important to note that although the limiting process is independent
of the choice of mollifier %, the constant Cε does very much depend on this choice,
as we already alluded to earlier.
Remark 15.3. Regarding the initial condition, one can take h0 ∈ C β for any fixed
β > 0. Unfortunately, this result does not cover the case of “infinite wedge” initial
conditions, see for example [Cor12].
The aim of this section is to sketch how the theory of regularity structures can be
used to obtain this kind of convergence results and how (15.4) is constructed. First of
all, we note that while our solution h will be a Hölder continuous space-time function
(or rather an element of D γ for some regularity structure with a model over R2 ), the
“time” direction has a different scaling behaviour from the three “space” directions.
As a consequence, it turns out to be effective to slightly change our definition of
“localised test functions” by setting
Our first step is to build a regularity structure that is sufficiently large to allow to
reformulate (15.1) as a fixed point in D γ for some γ > 0. Denoting by G the heat
kernel (i.e. the Green’s function of the operator ∂t − ∂x2 ), we can rewrite the solution
to (15.1) with initial condition h0 as
where ∗ denotes space-time convolution and where we denote by Gh0 the harmonic
extension of h0 . (That is the solution to the heat equation with initial condition h0 .)
In order to have a chance of fitting this into the framework described above, we first
decompose the heat kernel G as in Exercise 14.20 as
G = K + K̂ ,
where the kernel K satisfies all of the assumptions of Section 14.3 (with β = 2) and
the remainder K̂ is smooth. If we consider any regularity structure containing the
usual Taylor polynomials and equipped with an admissible model, is straightforward
to associate to K̂ an operator K̂ : D γ → D ∞ via
X Xk
D(k) K̂ ∗ Rf (z) ,
K̂f (z) =
k!
k
where z denotes a space-time point and k runs over all possible 2-dimensional
multiindices. Similarly, the harmonic extension of h0 can be lifted to an element
in D ∞ which we denote again by Gh0 by considering its Taylor expansion around
every space-time point. At this stage, we note that we actually cheated a little: while
Gh0 is smooth in {(t, x) : t > 0, x ∈ S 1 } and vanishes when t < 0, it is of course
singular on the time-0 hyperplane {(0, x) : x ∈ S 1 }. This problem can be cured
by introducing weighted versions of the spaces D γ allowing for singularities on a
given hyperplane. A precise definition of these spaces and their behaviour under
multiplication and the action of the integral operator K can be found in [Hai14c].
For the purpose of the informal discussion given here, we will simply ignore this
problem.
This suggests that the “abstract” formulation of (15.1) should be given by
shorthand I 0 = ∂I. This is also suggestive of the fact that I 0 can itself be considered
an abstract integration map, associated to the kernel K 0 = ∂x K.
We then simply add to T all of the formal expressions that an application of the
right hand side of (15.8) can generate for the description of H, ∂H, and (∂H)2 . The
homogeneity of a given expression is furthermore completely determined by the
rules |Iτ | = |τ | + 2, |∂τ | = |τ | − 1 and |τ τ̄ | = |τ | + |τ̄ |. For example, it follows
from (15.8) that the symbol I(Ξ) is required for the description of H, so that I 0 (Ξ)
is required for the description of ∂H. This then implies that I 0 (Ξ)2 is required for
the description of the right hand side of (15.8), which in turn implies that I(I 0 (Ξ)2 )
is also required for the description of H, etc.
Remark 15.4. Here we made a distinction between I(Ξ), interpreted as the linear
map I applied to the symbol Ξ, and the symbol I(Ξ). Since the map I is then
defined by I(Ξ) = I(Ξ), this distinction is somewhat moot and will be blurred in
the sequel.
More formally, denote by U the collection of those formal expressions that are
required to describe H. This is then defined as the smallest collection containing 1,
X, and I(Ξ), and such that
τ1 , τ2 ∈ U ⇒ I(∂τ1 ∂τ2 ) ∈ U ,
and we define our space T as the set of all linear combinations of elements in W.
Naturally, Tα consists of those linear combinations that only involve elements in W
that are of homogeneity α. It is not too difficult to convince oneself that, for every
α ∈ R, W contains only finitely many elements of homogeneity less than α, so that
each Tα is finite-dimensional.
In order to simplify expressions later, we will use the following shorthand graphi-
cal notation for elements of W. For Ξ, we draw a small circle. The integration map I
is then represented by a downfacing wavy line and I 0 is represented by a downfacing
plain line. The multiplication of symbols is obtained by joining them at the root. For
example, we have
T = hΞ, , , , , , , , 1, , , , , X1 , , , . . .i , (15.10)
where we ordered symbols in increasing order of homogeneity and used h·i to denote
the linear span.
15.3 The structure group 227
Exercise 15.5. Compute the homogeneities of the symbols appearing in the list
(15.10).
Recall that the purpose of the group G is to provide a class of linear maps Γ : T → T
arising as possible candidates for the action of “reexpanding” a “Taylor series” around
a different point. In our case, in view of (14.5), the coefficients of these reexpansions
will naturally be some polynomials in x and in the expressions appearing in (14.6).
This suggests that we should define a space T + whose basis vectors consist of formal
expressions of the type
YN
Xk J`i (τi ) , (15.11)
i=1
where N is an arbitrary but finite number, the τi are canonical basis elements in W
defined in (15.9), and the `i are d-dimensional multiindices satisfying |`i | < |τi | + 2.
(The last bound is a reflection of the restriction of the summands in (14.6) with
β = 2.) The space T + is endowed with a natural commutative product. (In fact,
T + is nothing but the free commutative algebra over the symbols {Xi , J` (τ )} with
i ∈ {1, . . . , d} and τ ∈ W with |τ | < |`|.)
It turns out that with this definition, the coefficients of the linear maps Γxy can be
expressed as polynomials of the numbers fx (J`i (τi )) and fy (J`i (τi )) for suitable
expressions τi and multiindices `i . In order to formalize this, we consider the follow-
ing construction. We define a linear map ∆ : T → T ⊗ T + in the following way. For
the basic elements Ξ, 1 and Xi (i ∈ {0, 1}), we set
228 15 Application to the KPZ equation
∆1 = 1 ⊗ 1 , ∆Ξ = Ξ ⊗ 1 , ∆Xi = Xi ⊗ 1 + 1 ⊗ Xi .
∆(τ τ̄ ) = ∆τ · ∆τ̄ ,
X X` Xm
∆I(τ ) = (I ⊗ I)∆τ + ⊗ J`+m (τ ) ,
`! m!
`,m
X X` Xm
∆I 0 (τ ) = (I 0 ⊗ I)∆τ + ⊗ J`+m+(0,1) (τ ) .
`! m!
`,m
def
Here, we extend τ 7→ Jk (τ ) = Jk (τ ) to a linear map Jk : T → T + by setting
Jk (τ ) = 0 for those basis vectors τ ∈ W for which |τ | < |k| − 2.
Let now G+ denote the set of all linear maps g : T + → R with the property that
g(σσ̄) = g(σ)g(σ̄) for any two elements σ and σ̄ in T + . Then, to any such map, we
can associate a linear map Γg : T → T by
Γg τ = (I ⊗ g)∆τ . (15.13)
In principle, this definition makes sense for every g ∈ (T + )∗ . However, it turns out
that the set of such maps with g ∈ G+ forms a group, which is our structure group
G.
Furthermore, there exists a linear map ∆+ : T + → T + ⊗ T + such that
(f ◦ g)(σ) = (f ⊗ g)∆+ σ .
This has the property that Γf ◦g = Γf ◦ Γg , with the symbol ◦ on the right denoting
the composition of linear maps as usual. The second identity of (15.14) furthermore
ensures that if f and g belong to G+ , then f ◦ g ∈ G+ . It also turns out that every
f ∈ G+ admits a unique inverse f −1 such that f −1 ◦ f = f ◦ f −1 = e, where
e : T + → R maps every basis vector of the form (15.11) to zero, except for e(1) = 1.
The element e is neutral in the sense that Γe is the identity operator.
It is a highly non-trivial fact [Hai14c, Sec. 8] that if Π comes from an admissible
model as in Definition 14.13 and we define Fx : T → T by
Fx = Γfx ,
While this is a very nice coherent algebraic framework, it begs the question whether
in general there do even exist any non-trivial admissible models. This is a valid
question since the analytic bounds and algebraic identities that any admissible model
should satisfy are extremely stringent. The next section shows that fortunately there
exists a very rich class of admissible models.
as well as (14.5). Here we used z and z̄ as notations for generic space-time points in
order not to overload the notations. The maps Γxy are then determined from Π by
the discussion in the previous subsection.
With such a model ιξ at hand, it follows from (15.15), (13.25), and the admissibil-
ity of ιξ that the associated reconstruction operator satisfies the properties
RKf = K ∗ Rf , R(f g) = Rf · Rg ,
as long as all the functions to which R is applied belong to D γ for some γ > 0. As a
consequence, applying the reconstruction operator R to both sides of (15.7), we see
that if H solves (15.7) then, provided that the model (Π, Γ ) = ιξ was built as above
starting from any continuous realisation ξ of the driving noise, the function h = RH
solves the equation (15.1).
At this stage, the situation is as follows. For any continuous realisation ξ of the
driving noise, we have factorized the solution map (h0 , ξ) 7→ h associated to (15.1)
into maps
(h0 , ξ) 7→ (h0 , ιξ) 7→ H 7→ h = RH ,
where the middle arrow corresponds to the solution to (15.7) in some weighted
D γ -space. The advantage of such a factorisation is that the last two arrows yield
continuous maps, even in topologies sufficiently weak to be able to describe driving
noise having the lack of regularity of space-time white noise. The only arrow that
isn’t continuous in such a weak topology is the first one. At this stage, it should
be believable that a similar construction can be performed for a very large class of
semilinear stochastic PDEs, provided that certain scaling properties are satisfied.
This is indeed the case and large parts of this programme have been carried out in
[Hai14c].
230 15 Application to the KPZ equation
Given this construction, one is lead naturally to the following question: given
a sequence ξε of “natural” regularisations of space-time white noise, for example
as in (15.6), do the lifts ιξε converge in probably in a suitable space of admissible
models? Unfortunately, unlike in the theory of rough paths where this is very often
the case (see Section 10), the answer to this question in the context of SPDEs is often
an emphatic no. Indeed, if it were the case for the KPZ equation, then one could
have been able to choose the constant Cε to be independent of ε in (15.6), which is
certainly not the case.
One way of circumventing the fact that ιξε does not converge to a limiting model as
ε → 0 is to consider instead a sequence of renormalised models. The main idea is
to exploit the fact that our abstract definitions of a model do not impose the identity
(15.15), even in situations where ξ itself happens to be a continuous function. One
question that then imposes itself is: what are the natural ways of “deforming” the
usual product which still lead to lifts to an admissible model? It turns out that the
regularity structure whose construction was sketched above comes equipped with
a natural finite-dimensional group of continuous transformations R on its space of
admissible models (henceforth called the “renormalisation group”), which essentially
amounts to the space of all natural deformations of the product. It then turns out that
even though ιξε does not converge, it is possible to find a sequence Mε of elements in
R such that the sequence Mε ιξε converges to a limiting model (Π̂, Γ̂ ). Unfortunately,
the elements Mε no not preserve the image of ι in the space of admissible models.
As a consequence, when solving the fixed point map (15.7) with respect to the model
Mε ιξε and inserting the solution into the reconstruction operator, it is not clear a
priori that the resulting function (or distribution) can again be interpreted as the
solution to some modified PDE. It turns out that in our case, at least for a suitable
subgroup of R, this is again the case and the modified equation is precisely given
by (15.6), where Cε is some linear combination of the constants appearing in the
description of Mε .
There are now three questions that remain to be answered:
1. How does one construct the renormalisation group R?
2. How does one derive the new equation obtained when renormalising a model?
3. What is the right choice of Mε ensuring that the renormalised models converge?
How does all this help with the identification of a natural class of deformations for
the usual product? First, it turns out that for every continuous function ξ, if we denote
again by (Π, Γ ) the model ιξ, then the linear map Π : T → C given by
15.5 Renormalisation of the KPZ equation 231
Π = Πy Fy−1 , (15.16)
ΠX k (x) = xk ,
ΠΞ (x) = ξ(x) , (15.17)
Πτ τ̄ = Πτ · Π τ̄ , ΠIτ = K ∗ Πτ . (15.18)
Note that this is very similar to the definition of ιξ, with the notable exception that
(14.5) is replaced by the more “natural” identity ΠIτ = K ∗ Πτ . It turns out
that the knowledge of Π and the knowledge of (Π, Γ ) are equivalent since one has
Πx = ΠFx and the map Fx can be recovered from Πx by (15.12). (This argument
appears circular but it is possible to put a suitable recursive structure on T and T +
ensuring that this actually works.) Furthermore, the translation (Π, Γ ) ↔ Π actually
works for any admissible model and does not at all rely on the fact that it was built by
lifting a continuous function. However, in the general case, the first identity in (15.17)
does of course not make any sense anymore and might fail even if the coordinates of
Π consist of continuous functions.
At this stage we note that if ξ happens to be a stationary stochastic process
and Π is built from ξ by following the above procedure, then Πτ is a stationary
stochastic process for every τ ∈ T . In order to define R, it is natural to consider only
transformations of the space of admissible models that preserve this property. Since
we are not in general allowed to multiply components of Π, the only remaining
operation is to form linear combinations. It is therefore natural to describe elements
of R by linear maps M : T → T and to postulate their action on admissible models
by Π 7→ Π M with
Π M τ = ΠM τ . (15.19)
It is not clear a priori whether given such a map M and an admissible model (Π, Γ )
there is a coherent way of building a new model (Π M , Γ M ) such that Π M is the map
associated to (Π M , Γ M ) as above. It turns out that one has the following statement:
Proposition 15.7. In the above context, for every linear map M : T → T commuting
with I and multiplication by X k , there exist unique linear maps ∆M : T → T ⊗ T +
and ∆ˆM : T + → T + ⊗ T + such that if we set
ΠxM τ = Πx ⊗ fx ∆M τ , M
(σ) = (γxy ⊗ fx )∆ˆM σ ,
γxy
then ΠxM satisfies again (14.5) and the identity ΠxM Γxy
M
= ΠyM .
At this stage it may look like any linear map M : T → T commuting with I and
multiplication by X k yields a transformation on the space of admissible models by
Proposition 15.7. This however is not true since we have completely disregarded the
analytical bounds that every model has to satisfy. It is clear from Definition 13.6 that
in the absence of any additional knowledge these are satisfied if and only if ΠxM τ is
232 15 Application to the KPZ equation
a linear combination of the Πx τ̄ for some symbols τ̄ with |τ̄ | ≥ |τ |. This suggests
the following definition.
Definition 15.8. The renormalisation group R consists of the set of linear maps
M : T → T commuting with I, I 0 , and with multiplication by X k , such that for
τ ∈ Tα one has
∆M τ − τ ⊗ 1 ∈ T>α ⊗ T + . (15.20)
Its action on the space of admissible models is given by Proposition 15.7.
∆ˆM σ − σ ⊗ 1 ∈ T>α
+
⊗ T+ .
However, it turns out that this is always the case, provided that ∆M satisfies (15.20).
The reason for this is that it is possible to verify that one always has the identity
∆ˆM Jk (τ ) = (Jk ⊗ I)∆M τ .
In the case of the KPZ equation, it turns out that we need a three-parameter sub-
group of R to renormalise the equations, but in order to explain the procedure we
will consider a larger 4-dimensional subgroup
P3 of R. More precisely, we consider
elements M ∈ R of the form M = exp(− i=0 Ci Li ), where the generators Li are
determined by the following contraction rules:
L0 : 7→ 1 , L1 : 7→ 1 , L2 : 7→ 1 L3 : 7→ 1 . (15.21)
L0 =2 , L0 =2 + ,
etc. The extension of the other operators Li to all of T proceeds in principle along
the same lines. However, as a consequence of the fact that I(1) = I 0 (1) = 0 by
construction, it actually turns out that Li τ = 0 for i 6= 0 and every τ for which Li
wasn’t defined in (15.21). It is possible to verify that one has the following result.
Proposition 15.10. The linear maps M of the type just described belong to R. Fur-
thermore, if (Π, Γ ) is an admissible model such that Πx τ is a continuous function
for every τ ∈ T , then one has the identity
Remark 15.11. Note that it is the same value x that appears twice on each side of
(15.22). It is in fact not the case that one has ΠxM τ = Πx M τ in general! However,
the identity (15.22) is all we need to derive the renormalised equation.
Proof. By Theorem 14.5, it turns out that (15.7) can be solved in D γ as soon as γ is
a little bit greater than 3/2. Therefore, we only need to keep track of its solution H
up to terms of homogeneity 3/2. By repeatedly applying the identity (15.8), we see
that the solution H ∈ D γ for γ close enough to 3/2 is necessarily of the form
H = h1 + + + h0 X1 + 2 + 2h0 ,
∂H = + + h0 1 + 2 + 2h0 , (15.23)
as an element of D γ for γ sufficiently close to 1/2. Similarly, the right hand side of
the equation is given up to order 0 by
It follows from the definition of M that one then has the identity
M ∂H = ∂H − 4C0 ,
so that, as an element of D γ with very small (but positive) γ, one has the identity
As a consequence, after neglecting all terms of strictly positive order, one has the
identity (writing c instead of c1 for real constants c)
Remark 15.13. It turns out that, thanks to the symmetry x 7→ −x enjoyed by our
problem, the corresponding model can be renormalised by a map M as above, but
with C0 = 0. The reason why we considered the general case here is twofold. First,
it shows that it is possible to obtain renormalised equations that differ from the
original equation in a more complicated way than just by the addition of a large
constant. Second, it is plausible that if one tries to approximate the KPZ equation by
a microscopic model which is not symmetric under space inversion, then the constant
C0 could play a non-trivial role.
It remains to argue why one expects to be able to find constants Ciε such that the
P3
sequence of renormalised models M ε ιξε with M ε = exp( i=1 Ciε Li ) converges
to a limiting model. Instead of considering the actual sequence of models, we only
ε
consider the sequence of stationary processes Π̂ τ := Π ε M ε τ , where Π ε is
ε ε
associated to (Π , Γ ) = ιξε as in Section 15.5.1.
Remark 15.14. It is important to note that we do not attempt here to give a full proof
that the renormalised model converges to a limit in the correct topology for the space
ε
of admissible models. We only aim to argue that it is plausible that Π̂ converges
to a limit in some topology. A full proof of convergence (but in a slightly different
setting) can be found in [Hai13], see also [Hai14c, Section 10].
Since there are general arguments available to deal with all the expressions τ
of positive homogeneity as well as expressions of the type I 0 (τ ) and Ξ itself, we
restrict ourselves to those that remain. Inspecting (15.10), we see that they are given
by
, , , , .
For this part, some elementary notions from the theory of Wiener chaos expansions
are required, but we’ll try to hide this as much as possible. At a formal level, one has
the identity
Π ε = K 0 ∗ ξε = Kε0 ∗ ξ ,
where the kernel Kε0 is given by Kε0 = K 0 ∗ δε . This shows that, at least formally,
one has
ZZ
Π ε (z) = K 0 ∗ ξε (z)2 = Kε0 (z − z1 )Kε0 (z − z2 ) ξ(z1 )ξ(z2 ) dz1 dz2 .
Similar but more complicated expressions can be found for any formal expression τ .
This naturally leads to the study of random variables of the type
Z Z
Ik (f ) = · · · f (z1 , . . . , zk ) ξ(z1 ) · · · ξ(zk ) dz1 · · · dzk . (15.25)
15.5 Renormalisation of the KPZ equation 235
Ideally, one would hope to have an Itô isometry of the type EIk (f )Ik (g) =
hf sym , g sym i, where h·, ·i denotes the L2 -scalar product and f sym denotes the sym-
metrisation of f . This is unfortunately not the case. Instead, one should replace the
products in (15.25) by Wick products, which are formally generated by all possible
contractions of the type
If we then set
Z Z
Iˆk (f ) = ··· f (z1 , . . . , zk ) ξ(z1 ) · · · ξ(zk ) dz1 · · · dzk ,
Finally, one has EIˆk (f )Iˆ` (g) = 0 if k 6= `. Random variables of the form Iˆk (f ) for
some k ≥ 0 and some square integrable function f are said to belong to the kth
homogeneous Wiener chaos.
Returning to our problem, we first argue that it should be possible to choose M ε
ε
in such a way that Π̂ converges to a limit as ε → 0. The above considerations
suggest that one should rewrite Π ε as
Π ε (z) = K 0 ∗ ξε (z)2
(15.26)
ZZ
= Kε0 (z − z1 )Kε0 (z − z2 ) ξ(z1 ) ξ(z2 ) dz1 dz2 + Cε(1) ,
(1)
where the constant Cε is given by the contraction
Z
2
Kε0 (z) dz .
(1) def
Cε = =
Note now that Kε0 is an ε-approximation of the kernel K 0 which has the same singular
behaviour as the derivative of the heat kernel. In terms of the parabolic distance, the
singularity of the derivative of the heat kernel scales like p K(z) ∼ |z|−2 for z → 0.
(Recall that we consider the parabolic distance |(t, x)| = |t| + |x|, so that this is
consistent with the fact that the derivative of the heat kernel is bounded by t−1 .) This
2
suggests that one has Kε0 (z) ∼ |z|−4 for |z| ε. Since parabolic space-time has
scaling dimension 3 (time counts double!), this is a non-integrable singularity. As a
matter of fact, there is a whole power of z missing to make it borderline integrable,
which suggests that one has
236 15 Application to the KPZ equation
1
Cε(1) ∼ .
ε
This already shows that one should not expect Π ε to converge to a limit as ε → 0.
However, it turns out that the first term in (15.26) converges to a distribution-valued
stationary space-time process, so that one would like to somehow get rid of this
(1)
diverging constant Cε . This is exactly where the renormalisation map M ε (in
particular the factor exp(−C1 L1 )) enters into play. Following the above definitions,
we see that one has
ε
(z) = Π ε M (z) = Π ε (z) − C1 .
Π̂
(1) ε
This suggests that if we make the choice C1 = Cε , then Π̂ does indeed converge
to a non-trivial limit as ε → 0. This limit is a distribution given, at least formally, by
ZZ
ε
ψ(z)K 0 (z − z1 )K 0 (z − z2 ) dz ξ(z1 ) ξ(z2 ) dz1 dz2 .
Π (ψ) =
Using again the scaling properties of the kernel K 0 , it is not too difficult to show that
this yields indeed a random variable belonging to the second homogeneous Wiener
chaos for every choice of smooth test function ψ.
The case τ = is treated in a somewhat similar way. This time one has
(0)
where the constant Cε is given by the contraction
Z
= Kε0 (z) K 0 ∗ Kε0 (z) dz .
def
Cε(0) =
This time however Kε0 is an odd function (in the spatial variable) and K 0 ∗ Kε0 is an
(0)
even function, so that Cε vanishes for every ε > 0. This is why we can set C0 = 0
and no renormalisation is required for .
Turning to our list of terms of negative homogeneity, it remains to consider ,
, and . It turns out that the latter two are the more difficult ones, so we only
discuss these. Let us first argue why we expect to be able to choose the constant C2
ε
in such a way that Π̂ converges to a limit. In this case, the “bad” term comes
from the part of Π ε (z) belonging to the homogeneous chaos of order 0. This is
simply a constant, which is given by
Z
= 2 K 0 (z)K 0 (z̄)Q2ε (z − z̄) dz dz̄ ,
(2) def
Cε = 2 (15.27)
Remark 15.15. The factor 2 comes from the fact that the contraction (15.27) appears
twice, since it is equal to the contraction . In principle, one would think that the
(2)
contraction also contributes to Cε . This term however vanishes due to the fact
0
that the integral of Kε vanishes.
Z
K 0 (z)K 0 (z̄)Qε (z̄)Qε (z + z̄) dz dz̄ ,
def
Cε(3) =2 =2
(2)
which diverges logarithmically for exactly the same reason as Cε . Setting C2 =
(2)
Cε , this diverging constant can again be cancelled out. The combinatorial factor 2
arises in essentially the same way as for and the contribution of the term where
the two top nodes are contracted vanishes for the same reason as previously.
It remains to consider the contribution of Π ε to the second Wiener chaos. This
contribution consists of three terms, which correspond to the contractions
It turns out that the first one of these terms does not give raise to any singularity. The
last two terms can be treated in essentially the same way, so we focus on the last one,
which we denote by η ε . For fixed ε, the distribution (actually smooth function) η ε is
given by
Z
η (ψ) = ψ(z0 )K 0 (z0 − z1 )Qε (z0 − z1 )K 0 (z2 − z1 )
ε
This is akin to the problem of making sense of the “Cauchy principal value”
distribution, which formally corresponds to the integration against 1/x. For the sake
of the argument, let us consider a function W : R → R which is compactly supported
and smooth everywhere except at the origin, where it diverges like |W (x)| ∼ 1/|x|.
It is then natural to associate to W a “renormalised” distribution RW given by
Z
RW (ϕ) = W (x) ϕ(x) − ϕ(0) dx .
Note that RW has the property that if ϕ(0) = 0, then it simply corresponds to
integration against W , which is the standard way of associating a distribution to
a function. Furthermore, the above expression is always well-defined, since ϕ is
smooth and therefore the factor (ϕ(x) − ϕ(0)) cancels out the singularity of W at
the origin. It is also straightforward to verify that if Wε is a sequence of smooth
approximations to W (say one has Wε (x) = W (x) for |x| > ε and |Wε | . 1/ε
otherwise) which has the property that each Wε integrates to 0, then W ε → RW in
a distributional sense.
In the same way, one can show that Q̂ε converges as ε → 0 to a limiting distribu-
tion R Q̂. As a consequence, one can show that η ε converges to a limiting (random)
distribution η given by
Z
η(ψ) = ψ(z0 ) R Q̂(z0 −z1 )K 0 (z2 −z1 )K 0 (z3 −z2 )K 0 (z4 −z2 ) ξ(z3 )ξ(z4 ) dz .
It should be clear from this whole discussion that while the precise values of the
constants Ci depend on the details of the mollifier δε , the limiting (random) model
(Π̂, Γ̂ ) obtained in this way is independent of it. Combining this with the continuity
of the solution to the fixed point map (15.7) and of the reconstruction operator R
with respect to the underlying model, we see that the statement of Theorem 15.1
follows almost immediately.
In the particular case of the KPZ equation, it turns out that is possible to give a robust
solution theory by only using “classical” controlled rough path theory, as exposed in
the earlier part of this book. This is actually how it was originally treated in [Hai13].
To see how this can be the case, we make the following crucial remarks:
1. First, looking at the expression (15.23) for ∂H, we see that most symbols come
with constant coefficients. The only non-constant coefficients that appear are in
front of the term 1, which is some kind of renormalised value for ∂H, and in front
of the term . This suggests that the problem of finding a solution h to the KPZ
equation (or equivalently a solution h0 to the corresponding Burgers equation) can
be simplified considerably by considering instead the function v given by
15.6 The KPZ equation and rough paths 239
v = ∂x h − Π + +2 , (15.28)
∆1 = 1 ⊗ 1 , ∆ = ⊗ 1 + 1 ⊗ J 0( ) ,
∆ = ⊗1, ∆ = ⊗ 1 + ⊗ J 0( ) .
It then follows from this and the definition (15.13) of the structure group G that
the space h , , 1, i ⊂ T is invariant under the action of G. Furthermore, its
action on this subspace is completely described by one real number corresponding
to J 0 ( ). Finally, viewing this subspace as a regularity structure in its own right,
we see that it is nothing but the regularity structure of Section 13.3.2, provided
that we make the identifications ∼ Ẇ , ∼ W , and ∼ Ẇ.
3. One has the identities
∆ = ⊗ 1 + ⊗ J 0( ), ∆ = ⊗ 1 + ⊗ J 0( ),
so that the pair of symbols { , } could also have played the role of {W , Ẇ}
in the previous remark.
Let now ξ be a smooth function and let h be given by the solution to the unrenor-
malised KPZ equation (15.1). Defining Π by ΠΞ = ξ and then recursively as in
(15.18), and defining v by (15.28), we then obtain for v the equation
∂t v = ∂x2 v + ∂x v Π + 4 Π
+R, (15.29)
where the “remainder” R belongs to C α for every α < −1. Similarly to before, it also
turns out that if we replace Π bi Π̂ = Π M defined as in (15.19) (with C0 = 0) and
h as the solution to the renormalised KPZ equation (15.6) with Cε = C1 + C2 + 4C3 ,
then v also satisfies (15.29), but with Π replaced by the renormalised model Π̂.
We are now in the following situation. As a consequence of (15.23) we can guess
that for any fixed time t, the solution v should be controlled by the function Π̂ ,
which we can interpret as one component (say W 1 ) of some rough path (W, W).
Note that here the spatial variable plays the role of time! The time variables merely
plays the role of a parameter, so we really have a family of rough paths indexed
by time. Furthermore, Π̂ can be interpreted as the distributional derivative of
another component (say W 0 ) of the rough path W . Finally, the function Π̂ can be
interpreted as a third component W 2 of W .
As a consequence of the second and third remarks above, the two distributions
Π̂ and Π̂ can then be interpreted as the distributional derivatives of the “iterated
integrals” W1,0 and W2,1 . It follows automatically from these algebraic relations
combined with the analytic bounds (13.13) that W1,0 and W2,1 then satisfy the
required estimates (2.3). Our model does not provide any values for W1,2 , but these
turn out not to be required. Assuming that v is indeed controlled by X1 = Π̂ , it
240 15 Application to the KPZ equation
∂t Z = ∂x2 Z + ∂x2 Y ,
241
242 References
[Dav08] A. M. DAVIE. Differential equations driven by rough paths: an approach via dis-
crete approximation. Appl. Math. Res. Express. AMRX 2008, no. 2, (2008), 1–40.
doi:10.1093/amrx/abm009.
[Der10] S. D EREICH. Rough paths analysis of general Banach space-valued Wiener processes.
J. Funct. Anal. 258, no. 9, (2010), 2910–2936. doi:10.1016/j.jfa.2010.01.018.
[DF12] J. D IEHL and P. F RIZ. Backward stochastic differential equations with rough drivers.
Ann. Probab. 40, no. 4, (2012), 1715–1758. doi:10.1214/11-AOP660.
[DFG13] J. D IEHL, P. F RIZ, and P. G ASSIAT. Stochastic control with rough paths. ArXiv e-prints
(2013). arXiv:1303.7160.
[DFM13] J. D IEHL, P. F RIZ, and H. M AI. Pathwise stability of likelihood estimators for
diffusions via rough paths. ArXiv e-prints (2013). arXiv:1311.1061.
[DFO14] J. D IEHL, P. K. F RIZ, and H. O BERHAUSER. Regularity theory for rough partial
differential equations and parabolic comparison revisited. Preprint; earlier version
available on arXiv (2014).
[DFS14] J. D IEHL, P. F RIZ, and W. S TANNAT. Stochastic partial differential equations: a rough
path view, 2014. Preprint.
[DGT12] A. D EYA, M. G UBINELLI, and S. T INDEL. Non-linear rough heat equa-
tions. Probab. Theory Related Fields 153, no. 1-2, (2012), 97–147.
doi:10.1007/s00440-011-0341-z.
[DNT12a] A. D EYA, A. N EUENKIRCH, and S. T INDEL. A Milstein-type scheme without Lévy
area terms for SDEs driven by fractional Brownian motion. Ann. Inst. Henri Poincaré
Probab. Stat. 48, no. 2, (2012), 518–550. doi:10.1214/10-AIHP392.
[DNT12b] A. D EYA, A. N EUENKIRCH, and S. T INDEL. A Milstein-type scheme without Lévy
area terms for SDEs driven by fractional Brownian motion. Ann. Inst. Henri Poincaré
Probab. Stat. 48, no. 2, (2012), 518–550. doi:10.1214/10-AIHP392.
[DOR13] J. D IEHL, H. O BERHAUSER, and S. R IEDEL. A Levy-area between Brownian motion
and rough paths with applications to robust non-linear filtering and RPDEs. ArXiv
e-prints (2013). arXiv:1301.3799.
[DPD03] G. DA P RATO and A. D EBUSSCHE. Strong solutions to the stochastic quantization equa-
tions. Ann. Probab. 31, no. 4, (2003), 1900–1916. doi:10.1214/aop/1068646370.
[DPZ92] G. DA P RATO and J. Z ABCZYK. Stochastic equations in infinite dimensions, vol. 44
of Encyclopedia of Mathematics and its Applications. Cambridge University Press,
Cambridge, 1992.
[Faw04] T. FAWCETT. Non-commutative harmonic analysis. Ph.D. thesis, University of Oxford,
2004.
[FdLP06] D. F EYEL and A. DE L A P RADELLE. Curvilinear integrals along enriched paths.
Electronic Journal of Probability 11, (2006), 860–892. doi:10.1214/EJP.v11-356.
[FG13] P. F RIZ and P. G ASSIAT. Eikonal equations and pathwise solutions to fully nonlinear
SPDEs, 2013. Preprint.
[FG14] P. K. F RIZ and B. G ESS. Stochastic scalar conservation laws driven by rough paths.
ArXiv e-prints (2014). arXiv:1403.6785.
[FGGR13] P. K. F RIZ, B. G ESS, A. G ULISASHVILI, and S. R IEDEL. Jain-Monrad criterion for
rough paths. ArXiv e-prints (2013). arXiv:1307.3460.
[FGL13] P. F RIZ, P. G ASSIAT, and T. LYONS. Physcial Brownian motion in magnetic field as
rough path. ArXiv e-prints (2013). arXiv:1302.2531.
[FLS06] P. F RIZ, T. LYONS, and D. S TROOCK. Lévy’s area under condition-
ing. Ann. Inst. H. Poincaré Probab. Statist. 42, no. 1, (2006), 89–101.
doi:10.1016/j.anihpb.2005.02.003.
[FO10] P. F RIZ and H. O BERHAUSER. A generalized Fernique theorem and
applications. Proc. Amer. Math. Soc. 138, (2010), 3679–3688.
doi:10.1090/S0002-9939-2010-10528-2.
[FO14] P. F RIZ and H. O BERHAUSER. Rough path stability of (semi-)linear
SPDEs. Probab. Theory Related Fields 158, no. 1-2, (2014), 401–434.
doi:10.1007/s00440-013-0483-2.
244 References
[JM83] N. C. JAIN and D. M ONRAD. Gaussian measures in Bp . Ann. Probab. 11, no. 1,
(1983), 46–57.
[Kal02] O. K ALLENBERG. Foundations of modern probability. Probability and its Applications
(New York). Springer-Verlag, New York, second ed., 2002.
[KM14] D. K ELLY and I. M ELBOURNE. Smooth approximation of stochastic differential
equations. ArXiv e-prints (2014). arXiv:1403.7281.
[Koh78] J. J. KOHN. Lectures on degenerate elliptic problems. In Pseudodifferential operator
with applications (Bressanone, 1977), 89–151. Liguori, Naples, 1978.
[KPZ86] M. K ARDAR, G. PARISI, and Y.-C. Z HANG. Dynamic scaling of growing interfaces.
Phys. Rev. Lett. 56, no. 9, (1986), 889–892.
[KR77] N. V. K RYLOV and B. L. ROZOVSKII. The Cauchy problem for linear stochastic partial
differential equations. Izv. Akad. Nauk SSSR Ser. Mat. 41, no. 6, (1977), 1329–1347,
1448.
[KRT07] I. K RUK, F. RUSSO, and C. A. T UDOR. Wiener integrals, Malliavin calculus
and covariance measure structure. J. Funct. Anal. 249, no. 1, (2007), 92–142.
doi:10.1016/j.jfa.2007.03.031.
[KS84] S. K USUOKA and D. S TROOCK. Applications of the Malliavin calculus. I. In Stochastic
analysis (Katata/Kyoto, 1982), vol. 32 of North-Holland Math. Library, 271–306.
North-Holland, Amsterdam, 1984.
[KS85] S. K USUOKA and D. S TROOCK. Applications of the Malliavin calculus. II. J. Fac. Sci.
Univ. Tokyo Sect. IA Math. 32, no. 1, (1985), 1–76.
[KS87] S. K USUOKA and D. S TROOCK. Applications of the Malliavin calculus. III. J. Fac.
Sci. Univ. Tokyo Sect. IA Math. 34, no. 2, (1987), 391–442.
[Kun82] H. K UNITA. Stochastic partial differential equations connected with nonlinear filtering.
In Nonlinear filtering and stochastic control (Cortona, 1981), vol. 972 of Lecture Notes
in Math., 100–169. Springer, Berlin, 1982.
[Kus01] S. K USUOKA. Approximation of expectation of diffusion process and mathematical
finance. In Taniguchi Conference on Mathematics Nara ’98, vol. 31 of Adv. Stud. Pure
Math., 147–165. Math. Soc. Japan, Tokyo, 2001.
[LCL07] T. J. LYONS, M. C ARUANA, and T. L ÉVY. Differential equations driven by rough
paths, vol. 1908 of Lecture Notes in Mathematics. Springer, Berlin, 2007. Lectures
from the 34th Summer School on Probability Theory held in Saint-Flour, July 6–24,
2004, With an introduction concerning the Summer School by Jean Picard.
[Led96] M. L EDOUX. Isoperimetry and Gaussian analysis. In Lectures on probability theory and
statistics (Saint-Flour, 1994), vol. 1648 of Lecture Notes in Math., 165–294. Springer,
Berlin, 1996.
[LLQ02] M. L EDOUX, T. LYONS, and Z. Q IAN. Lévy area of Wiener processes in Banach
spaces. Ann. Probab. 30, no. 2, (2002), 546–578. doi:10.1214/aop/1023481002.
[LN11] T. LYONS and H. N I. Expected signature of Brownian Motion up to the first exit time
from a bounded domain. ArXiv e-prints (2011). arXiv:1101.5902.
[LPS13] P.-L. L IONS, B. P ERTHAME, and P. E. S OUGANIDIS. Scalar conservation laws with
rough (stochastic) fluxes. ArXiv e-prints (2013). arXiv:1309.1931.
[LQ02] T. LYONS and Z. Q IAN. System control and rough paths. Oxford Mathematical
Monographs. Oxford University Press, Oxford, 2002. Oxford Science Publications.
[LQZ02] M. L EDOUX, Z. Q IAN, and T. Z HANG. Large deviations and support theorem for
diffusion processes via rough paths. Stochastic Process. Appl. 102, no. 2, (2002),
265–283. doi:10.1016/S0304-4149(02)00176-X.
[LS98a] P.-L. L IONS and P. E. S OUGANIDIS. Fully nonlinear stochastic partial differential
equations. C. R. Acad. Sci. Paris Sér. I Math. 326, no. 9, (1998), 1085–1092.
doi:10.1016/S0764-4442(98)80067-0.
[LS98b] P.-L. L IONS and P. E. S OUGANIDIS. Fully nonlinear stochastic partial differential
equations: non-smooth equations and applications. C. R. Acad. Sci. Paris Sér. I Math.
327, no. 8, (1998), 735–741. doi:10.1016/S0764-4442(98)80161-4.
References 247
[LS00a] P.-L. L IONS and P. E. S OUGANIDIS. Fully nonlinear stochastic PDE with semilinear
stochastic dependence. C. R. Acad. Sci. Paris Sér. I Math. 331, no. 8, (2000), 617–624.
doi:10.1016/S0764-4442(00)00583-8.
[LS00b] P.-L. L IONS and P. E. S OUGANIDIS. Uniqueness of weak solutions of fully nonlinear
stochastic partial differential equations. C. R. Acad. Sci. Paris Sér. I Math. 331, no. 10,
(2000), 783–790. doi:10.1016/S0764-4442(00)01597-4.
[LS01] W. V. L I and Q.-M. S HAO. Gaussian processes: inequalities, small ball probabilities
and applications. In Stochastic processes: theory and methods, vol. 19 of Handbook of
Statist., 533–597. North-Holland, Amsterdam, 2001.
[LV04] T. LYONS and N. V ICTOIR. Cubature on Wiener space. Proc. R. Soc. Lond. Ser. A Math.
Phys. Eng. Sci. 460, no. 2041, (2004), 169–198. doi:10.1098/rspa.2003.1239.
Stochastic analysis with applications to mathematical finance.
[LV07] T. LYONS and N. V ICTOIR. An extension theorem to rough paths. Ann.
Inst. H. Poincaré Anal. Non Linéaire 24, no. 5, (2007), 835–847.
doi:10.1016/j.anihpc.2006.07.004.
[LY02] F. L IN and X. YANG. Geometric measure theory—an introduction, vol. 1 of Advanced
Mathematics (Beijing/Boston). Science Press Beijing, Beijing, 2002.
[Lyo91] T. LYONS. On the nonexistence of path integrals. Proc. Roy. Soc. London Ser. A 432,
no. 1885, (1991), 281–290. doi:10.1098/rspa.1991.0017.
[Lyo94] T. LYONS. Differential equations driven by rough signals. I. An extension of
an inequality of L. C. Young. Math. Res. Lett. 1, no. 4, (1994), 451–464.
doi:10.4310/MRL.1994.v1.n4.a5.
[Lyo98] T. J. LYONS. Differential equations driven by rough signals. Rev. Mat. Iberoamericana
14, no. 2, (1998), 215–310. doi:10.4171/RMI/240.
[Mal78] P. M ALLIAVIN. Stochastic calculus of variations and hypoelliptic operators. Proc. In-
tern. Symp. SDE 195–263.
[Mal97] P. M ALLIAVIN. Stochastic analysis, vol. 313 of Grundlehren der Mathematischen
Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag,
Berlin, 1997.
[McK69] H. P. M C K EAN , J R . Stochastic integrals. Probability and Mathematical Statistics, No.
5. Academic Press, New York-London, 1969.
[McS72] E. J. M C S HANE. Stochastic differential equations and models of random processes. In
Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probabil-
ity (Univ. California, Berkeley, Calif., 1970/1971), Vol. III: Probability theory, 263–294.
Univ. California Press, Berkeley, Calif., 1972.
[Mey92] Y. M EYER. Wavelets and operators, vol. 37 of Cambridge Studies in Advanced
Mathematics. Cambridge University Press, Cambridge, 1992. Translated from the 1990
French original by D. H. Salinger.
[MR06] M. B. M ARCUS and J. ROSEN. Markov processes, Gaussian processes, and local
times, vol. 100 of Cambridge Studies in Advanced Mathematics. Cambridge University
Press, Cambridge, 2006.
[MSS06] A. M ILLET and M. S ANZ -S OL É. Large deviations for rough paths of the fractional
Brownian motion. Ann. Inst. H. Poincaré Probab. Statist. 42, no. 2, (2006), 245–271.
[Nor86] J. N ORRIS. Simplified Malliavin calculus. In Séminaire de Probabilités, XX, 1984/85,
vol. 1204 of Lecture Notes in Math., 101–130. Springer, Berlin, 1986.
[NP88] D. N UALART and É. PARDOUX. Stochastic calculus with anticipating in-
tegrands. Probab. Theory Related Fields 78, no. 4, (1988), 535–581.
doi:10.1007/BF00353876.
[NT11] D. N UALART and S. T INDEL. A construction of the rough path above fractional
Brownian motion using Volterra’s representation. Ann. Probab. 39, no. 3, (2011),
1061–1096. doi:10.1214/10-AOP578.
[Nua06] D. N UALART. The Malliavin calculus and related topics. Probability and its Applica-
tions (New York). Springer-Verlag, Berlin, second ed., 2006.
[Par79] E. PARDOUX. Stochastic partial differential equations and filtering of diffusion pro-
cesses. Stochastics 3, no. 2, (1979), 127–167. doi:10.1080/17442507908833142.
248 References
Th , 25 Cameron–Martin
T≥α , 213 embedding theorem, 150
||| · |||α , 54 paths, 150
C γ , 194 space, 150
C α , 14 theorem for Brownian rough path, 127
Cgα , 16 variation embedding, 150
C p-var , 151 variation embedding, improved, 166
p-var
Cg , 151 Carnot–Carathéodory
0,α metric, 18
Cg,0 , 126
0,α norm, 18
Cg , 22
Cass–Litterer–Lyons estimates, 159
D γ (V ), 212
Chen’s relation, 13, 17, 20
DX 2α
, 56
γ Chen–Strichartz formula, 121
Dα , 213
complementary Young regularity, 149
M , 198
concentration of measure, 154
W 1 , 150
controlled rough paths, 56
%α , 15
composition with regular functions, 97
dC , 18, 30
integration, 57
p-variation, 149
of low regularity, 101
BC, 170
operations on, 95
1-form, 48
relation to rough paths, 95
covariance function, 129
admissible models, 215 cubature formula, 40, 45
cubature on Wiener space, 39
Borell’s inequality, 154
Bouleau–Hirsch criterion, 160 Davie’s lemma, 50
bracket of a rough path, 71 differential equations
Brownian motion, 150 Young, 106
Banach-valued, 43 Doob–Meyer
fractional, 139, 141 decomposition, 83
Hölder roughness, 90 for rough paths, 86
Hilbert-valued, 42
in magnetic field, as rough path, 34 enhanced Brownian motion, 28
Itô, as rough path, 31
physical, 34, 44 Fawcett’s formula, 39
Stratonovich, as rough path, 32 Fernique theorem, 154
Brownian rough path, 28, 33 for Gaussian rough paths, 155
249
250 Index