Forward Start Options in Heston Model
Forward Start Options in Heston Model
Lund University
Henrik Sandler
August 2021
Contents
1 Introduction 3
3 Diffusion Equation 18
3.1 Diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.1 Solution to the diffusion equation . . . . . . . . . . . . . . 20
3.1.2 Diffusion equation with drift . . . . . . . . . . . . . . . . 22
3.1.3 Solution to the Diffusion equation with drift . . . . . . . . 23
3.1.4 Fokker - Plank equation . . . . . . . . . . . . . . . . . . . 24
3.1.5 A general SDE . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1.6 Kolmogorov Backward equation . . . . . . . . . . . . . . . 27
3.2 Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . 30
5 Fourier Transformation 37
5.1 Fourier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.1.1 Definition of the Fourier Transform, and Characteristic
function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1
5.1.2 Derivation, Carr - Madan (1999), of the call price with
Fourier transform . . . . . . . . . . . . . . . . . . . . . . . 39
5.1.3 Characteristic function . . . . . . . . . . . . . . . . . . . . 40
5.1.4 Inversion of the CF to the CDF/PDF . . . . . . . . . . . 41
5.1.5 Proofs and facts for the Fourier inversion . . . . . . . . . 41
5.1.6 Inversion of an option . . . . . . . . . . . . . . . . . . . . 42
5.2 Bibliographical notes . . . . . . . . . . . . . . . . . . . . . . . . . 45
6 Heston model 46
6.1 Heston Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.1.1 Intuition of Heston model . . . . . . . . . . . . . . . . . . 50
6.1.2 Another specification of the Heston model . . . . . . . . . 51
6.1.3 European Call option Price, in Heston model . . . . . . . 53
6.1.4 Bibliographical notes . . . . . . . . . . . . . . . . . . . . . 61
References 80
2
Chapter 1
Introduction
3
In chapter 6 I discuss the Heston model, it is a model that tries to better ex-
plain the non-flat volatility surface. I also derive the characteristic function for
it. You can also use the martingale method to prove the Heston model, that
was not used by Heston in his derivation. I chose not to include it. There is
a good point to go into the log scale of the asset price, because it will make
variance affine in the state space. In the last chapter I look at the forward start
contracts, under the Black - Scholes and the Heston model. For the Heston
model, I write down three different ”ansatz” to the problem. I only look at the
variance and (log) of the asset price, there are extensions that include random
grant time and random interest rate. I choose not to include them in the article.
This would not have been possible without the help, support, and trust of
my supervisor Magnus Wiktorsson, Lund University.
4
Chapter 2
Geometric Brownian
Motion
• The holder of the contract has, exactly at time t = T , the right to buy X
US dollars at the price K SEK/$
• The holder of the option has no obligation to buy the dollars
This contract is called an option, because it gives the holder of the contract
the option, but not the obligation of buying some underlying asset. The prefix
European, means that the option can only be exercised at maturity, the exercise
date T . There are other types of options that can be exercised at times prior
to maturity, T . Option is a derivative asset, in the sense that it is defined in
terms of the underlying asset. An option always has a non-negative price when
the contract is entered. The price is determined on the existing option market.
The value of the option (at T ), depends on the future level of the exchange rate,
that is, it is stochastic.
5
and to get the population after the end of t years
X0 (1 + r)t = Xt (2.2)
Here, x, t are discrete variables, but we will turn them into continuous variables.
Let us consider the number of bacteria, a more dense number, thus making
the step to continuous variables more easy. In the previous example we had
one compounding per t, (per year), let us here assume that we have f periods,
(compounding) per year (unit time), let the rate r be the same as in the previous
example. In each compounding period, the numbers of bacteria, grows with fr ,
for t years, there will be f t periods. We get the following
f t
r
Xt = X0 1 + (2.3)
f
and let the number of compounding period grow, per unit time, that was (year),
we get
f t
r
Xt = lim X0 1 + (2.4)
f →∞ f
and the above equation, (2.4) is one the definitions of e, using this fact and law
of composition of limit we end up with:
t
Xt = X0 (er ) = X0 ert (2.5)
We will interpret the variables as continuous, going back to the population (2.2),
and to see what happens to the population in small time step h, we get
dXt erh − 1
= lim X0 ert (2.8)
dt h→0 h
The Taylor expansion of Euler’s constant e, close to zero is
x2
ex = 1 + x + + ... (2.9)
2!
applying (2.9), the Taylor series expansion to (2.8) we get
2
dXt 1 + rh + (rh)
2! −1
= lim X0 ert (2.10)
dt h→0 h
6
dividing through by h, and canceling the 1 in the denominator we get
(r2 h)
dXt
= lim r + + . . . X0 ert = rX0 ert = rXt (2.11)
dt h→0 2!
This differential equation could also have been solved using the integrating fac-
tor, or the separation of variables. This differential equation can be used to
model many objects,
dXt
= rXt (2.12)
dt
for example the growth of the bank account. Here we assume that r in (2.12) is
constant, I will in later chapters, look at the example where rt is deterministic
function of time.
dXt = Xt (rt dt + dRt ) (2.13)
When you deposit the money at the bank, you know the interest rate, at the
beginning of each period, (risk-free rate), the rate will vary over time, this
is a case for the time dependent deterministic interest rate, rt . Assume that
you invest money in a stock, by buying share of the company, than you would
expect to earn a return, but this return will be random, since it depends on the
future price of the stock, that gives the additional dRt the random component.
This random component will be a stochastic process. There are many random
process, but in the Geometric Brownian Motion we assume the following model
This (2.15) allows you to build and model many as stochastic process, the
deterministic rt allows you to control the mean of the process, and remember
the expectation of the integral of a stochastic process with respect to a standard
Brownian Motion, is zero, for a fixed time horizon. In the simple model, we
assume that r, σ are constant, the process can be written as
for the stochastic model dXt = rXt dt + σXt dWt , we can write it in integral
form, and noting that expectation of a stochastic integral is zero.
Z t
EXt = EX0 + rEXs ds
0 (2.18)
rT
= EXT = X0 e
7
and for the second moment we get
Z t
2
E [Xt ] = EX02 + (2r + σ 2 )EXs2 ds
0 (2.19)
2 2 (2r+σ 2 )t
E [Xt ] = X0 e
In the process with a random process, the infinitesimal change is not deter-
ministic anymore, thus they have a probability distribution and variance refers
to the infinitesimal changes, and not the process itself. This variance is the
probability weighted average of the displacement squared, but the quadratic
variation of the process, this is like the limiting sum of the squares of displace-
ment, when the interval is divided into large number of sub intervals, is also
equal to, for the standard Brownian increments, it follows that
dWt2 = dt (2.21)
dx = rxdt (2.22)
collect the x:s and the t on each side of the equation (this technique is called
the separation of variables) , and in the next step, we integrate and take the
exponent on each side.
dx
= rdt ⇔ d log(x) = rdt ⇔ x = x0 ert (2.23)
x
This gives us a hint that log(x) also should solve the stochastic differential
equation, but in the stochastic world, the chain rule, Ito’s lemma, is different
from the deterministic world.
∂ log Xt 1 ∂ 2 log Xt
d log Xt = dXt + dXt2 (2.24)
∂Xt 2 ∂Xt2
putting the first and second derivative of log X it becomes
1 1 1
d log Xt = dXt − dXt2 (2.25)
Xt 2 Xt2
8
insert dXt and dXt2 , and remember box-algebra
1 1 1 2 2
d log Xt = (rXt + σXt dWt ) − σ Xt dt (2.26)
Xt 2 Xt2
on the left hand side, the differential and the integral cancels, and taking the
constant out of the integrals on the right hand side gives us.
Z T Z T
1 2
log XT − log X0 = r − σ dt + σ dWt (2.29)
2 0 0
take the exponent on both side, and noting that W0 = 0, it is in the definition
of Brownian motion. 1 2
XT = X0 e(r− 2 σ )T +σWT (2.31)
it can also be written as
1 2
XT = X0 elog Xt +(r− 2 σ )T +σWT (2.32)
9
√
Z ∼ N [0, 1] as WT = T Z, so we can write (2.32) as, for an arbitrary starting
point t √
1 2
XT = Xt e(r− 2 σ )(T −t)+σ T −tZ
(2.35)
and let ∆t be the size of the time step
√
1 2 1 2
Xt+∆t = Xt e(r− 2 σ )∆t+σ ∆tZ 2
∼ LN log Xt + r − σ ∆t, σ ∆t (2.36)
2
Bt = ert (2.39)
We have the stock dynamics under the physical, or the real probability measure
P, and we want to know how these dynamics will look like under the risk-neutral
measure Q, which is the measure associated by the bank account numeraire and
under the measure induced by the stock measure S. The numeraire is an asset
which acts as an measure of value, for example money or gold or tulips. There
are three key concepts needed to derive the dynamics. The general valuation
formula, (RNVF), the theory of Martingales and the Girsanov theorem.
10
under the risk neutral measure Q. The value of the unit under the stock measure
S will also be a martingale.
Vt VT
= ES Ft (2.42)
St ST
To derive the dynamics of the stock under the risk neutral measure, we write
it in terms of the price of the stock expressed in the units of the bank account.
The stock price scaled by the value of the bank account, will be a martingale
under the measure Q
Vt VT
= EQ Ft (2.43)
Bt BT
Let the ratio between the stock price and the bank account be
St
Zt = (2.44)
Bt
then it follows from the theory of Martingales that the SDE of Zt , i.e. the
dynamics of the stock asset will have zero drift under the measure induced by
the bank account (Q).
dZt = σZt dWtQ (2.45)
where W Q is the standard Brownian motion under the Q measure. Use Ito’s
lemma for ratio Zt
St dSt 1
d = + St d
Bt Bt Bt
dSt
+ St d(e−rt )
=
Bt
dSt
= + St (−re−rt )
Bt
dSt 1
= + St −r dt
Bt Bt
(2.46)
dSt St
= − r dt
Bt Bt
µSt dt + σSt dWt St
= − r dt
Bt Bt
St µ − r
=σ dt + dWt
Bt σ
µ−r
dZt = σZt dt + dWt
σ
So we have two equation that describe the dynamics of Zt and they therefore
must be equal
dZt = σZt dWtQ (2.47)
11
µ−r
dZt = σZt dt + dWt (2.48)
σ
The Brownian motion under the original measure P and the Brownian motion
under the risk neutral measure are linked as follows
µ−r
dWtQ = dt + dWt (2.49)
σ
Substitute the new Brownian, under Q to get the orginal Brownian under the
physical measure P to get
µ−r
dSt = µSt dt + σSt dWtQ − dt =
σ (2.50)
Q
= rSt dt + σSt dWt
The dynamics under the stock measure, here I will use the Girsanov theorem.
We know that the value of an asset under the bank account as the numeraire is
a martingale
V0 Vt
= EQ Ft (2.51)
B0 Bt
and the value of an asset under the stock numeraire will be a martingale under
the numeraire induced by the denominator
V0 Vt
= ES Ft (2.52)
S0 St
B0 and S0 are known at the filtration by F0 so we can put them inside our
expectation.
B0
V0 = EQ Vt Ft (2.53)
Bt
and for the stock as numeraire.
S0
V0 = ES Vt Ft (2.54)
St
as both expression represents the price of the same asset, and as it holds for any
asset, it means that the terms inside the expectation must be equal.
B0 S0
dQ = dPS ⇔
Bt St
S (2.55)
dP B0 St
=
dQ Bt S0
The solution for the dynamics for the stock price SDE is, under the risk neutral
measure Q
σ2
St = S0 exp rt − t + σWtQ (2.56)
2
12
put into the equation and also for the ratio of the bank account and it gives us
S
dP 1 2 Q 1 2 Q
= e−rt ert− 2 σ t+σWt ↔ e− 2 σ t+σWt (2.57)
dQ
The Girsanov theorem states that if WtQ is a Brownian motion under the mea-
Rt
sure Q and if we shift the process by Y (t) = 0 yu du than the shifted process
R t
WtS = WtQ − 0 yu du will be a Brownian motion under the measure P S that can
be identified of its density
1 t 2
S Z Z t
dP
= exp − yu du + yu dWuQ (2.58)
dQ 2 0 0
In our example yt = σ, i.e. a constant, and the relationship between the two
Brownian Motion would be
σ2 t σ2 t
E[y] = E log(S0 ) + σWt + rt − ⇔ log(S0 ) + rt − (2.64)
2 2
Z t
2
Var[y] = σ du = σ 2 t (2.65)
0
13
and the distribution for St can be found by variable transformation.
!
1 1 log S − E[log S] 2 1
fSt (S) = p exp − p ↔
2πVar[log S] 2 Var[log S] S
! (2.66)
1 1 log S − log S − rt + σ2 t 2
0 2
√ exp − √
S 2πσ 2 t 2 σ2 t
The price for a call option, it pays at maturity of the option the difference
between the price of the underlying and the strike if the difference is positive.
+
Payoff (ST − K) (2.67)
We can simplify the two expression, i.e. solve analytically, and we end up with
and the
St1
log + r + σ 2 (T − t)
K 2
d(1) = √
σ T −t
(2.70)
St 1 2
log + r − σ (T − t)
K 2
d(2) = √
σ T −t
∂V 1 ∂2V ∂V
+ σ 2 S 2 2 + rS − rV = 0 (2.71)
∂t 2 ∂S ∂S
Let the price of the option be V . V is a function that depends on V = (T −
t, St ; r, σ, K) and assume that r, σ, K, the risk - free rate, the volatility and
the strike price are all constant, than the price of the option will depend on
14
V = (T − t, S), the time to maturity and the price of the underlying. We next
use Ito’s lemma on the differential of V 1
∂V ∂V 1 ∂2V
dV = dt + dS + dS 2 (2.72)
∂t ∂S 2 ∂S 2
insert dS from (2.37) and dB from (2.38) and using box algebra, and collecting
the dt terms we end up with
1 2 2 ∂2V
∂V ∂V ∂V
dV = + µS + σ S 2
dt + σS dWt (2.73)
∂t ∂S 2 ∂S ∂S
Equation (2.73) looks as a SDE, with a drift term and a diffusion term, the
later is driven by a geometric Brownian motion. Use the Delta argument to
eliminate the stochastic component, that is done by trading in the underlying,
i.e. the stock. Hedging the risk of the option by trading in the underlying stock.
As both are driven by the same Brownian, we almost eliminate our exposure
to the gBm in the option price by trading the underlying. We need to know
how many units of the underlying to buy or sell, when to buy and sell, or for
how long should we keep the position hedged. Assume that we are hedging a
short position in a call option, our strategy will involve buying stocks, buying
stocks will require funding possibility. Assume that we have unlimited access
to a bank account, and we need to pay the bank interest rate when we borrow,
and we must be able to repay our debt. Reversely the bank bank will pay us
interest rate, when we have excess cash. Assume that we bought ∆ units of the
stock and borrowed α units, of the currency. ∆, α can be negative or positive.
That gives us
Π = ∆S + αB (2.74)
and the differential of the portfolio
This is a SDE for the portfolio, since the Brownian motion is the same in both
(2.73) and (2.77) we can set the stochastic terms equal.
∂V
σS = −∆σSt (2.78)
∂S
2
1 the reason for not adding ∂ V dt2 is box - algebra, and the reason for not adding the cross
∂t2
term is the same, so we need only the second partial to the asset price
15
∂V
isolating the ∂S , and canceling terms, we find that
∂V
∆=− (2.79)
∂S
i.e. the derivative of the option price w.r.t. the stock price. Then the combined
portfolio is
∂2V
∂V ∂V 1 ∂V
dV + dΠ = + µS + σ 2 S 2 2 − µS + αrB dt (2.80)
∂t ∂S 2 ∂S ∂S
we only have the drift terms left, and some cancellation gives us.
∂2V
∂V 1
d (V + Π) = + σ 2 S 2 2 + αrB dt (2.81)
∂t 2 ∂S
The total portfolio, has only a deterministic component and must, to avoid
arbitrage grow at the risk free rate
d (V + Π) = (V + Π) rdt (2.82)
use that
∂V
Π=− S + αB (2.83)
∂S
substitute for Π in (2.82) we get
∂V 1 ∂2V ∂V
+ σ 2 S 2 2 + αrB = rV − r S + αRB (2.84)
∂t 2 ∂S ∂S
and cancellation gives us
∂V 1 ∂2V ∂V
+ σ 2 S 2 2 + rS − rV = 0 (2.85)
∂t 2 ∂S ∂S
If you shift the last terms to the right hand side, you will see that
∂2V
∂V 1 ∂V
+ σ2 S 2 2 = r V − S (2.86)
∂t 2 ∂S ∂S
then the r.h.s is the return of the bank account, that is equal to the option
premium minus the amount that we borrowed to finance the delta units of
the stocks, is equal to the balance times interest rate, in an infinitesimal time
period. The left hand side represents how the Delta hedged option changes in
an infinitesimal time. The first term captures the shortening of the maturity,
the second term, the gamma-impact, the risk that remains after the Delta is
hedged. This is almost the backward diffusion equation. You can also solve this
problem by introducing a replicated portfolio.
16
2.5 Bibliographical notes
A good starting point is (Björk, 2009). There are many books in this area, a bit
more applied is (Lindström, Madsen, & Nielsen, 2015), a more hands on with
many coding exercises are (Iacus, 2011), a nice introduction to Brownian motion,
stochastic integral and the existence of it is (Evans, 2012). There are also many
lecture notes in the internet, I have used, http://www.frouah.com/pages/finmath.html
for some clarification, and from Rolf Poulsen notes Copenhagen University
http://web.math.ku.dk/ rolf/teaching/ctff03/ . A good place with derivations
and videos is https://quantpie.co.uk/
17
Chapter 3
Diffusion Equation
3.1 Diffusion
This chapter will deal with the diffusion equation. I have included it because,
it gives a better understanding of variance, (volatility). It will help to get
a better picture of Dupire’s local volatility model, and in the Heston model,
where the variance is assumed also to be a volatile process, not only as in the
Black - Scholes model, the underlying stock price process. This derivation of the
diffusion follows Einstein’s solution, using probability to the problem. Assume
that we have suspended particles in a liquid, we will take the 1-dimensional
view of Einstein solution to the diffusion equation. See figure (3.1), at time
t, there will be f (x, t) · dx particles in the left rectangle. I make an area, dx
around the generic x As time moves on to t + τ t, from the upper to the lower
subplot in figure (3.1) there will be another number of particles in the same area,
keeping the generic x on the x-axis fixed. Assume that τ the time step is small,
but big enough to assume that the two figures in figure (3.1) are independent.
Look at the distance a particle in the upper figure need to move, from the
right rectangle to the left rectangle, during a time interval τ , the length of the
movement, or displacement has a probabilistic interpretation, let ∆ is a random
variable. Einstein assumed that most particles will have a small displacement,
and the probability of a large movement is small. Notice that I have chosen a
displacement to the left. We assume that there is no influx of particles. The
number of particles in the rectangle ∆ away from the original, at time t, will
be a rectangles f (x + ∆, t) · dx. Let φ(∆) be the probability that a particle
has a displacement equal to ∆, the number of particles in the new rectangle,
after a time step equal to τ that will move to the original rectangle will be
dxf (x + ∆, t)φ(∆)1 . This will hold for every rectangle to the right from the
original rectangle, as ∆ is the distance, the movement from the left would be
dxf (x − ∆, t)φ(−∆). I assume that φ is symmetric around the generic x i.e.
φ(∆) = φ(−∆). I further assume that the possibility that a particle will make
1 This
R
is nothing but the expected number, E(X) = f (x + ∆, t)φ(∆) dx
18
f (x, t) f (x, t + ∆)
dx
f (x, t + τ )
Figure 3.1: Particles in a fluid. The upper is at time t and the lower is at time
t+τ
two movements in a small time interval, τ , is zero. Integrate across the x-axis,
we get the number in particles in x at a later time t + τ . There is no influx of
new particles.
Z ∞
f (x, t + τ ) dx = dx f (x + ∆, t)φ(∆) d∆
∆=−∞
Z ∞ (3.1)
f (x, t + τ ) = f (x + ∆, t)φ(∆) d∆
∆=−∞
Expend the left hand side in equation (3.1), using a Taylor expansion
∂f
f (x, t + τ ) = f (x, t) + τ (3.2)
∂t
and the right hand side in equation (3.1), using a Taylor expansion
∂f 1 ∂2f 2
f (x + ∆, t) = f (x, t) + ∆+ ∆ (3.3)
∂x 2! ∂x2
substitute these in equation (3.1) we get
Z ∞
1 ∂2f 2
∂f ∂f
f (x, t) + τ= f (x, t) + ∆+ ∆ φ(∆) d∆ (3.4)
∂t ∆=−∞ ∂x 2 ∂x2
19
∞
1 ∂2f 2
Z
∂f
f (x, t) + ∆+ ∆ φ(∆) d∆
∆=−∞ ∂x 2 ∂x2
Z ∞
= f (x, t)φ(∆) d∆
∆=−∞
Z ∞ (3.5)
∂f
+ ∆φ(∆) d∆
∆=−∞ ∂x
Z ∞
1 ∂2f 2
+ 2
∆ φ(∆) d∆
∆=−∞ 2 ∂x
The total probability is equal to one, so the first integral after the equal sign is
equal to f (x, t). The number of particles in a rectangle f (x, t) does not depend
on ∆, the displacement, and ∆ is symmetric around zero, makes the second
integral to zero. So we end up with
Z ∞
∂f 1 ∂2f 2
τ= 2
∆ φ(∆) d∆
∂t ∆=−∞ 2 ∂x
1 ∂2f ∞
Z
∂f
= ∆2 φ(∆) d∆ (3.6)
∂t 2τ ∂x2 ∆=−∞
Z ∞
∂f ∂2f 1
= D 2 where D = ∆2 φ(∆) d∆
∂t ∂x 2τ ∆=−∞
∂f ∂2f
=D 2 (3.7)
∂t ∂x
and we want to find f (x, t) that would solve equation (3.7) You can use the
similarity principle. We have a PDE, the key is to find an invariant transfor-
mation, of the variables x, t, to a set with less variables, that also solves the
heat equation. We then proceed to solve the simpler equation, and hope it is
easier. It turns out that for the diffusion equation the following transformation
reduces the number of parameters (x, t) to v = λx, u = λ2 t for a new set of
variables (u, v), so f (v, u) = f (λx, λ2 t). The diffusion equation under the new
20
transformed variables becomes
∂f (v, u) ∂f (v, u) ∂u ∂f (v, u)
= = λ2
∂t ∂u ∂t ∂u
∂f (v, u) ∂f (v, u) ∂v ∂f (v, u)
= =λ (3.8)
∂x ∂v ∂x ∂v 2
∂ 2 f (v, u) ∂ ∂f (v, u) ∂ f (v, u)
2
=λ = λ2
∂x ∂x ∂v ∂v 2
The invariant transformation satisfies the diffusion equation, notice that λ12
cancels
∂f (λx, λ2 t) ∂ 2 f (λx, λ2 t)
=D (3.9)
∂t ∂x2
The question is now, how to find the function f (x, t). I will not go through the
steps, a brief outline if you set λ = √1t the then the transformed heat equation
becomes f (x, t) = λf (λx, λ2 ) becomes √1t f √xt , 1 we also want the result to
be dimension less, remember that x is the distance in the horizontal axis and t
is time
1 length
√ =√ (3.10)
t time
putting this into the diffusion equation, and let f be the number of particles.
∂f ∂2f
=D 2
∂t ∂x (3.11)
particle Particle
=D
time area
it makes D to have the dimension Area
time , that makes the square root of D to have
√ length
the the same dimension D = √ time
. So it is dimension-less.
m 1 x2
f (x, t) = √ exp− 4 Dt dz (3.12)
4πDt
at time 0, the number of particles will be at location 0 f (0, 0) = ψ(0), every
particle will be concentrated in its initial value, so for a generic point x on the
x-axis, we have f (x, 0) = ψ(x). If we want to see how the particle spread over
the x-axis and time
Z
1 1 (x−z)
2
f (x, t) = ψ(z) √ exp− 4 Dt dz (3.13)
4πDt
where f (x, 0) = ψ(x) the initial distribution, or the initial number of particles,
is also called the impulse function. The exponent in equation (3.13) is Green’s
function, and is the response to the impulse.
21
3.1.2 Diffusion equation with drift
We have the diffusion equation
∂f ∂2f
=D 2 (3.14)
∂t ∂x
and the diffusion equation with drift is
∂f ∂2f ∂f
=D 2 −µ (3.15)
∂t ∂x ∂x
The particles will move as in the standard diffusion equation, but now they will
have a force acting upon them, think gravity or current, that make them move
in a preferred direction. In the diffusion equation, we were only interested in
the size of the displacement, as it was assumed to be symmetric around the
current value x, that made the second integral in (3.5) to become zero, but this
will no longer be the case. A particle will move from the right, from x + ∆ if it
experience a displacement of (−∆), that will be φ(−∆) and movement from the
left of x will then be characterized as x − ∆ as φ(∆). So the number of particles
in x an instant (τ ) later, is thus
Z ∞
f (x, t + τ )dx = dx f (x + ∆, t)φ(−∆) d∆
∆=∞
Z ∞
f (x, t + τ ) = f (x + ∆, t)φ(−∆) d∆
∆=∞
Z ∞
1 ∂2f 2
∂f ∂f
f (x, t) + τ= f (x, t) + ∆+ ∆ φ(−∆) d∆
∂t ∆=−∞ ∂x 2 ∂x2
∂f ∞ 1 ∂2f ∞ 2
Z Z
= f (x, t) + ∆φ(−∆) d∆ + ∆ φ(−∆), d∆
∂x −∞ 2 ∂x2 ∞
∂f ∂f 1 ∞ 1 ∂2f ∞ 2
Z Z
= ∆φ(−∆) d∆ + ∆ φ(−∆), d∆
∂t ∂x τ −∞ 2τ ∂x2 ∞
∂f 1 ∞ 1 ∂2f ∞ 2
Z Z
∂f
=− ∆φ(∆) d∆ + ∆ φ(∆), d∆
∂t ∂x τ −∞ 2τ ∂x2 ∞
(3.16)
We made a Taylor expansion in the last line, just as in equation (3.5), expanding
the parenthesis on the right hand side, noting that the first integral will f (x, t),
due to the fact that the total probability is equal to one, the second, will however
not cancel, as in equation (3.5) as it has a drift. I also change ∆ to −∆,
remember that (−1)2 = 1 in the second partial derivative. So the diffusion
coefficient becomes
Z ∞
1 ∞
Z
1
D= ∆2 φ(∆) d∆, µ = ∆φ(∆), d∆ (3.17)
2τ −∞ τ −∞
The left hand side in equation (3.15) tells us how the particle change under a
small interval, keeping the position constant. The D tells us, just as the heat
22
equation, that if you are in a location where the number of particles are lower
than the surrounding, remember that we are in a 1-dimensional setting, more
particles will enter that point, and if the number of particles are higher than the
surrounding, more particles will move out. The drift component tells us that if
the location is on the left, more particle will move into the area of x.
∂y ∂ f˜ ∂y ∂ f˜
= =
∂x ∂y ∂x ∂y
2
∂ y ∂ f˜
= (3.21)
∂x 2 ∂y
∂ f˜ ∂ f˜ ∂ f˜ ∂y ∂ f˜ ∂ f˜
= + = −µ
∂t ∂t ∂y ∂t ∂t ∂y
We can now plug it into the diffusion - convection equation (3.18) and we get
∂ f˜ ∂ f˜ ∂2f ∂ f˜
−µ =D 2 −µ (3.22)
∂t ∂y ∂y ∂y
The drift terms cancels, and we have the famous diffusion equation. We need to
transform the initial conditions for f (x, t) when we transform it to f˜(y, t), but
it will be the same at time equal zero, f (x, 0) = f (y,˜ 0), and the solution will
be for y
m 1 y
2
f˜(y, t) = √ e− 4 Dt (3.23)
4πDt
23
substitute back we get
m 1 (x−µt)
2
f (x, t) = √ e− 4 Dt (3.24)
4πDt
Splitting the square in the exponent in the above equation gives us
2 2
1 µ t −2xµt m 1 x2
f (x, t) = e− 4 Dt √ e− 4 Dt
4πDt
(3.25)
µ
− 2D ( x− µ ) m 1 x2
f (x, t) = e t √ e− 4 Dt
4πDt
Here we can make a change measure, use the Girsanov theorem. Setting D = 12 ,
then it becomes
2 2
1 µ t −2xµt m − 1 x2
f (x, t) = e− 2 t √ e 2 t
2πt
(3.26)
1 2 1 − 1 x2
f (x, t) = e− 2 µ t+µx √ e 2 t
2πt
I replaced m with 1, meaning that we start with 1 particle that behaves as a
Brownian Motion and use the fact the Radon- Nikodym density process can be
written as
dQ0 1 2
= e− 2 µ t+µW̃t (3.27)
dQ
Then the new process, Wt0 is equal to the old process minus the drift, W̃t − µt
24
I used the fact expectation removes the random part of the Brownian motion and
put it in the expectation, since expectation means integral w.r.t. probability. I
have also interchanged expectation and derivative. The Fokker - Plank contains
probability density in place of Brownian motion. Let the probability density of
p(x, t) for a fixed x and a fixed t2 , so the expectation can be written as
Z ∞ Z ∞
d 1
f (x)p(x, t) dx = fXX (x)p(t, x) dx (3.30)
dt −∞ 2 −∞
d ∞ 1 ∞ ∂ 2 p(x, t)
Z Z
f (x)p(x, t) dx = f (x) dx
dt −∞ 2 −∞ ∂x2
Z ∞ (3.31)
∂p(x, t) 1 ∂ 2 p(x, t)
f (x) − dx = 0
−∞ ∂t 2 ∂x2
and the only way for equation (3.31) to equal zero is if the parenthesis is equal
to zero, we said that the function f () was arbitrary,
∂p(x, t) 1 ∂ 2 p(x, t)
= (3.32)
∂t 2 ∂x2
This is the Fokker - Plank equation, it is similar to the diffusion equation for
Brownian motion.
25
differentiable. We use Ito’s lemma.
∂f 1 ∂2f
df = dXt + dXt2
∂x 2 ∂x2
1
= fX (µdt + σdWt ) + fXX dt
2
1 2
= µfX + σ fXX + fX σdWt
2
1 2
(3.34)
E[df ] = E µfX + σ fXX + fX σdWt
2
1 2
E[df ] = E µfX + σ fXX
2
d 1 2
E[f ] = E µfX + σ fXX
dt 2
Let p(x, t) be the probability density function for (forward) for an object starting
in position x0 at time t0 , it is more correct to write the function p as p(x, t|x0 , t0 ).
This also reveals that it is a forward equation, hence the name Kolmogorov
forward equation.
d ∞
Z Z ∞
1
f (x)p(x, t) dx = µ(x, t)fX (x) + σ 2 (x, t)fXX (x) p(t, x) dx
dt −∞ −∞ 2
(3.35)
∂p(x, t) ∂ 1 ∂2
σ 2 (x, t)p(x, t) = 0
+ (µ(x, t)p(x, t)) − (3.37)
∂t ∂x 2 ∂x2
∂p(x, t) ∂ 1 ∂2
σ 2 (x, t)p(x, t) = 0
+ (µ(x, t)p(x, t)) − (3.39)
∂t ∂x 2 ∂x2
26
looking at equation (3.33) and substitute the constant σ for σ and κX for the
µ taking the constants out of the derivative and we get
∂p(x, t) ∂ 1 ∂2
−κ (xp(x, t)) − σ 2 2 (p(x, t)) = 0 (3.41)
∂t ∂x 2 ∂x
For the geometric Brownian SDE,
which can be written as
∂p(x, t) ∂ 1 ∂2
(xp(x, t)) − σ 2 2 x2 p(x, t) = 0
+µ (3.43)
∂t ∂x 2 ∂x
we get the Fokker - Plank equation for the Geometric Brownian motion.
Infinitesimal generator
Avoiding the dependency on time, which otherwise that can complicate things
for a Markov process, we have
1 ∂2 2
∂p(x, t) ∂
= − µ(x) + σ (x) p(x, t) (3.44)
∂t ∂x 2 ∂x2
Taking the finite approximation of the derivative as
P (x, t + ) − P (x, t)
lim = L(x)(p(x, t)) (3.45)
→0
where L(x) is the linear differential operator, also known as the infinitesimal
generator.
∂ 1 ∂2 2
L(x) = − µ(x) + σ (x) (3.46)
∂x 2 ∂x2
The infinitesimal generator is usually just the local Taylor series expansion. This
concept was also used to Delta-hedge the Black-Scholes PDE. The infintesimal
generator can also describe the transition densities in a Markov process, it gives
the probability to move from state (i) to state (j) in a short time interval, but
this can be expended to work for a process that can take infinitely many states.
This process can used to model many stochastic processes, by changing the
drift and the diffusion coefficients, we can arrive at many types of process, e.g.
27
GBM, the O-U process. The Kolmogorov Forward equation, also known as the
Fokker-Plank equation, can be written as.
∂p(x, t) ∂ 1 ∂2
σ 2 (x, t)p(x, t) = 0
− (µ(x, t)p(x, t)) + (3.48)
∂t ∂x 2 ∂x2
This density process p(x, t) is dependent on its conditional start p(x, t|x0 , t0 ).
Here x0 and t0 are fixed. The left hand side tells us what happens when t
changes. Here t is a forward variable. The right hand side in equation (3.48) is
with respect to the forward variable x. In the backward Kolmogorov equation,
we describe the conditional probability density with respect to the initial time t0 ,
and the derivatives on the right hand side is with respect to the initial position
x0 . The Kolmogorov backward equation has thus the following 3 derivatives.
∂p(x, t|x0 , t0 )
∂t0
∂p(x, t|x0 , t0 )
(3.49)
∂x0
2
∂ p(x, t|x0 , t0 )
∂x20
Let the conditional probability P(A, t|x0 , t0 ) = P(Xi ∈ A|X0 = x0 ), if x is a real
number it can be viewed as
Z x Z x
P(x, t|x0 , t0 ) = P(Xt ≤ x|X0 = x0 ) = p(z, t|x0 , t0 ) dz = P(dz, t|x0 , t0 )
−∞ −∞
(3.50)
The Chapman- Kolmogorov equation states that the probability to go from
x0 to x is the same as to go through an intermediate step y, summing (or
integrating) over all y in the system.
Z
P(x, t|x0 , t0 ) = P(x, t|y, t1 )P(dy, t1 |x0 , t0 ) (3.51)
The backward equation is about the variable t, that is going backward. The
finite difference approximation is, remember that we are going backward in time
and assume h > 0
P(x, t|x0 , t0 ) − P(x, t|x0 , t0 − h)
(3.52)
h
Using the equations (3.51) (3.52) I will re-write P(x, t|x0 , t0 − h). It becomes, I
also change x0 to x−1
Z
P(x, t|x−1 , t0 − h) = P(x, t|y, t0 )P(dy, t0 |x−1 , t0 − h) (3.53)
28
what equation (3.54) says it that the probability to go x0 to x in a time interval
from time t0 − h to t, (the left hand side on the above equation) to go from x0
at time t0 − h through an intermediate point y and from point y to the value x
at time t, the integral is for all y ≤ A. We can return to equation (3.52) and
use the derivative approximation formula.
R
P(x, t|y, t0 )P(dy, t0 |x0 , t0 − h) − P(x, t|x0 , t0 )
R h (3.55)
(P(x, t|y, t0 ) − P(x, t|x0 , t0 )) P(dy, t0 |x0 , t0 − h)
h
Let us focus on last term in the above expression. If we scale it by h it would
represent the probability per unit of time.
P(dy, t0 |x0 , t0 − h)
h (3.56)
P(y, t0 |x0 , t0 − h)
h
if you are at x0 you would expect that the change over a small interval of time h
could be made arbitrarily small, say δ ||y − x0 || < δ is diffusion part, and when
δ ||y − x0 || > δ we have a process with jumps. Since this thesis is not about
jump -process, I will assume that ||y − x0 || < δ. So the equation (3.55) becomes
Z
(P(x, t|y, t0 ) − P(x, t|x0 , t0 )) ||y − x0 || (3.57)
note that t is the same in the above expression, the variable that changes is x,
we have reduced a problem in 2-dimensions to a 1-dimensional problem. Going
back to (3.52) we get
taking the limit as h → ∞, and noting that the integration is with respect to y,
29
we can take the differential out of the integral. we get
∂P (x, t|x0 , t0
−
∂t0
(3.60)
∂P (x, t|x0 , t0 ) 1 2 ∂ 2 P (x, t|x0 , t0 )
= µ(x0 , t0 ) + σ (x0 , t0 )
∂x0 2 ∂x20
Where I have used the fact that the drift is the average displacement per unit
time, and the variance i is the squared displacement over a unit time. The
minus sign in the first derivative is due to the fact that we are using the finite
difference approximation for a value to the left of t, we are going backward in
time. Stating equation (3.60) in terms of probability density functions we get,
it is with respect to the forward variable x Here is the Kolmogorov backward
equation
∂p(x, t|x0 , t0
−
∂t0
(3.61)
∂p(x, t|x0 , t0 ) 1 2 ∂ 2 p(x, t|x0 , t0 )
= µ(x0 , t0 ) + σ (x0 , t0 )
∂x0 2 ∂x20
∂p(x, t|x0 , t0
=
∂t0
(3.62)
∂p(x, t|x0 , t0 ) 1 2 ∂ 2 p(x, t|x0 , t0 )
− µ(x0 , t0 ) + σ (x0 , t0 )
∂x0 2 ∂x20
30
Chapter 4
binomial setting. I call this the Dupire model, without making any preferences.
31
will produce the current market prices of options, but the implied dynamics for
future times are not good. If the smile curve of the volatility is very steep for
the shorter maturities and flattens as the maturities increases. But supply and
demand will give a steeper smile curve as we are nearing the time for the longer
maturities. It will most likely behave, get a steeper slope just as the shorter
maturity. Local volatility only uses today’s prices and makes no assumption
over the behavior over time, this must be viewed as a weakness of the model.
It provides a perfect fit for today’s data, but when the data changes, you will
need to refit.
32
it is the inverse problem. In the diffusion equation we interpret the diffusion
coefficients in terms of the distribution of particles over time. If you have the
distribution of particles over time, let us say that you know the distribution
at two different times, at time t and at time t + δt, we could find the diffusion
coefficient by some finite approximation method. In the Dupire PDE, we replace
the distribution function p(x, t|x0 , t0 ) with the call option price, and the diffusion
term, that is the local volatility function, that is a function of K and T , given the
call option prices we can estimate the volatility values at different level of strike
and time to maturity. I will follow the steps that Dupire did when he derived
the local volatility, he used the Fokker - Plank equation, in many presentation
they use the backward diffusion, but I will follow his original presentation.
∂p ∂(Sp 1 ∂ 2 (σ 2 S 2 p)
= −r + (4.9)
∂t ∂S 2 ∂S 2
The undiscounted price of a call option as the expectation of the payoff, that is
integration for all values of S, the stock price where it is greater than K
Z ∞
u
CK,T = p(S, T )(ST − K) dS (4.10)
S=K
33
The left hand side in the above equation is the Dupire PDE, and the derivative
inside the integration is the Fokker- Plank equation, it states the propagation
over time for p(S, T )
u Z ∞
∂CK,T ∂(Sp) 1 ∂ 2 (σ 2 S 2 p)
= −r + (ST − K) dS
∂T S=K ∂S 2 ∂S 2
u Z ∞ (4.12)
∂CK,T 1 ∞ ∂ 2 (σ 2 S 2 p
Z
∂(Sp)
=r dS + (ST − K) dS
∂T S=K ∂S 2 S=K ∂S 2
Let us look at the two integrals on the right hand side of equation (4.12), I start
with the first, I will be using the integration by parts, remember that K is a
constant.
Z ∞
∂(Sp)
dS
S=K ∂S
Z ∞
∞
= [(S, T )(S − K)]S=K − Sp(S, T ) dS (4.13)
S=K
Z ∞
=− Sp(S, T ) dS
S=K
Let’s at the second integral on the right hand side of equation (4.12), here we
need to perform the integration by parts two times.
Z ∞
∂ 2 (σ 2 S 2 p)
= (ST − K) dS
S=K ∂S 2
∞ Z ∞
∂(σ 2 S 2 p) ∂(σ 2 S 2 p)
= (S − K) − dS
∂S S=K S=K ∂S
Z ∞ (4.14)
∂(σ 2 S 2 p)
=− dS
S=K ∂S
∞
= −(σ (S, T )S 2 p(S, T ) S=K
2
= σ 2 (K, T )K 2 p(K, T )
34
Remember that the call option price, expressed with the density is
Z ∞
u
CK,T = p(S, T )(S − K) dS
S=K
Z ∞ Z ∞
u
CK,T = Sp(S, T ) dS − K p(S, T )( dS
S=K S=K
Z ∞ Z ∞
u
Sp(S, T ) dS = CK,T +K p(S, T )( dS
S=K S=K
u ∞
∂CK,T
Z
u 1
= rCK,T + rK p(S, T )( dS + σ 2 (K, T )K 2 p(K, T )
∂T S=K 2
(4.16)
The undiscounted price for a call option is
Z ∞
u
CK,T = p(S, T )(ST − K) dS (4.17)
S=K
with the same maturity as the option, normally the moneyness is defined the other way around,
F
y= K , the smile curve is usually shown as a function of strike prices
35
S0 , is a fixed number, the price of today, and it is the strike that gives the option
prices. Let us reproduce, the Dupire PDE for the strike
u u
∂CK,T u
∂CK,T 1 ∂ 2 CK,T
u
= rCK,T − rK + K 2 σ 2 (T, K) (4.22)
∂T ∂K 2 ∂K 2
and the Dupire PDE, in terms of moneyness, y
!
u
∂ C̃K,T u 1 ∂ 2 C̃y,T
u u
∂ C̃y,T
= rC̃K,T + σ 2 (T, y) − (4.23)
∂T 2 ∂y 2 ∂y
You can also write the Dupire PDE in terms of the BS implied volatility
36
Chapter 5
Fourier Transformation
5.1 Fourier
Here I will follow (Matsuda, 2004) notes. Many of the option pricing models
assumes that the stock follows an exponential (geometric) Lévy process.
St = S0 eLt (5.1)
The Black - Scholes price is the calculated discounted value of the expected
terminal payoff, under the risk - neutral measure Q
Z ∞
C(S0 , T ) = e−rT (ST − K)Q(ST |F0 ) dST (5.4)
K
It was a well known fact, even before the publication of the famous Black -
Scholes paper that the empirical log return density is not a normal distribution,
37
it has excess kurtosis and skewness. The models after Black - Scholes have tried
to capture this deviation. (Carr & Madan, 1999) did a re-write of ((5.4)), in
terms of a CF of the conditional log terminal stock price φ(log ST |F0 )
e−αk ∞ −iωk e−rT φT (ω − (α + 1)i)
Z
C(T, k) = e dω (5.6)
2π −∞ α2 + α − ω 2 + i(2α + 1)ω
The option pricing with Fourier transforms is simple, and will work if the CF
of the conditional log terminal stock price ST |F0 is obtained in closed form.
In order to calculate the characteristic function, set the parameters (a, b) = (1, 1)
1
and thus (5.7) and (5.8) becomes
Z ∞
G (ω) ≡ F [g(t)](ω) ≡ eiωt g(t) dt
−∞
Z ∞ (5.9)
1
g(t) ≡ F [G (ω](t) ≡
−1
e −iωt
G (ω) dω
2π −∞
Let X be a random variable with the probability density function P(x), a charac-
teristic function φ(ω) with ω ∈ R is defined as the Fourier transform of the prob-
ability density function P(x), using Fourier transform parameters (a, b) = (1, 1)
From the definition (5.9)
Z ∞
φ(ω) ≡ F [P(x)] ≡ eiωx P(x) dx ≡ E eiωx
(5.10)
−∞
The probability density function can be obtained by the inverse Fourier trans-
form of the characteristic function.
Z ∞
1
P(x) ≡ F [φ(ω)] ≡
−1
e−iωx φ(ω) dω ≡ E eiωx
(5.11)
2π −∞
1 in pure mathematics the pair (a, b) = (1, −1) is used, in modern physics the pair (a, b) =
(0, 1) is used, and in signal processing the pairing (a, b) = (0, −2π) is used.
38
5.1.2 Derivation, Carr - Madan (1999), of the call price
with Fourier transform
Let Q(ST |Ft ) be the pdf of the terminal asset price ST under the risk neutral
measure Q conditional on the information at Ft . The call price will thus be
"Z #
∞ Z K
−r(T −t)
C(t, St ) = e (ST − K)Q(ST |Ft dST + (0)Q(ST |Ft dST
K 0
Z ∞
C(t, St ) = e−r(T −t) (ST − K)Q(ST |Ft dST
K
(5.12)
for the further derivation, assume t = 0, change the stock asset variable to its
logarithm ST = log(ST ) ≡ sT and do the same for the strike K = log K ≡ k.
We can rewrite (5.12) as
Z ∞
−rT
C(T, k) = e (esT − ek )Q(sT |F0 )dsT (5.13)
k
39
So the (5.18) can’t have a Fourier transform. Carr - Madan, defined a modified
call price
Cmod (T, k) = eαk C(T, k) (5.20)
and by carefully choosing α > 0 we get
Z ∞
|Cmod (T, k)| dt < ∞ (5.21)
−∞
as it is the call price that we want, we use the definition of the inverse Fourier
transform
Z ∞
1
Cmod (T, k) = e−iωk ψT (ω) dω
2π −∞
Z ∞
1
eαk C(T, k) = e−iωk ψT (ω) dω (5.23)
2π −∞
e−αk ∞ −iωk
Z
C(T, k) = e ψT (ω) dω
2π −∞
Carr Madan than derived an analytic expression for ψT (ω) in terms of the
characteristic function, and the end result is
e−rT φT (ω − (α + 1)i)
ψT (ω) = (5.24)
α2+ α − ω 2 + i(2α + 1)ω
So the call option price becomes
where u ∈ R. For option prices we extend the definition to the complex plane,
u ∈ D ⊆ C, where D denotes the subset of the complex plane on which the ex-
pectation is well defined. φX (u) under the extended definition is called the gen-
eralized Fourier transform. Note that the generalized Fourier transform includes
as a special case the Laplace transform, when (=(u) > 0) and the cumulative
generating function, when (=(u) < 0), if they are well defined.
40
Example 5.1.1. Under B-S model, the log security return is given by
St 1
sT := log = µ − σ 2 t + σWt (5.27)
S0 2
and for the above mean and variance the PDF for s will be
1 2
!2
1 1 x − µ − σ t
fs (x) = √ exp − √ 2 (5.29)
2πσ 2 t 2 σ2 t
41
2
R∞ sin(uζ)
• π u=0 u du = sgn(ζ)
R∞ Rx R∞
• y=−∞
sgn(y − x) dF (y) = − y=−∞
dF (y) + y=x
dF (y) = 1 − 2F (x)
Here comes a proof of the inversion formula, for a more detailed proof, see for
example (Kendall et al., 1946)
Z ∞ iux
e φX (−u) − e−iux φX (u)
I= du
u=0 iu
Z ∞ Z ∞
eiux e−iuz − e−iux eiuz
dF (z) du
u=0 z=−∞ iu
Z ∞ Z ∞
2 sin(u(x − z))
dF (z) du
u=0 z=−∞ u
Z ∞ Z ∞ (5.34)
2 sin(u(x − z))
du dF (z)
z=−∞ u=0 u
Z ∞
πsgn(x − z) dF (z) = π (2F (x) − 1))
z=−∞
1 1
hence F (x) = + I
2 2π
and the PDF inversion
Z ∞ iux
e φX (−u) − e−iux φX (u) 1 ∞ −iux
Z
0 1
f (x) = F (x) = du = e φX (u) du
2π u=0 iu π u=0
(5.35)
42
The option transform will be
Z ∞
0 φs (u − i)
χc (u) = eiuk dc(k) = − ,u ∈ R (5.38)
k=−∞ iu + 1
the boundary conditions give that c(∞) = 0, when strike is infinity and c(−∞) =
1 when the strike is zero, remember we have been scaling the option parameters,
hence eiu∞ = 0
Z ∞
0 −iu∞
χc (u) = e iueiuk dk
k=−∞
Z ∞ Z ∞
−iu∞
e − e 1st >k dF (s) eiuk dk
st k
e − iu
k=−∞ s=−∞
Z ∞ Z ∞
−iu∞ st k iuk
e − iu e − e 1st >k e dk dF (s)
s=−∞ k=−∞ (5.41)
Z ∞ Z st
e−iu∞ − iu eiuk+st − e(iu+1)k dk dF (s)
s=−∞ k=−∞
Z ∞ k=st
iuk (iu+1)k
e e
e−iu∞ − iu est − dF (s)
s=−∞ iu iu + 1
k=∞
Another check on the boundary conditions limk→−∞ e(iu+1)k = 0 given the real
component e−∞ , the other boundary is non-convergent est e−iu∞ , which we pull
out and take the expectation to have
Z ∞
est e−iu∞
iu dF (s) = e−iu∞ (5.42)
s=−∞ iu
∞
e(iu+1)st e(iu+1)st
Z
χ0c (u) = −iu − dF (s) =
s=−∞ iu iu + 1
Z ∞ (iu+1)st (5.43)
e φ(u − i
− dF (s) = −
s=−∞ iu + 1 iu + 1
43
The scaled version of the c(k) behaves as a CDF, in particular it has c(∞) = 0,
when strike is infinity and c(−∞) = 1 when strike is zero.
Z ∞ iux
e χ(−u) − e−iux χ(u)
I= du =
u=0 iu
Z ∞
πsgn(x − z) dF (z) = −π(1 − 2c(x)) (5.44)
z=−∞
1 1
thus c(x) = + I
2 2π
Treat c(k) as a PDF, than the option transform is, for more information please
look at (Carr & Wu, 2004)
Z ∞
φx (z − i)
χ00c (z) = eizk c(k) dk = (5.45)
k=−∞ (iz)(iz + 1)
Proof 5.1.2. Proof of the inversion transformation for the option prices.
Z ∞
χ00c (z) = eizk c(k) dk
k=−∞
Z ∞ Z ∞
est − ek 1st >k dF (s) eizk dk
k=−∞ s=−∞
Z ∞ Z ∞
est − ek 1st >k eizk dk dF (s)
=
s=−∞ k=−∞ (5.47)
Z ∞ Z st
= eizk+st − e(iz+1)k dk dF (s)
s=−∞ k=−∞
Z ∞ k=st
izk (iz+1)k
est e e
= − dF (s)
s=−∞ iz iz + 1
k=∞
44
5.2 Bibliographical notes
The theory of Fourier transformation is huge. A good starting point is a course
from Stanford University, EE261 - The Fourier Transform and its Applications,
you can find it at https://see.stanford.edu/Course/EE261 a good note from a
financial emphasis is (Matsuda, 2004). The financial application began with
(Carr & Madan, 1999) another good introduction is (Černỳ, 2004). Also the
PPT -presation by Liuren Wu 2 has been used. A financial view of Fourier
transformation can be found in (Pascucci, 2011). A standard text book is
(Kendall et al., 1946). To see the connection between the option price and the
characteristic function, look at (Carr & Wu, 2004)
45
Chapter 6
Heston model
variance.
46
This is called the stylized facts, in the financial mathematical literature. When
observing the log-returns, you see higher peaks and fatter tails, there is also
the smile curve in the observed log-returns, were the Black - Scholes model
to be correct, you would see a flat surface over the maturity dimension and
the strikes. The correlation ρ gives control over the relationship between the
volatility and the stock price. For example, the volatility is usually higher when
prices are depressed. Setting ρ < 0 gives you that feature in Heston’s model.
The correlation also effect the skew of the volatility surface. Remember that
the variance is positive, that is why there is square root in the variance of dvt ,
and mean-reverting. As there are only a limited number of parameters, it makes
the calibration complicated, but it is still an attractive model, since it capture
more of the dynamics of the stocks. It does not capture the skew seen in the
short tenors, for that the jump process can be an attractive model, but then the
Heston model would compete with other models, for example Bates modeling,
local stochastic volatility models with jumps. The Heston model belongs to the
class of affine jump diffusion AJD, but without jumps, the Jt are set to zero.
Let us start with deriving the pricing PDE for the Heston model. The price
of the option V will depend on the time to maturity, and the two diffusion
process, St , vt , I will suppress, the (r, K, T ), as they are constant in the Heston
model.
V = V (t, St , vt ; r, K, T ) ∼ V (t, St , vt ) (6.4)
The differences to the Black Scholes pricing PDE and the Heston model pricing
PDE
BS Heston
One Brownian motion Two Brownian motion
One source of randomness Two sources of randomness
Use Delta hedging Use Delta and Sigma hedging
Complete market Volatility is not traded, hence incomplete market
Unique Martingale measure Many Martingale measure
To start solving the Heston PDE, we need the 2-dimensional version of Ito’s
lemma, with time dependency. I will avoid writing the subscript t, to the two
random processes, but they are always there, to make the presentation clearer.
We begin with the a small change in the value process. I will use the two-
dimensional version of Ito’s lemma, with time -dependency.
47
Use the differential operator, and denote it by L
∂ 1 ∂2 1 ∂2 ∂2
L= + vS 2 2 + σ 2 v 2 + ρσvS (6.6)
∂t 2 ∂S 2 ∂v ∂v∂S
and the drift term can be written as (LV ) and the arguments then by using the
differential operator (6.6) in (6.5) gives us
∂V ∂V
dV = (LV )(t, s, v)dt + dS + dv (6.7)
∂S ∂v
To get the two randomnesses in Heston’s model, we need two options with
different maturities T1 and T2 , let the value of the option at the shorter maturity,
T1 be V = V (t, St , vt ; r, K, T1 ), we need two assets, one is the stock price as in
BS, and for the other randomness, assume that we know the option value at
another time T2 and denote it by U = U (t, St , vt ; r, K, T2 ), where T1 < T2 .
We need a longer maturity to be able to hedge the option with the shorter
maturity all to its maturity. The changes for the second option, dU is given by
Ito differential, for two processes and time dependency.
∂V ∂V
dS +
dU = (LU )(t, s, v)dt + dv (6.8)
∂S ∂v
We are hedging the risk of the option with maturity T1 using the stock price
and longer maturity option, T2 . We are constructing a portfolio, with delta unit
of the stock and sigma units of the second option, with the longer maturity, and
alpha units from the bank account.
V = ∆S + ΣU + αB (6.9)
Using the self-financing concept, we can find the change in the value of the
option, by2
dV = ∆dS + ΣdU + αdB (6.10)
we have (6.7) and (6.8) and (6.3) is only a deterministic growth. Than (6.10)
becomes
dV = ∆dS + ΣdU + αdB
∂U ∂U
= ∆dS + Σ (LU )(t, s, v)dt + dS + dv + arBdt
∂S ∂v
(6.11)
combine the dS-terms
∂U ∂U
= Σ(LU )(t, s, v)dt + ∆ + Σ dS + Σ dv + arBdt
∂S ∂v
the stochastic terms, and the source of randomness are dS, dv, our aim is to
remove them. Equating the dS in the first two lines in (6.11) gives us
∂V ∂U
dS = ∆dS + Σ dS
∂S ∂S (6.12)
∂V ∂U
=∆+Σ
∂S ∂S
2 note that no cross-term is needed, e.g. (?, ?), rebalancing after each short step.
48
and for the dv
∂V ∂U
=Σ (6.13)
∂v ∂v
that gives that
∂V
∂v
Σ= ∂U
(6.14)
∂v
isolate ∆ in (6.12)
∂V ∂U
∆= −Σ (6.15)
∂S ∂S
after some cancellation we get
∂V ∂V ∂V ∂V
(LV )(t, s, v)dt dS + dv = Σ(LU )(t, s, v)dt + dS + dv + arBdt
∂S ∂v ∂S ∂v
(6.16)
Cancellation of the stochastic terms gives
Use the replicating portfolio to get rid of the bank account. Remember that
αB = V − ∆S − ΣU , just a re-write of (6.9), and using ∆ = ∂V∂S
∂V ∂U
αB = V − S+Σ − ΣU (6.18)
∂S ∂S
put it in the (6.16)
∂V ∂U
(LV )(t, s, v)dt = Σ(LU )(t, s, v)dt + r V − S+Σ − ΣU dt (6.19)
∂S ∂S
every term has dt so you can get rid off it, put all the terms containing V which
PDE we are after, remember that it is the option with shorter maturity, to the
left hand side.
∂V ∂U
(LV )(t, s, v)dt − rV + r S = Σ(LU )(t, s, v) + rΣ − rΣU (6.20)
∂S ∂S
and we end up after some more straight forward calculation
(LV )(t, s, v) − rV + r ∂V
∂S S (LU )(t, s, v) − rU + r ∂U
∂S S
∂V
= ∂U
(6.21)
∂v ∂v
This means that the fraction must not depend on V, U , but most depend on the
parameters, I ignore the normalization.
(LV )(t, s, v) − rV + r ∂V
∂S S
∂V
= −f (t, s, v) (6.22)
∂v
49
A more revealing rewritten is
∂V ∂V
(LV )(t, s, v) − rV = −rS − f (t, s, v) (6.24)
∂S ∂v
Remember that rS is the drift of the stock price S so f must also be some
drift of the SDE.
50
6.1.2 Another specification of the Heston model
We have the following system of the Browian motions see equation (6.1), which
I reproduce here
√
dSt = µSt dt + vt St dZ1
√ (6.34)
dvt = κ(θ − vt )dt + σ vt dZ2
we can proceed as follows, let the second Brownian motion, be the same, in the
new specification.
dZ2 = dW2 (6.37)
and the other as linear combination of the, correlated Brownian motion, the old
Brownian, as in equation (6.1)
p
dZ1 = 1 − ρ2 dW1 + ρdW2 (6.38)
and the correlation, < dW1 , dW2 >= 0 So we can write Heston’s model in terms
of independent Brownian motions
√ p
dS = µSdt + vS 1 − ρ2 dW1 + ρdW2
√ (6.40)
dv = κ(θ − v)dt + σ vdW2
51
and the Radon - Nikodym, density process, or the Girsanov theorem gives the
link between the physical and the risk-neutral measure
µ−r
dW Q = dt + dW (6.44)
σ
that under the risk neutral measure can be written as
dS µ−r
= µdt + σ dW Q − dt
S σ
= µdt + σdW Q − µdt + rdt (6.45)
Q
= µdt + σdW
= (µ − λσ)dt + σdW Q
In this specification, (µ − λσ) = r and the ratio µ−rσ = λ is the market price
of risk. this λ is used when markets are not complete, or the assets are not
traded, which is the case for the variance process. Our risk neutral measure Q
depends on λ and for different λ you get different risk neutral measures. For
stocks, which is a traded asset, we can write the relationship using
√ p
dS = µSdt + vS 1 − ρ2 dW1 + ρdW2 (6.46)
where the stock price SDE is the second Brownian, the uncorrelated, which is
not traded.
√
Qλ µ − r − λρ v
dW1 = dW1 + p √
1 − ρ2 v (6.47)
dW2Qλ = dW2 + λdt
then the drift of the stock will be equal to r, the risk free rate. into (6.45) we
get
" √ #
dS p √ Q µ − r − λρ v √ Q
= µdt+ 1 − ρ2 v dW1 λ − p √ dt +ρ v(dW2 λ −λdt) (6.48)
S 1−ρ v2
52
going back to Z1 , and Z2 , that was a linear combination of dW1 and dW2 , we
get the system in the original form
dS √
= rdt + vdZ1Qλ
S (6.51)
√ √
dv = κ(θ − v) − λσ v dt + σ + vdZ2Qλ
conditioning on λ we are in the risk - neutral world and we can use the theory
of Martingale, as we did in the Black - Scholes setting, the stock price process
is a Martingale under the risk neutral measure, than we can write the price of
the option as the discounted expected value. That can be done as follows
and the PDE will be given by the 2-dimensional version of the Feynman-Kac
theorem. The h(ST , ∗) is the payoff function.
∂V 1 2 ∂ 2 V ∂V
0= + vS + rS − rV
∂t 2 ∂S 2 ∂S (6.53)
1 ∂2V √ ∂V ∂V 2
+ σ 2 v 2 + κ(θ − v) − λσ v + ρσvS
2 ∂v ∂v ∂v∂S
where the second line corresponds to the second dimension in the Feynman -
Kac, and the last term is the cross-term. Rearranging the terms gives us
2 2
∂V 2
∂V 1 2∂ V 2 ∂ V
0= + vS + σ v 2 + 2ρσvS +
∂t 2 ∂S 2 ∂v ∂v∂S
(6.54)
∂V √ ∂V
rS + κ(θ − v) − λσ v − rV
∂S ∂v
and the Feynman - Kac 2-dimensional PDE is
∂V 1X ∂2V X Q ∂V
0= + σi,j + µi λ − rV (6.55)
∂t 2 i,j ∂xi ∂xj i
∂xi
where the {xi } are the underlying processes, in Heston we have two processes,
the stock price and the variance SDE. σ11 = vS 2 , the square root of the variance
of the stock price SDE, σ22 = vσ 2 the square root of the variance in the variance
SDE, σ12 = σ21 = ρσvS is the correlation between the two Brownian√motions.
The drift of the stock price SDE, is µ1 = rS and µ2 = κ(θ − v) − λσ v is the
drift for the variance SDE.
53
following dynamics for the stock price, where the variance itself is a random
process
√
dSt = µSt dt + vt St dZ1
√ (6.56)
dvt = κ(θ − vt )dt + σ vt dZ2
and in the risk - neutral setting, under the measure Q we get
√
dSt = rSt dt + vt St dZ1Qλ
√ √ (6.57)
dvt = (κ(θ − vt ) − λσ vt ) dt + σ vt dZ2Qλ
where the drift in the variance process gets adjusted by the market price of risk.
Heston3 made the variance process simpler by
√
dvt = (κ(θ − vt ) − λvt ) dt + σ vt dZ2Qλ (6.58)
Let us write the variance process a bit shorter to save space.
√
dvt = µv dt + σ vt dZ2Qλ (6.59)
In the last chapter we showed that using Delta and Sigma hedging, we could
solve this pricing PDE, which I reproduce here, it is the 2 dimensional version
of the Black Scholes equation.
2 2
∂V 2
∂V 1 2∂ V 2 ∂ V
0= + vS + σ v 2 + 2ρσvS +
∂t 2 ∂S 2 ∂v ∂v∂S
(6.60)
∂V √ ∂V
rS + κ(θ − v) − λσ v − rV
∂S ∂v
The price of the derivative can be written as the expected value of the discounted
terminal payoff
V0 = EQλ e−rT h(ST )|S0 , v0
(6.61)
where h(ST ) is the payoff function, so for an European call option it would
h(ST ) = max(ST − K, 0). Not that we are not using filtration, but conditional
on the initial values, this is due to Markov properties. The Feynman Kac
presents the link between the two representation (6.60) and (6.61) As in the
Black Scholes model, it gets simpler if we make a transformation of the stock
price to the log price of the stock price. x = log(S), apply Ito’s lemma to the
differential on both sides we get
1 √
dx = d log(S) = r − v dt + vt dZ1Qλ (6.62)
2
Let us also transform the PDE (6.60) from being a function of S to being a
function of x
1 ∂2V 1 2 ∂2V ∂V 2
∂V 1 ∂V ∂V
0= + v 2 + σ v 2 + 2ρσv + r− v + µv ) − rV
∂t 2 ∂x 2 ∂v ∂v∂x 2 ∂x ∂v
(6.63)
3 (Breeden, 1979) Breeden states how to find a risk neutral measure for the market price
54
For the BS case we could use the Heat equation to get a closed form solution,
that can’t be done in the Heston model. There are 3 variables in the Heston
model, t, x, v, but the procedure to derive the pricing formula are step wise
similar. The valuation formula givess us the following
The payoff for a European call option can be written as h(ST ) = max(ST −
K, 0) = ST 1ST >K − K1ST >K , now we can split the payoff into two terms, due
to the linearity of expectation. Let us go back to (6.61)
The second term is similar to Black- Scholes, but the first term is a bit more
complicated, a change of numeraire is needed, the change of numeraire is the
same as in the Black - Scholes example, where the first term were in the stock
measure, and the second term were in the bank account numeraire.
V0 Q VT V0 S VT
=E F0 =E F0 (6.66)
B0 BT S0 ST
The value of an asset scaled by the value of the stock price, will be a martingale,
under the measure induced by the stock price as the numeraire. Notice that S0
is known at time, it is a constant
Q B0 S S0
V0 = E VT F0 V0 = E VT F0 (6.67)
BT ST
as (6.67) and (6.67) is the price for the same asset, we come to
B0 S0 B0
dQ = dP S ST dQ = S0 dP S (6.68)
BT ST BT
and since the bank account starts with 1 we get
P1 is the probability that the stock price is greater than K, the strike price,
under the stock measure. P2 is the probability that the stock price is greater
than K but under the risk-neutral measure. You can choose a time τ ∈ [0, T ]
to denote the price of the option Vτ with remaining time, τ = T − t. Use the
55
chain rule from regular calculus, notice that T is fixed, so there is the same
derivative but with opposite signs, the chain rule produces an −1, so there will
be a negative sign if working with τ instead of t. Our PDE is in x that is
x = log(S) so our pricing function (6.65) will become
Vτ = ex P1 − Ke−rτ P2 (6.71)
∂V 1 ∂ 2 V 1 2 ∂ 2 V ∂V 2
1 ∂V ∂V
0= + v 2 + σ v 2 +2ρσv + r− v +µv −rV (6.72)
∂t 2 ∂x 2 ∂v ∂v∂x 2 ∂x ∂v
1 ∂2V 1 2 ∂2V ∂V 2
∂V 1 ∂V ∂V
= v 2 + σ v 2 + 2ρσv + r− v + µv − rV (6.73)
∂τ 2 ∂x 2 ∂v ∂v∂x 2 ∂x ∂v
St = ea(t)+b(t).x (6.74)
St = ea(t)+b(t).x (6.76)
where the x denotes a vector of two factors of a(t) and b(t) We now have the
PDE, (6.73), and we have that is a solution (6.73), satisfies the PDE, but since
Vτ is a linear combination, also note that the prices are linear combination of
the two terms. of P1 and P2 both P1 and P2 must solve (6.73) by themselves.
I carry out the calculations. Let V1 = ex P1 , and solve (6.73), first we must
calculate the derivatives, put the results in the PDE and simplify.
∂ 2 V1 2 2 2
∂v 2 ex ∂∂xP21 ∂ V2
∂v 2 e−rτ ∂∂vP22
∂ 2 V1 x ∂P1 x ∂ 2 P1 ∂ 2 V2 −rτ ∂P2 x ∂ 2 P2
∂v∂x e ∂v + e ∂x∂v ∂v∂x e ∂v + e ∂x∂v
Then we insert the substitution into (6.73), and we get, notice that ex ap-
pears in all terms for V1 = ex P1 so it cancels. The second term, V2 = e−rτ P2 ,
56
and notice that e−rτ appears in all terms in the second line, so we can cancel
it.
1 ∂ 2 P1 1 2 ∂ 2 P1 ∂P12
∂P1 1 ∂P1 ∂P1
= v + σ v + ρσv + r + v + (µv + ρσv)
∂τ 2 ∂x2 2 ∂v 2 ∂v∂x 2 ∂x ∂v
2 2 2
∂P2 1 ∂ P2 1 ∂ P2 ∂P2 1 ∂P2 ∂P2
= v + σ2 v + ρσv + r− v + µv
∂τ 2 ∂x2 2 ∂v 2 ∂v∂x 2 ∂x ∂v
(6.77)
These two lines look very similar, the only difference is in coefficients of ∂P ∂x
1
2
be dS = (r + σ )Sdt + σSdW S , in Heston, it means that we add v for the drift
under P1 , which is the coefficient for ∂P ∂x It also explains the extra term in the
1
∂P1
coefficient for ∂v , because ρ is the correlation of stock price SDE and dZ2 ρσv
is as the covariance.
As the two PDE are similar, we can write them as one PDE with a level j,
let u1 = 0.5, u2 = −0.5, and b1 = κ + λ − ρσ and b2 = κ + λ then (6.77) can be
written as a generic PDE
1 ∂ 2 Pj 1 2 ∂ 2 Pj ∂Pj2
∂Pj 1 ∂Pj ∂Pj
= v + σ v + ρσv + r + uj v + (a − bj v)
∂τ 2 ∂x2 2 ∂v 2 ∂v∂x 2 ∂x ∂v
(6.78)
To solve numerically (6.78) subject to its terminal condition, which has become
the initial condition, because τ = T − t, and remember that the value Vt =
ex P1 − Ke−rτ P2 , and that P1 is equal to the probability that the stock price is
greater or equal to strike K at maturity, under the different measures. So the
initial condition is P0 = 1S>K
where the function value at maturity is in term of the indicator function, The
characteristic function is
fj = EQ eiφXT S0 , v0
(6.80)
57
and it must satisfy the same PDE (6.78), only the terminal condition is chang-
ing, from indicator function, to exponential, the remember that the indicator
function, is the same as the probability to the indicated event, under some
measure.
1 ∂ 2 fj 1 2 ∂ 2 fj ∂fj2
∂fj 1 ∂fj ∂fj
= v 2 + σ v 2 +ρσv + r + uj v +(a−bj v) (6.81)
∂τ 2 ∂x 2 ∂v ∂v∂x 2 ∂x ∂v
where p(x) is the probability function, in this case for the normal density, and
let u = iφ be a complex number. For a geometric Brownian motion, where the
stock price has the following dynamics, under the risk neutral measure
dS = rS dt + σS dWt (6.83)
and if we take the logarithm of the stock price dynamics, it will be normally
distributed
1
log(ST ) ∼ N x + r − σ 2 τ, σ 2 τ (6.84)
2
which can be viewed as a marginal distribution, and its characteristic function
will be note that x = log(S0 ), τ is the time to maturity, we are using filtration
now, and the subscript x in Xtx represents the value of a Markov process X,
at time t, where it started at small x at time 0. . ψ and φ are some generic
functions of τ, u. An affine function is function of this form
f (x) = a + bx (6.85)
this is also called the affine, i.e. linear transformation plus translation exponen-
tial. If {Xt } is a stochastic process with a dynamics given by
58
then Xt is called an affine process if the drift and diffusion are affine functions
and the drift and diffusion are function that can be written as follows, note that
it is drift and variance and not drift and volatility
Then we characteristic function will be in the affine, the market price of risk,
and the square root in the diffusion model of Heston’s model are examples of
affine exponential form. In Heston’s model we have that the log of the stock
price is given by the following SDE
√
dx = (r − 0.5v)dt + vdZ1 (6.90)
for an affine function to be zero, the coefficients before v must be zero, and the
sum of the other terms must also be zero.
∂D 1 2 1 2 2
− − φ + σ D + ρσiφD + uj iφ − bj D = 0 (6.96)
∂τ 2 2
and that
∂C
− + riφ + aD = 0 (6.97)
∂τ
59
This is the Ricatti equation system, begin with (6.81), put the time-derivative
on the l.h.s.
∂D 1 2 1
= iµj φ − φ + (iρσiφ − bj )D + + σ 2 D2 (6.98)
∂τ 2 2
∂C
= +riφ + aD (6.99)
∂τ
if τ = 0, then our function f (x, v, 0) = eiφx , then it follows, that we have the
initial conditions
D(0, φ) = 0, C(0, φ) = 0 (6.100)
The general form for a Riccati equation is
dy
= a + by + cy 2 (6.101)
dτ
Start with solving (6.97), we can identify the coefficient in the Riccati equation
1 1 2
a = iµj φ − φ2 , b = iρσiφ − bj , c= σ (6.102)
2 2
The solution can
√ be written as, given that we have a initial equation y(0) =
0 and d = ± b2 − 4ac, I will only use the solution with a plus4 . Use the
transformation to the second order linear equation and using the characteristic
equation to solve it, given the condition y(0) = 0
1 −(b − d)edτ + (b − d)
y=−
2c − b−d
b+d e
dτ + 1
d − b 1 − edτ
D(τ ) =
2c 1 − b−d
b+d e
dτ
q
dj = (iρσφ)2 − (2iµj φ − φ2 )σ 2 (6.103)
bj − iρσφ + dj
gj =
bj − iρσφ − dj
dj + bj − iρσφ 1 − edj τ
D(τ ) =
σ2 1 − gj edj τ
1 − gj edj τ
a
C(τ ) = irφτ + 2 −2 log + (dj + bj − iρσφ)τ (6.104)
σ 1 − gj
4 The solution with a minus sign is called the Heston trap
60
So the characteristic function C(τ ) and D(τ ) are the two that correspond to
P1 and P2 , that appear in the European option price in the Heston model. We
have deduced the solution of the characteristic function
Vτ = ex P1 − Kerτ P2 (6.106)
1 ∞
Z −iφ log K
1 e fj
Pj = + R dφ (6.107)
2 π 0 iφ
To calculate the price of the option, we first need to find the characteristic func-
tions, we can thereafter determine the probability by some numerical integration
methods and when we have the probabilities, we have the price of the option.
61
Chapter 7
• state variable, all uncertainty in valuing the option after time t is resolved
once the underlying asset price after time t is known
• data-invariance, the variables determining the value of the option are not
date-dependent
• payout, the underlying asset through the grant date has a known constant
payout rate d
Furthermore, let
• S ≡ current value of the underlying
• St ≡ the (random) value of the underlying after time t, the grant date.
1 ”Pay Now, Choose Later”, RISK 4 (Februry 1991), p.13 Ru-
binstein, Mark, found on the internet as ”Forward-Start Option”,
https://ramurapt.files.wordpress.com/2009/10/forwardstartoptions.doc, retr. 2021-03-15
62
• C(X, Y, T − t) value of a call option with X as the underlying, Y as the
strike price and T − t remaining time to maturity.
The value of a forward starting at the money call option is, on the grant date,
t, by using the homogeneity assumption
Since all randomness comes from St , the second factor is non-random, C(1, 1, T −
t).
By using the replicating portfolio assumption, if we can make an investment
now, that will for sure produce the same outcome at time t, St C(1, 1, T − t),
then the current cost of the investment must equal the value of the forward start
option. Let C(1, 1, T − t) be the number of shares, to replicate the value of the
option after time t, we need to hold C(1, 1, T − t), correcting for the dividends
until time t
Sd−1 C(1, 1, T − t) (7.2)
is the current value of the forward-start option. Using homogeneity it can be
written as
d−1 C(S, S, T − t) (7.3)
Thus ”the value a forward-start option is simply the current value of d−1 calls
which are currently at-the-money, with time to expiration T − t
We can split the time until maturity, into two parts, one part is the current
time until the grant time, 0 − t, during that period we need to hold C(1, 1, T − t)
shares of the stock.
7.2 Background
Here I follow (Musiela & Rutkowski, 2005), to write the theoretical background
of the problem. The value of a forward start option changes with volatility.
In Black - Scholes setting, with constant and deterministic σ, volatility, the
Forward- start option becomes very simple. In their notation, the payoff will be
that is a European Call option, with starting point at time T0 . Its price is
where everything in the parentheses in the right hand side of the above equation
are deterministic.
63
7.2.1 Deterministic volatility
Were we to expand the classic Black Scholes model with deterministic volatility
σ(t) (Musiela & Rutkowski, 2005). The volatility will be different at different
times. It will give a flat smile in the implied volatility surface. Let the matu-
rity T be known, as is the case for a forward - start option, then the mapping
K 7→ σ b0 (T, K), is the implied volatility curve for the maturity date T . The
market-based Black - Scholes implied volatility surface σ b0 (T, K) is thus implic-
itly defined:
C0m (T, K) = c(S0 , T, K, r, σ
b0 (T, K) (7.7)
where c(S0 , T, K, r, σ) is the Black - Scholes price of a call option. Let C0M (T, K)
be a family of market prices of European call options with all strikes K > 0,
and all maturities 0 < T < T ∗ for some T ∗ > 0. I will treat the parameter r
as constant in this thesis. Assume that the implied volatility σ b0 (T, K) inferred
from the call option prices is flat in K, for each the maturity date T , the implied
volatility does not depend on strike K, in this case σ b : (0, T ∗ ) 7→ R+ . To match
market data, we only need an extension to Black - Scholes model, assume time
dependent volatility function σ b : R+ 7→ R+ . The extension to Black - Scholes
would be driven by the following SDE, under the risk - neutral measure Q
b(t)dW Q (t)
dS(t) = rS(t)dt + σ (7.8)
and the volatility function satisfies
Z T
1
b02 (T, K)
σ = b2 (u) du
σ (7.9)
T 0
We will have a flat smile in the implied volatility surface. Going back to the
forward - start option, if we assume a flat implied volatility surface we have
CT0 = (ST0 , T − T0 , KST0 ) = ST0 c(1, T − T0 , K, r, σ(T0 , T )) (7.10)
where the average future volatility is
Z T
2 1
σ
b (T0 , T ) = σ 2 (t) dt (7.11)
T − T0 T0
and thus
F S0 = S0 c(1, T − T0 , K, r, σ(T0 , T )) = c(S0 , T − T0 , KS0 , r, σ(T0 , T )) (7.12)
Since the forward start option start its life at T0 , before that time, that time it
behaves as process growing at a risk free rate, when the option becomes active
it will have a volatility, that is difference from the frozen asset at ST0 and the
call option future variance between T0 and T , for the underlying asset St . The
forward implied volatility is
b02 (T, K) − T0 σ
Tσ b02 (T0 , K)
b2 (T0 , T ) =
σ (7.13)
T − T0
note that the implied volatility is independent of K
64
7.2.2 Case for random volatility, Musiela Rudkowski
If there is a volatility smile, we can no longer derive uniquely the forward volatil-
ity from the implied volatility surface. To deal with this case, we make the
assumption that S satisfies
dSt = St (r dt + σt )d Wt∗ (7.14)
for some stochastic volatility process σ. Suppose that the volatility process σ
is given, and we want to find a closed - form expression solution for the price
C(ST0 , T − T0 , KST0 ), to find a forward start option for the time t ∈ [0, T0 ], we
need to compute
F St = e−r(T0 −t) EQ [C(ST0 , T − T0 , KST0 )|Ft ] (7.15)
and it is difficult. A help can be the terminal condition, as it is an European
option, the terminal payoff is
+ + +
F ST = (ST − KST0 ) = ST0 (Y − K) = ŜT (Y − K) (7.16)
ST ST
where Ŝ is given by ŜT = St∧T0 for every t ∈ [0, T ] and where Y = ST0 = ŜT
,
define an measure Q̂ equivalent to Q, by
dQ̂ ŜT B0 ŜT B0
ηT = =a = 0 where a = e−r(T −T0 ) (7.17)
dQ Ŝ0 BT0 S 0 BT0
65
In order to find the option price, we need to find the volatility process σ under
Q̂, it can be done if we know the SDE, governing volatility process σ under Q.
Let σ have the following dynamics under Q
where Z t
W = W̃t − ρu σu 1[0,T0 ] (u) du (7.25)
0
This is the model that Lucic and Kruse-Nögel used to incorporate the Heston
model for stochastic volatility.
The problem is to value the options. Assume that you two independent Brown-
ian motions, one is driving the asset process, and the other driving the variance
process, under some Martingale measure Q by
(1)
dSt = rt St dt + σt (vt , St )St dWt
(1)
p (2)
(7.29)
dvt = αt (vt ) dt + βt (vt ) ρ dWt + 1 − ρ2 dWt
Assume some regularity assumptions, and that the discounted asset price should
be a Martingale. In this general framework we can study many different models
66
for stochastic volatility ( Hull and White, Stein - Stein, Heston ) and the local
volatility model from Dupire. Let
Z t
P (s, t) = exp − ru 1s≤u du (7.30)
s
You can split the forward - start option into two parts, one part before the grant
date, t ∈ [0, T0 ] and the other part t ∈ [T0 , T ]. Fix a t in the former part, and
study the asset price process.
Z u Z u
1 2 (1)
Su = S0 exp rs − σs ds + σs dWs u ∈ [0, T0 ] (7.34)
0 2 0
Do a change of numeraire
SuT0
Nu = (7.35)
P (T0 , u)
Than we can re-write equation (7.33) as
!+
ST P (T0 , T )
V (1) = Nt EN − KP (T0 , T ) Ft
STT0
! !+
Z T Z T
1
V (1) = St P (T0 , T )EN exp rs − σs2 ds + σs dWs(1) − K Ft
T0 2 T0
(7.36)
where
!
Z T Z T
dN NT P (0, T ) 1
= = exp − σs2 1s≤T0 ds + σs 1s≤T0 dWs(1) (7.37)
dQ N0 2 0 0
Using equations (7.29), and (7.36) we have the following dynamics, under the
measure N, and use the fact that the Girsanov theorem
Z u
WuN (1) = Wu(1) − σs 1s≤T0 ds
0 (7.38)
WuN (2) = Wu(2)
67
gives us two independent Brownian Motion
(7.39)
By the risk - neutral valuation theorem, the value process scaled by the nu-
meraire of the asset will be a martingale, under the measure induced by the
(1)
V
numeraire. So the St t is the value of the European Call options, and the asset
dynamics is under the risk - neutral measure Q is
With this change of numeraire, the asset Ŝt is frozen until T0 , the grant date,
or the time when the strike is set. If we use the payout in (7.28)
! !+
Z T Z T
1
V (2) = P (t, T )EQ exp rs − σs2 ds + σs dWs(1) − K Ft
T0 2 T0
(7.41)
The pricing of a forward start call option is thus reduced to pricing vanilla
call options.
This is a case of (7.29). Under the risk - neutral measure Q we can write for the
two types of payouts, (7.27), 7.28) as V (m) for m = 1, 2, we have the following
dynamics.
q
(m) (m) (m) (m)
dŜt = rt Ŝt 1T0 ≤t dt + vt St 1T0 ≤t dW (1)
q
(m) (m) (m)
p
dvt = λv − (λ − ρη(2 − m)1t≤T0 )vt dt + η vt ρdW (1) + 1 − ρ2 dW (2)
(7.43)
68
The difference from (7.42) to the above, is that the coefficients are (time) piece-
wise constant coefficients. So the original Heston procedure can be used, over
the discrete intervals where the coefficients are constants. Lucic follow the steps
as outlined in Gatheral’s notes. Denote τ = T − t, and use, as in Heston, the
log scale of the asset process, x = log(S).
∂C
= λD, C(0) = 0
∂τ
(7.44)
∂D (m) η2 D2
= αt − βt D + , D(0) = 0
∂τ 2
Now we need to integrate (7.44) over [0, τ ] which is done in two separate cases,
if τ ∈ [0, T − T0 ], then it is a vanilla call option, we know the asset value at ST0
since the filtration is after T0 , so we get constants for all parameters
(m)
(m) 1 − e−d τ
D(m, k, τ ) = r−
1 − g (m) exp(−dm τ )
(m)
!!
(m) 2 1 − g (m) e−d τ
C(m, k, τ ) = λ r− τ − 2 log
η 1 − g (m)
r 2
(m) (m) (7.45)
d = β0 − 2α0 η 2
(m)
(m) β0 ± d(m)
r± =
η2
(m)
r−
g (m) = (m)
r+
for the genuine forward - start option, where τ > T − T0 , we are integrating
over [T − T0 , τ ] and using C and D from (7.45) as the initial conditions.
(m)
2βT
D(m, k, τ ) =
(m)
η 2 1 + c exp(βT (τ − T + T0 ))
(m)
2β λ(τ − T + T0 )
C(m, k, τ ) = C(m, k, T − T0 ) + T
η2
(m)
(7.46)
2λ 1 + exp β T (τ − T + T0 )
− 2 log
η 1+c
(m)
2βT
c= 2
−1
η D(m, k, T − T0 )
This solves the calculations for the Fourier transform in the option price.
69
7.4 Kruse: FSO under Heston model
Here I am following the following article (?, ?) A forward start option starts
somewhere in the future, the determination time of the strike, when the strike is
set equal to a proportion of the current price. In the BS setting, one can easily
transform the pricing problem of a FSO, into a valuation problem of a vanilla
option at the determination time. The option price at the determination time,
has only one stochastic component at the determination time, the asset stock
price. In a stochastic volatility model we add the randomness of the volatility
of the underlying. It makes the today’s price to rely on today’s volatility, and
the assumption of the SDE of the volatility process.
The payoff structure is
+
PF W S (S(T, S(t∗ )) = (S(T ) − kS(t∗ )) (7.47)
where k is the percentage of the strike price. In Heston’s model we have following
structure for the asset is under measure Q and for the volatility under measure
Qλ p
dS(t) = rS(t) + ν(t)S(t)dW1 (t) (7.48)
p p
dν(t) = κ(θ − ν(t))dt + σ ν(t)d ρW1 + 1 − ρ2 W2 (t) (7.49)
assuming some regularity conditions, and note that d(W1 , W2 ) = 0dt they are
uncorrelated. Heston showed that for a European vanilla option, with payoff
where K, the strike is known, by using a Delta - Sigma hedging, we end up with
P1 and P2 , and using Fourier transformation, and Riccati equation for solving
parabolic PDE. The option price at time t ∈ [0, T ] is
C(t, S(t), ν(t)) = S(t)P1 (t, S(t), ν(t), K)−Ke−r(T −t) P2 (t, S(t), ν(t), K) (7.51)
70
7.4.1 Kruse’s solution
Choose a time t before the determination time t∗ , an European Call option and
some regularity conditions, its price is
∗
C(t, ν(t), S(t)) = S(t)P̂1 (t, ν(t)) − ke−r(T −t ) S(t)P̂2 (t, ν(t)) (7.54)
where
κθ
κ̂ = κ − ρσ, θ̂ = (7.61)
κ − ρσ
The option price, by use of the Tower property can be re-written as
" + #
S(t∗ )
S
CF W S (t, ν(t), S(t)) = E S(t) 1 − k |Ft (7.62)
S(T )
remember that t∗ is the delivery time, and if the valuation time is t < t∗ we get,
again the tower property of expectation
" " + # #
S S S(t∗ )
CF W S (t, ν(t), S(t)) = S(t)E E 1−k |Ft∗ |Ft (7.63)
S(T )
71
where the inside expectation is the value of the call option at determination
point.
" + #
S(t∗ ) (S(T ) − kS(t∗ ))
ES
|Ft∗ | = CF W S (t∗ , ν(t∗ ), S(t∗ )) (7.65)
S(T )
72
Assume that there is an equivalent martingale measure Q and a bank account,
with a constant deterministic interest rate. Than the RNVF will give us the
today’s value from the payoff at time T2
f wd Q 1 S(T2 ) ∗
V (t0 , S0 ) = B(t0 )E max − K , 0 F(t0 ) (7.73)
B(T ) S(T1 )
Using (Duffie et al., 1999), to have an affine process, we need to take the log-
arithm of the asset values, using (7.72) and the fact that the logarithm of the
quotient is the difference of the logarithms we get
B(t0 ) Q h i
V f wd (t0 , S0 ) = E max ex(T1 ,T2 ) − K ∗ , 0 F(t0 ) (7.74)
B(T2
where x(T1 , T2 ) = log(S(T2 ))−log(S(T1 )). Now we can derive the characteristic
function
h i
φx (u) ≡ φx (u, t0 , T2 ) = EQ eiu(log(S(T2 ))−log(S(T1 )) F(t0 ) (7.75)
Using the property of iterated expectation, the Tower property, we can condition
(7.75) on the time T1 and write the characteristic function as
n h i o
φx (u) = EQ EQ eiu(log(S(T2 ))−log(S(T1 )) F(T1 ) F(t0 ) (7.76)
where we will derive the ψX (u, T1 , T2 ) for two different asset classes, the Black
- Scholes and the Heston model.
73
where T2 − T1 = ∆T . Insert the above in (7.78) we get
h 1 2 1 2 2
i
φx (u) = EQ e(r− 2 σ )iu∆T − 2 σ u ∆T F(t0 )
1
1 (7.80)
φx (u) = r − σ 2 iu∆T − σ 2 u2 ∆T
2 2
and the last line is the characteristic function for normally distributed random
variable with mean equal to r − 12 σ 2 ∆T and variance equal to σ 2 ∆T . The
74
make the normal random variable standard, by extracting its mean and divide
with its deviation and we get
2
1
)∆T + 21 σ 2 ∆T Z ∞ √
e(r− 2 σ
e− 2 (x−σ ∆T )
1
√
2π a (7.86)
r∆T
h √ i
=e 1 − Φ a − σ ∆T
use the fact the standard normal is symmetric around zero, we get
√
V f wd (t0 , S0 ) = erT1 Φ σ ∆T − a − K ∗ e−rT2 Φ(−a)
with
1
+ r + 12 σ 2 ∆T
log K∗ (7.87)
d1 = √
σ ∆T
log K ∗ + r − 12 σ 2 ∆T
1
d2 = √
σ ∆T
under the Heston model this is to find ψX (u, T1 , T2 ), now the state- space u
is a two-dimensional vector, uT = [u, 0]T , with the second parameter is set to
zero. We want to find the asset price at maturity, not the variance. Of course
the variance influences the asset price, but it is already captured in the asset
price. Using the fact that Heston model belongs to the class of AJD, affine jump
diffusions, we know that its characteristic function can be written as
where A, B, C are complex valued function. In the Heston model, the variance
follows a CIR - model, a squared root diffusion, with mean-reversion, and no
jumps. In the Heston model B(u, τ ) = iu and in (7.89) the constant A(u, τ ), is
not stochastic. I let τ = T − t, time to maturity. So this simplifies (7.88) to the
following
h i
φx (u) = eA(u,τ )+r(T2 −T1 ) EQ eC(u,τ )v(T1 ) ψX (u, T1 , T2 ) F(t0 ) (7.90)
This formula does not depend on the asset price S(t) or the log asset price
X(t) = log(S(t)). In order to get an affine system we are using the log of
the asset price. The idea behind solving (7.90) is to use moment - generating
75
function for the CIR - model. In the Heston model we have the in (Oosterlee &
Grzelak, 2019) writes the representation for the dynamics of the variance thus
p
dv(t) = κ(v − v(t)) dt + γ v(t) dWv (t) (7.91)
That is the same as in Heston’s representation.
p
dv(t) = κ(θ − v(t)) dt + σ v(t) dWv (t) (7.92)
the moment generating function has the following form.
i 12 δ
Q
h
uv(t) 1 uc(t, t0 )κ(t, t0 )
E e F(t0 ) = exp (7.93)
1 − 2uc(t, t0 ) 1 − 2uc(t, t0 )
with the following parameters
γ2
c(t, t0 ) = 1 − e−κ(t−t0 )
4κ
4κv
δ= (7.94)
γ2
4κv0 e−κ(t−t0 )
κ(t, t0 ) =
γ2 1 − e−κ(t−t0
∞ k Z ∞
X 1 − κ(t,t0 ) κ(t, t0 )
Mv(t) (u) = e 2 euc(t,t0 ) fχ2 (δ+2k) (x) dx (7.97)
k! 2 0
k=0
76
adding and subtracting an exponential term
12 δ
1 κ(t, t0 ) κ(t, t0 )
Mv(t) (u) = exp −
1 − 2uc(t, t0 ) 2(1 − 2uc(t, t0 ) 2
∞ k (7.99)
X 1 − 2(1−2uc(t,t
κ(t,t0 ) κ(t, t0 )
· e 0 ))
k! 2(1 − 2uc(t, t0 ))
k=0
The expression under the sum, is the probability that P(Y = k) for a Pois-
son distributed random variable, the probability mass function, for a Poisson
distributed random variable (with parameter α̂ is
1 α̂ k
P[Y = k] = e α̂ (7.100)
k!
κ(t,t0 )
in our example α̂ = 2(1−2uc(t,t0 )) we get
21 δ ∞
X
1 κ(t, t0 ) κ(t, t0 )
Mv(t) (u) = exp − P[Y = k]
1 − 2uc(t, t0 )) 2(1 − 2uc(t, t0 ) 2
k=0
(7.101)
but as the sum of all probabilities is equal to one, the last sum vanish, and we
get
12 δ
1 κ(t, t0 ) κ(t, t0 )
Mv(t) (u) = exp − (7.102)
1 − 2uc(t, t0 ) 2(1 − 2uc(t, t0 )) 2
So we can insert this in (7.90) we get the φX (u, T1 , T2 ) when solving the coupled
Riccait equations
C(u, τ )c(T1 , t0 )κ(T1 , t0 )
φx (u) = exp A(u, τ ) + rτ +
1 − 2C(u, τ )c(T1 , t0 )
12 δ (7.103)
1
·
1 − 2C(u, τ ), c(T1 , t0 )
7.6 Appendix
I reproduce the variance dynamics in the Heston model (CIR- model)
p
dv(t) = κ(v − v(t)) dt + γ v(t) dWv (t) (7.104)
The process v(t)|v(s) with 0 < s < t under the CIR dynamics is distributed as
c(t, s) times a non-central χ2 random variable χ2 (δ, κ(t, s)) where δ is the degree
77
of freedom, and κ(t, s) is the non-centrality parameter. This gives us
v(t)|v(s) ∼ c(t, s)χ2 (δ, κ(t, s)) t > s > 0
with
1 2
c(t, s) = γ 1 − e−κ(t−s)
4κ
(7.105)
4γv
δ=
γ2
4κv(s)e−κ(t−s)
κ(t, s) =
γ 2 1 − e−κ(t−s)
The cumulative distribution function, CDF, will look like
x x
Fv(t) (x) = Q[v(t) ≤ x] = Q χ2 (δ, κ(t, s)) ≤ = Fχ2 (δ,κ(t,s))
c(t, s) c(t, s)
(7.106)
where
2
(κ(t, s)
∞
γ(k + 2δ , y2 )
X κ(t, s) 2
Fχ2 (δ,κ(t,s)) (y) = exp − (7.107)
Γ k + 2δ
2 k!
k=0
and the lower incomplete Gamma function γ(a, z), and the Gamma function
G(z) are Z z Z ∞
a−1 −t
γ(a, z) t e dt, Γ(z) = tz−1 e−t dt (7.108)
0 0
and the probability density function, pdf, is
12 ( δ2 −1)
1 − 1 (y+κ(t,s)) y p
fχ2 (δ,κ(t,s)) (y) = e 2 B δ −1 κ(t, s)y (7.109)
2 κ(t, s) 2
78
7.7 Bibliographical notes
The starting point is Rubinstein’s article from 1990. A good book that describes
the Forward- start problem is (Musiela & Rutkowski, 2005), thereafter I used
two articles, (Lucic, 2003) and (?, ?), that independently produced a closed form
expression for the Heston model. Another article is (Ahlip & Rutkowski, 2009),
that also set up the framework for stochastic interest rate. I only reproduced
their result with deterministic interest rate. A recent book (Oosterlee & Grzelak,
2019) makes use of the moment generating function to derive the Black Scholes
and the Heston model.
79
References
Ahlip, R., & Rutkowski, M. (2009). Forward start options under stochastic
volatility and stochastic interest rates. International Journal of Theoreti-
cal and Applied Finance, 12 , 209-225.
Björk, T. (2009). Arbitrage theory in continuous time. Oxford university press.
Breeden, D. T. (1979). An intertemporal asset pricing model with stochastic
consumption and investment opportunities. Journal of Financial Eco-
nomics (JFE), 7 (3).
Carr, P., & Madan, D. (1999). Option valuation using the fast fourier transform.
Journal of Computational Finance, 2 , 61-73.
Carr, P., & Wu, L. (2004). Time-changed lévy processes and option pricing.
Journal of Financial economics, 71 (1), 113–141.
Černỳ, A. (2004). Introduction to fast fourier transform in finance. The Journal
of Derivatives, 12 (1), 73–88.
DERMAN, E., & Kani, I. (1994). Riding on a smile. Risk , 7 , 32–39.
Duffie, D., Pan, J., & Singleton, K. (1999). Transform analysis and asset pricing
for affine jump-diffusions. Capital Markets eJournal .
Dupire, B., et al. (1994). Pricing with a smile. Risk , 7 (1), 18–20.
Einstein, A. (1905). Über die von der molekularkinetischen theorie der
wärme geforderte bewegung von in ruhenden flüssigkeiten suspendierten
teilchen. Annalen der Physik , 322 (8), 549-560. Retrieved from https://
onlinelibrary.wiley.com/doi/abs/10.1002/andp.19053220806 doi:
https://doi.org/10.1002/andp.19053220806
Evans, L. C. (2012). An introduction to stochastic differential equations
(Vol. 82). American Mathematical Soc.
Gatheral, J. (2011). The volatility surface: a practitioner’s guide (Vol. 357).
John Wiley & Sons.
Gilli, M., Maringer, D., & Schumann, E. (2019). Numerical methods and opti-
mization in finance (Second ed.). Waltham, MA, USA: Elsevier/Academic
Press. Retrieved from http://www.enricoschumann.net/NMOF/ (ISBN
978-0128150658)
Heston, S. (1993). A closed-form solution for options with stochastic volatility
with applications to bond and currency options. Review of Financial
Studies, 6 , 327-343.
Iacus, S. M. (2011). Option pricing and estimation of financial models with r.
John Wiley & Sons.
80
Kendall, M. G., et al. (1946). The advanced theory of statistics. (No. 2nd Ed).
Charles Griffin and Co., Ltd., London.
Lindström, E., Madsen, H., & Nielsen, J. N. (2015). Statistics for finance. CRC
Press.
Lucic, V. (2003). Forward-start options in stochastic volatility model. Wilmott,
2003 , 72-75.
Matsuda, K. (2004). Introduction to option pricing with fourier transform:
Option pricing with exponential lévy models. white paper. City University
of New York .
Musiela, M., & Rutkowski, M. (2005). Martingale methods in financial mod-
elling springer-verlag. Springer-Verlag, Berlin, Heidelberg, New York .
Oosterlee, C. W., & Grzelak, L. A. (2019). Mathematical modeling and compu-
tation in finance: With exercises and python and matlab computer codes.
World Scientific.
Pascucci, A. (2011). Pde and martingale methods in option pricing. Springer
Science & Business Media.
Singleton, K. J. (2001). Estimation of affine asset pricing models using the
empirical characteristic function. Journal of Econometrics, 102 (1), 111–
141.
81