Further Results With OU and CIR Processes
Further Results With OU and CIR Processes
A.Krul
Adriaan.Krul@INGbank.com
tw1249266
Literature report
10-12-07 20-2-2008
2
Contents
1 Introduction 5
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4 Kalman Filter 19
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Kalman Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Introductory example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3.1 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.2 How to choose R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
7 Conclusion 35
7.1 Further research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4 CONTENTS
Chapter 1
Introduction
1.1 Introduction
A future contract is an agreement between two parties to buy or sell an asset at a certain time in the
future for a certain price. Nowadays, traders calibrate the so-called convenience yield (CY), δt , via market
data every two days, using the future contract. Convenience yield is the benefit or premium associated
with holding an underlying product or physical good, rather than the contract or derivative product.
Sometimes, due to irregular market movements such as an inverted market (this is when the short-term
contract prices are higher than the long-term contracts), the holding of an underlying good or security
may become more profitable than owning the contract or derivative instrument, due to its relative scarcity
versus high demand.
An example would be purchasing physical bales of oil rather than future contracts. Should their be a
sudden shock-situation wherein the demand for oil increases, the difference between the first purchase
price of the oil versus the price after the shock would be the convenience yield. The market shows
however that the CY behaves stochastically and has a mean-reverting property, see e.g. [5]. The CY
may therefore be modelled by an Ornstein-Uhlenbeck (OU) process. The disadvantage of this process is
that it allows for negative CY which can result in cost and carry arbitrage possibilities. To prevent this
from happening we assume that the CY follows a Cox-Ingersoll-Ross (CIR) model. For both processes,
the spot price follows a geometrical brownian motion.
For commodities, the spot prices and the CY can not be observered directly from the market. To model
both stochastic processes (OU and CIR), the non-observability of these state variables is difficult. Since
the future prices of commodities are widely observered and traded on the market we use a method that
links these actual observations with the latent state variables. This method is known as the Kalman Filter,
(KF). The main idea of the (KF) is to use observable (futures) variables to reconstitute the value of the
non-obsvervable (spot prices and CY). We use futures of light crude oil ranging for the period from
01-02-2002 until 25-01-2008.
For applying the Kalman Filter, we assume an affine form for the closed form solution of the future prices
for both the OU and the CIR process. By assuming this, solving the stochastic differential equation of
both processes is a lot easier and having the closed form solution of the future prices will make the use
of the KF considerably easier.
The aim of this thesis is to implement the KF for both OU and CIR and compare the results with the
market data. Other commodities are numerically tested and we will try to price options on commodities
using this method. The outline of this report is as follows; in Chapter 2 and 3 analytical results are given
for the OU and CIR processes (resp.). In Chapter 4, the KF is explained in detail and it starts with an
introductory example. Chapter 5 applies the KF to the OU process and discusses the numerical results.
In Chapter 6 the KF is set up for the CIR process, but not yet tested numerically. The report ends with
a conclusion and further research.
6 Introduction
Chapter 2
In this chapter analytical results are given for the stochastic CY when it follows an Ornstein-Uhlenbeck
process. The assumption is that the spot price St follows a geometrical Brownian motion, i.e.,
where µ is the drift term, η is the volatility term, and Wt a standard Brownian motion. The form of the
convenience yield (δt ) is a result of the studies of Gibson and Schwartz (see [5]) where they find empirical
evidence that the convenience yield will have a mean-reverting property, i.e.,
In (2.2), α is the long range mean to which δt tends to revert, k the speed of adjustment, σ the volatility
term and Zt a standard Brownian motion. Here dWt dZt = ρdt. The solution of (2.1) is given by
( Z T )
1 2
ST = St exp (µ − η )(T − t) + η dWs , (2.3)
2 t
This implies
Z t Z t
kt ks
δt e = δ0 + kα e ds + σ eks dZs .
0 0
8 Stochastic convenience yield model following an Ornstein-Uhlenbeck process
For the future price we consider the case YT ≡ ST and we compute Vt [ST ]. The current value of a claim
on a future delivery of the commodity on the future date t is
1 1 1
Vt [ST ] = St exp −α + (σλ − σηρ) + ( )2 σ 2 (T − t)
k 2 k
1 1 1
− δt − α + (σλ − σηρ) + ( )2 σ 2 (1 − θ)
k k k
2
1 1 σ
+ ( )2 (1 − θ2 ) . (2.11)
2 k 2k
Proof. From (2.7) and (2.3) we get
Z T
−r(T −t) 1 2
e ST = St exp (µ − r − η )(T − t) + η dWs . (2.12)
2 t
µ̃ = E[z]
1 1 1 1
= −( η 2 + α − σλ)(T − t) + (α − δt − σλ)(1 − θ). (2.15)
2 k k k
10 Stochastic convenience yield model following an Ornstein-Uhlenbeck process
For calculating Var[z] we notice that for a general process g ∈ L2 [t, T ] it follows
!2
Z T Z T
E g(s)dWs = E[g(s)2 ]ds.
t t
σ̃ 2 = E[(z 2 )] − (E[z])2
1 1
= (η 2 − 2 σηρ + 2 σ 2 )(T − t)
k k
1 1
+ 2( 2 σηρ − 3 σ 2 )(1 − θ)
k k
1 σ2
+ (1 − θ2 ), (2.22)
k 2 2k
where θ = e−k(T −t) as before. From the standard formula of the expected value of a lognormal random
variable we get
1
Vt [ST ] = St exp µ̃ + σ̃ 2 . (2.23)
2
2.4 Price valuation of a European call option 11
This follows from the absence of risk-free arbitrage opportunities, i.e. we must have
Vt [ST − F ] = 0. (2.25)
In order to find the PDE to which F (S, δ, τ ) satisfies we use the Feynman-kač theorem.
subject to
A similar result exists for V (S, δ, t) := Vt [YT ]. The V (S, δ, t) satisfies the PDE
In this chapter, instead of (2.2) we assume that the convenience yield follows a Cox-Ingersoll-Ross (CIR)
process and where the spot price follows a geometrical brownian motion with a time-varying volatility,
which is proportional to the square root of the instantaneous convenience yield level, i.e.,
In (3.1), S is the price of the underlying, δ is the CY, µ is the drift term, η is the volatility term of dS
St ,
t
Wt a standard Brownian motion, α is the long range mean to which δt tends to revert, k the speed of
adjustment, σ the volatility term of dδt , and Zt a standard Brownian motion. Here dWt dZt = ρdt. The
reason why we assume that the CY follows a CIR process is the nonnegativity. A negative CY would
make the forward prices go up at more than the interest rate and provide some kind of cash and carry
arbitrage through buying the spot commodity and selling a forward. The CIR process excludes negative
CY.
d d kt d
E[δt ] = k(α − E[δt ]) ⇒ e E[δt ] = ekt [kE[δt ] + E[δt ]] = ekt kα.
dt dt dt
This leads to
Z t
kt
e E[δt ] − δ0 = kα eku du = α(ekt − 1) ⇒ E[δt ] = α + e−kt (δ0 − α) = e−kt δ0 + (1 − e−kt )α.
0
Remark. We see that if δ0 = α then E[δt ] = α ∀t. If δ0 6= α, then δt exhibits mean reversion, i.e.
limt→∞ E[δt ] = α.
14 Stochastic convenience yield model following a Cox-Ingersoll-Ross process
For the variance we calculate first dδt2 via Ito. Define f (x) = x2 . We have
dδt2 = df (δt )
1
= f 0 (δt )dδt + f 00 (δt )dδt dδt
2 p p
= 2δt [k(α − δt )dt + σ δt dZt ] + [k(α − δt )dt + σ δt dZt ]2
3/2
= 2αkδt dt − 2kδt2 dt + 2σδt dZt + σ 2 δt dt
3/2
= (2kα + σ 2 )δt dt − 2kδt2 dt + 2σδt dZt .
This leads to
Z t Z t Z t
δt2 = δ02 + (2kα + σ ) 2
δu du − 2k δu2 du + 2σ δu3/2 dZu .
0 0 0
d
E[δt2 ] = (2kα + σ 2 )E[δt ] − 2kE[δt2 ].
dr
From this we get
d 2kt 2 d
e E[δt ] = e2kt [2kE[δt2 ] + E[δt2 ]]
dt dt
= e2kt (2kα + σ 2 )E[δt ].
It follows that
ασ 2 σ2 σ2 σ2 α
E[δt2 ] = + α2 + (δ0 − α)( + 2α)e−kt + (δ0 − α)2 e−2kt + ( δ0 )e−2kt .
2k k k k 2
We finally have
Again via the same explanation given in chapter 1 we insert the market price of risk into (3.3). This
leads to the two joint stochastic process
Assuming (see e.g. [2]) this PDE has an affine form solution
F (S, δ, τ ) = SeA(τ )−B(τ )δ ,
(3.7)
A(0) = 0, B(0) = 0.
To find B(τ ) and A(τ ) we notice that (by substituting (3.7) into (3.6))
1 2 2
σ B + (k − ρησ)B − 1 + Bτ = 0,
2
r + (λ − kα)B − Aτ = 0.
For simplicity first write a1 = 12 σ 2 and a2 = k − ρησ which leads to f (B) = Bτ = −a1 B 2 − a2 B + 1.
Now we want to factorize the function f (B).
f (B) a2 1
= B2 + B − = (B + )(B − γ).
−a1 a1 a1
a2 1
From this we get that ( − γ) = a1 and γ = a1 . From these two relations it follows that we have to solve
a1 γ 2 + a2 γ − 1 = 0.
This gives
p
a2 a22 + 4a1
γ1,2 = − ±
2a1 2a1
−k2 ± k1
= , (3.8)
2a1
where k1 and k2 are given by (3.15). From this point it is assumable that both γ’s should work. For
convenience of the reader we check this below. We have
Bτ = −a1 (B + )(B − γ) = 0
1
dB + a1 dτ = 0
(B + )(B − γ)
1 −1 1
+ dB + a1 dτ = 0
γ+ B+ B−γ
1 1
[−ln|B + | + ln|B − γ|] + τ = c
γ+ a1
|B − γ| γ+
ln = − τ + (γ + )c
|B + | a1
γ+
(c̃e− a1 τ
+ γ)
B(τ ) = ,
− γ+
a τ
1 − c̃e 1
16 Stochastic convenience yield model following a Cox-Ingersoll-Ross process
with initial condition B(0) = 0 we get that c̃ = − γ . Substituting c̃ and calculating for γ1 (which refers
to the plus sign) gives
γ+
γ(1 − e− a1 τ
)
B(τ ) = γ+
γ − a1 τ
1+ e
γ+
1 − e− a1 τ
= −
γ+
τ
+γe a1
γ
γ+
1 − e− a1 τ
= γ+
k2 + a1 (1 + e− a1 τ
)γ
− γ+
a1 τ
1−e
= γ+
k2 + a1 ( −k2a
2 +k1
1
(1 + e− a1 τ
))
− γ+
a1 τ
2(1 − e )
=
− γ+ γ+
2k2 − k2 + k1 − k2 e a τ
1 + k1 e− a1 τ
γ+
2(1 − e− a1 τ
)
= γ+ . (3.9)
k2 + k1 + (k1 − k2 )e− a1 τ
Now calculate
−k2 +k1 k −k2 +k1
( + 2+ )τ
2a1 a1 2a1
− γ+
a τ −
e 1 = e a1
= e−k1 τ . (3.10)
2(1 − e−k1 τ )
B(τ ) = (3.11)
k1 + k2 + (k1 − k2 )e−k1 τ
Now substituting c̃ and calculating for γ2 (which refers to the plus sign) gives
γ+
γ(1 − e− a1 τ
)
B(τ ) = γ+
γ − a1 τ
1+ e
γ+
1 − e− a1 τ
= −
γ+
τ
+γe a1
γ
γ+
1 − e− a1 τ
= γ+
k2 + a1 (1 + e− a1 τ
)γ
− γ+
a1 τ
1−e
= γ+
k2 + a1 ( −k2a
2 −k1
1
(1 + e− a1 τ
))
− γ+
a1 τ
2(1 − e )
=
− γ+ γ+
2k2 − k2 − k1 − k2 e a τ
1 − k1 e− a1 τ
γ+
2(1 − e− a1 τ
)
= γ+ .
k2 − k1 + (−k1 − k2 )e− a1 τ
Now calculate
−k2 −k1 k −k2 −k1
( + 2+ )τ
2a1 a1 2a1
− γ+
a τ −
e 1 = e a1
= ek1 τ . (3.12)
3.3 Partial differential equation for the future prices 17
where
T
(k1 + k2 )ek1 τ + k1 − k2
Z
2
Bq dq = ln (3.15)
t k1 (k1 + k2 ) 2k1
k1 + k2 + (k1 − k2 )e−k1 τ
2
+ ln , (3.16)
k1 (k1 − k2 ) 2k1
where
q
k1 = k22 + 2σ 2 ,
k2 = (k − ρησ).
18 Stochastic convenience yield model following a Cox-Ingersoll-Ross process
Chapter 4
Kalman Filter
4.1 Introduction
In 1960 R.E. Kalman published the Kalman Filter (KF). This algorithm makes optimal use of imprecise
data on a (quasi-) linear system with Gaussian errors (white noise) to continuously update the best
estimate of the systems current state. The power of KF is; to compute these updates it is only necessary
to consider the estimates from the previous time step and the new measurement and not all the previous
data. The main idea is that we want to estimate the current state and its uncertainty, but we can not
directly observe these states. Instead we observe noisy measurements.
zt = Ht xt + vt , (4.2)
where
At is a n × n matrix,
Bt is a n × l matrix,
Ht is a m × n matrix,
wt and vt are the process and measurement noise (resp.) with mean zero and covariance matrices
Q and R (resp.).
Now define
The KF algorithm is a predictor-corrector algorithm and uses the time updates for the prediction and
the measurement updates as the corrector. The KF algorithm is given by
20 Kalman Filter
x̂−
t = At x̂t−1 + Bt ut−1
6
Table 4.1:
?
Corrector (Measurement updates)
x̂t = x̂− −
t + Kt (zt − Ht x̂t )
Pt = (I − Kt Ht )Pt−
Table 4.2:
In Chapter 5 we discuss the Kalman filter for the CIR process. But first we introduce a simple example
following [6].
In [6] they use the KF to estimate a random constant. We assume all matrices in Table 4.1 and 4.2 to
be constant. By setting A = 1, B = 0 we obtain x̂− k = x̂k−1 , i.e., we skip the updating step. By setting
H = 1, we get zk = xk +vk , i.e., the measurements comes directly from the state xk . We set Q = 1E −05.
Now we have to choose an intial state to begin with. Since a random variable is normally distributed
with mean zero, we take x0 = 0 to be the intial state. Accordingly we must start with an intial state for
Pk , P0 . It turns out that we can arbitraly choose P0 6= 0 and the filter will eventually converge. Take
P0 = 1, by taking P0 large enough, the choice of x0 does not influence the kalman filter [7]. The crosses
in the following figures are generated by the matlab function
y=-0.37727+normrnd2(0,0.025^2,samples,1);
4.3 Introductory example 21
0.1
−0.1
−0.2
−0.3
−0.4
−0.5
−0.6
−0.7
−0.8
0 10 20 30 40 50 60 70 80 90 100
Figure 4.1: Simulation of a random constant, with R = 0.0238. The true value of x = −0.37727 is given
by the green line, the Kalman Filter by the red line and the crosses are the noisy measurements.
Changing the choice of R is made clear by figure 4.2 and 4.3. In figure 4.2 we take R = 1, this will cause
a much slower converging behaviour than in figure 4.1. This is because the filter responds slower to the
noisy measurements. However the filter will eventually converge to the true value of x. In figure 4.3 we
take R = 0.0001, as we can see the filter quickly responds to the measurements and tries to fit it. With
the choice of this R we can expect that it will take very long for the filter to converge to the green line.
0.2
0.1
−0.1
−0.2
−0.3
−0.4
−0.5
−0.6
−0.7
−0.8
0 10 20 30 40 50 60 70 80 90 100
Iteration
Figure 4.2: Simulation of a random constant, with R = 1. The true value of x = −0.37727 is given by
the green line, the Kalman Filter by the red line and the crosses are the noisy measurements.
22 Kalman Filter
0.2
0.1
−0.1
−0.2
−0.3
−0.4
−0.5
−0.6
−0.7
−0.8
0 10 20 30 40 50 60 70 80 90 100
Figure 4.3: Simulation of a random constant, with R = 0.0001. The true value of x = −0.37727 is given
by the green line, the Kalman Filter by the red line and the crosses are the noisy measurements.
0.5
−0.5
−1
−1.5
0 10 20 30 40 50 60 70 80 90 100
Figure 4.4: Comparison between the theoretical innovations (blue lines) and the statistical innovations
(red crosses). The number of red crosses between the blue lines is 54 per cent. R = 0.10.
It is therefore necessary to change the value of R, say R = 0.16. We see that 71 per cent of the red
crosses lie between the blue lines. If we now plot the Kalman Filter using this R we see that this is one
of the best choices for R, what we expected.
4.3 Introductory example 23
1.5
0.5
−0.5
−1
−1.5
0 10 20 30 40 50 60 70 80 90 100
Figure 4.5: Comparison between the theoretical innovations (blue lines) and the statistical innovations
(red crosses). The number of red crosses between the blue lines is 71 per cent. R = 0.16.
0.2
0.1
−0.1
−0.2
−0.3
−0.4
−0.5
−0.6
−0.7
−0.8
0 10 20 30 40 50 60 70 80 90 100
Figure 4.6: Simulation of a random constant, with R = 0.16. The true value of x = −0.37727 is given by
the green line, the Kalman Filter by the red line and the crosses are the noisy measurements.
24 Kalman Filter
Chapter 5
To fit the state-space model, the state variables (which in this case are the spot prices and the convenience
yield) are contained inside a state vector. The measurement equation, consisting of this vector and
uncorrelated disturbances to account for possible errors in the data, links actual observations (in this
case the future prices on several different maturities) with latent variables. These latent variables are
assumed to be first-order Markovian processes and are related to systems (2.1) and (2.2). The Kalman
filter will give an optimal prediction for the unobserved data by only considering the previously estimated
value.
Consider again
where
σ2 σ 1 − e−2kτ 1 − e−kτ
2
σ2
ησρ
A(τ ) = r − α̃ + 2 − τ + + α̃k + σηρ − ,
2k k 4 k3 k k2
and
1 − e−kτ λ
B(τ ) = , α̃ = α − ( ).
k k
In state variable terms, (5.1) can be rewritten in
where1
Yt = [lnFt (τi )], for i = 1, ..., n is a n × 1 vector for n maturities. τi are the maturity dates. Ft (τ )
are observed from market data.
t is a n × 1 vector of uncorrelated disturbances and is assumed to be normal with zero mean and
variance matrix Ht . Ht is a n × n diagonal matrix with hi on its diagonal.
1 NOBS is the number of observations.
26 The Kalman Filter for the Ornstein-Uhlenbeck process
The t in (5.3) term is included to account for possible errors in the measurement. These errors especially
occur when the state variables are unobservable. To get a feeling of the size of the error suppose that the
OU model generates the prices and yields perfectly, and that the state variables can be observed from the
market directly. The included error term in the measurement equation can be seen as the possibilities of
bid-ask spreads, errors in the data etc. The error is assumed to be small in comparison to the variation
of the yield. The matrix Ht is assumed to be diagonal for convenience in order to reduce the number of
parameters to be estimated. The diagonal elements hi are then estimated via the log-likelihood function.
where ξt is the transition error. Like t , ξt is also assumed normal with mean zero and has a covariance-
variance matrix given by
" 2 #
σ ∆t ρση∆t
Vt = (5.7)
ρση∆t η 2 ∆t
The matrices, H, Z, Q, c, d given above are parametrized by of the unknown parameter set ϕ =
{k, αλ, η, σ, ρ, µ, hi }. We dropped the time subscript for each matrix because they are time-independent.
Note that Vt does not depend on the state variables {xt , δt }.
while ( number of iterations has not been reached , optimal ϕ has not been found ) do
(Kalman Filter)
for i = 1 : N OBS do
(Prediction Equations)
ãti = Q ∗ ati−1 + c;
P̃ti = Q ∗ Pti−1 ∗ Q0 + R ∗ V ∗ R0 ;
ỹti = Z ∗ ãti + D;
(Innovations)
vti = yti − ỹti ;
(Updating equations)
Fti = Z ∗ P̃ti ∗ Z 0 + H;
ati = ãti + P̃ti ∗ Z 0 ∗ Ft−1
i
∗ vti ;
0 −1
Pti = P̃ti − P̃ti ∗ Z ∗ Fti ∗ Z ∗ P̃ti ;
dFti = det Fti ;
if dFti 6 0 then
5.2 Numerical results for the Ornstein-Uhlenbeck process 27
dF t = 10−10 ;
end if
(Log-likelihood function per iteration)
logl(i) = −(n/2) ∗ ln(2 ∗ π) − 0.5 ∗ ln(dFti ) − 0.5 ∗ vt0 i ∗ Ft−1
i
∗ vti ;
end for P
LogL = i logl(i);
(Adjustment for ϕ via a Matlab optimization routine)
end while
Via an optimalisation routine, the vector ϕ is chosen such that the total sum of the log-likelihood function
is maximized and the innovations minimized. After this is done, the matrices in the measurement and
transition equation are updated and the Kalman Filter algorithm will then be repeated. This iterative
procedure is repeated untill the optimized ϕ is found. With this optimized set of parameters the matrices
will be updated once more and are used for the last time to generate the paths of the non-observable
variables via the Kalman Filter. The total sum of the log-likelihood function is calculated as follows
1 1X 1 X 0 −1
lnL(Y ; ϕ) = − nln2π − ln|Ft | − v F vt , (5.8)
2 2 t 2 t t t
where Yt is the information vector at time t and it is assumed that Yt conditional on Yt−dt is normal with
mean E[Yt |Yt−dt ] and covariance matrix Ft .2
with δ0 = δ0implied .
5.2.2 Results
The optimized parameter set is estimated for different initial sets and given in Table 5.1. To test the
robustness of the calibration, we first choose randomly an initial parameter set. This gaves us the
optimized set, which we then inserted as the initial one. Within a few interation steps the parameters
converges to the same values.
Remark: In the following sections we use the second column of Table 5.1 to be the initial parameter set.
2 This Ft is not the same as the future price
28 The Kalman Filter for the Ornstein-Uhlenbeck process
Parameters Ini parset Opti parset Ini parset Opti parset Ini parset Opti parset
k 0.3 1.4221 (0.0372) 0.3 1.4221 (0.0380) 2 1.4221 (0.0382)
µ 0.2 0.3733 (0.1471) 0.2 0.3733 (0.1376) 0.2 0.3733 (0.1382)
α 0.06 0.0699 (0.1128) 0.2 0.0699 (0.1025) 0.06 0.0699 (0.1082)
λ 0.01 -0.0183(0.1602) 0.1 -0.0183(0.1459) 0.01 -0.0183(0.1543)
η 0.4 0.3630 (0.0137) 0.4 0.3630 (0.0139) 0.5 0.3630 (0.0153)
σ 0.4 0.4028 (0.0165) 0.4 0.4028 (0.0172) 0.4 0.4028 (0.0181)
ρ 0.8 0.8378 (0.0162) 0.5 0.8378 (0.0164) 0.5 0.8378 (0.0177)
|h1 | 0.0246 0.0188 (0.0008) 0.0246 0.0188 (0.0007) 0.01 0.0188 (0.0007)
|h2 | 0.0268 0.0072 (0.0003) 0.0268 0.0072 (0.0003) 0.01 0.0072 (0.0003)
|h3 | 0.0291 0.0022 (0.0001) 0.0291 0.0022 (0.0001) 0.01 0.0022 (0.0001)
|h4 | 0.0313 0.0000 (0.0001) 0.0313 0.0000 (0.0001) 0.01 0.0000 (0.0001)
|h5 | 0.0336 0.0006 (0.0000) 0.0336 0.0006 (0.0000) 0.01 0.0006 (0.0000)
|h6 | 0.0357 0.0000 (0.0001) 0.0357 0.0000 (0.0001) 0.01 0.0000 (0.0001)
|h7 | 0.0377 0.0014 (0.0001) 0.0377 0.0014 (0.0001) 0.01 0.0014 (0.0001)
Log-Likelihood 8744.6479 8744.6479 8744.6479
Table 5.1: Optimized parameterset (Opti parset) for different initial parametersets (Ini parset). Standard
errors in parentheses.
Table 5.2: Optimized parameterset (Opti parset) for different initial parametersets (Ini parset). Standard
errors in parentheses.
4.6
4.4
4.2
3.8
3.6
3.4
3.2
3
0 50 100 150 200 250 300 350
Figure 5.1: Blue line: State variable xt , Red line:ln(Ft (τ1 )).
0.8
0.6
0.4
0.2
−0.2
−0.4
−0.6
−0.8
0 50 100 150 200 250 300 350
5.2.6 Innovations vt
For each future contract we plot the innovation, i.e. vti = yti − ỹti . We assumed the measurement errors,
t in equation (5.3) to be normal with mean zero and variance H. Considering the figures, this assumption
is acceptable. Note that there are some transcend points. This could be caused by data errors.
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
−0.1 −0.1
−0.2 −0.2
−0.3 −0.3
−0.4 −0.4
0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350
Figure 5.3: Innovation corresponding to F1. Figure 5.4: Innovation corresponding to F2.
Mean = -8.1734e-004, Variance = 0.0022. Mean = 8.6078e-004, Variance = 0.0019.
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
−0.1 −0.1
−0.2 −0.2
−0.3 −0.3
−0.4 −0.4
0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350
Figure 5.5: Innovation corresponding to F3. Figure 5.6: Innovation corresponding to F4.
Mean = 6.4612e-004, Variance = 0.0017. Mean = -1.0181e-004, Variance = 0.0015.
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
−0.1 −0.1
−0.2 −0.2
−0.3 −0.3
−0.4 −0.4
0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350
Figure 5.7: Innovation corresponding to F5. Figure 5.8: Innovation corresponding to F6.
Mean = -5.5711e-004, Variance = 0.0014. Mean = -3.4456e-004, Variance = 0.0013.
5.2 Numerical results for the Ornstein-Uhlenbeck process 31
0.4
0.3
0.2
0.1
−0.1
−0.2
−0.3
−0.4
0 50 100 150 200 250 300 350
0.2
0.15
0.1
Increments of the state variable delta
0.05
−0.05
−0.1
−0.15
−0.2
−0.25
−0.3
−0.3 −0.25 −0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15
Increments of the state variable x
Since the pattern of dots slopes from lower left to upper right, it suggests a positive correlation between
the variables being studied, which is exactly what we expected.
If this is not done, the iterative procedure may break down. In case of matrix H, consisting of the
variances of the error terms, can become negative. But being negative, this can cause problems in the
KF. In (??) Ft could converge to a negative diagonal matrix, becoming impossible to inverse. So we
square each element of H before calculating Ft . If the parameters hi come out negative, it is ok, because
they effectively enter Ft as the variance, i.e. they will be squared. This is why hi in Table 5.1 and in
Table 5.2 are in absolute value.
32 The Kalman Filter for the Ornstein-Uhlenbeck process
at = Qt at + ct ,
Pt = Qt Pt Q0t + Rt Vt Rt0 ,
yt = Zt at + Dt . (5.10)
Forecasting for the state variable δt is not a succes. This can be explained by the sudden drop at 160.
4.6
4.4
4.2
3.8
3.6
3.4
3.2
3
0 50 100 150 200 250 300 350
Figure 5.11: Forecast over half a sample of the log of the future prices (F1-F7).
−1
0 50 100 150 200 250 300 350
Figure 5.12: Forecast over half a sample of the state variables. Blue line: xt , green line: δt , red lines:
forecasting
The Kalman forecasting can not absorb this sudden drop in the convenience yield. For that, it follows
the previous estimations. However, as noted earlier, the mean-reverting term is equal to 0.0699, which
is approximately equal to the forecasting. Also for xt , the sudden drop at 250 can not be absorbed by
this forecasting method. This is a real disadvantage of this method. A way to get the method to absorb
these drops, jumps can be inserted in the stochastic processes of both the state variables.
Chapter 6
where A(τ ) and B(τ ) are given in (3.14) and (3.13). The measurement equation then reads
where
Yt = [lnF (τi )], for i = 1, ..., n is a n × 1 vector for n maturities. τi are the maturity dates.
dt = [A(τi )] for i = 1, .., n is a n × 1 vector.
Zt = [1, −B(ti )], for i = 1, ..., n is a n × 2 matrix.
t is a n × 1 vector of uncorrelated disturbances and is assumed to be normal with zero mean and
variance matrix Ht .
Another motivation for including the error term is that perhaps the dynamics of the stochastic convenience
yield does not include a square-root. If so, the yields implied by the CIR model will differ from the
observed yields. Furthermore, since the CIR model forbids negativity of the convenience yield there are
two solutions; firstly we can replace any negative element of the estimate δt with zero. Secondly, we
could just skip the updating step, i.e. we set δt = δt−1 whenever δt is negative. The reason why this is
done is just a matter of convenience and it seems rather difficult to impose the positivity constraint in
the estimation procedure and thus a solution like these is needed.
Again, the vector [xt , δt ]0 is the data which we want to estimate. From (3.1) and (3.5) the transition
equation immediately follows
1 −(1 + 12 η 2 )∆t
1
xt µ∆t xt−∆t 1 0 ξt
= + + .
δt kα∆t 0 1 − k∆t δt−∆t 0 1 ξt2
where E[ξti ] = 0 for i = 1, 2 and where the covariance matrix of ξti , Vt , is given by
2
√ p p
√ p η ∆tδp
t−∆t ρη ∆t δt−∆t Var[δt |δt−∆t ]
= .
ρη ∆t δt−∆t Var[δt |δt−∆t ] Var[δt |δt−∆t ]
The difference between Vt for the OU process and the CIR process is the time dependency. Implementing
the Kalman Filter for the CIR process includes feedback from the kalman filter to update Vt with δt|t−1 .
As mentioned earlier, since the square root is taking of the convenience yield, we replace any negative
element of the estimate δt|t−1 with zero.
General remark: it would be better to use δt + γ, γ a constant parameter.
Chapter 7
Conclusion
We implemented the KF for the OU process. Both the CY as well as the state-variable x (log of the
spot prices) seems to follow the implied yield and the market prices (resp.) quite good. Also, different
initial values for the parameterset will eventually converge to the optimized set with the same value of
the log-likelihood. This is a good result and tests the robustness of the method.
The main difference bewteen the systmes matrices of both processes is the transition error covariance-
variance matrix Vt . For the CIR process, we simply replaced any negative element of the CY by zero,
but since it is negative for a large number of observations, this will probably give rise to large standard
errors in the optimized parameterset ϕ.
[1] F.H.C. Naber, Fast solver for the three-factor Heston-Hull-White problem, 2006-2007, Wholesale
Banking Amsterdam ING.
[2] A. Krul, L∞ Error estimates for the solution of the Black-Scholes and Hull-White models on truncated
domains with a payoff boundary condition, 2007, Wholesale Banking Amsterdam ING.
[3] P. Bjerksund, Contingent Claims Evaluation when the Convenience Yield is stochastic: Analytical
Results, 1991.
[4] D.R. Ribeiro and S. D. Hodges, A Two-Factor Model for Commodity Prices and Futures Valuation,
2004.
[5] R. Gibson and E. S. Schwartz, Stochastic Convenience Yield and the Pricing of Oil Contingent Claims,
The Journal of Finance, Volume 45, Issue 3(Jul.1990), 959-976.
[6] G. Welch and G. Bishop, An Introduction to the Kalman Filter, July 24, 2006.
[7] A. Heemink , http://ta.twi.tudelft.nl/wagm/users/heemink/ch5dataas.doc, Delft University of Tech-
nology.
[8] D. Lautier, The Kalman Filter In Finance: An Application to Term Structure Models of Commodity
Prices and a Comparison between the Simple and the Extended Filters, 06/12/02, working paper.
[9] D. Lautier, The Informational Value of Crude Oil Futures Prices, June 2003.