0% found this document useful (0 votes)
53 views70 pages

Final Thesis PDF

Uploaded by

Manav Agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views70 pages

Final Thesis PDF

Uploaded by

Manav Agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

Delft University of Technology

Faculty of Electrical Engineering, Mathematics and Computer Science


Delft Institute of Applied Mathematics

Calibration of Different Interest Rate Models


for a Good Fit of Yield Curves

A thesis submitted to the


Delft Institute of Applied Mathematics
in partial fulfillment of the requirements

for the degree

MASTER OF SCIENCE
in
APPLIED MATHEMATICS

by

H.H.N. AMIN

Delft, the Netherlands


September 2012

Copyright
c 2012 by H.H.N. Amin. All rights reserved.
MSc THESIS APPLIED MATHEMATICS

“Calibration of Different Interest Rate Models for a Good Fit of Yield Curves”

H.H.N. AMIN

Delft University of Technology

Daily supervisor Responsible professor

Dr. J.A.M. van der Weide Prof. dr. F.H.J. Redig

Other thesis committee members

Prof. dr. ir. A.W. Heemink

Dr. R.J. Fokkink

Drs. J. Hommels

September 11, 2012 Delft, the Netherlands


iv

iv
Preface

One of the first mathematical models to describe the interest rate over time was the Vasicek
model (1978). Soon after, the Cox Ingersoll Ross (CIR) model (1985) was introduced. The
Vasicek model and the CIR model belong to the family of short interest rate models. Through
transformation these models can be applied to compute the interest rate values. Classical
techniques such as the Maximum Likelihood Estimate (MLE) and Least Squares Method
(LSM) are used to estimate the parameters in the short rate model from the historical data.
One must understand that the short rate values cannot be observed from the financial
market. However, we can observe the bond prices and from these we can compute the interest
rates values. It turns out that when the short interest rate values are not known, MLE and
LSM cannot be used to estimate the parameters in the CIR model. Rabobank encountered
this problem.
My assignment was to estimate the parameters in the CIR model using the historical data
collected by Rabobank. When doing calibration using MLE or LSM for the Vasicek model,
it turns out that the drift parameters are estimated with very high bias. Rabobank uses the
Long Term Quantile (LTQ) method, which is expected to have no bias. My assignment was
to test this claim and see whether there is a bias and if so I had to eliminate this bias.

v
vi

vi
Acknowledgment

First of all I would like to thank Rabobank and in particular Jasper Hommels for giving
me the opportunity to work on this challenging project, for his feedback and the excellent
cooperation. Furthermore, I would like to thank my daily supervisor dr. J.A.M. van der
Weide for his guidance and advice. Next, I would like to thank prof.dr.ir. A.W. Heemink for
his feedback on the Kalman Filter method and dr.ir. R.J. Fokkink for taking over the tasks
of my daily supervisor in the final weeks. A special thanks goes to my family and dearest
friends who supported me through good and bad times and made it possible for me to be the
person I am today.

Hassan Amin
Delft, September 2012

vii
viii

viii
Contents

Introduction 1

1 Interest Rate Models 5


1.1 Deriving Partial Differential Equation for Bond Price . . . . . . . . . . . . . . 6
1.2 The Vasicek model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 The Cox Ingersoll Ross model . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 The Nelson-Siegel model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2 Calibration of Interest Rate Models 19


2.1 Calibration with MLE and LSM . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Calibration with Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Calibration with the Long Term Quantile method . . . . . . . . . . . . . . . . 25
2.4 Testing and Comparing the Methods . . . . . . . . . . . . . . . . . . . . . . . 26

3 Forecasting 33
3.1 Modeling data with the Vasicek model . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Modeling data with the Cox Ingersoll Ross model . . . . . . . . . . . . . . . . 45
3.3 Modeling data with the Nelson-Siegel model . . . . . . . . . . . . . . . . . . . 51
3.4 Comparing the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Conclusions and Further Research 56

Bibliography 58

ix
x CONTENTS

x
Introduction

The big financial institutes such as banks and pension funds hold a zero-coupon bond (also
called pure discount bond) contract, which guarantees the payment of one unit of currency
at maturity1 of the contract, with no intermediate payments. This means that one party,
in case of a government bond this is the government, agrees to pay a fixed interest rate at
maturity time to the holder of the contract. In mathematical terms, we define the contract
value at time t as P (t, T ) where T is maturity, hence the contract payment is P (T, T ) = 1.
The length of the contract can vary from one month up to 30 years, the constant interest rate
increases as the length of the contract increases.
Thus, for a contract of 1 year the interest rate may be 1% whereas for the same contract
over a period of 10 years one might get a constant interest rate of 3%. One of the reasons
behind this is that often it is assumed that the interest rate will rise. Secondly, if the interest
rate return were equal for a contract of 1 month and a contract of 30 years then there
would be no advantage in entering a contract of a duration of 30 years, for the simple reason
that contract over a period of 30 years must have a higher interest rate. When entering a
contract there is a fixed interest rate agreed on by the two parties. It is very important to
compute the ”fairest” interest rate for each contract. If one were able to predict the future
the fairest interest rate would simply be the average of the daily measured interest rate up to
the maturity. Since this is not the case, mathematical models come into play. Therefore, it is
extremely important to find a good model that models the changes in the interest rate over
time closely. First we define the (continuously compounded spot) interest rate by R(t, T ).
The bond price is defined as,

P (t, T ) = e−τ R(t,T ) , (1)


with τ = T − t, see [4]. The compound interest rate R(t, T ) is from a family of affine term
structure models. Affine term structure models are interest rate models where the compound
interest rate R(t, T ) is an affine function in the short rate r(t)2 . Thus, we have

R(t, T ) = α(t, T ) + β(t, T )r(t), (2)


where α(t, T ) and β(t, T ) are deterministic functions of time [4]. Therefore, the zero-
coupon bond price can be rewritten as,

P (t, T ) = A(t, T )e−B(t,T )r(t) , (3)


1
Maturity here is defined as the expiry date of the contract
2
Unless stated otherwise, τ = T − t, T = maturity, t = time, R(t, T ) = compound interest rate and
r(t) = rt = short interest rate.

1
2 CONTENTS

where

− ln(A(t, T ))
α(t, T ) = ,
τ
B(t, T )
β(t, T ) = .
τ

It is impossible to model R(t, T ) exactly. Each model, no matter how good, comes with
assumptions, no model can exist without assumptions. For this reason a model can never be
perfect. Therefore, modeling R(t, T ) is like forecasting the weather. The best way to predict
the weather for tomorrow is by measuring the weather today and by using mathematical
models to predict the movement of the clouds. The assumption made here is that the weather
tomorrow depends on the weather today, which is true for most cases but not always. The
more assumptions you make the easier the model, hence results into less accurate predictions.
Models are the backbone of our society, because of models we know that there are four
different seasons and if it is 30 degrees today it is almost 100% certain that it will not snow
tomorrow, but it might rain depending on the movement of the clouds. These predictions are
not 100% accurate; unfortunately, however, a farmer has to rely on these predictions, since
he has no better alternatives. Keep in mind that models improve day by day and who knows
what will happen in the future, the prediction will never be 100% accurate but may be very
close.
In the world of finance we have the same problem. Mathematical models are used to
describe the future by using historical information. Similar to the mathematical models used
in the weather forecasting, in the mathematical models used in the financial world certain
scenarios can be ruled out with a high certainty. Poor predictions may result in big losses.
This is why, we observe an increase in the number of mathematicians working for a financial
institute, to improve the models used today to tackle the financial crisis.
In this thesis we will compare three different models; the Vasicek model [21] (1978), the
Cox Ingersoll Ross (CIR) model [5] (1985) and the Nelson-Siegel model (2006) [6] (2006).
The first two methods are well-known and used for decades in the financial world. However,
the Nelson-Siegel model, which originates from [16] (1987) and was reintroduced by Diebold
and Li (in 2006), is used by the majority of Central Banks [3]. Each model of these three
models comes with the assumption that the historical movements of the interest rate is the
best prediction for the future interest rates. The models are described by a fixed number of
parameters that depend on the collected historical data. It is extremely important to estimate
these parameters without any bias, because a bias in the estimation may result in an huge
loss over the short term and the long term, as we will show in Chapter 2. Estimation of
these parameters is called calibration. The classic methods such as the Maximum Likelihood
Estimator (MLE) and the Least Square Method (LSM) are commonly used for the calibration
of short rate models. Recently, M. de Ruijter [17] and E. van Elen [9] both applied these
methods for calibration of the short rate models in their final theses for Rabobank and Netspar
respectively. However, doing estimations using MLE and LSM result in a large bias of up to
400% in one of the estimated parameters. This phenomenal was first addressed by Merton
[14] (1980) and later confirmed by many others like Ball and Torous [2] (1996) and Yu and
Phillips [23] (2001). Despite this bias these methods are still used today. Yu and Phillips [24]
(2006) proposed a jackknife method to reduce the bias by a factor of 5 up to 10, however this

2
3 CONTENTS

comes with an increase in the variance. Aı̈t-Sahalia [1] (2008) and Tang and Chen [20] (2009)
used bootstrap method combined with MLE to reduce the bias by the same factor, which does
not increase the variance. E. van Elen did not use any bias reduction techniques. Therefore,
the bias remains in the estimated parameters and the consequences of this in computing the
bond price is shown in Chapter 2. Rabobank uses the Long Term Quantile (LTQ) method
as a better alternative for MLE and LSM to reduce the bias in the estimated parameters.
This is why M. de Ruijter applied the LTQ method to reduce the bias, however no tests were
performed to measure the bias reduction with respect to LSM and MLE.
In our work we reviewed LTQ and found that we are still have a bias of around 60%, which
can have a significant impact on the bond price as we shall see in Chapter 2. Furthermore,
we propose the Kalman filter, which is usually applied to estimate parameters for the CIR
model [15] in order to reduce the bias in the estimated parameters. The Kalman filter is often
used as a last resort when methods such as LSM and MLE fail, such as in the case when
estimating parameters in the CIR model [15]. We shall show in this thesis why the Kalman
filter should be applied in all cases as this method outperforms LSM, MLE and even LTQ. In
the final thesis of E. van Elen the CIR model is estimated using MLE. This can only be done
if there is no distinction is made between the R(t, T ) and r(t), which is a common pitfall.
However, one must understand that the historical data collected from the market is R(t, T )
and that r(t) cannot be collected from the market. Therefore, a certain transformation is
always needed when dealing with short rate models as we will show in Chapter 1. For this
reason LSM and MLE fail in case of calibration of the CIR model.
After the calibration, we compare our results with the initial historical yield curve. The
yield curve for day x is defined as the compound interest rate for 9 different contracts, with
9 different maturities (1 month, 3 months, 6 months, 1 year, 2 years, 5 years, 10 years, 20
years, 30 years), at day x. The compound interest rate is computed from the bond prices for
each contract, which are observed by Rabobank.

3
4 CONTENTS

4
1

Interest Rate Models

In this chapter we will describe the models that we will use. In this thesis we will work
with the Vasicek model, the CIR model and the Nelson-Siegel model. In Section 1.1 we
derive the partial differential equation for the zero-coupon bond price for short rate models.
This is done using the same approach as when deriving the Black-Scholes partial differential
equation. The partial differential equation for the zero-coupon bond price allows us to derive
the zero-coupon bond price for the Vasicek model and the CIR model, in Section 1.2 and
Section 1.3 respectively.
Most of the banks today use the Vasicek model or the extended version of the model called
the Hull White model, which was introduced by Hull and White [11] (1987), to model the
evolution of short interest rates. There are two common reasons for this. The first and most
obvious reason is that these were the first models introduced describing rt but that alone is
not enough. The important reason is because of its simplicity as we will see in Section 1.2.
The latter is probably the reason why the model is still used after decades. The major upset
when using this model is that one encounters negative interest rate values. This is due the fact
that the Vasicek model is normally distributed as we will show in Section 1.2. It is obvious
that no one will put money in his/her saving account knowing that it will be worth less next
year than it is today, so working with negative interest rate is not realistic.
Therefore, a better alternative is the Cox Ingersoll Ross (CIR) model as this model does
not generate negative interest rate values and if the Feller condition holds, see Section 1.3, the
interest rate values will be strictly positive. The CIR model is a more realistic model compared
to the Vasicek model. It turns out that the CIR model is more complicated. Deriving the
bond price for the CIR model is much harder as we will see in Section 1.3. Estimating the
parameters for the CIR model cannot be done using classical methods such as MLE and LSM
and the reasons for this will be given in Section 1.3.
Even though the Vasicek model, or an extended version of this model, is widely used in
the financial world, nine of the thirteen central banks that inform the Bank for International
Settlements of their estimations use the model developed by Nelson and Siegel [6] or its
extended version, proposed by Svensson to model the yield curve [3]. In Section 1.4 the
Nelson-Siegel model is explained, the idea behind this model and why this model is very
popular by the Central Banks.

5
6 CHAPTER 1. INTEREST RATE MODELS

1.1 Deriving Partial Differential Equation for Bond Price

The Vasicek model and the CIR model assume that the interest rate follows a Markov process.
We define W (t) as a Wiener process on the risk-neutral probability space (Ω, F, P). Then the
short interest rate follows the following stochastic differential equation [18],

dr(t) = µ(r, t)dt + σ(r, t)dW (t). (1.1)


In this section we relax the notation rt = r(t) and we will work with r = r(t), because
later on we will use partial derivatives and we do not want to cause confusion.
We assume that we have a market which is perfectly liquid. This means that it is possible
to purchase or sell a bond, or their fractions in any amount, at any given time. We define the
bond price, P (t, T ), as a function of t and r, hence P (t, T ) = f (t, r). Note that f (t, r) is at
least twice continuously differentiable. Therefore, we can apply Itô’s lemma to get,

df (t, r) = ft dt + fr dr + 12 frr (dr)2

 1 2
= ft dt + fr µ(r, t)dt + σ(r, t)dW (t) + frr µ(r, t)dt + σ(r, t)dW (t)
2 (1.2)

= ft + fr µ(r, t) + 12 frr σ 2 (r, t) dt + fr σ(r, t)dW (t),




where ft is the derivative of f = f (t, r), etcetera.


We want to create a risk-free self-financing investment, (t), at time t [19]1 . We define b1
Q
and b2 as two arbitrary zero-coupon bonds with different maturities with bond price f1 (t, r)
and f2 (t, r) respectively. Furthermore, we define 41 and 42 as the (fraction) amount of
bonds for b1 and b2 that we have to buy in our risk-free portfolio. The return on our risk-free
portfolio is the sum of proportional returns in each of the two underlying bonds. In other
words,
Q
d df (t, r) df2 (t, r)
Q = 41 1 + 42 . (1.3)
f1 (t, r) f2 (t, r)
We can substitute Equation (1.2) into Equation (1.3) to get,

f1t + f1r µ(r, t) + 21 f1rr σ 2 (r, t) dt + f1r σ(rt , t)dW (t)


Q 
d
Q = 41
f1 (t, T )

f2t + f2r µ(r, t) + 12 f2rr σ 2 (r, t) dt + f2r σ(r, t)dW (t)



+42
f2 (t, T ) (1.4)
= 41 µˆ1 dt + 41 σˆ1 dW (t) + 42 µˆ2 dt + 42 σˆ2 dW (t)
 
= 41 µˆ1 + 42 µˆ2 dt + 41 σˆ1 + 42 σˆ2 dW (t),

1
In the book of Seydel the self-pricing portfolio is used to derive the Black-Scholes equation, the steps are
almost identical only in our work we are using it to derive the differential equation for the bond price.

6
7 1.1. DERIVING PARTIAL DIFFERENTIAL EQUATION FOR BOND PRICE

with

f1t + f1r µ(rt , t) + 21 f1rr σ 2 (r, t) dt



µˆ1 = ,
f1 (t, T )

f1r σ(r, t)dW (t)


σˆ1 = ,
f1 (t, T )

f2t + f2rt µ(r, t) + 21 f2rr σ 2 (r, t) dt



µˆ2 = ,
f2 (t, T )

f2r σ(r, t)dW (t)


σˆ2 = .
f2 (t, T )

Since we want a risk-free portfolio we need to eliminate the risk factor, which is the Wiener
process. As 41 and 42 are the only two variables which can be changed, they must be chosen
in such a way that we can eliminate the Wiener process. In other words,

41 + 42 = 1,

41 σˆ1 + 42 σˆ2 = 0.

Solving this results in

−σˆ2
41 = ,
σˆ1 − σˆ2

σˆ1
42 = .
σˆ1 − σˆ2

If we then substitute this into our Equation (1.3) we get


Q  
d −σˆ2 σˆ1
Q = µˆ1 + µˆ2 dt.
σˆ1 − σˆ2 σˆ1 − σˆ2 (1.5)

Q
Our portfolio, , is risk-free, and hence must offer the same return as any other risk-free
investment. A safe investment in this case might be a saving account on a bank. Therefore,
if we put our money in a savings account the expected return is
Q
d
Q = rdt.
(1.6)

Substituting Equation (1.5) into Equation (1.6) and then solving this will give

µˆ2 σˆ1 − µˆ1 σˆ2


r(t) = .
σˆ1 − σˆ2 (1.7)

7
8 CHAPTER 1. INTEREST RATE MODELS

We can rewrite this as follows,

µˆ1 − r µˆ2 − r
= .
σˆ1 σˆ2 (1.8)

Let us now define

µ̂t − r(t)
λ(t) = .
σ̂t (1.9)

This parameter, λ(t), is called the market risk parameter [5]2 . If we now substitute the
values σ̂t and µ̂t back into Equation (1.9) we get

ft + fr µ(r, t) + 12 frr σ 2 (r, t) rf (t, r)


λ(t) = − .
fr σ(r, t) fr σ(r, t)

This can be rewritten as,

ft (t, r) + (µ(r, t) − λ(t)σ(r, t))fr (t, r) + 12 σ 2 (r, t)frr (t, r) − rf (t, r) = 0. (1.10)

Let us put back f (t, r) = P (t, T ), then we get

Pt (t, T ) + (µ(r, t) − λ(t)σ(r, t))Pr (t, r) + 12 σ 2 (r, t)Prr (t, r) − rP (t, T ) = 0. (1.11)

We want to compute the bond price in the following form,

P (t, T ) = eC(t,T )−B(t,T )r(t) . (1.12)


Note that this is simply Equation (3), see introduction, only now A(t, T ) = exp(C(t, T )).
Note that the functions B(t, T ) and C(t, T ) depend on the variable t only (because T is
fixed). Therefore, we can define B(t, T ) and C(t, T ) as functions with one variable τ , hence
B(t, T ) = B(τ ) and C(t, T ) = C(τ ). For the derivatives of B(t, T ) and C(t, T ) with respect
to τ we find

Bτ (τ ) = −Bt (τ ),

Cτ (τ ) = −Ct (τ ).

Next, we introduce

µ̃ = µ(r, t) − λt σ(r, t),

σ̃ = σ(r, t).

2
A negative value for λ means that the risk premium for holding longer term bonds is positive [15].

8
9 1.2. THE VASICEK MODEL

Rewrite Equation (1.11) using µ̃ and σ̃ yields

Pt + µ̃Pr + 12 σ̃ 2 Prr − rP = 0. (1.13)

Then we compute Pt , Pr and Prr . This gives us



Pτ (τ ) = − Cτ (τ ) + Bτ (τ )r P (τ ),

Pr (τ ) = −B(τ )P (τ ),

Prr (τ ) = B 2 (τ )P (τ ).

Substituting this in Equation (1.13) we get

− Cτ (τ ) − µ̃B(τ ) + 21 σ̃ 2 B 2 (τ ) − 1 − Bτ (τ ) r = 0.

(1.14)

Note that if there exists a solution then this solution is unique [10]. In Section 1.2 and
Section 1.3 we will solve Equation (1.14) for the Vasicek model and the CIR model.

1.2 The Vasicek model

The stochastic differential equation of the Vasicek model is,

drt = κ(µ − rt )dt + σdWt , (1.15)

where κ, µ and σ are constants and Wt is a Wiener process[21]3 .


It can easily be shown using Itô’s lemma [18], with Yt = eκt rt , that the exact discrete
model corresponding to (1.15) is,
Z t+4t
rt+4t = e−κ4t rt + µ(1 − e−κ4t ) + σe−κ(t+4t) e−κt dWt . (1.16)
t

This means that rt is normally distributed with Equation (1.17) as its mean and Equa-
tion (1.18) as its variance,

E rt+4t |rt = e−κ4t rt + µ(1 − e−κ4t ),


 
(1.17)

σ2
 
(1 − e−2κ4t ).
 
Var rt+4t |rt = (1.18)

If κ goes to zero then the expectation goes to rt and the variance to zero. However, if κ
2
goes to infinity we see that the expectation goes to µ and the variance to σ2κ .
We want to compute the zero-coupon bond price for the Vasicek model. This means we
have to solve Equation (1.14) with
3
We assume that the risk parameter λ = 0.

9
10 CHAPTER 1. INTEREST RATE MODELS

µ̃ = κ(µ − r),

σ̃ = σ.

We get

 
∂C(τ ) 1 2 2 ∂B(τ )
− − κ(µ − r)B(τ ) + 2 σ B (τ ) − 1 − r = 0. (1.19)
∂τ ∂τ

We can rewrite this as

 
∂C(τ ) ∂B(τ )
− − κµB(τ ) + 12 σ 2 B 2 (τ ) − 1 − κrB(τ ) − r = 0. (1.20)
∂τ ∂τ

Because Equation (1.20) holds for every r and every τ we must have,

∂C(τ )
− − κµB(τ ) + 21 σ 2 B 2 (τ ) = 0, (1.21)
∂τ

and

∂B(τ )
κB(τ ) + = 1. (1.22)
∂τ

Furthermore, we have that

P (T, T ) = P (τ = 0)

= eC(0)−B(0)r(T ) ,

which implies that C(0) = B(0) = 0. We first solve the following Equation (1.22) with
boundary condition B(0) = 0. It is easy to see that ∂B(τ )
∂τ = e
−κτ , solving this with B(0) = 0

we get

1 − e−κτ
B(τ ) = . (1.23)
κ

Now substitute Equation (1.23) into Equation (1.21) and then integrate with respect to
t, over an interval [t, T ]. This gives

10
11 1.2. THE VASICEK MODEL

Z T
C(τ ) = κµB(T − s) + 12 σ 2 B 2 (T − s)ds
t

2
T
1 − e−κ(T −s) T
1 − e−κ(T −s)
Z Z 
1 2
= κµ ds + 2σ ds
t κ t κ

T
1 − e−κ(T −s) T
σ2
Z Z
1 − 2e−κ(T −s) + e−2κ(T −s) ds

= κµ ds +
t κ t 2κ

1 − e−κ(T −t) σ2 σ 2 1 − e−κ(T −t) σ 2 1 − e−2κ(T −t)


= −µ(T − t) + µ + (T − t) − +
κ 2κ 2κ κ 2κ 2κ

σ2 σ2 σ 2 1 − e−2κ(T −s)
 
= −µ(τ ) + µB(τ ) + (τ ) − B(τ ) +
2κ 2κ 2κ 2κ

B(τ ) − τ (κ2 µ − σ2 ) σ 2 2

= − B (τ ).
κ2 4κ
Therefore, we get for the bond price

P (t, τ ) = A(τ )e−B(τ )r(t)


(1.24)
= e−τ R(t,τ ) ,

with
  
σ2 σ2 2
A(τ ) = exp µ − 2κ2
(B(τ ) − τ ) − 4κ B (τ ) ,

1 − e−κτ
B(τ ) = .
κ

Recall, the values were determined under the risk neutral measure. We get the following
stochastic differential equation for R(t, τ ),
 
Bτ µ ln(Aτ ) Bτ
dR(t,τ ) =κ − − R(t,τ ) dt + σdWt , (1.25)
τ τ τ
which can be rewritten as,

dR(t,τ ) = κ µ − R(t,τ ) )dt + σ


b(b bdWt , (1.26)

with

11
12 CHAPTER 1. INTEREST RATE MODELS

κ
b = κ,

Bτ µ ln(Aτ )
µ
b = − ,
τ τ

σ
b = σ.
τ

Note that both rt and R(t,τ ) are normally distributed in the Vasicek model. We can
calibrate both stochastic differential equations using the same distribution function but with
different parameters. We collected R(t,τ ) from the historical data. After estimating κ
b, µ
b and
σ
b using the historical data, we can find the parameters κ, µ and σ for the short rate model.
Suppose through calibration we estimated κ b, µ
b and σ
b, then we get

κ = κ
b,

1 − e−κτ
B(τ ) = ,
κ
τ
σ = σ
b,
B(τ )

σ 2 Bτ2 σ2
    
 Bτ
A(τ ) = exp τ τ − Bτ  + 2 − µ̂ ,
τ 4k Bτ − τ 2k

σ 2 Bτ2 σ2
 

µ = 1 − Bτ + 2 − µ̂.
4k Bτ − τ 2k

1.3 The Cox Ingersoll Ross model

Now we consider the CIR model. The stochastic differential equation of the CIR model is

drt = κ(µ − rt )dt + σ rt dWt , (1.27)
where κ, µ and σ are constants and Wt is a Wiener process [5]4 .
Using Itô’s lemma again, as for the Vasicek model, we can show the exact discrete model
corresponding to (1.27) is,
t+4t

Z
−κ4t −κ4t −κ(t+4t)
rt+4t = e rt + µ(1 − e ) + σe e−κt rt dWt . (1.28)
t
Now we can easily compute the expectation and the variance of the process. This gives
us,
4
We assume that the risk parameter λ = 0.

12
13 1.3. THE COX INGERSOLL ROSS MODEL

E rt+4t |rt = e−κ4t rt + µ(1 − e−κ4t ),


 
(1.29)

 2  2
−κ4t σ −κ4t σ
(1 − e−κ4t )2 .
 
Var rt+4t |rt = e rt (1 − e )+µ (1.30)
κ 2κ
If κ goes to zero then the expectation goes to rt and the variance to zero. However, if
2
κ goes to infinity we see that the expectation goes to µ and the variance to µ σ2κ . It can be
shown that the process rt features a noncentral chi-squared distribution function [5].
We want to compute what the zero-coupon bond price for the CIR model is. This means
we have to solve Equation (1.14) with

µ̃ = κ(µ − r),
p
σ̃ = σ (r).

Therefore, we get
 
∂C(τ ) 1 2 2 ∂B(τ )
− − κ(µ − r)B(τ ) + 2 rσ B (τ ) − 1 − r = 0. (1.31)
∂τ ∂τ
We can rewrite this as,
 
∂C(τ ) 1 2 2 ∂B(τ )
− − κµB(τ ) − 1 − 2 σ B (τ ) − κrB(τ ) − r = 0. (1.32)
∂τ ∂τ
Using the same argument as for the Vasicek model we get

∂C(τ )
− − κµB(τ ) = 0, (1.33)
∂τ
and

∂B(τ ) 1 2 2
κB(τ ) + + 2 σ B (τ ) = 1, (1.34)
∂τ
with B(0) = 0 and C(0) = 0. We easily solved the partial differential equation for B(τ )
in the Vasicek model, see Section 1.2. For the CIR model this is slightly harder. A point of
notice, we decided to derive the solution for C(τ ), but not for the B(τ ). The reasons for this
are: deriving the solution for C(τ ) is harder and a very similar derivation for B(τ ) is provided
in [10]. We know that there exists a solution, see [5]. The solution is

2(eλτ − 1)
B(τ ) = , (1.35)
2λ + (λ + κ)(eτ λ − 1)
with5

λ = κ2 + 2σ 2 .

This gives us,


5
Note in this case λ is not defined as the risk parameter.

13
14 CHAPTER 1. INTEREST RATE MODELS

2(eλτ − 1)
B(τ ) = ,
2λ + (λ + κ)(eτ λ − 1)

4(eλτ − 1)2
B 2 (τ ) = 2 ,
2λ + (λ + κ)(eτ λ − 1)

4λ2 eλτ
Bτ (τ ) = 2 .
2λ + (λ + κ)(eτ λ − 1)

As expected this solution indeed solves Equation (1.34), and since there exists only one
[10], the proposed solution is unique. Now substitute Equation (1.35) into Equation (1.33)
and then integrate with respect to t, over an interval [t, T ]. Therefore, we get

Z T
C(τ ) = −κµB(T − s)ds
t

T
2(eλ(T −s) − 1)
Z  
= −κµ ds.
t 2λ + (λ + κ)(e(T −s)λ − 1)

To solve this integral, we start with first using the following transformation

x(s) = eλ(T −s) ,


with
1
ds = − dx.
λx(s)
This gives us,

T
(eλ(T −s) − 1)
Z  
C(τ ) = −2κµ ds
t 2λ + (λ + κ)(e(T −s)λ − 1)

1  
−(x − 1)
Z
= −2κµ  dx
eλ(T −t) λx 2λ + (λ + κ)(x − 1)

Z 1  
2κµ 1 1
= −  dx
eλ(T −t) λ 2λ + (λ + κ)(x − 1) x 2λ + (λ + κ)(x − 1)

Z 1  
2κµ 1 (λ + κ) 1
= − − dx.
eλ(T −t) λ 2λ + (λ + κ)(x − 1) (κ − λ) 2λ + (λ + κ)(x − 1) x(λ − κ)

14
15 1.3. THE COX INGERSOLL ROSS MODEL

Next, we apply the transformation

y(x) = 2λ + (λ + κ)(x − 1),

with

1
dx = dy.
(λ + κ)

This leads to,

2λ−(λ+κ)
2κµ 1
Z   Z
2κµ 1 1 1
C(τ ) = − dy − dx
2λ+(λ+κ)(eλ(T −t) −1) λ (λ + κ)y (κ − λ)y λ eλ(T −t) (λ − κ)x

 2λ−(λ+κ)   1
2κµ 1 1 2κµ 1
= (λ+κ) ln(y) − (κ−λ) ln(y) − ln(x)
λ 2λ+(λ+κ)(eλ(T −t) −1) λ (λ−κ) eλ(T −t)

 1
2κµ 1 1
= (λ+κ) ln(2λ + (λ + κ)(x − 1)) − (κ−λ) ln(2λ + (λ + κ)(x − 1))
λ eλ(T −t)

 1
2κµ 1
− (λ−κ) ln(x)
λ eλ(T −t)

 T
2κµ
= 1
(λ+κ) ln(2λ + (λ + κ)(eλ(T −s) − 1)) − 1
(κ−λ) ln(2λ + (λ + κ)(eλ(T −s) − 1))
λ t

 T
2κµ 1 λ(T −s)
− (λ−κ) ln(e ) .
λ t

Instead of just substituting the boundaries into the integral and solving it, we will first
simplify the equation so that eventually we can have a nice simple formula that we can work
with. Simplifying the above equation gives

15
16 CHAPTER 1. INTEREST RATE MODELS

  − 22λ 2 T   T
2κµ λ(T −s)
(κ −λ ) 2κµ 1 λ(T −s)
C(τ ) = ln 2λ + (λ + κ)(e − 1) − ln e
λ t λ (λ−κ) t

  − 22λ 2 T   − 22λ 2 T
2κµ λ(T −s)
(κ −λ ) 2κµ (T −s)(κ+λ) (κ −λ )
= ln 2λ + (λ + κ)(e − 1) − ln e 2
λ t λ t

    T
−4κµ λ(T −s)
(T −s)(κ+λ)
= ln 2λ + (λ + κ)(e − 1) − ln e 2
κ2 − λ2 t

    T
2κµ λ(T −s) − 1) − ln e
(T −s)(κ+λ)
= ln 2λ + (λ + κ)(e 2
σ2 t
    
2κµ τ (κ+λ)
λτ − 1) + ln e 2
= ln(2λ) − ln 2λ + (λ + κ)(e
σ2
τ
2λe(λ+κ) 2
 
2κµ
= ln .
σ2 2λ + (λ + κ)(eτ λ − 1)

Now we have a formula for the zero-coupon bond price for the CIR model which is given
as

P (t, τ ) = A(τ )e−B(τ )r(t)


(1.36)
= e−τ R(t,τ ) .

with

τ  2κµ
2λe(λ+κ) 2

A(τ ) = σ2 ,
2λ + (λ + κ)(eτ λ − 1)

2(eλτ − 1)
B(τ ) = ,
2λ + (λ + κ)(eτ λ − 1)

λ = κ2 + 2σ 2 .

Now we shall repeat the same steps as for the Vasicek model. Under the risk neutral
measure we get the following stochastic differential equation for R(t,τ ) ,

s
τ R + ln(A ) τ R(t,τ ) + ln(Aτ )
   
Bτ (t,τ ) τ
dR(t,τ ) = κ µ− dt + σ dWt . (1.37)
τ Bτ Bτ

We cannot rewrite R(t,τ ) in the following form;

16
17 1.4. THE NELSON-SIEGEL MODEL

q
dR(t,τ ) = κ µ − R(t,τ ) )dt + σ
b(b b R(t,τ ) dWt , (1.38)

and from this we can conclude that rt and R(t,τ ) do not follow the same distribution, as it
was the case in the Vasicek model, this means classical methods such as MLE and LSM fail.
Therefore, we must apply the Kalman filter algorithm which has been used in the paper of
Duan and Simonata [7] (1995) dealing with the estimation of affine term structure models.

1.4 The Nelson-Siegel model

Diebold and Li (2006) reintroduced the Nelson and Siegel model, which originates from (1987).
They provided evidence, in their paper [6], that the model can also be a valuable tool for
forecasting the yield curve. Even though the yield curve is almost perfectly fitted (unlike the
case with the Vasicek model and the CIR model, as we will see in Chapter 3) the Nelson-
Siegel model does allow arbitrage. Despite this huge disadvantage, the Bank of International
Settlements (BIS, 2005) reports that currently nine out of thirteen Central Banks which report
their curve estimation methods to the BIS use the Nelson-Siegel to construct zero-coupon yield
curves. For this reason, we describe this model briefly. Let us define r(t, τ ) = rt (τ ) forward
rate curve at time t over a time interval τ . Then in the Nelson-Siegel model rt (τ ) is given by
the following equation,

rt (τ ) = β1,t + β2,t e−λt τ + β3,t λt τ e−λt τ . (1.39)


The compound interest rate, R(t, τ ) = Rt (τ ), is the average of the forward rate curve,

1 τ
Z
Rt (τ ) = rt (s)ds. (1.40)
τ 0
Substituting Equation (1.39) into (1.40) we get

Rt (τ ) = β1,t κ1 + β2,t κ2 + β3,t κ3 , (1.41)


with

κ1 = 1,

1 − e−λt τ
κ2 = ,
λt τ

1 − e−λt τ
κ3 = − e−λt τ .
λt τ

There are several reasons why this model is widely used. The most important reason is
that it provides a very good approximation of the yield curve by using a small number of
parameters. Together, the three parameters κ1 , κ2 , κ3 , give the model enough flexibility to fit
all different types shapes. We want to explain the parameters β1,t , β2,t and β3,t . Diebold and

17
18 CHAPTER 1. INTEREST RATE MODELS

Li (2006) define the parameters β1,t , β2,t and β3,t as long-term, short-term and medium-term
components respectively. The following limits are well-defined,

lim Rt (τ ) = β1,t + β2,t , (1.42)


τ →0

and

lim Rt (τ ) = β1,t . (1.43)


τ →∞

The first compound, β1,t , is referred to as the long-term compound, because the loading
on β1,t is κ1 which is equal to 1 and hence does not decay to zero in the limit as t → ∞. The
long-term compound at time t is chosen in such a way that it is equal to the interest rate of
the contract with the longest maturity [6] at time t. For our specific case we have,

Rt (τ = 30 years) ≈ β1,t . (1.44)


The second compound, β2,t , is referred to as the short-term compound, because the loading
on β2,t is κ2 which starts at 1 but decays monotonically and quickly to zero as t → ∞. Diebold
and Li define the yield curve slope as Equation (1.42) minus Equation (1.43), which gives
−β2,t . For our specific case we have,

Rt (τ = 30 years) − Rt (τ = 1 month) ≈ −β2,t . (1.45)


The final compound, β3,t , is referred to as the medium-term compound, because the
loading on β3,t is κ3 which starts at 0, increases to its maximum, which is determined by the
decay parameter λt , before returning zero as t → ∞. The medium-term compound is closely
related to the yield curve curvature, which is defined as twice the two-year yield minus the
sum of the limits in Equation (1.42) and Equation (1.43) [6]. For our specific case we have,

2Rt (τ = 30 years) − Rt (τ = 30 years) − Rt (τ = 1 month) ≈ β2,t . (1.46)
Thus, if we define lt , st and ct as the yield curve level, yield curve slope and yield curve
curvature respectively, we must have,
 
corr β1,t , lt ≈ 1, (1.47)

 
corr β2,t , st ≈ −1, (1.48)

 
corr β3,t , ct ≈ 1. (1.49)
We may choose to estimate the parameters θ = {β1,t , β2,t , β3,t , λt } by nonlinear least
squares for each time step. However Diebold and Li, instead fix λt at a prespecified value,
which allows them to compute the values of β1,t , β2,t , β3,t by using LSM for each step t.
Therefore, we will also fix λt before using LSM to estimate the parameters β1,t , β2,t , and
β3,t . We must choose λt in such a way that the error between the actual yield curve and the
estimated yield curve, using the Nelson-Siegel model, is minimized, for details see Chapter 3.

18
2

Calibration of Interest Rate Models

2.1 Calibration with MLE and LSM

We are given the historical data and we want to find the parameters κ, µ and σ for the Vasicek
model. To do so, we will use two methods LSM and MLE. We start with LSM. The Euler
discretization of the Vasicek model is given by,

R(t+4t,τ ) = R(t,τ ) + κ(µ − R(t,τ ) )4t + σ Wt+4t − Wt . (2.1)
By rewriting Equation (2.1) we can use LSM to estimate the drift parameters by solving
the following equation,
N
X −1  2

κ̂, µ̂ = arg min R(i+1,τ ) − R(i,τ ) − κµ4t + κR(i,τ ) 4t . (2.2)
κ,µ
i=1

Once we have solved (2.2) (in our case with MATLAB), we then compute the standard
deviation of residuals and use that as estimator for the parameter σ which we define as (σ̂).
Thus, we have an estimator for (κ, µ, σ) which is, (κ̂, µ̂, σ̂). We can do the transformations
in Section 1.2 to get the parameters for rt . Note that for LSM we do not use the conditional
density function or the density function. Any discretization suffices, in our case we used the
Euler discretization to estimate the parameters.
Next, we will use MLE to estimate the parameters (κ, µ, σ). Since Rt follows a Markov
process it follows that (R(t+4t,τ ) |R(t,τ ) ) is independent and identically distributed with con-
ditional density function and density function [18]. Therefore, MLE is given by

L(θ; R(2,τ ) |R(1,τ ) , . . . , R(N,τ ) |R(N −1,τ ) ) = f (R(2,τ ) |R(1,τ ) , . . . , R(N,τ ) |R(N −1,τ ) ; θ)
QN −1 (2.3)
= i=1 f (R(i+1,τ ) |R(i,τ ) ; θ),

with θ = (κ, µ, σ). In our case it is more convenient to work with the log-likelihood, which
is the logarithm of the likelihood function L. Thus, we get

19
20 CHAPTER 2. CALIBRATION OF INTEREST RATE MODELS


`(θ; R(2,τ ) |R(1,τ ) , . . . , R(N,τ ) |R(N −1,τ ) ) = ln L(θ; R(2,τ ) |R(1,τ ) , . . . , R(N,τ ) |R(N −1,τ ) )
PN −1 
= i=1 ln f (θ; R(2,τ ) |R(1,τ ) , . . . , R(N,τ ) |R(N −1,τ ) ) .

(2.4)
Many articles, such as Durham and Gallant (2002), have approximated the likelihood
function of a continuous system analytically and computed it by numerical methods. Despite
its good asymptotic properties, MLE can produce a huge bias in the estimated parameters in
the short rate models. It is well known that the bias is mainly found in the drift parameters see
for instance Merton [14], Ball and Torous [2] and Yu and Phillips [23]. Merton first discovered
the bias when estimating parameters with the Black-Scholes model. The difficulty in the drift
parameter comes from the estimation approaches, including MLE. The problem increases
when the process has a lack of dynamics, which happens when κ is small. As mentioned by
Yu and Phillips [24] and in our test results, see Section 2.4, MLE and LSM for κ can produce
a bias of more than 400%. The prices of bond options and other derivative securities depend
mainly on the value of unknown parameters. Yu and Phillips [24] introduced the jackknife
method to reduce the bias in parameter estimation, however this comes with an increase in
the variance. Aı̈t-Sahalia [1] and Tang and Chen [20] used bootstrapping combined with MLE
to reduce the bias in the parameters, this reduces the bias without increasing the variance.

2.2 Calibration with Kalman Filter

In 1960, Rudolph E. Kalman published his famous paper [12] introducing a powerful linear
filtering technique named after him. The Kalman filter is a method that provides an efficient
recursive algorithm to estimate the state of a process, in a way that minimizes the mean of
the squared error. The filter is very powerful because it supports estimations of past, present
and future states, this can be done even when exact parameters of the model are not known.
A good introduction of the Kalman filter is given in [22]. We will try to follow this paper
briefly but give a more detailed introduction of the Kalman filter. The Kalman filter addresses
the general problem of trying to estimate the state transition space x ∈ Rn of a discrete time
controlled process that is governed by the linear stochastic differential equation,

xt = Axt−1 + But−1 + wt , (2.5)

with a measurement space z ∈ Rm that is observed. Thus,

zt = Hxt + vt . (2.6)

The variable ut−1 ∈ Rn is the optional control input, we assume that ut−1 is equal to
the identity matrix. The random variables wt and vt represent the process and measurement
noise respectively. They are assumed to be independent and following a normal distribution.
Thus, wt ∼ N (0, Qt ) and vt ∼ N (0, Rt ), where

20
21 2.2. CALIBRATION WITH KALMAN FILTER

 2 
qt1 0 ··· 0
0 qt21 ··· 0 
Qt =  . ,
 
.. .. ..
 .. . . . 
0 0 ··· qt21
and
 2 
ρt1 0 ··· 0
0 ρ2t2 ··· 0 
Rt =  . .
 
.. .. ..
 .. . . . 
0 0 ··· ρ2t3
The covariance matrices Qt and Rt are time dependent, therefore, they may change with
each time step or measurement. However, we will take the covariance matrix Rt constant.
Minimizing this will result in a more reliable estimated parameters as we will show later in
this section. In Section 2.4 we will also demonstrate this when testing the Kalman filter. We
know that since wt ∼ N (0, Qt ),

p(xt |xt−1 ) ∼ N (Axt−1 + B, Qt ), (2.7)


and since vt ∼ N (0, Rt ) we have

p(zt |xt ) ∼ N (Hxt , Rt ). (2.8)


It follows that,

p(xt |zt ) ∼ N (µt , σt2 ), (2.9)


for details see [22]. We want to find µt and σt2 . We define x̂t|t−1 to be the estimated
value of xt at time t, using information up to t − 1 and E[x̂t|t−1 ] is the expected value of this
estimate. The estimated error at time t is given by,

4xt|t−1 = xt − x̂t|t−1

= xt − E[x̂t|t−1 ]

= Axt−1 + B + wt − (Ax̂t−1|t−1 + B)

= A(xt−1 − x̂t−1|t−1 ) + wt

= A4x̂t−1|t−1 + wt .

We define the covariance of estimated error at time t given information up to t − 1 as


P{t|t−1} . If we want to compute the covariance of estimated error P{t|t−1} , then it suffices to
compute E[4xt|t−1 4xTt|t−1 ]1 , because E[4xt|t−1 ] = 0. Thus, we get
1
In this section T = transpose and not the maturity time, hence AT = the transpose of a matrix A.

21
22 CHAPTER 2. CALIBRATION OF INTEREST RATE MODELS

 
P{t|t−1} T
= E 4xt|t−1 4xt|t−1

  T 
= E A4xt−1|t−1 + wt A4xt−1|t−1 + wt

  
= E A4xt−1|t−1 + wt T T T
4xt−1|t−1 A + wt

 
T T T T T T
= E A4xt−1|t−1 4xt−1|t−1 A + A4xt−1|t−1 wt + wt 4xt−1|t−1 A + wt wt

    
T T T
= A E 4xt−1|t−1 4xt−1|t−1 A + E wt wt

= AP{t−1|t−1} AT + Qt .

Next, we define t|t−1 , also called innovation or measurement residual, as the difference
between our observed data at time t and the data we expected to observe at time t − 1, that
is,

t|t−1 = zt − H x̂t|t−1 . (2.10)

Define K as the Kalman gain matrix. The Kalman gain matrix is a correction added to
the estimate, x̂t|t−1 , that is proportional to the measurement residual. Therefore, we have

x̂t|t = x̂t|t−1 + Kt|t−1


(2.11)
= x̂t|t−1 + K(zt − H x̂t|t−1 ).

The matrix K is chosen in such a way that it minimizes, P{t|t} , the covariance of estimated
error at time t. So let us first find an expression for P{t|t} , before we give an expression for K

22
23 2.2. CALIBRATION WITH KALMAN FILTER

 
P{t|t} = E 4xt|t 4xTt|t

  T 
= E xt − x̂t|t xt − x̂t|t

  T 
= E xt − x̂t|t−1 − Kt|t−1 xt − x̂t|t−1 − Kt|t−1

  T 
= E xt − x̂t|t−1 − K(zt − H x̂t|t−1 ) xt − x̂t|t−1 − K(zt − H x̂t|t−1 )

  T 
= E xt − x̂t|t−1 − K(Hxt + vt − H x̂t|t−1 ) xt − x̂t|t−1 − K(Hxt + vt − H x̂t|t−1 )

  T 
= E (I − KH)(xt − x̂t|t−1 ) − Kvt (I − KH)(xt − x̂t|t−1 ) − Kvt

= (I − KH)E[(xt − x̂t|t−1 )(xt − x̂t|t−1 )T ](I − KH)T + Kt E[vt vtT ]K T

= (I − KH)P{t|t−1} (I − KH)T + KRt K T

= P{t|t−1} − P{t|t−1} H T K T − KHP{t|t−1} + KHP{t|t−1} H T K T + KRt K T ,

where I = identity matrix. So we have

P{t|t} = P{t|t−1} − P{t|t−1} H T K T − KHP{t|t−1} + KHP{t|t−1} H T K T + KRt K T . (2.12)

If we want to minimize the covariance of P{t|t} , we can use the mean square error measure
 2   
T
E xt − x̂t|t = tr E[4xt|t 4xt|t ]

= tr(P{t|t} ).

If we then take the trace of both sides and find the gradient with respect to K we get

∇K tr(P{t|t} ) = −P{t|t−1} H T − P{t|t−1} H T + 2KHP{t|t−1} H T + 2KRt .


By setting the left-hand side equal to zero and making K the subject, we find the value
K that minimizes P{t|t} . Therefore, the Kalman gain matrix equals

K = P{t|t−1} H T (HP{t|t−1} H T + Rt )−1 . (2.13)

23
24 CHAPTER 2. CALIBRATION OF INTEREST RATE MODELS

Note that when the covariance matrix Rt , approaches zero the Kalman gain matrix weights
the residual more heavily. In other words, the measurement space zt is assumed to be more
and more reliable. Our measurement in the short interest rate models is the compound
interest rate, the historical data. The historical data must be trusted, therefore, we ought
to minimize Rt , the variance of error term vt in Equation (2.6), to get a reliable estimated
parameters. However, if the estimated covariance matrix P{t|t−1} approaches zero we see that,

lim K = 0, (2.14)
P{t|t−1} →0

which means that the Kalman gain matrix weights the residuals less heavily. In other
words, the measurement space zt is assumed to be less and less reliable. What is left to
update is the covariance of estimated error at time t, P{t|t} . We can do this by putting
Equation (2.13) into Equation (2.12). This gives us

P{t|t} = (I − KH)P{t|t−1} . (2.15)


So now we know the distribution of Equation (2.9), namely

p(xt |zt ) ∼ N (x̂t|t , P{t|t} ). (2.16)


The log-likelihood function is given by

N     −1
n 1X 1
`(θ) = − ln(2π) − ln det P{t|t} − t P{t|t} Tt|t−1 , (2.17)
2 2 2
t=1

with n as the dimension space of xt as defined at the start and N = the number of the
collected data. The Equation (2.17) is also called Quasi Maximum logLikelihood estimator
which best explains the observed values of xt .
Putting all the steps together we have the Kalman filter algorithm:

1. Choose initial values for x̂t−1|t−1 and P{t−1|t−1} ,

2. x̂t|t−1 = Ax̂t−1|t−1 + B,

3. P{t|t−1} = AP{t−1|t−1} AT + Qt ,

4. t|t−1 = zt − H x̂t|t−1 ,

5. K = P{t|t−1} H T (HP{t|t−1} H T + Rt )−1 ,

6. x̂t|t = x̂t|t−1 + Kt|t−1 ,

7. P{t|t} = (I − KH)P{t|t−1} .

In this thesis we will only show how the Kalman filter is applied to the CIR model, because
the steps for the Vasicek model are almost identical. The only differences are the choice of
the matrices in Equation (2.5) and the moments.

24
25 2.3. CALIBRATION WITH THE LONG TERM QUANTILE METHOD

2.3 Calibration with the Long Term Quantile method

In this section we will explain the Long Term Quantile (LTQ) method, which is the method
used by Rabobank for calibration purposes. The major assumption in this model is that the
quantiles from the historical data are representative for quantiles in the future. Therefore,
a 95% confidence interval is taken from the historical data and the parameters in the short
interest rate model are chosen such that in 95% of the cases the generated interest rates will
fall within the confidence interval taken from the historical data. Sometimes the best way
to explain a method is by given an example. We will show how this method works for the
Vasicek model. We know that for the Vasicek model the compound interest model is normally
distributed. Recall,

R(t, τ ) = ατ + βτ r(t), (2.18)


and
 
dR(t,τ ) = κ βτ µ − ατ − R(t,τ ) dt + βτ σdWt . (2.19)

Therefore, we get
   
E R(t+4t,τ ) |R(t,τ ) = E ατ + βτ rt+4t |R(t,τ )
 
= ατ + βτ E rt+4t |R(t,τ )

= ατ + βτ e−κ4t rt + µ(1 − e−κ4t ),

and
   
Var R(t+4t,τ ) |R(t,τ ) = Var ατ + βτ rt+4t |R(t,τ )
 
= Var βτ rt+4t |R(t,τ )

= βτ2 Var rt+4t |R(t,τ )


 

σ2
= βτ2 (1 − e−2κ4t ).


Therefore, if t → ∞ we get
2
 

lim R(t,τ ) ∼ N βτ − ατ , βτ . (2.20)
t→∞ 2κ
Another key assumption that is made in the LTQ method is that the drift term is ignored.
There is a logical reason behind this, Rabobank collects the data on a daily bases which means
that dt = 4t is very small and hence can be ignored. This means,

dR(t,τ ) = βτ σdWt . (2.21)

25
26 CHAPTER 2. CALIBRATION OF INTEREST RATE MODELS

Using Equation (2.21) with the collected historical data we can solve βτ σ. Suppose we
collected N different data this gives us,
v
u PN 
u
t=1 R(t,τ ) − R (t−1,τ )
βτ σ = t  . (2.22)
N − 1 4t

Thus, for the 95% confidence interval we have


2 2
 
2σ 2σ
P βτ − ατ − qβτ < lim R < βτ − ατ + qβτ = q, (2.23)
2κ t→∞ (t,τ ) 2κ
with q = 0.95. We define (x, y, z) as follows,

x = βτ σ,

σ2
y = βτ − ατ − qβτ2 ,

σ2
z = βτ − ατ + qβτ2 .

We can solve x using Equation (2.21) and (y, z) can be collected from the historical data.
Now we can estimate the parameters κ, µ and σ. This gives us
 2
xq
κ = 2 ,
z−y
x
σ = ,
βτ

y + z σ 2 − xσ x2 τ
µ = + − .
2 2κ2 4κ

2.4 Testing and Comparing the Methods

In the previous sections we described the calibration of 4 different methods. Before using
any of the methods it is wise to test the accuracy of the methods. Some may suggest using
Monte Carlo, but we do not want to use Monte Carlo simulation for the simple reason that
we only collect the data for 9 different contracts and not thousands of different contracts.
The collected data from Rabobank is from 1 January 2001 up to 1 September 2011. That
is approximately 10 years of data. Therefore, we simulate 10 years of data using Euler’s
forward scheme. We want to eliminate (or minimize to almost zero) any bias caused by the
discretization. Before we explain how we will do that, we first describe the model that we
will use for this test. In Chapter 1 we discussed three different models.

26
27 2.4. TESTING AND COMPARING THE METHODS

For the test we will use the Vasicek model. There are two reasons for this. Firstly, the
Vasicek model is the easiest model to work with. Secondly, the parameters in this model
can be estimated using any method, unlike the CIR model which parameters can only be
fitted with the Kalman filter. Therefore, the Vasicek model is the only “fair” model to do
this test. We want to eliminate the bias in the discretization. This can be done as follows.
First, we choose the parameters κ = 0.1, µ = 0.05 and σ = 0.02, with initial value r0 = 0.06.
1
Then, we simulate the short rates using Euler’s forward scheme with 4t = 12 (assuming
4t
that we are measuring the data once every month) 10 for 9 different contracts defined as
τ = 1 month, 3 months, 6 months, 1 year, 2 years, 5 years, 10 years, 20 years, 30 years . Thus,

4t √ 
rt+ 4t = rt + κ(µ − rt ) + σ rt Wt+ 4t − Wt . (2.24)
10 10 10

Then, we take the compound interest as a function of the 10th short interest rates,

R(t + 4t, τ ) = ατ + βτ r(t + 4t). (2.25)


We want to show the strength of the Kalman filter method. Therefore, we first estimate
the parameters for one contract τ = 1 month. We define κ̂, µ̂ and σ̂ as the estimated values
for κ, µ and σ respectively. If we look at Table 2.1, Table 2.2 and Table 2.3, we see that the
diffusion parameter σ comes with the smallest bias. This is because the Euler discretization
of the Vasicek model is only confined to the diffusion part. In general, discretization schemes
are likely to encounter this kind of systematic bias [13]. However, if 4t → 0 the bias in the
diffusion parameter goes to zero.

Parameter σ σ̂ Bias Bias (%)


LSM 0.02 0.0176 0.0024 12
MLE 0.02 0.0186 0.0014 7
LTQ 0.02 0.0180 0.0020 10
Kalman Filter 0.02 0.0130 0.0070 34

Table 2.1: Estimated σ values for each method using only τ = 1 month.

If we look at Table 2.2 and 2.3 we see that there is a large bias in the estimated parameter
when using LSM and MLE, especially when estimating the κ parameter. As mentioned before
the bias may be very large. The LTQ method is the best calibration method when using just
one contract (e.g., τ = 1). Note that the Kalman filter in this case estimates the parameters
with a large bias just like LSM and MLE.

Parameter κ κ̂ Bias Bias (%)


LSM 0.1 0.6506 0.5506 551
MLE 0.1 0.6689 0.5689 569
LTQ 0.1 0.1582 0.0582 58
Kalman Filter 0.1 0.5162 0.4162 416

Table 2.2: Estimated κ values for each method using only τ = 1 month.

If we now estimate the parameters using all the contracts, τ = 1 month, 3 months,
6 months, 1 year, 2 years, 5 years, 10 years, 20 years, 30 years . Then we see that the bias in the

27
28 CHAPTER 2. CALIBRATION OF INTEREST RATE MODELS

Parameter µ µ̂ Bias Bias (%)


LSM 0.05 0.0843 0.0343 69
MLE 0.05 0.0843 0.0343 69
LTQ 0.05 0.0775 0.0275 55
Kalman Filter 0.05 0.0747 0.0247 49

Table 2.3: Estimated µ values for each method using only τ = 1 month.

Kalman filter disappears. In other words, the independent error term vt from Equation (2.6)
has variance zero. Note this is only the case in a test scenario, because when dealing with the
real data we would never encounter the independent error term vt with variance zero, simply
because no theoretical model can fit the historical data exactly. Therefore, when estimated
the real data in Chapter 3 the reliability of our estimated parameters depends heavily on the
variance of the independent error term.
For MLE the bias in the parameter
 σ increases rapidly,
this is because as the duration
of the contract increases, τ = 10 years, 20 years, 30 years , the variance of R(t, τ ) increases
and hence the bias in the σ sigma parameter increases. This is simultaneously the reason
behind the bias increase for LSM even though the increase is much smaller there. We can see
that the bias in the drift parameter µ decreases by a small margin. This is simply because for
each contract the long term mean must converge to µ. Therefore, the more data we collect
the easier it is to estimate the long term mean. If we had used the Monte Carlo method to
do this test, then the bias in the parameter µ would go to zero.
The parameter κ is by far the most difficult one to estimate when using classic techniques
like MLE and LSM, the bias reduction techniques are, therefore, mainly designed to reduce the
bias in this parameter. Like in most cases the more data collected does not necessarily mean
a better estimate. When dealing with only one contract we have seen that LTQ outperforms
the other methods including the Kalman filter. However, when we put in all the collected
data we see that the Kalman filter is by far the best method, because any sort of biasness is
eliminated. Therefore, the Kalman filter is by far the best calibration method. We strongly
advise this method above the alternatives (MLE, LSM and LTQ). The LTQ method is still a
better alternative than MLE and LSM as the bias for the κ estimator is reduced by a factor
of 10 and when dealing with more contracts there is no increase in the bias for the parameter
σ.

Parameter σ σ̂ Bias Bias (%)


LSM 0.02 0.0138 0.0062 31
MLE 0.02 0.0510 0.0310 155
LTQ 0.02 0.0209 0.0009 5
Kalman Filter 0.02 0.02 0 0

Table 2.4: Estimated σ values using all the simulated data for τ .

Why does the Kalman filter method outperform the other methods? That is because the
Kalman filter exploits all the collected data! The methods LSM, MLE and LTQ estimate the
parameters of R(t+4t, τ ) separately for each τ and average them to get the results. Therefore,
it does not matter if one has the collected data for just one contract or for 9 contracts because
the methods do not use the collected data for different contracts efficiently. The Kalman filter

28
29 2.4. TESTING AND COMPARING THE METHODS

Parameter κ κ̂ Bias Bias (%)


LSM 0.1 0.6506 0.5506 551
MLE 0.1 0.6690 0.5690 569
LTQ 0.1 0.1582 0.0582 58
Kalman Filter 0.1 0.1 0 0

Table 2.5: Estimated κ values using all the simulated data for τ .

Parameter µ µ̂ Bias Bias (%)


LSM 0.05 0.0741 0.0241 48
MLE 0.05 0.0723 0.0223 45
LTQ 0.05 0.0713 0.0213 43
Kalman Filter 0.05 0.05 0 0

Table 2.6: Estimated µ values using all the simulated data for τ .

on the other hand distinguishes itself from the other methods mainly because the Kalman
filter uses each collect contract efficiently. As we have seen, when only using one contract for
the calibration (instead of 9 contracts) the Kalman filter comes with a large bias.
Next, we plot the yield curve, see Figure 2.1, and see what the error in the approximation
does to our yield curve. Therefore, we use the estimated parameters for each method and
plot them together with the actual yield curve, which has parameters κ = 0.1, µ = 0.05 and
σ = 0.02. We can see that even though LTQ parameters have a relatively small bias compared
to LSM and MLE the yield curve of LTQ is closer to the estimated yield curve of LSM and
MLE, which is way off, rather than the actual yield curve. Therefore, a relatively small biases
in the parameters can still have a huge impact on the actual yield curve. This problem can
be avoided by using the Kalman filter.

0.08
Actual yield curve
Yield curve with LSM
Yield curve with MLE
0.075 Yield curve with LTQ

0.07

0.065
Compound Interest Rate

0.06

0.055

0.05

0.045

0.04
0 5 10 15 20 25 30
Years

Figure 2.1: A plot of the actual initial test yield curve against estimated initial yield curves using
MLE, LSM and LTQ.

29
30 CHAPTER 2. CALIBRATION OF INTEREST RATE MODELS

The reason for calibration is to eventually try to predict the future and compute the fairest
bond price. We do not want to overestimate the bond price as this means that we are selling
the bond price higher than it actually is. If another company can lower their prices with the
same risk, then we lose out. However if we underestimate the bond price, this means that
we sell a bond price at a lower cost than it actually is, losing money and risking bankruptcy.
We will compute the bond price using Monte Carlo. Our initial value is r0 = 0.06 and we
simulate 1000 paths using Eueler discretization,

4t √ 
rt+ 4t = rt + κ(µ − rt )
+ σ rt Wt+ 4t − Wt .
10 10 10

We want to see the impact of the bias on the estimated parameters on the short and long
term. Therefore, we plot in Figures 2.2, 2.3 and 2.4 the bond price for a duration of 1 month,
1 year and 10 years respectively. We can see for a contract with a duration of 1 month the
bias in the estimated parameters using LSM and LTQ results in an underestimate of the bond
price, whereas estimation using MLE overestimates the bond price. For a contract with a
duration of 1 year or 10 years, the bias in the estimated parameters using LSM, MLE and
LTQ all result in an underestimate of the bond price. The gab in the ”real” bond price and
the estimated bond prices, using LSM, MLE and LTQ, increases as the years increase. This
results in a loss which would have been prevented using Kalman filter. Therefore, we can
conclude that the Kalman filter is the best method for calibration. In Section 3.2 we will only
work with the Kalman filter to estimate the parameters in the CIR model. This is not only
due to the fact that the Kalman filter is a better alternative but also because LSM, MLE and
LTQ cannot be applied to the CIR model as we have shown in Chapter 1.

0.9951
Actual Bond Price
Bond Price with LSM
Bond Price with MLE
0.9951 Bond Price with LTQ

0.9951

0.9951

0.9951
Bond Price

0.995

0.995

0.995

0.995

0.995
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
Years

Figure 2.2: A plot of the actual bond price of τ = 1 month over a period of 1 month compared to a
plot using MLE, LSM and LTQ.

30
31 2.4. TESTING AND COMPARING THE METHODS

Actual Bond Price


Bond Price with LSM
Bond Price with MLE
Bond Price with LTQ
0.946

0.944

0.942
Bond Price

0.94

0.938

0.936

0.934
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Years

Figure 2.3: A plot of the actual bond price of τ = 1 month over a period of 1 year compared to a plot
using MLE, LSM and LTQ.

Actual Bond Price


0.66 Bond Price with LSM
Bond Price with MLE
Bond Price with LTQ
0.64

0.62

0.6

0.58
Bond Price

0.56

0.54

0.52

0.5

0.48

0.46
0 1 2 3 4 5 6 7 8 9 10
Years

Figure 2.4: A plot of the actual bond price of τ = 1 month over a period of 10 years compared to a
plot using MLE, LSM and LTQ.

31
32 CHAPTER 2. CALIBRATION OF INTEREST RATE MODELS

32
3

Forecasting

We have seen in Chapter 2 that the Kalman filter estimates the parameters in the Vasicek
model without any bias. The same argument holds for the CIR model. In this chapter we
will model the interest rate using the Vasicek model, the CIR model and the Nelson-Siegel
model.

3.1 Modeling data with the Vasicek model

To keep the order as in Chapter 1 we start again with the Vasicek model. As descriped before,
the major handicap in this model is that it generates negative values for rt . However, we still
want to use the Vasicek model as comparison for the CIR model later on in this chapter.
Even though we strongly advise working with Kalman filter for calibration, we want to see
what we get if we did the calibration using, LSM, MLE or LTQ. Table 3.1 shows the results.

Parameter κ µ σ
LSM 0.3856 0.0264 0.0069
MLE 0.3906 0.0238 0.0325
LTQ 0.0904 0.0368 0.0113
Kalman Filter 0.2039 0.0492 0.0045

Table 3.1: Estimated parameters, for the Vasicek model, of the historical data (from 1 Januray 2001 -
1 September 2011) for EUR.

The dimension space of τ is 9, because we have the historical data for 9 different contracts.
This means that the dimension space of the variance of the independent error term vt of
Equation (2.6) has dimension space of 9 × 9, with non zero values only at the diagonal.
Recall, the values at the diagonal were defined as ρ1 , · · · , ρn . The corresponding ρ1 , · · · , ρn ,
with the estimated parameters of the Vasicek model are given in Table 3.2.
Next, we plot the yield curves. Unlike in Section 2.4 we are now dealing with 9 different
values for R(0, τ ) (we take t = 0 to be equal to 1 January 2001). This is because, in Section 2.4
we choose one initial value for rt , hence we had one initial value for R(0, τ ) that corresponded
to all maturities and plotted our yield curves for this starting value. This is not the case for

33
34 CHAPTER 3. FORECASTING

Variance ρ1 ρ2 ρ3 ρ4 ρ5 ρ6 ρ7 ρ8 ρ9
0.003 0.001 0 0.001 0.003 0.004 0.004 0.005 0.006

Table 3.2: The variance of the independent error term vt of Equation (2.6) for the Vasicek model.

the historical data and this is also why short rate models give a poor fit of the historical yield
curves as we will see later on in this section. Our 9 initial values are

R(0, τ1 ) = 0.0478,

R(0, τ2 ) = 0.0473,

R(0, τ3 ) = 0.0460,

R(0, τ4 ) = 0.0444,

R(0, τ5 ) = 0.0455,

R(0, τ6 ) = 0.0484,

R(0, τ7 ) = 0.0527,

R(0, τ8 ) = 0.0568,

R(0, τ9 ) = 0.0572,

with τ1 = contract for 1 month, τ2 = contract for 3 months, . . ., τ9 = contract for 30


years. In other words, instead of one initial value we have 9 different initial values and, hence
we get 9 different yield curves. Therefore, we use the estimated parameters for each method
and plot them together with the actual initial yield curve. See Figures 3.1, 3.2, up to, 3.11 for
a plot of the yield curves using the Kalman filter, LSM, MLE and LTQ. We can see that the
Kalman filter estimation is the closest to the actual initial yield curve for each initial value
of R(0, τi ), with i = 1, . . . , 9 followed by LTQ, as was the case in Section 2.4.

34
35 3.1. MODELING DATA WITH THE VASICEK MODEL

0.1
Actual yield curve
Yield curve with LSM
Yield curve with LSM
Yield curve with MLE
0.09
Yield curve with LTQ

0.08

0.07
Compound Interest Rate

0.06

0.05

0.04

0.03

0.02
0 5 10 15 20 25 30
Years

Figure 3.1: A plot of the actual initial yield curve, for the Vasicek model, against estimated initial
yield curves, with R(0, τ1 ), using the Kalman Filter (KF), MLE, LSM and LTQ.

0.1
Actual yield curve
Yield curve with LSM
Yield curve with LSM
Yield curve with MLE
0.09
Yield curve with LTQ

0.08

0.07
Compound Interest Rate

0.06

0.05

0.04

0.03

0.02
0 5 10 15 20 25 30
Years

Figure 3.2: A plot of the actual initial yield curve, for the Vasicek model, against estimated initial
yield curves, with R(0, τ2 ), using the Kalman Filter (KF), MLE, LSM and LTQ.

35
36 CHAPTER 3. FORECASTING

0.1
Actual yield curve
Yield curve with LSM
Yield curve with LSM
Yield curve with MLE
0.09
Yield curve with LTQ

0.08

0.07
Compound Interest Rate

0.06

0.05

0.04

0.03

0.02
0 5 10 15 20 25 30
Years

Figure 3.3: A plot of the actual initial yield curve, for the Vasicek model, against estimated initial
yield curves, with R(0, τ3 ), using the Kalman Filter (KF), MLE, LSM and LTQ.

0.1
Actual yield curve
Yield curve with LSM
Yield curve with LSM
Yield curve with MLE
0.09
Yield curve with LTQ

0.08

0.07
Compound Interest Rate

0.06

0.05

0.04

0.03

0.02
0 5 10 15 20 25 30
Years

Figure 3.4: A plot of the actual initial yield curve, for the Vasicek model, against estimated initial
yield curves, with R(0, τ4 ), using the Kalman Filter (KF), MLE, LSM and LTQ.

36
37 3.1. MODELING DATA WITH THE VASICEK MODEL

0.1
Actual yield curve
Yield curve with LSM
Yield curve with LSM
Yield curve with MLE
0.09
Yield curve with LTQ

0.08

0.07
Compound Interest Rate

0.06

0.05

0.04

0.03

0.02
0 5 10 15 20 25 30
Years

Figure 3.5: A plot of the actual initial yield curve, for the Vasicek model, against estimated initial
yield curves, with R(0, τ5 ), using the Kalman Filter (KF), MLE, LSM and LTQ.

0.1
Actual yield curve
Yield curve with LSM
Yield curve with LSM
Yield curve with MLE
0.09
Yield curve with LTQ

0.08

0.07
Compound Interest Rate

0.06

0.05

0.04

0.03

0.02
0 5 10 15 20 25 30
Years

Figure 3.6: A plot of the actual initial yield curve, for the Vasicek model, against estimated initial
yield curves, with R(0, τ6 ), using the Kalman Filter (KF), MLE, LSM and LTQ.

37
38 CHAPTER 3. FORECASTING

0.1
Actual yield curve
Yield curve with LSM
Yield curve with LSM
Yield curve with MLE
0.09
Yield curve with LTQ

0.08

0.07
Compound Interest Rate

0.06

0.05

0.04

0.03

0.02
0 5 10 15 20 25 30
Years

Figure 3.7: A plot of the actual initial yield curve, for the Vasicek model, against estimated initial
yield curves, with R(0, τ7 ), using the Kalman Filter (KF), MLE, LSM and LTQ.

Actual yield curve


Yield curve with LSM
Yield curve with LSM
Yield curve with MLE
0.3 Yield curve with LTQ

0.25
Compound Interest Rate

0.2

0.15

0.1

0.05

0
0 5 10 15 20 25 30
Years

Figure 3.8: A plot of the actual initial yield curve, for the Vasicek model, against estimated initial
yield curves, with R(0, τ8 ), using the Kalman Filter (KF), MLE, LSM and LTQ.

38
39 3.1. MODELING DATA WITH THE VASICEK MODEL

0.1
Actual yield curve
Yield curve with LSM
Yield curve with LSM
0.09
Yield curve with MLE
Yield curve with LTQ

0.08

0.07
Compound Interest Rate

0.06

0.05

0.04

0.03

0.02

0.01

0
0 5 10 15 20 25 30
Years

Figure 3.9: A closed up plot of the actual initial yield curve, for the Vasicek model, against estimated
initial yield curves, with R(0, τ9 ), using the Kalman Filter (KF), MLE, LSM and LTQ.

0.45
Actual yield curve
Yield curve with LSM
Yield curve with LSM
0.4 Yield curve with MLE
Yield curve with LTQ

0.35

0.3
Compound Interest Rate

0.25

0.2

0.15

0.1

0.05

0
0 5 10 15 20 25 30
Years

Figure 3.10: A plot of the actual initial yield curve, for the Vasicek model, against estimated initial
yield curves, with R(0, τ8 ), using the Kalman Filter (KF), MLE, LSM and LTQ.

39
40 CHAPTER 3. FORECASTING

0.1
Actual yield curve
Yield curve with LSM
Yield curve with LSM
0.09
Yield curve with MLE
Yield curve with LTQ

0.08

0.07
Compound Interest Rate

0.06

0.05

0.04

0.03

0.02

0.01

0
0 5 10 15 20 25 30
Years

Figure 3.11: A closed up plot of the actual initial yield curve, for the Vasicek model, against estimated
initial yield curves, with R(0, τ9 ), using the Kalman Filter (KF), MLE, LSM and LTQ.

Now we give a plot of the KF estimated parameters for the for several different time steps,
see Figures 3.12, up to, 3.16.

0.12
Actual yield curve
Yield curve with CIR model
0.11

0.1

0.09
Compound Interest Rate

0.08

0.07

0.06

0.05

0.04

0.03

0.02
0 5 10 15 20 25 30
Years

Figure 3.12: A plot of the actual yield curve on 01/01/2003, for the Vasicek model, against estimated
yield curves, using the Kalman Filter (KF).

40
41 3.1. MODELING DATA WITH THE VASICEK MODEL

0.08
Actual yield curve
Yield curve with CIR model

0.07

0.06
Compound Interest Rate

0.05

0.04

0.03

0.02

0.01
0 5 10 15 20 25 30
Years

Figure 3.13: A plot of the actual yield curve 03/01/2005, for the Vasicek model, against estimated
yield curves, using the Kalman Filter (KF).

0.11
Actual yield curve
Yield curve with CIR model

0.1

0.09

0.08
Compound Interest Rate

0.07

0.06

0.05

0.04

0.03
0 5 10 15 20 25 30
Years

Figure 3.14: A plot of the actual yield curve 01/01/2008, for the Vasicek model, against estimated
yield curves, using the Kalman Filter (KF).

41
42 CHAPTER 3. FORECASTING

0.06
Actual yield curve
Yield curve with CIR model

0.05

0.04
Compound Interest Rate

0.03

0.02

0.01

0
0 5 10 15 20 25 30
Years

Figure 3.15: A plot of the actual yield curve 01/01/2010, for the Vasicek model, against estimated
yield curves, using the Kalman Filter (KF).

0.035
Actual yield curve
Yield curve with CIR model

0.03

0.025
Compound Interest Rate

0.02

0.015

0.01

0.005
0 5 10 15 20 25 30
Years

Figure 3.16: A plot of the actual yield curve 03/01/2011, for the Vasicek model, against estimated
yield curves, using the Kalman Filter (KF).

42
43 3.1. MODELING DATA WITH THE VASICEK MODEL

Next, we plot the estimated bond price for a contract with maturity of 1 month, 1 year
and 10 years over a period of 1 month, 1 year and 10 years respectively for each method.
We compute our estimated bound price in the same way as in Chapter 2 using Monte Carlo
simulation, in Figures 3.17, 3.18 and 3.19 we can see the results. Note that such as in
Section 2.4, the LTQ method is closed to the Kalman filter estimation.

0.9961
Bond Price for an one month contract with KF
Bond Price for an one month contract with LSM
Bond Price for an one month contract with MLE
Bond Price for an one month contract with LTQ

0.9961
Bond Price

0.996

0.996
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
Years

Figure 3.17: A plot of the simulated bond price for τ = 1 month with initial value R(0, τ1 ) , for the
Vasicek model, for 1 month compared to a plot using the Kalman Filter (KF) MLE, LSM and LTQ.

43
44 CHAPTER 3. FORECASTING

0.965
Bond Price for an one year contract with KF
Bond Price for an one year contract with LSM
Bond Price for an one year contract with MLE
0.964 Bond Price for an one year contract with LTQ

0.963

0.962

0.961
Bond Price

0.96

0.959

0.958

0.957

0.956

0.955
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Years

Figure 3.18: A plot of the simulated bond price for τ = 1 year with initial value R(0, τ4 ) , for the
Vasicek model, for 1 year compared to a plot using the Kalman Filter (KF) MLE, LSM and LTQ.

0.9
Bond Price for a ten years contract with KF
Bond Price for a ten years contract with LSM
Bond Price for a ten years contract with MLE
Bond Price for a ten years contract with LTQ
0.85

0.8

0.75
Bond Price

0.7

0.65

0.6

0.55
0 1 2 3 4 5 6 7 8 9 10
Years

Figure 3.19: A plot of the simulated bond price for τ = 10 years with initial value R(0, τ7 ) , for the
Vasicek model, for 10 years compared to a plot using the Kalman Filter (KF) MLE, LSM and LTQ.

44
45 3.2. MODELING DATA WITH THE COX INGERSOLL ROSS MODEL

3.2 Modeling data with the Cox Ingersoll Ross model

In this section we will model the historical data using the CIR model. The parameters in
this model can only be estimated using the Kalman filter. We will give a brief explanation
of the Kalman filter applied to the CIR model. There is a relationship between compound
interest rate R(t,τ ) = Rt (τ ) and the short interest rate rt , as shown in the previous chapters.
We define the short interest rate, rt , as transition space, with wt defined as in Equation (2.5).
Therefore, we have

rt = G + Hrt−1 + wt , (3.1)
with

G = µ(1 − e−κ4t ),

H = e−κ4t ,
   
σ2 2
wt = rt e−κ4t κ (1 − e −κ4t ) + µ σ2κ (1 − e−κ4t )2 .

The measure space is defined as

Rt (θ) = C(θ) + D(θ)rt + vt , (3.2)


− ln(A(τi ))
where θ = (κ, µ, σ), D(θ) and C(θ) are vectors with C(θ) = ci = τi and D(θ) =
B(τi )
di = and vt defined as in Equation
τi (2.6)1 .
Now that we have described the space representation of the CIR model. We will apply 
the Kalman filter. The first step in the algorithm is to choose the initial mean as E[rt |r0 = µ
2
and initial variance as Var rt |r0 = µσ
 
2κ . See below, were we outlined the updating equations
used in the Kalman filter.

Rt = C(θ) + D(θ)E[rt |rt−1 , (3.3)

t = Rt − Rt−1 (3.4)

Var Rt |Rt−1 ] = D(θ)Var rt |rt−1 D(θ)T + vt ,


  
(3.5)

Kt = Var rt |rt−1 D(θ)T (Var Rt |Rt−1 ])−1 ,


  
(3.6)

 
E[rt+1 |rt = E[rt+1 |rt + Kt t , (3.7)

     
Var rt+1 |rt = Var rt |rt−1 − Kt D(θ)Var rt |rt−1 , (3.8)
1
It is important to point out that if the parameters κ, µ and σ are not known then we cannot compute A(τ )
or B(τ ), because A(τ ) = A(τ, κ, µ, σ) and B(τ ) = B(τ, κ, µ, σ).

45
46 CHAPTER 3. FORECASTING

with Kt = the Kalman gain matrix at time t. Under the assumption that the errors, t ,
are normally distributed, we can construct the Quasi Maximum logLikelihood estimator,

N      −1
n 1X 1
Tti ,

`(θ) = − ln(2π) − ln det Var Rti |Rti −1 ] − ti Var Rti |Rti −1 (3.9)
2 2 2
i=1

with n = number of contracts and N = number of collected data for each contract.
Therefore, the best estimator for θ is found by maximizing (3.9). We will do so using MATLAB
optimization function fmincon. This functions finds a constrained minimum of a function of
several variables specified by,

c(x) ≤ 0,

 (nonlinear constraint)




 ceq(x) = 0, (nonlinear constraint)




 A · x ≤ b, (linear constraint)


min f (x) such that =
x
Aeq · x = beq, (linear constraint)








lb ≤ x ≤ ub,



 (bounds)

x, b, beq, lb, and ub are vectors, A and Aeq are matrices, c(x) and ceq(x) are functions
that return vectors, and f (x) is a function that returns a scalar. Furthermore, f (x), c(x), and
ceq(x) can be nonlinear functions. We want to maximize the function in Equation (3.9), this
is the same as minimizing −`(θ) in Equation (3.9).
Table 3.3 shows the estimated parameters for the CIR model and Table 3.4 the ρn values
that correspond to them.

Parameter κ µ σ
Kalman Filter 0.1990 0.0497 0.0354

Table 3.3: Estimated parameters, for the CIR model, of the historical data (from 1 Januray 2001 - 1
September 2011) for EUR.

Variance ρ1 ρ2 ρ3 ρ4 ρ5 ρ6 ρ7 ρ8 ρ9
0.003 0.001 0 0.001 0.003 0.004 0.004 0.005 0.006

Table 3.4: The variance of the independent error term vt of Equation (2.6) for the CIR model.

Now we plot the yield curves. This is done by the same way as the Vasicek model, we
first compute R(0, τi ) for i = 1, . . . , 9. This gives us,

46
47 3.2. MODELING DATA WITH THE COX INGERSOLL ROSS MODEL

R(0, τ1 ) = 0.0478,

R(0, τ2 ) = 0.0473,

R(0, τ3 ) = 0.0460,

R(0, τ4 ) = 0.0444,

R(0, τ5 ) = 0.0455,

R(0, τ6 ) = 0.0484,

R(0, τ7 ) = 0.0527,

R(0, τ8 ) = 0.0568,

R(0, τ9 ) = 0.0572.

We plotted all these yield curves in Figures 3.20, up to, 3.25. In Section 3.4 we will give
a plot of the bond price comparing this with the Vasicek model.

0.1
Actual yield curve
Yield curve with CIR model

0.09

0.08
Compound Interest Rate

0.07

0.06

0.05

0 5 10 15 20 25 30
Years

Figure 3.20: A plot of the actual initial yield curve, for the CIR model, against estimated yield curves,
using the Kalman Filter (KF).

47
48 CHAPTER 3. FORECASTING

Actual yield curve


Yield curve with CIR model
0.05

0.045
Compound Interest Rate

0.04

0.035

0.03

0.025

0.02
0 5 10 15 20 25 30
Years

Figure 3.21: A plot of the actual yield curve on 01/01/2003, for the CIR model, against estimated
yield curves, using the Kalman Filter (KF).

0.05
Actual yield curve
Yield curve with CIR model
0.045

0.04
Compound Interest Rate

0.035

0.03

0.025

0.02

0.015

0 5 10 15 20 25 30
Years

Figure 3.22: A plot of the actual yield curve on 03/01/2005, for the CIR model, against estimated
yield curves, using the Kalman Filter (KF).

48
49 3.2. MODELING DATA WITH THE COX INGERSOLL ROSS MODEL

0.05
Actual yield curve
Yield curve with CIR model
0.049

0.048
Compound Interest Rate

0.047

0.046

0.045

0.044

0.043

0.042
0 5 10 15 20 25 30
Years

Figure 3.23: A plot of the actual yield curve 01/01/2008, for the CIR model, against estimated yield
curves, using the Kalman Filter (KF).

0.05
Actual yield curve
Yield curve with CIR model
0.045

0.04

0.035
Compound Interest Rate

0.03

0.025

0.02

0.015

0.01

0.005
0 5 10 15 20 25 30
Years

Figure 3.24: A plot of the actual yield curve 01/01/2010, for the CIR model, against estimated yield
curves, using the Kalman Filter (KF).

49
50 CHAPTER 3. FORECASTING

0.05
Actual yield curve
Yield curve with CIR model
0.045

0.04

0.035
Compound Interest Rate

0.03

0.025

0.02

0.015

0.01

0.005

0
0 5 10 15 20 25 30
Years

Figure 3.25: A plot of the actual yield curve 03/01/2011, for the CIR model, against estimated yield
curves, using the Kalman Filter (KF).

50
51 3.3. MODELING DATA WITH THE NELSON-SIEGEL MODEL

3.3 Modeling data with the Nelson-Siegel model

The Nelson-Siegel model is the easiest to calibrate, as described in Chapter 1. We have

Rt (τ ) = β1,t κ1 + β2,t κ2 + β3,t κ3 , (3.10)


with

κ1 = 1,

1−e−λt τ
κ2 = λt τ ,

1−e−λt τ
κ3 = λt τ − e−λt τ .

Recall, λt is fixed. Therefore, we first compute λt using MATLAB (optimization function


fmincon) and get λt = 0.6807. This now allows us to estimate β1,t , β2,t , β3,t , for each time
step, see Figures 3.26, 3.27, 3.28. We define β̂1,t , β̂2,t and β̂3,t as the average value of β1,t , β2,t
and β3,t respectively, see Table 3.5.

0.065
Level of historical data
Estimated long−term compound

0.06

0.055

0.05

0.045

0.04

0.035

0.03

0.025
0 2 4 6 8 10 12
Years

Figure 3.26: A plot of the yield curve level and the parameters β1,t of the Nelson-Siegel model.

51
52 CHAPTER 3. FORECASTING

Level of historical data


Negative values of estimated short−term compound
0.04

0.03

0.02

0.01

−0.01

−0.02

−0.03
0 2 4 6 8 10 12
Years

Figure 3.27: A plot of the yield curve slope and the parameters −β2,t of the Nelson-Siegel model.

0.04
Level of historical data
Estimated medium−term compound

0.02

−0.02

−0.04

−0.06

−0.08

−0.1
0 2 4 6 8 10 12
Years

Figure 3.28: A plot of the yield curve curvature and the parameters β3,t of the Nelson-Siegel model.

52
53 3.4. COMPARING THE RESULTS

Parameter β̂1,t β̂2,t β̂3,t


0.0463 -0.0187 -0.0203

Table 3.5: Estimated parameters of the Nelson-Siegel model of the historical data (from 1 Januray
2001 - 1 September 2011) for EUR.

Next, we want to plot the initial yield curve. We do so using the initial values of β1,t , β2,t
and β3,t , see Table 3.6. See Figure 3.29 for a plot of the yield curve. We can see that the
actual initial yield curve is estimated perfectly with the Nelson-Siegel model. This is clearly
the reason why this model is so popular by the majority of the Central Banks.
     
Parameter corr β1,t , lt corr β2,t , st corr β1,t , lt
0.9598 -0.9755 0.9946

Table 3.6: Initial estimated parameters of the Nelson-Siegel model of historical data (from 1 Januray
2001 - 1 September 2011) for EUR.

0.06
Actual yield curve
Yield curve with Nelson Siegel model

0.058

0.056
Compound Interest Rate

0.054

0.052

0.05

0.048

0.046

0.044
0 5 10 15 20 25 30
Years

Figure 3.29: A plot of the actual initial yield curve level and the estimated yield curve of the Nelson-
Siegel model.

3.4 Comparing the Results


Now that we have estimated the parameters for the Vasicek model, the CIR model and the
Nelson-Siegel model. We will give a plot of the yield curves to compare these models. For
the Vasicek model we will use the estimated values computed with the Kalman filter. See
Figure 3.30 for the result. Note, that the Vasicek model and the CIR model both give the
same initial yield curve. However, the CIR model estimates the yield curve better when
different time steps are taken into account, as we have seen in Section 3.2. Plus the fact that
in the CIR model the short rate values cannot turn negative, it is better to use this model.

53
54 CHAPTER 3. FORECASTING

0.06
Actual yield curve
Yield curve with Nelson Siegel model
Yield curve with Vasicek model
Yield curve with CIR model
0.058

0.056

0.054
Compound Interest Rate

0.052

0.05

0.048

0.046

0.044
0 5 10 15 20 25 30
Years

Figure 3.30: A plot of the actual initial yield curve and the estimated initial yield curves with the
Vasicek model, the CIR model and the Nelson-Siegel model.

Next, we compare the estimated bond price for the Vasicek model and CIR model for
a duration of one month, one year and ten years. We do this using 1000 Monte Carlo
simulations. See Figure 3.31, 3.32 and 3.33 for the results. We can see that the bond price
of the Vasicek model is more like a straight line, because the interest rates are normally
distributed and the estimated σ is very small, looking at the historical data this is highly
unlikely, therefore, the values remain around the initial value. Whereas, the bond price for
the CIR model decreases over time this is due the fact that the estimated σ here is relatively
much larger, more realistic looking at the historical data. Therefore, we can conclude that
the CIR model is as expected a better model to describe the interest rate.

54
55 3.4. COMPARING THE RESULTS

0.9961
Bond Price for an one month contract with CIR model
Bond Price for an one month contract with Vasicek model

0.996
Bond Price

0.996

0.9959
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
Years

Figure 3.31: A plot of the bond price with the Vasicek model and the CIR model for a duration of 1
month.

0.965
Bond Price for an one year contract with CIR model
Bond Price for an one year contract with Vasicek model

0.96

0.955

0.95
Bond Price

0.945

0.94

0.935

0.93
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Years

Figure 3.32: A plot of the bond price with the Vasicek model and the CIR model for a duration of 1
year.

55
56 CHAPTER 3. FORECASTING

0.64
Bond Price for a ten years contract with CIR model
Bond Price for a ten years contract with Vasicek model
0.62

0.6

0.58
Bond Price

0.56

0.54

0.52

0.5

0.48
0 1 2 3 4 5 6 7 8 9 10
Years

Figure 3.33: A plot of the bond price with the Vasicek model and the CIR model for a duration of 10
years.

56
Conclusions and Further Research

Conclusions
We have shown that calibration with MLE and LSM results in a huge bias. In the past years
there have been methods developed to reduce the bias. Methods such as jackknife, bootstrap
combined with MLE and LTQ, all are good alternatives as they reduce the bias with a factor
between 5 and 10. However, the bias on one of the drift parameters still remains high, around
60%. We have used the Kalman filter to eliminate the bias and from this we can conclude
that the Kalman filter outperforms LSM, MLE and LTQ. The strength in the Kalman filter
lays in the way the method handles the collected historical data for different contracts as we
have shown in Chapter 2. Therefore, we recommand Rabobank to apply the Kalman filter
for calibration when dealing with short rate models, such as the Vasicek model or the CIR
model (or extended versions of these models).
We have shown that short rate models, such as the Vasicek model and the CIR model, fit
the initial yield curve poorly. This is due to the fact that we have 9 different initial values
from the historical data and we can only apply one initial value at time zero. This results
in 9 different yield curves, each fitting the yield curve poorly. However, unlike the Vasicek
model, the CIR model estimates the yield curve better at different time steps as shown in
Section 3.2. The Nelson-Siegel model on the other hand, fits all the yield curves perfectly.
However, this raises another concern as the arbitrage rule is neglected.
The CIR model estimates the yield curves relatively much better compared to the Vasicek
model. Plus the fact that in the CIR model the short rate values cannot turn negative, it is
better to use this model. Even though the Nelson-Siegel model fits the yield curves perfectly,
it does allow arbitrage and, therefore, must be ignored. Thus, we conclude that the CIR
model is more preferable when dealing with interest rate.

Further Research
For Further research we would like to recommand Rabobank to look at the two factor CIR
model as defined in [18]. We expect that this results in a more accurate estimation of the
yield curves. The problem that will arise is that the partial differential equation of the bond
price does not have an exact solution. However, the Kalman filter can still be applied for
calibration.
In this thesis we have given a brief description of the Nelson-Siegel model. This model
gives a good fit of the yield curves, however, the arbitrage rule is not obeyed. Diebold and Li
have recently expanded the Nelson-Siegel model to obey the no arbitrage rule. This resulted

57
58 CHAPTER 3. FORECASTING

in a poor fit of the yield curves. The choice is really clear, one can have a perfect fit of the
yield curves with the arbitrage rule obeyed or a poor fit of the yield curve with the arbitrage
rule upheld. Therefore, we strongly recommend Rababank to do some research on this model
and see how the arbitrage rule can be obeyed while maintaining a perfect fit of the yield
curves.

58
Bibliography

[1] Aı̈t-Sahalia Y., 2008. Closed-form likelihood expansions for multivariate diffusions. An-
nals of Statistics 36, 906-937.

[2] Ball C. and Torous W., 1996. Unit roots and the estimation of interest rate dynamics.
Journal of Empirical Finance 3, 215-238.

[3] Bank for International Settlements (BIS), 2005. Zero-coupon yield curves: technical
documentation. Papers, No. 25, Basel.

[4] Brigo D. and Mercurio F., 2006. Interest Rate Models: Theory and Practice with Smile,
Inflation and Credit, Heidelberg, Springer Verlag, 2nd Edition.

[5] Cox, J., Ingersoll J. and S. Ross, 1985. A theory of the term structure of interest rates.
Econometrica, Vol. 53, 385-407.

[6] Diebold F.X. and Li C., 2006. Forecasting the Term Structure of Government Bond
Yields. Journal of Econometrics, 130, 337-364.

[7] Duan J.C. and Simonato J.G., 1995. Estimating and Testing Exponential-Affine Term
Structure Models by Kalman Filter,” Discussion paper, Centre universitaire de recherche
et analyze des organizations (CIRANO).

[8] Durham G. and A. R. Gallant, 2002. Numerical Techniques for Maximum Likelihood
Estimation of Continuous-time Diffusion Processes. Journal of Business and Economic
Statistics, 20, 297-316.

[9] Elen E.A.L.J. van, 2010. Term structure forecasting. Rabobank.

[10] Haberman R., 2003. Applied Partial Differential Equations. Vol. 4.

[11] Hull J. and White A., 1987. The Pricing of Options with Stochastic Volatilities. Journal
of Finance, Vol.42, 281-300.

[12] Kalman R.E., 1960. A new approach to linear filtering and prediction theory. Journal
of Basic Engineering, Transactions ASME, Series D.

[13] Lo A., 1988. Maximum likelihood estimation of generalized Ito processes with discretely
sampled data. Econometric Theory 4, 231-247.

[14] Merton R.C., 1980. On estimating the expected return on the market: An exploratory
investigation. Journal of Financial Economics 8, 323-361.

59
60 BIBLIOGRAPHY

[15] Miller J., Edelman D. and Appleby J., 2007. Numerical Methods for Finance. Chapman
& Hall/CRC Financial Mathematics Series.

[16] Nelson C.R. and A.F. Siegel, 1987. Parsimonious Modeling Of Yield Curves. Journal of
Business, 60, 473-489.

[17] Ruijter M.C.A. de, 2010. An interest rate model for counterparty credit risk. Rabobank.

[18] Shreve S., 2004. Stochastic Calculus for Finance II, Vol. 2.

[19] Seydel R., 2003. Tools for Computational Finance, 3th edition, Springer Verlag, Berlin.

[20] Tang C.Y and Chenb S.X, 2009. Parameter estimation and bias correction for diffusion
processes., Journal of Econometrics 149, (2009) 65-81.

[21] Vasicek O., 1977. An equilibrium characterization of the term structure. Journal of
Financial Economics, Vol. 5, 177-186.

[22] Welcha G. and Bishop G., 2001. An Introduction to the Kalman Filter., http://www.
cs.unc.edu/~tracker/media/pdf/SIGGRAPH2001_CoursePack_08.pdf.

[23] Yu J. and Phillips P., 2001. A Gaussian approach for estimating continuous time models
of short term interest rates. The Econometrics Journal 4, 211-225.

[24] Yu J. and Phillips P., 2006. Jackknifing bond option prices. Review of Financial Studies,
Vol. 18, 707-742.

60

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy