0% found this document useful (0 votes)

26 views57 pages

KalmanSlides 2

This document provides an introduction to stochastic filtering, focusing on Bayesian estimation and its application in filtering hidden processes using observable data. It outlines the basic principles, correction equations, and the Kalman filter, which is a powerful technique for estimating the state of a dynamic system. The document includes examples and emphasizes the recursive nature of the filtering process, allowing for efficient updates with new observations.

Uploaded by

Mihai Iancu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views57 pages

KalmanSlides 2

Uploaded by

Mihai Iancu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 57

An introduction to Filtering

Samuel N. Cohen
Mathematical Institute
University of Oxford

©2019, Not for general distribution

Filtering
What are we doing here?

I In these lectures, we are going to look at the basic principles

of ‘stochastic filtering’.
I The key idea is that you have two processes, which are
correlated, and you use observations of one (which you can
see) to determine the behaviour of the other (which you can’t
see)
I Our aim is to give applicable theory, with numerical examples
(all implemented in the statistical environment R (available at
r-project.org))
I We will also give an example with data from high-frequency
trading.

Filtering: Introduction 2
A basic problem
Bayesian estimation of the mean

I We begin with a simple Bayesian estimation problem, which

will lead nicely to filtering.
I We have a hierarchical model for some observations
Y1 , Y2 , ..., YT with unknown mean X .
I For simplicity, suppose X ∼ N(µ0 , τ02 ) and Yi |X ∼ N(X , σ 2 ),
where the Yi are conditionally independent.
I We assume σ, τ0 , µ0 are all known.
I Our aim is to estimate X from the observations of Y ’s.

Filtering: A basic problem 3

The joint density
Bayesian estimation of the mean

We write out the joint density of X and Y1 , expand and complete

the square, to see that
(x − µ )2 (y − x)2
0
f (x, y ) ∝ exp − 2
−
2τ0 2σ 2
2 2
µ0 /τ0 +y /σ 2
x − 1/τ 2 2 (y − µ0 )2

0 +1/σ
= exp − 1
− .
2 1/τ 2 +1/σ 2 2(σ 2 + τ02 )
0

Using Bayes’ theorem, we conclude that

µ /τ 2 + Y /σ 2 1
0 0 1
X |Y1 ∼ N , =: N(µ1 , τ12 ).
1/τ02 + 1/σ 2 1/τ02 + 1/σ 2

Filtering: A basic problem 4

The correction equations
Bayesian estimation of the mean

I This gives us a way of ‘correcting’ our opinions of X given the

first observation
I We take a weighted average for the mean, and add the inverse
variances (‘precisions’).
I Of course, we can repeat this, to include the second
observation, then the third,..., and after some simplification
we find
X |(Y1 , ..., Yt ) ∼ N(µt , τt2 ),
where
2 + Y /σ 2
µt−1 /τt−1 t 1
µt = 2
, τt2 = 2
1/τt−1 + 1/σ 2 1/τt−1 + 1/σ 2

Filtering: A basic problem 5

Simplification
Bayesian estimation of the mean

I This simplifies, in this setting, to

tσ 2 µ0 + τ02 Ȳt 1
µt = , τt2 =
tσ 2 + τ02 1/τ02 + t/σ 2
1 Pt
with Ȳt = t s=1 Ys .
I This simplification is special to this particular setting.

Example 1

Filtering: A basic problem 6

The correction dynamics
Bayesian estimation of the mean

I Let’s focus on the way the distribution changes.

I Whenever we get a new observation Yt , we correct our
estimate of X , by updating the conditional distribution with
the rule
2 Y Yt+1
(µt−1 , τt−1 t
) −→ (µt , τt2 ) −→ (µt+1 , τt+1
2
).

I This is the basic idea of filtering: we have a hidden value X ,

and use our observations to update an estimate of X (in
particular, its conditional distribution).

Filtering: A basic problem 7

A simple filtering problem
Bayesian estimation of a changing mean

I Instead of X being constant, we will now assume that X is a

random process.
I In particular, we will take X0 ∼ N(µ0 , τ02 ) and

Xt |Xt−1 ∼ N(Xt−1 , γ 2 ), Yt ∼ N(Xt , σ 2 ).

I Equivalently,

X0 = τ0 W0 , Xt = Xt−1 + γWt , Yt = Xt + σVt ,

where W , V are standard white noise (i.e. Wt , Vt are

independent N(0, 1)).
I Here γ, σ, µ0 and τ0 are all known.
I We call X the signal process and Y the observation process.

Filtering: The simplest filtering problem 8

Dependence diagram
Bayesian estimation of a changing mean

The dependence diagram for our model is the following:

X0 X1 X2 ··· XT

Y1 Y2 ··· YT

Our conclusion depends on learning from the observations Y , but

also takes account of the fact that X is changing through time.

Filtering: The simplest filtering problem 9

Solving the filter
Bayesian estimation of a changing mean

I We want to find the distribution of Xt given Y1 , ..., Yt .

I Write Ft = σ(Xs , Ys ; s ≤ t) for the ‘full information filtration’
and Yt = σ(Ys ; s ≤ t) for the ‘observation filtration’.
I We can now repeat calculations similar to those we did before:

I From the dynamics, we have the prediction:

X0 ∼ N(µ0 , τ02 ) ⇒ X1 ∼ N(µ0 , τ02 + γ 2 )

2
I Writing τ1|0 = τ02 + γ 2 , Bayes’ rule gives the correction:

µ0 /τ 2 + Y1 /σ 2 1
1|0 2
X1 |Y1 ∼ N 2 + 1/σ 2 , 2 + 1/σ 2 =: N(µ1|1 , τ1|1 ).
1/τ1|0 1/τ1|0

Filtering: The simplest filtering problem 10

Solving the filter
Bayesian estimation of a changing mean

I In general, we write µt|t−1 for the mean of Xt given Yt−1 and

µt|t for the mean of Xt given Yt , similarly for the variances
2
τt|t−1 2 .
and τt|t
I Our system can be described in two steps, prediction and
correction:
2 Prediction 2 Correction 2
(µt−1|t−1 , τt−1|t−1 ) −→ (µt|t−1 , τt|t−1 ) −→ (µt|t , τt|t ),

where
2 2
µt|t−1 = µt−1|t−1 , τt|t−1 = τt−1|t−1 + γ2,
2
µt−1|t−1 /τt|t−1 + Yt /σ 2 2 1
µt|t = , τt|t = 2
.
1/τ 2 + 1/σ 2 1/τt|t−1 + 1/σ 2
t|t−1

Filtering: The simplest filtering problem 11

Solving the filter
Bayesian estimation of a changing mean

I By iterating these equations, we solve our filtering problem,

that is, we have a complete description of the distribution of
Xt given Yt for every t.
I These calculations are recursive, so including new observations
is simple (and fast!).

2 )
(µ1|0 , τ1|0 2 )
(µ2|1 , τ2|1 ···

2 )
(µ0|0 , τ0|0 2 )
(µ1|1 , τ1|1 2 )
(µ2|2 , τ2|2

Y1 Y2

Filtering: The simplest filtering problem 12

Solving the filter
Bayesian estimation of a changing mean

Example 2

I These equations can be solved quickly, using only basic

methods.
I Updating only involves addition, multiplication and division.
I Division can be largely avoided by using precisions (τ −2 )
instead of variances (τ 2 ).
2 converges quickly to a stationary value, the
I Observe that τt|t
limit is found by solving the equation
1 1 p 2
τ2 = ⇒ τ 2
= γ 4σ + γ 2 − γ2 .
1/(τ 2 + γ 2 ) + 1/σ 2 2

Filtering: The simplest filtering problem 13

The Kalman Filter
The general setup

I We have just solved a simple case of the famous Kalman

filtering problem.
I The general case has two differences: our processes are vector
valued and the relationship between X and Y is more general
(but still linear).
I These simple generalizations yield an extraordinarily powerful
technique.

Filtering: Discrete Time: The Kalman filter 14

The Kalman Filter
The general setup

Consider the following model:

Xt = AXt−1 + Wt , Yt = CXt + Vt

with starting distribution X0 ∼ N(µ0|0 , P0|0 ).

Here
I W , V are white noise processes in Rk and Rd with respective
(nonnegative definite) variances Γ and Σ
I In other words, Wt ∼ N(0, Γ) and Vt ∼ N(0, Σ) for all t, all
values independent.
I A and Γ are k × k-matrices, C is d × k, Σ is d × d.
I We know A, C , Γ, Σ.

Filtering: Discrete Time: The Kalman filter 15

The key equations
Conditioning normal distributions

The key fact we will need is that

if you have jointly (multivariate) normal random variables
Y , Z , then Y |Z is also normal.

Furthermore
E [Y |Z ] = E [Y ] + cov(Y , Z )var(Z )−1 (Z − E [Z ])
var(Y |Z ) = var(Y ) − cov(Y , Z )var(Z )−1 cov(Y , Z )>

These facts can be proven using the densities, and justify

everything that follows.

Filtering: Discrete Time: The Kalman filter 16

The Kalman Filter: Prediction
Step 1 of the filter

I We know that Xt |Yt = Xt |(Y1 , Y2 , ..., Yt ) is normal (and

similarly Xt |Yt−1 ),
I Using the dynamics of X and Y , we can easily calculate the
prediction equations:

µt|t−1 = E [Xt |Yt−1 ] = E [AXt−1 + Wt |Yt−1 ]

Filtering: Discrete Time: The Kalman filter 17

The Kalman Filter: Kalman Gain
Step 2a of the filter

I The correction equations are made simpler if we first calculate

the ‘innovation’ process η and its variance S
I η tells us what ‘new’ information we learn from Yt

ηt = Yt − E [Yt |Yt−1 ] = Yt − C µt|t−1 ,

St = var(ηt |Yt−1 ) = var(Yt |Yt−1 ) = CPt|t−1 C > + Σ.

I Using S, we can calculate the ‘Kalman gain’ process, which

allows us to optimally incorporate new information,

Kt = Pt|t−1 C > St−1 = (St−1 CPt|t−1 )>

Filtering: Discrete Time: The Kalman filter 18

The Kalman Filter: Prediction
Step 2b of the filter

Finally, it is easy to calculate the correction equations:

µt|t = µt|t−1 + Kt ηt ,
Pt|t = (I − Kt C )Pt|t−1 .

Given these equations, we are ready to calculate!

Example 3

Filtering: Discrete Time: The Kalman filter 19

The Kalman Filter: Forecasting
Easy with Matrices!

I Using our equations, it is easy to see how to calculate the

forecasted values E [Xt |Ys ] for s < t.
I By direct recursion:

µt|s = E [Xt |Ys ] = At−s µs|s .

I Furthermore, the conditional variance Pt|s = var(Xt |Ys )

satisfies
Pt+1|s = APt|s A> + Γ
which is easy to calculate recursively.

Filtering: Discrete Time: The Kalman filter 20

The Kalman Filter: Smoothing
Harder, but useful!

I Calculating the ‘smoother’, that is, µt|N = E [Xt |YN ] for

t < N is also possible.
−1
I First write Jt = Pt|t A> Pt+1|t . Then, using our basic
properties of normal distributions (and plenty of algebra),

µt|N = µt|t + Jt (µt+1|N − µt+1|t ),

Pt|N = Pt|t + Jt (Pt+1|N − Pt+1|t )Jt> ,

I These can be calculated backwards, starting at time N.

I In effect, you first do a single forward pass through the
observations from 0 → N calculating the filter, then
backwards from N → 0 to calculate the smoother.
Example 3 (ctd)

Filtering: Discrete Time: The Kalman filter 21

The Kalman Filter: Smoothing
One-step correlations

I We shall see that, when trying to fit a filter in practice, it will

also be useful to know the values of

Pt−1,t|N := E [(Xt − µt|N )(Xt−1 − µt−1|N )> |YN ].

I Fortunately, there is a formula:

PN−1,N|N = (I − KN C )APN−1|N−1 ,
> >
Pt−1,t|N = Pt|t Jt−1 + Jt (Pt,t+1|N − APt|t )Jt−1 .

I The derivation is even more algebra than before.

I It can also be calculated using a single sweep back through
the data.
Exercise: prove these formulae!

Filtering: Discrete Time: The Kalman filter 22

Example: An ARMA(1,1) process
A common time series model

To see how rich a theory this gives, consider an ARMA(1,1)

process, where
xt = φxt−1 + θzt−1 + zt
for constants φ, θ and white noise z.
I We only observe xt .
I It’s difficult to calculate E [xt |xt−1 , xt−2 , ...], which is usually
needed when fitting these models.
I This does not look like the models we’ve considered...

Filtering: Discrete Time: The Kalman filter 23

Example: An ARMA(1,1) process
A common time series model

To see how rich a theory this gives, consider an ARMA(1,1)

Filtering: Discrete Time: The Kalman filter 24

Example: An ARMA(1,1) process
Surprisingly a Kalman Filter model!

We can write

xt φ 1 xt−1 1 φ 1
Xt = = + zt = Xt−1 + Wt
θzt 0 0 θzt−1 θ 0 0

and
Yt = xt = 1 0 Xt−1 .
I Hence we can apply the Kalman filter to X , and so efficiently
calculate

1 0 µt|t−1 = 1 0 E [Xt |Yt−1 ] = E [xt |xt−1 , xt−2 , ...].

I In our earlier notation, we have Σ = 0, A, C as indicated and

1 θ
Γ= .
θ θ2
Filtering: Discrete Time: The Kalman filter 25
Hidden Markov Models
Another simple filter

I The equations we’ve seen have been fairly ‘nice’.

I The filters can be solved in closed form, recursively, and are
finite dimensional.
I This is because we have assumed throughout that all our
random variables are Gaussian, and all the relationships
between them are linear.
I Without this assumption, as we will see in continuous time,
we are in a much more difficult situation.
I One other case where a nice set of equations can be obtained
is when X is a finite-state Markov chain.

Filtering: Discrete Time: Hidden Markov Models 26

Hidden Markov Models
A general setup

I Suppose X is a finite-state Markov chain. We write X as a

process Xt = AXt−1 + Mt where X takes values in the basis
vectors in Rd , and M is a martingale difference process (so
E [Mt |Ft−1 ] = 0).
I The matrix A> is the familiar transition matrix of the Markov
chain.
I We just need to calculate the probability X takes values in
each state, or equivalently, the vector µt|t = E [Xt |Yt ] ∈ Rd
(as P(Xt = ei |Yt ) = E [ei> Xt |Yt ] = ei> µt|t
I We assume that Yt |Ft ∼ c(y ; Xt )m(dy ), where c is some
density function and m is some measure (no normality is
needed).

Filtering: Discrete Time: Hidden Markov Models 27

The Filter
Still easy to calculate

I We can directly calculate the prediction equation:

µt|t−1 = E [Xt |Yt−1 ] = E [AXt−1 + Mt |Yt−1 ] = Aµt−1|t−1 .

I To calculate the correction equation, we use Bayes’ theorem:

c(Yt ; ei )P(Xt = ei |Yt−1 )

P(Xt = ei |Yt , Yt−1 ) = P
j c(Yt ; ej )P(Xt = ej |Yt−1 )
∝ c(Yt ; ei )P(Xt = ei |Yt−1 )

Filtering: Discrete Time: Hidden Markov Models 28

Forecasting and Smoothing
Simple algorithms

I Again, forecasting is easy: µt|s = At−s µs|s for s < t.

I Smoothing can be done with a backward pass, by looking at a
‘dual’ variable ν satisfying the equation (for N > t)

νt|N ∝ A> C (Yt+1 )νt+1|N , νN|N = 1,

and then calculating µt|N ∝ µt|t νt|N , where the product is

taken component by component.
I There are closed form equations for other quantities also (for
example, estimating occupation times, the number of
transitions, functions of X and Y , ... see Elliott, Aggoun and
Moore, Hidden Markov Models, Springer 1995)

Example 4
Filtering: Discrete Time: Hidden Markov Models 29
Continuous time
Much more technically difficult

I So now we move gear a little technically, as we want to see

what happens in continuous time.
I This is particuarly useful as a model when observations occur
in very high frequency, as it allows us to find good
approximations to our problem.
I On the other hand, it becomes more difficult to find and solve
the filtering equations.

Filtering: Continuous Time: The key equations 30

The reference probability method
A nice version of Bayes’ theorem

I The approach we shall take is called the ‘reference probability

method’.
I It depends on the following result, which will serve as “Bayes’
theorem” in this context.

Theorem
Suppose we have a probability measure Q ∼ P. Write the
Radon–Nikodym density Z = dQ/dP, and suppose we have a
filtration {Ft }t≥0 . Then for any t ≥ 0 and any random variable ξ,
we know that
EP [Z ξ|Ft ]
EQ [ξ|Ft ] = .
EP [Z |Ft ]

Filtering: Continuous Time: The key equations 31

A continuous model
Common basic time series model

I We assume as before that we have processes X and Y , on an

interval [0, T ].
I These satisfy the SDEs

dXt = f (t, Xt )dt + κ(t, Xt )dBt

dYt = c(t, Xt )dt + dWt

where f , κ, c are known (Lipschitz continuous) functions, and

B and W are Brownian motions.
I We assume X and Y are scalar and B and W are independent
for simplicity.
I These assumptions can be relaxed, but the notation becomes
more difficult.

Filtering: Continuous Time: The key equations 32

Feynman–Kac
Connecting SDEs and PDEs

I From the Feynman–Kac theorem/Ito’s lemma, we know that

for any smooth bounded function φ,
Z
φ(Xt ) = φ(X0 ) + Lφ(Xu )du + martingale
[0,t]

where L is the infinitesimal generator of X , that is,

∂φ 1 ∂2φ
Lφ = f (t, x) + · 2 κ(t, x)2 .
∂x 2 ∂x
I We expect L to be part of the solution to our filtering
problem.

Filtering: Continuous Time: The key equations 33

Changing measure
Making Bayes’ theorem work for us

I We define a probability Q by dQ
dP = ZT , where
Z Z t 1 t
Z
Zt = E − c(s, Xs )dWs = exp − c(s, Xs )dWs − c(s, Xs )2 dt .
t 0 2 0

I We write Λ = 1/Z , and using Ito’s lemma we can see that

dΛt = Λt c(t, Xs )dYs .
I Using Girsanov’s theorem, this change of measure has the
effect of changing the drift in Y , so under Q we have the
dynamics

dXt = f (t, Xt )dt + κ(t, Xt )dBt , dYt = dWtQ

where B and W Q are independent Q-Brownian motions.

I X and Y are independent under Q!
Filtering: Continuous Time: The key equations 34
Unnormalized expectations
Expanding with Ito

I We will now try to calculate the unnormalized expectations,

which we write:

σt (φ) := EQ [Λt φ(Xt )|Yt ].

I “Bayes’ theorem” tells us that EP [φ(Xt )|Ft ] = σt (φ)/σt (1).

I Now, we can write out Λs φ(Xs ) using Ito’s lemma. This gives

∂φ 1 ∂2φ
d(Λφ(X ))t = Λt dXt + Λt 2 κ(t, Xt )2 dt + Λt φ(Xt )c(t, Xt )dYt
∂x 2 ∂x
∂φ
= Λt Lφ(Xt )dt + Λt κ(t, Xt )dBt + Λt φ(Xt )c(t, Xt )dYt
∂x

Filtering: Continuous Time: The key equations 35

Unnormalized expectations
Using independence

I Taking an expectation, as (X , B) and Y are Q-independent,

we have the ‘Zakai equation’
σt (φ) = EQ [Λt φ(Xt )|Yt ]
Z t Z t
= σ0 (φ) + EQ [Λs Lφ(Xs )|Yt ]ds + EQ [Λs φ(Xs )c(s, Xs )|Yt ]dYs
0 0
Z t Z t
= σ0 (φ) + EQ [Λs Lφ(Xs )|Ys ]ds + EQ [Λs φ(Xs )c(s, Xs )|Ys ]dYs
0 0
Z t Z t
= σ0 (φ) + σs (Lφ)ds + σs (φc)dYs
0 0

I This is a simple equation apart from one thing: the term

σs (φc) cannot be calculated recursively in terms of σs (φ).

Filtering: Continuous Time: The key equations 36

Normalized expectations
Simplifying with Ito

I Rearranging and applying Ito’s lemma, we can obtain an

equation for the normalized expectations

πs (φ) := σs (φ)/σs (1) = E [φ(Xs )|Ys ],

the ‘Fujisaki–Kallianpur–Kunita’ equation

Z Z

πt (φ) = π0 (φ)+ πs (Lφ)du+ πs (φc)−πs (φ)πs (c) dVs .
[0,t] [0,t]

I Here dVs = dYs − πs (c)ds is the (differential of the)

‘innovations process’ (and is a Y-Brownian motion under P).

Filtering: Continuous Time: The key equations 37

The Density equation
Finding an SPDE

I Let’s assume
R X has a smooth density given Yt , so
σt (φ) = R φ(x)q(t, x)dx. Then we see that
Z Z Z tZ
φ(x)q(t, x)dx = φ(x)q(0, x)dx + Lφ(x)q(s, x)dxds
R R 0 R
Z tZ
+ φ(x)c(s, x)q(s, x)dx dYs
0 R
I By integration by parts, if L∗ is the adjoint of L

∂(qf ) 1 ∂ 2 (qκ)
L∗ q = + · ,
∂x 2 ∂x 2
we calculate
Z Z Z t Z t
φ(x)q(t, x)dx = φ(x) q(0, x)+ L∗ q(s, x)ds+ c(s, x)q(s, x)dYs dx
R R 0 0

Filtering: Continuous Time: The key equations 38

The Density equation
Finding an SPDE

I This should hold for every smooth and bounded φ, so we have

the linear SPDE
Z t Z t
∗
q(t, x) = q(0, x) + L q(s, x)ds + c(s, x)q(s, x)dYs
0 0

We can then calculate the density of Xt |Yt as

q(t, x)
p(t, x) = R 0 0
.
R q(t, x )dx

I One can also get a nonlinear SPDE for the normalized density.
I Solving SPDEs is hard, so this equation is not frequently
solved in practice in this general form – instead it suggests
good approximations, or allows special cases to be derived.

Filtering: Continuous Time: The key equations 39

The Kalman–Bucy filter
The Continuous-time Gaussian model

I Let’s see the continuous-time Gaussian case.

I Here we assume c(t, Xt ) = cXt , f (t, Xt ) = aXt and
κ(t, Xt ) = b. Then we have the dynamics

dXt = aXt dt + b dBt , dYt = cXt dt + dWt

I Here a, b, c are known.

I With Ỹ = Y /c and f = 1/c, this is the same as the model
for observations d Ỹt = Xt dt + f dWt .
I We know that these equations define a Gaussian process (i.e.
all marginals are jointly normal), so it’s enough to calculate
the mean and variance.

Filtering: Continuous Time: The Kalman–Bucy Filter 40

The Kalman–Bucy filter
Simplifying...

I Write X̂t = EP [Xt |Yt ].

I First observe that everything here is Gaussian, and X̂ − X is
uncorrelated with Ys for all s < t.
I In particular, this implies they are independent, and

E [(Xt − X̂t )2 |Yt ] = E [(Xt − X̂t )2 ] =: Pt

is deterministic.
I Also, E [(Xt − X̂t )3 |Yt ] = 0.

Filtering: Continuous Time: The Kalman–Bucy Filter 41

The Kalman–Bucy filter
Applying the FKK equation

I Taking φ(x) = x so X̂t = πt (φ), we know Lφ ≡ 0, so

Z t Z t
πs (Xs2 ) − X̂s2 dVs

X̂t = X̂0 + aX̂s ds + c
Z0 t Z0 t
= X̂0 + aX̂s ds + c Ps dVs .
0 0

I Notice this is in terms of the innovations process V .

Filtering: Continuous Time: The Kalman–Bucy Filter 42

The Kalman–Bucy filter
Applying the FKK equation

I Taking φ(x) = x 2 ,
Z
2 2
πt (X ) = π0 (X ) + (2aπs (X 2 ) + b 2 )du
[0,t]
Z
+c (πs (X 3 ) − X̂u πu (X 2 ))dV .
[0,t]
Z t Z t
X̂t2 = X̂02 + 2
2a(X̂s ) ds + 2c X̂s Ps dVs .
0 0

I Taking a difference and simplifying, we obtain a Riccatti

equation for the variance P
Z t
2 2
Pt = πt (X ) − X̂t = P0 + (2aPs + b 2 − c 2 Ps2 )du.
0

Filtering: Continuous Time: The Kalman–Bucy Filter 43

The Kalman–Bucy filter
The filter

I Together, we have an SDE for the mean

Z t Z t
X̂t = X̂0 + aX̂s ds + c Ps dVs .
0 0

and a (deterministic) Riccatti equation for the variance

Z t
Pt = P0 + (2aPs + b 2 − c 2 Ps2 )du.
0
I This pair of equations is called the ‘Kalman–Bucy filter’.
I It can then be approximated using the usual methods for
SDEs/ODEs (eg Euler methods)
I It is possible to obtain a Kalman–Bucy smoother as well.
Example 5

Filtering: Continuous Time: The Kalman–Bucy Filter 44

The Wonham filter
Continuous Markov Chains

I Just as in discrete time, there is a continuous time equation

for the filter based on a (continuous time) Markov chain.
I Here we have the dynamics

dXt = AXt dt + dMt

dYt = c > Xt dt + dWt

where A> is the Q-matrix of the Markov chain, M is a

martingale and c is a vector.

Filtering: Continuous Time: The Wonham Filter 45

The Wonham filter
Continuous Markov Chains

I As X is written using only basis vectors in RN , any function of

X can be written Φ(X ) = φ> X for some vector φ ∈ RN .
I While X is not of the form we considered earlier, we can still
find the generator of X is LΦ = A> φ, and the adjoint of L is
simply L∗ v = Av .
I We can calculate (from the Zakai equation) the unnormalized
probability vector for the state of X
Z t Z t
E [Xt |Yt ] ∝ qt = q0 + Aqu du + diag(c)qu dYu .
0 0

I This equation is just an N-dimensional linear SDE. Equations

for the smoother are also known.

Filtering: Continuous Time: The Wonham Filter 46

Calibrating a filter
Trying to make things useable.

I What we have seen so far deals with the problem of how to

take our observations Y and obtain the behaviour of X .
I However, we have assumed throughout that we know the
probability model, that is, all the other parameters are fixed.
I The question of how to estimate those parameters is what we
consider next.
I This problem has a wide range of approaches, depending on
the details involved.
I We shall focus on a simple case, using the EM algorithm
(discussed in Elliott, van Der Hoek and Malcolm (2005),
based on Shumway and Stoffer (1982)).

Filtering: Calibration 47
Calibrating a filter
A setup

I We focus on the following simple scalar version of the discrete

time Kalman filter:
Xt+1 = a + bXt + cWt+1
Yt = Xt + f Vt .

I If we could observe X and Y directly, then we could calculate

a, b, c and f easily using regression.
I If we cannot observe X , then we need to use a more advanced
method.

Filtering: Calibration 48
Calibrating a filter
The equations

I Our filtering equations simplify to

µt+1|t = a + bµt|t , Pt+1|t = b 2 Pt|t + c 2

I The smoothing equations are (with Jt = bPt|t /Pt+1|t )

Filtering: Calibration 49
Calibrating a filter
The EM algorithm

I So, how to estimate the parameters?

I Simply regressing the smoothed values of X gives bias, as we
expect X will be ‘rougher’ than the smoothed values.
I The likelihood function is hard to compute, as it depends on
X and Y
I If we assumed we could calculate the expectation, then we
could instead try to maximize the expected log-likelihood
E [`(a, b, c, f ; {Xt , Yt }t≤T )|YT ]
I We can then iterate (calculate parameters) ↔ (calculate filter
estimates) until convergence. This is the
“Expectation-Maximization algorithm”, as we iterate between
(Maximum likelihood step) ↔ (Expectation step).

Filtering: Calibration 50
Calibrating a filter
The maximum estimates

I The maximization step can be solved! (all sums from 1 to N):

1
Xt−1 )(Xt − N1
P P P
E [ (Xt−1 − N Xt−1 )|YN ]
b̂ = P 2
E [ (Xt − X̄ ) |YN ]
Pt−1,t|N + µt|N µt−1|N − N1
P P P P
µt|N µt−1|N
=
Pt|N + µ2t|N − N1 ( µt|N )2
P P P

I We start off with approximate estimates of a, b, c, f

I We iterate the EM algorithm to improve these estimates
I Convergence may be slow (or get stuck) given bad starting
points.
I Given a large amount of data, it may be worth starting with
only a small subsample, then increasing the amount of data
used as you go.

Example 6

Filtering: Calibration 52
Pairs trading
A simple application

I We will look at using these methods to creat a basic pairs

trading system, using a toy setup, with real data.
I We will build this using one-second mid-prices for Microsoft
(MSFT) and Intel Corp. (INTC), on individual days in the
week beginning 3 November 2014.
I Thanks to Álvaro Cartea and Sebastian Jaimungal for data.

Filtering: Application: Pairs Trading 53

Pairs trading
A model

I We will model this using the method in previous section, as

suggested by Elliott, van der Hoek and Malcolm (2005).
I We fit the filter using Y = log(INTC/MSFT) using the
Kalman–EM method described above.
I We will then create a trading signal depending on the value of
Yt − µt|t .
I If our model is reasonable, we expect this value will revert
quickly to zero, which suggests a profitable trade, either long
INTC and short MSFT (if Y < µt|t ) or vice versa.
I Effectively, we expect prices to oscillate around a short-term
mean

Filtering: Application: Pairs Trading 54

Pairs trading
A model

I We choose to trade only when the difference is sufficiently

large, in such a way that we have a position 1% of the time.
I We reevaluate our position every second, and only
invest/short at most $1 in each stock.
I We ignore all transaction costs, microstructure issues, trading
constraints, etc.
Example 7

Filtering: Application: Pairs Trading 55

Pairs trading
A model

I This suggests that these methods can be used to build

profitable trading strategies.
I Of course, we would need to incorporate further effects into
our model of profits before implementing this in practice, as in
the real world we can only buy at the ask and sell at the bid,
which will likely eliminate most of our observed gains.
I Filtering is fast, which is important in this setting.

Filtering: Application: Pairs Trading 56

Conclusion
What have we done

I We have looked at the problem of filtering in a variety of

contexts.
I Discrete/Continuous time
I Gaussian/Finite state (or general with an SPDE solution)
I We have seen how you can implement these filters, and how
to estimate the coefficients in a simple setting
I The EM algorithm can be used more widely
I We have seen a toy application of these methods to financial
data

Filtering: Conclusion 57

LAB6
50% (2)
LAB6
5 pages
An Introduction To Kalman Filtering With MATLAB Examples: S L S P
89% (9)
An Introduction To Kalman Filtering With MATLAB Examples: S L S P
83 pages
Aviat PV User Manual PDF
100% (3)
Aviat PV User Manual PDF
568 pages
Kalman Notes 001
No ratings yet
Kalman Notes 001
11 pages
Bayesian Estimation of Time Varying Systems
No ratings yet
Bayesian Estimation of Time Varying Systems
115 pages
1 Deriving Kalman Filter
No ratings yet
1 Deriving Kalman Filter
7 pages
ML Lecture17
No ratings yet
ML Lecture17
60 pages
A Step by Step Mathematical Derivation A
No ratings yet
A Step by Step Mathematical Derivation A
32 pages
Tutorial On Kalman Filter
No ratings yet
Tutorial On Kalman Filter
47 pages
2020 - Bretz - Notes Onf Filtering Piecewise Linear Data
No ratings yet
2020 - Bretz - Notes Onf Filtering Piecewise Linear Data
6 pages
Kalman Filtering Book PDF
No ratings yet
Kalman Filtering Book PDF
83 pages
Kalman Filter Shoudong
100% (1)
Kalman Filter Shoudong
7 pages
Bayesian Filtering Techniques: Kalman and Extended Kalman Filter Basics
No ratings yet
Bayesian Filtering Techniques: Kalman and Extended Kalman Filter Basics
4 pages
Bayes Filter Reminder: Bel (X P (X Bel (X
No ratings yet
Bayes Filter Reminder: Bel (X P (X Bel (X
26 pages
Sequential State Estimation in Nonlinear, Non Gaussian Dynamical Systems
No ratings yet
Sequential State Estimation in Nonlinear, Non Gaussian Dynamical Systems
38 pages
Kalman and Extended Kalman Filters Conce
No ratings yet
Kalman and Extended Kalman Filters Conce
44 pages
Kalman and Extended Kalman Concept, Derivation and Properties
No ratings yet
Kalman and Extended Kalman Concept, Derivation and Properties
44 pages
IsabelRibeiro Kalman
No ratings yet
IsabelRibeiro Kalman
45 pages
Observers and Kalman Filters: CS 393R: Autonomous Robots
No ratings yet
Observers and Kalman Filters: CS 393R: Autonomous Robots
37 pages
Bayesian Filtering
No ratings yet
Bayesian Filtering
252 pages
Kalman Filter and Economic Applications
No ratings yet
Kalman Filter and Economic Applications
15 pages
The Kalman Filter and Related Algorithms: A Literature Review
No ratings yet
The Kalman Filter and Related Algorithms: A Literature Review
18 pages
Kalman Filters Switching Kalman Filter: Adventures of Our BN Hero
No ratings yet
Kalman Filters Switching Kalman Filter: Adventures of Our BN Hero
22 pages
Kalman Filter
100% (1)
Kalman Filter
15 pages
Kalman Filtering
No ratings yet
Kalman Filtering
15 pages
Probabilistic
No ratings yet
Probabilistic
55 pages
Derivation of The Kalman Filter in A Bayesian Filtering Perspective
No ratings yet
Derivation of The Kalman Filter in A Bayesian Filtering Perspective
6 pages
Lecture3 2023 Annotated
No ratings yet
Lecture3 2023 Annotated
28 pages
Linear Dynamical Models, Kalman Filtering and Statistics. Lecture Notes To IN-ST 259
No ratings yet
Linear Dynamical Models, Kalman Filtering and Statistics. Lecture Notes To IN-ST 259
163 pages
Bayesian Filtering - From Kalman Filters To Particle Filters and Beyond
No ratings yet
Bayesian Filtering - From Kalman Filters To Particle Filters and Beyond
69 pages
An Introduction To Kalman Filtering:Probabilistic And: Deterministic Approaches
No ratings yet
An Introduction To Kalman Filtering:Probabilistic And: Deterministic Approaches
12 pages
Sheffield Workshop2013 Osborne
No ratings yet
Sheffield Workshop2013 Osborne
86 pages
Documentation
No ratings yet
Documentation
131 pages
Armando Barreto - Malek Adjouadi - Francisco Ortega - Nonnarit O-Larnnithipong - Intuitive Understanding of Kalman Filtering With MATLAB-CRC Press (2021) PDF
No ratings yet
Armando Barreto - Malek Adjouadi - Francisco Ortega - Nonnarit O-Larnnithipong - Intuitive Understanding of Kalman Filtering With MATLAB-CRC Press (2021) PDF
248 pages
14 Kalmanfilter
No ratings yet
14 Kalmanfilter
34 pages
Factor Models
No ratings yet
Factor Models
59 pages
Kalman
No ratings yet
Kalman
7 pages
Derivation of KF
No ratings yet
Derivation of KF
8 pages
Lec 11 Tracking
No ratings yet
Lec 11 Tracking
70 pages
METULecture 1
No ratings yet
METULecture 1
15 pages
1D Kalman Filter - Shoudong
No ratings yet
1D Kalman Filter - Shoudong
9 pages
Bayesian and Kalman
No ratings yet
Bayesian and Kalman
3 pages
Notes On Kalman Filter KF EKF ESKF IEKF IESKF
No ratings yet
Notes On Kalman Filter KF EKF ESKF IEKF IESKF
30 pages
Sarkka
No ratings yet
Sarkka
252 pages
Particle Filter Tutorial
No ratings yet
Particle Filter Tutorial
8 pages
Alireza Javaheri
No ratings yet
Alireza Javaheri
17 pages
A Fresh Look at The Kalman Filter
No ratings yet
A Fresh Look at The Kalman Filter
23 pages
Prem Kumar L.S Subir Mansukhani - Prediction Using Kalman Filter
No ratings yet
Prem Kumar L.S Subir Mansukhani - Prediction Using Kalman Filter
9 pages
Bfs Book 2023 Online
No ratings yet
Bfs Book 2023 Online
436 pages
Gaussian Filters For Nonlinear Filtering Problems
No ratings yet
Gaussian Filters For Nonlinear Filtering Problems
18 pages
Kalmannote Basics
No ratings yet
Kalmannote Basics
4 pages
Introkalman e 151211
No ratings yet
Introkalman e 151211
80 pages
CR 02 Bayes Kalman Filters
No ratings yet
CR 02 Bayes Kalman Filters
44 pages
Full Notes 248 Spring 2022 Time Series
No ratings yet
Full Notes 248 Spring 2022 Time Series
117 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Computer Solved: Nonlinear Differential Equations
From Everand
Computer Solved: Nonlinear Differential Equations
Joe J. Ettl
No ratings yet
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet
Differential Equations: A Concise Course
From Everand
Differential Equations: A Concise Course
H. S. Bear
5/5 (3)
Capsule Calculus
From Everand
Capsule Calculus
Ira Ritow
No ratings yet
Time Estimation Gizmo ExploreLearning
No ratings yet
Time Estimation Gizmo ExploreLearning
1 page
Post-Implementation Steps For SAP Note 3295909
No ratings yet
Post-Implementation Steps For SAP Note 3295909
5 pages
Manual
No ratings yet
Manual
64 pages
Viva Questions
No ratings yet
Viva Questions
3 pages
B V M Catalogue
No ratings yet
B V M Catalogue
24 pages
Catalogue & Price List 2019-20: Swimming Pool & Spa Equipment
No ratings yet
Catalogue & Price List 2019-20: Swimming Pool & Spa Equipment
260 pages
1 Write The Java Program For Grading System
No ratings yet
1 Write The Java Program For Grading System
5 pages
Headspace 2nd Year
100% (1)
Headspace 2nd Year
401 pages
Control Panel: Need Help?
No ratings yet
Control Panel: Need Help?
12 pages
EAPP 12 2nd Quarter
No ratings yet
EAPP 12 2nd Quarter
23 pages
Uiet 2009 Cutoff
No ratings yet
Uiet 2009 Cutoff
17 pages
Case Study
No ratings yet
Case Study
11 pages
PRJ 3
No ratings yet
PRJ 3
7 pages
Supply Chain PDF
No ratings yet
Supply Chain PDF
2 pages
Direct Memory Access - GeeksforGeeks
No ratings yet
Direct Memory Access - GeeksforGeeks
4 pages
Benzara MBA 2024 MAIT
No ratings yet
Benzara MBA 2024 MAIT
3 pages
850 Universal Interface Manual UI 5000
No ratings yet
850 Universal Interface Manual UI 5000
4 pages
Color Video Doorphone Kit: 1byone Products Inc
No ratings yet
Color Video Doorphone Kit: 1byone Products Inc
19 pages
API - Pipeline Fact Sheet - RV8
No ratings yet
API - Pipeline Fact Sheet - RV8
1 page
The Process of Animation
No ratings yet
The Process of Animation
7 pages
BMC Resmart Gii Y30t Bipap Humidifier
No ratings yet
BMC Resmart Gii Y30t Bipap Humidifier
4 pages
Manoj V - 3.8years
No ratings yet
Manoj V - 3.8years
3 pages
Template For GigaByte Journal Data Report Submissions
No ratings yet
Template For GigaByte Journal Data Report Submissions
10 pages
Brackets Lesson For Coding and Programming by Slidesgo
No ratings yet
Brackets Lesson For Coding and Programming by Slidesgo
57 pages
Cosworth Performance Parts 2011
No ratings yet
Cosworth Performance Parts 2011
48 pages
KT-60 Introduction V1.0 20241114
No ratings yet
KT-60 Introduction V1.0 20241114
24 pages
Api Tools Presentation
No ratings yet
Api Tools Presentation
18 pages
23-04-2024 Tuesday Educational Information and o
No ratings yet
23-04-2024 Tuesday Educational Information and o
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.