Kalman Filter Shoudong
Kalman Filter Shoudong
Contents
1 Introduction
6.1
6.2
7
1
Introduction
I assume the readers have already read the Part I of this notes (One dimensional KF).
The structure of this note is the following. The Kalman Filter formula is reviewed again in
Section 2. The multi-dimensional Gaussian distribution is reviewed in Section 3 and some of its
properties are listed in Section 4. The derivation of multi-dimensional Kalman filter prediction
formulas are explained in Section 5 and the multi-dimensional Kalman filter update formulas
are derived in Section 6. Section 7 provides the matrix inversion lemma which is used in the
derivation of the formulas.
For a linear system, the process model (from time k to time k + 1) is described as
xk+1 = F xk + Guk + wk ,
(1)
where xk , xk+1 are the system state (vector) at time k, k + 1, F is the system transition matrix,
G is the gain of control uk , and wk is the zero-mean Gaussian process noise wk N (0, Q).
For state estimation problem, the true system state is not available and needs to be estimated.
The initial state x0 is assumed to follow a known Gaussian distribution x0 N (
x0 , P0 ). The
objective is to estimate the state at each time step by the process model and the observations.
The observation model at time k + 1 is given by
zk+1 = Hxk+1 + vk+1 .
(2)
where H is the observation matrix and vk+1 is the zero-mean Gaussian observation noise vk+1
N (0, R).
Suppose the knowledge on xk at time k is
xk N (
xk , Pk ),
(3)
xk+1 N (
xk+1 , Pk+1 )
(4)
(5)
k+1 = x
k+1 + K(zk+1 H x
k+1 )
x
T
Pk+1 = Pk+1 KSK ,
(7)
(6)
(11)
(12)
Note that this information is a matrix instead of a scalar. So a quantity measure (for example,
determinant, trace, largest eigenvalue) is necessary to compare two information matrices.
Below are a few properties of Gaussian distributions which are useful in deriving Kalman Filter.
For any constant matrix F ,
x N (m, P ) F x N (F m, F P F T ).
(13)
(14)
For two independent random variable x and y (the value of x contains no information
about the value of y and vice versa),
x N (mx , Px ), y N (my , Py ) x + y N (mx + my , Px + Py ).
(15)
Remark 4.1 These properties are extensions of the properties for one dimensional Gaussian
distributions. The rigorous proof of these properties can be found in many tutorials or books
on probability. For example, the following website:
http://en.wikipedia.org/wiki/Gaussian distribution
This section shows that the KF prediction formula can be obtained easily from the properties
of Gaussian distributions listed in Section 4.
Suppose the process model is
xk+1 = F xk + Guk + wk
3
(16)
where uk is the control vector (a constant vector from time k to time k + 1) and wk is the
zero-mean Gaussian process noise vector with covariance matrix Q. That is wk N (0, Q). It
is also assumed that wk is independent with xk .
At time k, the estimate of xk is a Gaussian distribution xk N (
xk , Pk ) (see equation (3)), thus
by property (13),
k , F Pk F T ).
F xk N (F x
(17)
By property (14) (here Guk is a constant vector),
k + Guk , F Pk F T ).
F xk + Guk N (F x
(18)
k + Guk , F Pk F T + Q).
xk+1 = (F xk + Guk ) + wk N (F x
(19)
Thus if we denote the estimate of xk+1 (after the process but before the observation) as
xk+1 N (
xk+1 , Pk+1 ),
(20)
k+1 = F x
k + Guk ,
x
Pk+1 = F Pk F T + Q.
(21)
This sections shows that the KF update formula can be obtained easily by adding the information from observation to the prior information.
The observation model is
zk+1 = Hxk+1 + vk+1 .
(22)
where H is a constant matrix, zk+1 is the observation value at time k + 1 (constant) and vk+1
is the zero-mean Gaussian observation noise with variance R. That is vk+1 N (0, R). It is
also assumed that vk+1 is independent with xk+1 .
6.1
(23)
(24)
(25)
The prior information about xk+1 is given by (20) (after the prediction but before the update).
So we have two pieces information about xk+1 information from observation (25) and prior
information (20).
According to the definition of information matrix of a Gaussian distribution (see (12) in Section
3), the information matrix (about xk+1 ) contained in (20) is
1
Iprior = Pk+1
,
(26)
(27)
The total information (about xk+1 ) after the observation should be the sum of the
two, namely,
1
Itotal = Iprior + Iobs = Pk+1
+ H T R1 H.
(28)
The new mean value is the weighted sum of the mean values of the two Gaussian
distributions (25) and (20). The weights are decided by the proportion of information contained in each of the Gaussian distributions (as compared with the total
information). That is,
1
1
k+1 = Itotal
k+1 + Itotal
x
Iprior x
Iobs (H 1 zk+1 )
1
1
k+1 + Itotal H T R1 zk+1 .
= Itotal Iprior x
(29)
(30)
So, the final estimate on xk+1 (after the prediction and update) is
xk+1 N (
xk+1 , Pk+1 )
(31)
k+1 and Pk+1 are given in the above equations (29) and (30).
where x
The update equations (29) and (30) can also be expressed as
and
1
1
k+1 = Itotal
H T R1 zk+1
(Iprior + Iobs Iobs )
xk+1 + Itotal
x
1
1
xk+1 + Itotal H T R1 zk+1
= Itotal (Itotal Iobs )
1
T 1
k+1 )
k+1 + Itotal H R (zk+1 H x
= x
(32)
1
1
+ H T R1 H)1 .
= (Pk+1
Pk+1 = Itotal
(33)
1
Itotal
1
+ H T R1 H)1
(Pk+1
Pk+1 Pk+1 H T (R + H Pk+1 H T )1 H Pk+1 .
(34)
Pk+1 =
=
=
(35)
where K = Pk+1 H T S 1 .
This is the formula (8).
Now from (34),
1
Itotal
H T R1 =
=
=
=
=
(36)
1
k+1 + Itotal
k+1 )
x
H T R1 (zk+1 H x
T
(37)
6.2
When H is arbitrary, H 1 may not exist but we still have the formula of the information matrix
from observation
Iobs = H T R1 H.
(38)
Here Iobs might not be full rank (the singularity of Iobs means that the observation only contain
information about part of the state xk+1 , there is no information about some part of the vector
xk+1 in the observation). However, the total information Itotal is invertible (full rank).
Also, we still have the new mean value as the weighted sum
1
1
k+1 + Itotal
k+1 = Itotal
H T R1 zk+1 .
Iprior x
x
(39)
Now all the formula after (27) and (29) are true since H 1 is not required.
Remark 6.1 If you have read through all the sections and they do make sense to you, then
congratulations to you. If you feel this is not enough and you also want to understand more
on multi-dimensional extended Kalman Filter (EKF), then please read Part III.
The following matrix inversion lemma is very useful and can be found in many textbooks about
matrices or Kalman Filter.
Lemma 7.1 Suppose that the partitioned matrix
A B
M=
C D
is invertible and that the inverse is conformably partitioned as
X Y
1
M =
,
U V
(40)
=
=
=
=
(41)
X
Y
U
V
=
=
=
=
(A BD1 C)1 ,
(A BD1 C)1 BD1 ,
D1 C(A BD1 C)1 ,
D1 + D1 C(A BD1 C)1 BD1 .
(42)
If D is invertible, then
(43)
(44)