0% found this document useful (0 votes)
15 views6 pages

Understanding The Kalman Filter

Uploaded by

nadamau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views6 pages

Understanding The Kalman Filter

Uploaded by

nadamau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Understanding the Kalman Filter

Richard J. Meinhold; Nozer D. Singpurwalla

The American Statistician, Vol. 37, No. 2. (May, 1983), pp. 123-127.

Stable URL:
http://links.jstor.org/sici?sici=0003-1305%28198305%2937%3A2%3C123%3AUTKF%3E2.0.CO%3B2-Z

The American Statistician is currently published by American Statistical Association.

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained
prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in
the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/journals/astata.html.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.

The JSTOR Archive is a trusted digital repository providing for long-term preservation and access to leading academic
journals and scholarly literature from around the world. The Archive is supported by libraries, scholarly societies, publishers,
and foundations. It is an initiative of JSTOR, a not-for-profit organization with a mission to help the scholarly community take
advantage of advances in technology. For more information regarding JSTOR, please contact support@jstor.org.

http://www.jstor.org
Thu Dec 20 12:01:53 2007
Understanding the Kalman Filter
RICHARD J. MEINHOLD and NOZER D. SINGPURWALLA*

details, giving an example, and giving an interpretation


This is an expository article. Here we show how the of results. A more mathematical discussion of the KF
successfully used Kalman filter, popular with control emphasizing the stochastic differential equation ap-
engineers and other scientists, can be easily understood proach is given by Wegman (1982). We feel that once it
by statisticians if we use a Bayesian formulation and is demystified, the KF will be used more often by ap-
some well-known results in multivariate statistics. We plied statisticians.
also give a simple example illustrating the use of the
Kalman filter for quality control work. 2. THE KALMAN FILTER MODEL:
MOTIVATION AND APPLICATIONS
KEY WORDS: Bayesian inference; Box-Jenkins mod-
els; Forecasting; Exponential smoothing; Multivariate Let Y,, Y,-,, . . . , Y,, the data (which may be either
normal distribution; Time series. scalars or vectors), denote the observed values of a
variable of interest at times t , t - 1 , . . . , l . We assume
that Y, depends on an unobservable quantity 0,, known
as the state of nature. Our aim is to make inferences
1. INTRODUCTION
about 0,, which may be either a scalar or a vector and
The Kalman filter (KF) commonly employed by con- whose dimension is independent of the dimension of Y, .
trol engineers and other physical scientists has been The relationship between Y, and 0, is linear and is speci-
successfully used in such diverse areas as the processing fied by the observation equation
of signals in aerospace tracking and underwater sonar,
and the statistical control of quality. More recently, it
has also been used in some nonengineering applications where F, is a known quantity. The observation error v,
such as short-term forecasting and the analysis of life is assumed to be normally distributed with mean zero
lengths from dose-response experiments. Un- -
and a known variance V,, denoted as v, N(0, V,).
fortunately, much of the published literature on the KF The essential difference between the KF and the con-
is in the engineering journals (including the original ventional linear model representation is that in the
development, in Kalman 1960 and Kalman and Bucy former, the state of nature-analogous to the re-
1961), and uses a language, notation, and style that is gression coefficients of the latter-is not assumed to be
alien to statisticians. Consequently, many practitioners a constant but may change with time. This dynamic
of statistics are not aware of the simplicity of this useful feature is incorporated via the system equation, wherein
methodology. However, the model, the notions, and
the techniques of Kalman filtering are potentially of
great interest to statisticians owing to their similarity to G, being a known quantity, and the system equation
linear models of regression and time series analysis, and -
error w, N(0, W,), with W, known. Since there are
because of their great utility in applications. many physical systems for which the state of nature 0,
In actuality, the KF may be easily understood by the changes over time according to a relationship pre-
statistician if it is cast as a problem in Bayesian infer- scribed by engineering or scientific principles, the abili-
ence and we employ some well-known elementary re- ty to include a knowledge of the system behavior in the
sults in multivariate statistics. This feature was evi- statistical model is an apparent source of attractiveness
dently first published by Harrison and Stevens (1971, of the KF. Note that the relationships (2.1) and (2.2)
1976), who were primarily interested in Bayesian fore- specified through F, and G, may or may not change with
casting. However, the particular result presented by time, as is also true of the variances V, and W,; we have
them is in a nontutorial manner, with emphasis placed subscripted these here for the sake of generality.
on the implementation of the KF. Our aim, on the other In addition to the usual linear model assumptions
hand, is to provide an exposition of the key notions of regarding the error terms, we also postulate that v, is
the approach in a single source, laying out its derivation independent of w,; while extension to the case of de-
in a few easy steps, filling in some clarifying technical pendency is straightforward, there is no need in this
article to do so.

*Richard J . Meinhold is a graduate student, and Nozer D . Sing-


purwalla is Professor of Operations Research and Statistics at George 2'1
Washington University, Washington, D.C. 20052. The work of the
To look at how the KF model might be employed in
second author was supported in part by the Office of Naval Research
Contract N00014-77-C-0263 and by the U,S, Army Research Office practice, we consider a version of the fie-
under Grant DAAG-29-80-C-0067 with George Washington univer- quently referenced example of tracking a satellite's or-
sity. bit around the earth. The unknown state of nature 0,

O The American Statistician, May 1983, Vol. 37, No. 2 123


could be the position and speed of the satellite at time x Prob{State of Nature), (3.1)
t , with respect to a spherical coordinate system with
which can also be written as
origin at the center of the earth. These quantities can-
not be measured directly. Instead, from tracking sta- ~ ( 0I y,)
, a P(Y, I el, y , - ~ )x ~ ( 0I Y
, ,-~), (3.2)
tions around the earth, we may obtain measurements of
where the notation P(A I B ) denotes the probability of
distance to the satellite and the accompanying angles of
occurrence of event A given that (or conditional on)
measurement; these are the Y,'s. The principles of ge-
event B has occurred. Note that the expression on the
ometry, mapping Y, into 0,, would be incorporated in
left side of (3.2) denotes the posterior distribution for 0
F,, while v, would reflect the measurement error; G,
at time t , whereas the first and second expressions on
would prescribe how the position and speed change in
the right side denote the likelihood and the prior distri-
time according to the physical laws governing orbiting
bution for 0, respectively.
bodies, while w, would allow for deviations from these
The recursive procedure can best be explained if we
laws owing to such factors as nonuniformity of the
focus attention on time point t - 1, t = 1,2,. . . ,and the
earth's gravitational field, and so on.
observed data until then, Y,-, = (Y,-,, Y,-,, . . . , Y1). In
A less complicated situation is considered by Phadke
what follows, we use matrix manipulations in allowing
(1981) in the context of statistical quality control. Here
for Y and/or 0 to be vectors, without explicitly noting
the observation Y, is a simple (approximately normal)
them as such.
transform of the number of defectives observed in a
At t - 1, our state of knowledge about 8,-, is em-
sample obtained at time t , while 8,,, and 02,,represent,
bodied in the following probability statement for @,-,:
respectively, the true defective index of the process and
thedrift of this index. We then have as the bbservation
equation
where and Z,-I are the expectation and the variance
of (Or-, I Y,-,). In effect, (3.3) represents the posterior
distribution of 0,-,; its evolution will become clear in the
and as the system equations
subsequent text.
It is helpful to remark here that the recursive pro-
cedure is started off at time 0 by choosing 80 and Zo to
be our best guesses about the mean and the variance of
In vector notation, this system of equations becomes 00, respectively.
We now look forward to time t , but in two stages:
1. prior to observing Y,, and
where
2. after observing Y,.
Stage 1. Prior to observing Y,, our best choice for 0,
is governed by the system equation (2.2) and is given as
G,O,-l + w, . Since Or-, is described by (3.3), our state of
knowledge about 0, is embodied in the probability
statement
does not change with time.
If we examine Y, - Y,-I for this model, we observe
that, under the assumption of constant variances, name- this is our prior distribution.
ly, V, = V and W, = W, the autocorrelation structure of In obtaining (3.4), which represents our prior for 0, in
this difference is identical to that of an ARZMA (0,1,1) the next cycle of (3.2), we used the well-known result
process in the sense of Box and Jenkins (1970). Al- that for any constant C
though such a correspondence is sometimes easily dis-
cernible, we should in general not, because of the dis-
X - N ( p , 2 ) 3 CX - N ( C p , CCC'),
crepancies in the philosophies and methodologies in- where C ' denotes the transpose of C .
volved, consider the two approaches to be equivalent. Stage 2 . On observing Y,, our goal is to compute the
posterior of 0, using (3.2). However, to do this, we need
3. THE RECURSIVE ESTIMATION PROCEDURE to know the likelihood Y(0,(Y,), or equivalently
P(Y, (O,, Y,-,), the determination of which is under-
The term "Kalman filter" or "Kalman filtering" re- taken via the following arguments.
fers to a recursive procedure for inference about the Let el denote the error in predicting Y, from the point
state of nature 0,. The key notion here is that given the t - 1; thus
data Y, = (Y,, . . . , Y1), inference about 0, can be carried
out through a direct application of Bayes's theorem:
Since F,, G,, and 8,-, are all known, observing Y, is
Prob{State of Nature I Data}
equivalent to observing el. Thus (3.2) can be rewritten
Prob{Data I State of Nature) as

124 O The American Statistician, May 1983, Vol. 37, No. 2


and

with P(e, ( 0, , Y,-l) being the likelihood. If in (4.2) we replace XI, X2, p2, and C22by el, O,, ~ 1 8 1 - ~ ,
Using the fact that Y, = F,O, + v,, (3.5) can be written and R,, respectively, and recall the result that (el )8,,
as e l = F,(O, - G , ~ , - ~ ) + V ,so
, that E(e,(O,, Y,-l)= -
yl-J N(F,(O, - G , ~ , - I ) V,)
, (Eq. (3.7)), then
F, (01 - G10,-1).
-
Since v, N(0, V,), it follows that the likelihood is P-1 + C12Rt-I (0, - ~rer-1)e ~ r ( o -
r ~ter-I),
described by so that pl@O and C12@FrRr;similarly,
C11 - C12CG1C21= 2 1 1 - FtRrF: Vt
We can now use Bayes's theorem (Eq. (3.6)) to so that Cll V, + FtR,F: .
obtain We now invoke the converse relation mentioned pre-
viously to conclude that the joint distribution of 0, and
e, , given Y,- can be described as

and this best describes our state of knowledge about 0,


at time t . Once P(0, ( Y,, Y,-l) is computed, we can go Making el the conditioning variable and identifying
back to (3.3) for the next cycle of the recursive pro- (4.3) with (4.1), we obtain via (4.2) the result that
cedure. In the next section, we show that the posterior
distribution of (3.8) is of the form presented in (3.3).

4. DETERMINATION OF THE
POSTERIOR DISTRIBUTION This is the desired posterior distribution. We now sum-
marize to highlight the elements of the recursive pro-
The tedious effort required to obtain P ( 0 , )Y,) using cedure.
(3.8) can be avoided if we make use of the following After time t - 1, we had a posterior distribution for
well-known result in multivariate statistics (Anderson with mean 8,-1 and variance C,-l (Eq. (3.3)).
1958, pp. 28-29), and some standard properties of the Forming a prior for 0, with mean ~ , 8 , and - ~ variance
normal distribution. R, = G,C,-lG: + W , (Eq. (3.4)) and evaluating a like-
Let Xl and X2 have a bivariate normal distribution lihood given e, = Y , - F,G,~,-,( Eq. (3.5)), we arrive at
with means p1 and p2, respectively, and a covariance the posterior density for 0,; this has mean
matrix
~ (V, + F,R,F: )-'el
8, = ~ , 8 , +- R,F: (4.5)
and variance
we denote this by 2, = R, - R,F: (V, + F,R,F: )-'F,R,. (4.6)
We now continue through the next cycle of the process.

When (4.1) holds, the conditional distribution of Xl 5. INTERPRETATION OF RESULTS AND


given X2 is described by CONCLUDING REMARKS

If we look at (4.4) for obtaining some additional in-


sight into the workings of the Kalman filter, we note
that the mean of t h e posterior distribution of (0,l el,
The quantity p1+ C12CG1(x2 - k2) is called the re-
Y,-l) is indeed the regression function of 0, on e,. The
gression function, and C12C&lisreferred to as the coeffi-
mean (regression function) is the sum of two quantities
cient of the least squares regression of Xl on x2.
~ , 8 , - and
~ , a multiple of the one step ahead forecast
As a converse to the relationship (4.1) implies (4.2),
error e, .
we have the result that whenever (4.2) holds, and when
We first remark that ~ , 8 , - 1is the mean of the prior
X2- N(p2, &), then (4.1) will hold; we will use this
distribution of 0, (see (3.4)), and by comparing (4.3)
converse relationship.
and (4.4) to (4.1) and (4.2) we verify that the multiplier
For our situation, we suppress the conditioning vari-
of e,, R,F: (V, + F,RJ;)-', is the coefficient of the least
ables Y,-l and let Xl correspond to el, and X2 corre-
squares regression of 0, on el (conditional on Y,-l). Thus
spond to 0,; we denote this correspondence by Xl -ael
-
and X2@ 0,. Since (0,)Y,-l) ~ ( ~ , 8 , -R,)
1 , (see (3.4)),
one way to view Kalman filtering is to think of it as an
updating procedure that consists of forming a pre-
we note that
liminary (prior) guess about the state of nature and then
adding a correction to this guess, the correction being

American Statistician, May 1983, Vol. 37, No. 2 125


determined by how well the guess has performed in regression function
predicting the next observation. posterior to t and
prior to t+l
Second, we should clarify the meaning of regressing
0, on e, since this pair constitutes but a single obser- Regression function
posterior to t-1
vation and the regression relationship is not estimated and prior to t
in the familiar way. Rather, we recall the usual frame-
work of sequential Bayesian estimation, wherein a new
posterior distribution arises with each successive piece
of data. At time zero, the regression of 81 on el is deter-
mined entirely by our prior specifications. On receiving
the first observation, the value of el is mapped into 8,
through this function, which is then replaced by a new Figure 1. Regression of 0, on e ,
regression relation based on e l , Fl, G I , V1, and W1. This
We first return to the quality control model of Section
in turn is used to map e2into e2, and so on as the process
2.1, simplified by the removal of the drift parameter.
continues in the usual Bayesian priorlposterior iterative
This yields
manner; see Figure 1. Thus Kalman filtering can also be
viewed as the evolution of a series of regression func- Y , = 0, + v, (Obs. Eqn.)
tions of 0, on el, at times 0,1, . . . ,t - 1, t , each having
a potentially different intercept and regression coeffi- and (6.1)
cient; the evolution stems from a learning process in- 0, = 0,-I + w, (Sys. Eqn.).
volving all the data. This is a simplest possible nontrivial KF model (some-
The original development of the Kalman filter ap- times referred to in the forecasting literature as the
proach was motivated by the updating feature just de- steady model); it also corresponds, in the sense of pos-
scribed, and its derivation followed via the least squares sessing the same autocorrelation structure (assuming
estimation theory. The Bayesian formulation described constant variances), to a class of ARIMA (0, 1, 1) mod-
here yields the same result in an elegant manner and els of Box and Jenkins (1970). In this situation,
additionally provides the attractive feature of enabling F, = G, = 1; if we further specified that Zo = 1, V, = 2,
inference about 0, through a probability distribution W, = 1, we can easily demonstrate inductively that
rather than just a point estimate. R, = G,Z,-,G: + W , = 2 , and from (4.6), Z, = l . In
(4.5), then, our recursive relationship becomes
6. ILLUSTRATIVE EXAMPLES 0, = el-, + ;(Y, - el-,)
A A

6.1 The Steady Model


We consider two examples to illustrate the preceding
mechanism and its performance.

Table 1. A Simulation of the Process Described in Section (6.2)

126 O The American Statistician, May 1983, Vol. 37, No. 2


We see then that in this simple situation the KF esti-
mator of O f , and thus Y,,,, is actually equivalent to that
derived from a form of exponential smoothing.

6.2 A Numerical Example


We present in Table 1 a numerical example involving
a simulation of the (scalar-dimensional) general model
of (2.1) and (2.2). We continue to specify So= 1, V , = 2,
W , 1, but incorpkrate cyclical behavior in 0, by setting

while F, is in the nature of the familiar independent


variable of ordinary regression. This situation clearly
cannot be contained in any class of the ARZMA family;
instead it is analogous, if not equivalent, to the transfer
function model approach of Box and Jenkins (1970).
Starting with a value for Bo, the disturbances v, and w,
were generated from a table of random normal variates
and used in turn to produce, via the system and obser-
vation equations, the processes (0,) and {Y,),of which Figure 2. A Plot of the Simulated Values of 8 , the State of Nature
only the latter would ordinarily be visible. A "bad at Time t, and Their Estimated Values 0, Via the Kalman Filter
guess" value of go was chosen; as can be seen in Figure
2, where the actual values of 8, and their estimates 8, are HARRISON, P.J., and STEVENS, C.F. (1971). "A Bayesian Ap-
plotted, the effect of this error is short-lived. The reader proach to Short-Term Forecasting," Operations Research Quarter-
ly, 22, 341-362.
may find it conducive to a better understanding of the - (1976), "Bayesian Forecasting (with discussion)," Journal of
model to work through several iterations of the recur- the Royal Statistical Society, Ser. B , 38, 205-247.
sive procedure. KALMAN, R . E . (1960), "A New Approach to Linear Filtering and
Prediction Problems," Journal of Basic Engineering, 82, 34-45.
[Received October 1981. Revised July 1982. ] KALMAN, R . E . , and BUCY, R.S. (1961), "New Results in Linear
Filtering and Prediction Theory," Journal of Basic Engineering,
83, 95-108.
REFERENCES PHADKE, M.S. (1981), ' " ~ u a l i t Audit
~ Using Adaptive Kalman
Filtering," P S Q C Quality Congress Transactions-San Francisco,
ANDERSON, T. W. (1958), A n Introduction to Multivariate Statisti- 1045-1052.
cal Analysis, New York: John Wiley. WEGMAN, E.J. (1982), "Kalman Filtering," in Encyclopedia o f
BOX, G.E.P., and JENKINS, G.M. (1970), Time Series Analysis, Statistics, eds. Norman Johnson and Samuel Kotz, New York: John
Forecasting and Control, San Francisco: Holden-Day. Wiley.

O The American Statistician, May 1983, Vol. 37, No. 2 127

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy