0% found this document useful (0 votes)

81 views36 pages

University of Bristol Research Report 08:16: SMCTC: Sequential Monte Carlo in C++

This document discusses SMCTC, a C++ template library for implementing sequential Monte Carlo (SMC) methods. SMC methods provide weighted samples from a sequence of distributions using importance sampling and resampling. The library aims to make SMC algorithms easier to implement and faster to execute compared to interpreted languages. It includes a simple particle filter example and a state-of-the-art rare event estimation algorithm. The document provides background on SMC methods and the generic SIR algorithm they are based on.

Uploaded by

Adam Johansen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

81 views36 pages

University of Bristol Research Report 08:16: SMCTC: Sequential Monte Carlo in C++

Uploaded by

Adam Johansen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

JSS

University of Bristol Research Report 08:16

http://www.stats.bris.ac.uk/

July 2008

SMCTC: Sequential Monte Carlo in C++

Adam M. Johansen

Abstract Sequential Monte Carlo methods are a very general class of Monte Carlo methods for sampling from sequences of distributions. Simple examples of these algorithms are used very widely in the tracking and signal processing literature. Recent developments illustrate that these techniques have much more general applicability, and can be applied very eectively to statistical inference problems. Unfortunately, these methods are often perceived as being computationally expensive and dicult to implement. This article seeks to address both of these problems. A C++ template class library for the ecient and convenient implementation of very general Sequential Monte Carlo algorithms is presented. Two example applications are provided: a simple particle lter for illustrative purposes and a state-of-the-art algorithm for rare event estimation.

Keywords: Monte Carlo, particle ltering, Sequential Monte Carlo, simulation, template class.

1. Introduction
Sequential Monte Carlo (SMC) methods provide weighted samples from a sequence of distributions using sampling and resampling mechanisms. They have been widely employed in the approximate solution of the optimal ltering equations (see, for example, Doucet, de Freitas, and Gordon (2001); Liu (2001); Doucet and Johansen (2008) for reviews of this literature) over the past fteen years (in this domain, the technique is often termed particle ltering). More recently, it has been established that the same techniques could be much more widely employed to provide samples from essentially arbitrary sequence of distributions (Del Moral, Doucet, and Jasra 2006a,b). SMC algorithms are perceived as being dicult to implement and yet there is no existing software or library which provides a cohesive framework for the implementation of general algorithms. Even in the eld of particle ltering, little generic software is available. Implementations of various particle lters described in van der Merwe, Doucet, de Freitas, and Wan (2000) were made available and could be adapted to some degree; more recently, a MatLab implementation of a particle lter entitled PFlib has been developed (Chen, Lee, Budhiraja, and Mehra 2007). This software is restricted to the particle ltering setting and is somewhat limited even within this class. Furthermore, many interesting algorithms are computationally intensive: a fast implementa-

SMCTC: Sequential Monte Carlo in C++

tion in a compiled language is essential to the generation of results in a timely manner. There are two situations in which fast, ecient execution is essential to practical SMC algorithms: Traditionally, SMC algorithms are very widely used in real-time signal processing situations. Here, it is essential that an update of the algorithm can be carried out in the time between two consecutive observations. SMC samplers are often used to sample from complex, high-dimensional distributions. Doing so can involve substantial computational eort. In a research environment, one typically requires the output of hundreds or thousands of runs to establish the properties of an algorithm and in practice it is often necessary to run algorithms on very large data sets. In either case, ecient algorithmic implementations are needed. The purpose of the present paper is to present a exible framework for the implementation of general SMC algorithms. This exibility and the speed of execution come at the cost of requiring some simple programming on the part of the end user. It is our perception that this is not a severe limitation and that a place for such a library does exist. It is our experience that with the widespread availability of high-quality mathematical libraries, particularly the GNU Scientic Library (Galassi, Davies, Theiler, Gough, Jungman, Booth, and Rossi 2006), there is little overhead associated with the development of software in C or C++ rather than an interpreted statistical language although it may be slightly simpler to employ PFlib if a simple particle lter is required, it is not dicult to implement such a thing using SMCTC (the Sequential Monte Carlo Template Class) as illustrated in section 5.1. Using the library should be simple enough that it is appropriate for fast prototyping and use in a standard research environment (as the examples in section 5 hopefully demonstrate). The fact that the library makes use of a standard language which has been implemented for essentially every modern architecture means that it can also be used for the development of production software: there is no diculty in including SMC algorithms implemented using SMTC as constituents of much larger pieces of software.

2. Sequential Monte Carlo

Sequential Monte Carlo methods are a general class of techniques which provide weighted samples from a sequence of distributions using importance sampling and resampling mechanisms. A number of other sophisticated techniques have been proposed in recent years to improve the performance of such algorithms. However, these can almost all be interpreted as techniques for making use of auxiliary variables in such a way that the target distribution is recovered as a marginal or conditional distribution or simply as a technique which makes use of a dierent distribution together with an importance weighting to approximate the distributions of interest.

2.1. Sequential Importance Sampling and Resampling

The Sequential Importance Resampling (SIR) algorithm is usually regarded as a simple example of an SMC algorithm which makes use of importance sampling and resampling techniques to provide samples from a sequence of distributions dened upon state-spaces of strictlyincreasing dimension. Here, we will consider SIR as being a prototypical SMC algorithm of

3 which it is possible to interpret essentially all other such algorithms as a particular case. Some motivation for this is provided in the following section. What follows is a short reminder of the principles behind SIR algorithms; see Doucet and Johansen (2008) for a more detailed discussion and an interpretation of most SMC algorithms as particular forms of SIR. Importance sampling is a technique which allows the calculation of expectations with respect to a distribution using samples from some other distribution, q with respect to which is absolutely continuous. To maximise the accessibility of this document, we assume throughout that all distributions admit a density with respect to Lebesgue measure and use appropriate notation; this is not a restriction imposed by the method or the software, simply a decision made for convenience. Rather than approximating (x)(x)dx as the sample average of over a collection of samples from , one approximates it with the sample average of (x)(x)/q(x) over a collection of samples from q. Thus, we approximate (x)(x)dx with the sample approximation n (X i ) 1 (X i ). 1 = n q(X i )
i=1

This is justied by the fact that Eq [(X)(X)/q(X)] = E [(X)]. In practice, one typically knows (X)/q(X) only up to a normalising constant. We dene w(x) (x)/q(x) and note that this constant is usually estimated using the same sample as the integral of interest leading to the consistent estimator of (x)(x)dx given by:
n N

2 :=
i=1

w(X i )(X i )
i=1

w(X i ).

Sequential importance sampling is a simple extension of this method. If a distribution q is dened over a product space n Ei then it may be decomposed as the product of condii=1 tional distributions q(x1:n ) = q(x1 )q(x2 |x1 ) . . . q(xn |x1:n ). In principle, given a sequence of probability distributions {n (x1:n )}n1 over the spaces { n Ei }n1 , we could estimate exi=1 pectations with respect to each in turn by extending the sample used at time k 1 to time k by sampling from the appropriate conditional distribution and then using the fact that: wn (x1:n ) n (x1:n ) n (x1:n ) = q(x1:n ) q(xn |x1:n1 )q(x1:n1 ) n (x1:n ) n1 (x1:n1 ) n (x1:n ) = wn1 (x1:n1 ) q(xn |x1:n1 )n1 (x1:n1 ) q(x1:n1 ) q(xn |x1:n1 )

to update the weights associated with each sample from one iteration to the next. However, this approach fails as n becomes large as it amounts to importance sampling on a space of high dimension. Resampling is a technique which helps to retain a good representation of the nal time-marginals (and these are usually the distributions of interest in applications of SMC). Resampling is the principled elimination of samples with small weight and replication of those with large weights and resetting all of the weights to the same value. The mechanism is chosen to ensure that the expected number of replicates of each sample is proportional to its weight before resampling.

SMCTC: Sequential Monte Carlo in C++

Algorithm 1 The Generic SIR Algorithm At time 1 for i = 1 to N do i X1 q1 ()

i W1 q 1(X 1) i 1 1 end for 1 i i Resample X1 , W1 to obtain X1i , N At time n 2 for i = 1 to N do i i Set X1:n1 = X1:n1 i i Sample Xn qn (|X1:n1 ) i Set Wn
i n (X1:n ) i i i qn (Xn |X1:n1 )n1 (X1:n1 )

(X i )

end for 1 i i i Resample X1:n , Wn to obtain X1:n , N Algorithm 1 shows how this translates into an algorithm for a generic sequence of distributions. It is, essentially, precisely this algorithm which SMCTC allows the implementation of. However, it should be noted that this algorithm encompasses almost all SMC algorithms.

2.2. Particle Filters

The majority of SMC algorithms were developed in the context of approximate solution of the optimal ltering and smoothing equations (although it should be noted that their use in some areas of the physics literature dates back to at least the 1950s). Their interpretation as SIR algorithms, and a detailed discussion of particle ltering and related elds is provided by Doucet and Johansen (2008). Here, we attempt to present a concise overview of some of the more important aspects of the eld. Particle ltering provides a strong motivation for SMC methods more generally and remains their primary application area at present. State space models (SSMs, and the closely related hidden Markov models) are very popular statistical models for time series. Such models describe the trajectory of some system of interest as an unobserved E-valued Markov chain, known as the signal process, which for the sake of simplicity is treated as being time-homogeneous in this paper. Let X1 and Xn |(Xn1 = xn1 ) f (|xn1 ) and assume that a sequence of observations, {Yn }nN are available. If Yn is, conditional upon Xn , independent of the remainder of the observation and signal processes, with Yn |(Xn = xn ) g(|xn ), then this describes an SSM. A common objective is the recursive approximation of an analytically intractable sequence of posterior distributions {p ( x1:n | y1:n )}nN , of the form:
n

p(x1:n |y1:n ) (x1 )g(y1 |x1 )

j=2

f (xj |xj1 )g(yj |xj ).

(1)

There are a small number of situations in which these distributions can be obtained in closed form (notably the linear-Gaussian case, which leads to the Kalman lter). However, in general it is necessary to employ approximations and one of the most versatile approaches is to use SMC to approximate these distributions. The standard approach is to use the SIR

5 Algorithm 2 SIR for Particle Filtering At time 1 for i = 1 to N do i Sample X1 q( x1 | y1 ). i i (X1 )g ( y1 |X1 ) i i i Compute the weights w1 X1 = and W1 w1 X1 . i q( X1 |y1 ) end for i 1 i i Resample W1 , X1 to obtain N equally-weighted particles N , X 1 . At time n 2 for i = 1 to N do i i i i i Sample Xn q( xn | yn , X n1 ) and set X1:n X 1:n1 , Xn . i i i g ( yn |Xn )f ( Xn |Xn1 ) i . Compute the weights Wn i i q ( Xn |yn ,Xn1 ) end for i 1 i i Resample Wn , X1:n to obtain N new equally-weighted particles N , X 1:n .

algorithm described in the previous section, targetting this sequence of posterior distributions although alternative strategies exist. This leads to algorithm 2.2. We obtain, at time n, the approximation:
N

p ( dx1:n | y1:n ) =
i=1

i Wn X i (dx1:n ) .
1:n

Notice that, if we are interested only in approximating the marginal distributions {p ( xn | y1:n )} i and {p (y1:n )}, then we need to store only the terminal-value particles Xn1:n to be able to compute the weights: the algorithms storage requirements do not increase over time.

2.3. SMC Samplers

It has recently been established that similar techniques can be used to sample from a general sequence of distributions dened upon general spaces (i.e. the requirement that the state space be strictly increasing can be relaxed and the connection between sequential distributions can be rather more general). This is achieved by applying standard SIR-type algorithms to a sequence of synthetic distributions dened upon an increasing sequence of state spaces constructed in such a way as to preserve the distributions of interest as their marginals. SMC Samplers are a class of algorithms for sampling iteratively from a sequence of distributions, denoted by {n (xn )}nN , dened upon a sequence of potentially arbitrary spaces, {En }nN , (Del Moral et al. 2006a). The approach involves the application of SIR to a cleverly constructed sequence of synthetic distributions which admit the distributions of interest as marginals.
n1

The synthetic distributions are n (x1:n ) = n (xn )

p=1

Lp (xp+1 , xp ) , where {Ln }nN is a se-

quence of backward-in-time Markov kernels from En into En1 . With this structure, an importance sample from n is obtained by taking the path x1:n1 , an importance sample from n1 , and extending it with a Markov kernel, Kn , which acts from En1 into En , pro-

SMCTC: Sequential Monte Carlo in C++

viding samples from n1 Kn and leading to the incremental importance weight: wn (xn1:n ) = n (x1:n ) n (xn )Ln1 (xn , xn1 ) = . n1 (x1:n1 )Kn (xn1 , xn ) n1 (xn1 )Kn (xn1 , xn ) (2)

In most applications, each n (xn ) can only be evaluated point-wise, up to a normalizing constant and the importance weights dened by (2) are normalised in the same manner as in the SIR algorithm. Resampling may then be performed. The choice of auxiliary kenels, Ln is critical to the performance of the algorithm. As was demonstrated in Del Moral et al. (2006b) the optimal form (if resampling is used at every iteration) is Ln1 (xn , xn1 ) n1 (xn1 )Kn (xn1 , xn ) but it is typically impossible to evaluate the associated normalising factor (which cannot be neglected as it depends upon xn and appears in the importance weight). In practice, obtaining a good approximation to this kernel is essential to obtaining a good estimator variance; a number of methods for doing this have been developed in the literature. It should also be noted that a number of other modern sampling algorithms can be interpreted as examples of SMC samplers. Algorithms which admit such an interpretation include annealed importance sampling (Neal 2001), population Monte Carlo (Capp, Guillin, Marin, and e Robert 2004) and the particle lter for static parameters (Chopin 2002). It is consequently straightforward to use SMCTC to implement these classes of algorithms. Although it is apparent that this technique is applicable to numerous statistical problems and has been found to outperform existing techniques, including MCMC, in at least some problems of interest (for example, see, Fan, Leslie, and Wand (2007); Johansen, Doucet, and Davy (2008)) there have been relatively few attempts to apply these techniques. Largely, in the opinion of the author, due to the perceived complexity of SMC approaches. Some of the diculties are more subtle than simple implementation issues (in particular selection of the forward and backward kernels an issue which is discussed at some length in the original paper, and for which sensible techniques do exist), but we hope that this library will bring the widespread implementation of SMC algorithms for real-world problems one step closer.

3. Using SMCTC
This section documents some practical considerations: how the the library can be obtained and what must be done in order to make use of it. The software has been successfully compiled and tested under a number of environments (including Gentoo, SuSe and Ubuntu linux utilising GCC-3 and GCC-4 and Microsoft Visual C++ 5 under the Windows operating system). Microsoft Visual C++ project les are also provided in the msvc subdirectory of the appropriate directories and these should be used in place of the Makele when working with this operating system/compiler combination. The le smctc.sln in the top-level directory comprises a Visual C++ solution which incorporates each of the individual projects. The remainder of this section assumes that working versions of the GNU C++ compiler (g++) and make are available. In principle, other compatible compilers and makers should work although it might be necessary to make some small modications to the Makele or source code in some instances.

3.1. Obtaining SMCTC

The SMCTC can be obtained from the authors website (http://www.stats.bris.ac.uk/ ~maamj/smctc/smctc.html at the time of writing) and is released under version 3 of the GNU General Public License (Free Software Foundation 2007). A link to the latest version of the software should be present on the SMC methods preprint server (http://www-sigproc.eng. cam.ac.uk/smc/software.html). Software is available in source form archived in .tar, .tar.bz2 and .zip formats.

3.2. Installing SMCTC

Having downloaded and unarchived the library source (for example, using tar xvf smctc.tar) it is necessary to perform a number of operations in order to make use of the library: 1. Compile the binary component of the library. 2. Install the library somewhere in your library path. 3. Install the header les somewhere in your include path. 4. Compile the example programs to verify that everything works. Actually, only the rst of these steps is essential. The library and header les can reside anywhere provided that the directory in which they specify is provided at compile and link times, respectively.

Compiling the library

Enter the top level of the SMCTC directory and run make libraries. This produces a static library named libsmctc.a and copies it to the lib directory within the SMCTC directory. Alternatively, make all will compile the library, the documentation and the example programs. After compiling the library and any other components which are required, it is safe to run make clean which will delete certain intermediate les.

Optional Steps
After compiling the library there will be a static library named libsmctc.a within the lib subdirectory. This should either be copied to your preferred library location (typically /usr/lib on a Linux system) or its locations must be specied every time the library is used. The header les contained within the include subdirectory should be copied to a system-wide include directory (such as /usr/include) or it will be necessary to specify the location of the SMCTC include directory whenever a le which makes use of the library is compiled. In order to compile the examples, enter make examples in the SMCTC directory. This will build the examples and copy them into the bin subdirectory.

3.3. Building Programs with SMCTC

It should be noted that SMCTC is dependent upon the GNU Scientic Library (GSL) (Galassi et al. 2006) for random number generation. It would be reasonably straightforward to adapt

SMCTC: Sequential Monte Carlo in C++

the library to work with other random number generators but there seems to be little merit in doing so given the provisions of the GSL and its wide availability. It is necessary, therefore, to link executables against the GSL itself and a suitable CBLAS implementation (one is ordinarily provided with the GSL if a locally optimised version is not available). None of these things is likely to pose problems for a machine used for scientic software development. Assuming that the appropriate libraries are installed, it is a simple matter of compiling your source les with your preferred C++ compiler and then linking the resulting object les with the SMCTC and GSL libraries. The following commands, for example, are sucient to compile the example program described in section 5.1:
g ++ -I ../../ include -c pfexample . cc pffuncs . cc g ++ pfexample . o pffuncs . o -L ../../ lib - lsmctc - lgsl - lgslcblas - opf

It is, of course, advisable to include some additional options to encourage the compiler to optimise the code as much as possible once it has been debugged.

3.4. Additional Documentation

The library is fully documented using the Doxygen system (van Heesch 2007). This includes a comprehensive class and function reference for the library. It can be compiled using the command make docs in the top level directory of the library, if the freely-available Doxygen and GraphViz (Gansner and North 2000) programs are installed. It is available compiled in both HTML and PDF formats from the same place as the library itself.

4. The SMCTC Library

The principal rle of this article is to introduce a C++ template class for the implementation o of quite general SMC algorithms. It seems natural to consider an object orientated approach to this problem: a sampler itself is a single object, it contains particles and distributions; previous generations of the sampler may themselves be viewed as objects. For convenience it is also useful to provide an object which provides random numbers (via the GSL). Templating is an object oriented programming (OOP) technique which abstracts the operation being performed from the particular type of object upon which the action is carried out. One of its simplest uses is the construction of container classes such as lists whose contents can be of essentially any type but whose operation is not qualitatively inuenced by that type. See (Stroustrup 1991, chapter 8) for further information about templates and their use within C++. Stroustrup (1991) helpfully suggests that, One can think of a template as a clever kind of macro that obeys the scope, naming and type rules of C++. It is natural to use such an approach for the implementation of a generic SMC library: whatever the state space of interest, E, (and, implicitly, the distributions of interest over those spaces) it is clear that the same actions are carried out during each iteration and the same basic tasks need to be performed. The structure of a simple particle lter with a realvalued state space (allowing a simple double to store the state associated with a particle) or a sophisticated trans-dimensional SMC algorithm dened over a state space which permits the representation of complex objects of a priori unknown dimension are, essentially the same. An SMC algorithm iteratively carries out the following steps: Move each particle according to some transition kernel.

sampler <T>

gslrnginfo

exception

history <T>

moveset <T>

rng

historyelement <T>

particle <T>

historyflags

Figure 1: Collaboration Diagram.T denotes the type of the sampler: the class used to represent an element of the sample space. Weight the particles appropriately.. Resample (perhaps only if some criterion is met). Optionally apply an MCMC move of appropriate invariant distribution1 . The SMCTC library attempts to perform all operations which are related solely to the fact that an SMC algorithm is running (iteratively moving the entire collection of particles, resampling them and calculating appropriate integrals with respect to the empirical measure associated with the particle set, for example) whilst relying upon user-dened callback functions to perform those tasks which depend fundamentally upon the state space, target and proposal distributions (such as proposing a move for an individual particle and weighting that particle correctly). Whilst this means that implementing a new algorithm is not completely trivial, it provides considerable exibility and transfers as much complexity and implementation eort from individual algorithm implementations to the library as possible whilst preserving that exibility.

4.1. Library and Program Structure

Almost the entire template library resides within the smc namespace, although a small number of objects are dened in std to allow for simpler stream I/O in particular. Figure 1 shows the essential structure of the library, which makes use of ve templated classes and four standard ones. The highest level of object within the library corresponds to an entire algorithm, it is the smc::sampler class. smc::particle holds the value and (logarithmic, unnormalised) weight associated with an individual sample.
1 Strictly, this step could be incorporated into step 1 during the next iteration, but it is such a common technique that it is convenient to incorporate it explicitly as an additional step.

SMCTC: Sequential Monte Carlo in C++

smc::history describes the state of the sampler after each previous iteration (if this data is recorded). smc::historyelement is used by smc::history to hold the sampler state at a single earlier iteration. smc::historyflags contains elementary information about the history of the sampler. Presently this is simply used to record whether resampling occurred after a particular iteration. smc::moveset deals with the initialisation of particles, proposal distributions and additional MCMC moves. smc:rng provides a wrapper for the GSL random number facilities. Presently only a subset of its features are available, but direct access to the underlying GSL object is possible. smc:gslrnginfo is used only for the handling of information about the GSL random number generators. smc::exception is used for error handling. The general structure of a program (or program component) which carries out SMC using SMCTC consists of a number of sections, regardless of the function of that program. Initialisation Before the sampler can be used it is necessary to specify what the sampler does and the parameters of the SMC algorithm. Iteration Once a sampler has been created it is iterated either until completion or for one or more iterations until some calculation or output is required. Depending upon the purpose of the software this phase, in which the sampler actually runs, may be interleaved with the output phase. Output If the sampler is to be of any use, it is also necessary to output either details of the sampler itself or, more commonly, the result of calculating some expectations with respect to the empirical measure associated with the particle set. Section 4.2 describes how to carry out each of these phases and the remainder of this section is then dedicated to the description of some implementation details. This section serves only to provide a conceptual explanation of the implementation of a sampler. For detailed examples, together with annotated source code, see section 5.

4.2. Creating, Conguring and Running a Sampler: smc::sampler

The top-level class is the one which most programs are going to make most use of. Indeed, it is possible to use the SMCTC library to perform SMC with almost no direct reference to any of its lower-level components. The rst interaction between most programs and the SMCTC library is the creation of a new sampler object. It is necessary to specify the number of particles which the sampler will use at this stage (although some use of a variable number of particles has been made in the literature, this is very much less common than the use of a xed number and for simplicity and eciency the present library does not support such algorithms). The constructor of smc::sampler must be supplied with two parameters2 , the rst indicates the number of particle to use and the other must take one of two values: SMC_HISTORY_RAM
2 A more complex version of the constructor which provides control over the nature of the random number generator used is also available; see section 4.5 or the library documentation for details.

11 or SMC_HISTORY_NONE. If SMC_HISTORY_RAM is used then the sampler retains the full history of the sampler in memory3 . Whilst this is convenient in some applications it uses a much greater amount of memory and has some computational overheads; if SMC_HISTORY_NONE is used, then only the most recent generation of particles is stored in memory. The latter setting is to be preferred in the absence of a reason to retain historical information. As the smc::sampler class is really a template class, it is necessary to specify what type of SMC sampler is to be created. The type in this case corresponds to a class (or native C++ type) which describes a single point in the state space of the sampler of interest. So, for example we could create an SMC sampler suitable for performing ltering in a one-dimensional real state space using 1000 particles and no storage of its history using the command:
smc :: sampler < double > Sampler (1000 , SMC_HISTORY_NONE ) ;

or, using the C++ standard template library to provide a class of vectors, we could dene a sampler with a vector-valued real state space using 1000 particles that retains its full history in memory using
smc :: sampler < std :: vector < double > > Sampler (1000 , SMC_HISTORY_RAM ) ;

Such recursive use of templates is valid although some early compilers failed to correctly implement this feature. Note that this is the one situation in which C++ is white-spacesensitive: it is essential to close the template-type declaration with > > rather than >> for some rather obscure reasons.

Proposals and Importance Weights

All of the functions which move and weight individual particles are supplied to the smc::sampler object via an object of class smc::moveset. See section 4.3 for details. Once an smc::moveset object has been created, it is supplied to the smc::sampler via the SetMoveSet member function. This function takes a single argument which should be an already initialised smc::moveset object which species functions used for moving and weighting particles. Once the moveset has been specied, the sampler will call these functions as needed with no further intervention from the user.

Resampling
A number of resampling schemes are implemented within the library. Resampling can be carried out always, never or whenever the eective sample size (ESS) in the sense of (Liu 2001, p. 3536) falls below a specied threshold. To control resampling behaviour, use the SetResampleParams(int, double) member function. The rst argument should be set to a constant indicating the resampling scheme to use (see table 1) and the second controls when resampling is performed. If the second argument is negative, then resampling is never performed; if it lies in [0, 1] then resampling is performed when the ESS falls below that proportion of the number of particles and when it is greater than 1, resampling is carried out when the ESS falls below that value. Note that if the second parameter is larger than the total number of particles, then resampling will always be performed.
3

In this case, the history means the particle set as it was at every iteration in the samplers evolution. This

SMCTC: Sequential Monte Carlo in C++ Resampling Scheme Used Multinomial Residual (Liu and Chen 1998) Stratied (Carpenter, Cliord, and Fearnhead 1999) Systematic (Kitagawa 1996)

Constant SMC RESAMPLE MULTINOMIAL SMC RESAMPLE RESIDUAL SMC RESAMPLE STRATIFIED SMC RESAMPLE SYSTEMATIC

Table 1: Constants dened in sampler.hh which can be used to specify a resampling scheme. The default behaviour is to perform stratied resampling whenever the ESS falls below half the number of particles. If this is acceptable then no call of SetResampleParams is required, although such a call can improve the readability of the code.

MCMC Diversication
Following the lead of the resample-move algorithm (Gilks and Berzuini 2001), many users of SMC methods make use of an MCMC kernel of the appropriate invariant distribution after the resampling step. This is done automatically by SMCTC if the appropriate component of the moveset supplied to the sampler was non-null. See section 5.2 for an example of an algorithm with such a move.

Running the Algorithm

Having set all of the algorithms operating parameters including the smc::moveset; it is not possible to initialise the sampler before the sampler has been supplied with a function which it can initialise the particles with the rst step is to initialise the particle system. This is done using the Initialise method of the smc::sampler which takes no arguments. This function eliminates any information from a previous run of the sampler and then initialises all of the particles by calling the function specied in the moveset once for each of them. Once the particle system has been initialised, one may wish to output some information from the rst generation of the system (see the following section). It is then time to begin iterating the particle system. The sampler class provides two methods for doing this: one which should be used if the program must control the rate of execution (such as in a realtime environment) or to interact with the sampler each iteration (perhaps obtaining the next observation for the likelihood function and calculating estimates of the current state) and another which is appropriate if one is interested in only the nal state of the sampler. The rst of these is Iterate() and it takes no arguments: it simply propagates the system to the next iteration using the moves specied in the moveset, resampling if the specied resampling criterion is met. The other, IterateUntil(int) takes a single argument: the number of the iteration which should be reached before the sampler stops iterating. The second function essentially calls the rst iteratively until the desired iteration is reached.

Output
The smc::sampler object also provides the interface by which it is possible to perform some basic integration with respect to empirical measures associated with the particle set and to
is dierent to the path-space implementation of the sampler if one wishes to work on the path-space then it is necessary to store the full path in the particle value.

13 obtain the locations and weights of the particles. Simple Integration The most common use for the weighted sample associated with an SMC algorithm is the approximation of expectations with respect to the target measure. The Integrate function performs the appropriate calculation (for a user-specied function) and returns the estimate of the integral. In order to use the built-in sample integrator, it is necessary to provide a function which can be evaluated for each particle. The library then takes care of calculating the appropriate weighted sum over the particle set. The function, assumed to be named integrand should take the form:
double integrand ( T , void *)

where T denotes the type of the smc::sampler template class in use. This function will be called with the rst argument set to the value associated with each particle in turn by the smc::sampler class. The function has an additional argument of type void * to allow the user to pass arbitrary additional information to the function. Having dened such a function, its integral with respect to the weighted empirical measure associated with the particle set associated with an smc::sampler object named Sampler is provided by calling
Sampler . Integrate ( integrand , void *( p ) ) ;

where p is a pointer to auxiliary information that is passed directly to the integrand function via its second argument this may be safely set to NULL if no such information is required by the function. See example 5.1 for examples of the use of this function with and without auxiliary information. Path-sampling Integration As is described in section 5.2 it is sometimes useful to estimate the normalising constant of the nal distribution using a joint Monte Carlo/numerical integration of the path-sampling identity of Gelman and Meng (1998). The IntegratePS performs this task, again using a user-specied function. This function can only be used if the sampler was created with the SMC_HISTORY_RAM option as it makes use of the full history of the particle set. The function to be integrated may have an explicit dependence upon the generation of the sampler and so an additional argument is supplied to the function which is to be integrated. In this case, the function (assumed to be named integrand_ps) should take the form:
double integrand_ps ( long , T , void *)

Here, the rst argument corresponds to the iteration number which is passed to the function explicitly and the remaining arguments have the same interpretation as in the simple integration case. An additional function is also required: one which species how far apart successive distributions are this information is required to calculate the trapezoidal integral used in the path sampling approximation. This function, here termed width_ps, takes the form
double width_ps ( long , void *)

SMCTC: Sequential Monte Carlo in C++

where the rst argument is set to an iteration time and the second to user-supplied auxiliary information. It should return the width of the bin of the trapezoidal integration in which the function is approximated by this particle generation. Once these functions have been dened, the full path sampling calculation is calculated by calling
Sampler . IntegratePS ( integrand_ps , width_ps , void *( p ) ) ;

where p is a pointer to auxiliary information that is passed directly to the integrand_ps function via its third argument this may be safely set to NULL if no such information is required by the function. Section 5.2 provides an example of the use of the path-sampling integrator. General Output For more general tasks it is possible to access the locations and weights of the particles directly. Three low-level member functions provide access to the current generation of particles, each takes a single integer argument corresponding to a particle index. The functions are
GetParticleValue ( int n ) Ge tP art icl eL ogW ei ght ( int n ) GetParticleWeight ( int n )

and they return a constant reference to the value of particle n, the logarithm of the unnormalised weight of particle n and the unnormalised weight of that particle, respectively. The GetHistory() member of the smc::sampler class returns a constant pointer to the smc:: history class in which the full particle history is stored to allow for very general use of the generated samples. This function is used in much the same manner as the simple particleaccess functions described above; see the user manual for detailed information. Finally, a human-readable summary of the state of the particle system can be directed to an output stream using the usual << operator.

4.3. Specifying Proposals and Importance Weights: smc::moveset

It is necessary to provide SMCTC with functions to initialise a particle; move a particle at each iteration and weight it appropriately and, if MCMC moves are required, then a function to apply such a move to a particle is needed. The following sections describe the functions which must be supplied for each of these tasks and this section concludes with a discussion of how to package these functions into an smc::moveset object and to pass the object to the sampler.

Initialising the particles

The rst thing that the user needs to tell the library how to do is to initialise an individual particle: how should the initial value and weight be set? This is done via an initialisation function which should have prototype:
smc :: particle <T > fInitialise ( smc :: rng * pRng ) ;

where T denotes the type of the sampler and the function is assumed to be named fInitialise.

15 When the sampler calls this function, it supplies a pointer to an smc::rng class which serves as a source of random numbers (the user is, of course, free to use an alternative source if they prefer) which can be accessed via member functions in that class which act as wrappers to some of the more commonly-used of the GSL random variate generators or by using the GetRaw() member function which returns a pointer to the underlying GSL random number generator. Note that the GSL contains a very large number of ecient generators for random variables with most standard distributions. The function is expected to produce a new smc::particle of type T and to return this object to the sampler. The simplest way to do this is to use the initialising-constructor dened for smc::particle objects. If an object, value of type T and a double named dLogWeight are available then
smc :: particle <T > ( value , dLogWeight )

will produce a new particle object which contains those values.

Moving and Weighting the particles

Similarly, it is necessary for a proposal function to be supplied. Such a function follows broadly the same pattern as the initialisation function but is supplied with the existing particle value and weight (which should be updated in place), the current iteration number and a pointer to an smc::rng class which serves as a source of randomness. The proposal function(s) take the form:
void fMove ( long , smc :: particle <T > & , smc :: rng *)

When the sampler calls this function, the rst argument is set to the current iteration number, the second to an smc::particle object which corresponds to the particle to be updated (this should be amended in place and so the function need return nothing) and the nal argument is the random number generator. There are a number of functions which can be used to determine the current value and weight of the particle in question and to alter their values. It is important to remember that the weight must be updated as well as the particle value. Whilst this may seem undesirable, and one may ask why the library cannot simply calculate the weight automatically from supplied functions which specify the target distributions, proposal densities (and auxiliary kernel densities, where appropriate), there is a good reason for this. The automatic calculation of weights from generic expressions has two principle drawbacks: it need not be numerically stable if the distributions are dened on spaces of high dimension (it is likely to correspond to the ratio of two very small numbers) and, it is very rarely an ecient way to update the weights. One should always eliminate any cancelling terms from the numerator and denominator as well as any constant (independent of particle value) multipliers in order to minimise redundant calculations. Note that the sampler assumes that the weights supplied are not normalised; no advantage is obtained by normalising them (this also allows the sampler to renormalise the weights as it sees t to obtain numerical stability). Changing the value There are two approaches to accessing and changing the value of the particle. The rst is to use the GetValue() and SetValue(T) (where T, again serves as shorthand for the type of the sampler) functions to retrieve and then set the value. This is

SMCTC: Sequential Monte Carlo in C++

likely to be satisfactory when dealing with simple objects and produces safe, readable code. The alternative, which is likely to produce substantially faster code when T is a complicated class, is to use the GetValuePointer() function which returns a pointer to the internal representation of the particle value. This pointer can be used to modify the value in place to minimize the computational overhead.

Updating the weight The present unnormalised weight of the particle can be obtained with the GetWeight() method; its logarithm with GetLogWeight(). The SetWeight(double) and SetLogWeight(double) functions serve to change the value. As one generally wishes to multiply the weight by the current incremental weight the functions AddToLogWeight(double) and MultiplyWeightBy(double) are provided and perform the obvious function. Note that the logarithmic version of all these functions should be preferred for two reasons: numerical stability is typically improved by working with logarithms (weights are often very small and have an enormous range) and the internal representation of particle weights is logarithmic so using the direct forms requires a conversion.

Mixtures of Moves
It is common practice in advanced SMC algorithms to make use of a mixture of several proposal kernels. So common, in fact, that a dedicated interface has been provided to remove the overhead associated with selecting and applying an individual move from application programs. If there are several possible proposals, one should produce a function of the form described in the previous section for each of them and, additionally, a function which selects (possibly randomly) the particular move to apply to a given particle during a particular iteration. The sampler can then apply these functions appropriately, minimising the risk of any error being introduced at this stage. In order to use the automated mixture of moves, two additional objects are required. One is a list of move functions in a form the sampler can understand. This amounts to an array of pointers to functions of the appropriate form. Although the C++ syntax for such objects is slightly messy, it is very straightforward to create such an object. For example, if the sampler is of type T and fMv1 and fMv2 each correspond to a valid move function then the following code would produce an array of the appropriate type named pfMoves which contains pointers to these two functions:
void (* pfMoves []) ( long , smc :: particle <T > & , smc :: rng *) = { fMv1 , fMv2 };

The other requirement is the function which selects which move to make at any given moment. In general, one would expect the selection to have some randomness associated with it. The function which performs the selection should have prototype:
long fSelect ( long lTime , const smc :: particle <T > & p , smc :: rng * pRng )

When it is called by the sampler, lTime will contain the current iteration of the sampler, p will be an smc::particle object containing the state and weight of the current particle and the nal argument corresponds to a random number generator. The function should return a value between 0 and one below than the number of moves available (this is interpreted by the sampler as an index into the array of function pointers dened previously).

Additional MCMC moves

If MCMC moves are required then one should simply produce an additional move function with an almost identical prototype to that used for proposal moves. However, proposals have a void return type, but the MCMC move function should return int. The function should return zero if a move is rejected and a positive value if it is accepted4 . In addition to allowing SMCTC to monitor the acceptance rate, this ensures that no confusion between proposal and MCMC moves is possible. Ordinarily, one would not expect these functions to alter the weight of the particle which they move but this is not enforced by the library.

Creating an smc::moveset object

Having specied all of the individual functions, it is necessary to package them all into a moveset object and then tell the sampler to use that moveset. The simplest way to ll a moveset is to use an appropriate constructor. There is a threeargument form which is appropriate for samplers with a single proposal function and a veargument variant for samplers with a mixture of proposals. Single-proposal Movesets If there a single proposal, then the moveset must contain three things: a pointer to the initialisation function, a pointer to the proposal function and, optionally, a pointer to an MCMC move function (if one is not required, then the argument specifying this function should be set to NULL). A constructor which takes three arguments exists and has prototype
moveset ( particle <T >(* pfInit ) ( rng *) , void (* pfNewMoves ) ( long , particle <T > & , rng *) , int (* pfNewMCMC ) ( long , particle <T > & , rng *) )

indicating that the rst argument should correspond to a pointer to the initialisation function, the second to the move function and the third to any MCMC function which is to be used (or NULL). Section 5.1 shows this approach in action. Mixture-proposal Movesets There is also a constructor which initialises a moveset for use in the mixture-formulation. Its prototype takes the form:
moveset ( particle <T >(* pfInit ) ( rng *) , long (* pfMoveSelector ) ( long , const particle <T > & , rng *) , long nMoves , void (** pfNewMoves ) ( long , particle <T > & , rng *) , int (* pfNewMCMC ) ( long , particle <T > & , rng *) )

Here, the rst and last arguments coincide with those of the single-proposal-moveset constructor described above; the second argument is the function which selects a move, the second is the number of dierent moves which exist and the fourth argument is an array of nMoves pointers to move functions. This is used in section 5.2. Using a Moveset Having created a moveset by either of these methods, all that remains is to call the SetMoveSet member of the sampler, specifying the newly-created moveset as the
4

It is advisable to return a positive value in the case of moves which do not involve a rejection step.

SMCTC: Sequential Monte Carlo in C++

sole argument. This tells the sampler object that this moveset contains all information about initialisation, proposals, weighting and MCMC moves and that calling the appropriate members of this object will perform the low-level application-specic functions which it requires the user to specify.

4.4. Error Handling: smc::exception

If an error that is too serious to be indicated via the return value of a function within the library occurs then an exception is thrown. Exceptions indicating an error within SMCTC are of type smc::exception. This class contains four pieces of information:
char * szFile ; long lLine ; long lCode ; char * szMessage ;

szFile is a NULL-terminated string specifying the source le in which the exception occurred; lLine indicates the line of that le at which the exception was generated. lCode provides a numerical indication of the type of error (this should correspond to one of the SMCX_* constants dened in smc-exception.hh) and, nally, szMessage provides a human-readable description of the problem which occurred. For convenience, the << operator has been overloaded so that os << e will send a human-readable description of the smc::exception, e to an ostream, os see section 5.1 for an example.

4.5. Random Number Generation

In principle, no user involvement is required to congure the random number generation used by the SMCTC library. However, if no information is provided then the sampler will use the same pseudorandom number sequence for every execution: it simply uses the GSL default generator type and seed. In fact, complete programmatic control of the random number generator can be arranged via the random number generator classes (it is possible to supply one to the smc::sampler class when it is created rather than allowing it to generate a default). However, this is not an essential feature of the library, the details can be found in the class reference and are not reproduced here. For day-to-day use, it is probably sucient for most users to take advantage of the fact that the GSL checks two environment variables before using its default generator (this is true of the mode in which it is invoked by SMCTC in its default mode). Consequently, it is possible to control the behaviour of the random number sequence provided to any program which uses the SMCTC library by setting these environment variables before launching the program. Specically, GSL_RNG_SEED species the random number seed and GSL_RNG_TYPE species which generator should be used see Galassi et al. (2006) for details. By way of an example, the pf binary produced by compiling the example in section 5.1 can be executed using the ranlux generator with a seed of 36532673278 by entering the following at a BASH (Bourne Again Shell) prompt from the appropriate directory:
GSL_RNG_TYPE = ranlux GSL_RNG_SEED =36532673278 ./ pf

5. Examples Applications
This section provides two sample implementations: a simple particle lter in section 5.1 and a more involved SMC sampler which estimates rare event probabilities in section 5.2. This section shows how one can go about implementing SMC algorithms using the SMCTC library and (hopefully) emphasizes that no great technical requirements are imposed by the use of a compiled language such as C++ in the development of software of this nature. It is also possible to use these programs as a basis for the development of new algorithms.

5.1. A Simple Particle Filter Model

It is useful to look at a basic particle lter in order to see how the description above relates to a real implementation. The following simple state space model, known as the almost constant velocity model in the tracking literature, provides a simple scenario. The state vector Xn contains the position and velocity of an object moving in a plane: Xn = (sx , ux , sy , uy ). Imperfect observation of the position, but not velocity, is possible at each n n n n time instance. The state and observation equations are linear with additive noise: Xn = AXn1 + Vn Yn = BXn + Wn

where 1 0 0 0 1 0 0 A= 0 0 1 0 0 0 1 1 0 0 0 0 0 1 0

= 0.1,

and we assume that the elements of the noise vector Vn are independent normal with variances 0.02 and 0.001 for position and velocity components, respectively. The observation noise, Wn , comprise independent, identically distributed t-distributed random variables with = 10 degrees of freedom. The prior at time 0 corresponds to an axis-aligned Gaussian with variance 4 for the position coordinates and 1 for the velocity coordinates.

Implementation
For simplicity, we dene a simple bootstrap lter (Gordon, Salmond, and Smith 1993) which samples from the system dynamics (i.e. the conditional prior of a given state variable given the state at the previous time but no knowledge of any subsequent observations) and weights according to the likelihood. The pffuncs.hh header performs some basic housekeeping, with function prototypes and global-variable declarations. The only signicant content is the denition of the classes used to describe the states and observations:
3 4

class cv_state {

SMCTC: Sequential Monte Carlo in C++

5 6 7 8 9 10 11 12 13 14

public : double x_pos , y_pos ; double x_vel , y_vel ; }; class cv_obs { public : double x_pos , y_pos ; };

In this case, nothing sophisticated is done with these classes: a shallow copy suces to duplicate the contents and the default copy constructor, assignment operator and destructor are sucient. Indeed, it would be straightforward to implement the present program using no more than an array of doubles. The purpose of this (apparently more complicated) approach is twofold: it is preferable to use a class which corresponds to precisely the objects of interest and, it illustrates just how straightforward it is to employ user-dened types within the template classes. The main function of this particle lter, dened in pfexample.cc, looks like this:
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

int main ( int argc , char ** argv ) { long lNumber = 1000; long lIterates ; try {
// Load o b s e r v a t i o n s

lIterates = load_data ( " data . csv " , & y ) ;

// I n i t i a l i s e and r u n t h e s a m p l e r

smc :: sampler < cv_state > Sampler ( lNumber , SMC_HISTORY_NONE ) ; smc :: moveset < cv_state > Moveset ( fInitialise , fMove , NULL ) ; Sampler . SetResampleParams ( SMC_RESAMPLE_RESIDUAL , 0.5) ; Sampler . SetMoveSet ( Moveset ) ; Sampler . Initialise () ; for ( int n =1 ; n < lIterates ; ++ n ) { Sampler . Iterate () ; double xm , xv , ym , yv ; xm = Sampler . Integrate ( integrand_mean_x , NULL ) ; xv = Sampler . Integrate ( integrand_var_x , ( void *) & xm ) ; ym = Sampler . Integrate ( integrand_mean_y , NULL ) ; yv = Sampler . Integrate ( integrand_var_y , ( void *) & ym ) ; cout << xm << " ," << ym << " ," << xv << " ," << yv << endl ; } } catch ( smc :: exception { e)

50 51 52 53

cerr << e ; exit ( e . lCode ) ; } }

This should be fairly self-explanatory, but some comments are justied. Line 25 serves to load some observations from disk (the LoadData function is included in the source le but is not detailed here as it could be replaced by any method for sourcing data; indeed, in real ltering applications one would anticipate this data arriving in real time from a signal source). This function assumes that a le called Data.csv exists in the present directory; the rst line of this le identies the number of observation pairs present and the remainder of the le contains these observations in a comma-separated form. A suitable data le is present in the downloadable source archive. Lines 2345 are enclosed in a try block so that any exceptions thrown by the sampler can be caught, allowing the program to exit gracefully and display a suitable error message should this happen. Line 4852 perform this elementary error handing. Line 28 creates an smc::sampler which employs lNumber= 1000 particles and which does not store the history of the system. It is followed by line 28 which creates an smc::moveset comprising an initialisation function fInitialise and a proposal function fMove the nal argument is NULL as no additional MCMC moves are used. These functions are described below. Once the basic objects have been created, lines 3133 initialise the sampler by: Specifying that we wish to perform residual resampling when the ESS drops below 50% of the number of particles. Supplying the moveset-information to the sampler. Telling the sampler to initialise itself and all of the particles (the fInitialise function is called for each particle at this stage). Lines 3545 then iterate through the observations, propagating the particle set from one ltering distribution to the next and outputting the mean and variance of sx and sy for n n each n. Line 35 causes the sampler to predict and update the particle set, resampling if the conditions specied in line 30 are met. The remaining lines calculate the mean and variance of the x and y coordinates using the simple integrator built in to the sampler. Consider the x components (the y components are dealt with in precisely the same way with essentially identical code). The mean is calculated by line 39 which asks the sampler to obtain the weighted average of function integrand_mean_x over the particle set. This function, as we require the mean of sx , simply returns the value of sx for the specied particle: n n
79 80 81 82

double integrand_mean_x ( cv_state s , void *) { return s . x_pos ; }

Line 40 then calculates the variance. Whilst it would be straightforward to estimate the mean of (sx )2 using the same method as sx and to calculate the variance from this, an alternative n n is to supply the mean calculated in line 38 to a function which returns the squared dierence

SMCTC: Sequential Monte Carlo in C++

between sx and an estimate of its mean. We take this approach to illustrate the use of the n nal argument of the integrand function. The function, in this case is:
84 85 86 87 88 89

double integrand_var_x ( cv_state s , void * vmx ) { double * dmx = ( double *) vmx ; double d = ( s . x_pos - (* dmx ) ) ; return d * d ; }

as the nal argument of the Sampler.Integrate() call is set to a pointer to the mean of sx n estimated in line 38, this is what is supplied as the nal argument of the integrand function when it is called for each individual particle. Finally, the functions used by the particle lter are contained in pffuncs.cc. The rst of these is the initialisation function:
31 32 33 34 35 36 37 38 39 40 41

smc :: particle < cv_state > fInitialise ( smc :: rng * pRng ) { cv_state value ; value . x_pos value . y_pos value . x_vel value . y_vel = = = = pRng - > Normal (0 , sqrt ( var_s0 ) ) ; pRng - > Normal (0 , sqrt ( var_s0 ) ) ; pRng - > Normal (0 , sqrt ( var_u0 ) ) ; pRng - > Normal (0 , sqrt ( var_u0 ) ) ;

return smc :: particle < cv_state >( value , logLikelihood (0 , value ) ) ; }

Line 33 declares and object of the same type as the value of the particle. Lines 3538 use the SMCTC random number class to initialise the position and velocity components of the state to samples from appropriate independent Gaussian distributions. As the particle has been drawn from a distribution corresponding to the prior distribution at time 0, it should be weighted by the likelihood. Line 40 returns an smc::particle object with the value of value and a weight obtained by calling the logLikelihood function with the rst argument set to the current iteration number and the second to value. The logLikelihood function
23 24 25

double logLikelihood ( long lTime , const cv_state & X ) { return - 0.5 * ( nu_y + 1.0) * ( log (1 + pow (( X . x_pos - y [ lTime ]. x_pos ) / scale_y ,2) / nu_y ) + log (1 + pow (( X . y_pos - y [ lTime ]. y_pos ) / scale_y ,2) / nu_y ) ) ; }

is slightly misnamed. It does not return the log likelihood, but the log of the likelihood up to a normalising constants. Its operation is not signicantly more complicated than it would be if the same function were implemented in a dedicated statistical language. Finally, the move function takes the form:
49

50 51

void fMove ( long lTime , smc :: particle < cv_state > & pFrom , smc :: rng * pRng ) { cv_state * cv_to = pFrom . GetValuePointer () ;

52 53 54 55 56 57 58 59

cv_to - > x_pos cv_to - > x_vel cv_to - > y_pos cv_to - > y_vel

+= += += +=

cv_to - > x_vel * Delta + pRng - > Normal (0 , sqrt ( var_s ) ) ; pRng - > Normal (0 , sqrt ( var_u ) ) ; cv_to - > y_vel * Delta + pRng - > Normal (0 , sqrt ( var_s ) ) ; pRng - > Normal (0 , sqrt ( var_u ) ) ;

pFrom . AddToLogWeight ( logLikelihood ( lTime , * cv_to ) ) ; }

Again, this is reasonably straightforward. Line 51 obtains a pointer to the value associated with the particle so that it can be modied in place. Lines 5356 add appropriate Gaussian random variables to each element of the state, according to the system dynamics, using the SMCTC random number class. Finally, line 58 adds the value of the log likelihood to the logarithmic weight (this, of course, is equivalent to multiplying the weight by the likelihood). Between them, these functions comprise a complete working particle lter. In total, 156 lines of code and 28 lines of header le are involved in this example program including signicant comments and white-space. Figure 2 shows the output of this algorithm running on simulated data, together with the simulated data itself and the observations. Only the position coordinates are illustrated. For reference, using a 1.7 GHz Pentium-M, this simulation takes 0.35 s to run for 100 iterations using 1000 particles.

5.2. Gaussian Tail Probabilities

The following example is an implementation of an algorithm, described in (Johansen, Del Moral, and Doucet 2006, section 2.3.1), for the estimation of rare event probabilities. A detailed discussion is outside the scope of this paper.

Model and Distribution Sequence

In general, estimating the probability of rare events (by denition, those with very small probability) is a dicult problem. In this section we consider one particular class of rare events. We are given a (possibly inhomogeneous) Markov chain, (Xn )nN , which takes its values in a sequence of measurable spaces (En )nN with initial distribution 0 and elementary transitions given by the set of Markov kernels (Mn )n1 . The law P of the Markov chain is dened by its nite dimensional distributions:
N

1 X0:N (dx0:N )

= 0 (dx0 )
i=1

Mi (xi1 , dxi ).

(3)

For this Markov chain, we wish to estimate the probability of the path of the chain lying in some rare set, R, over some deterministic interval 0 : P . We also wish to estimate the distribution of the Markov chain conditioned upon the chain lying in that set, i.e., to obtain a set of samples from the distribution:
1 P0 X0:P (|X0:P R) .

(4)

In general, even if it is possible to sample directly from 0 () and from Mn (xn1 , ) for all n and almost all xn1 it is dicult to estimate either the probability or the conditional distribution.

SMCTC: Sequential Monte Carlo in C++

14 Ground Truth Filtering Estimate Observations 12

2 -7 -6 -5 -4 -3 -2 -1

Figure 2: Proof of concept: simulated data, observations and the posterior mean ltering estimates obtained by the particle lter.

25 The approach which we propose is to employ a sequence of intermediate distributions which 1 1 move smoothly from P X0:P to the target distribution P X0:P (|X0:P R) and to obtain samples from these distributions using SMC methods. By operating directly upon the path space, we obtain a number of advantages. It provides more exibility in constructing the importance distribution than methods which consider only the time marginals, and allows us to take complex correlations into account. We can, of course, cast the probability of interest as the expectation of an indicator function over the rare set, and the conditional distribution of interest in a similar form as: P (X0:P R) = E [IR (X0:P )] , P (dx0:p R) . P (dx0:p |X0:P R ) = E [IR (X0:P )] We concern ourselves with those cases in which the rare set of interest can be characterised by some measurable function, V : E0:P R, which has the properties that: V : R [V , ), V : E0:P \ R (, V ). In this case, it makes sense to consider a sequence of distributions dened by a potential function which is proportional to their Radon-Nikodm derivative with respect to the law of y the Markov chain, namely: g (x0:p ) = 1 + exp () V (x0:P ) V
1

where () : [0, 1] R+ is a dierentiable monotonically-increasing function such that (0) = 0 and (1) is suciently large that this potential function approaches the indicator function on the rare set as we move through the sequence of distributions dened by this potential function at the parameter values {t/T : t {0, 1, . . . , T }}. Let t (dx0:P ) P(dx0:P )gt/T (x0:P ) t=0 be the sequence of distributions which we use. The SMC samplers framework allows us to obtain a set of samples from each of these distributions in turn via a sequential importance sampling and resampling strategy. Note that each of these distributions is over the rst P + 1 elements of a Markov chain: they are dened upon a common space. In order to estimate the expectation which we seek, make use of the identity: E [IR (X0:P )] = T Z1 IR (X0:P ) , g1 (X0:P )
T

where Z = g (xo:P )P(dx0:P ) and use the particle approximation of the right hand side of this expression. This is simply importance sampling: Z1 /g1 () is simply the density of 0 with respect to T and we wish to estimate the expectation of this indicator function under 0 . Similarly, the subset of particles representing samples from T which hit the rare set can be interpreted as (weighted) samples from the conditional distribution of interest. We use the notation (Yti )N to describe the particle set at time t and Yt to describe the j th i=1 (i,p) to refer to state in the Markov chain described by particle i at time t. We further use Yt
(i,j)

SMCTC: Sequential Monte Carlo in C++

every state in the Markov chain described by particle i at time t except the pth , and similarly, (i,p) (i,0:p1) (i,p+1:P ) Yt Y Yt , Y , Yt , i.e., it refers to the Markov chain described by the same particle, with the pth state of the Markov chain replaced by some quantity Y .

The Path Sampling Approximation

The estimation of the normalising constant associated with our potential function can be achieved by a Monte Carlo approximation of the path sampling formulation given by Gelman and Meng (1998). Given a parameter such that a potential function g (x) allows a smooth transition from a reference distribution to a distribution of interest, as some parameter increases from zero to one, one can estimate the logarithm of the ratio of their normalising constants via the integral relationship: log Z1 Z0
1

=
0

dq d, d

(5)

where E denotes the expectation under . In our cases, we can describe our sequence of distributions in precisely this form via a discrete sequence of intermediate distributions parametrized by a sequence of values of : d log g (x) = d log Zt/T Z0 =
0 (t/T )

d (V (x) V ) )) + 1 d exp(()(V (x) V

t/T

(V () V ) d d )) + 1 d exp(()(V () V E (t/T ) (V () V ) d, exp((V () V )) + 1

=
0

(1)

where E is used to denote the expectation under the distribution associated with the potential function at the specied value of its parameter. The SMC sampler provides us with a set of weighted particles obtained from a sequence of distributions suitable for approximating the integrals in (5). At each t we can obtain an estimate of the expectation within the integral via the usual importance sampling estimator; and the integral over (which is one dimensional and over a bounded interval) can then be approximated via a trapezoidal integration. As we know that Z0 = 0.5 we are then able to estimate the normalising constant of the nal distribution and then use an importance sampling estimator to obtain the probability of hitting the rare set.

A Gaussian Case
It is useful to consider a simple example for which it is possible to obtain analytic results for the rare event probability. The tails of a Gaussian distribution serve well in this context, and we borrow the example of Del Moral and Garnier (2005). We consider a homogeneous Markov chain dened on (R, B(R)) for which the initial distribution is a standard Gaussian distribution and each kernel is a standard Gaussian distribution centred on the previous position: 0 (dx) = N (dx; 0, 1) n > 0 : Mn (x, dy) = N (dy; x, 1).

27 The function V (x0:P ) := xP corresponds to a canonical coordinate operator and the rare set R := E P [V , ) is simply a Gaussian tail probability: the marginal distribution of XP is simply N (0, P + 1) as XP is the sum of P + 1 iid standard Gaussian random variables. Sampling from 0 is trivial. We employ an importance kernel which moves position i of the chain by ij. j is sampled from a discrete distribution. This distribution over j is obtained by considering a nite collection of possible moves and evaluating the density of the target distribution after each possible move. j is then sampled from a distribution proportional to this vector of probabilities. is an arbitrary scale parameter. The operator, G , dened
i by G Yn =

(i,p)

+ p

p=0

, where is interpreted as a parameter, is used for notational

convenience. This forward kernel can be written as:

S i i Kn (Yn1 , Yn ) = j=S i i i n (Yn1 , Yn )Gj Y i (Yn ),
n1

where the probability of each of the possible moves is given by

i i n (Yn1 , Yn ) = i n (Yn ) S j=S i n (Gj Yn1 )

This leads to the following optimal auxiliary kernel:

i n1 (Yn1 ) i i Ln1 (Yn , Yn1 ) = S i i n1 (Gj Yn )n (Gj Yn , Yn ) j=S S j=S i i i n (Yn1 , Yn )Gj Y i (Yn )
n1

The incremental importance weight is consequently:

i i wn (Yn1 , Yn ) = i n (Yn ) S j=S i i i i n (Gj Yn )wn (Gj Yn , Yn )Gj Y i (Yn ) n1

As the calculation of the integrals involved in the incremental weight expression tend to be analytically intractable in general, we have made use of a discrete grid of proposal distributions as proposed by Peters (2005). This naturally impedes the exploration of the sample space. Consequently, we make use of a Metropolis-Hastings kernel of the correct invariant distribution at each time step (whether resampling has occurred, in which case this also helps to prevent sample impoverishment, or not). We make use of a linear schedule () = k and show the results of our approach (using a chain of length 15, a grid spacing of = 0.025 and S = 12 in the sampling kernel) in table 2. It should be noted that constructing a proposal in this way has a number of positive points: it allows the use of a (discrete approximation to) essentially any proposal distribution, and it is possible to use the optimal auxiliary kernel with it. However, there are also some drawbacks and we would not recommend the use of this strategy without careful consideration. In particular, the use of a discrete grid limits the movement which is possible considerably and

28 Threshold, V 5 10 15 20 25 30 9 15 10 15

SMCTC: Sequential Monte Carlo in C++ True log10 probability -2.32 -5.32 -9.83 -15.93 -23.64 -33.00 -43.63 -53.23 SMC Mean -2.30 -5.30 -9.81 -15.94 -23.83 -33.08 -43.61 -53.20 SMC Variance 0.016 0.028 0.026 0.113 0.059 0.106 0.133 0.142 k 2 4 6 10 12.5 14 12 11.5 T 333 667 1000 2000 2500 3500 3600 4000

Table 2: Means and variances of the estimates produced by 10 runs of the proposed algorithm using 100 particles at each threshold value for the Gaussian random walk example. it will generally be necessary to make use of accompanying MCMC moves to maintain sample diversity. More seriously, these grid-type proposals are typically extremely expensive to use as they require numerous evaluations of the target distribution for each proposal. Although it provides a more-or-less automatic mechanism for constructing a proposal and using its optimal auxiliary kernel, the cost of each sample obtained in this way can be suciently high that using a simpler proposal kernel with an approximation to its optimal auxiliary kernel could yield rather better performance at a given computational cost.

Implementation
The simfunctions.hh le in this case contains the usual overhead of function prototypes and global variable denitions. It also includes a le markovchain.h which provides a template class for objects corresponding to evolutions of a Markov chain. This is a container class which operates as a doubly-linked list. The use of this class is intended to illustrate the ease with which complex classes can be used to represent the state space. The details of this class are not documented here, as it is somewhat outside the scope of this article; it should be suciently straightforward to understand the features used here. The le main.cc contains the main function:
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

int main ( int argc , char ** argv ) { cout << " Number of Particles : " ; long lNumber ; cin >> lNumber ; cout << " Number of Iterations : " ; cin >> lIterates ; cout << " Threshold : " ; cin >> dThreshold ; cout << " Schedule Constant : " ; cin >> dSchedule ; try {
// /An a r r a y o f move f u n c t i o n p o i n t e r s

void (* pfMoves []) ( long , smc :: particle < mChain < double > > & , smc :: rng *) = { fMove1 , fMove2 }; smc :: moveset < mChain < double > > Moveset ( fInitialise , fSelect , sizeof

30 31 32 33 34 35 36 37 38 39

( pfMoves ) / sizeof ( pfMoves [0]) , pfMoves , fMCMC ) ; smc :: sampler < mChain < double > > Sampler ( lNumber , SMC_HISTORY_RAM ) ; Sampler . SetResampleParams ( SMC_RESAMPLE_STRATIFIED ,0.5) ; Sampler . SetMoveSet ( Moveset ) ; Sampler . Initialise () ; Sampler . IterateUntil ( lIterates ) ;
// / E s t i m a t e t h e n o r m a l i s i n g c o n s t a n t o f t h e t e r m i n a l distribution

double zEstimate = Sampler . I nt e gr a te Pa t hS am p li n g ( pIntegrandPS , pWidthPS , NULL ) - log (2.0) ;

// / E s t i m a t e t h e w e i g h t i n g f a c t o r f o r t h e t e r m i n a l distribution

40 41 42 43

double wEstimate = Sampler . Integrate ( pIntegrandFS , NULL ) ; cout << zEstimate << " " << log ( wEstimate ) << " " << zEstimate + log ( wEstimate ) << endl ; } catch ( smc :: exception { cerr << e ; exit ( e . lCode ) ; } return 0; } e)

44 45 46 47 48 49 50 51 52

Lines 1624 allow the runtime specication of certain parameters of the model and of the employed sequence of distributions. The remainder of the function takes essentially the same form as that described in the particle lter example in particular, the same error handling mechanism is used. The core of the program is contained in lines 2843. In this case, the more complicated form of moveset is used: we consider an implementation which makes use of a mixture of the grid based moves described above and a simple update move (which does not alter the state of the particle, but reweights it to account for the change in distribution the inclusion of such moves can improve performance at a given computational cost in some circumstances). Following the approach detailed in section 4.3, the program rst populates an array of pointers to move functions with references to fMove1 and fMove2 in line 28. Line 29 causes the generation of a moveset with initialisation function fInitialise, the function fSelect used to select which move to apply, two (generated automatically using the ratio of sizeof operators to eliminate errors) moves, which are specied in the aforementioned array and, nally, an MCMC move available in function fMCMC. Line 30 creates a sampler, stipulating that the full history of the particle system should be retained in memory (we wish to use the entire history to calculate some integrals used in path sampling at the end; whilst this could be done with a calculation after each iteration it is simpler to simply retain all of the information and to calculate the integral at the end). Line 32 tells the sampler that we wish to use stratied resampling whenever the ESS drops below half the number of particles and line 33 species the moveset. Lines 3536 initialise the sampler and then iterate until the desired number of iterations have

SMCTC: Sequential Monte Carlo in C++

elapsed. These two lines are in some sense responsible for the SMC algorithm running. Output is then generated in the remainder of the function. Line 39 calculates the normalising constant of the nal distribution using path sampling as described above, it does this by calling the SMCTC function which does this automatically using widths supplied by a function pWidthPS (dened in simfunctions.cc):
149 150 151 152 153 154

155

double pWidthPS ( long lTime , void * pVoid ) { if ( lTime > 1 && lTime < lIterates ) return ((0.5) * double ( ALPHA ( lTime +1.0) - ALPHA ( lTime -1.0) ) ) ; else return ((0.5) * double ( ALPHA ( lTime +1.0) - ALPHA ( lTime ) ) +( ALPHA (1) -0.0) ); }

and integrand pIntegrandPS

142

143 144 145

146

double pIntegrandPS ( long lTime , smc :: particle < mChain < double > > pPos , void * pVoid ) { double dPos = pPos . GetValue () . GetTerminal () -> value ; return ( dPos - THRESHOLD ) / (1.0 + exp ( ALPHA ( lTime ) * ( dPos THRESHOLD ) ) ) ; }

This operates in the same manner as the simple integrator, and example of which was given in section 5.1. For details about the origin of the terms being integrated etc. see Johansen et al. (2006). Line 41 then calculations a correction term arising from the fact that the nal distribution is not the indicator function on the rare set (this is essentially an importance sampling correction) using the simple integrator and the following integrand function:
158 159 160 161 162 163 164 165

double pIntegrandFS ( mChain < double > dPos , void * pVoid ) { if ( dPos . GetTerminal () -> value > THRESHOLD ) { return (1.0 + exp ( - FTIME *( dPos . GetTerminal () -> value - THRESHOLD ) ) ) ; } else return 0; }

The result of these two calculates, and an estimate of the natural logarithm of the rare event probability are then produced in line 43. Finally, the detailed functions are provided in simfunctions.cc The following function is used throughout to calculate probabilities:
14 15 16 17 18 19 20

double logDensity ( long lTime , const mChain < double > & X ) { double lp ; mElement < double > * x = X . GetElement (0) ; mElement < double > * y = x - > pNext ;
// B e g i n w i t h t h e d e n s i t y e x l u d i n g t h e e f f e c t o f t h e p o t e n t i a l

21 22 23 24 25 26 27 28 29 30 31 32

lp = log ( g sl _r a n_ u ga us s ia n_ p df (x - > value ) ) ; while ( y ) { lp += log ( g s l_ ra n _u g au ss i an _p d f (y - > value - x - > value ) ) ; x = y; y = x - > pNext ; }
//Now i n c l u d e t h e e f f e c t o f t h e m u l t i p l i c a t i v e potential function

lp -= log (1.0 + exp ( -( ALPHA ( lTime ) * (x - > value - THRESHOLD ) ) ) ) ; return lp ; }

It should be fairly self-explanatory, but this function calculates the unnormalised density under the target distribution at time lTime of a point in the state space, X (which is, of course, a Markov chain). It does this by calculating its probability under the law of the Markov chain and then correcting for the potential. Initialisation is straightforward in this case as the rst distribution is simply the law of the Markov chain with independent standard Gaussian increments. This function simulates from this distribution and sets the weight equal to 0 (this is the preferred constant value for numerical reasons and should be used whenever all particles should have the same weight):
36 37 38

smc :: particle < mChain < double > > fInitialise ( smc :: rng * pRng ) {
// C r e a t e a Markov c h a i n w i t h t h e a p p r o p r i a t e the p a r t i c l e s . i n i t i a l i s a t i o n and t h e n a s s i g n t h a t t o

39 40 41 42 43 44 45 46 47 48

mChain < double > Mc ; double x = 0; for ( int i = 0; i < PATHLENGTH ; i ++) { x += pRng - > NormalS () ; Mc . AppendElement ( x ) ; } return smc :: particle < mChain < double > >( Mc ,0) ; }

As described above, and in the original paper, the grid-based move is used for every particle at every iteration. Consequently, the selection function simply returns zero indicating that the rst move should be used, regardless of its arguments:
4

5 6 7

long fSelect ( long lTime , const smc :: particle < mChain < double > > & p , smc :: rng * pRng ) { return 0; }

It would be straightforward to modify this function to return 1 with some probability (using the random number generator to determine which action to use). This would lead to a sampler which makes uses of a mixture of these grid-based moves (fMove1)and the update move provided by fMove2. The main proposal function is:

SMCTC: Sequential Monte Carlo in C++

59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110

void fMove1 ( long lTime , smc :: particle < mChain < double > > & pFrom , smc :: rng * pRng ) { // The d i s t a n c e b e t w e e n p o i n t s i n t h e random g r i d . static double delta = 0.1; static double gridweight [2* GRIDSIZE +1] , gridws = 0; static mChain < double > NewPos [2* GRIDSIZE +1]; static mChain < double > OldPos [2* GRIDSIZE +1]; // F i r s t s e l e c t a new p o s i t i o n from a g r i d c e n t r e d on t h e o l d p o s i t i o n , w e i g h t i n g t h e p o s s i b l e c h o i s e s by t h e // p o s t e r i o r p r o b a b i l i t y o f t h e r e s u l t i n g s t a t e s . gridws = 0; for ( int i = 0; i < 2* GRIDSIZE +1; i ++) { NewPos [ i ] = pFrom . GetValue () + (( double ) ( i - GRIDSIZE ) ) * delta ; gridweight [ i ] = exp ( logDensity ( lTime , NewPos [ i ]) ) ; gridws = gridws + gridweight [ i ]; } double dRUnif = pRng - > Uniform (0 , gridws ) ; long j = -1; while ( dRUnif > 0 && j <= 2* GRIDSIZE ) { j ++; dRUnif -= gridweight [ j ]; } pFrom . SetValue ( NewPos [ j ]) ; // Now c a l c u l a t e t h e w e i g h t c h a n g e w h i c h t h e p a r t i c l e double logInc = log ( gridweight [ j ]) , Inc = 0; s u f f e r s as a r e s u l t

for ( int i = 0; i < 2* GRIDSIZE +1; i ++) { OldPos [ i ] = pFrom . GetValue () - (( double ) ( i - GRIDSIZE ) ) * delta ; gridws = 0; for ( int k = 0; k < 2* GRIDSIZE +1; k ++) { NewPos [ k ] = OldPos [ i ] + (( double ) (k - GRIDSIZE ) ) * delta ; gridweight [ k ] = exp ( logDensity ( lTime , NewPos [ k ]) ) ; gridws += gridweight [ k ]; } Inc += exp ( logDensity ( lTime -1 , OldPos [ i ]) ) * exp ( logDensity ( lTime , pFrom . GetValue () ) ) / gridws ; } logInc -= log ( Inc ) ; pFrom . SetLogWeight ( pFrom . GetLogWeight () + logInc ) ; for ( int i = 0; i < 2* GRIDSIZE +1; i ++) { NewPos [ i ]. Empty () ; OldPos [ i ]. Empty () ; } return ; }

This is a reasonably large amount of code for an importance distribution, but that is largely due to the complex nature of this particular proposal. Even in this setting, the code is straightforward, consisting of four basic operations: lines 6174 produce a representation of the state after each possible move and calculate the proposal probability of each one, 7684 samples from the resulting distribution, 87101 calculates the weight of the particle, and the remainder of the function deletes the states produced by the rst section. In contrast, the update move takes the rather trivial form:

112

113 114

115

void fMove2 ( long lTime , smc :: particle < mChain < double > > & pFrom , smc :: rng * pRng ) { pFrom . SetLogWeight ( pFrom . GetLogWeight () + logDensity ( lTime , pFrom . GetValue () ) - logDensity ( lTime -1 , pFrom . GetValue () ) ) ; }

It simply updates the weight of the particle to take account of the fact that it should now target the proposal at time lTime rather than lTime1. The following function provides an additional MCMC move:
118

119 120 121 122 123 124 125

int fMCMC ( long lTime , smc :: particle < mChain < double > > & pFrom , smc :: rng * pRng ) { static smc :: particle < mChain < double > > pTo ; mChain < double > * pMC = new mChain < double >; for ( int i = 0; i < pFrom . GetValue () . GetLength () ; i ++) pMC - > AppendElement ( pFrom . GetValue () . GetElement ( i ) -> value + pRng - > Normal (0 , 0.5) ) ; pTo . SetValue (* pMC ) ; pTo . SetLogWeight ( pFrom . GetLogWeight () ) ; delete pMC ; double alpha = exp ( logDensity ( lTime , pTo . GetValue () ) - logDensity ( lTime , pFrom . GetValue () ) ) ; if ( alpha < 1) if ( pRng - > UniformS () > alpha ) { return false ; } pFrom = pTo ; return true ; }

126 127 128 129 130 131

132 133 134 135 136 137 138 139

This is a simple Metropolis-Hastings (Metropolis, Rosenbluth, Rosenbluth, and Teller 1953; Hastings 1970) move. Lines 120130 produce a new state by adding a Gaussian random variable of variance 1/4 to each element of the state. Lines 131138 then determine whether to reject them move in which case the function returns false or to accept it, in which case the value of the existing state is set to the proposal and the function returns true. This should serve as a prototype for the inclusion of Metropolis-Hastings moves within SMCTC programs. Again this comprises a full implementation of the SMC algorithm; in this instance one which uses 229 lines of code and 32 lines of header. Although some of the individual functions are relative complex in this case, that is a simple function of the model and proposal structure. Fundamentally, this program has the same structure as the simple particle lter introduced in the previous section.

SMCTC: Sequential Monte Carlo in C++

6. Discussion
This article has introduced a C++ template class intended to ease the development of ecient SMC algorithms in C++. Whilst some work and technical prociency is required on the part of the user to implement particular algorithms using this approach and it is clear that demand also exists for a software platform for the execution of SMC algorithms by users without the necessary skills it seems to us to oer the optimal compromised between speed, exibility and eort. SMCTC currently represents a rst step towards the provision of software for the implementation of SMC algorithms. Sequential Monte Carlo is a young and dynamic eld and it is inevitable that other requirements will emerge and that some desirable features will prove to have been omitted from this library. The author envisages that SMCTC will continue to be developed for the foreseeable future and would welcome any feedback.

Acknowledgements
The author thanks Dr. Mark Briers and Dr. Edmund Jackson for their able and generous assistance in the testing of this software on a variety of platforms.

References
Capp O, Guillin A, Marin JM, Robert CP (2004). Population Monte Carlo. Journal of e Computational and Graphical Statistics, 13(4), 907929. Carpenter J, Cliord P, Fearnhead P (1999). An Improved Particle Filter for Non-linear Problems. IEEE Proceedings on Radar, Sonar and Navigation, 146(1), 27. Chen L, Lee C, Budhiraja A, Mehra RK (2007). PFlib: An Object Oriented MATLAB Toolbox for Particle Filtering. In Proceedings of SPIE Signal Processing, Sensor Fusion and Target Recognition XVI, volume 6567. Chopin N (2002). A sequential particle lter method for static models. Biometrika, 89(3), 539551. Del Moral P, Doucet A, Jasra A (2006a). Sequential Monte Carlo Methods for Bayesian Computation. In Bayesian Statistics 8, Oxford University Press. Del Moral P, Doucet A, Jasra A (2006b). Sequential Monte Carlo Samplers. Journal of the Royal Statistical Society B, 63(3), 411436. Del Moral P, Garnier J (2005). Genealogical Particle Analysis of Rare Events. Annals of Applied Probability, 15(4), 24962534. Doucet A, de Freitas N, Gordon N (eds.) (2001). Sequential Monte Carlo Methods in Practice. Statistics for Engineering and Information Science. Springer Verlag, New York. Doucet A, Johansen AM (2008). Particle Filtering and Smoothing: Fiteen years later. In review.

35 Fan Y, Leslie D, Wand MP (2007). Generalized linear mixed model analysis via sequential Monte Carlo sampling. Statistics Group Research Report 07:10, University of Bristol, Department of Mathematics. Free Software Foundation (2007). GNU General Public License. URL http://www.gnu. org/licenses/gpl-3.0.html. Galassi M, Davies J, Theiler J, Gough B, Jungman G, Booth M, Rossi F (2006). GNU Scientic Library Reference Manual. Network Theory Limited, revised 2nd edition. Gansner ER, North SC (2000). An open graph visualization system and its applications to software engineering. Software: Practice and Experience, 30(11). Gelman A, Meng XL (1998). Simulating Normalizing Constants: From Importance Sampling to Bridge Sampling to Path Sampling. Statistical Science, 13(2), 163185. Gilks WR, Berzuini C (2001). Following a moving target Monte Carlo inference for dynamic Bayesian models. Journal of the Royal Statistical Society B, 63, 127146. Gordon NJ, Salmond SJ, Smith AFM (1993). Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings-F, 140(2), 107113. Hastings WK (1970). Monte Carlo Sampling Methods Using Markov Chains and their Applications. Biometrika, 52, 97109. Johansen AM, Del Moral P, Doucet A (2006). Sequential Monte Carlo samplers for rare events. In Proceedings of the 6th International Workshop on Rare Event Simulation, pp. 256267. Bamberg, Germany. Johansen AM, Doucet A, Davy M (2008). Particle Methods for Maximum Likelihood Parameter Estimation in Latent Variable Models. Statistics and Computing, 18(1), 4757. Kitagawa G (1996). Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models. Journal of Computational and Graphical Statistics, 5, 125. Liu JS (2001). Monte Carlo Strategies in Scientic Computing. Springer Series in Statistics. Springer Verlag, New York. Liu JS, Chen R (1998). Sequential Monte Carlo Methods for Dynamic Systems. Journal of the American Statistical Association, 93(443), 10321044. Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH (1953). Equation of State Calculations by Fast Computing Machines. Journal of Chemical Physics, 21, 10871092. Neal RM (2001). Annealed Importance Sampling. Statistics and Computing, 11, 125139. Peters GW (2005). Topics In Sequential Monte Carlo Samplers. M.Sc. thesis, University of Cambridge, Department of Engineering. Stroustrup B (1991). The C++ Programming Language. Addison Wesley, 2nd edition. van der Merwe R, Doucet A, de Freitas N, Wan E (2000). The Unscented Particle Filter. Technical Report CUED-F/INFENG-TR380, University of Cambridge, Department of Engineering.

SMCTC: Sequential Monte Carlo in C++

van Heesch D (2007). doxygen Manual, 1.5.5 edition. URL http://www.doxygen.org.

Aliation:
Adam Johansen, University of Bristol Department of Mathematics, University Walk, Bristol, BS3 3HY, United Kingdom E-mail: Adam.Johansen@bristol.ac.uk URL: http://stats.bris.ac.uk/~maamj/

University of Bristol Research Report

http://www.stats.bris.ac.uk/

Unit 5
No ratings yet
Unit 5
74 pages
Stochastic Simulation Book
No ratings yet
Stochastic Simulation Book
146 pages
MCMC Sampling - Class 2025
No ratings yet
MCMC Sampling - Class 2025
101 pages
CPSC 540: Machine Learning: Monte Carlo Methods
No ratings yet
CPSC 540: Machine Learning: Monte Carlo Methods
32 pages
Lec29 ImportanceSampling
No ratings yet
Lec29 ImportanceSampling
84 pages
SMCP3
No ratings yet
SMCP3
28 pages
Stan Reference 2.6.0
No ratings yet
Stan Reference 2.6.0
506 pages
Journal of Statistical Software
No ratings yet
Journal of Statistical Software
41 pages
Introduction To State Space Models and Sequential Bayesian Inference
No ratings yet
Introduction To State Space Models and Sequential Bayesian Inference
58 pages
Computational Bayesian Statistics
100% (1)
Computational Bayesian Statistics
254 pages
Lec30 GibbsSampling
No ratings yet
Lec30 GibbsSampling
55 pages
Lec35 SequentialImportanceSampling
No ratings yet
Lec35 SequentialImportanceSampling
46 pages
Parallel Resampling in The Particle Filter
No ratings yet
Parallel Resampling in The Particle Filter
18 pages
Bayesian Update With Importance Sampling Required
No ratings yet
Bayesian Update With Importance Sampling Required
21 pages
3 1 Lueckmann21a-Supp
No ratings yet
3 1 Lueckmann21a-Supp
39 pages
18 Aos1715
No ratings yet
18 Aos1715
33 pages
Operation Research Class Notes PDF
50% (2)
Operation Research Class Notes PDF
308 pages
Particle Filters: Texpoint Fonts Used in Emf. Read The Texpoint Manual Before You Delete This Box.: Aaaaaaaaaaaaa
No ratings yet
Particle Filters: Texpoint Fonts Used in Emf. Read The Texpoint Manual Before You Delete This Box.: Aaaaaaaaaaaaa
57 pages
On Sequential Monte Carlo Sampling Methods For Bayesian Filtering
No ratings yet
On Sequential Monte Carlo Sampling Methods For Bayesian Filtering
35 pages
Particle Filter Theory and Practice With Positioning Applications
No ratings yet
Particle Filter Theory and Practice With Positioning Applications
30 pages
11 BJPS144
No ratings yet
11 BJPS144
14 pages
Lecture 18 1
No ratings yet
Lecture 18 1
17 pages
Sampling Marginals
No ratings yet
Sampling Marginals
13 pages
Particle Filters: Texpoint Fonts Used in Emf. Read The Texpoint Manual Before You Delete This Box.: Aaaaaaaaaaaaa
No ratings yet
Particle Filters: Texpoint Fonts Used in Emf. Read The Texpoint Manual Before You Delete This Box.: Aaaaaaaaaaaaa
57 pages
Beyond The Kalman FilterParticle Filters For Tracking Applications
100% (1)
Beyond The Kalman FilterParticle Filters For Tracking Applications
47 pages
Four Lectures On Computational Statistical Physics: February 2009
No ratings yet
Four Lectures On Computational Statistical Physics: February 2009
38 pages
Coeurdoux etal23-PnPGibbs
No ratings yet
Coeurdoux etal23-PnPGibbs
15 pages
Bayesian Filtering - From Kalman Filters To Particle Filters and Beyond
No ratings yet
Bayesian Filtering - From Kalman Filters To Particle Filters and Beyond
69 pages
A Tutorial On Particle Filtering and Smoothing: Fifteen Years Later
No ratings yet
A Tutorial On Particle Filtering and Smoothing: Fifteen Years Later
41 pages
An Introduction To Particle Filters: David Salmond and Neil Gordon Sept 2005
No ratings yet
An Introduction To Particle Filters: David Salmond and Neil Gordon Sept 2005
27 pages
Hashem Pesaran, Lung-Fei Lee - Analysis of Panels and Limited Dependent Variable Models (1999)
No ratings yet
Hashem Pesaran, Lung-Fei Lee - Analysis of Panels and Limited Dependent Variable Models (1999)
350 pages
Statistics 202C Study Guide: Part I: Sampling Basic Unstructured Distributions and Monte Carlo Basics
No ratings yet
Statistics 202C Study Guide: Part I: Sampling Basic Unstructured Distributions and Monte Carlo Basics
14 pages
Particle Filter Tutorial
No ratings yet
Particle Filter Tutorial
39 pages
Sampling Slides
No ratings yet
Sampling Slides
38 pages
Chatterjee
No ratings yet
Chatterjee
152 pages
Journal of Statistical Software: Pyparticleest: A Python Framework For
No ratings yet
Journal of Statistical Software: Pyparticleest: A Python Framework For
25 pages
Path Sampling For Particle Filters With Application To Multi-Target Tracking
No ratings yet
Path Sampling For Particle Filters With Application To Multi-Target Tracking
29 pages
ML - Unit-V-1
No ratings yet
ML - Unit-V-1
42 pages
Computational Statistics With Matlab: Mark Steyvers May 13, 2011
No ratings yet
Computational Statistics With Matlab: Mark Steyvers May 13, 2011
78 pages
Siggraph03
No ratings yet
Siggraph03
24 pages
Monte Carlo Simulations Using Microsoft Excel Shinil Cho PDF Download
No ratings yet
Monte Carlo Simulations Using Microsoft Excel Shinil Cho PDF Download
42 pages
Sampling Methods: Søren Højsgaard
No ratings yet
Sampling Methods: Søren Højsgaard
22 pages
Mplus User Guide Ver - 7 - r6 - Web
No ratings yet
Mplus User Guide Ver - 7 - r6 - Web
856 pages
UNIT-5 Markov Chain Monte Carlo Methods
No ratings yet
UNIT-5 Markov Chain Monte Carlo Methods
17 pages
Sequential Monte Carlo Methods
No ratings yet
Sequential Monte Carlo Methods
6 pages
Big Data JPM
No ratings yet
Big Data JPM
31 pages
2 Simulation
No ratings yet
2 Simulation
16 pages
Computation
No ratings yet
Computation
11 pages
Computational Bayesian Statistics. An Introduction - Amaral, Paulino, Muller PDF
100% (4)
Computational Bayesian Statistics. An Introduction - Amaral, Paulino, Muller PDF
257 pages
Book Numerical
100% (1)
Book Numerical
388 pages
Bayes Intro PT 2
No ratings yet
Bayes Intro PT 2
13 pages
SharmaJK 2016 Contents OperationsResearchThe
100% (1)
SharmaJK 2016 Contents OperationsResearchThe
100 pages
Putational Statistics Using Matlab
No ratings yet
Putational Statistics Using Matlab
78 pages
PRMS - Uncertanity and Reserve Estimation
100% (2)
PRMS - Uncertanity and Reserve Estimation
70 pages
An Introduction To MCMC For Machine Learning
No ratings yet
An Introduction To MCMC For Machine Learning
39 pages
ALFANTA, JESIEL - Paper Review Using Montecarlo
No ratings yet
ALFANTA, JESIEL - Paper Review Using Montecarlo
1 page
Pflib - An Object Oriented Matlab Toolbox For Particle Filtering
No ratings yet
Pflib - An Object Oriented Matlab Toolbox For Particle Filtering
8 pages
Monte Carlo
No ratings yet
Monte Carlo
59 pages
Bayesian Analysis
No ratings yet
Bayesian Analysis
20 pages
SPM Unit 3 Notes
No ratings yet
SPM Unit 3 Notes
27 pages
BigData - Monte Carlo Algorithm
No ratings yet
BigData - Monte Carlo Algorithm
22 pages
An Introduction To MCMC For Machine Learning: Abstract
No ratings yet
An Introduction To MCMC For Machine Learning: Abstract
39 pages
Shinozuka (1991)
No ratings yet
Shinozuka (1991)
14 pages
Particle Filtering: Emin Orhan Eorhan@bcs - Rochester.edu
No ratings yet
Particle Filtering: Emin Orhan Eorhan@bcs - Rochester.edu
6 pages
Rambaut2018 Tracer 1.7
No ratings yet
Rambaut2018 Tracer 1.7
4 pages
Nist TN 1900 PDF
No ratings yet
Nist TN 1900 PDF
105 pages
The Unscented Particle Filter: Rudolph Van Der Merwe Arnaud Doucet
No ratings yet
The Unscented Particle Filter: Rudolph Van Der Merwe Arnaud Doucet
7 pages
Paper Siga23metalayer
No ratings yet
Paper Siga23metalayer
15 pages
3rd Year Syllabus 2020-21
No ratings yet
3rd Year Syllabus 2020-21
36 pages
Resampling
No ratings yet
Resampling
2 pages
Computational Statistics With Matlab
No ratings yet
Computational Statistics With Matlab
71 pages
Article2 PDF
No ratings yet
Article2 PDF
27 pages
Monte Carlo Integration
No ratings yet
Monte Carlo Integration
5 pages
The Kernel Polynomial Method
No ratings yet
The Kernel Polynomial Method
32 pages
MCMC For Epidemiologists
No ratings yet
MCMC For Epidemiologists
8 pages
Burger 2005 Batu Hijau Mill Throughput Model
No ratings yet
Burger 2005 Batu Hijau Mill Throughput Model
21 pages
Overview of Methods For Voltage Sag Performance Estimation
No ratings yet
Overview of Methods For Voltage Sag Performance Estimation
5 pages
ECE 4310: Energy System II Study Guide Winter 2014: - Frequency Error
No ratings yet
ECE 4310: Energy System II Study Guide Winter 2014: - Frequency Error
6 pages
Best Financial Engineering Books
100% (1)
Best Financial Engineering Books
3 pages
2021 - Spare Parts Ordering Decisions Using Age Based, Block Based and Condition Based Replacement Policies
No ratings yet
2021 - Spare Parts Ordering Decisions Using Age Based, Block Based and Condition Based Replacement Policies
6 pages
SIMULATION
No ratings yet
SIMULATION
6 pages
Ge 501 Notes
No ratings yet
Ge 501 Notes
34 pages
Chapter 3 Contingency Analysis and Allocation
No ratings yet
Chapter 3 Contingency Analysis and Allocation
15 pages
Principles of Good Practice For The Use of Monte Carlo Techniques in Human Health and Ecological Risk Assessments
No ratings yet
Principles of Good Practice For The Use of Monte Carlo Techniques in Human Health and Ecological Risk Assessments
5 pages
Editorial: Mathematical Modeling and Optimization of Industrial Problems
No ratings yet
Editorial: Mathematical Modeling and Optimization of Industrial Problems
4 pages
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Nonlinear Transformations of Random Processes
From Everand
Nonlinear Transformations of Random Processes
Ralph Deutsch
No ratings yet
Graph Layout Support for Model-Driven Engineering
From Everand
Graph Layout Support for Model-Driven Engineering
Miro Spönemann
No ratings yet
Loop-shaping Robust Control
From Everand
Loop-shaping Robust Control
Philippe Feyel
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

University of Bristol Research Report 08:16: SMCTC: Sequential Monte Carlo in C++

Uploaded by

University of Bristol Research Report 08:16: SMCTC: Sequential Monte Carlo in C++

Uploaded by

JSS

University of Bristol Research Report 08:16

SMCTC: Sequential Monte Carlo in C++

SMCTC: Sequential Monte Carlo in C++

2. Sequential Monte Carlo

2.1. Sequential Importance Sampling and Resampling

SMCTC: Sequential Monte Carlo in C++

Algorithm 1 The Generic SIR Algorithm At time 1 for i = 1 to N do i X1 q1 ()

2.2. Particle Filters

p(x1:n |y1:n ) (x1 )g(y1 |x1 )

f (xj |xj1 )g(yj |xj ).

2.3. SMC Samplers

The synthetic distributions are n (x1:n ) = n (xn )

Lp (xp+1 , xp ) , where {Ln }nN is a se-

SMCTC: Sequential Monte Carlo in C++

3.1. Obtaining SMCTC

3.2. Installing SMCTC

Compiling the library

3.3. Building Programs with SMCTC

SMCTC: Sequential Monte Carlo in C++

3.4. Additional Documentation

4. The SMCTC Library

4.1. Library and Program Structure

SMCTC: Sequential Monte Carlo in C++

4.2. Creating, Conguring and Running a Sampler: smc::sampler

Proposals and Importance Weights

Running the Algorithm

SMCTC: Sequential Monte Carlo in C++

4.3. Specifying Proposals and Importance Weights: smc::moveset

Initialising the particles

will produce a new particle object which contains those values.

Moving and Weighting the particles

SMCTC: Sequential Monte Carlo in C++

Additional MCMC moves

Creating an smc::moveset object

SMCTC: Sequential Monte Carlo in C++

4.4. Error Handling: smc::exception

4.5. Random Number Generation

5.1. A Simple Particle Filter Model

SMCTC: Sequential Monte Carlo in C++

lIterates = load_data ( " data . csv " , & y ) ;

cerr << e ; exit ( e . lCode ) ; } }

double integrand_mean_x ( cv_state s , void *) { return s . x_pos ; }

SMCTC: Sequential Monte Carlo in C++

return smc :: particle < cv_state >( value , logLikelihood (0 , value ) ) ; }

pFrom . AddToLogWeight ( logLikelihood ( lTime , * cv_to ) ) ; }

5.2. Gaussian Tail Probabilities

Model and Distribution Sequence

SMCTC: Sequential Monte Carlo in C++

14 Ground Truth Filtering Estimate Observations 12

SMCTC: Sequential Monte Carlo in C++

The Path Sampling Approximation

d (V (x) V ) )) + 1 d exp(()(V (x) V

(V () V ) d d )) + 1 d exp(()(V () V E (t/T ) (V () V ) d, exp((V () V )) + 1

, where is interpreted as a parameter, is used for notational

convenience. This forward kernel can be written as:

where the probability of each of the possible moves is given by

This leads to the following optimal auxiliary kernel:

The incremental importance weight is consequently:

double zEstimate = Sampler . I nt e gr a te Pa t hS am p li n g ( pIntegrandPS , pWidthPS , NULL ) - log (2.0) ;

SMCTC: Sequential Monte Carlo in C++

and integrand pIntegrandPS

143 144 145

lp -= log (1.0 + exp ( -( ALPHA ( lTime ) * (x - > value - THRESHOLD ) ) ) ) ; return lp ; }

SMCTC: Sequential Monte Carlo in C++

119 120 121 122 123 124 125

126 127 128 129 130 131

132 133 134 135 136 137 138 139

SMCTC: Sequential Monte Carlo in C++

SMCTC: Sequential Monte Carlo in C++

van Heesch D (2007). doxygen Manual, 1.5.5 edition. URL http://www.doxygen.org.

University of Bristol Research Report

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.