Spde No R PDF
Spde No R PDF
Finn Lindgren
University of Bath, United Kingdom
Håvard Rue
Norwegian University of Science and Technology, Norway
Abstract
The principles behind the interface to continuous domain spatial models in the R-
INLA software package for R are described. The Integrated Nested Laplace Approxima-
tion (INLA) approach proposed by Rue, Martino, and Chopin (2009) is a computationally
effective alternative to MCMC for Bayesian inference. INLA is designed for latent Gaus-
sian models, a very wide and flexible class of models ranging from (generalized) linear
mixed to spatial and spatio-temporal models. Combined with the Stochastic Partial Dif-
ferential Equation approach (SPDE, Lindgren and Lindgren 2011), one can accommodate
all kinds of geographically referenced data, including areal and geostatistical ones, as well
as spatial point process data. The implementation interface covers stationary spatial mod-
els, non-stationary spatial models, and also spatio-temporal models, and is applicable in
epidemiology, ecology, environmental risk assessment, as well as general geostatistics.
Keywords: Bayesian inference, Gaussian Markov random fields, stochastic partial differential
equations, Laplace approximation, R.
Traditionally, Markov models in image analysis and spatial statistics have been largely con-
fined to discrete spatial domains, such as lattices and regional adjacency graphs. However,
as discussed in Lindgren, Rue, and Lindström (2011), one can express a large class of ran-
dom field models as solutions to continuous domain stochastic partial differential equations
(SPDEs), and write down explicit links between the parameters of each SPDE and the ele-
ments of precision matrices for weights in a discrete basis function representation. As shown
by Whittle (1963), such models include those with Matérn covariance functions, which are
ubiquitous in traditional spatial statistics, but in contrast to covariance based models it is
far easier to introduce non-stationarity into the SPDE models. This is because the differ-
ential operators act locally, similarly to local increments in Gibbs-specifications of Markov
models, and only mild regularity conditions are required. The practical significance of this
2 Bayesian Spatial Modelling with R-INLA
is that classical Gaussian random fields can be merged with methods based on the Markov
property, providing continuous domain models that are computationally efficient, and where
the parameters can be specified locally without having to worry about positive definiteness
of covariance functions.
The fundamental building block of such Gaussian Markov random field (GMRF) models, as
implemented in R-INLA, is a high-dimensional basis representation, with simple local basis
functions. This is in contrast to Fixed Rank Kriging (Cressie and Johannesson 2008) that
typically uses a smaller number of global basis functions, and predictive process methods
(Banerjee, Gelfand, Finley, and Sang 2008). See Wikle (2010) of for an overview of such
low-rank representation methods. A numerical comparison of the error introduced in Kriging
calculations was performed by Bolin and Lindgren (2013) for SPDE based GMRF models,
covariance tapering, and process convolutions. A non-parametric approach using similar
GMRF models is available in the LatticeKrig package in CRAN (Nychka, Hammerling, Sain,
and Lenssen 2013).
The different methods can also be combined, although the details for doing that within R-
INLA are beyond the scope of this paper. For example, the global temperature analysis in
Lindgren et al. (2011) used a combination of a low dimensional global basis, like in Fixed
Rank Kriging, and a small-scale GMRF process, both with priors based on approximations to
continuous domain SPDE models. There is considerable overlap between models formulated
using Fixed Rank Kriging and SPDE/GMRF models, and a clear line separating the methods
cannot be drawn.
The R-INLA software package currently has direct support for stationary and non-stationary
locally isotropic SPDE/GMRF models on compact subsets of R, R2 , and on S2 , as well as
separable space-time models. Some non-separable space-time models, non-stationary fully
anisotropic models, as well as models on R3 and other user-defined domains are also partially
supported by the internal implementation, but have not yet been added to the basic interface.
Consequently, auto-regressive models (see e.g., Cameletti, Lindgren, Simpson, and Rue 2013)
are fully supported, but anisotropic advection-diffusion models (see e.g., Sigrist, Künsch,
and Stahel 2014) are limited to non-advective models and require advanced user interaction,
but support non-stationary anisotropy if only the strength of the non-isotropic diffusion is
unknown.
The following sections present the basic ingredients of the link between continuous domains
and Markov models and related simulation free Bayesian inference methods (Section 1), de-
scribe the structure of the interface to using such models in the R-INLA software package
(Section 2 and 3), and discuss planned future development (Section 4). Special emphasis is
placed on the abstractions necessary to simplify the practical bookkeeping for the users of the
software. For further details on the computational and inferential methods in the R-INLA
package we refer to Martins, Simpson, Lindgren, and Rue (2013).
where ∆ is the Laplacian, κ is the spatial scale parameter, α controls the smoothness of the
realisations, τ controls the variance, and Ω is the spatial domain. The right-hand side of the
equation, W(s), is a Gaussian spatial white noise process. As noted by Whittle (1954, 1963),
the stationary solutions on Rd have Matérn covariances,
σ2
COV(x(0), x(s)) = (κksk)ν Kν (κksk). (1)
2ν−1 Γ(ν)
The parameters in the two formulations are coupled so that the Matérn smoothness is ν =
α − d/2 and the marginal variance is
Γ(ν)
σ2 = . (2)
Γ(α)(4π)d/2 κ2ν τ 2
From this we can identify the exponential covariance with ν = 1/2. For d = 1, this is obtained
with α = 1, and for d = 2 with α = 3/2.
From spectral theory one can show that integer values for α gives continuous domain Markov
fields (Rozanov 1982), and these are the easiest for which to provide discrete basis represen-
tations. In R-INLA, the default value is α = 2, but 0 ≤ α < 2 are also available, though
not as extensively tested. For the non-integer α values the approximation method introduced
in the authors’ discussion response in Lindgren et al. (2011) is used. Historically, Whittle
4 Bayesian Spatial Modelling with R-INLA
(1954) argued that α = 2 was a more natural basic choice for d = 2 models than the frac-
tional α = 3/2 alternative. Note that fields with α ≤ d/2 have ν ≤ 0 and that such fields
have no point-wise interpretation, although they have well-defined integration properties. In
particular, this means that the case d = 2, α = 1, which on a regular lattice discretisation
corresponds to the common CAR(1) model, needs to be interpreted with care (Besag 1981;
Besag and Mondal 2005), especially when used in combination with irregular discretisation
domains.
The models discussed in Lindgren et al. (2011) and implemented in R-INLA are built on a
basis representation
n
X
x(s) = ψk (s)xk , (3)
k=1
where ψk (·) are deterministic basis functions, and the joint distribution of the weight vector
x = {x1 , . . . , xn } is chosen so that the distribution of the functions x(s) approximates the
distribution of solutions to the SPDE on the domain. To obtain a Markov structure, and
to preserve it when conditioning on local observations, we use piecewise polynomial basis
functions with compact support. The construction is done by projecting the SPDE onto the
basis representation in what is essentially a Finite Element method.
To allow easy and explicit evaluation, for two-dimensional domains we use piece-wise linear
basis functions defined by a triangulation of the domain of interest. For one-dimensional
domains, B-splines of degrees 1 (piecewise linear) and 2 (piecewise quadratic) are supported.
This yields sparse matrices C, G1 , and G2 such that the appropriate precision matrix for the
weights is given by
Q = τ 2 (κ4 C + 2κ2 G1 + G2 )
for the default case α = 2, so that the elements of Q have explicit expressions as functions
of κ and τ . Assigning the Gaussian distribution x ∼ N(0, Q−1 ) now generates continuously
defined functions x(s) that are approximative solutions to the SPDE (in a stochastically weak
sense).
The simplest internal representation of the parameters in the model interface is log(τ ) = θ1
and log(κ) = θ2 , where θ1 and θ2 are assigned a joint normal prior distribution. Since τ
and κ have a joint influence on the marginal variances of the resulting field, it is often more
natural to construct the parameter model using the standard deviation σ and range ρ, where
ρ = (8ν)1/2 /κ is the distance for which the correlation functions have fallen to approximately
0.13, for all ν > 1/2. Another commonly used definition for the range is as the distance at
which the correlation is 0.05. The alternative definition used in R-INLA has the advantage
of explicit dependence on ν. Translating this into τ and κ yields
1 Γ(ν)
log τ = log − log σ − ν log κ, (4)
2 Γ(α)(4π)d/2
log(8ν)
log κ = − log ρ. (5)
2
Suppose we want a parameterisation
log σ = log σ0 + θ1 , (6)
log ρ = log ρ0 + θ2 , (7)
Journal of Statistical Software 5
where σ0 and ρ0 are base-line standard deviation and range values. We then substitute log σ
and log ρ into Equation 4 and 5, giving the internal parameterisation
log(8ν)
log κ0 = − log ρ0 ,
2
1 Γ(ν)
log τ0 = log − log σ0 − ν log κ0 ,
2 Γ(α)(4π)d/2
log τ = log τ0 − θ1 + νθ2 ,
log κ = log κ0 − θ2 ,
Non-stationary fields
There is a vast range of possible extensions to the stationary SPDE described in the previous
section, including non-stationary versions (see Lindgren et al. 2011; Bolin and Lindgren 2011,
for examples). In the current version of the package, a non-stationary model defined via
spatially varying κ(s) and τ (s) is available for the case α = 2. The SPDE is defined as
and log κ(s) and log τ (s) are defined as linear combinations of basis functions,
p
X
log(τ (s)) = bτ0 (s) + bτk (s)θk ,
k=1
Xp
log(κ(s)) = bκ0 (s) + bκk (s)θk ,
k=1
where {θ1 , . . . , θp } is a common set of internal representation parameters, and bτk (·) and bκk (·)
are spatial basis functions, some of which, for each k may be identically zero for either τ or
κ. The precision matrix for the discrete field representation weights is a simple modification
of the stationary one, with the parameter fields (evaluated at the mesh discretisation points)
entering via diagonal matrices:
Just as in the stationary case, the model can be reparameterised using Equation 4 and 5,
where
p
X
log(σ(s)) = bσ0 (s) + bσk (s)θk ,
k=1
p
log(ρ(s)) = bρ0 (s) + bρk (s)θk ,
X
k=1
6 Bayesian Spatial Modelling with R-INLA
and σ(s) and ρ(s) are the nominal local standard deviations and correlation ranges. There
are no explicit expressions for the actual values, since they depend on the entire parameter
functions in a non-trivial way. For given values of θ, the marginal variances can be efficiently
calculated using the discretised GMRF representation, see Section 2.3.
Given the offsets, bσ0 (s) and bκ0 (s), and basis functions. bσk (s) and bκk (s), for the log(σ(s))
and log(ρ(s)) parameter fields, the internal model representation can be constructed using
the following identities:
log(8ν)
bκ0 (s) = − bρ0 (s), (8)
2
bκk (s) = −bρk (s), (9)
τ 1 Γ(ν)
b0 (s) = log d/2
− bσ0 (s) − νbκ0 (s), (10)
2 Γ(α)(4π)
bk (s) = −bk (s) − νbκk (s).
τ σ
(11)
The constant Γ(ν)/(Γ(α)(4π)d/2 ) is 1/2 and 1/4 for d = 1, α = 1 and 2. For d = 2 and α = 2
it is 1/(4π). There is experimental support for constructing basis functions that reduces the
influence of the range on the variance for cases where the basis functions for log ρ(s) have
rapid changes.
Boundary effects
When constructing solutions to the SPDEs on bounded domains, boundary conditions are
imposed, but how to construct practical and proper stochastic boundary conditions for these
models is an open research problem. In the current version of the package, all 2D models are
restricted to deterministic Neumann boundaries (zero normal-derivatives), as this is easy to
construct, has well defined physical interpretation in terms of reflection, and has an effect on
the covariances that is easy to quantify. As a rule of thumb, the boundary effect is negligible
at a distance ρ from the boundary, and the variance is inflated near the boundary by a factor
2 along straight boundaries, and by a factor 4 near right-angled corners. In practice one can
therefore avoid the boundary effect by extending the domain of interest by a distance at least
ρ, as well as avoid sharp corners. The built-in mesh generation routines (see Section 2.1)
are designed to do this. For one-dimensional models, the boundaries can also be defined as
Dirichlet (value zero at the boundary), free, or cyclic.
Space-time models
While no space-time models are currently implemented explicitly, it is possible to construct
such models using general code features. The most important method is to construct a
Kronecker product model. Starting from a basis representation
X
x(s, t) = ψk (s, t)xk ,
k
where each basis function is the product of a spatial and a temporal basis function, ψk (s, t) =
ψis (s)ψjt (t), the space-time SPDE
∂
(κ(s)2 − ∆)α/2 (τ (s)x(s, t)) = W(s, t), (s, t) ∈ Ω × R
∂t
Journal of Statistical Software 7
which is called a Laplace approximation. This allows approximate evaluation of the (unnor-
malised) posterior density for θ at any point. The algorithm uses numerical optimisation to
find the mode of the posterior. The marginal posteriors for each θk and xj are then calculated
using numerical integration over θ, with another Laplace approximation involved in the latent
field marginal posterior calculations:
Z
p(θk | y) ≈ pe(θ | y) dθ −k ,
Z
p(xj | y) ≈ pe(xj | θ, y) pe(θ | y) dθ
8 Bayesian Spatial Modelling with R-INLA
Let z (1) and z (2) denote the covariate values for the fixed and time effects, and let z (3)
denote random effect indices. We can then rewrite the linear model from Equation 14 using
mapping functions hk (·) that denote the mapping from covariates or indices to the actual
latent value for formula component k,
(1) (1)
h1 (zi ) = zi β
(2) (2)
h2 (zi ) = smooth effect evaluated at zi
(3) (3)
h3 (zi ) = random effect component number zi
(k)
X
ηi = hk (zi )
k
The latent field is the joint vector of all latent Gaussian variables, including the linear covariate
effect coefficient β. For missing values in the z-vectors, the h functions are defined to be zero.
Since this construction only allows each observation to directly depend on a single element
from each hk (·) effect, this does not cover the case when an effect is defined using a basis
expansion such as Equation 3. To solve this, R-INLA can apply a second layer of linear
combinations to the η predictor,
η ∗ = Aη, (16)
where A is a user-defined sparse matrix. This allows the SPDE models to be treated as
indexed random effects, and the mapping between the basis weights and function values is
done by placing appropriate ψj (s) values in the A matrix. Whenever an A matrix is used,
the elements of the η ∗ vector are the linear predictor values used in the general observation
model in Equation 15, instead of η. This is further formalised in Section 2.5.
See Section 2.2 for how to construct the part of the A matrix needed for an SPDE model,
and Section 2.5 for how to set up the joint matrices needed for general models.
2. R interface
Journal of Statistical Software 9
The R-INLA interface to the SPDE models described in the previous section is divided into
five basic categories: 1) Mesh construction, 2) space mapping, 3) SPDE model construction,
4) plotting, and 5) INLA input and output structure bookkeeping.
Due mostly to the the complexity of building the binary executables that form the compu-
tational backbone of the R-INLA package, it is not available to install from CRAN, but can
still be easily installed directly from its web page (Rue et al. 2013b) from within R:
R> source("http://www.math.ntnu.no/inla/givemeINLA-testing.R")
R> library("INLA")
The package can later be upgraded to the latest development version with
which contains the latest features and bug fixes. The non-testing version is updated less
frequently. The package website http://www.r-inla.org contains more documentation, as
well as a discussion forum. The recommended way to access the full source code is to clone the
repository located at http://code.google.com/p/inla/ (Rue, Martino, Lindgren, Simpson,
and Riebler 2013a). On GNU/Linux systems, the Makefile supplied in the supplementary
material can be used to download the code and build a binary R package.
The major challenge when designing a general software package for practical use of the
SPDE/GMRF models is that of bookkeeping, i.e., how to assist the user in keeping track
of the links between continuous and discrete representations, in a way that frees the user
from having to know the details of the implementation and internal storage. To solve this, a
bit of abstraction is needed to avoid cluttering the interface with those details. Thus, instead
of visibly keeping track of mappings between triangle mesh node indices and data locations,
the user can use sparse matrices to encode these relationships, and wrapper functions are
provided to manipulate these matrices and associated index and covariate vectors in ways
suitable for the intended usage.
effects may actually be desirable. The function inla.mesh.2d() allows us to create a mesh
with small triangles in the domain of interest, and use larger triangles in the extension used
to avoid boundary effects. This minimises the extra computational work needed due to the
extension.
R> m = 50
R> points = matrix(runif(m * 2), m, 2)
R> mesh = inla.mesh.2d(
+ loc = points, cutoff = 0.05, offset = c(0.1, 0.4), max.edge = c(0.05, 0.5) )
The cutoff parameter is used to avoid building many small triangles around clustered input
locations, offset specifies the size of the inner and outer extensions around the data locations,
and max.edge specifies the maximum allowed triangle edge lengths in the inner domain and
in the outer extension. The overall effect of the triangulation construction is that, if desired,
one can have smaller triangles, and hence higher accuracy of the field representation, where
the observation locations are dense, larger triangles where data is more sparse (and hence
provides less detailed information), and large triangles where there is no data and spending
computational resources would be wasteful. However, note that there is neither any guarantee
nor any requirement that the observation locations are included as nodes in the mesh. If one
so desires, the mesh can be designed from different principles, such as lattice points with no
relation to the precise measurement locations. This emphasises the decoupling between the
continuous domain of the field model and the discrete data locations.
A new feature is the option to compute a non-convex covering to use as boundary information,
with the resulting domain boundary shown in Figure 1b. The boundary is then supplied to
inla.mesh.2d(),
resulting in the non-convex mesh shown in Figure 1c. Just as before, the SPDE edge effects
can be moved outside the domain of interest using an extension with larger triangles, shown
in Figure 1d:
For geostatistical problems with global data, one can work directly on a spherical mesh. Any
spatial coordinates must first be converted into 3D Cartesian coordinates. For longitudes
and latitudes this can be done with inla.mesh.map(), and the result can be used with
inla.mesh.2d():
Alternatively, a semi-regular mesh can be constructed using the more low-level command
● ●
● ●● ●
●
●
●
●
● ●
●
● ●●● ● ●
● ● ●
● ● ● ●
●
● ●
● ● ● ●
●● ● ●
● ● ● ● ●
● ● ●
● ● ●
● ● ● ●
● ● ● ● ● ●
● ●
●
● ● ●
●● ●
● ●
● ● ●
● ● ●
●
●● ●
● ●
● ● ● ●
● ● ●
● ●
●● ●
● ● ●
● ●
● ●
where the globe parameter specified the number of sub-segments to use, when subdividing
an icosahedron. The points are adjusted to lie on constant latitude circles. See Figure 3 for
an example of how to plot fields defined on spherical meshes.
which produces a matrix with Aij = ψj (si ) for all points si in points.
Models in R-INLA can have several replicates, as well as being grouped, which corresponds
to Kronecker product models. In order to obtain the correct A matrix for such models, the
user can specify indices for the parameters group and repl. A recently introduced feature
12 Bayesian Spatial Modelling with R-INLA
also allows specifying a one dimensional group.mesh, which is then interpreted as defining a
Kronecker product basis, such as for the space-time models mentioned in Section 1.1, and an
example is given in Section 3.2.
but in practice we need to also specify the prior distribution for the parameters, and/or modify
the parameterisation to suit the specific situation. This is true in particular when the models
are used as simple smoothers, as there is then rarely enough information in the likelihood to
fully identify the parameters, giving more importance to the prior distributions.
√
Using the theory from Section 1.1, the empirically derived range expression ρ = 8ν/κ allows
for construction of a model with known range and variance (= 1) for (θ1 , θ2 ) = (0, 0), via
R> sigma0 = 1
R> size = min(c(diff(range(mesh$loc[, 1])), diff(range(mesh$loc[, 2]))))
R> range0 = size / 5
R> kappa0 = sqrt(8) / range0
R> tau0 = 1 / (sqrt(4 * pi) * kappa0 * sigma0)
R> spde = inla.spde2.matern(mesh,
+ B.tau = cbind(log(tau0), -1, +1),
+ B.kappa = cbind(log(kappa0), 0, -1),
+ theta.prior.mean = c(0, 0),
+ theta.prior.prec = c(0.1, 1) )
Here, sigma0 is the field standard deviation and range0 is the spatial range for θ = 0,
and B.tau and B.kappa are matrices storing the parameter basis functions introduced in
Section 1.1. For stationary models, only the first matrix row needs to be supplied. In this
example, the prior median for the spatial range is chosen heuristically to be a fifth of the
approximate domain diameter.
Setting suitable priors for θ in these models generally is difficult problem. The heuristic
used above is to specify a fairly vague prior for θ1 which controls the variance, with σ02 being
the median prior variance, and a larger prior precision for θ2 . When range0 is a fifth of
the domain size, the precision 1 for θ2 gives an approximate 95% prior probability for the
range being shorter than the domain size. Experimental helper functions for constructing
parameterisations and priors are included in the package.
Models with range larger than the domain size are usually indistinguishable from intrinsic
random fields, which can be modelled by fixing κ to zero (or rather some small positive
value) with B.tau = cbind(log(tau0), 1) and B.kappa = cbind(log(small), 0). Note
that the sum-to-zero constraints often used for lattice based intrinsic Markov models is in-
appropriate due to the irregular mesh structure, and a weighted sum-to-zero constraint is
Journal of Statistical Software 13
8 8
6 6
0.8 0.8
4 4
2 2
0.6 0.6
0 0
−2 −2
0.4 0.4
−4 −4
0.2 −6 0.2 −6
−8 −8
−10 −10
0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8
Figure 2: The two random field samples, with only the domain of interest, [0, 1]×[0, 1], shown.
needed to reproduce such models. The option constr = TRUE to the inla.spde.matern()
call can be used to apply an integrate-to-zero constraint instead, which uses the triangle ge-
ometry to calculate the needed weights. Further integration constraints can be specified using
extraconstr.int = list(A = A, e = e) option, which implements constraints of the form
ACx = e,
R
where C is the sparse matrix with elements Cij = Ω ψi (s)ψj (s) ds. Non-integration con-
straints can be supplied with extraconstr, and all constraints will be passed on automatically
to the inla() call later.
The following code then generates two samples from the model,
R> x = inla.qsample(n = 2, Q)
and the resulting fields are shown in Figure 2. To take any constraints specified in the spde
object into account when sampling, use
Obtaining covariances is a much more costly operation, but the function inla.qinv(Q) can
quickly calculate all covariances between neighbours (in the Markov sense), including the
marginal variances. Finally, inla.qsolve(Q,b) uses the internal R-INLA methods for solving
a linear system involving Q.
2.4. Plotting
Mesh structure
The interface supports a plot() function aimed at plotting the basic structure of a triangu-
lation mesh. By specifying rgl = TRUE, the rgl plotting system is used, which is useful in
particular for spherical domains. Variations of the following commands were used to produce
Figure 1:
R> plot(mesh)
R> plot(mesh, rgl = TRUE)
R> lines(mesh$segm$bnd, mesh$loc, add = FALSE)
Spatial fields
For plotting fields defined on meshes, one option is to use the rgl option, which supports
specifying colour values for the nodes, producing an interpolated field plot, and optionally
draw triangle edges and vertices in the same plot:
The more common option is to explicitly evaluate the field on a regular lattice, and use any
matrix-based plotting mechanism, such as image():
All the figures showing fields have been drawn using a wrapper around the levelplot() from
the lattice package, which is available in the supplementary material.
The inla.mesh.project/or() functions are here used to map between the basis function
weights for the mesh nodes and points on a regular grid, by default a 100 × 100 lattice
covering the mesh domain. The functions also support several types of projections for spherical
domains,
−2.6 −2.6
50
−2.8 0.5
−2.8
Latitude
0 −3.0 −3.0
0.0
−3.2 −3.2
−50 −0.5
−3.4
−3.4
−100 0 100
−1 0 1
Longitude
Figure 3: Projections of a sample from an SPDE model on a spherical domain. The left panel
uses longitude-latitude projection, and the right hand panel uses the equal area Mollweide
projection.
Z = z (1) . . . z (K) ,
X
η = H(Z) = hk (z (k) ),
k
∗
η = A H(Z),
where hk (·) are the effect mapping functions defined in Section 1.2 and specified using a
model formula. The effects are treated as named vectors, regardless of the ordering. Any
effect known to H but not present in a particular Z is treated as “no effect”, which is the
same effect as when providing NA values.
The first operation is to construct sums of predictors (shown here only for two predictors):
η ∗ = A1 H(Z 1 ) + A2 H(Z 2 ) = A
e H(Z),
e
Ze = Z1 ,
Z2
Ae = A1 A2 .
The joining of the Z 1 and Z 2 effect collections is performed by matching vector names, adding
NA for any missing components.
16 Bayesian Spatial Modelling with R-INLA
The second operation is to join predictors in sequence (only two predictors shown):
η ∗1 = A1 H(Z 1 ),
η ∗2 = A2 H(Z 2 ),
e ∗ = (η ∗1 , η ∗2 ) = A
η e H(Z),
e
Ze = Z1 ,
Z2
A1 0
A=
e .
0 A2
As a post-processing step for both operations, the new covariate matrix Ze and matrix A
e are
analysed to detect any duplicate rows in Z or any all-zero columns in A, and those are by
e e
default removed in order to minimise the internal size of the model representation.
The syntax for a sum operation is
where each A matrix has an associated list of effects. A join operation is performed by
supplying two or more previously generated stack objects,
Any vectors specified in the data list, most importantly the response variable vector itself,
should be the same length as the predictor itself (scalars are replicated to the appropriate
length). With the help of the name-tag it also keeps track of the indices needed to map from
the original inputs into the resulting stacked representation. See Section 2.6 for an illustrating
example of using inla.stack().
Note that H(·) is conceptually defined by the model formula, which needs to mention every
covariate component present in Z and that is meant to be used.
yi = β0 + ci βc + x1 (si ) + ei ,
yi+m = β0 + ci+m βc + x2 (si ) + ei+m ,
Journal of Statistical Software 17
where ci is an observation-specific covariate, ei is measurement noise, and x1 (·) and x2 (·) are
the two field replicates. Note that the intercept, β0 , can be interpreted as a spatial covariate
effect, constant over the domain.
We use the basis
P function representation of x(·) to define a sparse matrix of weights A such
that x(si ) = j Aij xj , where {xj } is the joint set of weights for the two replicate fields.
If we only had one replicate, we would have Aij = ψj (si ). The matrix can be generated
by inla.spde.make.A(), which locates the points in the mesh and organises the evaluated
values of the basis functions for the two replicates:
R> A = inla.spde.make.A(mesh,
+ loc = points,
+ index = rep(1:m, times = 2),
+ repl = rep(1:2, each = m) )
For each observation, index gives the corresponding index into the matrix of measurement
locations, and repl determines the corresponding replicate index. In case of missing observa-
tions, one can either keep this A-matrix while setting the corresponding elements of the data
vector y to NA, or omit the corresponding elements from y as well as from the index and repl
parameters above. Also note that the row-sums of A are 1, since the piece-wise linear basis
functions we use sum to 1 at each location.
Rewriting the observation model in vector form gives
y = 1β0 + cβc + Ax + e
= A(x + 1β0 ) + cβc + e
Using the helper functions, we can generate data using our two previously simulated model
replicates,
R> x = as.vector(x)
R> covariate = rnorm(m * 2)
R> y = 5 + covariate * 2 + as.vector(A %*% x) + rnorm(m * 2) * 0.1
The formula in inla() defines a linear predictor η as the sum of all effects, and an NA in a
covariate or index vector is interpreted as no effect. To accommodate predictors that involve
more than one element of a random effect, one can specify a sparse matrix of weights defining
an arbitrary linear combination of the elements of η, giving a new predictor vector η ∗ . The
linear predictor output from inla() then contains the joint vector (η ∗ , η). To implement our
model, we separate the spatial effects from the covariate by defining
x + 1β0
ηe = ,
cβc
so that now E(y | η e ) = η ∗e . The bookkeeping required to describe this to inla() involves
concatenating matrices and adding NA elements to the covariates and index vectors:
Ae = A I 2m
field0 = (1, . . . , n, 1, . . . , n)
field = (field0, NA, . . . , NA)
intercept = (1, . . . , 1, NA, . . . , NA)
cov = (NA, . . . , NA, c1 , . . . , c2m )
Doing this by hand with Matrix::cBind(), c(), and rep() quickly becomes tedious and
error-prone, so one can instead use the helper function inla.stack(), which takes blocks of
data, weight matrices, and effects and joins them, adding NA where needed. Identity matrices
and constant covariates can be abbreviated to scalars, with a complaint being issued if the
input is inconsistent or ambiguous.
We also need to keep track of the two field replicates, and use inla.spde.make.index(),
which gives a list of index vectors for indexing the full mesh and its replicates (it can also be
used for indexing Kronecker product group models, e.g., in simple multivariate and spatio-
temporal models). The code
The predictor information for the observed data can now be collected, using
where the tag identifier can later be used for identifying the correct indexing into the inla()
output. As discussed in Section 2.5, each “A” matrix must have an associated list of “effects”,
in this case A:(field, field.repl, field.group, intercept) and 1:(cov). The data list
may contain anything associated with the “left hand side” of the model, such as exposure E
for Poisson likelihoods. By default, duplicates in the effects are identified and replaced by
single copies (compress = TRUE), and effects that do not affect η ∗ are removed completely
(remove.unused = TRUE), so that each column of the resulting A matrix has a least one
non-zero element.
Journal of Statistical Software 19
If we want to obtain the posterior prediction of the combined spatial effects at the mesh
nodes, x(si ) + β0 , we can define
η p = x + 1β0
η ∗p = Iη p = Ap η p
We can now join the estimation and prediction stack into a single stack,
In this simple example, the second block row of As (generating x+1β0 ) is not strictly needed,
since the same information would be available in η s itself if we specified remove.unused =
FALSE when constructing stack.pred and stack, but in general such special cases can be
hard to keep track of.
We are now ready to do the actual estimation. Note that we must explicitly remove the default
intercept from the η-model, since that would otherwise be applied twice in the construction
of η ∗ , and the constant covariate intercept is used instead:
R> formula =
+ y ~ -1 + intercept + cov + f(field, model = spde, replicate = field.repl)
R> inla.result = inla(formula,
+ data = inla.stack.data(stack, spde = spde),
+ family = "normal",
+ control.predictor = list(A = inla.stack.A(stack),
+ compute = TRUE))
The function inla.stack.data() produces the list of variables needed to evaluate the formula
and inla.stack.A() extracts the As matrix.
Since the SPDE-related contents of inla.result can be hard to interpret, the helper function
inla.spde2.result() can be used to extract the relevant parts and transform them into more
user-friendly information, such as posterior densities for range and variance instead of raw
distributions for θ, as shown in Figure 4:
0.12
2.5
0.10
2.0
0.08
1.5
y
0.06
1.0
0.04
0.02
0.5
0.00
0.0
x x
Figure 4: Posterior densities for nominal range and variance. The true values were 0.44 and
9.
The posterior means and standard deviations for the latent fields can be extracted and plotted
as follows, where inla.stack.index() provides the necessary mappings between the inla()
output and the original data stack specifications:
Figure 5 shows the posterior mean and standard deviation fields as drawn with levelplot()
instead of image().
3. Further examples
The package website (Rue et al. 2013b) has several tutorials and case-studies, including an
extensive collection of examples for the SPDE models. Here we give only three brief examples
showing fundamental concepts.
● ● ●
● ●●
●● ● ●
● ●
● ●
8 5
● ●
●● ●●
● ● ●
0.8 ●
● 0.8 ●
● ● ●
●
● ● ● 6 ● ● ● 4
● ●
● ●
● ●
● ●
0.6 ● ● 0.6 ● ●
4
3
● ● ●
● ● ● ●
●
● ● ● ●
● ●
0.4 ●
2 0.4 ●
● ●
2
● ●
● ● ●
●
● ●
● ●
● ● 0 ● ●
0.2 0.2
1
● ● ● ●
● ●
● ●
● ● ● ●
● ●
● −2 ●
● ●
● ●
0
0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8
Figure 5: Posterior mean (left) and standard deviation (right) for β0 + x1 (s).
periodic function giving the rainfall probability for each day of the year. We will model pt as
a logit-transformed Gaussian process.
This is a simple example for using a 1D SPDE model with a 2nd order B-spline basis rep-
resentation. The desired model is an intrinsic 2nd order random walk, but since there is no
inla.spde2.intrinsic() wrapper function yet, the details are set up explicitly.
First, a 2nd order B-spline basis mesh with 24 basis functions and cyclic boundary conditions
is defined, and an appropriate spde model is constructed. The mesh is specified to have cyclic
boundary conditions,
R> data("Tokyo")
R> knots = seq(1, 367, length = 25)
R> mesh = inla.mesh.1d(knots, interval = c(1, 367), degree = 2, boundary = "cyclic")
and the prior median for τ is calculated to give a specified prior median standard deviation
when κ is fixed to a small value,
R> sigma0 = 1
R> kappa0 = 1e-3
R> tau0 = 1 / (4 * kappa0^3 * sigma0^2)^0.5
y : measurements
station.loc : coordinates for measurement stations
station.id : for each measurement, which station index?
time : for each measurement, what time?
1.0
● ●● ● ●● ●● ●
● ●●●
●
●●●
●
● ●●●
● ● ●●
●●
● ●●
0.8
0.6
Probability
●
●
●● ●●
●
●●●
●●●●●
●
●●●●
●●
●●
●
●●●
●
●
●●●
●●
●●●
●●●●●
●●
●●●
●●●
●●●
●●●
●●●
●
●●●●
●
●●
●●
●●
●●
●●●
● ●
●
●
●●
●●
●●●●
●
●●●●
●●
●●
●
●
●●●●●
●
●●
●●
●
●●● ●
●
●●●
●●
●● ●●
●
●●●
●
0.4
0.2
0.0
●
●●●
●
●
●●
●
●●
●●●
●
●
●●
●
●●
●●
●
●●
●
●
●●
●●
●
●●
●●
●●
●●
●
●●
●●
●
●
●●
●●●●●
●●●●●●
●●●
●
●●
●
●●
●●
●
●●
●●
●
●●●
●
●
●●●
●
●●
●
●●
●●
●
●●● ●
●
●●●●
●●●
●
●
●●●
●
●
●●
●
●●
●
●
●●
●●●
●
●
●●
●
●●
●●
●
●●●
●●●●
●
●
●●
●●
●
●●●●
●
●●
●
●●
●
●
●●
●●●
●
●
●●●●
●
●●
●
●●
●
●
●●
●
●●
●
●
●●
●●
●
●
●●●
●
●●
●
●●
●
●
Day
Figure 6: Empirical and model based Binomial probability estimates for the Tokyo rainfall
data set, with 95% posterior predictive bounds. The empirical probability estimates are the
proportion of observed rainfall days for each day of the year.
4. Future development
The R-INLA package is in constant development, with new models added as they are needed
and developed. The current work for the SPDE models is focusing on construction of param-
eter basis functions and priors for non-stationary model parameters, as well as implementing
extensions to non-separable space-time models and more flexible boundary conditions. An
associated package excursions for computing level excursion sets with joint excursion probabil-
ities, as well as credible regions for contour curves, is available in CRAN (Bolin and Lindgren
2014).
As the size of spatial and spatio-temporal models and data sets grows, iterative matrix meth-
ods and other approximation techniques for more complex models are also being investigated,
with the long-term goal of replacing the core of R-INLA to more easily handle such challenges.
Acknowledgements
The authors wish to thank their collaborators David Bolin, Michela Cameletti, Janine Illian,
Johan Lindström, Thiago Martins, Daniel Simpson, Sigrunn Sørbye, Elias Krainski, and
Ryan Yue, who have all contributed with ideas and suggestions for the development of the
spatial model interface. We are also grateful to the editors and reviewers for their thoughtful
comments on the manuscript.
References
Banerjee S, Gelfand AE, Finley AO, Sang H (2008). “Gaussian Predictive Process Models for
Large Spatial Datasets.” Journal of the Royal Statistical Society B, 70(4), 825–848.
Besag J, Mondal D (2005). “First-order Intrinsic Autoregressions and the de Wijs Process.”
Biometrika, 92(4), 909–920.
Bolin D, Lindgren F (2011). “Spatial Models Generated by Nested Stochastic Partial Differ-
ential Equations, with an Application to Global Ozone Mapping.” The Annals of Applied
Statistics, 5(1), 523–550. ISSN 1932-6157.
Bolin D, Lindgren F (2014). “Excursion and contour uncertainty regions for latent Gaussian
models.” Journal of the Royal Statistical Society B. ISSN 1467-9868. doi:10.1111/rssb.
12055.
Cressie NAC, Johannesson G (2008). “Fixed Rank Kriging for Very Large Spatial Data Sets.”
Journal of the Royal Statistical Society B, 70(1), 209–226.
Lindgren F, Rue H, Lindström J (2011). “An Explicit Link Between Gaussian Fields and
Gaussian Markov Random Fields: the Stochastic Partial Differential Equation Approach.”
Journal of the Royal Statistical Society B, 73(4), 423–498. ISSN 1369-7412.
Martins TG, Simpson D, Lindgren F, Rue H (2013). “Bayesian Computing with INLA: New
Features.” Computational Statistics and Data Analysis, 67, 68–83.
Rue H, Martino S, Chopin N (2009). “Approximate Bayesian Inference for Latent Gaussian
Models using Integrated Nested Laplace Approximations (with discussion).” Journal of the
Royal Statistical Society B, 71, 319–392.
Sigrist F, Künsch HR, Stahel WA (2014). “Stochastic partial differential equation based
modelling of large space-time data sets.” Journal of the Royal Statistical Society B. ISSN
1467-9868. doi:10.1111/rssb.12061.
Simpson D, Lindgren F, Rue H (2012a). “In Order to Make Spatial Statistics Computationally
Feasible, We Need to Forget About the Covariance Function.” Environmetrics, 23(1), 65–
74. ISSN 1180-4009.
Whittle P (1954). “On Stationary Processes in the Plane.” Biometrika, 41(3/4), 434–449.
Whittle P (1963). “Stochastic Processes in Several Dimensions.” Bull. Inst. Internat. Statist.,
40, 974–994.
Affiliation:
Finn Lindgren
Department of Mathematical Sciences
University of Bath
Claverton Down
BA2 7AY, Bath, United Kingdom
E-mail: f.lindgren@bath.ac.uk
URL: http://people.bath.ac.uk/fl353/