The Value of Multiproxy Reconstruction of Past Climate
The Value of Multiproxy Reconstruction of Past Climate
net/publication/227369271
CITATIONS READS
50 986
3 authors:
Bo Li Doug Nychka
Vanderbilt University NSF National Center for Atmospheric Research
23 PUBLICATIONS 376 CITATIONS 231 PUBLICATIONS 14,475 CITATIONS
Caspar M Ammann
NSF National Center for Atmospheric Research
173 PUBLICATIONS 13,659 CITATIONS
SEE PROFILE
All content following this page was uploaded by Caspar M Ammann on 16 May 2014.
Understanding the dynamics of climate change in its full richness requires the knowledge of long temperature time series. Although long-
term, widely distributed temperature observations are not available, there are other forms of data, known as climate proxies, that can have
a statistical relationship with temperatures and have been used to infer temperatures in the past before direct measurements. We propose
a Bayesian hierarchical model to reconstruct past temperatures that integrates information from different sources, such as proxies with
different temporal resolution and forcings acting as the external drivers of large scale temperature evolution. Additionally, this method
allows us to quantify the uncertainty of the reconstruction in a rigorous manner. The reconstruction method is assessed, using a global
climate model as the true climate system and with synthetic proxy data derived from the simulation. The target is to reconstruct Northern
Hemisphere temperature from proxies that mimic the sampling and errors from tree ring measurements, pollen indices, and borehole
temperatures. The forcing series used as covariates are solar irradiance, volcanic aerosols, and greenhouse gas concentrations. The Bayesian
model was successful in integrating these different sources of information in creating a coherent reconstruction. Within the context of this
numerical testbed, a statistical process model that includes the external forcings can improve the quality of a hemispheric reconstruction
when long time scale proxy information is not available. This article has supplementary material online.
KEY WORDS: Bayesian hierarchical model; Forcings; Global climate model; Past temperature reconstruction; Proxies.
883
884 Journal of the American Statistical Association, September 2010
of the Earth’s climate as “truth,” and we also apply our meth- discrete periods in the past (Guiot, Harrison, and Prentice 1993;
ods to synthetic proxy data generated from this simulation. The Williams, Bartlein, and Webb 2000), they have only rarely been
synthetic proxy data bear the main characteristics of the real used in large-scale temperature reconstructions for the past mil-
proxies. lennium. For more details on these proxy characteristics, see
A scientific contribution of this paper is an estimate of the Guiot et al. (2005).
added value of combining proxies with different climate reten-
tion characteristics and including external forcings series. To a 1.2 External Drivers of Climate
statistician this is a practical exploration of different statistical Besides proxy measurements of past climate, there are also
designs for climate proxy and forcing datasets. Assembling a observations of the external drivers of the climate system. In
meaningful multiproxy dataset is by itself time consuming and this work we will focus on solar irradiance, volcanism, and
thus an important question is whether such an effort is worth greenhouse gases as primary external forcings for temperature
the additional accuracy in the reconstruction. A companion is- evolution. These three forcings drive the large-scale climate
sue is the value of including external forcing information into variation, because the climate system has to react to any varia-
the statistical procedure, especially when the forcing series also tions of these three forcings in order to restore the energy bal-
have observational errors. The inclusion of forcings can be crit- ance (Crowley 2000; IPCC 2007). A positive solar irradiance
icized as adding nonclimatic information that should better be forcing tends to warm the surface, whereas a negative one tends
used in the post reconstruction to evaluate the cause of the cli- to cool it. Volcanism often causes sudden temperature drops be-
matic variations. However, if the forcings provide substantial cause the large amount of aerosols ejected by an explosive vol-
improvement in the reconstruction and are included with rele- canic eruption into the atmosphere reduces the radiation reach-
vant components of statistical uncertainty then they might be ing the surface. Unlike those two natural forcings, recent in-
very useful. In particular, we are interested in determining the
crease in greenhouse gases, with CO2 as a major component, is
tradeoff between using external forcings jointly with a single
a forcing due to human activities. An important component of
proxy type or using multiple proxies. To our knowledge this
our statistical model is the inclusion of an empirical model for
is the first deliberate exploration of these statistical design/data
temperature that depends on these forcings. We believe that this
issues for paleoclimate applications. Given the scope of these
covariate information is an important addition to that from the
methodological issues we defer any reconstructions based on
proxy observations (Hughes and Ammann 2009), and also dis-
actual observations and their geophysical interpretation to a
tinguishes our approach from many conventional paleoclimate
subsequent paper.
reconstructions.
1.1 Multiple Proxies for Temperature
1.3 Statistical Estimates of Past Temperatures
Different proxies preserve the climate information in differ-
ent ways and therefore are sensitive to climate variables at dif- Most studies reconstruct the past temperature by relying on
ferent time scales. One might be good at short time scales, while only one proxy record of a particular resolution and do not use
another better at longer time scales. This is a key to our ap- external forcings as covariate information. For example, Mann,
proach and the Bayesian hierarchical modeling that we employ Bradley, and Hughes (1998), Jones et al. (1998), and Crowley
takes advantage of such complementary skills/characteristics and Lowery (2000) reconstructed the NH temperature based on
among different proxies. In this work we consider three widely annually recorded proxy data, with Esper, Cook, and Schwein-
used but distinct climate proxies: tree rings, borehole tem- gruber (2002) and Briffa and Melvin (2008) exclusively based
peratures, and pollen abundance. Tree-ring width and density on tree rings. Viau et al. (2006) carried out the reconstruction by
are perhaps the most widely distributed and generally used focusing on pollen only and Harris and Chapman (2001) drew
proxy and their relationship to seasonal temperatures has been inference for the past temperature primarily based on borehole.
extensively studied (Fritts 1976; Cook and Kairiukstis 1990; Only a few reconstruction applications have tried to integrate
Schweingruber 1996). Although tree-ring measurements typi- data from sources with very different temporal resolution (e.g.,
cally have their dating accurate to the year, their ability to en- Beltrami et al. 1995; Huang 2004; Moberg et al. 2005; Haslett
code centennial and slower climate variability is often limited et al. 2006). Although they found that data integration improves
by the technique used to remove nonclimatic variations in tree- the reconstruction, none of them has integrated all three types
ring time series (Cook et al. 1995; Briffa and Melvin 2008). of proxies as well as the external forcings mentioned above.
Borehole depth temperature profiles directly preserve the sur- Furthermore, no work has been conducted to systematically in-
face temperature variability as the surface heat diffuses down- vestigate the role of proxies and forcings and thus provide a
ward into the earth and have been recently used to character- guide for temperature reconstruction.
ize NH continental temperature for the past 500 years (Huang, We develop a Bayesian hierarchical model (BHM) to recon-
Pollack, and Shen 2000; Harris and Chapman 2001; Chapman, struct the NH mean temperature that incorporates the infor-
Bartlett, and Harris 2004). As opposed to tree rings, borehole mation from tree ring, pollen, and borehole records altogether,
temperatures are only sensitive to climate variations at multi- and also makes use of past forcings. BHMs have been demon-
decadal or longer time scales due to the attenuation by the dif- strated to be a powerful method in solving complex problems
fusion process. As a third proxy, pollen records provide climate in climatology, ecology, and environmetrics (e.g., Wikle et al.
information that can fill the gap between tree ring and bore- 2001; Berliner, Milliff, and Wikle 2003) by splitting a compli-
hole temperatures because they are considered sensitive to mul- cated model into three basic components: an observation level,
tidecadal variability. Although pollen records are widely dis- a process level, and a level of prior information on statistical pa-
tributed and have been used for reconstructing temperatures at rameters. In this application the observation level relates each
Li, Nychka, and Ammann: Multiproxy Reconstruction of Past Climate 885
proxy to temperature, the process level relates temperature to to temperature reconstruction. Finally, all the data, R files,
the external forcings and the prior level specifies prior distrib- and supplement material involved in this paper are posted at
utions of regression and variance parameters that tend to stabi- http:// www.image.ucar.edu/ Data/ .
lize the problem but are otherwise uninformative. The synthesis
of proxies and forcings enabled by BHMs will ideally provide 2. CLIMATE MODEL OUTPUT AND
more accurate reconstructions because the strength of one com- RADIATIVE FORCINGS
ponent can compensate for weakness of others. Climate system models are large computer codes that im-
Compared to the widely used regression approach (e.g., Li, plement the basic physical equations for fluid dynamics and
Nychka, and Ammann 2007) in the temperature reconstruction, for thermodynamics to describe the motion of the atmosphere,
our method can avoid the possible attenuation effects caused ocean, and sea ice and their interaction with the land. The mod-
by errors in explanatory variables (Ammann, Genton, and Li els are highly nonlinear and expressed in a differential form
2010), because the BHM explicitly models the measurement where the state of the climate system is stepped from one time
error in proxies. To our knowledge Haslett et al. (2006) and point to the next over a short time interval by solving a large
Lee, Zwiers, and Tsao (2008) are the two earliest statistical ap- system of coupled partial differential equations. The model
proaches that consider an observational model for the proxies. simulations are started by initial conditions of the ocean, at-
Haslett et al. first proposed a Bayesian reconstruction based on
mosphere, and sea ice and are then subsequently driven by in-
the fossil pollen data, but they did not consider other type of
ternal variability that is modulated or changed over time by ex-
proxies and external forcings. Lee, Zwiers, and Tsao proposed
ternal forcings such as solar irradiance, volcanic aerosols, and
a Kalman filter approach to incorporate the external driving fac-
greenhouse gas concentrations. In addition, topography and an
tors, yet they did not consider combining proxies with different
annual cycle of land cover are prescribed as boundary condi-
resolution.
tions for the atmosphere over land. Averaging the results of the
1.4 Evaluation of Reconstruction Methods model over a specific time period provides an estimate of the
climate of the model.
In developing and testing this statistical method we take
The climate simulation used in this work (Ammann et al.
an unconventional approach in its evaluation of scientific im-
2007) is a run of the National Center for Atmospheric Re-
pact. Because there is no adequate “ground truth” available for
search (NCAR) Community Climate System Model (CCSM)
evaluating the fidelity of reconstruction approaches, we use a
Version 1.4 (Boville et al. 2001; Otto-Bliesner and Brady 2001).
high-resolution, state-of-the-art climate simulation of the past
The model atmosphere and land components were configured
1150 years (Ammann et al. 2007) as a means for evaluating the
with a resolution of 3.75◦ × 3.75◦ or 400 km × 400 km, a 3◦
method. Specifically we use the output from the climate model
to generate synthetic proxy datasets that represent the character- resolution in the ocean with meridional resolution of <1◦ at
istics of real-world proxies and include the forcings used in the the equator. Details of this experiment can be found in Am-
simulation to determine how well our method can reconstruct mann et al. (2007) and at http:// www.cesm.ucar.edu. Its simu-
the model temperatures. The strategy of using synthetic data lated global annual temperature during 850–1999 is shown in
from climate model output to evaluate the reconstruction meth- Figure 1. A near hockey stick shape of this temperature series
ods is well established in the paleoclimatology literature (Mann results from the increased forcings due to greenhouse gases.
and Rutherford 2002; Rutherford et al. 2003; Zorita, Gonzalez- This shape, while a realistic representation of how temperature
Rouco, and Legutke 2003; von Storch et al. 2004; Ammann has increased, can cause a bias in the reconstruction and is dis-
and Wahl 2007; Lee, Zwiers, and Tsao 2008) and is a prac- cussed in the section on numerical results.
tical solution to provide test beds that are complex but where The three external forcings we consider here are solar irradi-
the true temperatures are assumed known. Finally it should be ance, volcanism, and greenhouse gases. Those are also the key
noted that the climate model has substantial complexity relative forcings used in the simulation by Ammann et al. (2007). For
to any tractable BHM and so studies using this model provide a simplicity, we have not separated out tropospheric aerosol, as
reasonable measure of how well a statistical model can account its variation after 1870 is similar to the greenhouse gases. The
for high dimensional and nonlinear geophysical process with solar irradiance series is a reconstruction by Bard et al. (2000)
stochastic components. and is derived from measurements of fluctuations of 10 Be pro-
duction rates which is modulated by solar magnetic variability.
1.5 Outline of Article The volcanic series is based on a synthesis of individual ice
Section 2 gives details on the global climate model output cores (see http:// www.ncdc.noaa.gov/ paleo/ icecore.html) and
used for evaluating the method and the external forcings se- in some cases on historical records of large eruptions. It is then
ries for the past 1150 years. Section 3 describes the salient transformed to be an estimate of volcanic sulfate aerosol mass
features of different proxy data and how we generate the syn- (Ammann et al. 2007) whose amplitude reflects the radiation
thetic proxies. Section 4 presents the BHM for combining prox- reduction. The change in greenhouse gases and its forcing prior
ies and forcings in temperature reconstruction, and discusses to the 1950s is derived from air bubbles in ice cores and then
several variations of the hierarchical model. Section 5 shows direct measurements exist subsequently. We use carbon dioxide
the results from different hierarchical models under different as a simple representation because other gases changed very
combinations of proxies and forcings, answers the design ques- similarly, albeit with different relative concentration. The shape
tions raised in the Introduction, and analyzes the identifiabil- of the carbon dioxide series is dominated by the first slow then
ity of parameters. Section 6 discusses the strength and exten- rapid increase since the beginning of the 19th century. Figure 1
sibility of this Bayesian hierarchical framework with respect illustrates the three forcing series.
886 Journal of the American Statistical Association, September 2010
Figure 1. The diagram of three forcings. The grey curve is the NH temperature, and among the black curves (a) is the volcanism, (b) is the
solar irradiance, and (c) is the greenhouse gases. All the curves are scaled in order to show them clearly in one figure.
Because the BHM will estimate the regression relationship However, the high frequency representation of tree rings in gen-
between these variables and temperature, the individual scales eral has been well studied and accepted in the dendrochronol-
of these series will not affect the reconstruction. This is useful ogy community (e.g., Cook et al. 1995 and Briffa et al. 1996),
as some of the uncertainty or error in these series is attributed and the strategy of treating tree rings only being informative
to the absolute scale. For example, most discrepancies among at high frequencies was originally employed by Moberg et al.
different solar irradiance estimates are due to different scaling (2005). So it serves the purpose of our study that evaluates the
(Bard et al. 2000) which has little effect to our approach. The capability of our method in combining different proxies. Fig-
year-to-year variation in the greenhouse effect is very small for ure 2 shows the 15 locations and Figure 3 gives an example of
centuries before about 1800 and will have negligible effect be- generated tree rings.
cause the key impact will more likely occur at the end of the
3.2 Borehole Temperatures
time series. The estimates of volcanism, however, bears uncer-
tainty in individual events that can reach 50% of the magnitude Temperature-depth profiles measured in boreholes contain a
(Rind 1995 and Zielinski 2000), and thus needs to be taken into record of surface temperature changes due to the thermal diffu-
account by a statistical model. sion in the Earth, hence they provide a means to directly esti-
mate the past temperatures through inversion of the down-core
3. CLIMATE PROXIES OBSERVATIONS AND temperature profile after taking the natural geothermal gradi-
GENERATING SYNTHETIC PROXIES ent into account (e.g., Beltrami 2001). However, the borehole
3.1 Tree Rings profile itself is unable to recover the surface temperatures at
annual resolution, because the ground essentially behaves as
Variation in the width and density of tree rings represent the a low pass filter only retaining the long term trends of cli-
most widely known climate proxy. Moreover, the wide geo- mate. In particular, the filtering becomes more and more se-
physical distribution of trees make them well suited to high- vere with time further back and thus smears out the poten-
resolution paleoclimate research. As a result of strong replica- tially recoverable temperatures. As a result, borehole temper-
tion within and between specific sites and regions, and careful atures are only sensitive to climate variations at multidecadal
checking for common patterns, dating is effectively absolute, or longer time scales (Gonzalez-Rouco, von Storch, and Zorita
that is, accurate to the year. However, the potentially limited 2003; Mann and Schmidt 2003; Chapman, Bartlett, and Har-
ability of capturing centennial and slower climate variability ris 2004; Huang 2004). Recently, borehole-derived temperature
might inhibit the tree-ring networks from representing climate estimates have attracted much attention because several stud-
uniformly across the frequency spectrum, and hence the tree- ies have systematically pointed towards a larger temperature
ring-based reconstruction should be viewed with caution for change since AD 1500 over the NH continents compared to the
time scales ranging from multidecadal to multicentennial (Cook other reconstructions (e.g., Huang, Pollack, and Shen 2000).
et al. 1995).
Synthetic Borehole Observations. Unlike generating syn-
Synthetic Tree-Ring Observations. To generate synthetic thetic tree-ring proxies using individual temperature series, we
but realistic tree rings from the climate model simulation, we generate borehole data based on five regional composite tem-
first select 15 local temperature series from the CCSM output perature series. These five composite temperature series are the
and then generate 15 pseudo tree-ring records by a high pass local average of model temperature output over five 20◦ × 20◦
filter. Specifically, we subtract the 10-year smoothing average squares as shown in Figure 2. The distribution of those loca-
from each of the local temperature series to give a filtered result. tions reasonably represents the spread of real borehole data
This might be an extreme approach for representing tree-ring (Huang, Pollack, and Shen 2000). Due to the complexity of the
information that perhaps in reality will not occur so drastically. physical process in forming the borehole profile, the algorithm
Li, Nychka, and Ammann: Multiproxy Reconstruction of Past Climate 887
Figure 2. Pseudo-proxy sampling locations in the Community Climate System Model. The online version of this figure is in color.
to generate the borehole depth temperature is not straightfor- smearing of this pulse, whereas the profile corresponding to the
ward. Therefore, we follow the modern preobservational mean- latest pulse displays a clear perturbation. As a consequence, the
surface air temperature (POM-SAT) model which was origi- flat profile lost the most information about the exact timing, the
nally derived by Carslaw and Jaegar (1959) and recently dis- nature and duration of the pulse.
cussed by Harris and Chapman (2001) and Harris (2007) to
simulate the borehole profiles up to 500 m. The value of the 3.3 Pollen Indices
profile at every 5 m depth interval is considered the synthetic Pollen assemblages retain a smoothed record of climate vari-
depth temperature. The POM-SAT model basically describes ation due to the persistence properties of mature plants (Brown
the diffusion process of surface temperature given an appro- et al. 2005), even where the data itself might be available at
priate initial condition while having the attenuation of a ther- higher resolution. Since fossil pollen records possess skills in
mal perturbation with respect to depth taken into account. This recovering temperatures at bidecadal to semicentennial time
model has proven to be consistent with all diffusion processes resolution, they provide climate information that can fill the gap
(Harris 2007). To illustrate the characteristic of the borehole between tree rings and borehole data (Bradley 1999).
data, we show in Figure 4 artificial examples of how a 1000-
year-long constant time series with a perturbation at different Synthetic Pollen Observations. Similar to the procedure of
ages is represented in the borehole temperature profile. As ex- generating synthetic borehole proxies, we select 10 regional
pected for a diffusion process, the older a surface perturbation, composite temperature series as the local average of 7.5◦ × 7.5◦
the more smeared out in the borehole profile. For example, the squares to generate pollen data. The locations of those squares
temperature profile corresponding to the time series with the are also shown in Figure 2. As discussed above that pollen car-
earliest pulse is the most flat among the five due to the severe ries information only at multidecadal temporal resolution, we
Figure 3. An example of synthetic tree rings and synthetic pollen together with the temperature (grey curve). The tree rings are represented
by the upper black curve, and the pollen is shown by dots which are observed at every 30 years. The black curve with dots embedded is the
10-year smoothing average of the temperature.
888 Journal of the American Statistical Association, September 2010
Figure 4. The borehole profile corresponding to the temperature series with a pulse at different time locations. This illustrates that the older
a surface perturbation, the more smeared out in the borehole profile due to the diffusion. The online version of this figure is in color.
therefore mimic a pollen assemblage by sampling a 10-year av- the process model resides below it. Typically, a third hierar-
erage temperature series at 30-year intervals. The strategy of chical level contains statistical models, also called priors, for
having pollen contain only lower frequency information than unknown parameters that includes additional physical informa-
tree rings is analogous to the wavelet decomposition in Moberg tion. The levels are formally generated by a series of condition-
et al. (2005). Figure 3 gives an example of such a generated ing steps where one level is conditioned on knowledge of the
pollen series. levels below it. The reader is referred to Banerjee, Carlin, and
Note that the above three temperature transformations in Sec- Gelfand (2004) for an introduction to BHMs. Let [X, Y] denote
tions 3.1 to 3.3 for generating proxies determine the three trans- the joint probability density function of the random variables
formation matrices of MD , MP , and MB in Section 4. More X and Y and [X|Y] the conditional density of X given Y. Now
details of synthetic proxies generation can be seen in the sup- let P denote proxy observations, T the NH temperature process
plement and the corresponding R files. and θ a set of statistical parameters that are involved in specify-
ing the joint distribution of P and T. The model specification is
3.4 Noise in Proxies precisely the joint distribution [P, T, θ]. This form can be built
Real world proxies are expected to contain extra noise in ad- from the product of conditional distributions:
dition to the uncertainty between temperature at local scales and [P, T, θ] = [P|T, θ ][T|θ ][θ ].
the hemispheric average. In order to assess the sensitivity of our
approach to the noise in proxies, we additionally synthesize an- Regarding the paleoclimate reconstruction problem an adum-
other set of proxies with an error component by adding white bration of the hierarchial levels is (i) Data stage [P|T, θ],
noise to the local/regional temperature series before they are (ii) Process stage [T|θ ], (iii) Priors [θ ]. Level (i) focuses on
processed to generate synthetic tree rings, pollen, and borehole. modeling statistical errors of the observed data and presents the
The variance of the noise is chosen to give a signal to noise ra- likelihood of the observed proxies given the true temperature
tio of 1 : 4 that conservatively reflects the expected precision in process, while level (ii) models the temperature process from
actual data (Mann et al. 2005). The reason that we add random the physical perspective. Level (iii) gives prior distributions of
perturbation to the original temperature rather than directly to the unknown parameters and closes the hierarchy.
the synthetic proxies is to preserve the smooth profile of pollen
and borehole temperatures, and it more realistically represents 4.1 Full Model
the local climate noise that subsequently carries over into the Let Di , Pj , and Bk be vectors of synthetic tree ring (Dendro-
proxy archives. chronology), Pollen, and Borehole data indexed by their various
locations. Note that these groups of proxy vectors will have dif-
4. BAYESIAN HIERARCHICAL MODEL
ferent lengths due to their sampling. Moreover, each tree ring
BHMs split a complicated model into three basic compo- and pollen vectors are indexed with respect to time, and the
nents. The data model occupies one level of the hierarchy, while borehole vectors are indexed by depth. Let S, V0 , and C be the
Li, Nychka, and Ammann: Multiproxy Reconstruction of Past Climate 889
time series vectors of solar irradiance, volcanism, and green- before being used, which is a standard method in paleoclima-
house gases, and let V denote the volcanic series with error. tology (Bradley and Jones 1993; Osborn and Briffa 2006).
Also let 1 denote a vector of ones with a generic length which Model (4.4) accounts for the uncertainty in the volcanism
will be determined individually by the conformable condition which is estimated to be between ±25% of the magnitude of the
according to the local context. individual volcanic pulses themselves. Model (4.5) brings the
Let MD , MP , and MB be the three transformation matri- physical understanding of temperature evolution based on en-
ces to link temperature series to tree ring, pollen, and bore- ergy balance theory into the reconstruction. Although our BHM
hole, respectively. Here we assume that those transformations contains only linear models, we could of course replace those
are known. Although the real relationship between proxies and linear models by more complicated ones. However, we found
temperatures can be more complex than such a linear transfor- that the linear models suffice for our data, and the problem
mation, those three matrices are derived from the main charac- at hand, that is, reconstructing NH mean temperature (Mann
et al. 2008). Our experience with related data suggests that
teristics of tree ring, pollen, and borehole data described in Sec-
an AR(1) structure is likely sufficient but we use an AR(2)
tion 3. More specifically, they represent the linear filters used to
model to provide additional flexibility to accommodate the pos-
generate the corresponding pseudo proxies from the model tem-
sibility of more complex dependence in our application. An
perature series. For example, the MB is derived from the POM-
AR(2)(σ 2 , φ1 , φ2 ) process is defined as et = φ1 et−1 + φ2 et−2 +
SAT model that is used to generate synthetic borehole profiles. t , t ∼ iid Normal(0, σ 2 ). The choice of priors and the justifi-
Note that those matrices can be easily updated if more precise cation for the linear models with AR(2) error structure can be
working models between proxies and temperatures are devel- seen in the supplement.
oped. Finally it is useful to partition the full-length temperature
process T into the unknown temperatures T1 requiring recon- 4.2 Sampling From the Posterior Distribution
struction over the time span of available proxy data, and the The specific choice of priors for time lag coefficients (φ1L ,
observed instrumental temperatures T2 (1850–present), that is, φ2L ) with L ∈ {D, P, T} guarantees their corresponding AR(2)
T = (T1 , T2 ) . Then we have the data and process models be- process to be stationary and causal (Shumway and Stoffer 2006,
low: chapter 3), and the conjugate priors for all the other parameters
(i) Data stage: allow for an explicit full conditional posterior distribution for
those parameters and T1 . There is no closed form for the pos-
Di |(T1 , T2 ) = μiD 1 + βiD MD (T1 , T2 ) + iD , terior distribution of time lag coefficients. Thus the posterior
is sampled by alternating the Gibbs sampler, which is used for
iD ∼ AR(2)(σD2 , φ1D , φ2D ), (4.1) updating T1 and parameters with explicit full conditional dis-
Pj |(T1 , T2 ) = μjP 1 + βjP MP (T1 , T2 ) + jP , tribution, and the Metropolis–Hasting (M–H) algorithm, which
is used for updating autoregressive parameters. More specifi-
jP ∼ AR(2)(σP2 , φ1P , φ2P ), (4.2) cally, we generate posteriors by Gibbs sampling for T1 , V0 ,
Bk |(T1 , T2 ) = MB {μkB 1 + βkB (T1 , T2 ) + kB }, (μiL , βiL ) with L ∈ {D, P, B}, βi with i = 0, 1, 2, 3 and σL2 with
L ∈ {D, P, B, T}, and generate posteriors by M–H for (φ1L , φ2L )
kB ∼ iid N(0, σB2 ), (4.3) with L ∈ {D, P, T}.
Whenever the M–H algorithm is used, the acceptance rate
V|V0 = (1 + V )V0 ,
is tuned to be roughly between 25% and 50% to secure ade-
V ∼ iid N(0, 1/64). (4.4) quate mixing of posterior samples (Gelman, Roberts, and Gilks
1996). We choose the hyperparameters μiL = 0 and μiL = 1 for
(ii) Process stage: L ∈ {D, P, B}, because this represents the ideal case when the
(T1 , T2 ) |(S, V0 , C) = β0 1 + β1 S + β2 V0 + β3 C + T , local/regional temperatures are not biased against the NH tem-
perature. For a similar reason, μi is set to be 0, 1, 1, 1 for i =
T ∼ AR(2)(σT2 , φ1T , φ2T ). (4.5) 0, 1, 2, 3. We found the results are robust to different choices
of those hyperparameters. In order to let the data determine
The target is to estimate T1 given T2 , the proxies, and the
the final estimates of the regression coefficients, we choose a
forcings. Forward models (4.1) to (4.3) describe the statistical 2 = 1 for L ∈ {D, P, B} and
relatively wide variance 2 =
σiL σiL
relationship between proxies and the true temperature process.
σi2 = 1 for i = 0, 1, 2, 3 to make the priors less informative.
Those models assume a stationary linear relationship between The hyperparameters ( rL ) with L ∈ {D, P, B, T} are set to be
qL ,
a local proxy and NH temperature over time and also condi- (3, 1) which corresponds to relatively vague prior knowledge.
tional independency between proxies given true temperatures. The convergence check by starting sampling from different sets
The special form of model (4.3) respects the smooth feature of of initial values indicates that β0 , β2 , which is the scale para-
the borehole profile by applying the smooth filter MB also to meter for zero-inflated volcanism, and σT2 in the process model
the error term kB , because as opposed to tree rings and pollen converge less well than the others. However, our main interest,
assemblage the real borehole profile is a smooth curve. Proxies the temperature reconstructions appear to be very insensitive to
of the same type but from different locations, such as the 15 different initial values (see the supplement for details).
tree rings, 10 pollens, or 5 boreholes generated in Section 3, are
allowed to have different regression coefficients including the 4.3 Simplifications of the Full BHM
intercepts and slopes, but they all share the same parameters In order to answer questions raised in the Introduction, we
in the error process to retain the parsimony of the whole model. consider several simplifications of the full model. Here we list
This restriction is reasonable if the proxies are first standardized the different factors that will figure into our numerical study.
890 Journal of the American Statistical Association, September 2010
(a) Bias
(b) Variance
Figure 5. Bias, variance, and rmse of the reconstructions for five data models and 23 scenarios that are combinations of with/without forcings,
with/without noise and modeling T1 /T in the process stage. “C” and “F” are the reconstructions without forcings (with constant mean function)
and with forcings incorporated, respectively. The online version of this figure is in color.
Skill of Proxies. We focus on the five data models (T, D, agreement at high frequencies. Borehole information does not
DP, DB, DBP) where forcings are absent, proxies are not sub- appear to contribute much because the spectrum only describes
ject to noise, and constant mean process model is assumed for the variation whereas the information therein is smeared too
T = (T1 , T2 ) rather than only T1 , to study the contribution due much to recover any detail of temperature evolution other than
to each type of proxies. This corresponds to the “C” points in a long term trend. However, as seen in Figure 5(a) the borehole
the leftmost panels of Figures 5(a) to (c), and the patterns in does help to reduce the bias a bit. Overall, compared to the ora-
those plots roughly imply the role of each type of proxy. To
more clearly illustrate the benefit of incorporating a particular
type of proxy, we compare the spectrum of the reconstruction
residuals from the five different data subsets. The residual spec-
trum at different frequencies measures the variation component
that we missed in the reconstruction at that specific frequency.
Figure 6 shows that the spectrum based on data models (D) and
(DB) look similar and also the spectrum based on data models
(DP) and (DBP) are hardly distinguishable, although the latter
two have smaller power at low frequencies due to the involve-
ment of pollen proxies in the reconstruction. Since pollen was
sampled every 30 years from a 10-year smoothing average of Figure 6. Using smoothed spectrum of reconstruction residuals
temperatures, it would thus be expected to retain the variabil- from the five data models to illustrate the frequency band at which
ity at around a 30-year period. This is verified by the observa- proxies capture the variation of the temperature process. Both axes are
tion that the spectrum of (DP) and (DBP) departures from the plotted on a logarithmic scale. The online version of this figure is in
spectrum of (D) and (DB) at about a 30-year period after an color.
892 Journal of the American Statistical Association, September 2010
cle proxies (T), the reconstructions using other proxies perform Sensitivity to Noise in Proxies. Not surprisingly, the noise
well at high frequency but are less precise at low frequency, in proxies introduces both bias and more variability in the re-
as part of the low frequency information is lost in pollen and constructions. We can see from Figure 5 that with the substan-
borehole. tial amount of noise added to the proxies, the performance of
the reconstruction deteriorates some but not terribly. It is worth
5.2 Other Inferences noting that our results are only based on one set of contaminated
Cause of Bias. It can be seen from Figure 5(a) that the proxies. Given different noise, the performance is slightly dif-
reconstruction when T = (T1 , T2 ) is modeled in the process ferent, but does not appreciably change the basic conclusion.
stage carries some systematic positive bias. This positive bias
5.3 Posterior Samples of
is pronounced in cases where forcings or information at longer
the Reconstructed Temperature
time scales is not available, that is, T is modeled as an AR(2)
with constant mean which corresponds to the “C” points in the We select model (DP) with forcings to show examples of the
figure, and no pollen is used in the reconstruction. The reason reconstruction at three scenarios. Figure 7(a) is for modeling T
for the positive bias is because the observed T2 which serves and no noise, Figure 7(b) for modeling T1 and no noise, and
the primary source to estimate the mean of T has higher mean Figure 7(c) for modeling T and with noise. To make this com-
temperature than T1 , but was nonetheless assumed to have the parison clearer we report posterior reconstructions for decadal
same mean function as T1 in the model. Therefore, after we re- average temperature. In all those figures, the 95% uncertainty
move this a priori assumption and only leave T1 in the process band calculated from the posterior samples is also displayed
stage as described in the partial temperature process model in together with the reconstruction. In Figure 7, all the recon-
Section 4.3, the positive bias has been largely reduced. An- structions follow the trend of the target very well, although
other way to reduce the bias is to incorporate external forc- they seem to miss some details. Comparison between panels (a)
ings since this enables the temperature process to be estimated and (b) in this figure shows that by modeling only the unknown
by its dependency on forcings. In this way the difference in T1 in the process stage can effectively reduce the bias caused by
mean functions is accounted for by the varying external forc- assuming T1 and T2 to have the same mean function. The dif-
ings. ference between panels (a) and (b) and panel (c) illustrates the
If bias is the primary concern, one should consider the re- larger bias and wider uncertainty band introduced by noise. In
construction that is based on modeling only T1 in the process panels (a) and (b), the uncertainty band covers the target tem-
stage, though one would have to accept that it will carry more peratures fairly well, while in panel (c) the coverage deterio-
variance. Note that the oracle reconstruction is not necessar- rates due to the bias.
ily the optimal reconstruction just in terms of bias. We can see In addition, we formally assess the model adequacy with
that some reconstructions have even lower bias than their corre- posterior samples. Using the criterion of verification rank his-
sponding oracle experiment. Yet due to the bias–variance trade togram and coverage probability of posterior distributions as
off, the oracle reconstruction is the best in terms of rmse. This in Gel, Raftery, and Gneiting (2004), we assess our models
is particularly visible when realistic noise is applied into prox- by comparing the performance of the reconstructions that are
ies. based on synthetic proxies to the reference reconstruction that is
Figure 7. The reconstructions using tree rings and pollen together with forcings in three scenarios. (a) modeling T and without noise;
(b) modeling T1 and without noise; (c) modeling T and with noise. The grey area is the 95% uncertainty band of the reconstruction.
Li, Nychka, and Ammann: Multiproxy Reconstruction of Past Climate 893
based on “oracle” proxies. This is because the reference recon- additive proxy errors. However, one area of future work is to
struction reaches the capacity of those “oracle” proxies (local consider a more realistic model for dating errors (Haslett and
temperatures) in recovering the NH temperature and thus can Parnell 2008). One surprise in this work is that the hypothetical
serve as a baseline to evaluate the models for synthetic prox- borehole proxies do not improve the reconstruction in a sig-
ies. The results show no evidence towards inadequacy of those nificant way. Nevertheless, in reality pollen records might not
models (see details in the supplement). perform as well as the synthetic ones, and in such a case bore-
hole information may be used more effectively. Also the use of
5.4 Identifiability of Parameters
borehole information might be improved by greater attention to
We examine the posterior distribution of parameter estimates the ill-posed aspects of the data model.
and compare them to their corresponding priors to make sure We believe that our work is a positive contribution to the
that priors only have little influence on the parameter estimates. paleoclimate community as it attempts to exploit all available
We focus on the model (DBP) with forcings included and with proxy information to discern past climate. Also, whenever a
proxy errors, because this case has the most complex setting Bayesian approach is used it will include companion measures
and provides the greatest challenge in determining parameters. of the reconstruction uncertainty. One perspective of the debate
We found that in general, the posteriors are not sensitive to the concerning the Northern Hemisphere temperature reconstruc-
priors and this is suggested by the relatively small variance of tion proposed by Mann, Bradley, and Hughes (1998) is that
the posteriors compared to their priors. However, compared to most attention is centered on the estimate but only little on its
the stable estimates for regression coefficients for tree rings, the uncertainty (see NRC 2006 for a comprehensive report). The
estimates for borehole data contain more uncertainties. More- “hockey stick” shape, which was the center of that discussion,
over, the BHM is unable to resolve the variance parameter in the in our context is an approximation to the posterior mean and
borehole model. We conjecture that this is because the transfor- then any assessment of the shape must also include the un-
mation matrix MB is an ill-posed matrix that has a small effec- certainty about this estimate. The concept of an ensemble of
tive rank, on the order of 6 degrees of freedom, hence there is no possible states is a well-recognized technique in geosciences
way to fully recover the temperature information from the bore- to quantify uncertainty in an estimate. Via Bayesian analysis, a
hole profile. This essentially has been illustrated in Figure 4. random sample of states from the posterior provides a rigorous
Despite the difficulty in estimating some of the parameters re- and easily interpreted method for generating an ensemble.
lated to the borehole proxy data the resulting reconstructions Another advantage of the BHM framework is that it read-
are about the same as (DP) combinations or slightly better. ily extends to more complex data level or process models. Be-
cause of this flexibility we believe that the methods will adapt
6. DISCUSSION AND CONCLUSIONS
to more complex statistical features of real data and the experi-
This paper has proposed a new application of BHMs to re- ence in this study will transfer to more complicated cases. An
construct NH temperatures by jointly using proxy data with dif- important extension will be reconstructing the space–time tem-
ferent temporal resolutions, and using forcings as external co- perature process instead of the NH temperature based on the
variates of temperature evolution. With this method we investi- work of Tingley and Huybers (2010), and further reconstructing
gated the benefits by combining different proxies and by inclu- the multivariate space–time climate process. Climate variables
sion of the forcings. Our results showed that a process model of interest include temperature, precipitation and geopotential
that includes the external forcings in the form of an energy bal- heights. The Bayesian framework in this article naturally lends
ance can dramatically improve the reconstruction, particularly itself to univariate and multivariate space–time random field re-
if the applied proxy data is deficient of decadal or centennial construction. Achieving this long-term goal will provide a valu-
scale variability. In our numerical study this improvement can able analysis to evaluate the next generation of climate system
be by a factor of 2 in rmse. However, its role can be partially models and improve our understanding of past climate.
replaced by our hypothetical pollen proxy that fills this range.
Tree rings play a significant role in retaining the high frequency SUPPLEMENTAL MATERIALS
variability, while pollen improved the reconstruction remark-
Additional analysis details and results: More details and re-
ably by capturing the variation at lower frequency band. These
sults of the analysis are shown in pdf file. (supplement.pdf)
results make a case for attempting multiproxy reconstructions
Data and R codes: The tar file contains all data sets that have
with tree rings and pollen assemblages and also including exter-
been used in reconstructions and all R codes that imple-
nal forcing covariates. Although we base these conclusions on
ment the reconstructions. The readme.rtf enclosed in the tar
a synthetic Monte Carlo experiment, the climate model simula-
file describes the content of each data file and R code file.
tion used as truth is a complex and extensive representation of
(paleo.tar)
the actual climate system and so provides confidence that these
results will extend well to real world conditions. [Received June 2009. Revised March 2010.]
Pollen proxies are usually collected from sediments layers
and thus possibly subject to dating errors, that is, the date during REFERENCES
which pollen was formed or the age of a layer in lake sediment Ammann, C. M., and Wahl, E. (2007), “The Importance of the Geophysical
might not be exactly identified due to various reasons, such as Context in Statistical Evaluations of Climate Reconstruction Procedures,”
the time lag between initial plant introduction to its abundance Climatic Change, 85, 71–88. [885]
Ammann, C. M., Genton, M. G., and Li, B. (2010), “Correcting for Signal At-
and different sediment accumulation rates (see Bradley 1999). tenuation From Noisy Proxy Data in Climate Reconstruction,” Climate of
We account for the dating error to some extent by considering the Past, 6, 273–279. [885]
894 Journal of the American Statistical Association, September 2010
Ammann, C. M., Joos, F., Schimel, D., Otto-Bliesner, B. L., and Tomas, R. Guiot, J., Nicault, A., Rathgeber, C., Edouard, J. L., Guibal, F., Pichard, G.,
(2007), “Solar Influence on Climate During the Past Millennium: Results and Till, C. (2005), “Last-Millennium Summer-Temperature Variations in
From Transient Simulations With the NCAR Climate System Model,” Pro- Western Europe Based on Proxy Data,” The Holocene, 15 (4), 489–500.
ceedings of the National Academy of Sciences, 104, 3713–3718. [885] [884]
Banerjee, S., Carlin, B., and Gelfand, A. E. (2004), Hierarchical Modeling and Harris, R. N. (2007), “Variations in Air and Ground Temperature and the POM-
Analysis for Spatial Data, Boca Raton: Chapman & Hall/CRC. [888] SAT Model: Results From the Northern Hemisphere,” Climate of the Past,
Bard, E., Raisbeck, G., Yiou, F., and Jouzel, J. (2000), “Solar Irradiance During 3, 611–621. [887]
the Last 1200 Years Based on Cosmogenic Nuclides,” Tellus, Ser. B, 52, Harris, R. N., and Chapman, D. S. (2001), “Mid-Latitude (30–60N) Climatic
985–992. [885,886] Warming Inferred by Combining Borehole Temperatures With Surface Air
Beltrami, H. (2001), “Surface Heat Flux Histories From Inversion of Geother- Temperatures,” Geophysical Research Letters, 28, 747–750. [884,887]
mal Data: Energy Balance at the Earth’s Surface,” Journal of Geophysical Haslett, J., and Parnell, A. (2008), “A Simple Monotone Process With Appli-
Research, 106, 21979–21993. [886] cation to Radiocarbon-Dated Depth Chronologies,” Journal of the Royal
Beltrami, H., Chapman, D. S., Archambault, S., and Bergeron, Y. (1995), “Re- Statistical Society, Ser. C, 57, 399–418. [893]
construction of High Resolution Ground Temperature Histories Combining Haslett, J., Whiley, M., Bhattacharya, S., Salter-Townshend, M., Wilson, S. P.,
Dendrochronological and Geothermal Data,” Earth Planetary Science Let- Allen, J. R. M., Huntley, B., and Mitchell, F. J. G. (2006), “Bayesian Palaeo-
ters, 136, 437–445. [884] climate Reconstruction,” Journal of the Royal Statistical Society, Ser. A,
Berliner, L. M., Milliff, R. F., and Wikle, C. K. (2003), “Bayesian Hierarchical 169, 395–438. [884,885]
Modeling of Air–Sea Interaction,” Journal of Geophysical Research, 108, Huang, S. (2004), “Merging Information From Different Resources for New
DOI: 10.10292002JC001413. [884] Insights Into Climate Change in the Past and Future,” Geophysical Research
Boville, B. A., Kiehl, J. T., Rasch, P. J., and Bryan, F. O. (2001), “Improve- Letters, 31, L13205. [884,886]
ments to the NCAR CSM-1 for Transient Climate Simulations,” Journal of Huang, S., Pollack, H. N., and Shen, P. Y. (2000), “Temperature Trends Over the
Climate, 14, 164–179. [885] Past Five Centuries Reconstructed From Borehole Temperatures,” Nature,
Bradley, R. S. (1999), Paleoclimatology: Reconstructing Climates of the Qua- 403, 756–758. [884,886]
ternary (2nd ed.), San Diego: Academic Press. [887,893] Hughes, M. K., and Ammann, C. M. (2009), “The Future of the Past: An Earth
Bradley, R. S., and Jones, P. D. (1993), ““Little Ice Age” Summer Temperature System Framework for High Resolution Paleoclimatology: Editorial Es-
Variations: Their Nature and Relevance to Recent Global Warming Trends,” say,” Climate Change, 94, 247–259. [884]
The Holocene, 3, 367–376. [889] IPCC (2007), Climate Change 2007: The Physical Science Basis. Contribution
Briffa, K. R., and Melvin, T. M. (2008), “A Closer Look at Regional Chronol- of Working Group I to the Fourth Assessment Report of the Intergovernmen-
ogy Standardisation of Tree-Ring Records: Justification of the Need, a tal Panel on Climate Change, eds. S. Solomon, D. Quin, M. Manning, Z.
Warning of Some Pitfalls, and Suggested Improvements in Its Application,” Chen, M. Marquis, K. B. M. Tignor, and H. L. Miller, Cambridge, U.K. and
in Dendroclimatology: Progress and Prospects. Developments in Paleoen- New York: Cambridge University Press. [883,884]
vironmental Research, eds. M. K. Hughes, H. F. Diaz, and T. W. Swetnam, Jones, P. D., Briffa, K. R., Barnett, T. P., and Tett, S. F. B. (1998), “High-
New York: Springer. [884] Resolution Palaeoclimatic Records for the Last Millennium: Interpretation,
Briffa, K. R., Jones, P. D., Schweigruber, F. H., Karlén, W., and Shiyatov, S. G. Integration and Comparison With General Circulation Model Control-Run
(1996), “Tree-Ring Variables as Proxy-Climate Indicators: Problems With Temperatures,” The Holocene, 8, 455–471. [884]
Low-Frequency Signals,” in Climatic Variations and Forcing Mechanism Lee, T. C. K., Zwiers, F. W., and Tsao, M. (2008), “Evaluation of Proxy-
of the Last 2000 Years. Series I: Global Environmental Change, eds. P. D.
Based Millennial Reconstruction Methods,” Climate Dynamics, 31 (2–3),
Jones, R. S. Bradley, and J. Jouzel, Berlin, Heidelberg: Springer-Verlag,
263, DOI: 10.1007/s00382-007-0351-9. [885]
pp. 9–41. [886]
Li, B., Nychka, W. D., and Ammann, C. M. (2007), “The “Hockey Stick” and
Brown, K. J., Clark, J. S., Grimm, E. C., Donovan, J. J., Mueller, P. G., Hansen,
the 1990s: A Statistical Perspective on Reconstructing Hemispheric Tem-
B. C. S., and Stefanova, I. (2005), “Fire Cycles in North American Interior
peratures,” Tellus, 59, 591–598. [885]
Grasslands and Their Relationship to Prairie Drought,” Proceedings of the
Mann, M. E., and Rutherford, S. (2002), “Climate Reconstruction Us-
National Academy of Sciences, 102 (25), 8865–8870. [887]
ing “Pseudoproxies”,” Geophysical Research Letters, 29, 1501, DOI:
Carslaw, H. S., and Jaeger, J. C. (1959), Conduction of Heat in Solids (2nd ed.),
10.1029/2001GL014554. [885]
Oxford, U.K.: Oxford University Press. [887]
Chapman, D. S., Bartlett, M. G., and Harris, R. N. (2004), Comment on Mann, M. E., and Schmidt, G. A. (2003), “Ground vs. Surface Air Tem-
“Ground vs. Surface Air Temperature Trends: Implications for Borehole perature Trends: Implications for Borehole Surface Temperature Re-
Surface Temperature Reconstructions” by M. E. Mann and G. Schmidt, constructions,” Geophysical Research Letters, 30 (12), 1607, DOI:
Geophysical Research Letters, 31, L07205, DOI: 10.1029/2003GL019054. 10.1029/2003GL017170. [886]
[884,886] Mann, M. E., Bradley, R. S., and Hughes, M. K. (1998), “Global-Scale Tem-
Cook, E. R., and Kairiukstis, L. A. (1990), Methods of Dendrochronology, Dor- perature Patterns and Climate Forcing Over the Past Six Centuries,” Nature,
drecht: Kluwer Academic. [884] 392, 779–787. [884,893,895]
Cook, E. R., Briffa, K. R., Meko, D. M., Graybill, D. A., and Funkhouser, G. Mann, M. E., Rutherford, S., Wahl, E., and Ammann, C. (2005), “Testing the
(1995), “The “Segment Length Curse” in Long Tree-Ring Chronology De- Fidelity of Methods Used in Proxy-Based Reconstruction of Past Climate,”
velopment for Paleoclimatic Studies,” The Holocene, 5, 229–237. [884,886] Journal of Climate, 18, 4097–4107. [888]
Crowley, T. J. (2000), “Causes of Climate Change Over the Last 1000 Years,” Mann, M. E., Zhang, Z., Hughes, M. K., Bradley, R. S., Miller, S. K., and
Science, 289, 270–277. [884] Rutherford, S. (2008), “Proxy-Based Reconstructions of Hemispheric and
Crowley, T. J., and Lowery, T. S. (2000), “How Warm Was the Medieval Warm Global Surface Temperature Variations Over the Past Two Millennia,” Pro-
Period?” Ambio, 29, 51–54. [884] ceedings of the National Academy of Sciences, 105, 13252–13257. [889]
Esper, J., Cook, E. R., and Schweingruber, F. H. (2002), “Low-Frequency Sig- Moberg, A., Sonechkin, D. M., Holmgren, K., Datsenko, N. M., and Karlen,
nals in Long Tree-Ring Chronologies for Reconstructing Past Temperature W. (2005), “Highly Variable Northern Hemisphere Temperatures Recon-
Variability,” Science, 295, 2250–2253. [884] structed From Low- and High-Resolution Proxy Data,” Nature, 433, 613–
Fritts, H. C. (1976), Tree Rings and Climate, New York: Academic Press. [884] 617. [884,886,888]
Gel, Y., Raftery, A. E., and Gneiting, T. (2004), “Calibrated Probabilitic National Research Council (NRC) (2006), Surface Temperature Reconstruc-
Mesoscale Weather Field Forecasting: The Geostatistical Output Perturba- tions for the Last 2,000 Years, Washington, DC: National Academy Press.
tion Method,” Journal of the American Statistical Association, 99, 575–583. [893]
[892] Osborn, T. J., and Briffa, K. R. (2006), “The Spatial Extent of 20th-Century
Gelfand, A. E., and Ghosh, S. K. (1998), “Model Choice: A Minimum Posterior Warmth in the Context of the Past 1200 Years,” Science, 311, 841–844.
Predictive Loss Approach,” Biometrika, 85, 1–11. [890] [889]
Gelman, A., Roberts, G. O., and Gilks, W. R. (1996), “Efficient Metropolis Otto-Bliesner, B. L., and Brady, E. C. (2001), “Tropical Pacific Variability in
Jumping Rules,” Baysian Statistics 5, eds. J. M. Bernardo, J. O. Berger, the NCAR Climate System Model,” Journal of Climate, 14, 3587–3607.
A. P. Dawid, and A. F. M. Smith, Oxford: Oxford University Press. [889] [885]
Gonzalez-Rouco, F., von Storch, H., and Zorita, E. (2003), “Deep Soil Temper- Rind, D. (1995), “The Potential for Modeling the Effects of Different Forcing
ature as Proxy for Surface Air-Temperature in a Coupled Model Simulation Factors on Climate During the Past 2000 Years,” in Global Environmental
of the Last Thousand Years,” Geophysical Research Letters, 30 (21), 2116, Change. NATO ASI Series, Vol. 41, ed. P. D. Jones, Stuttgart: Springer-
DOI: 10.1029/2003GL018264. [886] Verlag, pp. 563–581. [886]
Guiot, J., Harrison, S. P., and Prentice, C. I. (1993), “Reconstruction of Rutherford, S., Mann, M. E., Delworth, T. L., and Stouffer, R. (2003), “Climate
Holocene Precipitation Patterns in Europe Using Pollen and Lake Level Field Reconstruction Under Stationary and Nonstationary Forcing,” Journal
Data,” Quaternary Research, 40, 139–149. [884] of Climate, 16, 462–479. [885]
Cressie and Tingley: Comment 895
Schweingruber, F. H. (1996), Tree Rings and Environment Dendroecology, Wikle, C. K., Millif, R. F., Nychka, D., and Berliner, L. M. (2001), “Spatiotem-
Berne, Switzerland: Paul Haupt. [884] poral Hierarchical Bayesian Modeling: Tropical Ocean Surface Winds,”
Shumway, R. H., and Stoffer, D. S. (2006), Time Series Analysis and Its Appli- Journal of the American Statistical Association, 96, 382–397. [884]
cations: With R Examples, New York: Springer. [889] Williams, J. W., Bartlein, P. J., and Webb III, T. (2000), “Data-Model Com-
Tingley, M. P., and Huybers, P. (2010), “A Bayesian Algorithm for Reconstruct- parisons for Eastern North America—Inferred Biomes and Climate Values
ing Climate Anomalies in Space and Time. Part 1: Development and Appli- From Pollen Data,” in Proceedings of the Third Paleoclimatic Modeling
cations to Paleoclimate Reconstruction Problems,” Journal of Climate, 23, Intercomparison Project Workshop, ed. P. Braconnot, Montreal, Canada:
2759–2781. [893] World Climate Research Program, pp. 77–86. [884]
Viau, A. E., Gajewski, K., Sawada, M. C., and Fines, P. (2006), Zielinski, G. A. (2000), “Use of Paleo-Records in Determining Variability
“Millennial-Scale Temperature Variations in North America During Within the Volcanism-Climate System,” Quaternary Science Reviews, 19,
the Holocene,” Journal of Geophysical Research, 111, D09012, DOI: 417–438. [886]
10.1029/2005JD006031. [884] Zorita, E., Gonzalez-Rouco, F., and Legutke, S. (2003), “Testing the Mann et
von Storch, H., Zorita, E., Jones, J., Dimitriev, Y., Gonzalez-Rouco, F., and Tett, al. (1998) Approach to Paleoclimate Reconstructions in the Context of a
S. (2004), “Reconstructing Past Climate From Noisy Data,” Science, 306, 1000-yr Control Simulation With the ECHO-G Coupled Climate Model,”
679–682. [885] Journal of Climate, 16, 1378–1390. [885]
The article by Bo Li, Douglas W. Nychka, and Caspar M. being to eventually take the results from laboratory to bedside
Ammann (hereafter, LNA) has several goals. It considers the (a goal of Transformative Medicine). LNA realize the impor-
important problem of reconstruction of past (over a period of tance of calibrating their proxy data (the “lab animals”) to the
more than 1000 Years Before Present) climate from multiproxy real data (whose analogue would be the “patients”).
data, and it directly recognizes the various uncertainties in this LNA use a methodology we call here posterior analysis
undertaking. These uncertainties are expressed through (condi- (whose analogue might be the “treatment”), that may be new
tional) probability distributions in a framework known to read- to the paleoclimate-reconstruction community, but it is well
ers of this journal as hierarchical statistical modeling. LNA use known to statisticians. Posterior analysis resulting from hi-
a physical–statistical model that also includes climate forcings, erarchical statistical modeling is a powerful way to account
and their statistical inference is Bayesian. Rather than using ac- for uncertainties in all aspects of a scientific study. The main
tual multiproxy data, LNA simulate their data. Then they de- strength of a hierarchical model (HM) is also a point of diffi-
sign a computer-simulation experiment to assess the value of culty, namely that all these uncertainties have to be expressed
including the various (simulated) proxies and the forcings. The through (parametric) probability distributions. This might not
design of the experiment, its analysis, and the conclusions ob- be easy for a paleoclimate scientist to do, and hence the sta-
tained from it, are intended to guide climate scientists towards tistician’s involvement is needed in a posterior analysis from
more precise inferences when carrying out actual paleoclimate the “get-go.” As part of our discussion, we shall examine the
reconstructions. Our discussion of LNA in the sections that fol- appropriateness of the HMs proposed by LNA.
low considers both the scientific and statistical goals summa- LNA use a computer simulation experiment (i.e., try it on the
rized above. lab animals first!) to determine the worthiness of a Bayesian
HM to address this highly complex climate-reconstruction
1. INTRODUCTION
problem. Their experiment should be assessed like any other,
Because LNA use pseudo-proxy data, not real data, we shall in terms of the basic principles of blocking, randomization, and
examine the way these pseudo-proxy data were created. Based replication (Fisher 1935), and in terms of the responses that
on a combination of scientific expertise and statistical intuition, are studied to answer the questions that provoked the experi-
time series that mimic tree-ring, borehole, and pollen data were ment. In LNA’s analyses, the responses all depend on the poste-
synthetically produced from the output of a general circulation rior distribution obtained for the various HMs that were fitted,
model (GCM) of the climate system. This is akin to the bio- which is consistent with their (Bayesian) hierarchical modeling
chemist conducting experiments on lab animals, with the goal approach.
Obtaining posterior distributions from an HM usually re-
quires a considerable investment in computation. Sometimes,
Noel Cressie is Director of the Program in Spatial Statistics and Environ-
mental Statistics, Professor of Statistics, and Distinguished Professor of Math- modeling decisions in the HM are made more for computa-
ematical and Physical Sciences, Department of Statistics, The Ohio State Uni- tional reasons than scientific ones. All of us who use the HM
versity, Columbus, OH 43210-1247 (E-mail: ncressie@stat.osu.edu). Martin P. approach are faced with these compromises, and we discuss this
Tingley is Post-Doctoral Fellow, Statistical and Applied Mathematical Sciences
Institute (SAMSI), P.O. Box 14006, Research Triangle Park, NC 27709-4006 in the context of LNA’s analyses. In the sections that follow, we
(E-mail: mtingley@samsi.info). This research was carried out while Cressie was expand on all of these issues raised in our introduction.
visiting SAMSI under the 2009–2010 Program, “Space–Time Analysis of En-
vironmental Mapping, Epidemiology and Climate Change.” It is supported by
the National Science Foundation under agreement DMS-0635449. Any opin- © 2010 American Statistical Association
ions, findings, and conclusions or recommendations expressed in this material Journal of the American Statistical Association
are those of the authors and do not necessarily reflect the views of the National September 2010, Vol. 105, No. 491, Applications and Case Studies
Science Foundation. DOI: 10.1198/jasa.2010.ap10318
896 Journal of the American Statistical Association, September 2010
2. DESIGN OF THE COMPUTER are designed. Ultimately, the experimenter is looking to at-
SIMULATION EXPERIMENT tribute the total variability of the responses to various sources
LNA use a General Circulation Model (GCM) as the basis in the experiment. Aldworth and Cressie (1999) discuss how
of an experiment to see whether an HM approach to paleocli- this can be done in a systematic way, which they illustrate with
mate reconstruction of temperature is worthwhile. A large part a simulation experiment to compare various spatial sampling
of their article discusses the design of the experiment, and we schemes of an ecological resource. In the case of LNA, with
start with that. one treatment and one replicate, their analysis can only attribute
Experimental design has its foundations set out in the book the total variability to the various factors (or “blocks”).
by Fisher (1935); the three basic tenets are blocking, random- 3. MULTIPROXY DATA, REAL AND SIMULATED
ization, and replication, and they are by now well accepted
by scientists. Think of the “treatment” in this experiment of In LNA’s experiment, the target quantity is the Northern
LNA’s as the generic posterior analysis using Bayes’ Theo- Hemisphere (NH) average temperature, and they build an HM
rem and MCMC. Then the “experimental units” are the vari- to capture the characteristics of three classes of real-world cli-
ous HMs outlined in LNA’s Section 4.3. It is now noticeable mate proxies. In order to test their HM, they simulate pseudo-
that their experiment involves only one treatment. It is unusual proxy time series using the output of a GCM.
to do an experiment without another treatment to compare to; Tree ring pseudo-proxies are constructed by taking the out-
in this case, it might be a standard analysis in the paleoclimate- put of the model at a number of grid locations, adding noise,
reconstruction literature, such as the RegEM method of Schnei- and then removing the 11-year running mean. This construc-
der (2001). Even if LNA’s posterior analysis does well, does it tion amounts to high-band-pass filtering the GCM model out-
do better than RegEM, say, or any other method a paleoclimate put that has been purposely noise degraded. Consequently, a
scientist might use (see, e.g., Jones et al. 2009)? Science ad- tree-ring pseudo-proxy observation at year t is a function of the
vances by replacing an inferior methodology with a superior model output at the corresponding location for years t − 5 to
one, whose inferiority is ideally established through a designed t + 5. Perhaps a running mean that looks back 10 or 11 years
experiment. would have been a better choice, since a tree ring cannot contain
LNA do carry out “blocking”; see their Section 4.3 where information about the future climate.
the different factors are listed. These correspond to the various We interpret the choices they made for tree-ring pseudo-
combinations of data and terms included in the HM (e.g., tem- proxy construction as an attempt to mimic the preprocessing
perature process model without external forcings, the “oracle” that is often applied to tree-ring data. A number of tree-ring se-
proxy, etc.). However, all their blocks are of size one, because ries (each perhaps covering a different time interval) are gen-
they only consider one treatment. erally combined to arrive at a single, long, climate-sensitive
There is a component of “randomization” in the experiment, series, and techniques such as Regional Curve Standardization
but not in the sense that Fisher meant it. Fisher was concerned (e.g., Briffa et al. 1992; Esper, Cook, and Schweingruber 2002)
with which experimental unit received which treatment within are used to remove biological growth effects from raw tree ob-
a block. Here the blocks are of size one, but what should hap- servations. These steps may result in a tree-ring observation
pen if two methodologies (e.g., posterior analysis and RegEM) at year t being dependent on local climate for years both be-
were applied to the (proxy) data and compared? In a simulation fore and after t. As referenced in LNA, there is evidence that
experiment, the statistician is able to create two (or more) iden- tree-ring proxies only record faithfully high-frequency climatic
tical experimental units, something a crop scientist could only changes, due either to this processing or to biology. Some re-
dream about. (The real-world analogy would be to have homo- constructions (e.g., Moberg et al. 2005) have used tree rings
geneous material—such as a water sample from a lake—that is only to infer the high-frequency component of the spatial aver-
divided into two parts and a different treatment would be ap- age temperature series. In their supplementary material, LNA
plied to each.) Therefore, in a simulation experiment, random- claim that the tree-ring pseudo-proxy construction results in
ization of treatment assignment to experimental unit may not time series that “look similar” to actual (i.e., not band-pass-
be important, depending on the computing resources needed to filtered) tree-ring time series. Our spectral analysis (not shown
apply a treatment to an experimental unit. here) suggests that this is not the case—the actual tree-ring se-
Finally, how much “replication” do LNA have in their exper- ries provided in the supplementary material have an abundance
iment? The experiment has a lot of factors and there are many of power at low frequencies (the spectra are red), while the LNA
ways the responses are quantified (see their Section 5), but their construction results in spectra with a sharp drop-off in power for
experiment has no replication. In effect, their study is on only periods longer than 10 years (as expected).
one “lab animal.” It is true that they tried to choose a “typical More realistic tree-ring pseudo-proxies could be created by
lab animal” by using a GCM simulation of the Earth’s climate. using the GCM to drive a forward model of tree growth, such
However, there are many decisions that go into such climate as that proposed by Shashkin and Vaganov (e.g., Shashkin and
simulations: A way to introduce replication into this experi- Vaganov 1993; Evans et al. 2006). As a result, the pseudo-proxy
ment would be to look at an ensemble of such Earth-climate time series would approximate more closely the actual proxy
simulators and run several (chosen randomly or purposively) time series (i.e., the makeup of the “lab animal” would be closer
to guard against any objection that the conclusions from this to that of the “patient”).
experiment are particular to the climate simulator used. Chris- Pollen pseudo-proxies are created by averaging the model
tiansen, Schmith, and Thejll (2009) make the same point in a output over a number of 7.5◦ × 7.5◦ regions, adding noise,
recent article published in the Journal of Climate. calculating an 11-year running mean, and then sampling every
Statistical simulation experiments should be designed in the 30 years. This reflects the fact that pollen assemblages record
same way agricultural, industrial, computer, etc. experiments information about the climate over large spatial and long tem-
Cressie and Tingley: Comment 897
poral scales. For example, the species composition of a forest proxy constructions, and the assumptions made at the data level
stand responds gradually to changes in the climate, while the of the HM.
pollen produced by that stand can travel considerable distances. LNA investigate the potential of borehole, pollen, and tree-
The implication that an observation of the proxy for a given ring proxies, along with estimated time series of forcings, to
year contains information about the climate in both the past and reconstruct NH mean temperatures. However, as they state
the future is perhaps more justified in the case of pollen prox- clearly, their methodology has not been tested on actual data.
ies, which are measured by analyzing small segments of sedi- In their experiment, the parameters used to transform the GCM
ment, often from lake-floor cores. Various physical and biolog- output into pseudo-proxies are assumed known—the matrices
ical mechanisms can mix the pollen deposited over a number MD , MP , and MB that appear in the data level of the model
of years, while the measurement process itself could involve are used to construct the pseudo-proxies. In real paleoclimate
sediment accumulated over more than one year. reconstructions, this will not be the case. For example, the as-
Borehole pseudo-proxies are formed from spatial averages sumption that a pollen observation reflects some weighted av-
over 20◦ × 20◦ regions by application of the POM-SAT model, erage of temperatures over a number of years is likely reason-
which describes the diffusion of surface-temperature pertur- able, but the number of years reflected in that observation, and
bations through the bedrock. LNA use POM-SAT to simulate the weights associated with the averaging, will in general not be
borehole temperature profiles down to 500 m, and then they known.
sample this depth profile every 5 m. The details of the POM- LNA assume that each pseudo-proxy time series has a (dif-
SAT model are not provided. ferent) linear relationship with the transformed unknown tem-
LNA assume that a borehole profile provides information peratures T; for each time series, they infer two regression pa-
about surface temperatures at large spatial scales. As a temper- rameters, and for each proxy type they infer three parameters
ature anomaly needs to propagate through rock, to where the [two AR(2) coefficients and a variance] for the error process.
measurement takes place, the spatial scale of the information For 15 tree-ring time series, this amounts to 2 · 15 + 3 = 33
might be considered similar to the depth scale. If this were so, parameters.
we find the choice of 20◦ × 20◦ regions to be too large. By Another strategy would be to infer the averaging weights in
forming the borehole pseudo-proxies from such large regions, the matrices MD , MP , and MB , in which case the scaling coef-
LNA likely overemphasize the information in actual borehole ficients (βi,D , βj,P , and βk,B ) in the data model are redundant.
proxies for inferring the NH (spatial mean) temperature time We shall now investigate the consequences of this for tree-ring
series. Their borehole pseudo-proxy construction was proba- proxies. If the matrix MD is assumed to represent a stationary
bly motivated by the observation that surface temperatures av- linear transform of the temperatures T within plus or minus five
eraged over longer time scales tend to reflect larger spatial years of a given observation, then an additional 11 parameters
scales. A fully Bayesian reconstruction of temperature in the must be inferred. If there are 15 tree-ring proxies, each with an
San Rafael region of Utah from borehole proxies is given by intercept only, and assumed to have common error process pa-
Brynjarsdóttir and Berliner (2010). rameters, then there are 11 + 1 · 15 + 3 = 29 parameters to be
A better approach would involve constructing the borehole estimated for the tree-ring proxies. As the matrices MD , MP ,
pseudo-proxies from local GCM output, and then using an and MB are not known in real-world applications, this (slightly
HM with a spatial component to reconstruct the temperature more parsimonious) model may be more realistic. The impacts
field through time, not just the NH spatial average. Ideally, this of these differing modeling choices would certainly be of inter-
would be calibrated to the heat equation that governs the tem- est to the scientific community and could be included in a future
poral evolution of subsurface temperatures for which surface simulation experiment.
temperatures are boundary conditions. The data level of such In this paper, LNA have tested the ability of a BHM to re-
a hierarchical model could represent the borehole data as re- construct past temperatures, given pseudo-proxies obtained by
flecting local temperature, smoothed through time, while the applying different, known transforms to the climate-model out-
process level could model a temperature field that becomes in- put. While these transforms are constructed to reflect aspects of
creasingly smooth in space as temporal smoothing increases. tree-ring, pollen, and borehole proxies, they are at best simple
We say more about the introduction of a spatial component in approximations to the processes that generate the actual prox-
Section 6. ies. In addition, assuming that these transforms are known elim-
For all three classes of pseudo-proxies, LNA add noise to the inates a source of uncertainty which, in real-world applications,
local GCM temperatures before forming the pseudo-proxies. could be large. To sum up our discussion in this section, we see
In other words, additive white noise, as well as the GCM out- many ways that the “lab animal” is different from the “patient.”
put, are subject to the transforms that create the pseudo-proxies.
This decision is motivated by the observation that actual pollen 4. THE HM USED FOR RECONSTRUCTING
and borehole proxies appear somewhat “smooth” in time. The PAST CLIMATE
notion that borehole proxies are temporally smooth is reflected
4.1 Hierarchical Modeling Choices
in the data level, as the noise term is also subject to the trans-
formation MB [LNA’s (4.3)]. This is not the case for the pollen There are several choices made by LNA when building the
pseudo-proxies, which are modeled at the data level as being HM in their Section 4.1. First, they introduce forcings S (so-
subject to additive AR(2) noise [LNA’s (4.2)]. The tree-ring lar irradiance), V (volcanism), and C (greenhouse gases, rep-
pseudo-proxies are likewise modeled at the data level as being resented by the concentration of CO2 ) into the model. Strictly
subject to additive AR(2) noise that is not filtered by the trans- speaking, the forcings should appear in the process model (4.5)
form matrix MD [LNA’s (4.1)]. In short, we see something of a in terms of their “noise-free” versions, S0 , V0 , and C0 . In terms
disconnect between the observed properties of proxies, pseudo- of the probability structure defined by LNA’s HM, all distribu-
898 Journal of the American Statistical Association, September 2010
tions are in fact conditional on S and C. This is an assumption that in a future simulation experiment, the opportunity should
that we invite LNA to comment on. Was the reason for this a be taken to consider the effect of spatial locations of proxies on
pragmatic one that kept the number of unknowns to a manage- inference for NH mean temperatures.
able size? Notice that a data model for S and C would introduce
unknowns S0 and C0 into the HM and hence into the MCMC. 4.3 Accounting for Nonlinearities in the HM
There was another hierarchical modeling decision made by We have already noted in Section 3 that each pseudo-proxy
LNA that we would like to invite comment on. While it is not time series is linearly related to the true temperature time series.
explicitly stated, LNA assume that the observed instrumental There is growing evidence that some tree-ring proxies, partic-
temperatures T2 have no measurement error; however, this does ularly those at high northern latitudes, have become less sensi-
not seem to reflect paleoclimate scientists’ understanding (e.g., tive to changes in local temperature over the last few decades
Brohan et al. 2006). Therefore, we suggest that the data stage (Briffa et al. 1998; Jones et al. 2009). This so-called “diver-
should include one more equation: gence” problem could be explained by nonstationarities or non-
T2 = T2,0 + ε2 . linearities in the tree-ring temperature relationship, or by the
presence of confounding covariates that are generally not in-
Then, at the process-stage, their (4.5) should be in terms of cluded in paleoclimate reconstructions. The problem with cap-
(T1 , T2,0 ). In fact, we find the notation, T1 , for past temper- turing nonlinearities in the data stage of an HM (as do LNA)
ature misleading; in line with other notation, we suggest that is that the nonlinearities are assumed part of the measurement
it be replaced with T1,0 . Consequently, when LNA state [just error and filtered out by the posterior analysis.
after (4.5)], “The target is to estimate T1 given T2 , the proxies We would like to finish this section by augmenting our dis-
and the forcings,” we suggest that their target should be to make cussion of LNA’s Equation (4.5). They assume that the forcings
inference on the unknown T1,0 , given the temperature data T2 . in (4.5) are additive in S (or S0 ), V0 , and C (or C0 ), an assump-
Moreover, we would extend this to making inference on the un- tion that seems to be supported by the IPCC’s Fourth Assess-
known T2,0 as well. ment report (Forster et al. 2007). However, this only refers to
It is clear that LNA have assumed var(ε 2 ) = 0, without the lack of interaction between the S, V0 , and C. Now, the ra-
explicitly stating it. Even if it were true, we find that it diative forcing associated with CO2 increases as the log of the
helps to distinguish between the (potentially) observed tem- mixing ratio (Forster et al. 2007). Furthermore, it seems clear
peratures, T = (T1 , T2 ) , and the unknown true temperatures, from LNA’s Figure 1 and the multiplicative measurement error
T0 ≡ (T1,0 , T2,0 ) . Then (4.1), (4.2), and (4.3) should be for- in their (4.4), that V0 should also be expressed on the log scale.
mulated conditional on T0 , and the focus of the study would Therefore, we suggest that Equation (4.5) be modified to be ad-
be on gaining knowledge about the unobserved past tempera- ditive in S, log(V0 ), and log(C), where the log of a vector is
tures, T1,0 (and T2,0 as well), from all relevant data sources interpreted as the vector of elementwise logs.
(including T2 ).
5. INFERENCE IN THE PRESENCE OF
4.2 Spatial Sampling Issues UNCERTAINTY: WHAT ARE THE QUESTIONS
AND HOW ARE THEY ANSWERED?
LNA provide justification for the spatial distribution of the
pseudo-proxies they chose (Figure 2 of LNA) for only the bore- LNA’s scientific goal is to investigate the value of including
holes, saying in this case that the “distribution of those loca- proxies with different spatial and temporal relationships, as well
tions reasonably represents the spread of real borehole data.” as various forcings, into a paleoclimate reconstruction based on
Now consider the particular spatial distribution they chose for posterior analysis. Their statistical goal is to assess a computer
the tree-ring pseudo-proxies (Figure 2 of LNA). First, a num- simulation experiment designed around how an HM includes
ber of the tree-ring pseudo-proxies are located in the South- aditional data for paleoclimate reconstruction.
ern Hemisphere, despite their goal of wanting to reconstruct Their posterior analyses are done carefully but, as discussed
NH mean temperatures. Second, LNA locate several tree-ring in Section 2, we believe the design lacks a competing method-
pseudo proxies in the tropics, despite the fact that trees in the ology and there should be some replication. Perhaps the lack
tropics do not generally develop annual rings, due to the the lack of replication is an explanation for the questions below. In Fig-
of strong seasonality. Third, LNA locate a tree-ring pseudo- ure 5, under the factor combination (Noise, T1, D), the “forc-
proxy in Eastern Greenland, north of 75◦ N, and certainly north ings” bias is extemely negative, resulting in a worse rmse when
of the tree line. forcings are included. Does this make sense? Also, does the
More generally, the spatial distribution of the pseudo-proxies borehole proxy really lead to a less-biased reconstruction? And
presumably impacts their ability to infer the NH spatial aver- shouldn’t the “oracle” be best in terms of virtually any skill
age. All things being equal, we might think that a regularly measure? (It’s not; see their supplementary material.)
spaced distribution of locations would be preferred (see, e.g., A lot of effort was put into understanding which factors in
Aldworth and Cressie 1999). However, the surface-temperature the reconstruction are important; the end product is a compari-
field is inhomogeneous and the spatial distributions of tree-ring son of a number of possible reconstructions, based on the bias
and pollen proxies used in any actual application are limited to and mean squared prediction error of posterior means. (We be-
particular geographical areas. Indeed, the locations of proxies lieve that LNA used posterior means, but we could not actually
in published reconstructions (e.g., Mann et al. 2008) could have find where they specified which posterior summary was used to
been used to inform the locations of the pseudo-proxies, but this define the reconstructions.) Coverage rates of posterior credible
was not a factor controlled in LNA’s experiments. We suggest intervals are investigated in LNA’s supplementary material and,
Cressie and Tingley: Comment 899
there, the performance of even the “oracle” is quite poor. (The series of temperature, where the spatial component has been av-
coverage rate is an attractive measure of skill since it is unitless eraged out over the NH. Now think of a time series of spatial
and intuitively interpretable.) For example, with a nominal cov- temperature processes, which we write as {T0,1 (s) : s ∈ globe},
erage rate of 90%, the oracle proxy only gives 65% coverage! {T0,2 (s) : s ∈ globe}, . . . , the present-day temperature process
Presumably, this difference is an indication of the inherent lim- over the globe. Write this time series of spatial processes as
itations involved in inferring the NH mean using proxies at the T0 (·). Then the spatial component might be introduced into the
particular locations used by LNA. data stage through spatially varying parameters in (4.1), (4.2),
Is this experiment applicable to real-world proxies? We have and (4.3), including parameters found in MD , MP , and MB (see
already mentioned that the matrices MD , MP , and MB , given our Section 3). The current LNA model relates each proxy ob-
in their data stage, are not known in practice. Either estimat- servation to a number of years of the NH mean times series (via
ing them, putting a prior on them, or carrying out a sensitivity the matrices MD , MP , and MB ). A spatial adaptation of MD
analysis should be done before applying the HM proposed by would reflect the local (in space) temperature value, whereas
LNA to real-world data. that of MP would reflect regional temperature values. As dis-
LNA investigate the interaction of proxy and climate-forcing cussed above, it is our view that MB should reflect the local
information in their roles of reconstructing climate. They inves- temperature value, but with a space–time covariance that mod-
tigate the impacts of including different combinations of prox- els an increasing spatial range with longer temporal averaging.
ies, of including noise in the proxy construction, including the Clearly, the curse of dimensionality needs to be taken into ac-
forcing time series, and of modeling the temperature process count when including spatial dependence in the model, which
over the entire time span or only over the reconstructed time we discuss below.
interval. They conclude that it is important to include informa- At the process stage, the forcings likely mix well enough over
tion about the target time series at a wide array of frequencies. annual time scales that we do not have to include any spatial
If tree rings only reflect high-frequency climate variability (as variability in that part of the model. However, the error term εT ,
is assumed by LNA), then including the different forcing time which accounts for process variability in LNA’s key regression
series improves results. Including proxies that reflect the lower-
Equation (4.5), should now be spatio-temporal with nonstation-
frequency variability of the target times series can partially re-
ary spatial covariances.
place the role of the forcing time series. These are nice, “take-
With all this extra structure, any posterior analysis runs
home” conclusions, and they extend the results of Moberg et al.
the risk of being overwhelmed by high dimensionality. One
(2005) to include the effects of climate forcings. A consistent
way to reduce the dimensionality is to use a spatio-temporal
message seems to be that having information on different time
random effects (STRE) model, as in Cressie, Shi, and Kang
scales is essential for arriving at skillful reconstructions of past
(2010), where a spatio-temporal analysis was done on a very
climate.
According to the process stage of the model [Equation (4.5) large remote-sensing dataset. In LNA’s terminology, write
from LNA] the NH mean temperature time series is a linear the (now) spatio-temporal error εT in their (4.5) as εT ≡
combination of the three forcing time series plus AR(2) noise. (ε 1 , ε 2 , . . . , ε t , . . .) , and assume
It would be interesting to re-run the analysis using different ε t ≡ St η t + ξ t ; t = 1, 2, . . . , T,
combinations of the forcing series, to investigate, for exam-
ple, the impact of including solar variability in the model. (Cli- where the matrices {St : t = 1, 2, . . . , T} are made up of known
mate change skeptics often attribute temperature changes to so- spatial basis functions and {ηt : t = 1, 2, . . . , T} is an r-dimen-
lar forcings, in place of the more common attribution given to sional vector autoregressive time series. Importantly, r is fixed.
greenhouse gases.) This would address the influences of the var- In Cressie, Shi, and Kang (2010), r was on the order of 100.
ious forcing time series, similar to the way in which LNA ad- The last term in the STRE model, {ξ t : t = 1, 2, . . .}, captures
dress the influence of the various types of pseudo-proxies. fine-spatial-scale variability.
There are several other factors that could be investigated, The STRE model inherits nonseparable, nonstationary spa-
which we have discussed earlier but group together here: Are tio-temporal covariances, an attractive feature since stationar-
the results robust to different runs of the climate model, or to ity is not expected over a global spatial scale and a millen-
output from different models? The spatial network of proxies nial temporal scale. Critically, the dimension reduction allows
is held fixed across all experiments, although the spatial distri- very fast matrix inversions in an MCMC. Tingley and Huy-
bution of these series must have an impact on their ability to bers (2010a, 2010b) present a spatio-temporal BHM for paleo-
infer NH mean temperature. How variable are the results as a climate reconstructions that assumes separability of the spatial
function of the spatial network? What is the optimal spatial de- and temporal variability in order to achieve a computationally
sign, and is this dependent on the GCM output or the particular feasible MCMC. Both approaches use a sequential updating
GCM chosen? Finally, LNA discuss the possibility of dating procedure to speed up inferences and they are O(T) in com-
errors when dealing with actual proxy time series, particularly putational complexity. The dimension reduction appears to be
for pollen observations. Sensitivity to mild dating errors could needed when spatial-data sizes go beyond about 2000 observa-
have been explored in their experiment. tions per time point.
Including a spatial model will produce estimates of the spa-
6. SPATIAL MODELING FOR tial mean and its associated uncertainty that are consistent
PALEOCLIMATE RECONSTRUCTION across global, hemispheric (including the NH), continental, and
The spatial aspect has not been featured in LNA’s data stage regional scales. Indeed, results included in LNA’s supplemen-
or process stage, something we would like to discuss in this sec- tary material point to the limitations of inferring a spatial av-
tion. We have suggested above that T0 is the appropriate time erage without modeling the spatial covariance. LNA compare
900 Journal of the American Statistical Association, September 2010
reconstructions based on the oracle proxies to those based on Brohan, P., Kennedy, J. J., Harris, I., Tett, S. F. B., and Jones, P. D. (2006),
dendro and pollen proxies using rank-verification histograms. “Uncertainty Estimates in Regional and Global Observed Temperature
Changes: A New Data Set From 1850,” Journal of Geophysical Research,
They note that the shapes of the rank-verification histograms 2, 99–113. [898]
are the same for each, but that the time series of temperature Brynjarsdóttir, J., and Berliner, L. M. (2010), “Bayesian Hierarchical Mod-
observations at the particular set of locations they have chosen eling for Temperature Reconstruction From Geothermal Data,” Technical
Report 842, The Ohio State University, Dept. of Statistics, Columbus, OH.
cannot capture all aspects of the spatial average temperature. [897]
Christiansen, B., Schmith, T., and Thejll, P. (2009), “A Surrogate Ensemble
7. CONCLUSIONS Study of Climate Reconstruction Methods: Stochasticity and Robustness,”
Journal of Climate, 22, 951–976. [896]
LNA have presented an HM for reconstructing past climate Cressie, N., Shi, T., and Kang, E. L. (2010), “Fixed Rank Filtering for Spatio-
from various types of (pseudo-)proxy and forcing information, Temporal Data,” Journal of Computational and Graphical Statistics, 19,
724–745. [899]
which represents a 21st-century statistical approach to paleocli- Evans, M., Reichert, B., Kaplan, A., Anchukaitis, K., Vaganov, E., Hughes,
mate reconstruction. One of the great advantages of an HM is M., and Cane, M. (2006), “A Forward Modeling Approach to Paleoclimatic
the conceptual ease with which different forms of uncertainty Interpretation of Tree-Ring Data,” Journal of Geophysical Research, 111,
G03008, DOI: 10.1029/2006JG000166. [896]
can be included, as well as the transparency of the physical Fisher, R. A. (1935), The Design of Experiments, Edinburgh, U.K.: Oliver &
and statistical modeling assumptions. While we feel that there Boyd. [895,896]
are a number of aspects of LNA’s HM that could be improved Forster, P., Ramaswamy, V., Artaxo, P., Berntsen, T., Betts, R., Fahey, D. W.,
Haywood, J., Lean, J., Lowe, D. C., Myhre, G., Nganga, J., Prinn, R., Raga,
upon, their efforts do represent a substantial step forward for G., Schulz, M., and Van Dorland, R. (2007), “Changes in Atmospheric Con-
the paleoclimate-reconstruction community, whose statistical stituents and in Radiative Forcing,” in Climate Change 2007: The Physical
approaches are summarized in NRC (2006). The suggestions Science Basis, eds. S. Solomon, D. Qin, M. Manning, Z. Chen, M. Marquis,
K. B. Averyt, M. Tignor, and H. L. Miller, Cambridge, U.K.: Cambridge
we have made are in support of posterior analysis from an HM University Press. [898]
approach, and LNA’s paper indicates that posterior analysis on Jones, P. D., Briffa, K. R., Osborn, T. J., Lough, J. M., van Ommen, T. D.,
actual paleoclimate data will advance our understanding of the Vinther, B. M., Luterbacher, J., Wahl, E. R., Zwiers, F. W., Mann, M. E.
et al. (2009), “High-Resolution Paleoclimatology of the Last Millennium:
Earth’s past climate (as well as quantify the associated uncer- A Review of Current Status and Future Prospects,” The Holocene, 19, 3–49.
tainties). We look forward to such efforts appearing in the liter- [896,898]
ature in the near future. Schneider, T. (2001), “Analysis of Incomplete Climate Data: Estimation of
Mean Values and Covariance Matrices and Imputation of Missing Values,”
ADDITIONAL REFERENCES Journal of Climate, 14, 853–871. [896]
Shashkin, A. V., and Vaganov, E. A. (1993), “Simulation Model of Climatically
Aldworth, W. J., and Cressie, N. (1999), “Sampling Designs and Prediction Determined Variability of Conifer’s Annual Increment (on the Example of
Methods for Gaussian Spatial Processes,” in Multivariate Analysis, Design Common Pine in the Steppe Zone),” Russian Journal of Ecology, 24, 275–
of Experiments, and Survey Sampling, ed. S. Ghosh, New York: Marcel 280. [896]
Dekker, pp. 1–54. [896,898] Tingley, M. P., and Huybers, P. (2010a), “A Bayesian Algorithm for Recon-
Briffa, K. R., Jones, P. D., Bartholin, T. S., Eckstein, D., Schweingruber, F. H., structing Climate Anomalies in Space and Time. Part 1: Development and
Karlen, W., Zetterberg, P., and Eronen, M. (1992), “Fennoscandian Sum- Applications to Paleoclimate Reconstruction Problems,” Journal of Cli-
mers From AD 500: Temperature Changes on Short and Long Timescales,” mate, 23, 2759–2781. [899]
Climate Dynamics, 7, 111–119. [896] (2010b), “A Bayesian Algorithm for Reconstructing Climate Anom-
Briffa, K. R., Schweingruber, F. H., Jones, P. D., Osborn, T. J., Shiyatov, S. G., alies in Space and Time. Part 2: Comparison With the Regularized
and Vaganov, E. A. (1998), “Reduced Sensitivity of Recent Tree-Growth to Expectation–Maximization Algorithm,” Journal of Climate, 23, 2782–
Temperature at High Northern Latitudes,” Nature, 391, 678–682. [898] 2800. [899]
Comment
Eugene WAHL, Christian S CHOELZEL, John W ILLIAMS, and Seyitriza T IGREK
brie and Imbrie 1979). A shorter-term example is the recogni- al. 2007; Mann et al. 2007; cf. Ammann, Genton, and Li 2010
tion that very long and intense droughts have occurred in North both for new methodology and for a good review of the rele-
America and eastern Asia (as examples—these are the regions vant literature). For example, one reason (among several) for
with the most complete paleodrought coverage), which have using eigen/SV decompositions is to help limit stochastic noise
lasted far longer than the worst droughts in the instrumental in the instrumental and proxy data by truncating the eigenvec-
record that have been used for planning in water management tors/EOFs used in the reconstruction process, in particular to
applications (e.g., Cook et al. 2004). Paleoclimate reconstruc- help reduce EIV issues when the form of the regressions is
tion is currently in a time of explosion in both interest in its climate = f (proxy).
results and the development and testing of new methods. The
work by Li, Nychka, and Ammann (this issue) to study the 3. USE OF BAYESIAN METHODS
capabilities of Bayesian Hierarchical Modeling for use in pa- Perhaps most importantly at the current time, much work is
leoclimate reconstruction is an important “new shoot” of this being done to develop appropriate models that can be used to
development. generate reconstruction ensembles, for the purpose of character-
izing reconstruction uncertainty. This work includes (but is not
2. TECHNICAL BACKGROUND
limited to): (1) identifying stochastic models for this purpose
In many applications, proxy information can be calibrated and then applying these via Monte Carlo methods to make ran-
against a climatic variable (or variables) of interest. In some dom draws from the estimated distribution of reconstructions
cases, such calibrations can be physically deterministic, de- (Li, Nychka, and Ammann 2007; Supplemental Online Mater-
rived from first principles and/or carefully controlled laboratory ial here); and (2) a more “engineering” style approach that de-
experiments. More typically, they are purely statistical, devel- scribes reasonable ranges of free analyst choices in model spec-
oped over a period of common overlap between the climate and ification per se, and then uses all possible models within these
proxy data, and then the calibration relationship is applied to ranges to generate the reconstruction ensemble from which
yield quantitative estimates of the climate variable as far back draws can be taken (Frank et al. 2010). It is, of course, possible
in time as the proxy data allow. Both kinds of calibration as- to combine these approaches to evaluate the combined effects
sume stationarity of the relationships modeled over time, as of stochastic and model selection uncertainties. The work of
does the Bayesian hierarchical approach evaluated by Li, Ny- Li, Nychka, and Ammann in this issue is part of a fundamen-
chka, and Ammann. Statistical calibrations have involved a va- tally new approach to the problem of generating statistically ap-
riety of simple to more complex mathematical models, often propriate ensemble distributions of reconstructions from which
utilizing one of a number of regression-based approaches. Two random draws can be taken (e.g., Haslett et al. 2006). In this
of the most common are: (1) direct (or indirect) regression of approach, a clearly specified hierarchy of mathematical mod-
the proxies against, for example, temperature, in either simple els is formulated as conditional probability density functions,
or multiple regression forms (cf. Frank et al. 2010); and (2) a which can then be combined via Bayes’ Theorem to yield an
more complex process in which one of these closed-form re- estimated joint posterior distribution of the climate variable(s)
gression approaches (or an iterative procedure such as regular- being reconstructed, the proxies used in the process, and the
ized expectations maximization, RegEM) is used fit the proxy parameters in the hierarchical model. The hierarchy of mod-
data to the component time series weights of an orthogonal spa- els in this formulation has three stages: (1) a data stage, in
tial decomposition of the target instrumental data field, and the which the proxy or proxies used in the reconstruction are mod-
resulting fitted component time series are then input back into eled in a “forward” way as causally dependent on the climate
the appropriate eigen/singular value (SV) expansion to yield es- variable(s) of interest; (2) a process stage, in which the climate
timated spatial fields (e.g., Luterbacher et al. 2004; Mann et variable(s) is (are) modeled, for example, as a simple autore-
al. 2007, 2009). These methods have proved very effective in gressive process or, more appropriately from a physical stand-
making expected value reconstructions of climate at both single point (as Li, Nychka, and Ammann do), as causally dependent
sites and for entire climate fields up to global scale. Simulta- on known factors that “force” the climate system (such as solar
neous whole-field reconstructions, in particular, have been one output, volcanic aerosol release, and greenhouse gas concentra-
of the key successes of these approaches, because they pro- tions); and (3) the assumed prior distributions of the unknown
vide site-to-site-consistent spatial information about a dynam- parameter values at both the data and process stages.
ical system (climate) that is inherently spatial in its character. Although complex, this Bayesian hierarchical modelling
(In this context, it is worth keeping in mind that even with (BHM) approach offers distinct resources in relation to the
modern instrumentation, including satellites, providing system- “traditional” reconstruction approaches mentioned above. First,
atic whole-field climate data coverage remains difficult.) In ad- and most importantly, the specification of the full model in an
dition, a variety of methods for validating reconstruction re- explicit mathematical/statistical form that leads to direct esti-
sults have been derived from mathematical statistics, including mation of the posterior distribution means that full uncertainty
econometrics and the theory of calibration and validation (cf. estimation is inherent in the reconstruction process from start
Cook, Briffa, and Jones 1994; Wahl and Ammann 2007). Ad- to finish. It does not have to be developed in a component-by-
ditional work is currently being done in this area to refine and component manner that runs a risk of becoming, to a lesser
extend the efficacy of validation methods. Much work is also or greater degree, ad hoc, or simply not tractable per se. Sec-
underway to consider the impact of, and solutions for, classi- ond, explicit specification of the mathematical reconstruction
cally known problems in regression such as errors-in-variables model used allows each part of the reconstruction process to
(EIV), heteroscedasticity, multicollinearity, etc. (e.g., Hegerl et be formally examined for its physical realism, climatological
902 Journal of the American Statistical Association, September 2010
validity, etc. This transparency of assumed model(s) is not al- It should be noted that RSE comparison of reconstruction
ways as easily considered in the other approaches mentioned. models, noise levels, proxy richness and spatial coverage, cal-
Third, especially when the process stage model relates the cli- ibration choices, and other factors has been done in this rela-
mate to its known forcing factors, the use of forward models in tively new area of study, although typically only one or a few
the data and process stages is more physically realistic in terms factors are evaluated in a given study (cf. Mann and Ruther-
of cause-and-effect relationships than the common specifica- ford 2002; Rutherford et al. 2005, 2010; Bürger, Fast, and
tion of physically “inverse” models, for example, of the form Cubasch 2006; Ammann and Wahl 2007; Mann et al. 2007;
climate = f (proxy). Lee, Zwiers, and Tsao 2008; Jones et al. 2009; Riedwyl et al.
Li, Nychka, and Ammann apply the BHM approach in 2009; von Storch, Zorita, and González-Rouco 2009; Smerdon
a reconstruction simulation experiment (RSE) context, con- et al. 2010). The kind of richer factorial RSE developed by Li,
structing the data models and deriving climate information (in Nychka, and Ammann in a BHM context can also be done with
this case, average Northern Hemisphere surface temperature) the traditional reconstruction methods described. Large n-way
from a long paleoclimate run of a complex three-dimensional sets of simulated proxy data could be developed that include
atmospheric-ocean general circulation model (AOGCM). The stochastic noise components, and these could be used in vari-
RSE approach represents a major step forward in examining ous reconstruction scenarios in Monte Carlo fashion. Forcings
paleoclimate reconstruction efficacy, providing an experimen- information could also be incorporated in a variety of ways.
tal “test bed” in which reconstruction approaches can be eval- However, carrying over uncertainty estimation into a real-world
uated prior to implementation on real-world data. Li, Nychka, context would continue to be less “natural” than in the Bayesian
and Ammann exploit the RSE context to examine a factorial set context, as described.
of simple to more complex BHM designs. Relaxing the physi-
cally realistic process model of temperature driven by climate 4. SIMULATION RESULTS
forcings to a simple mean function allows examination of the 4.1 Inclusion of Forcing Information
usefulness of incorporating forcing data in the reconstruction
process, which has only begun to be systematically explored The results from Li, Nychka, and Ammann’s examination
in paleoclimate science (cf. Lee, Zwiers, and Tsao 2008, who show several salient features. Most prominently, the inclusion
examine use of the Kalman filter for this purpose in a seminal of forcing information generally leads to greatly improved re-
RSE study). constructions, especially in terms of reduction of variance.
Li, Nychka, and Ammann’s design additionally allows ex- Forcing information also is greatly valuable to reduce recon-
amination of another important area of development in climate struction bias when the temperature process model is assumed
reconstruction: the value of using one proxy type only (preserv- to be constant across both the known, instrumental time period,
ing homogeneity of climate response characteristics, but with and the time period to be reconstructed utilizing the proxy data.
potential frequency sensitivity limitations) versus the combin- When this assumption is relaxed (representing a partial relax-
ing of proxies with different response characteristics, but with ation of the stationarity assumption described above) bias is
a wider range of frequency sensitivities (e.g., Moberg et al. generally unaffected by inclusion of forcing information, with
2005). They construct simulated proxies with three kinds of fre- the exception of reconstructions based on noisy tree ring in-
quency sensitivities to enable this examination, each with and formation only, in which case adding forcing information in-
without a stochastic noise component: (1) a high-frequency- creases bias towards too-low values. Although this relaxation
only proxy (defined as carrying annual-to-decadal resolution), leads to reduced bias (which can be noted particularly in the ab-
sampled at annual time steps (mimicking a potential worst- sence of noise in the proxy data), it is associated with increased
case of the information available in tree rings); (2) a mid and variance, which should be taken into account before choosing
low-frequency proxy (defined as carrying decadal-to-lower res- this variant of the BHM over its stationary complement.
olution), sampled at 30-year time steps (mimicking the infor-
4.2 Inclusion of Different Kinds of Proxy Information
mation typically available from pollen preserved in sedimen-
tary deposits); and (3) a low-frequency-only proxy (defined as When pollen-type proxy data are included with either tree
carrying multidecadal-to-multicentennial resolution), sampled ring or borehole-type data, or these types combined, the value
at 100 five meter depth intervals (mimicking the information of forcing information to reduce bias under the full stationarity
available from borehole temperature profiles as a partial func- assumption is significantly lessened. The inclusion of pollen-
tion of heat diffusion from the surface, which is increasingly type information also reduces variance in all cases. These re-
smoothed the farther in the past a surface temperature anomaly sults suggest that the mid-to-low frequency information carried
occurred). (Note that the relationship of depth to time in the dif- by the pollen data represents a significant portion of the addi-
fusion model is not described by the authors.) Li, Nychka, and tional information content that the forcing data carry and could,
Ammann also add what they term an “oracle proxy,” which is to an extent, substitute for the forcing information. It should be
formed as the “true” model temperature time series at the proxy noted that establishing the depth-age relationship in the sedi-
locations, with and without stochastic noise (allowing exami- mentary deposits from which fossil pollen is typically extracted
nation of the extent to which these locations carry sufficient in- depends on dating methods that carry significant inherent im-
formation to capture the full hemispheric mean). This richness precision and age-dependent possibilities of bias (such as 14 C
of factorial examination is unique in the RSE literature to date, dating), and can also be affected by offsets in the time of pollen
and is even more powerful when combined with examination of production versus its deposition, similar offsets between pro-
the presence/absence of forcing information. duction of the material dated and its deposition, nonconstant
Wahl et al.: Comment 903
rates of sedimentation that make interpolation between dated important seasonal variability; and (4) relaxation of the assump-
strata uncertain, and other factors. These sources of error could tion of time-invariant noise processes, which is recognized as
potentially be modeled in the BHM framework (cf. Haslett and nontrivial and beyond the scope of the design used by Li, Ny-
Parnell 2008), but would introduce additional imprecision in chka, and Ammann.
the information content of pollen data. Thus, the frequency ca-
pacity assumed in Li, Nychka, and Ammann’s tests is a best- 5.1 Extending BHM Methods to Reconstruct Spatially
case situation (approached in the real world at relatively rare Explicit Climate Data
sites that produce annual/subannual depositional bands called
Extending BHM methods to reconstruct spatially explicit cli-
varves, which, presumably, can be counted like tree rings), and
mate data, for example, regularly gridded data comprising cli-
the partial capacity of pollen-derived information to substitute
for forcing data in BHM reconstructions likely represents a the- mate fields in latitude-longitude space, is a new frontier for sig-
oretical optimum that is not generally obtainable in practice. nificant further effort, following the pioneering work of Tin-
By construction, tree-ring-type data are necessary to obtain gley and Huybers (2010a, 2010b). As mentioned, a key success
higher-frequency information at subdecadal time scales in Li, of paleoclimatology has been the development of simultaneous
Nychka, and Ammann’s tests. This kind of dependence is also whole-field reconstruction approaches, one of the challenges of
seen in many real-world paleoclimate reconstruction situations, which is to properly characterize uncertainty. This is a chal-
and the availability/unavailability of dendrochronological infor- lenge to which application of the BHM approach could be ex-
mation is one of the key features that differentiate paleoclima- pected to add significant capacity, again both methodologically
tology of the past 1–2 millennia from reconstruction work fur- and as a reference point for improving traditional reconstruction
ther back in time. That very low frequency information pro- methods. An intriguing possibility in this regard would be to in-
vided by borehole-type data had little effect on reconstruction clude spatial climate information produced by AOGCMs (and
quality is surprising, and likely is due to the age-related smooth- their successors, earth system models, or ESMs) at the process
ing of borehole information the farther one goes back in time. model stage in the BHM, in an analogous manner to the way
However, the geographic coverage of boreholes is extensive, forcing information is used by Li, Nychka, and Ammann. An-
and exploiting them for the kind of low frequency information other issue that arises for spatial climate reconstructions is that
they can offer, whether within a BHM framework or not, con- logically they should be weighted in reconstruction skill analy-
tinues to be a valuable tool to paleoclimatologists, not least be- ses to take into account the fact the climate system has “nodes”
cause the inversion relationships that underlie borehole recon- of particular sensitivity or importance for the entire earth (e.g.,
structions are based on well-established physical laws of heat the tropical Pacific involved in the El Nino-Southern Oscilla-
diffusion in solids. tion phenomenon and the North Atlantic region). Accounting
for spatially varying importance of reconstruction success in
5. LOOKING TO THE FUTURE
building validation schemes for spatial climate reconstructions
The BHM RSE study by Li, Nychka, and Ammann, along is a nontrivial issue, as it is in weather forecasting, and the ap-
with a few empirical studies using Bayesian approaches, plication of BHM methods to help in this work could be highly
demonstrates that Bayesian methods, and the BHM structure valuable.
in particular, represent an important new application in the
toolkit of paleoclimate reconstruction. By offering an explicit 5.2 Refining the Data Models Used in BHMs
model structure and systematic treatment of uncertainty from
Finally, significant work could be done to refine the data
“start to finish,” BHM offers both a new methodology and
models used in BHMs for climate reconstruction. The bore-
a reference point for work to extend traditional paleoclimate
hole model used by Li, Nychka, and Ammann is realistic at
methods to build well-composed reconstruction ensembles. It
a first-order level, but the tree ring and pollen models are less
will be particularly valuable to make carefully designed paral-
lel comparisons between BHM and traditional reconstruction realistic and deserve significant refinement. Work is currently
methods extended to generate true ensembles; to determine the being done in developing biophysical models of seasonal tree
relative efficacy of both approaches for understanding past cli- growth driven by a small set of key climatic variables, which
mates, and also to examine potential issues in terms of com- carries potential for being incorporated into BHM-based recon-
putational effort and effects that may be caused by relative sta- structions (e.g., Evans et al. 2006). The authors are developing a
bility/instability of estimated parameters in the BHM frame- more realistic model for pollen response to temperature, which
work. Dealing with dating issues involved with any kind of is described in general terms in Section 5.3 and applied and
proxy that is not associated with a physical process with reg- tested in the Supplemental Online Material (SOM) (cf. Korhola
ular time steps (annual growth characteristics in trees, varves et al. 2002; Haslett et al. 2006). Adding these, and/or other,
in sediments, etc.) is another clear avenue for refinement, as data model refinements will amplify the inherent strength of
mentioned. Additional dimensions in RSE analyses of BHM the BHM approach in terms of explicit specification of the hi-
performance that would be valuable to incorporate include: erarchical reconstruction model. Of particular interest would be
(1) joint analysis of more than one climate variable, for ex- to combine such data model refinements with the incorporation
ample, temperature and precipitation; (2) employing more than of ESM output at the process model stage, described above.
one AOGCM run, to enable isolation and elimination of model- Pursuit of these goals provides important direction for further
specific outcomes; (3) examination of seasonal as well as an- development work of the kind set forth by Li, Nychka, and Am-
nual reconstruction performance, as annual averages can mask mann.
904 Journal of the American Statistical Association, September 2010
5.3 A Binomial-Logistic Pollen Data Model for Use method performs extremely well, thereby allowing great sim-
in BHM Reconstruction of Temperature plification from ecological, mathematical, and by extension,
computational standpoints. Although it includes only two to
Modern paleoenvironmental reconstruction from fossil pol-
four pollen types, it can provide as much or more explained
len data often attempts to take advantage of the fact that pollen
variation in the pollen-temperature relationship as a 64-type
“assemblages” archived in sedimentary deposits are generally
“modern analog technique,” or MAT (∼80% explained varia-
rich in taxonomic (i.e., plant type) diversity. To produce a more tion in temperate northeast North America where it has been
realistic pollen data model for use in real-world applications applied; cf. the SOM for description of the MAT). Thus, the
of the BHM, however, it was decided to define a “reduced- new pollen ratio method represents an information-rich, taxo-
space” taxonomic model that would be parsimonious in intro- nomic “reduced space” data model that can be fruitfully, and
ducing additional parameters that need to be estimated within efficiently, employed in a BHM framework.
the BHM. To be even more parsimonious, another possibility is
to estimate the pollen model separately, which could then be in- SUPPLEMENTAL MATERIALS
corporated in the BHM to serve a role analogous to the transfor-
A Pollen ‘Forward’ Model to Enhance the Realism of the
mation matrices utilized by Li, Nychka, and Ammann to define
BHM: Provided additional material describing the application
the “forward” (i.e., causal) relationships between climate and
and testing of the binomial-logistic pollen data model out-
proxy information [the “M” matrices in their Equations (4.1)– lined in Section 5.3. Section S.1: Specification, testing, and
(4.3)]. estimation of the binomial-logistic GLM. Section S.2: Use
To meet these criteria, a refinement of the traditional pollen of the model in paleotemperature reconstruction and uncer-
ratio method (cf. Adam and West 1983) was developed. The ra- tainty estimation. Section S.3: Technical note (regarding lan-
tio method has been known for some time to be a useful tool in guages used in computation). Also provided Figures S1, S2,
pollen-based paleoclimate reconstruction: in situations in which S3. (supplement.pdf)
one (or a few) dominant pollen type(s) in a region have a strong
positive correlation with a climate variable of interest and an- ADDITIONAL REFERENCES
other (or a few) dominant pollen type(s) have a strong nega- Adam, D. P., and West, G. J. (1983), “Temperature and Precipitation Estimates
tive correlation with the same climate variable. A classic ex- Through the Last Glacial Cycle From Clear Lake, California, Pollen Data,”
ample of this situation occurs in the coastal mountain regions Science, 219, 168–170. [904]
Ammann, C. M., Genton, M. G., and Li, B. (2010), “Technical Note: Correcting
of California, where oak (Quercus) and pine (Pinus) pollen for Signal Attenuation From Noisy Proxy Data in Climate Reconstructions,”
representation, respectively, vary inversely in relation to tem- Climate of the Past, 6, 273–279. [901]
Bürger, G., Fast, I., and Cubasch, U. (2006), “Climate Reconstruction by
perature (Adam and West 1983; Wahl 2003). When counts of Regression—32 Variations on a Theme,” Tellus, Ser. A, 58 (2), 227–235.
these pollen types are combined as Pinus/(Quercus + Pinus) [902]
or Quercus/(Quercus + Pinus) ratios, a mathematically appro- Cook, E. R., Briffa, K. R., and Jones, P. D. (1994), “Spatial Regression Meth-
ods in Dendroclimatology: A Review and Comparison of Two Techniques,”
priate estimation form (which generally has not been utilized in International Journal of Climatology, 14, 379–402. [901]
this context) is the binomial logistic generalized linear model Cook, E. R., Woodhouse, C. A., Eakin, C. M., Meko, D. M., and Stahle, D. W.
(GLM) (Gelman et al. 2004). (2004), “Long-Term Aridity Changes in the Western United States,” Sci-
ence, 306, 1015–1018. [901]
The GLM also can readily model this relationship in the for- Evans, M. N., Reichert, B. K., Kaplan, A., Anchukaitis, K. J., Vaganov, E. A.,
ward form of pollen = g(climate), which is more physically re- Hughes, M. K., and Cane, M. A. (2006), “A Forward Modeling Approach
alistic in terms of the direction of causation. The specification to Paleoclimatic Interpretation of Tree-Ring Data,” Journal of Geophysical
Research, 111, G03008, DOI: 10.1029/2006JG000166. [903]
of such a forward model is shown below: Frank, D., Esper, J., Raible, C. C., Büntgen, U., Trouet, V., Stocker, B., and
Joos, F. (2010), “Ensemble Reconstruction Constraints on the Global Car-
rnum ∼ Bin(n, p), bon Cycle Sensitivity to Climate,” Nature, 463, 527–530. [901]
Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B. (2004), Bayesian Data
where Analysis (2nd ed.), Boca Raton, FL: Chapman & Hall/CRC Press. [904]
Hegerl, G. C., Crowley, T. J., Allen, M. R., Hyde, W. T., Pollack, H. N.,
E(r|T) = p = exp(η)/[1 + exp(η)], Smerdon, J. E., and Zorita, E. (2007), “Detection of Human Influence on
a New, Validated 1500 Year Temperature Reconstruction,” Journal of Cli-
and mate, 20, 650–666. [901]
Imbrie, J., and Imbrie, K. P. (1979), Ice Ages: Solving the Mystery, Cambridge,
η = α + β(T). MA and London: Harvard University Press. [901]
Jones, P. D., Briffa, K. R., Osborn, T. J., Lough, J. M., van Ommen, T. D.,
Here, r is the pollen ratio formed as above, rnum is the ratio Vinther, B. M., Luterbacher, J., Wahl, E. R., Zwiers, F. W., Mann, M. E.,
Schmidt, G. A., Ammann, C. M., Buckley, B. M., Cobb, K. M., Esper, J.,
numerator, n is the ratio denominator (i.e., the sum of pollen Goosse, H., Graham, N., Jansen, E., Kiefer, T., and Kull, C. (2009), “High-
counts), the denominator-specific count is (n − rnum ), and T is Resolution Palaeoclimatology of the Last Millennium: A Review of Current
the temperature at each site corresponding to a specific value Status and Future Prospects,” Holocene, 19 (1), 3–49. [902]
Korhola, A., Vasko, K., Toivonen, H. T. T., and Olander, H. (2002), “Holocene
of r. Temperature Changes in Northern Fennoscandia Reconstructed From Chi-
As described further in the SOM, α and β were estimated ronomids Using Bayesian Modelling,” Quaternary Science Reviews, 21,
using the GLM algorithm (in the R language), yielding fit- 1841–1860. [903]
Luterbacher, J., Dietrich, D., Xoplaki, E., Grosjean, M., and Wanner, H. (2004),
ted values of p = E(r|T) for given values of T. These fitted “European Seasonal and Annual Temperature Variability, Trends, and Ex-
E(r|T), T combinations were then compared with the actual tremes Since 1500,” Science, 303, 1499–1503. [901]
sampled r, T pairs to determine how much of the total GLM de- Mann, M. E., Rutherford, S., Wahl, E., and Amman, C. (2007), “Robustness
of Proxy-Based Climate Field Reconstruction Methods,” Journal of Geo-
viance the E(r|T) values explain, analogous to explained vari- physical Research, 112, D12109, DOI: 10.1029/2006JD008272; corrigenda
ation in a standard linear model. Importantly, the new ratio (2008), 113, D18107, DOI: 10.1029/2008JD009964. [901,902]
Smith: Comment 905
Mann, M. E., Zhang, Z., Rutherford, S., Bradley, R. S., Hughes, M. K., Shindell, Climate Fields of the Last Millennium,” Journal of Climate, in press.
D., Ammann, C., Faluvegi, G., and Ni, F. (2009), “Global Signatures and [902]
Dynamical Origins of the Little Ice Age and Medieval Climate Anomaly,” Tingley, M. P., and Huybers, P. (2010a), “A Bayesian Algorithm for Recon-
Science, 326, 1256–1260, DOI: 10.1126/science.1177303. [901] structing Climate Anomalies in Space and Time. Part I: Development and
Riedwyl, N., Küttel, M., Luterbacher, J., and Wanner, H. (2009), “Compari- Applications to Paleoclimate Reconstruction Problems,” Journal of Cli-
son of Climate Field Reconstruction Techniques: Application to Europe,” mate, 23, 2759–2781. [903]
Climate Dynamics, 32 (2–3), 381–395, DOI: 10.1007/s00382-008-0395-5. (2010b), “A Bayesian Algorithm for Reconstructing Climate Anom-
[902] alies in Space and Time. Part II: Comparison With the Regularized
Rutherford, S., Mann, M. E., Ammann, C. M., and Wahl, E. R. (2010), Expectation–Maximization Algorithm,” Journal of Climate, 23, 2782–
Comments on “A Surrogate Ensemble Study of Climate Reconstruc- 2800. [903]
tion Methods: Stochasticity and Robustness,” by B. Christiansen, T. von Storch, H., Zorita, E., and González-Rouco, F. (2009), “Assessment of
Schmith, and P. Theijll, Journal of Climate, 23, 2832–2838, DOI: Three Temperature Reconstruction Methods in the Virtual Reality of a Cli-
10.1175/2009JCLI3146.1. [902] mate Simulation,” International Journal of Earth Sciences, 98, 67–82. [902]
Rutherford, S., Mann, M. E., Osborn, T. J., Bradley, R. S., Briffa, K. R., Hughes, Wahl, E. R. (2003), “Pollen Surface Samples for Paleoenvironmental Recon-
M. K., and Jones, P. D. (2005), “Proxy-Based Northern Hemisphere Surface struction From the Coast and Transverse Ranges of Southern California,”
Temperature Reconstructions: Sensitivity to Methodology, Predictor Net- Madroño, 50 (4), 286–299. [904]
work, Target Season and Target Domain,” Journal of Climate, 18, 2308– Wahl, E. R., and Ammann, C. M. (2007), “Robustness of the Mann, Bradley,
2329. [902] Hughes Reconstruction of Surface Temperatures: Examination of Criti-
Smerdon, J. E., Kaplan, A., Chang, D., and Evans, M. N. (2010), “A Pseudo- cisms Based on the Nature and Processing of Proxy Climate Evidence,”
proxy Evaluation of the CCA and RegEM Methods for Reconstructing Climatic Change, 85, 33–69. [901]
Comment
Richard L. S MITH
The paper by Li, Nychka, and Ammann (2010) has exempli- peratures back to 1000, using 14 proxy series first discussed in
fied the power of Bayesian Hierarchical Models to solve funda- Mann, Bradley, and Hughes (MBH 1999). Their results showed
mental problems in paleoclimatology. However, much can also that there is indeed a high probability that the 1990s were the
be learned by more elementary statistical methods. In this dis- warmest decade of the millennium. The BHM technique has
cussion, we use principal components analysis, regression, and since been taken up by other authors, such as Tingley and Huy-
time series analysis, to reconstruct the temperature signal since bers (2010a, 2010b), Brynjarsdóttir and Berliner (2010), and
1400 based on tree rings data. Although the “hockey stick” promises to be the method of choice for future statistical analy-
shape is less clear cut than in the original analysis of Mann, ses of paleoclimatic data.
Bradley, and Hughes (1998, 1999), there is still substantial ev- In the paper under discussion, LNA (2010) have shown that it
idence that recent decades are among the warmest of the past is also possible to answer “design”-type questions using BHMs.
600 years. I believe that this is the logical next step in the scientific applica-
The problem of paleoclimate reconstruction is a natural one tion of BHMs to paleoclimatology, and the methodology they
for the use of Bayesian hierarchical models (BHMs). As in most have presented will play an important role in the selection of
BHMs, there is an unobserved “process” which is the true ob- proxies for future paleoclimatological studies. I commend their
ject of interest—in this case, the true series of temperatures. contribution.
There are also various sources of “data” which are dependent on Although I fully support the further development of the BHM
the “process” with different levels of accuracy—observational approach, it seems to me there is still some merit in looking for
data, tree rings, boreholes, ice cores, etc. The problem of paleo- simpler statistical approaches, using methods that are routinely
climate reconstruction may be characterized as how to combine taught in first-year graduate courses in statistics, and that can
the different data series to obtain the best reconstruction of the (through the ready availability of the R programming language)
unobserved process, with suitable measures of uncertainty. The be easily adopted by paleoclimatologists without extensive sta-
BHM technique is especially valuable for answering nonstan- tistical training. Indeed, much of the debate over the “hockey
dard uncertainty questions, for instance, “what is the probabil- stick curve” has focused on the correct use of elementary statis-
ity that the 1990s were the warmest decade of the [1000–2000] tical methods, in particular, the method of principal components
millennium?” (PCs). For the remainder of this note, I aim to show how routine
application of PCs, regression, and time series analysis can be
In an earlier paper, Li, Nychka, and Ammann (henceforth
used to resolve some issues that have caused much contention
LNA 2007) used an ensemble reconstruction, obtained via
in the literature.
a combination of linear regression, bootstrapping and cross-
validation, to reconstruct Northern Hemisphere average tem- BRIEF SUMMARY OF THE CONTROVERSY
The hockey stick curve, in the form that is currently debated,
Richard L. Smith is Director, Statistical and Applied Mathematical Sci- was first constructed in two papers of MBH (1998, 1999). After
ences Institute, Research Triangle Park, NC 27709-4006, and Mark L.
Reed III Distinguished Professor, Department of Statistics and Operations Re-
search, University of North Carolina, Chapel Hill, NC 27599-3260 (E-mail: © 2010 American Statistical Association
rls@email.unc.edu). SAMSI is supported by the National Science Foundation, Journal of the American Statistical Association
grant DMS 0635449. I am grateful to Doug Nychka and Caspar Ammann for September 2010, Vol. 105, No. 491, Applications and Case Studies
making their data and programs available. DOI: 10.1198/jasa.2010.ap10507
906 Journal of the American Statistical Association, September 2010
Figure 1. Reconstructed temperatures from 70 tree rings Figure 3. The first principal component, computed from the tree
(1400–1980) in the North American ITRDB dataset. Each series has ring dataset using a conventional (correlation-based) PC decomposi-
been smoothed using the 25-year triangular window described in the tion, together with the smoothed trend. The online version of this fig-
text. The online version of this figure is in color. ure is in color.
Smith: Comment 907
Figure 5. Six reconstructions of historical temperature anomalies, together with their smoothed trends and pointwise 90% prediction intervals
on the trends. The online version of this figure is in color.
powerful way to select a model is to treat all of K, p, and q likelihood fitting using the arima command in R (R Core De-
as undetermined model parameters, to perform a generalized velopment Team 2010), ignoring models for which the MLE
least squares (GLS) analysis, and to select K, p, and q to min- algorithm did not converge or for which the resulting model fit
imize one of AIC, BIC, or AICC. Unfortunately, this proce- violated the stationarity condition for the autoregressive part of
dure quickly produces unwieldy models and does not lead to a the model. However, the final model is hard to interpret with so
clear-cut conclusion. For example, fitting models by minimiz- many parameters, and it seems probable that still higher-order
ing AIC up to K = 9, p = 10, q = 5 produced the best model at models would be obtained if larger values of K, p, q were per-
K = 8, p = 2, q = 5. These results were obtained by maximum mitted. Similar results were obtained using BIC and AICC.
As an alternative to full GLS time series regression, there-
Table 1. Table of AIC, BIC, AICC values for OLS fore, I used the same OLS fits for the regression components
regression without allowing for autocorrelation produced earlier, but selected the optimal ARMA(p, q) model
fitted to the residuals, and then recalculated the width of the
K AIC BIC AICC
prediction intervals to take account of the autocorrelation. This
1 −40.4 −33.3 −40.1 produced more easily interpretable results. For example, with
2 −58.2 −48.7 −57.7 K = 2, the optimal ARMA model had p = 1, q = 2 when se-
3 −58.7 −46.9 −57.9 lecting by AIC and p = 1, q = 0 when using BIC or AICC.
4 −57.3 −43.0 −56.1 With K = 9, all three selection criteria resulted in AR(1) as
5 −55.4 −38.8 −53.8 the optimal time series model—incidentally relevant to LNA
6 −57.3 −38.4 −55.3
(2010), where they used AR(2) as the time series model for
7 −58.1 −36.8 −55.5
8 −63.8 −40.1 −60.5
residuals, though Dr. Li remarked in her oral presentation that
9 −66.4 −40.4 −62.5 the AR(1) model appears equally suitable in practice.
10 −66.1 −37.7 −61.4 For the three models just derived, the reconstructed curves,
with prediction interval bounds for the 25-year moving average,
Smith: Comment 909
Figure 6. Reconstructed series with time series corrections: K = 2 Figure 8. Reconstructed series with time series corrections: K = 9
PCs with AR(1) residuals. The solid curves represent the smoothed re- PCs with AR(1) residuals. The online version of this figure is in color.
constructed series with pointwise 90% prediction intervals. The dashed
curve at the right-hand end is the same smoother applied to the obser-
vational data points. The online version of this figure is in color. ter 9 of NRC (2006)—but it does not appear to have been sys-
tematically developed.
The results support an overall conclusion that the tempera-
are shown in Figures 6 through 8. Also shown on the plot are tures in recent decades have been higher than at any previous
the actual global mean temperatures for 1902–1980, and a 25- time since 1400. On the other hand, none of the recent recon-
year triangular moving average filter applied to those. The three structions shows as sharp a hockey stick shape as the widely
figures look very similar to each other, though the width of the reproduced figure 3(a) of MBH (1999), so in that respect, crit-
prediction intervals is about twice that in Figure 5. All three ics of the hockey stick are also partially vindicated by these
confirm that the reconstructed smoothed temperature for prior results.
centuries was well below its value in recent decades. I have confined this discussion to statistical aspects of the
reconstruction, not touching on the question of selecting trees
DISCUSSION AND SUMMARY for the proxy series (extensively discussed by M&M, Wegman,
This analysis has used principal components regression com- Scott, and Said and Ammann/Wahl) nor the apparent recent “di-
bined with time series analysis of the residuals to reconstruct vergence” of the relationship between tree ring reconstructions
the global mean temperature series back to 1400. I smoothed and measured temperatures (see, e.g., NRC 2006, pp. 48–52).
the reconstructed series using a 25-year triangular moving av- I regard these as part of the wider scientific debate about den-
droclimatology but not strictly part of the statistical discussion,
erage, and calculated 90% prediction intervals on the smoothed
though it would be possible to apply the same methods as have
reconstruction as a measure of uncertainty. Three standard sta-
been given here to examine the sensitivity of the analysis to
tistical model selection criteria (AIC, BIC, and AICC) were
different constructions of the proxy series or to different speci-
used to select the model orders K (number of PCs), p and q
fications of the starting and ending points of the analysis.
(for the autoregressive and moving average components of the
time series model fitted to the residuals). Although these crite-
ria do not lead to clear-cut selection of the best model, the final ADDITIONAL REFERENCES
reconstructions do not appear to depend too sensitively on the Akaike, H. (1973), “Information Theory and an Extention of the Maximum
model selected. Taking into account the general desire in ap- Likelihood Principle,” in 2nd International Symposium on Information The-
ory, eds. B. N. Petrov and F. Csaki, Budapest: Akademia Kiado, pp. 267–
plied statistics for a parsimonious model, the model with K = 2 281. [907]
PCs and AR(1) residuals appears adequate. (1978), “A Bayesian Analysis of the Minimum AIC Procedure,” An-
The idea of PC regression as a technique in paleoclimate re- nals of the Institute of Statistical Mathematics, 30, 9–14. [907]
Brockwell, P. J., and Davis, R. A. (2003), Introduction to Time Series and Fore-
construction is not new—for example, it was discussed in chap- casting (2nd ed.), New York: Springer. [907]
Brynjarsdóttir, J., and Berliner, L. M. (2010), “Beysian Hierarchical Model-
ing for Paleoclimate Reconstruction From Geothermal Data,” preprint, Ohio
State University. [905]
Hurvich, C. M., and Tsai, C.-L. (1989), “Regression and Time Series Model
Selection in Small Samples,” Biometrika, 76, 297–307. [907]
Li, B., Nychka, D. W., and Ammann, C. M. (2010), “The Value of Multi-Proxy
Reconstruction of Past Climate,” Journal of the American Statistical Asso-
ciation, 105, 883–895. [905,908]
Mann, M. E., Bradley, R. S., and Hughes, M. K. (1999), “Northern Hemisphere
Temperatures During the Past Millennium: Inferences, Uncertainties, and
Limitations,” Geophysical Research Letters, 26, 759–762. [905,906,909]
McIntyre, S., and McKitrick, R. (2003), “Corrections to the Mann et al. (1998)
Proxy Data Base and Northern Hemisphere Average Temperature Series,”
Energy and Environment, 14, 751–771. [906]
(2005a), “Hockey Sticks, Principal Components, and Spuri-
ous Significance,” Geophysical Research Letters, 32, L03710, DOI:
10.1029/2004GL021750. [906]
Figure 7. Reconstructed series with time series corrections: K = 2 (2005b), “The M&M Critique of the MBH98 Northern Hemisphere
PCs with ARMA(1, 2) residuals. The online version of this figure is in Climate Index: Update and Implications,” Energy and Environment, 16, 69–
color. 100. [906,907]
910 Journal of the American Statistical Association, September 2010
R Core Development Team (2010), R: A Language and Environment for Statis- Expectation–Maximization Algorithm,” Journal of Climate, 23, 2782–
tical Computing, Vienna, Austria: R Foundation for Statistical Computing, 2800. [905]
available at http:// www.R-project.org. [908] Wahl, E. R., and Ammann, C. M. (2007), “Robustness of the Mann, Bradley,
Hughes Reconstruction of Northern Hemisphere Surface Temperatures: Ex-
Tingley, M. P., and Huybers, P. (2010a), “A Bayesian Algorithm for Recon-
amination of Criticisms Based on the Nature and Processing of Proxy Cli-
structing Climate Anomalies in Space and Time. Part I: Development and mate Evidence,” Climatic Change, 85, 33–69. [907]
Applications to Paleoclimate Reconstruction Problems,” Journal of Cli- Wegman, E. J., Scott, D. W., and Said, Y. H. (2006), “Ad hoc Committee Report
mate, 23, 2759–2781. [905] on the ‘Hockey Stick’ Global Climate Reconstruction,” presented to the
(2010b), “A Bayesian Algorithm for Reconstructing Climate Anom- Committee on Energy and Commerce and the Subcommittee on Oversight
alies in Space and Time. Part II: Comparison With the Regularized and Investigations, U.S. House of Representatives. [906]
Rejoinder
Bo L I, Douglas W. N YCHKA, and Caspar M. A MMANN
Paleoclimate reconstruction provides a very good example treatments. We accept CT’s criticism on the lack of replication.
for the necessity of combining statistics with the geosciences To our knowledge, however, this climate model run is unique in
and obviously both areas benefit from this combination. The both its resolution and length. Adding in another climate model
“Climategate” review panel chaired by Lord Oxburgh con- simulation would be better described as creating another factor
cluded that “It would be helpful for researchers to work more for the experimental design rather than a replicate. Simulating a
closely with professional statisticians in future. This would en- sample (also termed an ensemble in the geosciences) of climate
sure the best methods were used when analyzing the complex model runs that reflect the model uncertainty and the variation
and often ‘messy’ data on climate,” quoted from BBC news in response is a grand challenge not only for paleoclimate stud-
(http:// news.bbc.co.uk/ 2/ hi/ science/ nature/ 8618024.stm). In ies but also for projections of future climate change.
our paper we provided a framework for the paleoclimate re- CT questioned why the “oracle” case is not always the best?
construction to help more statisticians contribute to this area If forcings are not included the “oracle” is unanimously the best
and also to provide a clearer description of the statistical as- in both measures. When forcings are included, the “oracle” is
sumptions that support a particular method. In this way we sometimes slightly worse than the others. The most likely ex-
hope that we have addressed needs highlighted by this recent planation for this result is the sampling error due to considering
review panel. The key points from the three discussions are the just one replication and this reinforces the point that we should
point out limitations of our analysis and also solutions for more consider more than one climate model run for evaluating the
improving the reconstructions. We note that there is a consen- BHM approach.
sus among the discussants that the Bayesian hierarchical model 2. REFINEMENT OF DATA MODELS
(BHM) approach is a positive contribution to paleoclimate and
the numerous constructive comments are much appreciated. We We agree that the forward models are unknown and much
also thank the discussants for sharing some new work in their more complex than our current models, which simply are linear
approximations and are not derived from physics. Regarding
discussions on the spatial aspects of reconstructions and on bet-
the moving average calculated in the forward model for tree
ter models for linking proxies to climate variables. To keep
rings, CT suggested to use the moving window as t − 10 to t
our response succinct we will not comment on this new work
instead of the current t − 5 to t + 5. We agree that for a single
except to acknowledge the value and innovativeness of these
tree ring this does not make physical sense but could reflect a
extensions.
composite of different trees or another form of annual climate
1. DESIGN OF THE EXPERIMENT proxy. The “tree ring” proxy sited at 75◦ N in our study was
actually intended to represent an ice-core based proxy that had
Cressie and Tingley (CT) examined our experiment and in- the same annual time scale information as a tree-ring proxy.
terpreted our set up as having only one treatment—the BHM Both CT and Wahl, Schoelzel, and Tigrek (WST) suggested a
applied to the dataset. Indeed, we do not compare this with other seasonal tree-growth model (Evans et al. 2006) as the future for-
methods of reconstruction. Our main interest was to understand ward model for tree rings. This is indeed a much more realistic
the information content of different kinds of climate proxies and model to consider, but it requires several other climate inputs in-
how this interacted with a process model. For that purpose, we cluding precipitation, transpiration, runoff, and solar radiation,
consider the combination of three factors of with/without forc- and it will introduce some nonlinear data models. Therefore, we
ing covariates, with/without proxy noise and modeling T1 /T. suspect a simpler version of this growth model might be more
In terms of experimental design language we have five blocks useful in practice. WST have made impressive progress in de-
comprising the different kinds of models and a 23 design for the veloping a binomial-logistic generalized linear model as an ef-
ficient forward model for pollen and we look forward to these
components being incorporated into spatial reconstructions.
Bo Li is Assistant Professor, Department of Statistics, Purdue University,
West Lafayette, IN 47906 (E-mail: boli@stat.purdue.edu). Douglas W. Ny-
chka is Senior Scientist and Director of Institute for Mathematics Applied to © 2010 American Statistical Association
Geosciences (E-mail: nychka@ucar.edu) and Caspar M. Ammann is scien- Journal of the American Statistical Association
tist (E-mail: ammann@ucar.edu), National Center for Atmospheric Research September 2010, Vol. 105, No. 491, Applications and Case Studies
(NCAR), Boulder, CO 80307. DOI: 10.1198/jasa.2010.ap10450
Li, Nychka, and Ammann: Rejoinder 911
CT raised the concern that the white noise was added onto two forcings have much less impact. The discrepancy in dif-
GCM before the forward matrices was applied is not consis- ferent estimates of solar irradiance S only amounts to a dif-
tent with the models for tree rings (4.1) and pollen (4.2). Since ferent scaling, which will be absorbed into the parameter β2 .
the transformer MD retains only high-frequency variability, the The greenhouse gas estimates are nearly flat prior to 1950s
white noise added to GCM temperature will be carried over to and then afterwards are the direct measurements, therefore the
the synthetic tree rings. Therefore, there is little effect to syn- uncertainty in greenhouse gases are expected to be small and
thetic tree rings whether the white noise is added before or after thus negligible. CT suggested the modeling of measurement
the transformer is applied, and the AR(2) is used to model a errors in instrumental temperatures, motivated by the various
mixture of white and serially correlated noise. The additive er- noises added to instrumental temperatures in the post process-
ror term in (4.2) serves an approximation to the well-known ing (Brohan et al. 2006). We find this a perceptive comment. In
dating error in pollen rather than for the purpose of modeling this sense the instrumental record is just another climate proxy,
the white noise. one that we hope has a small measurement error and a simple
As opposed to assuming known forward models, CT sug- forward model! Including the instrumental as another observa-
gested an approach to parametrize the unknown transforma- tional equation is an elegant way to deal with all the observed
tion matrices using MD for example. They proposed to have information in a unified way.
Di (t) = μi + c1 T(t − 5) + · · · + c11 T(t + 5) and then estimate Due to the fact that the radiative forcing associated with CO2
the parameters with data. However, in order to keep the par- increases as the log of the mixing ratio and also the form of
simony, that model imposes a strong restriction that the rela- multiplicative error in volcanism, CT suggested a new linear
tionship between any tree-ring and global temperatures share process model:
the same slope parameters, which sounds a bit unrealistic as
implied in the “divergence” problem that CT mentioned in Sec- T = β0 1 + β1 S + β2 log(V0 ) + log(C) + .
tion 4.3 of their discussion. Their another suggestion of estimat-
CT also suggested to investigate the role of each forcings as our
ing transformation matrices by putting a prior on them and then
systematic evaluation for each proxy. Both are valuable sugges-
carrying out a sensitivity analysis is more general and an excel-
tions that are worthwhile to try out in future studies.
lent compromise between flexibility and controlling the number
Considering the relatively extensive statistical training re-
of model parameters.
quired to use the BHM and for convenience of paleoclimatolo-
3. SPATIAL RECONSTRUCTION gists, Smith discussed the efficiency and benefits of using prin-
cipal components regression combined with time series analy-
CT has sketched the hierarchies and presented a space–time sis. We agree that with simple data structure, regression tech-
random effects model that proposes nonseparable and nonsta-
niques can be very useful and much appreciated for their merits
tionary spatio-temporal covariances as expected over a global
of being easy to understand and implement, but it is also worth
spatial scale and a millennial temporal scale (Cressie, Shi, and
mentioning that those techniques may suffer from the attenu-
Kang 2010). This model also allows for fast computation of
ation effects caused by errors-in-variables (Ammann, Genton
iterations in MCMC by reducing the dimension of basis func-
and Li 2010). A careful examination of assumptions made for
tions, and is a promising, feasible approach to deal with data
regressions and appropriate corrections can perhaps improve
over a large spatial scale. Besides, WST raised excellent points
those methods.
that spatial climate information produce by AOGCMs can be
In summary, the discussions have sketched some important
included in the process stage, and the spatial reconstruction
improvements to our work and future directions. We are glad
should be weighted to account for “modes” in the Earth sys-
our paper has initiated these valuable responses.
tem with particular sensitivity for the entire earth. The spatial
network sensitivity and spatial network design for proxies men- ADDITIONAL REFERENCES
tioned in CT are definitely interesting research questions that
Brohan, P., Kennedy, J. J., Harris, I., Tett, S. F. B., and Jones, P. D. (2006),
we should consider. In particular, what are the optimal loca- “Uncertainty Estimates in Regional and Global Observed Temperature
tions to look for new proxies? Or, given all the available data, Changes: A New Data Set From 1850,” Journal of Geophysical Research,
which locations should be considered? 111, DOI: 10.1029/2005JD006548. [911]
Cressie, N., Shi, T., and Kang, E. L. (2010), “Fixed Rank Filtering for Spatio-
4. OTHER ISSUES WITH BHM MODELS Temporal Data,” Journal of Computational and Graphical Statistics, 19,
724–745. [911]
Responding to CT’s comments concerning the forcings, we Evans, M., Reichert, B., Kaplan, A., Anchukaitis, K., Vaganov, E., Hughes,
M., and Cane, M. (2006), “A Forward Modeling Approach to Paleoclimatic
only introduced observational errors into the volcanic forcing Interpretation of Tree-Ring Data,” Journal of Geophysical Research, 111,
because this reflected our understanding that errors in the other DOI: 10.1029/2006JG000166. [910]