Double Bootstrapping For Visualising The Distribution
Double Bootstrapping For Visualising The Distribution
single and double bootstrap procedures for estimating the distribution of descriptive statistics
for independent and identically distributed functional data. At the cost of longer computational
time, the double bootstrap with the same bootstrap method reduces confidence level error and
provides improved coverage accuracy than the single bootstrap. Illustrated by a Canadian
weather station data set, the double bootstrap procedure presents a tool for visualising the
distribution of the descriptive statistics for the functional data.
1 Introduction
Recent computer technology in data collection and storage allows statisticians to analyse functional
data. The monographs by Ramsay and Silverman (2002, 2005) present state-of-art parametric
techniques, while the book by Ferraty and Vieu (2006) gives a nonparametric treatment for
functional data analysis. For selective reviews on various aspects of functional data analysis,
consult Cuevas (2014), Shang (2014), Morris (2015), Goia and Vieu (2016), Wang et al. (2016) and
Reiss et al. (2017).
In functional data analysis, a key objective is to draw statistical inference about the distribution
of descriptive statistics from realisations generated from an unknown data generating process
at a population level. Although it is important to have a consistent estimator for descriptive
statistics, it is equally important to estimate the variability associated with the descriptive statistics,
construct confidence intervals (CIs) and carry out hypothesis tests. When such a problem arises,
bootstrapping turns out to be the only practical alternative (see, e.g., Cuevas et al., 2006; McMurry
and Politis, 2011).
* Corresponding address: Department of Actuarial Studies and Business Analytics, Level 7, 4 Eastern Road,
Macquarie University, Sydney, NSW 2109, Australia; Email: hanlin.shang@mq.edu.au; ORCID: https://orcid.org/
0000-0003-1769-6430
1
Since functional data do not have a well-defined density, nonparametric bootstrapping is
commonly used for independent and identically distributed (i.i.d.) functional data. In nonpara-
metric bootstrapping, one resamples X b = {X1b , . . . , Xnb } drawn randomly with replacement from
original functions X = {X1 , . . . , Xn }, so that each Xib has probability n−1 of being equal to any
given one of the X j ’s,
P Xib = X j |X = n−1 , 1 ≤ i, j ≤ n.
With a set of bootstrap samples {X 1 , · · · , X B }, we can reduce coverage level error between the
nominal and empirical coverage probabilities by iterating the bootstrap once more, where B de-
notes the number of bootstrap samples. Through a nonparametric bootstrap, one resamples X bη =
bη bη
{X1 , . . . , Xn } drawn randomly with replacement from bootstrap functions X b = {X1b , . . . , Xnb },
bη
so that each Xi has probability n−1 of being equal to any given one of the X jb ’s,
bη
= X jb |X b , X = n−1 ,
P Xi 1 ≤ i, j ≤ n.
We use the single and double bootstrap procedures to study the distribution of descriptive
statistics of functional data denoted by θ = ψ(F0 ), which is a function of the underlying distribu-
tion function F0 . The bootstrap principle is to estimate a function of the population distribution
F0 by using the same function of the bootstrap distribution function based on a random sample
of size n from F0 . Following the early work by Martin (1990a,b), we consider an example of iter-
ated bootstrapping, namely double bootstrapping, to construct CIs with more accurate coverage
probability than single bootstrapping for estimating the distribution of descriptive statistics of
functional data (see also Shang, 2015).
There is an extensive literature on double bootstrapping for scalar-valued data (see, e.g., Chang
and Hall, 2015). The first article to mention the double bootstrap was by Hall (1986), followed
quickly by Beran (1987, 1988). Extensive theoretical discussions are given in Hall (1988) and Hall
and Martin (1988). In the statistics literature, double bootstrapping was used to construct CIs and
estimate the underlying distribution (see, e.g., Booth and Hall, 1994; Booth and Presnell, 1998; Lee
and Young, 1999; Chang and Hall, 2015). The double bootstrapping was also incorporated within
the errors-in-variables framework and applied to a calibration problem (see Pešta, 2013). In the
econometrics literature, Davidson and MacKinnon (2002) uses the double bootstrapping to improve
the reliability of bootstrap tests of non-nested linear regression models. The contribution of this
paper is to extend the use of double bootstrapping to functional data analysis. Since functional data
are often observed with measurement error, we aim to provide accurate uncertainty quantification
for the descriptive statistics of functional data.
The outline of this article is as follows: Section 2 sets notations, definitions, and introduces a
2
bootstrap method. Section 3 presents the descriptive statistics, the L2 distance metrics used for
evaluation and comparison, and the finite sample performance based on a series of simulation
studies. In Section 4, we apply the single and double bootstrap procedures to a Canadian weather
station data set. Section 5 concludes, along with some ideas on how the double bootstrap procedure
presented here can be further extended. Additional simulation results are shown in the appendices.
2 Bootstrapping
2.1 Notation
Random functions are assumed to be sampled from a second-order stochastic process X in L2 [0, τ ],
where L2 [0, τ ] is the Hilbert space of square-integrable functions on the bounded interval [0, τ ].
The stochastic process X satisfies the finite variance condition 0 X 2 (t)dt < ∞, inner product
Rτ
Rτ
h f , gi = 0 f (t) g(t)dt for any two functions, f and g ∈ L2 [0, τ ] and induced squared norm
k · k2 = h·, ·i. All random functions are defined on a common probability space (Ω, A, P). The
notation X ∈ L2H (Ω, A, P) is used to indicate E(kX k p ) < ∞ for some p > 0, where H denotes the
Hilbert space. When p = 1, X (t) has the mean curve µ(t) = E[X (t)]; when p = 2, a non-negative
definite covariance function is given by
cX (t, s) = Cov[X (s), X (t)] = E {[X (s) − µ(s)][X (t) − µ(t)]} , (2.1)
for all s, t ∈ I .
The covariance function cX (t, s) in (2.1) allows the covariance operator of X , denoted by KX ,
to be defined as
Z
KX (φ)(s) = cX (t, s)φ(t)dt.
I
Via Mercer’s lemma, there exists an orthonormal sequence (φk ) of continuous function in L2 [0, τ ]
and a non-increasing sequence λk of positive number, such that
∞
cX (t, s) = ∑ λk φk (t)φk (s), s, t ∈ I ,
k =1
where (λ1 , λ2 , . . . ) are eigenvalues, where (φ1 (t), φ2 (t), . . . ) are orthogonal eigenfunctions.
3
randomly sampling with replacement from the original functions (see also Shang, 2015). In
practice, we can only observe and evaluate X at discretised data points 0 ≤ t1 < t2 · · · < t T ≤ τ,
thus the discretized bootstrap samples are obtained as {X b (t j ) = [X1b (t j ), X2b (t j ), . . . , Xnb (t j )]> ;
j = 1, 2, . . . , T }.
To avoid the possible appearance of repeated curves in the bootstrap samples, Cuevas et al.
(2006) and Shang (2015) replaced the standard i.i.d. bootstrap samples by so-called smooth
bootstrap samples, which are drawn from a smooth empirical distribution function. This can be
achieved by adding white noise to the bootstrap sample, expressed as
X 0 ( t j ) = X b ( t j ) + z ( t j ), j = 1, 2, . . . , T, (2.2)
Let θ := θ (t) = ψ(F0 ) be a statistic whose distribution is unknown, where ψ is the function that
defines the parameter θ. Let cn (α) denote the largest (1 − α)th quantile of the distribution of θ.
Suppose F is the set of all possible values of parameter θ. Then
{ϑ ∈ F : θϑ ≤ cn (α)}
4
empirical coverage probability of θ obtained from the single bootstrap is given as
n o
# D (θbb , θb) > D (θ,
b θ)
, b = 1, . . . , B1 ,
B1
where B1 is the number of samples in the single bootstrap. In contrast, the empirical coverage
probability of θ obtained from the double bootstrap is given as
n o
# D (θbbη , θbb ) > D (θ,
b θ)
,
B1 B2
where B2 is the number of samples in the second layer of the double bootstrap. To reduce both
computational time and confidence level error, B2 can equal to one (see, e.g., Chang and Hall, 2015).
The difference between the single and double bootstrapping stems from the distance measure
D (θbbη , θb), which plays the role of error correction to achieve better calibration.
3 Simulation study
The single and double bootstrap procedures are used to estimate the distribution of descriptive
statistics of functional data. In Section 3.1, we review several descriptive statistics that characterise
i.i.d. functional data. Section 3.2 introduces a simulated Gaussian process, while the evaluation
metrics of estimation accuracy are given in Section 3.3. Section 3.4 displays the simulation results,
where the performance of single and double bootstrapping are evaluated and compared based on
their differences between the empirical and nominal coverage probabilities.
We seek an estimator of the functional median, which allows us to rank a set of functions based on
their location depth, i.e., the distance from the functional median (the deepest curve). This leads to
the idea of functional depth, which has received much attention in the functional data literature
(see e.g., Cuevas et al., 2006, 2007; Gervini, 2012; Lopéz-Pintado and Romo, 2009). We consider
two functional depth measures, namely Fraiman and Muniz’s (2001) depth and Gervini’s (2012)
depth based on small ball probability.
Fraiman and Muniz’s (2001) depth is the oldest functional depth measure. For each t ∈ [0, τ ],
let F1 be the empirical sampling distribution of {X1 (t), . . . , Xn (t)} and let Zi (t) be the univariate
depth of function Xi (t), given by
1
Z τ Z τ
Ii = Zi (t)dt = 1− − F1 [Xi (t)] dt,
0 0 2
5
and the values of Ii provide a way of ranking the curves from inward to outward (Shang, 2015).
The functional median is the deepest curve with the largest value of Ii .
Gervini (2012) considered the set of distances between any two functions, denoted by d(Xi , X j ).
An observation Xi is an outlier if it is far away from most other functions. Given a probability
α ∈ [0, 1], Gervini (2012) defines the α-radius ri as the distance between Xi (t) and its dαneth closest
observations, where d x e denotes the integer closest to x from above. Customarily, we consider
α = 0.5. The rank of ri provides a measure of outlyingness of Xi (t); the smaller the ri is, the more
dense the Xi (t) is (Shang, 2015). The functional median is the curve with the smallest ri .
Apart from the functional median, we also consider sample versions of the functional mean
and functional variance, given by
1 n
n i∑
X (t) = X i ( t ),
=1
n
1 2
b (t) =
V ∑
n − 1 i =1
Xi ( t ) − X ( t )
and the γ-trimmed functional mean, which is the mean function of the 100(1 − γ)% deepest curves.
The functional trimmed mean is expressed as
n−dγne
1
n − dγne ∑ X i ( t ),
i =1
where X(1) , X(2) , . . . , X(n) are the ordered sample curves based on their increasing location
h i
depth, and γ ∈ 0, n− n
1
is the trimming parameter. Customarily, we consider γ = 0.05.
A series of Monte Carlo simulation studies is implemented to evaluate and compare the finite
sample performance of the single and double bootstrap procedures for estimating the distribution
of descriptive statistics of i.i.d. functional data. For comparison, we consider the functional
model previously studied in Cuevas et al. (2006) and Shang (2015): a Gaussian process X (t) with
mean m(t) = 0.95 × 10t(1 − t) + 0.05 × 30t(1 − t), Cov[X (ti ), X (t j )] = exp(−|ti − t j |/0.3), and
Var[X (t)] = 1.
Given that the data generating process is a Gaussian process, we can draw samples from
a multivariate normal distribution with mean [m(t1 ), m(t2 ),. . . , m(t T )]> and covariance matrix
Cov[X (ti ), X (t j )] = min(ti , t j ) = ti . In practice, we evaluate and compare our methods on a
common set of grid points. In our simulation studies, we have taken 101 equally-spaced grid
points between 0 and 1, for two sample sizes n = 100 and 300.
6
3.3 Simulation evaluation
To evaluate the performance of the bootstrap procedures, we calculate the bootstrap CIs of the
descriptive statistics. Given a set of original functions {X1 , . . . , Xn }, we draw B1 = B2 = 399
bootstrap samples for each of the R = 200 replications; and the same pseudo-random seed
was used for all the methods to ensure the same simulation randomness (see also Shang, 2015).
For each replication, the 100(1 − δ)% bootstrap CIs of a descriptive statistic θ, are defined by
calculating the cut-off value D (X1 , . . . , Xn ), such that 100(1 − δ)% of the bootstrap repetitions
(θbb , b = 1, . . . , B1 ) are within a distance smaller than D (X1 , . . . , Xn ). In the double bootstrap, the
100(1 − δ)% bootstrap CIs of a descriptive statistic are defined by calculating the cut-off value
D (X1b , . . . , Xnb ), such that 100(1 − δ)% of the bootstrap repetitions (θbbη , η = 1, . . . , B2 ) are within a
distance smaller than D (X1b , . . . , Xnb ).
We calculate the empirical coverage probability that the target function at the population level
lies within the CIs. As a performance measure, L2 distance metrics are used for constructing CIs.
They are defined as
Z i2 1
h 2
θ (t) − θb(t) = θ (t) − θb(t) dt ,
2 t∈I
Z i2 1
h 2
bb
θb(t) − θ (t) = bb
θb(t) − θ (t) dt ,
2 t∈I
Z i2 1
h 2
θbb (t) − θbbη (t) = θbb (t) − θbbη (t) dt .
2 t∈I
In Figure 1, we plot the nominal (from 0.5 to 0.95 in a step of 0.05) and empirical coverage
probabilities for estimating the functional mean of i.i.d. functional data using single and double
bootstrap procedures. As the sample size increases from n = 100 to 300, the empirical coverage
probability improves for all the bootstrap procedures, including the single bootstrap procedure of
Shang (2015). This result is not surprising, as the validity of bootstrapping relies on a moderate
or large sample size (see also McMurry and Politis, 2011). Subject to the same pseudo-random
seed, the double bootstrap procedure outperforms the single bootstrap procedure for most if not
all confidence levels. This result is not surprising, as iterating the bootstrap principle reduces
the dependence between the probability distribution of the resample and the unknown data
generating process (as outlined in Shang, 2015). Between bootstrapping the original functions and
smoothed functions in (2.2), there is an advantage to bootstrap the original function.
7
L2 (n = 100) L2 (n = 300)
1.0
1.0
nominal
single bootstrap ● ●
● ●
Empirical coverage probability
0.9
● ●
single smooth bootstrap ● ●
● double smooth bootstrap ● ●
● ●
●
0.8
0.8
●
● ●
● ●
● ●
●
0.7
0.7
●
● ●
● ●
● ●
●
0.6
0.6
●
● ●
● ●
● ●
●
0.5
0.5
●
● ●
0.4
0.4
0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.9
Figure 1: Empirical and nominal coverage probabilities for estimating the functional mean, based on
B1 = B2 = 399 repetitions and R = 200 replications. See (2.2) for the smooth bootstrap.
In Figures 2 and 3, we present the nominal and empirical coverage probabilities for estimating
the functional median based on the Fraiman and Muniz’s (2001) depth and the α-radius depth.
A striking feature is that the single bootstrap procedure noticeably over-estimates the nominal
coverage probability, especially when the confidence level is large. In contrast, the double bootstrap
procedure corrects the coverage error, and it even slightly under-estimates the nominal coverage
probability. Compared to the functional mean, in general, the functional median is harder to
estimate correctly. This is because the quantity being estimated, such as a quantile, is sensitive to
the discreteness of empirical distribution function (see, e.g., Bickel and Freedman, 1981; Beran and
Srivastava, 1985; Hall and Martin, 1989; Eaton and Tyler, 1991). However, the comparison result
further demonstrates the usefulness of the double bootstrap procedure, which can still correct
coverage error to achieve better calibration, as measured by the L2 metrics. Between bootstrapping
the original and smoothed functions, it is advantageous to bootstrap smoothed function.
L2 (n = 100) L2 (n = 300)
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
● ●
● ● ●
Empirical coverage probability
●
● ● ●
●
● ●
● ●
● ●
● ●
● ●
● ●
● ●
●
●
● ● ●
●
● ●
● ●
● ●
●
nominal
●
single bootstrap ●
● double bootstrap
●
single smooth bootstrap
● double smooth bootstrap
0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.9
Figure 2: Empirical and nominal coverage probabilities for estimating the functional median based on the
Fraiman and Muniz’s (2001) depth, using B1 = B2 = 399 repetitions and R = 200 replications.
8
L2 (n = 100) L2 (n = 300)
1.0
1.0
●
● ●
Empirical coverage probability
0.9
●
● ●
●
●
●
●
●
●
0.8
0.8
●
● ● ●
●
● ● ●
0.7
0.7
●
● ● ●
●
● ● ●
0.6
0.6
nominal ●
● ● single bootstrap ●
● double bootstrap ●
● single smooth bootstrap ●
0.5
0.5
● double smooth bootstrap
0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.9
Figure 3: Empirical and nominal coverage probabilities for estimating the functional median based on the
α-radius depth, using B1 = B2 = 399 repetitions and R = 200 replications.
In Figures 4 and 5, we show the nominal and empirical coverage probabilities for estimating the
functional trimmed mean based on the Fraiman and Muniz’s (2001) depth and the α-radius depth.
Similar to the functional mean, the empirical coverage probability improves for all the bootstrap
procedures as the sample size increases from n = 100 to 300. Subject to the same pseudo-random
seed, the single bootstrap procedure is outperformed by the double bootstrap procedure for most,
if not all, confidence levels. This result again demonstrates the advantage of the double bootstrap
procedure, which can correct coverage error. Between bootstrapping the original and smoothed
functions, there is an advantage to bootstrap the original function.
L2 (n = 100) L2 (n = 300)
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
nominal
single bootstrap ●
● ●
●
Empirical coverage probability
● double bootstrap ● ●
● ●
single smooth bootstrap
● ●
● double smooth bootstrap ● ●
● ●
● ●
●
● ●
●
● ●
● ●
● ●
● ●
● ●
● ●
● ●
● ●
● ●
● ●
0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.9
Figure 4: Empirical and nominal coverage probabilities for estimating the functional trimmed mean based
on the Fraiman and Muniz’s (2001) depth, using B1 = B2 = 399 repetitions and R = 200
replications.
In Figure 6, we plot the nominal and empirical coverage probabilities for estimating functional
variance. Similar to the functional mean and trimmed functional mean, the empirical coverage
probability improves for all the bootstrap procedures as the sample size increases from n = 100 to
9
L2 (n = 100) L2 (n = 300)
0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.9
Figure 5: Empirical and nominal coverage probabilities for estimating the functional trimmed mean based
on the α-radius depth, using B1 = B2 = 399 repetitions and R = 200 replications.
300. When n = 100, the inferior estimation accuracy of the single bootstrap procedure is magnified,
because the variance is a squared distance between Xi (t) and X (t). When n = 300, the estimation
accuracy of the single bootstrap procedure improves, but the double bootstrap procedure still
outperforms it. This result again demonstrates the usefulness of the double bootstrap procedure
because of its better calibration. Between bootstrapping the original and smoothed functions, there
is an advantage to bootstrap original functions.
L2 (n = 100) L2 (n = 300)
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
nominal ● ●
single bootstrap ● ● ●
●
Empirical coverage probability
● double bootstrap ●
● ●
single smooth bootstrap ●
● ●
● double smooth bootstrap ●
● ●
● ●
● ●
● ●
● ●
● ●
● ●
● ●
● ●
● ●
●
●
● ●
●
●
●
0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.9
Figure 6: Empirical and nominal coverage probabilities for estimating the functional variance, based on
B1 = B2 = 399 repetitions and R = 200 replications.
When the descriptive statistic includes the functional mean, it is advantageous to bootstrap the
original functions; when the descriptive statistic includes the functional median, it is advantageous
to bootstrap the smoothed functions. In Appendix B, we perform a sensitivity analysis of bootstrap
replications and examine whether or not the bootstrap replications may affect the finite sample
estimation of the empirical coverage probabilities at various confidence levels.
10
4 Application to meteorology data
We consider the classic Canadian weather station data, which is available publicly at the fda
package (Ramsay et al., 2020). This data set has been widely studied by Ramsay and Silverman
(2002, 2005), in the area of descriptive analysis and regression analysis of functional data.
In Figure 7, we plot the temperature change in degrees Celsius throughout a year, taken from
35 weather stations across Canada. The 35 weather stations cover the Atlantic, Pacific, Continental
and Arctic climate zones. The functional curves were interpolated from 365 data points, which
measure the daily mean temperature recorded by a weather station averaged over the period from
1960 to 1994. Through the rainbow plot of Hyndman and Shang (2010), the colours correspond
to the geographic climates of stations; the red lines show the weather stations located in the
comparably warmer regions, whereas the purple lines show the weather stations located in the
colder regions (Shang, 2015).
20
10
0
Celsius
−10
−20
−30
Day
Figure 7: Averaged Canadian daily temperatures from 1960 to 1994 observed at 35 weather stations. Each
curve shows averaged Canadian daily temperatures at a weather station, not at a particular year.
Via the single and double bootstrap procedures, we aim to visualise the distribution of the
descriptive statistics of the Canadian weather station data. As an illustration, we apply the single
and double bootstrap procedures to plot the 95% CIs of the functional mean, functional variance,
functional median, and 5% trimmed functional mean in Figure 8. The sample estimates are shown
in solid black lines. While the 95% empirical CIs are shown in solid red lines for the single
bootstrap, the 95% empirical CIs are shown in dotted blue lines for the double bootstrap.
11
Mean Variance
20
140
Sample estimate
Single bootstrapped CIs
Double bootstrapped CIs
10
80 100
0
60
40
−10
20
−20
0
0 100 200 300 0 100 200 300
Day
Median (FM) Median (radius)
20
20
10
10
0
0
−10
−10
−20
−20
20
10
10
0
0
−10
−10
−20
−20
Day
Figure 8: Ninety-five percent CIs of the descriptive statistics for the Canadian weather station data, based
on B1 = B2 = 399 repetitions. The sample estimates are shown in solid black lines; their 95%
CIs obtained from the single and double bootstrap procedures are shown in solid red lines and
dotted blue lines, respectively.
12
5 Conclusion
We present a double bootstrap procedure for drawing random samples from a set of i.i.d. func-
tional data to visualise the distribution of descriptive statistics. Through a series of Monte Carlo
simulations, we show reduced confidence level error and improved coverage probability in com-
parison to the single bootstrap procedure, using the same bootstrap method. This result is not
surprising since as has been pointed out by Chang and Hall (2015) and Shang (2015), iterating the
bootstrap principle reduces the dependence between the probability distribution of the resample
and the unknown data generating process. As the number of sample curves increases from 100
to 300, the estimation accuracy improves for both bootstrap procedures, more so for the double
bootstrap. Illustrated by the Canadian weather station data set, the single and double bootstrap
procedures produce similar results for estimating the distributions of various descriptive statistics,
but there are noticeable differences at the edges of function support for the functional median.
There are a few ways in which the paper can be further extended, and we briefly outline two.
Firstly, bootstrapping functional time series is still in its infancy, and we intend to extend the
double bootstrap procedure to analyse stationary and weakly dependent functional time series
(see, e.g., Zhu and Politis, 2017; Nyarige, 2016; Shang, 2018; Paparoditis, 2018; Paparoditis and
Shang, 2020, for various single bootstrap procedures). Secondly, we aim to develop bootstrap
procedures that can handle stationary long-memory functional time series (see, e.g., Li et al., 2020a)
and non-stationary long-memory functional time series (see, e.g., Li et al., 2020b).
Acknowledgement
The author thanks Dr. Yanrong Yang for many comments and suggestions.
Similar to Section 3.3 of the main manuscript, we also consider L∞ distance metrics for constructing
CIs. The L∞ distance metrics are defined as
Using the L∞ distance metrics for constructing CIs, Figures A.1 to A.6 present the finite sample
performance between the two bootstrap procedures.
13
L∞ (n = 100) L∞ (n = 300)
1.0
1.0
●
● ●
●
Empirical coverage probability
0.9
● ●
● ●
● ●
● ●
● ●
0.8
0.8
● ●
● ●
● ●
● ●
0.7
0.7
● ●
● ●
● ●
● ●
0.6
0.6
● ●
●
●
● ●
●
●
0.5
0.5
● ●
0.4
0.4
0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.9
Figure A.1: Empirical and nominal coverage probabilities for estimating the functional mean, based on
B1 = B2 = 399 repetitions and R = 200 replications. See (2.2) for the smooth bootstrap.
L∞ (n = 100) L∞ (n = 300)
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
● ●
Empirical coverage probability ●
● ●
● ●
●
●
● ●
● ●
● ●
● ●
● ●
●
● ●
● ●
● ● ●
●
● ●
● ● ●
●
●
●
●
●
0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.9
Figure A.2: Empirical and nominal coverage probabilities for estimating the functional median based on
the Fraiman and Muniz’s (2001) depth, using B1 = B2 = 399 repetitions and R = 200
replications.
L∞ (n = 100) L∞ (n = 300)
1.0
1.0
● ●
● ● ●
● ●
Empirical coverage probability
● ● ● ●
0.9
0.9
● ●
●
● ●
●
●
● ● ●
0.8
0.8
● ● ● ●
● ● ●
●
● ● ●
0.7
0.7
●
●
●
●
●
0.6
0.6
● ●
0.5
0.5
0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.9
Figure A.3: Empirical and nominal coverage probabilities for estimating the functional median based on
the α-radius depth, using B1 = B2 = 399 repetitions and R = 200 replications.
14
L∞ (n = 100) L∞ (n = 300)
0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.9
Figure A.4: Empirical and nominal coverage probabilities for estimating the functional trimmed mean
based on the Fraiman and Muniz’s (2001) depth, using B1 = B2 = 399 repetitions and
R = 200 replications.
L∞ (n = 100) L∞ (n = 300)
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
● ●
●
● ●
●
● ●
●
●
● ●
●
● ●
●
● ●
●
● ●
● ●
● ●
● ●
● ●
● ●
0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.9
Figure A.5: Empirical and nominal coverage probabilities for estimating the functional trimmed mean
based on the α-radius depth, using B1 = B2 = 399 repetitions and R = 200 replications.
L∞ (n = 100) L∞ (n = 300)
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
● ●
● ● ● ●
Empirical coverage probability
● ● ● ●
● ●
● ●
● ●
● ●
● ●
● ●
● ●
● ●
●
●
● ●
●
● ●
●
●
● ●
●
●
●
0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.9
Figure A.6: Empirical and nominal coverage probabilities for estimating the functional variance, based on
B1 = B2 = 399 repetitions and R = 200 replications.
15
Appendix B: Sensitivity analysis of bootstrap replications
In the double bootstrap procedure, the choices of B1 and B2 may affect the empirical coverage
probability. In Table B.1, we study four different choices of B1 and B2 and find that the empirical
coverage probabilities are fairly similar regardless different combinations of B1 and B2 .
Table B.1: Empirical and nominal coverage probabilities are shown for estimating various descriptive
statistics, based on four combinations of bootstrap replications. B1 symbolizes the number of
bootstrap samples in the first layer of the double bootstrap, whereas B2 symbolizes the number
of bootstrap samples in the second layer of the double bootstrap. FM denotes the Fraiman and
Muniz’s (2001) depth.
Mean B1 = 399 L2 (n = 100) 0.49 0.54 0.59 0.64 0.69 0.74 0.79 0.84 0.89 0.95
B2 = 399 Linf (n = 100) 0.49 0.54 0.59 0.64 0.69 0.74 0.79 0.84 0.90 0.95
L2 (n = 300) 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.84 0.90 0.95
Linf (n = 300) 0.49 0.54 0.59 0.64 0.69 0.74 0.80 0.85 0.90 0.95
B1 = 399 L2 (n = 100) 0.48 0.53 0.58 0.63 0.68 0.73 0.78 0.84 0.89 0.94
B2 = 99 Linf (n = 100) 0.48 0.53 0.58 0.63 0.68 0.73 0.79 0.84 0.89 0.94
L2 (n = 300) 0.49 0.54 0.59 0.64 0.69 0.74 0.79 0.84 0.89 0.94
Linf (n = 300) 0.49 0.54 0.59 0.64 0.69 0.74 0.79 0.84 0.89 0.94
B1 = 99 L2 (n = 100) 0.49 0.54 0.59 0.64 0.69 0.74 0.79 0.84 0.89 0.95
B2 = 399 Linf (n = 100) 0.49 0.54 0.59 0.64 0.69 0.74 0.79 0.84 0.89 0.95
L2 (n = 300) 0.49 0.54 0.59 0.64 0.69 0.74 0.79 0.84 0.89 0.95
Linf (n = 300) 0.49 0.54 0.59 0.64 0.69 0.74 0.79 0.84 0.90 0.95
B1 = 99 L2 (n = 100) 0.49 0.53 0.58 0.63 0.68 0.73 0.78 0.83 0.88 0.94
B2 = 99 Linf (n = 100) 0.48 0.53 0.58 0.63 0.68 0.73 0.78 0.84 0.89 0.94
L2 (n = 300) 0.48 0.53 0.58 0.64 0.69 0.74 0.79 0.83 0.89 0.94
Linf (n = 300) 0.49 0.54 0.59 0.63 0.68 0.73 0.78 0.84 0.89 0.94
Median B1 = 399 L2 (n = 100) 0.33 0.41 0.50 0.59 0.65 0.70 0.75 0.82 0.87 0.92
(FM) B2 = 399 Linf (n = 100) 0.36 0.43 0.51 0.59 0.65 0.70 0.75 0.81 0.86 0.91
L2 (n = 300) 0.40 0.45 0.52 0.61 0.66 0.70 0.75 0.80 0.86 0.91
Linf (n = 300) 0.40 0.45 0.51 0.59 0.64 0.68 0.72 0.77 0.83 0.88
B1 = 399 L2 (n = 100) 0.34 0.41 0.49 0.57 0.64 0.70 0.75 0.81 0.87 0.92
16
Nominal coverage probability
Statistic 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95
B2 = 99 Linf (n = 100) 0.36 0.43 0.50 0.58 0.64 0.69 0.74 0.80 0.85 0.90
L2 (n = 300) 0.40 0.45 0.52 0.60 0.66 0.70 0.75 0.80 0.86 0.91
Linf (n = 300) 0.40 0.45 0.51 0.58 0.64 0.68 0.72 0.77 0.83 0.88
B1 = 99 L2 (n = 100) 0.36 0.44 0.53 0.62 0.68 0.72 0.77 0.83 0.89 0.93
B2 = 399 Linf (n = 100) 0.38 0.45 0.53 0.61 0.66 0.71 0.76 0.81 0.86 0.91
L2 (n = 300) 0.43 0.48 0.55 0.64 0.69 0.73 0.76 0.82 0.87 0.92
Linf (n = 300) 0.42 0.47 0.53 0.61 0.66 0.70 0.74 0.79 0.85 0.90
B1 = 99 L2 (n = 100) 0.33 0.40 0.49 0.57 0.64 0.69 0.75 0.81 0.87 0.92
B2 = 99 Linf (n = 100) 0.36 0.42 0.50 0.57 0.64 0.69 0.74 0.80 0.85 0.90
L2 (n = 300) 0.39 0.45 0.51 0.58 0.65 0.69 0.74 0.79 0.85 0.91
Linf (n = 300) 0.40 0.45 0.51 0.57 0.63 0.67 0.71 0.76 0.82 0.87
Median B1 = 399 L2 (n = 100) 0.59 0.64 0.68 0.73 0.77 0.82 0.86 0.89 0.93 0.97
(α-radius) B2 = 399 Linf (n = 100) 0.72 0.77 0.80 0.84 0.87 0.90 0.93 0.95 0.97 0.99
L2 (n = 300) 0.56 0.61 0.66 0.70 0.75 0.79 0.84 0.88 0.92 0.96
Linf (n = 300) 0.68 0.73 0.77 0.81 0.85 0.88 0.91 0.94 0.96 0.99
B1 = 399 L2 (n = 100) 0.58 0.63 0.68 0.72 0.76 0.81 0.85 0.89 0.92 0.96
B2 = 99 Linf (n = 100) 0.71 0.76 0.80 0.83 0.87 0.90 0.92 0.95 0.97 0.99
L2 (n = 300) 0.56 0.60 0.65 0.70 0.74 0.79 0.83 0.87 0.91 0.95
Linf (n = 300) 0.67 0.72 0.76 0.80 0.84 0.87 0.91 0.93 0.96 0.98
B1 = 99 L2 (n = 100) 0.59 0.64 0.68 0.73 0.77 0.81 0.86 0.89 0.93 0.96
B2 = 399 Linf (n = 100) 0.72 0.76 0.80 0.84 0.87 0.90 0.93 0.95 0.97 0.99
L2 (n = 300) 0.56 0.61 0.66 0.70 0.75 0.79 0.84 0.88 0.92 0.96
Linf (n = 300) 0.68 0.72 0.76 0.81 0.84 0.88 0.91 0.94 0.97 0.99
B1 = 99 L2 (n = 100) 0.58 0.63 0.68 0.72 0.77 0.81 0.85 0.89 0.92 0.96
B2 = 99 Linf (n = 100) 0.71 0.76 0.80 0.83 0.87 0.90 0.93 0.95 0.97 0.99
L2 (n = 300) 0.55 0.60 0.65 0.69 0.74 0.78 0.83 0.87 0.91 0.95
Linf (n = 300) 0.67 0.71 0.76 0.80 0.84 0.87 0.90 0.93 0.96 0.98
Trimmed B1 = 399 L2 (n = 100) 0.50 0.55 0.60 0.65 0.70 0.76 0.81 0.86 0.90 0.95
17
Nominal coverage probability
Statistic 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95
mean (FM) B2 = 399 Linf (n = 100) 0.50 0.55 0.60 0.65 0.71 0.76 0.81 0.86 0.91 0.96
L2 (n = 300) 0.51 0.56 0.61 0.66 0.71 0.76 0.81 0.86 0.90 0.95
Linf (n = 300) 0.51 0.56 0.61 0.66 0.71 0.76 0.81 0.85 0.90 0.95
B1 = 399 L2 (n = 100) 0.50 0.55 0.59 0.65 0.70 0.75 0.80 0.85 0.90 0.95
B2 = 99 Linf (n = 100) 0.49 0.55 0.59 0.65 0.70 0.75 0.80 0.85 0.90 0.95
L2 (n = 300) 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95
Linf (n = 300) 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.94
B1 = 99 L2 (n = 100) 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95
B2 = 399 Linf (n = 100) 0.50 0.55 0.60 0.65 0.70 0.75 0.81 0.86 0.91 0.95
L2 (n = 300) 0.50 0.55 0.60 0.65 0.71 0.75 0.81 0.85 0.90 0.95
Linf (n = 300) 0.51 0.56 0.61 0.66 0.71 0.76 0.80 0.85 0.90 0.95
B1 = 99 L2 (n = 100) 0.49 0.54 0.59 0.64 0.70 0.75 0.80 0.85 0.89 0.94
B2 = 99 Linf (n = 100) 0.49 0.54 0.59 0.64 0.69 0.75 0.80 0.85 0.90 0.95
L2 (n = 300) 0.49 0.54 0.59 0.65 0.70 0.75 0.80 0.85 0.90 0.94
Linf (n = 300) 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.94
Trimmed B1 = 399 L2 (n = 100) 0.51 0.56 0.61 0.66 0.71 0.76 0.81 0.86 0.91 0.95
mean B2 = 399 Linf (n = 100) 0.51 0.56 0.61 0.66 0.71 0.77 0.82 0.87 0.91 0.96
(α-radius) L2 (n = 300) 0.51 0.56 0.61 0.66 0.71 0.76 0.81 0.86 0.90 0.95
Linf (n = 300) 0.51 0.56 0.61 0.66 0.71 0.76 0.81 0.86 0.91 0.95
B1 = 399 L2 (n = 100) 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95
B2 = 99 Linf (n = 100) 0.50 0.55 0.60 0.65 0.71 0.76 0.81 0.86 0.90 0.95
L2 (n = 300) 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.94
Linf (n = 300) 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95
B1 = 99 L2 (n = 100) 0.50 0.55 0.61 0.66 0.71 0.76 0.81 0.86 0.90 0.95
B2 = 399 Linf (n = 100) 0.50 0.56 0.61 0.66 0.71 0.76 0.82 0.86 0.91 0.96
L2 (n = 300) 0.50 0.55 0.61 0.66 0.71 0.76 0.80 0.85 0.90 0.95
Linf (n = 300) 0.51 0.56 0.61 0.66 0.71 0.76 0.81 0.86 0.91 0.95
B1 = 99 L2 (n = 100) 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.94
18
Nominal coverage probability
Statistic 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95
B2 = 99 Linf (n = 100) 0.50 0.55 0.61 0.66 0.71 0.76 0.80 0.86 0.90 0.95
L2 (n = 300) 0.49 0.55 0.60 0.65 0.70 0.75 0.80 0.84 0.89 0.94
Linf (n = 300) 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.94
Variance B1 = 399 L2 (n = 100) 0.46 0.51 0.56 0.61 0.67 0.72 0.77 0.83 0.88 0.93
B2 = 399 Linf (n = 100) 0.48 0.53 0.58 0.64 0.69 0.75 0.80 0.86 0.91 0.96
L2 (n = 300) 0.48 0.53 0.58 0.64 0.69 0.74 0.79 0.84 0.89 0.94
Linf (n = 300) 0.49 0.54 0.59 0.64 0.70 0.75 0.80 0.86 0.91 0.96
B1 = 399 L2 (n = 100) 0.45 0.50 0.55 0.61 0.66 0.71 0.76 0.82 0.87 0.92
B2 = 99 Linf (n = 100) 0.47 0.52 0.58 0.63 0.68 0.74 0.79 0.85 0.90 0.95
L2 (n = 300) 0.48 0.53 0.58 0.63 0.68 0.73 0.78 0.83 0.88 0.94
Linf (n = 300) 0.48 0.53 0.58 0.64 0.69 0.74 0.79 0.85 0.90 0.95
B1 = 99 L2 (n = 100) 0.46 0.51 0.56 0.61 0.67 0.72 0.77 0.83 0.88 0.93
B2 = 399 Linf (n = 100) 0.48 0.53 0.58 0.64 0.69 0.75 0.80 0.86 0.91 0.96
L2 (n = 300) 0.49 0.54 0.59 0.64 0.69 0.74 0.79 0.84 0.89 0.95
Linf (n = 300) 0.49 0.54 0.59 0.65 0.70 0.75 0.80 0.86 0.91 0.96
B1 = 99 L2 (n = 100) 0.45 0.50 0.55 0.60 0.66 0.71 0.76 0.82 0.87 0.93
B2 = 99 Linf (n = 100) 0.47 0.52 0.58 0.63 0.69 0.74 0.79 0.85 0.90 0.95
L2 (n = 300) 0.48 0.53 0.58 0.63 0.68 0.73 0.78 0.83 0.89 0.94
Linf (n = 300) 0.48 0.53 0.59 0.64 0.69 0.74 0.80 0.85 0.90 0.95
19
References
Beran, R. (1987), ‘Prepivoting to reduce level error in confidence sets’, Biometrika 74(3), 457–468.
Beran, R. (1988), ‘Prepivoting test statistics: A bootstrap view of asymptotic refinements’, Journal of
the American Statistical Association: Theory and Methods 83(403), 687–697.
Beran, R. and Srivastava, M. S. (1985), ‘Bootstrap tests and confidence regions for functions of a
covariance matrix’, The Annals of Statistics 13(1), 95–115.
Bickel, P. J. and Freedman, D. A. (1981), ‘Some asymptotic theory for the bootstrap’, The Annals of
Statistics 9(6), 1196–1217.
Booth, J. G. and Hall, P. (1994), ‘Monte Carlo approximation and the iterated bootstrap’, Biometrika
81(2), 331–340.
Booth, J. G. and Presnell, B. (1998), ‘Allocation of Monte Carlo resources for the iterated bootstrap’,
Journal of Computational and Graphical Statistics 7(1), 92–112.
Chang, J. and Hall, P. (2015), ‘Double-bootstrap methods that use a single double-bootstrap
simulation’, Biometrika 102(1), 203–214.
Cuevas, A. (2014), ‘A partial overview of the theory of statistics with functional data’, Journal of
Statistical Planning and Inference 147, 1–23.
Cuevas, A., Febrero, M. and Fraiman, R. (2006), ‘On the use of the bootstrap for estimating
functions with functional data’, Computational Statistics & Data Analysis 51(2), 1063–1074.
Cuevas, A., Febrero, M. and Fraiman, R. (2007), ‘Robust estimation and classification for functional
data via projection-based depth notions’, Computational Statistics 22(3), 481–496.
Davidson, R. and MacKinnon, J. G. (2002), ‘Fast double bootstrap tests of nonnested linear regres-
sion models’, Econometric Review 21(4), 419–429.
Eaton, M. L. and Tyler, D. E. (1991), ‘On Wielandt’s inequality and its application to the asymptotic
distribution of the eigenvalues of a random symmetric matrix’, The Annals of Statistics 19(1), 260–
271.
Ferraty, F. and Vieu, P. (2006), Nonparametric Functional Data Analysis, Springer, New York.
Fraiman, R. and Muniz, G. (2001), ‘Trimmed mean for functional data’, Test 10(2), 419–440.
Gervini, D. (2012), ‘Outlier detection and trimmed estimation for general functional data’, Statistica
Sinica 22(4), 1639–1660.
20
Goia, A. and Vieu, P. (2016), ‘An introduction to recent advances in high/infinite dimensional
statistics’, Journal of Multivariate Analysis 146, 1–6.
Hall, P. (1986), ‘On the bootstrap and confidence intervals’, The Annals of Statistics 14(4), 1431–1452.
Hall, P. (1988), ‘Theoretical comparison of bootstrap confidence intervals’, The Annals of Statistics
16(3), 927–953.
Hall, P. and Martin, M. A. (1988), ‘On bootstrap resampling and iteration’, Biometrika 75(4), 661–671.
Hall, P. and Martin, M. A. (1989), ‘A note on the accuracy of bootstrap percentile method confidence
intervals for a quantile’, Statistics and Probability Letters 8(3), 197–200.
Hyndman, R. J. and Shang, H. L. (2010), ‘Rainbow plots, bagplots, and boxplots for functional
data’, Journal of Computational and Graphical Statistics 19(1), 29–45.
Lee, S. M. S. and Young, G. A. (1999), ‘The effect of Monte Carlo approximation on coverage
error of double-bootstrap confidence intervals’, Journal of the Royal Statistical Society: Series B
61(2), 353–366.
Li, D., Robinson, P. M. and Shang, H. L. (2020a), ‘Long-range dependent curve time series’, Journal
of the American Statistical Association: Theory and Methods 115(530), 957–971.
Li, D., Robinson, P. M. and Shang, H. L. (2020b), Nonstationary fractionally integrated functional
time series, Working paper, University of York. DOI: 10.13140/RG.2.2.20579.09761.
Lopéz-Pintado, S. and Romo, J. (2009), ‘On the concept of depth for functional data’, Journal of the
American Statistical Association: Theory and Methods 104(486), 718–734.
Martin, M. A. (1990a), ‘On bootstrap iteration for coverage correction in confidence intervals’,
Journal of the American Statistical Association: Theory and Methods 85(412), 1105–1118.
Martin, M. A. (1990b), On the double bootstrap, in C. Page and R. LePage, eds, ‘Computing Science
and Statistics’, Springer, New York, pp. 73–78.
McMurry, T. and Politis, D. N. (2011), Resampling methods for functional data, in F. Ferraty and
Y. Romain, eds, ‘The Oxford Handbook of Functional Data Analysis’, Oxford University Press,
Oxford, pp. 189–209.
Morris, J. S. (2015), ‘Functional regression’, Annual Review of Statistics and Its Application 2, 321–359.
Nyarige, E. G. (2016), The bootstrap for the functional autoregressive model FAR(1), PhD thesis,
Technische Universität Kaiserslautern.
URL: https://kluedo.ub.uni-kl.de/frontdoor/index/index/year/2016/docId/4410
21
Paparoditis, E. (2018), ‘Sieve bootstrap for functional time series’, The Annals of Statistics
46(6B), 3510–3538.
Paparoditis, E. and Shang, H. L. (2020), Bootstrap prediction bands for functional time series,
Technical report, University of Cyprus.
URL: https://arxiv.org/abs/2004.03971
Pešta, M. (2013), ‘Total least squares and bootstrapping with applications in calibration’, Statistics:
A Journal of Theoretical and Applied Statistics 47(5), 966–991.
Ramsay, J. O., Wickham, H., Graves, S. and Hooker, G. (2020), fda: Functional Data Analysis. R
package version 5.1.9.
URL: https://CRAN.R-project.org/package=fda
Ramsay, J. and Silverman, B. (2002), Applied Functional Data Analysis, Springer Series in Statistics,
New York.
Ramsay, J. and Silverman, B. (2005), Functional Data Analysis, 2nd edn, Springer Series in Statistics,
New York.
Reiss, P. T., Goldsmith, J., Shang, H. L. and Ogden, R. T. (2017), ‘Methods for scalar-on-function
regression’, International Statistical Review 85(2), 228–249.
Shang, H. L. (2015), ‘Resampling techniques for estimating the distribution of descriptive statistics
of functional data’, Communications in Statistics – Simulation and Computation 44(3), 614–635.
Shang, H. L. (2018), ‘Bootstrap methods for functional time series’, Statistics and Computing 28(1), 1–
10.
Wang, J.-L., Chiou, J.-M. and Müller, H.-G. (2016), ‘Functional data analysis’, Annual Review of
Statistics and Its Applications 3, 257–295.
22