0% found this document useful (0 votes)

27 views22 pages

A Comparison of Bootstrap Methods For Variance Estimation

This document compares nonparametric and parametric bootstrap methods for estimating the variance of a statistic. It describes how the performance of the two methods can be affected by the kurtosis of a distribution. The nonparametric bootstrap method directly resamples from the original sample, while the parametric method generates samples from a parametric distribution fit to the data. The paper suggests that under some conditions related to kurtosis, the nonparametric bootstrap may perform better than the parametric bootstrap for estimating variance, even if the true distribution is the same as the parametric distribution used.

Uploaded by

pablonino

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views22 pages

A Comparison of Bootstrap Methods For Variance Estimation

Uploaded by

pablonino

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

A Comparison of Bootstrap Methods for

Variance Estimation

Saeid Amiri, Dietrich von Rosen, and Silvelyn Zwanzig

Research Report
Centre of Biostochastics

Swedish University of Report 2009:02

Agricultural Sciences ISSN 1651-8543
A Comparison of Bootstrap Methods for Variance Estimation

Saeid Amiri1 , Dietrich von Rosen

Centre of Biostochastics
Swedish University of Agricultural Sciences
P.O. Box 7032, SE-750 07 Uppsala, Sweden
Silvelyn Zwanzig
Department of Mathematics
Uppsala University, P.O.Box 480, 751 06 Uppsala, Sweden

Abstract
This paper presents a comparison of the nonparametric and parametric
bootstrap methods, when the statistic of interest is the sample vari-
ance estimator. Conditions when the nonparametric bootstrap method
of variance performs better than the parametric bootstrap method are
described.

Keywords: Bootstrap, nonparametric, parametric, kurtosis.

1
E-mail address to the correspondence author: saeid.amiri@et.slu.se
1 Introduction
There has been much theoretical and empirical research on properties of the
bootstrap method and it has become a standard tool in statistical analysis.
The idea behind the bootstrap method is that if the sample distribution is
a good approximation of the population distribution, the sampling distribu-
tion of interest can be used to generate a large number of new samples from
the original sample via sampling with replacement. Bootstrapping treats the
sample as the actual population.
The most important property of the bootstrap method is the ability to es-
timate the standard error of any well-defined function of the random variables
corresponding to the sample data. Applying the bootstrap method requires
fewer assumptions than are needed for conventional methods. There are many
books and papers on the bootstrap method and its applications in a variety of
fields, see e.g. Hall (1992), Efron and Tibshirani (1993), Shao and Tu (1996),
Davison and Hinkley (1997), Mackinnon (2002), Janssen and Pauls (2003) and
Athreya and Lahiri (2006). Many use the bootstrap method without focusing
on the theory. However by considering theoretical aspects, it is possible to
understand the mechanism behind the simulations.
In this paper we present a finding that helps explain the different in perfor-
mance of the nonparametric and parametric bootstrap methods. The statistic
of interest is the variance. The commonly applied nonparametric bootstrap
resamples the observations from an original sample, whereas the parametric
bootstrap method generates bootstrap observations by a given parametric dis-
tribution. If justification cannot be provided for the use of a specific parametric
distribution, then a nonparametric bootstrap can be used. This is discussed
in the rest of this paper.
The nonparametric and parametric methods are simultaneously considered
in some studies, but often the results of the simulations are given without ex-
plicit discussions of their different performances. For example, Efron and
Tibshirani (1993) discuss the nonparametric and parametric bootstrap con-
fidence intervals of variance by using an example, Ostaszewski and Rempala
(2000) explain how to use the bootstrap methods within the actuary sciences
and Lee (1994) explains how to use a tuning parameter to find more accurate
estimation.
It is difficult to study the comparison in general but it is possible for
the main parameters such as the mean and variance. Here, We will use a
heuristic criterion to compare the bootstrap method with the real distribu-
tion. Although the bootstrap method is based on the sample, it is intended
to approach the real distribution. According to Hall (1992), the bootstrap

1
method may be expressed as an expectation conditional on the sample or as
an integral with respect to the sample distribution function. This allows us
to make direct comparisons of the nonparametric and parametric bootstrap
method and to draw conclusions from these comparisons.
We can show that the behavior of the nonparametric and parametric boot-
strap methods of variance estimation is affected by the kurtosis, which is ex-
plained in Theorem 1. This can be expected because the variance of variance
depends on the fourth moment. Distributions are usually classified by how
flat-topped the distribution is relative to the normal distribution. This can
be done via the sample kurtosis. It should be mentioned that there is not
a universal agreement about what the kurtosis is, see Darlington (1970) and
Joanes and Gill (1998). In the case of variance estimation, we show that the
bootstrap estimation depends on the kurtosis. However there is no difference
between the parametric and nonparametric bootstrap method in the case of
mean estimation. In the case of variance estimation, we show that the non-
parametric bootstrap method can be better than the parametric bootstrap
under some conditions, regardless whether the real distribution and the dis-
tribution of the parametric bootstrap method belong to the same distribution
family.
In Section 2, we briefly outline the bootstrap approaches. In Section 3,
the main results are presented. In Section 4, the theoretical discussion is
illustrated using some examples.

2 Bootstrap method
Let us look at the bootstrap stages, which can be formulated as below:
1. Suppose X = (X1 , · · · , Xn ) is an i.i.d. random sample of the distribution
F . Assume V (X) = σ 2 and EX 4 < ∞.
2. We are interested in θ(F ) = σ 2 and consider plug-in estimation: θb =
2 ,
θ(X1 , · · · , Xn ) = θ(Fn ) = SX
n
2 1X 2
SX = Xj − (X)2 , (1)
n
j=1

where Fn is the empirical distribution function, i.e. Fn (x) =

1 Pn
n j=1 I(Xj ≤ x).

3. Generate the bootstrap samples. This can be done in two different ways,
the nonparametric and parametric bootstrap, with the symbols ”∗” and
”#” used to distinguish the approaches.

2
iid
(i) The nonparametric bootstrap method: Xij∗ ∼ Fn (x), i = 1, . . . , B,
j = 1, . . . , n. Note that if Z ∼ Fn (x) then EZ = X and
V (Z) = SX 2 , where S 2 is the second centered moment estimator.
X
The kurtosis of Fn is defined as:
Pn 4
j=1 (Xj − X) /n
KFn = ³ P ´2 . (2)
n 2 /n
j=1 (X j − X)

iid
(ii) The parametric bootstrap method: Xij# ∼ Gλb , i = 1, . . . , B, j =
1, . . . , n where Gλb = G(.|X ) is an element of a class {Gλ , λ ∈ Λ} of
distributions. The parameter λ is estimated by statistical methods.
We also have E(X # ) = X and V (X # ) = SX 2 . In this case, the

kurtosis, KG(.|X ) , is defined as:

EX (X − X)4
KG(.|X ) = ,
(EX (X − X)2 )2

where EX (.) = E(.|X ) is the conditional expectation.

4. Calculate the bootstrap replications

S 2 (Xi× ) = S 2 (Xi1
× ×
, . . . , Xin ) i = 1, . . . , B,

The symbol × is used when either the parametric or nonparametric

procedure hold.

5. Handle the bootstrap replications as i.i.d. random samples and consider

the sample mean and the sample variance. They are:
B
1 X 2 ×
S 2× = S (Xi ), (3)
B
i=1

1 X³ 2 ×
B ´2 PB ¡S 2 (X × )¢2
V ×
= S (Xi ) − S 2×
= i=1 i
− (S 2× )2 . (4)
B B
i=1

It is obvious that V × measures the bootstrap variation which shows how

the replications settle around the mean. The square root of V × is referred to
as the bootstrap estimate of standard error, seB .

3
It is difficult to offer a specific guideline to compare the variances explicitly.
Here e is a proposed criterion in order to compare the ratio of the conditional
expectation of V ∗ and V # , i.e.

E(V ∗ |X )
e= . (5)
E(V # |X )

Thus it is suggested to compare the expectation of the variances of sample

variances. When e < 1, it means that the replications of the nonparametric
bootstrap concentrate more than the replications of the parametric bootstrap.

3 Comparison of the bootstrap methods

Two statistics that are often used to study properties of estimators are biased-
ness and mean square error (MSE) which are studied here for the bootstrap
estimation approaches of the variance. Let us first look at biasedness.

3.1 Biasedness
The following theorem clarifies the properties of the nonparametric and para-
metric bootstrap estimators of variance. It can be shown explicitly how boot-
strapping is affected by the kurtosis.
iid
Theorem 1 Let X = (X1 , · · · , Xn ) ∼ F with EX 4 < ∞. Then for the
bootstrap methods in (i) and (ii), presented in the previous section:
n−1 2
E(S 2∗ |X ) = E(S 2# |X ) = SX , (6)
n
KFn < KG(.|X ) ⇐⇒ E(V ∗ |X ) < E(V # |X ), (7)

where KFn and KG(.|X ) are the sample kurtosis and the kurtosis corresponding
to the parametric distribution Gλ̂ used in (ii).

Proof: By the construction in (i) and (ii), the conditional expectation of

S 2 (Xi× ) and S 2× given in (3) are as follow:

¡ ¢ ³1 Xn ´
×
E S 2 (Xi× )|X = EX Xij×2 − (X i )2
n
j=1
n − 1³ ¡ ¢2 ´
= EX (X ×2 ) − EX (X × ) ,
n

4
and according to (3)
B
¡1 X ¢ ¡
EX (S 2× ) = EX S 2 (Xi× ) = EX S 2 (Xi× )). (8)
B
i=1
Therefore
n−1 2
E(S 2∗ |X ) = E(S 2# |X ) =
SX ,
n
and (7) is verified. The conditional expectation of V × is given by:
B
1 X
EX (V × ) = EX (S 2 (Xi× )2 ) − EX ((S 2× )2 )
B
i=1
£1 B−1 ¤
= EX (S 2 (Xi× )2 ) − EX (S 2 (Xi× )2 ) + EX ((S 2× )2
B B
B − 1£ ¡ ¢2 ¤
= EX (S 2 (Xi× )2 ) − EX (S 2× ) , (9)
B
where
EX (S 2 (Xi× )2 )
n n
1 £ X ×2 2 2 ×4 ×2 X ×2
= EX ( ( X j ) + n X − 2nX Xj ])
n2
j=1 j=1
1£
= (n − 1)2 nEX (X ×4 ) + (4 − 4n)n(n − 1)EX (X ×3 )EX (X × )
n4
+ n(n − 1)(n2 + 3 − 2n)(EX (X ×2 ))2
µ ¶ µ ¶
n ×2 × 2 n
+ 3(12 − 4n) EX (X )(EX (X )) + 24 (EX (X × ))4 ]
3 4
n−1 £ ¡ ¡ ¢4 ¢
= ( 3 ) (n − 1) EX X × − EX (X × )
n
¡ ¢2
+ (n2 − 2n + 3) EX (X × − EX (X × ))2 ]. (10)
By using (8) and (10),
B − 1¡ ¢
EX (V × ) = EX (S 2 (Xi× )2 ) − (EX (S 2× ))2
B
B−1 n−1 £ ¡ ¡ ¢4 ¢
= ( )( 3 ) (n − 1) EX X × − EX (X × )
B n
¡ ¢2
− (n − 3) EX (X × − EX (X × ))2 ]
¡ B − 1 ¢¡ n − 1 ¢¡ × × 2 2
¢ ¡ ×
¢
= E X (X − E X (X )) (n − 1)K − (n − 3)
B n3
¡ B − 1 ¢¡ n − 1 ¢¡ 2 ¢2 ¡ ¢
= 3
SX (n − 1)K × − (n − 3) , (11)
B n

5
where K × can be either KFn or KG(.|X ) . Thus the difference between EX (V ∗ )
and EX (V # ) depends on the kurtosis. The ratio of the conditional expectation
of V ∗ and V # by using (11) equals:
¡
2 )2 (n − 1)K
¢
E(V ∗ |X ) (SX Fn − (n − 3)
e = = ¡ ¢2 ¡ ¢, (12)
E(V # |X ) S2 (n − 1)KG(.|X ) − (n − 3)
X

which implies that:

E(V ∗ |X ) < E(V # |X ) ⇐⇒ KFn < KG(.|X ) . (13)

¤
From relation (7), it follows that the unconditional expectation of the
bootstrap parametric and nonparametric estimator equal
µ ¶ µ ¶
2∗ 2# n−1 2 n−1 2 2
E(S ) = E(S )= E(SX ) = σ . (14)
n n

Thus

E(S ×2 ) < E(SX

2
) < σ2,

where σ 2 is the variance of the underlying distribution. Therefore the boot-

strap estimator of variance is more biased than the sample variance. The
following example clarifies this result in the case of the normal distribution.

Example 1: Consider the parametric bootstrap with the normal distribu-

# 2 ). Then nS 2 (X # )
tion, namely Xi,j |X ∼ N (X, SX 2
SX
i
|X ∼ χ2n−1 , which implies
¡ ¢ 2
(n−1)SX
EX S 2 (Xi# ) = n and KG(.|X ) = 3. Therefore in this case, a closed
form of the expectation of the parametric bootstrap estimator can be found:
B B
¡ 2# ¢ 1 X 2 # 1 X n−1 2
EX S = EX ( S (Xi )) = EX (S 2 (Xi# )) = SX .
B B n
i=1 i=1

Since KG(.|X ) = 3 and by (11),

¡n − 1¢ B − 1 4
EX (V # ) = 2 ( )SX . (15)
n2 B
Hence in the case of the normal distribution, if KFn < 3 holds, then EX (V ∗ )
is less than EX (V # ). Hence the replications of the nonparametric bootstrap

6
Figure 1: Percentage of times KFn < 3 happen in the simulation as a function
of sample size.

method of the variance concentrates more than the replications of parametric

bootstrap method.
For the nonparametric and parametric bootstrap method with the nor-
mal distribution, the replications of S 2∗ are fairly close in comparison with
S 2# . This can be explained as follows. Consider the normal distribution
F = N (µ, σ 2 ) and G(.|X ) = N (X, SX2 ) where K
G(.|X ) = 3. Cramér (1945)
showed that the estimation of the sample kurtosis is negatively biased, i.e.
b Fn ) = 3 − 6
E(K ,
n+1
b Fn ) = 24n(n − 2)(n − 3)
V (K .
(n + 1)2 (n + 3)(n + 5)
Therefore K b Fn < 3 is ”more likely” to occur, which is illustrated in Figure
1. As previously mentioned K b Fn < 3 makes the nonparametric replications
concentrate more than the parametric bootstrap. A discussion of biasedness
of the sample kurtosis is difficult. However, by using Monte Carlo simulation
it can be shown that K b Fn < KG(.|X ) is held for the t, chisquare and uniform
distributions, as well as the normal distribution.
Theorem 1 provides the possibility to make a direct comparison of the two
bootstrap approaches. However one should make a comparison relative to the
true underlying distribution. The following theorem helps us to carry out this
comparison. In general, the variance of variance is given by:
2 n − 1 4¡ ¢
V (SX )= 3
σ (n − 1)K − (n − 3) , (16)
n

7
iid
Theorem 2 Let X = (X1 , · · · , Xn ) ∼ F with EX 4 < ∞. Then for the
bootstrap methods in (i) and (ii) in Section 1 where KG(.|X ) is assumed to
be independent of observations, the following relations hold for V ∗ and V # ,
which are defined in (4):
µ
∗ B − 1 (n − 1)2 4
E(V ) = σ K(n − 1)(n2 − 4n + 6)
B n6
¶
3 2
+ (−n + 11n − 24n + 18) , (17)
µ ¶
B − 1 (n − 1)2 4
E(V # ) = σ K(n − 1) + (n2 − 2n + 3)
B n6
µ ¶
(n − 1)KG(.|X ) − (n − 3)) , (18)

where K is the kurtosis of F .

Proof: The following relations are presented by Cramér (1945, p.349),

n − 1 4¡ 2 ¢
E(S 4 ) = 3
σ (n − 2n + 3) + (n − 1)K , (19)
n
µX
n ¶ µ ¶
4 n−1 4
E (Xi − X) /n = σ (n2 − 3n + 3)K + 3(2n − 3) . (20)
n3
i=1

The proof of theorem is completed by inserting these equations into the cor-
responding terms in (11). ¤
This theorem states that E(V ∗ ) depends on K, whereas E(V # ) depends on
K and KG(.|X ) . It should be noted that if KG(.|X ) depends on the observations,
for example the lognormal distribution, then it is impossible to present a closed
form solution. Hence in this case, study of the performance of the parametric
bootstrap is rather difficult. However for the nonparametric bootstrap, (17)
always holds. It is obvious that the methods are biased. In the case of the
normal distribution, the following corollary is a direct result of Theorem 2.
iid
Corollary 1 If X = (X1 , · · · , Xn ) ∼ F = N (µ, σ 2 ) and also if G(.|X ) =
2 ), then the following relations hold:
N (X̄, SX

E(V ∗ ) < E(V # ) < V (SX

2
), (21)

Bn3 ∗ Bn3
E(V ) = E(V # ) = V (SX
2
).(22)
(B − 1)(n − 1)(n2 − 2n + 3) (B − 1)(n2 − 1)n

8
# E(V ∗ )
Figure 2: Plots of E(V
V (SX
)
2 ) and 2 )
V (SX
versus kurtosis, for the sample size of 10
and 30, respectively.

If the underlying distribution of F and G(.|X ) belong to the normal distribu-

tion family, it is expected that the standard error of the parametric bootstrap
of variance will be closer to the variance of F in comparison with the non-
parametric bootstrap. Moreover, by using the corrections given in (22), it
is possible to find unbiased estimators of the parametric and nonparametric
bootstrap of variance.
It is interesting that when KFn > 3 then E(V ∗ |X ) > E(V # |X ) and since
V (SX2 ) is larger than the expected bootstrap estimation, (21), V ∗ has more

chance to be close to V (SX 2 ) than V # . This is explained later in this paper

by simulations (see Table 4).

Figure 2 clarifies the behavior of E(V # ) and E(V ∗ ), which are given in
#) E(V ∗ )
(17) and (18). Plots include E(V 2
V (S )
and V (S 2 )
for sample sizes of 10 and 30,
X X
where the F and G(.|X ) belong to the same distribution family. It is obvious
that the nonparametric bootstrap method has a negative bias for large sample
size against the parametric bootstrap method. The most important result is
that for the distribution with the kurtosis approximately 1.4 ≤ K b ≤ 2, the
nonparametric bootstrap method is less biased than the parametric bootstrap
method, regardless whether F and G(.|X ) have the same distribution. This
range is based on n, which can be found by (17) and (18). It is obvious that
the amount of kurtosis affects the accuracy of variance estimation.
Now let us study the case when F and G(.|X ) do not belong to same
distribution family. In Figure 3, G(.|X ) ∼ N (µ, σ 2 ) and F is an arbitrary
#) E(V ∗ )
distribution. It includes E(V 2 ) and V (S 2 ) for sample sizes of 10 and 30. It
V (SX X
clarifies when there is uncertainty about the real distribution, special care

9
E(V # ) E(V ∗ )
Figure 3: Plot of 2 )
V (SX
and 2 )
V (SX
versus kurtosis, for sample size of 10 and
2 ).
30, where G(.|X ) ∼ N (X, SX

should be taken when using the parametric bootstrap method. It is obvious

that the performance of the nonparametric bootstrap method quickly improves
with an increase in sample size.
The following example is given to explain why the nonparametric may be
better, regardless if G(.|X ) and F belong to the same distribution family.

Example 2: The probability density function (pdf) of the exponential power

distribution (EPD) family is given as follows (Chiodi, 1995):

1 ¡ −|x − µ|p ¢
fX (x) = exp p > 0, x²R, (23)
2p1/p Γ(1 + 1/p)σp pσpp

where µ and σp are location and scale parameters and p is a shape parameter.
The following relations hold:

Γ(1/p)Γ(5/p)
µ = E(X), σp = E(|X − µ|p )1/p , K= ¡ ¢2 .
Γ(3/p)

These can be some known pdfs, e.g. with p = 1 it is the Laplace, p = 2

it is the normal distribution, and for p−→ ∞ it is the uniform distribution.
It is obvious that EPD with p = 10 has a kurtosis of 1.884. Let µ = 0
and σp = 2, for this case σ 2 = 2 and with n = 30 then V (s2X ) = 0.1187.
Figure 4 is the violin plot of |V (S 2× ) − V (SX
2 )| obtained from Monte Carlo

simulation of the parametric and nonparametric bootstrap, with B = 500 and

the Monte Carlo replications based on 1000 simulations. The results show that
the nonparametric bootstrap is closer to V (S 2 ) than the parametric bootstrap.

10
Figure 4: Violin plot of the nonparametric and parametric bootstrap of EPD.

The violin plot is a combination of a box plot and a kernel density plot. It
helps to study the results of the simulations. The violin plot is a combination
of a box plot and a kernel density plot, see Hintze and Nelson (1998).

3.2 MSE
The variability of estimation can also be assessed by its MSE, defined as
b = V (θ)
M SE(θ) b + (Bias)2 . (24)

The MSE of the sample variance is given as follows,

2 2 1 4
M SE(SX ) = V (SX )+ σ . (25)
n2
In the following, first the conditional M SE, which is the direct result of the
bootstrap method, is first discussed. Actually the conditional MSE is the
direct result of the bootstrap method and the unconditional is a combination of
the bootstrap and the frequentist approaches. The following lemma discusses
the bootstrap estimation of S 2 .
iid
Lemma 1 Let X = (X1 , · · · , Xn ) ∼ F with E(X 4 ) < ∞. Then for the
bootstrap methods in (i) and (ii) in section 2:

lim M SE(S 2∗ |X ) = lim M SE(S 2# |X ) = (SX

2
− σ 2 )2 (26)
B−→∞ B−→∞

11
Proof: Because of independency of the conditional Si2× , the following equa-
tion holds,
1
V (S 2× |X ) = V (Si2× |X ).
B
For B −→ ∞, This tends to zero and therefore M SE(S 2× |X ) converges to
the squared biasedness which is given in (26). ¤

iid
Lemma 2 If X = (X1 , · · · , Xn ) ∼ F with E(X 4 ) < ∞. Then for the boot-
strap methods explained in (i) and (ii) in Section 2:
µ ¶2 µ ¶2
n−1 1 − 2n
lim M SE(S 2∗ ) = lim M SE(S 2# ) = 2
V (SX )+ σ4 .
B−→∞ B−→∞ n n2
(27)

Proof: The first term of M SE is obtained as follows:

¡ ¢
lim V (S 2× ) = lim V (E(S 2× |X )) + E(V (S 2× |X ))
B−→∞ B−→∞
n−1 2 ¡1 ¢
= V( SX ) + lim E V (S 2× (Xi )|X
n B−→∞ B
µ ¶2
n−1 2
= V (SX ).
n

The second term can be obtained directly from (14). ¤

Lemma 2 clarifies that the MSE of the parametric and nonparametric boot-
strap estimators of variance are asymptotically the same. The discussion of
M SE(V × ) is rather difficult because it is based on higher moments. How-
ever it can be studied asymptotically. The following lemmas are required in
order to establish the proof for the Theorem 3. First the conditional MSE is
considered.
iid
Lemma 3 Let X = (X1 , · · · , Xn ) ∼ F and E(X 8 ) < ∞ then
¡ ¢2
lim M SE(V ∗ |X ) = E(V ∗ |X ) − V (S 2 ) ,
lim (28)
B−→∞ B−→∞
#
¢2
lim M SE(V |X ) = lim (E(V # |X ) − V (S 2 ) , (29)
B−→∞ B−→∞

where E(V ∗ |X ) and E(V # |X ) are given in (11).

12
Proof: It holds that:
µ B ¶
× 1 X 2× 2× 2
V (V |X ) = V (S (Xi ) − S ) |X
B
i=1
B−1
= ((B − 1)E(S 2× (Xi ) − S 2× |X )
B3
−(B − 3)(E(S 2× (Xi ) − S 2× |X )2 )

Therefore V (V × |X ) −→ 0 as B −→ ∞. This leads to:

lim M SE(V × |X ) = 0 + lim (Bias)2 . (30)

B−→∞ B−→∞

¤
This lemma shows that the conditional M SE(V × ) is affected by the kur-
tosis via E(V × |X ) which is expected by the discussion of Theorem 1. The
following lemma is necessary for the discussion of M SE(V × ).
iid
Lemma 4 Let X = (X1 , · · · , Xn ) ∼ F and E(X 8 ) < ∞.
µ ¶2 µ ¶
n−1
V (Vb (SX
2
)) = 2
n V (b 2
µ2 ) + O(n−4 ),
µ4 ) + n V (b2
(31)
n3

where Vb (SX
2 ) is the estimate of (16) and

µ8 − 8µ3 µ5 − µ24 + 16µ2 µ23

V (b
µ4 ) = + O(n−2 ),
n
4µ4 µ22 − 4µ42
µ22 ) =
V (b + O(n−2 ),
n
µi = E(Xi − µ)i . (32)

Proof: Let
n−1
Vb (SX
2
)= ((n − 1)b µ22 ),
µ4 − (n − 3)b
n3
where µ
b2 and µb4 are the estimators of the second and fourth centrad moments.
Then it can be shown that
µ ¶ µ
b 2 n−1 2
V (V (SX )) = (n − 1)2 V (b
µ4 ) + (n − 3)2 V (b
µ22 )
n3
¶
− 2(n − 1)(n − 3)Cov(b b22 ) ,
µ4 , µ

13
where
1 n−1 ¡ ¢
Cov(b b22 ) =
µ4 , µ µ4 ) + 2 2 Cov (X1 − X)4 , (X2 − X)2 (X3 − X)2
V (b
n n
(n − 1)(n − 2) ¡ ¢
+ 2
Cov (X1 − X)4 , (X1 − X)2 (X2 − X)2 .
n
Moreover, It can be shown by some algebra that:
¡ ¢
Cov (X1 − X)4 , (X1 − X)2 (X2 − X)2
1
= (µ6 µ2 − µ4 µ22 ) + (21µ4 µ22 − 7µ6 µ2 − 6µ42 ) + O(n−2 )
¡ n ¢
Cov (X1 − X)4 , (X2 − X)2 (X3 − X)2
1
= 2 (23µ22 µ4 − 85µ42 + 2µ6 µ2 ) + O(n−3 )
n
Therefore by using these relations the following result is obtained
µ ¶2 µ ¶
n − 1
V (Vb (SX )) =
2 2
n V (b 2
µ2 ) + O(n−4 ).
µ4 ) + n V ar(b2
(33)
n3
¤
The next theorem discusses M SE(V ×) in general.
iid
Theorem 3 Let X = (X1 , · · · , Xn ) ∼ F and E(X 8 ) < ∞. If KG(.|X ) is
independent of the observations, then by using the bootstrap estimation given
in (i) and (ii) in Section 2
µ ¶2
B−1
∗
M SE(V ) = V (Vb (SX
2
)) + (E(V ∗ ) − V (SX
2 2
)) , (34)
B
µ ¶2 µ ¶2
# B−1 n−1
M SE(V ) = ((n − 1)KG(.|X ) − (n − 3))2 V (SX
4
)
B n3
+ (E(V ∗ ) − V (SX
2 2
)) , (35)

where E(V ∗ ) and E(V # ) are given in Theorem 2 and V (Vb (SX
2 )) can be found

by Lemma 4.

Proof: The proof can be obtained directly by using the definition of MSE,
Lemma 3 and Lemma 4. ¤
×
This theorem can be used to find M SE(V ) for the nonparametric and
parametric bootstrap. The next corollary is an application of this theorem for
the normal distribution.

14
Table 1: Data used to study the simulation of variance
Variable KFn Data
x 2.59 48 36 20 29 42 42 20 42 22 41 45 14 6
0 33 28 34 4 32 24 47 41 24 26 30 41
y 3.41 48 36 20 29 42 42 20 42 22 41 45 14 30
0 33 28 34 24 32 24 47 41 24 26 30 41

iid
Corollary 2 Let X = (X1 , · · · , Xn ) ∼ F = N (µ, σ 2 ) and if G(.|X ) =
2 ), then the following relation holds asymptotically,
N (X̄, SX

M SE(V # ) < M SE(V ∗ ). (36)

Proof: This can be shown by using Corollary 1 and Theorem 3. ¤

Corollary 2 holds when F and G are normally distributed. However, if they
have different distributions, it is rather difficult to conclude the same results
and these should be studied by Theorem 3.

4 Simulations
This section includes the simulations of the bootstrap methods to clarify the
results which are obtained theoretically in Section 3.

Example 3. Let the distribution function of the parametric bootstrap be

N (0, 1) where KG(.|X ) = 3. The chosen data set is the spatial perception
of 26 neurologically impaired children. These data are used by Efron and
Tibshirani (1993) to study the variance. The data set y is the same as x with
a few changes to increase the kurtosis from 2.59 to 3.41 which helps to study
the effect of kurtosis on the bootstrap estimation of variance. The data sets
are given in Table 1.
Columns 2 and 4 in Table 2 include S 2× and V × which are carried out
with B = 500. Monte Carlo simulation (1000 replications) is used to study
the comparison of nonparametric and parametric bootstrap estimation. The
results of these Monte Carlo simulations are given in Column 3 and 5, as the
ratios S 2∗ < S 2# and V ∗ < V # .
They agree with the theoretical analysis discussed in Section 3. The
estimations of S 2∗ and S 2# are closed, but in contrast V ∗ and V # are quite
different. For the first data set with KFn = 2.59, with the assumption of
normality of data, the nonparametric estimation of the variance bootstrap

15
Table 2: Simulation of S 2× and V 2×
Variable S 2∗ , S 2# Ratio† V ∗, V # Ratio‡
x 165.00,165.00 0.517 1755.70,2178.04 0.990
y 118.31,118.37 0.492 1338.85,1124.60 0.042
†
The ratio of S 2∗ < S 2# in the 1000 simulations,
‡
The ratio of V ∗ < V # in the 1000 simulations.

method is more concentrated than the parametric estimation, which it can be

seen from the value V ∗ and V # in column 4 for the first data set. However
the estimators of S 2∗ and S 2# are close not only for the first but also for the
second data set. For the first data set, 99% of the simulations have V ∗ < V # .
For the second data set, the parametric bootstrap is more concentrated than
the nonparametric bootstrap, as only 42 out of 1000 have V ∗ < V # . This
result agrees with the theoretical discussion of Theorem 1, because KFn of
the second data is larger than 3.

Example 4. Here confidence intervals of variance are briefly discussed. There

are different methods to find the confidence interval of variance which are
discussed by Efron and Tibshirani (1993) and Davison and Hinkley (1997).

Method I The common method uses the CLT:

s
b − 1)
(K
S 2 ± tα/2,n−1 S 2 .
n

Method II Letting Xi ∼ N (µ, σ 2 ), it is easy to find the CI for variance:

³ nS 2 nS 2 ´
, .
χ2α/2,n−1 χ21−α/2,n−1

Method III This method is referred to as the standard method:

b×
S 2 ± tα/2 seB,

b×
where seB is the bootstrap estimate of standard error.

Method IV The CI of variance based on the bootstrap methods can be found

by using the following formula:
r
2 × 2 (K − 1)
S ± tα/2 S ,
n

16
S 2× − S 2
where t×α/2 is α/2 percentile of t × = p , where S 2× and V (S 2 )×
V (S 2 )×
are estimated by the bootstrap method.

Method V Using method II, the CI is asymptotically as below:

µ ¶
nS 2 nS 2
, ,
χ2×
α/2 χ 2×
1−α/2

nS 2×
where χ2×
α/2 is the percentile of χ
2× = .
S2
Method VI This method is called the percentile CI:

[θb%low , θb%up ] = [G
b −1 (α/2), G
b −1 (1 − α/2)],

where G b −1 (α/2) = S 2× (α/2), the percentile of the bootstrap resampling

of variance.

Method VII This method, referred to as bias-corrected and accelerated,

BCα , by Efron and Tibshirani (1993), has a substantial improvement
over the percentile method in both theory and practice. It is based on
the percentile of bootstrap distribution by adjusting the acceleration and
bias-correction.

Tables 3 and 4 include the bootstrap confidence interval at the level of 95%
for x and y with B = 500 which are discussed in the previous example. The
parametric bootstrap is done by the normal distribution. The first two lines
of both tables are the standard method for the construction of CI of variance,
which is based on the t and χ2 . It is obvious that Method I has smaller length
than Method II because the former is based on the symmetrical distribution
but in reality the distribution of variance is asymmetrical. Method II is known
as the exact method, as a criterion which can be used to study the different
methods.
It is obvious that the length of parametric bootstrap CI for the variance
of x is wider than the nonparametric bootstrap. In contrast the length of
parametric bootstrap CI for the variance of y is shorter than that of the
nonparametric bootstrap. Because Method III uses the square root of V (S 2 )
which depends on the kurtosis, this method is directly affected by kurtosis.
Method IV uses bootstrap resamples in t. Although the methods V and VI
do not use V (S 2 )× directly, they are based on the 5th and 95th percentiles
and of course the spread is directly affected by kurtosis. Method VII is also

17
Table 3: Confidence interval at 95% for x and y
x y
Method Low Up Length Low Up Length
I 99.018 244.049 145.031 59.050 187.094 128.043
II 118.448 305.233 186.784 84.984 218.999 134.014
III non 100.064 243.003 142.938 61.886 184.258 122.372
III par 91.483 251.584 160.101 66.084 180.060 113.976
IV non 110.249 283.828 173.578 75.743 295.122 219.379
IV par 115.847 309.760 193.912 77.494 236.850 159.356
V non 124.379 306.281 181.902 83.498 225.435 141.936
V par 120.598 311.475 190.876 85.460 219.482 134.022
VI non 99.927 233.364 133.437 65.223 183.094 117.870
VI par 96.051 248.405 152.353 69.262 175.156 103.660
VII non 119.520 258.307 138.786 79.792 227.236 147.443
VII par 113.565 289.907 176.342 84.531 217.723 133.192

Table 4: Simulation of convergence of V ∗ and V # to V (SX

2 )

KFn < 3 KFn > 3

Size Ratio1 Ratio2 Ratio1 Ratio2
n = 10 0.718 0.282 0.403 0.597
n = 30 0.695 0.305 0.418 0.518
n = 50 0.676 0.324 0.562 0.438
n = 100 0.676 0.324 0.581 0.418
2
Ratio1: Number of times |V (SX ) − V # | < |V (SX
2
) − V ∗ |, Ratio2=1-Ratio1

affected by it, see complete discussion in section 2.2.

Example 5. This example explains the convergence of V ∗ and V # to V (SX 2 )

by simulation. Let F be the distribution of N (0, 1) and have the parametric

bootstrap done by the normal distribution with B=500. Table 4 shows how
many times the bootstrap estimation of standard error is close to V (SX 2 ) for

the results based on the 1000 simulations.

Table 4 shows that when KFn < 3, the parametric bootstrap method is
appropriate although the replications of nonparametric bootstrap concentrate
more but when KFn > 3 and n is small then the nonparametric bootstrap has
better performance. This results admit Corollary 1 and its discussions.

18
5 Conclusions
This paper discusses bootstrap estimation of the variance, in the nonpara-
metric and the parametric setting and studies their behavior. It shows that
the parametric and nonparametric bootstrap estimation of variance are equal
(7), but that the bootstrap standard error depends on the sample kurtosis
(7). If the distribution of the sample is normal and the parametric bootstrap
is based on the normal distribution then this parametric bootstrap method
with normal distribution can be expected to be better than the nonparamet-
ric bootstrap, i.e. closer to the sample distribution. If KFn > 3, then for small
sample sizes, the nonparametric bootstrap method seems more appropriate.
Moreover, Theorem 2 gives the expectation of V ∗ and V # . In the case of
the nonparametric method, this depends on K but for the parametric method,
it depends on K and KG(.|X ) . When KG(.|X ) depends on the observations, the
given general form of the parametric bootstrap does not hold.
Figure 2 explains the expected result. It clarifies that when K is between
1.4 and 2, the result of nonparametric bootstrap is more appropriate, regard-
less of whether KG(.|X ) and F have the same distribution.
This paper emphasizes that special care should be taken when making
claims about the accuracy of the parametric bootstrap approach in applica-
tions, Figure 3, which is based on Theorem 2, clarifies how much the result is
affected by the wrong choice if the distribution of population is not normal.
Two kind of expectations are discussed throughout, conditional and uncon-
ditional. The conditional expectation clarifies the result of the bootstrapping,
whereas the unconditional expectation is the combination of the bootstrapping
and a frequentist approach.

References
[1] Athreya, K.B. and Lahiri. S.N. (2006). Measure Theory and Probability
Theory. Springer, New York.
[2] Chiodi, M. (1995). Generation of pseudo random variates from a normal
distribution of order P. Pubblicato su Statistica Applicata. 7(4), 401-416.
[3] Cramér, H. (1945). Mathematical Methods of Statistics. Almqvist & Wik-
sells, Uppsala.
[4] Darlington, R.B. (1970). Is kurtosis really peakedness? The American
Statistician. 24, 19-22.
[5] Davison, A.C. and Hinkley, D.V. (1997). Bootstrap Methods and Their
Application. Cambridge University Press. Cambridge.
[6] Efron, B. and Tibshirani, R. (1993). Introduction to the Bootstrap. Chap-
man & Hall, New York.

19
[7] Hall, P. (1992). The Bootstrap and Edgeworth Expansion. Springer, New
York.
[8] Hintze, J.L. and Nelson, R.D. (1998). Violin plots: a box plot-density
trace synergism. The American Statistician, 2(5), 181-4.
[9] Janssen, A. and Pauls,T. (2003). How do bootstrap and permution tests
work? Annals of Statistics, 31, 768-806.
[10] Joanes, D.N. and Gill, C.A. (1998). Comparing measures of sample skew-
ness and kurtosis. The Statistician, 47(1), 183-189.
[11] Lee, S.S. (1994). Optimal choice between parametric and non-parametric
bootstrap estimates. Math. Proc. Comb. Phil. Soc, 115, 335.
[12] Mackinnon, J.G. (2002). Bootstrap inference in econometrics. Canadian
Journal of Economics, 35, 615-645.
[13] Ostaszewski, K. and Rempala, G.A. (2000). Paramet-
ric and nonparametric bootstrap in acturial practice.
www.actuarialfoundation.org/research edu/parametic.pdf.
[14] Shao, J. and Tu, D. (1996). The Jackknife and Bootstrap. Springer, New
York.

(Cambridge Series in Statistical and Probabilistic Mathematics) A. C. Davison, D. v. Hinkley - Bootstrap Methods and Their Application-Cambridge University Press (1997)
No ratings yet
(Cambridge Series in Statistical and Probabilistic Mathematics) A. C. Davison, D. v. Hinkley - Bootstrap Methods and Their Application-Cambridge University Press (1997)
596 pages
Davison Hinkley Bootstrap Methods and Their Application
No ratings yet
Davison Hinkley Bootstrap Methods and Their Application
596 pages
Resampling Methods For Dependent Data
No ratings yet
Resampling Methods For Dependent Data
382 pages
Bootstrap
No ratings yet
Bootstrap
63 pages
Bootstrap
No ratings yet
Bootstrap
4 pages
An Introduction To The Bootstrap 3ai7r0o65z
No ratings yet
An Introduction To The Bootstrap 3ai7r0o65z
8 pages
Sta255 Week 10-2 Pre
No ratings yet
Sta255 Week 10-2 Pre
20 pages
1 s2.0 S0167947399000663 Main
No ratings yet
1 s2.0 S0167947399000663 Main
11 pages
Bootstrap
No ratings yet
Bootstrap
4 pages
Bootstrap Methods and Their Application
100% (1)
Bootstrap Methods and Their Application
596 pages
Notessc w05
No ratings yet
Notessc w05
10 pages
Bootstrap Method
No ratings yet
Bootstrap Method
28 pages
Boot
No ratings yet
Boot
15 pages
1979 B. Efron Bootstrap Methods - Another Look at The Jackknife
No ratings yet
1979 B. Efron Bootstrap Methods - Another Look at The Jackknife
27 pages
Boots Trapping
No ratings yet
Boots Trapping
4 pages
S M S T C Lecture 2425 4
No ratings yet
S M S T C Lecture 2425 4
43 pages
Lec 6
No ratings yet
Lec 6
13 pages
L8 Bootstrap Methods
No ratings yet
L8 Bootstrap Methods
69 pages
Horowitz Annu Rev
No ratings yet
Horowitz Annu Rev
67 pages
Bootstrap Report
No ratings yet
Bootstrap Report
92 pages
Estimation of Error Rates by Neans of Simulated Bootstrap Distributions
No ratings yet
Estimation of Error Rates by Neans of Simulated Bootstrap Distributions
6 pages
Estimation Through Bootsrtapping
No ratings yet
Estimation Through Bootsrtapping
6 pages
Bootstrapping The General Linear Hypothesis Test: Pedro Delicado
No ratings yet
Bootstrapping The General Linear Hypothesis Test: Pedro Delicado
17 pages
Bootstrap 1
No ratings yet
Bootstrap 1
16 pages
Bootstrap Stat 498 B
No ratings yet
Bootstrap Stat 498 B
61 pages
Bootstrap 1
No ratings yet
Bootstrap 1
7 pages
Chapter 4
No ratings yet
Chapter 4
25 pages
Bootstrapping Regression Models: 1 Basic Ideas
No ratings yet
Bootstrapping Regression Models: 1 Basic Ideas
14 pages
4.5-Bootstrap Variations
No ratings yet
4.5-Bootstrap Variations
25 pages
Méthode de Bootstrapping
No ratings yet
Méthode de Bootstrapping
28 pages
AdvEcx Chp3 Full 3006
No ratings yet
AdvEcx Chp3 Full 3006
17 pages
Introduction To Monte Carlo Procedures: The Non-Parametric and Parametric Bootstrap 1. Review of The Non-Parametric Bootstrap
100% (1)
Introduction To Monte Carlo Procedures: The Non-Parametric and Parametric Bootstrap 1. Review of The Non-Parametric Bootstrap
10 pages
Bootstrap Methods 2020
No ratings yet
Bootstrap Methods 2020
16 pages
Bootstrap Method PDF
No ratings yet
Bootstrap Method PDF
14 pages
1983 Efron Gong A Leisurely Look at The Bootstrap Jackknife CV CV
No ratings yet
1983 Efron Gong A Leisurely Look at The Bootstrap Jackknife CV CV
14 pages
Automatic Block-Length Selection For The: Patton/code - HTML
No ratings yet
Automatic Block-Length Selection For The: Patton/code - HTML
21 pages
Bootstrap Methodology
No ratings yet
Bootstrap Methodology
33 pages
MPRA Paper 7163
No ratings yet
MPRA Paper 7163
24 pages
Bootstrapping: Bias Statistic
No ratings yet
Bootstrapping: Bias Statistic
2 pages
Bootstrapping Techniques in Statistical Analysis and Approaches in R MATH 289
No ratings yet
Bootstrapping Techniques in Statistical Analysis and Approaches in R MATH 289
10 pages
Lecture 19 20
No ratings yet
Lecture 19 20
5 pages
Bootstrap: Estimate Statistical Uncertainties
No ratings yet
Bootstrap: Estimate Statistical Uncertainties
22 pages
L22 Bootstrap
No ratings yet
L22 Bootstrap
7 pages
Braun Bootstrap2012 PDF
No ratings yet
Braun Bootstrap2012 PDF
63 pages
Bootstrap
No ratings yet
Bootstrap
52 pages
Lecture 4
No ratings yet
Lecture 4
6 pages
Resampling Methods For Time Series
No ratings yet
Resampling Methods For Time Series
5 pages
Chapter 23 Summary: T Method. We Discussed The Pros and Cons of Each Method and Illustrated
No ratings yet
Chapter 23 Summary: T Method. We Discussed The Pros and Cons of Each Method and Illustrated
2 pages
Lecture 9 PDF
No ratings yet
Lecture 9 PDF
22 pages
Stat - Bootstrapping in Statistics
No ratings yet
Stat - Bootstrapping in Statistics
7 pages
Bootstrap Methods: Another Look at The Jackknife
No ratings yet
Bootstrap Methods: Another Look at The Jackknife
27 pages
Bootstrap Explained
No ratings yet
Bootstrap Explained
1 page
A Leisurely Look at The Bootstrap, The Jackknife, and Cross-Validation (1983 13s) - BRADLEY EFRON
No ratings yet
A Leisurely Look at The Bootstrap, The Jackknife, and Cross-Validation (1983 13s) - BRADLEY EFRON
13 pages
Intro Bootstrap 341
No ratings yet
Intro Bootstrap 341
18 pages
Bootstrap Up
No ratings yet
Bootstrap Up
5 pages
Bootstrap PDF
No ratings yet
Bootstrap PDF
24 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

A Comparison of Bootstrap Methods For Variance Estimation

Uploaded by

A Comparison of Bootstrap Methods For Variance Estimation

Uploaded by

A Comparison of Bootstrap Methods for

Saeid Amiri, Dietrich von Rosen, and Silvelyn Zwanzig

Swedish University of Report 2009:02

Saeid Amiri1 , Dietrich von Rosen

Keywords: Bootstrap, nonparametric, parametric, kurtosis.

where Fn is the empirical distribution function, i.e. Fn (x) =

kurtosis, KG(.|X ) , is defined as:

where EX (.) = E(.|X ) is the conditional expectation.

4. Calculate the bootstrap replications

The symbol × is used when either the parametric or nonparametric

5. Handle the bootstrap replications as i.i.d. random samples and consider

It is obvious that V × measures the bootstrap variation which shows how

Thus it is suggested to compare the expectation of the variances of sample

3 Comparison of the bootstrap methods

Proof: By the construction in (i) and (ii), the conditional expectation of

which implies that:

E(V ∗ |X ) < E(V # |X ) ⇐⇒ KFn < KG(.|X ) . (13)

E(S ×2 ) < E(SX

where σ 2 is the variance of the underlying distribution. Therefore the boot-

Example 1: Consider the parametric bootstrap with the normal distribu-

Since KG(.|X ) = 3 and by (11),

method of the variance concentrates more than the replications of parametric

where K is the kurtosis of F .

Proof: The following relations are presented by Cramér (1945, p.349),

E(V ∗ ) < E(V # ) < V (SX

If the underlying distribution of F and G(.|X ) belong to the normal distribu-

chance to be close to V (SX 2 ) than V # . This is explained later in this paper

by simulations (see Table 4).

should be taken when using the parametric bootstrap method. It is obvious

Example 2: The probability density function (pdf) of the exponential power

These can be some known pdfs, e.g. with p = 1 it is the Laplace, p = 2

simulation of the parametric and nonparametric bootstrap, with B = 500 and

The MSE of the sample variance is given as follows,

lim M SE(S 2∗ |X ) = lim M SE(S 2# |X ) = (SX

Proof: The first term of M SE is obtained as follows:

The second term can be obtained directly from (14). ¤

where E(V ∗ |X ) and E(V # |X ) are given in (11).

Therefore V (V × |X ) −→ 0 as B −→ ∞. This leads to:

lim M SE(V × |X ) = 0 + lim (Bias)2 . (30)

µ8 − 8µ3 µ5 − µ24 + 16µ2 µ23

M SE(V # ) < M SE(V ∗ ). (36)

Proof: This can be shown by using Corollary 1 and Theorem 3. ¤

Example 3. Let the distribution function of the parametric bootstrap be

method is more concentrated than the parametric estimation, which it can be

Example 4. Here confidence intervals of variance are briefly discussed. There

Method I The common method uses the CLT:

Method II Letting Xi ∼ N (µ, σ 2 ), it is easy to find the CI for variance:

Method III This method is referred to as the standard method:

Method IV The CI of variance based on the bootstrap methods can be found

Method V Using method II, the CI is asymptotically as below:

where G b −1 (α/2) = S 2× (α/2), the percentile of the bootstrap resampling

Method VII This method, referred to as bias-corrected and accelerated,

Table 4: Simulation of convergence of V ∗ and V # to V (SX

KFn < 3 KFn > 3

affected by it, see complete discussion in section 2.2.

Example 5. This example explains the convergence of V ∗ and V # to V (SX 2 )

by simulation. Let F be the distribution of N (0, 1) and have the parametric

the results based on the 1000 simulations.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.