0% found this document useful (0 votes)

7 views12 pages

a-general-bootstrap-algorithm-for-hypothesis-testing

This paper presents a general bootstrap algorithm for hypothesis testing that preserves the data structure of each group independently. The method computes bootstrap statistic values using the null hypothesis only for calculation, rather than during resampling, allowing for more accurate hypothesis testing in complex data structures. Several case studies, including comparisons of Gini indices and survival curves, demonstrate the effectiveness of the proposed algorithm.

Uploaded by

Álvaro Diego Maia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views12 pages

a-general-bootstrap-algorithm-for-hypothesis-testing

Uploaded by

Álvaro Diego Maia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Journal of Statistical Planning and Inference 142 (2012) 589–600

Contents lists available at SciVerse ScienceDirect

Journal of Statistical Planning and Inference

journal homepage: www.elsevier.com/locate/jspi

A general bootstrap algorithm for hypothesis testing

Pablo Martı́nez-Camblor a,b,n, Norberto Corral b
a
CAIBER. Oﬁcina de Investigación Biosanitaria, Asturies, Spain
b
Departamento de Estadı́stica e IO y DM., Universidad de Oviedo, Asturies, Spain

a r t i c l e i n f o abstract

Article history: The bootstrap is a intensive computer-based method originally mainly devoted to
Received 14 December 2010 estimate the standard deviations, conﬁdence intervals and bias of the studied statistic.
Received in revised form This technique is useful in a wide variety of statistical procedures, however, its use for
2 June 2011
hypothesis testing, when the data structure is complex, is not straightforward and each
Accepted 5 September 2011
Available online 10 September 2011
case must be particularly treated. A general bootstrap method for hypothesis testing is
studied. The considered method preserves the data structure of each group indepen-
Keywords: dently and the null hypothesis is only used in order to compute the bootstrap statistic
Gini index values (not at the resampling, as usual). The asymptotic distribution is developed and
Survival model
several case studies are discussed.
Competing risk
& 2011 Elsevier B.V. All rights reserved.
Cumulative incidence function

1. Introduction

The Bootstrap method, introduced and explored in detail by Efron (1979, 1982), is a (not only but mainly)
nonparametric intensive computer-based method of statistical inference which is often used in order to solve many real
questions without the need of knowing the underlying mathematical formulas. In particular, the bootstrap is really useful
in order to asses measures of accuracy to statistical estimates.
Obviously, there exists a vast literature about it. Among others, the monographs of Efron and Tibshirani (1993), Hall
(1992) or Shao and Tu (1995) addressed the problem from different approaches. Besides, alternative to the original one
bootstrap resampling plans have been proposed. González-Manteiga and Prada-Sánchez (1994) provided a brief review for
the smoothed, symmetrized and Bayesian bootstraps.
Originally, the bootstrap was devoted to the confidence interval construction and there exists a huge number of papers
about this topic (see, e.g., Hall, 1988 or DiCiccio and Efron, 1996 and references therein). Of course, there is an intimate
connection between confidence intervals and hypothesis testing. However, both procedures can differ because of the need (for
hypothesis testing) to generate the bootstrap distribution of the selected test statistics under a specific null hypothesis (Martin,
2007). Once saved this point, the bootstrap methods provide a creative way for building hypothesis testing without the need for
restrictive parametric assumptions (see, e.g., Silverman, 1981 or Davison and Hinkley, 1997 and references therein).
Nonetheless and despite of that authors as (for instance) Davison and Hinkley (1997) dealt with the problem of
developing fully nonparametric null models from which resampling can be carried out when no simple null model exists,
there is a vast variety of problems in which the resampling under the null implies, in someway, the (partially) loss of the
original data structure (e.g., marginal distribution comparison of k dimensional random variables). Moreover, when the
null does not imply necessarily the equality among the involved cumulative distribution functions (CDFs), the usual
bootstrap resampling plan may not be a good method for estimating the statistic variability (see, for instance, the analysis
for the Gini index made in Section 3 of the present paper).

n
Correspondence to: Oﬁcina de Investigación Biosanitaria del Principado de Asturias, C/Rosal 7 bis, 33009 Oviedo, Spain. Tel.: þ 34 985109805.
E-mail addresses: pablomc@ﬁcyt.es (P. Martı́nez-Camblor), norbert@uniovi.es (N. Corral).

0378-3758/$ - see front matter & 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.jspi.2011.09.003
590 P. Martı́nez-Camblor, N. Corral / Journal of Statistical Planning and Inference 142 (2012) 589–600

In this paper, a general resampling plan for hypothesis testing is studied. Our bootstrap algorithm, previously
considered to develop a kernel density based test for the classical k-sample problem for paired design (Martı́nez-Camblor,
2010a), allows to preserve the particular structure of each (involved) group. The key of this algorithm is that the null is
considered in order to compute the statistic (bootstrap) values instead of at the resampling moment (as usual).
Rest of the paper is organized as follows: in Section 2, the problem for general hypothesis testing is introduced. General
asymptotic distribution for the traditional and the new bootstrap procedures are developed. In Section 3, we analyze the
problem of testing the equality among Gini indices from independent samples (equality among Gini indices does not imply
equality among underlying distribution functions). Section 4 is devoted to the survival curves comparison under non-equal
censoring time distributions. Finally, in Section 5 we deal with the comparison of cumulative incidence functions (CIFs) in
competing risk setting.
The studied algorithm is really simple and easily to implement, in addition it performs well (in the sense that it leads to
good approximation to the distribution of interest) in all considered problems. Finally, we want to remark that, arguing as
in Horváth (1991), we can assume without loss of generality that all random variables and processes are deﬁned on the
same (and adequate) probability space.

2. The bootstrap method

With the purpose of developing a test to check whether the parameter (or function) J (J(t) for functions) (of course, J
depends on the underlying distribution function, F) is the same for k different populations, i.e., to contrast the hypothesis:
H0 : J1 ¼ ¼ Jk ð ¼ JÞ, ð1Þ
in first place, we must choose an adequate estimator for estimating the target. Let J^ n be this estimator. If J^n ¼ ðJ^ n1 , . . . , J^ nk Þ
stands for the k-dimensional vector where J^ ni is the estimator of Ji (J value in the ith group, i 2 1, . . . ,k) we assume that, for
certain fixed l 2 ð0; 1Þ, it is satisfied the weak convergence:
L
nl fJ^n Jg!n Dk ½J, ð2Þ
with J ¼ ðJ1 , . . . ,Jk Þ and Dk ½J is a k-dimensional probability law which, probably, depends on the real (and unknown) J value.
For different sample sizes, nl fJ^n Jg denotes the vector fnl1 ðJ^ n1 J1 Þ, . . . ,nlk ðJ^ nk Jk Þg. Note that (2) implies that, if Ck ½J ¼
L
fGð1Þ ½J1 , . . . , GðkÞ ½Jk g is a k-dimensional vector with distribution Dk ½J then, we also have the convergence nl fJ^n Jg!n Ck ½J.
L
Moreover, for each i 2 1, . . . ,k we have that nli fJ^ ni Ji g!ni GðiÞ ½Ji (D½Ji denotes the (marginal) distribution of GðiÞ ½Ji , 1 ri r k).
Let X ¼ fX1 , . . . ,Xk g, with Xi ¼ fxi1 , . . . ,xini g for i 2 1, . . . ,k, be k random samples (X could also be a random sample from a
k-dimensional random variable with marginals X 0i s). It can be assumed (without loss of generality) that the hypothesis in
(1) is rejected for large values of the statistic
X
k
TN¼ cN ðnli fJ^ni J^ gÞ, ð3Þ
i¼1

where fcN gN2N is a sequence of real functions such that cN -N c, J^ ni is the estimation of J in the ith sample (1 ri rk) and
P P
J^ ¼ N 1 ki ¼ 1 ni J^ ni ðN ¼ ki ¼ 1 ni Þ. Under the null (and only under the null), it is easy to derive the equality:
0 1
Xk X k
TN¼ cN @ aij ðnÞnj fJ^nj Jj gA,
l
ð4Þ
i¼1 j¼1

with aii ðnÞ ¼ ð1ni =NÞ and aij ðnÞ ¼ ðnli n1 l =NÞ for jai (1 ri,j r k). Under general and mild conditions (continuity is
j
sufﬁcient although not need) on the functions cN (N 2 N) and if aij ðnÞ-N aij 2 R (i.e., there exist real constants, cij, such
that ni =nj -ni ,nj cij for i,j 2 1, . . . ,k) directly from (2) and (4) is derived the following convergence:
0 1
L X k Xk
T N !N c @ aij GðjÞ ½Jj A, ð5Þ
i¼1 j¼1

where Gð1Þ ½J, . . . , GðkÞ ½J are the k components of the k-dimensional random vector Ck ½J which appears in statement (2)
(note that the marginal distributions are D½Jj , 1 rj rk).
The classical bootstrap method for hypothesis testing (assuming independence among the samples) replicates the
original problem by drawing k independent (bootstrap) samples from the pooled sample distribution (F^ N ) following the
general algorithm:

B 1. Compute the statistic value, T N , from the original sample X ¼ fX1 , . . . ,Xk g.
B 2. Draw B samples X n,b ¼ fX1n,b , . . . ,Xkn,b g (1r b rB) from the pool sample distribution (F^ N ).
B 3. n,b
For b 2 1, . . . ,B, compute T N , statistic value referred to the sample X n,b .
B 4. The distribution under the null of the statistic T N is approximated by fT nN,1 , . . . ,T N
n,B
g, i.e., the ﬁnal P-value is
P. Martı́nez-Camblor, N. Corral / Journal of Statistical Planning and Inference 142 (2012) 589–600 591

computed by

1XB
n,b
Pn ¼ IfT N r T N g,
Bb¼1

IfAg stands for the usual indicator function on the set A (takes value 1 if A is true and 0 otherwise).

If T nN denotes the bootstrap version of T N (the underlying distribution function is the ECDF computed from the pool sample,
L
F^ N ), since for each particular i 2 1, . . . ,k nli fJ^ ni Ji g!ni GðiÞ ½Ji we have that, for each u 2 R:
0 8 0 1 91
<X k Xk =
@P X fT n r ugP cN @ ^
aij GðjÞ ½J N r u A!N 0 a:s:,
A ð6Þ
N
: ;
i¼1 j¼1

where Gð1Þ ½J^ N , . . . , GðkÞ ½J^ N are k independent random variables with distribution D½J^ N and P X denotes probability
conditionally on sample X.
In a vast variety of situations, traditional bootstrap works correctly and provides a valuable tool in order to compute the
statistic distribution under the null. Moreover, all the (bootstrap) samples generated by the above algorithm are from the
same distribution, F^ N and, therefore, the distribution derived from B1 B4 is always a distribution under the null. However,
the null not necessarily implies the equality among the k-distribution functions and, in some particular problems, the
derived distribution from the above algorithm can produce unsatisfactory critical regions.
Note that, the origin of the error lies in to assume equality among the underlying distribution functions when this were
untrue. If we resampling for each sample (in particular, from F^ ni with i 2 1, . . . ,k), the Eq. (4) (which only uses the relevant
information contained in the null) suggests the following bootstrap estimator:
0 1
X k X k X
k
l ^ nn ^ A nn nn
nn
TN ¼ cN @ aij ðnÞnj fJ nj J nj g ¼ cN ðnli ½ðJ^ni J^ ÞðJ^ni J^ ÞÞ, ð7Þ
i¼1 j¼1 i¼1
nn nn P
where, for i 2 1, . . . ,k, J^ ni is the value of the statistic in a sample from F^ ni and J^ ¼ N 1 ki ¼ 1 ni J^ ni .
This expression allows to deﬁne the following (simple) algorithm:

Pk
N1. Compute the statistic value, T N ¼ i¼1 cN ðnli fJ^ni J^ gÞ, from the original sample X ¼ fX1 , . . . ,Xk g.
N2. For each i 2 1, . . . ,k, draw B samples (with size ni) Xinn,b from F^ ni to built X nn,b ¼ fX1nn,b , . . . ,Xknn,b g (1r br B).
P nn,b nn,b
N3. For b 2 1, . . . ,B, compute T nnN
,b
¼ ki ¼ 1 cN ðnli ½ðJ^ ni J^ ÞðJ^ ni J^ ÞÞ, statistic value referred to the sample X n,b (J^ ni and J^
are still the estimations from the original sample).
,1 nn,B
N4. The distribution under the null of T N is approximated by fT nn N , . . . ,T N g, i.e., the ﬁnal P-value is computed by
1X B
,b
Pnn ¼ IfT N r T nn
N g:
Bb¼1

If T nn
N stands for the new bootstrap version of T N (following the above algorithm), the Theorem 1 warrants that, under the
null hypothesis joint with usual and mild conditions, the distribution of T nn
N approximates the T N distribution.

Theorem 1. Under the conditions and notations previously mentioned, if Dk ½J þ d-d-0 Dk ½J (Dk is the one involved in (2)),
then for each u 2 R, it had the convergence

N rugPfT N rugÞ!N 0
ðP X fT nn a:s:, ð8Þ
where P X denotes probability conditionally on sample X.

Proof. Arguing as in Eq. (5) we directly derive that

0 8 0 1 91
<Xk X
k =
@P X fT nn r ugP cN @ ^
aij ðnÞGðjÞ ½J nj r u A!N 0
A a:s:, ð9Þ
N
: ;
i¼1 j¼1

where Gð1Þ ½J^ n1 , . . . , GðkÞ ½J^ nk are the k components of the k-dimensional random vector Ck ½J^n (note that the marginal
distributions are D½J^ nj (1r j r k) and the independence among them is not need).
On the other hand, under the null, J^ ni -ni J (1r i rk), hence the properties required to fcN gN2N and to Dk let us to write:
0 1 0 1
Xk X k
L Xk X
k
cN @ aij ðnÞGðjÞ ½J^nj A!N c@ aij GðjÞ ½JA a:s: ð10Þ
i¼1 j¼1 i¼1 j¼1
592 P. Martı́nez-Camblor, N. Corral / Journal of Statistical Planning and Inference 142 (2012) 589–600

Obviously, 8u 2 R,
0 8 0 1 91
<Xk X
k =
nn
ðP X fT N r ugPfT N rugÞ ¼ @P X fT nn r ugP cN @ ^
aij ðnÞGðjÞ ½J nj r u A
A
N
: ;
i¼1 j¼1
0 8 0 1 9 8 0 1 91
<Xk X
k = <Xk X
k =
@
þ P cN @ ^ A
aij ðnÞGðjÞ ½J nj r u P c@ ^
aij GðjÞ ½J ru A
A
: ; : ;
i¼1 j¼1 i¼1 j¼1
0 8 0 1 9 1
<Xk X
k =
@
þ P c@ ^
aij GðjÞ ½J r u PfT N rugA:
A ð11Þ
: ;
i¼1 j¼1

Convergence in (8) is derived from (11), (9), (10) and (5). &

The proposed algorithm (N1 N4 ) behaves good in a variety of problems. It always resamples from the original data
(without taking into account, at this moment, the null hypothesis). This point, which is the main particularity of the method
(step N3), allows to preserve the whole original data structure (variance–covariance matrix), even the (possible) dependence
(resampling on the individual in paired design). The null (and only the null) is incorporate to the algorithm in the bootstrap
estimator deﬁnition (Eq. (7)) and is used at the moment of computing the statistic (bootstrap) values. Any other assumptions
(for instance, other kind of equalities among the groups) are necessary. In addition, under the null, arguing like in Theorem 1
and from (6) and (8), for each u 2 R, it is derived the convergence ðP X fT nN rugP X fT nn
N ruÞgÞ-N 0 a.s. However, under the
alternative, this statement is, in general, untrue.
Despite of that our theoretical developments have been focused on the k-sample problem, the proposed algorithm can
be applied on general hypothesis testing. In particular, in one sample problem (H0: J1 ¼ J for a ﬁxed J). However, in these
cases steps N2 and B2 (resampling procedures) are equal and traditional and new bootstraps are equivalent. Note that, in
this setting, two resampling plans get to preserve the original data structure.

2.1. k-Sample Cramér–von Mises statistic

The hypothesis test in which the traditional bootstrap method reaches its best and most direct application is, obviously,
the k-sample problem for independent samples, i.e., to check the null:

H0 : F1 ¼ ¼ Fk ð ¼ FÞ:

There exist a number of statistic for this goal, probably, one of the most popular is the k-sample version of the traditional
Cramér–von Mises test proposed by Kiefer (1959). Let Xi ¼ fxi1 , . . . ,xini g (1 ri rk) be k independent samples, it is deﬁned by

X
k Z
CN2 ðkÞ ¼ ni fF^ ni ðXi ,tÞF^ N ðX,tÞg2 dF^ N ðX,tÞ, ð12Þ
i¼1
P
where F^ ni ðXi ,tÞ (1r i rk) and F^ N ðX,tÞ (N ¼ ki ¼ 1 ni ) denote the empirical cumulative distribution function (ECDF) referred
to the ith sample and to the pooled sample, respectively.
It is well-known (see, for instance, Van der Vaart, 1998) that, under general (and mild) conditions, it had the
convergence:
pffiffiffi L
nfF^ n ðX,tÞFðtÞg!n W 0 fF ðt Þg,

where W 0 ft g (0 r t r 1) stands for typical Brownian bridge. Following the above exposition, we also know that, under the
null:
0 12
k Z
X X
k
L
CN2 ðkÞ!N @ a 0
ij W ðjÞ fF ðt Þg
A dFðtÞ,
i¼1 j¼1

where W 0ð1Þ fF ðt Þg, . . . ,W 0ðkÞ fF ðt Þg are k-independent Brownian bridges. If CN2, n ðkÞ and CN2, nn ðkÞ denote, respectively, the
bootstrap (where the underlying distribution function always is F^ N ) and the new bootstrap (the resample is done by
following the step N2 and the statistic value is computed by using N3) versions of CN2 ðkÞ, it had (particular equations for (6)
and (8)) that, for each u 2 R, we have the convergences:
0 8 0 12 91
<X k Z X k =
B 2, n
@P X fCN ðkÞ r ugP @ aij W 0ðjÞ fF^ N ðt ÞgA dF^ N ðtÞ r u C
A!N 0 a:s:
: ;
i¼1 j¼1
P. Martı́nez-Camblor, N. Corral / Journal of Statistical Planning and Inference 142 (2012) 589–600 593

and
0 8 0 12 91
<Xk Z X
k =
B
@P X fCN2, nn ðkÞ r ugP @ aij W 0ðjÞ fF^ nj ðt ÞgA dF^ N ðtÞ ru C
A!N 0 a:s:
: ;
i¼1 j¼1

It is worth to note that the traditional bootstrap resampling is from the pooled sample, therefore ECDF estimation is
P
made from a sample with size N (N ¼ ki ¼ 1 ni ). However, in the considered new bootstrap, ECDF estimations are always
made from each particular group (sizes nj, 1 rj r k). In practice, under the null, the effects are almost negligible (although,
in this case, for too small sample sizes the new bootstrap could have some problems with the nominal level) but, under the
alternative, the estimations could be different (see Fig. 1).
Let us consider a homogeneous (equal sample size) two-sample problem. Let X1 and X2 be two independent random
samples (with size n) drawn from the distributions F1 and F2, respectively, and let F ¼ ð1=2ÞF1 þð1=2ÞF2 . CN2 ðkÞ-distribution
can be the approximated from:
Z
pffiffiffi
Dn ¼ 2 ½W 0ð1Þ fF1 ðtÞgW 0ð2Þ fF2 ðtÞg þ n eðtÞ2 dFðt Þ,

where W 0ð1Þ fF 1 ðt Þg and W 0ð2Þ fF 2 ðt Þg are two independent Brownian bridges and eðtÞ ¼ ½F2 ðtÞF1 ðtÞ. It is easy to compute the
corresponding expected value:
Z Z Z
E½Dn ¼ 2 F1 ðtÞð1F1 ðtÞÞ dFðtÞ þ2 F2 ðtÞð1F2 ðtÞÞ dFðtÞ þ 2n ½F2 ðtÞF1 ðtÞ2 dFðtÞ:

Bootstrap approximation is based on resampling from the ECDF assuming that the null hypothesis is true, i.e., on
resampling from F^ N ðX,tÞ with X ¼ fX1 ,X2 g (N ¼ 2n). Therefore, CN2, n ðkÞ-distribution can be approximated by
Z
Dn ¼ 2 ½W 0ð1Þ fFðtÞgW 0ð2Þ fFðtÞg2 dFðtÞ,

and its expected value is

Z Z Z
E½Dn ¼ 2 F1 ðtÞð1F1 ðtÞÞ dFðtÞ þ2 F2 ðtÞð1F2 ðtÞÞ dFðtÞ þ ½F2 ðtÞF1 ðtÞ2 dFðtÞ:

H0 (n1 = n2 = 50) H1 (n1 = n2 = 50)

1.5 1.5

1.0 Monte Carlo 1.0

Density

N−Boots.
Boots.
0.5 0.5

0.0 0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
t t
H0 (n1 = n2 = 100) H1 (n1 = n2 = 100)
1.5 1.5

1.0 1.0
Density
Density

0.5 0.5

0.0 0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
t t

Fig. 1. Density estimations for two studied procedures. The histograms depict the real distribution under the null computed from 10,000 Monte Carlo
replications.
594 P. Martı́nez-Camblor, N. Corral / Journal of Statistical Planning and Inference 142 (2012) 589–600

Studied resampling plan (algorithm N1 N4 ) always resampling from the original data (without any additional assump-
tion), therefore it preserves the original data structure. The null is (only) taking into account in order to compute the
bootstrap statistic values (following the step N3). CN2, nn ðkÞ-distribution can be approximated by
Z
Dnn ¼ 2 ½W 0ð1Þ fF1 ðtÞgW 0ð2Þ fF2 ðtÞg2 dFðtÞ,

with expected value

Z Z
E½Dnn ¼ 2 F1 ðtÞð1F1 ðtÞÞ dFðtÞ þ2 F2 ðtÞð1F2 ðtÞÞ dFðtÞ:

Obviously, E½Dnn r E½Dn and, despite of, under the null, the asymptotic difference is zero (for ﬁnite samples the difference
is (almost) negligible), under the alternative, this difference can lead to changes in the obtained statistical power. New
bootstrap procedure estimates the original distribution assuming that the null is true, i.e., eðtÞ ¼ 0. Fig. 1 depicts the
traditional and the new bootstrap approximation when F1 is a standard normal distribution (N ð0; 1Þ) and F2 is a N ð2; 1Þ. At
left, the null is true, both samples are drawn from F (the observed rejection proportions (a ¼ 0:05) were 0.051 and 0.054 for
the bootstrap and the new bootstrap methods, respectively (n¼ 50) and 0.049 and 0.051 for n¼100). The difference
between the densities are negligible for the two considered sample sizes (n ¼50, 100). At right (the histograms with the
real distributions used as reference are the same), the simulations are from the alternative hypothesis (the observed
statistical power was 1, the distributions are clearly different and e(t) is relevant even for small sample size). The density
estimation for the new bootstrap is a bit more sharp than the one for traditional bootstrap.
For a ﬁxed nominal level a, the statistical power of the statistic CN2 ðkÞ is P H1 fCN2 ðkÞ 4 ta g (P H1 denotes the probability
conditionally on the alternative hypothesis) where ta ¼ maxfu2Rg ðP H0 fCN2 ðkÞ 4 ug r aÞ (P H0 denotes the probability
conditionally on the null). Of course, one method (to approximate the distribution of the statistic) will be more powerful
when the corresponding ta estimation was smaller.
Arguing like in the previous scheme, we consider a two-sample problem where F1 follows a N ð0; 1Þ and F2 a N ðm,1Þ
(F ¼ ð1=2ÞF1 þ ð1=2ÞF2 ). Under the null, both samples are generated from F and, under the alternative, one is run from F1 and
the other one from F2. Table 1 depicts the observed rejection proportions (at nominal level a ¼ 0:05) in 10,000 Monte Carlo
simulations, for different values of m and different sample sizes (although always n1 ¼ n2 ¼ n). The P-values were
approximated with B ¼2000. Although the new bootstrap method always rejected a bit more than the traditional one,
differences between both methods are negligible.
Fig. 2 depicts the mean of the ta -estimations from the two methods (tna and tnn a for traditional and proposed algorithms,
respectively) for the previous considered problem (n¼50). Note that, in spite of the observed statistical power is almost
the same, when resampling from the alternative hypothesis, tnn a -values are smaller than the ta ones.
n

3. Gini concentration index

Gini (1995) concentration index is often used in order to study distribution inequality (mainly, although not only, in
economic context). Let X be a non-negative random variable with cumulative distribution function (CDF) F, the Gini index
can be deﬁned as
Z
1
GðFÞ ¼ FðuÞð1FðuÞÞ du:
E½X
By replacing the CDF by the empirical cumulative distribution function (ECDF) it is obtained the typical nonparametric
Gini index estimator, GðF^ N Þ. This estimator has been widely studied (see, e.g., Martı́nez-Camblor, 2007 and references

Table 1
Observed rejection proportions in 10,000 Monte Carlo simulations for the bootstrap and new bootstrap algorithms. The P-values were approximated from
2000 iterations.

n Boot. ðB1 B4 Þ New Boot. (N1 N4 )

m¼0 m ¼ 1=2 m¼1 m ¼ 3=2 m¼0 m ¼ 1=2 m¼1 m ¼ 3=2

H0
50 0.052 0.050 0.051 0.049 0.056 0.053 0.054 0.053
75 0.045 0.050 0.048 0.052 0.047 0.052 0.051 0.055
100 0.049 0.051 0.048 0.052 0.050 0.053 0.049 0.052
150 0.050 0.046 0.050 0.051 0.051 0.049 0.050 0.051

H1
50 0.052 0.648 0.996 1.000 0.055 0.659 0.997 1.000
75 0.054 0.824 1.000 1.000 0.056 0.831 1.000 1.000
100 0.049 0.919 1.000 1.000 0.051 0.920 1.000 1.000
150 0.047 0.980 1.000 1.000 0.048 0.980 1.000 1.000
P. Martı́nez-Camblor, N. Corral / Journal of Statistical Planning and Inference 142 (2012) 589–600 595

2.5 2.5

2.0 2.0
1.84

1.5 1.5

1.0 1.0

0.5 0.5
** *
τα τα
0.0 0.0
0 1/2 1 3/2 0 1/2 1 3/2
μ μ

Fig. 2. Continuous and dotted lines stand for the mean of tnn
a and ta (a ¼ 0:05), respectively. At left, the samples are from the null, at right the samples are
n

from the alternative (n¼ 50). Grey areas stand for a 95% conﬁdence intervals.

therein) and, under usual conditions, it is known the convergence:

pffiffiffiffi L
NfGðF^ N ÞGðFÞg!N N ð0,VG ðFÞÞ, ð13Þ
where the variance, VG2 ðFÞ,
depends on both the real Gini index (G(F)) and on the underlying CDF (a multivariate version for
this result has also been developed). Let X ¼ fX1 , . . . ,Xk g with Xi ¼ fxi1 , . . . ,xini g for i 2 1, . . . ,k, be k (possibly independent)
random samples, our objective is to check the hypothesis:
H0 : G1 ¼ ¼ Gk ð ¼ GÞ
with Gi ¼ GðFi Þ and where Fi is the real CDF for the ith sample ð1r i rkÞ. From (4), it is known that H0 will be rejected for
large values of the statistic:
0 12
X k Xk
pffiffiffiffiffi
TN¼ @ ^
aij ðnÞ nj fGðF nj ÞGgA
i¼1 j¼1

and from (5) and (13),

0 12
L Xk Xk
T N !N @ aij N ðjÞ ð0,VG ðFj ÞÞA ,
i¼1 j¼1

where for 1 rj r k, N ðjÞ ð0,VG ðFj ÞÞ are random variables with distribution N ð0,VG ðFj ÞÞ. For each u 2 R, the bootstrap version
(associated with the algorithm B1 B4 ) of T N (T nN ), satisﬁes the convergence
0 8 0 1 91
<X k X k =
@P X fT n r ugP
N
@ aij N ðjÞ ð0,VG ðF^ N ÞÞA r u A!N 0 a:s:,
: ;
i¼1 j¼1

where for 1r j r k, N ðjÞ ð0,VG ðF^ N ÞÞ are independent random variables with distribution N ð0,VG ðF^ N ÞÞ. The new bootsrap
version (associated with the algorithm N1 N4 ) of T N (T nn N ), for each u 2 R, satisﬁes the convergence:
0 8 0 1 91
<X k X k =
@P X fT nn r ugP
N
@ aij N ðjÞ ð0,VG ðF^ nj ÞÞA r u A!N 0 a:s:,
: ;
i¼1 j¼1

where for N ¼ fN ð1Þ ð0,VG ðF^ n1 ÞÞ, . . . ,N ðkÞ ð0,VG ðF^ nk ÞÞg is a k-dimensional random vector (which preserves the original data
structure) and whose marginal random variables follow the distribution N ð0,VG ðF^ nj ÞÞ (1 rj rk).
Even for independent samples, there exist some issues which do not allow to guarantee the general correct convergence
for the traditional bootstrap method. On one hand, there exist different cumulative distribution functions which drive to
the same Gini index and, on the other hand, given two CDFs, F1 and F2, the equality Gðð1=2ÞðF1 þF2 ÞÞ ¼ ð1=2ÞðGðF1 Þ þ GðF2 ÞÞ is
usually untrue.
For instance, consider one sample drawn from the distribution F1 ðtÞ ¼ tI½0;1Þ ðtÞ þI½1,1Þ ðtÞ (IA stands for the typical
indicator function) and another one independently drawn from F2 ðtÞ ¼ ð1=2ÞF1 ðtÞ þ ð1=2ÞI½2,1Þ ðtÞ (in Fig. 3, at left, it is shown
the respective Lorenz curves). It is easy to check that GðF1 Þ ¼ GðF2 Þ ¼ 1=3 (VG2 ðF1Þ ¼ 8=135 ( 0:058) and VG2 ðF2Þ ¼ 304=3375
596 P. Martı́nez-Camblor, N. Corral / Journal of Statistical Planning and Inference 142 (2012) 589–600

1.0 100
F1
F2
0.8 80

Lorenz Curve
0.6 60

Density
Monte Carlo
N−Boots.
0.4 40 Boots.

0.2 20

0.0 0
0.0 0.2 0.4 0.6 0.8 1.0 0.00 0.02 0.04 0.06 0.08 0.10
t t

Fig. 3. At left, Lorenz curves for the considered functions. At right, histogram for the real distribution (based on 10,000 Monte Carlo replications) and
density estimations for the bootstrap and new bootstrap approximation.

( 0:090)). Obviously, the null is true, however, Gðð1=2ÞF1 þ ð1=2ÞF2 Þ ¼ 3=7 (VG2 ðð1=2ÞF1 þð1=2ÞF2 Þ ¼ 64=1135 ( 0:057)). In
this case, assuming equal sample sizes, the asymptotic distribution for T N is ð168=1125Þ w21 . Since for j 2 1, . . . ,k,
L n L
F^ nj !nj Fj (almost surely), it had that T nn 2 2
N !N ð168=1125Þ w1 while T N !N ð128=1125Þ w1 .
Fig. 3 shows, at right, the two considered approximations for the sample distribution (sample size were n1 ¼ n2 ¼ 100)
for the statistic T N ; one based on the usual bootstrap algorithm (B1 B4 ) and the one based on the studied resampling plan
(algorithm N1 N4 ) for the above described problem. The real sample distribution is approximated by 10,000 Monte Carlo
replications. The observed rejection percentages (nominal level a ¼ 0:05) were 11.1% and 5.3% for the traditional and new
bootstrap, respectively.

4. Survival curves comparison

Conventionally, survival studies are related with the estimation of the involved survival functions (1-CDF). The data are
often randomly right censored. Specifically, let Tn ¼ ft1 , . . . ,tn g be a sample of n independent and identically distributed
(iid) lifetime (non-negative) observations with cumulative distribution function F and let Cn ¼ fc1 , . . . ,cn g be iid censoring
time (also non-negative) observations with CDF G. We observe zi ¼ minfti ,ci g and know the pairs ðzi , di Þ where di ¼ Ifti o ci g
(1 ri r n). Clearly, Z ¼ fz1 , . . . ,zn g are iid with CDF H where ð1HÞ ¼ ð1FÞð1GÞ. In this context, the Kaplan–Meier (KM) or
product-limit estimator (Kaplan and Meier, 1958) plays the analog role that the ECDF for complete information. KM has
been widely studied and there exists a vast literature on it. We would like to highlight here the paper by Csörgo+ (1996) in
which the author established universal Gaussian approximations for the empirical cumulative hazard and the product-
limit processes. In particular, it is known that, if d(t) is the variance of the KM estimator, DðtÞ ¼ dðtÞ=½1 þdðtÞ, for t 2 ½0, tH
(tH ¼ inf fx : HðxÞ ¼ 1g),
pffiffiffi KM L
nfF^ n ðtÞFðtÞg!n ½1FðtÞ½1dðtÞW 0 fDðt Þg,

where W 0 ftg (0 r t r1) stands for a typical Brownian bridge.

The k-sample problem for censored samples is very common in several areas, specially in clinical trials. The focus is to
check the hypothesis:
H0 : F1 ðtÞ ¼ ¼ Fk ðtÞð ¼ FðtÞÞ, ð14Þ
for arbitrary censoring time distributions. The most commonly used nonparametric tests are, probably, the linear rank tests
(Harrington and Fleming, 1982), based on the difference between observed and expected number of events. These tests
generalize most of the previously proposed comparison criterions. Of course, classical criterions for comparing CDFs have
also been studied on this context (see, for example, Martı́nez-Camblor, 2010b and references therein).
Bootstrapping with censored data was the focus of some debate in the 80s. Two different approaches were considered.
Reid (1981) proposed resampling from the KM estimator while Efron (1981) suggested random samples from fðzi , di Þgni¼ 1 .
Akritas (1986) studied both procedures (by using the theory of martingales for point processes) and concluded that only
the Efron method produces asymptotically correct confidence bands. However, typical resampling from the pooled sample
can induce unwanted results for certain unbalanced problems (not equal censoring time distributions) in order to check
(14). For instance, we are going to consider a statistic based on the L2-criterion (following the above (and natural) notation,
P 1 R t^ i pffiffiffiffi ^ KM KM
it is denoted by L2 ¼ k t^ ð ni fF ðtÞF^ ðtÞgÞ2 dt, where t^ i ¼ t^ i ðni Þ ¼ inf fx : H
i¼1 i 0 ni N
^ n ðxÞ ¼ 1g is the natural estimator for
i
P. Martı́nez-Camblor, N. Corral / Journal of Statistical Planning and Inference 142 (2012) 589–600 597

14 H0 (n1 = n2 = 50) 14 H1 (n1 = n2 = 50)

12 12

10 10

8 Monte Carlo 8

Density
Density
N−Boot
6 Boot 6

4 4

2 2

0 0
0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5
t t

Fig. 4. Histogram for the real distribution of the statistic (based on 10,000 Monte Carlo replications) and density estimations for the bootstrap and new
bootstrap approximation under the null (at left) and under the alternative described in the text (at right).

ti ) and a three-sample problem where the (real) underlying survival function is SðtÞ ¼ ð1=3ÞfS1 ðtÞ þ S2 ðtÞ þ S3 ðtÞg
1
with Si ðtÞ ¼ eð2iÞ t
I½0,1Þ ðtÞ (1 ri r3) and the respective censoring times are drawn from the functions
1
Ci ðtÞ ¼ ði5Þ tI½0,i5 ðtÞ þ Iði5,1Þ ðtÞ (the expected censor percentages are 32.5%, 26.7% and 22.1% for the first, second and third
sample, respectively). Despite of, obviously, the null hypothesis is true, in a Monte Carlo simulation study (considered
sample sizes were n1 ¼ n2 ¼ n3 ¼ 50 and 10,000 Monte Carlo replications were run) the observed rejection percentage
(a ¼ 0:05) for the traditional bootstrap was 9.5% (the P-values were estimated from 1000 bootstrap resamples). The
proposed resampling procedure gets to preserve (again) the internal structure within each group and the observed
rejection proportion (on the same Monte Carlo replications) was 4.3% (1000 bootstrap samples were also considered in
order to estimate the P-values).
Fig. 4 depicts the real distribution of the statistic (estimated by 10,000 Monte Carlo replications), the traditional
bootstrap approximation (resampling from the pooled sample) and the proposed new bootstrap distributions under the
null (at left). At right, the distribution under the null is computed in a problem in which the underlying survival functions
are S1, S2 and S3 for the first, second and third sample, respectively (the observed statistical power (nominal level a ¼ 0:05)
is close to 0.95). The histogram is the same real distribution in both cases.
The problem of resampling from the pooled sample is the ti estimation. As usual, since the censoring and the survival
times are independent (i.e., the censoring mechanism is noninformative), the problem can also be solved by using a little
modification on the usual bootstrap algorithm (see Martı́nez-Camblor, 2011). In this case, simulations (not shown here)
suggest that the traditional bootstrap results (under the null) are quite similar than the obtained ones by using the
proposed resampling algorithm.

5. Cumulative incidence functions comparison

The above problem is compounded when the studied event may be precluded by the occurrence of others which alters
the probability of experiencing the event of interest. Such events are known as competing risk events.
To be precise, in the considered competing risks setting, we suppose there are k independent groups of subjects, let
0
Tni ¼ fti1 , . . . ,tini g (1 ri r k) be the failure times of ith group and dij (1 r j rni ) be the indicator of the studied event (1 if the
0
event of interest occurs and 2 if other (competing) event occurs). For each i 2 1, . . . ,k, the pairs ðtij , dij Þ (1r j r ni ) from
different subjects within the same group are assumed to be iid. Conventionally, there also exist independent censoring
n
times, Cni ¼ fci1 , . . . ,cini g, which are iid with CDF Gi. We observe zij ¼ minftij ,cij g and, for i 2 1, . . . ,k, the pairs ðzij , dij Þ j i¼ 1
0
where dij ¼ dij Iftij ocij g are known. There exist different functions of interest related with this problem. We will focus on
the cumulative incidence function (CIF). Therefore, our goal is to test the hypothesis:
H0 : F11 ðtÞ ¼ ¼ F1k ðtÞð ¼ F1 ðtÞÞ, ð15Þ
0
where, for j 2 1, . . . ,k, F1j ðtÞ ¼ Pftij rt, dij
¼ 1g. Despite of that the methods for estimating this subdistribution function are
not new (see Kalbﬂeish and Prentice, 1980 for a early reference), misused of the Kaplan–Meier estimator on this setting is
still usual in biomedical literature (Gooley et al., 1999). Of course, some different criteria have been proposed in order to
testing (15). Gray (1988) developed a log-rank type tests for the CIFs comparison. Pepe (1991) used the integrated
difference of CIF estimates (for two-sample problems). Lin (1997) proposed to use the Kolmogorov–Smirnov criterion for
598 P. Martı́nez-Camblor, N. Corral / Journal of Statistical Planning and Inference 142 (2012) 589–600

1.2 H0 (n1 = n2 = 50) 1.2 H1 (n1 = n2 = 50)

1.0 1.0

0.8 Monte Carlo 0.8

N−Boot
Density

Density
0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0
t t

Fig. 5. Histogram for the real distribution of the statistic (based on 10,000 Monte Carlo replications) and density estimation for new bootstrap
approximation under the null (at left) and under the alternative described in the text (at right).

this goal. Asymptotic distribution (theory of counting processes (Aalen, 1978) is usually employed in order to derive this
approximation) is always derived in order to approximate the (considered) statistic distribution.
Let us consider the Kolmogorov–Smirnov type test suggested by Lin (1997). It is known (Kraus, 2007) that its
convergence to the asymptotic distribution is slow and its ﬁxed nominal level is usually underestimated. The studied
resampling plan not only provides a useful tool in order to approximate the distribution under the null for general
statistics, as the usual bootstrap, it can also get increase the convergence speed for small samples. Suppose it had two
independent sample, the studied (event labeled as 1) real failure time distribution is (for both samples)
F10 ðtÞ ¼ f1ð1=2Þet ð1=4Þeð1=2Þt gI½0,1Þ ðtÞ and this event happens with probability 12. The other competing events (labeled
as 2) is F20 ðtÞ ¼ f1et gI½0,1Þ ðtÞ, obviously, it also happens with probability 12. Since the failure times are generated, the
censoring times are drawn, independently, following a uniform distribution on [0,10] (about 15% and 10% of censored are
expected for F10 and F20, respectively). In this situation, a Monte Carlo simulation study (considered sample sizes were
n1 ¼ n2 ¼ 50 and 10,000 replications were run) the observed rejection percentage (a ¼ 0:05) for the asymptotic
approximation (Lin, 1997) was (only) 1.6%. The proposed bootstrap gets improve this approximation and it obtains a
rejection percentage of 5.9% (the P-values were estimated from 1000 bootstrap resamples). This improvement has a direct
impact on the statistical power. Under the alternative (the studied event, 1, takes values from F11 ðtÞ ¼ f1et gI½0,1Þ ðtÞ and
F12 ðtÞ ¼ f1ð1=2Þeð1=2Þt g I½0,1Þ ðtÞ for the ﬁrst and second sample, respectively, with probability 12. Rest of the problem
conditions are still like the previous ones), the observed rejection percentages (10,000 Monte Carlo replications were
made) were 10.2% for the asymptotic distribution and 22.1% for the new bootstrap.
Fig. 5 shows density estimations under the null for the real sample distribution (estimated from 10,000 Monte Carlo
replications) and for the new bootstrap algorithm, at left under the null and, at right, under the alternative hypothesis. In
this case, both distributions are almost equal.

6. Main conclusions

In order to construct a resampling algorithm useful to hypothesis testing, it looks reasonable that, the null should be
taken into account. Most of the authors match that the resampling under the null is critical to the proper construction of
bootstrap test (see, for example, Fisher and Hall, 1990; Hall and Wilson, 1991 or Westfall and Young, 1993). Despite that
the resampling under the null can be developed in a wide range of situations, it is true that, in many common practical
problems sampling under the null could be complicated. The problems appear when the null hypothesis restrictions are
not easily reflected, in adequate way, by the pool sample (Bickel and Ren, 2001). In particular, if the studied parameter, J(F)
P P
(F stands for the underlying distribution function) is such that Jð ki ¼ 1 gi Fi Þa ki ¼ 1 gi JðFi Þ (gi 2 R for i 2 1, . . . k), resampling
from the pool sample could drive to mistaken critical regions.
There have been a number of papers in the statistical literature which deal with particular problems and propose
modifications of the bootstrap for correct use in testing (see, for instance, Romano, 1988, 1989 and references there in). In
this report we study a general, simple and useful bootstrap algorithm which allows to develop fully nonparametric tests.
The main difference with the usual bootstrap plan is that the new proposed bootstrap does not resample from the null but
the relevant information contained in the null (and only this information) is used in order to define the bootstrap
estimator (i.e., to compute the statistic bootstrap values). The idea lies on item N3, in front of the B3 for traditional
bootstrap. Note that both steps are the same for one-sample problems therefore, in this case, both methods are the same.
P. Martı́nez-Camblor, N. Corral / Journal of Statistical Planning and Inference 142 (2012) 589–600 599

This device allows resampling from the alternative (in order to preserve the data structure within the different studied
groups) and not make any other additional assumptions.
When the traditional bootstrap works (the samples are independent and the null hypothesis implies the equality
among the involved CDFs), the provided distributions for both methodologies, under the null, are asymptotically
equivalent. However, the two procedures are different and the obtained distributions, under the alternative, can be really
different, hence the obtained statistical power could be (also) different depending on the particular studied problem. For
instance, in the considered two-sample problem based on the Cramér–von Mises statistic, the differences between
algorithms do not produce relevant changes on the observed statistical powers.
The developed method has many practical applications. Due the studied algorithm allows preserve the internal
covariance data structure, perhaps the marginal distribution comparison in a multivariate random variable could be the
most direct one (see Martı́nez-Camblor, 2010a; Martı́nez-Camblor et al., 2011a, 2011b for usual applications on k-sample
problem for paired design and Martı́nez-Camblor and Corral (2011c) for the generalization of the repeated measures
problem to functional data). However, it is also really useful when the null hypothesis implications are not clear (see
Martı́nez-Camblor et al., 2011b for a practical application of this case). Competing risks and multistate models are specially
interested setting, the different functions which are involved in these problems, which can be different even under the
null, complicate the use of the usual resampling plans and the studied procedure allows replicate the problem complexity
and checking the interest hypothesis without any other assumption.
Of course, the considered algorithm is compatible and can be easily adapted to other bootstrap resampling alternatives
like the smoothed, symmetrized or Bayesian ones.

Acknowledgements

The authors are grateful with the anonymous reviewers which comments and suggestions have helped to improve
the paper.

References

Aalen, O.O., 1978. Nonparametric inference for a family of counting processes. Annals of Statistics 6, 701–726.
Akritas, M.G., 1986. Bootstrapping the Kaplan–Meier estimator. Journal of the American Statistical Association 81 (396), 1032–1038.
Bickel, P.J., Ren, J.J., 2001. The Bootstrap in Hypothesis Testing. Lecture Notes-Monograph Series, vol. 36, pp. 91–112.
+ S., 1996. Universal Gaussian approximations under random censorship. Annals of Statistics 34 (6), 2744–2778.
Csörgo,
Davison, A.C., Hinkley, D.V., 1997. Bootstrap Methods and their Application. Cambridge University Press, Cambridge.
DiCiccio, T.J., Efron, B., 1996. Bootstrap confidence intervals (with discussion). Statistical Sciences 11, 189–228.
Efron, B., 1979. Bootstrap methods: another look at the jackknife. Annals of Statistics 7 (1), 1–26.
Efron, B., 1981. Censored data and the bootstrap. Journal of the American Statistical Association 76, 312–319.
Efron, B., 1982. The jackknife, the bootstrap and other resampling plans. In: Regional Conference Series in Applied Mathematics, CBMS-NSF.
Efron, B., Tibshirani, R.J., 1993. An Introduction to the Bootstrap. Chapman & Hall, New York.
Fisher, N.I., Hall, P., 1990. On bootstrap hypothesis testing. Australian & New Zealand Journal of Statistics 32 (2), 177–190.
Gray, R.J., 1988. A class of k-sample tests for comparing the cumulative incidence of a competing risk. Annals of Statistics 16, 1141–1154.
Gini, C., 1995. Variabilitá e mutabilita. In: Pizetti, E., Salvemini, T. (Eds.), Memorie di Metodologia Statistica, Rome. Reprinted (Libreria Eredi Virgilio
Veschi, 1912).
González-Manteiga, W., Prada-Sánchez, J.M., 1994. The bootstrap—a review. Computational Statistics 9, 165–205.
Gooley, T.A., Leisenring, W., Crowley, J., Storer, B.E., 1999. Estimation of failure probabilities in the presence of competing risks: new representations of
old estimators. Statistics in Medicine 18, 695–706.
Hall, P., 1988. Theoretical comparison of bootstrap confidence intervals (with discussion). Annal of Statistics 16, 927–953.
Hall, P., 1992. The Bootstrap and Edgeworth Expansion. Springer, New York.
Hall, P., Wilson, S.R., 1991. Two guidelines for bootstrap hypothesis testing. Biometrics 47, 757–762.
Harrington, D.P., Fleming, T.R., 1982. A class of rank test procedures for censored survival data. Biometrika 69 (3), 553–566.
Horváth, L., 1991. On Lp-norms of multivariate density estimators. Annals of Statistics 19 (4), 1933–1949.
Kalbfleish, J.D., Prentice, R.L., 1980. The Statistical Analysis of Failure Time Data. John Wiley, New York.
Kaplan, E.L., Meier, P., 1958. Nonparametric estimation from incomplete observations. Journal of American Statistic Association 53, 457–481.
Kiefer, J., 1959. k-Sample analogues of the Kolmogorov–Smirnov, Cramér–von Mises tests. Annals of Mathematical Statistics 30, 420–447.
Kraus, D., 2007. Smooth Tests of Equality of Cumulative Incidence Function in Two Samples. Institute of Information and Automation, Prague, Research
Report 2197, pp. 1–12.
Lin, D.Y., 1997. Nonparametric inference for cumulative incidence functions in competing risks studies. Statistics in Medicine 16, 901–910.
Martin, M.A., 2007. Bootstrap hypothesis testing for some common statistical problems: a critical evaluation of size and power properties. Computational
Statistics & Data Analysis 51 (12), 6321–6342.
Martı́nez-Camblor, P., 2007. Central limit theorems for S-Gini and Theil inequality coefficients. Revista Colombiana de Estadı́stica 2, 287–300.
Martı́nez-Camblor, P., 2010a. Nonparametric k-sample test based on kernel density estimator for paired design. Computational Statistics & Data Analysis
54, 2035–2045.
Martı́nez-Camblor, P., 2010b. Comparing k-independent and right censored samples based on the likelihood ratio. Computational Statistics 25, 363–374.
Martı́nez-Camblor, P., 2011. Testing the equality among distribution functions from independent and right censored samples via Cramér–von Mises
criterion. Journal of Applied Statistics 38 (6), 1117–1131.
Martı́nez-Camblor, P., Carleos, C., Corral, N., 2011a. Cramér–von Mises statistic for paired samples, unpublished manuscript.
Martı́nez-Camblor, P., Corral, N., Vicente, D., 2011b. Statistical comparison of the genetic sequence type diversity of invasive Neisseria meningitidis isolates
in northern Spain (1997–2008). Ecological Informatics, in press. doi:10.1016/j.ecoinf.2011.06.001.
Martı́nez-Camblor, P., Corral, N., 2011c. Repeated measures analysis for functional data. Computational Statistics & Data Analysis 55 (12), 3244–3256.
Pepe, M.S., 1991. Inference for events with dependent risks in multiple endpoint studies. Journal of American Statistical Association 86, 770–778.
Reid, N., 1981. Estimating the median survival time. Biometrika 68, 601–608.
Romano, J.P., 1988. A bootstrap revival of some nonparametric distance tests. Journal of American Statistic Association 83, 698–708.
600 P. Martı́nez-Camblor, N. Corral / Journal of Statistical Planning and Inference 142 (2012) 589–600

Romano, J.P., 1989. Bootstrap of randomization tests of some nonparametric hypotheses. Annals of Statistic 17, 141–159.
Shao, J., Tu, D., 1995. The Jackknife and the Bootstrap. Springer, New York.
Silverman, B.W., 1981. Using kernel density estimates to investigate multimodality. Journal of the Royal Statistical Society, Series B 43 (1), 97–99.
Van der Vaart, A.W., 1998. Asymptotic Statistics. Cambridge University Press, London.
Westfall, P.H., Young, S.S., 1993. Resampling-based Multiple Testing: Examples and Methods for p-value Adjustment. Wiley, New York.

Bradley Efron, R.J. Tibshirani An Introduction To Bootstrap
60% (5)
Bradley Efron, R.J. Tibshirani An Introduction To Bootstrap
225 pages
Chapter 01-The Roles of Statistics in Engineering
No ratings yet
Chapter 01-The Roles of Statistics in Engineering
13 pages
Immediate Download (Ebook PDF) Advanced Mathematical and Computational Tools in Metrology and Testing X Ebooks 2024
100% (5)
Immediate Download (Ebook PDF) Advanced Mathematical and Computational Tools in Metrology and Testing X Ebooks 2024
41 pages
Pengantar Analisis Real I
No ratings yet
Pengantar Analisis Real I
177 pages
(Cambridge Series in Statistical and Probabilistic Mathematics) A. C. Davison, D. v. Hinkley - Bootstrap Methods and Their Application-Cambridge University Press (1997)
No ratings yet
(Cambridge Series in Statistical and Probabilistic Mathematics) A. C. Davison, D. v. Hinkley - Bootstrap Methods and Their Application-Cambridge University Press (1997)
596 pages
STAT 713 Mathematical Statistics Ii: Lecture Notes
No ratings yet
STAT 713 Mathematical Statistics Ii: Lecture Notes
152 pages
E-Book - Methodology and Theory For The Bootstrap
No ratings yet
E-Book - Methodology and Theory For The Bootstrap
41 pages
Estimations
100% (1)
Estimations
183 pages
BOOTSTRAPPING MAX STATISTICS IN HIGH DIMENSIONS- NEAR-PARAMETRIC RATES UNDER WEAK VARIANCE DECAY AND APPLICATION TO FUNCTIONAL AND MULTINOMIAL DATA
No ratings yet
BOOTSTRAPPING MAX STATISTICS IN HIGH DIMENSIONS- NEAR-PARAMETRIC RATES UNDER WEAK VARIANCE DECAY AND APPLICATION TO FUNCTIONAL AND MULTINOMIAL DATA
64 pages
[P._McCullagh,_John_A._Nelder]_Generalized_Linear_(b-ok.xyz)
No ratings yet
[P._McCullagh,_John_A._Nelder]_Generalized_Linear_(b-ok.xyz)
274 pages
ch14 1
No ratings yet
ch14 1
21 pages
Horowitz Annu Rev
No ratings yet
Horowitz Annu Rev
67 pages
Chapter 4 - Hypothesis Testing
No ratings yet
Chapter 4 - Hypothesis Testing
27 pages
Appendix Bootstrapping
No ratings yet
Appendix Bootstrapping
18 pages
The Iterated Bootstrap: Russell Davidson
No ratings yet
The Iterated Bootstrap: Russell Davidson
32 pages
Jann 2021 Ebalfit
No ratings yet
Jann 2021 Ebalfit
32 pages
Lecture 4 Sampling Distribution of Sample Mean
No ratings yet
Lecture 4 Sampling Distribution of Sample Mean
20 pages
Bootstrap
No ratings yet
Bootstrap
63 pages
Flachaire 03a
No ratings yet
Flachaire 03a
16 pages
Chapter Three: Hypothesis Testing
No ratings yet
Chapter Three: Hypothesis Testing
66 pages
Institute of Mathematical Statistics
No ratings yet
Institute of Mathematical Statistics
12 pages
JMP - Statistical-Thinking - PSP Certification
No ratings yet
JMP - Statistical-Thinking - PSP Certification
10 pages
Bootstrap Confidence Intervals
No ratings yet
Bootstrap Confidence Intervals
10 pages
1612 ECTJ Statistical Inference
No ratings yet
1612 ECTJ Statistical Inference
27 pages
Ewma Notes
No ratings yet
Ewma Notes
8 pages
2 Bootstrap: 2.1 Review On Usual Asymptotic Inference
No ratings yet
2 Bootstrap: 2.1 Review On Usual Asymptotic Inference
7 pages
Examples of Markov Chains: - Random Walk On A Line
No ratings yet
Examples of Markov Chains: - Random Walk On A Line
23 pages
2nd QTR PeTa 2
No ratings yet
2nd QTR PeTa 2
4 pages
Cheap Subsampling bootstrap confidence intervals for fast and robust inference in biostatistics
No ratings yet
Cheap Subsampling bootstrap confidence intervals for fast and robust inference in biostatistics
16 pages
Assumption of Anova
No ratings yet
Assumption of Anova
8 pages
What Teachers Should Know About The Bootstrap Resa
No ratings yet
What Teachers Should Know About The Bootstrap Resa
84 pages
High-Dimensional, Two-Sample Testing
No ratings yet
High-Dimensional, Two-Sample Testing
9 pages
Davidson(2007)_how Many Bootstraps
No ratings yet
Davidson(2007)_how Many Bootstraps
14 pages
(1994) The Stationary Bootstrap - Politis and Romano
No ratings yet
(1994) The Stationary Bootstrap - Politis and Romano
12 pages
Double Bootstrapping for Visualising the Distribution
No ratings yet
Double Bootstrapping for Visualising the Distribution
22 pages
Variable Control
No ratings yet
Variable Control
9 pages
Davison Hinkley Bootstrap Methods and Their Application
No ratings yet
Davison Hinkley Bootstrap Methods and Their Application
596 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
26 pages
review
No ratings yet
review
81 pages
This Content Downloaded From 140.213.190.131 On Tue, 13 Apr 2021 09:26:31 UTC
No ratings yet
This Content Downloaded From 140.213.190.131 On Tue, 13 Apr 2021 09:26:31 UTC
23 pages
FRAM Time Series
No ratings yet
FRAM Time Series
30 pages
Felsenstein 1985
No ratings yet
Felsenstein 1985
10 pages
ML Unit 5 @ VS
No ratings yet
ML Unit 5 @ VS
29 pages
AdvEcx Chp3 Full 3006
No ratings yet
AdvEcx Chp3 Full 3006
17 pages
Nonparametric Standard Errors and Confidence Intervals
No ratings yet
Nonparametric Standard Errors and Confidence Intervals
21 pages
Resampling Methods A Practical Guide to Data Analysis New Edition PDF
No ratings yet
Resampling Methods A Practical Guide to Data Analysis New Edition PDF
14 pages
1983 Efron Gong A Leisurely Look at The Bootstrap Jackknife CV CV
No ratings yet
1983 Efron Gong A Leisurely Look at The Bootstrap Jackknife CV CV
14 pages
CHAPTER 3. Some Common Probability Distributions 2023
No ratings yet
CHAPTER 3. Some Common Probability Distributions 2023
6 pages
TESTINGFORSLOPEHOMOGENEITYINDYNAMICPANELSUSINGTHEWILDBOOTSTRAP______adjTEST337065-339391
No ratings yet
TESTINGFORSLOPEHOMOGENEITYINDYNAMICPANELSUSINGTHEWILDBOOTSTRAP______adjTEST337065-339391
8 pages
Williams Test For The Minimum Effective Dose
No ratings yet
Williams Test For The Minimum Effective Dose
8 pages
Nonparametric Statistics Theory and Methods
No ratings yet
Nonparametric Statistics Theory and Methods
275 pages
R Session Bootstrapping Randomisation 2024
No ratings yet
R Session Bootstrapping Randomisation 2024
4 pages
2024HW2Boot GOF Eng (1)
No ratings yet
2024HW2Boot GOF Eng (1)
4 pages
MPRA Paper 7163
No ratings yet
MPRA Paper 7163
24 pages
Bootstrap Method PDF
No ratings yet
Bootstrap Method PDF
14 pages
Kaggle Competition PDF
No ratings yet
Kaggle Competition PDF
19 pages
Jackknife: (I) Introduction
No ratings yet
Jackknife: (I) Introduction
11 pages
Bootstrap Methodology
No ratings yet
Bootstrap Methodology
33 pages
American Statistical Association, Taylor & Francis, Ltd. The American Statistician
No ratings yet
American Statistical Association, Taylor & Francis, Ltd. The American Statistician
8 pages
Đề thi cuối kỳ - Tổng hợp - EN1
No ratings yet
Đề thi cuối kỳ - Tổng hợp - EN1
7 pages
Bootstrapping Regression Models: 1 Basic Ideas
No ratings yet
Bootstrapping Regression Models: 1 Basic Ideas
14 pages
1-s2.0-016794739592844N-main
No ratings yet
1-s2.0-016794739592844N-main
11 pages
Bootstrapping The General Linear Hypothesis Test: Pedro Delicado
No ratings yet
Bootstrapping The General Linear Hypothesis Test: Pedro Delicado
17 pages
Bootstrap Methods: Another Look at The Jackknife
No ratings yet
Bootstrap Methods: Another Look at The Jackknife
27 pages
Generative Models for Ambiguity Resolution
No ratings yet
Generative Models for Ambiguity Resolution
8 pages
Resampled Inference Resampled Inference
No ratings yet
Resampled Inference Resampled Inference
21 pages
ANOVA Test
No ratings yet
ANOVA Test
3 pages
954/3 STPM: One and Half Hours
No ratings yet
954/3 STPM: One and Half Hours
4 pages
Chapter 4
No ratings yet
Chapter 4
25 pages
An Introduction to the Bootstrap 3ai7r0o65z
No ratings yet
An Introduction to the Bootstrap 3ai7r0o65z
8 pages
Bootstrap 1
No ratings yet
Bootstrap 1
16 pages
Bootstrap Up
No ratings yet
Bootstrap Up
5 pages
(Springer Series in Statistics) Jun Shao, Dongsheng Tu (Auth.) - The Jackknife and Bootstrap-Springer-Verlag New York (1995)
100% (1)
(Springer Series in Statistics) Jun Shao, Dongsheng Tu (Auth.) - The Jackknife and Bootstrap-Springer-Verlag New York (1995)
532 pages
WLP Stat
No ratings yet
WLP Stat
2 pages
bootstrap-methods-2020
No ratings yet
bootstrap-methods-2020
16 pages
ligaments 2.0(fixed). Coding helpful for students
No ratings yet
ligaments 2.0(fixed). Coding helpful for students
29 pages
Module 1 Lesson 2
No ratings yet
Module 1 Lesson 2
17 pages
Statistics and Probability
No ratings yet
Statistics and Probability
4 pages
Bootstrap Methods and Their Application
100% (1)
Bootstrap Methods and Their Application
596 pages
Robert v. Hogg, Allen T. Craig - Introduction To M
No ratings yet
Robert v. Hogg, Allen T. Craig - Introduction To M
448 pages
Time Series Analysis R
100% (3)
Time Series Analysis R
340 pages
Chapter 3 - Control Chart For Variables
100% (1)
Chapter 3 - Control Chart For Variables
66 pages
(Dimitris N. Politis, Joseph P. Romano, Michael Subsampling
No ratings yet
(Dimitris N. Politis, Joseph P. Romano, Michael Subsampling
180 pages
Counter Examples in Probability
No ratings yet
Counter Examples in Probability
12 pages
An Introduction To Bootstrap Methods With Applications To R
No ratings yet
An Introduction To Bootstrap Methods With Applications To R
236 pages
Co-Clustering: Models, Algorithms and Applications
From Everand
Co-Clustering: Models, Algorithms and Applications
Gérard Govaert
No ratings yet
Stationary and Related Stochastic Processes: Sample Function Properties and Their Applications
From Everand
Stationary and Related Stochastic Processes: Sample Function Properties and Their Applications
Harald Cramér
4/5 (2)
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
The Matrixial Brain: Experiments in Reality
From Everand
The Matrixial Brain: Experiments in Reality
Paul Chaplin
No ratings yet
Lectures on the Coupling Method
From Everand
Lectures on the Coupling Method
Torgny Lindvall
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

a-general-bootstrap-algorithm-for-hypothesis-testing

Uploaded by

a-general-bootstrap-algorithm-for-hypothesis-testing

Uploaded by

Journal of Statistical Planning and Inference 142 (2012) 589–600

Contents lists available at SciVerse ScienceDirect

Journal of Statistical Planning and Inference

A general bootstrap algorithm for hypothesis testing

2. The bootstrap method

Proof. Arguing as in Eq. (5) we directly derive that

2.1. k-Sample Cramér–von Mises statistic

and its expected value is

H0 (n1 = n2 = 50) H1 (n1 = n2 = 50)

1.0 Monte Carlo 1.0

with expected value

3. Gini concentration index

n Boot. ðB1 B4 Þ New Boot. (N1 N4 )

m¼0 m ¼ 1=2 m¼1 m ¼ 3=2 m¼0 m ¼ 1=2 m¼1 m ¼ 3=2

therein) and, under usual conditions, it is known the convergence:

and from (5) and (13),

4. Survival curves comparison

where W 0 ftg (0 r t r1) stands for a typical Brownian bridge.

14 H0 (n1 = n2 = 50) 14 H1 (n1 = n2 = 50)

5. Cumulative incidence functions comparison

1.2 H0 (n1 = n2 = 50) 1.2 H1 (n1 = n2 = 50)

0.8 Monte Carlo 0.8

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.