0% found this document useful (0 votes)
78 views42 pages

Tugas Ibu Sumah No 2 PDF

This document discusses sample size determination in clinical trials. It provides an introduction to sample size calculation, explaining why it is important. It outlines the main approaches to sample size calculation including precision analysis and power analysis. It lists the key information required to calculate a sample size such as effect sizes, variability, and statistical tests. It provides examples of sample size calculations for comparing two proportions and comparing two means. It also discusses important considerations like achieving the required sample size and retrospective sample size calculations.

Uploaded by

Glen Walalayo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views42 pages

Tugas Ibu Sumah No 2 PDF

This document discusses sample size determination in clinical trials. It provides an introduction to sample size calculation, explaining why it is important. It outlines the main approaches to sample size calculation including precision analysis and power analysis. It lists the key information required to calculate a sample size such as effect sizes, variability, and statistical tests. It provides examples of sample size calculations for comparing two proportions and comparing two means. It also discusses important considerations like achieving the required sample size and retrospective sample size calculations.

Uploaded by

Glen Walalayo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

 

Sample Size Determination in Clinical


Trials
HRM-733 CLass Notes 
Lehana Thabane, BSc, MSc, PhD
Biostatistician 
Center for Evaluation of Medicines
St. Joseph’s Heathcare
105 Main Street East, Level P1
Hamilton ON L8N 1G6
thabanl@mcmaster.ca
Fax: (905)528-7386
Tel: (905)522-1155 x3720
http://www.lehanathabane.com

Assistant Professor 
Department of Clinical Epidemiology & Biostatistics
Faculty of Health Sciences
McMaster University
University
Hamilton ON
November 19, 2004

1
 

Contents
1 Intro duction 1
1.1 Learning Ob jjeectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Why is sample size calculation important? 2

3 Approaches to sample size calculation 3


3.1 Precision of Estimati
ation: P
Prrecision Analysis . . . . . . . . . . . . . . . . . . . 3
3.2
3.2 Hy
Hypo
poth
thes
esis
is Tes
esti
ting
ng of effec
effects
ts/r
/rel
elat
atio
ions
nshi
hips
ps:: Pow
Power An
Anal
alys
ysis
is . . . . . . . . . . 3

4 Information required to calculate a sample size 4


4.1
4.1 Fac
acto
tors
rs that
that influ
influen
ence
ce sa
samp
mple
le si
size
ze ca
calc
lcul
ulat
atio
ion:
n: A che
heccklis
klistt . . . . . . . . . . 5

5 Explanation of Statistical Terms 7


6 Formulae for Sample Size Calculations 10
6.1 Sample Size Adjustments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

7 Repo
porrtin
ting the resu
sult
ltss of sa
sam
mple si
sizze calcu
lculatio
ation
n in th
the
e protoc
otoco
ol 13

8 Specific examples of samples size calculations 14


8.1 Example 1: C
Com
ompparing two propo
porrtions . . . . . . . . . . . . . . . . . . . . . 14
8.2 Example 2: C
Coomparing two means . . . . . . . . . . . . . . . . . . . . . . . 15

9 In
Ina
approp
ropriat
riate
e word
ordin
ing
g or repo
reporrtin
ting of Sample si
siz
ze calcu
lculatio
ation
ns 17
10 Im
Impo
porrtan
tant Rema marrks Abo bou ut Achiev eviing th
the
e req
equ
uired
red sasamp
mplele si
size
ze 19
10.1 Common Recruitment Strategies . . . . . . . . . . . . . . . . . . . . . . . . . 19
10
10.2
.2 Re
Reas
ason
onss ffor
or fa
fail
ilur
uree ttoo aacchiev
hievee tthe
he requ
requir
ired
ed samp
sample
le size
size . . . . . . . . . . . . . 19
10.3 Possible or Common Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 20

11 Retrospective Sample Size Calculations 20

12 Important Rules of Thumb/Cautions 21


12.1 G
Geeneral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

12
12.2
.2 Ru
Rule
less of Th
Thum
umb
b for Rel
elat
atio
ions
nshi
hips
ps/A
/Ass
ssoc
ocia
iattion
ion . . . . . . . . . . . . . . . . . 23

i
 

13 Tips on Elicitation of Effect Sizes and Variances for Sample Size Calcula-
tions 24

14 Sa
Sampl
mple
e S
Siz
ize
e C
Calc
alcula
ulatio
tions
ns for Clu
Cluste
sterr R
Ran
andom
domiz
ized
ed Con
Contro
trolle
lled
d S
Stud
tudie
iess 25
14
14.1
.1 Re
Reas
ason
onss fo
forr Us
Usin
ingg Cl
Clus
ustter
er-r
-ran
ando
domi
mize
zed
d Desig
esigns
ns . . . . . . . . . . . . . . . . . 25
14
14.2
.2 Sa
Samp
mplele Size
Size For
orm
mulae
ulae fo
forr C
Clu
lust
ster
er-r
-ran
ando
domi
mize
zed
dDDes
esig
igns
ns . . . . . . . . . . . . . 26

15 Sample Size Calculations for Other Types of Studies 27


15.1 Analysis of Change From Baseline . . . . . . . . . . . . . . . . . . . . . . . . 27
15.2 Analysis of Times to Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
15
15.3
.3 Com
Compa parris
ison
onss of Meaeans
ns for two Poioiss
sson
on Popul
opulat
atio
ion
ns . . . . . . . . . . . . . . 28
15
15.4
.4 T
Teeststin
ingg fo
forr a si
sing
ngle
le corre
orrela
lati
tion
on co
coeeffic
fficie ien
nt . . . . . . . . . . . . . . . . . . . . 29
15.5
15.5 Co
Comp mpararin
ingg C
Cor
orre
rela
lati
tion
on Coe
Coeffic
fficieient
ntss ffor
or Tw
Twoo IInd
ndepe
epend
nden
entt S
Sam
ampl
ples
es . . . . . . 29
15.6 E ssttimation Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

16 Sample Size Calculation based on Transformations 30

17 Sample Size Calculations and Non-parametric Tests 30

18 Software for Sample Size Calculations 31

ii
 

1 In
Intr
trodu
oduc
cti
tion
on
1.1
1.1 Lear
Learni
ning
ng Obje
Objecti
ctiv
ves
Specific Learning Objectives

•   Learn about the important elements of a sample size calculation in the design of a
clinical trial

–  Why is the sample size calculation important?


–  How to calculate the required sample size?

•  Gain some knowledge about the basic statistical rules of thumb in sample size calcu-
lations

  Learn how to report the results of sample size calculation for a granting agency
agency,, research
• ethics board submission, etc.
1.2
1.2 Intr
Introdu
oduct
ctor
ory
y Rema
Remark
rkss
Sample size calculations form an integral part of a vast majority of quantitative studies.
There are three main parts to sample size calculation: (i) sample size estimation, which de-
pends on a host of items (see Section 3.1); (ii) sample size justific
justification,
ation, whic
which
h ofte
often
n inv
involv
olves
es
 justification of the calculated number in the light of budgetary and other biological consid-
erations, and (iii) sample size adjustment, increasing the sample to account for potential
dropouts or effect of covariates.

•  Sample size calculations may not be required for some pilot or exploratory studies. It
is important to note that

–  A pilot study is a preliminary study intended to test the feasibility of a larger


study, data collection methods, collect information for sample size calculations,
and therefore should always have a main study to which it leads.
–  In other words, pilot studies cannot exist on their own, but only in relation to a
larger studies with the aim to facilitate the larger studies.
–  A pilot study SHOULD NOT be regarded as a study which is too small to produce

a definitive answer to the question of interest.

1
 

•  Sample size is just one part of a study design. There


There are sever
several
al other parts that are
needed to determine the sample size (see Section 4).

•   As part of sample size discussions, it is crucial to know what the consequences of 
’getting it wrong’ are: these may be ethical, economic or scientific
•  Sample size problems can be approached in two ways:
–   Patients I need  
need   approa
approach
ch:: bas
based
ed on cal
calcul
culati
ation
on of the sampl
samplee siz
sizee for a giv
given
en
power, level of significance and clinically meaningful difference
–  Patients I can get  approach:
 approach: based on calculation of power or detectable difference
for a given sample size and level of significance

2 Wh
Why
y is sa
samp
mple
le si
size
ze ccal
alcu
cula
lati
tion
on iimpor
mporta
tan
nt?
Sample size is important for two main reasons:

•  Economic reasons: See Altman (1980),


–   An  undersized   study
An undersized  study may result in a waste of resources due to their incapability
to yield useful
useful resul
results.
ts. Rec
Recall
all that witho
without
ut a lar
large
ge enoug
enough
h a sam
sample
ple,, an impo
impor-
r-
tant relationship or effect/difference may exist, but the collected data be not be
sufficient to detect it (ie the study may be under-powered to detect the effect).
–   An  oversized  study
An oversized    study can result in unnecessary waste of resources, while at the same
time yielding significant results that may not have much practical importance.
Note that if a study is based on a very large sample, it will almost always lead to
statistically significant results.

•  Ethical reasons: See Altman (1980)


–   An
An undersized   study can expose subjects to unnecessary (sometimes potentially
 undersized  study
harmful or futile) treatments without the capability to advance knowledge
–   An
An oversized 
 oversized  study
  study has the potential to expose an unnecessarily large number of 
subjects to poten
potentially
tially harmful or futil
futilee trea
treatmen
tments
ts

 Scientific reasons: (Moher et al (1994))



2
 

–  If a trial with negative results has a  sufficient sample size  to


  to detect a clinically
important effect, then the negative results are interpretable–the treatment did not
have an effect at least as large as the effect considered to be clinically relevant.

–  If a trial with negative results has  insufficient power (insufficient sample size),
size), a
clinically important (but statistically nonsignificant) effect is usually ignored or,
worse, is taken to mean that the treatment under study made no difference

Overall sample size calculation is an important part of the study design to ensure validity,
accuracy, reliability and, scientific and ethical integrity of the study.

3 Ap
Appr
proac
oache
hess to sam
sampl
ple
e si
size
ze cal
calcu
cula
lati
tion
on
There are two major classical approaches to sample size calculations in the design of quan-
titative studies:

•  Precision of estimation of an unknown characteristic/parameter of a Population


•   Hypothesis testing of treatment effects/population parameters
3.1 Precis
Precision
ion of Estima
Estimatio
tion:
n: Pre
Precis
cision
ion Analys
Analysis
is
In studies concerned
concerned with estim
estimating
ating some paramete
parameterr of a populati
population
on ( e.g. the prev
prevalenc
alencee
of a medical condition in the population), sample size calculations are important to ensure
that estimates
estimates are obtained with requ
required
ired precisio
precision/acc
n/accuracy
uracy or lev
level
el of confid
confidence
ence.. Reca
Recall
ll
that the smaller the margin of error in the estimation, the more informative or precise the

estimate is. For example,


•  a prevalence of 10% from a sample of size 20 would have a 95% confidence interval
(CI) of (1%, 31%), which may not be considered very precise or informative.

•  However, a prevalence of 10% from a sample of size 400 would have a 95% CI of (7%,
13%), which may be considered more accurate or informative.

3.2 Hypothesis
Hypothesis Testing
Testing of e
effects/
ffects/relati
relationship
onships:
s: Power
Power Analysis
Analysis
In studie
studiess con
concer
cerned
ned wit
with
h det
detect
ecting
ing an effect
effect (e.
(e.g.
g. a diff
differ
erenc
encee bet
betw
ween tw
twoo tre
treatm
atmen
ents,
ts,
or relative risk of a diagnosis if a certain risk factor is present versus absent), sample size
calculations are important to ensure that if an effect deemed to be clinically meaningful exists,

3
 

then the
then there
re is a hig
high
h chanc
chancee of it being det
detect
ected,
ed, i.e. tha
thatt the analy
analysis
sis will be sta
statis
tistic
ticall
ally
y
significan
signifi cant.
t. If the sample is too small, then even if large differen
differences
ces are observ
observed,ed, it will b
bee
impossible to show that these are due to anything more than sampling variation. There are
differen
diffe rentt types of hypot
hypothesis
hesis testi
testing
ng problems dependin
dependingg on the goal of the resear
research.
ch. Let
µS   = mean of standard treatment,   µT    = mean of new treatment, and  and   δ  
δ   = the minimum
clinically important difference.

1.   Test for Equality : Here the goal is to dete


detect
ct a clini
clinically
cally meanin
meaningful
gful differe
difference/
nce/effec
effects
ts
is such a difference/effects exists

2.  Test for Non-inferiority : To demonstrate that the new drug is as less effective as the
standard treatment (ie the difference between the new treatment and the standard is
less than the smallest clinically meaningful difference)

3.   Test for Superiority : To dem


demons
onstra
trate
te tha
thatt the new tre
treatm
atmen
entt is mor
moree supe
superio
riorr tha
thatt
standard treatment (ie the difference between the new treatment and the standard is
greater than the smallest clinically meaningful difference).

4.   Test for equivalence : To demon


demonstrat
stratee the differen
difference
ce b
bet
etwe
ween
en the new trea
treatmen
tmentt and
standard
standa rd treat
treatmen
mentt has no clini
clinical
cal importan
importancece

Test for Null Hypo potthesis Alternative Hypo potthesis


Equality    −
 µT  µS   = 0  
H 0  : : µ  −  
 :  µT  µS  = 0
H a  : µ
Non-inferiority  −  ≥
 µT  µS  δ
  H 0  : : µ  −
 :  µT  µS   < δ 
H 0  : µ
Superiority    −  ≤
 µT  µS  δ
H 0  : : µ  −
 :  µT  µS   > δ 
H 0  : µ
Equivalence   |  − | ≥
H 0  : µT  µS  δ |  − |
H 0  : µT  µS  < δ 

It is important to note that


•  the test for superiority is often referred to as the test for  clinical  superiority
  superiority

•   If   δ  =
  = 0, it is called the test of  statistical 
  statistical  superiority
  superiority

•   Equiv
Equivale alence
nce is tak
taken
en to be the alt alter
ernat
nativ ivee hypoth
hypothesi
esis,
s, and the nu
null
ll hypoth
hypothesi
esiss is
nonequivalence

4 In
Info
form
rmat
atio
ion
n req
requi
uire
red
d to ca
calc
lcul
ulat
ate
e a samp
sample
le si
size
ze
It is highly recommended that you ask a professional statistician to conduct the sample size
calculation.

4
 

4.1 Fact
actors
ors that
that influenc
influence
e sample
sample size calcul
calculati
ation:
on: A check
checklis
listt
1. The objective(s) of the research: Is the researc
research
h dealing with an estimation, h
hypothesis
ypothesis
or equivalence testing problem?

2. Are the con


control
trol and int
interv
ervent
ention(s)
ion(s) descr
described
ibed in detail?

3. The outcome
outcome(s)
(s) of the rese
researc
arch:
h:

•  Is/Are the outcome(s) categorical or continuous?


•  Is it a multiple or single outcome study?
•   What is(are) the primary outcome(s)?
•   What is(are) the secondary outcome(s)?
•  Are the outcomes clinically relevant?
•  Can the outcomes be measured for all subjects?
•  Are the frequency and duration of the outcome measurements explicit?
•   Are there any surrogate outcomes?
–  What is the rationale for using surrogate outcomes?
–  Will they accurately reflect the main outcomes?
–   How can the observ
observed
ed benefit or harm made on surro
surrogate
gate outcom
outcomes
es translat
translatee
into corresponding benefit or harm on the main outcome?

4. Are there any cov


covariat
ariates
es or fact
factors
ors for whic
which
h to control
control??

5. What is the unit of rando


randomizat
mization?
ion? Is it individual subject
subjects,
s, family practi
practices,
ces, hospi
hospital
tal
wards, communities, families, etc?

6. What is the unit of analys


analysis?
is? Is it indiv
individual
idual subjects or cluste
clusters
rs (eg family practi
practices,
ces,
hospital wards, communities, families)?

7. What is the re
researc
search
h design
design?? Is it

•  a simple randomized controlled trial (RCT)


•  a cluster randomized trial
•  an equivalence trial
5
 

•  a non-randomized intervention study


•   an observational study
•  a prevalence study
•   a study measuring sensitivity and specificity
•  a paired design study (ie paired comparison)
•  a repeated-measures design study (ie does your study include repeated measures)?
The following additional factors are equally important

•  are groups of equal sizes?


•   are the data hierarchical?
8. Resea
Researc
rch
h subjects

•  what is the target population?


•  what is the inclusion and exclusion criteria?
•  what is the likely patient compliance rate?
•  what is the baseline risk (poor or good prognosis)?
•  what is the chance of treatment response?
•  what is the potential drop-out rate?
9. Ho
Howw long
long is the
the du
dura
rati
tion
on of the
the fo
foll
llow
ow-u
-up?
p? Is it long
long enou
enough
gh to be of any cli
clini
nica
call
relevance?

10. What is the desired lev


level
el of significan
significance?
ce?

11. What is the desir


desired
ed powe
power?
r?

12. Wha
Whatt type of summar
summary
y or tes
testt stati
statisti
sticc will be use
used
d for analys
analysis?
is? Wil
Willl it be a one
one-- or
two-tailed test?

13. The smallest difference (see Spie


Spiegelhalter
gelhalter and F
Freedman
reedman [9] and Spiegelhalter et al [8])

•  Does this reflect the degree of benefit from the intervention against the control
over the specified time frame?
•  Is it stated as
6
 

–   the smallest clinically important difference? (Lachin [12])


–  the difference that investigators think is worth detecting? (Fleiss [13])
–  the differ
difference
ence that inv
investig
estigators
ators think is likely
likely to be dete
detected
cted?? (Halper
(Halperin
in et

al [10]).
14. Justi
Justificati
fication:
on: Most importan
importantly
tly,, is the justi
justificati
fication
on provided on how the variou
variouss prior
estimates used in the calculations were obtained and their usefulness in the context of 
the study? This also deals with the clinical rele
relevvance of the estimates
estimates depending on
the source (ie published data, previous work, review of records, expert opinions, etc).

5 Ex
Expl
plan
anat
atio
ion
n of S
Sta
tati
tist
stic
ical
al T
Ter
erms
ms
Below are some brief descriptions of some of above statistical terms.

1.  Null and alternative hypothesis : Man


Many y statistic
statistical
al analy
analyses
ses invo
involve
lve the compariso
comparison
n of 
two treatments, procedures or subgroups of subjects. The numerical value summarizing
the  effect . In other study designs the effect may be
the difference of interest is called the effect 

•  Odds ratio (OR): H 


(OR):  H    : OR=1
0

•  Relative risk (RR): H 


(RR):  H    : RR=1
0

•   Risk Difference (RD): H 


(RD):  H    : RD=0
0

•  Difference between means (µ (µ − µ ):   H   :   µ − µ  = 0


1 2 0 1 2

•   Correlation coefficient (ρ(ρ):   H   : : ρ


0  ρ =
 = 0

hypothesis   H 0  states that there is no effect and the alter-


Note that usually, the null hypothesis 
native hypothesis that there is an effect.

2.   P-value of a test : The p-value is the probability of obtaining the effect as extreme or
more extreme than what is observed in the study if the null hypothesis of no effect is
actually true. It is usually expressed as a proportion (e.g. p=0.001).

3.  Significance level of a test : Also called the Type error probability, the significance level
is a cut-off point for the p-value, below which the null hypothesis will be rejected and
it will be concl
concluded
uded that there is evide
evidence
nce of an effect. The conv
convent
entional
ional significance
significance

is   α = 0.05 or 5%.


level is 

7
 

4.  Power of a test : Power is the probability

•  that the null hypothesis will be correctly rejected i.e. rejected when there is indeed
a real difference or association
•  the test will detect a difference or association of a particular magnitude when it
exists
•   1−β , where β 
where β  is
 is the P(Type II error), the chance of missing a clinically meaningful
difference

The higher the power, the lower the chance of missing a real effect. Power is typically
set to be at least 80%.

Reject   H 0
Reject    Accept  H a
Accept H 
H 0   True
rue Type I er
erro
rorr Corr
orrec
ectt De
Deci
cisi
sion
on
(Probability=α
(Probability= α)
H 0   Fal
alse
se Co
Corr
rrec
ectt De
Deci
cisi
sion
on Typ
ypee II erro
errorr
(Probability=β 
(Probability= β )

5.   Effect size of clinical importance : Thi


Thiss is the small
smallest
est diffe
differen
rence
ce betw
between
een the gro
group
up
means or proportions (or odds ratio/relative risk closest to unity) which would be
considered to be clinically meaningful. The sample size should be estimated so that if 
such
such a diffe
differenc
rencee exist
exists,
s, then the cha
chance
nce that a statis
statistical
tically
ly significan
significantt resul
resultt wou
would
ld be
obtained is very high.

6.   One-sided and two-sided tests of significance : In a tw


two-side
o-sided
d test, the nul
nulll hypothesi
hypothesiss

states there is no effect, and the alternative hypothesis is that a difference exists in
either direction. In a one-sided test the alternative hypothesis does specify a direction,
for example that an active treatment is better than a placebo, and the null hypothesis
then includes both no effect and placebo better than active treatment.

•  Two-sided tests should be used unless there is a very good reason for doing oth-
erwise.
•  One-sided tests may be appropriate in situations where is it completely inconceiv-
able that the results could go in either direction, or the only true concern with
outcomes in one tail of the distribution.

(a) Exampl
Examples
es includ
include:
e:

8
 

i. tox
toxicit
icity
y studies
ii. safety ev
evaluations
aluations
iii. analys
analysis
is of occurr
occurrence
encess of advers
adversee drug react
reactions
ions

iv. risk an
analysis
alysis
(b) Refe
Referenc
rences:
es:
i. Bland JM, Altman DG. One and two sided tests of significance.   BMJ 
1994; 309: 248.
ii. Dubey SD. Some Thoughts on the One-sided and Two-sided
Two-sided Tests.  Jour-
Tests. Jour-
nal of Biopharmaceutical Statistics  1991;
  1991; 1: 139-150.
iii. Chow S-C
S-C,, Shao J, W
Wang
ang H. Sample Size Calculations in Clinical Researc
H. Sample Research 

Marcel Dekker: New York, NY 2003
The expectation that the difference will be in a particular direction is not adequate

 justification for one-sided tests.

9
 

6 For
orm
mul
ulae
ae fo
forr Sam
Sampl
ple
e Siz
Size
e Cal
Calcu
cula
lati
tions
ons
Table 1: Formula
ormulae
e for Sample Size Calculations for Comparisons Betw
Between
een
Means

Hypotheses and Sample Size rules


Design Hyp othesis   H 0   H a   Basic Rule
  2

z α +zβ σ2
2
One-sample Equality   µ − µ  = 0
0   µ − µ  = 0
0   n  = (µ−µ0 )2

2
  (zα +zβ ) σ2
Superiority   µ − µ  ≤ δ
0 µ − µ  > δ
0 n  = (µ−µ0 −δ )2

2
  (zα +zβ ) σ2
Equivalence   |µ − µ | ≥ δ    |µ − µ | < δ
0 0 n  = |−δ )2
(|µ−µ0 |−δ

2 z α +zβ σ2
Two-sample Parallel Equality   µ1 − µ  = 0
2   µ1 − µ  = 0
2   ni  =
  2
(µ1 −µ2 )2

2
  2(zα +zβ ) σ 2
Non-inferiority   µ1 − µ  ≥ δ
2 µ1 − µ  < δ
2 ni  = (µ1 −µ2 −δ )2

2
  2(zα +zβ ) σ 2
Superiority   µ1 − µ  ≤ δ
2 µ1 − µ  > δ
2 ni  = (µ1 −µ2 −δ )2

2
  2(zα +zβ ) σ 2
Equivalence   |µ − µ | ≥ δ    |µ − µ | < δ
1 2 1 2 ni  = |−δ )2
(|µ1 −µ2 |−δ

  2

z α +zβ σ2
2
Two-sample Crossover Equality   µ1 µ2  = 0   µ1 µ2 = 0   ni  = 2(µ1 −µ2 )2
2(µ
− −   2
  (zα +zβ ) σ2
Non-inferiority   µ1 − µ  ≥ δ
2 µ1 − µ  < δ
2 ni  = 2(µ1 −µ2 −δ )2
2(µ

2
  (zα +zβ ) σ2
Superiority   µ1 − µ  ≤ δ
2 µ1 − µ  > δ
2 ni  = 2(µ1 −µ2 −δ )2
2(µ

2
  (zα +zβ ) σ2
Equivalence   |µ − µ | ≥ δ    |µ − µ | < δ
1 2 1 2 ni  = 2( |−δ)2
2(||µ1 −µ2 |−δ

10
 

Table 2: Formula
ormulae
e for Sample Size Calculations for Comparisons Betw
Between
een
Proportions

Hypotheses and Sample Size rules


Design Hyp othesis   H 0   Basic Rule
2
 z α +zβ π (1
(1−
−π )
2
One-sample Equality   π − π  = 0
0   n  = (π −π0 )2

2
  (zα +zβ ) π (1
(1−
−π )
Superiority   π − π  ≤ δ
0 n  = (π −π0 −δ )2

2
  (zα +zβ ) π (1
(1−
−π )
Equivalence   |π − π | ≥ δ 0 n  = |−δ )2
(|π −π0 |−δ

  2

z α +zβ (π1 (1
(1−
−π2 )+π
)+π2 (1
(1−−π2 ))
2
Two-sample Parallel Equality   π1 − π  = 0
2   ni  = (π1 −π2 )2

2
  (zα +zβ ) (π1 (1
(1−−π2 )+π
)+π2 (1
(1−−π2 ))
Non-inferiority   π1 − π  ≥ δ
2 ni  = (π1 −π2 −δ )2

2
  (zα +zβ ) (π1 (1
(1−−π2 )+π
)+π2 (1
(1−−π2 ))
Superiority   π1 − π  ≤ δ
2 ni  = (π1 −π2 −δ )2

2
  (zα +zβ ) (π1 (1
(1−−π2 )+π
)+π2 (1
(1−−π2 ))
Equivalence   |π − π | ≥ δ
1 2 ni  = |−δ )2
(|π1 −π2 |−δ

2
  z α +zβ σd2
2
Two-sample Crossover Equality   π1 − π  = 0
2   ni  = 2(π1 − π2 ) 2
2(π

2
  (zα +zβ ) σ2
Non-inferiority   π1 − π  ≥ δ
2 ni  = 2(
2(ππ1 −π2 −δ )d2

2
  (zα +zβ ) σd2
Superiority   π1 − π  ≤ δ
2 ni  = 2(
2(ππ1 −π2 −δ )2

2
  (zα +zβ/ ) 2 σd2
Equivalence   |π − π | ≥ δ
1 2 ni  = 2( |−δ )2
2(||π1 −π2 |−δ

11
 

6.1
6.1 Sampl
Sample
e S
Siz
ize
e A
Adju
djust
stme
men
nts
It is important to note that sample-size problems will vary from study to study depending
on the context. The sample size may need to be adjusted to account for the effects of other
variables, and the uncertainty of predictable practical and ethical factors.
•  Which variables should be included in the sample size calculation? 
–  The sample size calculation should relate to the study’s primary outcome variable.
–  Ideally, separate sample size calculations should be provided for each   important 
variable : the sample size should also be suffici
sufficient
ent for the analyse
analysess of all importan
importantt
variables.
–   A simpler conservative approach is to estimate sample sizes for all important
outcomes and then use the maximum estimate.
–  As a rule of thumb, when the correlation of a covariate with the response variable
is  ρ,, then the sample size can be reduced by a factor of 1 ρ2 . That is,
is ρ −
nnew   =  n(1
 n(1 − ρ ).
2

•  Multiplicity and Sample Size Adjustment : Multiplicity adjustment using the Bonferroni
method to the level of significance should be made when at least one significant result
(eg one of several primary outcomes or several pairwise comparisons) is required to
draw a conclusion

•  Allowing for response rates and other losses to the sample 


The sample size calculation
calculation should relat
relatee to the final, achiev
achieved
ed sample. Ther
Therefore
efore,, the
initial sample size may need to be adjusted in order to account for

–  the expected response rate

–  loss to follow up
–   lack of compliance
–  any other unforseen reasons for loss of subjects

For example to adjust the sample size for the anticipated loss to follow-up rare: Suppose
n   is the total number of subjects in each group not accounting for loss to follow-up,
and L
and  L  is the loss to follow-up rate, then the adjusted sample size is given by
  n
nnew   =

1 L
It is important to state clearly what factors were taken into consideration in the sample
size adjustment and the justification should also be explicit.

•  Adjustment for Unequal Group Size 


12
 

–   First calculate n
calculate  n  the per group sample size assuming equal number per group
–  If we require n
require  n 1 /n2  = k
 =  k,, then
 1  1
n2  = n(1 + 1/k
1/k)) and   n1  = n(1 + k
+ k))
2 2
–   The questio
question
n of whether to use unequal sample sizes matters when multiple
multiple sizes
can be obtained in one group (see Lachin 2000, van Belle 2003)

7 Re
Repor
porti
ting
ng the re
resu
sult
ltss of sam
sampl
ple
e si
size
ze cal
calcu
cula
lati
tion
on in
the protocol
The protocol should provide sufficient details on how the sample size was determined. This
should cover
1. clear state
statemen
ments
ts of the (primar
(primary)
y) ob
object
jective
ivess of the study
2. the desire
desired
d level of signific
significance
ance
3. the desi
desired
red pow
power
er
4. type of summary or test stati
statistic
stic will be used for analysis
5. whet
whether
her the test will one- or two-taile
two-tailed
d
6. the smalles
smallestt differenc
differencee and a clear statemen
statementt of whet
whether
her it is
•   the smallest clinically important difference
•  the difference that investigators think is worth detecting
•  the difference that investigators think is likely to be detected
7. justi
justificati
fication
on provide
provided
d on how the var
various
ious prior estimate
estimatess of the varianc
ariancee and the effect
used in the calculations were obtained and their usefulness in the context of the study
8. clear state
statemen
ments
ts about the assumpti
assumptions
ons made about the distr
distributio
ibution
n or varia
variabilit
bility
y of 
the outcomes
9. clear state
statemen
mentt about the sche
scheduled
duled durat
duration
ion of the study
10. clear state
statemen
ments
ts about how the sample size calculati
calculation
on was adjusted for
•  the expected response rate
•  loss to follow up
•   lack of compliance
 any other unforseen reasons for loss of subjects

11. any other infor
information
mation that forme
formed
d the basis for the sample size calculat
calculation.
ion.

13
 

8 Spec
Specifi
ificc ex
exam
ampl
ples
es of ssamp
ample
less si
size
ze ccal
alcu
cula
lati
tions
ons
If yo
your
ur stu
study
dy req
requir
uires
es the estim
estimati
ation
on of a sin
single
gle pro
proport
portion
ion,, com
compar
pariso
ison
n of tw
twoo mea
means,
ns,
or com
compar
pariso
ison
n of twtwoo pro
proport
portion
ions,
s, the sam
sample
ple siz
sizee cal
calcul
culati
ations
ons for the
these
se situat
situation
ionss are
(gener
(ge nerall
ally)
y) rel
relati
ative
vely
ly str
straig
aight
htfor
forwa
ward,
rd, and are the
there
refor
foree pre
presen
sented
ted her
here.
e. HoHowe
weve
ver,
r, it is
still strongly recommended that you ask a statistician to conduct the sample size calcu-
lation.
latio n. The follo
following
wing exampl
exampleses (take
(takenn from St. Georg
George’s e’s Hospital Medi
Medical
cal School website:
website:
http://www.sghms.ac.uk/depts/phs/guide/size.htm ) are meant to illustrate how to calcu-
late, justify and report sample size calculations.

8.1 Exampl
Example
e 1
1:: Compar
Comparing
ing tw
two
o propo
proporti
rtions
ons
•   Goal:  The following calculation only applies when you intend to compare two groups
of the same size.
•   Scenario:
Scenario: A place
placebo-con
bo-controll
trolled
ed randomize
randomized
d trial proposes to assess the effectiv
effectivenes
enesss of 
colony stimulating factors (CESS) in reducing sepsis in premature babies. A previous
study has shown the underlying rate of sepsis to be about 50% in such infants around 2
weeks after birth, and a reduction of this rate to 34% would be of clinical importance.
•   Required information :
–  Primary outcome variable = presence/absence of sepsis at 14 days after treatment
(treatment is for a maximum of 72 hours after birth).
–  Hence, a categorical variable summarized by proportions.
–  Size of difference of clinical importance = 16%, or 0.16 (i.e. 50%-34%)
–  Significance level = 5%
–  Power = 80%

–  Type of test = two-sided


The formula for the sample size for comparison of 2 proportions (two-sided) is as
follows: 2
n  =
[z α + z β 
2
β ] × −
[π1 (1 π1 ) + −
 π2 (1 π2 )]
+ π
(π1 π2 )2−
where
–  n = the sample size required in each group (double this for total sample)
–   π1   = first proportion=0.50,
–   π2  = second proportion=0.34,
–   π1 π2  = size of difference of clinical importance = 0.16

–   z α depends on desired significance level = 1.96
2

14
 

–   z ββ   depends on desired power = 0.84

Inserting the required information into the formula gives: -

  [1 0.84]2
[1..96 + 0. [(0..50
[(0 0.50) + (0.
(0.34 0.66)]
n  = × [0 ×
[0..16]2   × = 146

This gives the number required


required in eac
each
h of the trial
trial’s
’s two groups. Ther
Therefor
eforee the total
sample size is double this, i.e. 292.

•   Suggested description of this sample size calculation :


”A sample size of 292 babies (146 in each of the treatment and placebo
groups) will be sufficient to detect a clinically important difference of 16%
between groups in the sepsis rate at 14 days, using a two-sided Z-test of the
difference between proportions with 80% power and a 5% significance level.
This 16% difference represents the difference between a 50% sepsis rate in
the placebo group and a 34% rate in the treatment group.”

8.2
8.2 Exam
Exampl
ple
e 2: Comp
Compar
arin
ing
g ttw
wo mea
means
ns
•   Goal : The following calculation only applies when you intend to compare two groups
of the same size.

•   Scenario:
Scenario: A ran
random
domize
ized
d con
contro
trolle
lled
d tri
trial
al has been pla
planne
nnedd to ev
evalu
aluate
ate a bri
brief
ef psy-
chological intervention in comparison to usual treatment in the reduction of suicidal
ideation
ideation amongs
amongstt patie
patients
nts presen
presenting
ting at hospit
hospital
al with deliberate self-pois
self-poisoning.
oning. Suici
Suici--
dal ideation will be measured on the Beck scale; the standard deviation of this scale
in a previous study was 7.7, and a difference of 5 points is considered to be of clini-
cal importanc
importance.
e. It is an
antic
ticipa
ipate
ted
d tha
thatt aro
around
und one thi
third
rd of pat
patien
ients
ts may drop out of 
treatment (Guthrie et al. 2001)
•   Required information :

–  Primary outcome variable = The Beck scale for suicidal ideation.


–  A continuous variable summarized by means.
–  Standard deviation = 7.7 points
–  Size of difference of clinical importance = 5 points
–  Significance level = 5%
–  Power = 80%
–  Type of test = two-sided

15
 

The formula for the sample size for comparison of 2 means (2-sided) is as follows: -

[z α + z ββ  ]2 × 2σ 2
n  = 2

δ 2

where n = the sample size required in each group (double this for total sample).
σ  = standard deviation, of the primary outcome variable = 7.7.
δ  =
  = size of difference of clinical importance = 5.0.
z α = 1.96.
2

z ββ  = 0.84.
Inserting the required information into the formula gives: -

0.84]2 2 7.72
[1..96 + 0.
  [1 × ×
n =   = 38
5.02

This gives
sample sizethe numberthis,
is double requi.e.
required
ired76.in eac
each
h of the trial
trial’s
’s two groups. Ther
Therefor
eforee the total

To allow
allow for the pred
predict
icted
ed dro
dropout
pout rat
ratee of aro
around
und one thi
third,
rd, the sampl
samplee siz
sizee wa
wass
increased to 60 in each group, a total sample of 120.

•  Suggested wording of this sample size calculation :


” A sample size of 38 in each group will be sufficient to detect a clinically
important difference of 5 points on the Beck scale of suicidal ideation, as-
suming a standard deviation of 7.7 points, using a tow-tailed t-test of the
difference between means, a power of 80%, and a significance level of 5%.
The calculation is based on the assumption that the measurements on Beck
scale
scale are nor
normal
mally
ly dis
distri
tribut
buted.
ed. Thi
Thiss nu
numbe
mberr has been inc
increa
reased
sed to 60 per
group (total of 120), to allow for a predicted drop-out from treatment of 
around one third”.

16
 

•  Wording from Power and Precision 


Power for a test of the null hypothesis
One goal of the proposed study is to test the null hypothesis that the two
population
set at 0.050.means are isequal.
The test Thewhich
2-tailed, criterion for significance
means (alpha)
that an effect hasdirec-
in either been
tion will be int
interpr
erpreted
eted..
With the proposed sample size of 39 and 39 for the two groups, the study
will have power of 80.8% to yield a statistically significant result.
This computation assumes that the mean difference is 5.0 and the common
within-gro
withi n-group
up stand
standard
ard devi
deviation
ation is 7.7.
This effect was selected as the smallest effect that would be important to
detect, in the sense that any smaller effect would not be of clinical or sub-
stantive significance. It is also assumed that this effect size is reasonable, in
the sense that an effect of this magnitude could be anticipated in this field
of research.

Precision for estimating the effect size


A second goal of this study is to estimate the mean difference between the
two populations. On average, a study of this design would enable us to report
the mean difference with a precision (95.0% confidence level) of plus/minus
3.46 points
points..
For example, an observed difference of 5.0 would be reported with a 95.0%
confidence interval of 1.54 to 8.46.
The precision estimated here is the median precision. Precision will vary as
a function of the observed standard deviation (as well as sample size), and
in any single study will be narrower or wider than this estimate.

9 In
Inap
appr
prop
opri
riat
ate
e wor
ordi
ding
ng or re
repor
porti
ting
ng of Sa
Samp
mple
le siz
size
e
calculations
1.  Example 1:  ”A previous study in this area recruited 150 subjects and found highly
significant results (p=0.014), and therefore a similar sample size should be sufficient
here.”

•  Why is this a problem? : Previous studies may have been ’lucky’ to find significant
results, due to random sampling variation
•   Solution:   Cal
Calcul
culati
ations
ons of sam
sample
ple size spec
specific
ific to the prese
present
nt,, pro
propose
posed
d stu
study
dy

should be provided

17
 

2.   Example 2:  ”Sample sizes are not provided because there is no prior information on
which to base them.”

•  If the study is a preliminary pilot aimed at assessing feasibility or gathering the
information required
size calculations to calculate
are not necessarysample sizes for a full-scale study, then sample
•   Where
Where prior informat
information
ion on standa
standard
rd devia
deviations
tions is unav
unavailabl
ailable,
e, then standard
deviation can be estimated from the range as
 Max-Min
Standard deviation =   .
4
Then sample size calculation
calculationss can be giv
given
en in ve
very
ry general terms
terms,, i.e. by giving
the size of difference that may be detected in terms of a number of standard
deviations

3.   Example
Example 3:   “The throughput of the clinic is around 50 patients a year, of whom
10% may refuse to take part in the study. Therefore over the 2 years of the study, the
sample size will be 90 patients. ”

•  Although most studies need to balance feasibility with study power, the sample
size should not be decided on the number of available patients alone.
•  If the number of available patients is a known limiting factor, a apply the patients 
the  patients 
I can get  approach
  approach to indicate either
(a) the power which the study will hav
havee to detect the desired difference of clinical
importance,
importan ce, or
(b) the differe
difference
nce whic
which
h will b
bee detecte
detectedd when the desir
desired
ed power is applied

Where the number of available patients is too small to provide sufficient power to detect
differences of clinical importance, you may wish to consider extending the length of 
the study, or collaborating with a colleague to conduct a multi-centre study.

4.   Other Examples

•   “The results of a pilot study have been submitted for publication and the revie
reviewers
wers
indicate that our sample size is adequate.”
•   “W
“Wee aim to recruit 100 parti
participan
cipants.
ts. This sample size was determin
determined
ed to dete
detect
ct
a small to moderate mean difference of xx points between the treatment groups
on at least one of the key outcomes with a 70% power.”

18
 

10 Im
Import
portan
antt Rem
Remark
arkss About A
Acchi
hievi
eving
ng th
the
e req
requi
uire
red
d
sample size
10.1
10.1 Common
Common Rec
Recrui
ruitmen
tmentt Str
Strateg
ategies
ies
Recruitment of suitable subjects for participation in clinical trials can also achieved through
use of effective recruitment strategies (taken from the 2000 Office of Inspector General (OIG)
of the US Department of Health Services Report):

1. throu
through
gh the use of financ
financial
ial and non-fi
non-financia
nanciall incenti
incentives
ves

2. by physic
physicians
ians flagging patient
patientss in their pract
practice
ice through chart revie
reviews
ws or when they
appear for an appointment

3. by furnishing tr
trial
ial information to other local clinicians or to
to disease advocacy and other
groups

4. throu
through
gh advert
advertising
ising and promotion suc
such
h as

•  media ads
•   press releases
•  televised segments
•  speakers at local health fairs
Recruitment through the Internet or Web in also increasingly becoming a popular option.

10.2
10.2 Reason
Reasonss for failur
failure
e to achiev
achieve
e the req
requir
uired
ed sample
sample size
size
The sample size required for a clinical trial may be very hard to recruit or recruitment has

been much
can lead to slower than ending
pre-mature anticipated.
of theThis
trial,iswhich
quite common ininconclusive
could lead clinical studies. Sometimes
findings this
because of 
lack of power. In order to avoid or address this problem, it is important to understand why
it happen:

•  patients’ refusal to consent to participate in the study


•  bad tim
timee of the study
study:: sno
snowy
wy we
weath
ather
er ma
mayy als
alsoo dis
discou
courag
ragee pote
potent
ntial
ial patie
patient
ntss fro
from
m
participating especially if the trial involves clinic visits

•  adverse media publicity: sometimes adverse media publicity about medicine in general
and trials in particular may discourage potential subjects from taking part in a research
endeavor

•  failure of recruiting staff to identify and approach potential research subjects


19
 

•  lack of genuine commitment to the project: sometimes this which might be caused by
honest doubts about the safety or efficacy of the new treatment

•  poor recruitment may also be due to staffing problems: eg clinical unit is under pressure
from excessive patient numbers or understaffing

•  too many projects going after the same subjects


For further details on barriers to patient participation in clinical studies, see Ross et al 1999.

10.3
10.3 Possi
Possible
ble or Common
Common Sol
Solutio
utions
ns
Possible solutions include

•  Pilot studies are very helpful in providing insights into some of these issues
•   good planning
planning:: dev
devise
ise a cle
clear
ar plan of how to mon
monito
itorr recru
recruitm
itmen
ent.
t. Thi
Thiss may also
help the proposal for funding since funders may well be impressed by a proposal which
shows that this issue has been considered by applicants and some there are plans to
deal with it

•  request to the funders for an extension in time or for an extension in funding


•  check with potential collaborators what their other trial commitments are
•   maintain regular visits to trial sites and good contact with staff who are responsible
for recruitment

•  to have recruitment targets (milestones) to enable the research team to monitor how
well recruitment is going, so that problems can be detected as soon as possible

11 Re
Retr
trospe
ospect
ctiv
ive
e Sam
Sampl
ple
e Si
Size
ze Ca
Calc
lcul
ulat
atio
ions
ns
Sometimes, people try to estimate the sample size or perform power analysis after the study
is completed.

•   Avoid retrospective planning; it’s bad science!


•  IMPORTANT: ”Observed power” (ie power calculated based on the observed effect)
is a decreasing function of the p-value of the test (Hoenig and Heisey, 2001). That is,

–  the observed power increases as the p-value decreases


–   the higher the observed power, the greater the evidence against 
evidence against  the
 the null hypothesis

•  The problem with observed power:


20
 

–   It is ass
associa
ociated
ted with thi
thiss com
common
mon mis
mis-in
-inter
terpre
pretat
tation
ion or mis
miscon
conce
cepti
ption:
on: if the
test is nonsignificant (pvalue is large), but the observed power is high, then this is
interpreted to mean that there is strong evidence in support of the null hypothesis
–   It causes great confusion because it is ofte
often
n used inappr
inappropriat
opriately
ely to add interp
interpre-
re-
tation to a non-significant test result

12 Im
Import
portan
antt Ru
Rule
less of Th
Thum
umb/C
b/Caut
autio
ions
ns
12.1
2.1 Gener
enera
al
1.  Multiplicity and Sample Size Adjustment : Multiplicity adjustment using the Bonferroni
method to the level of significance should be made when at least one significant result
(eg one of several primary outcomes or several pairwise comparisons) is required to
draw a conclusion
2.  Overlapping Confidence Intervals Do not imply non-significance 
Basic Rule: “Confidence intervals associated with statistics can overlap as much
as 29% and the statistics can still be significantly different” (van Belle 2002)
3.  Sample size calculations should be based on the statistics used in the analysis . Example,
if sample size calculations were based on assumption that the outcome is continuous,
then dichotomizing the outcome for the analysis would not be appropriate. Why?
•  using a different statistic for analysis may alter the anticipated power
•  the anticipated treatment effect may no longer be meaningful in the scale of the
new statistic
4. The basic rule of thumb for estimating the sample size for testing equality
equality of two means
is 2
 =  n 2  =   8σ2   ; wh
n1  = n −
eree   δ  = µ
wher  =  µ1 µ2
δ 
5. The basic rule of thumb for estimating the sample size for testing equality of two
proportions
proportions is
n1  = n
 =  n 2  =

  8π(1 π)
; wh
wher
ere
e   π  =
  π1  + π
 +  π2
(π1 π2 )2
− 2
6. Since sample calcu
calculation
lationss are estimat
estimates,
es,
•   it is usually better to be conservative . For example, it is usually better to assume
a two-sided test than a one-sided test
•  it is better to adopt a simple approach  even
  even for complex problems. For example, it
is simpler to use difference between proportions than logistic regression in sample
size calculation

21
 

7. Althou
Althoughgh larger sample sizes are genera
generally
lly desired
desired,, it is important to  be aware of the 
to be
statistical (over-power), ethical, and economic consequences of too large a sample 

8.  Rare Incidence rates : If the primar


primaryy outcome is an extrem
extremely
ely rare ev
event
ent (e
(eg.
g. one per
10, 000), then sample size calculations will indicate that a very large sample is required

9. It is wor
worth
th notin
notingg that  observational or non-randomized studies looking for differences 
that observational
or associations will generally require a much larger sample  in
sample  in order to allow adjustment
for confounding factors within the analysis

10.   It is the absolute sample size which is of most interest , not the sample size as a pro-
portion of the whole population

11.   Consistency with study aims and statistical analysis 

•  The adequacy of a sample size should be assessed according to the purpose of the
study. Check whether the purpose of the study is
–  to test for no difference (equality)
–   to test for non-inferiority
–  to test for superiority
–  to test for equivalence
Note that the sample size required to demonstrate equivalence will be larger than
that required to demon
demonstrat
stratee a differ
difference
ence..
•  Sample size calculations should relate to the study’s stated objectives, and be
based on the study’s primary outcome variable or all important outcome variables
•  Sample size calculations should also be consistent with the proposed method of 
analysis

12. The following rules of thumb ha


have
ve been recommended by v
vanV
anVoorhis
oorhis and Morgan 2001:

•  For moderate to large effect size (ie 0.50 ≤effect size≤0.80), 30 subjects per group
are required
•  For comparisons between three or more groups, then to detect an effect size of 0.5
(a) with 80% power, will requi
require
re 14 subjects
subjects/group
/group
(b) with 50% power, will requi
require
re 7 subjects
subjects/group
/group

13.  Sensitivity Analysis : It is best to create a sample size table for different values of the
(α), power or different effect sizes, and then ponder the table to
level of significance (α
select the optimal sample size

22
 

12.2
12.2 Rules
Rules of Th
Thum
umb
b for Relat
Relation
ionshi
ships/
ps/Ass
Associa
ociatio
tion
n
problems,  power    refers to the ability to find a specified regression coefficient
In regression problems, power 
2
 R statistically significant at a specified level of significance and specified sample
or level of  R
size.
1. For multi
multiple
ple regre
regression:
ssion: Hair et al, 2000 state

(a) that with 80% pow


power,
er, and  α =
and α  = 0.05, one can detect a
i.   R2 ≥ 0.23 based on n
on  n =
 = 50
ii.   R2 ≥ 0.12 based on n
on  n =
 = 100
(b) The gener
general
al rule is that the ratio of num
number
ber of sub
subject
jectss to num
number
ber of independen
independentt
variabl
ariables
es should be about 5:1. There is substan
substantial
tial risk of “ov
“overfit
erfitting”
ting” if it falls
below.
(c) The desire
desired
d ratio is usual
usually
ly about 15 to 20 sub
subject
jectss for eac
each
h independent
independent varia
variable
ble

2. Sample size for examin


examining
ing relation
relationships:
ships: Gree
Green
n (1991) recommen
recommends
ds

(a) Rule 1: for testi


testing
ng multip
multiple
le correl
correlation
ationss

n > 50
>  50 + 8m
8m

 m  is the number of independent variables


where m
where
(b) Rule 2: for testin
testingg relationsh
relationship
ip of outco
outcome
me with individua
individuall predictors
predictors

n > 104
>  104 + m
+ m

3. Harri
Harriss (1985) recomm
recommends
ends

(a) Rule 1: For 5 or less predic


predictors,
tors, the num
number
ber of sub
subject
jectss should excee
exceed
d the number
of independent variables by 50
n >  50 + m
+  m

(b) Rule 2: For equatio


equations
ns invo
involving
lving 6 or more predi
predictor
ctors,
s, an absolu
absolute
te num
number
ber of 10
subjects per predictor is recommended

n > 104
>  104 + m
+ m

4. Large sample
sampless are needed (see T
Tabac
abachnic
hnick
k and Fidell, 1996) if 

(a) the dependen


dependentt variabl
ariablee is ske
skewed
wed
(b) the effe
effect
ct size is sma
small
ll
(c) there is substa
substanti
ntial
al measu
measureme
rement
nt error

23
 

(d) step
stepwise
wise regre
regression
ssion is used
5. Rules for chi-
chi-squar
squared
ed testing
(a) Sample size shou
should
ld be suc
such
h that no expect
expected
ed freque
frequency
ncy in a cell shoul
should
d drop b
belo
elow
w
5: small expected cell frequencies can limit power substantially and inflate Type
I error.
(b) Ov
Overall
erall samp
sample
le size shoul
should
d be at least 20
(c) The num
number
ber of cells (ie degrees of freedom of the chi-sq
chi-squared
uared test) is indirectl
indirectly
y
related with power (see Cohen (1988))
6. Rules for F
Factor
actor Analy
Analysis
sis
(a) At least 300 case
cases/subject
s/subjectss (T
(Tabac
abachnic
hnick
k and Fidell, 1996)
(b) At least 50 partic
participan
ipants/subjec
ts/subjects
ts p
per
er variab
variable
le (Pedhaz
(Pedhazur
ur and Sch
Schmelk
melkin,
in, 1991)
(c) Comre
Comrey
y and Lee (1992) guide
guide::

••    n=100:
n=50: very poor
poor
•   n=200: fair
•   n=300: good
•   n=500: very good
(d) The higher the cases-per-var
cases-per-variable
iable ratio, the smaller the chance of “overfitting”
“overfitting” (ie
creating factors that are not generalizable beyond the specific sample)

13 Tip
Tipss on E
Elic
licita
itatio
tion
n of E
Effec
ffectt Si
Sizes
zes and V
Vari
ariance
ancess fo
forr
Sample Size Calculations
The following tips are taken from “Some Practical Guidelines for Effective Sample Size
Determination” by Lenth (Lenth, 2001)
1. Elici
Elicitation
tation of informat
information
ion on effec
effectt sizes to calcu
calculate
late sample size: Importa
Important
nt questi
questions
ons
•  What results do you expect (hope to) see?
•  Would an effect of half that magnitude [specify] be of any scientific interest?
•  Would a increase/decrease/difference of this magnitude [specify] be of any prac-
tical importance?
•  What is the range of clinical indifference?
•  If you were a patient, would the benefits of reducing/increasing the primary out-
come [specify] by this magnitude [specify] outweigh the cost, inconvenience and
potential
potential side effec
effects
ts of this treatmen
treatment?
t?

24
 

2. Elici
Elicitation
tation of infor
information
mation on stand
standard
ard devia
deviations
tions/v
/varianc
ariances
es to calcu
calculate
late sample size:
Important questions

•  What is the usual range of the primary outcome?


•  Tell me about the smallest and largest values that you have seen?
•  Discuss stories about extreme observations to determine the extent to which they
may represent ordinary dispersion
•   Do you have any studies that you have done or done by others on using this
outcome? Do you have any historical or pilot data?
•  What are the possible sources of variation based on past studies?
3. Effec
Effectt sizes can also be expr
expressed
essed in terms of the standar
standard
d deviatio
deviation.
n. For example
example,, the
general sample size formula for comparing two means is
Z α/2  + Z 
α/2

 + Z ββ   (σ12 + σ
 +  σ22 )
n  =
δ 2
where   δ  = µ
where  =  µ1 − µ . If  σ σ  = σ
2 1  =  σ  = σ
2  =  σ,, this reduces to
Z  α
α/2
/2

 + Z  2σ
 + Z  β
β  
2
n  =   .
δ 2
If   δ  = k
 =  kσ
σ  for some constant k
constant  k,, the formula can be rewritten as

2 Z αα/2  + Z ββ  
/2  + Z 

n =   .
k2
the  δ  involves
Thus, your elicitation about the δ   involves elicitation of  k.
 k .

14 Sam
Sampl
ple
e Size Cal
Calcu
cula
lati
tions
ons for Clus
Cluste
terr Rand
Random
omiz
ized
ed
Controlled Studies
Cluster randomized designs are increasingly used in community healthcare interventions and
health services research. Cluster randomized designs are designs in which intact social units
or clusters of other units are allocated to treatment groups or interventions.

14.1
14.1 Reason
Reasonss for Usin
Using
g Clu
Cluster
ster-ra
-rando
ndomiz
mized
ed Design
Designss
Reasons for using cluster-randomized designs include (from Hutton 2001; Donner and Klar
2000):

•  Scientific reasons
25
 

1.  Treatment contamination : In cases where interv


intervention
ention is aimed at changing human
behavior or knowledge transmission, cluster randomized designs are used to avoid
contamination or influence of personal interactions among cluster members.
2.   Enhancing compliance/adherence : Informal discussions about the intervention ap-
plied to a family practice, school, community setting, etc, might enhance subject
adherence.
3.  Cluster level intervention : Some interventions such as those aimed at family physi-
cians
cians can onl
onlyy be app
applie
lied
d at the clu
cluste
sterr lev
level.
el. The clus
cluster
ter effe
effect
ct or infl
influen
uence
ce of 
cluster-level covariates is such that individuals within a cluster are often treated
in a similar fashion or exposed to a common environment.
4.   Cluster action of an intervention : IntInterv
ervent
entions
ions such as vaccin
accines,
es, treat
treatmen
mentt of 
river
river blindn
blindness,
ess, appli
applied
ed at comm
communit
unityy lev
level
el redu
reduce
ce the likelih
likelihood
ood of infe
infection
ctionss
or tra
transm
nsmiss
ission
ionss of dis
diseas
eases
es wit
within
hin the com
commumunit
nity
y. Thi
Thiss is beca
because
use inf
infect
ection
ionss
tend to spread more quickly within communities/families than between commu-
nities/families.

•  Logistical and political reasons


1.  Administrative convenience : Using clusters facilitates the administration of a trial
in many ways: Fewer units (clusters) to contact; access to patients through family
practices is easier; recruitment and randomization of practices is easier and faster
than that of patients.
2.   Political : Somet
Sometimes
imes commu
communit
nity
y leaders, local or national leaders may hav
havee to
provide permission before individual subjects within a community can be con-
tacted for trial purposes
3.  Access to routine data:  Sometimes it’s easier to randomize practices or communi-
ties in order to access relevant information

•   Ethical Reasons
1.  Randomizing part of the family : For trials that deal with vacci
accines
nes or food int
inter-
er-
ventions, it would appear unethical to randomize some members of a family or
community to one intervention instead of the whole family.

14.2
14.2 Sample
Sample Size Form
Formula
ulae
e for Clust
Cluster-
er-ran
random
domize
ized
d Designs
Designs
Let

k  = num
umbeberr of cl
clus
uste
ters
rs
m  = av
aver
erag
agee cl
clus
uste
terr size
size
ρ  = intra
intra-cl
-clust
uster
er cor
correl
relati
ation
on coeffi
coefficie
cient
nt
I F    = −
1 + (m 1) ρ   = infl
1)ρ inflati
ation
on fac
factor
tor

26
 

•  Testing Equality of Means:   µ − µ  = 0


1 2
z   2
2
α
α/2
/2  +  z  2σ × I F 
 + z  β
β  
n1  = n
 =  n 2  =
(µ1 µ2 )2
z   − 2
2
α
α/2
/2  +  z  2σ × I F 
 + z  β
β  
k  = 2
m (µ1 −µ ) 2

•  Testing Equality of Proportions:   π − π  = 0 1 2


z  2

α/2  + z 
α/2  + z β  + π2 (1
+ π1 ) + π
β  [π1 (1 + π − π )] × I F 
2
n1  = n
 =  n 2  = 2
(π1 −π ) 2

z  2
 [π (1 + π
α
α/2  +  z β 
/2  + z  β  1 + π (1 − π )] × I F 
+ π ) + π 1 2 2
k  = 2
m (π1 −π ) 2

•  Testing Equality of Incidence Rates:   λ  = λ


 =  λ 1 2
z   2
α
α/2
/2  + z  β
β   + λ ] × I F 
 +  z  [λ  + λ 1 2 t
n1  = n
 =  n2  = 2
t (λ1 −λ ) 2

where
 C V 2 (λ21  + λ
 +  λ22 ) t
I F t   = 1 +
(λ1  + λ
 +  λ2 )
  σ1   σ2
C V    = =
λ1 λ2

and   σi2   is the between-cluster variation in incidence rates for the   ith group,   t  is the
and 
person-years, and  and   C V 
V  is
 is the coefficient of variation, which plays the same role as the
intra-cl
intra-classass corre
correlation
lation coefficien
coefficient.
t. Note that if CV=0
CV=0,, then  I F t  = 1
then I

15 Sam
Sampl
ple
e Siz
Size
e Cal
Calcu
cula
lati
tions
ons for
for Ot
Othe
herr Ty
Types
pes of St
Stud-
ud-
ies
15.1
15.1 Analys
Analysis
is o
off Chang
Change
e From
From Baselin
Baseline
e
Let

y   = Res
Response
ponse vari
ariabl
ablee
yb   = Corre
Corresponding
sponding baseli
baseline
ne measu
measureme
rement
nt

27
 

µ  = Mean ooff the distr


distributio
ibutionn of tthe
he re
response
sponse v variabl
ariablee
µb   = Corre
Corresponding
sponding baseli
baseline
ne mean
ρ  = corre
correlation
lation be
betw
tween
een rresponse
esponse aand
nd base
baseline
line me
measure
asuremen
ments
ts
d  =   y yb  = change from baseline
δ    = −
  µ µb
σ  = standa
standard
rd ddeviat
eviation
ion of tthe
he d
distri bution of  d
istribution  d

The sample size calculation for analysis based on   d  is given by


Z   2
2
α
α/2
/2  +  Z  2σ (1 − ρ)
 + Z  β 
β 
n =
δ 2
Note that if   ρ >  0
 0..5, it is advantageous, in terms of the sample size, to evaluate the change
from baseline instead of comparing two groups in a parallel design.

15.2
15.2 An
Anal
alys
ysis
is o
off Times
Times to F
Fai
ailu
lure
re
Let

M t   = Mean survi


surviv
val tim
timee in T
Treatm
reatment
ent Group
M c   = Mean survi
surviv
val ttime
ime in C
Contr
ontrol
ol Gr
Group
oup

Assuming the survival times are exponentially distributed, the required sample size is given
by  2 
2 Z αα/2  + Z ββ  
/2  + Z 
n  =
M t /M c ))2
(ln(M 
(ln(

15.3
15.3 Compar
Compariso
isons
ns of Means
Means for two
two Poisso
Poisson
n Pop
Popula
ulatio
tions
ns
Let

θ1   = Mea
Mean
n of Po
Popul
pulati
ation
on 1
θ1   = Mea
Mean
n of Po
Popul
pulati
ation
on 2

Using a two- sample test of equality of means based on samples from two Poisson populations,
the required number of observations per sample is
  4
n  = √ θ − √ θ 
1 2 2

28
 

15.4
15.4 Test
esting
ing for
for a single
single cor
correl
relati
ation
on coefficie
coefficient
nt
Let

H 0   :   ρ  = 0

H a   :   ρ = 0

Using The Fisher’s arctanh (Z) transformation

 1
 
+ ρ
1 + ρ
Z   =  ln
2 1 ρ −
and normal approximation, then the required sample is
  2
2 Z αα/2  +  Z ββ  
/2  + Z 
n = 3 + 2
1+
1+ρ
ρ
ln  
1−ρ

where   ρ   is regarded
where  regarded as the clinica
clinically
lly meaningf
meaningful
ul value of the corre
correlation
lation coefficie
coefficient.
nt. Note
that the same formula can be used to determine the sample size for testing that the slope of 
a regression line is not equal to zero.

15.5 Comparing
Comparing Correlatio
Correlation
n Coefficien
Coefficients
ts for Two
Two In
Independen
dependentt S
Sam-
am-
ples
Let

H 0   :   ρ1  = ρ
 =  ρ2
H a   :   ρ1 =  ρ2
 
The required number of observations per sample is given by
  2
2 Z αα/2  +  Z ββ  
/2  + Z 
n = 3 + ln   − ln   2
1+ρ
1+ρ1 1+
1+ρρ2
1−ρ1 1−ρ2

15.6
15.6 Es
Esti
tima
mati
tion
on Prob
Proble
lems
ms
If there are no comparisons being made but a parameter is being estimated, then confidence
interval approach is used in calculating the sample size. Here we require the prior estimate
of the varian
ariance
ce and the margin of erro
errorr or required accurac
accuracy
y. For estimatin
estimatingg the population

29
 


mean   µ, using a (1 α)100% confidence interval and the desired margin of error of   E , the
mean 
sample size is given by
2  
Z αα/2
/2 σ
n  =

where σ is
where σ  is the prior estimate of the standard deviation of the population. The corresponding
formula for estimating the population proportion is given by
2
Z α/
α/2 2 π (1 − π)
n  =
E 2
 π  is the prior estimate.
where π
where

16 Samp
Sample
le Si
Size
ze C
Calc
alcula
ulatio
tion
n bas
based
ed on Trans
ransfor
format
mation
ionss
Most of the statistical testing and corresponding sample size calculation procedures are based
on the norma
normall distr
distributi
ibution
on or some specifi
specificc distr
distributi
ibution
on of the responses or data. How
Howev
ever,
er,
quite often the assumed distribution may not fit the data, changing the scale of the original
data (transformations) and assuming the distribution for the transformed data may provides
a solution. Thus, if the analysis of the data is to be done on transformed data, it is equally
important to base the sample calculations on the scale of the transformed data.

1. For example if instead using risk differe


different  p 1
nt p − p , one could use the odds ratio
2

  p (1 − p ) 1 2
OR =
OR  = .
 p (1 − p ) 2 1

test   H   : OR
In this case the required sample size to test   :  OR = versus  H   : O
 = 1 versus H 
0  :  OR
R=
a 1 is
2
Z α + Z β 
2
β    1   1

n  = log (OR
 
OR))  p (1 − p )  +  p (1 − p )
2
1

1 2 2

2. The distri
distribution
bution of certai
certain
n outcomes suc
such
h as durat
duration
ion of sympt
symptoms,
oms, cost, etc, is often
skewed, but the log-transformation may normalize the distribution leading to a log-
normal
normal distr
distribu
ibutio
tion.
n. The
There
refor
foree it wo
would
uld be impo
importa
rtant
nt to perf
perform
orm the sam
sample
ple size
calculations on the transformed scale which would used for inferences.

17 Sample Size Calcul


Calculations
ations and Non-par
Non-parametri
ametric
c Tests
Use of non-parametric tests is also quite common in statistical analysis of RCT data.

•  Most of theunder
developed statistical proceduresofdiscussed
the assumption normalitysoor
far, including
some those under s, have been
other distribution

30
 

•   Non-parametric (also called distribution-free 


called  distribution-free ) methods are designed to avoid distribu-
tional assumptions

•  Advantages of Non-parametric Methods


1. Fewer assumptions are require
required
d (ie no distributional assumptions or assumptions
about equality of variances)
2. Onl
Only
y nom
nomina
inall (ca
(categ
tegori
orical
cal dat
data)
a) or ord
ordina
inall (ra
(rank
nked)
ed) are req
requir
uired,
ed, rat
rather
her tha
than
n
numerical (interval) data

•  Disadvantages of Non-parametric Methods


1. They are les
lesss efficien
efficientt
(a) less powerful than parametric ccounterparts
ounterparts
(b) often lead to over
overestimation
estimation of variances
variances of test statistics when there are large
proportions of tied observations

2. They don’t lend themse


themselve
lvess easily to CIs and sample size calculati
calculations
ons
3. Int
Interpre
erpretatio
tation
n of non-parame
non-parametric
tric result
resultss is quite hard

For latest developments in sample size calculations for nonparametric tests, see Chapter 11
of 

•   Chow S-C, Shao J, Wang H.  Sample Size Calculations in Clinical Research  Marcel


Research  Marcel
Dekker: New York, NY 2003

18 Sof
Softtware ffor
or Sa
Samp
mple
le Si
Size
ze Ca
Calc
lcul
ulat
atio
ions
ns
The following article by Len Thomas and Charles J. Krebs provides an indepth review of 
some software of sample size calculation software:
• Thomas L, Krebs CJ. A Review of Statistical power analysis software.  Bulletin of the 
Ecological Society of America  1997;
  1997; 78(2): 126-139.

•  Commercial Software Many more options are provided by the commercial computer
package that include

1. nQue
nQuery
ry advisor:   http :
advisor: http :   //www.statsol.ie/nquery/samplesize.htm
2. Po
Power
wer and Precision:   http :
Precision: http : //www.power
 //www.power − analysis.com/home.htm
2002::   http :
3. PASS 2002 http : //www.ncss.com/pass.html
 //www.ncss.com/pass.html

 Freeware on the web (User Beware!)



31
 

http://www.stat.ucla.edu/~jbond/HTMLPOWER/index.html
http://www.health.ucalgary.ca/~rollin/stats/ssize/
http://www.stat.uiowa.edu/%7Erlenth/Power/index.html
http://www.dssresearch.com/SampleSize/
http://www.stat.ucla.edu/calculators/powercalc/
http://hedwig.mgh.harvard.edu/sample_size/size.html
http://www.bobwheeler.com/stat/SSize/ssize.html
http://www.math.yorku.ca/SCS/Online/power/
http://www.surveysystem.com/sscalc.htm
http://www.researchinfo.com/docs/calculators/samplesize.cfm
http://espse.ed.psu.edu/spsy/Watkins/Watkins3.ssi
http://www.mc.vanderbilt.edu/prevmed/ps/index.htm

32
 

References
[1] Chow S-C, Shao J, Wang H.   Sample Size Calcu
Calculatio
lations
ns in Clin
Clinic
ical
al Re
Rese
sear
arch 
ch    Marcel
Dekker: New York, NY 2003

G..  Statistical Rules of Thumb.


[2] van Belle G Thumb. Wiley: New York, NY 2002.
[3] Dubey SD. Some Though
Thoughts ts on the One-si
One-sided
ded and Tw
Two-side
o-sided
d Te
Tests.  Journal of Biophar-
sts. Journal
maceutic
maceutical
al Statistics  1991;
  1991; 1: 139-150.
[4] Mohe
Moherr D, Dulberg CS, Wells GA. Statistic
Statistical
al Pow
Power,
er, Sample Size, and Their Report
Reporting
ing
Trials.  JAMA 1994;272:122-124
in Randomized Controlled Trials. JAMA  1994;272:122-124
[5] Altma
Altmann DG. Stati
Statistics
stics and ethics in medical research,, III: how large a sample?   BMJ 
research
1980;281:1336-1338.
[6] Altman DG.   Practic
Practical
al Statistics for Medi
Medical
cal Res
Resear
earch 
ch . Chapman and Hall, London,
1991.
[7] Lenth RRV. Determination. American 
V. Some Practical Guidelines for Effective Sample Size Determination. American 
Statistician  2001;
  2001; 55: 187-93
[8] Spieg
Spiegelhalt
elhalter
er DJ, Freed
reedman
man LS, Par Parmar
mar MKB. BayBayesian
esian approach
approaches
es to random
randomized
ized
In  Bayesian Biostatistics . DA Berry, DK Stangl (eds). Marcel Dekker: New York,
trials. In Bayesian
NY 1996
[9] Spieg
Spiegelhalt
elhalter
er DJ, Freed
reedman
man LS. A predi
predictiv
ctivee approach to sele
selecting
cting the size of a clini
clinical
cal
  1986;  5:: 1-13.
opinion.  Stat Med  1986; 5
trial, based on subjective clinical opinion. Stat
[10] Halperi
Halperinn M, Lan KKG,W
KKG,Ware are JH, Johnson NJ, DeMets DL. An aid to data monitoring
  1982;   3: 311-323
trials.  Contr Clin Trials  1982; 
in long-term clinical trials. Contr
[11] Lac
Lachin
hin JM. Biostatistical Methids . Wiley and Sons, New York, NY 2000
JM. Biostatistical

[12] Lac
Lachin
hin Contr
JM. Int
trials. Contr
trials. Introducti
roduction
Clin on 1981; 
Trials  to sample
  1981; size determ
  1: 13-28. determinatio
ination
n and power analysis for clinical

[13]] Fle
[13 Fleiss
iss JL.  Statistical methods for rates and proportions, 2nd ed.  John Wiley & Sons:
JL. Statistical
New York, NY 1981
[14] Armit
Armitage
age PP,, Berry G, Matt
Matthews
hews JNS.  Statistical Methods in Medical Research , 4th ed.
JNS. Statistical
Blackwell, Oxford, 2002.
[15] Bland JM, Altman DG. One and tw
twoo sided tests of significance..  BMJ 
significance  BMJ  1994;
 1994; 309: 248.
M.  An Introduction to Medical Statistics , 3rd. ed. Oxford University Press, Ox-
[16] Bland M. An
ford, 2000.
[17] Elasho
Elashoff
ff JD.  nQuery Advisor Version 4.0 User’s Guide . Los Angeles, CA, 2000.
JD. nQuery

33
 

[18] Guthrie E, Kapur N, Mack


Mackway-
way-Jones
Jones K, Chew-Graham C, Moorey J, Mendel E, Mar Marino-
ino-
Francis F, Sanderson S, Turpin C, Boddy G, Tomenson B. Randomised controlled trial
of brief psychological intervention after deliberate self poisoning.   BMJ  2001;
  2001; 323: 135-
138.

[19] Lemeshow SS,, Hosmer DW, Klar J, Lw


Lwanga
anga SK.  Adequacy of sample size in health studies .
SK. Adequacy
John Wiley & Sons, Chichester, 1996.
[20] Thoma
Thomass L, Krebs CJ. A Revi
Reviewew of Statis
Statistical
tical powe
powerr analy
analysis
sis softw
software.
are.   Bulletin of the 
Ecological Society of America  1997;
  1997; 78(2): 126-139.
[21] Machin D, Campbell MJ, Fa Fayers
yers P
P,, Pinol, A. (1998) Statistical T
Tables
ables for the Design of 
Clinical Studies, Second Edition Blackwell, Oxford.
[22] Pocock SJ.   C
Clinic
linical
al Trials: A Prac
Practical
tical Appro
Approach 
ach . John Wiley and Sons, Chichester
1983.
[23] Thoma
Thomass M, McKi
McKinley
nley RK, Freem
reemanan E, Foy C. Prev
Prevalenc
alencee of dysfu
dysfuncti
nctional
onal breat
breathing
hing
in patients treated for asthma in primary care: cross sectional survey.   BMJ  2001;
  2001; 322:
1098-1100.
[24] White
Whitehead, J.   The Design and Analysis of Sequential Clinical Trials , revised 2nd. ed.,
head, J.
Wiley, Chichester 1997.
[25] St. George’
George’ss Hospital Medica
Medicall School.  Statistics Guide for Research Grant Applicants .
School. Statistics
Available
Av ailable at http://www.sghms.ac
http://www.sghms.ac.uk/depts/phs/guide/size.h
.uk/depts/phs/guide/size.htm#which
tm#which (Last accessed
on September 1, 2003)
[26] Willan AR. Pow
Power
er function argumen
argumentsts in support of an alter
alternativ
nativee approa
approach
ch for ana-
trials.  Contr Clin Trials  1994;
lyzing management trials. Contr   1994; 15:211-219.
[27] Cassag
Cassagrande
rande JT, Pik
Pikee MC, Smith PG. The powe powerr function of the ”exact” test for com-
distributions.  Applied Statistics  1978;
paring two binomial distributions. Applied   1978; 27:176-189.

[28] Freedman LS, Lowe D,M D,Macaskill


acaskill P
P.. Stopping rules for clinical trials incorporating clinical
 Biometric s 1984; 40:575-586.
opinion. Biometric 
opinion.
[29]] Geo
[29 George
rge SL, Des
Desu u MM
MM:: Pla
Planni
nning
ng the size and dur
durati
ation
on of a cli
clinic
nical
al tri
trial
al stu
studyi
dying
ng the
event.  J Chron Dis  1974;
time to some critical event. J   1974; 27:15-24.
[30] Rubin
Rubinstein
stein L
LV,
V, Gail MH, Santne
Santnerr TJ: Planning the durat
duration
ion of a compa
comparativ
rativee clini
clinical
cal
observation.  J. Chron Dis  1981;
trial with loss to follow-up and a period of continued observation. J.   1981;
34:469-479.
[31] Shu
Shuster
ster JJ.  CRC Handbook of Sample Size Guidelines for Clinical Trials . CRC Press,
JJ. CRC
1990.
[32] Lac
Lachin
hin JM. Int
Introducti
roduction
on to sample size determ
determinatio
ination n and power analysis for clinical
trials.  Contr Clin Trials   1981;
trials. Contr 1981; 2:93-113. Last JM (Ed.). A Dictionary of Epidemiology,
3rd Edition. New York: Oxfo Oxford
rd Univ
Universit
ersity
y Press
Press,, Inc., 1995. ISBN: 0-19-50966
0-19-509668-1.
8-1.

34
 

[33] Ada
Adayy LA. Chapter 7: Decid
Deciding
ing how man
many
y will be in the sample
sample.. In  Designing and Con-
In Designing
ducting Health
Health Surveys: A Comprehensi
Comprehensive
ve Guide , 2nd Edition. San Fr Francisc
ancisco:
o: Josse
Jossey-
y-
Bass Publi
Publishers
shers,, 1996.
[34]] Gor
[34 Gordis L.   Epidemiology . Philad
dis L. Philadelphi
elphia:
a: W.B. Saunders Company
Company,, 1996.
[35]] Fle
[35 Fletc
tcher
her RH, Fle
Fletc
tcher
her SW, Wagn
agner EH.   Clinical
er EH. Clinical Epide
Epidemiolo
miology,
gy, The Esse
Essential
ntials 
s , 3rd
Edition. Philadelphia: Williams & Wilkins, 1996.
[36] Campbel
Campbelll M, Grims
Grimshaw
haw J, Steen N, for the Changing Profess
Professional
ional Practi
Practice
ce in Europe
trials.  J Health Serv Res Policy ,
Group. Sample size calculations for cluster randomised trials. J
2000;5(1):12-16.
[37] Ray JG, Verme
ermeulen
ulen MJ. Sample size estimatio
estimation
n for the sorce
sorcerers
rers apprentice..  Can Fam 
apprentice
Phys , July 1999;45:1999.
[38] Lwa
Lwanganga S, Lemsho
Lemshow
w S.  Sample Size Determination in Health Studies: a Practical Man-
S. Sample
ual . Geneva, Switzerland: World Health Organization, 1991.
[39] Sch
Schulz
ulz KF, Grimes DA. Sample size slippag
slippages
es in random
randomised
ised trials
trials:: Exclu
Exclusions
sions and the
wayward.  Lancet  2002;
lost and wayward. Lancet    2002; 358: 781-5.
[40] Boen JR, ZaZahn DA..  The Human Side of Statistical Consulting , Life
hn DA Lifetime
time Learning Pub-
lications, Belmont, CA 1982.
[41] Boren
Borenstein
stein M, Roths
Rothstein
tein H, Cohen J.  Power and Precision , Biostat, Teaneck, NJ, Soft-
J. Power
ware for MS-DOS systems 1997.
[42]] Cas
[42 Castel
telloe
loe J. Sam
Sample
ple Siz
Sizee Com
Comput
putati
ations
ons and PoPowe
werr Ana
Analys
lysis
is wit
with
h the SAS Sys
System
tem,,
in  Proceedingsof the Twenty-Fifth Annual SAS Users Group International Conference 
in Proceedingsof
2000, Cary,
Cary, NC, SAS Instit
Institute,
ute, Inc.,Pape
Inc.,Paperr 265-25.
[43] Cohen J.  Statistical Power Analysis for the Behavioral Sciences , 2nd Edn, Academic
Press, New York, 1988.
[44] Desu MM, Ragha
Raghav D.  Sample Size Methodology , Academic Press, Boston 1990.
varao D. Sample
[45]] Ela
[45 Elasho
shoff J.   nQuery Advisor Release 4.0 , Statistical Solutions, Cork, Ireland, Software
ff J.
for MS-DOS systems 2000.
[46]] Frei
[46 reiman
man JA ChaChalme
lmers
rs TC, Smith (Jr) H, Kue
Kueble
blerr RR. The Impo
Importa
rtance
nce of Bet
Beta,a, the
Type II Error, and Sample Size in the Design and Interpretation of the Randomized
Controll
Con trolled
ed Trial: Surv
Survey
ey of 71 “Negative
“Negative T
Trials” in  Medical Uses of Statistics , eds. J.
rials”,, in Medical
C. Bailar III and F. Mosteller, chap. 14, pp. 289304, NEJM Books, Waltham, Mass.
1986.
[47] Hin
Hintze
tze J.  PASS 2000 , Number Cruncher Statistical Systems, Kaysville, UT, Software
J. PASS
for MS-DOS systems 2000.

[48] Hoenig JM, Heisey DM. The Abuse of Power: The Perv Pervasive
asive F
Fallacy
allacy of Power Calcula-
Analysis,  The American Statistician  2001;
tionsin Data Analysis, The   2001; 55: 1924.

35
 

[49] Kraem
Kraemerer HC, Thi
Thiemann
emann S.  How Many Subjects? Statistical Power Analysis in Research 
S. How
,Sage Publications,
Publications, Newbury Park, CA 1987.
[50]
[50] Le
Len
nth RV. (2000
2000)), Java appl
applet
etss for po
pow
wer and
and samp
sample
le size
size.. Av
Aval
alia
ialb
lble
le at
http://www.stat.uiowa.edu/ rlenth/Power/ 2000. (Last accessed on September 7, 2003)
[51] Lipse
LipseyyMMW.  Design Sensitivity: Statistical
W. Design Statistical Power for Exper
Experimental
imental Rese
Research 
arch , Sage Pub-
lications, Newbury Park, CA 1990.
[52] Mace AE
AE.. (1964),  Sample-size determination , Reinhold, New York, 1964.
(1964), Sample-size
[53] Muller KE, Benignus V. A. Increasing scien
scientific
tific power with statistical
statistical power, Neurotoxi-
power, Neurotoxi-
cology and Teratology  1992;
  1992; 14: 211219.
[54] OBrie
OBrienn RG.  UnifyPow.sas Version 98.08.25 , Department of Biostatistics and Epidemi-
RG. UnifyPow.sas
ology,Cleveland Clinic Foundation, Cleveland, OH, 1998. Available for download from
http://www.bio.ri.ccf.org/power.html (Last accessed on September 7, 2003)
[55] Odeh RE, F
Fox
ox M. Sample Size Choice: Charts for Experiments with Linear Models , 2nd
M. Sample
edn. Marcel Dekker, New York 1991
[56] Sch
Schuirmann
uirmann D. A compromise test for equivalence of average bioavailability,,  ASA Pro-
bioavailability
ceedings of the Biopharmaceutical Section  1987;
  1987; 137142.
[57] Tayl
aylor
or DJ, Mulle
Mullerr KE. Computin
Computingg Confid
Confidence
ence Bounds for Pow
Power er and Sample Size of 
Model,  The American Statistician  1995;
theGeneral Linear Univariate Model, The   1995; 49: 4347.
[58] Thoma
Thomass L. Retr
Retrospect
ospective
ive Pow
Power
er Analy
Analysis,  Conservation Biology  1997;
sis, Conservation   1997; 11: 276280.
[59] Thomas L. Statistical po pow
wer analysis software. 1998. Avalialble at
http://w
http://www.f
ww.forest
orestry
ry.ubc.c
.ubc.ca/
a/ conse
conserv
rvation/
ation/pow
power/
er/ (Last acce
accessed
ssed on Septe
September
mber 7
2003)
[60] Thorn
Thornley
ley B, Adams C. ConConten
tentt and qualit
quality
y of 2000 con
controll
trolled
ed trials in sch
schizoph
izophrenia
renia
50years,  BMJ  1998;
over 50years, BMJ    1998; 317: 11811184.
[61] Wheeler RE. Port
Portable
able Pow
Power,  Technometrics   1974; 16:193-201.
er, Technometrics 
[62] Wrigh
rightt TA. simple algorit
algorithm
hm for tighte
tighterr exact upper confid
confidence
ence bounds with rare at-
universes,  Statistics and Probability Letters  1997;
tributes infinite universes, Statistics   1997; 36: 5967
[63]] Whi
[63 Whitle
tley
y E, Bal
Balll J. Sta
Statis
tistic
ticss rev
review
iew 4: Sam
Sample
ple size calcu
calculat
lations.   Critical Care  
ions. Care   2002;
6:335-341
[64] Dupont WD, Plumm
Plummerer WD, Jr. Pow
Powerer and Sample Size Calcula
Calculations:
tions: A Review and
Program.  Controlled Clinical Trials  11:116-128,
Computer Program. Controlled   11:116-128, 1990
[65] Dupont WD, Plummer WD, Jr. Pow Power
er and Sample Size Calculations for Studies Involv-
Involv-
ing Linear Regression. Controlled
Regression.  Controlled Clinical Trials  19:589-601,
 19:589-601, 1998

[66] Sch
Schoenfel
oenfeldd DA, Rich
Richter
ter JR. Nomogr
Nomogramsams for calculat
calculating
ing the num
number
ber of patient
patientss neede
needed
d
endpoint.  Biometrics  38:163-170,
for a clinical trial with survival as an endpoint. Biometrics   38:163-170, 1982

36
 

[67] Pea
Pearson
rson ES, Hartl
Hartley
ey HO.  Biometrika Tables for Statisticians  Vol.
HO. Biometrika   Vol. I 3rd Ed. Cambridge:
Cambridge University Press, 1970
[68] Sch
Schlesse
lesselman JJ.   Case-Control Studies: Design, Conduct, Analysis . New York: Oxford
lman JJ.
University Press, 1982
[69] Casagrande JT, Pike MC, Smith PG. An improved approximate
approximate formula for calculating
sample sizes for comparing two binomial distributions.  Biometrics  1978;
 1978; 34:483-
34:483-486
486
[70] Dupont WD. Po
Power
wer calcula
calculations
tions for matc
matched
hed case-con
case-control
trol studies.  Biometrics  44:1157-
studies. Biometrics    44:1157-
1168,
[71] Gore SM
SM.. Assess
Assessing
ing clin
clinical
ical tr
trials
ials trial si
size.  BMJ  1981;
ze. BMJ   1981; 282: 1687-1689.
[72] Friedm
riedman
an L, Furberg C, DeMets D.  Fundamentals of clinical trials . 3rd ed. New York:
D. Fundamentals
Springer-Verlag; 1998.
[73] Gebski V, Marschne
Marschnerr I, Keech A
AC.C. Specifying objectives and outcomes for clinical trials.
Med J Aust  2002;
  2002; 176: 491-492.

[74] Kirby A,Gebski V, Keec


Keech
h AC. Determining the sample size
size in a clinical trial.  MJA 2002
trial. MJA  2002
177 (5): 256-257
[75] Beal SL. Sample Size Determ
Determinati
ination
on for Confidenc
Confidencee In-te
In-terv
rvals
als on the Populat
Population
ion Mean
and on the Differences Between Two Population Means.  Biometrics  1989;
  1989; 45: 969977.
[76] Dilet
Diletti
ti E, Hausc
Hauschke
hke D, Stein
Steinijans
ijans VW. Sampl
Samplee Size Determi
Determinatio
nation
n for Bioequiv
Bioequivalenc
alencee
Assessment by Means of Confidence Intervals, International
Intervals,  International Journal of Clinical Phar-
macology, Therapy and Toxicology  1991;
Toxicology  1991; 29: 18.
[77] OBrien R, Lohr V. P
Pow
ower
er Analysis F
For
or Linear Models: The Time Has Come.  Proceedings 
Come. Proceedings 
of the Ninth Annual SAS Users Group International Conference  1984,
  1984, 840846.
[78] Ow
Owen
en DB. A Special Case of a Biv
Bivariate
ariate Non-ce
Non-centr
ntralt-d
alt-distribution..  Biometrika 
istribution  Biometrika    1965;
52: 437446.
[79] Phillips KF. Power of the Two One-Sided TTests Bioequivalence. Journal of 
ests Procedurein Bioequivalence. Journal
Pharmacokinetics and Biophar-maceutics  1990;
  1990; 18(2): 137144.
[80] Gordon D, Finch SJ, Nothnagel M, Ott J. Power and Sample Size Calculations for
Case-Control Genetic Association Tests when Errors Are Present: Application to Single
Polymorphisms.  Human Heredity  2002;54:22-33
Nucleotide Polymorphisms. Human   2002;54:22-33
[81] Kirb
Kirby
y A, Gers
Gerski
ki V, Keec
Keech
h AC. Determin
Determining
ing the Smaple Size in a Clinical Trial.   MJA
Trial.
2002; 177: 256-257
[82] Donner A, Klar N.   Design and Analysis of Cluster Randomized Trials in Health Re-
search . Arnold, London 2000.

[83] Hutto
Huttonn LJ. Are Disti
Distinctiv
nctivee Ethic
Ethical
al Princ
Principles
iples Requir
Required
ed for Clust
Cluster
er Randomize
Randomized
d Con-
Trials.  Statistics in Medicine  2001;
troled Trials. Statistics   2001; 20: 473-488

37
 

[84] Hay
Hayes
es RJ, Benne
Bennett
tt S. Simpl
Simplee Sample Size Calculati
Calculations
ons for Cluster-
Cluster-rando
randomized
mized T
Trials
rials..
Int J Epidermiology  1999;
 1999; 28:319-
28:319-326
326
[85]   Recrui
Recruiting
ting Huma
Human
n Subj
Subjeects: Sampl
Samplee Guid
Guideline
eliness for Pr
Practi
actice 
ce . OEI-0
OEI-01-97-0
1-97-00196,
0196, June
2000
[86] American Psychological Association.
Association. Publication
 Publication manual of the American Psychological 
Association  (5th
  (5th ed.). Washington, DC: Author 2001.
[87] Aron AA,, Aron E
EN.  Statistics for psychology  (2nd
N. Statistics   (2nd ed.). Upper Saddle River, NJ: Prentice
Hall 1999.
J.  Statistical power analysis for the behavioral sciences  (2nd
[88] Cohen J. Statistical   (2nd ed.). Hillsdale, NJ:
Erlbaum 1988.
[89] Cohen J. Things I hav far).   American Psychologist  1990;
havee learned (so far).   1990; 45: 1304-1312.
[90] Cohen J. A powe
powerr primer. Psychological Bulletin  1992;112:
primer. Psychological   1992;112: 155-159.
[91] Cohen J, Cohen P.   Applied multiple regression/correlation analysis for the behavioral 
sciences . Hillsdale, NJ: Erlbaum 1975.
[92] Comre
Comreyy AL, Lee HB.  A first course in factor analysis  (2nd
HB. A   (2nd ed.). Hillsdale, NJ: Erlbaum
1992.
[93] Green SB. How many subjects does it take to do a regression analysis?   Multivariate 
Behavioral Research  1991;
  1991; 26: 499-510.
[94] Guadag
Guadagnoli
noli E, Velic
elicer
er WF. Relat
Relation
ion of sampl
samplee size to the stabilit
stability
y of compone
component
nt pat-
terns.  Psychological Bulletin  1988;
terns. Psychological   1988; 103: 265-275.
[95]] Har
[95 Harris RJ.   A primer of multivariate statistics  
ris RJ. statistics   (2nd
(2nd ed.). New Y
York:
ork: Acade
Academic
mic Press
1985.

[96] How
Howell DC.. Statistical methods for psychology  (4th
ell DC  (4th ed.). Belmont, CA: Wadsworth 1997.
[97] Ho
[97] Hoyle
yle EH (Ed .).   Statistical strategies for small sample research . Thousand Oaks, CA:
(Ed.).
Sage 1999.
[98] Kraem
Kraemer
er HC, Thiem
Thiemann
ann S.  How many subje
S. How subjects?
cts? Statistic
Statistical
al power analysis in res
resear
earch 
ch .
Newbury Park, CA: Sage 1987.
[99] Ped
Pedhazur
hazur EJ, Schmelk
Schmelkinin LP
LP..   Measur
Measuremen
ement,
t, desi
design,
gn, and analys
analysis:
is: An inte
integr
grate
atedd ap-
proach . Hillsdale, NJ: Erlbaum 1991.
[100] Tabac
abachnic
hnick
k BG, Fidell LS.  Using multivariate statistics  (3rd
LS. Using   (3rd ed.). New York: Harper-
Collins
Collins 1996.
[101] Wilkinson L, Task F
Force
orce on Statistical Inference, AP
APA
A Board of Scientific Affairs. Sta-
tistical
tistical methods
chologist   1999; 54:in594-604.
 1999; psyc
psychology
hology journal
journals:
s: Guide
Guidelines
lines and expla
explanation
nations.  American Psy-
s. American

38
 

L.   Research mistakes in the social and behavioral sciences . Ames: Iowa
[102] Wolins L. Iowa Stat
Statee
University Press 1982.
[103] Shus
Shuster JJ..  Handbook of Smaple Size Guidelines for Clinical Trials . CRC Press, Boca
ter JJ
Raton: Florida 1990
[104] Lemeshow S, Hosme
Hosmerr DW (Jr.)
(Jr.),, Klar J, Lwanga SK.
SK. Adequacy
 Adequacy of Smaple Size in Health 
Studies . World Health Organization, Wiley, New York: NY 1990.
[105] Hair JF, Anderson RE, T Tatham
atham RL, Black WC.  Multivariate Data Analysis , 5th Edi-
WC. Multivariate
tion, Prentice Hall, New Jersey; 1998.

39

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy