Sample Size Calculation: Research Week 2014 Health Cluster UKM
Sample Size Calculation: Research Week 2014 Health Cluster UKM
com, 2014
Introduction
• This is an abbreviated set of slides on how to
calculate sample size.
• It will focus on those with
– Measuring prevalence/incidence for outcome
– Qualitative outcome (i.e. Dead vs Alive)
– Continuous outcome (i.e. drop of BP in mm Hg)
for commonly used study designs in PPUKM.
• Those who want the complete set of slides for
calculating sample size, please refer to next page;
© drtamil@gmail.com, 2014
Measuring prevalence
Prevalence
• If the objective of your study is to measure the
prevalence of the outcome of interest, then
you will be conducting a cross-sectional study.
So you will only take a sample of your
population. The number of sample selected
depends on the expected prevalence rate.
• To estimate the expected prevalence rate, you
will need to do a literature review, hopefully
similar to your own population.
© drtamil@gmail.com, 2014
• Prevalence = P = 20%
• Absolute precision required = 5 percentage points,
(means that if the calculated prevalence of obesity is 20%,
then the true value of the prevalence lies between 15-25%).
© drtamil@gmail.com, 2014
Calculate Manually
• n = (Z1-α)2(P(1-P)/D2) where
• Z1-α = Z0.95 = 1.96 (from normal distribution
table. This value of 1.96 is standard for CI of
95%).
• P = 20% = 0.2 in this example
• D = 5% = 0.05 in this example
• n = 1.962 x (0.2(1-0.2)/0.052) = 245.84
• So the sample size required is 246.
© drtamil@gmail.com, 2014
Alternative to calculation
http://www.palmx.org/samplesize/Calc_Samplesize.xls
© drtamil@gmail.com, 2014
Reminder
• If the prevalence for the outcome of interest is
less than 5%, you should not be doing a cross-
sectional study, instead you should be doing a
case-control study.
• If your supervisor still insists that you do x-
sectional study, then the level of precision
should be half of the prevalence; i.e. prevalence
of HIV among STD patients is 4% therefore
accuracy (d) must be set at 2%. Therefore the
required sample size would be 369, not 59.
© drtamil@gmail.com, 2014
Dichotomous Qualitative
Outcome
Calculate sample size
© drtamil@gmail.com, 2014
X-sectional vs cohort vs
case control vs clinical trial
D+ D+
RF+ RF+
D- Ratio D-
Ratio not
usually
(1:1) D+ D+
(1:1)
RF- RF-
D- D-
X-sectional Cohort
RF+ C+
D+ T+
Ratio RF - Ratio C-
usually usually
RF+ C+
(1:1) (1:1)
D- T-
RF- C-
Case-Control Clinical Trial
© drtamil@gmail.com, 2014
X-sectional vs cohort vs
case control vs clinical trial
D+ D+
RF+ RF+
D- Ratio D-
Ratio not
usually
(1:1) D+ D+
(1:1)
RF- RF-
D- D-
X-sectional Cohort
RF+ C+
D+ T+
Ratio RF - Ratio C-
usually usually
RF+ C+
(1:1) (1:1)
D- T-
RF- C-
Case-Control Clinical Trial
© drtamil@gmail.com, 2014
Overweight
Sample DM - (68%)
ratio (1:1)
DM + (7%)
Normal
DM - (93%)
Rifas-Shiman SL et al, 2008.Diabetes and lipid screening among patients in primary care: A cohort study. BMC Health Services Research.
© drtamil@gmail.com, 2014
Calculate Manually
Calculate using these formulas (Fleiss JL. 1981. pp. 44-45)
Calculate Manually
© drtamil@gmail.com, 2014
Alternative to calculation
http://www.palmx.org/samplesize/Calc_Samplesize.xls
So you’ll need a sample size of 46 each for both groups. Total of 92.
© drtamil@gmail.com, 2014 http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/PowerSampleSize
Or use PS2
• So the sample
size required
for each group
is 38. Total of
76
• Excel = 92 vs
PS2 = 76
• Slight
difference due
to different
formula used.
© drtamil@gmail.com, 2014
PS2
We are planning a study of independent cases and controls
with 1 control(s) per case. Prior data indicate that the
failure rate (DM) among controls (normal weight) is 0.07.
If the true failure rate (DM) for experimental (overweight)
subjects is 0.32, we will need to study 38 experimental
(overweight) subjects and 38 control (normal weight)
subjects to be able to reject the null hypothesis that the
failure rates (DM) for experimental (overweight) and
control (normal weight) subjects are equal with
probability (power) 0.8. The Type I error probability
associated with this test of this null hypothesis is 0.05. We
will use an uncorrected chi-squared statistic to evaluate
this null hypothesis.
© drtamil@gmail.com, 2014
• s = standard deviation,
• d = the difference to be detected, and
• C = constant (refer to table below); if
α=0.05 & 1-β=0.8, then C = 7.85.
© drtamil@gmail.com, 2014
Manual Calculation
• d = 10 mmHg
• s = 20 mm Hg
n = 1 + 2 x 7.85 (20/10)2
= 63.8 = 64
Alternative to table
http://www.palmx.org/samplesize/Calc_Samplesize.xls
The standardised difference; 10 mm Hg/20 mm Hg = 0.5
© drtamil@gmail.com, 2014
Or you can
use PS2
• We still end
up with the
same
answer.
© drtamil@gmail.com, 2014
PS2
• We are planning a study of a continuous response
variable from independent control (placebo) and
experimental (treatment) subjects with 1 control(s)
per experimental subject. In a previous study the
response within each subject group was normally
distributed with standard deviation 20. If the true
difference in the experimental and control means is
10 (mm Hg), we will need to study 64 experimental
subjects and 64 control subjects to be able to reject
the null hypothesis that the population means of the
experimental and control groups are equal with
probability (power) 0.8. The Type I error probability
associated with this test of this null hypothesis is
0.05.
© drtamil@gmail.com, 2014
• s = standard deviation,
• d = the difference to be detected, and
• C = constant (refer to table below); if
α=0.05 & 1-β=0.8, then C = 7.85.
© drtamil@gmail.com, 2014
Manual Calculation
• d = 10 mmHg
• s = 20 mm Hg
n = 1 + 7.85 (20/10)2
= 32.4 = 33
Next