EstimatingPowerSampleSize ATrickey
EstimatingPowerSampleSize ATrickey
[Pye, 2016]
Topics
Statistical • Components
Power • Assumptions
14 12 14 a while
years years years ago…
2. Outcome
• Response / Dependent variable (DV)
Cause Effect
• Occurring after predictors 1° IV DV
Exposure Outcome
3. Confounders
• Related to both outcome and exposure
• Must be taken into account for internal validity
Variable Measurement Scales
Type of Descriptive Information
Characteristics Examples
Measurement Stats Content
Continuous Mean (SD)
Ranked spectrum; Weight, BMI Highest
+ all below
quantifiable
Ordered Discrete intervals Mean (SD)
Number of cigs / day High
+ all below
Categorical
Ordered ASA Physical Status
Ordinal Median Intermediate
categories Classification
(Polychotomous)
Categorical Nominal Unordered Counts,
Blood Type, Facility Lower
(Polychotomous) Categories Proportions
Categorical Sex (M/F), Counts,
Two categories Low
Binary (Dichotomous) Obese (Y/N) Proportions
[Hulley 2007]
Measures of Central Tendency
1. Mean = average
• Continuous, normal distribution
2. Median = middle
• Continuous, nonparametric distribution
3. Mode = most common
• Categorical
Variability Box Plot Components:
0
Test Statistic 0
Evaluate association in one direction Test Statistic
P-value Definition
Pitfalls:
• The statistical significance of the effect does not explain the size
of the effect
• Report descriptive statistics with p-values (N, %, means, SD, etc.)
• STATISTICAL significance does not equal CLINICAL significance
• P is not truly yes/no, all or none, but is actually a continuum
• P is highly dependent on sample size
Which Statistical Test?
1. Number of IVs
2. IV
Measurement
Scale
3. Independent
vs. Matched
Groups
4. DV
Measurement
Scale
Common Regression Models
Outcome Appropriate Model
Variable Regression Coefficient
Slope (β): How much the outcome increases for
Continuous Linear Regression
every 1-unit increase in the predictor
Binary / Odds Ratio (OR): How much the odds for the outcome
Logistic Regression
Categorical increases for every 1-unit increase in the predictor
Time-to- Cox Proportional- Hazard Ratio (HR): How much the rate of the outcome
Event Hazards Regression increases for every 1-unit increase in the predictor
Poisson Regression Incidence Rate Ratio (IRR): How much the rate of the
Count or Negative Binomial outcome increases for every 1-unit increase
Regression in the predictor
Hierarchical / Mixed Effects Models
Correlated Data Nested Data
Grouping of subjects
Repeated measures
over time Level 3:
Multiple related outcomes Hospitals
Can handle
Level 2:
Missing data
Surgeons
Nonuniform measures
• Outcome of interest
• Study design
• Effect Size
• Allocation ratio between groups
• Population variability
• Alpha (p-value, typically 0.05)
• Beta (1-power, typically 0.1-0.2)
• 1- vs. 2-tailed test
Effect Size
• Cohen’s d: comparison between two means
• d = m1 – m2 / pooled SD
• Small d=0.2; Medium d=0.5; Large d=0.8
• Expected values per group (e.g. complications: 10% open vs.
3% laparoscopic)
• Minimal clinically important difference (e.g. 10% improvement)
• What is the MCID that would lead a clinician to change his/her practice?
• Inverse relationship with sample size
• ↑ effect size, ↓ sample size
• ↓ effect size, ↑ sample size
Confidence Interval-Based Power
• How precisely can you estimate
your measure of interest?
• Examples
• Diagnostic tests: Sensitivity / Specificity
• Care utilization rates
• Treatment adherence rates
• Calculation components
• N
• Variability
• α level
• Expected outcomes
Rule of Thumb Power Calculations
• Simulation studies
• Degrees of freedom (df) estimates
• df: the number of IV factors that can vary in your regression model
• Multiple linear regression: ~15 observations per df
• Multiple logistic regression: df = # events/15
• Cox regression: df = # events/15
• Best used with other hypothesis-based or confidence
interval-based methods
Collaboration with
Biostatisticians
Biostatistics Collaboration
• 2001 Survey of BMJ & Annals of Internal Medicine re: statistical
and methodological collaboration
• Stats/methodological support – how often?
• Biostatistician 53%
• Epidemiologist 32%
• Authorship outcomes given significant contribution
• Biostatisticians 78%
• Epidemiologists 96%
• Publication outcomes
• Studies w/o methodological assistance more likely to be
rejected w/o review: 71% vs. 57%, p=0.001
[Altman, 2002]
Questions from your Biostatistician
• What is the research question?
• What is the study design?
• What effect do you expect to observe?
• What other variables may affect your results?
• How many patients are realistic?
• Do you have repeated measures per individual/analysis unit?
• What are your expected consent and follow-up completion rates?
• Do you have preliminary data?
• Previous studies / pilot data
• Published literature
Stages of Power Calculation
Study Design
Hypothesis
Similar Simulation/
Literature Rules of Thumb
Sample Size
Feasible?
Important?
[Pye, 2016] Other Considerations?
Statistical Power Tips
• Seek biostatistician feedback early
• Calculations take time and typically a few iterations
• Without pilot data, it is helpful to identify previous research with
similar methods
• If absolutely no information is available from a reasonable comparison study,
you can estimate power from the minimal clinically important difference*
*[Revicki, 2008]
Authorship
Authorship
International Committee of Medical Journal Editors (ICMJE) rules:
All authors must have…
1. Substantial contributions to the conception or design of the work; or the
acquisition, analysis, or interpretation of data for the work; AND
2. Drafting the work or revising it critically for important intellectual content; AND
3. Final approval of the version to be published; AND
4. Agreement to be accountable for all aspects of the work in ensuring that
questions related to the accuracy or integrity of any part of the work are
appropriately investigated and resolved.
• Epidemiologist/Biostatisticians typically qualify for authorship
• Sometimes an acknowledgement is appropriate
• Must be discussed
Consultation
S-SPIRE Biostatisticians
atrickey@stanford.edu
(650) 725-7003
References
1. Farrokhyar F, Reddy D, Poolman RW, Bhandari M. Practical Tips for Surgical Research: Why
perform a priori sample size calculation?. Canadian Journal of Surgery. 2013 Jun;56(3):207.
2. Revicki D, Hays RD, Cella D, Sloan J. Recommended methods for determining responsiveness and
minimally important differences for patient-reported outcomes. Journal of clinical epidemiology.
2008 Feb 1;61(2):102-9.
3. Hulley SB, Cummings SR, Browner WS, Grady DG, Newman TB. (2007). Designing Clinical
Research. 3rd ed. Philadelphia, PA: Lippincott Williams & Wilkins.
4. Gordis, L. (2014). Epidemiology. Philadelphia: Elsevier/Saunders.
5. Altman DG, Goodman SN, Schroter S. How statistical expertise is used in medical research. JAMA.
2002 Jun 5;287(21):2817-20.
6. Pye V, Taylor N, Clay-Williams R, Braithwaite J. When is enough, enough? Understanding and
solving your sample size problems in health services research. BMC research notes. 2016
Dec;9(1):90.