Mixed Models Day 1 - 2023
Mixed Models Day 1 - 2023
2
Course Structure
• Morning Sessions (except today):
o 10:00 – 12:45: theory & practice
• Afternoon Sessions
o 13:30 – 17:00: theory & practice
• Daily Quiz (Days 1-4)
o available from 15:30 each day to 09:50 next day
• Day 1: introduction to multilevel modelling
• Day 2: longitudinal data I (modelling time as continuous)
• Day 3: longitudinal data I (modelling time as categorical)
• Day 4: beyond the linear mixed model
• Day 5: case studies
3
Course Objectives
• At the end of the course, the student will:
o understand the difference between fixed and random effects;
o be able to choose fixed and random effects based on the research
question and study design
o know when to apply a mixed model in practice;
o be able to perform mixed model analyses using statistical software (R,
SPSS);
o be able to interpret the output of mixed model analyses in terms of the
context of the research question(s);
o know the most commonly used methods for checking model
appropriateness and model fit;
o be able to report the results of mixed model analyses to non-statistical
investigators.
4
Overview Day 1: Multilevel Modelling
• Introduction to multilevel data
• Example: multilevel data (children within schools)
• The problem, and some possible solutions
• The mixed model solution
• Adding random effects (random intercept, random slope)
• Adding fixed effects (school- and child-level) to the model
• Interpretation of mixed models
• Summary
5
Examples of multilevel data
• Effect of school environment on exam results
o Design: hierarchical, where the examination results of a random sample
of students within a random sample of schools are compared
• Influence of race and sex on fetal heartbeat during pregnancy
o Design: repeated measurements on different gestational ages during
pregnancy, where the gestational ages were not the same for all
women
• Multi-center hypertension trial
o Design: hierarchical, with 193 patients in 27 centers, DBP measured 5
times per patient over the course of several weeks
6
Characteristics of multilevel data
• Hierarchical structure of data
o children within (classrooms within) schools
o patients within centers
o measurements within patients
• Variation at all levels
• “Units” within a level expected to be correlated
• Variables can be measured at different levels
o Level 2:
• type of school (mixed vs. single-gender)
• university vs. community hospital
o Level 1:
• reading ability of child at intake
• gender of patient
7
Example: London Schools
• Data collected by Goldstein, Rasbash, et al (1993) on 4059 children in
65 schools in Inner London.
8
Example: London Schools
• Variables in dataset:
o School ID
o Student ID
o Normalised exam score (outcome variable)
o Standardised LR test score
o Student gender
o School gender
o School average of intake score
o Student level Verbal Reasoning (VR) score category at intake
o Category of students’ intake score (averaged)
9
London Schools
school # boys # children school # boys # children
1 45 73 13 26 64
2 0 55 14 92 198
3 29 52 15 47 91
4 45 79 16 0 88
5 16 35 17 31 126
6 0 80 18 0 120
7 0 88 19 33 55
8 0 102 20 21 39
9 21 34 21 0 73
10 31 50 22 48 90
11 62 62 . . .
12 23 47 . . .
. . .
10
London Schools
All schools
2
normalized exam score
0
-2
-3 -2 -1 0 1 2 3
12
London Schools:
1. linear regression, aggregated mean exam vs mean LRT
13
London Schools:
1. linear regression, aggregated mean exam vs mean LRT
• Disadvantages:
o every school (regardless of sample size) given equal weight
o N = 65
o school-level variables possible, but not child-level variables
o we can only make inference at school level, not child-level
o possibility of “ecological fallacy”
15
London Schools
2. linear regression, all schools together
0
-2
-3 -2 -1 0 1 2 3
• Disadvantages:
o inflates sample size, especially for level-2 variables
• SE’s of level-2 variables tend to be underestimated → p-values too small,
CI’s too narrow (type I error inflated)
• SE’s of level-1 variables may be over- or underestimated
o ignore correlated residuals (correlation of children within schools)
18
London Schools:
2. linear regression, all schools together
3
2
2
1
1
residuals
residuals
0
0
-1
-1
-2
-2
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
19
London Schools:
2. linear regression, all schools together
4
2
2
normalized exam score
0
-2
-2
-4
-4
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1
standardized London Reading Test score standardized London Reading Test score
20
London Schools
3. linear regression per school
21
London Schools
3. linear regression per school
22
London Schools
3. linear regression per school
• Disadvantages:
o 65 different regressions, how to combine the results?
• mean slope: every school has equal weight
• standard error of parameter estimate?
o child-level variables possible, but not school-level variables
23
London Schools
4. all schools together, main effects and interactions
25
Break & practice
• Please read notes at top of exercises!!
• Exercise 1 (R, SPSS, or both)
• See step-by-step in R if you need more help
26
London Schools
5. Mixed Models
• Advantages:
o sample size correct, account for correlation of children within schools
• so: correct SE’s/p-values/CI’s
o no need for 64 main effects and interactions
• differences between schools captured one or more ‘variance components’
o both child-level and school-level variables simultaneously
• so: inference for both children and schools
• interactions between child- and school-level variables possible
o examine variation at different levels
o models work well in presence of missing outcomes (longitudinal)
27
Mixed Models
• Mixed models made up of
o fixed effects
o random effects
• Sometimes (inaccurately) called “random effects models”
• Also sometimes called “random coefficient” models
• Some variables (or: their coefficients) can be included as both “fixed”
(of interest) and “random” (random variation across the level-2 units)
28
Mixed Models: what is a “fixed effect”?
• Fixed effect: variable of interest
o overall intercept (not really of interest)
o overall slope for LRT (to help make predictions of exam performance)
o other fixed effects of interest:
• gender (difference between boys and girls?)
• type of school (boys’, girls’, mixed)
• “achievement level” of school
• ...
29
Mixed Models: what is a “random effect”?
31
Interlude: some notation
• level-1 (child) model: 𝑦𝑖𝑗 = 𝑏0𝑖 + 𝑏1𝑖 ∙ 𝑥1𝑖𝑗 + 𝜀𝑖𝑗
• level-2 (school) model: 𝑏0𝑖 = 𝛽0 + υ0𝑖 ; 𝑏1𝑖 = 𝛽1 + υ1𝑖
• combine the two: 𝑦𝑖𝑗 = 𝛽0 + υ0𝑖 + 𝛽1 ∙ 𝑥1𝑖𝑗 + υ1𝑖 ∙ 𝑥1𝑖𝑗 + 𝜀𝑖𝑗
o rewrite: 𝑦𝑖𝑗 = (𝛽0 + υ0𝑖 ) + (𝛽1 +υ1𝑖 ) ∙ 𝑥1𝑖𝑗 + 𝜀𝑖𝑗
32
Mixed Models: what is a “random effect”?
Random intercept + random
Random intercept only: slope:
2
normalized exam score
0
-2
-2
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
standardized London Reading Test score standardized London Reading Test score 33
London Schools
1
0
-1
-2
-2 -1 0 1 2
34
London Schools
-3 -2 -1 0 1 2 3
35
London Schools
0
-2
-3 -2 -1 0 1 2 3
36
London Schools
Graph per school (“spaghetti plot”):
37
Mixed Models: the model
• 𝑦𝑖𝑗 = 𝛽0 + 0𝑖 + 𝛽1 + 1𝑖 ∙ 𝑥1𝑖𝑗 + ⋯ + 𝜀𝑖𝑗
• Where:
o 𝑦𝑖𝑗 : outcome (exam score) for jth child in ith school
o 𝑥1𝑖𝑗 : first explanatory variable (LRT score) at level 1 (jth child in ith
school)
o 𝛽0 , 𝛽1 , ... : regression coefficients for explanatory variables (“fixed
effects”)
o 0𝑖 : random effect for the intercept in ith school
o 1𝑖 : random effect for the slope (for LRT) in ith school
o 𝜀𝑖𝑗 : level-1 residual (jth child in ith school)
• Model assumptions:
o 𝜀𝑖𝑗 ~𝑁 0, 𝜎𝑒 2 ; 0𝑖 ~𝑁(0, 𝜎0 2 ) ; 1𝑖 ~𝑁(0, 𝜎1 2 )
o 𝜀𝑖𝑗 independent
o 𝑐𝑜𝑣 0𝑖 , 1𝑖 = 𝜎01
o 𝑐𝑜𝑣 𝜀𝑖𝑗 , 0𝑖 = 𝑐𝑜𝑣 𝜀𝑖𝑗 , 1𝑖 = 0
38
Mixed models in R
Two packages used in this course
• Package nlme
o lme() for Gaussian models
o gls() function for models with correlated errors (day 2)
o approximate (Wald) CI’s via intervals() function in same package
• Package lme4
o lmer() for Gaussian models
o glmer() for generalized linear mixed models (day 4)
o “profile likelihood” CI’s via confint()
Watch out! R gives the standard deviation of the random effects, not
the variance. Var(rand int) = 0.30352 = 0.092; res var=0.75212 = 0.565
40
London Schools: mixed model
random intercept only
41
London Schools: mixed model
simplest model: only random intercept
42
London Schools: mixed model
simplest model: only random intercept
Fitted model
43
London Schools: mixed model
random intercept + random slope
Random effects:
Formula: ~standlrt | school
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 0.3007313 (Intr)
standlrt 0.1205753 0.497
Residual 0.7440777
44
London Schools: mixed model
random intercept + random slope
45
London Schools: mixed model
random intercept + random slope
46
London Schools: mixed model
random intercept + random slope
Fitted model
47
Break & practice
• Exercise 2 (R, SPSS, or both)
48
London Schools: comparing right & wrong models
overall/fixed
Model slope LRT s.e.
1. aggregated data 0.884 0.116
2. disaggregated data 0.595 0.013
3. regr. per school 0.425 ??
4. school*LRT interactions ?? ??
5a. mixed model (random intercept) 0.563 0.012
5b. mixed model (random int + random 0.557 0.020
slope LRT)
49
London Schools data
Aside: coding of categorical variables
Random effects:
Formula: ~standlrt | school
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 0.2936242 (Intr)
standlrt 0.1212575 0.533
Residual 0.7416710
51
London Schools:
adding a child-level covariate
52
London Schools
adding (fixed) school-level covariates
> sch.lme.4 <- lme(normexam~standlrt + factor(gender)+ factor(schgend) + factor(schav),
random=~standlrt | school, data=london, method="ML")
> summary(sch.lme.4)
Random effects:
Formula: ~standlrt | school
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 0.2660309 (Intr)
standlrt 0.1212542 0.499
Residual 0.7417279
53
London Schools:
Adding child- and school-level covariates
Effect estimate se p
Fixed Effects
Intercept -0.265 0.082 0.0012
norm. LRT 0.552 0.020 < 0.0005
girls (vs. boys) 0.167 0.034 < 0.0005
school avg: low (ref) 0.100
school avg: mid 0.067 0.085 0.436
school avg: high 0.174 0.099 0.083
school gender: mixed (ref)
school gender: boys 0.187 0.098 0.061
school gender: girls 0.157 0.078 0.048
(Co)variance
Parameters:
school intercept 0.2662
school slope 0.1212
corr int-slope 0.499
residual variance 0.7422
54
London Schools: conclusions (so far)
• The reading score is a significant predictor of exam score
o for every 1 SD higher on reading score, average increase of 0.552 SD
on exam score
• Girls do significantly better than boys on exam
o girls score, on average, 0.167 SD higher on exam than boys
• School “level” (average exam score) does not appear to be predictive
of exam score
• School gender may be predictive
o average exam score at girls’ schools is 0.157 SD higher than at mixed
schools
o average exam score at boys’ schools is 0.174 SD higher than at mixed
schools
• Note: these conclusions are based on the “Wald” p-values and are
not necessarily to be trusted!
55
London Schools: conclusions (so far)
• Because the LRT score has been centered, the estimate for the
intercept (-0.265) is the estimated average (normalized) exam score
for:
o a boy (ref) with
o avg LRT score from
o a school with low average score (ref) and
o mixed school (ref)
• The residual variance is 0.550, much larger than the variances for the
random intercept (0.071) and random slope (0.015), indicating more
variation within schools than between.
• Adding child- and school-level covariates explains some of the
variance between schools (SD RI 0.31 → 0.27)
56
London Schools: still to do
• We’ve made model assumptions, need to check them!
o distribution of residuals
o distribution of random effects (?)
• How to choose among models?
• How to answer subquestion (does gender of school have influence
on effect of gender of pupil?)
57
Multilevel modelling, summary
• Account for correlation of measurements at different levels
o children within schools, measurements within patients
• Allow us to include variables measured at different levels
o child’s gender, school’s achievement or SES level
• We can model variation at different levels
o more variation within than between schools
• Longitudinal data is a specific example of multi-level data
o days 2 & 3: models for longitudinal data
• How to build models, check assumptions?
o days 2 & 3: model building (day 2), model checking (day 3)
• Outcomes don’t have to be continuous
o day 4: models for Poisson, binomial and survival data
58