Datos Categóricos
Datos Categóricos
Analysis
Geert Molenberghs
Ariel Alonso Abad
Fabián Santiago Tibaldi
1 Reading 1
3.2 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3.1 Dropout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
ii
CONTENTS iii
4.2 Taxonomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.6 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
. . . . . . . . . . . . . . . . . . . . . . .
4.13.2 Empirical Bayes estimates b 71
i
. . . . . . . . . . . . . . . . . . . . . . . . . .
4.13.3 Shrinkage estimators b 73
i
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
16.20Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
16.21Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
CONTENTS xiv
Reading
University Press.
1
CHAPTER 2. GENERALIZED LINEAR MODELS 2
1. E(Yi) = µi
2. η(µi) = xTi β with η(.) the link function
3. Var(Yi) = φv(µi), where
• v(.) is a known variance function
Summary
Yi ∼ N (xTi β, σ 2)
with
• v(µ) = 1
• φ = σ2
•θ=µ
• ψ(θ) = θ2/2
−y 2
• c(y, φ) = 2φ − 12 ln(2πφ)
CHAPTER 2. GENERALIZED LINEAR MODELS 5
exp(xTi β)
P (Yi = 1) =
1 + exp(xTi β)
with
• η(µ) = ln µ
1−µ
• v(µ) = µ(1 − µ)
•φ=1
• θ = ln µ
1−µ
• ψ(θ) = ln {1 + exp(θ)}
1 N n
(θ1, . . . , θN , φ) = {yiθi − ψ(θi)} + c(yi, φ)
φ i=1 i=1
N ∂θi
ψ (θi)[φψ (θi)]−1 {yi − φ(θi)}
S(βj ) =
i=1 ∂βj
(j = 1, . . . , p)
Remarks
9
CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 10
|---------|------------------------------------------------------|----------|
| | GSA | |
| |----------|----------|----------|----------|----------| |
| |Very Good | Good | Moderate | Bad | Very Bad | All |
| |----|-----+----|-----+----|-----+----|-----+----|-----+----|-----|
| | N | % | N | % | N | % | N | % | N | % | N | % |
|---------+----+-----+----+-----+----+-----+----+-----+----+-----+----+-----|
|Time | | | | | | | | | | | | |
|---------| | | | | | | | | | | | |
|MONTH 3 | 55| 14.3| 112| 29.1| 151| 39.2| 52| 13.5| 15| 3.9| 385|100.0|
|---------+----+-----+----+-----+----+-----+----+-----+----+-----+----+-----|
|MONTH 6 | 38| 12.6| 84| 27.8| 115| 38.1| 51| 16.9| 14| 4.6| 302|100.0|
|---------+----+-----+----+-----+----+-----+----+-----+----+-----+----+-----|
|MONTH 9 | 40| 17.6| 67| 29.5| 76| 33.5| 33| 14.5| 11| 4.8| 227|100.0|
|---------+----+-----+----+-----+----+-----+----+-----+----+-----+----+-----|
|MONTH 12 | 30| 13.5| 66| 29.6| 97| 43.5| 27| 12.1| 3| 1.3| 223|100.0|
|---------|----|-----|----|-----|----|-----|----|-----|----|-----|----|-----|
CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 11
3.2 Questions
• Investigation of dropout
CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 12
3.3.1 Dropout
Dropout Time
Frequency|
Col Pct |MONTH 3 |MONTH 6 |MONTH 9 |MONTH 12| Total
---------+--------+--------+--------+--------+
No | 385 | 302 | 227 | 223 | 1137
| 78.41 | 61.51 | 46.23 | 45.42 |
---------+--------+--------+--------+--------+
Yes | 106 | 189 | 264 | 268 | 827
| 21.59 | 38.49 | 53.77 | 54.58 |
---------+--------+--------+--------+--------+
Total 491 491 491 491 1964
Dropout
Pattern Cumulative Cumulative
(redefined) Frequency Percent Frequency Percent
-------------------------------------------------------------
**** 96 19.55 96 19.55
-*** 63 12.83 159 32.38
--** 54 11.00 213 43.38
---* 55 11.20 268 54.58
---- 223 45.42 491 100.00
CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 14
• Early dropout (did the subject drop out after the first
or the second visit) ?
• Binary response
• PROC GENMOD can fit GLMs in general
• PROC LOGISTIC can fit models for binary (and
ordered) responses
• SAS code:
Model Information
Response Profile
Ordered Ordered
Level Value Count
1 0 271
2 1 115
Algorithm converged.
Model Information
Response Profile
Ordered Total
Value earlydrp Frequency
1 1 115
2 0 271
NOTE: 9 observations were deleted due to missing values for the response or
explanatory variables.
Intercept
Intercept and
Criterion Only Covariates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Model Information
Response Profile
Ordered Ordered
Level Value Count
1 0 271
2 1 115
Algorithm converged.
Model Information
Response Profile
Ordered Total
Value earlydrp Frequency
1 1 115
2 0 271
NOTE: 9 observations were deleted due to missing values for the response or
explanatory variables.
Intercept
Intercept and
Criterion Only Covariates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
22
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 23
4.2 Taxonomy
• Cross-sectional:
Yi1 = βC xi1 + εi1 (4.1)
βC : the average difference across two sub-population
that differ by unit x.
• Repeated observations:
Yij = βC xi1 + βL(xij − xi1) + εij (4.2)
– j = 1: cross-sectional
– ⇒ βC retains interpretation
– In addition, βL can be studied
Subtract:
(Yij − Yi1) = βL(xij − xi1) + (εij − εi1)
βL: expected change in Y over time per unit
change in x.
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 30
4.6 Notation
• Random effects:
– These are effects which arise from the
characteristics of individual subjects.
– Some subjects may be intrinsically high
responders, others intrinsically low responders.
– The influence of a random effect extends over all
measurements of the same subject.
• Serial correlation:
– Measurements taken close together in time are
typically more strongly correlated than those taken
further apart in time.
– On a sufficiently small time-scale, this kind of
structure is almost inevitable.
• Measurement error:
– When measurements involve delicate
determinations, the results may show substantial
variation even when two measurements are taken
at the same time from the same subject.
– e.g. bio-assay of blood samples
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 35
Girl 8 10 12 14 Boy 8 10 12 14
1 21.0 20.0 21.5 23.0 1 26.0 25.0 29.0 31.0
2 21.0 21.5 24.0 25.5 2 21.5 22.5∗ 23.0 26.5
3 20.5 24.0∗ 24.5 26.0 3 23.0 22.5 24.0 27.5
4 23.5 24.5 25.0 26.5 4 25.5 27.5 26.5 27.0
5 21.5 23.0 22.5 23.5 5 20.0 23.5∗ 22.5 26.0
6 20.0 21.0∗ 21.0 22.5 6 24.5 25.5 27.0 28.5
7 21.5 22.5 23.0 25.0 7 22.0 22.0 24.5 26.5
8 23.0 23.0 23.5 24.0 8 24.0 21.5 24.5 25.5
9 20.0 21.0∗ 22.0 21.5 9 23.0 20.5 31.0 26.0
10 16.5 19.0∗ 19.0 19.5 10 27.5 28.0 31.0 31.5
11 24.5 25.0 28.0 28.0 11 23.0 23.0 23.5 25.0
12 21.5 23.5∗ 24.0 28.0
13 17.0 24.5∗ 26.0 29.5
14 22.5 25.5 25.5 26.0
15 23.0 24.5 26.0 30.0
16 22.0 21.5∗ 23.5 25.0
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 38
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 39
N −1
−1
Var(β) =
Xi V Xi
i=1
and with
β = (β0, β1, β0,8, β0,10, β0,12, β1,8, β1,10, β1,12)
• Parameterization:
– Means for boys: β0 + β1 + β1,8
β0 + β1 + β1,10
β0 + β1 + β1,12
β0 + β1
– Means for girls: β0 + β0,8
β0 + β0,10
β0 + β0,12
β0
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 44
• SAS program:
proc mixed data = growth method = ml covtest;
title ’Growth Data, Model 1’;
class idnr sex age;
model measure = sex age*sex / s;
repeated / type = un subject = idnr r rcorr;
run;
5.0143 2.5156 3.6206 2.5095
2.5156 3.8748 2.7103 3.0714
3.6206 2.7103 5.9775 3.8248
2.5095 3.0714 3.8248 4.6164
and
β = (β0, β01, β10, β11).
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 47
• Parameterization:
– β0: intercept for boys
– β0 + β01: intercept for girls
– β10: slope for boys
– β11: slope for girls
• Predicted trends:
girls : Ŷj = 17.43 + 0.4764tj
and
β = (β0, β01, β1)
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 50
• SAS program:
• Predicted trends:
girls : Ŷj = 15.37 + 0.6747tj
• SAS program:
proc mixed data = growth method = ml covtest;
title ’Growth Data, Model 4’;
class sex idnr;
model measure = sex age*sex / s;
repeated / type = toep subject = idnr r rcorr;
run;
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 52
• SAS program:
• SAS program:
proc mixed data = growth method = ml covtest;
title ’Growth Data, Model 6’;
class sex idnr;
model measure = sex age*sex / s;
random intercept age / type = un subject = idnr g;
run;
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 57
• Estimate for V :
4.6216 2.8891 2.8727 2.8563
2.8891 4.6839 3.0464 3.1251
Z DZ + σ I =
2
2.8727 3.0464 4.9363 3.3938
2.8563 3.1251 3.3938 5.3788
• Subject-specific intercepts
Independence: Model 8
• SAS program:
Overview
Yi = Xiβ + Zibi + εi
bi ∼ N (0, D),
εi ∼ N (0, Σi),
b1, . . . , bN , ε1, . . . , εN independent,
• Distribution of bi:
bi ∼ N (0, D)
with density function f (bi)
• E(Yi) = Xiβ
Var(Yi) = Vi = ZiDZi + Σi
• We denote θ = (β , α)
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 67
• Autocorrelation function:
ρ(u) = corr(e(t), e(t − u)), u≥0
• It follows from
1
E (ei(t) − ej (t − u)) = σ 2,
2
for i = j
2
that σ 2 can be estimated by
2 =
σ (rik − rjl )2/2
i=j k l
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 70
Bayesian methods
∝ f (yi|bi) f (bi)
∝ ...
1 −1
∝ exp − bi − DZi Vi (yi − Xiβ)
2
Λ−1
i bi − DZiVi−1(yi − Xiβ)
= Xi β
+ ZiDZiVi−1(yi − Xiβ)
= In i − ZiDZiVi−1 Xi β
+ ZiDZiVi−1yi
= ΣiVi−1Xiβ
+ In i − ΣiVi−1 yi,
• Hence, Y
i is a weighted mean of the
population-averaged profile Xiβ
and the observed
−1 −1
data yi, with weights Σ i Vi and Ini − Σi Vi
respectively.
5.1 Introduction
5.2 Questions
• Prediction
# param. Comp.
Model Description Covar. Struct. Deviance Model G2 d.f. p-value
• Code:
(pj )
X pj if pj = 0,
X =
ln(X) if pj = 0.
Model Information
Dimensions
Covariance Parameters 9
Columns in X 5
Columns in Z Per Subject 3
Subjects 107
Max Obs Per Subject 7
Observations Used 513
Observations Not Used 250
Total Observations 763
Fit Statistics
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 86
8 681.48 <.0001
Standard
Effect Estimate Error DF t Value Pr > |t|
Num Den
Effect DF DF F Value Pr > F
• Primary motivation
– True endpoint is rare and/or distant
– Surrogate endpoint is frequent and/or close in time
• Secondary motivation
True endpoint is
– invasive
– uncomfortable
– costly
– confounded
∗ by secondary treatments
∗ by competing risks
91
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 92
Z: Interferon-α
• 0: placebo
• 1: 6MIU
N : 190
• 36 centers
• # patients per center ∈ [2; 18]
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 93
Visual Acuity
V A L I D
A T I O N
O F S U R
R O G A T
E M A R K
E R S I N
R A N D O
M I Z E D
E X P E R
I M E N T
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 94
ARMD Data
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 95
N : 1194
• Individual data available on every randomized patient
• 952 (80%) have progression/death
• 50 units
• # patients per unit ∈ [2; 274]
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 96
CORFU Study
T : Survival time
S: Time to progression
N : 736
• Individual data available on every randomized patient
• 694 (94.3%) have progression/death
• 76 units
• # patients per unit ∈ [2; 38]
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 99
(S|treated) = (S|control)
(T |treated) = (T |control)
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 101
• Description:
4. The full effect of Z on T is explained by S
• Model:
Tij |Zij , Sij = µ̃T + βS Zij + γZ Sij + ε̃T ij ,
• Definition:
β − βS
PE(T, S, Z) =
β
• Estimate:
– P E = 0.65 (95% C.I. [−0.22; 1.51])
6.7 Criticism
Choi et al (1993)
– Relative Effect
– Adjusted Association
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 104
• Description:
4A. The effect of Z on S predicts a clinically useful
effect of Z on T
• Definition:
β
RE(T, S, Z) =
α
• Estimate:
– RE = 1.45 (95% C.I. [−0.48; 3.39])
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 105
• Description:
4B. The correlation between S and T after correction
for Z
• Definition:
ρZ = Corr(S, T |Z)
• Estimate:
– ρZ = 0.75 (95% C.I. [0.69; 0.82])
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 106
• BUT:
The RE is based on a single trial ⇒ regression
through the origin, based on one point !
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 107
• Context:
– multicenter trials
– meta analysis
– several meta analyses
• Extensions:
• Model:
Sij |Zij = µSi + αiZij + εSij
Tij |Zij = µT i + βiZij + εT ij
• Error structure:
– Individual level:
∗ Deviations εSij and εT ij are correlated
– Trial level:
∗ Treatment effects αi and βi are correlated
∗ (Information from intercepts µSi and µT i can be
used as well)
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 110
Statistical Model
• Model:
Sij |Zij = µSi + αiZij + εSij
Tij |Zij = µT i + βiZij + εT ij
• Error structure:
σ σ
Σ = SS ST
σT T
• Trial-specific effects:
µSi µS mSi
µT i µT mT i
= +
αi α ai
βi β bi
Endpoints dimension:
• Both endpoints together
• Each endpoint separately
Center dimension:
• Center as fixed effect
• Center as random effect
Measurement error:
• No adjustment
• Adjustment by sample size per trial
• Full correction using Stijnen’s approach
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 112
30
20
10
at 12 months
-10
-20
-30
-40
• Prediction:
– What do we expect ?
E(β + b0|mS0, a0)
– How precisely can we estimate it ?
Var(β + b0|mS0, a0)
• Estimate:
2
– Rtrial = 0.692 (95% C.I. [0.52; 0.86])
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 113
• Prediction:
T −1
dSb dSS dSa µS0 − µS
E(β + b0 |mS0 , a0 ) = β +
dab dSa daa α0 − α
T −1
dSb dSS dSa dSb
Var(β + b0 |mS0 , a0 ) = dbb −
dab dSa daa dab
• Trial-level association:
T −1
dSb
dSS dSa
dSb
dab
dSa daa
dab
Rb2i|mSi,ai =
dbb
• Estimate:
– Rb2i|mSi,ai = 0.692 (95% C.I. [0.52; 0.86])
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 114
30
20
10
at 12 months
-10
-20
-30
-40
• Trial-level association:
ρZ = Rindiv = Corr(εT i, εSi)
• Estimate:
2
– Rindiv = 0.483 (95% C.I. [0.38; 0.59])
• Conditional density:
−1
Tij |Zij , Sij ∼ N µT i − σT S σSS µSi
−1
+(βi − σT S σSS αi)Zij
−1
+ σT S σSS Sij ; σT T − σT2 S σSS
−1
• Trial-level association:
2
σST
ρZ = Rε2T i|εSi =
σSS σT T
• Estimate:
– Rε2T i|εSi = 0.483 (95% C.I. [0.38; 0.59])
• Prediction:
– What do we expect ?
E(β + b0|mS0, a0)
– How precisely can we estimate it ?
Var(β + b0|mS0, a0)
• Estimate:
2
– Rtrial = 0.940 (95% C.I. [0.91; 0.97])
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 117
• Trial-level association:
ρZ = Rindiv = Corr(εT i, εSi)
• Estimate:
2
– Rindiv = 0.886 (95% C.I. [0.87; 0.90])
• Prediction:
– What do we expect ?
E(β + b0|mS0, a0)
– How precisely can we estimate it ?
Var(β + b0|mS0, a0)
• Estimate:
2
– Rtrial = 0.454 (95% C.I. [0.23; 0.68])
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 120
• Trial-level association:
ρZ = Rindiv = Corr(εT i, εSi)
• Estimate:
2
– Rindiv = 0.665 (95% C.I. [0.62; 0.71])
– Rindiv = 0.815
– ρZ = 0.805
Chapter 7
121
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 122
Cancer Cases
Controls BPH cases L/R M
Number of participants 16 20 14 4
Age at diagnosis (years)
median 66 75.9 73.8 72.1
range 56.7-80.5 64.6-86.7 63.6-85.4 62.7-82.8
Years of follow up
median 15.1 14.3 17.2 17.4
range 9.4-16.8 6.9-24.1 10.6-24.9 10-25.3
Time between
measurements (years)
median 2 2 1.7 1.7
range 1.1-11.7 0.9-8.3 0.9-10.8 0.9-4.8
Number of measurements
per individual
median 8 8 11 9.5
range 4-10 5-11 7-15 7-12
Complications
Stage 1:
Stage 2:
Stage 1:
Stage 2:
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 128
β 1i = β1Agei + β2Ci + β3Bi + β4Li + β5Mi + b1i
β 2i = β6Agei + β7Ci + β8Bi + β9Li + β10Mi + b2i
β 3i = β11Agei + β12Ci + β13Bi + β14Li + β15Mi + b3i,
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 129
where
Agei = Age at time of diagnosis
1 if Control
Ci =
0 otherwise
1 if BPH case
Bi =
0 otherwise
1 if L/R cancer case
Li =
0 otherwise
1 if Metastatic cancer case
Mi =
0 otherwise
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 130
• β7, β8, β9, β10 are the average slopes for time after
correction for age.
• β12, β13, β14, β15 are the average slopes for time2 after
correction for age.
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 131
Stage 1:
Yi = Zi β i + εi
where
Yi1
εi1
1 ti1 t2i1
β1i
Yi2
εi2
1 ti2 t2i2
Yi =
, εi =
, Zi =
, βi =
β2i
... ... ... ... ...
β3i
Yini εini 1 tini t2ini
Stage 2:
βi = Biβ + bi,
Yij = Yi(tij )
or equivalently
Yi = Xiβ + Zibi + εi
PSA example
• lnpsa = ln(P SA + 1)
• We assume Σi = σ 2Ini
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 134
SAS program
• MODEL statement:
– response variable
– fixed effects
– options similar to SAS regression procedures
• RANDOM statement:
– definition of random effects (including intercepts !)
– identification of the ‘subjects’: independence
accross subjects
– structure of random-effects covariance matrix D
many structures available within SAS
• REPEATED statement:
– ordering of measurements within subjects
– the effect(s) specified must be of the factor-type
– identification of the ‘subjects’: independence
accross subjects
– structure of Σi
the same structures available as for the RANDOM
statement
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 136
Overview of frequently used covariance structures which can be specified in the RANDOM
and REPEATED statements of the SAS procedure MIXED. The σ-parameters are used to
denote variances and covariances, while the ρ-parameters are used for correlations.
Structure Example
σ12 σ12 σ13
Unstructured σ12 σ22 σ23
type=UN
σ13 σ23 σ32
σ12 σ12 0
Banded σ12 σ22 σ23
type=UN(2)
0 σ23 σ32
σ2 ρσ 2 ρ2 σ 2
First-order autoregressive ρσ 2 σ2 ρσ 2
type=AR(1)
ρ2 σ 2 ρσ 2 σ2
σ2 σ12 σ13
Toeplitz σ12 σ2 σ12
type=TOEP
σ13 σ12 σ2
σ2 0 0
Toeplitz (1) 0 σ2 0
type=Toep(1)
0 0 σ2
Heterogeneous com- σ12 ρσ1 σ2 ρσ1 σ3
pound symmetry ρσ1 σ2 σ22 ρσ2 σ3
type=CSH ρσ1 σ3 ρσ2 σ3 σ32
Heterogeneous first- σ12 ρσ1 σ2 ρ2 σ1 σ3
order autoregressive ρσ1 σ2 σ22 ρσ2 σ3
type=ARH(1) ρ2 σ1 σ3 ρσ2 σ3 σ32
σ12 ρ1 σ1 σ2 ρ2 σ1 σ3
Heterogeneous Toeplitz ρ1 σ1 σ2 σ22 ρ1 σ2 σ3
type=TOEPH
ρ2 σ1 σ3 ρ1 σ2 σ3 σ32
(1)
Example: repeated timeclss / type = simple subject = id;
(2)
Example: random intercept time time2 / type = simple subject = id;
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 137
specified in the RANDOM and REPEATED statements of the SAS procedure MIXED. The
correlations are positive decreasing functions of the Euclidean distances dij between the
observations. The coordinates of the observations, used to calculate these distances are
given by a set of variables the names of which are specified in the list ‘list’. The variance is
denoted by σ 2 , and ρ defines how fast the correlations decrease as functions of the dij .
Structure Example
1 ρd12 ρd13
Power
σ2
ρ
d12
1 ρd23
type=SP(POW)(list)
ρd13 ρd23 1
1 exp(−d12 /ρ) exp(−d13 /ρ)
Exponential
2
σ exp(−d12 /ρ) 1 exp(−d23 /ρ)
type=SP(EXP)(list)
exp(−d13 /ρ) exp(−d23 /ρ) 1
1 exp(−d212 /ρ2 ) exp(−d213 /ρ2 )
Gaussian
σ2
exp(−d 2
12 /ρ 2
) 1 exp(−d 2
23 /ρ 2
)
type=SP(GAU)(list)
exp(−d213 /ρ2 ) exp(−d223 /ρ2 ) 1
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 138
Maximum likelihood and restricted maximum likelihood estimates (MLE and REMLE) and
standard errors for all fixed effects and all variance components in the PSA model
0 1 -259.0577593
1 2 -753.2423823 0.00962100
2 1 -757.9085275 0.00444385
. . ............ ..........
6 1 -760.8988784 0.00000003
7 1 -760.8988902 0.00000000
• Objective functions:
1
ln (L (θ)) = − {n ln(2π) + OF (θ)}
2
ML ML
1
ln (L (θ)) = − {(n − p) ln(2π) + OF (θ)}
2
REML REML
Description Value
Observations 463.0000
Variance Estimate 1.0000
Standard Deviation Estimate 1.0000
REML Log Likelihood -31.2350
Akaike’s Information Criterion -38.2350
Schwarz’s Bayesian Criterion -52.6018
-2 REML Log Likelihood 62.4700
Null Model LRT Chi-Square 501.8411
Null Model LRT DF 6.0000
Null Model LRT P-Value 0.0000
H0 L β = 0,
−1
−1 −1
β L L N
X V
i=1 i i ( α)X i L L β
F = ,
rank(L)
– Satterthwaite’s approximation
– ...
Example 1
β4 = β5
H0 : β9 = β10
14 = β15 ,
β
0 0 0 1 −1 0 0 0 0 0 0 0 0 0 0
H0 :
β = 0,
0 0 0 0 0 0 0 0 1 −1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 −1
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 150
Example 2
Example 3
Example 4
Model reduction
= −β8 + β9 + β14
7.6.1 Example
−2 ln λ = −2 ln ,
L(θ 1)
N
where θ
0 and θ 1 are ML or REML estimates under
• Test statistics:
Maximum likelihood
Asymptotic null distribution
Hypothesis −2 ln(λN ) Correct Naive
Model 2 versus Model 1 94.270 χ22:3 χ23
Model 3 versus Model 2 161.016 χ21:2 χ22
Model 4 versus Model 3 240.114 χ20:1 χ21
Restricted maximum likelihood
Asymptotic null distribution
Hypothesis −2 ln(λN ) Correct Naive
Model 2 versus Model 1 92.796 χ22:3 χ23
Model 3 versus Model 2 165.734 χ21:2 χ22
Model 4 versus Model 3 245.874 χ20:1 χ21
165
CHAPTER 8. PARAMETRIC MODELING FAMILIES 166
⇒ inconsistency
⇒ further assumptions:
βi = β + U i
with
• Likelihood methods:
– Multivariate Probit Model
Ashford and Sowden (1970)
– Bahadur Model
Bahadur (1962)
9.1 Notation
175
CHAPTER 9. MODELLING REPEATED CATEGORICAL DATA 176
• Advantages:
– The parameter vector is not constrained. All
values of θ ∈ IR yield nonnegative probabilities.
– Calculation of the joint probabilities is fairly
straightforward:
∗ ignore the normalizing constant
∗ evaluate the density for all possible sequences y
∗ sum all terms to yield c(θ)−1
• Drawbacks:
– Due to above conditional interpretation, the
models are less useful for regression.
The dependence of E(Yij ) on covariates involves all
parameters, not only the main effects.
– The interpretation of the parameters depends on
the length ni of a sequence.
Shorter sequences imply that one conditions on less
outcomes, changing interpretation with length of sequence.
Remarks
An important sub-family of
η i = η i(µi)
is the log-contrast family:
η i(µi) = C ln(Aµi),
with
Pairwise Association
ψijk (Yi = 1)
=
ψijk (Yi = 0)
188
CHAPTER 10. CASE STUDY: NTP DATA 189
dam
@
@
@
@
@
? R
@
. . .implant (mi). . .
@
@
@
@
@
R
@
viable (ni) non-viable (ri)
A A
A A
A A
A A
A A
AU AU
malf. (zi)weight deathresorption
A
A
A
A
A
? AU
1 ... K
CHAPTER 10. CASE STUDY: NTP DATA 190
10.2 Design
10.3 Goals
10.4 Issues
n
f (y i, θ i) = exp θij yij + θij1j2 yij1 yij2 − A(θ i)
j=1 j1 <j2
n
= c(θ i) exp θij yij + θij1j2 yij1 yij2 .
j=1 j1 <j2
• NTP data
• Yij is malformation indicator for fetus j in litter i
• Code Yij as −1 or 1
• di is dose level at which litter i is exposed
• Simplification:
θij = θi = β0 + βddi,
θij1j2 = βa.
• Using
n
i
Zi = Yij
j=1
we obtain
ni
f (zi|θi, βa) =
exp {θ z + β z (n − z ) − A(θ )} ,
i i a i i i i
zi
CHAPTER 10. CASE STUDY: NTP DATA 195
with vi = Var(Yi).
(Here, Yi is scalar.)
197
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 198
As N → ∞
√
N (β̂ − β) ∼ N (0, I0−1)
where
N
DiT [Vi(α)]−1Di
I0 =
i=1
• (Unrealistic) Conditions:
– α is known
– the parametric form for V i(α) is known
• Solution: working correlation matrix
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 202
Write
1/2 1/2
Vi(β, α) = φAi (β)Ri(α)Ai (β).
• Independence:
Corr(Yij , Yik ) = 0 (j = k).
There are no parameters to be estimated.
• Exchangeable:
Corr(Yij , Yik ) = α (j = k).
1 N
1
α̂ = eij eik .
N i=1 ni (ni − 1) j=k
• AR(1):
Corr(Yij , Yi,j+t) = αt (t = 0, 1, . . . , ni − j).
1 N
1
α̂ = eij ei,j+1.
N i=1 ni − 1 j≤ni −1
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 207
• Unstructured:
Corr(Yij , Yik ) = αjk (j = k).
1 N
α̂jk = eij eik .
N i=1
β (t+1) = β (t)−
DiT Vi−1Di
DiT Vi−1(y i − µi) .
i=1 i=1
Model Information
Description Value
Parameter Information
Parameter Effect
PRM1 INTERCEPT
PRM2 DOSE
Description Value
Parameter
Number PRM1 PRM2
Parameter
Number PRM1 PRM2
. . . . .
. . . . .
. . . . .
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 215
NOTE: The scale parameter for GEE estimation was computed as the
square root of the normalized Pearson’s chi-square
Description Value
Parameter
Number PRM1 PRM2
Parameter
Number PRM1 PRM2
• Classical approach:
– Estimating equation for β
– Moment-based estimation for α
– Liang and Zeger (1986)
– SAS PROC GENMOD
• Alternative approach GEE 1.5:
– Estimating equation for β
– Estimating equation for α
– Prentice (1988)
– SAS macro gee1corr.mac by Stuart Lipsitz
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 220
Form of Equations
N
DiT Vi−1(Y i − µi) = 0,
i=1
N
EiT Wi−1(Z i − δ i) = 0,
i=1
where
(Yij − µij )(Yik − µik )
Zijk = ,
µij (1 − µij )µik (1 − µik )
δijk = E(Zijk )
√
√ joint asymptotic distribution of N (β̂ − β) and
The
N (α̂ − α) normal with variance-covariance matrix
consistently estimated by
A 0
Λ11 Λ12 A B T
N
,
B C Λ21 Λ22 0 C
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 221
−1
N
A = DiT Vi−1 Di ,
i=1
−1 −1
N
N
N
∂Z i
B = EiT Wi−1 Ei EiT Wi−1 DiT Vi−1 Di ,
i=1 i=1 ∂β i=1
−1
N
C = EiT Wi−1 Ei ,
i=1
N
Λ11 = DiT Vi−1 Cov(Y i )Vi−1 Di ,
i=1
N
Λ12 = DiT Vi−1 Cov(Y i , Z i )Wi−1 Ei ,
i=1
Λ21 = Λ12 ,
N
Λ22 = EiT Wi−1 Cov(Z i )Wi−1 Ei ,
i=1
and
Statistic Estimator
Var(Y i) (Y i − µi)(Y i − µi)T
Cov(Y i, Z i) (Y i − µi)(Z i − δ i)T
Var(Z i) (Z i − δ i)(Z i − δ i)T
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 222
%include ’c:\sas\stat\sample\gee1corr.mac’;
%gee(data=m.dehp3,y=visceral,x=dose,id=litter,corr=exc);
%gee(data=m.dehp3,y=visceral,x=dose,id=litter,corr=ind);
• for gee1corr.mac
• for GLIMMIX macro
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 224
CORR SECORR Z P
0.1100235 0.0455011 2.4180411 0.0156043
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 225
Write
y i = µi + εi
with
η i = g(µi),
η i = Xiβ,
Var(y i) = Var(εi) = Σi.
Here,
11.8.1 Estimation
Solve iteratively:
N N
Wiy ∗i ,
XiT WiXiβ =
i=1 i=1
where
Wi = DiΣ−1
i Di ,
y ∗i = η̂ i + (y i − µ̂i)Di−1,
∂µi
Di = ,
∂η i
Σi = Var(ε),
µi = E(y i).
Remarks:
%include ’c:\sas\stat\sample\glimmix.sas’;
%glimmix(
data=m.dehp3,
procopt=method=reml,
stmts=%str(
class litter;
model visceral=dose / solution;
repeated / subject=litter type=cs r;),
error=binomial,
link=logit,
title=’visceral, CS, Model Based’,
options=mixprintlast
);
%glimmix(
data=m.dehp3,
procopt=method=reml,
stmts=%str(
class litter;
model visceral=dose / solution;
repeated / subject=litter type=simple r;),
error=binomial,
link=logit,
title=’visceral, Independence, Model Based’,
options=mixprintlast
);
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 230
%glimmix(
data=m.dehp3,
procopt=method=reml empirical,
stmts=%str(
class litter;
model visceral=dose / solution;
repeated / subject=litter type=cs r;),
error=binomial,
link=logit,
title=’visceral, CS, Empirically Corrected’,
options=mixprintlast
);
%glimmix(
data=m.dehp3,
procopt=method=reml empirical,
stmts=%str(
class litter;
model visceral=dose / solution;
repeated / subject=litter type=simple r;),
error=binomial,
link=logit,
title=’visceral, Independence, Empirically Corrected’,
options=mixprintlast
);
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 231
• Described in
– Littell et al (1996)
– glimmix.sas
• Visceral malformation
• Exchangeable correlation (CS)
• Model based standard errors (the standard in PROC
MIXED)
0 1 4843.2167338
1 2 4823.1918629 0.00000048
2 1 4823.1907084 0.00000000
. . . . .
. . . . .
. . . . .
CS LITTER 0.07639306
Residual 0.92257140
Description Value
Observations 1082.000
Res Log Likelihood -3404.05
Akaike’s Information Criterion -3406.05
Schwarz’s Bayesian Criterion -3411.03
-2 Res Log Likelihood 6808.098
Null Model LRT Chi-Square 20.0260
Null Model LRT DF 1.0000
Null Model LRT P-Value 0.0000
Cov
Parm Estimate
CS 0.07639306
Description Value
Deviance 407.7891
Scaled Deviance 442.0136
Pearson Chi-Square 1076.0335
Scaled Pearson Chi-Square 1166.3418
Extra-Dispersion Scale 0.9226
Parameter Estimates
Description Value
Observations 1082.000
Res Log Likelihood -3415.38
Akaike’s Information Criterion -3416.38
Schwarz’s Bayesian Criterion -3418.88
-2 Res Log Likelihood 6830.770
Null Model LRT Chi-Square 0.0000
Null Model LRT DF 0.0000
Null Model LRT P-Value 1.0000
Cov
Parm Estimate
DIAG 0.99703847
Description Value
Deviance 407.5135
Scaled Deviance 407.5135
Pearson Chi-Square 1076.8015
Scaled Pearson Chi-Square 1076.8015
Extra-Dispersion Scale 1.0000
Parameter Estimates
Cov
Parm Estimate
CS 0.07639306
Description Value
Deviance 407.7891
Scaled Deviance 442.0136
Pearson Chi-Square 1076.0335
Scaled Pearson Chi-Square 1166.3418
Extra-Dispersion Scale 0.9226
Parameter Estimates
Cov
Parm Estimate
DIAG 0.99703847
Description Value
Deviance 407.5135
Scaled Deviance 407.5135
Pearson Chi-Square 1076.8015
Scaled Pearson Chi-Square 1076.8015
Extra-Dispersion Scale 1.0000
Parameter Estimates
GEE1 Estimates (Model Based Standard Errors; Robust Standard Errors) for
the DEHP Data. Exchangeable Working Assumptions.
GEE1 Estimates (Model Based Standard Errors; Robust Standard Errors) for
the DEHP Data. Independence Working Assumptions.
Remarks
11.11.2 Discussion
(Another GEE1.5.)
249
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 250
Model Information
Response Profile
Ordered Ordered
Level Value Count
1 0 206
2 1 931
Parameter Information
Parameter Effect
Prm1 Intercept
Prm2 pca0
Prm3 TIME
Prm4 TIME*TIME
Algorithm converged.
Algorithm converged.
IND EXCH
1 0 0 0
1 0.219 0.219 0.219
1 0 0
1 0.219 0.219
1 0
1 0.219
1 1
AR UN
1 0.247 0.061 0.015 1 0.177 0.248 0.202
1 0.247 0.061
1 0.181 0.178
1 0.247
1 0.459
1 1
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 254
IND
EXCH
1 0 0 0
1 0.219 0.219 0.219
1 0 0
1 0.219 0.219
1 0
1 0.219
1
1
AR
UN
1 0.235 0.055 0.013 1 0.143 0.288 0.228
1 0.235 0.055
1 0.220 0.098
1 0.235
1 0.443
1
1
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 255
12.3.1 Output
Model Information
Dimensions
Covariance Parameters 10
Columns in X 4
Columns in Z 0
Subjects 395
Max Obs Per Subject 4
Observations Used 1137
Observations Not Used 0
Total Observations 1137
Parameter Search
Parameter Search
Iteration History
1 1 5271.50486573 0.00000000
Standard Z
Cov Parm Subject Estimate Error Value Pr Z
Fit Statistics
10 0.00 1.0000
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 259
Standard
Effect Estimate Error DF t Value Pr > |t|
Num Den
Effect DF DF F Value Pr > F
Description Value
Deviance 1065.2602
Scaled Deviance 1065.2602
Pearson Chi-Square 1101.4964
Scaled Pearson Chi-Square 1101.4964
Extra-Dispersion Scale 1.0000
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 260
Intercept 2.918 (0.463; 0.494) 2.940 (0.463; 0.494) 2.942 (0.463; 0.488)
Time -0.833 (0.328; 0.343) -0.843 (0.326; 0.334) -0.843 (0.326; 0.330)
Time2 0.177 (0.067; 0.070) 0.178 (0.066; 0.068) 0.178 (0.066; 0.067)
Basel. PCA -0.226 (0.095; 0.103) -0.230 (0.095; 0.105) -0.230 (0.095; 0.104)
ρ 0.219 0.260 (0.048) 0.264 (0.037†)
12.4.1 Output
Model Information
Response Profile
Ordered Ordered
Level Value Count
1 0 206
2 1 931
Algorithm converged.
Algorithm converged.
• Output:
The GENMOD Procedure
Model Information
Response Profile
Ordered Ordered
Level Value Count
1 0 206
2 1 931
Algorithm converged.
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 265
Parameter Group
Alpha1 (1, 2)
Alpha2 (1, 3)
Alpha3 (1, 4)
Alpha4 (2, 3)
Alpha5 (2, 4)
Alpha6 (3, 4)
Algorithm converged.
Random-Effects Models
266
CHAPTER 13. RANDOM-EFFECTS MODELS 267
• Quadrature:
– Select abscissas
– Construct weighted sum of function over abscissas
• Adaptive Quadrature:
– Typical for random effects distribution
– Integral centered at EB estimate of ui
– Number of quadrature points selected in function
of desired accuracy
• Pinheiro and Bates (1995)
CHAPTER 13. RANDOM-EFFECTS MODELS 269
13.4 Software
• Building blocks:
The binomial part: conditional on the success
probability πi in cluster i, the responses
Y i1, . . . , Yini are independent with common
probability πi.
The beta part: the πi are drawn from a beta
distribution with mean π and variance δπ(1 − π)
• The marginal distribution of Zi is then beta-binomial
with
f (zi | πi , ρ)
B(πi (ρ−1 − 1) + zi, (1 − πi)(ρ−1 − 1) + (ni − zi))
=
B(πi(ρ−1 − 1), (1 − πi)(ρ−1 − 1))
where B(., .) denotes the beta function.
• The moments are:
– E(Zi) = niµi
– Var(Zi) = niµi(1 − µi)[1 + (ni − 1)δ]
Williams (1975)
CHAPTER 13. RANDOM-EFFECTS MODELS 274
13.5.2 Discussion
• An extension of GEE1–Alternative 2.
• Random effects are included in the model.
Write
y i = µi + εi
with
η i = g(µi),
η i = Xiβ + Zibi,
Var(y i|bi) = Σi.
Here,
• µi = ψ (θi)
• v(µi) = ψ (θi)
∂Qi yi − µi
= .
∂µi φv(µi)
CHAPTER 13. RANDOM-EFFECTS MODELS 278
• the mean µi
• the mean function θ(µi)
• the variance function v(µi)
• the scale parameter φ
CHAPTER 13. RANDOM-EFFECTS MODELS 279
where
Wi = DiΣ−1i Di ,
∂µi
Di = ,
∂η i
Σi = Var(ε),
and W , D, and Σ are block-diagonal matrices built
from Wi, Di, and Σi respectively.
The estimates are:
β̂ = (X T V̂ −1X)−1X T V̂ −1y ∗,
b̂ = Ĝ∗Z T V̂ −1r̂.
7. Compute
µ̂i = g −1(Xiβ̂ + Zib̂i).
8. Iterate until convergence.
CHAPTER 13. RANDOM-EFFECTS MODELS 282
data help;
set m.dehp2;
dose=dose/292;
collaps = ((visceral-1) or (skeletal-1) or (external-1));
if visceral=. then delete;
skeletal=skeletal-1;
visceral=visceral-1;
external=external-1;
run;
%include ’c:\sas\stat\sample\glimmix.sas’;
%glimmix(
data=help,
procopt=method=reml,
stmts=%str(
class litter;
id litter dose;
model weight=dose / solution predmeans;
random intercept / subject=litter solution;),
error=normal,
link=identity,
title=’GLIMMIX, dehp2, weight, random intercept’,
options=mixprintlast
);
Description Value
Observations 1082.000
Res Log Likelihood 1014.980
Akaike’s Information Criterion 1012.980
Schwarz’s Bayesian Criterion 1007.995
-2 Res Log Likelihood -2029.96
...
INTERCEPT 203 0.02616233 0.02521825 974 1.04 0.2998
Predicted Means
Predicted Values
Description Value
Observations 1082.000
Res Log Likelihood 1014.980
Akaike’s Information Criterion 1012.980
Schwarz’s Bayesian Criterion 1007.995
-2 Res Log Likelihood -2029.96
Description Value
Observations 1082.000
Res Log Likelihood 1014.834
Akaike’s Information Criterion 1012.834
Schwarz’s Bayesian Criterion 1007.849
-2 Res Log Likelihood -2029.67
INTERCEPT 0.00552982
Parameter Estimates
Predicted Means
1 0 38 0.9665
10 0 49 0.9665
13.7.2 Discussion
Let us combine
• dose level
• predicted mean
• random intercept
• predicted value
Remarks
%include ’c:\sas\stat\sample\glimmix.sas’;
%glimmix(
data=m.dehp3,
procopt=method=reml,
stmts=%str(
class litter;
id litter dose;
model visceral=dose / solution predmeans predicted;
random intercept / subject=litter solution;
repeated / subject=litter type=simple;
),
error=binomial,
link=logit,
maxit=100,
options=mixprintlast,
title=’GLIMMIX, dehp3, visceral, random intercept and SIMPLE’
);
CHAPTER 13. RANDOM-EFFECTS MODELS 299
%glimmix(
data=m.dehp3,
procopt=method=reml,
stmts=%str(
class litter;
id litter dose;
model visceral=dose / solution predmeans predicted;
make ’Predicted’ out=m.predvisc noprint; <------------
make ’PredMeans’ out=m.prmvisc noprint;
make ’SolutionR’ out=m.solrvisc noprint;
random intercept / subject=litter solution;
repeated / subject=litter type=simple;
),
error=binomial,
link=logit,
maxit=100,
options=mixprintlast,
title=’GLIMMIX, dehp3, visceral, random intercept and SIMPLE’
);
Description Value
Observations 1082.000
Res Log Likelihood -3355.59
Akaike’s Information Criterion -3357.59
Schwarz’s Bayesian Criterion -3362.57
-2 Res Log Likelihood 6711.179
Predicted Means
0 38 0 -6.8176 -5.6019
0 49 0 -6.9235 -5.6019
0 57 1 -2.4654 0.1810
1 66 1 2.0691 0.1810
CHAPTER 13. RANDOM-EFFECTS MODELS 302
INTERCEPT 2.73470639
DIAG 0.34412800
Description Value
Deviance 282.3032
Scaled Deviance 282.3032
Pearson Chi-Square 353.0951
Scaled Pearson Chi-Square 353.0951
Extra-Dispersion Scale 1.0000
Parameter Estimates
13.7.5 Discussion
Let us combine
• dose level
• predicted mean
• random intercept
• predicted value
Remarks
• It is seen in plots
• It shows through main effects estimates
• Neuhaus and Jewell (1993)
CHAPTER 13. RANDOM-EFFECTS MODELS 308
CHAPTER 13. RANDOM-EFFECTS MODELS 309
GEE1 and GLIMMIX Estimates (Model Based Standard Errors; Robust Standard Errors) for the
DEHP Data. Exchangeable Working Assumptions/Random Intercept Model.
313
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 314
14.1.1 Output
Specifications
Description Value
Dimensions
Description Value
Parameters
Iteration History
Fit Statistics
Description Value
Parameter Estimates
Standard
Parameter Estimate Error DF t Value Pr > |t| Alpha Lower
Parameter Estimates
Additional Estimates
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 316
Standard
Label Estimate Error DF t Value Pr > |t| Alpha Lower Upper
• Need to calculate
+∞ exp(xTij β + x) 1 −x2 /(2σu2 )
−∞ . √ e dx.
1 + exp(xTij β + x) 2πσu
• Take mean value of the covariates to evaluate this
expression.
• This gives fitted complete profiles, that is, what would
be obtained had all the patients stayed in the study.
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 318
*** use the following statement in PROC NLMIXED to get parameter estimates;
ods output parameterestimates=parmest;
proc iml;
*** read in fixed param. estimates;
use parmest;
read all var{estimate} into parmest;
beta=parmest[1:(nrow(parmest)-1)];
sig2=parmest[nrow(parmest)];
cif=probit(0.975);
do t=1 to 4;
xcov={1} // t // t**2 // {3};*** Note: 3 = median baseline PCA;
xbeta=t(xcov)*beta;
call quad(prc,"integr",{.M .P});
*** approximate confidence intervals (ignoring variability in the estimates);
low_prc=exp(xbeta-cif*sqrt(sig2))/(1+exp(xbeta-cif*sqrt(sig2)));
upp_prc=exp(xbeta+cif*sqrt(sig2))/(1+exp(xbeta+cif*sqrt(sig2)));
pdrespc=pdrespc // (t || prc || low_prc || upp_prc);
end;
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 319
• Code:
proc nlmixed data=gsa npoints=20 noad noadscale tech=newrap;
parms beta0=3 beta1=-0.8 beta2=0.2 beta3=-0.2 su=1;
eta = beta0 + beta1*time + beta2*time2 + beta3*pca0 + u;
expeta = exp(eta);
p = expeta/(1+expeta);
model gsabin ~ binary(p);
random u ~ normal(0,su**2) subject=patid;
estimate ’ICC’ su**2/(arcos(-1)**2/3 + su**2);
run;
• Output:
The NLMIXED Procedure
Specifications
Description Value
Dimensions
Description Value
Parameters
Iteration History
Fit Statistics
Description Value
Parameter Estimates
Standard
Parameter Estimate Error DF t Value Pr > |t| Alpha Lower
Parameter Estimates
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 324
Additional Estimates
Standard
Label Estimate Error DF t Value Pr > |t| Alpha Lower Upper
14.2.2 MIXOR
Numbers of observations
-----------------------
1 1 4 2 1 4 3 4 3 3 4 4 4 3 1 1 1 3 2
...
Starting values
---------------
mean 1.022
covariates 0.295 -0.066 0.079
var. terms 0.574
---------------------------------------------------------
* Final Results - Maximum Marginal Likelihood Estimates *
---------------------------------------------------------
Total Iterations = 10
Quad Pts per Dim = 20
Log Likelihood = -506.275
Deviance (-2logL) = 1012.549
Ridge = 0.000
• Code:
%glimmix(data=gsa, procopt=%str(method=ml noclprint covtest),
stmts=%str(
class patid timecls;
model gsabin = time|time pca0 / s;
repeated timecls / sub=patid type=un rcorr=3;
),
error=binomial);
• Output:
The Mixed Procedure
Model Information
Dimensions
Covariance Parameters 2
Columns in X 4
Columns in Z Per Subject 1
Subjects 395
Max Obs Per Subject 4
Observations Used 1137
Observations Not Used 0
Total Observations 1137
Parameter Search
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 330
Iteration History
1 1 5666.87638845 0.00000000
Standard Z
Cov Parm Subject Estimate Error Value Pr Z
Fit Statistics
1 0.00 1.0000
Standard
Effect Estimate Error DF t Value Pr > |t|
Num Den
Effect DF DF F Value Pr > F
Description Value
Deviance 564.5908
Scaled Deviance 1139.5937
Pearson Chi-Square 451.3494
Scaled Pearson Chi-Square 911.0227
Extra-Dispersion Scale 0.4954
Chapter 15
332
CHAPTER 15. ANALGESIC TRIAL: ORDINAL DATA 333
• Output:
Model Information
Response Profile
Ordered Total
Value GSA Frequency
1 Bad 163
2 Good 329
3 Moderate 439
4 Very Bad 43
5 Very Good 163
30.0808 9 0.0004
Intercept
Intercept and
Criterion Only Covariates
Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq
• Output:
Model Information
Response Profile
Ordered Ordered
Level Value Count
1 Bad 163
2 Good 329
3 Moderate 439
4 Very Bad 43
5 Very Good 163
CHAPTER 15. ANALGESIC TRIAL: ORDINAL DATA 337
Algorithm converged.
Algorithm converged.
• Output:
Specifications
Description Value
Dimensions
Description Value
Parameters
i1 i2 i3 i4 b1 b2 b3 sd
Parameters
NegLogLike
1716.69515
Iteration History
Iteration History
Fit Statistics
Description Value
Parameter Estimates
Standard
Parameter Estimate Error DF t Value Pr > |t| Alpha Lower
Parameter Estimates
i1 -0.4810 -0.00002
i2 2.0991 -0.00011
i3 4.9973 0.000138
i4 7.3920 -0.00003
b1 1.1463 0.00026
b2 0.009308 0.001023
b3 0.5898 0.000191
sd 2.3857 -0.0001
CHAPTER 15. ANALGESIC TRIAL: ORDINAL DATA 342
Additional Estimates
Standard
Label Estimate Error DF t Value Pr > |t| Alpha Lower Upper
Missing Data
• Measurement Yij
• Dropout indicator
1 if Yij is observed,
Rij =
0 otherwise.
343
CHAPTER 16. MISSING DATA 344
Dropout
• Possible definition:
ni
Di = 1 + Rij
j=1
CHAPTER 16. MISSING DATA 345
Selection Models:
f (Yi|θ)f (Di|Yi, ψ)
Pattern-Mixture Models:
f (Yi|Di, θ)f (Di|ψ)
16.6 Ignorability
• Counterexamples:
– Generalized Estimating Equations (Liang and
Zeger)
– Least Squares
• EM algorithm
Match the data to the “complete” model
• Multiple Imputation
Accounts properly for uncertainty due to missingness
CHAPTER 16. MISSING DATA 353
Almost. . .
Warnings
yi = Xiβ + Zibi + εi
bi ∼ N (0, D)
independent
εi ∼ N (0, Σi)
⇓ Vi = ZiDZi + Σi
• Monotone dropout
• Dropout probability at occasion j:
P (Di = j|Di ≥ j, y i, Wi) = g(hij , yij )
• Dropout model:
logit[g(hij , yij )] = logit [P (Di = j|Di ≥ j, y i, Wi)]
= hij ψ + ωyij , i = 1, . . . , N
• MAR if ω = 0
• Non-random if ω = 0
CHAPTER 16. MISSING DATA 356
• Dropout probability:
f (di|y i, Wi, ψ)
n
i
[1 − g(hij , yij )] for Di = ni + 1,
j=2
=
d−1
[1 − g(hij , yij )]g(hid, yid) for Di = d ≤ ni.
j=2
CHAPTER 16. MISSING DATA 357
16.14 A Paradox
∗ Pattern-mixture Models:
f (Yi1, . . . , Yid|Di = d) = f (Yi1, . . . , Yid|Di > d)
CHAPTER 16. MISSING DATA 361
• Pattern-membership probabilities:
π1, . . . , πt, . . . , πT .
• Their variance:
Var(β1, . . . , βg ) = AV A
where
Var(β t) 0
V =
0 Var(πt)
and
∂(β1, . . . , βg )
A=
∂(β11, . . . , βng , π1, . . . , πn)
CHAPTER 16. MISSING DATA 362
Shared-parameter models
16.20 Example
16.21 Literature
• Literature
– Robins (SiM 1997)
– Robins and Gill (SiM 1997)
– Rotnitzky and Robins (Scand J Stat 1995)
– Rotnitzky and Robins (SiM 1997)
– Robins, Rotnitzky and Zhao (JASA 1995)
– Robins and Rotnizky (JASA 1995)
– Robins, Rotnitzky, and Scharfstein (JASA 1998)
Chapter 17
368
CHAPTER 17. CASE STUDY: ANALGESIC TRIAL 369
Model Information
prevgsa 5 1 2 3 4 5
Response Profile
Ordered Ordered
Level Value Count
1 0 800
2 1 163
Algorithm converged.
GEE WGEE
1 0.173 0.246 0.201 1 0.215 0.253 0.167
1 0.177 0.113
1 0.196 0.113
1 0.456
1 0.409
1 1
Chapter 18
PROC NLMIXED
18.1 Features
370
CHAPTER 18. PROC NLMIXED 371
18.2 Particularities
18.3 Limitations
18.4 MIXOR
• Program in the public domain specifically designed for
mixed-effects ordinal regression analysis. The program
can be downloaded at
http://www.uic.edu/ hedeker/mixreg.html
19.1 Introduction
374
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 375
• where:
– Ωe = cov[eijk ],
– Ωu = cov[uij ],
– Ωv = cov[vi].
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 377
Assumptions:
– Level 1 residuals are independent across level 1
units and are N (0, V3(1)), where V3(1) is diagonal
2 (1)T (1)
with elements σeijk= Zijk ΩeZijk .
– Level 2 residuals are independent across level 2
units and are N (0, V3(2)), where V3(2) is block-
(2)T (2)
diagonal with blocks V3(2)ij = Zijk ΩuZijk .
– Level 3 residuals are independent across level 3
units and are N (0, V3(3)), where V3(3) is block-
(3)T (3)
diagonal with blocks V3(3)i = Zijk Ωv Zijk .
• Thus, cov[Y ] is block-diagonal with ith block given
by:
2
V3i = V3(3)i + V3(2)ij + σeijk .
j j,k
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 378
with V ∗ = V
V.
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 380
19.3.2 Remarks
• β̂ = (X T V −1 X)−1 X T V −1 Y, with
1 x11 y11
1 x12
y12
X=
.. .. ,
Y =
.. .
. . .
1 xmnm ymnm
19.5 Example
• Covariates:
– SEX: xi = 1 for boys and 0 for girls
– AGE: 8, 10, 12, and 14
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 383
• Model:
Yij = β0+β01xi+β10tj (1−xi)+β11tj xi+b0i+b1itj +εij ,
• Predicted means:
• Pros:
– The algorithm is quick and efficient (compared to
ML).
– Allows to estimate an overdispersion parameter
(since the algorithm iteratively fits linear models).
Just write
Note
We will use the growth data to illustrate the built-in function mle(). SPlus Version 4.5 is used.
Apart from the references mentioned earlier which give the theoretical underpinning, there is
390
CHAPTER 20. THE USE OF SPLUS 391
ample documentation within SPlus. The on-line manual provides a 53-page discussion of linear
and nonlinear mixed-effects models. The function lme() is generic. The on-line help system of
SPlus provides a brief account of the syntax of this generic function. Methods functions are
being developed for specific classes of objects. The methods function lme.formula() comes
with ample documentation.
Fixed effects. The structure is specified by means of the fixed argument, using standard
formulas.
Clusters. The clusters (subjects, units, etc.) are defined using cluster.
Method of estimation. Both maximum likelihood and REML are provided. The user’s
preference can be specified by means of the argument est.method.
Other tools include subsetting, specifying the action to be undertaken on missing data, and
control over the estimation algorithm.
cluster = ~ IDNR,
data = growth5.df,
re.structure = "unstructured",
na.action = "na.omit",
est.method = "ML")
Call:
Fixed: MEASURE ~ 1 + MALE + MALEAGE + FEMAGE
Random: ~ 1 + AGE
Cluster: ~ (IDNR)
Data: growth5.df
Structure: unstructured
Parametrization: matrixlog
Standard Deviation(s) of Random Effect(s)
(Intercept) AGE
2.134752 0.1541473
Correlation of Random Effects
(Intercept)
AGE -0.6025632
Although the above output is rather brief, one can obtain a more extensive summary:
Call:
Fixed: MEASURE ~ 1 + MALE + MALEAGE + FEMAGE
CHAPTER 20. THE USE OF SPLUS 393
Random: ~ 1 + AGE
Cluster: ~ (IDNR)
Data: growth5.df
Estimation Method: ML
Convergence at iteration: 6
Log-likelihood: -213.903
AIC: 443.806
BIC: 465.263
The estimates and standard errors coincide with those obtained with, for example, MLwiN. This
is immediately clear for the fixed-effects estimates, their standard errors, and the residual
variance. The components of the D matrix have to be derived from the standard deviations and
correlation of the random effects:
As is the case with MLwiN, SPlus in general, and lme() in particular, have extensive graphical
capabilities.