0% found this document useful (0 votes)

19 views416 pages

Datos Categóricos

The document discusses the analysis of repeated categorical data, focusing on generalized linear models and linear mixed models for longitudinal data. It includes case studies such as analgesic and vaccination trials, as well as surrogate markers in cancer research. The content is structured into sections covering theoretical models, case studies, and statistical methods for data analysis.

Uploaded by

torresserranobeca

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views416 pages

Datos Categóricos

Uploaded by

torresserranobeca

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 416

Repeated Categorical Data

Analysis

Geert Molenberghs
Ariel Alonso Abad
Fabián Santiago Tibaldi

CenStat, Limburgs Universitair Centrum

Cuba, December 2001

In collaboration with: Didier Renard, Geert Verbeke

Contents

1 Reading 1

1.1 Basic References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Generalized Linear Models 1

2.1 GLM for Independent Responses: Review . . . . . . . . . . . . . . . . . . . . . 2

2.2 Example: Normal Linear Model . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.3 Example: Bernoulli Logistic Model . . . . . . . . . . . . . . . . . . . . . . . . 5

2.4 Likelihood Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Case Study: Analgesic Trial 9

3.1 Observed Frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.2 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.3 Missingness Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.3.1 Dropout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

ii
CONTENTS iii

3.4 Summary Table: Logit Link . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4 Linear (Mixed) Models for Longitudinal Data 22

4.1 Correlated Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.2 Taxonomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.2.1 EXAMPLE: Reading ability and age . . . . . . . . . . . . . . . . . . . . 25

4.2.2 EXAMPLE: CD4+ cell counts . . . . . . . . . . . . . . . . . . . . . . . 26

4.2.3 EXAMPLE: Weight of Pigs . . . . . . . . . . . . . . . . . . . . . . . . 27

4.3 Scientiﬁc Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.4 Merits of Longitudinal Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.5 Advantages of Longitudinal Studies . . . . . . . . . . . . . . . . . . . . . . . . 30

4.6 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.7 Types of Longitudinal Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.7.1 Balanced Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.7.2 Unbalanced Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.8 Types of Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.9 Components of Variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.10 Full Multivariate Model For Balanced Data . . . . . . . . . . . . . . . . . . . . 35

4.10.1 Case Study: Growth Data . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.10.2 Model formulation and estimation . . . . . . . . . . . . . . . . . . . . . 39

CONTENTS iv

4.10.3 Example: Growth data . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.11 Linear Mixed Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.12 Inference in the marginal linear mixed model . . . . . . . . . . . . . . . . . . . 64

4.12.1 The hierarchical versus marginal model . . . . . . . . . . . . . . . . . . 64

4.12.2 Notation and terminology . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.12.3 The Autocorrelation Function . . . . . . . . . . . . . . . . . . . . . . . 67

4.12.4 The Variogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.13 Empirical Bayes Methods for the Random Eﬀects . . . . . . . . . . . . . . . . . 70

4.13.1 Estimation of the Random Eﬀects bi . . . . . . . . . . . . . . . . . . . 70

. . . . . . . . . . . . . . . . . . . . . . .
4.13.2 Empirical Bayes estimates b 71
i

. . . . . . . . . . . . . . . . . . . . . . . . . .
4.13.3 Shrinkage estimators b 73
i

5 Case Study: Vaccination Trial 75

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.2 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5.3 Selection of a Covariance Structure . . . . . . . . . . . . . . . . . . . . . . . . 80

5.3.1 Note: Log-Linear Variance Model . . . . . . . . . . . . . . . . . . . . . 81

5.4 Simpliﬁcation of the Mean Structure . . . . . . . . . . . . . . . . . . . . . . . 83

5.4.1 Note: Fractional Polynomials . . . . . . . . . . . . . . . . . . . . . . . 84

5.5 How Does the Model Fit ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

CONTENTS v

5.6 Prediction From the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6 Case Study: Surrogate Markers 91

6.1 Age-Related Macular Degeneration . . . . . . . . . . . . . . . . . . . . . . . . 92

6.2 Advanced Ovarian Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

6.3 Advanced Colorectal Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

6.4 Deﬁnition of Surrogate Endpoint . . . . . . . . . . . . . . . . . . . . . . . . . 100

6.5 ARMD: Prentice’s Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

6.6 Proportion Explained . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6.7 Criticism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.8 Relative Eﬀect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

6.9 Adjusted Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6.10 Use of RE and Adjusted Association . . . . . . . . . . . . . . . . . . . . . . . 106

6.11 Analysis Based on Several Trials . . . . . . . . . . . . . . . . . . . . . . . . . . 107

6.12 Statistical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

6.13 Methods of Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.14 ARMD: Trial-Level Surrogacy . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

6.15 ARMD: Individual-Level Surrogacy . . . . . . . . . . . . . . . . . . . . . . . . 114

6.16 Ovarian: Trial-Level Surrogacy . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

6.17 Ovarian: Individual-Level Surrogacy . . . . . . . . . . . . . . . . . . . . . . . . 117

CONTENTS vi

6.18 Ovarian: Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

6.19 Colorectal: Trial-Level Surrogacy . . . . . . . . . . . . . . . . . . . . . . . . . 119

6.20 Colorectal: Individual-Level Surrogacy . . . . . . . . . . . . . . . . . . . . . . . 120

7 Case Study: The prostate Cancer Data 121

7.1 The use of PSA to detect prostate cancer . . . . . . . . . . . . . . . . . . . . . 121

7.1.1 The Baltimore Longitudinal Study of Aging (BLSA) . . . . . . . . . . . 123

7.1.2 The prostate data from the BLSA . . . . . . . . . . . . . . . . . . . . . 124

7.1.3 Descriptive statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

7.2 A two-stage model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

7.2.1 General idea of two-stage models . . . . . . . . . . . . . . . . . . . . . 126

7.2.2 Applied to the prostate data . . . . . . . . . . . . . . . . . . . . . . . . 127

7.2.3 Matrix notation for two-stage models: . . . . . . . . . . . . . . . . . . . 131

7.3 Linear mixed models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

7.3.1 Stage 1 + Stage 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

7.4 Fitting linear mixed models in SAS . . . . . . . . . . . . . . . . . . . . . . . . 133

7.5 Inference for contrasts of ﬁxed eﬀects . . . . . . . . . . . . . . . . . . . . . . . 147

7.5.1 The CONTRAST statement . . . . . . . . . . . . . . . . . . . . . . . . 147

7.5.2 The ESTIMATE statement . . . . . . . . . . . . . . . . . . . . . . . . 157

7.6 Inference for variance components . . . . . . . . . . . . . . . . . . . . . . . . . 160

CONTENTS vii

7.6.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

7.6.2 Wald tests for variance components . . . . . . . . . . . . . . . . . . . . 161

7.6.3 Likelihood ratio test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

8 Parametric Modeling Families 165

8.1 Continuous Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

8.1.1 Marginal Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

8.1.2 Random-Eﬀects Models . . . . . . . . . . . . . . . . . . . . . . . . . . 166

8.1.3 Transition Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

8.2 Longitudinal Generalized Linear Models . . . . . . . . . . . . . . . . . . . . . . 168

8.2.1 Marginal Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

8.2.2 Random-eﬀects Models . . . . . . . . . . . . . . . . . . . . . . . . . . 171

8.2.3 Conditional Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

8.2.4 Marginal Versus Conditional Models . . . . . . . . . . . . . . . . . . . . 173

8.3 Main Focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

9 Modelling Repeated Categorical Data 175

9.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

9.1.1 The Standard (Regression) Notation . . . . . . . . . . . . . . . . . . . 176

9.1.2 The Table Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

CONTENTS viii

9.2 A Conditional Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

9.2.1 Interpretation of Parameters . . . . . . . . . . . . . . . . . . . . . . . . 180

9.3 Marginal Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

9.3.1 Link Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

10 Case Study: NTP Data 188

10.1 Data Structure of Developmental Toxicity Studies . . . . . . . . . . . . . . . . 189

10.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

10.3 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

10.4 Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

10.5 Quadratic Log-linear Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

10.6 The quadratic exponential model . . . . . . . . . . . . . . . . . . . . . . . . . 193

10.7 The linear exponential model . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

10.8 Specialized to Clustered Binary Data . . . . . . . . . . . . . . . . . . . . . . . 194

10.9 Quadratic Clustered Loglinear Model . . . . . . . . . . . . . . . . . . . . . . . 195

10.10The Bahadur Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

11 Generalized Estimating Equations 197

11.1 Large Sample Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

11.2 Unknown Covariance Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 202

CONTENTS ix

11.3 The Sandwich Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

11.4 The Working Correlation Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 205

11.4.1 Estimation of Working Correlation . . . . . . . . . . . . . . . . . . . . . 206

11.5 Fitting GEE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

11.6 The NTP Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

11.6.1 PROC GENMOD Code . . . . . . . . . . . . . . . . . . . . . . . . . . 210

11.6.2 Discussion of Program . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

11.6.3 PROC GENMOD Output . . . . . . . . . . . . . . . . . . . . . . . . . 213

11.6.4 Discussion of Output . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

11.7 GEE: Alternative 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

11.7.1 gee1corr.mac Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

11.7.2 gee1corr.mac Output . . . . . . . . . . . . . . . . . . . . . . . . . . 224

11.8 GEE: Alternative 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

11.8.1 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

11.8.2 The Variance Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 228

11.8.3 GLIMMIX Macro Code . . . . . . . . . . . . . . . . . . . . . . . . . . 229

11.8.4 Discussion of Macro . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

11.8.5 GLIMMIX Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

11.8.6 Discussion of Output . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

CONTENTS x

11.9 Comparison of GEE Estimates (Standard Errors) . . . . . . . . . . . . . . . . . 241

11.10GEE2: Odds Ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

11.11GEE2: Correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

11.11.1 The NTP Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246

11.11.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

11.12Alternating Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . . . . 248

12 Case Study: Analgesic Trial 249

12.1 Comparison of GEE Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . 253

12.2 Comparison of GEE Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . 254

12.2.1 Fitted Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

12.3 Use of GLIMMIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256

12.3.1 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257

12.4 Alternating Logistic Regressions . . . . . . . . . . . . . . . . . . . . . . . . . . 261

12.4.1 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

13 Random-Eﬀects Models 266

13.1 The Marginal Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

13.2 Numerical Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268

13.2.1 Adaptive Gaussian Quadrature . . . . . . . . . . . . . . . . . . . . . . . 268

CONTENTS xi

13.2.2 First Order Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

13.3 Estimation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270

13.4 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

13.5 The Beta-binomial Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

13.5.1 The NTP Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274

13.5.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275

13.6 Generalized Linear Mixed Models . . . . . . . . . . . . . . . . . . . . . . . . . 276

13.6.1 Quasi-likelihood Function . . . . . . . . . . . . . . . . . . . . . . . . . 277

13.6.2 Quasi-likelihood For Generalized Linear Mixed Models . . . . . . . . . . 279

13.6.3 Estimation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 280

13.7 Linear Mixed Model Using GLIMMIX . . . . . . . . . . . . . . . . . . . . . . . 282

13.7.1 Selected Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

13.7.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293

13.7.3 GLIMMIX Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

13.7.4 Output From GLIMMIX . . . . . . . . . . . . . . . . . . . . . . . . . . 300

13.7.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303

13.8 The NTP Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309

13.9 Transition Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310

13.10Diﬀerences Between Families of Models . . . . . . . . . . . . . . . . . . . . . . 312

CONTENTS xii

14 Case Study: Analgesic Trial 313

14.1 PROC NLMIXED Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

14.1.1 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

14.1.2 Population Averaged Proﬁles . . . . . . . . . . . . . . . . . . . . . . . 317

14.1.3 Fitted Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

14.2 Comparison of Diﬀerent Approaches/Programs . . . . . . . . . . . . . . . . . . 322

14.2.1 PROC NLMIXED (Gaussian quadrature and N-R) . . . . . . . . . . . . 322

14.2.2 MIXOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

14.2.3 PQL2 (MLwiN) without and with extra-dispersion parameter . . . . . . 327

14.2.4 PQL (MLwiN) without and with extra-dispersion parameter . . . . . . . 328

14.2.5 PQL (GLIMMIX) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

15 Analgesic Trial: Ordinal Data 332

15.1 Proportional odds model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332

15.2 GEE Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336

15.3 Random-eﬀects Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338

16 Missing Data 343

16.1 Missing Data Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

16.2 The Name of the Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

CONTENTS xiii

16.3 Factorizing the Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346

16.4 Selection Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347

16.5 Missing Data Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348

16.6 Ignorability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349

16.7 Ignorability ←− Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350

16.8 Simple Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351

16.9 Three Likelihood Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . 352

16.10An Ignorable Likelihood Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 353

16.11A Selection Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354

16.12Dropout Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355

16.13Contributions Combined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356

16.14A Paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

16.15Pattern-Mixture Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358

16.16Pattern-Mixture Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359

16.17Pattern-Mixture Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360

16.18Estimating Marginal Eﬀects From PMM . . . . . . . . . . . . . . . . . . . . . 361

16.19Random-Coeﬃcient Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362

16.20Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363

16.21Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
CONTENTS xiv

16.22Non-Normal Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365

16.23Pros and Cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366

16.24Less Parametric Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367

17 Case Study: Analgesic Trial 368

17.1 Weighted GEE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368

17.2 Analgesic Trial Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370/1

17.2.1 Estimated working correlation structures: . . . . . . . . . . . . . . . . . 373/1

18 PROC NLMIXED 370

18.1 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370

18.2 Particularities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371

18.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372

18.4 MIXOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373

19 Introduction to Multilevel Modeling 374

19.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374

19.2 Multilevel Model Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 376

19.3 Estimation Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378

19.3.1 IGLS Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379

19.3.2 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380

CONTENTS xv

19.4 Illustration of the IGLS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 381

19.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382

19.6 Multilevel Models for Discrete Response Data . . . . . . . . . . . . . . . . . . . 386

19.7 MQL/PQL Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387

20 The Use of SPlus 390

20.1 Fitting Mixed Models Using SPlus . . . . . . . . . . . . . . . . . . . . . . . . . 390

Chapter 1

Reading

1.1 Basic References

• Verbeke, G. and Molenberghs, G. (1997). Linear

Mixed Models In Practice: A SAS Oriented Approach,
Lecture Notes in Statistics 126. New York:
Springer-Verlag.
• Verbeke, G. and Molenberghs, G. (2000). Linear
Mixed Models for Longitudinal Data. New York:
Springer-Verlag.
• Fahrmeir, L. and Tutz, G. (2001). Multivariate
Statistical Modelling Based on Generalized Linear
Models (2nd edition), Springer Series in Statistics.
New York: Springer-Verlag.
• Diggle, P.J., Liang, K.Y. and Zeger, S.L. (1994).
Analysis of Longitudinal Data. Oxford: Oxford
1
CHAPTER 1. READING 2

University Press.

1.2 Further Reading

• Brown, H. and Prescott, R. (1999). Applied Mixed
Models in Medicine. Chichester: John Wiley.
• Crowder, M.J. and Hand, D.J. (1990). Analysis of
Repeated Measures. London: Chapman and Hall.
• Davidian, M. and Giltinan, D.M. (1995). Nonlinear
Models For Repeated Measurement Data. London:
Chapman and Hall.
• Goldstein, H. (1979). The Design and Analysis of
Longitudinal Studies. London: Academic Press.
• Goldstein, H. (1995). Multilevel Statistical Models.
London: Edward Arnold.
• Hand, D.J. and Crowder, M.J. (1995). Practical
Longitudinal Data Analysis. London: Chapman and
Hall.
• Jones, B. and Kenward, M.G. (1989). Design and
Analysis of Crossover Trials. London: Chapman and
Hall.
• Kshirsagar, A.M. and Smith, W.B. (1995). Growth
Curves. New York: Marcel Dekker.
CHAPTER 1. READING 3

• Lindsey, J.K. (1993). Models for Repeated

Measurements. Oxford: Oxford University Press.
• Longford, N.T. (1993). Random Coeﬃcient Models.
Oxford: Oxford University Press.
• McCullagh, P. and Nelder, J.A. (1989). Generalized
Linear Models (second edition). London: Chapman
and Hall.
• Pinheiro, J.C. and Bates, D.M. (2000). Mixed-Eﬀects
Models in S and S-PLUS. New York: Springer-Verlag.
• Searle, S.R., Casella, G., and McCulloch, C.E. (1992).
Variance Components. John Wiley & Sons, New York.
• Senn, S.J. (1993). Cross-over Trials in Clinical
Research. Chichester: Wiley.
• Vonesh, E.F. and Chinchilli, V.M. (1997). Linear and
non-linear models for the analysis of repeated
measurements. Marcel Dekker, Basel.
Chapter 2

Generalized Linear Models

Generalized linear models (GLM) is a unifying theory for

a wide range of settings:

• normal: linear models: multiple regression, ANOVA

• binary: probit and logit (logistic) regression

• categorcial data: log-linear modelling

• counts: Poisson regression

• non-negative continuous data: survival analysis

(possibly censored)

(McCullagh and Nelder 1989)

1
CHAPTER 2. GENERALIZED LINEAR MODELS 2

2.1 GLM for Independent Responses:

Review

1. E(Yi) = µi
2. η(µi) = xTi β with η(.) the link function
3. Var(Yi) = φv(µi), where
• v(.) is a known variance function

• φ is a scale parameter (sometimes called

overdispersion parameter)

• exponential family p.d.f.

−1
f (y|θi, φ) = exp φ [yθi − ψ(θi)] + c(y, φ)
with θi the natural parameter and ψ(.) a function
satisfying
– µi = ψ (θi)

– v(µi) = ψ (θi)
CHAPTER 2. GENERALIZED LINEAR MODELS 3

Summary

1.–2. specify the mean response as a known function of

explanatory variables (covariates, regression
parameters)

3. specify the variance of the response as a known

function of the mean, multiplied by a scale parameter

4. specify the distribution of the response as a member

of the exponential family
(for large samples, this assumption can be relaxed)
CHAPTER 2. GENERALIZED LINEAR MODELS 4

2.2 Example: Normal Linear Model

Yi ∼ N (xTi β, σ 2)
with

• η(µ) = µ (identity link)

• v(µ) = 1

• φ = σ2

•θ=µ

• ψ(θ) = θ2/2

−y 2
• c(y, φ) = 2φ − 12 ln(2πφ)
CHAPTER 2. GENERALIZED LINEAR MODELS 5

2.3 Example: Bernoulli Logistic Model

exp(xTi β)
P (Yi = 1) =
1 + exp(xTi β)
with

• η(µ) = ln µ
1−µ

• v(µ) = µ(1 − µ)

•φ=1

• θ = ln µ
1−µ

• ψ(θ) = ln {1 + exp(θ)}

• Verify the conditions on ψ(θ)

CHAPTER 2. GENERALIZED LINEAR MODELS 6

2.4 Likelihood Estimation

1 N n

(θ1, . . . , θN , φ) = {yiθi − ψ(θi)} + c(yi, φ)
φ i=1 i=1

Since θi is modelled in terms of β:

θi = θi(β)
we ﬁnd
∂ 1
N ∂θi
= {yi − φ(θi)}
∂βj φ i=1 ∂βj

which can be rewritten in the classical score equations

form

N ∂θi
ψ (θi)[φψ (θi)]−1 {yi − φ(θi)}

S(βj ) =
i=1 ∂βj

(j = 1, . . . , p)

The MLE is found by solving the score equations

S(β) = 0.
CHAPTER 2. GENERALIZED LINEAR MODELS 7

The score equations can be rewritten in a useful form,

using two facts:

1. ψ (θi) = µi and thus

∂µi ∂θi
= ψ (θi)
∂βj ∂βj

2. vi = var(Yi) = φψ (θi)

The score equations become

N
∂µi −1
S(βj ) = vi (yi − µi) = 0
i=1 ∂βj
or  
N 
∂µi T −1
S(β) = 
  vi (yi − µi ) = 0.
i=1 ∂β
CHAPTER 2. GENERALIZED LINEAR MODELS 8

Remarks

• Estimation of β only depends on the p.d.f. f (.)

through the means and variances of the responses Yi.

• A uniﬁed estimation scheme obtainable in terms of

speciﬁed explanatory variables, link function and
variance function:
iterative (re-)weighted least squares.
Newton-Raphson and Fisher scoring can be used
equally well.

• The theory of quasi-likelihood shows that the usual

asymptotic properties of β̂ hold when the means and
variances of Yi are speciﬁed correctly, even if the
distribution of f (.) is not

• In some applications φ is a known constant.

If not, need to estimate φ to construct standard errors
for β̂.
Chapter 3

Case Study: Analgesic Trial

• single-arm trial with 530 patients recruited (491

selected for analysis)

• analgesic treatment for pain caused by chronic

nonmalignant disease

• treatment was to be administered for 12 months

• we will focus on Global Satisfaction Assessment

(GSA)

• GSA scale goes from 1=very good to 5=very bad

• GSA was rated by each subject 4 times during the

trial, at months 3, 6, 9, and 12.

9
CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 10

3.1 Observed Frequencies

|---------|------------------------------------------------------|----------|
| | GSA | |
| |----------|----------|----------|----------|----------| |
| |Very Good | Good | Moderate | Bad | Very Bad | All |
| |----|-----+----|-----+----|-----+----|-----+----|-----+----|-----|
| | N | % | N | % | N | % | N | % | N | % | N | % |
|---------+----+-----+----+-----+----+-----+----+-----+----+-----+----+-----|
|Time | | | | | | | | | | | | |
|---------| | | | | | | | | | | | |
|MONTH 3 | 55| 14.3| 112| 29.1| 151| 39.2| 52| 13.5| 15| 3.9| 385|100.0|
|---------+----+-----+----+-----+----+-----+----+-----+----+-----+----+-----|
|MONTH 6 | 38| 12.6| 84| 27.8| 115| 38.1| 51| 16.9| 14| 4.6| 302|100.0|
|---------+----+-----+----+-----+----+-----+----+-----+----+-----+----+-----|
|MONTH 9 | 40| 17.6| 67| 29.5| 76| 33.5| 33| 14.5| 11| 4.8| 227|100.0|
|---------+----+-----+----+-----+----+-----+----+-----+----+-----+----+-----|
|MONTH 12 | 30| 13.5| 66| 29.6| 97| 43.5| 27| 12.1| 3| 1.3| 223|100.0|
|---------|----|-----|----|-----|----|-----|----|-----|----|-----|----|-----|
CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 11

3.2 Questions

• Evolution over time

• Relation with baseline covariates: age, sex, duration

of the pain, type of pain, disease progression, Pain
Control Assessment (PCA), . . .

• Investigation of dropout
CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 12

3.3 Missingness Patterns

Missingness Cumulative Cumulative
Pattern Frequency Percent Frequency Percent
-------------------------------------------------------
**** 96 19.6 96 19.6
***- 2 0.4 98 20.0
**-* 1 0.2 99 20.2
*-** 3 0.6 102 20.8
*-*- 1 0.2 103 21.0
*--* 1 0.2 104 21.2
*--- 2 0.4 106 21.6
-*** 63 12.8 169 34.4
-**- 18 3.7 187 38.1
-*-* 2 0.4 189 38.5
-*-- 7 1.4 196 39.9
--** 51 10.4 247 50.3
--*- 30 6.1 277 56.4
---* 51 10.4 328 66.8
---- 163 33.2 491 100.0
CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 13

3.3.1 Dropout
Dropout Time

Frequency|
Col Pct |MONTH 3 |MONTH 6 |MONTH 9 |MONTH 12| Total
---------+--------+--------+--------+--------+
No | 385 | 302 | 227 | 223 | 1137
| 78.41 | 61.51 | 46.23 | 45.42 |
---------+--------+--------+--------+--------+
Yes | 106 | 189 | 264 | 268 | 827
| 21.59 | 38.49 | 53.77 | 54.58 |
---------+--------+--------+--------+--------+
Total 491 491 491 491 1964

Dropout
Pattern Cumulative Cumulative
(redefined) Frequency Percent Frequency Percent
-------------------------------------------------------------
**** 96 19.55 96 19.55
-*** 63 12.83 159 32.38
--** 54 11.00 213 43.38
---* 55 11.20 268 54.58
---- 223 45.42 491 100.00
CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 14

Generalized Linear Models

• Early dropout (did the subject drop out after the first
or the second visit) ?
• Binary response
• PROC GENMOD can fit GLMs in general
• PROC LOGISTIC can fit models for binary (and
ordered) responses
• SAS code:

/* Logit link */

proc genmod data=earlydrp;
model earlydrp = pca0 weight psychiat physfct / dist=b;
run;

proc logistic data=earlydrp descending;

model earlydrp = pca0 weight psychiat physfct;
run;

/* Probit link */

proc genmod data=earlydrp;
model earlydrp = pca0 weight psychiat physfct / dist=b link=probit;
run;

proc logistic data=earlydrp descending;

model earlydrp = pca0 weight psychiat physfct / link=probit;
run;
CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 15

The GENMOD Procedure

Model Information

Data Set WORK.EARLYDRP

Distribution Binomial
Link Function Logit
Dependent Variable earlydrp
Observations Used 386
Probability Modeled Pr( earlydrp = 1 )
Missing Values 9

Response Profile

Ordered Ordered
Level Value Count

1 0 271
2 1 115

Criteria For Assessing Goodness Of Fit

Criterion DF Value Value/DF

Deviance 381 437.9967 1.1496

Scaled Deviance 381 437.9967 1.1496
Pearson Chi-Square 381 384.1586 1.0083
Scaled Pearson X2 381 384.1586 1.0083
Log Likelihood -218.9984

Algorithm converged.

Analysis Of Parameter Estimates

Standard Wald 95% Chi-

Parameter DF Estimate Error Confidence Limits Square Pr > ChiSq

Intercept 1 -1.0673 0.7328 -2.5037 0.3690 2.12 0.1453

PCA0 1 0.3981 0.1343 0.1349 0.6614 8.79 0.0030
WEIGHT 1 -0.0211 0.0072 -0.0353 -0.0070 8.55 0.0034
PSYCHIAT 1 0.7169 0.2871 0.1541 1.2796 6.23 0.0125
PHYSFCT 1 0.0121 0.0050 0.0024 0.0219 5.97 0.0145
Scale 0 1.0000 0.0000 1.0000 1.0000

NOTE: The scale parameter was held fixed.

CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 16

The LOGISTIC Procedure

Model Information

Data Set WORK.EARLYDRP

Response Variable earlydrp
Number of Response Levels 2
Number of Observations 386
Link Function Logit
Optimization Technique Fisher’s scoring

Response Profile

Ordered Total
Value earlydrp Frequency

1 1 115
2 0 271

NOTE: 9 observations were deleted due to missing values for the response or
explanatory variables.

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics

Intercept
Intercept and
Criterion Only Covariates

AIC 472.224 447.997

SC 476.179 467.776
-2 Log L 470.224 437.997

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 32.2269 4 <.0001

Score 31.6004 4 <.0001
Wald 28.3625 4 <.0001
CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 17

Analysis of Maximum Likelihood Estimates

Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 -1.0674 0.7328 2.1214 0.1453

PCA0 1 0.3981 0.1343 8.7885 0.0030
WEIGHT 1 -0.0211 0.00723 8.5546 0.0034
PSYCHIAT 1 0.7169 0.2871 6.2338 0.0125
PHYSFCT 1 0.0121 0.00496 5.9706 0.0145

Odds Ratio Estimates

Point 95% Wald

Effect Estimate Confidence Limits

PCA0 1.489 1.144 1.937

WEIGHT 0.979 0.965 0.993
PSYCHIAT 2.048 1.167 3.595
PHYSFCT 1.012 1.002 1.022

Association of Predicted Probabilities and Observed Responses

Percent Concordant 67.1 Somers’ D 0.346

Percent Discordant 32.5 Gamma 0.347
Percent Tied 0.4 Tau-a 0.145
Pairs 31165 c 0.673
CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 18

The GENMOD Procedure

Model Information

Data Set WORK.EARLYDRP

Distribution Binomial
Link Function Probit
Dependent Variable earlydrp
Observations Used 386
Probability Modeled Pr( earlydrp = 1 )
Missing Values 9

Response Profile

Ordered Ordered
Level Value Count

1 0 271
2 1 115

Criteria For Assessing Goodness Of Fit

Criterion DF Value Value/DF

Deviance 381 437.9255 1.1494

Scaled Deviance 381 437.9255 1.1494
Pearson Chi-Square 381 384.2600 1.0086
Scaled Pearson X2 381 384.2600 1.0086
Log Likelihood -218.9628

Algorithm converged.

Analysis Of Parameter Estimates

Standard Wald 95% Chi-

Parameter DF Estimate Error Confidence Limits Square Pr > ChiSq

Intercept 1 -0.6485 0.4371 -1.5052 0.2082 2.20 0.1379

PCA0 1 0.2370 0.0791 0.0821 0.3920 8.99 0.0027
WEIGHT 1 -0.0126 0.0043 -0.0210 -0.0042 8.72 0.0031
PSYCHIAT 1 0.4300 0.1731 0.0908 0.7692 6.17 0.0130
PHYSFCT 1 0.0073 0.0030 0.0015 0.0132 6.06 0.0139
Scale 0 1.0000 0.0000 1.0000 1.0000

NOTE: The scale parameter was held fixed.

CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 19

The LOGISTIC Procedure

Model Information

Data Set WORK.EARLYDRP

Response Variable earlydrp
Number of Response Levels 2
Number of Observations 386
Link Function Normit
Optimization Technique Fisher’s scoring

Response Profile

Ordered Total
Value earlydrp Frequency

1 1 115
2 0 271

NOTE: 9 observations were deleted due to missing values for the response or
explanatory variables.

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics

Intercept
Intercept and
Criterion Only Covariates

AIC 472.224 447.926

SC 476.179 467.705
-2 Log L 470.224 437.926

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 32.2981 4 <.0001

Score 31.6004 4 <.0001
Wald 30.0157 4 <.0001
CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 20

Analysis of Maximum Likelihood Estimates

Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 -0.6486 0.4354 2.2198 0.1363

PCA0 1 0.2371 0.0796 8.8714 0.0029
WEIGHT 1 -0.0126 0.00424 8.8833 0.0029
PSYCHIAT 1 0.4300 0.1741 6.0995 0.0135
PHYSFCT 1 0.00732 0.00297 6.0698 0.0138

Association of Predicted Probabilities and Observed Responses

Percent Concordant 67.1 Somers’ D 0.346

Percent Discordant 32.5 Gamma 0.347
Percent Tied 0.4 Tau-a 0.145
Pairs 31165 c 0.673
CHAPTER 3. CASE STUDY: ANALGESIC TRIAL 21

3.4 Summary Table: Logit Link

Variable† Estimate S.E. p

Intercept -1.0674 0.7328 0.1453

PCA0 (PCA) 0.3981 0.1343 0.0030
WEIGHT (weight) -0.0211 0.0072 0.0034
PSYCHIAT (psychiatric disorder Yes/No) 0.7169 0.2871 0.0125
PHYSFCT (physical functioning) 0.0121 0.0050 0.0145
† All variables as measured at baseline.
Chapter 4

Linear (Mixed) Models for Longitudinal

Data

4.1 Correlated Data

• One outcome, one sample
– height of human subjects
– mean, median, standard error, interquartile ranges

• One outcome, two samples (binary covariate)

– treated and untreated patients, two species,. . .
– Are means diﬀerent across populations ?

22
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 23

• One outcome, more complex covariate

– several dose levels
– species of plants
– weight

⇒ Linear, logistic, Poisson,. . . regression

• One outcome, multiple covariates
– Most techniques extend easily
– Multi-way ANOVA
– Multiple regression
– Collinearity, confounding,. . .
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 24

4.2 Taxonomy

Classical design: (cross-sectional) A single outcome is

measured on each subject.

Multivariate design: A single outcome on several

variables is measured on each subject (HDL, LDL,
CHOL, APOA1, APOB).

Repeated measures design: Multiple outcomes of a

single quantitity are measured on each subject (e.g.
the malformation index for all fetuses of the same
dam: clustered data).

Longitudinal design: Repeated measures over time.

CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 25

4.2.1 EXAMPLE: Reading ability and age

• Panel (a) surprising ?

• First longitudinal interpretation: panel (b).
• Second longitudinal interpretation: panel (c).
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 26

4.2.2 EXAMPLE: CD4+ cell counts

• A cohort of 369 male HIV seroconverters.

• CD4+ cell-count is measured at approximately 6
month intervals.
• A variable number of measurements per subject.
• 2 distinct objectives:
– estimate the population average time course of
CD4+ cell depletion.
– estimate/predict time course for individual men.
• Substantial measurement error in CD4+ cell
determinations.
• CD4+ is highly variable due to sudden changes
a mild infection, such as an ordinary cold, causes CD4 to “shoot up”.
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 27

4.2.3 EXAMPLE: Weight of Pigs

• Plot for the weights of pigs

• “raw” and a “standardized residual” plot
• Raw plot:
– overall trend
– unclear picture of the random variation about the
trend
– variance is increasing over time
• The residuals have a tendency to remain “high” or
“low” for a given animal: tracking.
• Although still manageable, the plot is quite busy.
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 28

4.3 Scientiﬁc Questions

• Making inferences about the set of mean response

profiles (population averaged, marginal).
Does diet affect milk protein content ?
• Predicting individual response trajectories (individual
based, conditional, empirical Bayes).
Person-specific future course of CD4 cell depletion.
• Inferences about the variability between subjects
(sum of squares flavour).
Milk protein data: effect of diets.
• Inferences about the nature of the dependence
between measurements within subject (components
in the variability, the autocorrelation structure).
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 29

4.4 Merits of Longitudinal Studies

longitudinal designs ∼ study of change

cf. reading example

• Cross-sectional:
Yi1 = βC xi1 + εi1 (4.1)
βC : the average diﬀerence across two sub-population
that diﬀer by unit x.
• Repeated observations:
Yij = βC xi1 + βL(xij − xi1) + εij (4.2)
– j = 1: cross-sectional
– ⇒ βC retains interpretation
– In addition, βL can be studied
Subtract:
(Yij − Yi1) = βL(xij − xi1) + (εij − εi1)
βL: expected change in Y over time per unit
change in x.
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 30

4.5 Advantages of Longitudinal Studies

• To study the longitudinal parameter with a

cross-sectional study: βL ≡ βC .
Very strong assumption.
• Longitudinal studies are more powerful:
The estimation of βL is based on changes of x
within subjects.
The subject serves as her own control.
• The distinction of between subjects variability and
within subjects variability is important:
– The between variability among CD4+ measures is
very high
⇒ less useful to predict an individual’s future
values.
– The past measurements of an individual may
contribute more information to predict an
individual’s future values than the measurements
of other subjects.
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 31

4.6 Notation

• Yij , yij : jth response on ith subject

• tij : measurement time at which Yij is taken
• xij : vector of explanatory variables
• Individual i: Y i = (Yi1, . . . , Yini )
• The entire dataset: Y = (Y 1, . . . , Y N )
• E(Y i) = µi
• Var(Y i) = V i
• E(Y ) = µ
• Var(Y ) = V

V has a block-diagonal structure

 




V 1 0 ... 0 



 
 


 0 V 2 ... 0 


V = 








.. .. . . . .. 


 
 
 
 

0 0 ... V N 
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 32

4.7 Types of Longitudinal Data

4.7.1 Balanced Data

• A ﬁxed number of measurements per subject.

• Measurements taken at ± the same time.
• “holes”≡ missing data.
• A saturated model can be considered.

4.7.2 Unbalanced Data

• A variable, possibly random, number of measurements

per subject is taken.
• Measurement times: (random) variable
• It is hard to distinguish unbalancedness from genuine
missing data.
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 33

4.8 Types of Outcomes

• Linear (Mixed) Models:

– A continuous outcome (or transformation) drawn
from a ± normal distribution.
→ linear (mixed) model (Laird and Ware, 1982)
→ SAS PROC MIXED, BMDP5V, GENSTAT, SPlus
(lme and nlme, OSWALD), MLwiN
• Generalized Linear Models:
– McCullagh and Nelder (1989)
– Multivariate and longitudinal counterparts
– Remains a field of active research.
– Marginal, conditional, random effects models
– Breslow and Clayton (1993)
– Wolfinger and O’Connell (1995)
– Lee and Nelder (1996)
– Fahrmeir and Tutz (1994)
– Less software.
SAS (GENMOD:GEE, GLIMMIX), SPlus
(OSWALD, GEE).
• Survival Outcomes
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 34

4.9 Components of Variability

• Random effects:
– These are effects which arise from the
characteristics of individual subjects.
– Some subjects may be intrinsically high
responders, others intrinsically low responders.
– The influence of a random effect extends over all
measurements of the same subject.
• Serial correlation:
– Measurements taken close together in time are
typically more strongly correlated than those taken
further apart in time.
– On a sufficiently small time-scale, this kind of
structure is almost inevitable.
• Measurement error:
– When measurements involve delicate
determinations, the results may show substantial
variation even when two measurements are taken
at the same time from the same subject.
– e.g. bio-assay of blood samples
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 35

4.10 Full Multivariate Model For

Balanced Data

4.10.1 Case Study: Growth Data

• Potthof and Roy (1964)

• Jennrich and Schluchter (1986)
• Little and Rubin (1987)
• Growth measurements for 11 girls and 16 boys.
• For each subject the distance from the center of the
pituitary to the maxillary ﬁssure was recorded.
• At ages 8, 10, 12, and 14.
• Little and Rubin (1987) deleted 9 of the
[(11 + 16) × 4] measurements.
• 9 subjects are incomplete.
• Deletion is conﬁned to the age 10 measurements.
• subjects with a low value at age 8 are more likely to
have a missing value at age 10.
• Balanced:
– common measurement times
– equally spaced measurements
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 36

• Later: missing data treatment, based on these data.

CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 37

Growth data for 11 girls and 16 boys.

Age (in years) Age (in years)

Girl 8 10 12 14 Boy 8 10 12 14
1 21.0 20.0 21.5 23.0 1 26.0 25.0 29.0 31.0
2 21.0 21.5 24.0 25.5 2 21.5 22.5∗ 23.0 26.5
3 20.5 24.0∗ 24.5 26.0 3 23.0 22.5 24.0 27.5
4 23.5 24.5 25.0 26.5 4 25.5 27.5 26.5 27.0
5 21.5 23.0 22.5 23.5 5 20.0 23.5∗ 22.5 26.0
6 20.0 21.0∗ 21.0 22.5 6 24.5 25.5 27.0 28.5
7 21.5 22.5 23.0 25.0 7 22.0 22.0 24.5 26.5
8 23.0 23.0 23.5 24.0 8 24.0 21.5 24.5 25.5
9 20.0 21.0∗ 22.0 21.5 9 23.0 20.5 31.0 26.0
10 16.5 19.0∗ 19.0 19.5 10 27.5 28.0 31.0 31.5
11 24.5 25.0 28.0 28.0 11 23.0 23.0 23.5 25.0
12 21.5 23.5∗ 24.0 28.0
13 17.0 24.5∗ 26.0 29.5
14 22.5 25.5 25.5 26.0
15 23.0 24.5 26.0 30.0
16 22.0 21.5∗ 23.5 25.0
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 38
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 39

4.10.2 Model formulation and estimation

• Let yi be the vector of n repeated measurements for

the ith subject:
 




yi1 
 
 
 

 yi2 

yi = 








.. 


 
 
 
 
 
yin

• The general multivariate model assumes that yi

satisﬁes a regression model
yi = Xiβ + εi
with










Xi: matrix of covariates







β: vector of regression parameters









 εi: vector of error components, εi ∼ N (0, Σ)
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 40

• We then have the following distribution for yi:

yi ∼ N (Xiβ, V )

• The mean structure Xiβ is modelled as in classical

regression (ANOVA) models

• Usually, V is just a general (n × n) covariance matrix.

However, special structures for V can be assumed
(see later).

• Assuming independence across individuals, β and the

parameters in V can be estimated by maximizing

N 

−n/2 − 12
LM L = 

(2π) |V |
i=1
 
1 −1 

× exp − (yi − Xiβ) V (yi − Xiβ) 




2
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 41

• The MLE for β equals

 
N −1 N
−1

XiV −1yi,

β =



Xi V Xi 
i=1 i=1

which has mean β and covariance matrix

 
N −1
 −1
Var(β) =



Xi V Xi 
i=1

• Inferences for β are obtained from replacing V by V

in the above equations, and from assuming normality

for β.

• If V is a general (n × n) covariance matrix, the MLE

equals
1
V =

(yi − Xiβ)(y

i − X i β)
N i
Otherwise, there is usually no analytic expression for
the MLE of the parameters in V .
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 42

4.10.3 Example: Growth data

Model 1: Unstructured mean and covariance

• 8 parameters for mean structure

• Unstructured 4 × 4 covariance matrix

• We deﬁne xi = 0 for boys and xi = 1 for girls

• Age: ti = 8, 10, 12, 14

• Model 1 is given by:

Yi1 = β0 + β1xi + β0,8(1 − xi) + β1,8xi + εi1,

Yi2 = β0 + β1xi + β0,10(1 − xi) + β1,10xi + εi2,

Yi3 = β0 + β1xi + β0,12(1 − xi) + β1,12xi + εi3,

Yi4 = β0 + β1xi + εi4,

CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 43

• In matrix notation, this equals

Y i = Xiβ + εi,
with
 




1 xi 1 − xi 0 0 xi 0 0 



 
 


 1 xi 0 1 − xi 0 0 xi 0 


Xi = 










1 xi 0 0 1 − xi 0 0 xi 


 
 
 

1 xi 0 0 0 0 0 0 

and with
β = (β0, β1, β0,8, β0,10, β0,12, β1,8, β1,10, β1,12)

• Parameterization:
– Means for boys: β0 + β1 + β1,8
β0 + β1 + β1,10
β0 + β1 + β1,12
β0 + β1
– Means for girls: β0 + β0,8
β0 + β0,10
β0 + β0,12
β0
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 44

• SAS program:
proc mixed data = growth method = ml covtest;
title ’Growth Data, Model 1’;
class idnr sex age;
model measure = sex age*sex / s;
repeated / type = un subject = idnr r rcorr;
run;

• MLE’s and estimated standard errors for β

Parameter MLE (s.e.)

β0 24.0909 (0.6478)
β1 3.3778 (0.8415)
β0,8 -4.5938 (0.5369)
β0,10 -3.6563 (0.3831)
β0,12 -1.7500 (0.4290)
β1,8 -2.9091 (0.6475)
β1,10 -1.8636 (0.4620)
β1,12 -1.0000 (0.5174)
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 45

• Estimated covariance matrix V

 
 5.0143 2.5156 3.6206 2.5095 

 
 2.5156 3.8748 2.7103 3.0714 


 
 3.6206 2.7103 5.9775 3.8248 


 
2.5095 3.0714 3.8248 4.6164

• The corresponding correlation matrix is

 


1.0000 0.5707 0.6613 0.5216 
 



0.5707 1.0000 0.5632 0.7262 
 .



0.6613 0.5632 1.0000 0.7281 
 
0.5216 0.7262 0.7281 1.0000
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 46

Model 2: Linear average trends

• Linear average trend within each sex group

• Unstructured 4 × 4 covariance matrix

• Model 2 is given by:

Yij = β0 + β01xi + β10tj (1 − xi) + β11tj xi + εij ,

• In matrix notation, this equals

Y i = Xiβ + εi,
where the design matrix is
 




1 xi 8(1 − xi) 8xi 



 
 


 1 xi 10(1 − xi) 10xi 


Xi = 







.



1 xi 12(1 − xi) 12xi 


 
 
 

1 xi 14(1 − xi) 14xi 

and
β = (β0, β01, β10, β11).
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 47

• Parameterization:
– β0: intercept for boys
– β0 + β01: intercept for girls
– β10: slope for boys
– β11: slope for girls

• SAS program: delete age from the CLASS statement

• LR test Model 2 versus Model 1

Mean Covar par −2 Ref G2 df p
1 unstr. unstr. 18 416.509
2 = slopes unstr. 14 419.477 1 2.968 4 0.5632

• Predicted trends:
girls : Ŷj = 17.43 + 0.4764tj

boys : Ŷj = 15.84 + 0.8268tj

CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 48

A View On the Data

CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 49

Model 3: Parallel average proﬁles

• Linear average trend within each sex group

• The same slope for both groups

• Unstructured 4 × 4 covariance matrix

• Model 3 is given by:

Yij = β0 + β01xi + β1tj + εij .

• In matrix notation, this equals

Y i = Xiβ + εi,
where the design matrix is
 




1 xi 8 



 
 


 1 xi 10 


Xi = 










1 xi 12 


 
 
 

1 xi 14 

and
β = (β0, β01, β1)
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 50

• The two slopes in Model 2 have been replaced by β1

• SAS program:

model measure = sex age / s;

• LR test: Model 3 versus Model 2

Mean Covar par −2 Ref G2 df p
1 unstr. unstr. 18 416.509
2 = slopes unstr. 14 419.477 1 2.968 4 0.5632
3 = slopes unstr. 13 426.153 2 6.676 1 0.0098

• Predicted trends:
girls : Ŷj = 15.37 + 0.6747tj

boys : Ŷj = 17.42 + 0.6747tj

CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 51

Model 4: Toeplitz covariance structure

• Linear average trend within each sex group

• Elements of V of the form Vij = α|i−j|:

 




α0 α1 α2 α3 
 
 
 

 α1 α0 α1 α2 

V = 







 


α2 α1 α0 α1 

 
 
 
 
α3 α2 α1 α0

• SAS program:
proc mixed data = growth method = ml covtest;
title ’Growth Data, Model 4’;
class sex idnr;
model measure = sex age*sex / s;
repeated / type = toep subject = idnr r rcorr;
run;
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 52

• LR test Model 4 versus Model 2:

Mean Covar par −2 Ref G2 df p
1 unstr. unstr. 18 416.509
2 = slopes unstr. 14 419.477 1 2.968 4 0.5632
4 = slopes banded 8 424.643 2 5.166 6 0.5227

• Estimated covariance matrix:

 
 4.9439 3.0507 3.4054 2.3421 

 
 3.0507 4.9439 3.0507 3.4054 


 ,
 3.4054 3.0507 4.9439 3.0507 


 
2.3421 3.4054 3.0507 4.9439

• Corresponding correlation matrix

 


1.0000 0.6171 0.6888 0.4737 
 



0.6171 1.0000 0.6171 0.6888 
 .



0.6888 0.6171 1.0000 0.6171 
 
0.4737 0.6888 0.6171 1.0000
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 53

• Standard errors of the variance components estimates:

Covariance Parameter Estimates (MLE)

Cov Parm Subject Estimate Std Error

TOEP(2) IDNR 3.05070312 0.97907984

TOEP(3) IDNR 3.40540527 0.98115569
TOEP(4) IDNR 2.34212396 1.03583358
Residual IDNR 4.94388956 0.98687143
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 54

Model 5: AR(1) covariance structure

• Linear average trend within each sex group

• Elements of V of the form Vij = σ 2ρ|i−j|:

 




1 ρ ρ2 ρ3 
 
 


 ρ 1 ρ ρ 2 


V =σ 2 










ρ2 ρ 1 ρ 


 
 
 

ρ3 ρ 2 ρ 1 

• SAS program:

repeated / type = AR(1) subject = idnr r rcorr;

CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 55

• LR test Model 5 versus Models 2 and 4:

Mean Covar par −2 Ref G2 df p
1 unstr. unstr. 18 416.509
2 = slopes unstr. 14 419.477 1 2.968 4 0.5632
4 = slopes banded 8 424.643 2 5.166 6 0.5227
5 = slopes AR(1) 6 440.681 2 21.204 8 0.0066
4 16.038 2 0.0003

• The estimated covariance matrix is

 
 4.8903 2.9687 1.8021 1.0940 
 
 
 2.9687 4.8903 2.9687 1.8021 
 
 
 .
 1.8021 2.9687 4.8903 2.9687 
 
 
 
1.0940 1.8021 2.9687 4.8903

• The corresponding correlation matrix is

 


1.0000 0.6070 0.3685 0.2237 
 



0.6070 1.0000 0.6070 0.3685 
 .



0.3685 0.6070 1.0000 0.6070 
 
0.2237 0.3685 0.6070 1.0000
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 56

Model 6: Random intercepts and slopes

• Linear trend for each subject

• Linear average trend for each sex group

• V is assumed of the form

V = ZDZ + σ 2I
where  




1 8 



 
 


 1 10 


Z= 










1 12 


 
 
 

1 14 

and where D is unstructured.

• SAS program:
proc mixed data = growth method = ml covtest;
title ’Growth Data, Model 6’;
class sex idnr;
model measure = sex age*sex / s;
random intercept age / type = un subject = idnr g;
run;
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 57

• Estimates for D and σ 2:

 



4.5569 −0.1983 
 
, 2 = 1.7162
σ
 
 

−0.1983 0.0238 

• Estimate for V :
 




4.6216 2.8891 2.8727 2.8563 
 
 



 2.8891 4.6839 3.0464 3.1251 


Z DZ + σ I =

2 










2.8727 3.0464 4.9363 3.3938 


 
 
 

2.8563 3.1251 3.3938 5.3788 

• The corresponding estimated correlation matrix is

 




1.0000 0.6209 0.6014 0.5729 
 
 




0.6209 1.0000 0.6335 0.6226 



 
 
 



0.6014 0.6335 1.0000 0.6586 


 
 
 

0.5729 0.6226 0.6586 1.0000 
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 58

Random Intercept/Compound Symmetry: Model

• Linear average trend for each sex group

• Subject-speciﬁc intercepts

• V is assumed of the form

ZDZ + σ 2I4 = dJ4 + σ 2I4
where J4 is a (4 × 4) matrix of ones.

• This covariance structure is called exchangeable or

compound symmetry.

• All correlations are equal to (d + σ 2)/σ 2

CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 59

• Possible SAS programs:

proc mixed data = growth method = ml covtest;
title ’Growth Data, Model 7’;
class sex idnr;
model measure = sex age*sex / s;
random intercept / type = un subject = idnr g;
run;

proc mixed data = growth method = ml covtest;

title ’Growth Data, Model 7’;
class sex idnr;
model measure = sex age*sex / s;
repeated / type = cs subject = idnr r rcorr;
run;

• LR test: Model 7 versus Models 1 and 6 (+random

slopes)

Mean Covar par −2 Ref G2 df p

1 unstr. unstr. 18 416.509
6 = slopes random 8 427.806 2 8.329 6 0.2150
7 = slopes CS 6 428.639 6 0.833 2 0.6594
6 0.833 1:2 0.5104
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 60

• The estimated covariance matrix is

 
 4.9052 3.0306 3.0306 3.0306 
 
 
 3.0306 4.9052 3.0306 3.0306 
 
 
 ,
 3.0306 3.0306 4.9052 3.0306 
 
 
 
3.0306 3.0306 3.0306 4.9052

• The corresponding correlation matrix equals

 


1.0000 0.6178 0.6178 0.6178 
 



0.6178 1.0000 0.6178 0.6178 
 .



0.6178 0.6178 1.0000 0.6178 
 
0.6178 0.6178 0.6178 1.0000

• The proﬁles, predicted by Model 7, are

girls : Ŷj = 17.37 + 0.4795tj ,

boys : Ŷj = 16.34 + 0.7844tj .

• While the average proﬁles are not exactly the same as

those from Model 2, they are extremely similar.
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 61

Independence: Model 8

• Linear average trend for each sex group

• Independence of all repeated measurements

• V is assumed of the form σ 2I

• SAS program:

repeated / type = simple subject = idnr r rcorr;

• LR test Model 8 versus Model 7

Mean Covar par −2 Ref G2 df p
1 unstr. unstr. 18 416.509
8 = slopes simple 5 478.242 7 49.603 1 <0.0001
7 49.603 0:1 <0.0001
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 62

Overview

Mean Covar par −2 Ref G2 df p

1 unstr. unstr. 18 416.509
2 = slopes unstr. 14 419.477 1 2.968 4 0.5632
3 = slopes unstr. 13 426.153 2 6.676 1 0.0098
4 = slopes Toepl. 8 424.643 2 5.166 6 0.5227
5 = slopes AR(1) 6 440.681 2 21.204 8 0.0066
4 16.038 2 0.0003
6 = slopes random 8 427.806 2 8.329 6 0.2150
7 = slopes CS 6 428.639 2 9.162 8 0.3288
4 3.996 2 0.1356
6 0.833 2 0.6594
6 0.833 1:2 0.5104
8 = slopes simple 5 478.242 7 49.603 1 <0.0001
7 49.603 0:1 <0.0001
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 63

4.11 Linear Mixed Models











Yi = Xiβ + Zibi + εi

























bi ∼ N (0, D),




















εi ∼ N (0, Σi),





















 b1, . . . , bN , ε1, . . . , εN independent,

• Inference for the marginal model: ML and REML

• Inference for random eﬀects: (Empirical) Bayes
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 64

4.12 Inference in the marginal linear

mixed model

4.12.1 The hierarchical versus marginal model

• Distribution of Yi given bi:

Yi|bi ∼ N (Xiβ + Zibi, Σi)
with density function f (yi|bi)

• Distribution of bi:
bi ∼ N (0, D)
with density function f (bi)

• The marginal density of Yi is then given by

f (yi) = f (yi|bi) f (bi) dbi
which is the density function of a ni-dimensional
normal distribution with mean vector Xiβ and with
covariance matrix Vi = ZiDZi + Σi
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 65

• The marginal model equals

Yi ∼ N (Xiβ, ZiDZi + Σi)

• E(Yi) = Xiβ
Var(Yi) = Vi = ZiDZi + Σi

• Hence, the random-eﬀects structure implies a

covariance structure of a very speciﬁc form.
e.g. random intercepts and slopes for time
=⇒ variance is a quadratic function over time.

• Note that the hierarchical model implies the marginal

model, but not vice versa.

• Therefore, inferences based on the marginal model do

not explicitly assume the presence of random eﬀects
representing the natural heterogeneity between
subjects.
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 66

4.12.2 Notation and terminology

• The only parameters in the marginal model are β, D

and the parameters in the Σi

• Let α contain all q(q + 1)/2 parameters in D, and all

parameters in the Σi.

• The elements of β are called ﬁxed eﬀects

• The elements of α are called variance components

• We denote θ = (β , α)
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 67

4.12.3 The Autocorrelation Function

• Often, the correlation between two measurements

within a subject only depends on the time lag
|tij − tik | between these two measurements.

• Autocorrelation function:
ρ(u) = corr(e(t), e(t − u)), u≥0

• For data sets where all subjects are measured at the

same, equally spaced time points, ρ(u) can be studied
by calculating the correlation between all
measurements with a speciﬁc time lag u

• For unbalanced data, with constant variance function,

ρ(u) can be studied by means of the so-called
variogram.
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 68

4.12.4 The Variogram

• We assume e(t) to be (weakly) stationary:

– Constant mean (zero)

– Constant variance: σ 2(t) = σ 2

– Corr(e(t), e(t − u)) only depends on u

• One can then show that the variogram, deﬁned by

1
γ(u) = E (ei(t) − ei(t − u)) 2
2
is equal to
γ(u) = σ 2 [1 − ρ(u)]

• Hence, if γ(u) and σ 2 can be estimated from the

data, the autocorralation function ρ(u) can be
studied.
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 69

• γ(u) is estimated by smoothing the scatterplot of

1
vijk = (rij − rik )2
2
versus the time lags
uijk = |tij − tik |
The resulting estimate γ(u)
is called the empirical or
sample variogram.

• It follows from
1
E (ei(t) − ej (t − u)) = σ 2,
2
for i = j
2
that σ 2 can be estimated by

2 =
σ (rik − rjl )2/2
i=j k l
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 70

4.13 Empirical Bayes Methods for the

Random Eﬀects

4.13.1 Estimation of the Random Eﬀects bi

How can we estimate the bi,

which are unobservable random variables ?

Bayesian methods

1. We ﬁrst assume θ = (β, α) known

=⇒ bi(θ) (= Bayes estimation)

2. Afterwards, we replace θ by its ML or REML

estimator
=⇒ bi = bi(θ)
(= empirical Bayes estimation)
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 71

4.13.2 Empirical Bayes estimates bi

• The model for the data, conditional on bi:

yi | bi ∼ N (Xiβ + Zibi, Σi)

• The prior distribution for bi:

bi ∼ N (0, D)

• The posterior density then equals

f (yi|bi) f (bi)
f (bi|yi) =
f (yi)

∝ f (yi|bi) f (bi)

∝ ...


1 −1

∝ exp − bi − DZi Vi (yi − Xiβ)
2



Λ−1
i bi − DZiVi−1(yi − Xiβ) 


for some positive deﬁnite matrix Λi.

CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 72

• Hence, the posterior distribution is given by

bi | yi ∼ N DZiVi−1(yi − Xiβ), Λi

• The EB estimate for bi then equals

bi = E(bi | yi) = DZiVi−1(yi − Xiβ),
in which all parameters are replaced by their ML or
REML estimates.

• Histograms and scatterplots of certain components of

bi are frequently used to detect model deviations or
subjects with ‘exceptional’ evolutions over time.

• Required SAS code:

random intercept time time2

/ type = un subject = id g solution;
make ’solutionR’ out=randeff noprint;
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 73

4.13.3 Shrinkage estimators bi

• Consider the prediction of the evolution of the ith

subject
Y

i ≡ X i β

+ Z b

i i

= Xi β

+ ZiDZiVi−1(yi − Xiβ)

= In i − ZiDZiVi−1 Xi β

+ ZiDZiVi−1yi

= ΣiVi−1Xiβ

+ In i − ΣiVi−1 yi,

• Hence, Y
i is a weighted mean of the
population-averaged proﬁle Xiβ
and the observed
−1 −1
data yi, with weights Σ i Vi and Ini − Σi Vi
respectively.

• Note that Xiβ

gets much weight if the residual
variability is ‘large’ in comparison to the total
variability.
CHAPTER 4. LINEAR (MIXED) MODELS FOR LONGITUDINAL DATA 74

• This phenomenon is usually called shrinkage:

The observed data are shrunk towards the

prior average proﬁle which is Xiβ,
since the prior mean of the random eﬀects was zero.

• This is also reﬂected in the fact that for any linear

combination λbi of random eﬀects,
var(λbi) ≤ var(λbi).
Chapter 5

Case Study: Vaccination Trial

5.1 Introduction

• Hepatitis A vaccination trial

– 120 patients recruited
– 109 selected for analysis

• Subjects taken from hospitals within the Antwerp

region

• Month 0–6 vaccination schedule

• Trial initiated in 1992 with yearly follow-up

• Response: (log10) antibody titer

• Covariates: lot, age, sex, weight, height, (BMI)

75
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 76
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 77

5.2 Questions

• Diﬀerence between lots

• Relation with baseline covariates

• Prediction

For modeling purposes, we restrict the analysis to

post-vaccination data.
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 78

visit1 visit2 visit3 visit4 visit5 visit6 visit7

visit1 1.00000 0.70230 0.64406 0.70146 0.63716 0.67282 0.68626
102 77 66 72 63 64 57
visit2 0.70230 1.00000 0.92642 0.92258 0.91542 0.92323 0.90638
77 80 66 71 63 64 54
visit3 0.64406 0.92642 1.00000 0.94630 0.93187 0.93040 0.87529
66 66 70 62 55 59 47
visit4 0.70146 0.92258 0.94630 1.00000 0.97291 0.97037 0.92452
72 71 62 74 62 63 52
visit5 0.63716 0.91542 0.93187 0.97291 1.00000 0.97662 0.93980
63 63 55 62 66 59 51
visit6 0.67282 0.92323 0.93040 0.97037 0.97662 1.00000 0.94029
64 64 59 63 59 68 50
visit7 0.68626 0.90638 0.87529 0.92452 0.93980 0.94029 1.00000
57 54 47 52 51 50 60
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 79
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 80

5.3 Selection of a Covariance Structure

• Use of an over-speciﬁed model for the mean structure:

– INTERCEPT, AGE, SEX, BMI, LOGBASE
– all covariates are allowed to have time-varying
eﬀects

• Initial random eﬀects: intercept, time, time2

• Models with serial correlation (exponential, gaussian)

show convergence diﬃculties.

• Selection of the covariance structure:

# param. Comp.
Model Description Covar. Struct. Deviance Model G2 d.f. p-value

A Int., time, time2 7 -384.2

B Int., time 4 -319.8 A 64.4 † < 0.0001
C Int. 2 -239.6 B 80.2 †† < 0.0001
D Int., time, time2 + 9 -432.3 A 48.1 2 < 0.0001
EXP(TIME)
E Int., time, time2 + 14 -440.8 D 8.5 4 0.075
EXP(TIMECLS)

† 50-50 mixture of χ22 and χ23 distributions.

†† 50-50 mixture of χ21 and χ22 distributions.
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 81

5.3.1 Note: Log-Linear Variance Model

• Linear mixed model:

Yij = Xij β + Zij γi + εij
with
– γi ∼ N (0, G)

– (εi1, . . . , ini ) ∼ N (0, Ri).

• Additional variance parameters can be incorporated

by adding a diagonal matrix to Ri.

• This could be as simple as σ 2I, or more complex:

σ 2diag[exp(U δ)]

• The latter is a log-linear variance model and produces

exponential local eﬀects (also known as dispersion
eﬀects).
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 82
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 83

5.4 Simpliﬁcation of the Mean

Structure

• No time varying eﬀect for the covariates are needed.

• No covariates remain in the model after adjusting for

log antibody level at time of second vaccination.

• Time trend is modeled using fractional polynomials.

• Powers retained: time−2 and log(time)

• Use time−2 and log(time) as random eﬀects instead

of time and time2

• Code:

proc mixed data=postvacc method=ml covtest scoring=5;

class patid pvmthcls;
model loganti = logpvmth pvmthm2 logbase|logbase / s ;
random int logpvmth pvmthm2 / sub=patid type=un;
repeated pvmthcls / sub=patid local=exp(pvmonth);
run;
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 84

5.4.1 Note: Fractional Polynomials

Royston P. & Altman D. (1994). Regression using fractional polynomials of continuous

covariates: parsimonious parametric modeling. Applied Statistics, 43, 429–67.

• A fractional polynomial (FP) of degree m:

m
βj X (pj ),

φm(X; p) = β0 +
j=1

where p = (p1, . . . , pm) is a real-valued vector of

powers with p1 < . . . < pm, and




(pj )




 X pj if pj = 0,
X = 



 ln(X) if pj = 0.

• FP extend the family of traditional polynomials.

FP provide a wide range of functional forms.
FP are a very ﬂexible tool for parametric modeling.

• Usually, m does not need to be large (m = 1, 2).

• Choose a predeﬁned set of ﬁxed powers (e.g. -2 to 2

by 0.5).

• Select the best combination of powers by checking

deviances compared to φ1(X; 1) (straight line).
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 85

• PROC MIXED output:

The Mixed Procedure

Model Information

Data Set WORK.POSTVACC

Dependent Variable LOGANTI
Covariance Structures Unstructured, Variance
Components, Local
Exponential
Subject Effects PATID, PATID
Estimation Method ML
Residual Variance Method Profile
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Containment

Dimensions

Covariance Parameters 9
Columns in X 5
Columns in Z Per Subject 3
Subjects 107
Max Obs Per Subject 7
Observations Used 513
Observations Not Used 250
Total Observations 763

Covariance Parameter Estimates

Cov Parm Subject Estimate

UN(1,1) PATID 0.1211

UN(2,1) PATID -0.01740
UN(2,2) PATID 0.008859
UN(3,1) PATID -0.06872
UN(3,2) PATID 0.01606
UN(3,3) PATID 0.03872
pvmthcls PATID 0.008318
EXP pvmonth -0.1718
Residual 0.04505

Fit Statistics
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 86

Log Likelihood 189.9

Akaike’s Information Criterion 180.9
Schwarz’s Bayesian Criterion 168.9
-2 Log Likelihood -379.8

Null Model Likelihood Ratio Test

DF Chi-Square Pr > ChiSq

8 681.48 <.0001

Solution for Fixed Effects

Standard
Effect Estimate Error DF t Value Pr > |t|

Intercept 3.3133 0.09280 103 35.70 <.0001

logpvmth -0.2266 0.01373 86 -16.50 <.0001
pvmthm2 -0.1276 0.04115 80 -3.10 0.0027
logbase -0.1891 0.09867 239 -1.92 0.0565
logbase*logbase 0.1553 0.03166 239 4.91 <.0001

Type 3 Tests of Fixed Effects

Num Den
Effect DF DF F Value Pr > F

logpvmth 1 86 272.35 <.0001

pvmthm2 1 80 9.62 0.0027
logbase 1 239 3.67 0.0565
logbase*logbase 1 239 24.08 <.0001
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 87

5.5 How Does the Model Fit ?

CHAPTER 5. CASE STUDY: VACCINATION TRIAL 88
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 89
CHAPTER 5. CASE STUDY: VACCINATION TRIAL 90

5.6 Prediction From the Model

• After modeling was completed, new data were

gathered on 74 subjects (Month 84).

• These can be compared with predictions obtained

from the model:
Chapter 6

Case Study: Surrogate Markers

• Primary motivation
– True endpoint is rare and/or distant
– Surrogate endpoint is frequent and/or close in time

• Secondary motivation

True endpoint is
– invasive
– uncomfortable
– costly
– confounded
∗ by secondary treatments
∗ by competing risks

91
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 92

6.1 Age-Related Macular Degeneration

Pharmacological Therapy for Macular Degeneration Study Group (1997)

Z: Interferon-α
• 0: placebo
• 1: 6MIU

T : Visual acuity at 6 months

• continuous
• binary: discretized version or loss of 2 lines of
vision

S: Visual acuity at 1 year

• continuous
• binary: discretized version or loss of 3 lines of
vision

N : 190
• 36 centers
• # patients per center ∈ [2; 18]
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 93

Visual Acuity

Visual Acuity=number of letters correctly read

V A L I D
A T I O N
O F S U R
R O G A T

E M A R K

E R S I N

R A N D O

M I Z E D

E X P E R

I M E N T
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 94

ARMD Data
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 95

6.2 Advanced Ovarian Cancer

4 randomized multicenter trials in advanced ovarian cancer

Ovarian Cancer Meta-Analysis Project (1991)

Z: Two treatment modalities

• 0: cyclophosphamide plus cisplatin (CP)
• 1: cyclophosphamide plus adriamycin plus cisplatin (CAP)

T : (Log of) Survival time

• continuous
• Time in weeks from randomization to death from any cause

S: (Log of) Time to progression

• continuous
• Time in weeks from randomization to clinical progression of the
disease or death due to the disease

N : 1194
• Individual data available on every randomized patient
• 952 (80%) have progression/death
• 50 units
• # patients per unit ∈ [2; 274]
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 96

Advanced Ovarian Cancer

CHAPTER 6. CASE STUDY: SURROGATE MARKERS 97

Advanced Ovarian Cancer

CHAPTER 6. CASE STUDY: SURROGATE MARKERS 98

6.3 Advanced Colorectal Cancer

CORFU Study

Z: Two treatment modalities

• 0: 5FU
• 1: 5FU plus folinic acid or 5FU plus interferon

T : Survival time

S: Time to progression

N : 736
• Individual data available on every randomized patient
• 694 (94.3%) have progression/death
• 76 units
• # patients per unit ∈ [2; 38]
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 99

Advanced Colorectal Cancer

CHAPTER 6. CASE STUDY: SURROGATE MARKERS 100

6.4 Deﬁnition of Surrogate Endpoint

Prentice (Bcs 1989)

“A test of H0 of no eﬀect of treatment on surrogate is

equivalent to a test of H0 of no eﬀect of treatment on
true endpoint.”

(S|treated) = (S|control)

(T |treated) = (T |control)
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 101

6.5 ARMD: Prentice’s Criteria

Criterion 1: Treatment Z is prognostic for surrogate S

• Sij |Zij = µS + αZij + εSij

• α = 2.83 (s.e. 1.86, P = 0.13)

Criterion 2: Treatment Z is prognostic for true

endpoint T

• Tij |Zij = µT + βZij + εT ij

• β = 4.12 (s.e. 2.32, P = 0.079)

Criterion 3: Surrogate S is prognostic for true

endpoint T

• Tij |Sij = µ + γSij + εij

• γ = 0.95 (s.e. 0.06, P < 0.0001)

CHAPTER 6. CASE STUDY: SURROGATE MARKERS 102

6.6 Proportion Explained

Freedman et al (SiM 1992)

• Description:
4. The full eﬀect of Z on T is explained by S

• Model:
Tij |Zij , Sij = µ̃T + βS Zij + γZ Sij + ε̃T ij ,

• Deﬁnition:
β − βS
PE(T, S, Z) =
β

• Estimate:
– P E = 0.65 (95% C.I. [−0.22; 1.51])

• But: problems with P E

CHAPTER 6. CASE STUDY: SURROGATE MARKERS 103

6.7 Criticism

• Prentice criteria neither necessary nor

suﬃcient
– except in the binary/binary case

• P E suﬀers from severe problems:

– P E not restricted to unit interval
Volberding et al (1990)

Choi et al (1993)

– conﬁdence limits (Fieller or delta) tend to be wide

Lin, Fleming, and DeGruttola (1997)

∗ unless large sample sizes

∗ unless very strong eﬀect of Z on T

• Proposal: two new criteria:

Buyse and Molenberghs (1998)

– Relative Eﬀect
– Adjusted Association
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 104

6.8 Relative Eﬀect

• Can we link the eﬀect of Z on S to the eﬀect of Z

on T ?

• Description:
4A. The eﬀect of Z on S predicts a clinically useful
eﬀect of Z on T

• Deﬁnition:
β
RE(T, S, Z) =
α

• Estimate:
– RE = 1.45 (95% C.I. [−0.48; 3.39])
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 105

6.9 Adjusted Association

• What is the association between S and T , after

correction for Z ?

• Description:
4B. The correlation between S and T after correction
for Z

• Deﬁnition:
ρZ = Corr(S, T |Z)

• Estimate:
– ρZ = 0.75 (95% C.I. [0.69; 0.82])
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 106

6.10 Use of RE and Adjusted

Association

• Criticism: P E not useful

Molenberghs and Buyse (1998)

• For normal endpoints:

ρZ
PE = δ
RE
• The two new quantities have clear meaning
– Relative Effect: trial-level measure of surrogacy
Can we translate the treatment effect on the surrogate to the
treatment effect on the endpoint, in a sufficiently precise way ?

– Adjusted Association: individual-level measure

of surrogacy
After accounting for the treatment eﬀect, is the surrogate endpoint
predictive for a patient’s true endpoint ?

• BUT:
The RE is based on a single trial ⇒ regression
through the origin, based on one point !
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 107

6.11 Analysis Based on Several Trials

• Context:
– multicenter trials
– meta analysis
– several meta analyses

• Extensions:

– Relative Eﬀect −→ Trial-Level Surrogacy

How close is the relationship between the
treatment eﬀects on the surrogate and true
endpoints, based on the various trials (units) ?

– Adjusted Association −→ Individual-Level

Surrogacy
How close is the relationship between the
surrogate and true outcome, after accounting for
trial and treatment eﬀects ?
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 108

Is Considered a Useful Idea

Albert et al (SiM 1998)

“There has been little work on alternative

statistical approaches. A meta-analysis approach
seems desirable to reduce variability.
Nevertheless, we need to resolve basic problems
in the interpretation of measures of surrogacy
such as PE as well as questions about the
biologic mechanisms of drug action.”
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 109

6.12 Statistical Model

• Model:
Sij |Zij = µSi + αiZij + εSij
Tij |Zij = µT i + βiZij + εT ij

• Error structure:

– Individual level:
∗ Deviations εSij and εT ij are correlated

– Trial level:
∗ Treatment eﬀects αi and βi are correlated
∗ (Information from intercepts µSi and µT i can be
used as well)
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 110

Statistical Model

• Model:
Sij |Zij = µSi + αiZij + εSij
Tij |Zij = µT i + βiZij + εT ij

• Error structure:
 
σ σ
 
Σ = SS ST






σT T

• Trial-speciﬁc eﬀects:
     


µSi   µS   mSi 
     


 µT i   µT   mT i 








 =   + 

 


 αi   α   ai 
     
    
βi β bi 

• Error structure of random eﬀects:

 
dSS dST dSa dSb 


 

dT T dT a dT b 


D= 




 daa dab 
 

dbb 
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 111

6.13 Methods of Estimation

Endpoints dimension:
• Both endpoints together
• Each endpoint separately

Center dimension:
• Center as fixed effect
• Center as random effect

Measurement error:
• No adjustment
• Adjustment by sample size per trial
• Full correction using Stijnen’s approach
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 112

6.14 ARMD: Trial-Level Surrogacy

Effect for change in visual acuity

10
at 12 months

-10

-20

-30

-40

-30 -20 -10 0 10 20

Effect for change in visual acuity at 6 months

• Prediction:
– What do we expect ?
E(β + b0|mS0, a0)
– How precisely can we estimate it ?
Var(β + b0|mS0, a0)

• Estimate:
2
– Rtrial = 0.692 (95% C.I. [0.52; 0.86])
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 113

ARMD: Trial-Level Surrogacy

• Prediction:
 T  −1  
dSb dSS dSa µS0 − µS
E(β + b0 |mS0 , a0 ) = β +      
dab dSa daa α0 − α
 T  −1  
dSb dSS dSa dSb
Var(β + b0 |mS0 , a0 ) = dbb −      
dab dSa daa dab

• Trial-level association:
 T  −1  


dSb  

dSS dSa  

dSb 
     

dab  
dSa daa  
dab 
Rb2i|mSi,ai =
dbb

• Estimate:
– Rb2i|mSi,ai = 0.692 (95% C.I. [0.52; 0.86])
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 114

6.15 ARMD: Individual-Level Surrogacy

Residual for change in visual acuity

10
at 12 months

-10

-20

-30

-40

-40 -30 -20 -10 0 10 20 30

Residual for change in visual acuity at 6 months

• Trial-level association:
ρZ = Rindiv = Corr(εT i, εSi)

• Estimate:
2
– Rindiv = 0.483 (95% C.I. [0.38; 0.59])

– Rindiv = 0.69 (95% C.I. [0.62; 0.77])

– Recall ρZ = 0.75 (95% C.I. [0.69; 0.82])

CHAPTER 6. CASE STUDY: SURROGATE MARKERS 115

ARMD: Individual-Level Surrogacy

• Conditional density:
−1
Tij |Zij , Sij ∼ N µT i − σT S σSS µSi
−1
+(βi − σT S σSS αi)Zij
−1
+ σT S σSS Sij ; σT T − σT2 S σSS
−1

• Trial-level association:
2
σST
ρZ = Rε2T i|εSi =
σSS σT T

• Estimate:
– Rε2T i|εSi = 0.483 (95% C.I. [0.38; 0.59])

– RεT i|εSi = 0.69 (95% C.I. [0.62; 0.77])

– Recall ρZ = 0.75 (95% C.I. [0.69; 0.82])

CHAPTER 6. CASE STUDY: SURROGATE MARKERS 116

6.16 Ovarian: Trial-Level Surrogacy

• Prediction:
– What do we expect ?
E(β + b0|mS0, a0)
– How precisely can we estimate it ?
Var(β + b0|mS0, a0)

• Estimate:
2
– Rtrial = 0.940 (95% C.I. [0.91; 0.97])
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 117

6.17 Ovarian: Individual-Level

Surrogacy

• Trial-level association:
ρZ = Rindiv = Corr(εT i, εSi)

• Estimate:
2
– Rindiv = 0.886 (95% C.I. [0.87; 0.90])

– Rindiv = 0.941 (95% C.I. [0.93; 0.95])

– ρZ = 0.944 (95% C.I. [0.94; 0.95])

CHAPTER 6. CASE STUDY: SURROGATE MARKERS 118

6.18 Ovarian: Prediction

unit # patients α 0 E(β + b0 |a0 ) β

+ b0

6 17 -0.58 (0.33) -0.45 (0.29) -0.56 (0.32)

8 10 0.67 (0.76) 0.49 (0.57) 0.76 (0.39)

55 31 1.08 (0.56) 0.80 (0.44) 0.79 (0.45)

DAC 274 0.25 (0.15) 0.17 (0.13) 0.14 (0.14)

GON 125 0.15 (0.25) 0.10 (0.20) 0.03 (0.22)

CHAPTER 6. CASE STUDY: SURROGATE MARKERS 119

6.19 Colorectal: Trial-Level Surrogacy

• Prediction:
– What do we expect ?
E(β + b0|mS0, a0)
– How precisely can we estimate it ?
Var(β + b0|mS0, a0)

• Estimate:
2
– Rtrial = 0.454 (95% C.I. [0.23; 0.68])
CHAPTER 6. CASE STUDY: SURROGATE MARKERS 120

6.20 Colorectal: Individual-Level

Surrogacy

• Trial-level association:
ρZ = Rindiv = Corr(εT i, εSi)

• Estimate:
2
– Rindiv = 0.665 (95% C.I. [0.62; 0.71])

– Rindiv = 0.815

– ρZ = 0.805
Chapter 7

Case Study: The prostate Cancer Data

7.1 The use of PSA to detect prostate

cancer

• U.S.: one of the most common and most costly

medical problems, and the second leading cause of
male cancer deaths.

• Important to look for markers which can detect the

disease in an early stage.

• Prostate speciﬁc antigen (PSA): level in the blood is

proportional with the volume of prostate tissue

• Still, an elevated PSA level is not necessarily an

indicator of prostate cancer because also patients with

121
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 122

benign prostatic hyperplasia (BPH) have an enlarged

volume of prostate tissue and therefore also an
increased PSA level.

• Clinical practice suggests that the rate of change in

PSA level might be a more accurate method of
detecting prostate cancer in the early stages of the
disease.
=⇒ Longitudinal Data
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 123

7.1.1 The Baltimore Longitudinal Study of

Aging (BLSA)

• Started in 1958, still ongoing

• Over 1500 males and 900 females enrolled

• Volunteers, predominantly white (95 per cent),

well-educated (over 75 per cent have college degrees),
and ﬁnancially comfortable (82 per cent)

• Participants return approximately every two years for

three days of biomedical and psychological
examinations.

• Data from repeated clinical examinations and a bank

of frozen blood samples.

• An average of 7 visits and 16 years of follow-up.

CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 124

7.1.2 The prostate data from the BLSA

• Retrospective case-control study

• 18 prostate cancer patients (14+4)
20 BPH cases
16 controls
• Inclusion criteria:
1. seven or more years of follow-up prior to diagnosis
of prostate cancer, simple prostatectomy for BPH,
or exclusion of prostate disease by a urologist,
2. conﬁrmation of the pathological diagnosis, and
3. no prostate surgery prior to diagnosis.

• To the extent possible, age at diagnosis and years of

follow-up was matched for the control, BPH and
cancer groups. However, due to the high prevalence
of BPH in men over age 50, it was difficult to find
age-matched controls with no evidence of prostate
disease. In fact, the control group remained
significantly younger at first visit and at diagnosis
compared to the BPH group, which makes it
necessary to control for age at diagnosis in all
statistical analyses.
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 125

7.1.3 Descriptive statistics

Cancer Cases
Controls BPH cases L/R M
Number of participants 16 20 14 4
Age at diagnosis (years)
median 66 75.9 73.8 72.1
range 56.7-80.5 64.6-86.7 63.6-85.4 62.7-82.8
Years of follow up
median 15.1 14.3 17.2 17.4
range 9.4-16.8 6.9-24.1 10.6-24.9 10-25.3
Time between
measurements (years)
median 2 2 1.7 1.7
range 1.1-11.7 0.9-8.3 0.9-10.8 0.9-4.8
Number of measurements
per individual
median 8 8 11 9.5
range 4-10 5-11 7-15 7-12

Complications

• Unequal number of repeated measurements per

individual: ni
• Measurements taken at arbitrary timepoints: tij
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 126

7.2 A two-stage model

7.2.1 General idea of two-stage models

Stage 1:

Assume that the data of each subject separately

can be well described by a linear regression model.

Stage 2:

Use regression techniques to investigate

the eﬀects of age, diagnostic group, . . .

on the subject-speciﬁc regression coeﬃcients

deﬁned in the ﬁrst stage.

CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 127

7.2.2 Applied to the prostate data

Yij = Yi(tij ) = ln(P SAi(tij ) + 1)

We assume that each proﬁle can be described

by a quadratic function over time

Stage 1:

Yij = β1i + β2itij + β3it2ij + εij , j = 1, . . . , ni

Stage 2:
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 128

















β 1i = β1Agei + β2Ci + β3Bi + β4Li + β5Mi + b1i





































β 2i = β6Agei + β7Ci + β8Bi + β9Li + β10Mi + b2i










































β 3i = β11Agei + β12Ci + β13Bi + β14Li + β15Mi + b3i,
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 129

where
Agei = Age at time of diagnosis
















1 if Control
Ci = 














0 otherwise
















1 if BPH case
Bi = 














0 otherwise
















1 if L/R cancer case
Li = 














0 otherwise
















1 if Metastatic cancer case
Mi = 














0 otherwise
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 130

• β2, β3, β4, β5 are the average intercepts after

correction for age.

• β7, β8, β9, β10 are the average slopes for time after
correction for age.

• β12, β13, β14, β15 are the average slopes for time2 after
correction for age.
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 131

7.2.3 Matrix notation for two-stage models:

Stage 1:

Yi = Zi β i + εi
where     
     
       



Yi1 





εi1 





1 ti1 t2i1 

  
       
       
       β1i 
       
       
       


 Yi2 




 εi2 




 1 ti2 t2i2 








       
Yi = 





, εi = 





, Zi = 





, βi = 
 β2i 

       
       
 ...   ...   ... ... ...   
       
       
       
       
       
       β3i 
     
     
     
 Yini   εini   1 tini t2ini 

Stage 2:

βi = Biβ + bi,

where Bi is the appropriate (3 × 15) matrix of covariates

β is equal to (β1, . . . , β15)
bi = (b1i, b2i, b3i)
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 132

7.3 Linear mixed models

7.3.1 Stage 1 + Stage 2

Yij = Yi(tij )

= β1Agei + β2Ci + β3Bi + β4Li + β5Mi

+ (β6Agei + β7Ci + β8Bi + β9Li + β10Mi) × tij

+ (β11Agei + β12Ci + β13Bi + β14Li + β15Mi) × t2ij

+ b1i + b2i × tij + b3i × t2ij + εij

or equivalently

Yi = Xiβ + Zibi + εi

where Xi = ZiBi is a (ni × 15) matrix of covariates.

CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 133

7.4 Fitting linear mixed models in SAS

PSA example

• Factor group deﬁned by:

– control: group = 1
– BPH: group = 2
– local cancer: group = 3
– metastatic cancer: group = 4

• time and timeclss are time, expressed in decades

before diagnosis

• age is age at the time of diagnosis

• lnpsa = ln(P SA + 1)

• Model with age-corrected quadratic evolutions within

each diagnostic group.

• Random intercepts and slopes for time and time2.

• We assume Σi = σ 2Ini
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 134

SAS program

proc mixed data = prostate method = ml;

class id group timeclss;
model lnpsa = group age group*time age*time
group*time2 age*time2 / noint solution;
random intercept time time2 / type = un subject = id;
repeated timeclss / type = simple subject = id;
run;

• PROC MIXED statement:

– calls procedure MIXED

– speciﬁes data-set (records correspond to occasions)

– estimation method: ML, REML (default),

MIVQUE0

• CLASS statement: deﬁnition of the factors in the

model
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 135

• MODEL statement:
– response variable
– ﬁxed eﬀects
– options similar to SAS regression procedures

• RANDOM statement:
– definition of random effects (including intercepts !)
– identification of the ‘subjects’: independence
accross subjects
– structure of random-effects covariance matrix D
many structures available within SAS

• REPEATED statement:
– ordering of measurements within subjects
– the effect(s) specified must be of the factor-type
– identification of the ‘subjects’: independence
accross subjects
– structure of Σi
the same structures available as for the RANDOM
statement
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 136

Overview of frequently used covariance structures which can be speciﬁed in the RANDOM

and REPEATED statements of the SAS procedure MIXED. The σ-parameters are used to

denote variances and covariances, while the ρ-parameters are used for correlations.

Structure Example
 
σ12 σ12 σ13
Unstructured  σ12 σ22 σ23 
type=UN
σ13 σ23 σ32

Simple  (1)  2 (2)

Variance Compo- σ2 0 0 σ1 0 0
nents  0 σ2 0  or  0 σ22 0 
type=SIMPLE 0 0 σ2 0 0 σ32
type = VC
 
σ12 + σ 2 σ12 σ12
Compound symmetry  
σ12 σ1 + σ 2
2
σ12
type=CS
σ12 σ12 σ1 + σ 2
2

 
σ12 σ12 0
Banded  σ12 σ22 σ23 
type=UN(2)
0 σ23 σ32
 
σ2 ρσ 2 ρ2 σ 2
First-order autoregressive  ρσ 2 σ2 ρσ 2 
type=AR(1)
ρ2 σ 2 ρσ 2 σ2
 
σ2 σ12 σ13
Toeplitz  σ12 σ2 σ12 
type=TOEP
σ13 σ12 σ2
 
σ2 0 0
Toeplitz (1)  0 σ2 0 
type=Toep(1)
0 0 σ2
 
Heterogeneous com- σ12 ρσ1 σ2 ρσ1 σ3
pound symmetry  ρσ1 σ2 σ22 ρσ2 σ3 
type=CSH ρσ1 σ3 ρσ2 σ3 σ32
 
Heterogeneous first- σ12 ρσ1 σ2 ρ2 σ1 σ3
order autoregressive  ρσ1 σ2 σ22 ρσ2 σ3 
type=ARH(1) ρ2 σ1 σ3 ρσ2 σ3 σ32
 
σ12 ρ1 σ1 σ2 ρ2 σ1 σ3
Heterogeneous Toeplitz  ρ1 σ1 σ2 σ22 ρ1 σ2 σ3 
type=TOEPH
ρ2 σ1 σ3 ρ1 σ2 σ3 σ32

(1)
Example: repeated timeclss / type = simple subject = id;
(2)
Example: random intercept time time2 / type = simple subject = id;
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 137

Overview of frequently used (stationary) spatial covariance structures, which can be

speciﬁed in the RANDOM and REPEATED statements of the SAS procedure MIXED. The

correlations are positive decreasing functions of the Euclidean distances dij between the

observations. The coordinates of the observations, used to calculate these distances are

given by a set of variables the names of which are speciﬁed in the list ‘list’. The variance is

denoted by σ 2 , and ρ deﬁnes how fast the correlations decrease as functions of the dij .

Structure Example

 
1 ρd12 ρd13
Power  
σ2 
 ρ
d12
1 ρd23  
type=SP(POW)(list)
ρd13 ρd23 1

 
1 exp(−d12 /ρ) exp(−d13 /ρ)
Exponential 
2

σ  exp(−d12 /ρ) 1 exp(−d23 /ρ) 

type=SP(EXP)(list)
exp(−d13 /ρ) exp(−d23 /ρ) 1

 
1 exp(−d212 /ρ2 ) exp(−d213 /ρ2 )
Gaussian  
σ2 
 exp(−d 2
12 /ρ 2
) 1 exp(−d 2
23 /ρ 2 
) 
type=SP(GAU)(list)
exp(−d213 /ρ2 ) exp(−d223 /ρ2 ) 1
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 138

SAS output: parameter estimates

Maximum likelihood and restricted maximum likelihood estimates (MLE and REMLE) and
standard errors for all ﬁxed eﬀects and all variance components in the PSA model

Eﬀect Parameter MLE (s.e.) REMLE (s.e.)

Age effect β1 0.026 (0.013) 0.027 (0.014)
Intercepts:
control β2 -1.077 (0.919) -1.098 (0.976)
BPH β3 -0.493 (1.026) -0.523 (1.090)
L/R cancer β4 0.314 (0.997) 0.296 (1.059)
Met. cancer β5 1.574 (1.022) 1.549 (1.086)
Age×time effect β6 -0.010 (0.020) -0.011 (0.021)
Time effects:
control β7 0.511 (1.359) 0.568 (1.473)
BPH β8 0.313 (1.511) 0.396 (1.638)
L/R cancer β9 -1.072 (1.469) -1.036 (1.593)
Met. cancer β10 -1.657 (1.499) -1.605 (1.626)
2
Age×time effect β11 0.002 (0.008) 0.002 (0.009)
Time2 effects:
control β12 -0.106 (0.549) -0.130 (0.610)
BPH β13 -0.119 (0.604) -0.158 (0.672)
L/R cancer β14 0.350 (0.590) 0.342 (0.656)
Met. cancer β15 0.411 (0.598) 0.395 (0.666)
Covariance of bi :
var(b1i ) d11 0.398 (0.083) 0.452 (0.098)
var(b2i ) d22 0.768 (0.187) 0.915 (0.230)
var(b3i ) d33 0.103 (0.032) 0.131 (0.041)
cov(b1i , b2i ) d12 = d21 -0.443 (0.113) -0.518 (0.136)
cov(b2i , b3i ) d23 = d32 -0.273 (0.076) -0.336 (0.095)
cov(b3i , b1i ) d13 = d31 0.133 (0.043) 0.163 (0.053)
Residual variance:
var(εij ) σ2 0.028 (0.002) 0.028 (0.002)
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 139

• ML and REML estimates diﬀerent for variance

components as well as for ﬁxed eﬀects

• REML estimates for variance components ‘larger’

than ML estimates

• Fitted average proﬁles (at median ages)

• Approximate t-tests available for ﬁxed eﬀects

CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 140

SAS output: Iteration history

REML Estimation Iteration History

Iteration Evaluations Objective Criterion

0 1 -259.0577593
1 2 -753.2423823 0.00962100
2 1 -757.9085275 0.00444385
. . ............ ..........
6 1 -760.8988784 0.00000003
7 1 -760.8988902 0.00000000

Convergence criteria met.

• Objective functions:
1
ln (L (θ)) = − {n ln(2π) + OF (θ)}
2
ML ML

1
ln (L (θ)) = − {(n − p) ln(2π) + OF (θ)}
2
REML REML

• Number of times the objective function has been

evaluated during each iteration

• Criterion: measure of convergence:|gk Hk−1gk |/|fk |,

where fk is the objective function, gk is the gradient,
and Hk is the Hessian.
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 141

SAS output: Information on model ﬁt

Model Fitting Information for LNPSA

Description Value

Observations 463.0000
Variance Estimate 1.0000
Standard Deviation Estimate 1.0000
REML Log Likelihood -31.2350
Akaike’s Information Criterion -38.2350
Schwarz’s Bayesian Criterion -52.6018
-2 REML Log Likelihood 62.4700
Null Model LRT Chi-Square 501.8411
Null Model LRT DF 6.0000
Null Model LRT P-Value 0.0000

• Observations: n = i=1 ni = 463

• Variance and standard deviation depend on

program-speciﬁcation

• Maximized REML log likelihood

• Information criteria: see later

• Test for the need for covariance modelling

CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 142

SAS output: F -tests for ﬁxed eﬀects

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

GROUP 4 299 15.90 0.0001

AGE 1 299 3.48 0.0631
TIME*GROUP 4 299 7.85 0.0001
AGE*TIME 1 299 0.27 0.6026
TIME2*GROUP 4 299 4.44 0.0017
AGE*TIME2 1 299 0.07 0.7982

• For continuous covariates: equivalent with t test

• For factors: test whether any of the parameters

assigned to this factor is signiﬁcantly diﬀerent from
zero

• Details on F-tests: see later

CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 143

First parameterization of mean structure

model lnpsa = group age grouptime agetime

group*time2 age*time2 / noint solution;

Solution for Fixed Effects

Effect GROUP Estimate Std Error DF t Pr > |t|

GROUP 1 -1.09842483 0.97631046 299 -1.13 0.2615 BETA2

GROUP 2 -0.52284975 1.08953050 299 -0.48 0.6317 BETA3
GROUP 3 0.29640353 1.05870264 299 0.28 0.7797 BETA4
GROUP 4 1.54938619 1.08561192 299 1.43 0.1546 BETA5
AGEDIAG 0.02655078 0.01423433 299 1.87 0.0631 BETA1
TIME*GROUP 1 0.56806200 1.47250947 299 0.39 0.6999 BETA7
TIME*GROUP 2 0.39562209 1.63767403 299 0.24 0.8093 BETA8
TIME*GROUP 3 -1.03590942 1.59277722 299 -0.65 0.5159 BETA9
TIME*GROUP 4 -1.60490411 1.62575741 299 -0.99 0.3244 BETA10
AGEDIAG*TIME -0.01116548 0.02142364 299 -0.52 0.6026 BETA6
TIME2*GROUP 1 -0.12952337 0.61005089 299 -0.21 0.8320 BETA12
TIME2*GROUP 2 -0.15845944 0.67234237 299 -0.24 0.8138 BETA13
TIME2*GROUP 3 0.34191867 0.65628931 299 0.52 0.6028 BETA14
TIME2*GROUP 4 0.39506308 0.66604953 299 0.59 0.5535 BETA15
AGEDIAG*TIME2 0.00225933 0.00882934 299 0.26 0.7982 BETA11
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 144

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

GROUP 4 299 15.90 0.0001

AGEDIAG 1 299 3.48 0.0631
TIME*GROUP 4 299 7.85 0.0001
AGEDIAG*TIME 1 299 0.27 0.6026
TIME2*GROUP 4 299 4.44 0.0017
AGEDIAG*TIME2 1 299 0.07 0.7982

Source Null hypothesis

Group H1 : β2 = β3 = β4 = β5 = 0
Age H2 : β1 = 0
Time∗group H3 : β7 = β8 = β9 = β10 = 0
Age∗group H4 : β6 = 0
Time2∗group H5 : β12 = β13 = β14 = β15 = 0
Age∗time2 H6 : β11 = 0
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 145

Second parameterization of mean structure

model lnpsa = group age time grouptime agetime

time2 group*time2 age*time2 / solution;

Solution for Fixed Effects

Parameter Estimate Std Error DDF T Pr > |T|

INTERCEPT 1.549386 1.085611 49 1.43 0.1599

GROUP 1 -2.647811 0.393111 300 -6.74 0.0001
GROUP 2 -2.072235 0.383595 300 -5.40 0.0001
GROUP 3 -1.252982 0.393223 300 -3.19 0.0016
GROUP 4 0.000000 . . . .
AGE 0.026550 0.014234 300 1.87 0.0631
TIME -1.604904 1.625757 49 -0.99 0.3284
TIME*GROUP 1 2.172966 0.583601 300 3.72 0.0002
TIME*GROUP 2 2.000526 0.567835 300 3.52 0.0005
TIME*GROUP 3 0.568994 0.579436 300 0.98 0.3269
TIME*GROUP 4 0.000000 . . . .
AGE*TIME -0.011165 0.021423 300 -0.52 0.6026
TIME2 0.395063 0.666049 50 0.59 0.5558
TIME2*GROUP 1 -0.524586 0.234146 300 -2.24 0.0258
TIME2*GROUP 2 -0.553522 0.223216 300 -2.48 0.0137
TIME2*GROUP 3 -0.053144 0.226748 300 -0.23 0.8149
TIME2*GROUP 4 0.000000 . . . .
AGE*TIME2 0.002259 0.008829 300 0.26 0.7982
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 146

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

GROUP 3 300 20.13 0.0001

AGE 1 300 3.48 0.0631
TIME 1 49 0.07 0.7885
TIME*GROUP 3 300 10.41 0.0001
AGE*TIME 1 300 0.27 0.6026
TIME2 1 50 0.03 0.8616
TIME2*GROUP 3 300 5.93 0.0006
AGE*TIME2 1 300 0.07 0.7982

Source Null hypothesis

Group H7 : β2 = β3 = β4 = β5
Age H8 : β1 = 0
Time H9 : (β7 + β8 + β9 + β10)/4 = 0
Time∗group H10 : β7 = β8 = β9 = β10
Age∗group H11 : β6 = 0
Time2 H12 : (β12 + β13 + β14 + β15)/4 = 0
Time2∗group H13 : β12 = β13 = β14 = β15
Age∗time2 H14 : β11 = 0
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 147

7.5 Inference for contrasts of ﬁxed

eﬀects

7.5.1 The CONTRAST statement

• Testing general linear hypotheses of the form

H0 L β = 0,

for some speciﬁc known matrix L.

• The set of linear combinations Lβ is sometimes

called a contrast (or a set of contrasts) of the ﬁxed
eﬀects β.
• F-statistic:

−1
−1 −1
β L L N
X V
i=1 i i ( α)X i L L β

F = ,
rank(L)

• Null-distribution approximately F with rank(L)

numerator degrees of freedom.
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 148

• Several methods available to estimate the

denominator degrees of freedom:
– containment method (default)

– Satterthwaite’s approximation

– ...

• In the context of longitudinal data, the resulting

p-values are often very similar under diﬀerent
approximation methods.
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 149

Example 1

• Testing whether the local cancer cases evolve diﬀerent

from the metastatic cancer cases.

• The null-hypothesis is speciﬁed by








β4 = β5

H0 :  β9 = β10

14 = β15 ,


 β

which is equivalent with testing

 
0 0 0 1 −1 0 0 0 0 0 0 0 0 0 0 

H0 : 
  β = 0,
0 0 0 0 0 0 0 0 1 −1 0 0 0 0 0 



 
0 0 0 0 0 0 0 0 0 0 0 0 0 1 −1
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 150

• The contrast can be tested within PROC MIXED by

adding the following statement to the original
program:

contrast ’L/R can = Met can’ group 0 0 1 -1,

group*time 0 0 1 -1,
group*time2 0 0 1 -1;

• Several CONTRAST statements (with diﬀerent

labels) are allowed

• Additional table in the output:

CONTRAST Statement Results

Source NDF DDF F Pr > F

L/R can = Met can 3 299 5.86 0.0007

CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 151

Example 2

Default F -tests under the ﬁrst

parameterization of the mean structure.

Null hypothesis Contrast statement

H1 : β2 = β3 = β4 = β5 = 0 contrast ’H1’ group 1 0 0 0,
group 0 1 0 0,
group 0 0 1 0,
group 0 0 0 1;
H2 : β1 = 0 contrast ’H2’ age 1;
H3 : β7 = β8 = β9 = β10 = 0 contrast ’H3’ group*time 1 0 0 0,
group*time 0 1 0 0,
group*time 0 0 1 0,
group*time 0 0 0 1;
H4 : β6 = 0 contrast ’H4’ age*time 1;
H5 : β12 = β13 = β14 = β15 = 0 contrast ’H5’ group*time2 1 0 0 0,
group*time2 0 1 0 0,
group*time2 0 0 1 0,
group*time2 0 0 0 1;
H6 : β11 = 0 contrast ’H6’ age*time2 1;
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 152

• The results are:

CONTRAST Statement Results

Source NDF DDF F Pr > F

H1 4 299 15.90 0.0001

H2 1 299 3.48 0.0631
H3 4 299 7.85 0.0001
H4 1 299 0.27 0.6026
H5 4 299 4.44 0.0017
H6 1 299 0.07 0.7982

• Exactly the same as the results obtained from the

default table ‘Tests of Fixed Eﬀects’
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 153

Example 3

Testing group-diﬀerences under ﬁrst

parameterization of the mean structure.

Null hypothesis Contrast statement

H7 : β2 = β3 = β4 = β5 contrast ’H7’ group 1 -1 0 0,
group 1 0 -1 0,
group 1 0 0 -1;
H8 : β1 = 0 contrast ’H8’ agediag 1;
H9 : β7 + β8 + β9 + β10 = 0 contrast ’H9’ group*time 1 1 1 1;
H10 : β7 = β8 = β9 = β10 contrast ’H10’ group*time 1 -1 0 0,
group*time 1 0 -1 0,
group*time 1 0 0 -1;
H11 : β6 = 0 contrast ’H11’ agediag*time 1;
H12 : β12 + β13 + β14 + β15 = 0 contrast ’H12’ group*time2 1 1 1 1;
H13 : β12 = β13 = β14 = β15 contrast ’H13’ group*time2 1 -1 0 0,
group*time2 1 0 -1 0,
group*time2 1 0 0 -1;
H14 : β11 = 0 contrast ’H14’ agediag*time2 1;
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 154

• The results are:

CONTRAST Statement Results

Source NDF DDF F Pr > F

H7 3 299 20.13 0.0001

H8 1 299 3.48 0.0631
H9 1 299 0.07 0.7876
H10 3 299 10.41 0.0001
H11 1 299 0.27 0.6026
H12 1 299 0.03 0.8610
H13 3 299 5.93 0.0006
H14 1 299 0.07 0.7982

• The same F -statistics as the ones obtained from the

default F -tests under the reparameterized mean
structure

• But diﬀerent denominator degrees of freedom (due to

containment method)

• However, only small changes in p-values

CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 155

Example 4

Model reduction

Hierarchical CONTRAST statements lead to the

following simpliﬁcations:

• no interaction age × time2

• no interaction age × time

• quadratic time eﬀect the same for both cancer groups

• the quadratic time eﬀect is not signiﬁcant for the

non-cancer groups

• the linear time eﬀect is not signiﬁcant for the controls

CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 156

• Simultaneous testing of all these hypotheses can be

done by adding the following CONTRAST statement
to the original program:

contrast ’Final model’ age*time2 1,

age*time 1,
group*time2 1 -1 0 0,
group*time2 0 0 1 -1,
group*time2 1 0 0 0,
group*time2 0 1 0 0,
group*time 1 0 0 0;

• This results in the following table in the output:

CONTRAST Statement Results

Source NDF DDF F Pr > F

Final model 6 299 0.56 0.7583

• Note that the matrix L is not of full rank:

the third row equals the ﬁfth row minus the sixth row
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 157

7.5.2 The ESTIMATE statement

• Can cancer patients be better discriminated from

BPH cases using the rate of increase of PSA rather
than just one single measurement of PSA ?

• How can we estimate the average diﬀerence in

ln(1 + P SA) as well as the average diﬀerence in the
rate of increase of ln(1 + P SA), for example 5 years
prior to diagnosis ?

• We ignore the metastatic cancer cases

• The average diﬀerence, 5 years prior to diagnosis

equals
DIFF(t = 5 years)

2
= β1 age + β4 + β9 t + β14 t

t=0.5
− (β1 age + β3 + β8 t)|t=0.5

= −β3 + β4 − 0.5 β8 + 0.5 β9 + 0.25 β14

CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 158

• The average diﬀerence in rate of change, 5 years prior

to diagnosis equals
DIFFRATE(t = 5 years)

∂
2
= β1 age + β4 + β9 t + β14 t
∂t
t=0.5

∂
− (β1 age + β3 + β8 t)
∂t t=0.5

= −β8 + β9 + β14

• Both quantities are of the form Lβ, for speciﬁc

(1 × 15) matrices L.

• Estimation of such linear combinations of ﬁxed eﬀects

can be done by adding the following ESTIMATE
statement:
estimate ’DIFF, t = 5yrs’ group 0 -1 1 0
bph*time -0.5
loccanc*time 0.5
cancer*time2 0.25
/ cl alpha = 0.05;

estimate ’DIFFRATE, t = 5yrs’ bph*time -1

loccanc*time 1
cancer*time2 1
/ cl alpha = 0.05;
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 159

• The matrices L are speciﬁed as in the CONTRAST

statement.

• Only matrices with one row are allowed

• The option ‘cl alpha=0.05’ to request an approximate

t-type 95% conﬁdence interval

• A new table is added to the output:

ESTIMATE Statement Results

Parameter Estimate Std Error DDF T

DIFF, t = 5yrs 0.22081242 0.14573103 301 1.52

DIFFRATE, t = 5yrs -0.95067653 0.16587627 301 -5.73

Pr > |T| Alpha Lower Upper

0.1308 0.05 -0.0660 0.5076

0.0001 0.05 -1.2771 -0.6243

• 5 years prior to diagnosis, there is on average no

signficant difference, but the average rates of change
are significantly different
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 160

7.6 Inference for variance components

7.6.1 Example

• Do we need subject-speciﬁc quadratic time-eﬀects in

our model ?

• If not, then all b3i equal zero

• In the marginal model, this corresponds to setting

appropriate variance components equal to zero:
H0 : d13 = d23 = d33 = 0

• Note that, due to the fact that the hierarchical and

the marginal model are not equivalent, rejecting
above null-hypothesis does not imply the presence of
subject-speciﬁc quadratic time-eﬀects

• The above H0 rather tests whether the covariance

structure Vi = ZiDZi + Σi can be simpliﬁed.
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 161

7.6.2 Wald tests for variance components

• Adding the ‘covtest’ option to the PROC MIXED

statement, SAS reports standard errors and Wald
tests for all variance components:

Covariance Parameter Estimates (REML)

Cov Parm Subject Estimate Std Error Z Pr > |Z|

UN(1,1) XRAY 0.44315715 0.09348532 4.74 0.0001 D11

UN(2,1) XRAY -0.49032753 0.12385751 -3.96 0.0001 D12
UN(2,2) XRAY 0.84160121 0.20326977 4.14 0.0001 D22
UN(3,1) XRAY 0.14795060 0.04701598 3.15 0.0017 D13
UN(3,2) XRAY -0.29997791 0.08195018 -3.66 0.0003 D23
UN(3,3) XRAY 0.11415478 0.03453537 3.31 0.0009 D33
TIMECLSS XRAY 0.02837400 0.00227601 12.47 0.0001 sigma2

• In our context, d33 = 0 does not make sense if d13

and d23 are not zero

• The Wald tests assume asymptotic normality of the

parameter estimates. However this is not satisﬁed for
the variances d11, d22, d33 and σ 2, due to boundary
problems.

• Classical results on MLE’s don’t hold in this context

CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 162

7.6.3 Likelihood ratio test

• No asymptotic chi-squared null-distribution for the

likelihood ratio test statistic, due to the boundary
problems

• However, some results are available for the case

Σi = σ 2Ini

• Typically, the asymptotic null-distribution is a mixture

of χ2 distributions, rather than a single χ2

• We denote the LR test statistic by

 
 L(θ 0 ) 

−2 ln λ = −2 ln   ,
L(θ 1)
N

where θ
0 and θ 1 are ML or REML estimates under

the null-hypothesis and under the alternative

hypothesis respectively.
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 163

Case 1: No Random Eﬀects versus One Random

Eﬀect

• The hypothesis of interest is

H0 : D = 0 versus H1 : D = d11
where d11 is a non-negative scalar
• The asymptotic null-distribution:
d 1 2 1
−2 ln λ −→ χ0 + χ21
2 2
N

Case 2: One versus Two Random Eﬀects

• The hypothesis of interest is

 
 d11 0 
H0 : D =  
0 0
for some strictly positive d11, versus H1 that D is a
(2 × 2) positive semi-deﬁnite matrix.
• The asymptotic null-distribution:
d 1 2 1
−2 ln λ −→ χ1 + χ22
2 2
N
CHAPTER 7. CASE STUDY: THE PROSTATE CANCER DATA 164

Applied to the Prostate Data

• Model 1: random intercepts, time, time2

Model 2: random intercepts, time
Model 3: random intercepts
Model 4: no random eﬀects

• Test statistics:

Maximum likelihood
Asymptotic null distribution
Hypothesis −2 ln(λN ) Correct Naive
Model 2 versus Model 1 94.270 χ22:3 χ23
Model 3 versus Model 2 161.016 χ21:2 χ22
Model 4 versus Model 3 240.114 χ20:1 χ21
Restricted maximum likelihood
Asymptotic null distribution
Hypothesis −2 ln(λN ) Correct Naive
Model 2 versus Model 1 92.796 χ22:3 χ23
Model 3 versus Model 2 165.734 χ21:2 χ22
Model 4 versus Model 3 245.874 χ20:1 χ21

• The covariance structure cannot be simpliﬁed deleting

random eﬀects from the model
Chapter 8

Parametric Modeling Families

8.1 Continuous Case

8.1.1 Marginal Models

E(Yij |xij ) = xTij β

cf. cross-sectional study

• CD4+: characterize the average CD4+ level as a

function of time.
• Assumptions about the correlation structure must be
included in the model.

165
CHAPTER 8. PARAMETRIC MODELING FAMILIES 166

8.1.2 Random-Eﬀects Models

Correlation arises because regression coeﬃcients vary

among individuals (heterogeneity).

E(Yij |β i, xij ) = xTij β i.

The number of parameters increases with the number of
subjects

⇒ inconsistency

⇒ further assumptions:

βi = β + U i
with

• β an unknown but constant regression parameter

vectors: fixed effects
• U i a random vector with mean zero: random
effects.
• Random effects are useful in describing individual
CD4+ trajectories.
CHAPTER 8. PARAMETRIC MODELING FAMILIES 167

8.1.3 Transition Models

The conditional expectation of Yij , given past outcomes

Yi1, . . . , Yi,j−1 and (past and present) covariates.

E(Yij |Yi,j1 , . . . , Yi1, xij ) = xTij β + αYi,j−1.

This model could be ﬁtted using regression software.

CHAPTER 8. PARAMETRIC MODELING FAMILIES 168

8.2 Longitudinal Generalized Linear

Models

• Also non-normal data can be measured repeatedly

(over time).
• GLMs need to be extended.
• Basic idea: account for correlation.
• Normal longitudinal data form a fairly recent branch
in statistics.
• Non-normal longitudinal data are an active area of
research.
• β denotes mean response parameters
• α is used for covariance parameters
• There are several ways to express generalized linear
models for longitudinal data.
• Unlike in the normal case, the choice made is crucial.
There is no transition from one family to the other in
terms of simple functions of the model parameters.
CHAPTER 8. PARAMETRIC MODELING FAMILIES 169

8.2.1 Marginal Models

• Specify E(Yj |X).

The probability of each outcome (or set of outcomes)
is directly modelled (integrating or summing the other
outcomes away).
• Sometimes called: population averaged method.
• Non-likelihood models:
– Empirical generalized least squares (EGLS)
∗ (Koch et al, 1977) (SAS CATMOD)
– Generalized estimating equations (GEE)
∗ Liang and Zeger (1986)
∗ Lipsitz, Laird and Harrington (1991)
∗ Liang, Zeger and Qaqish (1992)
∗ Zhao and Prentice (1992)
∗ Lipsitz and Zhao (1995)
∗ Rotnitzky, Robins, and Zhao (1995)
∗ ...
∗ SAS procedure GENMOD
CHAPTER 8. PARAMETRIC MODELING FAMILIES 170

• Likelihood methods:
– Multivariate Probit Model
Ashford and Sowden (1970)

– Bahadur Model
Bahadur (1962)

– Odds ratio model for bivariate data

Dale (1986)

– Odds ratio models for multivariate data

∗ constraint equations approach
Lang and Agresti (1994)
∗ multivariate Dale model
Molenberghs and Lesaﬀre (1994)
∗ multivariate logit type models
Glonek and McCullagh (1995)
CHAPTER 8. PARAMETRIC MODELING FAMILIES 171

8.2.2 Random-eﬀects Models

• Specify E(Y |b, X).

The outcome(s), conditional on an unobserved
(latent) random eﬀect or set of random eﬀects.
• Beta-binomial model
– Skellam (1948)
– Kleinman (1973)
– Molenberghs, Declerck, and Aerts (1997)

• Generalized linear mixed models

– Engel and Keen (1992)
– Breslow and Clayton (1993)
– Wolﬁnger and O’Connell (1995)

• Hierarchical generalized linear models

– Lee and Nelder (1996)

• GLIMMIX macro in SAS

• NLMIXED procedure in SAS
• MIXOR
• MLwiN
• ...
CHAPTER 8. PARAMETRIC MODELING FAMILIES 172

8.2.3 Conditional Models

• Specify E(Yj |{Yk }, X).

An outcome or set of outcomes is modelled,
conditional on the other outcomes or at least a set of
other outcomes.
• In a longitudinal setting, a very relevant family is the
so-called set of transition models:
E(Yj |{Y1, . . . , Yj−1}, X),
(e.g., Markov models.)
• Rather than integrating over other outcomes, they
have to be conditioned upon.
• multivariate exponential family: Cox (1972)
this model forms the basis for the well known
loglinear model.
CHAPTER 8. PARAMETRIC MODELING FAMILIES 173

8.2.4 Marginal Versus Conditional Models

• Fitting marginal models is fairly involved.

• Marginal association parameters highly constrained.
• Marginal models are reproducible
(upward compatible).
This property is particularly relevant when sequences
are of unequal length.
• Some models combine “the best of both worlds”:
– mixed marginal-conditional model
Fitzmaurice and Laird (1993)

– Alternating logistic regressions

Carey, Diggle, Zeger (1994)

– 2nd order mixed parametrization/ GEE2

Molenberghs and Ritter (1996)
Molenberghs and Danielson (1999)
CHAPTER 8. PARAMETRIC MODELING FAMILIES 174

8.3 Main Focus

• Marginal: Generalized Estimating Equations

• Random Eﬀects: Generalized Linear Mixed Models
Chapter 9

Modelling Repeated Categorical Data

9.1 Notation

It is useful to have a double notation for multivariate

categorical data:

• The standard (regression) notation

• The contingency table notation

175
CHAPTER 9. MODELLING REPEATED CATEGORICAL DATA 176

9.1.1 The Standard (Regression) Notation

This notation is in agreement with previous notation.

• Let the outcomes for subject i = 1, . . . , N be

denoted as (Yi1, . . . , Yini ).
– Binary data: each component is either 0 or 1.
– (Binary data: each component is either 1 or 2.)
– (Binary data: each component is either −1 or +1.)
– (Categorical data: Yij ∈ {1, . . . , c}.)
• The corresponding covariate vector is xij .
CHAPTER 9. MODELLING REPEATED CATEGORICAL DATA 177

9.1.2 The Table Notation

It often happens that a lot of individuals have the same

covariate

(e.g. a trial with treatment or control as the only

covariate).

A contingency table approach is useful.

• Let Zi(k1, . . . , kN ) be the cells in an N -way

contingency table.
Here, kj = 0, 1 for j = 1, . . . , n.
• Here, i denotes the covariate level or design level,
grouping all individuals with covariate vector xi or
covariate vectors xi1, . . . , xiN .
For ease of notation, no new index has been
introduced.
• The corresponding cell probabilities are
µi(k1, . . . , kn).
• The tables and probabilities are summarized as Z i
and µi.
CHAPTER 9. MODELLING REPEATED CATEGORICAL DATA 178

• Lower dimensional tables are found by summing over

an appropriate set of indices.
• The univariate probabilities are:
µij = µi(+, . . . , +, kj = 1, +, . . . , +)
with corresponding count Zij .
The shorthand notation µij and Zij is easier than the
+ notation.
• Note that
E(Yij ) = Pr(Yij = 1) = µij .

• We also use bivariate counts and probabilities:

– two-way counts Zijk
– two-way probabilities
µijk = E(Yij Yik ) = Pr(Yij = 1, Yik = 1).

This notation extends easily to categorical data.

Then, each component Yij assumes values 1, . . . , c for a

c category variable.
CHAPTER 9. MODELLING REPEATED CATEGORICAL DATA 179

9.2 A Conditional Model

The Log-linear Model

Speciﬁes the joint distribution of Y i in terms of a

multivariate exponential family:

n

f (y i, θ i) = exp 
 θij yij + θij1j2 yij1 yij2 + . . . +
j=1 j1 <j2
θi1...n yi1 . . . yin − A(θ i))

n

= c(θ i) exp  θij yij + θij1j2 yij1 yij2 + . . . +
j=1 j1 <j2
θi1...n yi1 . . . yin )

where A(θ i) and c(θ i) represent the same normalizing

constant, written in additive and multiplicative way,
respectively.

• θ i is the canonical parameter, consisting of ﬁrst,

second, up to nth order components.
• The model was proposed by Cox (1972).
CHAPTER 9. MODELLING REPEATED CATEGORICAL DATA 180

9.2.1 Interpretation of Parameters

The parameters have a conditional interpretation:

 
Pr(Yij = 1|Yik = 0; k =
 j) 
θij = ln 
 
Pr(Yij = 0|Yik = 0; k = j)
⇒ the ﬁrst order parameters (main eﬀects) are
interpreted as conditional logits.
Similarly,
 
Pr(Yij = 1, Yik = 1|Yi = 0; k, j =
)Pr(Yij = 0, Yik = 0|Yi = 0; k, j = ) 
θijk = ln 
Pr(Yij = 1, Yik = 0|Yi = 0; k, j = )Pr(Yij = 0, Yik = 1|Yi = 0; k, j = )

These are conditional log odds ratios.

CHAPTER 9. MODELLING REPEATED CATEGORICAL DATA 181

• Advantages:
– The parameter vector is not constrained. All
values of θ ∈ IR yield nonnegative probabilities.
– Calculation of the joint probabilities is fairly
straightforward:
∗ ignore the normalizing constant
∗ evaluate the density for all possible sequences y
∗ sum all terms to yield c(θ)−1
• Drawbacks:
– Due to above conditional interpretation, the
models are less useful for regression.
The dependence of E(Yij ) on covariates involves all
parameters, not only the main eﬀects.
– The interpretation of the parameters depends on
the length ni of a sequence.
Shorter sequences imply that one conditions on less
outcomes, changing interpretation with length of sequence.

These drawbacks make marginal models or models

that combine marginal and conditional features better
suited.
CHAPTER 9. MODELLING REPEATED CATEGORICAL DATA 182

9.3 Marginal Models

Whereas the probability mass function of the loglinear

model follows at once, a set of choices have to be made
in a marginal model:

• Description of mean proﬁles (univariate parameters)

and of association (bivariate and higher order
parameters)
• Degree of modelling:
– joint distribution fully speciﬁed ⇒ likelihood
procedures
– or only a limited number of moments ⇒ e.g.,
generalized estimating equations

Minimally, one speciﬁes:

• η i(µi) = {ηi1(µi1), . . . , ηin(µin)}

• E(Y i) = µi and η i(µi) = X iβ
• var(Y i) = φv(µi) where v(.) is a known variance
function
• corr(Y i) = R(α)
CHAPTER 9. MODELLING REPEATED CATEGORICAL DATA 183

Remarks

• Choosing logit links ⇒ (extensions of) logistic

regression.
• Although this is seemingly a straightforward
generalization of univariate GLMs, there are problems:
– the probability model is not fully specified:
∗ only the first moment is specified; the variances
usually follow immediately from the first
moment (cf. binary, counts);
∗ the covariances involve the correlations, they are
specified separately, using a different parameter
vector α
∗ estimation of α is still left blank
∗ still, the third and higher moments have not
been specified
– some of these models are severely constrained
(e.g., Bahadur)
– estimation can proceed through generalized
estimating equations
CHAPTER 9. MODELLING REPEATED CATEGORICAL DATA 184

9.3.1 Link Functions

An important sub-family of
η i = η i(µi)
is the log-contrast family:
η i(µi) = C ln(Aµi),
with

• A a matrix deﬁning an appropriate set of probabilities,

• C a matrix deﬁning contrasts of log probabilities.
• Advantages:
– Facilitate computations
– Encompass popular links such as logit link

We will consider special cases of link functions.

CHAPTER 9. MODELLING REPEATED CATEGORICAL DATA 185

Univariate Link Functions

(Sometimes called marginal links, although this term is

slightly misleading).

• The marginal logit link:

ηij = ln(µij ) − ln(1 − µij ) = logit(µij ).

• Some links, such as the probit link:

ηij = Φ−1
1 (µij ),

and the complementary log-log link are excluded from

this particular family.
However, this is not a major obstacle.
CHAPTER 9. MODELLING REPEATED CATEGORICAL DATA 186

Pairwise Association

• Success probability approach.

(Ekholm 1991)
Logit link for two-way probabilities
ηijk = ln(µijk ) − ln(1 − µijk ) = logit(µijk ),
• Marginal correlation coeﬃcient.
(Bahadur model)
µijk − µij µik
ρijk =
µij (1 − µij )µik (1 − µik )
ηijk = ln(1 + ρijk ) − ln(1 − ρijk ).
• Marginal odds ratio.
(Dale model)
(µijk )(1 − µij − µik + µijk )
ψijk =
(µik − µijk )(µij − µijk )
 
 Pr(Yij = 1, Yik = 1)Pr(Yij = 0, Yik = 0) 
=  

Pr(Yij = 0, Yik = 1)Pr(Yij = 1, Yik = 0)
ηijk = ln(ψijk )

Observe that this odds ratio has the same structure as

the one in the log-linear model. However, we do not need
to condition on the other outcomes. Hence, the name
marginal odds ratio.
CHAPTER 9. MODELLING REPEATED CATEGORICAL DATA 187

Higher Order Association

• All three extend naturally to higher orders

• Working in terms of correlations leads to the Bahadur
model
• Using odds ratios all the way through yields the
multivariate Dale model (multivariate odds ratio
model)

For instance, let us consider the three-way odds ratio:

Pr(Yij = 1, Yik = 1, Yi = 1)Pr(Yij = 1, Yik = 0, Yi = 0)
ψijk =
Pr(Yij = 1, Yik = 1, Yi = 0)Pr(Yij = 1, Yik = 0, Yi = 1)
Pr(Yij = 0, Yik = 1, Yi = 0)Pr(Yij = 0, Yik = 0, Yi = 1)
×
Pr(Yij = 0, Yik = 1, Yi = 1)Pr(Yij = 0, Yik = 0, Yi = 0)

ψijk (Yi = 1)
=
ψijk (Yi = 0)

Higher order odds ratios are deﬁned similarly.

Chapter 10

Case Study: NTP Data

Developmental Toxicity Studies

• Research Triangle Institute US National Toxicology Program

• The eﬀect in mice of 3 chemicals:

– DEHP: di(2-ethyhexyl)-phtalate
– EG: ethylene glycol
– DYME: diethylene glycol dimethyl ether
• Implanted fetuses:
– death/resorbed
– viable:
∗ weight
∗ malformations: visceral, skeletal, external

188
CHAPTER 10. CASE STUDY: NTP DATA 189

10.1 Data Structure of Developmental

Toxicity Studies

dam
@
@
@
@
@
? R
@
. . .implant (mi). . .
@
@
@
@
@
R
@
viable (ni) non-viable (ri)
A A
A A
A A
A A
A A
AU AU
malf. (zi)weight deathresorption
A
A
A
A
A
? AU
1 ... K
CHAPTER 10. CASE STUDY: NTP DATA 190

10.2 Design

• Restrict attention to malformations:

– visceral
– skeletal
– external
– collapsed: any
• control group and 3 or 4 dose groups
dose is cluster level covariate

• each group: 20 to 30 dams

• oﬀspring per litter: 2 to 17 fetuses
CHAPTER 10. CASE STUDY: NTP DATA 191

10.3 Goals

• Describe dose-response relationship

• Test for dose effect
• Estimation of dose effect
• Determine a benchmark dose: BD
effective dose, virtually safe dose
• Use a fully quantitative approach
CHAPTER 10. CASE STUDY: NTP DATA 192

10.4 Issues

• Account for clustering (litter eﬀect)

• Univariate versus multivariate approach:
– individual malformation types
– trivariate model
– collapsed outcome
• Fetus versus dam to mimick processes in humans
• Method of estimation ?
• Family of models ?
CHAPTER 10. CASE STUDY: NTP DATA 193

10.5 Quadratic Log-linear Model

Cox (1972) and others suggest that often the higher

order interactions can be neglected. This claim is
supported by empirical evidence.

10.6 The quadratic exponential model

 
n

f (y i, θ i) = exp  θij yij + θij1j2 yij1 yij2 − A(θ i)
j=1 j1 <j2
 
n

= c(θ i) exp  θij yij + θij1j2 yij1 yij2  .
j=1 j1 <j2

10.7 The linear exponential model

 
n

f (y i, θ i) = exp  θij yij − A(θ i)
j=1
then this model reﬂects the assumption of independence.

• Equals logistic regression.

• Both conditional and marginal.
CHAPTER 10. CASE STUDY: NTP DATA 194

10.8 Specialized to Clustered Binary

Data

• NTP data
• Yij is malformation indicator for fetus j in litter i
• Code Yij as −1 or 1
• di is dose level at which litter i is exposed
• Simpliﬁcation:
θij = θi = β0 + βddi,
θij1j2 = βa.

• Using
n
i
Zi = Yij
j=1
we obtain
 
ni
f (zi|θi, βa) = 


 exp {θ z + β z (n − z ) − A(θ )} ,
i i a i i i i
zi 
CHAPTER 10. CASE STUDY: NTP DATA 195

10.9 Quadratic Clustered Loglinear

Model

Maximum Likelihood Estimates (Standard Errors) for the Conditional Model.

Outcome Parameter DEHP EG DYME

External β0 -2.81(0.58) -3.01(0.79) -5.78(1.13)
βd 3.07(0.65) 2.25(0.68) 6.25(1.25)
βa 0.18(0.04) 0.25(0.05) 0.09(0.06)
Visceral β0 -2.39(0.50) -5.09(1.55) -3.32(0.98)
βd 2.45(0.55) 3.76(1.34) 2.88(0.93)
βa 0.18(0.04) 0.23(0.09) 0.29(0.05)
Skeletal β0 -2.79(0.58) -0.84(0.17) -1.62(0.35)
βd 2.91(0.63) 0.98(0.20) 2.45(0.51)
βa 0.17(0.04) 0.20(0.02) 0.25(0.03)
Collapsed β0 -2.04(0.35) -0.81(0.16) -2.90(0.43)
βd 2.98(0.51) 0.97(0.20) 5.08(0.74)
βa 0.16(0.03) 0.20(0.02) 0.19(0.03)
CHAPTER 10. CASE STUDY: NTP DATA 196

10.10 The Bahadur Model

Maximum Likelihood Estimates (Standard Errors) for the Bahadur Model.

Outcome Parameter DEHP EG DYME

External β0 -4.93(0.39) -5.25(0.66) -7.25(0.71)
βd 5.15(0.56) 2.63(0.76) 7.94(0.77)
βa 0.11(0.03) 0.12(0.03) 0.11(0.04)
Visceral β0 -4.42(0.33) -7.38(1.30) -6.89(0.81)
βd 4.38(0.49) 4.25(1.39) 5.49(0.87)
βa 0.11(0.02) 0.05(0.08) 0.08(0.04)
Skeletal β0 -4.67(0.39) -2.49(0.11) -4.27(0.61)
βd 4.68(0.56) 2.96(0.18) 5.79(0.80)
βa 0.13(0.03) 0.27(0.02) 0.22(0.05)
Collapsed β0 -3.83(0.27) -2.51(0.09) -5.31(0.40)
βd 5.38(0.47) 3.05(0.17) 8.18(0.69)
βa 0.12(0.03) 0.28(0.02) 0.12(0.03)
Chapter 11

Generalized Estimating Equations

We have seen that the score equations for estimating β

in a classical (univariate) GLM take the form
n
∂µi −1
S(βj ) = vi (yi − µi) = 0
i=1 ∂βj

with vi = Var(Yi).

(Here, Yi is scalar.)

197
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 198

The corresponding vector notation

S(β) = DT V −1(y − µ) = 0
where

• D is an N × p matrix with (i, j)th element

∂µi
∂βj
• V is an N × N diagonal matrix with non-zero
elements proportional to Var(Yi)
• y and µ are N -vectors with elements yi and µi
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 199

In the longitudinal setting, all of this carries over, except

that

• the single element scalars yi and µi are replaced by ni

element vectors y i and µi associated with the
sequences of ni observations on the ith subject
• the corresponding matrices V i = Var(Y i) involve a
set of nuisance parameters, α say, which determine
the covariance structure of Y i.
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 200

The score equations for the complete set of data,

Y = (Y 1, . . . , Y N ) from the N subjects can now be
written as
N
DiT [V i(α)]−1 (y i − µi) = 0.

S(β) =
i=1

• These equations have the same form as the score

equations of the full likelihood procedure of, e.g., the
odds ratio model.
• We restrict specification to the first moment only
(hence only Y i).
The second moment is only specified in the variances,
not in the correlations.
• Solution of the score equations uses a multivariate
version of the iteratively weighted least squares
algorithm, provided that the variance matrices Vi(α)
are known (including the numerical values of α).
• Alternatively, a Fisher scoring algorithm can be used
• Liang and Zeger (1986)
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 201

11.1 Large Sample Properties

As N → ∞
√
N (β̂ − β) ∼ N (0, I0−1)
where
N
DiT [Vi(α)]−1Di

I0 =
i=1

• (Unrealistic) Conditions:
– α is known
– the parametric form for V i(α) is known
• Solution: working correlation matrix
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 202

11.2 Unknown Covariance Structure

Keep the score equations

N
[Di]T [Vi(α)]−1 (y i − µi) = 0

S(β) =
i=1
BUT

• suppose V i(.) is not the true variance of Y i but only

a plausible guess, a so-called working correlation
matrix
• specify correlations and not covariances, because the
variances follow from the mean structure
• the score equations are solved as before

The asymptotic normality results change to

√
N (β̂ − β) ∼ N (0, I0−1I1I0−1)
N
DiT [Vi(α)]−1Di

I0 =
i=1
N
DiT [Vi(α)]−1Var(Y i)[Vi(α)]−1Di.

I1 =
i=1
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 203

11.3 The Sandwich Estimator

• This is the so-called sandwich estimator:

– I0 is the bread
– I1 is the ﬁlling (ham or cheese)
• the known-variance result is recovered when the guess
is actually equal to the true model
• the estimators β̂ are consistent even if the working
correlation matrix is incorrect
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 204

• An estimate is found by replacing the unknown

variance matrix Var(Y i) by
(Y i − µ̂i)(Y i − µ̂i)T
where µ̂i = µi(β̂)
Even if this estimator is bad for Var(Y i) it leads to a
good estimate of I1, provided that:
– the replication in the data is sufficiently large
– the same model for µi is fitted to groups of
subjects
– observation times do not vary too much between
subjects
• a bad choice of working correlation matrix can affect
the efficiency of β̂
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 205

11.4 The Working Correlation Matrix

Write
1/2 1/2
Vi(β, α) = φAi (β)Ri(α)Ai (β).

Variance function: Ai is (ni × ni) diagonal with

elements v(µij ), the known GLM variance function.

Working correlation: Ri(α), possibly dependent on a

diﬀerent set of parameters α.

Overdispersion parameter: φ, assumed 1 or estimated

from the data.

The unknown quantities are expressed in terms of the

Pearson residuals
yij − µij
eij = .
v(µij )
Note that eij depends on β.
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 206

11.4.1 Estimation of Working Correlation

Liang and Zeger (1986) proposed moment-based

estimates for the working correlation.

Some of the more popular ones:

• Independence:
Corr(Yij , Yik ) = 0 (j = k).
There are no parameters to be estimated.
• Exchangeable:
Corr(Yij , Yik ) = α (j = k).
1 N
1
α̂ = eij eik .
N i=1 ni (ni − 1) j=k

• AR(1):
Corr(Yij , Yi,j+t) = αt (t = 0, 1, . . . , ni − j).
1 N
1
α̂ = eij ei,j+1.
N i=1 ni − 1 j≤ni −1
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 207

• Unstructured:
Corr(Yij , Yik ) = αjk (j = k).
1 N

α̂jk = eij eik .
N i=1

The dispersion parameter is estimated by

1 N 1 ni
φ̂ = e2ij .
N i=1 ni j=1
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 208

11.5 Fitting GEE

The standard procedure, implemented in the SAS

procedure GENMOD.

1. Compute initial estimates for β, using a univariate

GLM (i.e., assuming independence).
2. • Compute Pearson residuals eij .
• Compute estimates for α.
• Compute Ri(α).
• Compute estimate for φ.
1/2 1/2
• Compute Vi(β, α) = φAi (β)Ri(α)Ai (β).
3. Update estimate for β:
 
N −1  N 

β (t+1) = β (t)− 
 DiT Vi−1Di



DiT Vi−1(y i − µi) .
i=1 i=1

4. Iterate 2.–3. until convergence.

Estimates of precision by means of I0−1 and I0−1I1I0−1.

CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 209

11.6 The NTP Data

• Marginal model: interest in eﬀect of exposure to dose.

• Correlation structure:
– AR(1) and unstructured meaningless.
– Exchangeable or independence are sensible choices.
• Variables in data set:
– dose: between 0 and 1
– litter: indicator of cluster
– visceral (0=normal, 1=malformed)
– skeletal
– external
– collapsed outcome
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 210

11.6.1 PROC GENMOD Code

proc genmod data=m.dehp3;

title ’Visceral, Exchangeable Working Assumptions’;
class litter;
model visceral=dose / dist=bin;
repeated subject=litter / type=exch covb corrw modelse;
run;

proc genmod data=m.dehp3;

title ’Visceral, Independence Working Assumptions’;
class litter;
model visceral=dose / dist=bin;
repeated subject=litter / type=ind covb modelse;
run;
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 211

11.6.2 Discussion of Program

• GEE estimation is invoked by the REPEATED

statement.

• It is available since Version 6.12.

• The MODEL statement is classical (univariate GLM,

MIXED).
– The ‘dist=’ option speciﬁes distribution and
default link.
– The ‘link=’ option speciﬁes the link function.
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 212

• Useful options to the REPEATED statement:

– ‘type=’: speciﬁes the correlation structure: exch,
ar(1), ind, un,. . .
– ‘covb’: requests covariance matrix of parameter
estimates
– ‘corrw’: requests working correlation matrix
– ‘modelse’: prints model based standard errors, in
addition to robust standard errors
– ‘obstats’: prints table containing response values,
predicted values, linear predictor, residuals for each
observation.
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 213

11.6.3 PROC GENMOD Output

Visceral, Exchangeable Working Assumptions

The GENMOD Procedure

Model Information

Description Value

Data Set M.DEHP3

Distribution BINOMIAL
Link Function LOGIT
Dependent Variable VISCERAL
Observations Used 1082
Number Of Events 72
Number Of Trials 1082

Parameter Information

Parameter Effect

PRM1 INTERCEPT
PRM2 DOSE

Criteria For Assessing Goodness Of Fit

Criterion DF Value Value/DF

Deviance 1080 407.5135 0.3773

Scaled Deviance 1080 407.5135 0.3773
Pearson Chi-Square 1080 1076.8015 0.9970
Scaled Pearson X2 1080 1076.8015 0.9970
Log Likelihood . -203.7567 .

Analysis Of Initial Parameter Estimates

Parameter DF Estimate Std Err ChiSquare Pr>Chi

INTERCEPT 1 -4.4692 0.2759 262.4482 0.0001

DOSE 1 4.4014 0.4280 105.7533 0.0001
SCALE 0 1.0000 0.0000 . .
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 214

NOTE: The scale parameter was held fixed.

GEE Model Information

Description Value

Correlation Structure Exchangeable

Subject Effect LITTER (108 levels)
Number of Clusters 108
Correlation Matrix Dimension 16
Maximum Cluster Size 16
Minimum Cluster Size 2

Covariance Matrix (Model-Based)

Covariances are Above the Diagonal and Correlations are Below

Parameter
Number PRM1 PRM2

PRM1 0.13427 -0.17695

PRM2 -0.88490 0.29780

Covariance Matrix (Empirical)

Covariances are Above the Diagonal and Correlations are Below

Parameter
Number PRM1 PRM2

PRM1 0.13445 -0.18714

PRM2 -0.86347 0.34936

Working Correlation Matrix

COL1 COL2 COL3 ...

ROW1 1.0000 0.0800 0.0800 ...

ROW2 0.0800 1.0000 0.0800 ...
ROW3 0.0800 0.0800 1.0000 ...

. . . . .
. . . . .
. . . . .
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 215

Analysis Of GEE Parameter Estimates

Empirical Standard Error Estimates

Empirical 95% Confidence Limits

Parameter Estimate Std Err Lower Upper Z Pr>|Z|

INTERCEPT -4.4977 0.3667 -5.2163 -3.7790 -12.27 0.0000

DOSE 4.5506 0.5911 3.3922 5.7091 7.6991 0.0000
Scale 0.9974 . . . . .

NOTE: The scale parameter for GEE estimation was computed as the
square root of the normalized Pearson’s chi-square

Analysis Of GEE Parameter Estimates

Model-Based Standard Error Estimates

Model 95% Confidence Limits

Parameter Estimate Std Err Lower Upper Z Pr>|Z|

INTERCEPT -4.4977 0.3664 -5.2158 -3.7795 -12.27 0.0000

DOSE 4.5506 0.5457 3.4811 5.6202 8.3389 0.0000
Scale 0.9974 . . . . .
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 216

Visceral, Independence Working Assumptions

Criteria For Assessing Goodness Of Fit

Criterion DF Value Value/DF

Deviance 1080 407.5135 0.3773

Scaled Deviance 1080 407.5135 0.3773
Pearson Chi-Square 1080 1076.8015 0.9970
Scaled Pearson X2 1080 1076.8015 0.9970
Log Likelihood . -203.7567 .

Analysis Of Initial Parameter Estimates

Parameter DF Estimate Std Err ChiSquare Pr>Chi

INTERCEPT 1 -4.4692 0.2759 262.4482 0.0001

DOSE 1 4.4014 0.4280 105.7533 0.0001
SCALE 0 1.0000 0.0000 . .

NOTE: The scale parameter was held fixed.

GEE Model Information

Description Value

Correlation Structure Independent

Subject Effect LITTER (108 levels)
Number of Clusters 108
Correlation Matrix Dimension 16
Maximum Cluster Size 16
Minimum Cluster Size 2

Covariance Matrix (Model-Based)

Covariances are Above the Diagonal and Correlations are Below

Parameter
Number PRM1 PRM2

PRM1 0.07616 -0.10315

PRM2 -0.87292 0.18333

Covariance Matrix (Empirical)

Covariances are Above the Diagonal and Correlations are Below
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 217

Parameter
Number PRM1 PRM2

PRM1 0.12993 -0.18001

PRM2 -0.85509 0.34107

Analysis Of GEE Parameter Estimates

Empirical Standard Error Estimates

Empirical 95% Confidence Limits

Parameter Estimate Std Err Lower Upper Z Pr>|Z|

INTERCEPT -4.4692 0.3605 -5.1757 -3.7627 -12.40 0.0000

DOSE 4.4014 0.5840 3.2568 5.5461 7.5365 0.0000
Scale 1.0004 . . . . .

NOTE: The scale parameter for GEE estimation was

computed as the square root of the normalized Pearson’s chi-square.

Analysis Of GEE Parameter Estimates

Model-Based Standard Error Estimates

Model 95% Confidence Limits

Parameter Estimate Std Err Lower Upper Z Pr>|Z|

INTERCEPT -4.4692 0.2760 -5.0101 -3.9283 -16.19 0.0000

DOSE 4.4014 0.4282 3.5622 5.2406 10.280 0.0000
Scale 1.0004 . . . . .
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 218

11.6.4 Discussion of Output

• Initial estimates are based on univariate GLM: logistic

regression.
• Initial and ‘independence’ estimates are the same.
• The ‘independence’ standard errors correct for
overdispersion.
• Compare:
– Exchangeable and Independence estimates and
standard errors.
– Model based and robust standard errors.
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 219

11.7 GEE: Alternative 1

• Classical approach:
– Estimating equation for β
– Moment-based estimation for α
– Liang and Zeger (1986)
– SAS PROC GENMOD
• Alternative approach GEE 1.5:
– Estimating equation for β
– Estimating equation for α
– Prentice (1988)
– SAS macro gee1corr.mac by Stuart Lipsitz
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 220

Form of Equations

N
DiT Vi−1(Y i − µi) = 0,

i=1
N
EiT Wi−1(Z i − δ i) = 0,

i=1
where
(Yij − µij )(Yik − µik )
Zijk = ,
µij (1 − µij )µik (1 − µik )
δijk = E(Zijk )
√
√ joint asymptotic distribution of N (β̂ − β) and
The
N (α̂ − α) normal with variance-covariance matrix
consistently estimated by
   



A 0 





Λ11 Λ12   A B T 


N 









,
B C Λ21 Λ22 0 C
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 221

 −1
N

A =  DiT Vi−1 Di  ,
i=1
 −1   −1
N
N
N
∂Z i  
B =  EiT Wi−1 Ei   EiT Wi−1 DiT Vi−1 Di  ,
i=1 i=1 ∂β i=1
 −1
N

C =  EiT Wi−1 Ei  ,
i=1
N

Λ11 = DiT Vi−1 Cov(Y i )Vi−1 Di ,
i=1
N

Λ12 = DiT Vi−1 Cov(Y i , Z i )Wi−1 Ei ,
i=1

Λ21 = Λ12 ,
N

Λ22 = EiT Wi−1 Cov(Z i )Wi−1 Ei ,
i=1

and

Statistic Estimator
Var(Y i) (Y i − µi)(Y i − µi)T
Cov(Y i, Z i) (Y i − µi)(Z i − δ i)T
Var(Z i) (Z i − δ i)(Z i − δ i)T
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 222

Special Case: Exchangeability

• Then, δijk = ρi, the correlation between any two

outcomes of the same cluster i.
• Fisher’s z transform: α = ln(1 + ρ) − ln(1 − ρ).
• Deﬁne  



Yi1Yi2 


 
 
 
 Yi1Yi3 
Zi = 


 ..




 
 
 
 
 
Yi,ni−1Yini
Hence,

E(Zijk ) = µijk = ρ µij (1 − µij )µik (1 − µik ) + µij µik ,
Var(Zijk ) = µijk (1 − µijk ),
∂E(Zijk ) 2 exp(α)
= µij (1 − µij )µik (1 − µik ),
∂α (exp(α) + 1)2
 2
2 exp(α) 1
C =  µ ij (1 − µ ij )µ ik (1 − µ ik ) .
(exp(α) + 1)2 µijk (1 − µijk )

• Standard error for ρ under exchangeability is obtained

by multiplying the standard error for α with
2 exp(α)/(exp(α) + 1)2.
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 223

11.7.1 gee1corr.mac Code

%include ’c:\sas\stat\sample\gee1corr.mac’;

%gee(data=m.dehp3,y=visceral,x=dose,id=litter,corr=exc);

%gee(data=m.dehp3,y=visceral,x=dose,id=litter,corr=ind);

It is essential to code the outcome as 0 and 1 !! :

• for gee1corr.mac
• for GLIMMIX macro
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 224

11.7.2 gee1corr.mac Output

Visceral, exchangeable assumption

Correlation Structure: Exchangeable

PARAMETER ESTIMATES with naive variance

VARIABLE ESTIMATE SE_EST Z P

INTERCEP -4.507824 0.3958657 -11.38726 0
DOSE 4.5890696 0.5845582 7.8504921 4.108E-15

PARAMETER ESTIMATES with robust variance

VARIABLE ESTIMATE SE_EST Z P

INTERCEP -4.507824 0.3685713 -12.23053 0
DOSE 4.5890696 0.5932811 7.735068 1.033E-14

CORR SECORR Z P
0.1100235 0.0455011 2.4180411 0.0156043
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 225

Visceral, independence assumption

Correlation Structure: Independence

PARAMETER ESTIMATES with naive variance

VARIABLE ESTIMATE SE_EST Z P

INTERCEP -4.469209 0.2758728 -16.20025 0
DOSE 4.401443 0.4280043 10.283642 0

PARAMETER ESTIMATES with robust variance

VARIABLE ESTIMATE SE_EST Z P

INTERCEP -4.469209 0.3604581 -12.39869 0
DOSE 4.401443 0.5840154 7.5365191 4.829E-14
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 226

11.8 GEE: Alternative 2

• Previous are formulated directly in terms of binary

outcomes
• This approach is based on a linearization

Write
y i = µi + εi
with
η i = g(µi),
η i = Xiβ,
Var(y i) = Var(εi) = Σi.
Here,

• η i is a vector of linear predictors,

• g(.) is the (vector) link function.
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 227

11.8.1 Estimation

Nelder and Wedderburn (1972)

Solve iteratively:
N N
Wiy ∗i ,

XiT WiXiβ =
i=1 i=1
where
Wi = DiΣ−1
i Di ,

y ∗i = η̂ i + (y i − µ̂i)Di−1,
∂µi
Di = ,
∂η i
Σi = Var(ε),
µi = E(y i).
Remarks:

• y ∗i is called “working variable” or “pseudo data”.

• Given the pseudo data, η̂ can be determined using
PROC MIXED.
• For linear models, Di = Ini and standard generalized
least squares follow.
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 228

11.8.2 The Variance Structure

The variance can be written as

1/2 1/2
Σi = φAi (β)Ri(α)Ai (β)
where

• φ is a scale (overdispersion) parameter,

• Ai = v(µi), expressing the mean-variance relation
(this is a function of β),
• Ri(α) describes the correlation structure:
– If independence is assumed then Ri(α) = Ini .
Equivalently, the scale parameter can be placed
along the diagonal.
– Other structures, such as compound symmetry,
AR(1),. . . can be assumed as well.
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 229

11.8.3 GLIMMIX Macro Code

%include ’c:\sas\stat\sample\glimmix.sas’;

%glimmix(
data=m.dehp3,
procopt=method=reml,
stmts=%str(
class litter;
model visceral=dose / solution;
repeated / subject=litter type=cs r;),
error=binomial,
link=logit,
title=’visceral, CS, Model Based’,
options=mixprintlast
);

%glimmix(
data=m.dehp3,
procopt=method=reml,
stmts=%str(
class litter;
model visceral=dose / solution;
repeated / subject=litter type=simple r;),
error=binomial,
link=logit,
title=’visceral, Independence, Model Based’,
options=mixprintlast
);
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 230

%glimmix(
data=m.dehp3,
procopt=method=reml empirical,
stmts=%str(
class litter;
model visceral=dose / solution;
repeated / subject=litter type=cs r;),
error=binomial,
link=logit,
title=’visceral, CS, Empirically Corrected’,
options=mixprintlast
);

%glimmix(
data=m.dehp3,
procopt=method=reml empirical,
stmts=%str(
class litter;
model visceral=dose / solution;
repeated / subject=litter type=simple r;),
error=binomial,
link=logit,
title=’visceral, Independence, Empirically Corrected’,
options=mixprintlast
);
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 231

11.8.4 Discussion of Macro

• Described in
– Littell et al (1996)
– glimmix.sas

• Intended for generalized linear mixed models

• Random-eﬀects applications are postponed

• “PROC MIXED” part of the macro:

– PROCOPT deﬁnes the PROC MIXED statement
options
– STMTS is a string that contains the body of the
MIXED procedure

• “GLM” part of the macro:

– GLM speciﬁed by means of ERROR and LINK
arguments
– OPTIONS=MIXPRINTLAST produces the output
of the underlying PROC MIXED call, produced at
convergence

• Double iteration scheme ⇒ slow

• PROC NLMIXED provides an alternative

CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 232

11.8.5 GLIMMIX Output

The ﬁrst output is for

• Visceral malformation
• Exchangeable correlation (CS)
• Model based standard errors (the standard in PROC
MIXED)

First, the MIXED output is produced.

We present a selection.

Visceral, CS, Model Based

The MIXED Procedure

REML Estimation Iteration History

Iteration Evaluations Objective Criterion

0 1 4843.2167338
1 2 4823.1918629 0.00000048
2 1 4823.1907084 0.00000000

Convergence criteria met.

R Matrix for LITTER 38

Weighted by _W

Row COL1 COL2 COL3 ...

1 91.60982268 7.00560953 7.00560953 ...

2 7.00560953 91.60982268 7.00560953 ...
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 233

3 7.00560953 7.00560953 91.60982268 ...

. . . . .
. . . . .
. . . . .

Covariance Parameter Estimates (REML)

Cov Parm Subject Estimate

CS LITTER 0.07639306
Residual 0.92257140

Model Fitting Information for _Z

Weighted by _W

Description Value

Observations 1082.000
Res Log Likelihood -3404.05
Akaike’s Information Criterion -3406.05
Schwarz’s Bayesian Criterion -3411.03
-2 Res Log Likelihood 6808.098
Null Model LRT Chi-Square 20.0260
Null Model LRT DF 1.0000
Null Model LRT P-Value 0.0000

Solution for Fixed Effects

Effect Estimate Std Error DF t Pr > |t|

INTERCEPT -4.49639983 0.36363994 106 -12.36 0.0001

DOSE 4.54563842 0.54219604 106 8.38 0.0001

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

DOSE 1 106 70.29 0.0001

CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 234

Next, the GLIMMIX macro produces its own output.

Visceral, CS, Model Based

Covariance Parameter Estimates

Cov
Parm Estimate

CS 0.07639306

GLIMMIX Model Statistics

Description Value

Deviance 407.7891
Scaled Deviance 442.0136
Pearson Chi-Square 1076.0335
Scaled Pearson Chi-Square 1166.3418
Extra-Dispersion Scale 0.9226

Parameter Estimates

Effect Estimate Std Error DF t Pr > |t|

INTERCEPT -4.4964 0.3636 106 -12.36 0.0001

DOSE 4.5456 0.5422 106 8.38 0.0001

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

DOSE 1 106 70.29 0.0001

CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 235

We now present a similar analysis for independence

working assumptions.

Visceral, Independence, Model Based

The MIXED Procedure

Covariance Parameter Estimates (REML)

Cov Parm Subject Estimate

DIAG LITTER 0.99703847

Model Fitting Information for _Z

Weighted by _W

Description Value

Observations 1082.000
Res Log Likelihood -3415.38
Akaike’s Information Criterion -3416.38
Schwarz’s Bayesian Criterion -3418.88
-2 Res Log Likelihood 6830.770
Null Model LRT Chi-Square 0.0000
Null Model LRT DF 0.0000
Null Model LRT P-Value 1.0000

Solution for Fixed Effects

Effect Estimate Std Error DF t Pr > |t|

INTERCEPT -4.46920901 0.27546399 106 -16.22 0.0001

DOSE 4.40144298 0.42737005 106 10.30 0.0001

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

DOSE 1 106 106.07 0.0001

CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 236

Visceral, Independence, Model Based

Covariance Parameter Estimates

Cov
Parm Estimate

DIAG 0.99703847

GLIMMIX Model Statistics

Description Value

Deviance 407.5135
Scaled Deviance 407.5135
Pearson Chi-Square 1076.8015
Scaled Pearson Chi-Square 1076.8015
Extra-Dispersion Scale 1.0000

Parameter Estimates

Effect Estimate Std Error DF t Pr > |t|

INTERCEPT -4.4692 0.2755 106 -16.22 0.0001

DOSE 4.4014 0.4274 106 10.30 0.0001

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

DOSE 1 106 106.07 0.0001

CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 237

Visceral, CS, Empirically Corrected

Covariance Parameter Estimates

Cov
Parm Estimate

CS 0.07639306

GLIMMIX Model Statistics

Description Value

Deviance 407.7891
Scaled Deviance 442.0136
Pearson Chi-Square 1076.0335
Scaled Pearson Chi-Square 1166.3418
Extra-Dispersion Scale 0.9226

Parameter Estimates

Effect Estimate Std Error DF t Pr > |t|

INTERCEPT -4.4964 0.3664 106 -12.27 0.0001

DOSE 4.5456 0.5908 106 7.69 0.0001

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

DOSE 1 106 59.20 0.0001

CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 238

Visceral, Independence, Empirically Corrected

Covariance Parameter Estimates

Cov
Parm Estimate

DIAG 0.99703847

GLIMMIX Model Statistics

Description Value

Deviance 407.5135
Scaled Deviance 407.5135
Pearson Chi-Square 1076.8015
Scaled Pearson Chi-Square 1076.8015
Extra-Dispersion Scale 1.0000

Parameter Estimates

Effect Estimate Std Error DF t Pr > |t|

INTERCEPT -4.4692 0.3605 106 -12.40 0.0001

DOSE 4.4014 0.5840 106 7.54 0.0001

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

DOSE 1 106 56.80 0.0001

CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 239

11.8.6 Discussion of Output

• GLIMMIX output copies parts from PROC MIXED

output:
– Parameter estimates, standard errors,. . .
– Covariance parameter and overdispersion scale:
∗ Compound symmetry:
· Cov. Par.: covariance between two
littermates (cf. random intercept variance)
· Extra-dispersion: residual variance
(“measurement error”)
∗ Independence:
· Cov. Par.: residual variance (“measurement
error”)
· Extra-dispersion: 1
– Tests of ﬁxed eﬀects
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 240

• GLIMMIX produces in addition model statistics

– deviance: treating all outcomes as if they were
independent
– scaled deviance=deviance/extra-dispersion
parameter
– They should not be trusted/used
• PROC MIXED output that should not be used:
– Model ﬁtting information
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 241

11.9 Comparison of GEE Estimates

(Standard Errors)

GEE1 Estimates (Model Based Standard Errors; Robust Standard Errors) for
the DEHP Data. Exchangeable Working Assumptions.

Outcome Parameter GENMOD PRENTICE GLIMMIX (repeated)

External β0 -4.98(0.40;0.37) -4.99(0.46;0.37) -5.00(0.36;0.37)
βd 5.33(0.57;0.55) 5.32(0.65;0.55) 5.32(0.51;0.55)
φ 0.88 0.65
ρ 0.11 0.11(0.04) 0.06
Visceral β0 -4.50(0.37;0.37) -4.51(0.40;0.37) -4.50(0.36;0.37)
βd 4.55(0.55;0.59) 4.59(0.58;0.59) 4.55(0.54;0.59)
φ 1.00 0.92
ρ 0.08 0.11(0.05) 0.08
Skeletal β0 -4.83(0.44;0.45) -4.82(0.47;0.44) -4.82(0.46;0.45)
βd 4.84(0.62;0.63) 4.84(0.67;0.63) 4.84(0.65;0.63)
φ 0.98 0.86
ρ 0.12 0.14(0.06) 0.13
Collapsed β0 -4.05(0.32;0.31) -4.06(0.35;0.31) -4.04(0.33;0.31)
βd 5.84(0.57;0.61) 5.89(0.62;0.61) 5.82(0.58;0.61)
φ 1.00 0.96
ρ 0.11 0.15(0.05) 0.11
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 242

GEE1 Estimates (Model Based Standard Errors; Robust Standard Errors) for
the DEHP Data. Independence Working Assumptions.

Outcome Parameter GENMOD PRENTICE GLIMMIX (repeated)

External β0 -5.06(0.30;0.38) -5.06(0.33;0.38) -5.06(0.28;0.38)
βd 5.31(0.44;0.57) 5.31(0.48;0.57) 5.31(0.42;0.57)
φ 0.90 0.74
Visceral β0 -4.47(0.28;0.36) -4.47(0.28;0.36) -4.47(0.28;0.36)
βd 4.40(0.43;0.58) 4.40(0.43;0.58) 4.40(0.43;0.58)
φ 1.00 1.00
Skeletal β0 -4.87(0.31;0.47) -4.87(0.31;0.47) -4.87(0.32;0.47)
βd 4.89(0.46;0.65) 4.90(0.47;0.65) 4.90(0.47;0.65)
φ 0.99 1.02
Collapsed β0 -3.98(0.22;0.30) -3.98(0.22;0.30) -3.98(0.22;0.30)
βd 5.56(0.40;0.61) 5.56(0.40;0.61) 5.56(0.41;0.61)
φ 0.99 1.04
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 243

11.10 GEE2: Odds Ratios

Follows as a simpliﬁcation from the likelihood methods

presented:

• with conditional working assumptions

mixed-marginal conditional model

• with marginal working assumptions

fully marginal odds ratio model

• Conditional working assumptions

Considering the quadratic exponential family

f (y i|Ψi) = exp ΨTi v i − A(Ψi) .
(Zhao and Prentice, 1990) yields a set of GEE2:
 
N  ∂µi 
−1
S(β) = 


 Mi (v i − µi) = 0.
i=1 ∂β
– likelihood inference: assume correct speciﬁcation
– GEE inference (+ robust variance estimator)

• Marginal working assumptions

– Set the higher order interactions to zero in the
fully marginal model.
– A similar set of GEE2 arises.
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 244

Remarks

• Also close to GEE1–Alternative 1 (Prentice 1988).

• Performance of both approaches is virtually identical.
• Orthogonality property is an advantage for conditional
assumptions.
• Both versions extend to categorical outcomes.
• Marginal outcomes + relevant part of association
modelled.
• Computations are stable and relatively fast.
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 245

11.11 GEE2: Correlations

• Start from the Bahadur model:

• Model:
– Marginal logit πij
– Pairwise correlations ρijk
• (Independence) working assumptions:
– Zero third order correlations
– Zero fourth order correlations
• Application to NTP data:
– Clustering and constant correlation:
logit(πi) = β0 + βddi,
 
 1 + ρi 
ln   = βa .
1 − ρi
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 246

11.11.1 The NTP Data

GEE2 Estimates (Standard Errors) for the Bahadur Model.

Outcome Parameter DEHP EG DYME

External β0 -4.98(0.37) -5.63(0.67) -7.45(0.73)
βd 5.29(0.55) 3.10(0.81) 8.15(0.83)
βa 0.15(0.05) 0.15(0.05) 0.13(0.05)
Visceral β0 -4.49(0.36) -7.50(1.05) -6.89(0.75)
βd 4.52(0.59) 4.37(1.14) 5.51(0.89)
βa 0.15(0.06) 0.02(0.02) 0.11(0.07)
Skeletal β0 -5.23(0.40) -4.05(0.33)
βd 5.35(0.60) 4.77(0.43)
βa 0.18(0.02) 0.30(0.03)
Collapsed β0 -5.23(0.40) -4.07(0.71) -5.75(0.48)
βd 5.35(0.60) 4.89(0.90) 8.82(0.91)
βa 0.18(0.02) 0.26(0.14) 0.18(0.12)
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 247

11.11.2 Discussion

• Association parameter is higher than in likelihood

version.

• The likelihood version is highly constrained:

For instance, the allowable range of βa for the external outcome
in the DEHP data is (−0.0164; 0.1610) when β0 and βd are ﬁxed
at their MLE. This range excludes the MLE under a
beta-binomial model. It translates to (−0.0082; 0.0803) on the
correlation scale.

• Dose eﬀect estimate is comparable.

• A GEE2 estimate is valid as soon as the second, third,

and fourth order joint probabilities are nonnegative.
The likelihood analysis requires all joint probabilities
to be nonnegative.

• GEE2: simple likelihood ratio tests are unavailable

Robust Wald and score tests can be used

• GEE2: joint probabilities are unavailable

• GEE2: more robust against misspeciﬁcation of higher

order association
CHAPTER 11. GENERALIZED ESTIMATING EQUATIONS 248

11.12 Alternating Logistic Regression

When marginal odds ratios are used to model association,

α can be estimated using ALR, which is

• almost as eﬃcient as GEE2

• almost as easy (computationally) than GEE1

(Another GEE1.5.)

Let µijk be as before and let αijk = ln(ψijk ) be the

marginal log odds ratio. Then
 
µij − µijk
logit Pr(Yij = 1|Yik = yik ) = αijk yik + ln 



1 − µij − µik + µijk

• αijk can be modelled in terms of predictors

• the second term is treated as an oﬀset
• the estimating equations for β and α are solved in
turn, and the “alternating” between both sets is
repeated until convergence.
• this is needed because the oﬀset clearly depends on β.

Diggle, Liang, and Zeger (1994)

Chapter 12

Case Study: Analgesic Trial

• GSA was dichotomized as follows:

GSABIN =





 1 if GSA ≤ 3 (Very Good to Moderate)




 0 otherwise
• GEE with UNstructured correlation structure:

249
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 250

proc genmod data=gsa;

ods listing exclude parameterestimates classlevels parminfo;
class patid timecls;
model gsabin = pca0 time|time / dist=b;
repeated subject=patid / type=un corrw within=timecls modelse;
run;

The GENMOD Procedure

Model Information

Data Set WORK.GSA

Distribution Binomial
Link Function Logit
Dependent Variable gsabin
Observations Used 1137
Probability Modeled Pr( gsabin = 1 )

Response Profile

Ordered Ordered
Level Value Count

1 0 206
2 1 931

Parameter Information

Parameter Effect

Prm1 Intercept
Prm2 pca0
Prm3 TIME
Prm4 TIME*TIME

Criteria For Assessing Goodness Of Fit

Criterion DF Value Value/DF

Deviance 1133 1064.1723 0.9393

Scaled Deviance 1133 1064.1723 0.9393
Pearson Chi-Square 1133 1136.8928 1.0034
Scaled Pearson X2 1133 1136.8928 1.0034
Log Likelihood -532.0862
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 251

Algorithm converged.

Analysis Of Initial Parameter Estimates

Standard Wald 95% Chi-

Parameter DF Estimate Error Confidence Limits Square Pr > ChiSq

Intercept 1 2.8016 0.4902 1.8408 3.7624 32.66 <.0001

pca0 1 -0.2058 0.0864 -0.3751 -0.0365 5.67 0.0172
TIME 1 -0.7864 0.3874 -1.5456 -0.0271 4.12 0.0424
TIME*TIME 1 0.1774 0.0793 0.0219 0.3329 5.00 0.0254
Scale 0 1.0000 0.0000 1.0000 1.0000

NOTE: The scale parameter was held fixed.

GEE Model Information

Correlation Structure Unstructured

Within-Subject Effect timecls (4 levels)
Subject Effect PATID (395 levels)
Number of Clusters 395
Correlation Matrix Dimension 4
Maximum Cluster Size 4

GEE Model Information

Minimum Cluster Size 1

Algorithm converged.

Working Correlation Matrix

Col1 Col2 Col3 Col4

Row1 1.0000 0.1770 0.2481 0.2021

Row2 0.1770 1.0000 0.1811 0.1177
Row3 0.2481 0.1811 1.0000 0.4594
Row4 0.2021 0.1177 0.4594 1.0000
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 252

Analysis Of GEE Parameter Estimates

Empirical Standard Error Estimates

Standard 95% Confidence

Parameter Estimate Error Limits Z Pr > |Z|

Intercept 2.8731 0.4592 1.9730 3.7731 6.26 <.0001

pca0 -0.2278 0.0959 -0.4157 -0.0398 -2.37 0.0176
TIME -0.7779 0.3234 -1.4117 -0.1440 -2.41 0.0162
TIME*TIME 0.1670 0.0656 0.0384 0.2955 2.55 0.0109

Analysis Of GEE Parameter Estimates

Model-Based Standard Error Estimates

Standard 95% Confidence

Parameter Estimate Error Limits Z Pr > |Z|

Intercept 2.8731 0.4836 1.9252 3.8209 5.94 <.0001

pca0 -0.2278 0.1025 -0.4287 -0.0268 -2.22 0.0263
TIME -0.7779 0.3275 -1.4198 -0.1359 -2.37 0.0176
TIME*TIME 0.1670 0.0665 0.0366 0.2973 2.51 0.0121
Scale 1.0000 . . . . .
NOTE: The scale parameter was held fixed.
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 253

12.1 Comparison of GEE Estimates

• under diﬀerent working correlation structures

• models without dropout patterns

Variable INDependence EXCHangeable AutoRegressive UNstructured

Intercept 2.80 (0.469; 0.490) 2.92 (0.463; 0.494) 2.94 (0.469; 0.488) 2.87 (0.459; 0.484)
Time -0.79 (0.341; 0.387) -0.83 (0.328; 0.343) -0.90 (0.334; 0.352) -0.78 (0.323; 0.328)
Time2 0.18 (0.070; 0.079) 0.18 (0.067; 0.070) 0.20 (0.069; 0.072) 0.17 (0.066; 0.067)
Basel. PCA -0.21 (0.095; 0.086) -0.23 (0.095; 0.103) -0.22 (0.095; 0.099) -0.23 (0.096; 0.103)
Parameter estimates and standard errors (empirical; model-based).

Estimated working correlation structures:

 IND   EXCH 

 1 0 0 0 

 1 0.219 0.219 0.219 
 
   
   



1 0 0 





1 0.219 0.219 


   
   



1 0 





1 0.219 


   
   
1 1

 AR   UN 
 1 0.247 0.061 0.015   1 0.177 0.248 0.202 
   
   
   



1 0.247 0.061 





1 0.181 0.178 


   
   



1 0.247 





1 0.459 


   
   
1 1
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 254

12.2 Comparison of GEE Estimates

• under diﬀerent working correlation structures
• models with dropout patterns

Variable INDependence EXCHangeable AutoRegressive UNstructured

Pattern 1 2.22 (0.584; 0.613) 2.29 (0.577; 0.614) 2.29 (0.577; 0.602) 2.26 (0.582; 0.625)
Pattern 2 3.13 (0.550; 0.574) 3.10 (0.548; 0.564) 3.09 (0.555; 0.562) 3.09 (0.543; 0.549)
Time Patt. 1 -0.62 (0.333; 0.367) -0.67 (0.324; 0.338) -0.67 (0.324; 0.336) -0.65 (0.327; 0.349)
Time Patt. 2 -0.91 (0.416; 0.462) -0.87 (0.413; 0.414) -0.88 (0.416; 0.425) -0.86 (0.409; 0.398)
Time2 Patt. 2 0.18 (0.083; 0.092) 0.17 (0.082; 0.083) 0.17 (0.083; 0.085) 0.17 (0.081; 0.080)
Basel. PCA -0.16 (0.097; 0.090) -0.16 (0.096; 0.108) -0.16 (0.097; 0.103) -0.16 (0.098; 0.107)
Parameter estimates and standard errors (empirical; model-based).


IND  
EXCH 



1 0 0 0 


 1 0.219 0.219 0.219 




   
   



1 0 0 





1 0.219 0.219 


   
   




1 0 







1 0.219 



   

1  
1 


AR  
UN 
 1 0.235 0.055 0.013   1 0.143 0.288 0.228 
   
   
   
   



1 0.235 0.055 





1 0.220 0.098 


   
   




1 0.235 







1 0.443 



   

1  
1 
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 255

12.2.1 Fitted Model

CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 256

12.3 Use of GLIMMIX

• Similar models can be ﬁtted using SAS macro

GLIMMIX
• Its use should not be recommended in general,
though!
• Code (to ﬁt UN structure):

%glimmix(data=gsa, procopt=%str(method=ml noclprint),

stmts=%str(
class patid timecls;
model gsabin = time|time pca0 / s;
repeated timecls / sub=patid type=un rcorr=3;
),
error=binomial);
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 257

12.3.1 Output

The Mixed Procedure

Model Information

Data Set WORK._DS

Dependent Variable _z
Weight Variable _w
Covariance Structure Unstructured
Subject Effect PATID
Estimation Method ML
Residual Variance Method None
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Between-Within

Dimensions

Covariance Parameters 10
Columns in X 4
Columns in Z 0
Subjects 395
Max Obs Per Subject 4
Observations Used 1137
Observations Not Used 0
Total Observations 1137

Parameter Search

CovP1 CovP2 CovP3 CovP4 CovP5 CovP6 CovP7 CovP8

1.0059 0.1865 0.9974 0.2923 0.2196 0.9470 0.2752 0.1697

Parameter Search

CovP9 CovP10 Log Like -2 Log Like

0.4186 0.9273 -2635.7524 5271.5049

Iteration History

Iteration Evaluations -2 Log Like Criterion

CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 258

1 1 5271.50486573 0.00000000

Convergence criteria met.

Estimated R Correlation Matrix

for PATID 3/Weighted by _w

Row Col1 Col2 Col3 Col4

1 1.0000 0.1862 0.2995 0.2849

2 0.1862 1.0000 0.2260 0.1764
3 0.2995 0.2260 1.0000 0.4467
4 0.2849 0.1764 0.4467 1.0000

Covariance Parameter Estimates

Standard Z
Cov Parm Subject Estimate Error Value Pr Z

UN(1,1) PATID 1.0059 0.07251 13.87 <.0001

UN(2,1) PATID 0.1865 0.06167 3.02 0.0025
UN(2,2) PATID 0.9974 0.08152 12.23 <.0001
UN(3,1) PATID 0.2923 0.07012 4.17 <.0001
UN(3,2) PATID 0.2196 0.07256 3.03 0.0025
UN(3,3) PATID 0.9470 0.08969 10.56 <.0001
UN(4,1) PATID 0.2752 0.07490 3.67 0.0002
UN(4,2) PATID 0.1697 0.07264 2.34 0.0195
UN(4,3) PATID 0.4186 0.07169 5.84 <.0001
UN(4,4) PATID 0.9273 0.09065 10.23 <.0001

Fit Statistics

Log Likelihood -2635.8

Akaike’s Information Criterion -2645.8
Schwarz’s Bayesian Criterion -2665.6
-2 Log Likelihood 5271.5

PARMS Model Likelihood Ratio Test

DF Chi-Square Pr > ChiSq

10 0.00 1.0000
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 259

Solution for Fixed Effects

Standard
Effect Estimate Error DF t Value Pr > |t|

Intercept 2.8873 0.4848 393 5.96 <.0001

TIME -0.7774 0.3236 393 -2.40 0.0168
TIME*TIME 0.1646 0.06540 393 2.52 0.0123
pca0 -0.2314 0.1038 393 -2.23 0.0263

Type 3 Tests of Fixed Effects

Num Den
Effect DF DF F Value Pr > F

TIME 1 393 5.77 0.0168

TIME*TIME 1 393 6.33 0.0123
pca0 1 393 4.97 0.0263

GLIMMIX Model Statistics

Description Value

Deviance 1065.2602
Scaled Deviance 1065.2602
Pearson Chi-Square 1101.4964
Scaled Pearson Chi-Square 1101.4964
Extra-Dispersion Scale 1.0000
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 260

GEE1 Estimates with Standard Errors (Empirical;

Model-Based): Exchangeable Working Assumptions.

GENMOD PRENTICE GLIMMIX

Variable (repeated)

Intercept 2.918 (0.463; 0.494) 2.940 (0.463; 0.494) 2.942 (0.463; 0.488)
Time -0.833 (0.328; 0.343) -0.843 (0.326; 0.334) -0.843 (0.326; 0.330)
Time2 0.177 (0.067; 0.070) 0.178 (0.066; 0.068) 0.178 (0.066; 0.067)
Basel. PCA -0.226 (0.095; 0.103) -0.230 (0.095; 0.105) -0.230 (0.095; 0.104)
ρ 0.219 0.260 (0.048) 0.264 (0.037†)

Parameter estimates and standard errors (empirical; model-based).

† Standard error calculated by the delta method.
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 261

12.4 Alternating Logistic Regressions

• Consider odds ratio instead of correlation as a

measure of dependence
• Code to ﬁt EXCHangeable structure (common odds
ratio):
proc genmod data=gsa;
where weight ne .;
class patid timecls;
model gsabin = time|time pca0 / dist=b;
repeated subject=patid / within=timecls logor=exch;
ods listing exclude classlevels parminfo parameterestimates;
run;
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 262

12.4.1 Output

The GENMOD Procedure

Model Information

Data Set WORK.GSA

Distribution Binomial
Link Function Logit
Dependent Variable gsabin
Observations Used 1137
Probability Modeled Pr( gsabin = 1 )

Response Profile

Ordered Ordered
Level Value Count

1 0 206
2 1 931

Criteria For Assessing Goodness Of Fit

Criterion DF Value Value/DF

Deviance 1133 1064.1723 0.9393

Scaled Deviance 1133 1064.1723 0.9393
Pearson Chi-Square 1133 1136.8928 1.0034
Scaled Pearson X2 1133 1136.8928 1.0034
Log Likelihood -532.0862

Algorithm converged.

GEE Model Information

Log Odds Ratio Structure Exchangeable

Within-Subject Effect timecls (4 levels)
Subject Effect PATID (395 levels)
Number of Clusters 395
Correlation Matrix Dimension 4
Maximum Cluster Size 4
Minimum Cluster Size 1
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 263

Algorithm converged.

Analysis Of GEE Parameter Estimates

Empirical Standard Error Estimates

Standard 95% Confidence

Parameter Estimate Error Limits Z Pr > |Z|

Intercept 2.9810 0.4621 2.0753 3.8866 6.45 <.0001

TIME -0.8689 0.3248 -1.5056 -0.2323 -2.67 0.0075
TIME*TIME 0.1830 0.0659 0.0539 0.3122 2.78 0.0055
pca0 -0.2352 0.0950 -0.4213 -0.0491 -2.48 0.0132
Alpha1 1.4307 0.2238 0.9921 1.8693 6.39 <.0001
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 264

• Code to ﬁt full structure (speciﬁc odds ratio):

proc genmod data=gsa;
where weight ne .;
class patid timecls;
model gsabin = time|time pca0 / dist=b;
repeated subject=patid / within=timecls logor=fullclust;
ods listing exclude classlevels parminfo parameterestimates;
run;

• Output:
The GENMOD Procedure

Model Information

Data Set WORK.GSA

Distribution Binomial
Link Function Logit
Dependent Variable gsabin
Observations Used 1137
Probability Modeled Pr( gsabin = 1 )

Response Profile

Ordered Ordered
Level Value Count

1 0 206
2 1 931

Criteria For Assessing Goodness Of Fit

Criterion DF Value Value/DF

Deviance 1133 1064.1723 0.9393

Scaled Deviance 1133 1064.1723 0.9393
Pearson Chi-Square 1133 1136.8928 1.0034
Scaled Pearson X2 1133 1136.8928 1.0034
Log Likelihood -532.0862

Algorithm converged.
CHAPTER 12. CASE STUDY: ANALGESIC TRIAL 265

GEE Model Information

Log Odds Ratio Structure Fully Parameterized Clusters

Within-Subject Effect timecls (4 levels)
Subject Effect PATID (395 levels)
Number of Clusters 395
Correlation Matrix Dimension 4
Maximum Cluster Size 4
Minimum Cluster Size 1

Log Odds Ratio Parameter

Information

Parameter Group

Alpha1 (1, 2)
Alpha2 (1, 3)
Alpha3 (1, 4)
Alpha4 (2, 3)
Alpha5 (2, 4)
Alpha6 (3, 4)

Algorithm converged.

Analysis Of GEE Parameter Estimates

Empirical Standard Error Estimates

Standard 95% Confidence

Parameter Estimate Error Limits Z Pr > |Z|

Intercept 2.9219 0.4583 2.0237 3.8201 6.38 <.0001

TIME -0.7980 0.3207 -1.4266 -0.1694 -2.49 0.0128
TIME*TIME 0.1683 0.0648 0.0412 0.2953 2.60 0.0094
pca0 -0.2359 0.0960 -0.4241 -0.0478 -2.46 0.0140
Alpha1 1.1280 0.3278 0.4856 1.7705 3.44 0.0006
Alpha2 1.5631 0.3865 0.8056 2.3206 4.04 <.0001
Alpha3 1.6035 0.4192 0.7819 2.4251 3.83 0.0001
Alpha4 1.1864 0.3680 0.4652 1.9077 3.22 0.0013
Alpha5 0.9265 0.4218 0.0997 1.7533 2.20 0.0281
Alpha6 2.4387 0.4805 1.4970 3.3805 5.08 <.0001
Chapter 13

Random-Eﬀects Models

• In a pure random-eﬀects model, one assumes that the

responses Yi1, . . . , Yini are conditionally independent
given an unobserved vector of random variables bi.
• The realized value of bi represents properties of the
given subject which vary randomly between subjects.
• E(Yij |bi) = µij
• η(µij ) = xTij β + z Tij bi
• var(Yij |bi) = φv(µij )

266
CHAPTER 13. RANDOM-EFFECTS MODELS 267

13.1 The Marginal Likelihood

• To derive the marginal likelihood for this model, we

need to integrate over the assumed distribution of the
random eﬀect vector b (which may be diﬃcult unless
b has low dimension).
• Thus, the distribution of the vector Y i of responses
from a single subject is given by
n

f (y i) = f (yij |bi)g(b)db
j=1

where g(.) denotes the joint p.d.f. of b.

• If both the conditional distribution of Y i given bi and
the marginal distribution of bi are normal, the
integrations can be performed analytically, and the
resulting model is an example of the Laird and Ware
(1982) model.
CHAPTER 13. RANDOM-EFFECTS MODELS 268

13.2 Numerical Integration

13.2.1 Adaptive Gaussian Quadrature

• Quadrature:
– Select abscissas
– Construct weighted sum of function over abscissas
• Adaptive Quadrature:
– Typical for random eﬀects distribution
– Integral centered at EB estimate of ui
– Number of quadrature points selected in function
of desired accuracy
• Pinheiro and Bates (1995)
CHAPTER 13. RANDOM-EFFECTS MODELS 269

13.2.2 First Order Method

• For normal outcome density

• Conditional mean function is replaced by a ﬁrst-order
Taylor series expansion
• For normal random eﬀects, a closed form solution
then results
• Beal and Sheiner (1982, 1988)
• Sheiner and Beal (1985)
CHAPTER 13. RANDOM-EFFECTS MODELS 270

13.3 Estimation Methods

• In general, hierarchical generalized linear models are

obtained (Lee and Nelder 1996).
• A useful sub-class of models is obtained by assuming
that bi is normally distributed.
• Modern developments in Monte Carlo inference make
this class relatively tractable for higher dimensional bi.
• A useful tool is the EM algorithm
• Approximate inference methods have been developed:
– Pseudolikelihood: Wolfinger and O’Connell (1995)
– Laplace transform based: Breslow and Clayton
(1993)
• We study two approaches:
– The beta-binomial model: random effect on the
probability scale
– Generalized linear mixed models: random effects in
the linear predictor
CHAPTER 13. RANDOM-EFFECTS MODELS 271

13.4 Software

• SAS macro GLIMMIX

• SAS PROC NLMIXED
• MIXOR
• MLwiN
• WinBUGS
• ...

13.5 The Beta-binomial Model

• Useful for clustered data, e.g., NTP data, with

covariates at cluster level only.
Cluster i consists of:
– ni littermates
– zi of which are successes
– the dose level is di
• Assume each cluster has a random success probability
πi.
CHAPTER 13. RANDOM-EFFECTS MODELS 272

• Then, the intracluster correlation is assumed to arise

from natural heterogeneity in the success probability
across litters.
• In contrast, marginal models specify marginal means
and association separately.
• Still, the beta-binomial model is often considered to
be a marginal model.
CHAPTER 13. RANDOM-EFFECTS MODELS 273

• Building blocks:
The binomial part: conditional on the success
probability πi in cluster i, the responses
Y i1, . . . , Yini are independent with common
probability πi.
The beta part: the πi are drawn from a beta
distribution with mean π and variance δπ(1 − π)
• The marginal distribution of Zi is then beta-binomial
with
f (zi | πi , ρ)
B(πi (ρ−1 − 1) + zi, (1 − πi)(ρ−1 − 1) + (ni − zi))
=
B(πi(ρ−1 − 1), (1 − πi)(ρ−1 − 1))
where B(., .) denotes the beta function.
• The moments are:
– E(Zi) = niµi
– Var(Zi) = niµi(1 − µi)[1 + (ni − 1)δ]
Williams (1975)
CHAPTER 13. RANDOM-EFFECTS MODELS 274

13.5.1 The NTP Data

Maximum Likelihood Estimates (Standard Errors) for the Beta-Binomial Model.

Outcome Parameter DEHP EG DYME

External β0 -4.91(0.42) -5.32(0.71) -7.27(0.74)
βd 5.20(0.59) 2.78(0.81) 8.01(0.82)
βa 0.21(0.09) 0.28(0.14) 0.21(0.12)
Visceral β0 -4.38(0.36) -7.45(1.17) -6.21(0.83)
βd 4.42(0.54) 4.33(1.26) 4.94(0.90)
βa 0.22(0.09) 0.04(0.09) 0.45(0.21)
Skeletal β0 -4.88(0.44) -2.89(0.27) -5.15(0.47)
βd 4.92(0.63) 3.42(0.40) 6.99(0.71)
βa 0.27(0.11) 0.54(0.09) 0.61(0.14)
Collapsed β0 -3.83(0.31) -2.51(0.09) -5.42(0.45)
βd 5.59(0.56) 3.05(0.17) 8.29(0.79)
βa 0.32(0.10) 0.28(0.02) 0.33(0.10)
CHAPTER 13. RANDOM-EFFECTS MODELS 275

13.5.2 Discussion

• Parameters have the same interpretation for:

– Bahadur (MLE)
– Bahadur (GEE2)
– Beta-binomial
(A diﬀerent interpretation applies for the conditional model.)

• Restrictions on the parameter space are

– most severe for Bahadur (MLE)
– intermediate for GEE
– least severe for beta-binomial (all positive
correlations allowed)
This is reﬂected in the estimates for βa
• Fitting beta-binomial model is relatively easy
CHAPTER 13. RANDOM-EFFECTS MODELS 276

13.6 Generalized Linear Mixed Models

• An extension of GEE1–Alternative 2.
• Random eﬀects are included in the model.

Write
y i = µi + εi
with
η i = g(µi),
η i = Xiβ + Zibi,
Var(y i|bi) = Σi.
Here,

• η i is a vector of linear predictors, given random

eﬀects,
• g(.) is the (vector) link function,
• bi satisﬁes the following moment assumptions:
– E(bi) = 0,
– Cov(bi) = G.
CHAPTER 13. RANDOM-EFFECTS MODELS 277

13.6.1 Quasi-likelihood Function

Recall: for the univariate exponential family:

−1
f (y|θi, φ) = exp φ [yθi − ψ(θi)] + c(y, φ)
with θi the natural parameter and ψ(.) a function
satisfying

• µi = ψ (θi)
• v(µi) = ψ (θi)

The quasi-likelihood function:

yiθi − ψ(θi)
Q(µi, yi) =
φ

∂Qi yi − µi
= .
∂µi φv(µi)
CHAPTER 13. RANDOM-EFFECTS MODELS 278

This framework extends beyond exponential families.

One only needs:

• the mean µi
• the mean function θ(µi)
• the variance function v(µi)
• the scale parameter φ
CHAPTER 13. RANDOM-EFFECTS MODELS 279

13.6.2 Quasi-likelihood For Generalized Linear

Mixed Models

• The previous quasi-likelihood, in vectorized form, is

used for y i|bi: the conditional quasi-likelihood.
• The quasi-likelihood for bi follows from the kernel of
the normal loglikelihood:
1 T
bi Gbi
2
• The joint quasi-likelihood becomes:
−1 −1 1 T −1
Q(µ, b|y) = (y Σ θ−(ψ ) Σ ψ )+ b G b
T 1/2 T 1/2
2
where all quantities have been vectorized or replaced
by block-diagonal matrices.
CHAPTER 13. RANDOM-EFFECTS MODELS 280

13.6.3 Estimation Algorithm

1. Obtain an initial estimate µ̂i.

2. Compute the pseudo data
y ∗i = η̂ i + (y − µ̂i)Di−1.
3. Fit a weighted linear mixed model with
• Data y ∗i
• Covariate matrices Xi and Zi
• Weight matrix Pi = A−1 −2
i Di

This yields Σ̂∗i and Ĝ.

4. Estimate φ by
1
N
φ̂ = ∗ r Ti V̂i−1r i,
N i=1
where
• N ∗ is N for ML and N − p for REML,
−1/2 −1/2
• Vi = Pi Σ∗i Pi + Zi G∗ ZiT
N T −1 −1 T −1
• r i = y ∗i − Xi ( i=1 Xi Vi Xi ) Xi Vi y i

5. Determine Σ̂i = φ̂Σ̂∗i and Ĝ = φ̂Ĝ∗

6. Solve the mixed model equations.
They will be written in vectorized form, by stacking y i, Xi, and
bi, and constructing block-diagonal matrices W , Z, and G:
    
∗


T
X WX T
X WZ 


 β  
 X Wy
T 



 −1








= 

 ∗



,
Z WX Z WZ + G
T T
b Z Wy
T
CHAPTER 13. RANDOM-EFFECTS MODELS 281

where
Wi = DiΣ−1i Di ,
∂µi
Di = ,
∂η i
Σi = Var(ε),
and W , D, and Σ are block-diagonal matrices built
from Wi, Di, and Σi respectively.
The estimates are:
β̂ = (X T V̂ −1X)−1X T V̂ −1y ∗,
b̂ = Ĝ∗Z T V̂ −1r̂.

7. Compute
µ̂i = g −1(Xiβ̂ + Zib̂i).
8. Iterate until convergence.
CHAPTER 13. RANDOM-EFFECTS MODELS 282

13.7 Linear Mixed Model Using

GLIMMIX

For continuous outcome birthweight in NTP:

• linear mixed model with random intercept (PROC

MIXED);
• equivalent GLIMMIX coding

data help;
set m.dehp2;
dose=dose/292;
collaps = ((visceral-1) or (skeletal-1) or (external-1));
if visceral=. then delete;
skeletal=skeletal-1;
visceral=visceral-1;
external=external-1;
run;

proc mixed data=help method=reml;

title ’PROC MIXED, dehp2, weight, random intercept’;
class litter;
id litter dose;
model weight=dose / solution predmeans predicted;
make ’Predicted’ out=m.predwt noprint;
make ’PredMeans’ out=m.prmwt noprint;
make ’SolutionR’ out=m.solrwt noprint;
random intercept / subject=litter solution;
run;
CHAPTER 13. RANDOM-EFFECTS MODELS 283

%include ’c:\sas\stat\sample\glimmix.sas’;

%glimmix(
data=help,
procopt=method=reml,
stmts=%str(
class litter;
id litter dose;
model weight=dose / solution predmeans;
random intercept / subject=litter solution;),
error=normal,
link=identity,
title=’GLIMMIX, dehp2, weight, random intercept’,
options=mixprintlast
);

proc print data=m.prmwt;

title ’Predicted Means’;
var dose litter _pred_;
proc print data=m.predwt;
title ’Predicted Values’;
var dose litter _pred_;
run;
CHAPTER 13. RANDOM-EFFECTS MODELS 284

13.7.1 Selected Output

Output from PROC MIXED call

PROC MIXED, dehp2, weight, random intercept

The MIXED Procedure

Covariance Parameter Estimates (REML)

Cov Parm Subject Estimate

INTERCEPT LITTER 0.00591700

Residual 0.00714764

Model Fitting Information for WEIGHT

Description Value

Observations 1082.000
Res Log Likelihood 1014.980
Akaike’s Information Criterion 1012.980
Schwarz’s Bayesian Criterion 1007.995
-2 Res Log Likelihood -2029.96

Solution for Fixed Effects

Effect Estimate Std Error DF t Pr > |t|

INTERCEPT 0.96651850 0.01105760 106 87.41 0.0001

DOSE -0.13117707 0.02680668 974 -4.89 0.0001

Solution for Random Effects

Effect LITTER Estimate SE Pred DF t Pr > |t|

INTERCEPT 38 -0.12859215 0.02820009 974 -4.56 0.0001

INTERCEPT 39 -0.02065678 0.02642638 974 -0.78 0.4346
INTERCEPT 40 -0.06864851 0.03027538 974 -2.27 0.0236
INTERCEPT 49 -0.01571926 0.02336063 974 -0.67 0.5012
INTERCEPT 50 0.02420205 0.02947096 974 0.82 0.4117
CHAPTER 13. RANDOM-EFFECTS MODELS 285

...
INTERCEPT 203 0.02616233 0.02521825 974 1.04 0.2998

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

DOSE 1 974 23.95 0.0001

Predicted Means

LITTER DOSE WEIGHT Predicted SE Pred L95 U95 Residual

38 0 0.8310 0.9665 0.0111 0.9448 0.9882 -0.1355

38 0 0.6800 0.9665 0.0111 0.9448 0.9882 -0.2865
38 0 0.9260 0.9665 0.0111 0.9448 0.9882 -0.0405
38 0 0.8800 0.9665 0.0111 0.9448 0.9882 -0.0865
38 0 0.8840 0.9665 0.0111 0.9448 0.9882 -0.0825
38 0 0.8460 0.9665 0.0111 0.9448 0.9882 -0.1205
38 0 0.8170 0.9665 0.0111 0.9448 0.9882 -0.1495
38 0 0.8400 0.9665 0.0111 0.9448 0.9882 -0.1265
38 0 0.6820 0.9665 0.0111 0.9448 0.9882 -0.2845
49 0 0.9890 0.9665 0.0111 0.9448 0.9882 0.0225
49 0 0.8980 0.9665 0.0111 0.9448 0.9882 -0.0685
49 0 0.9450 0.9665 0.0111 0.9448 0.9882 -0.0215
49 0 0.8990 0.9665 0.0111 0.9448 0.9882 -0.0675
49 0 0.9330 0.9665 0.0111 0.9448 0.9882 -0.0335
49 0 0.8420 0.9665 0.0111 0.9448 0.9882 -0.1245
49 0 0.8960 0.9665 0.0111 0.9448 0.9882 -0.0705
49 0 1.0060 0.9665 0.0111 0.9448 0.9882 0.0395
49 0 1.1150 0.9665 0.0111 0.9448 0.9882 0.1485
49 0 1.0070 0.9665 0.0111 0.9448 0.9882 0.0405
49 0 0.9580 0.9665 0.0111 0.9448 0.9882 -0.0085
49 0 0.9990 0.9665 0.0111 0.9448 0.9882 0.0325
49 0 0.9090 0.9665 0.0111 0.9448 0.9882 -0.0575
49 0 0.8480 0.9665 0.0111 0.9448 0.9882 -0.1185
49 0 0.9990 0.9665 0.0111 0.9448 0.9882 0.0325
...
39 0.15 0.9610 0.9468 0.0087 0.9296 0.9639 0.0142
39 0.15 0.9210 0.9468 0.0087 0.9296 0.9639 -0.0258
...
60 0.15 0.9560 0.9468 0.0087 0.9296 0.9639 0.0092
60 0.15 0.6240 0.9468 0.0087 0.9296 0.9639 -0.3228
...
40 0.31 0.8600 0.9256 0.0079 0.9101 0.9412 -0.0656
CHAPTER 13. RANDOM-EFFECTS MODELS 286

40 0.31 0.5590 0.9256 0.0079 0.9101 0.9412 -0.3666

...
52 0.31 1.0350 0.9256 0.0079 0.9101 0.9412 0.1094
52 0.31 0.9490 0.9256 0.0079 0.9101 0.9412 0.0234
...
53 0.65 0.7510 0.8807 0.0126 0.8560 0.9054 -0.1297
53 0.65 0.9020 0.8807 0.0126 0.8560 0.9054 0.0213
...
53 0.65 0.8750 0.8807 0.0126 0.8560 0.9054 -0.0057
53 0.65 0.9640 0.8807 0.0126 0.8560 0.9054 0.0833
...
70 0.65 0.8210 0.8807 0.0126 0.8560 0.9054 -0.0597
70 0.65 0.9870 0.8807 0.0126 0.8560 0.9054 0.1063
...
57 1 0.8890 0.8353 0.0207 0.7948 0.8759 0.0537
57 1 0.8940 0.8353 0.0207 0.7948 0.8759 0.0587
...
66 1 0.7060 0.8353 0.0207 0.7948 0.8759 -0.1293
66 1 0.7980 0.8353 0.0207 0.7948 0.8759 -0.0373
...
CHAPTER 13. RANDOM-EFFECTS MODELS 287

Predicted Values

LITTER DOSE WEIGHT Predicted SE Pred L95 U95 Residual

38 0 0.8310 0.8379 0.0265 0.7859 0.8899 -0.0069

38 0 0.6800 0.8379 0.0265 0.7859 0.8899 -0.1579
38 0 0.9260 0.8379 0.0265 0.7859 0.8899 0.0881
38 0 0.8800 0.8379 0.0265 0.7859 0.8899 0.0421
38 0 0.8840 0.8379 0.0265 0.7859 0.8899 0.0461
38 0 0.8460 0.8379 0.0265 0.7859 0.8899 0.0081
38 0 0.8170 0.8379 0.0265 0.7859 0.8899 -0.0209
38 0 0.8400 0.8379 0.0265 0.7859 0.8899 0.0021
38 0 0.6820 0.8379 0.0265 0.7859 0.8899 -0.1559
49 0 0.9890 0.9508 0.0210 0.9096 0.9920 0.0382
49 0 0.8980 0.9508 0.0210 0.9096 0.9920 -0.0528
49 0 0.9450 0.9508 0.0210 0.9096 0.9920 -0.0058
49 0 0.8990 0.9508 0.0210 0.9096 0.9920 -0.0518
49 0 0.9330 0.9508 0.0210 0.9096 0.9920 -0.0178
49 0 0.8420 0.9508 0.0210 0.9096 0.9920 -0.1088
49 0 0.8960 0.9508 0.0210 0.9096 0.9920 -0.0548
49 0 1.0060 0.9508 0.0210 0.9096 0.9920 0.0552
49 0 1.1150 0.9508 0.0210 0.9096 0.9920 0.1642
49 0 1.0070 0.9508 0.0210 0.9096 0.9920 0.0562
49 0 0.9580 0.9508 0.0210 0.9096 0.9920 0.0072
49 0 0.9990 0.9508 0.0210 0.9096 0.9920 0.0482
49 0 0.9090 0.9508 0.0210 0.9096 0.9920 -0.0418
49 0 0.8480 0.9508 0.0210 0.9096 0.9920 -0.1028
49 0 0.9990 0.9508 0.0210 0.9096 0.9920 0.0482
...
39 0.15 0.9610 0.9261 0.0253 0.8765 0.9757 0.0349
39 0.15 0.9210 0.9261 0.0253 0.8765 0.9757 -0.0051
...
60 0.15 0.9560 0.9259 0.0217 0.8834 0.9685 0.0301
60 0.15 0.6240 0.9259 0.0217 0.8834 0.9685 -0.3019
...
40 0.31 0.8600 0.8570 0.0295 0.7990 0.9149 0.0030
40 0.31 0.5590 0.8570 0.0295 0.7990 0.9149 -0.2980
...
52 0.31 1.0350 0.9257 0.0233 0.8800 0.9713 0.1093
52 0.31 0.9490 0.9257 0.0233 0.8800 0.9713 0.0233
...
53 0.65 0.7510 0.9122 0.0233 0.8665 0.9579 -0.1612
53 0.65 0.9020 0.9122 0.0233 0.8665 0.9579 -0.0102
...
70 0.65 0.8210 0.8431 0.0279 0.7883 0.8978 -0.0221
CHAPTER 13. RANDOM-EFFECTS MODELS 288

70 0.65 0.9870 0.8431 0.0279 0.7883 0.8978 0.1439

...
57 1 0.8890 0.7983 0.0266 0.7462 0.8505 0.0907
57 1 0.8940 0.7983 0.0266 0.7462 0.8505 0.0957
...
66 1 0.7060 0.7769 0.0342 0.7099 0.8440 -0.0709
66 1 0.7980 0.7769 0.0342 0.7099 0.8440 0.0211
...
CHAPTER 13. RANDOM-EFFECTS MODELS 289

Output from GLIMMIX call

GLIMMIX, dehp2, weight, random intercept

The MIXED Procedure

Covariance Parameter Estimates (REML)

Cov Parm Subject Estimate

INTERCEPT LITTER 0.00591700

Residual 0.00714764

Model Fitting Information for _Z

Weighted by _W

Description Value

Observations 1082.000
Res Log Likelihood 1014.980
Akaike’s Information Criterion 1012.980
Schwarz’s Bayesian Criterion 1007.995
-2 Res Log Likelihood -2029.96

Solution for Fixed Effects

Effect Estimate Std Error DF t Pr > |t|

INTERCEPT 0.96651850 0.01105760 106 87.41 0.0001

DOSE -0.13117707 0.02680668 974 -4.89 0.0001

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

DOSE 1 974 23.95 0.0001

CHAPTER 13. RANDOM-EFFECTS MODELS 290

Covariance Parameter Estimates (MIVQUE0)

Cov Parm Subject Estimate

INTERCEPT LITTER 0.00552982

Residual 0.00726150

Model Fitting Information for _Z

Weighted by _W

Description Value

Observations 1082.000
Res Log Likelihood 1014.834
Akaike’s Information Criterion 1012.834
Schwarz’s Bayesian Criterion 1007.849
-2 Res Log Likelihood -2029.67

Solution for Fixed Effects

Effect Estimate Std Error DF t Pr > |t|

INTERCEPT 0.96648041 0.01074208 106 89.97 0.0001

DOSE -0.13088310 0.02608864 974 -5.02 0.0001

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

DOSE 1 974 25.17 0.0001

CHAPTER 13. RANDOM-EFFECTS MODELS 291

GLIMMIX, dehp2, weight, random intercept

Covariance Parameter Estimates

Cov Parm Estimate

INTERCEPT 0.00552982

GLIMMIX Model Statistics

Extra-Dispersion Scale 0.0073

Parameter Estimates

Effect Estimate Std Error DF t Pr > |t|

INTERCEPT 0.9665 0.0107 106 89.97 0.0001

DOSE -0.1309 0.0261 974 -5.02 0.0001

Random Effects Estimates

Effect LITTER Estimate SE Pred DF t Pr > |t|

INTERCEPT 38 -0.1272 0.0281 974 -4.52 0.0001

INTERCEPT 39 -0.0205 0.0264 974 -0.77 0.4386
INTERCEPT 40 -0.0678 0.0303 974 -2.24 0.0252
INTERCEPT 49 -0.0156 0.0233 974 -0.67 0.5037
INTERCEPT 50 0.0240 0.0294 974 0.81 0.4154
...
INTERCEPT 203 0.0258 0.0251 974 1.03 0.3042

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

DOSE 1 974 25.17 0.0001

CHAPTER 13. RANDOM-EFFECTS MODELS 292

Predicted Means

OBS DOSE LITTER _PRED_

1 0 38 0.9665
10 0 49 0.9665

331 0.15068 39 0.9468

341 0.15068 60 0.9468

619 0.31164 40 0.9257

626 0.31164 52 0.9257

1033 1.00000 57 0.8356

1042 1.00000 66 0.8356
CHAPTER 13. RANDOM-EFFECTS MODELS 293

13.7.2 Discussion

Let us combine

• dose level
• predicted mean
• random intercept
• predicted value

LITTER DOSE MEAN R.INT. PRED.

38.0000 0.0000 0.9665 -0.1286 0.8379

49.0000 0.0000 0.9665 -0.0157 0.9508
50.0000 0.0000 0.9665 0.0242 0.9907
61.0000 0.0000 0.9665 -0.0391 0.9274
62.0000 0.0000 0.9665 -0.1608 0.8057
73.0000 0.0000 0.9665 -0.0963 0.8703
85.0000 0.0000 0.9665 -0.0603 0.9062
86.0000 0.0000 0.9665 -0.0083 0.9582
97.0000 0.0000 0.9665 -0.0370 0.9295
98.0000 0.0000 0.9665 0.0518 1.0183
109.0000 0.0000 0.9665 0.0041 0.9706
110.0000 0.0000 0.9665 0.1109 1.0775
119.0000 0.0000 0.9665 -0.0717 0.8948
120.0000 0.0000 0.9665 0.0694 1.0359
129.0000 0.0000 0.9665 -0.0575 0.9090
130.0000 0.0000 0.9665 -0.0003 0.9663
139.0000 0.0000 0.9665 0.0704 1.0369
140.0000 0.0000 0.9665 0.0366 1.0031
149.0000 0.0000 0.9665 0.0001 0.9666
150.0000 0.0000 0.9665 0.0532 1.0197
159.0000 0.0000 0.9665 -0.0094 0.9571
160.0000 0.0000 0.9665 -0.0940 0.8725
169.0000 0.0000 0.9665 0.0010 0.9675
170.0000 0.0000 0.9665 -0.0828 0.8837
CHAPTER 13. RANDOM-EFFECTS MODELS 294

179.0000 0.0000 0.9665 -0.1467 0.8199

180.0000 0.0000 0.9665 0.0786 1.0451
189.0000 0.0000 0.9665 -0.0453 0.9212
190.0000 0.0000 0.9665 0.0428 1.0094
199.0000 0.0000 0.9665 -0.0407 0.9258
200.0000 0.0000 0.9665 0.0172 0.9837
39.0000 0.1507 0.9468 -0.0207 0.9261
60.0000 0.1507 0.9468 -0.0208 0.9259
63.0000 0.1507 0.9468 0.0376 0.9843
72.0000 0.1507 0.9468 -0.0589 0.8879
75.0000 0.1507 0.9468 -0.0873 0.8595
84.0000 0.1507 0.9468 0.0570 1.0037
87.0000 0.1507 0.9468 0.0515 0.9982
96.0000 0.1507 0.9468 -0.0296 0.9172
99.0000 0.1507 0.9468 0.0330 0.9797
108.0000 0.1507 0.9468 0.1777 1.1244
118.0000 0.1507 0.9468 -0.0960 0.8507
121.0000 0.1507 0.9468 -0.0287 0.9181
128.0000 0.1507 0.9468 -0.0060 0.9408
131.0000 0.1507 0.9468 0.0903 1.0371
138.0000 0.1507 0.9468 -0.0405 0.9063
141.0000 0.1507 0.9468 0.0231 0.9698
148.0000 0.1507 0.9468 -0.0264 0.9203
151.0000 0.1507 0.9468 0.0170 0.9638
158.0000 0.1507 0.9468 0.0549 1.0016
161.0000 0.1507 0.9468 0.0053 0.9520
171.0000 0.1507 0.9468 0.1582 1.1050
181.0000 0.1507 0.9468 0.0845 1.0313
188.0000 0.1507 0.9468 0.0293 0.9761
191.0000 0.1507 0.9468 0.0837 1.0304
198.0000 0.1507 0.9468 -0.0184 0.9284
201.0000 0.1507 0.9468 0.0104 0.9572
40.0000 0.3116 0.9256 -0.0686 0.8570
52.0000 0.3116 0.9256 0.0000 0.9257
64.0000 0.3116 0.9256 -0.0533 0.8724
71.0000 0.3116 0.9256 -0.0239 0.9018
76.0000 0.3116 0.9256 -0.0002 0.9254
83.0000 0.3116 0.9256 0.0861 1.0117
88.0000 0.3116 0.9256 -0.1606 0.7650
100.0000 0.3116 0.9256 -0.1162 0.8094
107.0000 0.3116 0.9256 -0.0496 0.8760
112.0000 0.3116 0.9256 -0.0135 0.9121
117.0000 0.3116 0.9256 -0.0415 0.8841
122.0000 0.3116 0.9256 -0.0326 0.8930
127.0000 0.3116 0.9256 -0.0242 0.9015
CHAPTER 13. RANDOM-EFFECTS MODELS 295

137.0000 0.3116 0.9256 0.0210 0.9467

142.0000 0.3116 0.9256 0.0043 0.9299
147.0000 0.3116 0.9256 0.0126 0.9383
152.0000 0.3116 0.9256 -0.0416 0.8841
157.0000 0.3116 0.9256 -0.0043 0.9213
162.0000 0.3116 0.9256 0.0523 0.9780
167.0000 0.3116 0.9256 0.1476 1.0732
172.0000 0.3116 0.9256 0.0184 0.9441
177.0000 0.3116 0.9256 -0.0386 0.8871
182.0000 0.3116 0.9256 0.0799 1.0056
187.0000 0.3116 0.9256 0.0070 0.9326
192.0000 0.3116 0.9256 0.2110 1.1367
197.0000 0.3116 0.9256 0.0494 0.9750
53.0000 0.6541 0.8807 0.0315 0.9122
70.0000 0.6541 0.8807 -0.0377 0.8431
82.0000 0.6541 0.8807 -0.0335 0.8472
89.0000 0.6541 0.8807 -0.1496 0.7311
94.0000 0.6541 0.8807 -0.0622 0.8185
101.0000 0.6541 0.8807 -0.0763 0.8044
106.0000 0.6541 0.8807 0.0702 0.9509
126.0000 0.6541 0.8807 0.0938 0.9745
136.0000 0.6541 0.8807 0.1445 1.0252
153.0000 0.6541 0.8807 0.0522 0.9330
163.0000 0.6541 0.8807 0.0734 0.9542
166.0000 0.6541 0.8807 0.0453 0.9260
173.0000 0.6541 0.8807 0.1232 1.0039
186.0000 0.6541 0.8807 -0.0381 0.8426
193.0000 0.6541 0.8807 -0.0065 0.8742
196.0000 0.6541 0.8807 0.0668 0.9475
203.0000 0.6541 0.8807 0.0262 0.9069
57.0000 1.0000 0.8353 -0.0370 0.7983
66.0000 1.0000 0.8353 -0.0584 0.7769
78.0000 1.0000 0.8353 -0.0890 0.7464
90.0000 1.0000 0.8353 -0.1149 0.7204
93.0000 1.0000 0.8353 -0.0192 0.8161
105.0000 1.0000 0.8353 -0.0376 0.7977
124.0000 1.0000 0.8353 0.0358 0.8711
154.0000 1.0000 0.8353 0.0907 0.9260
184.0000 1.0000 0.8353 -0.0606 0.7747
CHAPTER 13. RANDOM-EFFECTS MODELS 296

Remarks

• mean of random intercepts is zero

• mean of average weights over litters is 0.9275
• mean of predicted weights over litters is 0.9275
• consider histogram of random intercepts
CHAPTER 13. RANDOM-EFFECTS MODELS 297
CHAPTER 13. RANDOM-EFFECTS MODELS 298

13.7.3 GLIMMIX Code

We will now apply the same procedure to the binary

outcome VISCERAL.

%include ’c:\sas\stat\sample\glimmix.sas’;

%glimmix(
data=m.dehp3,
procopt=method=reml,
stmts=%str(
class litter;
id litter dose;
model visceral=dose / solution predmeans predicted;
random intercept / subject=litter solution;
repeated / subject=litter type=simple;
),
error=binomial,
link=logit,
maxit=100,
options=mixprintlast,
title=’GLIMMIX, dehp3, visceral, random intercept and SIMPLE’
);
CHAPTER 13. RANDOM-EFFECTS MODELS 299

The following syntax DOES NOT WORK PROPERLY:

%glimmix(
data=m.dehp3,
procopt=method=reml,
stmts=%str(
class litter;
id litter dose;
model visceral=dose / solution predmeans predicted;
make ’Predicted’ out=m.predvisc noprint; <------------
make ’PredMeans’ out=m.prmvisc noprint;
make ’SolutionR’ out=m.solrvisc noprint;
random intercept / subject=litter solution;
repeated / subject=litter type=simple;
),
error=binomial,
link=logit,
maxit=100,
options=mixprintlast,
title=’GLIMMIX, dehp3, visceral, random intercept and SIMPLE’
);

• Predicted values are used internally by GLIMMIX

• Datasets can be obtained by copying from
SASWORKS
CHAPTER 13. RANDOM-EFFECTS MODELS 300

13.7.4 Output From GLIMMIX

GLIMMIX, dehp3, visceral, random intercept and SIMPLE

The MIXED Procedure

Covariance Parameter Estimates (REML)

Cov Parm Subject Estimate

INTERCEPT LITTER 2.73470639

DIAG LITTER 0.34412800

Model Fitting Information for _Z

Weighted by _W

Description Value

Observations 1082.000
Res Log Likelihood -3355.59
Akaike’s Information Criterion -3357.59
Schwarz’s Bayesian Criterion -3362.57
-2 Res Log Likelihood 6711.179

Solution for Fixed Effects

Effect Estimate Std Error DF t Pr > |t|

INTERCEPT -5.60186511 0.37338672 106 -15.00 0.0001

DOSE 5.78288569 0.70050956 974 8.26 0.0001

Tests of Fixed Effects

Source NDF DDF Type III F Pr > F

DOSE 1 974 68.15 0.0001

CHAPTER 13. RANDOM-EFFECTS MODELS 301

Predicted Means

VISCERAL LITTER DOSE _Z Predicted

0 38 0 -6.8176 -5.6019
0 49 0 -6.9235 -5.6019

0 39 0.17 -6.1193 -4.6381

0 60 0.17 -6.2357 -4.6381

0 40 0.33 -5.3822 -3.6742

0 52 0.33 -3.6386 -3.6742

0 57 1 -2.4654 0.1810
1 66 1 2.0691 0.1810
CHAPTER 13. RANDOM-EFFECTS MODELS 302

GLIMMIX, dehp3, visceral, random intercept and SIMPLE

Covariance Parameter Estimates

Cov Parm Estimate

INTERCEPT 2.73470639
DIAG 0.34412800

GLIMMIX Model Statistics

Description Value

Deviance 282.3032
Scaled Deviance 282.3032
Pearson Chi-Square 353.0951
Scaled Pearson Chi-Square 353.0951
Extra-Dispersion Scale 1.0000

Parameter Estimates

Effect Estimate Std Error DF t Pr > |t|

INTERCEPT -5.6019 0.3734 106 -15.00 0.0001

DOSE 5.7829 0.7005 974 8.26 0.0001

Random Effects Estimates

ffect LITTER Estimate SE Pred DF t Pr > |t|

INTERCEPT 38 -0.2128 1.5035 974 -0.14 0.8875

INTERCEPT 39 -0.4752 1.3660 974 -0.35 0.7280
INTERCEPT 40 -0.6953 1.2769 974 -0.54 0.5862
INTERCEPT 49 -0.3190 1.4432 974 -0.22 0.8251
INTERCEPT 50 -0.1929 1.5157 974 -0.13 0.8988
INTERCEPT 203 0.0390 0.5066 974 0.08 0.9387
CHAPTER 13. RANDOM-EFFECTS MODELS 303

13.7.5 Discussion

Let us combine

• dose level
• predicted mean
• random intercept
• predicted value

In this case, also the probability scale is relevant:

• Conversion of predicted means:

exp(Xiβ̂)
P (Yij = 1|Xi, β̂) =
1 + exp(Xiβ̂)
• Conversion of predicted values:
exp(Xiβ̂ + Zib̂i)
P (Yij = 1|Xi, Zi, β̂, b̂i) =
1 + exp(Xiβ̂ + Zib̂i)
CHAPTER 13. RANDOM-EFFECTS MODELS 304

linear scale probability scale

------------ -----------------
LITTER DOSE MEAN R.INT. PRED MEAN(P) R.E.(P) PRED(P)

38.000 0.000 -5.602 -0.213 -5.815 0.004 -0.001 0.003

49.000 0.000 -5.602 -0.319 -5.921 0.004 -0.001 0.003
50.000 0.000 -5.602 -0.193 -5.795 0.004 -0.001 0.003
61.000 0.000 -5.602 -0.151 -5.753 0.004 -0.001 0.003
62.000 0.000 -5.602 -0.172 -5.774 0.004 -0.001 0.003
73.000 0.000 -5.602 -0.213 -5.815 0.004 -0.001 0.003
85.000 0.000 -5.602 -0.335 -5.937 0.004 -0.001 0.003
86.000 0.000 -5.602 2.749 -2.853 0.004 0.051 0.055
97.000 0.000 -5.602 2.496 -3.106 0.004 0.039 0.043
98.000 0.000 -5.602 -0.286 -5.888 0.004 -0.001 0.003
109.000 0.000 -5.602 -0.172 -5.774 0.004 -0.001 0.003
110.000 0.000 -5.602 -0.286 -5.888 0.004 -0.001 0.003
119.000 0.000 -5.602 -0.319 -5.921 0.004 -0.001 0.003
120.000 0.000 -5.602 -0.213 -5.815 0.004 -0.001 0.003
129.000 0.000 -5.602 -0.286 -5.888 0.004 -0.001 0.003
130.000 0.000 -5.602 -0.286 -5.888 0.004 -0.001 0.003
139.000 0.000 -5.602 3.684 -1.918 0.004 0.124 0.128
140.000 0.000 -5.602 3.106 -2.496 0.004 0.072 0.076
149.000 0.000 -5.602 -0.268 -5.870 0.004 -0.001 0.003
150.000 0.000 -5.602 -0.335 -5.937 0.004 -0.001 0.003
159.000 0.000 -5.602 -0.268 -5.870 0.004 -0.001 0.003
160.000 0.000 -5.602 -0.268 -5.870 0.004 -0.001 0.003
169.000 0.000 -5.602 -0.286 -5.888 0.004 -0.001 0.003
170.000 0.000 -5.602 -0.250 -5.853 0.004 -0.001 0.003
179.000 0.000 -5.602 -0.303 -5.905 0.004 -0.001 0.003
180.000 0.000 -5.602 -0.268 -5.870 0.004 -0.001 0.003
189.000 0.000 -5.602 -0.081 -5.683 0.004 -0.000 0.003
190.000 0.000 -5.602 -0.213 -5.815 0.004 -0.001 0.003
199.000 0.000 -5.602 -0.193 -5.795 0.004 -0.001 0.003
200.000 0.000 -5.602 -0.172 -5.774 0.004 -0.001 0.003
39.000 0.167 -4.638 -0.475 -5.113 0.010 -0.004 0.006
60.000 0.167 -4.638 -0.592 -5.231 0.010 -0.004 0.005
63.000 0.167 -4.638 -0.537 -5.175 0.010 -0.004 0.006
72.000 0.167 -4.638 -0.565 -5.203 0.010 -0.004 0.005
75.000 0.167 -4.638 -0.565 -5.203 0.010 -0.004 0.005
84.000 0.167 -4.638 -0.475 -5.113 0.010 -0.004 0.006
87.000 0.167 -4.638 -0.537 -5.175 0.010 -0.004 0.006
96.000 0.167 -4.638 -0.287 -4.925 0.010 -0.002 0.007
99.000 0.167 -4.638 -0.565 -5.203 0.010 -0.004 0.005
108.000 0.167 -4.638 -0.618 -5.257 0.010 -0.004 0.005
118.000 0.167 -4.638 -0.370 -5.008 0.010 -0.003 0.007
CHAPTER 13. RANDOM-EFFECTS MODELS 305

121.000 0.167 -4.638 -0.507 -5.145 0.010 -0.004 0.006

128.000 0.167 -4.638 -0.537 -5.175 0.010 -0.004 0.006
131.000 0.167 -4.638 -0.537 -5.175 0.010 -0.004 0.006
138.000 0.167 -4.638 -0.507 -5.145 0.010 -0.004 0.006
141.000 0.167 -4.638 -0.537 -5.175 0.010 -0.004 0.006
148.000 0.167 -4.638 -0.592 -5.231 0.010 -0.004 0.005
151.000 0.167 -4.638 -0.507 -5.145 0.010 -0.004 0.006
158.000 0.167 -4.638 2.018 -2.620 0.010 0.058 0.068
161.000 0.167 -4.638 -0.592 -5.231 0.010 -0.004 0.005
171.000 0.167 -4.638 -0.565 -5.203 0.010 -0.004 0.005
181.000 0.167 -4.638 -0.507 -5.145 0.010 -0.004 0.006
188.000 0.167 -4.638 -0.442 -5.080 0.010 -0.003 0.006
191.000 0.167 -4.638 -0.507 -5.145 0.010 -0.004 0.006
198.000 0.167 -4.638 -0.189 -4.828 0.010 -0.002 0.008
201.000 0.167 -4.638 -0.442 -5.080 0.010 -0.003 0.006
40.000 0.333 -3.674 -0.695 -4.370 0.025 -0.012 0.012
52.000 0.333 -3.674 1.113 -2.561 0.025 0.047 0.072
64.000 0.333 -3.674 -0.634 -4.308 0.025 -0.011 0.013
71.000 0.333 -3.674 1.039 -2.636 0.025 0.042 0.067
76.000 0.333 -3.674 -0.938 -4.612 0.025 -0.015 0.010
83.000 0.333 -3.674 -0.896 -4.570 0.025 -0.014 0.010
88.000 0.333 -3.674 -1.014 -4.689 0.025 -0.016 0.009
100.000 0.333 -3.674 3.337 -0.338 0.025 0.392 0.416
107.000 0.333 -3.674 -0.938 -4.612 0.025 -0.015 0.010
112.000 0.333 -3.674 -0.896 -4.570 0.025 -0.014 0.010
117.000 0.333 -3.674 1.283 -2.391 0.025 0.059 0.084
122.000 0.333 -3.674 -0.851 -4.526 0.025 -0.014 0.011
127.000 0.333 -3.674 1.194 -2.480 0.025 0.053 0.077
137.000 0.333 -3.674 1.283 -2.391 0.025 0.059 0.084
142.000 0.333 -3.674 2.008 -1.667 0.025 0.134 0.159
147.000 0.333 -3.674 -0.977 -4.652 0.025 -0.015 0.009
152.000 0.333 -3.674 -1.014 -4.689 0.025 -0.016 0.009
157.000 0.333 -3.674 -0.896 -4.570 0.025 -0.014 0.010
162.000 0.333 -3.674 -0.938 -4.612 0.025 -0.015 0.010
167.000 0.333 -3.674 -0.938 -4.612 0.025 -0.015 0.010
172.000 0.333 -3.674 1.748 -1.926 0.025 0.102 0.127
177.000 0.333 -3.674 2.231 -1.443 0.025 0.166 0.191
182.000 0.333 -3.674 -0.752 -4.426 0.025 -0.013 0.012
187.000 0.333 -3.674 1.913 -1.761 0.025 0.122 0.147
192.000 0.333 -3.674 -0.399 -4.073 0.025 -0.008 0.017
197.000 0.333 -3.674 2.366 -1.308 0.025 0.188 0.213
53.000 0.667 -1.747 -2.065 -3.811 0.148 -0.127 0.022
70.000 0.667 -1.747 1.157 -0.589 0.148 0.208 0.357
82.000 0.667 -1.747 1.393 -0.354 0.148 0.264 0.412
89.000 0.667 -1.747 0.311 -1.436 0.148 0.044 0.192
CHAPTER 13. RANDOM-EFFECTS MODELS 306

94.000 0.667 -1.747 -1.361 -3.108 0.148 -0.106 0.043

101.000 0.667 -1.747 2.235 0.489 0.148 0.471 0.620
106.000 0.667 -1.747 -2.065 -3.811 0.148 -0.127 0.022
126.000 0.667 -1.747 -0.396 -2.143 0.148 -0.043 0.105
136.000 0.667 -1.747 -2.065 -3.811 0.148 -0.127 0.022
153.000 0.667 -1.747 -0.978 -2.725 0.148 -0.087 0.062
163.000 0.667 -1.747 -1.793 -3.540 0.148 -0.120 0.028
166.000 0.667 -1.747 1.157 -0.589 0.148 0.208 0.357
173.000 0.667 -1.747 -0.292 -2.039 0.148 -0.033 0.115
186.000 0.667 -1.747 0.991 -0.756 0.148 0.171 0.319
193.000 0.667 -1.747 -0.174 -1.921 0.148 -0.021 0.128
196.000 0.667 -1.747 -0.292 -2.039 0.148 -0.033 0.115
203.000 0.667 -1.747 0.039 -1.708 0.148 0.005 0.153
57.000 1.000 0.181 -1.329 -1.148 0.545 -0.304 0.241
66.000 1.000 0.181 -0.531 -0.350 0.545 -0.132 0.413
78.000 1.000 0.181 2.348 2.529 0.545 0.381 0.926
90.000 1.000 0.181 0.789 0.970 0.545 0.180 0.725
93.000 1.000 0.181 1.848 2.029 0.545 0.339 0.884
105.000 1.000 0.181 1.250 1.431 0.545 0.262 0.807
124.000 1.000 0.181 -0.531 -0.350 0.545 -0.132 0.413
154.000 1.000 0.181 -2.004 -1.823 0.545 -0.406 0.139
184.000 1.000 0.181 0.468 0.649 0.545 0.112 0.657
CHAPTER 13. RANDOM-EFFECTS MODELS 307

Remarks

• On the linear scale:

– mean of random intercepts is zero
– mean of average over litters is −3.8171
– mean of predicted value over litters is −3.8171
• On the probability scale:
– mean of random eﬀect is 0.0207
– mean of average probabilities over litters is 0.0781
– mean of predicted probabilities over litters is
0.0988

This property is well known:

−1 −1
g (Xiβ̂) = E g (Xiβ̂ + Zib̂i)

• It is seen in plots
• It shows through main eﬀects estimates
• Neuhaus and Jewell (1993)
CHAPTER 13. RANDOM-EFFECTS MODELS 308
CHAPTER 13. RANDOM-EFFECTS MODELS 309

13.8 The NTP Data

GEE1 and GLIMMIX Estimates (Model Based Standard Errors; Robust Standard Errors) for the
DEHP Data. Exchangeable Working Assumptions/Random Intercept Model.

Outcome Parameter GENMOD PRENTICE GLIMMIX (rep) GLIMMIX (rand)

External β0 -4.98(0.40;0.37) -4.99(0.46;0.37) -5.00(0.36;0.37) -6.62(0.46;0.40)
βd 5.33(0.57;0.55) 5.32(0.65;0.55) 5.32(0.51;0.55) 7.25(0.81;0.64)
φ 0.88 0.65
ρ 0.11 0.11(0.04) 0.06
rand.int. 3.63
residual 0.25
Visceral β0 -4.50(0.37;0.37) -4.51(0.40;0.37) -4.50(0.36;0.37) -5.60(0.37;0.40)
βd 4.55(0.55;0.59) 4.59(0.58;0.59) 4.55(0.54;0.59) 5.78(0.70;0.71)
φ 1.00 0.92
ρ 0.08 0.11(0.05) 0.08
rand.int. 2.73
residual 0.34
Skeletal β0 -4.83(0.44;0.45) -4.82(0.47;0.44) -4.82(0.46;0.45) -6.63(0.48;0.53)
βd 4.84(0.62;0.63) 4.84(0.67;0.63) 4.84(0.65;0.63) 6.65(0.84;0.89)
φ 0.98 0.86
ρ 0.12 0.14(0.06) 0.13
rand.int. 3.63
residual 0.25
Collapsed β0 -4.05(0.32;0.31) -4.06(0.35;0.31) -4.04(0.33;0.31) -4.85(0.32;0.34)
βd 5.84(0.57;0.61) 5.89(0.62;0.61) 5.82(0.58;0.61) 7.20(0.67;0.72)
φ 1.00 0.96
ρ 0.11 0.15(0.05) 0.11
rand.int. 2.30
residual 0.48
CHAPTER 13. RANDOM-EFFECTS MODELS 310

13.9 Transition Models

• In this approach we model the conditional distribution

of each Yij given its predecessors Yi1, . . . , Yi,j−1.
• Typically, this is achieved by including the
predecessors as additional covariates in a classical
GLM
• Thus we assume
– E(Yij |Yi1, . . . , Yi,j−1) = µij
– η(µij ) = xTij β + y T(ij)α
where y (ij) = (yi1, . . . , yi,j−1)
– Var(Yij |Yi1, . . . , Yi,j−1) = φv(µij )
• Then, construct a likelihood for β and α via
f (y i) = f (yi1)f (yi2|yi1)f (yi3|yi1, yi2) . . . f (yini |yi1, . . . , yi,ni −1)
CHAPTER 13. RANDOM-EFFECTS MODELS 311

• In practice one will make the “recent past”

assumption: only the most recent measurements are
needed:
f (yij |yi1, . . . , yi,j−1) = f (yij |yi,j−k , . . . , yi,j−1)
a Markov dependence of order k.
• The joint distribution simpliﬁes to
f (y i) = f (yi1)f (yi2 |yi1) . . . f (yik |yi1, . . . , yi,k−1)
n
i
× f (yij |yi,j−k , . . . , yi,j−1)
j=k+1

• The transition model speciﬁes the form of the

conditional densities within the product sign.
It does not explicitly specify the remaining terms in
f (y i) and these may be impossible to evaluate.
• Pragmatic solution: ignore the unspeciﬁed terms.
This represents a loss of information which may or
may not be serious, depending on the values of k and
ni.
It is a potentially serious disadvantage for data
consisting of short sequences.
CHAPTER 13. RANDOM-EFFECTS MODELS 312

13.10 Diﬀerences Between Families of

Models

The parameter β does not have the same substantive

meaning in the three approaches:

• in marginal modelling, β is unequivocally a population

parameter; it determines the effect of explanatory
variables on the population mean response
• in conditional, transition or random effects modelling,
β is still a population parameter, in the sense that it
operates on all of the subjects, but it determines the
effects of explanatory variables on the conditional
mean response of an individual subject
– given that subject’s measurement history (transition model),
OR

– given that subject’s own random characteristics Ui (random eﬀects)

– given that subject’s other outcomes (conditional model)

• the three classes of model are fundamentally diﬀerent,

and no easy conversion is possible
Chapter 14

Case Study: Analgesic Trial

14.1 PROC NLMIXED Code

proc nlmixed data=gsa;

parms beta0=3 beta1=-0.8 beta2=0.2 beta3=-0.2 s2u=1;
eta = beta0 + beta1*time + beta2*time2 + beta3*pca0 + u;
expeta = exp(eta);
p = expeta/(1+expeta);
model gsabin ~ binary(p);
random u ~ normal(0,s2u) subject=patid;
estimate ’ICC’ s2u/(arcos(-1)**2/3 + s2u);
run;

313
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 314

14.1.1 Output

The NLMIXED Procedure

Specifications

Description Value

Data Set WORK.GSA

Dependent Variable gsabin
Distribution for Dependent Variable Binary
Random Effects u
Distribution for Random Effects Normal
Subject Variable PATID
Optimization Technique Dual Quasi-Newton
Integration Method Gaussian Quadrature

Dimensions

Description Value

Observations Used 1137

Observations Not Used 0
Total Observations 1137
Subjects 395
Max Obs Per Subject 4
Parameters 5
Quadrature Points 20

Parameters

beta0 beta1 beta2 beta3 s2u NegLogLike

3 -0.8 0.2 -0.2 1 512.393718

Iteration History

Iter Calls NegLogLike Diff MaxGrad Slope

1 4 511.382562 1.011155 8.728454 -654.408

2 6 511.150086 0.232476 8.37513 -7.92042
3 8 507.712837 3.437249 8.276495 -4.42161
4 10 506.777649 0.935189 1.938972 -0.84523
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 315

5 12 506.53802 0.239629 4.142048 -0.11795

6 13 506.380646 0.157374 20.07718 -0.1081
7 15 506.287593 0.093053 2.582222 -0.15011
8 17 506.275845 0.011748 0.605879 -0.01344
9 19 506.274764 0.001081 0.292118 -0.00145
10 21 506.274723 0.000041 0.044318 -0.00007
11 23 506.274723 3.296E-7 0.002903 -5.24E-7

NOTE: GCONV convergence criterion satisfied.

Fit Statistics

Description Value

-2 Log Likelihood 1012.5

AIC (smaller is better) 1022.5
BIC (smaller is better) 1042.4
Log Likelihood -506.3
AIC (larger is better) -511.3
BIC (larger is better) -521.2

Parameter Estimates

Standard
Parameter Estimate Error DF t Value Pr > |t| Alpha Lower

beta0 4.0474 0.7097 394 5.70 <.0001 0.05 2.6521

beta1 -1.1600 0.4657 394 -2.49 0.0131 0.05 -2.0756
beta2 0.2445 0.09472 394 2.58 0.0102 0.05 0.05826
beta3 -0.2997 0.1428 394 -2.10 0.0365 0.05 -0.5805
s2u 2.5326 0.6764 394 3.74 0.0002 0.05 1.2027

Parameter Estimates

Parameter Upper Gradient

beta0 5.4428 0.000556

beta1 -0.2445 -0.00006
beta2 0.4307 -0.0029
beta3 -0.01893 0.001801
s2u 3.8624 -0.00006

Additional Estimates
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 316

Standard
Label Estimate Error DF t Value Pr > |t| Alpha Lower Upper

ICC 0.4350 0.06564 394 6.63 <.0001 0.05 0.3059 0.5640

CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 317

14.1.2 Population Averaged Proﬁles

Example code to derive population-averaged ﬁtted

(complete) proﬁles

• Need to calculate

+∞ exp(xTij β + x) 1 −x2 /(2σu2 )
−∞ . √ e dx.
1 + exp(xTij β + x) 2πσu
• Take mean value of the covariates to evaluate this
expression.
• This gives ﬁtted complete proﬁles, that is, what would
be obtained had all the patients stayed in the study.
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 318

*** use the following statement in PROC NLMIXED to get parameter estimates;
ods output parameterestimates=parmest;

proc iml;
*** read in fixed param. estimates;
use parmest;
read all var{estimate} into parmest;
beta=parmest[1:(nrow(parmest)-1)];
sig2=parmest[nrow(parmest)];

*** module that evaluates the function to be integrated;

start integr(x) global(sig2,xbeta);
f=(exp(xbeta+x)/(1+exp(xbeta+x)))*exp(-0.5*(x**2)/sig2)/sqrt(2*arcos(-1)*sig2);
return(f);
finish;

cif=probit(0.975);
do t=1 to 4;
xcov={1} // t // t**2 // {3};*** Note: 3 = median baseline PCA;
xbeta=t(xcov)*beta;
call quad(prc,"integr",{.M .P});
*** approximate confidence intervals (ignoring variability in the estimates);
low_prc=exp(xbeta-cif*sqrt(sig2))/(1+exp(xbeta-cif*sqrt(sig2)));
upp_prc=exp(xbeta+cif*sqrt(sig2))/(1+exp(xbeta+cif*sqrt(sig2)));
pdrespc=pdrespc // (t || prc || low_prc || upp_prc);
end;
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 319

14.1.3 Fitted Model

CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 320
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 321
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 322

14.2 Comparison of Diﬀerent

Approaches/Programs

14.2.1 PROC NLMIXED (Gaussian quadrature

and N-R)

• Code:
proc nlmixed data=gsa npoints=20 noad noadscale tech=newrap;
parms beta0=3 beta1=-0.8 beta2=0.2 beta3=-0.2 su=1;
eta = beta0 + beta1*time + beta2*time2 + beta3*pca0 + u;
expeta = exp(eta);
p = expeta/(1+expeta);
model gsabin ~ binary(p);
random u ~ normal(0,su**2) subject=patid;
estimate ’ICC’ su**2/(arcos(-1)**2/3 + su**2);
run;

• Output:
The NLMIXED Procedure

Specifications

Description Value

Data Set WORK.GSA

Dependent Variable gsabin
Distribution for Dependent Variable Binary
Random Effects u
Distribution for Random Effects Normal
Subject Variable PATID
Optimization Technique Newton-Raphson
Integration Method Gaussian Quadrature

Dimensions

Description Value

Observations Used 1137

CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 323

Observations Not Used 0

Total Observations 1137
Subjects 395
Max Obs Per Subject 4
Parameters 5
Quadrature Points 20

Parameters

beta0 beta1 beta2 beta3 su NegLogLike

3 -0.8 0.2 -0.2 1 512.393718

Iteration History

Iter Calls NegLogLike Diff MaxGrad Slope

1 14 506.413768 5.97995 11.45115 -11.1936

2 21 506.275166 0.138602 0.266335 -0.26772
3 28 506.274723 0.000443 0.0004 -0.00088
4 35 506.274723 6.782E-9 4.515E-9 -1.36E-8

NOTE: GCONV convergence criterion satisfied.

Fit Statistics

Description Value

-2 Log Likelihood 1012.5

AIC (smaller is better) 1022.5
BIC (smaller is better) 1042.4
Log Likelihood -506.3
AIC (larger is better) -511.3
BIC (larger is better) -521.2

Parameter Estimates

Standard
Parameter Estimate Error DF t Value Pr > |t| Alpha Lower

beta0 4.0474 0.7097 394 5.70 <.0001 0.05 2.6521

beta1 -1.1600 0.4657 394 -2.49 0.0131 0.05 -2.0755
beta2 0.2445 0.09472 394 2.58 0.0102 0.05 0.05826
beta3 -0.2997 0.1428 394 -2.10 0.0365 0.05 -0.5805
su 1.5914 0.2125 394 7.49 <.0001 0.05 1.1736

Parameter Estimates
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 324

Parameter Upper Gradient

beta0 5.4427 -1.56E-9

beta1 -0.2445 -2E-9
beta2 0.4307 1.313E-9
beta3 -0.01893 1.468E-9
su 2.0092 -4.52E-9

Additional Estimates

Standard
Label Estimate Error DF t Value Pr > |t| Alpha Lower Upper

ICC 0.4350 0.06564 394 6.63 <.0001 0.05 0.3059 0.5640

CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 325

14.2.2 MIXOR

MIXOR - The program for mixed-effects ordinal regression analysis

(version 2)

Global Satisfaction Assessment

Response function: logistic

Random-effects distribution: normal

Covariate(s) and random-effect(s) mean subtracted from thresholds

==> positive coefficient = positive association between regressor
and ordinal outcome

Numbers of observations
-----------------------

Level 1 observations = 1137

Level 2 observations = 395

The number of level 1 observations per level 2 unit are:

1 1 4 2 1 4 3 4 3 3 4 4 4 3 1 1 1 3 2
...

Descriptive statistics for all variables

----------------------------------------

Variable Minimum Maximum Mean Stand. Dev.

GSAbin 0.00000 1.00000 0.81882 0.38534

intcpt 1.00000 1.00000 1.00000 0.00000
Time 1.00000 4.00000 2.25330 1.12238
Time2 1.00000 16.00000 6.33597 5.55444
PCA0 1.00000 5.00000 3.02375 0.89519

Categories of the response variable GSAbin

--------------------------------------------

Category Frequency Proportion

0.00 206.00 0.18118

1.00 931.00 0.81882
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 326

Starting values
---------------

mean 1.022
covariates 0.295 -0.066 0.079
var. terms 0.574

==> The number of level 2 observations with non-varying responses

= 284 ( 71.90 percent )

---------------------------------------------------------
* Final Results - Maximum Marginal Likelihood Estimates *
---------------------------------------------------------

Total Iterations = 10
Quad Pts per Dim = 20
Log Likelihood = -506.275
Deviance (-2logL) = 1012.549
Ridge = 0.000

Variable Estimate Stand. Error Z p-value

-------- ------------ ------------ ------------ ------------
intcpt 4.04741 0.71278 5.67835 0.00000 (2)
Time -1.16003 0.47453 -2.44457 0.01450 (2)
Time2 0.24449 0.09678 2.52624 0.01153 (2)
PCA0 -0.29971 0.15375 -1.94932 0.05126 (2)

Random effect variance term (standard deviation)

intcpt 1.59139 0.20578 7.73355 0.00000 (1)

note: (1) = 1-tailed p-value

(2) = 2-tailed p-value

Calculation of the intracluster correlation

-------------------------------------------
residual variance = pi*pi / 3 (assumed)
cluster variance = (1.591 * 1.591) = 2.533

intracluster correlation = 2.533 / ( 2.533 + (pi*pi/3)) = 0.435

CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 327

14.2.3 PQL2 (MLwiN) without and with

extra-dispersion parameter
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 328

14.2.4 PQL (MLwiN) without and with

extra-dispersion parameter
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 329

14.2.5 PQL (GLIMMIX)

• Code:
%glimmix(data=gsa, procopt=%str(method=ml noclprint covtest),
stmts=%str(
class patid timecls;
model gsabin = time|time pca0 / s;
repeated timecls / sub=patid type=un rcorr=3;
),
error=binomial);

• Output:
The Mixed Procedure

Model Information

Data Set WORK._DS

Dependent Variable _z
Weight Variable _w
Covariance Structure Variance Components
Subject Effect PATID
Estimation Method ML
Residual Variance Method Profile
Fixed Effects SE Method Model-Based
Degrees of Freedom Method Containment

Dimensions

Covariance Parameters 2
Columns in X 4
Columns in Z Per Subject 1
Subjects 395
Max Obs Per Subject 4
Observations Used 1137
Observations Not Used 0
Total Observations 1137

Parameter Search
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 330

CovP1 CovP2 Variance Log Like -2 Log Like

3.1651 0.4954 0.4954 -2833.4382 5666.8764

Iteration History

Iteration Evaluations -2 Log Like Criterion

1 1 5666.87638845 0.00000000

Convergence criteria met.

Covariance Parameter Estimates

Standard Z
Cov Parm Subject Estimate Error Value Pr Z

Intercept PATID 3.1651 0.3911 8.09 <.0001

Residual 0.4954 0.02521 19.65 <.0001

Fit Statistics

Log Likelihood -2833.4

Akaike’s Information Criterion -2835.4
Schwarz’s Bayesian Criterion -2839.4
-2 Log Likelihood 5666.9

PARMS Model Likelihood Ratio Test

DF Chi-Square Pr > ChiSq

1 0.00 1.0000

Solution for Fixed Effects

Standard
Effect Estimate Error DF t Value Pr > |t|

Intercept 4.0292 0.5476 394 7.36 <.0001

TIME -1.2788 0.3341 739 -3.83 0.0001
CHAPTER 14. CASE STUDY: ANALGESIC TRIAL 331

TIME*TIME 0.2590 0.06800 739 3.81 0.0002

pca0 -0.2922 0.1300 739 -2.25 0.0249

Type 3 Tests of Fixed Effects

Num Den
Effect DF DF F Value Pr > F

TIME 1 739 14.65 0.0001

TIME*TIME 1 739 14.50 0.0002
pca0 1 739 5.05 0.0249

GLIMMIX Model Statistics

Description Value

Deviance 564.5908
Scaled Deviance 1139.5937
Pearson Chi-Square 451.3494
Scaled Pearson Chi-Square 911.0227
Extra-Dispersion Scale 0.4954
Chapter 15

Analgesic Trial: Ordinal Data

15.1 Proportional odds model

• Multinomial model for a response variable with K

ordered categories (1,. . . ,K):
P [Yi ≤ r] = F (µr + xTi β) for r = 1, . . . , K − 1.
where F is a cumulative distribution function.
• F = cdf for the logistic distribution ⇒ proportional
odds model.

332
CHAPTER 15. ANALGESIC TRIAL: ORDINAL DATA 333

• PROC LOGISTIC code (parameter estimates can

serve as initial values for PROC NLMIXED):
proc logistic data=gsa;
class patid timecls;
model gsa = time|time pca0;
run;

• Output:

The LOGISTIC Procedure

Model Information

Data Set WORK.GSA

Response Variable GSA
Number of Response Levels 5
Number of Observations 1137
Link Function Logit
Optimization Technique Fisher’s scoring

Response Profile

Ordered Total
Value GSA Frequency

1 Bad 163
2 Good 329
3 Moderate 439
4 Very Bad 43
5 Very Good 163

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.

Score Test for the Proportional Odds Assumption

Chi-Square DF Pr > ChiSq

CHAPTER 15. ANALGESIC TRIAL: ORDINAL DATA 334

30.0808 9 0.0004

Model Fit Statistics

Intercept
Intercept and
Criterion Only Covariates

AIC 3207.617 3212.857

SC 3227.761 3248.110
-2 Log L 3199.617 3198.857

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 0.7597 3 0.8591

Score 0.7336 3 0.8653
Wald 0.7820 3 0.8538

Analysis of Maximum Likelihood Estimates

Standard
Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 -1.7603 0.3411 26.6377 <.0001

Intercept2 1 -0.2421 0.3359 0.5196 0.4710
Intercept3 1 1.5386 0.3395 20.5387 <.0001
Intercept4 1 1.8178 0.3413 28.3632 <.0001
TIME 1 0.1201 0.2697 0.1984 0.6561
TIME*TIME 1 -0.0262 0.0545 0.2305 0.6312
pca0 1 -0.0441 0.0601 0.5379 0.4633

Odds Ratio Estimates

Point 95% Wald

Effect Estimate Confidence Limits

pca0 0.957 0.850 1.077

Association of Predicted Probabilities and Observed Responses

Percent Concordant 44.3 Somers’ D 0.012

CHAPTER 15. ANALGESIC TRIAL: ORDINAL DATA 335

Percent Discordant 43.1 Gamma 0.013

Percent Tied 12.5 Tau-a 0.009
Pairs 468410 c 0.506
CHAPTER 15. ANALGESIC TRIAL: ORDINAL DATA 336

15.2 GEE Model

• Only the IND structure is available for the

multinomial distribution in PROC GENMOD.
• Code:

proc genmod data=gsa;

ods listing exclude classlevels parminfo parameterestimates;
class patid timecls;
model gsa = time|time pca0 / dist=multinomial link=cumlogit;
repeated sub=patid / type=ind within=timecls;
run;

• Output:

The GENMOD Procedure

Model Information

Data Set WORK.GSA

Distribution Multinomial
Link Function Cumulative Logit
Dependent Variable GSA
Observations Used 1137

Response Profile

Ordered Ordered
Level Value Count

1 Bad 163
2 Good 329
3 Moderate 439
4 Very Bad 43
5 Very Good 163
CHAPTER 15. ANALGESIC TRIAL: ORDINAL DATA 337

Criteria For Assessing Goodness Of Fit

Criterion DF Value Value/DF

Log Likelihood -1599.4286

Algorithm converged.

GEE Model Information

Correlation Structure Independent

Within-Subject Effect timecls (4 levels)
Subject Effect PATID (395 levels)
Number of Clusters 395
Correlation Matrix Dimension 4
Maximum Cluster Size 4
Minimum Cluster Size 1

Algorithm converged.

Analysis Of GEE Parameter Estimates

Empirical Standard Error Estimates

Standard 95% Confidence

Parameter Estimate Error Limits Z Pr > |Z|

Intercept1 -1.7598 0.3661 -2.4774 -1.0423 -4.81 <.0001

Intercept2 -0.2416 0.3703 -0.9674 0.4842 -0.65 0.5141
Intercept3 1.5392 0.3727 0.8087 2.2696 4.13 <.0001
Intercept4 1.8184 0.3841 1.0656 2.5711 4.73 <.0001
TIME 0.1201 0.2634 -0.3962 0.6365 0.46 0.6483
TIME*TIME -0.0262 0.0527 -0.1295 0.0772 -0.50 0.6196
pca0 -0.0443 0.0817 -0.2044 0.1158 -0.54 0.5876
CHAPTER 15. ANALGESIC TRIAL: ORDINAL DATA 338

15.3 Random-eﬀects Model

Hedeker D. & Gibbons R. (1994). A random-eﬀects ordinal

regression model for multilevel analysis. Biometrics, 50, 933–44.

• Model assuming an underlying latent continuous

variable ỹij (e.g. normal or logistic):
ỹij = xTij β + ui + ij
• Instead of observing ỹij , we observe only whether it
falls in one of the intervals
] − ∞, γ1], ]γ1, γ2], . . . , ]γK−2, γK−1], ]γK−1, +∞[

• The model can then be written as

P [yij = r|ui] = F (γr − zij ) − F (γr−1 − zij ), r = 1, . . . , K − 1

with zij = xTij β + ui, γ0 = −∞ and γK = +∞

• This has been implemented in the MIXOR program.
CHAPTER 15. ANALGESIC TRIAL: ORDINAL DATA 339

• PROC NLMIXED code:

proc nlmixed data=gsa qpoints=20;

parms i1=-1.8 i2=-0.2 i3=1.5 i4=1.8 b1=0.1 b2=0 b3=0 sd=1;
eta = b1*time + b2*time2 + b3*pca0 + u;
if gsa=1 then z = 1/(1+exp(-(i1-eta)));
else if gsa=2 then z = 1/(1+exp(-(i2-eta))) - 1/(1+exp(-(i1-eta)));
else if gsa=3 then z = 1/(1+exp(-(i3-eta))) - 1/(1+exp(-(i2-eta)));
else if gsa=4 then z = 1/(1+exp(-(i4-eta))) - 1/(1+exp(-(i3-eta)));
else z = 1 - 1/(1+exp(-(i4-eta)));
if z > 1e-8 then ll = log(z);
else ll = -1e100;
model gsa ~ general(ll);
random u ~ normal(0,sd*sd) subject=patid;
estimate ’var_u’ sd*sd;
estimate ’icc’ sd*sd/(sd*sd+arcos(-1)**2/3);
run;

• Output:

The NLMIXED Procedure

Specifications

Description Value

Data Set WORK.GSA

Dependent Variable GSA
Distribution for Dependent Variable General
Random Effects u
Distribution for Random Effects Normal
Subject Variable PATID
Optimization Technique Dual Quasi-Newton
Integration Method Adaptive Gaussian
Quadrature

Dimensions

Description Value

Observations Used 1137

CHAPTER 15. ANALGESIC TRIAL: ORDINAL DATA 340

Observations Not Used 0

Total Observations 1137
Subjects 395
Max Obs Per Subject 4
Parameters 8
Quadrature Points 20

Parameters

i1 i2 i3 i4 b1 b2 b3 sd

-1.8 -0.2 1.5 1.8 0.1 0 0 1

Parameters

NegLogLike

1716.69515

Iteration History

Iter Calls NegLogLike Diff MaxGrad Slope

1 3 1694.65105 22.0441 989.5756 -7996.23

2 4 1615.93733 78.71373 919.9434 -4222.5
3 5 1542.58202 73.3553 739.0983 -386.772
4 7 1498.82212 43.7599 200.5629 -245.556
5 8 1470.57325 28.24887 86.6201 -289.557
6 10 1459.63373 10.93952 101.1084 -30.3395
7 12 1454.03215 5.601585 42.71724 -4.98865
8 14 1451.81078 2.221367 39.48043 -1.31487

Iteration History

Iter Calls NegLogLike Diff MaxGrad Slope

9 16 1450.83262 0.978159 15.13727 -0.77169

10 18 1450.35766 0.474961 22.16116 -0.31726
11 19 1449.68318 0.674483 9.663066 -0.195
12 21 1447.25905 2.42413 5.373867 -0.8011
13 22 1445.77923 1.479822 19.19658 -2.05807
14 24 1445.25326 0.525968 6.559206 -0.92561
15 26 1445.24347 0.009789 1.730594 -0.01624
16 28 1445.24043 0.003041 2.766239 -0.00236
17 30 1445.23852 0.001914 0.297433 -0.00122
CHAPTER 15. ANALGESIC TRIAL: ORDINAL DATA 341

18 32 1445.23842 0.0001 0.238891 -0.00015

19 34 1445.23839 0.000023 0.041654 -0.00002
20 36 1445.23839 3.868E-7 0.001023 -7.06E-7

NOTE: GCONV convergence criterion satisfied.

Fit Statistics

Description Value

-2 Log Likelihood 2890.5

AIC (smaller is better) 2906.5
BIC (smaller is better) 2938.3
Log Likelihood -1445
AIC (larger is better) -1453
BIC (larger is better) -1469

Parameter Estimates

Standard
Parameter Estimate Error DF t Value Pr > |t| Alpha Lower

i1 -1.5585 0.5481 394 -2.84 0.0047 0.05 -2.6360

i2 1.0292 0.5442 394 1.89 0.0593 0.05 -0.04061
i3 3.8916 0.5624 394 6.92 <.0001 0.05 2.7860
i4 6.2144 0.5990 394 10.37 <.0001 0.05 5.0368
b1 0.5410 0.3078 394 1.76 0.0796 0.05 -0.06420
b2 -0.1123 0.06187 394 -1.82 0.0702 0.05 -0.2340
b3 0.3173 0.1386 394 2.29 0.0226 0.05 0.04476
sd 2.1082 0.1412 394 14.94 <.0001 0.05 1.8307

Parameter Estimates

Parameter Upper Gradient

i1 -0.4810 -0.00002
i2 2.0991 -0.00011
i3 4.9973 0.000138
i4 7.3920 -0.00003
b1 1.1463 0.00026
b2 0.009308 0.001023
b3 0.5898 0.000191
sd 2.3857 -0.0001
CHAPTER 15. ANALGESIC TRIAL: ORDINAL DATA 342

Additional Estimates

Standard
Label Estimate Error DF t Value Pr > |t| Alpha Lower Upper

var_u 4.4447 0.5952 394 7.47 <.0001 0.05 3.2746 5.6148

icc 0.5747 0.03273 394 17.56 <.0001 0.05 0.5103 0.6390
Chapter 16

Missing Data

16.1 Missing Data Notation

• Subject i at occasion (time) j = 1, . . . , ni

• Measurement Yij

• Dropout indicator





 1 if Yij is observed,
Rij = 

 0 otherwise.

343
CHAPTER 16. MISSING DATA 344

• Group Yij into a vector

Y i = (Yi1, . . . , Yini ) = (Y oi, Y m
i )





 Y oi contains Yij for which Rij = 1,




 Ym
i contains Yij for which Rij = 0.

• Group Rij into a vector Ri = (Ri1, . . . , Rini )

Dropout

• Monotone patterns only

• Di: time of dropout (relevant for monotone

processes)

• Possible deﬁnition:

ni
Di = 1 + Rij
j=1
CHAPTER 16. MISSING DATA 345

16.2 The Name of the Game

Complete data Y i: the scheduled measurements

the outcome vector that would be recorded if there were no missing data.

Missing data indicators Ri, also called the missing

data process.

Full data (Y i, Ri): the complete data, together with

the missing data indicators.

Observed data Yio.

Missing data Yim.

One observes both the observed data or measurements

Yio, together with the dropout indicators Ri.
CHAPTER 16. MISSING DATA 346

16.3 Factorizing the Distribution

Consider the distribution of the full data:

f (Yi, Di|θ, ψ)

• θ parametrizes the measurement distribution,

• β parametrizes the missingness process.

Several routes possible:

Selection Models:
f (Yi|θ)f (Di|Yi, ψ)

Pattern-Mixture Models:
f (Yi|Di, θ)f (Di|ψ)

Shared parameter models:

f (Yi, Di|bi, θ, ψ)
CHAPTER 16. MISSING DATA 347

16.4 Selection Models

Most models are based on the following factorization:

f (Yi, Di|θ, ψ) = f (Yi|θ)f (Di|Yi, ψ)

• the ﬁrst factor is the marginal density of the

measurement process

• the second factor is the density of the missingness

process, given the outcomes.

This framework is called selection modeling.

The second factor corresponds to the (self-)selection of individuals into the

“observed” and “missing” groups.
CHAPTER 16. MISSING DATA 348

16.5 Missing Data Processes

f (Di|Y i, ψ) = f (Di|Y oi, Y m

i , ψ).

Missing Completely At Random (MCAR)

Missingness is independent of the measurements:
f (Di|ψ).

Missing At Random (MAR) Missingness is

independent of the unobserved (missing)
measurements, possibly depending on the observed
measurements:
f (Di|Y oi, ψ).
Missing Not At Random (MNAR) Missingness
depends on the missing values.

Above terminology is independent of the statistical

framework chosen to analyse the data.

This is to be contrasted with the terms ignorable and

nonignorable missing data.
CHAPTER 16. MISSING DATA 349

16.6 Ignorability

Let us decide to use likelihood based estimation.

The full data likelihood contribution for subject i:

L∗(θ, ψ|Y i, Di) ∝ f (Y i, Di|θ, ψ).

Base inference on the observed data:

Under a MAR process:

f (Y i , Di |θ, ψ)
o
= f (Y oi, Y m i |θ)f (Di |Y i , ψ)dY i
o m

= f (Y oi|θ)f (Di|Y oi, ψ),

The likelihood factorizes into two components.

CHAPTER 16. MISSING DATA 350

16.7 Ignorability ←− Separability

If further θ and β are disjoint

then inference can be based on the marginal observed
data density only.

Within the likelihood framework ignorability is equivalent

to the union of MAR and MCAR (assuming separability)

• Counterexamples:
– Generalized Estimating Equations (Liang and
Zeger)
– Least Squares

• General Account: (Rubin, Bka 1976)

– Sampling Distribution (frequentist) Theory
– Likelihood Based Estimation
– Bayesian Inference
CHAPTER 16. MISSING DATA 351

16.8 Simple Methods

• Rectangular matrix by deletion: complete case

analysis
• Rectangular matrix by completion: imputation
– Vertical: Unconditional mean imputation
– Horizontal: Last observation carried forward
– Diagonal: Conditional mean imputation
• Using data as is: available case analysis
– Frequentist: diﬃcult and not generally valid
– Likelihood: the thing to do – consistent with an
ignorable analysis
CHAPTER 16. MISSING DATA 352

16.9 Three Likelihood Approaches

• Direct likelihood maximization

– continuous: SAS PROC MIXED,. . .
– categorical: generalized linear mixed models (SAS PROC
NLMIXED,. . . )
– NOT: generalized estimating equations !!!

• EM algorithm
Match the data to the “complete” model

• Multiple Imputation
Accounts properly for uncertainty due to missingness
CHAPTER 16. MISSING DATA 353

16.10 An Ignorable Likelihood Analysis

Likelihood based inference is valid, whenever

• the mechanism is MAR,

• the parameters describing the missingness mechanism
are distinct from the measurement model parameters.

PROC MIXED gives valid inference !

Almost. . .

Warnings

• When the research question is concerned with

missingness parameters, a more complex analysis is
needed.
• Precision estimation poses problems (Kenward and
Molenberghs, Stat Sci 1997).
• MNAR hard to rule out: OSWALD (PCMID),. . . .
CHAPTER 16. MISSING DATA 354

16.11 A Selection Model

Measurements: The linear mixed model

yi = Xiβ + Zibi + εi










bi ∼ N (0, D)






independent







 εi ∼ N (0, Σi)

⇓ Vi = ZiDZi + Σi

yi ∼ Nni (Xiβ, Vi)

(Laird and Ware, Bcs 1982)

CHAPTER 16. MISSING DATA 355

16.12 Dropout Model

• Monotone dropout
• Dropout probability at occasion j:
P (Di = j|Di ≥ j, y i, Wi) = g(hij , yij )

• hij : vector with all responses prior to occasion j,

and possibly covariates wij .

• Dropout model:
logit[g(hij , yij )] = logit [P (Di = j|Di ≥ j, y i, Wi)]

= hij ψ + ωyij , i = 1, . . . , N

• MAR if ω = 0

• Non-random if ω = 0
CHAPTER 16. MISSING DATA 356

16.13 Contributions Combined

• Dropout probability:
f (di|y i, Wi, ψ)
 n
 i







[1 − g(hij , yij )] for Di = ni + 1,

 j=2




= 





 d−1







[1 − g(hij , yij )]g(hid, yid) for Di = d ≤ ni.
j=2
CHAPTER 16. MISSING DATA 357

16.14 A Paradox

Glynn, Laird and Rubin (1986)

• Two measurements (Y1, Y2)

• Y1 always observed.
• Y2 observed (R = 1) or missing (R = 0).
• Selection model versus pattern-mixture model
f (y1, y2)g(r = 1|y1, y2) = f1(y1, y2)p(r = 1)
f (y1, y2)g(r = 0|y1, y2) = f0(y1, y2)p(r = 0)
or
f (y1, y2)g(y1, y2) = f1(y1, y2)p
f (y1, y2)[1 − g(y1, y2)] = f0(y1, y2)[1 − p]
of which the ratio yields:
1 − g(y1, y2) p
f0(y1, y2) = f1(y1, y2)
g(y1, y2) 1 − p
• The right hand side is identiﬁable
• The left hand side is not. . .
CHAPTER 16. MISSING DATA 358

16.15 Pattern-Mixture Models

f (Yi, Di|θ, ψ) = f (Yi|Di, θ)f (Di|ψ).

• Natural parameters of selection models and

pattern-mixture models have diﬀerent meaning.

+ SeM: useful framework of missing data processes.

− SeM: MNAR =⇒ untestable assumptions.

+ PMM: identiﬁable parts are unambiguous.

• Little (JASA 1993) suggests the use of identifying

relationships.
CHAPTER 16. MISSING DATA 359

16.16 Pattern-Mixture Models

• Cohen and Cohen (1983)

• Muthén, Kaplan, and Hollis (Psychometrika 1987)
• Allison (Soc Method 1987)
• McArdle and Hamagani (Experimental Aging
Research 1992)
• Little (JASA 1993, Bka 1994, JASA 1995)
• Little and Wang (Bcs 1996)
• Hedeker and Gibbons (Psychological Methods 1997)
• Hogan and Laird (SiM 1997)
• Ekholm and Skinner (AppStat 1998)
• Molenberghs, Michiels, and Kenward (Biom J 1998)
• Verbeke, Lesaﬀre, and Spiessens (1998)
• Molenberghs, Michiels, and Lipsitz (CommStat 1999)
• Michiels, Molenberghs, and Lipsitz (Bcs 1999)
CHAPTER 16. MISSING DATA 360

16.17 Pattern-Mixture Modeling

• Little (1993, 1994 and 1995):

Deﬁnes pattern-mixture models
Clear lack of information ⇒ Fair modeling
Identifying restrictions (CCMV, ...)

• Molenberghs, Michiels, Kenward, and Diggle

(Stat Neerl 1998):
Classiﬁcation possible (cf. selection modeling)
MAR ⇔ ACMV (for monotone patterns)
∗ Selection Models:
f (Di|Y oi, ψ).

∗ Pattern-mixture Models:
f (Yi1, . . . , Yid|Di = d) = f (Yi1, . . . , Yid|Di > d)
CHAPTER 16. MISSING DATA 361

16.18 Estimating Marginal Eﬀects

From PMM

• Pattern-membership probabilities:
π1, . . . , πt, . . . , πT .

• The marginal eﬀects:

n
β = β tπt, = 1, . . . , g
t=1

• Their variance:
Var(β1, . . . , βg ) = AV A
where  

 Var(β t) 0 

V = 






0 Var(πt) 

and
∂(β1, . . . , βg )
A=
∂(β11, . . . , βng , π1, . . . , πn)
CHAPTER 16. MISSING DATA 362

16.19 Random-Coeﬃcient Models

• Sofar, selection models and pattern-mixture models

did not contain random eﬀects, shared between
measurement and dropout model

Shared-parameter models

• Often, one can assume a latent variable driving both

processes
• Other term: Random-coeﬃcient-based models
(Little 1995)
• versus outcome-based models
CHAPTER 16. MISSING DATA 363

16.20 Example

• Latent variable Z drives individual’s response

Yij | α0i, α1i ∼ N(α0i + α1iti; σ 2)
and αi = (α0i, α1i) satisfy
αi ∼ N(α, Φ)

• Corresponding selection model

f (y, r | z) = f (y | z)P(r | z)
and then

f (yo, r) = f (y | z)P(r | z)f (z)dzdym,
and

f (yo, r) = { f (y | z)dym} P(r | z)f (z)dz

= f (yo | z)P(r | z)f (z)dz.
CHAPTER 16. MISSING DATA 364

16.21 Literature

• Continuous outcomes: Wu and Carroll (Bcs 1988):

– Gaussian random eﬀects model
– PH, logistic, or probit for dropout times
• Joint models for time-to-event (dropout) and
measurements:
– Schluchter (SiM 1992)
– DeGruttola and Tu (Bcs 1994)
– Taylor, Cumberland, and Sy (JASA 1994)
– Tsiatis, DeGruttola, and Wulfsohn (JASA 1995)
– Satten and Longini (App Stat 1996)
– Faucett and Thomas (SiM 1996)
– Wulfsohn and Tsiatis (Bcs 1997)
– Bycott and Taylor (SiM 1998)
CHAPTER 16. MISSING DATA 365

16.22 Non-Normal Outcomes

• generalized linear model for outcomes

• generalized linear model for dropout process
• shared random eﬀects between them
• Literature:
– Wu and Bailey (SiM 1988)
– Wu and Bailey (Bcs 1989)
– Mori, Woolson, and Woodworth (Bcs 1994)
– Follman and Wu (Bcs 1995)
– Pulkstenis et al (JASA 1998)
– Ten Have et al (Bcs 1998)
– Albert and Follman (Bcs 2000)
CHAPTER 16. MISSING DATA 366

16.23 Pros and Cons

• Easier to handle intermittent missingness

• May be viewed as a natural framework for genesis of

data

• Share computational complexity with outcome-based

selection models

• High dependence of inferences on modelling

assumptions

• Modelling assumptions can not be veriﬁed from the

data to full satisfaction
(cf. paradox)
CHAPTER 16. MISSING DATA 367

16.24 Less Parametric Approaches

• Generalized estimating equations

• Weights to reﬂect selection/dropout probability

• Literature
– Robins (SiM 1997)
– Robins and Gill (SiM 1997)
– Rotnitzky and Robins (Scand J Stat 1995)
– Rotnitzky and Robins (SiM 1997)
– Robins, Rotnitzky and Zhao (JASA 1995)
– Robins and Rotnizky (JASA 1995)
– Robins, Rotnitzky, and Scharfstein (JASA 1998)
Chapter 17

Case Study: Analgesic Trial

17.1 Weighted GEE

• Strictly, GEE inference is correct under the MCAR

missing data mechanism.
• A way to reduce bias in the parameter estimates when
the mechanism is MAR is to use Weighted GEE
(WGEE).
• References: Robins, Rotnitzky & Zhao (JASA, 1995)
Fitzmaurice, Molenberghs & Lipsitz (JRSSB, 1995).

368
CHAPTER 17. CASE STUDY: ANALGESIC TRIAL 369

• The idea is to weight each subject’s contribution in

the GEEs by the inverse probability that a subject
drops out at the time he dropped out.
• This can be calculated as
i −1
d
P [Di = di ] = { (1 − P [Rik = 0|Ri2 = . . . = Ri,k−1 = 1])}
k=2
P [Ridi = 0|Ri2 = . . . = Ri,di −1 = 1]I{di ≤T }
CHAPTER 17. CASE STUDY: ANALGESIC TRIAL 370/1

17.2 Analgesic Trial Example

• The analgesic data show some evidence of MAR.

• Model on conditional dropout P [Di = j|Di ≥ j]
shows dependence on previous GSA measurements.
• This model includes previous GSA, baseline PCA,
physical functioning and genetic/congenital disorder.

The GENMOD Procedure

Model Information

Data Set WORK.GSAC

Distribution Binomial
Link Function Logit
Dependent Variable dropout
Observations Used 963
Probability Modeled Pr( dropout = 1 )
Missing Values 15

Class Level Information

Class Levels Values

prevgsa 5 1 2 3 4 5

Response Profile

Ordered Ordered
Level Value Count

1 0 800
2 1 163

Criteria For Assessing Goodness Of Fit

CHAPTER 17. CASE STUDY: ANALGESIC TRIAL 371/1

Criterion DF Value Value/DF

Deviance 955 832.4611 0.8717

Scaled Deviance 955 832.4611 0.8717
Pearson Chi-Square 955 967.4301 1.0130
Scaled Pearson X2 955 967.4301 1.0130
Log Likelihood -416.2306

Algorithm converged.

Analysis Of Parameter Estimates

Standard Wald 95% Chi-

Parameter DF Estimate Error Confidence Limits Square Pr > ChiSq

Intercept 1 -1.8043 0.4856 -2.7562 -0.8525 13.80 0.0002

prevgsa 1 1 -1.0183 0.4131 -1.8278 -0.2087 6.08 0.0137
prevgsa 2 1 -1.0374 0.3772 -1.7767 -0.2980 7.56 0.0060
prevgsa 3 1 -1.3439 0.3716 -2.0721 -0.6156 13.08 0.0003
prevgsa 4 1 -0.2636 0.3828 -1.0140 0.4867 0.47 0.4910
prevgsa 5 0 0.0000 0.0000 0.0000 0.0000 . .
pca0 1 0.2542 0.0986 0.0609 0.4474 6.65 0.0099
PHYSFCT 1 0.0090 0.0038 0.0015 0.0165 5.54 0.0186
gendis 1 0.5863 0.2420 0.1120 1.0607 5.87 0.0154
Scale 0 1.0000 0.0000 1.0000 1.0000
CHAPTER 17. CASE STUDY: ANALGESIC TRIAL 372/1

• PROC GENMOD code to ﬁt WGEE (assuming the

variable wi is the inverse of the probability that
subject i drops out at time di):

proc genmod data=repbin.gsaw;

scwgt wi;
class patid timecls;
model gsabin = time|time pca0 / dist=b;
repeated subject=patid / type=un corrw within=timecls;
run;

Variable GEE WGEE

Intercept 2.950 (0.465) 2.166 (0.694)
Time -0.842 (0.325) -0.437 (0.443)
Time2 0.181 (0.066) 0.120 (0.089)
Baseline PCA -0.244 (0.097) -0.159 (0.130)

Parameter estimates and standard errors (empirical).

Working correlation structure is UN.
CHAPTER 17. CASE STUDY: ANALGESIC TRIAL 373/1

17.2.1 Estimated working correlation structures:

 GEE   WGEE 
 1 0.173 0.246 0.201   1 0.215 0.253 0.167 
   
   
   



1 0.177 0.113 





1 0.196 0.113 


   
   



1 0.456 





1 0.409 


   
   
1 1
Chapter 18

PROC NLMIXED

18.1 Features

• New SAS (V.7 and later) procedure to ﬁt nonlinear

mixed models (i.e. models in which both fixed and
random effects can enter nonlinearly).
• Relies on likelihood inference, that is, PROC
NLMIXED maximizes an approximation (∼ numerical
integration) to the likelihood integrated over the
random effects.

370
CHAPTER 18. PROC NLMIXED 371

18.2 Particularities

• Diﬀerent integral approximations are available, the

principal one being (adaptative) Gaussian quadrature.
• Different optimization algorithms are available to
carry out the maximization of the likelihood.
• Constraints on parameters are also allowed in the
optimization process.
• The conditional distribution (given the random
effects) can be specified as Normal, Binomial,
Poisson, or as any distribution for which you can
specify the likelihood by programming statements.
• E-B estimates of the random effects can be obtained.
CHAPTER 18. PROC NLMIXED 372

18.3 Limitations

• Only one RANDOM statement can be speciﬁed (i.e.,

it can handle 2-level models only).
• Only normal random eﬀects are allowed (though this
is probably the most commonly used choice).
• Does not calculate automatic initial values.
• Make sure your data set is sorted by cluster ID !!!
• No missing values should be left in the (dependent or
independent) variables.
CHAPTER 18. PROC NLMIXED 373

18.4 MIXOR
• Program in the public domain speciﬁcally designed for
mixed-eﬀects ordinal regression analysis. The program
can be downloaded at
http://www.uic.edu/ hedeker/mixreg.html

• Performs numerical integration (Gaussian quadrature)

and uses Newton-Raphson algorithm to maximize the
marginal likelihood.

• Diﬀerences/similarities with PROC NLMIXED:

– PROC NLMIXED can perform Gaussian
quadrature by using the options NOAD and
NOADSCALE. The number of quadrature points
can be speciﬁed with the option QPOINTS=m.
– PROC NLMIXED can maximize the marginal
likelihood using the Newton-Raphson algorithm by
specifying the option TECHNIQUE=NEWRAP.
– When comparing the output from both programs,
there will be some discrepancies in the standard
errors of the parameters. This is because MIXOR
uses an approximation to the (empirical)
information matrix, whereas PROC NLMIXED
uses numerical derivatives.
Chapter 19

Introduction to Multilevel Modeling

19.1 Introduction

• Data with a hierarchical or clustered structure

• Some examples of hierarchies:
– Longitudinal data
level 1: occasions, level 2: subjects
– Teratologic data
level 1: oﬀsprings, level 2: litters
– Education sciences
level 1: students, level 2: classrooms, level 3 : schools

374
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 375

• Units in a cluster are more alike than units from other

clusters (correlation)
⇒ inference is misleading if the hierarchical structure
is ignored.
• Multilevel modeling acknowledges for clustering
through the sharing of latent variables (random
eﬀects).
• These latent variables can be introduced at any level
in the hierarchy.
• Some references:
– Random Coeﬃcient Models, Longford N. (1993)
– Multilevel Statistical Models, Goldstein H. (1995)
– Introducing Multilevel Modeling, Kreft I. & De Leeuw J. (1998)
– Multilevel Analysis, Snijders T. & Bosker R. (2000)

• On the web, see the Multilevel Models Project page at

http://www.ioe.ac.uk/multilevel
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 376

19.2 Multilevel Model Formulation

• We illustrate the general structure using a 3-level

model:

Y = Xβ + Z (3)v + Z (2)u + Z (1)e,

or
q3 (3) q2 (2) q1 (1)
Yijk = Xijk β+ Zhijk vhi+ Zhijk uhij + Zhijk ehijk ,
h=0 h=0 h=0

• where:
– Ωe = cov[eijk ],
– Ωu = cov[uij ],
– Ωv = cov[vi].
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 377

Assumptions:
– Level 1 residuals are independent across level 1
units and are N (0, V3(1)), where V3(1) is diagonal
2 (1)T (1)
with elements σeijk= Zijk ΩeZijk .
– Level 2 residuals are independent across level 2
units and are N (0, V3(2)), where V3(2) is block-
(2)T (2)
diagonal with blocks V3(2)ij = Zijk ΩuZijk .
– Level 3 residuals are independent across level 3
units and are N (0, V3(3)), where V3(3) is block-
(3)T (3)
diagonal with blocks V3(3)i = Zijk Ωv Zijk .
• Thus, cov[Y ] is block-diagonal with ith block given
by:
2
V3i = V3(3)i + V3(2)ij + σeijk .
j j,k
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 378

19.3 Estimation Procedure

• Multilevel data sets are typically big

⇒ need for eﬃcient estimation methods !
• Iterative Generalized Least Squares (IGLS) algorithm:
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 379

19.3.1 IGLS Procedure

• Suppose we know the value of all (co)variance

parameters. Then the usual GLS estimation procedure
can be applied to estimate the ﬁxed coeﬃcients:
β̂ = (X T V −1X)−1X T V −1Y.

• Suppose we know the value of β. We can form the

cross-product matrix of residuals Ỹ Ỹ T , with
Ỹ = Y − Xβ.
• If Y ∗ denotes Ỹ Ỹ T , we have E[Y ∗] = V .
• If Y ∗∗ = Y Y = vec(Ỹ Ỹ T ), then E[Y ∗∗] can be
written as Z ∗θ, where θ comprises all (co)variance
parameters and Z ∗ is a suitable design matrix.
• The vector θ can then be estimated using standard
GLS estimation:
−1 −1
θ̂ = (Z ∗ V ∗ Z ∗)−1Z ∗ V ∗ Y ∗∗,
T T

with V ∗ = V
V.
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 380

19.3.2 Remarks

• Starting with initial values (OLS estimates for β), the

IGLS algorithm alternates between the random and
ﬁxed parameter estimation until the procedure
converges.
• Note:
– The IGLS algorithm converges to ML estimates.
– The IGLS algorithm can be modiﬁed to mimick
REML estimation (by taking the sampling
variation of β̂ into account).
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 381

19.4 Illustration of the IGLS Algorithm

Consider the following simple 2-level model:

yij = β0 + β1 xij + u0i + e0ij .

• β̂ = (X T V −1 X)−1 X T V −1 Y, with
   
 1 x11   y11 
   
   

 1 x12 


 y12 

X= 
 .. .. ,

Y = 
 .. .

 . .   . 
   
   
1 xmnm ymnm

• Calculate ỹij = yij − βˆ0 − βˆ1 xij .

• We can write:
       
2 2 2
 ỹ11   σu0 + σe0   1   1 
       
   2     

 ỹ11 ỹ12 


 σu0 
 2 
 1 
 2 
 0 


 ..  
= 
 .. 

+R = σ0u 
 .. 

+ σe0 
 .. 

+R
 .   .   .   . 
       
       
2 2 2
ỹmnm σu0 + σe0 1 1

where R is a residual vector.

T −1 T −1
• θ̂ = (Z ∗ V ∗ Z ∗ )−1 Z ∗ V ∗ Y ∗∗ with
   
2
 1 1   ỹ11 
   
   

 1 0 


 ỹ11 ỹ12 

Z∗ = 
 .. ,

Y ∗∗ = 
 ..  
.
 .   .  
  
   
2
1 1 ỹmn m
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 382

19.5 Example

• GROWTH data set (2-level model)

• Model ﬁt with MLwiN (see MLwiN homepage at
http://www.ioe.ac.uk/mlwin)
• Data:

• Covariates:
– SEX: xi = 1 for boys and 0 for girls
– AGE: 8, 10, 12, and 14
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 383

• Model:
Yij = β0+β01xi+β10tj (1−xi)+β11tj xi+b0i+b1itj +εij ,

Number Eﬀect Fixed Level 2 Level 1

0 Intercept β0 b0i εij
1 Male β01
2 Female∗Age β10
3 Male∗Age β11
4 Age b1i
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 384
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 385

• Predicted means:

• Predicted (E-B) individual proﬁles:

CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 386

19.6 Multilevel Models for Discrete

Response Data

• Model formulation (binary data):

– logit(πij ) = XijT β + ZijT ui

– yij = πij + eij vij , vij = πij (1 − πij )/nij
– Take eij such that yij ∼ Bin(πij , nij ), with
var(eij ) = 1.
• Parameter estimation:
– ML will be hardly feasible in general
⇒ approximate methods have been proposed to
avoid numerical integration
– Some references:
∗ MQL: Goldstein (1991)
∗ PQL: Breslow & Clayton (1993), Wolﬁnger & O’Connell (1993)
∗ PQL2: Goldstein & Rasbash (1996)
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 387

19.7 MQL/PQL Procedure

• Use ﬁrst order Taylor expansion for the ﬁxed and

random part of the mean function about Ht:
– MQL: Ht = XijT β t (ﬁxed part predictor)
µij = f (Ht) + XijT (β − β t)f (Ht) + ZijT uif (Ht)
– PQL: Ht = XijT β t + ZijT ûi (current predicted
value)
µij = f (Ht)+XijT (β−β t)f (Ht)+ZijT (ui−ûi)f (Ht)
• Rewrite as a linear model:
Yij − f (Ht) + XijT β t [+f (Ht)ZijT ûi] =
f (Ht)XijT β + f (Ht)ZijT ui + eij vij .

• Update the ﬁxed and random parameters as in the

IGLS algorithm.
• Iterate until convergence.
• MQL2/PQL2 procedures: further add second-order
terms in the above Taylor expansion.
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 388

• Pros:
– The algorithm is quick and eﬃcient (compared to
ML).
– Allows to estimate an overdispersion parameter
(since the algorithm iteratively ﬁts linear models).
Just write

yij = πij + eij vij , vij = πij (1 − πij )/nij

with var(eij ) = σe2.
• Cons:
– MQL/PQL gives biased estimates (downward),
mostly for variance parameters !!
– The bias is worst for binary data
– Bias increases with increasing variance components
– Bias increases with decreasing cluster size
– PQL2 less biased
– Convergence problems are common (especially
with PQL2 procedure).
CHAPTER 19. INTRODUCTION TO MULTILEVEL MODELING 389

Note

• PQL/MQL can be ﬁtted by the SAS macro

GLIMMIX.
• MQL, PQL, MQL2 and PQL2 can be ﬁtted in the
MLwiN package.
Chapter 20

The Use of SPlus

20.1 Fitting Mixed Models Using SPlus

SPlus provides various ways to estimate mixed models. On the one hand, the built-in functions
lme() for linear mixed-eﬀects models can be used. Note that there is a companion function for
nonlinear mixed-eﬀects models, nmle(). These functions are based on work by Lindstrom and
Bates (1988), Laird and Ware (1982), Box, Jenkins, and Reinsel (1994), and Davidian and
Giltinan (1995).

Figure 20.1: Growth Data. Predicted individual profiles.

We will use the growth data to illustrate the built-in function mle(). SPlus Version 4.5 is used.
Apart from the references mentioned earlier which give the theoretical underpinning, there is

390
CHAPTER 20. THE USE OF SPLUS 391

ample documentation within SPlus. The on-line manual provides a 53-page discussion of linear
and nonlinear mixed-eﬀects models. The function lme() is generic. The on-line help system of
SPlus provides a brief account of the syntax of this generic function. Methods functions are
being developed for speciﬁc classes of objects. The methods function lme.formula() comes
with ample documentation.

Let us discuss the main arguments:

Fixed eﬀects. The structure is speciﬁed by means of the fixed argument, using standard
formulas.

Random effects. The random-effects structure is specified through random. Additional

arguments to tune the random-eﬀects model are re.block (describing the blocking
structure), re.structure (specifying the form of the D matrix), and re.paramtr
(specifying how the D matrix is internally parameterized). The latter argument is
included to improve numerical stability and to ensure that the resulting D matrix is
positive deﬁne. Values of this argument refer to the Cholesky decomposition, the matrix
logarithm, and several others.

Serial correlation. This structure is deﬁned by means of the argument serial.structure.

In the case that a serial correlation structure depending on time is assumed, the
arguments serial.covariate and serial. covariate.transformation can be used
to specify this aspect of the serial process.

Residual variance. The residual variance function is deﬁned by means of var.function.

Fine-tuning can be done using var.covariate and var.estimate (indicating whether
the variance parameters are to be estimated or to be kept ﬁxed at their initial values).

Clusters. The clusters (subjects, units, etc.) are deﬁned using cluster.

Method of estimation. Both maximum likelihood and REML are provided. The user’s
preference can be speciﬁed by means of the argument est.method.

Other tools include subsetting, specifying the action to be undertaken on missing data, and
control over the estimation algorithm.

Let us apply the function lme.formula() to ﬁt Model 6 to the growth data.

The following program can be used.

my.lme <- lme.formula(

fixed = MEASURE ~ 1 + MALE + MALEAGE + FEMAGE,
random = ~ 1 + AGE,
CHAPTER 20. THE USE OF SPLUS 392

cluster = ~ IDNR,
data = growth5.df,
re.structure = "unstructured",
na.action = "na.omit",
est.method = "ML")

Printing the object my.lme produces

Call:
Fixed: MEASURE ~ 1 + MALE + MALEAGE + FEMAGE
Random: ~ 1 + AGE
Cluster: ~ (IDNR)
Data: growth5.df

Variance/Covariance Components Estimate(s):

Structure: unstructured
Parametrization: matrixlog
Standard Deviation(s) of Random Effect(s)
(Intercept) AGE
2.134752 0.1541473
Correlation of Random Effects
(Intercept)
AGE -0.6025632

Cluster Residual Variance: 1.716206

Fixed Effects Estimate(s):

(Intercept) MALE MALEAGE FEMAGE
17.37273 -1.032102 0.784375 0.4795455

Number of Observations: 108

Number of Clusters: 27

Although the above output is rather brief, one can obtain a more extensive summary:

> my.lme.2 <- summary(my.lme)

> my.lme.2

Call:
Fixed: MEASURE ~ 1 + MALE + MALEAGE + FEMAGE
CHAPTER 20. THE USE OF SPLUS 393

Random: ~ 1 + AGE
Cluster: ~ (IDNR)
Data: growth5.df

Estimation Method: ML
Convergence at iteration: 6
Log-likelihood: -213.903
AIC: 443.806
BIC: 465.263

Variance/Covariance Components Estimate(s):

Structure: unstructured
Parametrization: matrixlog
Standard Deviation(s) of Random Effect(s)
(Intercept) AGE
2.134752 0.1541473
Correlation of Random Effects
(Intercept)
AGE -0.6025632

Cluster Residual Variance: 1.716206

Fixed Effects Estimate(s):

Value Approx. Std.Error z ratio(C)
(Intercept) 17.3727273 1.18203467 14.6973077
MALE -1.0321023 1.53550808 -0.6721568
MALEAGE 0.7843750 0.08275405 9.4783886
FEMAGE 0.4795455 0.09980513 4.8048175
CHAPTER 20. THE USE OF SPLUS 394

Conditional Correlation(s) of Fixed Effects Estimates

(Intercept) MALE MALEAGE
MALE -7.698004e-001
MALEAGE 6.198039e-016 -5.617972e-001
FEMAGE -8.801671e-001 6.775530e-001 -1.691642e-016

Random Effects (Conditional Modes):

(Intercept) AGE
1 -0.68278894 -0.039972872
2 -0.45926352 0.071886460
3 -0.03109489 0.093020178
4 1.61182535 0.030832363
5 0.43850471 -0.043000835
.....
25 0.50935427 -0.055453935
26 -0.10573027 0.083999487
27 -0.89462307 -0.076992100

Standardized Population-Average Residuals:

Min Q1 Med Q3 Max
-3.335979 -0.4153858 0.01039114 0.4916851 3.858188

Number of Observations: 108

Number of Clusters: 27

The estimates and standard errors coincide with those obtained with, for example, MLwiN. This
is immediately clear for the fixed-effects estimates, their standard errors, and the residual
variance. The components of the D matrix have to be derived from the standard deviations and
correlation of the random effects:

d11 = 2.1347522 = 4.577,

d12 = (−0.6025632)(2.134752)(0.1541473) = −0.198,
d22 = 0.15414732 = 0.024.

As is the case with MLwiN, SPlus in general, and lme() in particular, have extensive graphical
capabilities.

Intrusion Detection Honeypots
From Everand
Intrusion Detection Honeypots
Chris Sanders
3/5 (2)
Sophia Rabe-Hesketh, Anders Skrondal - Multilevel and Longitudinal Modeling Using Stata. 2 Vols.-Stata Press (2012)
100% (2)
Sophia Rabe-Hesketh, Anders Skrondal - Multilevel and Longitudinal Modeling Using Stata. 2 Vols.-Stata Press (2012)
1,030 pages
Regression Analysis of Count Data 2nd Ed
No ratings yet
Regression Analysis of Count Data 2nd Ed
9 pages
Confirmation - Flight Booking - Etihad
No ratings yet
Confirmation - Flight Booking - Etihad
2 pages
Gfmam The Maintenance Framework First Edition English Version
100% (1)
Gfmam The Maintenance Framework First Edition English Version
24 pages
Arroyo Oscar The World of Tomorrow
100% (1)
Arroyo Oscar The World of Tomorrow
5 pages
Normal Pulse Voltammetry
100% (2)
Normal Pulse Voltammetry
10 pages
Longitudinalf
No ratings yet
Longitudinalf
664 pages
Longitudinal PDF
No ratings yet
Longitudinal PDF
664 pages
Course Regression Model Strategies PDF
No ratings yet
Course Regression Model Strategies PDF
307 pages
Regression With Linear Predictors Complete DOCX Download
100% (20)
Regression With Linear Predictors Complete DOCX Download
16 pages
Course PDF
No ratings yet
Course PDF
403 pages
Course HEM245 2021
No ratings yet
Course HEM245 2021
157 pages
Heus Preview
No ratings yet
Heus Preview
29 pages
Lecture Notes On Maximum Likelihood Estimation: Michael Peress December 30, 2024
No ratings yet
Lecture Notes On Maximum Likelihood Estimation: Michael Peress December 30, 2024
113 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
46 pages
ECON835 Lecture Notes Part 1 Probability Through Asymptotics (Fall 2014)
No ratings yet
ECON835 Lecture Notes Part 1 Probability Through Asymptotics (Fall 2014)
75 pages
Regression Models Course Notes
No ratings yet
Regression Models Course Notes
102 pages
Imstat
No ratings yet
Imstat
510 pages
Regression Analysis of Count Data 2nd Ed
No ratings yet
Regression Analysis of Count Data 2nd Ed
9 pages
Other Lec From Other
100% (1)
Other Lec From Other
205 pages
Edda Course Notes
No ratings yet
Edda Course Notes
310 pages
Multivariate Statistics Made Simple A Practical Approach by K. v. S. Sarma, R. Vishnu Vardhan
100% (1)
Multivariate Statistics Made Simple A Practical Approach by K. v. S. Sarma, R. Vishnu Vardhan
259 pages
Clinical Trial Data Analysis Using R and SAS - 2nd Edition Accessible DOCX Download
100% (1)
Clinical Trial Data Analysis Using R and SAS - 2nd Edition Accessible DOCX Download
17 pages
Rms PDF
No ratings yet
Rms PDF
506 pages
RealStats Book
No ratings yet
RealStats Book
897 pages
Imstat
No ratings yet
Imstat
549 pages
Statistical Modeling For Management
No ratings yet
Statistical Modeling For Management
255 pages
Math631 Course Notes
No ratings yet
Math631 Course Notes
281 pages
Survivametodl
No ratings yet
Survivametodl
98 pages
Roy Sabo, Edward Boone (Auth.) - Statistical Research Methods - A Guide For Non-Statisticians-Springer-Verlag New York (2013)
No ratings yet
Roy Sabo, Edward Boone (Auth.) - Statistical Research Methods - A Guide For Non-Statisticians-Springer-Verlag New York (2013)
218 pages
Quantitative Research Methods For Political Science, Public Policy and Public Administration, With Applications in R
No ratings yet
Quantitative Research Methods For Political Science, Public Policy and Public Administration, With Applications in R
259 pages
Stat 231 Course Notes
100% (1)
Stat 231 Course Notes
326 pages
Ekstrøm, Claus Thorn - Sørensen, Helle - Introduction To Statistical Data Analysis For The Life Sciences-CRC Press (2014)
No ratings yet
Ekstrøm, Claus Thorn - Sørensen, Helle - Introduction To Statistical Data Analysis For The Life Sciences-CRC Press (2014)
521 pages
Ecstats
No ratings yet
Ecstats
299 pages
Multivariate Statistical Analysis: Old School
No ratings yet
Multivariate Statistical Analysis: Old School
319 pages
Multivariate
0% (1)
Multivariate
319 pages
Medical Statistics With R
No ratings yet
Medical Statistics With R
85 pages
Design Expert
No ratings yet
Design Expert
74 pages
Stats For Science Course Reader
No ratings yet
Stats For Science Course Reader
77 pages
Introduction To Bio Statistics
No ratings yet
Introduction To Bio Statistics
204 pages
Stat Technical Notes
0% (1)
Stat Technical Notes
430 pages
Iter PDF
No ratings yet
Iter PDF
400 pages
Generalized Linear Models
100% (9)
Generalized Linear Models
243 pages
Intro To Econometrics With R PDF
No ratings yet
Intro To Econometrics With R PDF
392 pages
What Statistics Books Try To Teach You But Dont Joe King University of Washington
No ratings yet
What Statistics Books Try To Teach You But Dont Joe King University of Washington
40 pages
Foundations of Applied Statistical Methods, 2nd Edition Official Ebook Release
No ratings yet
Foundations of Applied Statistical Methods, 2nd Edition Official Ebook Release
17 pages
STAT 231 Course Notes W16 Print
No ratings yet
STAT 231 Course Notes W16 Print
424 pages
Basic Stats
No ratings yet
Basic Stats
49 pages
Unlocking Statistics for the Social Sciences
From Everand
Unlocking Statistics for the Social Sciences
Norma Sinclair
No ratings yet
Chas A Low Course Notes
No ratings yet
Chas A Low Course Notes
146 pages
Statistics 231 Course Notes: Department of Statistics and Actuarial Science, University of Waterloo
No ratings yet
Statistics 231 Course Notes: Department of Statistics and Actuarial Science, University of Waterloo
204 pages
Math Stats Lecture 2020F
No ratings yet
Math Stats Lecture 2020F
122 pages
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)
Advanced college algebra study guide
From Everand
Advanced college algebra study guide
Harrison Cook
No ratings yet
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
From Everand
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
Harrison K Cook
No ratings yet
ChatGPT for Business: Strategies for Success
From Everand
ChatGPT for Business: Strategies for Success
Matthew C. Smith
1/5 (1)
Human Nature Potential in Nurture
From Everand
Human Nature Potential in Nurture
David L. Hawk
No ratings yet
A Discourse Analysis of 1 Peter
From Everand
A Discourse Analysis of 1 Peter
Ervin Ray Starwalt
No ratings yet
Securing ChatGPT: Best Practices for Protecting Sensitive Data in AI Language Models
From Everand
Securing ChatGPT: Best Practices for Protecting Sensitive Data in AI Language Models
Matthew C. Smith
No ratings yet
Keys to Better Reading
From Everand
Keys to Better Reading
Judy McFall
No ratings yet
The Linux Terminal for Advanced Users - The Command Line Made Easy: First Edition
From Everand
The Linux Terminal for Advanced Users - The Command Line Made Easy: First Edition
Michael Basler
No ratings yet
Risk Management and System Safety
From Everand
Risk Management and System Safety
Leonam dos Santos Guimarães
5/5 (1)
Mortals or Immortals
From Everand
Mortals or Immortals
Konstantinos p Anastasiadis
No ratings yet
Se 221FJ01071
No ratings yet
Se 221FJ01071
3 pages
Crystal Structures I
No ratings yet
Crystal Structures I
28 pages
UNEP MC COP 3 INF 28 List Participants - English
No ratings yet
UNEP MC COP 3 INF 28 List Participants - English
70 pages
Mark Scheme (RESULTS) October 2020: Pearson Edexcel GCE in Physics (8PH0) Paper 2: Core Physics II
No ratings yet
Mark Scheme (RESULTS) October 2020: Pearson Edexcel GCE in Physics (8PH0) Paper 2: Core Physics II
19 pages
15 Advanced English Phrases For Better Expressing Emotions
No ratings yet
15 Advanced English Phrases For Better Expressing Emotions
4 pages
TOC of My Recently Published Book: Air Bearings: Theory, Design and Applications Wiley
No ratings yet
TOC of My Recently Published Book: Air Bearings: Theory, Design and Applications Wiley
11 pages
How You Can Talk With God
No ratings yet
How You Can Talk With God
5 pages
Assignment Roof
100% (4)
Assignment Roof
68 pages
Test Questions in Professional Education
No ratings yet
Test Questions in Professional Education
16 pages
Telehandler Genie GTH 1048-Specifications
No ratings yet
Telehandler Genie GTH 1048-Specifications
2 pages
Basic Knife Cuts
No ratings yet
Basic Knife Cuts
5 pages
Bodybuilding, Drugs and Risk
No ratings yet
Bodybuilding, Drugs and Risk
230 pages
Number Series
No ratings yet
Number Series
16 pages
AEIF 2024 Proposal Forms
No ratings yet
AEIF 2024 Proposal Forms
10 pages
KALAnnualReport2016 17
No ratings yet
KALAnnualReport2016 17
92 pages
Vocative in English PDF
No ratings yet
Vocative in English PDF
22 pages
TD Sba0 en
No ratings yet
TD Sba0 en
3 pages
How To Choose The Journal That's Right For Your Study - PLOS
No ratings yet
How To Choose The Journal That's Right For Your Study - PLOS
13 pages
Research II Proposal
No ratings yet
Research II Proposal
26 pages
Avoid News Part1 TEXT PDF
No ratings yet
Avoid News Part1 TEXT PDF
11 pages
FCE Sample Use of English 1, Twins, Edinburugh, Languages
No ratings yet
FCE Sample Use of English 1, Twins, Edinburugh, Languages
6 pages
Repetitve Nerve Stimulation (RNS) : By: Syed Irshad Murtaza Neurophysiology Dept AKUH Karachi Date:12-06-2013
No ratings yet
Repetitve Nerve Stimulation (RNS) : By: Syed Irshad Murtaza Neurophysiology Dept AKUH Karachi Date:12-06-2013
33 pages
In Vivo and in Vitro Evaluation of Four Different Aqueous Polymeric Dispersions For Producing An Enteric Coated Tablet
No ratings yet
In Vivo and in Vitro Evaluation of Four Different Aqueous Polymeric Dispersions For Producing An Enteric Coated Tablet
6 pages
DR - AishaCv 20250422 152511 0000
No ratings yet
DR - AishaCv 20250422 152511 0000
4 pages
CHILD HEALTH NURSING Unit I Stuents Notes Notes
100% (9)
CHILD HEALTH NURSING Unit I Stuents Notes Notes
13 pages
Term 2 Basic 3 Week 3 Lesson Plan
No ratings yet
Term 2 Basic 3 Week 3 Lesson Plan
20 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.