Survival Data Analysis HW3
Survival Data Analysis HW3
Group members:
Melvin Estolano (2159122)
Quynh Long Khuong (2159280)
Jessa Mae Lastimoso (2159120)
Jose Carlos Cortiñas Porras (2055631)
Lecturer:
Prof. Dr. Tomasz Burzykowski
Group 4: Survival Data Analysis Homework 3
1 Introduction
The objective of this analysis is to use the accelerated failure-time model (AFT) to investigate the effect
of adding Indinavir (IDV) to Zidovudine (ZDV) and Lamivudine (LAM) to the AIDS-free survival time
of the patients.
2 Methodology
To answer the research question, the accelerated failure time (AFT) model was used to estimate the
treatment effect on the AIDS-free survival time. In this model, the covariates have a multiplicative ef-
fect on the survival time. Several parametric models were compared. These are Generalized Gamma,
Weibull, Exponential, Log-normal, Log-logistic, and Generalized F models. The comparison was based
on the Likelihood Ratio Test. After selecting the plausible models, these were assessed through diagnostic
checking using the Cox-Snell residuals.
Since patients were randomly assigned to treatment and control groups, we would not expect bias for
the treatment effect when omitting covariates. However, to minimize bias for the scale parameter of the
residual errors, we included potential covariates of AIDS-free survival time, including (age, sex, race,
hemophilia diagnosis, Karnofsky performance scale, baseline CD4 cell count, and ZDV-use duration).
The full models were then simplified to get the most parsimonious model using the backward selection
procedure. When the model was no longer reducible, the final model was obtained.
T ∼ GF (β, σ, m1 , m2 ), ε ∼ fε (w)
1
Group 4: Survival Data Analysis Homework 3
The potential models were suggested based on the Likelihood Ratio Test (LRT) results and compared
to the Generalized F model. Table 1 summarizes the results of the LRT for the 6 models. The selected
models for initial analysis were Generalized Gamma, Weibull, Exponential, Log-normal, and Log-logistic
models as all p-values were non-significant.
All five potential models were fitted and checked the assumptions. We checked the assumptions of these
AFT models using the plot of Kaplan-Meier estimate of the survival function of the standardized residual
against residual (Figure 1). From Figure 1, we can see that the Weibull and log-logistic models fitted the
data equally well as the model-based estimated survival curves were almost the same as the Kaplan-Meier
2
Group 4: Survival Data Analysis Homework 3
estimated survival curve of the standardized residuals. However, the exponential and log-normal models
did not fit the data well as the substantial deviation was seen between the two survival curves.
Similar results can be seen from the plot of the cumulative hazard of residuals against the Cox-Snell
residuals (Figure 2) We can see some deviations at the ends, especially for log-normal and exponential
models. The Weibull and log-logistic seem to fit the data well as most of the points lie on a straight line
with an intercept at zero and a unit slope. Together, we considered both Weibull and log-logistic models
can be used. For convenience, we chose the Weibull distribution for further analyses.
4
X
ln Ti = µ + β1 Treatmenti + β2a Karnofai + β3 CD4i + σ · εi
a=2
Parameter estimates for the final Weibull AFT model are shown in Table 2. Compared to HIV patients
in the control group, HIV patients in the experimental group had an average of 2.29 times (95% CI: 1.32
- 3.98) longer AIDS progression time. We also found that patients with higher Karnofsky performance
scores had longer AIDS-free survival time. Specifically, as compared to patients with 70 score, patients
with 90 and 100 had 4.32 and 7.52 times AIDS-free survival time. Regarding the CD4 cell count, with
each CD4 cell increase at baseline, the AIDS-free survival time was prolonged by an average of 2%.
Since the AFT model was fitted with Weibull distribution, we can convert it into a PH model with
lnλT (t) = lnλT 0 (t) − pX ′ β, and that, the Hazard ratios are e−pβ . Results obtained from the model show
a decrease of 48% of the hazard for treatment group patients in comparison to control group patients.
3
Group 4: Survival Data Analysis Homework 3
4 Conclusion
In this analysis, we used the AFT model to evaluate the treatment effect of adding IDV to ZDV+LAM in
prolonging the AIDS-free survival time of HIV-positive patients. The Weibull model was selected as the
well-fitting model. We found that the new treatment with a 3-drug regimen increases the time of AIDS
progression in HIV-infected patients by 2.29 times as compared to patients under the 2-drug regimen
(95% CI: 1.32 - 3.98).
5 Appendix
R Codes (selected)
# - - - - - - - - - - Check distribution of survival time
# Generalize F ( mu , sigma , Q , P )
AFT _ GF <- flexsurvreg ( Surv ( time , censor ) ~ idv _ f + age + sex _ f + race _ f + ivdrug _ f +
hemo _ f + karnof _ f + zdv _ 0 + cd4 _ 0 , data = df , dist = " genf " , control = list (
fnscale = 2500) )
# Weibull ( q =1 , p =0)
AFT _ WB <- flexsurvreg ( Surv ( time , censor ) ~ idv _ f + age + sex _ f + race _ f + ivdrug _ f +
hemo _ f + karnof _ f + zdv _ 0 + cd4 _ 0 , data = df , dist = " weibull " )
# p - value for LRT
1 - pchisq (2 * ( AFT _ GF - AFT _ WB ) , df = 2)
# - - - - - - - - - - Checking assumption ( KM survival function of residual vs residual )
# Weibull
aids _ residW <- psm ( Surv ( time , censor ) ~ idv _ f + age + sex _ f + race _ f + ivdrug _ f +
hemo _ f + karnof _ f + zdv _ 0 + cd4 _ 0 , data = df , dist = " weibull " , y = T )
res . W <- resid ( aids _ residW , type = " cens " )
survplot ( npsurv ( res . W ~ 1) , conf = " none " , ylab = " Survival probability ( Weibull ) " ,
xlab = " Residual " , col = " red " )
lines ( res .W , lty = 2 , lwd = 2)
# - - - - - - - - - - Final model
AFT _ W _ reduce6 <- survreg ( Surv ( time , censor ) ~ idv _ f + karnof _ f + cd4 _ 0 ,
data = df , dist = " weibull " )
summary ( AFT _ W _ reduce6 )