Hrqol
Hrqol
Abstract
Background: Health-related quality of life (HRQOL) is a multi-dimensional concept commonly used to examine the
impact of health status on quality of life. HRQOL is often measured by four core questions that asked about general
health status and number of unhealthy days in the Behavioral Risk Factor Surveillance System (BRFSS). Use of these
measures individually, however, may not provide a cohesive picture of overall HRQOL. To address this concern, this
study developed and tested a method for combining these four measures into a summary score.
Methods: Exploratory and confirmatory factor analyses were performed using BRFSS 2013 data to determine potential
numerical relationships among the four HRQOL items. We also examined the stability of our proposed one-factor
model over time by using BRFSS 2001–2010 and BRFSS 2011–2013 data sets.
Results: Both exploratory factor analysis and goodness of fit tests supported the notion that one summary factor
could capture overall HRQOL. Confirmatory factor analysis indicated acceptable goodness of fit of this model. The
predicted factor score showed good validity with all of the four HRQOL items. In addition, use of the one-factor
model showed stability, with no changes being detected from 2001 to 2013.
Conclusion: Instead of using four individual items to measure HRQOL, it is feasible to study overall HRQOL via
factor analysis with one underlying construct. The resulting summary score of HRQOL may be used for health
evaluation, subgroup comparison, trend monitoring, and risk factor identification.
Keywords: Health-related quality of life, Summary score, Factor analysis
Background is excellent, very good, good, fair, or poor? (2) Now think-
Health-related quality of life (HRQOL) is a useful indica- ing about your physical health, which includes physical
tor of overall health because it captures information on illness and injury, for how many days during the past
the physical and mental health status of individuals, and 30 days was your physical health not good? (3) Now
on the impact of health status on quality of life [1, 2]. thinking about your mental health, which includes
HRQOL is usually assessed via multiple indicators of stress, depression, and problems with emotions, for
self-perceived health status and physical and emotional how many days during the past 30 days was your men-
functioning. Together, these measures provide a compre- tal health not good? (4) During the past 30 days, for
hensive assessment of the burden of preventable diseases, about how many days did poor physical or mental
injuries, and disabilities [3]. health keep you from doing your usual activities, such
To assess and measure HRQOL at the state and national as self-care, work, or recreation? [3–5].
levels, the Centers for Disease Control and Prevention These four items, which have demonstrated good re-
(CDC) developed a set of four “core” questions (CDC test reliability, validity, and responsiveness [6–8], have
HRQOL-4): (1) Would you say that in general your health been included in the Behavioral Risk Factor Surveillance
System (BRFSS) in all 50 states since 1993. In addition,
* Correspondence: wso3@cdc.gov
the four items have also been included in other national
1
SciMetrika, LLC, 100 Capitola Drive, Durham, NC 27701, USA surveys (e.g., National Health and Nutrition Examination
Full list of author information is available at the end of the article
© 2016 The Author(s). Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Yin et al. Population Health Metrics (2016) 14:22 Page 2 of 9
Survey (NHANES), Medicare Health Outcome Survey) module questions [7, 17]. Using data from BRFSS (2001
and in various chronic disease assessments [7, 9, 10]. CDC and 2002), both studies demonstrated that the nine
HRQOL-4 account for similar variance as the Patient- HRQOL questions have good internal consistency and
Reported Outcome Measurement Information System could be reduced to two latent factors that correspond
(PROMIS) items (e.g., SF-36) [11–13]. However, the CDC to the physical and mental health aspects of HRQOL.
items appear more appropriate for assessing burden of However, data from the optional BRFSS module were
disease for chronic conditions and are brief and easily only available for a few states and years, which limits the
interpretable [11]. application of these models in tracking HRQOL over the
In 1995, CDC added five additional questions related to years or assessing HRQOL at the national level.
quality of life to BRFSS, as part of an optional module. This study proposes a method for creating a summary
The new questions asked about days experiencing pain, score of overall HRQOL based solely on CDC HRQOL-4.
feeling sad or depressed, feeling worried or anxious, not Public health professionals could treat such a consolidated
getting enough rest, or feeling healthy. However, the op- score as a “new” variable that could be used to describe
tional module was only used in a limited number of states both community and population health, assess health dis-
and years. parities, monitor trends, and identify risk factors of overall
To assess HRQOL comprehensively, public health profes- HRQOL at the local and/or national levels. Using the
sionals have sought a means to summarize these HRQOL 2013 BRFSS data set, the study assesses whether there is
measures. To combine the information on physically and an underlying latent construct of HRQOL for the general
mentally unhealthy days, some researchers have summed population, and investigates the possibility of reducing
the two measures in CDC HRQOL-4 to create an Un- CDC HRQOL-4 to one summary score. It also provides
healthy Days Index, with the sum of the two items being an example of how this type of summary score could be
truncated at 30 days [3, 14, 15]. This approach assumes an used in trend analysis using BRFSS 2001–2010 and 2011–
independent relationship between the two kinds of days. 2013 data sets.
Another approach is to view HRQOL as a latent (hidden)
construct that can be quantified through factor analysis. Methods
Factor analysis is a method for detecting relationships Data sources
among variables, which often reduces the number of vari- The BRFSS is a state-based random-digit-dialed telephone
ables. Previous studies found strong associations among the health survey system. The survey annually collects data
CDC HRQOL-4 questions, suggesting that these items may from non-institutionalized civilian adults (≥18 years of age)
be suitable for factor analysis [4]. Toet and colleagues found about their health-related risk behaviors, chronic health
good internal consistency of the four measures (the Cron- conditions, and use of preventive services [20]. Starting in
bach’s alpha for the three unhealthy day measures was 0.77; 2011, BRFSS changed its weighting methodology and
a Cronbach’s alpha of 0.70 or more is usually considered ac- added cellular telephone users to its samples. Due to these
ceptable [16]) [13]. Horner-Johnson and colleagues, on the changes, caution should be used when comparing BRFSS
other hand, found a relatively poor consistency between the data from before and after 2011 [21]. In our analyses, we
mentally unhealthy day item and the three other items included two groups of data sources: BRFSS 2013 data (as
based on “the Cronbach’s alpha increase if item removed” an experimental study for factor analysis) and BRFSS
test [17]. They compared two alpha values: one based on all 2001–2013 data sets (to assess model stability and perform
items; the other based on remaining items after a test item trend analysis, one for 2001–2010 data sets and another
was removed. This analysis relies on the premise that if the for 2011–2013 data sets). Data on the four HRQOL ques-
test item value increases, this may indicate poor consistency tions were available from all states for every year, except
of the removed item. However, due to the lack of a clear 2002, when data were available from 22 states only.
cutoff value for the increase, it is a somewhat subjective
choice to remove a single item measure, especially for situa- Statistical analysis
tions in which the increase in the alpha values is minimal. To study the underlying structure of the CDC HRQOL-4,
Horner-Johnson and colleagues found only a very slight in- we conducted Cronbach’s alpha test, exploratory factor
crease (e.g., 0.001 when using BRFSS 2002 data), which analysis (EFA), and confirmatory factor analysis (CFA)
may not be enough to undermine the internal consistency using BRFSS 2013 data. We then assessed the stability of
of the mentally unhealthy day item with other HRQOL the resulting model over years, and demonstrated its ap-
items [17]. Raykov and colleagues warned that the Cron- plications for trend analysis using BRFSS 2001–2010 and
bach’s alpha if item is removed test can be misleading for 2011–2013 data sets.
selecting construct components [18, 19]. To analyze the internal consistency or reliability of
Two studies have conducted HRQOL factor analysis the CDC HRQOL-4, we performed Cronbach’s alpha
using the CDC HRQOL-4 plus the five optional HRQOL test (a larger alpha value indicates greater internal
Yin et al. Population Health Metrics (2016) 14:22 Page 3 of 9
correlation). We used the traditional cutoff value of removed test indicated good consistency within items.
0.70 or higher as being acceptable [16]. To reveal con- Removing the mentally unhealthy day items increased
struct dimensions, EFA was used, with factors with an alpha by 1.3 %, which is consistent with Horner-
eigenvalue (a number showing how much variance Johnson’s results [6]. EFA (Table 2) showed that a single
there is for that underlying factor) larger than or equal factor, with an eigenvalue larger than one, explained
to 1.0 being considered acceptable [22]. The principal 99.9 % of the total variance. Therefore, we propose a
axis factoring with rotation of orthogonal varimax rota- one-factor HRQOL model for the CDC HRQOL-4.
tion was used, which can accommodate non-normal
data distribution [23]. Factor model
Based on the results of Cronbach’s alpha test and EFA, An initial model with four paths from one factor to the
we hypothesized that it would be possible to summarize four CDC HRQOL-4 items was first evaluated by CFA.
the CDC HRQOL-4 items by using a single factor. To The four items had factor loadings that ranged from
determine if the model adequately fit the data, we con- 0.46 to 0.87, larger than the minimal acceptable cutoff
ducted a goodness of fit test using CFA. We used an value of ±0.3 [26]. The goodness of fit statistics indicate
asymptotically distribution-free method to account for that the model is acceptable but could be improved
non-normality of the data and ordinal data [24]. Five upon (RMSEA = 0.086, CFI = 0.90, TLI = 0.70, SRMR =
model fit statistics were used to evaluate model fit: root 0.03, CD = 0.85). To determine whether the model could
mean squared error of approximation (RMSEA), compara- be improved, a post-hoc model modification was per-
tive fit index (CFI), Tucker-Lewis index (TLI), standardized formed. We found that adding an error correlation path
root mean squared residual (SRMR), and coefficient of de- between the physically unhealthy day item and the men-
termination (CD). We followed commonly accepted cri- tally unhealthy day item substantially improved the
teria regarding goodness of fit: RMSEA (≤0.06), CFI and/or goodness of fit between model and data. Thus, a final
TLI (≥0.95), SRMR (≤0.08), and CD close to 1 [25]. Using model was proposed (Fig. 1). The minimal factor loading
one-factor model regression, we generated HRQOL factor was increased from 0.46 to 0.54. The goodness of fit
score values. To confirm the validity of the HRQOL factor statistics were also greatly improved (RMSEA = 0.039,
scores, we compared the mean changes in the HRQOL CFI = 0.99, TLI = 0.94, SRMR = 0.01, CD = 0.89).
factor scores with each level of the HRQOL measures.
After establishing the one-factor HRQOL model using Factor scores
BRFSS 2013 data, we assessed model stability over the To quantify the overall HRQOL, weighted factor score
years using two data sets: BRFSS 2001–2010 (10 years) values were predicted by the final CFA model. Factor
and 2011–2013 (3 years). To do so, we conducted a series score could be considered as weighted sum scores (multi-
of hierarchical tests including factorial configural invari- plying the score of each item into its factor loading and
ance (similar factor structure across groups), metric in- then summing all of them). Figure 2 shows the distribu-
variance (equivalent factor loadings across groups), and tion of predicted factor scores using BRFSS 2013 data,
scalar invariance (equivalent intercepts across groups) with a larger value indicating better quality of HRQOL.
[26]. In sequencing of these tests (increasing constraints The “skewed left” distribution suggests that the majority
on model parameters), we followed the recommended of the population is healthy in terms of HRQOL. To check
criteria, which suggest that the more restrictive nested the consistency of HRQOL factor scores with their ori-
model with a decrease of CFI less or equal to 0.01 be ginal measures, we summarized HRQOL factor scores for
accepted [27, 28]. Next, HRQOL factor scores for the each level of CDC HRQOL-4 (Table 3). Either in one year
13 years were generated by model predication. Survey or across years, the overall means of HRQOL factor scores
sampling design and weighting were considered in the decrease as the CDC HRQOL-4 ratings become worse for
analyses. The year 2000 US standardized population was both male and female adults (we did an analysis stratified
used for age standardization. All analyses were conducted by sex, discussed later), indicating the validity of factor
using STATA 13.0 statistical software (College Station, scores in representing HRQOL.
TX: StataCorp LP).
Model stability
Results To test whether our HRQOL model was stable over time,
Factor structure we examined BRFSS data from 2001 to 2013. Table 4 sum-
Using BRFSS 2013 data, we first analyzed the correlation marizes the goodness of fit statistics of the model using a
matrix and internal consistency of the CDC HRQOL-4 series of BRFSS data sets. For all the data sets, whether the
questions (Table 1). The Cronbach’s alpha value of the combined (2001–2010 or 2011–2013) or individual years
CDC HRQOL-4 was 0.76, which was within the accept- were examined, our HRQOL model exhibited acceptable
able range [16]. The alpha change if the item were goodness of fit (RMSEA = 0.035-0.05, CFI = 0.984-0.99,
Yin et al. Population Health Metrics (2016) 14:22 Page 4 of 9
Table 1 Correlation matrix and internal consistency of the CDC HRQOL-4 items, BRFSS 2013
General health Physically Mentally Activity Cronbach's alpha Change (%)
status unhealthy days unhealthy days limitation days if item removed
General health statusa 1 0.733 -3.9
b
Physically unhealthy days 0.52 1 0.651 -14.6
Mentally unhealthy daysc 0.29 0.35 1 0.773 +1.3
d
Activity limitation days 0.43 0.65 0.44 1 0.656 -14
Overall construct 0.763
a
Would you say that in general your health is excellent, very good, good, fair, or poor?
b
Now thinking about your physical health, which includes physical illness and injury, for how many days during the past 30 days was your physical health
not good?
c
Now thinking about your mental health, which includes stress, depression, and problems with emotions, for how many days during the past 30 days was your
mental health not good?
d
During the past 30 days, for about how many days did poor physical or mental health keep you from doing your usual activities, such as self-care, work,
or recreation?
TLI = 0.915-0.938, SRMR = 0.01-0.014, and CD = 0.868- over years. Figure 3 shows the age-standardized weighted
0.885). To further examine this, we analyzed results from a means of HRQOL factor scores predicted for the 2001–
sequence of hierarchical tests (Table 5). For both of the 2010 and 2011–2013 periods, respectively. The overall
combined data sets (2001–2010 and 2011–2013), all HRQOL scores gradually declined from 2001 to 2004 and,
models had acceptable goodness of fit statistics (RMSEA = in general, remained stable thereafter through 2010
0.02-0.044, CFI = 0.977-0.987, TLI = 0.925-0.984, SRMR = (p < 0.001 for 2001 vs. 2004, adjusted Wald test). Compared
0.011-0.014, and CD = 0.879-0.884). The decrease in CFI with 2011 and 2012, the overall HRQOL scores increased
was no larger than 0.01 for each model pairwise compari- in 2013 (p < 0.001 for 2011 vs. 2013, adjusted Wald test).
son, whether it involved full metric invariance versus full These findings were also confirmed with the changes from
configural invariance, or full scalar invariance versus full the original CDC HRQOL-4 questions (Additional file 1
metric invariance. These results indicate that the new, sin- shows results of CDC HRQOL-4 changes for 2001 vs.
gle measure of HRQOL has strong measurement invari- 2004, and 2011 vs. 2013).
ance, holding full equivalent factor patterns, full equivalent
factor loadings, and full equivalence intercepts over the Discussion
years, from 2001 to 2010, and from 2011 to 2013. In this study, we developed and tested a one-factor
We also further assessed model stabilities across sex HRQOL model using a series of BRFSS data sets. To our
and age subgroups (Table 5). Results suggest that the knowledge, this is the first report of an HRQOL factor
one-factor model has strong measurement invariance analysis based solely on CDC HRQOL-4. Two previous
across sex, holding full equivalent factor patterns, full studies, which used data obtained from the optional
equivalent factor loadings, and full equivalence inter- BRFSS module, proposed a two-factor model [7, 17].
cepts between male and female adults. When applied to One report used summed z-scores from all items to rep-
young (18-64) and old (65+) age subgroups, the one- resent physical and mental health, respectively. However,
factor model has full configural invariance but the full it did not consider item factor loadings and removed one
equivalent factor loadings is not supported as the CFI item due to the cross loading issues [17]. As the CDC
decrease is larger than 0.01. However, after releasing the HRQOL-4 questions are more commonly used in BRFSS
equivalent factor loading constraints for the mentally and other surveys, we performed HRQOL factor analysis
unhealthy day item, partial metric invariance is tenable. using only these four items. EFA revealed that the four
items could be explained by one underlying factor—a gen-
Model application: trend monitoring eral health factor that encompasses both physical and
The one-factor HRQOL model exhibits strong measure- mental health. As a result, this model could be used to
ment invariance across year subgroups, which allows us to generate a one-factor score that represents the underlying
analyze how the mean of HRQOL factor scores changes construct of HRQOL.
In addition to EFA, we performed CFA to evaluate our
Table 2 Exploratory factor analysis of the CDC HRQOL-4, BRFSS
one-factor model with more statistical options such as
2013
goodness of fit, modification indices, and measurement in-
Factor Eigenvalue Percentage of Accumulated Percentage
explained variance of explained variance variance tests. Our post-hoc analysis found a negative
factor 1 1.76 99.9 99.9
error correlation path between the physically unhealthy
day item and the mentally unhealthy day item. This result
factor 2 <0.01 0.1 100.0
may have not only statistical support but also theoretical
Yin et al. Population Health Metrics (2016) 14:22 Page 5 of 9
Fig. 1 Final one-factor model for the CDC HRQOL-4, BRFSS 2013. Standardized factor loadings from the latent construct (represented by the large oval)
to its measures (represented by rectangles) are shown beside the single-headed arrows. The small ovals represent error variances unexplained by the
model. The curved double-headed arrow represents correlations between error variances
meaning. First, research has found that using similar ques- young and old age subgroups, only partial metric invari-
tion formats can affect survey responses [29]. The format ance was observed due to different factor loadings on the
of the two questions is very similar, which may contribute mentally unhealthy day item. This may suggest that young
to the covariance between the two items. Second, our pre- and old people have different dimensions on mental
liminary analysis (not shown) found that some individuals health aspect, which is in accordance with previous re-
report no physically unhealthy days, but 30 mentally un- ports [30, 31]. Further studies are needed to show how
healthy days. Our one-factor model may account for this stable the factor structure is with other demographics, so-
distinction by indicating a negative relationship between cioeconomic characteristics, and chronic conditions.
the error terms in the measures of physically unhealthy Using BRFSS 2001–2010 and 2011–2013 data sets, we
days and mentally unhealthy days. demonstrated that our one-factor HRQOL model is
Our one-factor model showed strong measurement in- stable over time, and could be used to monitor trends in
variance across year and sex subgroups. However, for HRQOL with a single summary score. This approach
Fig. 2 HRQOL summary score, BRFSS 2013. Histogram shows the distribution of the HRQOL summary score using BRFSS 2013 data set. Larger
value means better quality of HRQOL. The “skewed left” distribution suggests that the majority of the population is healthy in terms of HRQOL
Yin et al. Population Health Metrics (2016) 14:22 Page 6 of 9
Table 3 The mean values of HRQOL summary scores by the CDC HRQOL-4 measures
BRFSS Data Sets 2005 2001-2010 2013 2011-2013
Sex Male Female Male Female Male Female Male Female
General Health Status
Excellent 3.10 4.04 3.14 4.05 3.32 4.17 3.35 4.18
Very good 1.81 2.58 1.84 2.59 2.09 2.84 2.13 2.84
Good -0.20 0.42 -0.18 0.44 -0.03 0.63 0.01 0.63
Fair -6.21 -5.60 -6.20 -5.58 -6.55 -5.84 -6.50 -5.84
Poor -16.91 -16.14 -16.89 -16.12 -17.34 -16.46 -17.27 -16.45
Physically Unhealthy Days
0 day 2.80 3.50 2.82 3.52 2.99 3.67 3.02 3.67
1-10 days -0.04 0.71 -0.01 0.73 0.07 0.79 0.11 0.79
11-20 days -7.98 -7.10 -7.96 -7.08 -7.85 -6.98 -7.79 -6.97
21-29 days -15.81 -14.72 -15.79 -14.71 -15.64 -14.60 -15.57 -14.57
30 days -21.17 -19.97 -21.16 -19.96 -20.85 -19.69 -20.77 -19.66
Mentally Unhealthy Days
0 day 1.11 1.77 1.12 1.80 1.30 2.00 1.34 1.99
1-10 days -0.24 0.52 -0.22 0.54 -0.31 0.43 -0.26 0.44
11-20 days -6.09 -5.05 -6.02 -5.06 -6.39 -5.50 -6.32 -5.47
21-29 days -10.32 -8.97 -10.18 -9.01 -10.91 -9.88 -10.83 -9.82
30 days -14.79 -13.17 -14.60 -13.23 -15.42 -14.24 -15.31 -14.15
Activity Limitation Days
0 day 1.58 2.30 1.60 2.32 1.80 2.51 1.84 2.51
1-10 days -2.00 -1.20 -1.97 -1.19 -1.95 -1.19 -1.90 -1.18
11-20 days -11.18 -10.19 -11.12 -10.17 -10.93 -10.02 -10.87 -9.99
21-29 days -17.33 -16.22 -17.24 -16.20 -16.98 -15.97 -16.93 -15.95
30 days -22.14 -21.01 -22.04 -20.96 -21.57 -20.51 -21.52 -20.49
Summary score with larger value means better quality of HRQOL
Table 5 Multi-group confirmatory factor analysis for measurement invariance across years, sex, and age subgroups
Models χ2 (df) RMSEA CFI TLI SRMR CD ΔCFI
BRFSS 2001-2010 (year subgroups)
Full configural invariance 5430 (10) 0.041 0.987 0.925 0.011 0.88 −
Full metric invariance 5648 (37) 0.022 0.987 0.979 0.012 0.88 0
Full scalar invariance 10107 (64) 0.022 0.977 0.978 0.014 0.879 0.01
BRFSS 2011-2013 (year subgroups)
Full configural invariance 2685 (3) 0.044 0.987 0.923 0.012 0.884 −
Full metric invariance 2696 (9) 0.025 0.987 0.974 0.012 0.884 0
Full scalar invariance 2794 (15) 0.02 0.987 0.984 0.012 0.884 0
BRFSS 2001-2010 (male vs. female)
Full configural invariance 10443 (2) 0.057 0.997 0.981 0.01 0.879 −
Full metric invariance 12243 (5) 0.039 0.996 0.991 0.012 0.879 0.001
Full scalar invariance 23654 (8) 0.043 0.993 0.989 0.019 0.881 0.003
BRFSS 2011-2013 (male vs. female)
Full configural invariance 5159 (2) 0.06 0.997 0.98 0.011 0.884 −
Full metric invariance 5788 (5) 0.041 0.996 0.991 0.012 0.884 0.001
Full scalar invariance 9425 (8) 0.041 0.994 0.991 0.017 0.885 0.002
BRFSS 2001-2010 (young vs. old ages)
Full configural invariance 3587 (2) 0.034 0.999 0.994 0.007 0.877 −
Partial metric invariance 13086 (3) 0.053 0.996 0.984 0.014 0.87 0.003
Full metric invariance 62559 (5) 0.089 0.981 0.955 0.038 0.866 0.018
BRFSS 2011-2013 (young vs. old ages)
Full configural invariance 2256 (2) 0.04 0.999 0.991 0.008 0.883 −
Partial metric invariance 7096 (3) 0.058 0.996 0.982 0.013 0.879 0.003
Full metric invariance 29614 (5) 0.092 0.981 0.955 0.035 0.874 0.018
Full configural invariance – same factor structure across groups; full metric invariance-equivalent factor loadings across groups; full scalar invariance – equivalent intercepts
across groups; partial metric invariance – equivalent factor loadings across groups except factor loadings for mentally unhealthy day item. Age subgroups: young (18-64)
and old (65+)
Fig. 3 Trend analysis of overall HRQOL, BRFSS 2001–2010 and 2011–2013. The weighted and age-adjusted HRQOL summary scores were predicated
by the model for the 2001–2010 and 2011–2013 periods, respectively. The mean HRQOL summary score for each year is shown from 2001 to 2013.
The 2000 US Census population was used for age standardization.
Yin et al. Population Health Metrics (2016) 14:22 Page 8 of 9