0% found this document useful (0 votes)
176 views6 pages

Fit Indices in SEM

The document discusses different fit indices used to evaluate structural equation models. It provides classifications of fit indices including discrepancy functions like chi-square, tests that compare the target model to the null model like CFI and NFI, and information theory measures like AIC and BIC. Guidelines are given for acceptable fit including CFI above 0.93, NFI above 0.90, and RMSEA below 0.08. The document recommends using a range of fit indices from different classes to overcome the limitations of individual indices.

Uploaded by

waqtole Jira
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
176 views6 pages

Fit Indices in SEM

The document discusses different fit indices used to evaluate structural equation models. It provides classifications of fit indices including discrepancy functions like chi-square, tests that compare the target model to the null model like CFI and NFI, and information theory measures like AIC and BIC. Guidelines are given for acceptable fit including CFI above 0.93, NFI above 0.90, and RMSEA below 0.08. The document recommends using a range of fit indices from different classes to overcome the limitations of individual indices.

Uploaded by

waqtole Jira
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Fit indices for structural equation modeling

Dr. Simon Moss

Overview

In structural equation modeling, the fit indices establish whether, overall, the model is acceptable.
If the model is acceptable, researchers then establish whether specific paths are significant.
Acceptable fit indices do not imply the relationships are strong. Indeed, high fit indices are often
easier to obtain when the relationships between variables are low rather than high--because the
power to detect discrepancies from predictions are amplified.

Many of the fit indices are derived from the chi-square value. Conceptually, the chi-square value,
in this context, represents the difference between the observed covariance matrix and the
predicted or model covariance matrix.

The fit indices can be classified into several classes. These classes include:

 Discrepancy functions, such as the chi square test, relative chi square, and RMS

 Tests that compare the target model with the null model, such as the CFI, NFI, TFI, and IFI

 Information theory goodness of fit measures, such as the AIC, BCC, BIC, and CAIC

 Non-centrality fit measures, such as the NCP.

Many researchers, such as Marsh, Balla, and Hau (1996), recommend that individuals utilize a
range of fit indices. Indeed, Jaccard and Wan (1996) recommend using indices from different
classes as well& this strategy overcomes the limitations of each index.

Summary of criteria that researchers often use

A model is regarded as acceptable if:

 The Normed Fit Index (NFI) exceeds .90 (Byrne, 1994) or .95 (Schumacker & Lomax, 2004)

 The Goodness of Fit Index exceeds .90 (Byrne, 1994)

 The Comparative Fit Index exceeds .93 (Byrne, 1994)

 RMS is less than .08 (Browne & Cudeck, 1993)--and ideally less than .05 (Stieger, 1990).
Alternatively, the upper confidence interval of the RMS should not exceed .08 (Hu &
Bentler, 1998)

The relative chi-square should be less than 2 or 3 (Kline, 1998& Ullman, 2001).

These criteria are merely guidelines. To illustrate, in a field in which previous models generate CFI
values of .70 only, a CFI value of .85 represents progress and thus should be acceptable (Bollen,
1989).

Discrepancy functions

Chi-square

The chi-square for the model is also called the discrepancy function, likelihood ratio chi-square, or
chi-square goodness of fit. In AMOS, the chi-square value is called CMIN.
If the chi-square is not significant, the model is regarded as acceptable. That is, the observed
covariance matrix is similar to the predicted covariance matrix--that is, the matrix predicted by the
model.

If the chi-square is significant, the model is regarded, at least sometimes, as unacceptable.


However, many researchers disregard this index if both the sample size exceeds 200 or so and
other indices indicate the model is acceptable. In particular, this approach arises because the chi-
square index presents several problems:

 Complex models, with many parameters, will tend to generate an acceptable fit

 If the sample size is large, the model will usually be rejected, sometimes unfairly

 When the assumption of multivariate normality is violated, the chi-square fit index is
inaccurate. The Satorra-Bentler scaled chi-square, which is available in EQS, is often
preferred, because this index penalizes the chi-square for kurtosis.

Relative chi-square

The relative chi-square is also called the normed chi-square. This value equals the chi-square index
divided by the degrees of freedom. This index might be less sensitive to sample size. The criterion
for acceptance varies across researchers, ranging from less than 2 (Ullman, 2001) to less than 5
(Schumacker & Lomax, 2004).

Root mean square residual

The RMS, also called the RMR or RMSE, represents the square root of the average or mean of the
covariance residuals--the differences between corresponding elements of the observed and
predicted covariance matrix. Zero represents a perfect fit, but the maximum is unlimited.

Because the maximum is unbounded, the RMS is difficult to interpret and consensus has not been
reached on the levels that represent acceptable models. Some researchers utilized the
standardized version of the RMS instead to override this problem.

According to some researchers, RMS should be less than .08 (Browne & Cudeck, 1993)--and ideally
less than .05 (Stieger, 1990). Alternatively, the upper confidence interval of the RMS should not
exceed .08 (Hu & Bentler, 1998).

Indices that compare the target and null models

Comparative fit index (CFI)

The comparative fit index, like the IFI, NFI, BBI, TLI, and RFI, compare the model of interest with
some alternative, such as the null or independence model. The CFI is also known as the Bentler
Comparative Fit Index.

Specifically, the CFI compares the fit of a target model to the fit of an independent model--a model
in which the variables are assumed to be uncorrelated. In this context, fit refers to the difference
between the observed and predicted covariance matrices, as represented by the chi-square index.

In short, the CFI represents the ratio between the discrepancy of this target model to the
discrepancy of the independence model. Roughly, the CFI thus represents the extent to which the
model of interest is better than is the independence model. Values that approach 1 indicate
acceptable fit.
CFI is not too sensitive to sample size (Fan, Thompson, and Wang, 1999). However, CFI is not
effective if most of the correlations between variables approach 0--because there is, therefore,
less covariance to explain. Furthermore, Raykov (2000, 2005) argues that CFI is a biased measure,
based on non-centrality.

Incremental fit index (IFI)

The incremental fit index, also known as Bollen's IFI, is also relatively insensitive to sample size.
Values that exceed .90 are regarded as acceptable, although this index can exceed 1.

To compute the IFI, first the difference between the chi square of the independence model--in
which variables are uncorrelated--and the chi-square of the target model is calculated. Next, the
difference between the chi-square of the target model and the df for the target model is
calculated. The ratio of these values represents the IFI.

Normed fit index (NFI)

The NFI is also known as the Bentler-Bonett normed fit index. The fit index varies from 0 to 1--
where 1 is ideal. The NFI equals the difference between the chi-square of the null model and the
chi square of target model, divided by the chi-square of the null model. In other words, an NFI of .
90, for example, indicates the model of interest improves the fit by 90% relative to the null or
independence model.

When the samples are small, the fit is often underestimated (Ullman, 2001). Furthermore, in
contrast to the TLI, the fit can be overestimated if the number of parameters is increased& the
NNFI overcomes this problem.

Tucker Lewis index (TLI) or Non-normed fit index (NNFI)

The TLI, sometimes called the NNFI, is similar to the NFI. However, the index is lower, and hence
the model is regarded as less acceptable, if the model is complex. To compute the TLI:

 First divide the chi square for the target model and the null model by their corresponding
df vales--which generates relative chi squares for each model.

 Next, calculate the difference between these relative chi squares.

 Finally, divide this difference by the relative chi square for the null model minus 1.

According to Marsh, Balla, and McDonald (1988), the TFL is relatively independent of sample size.
The TFI is usually lower than is the GFI--but values over .90 or over .95 are considered acceptable
(e.g., Hu & Bentler, 1999).

Information theory goodness of fit measures

Akaike Information Criterion

The AIC, like the BIC, BCC, and CAIC, is regarded as an information theory goodness of fit
measure--applicable when maximum likelihood estimation is used (Burnham & Anderson, 1998).
These indices are used to compare different models. The models that generate the lowest values
are optimal. The absolute AIC value is irrelevant--although values closer to 0 are ideal& only the
AIC value of one model relative to the AIC value of another model is meaningful.

Like the chi square index, the AIC also reflects the extent to which the observed and predicted
covariance matrices differ from each other. However, unlike the chi square index, the AIC
penalizes models that are too complex. In particular, the AIC equals the chi square divided by n
plus 2k / (n-1). In this formula, k = .5v/v + 1 - df, where v is the number of variables and n = the
sample size.

Browne-Cudeck criterion (BCC) and Consistent AIC (CAIC)

The BCC is similar to the AIC. That is, the BCC and AIC both represent the extent to which the
observed covariance matrix differs from the predicted covariance matrix--like the chi square
statistic--but include a penalty if the model is complex, with many parameters. The BCC bestows
an even harsher penalty than does the AIC.

The BCC equals the chi square divided by n plus 2k / (n- v - 2). In this formula, k = .5v/v + 1 - df,
where v is the number of variables and n = the sample size.

The CAIC is similar to the AIC as well. However, the CAIC also confers a penalty if the sample size is
small.

Bayesian Information Criterion (BIC)

The Bayesian Information Criterion is also known as Akaike's Bayesian Information Criterion (ABIC)
and the Schwarz Bayesian Criterion (SBC). This index is similar to the AIC, but the penalty against
complex models is especially pronounced--even more pronounced than is the BCC and CAIC
indices. Furthermore, like the CAIC, a penalty against small samples is include.

BIC was derived by Raftery (1995). Roughly, the BIC is the log of a Bayes factor of the target model
compared to the saturated model.

Determinants of which indices to use

Many other indices have also been developed. These indices include the GFI, AGFI, FMIN,
noncentrality parameter, and centrality index. The GFI and, to a lesser extent, the FMIN used to be
very popular, but their use has dwindled recently.

Some indices are especially sensitive to sample size. For example, fit indices overestimate the fit
when the sample size is small--below 200, for example. Nevertheless, RMSEA and CFI seem to be
less sensitive to sample size (Fan, Thompson, and Wang, 1999).

Source: https://www.sicotests.com/psyarticle.asp?id=277 accessed on 22 Feb. 18.

References

Anderson, J. C., & Gerbing, D. W. (1984). The effect of sampling error on convergence, improper
solutions and goodness-of-fit indices for maximum likelihood confirmatory factor
analysis.Psychometrika, 49, 155-173.

Bentler, P M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238-
246.

Bentler, P. M., & Bonett, D. G. (1980). Significant tests and goodness of fit in the analysis of
covariance structures. Psychological Bulletin, 88, 588-606.

Bentler, P. M., & Mooijaart, A. (1989). Choice of structural model via parsimony: A rationale based
on precision. Psychological Bulletin, 106,315-317.

Bollen, K. A. (1989). Structural equations with latent variables. NY: Wiley.


Bollen, K. A. (1990). Overall fit in covariance structure models: Two types of sample size
effects. Psychological Bulletin, 107, 256-259.

Browne, M. W., & Cudeck, R. (1989). Single sample cross-validation indices for covariance
structures. Multivariate Behavioral Research, 24, 445-455.

Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S.
Long (Eds.), Testing structural equation models (pp. 136-162). Newsbury Park, CA: Sage.

Burnham, K, P., and D. R. Anderson (1998). Model selection and inference: A practical information-
theoretic approach. New York: Springer-Verlag.

Byrne, B. M. (1994). Structural equation modeling with EQS and EQS/Windows. Thousand Oaks, CA:
Sage Publications.

Cheung, G. W. & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement
invariance. Structural Equation Modeling, 9, 233-255.

Fan, X., B. Thompson, and L. Wang (1999). Effects of sample size, estimation method, and model
specification on structural equation modeling fit indexes. Structural Equation Modeling, 6, 56-83.

Hipp J. R., & Bollen K. A. (2003). Model fit in structural equation models with censored, ordinal, and
dichotomous variables: testing vanishing tetrads. Sociological Methodology, 33, 267-305.

Hu, L. T., & Bentler, P. M. (1995). Evaluating model fit. In R. H. Hoyle (Ed.), Structural equation
modeling: Concepts, issues, and applications (pp. 76-99). Thousand Oaks, CA: Sage.

Kline, R. B. (1998). Principles and practice of structural equation modeling. NY: Guilford Press.

Jaccard, J., & Wan, C. K. (1996). LISREL approaches to interaction effects in multiple regression.
Thousand Oaks, CA: Sage Publications.

Joreskog, K. G. (1993). Testing structural equation models. In K. A. Bollen & J. S. Long (Eds.), Testing
structural equation models (pp. 294-316). Newbury, CA: Sage.

Marsh, H. W., Balla, J. R., & Hau, K. T. (1996). An evaluation of incremental fit indexes: A clarification
of mathematical and empirical properties. In G. A. Marcoulides & R. E. Schumacker (Eds.), Advanced
structural equation modeling techniques(pp.315-353 . Mahwah , NJ : Lawrence Erlbaum.

Marsh, H. W., Balla, J. R., & McDonald, R. P. (1988). Goodness-of-fit indexes in confirmatory factor
analysis: The effect of sample size. Psychological Bulletin, 103, 391-410.

Marsh, H. W., & Hau, K. T. (1996). Assessing goodness of fit: Is parsimony always desirable?

Journal of Experimental Education

, 64, 364-390.

Raftery, A. E. (1995). Bayesian model selection in social research. In Adrian E. Raftery (Ed.) (pp. 111-
164). Oxford: Blackwell.

Raykov, T. (2000). On the large-sample bias, variance, and mean squared error of the conventional
noncentrality parameter estimator of covariance structure models. Structural Equation Modeling, 7,
431-441.
Raykov, T. (2005). Bias-corrected estimation of noncentrality parameters of covariance structure
models. Structural Equation Modeling, 12, 120-129.

Schumacker, R. E., & Lomax, R. G. (2004). A beginner's guide to structural equation modeling,
Second edition. Mahwah, NJ: Lawrence Erlbaum Associates.

Steiger, J. H. (1990). Structural model evaluation and modification: An interval estimation


approach. Multivariate Behavioural Research, 25, 173-180.

Steiger J. H. (2000). Point estimation, hypothesis testing and interval estimation using the RMSEA:
Some comments and a reply to Hayduk and Glaser. Structural Equation Modeling, 7, 149-162.

Tucker, L. R., & Lewis, C. (1973). The reliability coefficient for maximum likelihood factor
analysis. Psychometrika, 38, 1-10.

Ullman, J. B. (2001). Structural equation modeling. In B. G. Tabachnick & L. S. Fidell (2001). Using
Multivariate Statistics (4th ed& pp 653- 771). Needham Heights, MA: Allyn & Bacon.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy