0% found this document useful (1 vote)
950 views115 pages

Lisrel Guide

Latent variables are ubiquitous in some research domains, while in other contexts they are seldom used. A LV is a statistical device used to summarize the information in a collection of correlated response variables. The higher the individual Tendency LV score is, the more likely that the person will endorse questionnaire items regarding use and abuse of alcoholic beverages.

Uploaded by

sfenrisulfr
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
950 views115 pages

Lisrel Guide

Latent variables are ubiquitous in some research domains, while in other contexts they are seldom used. A LV is a statistical device used to summarize the information in a collection of correlated response variables. The higher the individual Tendency LV score is, the more likely that the person will endorse questionnaire items regarding use and abuse of alcoholic beverages.

Uploaded by

sfenrisulfr
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 115

5 Structural equation models

5.1 Introduction

The single most important feature of the LISREL program is its facility to deal with a wide variety of models for the analysis of latent variables (LVs). In the social sciences, and increasingly in biomedical and public health research, LV models have become an indispensable statistical tool. Because the whole framework of the LISREL model is based on relationships among LVs, it is worthwhile to briefly illustrate the concept of a latent variable. Latent variables are ubiquitous in some research domains, while in other contexts they are seldom used. In alcohol abuse studies, for example, they are a major focus of attention. It is the complexity of attitudes and traits underlying the alcoholism syndrome that is of greatest concern, rather than any specific behavior. As an example, questionnaire items are frequently collected that deal with the functioning of the subjects in a particular domain. Subsets of these items are often correlated. This implies that the subset reflects a common theme. For example, consider the following items from a hypothetical survey, used in a hypothetical model as shown in Figure 5.1: o o o o
Q1: How many alcoholic drinks do you generally consume on any occasion? Q2: How many days in a typical week do you consume alcohol? Q3: Do you frequently attend parties where alcohol is available? Q4: Do you have alcoholic beverages with meals?

Q1 Q2
Tendency

Q3 Q4

Figure 5.1

Path Diagram for Hypothetical SEM

In this example, a possible LV would be Tendency to Use Alcohol. This is a LV because Tendency to Use is a kind of unmeasurable propensity that is more than the combination of these items. The higher the individual Tendency LV score is, the more likely that the person will endorse questionnaire items regarding use and abuse of alcoholic beverages. A LV is a statistical device used to summarize the information in a collection of correlated response variables. A LV describes the information of a set of items and reduces them to a single new measure. It is often assumed that the latent variable is superordinate to items on which it is based.

Chapter 5: Structural equation models

225

There are basically three major reasons for the utility of LV models. First, this kind of model can summarize information contained in many response variables by a few LVs. Consequently, the approach is parsimonious. Second, when properly specified a LV model can minimize the biasing effects of errors of measurement in estimating treatment effects. This means that the approach is often more accurate than is a traditional version of the same analysis. Third, LV models investigate effects between primary conceptual variables, rather than between any particular set of ordinary response variables. This means that a LV model is often viewed as more appropriate theoretically than is a simpler analysis with response variables only. A partial list of the sort of models that are subsumed under the framework of LISREL's general LV structure includes factor analysis, simultaneous equation models, standard growth curve processes, errors-in-variables models, virtually all forms of classical regression, univariate linear models and multivariate linear models, including the corresponding hypothesis tests on means and variances of classical experimental design. Literally hundreds of published articles appear each year that feature LV models, and an active program of statistical investigations on properties and extensions of LV models is carried out. The hypothetical path diagram in Figure 5.2 shows seven x variables as indicators of three latent variables. Note that x3 is an indicator for both 1 and 2 . There are two latent variables, each with two y indicators. The model involves errors in equations (the s), and errors in variables (the s and s). A more detailed discussion of this model is given in Chapter 1 of the LISREL 8: User's Reference Guide (1996).

Figure 5.2

Path Diagram for Hypothetical SEM

The LISREL model for single samples (Jreskog & Srbom, 1996) is defined by two components, namely the structural equation model and the measurement model(s).

Chapter 5: Structural equation models

226

The structural equation model

= + B + +
(5.1) where is a m 1 vector of endogenous latent variables and where it is assumed that the n 1 vector of exogenous latent variables has mean and covariance matrix , and that the m 1 vector of error terms has zero mean and covariance matrix , and cov(, ' ) = 0 . If I B 0 , and setting A = (I B) 1 , it follows that
= A ( + )

(5.2)

and

Cov ( ) = A ( ' + ) A '


. (5.3)

Measurement models The measurement models for the p endogenous observed variables, represented by the vector y , and the q exogenous observed variables, contained in the vector x , relate the observed (manifest) variables to the underlying factors (latent variables) and may be expressed as

y = y + y + , E () = 0, Cov() = x = x + x + , E () = 0, Cov() = respectively. The mean vectors of the observed variables are
y = y + y A( + ), x = x + x

(5.4)

(5.5)

Chapter 5: Structural equation models

227

In general, in a single population, y , x , , and will not be identified without the imposition of further conditions. It further follows that

y = y A( ' + ) A ' 'y + x = x 'x + and


yx = y A 'x .

(5.6) (5.7)

(5.8)

LISREL

From (5.5) to (5.8), it follows that the covariance structure for the observed variables of the general model may be expressed as:
y yy = Cov = x xy yx xx

(5.9)

From (5.5), the mean structure of the observed variables of the general LISREL model follows as:

y = E = y . x x

(5.10)

LISREL fits the mean-and-covariance structure defined in (5.9) and (5.10) to the data on the observed variables of the LISREL model. In this regard, LISREL can handle simple random sample data as well as complex survey data.

Special cases of the general LISREL model are obtained by fixing and constraining the parameters which are the elements in the 13 parameter matrices , , x , y , y , x , B, , , , , , .

A large number of submodels is obtained by setting certain parameter matrices equal to the identity matrix or to zero. A few examples are:
o The measurement model for x , x = x + .

Chapter 5: Structural equation models

228

o A structural equation model where y and x are observed without error ( y = I, x = I,

= 0, = 0 )

y = By + x +

Kaplan (2000) pointed out that this model was a major innovation in econometric modeling. In the special case where B = 0 , one obtains the multivariate multiple regression model

y = x +

The general form of the LISREL model, due to its flexible specification in terms of fixed and free parameters and simple equality constraints, has proven to be so rich that it can handle a large variety of problems. Using the inequality constraints feature in LISREL, users constantly discover new models, such as nonlinear growth curves (see du Toit & Cudeck, 2001) and vector time series models with ARMA residuals (du Toit & Browne, 2001) that can be handled within the LISREL framework. There are many articles on structural equation modeling. Hayduk (1996), for example, gives a long list of substantive areas where structural equation models are being used: addictions, criminology, education, family studies, health, marketing, psychology, and sociology to mention just a few. A very large number of technical and substantive articles using structural equation models have appeared in dozens of journals. The next section describes how to draw a path diagram and create syntax using the graphical user interface of LISREL. An overview of the SIMPLIS syntax, which is used to specify LISREL models, is given in Section 5.3. Thereafter, two illustrative examples are discussed in Section 5.4. In Section 5.5, a simulation study and empirical comparisons are used to assess the results produced by LISREL in the case of complex survey data. An overview of the statistical theory implemented in LISREL for the analysis of complex survey data concludes this chapter.

Chapter 5: Structural equation models

229

5.2 5.2.1

Graphical User Interface The new PTH window

The path diagram component of the graphical user interface (GUI) of the LISREL module consists of the options and dialog boxes of the Setup menu on the PTH window of LISREL 8.7. This GUI component allows you to interactively generate the syntax file by means of a path diagram, which is a graphical representation of a structural equation model. The Setup menu on the PTH window of LISREL 8.7 is reviewed in the next section while the four dialog boxes are reviewed separately in the subsequent sections. Thereafter, the use of the graphic pane of the PTH window is outlined. The Setup menu on the PTH window provides access to a sequence of four dialog boxes that can be used to create a SIMPLIS or LISREL syntax file interactively by using a path diagram. A new PTH window is opened as follows. Open LISREL 8.7 and select the New option on the File menu to create the following window.

Click on the New option on the File menu to load the New dialog box and select the Path Diagram option from the New dialog box as shown below.

Chapter 5: Structural equation models

230

Click on the OK button to load the Save As dialog box and then enter, for example, the name demo.pth in the File name field to produce the following dialog box.

5.2.2

The Setup menu

Next, click on the Save button to open the PTH window for demo.pth and then click on the Setup menu to obtain the following window.

Chapter 5: Structural equation models

231

Typically, clicking on the Title and Comments option of the Setup menu will load the Title and Comments dialog box (see Section 5.2.2). However, you can click directly on the Groups, Variables or Data option to go to the Groups (see Section 5.2.3), the Variables (see Section 5.2.4), or the Data dialog box (see Section 5.2.5). Once you have completed the four sequential dialog boxes and drawn the path diagram, the SIMPLIS syntax file or the LISREL syntax file is generated by clicking on the Build SIMPLIS Syntax or the Build LISREL Syntax option respectively.

5.2.3

The Title and Comments dialog box

The Title and Comments dialog box allows you to specify a title and additional comments for the analysis. It is accessed by selecting the Title and Comments option on the Setup menu. This selection loads the following Title and Comments dialog box.

Title <string> <comment1> <comment2> . . . <commentk>

Note that the Title and Comments dialog box corresponds with the Title command as shown above. Once you are done with the Title and Comments dialog box, click on the Next button to go to the
Group Names dialog box.

5.2.4

The Group Names dialog box

The Group Names dialog box is usually accessed by clicking on the Next button of the Title and Comments dialog box. It is required for multiple group analysis and allows you to specify different group names. Note that the Group Names dialog box corresponds with the Group command as indicated on the image below. For single group analysis, you can skip this dialog box by simply clicking on the Next button.

Chapter 5: Structural equation models

232

Group <label1> Group <label2> . . . Group <labelk>

Once the Group Names dialog box has been completed, click the Next button to go to the Labels dialog box.

5.2.5

The Labels dialog box

The Labels dialog box allows you to specify the observed variables and latent variables of the model interactively. Access to this dialog box is obtained by clicking on the Next button of the Group Names dialog box. This selection loads the Labels dialog box as shown below. Note that the Labels dialog box corresponds with the Observed variables and Latent variables commands as shown above. Note also that the Add/Read Variables dialog box, which is loaded by clicking on the Add/Read Variables button, corresponds with the System file from file and the Raw data file from file commands. If a LISREL system file (DSF) or a PRELIS system file (PSF) is to be used, you can browse for the corresponding DSF or PSF by first selecting the LISREL System File option or the PRELIS System File option from the drop-down list box respectively and then clicking on the Browse button. Otherwise, you can add a list of variables by activating the Add list of variables radio button. When you are done with the Add/Read Variables dialog box, click the OK button to return to the Labels dialog box. If the model includes any latent (unobservable) variables, you must specify labels for them by clicking on the Add Latent Variables button to load the Add Variables dialog box. Click the OK button after the label has been entered to return to the Labels dialog box. Once the labels for all the latent variables of the model have been specified, you click on the Next button to return to the Data dialog box. Chapter 5: Structural equation models
233

Latent variables <labels>

Observed variables <labels>

System file from file <dsfname> or Raw data file from file <psfname>

5.2.6

The Data dialog box

Specify the data to be analyzed by using the Data dialog box. It is usually accessed by clicking on the Next button of the Labels dialog box. This action loads the following Data dialog box.

Chapter 5: Structural equation models

234

System file from file <filename>

Covariance matrix from file <filename>

Sample size <number>

Asymptotic covariance matrix from file <filename>

Note that the Data dialog box corresponds with the System file from file, Covariance matrix from file, Sample Size and Asymptotic covariance matrix from file commands as shown in the image above. If a DSF or a PSF is selected in the Labels dialog box and the covariance matrix is the matrix to be analyzed, the Data dialog box is redundant. In other words, in this case, you can click on the OK button to return to the PTH window without completing the Data dialog box. In the case of a single-group analysis, the Groups drop-down list box is not accessible. In the case of a multiple group analysis, the Groups drop-down list box displays the labels of the different groups as specified in the Group Names dialog box. In this case, specify the data for each group by selecting the group name from the list box. If the latent variable means are to be compared across groups, click the Estimate latent means check box. Select the desired data type from the Statistics from drop-down list box if the covariance matrix is not desired. Select the appropriate data file type from the File type drop-down list box if a DSF is not preferred and then use the Browse button to browse for the corresponding file. You can open the data file to be analyzed by clicking on the Edit button or specify the data by clicking on the New button. If the asymptotic covariance matrix or asymptotic variances of the sample moments is to be used in the analysis, you must check the Include weight matrix check box, select the type of weight matrix from the drop-down list box and browse for the appropriate file. If a DSF or a PSF is not to be processed, enter the number of observations in the Number of observations field. Select the desired moment matrix from the Matrix to be analyzed drop-down list box if a correlation matrix rather than a covariance matrix is to be analyzed. Chapter 5: Structural equation models
235

After all four dialog boxes are completed, click on the OK button to return to the PTH window.

5.2.7

The graphic pane of the PTH window

Once you are done with the four dialog boxes of the Setup menu of the PTH window, the graphic pane of the PTH window is used to create a path diagram of the model to be fitted to the data. An example of the graphic pane of the PTH window is shown below.

Zoom tool Text tool Two-way path tool Multi-segment path tool One-way path tool Select tool

You may use the following sequential steps to create a path diagram of the structural equation model to be fitted to the data in the graphic pane of the window. o Use the Select tool to click, drag and drop the observed variables, one at a time, from the Observed list box into the graphic pane of the window. o If the model includes latent variables, use the Select tool to click, drag and drop the latent variables, one at a time, from the Latent list box into the graphic pane of the window. o Use the One-way path tool to specify the regression relationships between the observed and latent variables of the model. o If applicable, use the Two-way path tool to specify the covariance (correlation) relationships between the latent variables and the error variables of the model. o If certain parameters of the model are fixed to specific values, certain parameters are set equal to each other, a path needs to be removed from the model or certain graphic properties are desired, the right-click menu of a path is used. This menu is activated by first selecting the path by using the Select tool and then by right-clicking to view the following menu.

Chapter 5: Structural equation models

236

The options on the menu above are used as follows. o The Fix option is used to fix a parameter that was set free by default. o The Free option is used to free a parameter that was fixed by default. o The Set Value option is used to specify the value for a fixed parameter. o The Set Equal to option is used to set a parameter to be equal to another parameter(s). o The Cancel Setting Equal option is used to release an equality constraint that was specified. o The Delete option is used to specify the deletion of a path or a selected object. o The Characteristics option is used to obtain information about the parameter. o The Options option is used to modify the graphic properties of the path. o The Make Line Straight option is used to automatically straighten a one-way path. Once the path diagram has been drawn, click on the Build SIMPLIS Syntax option or Build LISREL Syntax option on the Setup menu to generate the SIMPLIS syntax file or the LISREL syntax file respectively.

5.2.8 The Weight Cases and Survey Design dialog boxes


The Weight Cases and Survey Design dialog boxes can be accessed if a PSF file is the active window. This is accomplished by selecting the Data menu from the main menu bar.

Chapter 5: Structural equation models

237

The Weight Cases option is used to calculate weighted sample statistics, for example means, covariances, and asymptotic covariances. It is assumed that these weights are normalized in the sense that the sum of sample weights equals the sample size.

The Survey Design dialog box shown below is used to define the stratification and cluster variables and to select a design weight. This information is stored within the PSF file and is retrieved whenever a SEM, based on an continuous outcome variable, is fitted to the data contained in the PSF file.

Chapter 5: Structural equation models

238

5.3 5.3.1

Syntax The structure of the SIMPLIS syntax file

The SIMPLIS syntax file, which is generated by the graphical user interface of the LISREL module, can also be prepared manually by using the LISREL 8.7 text editor or any other text editor such as Notepad and WordPad. The general structure of the SIMPLIS syntax file depends on the data to be processed. If the raw data file to be processed is a PSF, the SIMPLIS syntax file has the following structure.
TITLE <string> RAW DATA FROM FILE <psfname> MISSING VALUE CODE <value> STRATUM <label> CLUSTER <label> WEIGHT <label> CASEWEIGHT <label> $CLUSTER <label> $PREDICT <labels> LATENT VARIABLES <labels> RELATIONSHIPS <relationships> SET <instruction> LISREL OUTPUT <options> PATH DIAGRAM END OF PROBLEM Optional Optional Optional Required Required Required Optional Optional Optional Optional Optional Optional Optional Optional Required

where <string> denotes a character string, <label> denotes a case-sensitive variable name used in the raw data or moment matrix file, <labels> denotes a list of case-sensitive variable names used in the raw data or moment matrix file or for the latent variables of the model, <psfname> denotes the complete name (including the drive and folder names) of the PSF, <value> denotes any real number, <relationships> denotes a list of model expressions (see Section 5.3.24) and <instruction> denotes a parameter statement (see Section 5.3.26). <options> denotes a list of options for the analysis each of which either has the syntax: <keyword> = <selection>

Chapter 5: Structural equation models

239

where <keyword> is one of AD, AL, BE, EP, GA, IT, KA, LX, LY, MA, ME, ND, NP, PH, PS, PV, RC, SI, SL, SV, TD, TE, TH, TM, TV, TX, TY or XO and <selection> denotes a number, a value or a name (see Section 5.3.13) or the syntax: <option> where <option> is one of ALL, AM, EF, FS, FT, MI, MR, NS, PC, PT, RO, RS, SC, SO, SS, WP, XA, XI or XM (see Section 5.3.14). If the data to be analyzed are summarized in a DSF, the structure of the SIMPLIS syntax file is as follows.
TITLE <string> SYSTEM FILE FROM FILE <dsfname> LATENT VARIABLES <labels> RELATIONSHIPS <relationships> SET <instruction> LISREL OUTPUT <options> PATH DIAGRAM END OF PROBLEM Optional Optional Optional Required Required Required Optional Optional

where <dsfname> denotes the complete name (including the drive and folder names) of the DSF (see Section 5.3.14). The SIMPLIS syntax file has the following structure if the data file to be processed is in the form of a text file.
TITLE <string> OBSERVED VARIABLES <labels> RAW DATA FROM FILE <filename> MISSING VALUE CODE <value> STRATUM <label> CLUSTER <label> WEIGHT <label> CASEWEIGHT <label> $CLUSTER <label> $PREDICT <labels> COVARIANCE MATRIX FROM FILE <filename> Required if COV. or CORR. MATRIX is not selected Required Optional Optional Optional Optional Optional Optional Optional Required if RAW DATA or Optional

Chapter 5: Structural equation models

240

CORR. MATRIX is not selected CORRELATION MATRIX FROM FILE <filename> ASYMPTOTIC COVARIANCE MATRIX FROM FILE <acmfilename> MEANS FROM FILE <filename> STANDARD DEVIATIONS FROM FILE <filename> SAMPLE SIZE <number> LATENT VARIABLES <labels> RELATIONSHIPS <relationships> SET <instruction> LISREL OUTPUT <options> PATH DIAGRAM END OF PROBLEM Optional Optional Optional Required Required Optional Optional Required Optional Required if RAW DATA or COV. MATRIX is not selected Optional

where <filename> denotes the complete name (including the drive and folder names) of a text file, <acmfilename> denotes the complete name (including the drive and folder names) of the binary file containing the estimated asymptotic covariance matrix of the sample moments and <number> denotes a positive integer. The three general structures of the SIMPLIS syntax file listed here assume a single-group structural equation model. In the case of a multiple group structural equation model, these structures apply to each GROUP command (see Section 5.3.10). The only exception is the END OF PROBLEM command, in the sense that only one should be specified as the final command of the SIMPLIS syntax file for the multiple group analysis. The SYSTEM FILE FROM FILE command is a required command only if a DSF is used. If the data to be analyzed do not come from a DSF or PSF, then the OBSERVED VARIABLES paragraph, the SAMPLE SIZE command, and one of the RAW DATA FROM FILE, COVARIANCE MATRIX FROM FILE, or the CORRELATION MATRIX FROM FILE commands are required. The LATENT VARIABLES paragraph is required only if the model includes latent variables. The RELATIONSHIPS or PATHS paragraph is required. The remaining SIMPLIS commands are all optional. One of the SYSTEM FILE FROM FILE or RAW DATA FROM FILE commands or the OBSERVED VARIABLES paragraph should be the first command following the TITLE paragraph. If the END OF PROBLEM command is included, it must be the final command. The other commands and paragraphs can be entered in any order. In the following sections, the SIMPLIS commands and paragraphs are discussed separately in alphabetical order.

Chapter 5: Structural equation models

241

5.3.2

$CLUSTER command

The $CLUSTER command is used to specify the variable that contains the cluster information of nested data for which a multilevel structural equation modeling analysis is desired. It is an optional command. For example, in the case of a standard structural equation modeling analysis, the $CLUSTER command is omitted.
Syntax $CLUSTER <label>

where <label> denotes the label of the cluster variable.


Example

Suppose that the primary sampling units of the complex survey are facility types and that the variable FACTYPE is used to indicate the facility type for each observation. Then, the corresponding $CLUSTER command is
$CLUSTER FACTYPE

5.3.3

$PREDICT command

The $PREDICT command is used to specify the explanatory variables for the fixed part of a multilevel structural equation model. It is an optional command. For example, in the case of a standard structural equation modeling analysis, the $PREDICT command is omitted.
Syntax $PREDICT <labels>

where <labels> denotes the labels of the explanatory variables.


Example

Suppose that the age (AGE) and gender (GENDER) of each respondent are to be used as predictors for the fixed part of a multilevel structural equation model. For this example, the corresponding $PREDICT command is
$PREDICT = AGE GENDER;

Chapter 5: Structural equation models

242

5.3.4

ASYMPTOTIC COVARIANCE MATRIX FROM FILE command

The ASYMPTOTIC COVARIANCE MATRIX FROM FILE command is used to specify the name of the binary file that contains the estimated asymptotic covariance matrix of the sample moments. In the case of the Robust Maximum Likelihood (RML), the Weighted Least Squares (WLS) and the Diagonally Weighted Least Squares (DWLS) methods, it is a required command.
Syntax ASYMPTOTIC COVARIANCE MATRIX FROM FILE <acmfilename>

where <acmfilename> denotes the name of the binary file containing the estimated asymptotic covariance matrix of the sample moments. If the acmfilename contains blank spaces, it should be given in single quotes.
Example

Suppose that the name of the binary file with the estimated asymptotic covariance matrix of the sample variances and covariances is NIH1.ACM and that it is located in the folder Projects\NIH1 on the E drive. In this case, the corresponding COVARIANCE MATRIX FROM FILE command is given by
ASYMPTOTIC COVARIANCE MATRIX FROM FILE E:\Projects\NIH1\NIH1.ACM

5.3.5

CASEWEIGHT command

The purpose of the CASEWEIGHT command is to allow the user to specify the variable containing the weights of the individual observations to be used to compute weighted means, sample variances and covariances (correlations) and asymptotic covariance matrices of the sample variances and covariances (correlations). It is assumed that these weights are normalized in the sense that they add up to the sample size. The CASEWEIGHT command is an optional command and corresponds with the selected variable on the Weight Cases dialog box (see Section 5.2.8).
Syntax CASEWEIGHT <label>

where <label> denotes the label of the variable containing the case weights.
Example

CASEWEIGHT

Suppose that the variable NEWWGT contains the weight for each observation. For this example, the command is given by
CASEWEIGHT NEWWGT

Chapter 5: Structural equation models

243

5.3.6

CLUSTER command

The CLUSTER command is used to specify the variable for the primary sampling units of the complex survey. It is an optional command. For example, in the case of a simple random sample, the CLUSTER command is omitted. The CLUSTER command corresponds with the CLUSTER variable section on the Survey Design dialog box (see Section 5.2.8).
Syntax CLUSTER <label>

where <label> denotes the label of the cluster variable.


Example

Suppose that the primary sampling units of the complex survey are types of facility and that the variable FACTYPE is used to indicate the facility type for each observation. Then, the corresponding CLUSTER command is
CLUSTER FACTYPE

5.3.7

CORRELATION MATRIX paragraph

The correlation matrix to be processed can be specified as a part of the SIMPLIS syntax file by using the CORRELATION MATRIX paragraph. It is a required command only if the correlations to be processed are provided as part of the SIMPLIS syntax file. If the sample correlations are in the form of a text file, the CORRELATION MATRIX FROM FILE command rather than the CORRELATION MATRIX paragraph is used (see Section 5.3.6).
Syntax CORRELATION MATRIX <values>

where <values> denotes the sample correlations of the observed variables in free or fixed format. If the sample correlations are listed in a fixed format, a FORTRAN format statement should be included as the first line of the CORRELATION MATRIX paragraph.
Examples CORRELATION MATRIX 1.000 0.257 1.000 0.521 0.245 1.000 0.533 0.346 0.218 1.000

Chapter 5: Structural equation models

244

CORRELATION MATRIX (4F6.3) 1.000 0.257 1.000 0.521 0.245 1.000 0.533 0.346 0.218 1.000

5.3.8

CORRELATION MATRIX FROM FILE command

If the correlation matrix to be processed is in the form of a text file, the CORRELATION MATRIX command is a required command and is used to specify the name of the text file that contains the sample correlations of the observed variables of the model. It is also possible to specify the correlation matrix as part of the SIMPLIS syntax file. In this case, the CORRELATION MATRIX paragraph instead of the CORRELATION MATRIX FROM FILE command is used (see Section 5.3.5).
FROM FILE Syntax CORRELATION MATRIX FROM FILE <filename>

where <filename> denotes the name of the text file containing the filename contains blank spaces, it should be given in single quotes.
Example

sample correlation matrix. If the

Suppose that the sample correlations are contained in the text file SELECT.COR, which is located in the folder My Projects\SELECT on the D drive. In this case, the corresponding CORRELATION MATRIX FROM FILE command is given by
CORRELATION MATRIX FROM FILE D:\My Projects\SELECT\SELECT.COR

5.3.9

COVARIANCE MATRIX paragraph

The COVARIANCE MATRIX paragraph is used to provide the sample covariance matrix as a part of the SIMPLIS syntax file. If the covariance matrix to be analyzed is provided as part of the SIMPLIS syntax file, it is a required command. If the sample covariance matrix is in the form of a text file, the COVARIANCE MATRIX FROM FILE command rather than the COVARIANCE MATRIX paragraph is used (see Section 5.3.8).
Syntax COVARIANCE MATRIX

Chapter 5: Structural equation models

245

<values>

where <values> denotes the sample variances and covariances of the observed variables in free format. If the sample variances and covariances are provided in a fixed format, a FORTRAN type X format statement should be included as the first line of the COVARIANCE MATRIX paragraph.
Examples COVARIANCE MATRIX 25.001 33.257 57.251 26.385 32.674 61.323 39.533 38.552 44.227 72.052 COVARIANCE MATRIX (4F6.3) 25.001 33.25757.251 26.38532.67461.323 39.53338.55244.22772.052

5.3.10

COVARIANCE MATRIX FROM FILE command

The COVARIANCE MATRIX FROM FILE command is used to specify the name of the text file that contains the sample covariance matrix of the observed variables of the model. It is a required command only if the covariance matrix to be analyzed is in the form of a text file.
Syntax COVARIANCE MATRIX FROM FILE <filename>

where <filename> denotes the name of the text file containing the sample covariance matrix.
Example

Suppose that the name of the text file with the sample variances and covariances is NIH1.COV and that it is located in the folder Projects\NIH1 on the E drive. In this case, the corresponding COVARIANCE MATRIX FROM FILE command is given by
COVARIANCE MATRIX FROM FILE E:\Projects\NIH1\NIH1.COV

Chapter 5: Structural equation models

246

5.3.11

END OF PROBLEM command

The END OF PROBLEM command is usually the final command of a SIMPLIS syntax file and it indicates that no more commands or paragraphs are to be processed. It is an optional command.
Syntax END OF PROBLEM

5.3.12

GROUP command

The GROUP command is used to specify a model for each of the groups in a multiple-group structural equation model. A GROUP command is specified for each group to be included in the multiple group analysis. If no RELATIONSHIPS or PATHS paragraph and no SET command are specified for any group after the very first group, the structural equation model for the group is assumed to be identical (including equal parameters) to that of the previous group. In other words, if you want the parameters to be different from that of the previous group for a specific group, each parameter has to be specified explicitly in the RELATIONSHIPS or PATHS paragraph or SET commands for that specific group.
Syntax GROUP <string>

where <string> denotes the descriptive name of the group.


Examples GROUP Freshmen GROUP 1

5.3.13

LATENT VARIABLES paragraph

The LATENT VARIABLES paragraph is used to provide descriptive names to the latent variables of the model. It is a required command if the model includes latent variables. However, if the latent variable labels are in the form of a text file, the LATENT VARIABLES FROM FILE command instead of the LATENT VARIABLES paragraph is used (see Section 5.3.12).
Syntax LATENT VARIABLES <labels>

Chapter 5: Structural equation models

247

where <labels> denotes the descriptive names of the latent variables of the model. These names are provided in free or abbreviated format and only the first 8 characters of each name are utilized.
Examples LATENT VARIABLES JobSat OrgCom Perform LATENT VARIABLES FACTOR1 - FACTOR4

5.3.14

LATENT VARIABLES FROM FILE command

If the labels of the latent variables are in the form of a text file, the LATENT VARIABLES FROM FILE command is used to specify descriptive names for the latent variables of the model. In this specific case, it is a required command. The latent variable labels can also be specified as part of the SIMPLIS syntax file. In this regard, the LATENT VARIABLES paragraph rather than the LATENT VARIABLES FROM FILE command is used (see Section 5.3.11).
Syntax LATENT VARIABLES FROM FILE <filename>

where <filename> denotes the name of the text file containing the descriptive names of the latent variables of the model.
Example

Suppose that the name of the text file containing the latent variable labels is SELECT.LAB and that it is located in the folder Projects\SELECT on the D drive. In this case, the corresponding LATENT VARIABLES FROM FILE command is given by
LATENT VARIABLES FROM FILE D:\Projects\SELECT\SELECT.LAB

5.3.15

LISREL OUTPUT command

The LISREL OUTPUT command is used to request the results to be printed in terms of the LISREL model used in the analysis, to specify special analyses and to request additional results. It is an optional command. If the results in terms of the LISREL model are not desired, the OPTIONS command may be used to specify special analyses and to request additional results (see Section 5.3.19).

Chapter 5: Structural equation models

248

Syntax LISREL OUTPUT <options>

where <options> denotes a list of options for the analysis each of which either has the syntax:
<keyword> = <selection>

where <keyword> is one of AD, AL, BE, DW, EP, GA, IT, KA, LX, LY, MA, ME, ND, NP, PH, PS, PV, RC, SI, SL, SV, TD, TE, TH, TM, TV, TX, TY or XO and <selection> denotes a number, a value or a name; or the syntax:
<option>

where <option> is one of ALL, AM, EF, FS, FT, MI, MR, NS, PC, PT, RO, RS, SC, SO, SS, WP, XA, XI or XM. Keywords and options may be specified in any order.
Examples LISREL OUTPUT ND = 3 SC ME = DW LISREL OUTPUT BE = BETA.TXT GA = GAMMA.TXT PV = PV.TXT SV = SV.TXT ND = 6

All the keywords and options of the LISREL OUTPUT command are optional.

AD keyword

The purpose of the AD keyword is to specify the iteration number at which the admissibility of the solution will be checked and the iterations will stop if the check fails.
Syntax AD = <number>

where <number> denotes the iteration number or OFF if the check is to be turned off.
Default AD = 20

AL keyword

The AL keyword is used to specify the name of the text file to which the estimates of the endogenous latent variable means are to be written.

Chapter 5: Structural equation models

249

Syntax AL = <filename>

where <filename> denotes the name of the text file to which the estimates are to be written.

ALL option

The purpose of the ALL option is to invoke the printing of all the results in the output file.
AM option

The automatic model modification procedure is invoked by specifying the AM option.

BE keyword

The purpose of the BE keyword is to specify the name of the text file for the estimate of the matrix of regression coefficients for the regression among the endogenous latent variables.
Syntax BE = <filename>

where <filename> denotes the name of the text file for the matrix of estimates.

EF option

The EF option is used to invoke the printing of the estimated total and indirect effects in the output file.

EP keyword

The convergence criterion for the iterative algorithm, which is used to obtain parameter and standard error estimates, is specified by using the EP keyword.
Syntax EP = <value>

where <value> denotes the convergence criterion.


Default EP = 0.000001

Chapter 5: Structural equation models

250

FS option

The FS option is used to request a factor scores regression analysis.

FT option

The purpose of the FT option is to request an external text file containing measures of fit based on four different 2 test statistic values.

GA keyword

The GA keyword is used to specify the name of the text file for the estimated regression matrix of the regression of the endogenous latent variables on the exogenous latent variables.
Syntax GA = <filename>

where <filename> denotes the name of the text file for the matrix of estimates.

IT keyword

The purpose of the IT keyword is to specify the maximum number of iterations for the iterative algorithm, which is used to compute parameter and standard error estimates.
Syntax IT = <number>

where <number> denotes the maximum number of iterations.


Default IT = <5q>

where <5q> is five times the number of free parameters of the model.
KA keyword

The KA keyword is used to specify the name of the text file for the estimates of the means of the exogenous latent variables.

Chapter 5: Structural equation models

251

Syntax KA = <filename>

where <filename> denotes the name of the text file for the estimates.

LX keyword

The estimated matrix of factor loadings for the exogenous latent variables can be written to a text file by using the LX keyword.
Syntax LX = <filename>

where <filename> denotes the name of the text file for the matrix of estimates.

LY keyword

The purpose of the LY keyword is to specify the name of the text file for the estimated matrix of factor loadings for the endogenous latent variables.
Syntax LY = <filename>

where <filename> denotes the name of the text file for the matrix of estimates.

MA keyword

The MA keyword is used to specify the name of the text file for the moment matrix that was analyzed.
Syntax MA = <filename>

where <filename> denotes the name of the text file for the moment matrix that was analyzed.

ME keyword

If the maximum likelihood method is not desired, other methods to fit the LISREL model to the data can be specified by using the ME keyword. Chapter 5: Structural equation models
252

Syntax ME = <method>

where <method> is one of the following:


IV TS UL GL ML WL Default ME = ML

instrumental variables two-stage least squares unweighted least squares generalized least squares maximum likelihood weighted least squares

MI option

The printing of the model modification indices in the output file is invoked by specifying the MI option.

MR option

The purpose of the MR option is to specify a MINRES exploratory factor analysis.

ND keyword

The purpose of the ND keyword is to specify the number of decimals for the results.
Syntax ND = <number>

where <number> denotes the number of decimals desired.


Default ND = 2

Chapter 5: Structural equation models

253

NP keyword

The NP keyword is used to specify the number of decimals for external text files to be produced.
Syntax NP = <number>

where <number> denotes the number of decimals desired.


Default NP = 3

NS option

The NS option is used to suppress the computation of internal starting values.

PC option

The PC option is used to invoke the printing of both the estimated asymptotic covariance and correlation matrices of the parameter estimators in the output file.
PH keyword

The PH keyword is used to specify the name of the text file for the estimated covariance (correlation) matrix of the exogenous latent variables.
Syntax PH = <filename>

where <filename> denotes the name of the text file for the matrix of estimates.

PS keyword

The purpose of the PS keyword is to specify the name of the text file for the estimated covariance matrix of the error terms for the endogenous latent variables.
Syntax PS = <filename>

where <filename> denotes the name of the text file for the matrix of estimates. Chapter 5: Structural equation models
254

PT option

The PT option is used to invoke the printing of the technical details of the estimation method in the output file.

PV keyword

The PV keyword is used to specify the name of the text file for the estimates of all the free parameters of the LISREL model.
Syntax PV = <filename>

where <filename> denotes the name of the text file to which the estimates are to be written.

RC keyword

The RC keyword is used to specify the ridge constant to be used if the matrix to be analyzed is not positive definite.
Syntax RC = <value>

where <value> denotes the ridge constant.


Default RC = 0.001

RO option

The purpose of the RO option is to invoke the use of the ridge constant for the moment matrix to be analyzed. The RO option will be invoked automatically if the matrix is not positive definite.

RS option

The RS option is used to invoke the printing of the residuals, standardized residuals, QQ-plot, and fitted covariance (or correlation, or moment) matrix in the output file.

Chapter 5: Structural equation models

255

SC option

The SC option is used to invoke the printing of the completely standardized solution in the output file.

SI keyword

The purpose of the SI keyword is to specify the name of the text file for the fitted moment matrix.
Syntax SI = <filename>

where <filename> denotes the name of the text file for the fitted moment matrix.

SL keyword

The SL keyword is used to specify the significance level of the model automated modification procedure expressed as a percentage when the automated modification procedure is desired.
Syntax SL = <number>

where <number> denotes the significance level expressed as a percentage.


Default SL = 1

SO option

The SO option is used to suppress the automated checking of the scale setting for each latent variable. It is needed for very special models where scales for latent variables are defined in a different way.
SS option

The SS option is used to invoke the printing of the standardized solution in the output file.

SV keyword

The purpose of the SV keyword is to specify the name of the text file for the standard error estimates of all the free parameters of the LISREL model. Chapter 5: Structural equation models
256

Syntax SV = <filename>

where <filename> denotes the name of the text file to which the standard error estimates are to be written.

TD keyword

The TD keyword is used to specify the name of the text file for the estimated covariance matrix of the measurement errors of the indicators of the exogenous latent variables.
Syntax TD = <filename>

where <filename> denotes the name of the text file for the matrix of estimates.

TE keyword

The purpose of the TE keyword is to specify the name of the text file for the estimated covariance matrix of the measurement errors of the indicators of the endogenous latent variables.
Syntax TE = <filename>

where <filename> denotes the name of the text file for the matrix of estimates.

TH keyword

The TH keyword is used to specify the name of the text file for the estimated covariance matrix between the measurement errors of the indicators of the endogenous latent variables and those of the exogenous latent variables.
Syntax TH = <filename>

where <filename> denotes the name of the text file for the matrix of estimates.

Chapter 5: Structural equation models

257

TM keyword

The TM keyword can be used to specify the maximum number of CPU seconds allowed for the current analysis.
Syntax TM = <number>

where <number> denotes the maximum number of CPU seconds allowed.


Default TM = 172800

TV keyword

The purpose of the TV keyword is to specify the name of the text file for the t values of all the free parameters of the LISREL model.
Syntax TV = <filename>

where <filename> denotes the name of the text file to which the t values are to be written.

TX keyword

The TX keyword is used to specify the name of the text file for the estimated vector of intercepts for the indicators of the exogenous latent variables.
Syntax TX = <filename>

where <filename> denotes the name of the text file for the vector of estimates.

TY keyword

The TY keyword is used to specify the name of the text file for the estimated vector of intercepts for the indicators of the endogenous latent variables.
Syntax TY = <filename>

Chapter 5: Structural equation models

258

where <filename> denotes the name of the text file for the vector of estimates.

WP option

The WP option is used to specify a column width of 132 for the output file. The default column width is 80 characters.
XA option

The purpose of XA option is to suppress the computation and printing of the additional 2 test statistic values. Only C1 (minimum fit function 2 value) will be computed. Standard error estimates are not affected. C1 is still an asymptotically correct chi-square for the GLS, ML, and WLS methods but not for ULS and DWLS methods. It is only intended for those who have very large models and cannot afford (or do not want) to let the computer run for an extended period.

XI option

The XI option is used to suppress the printing of the numerous measures of fit to the output file. If the XI option is specified, only the 2 value, degrees of freedom and corresponding p-value are printed.

XM option

The XM option is used to suppress the computation and printing of the modification indices. When a path diagram is requested, the indices will be computed, but will not be included in the output.

XO keyword

The purpose of the XO keyword is to specify the number of repetitions for which results should be written to the output file.
Syntax XO = <number>

where <number> denotes the number of repetitions for which results should be written to the output file.
Default XO = <nrep>

Chapter 5: Structural equation models

259

where <nrep> is the number of repetitions specified.

5.3.16

MEANS paragraph

The MEANS paragraph is used to provide the sample means of the observed variables of the model as part of the SIMPLIS syntax file. It is a required command only if a mean-and-covariance structure model is specified and the raw data file or DSF is not provided. Sample means can also be provided in the form of an external text file. In this case, the MEANS FROM FILE command rather than the MEANS paragraph is used (see Section 5.3.15).
Syntax MEANS <values>

where <values> denotes a list of sample means in free or fixed format. If a fixed format rather than a free format is used, a FORTRAN type format statement should be the first line of the MEANS paragraph.
Examples MEANS 12.225 16.752 18.239 20.003 15.395 MEANS (5F6.3) 12.22516.75218.23920.00315.395

5.3.17

MEANS FROM FILE command

The MEANS FROM FILE command is used to specify the name of the text file that contains the sample means of the observed variables of the model. It is a required command only if a meanand-covariance structure model is specified and the raw data file or DSF is not provided. Sample means can also be provided as part of the SIMPLIS syntax file. This is accomplished by using the MEANS paragraph instead of the MEANS FROM FILE command (see Section 5.3.14).
Syntax MEANS FROM FILE <filename>

where <filename> denotes the name of the text file containing the sample means.

Chapter 5: Structural equation models

260

Example

Suppose that the name of the text file with the sample means is SELECT.MNS and that it is located in the folder Projects\SELECT on the D drive. In this case, the corresponding MEANS FROM FILE command is given by
MEANS FROM FILE D:\Projects\SELECT\SELECT.MNS

5.3.18

MISSING VALUE CODE command

If the raw data to be processed include missing values, the MISSING VALUE CODE command is used to specify the global missing value. It is a required command if the Full Information Maximum Likelihood (FIML) method for data with missing values is to be used and the global missing value is not specified in the PSF.
Syntax MISSING VALUE CODE <value>

where <value> denotes the global missing value.


Example

Suppose that the missing values in a text data file are all listed as -100. The MISSING VALUE CODE command is then
MISSING VALUE CODE -100

5.3.19

OBSERVED VARIABLES paragraph

The OBSERVED VARIABLES paragraph is used to provide descriptive names to the observed variables of the model. It is a required command, unless a PSF or a DSF is used. If the labels of the observed variables are in the form of a text file, the OBSERVED VARIABLES FROM FILE command instead of the OBSERVED VARIABLES paragraph is used (see Section 5.3.18).
Syntax OBSERVED VARIABLES <labels>

where <labels> denotes the descriptive names of the observed variables of the model. These names are provided in free or abbreviated format and only the first 8 characters of each name are utilized.

Chapter 5: Structural equation models

261

Examples OBSERVED VARIABLES Age Gender MSCORE SSCORE ESCORE OBSERVED VARIABLES JS1 JS6 OC1 OC10

5.3.20

OBSERVED VARIABLES FROM FILE command

If the labels of the observed variables are in the form of a text file, the OBSERVED VARIABLES FROM FILE command is used to specify descriptive names for the observed variables of the model. If a DSF or a PSF is not used, it is a required command. The labels of the observed variables can also be specified as part of the SIMPLIS syntax file. In this regard, the OBSERVED VARIABLES paragraph rather than the OBSERVED VARIABLES FROM FILE command is used (see Section 5.3.17).
Syntax OBSERVED VARIABLES FROM FILE <filename>

where <filename> denotes the name of the text file containing the descriptive names of the observed variables of the model.
Example

Suppose that the name of the text file containing the latent variable labels is ABUSE.LAB, which is located in the folder Projects\ABUSE on the C drive. In this case, the corresponding OBSERVED VARIABLES FROM FILE command is given by
OBSERVED VARIABLES FROM FILE C:\Projects\ABUSE\ ABUSE.LAB

5.3.21

OPTIONS command

The OPTIONS command is used to specify special analyses and to request additional results and it is an optional command. If the results in terms of the LISREL model are preferred, the LISREL OUTPUT command should be used (see Section 5.3.13).
Syntax OPTIONS <options>

where <options> denotes a list of options for the analysis each of which either has the syntax:

Chapter 5: Structural equation models

262

<keyword> = <selection> where <keyword> is one of AD, AL, BE, EP, GA, IT, KA, LX, LY, MA, ME, ND, NP, PH, PS, PV, RC, SI, SL, SV, TD, TE, TH, TM, TV, TX, TY or XO (see Section 5.3.13) and <selection> denotes a number, a value or a name; or the syntax:
<option>

where <option> is one of ALL, AM, DW, EF, FS, FT, MI, MR, NS, PC, PT, RO, RS, SC, SO, SS, WP, XA, XI or XM (see Section 5.3.13).
Examples OPTIONS ND = 3 SC ME = DW AD = OFF

5.3.22

PATH DIAGRAM command

The PATH DIAGRAM command is used to generate a PTH file in which the results of the analysis are summarized in the form of a path diagram. It is an optional command.
Syntax PATH DIAGRAM

5.3.23

PATHS paragraph

The PATHS paragraph may be used to specify the regression relationships of the structural equation model to be fitted to the data. Alternatively, the RELATIONSHIPS paragraph can be used to specify these relationships (see Section 5.3.24). It is a required command only if the RELATIONSHIPS paragraph is not used.
Syntax PATHS <paths>

where <paths> denotes a list of regression relationships each of which has the following syntax
<x> -> <y>

where <x> denotes a list of independent (exogenous) variable labels and <y> denotes a list of dependent (endogenous) variable labels. These lists can be in free format or in abbreviated form.

Chapter 5: Structural equation models

263

Example PATHS JS -> JS1 JS7 OC -> OC1 OC3 OC7 OC -> JS

5.3.24

RAW DATA paragraph

The RAW DATA paragraph is used to provide the raw data to be analyzed as part of the SIMPLIS syntax file. It is a required command only if the raw data are provided as part of the SIMPLIS syntax file. If the raw data are in the form of a text file or a PSF, the RAW DATA FROM FILE command instead of the RAW DATA paragraph is used (see Section 5.3.23).
Syntax RAW DATA <values>

where <values> denotes the rows of the raw data matrix. These rows can be provided in free or fixed formats. However, in the case of a fixed format, the first line of the RAW DATA paragraph should be a FORTRAN format statement.
Examples RAW DATA 175 217 555 224 331 566 777 111 121 667 RAW DATA (2F6.3) 12.34514.417 16.24519.205 10.33411.276 15.11416.267 13.24715.589

Chapter 5: Structural equation models

264

5.3.25

RAW DATA FROM FILE command

The RAW DATA FROM FILE command is used to specify the name of the PSF or the text file containing the raw data. It is a required command only if a PSF or a text data file is to be processed. The raw data matrix can also be specified as part of the SIMPLIS syntax file. In this regard, the RAW DATA paragraph rather than the RAW DATA FROM FILE command is used (see Section 5.3.22).
Syntax RAW DATA FROM FILE <filename>

where <filename> denotes the name of the text data file or the PSF.
Example

Suppose that the name of the PSF containing the raw data is SELECT.PSF and that it is located in the folder Projects\SELECT on the E drive. In this case, the corresponding RAW DATA FROM FILE command is given by
RAW DATA FROM FILE E:\Projects\SELECT\SELECT.PSF

5.3.26

RELATIONSHIPS paragraph

The RELATIONSHIPS paragraph may be used to specify the regression relationships of the structural equation model. The PATH commands can also be used to specify these relationships (see Section 5.3.21). It is a required paragraph only if a PATH command is not used.
Syntax RELATIONSHIPS <relationships>

where <relationships> denotes a list of regression relationships each of which has the following syntax
<y> = <x>

where <x> denotes a list of independent (exogenous) variable labels and <y> denotes a list of dependent (endogenous) variable labels. These lists can be in free format or in abbreviated form.

Chapter 5: Structural equation models

265

Example PATHS JS1 JS7 = JS OC1 OC3 OC7 = OC JS = OC

5.3.27

SAMPLE SIZE command

The SAMPLE SIZE command is used to specify number of cases to be processed. It is a required command, unless a DSF or PSF is used.
Syntax SAMPLE SIZE <number>

where <number> denotes the number of cases.


Example SAMPLE SIZE 388

5.3.28

SET command

The SET command is used to specify the status and/or the value(s) of a parameter(s) of the model. It is an optional command.
Syntax SET the <parameter> equal to <value> SET the <parameter> Free SET the <parameter1> and the <parameter2> Equal

where <parameter> is one of


Path <label1> -> <label2> Variance of <label> Covariance of <label1> and <label2> Error Variance of <label> Error Covariance of <label1> and <label2>

where <label>, <label1> and <label2> <value> denotes a real number. Chapter 5: Structural equation models

denote labels of observed or latent variables of the model and

266

Examples SET the Path Ses - >Alien67 and the Path Ses - >Alien71 Equal SET the Variance of Ses equal to 1.0 SET the Error Covariance of ANOMIA67 and ANOMIA71 Free

5.3.29

STANDARD DEVIATIONS paragraph

The STANDARD DEVIATIONS paragraph is used to provide the sample standard deviations of the observed variables of the model as part of the SIMPLIS syntax file. It is a required command only if the covariance matrix is to be analyzed and a CORRELATION MATRIX FROM FILE command or CORRELATION MATRIX paragraph is used. Sample standard deviations can also be provided can also be provided in the form of an external text file. In this case, the STANDARD DEVIATIONS FROM FILE command rather than the STANDARD DEVIATIONS paragraph is used (see Section 5.3.28).
Syntax STANDARD DEVIATIONS <values>

where <values> denotes a list of sample standard deviations of the observed variables of the model in free or fixed format. If a fixed format rather than a free format is used, a FORTRAN format statement should be the first line of the STANDARD DEVIATIONS paragraph.
Examples STANDARD DEVIATIONS 13.61 14.76 14.13 14.90 10.90 3.749 STANDARD DEVIATIONS (5F6.3) 12.22516.75218.23920.00315.395

5.3.30

STANDARD DEVIATIONS FROM FILE command

The STANDARD DEVIATIONS FROM FILE command is used to specify the name of the text file that contains the standard deviations of the observed variables of the model. It is a required command only if the covariance matrix is to be analyzed and a CORRELATION MATRIX FROM FILE command or a CORRELATION MATRIX paragraph is used. Sample standard deviations can also be provided as part of the SIMPLIS syntax file. This is accomplished by using the STANDARD DEVIATIONS paragraph instead of the STANDARD DEVIATIONS FROM FILE command (see Section 5.3.27).

Chapter 5: Structural equation models

267

Syntax STANDARD DEVIATIONS FROM FILE <filename>

where <filename> denotes the name of the text file containing the sample standard deviations.
Example

Suppose that the name of the text file with the sample standard deviations is Depresion.std and that it is located in the folder Projects\DEPRESSION on the E drive. In this case, the corresponding STANDARD DEVIATIONS FROM FILE command is given by
STANDARD DEVIATIONS FROM FILE E:\Projects\DEPRESSION\Depresion.std

5.3.31

STRATUM command

Complex surveys are typically obtained by stratifying the target population into subpopulations (strata). The STRATUM command allows the user to specify the stratification variable. Since other types of surveys are incorporated, the STRATUM command is an optional command. The STRATUM command corresponds with the STRATUM variable section on the Survey Design dialog box (see Section 5.2.8).
Syntax STRATUM <label>

where <label> denotes the label of the stratification variable.


Example

Suppose that the target population was stratified into census regions and that the variable CENREG is the variable used to indicate the census region for each observation. In this case, the STRATUM command is given by
STRATUM CENREG

5.3.32

SYSTEM FILE FROM FILE command

The SYSTEM FILE FROM FILE command is used to specify the DSF to be processed. It is a required command only if a DSF is to be processed.
Syntax SYSTEM FILE FROM FILE <filename>

Chapter 5: Structural equation models

268

where <filename> denotes the name of the DSF.


Example

Suppose that the DSF file, DEPRESSION.DSF, which is located in the folder Projects\DEPRESSION on the F drive, is to be processed. In this case, the corresponding SYSTEM FILE FROM FILE command is given by
SYSTEM FILE FROM FILE F:\Projects\DEPRESSION\DEPRESSION.DSF

5.3.33

TITLE paragraph

The TITLE paragraph is used to specify a descriptive heading for the analysis. It is an optional command. If the TITLE paragraph is used, avoid using any words that correspond to other SIMPLIS commands or paragraphs in the string field.
Syntax TITLE <string>

where <string> denotes a character string.


Example TITLE A SIMPLIS syntax file for Example 6

5.3.34

WEIGHT command

Design weights are constructed for the ultimate sampling units of complex surveys. The purpose of the WEIGHT command is to allow the user to specify the design weight variable. Since surveys without design weights are permitted, the WEIGHT command is an optional command. The WEIGHT command corresponds with the DESIGN weight section on the Survey Design dialog box (see Section 5.2.8).
Syntax WEIGHT <label>

where <label> denotes the label of the design weight variable.

Chapter 5: Structural equation models

269

Example

Suppose that the variable USUWGT is used to capture the design weight for each observation. For this example, the WEIGHT command is given by
WEIGHT USUWGT

Chapter 5: Structural equation models

270

5.4

Examples

5.4.1 A structural equation model for the 2001 Monitoring the Future data
The data

The Inter-University Consortium for Political and Social Research (ICPSR) at the University of Michigan has undertaken annual surveys designed to explore changes in important values, behaviors, and lifestyle orientations of contemporary American youth. The aims of these surveys are to provide a systematic, accurate description of the youth population of interest in a given year, and to explain relationships and trends observed over time. The Monitoring the Future surveys began in 1975. In the current example, data for 1608 respondents from the 2001 survey are used, and the focus is on relationships between the alcohol and marijuana use of respondents and traffic violations and/or accidents they were involved in. Data for the first 10 participants on most of the variables used in this section are shown below in the form of a PSF named select.psf which can be found in the TUTORIAL folder.

The following variables included in the PSF were selected from the survey data: o school: This variable is used to indicate group membership of respondents within the 48 schools included in the survey. o region: The 48 schools were drawn from 4 regions, and this variable indicates the region a school was drawn from. o alclifs: The numerical response to the question "On how many occasions have you had alcoholic beverages to drink in your lifetime?" o alc12mos: The numerical response to the question "On how many occasions have you had alcoholic beverages to drink in the past 12 months?" Chapter 5: Structural equation models
271

o alc30ds: The numerical response to the question "On how many occasions have you had alcoholic beverages to drink in the past 30 days?" o xmjlifs: The numerical response to the question "On how many occasions have you used marijuana in your lifetime?" o xmj12mos: The numerical response to the question "On how many occasions have you used marijuana in the past 12 months?" o xmj30ds: The numerical response to the question "On how many occasions have you used marijuana in the past 30 days?" o tick12mo: The numerical response to the question "Within the last 12 months, how many times have you received a ticket (or been stopped and warned) for moving violations?" o acci12mo: The numerical response to the question "Within the last 12 months, how many times you were involved in an accident while driving?" o newwgt: The design weight of a student, computed as the inverse of the selection probability estimate of the region from which the student was selected. This selection probability estimate is merely the ratio of the sample frequency and the approximate population size of the region from which the student was selected. The model

The five indicators or observed variables alclifs, alc30ds, xmjlifs, xmj12mos, and xmj30ds are modeled to measure the alcohol and marijuana usage. Alcohol and marijuana usage, represented by the latent variables ALCUSAGE and MRJUSAGE in our proposed model, are modeled as causes of the number of moving violations and accidents, as represented by the Eta variables ACCIDENT and TICKETS respectively. These two variables, in turn are measured without error by the two Y variables acci12mo and tick12mo. A path diagram of the model we intend fitting to the data is shown below.

Chapter 5: Structural equation models

272

Mathematical Model
Measurement model

The measurement model for the latent variables ALCUSAGE, MRJUSAGE, ACCIDENT and TICKETS may be expressed as
y y 0 x = 0 + x

where y = [ tick21mo acci12mo ] , x = [ alclifs alc30ds xmjlifs xmj12mos xmj30ds ] , = [ TICKETS


ACCIDENT ] ,

= [ ALCUSAGE MRJUSAGE ] , = [1 2 3 4 5 6 ] , = [1 2 ] ,
1 0 y = 0 1

and
1 2 x = 0 0 0 0 0 3 4 5

where 1 , 2 , 3 , 4 , 5 , 6 , 1 and 2 denote measurement errors, and where 1 , 2 , 3 , 4 and 5 denote unknown factor loadings.
Structural equation model

The structural equation model for the latent variables ALCUSAGE, MRJUSAGE, ACCIDENT and
TICKETS is given by

= B + +
where = [ 1 2 ] ,

Chapter 5: Structural equation models

273

0 B= 0 0

and
2 = 1 3 4

where , 1 , 2 , 3 and 4 denote unknown regression weights, and 1 and 2 denote error terms. The survey design variables school and region will be used as stratification and cluster variable respectively, while the design weight as represented by the variable newwgt will also be included in the specification of the analysis, as illustrated next.

Preparing the data and setting up the analysis

The model is fitted to the data in select.psf by using the path diagram component (PTH window) of the LISREL GUI (See Section 5.2). After drawing the proposed model as a path diagram, SIMPLIS syntax is created and submitted. However, before we can fit the model, we need to specify the details of the complex survey design for the data in select.psf. The first step is to open the PSF, which is accomplished as follows:
o Use the File, Open option to activate the display of an Open dialog box. o Set the Files of type drop-down list box to Prelis Data (*.psf) and browse for the file select.psf in the TUTORIAL folder. o Select the file and click the Open button to open the PSF in a PSF window.
Preparing the data

Click on the Survey Design option on the Data menu to load the Survey Design dialog box. Select the variable region from the Variables in data: list box and click on the Add button of the Stratification variable section. Next, select the variable school and add this variable to the Cluster variable section in a similar fashion. Finally, add the weight variable by selecting the variable newwgt from the Variables in data: list box and add this variable in the Design weight section. The completed Survey Design dialog box is shown below. Click on the OK button to return to the PSF window, and click the Save option on the File menu.

Chapter 5: Structural equation models

274

We now turn to creating a path diagram for the model to be fitted to these data. To open a new PTH window, select the New option on the File menu to load the New dialog box. Select the Path Diagram option from the list box on the New dialog box and provide a name for the path diagram, for example select.pth, in the File name string field of the Save As dialog box. Click on the Save button to open an empty PTH window.

Select the Title and Comments option on the Setup menu to load the Title and Comments dialog box. Enter the title A model for traffic tickets and accidents in the Title string field, and click on the Next button to load the Group Names dialog box.

Chapter 5: Structural equation models

275

Click on the Next button to load the Labels dialog box. Click on the Add/Read Variables button to load the Add/Read Variables dialog box, and select the PRELIS System File option in the Read from file: drop-down list box. Click on the Browse button to load the Browse dialog box and select the file select.psf in the TUTORIAL folder. Click on the OK button to return to the Labels dialog box.

Click on the Add Latent Variables button to load the Add Variables dialog box. Enter the label ALCUSAGE in the string field and click OK. Enter the labels MRJUSAGE, ACCIDENT, and TICKETS in the same way.

Chapter 5: Structural equation models

276

The completed Labels dialog box is shown below. Click on the OK button to return to the PTH window for select.pth.

Setting up the analysis

At this point, an empty PTH window is displayed, with variable names listed to the left. Check the Y check boxes of acci12mo and tick12mo respectively. Check the Eta check boxes of ACCIDENT and TICKETS respectively to obtain the window shown below.

Chapter 5: Structural equation models

277

Next, click, drag and drop the labels of the Y variables one at a time into the PTH window. Position these variables to the right of the PTH window. Click, drag and drop the labels of the latent variables ACCIDENT and TICKETS one at a time into the PTH window to obtain the window shown below. Note that labels of variables dragged to the PTH window are shown against a colored background.

We now add the rest of the observed variables (alclifs, alc30ds, xmjlifs, xmj12mos, and xmj30ds) one at a time into the PTH window, positioning them to the left of the PTH window. The last variables to be added are the latent variables ALCUSAGE and MRJUSAGE.

Chapter 5: Structural equation models

278

The next step is to add the paths between the variables dragged in the PTH window. Select the arrow icon on the Drawing toolbar, and click and drag indicator paths from the latent variable ALCUSAGE to alclifs and alc30ds respectively. To do so, start by clicking inside the ellipse representing ALCUSAGE and do not release the mouse button before the cursor is inside the rectangle representing alclifs or alc30ds. Click and drag similar indicator paths from the latent variable MRJUSAGE to xmjlifs, xmj12mos and xmj30ds respectively. Structural paths from the latent variable ALCUSAGE to both ACCIDENT and TICKETS, and from the latent variable MRJUSAGE to ACCIDENT and TICKETS are added in the same way. Also add indicator paths from the latent variable ACCIDENT to acci12mo, from the latent variable TICKETS to tick12mo, and from TICKETS to ACCIDENT. The model should now look like the image below.

Chapter 5: Structural equation models

279

The two indicator paths ACCIDENT to acci12mo, and TICKETS to tick12mo have to be fixed to a value of 1.0. To do so, deselect the arrow icon on the Drawing toolbar by clicking on the selection icon to its left. Next, right click on the path between ACCIDENT to acci12mo and select the Set Value option from the pop-up menu that appears. Set the value to 1.0 and click OK to return to the PTH window.

Right click on this path again, and select the Fix option from the pop-up menu. Note that the color representing the path has changed in the PTH window. Set the path between TICKETS to tick12mo to 1.0 in the same way.

Finally, set the error variances of acci12mo and tick12mo to zero by right-clicking the error arrows and selecting the Fix option from the pop-up menu.

Chapter 5: Structural equation models

280

The last paths to be added to the path diagram are the covariance between the measurement errors of xmj12mos and xmj30ds. Select the double arrow icon on the Drawing toolbar, and click and drag a path between the error arrows of xmj12mos and xmj30ds. Be sure to position the cursor over each arrow before activating and releasing the mouse button.

The path diagram should look like the following image.

Chapter 5: Structural equation models

281

Select the Build SIMPLIS Syntax option on the Setup menu. The generated syntax is automatically displayed in a SPJ window, as shown below.

Click on the Run LISREL icon on the main toolbar to produce the following PTH window.

Chapter 5: Structural equation models

282

Discussion of results

Portions of the output file select.out are shown below. From the results, it is evident that the five factor loadings are statistically significant if a 1% level of significance is used. In addition, the error covariance for xmj12mos and xmj30ds is also significant at a 1% level of significance. In other words, the results do not indicate any misspecifications in the measurement model of the latent variables ALCUSAGE and MRJUSAGE.

Since = 0.38 (t = 11.03) , it follows that a student's number of accidents exerts a significant positive influence ( p < 0.01) on his/her number of traffic tickets. Thus an increase in the number of accidents corresponds to an increase in the number of traffic tickets. Similarly, it follows that the alcohol usage of a student is a significant antecedent ( p < 0.01) of both the number of accidents

Chapter 5: Structural equation models

283

and the number of traffic tickets of the student. On the other hand, the marijuana usage is not a statistical significant antecedent of both the students number of accidents and traffic tickets. The R 2 value for the number of accidents follows as 1 0.44 = 0.56. In other words, the alcohol usage and marijuana usage of a student explains approximately 56% of the variation in the students number of accidents. Similarly, it follows that they explain approximately 27% of the variation in the number of traffic accidents of the student.

Chapter 5: Structural equation models

284

From the results above, it is evident that the 2 test statistic value for the null hypothesis of a perfect fit is significant if a 1% level of significance is used. There is sufficient evidence that the theoretical model does not fit the data perfectly. However, the RMSEA point estimate of 0.017 indicates that the model does provide a close fit to the data (Browne & Cudeck, 1993).

5.4.2 Implementation of sampling weights in a linear growth curve model


The data

A linear growth curve model with two dummy coded covariates (Lang1 and Lang2) is fitted to a simulated dataset contained in the PSF surveysem.psf, contained in the MISSINGEX folder. It is assumed that the data are stratified according to 48 counties. Within each county three schools are selected as primary sampling units (PSUs). In school number 1 four students are selected; in schools 2 and 3, three students are selected from each school. Students were selected on the basis of their initial achievement in an aptitude test (Score1) and measurements were repeated over six time intervals for five students from each school and over four time intervals for the remaining five. The table below (Weight3) shows the weight calculations based on standardized initial scores.
Interval Lower Upper % Expected % Selected Weight3 -------------------------------------------------------------1 -Inf -1.00 15.87 10.00 1.587 2 -1.00 -0.70 8.33 10.00 0.833 3 -0.70 -0.20 17.88 10.00 1.788 4 -0.20 0.00 7.93 10.00 0.793 5 0.00 0.30 11.79 10.00 1.179 6 0.30 1.00 22.34 10.00 2.234 7 1.00 1.30 6.19 10.00 0.619 8 1.30 1.80 6.09 10.00 0.609 9 1.80 2.30 2.52 10.00 0.252 10 2.30 Inf 1.07 10.00 0.107

Ten students were selected from each school as follows:


Four from racial group 1 with Weight2 = 7.0/4.0 Three from racial group 2 with Weight2 = 2.0/3.0 Three from racial group 3 with Weight2 = 1.0/3.0

Final weights are obtained as follows:


Final_wt = Weight3*Weight2*10.0.

Chapter 5: Structural equation models

285

Multiplication of the weights by a factor of 10.0 was done to illustrate that a constant scaling of the weights does not affect parameter estimates, standard error estimates or the chi-square goodness of fit statistic value. The data were simulated according to the following model.
Scoreij = a 0i + a 1i t j + eij , j = 1, 2...6

In the model, t j = ( j 1) and i denotes student number i. In simulating the data set, it was assumed that
a0i = 0 + 1 Lang1 + 2 Lang 2 + u0i a1i = 1 + u1i

where
0 1.0 = 1 0.5 11 = var(u0i ) = 2

21 = cov(u0i , u1i ) = 0.6 22 = var(u1i ) = 0.4 eij ~ N (0, I 2 ), 2 = 1


and
1 0.5 = 2 1.0

The first 10 records of the data set are shown below.

Note that even-numbered cases have missing values on Score5 and Score6. This is indicated with a missing value code of -9.00.

Chapter 5: Structural equation models

286

The model

The conceptual path diagram for the model is shown below. The paths from the latent variables intcept and time to the dependent variables Score1 to Score6 are shown in gray to indicate that the corresponding coefficients are fixed values.

The conceptual path diagram for the structural part of the model is given below and indicates that allowance is made for the latent variables to be correlated.

Mathematical Model
Measurement model

The measurement model for the latent variables may be expressed as

y y 0 x = 0 + x

Chapter 5: Structural equation models

287

where y = [ Score1 Score2 . Score6 ] , x = [ Lang1 Lang2 ] , = [ intcept time ] , = [ Lang1


Lang2 ] , =

[1

2 ] , = [1 6 ] , and Cov() = 2 I . Also


1 1 1 y = 1 1 1

0 1 2 3 , 4 5

21 Cov( ) = 11 , 21 22 E ( ) = 1 , 2

1 0 x = 0 1 ,

and Cov() = 0.
Structural equation model

The structural equation model for the latent variables intcept and time is given by ( B = 0 )
= +

where = [ 1 2 ] , with

21 Cov() = 11 , 21 22
E ( ) = 1 , 2

and
= 1 2 0 0

Chapter 5: Structural equation models

288

where 1 and 2 denote unknown regression weights. The thirteen unknown model parameters are

1 , 2 , 11 , 12 , 22 , 1 , 2 , 2 , 11 , 21 , 22 , 1 , and 2 .
The unrestricted model has 36 + 8 = 44 parameters, since there are 8 observed variables and the number of non-duplicated elements of a covariance matrix of order 8 is 36. Therefore, the number of degrees of freedom equals 44 13 = 31.
SIMPLIS syntax for the model is shown next. Note that 5.0*time, for example, indicates that the coefficient for the time Score6 path is fixed at a value of 5.0.

An experienced LISREL user may prefer to type the SIMPLIS commands to fit a specific model. However, we provide an outline of the steps required to build the syntax by drawing a path diagram. With more experience, users will find many shortcuts in developing the syntax. For example, it may be easier to type in the 1.0*, 2.0*, etc. values after syntax has been generated than to fix and set each path to a specific value.

Example: Implementation of sampling weights in a linear growth curve model


Describing the data

From the main menu bar, select the Data, Survey Design option and add the variable Final_wt in the Design weight: box. Click OK when done.

Chapter 5: Structural equation models

289

Next use the Data, Define Variables option to select the variables Score5 and Score6, then click the Missing Values button to invoke the Missing Values dialog box. Enter -9.0 as shown below. Click OK, then use the File, Save option to ensure that these changes are contained in the PSF file.

Chapter 5: Structural equation models

290

Setting up the analysis

To generate the SIMPLIS commands interactively, we proceed as follows. Using the File, New option, select the Path Diagram option from the New dialog box.

Save the new path diagram in the MISSINGEX folder as Survey1.pth.

Click Save when done. From the Setup menu, select Title and Comments.

This action loads the Title and Comments dialog box shown below. Enter a title and any (optional) comments as shown below. Chapter 5: Structural equation models
291

Click Next to proceed to the Groups dialog and, since this is a single-group example, click Next again to activate the Labels dialog shown below. To add a list of observed variables, click the Add/Read Variables button below the Observed Variables list box.

From the Add/Read Variables dialog box, click the Read from file: radio button and select PRELIS System File. Next, use the Browse button to locate and select surveysem.psf.

Chapter 5: Structural equation models

292

Click the Open button once the desired PSF is selected. The observed variable names will be displayed in the Labels dialog box. To add a list of latent variables, click the Add Latent Variables button and type the names of the latent variables one at a time.

Click OK when the latent variables intcept and time are entered. The left hand side of the path diagram window should display the observed and latent variables. If not, select the View, Toolbars option and from the drop-down list Select Variables. Click on the check boxes on the right hand side of the variable names to define Score1 to Score6 as Y (dependent) variables and intcept and time as Eta (endogenous latent) variables.

Chapter 5: Structural equation models

293

We start drawing the path diagram by dragging the names Score1 to Score6 to the path diagram window. Next, drag intcept and time to the middle of the path diagram window. A variable is dragged to the path diagram window by left-clicking on the variable name and then moving it with the left mouse button held down. Finally, drag Lang1, Lang2 and CONST to the left of intcept and time. Click the one-sided arrow on the drawing bar and connect arrows from intcept to the variables Score1, , Score6.

With the left mouse button down, start in the ellipse and drag the arrow to within a rectangle representing one of the Score variables before releasing the mouse button. Unselect the arrow by clicking on the square on the drawing bar. Once this is done, move the mouse pointer to each arrow

Chapter 5: Structural equation models

294

and right-click. From the pop-up menu, select Fix. Repeat this for each path from intcept to a Score value.

Once all the paths are fixed, start with the path from intcept to Score1. Right-click on the arrow, and select the Set Value option from the pop-up menu.

Chapter 5: Structural equation models

295

Change the value of each path to 1 as shown below.

Repeat the above procedure by drawing paths from time to Score2, Score3, , Score6. Fix these paths and set the path coefficients to 1, 2, 3, 4, and 5 respectively, as shown below.

The path diagram is completed by drawing arrows from Lang1, Lang2 and the SIMPLIS variable CONST to intcept and time as shown. Once this is done, select the two-sided arrow (error covariance or factor correlation) to add a covariance path between Lang1 and Lang2.

Chapter 5: Structural equation models

296

To draw this path, left-click on the horizontal link between 0.00 and the Lang1 rectangle. Drag the path to the line connecting 0.00 and Lang2 before releasing the mouse button.

To build the corresponding SIMPLIS syntax, select Setup, Build SIMPLIS Syntax from the main menu.

Chapter 5: Structural equation models

297

The syntax shown below is generated.

In our model, we assume that the coefficient of the latent variable time is not influenced by Lang1 or Lang2 and we change the syntax by adding 0.00* in front of Lang1 and Lang2 as shown below.

Chapter 5: Structural equation models

298

Since it is assumed that the error variances associated with the dependent variables are homogeneous, we manually add the command
Equal Error Variances: Score1 Score6

In addition, it is assumed that intcept and time are correlated. The equivalent SIMPLIS command is also typed in, as shown in the syntax file. Once these modifications to the command file are completed, click the Run LISREL icon to obtain a path diagram.
Discussion of results implementation of sampling weights in a linear growth curve model

The path diagram shows the parameter estimates and 2 goodness of fit statistic ( 2 = 72.42, df = 31) under the assumption that stratification and selection of clusters do not effect standard errors or model fit.

In the display below, the structural part of the model is shown. The numeric values are the t-values (t: parameter estimate/std. error) for each path. For example, the t-value for 1 = 0.41 equals 2.28.

Chapter 5: Structural equation models

299

Example: Incorporating stratum and cluster variables in the model

Next, select the Survey Design option from the Data menu and add County and School respectively as stratification and cluster variables. Click OK and then select the File, Save option.

Once this is done, make survey1.spl the active window before clicking the Run Lisrel icon.
Discussion of results Incorporating stratum and cluster variables in the model

From the path diagram below we note that the 2 statistic has decreased from 72.42 to 31.34 and that the t-value corresponding to 1 decreased from 2.28 to 1.71. In the latter case, the t-value indicates a non-significant coefficient for the Lang1, intcept path. Chapter 5: Structural equation models
300

Finally, selected parts of the LISREL output are shown below.

The 95% confidence intervals for the estimated parameters show that the population parameter values are all contained in the respective intervals. For example, a 95% confidence interval for 1 gives 0.491 2(0.05) = (0.391; 0.591) with corresponding population value of 0.5.

Chapter 5: Structural equation models

301

The 2 goodness-of-fit statistic equals 31.34, with a corresponding p-value of 0.45, indicating that the model fits the data quite well.

Chapter 5: Structural equation models

302

5.5 5.5.1

Evaluation Simulation study based on a linear growth curve model

Introduction

In Section 4.5.2, a simulation study based on a linear growth model for continuous outcomes was discussed (see Asparouhov, 2004). This study is repeated by fitting a structural equation model to the data which is mathematically equivalent to a multilevel model.

The data

An unbalanced design, consisting of 500 univariate observations that are clustered within 100 level-two units, was used. Half of the level two units have four observations and the other half have six observations. The times of the observations are equally spaced starting at 0 and ending with 3 for the clusters with 4 observations and ending with 5 for the clusters with 6 observations. The linear growth model has random intercept and slope coefficients. For a complete description of the data, see Section 4.5.2.

The model

A path diagram of the model is shown below.

Chapter 5: Structural equation models

303

The LISREL model is


x = x +

where
1 1 1 x = 1 1 1

0 1 2 , 3 4 5

E ( ) = 1 , Cov( ) = 11 21 2 21 22 ,

and Cov() = 2 I .

Setting up the analysis

The SIMPLIS syntax for fitting this growth model is as follows.

The command intcept time = CONST indicates the estimation of the population intercept and slope coefficients. The command Equal Error Variances: Y1 Y6 specifies a homogeneous error variance term on level 1 of the model.

Chapter 5: Structural equation models

304

Discussion of results

Results of the simulation study are summarized in Tables 5.1 and 5.2.
Table 5.1: Bias and Coverage in LISREL and Mplus Parameter True Value Bias Coverage LISREL Mplus LISREL Mplus

0 1 11 22 21
2

0.5 0.1 1 0.2 0.3 1

0.017 0.002 -0.024 -0.006 -0.006 -0.008

0.017 0.002 -0.024 -0.006 -0.006 -0.008

0.908 0.938 0.845 0.892 0.936 0.908

0.908 0.942 0.848 0.902 0.940 0.910

The bias and coverage produced by LISREL and Mplus are virtually identical. As part of the simulation study, the unadjusted and adjusted 2 goodness-of-fit statistics for each of the 500 simulations were computed. The adjusted 2 was obtained by the multiplication of the unadjusted 2 with the 2 scale factor, as described in Section 5.6. The mean values and rejection rates of these statistics for the 500 samples are given in Table 5.2. The rejection rate (expressed as 2 a percentage) denotes the number of times the 2 statistic exceeded 21;0.05 . The degrees of freedom, 21, is obtained as the number of non-duplicated elements of the covariance matrix plus the number of means minus the number of parameters estimated. Therefore df = 21 + 6 6 .
Table 5.2: Mean values and rejection rates of 2 goodness-of-fit statistics

Unadjusted 2

Adjusted 2

32.98 32 %

22.49 9%

2 The expected mean for a 21 random variate is 21. The mean for the adjusted 2 , which is higher than the expected mean, explains the rejection rate being higher than 5 %. This result implies that more research on the correction factor of the 2 test statistic under complex sampling is indicated.

Chapter 5: Structural equation models

305

5.5.2

Latent curve analysis with main and interaction effects

Introduction

Curran, Bauer & Willoughby (2004) considered the testing of main effects and interactions in latent curve analysis. Their goal was to illustrate that classic techniques, as applied in multiple regression, can be generalized to the case of latent curve analyses. As part of the paper, an example was used to illustrate the testing of a categorical by continuous interaction in an unbalanced latent curve model with missing data over time. In this section, the same model is fitted, with and without sampling weights, in order to evaluate the impact of ignoring weights on an analysis: an analysis option not available to the authors of the paper at the time of publication.

The data

The example in the paper was based on data from the National Longitudinal Study of Youth. Specific details on the selection of the sample can be found in Curran (1997). The sample consists of information on 405 children at 4 occasions. At the start of the study, children in the sample were between 6 and 8 years old. Information is not available for all children on all occasions: while 405 were interviewed initially, the second, third, and fourth occasions provided information on 374, 297, and 294 children respectively. Only 221 children were interviewed on all four occasions. As such, the participant attrition over time, combined with the variability in age at the start of the study and the fact that measurement occasions were approximately two years apart, makes this an example of an unbalanced design with missing data. In this section, the same data are used. Two analyses will be performed: a multilevel and a SEM analysis, the latter to verify the validity of the comparison of our results with that of Curran, Bauer & Willoughby (2004). In addition, models will be fitted with and without sampling weights. The data were used in different formats for the structural equation and multilevel models. A short description of each data format is given below.
Structural equation modeling

A few of the variables in the data set curran_NLSY_subset.psf are shown below for the first 10 observations.

Chapter 5: Structural equation models

306

The emotional support at home and the level of antisocial behavior exhibited by these children were of special interest. The authors focused on three questions of interest: the form of the mean developmental trajectory of antisocial behavior over time, the possibility of meaningful individual variability in trajectories around these mean values, and the possible effect of interaction between the gender of a child and the level of emotional support on antisocial behavior. The following variables included in the PSF were selected from the survey data:
o MOM_ID: This variable represents the identification number of the mother and serves as

o o

o o

o o

grouping variable for all measurements for a specific child. There are 405 mothers included in this subset of the NLSY data. MOM_WT: The sampling weight for each mother. antiy1 antiy10: A measure of antisocial behavior in the child. For each of these variables, a continuous measure representing the sum of six items assessing child antisocial behavior over the previous 3 months was created with values ranging between 0 and 12, where a high value would indicate a higher level of antisocial behavior. genfemo: The gender of the child, coded 0 for a female, and 1 for a male. home_emo: A measure of emotional support of the child in the home. This continuous measure, ranged from 0 to 13 with higher values reflecting higher levels of support, was measured at the first measurement occasion. It is centered around the mean level of emotional support. home_cog: A measure of cognitive stimulation, computed as a summation of 14 dichotomously scored items reported by the mother, which ranges between 0 and 14. genxemo: A variable intended to represent the interaction between a childs gender and level of emotional support: genxemo = genfemo home_emo.

Multilevel modeling

For the multilevel analysis, the PSF curran_mlev.psf was used as basis of the analysis. Data on all the variables used in this model are shown below for the first 10 respondents. Note that, in contrast Chapter 5: Structural equation models
307

to the PSF used for the SEM model, antisocial behavior is now represented by a single variable containing the stacked measurements over the 4 measurement occasions. For the first child, for example, the values 2, 1, 0, and 2 respectively were observed at the measurement occasions, where the latter is indicated by the variable tim.

The following variables were used in the multilevel analysis:


o Mom_ID: This variable represents the identification number of the mother and serves as grouping variable for all measurements for a specific child. There are 405 mothers included in this subset of the NLSY data. Mom_Wt: The sampling weight for each mother. antiy: A measure of antisocial behavior in the child at a given measurement occasion. This continuous measure, representing the sum of six items assessing child antisocial behavior over the previous 3 months, was created with values ranging between 0 and 12, where a high value would indicate a higher level of antisocial behavior. tim: This variable indicates the time of measurement, and varies from 0 to 9. genfemo: The gender of the child, coded 0 for a female, and 1 for a male. home_emo: A measure of emotional support of the child in the home. This continuous measure ranged from 0 to 13, with higher values reflecting higher levels of support, was measured at the first measurement occasion. It is centered around the mean level of emotional support. genxemo: A variable intended to represent the interaction between a childs gender and level of emotional support: genxemo = genfemo home_emo.

o o

o o o

The model

Curran, Bauer & Willoughby (2004) shows how a structural equation model-based latent curve analysis and a hierarchical linear model for these data can be formulated to produce equivalent results. They point out, however, that there are subtle but important differences in both model estimation and interpretation due to the way in which time is incorporated into the model. These Chapter 5: Structural equation models
308

differences are of particular importance in the case of conditional growth models, where one or more exogenous variables predict the random growth curve parameters. Main effects of the random trajectories imply that exogenous variables interact with time in the prediction of repeated measures for both cases. While both predictors and time are used as exogenous variables in the hierarchical linear model, the interaction between time and any predictor is explicitly modeled as a cross-level interaction. The latent curve analysis does not use time as a variable as such. Instead, it is incorporated into the model via the factor loading matrix. In this section, the two models and data sets constructed for use in the analyses will further illustrate this difference. To accommodate the differences in models fitted and data sets used, the structural equation model and the multilevel, or hierarchical linear, model, will be discussed separately in the rest of this section.
Structural equation model

We first consider the structural equation model. The model shown below corresponds to Figure 2 in Curran, Bauer & Willoughby (2004), and represents cohort-sequential conditional linear latent curve model with 10 time points, regressed on the main effect of gender, the main effect of emotion, and the interaction between gender and emotion. The variables intcept and slope represent the latent intercept and latent slope of the trajectory respectively.

In the Y part of the model, we include the dependent variables antiy1 to antiy10. It is assumed that antiy1 to antiy10 are indicators of the endogenous (ETA) latent variables intcept and slope. The covariates genfemo, home_emo, and genxemo are also assumed to have relationships with both the intercept and the slope of the trajectory and form the X part of the model. Finally, we allow the intcept and slope variables to be correlated. This path cannot be seen on the basic path diagram,

Chapter 5: Structural equation models

309

which is the type of diagram used here (to view this path, select the Structural Model option from the Models: drop-down list in the PTH window).

Mathematical model

The LISREL model consists of a measurement and structural part.


Measurement model

The measurement model may be expressed as

y y x = 0

0 + x
'

where

y = [ antiy1 antiy 2 antiy10] ,


' '

x = [ genfemo hom _ emo genxemo ]


'

'

= [1 2 10 ] , = [1 2 3 ] , = [ genfemo hom _ emo genxemo ] ,

intcept = slope and 1 1 1 1 1 1 1 1 1 1 'y = . 0 1 2 3 4 5 6 7 8 9

Cov() is a diagonal matrix with diagonal elements var(1 ) , var( 2 ) , , var(10 ) where we constrained these elements to be equal, while x is a 3 3 identity matrix and

11 21 31 Cov( ) = = 21 22 32 . 31 32 33
Structural equation model

The structural equation model for the latent variables intcept and slope is given by ( B = 0 )

Chapter 5: Structural equation models

310

= + , where = [ 1 2 ] , with
'

21 Cov() = 11 , 21 22
E () = 1 , 2

and
= 1 4

2 3 5 6 .

The unknown model parameters are therefore 1 , 2 , 3 , 4 , 5 , 6 , 1 , 2 , 11 , 12 , , 33 , 11 , 12 , 22 , and var(1 ) .


Multilevel model

A general two-level model for a response variable y depending on a set of r predictors x1 , x2 , ,xr can be written in the form
yij = x'( f )ij + x'( 2 )i ui + x'( 1 )ij e ij

where i = 1, 2 , , N denotes the level-2 units, and j = 1, 2 , , ni the level-1 units. Thus yij represents the response of individual j , nested within level-2 unit i . The model shown here consists of a fixed and a random part. The fixed part of the model is represented by the vector product x'( f )ij , where x'( f )ij is a typical row of the design matrix of the fixed part of the model with, as elements, a subset of the r predictors. The vector contains the fixed, but unknown parameters to be estimated. The vector products x'( 2 )i ui and x'( 1 )ij e ij denote the random part of the model at levels 2 and 1 respectively. For example, x'( 2 )i represents a typical row of the design

Chapter 5: Structural equation models

311

matrix of the random part at level 2, and ui the vector of random level-2 coefficients to be estimated. The product x'( 1 )ij e ij serves the same purpose at level 1. It is assumed that u1 ,u 2 , ,u N are assumed i.i.d., with mean vector 0 and covariance matrix (2) , and e i1 ,e i 2 , ,e ini are assumed i.i.d., with mean vector 0 and covariance matrix (1) . Within this hierarchical framework, the model fitted to the data uses the participant's gender, level of emotional support at home, and the interaction between these variables to predict the variability in intercept and slope over time of antisocial behavior trajectories.

antiyij = 0 + 1 genfemoij + 2 home _ emoij + 3 genxemoij +

4 timij + 5 ( genfemoij )(timij ) + 6 (home _ emoij )(timij ) + 7 ( genxemoij )(timij ) + ui 0 + +ui1 timij + eij
where 0 denotes the average expected level of antisocial behavior for a female child at the first measurement occasion with a score of 0 on the measure of emotional support at home. The coefficients 1 , 2 , , 7 are the estimated coefficients associated with the fixed part of the model which contains the predictor variables genfemo, home_emo, and the interaction term genxemo. The random part of the model is represented by, ui 0 , ui 0 * timij , and eij , which denote the variation in average level of antisocial behavior between children, in slope over measurements occasions, and between measurements taken at different occasions, where the occasions form the lowest level of the hierarchy.

Setting up the analysis


Structural equation model

The SIMPLIS syntax for the model is shown below. Note that 5.0*slope, for example, indicates that the coefficient for the slope antiy6 path is fixed at a value of 5.0.

Chapter 5: Structural equation models

312

An experienced LISREL user may prefer to type the SIMPLIS commands to fit a specific model. Alternatively, the syntax can be created by drawing a path diagram. This is done in the same way as shown in Section 5.4.2. To add a weight variable, as is the case in the second of the structural equation models considered here, select the Data, Survey Design option from the main menu bar

Chapter 5: Structural equation models

313

and enter the variable MOM_WT in the Design weight field as shown below.

To run the model, click the Run LISREL icon button on the main menu bar.

Multilevel model

Specifying the multilevel model is straightforward, and proceeds as shown in Chapter 4. Briefly, the level-2 ID is identified as Mom_ID, the outcome is antiy, and the fixed part of the model consists of the variables genfemo, home_emo, genxemo and tim as shown in the two dialog boxes below. The weight variable Mom_Wt used in the second of the multilevel analyses discussed here is entered on the Identification Variables dialog box.

Chapter 5: Structural equation models

314

Note that, in the Select Response and Fixed Variables dialog box shown, the required interaction between tim and the three variables genfemo, home_emo, genxemo is not included this will be added manually to the syntax file created via the dialog boxes.

Chapter 5: Structural equation models

315

To estimate both a random intercept and a random slope, add the variable tim to the Random Level
2 field on the Random Variables dialog box as shown below.

The syntax generated via the Finish button on the Random Variables dialog box is shown below:

Finally, type the additional interaction terms tim*genfemo tim*home_emo tim*genxemo into the syntax file to obtain the final syntax as shown below. The analysis is started by clicking the Run Prelis icon button on the main menu bar.

Chapter 5: Structural equation models

316

Discussion of results

The results of both the SEM and multilevel models are summarized in Table 5.3. The table also contains the results from Curran, Bauer & Willoughby (2004). These analyses did not take the sampling weight Mom_Wt into account. The results of the weighted analysis, in which this variable was incorporated into the estimation procedure, are reported in Table 5.4.
Table 5.3: Unweighted analyses: comparison of results
Estimates Coefficient genfemo ( 1 ) home_emo ( 2 ) genxemo ( 3 ) INTCPT ( 0 ) tim genfemo ( 4 ) tim home_emo ( 5 ) tim genfemo home_emo (6 ) tim ( 1 ) CBW paper Multilevel SEM Standard errors CBW Multilevel paper

SEM

0.829 -0.194 0.044 1.217 0.013 0.012 -0.029 0.066 * * * *

0.829 -0.194 0.044 1.217 0.013 0.012 -0.029 0.066 0.669 0.019 0.076 1.758

0.829 0.194 0.044 1.217 0.013 0.012 0.029 0.066 0.669 0.019 0.076 1.758

0.161 0.048 0.070 0.114 0.035 0.010 0.015 0.024 * * * *

0.161 0.048 0.070 0.115 0.035 0.010 0.015 0.025 0.203 0.009 0.035 0.097

0.161 0.048 0.070 0.114 0.035 0.010 0.015 0.025 0.204 0.009 0.035 0.098

Var (intcept ) ( 11 )
Var (time slope) ( 22 ) Cov(intcept , tim) ( 21 )

2 ( var( ) )

* Not reported in the Curran et. al. paper

Chapter 5: Structural equation models

317

Table 5.4: Weighted analyses: comparison of results


Coefficient genfemo ( 1 ) home_emo ( 2 ) genxemo ( 3 ) INTCPT ( 0 ) tim genfemo ( 4 ) tim home_emo ( 5 ) tim genfemo home_emo ( 6 ) tim ( 1 ) Estimates Multilevel SEM Standard errors Multilevel SEM

0.901 -0.244 0.122 1.203 0.024 0.014 -0.020 0.073 0.487 0.230 0.091 1.966

0.900 -0.243 0.122 1.203 0.024 0.014 -0.020 0.073 0.415 0.022 0.103 1.996

0.201 0.021 0.112 0.120 0.047 0.015 0.026 0.030 0.317 0.014 0.060 0.234

0.205 0.058 0.115 0.115 0.048 0.015 0.038 0.030 0.330 0.015 0.063 0.217

Var (intcept ) ( 11 )

Var (time slope) ( 22 )


Cov(intcept , tim) ( 21 )

( var( ) )
2

The goodness-of-fit of the models fitted can also be compared. For the weighted structural equations model, the following path diagram was obtained:

Chapter 5: Structural equation models

318

From the path diagram, 2 = 99.48, with 83 degrees of freedom. The corresponding 2 -statistic value for the unweighted model is 107.2978, with 83 degrees of freedom. For the analyses that included the weight variable, there are differences in parameter estimates if we compare the multilevel model results with those of the structural equation model. These differences are most evident in the covariance matrix of the latent variables. LISREL produced a warning message that this matrix is not positive definite. On further examination, it was found that the variable antiy10 contained only 8 non-missing observations. Refitting of the models, using the first 9 variables antiy1 antiy9, showed that the multilevel and SEM results are identical in this case.

5.5.3

Replicate weights

Introduction

Survey data sets often include a column W0 of design weights and additional columns W1 , W2 , , WR 1 of replicate weights. Typically, a researcher may repeatedly fit the same model to the data by working through the sequence of weight variables W0 , W1 , , WR 1 . Means of the R sets of parameter estimates and their standard errors may subsequently be computed to obtain more accurate parameter and standard error estimates. In this section, we illustrate how to use LISREL in the case of replicate weights.

The data

The data set used here is described in Section 3.4.1. A few of the variables are shown below for the first 10 observations in the data set.

The contents of the PSF are obtained by selecting the Statistics, Data Screening option. A portion of the output is shown below. Chapter 5: Structural equation models
319

The following variables included in the PSF were selected from the survey data:
CENREG: This variable indicates the census region and has four categories, these being "Northeast", "Midwest", "South", and "West" respectively. o FACTYPE: The facility treatment type has four categories, too, representing facilities with "residential treatment", "outpatient methadone treatment", "outpatient non-methadone treatment", and "more than one type of treatment" respectively. o ALCEU: An indicator variable with value "1" if the respondent has ever used alcohol, and "0" otherwise. o

o COCEU: An indicator variable with value "1" if the respondent has ever used cocaine, and "0" otherwise. o MAREU: An indicator variable with value "1" if the respondent has ever used marijuana, and "0" otherwise. o AGE: This variable denotes age at admission to a facility. o GENDER: The respondent's gender is denoted by this indicator variable that assumes a value of "1" for female respondents. o RACE_D: The original variable RACE recoded so that '1" denotes white and "0" other ethnic groups. o DEPR: This indicator variable is coded "1" if the respondent is depressed, and "0" otherwise. o EDU: A categorical variable representing the respondent's level of education at admission. It has 5 categories, these being (from 1 to 5) "less than 8 years", "8 11 years or less than High School graduate", "High School graduate / GED", "some college", and "college graduate / postgraduate". o JAILR: This indicator variable indicates whether the respondent had a prison or jail record prior to admission.

Chapter 5: Structural equation models

320

o NUMTE: A count variable, indicating the total number of treatment episodes prior to admission. o A2TWA0 A2TWA78: These variables are abstract final full sample weights. A more complete description follows below.

Variance estimation frequently relies on one of two techniques: Taylor series linearization or replication weights. Replicate weights are based on the same ideas as the jackknife, and has recently come into use in US government surveys, where replicate weights are provided instead of information on PSUs. In such cases, replicate weights may be used to disguise and/or prevent identification of individuals within PSUs to preserve privacy.
Handling of missing data

Missing values for the variables ALCEU, COCEU, , NUMTE in the file replicwts.psf are coded 9.0 . For these variables, 20.8 % of the possible values are not observed. Use of listwise deletion would result in retaining only 97 of the selected 5005 cases. Therefore, we use the full information maximum likelihood (FIML) procedure as described in Section 5.7. To define -9.0 as the global missing value code, select the Data, Define Variables option from the main menu. Select the variable ALCEU (or any other variable in the list) and click the Missing Values button to activate the Missing Values for dialog box. Type -9.0 in the Global missing value text box and select pairwise as the deletion method. Click OK when done.

Chapter 5: Structural equation models

321

Handling of zero weights

Descriptive statistics of the weight variables A2TWA0 to A2TWA78 revealed that these variables contained zero as possible values. If a zero value is encountered in any row of the data set, it was replaced by the average value of all the non-zero weights in the corresponding row.

We found that multiple imputation failed in estimating weights if 0 was regarded as a missing value code. The reason for this is that all the weight variables are highly correlated and therefore the covariance matrix of the weight variables is essentially singular.

Chapter 5: Structural equation models

322

To define CENREG and FACTYPE as stratification and cluster variables, select the Data, Survey Design option from the main menu and add the variables as shown in the dialog box below. For our first analysis, select A2TWA0 as the Design weight variable.

The model

The path diagram representation of the model fitted to the data is shown below. It is assumed that each of the X-variables AGE, GENDER, RACE_D, DEPR, EDU and JAILR is a perfect indicator of the corresponding latent variable, these being age, gender, race_d, depr, educ and jailr. This implies that the error variances of the X-variables are zero, and that the path coefficients age AGE, gender GENDER, race_d RACE_D, depr DEPR, educ EDU and jailr JAILR are all equal to one. In the LISREL terminology, age, gender, race_d, depr, edu and jailr are exogenous (KSI) latent variables. In principle, several indicators of depression and education, if available, could be incorporated into this model.

In the Y part of the model, we include the dependent variables ALCEU, COCEU, MAREU and NUMTE. It is assumed that ALCEU, COCEU, and MAREU are indicators of the endogenous latent (ETA) variable subabuse while NUMTE is a perfect indicator of the ETA variable numte. The path subabuse ALCEU is set equal to 1 to fix the scale of the endogenous latent variable subabuse. Finally, we assume that numte can be predicted by age, , jailr, and in turn, subabuse is predicted by numte. In this context, the latent variable numte is a so-called mediating variable.

Mathematical model

The LISREL model consists of a measurement and structural part.

Chapter 5: Structural equation models

323

Measurement model

y y x = 0

0 + x

where x = (AGE, GENDER, RACE_D, DEPR, EDU, JAILR)' , x is a 6 6 identity matrix and
= (age, gender, race_d, depr, edu, jailr )' . Also, Cov() = 0 and Cov( ) = .

Furthermore, subabuse = , numte ALCEU COCEU , y= MAREU NUMTE and 1 y = 21 31 0 0 0 . 0 1

Finally, Cov() is a diagonal matrix with diagonal elements var( ALCEU ) , var(COCEU ) , var( MAREU ) and 0.
Structural equation model

The structural model can be written as


= B + +

, where 0 12 B= , 0 0 Chapter 5: Structural equation models


324

and
0 0 = 21 22 0 0 0

23 24 25

0 , 26

that is, subabuse = 12 numte + t1 numte = 21 age + 22 gender + 23 race _ d + 24 depr + 25 educ + 26 jailr. Also Cov() =

21 = 11 . 21 22
The unknown model parameters are therefore y ,21 , y ,31 , 12 , 21 , 22 , 23 , 24 , 25 , 26 , 11 ,

21 , 22 , ., 66 , 11 , 21 , 22 , var(1 ) , var( 2 ) , and var( 3 ) . In subsequent output, these 36


parameters will be denoted by the symbols LY21, LY31, BETA12, GAMMA21, GAMMA22, ., GAMMA26, PHI11, PHI21, ., PHI66, PSI11, PSI21, PSI22, TE11, TE22 and TE33.

Example: Setting up the analysis using SIMPLIS syntax

It is relatively easy to specify the model described above with SIMPLIS syntax. We start by indicating that the raw data is to read from the file replicwts.psf. Note that this is followed by a REWIND command which allows LISREL to repeatedly read the raw data from the same file. The program commands AGE = 1.0*age, , JAILR=1.0*jailr specify that each of the X-variables are assumed to be exactly equal to the corresponding latent variable. Note that the part 1.0*latent variable indicates that the path coefficient is fixed at the value of 1.0. In contrast, a command such as COCEU = (0.5)*subabuse indicates that 0.5 is a starting value (preliminary estimate) of y ,21 . Since we assume that the X-variables measure the corresponding KSI latent variables without error, we set the error variances of NUMTE to JAILR to 0.

Chapter 5: Structural equation models

325

Next, we assume that subabuse is predicted by numte, and, in turn, numte is predicted by age, gender, race_d, depr, educ and jailr. This part of the syntax is shown below.

To allow for the estimation of the covariance of the errors between the ETA latent variables numte and subabuse, we SET the error covariance between these variables free. Chapter 5: Structural equation models

326

Finally, we use the LISREL OUTPUT: command to specify the number of repetitions (RP = 1), number of decimals (ND = 4), admissibility check off (AD = OFF), and save the standard error estimates (SV = replic1.std) and parameter estimates (PV = replic1.par) to text files replic1.std and replic1.par respectively.

Discussion of results SIMPLIS syntax

The estimated path coefficients for weight = A2TWA0 are shown in the path diagram given below. Although not presented here, all coefficients are statistically significant. The 2 -statistic for goodness of fit is 24.13, degrees of freedom is 19, and the p-value is 0.19103.

Contents of the files replic1.par and replic1.std are given below. The estimates are preceded by three numbers N1, N2, and N3, where N1 = repetition number, while N2 and N3 are zero if convergence has been attained. The parameter estimate and standard error values are given in scientific notation. For example, 0.165949 D + 02 = 0.165942 102 = 16.5942 0.234136 D + 0 = 0.23413 100 = 0.23413 In general, D + k implies that the decimal point should be moved k positions to the right, whereas D k implies the insertion of k zero values just after the decimal point. For example, 0.868283D 03 = 0.000868283 .

Chapter 5: Structural equation models

327

Example: Use of replicate weights

We now demonstrate the use of replicate weights in LISREL, and start by making the PSF window the active window in order to invoke the PSF menu bar. From the Data menu, select Survey Design and remove the stratum variable CENREG and the cluster variable FACTYPE by clicking on the corresponding Remove buttons. Select the File, Save option to save this change.

Chapter 5: Structural equation models

328

Next, edit the SIMPLIS syntax file by changing RP = 1 to RP = 79 (shown in bold typeface below) and use the filenames replic2.std and replic2.par for saving the standard errors and parameter estimates respectively. It is important to note that LISREL assumes that there are a total of 79 weight variables in the data set, starting with the Design weight variable name (in the present case A2TWA0) selected in the Survey Design dialog box. Additionally, it is assumed that the weight columns follow one another.

Discussion of results using replicate weights

The results below show the parameter estimates and parameter standard error estimates for the model fitted with design weight A2TWA0, including stratification (CENREG) and clustering (FACTYPE) variables, and the corresponding average values using replicate weights without CENREG and FACTYPE. In general, parameter estimates for the two estimation methods are quite close. Standard error estimates, on the other hand, tend to be larger for the replicate method. This is clearly a topic that requires further research.

Chapter 5: Structural equation models

329

Chapter 5: Structural equation models

330

5.6 5.6.1

Theory Introduction

We assume that the population from which the sample data are obtained can be stratified into H strata. Within each stratum h , nh clusters or primary sampling units (PSUs) are drawn and within the h -th stratum and k -th cluster, nhk ultimate sampling units (USUs) are drawn with design weights whkl , where l denotes the l -th USU within the k -th cluster, which in turn is nested within stratum h . In the subsequent sections we discuss parameter estimation and the Taylor linearization method employed in LISREL to produce robust standard error estimates under single stage sampling.

5.6.2

Parameter estimation

Assume that y hkl is distributed as a p 1 multivariate normal random vector with mean and covariance matrix . In structural equation modeling, it is hypothesized that = ( ) and = ( ) , where is a vector of q unknown parameters to be estimated.

Example 1: CFA model with one factor

Consider the model (y ) = ' + D . For this model, we have = ( 1 , 2 , ..., p , , 1 , 2 , ..., p ) .
'

To avoid any indeterminacy in estimating the unknown parameters, one would typically set the variance ( ) of the latent factor equal to 1, or alternatively fix one of the factor loadings, for example 1 , to 1. In the latter case, = (2 ,..., p , ,1 , 2 ,..., p ) ' .

Complete data

In the case of complete data, parameter estimation is relatively straightforward and can be summarized by the following two steps.

Step 1

Calculate the natural logarithm of the likelihood function, ln L , where


ln L = whkl ln f ( y hkl | )
h =1 k =1 l =1 H nh nhk

(5.11)

Chapter 5: Structural equation models

331

where

f ( y hkl | ) = ( 2 ) and

p/2

1/ 2

1 exp tr { 1 ( ) G} 2

(5.12)

G = ( y hkl ( ) )( y hkl ( ) ) '

(5.13)

Step 2
Obtain an estimate of by solving the set of simultaneous equations
ln L =0 =

(5.14)

In general, no closed-form solution to the set of equations in (5.14) exists, and therefore parameter estimates are obtained iteratively using the Fisher scoring algorithm: (t +1) = (t ) + I 1 ( (t ) ) g ( (t ) ) n

(5.15) , and g ( ) denotes the gradient

where (t ) denotes the parameter values at iteration t , t = 1, 2,

vector and where I n ( ) denotes the Fisher information matrix for the elements of . In other words, g() = and
2 ln L In ( ) = E

ln L

(5.16)

(5.17)

Iterations are continued until i( t +1) i( t ) i = 1, 2, 10-6.


Chapter 5: Structural equation models

, q where is a small scalar value, e.g.

332

Incomplete data

Since the case of no missing values may be considered as a special case of a general framework that handles missing values, specific expressions for the gradient vector and information matrix will be given in this section. Suppose that the data set contains n cases, where n = 1 , then one could alternatively write
h =1 k =1 l =1 H nh nhk

(5.11) as ln L = w ln f ( y ) , where denotes the subscript hkl .


=1

no missing values and subset 2 contains no values for the p -th element of y , [subset 2] . Let w1 j and y1 j , j = 1, 2, j = 1, 2, , n1 , denote the weights and observations in subset 1, while w2 j and y 2 j , , n2 , denote the weights and observations in subset 2. In this case, we can write ln L as
2 ni

Suppose that the data can be split into two subsets of sizes n1 and n2 , such that subset 1 contains

w
i =1 j =1

ij

ln f ( y ij | )

(5.18)

Suppose that, in general, there are NPAT patterns of missingness, i = 1, 2, , NPAT and that the number of observations within a pattern equals ni , j = 1, 2,..., ni . Let pi indicate the number of variables with non-missing data.

Example 2
Pattern y1 y2 y3 y4

pi 3 3 3 4 4 2 1

wij 1.8 1.7 1.4 0.8 0.7 1.1 1.3

1 1 1 2 2 3 4

19.3 17.6 16.4 20.1 19.4 . .

36.4 34.3 32.8 40.2 39.3 36.5 .

18.7 19.1 17.9 19.6 20.1 . .

. . . 57.8 58.9 62.4 60.2

For the data shown above, NPAT = 4, n = 7, n1 = 3, n2 = 2, n3 = 1 and n4 = 1.

Chapter 5: Structural equation models

333

In general, 1 NPAT ni wij pi ln 2 + ln i + tri1 ( y ij i )( y ij i ) ' 2 i =1 j =1

ln L = Let

(5.19)

wi = wij
j =1

ni

(5.20)

and 1 ni y wi = wij y ij wi j =1

(5.21)

Example 3

For the data set given in Example 1, w1 = 1.8 + 1.7 + 1.4 = 4.9 , and
19.3 17.6 16.4 17.88 1 + 1.7 34.3 + 1.4 32.8 = 34.64 y w1 = 1.8 36.4 4.9 18.7 19.1 17.9 18.61
' Ignoring the weights, y1 = (17.77 34.50 18.57 ) .

Use of (5.20) and (5.21) leads to the following expression for (5.19):
ni 1 NPAT 1 ' ' ln L = wi. ( pi ln 2 + ln i ) + tri wij ( y ij y ij y ij i' i y ij + i i' ) 2 i =1 j =1 ni 1 NPAT 1 ' = wi. ( pi ln 2 + ln i ) + tri wij y ij y ij wi. y wi i' i wi. y 'wi wi. y wi y 'wi + wi. y wi y 'wi + wi. i i' 2 i =1 j =1

Therefore

ln L =

1 NPAT wi ( pi ln 2 + ln i ) + tri1 G wi + S wi 2 i =1

)}

(5.22)

Chapter 5: Structural equation models

334

where G wi = y wi i and 1 ni S wi = wij y ij y wi wi j =1

)( y

wi

i '

(5.23)

)( y

ij

y wi '

(5.24)

or equivalently,
' wi.S wi = wij y ij y ij wi. y wi y 'wi j =1 ni

Gradient vector

g( r ) =

NPAT ni i =1

w ( y
j =1 ij

ij

i ) i1
'

' i 1 + tri1 ( y ij i )( y ij i ) i i1 i r 2 r (5.25)

Use of (5.20), (5.21), (5.22) and (5.23) gives ni ' 1 ( wij y ij wiji ) i1 i + 2 tri1 wi.S wi + wi.G wi wi.i i1 i i =1 j =1 r r NPAT ' 1 (5.26) = wi. y wi i i1 i + tri1 S wi + G wi i i1 i r 2 r i =1
NPAT

g( r ) =

Information matrix

2 ln L n ( rs ) = E r s = =
NPAT ni

i =1 i =1

i' 1 i 1 wij i + 2 tri1 i i1 i j =1 s r s r


i.

(5.27)

NPAT

1 i 1 + tri1 i i1 i i s 2 r s r
' i

Chapter 5: Structural equation models

335

Let rij = i1 ( y ij ij ) and


Pij = rij rij' i.

(5.28)

(5.29)

From (5.26) it follows that


NPAT ni i =1

[g ( )]r = where

[g
j =1

ij r

(5.30)

i 1 i [g ij ]r = wij rij' + tr Pij , i = 1, 2,..., NPAT ; j = 1, 2,..., ni r 2 r

(5.31)

Approximate covariance matrix of estimators

An approximate covariance matrix of the parameter estimators is derived as follows. From (5.14), (5.16) and (5.30) it follows that is the solution to the set of equations
NPAT ni i =1

w() =

g
j =1

ij

() = 0 (5.32)

Note that each g ij is associated with a specific case hkl, where h denotes strata, k clusters and l
ultimate sample units. Using a first-order Taylor expansion of w ( ) at = , it follows that w ( ) '

0 = w () w () +

( )

(5.33)

Chapter 5: Structural equation models

336

Taking variances on both sides, it further follows that


w ( ) w ( ) Cov ( w ( ) ) Cov ( ) . ' '

(5.34)

Thus, provided that (cf. (5.32))

2 ln L g( ) = is a non-singular matrix, ' '


1 1

2 ln L 2 ln L Cov ( ) Cov ( w ( ) ) , ' '


2 ln L where E = n ( ) . '

Therefore, an approximate expression for the asymptotic covariance matrix of is given by


Cov ( ) n 1 ( ) G n 1 ( ) where G = Cov ( w ( ) ) .

(5.35)

Using results derived by Binder (1983) and Fuller (1975), it follows that
n n 1 H nh (1 f h ) h t h t h.. t hi. t h.. n q h=1 nh 1 i =1 i.

G=

)(

'

(5.36)

where
nh i

nh = mhij , with mhij the number of cases with identical response patterns within stratum h,
j =1

cluster i, and USU j. If f hij = 1 for all h, then mhij = 1 for all h, i and j.
fh = t hij

nh , the sampling rate for stratum h . Nh = g hij ( ), where g hij ( ) (cf. (5.31)) is the hij -th contribution to the gradient vector g( )

Chapter 5: Structural equation models

337

as defined by (5.30).
t hi. = t h.. =
mh ij

t hij
j =1

1 nh

t hi.
i =1

nh

In practice, we assume a zero contribution to G for strata that contain a single PSU (cluster). Additionally, if the data do not contain a stratification variable, the PSUs are assumed to be the strata, and the observations (ultimate sampling units) within each PSU, clusters. Likewise, if there is no variable to define clusters, the observations within each stratum are treated as being the primary sampling units.

Adjustment to the chi-square goodness of fit statistics


2 Simulation studies indicated that the LR -statistic based on the log-likelihood (cf. (5.22)) in general yields too high a rejection rate. Let (cf. (5.35))

d = tr ( n ( ) Cov ( ) ) ,

(5.37)

where n denotes the information matrix defined by (5.27). A correction to the 2 -statistic for testing model fit is given by
2 2 robust = c LR ,

()

where
c= q , d

and where q denotes the number of parameters to be estimated.

Chapter 5: Structural equation models

338

5.7

References

Asparouhov, T. (2004). Weighting for Unequal Probability of Selection in Latent Variable Modeling. Mplus Web Notes: No. 7. Binder, D.A. (1983). On the variances of asymptotically normal estimators from complex surveys, International Statistical Review, 51, 279-292. Browne, M.W. & Cudeck, R. (1993). Alternative ways of assessing model fit. In: K.A. Bollen & J.S. Long (Eds.), Testing Structural Equation Models. Sage Publications. Curran, P.J. (1997). Comparing three modern approaches to longitudinal data analysis: An examination of a single developmental sample. Symposium conducted at the meeting of the Society for Research on child Development, Washington, DC. Curran, P.J., Bauer, D.J., & Willoughby, M.T. (2004). Testing Main Effects and Interactions in Latent Curve Analysis. Psychological Methods, 9(2), 220-237. Du Toit, S.H.C. & Browne, M.W. (2001). The covariance structure of a vector ARMA time series. In: R. Cudeck, S.H.C. du Toit, & D. Srbom (Eds.), Structural Equation Modeling: Present and Future, Lincolnwood, IL: Scientific Software International, Inc. Du Toit, S.H.C. & Cudeck, R. (2001). The analysis of nonlinear random coefficient regression models with LISREL using constraints. In: R. Cudeck, S.H.C. du Toit, & D. Srbom (Eds.), Structural Equation Modeling: Present and Future, Lincolnwood, IL: Scientific Software International, Inc. Fuller, W.A. (1975). Regression Analysis for Sample Survey. Sankhya, Series C, 37, 117-132. Hayduk, L. (1996). LISREL issues, debates and strategies, Baltimore: John Hopkins University Press. Jreskog, K.G. & Srbom, D. (1996). LISREL 8: Structural Equation Modeling. Chicago: Scientific Software International. Kaplan, D. (2000). Structural Equation Modeling: Foundations and Extensions (Advanced Quantitative Techniques in the Social Sciences). Sage.

Chapter 5: Structural equation models

339

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy