Aiken L. Multiple Regression. Testing and Interpreting... 1991
Aiken L. Multiple Regression. Testing and Interpreting... 1991
REGRESSION:
Testing and
Interpreting
Interactions
Leona S. Aiken
Stephen G. West .
Arizona State University
SAGE PUBLICATIONS
kJ The International Professional Publishers
У Newbury Park London New Delhi
CS CamScanner
b\ Sage РмЫкапопх. Inc
хм\5глх;
S \GF NhhcMiotts Ы
0 BcnMl Swvel
LwdxHi вс:лж»
Vnihsl KinglhMtl
Aiken. Leona S,
Multiple regression: Testing and interpreting interactions / Leona
S. Aiken and Stephen G. West,
p. cm.
Includes bibliographical references and index.
ISBN 0-8039.3605-2 (e) ISBN 0-7619.0712
*2 (pbk.)
I. Regression analysis. I. West, Stephen G. II,Title.
QA278.2.A34 1991
91.2062
519.5’30—dc20
98 99 01 02 03 It) 9 К 7 6 5
CamScanner
Contents
Preface ix
1. Introduction
Summary
r----------- 1
cs CamScanner
Interpreting the Regression Coefficients 36
36
The Interaction Ternt Х7,
The First Order Terms X and Z 37
A Geometric Interpretation 39
Standardized Solutions with Multiplicative Terms 40
Appropriate Standardized Solution with Interaction Terms 43
Simple Slope Analysis from the Standardized Solution 44
Relationship Between Raw and "Standardized” Solution 45
Summary 47
CamScanner
Model and Effect Testing with Higher Order Terms lnn
' Sonie Issues in Testing Lower Order Effects in Models U
Containing Higher Order Terms
Question 1. Interpretation of Lower Order Terms When h. Is l°°
Significant
Question 2, Should Lower Order Coefficients Be Tested in Reduced
Models When b3 Is Nonsignificant?
Exploring Regression Equations Containing Higher Order Terms
with Global Tests
Some Global Tests of Models with Higher Order Terms 107
Structuring Regression Equations with Higher Order Terms । 10
Sequential Model Revision of Regression Equations Containing
Higher Order Terms: Exploratoiy Tests 11 j
Application of Sequential Testing Following a Global Test I ]2
General Application of Sequential Testing 113
Present Approach Versus That Recommended by Cohen(1978) 113
Variable Selection Algorithms 114
Summary цд
CamS caimer
9. Conclusion: Some Contrasts Between ANOVA and MR in
Practice 172
Appendix A: Mathematical Underpinnings 177
References 190
CamScanner
Preface
ix
cs CamScamier
X MULTIPLE REGRESSION
■
CamScanner
preface xi
thank Susan Maxwell for her many insights about interactions. Finally,
we arc very appreciative of the encouragement, guidance, flexibility, and
patience of our Sage Editor, Deborah Laughton, and the painstaking ef
forts of production editor Susan McElroy in preparing the book for pub
lication.
The clerical and editorial efforts of Jane Hawthorne and Kathy Sidlik
during various stages of the project arc gratefully acknowledged. Andrea
Fenaughty provided greatly appreciated assistance with the indexing and
referencing. Support for graduate assistant Ray Reno, as well as Jane
Hawthorne and Kathy Sidlik, was provided by the College of Liberal Arts
and Sciences, Arizona State University. The efforts of Steve West were
in part supported by National Institute of Mental Health grant
P50MH39246.
L. S. A. and S. G. IF.
Tempe, January 1991
CS CamScanner
1 Introduction
r = b|X + b2Z + b0
CS CamScanner
2 MULTIPLE REGRESSION
CamScanner
Introduction 3
cs CamScanner
MUI fl 14 If HlUIHhSMlON
cs CamScanner
Introduction
two decades have passed since the initial proposals of MR as a
Ove; data analytic strategy in the social sciences by Blalock (1965) and
genera ‘ Other works have also periodically echoed Cohen’s mes-
C°h? p Cohen & Cohen, 1975, 1983; Darlington, 1990; Kenny, 1985;
S?ger Wasserman, & Kutner, 1989; Pedhazur, 1982). Why have re-
l4Ctcr’ rs been so slow to utilize these techniques in the analysis of studies
Solving two or more predictor variables? We believe this underutiliza-
Ш of MR approaches stems in large part from several impediments that
arise when researchers actually attempt to utilize the general procedures
diat have been outlined and to interpret their results. The purpose of this
book is to provide a detailed explanation of the procedures through which
regression equations containing interactions and higher order nonlinear
tenns may be structured, tested, and interpreted. In pursuing this aim, we
present an integration of recent work in the psychological, sociological,
and statistical literatures that clarifies a number of earlier points of con
fusion in the literature that have also served as impediments to the use of
the MR approach. .
Chapter 2 addresses a number of issues involved in the interpretation
of interactions between two continuous variables in MR. One major im^
pediment to the use of MR has been that procedures for displaying and
probing significant interactions have not been readily available. T at is,
once an interaction was found to be significant, exactly what s ou one
do next? In this chapter, we present the graphical approaches to examin
ing the interaction in equation 1.2 originally develope у 0 J*
Cohen (1975, 1983), present analyses to answer questions abou
(ordinal versus disordinal) of the interaction, and enve proce
post hoc statistical probing of interactions between k
that closely parallel simple effects testing within the ANOVA fnmework.
Chapter 3 addresses another impediment to the use о simple
action terms: the lack of invariance of the MR results
linear transformations of the data. To understanc is Р ' . ’ s for
analyzing a data set using equation 1.2 that contains irs
X and Z and the linear interaction of X and Z.
Two analyses are conducted: First, the data arc апа'У“‘?('"иДп devia
■ form; second, the data are analyzed with X an «« deviation score
Hon score form), with the interaction creal ld differj perhaps
forms of X and Z. The results of these two analyses w
dramatically. Only the b, coefficient for the interaction
CamScanner
6 MULTIPLE REGRESSION
the same in both equations (see Cohen, 1978). Such shifts in the results
of the analyses of regression equations containing interactions or higher
order variables under transformation are disturbing. This problem has led
to substantial discussion in the social science literature (see, e.g., Fried
rich, 1982, in political science; Cohen, 1978, and Sockloff, 1976, in psy
chology; Althauser, 1971, and Arnold & Evans, 1979, in sociology).
Confusion and conflicting recommendations have resulted (e.g., Schmidt,
1973, versus Sockloff, 1976). Chapter 3 clarifies the source of the prob
lem of failure to maintain invariance. Procedures through which research
ers may work with equations containing higher order terms and maintain
unambiguous interpretations of their effects are highlighted. The interpre
tation of all regression coefficients in equations containing interactions is
explained. Finally, a standardized solution for equations containing in
teractions is presented.
Chapter 4 and 5 examine problems in testing interactions in more com
plex regression equations. Most discussions of interactions in MR have
focused exclusively on the simple model involving two predictors and
their interaction represented by equation 1.2. Chapter 4 generalizes the
procedures for the treatment of interactions to the three predictor case.
Methods for graphically displaying the three-way interaction and for con
ducting post hoc tests that are useful in probing the nature of the inter
action are discussed.
Chapter 5 considers several complications that arise in structuring, test
ing, and interpreting regression equations containing higher order terms
to represent curvilinear (quadratic) effects and their complex interactions.
The methods for graphically portraying and conducting post hoc tests of
interactions developed in earlier chapters are generalized to a variety of
complex regression equations.
Chapters 5 and 6 also address another impediment to the use of com
plex MR models for researchers trained in the ANOVA tradition. In equa
tion 1.2 where the XZ term with one degree of freedom (df) represents
fully the interaction between X and Z, generalizing from ANOVA to
regression is relatively easy. This generalization is less straightforward
when interactions have more than one degree of freedom. In ANOVA
there is always one source of variation for the whole interaction and one
omnibus test for significance even when an interaction has several degree
of freedom. However, in more complex MR equations, a series of terms,
each with one degree of freedom, may represent the interaction. F°r eX
arnp e, terms representing the linear X by linear Z component (XZ) an
e curvilinear (quadratic) X by linear Z component (X2Z) of the inter
CamScanner
Introduction
7
action between X and Z may need to be built into the regression equation
Ш such cases, the generabzatton front one source of variation and one >
omnibus test tor the whole interaction in ANOVA to the MR framework
is not generally familiar to researchers. Similar problems exist for testin,
multiple df “main effects” in MR in the absence of an interaction Chan
ter 5 provides guidelines for the structuring and interpreting of these more
1
complex regression equations.
Chapter 6 further extends the consideration of this issue by developing
a variety of procedures for model and effect testing in complex regression
equations. Strategics fortesting and interpreting lower order effects in MR
are developed. Global tests of a variety of hypotheses (c.g., the overall
linearity of regression) that are based on sets of terms in the equation are
discussed. Hierarchical effect-by-effect and term-by-term step-down test
ing strategies are presented for simplifying complex regression equations.
Procedures are presented for identifying the scale-independent term(s) that
can be legitimately tested at each step, yielding proper reduced equations
that can be interpreted.
Chapter 7 generalizes our treatment of interactions to cases involving
combinations of categorical and continuous predictor variables. Issues
arising in the representation of categorical variables and the interpretation
of the regression coefficients are discussed. Post hoc tests of the interac
tion are presented that examine differences between regression equations
for the groups defined by the categorical variable.
Chapter 8 addresses the problem of measurement error in the predictor
variables and its effect on interactions. Several methods of correcting for
measurement error are presented, and their performance evaluated. The
dramatic effect of measurement error on statistical power (the ability of
statistical tests to detect true interactions) is shown.
Finally, Chapter 9 briefly contrasts the ANOVA and MR approaches
as they have been used in practice. ANOVA was classically applied to
experimental designs, whereas multiple regression was applied to mea
sured variables. We explore some of the areas in ANOVA an in
which lingering traditions have led to divergent practices, mode speci i
cation, functional form, and examination of the tenability of assumptions.
We hope that this book will introduce investigators in a variety о so
science disciplines to the major issues involved in the design an ana ysi
of research involving one or more continuous predictor varia es. e
hope the book will provide an increased understanding о interac io
multiple regression and will help remove the impediments to t e u
MR as a general data analytic strategy.
cs CamScanner
8 MULTIPLE REGRESSION
Notes
1. The ANOVA model is appropriate when (a) the levels of the predictor variable arc
discrete rather than continuous of (b) the relationship between the predictor and outcome
variables is a step function rather than linear or curvilinear (Kenny, 1985), As Cohen and
Cohen (1983; see also Chapter 7 of the present book) have shown, these cases can also be
equally well represented using the MR approach.
2. When beginning this project, we reviewed the 1984 volumes of four major journals
of the American Psychological Association: Journal of Abnormal Psychology, Develop
mental Psychology, Journal of Consulting and Clinical Psychology, and Journal of Per
sonality and Social Psychology. These journals were selected because they are leading jour
nals in psychology and they most frequently publish articles involving analyses of multiple
continuous predictor (independent) variables. Our estimate that 23% of the articles involv
ing two (or more) continuous variables used MR with interactions may be considered to
reflect the quality of the best practice rather than average level of practice in these areas of
psychology in 1984.
CS CamScanner
2 Interactions Between Continuous
Predictors in Multiple Regression
Cam Scanner
10
Mill,Tin.H RKOUHSSION
CamScanner
was added to the predicted scores to create observed criterion semes
nallv. the regression equation, f - btX I b;7. । b X7 i b
mated based on the three original predictors and the observed/ y'
The results of the regression analysis and lhc |M1M hoc fobl '()f
XZ interaction that we will be discussing in (his chapter arc prevntol in
Tables 2,1, 2.2. and 2.3 and Vigu.v 2.1. Note that in’ this ch W Хи
only discuss the potlions of the tables that pertain to "centered data " tf -
portions of the tables labeled "unccntcrcd data" arc discussed in Chapter
3. Of most importance at present, we see In Table 2.lc(ii) the regression
equation containing the interaction:
In this equation, both the Z effect and the XZ interaction arc significant.
From the overall equation it appears that there is no overall effect of length
of time in position (X) on self-assurance (У), that there is a positive effect
of managerial ability on self-assurance (Z), but that the relationship of
time to self-assurance is modified by managerial ability (XZ). Also of
interest, Table 2.1a presents the means and standard deviations for X and
Z and the correlation matrix for X, Z, and XZ for the centered data. Table
Table 2.1
Centered versus Uncentered Regression Analyses Containing an Interaction
Correlation matrix
Correlation matrix
X Z XZ Y X' Z' X'Z’ r
.42 HI 17
X - .42 .10 .17 X'
*
*«4 ,H6 31
Z - .04 .31 Z’
.21
XZ - .21 X'Z’
cs Cam Scanner
12 MULTIPLE REGRESSION
2,lc(i) shows the regression equation when the Interaction term is omit
ted.1
CamScanner
Between Continuous Predictors
13
clinical diagnostic lest , a specific score represented a cutoff above wh.ch
pathology were indicated. then that cutoff score, a higher score typic' f
thc clinical eondthon. and a lower score typical of normal popul , on
mighl be chosen. In a study involving income, the federal government",
value of the poverty line for a family of four might be chosen. In other
cases, such as our fictitious example of managerial ability, no weial sci
ence based rationale will exist to guide the choice of several values of Z
In such cases. Cohen and Cohen (I9K3) have suggested as a guideline
that researchers use the values ZM. ZH, ZL, corresponding lo the mean of
Z. one standard deviation above Z, and one standard deviation below Z.
respectively. Whatever values of Z arc chosen, each is substituted into
equation 2.2 to generate a series of simple regression equations of Fon
X at specific values of Z. These equations arc plotted to display the inter
action.
Numerical Example
« 6.82X 4- 10.41.
CS CamScanner
14 MULTIPLE REGRESSION
Table 2.2
Simple Regression Equations for Centered and Uncentered Data
ZL, and essentially no relationship between X and Y for ZM. Figure 2.1a
reveals a complex pattern of regression of Y on X depending on the level
of Z. If only the nonsignificant b\ coefficient in Table 2.1c(ii) had been
examined, it would have been concluded that there was no relationship
ofXtoK
CamScanner
Interactions Between Continuous Predictors
15
r4
cs CamS canner
MULTIPLE REGRESSION
The values s, j and s33 are the variances of bt and b3, respectively, taken
from Sb, the sample estimate of the variance-covariance matrix of predic
tors; sJ3 is the covariance between bt and b3 taken from Sb. As the value
of Z varies in the simple slope, the value of the standard error in equation
2.4 varies as well. Note that equation 2.4 pertains only to the simple slope
(bt + b3Z) in equation 2.2.
t-Tests for Simple Slopes. The /-test for whether a simple slope differs
from zero is simply the value of the simple slope divided by its standard
error with (n - к - 1) degrees of freedom, where n is the number ol
cases and к is the number of predictors, not including the regression con
stant (here к = 3).
CamScanner
Interactions Between Continuous Predictors
17
tion involves three elements from c „
-0.08. and J,, = 0.40. Substituting Z « ,™ У = 2-35- ■<,, -
the standard error of the simple slope of у 'у '"l" e<,"a,ion 2.4 yicW,
Table 2.3
Computation of Standaid Errors and r-Tcsts for Simple Slope,
fri b2 6,
cs CamScanner
[8 MULTIPLE REGRESSION
provided. These tests confirm the positive regression Ton Xat ZH and the
negative regression of Г on X at ZL; the regression of F on X at ZM does
sb = 5/^22 + 2Хг2з + *1
2^зз
3 (2-5)
Once again, the r-test for whether a simple slope differs from zero is
simply the value of the simple slope divided by its standard error with
(n - к - 1) degrees of freedom. *
4 ■
CamScanner
Interactions Between Continuous Predictors
sion constant (intercept) from that analysis will be that for the simple
regression equation. The standard error of b, will be the standard error of
% the simple slope of У on X at C Vz, and the Г-test will be that for the simple
1 slope.
Similarly, if the simple slope of У on X one standard deviation below
the mean of centered Z is sought (here CVZ == -2.20), then once again
a new variable Zcv = Z - CVZ is calculated, here Zcv = Z - (-2.20)
■ and the regression of У on X, Zcv, and (X)(ZCV) is performed. The re
sulting b, term, its standard error, and /-test form the simple slope analysis
one standard deviation below the mean of Z, at CVZ = -2.20.
Table 2.4 provides SPSSX computer output for the simple slope anal
ysis reported in Table 2.3a. First, the overall regression analysis with
'. centered X, Z, and XZ is given, replicating Table 2.1c(ii), and two trans
formed variables are calculated:
r4
cs CamS canner
2Q MULTIPLE REGRESSION
Note that this is identical to the /-test for the significance of the b, coef.
ficicnt in the overall analysis. In other words, given that Z is a continuous
variable, the significance of the by coefficient in the overall analysis in
Table 2.4 e , __ r
Computation of Simple Slope Analysis by Computer for the XZ Interaction in
the Regression Equation У = btX 4- ^Z + byXZ + bo
X Z XZ
X 2.34525 -0.41324 -0.08489
Z -0.41469 0.42938 -0.00453
XZ -0.08211 -0.00187 0.39895
CamScanner
Interactions Between Continuous Predictors 21
CamScanner
У) MULTIPLE REGRESSION
Ла Ла
CamScanner
Interactions Between Continuous Predictors 23
crossing point falls outside (he actual range of values on X arc classified
as being ordinal, whereas those whose crossing point falls inside the ac
tual range of values on X arc classified as being disordinal. An alternative
approach in the absence of scale-based or date-based criteria is for the
researcher to consider a meaningful range of the variable in terms of the
system being characterized by the regression equation; this meaningful
range has been referred to as the dynamic range of the variable in the
context of sensory systems such as vision or audition (see Teghtsoonian,
1971). Those interactions for which the lines crossed within the meaning
ful range of the variable would be termed disordinal, whereas other in
teractions whose crossing point fell outside this range would be termed
ordinal.
The reader should bear in mind that the classification of an interaction
as ordinal versus disordinal is always with regard to a particular config
uration of variables. An interaction may be ordinal in one direction, say
the regression of Y on X at values of Z, and disordinal in the other direc
tion (Y on Z at values of X). The question of whether to characterize an
interaction in terms of Y and X at values of Z or in terms of Y and Z at
values of X may be driven by theory and the specific predictions to be
tested. For example, most theoretical discussions present life stress (X)
as the predictor of health (Y), with social support (Z) being described as
the variable that moderates this relationship; hence the regression of Y on
X at values of Z is considered. However, in general, it is useful to examine
both the regression of Y on X at levels of Z, and У on Z at levels of X.
Both castings provide potentially informative and complementary two-
dimensional representations of what is in reality a three-dimensional
regression surface.
The point at which two simple regression lines cross can be determined
algebraically. For the regression of Y on X al values ot Z, the simple
regression equation is written at two specific values of Z, say ZH and ZL,
yielding two simple regression equations:
Ун « (b4 4- ^ZH)X 4 (№ 4- M
)'t ж (A + AZi.)X 4 4-
The two equations are set equal to one another to determine the expression
for the point at which the lines represented by these interactions cross.
cs CamScanner
24 MULTIPLE REGRESSION
Here,
^c«ni ~ (2.6)
Numerical Example
that is, any regression equation that is linear in the regression coefficients
fKmenta 1986). The approach pertains to all of the more complex regres
sion equations in this book. Readers familiar with matrix algebra should
find the exposition straightforward. For readers unfamiliar with ram
* central form of the expression for the variance of a simp
i к < th#
algebra, 8 «««ndard error) is given in equation 2.10 below,
slope (square of the stan d of slopes is th.
The starting po linear combination of the original
observation that each simp e *’ tion In the equation Y = (bt + b3Z)
regression coefficients in kione for the regression of Y on X is
(bo + b2Z), the simple slope for the reg
CamScanner
inferaeilonx between Continuing Predictor» 25
(/,, 4 b2Z). Utting the known properties of linear combinations, v/c can
derive the sampling variance of the simple slope (bt 4- /^Z),
Consider any linear combination U of variables /q * * • bp, weighted
by h'i * * ’ respectively. In vector equation form, this may be ex
pressed as V w'b, or equivalently in algebraic form
Here the regression cocfflccnts b' =* |/q h2 • • * bp] arc the elements of
the combination and w' ** |w, w2 • • • wp] arc the weights that define
the combination. The variance a J of the combination is a function of EA,
the variance covariance matrix of the elements b{ • • • bp and of the
weights themselves, as given by the quadratic form
a J s= (2.7)
(2.8)
CamScanner
25 MULTIPLE REGRESSION
« Ju + 2Zj|3 + (2.9)
к к
b = S n?<r« + S S
a2 (2.10)
/»i J ;=i /*
where
CamScanner
Interactions Between Continuous Predictors
27
Summary
This chapter addressed the probing of significant interactions between
two continuous variables X and Z in the regression equation f = J "
b2z + b,xz + l>„. First, the regression equation was rearranged to show
the regresston of the entenon on X at values of Z; the simple slope of that
regression equation was defined. Post hoc probing of the interaction began
with prescriptions for plotting the interaction. Л /-test for the significance
of the simple slopes was presented together with a simple, computer-based
method for performing this test. The distinction between ordinal and
disotdinal (noncrossover versus crossover) interactions was presented for
the interaction between two continuous variables, and the procedure for
determining the crossing point of simple regression lines was illustrated.
Finally, a more advanced optional section presented a general derivation
of the standard errors of simple slopes in any OLS regression equation.
Notes
1. In practice there will typically be little difference between the bt and b2 coefficients
in the regression equations containing the interaction [e.g., Table 2. lc(ii)] and those coef
ficients in the regression equation not containing the interaction [e.g., Table 2.1c(i)j if
predictors X and Z are centered and have an approximately bivariate normal distribution.
2. In much of the discussion we refer to the regression of У on X at values of Z. The
interaction may just as well be cast in terms of the regression of У on Z at values of X. Our
exposition is often confined to У on X and values of Z for simplicity.
3, Conceptually, the population variance-covariance matrix of the regression coeffi
cients, E/>( can be understood as follows. Imagine computing equation 2.1 for a infinite
number of random samples from a given population. The variance of each regression coef
ficient (e.g., bt) across all the samples would be on the main diagonal of *E
. The covariance
between pairs of regression coefficients (e.g., with b2) across all samples would be the
off-diagonal entries.
4. In SPSS-X the covariances among estimates (d'u, S|j. Jjj) and the correlations among
the estimates are printed in the same matrix, with variances of the estimates ($ц,
on the main diagonal, covariances below the diagonal, and correlations above thv d'agona .
This matrix is obtained in SPSS-X REGRESSION with the BCOV keyword on the St -
T1ST1CS subcommand. To form the covariance matrix *S in table 2.3a(i). we ave p ace
the covariances both below and above the diagonal; SPSS users should be certain to
3S Well
The covariance matrix of the estimates is obtained in SAS from RLG wit
keyword COVB on the MODEL statement. The Sb matrix obtained m SAS c™a js
variances and covariances, just as in Table 2.3a(i). No modification of t e
CS CamScanner
3 The Effects of Predictor Scaling on
‘ Coefficients of Regression Equations
CS CamScanner
Effects of Predictor Scaling 29
1. In the case of linear regression with no higher order terms, that is, b3, - 0
in equation 3.1, rescaling by additive constants has no effect on the value
of the regression coefficients.
In regression equalions containing at least one higher order lenn,лея." g
2.
by additive constants leads to changes in all regresston coefficients except
for the highest order term. .
3.
Simple slopes of simple regression equations are una ecte у a
transformations. . . „„
4. Under additive scale transformation the interpretation ® 1 ®
ordinal versus disordinal remains unchanged. The impo probing
this exposition is that our prescriptions for plotting
of interactions between continuous variables do not suffe egression
of lack of invariance, even though coefficients in
equation do.
CamScanner
MULTIPLE KfcGRESSJON
and define two new variables X' = X 4- c and Z' = Z 4- /, where c and
/are additive constants. (Note that if c and /are the arithmetic means of
X and Z, respectively, and X and Z represent centered variables, then X'
and Z' represent the uncentercd forms of these same variables). We re
write the original centered variables as X - X' - c and Z = Z' - / and
substitute these values into the simple regression equation, yielding:
Here, the coefficients b{ and b2 for the uncentered first order (X' and Z')
terms are identical with those in equation 3.3 based on centered X and Z.
Only the regression intercept (b0 - b} c - b2f) is changed from its orig
inal value.
Note that the original bl coefficient of equation 3.1 becomes bj = (bi "
byf), the original b2 coefficient of equation 3.1 becomes b2 = (b2 "
b3c), and the regression constant b0 becomes (b0 - btc - b2/ + M/)-
Only the interaction coefficient does not change: b3 thus retains its origin
value and interpretation.’ The change in first order coefficients produce
by linearly rescaling X and Z occurs when there is a nonzero interact^
between these variables. The covariances between interaction term
and each component (X and Z) depend in part upon the means of t
individual predictors. Rescaling changes the means, thus changes pre t
tor covariances, resulting in changes in b, and b2 for the predictors c
i'4?
CamScanner
EficlS of Predictor Scaling 3!
inCd in the higher order function. This is true even if the individual
” dictots, X and Z, arc uncorrclatcd with one another.2 (The interested
reader is referred to Appendix A, which provides algebraic expressions
for the mean and variance of crossproduct terms XZ in terms of the means,
variances, and covariances of its components X and Z. The covariance
between a crossproduct term and its components is also explored.)
The simple slopes that are calculated from the interaction also remain
constant under additive scale transformation. To see this algebraically,
recall from equation 3.2 that (b{ + b3Z) is the general form for the simple
slopes of У on X at values of Z from equation 3.1. Let us once again use
the expressions X = X' - c and Z = Z' - /, and substitute them into
equation 3.2 for the regression of У on X at levels of Z:
Note that the simple regression coefficient (bt + b3Z) for the regression
of У on X1 at values of Z docs not change from equation 3.2 to 3.8;
rescaling predictors changes the regression constants but not the regres
sion coefficients of the simple regression equations.
CamScanner
32 MULTIPLE REGRESSION
and f for X and Z, respectively. To show this, we use the expression Ь’г
- (b2 - byc) from equation 3.5 and recall that, for uncentcred versus
centered equation 3.1 * b\ «== bJt Substituting these expressions into equa
tion 2.6, we find that the regression lines cross at the value
Numerical Example
Centered Versus Uncentered Data
Our examination of Tables 2.1, 2.2, and 2.3 in the previous chapter
was focused solely on those portions of the tables that report the results
of analyses using centered variables X and Z. Also contained in these
tables are the results of parallel analyses in which the variables have been
transformed to their uncentered forms as follows: X' - X + 5 and Z' =
Z + 10. This example permits the direct comparison of the regression
analysis, plots, and post hoc probing based on centered versus uncentered
data. It also introduces some of the desirable properties of centered so- i
lutions.
Correlations
Note in Table 2.1a that the correlations between the centered terms A
and XZ and between Z and XZ are low, . 10 and .04, respectively. How
ever, in the uncentcred case, large correlations are introduced between X'
and X'Z' and between Zz and X'Z'. For example, the correlation be
tween uncentered Z' and X'Z' is .86, instead of .04 forZ with XZ. This
example illustrates how considerable multicollinearity can be introduced
into a regression equation with an interaction when the variables are not
centered (Marquardt, 1980). Very high levels of multicollinearity can lead
to technical problems in estimating regression coefficients. Centering
CS CamScanner
1Effects of Predictor Scaling 33
i ■
variables will often help minimize these problems (Ncter, Wasserman, &
Kutncr, 1989).
Regression Equations with no Higher Order Terms
The regression equations including first order terms only (no interac
tion) arc given in Tables 2.1 c(i) and 2. ld(i) for centered versus uncen
tcred data. Note that the regression coefficients b\ - 1.67 and b2 = 3.59
arc identical for the centered and uncentcred equations. Only the regres
sion constant reflects the change in scaling.
Simple Slopes
Equation 3.2 expresses the regression of У on X at particular values of
Z. Using uncentered data, we compute the simple_slope equations at the
values Z'h - Z' + 1 standard deviation, Z^ - Z', and Z£ = Z' - 1
standard deviation. To calculate the simple slope equations in the uncen
tered case, we use the uncentered regression equation in Table 2.1d(ii)
containing the interaction:
CamScanner
MULTIPLE REGRESSION
The standard errors of simple slopes and hence the /-tests are also in
variant under additive transformation. The variance-covariance matrix of
the unccntcrcd regression coefficients, Sh., is given in Table 2.3b. This
matrix was obtained from the regression program of the SPSS-X package
applied to the uncentered data set. The square root of equation 2.4 for the
variance of the simple slope, that is, «= (j, , 4- 2Zsl3 4- Z2s33)‘/2],
applies to both centered and uncentered data. For example, the esti
mate of the standard error of the simple slope of Y on X' at Z'H - 12.20
is given as [43.88 4- (2)(12.2)(-4.07) 4- 12.22(0.40)j‘/2 = 1.98,
where । = 43.88, j{3 = -4.07, and s33 « 0.40 from the matrix S .
*
For centered ZH = 2.20, bH was found to be [2.35 + 2(2.2)( -0.08)
+ 2.22(0.40)]‘/2 = 1.98. The simple slopes, standard errors, and r-tests
based on centered versus uncentered data are identical, as is shown in
Table 2.3.
cs CamScanner
Effects of Predictor Scaling 35
Multicollinearity:
Essential Versus Nonessential Ill-Conditioning
If the first order variables X and Z are not centered, then product terms
of the form XZ and power polynomial terms of the form X2 are highly
correlated with the variables of which they are comprised (the Pearson
product moment correlation between X and X2 can approach 1.0). When
employed in regression analyses with lower order terms, the highest order
term produces large standard errors for the regression coefficients of the
lower order terms, though the standard error of the highest order term is
unaffected. Cohen (1978) and Pedhazur (1982) acknowledge the compu
tational problems that may arise from this multicollinearity.
The literature on regression with higher order terms contains many ad
monitions about the problems of multicollinearity. However, these prob
lems are not the usual problems of multicollinearity in regression analysis
in which two supposedly different predictors are very highly correlated.
The multicollinearity in the context of regression with higher order terms
is due to scaling, and can be greatly lessened by centering variables. The
special cases we consider in this book (e.g., the relationship between X
and X2, or between X and Z and their product XZ) follow a general result.
Uncentered X' and X>2 will be highly correlated. But if instead we use
centered predictor X and it is normally distributed, then the covariance
between centered predictor X and X2 is zero. Even if X is not normally
distributed, the correlation between X and X2 will be much lower than the
correlation between X' and X'2. Uncentered X' and Z' will both be highly
correlated with their crossproduct X'Z'. But if X and Z are bivariate nor
mal, then the covariance between each centered variable X and Z and the
product term XZ is zero. When X and Z are centered, the only remaining
correlation between first order and product terms or between first order
and second order terms is that due to nonnormality of the variables. (We
CamScanner
36 MULTIPLE REGRESSION
i
again recommend Appendix A (or the mathematical basis of these state *
menu.)
Manquanlt (1980) refers Io the problems of multicollincarity produced
by nonccntcred variables ns nnnewntial ill(ondltinning, whereas those
that exist because of actual relationships between variables in the popu
lation (eg, between the age of a child and his/her developmental stage)
are referred to ax essential ilbcondlflnnlng. Noncsscntial ill-conditioning
is eliminated by centering the predictors.
We recommend centering for computational reasons. Marquardt (1980),
Smith and Sasaki (1979), and Tate (1984) provide clear discussions of
approaches to reducing multicollincarity in interactive regression models
(hoc also i^ince. 1988).
CamScanner
Effects of Predictor Scaling 37
CamScanner
MULTIPLE REGRESSION
38
overall centered regression equation is » 2.3S. the standard error i,
thus2 35i/! = 1.53. amir = 1.14/1.53 - 0.74. These are prectscly tfe
same as the standard error and r value for the simple slope b„ or
Table 2.3. Thus there is a clear interpretation of each of the b coefficient!
in the centered regression equation. ...
In the uncentcrcd equation the b\ coefficient docs retain ns interpreta
tion as the regression of Ton X’ at Z' = 0. However, the value of zero
is no longer at the center of the data, and, in fact, may not even exist or,
the scale of variable Z. Only in the special case in which uncentered
variable Z' has a meaningful zero point is the regression coefficient for
X' meaningful. Only when the uncentered X' variable has a meaning^.;
0 point is the b2 coefficient meaningful. Because the centered overi *
regression analysis provides regression coefficients for first order terra
that may be informative, we recommend that the centered analysis be
employed, echoing the recommendations of Finney et al. (1984) and Mar
quardt (1980). This also conforms to the familiar model used in ANOVA
Each main effect is estimated when the value of all other factors are equal
to their respective means.
Interpretation in the Presence of an Interaction.
The b{ and b2 coefficients in equation 3.1 do not represent “main ef
fects’’ as this term is usually used. Main effects are most typically defined
as the constant effect of one variable across all values of another variable
(Cramer & Appelbaum, 1980). Less commonly they are defined as av
erage effect of one variable across the range of other variables (Finney et
al., 1984). Darlington (1990) defines average effect as the average of the
simple slopes computed across all cases. In other words, if one were к
substitute observed predictor scores for each case into the simple slope
expression (bt + b3Z) for the regression of Y on X, calculate for each
case a value of the simple slope, and then average these across all cases-
this would be considered the average effect of У on X. For a centered
regression equation, the average effect of У on X is bx. Put another was.
if one calculated a simple slope of У on X at every value of Z, weighted
each such slope by the number of cases with that value of Z, and took the
weighted average of the simple slopes, the result would be the average
simple slope, equal to b, in the centered regression equation.
The b\ and b2 coefficients never represent constant effects of the pre
dictors in the presence of an interaction. The bt and b2 coefficients from
centered equations always represent the effects of the predictors at the
mean of the other predictors. In the centered equation, they may also be
considered as the weighted average effect of each predictor coefficient
CamScanner
\vffects of Predictor Scaling . 39
A Geometric Interpretation
A geometric representation of the bx and ^2 conditional effects and the
by interaction (Cleary & Kessler, 1982) provides insight into their mean
ing. The b, coefficient for Z indicates how the predicted score К changes
as a function of Z at X = 0. In Figure 2.1a, the value X - 0.0 is shown
in the center of the X axis. Reading up from the X axis at л - °
simple regression line at ZL and then across from the simp e re^res^e
■ line to the Y axis yields a predicted score Y at X - 0.0 and Z - l-
predicted scores at X — 0.0 are —5.33, 2.54, and 10.41 for ZL, m« a
cs CamScanner
40 MULTIPLE REGRESSION
To this point we have come to expect that changing the scale of pre
dictor variables by additive constants, as in centering, will have no effect
on the unstandardized regression coefficient for the interaction term. But
what of the standardized regression coefficients (betas) for the interaction
term associated with centered versus uncentered scores? Table 3.1 com
pares the unstandardized regression coefficients for the uncentered and ;
centered solutions of the numerical examples (presented as Cases la and
2a in Table 3.1). Standardized regression coefficients associated with each
of these analyses are also presented (Cases lb and 2b of Table 3.1, re
spectively). The comparison shows one reassuring and two disconcerting
findings.
1. The r-tests of the standardized regression coefficients for the interaction ten”
in the centered versus noncentered analyses are identical (r » 4.087).
2. The standardized regression coefficients for the interaction term in the ceP- ■
tercd versus uncentered analyses differ substantially (1.61 versus 0.19).
CamScanner
: Y - 6.82 X * - 23.6
: Y- 1.14X-3.15
Y --4.54 X+17.36
X'
Figure 3.1. Interaction Plotted from Unccnlered Regression Equation: f « -24.68Л" - 9.33Z' 2.5H.rZ' + 90.15
cs CamScanner
MULTIPLE REGRESSION
42
3 The simple slopes generated from the two standardized solutions are also
' substantially different. To illustrate, let us compute the simple slope for the
regression of Гоп X, that is, (b, + b3Z), at Z„ = Z one standard deviation
above the mean (i.e., when Z = 1 for the standardized case). For solution
lb of Table 3.1,
Table 3.1
Raw and Standardized Solutions Based on Centered versus Uncentered Data
■
Simple slopt
bt bi by l-test asZ,,
Analysis (forX) (for Z) (forXZ) bo for by (bt + byZ^j
■
la. Raw uncentered Y,
X. and Z -24.67759 -9.32984 2.58141 90.15337 4.087
Ib. Standardized * ■■
solution associated
with la (i.e., raw
uncentered Y, X
and Z) -0.83210 -0.73258 1.61337 4.087 0.78127
*
2a. Raw centered X
■?
and Z 1.13648 3.57720 2.58141 2.53743" 4.087
2b. Standardized J
•' ■■i
solution associated
with 2a (i.e., raw
centered X and Z) 0.03832 0.28088 0.19218 4.087 0.23050
*
3a. Standardized
zK,zx,z2,and
(ZxZz) as
predictors, raw
analysis 0.03832 0.28088 0.19150 -0.07930 4.087 0.22982
3b. Standardized
solution associated
with 3a 0.03832 0.28088 0.19218 — 4.087 0.23050'
"Criterion Y is unccntercd, in order to provide predicted scores in the original scale of the criterion.
"These outcomes are the result of inappropriate factorization and should not be employed; sec test-
cs CamScanner
Effects of Predictor Scaling
Zy = b
z
* x + b2zz + b3zxzz + bo
*
Zy - (b z
+ b
* z)zx + b
*
z z + b*
CamScanner
44 MULTIPLE REGRESSION
CamScanner
^Cts of Scaling 45
the simple slopes of zx nt high (+1), moderate (0), and low (-1) levels
ofjjf. The standard errors and /-tests can then be performed by substituting
the appropriate values in the formulas presented in Chapter 2 (equations
2.4 and following text). These results arc presented in Table 3.2; Table
2.3a presents the results of the same analysis performed on the raw cen
tered data.
There arc simple algebraic relationships between the centered raw score
analysis with centered X, Z, and their crossproduct XZ as predictors and
the “standardized” analysis using the Friedrich procedure with zx, zz,
and <x zz as predictors. These relationships are presented below and in
Table 3.2 for the values related to tests of simple slopes.
1. The regression coefficients are related as follows:
* = bЛ
z>
Sy
where b* is the standardized regression coefficient associated with pre
dictor i, bj is the unstandardized regression coefficient associated with
predictor i, sf is the standard deviation of predictor i, and sY is the standard
deviation of the criterion Y. For example, from Table 3.1(2a), b3 =
2.58141; from Table 3.1(3a) b$ = 0.19150, sY = 28.01881, and Sxz =
2.08592; orfcj = 2.58141 (2.08592/28.01881). This relationship holds
for all the regression coefficients in the equation. It is the usual relation
ship that is obtained for any linear regression analysis involving only first
order terms (see, e.g., Cohen & Cohen, 1983).
2. The regression constants (intercepts) in the two analyses have the
following relationship:
/МЛ
standardized clement,у = unstandardized element,у I J
CamScanner
46 MULTIPLE REGRESSION
where i and j refer to any two predictors, st and Sj are their respective
standard deviations, and is the variance of the criterion. This relation
*
ship may be verified numerically by comparing the values for the stan
dardized solution in Table 3.2(a) to those for the unstandardized solution
in Table 2.3(a).
Table 3.2 .
Simple Slope Analysis Based on Predictors zx, Zz. and zxz2 (for Comparison
with Raw-Centered Simple Slope Analysis in Table 2.3a)
b. Simple slopes .
d. /-tests
CamScanner
47
4. l-rom (3). Il follows Unit the Mnndnnl emits of the rcitreMion c<w
fteienis arc related ан follows:
.♦ « . i
Л/, Л/, “•*
■4'
where y* and sh arc the standard errors of the standardized and raw regres
sion coefficient for predictor /, respectively, and ,rz is the standard devia
tion of predictor i.
5. From (3), it also follows that the matrices of correlations among the
predictors are identical for the unstandardized and standardized solutions.
6. From (1) and (4), it follows that the /-test values and p values for
tests of the regression coefficients in the two analyses arc identical.
7. Finally, Table 3.2 compares the simple slopes, the standard errors
of the simple slopes, and the r-tests for simple slopes for the unstandard
ized and standardized solutions. As can be seen, both the simple slope
and standard error of the simple slope are related to their unstandardized
counterparts by identical algebraic expressions. The r-tests for the simple
slopes for the standardized and unstandardized centered cases are identi
cal.
In summary, using the Friedrich (1982) approach to standardization
preserves the usual relationships between raw and standardized solutions
found for regression equations that involve only linear terms. These re
lationships are also preserved for the simple slope analyses. Thus, of the
four possible standardized solutions presented in Table 3.1, only the
Friedrich approach (3a) is algebraically appropriate and bears a straight
forward algebraic relationship to the unstandardized centered analysis in
(2a). Remember in this approach that the predictors are all z-scores or
their products to begin with; they should not be further standardized. The
use of this approach avoids potential computational difficulties and am
biguities of interpretation.
Summary
This chapter has addressed the issue of scale invariance when there are
interactions between continuous variables in MR. I he discrepancies in
the regression coefficients obtained from centered and uncentered ana
yses vanish when our prescriptions for probing the interaction are о
lowed. Centered and unccntered analyses lead to identical slopes о t e
r’I
cs CamScanner
48 MULTIPLE REGRESSION
simple regression equations and identical tests of the highest order inter
*
*
action. The interpretation of first order terms in regression equations con
taining interactions is considered; such first order terms represent confa
tional rather than constant effects of single predictors. Our consideration
of the interpretation of these coefficients clarifies an advantage of center
*
ing variables before analysis, as did our consideration of multicollinearity
between lower and higher order terms. Problems associated with stan
dardized solutions of regression equations containing interactions are dis
cussed and an appropriate standardized solution is presented.
-
Notes
I
1. The constancy of the unstandardized b3 coefficient across the centered and uncente^i
analyses does not hold for b3 in the standardized solutions based on centered versus unen
tered data (see the final section of this chapter).
2. These comments concerning the interaction pertain to the XZ term in equation 3.1
In equations with higher order terms such as X2 and X2Z, considered in Chapter 5, enh
the regression coefficient for the highest order term is invariant under linear scale transfer
mation.
3. The simple slopes for the regression of Yon Z at values of X from equation 3.1 is
4- b3X) = (3.58 + 2.58X) in the numerical example. For every one unit increase in X
there will be 2.58 units of increase in Z. Hence there is a symmetry of the regressions ■
on X at Z and У on Z at X.
•I.
r 1
cs CamScanner
4 Testing and Probing
Three-Way Interactions
CamScanner
50 MULTI t'l,H RI'.CJHHSN|()N
In this equation, the test of the b, coefficient indicates whether the three
*
way interaction is significant. I lie two way interactions (e.g,, XZ) now
represent conditional interaction effects, evaluated when the third variable
(e.g, IV) equals 0. They are affected by the scale of the predictor just as
arc first order terms X and Z in the presence of the XZ interaction. With
centered predictor variables, the two-way interactions arc interpreted as
conditional interaction effects at the mean of the variable not involved in
the interaction (e.g., the conditional XZ interaction at the mean of W),
First order effects (e.g., X) may also be interpreted as conditional effects
(e.g., when IV and Z « 0; sec pp. 37-40). If the XZIV interaction in
equation 4.1 is significant, then this interaction should be probed to assist
in its interpretation. If the highest order interaction in the regression equa
tion is not significant, readers may wish to use the stepdown procedures
presented in Chapter 6.
. (42)
Numerical Example
A simulation involving three multivariate normal predictors X, Z, and ]
W is used to illustrate the probing of the three-way interaction. Table 4.1 j
provides the regression analysis and computation of simple slopes. Table |
4.3a provides the means and standard deviations of predictors and criteria; \
note that the first order predictors X, Z, and IV are centered but the cri-
CamScanner
Testing: <»n<^ Probing Three-Way Interaction
* 51
tenon and the crossproduct terms arc not. Table 4.1 и gives the overall
regression analysis; the three predictor XZW interaction term is signifi
cant. In Table 4.1b, the overall regression equation is rearranged accord
ing to equation 4.2 to show the regression of Y on X at levels of Z and
IK In Table 4.1c, four separate simple regression equations arc generated,
one at each of the four combinations of Z and W one standard deviation
above and below their means, that is, at combinations of ZL and ZH with
Ji; and И'ц. For example, for the simple regression equation at Zff and
jy with centered Z and IV, and sz « 3.096, « 1.045, the values ZM
« 3.096 and JVH = 1.045 were substituted into the equation in Table
4,1b. This substitution is shown below:
+ (I4.2831)(1.O45) + (-1.7062)(3.096)(1.045)
+ 4.5710]
Y = 4.521X + 22.881
The result is the simple regression equation for Y on X at ZH, И'ц, given
in Table 4. 1 c(i). The results of the remaining three substitutions arc given
in Table 4.1c as well.
Table 4.1
Three Predictor Regression Analysis
c. Simple Regression Equations at Values One Standard Deviation Above and Below the
Means of Z and IV, where jz «= 3.096, “ 1.045
(i) At Z„ and tVH: ? ~ 4.52IX + 22.881
(ii) At ZL and 1VH; f « -3.843 X + 16.112
CamScanner
52 MULTIPLE REGRESSION
These two values are used to draw the regression line of У on X at ZH and
The reader should note that the analyst is not confined to the format of
Figure 4.1. One might alternatively display the regression of У on X at
levels of W within each graph, with each graph confined to one level of
Z. Or, one might plot the regression of У on Z at levels of W within each
graph, with each graph confined to one level of X. Plotting the interaction
in various ways can often be useful in the interpretation of higher order
interactions. However, theory may provide guidance in the organization
of the plot. For example, in research on the relationship between 1^
stress and health, life stress is typically seen as the “primary” indepe11'
dent variable whose effects may be modified by other variables (e.g-» j
social support; perceived control over one’s own health). Researched j
would typically depict life stress on the X axis (abscissa) of the plot Ю |
emphasize the central importance of this variable.
CamScanner
г ч
CS CamScanner
54 MULTIPLE REGRESSION ;
Tests of simple slopes follow the same general proceedures as for test
ing the XZ interaction in Chapter 2. For the three-predictor interaction,
the simple slope for the regression of Y on X is (b, + b4Z + b5W +
b7ZHz). The simple slope may be tested for significance at any combi
nation of values of the continuous variables Z and W that are chosen. For
readers familiar with ANOVA, these tests are analogous to tests of “sim
ple simple main effects’’ (Winer, 1971).
The general expression for the standard error of the simple slope for У
on X at values of Z and W is as follows:2
CamScanner
Trstinj? and Probing Three- Wav Interactions
55
, Table 4.2
Variances. Standard Errors, and r-Tests for Simple Slopes in the Three-Predrtor
■Interaction
* » [I
w 0 0 3.096 1.045 0 (3.096)(1.045)l
d. Standard Errors and /-Tests for Simple Slopes of Y on X Shown in Figure 4.1
Simple Standard
Slope Error /-lest
(i) At ZH, И'и 4.521 1.424 3.17
**
<u) At A- -3.843 1.600 *
-2.40
(Ш) At ZH, WL -2.693 1.565 -1.72 +
(IV) AtZL, -0.811 1.478 -0.55
3. Taking each pair of Zcv with W'cv (e.g., 2 and W one standard
deviation above their means) in turn, the criterion Y is regressed on X,
wn. W(Zcv). (X)(»tv). (Zcv)(”cv). »nd (X)(Zcv)(»cv)-
the resulting regression coefficient bk for X is the simple regression coef
ficient of Ton X at the specific values Zcv and Ikcv The rtest (us*nfi 1^е
reP°rted standard error) of provides the test of the simple slope. T e
re^ess’on constant is that for the simple regression in question.
. The computer analysis of the XZW interaction explored in Table 4.2 is
8,ven in Table 4.3. The overall regression analysis is given in Table 4. a,
note that predictors X, Z, and W are centered with sx = 7.070, sz =
CamScanner
56 MULTIPLE REGRESSION
Table 4.3
Computation of Simple Slope Analysis by Computer for the XZW Interaction in
the Regression Equation
Y = b|X + b2Z + bjIV + b^XZ + b5XW + b6ZW 4- b^XZW + bQ
CamScanner
Testing and Probing Three-Way Interactions
CamScanner
58 MULTIPLE REGRESSION
I
3.096, and * 1.045. Tabic 4.3b shows the computation of the new
(transformed) variables from Step (1) above:
CamScanner
*
Note that the denominator of this expression will be zero when both the
YZ and the XZIF interactions do not exist, that is, when b4 and b7 are
z,cm. When the denominator is zero, the simple regression lines are par-
nllcl. The fact that IF appears in equation 4.4 indicates that the crossing
point for the simple regressions of Y on X at values of Z depend on the
specific value of IF. Figure 4.1 illustrates this point nicely; at “IFLow,”
the left-hand graph of Figure 4.1, the simple regression lines do not cross
within the range of X that is portrayed; at “IFHigh,” they do cross. Now
instead of a single cross-over point, there is a line of cross-over points,
with each point on the line corresponding to a different value of IF.
If equation 4.4 is evaluated at IF = 1.045 corresponding to the “IF
High” graph of Figure 4.1, then
= -0.8093
For IFl = -1.045 corresponding to the “IF Low” graph of Figure 4.1,
Xcross at IFl = 15.3310. Note that at IFH, the crossing point is well within
one standard deviation of the mean = 7.07), whereas for IFL, the
crossing point is over two standard deviations above the mean.
Alternatively, if one has plotted regression of У on X at levels of IF
within separate graphs confined to particular levels of Z, then t e va ue
of interest is
-(fe3 4- b6Z) (45)
Xcross at specified Z = + v ’
The denominator of this expression will be zero if both the XIF and XZIF
interactions are zero.
The simple slope and variance expressions for equations of similar form
follow orderly patterns. The regression of У on X for equation 2.1 is pre-
CamScanner
Table 4.4
Expressions for Simple Slopes and Their Variances for Various Regression Equations Containing Two- and Three-Predictor
Interactions
Vsria
* ce of
Case Equation Regression Simple Slope Simple Skpe (xb
(2) Y ~ btX 4 b2Z 4 b3W 4 b4XZ 4 YonX (b, + b4Z) 4 2Zsl4 4 Z2s44
beZW 4 b0
(3) Y = btX 4 b2Z 4 b3W 4 b4XZ 4 УопХ (bt + b4Z + b5W) S,, 4 Z2544 4 Wzs35 4 2Zs14 4 2И<5,5
^XH7 4 bbZW 4 bQ 4 ZWr45
(4) ¥ = b,X + b2Z 4 b3W + b4XZ + УопХ (b, + b4Z + b5W + b-,ZW) X, , 4 Z:J44 4 iy2SS5 4 Z2W2S77 4
b3XW + bbZW 4 fc7XZiy 4 2Zr14 4 2Wsts 4 2ZRsl7 4 2ZH'54J
^0 4 2WZ3547 4 2W2Zs57
cs CamScanner
Testing and Probing Three-Way Interactions
61
sented as Case la in Table 4.4. Case 2 of Table 4.4 presents the regres
sion of 1 on X in an equation with one level of increase in complexity
over Case 1: The addition of a W first order term and a ZW interaction
Because neither of these new terms involves X, they have no effect on the
expressions for the simple slope of У on X or its variance. The simple
slopes for Case la and Case 2 have the same structure. Case 3 of Table
4.4 is increased in complexity from Case 2 by the addition of the XW
interaction; it contains all three two-way interactions among X, Z, and W,
but not the three-way interaction. Since the XW interaction, tested with
the 65 coefficient, docs contain X, the b$ coefficient appears in the simple
slope expression for У on X; the variance of b5 and its covariance with
other predictors appear in the expression for the variance of the simple
slope. Finally, Case 4 is the complete equation including the XZW inter
action, to which Chapter 4 is devoted. The reader may follow the patterns
illustrated in Table 4.4 to generate simple slopes and variance expressions
i for equations of varying complexity but which contain only linear terms
; and products of linear terms.
Summary
Notes
1- Social science research areas differ in their position about the permissibility о omi
bng lower order terms in regression equations. The only case in which a justi ication or
1 ls practice may be offered is when strong theory dictates a lower ordei effect must equa
Zer2 (sec Fisher, 1988; Kmenta. 1986). „_ q
•s ' ^e weight vector used to generate this expression is и» = [ 1 0 h
a *
7 * — w'Sltw as in equation 2.8.
matrix. Then, s
CamScanner
5 Structuring Regression Equations to
Reflect Higher Order Relationships
62 ■S Oil ■
CamScanner
i
63
proach to om presentation. First, we have illustrated what regression
equations containing higher order terms look like In terms of the forms of
relationships they represent. We begin with a relatively simple equation
(Case 1 below) and gradually build in complexity (Cases 2, 3, and 4).
Second, we consider the post hoc probing of equations presented in the
first stage, again beginning with Case 1 and working through Cases 2, 3,
and 4. The case numbers we have used throughout the chapter arc the
same as those in Table 5.1 foi each equation. Table 5.1 summarizes the
cases we will consider in depth here.
Our discussion of the representation of curvilinear relationships is lim
ited to terms involving no more than second powers of the predictor
variables. Certainly effects represented by still higher order powers may
occur and our prescriptions may be generalized to these relationships as
well. However, at present, relationships with higher than second power
terms are rarely, if ever, hypothesized to exist in the social sciences.
The X and X2 terms represent the linear and quadratic components of the
overall “main” effect of X, each with one degree of freedom. Note that
both the X and the X2 terms must be included in the equation, even it it
is expected that there is only a quadratic relationship between X and .
As is illustrated in Figure 5.1, this equation fits many different appea
CamScanner
г ч
CS CamScanner
Structuring Ki firesslon Equations 65
anccs of the relationship between A' and У. Because the X and X2 terms
fonn a building block for more complex equations, the interpretation of
the regression coefficients in equation 5.1 is explored here in some detail.
With centered predictors, the />t coefficient indicates the overall linear
trend (positive or negative) in the relationship between X and Yacross the
observed data. If the linear trend is predominantly positive, as in Figures
5.1a,b, /ъ is positive; if the linear trend is predominantly negative, as in
Figure 5.1c. then b\ is negative. For the completely symmetric U-shaped
and inverted U-shaped relationships depicted in Figures 5.Id and 5.1c
respectively, b\ is zero. The interpretation of b\, then, is consistent with
all previous interpretations in Chapters 2 through 4, when centered pre
dictors are employed.
The b2 coefficient indicates the direction of curvature. If the relation
ship is concave upward, as in Figures 5.1a,d, then b2 is positive; if the
relationship is concave downward, as in Figures 5.1b,c,e, then b2 is neg
ative. When the curve is concave upward (b2 positive), we often are in
terested in the value of X at which Y takes on its lowest value, the mini
mum of the curve, as in Figures 5.1a,d. When the curve is concave
downward (b2 negative), we may seek the value of X at which f reaches
its highest value, the maximum of the curve, as in Figures 5.1b,c,e. As
we explain later, the maximum or minimum point of the function is
reached when X = -bt/2b2. If this value falls within the meaningful
range of X, then the relationship is nonmonotonic and may appear as in
Figure 5.1d,e. If this value falls outside the meaningful range of the data,
then the relationship appears monotonic, as in Figures 5. la,b,c. It should
be noted that the distance from the maximum or minimum point to the
mean of X is invariant under additive transformation.
When X bears a linear relationship to Y (i.e., no higher order term
containing X appears in the equation), a one-unit change in X is associated
with a one-unit change in Y. In contrast, when X bears a curvilinear re
lationship to Y (i.e., there is a higher order term containing a power of X
in the equation such as X2), then the change in Y for a one-unit change
in X depends upon the value of X. This can be verified by noting the
change in Yas a function of X in Figure 5.1.
CamScanner
CamScanner
Structuring: Regression Equations
67
prescriptions for testing nnd post hoc probing nf interecti™.
employ throughout this chapter . ,,ng|c ,.m„|,,fd <Ul. M, Th(.
bivsnstc nonnsl parts of scores used in Ch,plf( 2 wrrr employ^here
ХЖ „ n7n f ’A 7 °' ' ' (”!,nd * 'norierstely correlsW
•
ra - 0.42. Tire higher outer term №, XZ, .ml X'Z were then genetst^
ftom the centered terms, and ell term, were used to produce Fusing the
regression equation ) - 0.2X 4 50X? 4 2 0Z 4 5XZ 4 I 5X^Z
Finally, observed F scores were generated from predicted P «ores by the
addition of normally distributed random error. Linking this simulation to
a substantive example, predictor X represents an individual’s self-concept
(i.e., how well or poorly an individual evaluates himself or herself over
all). The criterion У represents an individual’s level of self-disclosure, or
the extent to which the individual shares personal information with others
Self-disclosure (У) has been found to be a U-shaped function of self,
concept (X); individuals with low or high scif-conccpts tend to disclose
more about themselves than persons with moderate self-concepts. Predic
tor Z represents the amount of alcohol consumed in a social situation in
which an individual has an opportunity to sclf-disclosc. Self-disclosure is
expected to increase with increased alcohol consumption: a linear rela
tionship is assumed in the absence of theory specifying a more complex
relationship.
CamScanner
CamScanner
Structuring Regression Equations 69
f = b,X + b,X
* + byZ + btXZ + b„ (5.3)
This equation is again plotted in two ways in Figure 5.2c, illustrating first
the regression of У on X and then the regression of У on Z. As is most
CS CamScanner
70 MULTIPLE REGRESSION J
■ '■ ■'
' 7 ■ .■ '
. 7< .7 '
. '
■
' "
. '
. ■ ■7 ; J.<
. '4
clearly seen in Figure 5.2c(l), the meaning of (he quadratic by linear Xl%
term is that the quadratic relationship between X and F varies in formal Ж
a function of the value of Z. If we had hypothesized that the curvilinear 3
relationship between self-disclosure and self-concept would be increas-Л
ingly manifested as alcohol consumption increased, equation 5.4 would
have been appropriate to test the prediction.
' ' 7 . w J
Case 5: Curvilinear X Relationship, J
Curvilinear Z Relationship,
and Their Interactions
Suppose finally that both the X and Z predictors were expected to have |
curvilinear effects. In this case, both predictors would be treated as pre- J
dictor X was in equation 5.2 as represented in the first four terms of equa- 5
tion 5.5. In addition, up to four components of the interaction may be
included in the regression equation, namely the XZ, XZ2, X2Z, and Х2'2
уЦ
*
terms: ''777:. M
7- 7 ' : . ■ ■ '• 7 ■ ■ 7
Representation of Curvilinearity in ANOVA Versus MR 1
CamScanner
Structuring Regression Equations 71
/ ...
CamScanner
72
‘ 'И Rkor»^Ss|(
of interpretation, sometimes of substantial magnitude. |? .
relationships in Figures 5.2a,b,c are all based on the same У'111*
» Ik
differs between these portions of Figure 5.2 is the complex ire'
tions used to fit the data. Recall (hat these data were siinulnicd
* wC
that the data were in fact generated by an equation of the form
in Figure 5.2c. W
*
nrcP
It may appear that the researcher must be more informed abo
nature of the relationships of predictors to criteria to use
ANOVA. After all, the ANOVA automatically includes curvilinear '
ponents in the variance partitions. In actuality, in the planning of JT
*
periment, the researcher must pick the number of levels of each fact *
based on some assumption or knowledge of the relationship of the facton
to the criterion. If there is only a linear relationship expected, two leveh
suffice to estimate the linear effect (but three levels arc required to test for
nonlinearity). If curvilinear relationships arc expected, at least three levels
are required. In an analogous manner, the researcher using MR whois
suspicious that there are nonlinearities in the relationships may explore
these relationships by using equations containing appropriate higher order
terms (see also Chapter 9).2
У = ьух + ь2х2 + *0
CamScanner
Structuring Regression Equations
73
As before, we rewrite the equation to show the regression of yon X:
dY и (5,7)
- = Ь. + гь2х
cs CamScanner
CamScanner
I
Structuring Regression Equations 75
If the value Xt is substituted into equation 5.7, the resulting value is the
value of the simple slope of У on X at Xh as illustrated in Figure 5.3a.
Otherwise stated, equation 5.7 is an expression for the linear change in Y
associated with a linear change in X at any specific value of X.
I The general definition of the simple slope as the first (partial) derivative
/, is applicable to all the simple slope expressions in Chapters 2 through 4.
Consider the regression equation with one XZ interaction, У - b}X +
b2Z + byXZ + b0. We take the first (partial) derivative of this equation
with respect to X:
BY
TZ = b, + b3Z (5.8)
ал
J* This is the same expression for the simple slope as was given in Chapter
2 and summarized in Table 4.4, Case 1.
At this point we will formalize the definition of simple slopes to en-
" compass all the cases throughout the text. The simple slope of the regres
sion of Y on X is the first (partial) derivative of the overall regression
equation with respect to the variable X. This function indicates the slope
of the regression of У on X at a particular value of at least one variable,
either some other variable Z, the variable X itself, or some combination
of variables.
Returning to the simple curvilinear equation, what does its first deriv-
’ ative tell us? Consider the regression equation illustrated in Figure 5.3a:
Y = 4.993Х + 6.454X2 + 3.197, where X is centered^ Using the expres-
i sion for the derivative, we see that at the mean of X (X = 0), the regres
sion of У on X is positive:
At XH, one standard deviation above the mean ($x = 0.945), XH = 0.945
so that the regression of У on X at XH is 4.993 + 2(6.454)(0.945) =
17.191. At XL = -0.945, the regression of У on X is -7.2051. Thus at
one standard deviation below the mean (Xu), there is a negative relation-
/ ship of X to У; whereas at one standard deviation above the mean (XH)
the relationship is strongly positive.
CS CamScanner
J
76 MULTIPLE REGRESSION J
that is. the value of X at which the predicted value of У is its lowest. This
point on the curve is where the tangent line to the curve (first derivative)
has a slope equal to zero. Using expression 5.7, we solve for that value
of X that causes expression 5.7 to equal zero:
by 4- 2b>X « 0
by - -2b2X
— by
“ X at ^minimum
CamScanner
у
CS CamScanner
78 MULTIPLE REGRESSION |
Table 5.2
Simple Slopes at values of X for the Regression Equation V = 4.993X +
6.454X2 + 3.197
, ,. .... - .. . ---- _ -
a. Variance Covariance Matrix of Regression Coefficients л
b< bt I
2.14449 -0.10432
/>,.-0.10432 1.26567 J
:
1
~—
-0.945 -1.106 3.657
О 4.
О
I
“
-0.387 0.000 1.750
—О
UJ О
1
0.000 4.993 1.464
4*
?
0.945 11,092 2.504
4-
u4*
—. я
♦•♦/> < .001, ■ . J
- ■: '■O=|
We now apply our prescriptions for post hoc probing to the progression
of regression equations considered in the first section of this chapter, Tte |
equations, their simple slopes, and the simple slope variances are sunt-1
marized in Table 5.1, Cases 2 though 4. Case 2 is a simple extension M d
Case 1. It is in Case 3 and Case 4 that we encounter the combination
higher order terms and crossproduct terms.
CamScanner
Structuring Regression Equations 79
J The (h{ 4- b4Z) coefficient in equation 5.11 takes on the same meaning
in the simple regression equation as does b} in equation 5.3 and indicates
the overall linear trend in the regression of У on У at one value of Z. If
- (Ь, + b4Z) is positive, the simple regression has an overall upward linear
Л .trend; if it is negative, an overall downward linear trend. However, the
; nature of the curvature, measured by b2, is independent of Z because Z ?
■\ does not interact with X2.
, ib J- ' Y-' ' v ■ ■ • ■ ■ ■■ ■ . > . . ■ . •- . ; ■ •
• -r ■ . ' • ' • ... ■ ■ *
’.Ж
CamScanner
MULTIPLE HEGRIINSIOH
4- 3.608(-2.200) 4- 3.246
Y = -5.332X 4- 3.563X2 - 4.512
ForZM = 0.00, Y = 1.125X 4- 3.563X2 4- 3.246
Case 3a: Simple Slopes. The reader should recall the distinction be
tween coefficients found by reexpression of the overall regression equa
tion, as in equation 5.11, and simple slopes. To determine the simple
slope of the regression of У on X at any single value of X, we differentiate
equation 5.3 with respect to X:
CamScanner
Structuring Regression Equations 81
0.848. Table 5.3 provides a matrix summarizing the simple slopes at all
nine combinations of XL, XM, and XH crossed with ZL, ZM, and ZH, Row
1, column 3 of the tabic shows the simple slope at Xlt ZM to be 0.848 as
we just calculated. Thje second row shows the regression of Ton X at the
mean of centered X (X *= 0) for ZL, ZM, and ZH. Note that these values
are identical to the first order coefficients of the simple regression equa
tions presented above. When X = 0, the regression coefficient (b^ + ft4Z)
for X in reexprcssed regression equation 5.11 and the simple slope of Y
on X at a particular value of Z, that is, (+ 2b2X + b4Z) of equation
5.12, are equal.
When X is centered, both equations indicate the average regression of
У on X at the mean of X for particular values of Z, or the average slope
of the regression of Ton X across all the cases in the sample.
The set of simple slopes presented in Table 5.3 leads to a useful sum
mary of the outcome of the regression analysis. When X is low (row 1),
Table 5.3 л ,
Probing Simple Slopes in the Equation Y = 1.125X + 3.563X^ + 3.61Z +
2.935XZ + 3.246 (Simple Slopes Are Found Using the Expression (b\ + 2bzX
+ btZ))
b> Ьг bj
bl 2.34649 0.01536 -0.41530 -0.08693
*
s = h* 0.01536 1.58396 -0.04381 -0.49227
by 0.41530 -0.04381 0.43079 0.01174
-- . b* -0.08693 -0.49227 0.01174 0.55213
*P<
" .0001; "p < .01; *p < .05
CamScanner
82 MULTIPLE REGRESSION
Case Standard Error and bTesl. The variance of the simple slor>t
is given in Table 5.1, Case 3a; its square root is the required standard
error; and the r-test has n — к — 1 df, where к — 4.
-(b,+b4Z)
x " 2b2 (513)
CamScanner
Structuring Regression Equations
preted as disordinal. If, in contrast, the low value of X at which the curves
cross is not meaningful, then the interaction would be interpreted as or
dinal. In our example, if values below ZL represented the sclf-conccpt
scores of clinically depressed individuals (a different population), then we
t would state that the interaction was ordinal within the “normal” popu
lation above ZL in self-concept. Note that if the Ьл term for the XZ inter
action were zero, then the simple regressions would not cross. In this
case, the equation reduces to the simpler regression equation 5.2, F =
b}X + b2X2 4- Z>3Z + bQ, which is depicted in Figure 5.2a(l).
Case 3b: Simple Slopes, Standard Error, and t-Test. The equation for
the regression of У on Z at levels of X is found by differentiating equation
5.3 with respect to Z:
Ц = b, + b,X
(5.14)
The variance of this simple slope10 is given in Table 5.1, Case 3b. The
standard error is the square root of this variance, and the btest follows,
with n - к - 1 df, where к = 4. The forms of both the simple slope and
its variance, given in Table 5.1, Case 3b, are identical to those for the
regression of У on X at values of Z in equation 2,1: У = b,X + b2Z +
b3XZ + bQ (see Table 4.4, Case lb). This is so because in both equations
2.1 and 5.3 the relationship of У to Z is linear both in the first order term
and in the interaction. The more complex equation 5.3 differs from equa
tion 2.1 only by the addition of the X2 term, which does not enter into
the regression of У on Z.
Case 3b: Crossing Points, Figure 5.2b(2) shows that the simple regres
sion lines do not cross at a single point, unlike our experience with pre
vious simple linear by linear interactions. In equations containing second
order terms in X, the crossing points of any two simple regression lines
CamScanner
#4 MULTIPLE REGRESSION |
I
for the simple regressions of К on Z depend upon specific value» of x
chosen. For equation 5.3, the value of Z at which two simple regression
lines of У on Z cross is
I
, ; I
- -p’« + hM + Ml A- U.IJ) J
where X, and X; arc the specific values of X chosen for examination
Note again that if the b4 coefficient for the XT interaction is zero, thr
simple regression lines arc parallel, as in Figure 5.2b(2). Further, if the
6, coefficient for the X2 term is zero, all simple regression lines will сгом
at the single value ~b}/bA.
To illustrate the calculation of a crossing point, we substitute XH *
0.945 for Xt and XM = 0 for Xj into equation 5.15: Zcrt,„ = -[1.125 >
3.563(0.945 + 0))/3.246 = -1.387. This case is graphically depicted
in Figure 5.2b(2), where the lowest value of Z plotted is ZL = -2.20
As in Case 3a, in equation 5.16 the (Z>| + b4Z) coefficient provides the
same information as the 6, coefficient in the overall equation. The (bt *
b4Z) coefficient represents the overall linear trend in the relationship
X to Tat a value of Z, paralleling its interpretation in equation 5.11. W
CamScanner
Structuring Regression Equations 85
(b2 + b5Z) coefficient for X1 in equation 5.16 conveys the same infor
mation as the b2 coefficient in the overall equation. It represents the nature
of the curvilinearity of the simple regression lines of Y on X at specific
values of Z. If the value of (b2 4- />5Z) is positive, the curve is concave
upward; if negative, it is concave downward.
Figure 5.2c(l) illustrates regression equation P = -2.O42X 4 3.fKX)X;
+ 2.138Z + 2.793XZ + 1,96OX2Z + 3.502. The figure is presented a-,
three simple regressions, based on the rearranged form of equation 5.16:
4- (2.138Z 4- 3.502)
Substituting the values —2.200, 0, and 2.200 for ZL, ZM, and ZH leads to
values of both coefficients of equation 5.16 for each simple regression
line. The value of (bi 4- b4Z) varies from negative to positive as Z in
creases: -8.188 forZL, -2.042 forZM, and 4.104 forZH. This is con
sistent with the generally negative linear trend at ZL, but generally posi
tive linear trend at ZH observed in Figure 5.2c(l). The values of (b2 4-
65Z) are as follows: —1.313, 3.000, and 7.312, for ZL, ZM, and ZH,
respectively. At ZL there is vety slight downward curvature, but the cur
vature is upward for both ZM and ZH.
As a further aid to interpretation, both the (Zq 4- b4Z) coefficient and
the (b2 4- b5Z) coefficient may be tested for significance in each simple
slope equation. Standard errors are computed according to the procedure
given in Chapter 2 and employed throughout the text.11 For example, for
(b2 4- b5Z), at ZL, t - -0.786, ns; at ZM, t - 2.422, p < .05; and at
ZH,7 - 4.853, p < .01. These tests confirm the appearance of the regres
sion of У on X at ZL as having a general upward linear trend that becomes
increasingly concave upward as Z increases.
The reader should be aware that the linear coefficient (6t 4- b4Z) and
the curvilinear coefficient (b2 4* b5Z) in equation 5.16 are not simple
slopes. Rather these coefficients summarize the overall relationship of У
to X at particular values of Z. (In contrast simple slopes measure the
regression of У on X only at a single pair of X and Z values. )
For all the simple slopes presented in this chapter, the computer method
developed in Chapter 2 may be used to find standard errors and simple
slopes. We provide examples later in the chapter for the simple slopes
presented. However, the computer method as presented in this text is only
applicable to simple slopes and cannot be used to find the standard errors
of the general linear and curvilinear coefficients in equation 5.16. Instead,
CamScanner
86 MULTIPLE REGRESSION
the approach outlined in the optional section at the end of Chapter 2 ana
summarized in equation 2.10 must be used. The same is true for the gen- .
eral linear coefficient (b} 4- b4Z) in equation 5.11.
Case 4a: Simple Slopes, Standard Errors, and t-Tests. To find the sim
ple slope of У on X in equation 5.4, wc compute the first (partial) deriv.
ative of equation 5.4 with respect to X:
dY
— = bt + 2ЬгХ + *4Z + 2b5XZ (5.17;
0A
so that
V _ ~(fcl + ^2) й
A ~ 2(Ьг + 65Z) 1
CamScanner
Structuring Regression Equations 87
Table 5.4
Probing Simple Slopes in the Equation Y = —2.042X + 3.OOOX2 + 2.138Z +
2.793XZ + 1.960X2Z + 3.502 [Simple Slopes Found Using the Expression
[b\ + ^42 + 2/?5XZ)]
t •
ji. bt b3 bi b< bi
b\ 2.78447 0.11045 -0.14763 -0.05919 -0.33314
bl 0.11045 1.53329 0.00252 -0.46696 -0.05926
bl -0.14763 0.00252 0.52842 0.02239 -0.15468
s
ь< -0.05919 -0.46696 0.02239 0.52960 -0.01487
bl -0.33314 -0.05926 -0.15468 -0.01487 0.20618
CamScanner
л V
88 MULTIPLE REGRESSION
When the b5 coefficient for the X2Z term is zero, the curves will not cross
However, even when b5 is nonzero, there may still be no point at which
the two regression curves cross. In fact, this is true for the present case,
as is illustrated in Figure 5.2c(l). Evaluation of equation 5.19 for the
case does not produce a real number for a solution; the solution is an
imaginary number:
CamScanner
Structuring Regression Equations 89
Each pair of simple regression lines cross at values of Z that depend upon
the values of X in question:
- \ 4- b2(Xj + Xj)
(5.21)
b4 + b5 (X,- + Xj
-[(-2.042) + 3.000(0.945 + 0)
2.793 + 1.960(0.945 + 0)
The simple slopes and the corresponding standard errors and r-tests for
all the analyses in Tables 5.2, 5.3, and 5.4 may be calculated by com
puter. The approach directly extends the three step procedure presented
in Chapter 2. We will consider the probing of equation 5.4, which in our
example includes a significant X2Z interaction. The outcome of the simple
slope analysis for the numerical example is presented in Table 5.4; the
parallel computer analysis is presented in Table 5.5.
The significant X2Z interaction implies that each regression of У on X
depends on the specific values of both X and Z. Consequently, we need
to specify these values and we will use all combinations of XL, XM, and
with ZL, ZM, and ZH as before. Recall that X and Z are centered; XM
= 0 and ZM = 0 so that new transformed variables are not needed for
these values.
I. Our first step is to transform the original X and Z variables so they
arc evaluated at the conditional values of interest. The transformed vari
ables are created from X and Z by subtracting conditional values CVX and
CV2, respectively. In this case we have the following:
' (a) XABOVE - X - (0.945) for the regression of У on X at CVX -
t 0.945, one standard deviation above the mean of X;
(b) XBELOW - X - (-0.945) for the regression of У on X at CVX
= -0.945, one standard deviation below the mean of X;
(c) ZABOVE = Z — (2.200) for the regression of У on X at CVZ =
L 2.200, one standard deviation above the mean of Z; and
; (d) ZBELOW = Z - (-2.200) for the regression of У on X at CVZ
/ - -2.200, one standard deviation below the mean of Z.
‘Л ........ Й......... . . . . ' ■ ;
CamScanner
MULTIPLE REGRESSION
Table S.S
Compulation of Simple Slope Analysis by Computer for lhe №z Interactin «
the Regression Equation Г « 6,.V + 6, A’2 + bxZ 4 btXZ 4- frsXJZ 4- />0П’П
COMPUTE XABOVE=X—(.945)
COMPUTE XBELOW =X-(-.945)
COMPUTE ZABOVE=Z-(2.20)
COMPUTE ZBELOW=Z-(-2.20)
COMPUTE X2A- XABOVE
*XABOVE
COMPUTE X2B^ XBELOW’XBELOW
*ZABOVE
COMPUTE XZA = X
*ZBELOW
COMPUTE XZB=-X
XAZ
=XABOVE
COMPUTE *Z
COMPUTE XBZ»XBELOW
*
Z
*
COMPUTE XAZA«XABOVE
ZABOVB
COMPUTE XAZB = XABOVE
*ZBELOW
COMPUTE XBZA = XBELOW
*ZABOVE
*ZBELOW
COMPUTE XBZB«XBELOW
COMPUTE *
XABQVE
Z 2
X2ZA
ZBELOW
COMPUTE X2ZB-X2
*
COMPUTE X2AZA = X2A
*ZABOVE
ZBELOW
COMPUTE X2AZB-X2A
*
COMPUTE X2BZA = X2В*ZABOVE 4
*ZBELOW
COMPUTE X2BZB = X2B
CamScanner
Structuring Regression Equations 91
c. Regression Analysis with XABOVE and ZBELOW, Yielding Simple Slope Analysis at
XH and ZL (regression of Уоп X one standard above the mean of X and one standard
deviation below the mean of Z)
N of Cases = 400
d. Regression Analysis with X and ZABOVE, Yielding Simple Slope Analysis at X and
ZH (Regression of Y on X at the Mean of X and One Standard Deviation Above the
Mean of Z)
1 ■ ' £
. •' . ' ' n
1
L............
■••••• . ■ ■ • я
CamScanner
MULTIPLE REGRESSION
.V
CamScanner
93
CamScanner
94 MULTIPLE REGRESSION
4- 1.960X2Z 4- 3.502
i
We note the dramatic change in the coefficients for the X and Z terms and
the intercept, illustrating the bias that is introduced by omitting these non
zero higher order terms from the equation. Nctcr, Wasserman, and Kutncr
(1989) discuss several methods of plotting residuals that arc useful in
detecting this problem.
The inclusion of higher order terms whose true value is zero should not
bias the estimates of lower order terms. In Chapter 3 we pointed out that
when X and Z are centered and bivariate normally distributed, then the
correlations of X with X2, Z with Z2, and X and Z with XZ are zero. It
would seem then that under these conditions including a higher order
terms, say X2 in equation 5.2, or XZ in equation 2.1, should have essen
tially no effect on the estimates of the X and Z effects. However, with
higher order terms, if the first order predictors are even moderately cor
related, then first and third order terms (e.g., Z with X2Z) will be highly
correlated, even for centered predictors. The same is true for second and
fourth order14 terms (e.g., X2 with XZ; X2 with X2Z2). These interpre
dictor correlations will introduce instability into regression coefficients;
the correlations exist even after variables are centered (Dunlap & Kemery'
1987; Marquardt, 1980). To illustrate, if we estimate equation 5.5, which
contains all the terms from 5.4 plus three terms known to be zero in our
simulated data set (i.e., Z2, XZ2, X2Z2), we find
We note that (a) the coefficients for each of the five nonzero terms has
changed somewhat from the values estimated above for equation 5.4
(though all significance levels remain highly similar), and (b) one of the
zero terms in the population is significant in the sample (for XZ2, p <
.05). The source of the significance is revealed in the pattern of intercom
relations of XZ2 with other variables. The XZ2 term has a low positive
zero order correlation with the criterion (r = 0.246) but a very high zero
order correlation with the other third order term X2Z (r = 0.817). Th?
cs CamScanner
к____ J
Smtcmrinf Regression Equations 95
Y « 6,X
* *
+ 62Z + + b0
CamScanner
96 MULTIPLE REGRESSION /
, . . . I
This second equation is linear in the parameters and would be estimated
using ordinary least squares (OUS) regression and can be interpreted jn
the transformed |c.g., log (X)|, but not the original X scaling in terms of
the prescriptions we have outlined in this book.
An example of a linearized equation Is provided by Wonnacott and
Wonnacott (1979) who note that economic theory describes the Cobh-
Douglas production function as follows: Q » bnKb'Lhlut where Q
quantity produced; К is capital; L is labor; b0» b]t and b2 arc the nonlinCar
regression parameters to be estimated; and и is the multiplicative error
term. If we take the logarithms of both sides of the equation, we discover
Y = Ь^ + b{X + b2Z + e
CamScanner
Smtcrwring Eegression Equations 97
Summary
Notes
I. In very rare instances, based on strong theory, only the quadratic component might
be included. To illustrate, suppose we had a model for judgments of size that indicated that
size was judged on the basis of area, and not on the basis of height or width of objects
alone, Then in a regression equation predicting size judgments from area, one might wish
to omit the first order terms of height and width. The first order terms might be omitted if
both strong theory plus prior empirical evidence indicated that individuals do not rely at all
on linear extent in the judgment of size (we thank David Kenny for this example).
2- If there are nonlinear relationships between predictors and the criterion, and these
relationships are not reflected in the regression equation, they may be delected with regres-
rion diagnostics applied to residuals (see, for example, Belsley, Kuh, & Welsh, 1980; Bol-
ten A Jackman, 1990; Cook A Weisberg, 1980; Stevens, 1984). and the regression equation
; Wropnaiely respecified.
CamScanner
. ... Г-' /
98 MULTIPLE REGRESSION Ш
f=M + bj(F-X)2 ’ - .
where ". . -■
For example, if one considers the U-shaped relationship between self-concept (X) and sell- X
disclosure (У), the rewritten polynomial equation would indicate the minimum value of
self-disclosure (M) at the value of self-concept (F) at which that minimum M was achieved,
7. For the general matrix-based approach to the variance of the simple slope of У он X
in Case 2, w'= (1 2X OJ and Sb is 3 X 3. =
8. For the general matrix-based approach to the variance of the simple slope of f
in Case 3a. w' i» Ц 2X 0 ZJ and is 4 X 4. . A
9. The second (partial) derivative of equation 5.3 with respect to Xis b2. If this secoH J
derivative is positive, then a minimum has been identified; if It is negative, a maximurnl»’ A
been identified (see note 4 above). ' / -
10. For the general matrix-based approach to the variance of the simple slope of
in Case 3b, w'|0 0 1 X) and Sb is the same 4 X 4 matrix as for Case 3a.; л
11 The variance of the coefficient (bt + b4Z\ is calculated using the general tnatrix Ц
approach, where w' =₽? [1 0 0 Z 0] and Sbls 5 X 5. The Meet has/»к —• 1 df, wl^.A’
к .a 5. For(b2 + b5Z), w' = [0 1 0 0 * is the same 5 x 5 matrix; / has t^. j
Z], and S
samedf. . ‘ .. . ...- .. J?
CamScanner
Structuring Regression Equations 99
12. For the general matrix«bascd approach to the variance of the simple slope of К on X
in Case 4a. - (1 2X 0 Z 2XZ) and *S is 5 x 5.
13. The second partial derivative of equation 5.4 with respect to X is
~j « 2(fr} + b5Z)
This equation indicates that the direction of curvature depends on the value of Z. The value
of Xobtained from equation 5Л8 will be a minimum when b2 + b5Z > 0 and a maximum
when b2 + b5Z < 0. At Zl = -2.20, the second derivative is 2(3.000 + (1.96)f—2.2O>]
= -1.312, indicating a maximum; hence the regression of У on X is slightly concave
downward. For Zh, the second derivative is 15.75; the regression of Y on X is concave
upward.
14. For multivariate normal variables X, Z, and l¥, all odd moments vanish (e.g., XZW.
XJZ, X:Z:H’), whether or not X, Z, and W are intercorrelated. If X, Z, and W are inter
correlated, then even moments do not vanish (e.g., X2Z2, X2Z2I¥2), See the appendix of
Kenny and Judd (1984) for a summary of moments of multivariate normal distributions.
15. A suppressor variable is one that is generally uncorrelated with the criterion, is
highly correlated with another predictor, and increases the predictability of the other pre
dictor by its inclusion in the equation. The suppressor typically has a significant negative
regression coefficient.
CamScanner
6 Model and Effect Testing
with Higher Order Terms
loo ,
CamScanner
Model and Effect Trying IОI
ancc that is shared by terms in (he equation could potentially 1ч' appor
tioned to the higher and lower order effects in several different ways. This
issue has stimulated a sizeable literature comparing strategies for testing
overlapping effects in MR (sec. c.g.. Allison. 1977; Cleary A Kessler,
1982; Cohen A Cohen, 1983; Darlington, |0»M). Lane. 1981; Pedhazur,
1982; Picxoto. 1987). Interestingly, a large parallel literature in ANOVA
discusses strategies for partitioning the variance and testing the effects in
factorial designs with unequal (nonproportional) cell sizes (sec. c.g.. Ap
pclbaum A Cramer, 1974; Cramer A Appclbaum, 1980; Herr A Gaebel
ein, 1978: Overall. Lee. A Hornick, 1981; Overall A Spiegel, 1969.
Overall, Spiegel. A Cohen, 1975). These designs have the same problem
that is typical of complex multiple regression equations: The partitioning
of variance is not unambiguous.
To illustrate this issue and the questions it engenders, we will consider
the simplest case, represented by the now familiar two variable regression
equation containing an interaction. This equation is reproduced as equa
tion 6.1 below.
With this equation as our starting point, a number of simpler models can
also be generated, presented here as equations 6.2, 6.3. and 6.4.
Note that the b}X terms in equations 6.1,6.2, and 6.3 will not generally
be equal, in equation 6.3, b{ reflects all variance shared between X and
, У. In equation 6.2, reflects all variance shared between X and У over
and above the variance shared between Z and У. Finally, in equation 6.1.
reflects the unique variance shared between X and У after the effects of
Z add XZ on У have been removed.’ Thus the h|S in the three equations
' have different meanings and may vary substantially in magnitude, de
pending on interpredictor correlation and the distribution of the variables.
Thus far in the book we have discussed the interpretation of only the
b, and b2 coefficients of equation 6.1. Each lower order effect was intcr-
Preted assuming the presence of the interaction in the equation.2 Testing
significance of each term in equation 6.1 provides a test of the unique
CamScanner
102 MULTIPLE REGRESSION
CamScanner
Model and l\fl'eet Testing 103
and that conditional effects be tested. At the same time, we believe that
authors have an obligation to the render to explain the meaning of con
ditional effects and how they differ from more familiar constant main ef
fects. Restricting the presentation of the tests of the conditional effects to
the context of post hoc probing of the interaction would help minimize
the possibility of misinterprction of such effects.
The second question concerns the proper procedure to follow when the
test of the by coefficient for the interaction is nonsignificant. This question
centers on whether or not the interaction should be eliminated from the
equation, with the first order X and Z terms now being tested using equa
tion 6.2. To begin to answer this question, we need to consider the trade
off between two desirable properties of statistical estimators: unbiased
ness and efficiency.
Unbiasedness means that the estimators, in this case the sample regres
sion coefficients, will on the average equal the population value of the
corresponding parameters. In standard regression models, the major source
of bias is the omission of terms from the regression equation that represent
true effects in the population (specification error). High efficiency means
that the size of the standard error of the estimator relative to other esti
mates of the same standard error will be small. Including additional terms
in a regression equation that in fact have no relationship to the criterion
has no effect on bias: All of the regression coefficients will continue to be
unbiased estimators of their respective population values. However, the
introduction into the equation of unnecessary terms having no relationship
to the criterion in the population has the result of lowering, sometimes
appreciably, the efficiency of the estimates of the regression coefficients.
Otherwise stated, the estimates of the standard errors of the regression
. coefficients will be larger, making it more difficult for any true effects in
I the equation to attain statistical significance. Hence, terms that have a
value of zero in the population should be removed from the regression
i equation to permit more powerful tests of the other eff ects.
/ Researchers rarely know whether an interaction is zero in the popula-
/ tion. Statistically, all the researcher can typically do to evaluate this hy-
; pothesis is to test the interaction in the sample and show it does not differ
у from zero. Unfortunately, as we will see in Chapter 8, tests of interactions
у often have low statistical power and may fail to detect small true inter-
f • ,
r4
cs CamScanner
104 MULTIPLE REGRESSION
action effects that exist in the population. This problem has led to con
flicting recommendations in the literature. For example, in the ANOVA
context, Cramer and Appclbamn (1980) have emphasized the increased
efficiency that results when nonsignificant higher order terms arc dropped
from the model. They have argued that this gain more than compensates
for the small amount of bias that may be introduced in the estimates of
the lower order effects/ In contrast, Overall ct al. (1981) focused on the
problem of bias and showed under the conditions studied in their simu
lation that the use of the full model (equation 6.1) resulted in less bias,
greater precision, and equal power relative to the use of the reduced model
(equation 6.2) for tests of the lower order effects.
However, these results, while informative, should not necessarily be
treated as being applicable in all contexts. As pointed out by Finney et
al. (1984), “in actual research situations, as opposed to simulation anal
yses, the degree of bias for each approach depends on unknown—and, in
many cases, unknowable—factors, such as the main and interactive effects
of relevant variables that were not included in the model” (p. 91). in
stead, Finney et al. (1984) argued that researchers should focus on the
distinction we introduced above: the theoretical versus exploratory basis
of the interaction. When there are strong theoretical expectations of an
interaction, they conclude that the interaction should be retained in the
final regression equation. Doing so informs the literature and leads to the
accumulation of knowledge about the theory in question. Even when non
significant, the estimates of the effect size for the interaction may be com
bined across multiple studies through meta-analysis. Further, it is logi
cally inconsistent to report the estimates of constant effects for X and Z
from equation 6.2 when strong theory postulates an interaction.
In practice, estimates of the lower order effects derived from equation
6.1 versus from equation 6.2 will often be quite similar when all predic
tors have been centered. Indeed, Finney et al. (1984) note three cases in
which they are identical. In each of these cases, there is no overlapping
variance between the two predictors and their interaction, (a) If X and I
are uncorrelated with one another and are centered, then each is uncor *
related with the XZ term, (b) If X and Z are bivariate normal, then again
rx xz ~ rz xz = 0- (c) XZ pairs are balanced (i.e., for every 4
-Z] data point there is a corresponding [X, Z] data point, and for ever) I-
[-X, Z] data point there is a corresponding [X, -Z] data point), theft |
again rXtxz ~ rz,xz = 0. Although these exact conditions do not often
hold in observed data, the rxxz and rx xz correlations are often low with 4
centered X and Z, so that estimates from equations 6.1 and 6,2 will be
quite similar. - - ///Ж
' ' Ж
CamScanner
Model and Effect Testing 105
Recommendation
Our own recommendation concurs closely with that of Finney ct al.
(1984). In eases in which there arc arc strong theoretical grounds for ex
pecting an interaction, the interaction, even if nonsignificant, should be
retained in the final regression equation. Post hoc probing procedures can
also be used as a supplemental guide to understanding how the interaction
could potentially modify the overall results. Such probing is particularly
important in eases in which the first order effects, though significant, are
not large in magnitude and the statistical power of the test of the inter
action is low. Conditional effects for the lower order terms from equation
6.1 may be reported where useful and appropriate, so long as centered X
and Z have been used in the analysis and the nature of the effects is ex
plained. However, in eases in which there is not a strong theoretical ex
pectation of an interaction, step-down procedures should be used. The
interaction should be dropped from the equation and the first order effects
should be estimated using equation 6.2.
The remainder of this chapter focuses on a variety of global tests that
can be useful in testing focused hypthescs about a variety of alternatives
to the full regression model. We also present sequential step-down pro
cedures that are useful in exploring complex regression equations. The
analyst should always keep in mind the theoretical versus exploratory ba
sis of the tests that are performed. This consideration informs the choice
of data analytic strategy and the interpretation of the results. With Finney
ct al, (1984) we also strongly encourage researchers using these global
and step-down testing procedures to report the findings of their prelimi
nary analyses in order to place the main results in context.
CamScanner
"”LTm=
(«in - Дои,)/"!
(6.5)
(1 -«?„)/(» -k~ 1)
In this equation, R2n is the squared multiple correlation of the model con
taining the terms in question; /?JU, is the squared multiple correlation front
the reduced model with the terms in question removed; m is the number
of terms in the set of terms being explored; n is the number of cases; and
к is the number of predictors in the full regression model, from which
R2n is derived. We will use a number of variants of this general procedure
throughout the remainder of the chapter.
As an illustration, if we wish to determine whether the Z and XZ terms
contribute to the prediction in equation 6.1, we can compare the full model
(equation 6.1) with a reduced model that eliminates X and XZ (equation
6.3). In this comparison, R2n >s ^1C multiple correlation from equation 6.1;
R2 is that from equation 6.3; m = 2 for the b2Z and b2XZ terms in
question; and к = 3, corresponding to the three predictors in equation
CamScanner
Model and Effect Testing 107
CamScanner
108 MULTIPLE REGRESSION
The test of the gain in prediction from equation 6.1 to equation 6.6 pro
vides a test of whether the linear combination of regression coefficients
(b2 + bsZ) for the X2 term is different from zero. In equations involving
Z2 terms, a similar reexpression of the regression equation in terms ofZ
and Z2 provides the basis for a similar test evaluating whether the rela
tionship of Z to the criterion is curvilinear.
In equation 6.6, one may test whether there is any effect of one of the
predictor variables, in either first order or higher order form. For a global
test of variable X, equation 6.6 would be contrasted to equation 6.4, Y =
b2Z + b0, which drops all terms containing X. If this test of gain in
prediction is nonsignificant, it is concluded that variable X has no effect :
in the regression model. For a global test of variable Z, equation 6.6 •
would be contrasted to equation 6.8, which drops all terms containing Z:
CamScanner
Model and Effect Testing 109
If there were no significant interaction, then both the b4XZ and h5X2Z
terms would be eliminated, leaving equation 6.9 as the appropriate cqua-
■ tion.
Global tests of effects analogous to those in ANOVA may be employed
in series. Suppose that the test of the interaction was not significant, yicld-
« ing reduced regression equation 6.9. Equation 6.9 represents the main
effect of X, represented by the X and X2 terms, plus the Z main effect.
Equation 6.9 would now be contrasted to equation 6.4 to determine the
overall contribution of X to prediction. This is analogous to the test of a
main effect of X in ANOVA where X has three levels. Finally, the linear
effect of Z would be tested by contrasting equation 6.9 with equation 6.8.
The above tests of the overall X effect (X plus X2) in the presence of
Z and the Z effect in the presence of X and X2 will be familiar to many
ANOVA users. These procedures are identical to Appelbaum and
Cramer’s (1974) tests of main effects over and above other main effects.
Appelbaum and Cramer (1974) referred to these tests with regard to the
A and В main effects of an ANOVA as “A eliminating B” and “B elim
inating A.”
Appelbaum and Cramer (1974) also observed that in some instances
the multiple correlation for an equation containing two main effects (e.g.,
equation 6.9) will be significant, but that neither individual effect will
. attain significance. They recommend further testing of the the X and Z
main effects in which each effect is considered in a single equation. In the
present case, the joint test of b{ and b2 in equation 6.8 would provide the
test of X, and the test of b2 in equation 6.4 would provide the test of Z.
Such tests, termed “A ignoring B” and “B ignoring A” by Appelbaum
and Cramer (1974), are recommended to clarify the impact of a single
factor on a criterion. If only one of the pair of tests is significant, the
associated predictor is taken as having an effect on (or association with)
‘ the criterion.
5. Global Test of the Equivalence
‘i of Two Regression Equations
Chapter 7 addresses the analysis of regression equations involving com
binations of continuous and categorical predictor variables. In equation
к. . . . .
CamScanner
40 MULTIPLE REGRESSION
In all of these global tests, the equations containing higher order terms
are deliberately structured to contain all lower order terms of which the
higher order terms are comprised. Tests of the contributions of higher
order terms should consider prediction of these terms over and above the
lower order terms. This is because higher order terms actually represent
the effects they are intended to represent if and only if all lower order
terms are partialed from them (Cohen, 1978). The XZW term covered in
detail in Chapter 4 only represents the linear X linear X linear component
of the XZW interaction if all lower order effects (i.e., X, Z, W, XZ, XW,
and ZW) have been partialed from the XZW term. A global test of the
curvilinearity of X in equation 6.6 would not be accomplished by deter
mining the significance of the multiple correlation in the equation f s
b2X2 4- b5X2Z 4- b0. In such an equation the higher order curvilinear
effects would be confounded with portions of the linear X, linear Z, and
the linear by linear XZ interaction. In sum, we recommend strongly that
in structuring regression equations with higher order terms all lower order
terms be included\see Allison, 1977; Cleary & Kessler, 1982; Cohen &
Cohen, 1983; Darlington, 1990; Pedhazur, 1982; Peixoto, 1987; Stone
& Hollenbeck, 1984 for further discussion of the necessity for the inclu
sion of lower order terms when higher order terms arc tested).
Readers will occasionally encounter models in research literature in
which lower order terms have been omitted. Fisher (1988) described a
theoretical model in which one predictor X has a direct effect on the cri
terion У; a second predictor Z modifies the effect of X on Y, but has no
direct effect on Y. This theorizing led to the regression equation f
CamScanner
Model and Effect Testing 11!
+ b$XZ 4- b0, with the b2Z term for the direct effect of X on Yomitted.
We suggest that it is usual with such theorizing to test the direct effect of
Xon Z and to show lack of support for its operation, rather than to omit
it from the model. Hence, we advocate the use of models with all lower
order effects included for theory testing. Demonstrations that effects pre
dicted to be nonexistent by theory do not accrue are valuable in theory
building.
cs CamScanner
И2 MULTIPLE REGRESSION
Note that both equations 6.9 and 6.10 are hierarchically “well forma- s
lated” (Peixoto, 1987) in that all lower order terms are represented. Sup- ,
pose further that the b2X2 term were significant, whereas the b4XZ term v
CamScanner
нж/ /ДОМ TeMltiff 113
У'Чч:•• ’• ■’ • • . .
were not. Then equation 6.9 would be retained, The source of the devia
tion Ihuu linearity found in the global test would have been Identified as
resulting from a curvilinear regression of Ton X, This relationship would
then be further characterized using the strategies for probing curvilinear
relationships described in Chapter 5, Case 2.
CS CamScanner
114 MULTIPLE REGRESSION
The familiar automatic stepwise forward (build up) and stepwise back
ward (tear down) variable selection algorithms should not be confused
with the luerarchical step-up mid step-down procedures discussed here.
The selection of predictors in stepwise procedures available in standard
statistical packages is based solely on the predictive utility of individual
predictors over and above other predictors. Their use in the context of
complex regression equations containing higher order terms will lead to
reduced regression equations that arc not hierarchically well-formulated,
that is, in which all necessary lower order terms are included. The iden
tical problem holds for the all-possible-subset regression algorithms in
which all possible regression equations containing 1 through к predicton
from a set of к predictors are generated and the “best” equation in terms
of predictive utility is selected. With regression equations containing
higher order terms, none of typical automatic search procedures is appro
priate. Only a procedure that preserves the hierarchy of variables at each
stage should be employed (see also Peixoto, 1987).
Summary
This chapter intially considers two questions stemming from the non
independence of terms in regression equations containing interactions.
First, the interpretation of the lower order coefficients in the presence of
an interaction is reviewed. Second, the trade-off between bias and effi
ciency in the tests of regression coefficients involved in dropping nonsig
nificant terms from a regression equation is discussed. The important role
of the theoretical versus exploratory basis of the interaction is emphasized
in the choice of testing procedure and in the interpretation of the results.
A variety of global step-down tests of focused hypotheses (e.g., the pres
ence of curvilinearity, the overall effect of a single variable) are intro
duced. A term-by-term strategy is presented for exploring complex
regression equations through the step-down elimination of nonsignificant
higher order predictors. Methods for the identification of scale-invariant
terms in any regression equation with higher order terms are provided.
An advantage of a hierarchical step-down procedure over step-up proce
dures is presented.
. . . . -.A :
'"к
* >."-1.
cs CamScanner
Model and Effect Testing 115
Notes
1. The percentage of variance uniquely shared between the predictor and the criterion
in each regression equation is the square of the standardized regression coefficent.
2. An infrequently used alternative procedure is for the analyst to specify the order of
tests in a hierarchical step-up procedure. The order of tests is based on strong theory, the
temporal precedence of the predictor variables, or both. For example, in a study of the
effect of *students race (X), high school GPA (Z), and their interaction (XZ) on college
GPA, the researcher could argue that race precedes high school GPA and thus accounts for
some of the variance in high school GPA. Race and high school GPA could also (though
much more weakly) be argued to precede the race x GPA interaction. Under this strong
set of assumptions, the test of b\ in equation 6.3 would provide the test of the race effect,
the test of Ьг in equation 6.2 would provide the test of effect of high school GPA, and the
test of in equation 6.1 would provide the test of the race x high school GPA interaction.
In this strategy, all of the variance shared between race, high school GPA, and their inter
action is apportioned to race; the variance shared only between high school GPA and the
interaction is apportioned to high school GPA. The test of by is once again a test of the
unique variance of the interaction. In the absence of a strong theoretical claim that X causes
2, the attribution of the shared variance to X cannot be logically justified. Thus claims for
the validity of this approach are strongly dependent on the judged adequacy of the under
lying substantive theory.
3. Readers wishing more advanced treatments of the bias versus efficiency issue in se
quential testing (pretest estimators) should consult Judge, Hill, Griffiths, Lutkepul, and Lee
(1982, particularly Chapter 21) and Judge and Bock (1978).
4. As will be explained in Chapter 7, this interpretation assumes that the group variable
has been dummy coded. If effect coding has been used, this effect represents the difference
of .each group from the unweighted mean, again evaluated at the mean of the continuous
variable.
5. We thank David Kenny for suggesting the computer test for scale invariance.
CamScanner
7 Interactions Between Categorical
and Continuous Variables
CamScanner
Categorical and Continuous Variables 117
Table 7.1
Three Dummy Variable Coding Systems for College Data
Throughout this section we will use the first set of dummy variable
codes depicted in Table 7.1a. In this coding system, the first dummy var
iable (D|) compares E with the LA comparison group which is assigned
a value of 0. The second dummy variable (D2) compares BUS with the
LA comparison group. In dummy coding, (a) the comparison group is
assigned a value of 0 in all dummy variables, (b) the group being con
trasted to the comparison group is assigned a value of 1 for that dummy
variable only, and (c) groups not involved in the contrast are also assigned
a value of 0 for that dummy variable. Note that dummy codes are partial
effects that are conditioned on all G - 1 dummy variables being present
in the regression equation.’
Continuing with our example, suppose the researcher has sampled a
small number (N = 50) of university graduates and has recorded their
college, grade point average, and starting salary. College and GPA arc
the predictor variables and starting salary is the outcome variable of in
terest. These data are presented in Table 7.2. As in Chapter 5, we will
consider the interpretation of a series of regression equations of increasing
complexity.
r4
cs CamScanner
MULTIPLE REGRESSION
Table 7.2
Hypothetical Data: Stalling Salaries in Three Colleges
Sub. No. College GPA Salary ($) Sub. No. College GPA Salary ($)
GPA Salary
Mean Mean ($)
LA 2.80 21,000
E 2.40 28,000
BUS 3.00 24,000
CS CamScanner
Categorical and Continuous Variables 119
This equation compares the mean starting salaries (V) of graduates of the
three colleges. Let us substitute the dummy codes for each college from
Table 7.1a into this equation.
E: Y = MO + MO) + !>o = + b0
BUS: У = MO) + MO + b0 = b2 + b0
CamScanner
120 MULTIPLE REGRESSION
Table 7.3
Analyses of Pntgression of Regression Equations
—r , , ,, ,. tr,i, ............ - - Ч1—И». - -A»
* * iH.hi» i.i и 4—HHHi—■
* »
***
** —^—^^"^
Once again we substitute the dummy codes for each college into the
equation: h
л :
LA: У = b, GPA + b0 ,
BUS: P == b2 + GPA + b0
CamScanner
Categorical and Continuous Variables 121
CamScanner
122
MULTIPLE REGRESSION
a. F * 6,D, + bjDj + bn
$29,000 DO fngiwering
$25,00000
Ltxral Art*?
122 GPA-Ontefed
GPA-Original
$2G.030.00 ■
i»
| $22,000.00 •
lie.00000
NOTE: Simple regression lines for the three colleges are parallel. <
l
CamScanner
Categorical and Continuous Variables 123
$26,000.00
s $22,000.00 •
$18,000.00 -
1.22 GPA-centered
4 GPA-original
NOTE: The simple regression lines for the three colleges now have different
slopes.
predicted increase in starting salary of $943. Note that in equation 7.2 the
effects of college are statistically removed from the regression coefficient
b3 for GPA. If this were not done (i.e., Z), and D2 were not included in
the equation), b3 would equal —$1,080, reflecting the relatively low GPAs
and high salaries of the engineers.
CamScanner
124 MULTIPLE REGRESSION
LA: Y = b3 GPA 4- bQ
This substitution makes it clear that each college now has its own linear
regression line, with each line having a separate intercept and slope. The
three regression lines corresponding to each of the colleges are depicted •
in Figure 7.1c. b3 represents the slope of the line for the LA graduates,
b3 4- b4 represents the slope of the line for the E graduates, and b3 4- b5
represents the slope of the line for the BUS graduates. From Table
7.3a(iii), b3 = $790 is the slope for the LA graduates, b3 + b4 - $123 :
is the slope for the E graduates, and b3 4- b5 = $1,872 is the slope for
the BUS graduates.
b0 represents the intercept of the LA regression line, evaluated in the
present centered case at the mean of the entire sample of 50 students. b\
represents the distance between the LA and E regression lines and b2
represents'the distance between the LA and BUS regression lines, both
distances evaluated at the mean GPA of the entire sample. From Table .
7.3a(iii), we see that these values are b0 = $20,982, b{ = $7,065, and ?
b2 = $2,619. (Note that the intercept for the LA group is b0 - $20,982, ;
for the E group it is b{ 4- b0 = 7,064.71 4- 20,981.52 = $28,046, and f
for the BUS group it is b2 4- b0 = 2,618.57 4- 20,981,52 = $23,600.09.)
These coefficients do not equal the values obtained in estimating equation
7.2, which does not contain the interaction terms: Equation 7.2 attributes
a portion of the interaction variance to the lower order terms. Note that
the distance estimates are only meaningful when interpreted at the mean
GPA of the sample, because the regression lines are not parallel.
To provide another perspective on these slope and intercept estimates,
CamScanner
Y . . . ......
Л Categorical and Continuous Variables 125
Дч .
we separately computed a simple linear regression of starting salary on
У GPA (centered) within each college. The results were as follows:
Note that the estimates for the slope and intercept within each college arc,
within rounding error, equal to the values reported above based on a model
containing the dummy variables, the continuous variable, and the inter
actions above.
A joint test of b4 and b5 provides the overall test of the statistical sig
nificance of the interaction. As can be seen in Table 7.3a(iii), this test is
significant as are the individual tests of each of the contrasts involving a
> dummy variable in interaction with the continuous variable, GPA.
в Finally, the point of intersection of any pair of the lines can be calcu-
,, lated to determine whether the lines cross within the useful range of the
continuous variable, here GPA. Each pair of lines may cross at a different
point. To calculate the point of intersection, the equations for the two
" regression lines are set equal to each other and solved for the continuous
variable (GPA). For the regression lines representing the LA and E groups
above,
CamScanner
126 MULTIPLE REGRESSION
the points of intersection fall within the useful range of the continuous
variable arc termed disordinal. Two of the three points of intersection fall
outside the possible range of GPA; however, the LA-BUS intersection
(0.35) does fall within the theoretical 0.0-4.0 range of GPA. Note, how
ever, that the LA-BUS intersection point docs fall outside of the range of
GPAs observed in our sample and very likely represents an impossible
GPA for an actual graduate. Hence, the LA-BUS interaction should also
be considered to be ordinal.
More generally, the point of intersection on the continuous variable can
be calculated using the slopes and intercepts of the two regression lines
according to the following equation:
» • • /1 Л /
Intersection point = ------ — (7.4)
02 0|
In this equation, is the intercept for group 1, J2 is the intercept for group
2, is the slope for group 1, and S2 is the slope for group 2.
Higher Order Effects and Interactions ’
As we saw in Chapter 5 with continuous variables, higher order terms
can also be added to the equation. Because the recommendations of Chap
ter 5 can be applied here, we will consider only briefly two examples.
As a first example, the potential linear and quadratic effects of GPA on '
starting salary, each of which are assumed to be identical across groups, ,
can be examined by adding a GPA2 term to equation 7.2. This results in ;
equation 7.5: ,
Extending the example further, if the linear and quadratic effects of GPA >
are both permitted to differ among the groups, then additional terms mini
be added to equation 7,5, resulting in equation 7.6:
Equation 7.6 permits both the linear and quadratic components of the <
GPA-starting salary relationship to differ among the three colleges. Com- A
CamScanner
Categorical and Continuous Variables 127
parison of equation 7.6 with one in which the (D, x GPA2) and (D2 x
GPA2) terms arc not included pennits a test of the significance of the
quadratic component of the college x GPA interaction.
The point(s) of intersection for the two groups can be determined by
setting the equations for the two groups to be equal and solving for the
continuous variable. For example, substitution into equation 7.6 and al
gebraic manipulation shows that the LA and E lines will cross at
, . . , -bt + - 4(&,)(МГ
Intersection point 1 -------------- ——---------- •—
2(M
If the solutions for intersection points 1 and 2 are equal, there is only one
intersection point. None, one, or both of the intersection points may occur
within the useful range of the continuous variable.
In summary, just as was the case for continuous variables, higher order
terms can be added to the regression equation to test specific hypotheses.
Each of the dummy variables representing the categorical variable must
be included as first order terms and in all interactions. As with continuous
variables, all of the lower order terms involved in interactions must be
included in the equation. Quadratic (or other higher order) functions of
dummy variables are never included in the regression equation, as they
do not lead to interpretable effects.
CamScanner
128 MULTIPLE REGRESSION
Tabic 7.4
Unweighted Effects Codes for College Example
El E2
LA -1 -1
E I 0
DUS 0 1
values into equation 7.1, we obtain the following set of equations for each
of the three colleges:
LA: Y = —bx - b2 + b0
E: Y = by + b0
BUS: Y = b2 + b0
CamScanner
Categorical and Continuous Van able i 129
within each college (see p. 125). The intercepts in the three colleges are
$20,982 (LA), $28,046 (E), and $23,600 (BUS). The unweighted mean
of the three intercepts is $24,209.3 which is identical to b& The intercept
for LA is -b{ - b2 + b0 = -$3,836.9 - (-$609.2) + $24,209.3 =
$20,982. Substituting the appropriate values into the equations above for
each of the colleges, the intercept for E is $3,836.9 + $24,209.3 -
$28,046 and the intercept for BUS is -609.2 4- $24,209.3 = $23,600.
Similar logic can be applied to the calculation of the slopes. The slopes
in the simple regressions calculated separately for each college w ere $790
for LA, $123 for E, and SI,872 for BUS. The unweighted mean of the
slopes in the three colleges is $928.3, which equals b3. The slope for the
LA group is b3 - b4 - bs = 928.3 - (-805.4) - 943.7 = 790. The
slope for the E group is b3 4- b4 - 123 and the slope for the BUS group
is b3 4- b5 — 1,872. Thus the differences in the b coefficients between the
dummy coding and effect coding analyses directly reflect the differences
m meaning. It is important to note that the simple regression equations
for each group are identical whether dummy coding or unweighted effects
coding is used. This is yet another example of the point made in Chapter
3. Predictor scaling does not affect the simple slope analysis for post hoc
probing.
Given that the two coding systems discussed above produce results that
reflect their different meanings, which coding system should be preferred?
When the interactions involve a categorical variable and a continuous var
iable, dummy variable coding produces immediately interpretable con
trasts with the comparison group, whereas simple effect coding does not.
Hence, if there is interest in contrasts between pairs of groups, dummy
variable coding will be more efficient. When the interactions of interest
involve two (or more) categorical variables, effect coding is preferred be
cause it produces results that are immediately comparable with standard
ANOVA procedures. For example, when there are equal ns in each cell,
effect coding produces main effects and interactions that are orthogonal
just as in ANOVA. However, dummy coding produces correlations be
tween the contrast vectors for the main effects and those for the interac
tions. Thus some (minor) adjustments are needed in the results of the
dummy coded analysis to produce orthogonal estimates of the variance
CamScanner
130 MULTIPLE REGRESSION i
resulting from the main effects and interactions (see Pedhazur, 1982, p 1
369). ' I
- ■ • j
Centering Revisited
After the emphasis on centering predictor variables in cases of inter- i
actions between two or more continuous variables, the failure to use cen
tered dummy or effect variables (i.c., mean = 0) is striking. However, I
with categorical variables we arc nearly always interested in regression of
the predictor variable within the distinct groups themselves rather than at
the value of the (weighted) mean of the groups. As we have seen, both
dummy coding and effects coding lead to clearly interpretable results in
terms of the slopes and intercepts for each group. If, however, we are ■
interested in the average effect of the continuous predictor variable, an- ’
other coding system, weighted effects coding, should be used. Weighted
effects codes follow the same logic as the unweighted codes, but take each <
group’s sample size into account. Darlington (1990) presents a discussion '
of weighted effects codes. Note that in the special case where the sample
sizes in each group are equal, unweighted and weighted effects codes are 1
equivalent.
■ I
Post Hoc Probing of Significant Interactions
CamScanner
Categorical and Continuous Variables 131
for students who have a specific value of GPA, say 3.5, corresponding to
the cutoff for Dean’s list.
Third, wc may be interested in identifying the rcgion(s) of the contin
uous variable where two regression lines can be shown to differ signifi
cantly. For example, for what range of values of GPA do the E and BUS
students differ in their starting salaries?
Computer Procedure
A very simple computer procedure can be used to test the simple slopes
in each of the groups. In our example, when D, = 0 and D2 = 0, the test
of the b3 coefficient in the overall analysis including the categorical var
iable, the continuous variable, and their interaction (see Table 7.3a(iii)J
provides the proper test of the simple slope in the comparison (LA) group.
We can take advantage of this fact by noting that the simple slope of the
comparison group in the particular dummy coding system is always prop
erty tested in this case. In our example, if we recode the groups according
to the dummy coding procedure shown in Table 7,1b, the E group is now
the comparison group; its simple slope is b3 = 122.9 and the test off b3 in
the regression analysis (r = 0.65, ns) provides the appropriate test of the
simple slope. Similarly, if we recode the groups according to the dummy
coding procedure shown in Table 7.1c, b3 » 1,872.0 is now the simple
slope of the BUS group and the test of b3 (t ~ 11.29, p < ,001) provides
the appropriate test of significance. Thus, in our case involving three
groups, conducting three separate regression runs in which each group m
turn serves as the comparison group produces proper tests of each of the
ihrec simple slopes.
cs CamScanner
132 MULTIPLE REGRE.SS|On
CamScanner
Categorical and Continuous Variables 133
Note that in each case, the b\ coefficient represents the distance between
the regression lines for E versus BUS when the value of the GPA variable
is 0.0. For (a), this distance is evaluated when original GPA is 0.0, which
is likely to be of little usefulness. For (b), this distance is evaluated when
original GPA is 2.78, which is the mean of the entire sample of students.
For (c), this distance is evaluated when original GPA is 3.50, which cor
responds to our point of interest, the Dean’s list cutoff. Thus we find that
at GPA = 3.5 the difference in the predicted starting salaries of E and
BUS students is $3,186. The test of is also reported in standard regres
sion packages, t = 12.01, p < .001, and corresponds to the results of
the Johnson-Neyman test of significance of the difference between two
regression lines at a GPA of 3.5. This computer solution can also be
extended to more complex problems such as testing the significance of
the distance between two regression planes or two regression curves when
specific values are given for each of the continuous predictor variables.
Several comments about this computer test should be noted.
1. The test should be used with dummy codes because the interest is
in comparing differences between group regression lines. Recall that un
weighted effects codes compare the regression line with the unweighted
mean.
2. When more than two groups are employed, the test of the coefficient
for each dummy code provides a test of the difference between the regres
Sion lines for the comparison group and the group specified by the dummy
code. Note that these tests use the mean square residual (MSres) from the
overall regression analysis based on all groups rather than just the MSfvs
from the two groups used in the contrast.
3, Contrasts not involving the designated comparison group can be
performed by respecifying the dummy coding as we earlier illustrated for
tests of simple slopes.
4. When several pairs of regression lines are being compared, re
searchers may wish to use the Bonferoni procedure to adjust their obtained
values for the number of different tests that are undertaken. Huitcma
(1980) presents an extensive discussion of the use of the Bonferoni pro
cedure in this context.
CamScanner
134 MULTIPLE REGRESSION
For ease of interpretation, the two following regression analyses should <
be run using the original values (i.e., not centered) of the continuous
predictor variable.
1. The continuous predictor variable is regressed on the outcome vari
able using only the data from group 1:
CamScanner
Categorical and Continuous Variables 135
~^2, N-4
* ~ ^1(2))
N - 4 SSx(i) SSx(2).
2F-
В- -SSx(|) SSy(2)_
Note that these equations do not always yield two solutions within the
effective range of the predictor variable. Depending on the nature of the
interaction, there may be 0, 1, or 2 regions within the possible range of
' . ■ ' • • ■ .
CamScanner
MULTIPLE REGRESSION
136
the predictor variable in which the predicted values of the two regression
lines differ. ,
We will illustrate the calculation of the regions of significance for the
two-group case using just the data from the E and BUS groups in our
example (i.c., we assume E and BUS comprise the entire sample). The
necessary data and their source for the computations arc presented below.
When these values are substituted into the formulas above, and steps П
and HI are carried out, the values 5.19 and 5.45 are obtained. Recall that
we calculated earlier in Chapter 7 (p. 125) that the regression lines tor
the E and BUS groups had a crossing point of 5.31. Thus, for values ol
GPA less than 5.19, the E group is predicted to have higher starting sal
aries than the BUS group, for values of GPA greater than 5.45, the
group is predicted to have higher starting salaries than the E group, and
°
for values of GPA between 5.19 and 5.45, the starting salaries of the t*
groups are not predicted to differ. Given that the possible range of GPA
is from 0.0 to 4.0, this means that the E group will always be predicted
to have a higher starting salary than the BUS group in the possible range
of GPA.
A few final observations should be made about the Potthoff procedure
for determining regions of significance.
CamScanner
Categorical and Continuous Variables 137
1. The calculations of steps II and III are quite tedious and are cur
rently unavailable in most standard computer packages. Appendix C of
this book contains a simple SAS program for Potthoff’s extension of the
'Johnson-Neyman procedure for the simple two-group case. Borich (1971;
Bench & Wunderlich, 1973) offers more extensive computer programs
for the Johnson-Neyman procedure.
2. Even when regions of significance are obtained within the possible
range of the predictor variable, caution should be taken in interpretation.
If few or no data points actually fall in the regions, the result represents
a serious extrapolation beyond the available data, raising concerns about
the meaning of the obtained region. For example, a region of significance
less than a cumulative GPA of 1.0 would not be particularly meaningful
because few, if any, students ever graduate with such low GPAs. Finally,
if the test does not identify any regions of significance within the range
of the predictor variable, this indicates that the two regression lines differ
for either (a) all values or (b) no values of the predictor variable. In case
(a), the bj coefficient for the group effect will typically be significant,
whereas in case (b), it will not be significant.
3. When regions are being calculated for several pairs of regression
lines, researchers may wish to substitute the more conservative Bonferoni
F for the F-value listed to maintain the study wise error rate at the level
claimed (e.g., alpha - .05).
4. Huitema (1980) includes an extensive discussion of applications of
the basic Johnson-Neyman procedure to more complex situations. Note,
however, that he presents the test that is appropriate when a priori values
of the predictor variables have been selected, Once again, the Potthoff
extension requires that 2F2, w - 4 be used in place of Fu <V-4 as the critical
F-value in the equations.
5. Cronbach and Snow (1977) raise the important methodological point
that the interpretation of the regions of significance is clearest in experi
mental settings in‘ which subjects are randomly assigned to treatment
groups. This practice eliminates the possibility that specification error in
the regression equation biases the results. Cronbach and Snow present an
• excellent discussion of the design and analysis of research on aptitude X
treatment interactions.
CamScanner
138 MULTIPLE REGRESSION
Notes
1. In the dummy coding system of Table 7.1a, dummy code D, actually contrasts E to
LA and E to BUS. Dummy code D2 actually contrasts BUS to E and BUS to LA. The two
contrasts share in common the E versus BUS contrast. When Dt and D2 are entered into the
same regression equation, the b} regression coefficient (for Dx) reflects that part of the D,
contrast that is independent of D2, that is, the E versus LA contrast. The b2 coefficient (for
D2) reflects that part of the D2 contrast that is independent of D,, that is, the BUS versus ’
LA contrast.
2. The simplest method of directly comparing the E and BUS groups is to rerun the
regression analysis using a dummy coding system that designates one of these groups as the
comparison group. In Table 7.1, both sections (b) and (c) include this comparison. This .
method can be used in the more complex models described below and is illustrated later in
the chapter.
3. If we had not centered GPA, bQ would have represented the predicted value of starting ;
salary for LA graduates with a GPA of 0,0. Presumably, none of these individuals would s
actually graduate, so that this predicted salary value would not be meaningful.
4. In previous instances of rescaling of first order terms by additive constants, there has
been no resultant change in the coefficient for the interaction. The reader should be aware |
that the change from dummy codes to effect codes is not a mere change in scaling by additive
constants. The interaction coefficients do change with a change in coding scheme.
5. The weight vector used to compute the standard errors is w' == [0 0 I AL
where the values of A and D2 are (0 0) for LA, (I 0) for E, and (0 I) for BUS. The general л
expression for the standard error of the simple slopes is where S,, is the 5 x 5
variance-covariance matrix of the b coefficients available from standard computer packages.
*
/i X
few:®..
CamScanner
8 Reliability and Statistical Power
CamScanner
, MULTIPLE REGRESSION
140
duce an analysis of the reliability of interaction terms We consider strat
egies that have been preposed to correct for the unreliabil.ty о product
egies mat na i i cons der the issue of whether spu-
terms in regresston ana у«s. Wc aho cmr *
rious effects_can — of intcractions. Wc consider
Reliability
CamScanner
Relinbiliiy and Statistical Power 141
3. that the covariance between the randtmi emm and tn» »t>»пм н
is. С( ?> ,<>) « 0.
(**
U
Then the reliability of the variable is defined as the proportion of fotal
variance in A* that is true score variance,
Pxx ш OtJ^x ™
Thus, if there is measurement error in predictor A’, then biX is biased: btx
will be closer to zero than the population value it estimates, This bias
occurs because the denominator of brx is the observed variance of pre
dictor X, which is inflated by measurement error. We will refer to bias in
which estimates are closer to zero than corresponding parameters as al
" ■••''J I,-’’;'. -• K'' —' ‘ ' . . • ' ' ' _ ' • • . ;
CamScanner
М2 MULTIPLE REGRESSION */
rYX ~ rY7.rX7.
byx.z ~ 1 - Г2
1 r xz
If variable Z has reliability p7Z less than 1.0, then the numerator of bYXZ,
corrected for measurement error, would be rzzrYx - ггггх2, where is
the sample estimate of pZrZ. The observed value of the numerator of bYXZ,
taking into account the unreliability of the partialled predictor Z, can vary
widely from the true value. A true nonzero relationship between X and У
in the population may not be observed at all in the sample; a true zero
relationship between X and У may appear to be nonzero; and even the sign .
of the observed regression of У on X may vary from the true value in a ,
particular sample. The reliability of the partialled variable (here Z) has a
profound effect on the bias in the estimators for other variables. Even if
one predictor in a set is perfectly reliable, its regression coefficient is
subject to bias produced by error in other predictors. Only in the case in
which the true scores underlying observed predictors are uncorrelated with t
one another is each coefficient guaranteed to be attenuated by measure-.
rnent error, as in the single predictor case (Maddala, 1977). ;
CamScanner
Reliability and Statistical Power 143
The last three terms (Txez 4- Tzex 4- exez) represent the measurement
error component () of the observed XZ product. These terms have non
zero covariance with the error in their components. For example:
. Bias in the Regression Coefficient for the Product Term. Once again
the direction of bias depends upon the correlation between true scores of
- predictors, with the same vagaries as in multiple regression with no higher
CamScanner
144 MULTI I'LL MKGgEMlOM
Mx C(X, i
РГхГ/.Тх
PX7.,X7.PXX
where TXTZ and Tx arc the true scores associated with observed varfahfa
XZ and X; oxz and cz arc the variances of XZ and X, respectively; and дх
and nz are the population means of X and Z, respectively, Inspection of
the numerator of equation 8.10 shows that this correlation goes to zero if
X and Z are bivariate normal and centered in the population such that
= HZ = Q. For the centered case in the population, when the crrmprod:xt
term is less than perfectly reliable, the regression coefficient for that term
will be attenuated, just as in the one-predictor case. Only in the centers
case in the population with bivariate normal predictors X and Z ц th
uncertainty of the direction of bias of this term removed; the interaction
stands alone as does the single predictor, because the true scores of the
interaction term and true scores of its components are uncorrelated, that
is, C(TX, TXTZ) = C(TZ, TXTZ) = 0.
n „ Pxz + PxxPzz
Pxz.xz-------- 2—7~7—
CS CamScanner
145
t
unrwdl (1978) point out the disconcerting fact that Prz rz depends <m
i
scaling <d variables. Nevertheless, expression 8.12 is instructive, in
hat it pennits examination of the reliability of the camprodurt XZ in
(^nns of ^e reliability of its components X and Z. when the predictor
variables are centered in the population. Table 8 I shows the reliability
of X and Z required to produce a specified reliability of the erm? product
term, as thc ^’ivlfttion between X and Z varies. When X and Z are tin-
vntlatcd, they must each have reliabilities of .89 tn order that the crim
product have a reliability of .8, When the individual predictor? each have
pxxl reliabilities (.84), the reliability of the crossproduct XX is only 70
|ч\ж that as the interprcdictor correlation increases, the reliability of the
4-ressproduct term increases slightly; or equivalently, slightly lower rvlc
abilities of the X and Z variables arc required to produce a crowproducf
term with a specified reliability. What is clear from Table 8.1 h that the
individual variables entering a crossproduct term must be highly reliable
if even adequate reliability is to be achieved for the crossproduct term.
Table g.l
< Individual Variable Reliabilities Required To Produce a Specified Crvssproduet
Reliability, as a Function of the Interpredictor Correlation
entry is the reliability of X and of Z and required u> рю4ф,е 4 specified reltabdU)
of (heir enmproduct Pxz.i/. This value depends upon the cweUtiqn between X und Z For
***«pplc. for a crossproduct reliability of .70, given lM p.u * .30, each variable tw>t have »
. « reliability of ,82 (alternatively, «be product of the reliabilities »«usi be 821)-
CamScanner
MULTIPLE REGRESSION -4
Variances of First Order Terms. Variances of the first order terms are
corrected for error using equation 8.1: ,
I
°TX - °\ ~~
(8.13)’
л
that is, an estimate of the error variance is subtracted from the observed^
variance of each predictor. Given known reliability the error variance
is given as a2 = aj(l - pxx). V
4 -i
■ к
Covariances between First Order Predictors and of First Order Pre-^
dictors with the Criterion. Under the assumption of classical measurement
theory that errors are uncorrelated, covariances between pairs of first or-\
der predictors and between first order predictors and the criterion are un-\
affected by measurement error (see equation 8.4). Hence no correction is
required in the covariances among first order terms or of first order terms
*
with the criterion. V
■
CamScanner
Reliability and Statistical Power 147
Variance of the Product Tenn. The variance of the product term a2x/ is
a complex function of the means and variances of the components and the
covariance between the components (sec Appendix A for a derivation of
the following expression):
+ (fyxPxx + (8J5)
To correct the variance of the crossproduct term for error, observed values
are substituted into equation 8.15 to obtain an estimate of the error vari
ance contained in the observed variance. Then the observed variance of
the crossproduct term is reduced by subtracting the estimate of the error
variance.
CamScanner
1^8 MULTIPLE REGRESSION /'
CamScanner
Reliability and Statistical Power 149
consistent with our prior usage). For example, the covariance between the
two first order variables X and Z is given as follows:
The second and third terms in equation 8.17 represent the error portion
of the covariance between the observed crossproduct and the component.
A comparison of equations 8.17 and 8.9 shows that permitting errors to
be correlated introduces an additional term reflecting the error covari
ances, namely, 2TxC(ex, e2).
To correct the covariance matrix of the predictors for error, Heise sub
stituted sample means for true scores. He estimated error variances and
covariances from multiple measurements on each value of each predictor.
The mean observation on one such point was taken to represent a true
score. The variance of the observations on the single predictor value pro
vided an estimate of the error variance; the covariance between (he re
peated observations on single cases across two predictors provided an
estimate of the covariance between errors. These estimates were pooled
across all cases and were used to adjust the covariance matrix of the pre
dictors. Corrections for the variances of second order (XZ) and third order
(XZW) crossproducts and their covariances followed Heise’s derived
expressions. Reliabilities of the nine individual scales all exceeded .90
(except for one scale for one subgroup of raters). In this case the corrected
regression estimates did not differ dramatically from the uncorrected es
timates.
In a subsequent simulation study Heise varied sample size (л — 200,
350, 500) and the reliability of the first order predictors (.70, .90). Bias
was always reduced on average with the method, even for the smallest
sample size. However, the corrected estimates varied substantially across
replications, particularly with reliabilities of .70, and large sample size
did not compensate for unreliability.
We are not surprised at the unstable solutions obtained when the reli
CamScanner
150
MULTIPLE REGRESSION
cs CamScanner
Eeiiabiliiy and Statistical Power 151
CamScanner
152 MULTIPLE REGRESSION'
of SES, that is, they contain measurement error. However, the comnw
variance shared by all three indicators provides an excellent representation
of the SES latent variable. A measurement model can be constructed &
represent SES. The set of measurement equations would be as follows.
/ =5 X,(SES) + e,
о « XjCSES) 4-
X3(SES) + c3
CamScanner
Reliability and Statistical Power 153
e.g., West, Sandler, Pillow, Baca, & Gcrstcn, 1991, for an empirical
illustration).
Kenny and Judd (1984) have developed methods fortesting interactions
and curvilinear effects involving continuous latent variables. They show
that, by forming products of the indicator variables, all of the information
is available that is needed to estimate models containing X * 2 and X ,
*
Z
where X * and Z * arc latent variables. For example, to estimate X * 2 in
the curvilinear case with two measured variables, all information needed
to estimate the variance and covariance terms in the model can be derived
based on the products of the measured variables X| and X2 (i.e., X2, X2,
and XjX2). For the latent variable X Z
* interaction with two measures
each of X * and Z *
, the crossproducts of the measured variables X(, X2
and ZH Z2 0-e-» ^1^1» ^1^2» X2Z|, and X2Z2) provide the starting point
for the estimation of the model. In each case, the products of the measured
variables become the indicators for the corresponding latent variable. In
a simulation of the performance of the model in the presence of measure
ment error in the observed variables, Kenny and Judd showed that their
approach provided good estimates of the parameters in a known under
lying true model.
Bollen (1989) noted three obstacles to the use of the Kenny and Judd
approach. First, there was initially considerable difficulty in implement
ing the procedure in the widely available EQS and LISREL programs,
although Hayduk (1987) and Wong and Long (1987) have recently de
scribed successful methods. Second, the formation of the products of the
indicator variables violates the assumption of multivariate normality that
is necessary for the estimation procedure to produce correct standard er
rors. Alternative estimation procedures (Browne, 1984) exist in the EQS
and LISREL programs that do not make this assumption; however, these
procedures require large sample sizes to produce proper estimates (Bender
&Chou, 1988; West & Finch, in press). Third, Kenny and Judd assumed
that the latent variables and disturbances (term representing unexplained
variation in a latent criterion variable) of the components of the latent
variables are normally distributed. These assumptions can be tested using
the EQS program. Again, violation of this assumption may require the
use of alternative estimation procedures or respecification of the model to
achieve proper estimates.
The Kenny and Judd approach to correction of measurement error in
regression equations involving curvilinear effects or interactions has shown
considerable promise to date in a small number of studies. However, the
approach has to date been difficult for most researchers to implement,
precluding its more frequent use in the literature.
CamScanner
■Л
154 MULTIPLE REGRESSION
<4
. J
Can Measurement Error Produce Spurious Effects?
We have focused on the extent to which measurement error attenuates
regression estimates. It is also possible that measurement error might lead
to effects being observed in the sample that do not actually exist in the
population, that is, spurious effects. ■ ■ ■ ■ \
CamScanner
I
Reliability and Statistical Power 155
; ror between predictors and criteria, and (c) the level of predictor reliability
(random measurement error). In his study using a large sample size (n =
760) there was no evidence that correlated measurement errors produced
spurious interactions, although these systematic errors did attenuate the
size of the estimates of interaction effects. Random measurement error
attenuated both first order and interactive effects. When the reliability of
the predictor variables comprising the interaction was .80, the variance
explained by the interaction effect was reduced by half relative to the true
variance in the population without measurement error.
Busemeyer and Jones (1983) and Darlington (1990) have noted one set
of conditions in which spurious interactions can be produced in observed
data even though none exist in the population. The conditions involve
nonlinear measurement models, for example, X = k(Tx)]^2 + e* , that
ь violate the fundamental assumption of classical test theory that observed
„ scores must be linearly related to the underlying true scores. Thus mea
surement instruments that have only ordinal rather than interval level
properties can produce spurious estimates of interactions and curvilinear
effects. Advanced methods of estimating nonlinear measurement models
1 do exist. The interested reader may consult Etezadi-Amoli & McDonald
(1983) and Mooijaart & Bentler (1986) for discussion of one class of
methods for estimating nonlinear measurement models.
Г Comment
■fe till
cs CamScanner
156 MULTIPLE REGRESSION
Statistical Power
Many authors have commented on the weak power of tests for inter
action terms in MR, particularly in the face of measurement error (e.g.,
Busemeyer and Jones, 1983; Dunlap & Kemery, 1988; Evans, 1985). The
question has been raised with respect both to interactions involving two
(or more) continuous variables as well as to interactions involving cate
gorical and continuous variables (see Chaplin, 1991; in press; Cronbach.
1987; Cronbach & Snow, 1977; Dunlap & Kemery, 1987; Morris, Sher
man, & Mansfield, 1986; Stone & Hollenbeck, 1989). In this section w
explore the power of tests of the interaction in the equation Y - b}X
b2Z + 63XZ + b0, closely following Cohen’s (1988) approach. We begin
by considering relationships among various measures of the impact of the
interaction: effect size, partial regression coefficient, partial correlation,
and gain in prediction (difference between squared multiple correlations
or semipartial correlations). We examine sample size requirements nec
essary to detect the XZ interaction when X and Z are measured without
error. We then explore the impact of measurement error on effect size,
variance accounted for, power, and sample size requirements as a func
tion of (a) the correlation between predictors X and Z and (b) the variance
accounted for by the first order effects. Finally, the results of recent sim
ulation studies of power of tests of interactions are presented.
1. The specific statistical test that is chosen (e.g., parametric tests that use al
Cohen (1988) has suggested that .80 is a good standard for the minimum
power necessary before undertaking an investigation. This suggestion has
been accepted as a useful rule of thumb throughout the social sciences-
In considering the power for the XZ interaction term in an equation
CamScanner
Reliability and Statistical Power 157
containing A’ and Z, wc will treat the first order terms X and 7, ач a “mu*'
of variables: set M for first order (“main”) effects. The X7. term will
constitute a second “set” I for interaction/ With Y as the criterion, we
define the following terms:
r j mi* squared multiple correlation resulting from combined predic
tion by two sets of variables M and I, where M consists of X
and F, and 1 consists of XY
rj M: squared multiple correlation resulting from prediction by set
M only
rj
* n mv the squared semipartial (or part) correlation of set f with the
criterion; r>4! M) «= r^Mf - r^M. This is the gain in the
squared multiple correlation due to the addition of set I (the
interaction) to an equation containing X and Z (set M). Put
otherwise, it is the proportion of total variance accounted for
by set I, over and above set M.
r}q M: the squared partial correlation of set 1 with the criterion, or
the proportion of residual variance after prediction by set M
that is accounted for by set I,
effect size for set I over and above set M. where effect size is
defined (Cohen, 1988) as the strength of a particular effect,
specifically the proportion of systematic variance accounted
for by the effect relative to unexplained variance in the cn
tenon:
r2 - r2
j _ > У-Mi rГМ
1 “• Г > Ml
The reader should note that the numerators of the squared partial corre
lation and effect size are identical and are equal tv r w>, the squared
sernipartiul correlation. However, the denominators of the squared partial
correlation, r2n M, and the effect size, f2> differ. The denominator of
f п.м is the residual variance after prediction from set M; the denominator
of/2 is the residual variance after prediction from sets .M and I. Most
importantly, the reader should note that the squared sentipartial correla
tion, or gain in squared multiple correlation with the addition of set 1 to
set M, is not linearly related to effect size, Il is the effect size fz (or the
CamScanner
158 MULTIPLE REGRESSION
■ . ■ ■ ■ • . *
CamScanner
Reliability and Statistical Power 159
Table 8.2
Effect Sizes Associated with Varying Combinations of r2rM and r2KMI (Adapted
from Jaccard et al., 1990, Table 3.1)
Л
r Г Mt
.30 .08 * • . • ■
.40
103 • '
* *
22
NOTE: This table is adapted from Jaccard et al. (1990, Table 3.1), which provides sample size require
ments for the test of the XZ interaction in the regression equation Y = btX + byZ + bJZ +
for power = .80 at a = .05. Effect size estimates have been added.
*Efiect size
' Sample size required for power .80 at a = .05
Table 8,3
Variation in Sample Size Requirements, Effect Size (/2), and Squared Partial
Correlation (Hi m) at a ~ .05 for Constant Gain in Prediction or Semiparttai
Correlation [r|lLM)l
Л 't . • . *
ГК»«
'г M ... ыл /' гр/.м
CamScanner
160 MULTIPLE REGRESSION !
tcct the XZ interaction (set I) are n = 26, 55, and 392, for large, mod
erate, and small effect sizes, respectively. Readers should note that then
of 55 for moderate effect size exceeds the majority of ns in Jaccard et
al.’s (1990) complete Table 3.1, because the majority of effect sizes in
that table exceed f2 = .15, the value that defines a moderate effect size.
Readers should not be misled to think that small sample sizes suffice to
detect interaction effects of the strength typically found in the social sci
ences.
CamScanner
Reliability and Statistical Power 161
is, = rr,z> we solve for the values of these validities that produce
„2
Г r.M’
We then introduce measurement error by assuming that predictors X
and Z and the criterion Y have reliability .80. We attenuate the correla
XiZ, rYM, Гпм» ''hi m))
tions involved in the power analysis (r2KX, r2yz, r2
for measurement error under the assumptions of classical test theory (Lord
& Novick, 1968) and Bohmstedt and Marwell’s (1978) work on the re
liability of the product term. Finally, we recompute the effect size/2 for
the test of the interaction based on the attenuated correlations. From/2
we recalculate statistical power, assuming that the researcher had used the
sample sizes required for power .80 in the error-free case (n — 26, 55,
392 for large, moderate, and small effect sizes, respectively). For mod
erate and large effect sizes, the analysis is repeated with reliability of X,
Z, and Y of .70.
Table 8.4 shows the effect of reduced reliability on effect sizes and
variance acounted for by the interaction. Table 8.5 shows the effect of
reduced reliability on the power of the test for the interaction assuming
the ns necessary for the error-free case were utilized. It also shows the
sample size required to produce power .80 for the interaction at a = .05.
CamScanner
162 MULTIPLE REGRESSION
Table 8.4
Impact of Reduced Reliability on Variance Accounted for [r ki.mJ and Effect
Size (/2) of the Interaction in the Regression Equation
P = b\X + b2Z + byXZ + b0
r* 0 .20 .50
' ».M
Reliability Actual at л = 55
1.00 .13 .13 .10 .10 .07 .07 ■ -i
.80 .07 .07 .05 .06 .03 .04
.70 .04 .05 .04 .04 .02 .03
■ --'Я
■ ■ •■’MM
-л
. 1
CamScanner
Reliability and Statistical Power 163
Table 8.5 addresses statistical power with reliabilities less than 1.00.
The table is structured so that at reliability 1.00, the power for each effect
size is .80. Note that the required sample sizes differ across effect sizes
to produce constant power of. 80. For the large effect size portion of Table
8.5, all power calculations are based on n = 26; for the moderate effect
size portion of the table, on n = 55; for the small effect size portion, on
n = 392. The pattern for loss of statistical power follows what we have
already seen for effect size and variance accounted for. Power is reduced
by up to half by having reliabilities of. 80 rather than 1.00 and is reduced
up to two thirds when reliabilities drop to . 70.
;; :The greater the proportion of variance accounted for by the fust order
effects, the sharper is the decline in the effect sizes, variance accounted
for, and power of the test for the interaction term as reliability decreases.
JB Required sample sizes increase accordingly. We saw earlier in this chap-
■ terthat as inteipredictor correlation increases, the reliability of the product
CamScanner
164 MULTIPLE REGRESSION
Tabic 8.5
Impact of Reduced Reliability on Power of the Test for the Interaction and on
Sample Size Required for Power .80 at a « .05 in the regression equation P =s
h,X + b2Z + MZ 4- bo
4
* 0 .20 .50
0 .50 0 .50 0
ГД 7 .50
Reliability Power at n = 26 . i
. ;1
47 59 52 26
.80 54 75
84 68 94 75 64
.70 127
■ Л
100
'■
Reliability Power at n = 55
LOO ’
.80 .80 .80 .80
.80
.80 .48 .52 .44 .49 .80
.37
.70 .34 .40 •31 .36 .41
.25
.29
■' .
■;
*
■■'"•'Л
::
CamScanner
Reliability and Statistical Power 165
Model 1: Y = OX + 0Z 4- 1XZ,
Model 2: Y = IX + 1Z + 1XZ,
in which each first order term and the interaction share equally in prcdic-
tion.
CamScanner
166 MULTIPLE REORESS|q /
Model 1
Predictor
Reliability Y = OX + OZ + JXZ У ~ IX + IZ + IXZ
J
1
Observed * *Cohen Observed * *Cohen ■5
Pxx Pzz Power Power /
* Power Power
CamScanner
Reliability and Statistical Power 167
sizes of n - 100. They reported that the interaction was detected in 100%
of 1000 simulated samples, Measurement error was added to the criterion,
but the amount not specified. If we assume criterion reliability .70, equiv
alent to that in Dunlap and Kcmcry (1988), then we would expect 100%
of the interactions to be detected, because the power approaches 1.00 for
this test even with substantially smaller samples of n ™ 30. Even when
the reliability of the criterion is .50, for n » 100 with perfectly measured
bivariate normal and uncorrclatcd predictors, effect size is large (f2
.33), and power approaches 1.00. Once again, interactions are detected
with 100% probability because they have very large effect sizes.
Paunonen and Jackson (1988) also provide simulations that match in
structure the real world data of Morris ct al. (1986), and hence are more
realistic in terms of effect sizes. For these simulations, moderator effects
were detected on average only 4.1% of the time with samples of size n
- 100. Such a detection rate is associated with effect sizes substantially
below small effect size/2 = .02. Indeed, for one case considered by
Morris et al. (1986), the effect size/2 equalled .001, according to a re
analysis by Cronbach (1987).
Two recommendations come from this review of simulations of power
of interactions. First, it would be very useful for our understanding of the
simulations if authors would report effect size measures. This practice
would permit comparison both across studies and with normative expec
tations for effect sizes in social science data. Second, in the absence of
such reports, readers should compute the effect sizes studies in the sim
ulation so that they are not misled by reports of very high power in the
absence of measures of strength of the interaction.
Finally, we offer a caution regarding the interpretation of tests of
regression models containing random measurement error in the predic
tors. Measurement error takes a greater toll on the power of interaction
effects, relative to first order effects. Measurement error also appears to
produce spurious first order effects but not spurious interactions (Dunlap
& Kemery, 1988; Evans, 1985). Taken together, these two factors will
lead to a greater apparent empirical support for theoretical predictions of
main effects at the cost of support for theoretical predictors of interac
tions.
CamScanner
168 MULTIPLE REGRESSION
' "V
Morris et al. (1986) proposed principal component regression (PCR) as A
a more powerful approach than OLS multiple regression for the analysis *
is ■ \
CamScanner
Reliability and Statistical Power 169
CamScanner
170 MULTIPLE REGRESSION
Summary
Notes
,
1. For those unfamiliar with path analysis, the basis of the analysis is typically ftunilUt
i
ordinary least squares regression. The terms path coefficient and structural ^efficient refcf
to standardized and unstandardized regression coefficients, respectively (Duncan, 1975).
2. Common approaches to measuring reliability make different assumptions concerning
the nature of the measures (e.g., test-retest correlations require parallel fonn equivalence;
Cronbach’s alpha requires parallel form or tau equivalence). Alwin and Jackson (|9S0)* !
it
Kenny (1979) present discussions of reliability of measures having varying phqxrties.
CamScanner
Reliability and Statistical Power 171
3. The constraint on the structure of a covariance matrix is that C(X,Z) s sxs2. If all
off-diagonal elements meet this condition, then the covariance matrix will be cither positive
definite or positive semidcfinitc. Л positive definite (PD) matrix is of full rank, that is, it
has all nonzero characteristic roots, a positive dctcnninnnt, and may be inverted, providing
the usual OLS solution for regression coefficients: b « tixJsXyt where SJx is the inverse
of the predictor covariance matrix. Л positive semidcfinitc (PSD) matrix is not of full rank;
there are linear dependencies among the predictors on which the covariance matrix is based
(e.g., entering three predictors that must together sum to 100 points). A PSD matrix has at
least one zero characteristic root, a zero determinant, and cannot be inverted; hence there
is no solution for the regression coefficients.
When a covariance matrix is adjusted for error a third condition may arise: matrix in
definiteness (Fcucht, 1989). The condition C(X, Z) s sxsz is not met; there is at least one
negative characteristic root and a nonzero (though negative) determinant. Such a matrix
may be inverted; and hence a solution for "corrected” regression coefficients derived, even
though "such a moment matrix violates the structure and assumptions of the general linear
model, and is inadmissable in regression analysis.” (Fcucht, 1989, p. 80). Sometimes, but
not always, an indefinite covariance matrix of predictors will generate negative standard
errors. Whenever a corrected covariance matrix is employed, its determinant should be
checked before beginning the analysis.
4. We are following Cohen’s (1988) development and using similar notation for effect
size and other terras; thus the reader will find Cohen’s highly informative and useful treat
ment of power quickly accessible.
5. There is a direct relationship between the reliability of the product term (Pxz.xz) and
rr«.M) = Pxz.xz(b3cTllTz)> where b] is taken from У = btX + b2Z 4- b3XZ 4 bQ,
and aTiTz is the variance of the product of true scores Tx and Tz. Thus the percentage of
attenuation in the variance accounted for in the criterion by the product term is a direct
function of product term reliability (Busemeyer & Jones, 1983).
6. PCR regression was developed by Mansfield, Webster, and Gunst (1977) to adjust
for multicollinearity in predictor matrices. To begin, the characteristic roots X, (/ = 1, p)
and characteristic vectors a{(i = 1, p) of the p X p covariance matrix of the predictors are
determined. Component scores are then formed on the set of p principal components, by
postmultiplication of the raw (n X p) data matrix X by the matrix of characteristic vectors,
that is, ut - Xalt where u, is the vector of component scores on the i th principal component.
Each u, is a linear combination of all the predictors. Those components (wt,«. • • • ut) (k
s p) associated with the large characteristics roots (Xt, Хг , • • • Xt) are retained for
analysis. Components associated with very small characteristic roots are deleted. The cri
terion Y is regressed on the к orthogonal principal components that have been retained: ?
= 4 d2u2 • • • <13и3 + dQ. The regression coefficients from that analysis (</h d2, • • ♦
4t) are converted into regression coefficients for the original predictors by the expression
bpcr. = d' u, where bfK„ is the principal component regression coefficient for predictor i of
the original predictor set. This yields the principal components regression equation in terms
of the original predictors: «= b^X 4- b^,:Z 4 b^XZ 4 b^. The b^,, are biased
estimates but they are efficient. See Mansfield et al. (1977) or Morris el al. (1986) for the
complete derivation of the analysis.
The reader may recall from Chapter 4 that centering the predictors will remove most of
the correlation between the crossproduct term and its component first order predictors. Hence,
-■f:--- simple regression equations containing an interaction, for example, P = b}X + b2Z +
b2XZ + b0, multicollinearity is not the source of the low level of statistical power.
CS CamScanner
I
9 Conclusion:
Some Contrasts Between ANOVA and MR
in Practice
In this book we have shown how many of the impediments to the under
standing of interactions among continuous variables can be overcome.
The interpretation of first order effects and interactions within the MR
framework was presented in some depth for the simple case of two vari
ables having only linear first order effects and a linear by linear interac
tion. This interpretation was then extended to more complex cases having
more than two interacting variables, curvilinear effects, or combinations
of categorical and continuous predictor variables.
Post hoc methods for probing significant interactions by testing simple
slopes, determining the crossing point of the regression lines, and graph
ically displaying the interaction were presented for both simple and com
plex regression equations. The lack of invariance of regression coeffi
cients under linear transformation was shown to have no impact whatever
upon the form or interpretation of the interaction: Simple slopes and the
4
status of the interaction as ordinal versus disordinal with regard to a pre
dictor remain invariant under such transformations. The gain in inter- j
pretability that results from centering the variables prior to analysis was
presented. A variety of tests for exploring regression equations containing
higher order effects were explained, including both global tests of hy
potheses and term by term step-down procedures that permit scale-free
172
CS CamScanner
Conclusion 173
CamScanner
I74 MULTIPLE REGRESSION
io avoid lowering power, increasing Type I error, and increasing the com- ’
plcxity ol underManding the results, particularly when the predictors to
be introduced are highly correlated with those of theoretical interest.
In ANOVA applied to randomized experiments, researchers typically
have not considered the precise functional form of systematic variation in
model specification. The standard ANOVA analysis employs a fully sat
urated model in which all terms through (he highest order possible are
always included, whether or not these higher order effects are theoreti
cal!} expected to occur. Unanticipated higher order effects arc detected
both during omnibus effect testing and in post hoc probing, for example,
a significant interaction not expected from theory but uncovered during
cflcct testing, or an unexpected curvilinear relationship uncovered with
trend analysis where only a linear relationship was expected. Less fre
quently recognized by ANOVA researchers is that the failure to specify
a functional form docs extract a penalty in terms of efficiency, as the
omnibus tests of the significance of main effects and interactions in
ANOVA aggregate several different functional forms, only some of which
may be of theoretical interest.
The apparent discrepancy between the need to consider model specifi
cation in ANOVA versus MR is further undermined when design consid
erations arc introduced. ANOVA tests the aggregation of all possible
functional forms of the main effects and interactions within the constraints
imposed by the sampling of the levels of each factor. The choice of too
few levels of a quantitative factor in ANOVA is the same specification
error as failing to include nonlinear terms in a regression model. The
choice of a 2 x 2 factorial design means that only linear main effects in
X and Z and the linear X by linear Z interactions can be detected, just as
in the simple regression equation containing an interaction we considered
in Chapter 2. However, in experimental designs, misspecification can only
be addressed by the redesign of the experiment and the collection of new
data. Thus researchers conducting randomized factorial experiments have
implicitly addressed die issue of functional form at the design phase of
the research; al the analysis phase, ANOVA aggregates all possible func
tional forms within die constraints imposed by the design.
However, when ANOVA is employed in the situation addressed in this
book in which the factors are comprised of two or more measured vari
ables, die functional form problem now arises in the analysis phase of the
research. The researcher must decide into exactly how many levels and
where each variable must be split to represent adequately the expected
functional form. The use of too few levels in ANOVA with measured
------ '
. ... ..i,
?i■£
CamScanner
Conclusion 175
CamScanner
По MtH.I'll'l U RIUIRUSSION
r4
cs CamScanner
, ■ ■
We are all familiar with the fact that if one makes simple additive transformations
on a variable (i.e., adding a constant), the variance of the variable, and its co
variances and correlations with other variables, remain unchanged. Only the mean
changes, by a factor of the additive constant. Thus we expect that additive trans
formations of predictor variables will have no effect on the outcomes of multiple
regression analysis. If predictor X is replaced with a variable X + c where c is a
constant, all regression coefficient estimates, and the variances and covariances
of these estimates, are expected to remain constant. This conclusion is true so
long as the regression equation contains only first order terms. In this case, only
the regression constant will be affected by changes in predictor variables.
The same pattern of invariance does not hold for product terms. If a constant
is added to a variable involved in a product term, the variance of the product term
as well as the covariances and correlations of that product term with other terms
are changed. Thus regression analyses containing product terms are scale depen
dent. The estimates of the regression coefficients, their variances and covariances,
and their standard errors are altered by changes in scale. Only the raw regression
coefficient for the highest order term and its standard error remain unchanged
under additive transformations (Cohen, 1978).
Bohmstedt and Goldberger (1969) provide a straightforward demonstration of
:■ the algebraic basis of this failure of invariance. Here we show (a) how the ex
peeled value (or mean) of a crossproduct term XZ depends on the expected values
(or means) of the variables, X and Z, of which it is comprised; (b) how the vari-
1 ance of the crossproduct XZ term depends upon the expected values of X and Z;
and (c) how the covariance of a crossproduct term XZ with another variable У
depends upon the expected values of X and Z. Having shown (c), it is easy to
'■';?/- ■ ' . ‘ ; ■ .; . : . л. ■ .
’ --/■Г-. > f • . ' f, . ’■ " ■ ■ . . ; ' '' ■. ■
i. : 177- .
i i ■ ■ ■: . -r,i ... : ■■ ■ ■ ■ ■ ■ .■ • ■
CamScanner
178 MULTIPLE REGRESSION
■
1
The Expected Value (Mean) of a Product Term • .•
We begin with two variables X and Z, making no assumption about their dis
tribution. Think of these as two predictors in a regression analysis. We define
deviation (centered) scores for each of the predictors,
Their expected values (means) are E(X) and E(Z), with variances V(X) « E(.n
and V(Z) = Efz2). and covariance C(X, Z) = E(xz).
First, we will form the crossproduct term XZ just as would be done in a regret
sion analysis involving interaction. We form the prossproduct of the raw scores:
• • ■ ■ •
XZ = [x 4- E(X)][z 4- E(Z)] (A ir I
• . . ?. .’. Г.
But for deviation scores E(x) » E(z) = 0, so the expected value of the co
product is as follows:
CamScanner
Appendix A 179
means of the two variables. This expression holds regardless of the distributions
of A’and Z.
substituting expression A.2 for XZ and A.4 for H(XZ). First, we form the «pare
as follows:
*Z)
V( - {« - [C(X, Z) + E(X)E(Z)]}’
This expression can be simplified if we assume that X and Z are bivariate normal;
this is the usual assumption in regression analysis. If variables X, Z, and W are
multivariate normal, then all odd moments (first, third, fifth, etc.) are zero, (e.g..
EU) = E(xzw) = E(rz) ~ E(x2zM = 0). Moreover, E(??) = V(X)V(Z)
•+ 2C’(X, Z). Then equation A.7 simplifies to
This is the same as expression (6) in Bohmstedt and Gokibcrger. What is impor
tant to note in expression A.8 is that k(XZ) depends upon the expected values
(or means) of X and Z. If constants are added to X, Z, or both, then V(XZ) will
< Consider the covariance between a crossproduct tenn XZ and the criterion )’ in
^regression analysis.
;■ .• ‘ 't . ■ ■
CamScanner
180 MULTIPLE REGRESSION
where
У-Е(У)=у (AJO)
and .
Note that expression AJI above is formed by taking the difference between
expressions A.2 and Л.4.
We multiply expressions AJO and A.11, which yields
We take expectations, making note that E(xy) = C(X, У), E(zy) = C(Z, Y),
and E(y) = 0, so that
Expression A J3 shows that the covariance between a product term XZ and an
other variable Y depends upon the expected values of the variables involved in
the product term but not of the other variable. Translating this into regression
with product variables, transforming the criterion У by additive constants will
have no effect on the regression analysis.
CamScanner
Appendix d 181
Centered Variables
that is. the mean of the crossproduct terms will equal the covariance between X
and 7. Note that even if X and Z are centered, the crossproduct XZ will not usually
be centered.
Second, with no distributional assumptions, we substitute E(X) » Oand E(Z)
« 0 into equation A.7. finding
C(XZ, Y) ~ 0 (AN)
*
Tim result seems surprising. It says that when /wo predictor X and 2 and a
criterion Y are multivariate normal, the covariance between the product XZ and
7 will be ierv. Does this mean that (here is necessarily no interaction if X, Z, and
}лш- multivariate normal? Yes. Turning the logic around, if there exists an in
teraction between X and Z in the prediction of Y, then, necessarily, the joint dis-
ttibutkin of X, Z. and Y is not multivariate iwnnal. Recall, however, that in fixed
r etivcix multiple regression, the distnbutional requirement applies only to the его
terion Otherwise stated, only the measurement error in the criterion must be nor-
\ «telly distributed, Thus the result in equation A. N does not present a problem
tai significance testing.
^« regression analysis we are very concerned with multicollincarity or very high
'among predictors. The covariance between a crossproduct term and a
CamScanner
182 MULTIPLE REGRESSION
C(XZ, X) = 0 (A,21)
CamScanner
Appendix В: Algorithm for Identifying
Scale-Independent Terms
■ •
. ■' •
■ Although additive transformations are examined here, the procedure can be gen-
k- eralized to multiplicative transformations such as those involved in standardiza-
lion (see Cohen, 1978).
Step 1. State the full regression equation under consideration, here equation
5.4. k: .-..k.
S If к
чк
■f-’j*
;•/
V 5', •"' ■
■f/-'-i ■'■■.5
Г. ;
A." ' -4.
.-xV./rvi
'
;
. . ••
'* • * .
. . ■. < . к.'
' г’*’
-
|
'
/.'-Л'' ”i. ■
.
■■
i *
;
. ■
.■ ’
■
". .
'
'.
*
■
*
.
'
. .
. ■ ■
*
■ ■
■. ;• . '
...
*
■
”.
joi
-• .
Io.’
■
CamScanner
184 MULTIPLE REGRESSION
Step Expand and collect the terms of the regression equation with tranv
3.
formed variables:
Y ~ (bt - 2b2c - *
f
b 4 2/>5(/)X’ 4 (b) — b$f)X'2
or equivalently
ample,
the coefficient is scale dependent. Only the row for the coefficient contains no b$
is scale free.
Step5. If a scale-free term is tested and found to be nonsignificant, it is dropped 1
from the equation. For any equation resulting from deletion of higher order terms,
delete the corresponding columns of “Modifications due to Transformation.” For j
example, if the term were deleted from equation 5.4, the following equation 1
would result:
■ ''I
Y -4- />,№ + />,Z +
b,X (B.5)
b
XZ
* + b„
deletion, those coefficients of the transformed equation that show no entries under .<?
“Modifications due to Transformation,” here and are scale free in the
b2 b4,
reduced equation. Thus in equation B.5, both and are scale invariant. b2 b4
' . ’ . . ’■ . .... •
CamScanner
1S5
П a joint test of both the b4 and b4 terms were nonsignificant, leading both
tcnnsto be dropped from equation 5.4. then both the b> and b\ coefficients show
пл entries under "Modifications due to Transformation’* Hence b» and b, are
scale invariant in equation V ~ b,X ♦ b,,VJ t b»Z > bv, both coefficients may
be tested tor significance.
Theprcscnt strategy is applicable to more complex equations such ax equation
5.5 which includes two nonlincat etlccts and their interactions, or equations in *
voicing three variables, ,\\ Z, and IF. and their interactions such as equation 4. I.
Appendix Table 112 provides useful summary charts for determining the xcate-
Anginal Equation:
r = b|X + b2X2 + bjZ + b4XZ + b<X2Z + bn
Transformed Equation:
Y = b\X' + bjX'2 + b\Z‘ + bJX’Z' + b\X,2Z' + htj
Transformations: X' «= X + G
Coefficient Relationships:
Coefficient
Modifications due to Transformation
Transformed Original b| b2 b\ b4 b^
Equation Equation
~2b2( -bj' ^2bscf
b\ b. -b^f
b‘3 bi ^■byC'
by h -2bsc
b4
ь;
b5
by ~byc ^Ьге2 ^b4cj -b$c2f
bi bo
J .v. >, •.'.Л ......... _
NOTE: Coeffickni relationships indicate, for example, that coefficient b\ of the transformed equation
equals the value (by - 2t>tc - bj + 2/>,c/). where the b, coefficients are taken from the original
equation.
cs CamScanner
MULTIPLE REGRESSION
186
+ b,X2Z2 + b0
Coefficient Relationships:
Coefficient
■ . : - : ■ . . ■ ' - . ' / .. . ■ . Я
■. ■ .• ■ ■ ■ - • ■ . . . ■ , .
■ . ' • • . '• •' ' . ' '' : :■ ■ <'■ ■. ' '■ \ . '■'/t'-'S
. . ■ . . • - ■ .• - ; . - ■
CamScanner
Appendix В
Transformed Original
Equation Equation bi b2 Ьз b4 b5 b6 bj
h NOTE: Coefficient relationships indicate in the three factor equation, for example, that the coefficient
£ of the transformed equation equals the value (Л( - bAf - b$h + byfh), where the Z>, coefficients are
taken from the original equation.
CS CamScanner
Appendix C: SAS Program for Test of
Critical Rcgion(s)
This program is applicable to eases comparing regression lines in which there are
two groups and one continuous variable. It identifies critical regions where the
two regression lines differ significantly using PotthofFs (1964) extension of the
Johnson-Neyman procedure (sec Chapter 7). Separate regression analyses within
each of the groups provide the data necessary for input to this program.
Variables are entered in the order below separated by a space (free format).
The values of each of the variables for the example in chapter 7 appear in lines
23 and 24 of the program. The program prints the name of the dependent variable,
the limit of region 1 (XL1), and the limit of region 2 (XL2).
188
CamScanner
Appendix С 189
Program
00001 (local system Job Control Language [JCL])
00002 (local system JCL)
00003 (local system JCL)
00004 DATA JOHNNEYK:
00005 INPUT DEPVBL $ ALLN N1N2 SXSQR1SXSQR2 MEANX1MEANX2 F
00006 SSRES Bl B01B2B02;
00007 MXSQR1 = MEANX1
2;
**
00008 MXSQR2 = MEANX2
2;
**
00009 SUMI = (1/SXSQR1) + (1/SXSQR2);
00010 SUM2 = (MEANX1/SXSQR1) + (MEANX2/SXSQR2);
00011 SUM3 = (ALLN/(N1
N2)
* ) + (MXSQR1/SXSQR1) + (MXSQR2/SXSQR2) ;
00012 SUMB1 = B1-B2;
00013 SUMBO = B01-B02;
00014 SUMB1SQ = SUMB1
2;
**
00015 SUMBOSQ = SUMB0
2;
**
00016 A = (((-2
F)/(ALLN-4))
* * SSRES * SUMI) + SUMB1SQ;
00017 B= ((( 2
F)/(ALLN-4)
* ) * SSRES *
SUM2) + (SUMBO * SUMB1);
00018 C = (((-2
F)/(ALLN-4))
* * SSRES * SUM3) + SUMBOSQ;
00019 SQRTB2AC = ((B
2)
** - **
.5;
C))
*
(A
00020 XL1 = (-B-SQRTB2AC)/A;
00021 XL2 = (-B+SQRTB2AC)/A;
00022 CARDS;
00023 SALARY25 10 15 21768.4 6671180.4 2.40 2.99 3-47
00024 870923 122.9 27705.0 1872 18401.6
00025 PROC PRINT; VAR DEPVBL XL1XL2;
X 00026 RUN;
00027 (local system JCL)
/ '< ■■ ■; ■ . : J. ■■■. , у- ■ у-...- ' . .. • '• ... .. ' . • ■
CS CamScanner
References
Press.
Belsley, D. A., Kuh, E., & Welsh, R. E. (1980). Regression diagnostics: Identifying injlu-
ential data and sources of collinearity. New York: John Wiley.
Bentler, P. M. (1980). Multivariate analyses with latent variables: Causal modeling. In M. ■
R. Rosenzweig & L. W. Porter (Eds.), Annual Review of Psychology, 31. Palo Alto,
CA: Annual Reviews.
Bentler, P. M. (1989). EOS: Structural equations program manual. Los Angeles: BMDP
Statistical Software. • . .. . - . ■■ .. .- 1
■ . • • • • • ■ ■ ■'
CS CamScanner
References 191
Bender, P. M., & Chou. С. P. (1988). Practical issues in structural modeling. In J. S. Long
(Ed.), Common problems /proper solutions: Avoiding error in quantitative research
(pp. 161-192). Newbury Park, CA: Sage.
Berk. R. A. (1990). A primer on robust regression. In J, Pox & J. S. Ix>ng (Eds.), Modern
methods of data analysis (pp. 292-324). Newbury Park. CA: Sage.
Blalock, H. M., Jr. (1965). Theory building and the concept of interaction. American So
ciological Review, 30. 374-381.
Bohmstedt. G. W. (1983). Measurement. In P. H. Rossi, J. I). Wright, & A. B. Anderson
(Eds.), Handbook of Survey Research (pp. 69-121). New York: Academic Press.
Bohmstedt, G. W., & Carter, T. M. (1971). Robustness in regression analysis. In H. L.
Costner (Ed.), Sociological Methodology (pp. 118-146). San Francisco: Jossey-Bass.
Bohmstedt, G. W., & Goldbcrgcr, A. S. (1969). On the exact covariance of products of
random variables. Journal of the American Statistical Association, 64, 325-328.
Bohmstedt. G. W„ & Marwell, G. (1978). The reliability of products of two random var
iables. In K. F. Schuesslcr (Ed.), Sociological methodology. San Francisco: Josscy-
Bass.
Bollen. K. A. (1989). Structural equations with latent variables. New York: John Wiley.
Bollen, K. A., & Barb, К. H. (1981). Pearson’s rand coarsely categorized measures. Amer
ican Sociological Review, 46, 232-239.
Bollen, K. A. & Jackman, R. W. (1990). Regression diagnostics: An expository treatment
of outliers and influential cases. In J. Fox and J. S. Long (Eds.), Modern methods of
data analysis (pp. 257-291). Newbury Park, CA: Sage.
Borich, G. D. (1971). Interactions among group regressions: Testing homogeneity of group
regressions and plotting regions of significance. Educational and Psychological Mea
surement, 31, 251-253.
Borich, G. D., & Wunderlich, K. W. (1973). Johnson-Neyman revisited: Determining
interactions among group regressions and plotting regions of significance in the case
of two groups, two predictors, and one criterion. Educational and Psychological Mea
surement, 33, 155-159.
Box, G. E. P., & Cox, D. R. (1964). An analysis of transformations (with discussion).
Journal of the Royal Statistical Society (Section B), 26, 211-246.
Browne, M. W. (1984). Asymptotic distribution free methods in analysis of covariance
structures. British Journal of Mathematical and Statistical Psychology, 37, 62-83.
Busemeyer, J. R., & Jones, L. E. (1983). Analyses of multiplicative combination rules
when the causal variables are measured with error. Psychological Bulletin, 93, 549-
562.
Byrne, В. M., Shavelson, R. J., & Muthdn, B. (1989). Testing for the equivalence of factor
covariance and mean structures: The issue of partial measurement invariance. Psycho
logical Bulletin, 105, 456-466.
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the
multitrait-multimethod matrix. Psychological Bulletin, 56, 81-105.
Champoux, J. E., & Peters, W. S. (1987). Form, effect size, and power in moderated
regression analysis. Journal of Occupational Psychology, 60, 243-255.
Chaplin, W. F. (1991). The next generation of moderator research in personality psychol
ogy. Journal of Personality, 59, 143-178.
Chaplin. W. F. (in press). Personality, interactive relations and applied psychology. In S.
Briggs, S. R. Hogan, & W. H. Jones (Eds.), Handbook of Personality Psychology.
Orlando, FL: Academic Press.
CS CamScanner
192 MULTIPLE REGRESSJOn
249-253.
Cohen. J (1988), Statistical power analysis for the behavioral sciences (2nd ed. >.
Irvington,
Daniel, C A Wood, F. S. (1980). Fitting equations to data (2nd ed.). New Yort: Jeto
Wiley.
Darlington, R. B. (1990). Regression and linear models. New York: McGraw-Hill.
Domino, G. (1968). Differential predictions of academic achievement m conforming ш
independent settings. Journal of Educational Psychology, 59, 256-260.
Domino, G (1971). Interactive effects of achievement orientation and teaching sty»eoe
academic achievement. Journal of Educational Psychology, 62, 427-431.
Dunean,O. D. (1975). Introduction to structural equation models. New York Academic
Press.
Dunlap, W P., 6 Kemery, E. R. (1987). Failure to detect moderator effects: Is iwW
linearity the pioblem? Psychological Bulletin, 102, 418-420.
Dunlap, W. P., & Kcrnery, E. R. (1988). Effects of predictor intereorreiuiioin and
bililies on moderated multiple regression. Organizational Behavior and Human Deci
CamScanner
References 193
CS CamScanner
194 MULTIPLE ИЕОтМО»
Judd, С. M., & McClelland, G. II. (1989). /Л//л analysis: А model comparison approx
San Diego: Harcourt, Brace, Jovanovich,
Judge, G. G., & Воск, M. E. (1978). 7he statistical implications ofpre-test and Steins
estimates in enconometrics. Amsterdam: North Holland.
Judge, G. G., Hill, R. C., Griffiths, W. E., I.mkepul, И„ & Lee, T. С. (1982), Introd^
tian to the theory and practice of econometrics. New York: John Wiley,
Kenny, D. Л. (1975). Л quasbcxperimcntal approach Io assessing treatment effect
* rni&e
noncquivalcnt control group design. Psychological Bulletin, 82, 345-362,
Kenny, D. A. (1979). Correlation and causality. New York: John Wiley,
Kenny, D. A. (1985). Quantitative methods for social psychology. In G, IJndzey Д £
Aronson (Eds.), Handbook of Social Psychology (3rd cd., Vol. I., pp. 487-5%). >1
*-»
York: Random House.
Kenny, D.,& Judd,C. M. (1984). Estimating the nonlinear and interactive effect
* of
variables. Psychological Bulletin, 96, 201-210.
Kenny, D. A., & Judd, С. M. (1986). Consequences of violating the independence
sumption in analysis of variance. Psychological Bulletin, 99, 422-431.
Kirk, R. E. (1982). Experimental design: Procedures for the behavioral sciences (2nd ed /.
Belmont, CA: Brooks/Cole.
Kmenta, J. (1986). Elements of econometrics (2nd Ed.). New York: Macmillan.
Lance, С. E. (1988). Residual centering, exploratory and confirmatory moderator analjsir,
and decomposition of effects in path models containing interactions. Applied Psycho
logical Measurement, 12, 163-175.
Lane, D. L. (1981). Testing main effects of continuous variables in nonadditive models.
Multivariate Behavioral Research, 16, 499-509.
LaRocco, J. M., House, J. S., & French, J. R. P., Jr. (1980). Social support, occupational
stress, and health. Journal of Health and Social Behavior, 21, 202-228.
Lautenschlager, G. J. & Mendoza, J. L. (1986). A step-down hierarchical multiple regres
sion analysis for examining hypotheses about test bias in prediction. Journal ofApplied
Measurement, 10, 133-139.
Levine, D. W., & Dunlap, W, P. (1982). Power of the F test with skewed data. Should
one transform or not? Psychological Bulletin, 92, 272-280.
Long, J. S. (1983a). Confirmatory factor analysis: A preface to L1SREL. Beverly Hdls.
CA: Sage.
Long, J. S. (1983b). Covariance structure models: An introduction to USREL. Beverly
Hills, CA: Sage.
Lord, F, M,, & Novick, M, R. (1968), Statistical theories of mental test scores. Reading,
MA: Addison-Wesley.
Lubin, A. (1961). The interpretation of significant interaction. Educational and Psycholog
ical Measurement, 21, 807-817.
Lubinski, D., & Humphreys, L. G. (1990). Assessing spurious “moderator effects” III»»'
trated substantively with the hypothesized (‘synergistic’) relation between spatial and
mathematical ability. Psychological Bulletin, 107, 385-393.
Maddala, G. S. (1977). Econometrics. New York: McGraw-Hill.
Mansfield, E, R,, Webster, J. T., & Gunst, R. F. (1977). An analytic variable selector
technique for principal component regression. Applied Statistics. 26, 34-40, J
Marascuilo, L. A., & Levin, J. R. (1984). Multivariate statistics in the social sciences-
Belmont, CA: Brooks/Cole.
Marquardt, D. W. (1980). You should standardize the predictor variables in your regression
models. Journal of the American Statistical Association, 75, 87-91.
■ ■ • - ■ ■ ■ ■ ■ ■ , ■ ■, ' .' ■ ■■ c >'л|
CamScanner
References
CamScanner
196 MUI Ill'Ll^ IMfhfiPhhllih
CamScanner
References
CamScanner
CamScanner
Glossary of Symbols 199
CamScanner
MULTIPLE REGRESSION
200
Definition Page
Sii
sample variance of unstandardized regression coef- 16
ficient by, i th diagonal element of Sfc, for centered
predictors
sample covariance between unstandardized 16
regression coefficients bj and by, off-diagonal ele
ment of S/, for centered predictors
sample covariance between unstandardized 34
regression coefficients 6/ and bj; off-diagonal ele
ment of Sb for uncentered predictors
5|_, ^M»
standard errors of simple slopes of Y on X at ZL. 17
ZM, and ZH, respectively (centered case)
si, 5м, 5h standard errors of simple slopes of Y on X at Zi, 44
Zm, and Z^, respectively (uncentered case)
*
c *
c *
r
CamScanner
6^ry of Symbols
201
Symbol Definition
sample mean of X 30
uncentered predictor X 153
latent variable X in structural equation m
24
the value of X at which>X
of Y on X at values of Z era , * 32
*
the value of X at which two sW
^entered X
ofYonXatvaluesofZcm^centeredZ v 2
the crossproduct of cente t<)rs A-alld Z' 30
crossproduct of uncen‘^ed P d z. structural 153
product of latent variables л
equation model . _ . % with z 6
crossproduct of square of cente
cs CamScanner
ж MULTIPLE RRGRRSSION
5V
:X\
* IX'titriuon Page
CamScanner
Glossary of Symbols 203
CamScanner
Author Index
cs CamScanner
Author Index
CamScanner
206 MULTIPLE rEGREssi01<
Sandler,!., 153
Sasaki, M. S., 36 Yerkes, R. M., 3,62
Schmidt, F. L., 6,28
Shavelson, R. J., 152
Sherman, J. D., 156,167-169,171 Zeller, R. A., 98
Simonton, D. K., 3
cs CamScanner
Subject Index
ANOVA, comparison with multiple regres and X2, XZ, 35; criterion, 35; effect on
sion, 70-72,172-176 interpredictor correlation, 32-33; effect
Assumptions of regression analysis, 25 on simple slopes, 33; expected value,
variance, and co^ir-ззсе cf product
terms, 181-182; interpretation
Bonferoni procedure, 133,137 of first order coefficients, 39-
41; interprstetiG
* of order regression
coefficients, 37-39; hsesr by linear
X Z
interpretation of first order regression 58; linear Xby linear interaction, 18-
Z
coefficients, 37-38; categorical predictor 20; standard errors of simple slopes, 19,
variables, 130; centered predictors 58; r-test for simple slopes, 19,58
defined, 9; centered versus uncentered Conditional effects, 10, 37-38, 50, 76, 102-
predictors, 28-34; correlation between X 105; centering and, 37; conditional inter-
207
CS CamScanner
208 MULTIPLE REGRESSION
actions, 50; first order conditional effects, Effects coding, 127-130; comparison with
37-38, 102-105; interpretability and pre dummy coding, 128-130; unweighted,
dictor scaling, 37-38. Sec also First order 127-129; weighted, 130
regression coefficients, Interactions Effect size, 157-161, 162; impact of
Conditional values of predictors, 18,58,89 measurement error on Interaction, 161,
Corrected estimate of regression coefficients, 162; relation to semipartial correlation,
145-154, 171; correction assuming 158-159; reliability, 161-162; small'
CamScanner
Subject Index 209
Inclusion of lower order terms In equations Main effect», 38-39,63, 70-71,103. See aho
with interactions, 49, 61, 93-95 Inter First order regression coefficients
actions, 9-27, 29, 36, 40. 42-44, 49-61, Matrices, 150, 171; matrix Indefinite-new,
*7
69 0. 79-89, 100-102, 123-127; 171; positive definite matrices, 150,171;
categorical and continuous variables, positive wmldefinlte matrices, 171
123-127; conditional interactions, 50; Matrix solution for standard error of simple
crossover versus noncrossover, 22; regression coefficients, 25-27,54,61,78,
curvilinear (higher order) Interactions, 79, 82, 85, 86,98-99,131, 138; categori
69-70, 70-89; inclusion of lower order cal by continuous variable, 131, 138;
terms in regression equation containing, curvilinear X, linear Z, and curvilinear X
49,61,93-95; interpretation in regression, by linear Z interaction, 86,99; curvilinear
9-10, 36; linear A
* by linear Z by linear IV X, linear Z, and linear X by linear Z inter
interaction, 49-61; linear X by linear Z action, 82-83, 98; curvilinear X, linear Z
interaction, 9-27; ordinal versus disordi- relationship, 79, 98; curvilinear X
nal, 22-23, 83; overlapping variance with relationship, 78,98; linear and curvilinear
first order terms, 100-102; standardized coefficients, 85, 98; linear X by linear Z
regression analysis, 40,42-44 by linear IV interaction, 54, 61; linear X
Interpretation of first order regression by linear Z interaction, 25-27
coefficients. Sec First order regression Maximum or minimum of curve, 65, 74,75-
coefficients 76, 79, 82, 86-88; curvlincar X, linear Z,
Interpretation of interactions, 9-10, 36; and curvilinear X by linear Z relationship,
curvilinear (higher order) interactions, 86-88; curvilinear X, linear Z, and linear
69-70; linear X by linear Z interaction, 9- X by linear Z relationship, 82; curvilinear
10,36; rescaling of predictors, 29 X, linear Z relationship, 79; curvilinear X
invariance of regression coefficients, 28-31, relationship, 74,75-76
33,36,40-45,48,183-187; algorithm for Measurement error, 139-155, 160-167;
identifying invariant regression attenuation bias, 141-142; bias in regres
coefficients, 183-187; centered versus un- sion coefficients, 141-144; classical
centered standardized solution, 40,42-43; measurement theory, 140-141; corrected
failure of invariance with higher order estimates of regression coefficients, 145-
terms, 28-31, 33, 177; highest order term 154; correlated errors in product terms,
in regression equation, 29, 30, 33, 48, 143; effect sizes for Interactions, 161-
177; invariance with no higher order 162; regression with product terms, 142»
terms, 29-30, 33; standardized solution 144; sample size requirements for
with higher order terms, 36, 40-45; detecting interactions, 163-164; variance
standardized versus unstandardized accounted for by interactions, 161, 162.
coefficients, 36. See oho Centering See aho Corrected estimates of regres
sion coefficients, Reliability of predictors
Median splits on predictors, 4,167-168
Johnson-Neyman technique, 132-137 Muhicollinearhy, 32-33, 35-36, 49; effect of
predictor scaling, 32-33,35
*36,49, essen
tial versus nonessential ill-conditioning,
Latent variable structural modeling, 151- 35-36
153; interactions, 153; regression es Multiple correlation and effect size, 157-159
timates corrected for measurement error,
151-153; structural coefficients, 152
? Linear combinations of regression Numerical examples, 10-12, 13-18, 32-34,
r, coefficients, 25 50-53, 55-59, 67, 117-126; categorical
CamScanner
210 MULTIPLE REGRESSION
Ordinal versus disordinal interactions, 22-23, Range of a variable, 22-23; dynamic range
29, 31-32, 34, 125-126; categorical by of a variable, 23; meaningful range of a
continuous variable interactions, 125 variable, 23
126; dynamic range of a variable, 23; in Regression analysis, 3-5, 93-95,96-97,110
variance and predictor scaling, 29,31-32, 114; as general data analytic strategy, 3.
34; meaningful range of a variable, 23 5; exploratory, 96-97; inclusion of terms,
Ordinary least squares regression, 25; as 93-95, 110-111; sequential model revi
sumptions, 25 sion, 111-113; step-up versus step-down
approaches, 113; stepwise, 114
Regression coefficient, 9-10, 37-40, 42-43,
Partial correlation and effect size, 157-159 45; interpretation of first order
Path coefficients, 170 coefficients, 9-10, 37-40; standardized
Plotting interactions, 12-14, 15, 52-53, 66, solution, 40,42-43,45
68, 123; categorical by continuous vari Regression equation, 9-10, 62-99, 118-126;
ables, 123; centered, uncentered data categorical and continuous variables, 119
compared, 15; curvilinear (higher order) 123; categorical and continuous variables
interactions, 68; curvilinear (Z2) relation and their interactions, 123-126; categorical
ship, 66; linear X by linear Z by linear W variable only, 118-119; other forms of non
interaction, 52-53; linear X by linear Z linearity, 95-96; with higher order
interaction, 12-14; theory and structuring (curvilinear) relationships, 62-99; with
graphs, 52 interaction, 9-10; with no interaction, 10
Post hoc probing of interactions, 12-24, 52 Reliability of predictors, 139-155, 160-167;
53, 72-79, 130-137; curvilinear (higher definition of reliability, 141; reliability of
order) interactions, 72-89; plotting inter crossproduct term, 144-145; statistical
actions, 12-14, 52-53; post hoc tests, 14 power of interactions, 161-167. See also
16, 130-137. See also Post hoc tests on Measurement error
interactions Residuals, 25; assumptions, ordinary least J
Post hoc tests on interactions, 14-26, 54-58, squares regression, 25
. ■ ■ S lllll я
72-89; curvilinear (higher order) inter
actions, 72-89; differences between sim . ■ ? ■ ■ ■ ■ ■
ple slopes, 19-20; linear X by linear Z by Sample size requirements for detecting inter- j
linear IV interaction, 54-58; linear Д' by actions, 158-159, 161, 163-164; tables
linear Z interaction, 14-26 159,164
Power (statistical), 95, 139-140, 156-169; SAS computer package, 27, 137, 188-189; J
impact of measurement error for inter PROC REG, 27; program for Potthoff ex« j
actions, 160-167; interactions, 139-140, tension of Johnson-Neyman technique,
156-169; loss with dichotomization of 137,188-189; variance covariance matrix |
predictors, 167-168; low power for inter of predictors, 27 ।
actions, 139,156; predictor reliability and Scale free terms, algorithm for identificato g
interactions, 160-167; sample size Ш-112,183-187 3
CamScanner
Subject Index
211
Scaling of predictors. See Centering. Invari
of, variances of simple slopes in
ance of regression coefficients, Trans
curvilinear interactions, 64; table of, vari
formations of predictors
ances of simple slopes in linear inter
Semipartial correlation and effect size, i 57
actions, 60
159 Standardized solution with interactions, 40
*
Simple regression equation, 12-14, 29, 31,
48; appropriate standardized solution, 44;
*75;
33, 36, 50-51, 73 categorical by con
crossptoduct terms with standard scores,
tinuous interactions, 124-125, 129; ccn- 43; failure of invariance of regression
tered versus uncentcred predictors, 29,
coefficients, 40, 42-43, 48; raw versus
31, 33; choosing values of Z, regression
*47;
standardized solution, 45 simple
of Y on A', 12-13; curvilinear A' rclation- slope analysis, 44-45; standardized
ship, 73
*75; effects of predictor trans
solutions from centered versus uncentcred
formation, 29, 31, 33; interpretation, 36; data, 40,42
linear A’ by linear Z by linear IV interac Structural coefficients, 152,170
tion, 50-51; linear X by linear Z interac
tion, 13-14
Simple slope, 12, 16-21, 22, 29, 31, 33, 36,
Theory, role in structuring regression
*38,
37 42, 44-45, 48, 50-51, 54-58, 60,
equations, 70-72, 93-96, 103-105, 110,
64, 73-75; bias in tests of significance,
173-174
22; computer analysis, 18-21, 54-58; Transformations of predictors, 28-29,32-33,40,
curvilinear (higher order) interactions,
177-187; additive transformations, 28-29,
73-75, 79, 80-83, 86; definitions of, 73
32-33, 177-182; algorithm for identifying
75; effect of predictor scaling, 29,31,33;
scale-independent regression coefficient, 40,
first derivative and, 73-75; first order
183-187, See also Centering
regression coefficients and, 37-38; inter
r-tests, 16-18, 19, 34, 40, 46, 47, 54, 58, 83,
* by linear Z by
pretation, 36, 48; linear A 86, 131; effect of predictor scaling on
linear X interaction, 50-51; linear X by simple slope tests, 34; simple slopes in
linear 2 inteaction, 16-18; standardized
categorical by continuous variable inter
regression equation, 42, 44-45; table, action, 131; simple slopes in curvilinear
higher order interactions, 64; table, linear
(higher order) interactions, 83,86; simple
interactions, 60
slopes in linear Af by linear Z by linear W
SPSS-X computer package, 27, 92;
interactions, 54, 58; simple slopes in
REGRESSION, 27; variance-covariance
linear ATby linear Z interaction, 16-18,19;
matrix of predictors, 27
standardized regression coefficients, 40,
Spurious regression effects, 154-155
46, 47. See also Standard error of simple
*18,
Standard error of simple slope, 16 19,
• -■ slopes ’ /•. ■;/'■'? './■.■ Д:;Д
24-26, 34, 46, 47, 54-58, 60, 64, 77-78,
1’83, 86-87/89-92, 131; categorical by
continuous variable interactions, 130
Unbiasedness versus efficiency, 103-104
‘32; computer calculation, 19, 58, 89-92,
1» curvilinear (higher order) inter-
««ions, 64, 77-78, 81-83, 86-87; deriva-
Variance-covariance matrix of regression
l’on of standard error, 24-26; effect of
coefficients, 16
*17, 25, 27, 45, 131;
<^‘ct9r s^lhg, 34- iiQBar x by linear Z
standardized solution, 45
У linear Winteraction, 54-58;;4loear X
Variance of simple slope, 24-26, 60, 64;
* near ‘nterac,’on».16-18; regression
of v j general expression, 25 *26; table of,
on AT at Z, 16; regression of Y on Z at curvilinear (higher order) interactions, 64;
/ standardized solution, 46, 47; table table of, linear interactions, 60
CamScanner