0% found this document useful (0 votes)
4K views220 pages

Aiken L. Multiple Regression. Testing and Interpreting... 1991

Uploaded by

vijay vikrant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4K views220 pages

Aiken L. Multiple Regression. Testing and Interpreting... 1991

Uploaded by

vijay vikrant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 220

MULTIPLE

REGRESSION:
Testing and
Interpreting
Interactions
Leona S. Aiken
Stephen G. West .
Arizona State University

With contributions by Raymond R. Reno


University ofNotre Dame

SAGE PUBLICATIONS
kJ The International Professional Publishers
У Newbury Park London New Delhi

CS CamScanner
b\ Sage РмЫкапопх. Inc

•Ml tights lescisvd No pan of this book may bo tvpimhieod or utilised in


чМ\\ toon 01 b\ atv\ means. electronic or mechanical, including photo
*
coding. lecsMding, or b\ an\ information stoiage and retrieval system,
without permission in willing fomi the publisher,

хм\5глх;

S\GF WhiteJiihMvv I»s'


(Si ТЪем>л»ч< ‘ r'm'*Oabtomu '
hv ичкOaks.
V nuil etsk^sagepub.co

S \GF NhhcMiotts Ы
0 BcnMl Swvel
LwdxHi вс:лж»
Vnihsl KinglhMtl

SAGK Wbls'Ahens Imha Pvt


*31 MmUI
M
Gn'atei Kailash I
New Delhi UONSliulia

Printed in the United Stales of America

Library of Congress CatnlogingdivPublientlon Data

Aiken. Leona S,
Multiple regression: Testing and interpreting interactions / Leona
S. Aiken and Stephen G. West,
p. cm.
Includes bibliographical references and index.
ISBN 0-8039.3605-2 (e) ISBN 0-7619.0712
*2 (pbk.)
I. Regression analysis. I. West, Stephen G. II,Title.
QA278.2.A34 1991
91.2062
519.5’30—dc20

This book is printed on acid


* free paper.

98 99 01 02 03 It) 9 К 7 6 5

First paperback printing 1996

5<rge Ялн/неб’оп Editor: Susan McElroy

CamScanner
Contents

Preface ix

1. Introduction

2. Interactions Between Continuous Predictors in Multiple


Regression 9
What Interactions Signify in Regression 9
Data Set for Numerical Examples 1
Probing Significant Interactions in Regression Equations 1
Plotting the Interaction ц
Post Hoc Probing ?2
Ordinal Versus Disordinal Interactions 24
Optional Section: The Derivation of Standard Errors of imp e p

Summary

ects of Predictor Scaling on Coefficients of


28
egression Equations 28
c Problem of Scale Invariance 29
inear Regression with no Higher Order Terms 30
^Hression Equations with Higher Order Terms 31
$l°Pes of Simple Regression Equations 31
32
inal Versus Disordinal Interactions
tonerical Example—Centered Versus Uncentered Data 35
35
^ ld the Criterion
tdticollinearity: Y Be Centered?
Essential Versus Nonessential Ill-Conditioning

r----------- 1
cs CamScanner
Interpreting the Regression Coefficients 36
36
The Interaction Ternt Х7,
The First Order Terms X and Z 37
A Geometric Interpretation 39
Standardized Solutions with Multiplicative Terms 40
Appropriate Standardized Solution with Interaction Terms 43
Simple Slope Analysis from the Standardized Solution 44
Relationship Between Raw and "Standardized” Solution 45

Summary 47

4. Testing and Probing Three-Way Interactions 49


Specifying, Testing, and Interpreting Three-Way Interactions 49
Probing Three-Way Interactions 50
Simple Regression Equation 50
Numerical Example 50
Graphing the Three- Way Interaction 52
Testing Simple Slopes for Significance 54
Standard Errors by Computer 54
Crossing Point of Simple Regression Equations with Three-Predictor
Interaction 58
Simple Slopes and Their Variances in a Series of Regression
Equations 59
Summary 61

5. Structuring Regression Equations to Reflect Higher Order


Relationships 62
Structuring and Interpreting Regression Equations Involving
Higher Order Relationships 63
Case 1: Curvilinear X Relationship 63
A Progression of More Complex Equations with Curvilinear
Relationships 65
Representation of Curvilinearity in ANOVA Versus MR 70
Post Hoc Probing of More Complex Regression Equations 72
Case 1: Curvilinear X Equation 72
The Progression of More Complex Curvilinear Equations Revisited 78
89
Coefficients of Simple Slopes by Computer
Three Final Issues 92
Curvilinearity Versus Interaction 92
What Terms Should Be Included in the Regression Equation? 93
Other Methods of Representing Curvilinearity 95
Summary 97

CamScanner
Model and Effect Testing with Higher Order Terms lnn
' Sonie Issues in Testing Lower Order Effects in Models U
Containing Higher Order Terms
Question 1. Interpretation of Lower Order Terms When h. Is l°°
Significant
Question 2, Should Lower Order Coefficients Be Tested in Reduced
Models When b3 Is Nonsignificant?
Exploring Regression Equations Containing Higher Order Terms
with Global Tests
Some Global Tests of Models with Higher Order Terms 107
Structuring Regression Equations with Higher Order Terms । 10
Sequential Model Revision of Regression Equations Containing
Higher Order Terms: Exploratoiy Tests 11 j
Application of Sequential Testing Following a Global Test I ]2
General Application of Sequential Testing 113
Present Approach Versus That Recommended by Cohen(1978) 113
Variable Selection Algorithms 114
Summary цд

. Interactions Between Categorical and ContinuousVariables 116


Coding Categorical Variables 116
Dummy Variable Coding 116
Unweighted Effects Coding 127
Choice of Coding System 129
Centering Revisited 130
Post Hoc Probing of Significant Interactions 130
Testing Simple Slopes Within Groups 131
Computer Procedure 131
Differences Between Regression Lines at a Specific Point 132
Identifying Regions of Significance ^4
137

Pliability and Statistical Power


Reliability
biased Regression Coefficients with Measurement Error
Corrected Estimates of Regression Coefficients in Equations
145
Containing Higher Order Terms 156
statistical Power 156
fotisiical Power Analysis 160
e Effects of Measurement Error on Statistical Power 165
^.me Corroborative Evidence: Simulation Studies 167
Median Split Approach: The Cost of Dichotomization 168
........
Principal •- --;;cUref^ Pnwer Waes
....
Component Regressio 169
Coming Full Circle
170
Summary

CamS caimer
9. Conclusion: Some Contrasts Between ANOVA and MR in
Practice 172
Appendix A: Mathematical Underpinnings 177

Appendix B: Algorithm for Identifying Seale-Independent Terms 183

Appendix C: SAS Program for Test of Critical Rcgion(s) 188

References 190

Glossary of Symbols 198

Author Index 204

Subject Index 207

About the Authors 212

CamScanner
Preface

Social scientists have long had interests in studying interactions between


variables. Whether hypothesized directly by theory or resulting from an
attempt to establish a boundary condition for a relationship, the testing of
interactions has formed an important part of their research.
In 1984 we began working on a number of research projects that in­
volved complex interactions between continuous variables. Although sev­
eral good articles and small sections of textbooks addressed
in multiple regression models, none provided what cou ® nd
to be a comprehensive treatment to which we could refer• sudens and
colleagues. This situation contrasted sharply to t e SP®“1 Proce.
cedures for examining interactions between categonca nrobjnc of inter-
dures for testing, graphically displaying, an P°s P reaIistic
actions between categorical variables for v У . of Variance
designs have for years been laboriously detailed mJ™on.
textbooks. The prevalence of incomplete or nonop) psychology fur-
tinuous variable interactions in several major jou ^ave clearcut
ther confirmed our observation that psycho ogis tjnuous variables,
guidelines for the analysis of interactions etwee continuous variable
Indeed, the usual practice in 1984 was either to igiof doing a me-
interaction effects or to follow the nonoptima P Anaiysis of Vari-
dian split of the continuous variables followe treatment of in-
ance. Thus the need for a comprehensive source & sourcebook would
fractions in multiple regression seemed clear, о

ix

cs CamScamier
X MULTIPLE REGRESSION

be useful to both graduate students and researchers facing this statistical


problem.
This book provides clear prescriptions for the probing and interpreta­
tion of continuous variable interactions that arc the analogs of existing
prescriptions for categorical variable interactions. We provide prescrip­
tions for probing and inteiprcting two- and three-way continuous variable
interactions, including those involving nonlinear components. The inter­
action of continuous and categorical variables, the hallmark of analysis
of covariance and related procedures, is treated as a special case of our
general prescriptions. The issue of power of tests for continuous variable
interactions, and the impact of measurement error on power are also ad­
dressed. Simple approaches for operationalizing the prescriptions for post
hoc tests of interactions with standard statistical computer packages are
provided.
The text is designed for researchers and graduate students who are fa­
miliar with multiple regression analysis involving simple linear relation­
ships of a set of continuous predictors to a criterion. Hence the material
is accessible to typical lower level graduate students in the social sciences,
education, and business. The text can usefully serve as a supplement to
introductoiy graduate level statistics courses. The required mathematical
background is high school algebra. Although there are smatterings of cal­
culus and matrix algebra, readers unfamiliar with these mathematical
fonns will not be disadvantaged in their understanding of the material or
the application of the prescriptions to their own research.
Several individuals have made significant contributions to the devel­
opment of this monograph. Most notable are the major contributions of
Ray Reno. He provided all the simulations and computer analyses we
have reported throughout the text. His computer examples render our pre­
scriptions accessible to anyone with knowledge of the regression analysis
programs in standard statistical computer packages. We also appreciate
the input of a number of individuals who read and commented on versions
of the manuscript, among them Sanford Braver, Patrick Curran, Joseph
Hepworth, and Jenn-Yun Tein of Arizona State University; Richard Dar­
lington of Cornell University, Charles Judd of the University of Colorado,
and Herbert Marsh of the University of Western Sydney, Macarthur.
David Kenny of the University of Connecticut and James Jaccard of the
State University of New York, Albany provided thorough reviews of the
manuscript; their helpful input is strongly reflected in the final form о
the text. Special acknowledgment is due to David Kenny and Patrick Cur
ran, who provided extremely detailed and probing commentaries. We a s

CamScanner
preface xi

thank Susan Maxwell for her many insights about interactions. Finally,
we arc very appreciative of the encouragement, guidance, flexibility, and
patience of our Sage Editor, Deborah Laughton, and the painstaking ef­
forts of production editor Susan McElroy in preparing the book for pub­
lication.
The clerical and editorial efforts of Jane Hawthorne and Kathy Sidlik
during various stages of the project arc gratefully acknowledged. Andrea
Fenaughty provided greatly appreciated assistance with the indexing and
referencing. Support for graduate assistant Ray Reno, as well as Jane
Hawthorne and Kathy Sidlik, was provided by the College of Liberal Arts
and Sciences, Arizona State University. The efforts of Steve West were
in part supported by National Institute of Mental Health grant
P50MH39246.

L. S. A. and S. G. IF.
Tempe, January 1991

CS CamScanner
1 Introduction

This book is concerned with a statistical problem commonly faced by


researchers in the social sciences, business, education, and communica­
tion: How to structure, test, and interpret complex multiple regression
models containing interactions, curvilinear effects, or both. To understa
this problem, consider the following example. A researcher wis es tojn
vestigate the effects of life stress (X) and the amount of social support
receivedJby the person (Z) on physical illness (У), e ° larpT sample
reliable continuous measures of each of the vana es on ai ~
of subjects. Following the dictates off whatt has variables and
tree” in many research areas, he enters the P nacvaces and
the outcome variable into any of the standar regre
estimates the familiar regression equation reproduced neio .

r = b|X + b2Z + b0

A«d1v accomplished and inform


The tests of the b, and b2 coefficients are easily а^^.уе1у^ have a non-
the researcher whether stress and social suppo • u|atjon. The b0 coef-
zero linear relationship to physical illness in e $^оп equations in
ficient, which will appear as the final term in a reCpt an(j wjH only
this book, represents the regression constan о re(jjCied value of Y.
rarely be of theoretical interest. Y represents гс^ег really wished to
But, is this the regression model that t e res jnterest was not
test? Like many other social scientists, 1 'Vhive the linear and additive
so much in whether stress and social suppo 1 ,

CS CamScanner
2 MULTIPLE REGRESSION

effects on physical illness specified by equation 1.1. Rather, in line with


prior theorizing (Cobb, 1976; LaRocco, House, & French, 1980), he
wished to test the explicit hypothesis that social support buffers the effect
of stress on physical illness. That is. the researcher predicted that while
each individual’s level of stress is positively related to his or her level of
physical illness, the strength of this relationship weakens as the level of
social support received by the individual increases. Indeed, for individ­
uals with very high levels of social support, the researcher would predict
that there would be little, if any, relationship between stress and physical
health. This hypothesis implies that there should be an interaction of stress
and social support in predicting physical illness, a relationship represented
by equation 1.2 below:

Y = byX + b2Z + b3XZ + bQ (1.2)

Thus equation 1.1 does not test the researcher’s hypothesis.


Substantive theory such as the stress-buffering model described above
often specifies that the value of an outcome variable depends jointly upon
the value of two or more predictor variables. Also, in empirical work in
a new research area investigators often initially attempt to find general
causal relationships of the form X causes У. When they have established
such relationships, they then attempt to specify conditions under which
the causal relationship is weakened (moderated) or strengthened (ampli­
fied). These are interactions.
Interest in complex hypotheses that are not adequately represented by
simple, additive linear regression equations is common in many disci­
plines. These complex hypotheses include not only interactions but also
curvilinear relationships between the predictor variables and the outcome.
The following examples selected from a variety of areas in the social
sciences, business, and education are illustrative. In each case, the regres­
sion equations must be structured to contain higher order terms repre­
senting interactions, curvilinear effects, or both in order to test the re­
searcher’s hypothesis properly.
1. The number of years of job experience is positively related to the
worker s salary. However, this relationship may be moderated by the *per
centage of female workers in the occupation: Occupations with a higher
percentage of female workers are predicted to have only small increase8
in salary as a function of job experience relative to occupations with «
lower percentage of female workers (England, Farkas, Kilbourne, & P°“’
1 700).

CamScanner
Introduction 3

2. A number of researchers have hypothesized that superior classroom


performance results when the student’s personal style matches the nature
of the classroom environment (see Cronbach & Snow, 1977). As one
illustration, Domino (1968, 1971) hypothesized that the student’s degree
of personal independence (versus conformity) would interact with the di­
rectiveness of the instructor (encourages independence versus encourages
conformity) to predict the student’s grade in the course,
3. Studies of workspace design find that (he larger the number of peo­
ple in each office, the higher the rate of turnover of personnel in the office.
However, this relationship is expected to be weakened to the extent each
employee’s desk is enclosed by partitions (Oldham & Fried, 1987).
4. Several theories of leadership propose that the group’s performance
is a complex function of both the leader’s style and the nature of the
situation. Forcxample, Fiedler (1967; Fiedler, Chemers, & Mahar, 1976)
proposed that in favorable situations (defined as having high task struc­
ture, good leader-member relations, high leader power) and in very un­
favorable situations (low values on each of these situational characteris­
tics), leaders with a task-oriented style would elicit the highest level of
performance from their groups. However, in mixed (moderate) situations,
leaders with a task-oriented style would elicit the lowest level of perfor­
mance.
5. In a study of vetos of legislation by U.S. presidents, Simonton
(1987) hypothesized that the degree of success in sustaining the veto would
reflect an interaction between the president’s personal level of flexibility
and the percentage of the electoral college vote he received in the previous
election.
6. A classic “law" in psychology (Yerkes & Dodson, 1908) hypoth­
esizes that performance will show an inverted-U shaped relationship to
the person's level of physiological arousal. The point of maximum per­
formance and the exact shape of the curve are determined by the difficulty

• The general methods of structuring complex regression equations to test


such hypotheses explicitly were first proposed over two decades ago n
the social sciences. Cohen (1968) proposed multiple regression (MR)
analysis as a general data analytic strategy. According to this мгле',
any combination of categorical and continuous variables can be analyzed
within a multiple regression (MR) framework simply through the appro­
priate dummy coding of the categorical variables. Interactions can lie rep­
resented as product terms, and curvilinear relationships can be represented
• through higher order terms in the regression equation. Other early pro-

cs CamScanner
MUI fl 14 If HlUIHhSMlON

posals for the stnirtiiring ami testing of complex regression models In


volving interactions nml/or higher onlet olivets wore mntle In sociology
by Allison (14771, Blalock (IW.M. mul Somhwtnnl (Ю7Н) «ml in politic,it
science by Friedrich (IRR2) and Wright (I'llb).
Despite the availability of general procedures for testing interactions
and curvilinear effects within an MR framework, the actual practice of
researchers in manv areas of social science, business, and education In
dicatcs these strategics have only rarely been followed. Many researchers,
as in our example of stress, social support, and physical Illness shove,
have incorrectly utilized simple additive regression models that Ignore
possible interaction effects.
Other researchers, originally trained in the use of analysis of variance
(ANOVA) models, have frequently utilized median splits of their contin
uous variables. This practice does allow data to be subjected to the fa­
miliar procedures of the 2 x 2 factorial ANOVA. Unfortunately, this
practice is also associated with substantial costs. Median splits ol contin­
uous variables throw away information. reducing the power of the statu *
tical test- They make it much more difficult to detect significant effects
when in fact they do exist (Cohen. 1983). Foran interaction effect of any
specified magnitude, a substantially larger sample of subjects will be
needed when the median split approach is used rather than the MR with
interactions approach in order to achieve adequate statistical power. The
median split approach may often be less informative in practice than MR
approaches if higher order relationships between the predictors and cri­
terion exist. Finally, the MR approach uses all of the information avail­
able in the predictor variables to provide direct estimates of the e fleet size
and percentage of variance accounted for.1
As an illustration of the magnitude of the problem of the nonutilizution
of MR with interactions in one social science, we conducted a survey of
four psychological journals (hat frequently publish articles involving ana
yses of multiple continuous predictor (independent) variables. I hree; vat
egories of analysis strategies for continuous predictors were tabulate • a
ANOVA with continuous variables broken into categories (ANO .* wi
cutpoints, nearly always median splits), (b) MR without interactions, ai
(c) MR with interactions. A total of the 148 articles involving the ana ys^
of two (or more) continuous variables were located: In 77% ol these a
tides one of the first two strategies was used rather than MR with n
actions. Although we have not undertaken a formal survey of jouma s
other social science disciplines, education, or business, our impress’0 _
that the use of complex MR models with interactions continues to be
relative to analysis strategies (a) and (b).

cs CamScanner
Introduction
two decades have passed since the initial proposals of MR as a
Ove; data analytic strategy in the social sciences by Blalock (1965) and
genera ‘ Other works have also periodically echoed Cohen’s mes-
C°h? p Cohen & Cohen, 1975, 1983; Darlington, 1990; Kenny, 1985;
S?ger Wasserman, & Kutner, 1989; Pedhazur, 1982). Why have re-
l4Ctcr’ rs been so slow to utilize these techniques in the analysis of studies
Solving two or more predictor variables? We believe this underutiliza-
Ш of MR approaches stems in large part from several impediments that
arise when researchers actually attempt to utilize the general procedures
diat have been outlined and to interpret their results. The purpose of this
book is to provide a detailed explanation of the procedures through which
regression equations containing interactions and higher order nonlinear
tenns may be structured, tested, and interpreted. In pursuing this aim, we
present an integration of recent work in the psychological, sociological,
and statistical literatures that clarifies a number of earlier points of con­
fusion in the literature that have also served as impediments to the use of
the MR approach. .
Chapter 2 addresses a number of issues involved in the interpretation
of interactions between two continuous variables in MR. One major im^
pediment to the use of MR has been that procedures for displaying and
probing significant interactions have not been readily available. T at is,
once an interaction was found to be significant, exactly what s ou one
do next? In this chapter, we present the graphical approaches to examin­
ing the interaction in equation 1.2 originally develope у 0 J*
Cohen (1975, 1983), present analyses to answer questions abou
(ordinal versus disordinal) of the interaction, and enve proce
post hoc statistical probing of interactions between k
that closely parallel simple effects testing within the ANOVA fnmework.
Chapter 3 addresses another impediment to the use о simple
action terms: the lack of invariance of the MR results
linear transformations of the data. To understanc is Р ' . ’ s for
analyzing a data set using equation 1.2 that contains irs
X and Z and the linear interaction of X and Z.

y=/>(X + b2Z + b3XZ +

Two analyses are conducted: First, the data arc апа'У“‘?('"иДп devia
■ form; second, the data are analyzed with X an «« deviation score
Hon score form), with the interaction creal ld differj perhaps
forms of X and Z. The results of these two analyses w
dramatically. Only the b, coefficient for the interaction

CamScanner
6 MULTIPLE REGRESSION

the same in both equations (see Cohen, 1978). Such shifts in the results
of the analyses of regression equations containing interactions or higher
order variables under transformation are disturbing. This problem has led
to substantial discussion in the social science literature (see, e.g., Fried­
rich, 1982, in political science; Cohen, 1978, and Sockloff, 1976, in psy­
chology; Althauser, 1971, and Arnold & Evans, 1979, in sociology).
Confusion and conflicting recommendations have resulted (e.g., Schmidt,
1973, versus Sockloff, 1976). Chapter 3 clarifies the source of the prob­
lem of failure to maintain invariance. Procedures through which research­
ers may work with equations containing higher order terms and maintain
unambiguous interpretations of their effects are highlighted. The interpre­
tation of all regression coefficients in equations containing interactions is
explained. Finally, a standardized solution for equations containing in­
teractions is presented.
Chapter 4 and 5 examine problems in testing interactions in more com­
plex regression equations. Most discussions of interactions in MR have
focused exclusively on the simple model involving two predictors and
their interaction represented by equation 1.2. Chapter 4 generalizes the
procedures for the treatment of interactions to the three predictor case.
Methods for graphically displaying the three-way interaction and for con­
ducting post hoc tests that are useful in probing the nature of the inter­
action are discussed.
Chapter 5 considers several complications that arise in structuring, test­
ing, and interpreting regression equations containing higher order terms
to represent curvilinear (quadratic) effects and their complex interactions.
The methods for graphically portraying and conducting post hoc tests of
interactions developed in earlier chapters are generalized to a variety of
complex regression equations.
Chapters 5 and 6 also address another impediment to the use of com­
plex MR models for researchers trained in the ANOVA tradition. In equa­
tion 1.2 where the XZ term with one degree of freedom (df) represents
fully the interaction between X and Z, generalizing from ANOVA to
regression is relatively easy. This generalization is less straightforward
when interactions have more than one degree of freedom. In ANOVA
there is always one source of variation for the whole interaction and one
omnibus test for significance even when an interaction has several degree
of freedom. However, in more complex MR equations, a series of terms,
each with one degree of freedom, may represent the interaction. F°r eX
arnp e, terms representing the linear X by linear Z component (XZ) an
e curvilinear (quadratic) X by linear Z component (X2Z) of the inter­

CamScanner
Introduction
7
action between X and Z may need to be built into the regression equation
Ш such cases, the generabzatton front one source of variation and one >
omnibus test tor the whole interaction in ANOVA to the MR framework
is not generally familiar to researchers. Similar problems exist for testin,
multiple df “main effects” in MR in the absence of an interaction Chan
ter 5 provides guidelines for the structuring and interpreting of these more
1
complex regression equations.
Chapter 6 further extends the consideration of this issue by developing
a variety of procedures for model and effect testing in complex regression
equations. Strategics fortesting and interpreting lower order effects in MR
are developed. Global tests of a variety of hypotheses (c.g., the overall
linearity of regression) that are based on sets of terms in the equation are
discussed. Hierarchical effect-by-effect and term-by-term step-down test­
ing strategies are presented for simplifying complex regression equations.
Procedures are presented for identifying the scale-independent term(s) that
can be legitimately tested at each step, yielding proper reduced equations
that can be interpreted.
Chapter 7 generalizes our treatment of interactions to cases involving
combinations of categorical and continuous predictor variables. Issues
arising in the representation of categorical variables and the interpretation
of the regression coefficients are discussed. Post hoc tests of the interac­
tion are presented that examine differences between regression equations
for the groups defined by the categorical variable.
Chapter 8 addresses the problem of measurement error in the predictor
variables and its effect on interactions. Several methods of correcting for
measurement error are presented, and their performance evaluated. The
dramatic effect of measurement error on statistical power (the ability of
statistical tests to detect true interactions) is shown.
Finally, Chapter 9 briefly contrasts the ANOVA and MR approaches
as they have been used in practice. ANOVA was classically applied to
experimental designs, whereas multiple regression was applied to mea
sured variables. We explore some of the areas in ANOVA an in
which lingering traditions have led to divergent practices, mode speci i
cation, functional form, and examination of the tenability of assumptions.
We hope that this book will introduce investigators in a variety о so
science disciplines to the major issues involved in the design an ana ysi
of research involving one or more continuous predictor varia es. e
hope the book will provide an increased understanding о interac io
multiple regression and will help remove the impediments to t e u
MR as a general data analytic strategy.

cs CamScanner
8 MULTIPLE REGRESSION

Notes

1. The ANOVA model is appropriate when (a) the levels of the predictor variable arc
discrete rather than continuous of (b) the relationship between the predictor and outcome
variables is a step function rather than linear or curvilinear (Kenny, 1985), As Cohen and
Cohen (1983; see also Chapter 7 of the present book) have shown, these cases can also be
equally well represented using the MR approach.
2. When beginning this project, we reviewed the 1984 volumes of four major journals
of the American Psychological Association: Journal of Abnormal Psychology, Develop­
mental Psychology, Journal of Consulting and Clinical Psychology, and Journal of Per­
sonality and Social Psychology. These journals were selected because they are leading jour­
nals in psychology and they most frequently publish articles involving analyses of multiple
continuous predictor (independent) variables. Our estimate that 23% of the articles involv­
ing two (or more) continuous variables used MR with interactions may be considered to
reflect the quality of the best practice rather than average level of practice in these areas of
psychology in 1984.

CS CamScanner
2 Interactions Between Continuous
Predictors in Multiple Regression

In this chapter we begin by explaining what the interaction between two


continuous predictors in a regression analysis signifies about the relation­
ship of the predictors to the criterion. We then address the problem of
displaying and statistically probing interactions between continuous vari­
ables in MR for the simple case in which there is only one term (/>3XZ)
involving both X and Z. This case is represented by equation 2.1:

Y - bxX + b2Z + b3XZ + bQ

For ease of presentation the exposition in this. put in


single predictor variables (here X and Z) have c term
deviation score form so that their means ar^er° ‘ntered predictors,
has been formed by multiplying togel er nrooerties as we will
Centering variables also yields desirable statistical propel
show in Chapter 3.

What Interactions Signify in Regression

In multiple regression analysis, the relationship of each predictor to the


criterion is measured by the slope of the regression line ot the criterion Y
yuicooiiis measureo oy inc siupc ui uiv --------
-1 — — —■slopes. Consider first
on the predictor; the regression coefficients are these rst

Cam Scanner
10
Mill,Tin.H RKOUHSSION

the familiar two predictor regression equation >' «■ b. X + b 7 i b


of rmXasnS "° "’iS "b'Pe «и regression; ,
reeression sf у'°"у r" '’!’.иС "C"’ss ",c Z- lr,’nc««lied the
regression of 1 on X for all cases nt any single value of Z, the regression
coefficient tor X in the subsamplc of cases would equal b, in the overall
equation. Put another way, the regression of Гоп X is independent of Z.
he regression of >'on Z is represented by b}, which has a constant value
across the range of Ar in this regression equation.
Now consider equation 2.1, which contains an XZ interaction. The XZ
interaction signifies that the regression of Y on X depends upon the spe­
cific value of Z at which the slope of Y on X is measured. There is a
different line for the regression of У on X at each and every value of Z.
The regressions of Уоп X at specific values of Z form a family of regres­
sion lines. Each of these regression lines at one value of Z is referred to
as a simple regression line. Because the regression of Y on X depends
upon the value of Z, the effect of X in equation 2.1 is termed a conditional
effect (e.g., Darlington, 1990).
The XZ interaction is symmetrical. The presence of the XZ interaction
in equation 2.1 equivalently means that the effect of predictor Z is con­
ditional on X; there is a different regression of У on Z at each value of X.

Data Set for Numerical Examples

To illustrate our prescriptions for post hoc probing of interactions, a


single data set is employed thoughout this and the next chapter. In our
example we will predict the self-assurance of managers (criterion У) based
on two predictors, their length of time in the managerial position (X) and
their managerial ability (Z). The data we use are artificial: They were
specifically constructed to include an interaction between X and Z. In these
data, individuals high in managerial ability increase in their self-assurance
with increased lime in the position, whereas individuals low in managerial
ability decrease in self-assurance with increased time in the position. As
we have described the expected relationships, the regression ot У on A
varies as a function of Z. All three variables, self-assurance (У). time in
position (X), and managerial ability (Z) are continuous variables.
Our simulation is based on moderately correlated bivariate normal pnj
dictors X and Z and their interaction XZ; 400 cases were generated in a •
From these two predictors, the XZ cross-product term was formed. *
dieted scores were generated from X, Z, and the interaction. Random eno

CamScanner
was added to the predicted scores to create observed criterion semes
nallv. the regression equation, f - btX I b;7. । b X7 i b
mated based on the three original predictors and the observed/ y'
The results of the regression analysis and lhc |M1M hoc fobl '()f
XZ interaction that we will be discussing in (his chapter arc prevntol in
Tables 2,1, 2.2. and 2.3 and Vigu.v 2.1. Note that in’ this ch W Хи

only discuss the potlions of the tables that pertain to "centered data " tf -
portions of the tables labeled "unccntcrcd data" arc discussed in Chapter
3. Of most importance at present, we see In Table 2.lc(ii) the regression
equation containing the interaction:

? = 1.14X + 3.58Z + 2.58XZ + 2.54

In this equation, both the Z effect and the XZ interaction arc significant.
From the overall equation it appears that there is no overall effect of length
of time in position (X) on self-assurance (У), that there is a positive effect
of managerial ability on self-assurance (Z), but that the relationship of
time to self-assurance is modified by managerial ability (XZ). Also of
interest, Table 2.1a presents the means and standard deviations for X and
Z and the correlation matrix for X, Z, and XZ for the centered data. Table

Table 2.1
Centered versus Uncentered Regression Analyses Containing an Interaction

a. Centered Data b. Unccntcrcd Data


Л' » 0(sx «= 0.95) X' - 5 (j, - 0.95)

Z = 0(jz = 2.20) Z' « 10(1/ - 2.20)

Correlation matrix
Correlation matrix
X Z XZ Y X' Z' X'Z’ r
.42 HI 17
X - .42 .10 .17 X'
*
*«4 ,H6 31
Z - .04 .31 Z’
.21
XZ - .21 X'Z’

c- Regression Equations Based on Centered Data


(i) No interaction: Г » I.67X 4 3.59Z
** ♦ 4.76

(ii) With interaction: У ~ 1.14X + 3.5HZ


** + 2.5KXZ 4 2

d. Regression Equations Based on Unccntcred Data


(i) No interaction: F «■ I.67X' + 3.59Z ** “ 39.
.a . л < u v17' ♦ * *4 90 15
(ii) With interaction: Ym **
-24.68X' *• 9.33Z

•*P < .01; •p < .05

cs Cam Scanner
12 MULTIPLE REGRESSION

2,lc(i) shows the regression equation when the Interaction term is omit­
ted.1

Probing Significant Interactions


in Regression Equations

Given that a significant interaction has been obtained, we now wish to


probe this interaction to sharpen our understanding of its meaning. The
primaty techniques for probing of this term arc plotting the interaction
and post hoc statistical testing.

Plotting the Interaction

Probing a significant interaction in MR begins with recasting the regres­


sion equation as the regression of the criterion on one predictor.2 For
example, the regression equation is restructured through simple algebra
to express the regression of У on X at levels of Z:

У = + b3Z)X + (b2Z + b0) (2.2)

In this restructured form of equation 2.1, the slope of the regression of Y


on X, (bt + b3Z), depends upon the particular value of Z at which the
slope is considered. We refer to (b{ + b3Z) as the simple slope of the
regression of У on X at Z. By simple slope we mean the slope of the
regression of У on X at (conditional on) a single value of Z. Note that the
simple slope (Ь} + b3Z) combines the regression coefficient of Уоп X
(Ь{) with the interaction coefficient (b3). Readers familiar with ANOVA
may find it helpful to think of simple slopes as the analog in MR of simple
effects in ANOVA.
We must then choose several values of Z to substitute into equation2 -
to generate a series of simple regression equations. If Z were categorical,
as for example when Z represents a dichotomous variable such as gender,
then we would compute two simple regression equations, one for men
and one for women. (Chapter 7 provides an in-depth consideration of
regression involving categorical and continuous variables.) On the other
hand, if Z is continuous, as in the example of managerial ability, then the
investigators arc free to choose any value within the full range of Z. In
some cases, theory, measurement considerations, or previous research may
suggest interesting values of Z that should be chosen. For example, »n a

CamScanner
Between Continuous Predictors
13
clinical diagnostic lest , a specific score represented a cutoff above wh.ch
pathology were indicated. then that cutoff score, a higher score typic' f
thc clinical eondthon. and a lower score typical of normal popul , on
mighl be chosen. In a study involving income, the federal government",
value of the poverty line for a family of four might be chosen. In other
cases, such as our fictitious example of managerial ability, no weial sci­
ence based rationale will exist to guide the choice of several values of Z
In such cases. Cohen and Cohen (I9K3) have suggested as a guideline
that researchers use the values ZM. ZH, ZL, corresponding lo the mean of
Z. one standard deviation above Z, and one standard deviation below Z.
respectively. Whatever values of Z arc chosen, each is substituted into
equation 2.2 to generate a series of simple regression equations of Fon
X at specific values of Z. These equations arc plotted to display the inter­
action.

Numerical Example

Figure 2.1a depicts three simple regression lines of the regression of


self-assurance (У) on time in position (X) as a function of three values
of managerial ability, ZL, ZM, and ZH for our data set. Note the sym­
metrical pattern of the three simple regression lines that is characteristic
of equations with a significant XZ interaction term.
To generate these simple regression lines, the overall regression equa­
tion Y = 1Л4Х 4- 3.58Z 4- 2.58XZ 4- 2.54 was rearranged to show the
regression of У on X at levels of Z:

У =( 1.14 4- 2.58Z)X 4- (3.58Z + 2.54) (2.3)

Then, following Cohen and Cohen (1983), values ot Z were chosen to be


one standard deviation below the mean (ZL e ~2.20), at * ^^Shnnle
= 0), and one standard deviation above the mean ( и * * w
regression lines were then generated by substituting t ese v ■ u
0. 2.20) in turn into equalion 2.3. For example, m ge.rcn.rc the Simple
regression for ZH « 2.20, the following substitution was i

У « [1.14 4- 2.58(2.2O)|X 4- (3.58(2.20) 4 2.54]

« 6.82X 4- 10.41.

The results of the computations of simple regression eqwindicate a


4 are given in Table 2.2a. The simple regressio 1 % for
^ilive regression of Ton X forZ„. a negative regresston

CS CamScanner
14 MULTIPLE REGRESSION

Table 2.2
Simple Regression Equations for Centered and Uncentered Data

a. Regression of У on X at Particular Values of Z for Centered Data


In general: Y = (1.14 + 2.58Z)X' + (3.58Z + 2.54)

At ZH =2.20: ? = 6.82X + 10.41

At ZM = 0.00: У = 1.14X + 2.54

At ZL = -2.20: Y= -4.54X - 5.33

b. Regression of Y on X’ at Particular Values of Z for Uncentered


Data
In general: У = (-24.68 + 2.58Z’)X' + (-9.33Z' + 90.15)

At ZH' = 12.20: У = 6.82X' - 23.67

At ZM' = 10.00: Y = LUX’ - 3.15

At ZL' = 7.80: Y = -4.54X' + 17.38

NOTE: Regression equation rearranged to show regression of У on X at levels of


Z: Y = (frt + byZ)X + (M + fro) -

ZL, and essentially no relationship between X and Y for ZM. Figure 2.1a
reveals a complex pattern of regression of Y on X depending on the level
of Z. If only the nonsignificant b\ coefficient in Table 2.1c(ii) had been
examined, it would have been concluded that there was no relationship
ofXtoK

Post Hoc Probing

Once plotting is accomplished, two questions that parallel the probing


of ANOVA interactions with simple effects may be asked: (a) For a spec­
ified value of Z, is the regression of Y on X significantly different from
zero, and (b) for any pair of simple regression equations, do their slopes
differ from one another? (Similar questions may also be asked for the
regression of У on Z for each value of X.)
Is the Slope of the Simple Regression Line Significantly Different
From 0?

The approach to probing interactions prescribed here permits the testing


о tie significance of the simple slopes of regression lines at single values
° a ?CCO^ Predictor. The approach was described in Friedrich (1982,
' also Darlington, 1990; Jaccard, Turrisi, & Wan, 1990) for the case

CamScanner
Interactions Between Continuous Predictors
15

a. Simple Regression Analysis from Centered Analysis: Y = 1.14X + 3.58Z +


2.58XZ + 2.54

b. Simple Regression Equation from the Uncentered Analysis. Y - 24.0)


- 9.33Z' + 2.58X'Z' 4- 90.15
Figure 2.1. Interaction Plotted from Centered and Uncentered Equation

r4
cs CamS canner
MULTIPLE REGRESSION

of the XZ interaction. It involves the calculation of the standard errors of


the simple slopes of simple regression equations. Then /-tests for the sig.
nificance of the simple slopes arc computed. We initially provide an over­
view and numerical example of this strategy. We then show how these
tests of simple slopes can easily be accomplished with available regres­
sion analysis computer programs. An optional section is presented at the
end of this chapter for more advanced readers who wish a general deri­
vation of standard errors of simple slopes.
To calculate the standard error of the simple slope we use values from
the variance-covariance matrix of the regression coefficients.3 Estimates
of these values are produced by regression programs in standard statistical
packages such as SPSS-X and SAS. Specific elements from this matrix
corresponding to terms in the simple slope arc given a weight and then
combined to produce the estimate of the standard error of the simple
regression coefficient.
Returning to the simple slope (bt 4- b3Z) in equation 2.2, its standard
error is given as

sb = y/s{ । + 2Zsi3 + Z^s33 (2.4)

The values s, j and s33 are the variances of bt and b3, respectively, taken
from Sb, the sample estimate of the variance-covariance matrix of predic­
tors; sJ3 is the covariance between bt and b3 taken from Sb. As the value
of Z varies in the simple slope, the value of the standard error in equation
2.4 varies as well. Note that equation 2.4 pertains only to the simple slope
(bt + b3Z) in equation 2.2.

t-Tests for Simple Slopes. The /-test for whether a simple slope differs
from zero is simply the value of the simple slope divided by its standard
error with (n - к - 1) degrees of freedom, where n is the number ol
cases and к is the number of predictors, not including the regression con­
stant (here к = 3).

Numerical Example. The covariance matrix of regression coefficients


Sb is given in Table 2.3a for the simulated data set with centered predic­
tors. This matrix was obtained from the regression program 0 ,
SPSS-X package applied to the centered data set4. Values from this ma r
are used in computing the standard errors of the simple slopes. Reca
in Figure 2.1a and Table 2.2a, the simple slopes for the regression о
onXatZH,ZM,andZ1.were6.82, 1.14,and -4.54,
tion 2.4 yields the standard errors associated with each slope.

CamScanner
Interactions Between Continuous Predictors
17
tion involves three elements from c „
-0.08. and J,, = 0.40. Substituting Z « ,™ У = 2-35- ■<,, -
the standard error of the simple slope of у 'у '"l" e<,"a,ion 2.4 yicW,

sH = 72.35 + 2(2.2О)р(ТояУТ72 2ОТШ> 2ЬЧ г—


' ^'20 )(0.40) - 7X93 . 1.98

Similar substitutions of ZM - o. and Z, . ,n : ,


standard errors sM » 1.53 nm|„ 2 ., '■ i m yield the estimated
Finally, in Table 2.3a, the /-tests of each inTable2-3’.
1th ",n,Plc ""I
* gainst zero are

Table 2.3
Computation of Standaid Errors and r-Tcsts for Simple Slope,

a. Analysis of Centered Data


<») Covariance matrix of regression coefficients

fri b2 6,

2.35 -0.41 -0.08 ~


$/. —
Ьг -0.41 0.43 -0.00

bj _ -0.08 -0.00 0.40.

(ii) Simple slopes, standard errors of simple slopes, and /-tests

Simple slope Standard error /•test

Ьц = 6.82 sH = 1.98 i - 6.82/1.98 = *


3.45

bM = 1.14 jM = 1-53 t ~ 1.14/1.53 = 0.74

b{ ** —4.54 Ji. «* 2.15 i '• *


-4.54/2.15 » -211

о Analysis of Uncentered Data


(i) Covariance matrix of regression coefficients
b't b'2 b\
43.88 19.96 -4.07
s"'= «.■
bi 19.96 1042 -2.00
- ■ ■ .. • s‘;
-200 0 40.
b\ . —4.07

(ii) Simple slopes, standard errors of simple slopes and / tests

Standard enor /-test


Simple slope
t 6.82/1 98 » 3.45”
b„’ ₽ 6.82 « 198
/as 1,14/1.53 * 0.74
Ьул‘ 1.14
1 -4.54/2.15 * -2.11
*
6L' ■» —4.54 $i/ •» 2.15

**P < .01; */> < .05

cs CamScanner
[8 MULTIPLE REGRESSION

provided. These tests confirm the positive regression Ton Xat ZH and the
negative regression of Г on X at ZL; the regression of F on X at ZM does

not differ from 0.

The Regression of Y on Z. The above presentation and numerical ex­


ample pertain to the regression of Y on X at levels of Z. If instead we
were interested in the regression of Ton Z at levels of X, the simple slope
equation would be expressed as Y = (b2 + b3X)Z + (btX + b0). For
this equation the simple slope is (b2 + b3X) and its standard error is

sb = 5/^22 + 2Хг2з + *1
2^зз
3 (2-5)

Once again, the r-test for whether a simple slope differs from zero is
simply the value of the simple slope divided by its standard error with
(n - к - 1) degrees of freedom. *
4 ■

Simple Slope Analysis by Computer. A computer procedure using stan­


dard regression programs can be used to perform the entire simple slope
analysis (Darlington, 1990, personal communication 1990; Judd &
McClelland, 1989). Here we confine the presentation to the regression of
Ton X at values of Z in equation 2.2. The procedure generalizes to more
complex equations. Examples of its use are presented throughout the text.
Following Darlington (1990), we use the term conditional value ofZ
(CVZ) to refer to the specific value of Z at which the regression of Y on
X is considered. Suppose we seek the simple slope for the regression of
Y on X at ZH = 2.20, one standard deviation above the mean of Z; then
CVZ = 2.20. To carry out the simple slope analysis by computer, pre­
dictor Z is transformed to a new variable Zcv by subtracting CVZ from 1
(i.e., Zcv = Z - CVZ). Then Zcv is used in the regression analysis in
lieu of Z. The value of b} from that analysis is the simple slope of th *
regression of Y on X at Z = 2.20.
In sum, three steps are required to carry out the simple regression anal­
ysis by computer:
1. Create a new variable Zcv, which is the original variable Z minus the con­
ditional value of interest, that is, Zcv = Z - CVZ;
2. form the crossproduct of the new variable with predictor X, that is.
(X)(ZCV); and 1

3. regress the criterion Y on X, Zcv, and (X)(ZCV).

The resulting regression coefficient will be the desired simple regres­


sion coefficient of Y on X at the conditional value CVZ of Z. The regres-

CamScanner
Interactions Between Continuous Predictors

sion constant (intercept) from that analysis will be that for the simple
regression equation. The standard error of b, will be the standard error of
% the simple slope of У on X at C Vz, and the Г-test will be that for the simple
1 slope.
Similarly, if the simple slope of У on X one standard deviation below
the mean of centered Z is sought (here CVZ == -2.20), then once again
a new variable Zcv = Z - CVZ is calculated, here Zcv = Z - (-2.20)
■ and the regression of У on X, Zcv, and (X)(ZCV) is performed. The re­
sulting b, term, its standard error, and /-test form the simple slope analysis
one standard deviation below the mean of Z, at CVZ = -2.20.
Table 2.4 provides SPSSX computer output for the simple slope anal­
ysis reported in Table 2.3a. First, the overall regression analysis with
'. centered X, Z, and XZ is given, replicating Table 2.1c(ii), and two trans­
formed variables are calculated:

1. ZABOVE = Z - (2.20), for the regression of У on X at CVZ = 2.20, one


standard deviation above the mean of Z, and
2. ZBELOW = Z — ( —2.20), for the regression of У on X at CVZ = —2.20,
one standard deviation below the mean of Z.

Second, their crossproducts with X are calculated: XZABOVE and


XZBELOW. Third, the two regression analyses described above are per­
formed. In Table 2.4c, the regression analysis involving X, ZBELOW,
and XZBELOW is reported. The regression coefficient b\ and constant bQ
equal those for the simple regression equation of У on X at ZL in Table
2.3a; the standard error and r-test are identical as well. In Table 2.4d, the
regression analysis involving X, ZABOVE, and XZABOVE is reported
and corresponds to the simple regression of У on X at ZH in Table 2.3a.
This computer approach to simple slope analysis generalizes to more
complex regression analyses, for example, those involving three linear
predictors and their interactions and those involving higher order poly­
nomial terms (e.g., X2, X2Z), considered in Chapters 4 and 5, respec­
tively.
Do the Slopes of a Pair of Simple Regression Lines Differ From One
Another?
Having determined for which values of Z the regression of У on X is
different from zero, the investigator may wish to determine whether the
simple slope of У on X differs at two values of Z, say ZH versus ZL, as
previously defined. The simple slopes in question are (b| + b3ZH) versus
(bi + b3Zx ), and their difference is simply a function of b3, that is, d —

r4
cs CamS canner
2Q MULTIPLE REGRESSION

(b.ZH - Ml) *= (A " A) Thc Mest of lhc difTercncc between the


slopes is given by ■
<7 _ (A - A)fr s Л=-
' s" V(z„ - 4Л»

Note that this is identical to the /-test for the significance of the b, coef.
ficicnt in the overall analysis. In other words, given that Z is a continuous
variable, the significance of the by coefficient in the overall analysis in­

Table 2.4 e , __ r
Computation of Simple Slope Analysis by Computer for the XZ Interaction in
the Regression Equation У = btX 4- ^Z + byXZ + bo

a. Overall Analysis with Centered X and Centered Z

(i) Means and standard deviations


Mean Std Dev
Y 4.759 28.019
X 0.000 0.945
Z 0.000 2.200
xz 0.861 2.086

(ii) Variance-covariance matrix of regression coefficients (b)


Below diagonal: covariance: above: correlation

X Z XZ
X 2.34525 -0.41324 -0.08489
Z -0.41469 0.42938 -0.00453
XZ -0.08211 -0.00187 0.39895

(iii) Regression Analysis

Variable В SEB T Sig T


X 1.136404 1.531420 0.742 .4585
z 3.577193 0.655271 5.459 .0000
xz 2.581445 0.631627 4.087 .0001
(Constant) 2.537403 1.418212 1.789 .0744

b. Computation of ZABOVE, ZBELOW, and crossproduct terms required for simple


slope analysis
COMPUTE ZABOVE « Z - 2.20
COMPUTE ZBELOW ~Z-( -2,20) -
COMPUTE XZABOVE ~X *ZABOVE
COMPUTE XZBELOW»X
*ZBELOW

CamScanner
Interactions Between Continuous Predictors 21

(Table 2.4, continued)

c Regression Analysis with ZBELOW AND XZBELOW, Yielding Simple Slope


Analysis at Z^ (Regression of Yon XOne Standard Deviation Below the Mean of Z)

(i) Means and standard deviations

Mean Std Dev


Y 4.759 28.019
X 0.000 0.945
ZBELOW 2.200 2.200
XZBELOW 0.861 3.082

(ii) Regression Analysis

Variable В SEB T SigT


X -4.542777 2.153481 -2.110 .0355
ZBELOW 3.577193 0.655271 5.459 .0000
XZBELOW 2.581446 0.631627 4.087 .0001
(Constant) -5.332421 2.020502 -2.639 .0086

d. Regression Analysis with ZABOVE AND XZABOVE, Yielding Simple Slope


Analysis at ZH (Regression of У on X One Standard Deviation Above the Mean of Z)

(i) Means and standard deviations


Mean Std Dev
Y 4.759 28.019
X 0.000 .945
ZABOVE -2.200 :2.200
XZABOVE 0.861 :2.801
-
(ii) Regression Analysis
SEB T SigT
Variable в
6.815584 1.978606 3.445 .0006
X
3.577193 0.655271 5.459 .0000
ZABOVE
2.581446 0.631627 4.087 .0001
XZABOVE
2.024012 5.142 .0000
(Constant) 10.407229

dicates that the regression of Y on X varies across t e rang •


further test is required of whether simple slopes of 1 on i e
another as a function of the value of Z.
4 Caution Concerning the Use of Simple Slope Tests
When simple slopes are evaluated using a priori values or P°Pul“
based values as in the example of the clinical diagnostic es ,

CamScanner
У) MULTIPLE REGRESSION
Ла Ла

above the procedures described in this chapter provide proper values of


the test statistics. Similarly. when a high or low value of Z is chosen on
the basis of Cohen and Cohen’s (1983) or other guidelines and the re­
searcher’s interest is in making a statement about the simple slope at that
specific numeric value, the lest statistics arc unbiased. These cases are
the ones most likely to be encountered by researchers.
In contrast, if the researcher’s interest centers on making inferences
about the simple slope at a specific population-based value (e.g., the pop­
ulation mean; one standard deviation above the mean in the population),
the Hosts of simple slopes described in this and subsequent chapters are
positively biased. The magnitude of this bias decreases with increasing
sample size. Two remedies for this problem have been summarized by
West and Aiken (1990): (a) A procedure developed by Lane (1981) to
provide conservative tests of the value of the simple slopes may be used,
(b) Bootstrapping techniques (Darlington, 1990, personal communication
1990; Stine, 1990) may be used to provide empirical estimates of the
standard error. At the present time we know of no systematic investigation
of the extent of bias and the adequacy of the proposed remedies.

Ordinal Versus Disordinal Interactions

A useful distinction, borrowed from the ANOVA literature, is the clas­


sification of interactions as being disordinal versus ordinal (Lubin, 1961),
or, equivalently, crossover versus noncrossover, respectively. According
to this descriptive classification, the interaction is ordinal (noncrossover)
when the simple regression lines (or lines representing levels of one factor
with categorical variables) for an interaction do not cross within the pos­
sible range of the values of the other variable. Conversely, the interaction
is disordinal (crossover) when the simple regression lines cross within the
possible range of values of the other variable. For example, if the scale
measuring managerial ability in our example had a potential range of val­
ues from 1 to 7 and the crossing point were at 5, the interaction would be
disordinal (crossover). In contrast, if the lines were to cross at -2 or at
12, the interaction would be ordinal. ,
Potential difficulties with this descriptive classification can develop m
the MR context with interactions between continuous variables that have
no obvious extreme points that define the ends of the continuum. Dne
approach to such cases is for the researcher to examine the calculate
crossing point relative to the actual range of the data. Interactions who
*

CamScanner
Interactions Between Continuous Predictors 23

crossing point falls outside (he actual range of values on X arc classified
as being ordinal, whereas those whose crossing point falls inside the ac­
tual range of values on X arc classified as being disordinal. An alternative
approach in the absence of scale-based or date-based criteria is for the
researcher to consider a meaningful range of the variable in terms of the
system being characterized by the regression equation; this meaningful
range has been referred to as the dynamic range of the variable in the
context of sensory systems such as vision or audition (see Teghtsoonian,
1971). Those interactions for which the lines crossed within the meaning­
ful range of the variable would be termed disordinal, whereas other in­
teractions whose crossing point fell outside this range would be termed
ordinal.
The reader should bear in mind that the classification of an interaction
as ordinal versus disordinal is always with regard to a particular config­
uration of variables. An interaction may be ordinal in one direction, say
the regression of Y on X at values of Z, and disordinal in the other direc­
tion (Y on Z at values of X). The question of whether to characterize an
interaction in terms of Y and X at values of Z or in terms of Y and Z at
values of X may be driven by theory and the specific predictions to be
tested. For example, most theoretical discussions present life stress (X)
as the predictor of health (Y), with social support (Z) being described as
the variable that moderates this relationship; hence the regression of Y on
X at values of Z is considered. However, in general, it is useful to examine
both the regression of Y on X at levels of Z, and У on Z at levels of X.
Both castings provide potentially informative and complementary two-
dimensional representations of what is in reality a three-dimensional
regression surface.

Determining the Crossing Point in an Interaction

The point at which two simple regression lines cross can be determined
algebraically. For the regression of Y on X al values ot Z, the simple
regression equation is written at two specific values of Z, say ZH and ZL,
yielding two simple regression equations:

Ун « (b4 4- ^ZH)X 4 (№ 4- M

)'t ж (A + AZi.)X 4 4-

The two equations are set equal to one another to determine the expression
for the point at which the lines represented by these interactions cross.

cs CamScanner
24 MULTIPLE REGRESSION

Here,

^c«ni ~ (2.6)

Note that if there is no interaction (i.c., b3 = 0), the simple regression


lines do not cross.
The crossing point for the simple regression of У on Z at values of X
can be derived in a parallel manner. This crossing point is Zcrw, =
“b|/b4.

Numerical Example

For the numerical example in Table 2. lc(ii) with b2 = 3.58 and b3 =


2.58, the simple regressions of Y on X at values_of Z cross at the value
•Across = —3.58/2.58 = -1.39. For centered X(X = 0.0 and sx = 0.95)
the simple regression lines cross (— 1.39 — 0.0)/0.95 = — 1.47 standard
deviations below the mean. Figure 2.1 portrays the interaction, but with
the range of X limited to one standard deviation on either side of its mean.
The interaction is ordinal within this range. If this is the meaningful range
of X, then we recommend classifying the interaction as ordinal and ex­
plaining the limits of X considered in so doing.

Optional Section: The Derivation of Standard Errors of


Simple Slopes
In this section we provide the general derivation of standard errors of
any simple slope in any ordinary least squares (OLS) regression equation
of the form

Y == bxX + b2Z + * • • + bpW + bQ

that is, any regression equation that is linear in the regression coefficients
fKmenta 1986). The approach pertains to all of the more complex regres­
sion equations in this book. Readers familiar with matrix algebra should
find the exposition straightforward. For readers unfamiliar with ram
* central form of the expression for the variance of a simp
i к < th#
algebra, 8 «««ndard error) is given in equation 2.10 below,
slope (square of the stan d of slopes is th.
The starting po linear combination of the original
observation that each simp e *’ tion In the equation Y = (bt + b3Z)
regression coefficients in kione for the regression of Y on X is
(bo + b2Z), the simple slope for the reg

CamScanner
inferaeilonx between Continuing Predictor» 25

(/,, 4 b2Z). Utting the known properties of linear combinations, v/c can
derive the sampling variance of the simple slope (bt 4- /^Z),
Consider any linear combination U of variables /q * * • bp, weighted
by h'i * * ’ respectively. In vector equation form, this may be ex­
pressed as V w'b, or equivalently in algebraic form

17 * Wj/q 4- w2b2 4- * « • 4' wpbp.

Here the regression cocfflccnts b' =* |/q h2 • • * bp] arc the elements of
the combination and w' ** |w, w2 • • • wp] arc the weights that define
the combination. The variance a J of the combination is a function of EA,
the variance covariance matrix of the elements b{ • • • bp and of the
weights themselves, as given by the quadratic form

a J s= (2.7)

As already explained, in the ease of usual ordinary least squares regres­


sion analysis, the variance-covariance matrix required is that of the
regression coefficients themselves, S,
* the sample estimate of Zh. Under
the typical assumption of ordinary least squares (OLS) regression, namely
normally distributed residuals et with mean zero [Е(б/ — 0)] and variance
a?, the least squares estimates of the regression coefficients are normally
distributed, with the estimate of their variance-covariance matrix5 given
by the matrix equation Sb «= MSy_fSxx
* In this equation, MSy_ ? is the
mean square residual from the overall analysis of regression (ANOReg),
and Sxx is the inverse of the covariance matrix of predictors. (See Mad-
dala, 1977, for a clear exposition of the sampling properties of OLS es­
timators in multiple regression analysis, and Morrison, 1976, for com­
ments on the properties of linear combinations of normally distributed
variables.)
The simple slope is written as a linear combination of all the coeffi­
cients in the equation except the constant 6(). For (/q 4- ZqZ), it is re­
written as U « (l)/q 4- (0)/q 4- (Z)/q, with the weight vector w * «
11 0 Z). At any particular value of Z, the sample estimate of the variance
of the simple slope (Zq 4- ZqZ) is then given as

(2.8)

CamScanner
25 MULTIPLE REGRESSION

In this equation is the variance of the estimate of regression coefficient


h . and s, is the covariance between the estimates of b, and fy. Complet­
ing the multiplication yields the following expression for the variance of
the simple slope of У on X at values of Z in equation 2.2. This is the
square of the standard error given in equation 2.4.

« Ju + 2Zj|3 + (2.9)

For the regression of 1 on Z at values of X, the simple slope is (bj +


b?X), and hence w' = [0 1 X]. With this weight vector used in place of
the vector [ 1 0 X] the variance of the simple slope of У on Z at values of
X is generated; it is the expression under the radical in equation 2.5.
Equation 2.7, the general equation for the variance of any linear com­
bination of regression coefficients, may also be expressed in algebraic
form (Stelzenberg & Land, 1983, p. 657). In terms of population values,
the general expression is as follows:

к к
b = S n?<r« + S S
a2 (2.10)
/»i J ;=i /*

where

к is the total number of regression coefficients for predictors excluding bo in


the equation (for equation 2.1, к = 3);
Wj are the weights used to define the combination in the weight vectors defined
above, and are their squares;
Ojj is the variance of regression coefficient bj of which is the sample estimate,
as in equation 2.5 above;
atJ is the covariance between two regression coefficients bt and b,. of which i4
is the sample estimate, as in equation 2.5 above.

Throughout Chapters 4, 5, and 7 we state equations for the variances


of simple slopes in a variety of regression equations. All the equation5
follow the form of equation 2.10. As we work through various regression
equations, we will indicate the particular weight vectors involved in d^
riving the simple slope variances. The reader may then follow equatio1’
2.7 in matrix form or, equivalently, 2,10 in algebraic form, to config
the formulas for the simple slope variances.

CamScanner
Interactions Between Continuous Predictors
27
Summary
This chapter addressed the probing of significant interactions between
two continuous variables X and Z in the regression equation f = J "
b2z + b,xz + l>„. First, the regression equation was rearranged to show
the regresston of the entenon on X at values of Z; the simple slope of that
regression equation was defined. Post hoc probing of the interaction began
with prescriptions for plotting the interaction. Л /-test for the significance
of the simple slopes was presented together with a simple, computer-based
method for performing this test. The distinction between ordinal and
disotdinal (noncrossover versus crossover) interactions was presented for
the interaction between two continuous variables, and the procedure for
determining the crossing point of simple regression lines was illustrated.
Finally, a more advanced optional section presented a general derivation
of the standard errors of simple slopes in any OLS regression equation.

Notes

1. In practice there will typically be little difference between the bt and b2 coefficients
in the regression equations containing the interaction [e.g., Table 2. lc(ii)] and those coef­
ficients in the regression equation not containing the interaction [e.g., Table 2.1c(i)j if
predictors X and Z are centered and have an approximately bivariate normal distribution.
2. In much of the discussion we refer to the regression of У on X at values of Z. The
interaction may just as well be cast in terms of the regression of У on Z at values of X. Our
exposition is often confined to У on X and values of Z for simplicity.
3, Conceptually, the population variance-covariance matrix of the regression coeffi­
cients, E/>( can be understood as follows. Imagine computing equation 2.1 for a infinite
number of random samples from a given population. The variance of each regression coef­
ficient (e.g., bt) across all the samples would be on the main diagonal of *E
. The covariance
between pairs of regression coefficients (e.g., with b2) across all samples would be the
off-diagonal entries.
4. In SPSS-X the covariances among estimates (d'u, S|j. Jjj) and the correlations among
the estimates are printed in the same matrix, with variances of the estimates ($ц,
on the main diagonal, covariances below the diagonal, and correlations above thv d'agona .
This matrix is obtained in SPSS-X REGRESSION with the BCOV keyword on the St -
T1ST1CS subcommand. To form the covariance matrix *S in table 2.3a(i). we ave p ace
the covariances both below and above the diagonal; SPSS users should be certain to
3S Well
The covariance matrix of the estimates is obtained in SAS from RLG wit
keyword COVB on the MODEL statement. The Sb matrix obtained m SAS c™a js
variances and covariances, just as in Table 2.3a(i). No modification of t e

5. In general, the covariance matrix of the parameters is the inverse of I ishe


tion matrix (Rao, 1973).

CS CamScanner
3 The Effects of Predictor Scaling on
‘ Coefficients of Regression Equations

In Chapter 1, we introduced the problem of the lack of invariance of


regression coefficients in equations containing interactions even under
simple linear transformations of the data. However, to this point, we have
not specifically addressed this problem, which has led to considerable
confusion in the literature (see discussions by Friedrich, 1982; Schmidt,
1973; SockJoff, 1976). In this chapter we explore the problem both al­
gebraically and by numerical example for the case of the regression equa­
tion containing one XT interaction term. After the consideration of scaling
effects to provide the reader with the necessary understanding of centered
versus uncentered solutions, we then examine the interpretation of each
of the regression coefficients in equation 2.1, Y — b^X 4- b2Z 4- byX2
4- b0. Finally we explore the relationship between the centered solution
and several potential standardized solutions, showing that only the pro­
cedure proposed by Friedrich (1982) produces a fully interpretable stan­
dardized solution.

The Problem of Scale Invariance

The problem of lack of scale invariance under linear transformation of


predictors is shown algebraically in this section. The scale transformation
we consider is rescaling by additive constants (i.e., adding or subtracting
28

CS CamScanner
Effects of Predictor Scaling 29

constants from predictor scores). For example, suppose we have a raw


score X'; we subtract the mean (a constant) from each score, yielding the
centered score X. Or, we rescale a variable that ranges from ~3 to +3
by adding 4 to each score, so that resulting scores will range from 1 to 7.
In our usual experience such rescaling has no effect on the correlational
properties of the rescaled variables, and hence will have no effect on lin­
ear regression. This is the desired state of affairs: We want the solution
to be identical for the original raw and transformed variables. However,
when there arc interactions in the regression equation, simply rescaling
by additive constants has a profound effect on regression coefficients.
’ We examine rescaling effects using the regression equation presented
in Chapter 2:

У = b,X + b2Z + b,XZ + bo

or as rewritten to show the regression of К on X at values of Z:

Y - (b, + b3Z)X + (b2Z + b0) (3.2)

There are four outcomes of rescaling by additive constants that will be


shown algebraically for this equation using the approach of Cohen (19/»).

1. In the case of linear regression with no higher order terms, that is, b3, - 0
in equation 3.1, rescaling by additive constants has no effect on the value
of the regression coefficients.
In regression equalions containing at least one higher order lenn,лея." g
2.
by additive constants leads to changes in all regresston coefficients except
for the highest order term. .
3.
Simple slopes of simple regression equations are una ecte у a
transformations. . . „„
4. Under additive scale transformation the interpretation ® 1 ®
ordinal versus disordinal remains unchanged. The impo probing
this exposition is that our prescriptions for plotting
of interactions between continuous variables do not suffe egression
of lack of invariance, even though coefficients in
equation do.

Linear Regression with no Higher Order Terms


Transformation by additive constants has no effect on^S^sion
ficients in equations containing only first order terms, о s

CamScanner
MULTIPLE KfcGRESSJON

gebraically, we take the simple regression equation ,

Y = b{X 4- b2Z 4- b0 (3.3)

and define two new variables X' = X 4- c and Z' = Z 4- /, where c and
/are additive constants. (Note that if c and /are the arithmetic means of
X and Z, respectively, and X and Z represent centered variables, then X'
and Z' represent the uncentercd forms of these same variables). We re­
write the original centered variables as X - X' - c and Z = Z' - / and
substitute these values into the simple regression equation, yielding:

у = MX' - c) 4- b2(Z' -/) + b0 or

Y = b'X' + b2Z' + (b0 - bxc - b2f) (3.4)

Here, the coefficients b{ and b2 for the uncentered first order (X' and Z')
terms are identical with those in equation 3.3 based on centered X and Z.
Only the regression intercept (b0 - b} c - b2f) is changed from its orig­
inal value.

Regression Equations with Higher Order Terms

It is quite a different matter if the regression equation contains an in­


teraction or other higher order terms. Substituting the expressions X =
X' — c and Z — Z' — / into equation 3.1 and collecting terms yields the
following: '

b3f)X' + (b2 - b3c)Z' 4- b3 X'Z'

4- (b0 - bxc - b2f + b3cf) (3-5)

Note that the original bl coefficient of equation 3.1 becomes bj = (bi "
byf), the original b2 coefficient of equation 3.1 becomes b2 = (b2 "
b3c), and the regression constant b0 becomes (b0 - btc - b2/ + M/)-
Only the interaction coefficient does not change: b3 thus retains its origin
value and interpretation.’ The change in first order coefficients produce
by linearly rescaling X and Z occurs when there is a nonzero interact^
between these variables. The covariances between interaction term
and each component (X and Z) depend in part upon the means of t
individual predictors. Rescaling changes the means, thus changes pre t
tor covariances, resulting in changes in b, and b2 for the predictors c

i'4?

CamScanner
EficlS of Predictor Scaling 3!

inCd in the higher order function. This is true even if the individual
” dictots, X and Z, arc uncorrclatcd with one another.2 (The interested
reader is referred to Appendix A, which provides algebraic expressions
for the mean and variance of crossproduct terms XZ in terms of the means,
variances, and covariances of its components X and Z. The covariance
between a crossproduct term and its components is also explored.)

Simple Slopes of Simple Regression Equations

The simple slopes that are calculated from the interaction also remain
constant under additive scale transformation. To see this algebraically,
recall from equation 3.2 that (b{ + b3Z) is the general form for the simple
slopes of У on X at values of Z from equation 3.1. Let us once again use
the expressions X = X' - c and Z = Z' - /, and substitute them into
equation 3.2 for the regression of У on X at levels of Z:

y= [fe, + fe,(Z' -/)](X' - c) + [b2(Z' -f) + *


o] (3.6)

Expanding and collecting terms yields

Y = (b, + b}Z‘ - byf)X'


+ (-b,c - b3cZ' + b3cf + b2Z' - b2f+b0) (3.7)

In order to compare the value of the simple regression coefficient (b^ +


b3Z) of equation 3.2 with the simple regression coefficient (bj + b3Z
- b3f) in equation 3.7, we substitute the expression Z' = Z + / into
equation 3.7 with the result that

Г « (b| + b3Z)X' + (-biC + b2Z - b3cZ + bQ) (3.8)

Note that the simple regression coefficient (bt + b3Z) for the regression
of У on X1 at values of Z docs not change from equation 3.2 to 3.8;
rescaling predictors changes the regression constants but not the regres
sion coefficients of the simple regression equations.

Ordinal Versus Disordinal Interactions


Once we have rescaled X and Z, the point at which the simple regres
sion lines cross will move by the same factor as the additive constants c

CamScanner
32 MULTIPLE REGRESSION

and f for X and Z, respectively. To show this, we use the expression Ь’г
- (b2 - byc) from equation 3.5 and recall that, for uncentcred versus
centered equation 3.1 * b\ «== bJt Substituting these expressions into equa­
tion 2.6, we find that the regression lines cross at the value

Y' ~(b2 - bye)' = ----


-Ыi _ —
e ---- ~b2L .i. c
(3.9)
b\ by
^3 h)

Thus transforming X by an additive constant moves the crossing points of


the simple regression lines of Y on X by precisely the same constant.
Hcncc the status of the interaction as ordinal versus disordinal is inde­
pendent of the predictor scaling.
The algebraic relationships we have shown allow us to reach an im­
portant conclusion: Any additive transformation of the original variables
has no effect on the overall interaction or on any aspect of the interaction
we might choose to examine.

Numerical Example­
Centered Versus Uncentered Data

Our examination of Tables 2.1, 2.2, and 2.3 in the previous chapter
was focused solely on those portions of the tables that report the results
of analyses using centered variables X and Z. Also contained in these
tables are the results of parallel analyses in which the variables have been
transformed to their uncentered forms as follows: X' - X + 5 and Z' =
Z + 10. This example permits the direct comparison of the regression
analysis, plots, and post hoc probing based on centered versus uncentered
data. It also introduces some of the desirable properties of centered so- i
lutions.
Correlations
Note in Table 2.1a that the correlations between the centered terms A
and XZ and between Z and XZ are low, . 10 and .04, respectively. How­
ever, in the uncentcred case, large correlations are introduced between X'
and X'Z' and between Zz and X'Z'. For example, the correlation be­
tween uncentered Z' and X'Z' is .86, instead of .04 forZ with XZ. This
example illustrates how considerable multicollinearity can be introduced
into a regression equation with an interaction when the variables are not
centered (Marquardt, 1980). Very high levels of multicollinearity can lead
to technical problems in estimating regression coefficients. Centering

CS CamScanner
1Effects of Predictor Scaling 33
i ■
variables will often help minimize these problems (Ncter, Wasserman, &
Kutncr, 1989).
Regression Equations with no Higher Order Terms

The regression equations including first order terms only (no interac­
tion) arc given in Tables 2.1 c(i) and 2. ld(i) for centered versus uncen­
tcred data. Note that the regression coefficients b\ - 1.67 and b2 = 3.59
arc identical for the centered and uncentcred equations. Only the regres­
sion constant reflects the change in scaling.

Regression Equations Containing an Interaction


Comparing Tables 2. lc(ii) and 2. ld(ii), we note that the b3 coefficient
for the interaction term is identical in the uncentered and centered equa­
tions, as are the tests of significance. However, in the uncentered equa­
tion, both of the coefficients forX' and Z' are negative and are significant.
In shaip contrast, in the centered equation, both of these coefficients are
positive, with the coefficient for Z being significant. Such dramatically
different results with simple additive rescaling of the data highlight the
difficulties of regression with interactions. As has already been shown
algebraically, the equivalence of the centered and uncentered analyses is
clarified in the simple slope analysis.

Simple Slopes
Equation 3.2 expresses the regression of У on X at particular values of
Z. Using uncentered data, we compute the simple_slope equations at the
values Z'h - Z' + 1 standard deviation, Z^ - Z', and Z£ = Z' - 1
standard deviation. To calculate the simple slope equations in the uncen­
tered case, we use the uncentered regression equation in Table 2.1d(ii)
containing the interaction:

f' « -24.68X' + -9.33Z' + 2.58X'Z’ + 90.15

This» equation of Table 2. ld(ii) is reexpressed in Table 2.2b in the form


of equation 3,2; that is,

P ₽= (-24.68 + 2,58Z')X' + (-9.33Z1 + 90.15)

Substituting the value for uncentcred Z'u = 12.20, we have Y — [ 24,68


+ 2.58(12,20)]X' + [(—9.33)( 12.20) + 90.15] = 6.82X' - 23.67.
For centered ZH ~ 2.20, recall that ? = [1.14 + 2.58(2.20)JX +

CamScanner
MULTIPLE REGRESSION

(3.58(2.20) 4- 2.541 - 6.82X 4- 10.41. Comparison of the simple slopes


for ZH, ZM. and Zt. for the centered versus unccntcrcd solutions in Table
2.2a versus 2.2b shows an important result: The corresponding simple
slopes arc identical. That is, the centered and uncentered simple slope
equations for a value of Z of the same relative standing across equations
(e.g., ZH one standard deviation above the mean) have the same slopes.
Hence the relationship between Y and X is unambiguously portrayed in
the simple slope equations regardless of whether these simple slope equa­
tions arc generated from the centered or unccntcrcd equations. A com­
parison of Figure 2.1a versus 2.1b verifies the equivalence of the simple
slopes from the interaction generated from centered versus unccntcrcd
data.

Standard Errors of Simple Slopes and t-Tests

The standard errors of simple slopes and hence the /-tests are also in­
variant under additive transformation. The variance-covariance matrix of
the unccntcrcd regression coefficients, Sh., is given in Table 2.3b. This
matrix was obtained from the regression program of the SPSS-X package
applied to the uncentered data set. The square root of equation 2.4 for the
variance of the simple slope, that is, «= (j, , 4- 2Zsl3 4- Z2s33)‘/2],
applies to both centered and uncentered data. For example, the esti­
mate of the standard error of the simple slope of Y on X' at Z'H - 12.20
is given as [43.88 4- (2)(12.2)(-4.07) 4- 12.22(0.40)j‘/2 = 1.98,
where । = 43.88, j{3 = -4.07, and s33 « 0.40 from the matrix S .
*
For centered ZH = 2.20, bH was found to be [2.35 + 2(2.2)( -0.08)
+ 2.22(0.40)]‘/2 = 1.98. The simple slopes, standard errors, and r-tests
based on centered versus uncentered data are identical, as is shown in
Table 2.3.

Ordinal Versus Disordinal Interactions

In order to examine the effect of centering on the ordinal versus disor­


dinal status of the interaction, we must determine the effect of the trans­
formation on the crossing point of the simple regression lines. Applying
equation 2.6, that is, X^h ₽ -b'2/bb for the value of X' at which all
regressions of Уоп X1 will cross, we find » -( -9.33)/2.58
3.61. With X - 5.0 and s' * ~ 0.95, we see that the regressions of Ton
X' cross at (3.61 - 5)/0.95 = — 1.47 standard deviations below the
mean. We have already seen in Chapter 2 that for the centered solution
the crossing point also falls precisely 1.47 standard deviations below the
mean of X (see p. 24).

cs CamScanner
Effects of Predictor Scaling 35

Should the Criterion Y be Centered?


Throughout Chapters 2 and 3 we have left the criterion Y unccntered,
even in analyses with centered predictors. Changing the scaling of the
criterion by additive constants has no effect on regression coefficients in
equations containing interactions. By leaving the criterion in its original
(usually unccntered) form, predicted scores conveniently are in the orig­
inal scale of the criterion. There is typically no reason to center the cri­
terion У when centering predictors.

Multicollinearity:
Essential Versus Nonessential Ill-Conditioning

If the first order variables X and Z are not centered, then product terms
of the form XZ and power polynomial terms of the form X2 are highly
correlated with the variables of which they are comprised (the Pearson
product moment correlation between X and X2 can approach 1.0). When
employed in regression analyses with lower order terms, the highest order
term produces large standard errors for the regression coefficients of the
lower order terms, though the standard error of the highest order term is
unaffected. Cohen (1978) and Pedhazur (1982) acknowledge the compu­
tational problems that may arise from this multicollinearity.
The literature on regression with higher order terms contains many ad­
monitions about the problems of multicollinearity. However, these prob­
lems are not the usual problems of multicollinearity in regression analysis
in which two supposedly different predictors are very highly correlated.
The multicollinearity in the context of regression with higher order terms
is due to scaling, and can be greatly lessened by centering variables. The
special cases we consider in this book (e.g., the relationship between X
and X2, or between X and Z and their product XZ) follow a general result.
Uncentered X' and X>2 will be highly correlated. But if instead we use
centered predictor X and it is normally distributed, then the covariance
between centered predictor X and X2 is zero. Even if X is not normally
distributed, the correlation between X and X2 will be much lower than the
correlation between X' and X'2. Uncentered X' and Z' will both be highly
correlated with their crossproduct X'Z'. But if X and Z are bivariate nor­
mal, then the covariance between each centered variable X and Z and the
product term XZ is zero. When X and Z are centered, the only remaining
correlation between first order and product terms or between first order
and second order terms is that due to nonnormality of the variables. (We

CamScanner
36 MULTIPLE REGRESSION
i
again recommend Appendix A (or the mathematical basis of these state *
menu.)
Manquanlt (1980) refers Io the problems of multicollincarity produced
by nonccntcred variables ns nnnewntial ill(ondltinning, whereas those
that exist because of actual relationships between variables in the popu­
lation (eg, between the age of a child and his/her developmental stage)
are referred to ax essential ilbcondlflnnlng. Noncsscntial ill-conditioning
is eliminated by centering the predictors.
We recommend centering for computational reasons. Marquardt (1980),
Smith and Sasaki (1979), and Tate (1984) provide clear discussions of
approaches to reducing multicollincarity in interactive regression models
(hoc also i^ince. 1988).

Interpreting the Regression Coefficients

The Interaction Term XZ

At the beginning of Chapter 2, we pointed out that an interaction be­


tween continuous predictors indicates that the regression of the criterion
on each of the predictors varies as a function of the value of the other
predictor. As seen in the simple slope expression contained in equation
3.2, the value of the by regression coefficient for the product term indi­
cates the amount of change in the slope of the regression of Y on X that
results from a one-unit change in Z (see also Cleary & Kessler, 1982;
Finney. Mitchell, Cronkite. & Moos, 1984; Judd & McClelland. 1989).
In terms of the simple regression equations3 in Table 2.2, the slope of the
regression of У on X at levels of Z increases by 2.58 units for every one
unit increase in Z, Note that (his change is monotonic and completely
uniform across the range of Z. The simple product term XZ represents an
interaction that always appears like a fan when plotted as a series of sim-
pic regression equations such as those in Figure 2.1. Interactions repre­
senting curvilinear and nonunifonn effects can also be built into regres­
sion equations; however, we defer the discussion of these more complex
interactions until Chapter 5.
We have already seen that the product term XZ in equation 3.1 is un­
affected by the scale of measurement. Hence, its interpretation holds across
additive scale transformations. Note, however, that this constancy per­
tains only to the unstandardized by coefficient; the standardized coefficient
(beta) for the interaction is affected by additive transformation as will be
shown later in this chapter.

CamScanner
Effects of Predictor Scaling 37

The First Order Terms X and Z


Centered Versus Uncentered Variables

In a regression equation containing an interaction the regression coef­


ficients for the first order terms (i.c., b} and bj in equation ID arc ex-
amplcsof what have been labeled conditional effects (c.g., Cleary & Kes­
sler 1982; Darlington, 1990). Conditional effects describe the effect of
one predictor on the criterion variable under the condition in which the
other predictor equals a specified value. For a conditional effect to be
useful, the point on the other predictor at which it is evaluated must be
meaningful. ,
In equation 3.1, the bi coefficient represents the regression of Y on X
atZ = 0; the Ьг coefficient represents the regression of Y on 7 at X * 0.
These coefficients will not always be meaningful for uncentcrcd predic­
tors. For example, if strength of athletes were predicted from their height
(X) and weight (Z), the regression coefficient predicting strength from
height (t|) would represent the regression of strength on height for ath­
letes weighing zero pounds. Often in social science research X and Z arc
measured on interval scales in which the value zero has no meaning. It
some behavior were predicted from a measure of motivation (X) and a
7-point attitude scale (Z) ranging from 1 to 7, the regression coefficient
for Y on X would be the slope of Y on X at the value Z = 0, a value not
even defined on the scale! However, when predictors arc centered, then
the value of 0 is the mean of each predictor. Hence, if Z is centered, then
the bi coefficient for X represents the regression of Y on X at the mean ot
the Z variable. Centering produces a value of zero on a continuous scale
that is typically meaningful.
The relationship between the coefficient in a centered overa
sion equation and the simple slope analysis may now be clarified. c ।
coefficient from the overall centered regression equation is equa to c
simple slope for the regression of Y on X at the mean ot Z (ZM )• n a e
2.3, the value bM » 1.14 is simple slope ot F on X' al ZM (equ va ent y.
from the uncentered equation, b‘M « 1.14). We have already encountere.
the concept of the regression of Y on X at the mean of Z init ic simp e
regression equation analysis portrayed in fables 2.2 and 2.,. n a
2.2, the simple slope of the regression of Y on X' at ZM or equivu ent у о
Гоп X' at ZfM is 1.14. From Table 2.1c(ii), the regression coefficient bt
is equal to 1.14 in the overall centered equation. *
There is even more convergence between the bi coefficient in the■
tercd analysis and the simple regression analyses summarized in la »c
2.3. From the centered matrix the variance of the b\ coefficient in t ie

CamScanner
MULTIPLE REGRESSION
38
overall centered regression equation is » 2.3S. the standard error i,
thus2 35i/! = 1.53. amir = 1.14/1.53 - 0.74. These are prectscly tfe
same as the standard error and r value for the simple slope b„ or
Table 2.3. Thus there is a clear interpretation of each of the b coefficient!
in the centered regression equation. ...
In the uncentcrcd equation the b\ coefficient docs retain ns interpreta­
tion as the regression of Ton X’ at Z' = 0. However, the value of zero
is no longer at the center of the data, and, in fact, may not even exist or,
the scale of variable Z. Only in the special case in which uncentered
variable Z' has a meaningful zero point is the regression coefficient for
X' meaningful. Only when the uncentered X' variable has a meaning^.;
0 point is the b2 coefficient meaningful. Because the centered overi *
regression analysis provides regression coefficients for first order terra
that may be informative, we recommend that the centered analysis be
employed, echoing the recommendations of Finney et al. (1984) and Mar­
quardt (1980). This also conforms to the familiar model used in ANOVA
Each main effect is estimated when the value of all other factors are equal
to their respective means.
Interpretation in the Presence of an Interaction.
The b{ and b2 coefficients in equation 3.1 do not represent “main ef­
fects’’ as this term is usually used. Main effects are most typically defined
as the constant effect of one variable across all values of another variable
(Cramer & Appelbaum, 1980). Less commonly they are defined as av­
erage effect of one variable across the range of other variables (Finney et
al., 1984). Darlington (1990) defines average effect as the average of the
simple slopes computed across all cases. In other words, if one were к
substitute observed predictor scores for each case into the simple slope
expression (bt + b3Z) for the regression of Y on X, calculate for each
case a value of the simple slope, and then average these across all cases-
this would be considered the average effect of У on X. For a centered
regression equation, the average effect of У on X is bx. Put another was.
if one calculated a simple slope of У on X at every value of Z, weighted
each such slope by the number of cases with that value of Z, and took the
weighted average of the simple slopes, the result would be the average
simple slope, equal to b, in the centered regression equation.
The b\ and b2 coefficients never represent constant effects of the pre­
dictors in the presence of an interaction. The bt and b2 coefficients from
centered equations always represent the effects of the predictors at the
mean of the other predictors. In the centered equation, they may also be
considered as the weighted average effect of each predictor coefficient

CamScanner
\vffects of Predictor Scaling . 39

’across all observed values of the other predictor. The interpretation of bx


V b as conditional effects of predictors at the mean of other predictors
may2well be useful in clarifying relationships under investigation. Hence
we agree with the position of Finney et al. (1984) that these effects should
not be disregarded simply because they are not constant effects. We also
echo the admonition of Cleary and Kessler (1982) that the interpretation
of these effects warrants careful consideration of the scale characteristics
• of the variables they represent.
Given that the bx coefficient for centered Z always represents the regres­
sion of У on X at the mean of Z, the range of Z should be explored to
determine over what range of Z the relationship represented^ by the bx
coefficient holds. Consider the centered regression equation Y = 3X +
2Z + 0.5XZ + 5, rearranged as Y = (3 + 0.5Z)X + (2Z + 5), with
Z centered and sz = 1.5. The regression coefficient for У on X equals 3
when Z = 0 (i.e., bx = 3). As Z increases above zero, the regression of
У on X becomes increasingly positive; the value of the simple slope (3 +
0.5Z) increases as Z increases. When Z decreases to Z = —1.5 (one
standard deviation below its mean), the simple slope for У on X is still
positive [i.e., 3 + 0.5(-1.5) = 2.25], as it is when Z = -3.0, two
standard deviations below the mean of Z. Thus the conclusion that, on
average, the regression of У on X is positive across the range of Z is quite
accurate. In contrast, the bx coefficient in Table 2. lc(ii) represents neither
the regression of У on X one standard deviation above the mean of Z nor
that regression at one standard deviation below the mean of Z.
We recommend that in characterizing the first order effects of a regres­
sion analysis containing an interaction, consideration be given to t e range
of each variable over which the first order effect of the other vanable holds
true. This eliminates the need to follow some rigid rule about considenng
versus not considering the conditional effects of first order vana es an
provides an accurate picture of the outcome.

A Geometric Interpretation
A geometric representation of the bx and ^2 conditional effects and the
by interaction (Cleary & Kessler, 1982) provides insight into their mean­
ing. The b, coefficient for Z indicates how the predicted score К changes
as a function of Z at X = 0. In Figure 2.1a, the value X - 0.0 is shown
in the center of the X axis. Reading up from the X axis at л - °
simple regression line at ZL and then across from the simp e re^res^e
■ line to the Y axis yields a predicted score Y at X - 0.0 and Z - l-
predicted scores at X — 0.0 are —5.33, 2.54, and 10.41 for ZL, m« a

cs CamScanner
40 MULTIPLE REGRESSION

ZH, respectively; these values arc the regression constants ( Y intercepts)


in the simple regression equations. Note that as Z increases, Y increases;
in the centered regression equation bj = 3.58 is positive.
Now consider Figure 3.1, which is an expanded illustration of Figure
2.1b for the unccntcrcd data. In Figure 3.1, we have extended the X axis
downward to 0.0 and have extended the simple regression lines to inter­
cept the У axis. The b2 coefficient in the unccntcrcd regression equation
still is interpreted at the regression of Y on Z' at X' = 0.0. Reading up
from Xf = 0.0 to the three simple regression equations yields values of
Pat values of Z'. The predicted scores arc 17.38, -3.15, and -23.67
for Z[, Zm, and Z„, respectively. As Z' increases, Y decreases, and b2
= -9.33 is negative in the uncentered regression equation. Rescaling by
additive constants produces shifts in the origin of the regression plane; in
the example the change in sign of b2 from centered to uncentered equation
is occasioned by the fact that the simple regression lines do not cross
above zero in the centered case but do in the uncentered case. Finally,
this shift in origin explains the difference in Y intercepts of the simple
slope equations between the centered and uncentered solutions.
What of b3? The by coefficient represents the angle between the regres­
sion lines of Y on X at values of Z or У on Z at values of X. These angles
do not change with shifts in the origins of the axes produced by rescaling.

Standardized Solutions with Multiplicative Terms

To this point we have come to expect that changing the scale of pre­
dictor variables by additive constants, as in centering, will have no effect
on the unstandardized regression coefficient for the interaction term. But
what of the standardized regression coefficients (betas) for the interaction
term associated with centered versus uncentered scores? Table 3.1 com­
pares the unstandardized regression coefficients for the uncentered and ;
centered solutions of the numerical examples (presented as Cases la and
2a in Table 3.1). Standardized regression coefficients associated with each
of these analyses are also presented (Cases lb and 2b of Table 3.1, re­
spectively). The comparison shows one reassuring and two disconcerting
findings.

1. The r-tests of the standardized regression coefficients for the interaction ten”
in the centered versus noncentered analyses are identical (r » 4.087).
2. The standardized regression coefficients for the interaction term in the ceP- ■
tercd versus uncentered analyses differ substantially (1.61 versus 0.19).

CamScanner
: Y - 6.82 X * - 23.6

: Y- 1.14X-3.15

Y --4.54 X+17.36

0.00 1.00 2.00 3.00 4.00 5.00 6.00

X'

Figure 3.1. Interaction Plotted from Unccnlered Regression Equation: f « -24.68Л" - 9.33Z' 2.5H.rZ' + 90.15

cs CamScanner
MULTIPLE REGRESSION
42

3 The simple slopes generated from the two standardized solutions are also
' substantially different. To illustrate, let us compute the simple slope for the
regression of Гоп X, that is, (b, + b3Z), at Z„ = Z one standard deviation
above the mean (i.e., when Z = 1 for the standardized case). For solution
lb of Table 3.1,

(bi + b3Z) = [-0.83210 + 1.61337(1)] = 0.78127

whereas for solution 2b of Table 3.1,

(bi + h3Z) = [0.03832 + 0.19218(1)] = 0.23050

In this section, following the development of Friedrich (1982; see also I


Jaccard, Turrisi, & Wan, 1990), we explore why standardized regression
coefficients (betas) for interaction terms and simple slopes do not display

Table 3.1
Raw and Standardized Solutions Based on Centered versus Uncentered Data

Simple slopt
bt bi by l-test asZ,,
Analysis (forX) (for Z) (forXZ) bo for by (bt + byZ^j


la. Raw uncentered Y,
X. and Z -24.67759 -9.32984 2.58141 90.15337 4.087
Ib. Standardized * ■■

solution associated
with la (i.e., raw
uncentered Y, X
and Z) -0.83210 -0.73258 1.61337 4.087 0.78127
*
2a. Raw centered X
■?
and Z 1.13648 3.57720 2.58141 2.53743" 4.087
2b. Standardized J
•' ■■i
solution associated
with 2a (i.e., raw
centered X and Z) 0.03832 0.28088 0.19218 4.087 0.23050
*
3a. Standardized
zK,zx,z2,and
(ZxZz) as
predictors, raw
analysis 0.03832 0.28088 0.19150 -0.07930 4.087 0.22982
3b. Standardized
solution associated
with 3a 0.03832 0.28088 0.19218 — 4.087 0.23050'

"Criterion Y is unccntercd, in order to provide predicted scores in the original scale of the criterion.
"These outcomes are the result of inappropriate factorization and should not be employed; sec test-

cs CamScanner
Effects of Predictor Scaling

the invariance properties we have come to expect. The strong implication


of this lack of invariance is that neither the traditional standardized cen­
tered nor the standardized unccntcrcd solution (solutions lb and 2b in
Table 3.1) should be used when interactions arc present.

Appropriate Standardized Solution with Interaction Terms

The computation of standardized regression coefficients (betas) in a


typical computer analysis begins with the standardization of each of the
predictors. (Whereas this is correct for regression equations that do not
contain interactions, it is not so for regression equations containing inter­
actions, as we will see.) For the raw uncentered analyses, these predictors
are X'r Z', and X’Z', and the corresponding z-scores are zx<, z2-, and
Zx'z1- The final term zX'Z‘ is the z-score computed from the raw product
term X’Z'; that is, the raw term is formed, and then it is standardized. It
is not in general equal to the product of z-scores (Zx'^z
*
)- The same is
true for the centered raw score analysis. The input variables are centered
X, centered Z, and their product XZ? For the standardized solution, the
predictor for the interaction is zxz> the z-score computed from the product
of centered raw scores. Again, it is not necessarily equal to the product
of z-scores (zxzz).
To compute the simple slopes, we must be able to factor the product
terms. Recall that to find the expression for the simple slope of the regres­
sion of У or X at levels of Z in the unstandardized case, we factor the XZ
term:

У = b,X 4- b3XZ + b2Z + b0 = (b, + b3Z)X + b2Z + b0

The same is true for the standardized analysis

Zy = b
z
* x + b2zz + b3zxzz + bo

*
Zy - (b z
+ b
* z)zx + b
*
z z + b*

where the b* are standardized regression coefficients and Zy is the pre­


dicted standard score corresponding to У. In order for the require ac
toririg to be performed, the predictor variable for the interaction term must
be the crossproduct of z-scores, but this is not the case in either stan ar
ized solution lb or 2b in Table 3.1 because the crossproduct terms have
been standardized.

CamScanner
44 MULTIPLE REGRESSION

Friedrich (1982) suggested a straightforward procedure that solves this


problem. One first calculates zx and zz and then forms their crossproduct
* These values are used as the predictors in a regression analysis,
:v Z
\vith:r as the criterion. The unstandardized solution from that analysis is
the appropriate “standardized” solution for use with multiplicative terms.
This solution is given in Table 3.1 (3a). Note that in this analysis the
regression intercept b0 will typically be nonzero, though in traditional
standardized regression analyses this coefficient is always zero.
To understand this difference in intercepts, recall that in general
у __ - b2X2 • • • - bxXx. Ibe traditional additive standardized
solution all variables have mean zero. In the procedure suggested by
Friedrich (1982) presented in Table 3.1(3a),lthc crossproduct term zx zz
docs not have a mean of zero. If two centered variables arc crossmulti­
plied, the mean of the resulting crossproduct term equals the covariance
between the variables (sec Appendix A). In the special case of two stan­
dardized variables, the mean of the crossproduct term equals their corre­
lation. Thus the mean of the crossproduct terms zx z7. be zero only
when X and Z arc uncorrelatcd. Similarity, the variance of the crossprod-
ucr term AZ is a function of both the variances of X and Z and the co­
variance between X and Z. (In the case of bivariate normal X and Z,
= °xaz + Covxz
*, sec Appendix A.) The variance of the crossproduct of
two z-scores will equal 1 only if zx and Zz are uncorrelatcd. In all, then,
the crossproduct of two standardized variables zx and zz is itself standard­
ized only if X and Z arc uncorrelatcd. With the mean of the product term
zxzz equal to rA'z, the correlation between X and Z, the value of the inter­
cept in solution 3a is -/;
* rxz.
Note that the unstandardized solution must be used with Friedrich’s
procedure. The standardized solutions (betas) associated with this proce­
dure (Table 3.1 (3b)) present the same problems as those given in lb and
2b: The interaction term is not a crossproduct of z-scores, rather it is the
z-score calculated from the crossproduct of z-scores. For the same reason
that solutions lb and 2b are inappropriate as standardized solutions, so­
lution 3b will also be inappropriate.

Simple Slope Analysis from the Standardized Solution


The “standardized” solution from the Friedrich procedure is given i«
Table 3.1 (3a) as follows:
ZY « 0.03832 zx 4- 0.28088 zz + 0.19150 zxz7 - 0.07930
Following the procedures developed in Chapter 2, we can treat this stun
*
dardized” solution just like any other regression equation and calculi -

CamScanner
^Cts of Scaling 45

the simple slopes of zx nt high (+1), moderate (0), and low (-1) levels
ofjjf. The standard errors and /-tests can then be performed by substituting
the appropriate values in the formulas presented in Chapter 2 (equations
2.4 and following text). These results arc presented in Table 3.2; Table
2.3a presents the results of the same analysis performed on the raw cen­
tered data.

Relationship Between Raw and “Standardized” Solution

There arc simple algebraic relationships between the centered raw score
analysis with centered X, Z, and their crossproduct XZ as predictors and
the “standardized” analysis using the Friedrich procedure with zx, zz,
and <x zz as predictors. These relationships are presented below and in
Table 3.2 for the values related to tests of simple slopes.
1. The regression coefficients are related as follows:

* = bЛ
z>
Sy
where b* is the standardized regression coefficient associated with pre­
dictor i, bj is the unstandardized regression coefficient associated with
predictor i, sf is the standard deviation of predictor i, and sY is the standard
deviation of the criterion Y. For example, from Table 3.1(2a), b3 =
2.58141; from Table 3.1(3a) b$ = 0.19150, sY = 28.01881, and Sxz =
2.08592; orfcj = 2.58141 (2.08592/28.01881). This relationship holds
for all the regression coefficients in the equation. It is the usual relation­
ship that is obtained for any linear regression analysis involving only first
order terms (see, e.g., Cohen & Cohen, 1983).
2. The regression constants (intercepts) in the two analyses have the
following relationship:

where b* and b0 are the standardized and unstandardized constants, re­


spectively, and Y is the mean of the criterion scores. If Y has been cen­
* — (boy/sY.
tered, then this reduces to /;
3. The variance-covariance matricies of the regression coefficients in
the two solutions arc related as follows:

/МЛ
standardized clement,у = unstandardized element,у I J

CamScanner
46 MULTIPLE REGRESSION

where i and j refer to any two predictors, st and Sj are their respective
standard deviations, and is the variance of the criterion. This relation
*
ship may be verified numerically by comparing the values for the stan­
dardized solution in Table 3.2(a) to those for the unstandardized solution
in Table 2.3(a).

Table 3.2 .
Simple Slope Analysis Based on Predictors zx, Zz. and zxz2 (for Comparison
with Raw-Centered Simple Slope Analysis in Table 2.3a)

a. Covariance Matrix of Regression Coefficients

b, 62 b. Relationship to covariance matrix in


bt .00267 -.00110 -.00021 Table 2.3a
-.00001 - -
b,L -.00110 .00265
-.00001 .00220 Standardized _ Raw
b, -.00021
element,, element,,

where s, and Sj are the standard deviations


of the raw predictors

b. Simple slopes .

*6 _ - .15332 Relation to simple slopes of raw-centered


,* analysis in Table 2.3a
— .UJojZ
*
b = .22982 Standardized Raw г -
simple slope = simple slope —•
of Гоп X of Гоп X L d

c. Standard Errors of Simple Slopes

j* = .07267 Relation to standard errors in raw­


* ncirn centered analysis in Table 2.3a
Jm = .05167 J
5* ж .06678 Standardized Raw
simple _ simple
standard standard (/r.
error error

d. /-tests

/* = -2.11 Relationship to r-tests in raw-centered


♦ _ n -. analysis in Table 2.3a
’M ~
1н ~ 3.45 /-tests are identical in raw-centered and
/
standardized analyses

NOTE: sx = 0.94476; jz = 2.200(M; = 2.08592; sr - 28.01881

CamScanner
47
4. l-rom (3). Il follows Unit the Mnndnnl emits of the rcitreMion c<w
fteienis arc related ан follows:

.♦ « . i
Л/, Л/, “•*
■4'

where y* and sh arc the standard errors of the standardized and raw regres­
sion coefficient for predictor /, respectively, and ,rz is the standard devia­
tion of predictor i.
5. From (3), it also follows that the matrices of correlations among the
predictors are identical for the unstandardized and standardized solutions.
6. From (1) and (4), it follows that the /-test values and p values for
tests of the regression coefficients in the two analyses arc identical.
7. Finally, Table 3.2 compares the simple slopes, the standard errors
of the simple slopes, and the r-tests for simple slopes for the unstandard­
ized and standardized solutions. As can be seen, both the simple slope
and standard error of the simple slope are related to their unstandardized
counterparts by identical algebraic expressions. The r-tests for the simple
slopes for the standardized and unstandardized centered cases are identi­
cal.
In summary, using the Friedrich (1982) approach to standardization
preserves the usual relationships between raw and standardized solutions
found for regression equations that involve only linear terms. These re­
lationships are also preserved for the simple slope analyses. Thus, of the
four possible standardized solutions presented in Table 3.1, only the
Friedrich approach (3a) is algebraically appropriate and bears a straight­
forward algebraic relationship to the unstandardized centered analysis in
(2a). Remember in this approach that the predictors are all z-scores or
their products to begin with; they should not be further standardized. The
use of this approach avoids potential computational difficulties and am­
biguities of interpretation.

Summary
This chapter has addressed the issue of scale invariance when there are
interactions between continuous variables in MR. I he discrepancies in
the regression coefficients obtained from centered and uncentered ana
yses vanish when our prescriptions for probing the interaction are о
lowed. Centered and unccntered analyses lead to identical slopes о t e

r’I
cs CamScanner
48 MULTIPLE REGRESSION

simple regression equations and identical tests of the highest order inter
*
*
action. The interpretation of first order terms in regression equations con
taining interactions is considered; such first order terms represent confa
tional rather than constant effects of single predictors. Our consideration
of the interpretation of these coefficients clarifies an advantage of center
*
ing variables before analysis, as did our consideration of multicollinearity
between lower and higher order terms. Problems associated with stan­
dardized solutions of regression equations containing interactions are dis­
cussed and an appropriate standardized solution is presented.
-

Notes
I
1. The constancy of the unstandardized b3 coefficient across the centered and uncente^i
analyses does not hold for b3 in the standardized solutions based on centered versus unen­
tered data (see the final section of this chapter).
2. These comments concerning the interaction pertain to the XZ term in equation 3.1
In equations with higher order terms such as X2 and X2Z, considered in Chapter 5, enh
the regression coefficient for the highest order term is invariant under linear scale transfer­
mation.
3. The simple slopes for the regression of Yon Z at values of X from equation 3.1 is
4- b3X) = (3.58 + 2.58X) in the numerical example. For every one unit increase in X
there will be 2.58 units of increase in Z. Hence there is a symmetry of the regressions ■
on X at Z and У on Z at X.

•I.

r 1

cs CamScanner
4 Testing and Probing
Three-Way Interactions

Chapters 1-3 have focused exclusively on interactions between two pre­


dictorvariables. The present chapter shows how the prescriptions for test­
ing, interpreting, and probing XZ interactions developed in previous chap­
ters generalize immediately to the three variable case. We limit our
treatment here to interactions involving only linear terms (XZIF); the dis­
cussion of interactions involving higher order, curvilinear components is
deferred until Chapter 5.

Specifying, Testing, and Interpreting


Three-Way Interactions

The usual requirement for developing a regression equation that in­


cludes a three-way interaction is that all first order and second order terms
must be included in the equation.1 As before, each of the predictor vari­
ables should be centered to maximize interpretability and to minimize
problems of multicollinearity. The predictor for the three-way interaction
is formed by multiplying together the three predictors. These considera­
tions result in the following regression equation:

Y = b}X + b2Z 4- b3W 4- b4XZ 4- b5XW


+ b6ZW 4* fyXZW + ba (4.1)
49

CamScanner
50 MULTI t'l,H RI'.CJHHSN|()N

In this equation, the test of the b, coefficient indicates whether the three
*
way interaction is significant. I lie two way interactions (e.g,, XZ) now
represent conditional interaction effects, evaluated when the third variable
(e.g, IV) equals 0. They are affected by the scale of the predictor just as
arc first order terms X and Z in the presence of the XZ interaction. With
centered predictor variables, the two-way interactions arc interpreted as
conditional interaction effects at the mean of the variable not involved in
the interaction (e.g., the conditional XZ interaction at the mean of W),
First order effects (e.g., X) may also be interpreted as conditional effects
(e.g., when IV and Z « 0; sec pp. 37-40). If the XZIV interaction in
equation 4.1 is significant, then this interaction should be probed to assist
in its interpretation. If the highest order interaction in the regression equa­
tion is not significant, readers may wish to use the stepdown procedures
presented in Chapter 6.

Probing Three-Way Interactions


I
Simple Regression Equation

We begin by developing the simple regression equation, rearranging


equation 4.1 to show the regression of Y on X:

Y = (b} + b4Z + bsW + b2ZW)X + (b2Z + b3W + b6ZW + b0)

. (42)

Equation 4.2 shows that the regression of У on X given in the expression


(bi + b4Z + b5 IV + b7ZW) depends upon both the value of IV and the
value of Z at which the Y on X relationship is considered. Equation 4.2
is now in the form of a simple regression equation, a direct generalization
from the simple regression equation Y = (b\ + b3Z)X + b2Z + b0 for
Ute two predictor case. The expression (b{ + b4Z + b5W + b7ZIV) is
the simple slope of the new simple regression equation.

Numerical Example
A simulation involving three multivariate normal predictors X, Z, and ]
W is used to illustrate the probing of the three-way interaction. Table 4.1 j
provides the regression analysis and computation of simple slopes. Table |
4.3a provides the means and standard deviations of predictors and criteria; \
note that the first order predictors X, Z, and IV are centered but the cri-

CamScanner
Testing: <»n<^ Probing Three-Way Interaction
* 51

tenon and the crossproduct terms arc not. Table 4.1 и gives the overall
regression analysis; the three predictor XZW interaction term is signifi­
cant. In Table 4.1b, the overall regression equation is rearranged accord­
ing to equation 4.2 to show the regression of Y on X at levels of Z and
IK In Table 4.1c, four separate simple regression equations arc generated,
one at each of the four combinations of Z and W one standard deviation
above and below their means, that is, at combinations of ZL and ZH with
Ji; and И'ц. For example, for the simple regression equation at Zff and
jy with centered Z and IV, and sz « 3.096, « 1.045, the values ZM
« 3.096 and JVH = 1.045 were substituted into the equation in Table
4,1b. This substitution is shown below:

[ -0.7068 4- (0.5234)(3.096) 4- (1.0007)(1.045)

+ (0.7917)(3.096)( 1.045)]/ + [(2.8761)(3.096)

+ (I4.2831)(1.O45) + (-1.7062)(3.096)(1.045)

+ 4.5710]

Y = 4.521X + 22.881
The result is the simple regression equation for Y on X at ZH, И'ц, given
in Table 4. 1 c(i). The results of the remaining three substitutions arc given
in Table 4.1c as well.

Table 4.1
Three Predictor Regression Analysis

a. Overall Regression Equation


f - -0.7068X + 2.8761Z 4- 14.2831 IV* + 0.5234XZ
**
+ 1.0007 XIV - 1.7062ZIV + 0.7917XZtV
*** + 4.5710

b. Regression Equation Rewritten to Show Simple Regression


Equation of У on X at Values of Z and W
P « (-0.7068 + O.5234Z + 1.00071V + 0.7917ZlV)X
+ (2.8761 Z + 14.2831 W 4- -1.7062Z1V + 4.5710)

c. Simple Regression Equations at Values One Standard Deviation Above and Below the
Means of Z and IV, where jz «= 3.096, “ 1.045
(i) At Z„ and tVH: ? ~ 4.52IX + 22.881
(ii) At ZL and 1VH; f « -3.843 X + 16.112

(iii) At ZH and Wt; F - -2.693X + 4.070


(iv) At ZL and 1VL: f « -0.81 IX -24.779

***/> < .001; *


•/> < ,01; *
p < .05.

CamScanner
52 MULTIPLE REGRESSION

Graphing the Three-Way Interaction


Casting of the three predictor interaction into a series of simple regres­
sion equations permits the plotting of the interaction. Figure 4.1 illustrates
the three predictor XWZ interaction. The plots shown should appear highly
familiar to readers who have worked with plots of three-factor interactions
in the ANOVA context. The two graphs of Figure 4.1 are generalizations
of Figure 2.1, in which the XZ interaction was plotted to show the regres­
sion of У on X at levels of the second variable Z. Each graph in Figure
4.1 also shows the regression of У on X at levels of the second variable
Z. The third variable W in the XZW interaction is included by creating a
series of graphs at different values of W. Thus the straightforward logic
of plotting simple regression equations of У on X at levels of a second
variable Z from the two-predictor case generalizes directly to the three-
predictor case in which simple regressions of У on X are plotted at levels
of two other variables.
To create Figure 4.1, values of XL and XH, one standard deviation be­
low and above the mean of X, were substituted into each of the four simple
regression equations of Table 4.1c. For example, for ZH and WH the sim­
ple slope equation from Table 4.1 is У = 4.521 + 22.811. Then with sx
= 7.070, the points on the graph for XL and XH are as follows:

For XL = -7.070, У = 4.521 (-7.070) 4- 22.881 = -9.082

For XH = 7.070, У = 4.521(7.070) + 22.881 = 54.844


. • j

These two values are used to draw the regression line of У on X at ZH and

The reader should note that the analyst is not confined to the format of
Figure 4.1. One might alternatively display the regression of У on X at
levels of W within each graph, with each graph confined to one level of
Z. Or, one might plot the regression of У on Z at levels of W within each
graph, with each graph confined to one level of X. Plotting the interaction
in various ways can often be useful in the interpretation of higher order
interactions. However, theory may provide guidance in the organization
of the plot. For example, in research on the relationship between 1^
stress and health, life stress is typically seen as the “primary” indepe11'
dent variable whose effects may be modified by other variables (e.g-» j
social support; perceived control over one’s own health). Researched j
would typically depict life stress on the X axis (abscissa) of the plot Ю |
emphasize the central importance of this variable.

CamScanner
г ч

CS CamScanner
54 MULTIPLE REGRESSION ;

Testing Simple Slopes for Significance

Tests of simple slopes follow the same general proceedures as for test­
ing the XZ interaction in Chapter 2. For the three-predictor interaction,
the simple slope for the regression of Y on X is (b, + b4Z + b5W +
b7ZHz). The simple slope may be tested for significance at any combi­
nation of values of the continuous variables Z and W that are chosen. For
readers familiar with ANOVA, these tests are analogous to tests of “sim­
ple simple main effects’’ (Winer, 1971).
The general expression for the standard error of the simple slope for У
on X at values of Z and W is as follows:2

sb — [$i i + Z2544 + И/2^55 + Z2Hz2s77 4- 2ZS|4 + 2Wsi5


+ 2ZWsl7 + 2ZWs45 + 2WZ2s47 + 2W2Zs57]'/2 (4.3)

Substituting appropriate values from and varying values of Z and


into equation 4.3 yields the standard errors of the simple slopes.
Computation of the simple slope variance for the ZH, simple regres­
sion equation is shown for the three-predictor example in Table 4.2. As
before, the standard error sb becomes the denominator of the r-test for the
simple slope, with (и - к - 1) degrees of freedom, where к is the total
number of predictors not including the regression constant, here 7.
We have limited our simple slope analysis to the values ZH, ZL and Wti,
WL, one standard deviation above and below Z and W, respectively. This
is not required: Analysts are free to choose any combinations of Z and ff
that are meaningful in their own research area.

Standard Errors by Computer

The computer approach to finding simple slopes, standard errors, and


rvalues that we developed for the two-predictor case also generalizes di'
rectly to the three-predictor case. The three steps outlined in Chapter -
are followed.
1. For both X and W, new variables Z^v and Ifcv are computed. These
are Z and W minus the specific conditional values CV2 and CVFV at which
the regression of У on X will be examined, that is, Zcv == Z - CVZ and
pycv = W - CV^. The conditional values typically would be ZH, 4’
IVH, and WL.
2. For each pair of transformed values Zcv and Wcv, the crossproducts
of these terms with each other and with X are formed: (X)(Zcv)’
(X)(lVcv), (ZcvX^cv), and (X)(ZCV)(lFCv).

CamScanner
Trstinj? and Probing Three- Wav Interactions
55
, Table 4.2
Variances. Standard Errors, and r-Tests for Simple Slopes in the Three-Predrtor
■Interaction

* 8 - St • Vanence-Covxriince Matrix of h, '


bt h fr, h
0-7,498 -0.74621 -0.46131 0,00744 -0.02715 -OWM55 - 001517
. -0.74621 3.89690 -1.28826 -0.00452 -0.I0I3J -008607 Л»
h
b, -0.46135 -1.25826 33.54756 -0.01744 ,.,7015 -,.93137 -0 48024
‘ h 0 00744 “0.00452 0.01744 0.04979 - 0.01563 -0.07925 - 0 00338
0027,5 -0.10133 -1.17015 -0.01561 0.64088 - 0.74581 -001669
hf
к -0.06455 -0.08607 -1.93137 -0.07925 -0.74581 3.40,41 0 06244
b- -0.01517 —0.06742 — 0.48024 - 0.00338 - 0.01669 0.06244 0.03954

b Weight Vector for Zh = 3.096 and H'M = 1.045

* » [I
w 0 0 3.096 1.045 0 (3.096)(1.045)l

c Variance of Simple Slope of Y on X at Z,, and IVH (Using Expression 4.3)

si « 0.71498 + (3.096)40.04979) + (I.O45)2(0.64088) + (3.096)2(1.045)2 (0.03954)


+ 2 (3.096) (0.00744) + 2 (1.045) (0.02715) + 2(3.096)(1.045)(-0.01517)
* 2(3.O96)(l.O45)(-O.O1563) + 2(l.O45)(3.O96)2(-O.OO338)
+ 2(1.045)J(3.096)(-0.01669)
4-2029 sh « 1.424

d. Standard Errors and /-Tests for Simple Slopes of Y on X Shown in Figure 4.1

Simple Standard
Slope Error /-lest
(i) At ZH, И'и 4.521 1.424 3.17
**
<u) At A- -3.843 1.600 *
-2.40
(Ш) At ZH, WL -2.693 1.565 -1.72 +
(IV) AtZL, -0.811 1.478 -0.55

**p < .01; »p < 05; +p < J0

3. Taking each pair of Zcv with W'cv (e.g., 2 and W one standard
deviation above their means) in turn, the criterion Y is regressed on X,
wn. W(Zcv). (X)(»tv). (Zcv)(”cv). »nd (X)(Zcv)(»cv)-
the resulting regression coefficient bk for X is the simple regression coef­
ficient of Ton X at the specific values Zcv and Ikcv The rtest (us*nfi 1^е
reP°rted standard error) of provides the test of the simple slope. T e
re^ess’on constant is that for the simple regression in question.
. The computer analysis of the XZW interaction explored in Table 4.2 is
8,ven in Table 4.3. The overall regression analysis is given in Table 4. a,
note that predictors X, Z, and W are centered with sx = 7.070, sz =

CamScanner
56 MULTIPLE REGRESSION

Table 4.3
Computation of Simple Slope Analysis by Computer for the XZW Interaction in
the Regression Equation
Y = b|X + b2Z + bjIV + b^XZ + b5XW + b6ZW 4- b^XZW + bQ

a. Overall Analysis with Centered X, Z, and W

(i) means and standard deviations

Mean Std Dev


Y 11.670 106.967
X 0.000 7.070
Z 0.000 3.096
w 0.000 1.045
xz 11.299 23.598
xw 2.111 7.610
zw 0.993 3.317
xzw 0.968 30.203

(ii) Regression analysis

Variable В SEB T Sig T


X -0.706801 0.845566 -0.836 .4037
Z 2.876090 1.974057 1.457 .1459
w 14.283066 5.792025 2.466 .0141
XZ 0.523446 0.223133 2.346 .0195
xw 1.000706 0.800550 1.250 .2120
zw -1.706190 1.844291 -0.925 .3555
xzw 0.791742 0.198858 3.981 .0001
(Constant) 4.570963 5.668610 0.806 .4205

b. Computation of WABOVE, WBELOW, ZABOVE. ZBELOW and Crossproduct


Terms Required for Simple Slope Analysis

COMPUTE WABOVE-W- 1.045


COMPUTE WBELOW = W-(-1.045)
COMPUTE ZABOVE-Z—3.096
COMPUTE ZBELOW = Z-(-3.096)
COMPUTE XZB = X*ZBELOW
COMPUTE XZA = X *ZABOVE
*WBELOW
COMPUTE XWB«X
*WABOVE
COMPUTE XWA«X
*WBELOW
COMPUTE ZBWB «ZBELOW
*
COMPUTE ZBWA-ZBELOW WABOVE
COMPUTE ZAWB-ZABOVE
*WBELOW
*WABOVE
COMPUTE ZAWA = ZABOVE
COMPUTE XZBWB—*W ZBELOW
X BELOW
COMPUTE XZBWA = *W X ABOVE
ZBELOW
COMPUTE XZAWB = *W ZABOVE
X BELOW
COMPUTE XZAWA = *W X ABOVE
ZABOVE

CamScanner
Testing and Probing Three-Way Interactions

Table 4.3, continued

c. Regression Analysis with ZABOVE, WABOVE, and Appropriate Crrmproducts


Yielding Simple Slope Analysis at ZH and II) ( (Regression of Y on X One Standard
Deviation Above the Mean of 7. and One Standard Deviation Above the Mean of W)

(i) Means and standard deviations


Mean Std Dev
Y 11.670 106.967
X 0.000 7.070
ZABOVE -3.096 3.096
WABOVE -1.045 1.045
XZA 11.299 32.225
XWA 2.111 10.749
ZAWA 4.229 6.004
XZAWA -17.376 56.826

(ii) Regression Analysis

Variable В SEB T SigT


X 4.521064 1.424386 3.174 .0016
ZABOVE 1.093121 2.726067 0.401 .6886
WABOVE 9.000700 7.361504 1.223 .2222
XZA 1.350816 0.293089 4.609 .0000
XWA 3.451939 0.957375 3.606 .0004
ZAWA -1.706190 1.844291 -0.925 .3555
XZAWA 0.791742 0.198858 3.981 .0001
(Constant) 22.881068 10.730287 2.132 .0336

d. Regression Analysis with ZABOVE, WBELOW, and Appropriate Crossproducts


Yielding Simple Slope Analysis at Z» and WL (Regression of Y on X One Standard
Deviation Above the Mean of Z and One Standard Deviation Below the Mean of W)

(i) Means and standard deviations


Mean Std Dev
Y 11 670 106.967
X 0.000 7.070
ZABOVE -3.096 3.096
WBELOW 1.045 1.045
XZA 11.299 32.225
XWB 2.111 Ю.462
ZAWB -2.242 5.114
XZAWB 6.239 45.539

(n) Regression analysis

Variable SEB T SigT


В
X 1.565075 -1.721 .0860
-2.693489
ZABOVE 2.791273 1.669 .0959
4.659059
WBELOW 7.361504 1.223 .2222
9.000700
XZA -0.303924 0.316296 -0.961 .3372 1■
• f/
3.606 .0004 I
XWB 3.451939 0.957375
ZAWB -1.706190 1.844291 -0.925 .3555
XZAWB 0.198858 3.981 .0001
0.791742
----- _ (Constant) 12.138283 0.335 .7376 _____ ■
4.069605

CamScanner
58 MULTIPLE REGRESSION
I

3.096, and * 1.045. Tabic 4.3b shows the computation of the new
(transformed) variables from Step (1) above:

(a) WABOVE « IV * (1.045), for the regression of Y on X at CVlv « 1.015


one standard deviation above the mean of IV;
(b) WBELOW « IV “ (-1.045), for the regression of Y on X at CV^ *
- 1.045, one standard deviation below the mean of IV;
(c) ZABOVE « Z - (3.096), for the regression of Y on X at CVZ * 3.096
one standard deviation above the mean of Z; and
(d) ZBELOW « Z - (-3.096), for the regression of XonXatCVz « -3.096,
one standard deviation below the mean of Z.

Table 4.3b also shows the computation of crossproduct terms pre­


scribed in Step (2) above. Table 4.3c provides the regression of Ton X
at ZABOVE and WABOVE, that is, the regression of Ton Xone standard
deviation above the means of Z and ИЛ The bt is the simple slope coef­
ficient, also given in Table 4.1c, equation (i), and again in Table 4.2.
The standard error of is the same value as was found using equation
4.3 (see Table 4.2); the r-test for bt is that for the simple slope. Finally,
Table 4.3d provides the simple slope analysis one standard deviation above
the mean of Z and one standard deviation below the mean of W, at
ZABOVE and WBELOW.

Crossing Point of Simple Regression


Equations with Three-Predictor Interaction

We showed in Chapter 2 for the two-predictor regression equation Y *


bfX + b2Z 4- b$XZ + b() that the simple regression equations of Гоп X
would cross at the value XCfOH « ~b2/by In the case in which the h
coefficient for the interaction is zero, the simple regression equations do
not cross, that is, they are parallel and the value of XCWM is undefined.
We can easily extend this analysis and our criterion for ordinal versus
disordinal status to the three-predictor case. Consider two simple rcg«s'
sion equations of Y on X at a fixed value of W but at differing values
Z, say ZL and ZH. We substitute into equation 4.1 first the value of
(expression 1) and then the value of ZH (expression 2). The two expit>
sions are then set equal to one another, yielding the value of X the po* 1’1
at which the two simple regression lines cross:

b,X + b2Zi. + b,W + bfX'A. + b,XW + + b,X7.LW + bo


= b,X + b2Zn + b, W + b4XZH + MU' + b6ZHiy + h
. . . J

CamScanner
*

tl,ul f’(vhinX Thru-Way bileracllons 59

The value of A' is:


- (/j2 4- b6W)
XtwM at specified W « (4.4)
(/;4 4- b7W)

Note that the denominator of this expression will be zero when both the
YZ and the XZIF interactions do not exist, that is, when b4 and b7 are
z,cm. When the denominator is zero, the simple regression lines are par-
nllcl. The fact that IF appears in equation 4.4 indicates that the crossing
point for the simple regressions of Y on X at values of Z depend on the
specific value of IF. Figure 4.1 illustrates this point nicely; at “IFLow,”
the left-hand graph of Figure 4.1, the simple regression lines do not cross
within the range of X that is portrayed; at “IFHigh,” they do cross. Now
instead of a single cross-over point, there is a line of cross-over points,
with each point on the line corresponding to a different value of IF.
If equation 4.4 is evaluated at IF = 1.045 corresponding to the “IF
High” graph of Figure 4.1, then

-[2.8761 4- (-1.7062)( 1.045)]


Xcrvss nt IFjj
[0.5234 4- (0.7917)( 1.045)]

= -0.8093

For IFl = -1.045 corresponding to the “IF Low” graph of Figure 4.1,
Xcross at IFl = 15.3310. Note that at IFH, the crossing point is well within
one standard deviation of the mean = 7.07), whereas for IFL, the
crossing point is over two standard deviations above the mean.
Alternatively, if one has plotted regression of У on X at levels of IF
within separate graphs confined to particular levels of Z, then t e va ue
of interest is
-(fe3 4- b6Z) (45)
Xcross at specified Z = + v ’

The denominator of this expression will be zero if both the XIF and XZIF
interactions are zero.

Simple Slopes and their Variances in


a Series of Regression Equations

The simple slope and variance expressions for equations of similar form
follow orderly patterns. The regression of У on X for equation 2.1 is pre-

CamScanner
Table 4.4
Expressions for Simple Slopes and Their Variances for Various Regression Equations Containing Two- and Three-Predictor
Interactions

Vsria
* ce of
Case Equation Regression Simple Slope Simple Skpe (xb

(la) Y = bxX 4 bjZ 4 b3XZ 4 bQ У on X (&, + b3Z) ru +2й,,4 2%

(lb) Y » b5X 4 h2Z 4- b3XZ 4 b0 Y on Z (b2 + b3X) s22 + 2Xj73 + X2j33

(2) Y ~ btX 4 b2Z 4 b3W 4 b4XZ 4 YonX (b, + b4Z) 4 2Zsl4 4 Z2s44
beZW 4 b0

(3) Y = btX 4 b2Z 4 b3W 4 b4XZ 4 УопХ (bt + b4Z + b5W) S,, 4 Z2544 4 Wzs35 4 2Zs14 4 2И<5,5
^XH7 4 bbZW 4 bQ 4 ZWr45

(4) ¥ = b,X + b2Z 4 b3W + b4XZ + УопХ (b, + b4Z + b5W + b-,ZW) X, , 4 Z:J44 4 iy2SS5 4 Z2W2S77 4
b3XW + bbZW 4 fc7XZiy 4 2Zr14 4 2Wsts 4 2ZRsl7 4 2ZH'54J
^0 4 2WZ3547 4 2W2Zs57

cs CamScanner
Testing and Probing Three-Way Interactions
61

sented as Case la in Table 4.4. Case 2 of Table 4.4 presents the regres­
sion of 1 on X in an equation with one level of increase in complexity
over Case 1: The addition of a W first order term and a ZW interaction
Because neither of these new terms involves X, they have no effect on the
expressions for the simple slope of У on X or its variance. The simple
slopes for Case la and Case 2 have the same structure. Case 3 of Table
4.4 is increased in complexity from Case 2 by the addition of the XW
interaction; it contains all three two-way interactions among X, Z, and W,
but not the three-way interaction. Since the XW interaction, tested with
the 65 coefficient, docs contain X, the b$ coefficient appears in the simple
slope expression for У on X; the variance of b5 and its covariance with
other predictors appear in the expression for the variance of the simple
slope. Finally, Case 4 is the complete equation including the XZW inter­
action, to which Chapter 4 is devoted. The reader may follow the patterns
illustrated in Table 4.4 to generate simple slopes and variance expressions
i for equations of varying complexity but which contain only linear terms
; and products of linear terms.

Summary

The procedures for specifying, testing, and interpreting a three-way


XZW interaction are shown to be straightforward generalizations of the
procedures developed for the two-way XZ interaction in Chapter 2. Plot­
ting the three-way interaction is accomplished by plotting two-way inter­
actions at values of the third variable, just as in ANOVA with three-factor
interactions. The methods for generating tests of simple slopes are gen­
eralized to the three-predictor interaction, and the computer approach to
these tests is illustrated for the three-predictor case. General patterns in
the structure of simple slopes and their variances are illustrated for regres­
sion equations of increasing complexity.

Notes

1- Social science research areas differ in their position about the permissibility о omi
bng lower order terms in regression equations. The only case in which a justi ication or
1 ls practice may be offered is when strong theory dictates a lower ordei effect must equa
Zer2 (sec Fisher, 1988; Kmenta. 1986). „_ q
•s ' ^e weight vector used to generate this expression is и» = [ 1 0 h
a *
7 * — w'Sltw as in equation 2.8.
matrix. Then, s

CamScanner
5 Structuring Regression Equations to
Reflect Higher Order Relationships

Many cases exist in the social sciences in which complex relationships


are expected between predictors and a criterion. These more complex re­
lationships often take the form of a monotonically increasing (or decreas­
ing) curvilinear relationship or a U-shaped or inverted U-shaped function.
For example, in psychology the well-known Yerkes-Dodson law (Yerkes
& Dodson, 1908) predicts that the relationship between physiological
arousal and performance will follow an inverted U-shaped function. To
examine these relationships specific higher order terms must deliberately
*
be built into the regression equation. If these higher order terms are omit
ted, nonlinearity will not be detected even when it does exist. Otherwise
stated, the use of the regression equation Y = b^X + b2Z 4- b3XZ + k
makes the assumption that only linear relationships of predictors to cri­
terion and a linear by linear interaction of the general form depicted
Figure 2.1 potentially exist. The analogous situation in ANOVA is
use of only two levels of each of two factors, a procedure that makes t№
implicit assumption that the relationship between each factor and the cri­
terion can only be linear and that any possible interaction must be linear
by linear in form.
In this chapter we explore how higher order relationships are rep1^
sented and tested in MR. The reader should be aware at the outset tb^
this chapter is relatively more complex than the previous ones; hencei»
may go more slowly. To help the reader, we have taken a two-stage aP

62 ■S Oil ■

CamScanner

Structuring Regression Equations i

63
proach to om presentation. First, we have illustrated what regression
equations containing higher order terms look like In terms of the forms of
relationships they represent. We begin with a relatively simple equation
(Case 1 below) and gradually build in complexity (Cases 2, 3, and 4).
Second, we consider the post hoc probing of equations presented in the
first stage, again beginning with Case 1 and working through Cases 2, 3,
and 4. The case numbers we have used throughout the chapter arc the
same as those in Table 5.1 foi each equation. Table 5.1 summarizes the
cases we will consider in depth here.
Our discussion of the representation of curvilinear relationships is lim­
ited to terms involving no more than second powers of the predictor
variables. Certainly effects represented by still higher order powers may
occur and our prescriptions may be generalized to these relationships as
well. However, at present, relationships with higher than second power
terms are rarely, if ever, hypothesized to exist in the social sciences.

Structuring and Interpreting Regression Equations


Involving Higher Order Relationships

For purposes of clarity, throughout this section we will use effect to


signify a general source of variance that parallels the typical partition in
ANOVA (e.g., a main effect, an interaction) and component to signify a
single predictor term (e.g., X2Z) that is part of the effect. We will again
assume that the predictor variables have been centered to facilitate inter­
pretation of the regression coefficients.

Case 1: Curvilinear X Relationship


Suppose that we expect a single predictor X to have a curvilinear rela­
tionship with Y. In this case we would use the following regression equa­
tion to represent the relationship:

Y= b^X + b2X2 + b0 (5Л)

The X and X2 terms represent the linear and quadratic components of the
overall “main” effect of X, each with one degree of freedom. Note that
both the X and the X2 terms must be included in the equation, even it it
is expected that there is only a quadratic relationship between X and .
As is illustrated in Figure 5.1, this equation fits many different appea

CamScanner
г ч

CS CamScanner
Structuring Ki firesslon Equations 65

anccs of the relationship between A' and У. Because the X and X2 terms
fonn a building block for more complex equations, the interpretation of
the regression coefficients in equation 5.1 is explored here in some detail.
With centered predictors, the />t coefficient indicates the overall linear
trend (positive or negative) in the relationship between X and Yacross the
observed data. If the linear trend is predominantly positive, as in Figures
5.1a,b, /ъ is positive; if the linear trend is predominantly negative, as in
Figure 5.1c. then b\ is negative. For the completely symmetric U-shaped
and inverted U-shaped relationships depicted in Figures 5.Id and 5.1c
respectively, b\ is zero. The interpretation of b\, then, is consistent with
all previous interpretations in Chapters 2 through 4, when centered pre­
dictors are employed.
The b2 coefficient indicates the direction of curvature. If the relation­
ship is concave upward, as in Figures 5.1a,d, then b2 is positive; if the
relationship is concave downward, as in Figures 5.1b,c,e, then b2 is neg­
ative. When the curve is concave upward (b2 positive), we often are in­
terested in the value of X at which Y takes on its lowest value, the mini­
mum of the curve, as in Figures 5.1a,d. When the curve is concave
downward (b2 negative), we may seek the value of X at which f reaches
its highest value, the maximum of the curve, as in Figures 5.1b,c,e. As
we explain later, the maximum or minimum point of the function is
reached when X = -bt/2b2. If this value falls within the meaningful
range of X, then the relationship is nonmonotonic and may appear as in
Figure 5.1d,e. If this value falls outside the meaningful range of the data,
then the relationship appears monotonic, as in Figures 5. la,b,c. It should
be noted that the distance from the maximum or minimum point to the
mean of X is invariant under additive transformation.
When X bears a linear relationship to Y (i.e., no higher order term
containing X appears in the equation), a one-unit change in X is associated
with a one-unit change in Y. In contrast, when X bears a curvilinear re­
lationship to Y (i.e., there is a higher order term containing a power of X
in the equation such as X2), then the change in Y for a one-unit change
in X depends upon the value of X. This can be verified by noting the
change in Yas a function of X in Figure 5.1.

A Progression of More Complex Equations with Curvilinear


Relationships
We now consider in turn each of a series of regression equations that
arc used to represent specific higher order relationships. To illustrate our

CamScanner
CamScanner
Structuring: Regression Equations
67
prescriptions for testing nnd post hoc probing nf interecti™.
employ throughout this chapter . ,,ng|c ,.m„|,,fd <Ul. M, Th(.
bivsnstc nonnsl parts of scores used in Ch,plf( 2 wrrr employ^here
ХЖ „ n7n f ’A 7 °' ' ' (”!,nd * 'norierstely correlsW

ra - 0.42. Tire higher outer term №, XZ, .ml X'Z were then genetst^
ftom the centered terms, and ell term, were used to produce Fusing the
regression equation ) - 0.2X 4 50X? 4 2 0Z 4 5XZ 4 I 5X^Z
Finally, observed F scores were generated from predicted P «ores by the
addition of normally distributed random error. Linking this simulation to
a substantive example, predictor X represents an individual’s self-concept
(i.e., how well or poorly an individual evaluates himself or herself over­
all). The criterion У represents an individual’s level of self-disclosure, or
the extent to which the individual shares personal information with others
Self-disclosure (У) has been found to be a U-shaped function of self,
concept (X); individuals with low or high scif-conccpts tend to disclose
more about themselves than persons with moderate self-concepts. Predic­
tor Z represents the amount of alcohol consumed in a social situation in
which an individual has an opportunity to sclf-disclosc. Self-disclosure is
expected to increase with increased alcohol consumption: a linear rela­
tionship is assumed in the absence of theory specifying a more complex
relationship.

Case 2: Curvilinear X Relationship


and Linear Z Relationship
In a two predictor equation, if it is expected that predictor Z will have
a linear effect on the outcome У but that predictor X may have a curvilinear
effect on У, then the following equation would be used:

У = 1цХ + b2X2 4- b3Z + b0 (5-2)

Figure 5.2a illustrates an equation of this form. The simple regression


lines are plotted in two different ways: First, the simple regression lines
of У on X at values of Z are illustrated in Figure 5.2a(i). Each curve
represents a single level of alcohol consumption (ZL, ZM, ZH). The same
curvilinear relationship of У to X is observed al each level of alcohol
consumption. The simple regression lines of У on Z at values ot X are
portrayed in Figure 5.2a(2). Note that the regression of У on X is curvi­
linear, whereas the regression of У on Z is linear. In both cases, the simple
regression lines are parallel, because there are no terms to represent an
interaction between X and Z. Because the relationship of Z to У is linear,
there is equal displacement of the simple regression lines of 1 on X at

CamScanner
CamScanner
Structuring Regression Equations 69

Case 3: Curvilinear X Relationship.


Unear Z Relationship, and
Unear by Linear XZ Interaction
Wc now add a term representing a simple linear by linear interaction
to the previously considered case» resulting in the following equation:

f = b,X + b,X
* + byZ + btXZ + b„ (5.3)

In our example wc might hypothesize that the curvilinear relationship of


self-disclosure to self-concept exists at all levels of alcohol consumption,
but that alcohol consumption interacts with self-concept in determining
self-disclosure. At high alcohol consuumption, the high self-disclosure of
those with high self-concepts would even be increased (a bit of self-ag­
grandizement, perhaps), whereas at a very low alcohol consumption level,
the self-disclosure of those with high self-concepts would be diminished
(a bit of humility, perhaps). This is a complex hypothesis, but not out of
the realm of possibility.
The effect of the introduction of the linear by linear interaction com­
ponent is illustrated in Figure 5.2b for the simulated data set. Again, the
regressions are plotted in two ways. Figure 5.2b(l) shows the simple
regression lines depicting the curvilinear regression of У on X at levels of
Z. Because of the interaction, the curves are no longer parallel. However,
since the interaction only involves first order terms in both X and Z, the
curves are identical in shape, that is, all are mildly concave upward. But,
in keeping with the hypotheses of our example, at ZH there is an overall
increasing trend; while at ZL the trend is an overall decrease in self-dis­
closure. Figure 5.2b(2) shows the same interaction from the perspective
of the simple regression lines depicting the regression of Y on Z at levels
of X. Note that the simple regression lines no longer cross at one single
point, as they did in equation 2.1 (see also Figure 3.1).
Case 4: Curvilinear X Relationship,
Linear Z Relationship, and
a Curvilinear X by Linear Z Interaction
A quadratic X by linear Z interaction component is now introduced:

t = b,X + ЬгХг + b,Z + b,XZ + bj№z + b„ (5.4)

This equation is again plotted in two ways in Figure 5.2c, illustrating first
the regression of У on X and then the regression of У on Z. As is most

CS CamScanner
70 MULTIPLE REGRESSION J
■ '■ ■'
' 7 ■ .■ '
. 7< .7 '
. '

' "
. '
. ■ ■7 ; J.<
. '4

clearly seen in Figure 5.2c(l), the meaning of (he quadratic by linear Xl%
term is that the quadratic relationship between X and F varies in formal Ж
a function of the value of Z. If we had hypothesized that the curvilinear 3
relationship between self-disclosure and self-concept would be increas-Л
ingly manifested as alcohol consumption increased, equation 5.4 would
have been appropriate to test the prediction.
' ' 7 . w J
Case 5: Curvilinear X Relationship, J
Curvilinear Z Relationship,
and Their Interactions
Suppose finally that both the X and Z predictors were expected to have |
curvilinear effects. In this case, both predictors would be treated as pre- J
dictor X was in equation 5.2 as represented in the first four terms of equa- 5
tion 5.5. In addition, up to four components of the interaction may be
included in the regression equation, namely the XZ, XZ2, X2Z, and Х2'2
уЦ
*
terms: ''777:. M

t Ж btX + b2Z + b3X2 + bAZ2 + b5XZ J

+ b6XZ2 + b,X2Z + b,X2Z2 + b0 (5.5) ;


/ 7 ' ; : ■ ' ' . ' ■ ' 7-' - 7'1
The two plots of this equation would yield graphs like that in'FiguOw
5.2c(l) for both the regression of F on X at levels of Z and the regression: 1
of У on Z at levels of X. (We mention this case but do not consider it
■ further.) . 7.\A-|

7- 7 ' : . ■ ■ '• 7 ■ ■ 7
Representation of Curvilinearity in ANOVA Versus MR 1

In MR two separate predictors are required to represent a “main effect :


that involves a second order term, as is shown in equation 5.1 'Го repRw
sent a two-way interaction, up to four additional terms may be
as in equation 5.5. In contrast, in ANOVA we are accustomed to having •
one source of variation for each main effect and one soua'c of variation
for each interaction. Dejxmding on the number of levels of each facicr. L
the source of variation will have one or more than one degree of freedom
Regardless of the number of degrees of freedom associated with ejcli
main effect and interaction, each source of variation is separately
for significance with a single omnibus test.
The key to understanding the direct equivalence of MR and ANOVA .
lies in the fact that each of the overall main effects and interactions «г> ■'
ANOVA aggregates together all the linear and higher order nonlinear paf- Jj,

CamScanner
Structuring Regression Equations 71

titions of variation that arc represented by separate components in MR.


Consider a main effect in ANOVA with three ordered levels of a factor
and therefore two degrees of freedom. The main effect partition combines
all of the linear and second order variation into one source of variation.
If we follow the same procedure with MR equation 5.1, we find that the
total predictable variation based on both of the predictors and two degrees
of freedom is identical to the main effect variation of the ANOVA. In our
example, suppose an experiment were performed with three groups of
subjects, one with low self-concept, one with moderate self-concept, and
one with high self-concept. In an ANOVA there would be one main effect
of self-concept with two degrees of freedom that subsumed any linear
trend in the relationship of self-concept to self-disclosure as well as the
curvilinear relationship. (Note that we do not advocate splitting a contin­
uous variable into groups so that ANOVA may be used—this is only for
illustrative purposes. See Chapter 8 for a discussion of the costs of split­
ting continuous variables to create groups for an ANOVA.)
Alternatively, the main effect of the ANOVA can be partitioned into
linear and quadratic components, each with one degree of freedom, using
orthogonal polynomials (e.g., Kirk, 1982; Winer, 1971). Such equiva­
lence of MR and ANOVA can also be shown for more complex cases
such as equation 5.4. Here we would have a two-way ANOVA with three
levels of Factor X and two levels of Factor Z. Now, in addition to the
main effect of X with two degrees of freedom described above, there is
an interaction with two degrees of freedom. This interaction effect cor­
responds to the two interaction terms in equation 5.4. Once again the
overall ANOVA interaction could be partitioned into linear by linear and
a quadratic by linear components with orthogonal polynomials.
Note, however, that there is an important difference between ANOVA
and MR in usual practice. In ANOVA with multiple levels of a factor and
the use of usual approaches to variance partitioning, any curvilinear vari­
ation is automatically subsumed in the variance partitions. In contrast, in
MR the analyst specifically decides which terms need to be included:
Terms to represent curvilinear relationships must be built systematically
into the equation. This is not to say ANOVA and MR differ mathemati­
cally. Rather it is to say that the conventional partitions of variance op­
erationalized in common statistical packages for ANOVA are structured
so that all components of an effect are subsumed in the omnibus term for
that effect; in MR, the structuring of the components of each effect is left
entirely to the analyst.
- Failure to structure the regression equation to reflect the theoretically
] expected curvilinear relationships and their interactions can lead to errors
• ■ ■

/ ...

CamScanner
72
‘ 'И Rkor»^Ss|(
of interpretation, sometimes of substantial magnitude. |? .
relationships in Figures 5.2a,b,c are all based on the same У'111*
» Ik
differs between these portions of Figure 5.2 is the complex ire'
tions used to fit the data. Recall (hat these data were siinulnicd
* wC
that the data were in fact generated by an equation of the form
in Figure 5.2c. W
*
nrcP
It may appear that the researcher must be more informed abo
nature of the relationships of predictors to criteria to use
ANOVA. After all, the ANOVA automatically includes curvilinear '
ponents in the variance partitions. In actuality, in the planning of JT
*
periment, the researcher must pick the number of levels of each fact *
based on some assumption or knowledge of the relationship of the facton
to the criterion. If there is only a linear relationship expected, two leveh
suffice to estimate the linear effect (but three levels arc required to test for
nonlinearity). If curvilinear relationships arc expected, at least three levels
are required. In an analogous manner, the researcher using MR whois
suspicious that there are nonlinearities in the relationships may explore
these relationships by using equations containing appropriate higher order
terms (see also Chapter 9).2

Post Hoc Probing of More


Complex Regression Equations

The procedures for probing interactions of the form XZ and XZR'de-


veloped in previous chapters directly generalize to more complex regret
sion equations. In this section we show how simple regression equations
can be derived and tested for the curvilinear X case and for each of the
progression of regression equations just considered. Indeed, as a useful
summary, expressions for the simple slopes and variances for each ot i
these equations are summarized in Table 5.1. We also show how to cal­
culate the crossing points for the simple regression lines corresponding 1°
several of the equations. Finally, we introduce new tools for probing sim­
ple regression lines that are curvilinear in form.

Case 1: Curvilinear X Equation

We begin with the simple curvilinear (second order) equation involving


only X and X terms represented by equation 5.1, which is reproduced
below:

У = ьух + ь2х2 + *0

CamScanner
Structuring Regression Equations
73
As before, we rewrite the equation to show the regression of yon X:

f = (&1 + Ь^Х + (5.6)

From equation 5.6 we see that the regression of Y ™ v л J


specific value of X; that is, at any particular value of xX valuTofT
may be decreasing. not changing much at all. or increaring * ° On
of Figure 5.3 also confirms this result. 5 "«pccwon
The simple slope for the regression of Ton X in equation 5.1 is actually
('„I b2* }- Rather the co^ct expression for the simple slope is J
+ 2b2X) The straightforward approach of rearranging regression equa­
tions to identify simple slopes does nor generalize to equations containing
curvilinear components (e.g„ X2) of first order terms. Finding simple
slopes for equations containing curvilinear components requires a bit of
calculus, which we provide below, with graphic illustration. We strongly
encourage readers to work through this presentation and the numerical
example in order to understand the remainder of the chapter.

Simple Slope Expressions as


Derivatives of Regression Equations

In considering the XZ and XZW interactions in previous chapters, we


have only dealt with linear changes in Y associated with linear changes in
X. But, how does one measure the regression of У on X at a single value
of X along a curve? A mathematical operation in calculus known as dif­
ferentiation (taking the first derivative) provides the answer. This opera­
tion provides the slope of a tangent line to a curve at any point on the
curve.3 Figure 5.3a illustrates a tangent line to a curve at one value of X,
namely Xz; the curve follows the form of equation 5.1. The tangent line
shows the regression of У on X at Xf. The slope of the tangent line to the
curve at X, measures the simple regression of У on X at that specific va ue
of X; thus the slope of the tangent line is the simple slope for the regres­
sion of У on X at one value of X. Putting this all together
votive of the curve with respect to X evaluated at one \a ue oj -
simple slope of Yon X at that value of X.
Using rules from calculus for differentiating a function ea s
derivative of equation 5.1 with respect to X:

dY и (5,7)
- = Ь. + гь2х

This derivative is the general expression for,h® “Equation 5.1-


the curve, where the curve is defined by overa g

cs CamScanner
CamScanner
I
Structuring Regression Equations 75

If the value Xt is substituted into equation 5.7, the resulting value is the
value of the simple slope of У on X at Xh as illustrated in Figure 5.3a.
Otherwise stated, equation 5.7 is an expression for the linear change in Y
associated with a linear change in X at any specific value of X.
I The general definition of the simple slope as the first (partial) derivative
/, is applicable to all the simple slope expressions in Chapters 2 through 4.
Consider the regression equation with one XZ interaction, У - b}X +
b2Z + byXZ + b0. We take the first (partial) derivative of this equation
with respect to X:

BY
TZ = b, + b3Z (5.8)
ал

J* This is the same expression for the simple slope as was given in Chapter
2 and summarized in Table 4.4, Case 1.
At this point we will formalize the definition of simple slopes to en-
" compass all the cases throughout the text. The simple slope of the regres­
sion of Y on X is the first (partial) derivative of the overall regression
equation with respect to the variable X. This function indicates the slope
of the regression of У on X at a particular value of at least one variable,
either some other variable Z, the variable X itself, or some combination
of variables.
Returning to the simple curvilinear equation, what does its first deriv-
’ ative tell us? Consider the regression equation illustrated in Figure 5.3a:
Y = 4.993Х + 6.454X2 + 3.197, where X is centered^ Using the expres-
i sion for the derivative, we see that at the mean of X (X = 0), the regres­
sion of У on X is positive:

+ 2b2X = 4.993 + 2 (6.454) (0) = 4.993

At XH, one standard deviation above the mean ($x = 0.945), XH = 0.945
so that the regression of У on X at XH is 4.993 + 2(6.454)(0.945) =
17.191. At XL = -0.945, the regression of У on X is -7.2051. Thus at
one standard deviation below the mean (Xu), there is a negative relation-
/ ship of X to У; whereas at one standard deviation above the mean (XH)
the relationship is strongly positive.

Minimum or Maximum of Curve


f At what point is the regression of У on X equal to zero? Inspection of
Figure 5.3b shows that the regression is zero at the minimum of the curve,

CS CamScanner
J

76 MULTIPLE REGRESSION J

that is. the value of X at which the predicted value of У is its lowest. This
point on the curve is where the tangent line to the curve (first derivative)
has a slope equal to zero. Using expression 5.7, we solve for that value
of X that causes expression 5.7 to equal zero:

by 4- 2b>X « 0

by - -2b2X

— by
“ X at ^minimum

For these data, X = -4.993/(2)(6.454) = -0.387. At the value of X


~ -0.387, the regression of У on X is zero and the predicted value of У
is minimum. In our example, this would indicate that self-disclosure is
lowest just below the mean self-concept in the sample.
Note that for simple curvilinear regression equations representing in*
verted U-shaped functions the regression of У on X is zero at the maximum
predicted value of У. Applied to curves such as those of Figure 5.1b,c,e
equation 5.9 would yield the value of X at that maximum.4

Overall Trend in the Relationship ofXtoY

At this point we revisit the by coefficient in the overall regression equa­


tion 5.1 and contrast it with the simple slope of У on X at any value of X
The by coefficient, evaluated when centered X is equal to 0 (at the mean),
represents the regression of У on X at the mean of X, or the conditional '
effect at its mean. If it is positive, this indicates that in the region of the
mean, У is increasing; if negative, that У is decreasing. For our example
because by = 4.993 in Figure 5.3b, in the region of the mean of seffi |
concept, self-disclosure increases as self-concept increases. Further, Й |
one calculated the regression of У on X for the value of X of each case & J
the observed sample (giving each case equal weight), on average thi * ,1
regression would be positive, with an average slope of 4.993. Across the j
observed cases, self-disclosure is on average higher with high than wllh J
low self-concept. In contrast to the by coefficient, the simple slope, 1
before, measures the regression of Уоп X at a single value of X. Evaluate |
at X — 0 for centered X, the simple slope will equal the by coefficient, < |
has been true for the linear cases previously presented in Chapter®
through 4. N

CamScanner
у

Structurinj; Regression Equations 77


\
Testing Simple Slopes

The computation of simple slopes to characterize higher order relation


ships provides a useful way to summarize these relationships and to place
numerous results in context. The presence of curvilinear relationships, for
example, has at times produced great consternation in research literature.
Some researchers report positive relationships of X to K; others find neg­
ative relationships; and still others report no relationship whatever. The
basis of these discrepancies may well be in the range of X sampled across
studies. Returning to Figure 5.3, one can see that if a small range ofXat
the low end of the continuum were sampled, a negative relationship would
be observed. If a small range at the high end were sampled, a steeply
positive relationship would be reported. Untangling these apparently con­
flicting results requires determining the range of values of X for which the
relationship is negative, not different from zero, or positive. Simple slope
tests aid in this task.
We may wish to test whether simple slopes representing the regression
of Y on X at each value of X are significantly different from zero.5 The
standard error of the simple slope (bx 4- 2b2X) is

sh = <Js\\ 4- 4ХУ12 + 4X2s,22 (5.10)

The /-test is then (b} 4- 2b2X)/sh with n - к - 1 df, where к = 2 for


X and X2.
Table 5.2 provides an illustration of the analysis of simple slopes for
the regression equation in Figure 5.3. Simple slopes are given at two
standard deviations (-1.890) and one standard deviation (-0.945) below
the mean of X, at the minimum of the curve (X = -0.387), at X = 0.00,
and at one standard deviation (X = 0.945) above the mean of X. At the
mean of X and above, there is a strong positive relationship between X
and Y that increases with increasing X; whereas at the point two standard
deviations below X (X = -1.890), there is a strong negative relationship.
The interpretation of an equation such as equation 5.1, which contains
both linear and nonlinear functions of the same variable, is conceptually
problematic—one cannot think of varying the second order term (X2)
while holding the linear term (X) constant. The simple slope approach
handles this problem by repartitioning the predictable variation into a se­
ries of regressions of У on X.6

CS CamScanner
78 MULTIPLE REGRESSION |

Table 5.2
Simple Slopes at values of X for the Regression Equation V = 4.993X +
6.454X2 + 3.197
, ,. .... - .. . ---- _ -
a. Variance Covariance Matrix of Regression Coefficients л
b< bt I

2.14449 -0.10432
/>,.-0.10432 1.26567 J

b. Compulation of Standard Error of Simple Slope at XH = 0.945, where sx я 0.945

(i) Simple slope: b} + b3X = 4.993 + 6.454(0.945) = 11.092


(ii) Variance of simple slope: s2 = । + 4Xs)2 + 4X2j22
= 2.144 4- 4 (0.945)(-0.104)
+ 4 (.945)2 (1.266)
= 6.271
Standard error of simple slope: sh = >/6.271 = 2.504

c. /-Tests for Simple Slopes

X Simple slope Standard error


-1.890 -19.404 2.730

:
1

~—
-0.945 -1.106 3.657

О 4.
О
I


-0.387 0.000 1.750

—О
UJ О

1
0.000 4.993 1.464

4*

?
0.945 11,092 2.504

4-
u4*
—. я
♦•♦/> < .001, ■ . J
- ■: '■O=|

The Progression of More Complex Curvilinear Equations Revisited

We now apply our prescriptions for post hoc probing to the progression
of regression equations considered in the first section of this chapter, Tte |
equations, their simple slopes, and the simple slope variances are sunt-1
marized in Table 5.1, Cases 2 though 4. Case 2 is a simple extension M d
Case 1. It is in Case 3 and Case 4 that we encounter the combination
higher order terms and crossproduct terms.

Case 2: Curvilinear X Relationship й


and Linear Z Relationship
Equation 5.2 represents an equation of this form and is reproduced НЙЗ
Y - bxX + b2X2 b3Z + b0. This equation, illustrated in Figure
builds on the simple second order curvilinear equation just considered Ml
the addition of the first order Z term. Differentiating with respect to X J

CamScanner
Structuring Regression Equations 79

yields dY/dX = (b} 4- 2b2X), the same simple slope for Y on X as in


equation 5.1. This first (partial) derivative or simple slope does not con­
tain Z, because Z does not interact with X. The only effect of changing
the value of Z is to change the elevation of the whole curve, as can be
seen in Figure 5.2a. Changing the elevation of the curve is reflected in
changes in the intercept and in the value of Y at the maximum or minimum
point of the curve. The value of X at which the minimum or maximum is
reached is unaffected by Z.
The variance of the simple slope7 is given in Table 5.1, Case 2. It is
identical to that in the simple curvilinear regression equation (Case 1 of
Table 5.1). The r-test of simple slopes at values of X follows directly with
n - к - 1 df, where к = 3.

Case 3: Curvilinear X Relationship,


Linear Z Relationship,
and Linear by Linear XZ Interaction

This set of relationships is represented by equation 5.3, which is repro­


duced here: Y = bxX 4- b2X2 + b3Z 4- b4XZ 4- b0.
This equation adds a linear by linear interaction between X and Z to the
case we just considered. We first consider the regression of Y on X (Case
3a of Table 5.1); then we consider the regression of Y on Z for the same
equation (Case 3b).

Case 3a: Reexpressed Regression Equation Showing Regression of У


on X. It is difficult conceptually to characterize the relationship of X and
Zto Y from inspection of equation 5.3. Reordering and grouping terms in
equation 5.3 presents the equation as a simple regression of У on X at
values of Z, a much more easily interpretable form. The rearranged equa­
tion has the same second order polynomial form (linear term, squared
term, constant) as equation 5.1:

Ж У = (/>| 4- b4Z)X 4- b2X2 4- (b3Z 4- b0) (5.11)

J The (h{ 4- b4Z) coefficient in equation 5.11 takes on the same meaning
in the simple regression equation as does b} in equation 5.3 and indicates
the overall linear trend in the regression of У on У at one value of Z. If
- (Ь, + b4Z) is positive, the simple regression has an overall upward linear
Л .trend; if it is negative, an overall downward linear trend. However, the
; nature of the curvature, measured by b2, is independent of Z because Z ?
■\ does not interact with X2.
, ib J- ' Y-' ' v ■ ■ • ■ ■ ■■ ■ . > . . ■ . •- . ; ■ •
• -r ■ . ' • ' • ... ■ ■ *
’.Ж

CamScanner
MULTIPLE HEGRIINSIOH

We can use equation 5.11 to calculate the simple regression equations


characterizing each of the three curves in Figure 5.2b(l), The overall
regression equation is Г = 1.125X + 3.563X2 4- 3.608Z 4 2.935XZ +
3.246. With centered Z, and s7 = 2.200, we can substitute values of £
into equation 5.11 to calculate the simple regression equations at ZL, ZM(
and ZH:

In general, f = (1.125 4- 2.935Z)X 4- 3.563X2 + 3.608Z


4- 3.246
ForZL = -2.20, Y = [1.125 4- (2.935)(-2.20)]X 4- 3.563X2

4- 3.608(-2.200) 4- 3.246
Y = -5.332X 4- 3.563X2 - 4.512
ForZM = 0.00, Y = 1.125X 4- 3.563X2 4- 3.246

For ZH = 2.20, Y = 7.582X 4- 3.563X2 4- 11.364

Because X is centered, the coefficient of X in each of the three simple


regression equations can be interpreted as the conditional effect of X on
Y at the mean of X for one value of Z, or the overall linear trend of the
relationship of X to У at one value of Z. From an inspection of Figure
5.2b(l), the overall linear trend is negative at ZL, very slightly positive
at ZM, and more strongly positive at ZH.

Case 3a: Simple Slopes. The reader should recall the distinction be­
tween coefficients found by reexpression of the overall regression equa­
tion, as in equation 5.11, and simple slopes. To determine the simple
slope of the regression of У on X at any single value of X, we differentiate
equation 5.3 with respect to X:

= b, + 2b2X + btZ (5.12)


ОЛ

The simple slope of У on X at any single value of X now depends on the


value of Z, as well as on the value of X. To interpret equation 5.3, we
would calculate the simple slopes at all nine combinations of XL, XM, and
XH with ZL, ZM, and Zu. To illustrate, consider the simple slope at Z1( s
2.200 and XL = —0.945. Substituting into equation 5.12, we find that
this simple slope is 1.125 4- 2(3.563)(—0.945) 4- 2.935(2.200) *

CamScanner
Structuring Regression Equations 81

0.848. Table 5.3 provides a matrix summarizing the simple slopes at all
nine combinations of XL, XM, and XH crossed with ZL, ZM, and ZH, Row
1, column 3 of the tabic shows the simple slope at Xlt ZM to be 0.848 as
we just calculated. Thje second row shows the regression of Ton X at the
mean of centered X (X *= 0) for ZL, ZM, and ZH. Note that these values
are identical to the first order coefficients of the simple regression equa­
tions presented above. When X = 0, the regression coefficient (b^ + ft4Z)
for X in reexprcssed regression equation 5.11 and the simple slope of Y
on X at a particular value of Z, that is, (+ 2b2X + b4Z) of equation
5.12, are equal.
When X is centered, both equations indicate the average regression of
У on X at the mean of X for particular values of Z, or the average slope
of the regression of Ton X across all the cases in the sample.
The set of simple slopes presented in Table 5.3 leads to a useful sum­
mary of the outcome of the regression analysis. When X is low (row 1),

Table 5.3 л ,
Probing Simple Slopes in the Equation Y = 1.125X + 3.563X^ + 3.61Z +
2.935XZ + 3.246 (Simple Slopes Are Found Using the Expression (b\ + 2bzX
+ btZ))

(1) (2) (3)


zL A, Zh
-2.200 0 2.200

Simple slope -12.065 -5.609 0.848


(1) XL = -0.945 standard error 2.628 2.819 3.786
t —4.591
*** *
-1.990 0.234

Simple slope -5.332 1.125 7.582


(2) XM » 0 standard error 2.324 1.532 2.153
/ *
-2.294 0.734 ***
3.522

Simple slope 1.403 7.859 14.316


(3) XH - 0.945 standard error 3.900 2.839 2.502
i 0.360 2.768 5,722
***

Sb; Covariance matrix of b coefficients:

b> Ьг bj
bl 2.34649 0.01536 -0.41530 -0.08693
*
s = h* 0.01536 1.58396 -0.04381 -0.49227
by 0.41530 -0.04381 0.43079 0.01174
-- . b* -0.08693 -0.49227 0.01174 0.55213

*P<
" .0001; "p < .01; *p < .05

CamScanner
82 MULTIPLE REGRESSION

the regression of У on X moves from highly negative to close to zero ая


Z increases. At the mean of A' (row 2), the regression of Y on X passes
from significantly negative to significantly positive as Z increases. When
X is high (row 3), the regression of V on X becomes increasingly pc>sif|v ,
as Z increases. ।
Table 5 3 illustrates the dramatic differences in the regression of yOn
A' as both X and Z vary. Once again, the use of simple slopes has |c<j ff)
a rcpartitionmg of three sources of variation, those due to X, X2, an(j
leading to r. straightforward interpretation of the regression analysis.

Case Standard Error and bTesl. The variance of the simple slor>t
is given in Table 5.1, Case 3a; its square root is the required standard
error; and the r-test has n — к — 1 df, where к — 4.

Case 3a: Minimum or Maximum of Curve. Inspection of Figure 5.2b(i


suggests that the value of X at which Y has its minimum value for each
regression curve depends on the value of Z. Setting equation 5.12 to zero
and solving for X yields the following expression for the value of x ai
which Y is minimum for each of the regression curves:

-(b,+b4Z)
x " 2b2 (513)

Equation 5.13 produces the value of X corresponding to the minimum


predicted value for U-shaped regression curves.9 Substituting the value
for ZL = -2.20 into the equation, the minimum value of Y is found at X’
= [(1.125 + (2.935)(-2.20)]/(—2)(3.563) = 0.75. For ZM and ZH, the ч
mimimum values of the regression curves are found at X = —0.16 and
-1.06, respectively. In terms of our example, the value of self-concept
at which self-disclosure (Y) began to increase occurs at ever lower levels ?
of self-concept (X) as alcohol consumption (Z) increased from ZL to ZH

Case 3a: Crossing Point. The curves corresponding to the simple A


regression equations for the regression of У on X will all cross at a single 3.1
point, X = -b3/bA. Substituting in the values of and h4t we find that V
the simple regressions cross at the value - -3.608/2.935 « -1.230 ?
In Figure 5.2b(l), XL, the lower limit of X illustrated, is -0.95 (i.e.,A p
= 0 and sx = 0.95). Hence the three curves intersect just below the value
of X illustrated. If the range of X within which the three simple regressions |
of У on X crossed were meaningful, then the interaction would be inter-Ж
‘ '7.V '■ ' <

CamScanner
Structuring Regression Equations

preted as disordinal. If, in contrast, the low value of X at which the curves
cross is not meaningful, then the interaction would be interpreted as or­
dinal. In our example, if values below ZL represented the sclf-conccpt
scores of clinically depressed individuals (a different population), then we
t would state that the interaction was ordinal within the “normal” popu­
lation above ZL in self-concept. Note that if the Ьл term for the XZ inter­
action were zero, then the simple regressions would not cross. In this
case, the equation reduces to the simpler regression equation 5.2, F =
b}X + b2X2 4- Z>3Z + bQ, which is depicted in Figure 5.2a(l).

Case 3b: The Regression of Y on Z. To this point we have considered


only the regression of Y on X. As we have seen in previous chapters,
probing the regression of У on Z is also likely to be informative given the
presence of the XZ interaction. The interaction signifies that the regression
of Ton Z will also depend on values of X. For the present case of a simple
XZ interaction, the regressions of У on Z will all be linear. The simple
regression lines of У on Z at values of X are illustrated in Figure 5.2b(2).

Case 3b: Simple Slopes, Standard Error, and t-Test. The equation for
the regression of У on Z at levels of X is found by differentiating equation
5.3 with respect to Z:

Ц = b, + b,X
(5.14)

The variance of this simple slope10 is given in Table 5.1, Case 3b. The
standard error is the square root of this variance, and the btest follows,
with n - к - 1 df, where к = 4. The forms of both the simple slope and
its variance, given in Table 5.1, Case 3b, are identical to those for the
regression of У on X at values of Z in equation 2,1: У = b,X + b2Z +
b3XZ + bQ (see Table 4.4, Case lb). This is so because in both equations
2.1 and 5.3 the relationship of У to Z is linear both in the first order term
and in the interaction. The more complex equation 5.3 differs from equa­
tion 2.1 only by the addition of the X2 term, which does not enter into
the regression of У on Z.

Case 3b: Crossing Points, Figure 5.2b(2) shows that the simple regres­
sion lines do not cross at a single point, unlike our experience with pre­
vious simple linear by linear interactions. In equations containing second
order terms in X, the crossing points of any two simple regression lines

CamScanner
#4 MULTIPLE REGRESSION |
I
for the simple regressions of К on Z depend upon specific value» of x
chosen. For equation 5.3, the value of Z at which two simple regression
lines of У on Z cross is
I
, ; I

- -p’« + hM + Ml A- U.IJ) J
where X, and X; arc the specific values of X chosen for examination
Note again that if the b4 coefficient for the XT interaction is zero, thr
simple regression lines arc parallel, as in Figure 5.2b(2). Further, if the
6, coefficient for the X2 term is zero, all simple regression lines will сгом
at the single value ~b}/bA.
To illustrate the calculation of a crossing point, we substitute XH *
0.945 for Xt and XM = 0 for Xj into equation 5.15: Zcrt,„ = -[1.125 >
3.563(0.945 + 0))/3.246 = -1.387. This case is graphically depicted
in Figure 5.2b(2), where the lowest value of Z plotted is ZL = -2.20

Case 4: Curvilinear X Relationship,


Unear Z Relationship, and
Curvilinear X by Unear Z Relationship

This set of relationships is represented by equation 5.4, which is repro­


duced here:

f = b,X + b2X2 + b2Z + b.XZ + bsX2Z + b0

Equation 5.4 adds an X2Z term to equation 5.3.

Case 4a: Reexpressed Regression Equation to Show Regression of Yon


X. As with equation 5.3, it is difficult at best to characterize the relation­
ship of X and Z to Y from equation 5.4. To simplify interpretation, equa­
tion 5.4 may be rearranged into a simple regression equation showing the
regression of У on X at values of Z. In the rearranged equation we see
that the regression coefficients for both X and X2 vary as a function of the
value of Z:

Y = (ft, + b,Z)X + (ft, + ft,Z)X2 + (b3Z + ft0) (5.16) л

As in Case 3a, in equation 5.16 the (Z>| + b4Z) coefficient provides the
same information as the 6, coefficient in the overall equation. The (bt *
b4Z) coefficient represents the overall linear trend in the relationship
X to Tat a value of Z, paralleling its interpretation in equation 5.11. W

CamScanner
Structuring Regression Equations 85

(b2 + b5Z) coefficient for X1 in equation 5.16 conveys the same infor­
mation as the b2 coefficient in the overall equation. It represents the nature
of the curvilinearity of the simple regression lines of Y on X at specific
values of Z. If the value of (b2 4- />5Z) is positive, the curve is concave
upward; if negative, it is concave downward.
Figure 5.2c(l) illustrates regression equation P = -2.O42X 4 3.fKX)X;
+ 2.138Z + 2.793XZ + 1,96OX2Z + 3.502. The figure is presented a-,
three simple regressions, based on the rearranged form of equation 5.16:

f « (-2.042 4- 2.793Z)X 4- (3.000 4- 1.960Z)X2

4- (2.138Z 4- 3.502)

Substituting the values —2.200, 0, and 2.200 for ZL, ZM, and ZH leads to
values of both coefficients of equation 5.16 for each simple regression
line. The value of (bi 4- b4Z) varies from negative to positive as Z in­
creases: -8.188 forZL, -2.042 forZM, and 4.104 forZH. This is con­
sistent with the generally negative linear trend at ZL, but generally posi­
tive linear trend at ZH observed in Figure 5.2c(l). The values of (b2 4-
65Z) are as follows: —1.313, 3.000, and 7.312, for ZL, ZM, and ZH,
respectively. At ZL there is vety slight downward curvature, but the cur­
vature is upward for both ZM and ZH.
As a further aid to interpretation, both the (Zq 4- b4Z) coefficient and
the (b2 4- b5Z) coefficient may be tested for significance in each simple
slope equation. Standard errors are computed according to the procedure
given in Chapter 2 and employed throughout the text.11 For example, for
(b2 4- b5Z), at ZL, t - -0.786, ns; at ZM, t - 2.422, p < .05; and at
ZH,7 - 4.853, p < .01. These tests confirm the appearance of the regres­
sion of У on X at ZL as having a general upward linear trend that becomes
increasingly concave upward as Z increases.
The reader should be aware that the linear coefficient (6t 4- b4Z) and
the curvilinear coefficient (b2 4* b5Z) in equation 5.16 are not simple
slopes. Rather these coefficients summarize the overall relationship of У
to X at particular values of Z. (In contrast simple slopes measure the
regression of У on X only at a single pair of X and Z values. )
For all the simple slopes presented in this chapter, the computer method
developed in Chapter 2 may be used to find standard errors and simple
slopes. We provide examples later in the chapter for the simple slopes
presented. However, the computer method as presented in this text is only
applicable to simple slopes and cannot be used to find the standard errors
of the general linear and curvilinear coefficients in equation 5.16. Instead,

CamScanner
86 MULTIPLE REGRESSION

the approach outlined in the optional section at the end of Chapter 2 ana
summarized in equation 2.10 must be used. The same is true for the gen- .
eral linear coefficient (b} 4- b4Z) in equation 5.11.

Case 4a: Simple Slopes, Standard Errors, and t-Tests. To find the sim­
ple slope of У on X in equation 5.4, wc compute the first (partial) deriv.
ative of equation 5.4 with respect to X:

dY
— = bt + 2ЬгХ + *4Z + 2b5XZ (5.17;
0A

The simple slope of У on X depends on the values of both X and Z. Tne


variance of the simple slope12 is given in Table 5.1, Case 4a. The Нея
follows with n — к — 1 df, where к = 5.
The simple slopes corresponding to all possible combinations of Xb
XM, and XH with ZL, ZM, and ZH are calculated by substituting the appro­
priate X and Z values into equation 5.17. The matrix containing the nine
simple slopes for our numerical example is presented in Table 5.4. Fa
example, for the combination XL, ZM (row 1, column 2 of Table 5.4), the
simple slope is -7.711. For the combination XM, ZM at the means of both
X and Z (row 2, column 2), the simple slopie is —2.042, the same value
as the coefficient for X in the overall equation. The standard errors,
and the corresponding Г-tests for each of the nine combinations of values
of Xand Z, are also provided and should be compared with Figure 5.2c(D-
For XL, we observe that the simple slope of the regression curve becomes
increasingly negative as the value of Z increases. In contrast, forXM, the
simple slope of the regression curve has a negative value at ZL, does not
differ from 0 at ZM, and becomes increasingly positive thereafter as 2
increases. Finally, for XH, the simple slope of the regression curve is
highly negative at ZL and rapidly changes to become positive at ZM and
highly positive at ZH.

Case 4a. Maximum and Minimum of Curves. To find the maximum or


minimum of each simple regression curve, we set simple slope equate
5.17 equal to 0 and solve for X,

b] 4- 2b2X 4- b4Z 4- 2b5XZ = 0

so that
V _ ~(fcl + ^2) й
A ~ 2(Ьг + 65Z) 1

CamScanner
Structuring Regression Equations 87

Table 5.4
Probing Simple Slopes in the Equation Y = —2.042X + 3.OOOX2 + 2.138Z +
2.793XZ + 1.960X2Z + 3.502 [Simple Slopes Found Using the Expression
[b\ + ^42 + 2/?5XZ)]

t •

(1) (2) (3)


ZL ZM Zh
-2.200 0 2.200

Simple slope -5.706 -7.711 -9.716


(1) XL = -0.945 standard error 2.963 2.801 4.439
t -1.925
* **
-2.753 *
-2.189
.(■ ...
sГ Simple slope -8.188 -2.042 4.104
t. (2) Хм =0 standard error 2.368 1.669 2.256
t —3.457
*** -1.224 1.819

J. .. Simple slope -10.669 3.627 17.924


(3) XH = 0.945 standard error 4.731 2.946 2.586
/ *
-2.255 1.231 6.930
***
г

SA: Covariance matrix of b coefficients:

ji. bt b3 bi b< bi
b\ 2.78447 0.11045 -0.14763 -0.05919 -0.33314
bl 0.11045 1.53329 0.00252 -0.46696 -0.05926
bl -0.14763 0.00252 0.52842 0.02239 -0.15468
s
ь< -0.05919 -0.46696 0.02239 0.52960 -0.01487
bl -0.33314 -0.05926 -0.15468 -0.01487 0.20618

< .001; < .01; *


p < .05
I*
r?' •?:. ■■ . . . ... . .
To illustrate, for the regression equation Y = —2.042X + 3.000X2 +
2.138Z + 2.793XZ 4- 1.960X2Z + 3.502, we substitute into equation
/ 5.18 to calculate the minimum at ZL = -2.200.
<
-[(-2.042) + (2.793)(-2.200)]
X = —— ---------------------------------- ;— = -3.120 for ZL
2(3.000 4- (1.960)(-2.200)]

Inspection of Figure 5.2c(l) indicates that this point corresponds to the


f maximum predicted value of Y. Similar calculations yield X ~ 0.34 for
ZM and X — —0.281 for ZH, both of which, from inspection of Figure
5.2c(l), correspond to the minimum predicted values of their respective
regression curves.13 As can be seen from inspection of Figure 5.2c(l),
these results, coupled with the examination of the simple regression curves
of Figure 5.2c(l) and their simple slopes, suggest that different relation-

CamScanner
л V
88 MULTIPLE REGRESSION

ships of scif-conccpt to self-disclosure exist nt low versus moderate and


higher levels of alcohol consumption.
'j
Case 4a: Crossing Point. We use the usual strategy to determine (hc
value of X at which the curves for the regression of Y on X nt values of £
cross. Two values of Z, Z, and Zp are chosen nnd substituted into the
regression equation 5.4; then the two equations arc set equal. We find th;»
there are now two possible crossing points, represented by the following
equations:

Хеями = [-/>, + (6, - 4/,,fe,)l/!)/2As (5.19)


- [-6, - (b} - 4М5)'/!1/2Ь3 (5.20)

When the b5 coefficient for the X2Z term is zero, the curves will not cross
However, even when b5 is nonzero, there may still be no point at which
the two regression curves cross. In fact, this is true for the present case,
as is illustrated in Figure 5.2c(l). Evaluation of equation 5.19 for the
case does not produce a real number for a solution; the solution is an
imaginary number:

Xcrou(i( = -[2.793 + [2.7932 - 4(2.138)( 1.960)]'P]/2( 1.960)

= [-2.793 + (-8.961 )'/2]/3.92

Evaluation of equation 5.20 also fails to produce a real number solution.


Thus the regression curves do not cross.
On p. 82, the crossing point for the regression of Y on X in Case h
was given as X - ~by/b^. The crossing points in equations 5.19 and
5.20 for Case 4a do not appear to be straightforward generalizations of
that for Case 3a. However, the discontinuity is only apparent. The limit
of equation 5.19 and equation 5.20 as approaches zero is, in tad<
±bjb4.
'J

Case 4b: Vie Regression of Y on Z. Given the significant interaction


involving Z, probing the regression of Y on 2 at levels of X is also 1&Й J
to be informative. The formulas for the simple slope of this regress^
line and its variance are given in Table 5.1, Case 4b. As can be seen $ j
Figure 5.2c(2), the regressions of Y on Z at levels of X are linear in fon»-
■?

CamScanner
Structuring Regression Equations 89

Each pair of simple regression lines cross at values of Z that depend upon
the values of X in question:

- \ 4- b2(Xj + Xj)
(5.21)
b4 + b5 (X,- + Xj

For example, when Xz = XH = 0.945 and X7 = XM = 0,

-[(-2.042) + 3.000(0.945 + 0)

2.793 + 1.960(0.945 + 0)

Coefficients of Simple Slopes by Computer

The simple slopes and the corresponding standard errors and r-tests for
all the analyses in Tables 5.2, 5.3, and 5.4 may be calculated by com­
puter. The approach directly extends the three step procedure presented
in Chapter 2. We will consider the probing of equation 5.4, which in our
example includes a significant X2Z interaction. The outcome of the simple
slope analysis for the numerical example is presented in Table 5.4; the
parallel computer analysis is presented in Table 5.5.
The significant X2Z interaction implies that each regression of У on X
depends on the specific values of both X and Z. Consequently, we need
to specify these values and we will use all combinations of XL, XM, and
with ZL, ZM, and ZH as before. Recall that X and Z are centered; XM
= 0 and ZM = 0 so that new transformed variables are not needed for
these values.
I. Our first step is to transform the original X and Z variables so they
arc evaluated at the conditional values of interest. The transformed vari­
ables are created from X and Z by subtracting conditional values CVX and
CV2, respectively. In this case we have the following:
' (a) XABOVE - X - (0.945) for the regression of У on X at CVX -
t 0.945, one standard deviation above the mean of X;
(b) XBELOW - X - (-0.945) for the regression of У on X at CVX
= -0.945, one standard deviation below the mean of X;
(c) ZABOVE = Z — (2.200) for the regression of У on X at CVZ =
L 2.200, one standard deviation above the mean of Z; and
; (d) ZBELOW = Z - (-2.200) for the regression of У on X at CVZ
/ - -2.200, one standard deviation below the mean of Z.
‘Л ........ Й......... . . . . ' ■ ;

CamScanner
MULTIPLE REGRESSION

Table S.S
Compulation of Simple Slope Analysis by Computer for lhe №z Interactin «
the Regression Equation Г « 6,.V + 6, A’2 + bxZ 4 btXZ 4- frsXJZ 4- />0П’П

uh Centered A' and Centered 7


>. Overall Anahsis * '

(0 Means and standard deviations

Mean Std Uev


Y 8.944 29.101
X 0.000 0.945
X1 0.890 1.230
Z 0.000 2.200
xz 0.861 2.086
x’z 0.187 3.944

(iil Regression analysis

Variable В SEB T SigT


X -2.041992 1.668673 -1.224 .2218
X3 2.999519 1.238259 2.422 .0159
z 2.138031 0.726923 2.941 .0035
xz 2.793482 0.727738 3.839 .0001
x2z 1.960267 0.454067 4.317 .0000
(Constant) 3.501767 1.586818 2.207 .0279

b. Computation of XABOVE. XBELOW, ZABOVE, ZBELOW, and Crossproducl


Terms Required for Simple Slope Analysis

COMPUTE XABOVE=X—(.945)
COMPUTE XBELOW =X-(-.945)
COMPUTE ZABOVE=Z-(2.20)
COMPUTE ZBELOW=Z-(-2.20)
COMPUTE X2A- XABOVE
*XABOVE
COMPUTE X2B^ XBELOW’XBELOW
*ZABOVE
COMPUTE XZA = X
*ZBELOW
COMPUTE XZB=-X
XAZ
=XABOVE
COMPUTE *Z
COMPUTE XBZ»XBELOW
*
Z
*
COMPUTE XAZA«XABOVE
ZABOVB
COMPUTE XAZB = XABOVE
*ZBELOW
COMPUTE XBZA = XBELOW
*ZABOVE
*ZBELOW
COMPUTE XBZB«XBELOW
COMPUTE *
XABQVE
Z 2
X2ZA
ZBELOW
COMPUTE X2ZB-X2
*
COMPUTE X2AZA = X2A
*ZABOVE
ZBELOW
COMPUTE X2AZB-X2A
*
COMPUTE X2BZA = X2В*ZABOVE 4
*ZBELOW
COMPUTE X2BZB = X2B

CamScanner
Structuring Regression Equations 91

Table 5.5, continued

c. Regression Analysis with XABOVE and ZBELOW, Yielding Simple Slope Analysis at
XH and ZL (regression of Уоп X one standard above the mean of X and one standard
deviation below the mean of Z)

(i) Means and standard deviations

Mean Std Dev


Y 8.944 29.101
XABOVE -0.945 0.945
X2A 1.783 2.103
ZBELOW 2.200 2.200
XAZB’ -1.218 3.139
X2AZB 2.484 5.678

N of Cases = 400

(ii) Regression analysis

Variable В SEB T Sig T


XABOVE -10.669354 4.730779 -2.255 0.0247
X2A -1.313069 1.670902 -0.786 0.4324
ZBELOW 6.528441 0.952239 6.856 0.0000
XAZB 6.498388 1.099950 5.908 0.0000
X2AZB 1.960268 0.454067 4.317 0.0000
(Constant) -10.111838 3.627950 -2.787 0.0056

d. Regression Analysis with X and ZABOVE, Yielding Simple Slope Analysis at X and
ZH (Regression of Y on X at the Mean of X and One Standard Deviation Above the
Mean of Z)

(i) Means and standard deviations

Mean Std Dev


Y 8.944 29.101
X .000 .945
X2 .890 1.230
ZABOVE -2.200 2.200
XZA .861 2.801
X2ZA -1.771 4.403

(ii) Regression analysis

Variable в SEB T Sig T


X 4.103670 2.255509 1.819 0.0696
X2 7.312106 1.506802 4.853 0.0000
■ ' ■. j
ZABOVE 2.138032 .726923 2.941 0.0035
XZA 2.793482 .727738 3.839 0.0001
’ ■ : }
X2ZA 1.960267 .454067 4.317 0.0000
.. (Constant) 8.205437 2.260164 3.630 0.0003 ¥

1 ■ ' £
. •' . ' ' n
1

L............
■••••• . ■ ■ • я

CamScanner
MULTIPLE REGRESSION

2. Crossproducts of transformed variables with themselves (for square?.:


*
terms), with each other, and with original variable X and Z art formed
The required crossproducts arc those corresponding to each of the ten* '
in the simple slope equation under consideration For example. for the
regression of У on X one standard deviation above the mean of X r.X„f
and one standard deviation below the mean of Z (Zt ). the FMb-snnj
crossproducts arc required.

X2A « XABOVE ♦ XABOVE

XAZB • XABOVE * ZBELOW

X2AZB - XABOVE ♦ XABOVE * ZBELOW = X2A • ZB

3. The regression analysis is performed using the transformed v*r>


ablcs and their crossproducts. In each case the bt coefficient for the X
predictor represents the regression of У on X at the conditional vabe
*
X and Z that have been specified.
Table 5.5 illustrates the use of the SPSS-X regression program to ** ея
the simple slopes at two of the nine combinations of XL, XM, and XH wb *
Zt, ZM, and ZH. Table 5.5a shows the overall regression analysis with be
centered predictors. Note that in the centered case, the test of bt corre­
sponds to the test of the simple slope for the regression of Y on X
ated at the values XM and ZM. Table 5.5b shows computer code that * 2
generate all needed terms required to perform the nine tests of simple
slopes presented in Table 5.4. Table 5.5c presents the test of the шерк
slope of the regression of У on X at XH and ZL, corresponding to the resuhs
of die parallel matrix-based test in Table 5.4 (row 3, column I of tae
matrix of simple slopes). Finally, Table 5.5d shows the regression Ы I
on X at the mean of X (XM) and ZH and corresponds to the test of идарй
slopes in Table 5.4 (row 2, column 3). The identical results of the пшгл-
based and computer-based analyses illustrate the equivalence of the rue
procedures.

.V

Three Final Issues

Curvilinearity Versus Interaction


■ '. ' й ;\
Darlington (1990) has clearly pointed out the difficulty of distinguish
ing between regression equations including an interaction term, Y « b^i ~
+ /;2Z + b)XZ 4- bQt and regression equations including a curvihft^

CamScanner
93

** /’|X "* ^.'^


* * 'n l!e,ft sc,s *n wh*vb X aftd Z are highly
*
cmn'U^ In such instances the X‘ and AZ terms will lx? highly cnrrc-
huxt Thus 'l W'N °^cn difficult and sometimes impossible to distin
guish between the two regression models on statistical grounds, even in
Urge samples. I ubinski and Humphreys (|W(I) provide an illustration of
this problem in attempting to distinguish between a curvilinear model and
an interaction model for predicting scores on an advanced mathematics
test for a large national sample of high school students.
Within a given sample, there is little researchers may do to distinguish
betuten the two interpretations. They may argue that the model supported
by the better articulated substantive theory
* should be retained. Or, they
may argue that the second model, F == bvV + b2X2 + b0, involves only
one variable (X) and has o. *e fewer parameter and should therefore be
accepted on the grounds of parsimony. A better solution is to locate a new
sample in which X and Z are less correlated or to sample cases from the
population in a manner designed to reduce the correlation between X and
Z. When X and Z contribute nonredundant information, then models rep­
resenting interaction versus curvilinearity can be distinguished.

What Terms Should Be Included


in the Regression Equation?
The example used throughout this chapter was based on a simulation,
so that the actual equation that generated the data is known. Typically,
however, the researcher must decide which terms to include in the regres­
sion equation. This decision should be guided by prior theory and empir­
ical research in the substantive area. In addition, the analyst may also
have new hypotheses about potential effects that should be included in the
equation. Nonetheless, in the absence of definitive knowledge about the
area, the researcher can easily include too few or too many terms in the
equation relative to the “true” regression equation. Each of these out­
comes has different benefits and consequences.
Omitting higher order terms whose true effects are nonzero from the
equation biases the lower order coefficients. Each lower order term (e.g.,
M that is tested includes unique variance that is due to the lower order
term plus all variance that is shared with the omitted nonzero higher order
terms. This problem can be illustrated by comparing two regression equa-
hons using our simulated data set. If we estimate an equation containing
Оп1У X and Z terms, we find

f « 1.923X + 3.726Z + 8.944

CamScanner
94 MULTIPLE REGRESSION

In contrast, if we estimate equation 5.4, which generated the data, we


find

Y = -2.042X 4- 2.138Z 4* 2.996X2 4- 2.793XZ

4- 1.960X2Z 4- 3.502
i

We note the dramatic change in the coefficients for the X and Z terms and
the intercept, illustrating the bias that is introduced by omitting these non­
zero higher order terms from the equation. Nctcr, Wasserman, and Kutncr
(1989) discuss several methods of plotting residuals that arc useful in
detecting this problem.
The inclusion of higher order terms whose true value is zero should not
bias the estimates of lower order terms. In Chapter 3 we pointed out that
when X and Z are centered and bivariate normally distributed, then the
correlations of X with X2, Z with Z2, and X and Z with XZ are zero. It
would seem then that under these conditions including a higher order
terms, say X2 in equation 5.2, or XZ in equation 2.1, should have essen­
tially no effect on the estimates of the X and Z effects. However, with
higher order terms, if the first order predictors are even moderately cor­
related, then first and third order terms (e.g., Z with X2Z) will be highly
correlated, even for centered predictors. The same is true for second and
fourth order14 terms (e.g., X2 with XZ; X2 with X2Z2). These interpre­
dictor correlations will introduce instability into regression coefficients;
the correlations exist even after variables are centered (Dunlap & Kemery'
1987; Marquardt, 1980). To illustrate, if we estimate equation 5.5, which
contains all the terms from 5.4 plus three terms known to be zero in our
simulated data set (i.e., Z2, XZ2, X2Z2), we find

Y = -1.027X 4- 2.412Z 4- 3.497X2 4- 0.411Z2 4- 1.929XZ

~ 0.534XZ2 4- 2.725X2Z - 0.003X2Z2 4- 1.795

We note that (a) the coefficients for each of the five nonzero terms has
changed somewhat from the values estimated above for equation 5.4
(though all significance levels remain highly similar), and (b) one of the
zero terms in the population is significant in the sample (for XZ2, p <
.05). The source of the significance is revealed in the pattern of intercom
relations of XZ2 with other variables. The XZ2 term has a low positive
zero order correlation with the criterion (r = 0.246) but a very high zero
order correlation with the other third order term X2Z (r = 0.817). Th?

cs CamScanner
к____ J
Smtcmrinf Regression Equations 95

significant negative coefficient for XZ2 is due to an unexpected suppressor


effect.1'
We thus caution researchers against routinely including higher order
terms not expected from theory on two grounds. As we have just seen,
such terms may introduce high interprcdictor correlations that cause in­
stability of regression coefficients and anomalous results, even with cen­
tered predictors. In addition, the inclusion of extraneous terms lowers the
statistical power of the tests of all terms in the equation. Chapter 6 dis­
cusses the procedures and the conditions under which nonsignificant higher
order terms can be dropped from regression equations to insure higher
power tests of the remaining terms.

Other Methods of Representing Curvilinearity

In this book we have emphasized the use of regression analysis as a


means of testing models developed from strong theory in which the equa­
tions lake the form of polynomial expressions. These equations represent
the forms typically hypothesized by current theory in most areas of the
social sciences. At the same time, other forms of nonlinear interactions
are occasionally hypothesized and regression analyses may be conducted
in a more exploratory vein. Below, wc briefly note some of the ap­
proaches that may be taken to other forms of theoretically predicted non­
linear interactions and to the exploratory use of regression analysis in
practical prediction problems.

Other Forms of Theory-Based Nonlinear Interactions

The simplest approach to nonlinear regression problems, where possi­


ble, is to test models that are linear in the parameters or in which the
equation has been linearized. For example, suppose theory predicted that
* « log (X) and Z
X * == (Z)’/2. Furthermore, the theory predicts that
the transformed variables interact to predict Y:

Y « 6,X
* *
+ 62Z + + b0

This equation follows the standard form we have estimated throughout


? the book. Note that an alternative representation of the equation is as
follows:

t » b0 + b, log (X) + Z»2(Z)’/? + b, log (X)(Z)l/!

CamScanner
96 MULTIPLE REGRESSION /
, . . . I
This second equation is linear in the parameters and would be estimated
using ordinary least squares (OUS) regression and can be interpreted jn
the transformed |c.g., log (X)|, but not the original X scaling in terms of
the prescriptions we have outlined in this book.
An example of a linearized equation Is provided by Wonnacott and
Wonnacott (1979) who note that economic theory describes the Cobh-
Douglas production function as follows: Q » bnKb'Lhlut where Q
quantity produced; К is capital; L is labor; b0» b]t and b2 arc the nonlinCar
regression parameters to be estimated; and и is the multiplicative error
term. If we take the logarithms of both sides of the equation, we discover

log (Q) = log (M 4- log (K) + b2 log (L) + log («)

which is equivalent to the easily estimated linear regression equation:

Y = Ь^ + b{X + b2Z + e

The regression coefficients can be estimated through OLS regression and


the value of b0 can be calculated from bi (by taking the antilog).
More complex nonlinear regression equations that cannot be linearized
can also be estimated using iterative, numerical search procedures. These
procedures are beyond the scope of the present book. Introductions to the
estimation, testing, and interpretation of such equations can be found in
Judge, Hill, Griffiths, Lutkepul, and Lee (1982), Kmenta (1986), and
Neter et al. (1989), while Gallant (1987) provides a more advanced treat­
ment. Finally, note that in each of the procedures outlined above, the
researcher’s interest is in developing unbiased or minimally biased esti­
mates and tests of the regression coefficients of a theoretically specified
nonlinear model. Such tests are conducted to provide evidence for or
against the viability of theory-based hypotheses.

Exploratory Regression Analyses


in Practical Prediction
Researchers interested in practical prediction problems often encounter
data that are not amenable to simple linear regression. These researchers
have developed a variety of graphical and statistical techniques for sim­
plifying regression equations through nonlinear transformations of the
variables (e.g., Atkinson, 1985; Box & Cox, 1964; Daniel & Wood,
1980). These techniques minimize problems in the data (e.g., outliers)
and maximize R2. They typically do not provide unbiased tests of the ?
■. ■ . ■

.
. ’■ i

CamScanner
Smtcrwring Eegression Equations 97

regression coefficients in models based on substantive theory. Indeed, data


transformations in the interest of simplifying the regression equation can
often eliminate interesting curvilinear or interactive effects predicted on
the basis of theory. Thus researchers need to keep in mind the nature of
their work, theory’ testing or practical prediction, when they are choosing
a strategy for nonlinear regression analysis.

Summary

For complex regression equations involving significant higher order


terms and their interactions, interpretation is facilitated by repartitioning
the variance of the original regression equation. A three-stage process has
been employed for recasting regression equations. First, the overall
regression equation is rearranged to show the regression of the criterion
on one of the predictors, with the regression coefficients being expressed
as linear combinations of the coefficients of the overall regression equa­
tion. Equations 5.11 and 5.16 represent examples of such reexpressions
into X and X2 components for the regression of Y on X. Second, values
of the other predictors are then substituted into the reorganized expres­
sions to generate a series of simple regression equations, such as those
illustrated in Figure 5.2. Finally, simple slopes for the regression of У
onto a single predictor at specified values of that and/or other predictors
are computed, for example, the regression of Yon X at values of X and Z
in equations 5.3 and 5.4. These simple slopes provide insights into the
trends represented by each simple regression equation.

Notes

I. In very rare instances, based on strong theory, only the quadratic component might
be included. To illustrate, suppose we had a model for judgments of size that indicated that
size was judged on the basis of area, and not on the basis of height or width of objects
alone, Then in a regression equation predicting size judgments from area, one might wish
to omit the first order terms of height and width. The first order terms might be omitted if
both strong theory plus prior empirical evidence indicated that individuals do not rely at all
on linear extent in the judgment of size (we thank David Kenny for this example).
2- If there are nonlinear relationships between predictors and the criterion, and these
relationships are not reflected in the regression equation, they may be delected with regres-
rion diagnostics applied to residuals (see, for example, Belsley, Kuh, & Welsh, 1980; Bol-
ten A Jackman, 1990; Cook A Weisberg, 1980; Stevens, 1984). and the regression equation
; Wropnaiely respecified.

CamScanner
. ... Г-' /
98 MULTIPLE REGRESSION Ш

3, Briefly, the operation of differentiation is applied separately to each term in the ?


regression equation. For polynomial expressions like the ones we arc considering, the fir\| ■
(partial) derivative (with respect to X) of the general term b(XnZM is (bjXriX""
Standard calculus texts (e.g., Thomas, 1972) provide detailed discussions of this operation
for readers interested in pursuing this topic in more depth,
4. Readers familiar with calculus can confirm whether the value of X corresponds m
the maximum or minimum value of Y by computing the value of the second derivative sf A;
the original curvilinear regression equation, P «= btX + b2X2 + bn. This quantity is equal c J
to 2 b- and indicates that 1’ will be a minimum when b2 is positive, whereas F will be?
maximum when Ьг is negative.
5. Our general matrix-based procedure for testing simple slopes applies here. The
expression (2?। 4- 2b2X) is a linear combination of regression coefficients. Thus an estimate X
of its standard error is provided by the square root of the expression s2 « w'Shw. For the
simple curvilinear regression equation 5.1, the weight vector w' for b} and h2 is (1 2X],
and Sh is the 2 X 2 covariance matrix of the regression coefficients.
6. Stimson, Carmines, and Zeller (1978) have provided a method of recasting poly­
nomial regression equations to render them more interpretable. The simple polynomial
regression У = bxX + b2X2 + b0 is rewritten as follows:

f=M + bj(F-X)2 ’ - .

where ". . -■

M = minimum (or maximum) value of the criterion,


_ M. •

F = value of X where minimum (or maximum) performance is produced,

= ~b' ' ■ ' , <


= 2b2

For example, if one considers the U-shaped relationship between self-concept (X) and sell- X
disclosure (У), the rewritten polynomial equation would indicate the minimum value of
self-disclosure (M) at the value of self-concept (F) at which that minimum M was achieved,
7. For the general matrix-based approach to the variance of the simple slope of У он X
in Case 2, w'= (1 2X OJ and Sb is 3 X 3. =
8. For the general matrix-based approach to the variance of the simple slope of f
in Case 3a. w' i» Ц 2X 0 ZJ and is 4 X 4. . A
9. The second (partial) derivative of equation 5.3 with respect to Xis b2. If this secoH J
derivative is positive, then a minimum has been identified; if It is negative, a maximurnl»’ A
been identified (see note 4 above). ' / -
10. For the general matrix-based approach to the variance of the simple slope of
in Case 3b, w'|0 0 1 X) and Sb is the same 4 X 4 matrix as for Case 3a.; л
11 The variance of the coefficient (bt + b4Z\ is calculated using the general tnatrix Ц
approach, where w' =₽? [1 0 0 Z 0] and Sbls 5 X 5. The Meet has/»к —• 1 df, wl^.A’
к .a 5. For(b2 + b5Z), w' = [0 1 0 0 * is the same 5 x 5 matrix; / has t^. j
Z], and S
samedf. . ‘ .. . ...- .. J?

CamScanner
Structuring Regression Equations 99

12. For the general matrix«bascd approach to the variance of the simple slope of К on X
in Case 4a. - (1 2X 0 Z 2XZ) and *S is 5 x 5.
13. The second partial derivative of equation 5.4 with respect to X is

~j « 2(fr} + b5Z)

This equation indicates that the direction of curvature depends on the value of Z. The value
of Xobtained from equation 5Л8 will be a minimum when b2 + b5Z > 0 and a maximum
when b2 + b5Z < 0. At Zl = -2.20, the second derivative is 2(3.000 + (1.96)f—2.2O>]
= -1.312, indicating a maximum; hence the regression of У on X is slightly concave
downward. For Zh, the second derivative is 15.75; the regression of Y on X is concave
upward.
14. For multivariate normal variables X, Z, and l¥, all odd moments vanish (e.g., XZW.
XJZ, X:Z:H’), whether or not X, Z, and W are intercorrelated. If X, Z, and W are inter­
correlated, then even moments do not vanish (e.g., X2Z2, X2Z2I¥2), See the appendix of
Kenny and Judd (1984) for a summary of moments of multivariate normal distributions.
15. A suppressor variable is one that is generally uncorrelated with the criterion, is
highly correlated with another predictor, and increases the predictability of the other pre­
dictor by its inclusion in the equation. The suppressor typically has a significant negative
regression coefficient.

CamScanner
6 Model and Effect Testing
with Higher Order Terms

The inclusion of nonlinear and interactive terms produces complex regres­


sion equations like those considered in Chapters 4 and 5. The discussion
thus far has focused on the probing and interpretation of the highest order
term in the full model in which all terms are included. In this chapter we
address in more detail the issues involved in simplifying regression equa­
tions by dropping nonsignificant higher order terms from the model. Wc
also consider several global tests designed to compare the full model with
reduced (simpler) models that may provide an equally good fit of the data.
Finally, we consider sequential testing strategies aimed at simplifying
complex regression equations containing higher order interactions.
Throughout this chapter it is important for the reader to keep in mind why
the specific tenn(s) being probed were included in the regression eqW
lion. Different analytic strategies and different intepretations may be ap­
propriate depending on whether the source of the term was strong theory
or data exploration.

Some Issues in Testing Lower Order Effects


in Models Containing Higher Order Terms

A central issue in testing complex regression equations is that the


order effects and the interactions arc not typically independent. The

loo ,

CamScanner
Model and Effect Trying IОI

ancc that is shared by terms in (he equation could potentially 1ч' appor­
tioned to the higher and lower order effects in several different ways. This
issue has stimulated a sizeable literature comparing strategies for testing
overlapping effects in MR (sec. c.g.. Allison. 1977; Cleary A Kessler,
1982; Cohen A Cohen, 1983; Darlington, |0»M). Lane. 1981; Pedhazur,
1982; Picxoto. 1987). Interestingly, a large parallel literature in ANOVA
discusses strategies for partitioning the variance and testing the effects in
factorial designs with unequal (nonproportional) cell sizes (sec. c.g.. Ap
pclbaum A Cramer, 1974; Cramer A Appclbaum, 1980; Herr A Gaebel
ein, 1978: Overall. Lee. A Hornick, 1981; Overall A Spiegel, 1969.
Overall, Spiegel. A Cohen, 1975). These designs have the same problem
that is typical of complex multiple regression equations: The partitioning
of variance is not unambiguous.
To illustrate this issue and the questions it engenders, we will consider
the simplest case, represented by the now familiar two variable regression
equation containing an interaction. This equation is reproduced as equa­
tion 6.1 below.

Y = b{X + b2Z + b2XZ + />0 (6.1)

With this equation as our starting point, a number of simpler models can
also be generated, presented here as equations 6.2, 6.3. and 6.4.

Y=biX + b2Z + b0 (6.2)

. biX +bo (&3)


У = b2Z I- bQ (6.4)

Note that the b}X terms in equations 6.1,6.2, and 6.3 will not generally
be equal, in equation 6.3, b{ reflects all variance shared between X and
, У. In equation 6.2, reflects all variance shared between X and У over
and above the variance shared between Z and У. Finally, in equation 6.1.
reflects the unique variance shared between X and У after the effects of
Z add XZ on У have been removed.’ Thus the h|S in the three equations
' have different meanings and may vary substantially in magnitude, de­
pending on interpredictor correlation and the distribution of the variables.
Thus far in the book we have discussed the interpretation of only the
b, and b2 coefficients of equation 6.1. Each lower order effect was intcr-
Preted assuming the presence of the interaction in the equation.2 Testing
significance of each term in equation 6.1 provides a test of the unique

CamScanner
102 MULTIPLE REGRESSION

variance attributable to each of the effects. This strategy is always the


method of choice if by is significant. The central question in this case is
what interpretation, if any, should be given to the lower order terms, This
first question, which wc have already considered and whose answer we
will briefly review below, has been the focus of an extensive debate in
the literature (Appelbaum & Cramer, 1974; Cohen, 1978; Darlington,
1990; Finney, Mitchell, Cronkitc, & Moos, 1984; Overall ct al., 1981;
Pedhazur, 1982).
Continuing with our example, suppose that by is not significant. This
outcome leads to a second question that has also been the focus of con­
siderable debate in the literature (c.g., Cramer & Appelbaum, 1980; Fin­
ney et al., 1984; Overall ct al., 1981): Should the XZ term be dropped
from the equation and the b\ and b2 coefficients be estimated using re­
duced equation 6.2? In answering this question, it will be useful to dis­
tinguish between two different cases: (a) Strong theory makes predictions
only about the X and Z effects, but the analyst included the XZ term to
explore what would happen, (b) Strong theory makes predictions about
the XZ effect that were not supported in the present study.

Question 1. Interpretation of Lower


Order Terms when by Is Significant
■ . 1

The interpretation of lower order terms in the presence of an interaction


has been previously discussed in Chapters 3 and 5. To review briefly the '
first order effects do not represent a constant effect across the range of the
second variable. Rather, for centered variables, they represent the con- 1
ditional effect of the variable at the mean of the second variable. They К
may also be interpreted as the average effect defined as the mean of the
simple slopes evaluated for each case in the sample (Darlington, 1990;
Finney et al., 1984). Thus, so long as the predictor variables arc centered.
the lower order effects have a clear interpretation. V
There are many situations in which the average value of the regression л
of У on X at the mean value of Z (i.e., the conditional effect of X at Z) -V
in the sample will be useful. For example, if poor school children receive ?
or do not receive a hot lunch program (X) regardless of the quality of Л
their home diet (Z), it is useful to know whether, on average in the sample $
at least, there is a salutory effect of the hot lunch program on some health Л
outcome (У). Stated in terms of the interpretation of 6, as a conditio^ V
effect at Z ~ 0 for centered Z, it is useful to know whether there ь 4 V
salutory effect of the hot lunch program for children with an average (typ V i
ical) home diet. Consequently, we recommend that variables be centered -

CamScanner
Model and l\fl'eet Testing 103

and that conditional effects be tested. At the same time, we believe that
authors have an obligation to the render to explain the meaning of con­
ditional effects and how they differ from more familiar constant main ef­
fects. Restricting the presentation of the tests of the conditional effects to
the context of post hoc probing of the interaction would help minimize
the possibility of misinterprction of such effects.

Question 2. Should Lower Order Coefficients


Be Tested in Reduced Models
when by Is Nonsignificant?

The second question concerns the proper procedure to follow when the
test of the by coefficient for the interaction is nonsignificant. This question
centers on whether or not the interaction should be eliminated from the
equation, with the first order X and Z terms now being tested using equa­
tion 6.2. To begin to answer this question, we need to consider the trade­
off between two desirable properties of statistical estimators: unbiased­
ness and efficiency.
Unbiasedness means that the estimators, in this case the sample regres­
sion coefficients, will on the average equal the population value of the
corresponding parameters. In standard regression models, the major source
of bias is the omission of terms from the regression equation that represent
true effects in the population (specification error). High efficiency means
that the size of the standard error of the estimator relative to other esti­
mates of the same standard error will be small. Including additional terms
in a regression equation that in fact have no relationship to the criterion
has no effect on bias: All of the regression coefficients will continue to be
unbiased estimators of their respective population values. However, the
introduction into the equation of unnecessary terms having no relationship
to the criterion in the population has the result of lowering, sometimes
appreciably, the efficiency of the estimates of the regression coefficients.
Otherwise stated, the estimates of the standard errors of the regression
. coefficients will be larger, making it more difficult for any true effects in
I the equation to attain statistical significance. Hence, terms that have a
value of zero in the population should be removed from the regression
i equation to permit more powerful tests of the other eff ects.
/ Researchers rarely know whether an interaction is zero in the popula-
/ tion. Statistically, all the researcher can typically do to evaluate this hy-
; pothesis is to test the interaction in the sample and show it does not differ
у from zero. Unfortunately, as we will see in Chapter 8, tests of interactions
у often have low statistical power and may fail to detect small true inter-
f • ,

r4
cs CamScanner
104 MULTIPLE REGRESSION

action effects that exist in the population. This problem has led to con­
flicting recommendations in the literature. For example, in the ANOVA
context, Cramer and Appclbamn (1980) have emphasized the increased
efficiency that results when nonsignificant higher order terms arc dropped
from the model. They have argued that this gain more than compensates
for the small amount of bias that may be introduced in the estimates of
the lower order effects/ In contrast, Overall ct al. (1981) focused on the
problem of bias and showed under the conditions studied in their simu­
lation that the use of the full model (equation 6.1) resulted in less bias,
greater precision, and equal power relative to the use of the reduced model
(equation 6.2) for tests of the lower order effects.
However, these results, while informative, should not necessarily be
treated as being applicable in all contexts. As pointed out by Finney et
al. (1984), “in actual research situations, as opposed to simulation anal­
yses, the degree of bias for each approach depends on unknown—and, in
many cases, unknowable—factors, such as the main and interactive effects
of relevant variables that were not included in the model” (p. 91). in­
stead, Finney et al. (1984) argued that researchers should focus on the
distinction we introduced above: the theoretical versus exploratory basis
of the interaction. When there are strong theoretical expectations of an
interaction, they conclude that the interaction should be retained in the
final regression equation. Doing so informs the literature and leads to the
accumulation of knowledge about the theory in question. Even when non­
significant, the estimates of the effect size for the interaction may be com­
bined across multiple studies through meta-analysis. Further, it is logi­
cally inconsistent to report the estimates of constant effects for X and Z
from equation 6.2 when strong theory postulates an interaction.
In practice, estimates of the lower order effects derived from equation
6.1 versus from equation 6.2 will often be quite similar when all predic­
tors have been centered. Indeed, Finney et al. (1984) note three cases in
which they are identical. In each of these cases, there is no overlapping
variance between the two predictors and their interaction, (a) If X and I
are uncorrelated with one another and are centered, then each is uncor *
related with the XZ term, (b) If X and Z are bivariate normal, then again
rx xz ~ rz xz = 0- (c) XZ pairs are balanced (i.e., for every 4
-Z] data point there is a corresponding [X, Z] data point, and for ever) I-
[-X, Z] data point there is a corresponding [X, -Z] data point), theft |
again rXtxz ~ rz,xz = 0. Although these exact conditions do not often
hold in observed data, the rxxz and rx xz correlations are often low with 4
centered X and Z, so that estimates from equations 6.1 and 6,2 will be
quite similar. - - ///Ж
' ' Ж

CamScanner
Model and Effect Testing 105

Recommendation
Our own recommendation concurs closely with that of Finney ct al.
(1984). In eases in which there arc arc strong theoretical grounds for ex­
pecting an interaction, the interaction, even if nonsignificant, should be
retained in the final regression equation. Post hoc probing procedures can
also be used as a supplemental guide to understanding how the interaction
could potentially modify the overall results. Such probing is particularly
important in eases in which the first order effects, though significant, are
not large in magnitude and the statistical power of the test of the inter­
action is low. Conditional effects for the lower order terms from equation
6.1 may be reported where useful and appropriate, so long as centered X
and Z have been used in the analysis and the nature of the effects is ex­
plained. However, in eases in which there is not a strong theoretical ex­
pectation of an interaction, step-down procedures should be used. The
interaction should be dropped from the equation and the first order effects
should be estimated using equation 6.2.
The remainder of this chapter focuses on a variety of global tests that
can be useful in testing focused hypthescs about a variety of alternatives
to the full regression model. We also present sequential step-down pro­
cedures that are useful in exploring complex regression equations. The
analyst should always keep in mind the theoretical versus exploratory ba­
sis of the tests that are performed. This consideration informs the choice
of data analytic strategy and the interpretation of the results. With Finney
ct al, (1984) we also strongly encourage researchers using these global
and step-down testing procedures to report the findings of their prelimi­
nary analyses in order to place the main results in context.

Exploring Regression Equations Containing


Higher Order Terms with Global Tests

There are many instances in which researchers may wish to use MR to


conduct global tests of hypotheses. Such tests may be based on competing
theoretical perspectives. For example, theory I may propose that only X
atxl X2 predict Y, whereas theory 2 may propose that Z and its interactions
with X and X2 also contribute to the prediction of У. Or, the tests of the
urne hypotheses may be purely exploratory. Hypotheses about interac­
tions and higher order effects may also be based on hunches about the
prior data snooping, desires to identify or rule out possible moder­
ators of an obtained effect, as well as many other possible sources. Such

CamScanner
"”LTm=

exploration is common in the ease of orthogonal ANOVA. Resear.


often explore the main and interactive effects of demographic factors (
gender) in preliminary analyses in the absence of strong theoretical6 8’’
dictions. Complex factorial ANOVA models often include effect^
which the researcher docs not have specific hypotheses. For example *
complete, four factor ANOVA design generates a total of 15 effects-^*
main effects, six two-way interactions, four three-way interactions, and
one four-way interaction—even though at most only a few of these effects
have been predicted. Such exploratory analyses can yield new and inter
esting hypotheses that can then be tested in subsequent research. How­
ever, strong caution must be exercised in interpreting the results of the
exploratory' analyses within the original sample given the high experi­
mentwise error rate (inflated alpha level) that results from conducting large
numbers of tests. This problem can be minimized through the Bonferoni
or other procedures that adjust the level of significance for the number of
tests that have been conducted.
In the regression context, a variety of hypotheses about nonlinear and
interactive effects may be tested through a general model comparison pro­
cedure. Model 1, the full model, contains all of the lower order terms
plus the specific additional terms that are in question. Model 2, the re­
duced model, contains only the lower order terms. To compare the two
models, a test of significance of the gain in prediction due to the inclusion
of the additional terms is given as follows, with m, n — к — 1 df:

(«in - Дои,)/"!
(6.5)
(1 -«?„)/(» -k~ 1)

In this equation, R2n is the squared multiple correlation of the model con­
taining the terms in question; /?JU, is the squared multiple correlation front
the reduced model with the terms in question removed; m is the number
of terms in the set of terms being explored; n is the number of cases; and
к is the number of predictors in the full regression model, from which
R2n is derived. We will use a number of variants of this general procedure
throughout the remainder of the chapter.
As an illustration, if we wish to determine whether the Z and XZ terms
contribute to the prediction in equation 6.1, we can compare the full model
(equation 6.1) with a reduced model that eliminates X and XZ (equation
6.3). In this comparison, R2n >s ^1C multiple correlation from equation 6.1;
R2 is that from equation 6.3; m = 2 for the b2Z and b2XZ terms in
question; and к = 3, corresponding to the three predictors in equation

CamScanner
Model and Effect Testing 107

6.1. Of importance, all tests of the gain in prediction accounted for by a


set of terms in a regression equation arc scale Invariant. This statement
is true even when not all of the terms making up the predictor set are scale
invariant. If this omnibus test is not significant, then all terms involved
in the set arc deleted (i.c., pooled into the error term) and the revised
model is tentatively accepted.
Differences in the interpretation of the results and in the subsequent
statistical procedure may appear at this point depending on the source of
the reduced model. If the reduced model was based on substantive theory,
it will be preferred over the alternative, full model on the grounds of
parsimony. No further testing to eliminate terms from the reduced model
would normally be conducted. In contrast, if the reduced model were
exploratory, this revised model could now receive further scrutiny. In
cases in which multiple terms remain in the reduced model, consideration
would be given to the possibility of dropping additional terms from the
reduced model. The purpose of such exploratory tests of complex regres­
sion models is to eliminate predictors for which there is no support and
to simplify the model as much as is possible.

Some Global Tests of Models


with Higher Order Terms
A variety of global tests may be made based on a single regression
model. We will use the now familiar equation

У « b.X 4- b2X2 + byZ 4- b4XZ 4- b3X2Z 4- (6.6)

to illustrate several of the interpretable types of global tests that may be


performed. The test chosen will ideally depend on substantive theory
* 1,1
more exploratory cases, the test will depend on the specific questions that
the researchers desire to answer.
1. Global Test of Linearity of Regression
Equation 6.6 may be contrasted to equation 6.2, F ~ + b;Z »
60, which contains only linear terms, to determine whether the regression
model can be treated as being purely linear. If there is no significant gam
in prediction from equation 6,2 to 6.6, then it is concluded that the sim
pier linear model of equation 6.2 is appropriate. Then the tests of 6; and
b2 coefficients within equation 6.2 provide information about the signtfi
'/ cance of the linear effects of X and Z, respectively (see Darlington, 1990,
h p. 336; Pedhazur, 1982, p. 426).
'............... ■ ..

CamScanner
108 MULTIPLE REGRESSION

2. Global Test of Curvilinearity of Regression


/

To determine whether the relationship of X to the criterion is curvilin­


ear, the gain in prediction from equation 6.1 to equation 6.6 is tested. To
see this, consider the reexprcssion of equation 6.6 in the form of a poly­
nomial regression equation, given in equation 5.16 and reproduced here:

? = (b\ + b<Z)X + (b2 + b5Z)X2 + (b3Z + b0) (6.7)

The test of the gain in prediction from equation 6.1 to equation 6.6 pro­
vides a test of whether the linear combination of regression coefficients
(b2 + bsZ) for the X2 term is different from zero. In equations involving
Z2 terms, a similar reexpression of the regression equation in terms ofZ
and Z2 provides the basis for a similar test evaluating whether the rela­
tionship of Z to the criterion is curvilinear.

3. Global Test of the Effect of One Variable

In equation 6.6, one may test whether there is any effect of one of the
predictor variables, in either first order or higher order form. For a global
test of variable X, equation 6.6 would be contrasted to equation 6.4, Y =
b2Z + b0, which drops all terms containing X. If this test of gain in
prediction is nonsignificant, it is concluded that variable X has no effect :
in the regression model. For a global test of variable Z, equation 6.6 •
would be contrasted to equation 6.8, which drops all terms containing Z:

Y = b,X + b2X2 + b0 (6.8)


I1.

There are к such tests in a regression model based on к original predictors


(here к = 2 corresponding to X and Z; see also Darlington, 1990, p. 336).
Note that in these tests all interactions involving the variable in question •
are dropped, whereas interactions not involving the variable in question j
are retained. )

4. Global Test of an ANOVA-Like Effect

As pointed out in Chapter 5, factors having more than 2 levels produce


main effects and interactions in ANOVA with multiple degrees of freedom
(df). For example, an ANOVA with 3 levels of X and 2 levels of Z yields
an XZ interaction with 2 df. If these data were analyzed using MR»the V
interaction variance from the ANOVA would be accounted for by the $ :<
plus X2Z terms in equation 6.6. Given this parallel structure, the satfe \
partitions of variation may be used in MR as in ANOVA. Thus a test о e
the overall XZ interaction analogous to that in ANOVA would result fro111 >‘

CamScanner
Model and Effect Testing 109

comparing full model represented by equation 6.6 with a reduced


? jnodel in which the two terms involving XZ are dropped. This reduced
model is represented by equation 6.9:

F = ьхх + b2x2 + bffz -t- />„ (6.9)

If there were no significant interaction, then both the b4XZ and h5X2Z
terms would be eliminated, leaving equation 6.9 as the appropriate cqua-
■ tion.
Global tests of effects analogous to those in ANOVA may be employed
in series. Suppose that the test of the interaction was not significant, yicld-
« ing reduced regression equation 6.9. Equation 6.9 represents the main
effect of X, represented by the X and X2 terms, plus the Z main effect.
Equation 6.9 would now be contrasted to equation 6.4 to determine the
overall contribution of X to prediction. This is analogous to the test of a
main effect of X in ANOVA where X has three levels. Finally, the linear
effect of Z would be tested by contrasting equation 6.9 with equation 6.8.
The above tests of the overall X effect (X plus X2) in the presence of
Z and the Z effect in the presence of X and X2 will be familiar to many
ANOVA users. These procedures are identical to Appelbaum and
Cramer’s (1974) tests of main effects over and above other main effects.
Appelbaum and Cramer (1974) referred to these tests with regard to the
A and В main effects of an ANOVA as “A eliminating B” and “B elim­
inating A.”
Appelbaum and Cramer (1974) also observed that in some instances
the multiple correlation for an equation containing two main effects (e.g.,
equation 6.9) will be significant, but that neither individual effect will
. attain significance. They recommend further testing of the the X and Z
main effects in which each effect is considered in a single equation. In the
present case, the joint test of b{ and b2 in equation 6.8 would provide the
test of X, and the test of b2 in equation 6.4 would provide the test of Z.
Such tests, termed “A ignoring B” and “B ignoring A” by Appelbaum
and Cramer (1974), are recommended to clarify the impact of a single
factor on a criterion. If only one of the pair of tests is significant, the
associated predictor is taken as having an effect on (or association with)
‘ the criterion.
5. Global Test of the Equivalence
‘i of Two Regression Equations
Chapter 7 addresses the analysis of regression equations involving com­
binations of continuous and categorical predictor variables. In equation
к. . . . .

CamScanner
40 MULTIPLE REGRESSION

6.1, let us assume that X is a continuous variable and Z is a dichotomous


predictor variable representing two different groups (c.g., men versus
women). The XZ interaction term in equation 6.1 then represents any \i
difference in the slopes of the regression lines of Y on X between the two
groups, here men versus women. Predictor Z represents differences in the
У scores between the two groups, evaluated at the mean of the full sample
on the continuous variable.4 If equation 6.1 is contrasted to equation 6.3,
this provides a global test of whether there is any difference between the
simple regression lines in the two groups (see Cohen & Cohen, 1983, pp.
312-313; Lautcnschlager & Mendoza, 1986; Neter, Wasserman, &
Kutner, 1989, p. 368).

Structuring Regression Equations


with Higher Order Terms

In all of these global tests, the equations containing higher order terms
are deliberately structured to contain all lower order terms of which the
higher order terms are comprised. Tests of the contributions of higher
order terms should consider prediction of these terms over and above the
lower order terms. This is because higher order terms actually represent
the effects they are intended to represent if and only if all lower order
terms are partialed from them (Cohen, 1978). The XZW term covered in
detail in Chapter 4 only represents the linear X linear X linear component
of the XZW interaction if all lower order effects (i.e., X, Z, W, XZ, XW,
and ZW) have been partialed from the XZW term. A global test of the
curvilinearity of X in equation 6.6 would not be accomplished by deter­
mining the significance of the multiple correlation in the equation f s
b2X2 4- b5X2Z 4- b0. In such an equation the higher order curvilinear
effects would be confounded with portions of the linear X, linear Z, and
the linear by linear XZ interaction. In sum, we recommend strongly that
in structuring regression equations with higher order terms all lower order
terms be included\see Allison, 1977; Cleary & Kessler, 1982; Cohen &
Cohen, 1983; Darlington, 1990; Pedhazur, 1982; Peixoto, 1987; Stone
& Hollenbeck, 1984 for further discussion of the necessity for the inclu­
sion of lower order terms when higher order terms arc tested).
Readers will occasionally encounter models in research literature in
which lower order terms have been omitted. Fisher (1988) described a
theoretical model in which one predictor X has a direct effect on the cri­
terion У; a second predictor Z modifies the effect of X on Y, but has no
direct effect on Y. This theorizing led to the regression equation f

CamScanner
Model and Effect Testing 11!

+ b$XZ 4- b0, with the b2Z term for the direct effect of X on Yomitted.
We suggest that it is usual with such theorizing to test the direct effect of
Xon Z and to show lack of support for its operation, rather than to omit
it from the model. Hence, we advocate the use of models with all lower
order effects included for theory testing. Demonstrations that effects pre­
dicted to be nonexistent by theory do not accrue are valuable in theory
building.

Sequential Model Revision of Regression Equations


Containing Higher Order Terms: Exploratory Tests

There are two instances in which the examination of complex regres­


sion equations will proceed one term at a time. First, in tests of explor­
atory' hypotheses, if a global test involving several terms is significant,
then the individual terms in the set are tested to characterize further the
nature of the effect studied in the global test. The second instance is in
the absence of global tests. One may adopt a strategy of sequentially ex­
ploring a complex regression equation term by term without having first
performed more global tests.
We showed in Chapter 3 that all regression coefficients except that for
the highest order term(s) are scale dependent, and that their significance
levels will vary widely with changes in scaling of the first order predic­
tors. Thus what is required is an approach to model exploration using
sequential term-by-term testing that involves tests only of scale-invariant
terms at each stage of analyses. A step-down hierarchical examination of
the regression equation with model respecification after each step satisfies
this requirement. The approach begins with the full equation; nonsignifi­
cant terms are then omitted sequentially in stages beginning with the high­
est order term in the equation. At each step the scale-invariant terms are
identified and only these terms are tested for statistical significance. Such
an approach is outlined for equation 6.6 by Peixoto (1987).
In any regression equation a term will be scale free if the complete
combination of letters and superscripts of the term is not included in any
other term. In equation 6.1 the XZ term is scale free; in equation 6.6 it is
not scale free because it is included in the X2Z term. Likewise, X' is
scale free in equation 6.9 but not in equation 6.6. Appendix В presents
] an algorithm for identifying scale-free terms in regressions through the
complexity of equations 4.1 and 5.5, containing XZW or X2Z2. It also
shows algebraically that the terms so identified are scale free. A simple,

cs CamScanner
И2 MULTIPLE REGRESSION

alternative computer-based procedure that is useful for complex regres­


sion equations is to estimate the regression equation twice. In the first
ran, the original values of the predictor variables arc used; in the second
nin, a constant is added to each of the predictor variables (e.g., X* = X
+ 10). Those terms in the regression equation that do not change between
the two runs arc scale invariant. Note that each time a term is dropped
from the equation in model revision, new terms become scale free, so that
this computer checking method would have to be repeated at each step.5
-

Application of Sequential Testing


Following a Global Test
If a global test involving more than one term is significant, then the
terms involved in the global test should be explored individually begin­
ning with the highest order term of the set, the only term that is scale
invariant. If nonsignificant, the highest order term is eliminated and the
model is revised. Those remaining terms of the global test set that are
scale free in the revised equation are then tested for significance. For
example, we described on page 107 a global test of the linearity of regres­
sion that contrasted equations 6.6 and 6.2. If this global test were signif­
icant, then the tenns of the set X2, XZ, and X2Z would be candidates for
follow-up tests. The X2Z term would be tested for significance first, by
testing the significance of the b5 coefficient in equation 6.6, or equiva­
lently by testing the gain in prediction of equation 6.6 over equation 6.10:

Y = Ь^Х 4- b2X2 + b3Z + b4XZ + b0 (6.10)


I

If the b5 coefficient were significant, this interaction would be probed


following the prescriptions of Chapter 5, Case 4a. If the coefficient were
nonsignificant and were dropped from the equation, then in resulting .
equation 6.10, there are two scale-independent terms, b2X2 and b4XZ t
Each of these terms is tested separately. The bAXZ term is tested by con- .
trusting equation 6.10 to equation 6,9; the b2X2 term is tested by con-
trusting equation 6.10 to equation 6.11, presented below,

Y — btX + b3Z 4- b4XZ + (6.11) i?


.. . ' ■ '

Note that both equations 6.9 and 6.10 are hierarchically “well forma- s
lated” (Peixoto, 1987) in that all lower order terms are represented. Sup- ,
pose further that the b2X2 term were significant, whereas the b4XZ term v

CamScanner
нж/ /ДОМ TeMltiff 113
У'Чч:•• ’• ■’ • • . .

were not. Then equation 6.9 would be retained, The source of the devia­
tion Ihuu linearity found in the global test would have been Identified as
resulting from a curvilinear regression of Ton X, This relationship would
then be further characterized using the strategies for probing curvilinear
relationships described in Chapter 5, Case 2.

General Application of Scqiicntinl Testing

An alternative strategy for exploring complex regression equations is


to simplify the model on a term-by-term basis without first having per­
formed global tests. The researcher simply begins with the highest order
term in the regression equation and steps down through the hierarchy fol­
lowing the algorithm for scale-independent terms. The procedure stops
when all nonsignificant higher order terms arc eliminated from the equa­
tion. If the final reduced equation still contains higher order terms, then
these terms should be probed using the prescriptions from Chapters 2
through 5, or from Chapter 7 if combinations of discrete and continuous
predictors are involved.

Present Approach Versus


that Recommended by Cohen (1978)

Cohen (1978) recommended a hierarchical step-up approach to exam­


ining regression equations that include higher order terms. In his ap­
proach, lower order effects are tested in equations containing only these
effects; interactions are then tested for their contribution over and above
main effects. Cohen’s approach handles the scale invariance problem as
does the step-down approach in that lower order scale-dependent terms
arc never tested in the presence of higher order terms. However, Cohen’s
step-up approach can lead to the interpretational problem of considering
“main effects’* before the analyst has determined whether interactions
exist. Indeed, investigators have misused the step-up approach by inter­
preting the coefficients from an equation containing only first order effects
as main effects when the subsequent test showed the interaction to be
significant. The step-down approach handles the scale invariance problem
and also helps insure that lower order effects will be interpreted as con­
ditional or average effects once a higher order effect has been shown to
exist. The step-down hierarchical approach is consistent with the position
taken by Cohen that first order terms should not be tested if there is a
significant interaction.

CS CamScanner
114 MULTIPLE REGRESSION

Variable Selection Algorithms

The familiar automatic stepwise forward (build up) and stepwise back­
ward (tear down) variable selection algorithms should not be confused
with the luerarchical step-up mid step-down procedures discussed here.
The selection of predictors in stepwise procedures available in standard
statistical packages is based solely on the predictive utility of individual
predictors over and above other predictors. Their use in the context of
complex regression equations containing higher order terms will lead to
reduced regression equations that arc not hierarchically well-formulated,
that is, in which all necessary lower order terms are included. The iden­
tical problem holds for the all-possible-subset regression algorithms in
which all possible regression equations containing 1 through к predicton
from a set of к predictors are generated and the “best” equation in terms
of predictive utility is selected. With regression equations containing
higher order terms, none of typical automatic search procedures is appro­
priate. Only a procedure that preserves the hierarchy of variables at each
stage should be employed (see also Peixoto, 1987).

Summary

This chapter intially considers two questions stemming from the non­
independence of terms in regression equations containing interactions.
First, the interpretation of the lower order coefficients in the presence of
an interaction is reviewed. Second, the trade-off between bias and effi­
ciency in the tests of regression coefficients involved in dropping nonsig­
nificant terms from a regression equation is discussed. The important role
of the theoretical versus exploratory basis of the interaction is emphasized
in the choice of testing procedure and in the interpretation of the results.
A variety of global step-down tests of focused hypotheses (e.g., the pres­
ence of curvilinearity, the overall effect of a single variable) are intro­
duced. A term-by-term strategy is presented for exploring complex
regression equations through the step-down elimination of nonsignificant
higher order predictors. Methods for the identification of scale-invariant
terms in any regression equation with higher order terms are provided.
An advantage of a hierarchical step-down procedure over step-up proce­
dures is presented.
. . . . -.A :
'"к
* >."-1.

cs CamScanner
Model and Effect Testing 115

Notes

1. The percentage of variance uniquely shared between the predictor and the criterion
in each regression equation is the square of the standardized regression coefficent.
2. An infrequently used alternative procedure is for the analyst to specify the order of
tests in a hierarchical step-up procedure. The order of tests is based on strong theory, the
temporal precedence of the predictor variables, or both. For example, in a study of the
effect of *students race (X), high school GPA (Z), and their interaction (XZ) on college
GPA, the researcher could argue that race precedes high school GPA and thus accounts for
some of the variance in high school GPA. Race and high school GPA could also (though
much more weakly) be argued to precede the race x GPA interaction. Under this strong
set of assumptions, the test of b\ in equation 6.3 would provide the test of the race effect,
the test of Ьг in equation 6.2 would provide the test of effect of high school GPA, and the
test of in equation 6.1 would provide the test of the race x high school GPA interaction.
In this strategy, all of the variance shared between race, high school GPA, and their inter­
action is apportioned to race; the variance shared only between high school GPA and the
interaction is apportioned to high school GPA. The test of by is once again a test of the
unique variance of the interaction. In the absence of a strong theoretical claim that X causes
2, the attribution of the shared variance to X cannot be logically justified. Thus claims for
the validity of this approach are strongly dependent on the judged adequacy of the under­
lying substantive theory.
3. Readers wishing more advanced treatments of the bias versus efficiency issue in se­
quential testing (pretest estimators) should consult Judge, Hill, Griffiths, Lutkepul, and Lee
(1982, particularly Chapter 21) and Judge and Bock (1978).
4. As will be explained in Chapter 7, this interpretation assumes that the group variable
has been dummy coded. If effect coding has been used, this effect represents the difference
of .each group from the unweighted mean, again evaluated at the mean of the continuous
variable.
5. We thank David Kenny for suggesting the computer test for scale invariance.

CamScanner
7 Interactions Between Categorical
and Continuous Variables

Thus far we have focused on the treatment of interactions between con­


tinuous predictor variables. We now consider problems in which categor­
ical predictor variables having two or more levels interact with continuous
predictor variables. We also discuss techniques for post hoc probing to
aid in the interpretation of significant interactions involving categorical
and continuous variables.

Coding Categorical Variables

A number of methods for coding categorical variables have been pro­


posed (see e.g., Cohen & Cohen, 1983; Darlington, 1990; Pedhazur,
1982). Two methods are considered here: (a) dummy variable coding and
(b) unweighted effects coding.

Dummy Variable Coding

Dummy variable coding is the most frequently utilized procedure in the


literature for representing categorical variables in regression equations.
Contrary to our recommendations for continuous variables, the dummy
coding procedure does not center the comparisons involving the categor­
ical variable(s). Nonetheless, as will be seen below, the results of this
procedure are easily interpretable.
4:
116 .. . ■

CamScanner
Categorical and Continuous Variables 117

To illustrate the use of dummy coding, imagine a researcher is studying


the starting salaries of bachelor degree graduates in three colleges: Liberal
Arts (LA), Engineering (E), and Business (BUS). College is the categor­
ical variable in this example. One of the colleges (c.g., LA) is designated
as the comparison group; this designation may be arbitrary, based on the­
ory, or because of a special interest in comparing the other two colleges
with this baseline. In general, G — 1 dummy variables will be needed,
where G is the number of groups (levels of the categorical variable). With
three levels of college, 3-1=2 dummy variables will be needed. Three
possible sets of dummy variable codes for comparing these three colleges
are presented in Table 7.1.

Table 7.1
Three Dummy Variable Coding Systems for College Data

a. LA as Comparison b. E as Comparison c. BUS as Comparison


Group Group Group
D. d2 D, D2 D, D2
LA 0 0 LA 1 0 LA 1 0
E 1 0 E 0 0 E 0 1
BUS 0 1 BUS 0 1 BUS 0 0

Throughout this section we will use the first set of dummy variable
codes depicted in Table 7.1a. In this coding system, the first dummy var­
iable (D|) compares E with the LA comparison group which is assigned
a value of 0. The second dummy variable (D2) compares BUS with the
LA comparison group. In dummy coding, (a) the comparison group is
assigned a value of 0 in all dummy variables, (b) the group being con­
trasted to the comparison group is assigned a value of 1 for that dummy
variable only, and (c) groups not involved in the contrast are also assigned
a value of 0 for that dummy variable. Note that dummy codes are partial
effects that are conditioned on all G - 1 dummy variables being present
in the regression equation.’
Continuing with our example, suppose the researcher has sampled a
small number (N = 50) of university graduates and has recorded their
college, grade point average, and starting salary. College and GPA arc
the predictor variables and starting salary is the outcome variable of in­
terest. These data are presented in Table 7.2. As in Chapter 5, we will
consider the interpretation of a series of regression equations of increasing
complexity.

r4
cs CamScanner
MULTIPLE REGRESSION

Table 7.2
Hypothetical Data: Stalling Salaries in Three Colleges

Sub. No. College GPA Salary ($) Sub. No. College GPA Salary ($)

1 LA 2 54 21,140 26 E 2.18 28.219


4. LA 2.25 20,667 27 H 1.93 27.946
3 LA 2.69 21,003 28 E 2.31 28,053
4 LA 2.84 21,269 29 E 2.45 28,209
5 LA 2.73 20,831 30 E 2.35 27.899
6 LA 2.83 21,370 31 E 2.44 28,295
7 LA 2.48 20,435 32 E 2.13 27,672
8 LA 2.58 20,584 33 E 2.22 27,756
9 LA 3.95 21,604 34 E 3.41 28,065
10 LA 3.00 20,937 35 E 2.58 27,885
11 LA 2.59 20,625 36 BUS 2.78 23,942
12 LA 2.27 20,389 37 BUS 2.50 23,205
13 LA 3.14 21,490 38 BUS 2.92 23,962
14 LA 2.95 21,007 39 BUS 3.08 24,369
15 LA 2.67 21,063 40 BUS 2.96 23,840
16 LA 2.67 21,003 41 BUS 3.06 24,452
17 LA 2.67 20,586 42 BUS 2.72 23,218
18 LA 2.89 21,084 43 BUS 2.82 23.455
19 LA 2.94 21,256 44 BUS 4.00 25,790
20 LA 3.43 21,651 45 BUS 3.22 24,206
21 LA 2.75 20,794 46 BUS 2.83 23,506
22 LA 1.93 20,380 47 BUS 2.52 22,96!
23 LA 3.04 20,961 48 BUS 3.36 24,868
24 LA 3.13 21,796 49 BUS 3.18 24,223
25 LA 3.05 21,075 50 BUS 2.91 24,004

GPA Salary
Mean Mean ($)

LA 2.80 21,000
E 2.40 28,000
BUS 3.00 24,000

NOTE: LA = Liberal Ans. E = Engineering; BUS « Business Administration.


GPA is on a 4-point system where A « 4.0, В » 3.0, C ® 2.0, D 1.0, F = 0 0.

Categorical Variable Only


Consider the simple regression equation containing only the dunH’
variables:

CS CamScanner
Categorical and Continuous Variables 119

This equation compares the mean starting salaries (V) of graduates of the
three colleges. Let us substitute the dummy codes for each college from
Table 7.1a into this equation.

LA: Z>,(0) + MO) + b0 ~ h0

E: Y = MO + MO) + !>o = + b0

BUS: У = MO) + MO + b0 = b2 + b0

From these substitutions, we see that b0 represents the mean (predicted)


value of Y for the comparison group, here the LA graduates. b0 + b,
represents the mean value of Y for the E graduates, and b0 + b2 represents
the mean value of Y for the BUS graduates. Table 7.3a(i) presents the
results of the analysis for the college salary data. The estimates for the
three b coefficients are $21,000 for b0, $6,999.90 for bit and $3,000.10
for b2. As can be seen in Table 7.2, b0 = mean for LA graduates, b{ +
b0 = mean for E graduates ($27,999.90), and b2 + b0 = mean for BUS
graduates ($24,000.10). These data are depicted in Figure 7.1a.
The joint test of b{ and b2 (see Chapter 6) compares the mean starting
salaries of the three colleges and is equivalent to a one-way ANOVA.
The SSregres>sion (predictable sum of squares) from the regression analysis
in Table 7.3a(i) is exactly equal to SStreatment in the one-way ANOVA.
The F-tests from the two analyses are equivalent as well. The results of
the joint test in the regression analysis show significant differences among
the groups. The test of b{ compares the means for the E and LA groups;
the test of b2 compares the means for the BUS and LA groups. Both
contrasts are significant.2 (See also note 1, p. 138).

Categorical and Continuous Variables

We now add an effect of GPA on starting salary to the equation, re­


sulting in

Y = bfDi + b2D2 + b3 GPA +,btt (7.2)

As usual in our treatment of continuous variables, we will center GPA


throughout this section:

, GPA = Original GPA — Mean GPA for entire sample

CamScanner
120 MULTIPLE REGRESSION

Table 7.3
Analyses of Pntgression of Regression Equations
—r , , ,, ,. tr,i, ............ - - Ч1—И». - -A»
* * iH.hi» i.i и 4—HHHi—■
* »
***
** —^—^^"^

й. Dummy Variable Coding

(i) Тем of dummy variables only


f « />iD( 4 h,l), 4- bn
f - (6,990.9)(/),) 4 (3,OOO)(Z2,) + 21,(XX)
Join! «ем of h„ bt:R1 - 0.969, F(2. 47) - 744, p < .001
Тем of/»,: «(47) - 38.0, p < .001
Тем of/»,; «(47) - 18.7, p < .001
SS^ - .160,492,000; SS„. - 11,386,717

(ii) Тем of dummy variables and continuous variable


f - b'D, 4- fr,D, + /»jGPA 4- bQ
f- (7,377)(P,) 4- (2,821)(£>a) 4- 943(GPA) 4- 20,978
Joint test of/>„/>„ ft,: R1 - 0.987, F(3, 46) = 1,124, p < .001.
Тем of/»,: - 0.017, F(l, 46) •» 58.62, p < .001.
SSW| - 366,872,239; SS„. - 5,006,479

(iii) Test of dummy variables, continuous variable, and interaction


Y - btDt 4- bjD, 4- ftjOPA 4- b4(Df x GPA) 4- bs(D2 x GPA) 4- b0
f - 7,065(0,) 4- (2,619)D2 4- (790)GPA 4- (-667)(D, X GPA)
4- (1,082) (D, x GPA) 4- 20,982
Joint test of bt-bi- R * e 0.994, F(5, 44) = 1,412, p < .001.
Joint test of bt, bt‘. /?fhingc = 0.007, F(2, 44) = 25.80, p < .001.
Test of/»,: «(44) = 6.76, p < .001
Tenofb4: «(44) - -2.98, p < .01
Test of b,: «(44) = 5.33, p < .001
SSW| - 369,574,652; SSrei = 2,304,066 4

Once again we substitute the dummy codes for each college into the
equation: h
л :
LA: У = b, GPA + b0 ,

E: f = fc, + fc3GPA +/>0 ?

BUS: P == b2 + GPA + b0

As can be seen, equation 7.2 implies that each college is represented by


a separate regression line» with each line having an identical slope, by д
This means the three regression lines will be parallel to one another, as ь
is illustrated in Figure 7.1b. b0 represents the predicted salary value for 1
0

CamScanner
Categorical and Continuous Variables 121

Table 7.3, continued

b. Unweighted Effects Coding

(i) Test of effects variables only


Y = bfEi + b2E2 + b0
F= $3,666.58 (К,) + (-333.26) (E2) + $24,333.32
Joint test of b2: R2 = 0.969, F(2, 47) = 744, p < .001
Test of b,: t(47) = 31.4, p < .001
Test of b2: t(47) *= -3.2, p < .01
SSXft = 360,492,000; SSre, = 11,386,718

(ii) Test of effect variables and continuous variable


F = btE\ + b2E2 — b3GPA + bQ
= 3,978.1 (E,) + (-578.7)(E2) + 943(GPA) + 24,377.7
Joint test of b(, b2, bj. R2 = 0.987, F(3, 46) = 1,124, p < .001
Test ofb3: Л2Ьпре = 0.017, F(l, 46) = 58.62, p < .001
SSrep = 366,872,239; SSre, = 5,006,479

(iii) Test of effect variables, continuous variables, and interaction


У = b,£t + b2E2 + b3GPA + b4 (£, X GPA) + b5(E2 X GPA) + bQ
Y == (3,836.9)E| + (—609.2)E2 + 928.3 GPA
+ (-805.4)(E, x GPA) + (943.7)(E2 x GPA) + 24,209.3
Joint test of brb5: R2 = 0.994, F(5, 44) = 1,412, p < .001
Joint test of b4, b5: Fc2hange = 0.007, F(2, 44) = 25.80, p < .001
Test of bp f(44) = 10.00, p < .001
Test of b4: t(44) = -5.60, p < .01
Test of bs: r(44) = 7.08, p < .001
SS„g = 369.574,652; SSre$ = 2,304,066

LA graduates having the GPA variable equal to 0. Because we have cen­


tered GPA, this corresponds to the mean GPA of the entire sample? In
Table 7.3a(ii) note that the value of bQ ($20,978) has changed from that
in our first equation ($21,000). bQ now represents the predicted starting
salary of LA graduates at the mean value of GPA for the entire sample
(2.78) in contrast to its earlier interpretation as the mean GPA for the LA
graduates.
h, represents the distance between the LA and В regression lines, and
b2 represents the distance between the BUS and LA regression lines, b^
and b2 represent the difference between mean starting salaries for E and
LA graduates and BUS and LA graduates, respectively, but now adjusted
for (conditioned on) GPA. The differences are now attributable to the
differences between the colleges above and beyond (independent of) dif­
ferences in GPA between the colleges. These changes in the meaning of

CamScanner
122
MULTIPLE REGRESSION

a. F * 6,D, + bjDj + bn

$29,000 DO fngiwering

$25,00000

Ltxral Art*?

122 GPA-Ontefed
GPA-Original

mean for the college


regardless of the level of GPA.

b. P = b\D\ + Ь2О2 + b3 GPA 4-

$2G.030.00 ■


| $22,000.00 •

lie.00000

122 GPA - centered


4 GPA • ooginal i

NOTE: Simple regression lines for the three colleges are parallel. <
l

Figure 7.1. Simple Regression Lines for Three Colleges

the b{ and b2 coefficients are reflected in die changed values of these


coefficients seen in Table 7.3a, (i) and (ii). bt now equals $7,377 and b2
now equals $2,821. A
The by coefficient represents the slope relating GPA to starting salary. 4
Recall in equation 7.2 we have assumed this relationship to be identical I,
in each of the colleges, by — $943 indicates that graduates could expect
a change in starting salary of $943 if their GPA changed 1.0 grade point.
For example, an increase in GPA from 2.78 to 3.78 would lead to a

CamScanner
Categorical and Continuous Variables 123

•4D, GPA + b5D2 GPA + b0

Engineering: Y = 123 GPA ♦ 20045

$26,000.00

s $22,000.00 •

$18,000.00 -

1.22 GPA-centered
4 GPA-original

NOTE: The simple regression lines for the three colleges now have different
slopes.

Figure 7.1, continued

predicted increase in starting salary of $943. Note that in equation 7.2 the
effects of college are statistically removed from the regression coefficient
b3 for GPA. If this were not done (i.e., Z), and D2 were not included in
the equation), b3 would equal —$1,080, reflecting the relatively low GPAs
and high salaries of the engineers.

Categorical and Continuous


Variables and their Interaction
Finally, let us consider a regression equation in which the slopes rep­
resenting the relationship between GPA and starting salary are permitted
to differ among the colleges. This means that the regression lines may not
be parallel, indicating a potential interaction between the categorical and
continuous variables. In general, the interaction between a continuous and
a categorical variable is formed by multiplying the continuous variable by
each of the dummy variables comprising the categorical variable. In the
present example, the following equation adds two tenns to represent the
interaction:

к Y= &tD, + b2D2 + b3 GPA + ft4(Z>t X GPA)

. • + Ъ>(рг x GPA) + bo ■ (7.3)

CamScanner
124 MULTIPLE REGRESSION

To understand the meaning of these equations, we substitute the dummy


codes for each of the colleges into this equation. This substitution pro­
duces three simple regression equations, analogous to those derived in
Chapters 2 through 5. Terms in which the value of cither dummy variable
equals 0 are omitted.

LA: Y = b3 GPA 4- bQ

E: Y = b{(X) + b3 GPA 4- Z?4 GPA 4- b0

= (Ь^ 4- b0) 4- (b3 4- b4) GPA

BUS: Y = b2( 1) + b3 GPA 4- b5 GPA 4- b0

= (b2 + b0) + (b3 4- b5) GPA

This substitution makes it clear that each college now has its own linear
regression line, with each line having a separate intercept and slope. The
three regression lines corresponding to each of the colleges are depicted •
in Figure 7.1c. b3 represents the slope of the line for the LA graduates,
b3 4- b4 represents the slope of the line for the E graduates, and b3 4- b5
represents the slope of the line for the BUS graduates. From Table
7.3a(iii), b3 = $790 is the slope for the LA graduates, b3 + b4 - $123 :
is the slope for the E graduates, and b3 4- b5 = $1,872 is the slope for
the BUS graduates.
b0 represents the intercept of the LA regression line, evaluated in the
present centered case at the mean of the entire sample of 50 students. b\
represents the distance between the LA and E regression lines and b2
represents'the distance between the LA and BUS regression lines, both
distances evaluated at the mean GPA of the entire sample. From Table .
7.3a(iii), we see that these values are b0 = $20,982, b{ = $7,065, and ?
b2 = $2,619. (Note that the intercept for the LA group is b0 - $20,982, ;
for the E group it is b{ 4- b0 = 7,064.71 4- 20,981.52 = $28,046, and f
for the BUS group it is b2 4- b0 = 2,618.57 4- 20,981,52 = $23,600.09.)
These coefficients do not equal the values obtained in estimating equation
7.2, which does not contain the interaction terms: Equation 7.2 attributes
a portion of the interaction variance to the lower order terms. Note that
the distance estimates are only meaningful when interpreted at the mean
GPA of the sample, because the regression lines are not parallel.
To provide another perspective on these slope and intercept estimates,

CamScanner
Y . . . ......
Л Categorical and Continuous Variables 125
Дч .
we separately computed a simple linear regression of starting salary on
У GPA (centered) within each college. The results were as follows:

LA: t = 790 GPA + 20,982

E: )’ = 123 GPA + 28,046

BUS: 1,872 GPA + 23,600

Note that the estimates for the slope and intercept within each college arc,
within rounding error, equal to the values reported above based on a model
containing the dummy variables, the continuous variable, and the inter­
actions above.
A joint test of b4 and b5 provides the overall test of the statistical sig­
nificance of the interaction. As can be seen in Table 7.3a(iii), this test is
significant as are the individual tests of each of the contrasts involving a
> dummy variable in interaction with the continuous variable, GPA.
в Finally, the point of intersection of any pair of the lines can be calcu-
,, lated to determine whether the lines cross within the useful range of the
continuous variable, here GPA. Each pair of lines may cross at a different
point. To calculate the point of intersection, the equations for the two
" regression lines are set equal to each other and solved for the continuous
variable (GPA). For the regression lines representing the LA and E groups
above,

b3 GPA + b0 = bx + b3 GPA + b4 GPA + bQ

/ So the intersection point for the LA and E lines is at GPA = -Ь}/Ь4 =


-7,065/-667 = 10.59 (see Table 7.3a(iii) for values of b{ and b4).
Setting the equations equal for the other possible pairs of groups produces
; two additional intersection points: the intersection of LA and BUS is
-2.42 and the intersection of BUS and E is 2.54. Recall that we have
centered GPA. To place these values back in the original metric, the sam­
ple mean must be added to these values leading to the following intersec-
X tion points on the original GPA scale: LA-E == 13.36, LA-BUS = 0.35,
:.t and E-BUS = 5.31. Figure 7.1c illustrates the three simple regression
X lines corresponding to each college.
'/ As discussed previously in Chapter 2, interactions in which all of the
X points of intersection fall outside of the useful range of the continuous
> variable are termed ordinal, whereas interactions in which at least one of

CamScanner
126 MULTIPLE REGRESSION

the points of intersection fall within the useful range of the continuous
variable arc termed disordinal. Two of the three points of intersection fall
outside the possible range of GPA; however, the LA-BUS intersection
(0.35) does fall within the theoretical 0.0-4.0 range of GPA. Note, how­
ever, that the LA-BUS intersection point docs fall outside of the range of
GPAs observed in our sample and very likely represents an impossible
GPA for an actual graduate. Hence, the LA-BUS interaction should also
be considered to be ordinal.
More generally, the point of intersection on the continuous variable can
be calculated using the slopes and intercepts of the two regression lines
according to the following equation:

» • • /1 Л /
Intersection point = ------ — (7.4)
02 0|

In this equation, is the intercept for group 1, J2 is the intercept for group
2, is the slope for group 1, and S2 is the slope for group 2.
Higher Order Effects and Interactions ’
As we saw in Chapter 5 with continuous variables, higher order terms
can also be added to the equation. Because the recommendations of Chap­
ter 5 can be applied here, we will consider only briefly two examples.
As a first example, the potential linear and quadratic effects of GPA on '
starting salary, each of which are assumed to be identical across groups, ,
can be examined by adding a GPA2 term to equation 7.2. This results in ;
equation 7.5: ,

У = b'Di + b2D2 + by GPA + b4GPA2 + d0 (7.5) '

Extending the example further, if the linear and quadratic effects of GPA >
are both permitted to differ among the groups, then additional terms mini
be added to equation 7,5, resulting in equation 7.6:

У = biDy + b2D2 + by GPA + £>4GPA2 + bs(D{ x GPA) \

+ b6(D2 X GPA) + b7(D, x GPA2) A


(

+ fcB(D2 X GPA2) + bo (7.6)

Equation 7.6 permits both the linear and quadratic components of the <
GPA-starting salary relationship to differ among the three colleges. Com- A

CamScanner
Categorical and Continuous Variables 127

parison of equation 7.6 with one in which the (D, x GPA2) and (D2 x
GPA2) terms arc not included pennits a test of the significance of the
quadratic component of the college x GPA interaction.
The point(s) of intersection for the two groups can be determined by
setting the equations for the two groups to be equal and solving for the
continuous variable. For example, substitution into equation 7.6 and al­
gebraic manipulation shows that the LA and E lines will cross at

, . . , -bt + - 4(&,)(МГ
Intersection point 1 -------------- ——---------- •—
2(M

Intersection point 2 =----------------- --------------------


*
i)
2(

If the solutions for intersection points 1 and 2 are equal, there is only one
intersection point. None, one, or both of the intersection points may occur
within the useful range of the continuous variable.
In summary, just as was the case for continuous variables, higher order
terms can be added to the regression equation to test specific hypotheses.
Each of the dummy variables representing the categorical variable must
be included as first order terms and in all interactions. As with continuous
variables, all of the lower order terms involved in interactions must be
included in the equation. Quadratic (or other higher order) functions of
dummy variables are never included in the regression equation, as they
do not lead to interpretable effects.

Unweighted Effects Coding

Unweighted effects coding is a second, relatively frequently used cod­


ing system. In this coding system the two groups involved in the contrast
specified by the code are compared with the unweighted mean of all of
the groups. As was defined in Chapter 2, the comparison group is always
coded -1, the group being contrasted to the comparison group is assigned
a value of 1, and groups not involved in the contrast are assigned a value
of 0. Returning to our college-salary example, we again arbitrarily des­
ignate LA as the comparison group. Two variables (effects codes) are
again needed to contrast LA to E and BUS, respectively. Table 7.4 pro­
vides the unweighted effects codes for these contrasts. Substituting these

CamScanner
128 MULTIPLE REGRESSION

Tabic 7.4
Unweighted Effects Codes for College Example

El E2
LA -1 -1
E I 0
DUS 0 1

values into equation 7.1, we obtain the following set of equations for each
of the three colleges:

LA: Y = —bx - b2 + b0

E: Y = by + b0

BUS: Y = b2 + b0

In these equations, b0 represents the unweighted grand mean of the three


groups, that is, b0 = [Mean(LA) + Mean(E) + Mean(BUS)]/3 =
[$21,000 + $27,999.9 + $24,000.1]/3 = $24,333.3. by represents the
deviation of the mean of the group labeled 1 in the E, contrast, here
Mean(E), from the unweighted grand mean. This value is $27,999.9 -
$24,333.3 = $3,666.6. b2 represents the deviation of the mean of the
group labeled 1 in the E2 contrast (BUS) from the unweighted grand mean.
Table 7.3b presents the results for the unweighted effects code analysis
for several of the regression equations discussed above. A comparison of
these results with those using dummy coding show both similarities and
differences in the results, (a) Overall tests of R2 or change in tf2‘conv-
spending to adding a categorical, continuous, or interaction variable ait
identical across the two coding systems, (b) Tests of the coefficients of
the continuous variable (b3) and each term of the interaction (b4, £s)atf
different.4 (c) Tests of the intercept and the dummy (or unweighted effect)
variables differ. These differences reflect the dissimilar interpretations of
the b coefficients in the two coding systems. In dummy variable coding,
the contrasts are with the comparison group; in unweighted effects coding,
the contrasts are with the unweighted mean of the sample.
To illuminate further the difference in interpretations, it is instructive
to compare the b coefficients in the full model with interactions (Table
7.3b(iii)) with the results of the simple regressions separately compute^

CamScanner
Categorical and Continuous Van able i 129

within each college (see p. 125). The intercepts in the three colleges are
$20,982 (LA), $28,046 (E), and $23,600 (BUS). The unweighted mean
of the three intercepts is $24,209.3 which is identical to b& The intercept
for LA is -b{ - b2 + b0 = -$3,836.9 - (-$609.2) + $24,209.3 =
$20,982. Substituting the appropriate values into the equations above for
each of the colleges, the intercept for E is $3,836.9 + $24,209.3 -
$28,046 and the intercept for BUS is -609.2 4- $24,209.3 = $23,600.
Similar logic can be applied to the calculation of the slopes. The slopes
in the simple regressions calculated separately for each college w ere $790
for LA, $123 for E, and SI,872 for BUS. The unweighted mean of the
slopes in the three colleges is $928.3, which equals b3. The slope for the
LA group is b3 - b4 - bs = 928.3 - (-805.4) - 943.7 = 790. The
slope for the E group is b3 4- b4 - 123 and the slope for the BUS group
is b3 4- b5 — 1,872. Thus the differences in the b coefficients between the
dummy coding and effect coding analyses directly reflect the differences
m meaning. It is important to note that the simple regression equations
for each group are identical whether dummy coding or unweighted effects
coding is used. This is yet another example of the point made in Chapter
3. Predictor scaling does not affect the simple slope analysis for post hoc
probing.

Choice of Coding System

Given that the two coding systems discussed above produce results that
reflect their different meanings, which coding system should be preferred?
When the interactions involve a categorical variable and a continuous var­
iable, dummy variable coding produces immediately interpretable con­
trasts with the comparison group, whereas simple effect coding does not.
Hence, if there is interest in contrasts between pairs of groups, dummy
variable coding will be more efficient. When the interactions of interest
involve two (or more) categorical variables, effect coding is preferred be­
cause it produces results that are immediately comparable with standard
ANOVA procedures. For example, when there are equal ns in each cell,
effect coding produces main effects and interactions that are orthogonal
just as in ANOVA. However, dummy coding produces correlations be­
tween the contrast vectors for the main effects and those for the interac­
tions. Thus some (minor) adjustments are needed in the results of the
dummy coded analysis to produce orthogonal estimates of the variance

CamScanner
130 MULTIPLE REGRESSION i

resulting from the main effects and interactions (see Pedhazur, 1982, p 1
369). ' I
- ■ • j

Centering Revisited
After the emphasis on centering predictor variables in cases of inter- i
actions between two or more continuous variables, the failure to use cen­
tered dummy or effect variables (i.c., mean = 0) is striking. However, I
with categorical variables we arc nearly always interested in regression of
the predictor variable within the distinct groups themselves rather than at
the value of the (weighted) mean of the groups. As we have seen, both
dummy coding and effects coding lead to clearly interpretable results in
terms of the slopes and intercepts for each group. If, however, we are ■
interested in the average effect of the continuous predictor variable, an- ’
other coding system, weighted effects coding, should be used. Weighted
effects codes follow the same logic as the unweighted codes, but take each <
group’s sample size into account. Darlington (1990) presents a discussion '
of weighted effects codes. Note that in the special case where the sample
sizes in each group are equal, unweighted and weighted effects codes are 1
equivalent.

■ I
Post Hoc Probing of Significant Interactions

The significant overall (joint) test of the interaction of a categorical and


continuous variable tells us only that there is an overall difference in the
slopes of the regression lines. As was the case for a significant interaction
between two continuous variables, we now wish to probe further the in­
teraction to assist in its interpretation. Three sets of tests that address
different questions can be performed. к
First, we may test the simple slopes for the regression of the continuous V
variable (e.g., GPA) on the outcome variable (e.g., starting salary). Be- ?
cause one of the predictor variables is categorical, the simple slopes of C
interest will be those evaluated at values of the dummy (or effect) varia- и
bles that correspond to the separate groups. Thus, in our example, we A
will be interested in evaluating whether the simple slopes for the LA, E, Л
and BUS groups each differ from zero. A
Second, we may test the difference between the predicted values in any
pair of groups for a specific value of the continuous variabl e. For exam- A
pie, we may wish to test whether the E and BUS regression lines differ A

CamScanner
Categorical and Continuous Variables 131

for students who have a specific value of GPA, say 3.5, corresponding to
the cutoff for Dean’s list.
Third, wc may be interested in identifying the rcgion(s) of the contin­
uous variable where two regression lines can be shown to differ signifi­
cantly. For example, for what range of values of GPA do the E and BUS
students differ in their starting salaries?

Testing Simple Slopes Within Groups


To test the significance of the simple slopes within each level of the
categorical variable (groups), the general procedures developed in Chap­
ter 2 are followed. For our starting salary example, b3 is the simple slope
for the LA group, (b3 + b4) is the simple slope for the E group, and (b}
+ b5) is the simple slope for the BUS group. The corresponding standard
errors for the three groups are (s33)l/2, (s33 + s44 + 2s34)l/2, and (in
+ $55 + 2s35)1/2, respectively, where s33,544, s55, s34, and r35 are taken
from Sb, the variance-covariance matrix of the predictors.5 The /-tests
can be computed by dividing the simple slope by its corresponding stan­
dard error, with df = 1, n - к - 1, where к is the number of terms in
the regression equation not including the intercept (here к = 5).

Computer Procedure

A very simple computer procedure can be used to test the simple slopes
in each of the groups. In our example, when D, = 0 and D2 = 0, the test
of the b3 coefficient in the overall analysis including the categorical var­
iable, the continuous variable, and their interaction (see Table 7.3a(iii)J
provides the proper test of the simple slope in the comparison (LA) group.
We can take advantage of this fact by noting that the simple slope of the
comparison group in the particular dummy coding system is always prop­
erty tested in this case. In our example, if we recode the groups according
to the dummy coding procedure shown in Table 7,1b, the E group is now
the comparison group; its simple slope is b3 = 122.9 and the test off b3 in
the regression analysis (r = 0.65, ns) provides the appropriate test of the
simple slope. Similarly, if we recode the groups according to the dummy
coding procedure shown in Table 7.1c, b3 » 1,872.0 is now the simple
slope of the BUS group and the test of b3 (t ~ 11.29, p < ,001) provides
the appropriate test of significance. Thus, in our case involving three
groups, conducting three separate regression runs in which each group m
turn serves as the comparison group produces proper tests of each of the
ihrec simple slopes.

cs CamScanner
132 MULTIPLE REGRE.SS|On

Differences Between Regression


Lines st a Specific Point

A second way to probe significant interactions is to test whether the


predicted values for any pair of the groups differ at a specified value of
the continuous variable. The Johnson-Neyman technique (Johnson & Fay
1950; Johnson & Neyman. 1936; Rogosa, 1980, 1981; see Huitema
1980. for an excellent presentation of the basic technique and its exten
sms io more complex problems) has long offered a solution to this ques­
tion. Although the calculations using this technique are straightforward
the? are fairly tedious; computer software to perform this analysis is not
cvsfiabte to our knowledge in the major statistical packages. We offer here
a simple alternative to the Johnson-Neyman technique that draws on our
computer method for testing simple slopes.
Let us simplify our example and consider only two groups, E and BUS.
Suppose we are interested in determining whether there is a significant
difference between the two regression lines at the point where GPA =
3-5 T>eaifs list). We will arbitrarily call BUS the comparison group so
mu our dummy variable D{ = 1 for E and 0 for BUS. The following
regression equation describes the two-group case with a potential inter-

Y = i,D, + b2 GPA + b3(D, X GPA) + 60

We can estimate this equation using several different transformations of


GPA obtaining different values for the lower order coefficients while the
coefficient remains constant. Below we report three solutions: (a) un*
tiaasfonned GPA on the usual 0.0-4.0 scale; (b) GPA-C centered for the
emire sample of 50 students (GPA - 2.78) to be maximally comparable
with the centered analyses reported above; and (c) transformed GPA
*
~ GPA — 3.50, which sets the 0.0 value for transformed GPA
the tmtransformed value of 3.50, the Dean’s list cutoff. The results ot v
three equations are

(a) GPA: 9 = 9.303.4(0,) + l,872.0(GPA)


+ ( —1,749.1)(D, x GPA) + 18.401-5

<b) GPA-C: f » 4.446.1(0,) + l,872.0(GPA-C) i


+ (-1.749.1 )(D, X GPA-C) + 23,60°-’

CamScanner
Categorical and Continuous Variables 133

(c) GPA-D: Y = 3,185.5(D,) + 1,872.O(GPA-D)

+ ( —1,749.1 )(Dt x GPA-D) 4- 24,953.5

Note that in each case, the b\ coefficient represents the distance between
the regression lines for E versus BUS when the value of the GPA variable
is 0.0. For (a), this distance is evaluated when original GPA is 0.0, which
is likely to be of little usefulness. For (b), this distance is evaluated when
original GPA is 2.78, which is the mean of the entire sample of students.
For (c), this distance is evaluated when original GPA is 3.50, which cor­
responds to our point of interest, the Dean’s list cutoff. Thus we find that
at GPA = 3.5 the difference in the predicted starting salaries of E and
BUS students is $3,186. The test of is also reported in standard regres­
sion packages, t = 12.01, p < .001, and corresponds to the results of
the Johnson-Neyman test of significance of the difference between two
regression lines at a GPA of 3.5. This computer solution can also be
extended to more complex problems such as testing the significance of
the distance between two regression planes or two regression curves when
specific values are given for each of the continuous predictor variables.
Several comments about this computer test should be noted.
1. The test should be used with dummy codes because the interest is
in comparing differences between group regression lines. Recall that un­
weighted effects codes compare the regression line with the unweighted
mean.
2. When more than two groups are employed, the test of the coefficient
for each dummy code provides a test of the difference between the regres­
Sion lines for the comparison group and the group specified by the dummy
code. Note that these tests use the mean square residual (MSres) from the
overall regression analysis based on all groups rather than just the MSfvs
from the two groups used in the contrast.
3, Contrasts not involving the designated comparison group can be
performed by respecifying the dummy coding as we earlier illustrated for
tests of simple slopes.
4. When several pairs of regression lines are being compared, re­
searchers may wish to use the Bonferoni procedure to adjust their obtained
values for the number of different tests that are undertaken. Huitcma
(1980) presents an extensive discussion of the use of the Bonferoni pro­
cedure in this context.

CamScanner
134 MULTIPLE REGRESSION

Identifying Regions of Significance

Potthoff (1964) has extended the basic Johnson-Neyman procedure to


identify regions in which the two regression lines arc significantly differ­
ent for all possible points. The goal of the test is to identify regions such ■
that “in the long run, not more than 5% of such regions which are cal-
eulated will contain any points at all for which the two groups arc equal
in expected criterion score” (Potthoff, 1964, p. 244). The test thus allows
for the fact that there will be variability from sample to sample in the
estimates of the slopes and intercepts of the two regression lines, so that
the point at which they begin to differ will also vary concomitantly. The
traditional Johnson-Neyman test provides an appropriate test of the dif- '
ference between two regression lines at a particular value. The equation
for the Potthoff test closely follows that of the original Johnson-Neyman
test, except that 2F2,лг-4 replaces FI>yv_4 as the critical F-value. Dis- •
cussions and derivations of the two procedures can be found in Potthoff •
(1964) and Rogosa (1980, 1981). Below we illustrate this analysis by :
comparing the E and BUS groups from our example.
To perform the analysis, three separate steps are performed following
the layout recommended by Pedhazur (1982). First, two separate regres­
sion analyses are performed. Second, the results of the regression anal­
yses are used to calculate three intermediate quantities. Third, the cutoff
values for the regions of significance are calculated.
I

Step I: Regression Analyses '

For ease of interpretation, the two following regression analyses should <
be run using the original values (i.e., not centered) of the continuous
predictor variable.
1. The continuous predictor variable is regressed on the outcome vari­
able using only the data from group 1:

For E only: = />)(B)(GPA) + ЬЩЪ}

2. The continuous predictor variable is regressed on the outcome vari- t


able using only the data from group 2: ,

For BUS only: T(bus) ~ ^i<bus)(GPA) + Z>o<bus) .


■ '

CamScanner
Categorical and Continuous Variables 135

Step II: Calculation of Intermediate Quantities


Three intermediate quantities, which arc conventionally labeled A, B,
and C, must then be calculated. The variables involved in the equations
are defined immediately below the formulas.

~^2, N-4
* ~ ^1(2))
N - 4 SSx(i) SSx(2).

2F-
В- -SSx(|) SSy(2)_

N-4 N Xl 1 + [^OCD ~ ^(X2)j


N - 4 1Л1Л2 SSx(i) SS%(2)_

F2, л -4 is the tabled F-value with 2 and /V - 4 degrees of freedom.


N is the number of subjects in the entire sample. nt is the number of
subjects in group 1; n2 is the number of subjects in group 2. N - +
n2-
SS^ is the total residual sum of squares. It is computed by summing
the residual sum of squares from the two within-group regression analyses
(i.e., regression analysis 1 and regression analysis 2).
is the sum of squares associated with the predictor variable in
group J (regression analysis 1); SSX(2) is the sum of squares associated
with the predictor variable in group 2 (regression analysis 2).
X| is the mean of group 1; X2 is the mean of group 2.
ЬШ} is the slope for group 1 (analysis 1); />|(2) is the slope for group 2.
is the intercept for group 1 (analysis 1); bQ{2) is the intercept tor group
2,

Step III. Calculation of Region Cutoffs


The final step is the calculation of the cutoff values for the regions of
significance, the formula for which is given below.

y -B + (B2 - лс)|/г . y -B - (B2 - лс)|/г


A ~i------------------- and A ---------------- --------------
A ,J*

Note that these equations do not always yield two solutions within the
effective range of the predictor variable. Depending on the nature of the
interaction, there may be 0, 1, or 2 regions within the possible range of
' . ■ ' • • ■ .

CamScanner
MULTIPLE REGRESSION
136

the predictor variable in which the predicted values of the two regression
lines differ. ,
We will illustrate the calculation of the regions of significance for the
two-group case using just the data from the E and BUS groups in our
example (i.c., we assume E and BUS comprise the entire sample). The
necessary data and their source for the computations arc presented below.

И| = Ю; n2 =15; N ** th + n2 - 25 (from Table 7.2)

/г. л,_4 = 2I = 3.47 for alpha = .05 (from F-Table)

From Regression Analysis 1 for E group

GPA(d — 2.40; — 122.9; 27,705.0,

SSX(1) = 21,768.4; Sres(1) = 357,394.5

From Regression Analysis 2 for BUS group

GPA{2) = 2.99; Z>1(2) = 1,872.0; = 18,401.6;

SSX(2) = 6,671,180.4; SSres(2) = 613,528.5.

From Regression Analyses 1 and 2 combining the groups

SSres = SSres(1) + SSres(2) = 970,923

When these values are substituted into the formulas above, and steps П
and HI are carried out, the values 5.19 and 5.45 are obtained. Recall that
we calculated earlier in Chapter 7 (p. 125) that the regression lines tor
the E and BUS groups had a crossing point of 5.31. Thus, for values ol
GPA less than 5.19, the E group is predicted to have higher starting sal­
aries than the BUS group, for values of GPA greater than 5.45, the
group is predicted to have higher starting salaries than the E group, and
°
for values of GPA between 5.19 and 5.45, the starting salaries of the t*
groups are not predicted to differ. Given that the possible range of GPA
is from 0.0 to 4.0, this means that the E group will always be predicted
to have a higher starting salary than the BUS group in the possible range
of GPA.
A few final observations should be made about the Potthoff procedure
for determining regions of significance.

CamScanner
Categorical and Continuous Variables 137

1. The calculations of steps II and III are quite tedious and are cur­
rently unavailable in most standard computer packages. Appendix C of
this book contains a simple SAS program for Potthoff’s extension of the
'Johnson-Neyman procedure for the simple two-group case. Borich (1971;
Bench & Wunderlich, 1973) offers more extensive computer programs
for the Johnson-Neyman procedure.
2. Even when regions of significance are obtained within the possible
range of the predictor variable, caution should be taken in interpretation.
If few or no data points actually fall in the regions, the result represents
a serious extrapolation beyond the available data, raising concerns about
the meaning of the obtained region. For example, a region of significance
less than a cumulative GPA of 1.0 would not be particularly meaningful
because few, if any, students ever graduate with such low GPAs. Finally,
if the test does not identify any regions of significance within the range
of the predictor variable, this indicates that the two regression lines differ
for either (a) all values or (b) no values of the predictor variable. In case
(a), the bj coefficient for the group effect will typically be significant,
whereas in case (b), it will not be significant.
3. When regions are being calculated for several pairs of regression
lines, researchers may wish to substitute the more conservative Bonferoni
F for the F-value listed to maintain the study wise error rate at the level
claimed (e.g., alpha - .05).
4. Huitema (1980) includes an extensive discussion of applications of
the basic Johnson-Neyman procedure to more complex situations. Note,
however, that he presents the test that is appropriate when a priori values
of the predictor variables have been selected, Once again, the Potthoff
extension requires that 2F2, w - 4 be used in place of Fu <V-4 as the critical
F-value in the equations.
5. Cronbach and Snow (1977) raise the important methodological point
that the interpretation of the regions of significance is clearest in experi­
mental settings in‘ which subjects are randomly assigned to treatment
groups. This practice eliminates the possibility that specification error in
the regression equation biases the results. Cronbach and Snow present an
• excellent discussion of the design and analysis of research on aptitude X
treatment interactions.

CamScanner
138 MULTIPLE REGRESSION

tation of the coefficients in a series of regression equations of increasing


complexity using two coding systems for the categorical variable, dummy
coding and simple effect coding. We have also discussed post hoc probing
of significant interactions including testing of simple slopes, differences ■,
between predicted values for pain of groups, and determination of critical )
regions of significance. As we have seen, the procedures developed in
previous chapters for the interpretation of interactions between two or
more continuous predictor variables generalize nicely to interactions in­
volving categorical and continuous variables. Finally, although not dis­
cussed in the present chapter, the prescriptions developed in Chapter 6
for model and effect testing with higher order terms are directly applicable
to regression equations containing categorical predictor variables.

Notes

1. In the dummy coding system of Table 7.1a, dummy code D, actually contrasts E to
LA and E to BUS. Dummy code D2 actually contrasts BUS to E and BUS to LA. The two
contrasts share in common the E versus BUS contrast. When Dt and D2 are entered into the
same regression equation, the b} regression coefficient (for Dx) reflects that part of the D,
contrast that is independent of D2, that is, the E versus LA contrast. The b2 coefficient (for
D2) reflects that part of the D2 contrast that is independent of D,, that is, the BUS versus ’
LA contrast.
2. The simplest method of directly comparing the E and BUS groups is to rerun the
regression analysis using a dummy coding system that designates one of these groups as the
comparison group. In Table 7.1, both sections (b) and (c) include this comparison. This .
method can be used in the more complex models described below and is illustrated later in
the chapter.
3. If we had not centered GPA, bQ would have represented the predicted value of starting ;
salary for LA graduates with a GPA of 0,0. Presumably, none of these individuals would s
actually graduate, so that this predicted salary value would not be meaningful.
4. In previous instances of rescaling of first order terms by additive constants, there has
been no resultant change in the coefficient for the interaction. The reader should be aware |
that the change from dummy codes to effect codes is not a mere change in scaling by additive
constants. The interaction coefficients do change with a change in coding scheme.
5. The weight vector used to compute the standard errors is w' == [0 0 I AL
where the values of A and D2 are (0 0) for LA, (I 0) for E, and (0 I) for BUS. The general л
expression for the standard error of the simple slopes is where S,, is the 5 x 5
variance-covariance matrix of the b coefficients available from standard computer packages.
*
/i X

few:®..

CamScanner
8 Reliability and Statistical Power

Throughout the previous chapters we have made no mention of measure


*
ment error in the predictors in regression analysis. We have assumed a
regression model in which predictors are measured without error. In sim-
pie linear regression analysis with no higher order terms, measurement
error in predictors introduces bias into the estimates of regression coeffi­
cients (i.e., the expected values of the regression coefficient estimates no
longer equal the population parameters). The same is true in more com­
plex models. Measurement error in individual predictors produces'a dra *
matic reduction in the reliability of the higher order terms constructed
from them. In turn, reduced reliability of higher order terms increases
their standard errors and consequently reduces the power of their static
ties! tests.
Moreover, the power of statistical tests for higher order terms, even in
the absence of measurement error, is expected to be low. Most recently.,
these concerns have been a particular focus in the literature on moderated
multiple regression (Chaplin, in press-а), which addresses tests of models
involving interactions. Recent discussions of issues of power and relh-
bility in moderated multiple regression are given in Arvey. Maxwell, and
Abraham (1985), Champoux and Peters (1987), Cronbach (1987), Dun­
lap and Kemery (1987, 1988), Evans (1985), Lubinski and Humphreys
0990), Paunonen and Jackson (1988). The concerns expressed in this
literature and the findings have implications for all equations considered
in this text.
In this chapter we first review classic concepts of reliability and intro-

■ J ■■ , . ;/<-< ,'л;; -■« •- % • •. ; • 139

CamScanner
, MULTIPLE REGRESSION
140
duce an analysis of the reliability of interaction terms We consider strat­
egies that have been preposed to correct for the unreliabil.ty о product
egies mat na i i cons der the issue of whether spu-
terms in regresston ana у«s. Wc aho cmr *
rious effects_can — of intcractions. Wc consider

fi^fhe number of cases required to achieve adequate power tn the case


of no measurement error in predictors. Then we consider the rmpact of
measurement error on effect sizes, statistical power, and sample stze re-

4Ut" section of the chapter, Reliability, presents the theoretical ba­


sis for the effects of measurement error on regression estimates. Once
again readers may find this section to be slower go.ng than the remainder
of the book. The second section, Statistical Power, shows the actual im­
pact of measurement error on statistical power and sample size require­
ments for tests of the interaction. This section can be read without a full
understanding of the theoretical material in the first section.

Reliability

Biased Regression Coefficients


with Measurement Error
Our discussion of measurement error will initially focus only on the
reliability of the predictor variables, because measurement error in the
criterion does not introduce bias into unstandardized regression coeffi­
cients (see, e.g., Duncan, 1975). We begin with a brief review of the
definition of reliability and consideration of the effect of unreliability or
a predictor in the simple one-predictor regression analysis (see Bohm­
stedt, 1983; Duncan, 1975; Heise, 1975; Kenny, 1979),
The Single Predictor Case
In classical measurement theory (Gulliksen, 1987), an observed score
(X) is defined as being comprised of the true score on the variable ( x
+ random error (ex), that is, X - Tx 4- ex. This definition implies 3
linear relationship between X and Tx, Classical measurement theory aS
sumes that:

1. The mean or expected value of the random error in the population is ze


that is, E(ex) « 0;
2. that the errors are normally distributed;

CamScanner
Relinbiliiy and Statistical Power 141

3. that the covariance between the randtmi emm and tn» »t>»пм н
is. С( ?> ,<>) « 0.

From these assumptions, it follows that the vatlAtue of oM«?rved м»;т<ч


is comprised of two components, true score variance and error vartatue
ns follows;

(**
U
Then the reliability of the variable is defined as the proportion of fotal
variance in A* that is true score variance,

Pxx ш OtJ^x ™

Bias Due to Unreliability in the Regression Coefficient and Correia


lions. How does measurement error in the predictor X and the criterion F
affect the covariance between them? Given X *= Tx T tx and F « Tr f
er, the covariance between X and Y is given as follows:

C(X, r) = C(TX. Tr) + C(Tr, «


*
) + C(TX, <r) + C(tv. <>.) (8.3)

However, this expression may be simplified with two further assumptions


of classical measurement theoiy: Random errors of the predictor and cri ­
terion are uncorrelated with true scores, so that С(ТЛ» M ° C(7>, fX)
= 0, and the important assumption that all random errors are uncone
*
hied, that is, E(€Xey) = 0. Under these assumptions the observed co *
variance equals the true covariance:

C(X, )') - С(Л, r,) (8.4)

Under classical measurement theoiy, covariances are unutlected by mea­


surement error. The unslandardized coefficient for the regression of F on
X is given as

Ш b№” C(X, r)M = C(TX, 7i)/|a], + <1 (8.5)

Thus, if there is measurement error in predictor A’, then biX is biased: btx
will be closer to zero than the population value it estimates, This bias
occurs because the denominator of brx is the observed variance of pre­
dictor X, which is inflated by measurement error. We will refer to bias in
which estimates are closer to zero than corresponding parameters as al­
" ■••''J I,-’’;'. -• K'' —' ‘ ' . . • ' ' ' _ ' • • . ;

CamScanner
М2 MULTIPLE REGRESSION */

tenuation bias oi simply attenuation throughout this discussion. Mea­


surement citoi nls»» produces ntlcnmitlon in simple (zero order) corrcla- i
lions (pK) * ('(X. F)/«h«i}), because the denominator of the correlation f
cocfficcnl contains the observed standard deviations of the variables. /

Multiple Regression with no Prodm l Terms

Unde» classical measurement theory assumptions, the variances of the ■,


predictors иге inflated by measurement error, but the observed covari-
unco- are not 1 his does not mean, however, that measurement error in- ■
variably causes attenuation of regression estimates in the multiple predic­
ts ease With several fallible first order predictors, the extent and direction
of bias in each regression cocfficcnl depends upon the intcrcorrelations
among the true scores underlying measured variables (see demonstrations
by Bohmstedt & Carter, 1971; Cohen & Cohen, 1983; Kenny, 1979). ,
Cohen and Cohen (1983) provide a very clear demonstration of this
point. In the two-predictor regression of У on X and Z, the standardized
partial regression coefficient for У on X is given as:

rYX ~ rY7.rX7.
byx.z ~ 1 - Г2
1 r xz

If variable Z has reliability p7Z less than 1.0, then the numerator of bYXZ,
corrected for measurement error, would be rzzrYx - ггггх2, where is
the sample estimate of pZrZ. The observed value of the numerator of bYXZ,
taking into account the unreliability of the partialled predictor Z, can vary
widely from the true value. A true nonzero relationship between X and У
in the population may not be observed at all in the sample; a true zero
relationship between X and У may appear to be nonzero; and even the sign .
of the observed regression of У on X may vary from the true value in a ,
particular sample. The reliability of the partialled variable (here Z) has a
profound effect on the bias in the estimators for other variables. Even if
one predictor in a set is perfectly reliable, its regression coefficient is
subject to bias produced by error in other predictors. Only in the case in
which the true scores underlying observed predictors are uncorrelated with t
one another is each coefficient guaranteed to be attenuated by measure-.
rnent error, as in the single predictor case (Maddala, 1977). ;

Regression with Product Terms


Measurement error in product terms (e.g., XZ, X2Z) that represent in­
teractions in regression requires special consideration. Of particular com

CamScanner
Reliability and Statistical Power 143

cem is the covariance between errors of measurement in the product terms


and in the predictors of which the product terms are comprised.

Covariance Between Errors in Product Term and Components. Rcgres-


sion models containing product terms have all the problems of unreli­
ability presented thus far. However, product terms also introduce another
complexity: Even under the assumptions of classical measurement theory,
the covariances between the product term and its components arc affected
by measurement error (see, c.g., Marsden, 1981).
Consider the following expression for the observed crossproduct be­
tween two variables, taken from Bohmstedt & Marwell (1978, p. 65):

XZ — (Tx + €x)(Tz 4- ez) = TXTZ 4- Txez 4- Tzex + exez (8.6)

The last three terms (Txez 4- Tzex 4- exez) represent the measurement
error component () of the observed XZ product. These terms have non­
zero covariance with the error in their components. For example:

txz) ~ С[бх, (Txez 4- Tzex 4- 6Xez)] (g

— ^хС(бх, 6Z) 4- Tza2


(x 4- С(бу, ez)

Under the assumptions of bivariate normality and uncorrelated errors, this


expression reduces to

С (бу, 6xz) = (8.8)

so that the observed covariance between a crossproduct term and a com­


ponent is as follows:

C(XZ, X) = С(ТхТг, Tx) + Trf (8.9)


■ Л

The covariance of errors in XZ and errors in Z will also produce bias in


regression coefficients for the crossproduct term and its components. The
bias is introduced into the regression coefficients, because C(XZ, X) and
C(XZ, Z), both of which contain measurement error, are terms in these
coefficients.

. Bias in the Regression Coefficient for the Product Term. Once again
the direction of bias depends upon the correlation between true scores of
- predictors, with the same vagaries as in multiple regression with no higher

CamScanner
144 MULTI I'LL MKGgEMlOM

order terms. Under the assumption of bivariate normality, the following


expression from Bohmstedt and Marwell (1978) gives the correlation h
*.
tween true scores in XZ and in X:

Mx C(X, i
РГхГ/.Тх

PX7.,X7.PXX

where TXTZ and Tx arc the true scores associated with observed varfahfa
XZ and X; oxz and cz arc the variances of XZ and X, respectively; and дх
and nz are the population means of X and Z, respectively, Inspection of
the numerator of equation 8.10 shows that this correlation goes to zero if
X and Z are bivariate normal and centered in the population such that
= HZ = Q. For the centered case in the population, when the crrmprod:xt
term is less than perfectly reliable, the regression coefficient for that term
will be attenuated, just as in the one-predictor case. Only in the centers
case in the population with bivariate normal predictors X and Z ц th
uncertainty of the direction of bias of this term removed; the interaction
stands alone as does the single predictor, because the true scores of the
interaction term and true scores of its components are uncorrelated, that
is, C(TX, TXTZ) = C(TZ, TXTZ) = 0.

Reliability of the Crossproduct Term. Bohmstedt and Marweli (197?


derive an expression for the reliablity of X and Z under the assumption
of classical measurement theory plus the assumption of bivariate normal­
ity of the predictors:

_ OxPzz + OzPxx + MxOzPxz + Pxz + PxxPzz fo


PxZ XZ ~
"x + (fz + 20x0zpxz + Pxz + ,1--------
П2 4- П2 4. ОП Л n------ 1" 2 (8.11»

where pxzxz is the reliability of the crossproduct term; 0x = px/ax is tx


ratio of the mean to the standard deviation of X; 0z = pzl<jz is the гзш
of the mean to the standard deviation of Z; pxx, Pzz are the reliabilities of
X and Z; and pxz is the correlation between X and Z.
4
When X and Z are centered in the population, then px = pz = °*
this expression reduces to the simpler form given in Busemeyer and lox *
(1983):
2

n „ Pxz + PxxPzz
Pxz.xz-------- 2—7~7—

If X and Z are uncorrelated, then the reliability of the product tens *


simply the product of the reliability of the components. Bohmstedt

CS CamScanner
145

t
unrwdl (1978) point out the disconcerting fact that Prz rz depends <m
i
scaling <d variables. Nevertheless, expression 8.12 is instructive, in
hat it pennits examination of the reliability of the camprodurt XZ in
(^nns of ^e reliability of its components X and Z. when the predictor
variables are centered in the population. Table 8 I shows the reliability
of X and Z required to produce a specified reliability of the erm? product
term, as thc ^’ivlfttion between X and Z varies. When X and Z are tin-
vntlatcd, they must each have reliabilities of .89 tn order that the crim­
product have a reliability of .8, When the individual predictor? each have
pxxl reliabilities (.84), the reliability of the crossproduct XX is only 70
|ч\ж that as the interprcdictor correlation increases, the reliability of the
4-ressproduct term increases slightly; or equivalently, slightly lower rvlc
abilities of the X and Z variables arc required to produce a crowproducf
term with a specified reliability. What is clear from Table 8.1 h that the
individual variables entering a crossproduct term must be highly reliable
if even adequate reliability is to be achieved for the crossproduct term.

Corrected Estimates of Regression Coefficients


in Equations Containing Higher Order Terms

One approach researchers have taken to the problem of measurement


error in regression analysis is to attempt to correct regression coefficients
for measurement error. The typical strategy is to correct the covariance
or correlation matrix of the predictors for the error, yielding matrices said

Table g.l
< Individual Variable Reliabilities Required To Produce a Specified Crvssproduet
Reliability, as a Function of the Interpredictor Correlation

. Oesircd Inlequedieior Correlation


TM^product
Reliability ' ■ ’ ' 0 10 ,:Ю ли

; i$Q .95 ■ .94 . 94


.80: .89 .89
>■ .70 ' .84 ’ .83 .79
60 ,77 ' - T7 ■' .75 -71
SO . .71 ’ ’ • ,70 • r-?' .67 61
.63 -63 ■' . .59 .50
'"Л:- .

entry is the reliability of X and of Z and required u> рю4ф,е 4 specified reltabdU)
of (heir enmproduct Pxz.i/. This value depends upon the cweUtiqn between X und Z For
***«pplc. for a crossproduct reliability of .70, given lM p.u * .30, each variable tw>t have »
. « reliability of ,82 (alternatively, «be product of the reliabilities »«usi be 821)-

CamScanner
MULTIPLE REGRESSION -4

to be corrected for attenuation (Puller & Hidiroglou, 1978), These ma­


trices arc then used to generate corrected regression coefficient
.
* The ap- f
preach historically has been most commonly used in path analysis (Kenny,
1979) to correct structural or path coefficients.1 These coefficients are crit­
ical for theory testing, because the sizes of these coefficients arc used to
infer the magnitude of causal effects of the variables. These approaches
require estimates of the reliabilities, which arc provided for the first order
predictors by work in measurement theory and test construction (Gullik- f
sen, 1987, Lord & Novick, 1968; Nunnally, 1978).2 Then an expression ,
for the reliability of the crossproduct term can be determined based on its ,
component reliabilities. '
A second approach to obtaining corrected estimates of regression cocf- '
ficicnts for product terms is through the use of latent variable structural f
models (Kenny & Judd, 1984). In this approach theoretically error-free
latent variables are estimated, which then are used to compute the regres­
sion coefficients (structural coefficients).

Correcting the Covariance Matrix:


Estimates from Classical Measurement Theory

This approach uses classical measurement theory to attempt to disat-*


tenuate the covariance matrix of the predictors (i.e., to eliminate error-
from the observed variances and covariances). The corrected covariance i
matrix is then used in the regression analysis (Busemeyer & Jones, 1983).;

Variances of First Order Terms. Variances of the first order terms are
corrected for error using equation 8.1: ,
I

°TX - °\ ~~
(8.13)’
л

that is, an estimate of the error variance is subtracted from the observed^
variance of each predictor. Given known reliability the error variance
is given as a2 = aj(l - pxx). V
4 -i
■ к
Covariances between First Order Predictors and of First Order Pre-^
dictors with the Criterion. Under the assumption of classical measurement
theory that errors are uncorrelated, covariances between pairs of first or-\
der predictors and between first order predictors and the criterion are un-\
affected by measurement error (see equation 8.4). Hence no correction is
required in the covariances among first order terms or of first order terms
*
with the criterion. V

CamScanner
Reliability and Statistical Power 147

Variance of the Product Tenn. The variance of the product term a2x/ is
a complex function of the means and variances of the components and the
covariance between the components (sec Appendix A for a derivation of
the following expression):

ol-z = rid + rid + 2C(X, хугхтг


+a2 z + C!(X, Z)
xa2 (8.14)

A portion of the observed variance of the crossproduct term is due to


error. The error is also a complex function of the means and variances of
each of the first order components, as well as of their reliabilities. Bohm­
stedt and Marwell (1978) provide an expression for error variance con­
tained in the crossproduct term in terms of parameters of the first order
terms:
2 _ t2 „2
~ + m2
X°t
* „2z +
, _2 _2 л
OzPzz

+ (fyxPxx + (8J5)

To correct the variance of the crossproduct term for error, observed values
are substituted into equation 8.15 to obtain an estimate of the error vari­
ance contained in the observed variance. Then the observed variance of
the crossproduct term is reduced by subtracting the estimate of the error
variance.

Covariances of Product Terms with Components. Covariances of cross­


product terms with their components must also be corrected, because, as
we have shown (equation 8.9), the observed covariance between a cross­
product term and a component contains measurement error. Equation 8.8
provides an expression for the error variance contained in the covariance
between a crossproduct and its component. An estimate of this error
variance is again obtained by substituting observed values for parameters.
Then the estimate of the error variance is subtracted from the observed
covariance between the crossproduct and a component. Note that there
will be separate estimates for correcting C(XZ, X) and C(XZ, Z).

Covariance of Product Terms with Other Variables. If the components


X and Z of the XZ crossproduct term and another predictor W are multi­
variate normal, then the covariance of the product term with the other
predictor W need not be corrected. This is because third moments of mul­
tivariate normal distributions vanish (see Appendix A, equation A.l).

CamScanner
1^8 MULTIPLE REGRESSION /'

Cautions. There are substantial cautions concerning regression esti­


mates derived from a covariance matrix corrected as specified above.
I. There is the strong assumption that errors arc uncorrclatcd, though
there is considerable evidence in the social sciences that errors of mca-
surcmcnt arc correlated when measures arc gathered with a common f
methodology (e g.. Campbell & Fiske. 1959; West & Finch, in press), j
2. If the correction is incomplete, that is. the variances and covariances
of some variables measured with error arc not corrected, then the partial '
correction introduces, rather than eliminates, bias in the coefficient esti­
mates (Won, 1982).
3. The covariance matrix created by the corrections may not be posi- >
live definite, meaning that there is no solution for the regression coeffi­
cients.
4. Regression coefficients derived from disattenuated covariance ma­
trices tend to be ovcrcorncctcd (Cohen & Cohen, 1983).
5. Regression coefficients derived from disattenuated covariance ma­
trices cannot be tested for significance, even if the corrections are made ;
using the reliability of each variable in the population. Fortunately, im- *
proved methods have recently been developed for correcting regression I
with multiplicative terms for measurement error (Feucht, 1989; Fuller, -
1987; Heise, 1986). ,

Correcting the Covariance Matrix: '


Correlated Errors Assumed
Heise (1986) proposed a correction for measurement error in order to ■
improve estimates of regression coefficients, including those for multipli- e
cative terms, but with a cost of increased variance of the estimates. He
assumed that errors were not independent of one another, that is, E(€Xq) ,
* 0, while retaining all other assumptions of classical measurement the- ч
ory. Permitting correlated measurement errors produces a divergent result
from that of classical measurement theory: The covariances among pairs
of first order terms are now affected by error. Heise provided expressions
for the error structure of product terms up through products of six van- j
ables in terms of sums of squares and crossproducts of observed scores, v
plus error variances and covariances. This work extends the algebraic
developments of Bohmstedt and Marwell (1978) for the two variable case. t
All the expressions are given in terms of sums of squares and cross­
products of observed scores, plus error variances and covariances. Two ?
expressions (8.16 and 8.17) are provided here to show the effect of the •
assumption of correlated errors (we have modified Heise’s notation to be

CamScanner
Reliability and Statistical Power 149

consistent with our prior usage). For example, the covariance between the
two first order variables X and Z is given as follows:

C(X, Z) - C(TX, 7» T C(/x,q) (8.16)

where C(e.v, ez) is the covariance between the measurement errors in X


and Z. Note that under classical measurement theory, covariances be­
tween first order terms were unaffected by measurement error.
The covariance of a crossproduct term with a component is given as
follows:

C(XZ, X) = C(TXTZ, Tx) + Tza\ + 2TxC(ex, tz) (8.17)

The second and third terms in equation 8.17 represent the error portion
of the covariance between the observed crossproduct and the component.
A comparison of equations 8.17 and 8.9 shows that permitting errors to
be correlated introduces an additional term reflecting the error covari­
ances, namely, 2TxC(ex, e2).
To correct the covariance matrix of the predictors for error, Heise sub­
stituted sample means for true scores. He estimated error variances and
covariances from multiple measurements on each value of each predictor.
The mean observation on one such point was taken to represent a true
score. The variance of the observations on the single predictor value pro­
vided an estimate of the error variance; the covariance between (he re­
peated observations on single cases across two predictors provided an
estimate of the covariance between errors. These estimates were pooled
across all cases and were used to adjust the covariance matrix of the pre­
dictors. Corrections for the variances of second order (XZ) and third order
(XZW) crossproducts and their covariances followed Heise’s derived
expressions. Reliabilities of the nine individual scales all exceeded .90
(except for one scale for one subgroup of raters). In this case the corrected
regression estimates did not differ dramatically from the uncorrected es­
timates.
In a subsequent simulation study Heise varied sample size (л — 200,
350, 500) and the reliability of the first order predictors (.70, .90). Bias
was always reduced on average with the method, even for the smallest
sample size. However, the corrected estimates varied substantially across
replications, particularly with reliabilities of .70, and large sample size
did not compensate for unreliability.
We are not surprised at the unstable solutions obtained when the reli­

CamScanner
150
MULTIPLE REGRESSION

ability of the predictor was .70. If the predictors arc uncorrelated th


under the assumption of uncorrclatcd errors the reliability of the sec
order crossproduct (XZ) term would be .702 = .49, and the reliability f
a third order crossproduct (XZJV) term, .703 = .34. No correction method
can be expected to salvage analyses containing variables so fraught with
measurement error. Yet, if our first order predictors have reliabilities of
even .70, these arc the levels of measurement error in our higher order
terms. As a final note, Heise’s correction may lead to covariance matrices
that arc not positive definite; then there is no proper solution for the cor­
rected regression coefficients.3

Correcting the Covariance Matrix:


Constraint for Proper Solution
The work of Fuller (1980, 1987; Fuller & Hidiroglou, 1978) provides
an improvement upon Heise’s approach. The corrections applied to the
covariance matrix are constrained so that the corrected covariance matrix
is positive definite, that is, it has the statistical structure of moment (co­
variance, correlation) matrices that will yield proper solutions for cor­
rected regression coefficients. The procedure also takes into account the
error of estimation in reliabilities when population values of reliability
coefficients are unknown. The procedure produces efficient estimates of
regression coefficients and consistent estimates of the corrected covari­
ance matrix of the regression coefficients, from which corrected standard
errors of the regression coefficients are derived.
Feucht (1989) provides a Monte Carlo simulation study comparing the
results of Heise’s corrected estimator approach, Fuller’s corrected/con­
strained estimator approach, and the standard ordinary least squares (OLS)
regression approach (without correction) in small samples (n = 60, 90)
with varying predictor reliability (all .90, all .60, or a mixture of 60, :
.90). The regression equation examined contained three first order terms
and one two-way interaction. .
A problem known as matrix indefiniteness (see note 3) of the covan-
ance matrix of predictors resulted from the corrected estimator procedure-
The prevalence of the problem increased as sample size and reliability .
decreased, to 54% of all samples with the lowest reliability (.60 for
three first order predictors) and smaller sample size (n = 60).
indefiniteness produces improper regression estimates. With regard to л
of estimates, the corrected/constrained approach produced less bi a
estimates, and the corrected approach more biased estimates relative
OLS, particularly for the crossproduct term. <

cs CamScanner
Eeiiabiliiy and Statistical Power 151

Corrected/eonstrain cd estimates showed slight attenuation bias. pro


Adding a conservative correction, rather than the overadjustment of other
methods. In terms of the efficiency of the estimates, OLS produced the
most efficient estimates (smallest variance of the estimates across sam­
ples). with the corrcctcd/constraincd approach being less efficient fn
contrast. Heise’s corrected approach produced very large variances, con­
sistent with the findings of Heise’s (1986) simulation, fn general, perfor­
mance of all estimators, in terms of both bias and efficiency, deteriorated
dramatically with decreases in reliability. Interestingly, mixed reliabilities
m the predictors in some cases led to worse performance than uniformly
low reliability. Reflecting the importance of sample size, an increase from
rt « 60 to n = 90 produced marked improvements in bias and efficiency
of all estimates. Corrcctcd/constraincd estimators were the best of the
three alternative approaches with small samples and low reliability

Latent Variable Approaches to Estimates


of Regression Coefficients in the
Presence of Measurement Error

An important alternative approach to the correction of estimates for


measurement error is provided by structural equation models with latent
variables. These models are the focus of a large and complex literature
(seec.g., Long, 1983a, 1983b, for an introduction; Bollen, 1989; Hay-
fluL 1987; Joreskog &, Sorbom, 1979, for more advanced treatments) and
require specialized statistical software to provide estimates and standard
errors of parameters. Here, we will only outline the basic procedure and
assumptions of the approach for equations containing curvilinear effects
or interactions.
The structural equation modeling approach conceptually develops two
seis of equations: (a) measurement equations that describe the relationship
«f each of the measured variables to an underlying latent variable; (b)
itructund equations that describe the relationship of each of the latent
predictor variables to the criterion variable(s). The structural coefficients
otaained from (b) correspond to regression coefficients that have been
ewrected for measurement error.
Io understand the measurement portion of the model, imagine a re-
ieiwher is attempting to measure socioeconomic status (STS). Sfc'.S is a
latent variable that cannot be measured directly, so typically three indo
cators of the latent variable аге measured: income (/), occupation (<7),
^nd education (£). Each of these single indicators is an imperfect measure

CamScanner
152 MULTIPLE REGRESSION'

of SES, that is, they contain measurement error. However, the comnw
variance shared by all three indicators provides an excellent representation
of the SES latent variable. A measurement model can be constructed &
represent SES. The set of measurement equations would be as follows.

/ =5 X,(SES) + e,
о « XjCSES) 4-

X3(SES) + c3

In these equations, the Xs represent factor loadings, the correlation be­


tween the measured variable and the latent variable; the cs represent mea­
surement error. The factor loadings and the errors are estimated ushf
specialized software such as LISREL 7 (Joreskog & Sorbom, 1989) tst
EQS (Bcntler, 1989). With four or more indicators of the underlying la­
tent factor, tests may be performed to determine if a single factor is ssf-
ficicnt to account for the data. The factors produced through this approach
are corrected for measurement error.
These factors are then used in the structural part of the model. Ccn-
ceptually, a regression analysis is performed using the factor scores as j
predictors and the criterion providing theoretically error-free estimates of
the structural (regression) coefficients. In practice, the measurement xvd
structural portions of the model are estimated simultaneously. The esc * i
mation technique assumes multivariate normality, that the latent variable
*
are not correlated with the errors, and typically that errors of measures
^
* •
are uncorrelated. This last assumption can be relaxed through the speci­
fication of a precise pattern of correlated errors in some models.
Some work in this area has addressed interactions and curvilinear ef­
fects in the structural portion of the model. Alwin and Jackson (1ШХ |
and Byrne, Shavelson, and Muthdn (1989), and Joreskog (1971) ..1
considered methods of comparing models across groups, a case cores j
sponding to the interactions between a categorical (group) and contm^
(factor) variable considered in Chapter 7. The approach involves :
parison of two models: In model 1, the unstandardized structural (re^'
sion) coefficients are set to be equal in each group (no interaction^ ® ■
model 2, the structural coefficients are estimated separately in each
A comparison of the goodness of fit of the two models provides a test |
the interaction term. This approach also permits tests of the cquiv^^ Ж
of the measurement portion of the model across the groups. Thisa^n^’ |
is easily implemented, and several examples exist in the literature ($*** |

CamScanner
Reliability and Statistical Power 153

e.g., West, Sandler, Pillow, Baca, & Gcrstcn, 1991, for an empirical
illustration).
Kenny and Judd (1984) have developed methods fortesting interactions
and curvilinear effects involving continuous latent variables. They show
that, by forming products of the indicator variables, all of the information
is available that is needed to estimate models containing X * 2 and X ,
*
Z
where X * and Z * arc latent variables. For example, to estimate X * 2 in
the curvilinear case with two measured variables, all information needed
to estimate the variance and covariance terms in the model can be derived
based on the products of the measured variables X| and X2 (i.e., X2, X2,
and XjX2). For the latent variable X Z
* interaction with two measures
each of X * and Z *
, the crossproducts of the measured variables X(, X2
and ZH Z2 0-e-» ^1^1» ^1^2» X2Z|, and X2Z2) provide the starting point
for the estimation of the model. In each case, the products of the measured
variables become the indicators for the corresponding latent variable. In
a simulation of the performance of the model in the presence of measure­
ment error in the observed variables, Kenny and Judd showed that their
approach provided good estimates of the parameters in a known under­
lying true model.
Bollen (1989) noted three obstacles to the use of the Kenny and Judd
approach. First, there was initially considerable difficulty in implement­
ing the procedure in the widely available EQS and LISREL programs,
although Hayduk (1987) and Wong and Long (1987) have recently de­
scribed successful methods. Second, the formation of the products of the
indicator variables violates the assumption of multivariate normality that
is necessary for the estimation procedure to produce correct standard er­
rors. Alternative estimation procedures (Browne, 1984) exist in the EQS
and LISREL programs that do not make this assumption; however, these
procedures require large sample sizes to produce proper estimates (Bender
&Chou, 1988; West & Finch, in press). Third, Kenny and Judd assumed
that the latent variables and disturbances (term representing unexplained
variation in a latent criterion variable) of the components of the latent
variables are normally distributed. These assumptions can be tested using
the EQS program. Again, violation of this assumption may require the
use of alternative estimation procedures or respecification of the model to
achieve proper estimates.
The Kenny and Judd approach to correction of measurement error in
regression equations involving curvilinear effects or interactions has shown
considerable promise to date in a small number of studies. However, the
approach has to date been difficult for most researchers to implement,
precluding its more frequent use in the literature.

CamScanner
■Л
154 MULTIPLE REGRESSION
<4
. J
Can Measurement Error Produce Spurious Effects?
We have focused on the extent to which measurement error attenuates
regression estimates. It is also possible that measurement error might lead
to effects being observed in the sample that do not actually exist in the
population, that is, spurious effects. ■ ■ ■ ■ \

Spurious First Order Effects? In equations containing interactions, first


order effects become biased and unstable when predictors contain mea­
surement error. We should not be surprised at this conclusion, given the
low reliability of the product term relative to the first order effects. Cohen
and Cohen (1983) point out that it is the reliability of the partialled vari­
able (in this context XZ) that has a profound effect on estimates of other
variables (here X and Z). Evans (1985) found substantial lability in the
joint variance accounted for by first order effects in regression equations
containing interactions when there was measurement error in the predic­
tors. Feucht’s (1989) simulations also showed that in the regression equa­
tion Y = b}X + b2Z + b3W + b4XZ + bQ, the variances of the estimates
of b} and b2 from each of three approaches (OLS, corrected, and cor-
rected/constrained) exceeded those of b3. Low reliability had a more del­
eterious effect on both bias and stability of all three estimates of bt and
b2 than on the estimates of b3. Finally, Dunlap and Kemery (1988) found
inflated Type I error estimates for the first order coefficients of compo­
nents of the interaction. These findings suggest that estimates of condi­
tional effects of first order terms in equations containing interactions (see
Chapters 3 and 5) are likely to be positively biased in the presence of 1
measurement error. That is, the estimates of the first order terms will be ;
further from zero than their corresponding population values.

Spurious Interactions? Although spurious interactions have not been j


shown to occur when there are random errors of measurment in the pre- |
dictors, the effects of other types of measurement error on interactions ।
should also be considered. Evans (1985) investigated the effects of cor­
related (systematic) measurement error between the predictors and the
criterion on estimates of interactions. Correlated measurement error may *
be expected when similar methods are used to collect measures on the
predictor and criterion variables (e.g., all measures are self-report ques' i
tionnaires). Evans conducted Monte Carlo simulations that varied (a) the S
magnitude of the interactions in the regression equation (no interactions, ‘
weak interactions, and strong interactions), (b) the level of correlated er-

CamScanner
I
Reliability and Statistical Power 155

; ror between predictors and criteria, and (c) the level of predictor reliability
(random measurement error). In his study using a large sample size (n =
760) there was no evidence that correlated measurement errors produced
spurious interactions, although these systematic errors did attenuate the
size of the estimates of interaction effects. Random measurement error
attenuated both first order and interactive effects. When the reliability of
the predictor variables comprising the interaction was .80, the variance
explained by the interaction effect was reduced by half relative to the true
variance in the population without measurement error.
Busemeyer and Jones (1983) and Darlington (1990) have noted one set
of conditions in which spurious interactions can be produced in observed
data even though none exist in the population. The conditions involve
nonlinear measurement models, for example, X = k(Tx)]^2 + e* , that
ь violate the fundamental assumption of classical test theory that observed
„ scores must be linearly related to the underlying true scores. Thus mea­
surement instruments that have only ordinal rather than interval level
properties can produce spurious estimates of interactions and curvilinear
effects. Advanced methods of estimating nonlinear measurement models
1 do exist. The interested reader may consult Etezadi-Amoli & McDonald
(1983) and Mooijaart & Bentler (1986) for discussion of one class of
methods for estimating nonlinear measurement models.
Г Comment

As anticipated by Cohen and Cohen (1983), the statistical and social


? science literatures have increasingly addressed the effects of measurement
error in regression analysis. More will be learned about the effects of
/ measurement error and behavior of corrected estimates as future simula­
tion studies build on the basic work of Dunlap & Kemery (1987), Evans
(1985), and Fcucht (1989). New theoretical and empirical developments
/ should continue and the approaches to correcting estimators for measure-
■4 ment error should become more accessible to researchers. Nonetheless,
as better and more accessible methods of deriving corrected regression
estimates in the presence of measurement error become available, this
// does not mean that social scientists can relax their concerns about im-
V proving measuring instruments. Each of the approaches to correcting for
У measurement error is based on strong assumptions. If these assumptions
•7 are seriously violated, then the corrected regression coefficients will be
seriously biased. Highly reliable measures in studies with adequate sam-
Z pie sizes produce the strongest social science.

■fe till
cs CamScanner
156 MULTIPLE REGRESSION

Statistical Power

Many authors have commented on the weak power of tests for inter­
action terms in MR, particularly in the face of measurement error (e.g.,
Busemeyer and Jones, 1983; Dunlap & Kemery, 1988; Evans, 1985). The
question has been raised with respect both to interactions involving two
(or more) continuous variables as well as to interactions involving cate­
gorical and continuous variables (see Chaplin, 1991; in press; Cronbach.
1987; Cronbach & Snow, 1977; Dunlap & Kemery, 1987; Morris, Sher­
man, & Mansfield, 1986; Stone & Hollenbeck, 1989). In this section w
explore the power of tests of the interaction in the equation Y - b}X
b2Z + 63XZ + b0, closely following Cohen’s (1988) approach. We begin
by considering relationships among various measures of the impact of the
interaction: effect size, partial regression coefficient, partial correlation,
and gain in prediction (difference between squared multiple correlations
or semipartial correlations). We examine sample size requirements nec­
essary to detect the XZ interaction when X and Z are measured without
error. We then explore the impact of measurement error on effect size,
variance accounted for, power, and sample size requirements as a func­
tion of (a) the correlation between predictors X and Z and (b) the variance
accounted for by the first order effects. Finally, the results of recent sim­
ulation studies of power of tests of interactions are presented.

Statistical Power Analysis


The power of a statistical test is the probability that the test will detev
an effect in a sample when, in fact, a true effect exists in the popu at«?n
As shown by Cohen (1988), the power of the statistical test depends 0
several parameters:

1. The specific statistical test that is chosen (e.g., parametric tests that use al

available information are more powerful than nonparametric tests),

2. the level of significance chosen (e.g., a = .01, or .05);


3. the magnitude of the true effect in the population;
4. sample size (n).

Cohen (1988) has suggested that .80 is a good standard for the minimum
power necessary before undertaking an investigation. This suggestion has
been accepted as a useful rule of thumb throughout the social sciences-
In considering the power for the XZ interaction term in an equation

CamScanner
Reliability and Statistical Power 157

containing A’ and Z, wc will treat the first order terms X and 7, ач a “mu*'
of variables: set M for first order (“main”) effects. The X7. term will
constitute a second “set” I for interaction/ With Y as the criterion, we
define the following terms:
r j mi* squared multiple correlation resulting from combined predic­
tion by two sets of variables M and I, where M consists of X
and F, and 1 consists of XY
rj M: squared multiple correlation resulting from prediction by set
M only
rj
* n mv the squared semipartial (or part) correlation of set f with the
criterion; r>4! M) «= r^Mf - r^M. This is the gain in the
squared multiple correlation due to the addition of set I (the
interaction) to an equation containing X and Z (set M). Put
otherwise, it is the proportion of total variance accounted for
by set I, over and above set M.
r}q M: the squared partial correlation of set 1 with the criterion, or
the proportion of residual variance after prediction by set M
that is accounted for by set I,

effect size for set I over and above set M. where effect size is
defined (Cohen, 1988) as the strength of a particular effect,
specifically the proportion of systematic variance accounted
for by the effect relative to unexplained variance in the cn
tenon:

r2 - r2
j _ > У-Mi rГМ

1 “• Г > Ml

The reader should note that the numerators of the squared partial corre­
lation and effect size are identical and are equal tv r w>, the squared
sernipartiul correlation. However, the denominators of the squared partial
correlation, r2n M, and the effect size, f2> differ. The denominator of
f п.м is the residual variance after prediction from set M; the denominator
of/2 is the residual variance after prediction from sets .M and I. Most
importantly, the reader should note that the squared sentipartial correla­
tion, or gain in squared multiple correlation with the addition of set 1 to
set M, is not linearly related to effect size, Il is the effect size fz (or the

CamScanner
158 MULTIPLE REGRESSION

closely related squared partial correlation, Ги.м) that *s directly related


to statistical power and not the squared semipartial correlation. This re *
alization will clarify findings in the literature such as that of Evans (1985)
in a Monte Carlo simulation “that an interaction term explaining 1% Of
the variance [squared semipartial correlation] is likely to be significant
when the preceding first order effects have used up 80% of the variance
in the dependent variable. A similar interaction is likely to be insignificant
if the first order effects only absorb 10% of the dependent variable” (p
317). With 80% of the variance explained by set M, a 1% increase in
predictable variance has effect size .05; with 10% of the variance ex *
plained by set M, the 1% gain has effect size .01.5
Jaccand, Turrisi, and Wan (1990) have provided a useful table (their
Table 3.1, p. 37, is partially reproduced here as Table 8.2) for determin­
ing sample sizes required to achieve power .80 for a test of XZ interaction
at a ~ .05. The rows of the table are r|<M, the squared multiple com *
lations from set M, and the columns the squared multiple correla­
tion from both the “main effects” and the interaction. The entries are
numbers of eases required under these conditions. For a constant differ­
ence between г^м1 and r2yM, that is, a constant squared semipartial cor­
relation the sample size requirements decrease systematically as
r2r M increases, as can be seen on the diagonals of the table. Table 8.2
adds effect size estimates to Jaccard et al.’s (1990) sample size require­
ments. An examination of the diagonal entries shows that as r|M in­
creases, effect size increases, accounting for the systematic decrease in
sample size requirements with increasing r2M. Table 8.3 explores the
variation in effect size f2 and the squared partial correlation for
constant squared semipartial correlation r2(LM) as r|M increases, again
showing the increase in effect size for constant r|(1 M) with increasing
r2rM. The reader should not lose sight of the distinction between the J
squared semipartial correlation, г х(1.м>» and effect size, /2, in reviewing
literature on statistical power of the interaction.
Cohen (1988) has provided some useful guidelines for interpreting cf- ■
feet sizes in the social sciences. Effect sizes around f2 =» .02 or squared |
partial correlations r2yi M = .02 are termed “small,” around/2 « ,15 or ?
ги.м ~ -13 are termed “moderate,” and aroundf2 « .35 or r|<M |
are termed “large.” Cohen’s reviews, as well as comprehensive meta- 'r’
analytic reviews, have indicated that large effect sizes are rarely obtained |
in most areas of tire literature in social science, education, and business- |
Using Cohen’s power tables for power ,80 at a = .05, and assuming no
measurement error in the predictors, the number of cases required to de- j

■ . ■ ■ ■ • . *

CamScanner
Reliability and Statistical Power 159

Table 8.2
Effect Sizes Associated with Varying Combinations of r2rM and r2KMI (Adapted
from Jaccard et al., 1990, Table 3.1)

Л
r Г Mt

.05 .10 .15 .20 .25 .30 .35 * * •


.50

.05 *.06 .12 .19 .27 .36 .46 ' f * ♦


.90
143b 68 43 32 24 19 ♦ re
10

.10 .06 .12 .20 .29 .38 • « ♦


.80
135 65 41 29 22 ♦ • •
10
„2 ■ .15 .06 .13 ,21 .31
“ Г.М ,70
127 60 39 27 * ♦ t
13

.20 .07 .14 .23 • ♦ ♦


.60
119 57 36 ♦ * * .
15

.25 .07 .15 • « ♦


.50
Ill 53 ♦ • ♦
17

.30 .08 * • . • ■
.40
103 • '
* *
22

NOTE: This table is adapted from Jaccard et al. (1990, Table 3.1), which provides sample size require­
ments for the test of the XZ interaction in the regression equation Y = btX + byZ + bJZ +
for power = .80 at a = .05. Effect size estimates have been added.
*Efiect size
' Sample size required for power .80 at a = .05

Table 8,3
Variation in Sample Size Requirements, Effect Size (/2), and Squared Partial
Correlation (Hi m) at a ~ .05 for Constant Gain in Prediction or Semiparttai
Correlation [r|lLM)l
Л 't . • . *

ГК»«
'г M ... ыл /' гр/.м

■30 .05 .25 24 36 .26


. <35 .10 .25 22 .38 .28
40 .15 .25 21 .42 .29
• 45 .20 .25 19 45 .30
-50 , .25 -25 17 .50 .33
: <■' - - ■
'■ ■ ■. ‘ ■ ■'' . ■ ■ ■
Required, for « .80, from Jaccard et al. (1990. Table 3.1. p. 37)
gfe Ш \ '■ - ■

CamScanner
160 MULTIPLE REGRESSION !

tcct the XZ interaction (set I) are n = 26, 55, and 392, for large, mod­
erate, and small effect sizes, respectively. Readers should note that then
of 55 for moderate effect size exceeds the majority of ns in Jaccard et
al.’s (1990) complete Table 3.1, because the majority of effect sizes in
that table exceed f2 = .15, the value that defines a moderate effect size.
Readers should not be misled to think that small sample sizes suffice to
detect interaction effects of the strength typically found in the social sci­
ences.

The Effects of Measurement


Error on Statistical Power

In this section we examine the effects of measurement error in the pre­


dictors on several indices of effect size and statistical power. We have
previously shown that the reliability of the product term increases with
increases in the correlation between the components (rx z). We have also
pointed out that the percentage of variance accounted for by first order
effects (r2 M) impacts upon the effect size of the interaction. Hence, both
r|M and r2x Y are varied in our study.
The basis of our examination is Cohen’s (1988) procedures for calcu­
lating the statistical power of tests in MR. This treatment of power as­
sumes normal distributions of all predictors. We follow this tradition,
assuming that X and Z are bivariate normal. However, the crossproduct
term XZ will not be normally distributed: The product of two normally
distributed variables does not have a normal distribution. Hence, our es­
timates of power, effect size, and variance accounted for are probably a
bit high and our estimates of sample size requirements are probably a bit
low relative to the true values (Jaccard et al., 1990). However, because
the purpose of power calculations is to provide a “ballpark” estimate of
the number of subjects required for an adequate study, this is not a serious i
limitation.
We begin by assuming no measurement error in the predictor variables. j
We use three effect sizes (/2 » .35, .15, .02, or large, medium, and |
small), three levels of variance accounted for by the combined first order J
terms (r2 M = 0, .20, .50), and two values of interpredictor correlation
(rx.z = 0» -50). From these values we compute Гу.мп the variance ac- j
counted for by the first order effects plus interactions, and Гуп.мр -I
squared semipartial correlation or variance accounted for by the interne- |
tion over and above main effects. Under the further assumption that the |
predictors have identical correlations with the criterion (validities), that j

CamScanner
Reliability and Statistical Power 161

is, = rr,z> we solve for the values of these validities that produce
„2
Г r.M’
We then introduce measurement error by assuming that predictors X
and Z and the criterion Y have reliability .80. We attenuate the correla­
XiZ, rYM, Гпм» ''hi m))
tions involved in the power analysis (r2KX, r2yz, r2
for measurement error under the assumptions of classical test theory (Lord
& Novick, 1968) and Bohmstedt and Marwell’s (1978) work on the re­
liability of the product term. Finally, we recompute the effect size/2 for
the test of the interaction based on the attenuated correlations. From/2
we recalculate statistical power, assuming that the researcher had used the
sample sizes required for power .80 in the error-free case (n — 26, 55,
392 for large, moderate, and small effect sizes, respectively). For mod­
erate and large effect sizes, the analysis is repeated with reliability of X,
Z, and Y of .70.
Table 8.4 shows the effect of reduced reliability on effect sizes and
variance acounted for by the interaction. Table 8.5 shows the effect of
reduced reliability on the power of the test for the interaction assuming
the ns necessary for the error-free case were utilized. It also shows the
sample size required to produce power .80 for the interaction at a = .05.

Effect Size and Measurement Error


Consider first the reduction in effect size for the interaction from an
initial large effect size (/2 = 0.35, Table 8.4a). The effect size/2 is
shown in row 1 across the six combinations of variance accounted for by
the main effects (r2M - 0, .20, .50) and interpredictor correlation
(rjr.z = 0, .50). If the reliabilities of predictors and the criterion are re­
duced to .80 (row 2 of large effect size section in Table 8.4), then the
effect size decreases to half of its original value for M ~ 0 and to one
third of its original value for r2YM — .50. For moderate effect size (/“ -
.15, Table 8.4b) and small effect size (/2 = .02, Table 8.4c) the pro­
portional decreases in effect size are very similar. The general pattern
that emerges is that when reliabilities drop from 1.00 to .80, the effect
size is reduced by a minimum of 50%; when reliabilities drop from 1.00
to .70, effect size is approximately 33% of its original size.

Variance Accounted for and Measurement Error


2/he percentage of variance accounted for by the interaction term (i.e.,
rr<i.M)) over and above first order effects follows a similar pattern. Again
consider large effect size (/2 = .35) in Table 8.4. Row 4 of Table 8.4a
' for large effect size shows that when f2 is held constant at .35, and the
■„ ■■■ J : > ‘ ; ; ‘

CamScanner
162 MULTIPLE REGRESSION

Table 8.4
Impact of Reduced Reliability on Variance Accounted for [r ki.mJ and Effect
Size (/2) of the Interaction in the Regression Equation
P = b\X + b2Z + byXZ + b0

r* 0 .20 .50
' ».M

rx.z 0 .50 0 .50 0 .50

a. Large Effect Size f2 = .35 (r2M M = .26)

Reliability Actual effect size at n = 26


LOO .35 .35 .35 .35 .35 .35
.80 .15 .17 .14 .16 .11 .13
.70 .10 .12 .09 .11 .06 .08

Reliability Actual r r(i.M) at n = 26


1.00 .26 .26 .20 .20 .13 .13
.80 .13 .15 .11 .12 .07 .07
.70 .09 .11 .07 .09 .04 .05

b. Moderate Effect Size f2 = .15 (r2n M — .13)

Reliability Actual effect size at л = 55 ■ ">

1.00 .15 .15 .15 .15 .15 .15


.80 .07 .08 .07 .07 .05 .06
_• V/3
.70 .05 .06 .04 .05 .03 .04
' •4

Reliability Actual at л = 55
1.00 .13 .13 .10 .10 .07 .07 ■ -i
.80 .07 .07 .05 .06 .03 .04
.70 .04 .05 .04 .04 .02 .03
■ --'Я
■ ■ •■’MM

. 1

c. Small Effect Size/2 = ,02 (r2n M = .02)



Reliability Actual effect size at n - 392 . -1-*i

1.00 .02 .02 .02 .02 .02 .02


■ '-"1
.80 .01 .01 »<h5 .01 .01 .01

Reliability Actual r^i.M) at л = 392 . • •


1.00 .02 .02 .02 .02 ♦oi .01
.80 .01 .01 .01 ,oi .01 .01

CamScanner
Reliability and Statistical Power 163

reliability of the predictors is perfect (1.00), the variance accounted for


by the interaction is .26 when r2.M = 0, .20 when r2,M - .20, and .13
when Гу.м = .50. When reliability drops to .80, the variance accounted
for the interaction decreases by 50%. When reliabilities are .70, the
variance accounted for by the interaction is only 33% to 50% of that
accountedfor when reliabilities are 1.00. This pattern is true for moderate
and small effect sizes as well.

Statistical Power and Measurement Error

Table 8.5 addresses statistical power with reliabilities less than 1.00.
The table is structured so that at reliability 1.00, the power for each effect
size is .80. Note that the required sample sizes differ across effect sizes
to produce constant power of. 80. For the large effect size portion of Table
8.5, all power calculations are based on n = 26; for the moderate effect
size portion of the table, on n = 55; for the small effect size portion, on
n = 392. The pattern for loss of statistical power follows what we have
already seen for effect size and variance accounted for. Power is reduced
by up to half by having reliabilities of. 80 rather than 1.00 and is reduced
up to two thirds when reliabilities drop to . 70.

Sample Size Requirements and Measurement Error

The sample size required to produce power of .80 at a = .05 increases


dramatically as reliability decreases (see Table 8.5). When reliabilities
drop from 1.00 to . 80, the sample size required to reach power . 80 at
a - .05 is slightly more than doubled. When reliabilities drop to. 70, the
sample size requirement is over three times higher than when reliabilities
are 1.00. For example, for moderate effect size, whereas n - 55 is re­
quired to detect an interaction when predictors are error free, sample sizes
of over 200 may be required when predictor reliabilities are .70. The cost
of measurement error on research implementation is enormous if adequate
statistical power is to be achieved.

A Note on Variance Accounted for by


t
Г&Л ■ ■ 1.■■■ ■ .
Main Effects and Interpredictor Correlation
. . . ■

;; :The greater the proportion of variance accounted for by the fust order
effects, the sharper is the decline in the effect sizes, variance accounted
for, and power of the test for the interaction term as reliability decreases.
JB Required sample sizes increase accordingly. We saw earlier in this chap-
■ terthat as inteipredictor correlation increases, the reliability of the product

CamScanner
164 MULTIPLE REGRESSION

Tabic 8.5
Impact of Reduced Reliability on Power of the Test for the Interaction and on
Sample Size Required for Power .80 at a « .05 in the regression equation P =s
h,X + b2Z + MZ 4- bo

4
* 0 .20 .50
0 .50 0 .50 0
ГД 7 .50

a. Large Effect Size/2 = .35 (r2n M = .26)

Reliability Power at n = 26 . i
. ;1

1.00 .80 .80 .80 .80 .80


.80
.80 .45 .49 .41 .46 .34
.38
.70 .31 .37 .29 .34 .21
.26

Reliability Required n for power = .80


LOO 26 26 26 26 26 '■'■J

47 59 52 26
.80 54 75
84 68 94 75 64
.70 127
■ Л

100
'■

b. Moderate Effect Size f2 = .15 (r^ M = .13)

Reliability Power at n = 55
LOO ’
.80 .80 .80 .80
.80
.80 .48 .52 .44 .49 .80
.37
.70 .34 .40 •31 .36 .41
.25
.29

Reliability Required n for power = .80


..,J
LOO 55 55 55 55
55 55
.80 109 99 122 Ю8
153
132
.70 169 139 192 155
257 207
■ '
'.’•X

c. Small Effect Size/2 = .02 (r2„ M = .02) •Я

Reliability Power at n = 392


1.00 .80 .80 .80 .80 .80 ■ •:
.80
.80 41 .55 •47 .52 .39 .44
..3
Reliability Required n for power = .80
LOO 392 392 392 392 392 392
.80 774 692 841 752 1056 909

■' .
■;
*
■■'"•'Л
::

CamScanner
Reliability and Statistical Power 165

term increases. Increases in rx z very slightly offset the loss of statistical


power.

Some Corroborative Evidence:


Simulation Studies

Our presentation of the effects of measurement error on power is based


on Cohen’s (1988) power calculations and assumptions of classical mea­
surement theory. Evans (1985) has provided some corroborative evidence
using Monte Carlo simulations. Effect sizes were found to be radically
reduced by the addition of measurement error. Large effect sizes were
reduced to medium effect sizes, just as is shown in Table 8.4. Evans
reported that interactions accounting for 1% of the variance would be
detectable if the variance accounted for by the first order effects rJ-M were
.80; the sample size in his simulation was n — 760 cases per sample,
which is in the range of sample sizes reported in Table 8.5 as being nec­
essary to detect an interaction having a small effect size accounting for
1% of the variance. As a final note, Evans reported a decrease by half in
variance accounted for by the interaction when predictor reliabilities de­
clined from 1.00 to .80; this is the magnitude of decrease reported in
Table 8.4.
Two additional simulation studies provide useful models for the study
of power in interactions (Dunlap & Kemeiy, 1988; Faunonen & Jackson,
1988). Dunlap and Kemery (1988) provide an extensive simulation study
of the effects of measurement error on the power of tests for first order
effects and interactions in small samples (n - 30). Criterion reliability
was held constant at .70; predictor reliability for X and Z is varied (.20,
•50, .80, or 1.00 for both predictors, as well as all combinations of mixed
reliabilities of X and Z). Among the regression models tested were the
following:

Model 1: Y = OX + 0Z 4- 1XZ,

a pure interaction model, and

Model 2: Y = IX + 1Z + 1XZ,

in which each first order term and the interaction share equally in prcdic-
tion.

CamScanner
166 MULTIPLE REORESS|q /

For each model and each combination of predictor reliabilities, the /


portion of significant interactions in 10,000 replications was reponed '
ble 8.6 provides a small sample of the results. In the table, “Obse^ /
Power" refers to the simulation results and "Cohen power" refers to tk^ /
result in Table 8.5. Note that with predictor reliabilities of 1.00, the^ j
served power approaches 1.00 in model 1 and exceeds .90 in model 2
When both predictors have reliability .80, observed power exceeds ,9q j ’ ■
model 1 and is .69 in model 2. These power levels seem very high, gjvJJ
the demonstration in Tables 8.4 and 8.5—and they are. The high levej
of statistical power are the direct result of the very large effect sizes th^ ,
зге considered in the simulation.
We computed the effect sizes (reported in Table 8.6) based on Coherfs
(19S8) procedure for six of the conditions explored by Dunlap and Kem.
ery (the six combinations of models 1 and 2 with 1.00, .80, and .50
reliability). Wrhen the reliability was .80 or better, the effect sizes
ranged from .31 to 2.45, levels that are extraordinarily rare in practice in
the social sciences’ Finally, we calculated power for the interaction terms
according to the approach we used to generate Table 8.5. The theoretical
power calculations (given as "Cohen power’’ in Table 8.6) are similar to
the observed power from the simulation.
Paunonen and Jackson (1988) also conducted a simulation study of the
power for the interaction in model 2 with error-free predictors and sample
1
Table 8.6
Comparison of Observed Power in a Simulation (Dunlap & Kemery, 1988) and
Power Calculations Based on Cohen (1988)

Model 1
Predictor
Reliability Y = OX + OZ + JXZ У ~ IX + IZ + IXZ
J
1
Observed * *Cohen Observed * *Cohen ■5
Pxx Pzz Power Power /
* Power Power

1.0" 1.00 -*1.00 -*1.00 2.45 .93 -*1.00 .78


.80 .80 93 .88 .39 .69 .80 .31
.08
.50 .50 .54 .66 .22 .29 .31
'j
NOTE: Predictors X and 2 are uncorrelated (one of the conditions of Dunlap and Kemery). The criterion
У has reliability .70 throughout, and n - 30 per sample.
•Power found in Dunlap and Kemery (1988) simulation under condition that rx z =s 0.
bPower calculated according to Cohen (1988) with appropriate disattenuation of correlations for unreli­
ability- . . .
c Effect sizes based on Cohen’s (1988) formulation. .

CamScanner
Reliability and Statistical Power 167

sizes of n - 100. They reported that the interaction was detected in 100%
of 1000 simulated samples, Measurement error was added to the criterion,
but the amount not specified. If we assume criterion reliability .70, equiv­
alent to that in Dunlap and Kcmcry (1988), then we would expect 100%
of the interactions to be detected, because the power approaches 1.00 for
this test even with substantially smaller samples of n ™ 30. Even when
the reliability of the criterion is .50, for n » 100 with perfectly measured
bivariate normal and uncorrclatcd predictors, effect size is large (f2
.33), and power approaches 1.00. Once again, interactions are detected
with 100% probability because they have very large effect sizes.
Paunonen and Jackson (1988) also provide simulations that match in
structure the real world data of Morris ct al. (1986), and hence are more
realistic in terms of effect sizes. For these simulations, moderator effects
were detected on average only 4.1% of the time with samples of size n
- 100. Such a detection rate is associated with effect sizes substantially
below small effect size/2 = .02. Indeed, for one case considered by
Morris et al. (1986), the effect size/2 equalled .001, according to a re­
analysis by Cronbach (1987).
Two recommendations come from this review of simulations of power
of interactions. First, it would be very useful for our understanding of the
simulations if authors would report effect size measures. This practice
would permit comparison both across studies and with normative expec­
tations for effect sizes in social science data. Second, in the absence of
such reports, readers should compute the effect sizes studies in the sim­
ulation so that they are not misled by reports of very high power in the
absence of measures of strength of the interaction.
Finally, we offer a caution regarding the interpretation of tests of
regression models containing random measurement error in the predic­
tors. Measurement error takes a greater toll on the power of interaction
effects, relative to first order effects. Measurement error also appears to
produce spurious first order effects but not spurious interactions (Dunlap
& Kemery, 1988; Evans, 1985). Taken together, these two factors will
lead to a greater apparent empirical support for theoretical predictions of
main effects at the cost of support for theoretical predictors of interac­
tions.

The Median Split Approach:


The Cost of Dichotomization
A commonly used alternative to the multiple regression approach pre­
sented in this book is to perform median splits on each of the predictor

CamScanner
168 MULTIPLE REGRESSION

variables and then to perform an ANOVA. This strategy loses information


from each of the predictor variables, thus adding a new source of niea- ‘
surcment error—error due to dlchotomi/ation. The effects of this procc- ■
dure on simple correlations have been extensively studied in the literature ’
on coarse categorization of variables (eg.. Bollen & Barb, 1981; Cohen,
1083).
Cohen (1083) provides data comparing the correlations, lvalues, and
power that could be expected when A’ and Fare both continuous versus
when A’ is dichotomized. He shows analytically that, when two variables
arc sampled from a bivariate normal population, the value of the simple
correlation coefficient r between a dichotomized predictor and continuous
criterion is .798 of the value obtained when both variables arc continuous.
When the correlation between the variables is rXY “* .20; the Most value
for the dichotomized ease is reduced to .78 of the /-value obtained when
both variables arc continuous; when rXY = .70. t is reduced to .62 of the
value obtained with the continuous predictor. For a moderate effect size
(r «= .30), the power of the test for n — 80 and a = .05 is reduced from
.78 to .55. Thus, for the single predictor case, dichotimization leads to a
substantial drop in statistical power.
Similar analytical work has not been done for tests of interaction in
multiple regression because of the difficulty introduced by the nonnomul
distribution of the interaction term. As a first peek, we have conducted a
small scale simulation to examine the loss of power for the lest of the ,
interaction. Using the population regression equation, Y - 2.00X +
O.OOZ 4- 1.00XZ + .01 in which X and Z are correlated .32, we took
five samples of size 200 and estimated the regression equation. For each
sample, we also performed median splits on X and Z and subjected the
resulting data to a 2 X 2 ANOVA. The /-values for the test of the inter­
action obtained from the ANOVA based on median splits (/ with m de­
grees of freedom = F2 with (1, in) degrees of freedom) were approxi­
mately .67 of the /-values obtained using the standard MR procedure.
Although the exact amount of reduction in the the /-values will depend
on several parameters (e.g., the size of the interaction effect), this ex­
ample corroborates Cohen’s (1983; Cohen & Cohen, 1983) admonitions
concerning the cost of dichotomization on statistical power.

Principal Component Regression \


Is not a Cure for Power Woes J

' "V
Morris et al. (1986) proposed principal component regression (PCR) as A
a more powerful approach than OLS multiple regression for the analysis *
is ■ \

CamScanner
Reliability and Statistical Power 169

of interaction effects (see Cronbach, 1987, for a clear conceptual presen­


tation of the analysis). Morris et al. were particularly concerned about the
effects on power of the high levels of multicollinearity that often occur
between the interaction and its components in uncentered equations.6 In
an analysis of 12 real world data sets, they showed that OLS multiple
regression detected an interaction in only one case, whereas parallel anal­
yses using PCR found highly significant interactions in 10 of the 12 data
sets. However, PCR, though accepted as a method for handling multi­
collinearity of first order predictors, is not appropriate for multiple regres­
sion models containing interactions. PCR works by eliminating some por­
tion of the variance of each predictor. In PCR, the test of the interaction,
according to Cronbach (1987), is based on a comparison of two multiple
correlations. The first is that derived from prediction by those portions of
the X, Z, and XZ retained in the analysis; the second is that derived from
prediction by those portions of X and Z retained but from which the XZ
product term has been partialled. This can be thought of as crediting the
interaction with all predictable variation it shares with the first order terms.
Such a procedure stands in direct contrast to the usual methods of appor­
tioning variance among terms (those which apportion only unique vari­
ance to each effect, and those which apportion variance shared between
the first order effects and the interaction to the first order effects; see Over­
all & Spiegel, 1969). Hence, theoretically the interaction effect should be
overestimated in PCR. In a simulation, Paunonen and Jackson (1988)
showed that with a nominal a = .05 for the test of the interaction, the
observed a rate was .377. With simulated data structured to match the
real world data of Morris et al. (1986), significance was found in 61 % of
cases with PCR, but only 4% of cases with OLS. PCR does not provide
an acceptable method for improving the power of tests of interactions in
multiple regression (see also Dunlap & Kemery, 1987).

Coming Full Circle

In this chapter we have explored in detail the issues of reliability of


measures and statistical power, as they relate to the detection of interac­
tions in multiple regression. Measurement error severely lowers the sta­
tistical power of the test of the interaction and dramatically increases the
sample size required to detect interactions. In simulation studies where
evidence of high power for interactions has been reported, effect sizes
have been substantially larger than those experienced in practice. For ex­
ample, Chaplin (in press-а, in press-b) reviews literature on moderator

CamScanner
170 MULTIPLE REGRESSION

effects in three broad areas of psychology with a clear finding: Observed


effect sizes for interactions are very small, accounting for about 1% of
variance in outcomes. Similarly, Champoux and Peters (1987) report av­
erage percentages of variance accounted for by interactions in the job
design literature of 3%. To detect such effects, very large samples are
required. Using large samples will ameliorate problems of power that are
produced by measurement error. Large sample sizes will not, however,
decrease bias in regression coefficients that is produced by measurement
error. The social scientist is forewarned.

Summary

Chapter 8 considers the effects of random measurement error in the


predictor variables on the bias and power of tests of the terms in the
regression equation. Classical measurement theory is reviewed and is used
to illustrate the effects of measurement error on the covariance matrix of
the predictors, from which estimates of regression coefficients are de­
rived. Methods of correcting the covariance matrix for random measure­
ment error are presented, including methods recently proposed by Hiese
and Fuller. Another approach proposed by Kenny and Judd based on
structural equation modeling of latent variables is outlined. The concept
of statistical power is introduced and the relation of several commonly
used measures in regression analysis to power is considered. The power
of tests of the interactions and sample size requirements for adequate
power in MR are considered both for the error-free case and for the more
usual situation in which the predictors are measured with less than perfect
reliability. Power tables are presented that will be useful for researchers
in planning their investigations. Measurement error is shown in both our
theoretical analysis and in several simulation studies to lead to an appre­
ciable decrease in the ability of MR to detect interactions.

Notes
,
1. For those unfamiliar with path analysis, the basis of the analysis is typically ftunilUt
i
ordinary least squares regression. The terms path coefficient and structural ^efficient refcf
to standardized and unstandardized regression coefficients, respectively (Duncan, 1975).
2. Common approaches to measuring reliability make different assumptions concerning
the nature of the measures (e.g., test-retest correlations require parallel fonn equivalence;
Cronbach’s alpha requires parallel form or tau equivalence). Alwin and Jackson (|9S0)* !
it
Kenny (1979) present discussions of reliability of measures having varying phqxrties.

CamScanner
Reliability and Statistical Power 171

3. The constraint on the structure of a covariance matrix is that C(X,Z) s sxs2. If all
off-diagonal elements meet this condition, then the covariance matrix will be cither positive
definite or positive semidcfinitc. Л positive definite (PD) matrix is of full rank, that is, it
has all nonzero characteristic roots, a positive dctcnninnnt, and may be inverted, providing
the usual OLS solution for regression coefficients: b « tixJsXyt where SJx is the inverse
of the predictor covariance matrix. Л positive semidcfinitc (PSD) matrix is not of full rank;
there are linear dependencies among the predictors on which the covariance matrix is based
(e.g., entering three predictors that must together sum to 100 points). A PSD matrix has at
least one zero characteristic root, a zero determinant, and cannot be inverted; hence there
is no solution for the regression coefficients.
When a covariance matrix is adjusted for error a third condition may arise: matrix in­
definiteness (Fcucht, 1989). The condition C(X, Z) s sxsz is not met; there is at least one
negative characteristic root and a nonzero (though negative) determinant. Such a matrix
may be inverted; and hence a solution for "corrected” regression coefficients derived, even
though "such a moment matrix violates the structure and assumptions of the general linear
model, and is inadmissable in regression analysis.” (Fcucht, 1989, p. 80). Sometimes, but
not always, an indefinite covariance matrix of predictors will generate negative standard
errors. Whenever a corrected covariance matrix is employed, its determinant should be
checked before beginning the analysis.
4. We are following Cohen’s (1988) development and using similar notation for effect
size and other terras; thus the reader will find Cohen’s highly informative and useful treat­
ment of power quickly accessible.
5. There is a direct relationship between the reliability of the product term (Pxz.xz) and
rr«.M) = Pxz.xz(b3cTllTz)> where b] is taken from У = btX + b2Z 4- b3XZ 4 bQ,
and aTiTz is the variance of the product of true scores Tx and Tz. Thus the percentage of
attenuation in the variance accounted for in the criterion by the product term is a direct
function of product term reliability (Busemeyer & Jones, 1983).
6. PCR regression was developed by Mansfield, Webster, and Gunst (1977) to adjust
for multicollinearity in predictor matrices. To begin, the characteristic roots X, (/ = 1, p)
and characteristic vectors a{(i = 1, p) of the p X p covariance matrix of the predictors are
determined. Component scores are then formed on the set of p principal components, by
postmultiplication of the raw (n X p) data matrix X by the matrix of characteristic vectors,
that is, ut - Xalt where u, is the vector of component scores on the i th principal component.
Each u, is a linear combination of all the predictors. Those components (wt,«. • • • ut) (k
s p) associated with the large characteristics roots (Xt, Хг , • • • Xt) are retained for
analysis. Components associated with very small characteristic roots are deleted. The cri­
terion Y is regressed on the к orthogonal principal components that have been retained: ?
= 4 d2u2 • • • <13и3 + dQ. The regression coefficients from that analysis (</h d2, • • ♦
4t) are converted into regression coefficients for the original predictors by the expression
bpcr. = d' u, where bfK„ is the principal component regression coefficient for predictor i of
the original predictor set. This yields the principal components regression equation in terms
of the original predictors: «= b^X 4- b^,:Z 4 b^XZ 4 b^. The b^,, are biased
estimates but they are efficient. See Mansfield et al. (1977) or Morris el al. (1986) for the
complete derivation of the analysis.
The reader may recall from Chapter 4 that centering the predictors will remove most of
the correlation between the crossproduct term and its component first order predictors. Hence,
-■f:--- simple regression equations containing an interaction, for example, P = b}X + b2Z +
b2XZ + b0, multicollinearity is not the source of the low level of statistical power.

CS CamScanner
I

9 Conclusion:
Some Contrasts Between ANOVA and MR
in Practice

In this book we have shown how many of the impediments to the under­
standing of interactions among continuous variables can be overcome.
The interpretation of first order effects and interactions within the MR
framework was presented in some depth for the simple case of two vari­
ables having only linear first order effects and a linear by linear interac­
tion. This interpretation was then extended to more complex cases having
more than two interacting variables, curvilinear effects, or combinations
of categorical and continuous predictor variables.
Post hoc methods for probing significant interactions by testing simple
slopes, determining the crossing point of the regression lines, and graph­
ically displaying the interaction were presented for both simple and com­
plex regression equations. The lack of invariance of regression coeffi­
cients under linear transformation was shown to have no impact whatever
upon the form or interpretation of the interaction: Simple slopes and the
4
status of the interaction as ordinal versus disordinal with regard to a pre­
dictor remain invariant under such transformations. The gain in inter- j
pretability that results from centering the variables prior to analysis was
presented. A variety of tests for exploring regression equations containing
higher order effects were explained, including both global tests of hy­
potheses and term by term step-down procedures that permit scale-free

172

CS CamScanner
Conclusion 173

testing of the effects in the regression equation. Finally, the effects of


unreliability of the predictor variables on the bias, efficiency, and power
of tests of first order and interaction terms in MR were considered in
depth, and several methods of correcting for unreliability were reviewed.
Taken together, the procedures developed in this book allow researchers
to overcome each of the impediments that have been identified to the use
of the MR approach to the testing of interactions between continuous
variables.
Embedded within this book, however, arc two important differences in
philosophy and procedure from the standard ANOVA approach, of which
the ANOVA user should be aware. First is the approach to a priori spec­
ification of a model of systematic variation. Second is the emphasis on
the examination of the error structure of the data to assess the adequacy
of the model. These differences reflect traditions arising from the dispa­
rate origins and continued typical areas of utilization of MR versus
ANOVA. ANOVA was originally developed for the analysis of planned
experiments, whereas MR was originally developed for the analysis of
nonexperimental observational and survey data. Although ANOVA may
be mathematically considered to be a special case of MR (Cohen, 1968;
Tatsuoka, 1975), the traditions associated with the two approaches have
led to important differences in the practice of researchers.
In multiple regression the researcher must necessarily specify each of
the terms to be included in the regression equation. This necessary spec­
ification emphasizes the importance of letting previous theory and re­
search guide the development of the model to be tested, a requirement
that will be unfamiliar to many ANOVA researchers. It also encourages
the careful examination of the existing literature to determine if other,
alternative models can also be developed, which can then often be directly
compared with the model preferred by the investigator.
In MR when theory is unclear about the nature of a “main” (first order)
or interaction effect of some of the variables, additional terms may be
introduced into the equation to represent such potential influences. Such
additional terms do not bias the results if, in fact, they have no influence
on the criterion in the population, and are relatively uncorrelated with
other predictors, although extraneous terms do decrease the efficiency of
the statistical tests of the other terms and introduce Type I error. The step­
down procedures described in Chapter 6 handle the efficiency problem by
testing terms when all nonsignificant higher order terms have been re­
moved. With Cohen and Cohen (1983), we encourage researchers to be
cautious in introducing theoretically unexpected terms into the equation

CamScanner
I74 MULTIPLE REGRESSION

io avoid lowering power, increasing Type I error, and increasing the com- ’
plcxity ol underManding the results, particularly when the predictors to
be introduced are highly correlated with those of theoretical interest.
In ANOVA applied to randomized experiments, researchers typically
have not considered the precise functional form of systematic variation in
model specification. The standard ANOVA analysis employs a fully sat­
urated model in which all terms through (he highest order possible are
always included, whether or not these higher order effects are theoreti­
cal!} expected to occur. Unanticipated higher order effects arc detected
both during omnibus effect testing and in post hoc probing, for example,
a significant interaction not expected from theory but uncovered during
cflcct testing, or an unexpected curvilinear relationship uncovered with
trend analysis where only a linear relationship was expected. Less fre­
quently recognized by ANOVA researchers is that the failure to specify
a functional form docs extract a penalty in terms of efficiency, as the
omnibus tests of the significance of main effects and interactions in
ANOVA aggregate several different functional forms, only some of which
may be of theoretical interest.
The apparent discrepancy between the need to consider model specifi­
cation in ANOVA versus MR is further undermined when design consid­
erations arc introduced. ANOVA tests the aggregation of all possible
functional forms of the main effects and interactions within the constraints
imposed by the sampling of the levels of each factor. The choice of too
few levels of a quantitative factor in ANOVA is the same specification
error as failing to include nonlinear terms in a regression model. The
choice of a 2 x 2 factorial design means that only linear main effects in
X and Z and the linear X by linear Z interactions can be detected, just as
in the simple regression equation containing an interaction we considered
in Chapter 2. However, in experimental designs, misspecification can only
be addressed by the redesign of the experiment and the collection of new
data. Thus researchers conducting randomized factorial experiments have
implicitly addressed die issue of functional form at the design phase of
the research; al the analysis phase, ANOVA aggregates all possible func­
tional forms within die constraints imposed by the design.
However, when ANOVA is employed in the situation addressed in this
book in which the factors are comprised of two or more measured vari­
ables, die functional form problem now arises in the analysis phase of the
research. The researcher must decide into exactly how many levels and
where each variable must be split to represent adequately the expected
functional form. The use of too few levels in ANOVA with measured

------ '
. ... ..i,
?i■£

CamScanner
Conclusion 175

variables is equivalent to the omission of a higher order term in MR and


typically results in biased estimates of effects. Beyond this problem of
adequate modeling of the functional form of the relationship of variables
to the criterion is the problem of the omission from the analysis of an
important variable or interaction (hat is correlated with both other predic­
tors (factors) and the criterion (dependent variable); such an omission
leads to precisely the same specification errors in ANOVA and MR ap­
plied to continuous variables (sec Kmcnta, 1986). In sum, specification
error is no less a problem in ANOVA than in MR; rather researchers using
ANOVA have dwelled less upon the problem than have MR users because
the specification of the model is addressed at the design rather than the
analysis phase in randomized experiments.
The MR literature has placed heavy emphasis on the assumptions un­
derlying the analysis, particularly normality, homosccdasticity, reliability
of measurement of predictors, and independence of observations. Viola­
tions of assumptions often result from misspecification of the regression
model. A substantial technology has emerged for the detection of such
problems within regression analysis through the examination of residuals
(e.g., Atkinson, 1985; Bollen & Jackman, 1990; Daniel & Wood, 1980).
These authors have offered guidelines in the use of this technology to aid
in the respecification of regression models so that they more adequately
fit the data. Other researchers have developed methods of identifying ap­
propriate transformations of problematic data (e.g., temporal data with
autocorrelated residuals) so that appropriate tests of the hypothesized
model may be performed (Judge, Hill, Griffiths, Lutkepul, & Lee, 1982;
Kmenta, 1986; McCleary & Hay, 1980). Finally, the impact of influential
data points on individual regression estimates has received careful atten­
tion (e.g., Atkinson, 1985; Belsley, Kuh, & Welsh, 1980; Cook & Weis­
berg, 1980; Stevens, 1984; Velleman & Welsh, 1981). Alternative esti­
mation techniques that are less subject to the vagaries of individual stray
data points have been proposed (Berk, 1990; Huynh, 1982).
Such careful examination of the tenability of the model has historically
rarely been practiced by ANOVA researchers because of the robustness
of the procedure in terms of Type I errors to violations of assumptions in
between-subject, randomized experiments with equal cell ns. However,
even in this optimal case, violation of the normality and equal variance
assumptions can lead to decreased efficiency in tests of the effects (Levine
& Dunlap, 1982). And, if the researcher uses other than randomized,
between-subject designs, violations of the assumptions can easily lead to
misestimation of treatment effects (Kenny & Judd, 1986; O’Brien &

CamScanner
По MtH.I'll'l U RIUIRUSSION

Kaiser, N85), When ivseaivhois (mu to tivuling Hicusiitett variables such


as those discussed heivin with ANOVA, the muse of lobustncss cannot
be invoked with the same impunity.
tn sum. it has been the ease that ANOVA utters have spent less effort
on model specification and examination of the error structure of data th in
have MR asci's, This might lead ANOVA users to steer away from MR,
even when it is more appivprhite, because It appears that MR requires
more effort. The same care in model specification and in examination of
error structures is necessary in ANOVA applied to continuous variables
in older that accurate effect-size estimates be achieved. That the MR lit *
erature has dwelled so extensively on the problems of parameter cstinu-
lion has resulted in the availability of statistical packages that support
regression diagnostics (e.g., SAS, SPSS-X), thereby facilitating the ef­
forts of the MR user to assure model specification accuracy and freedom
from error structure violations.
We have provided a complete armamentarium with which to understand
interactions among continuous variables and between categorical and con­
tinuous variables in regression analysis. We hope that with these inter
pretationd tools at their disposal, researchers familiar with the basic tech­
niques of MR will begin testing theoretically interesting interaction terms
in their equations. Wc also hope that researchers originally trained in
ANOVA will generalize well-learned strategies and utilize the more pow­
erful and more appropriate multiple regression framework to test inter
actions between continuous variables.

r4
cs CamScanner
, ■ ■

Appendix A: Mathematical Underpinnings

We are all familiar with the fact that if one makes simple additive transformations
on a variable (i.e., adding a constant), the variance of the variable, and its co­
variances and correlations with other variables, remain unchanged. Only the mean
changes, by a factor of the additive constant. Thus we expect that additive trans­
formations of predictor variables will have no effect on the outcomes of multiple
regression analysis. If predictor X is replaced with a variable X + c where c is a
constant, all regression coefficient estimates, and the variances and covariances
of these estimates, are expected to remain constant. This conclusion is true so
long as the regression equation contains only first order terms. In this case, only
the regression constant will be affected by changes in predictor variables.
The same pattern of invariance does not hold for product terms. If a constant
is added to a variable involved in a product term, the variance of the product term
as well as the covariances and correlations of that product term with other terms
are changed. Thus regression analyses containing product terms are scale depen­
dent. The estimates of the regression coefficients, their variances and covariances,
and their standard errors are altered by changes in scale. Only the raw regression
coefficient for the highest order term and its standard error remain unchanged
under additive transformations (Cohen, 1978).
Bohmstedt and Goldberger (1969) provide a straightforward demonstration of
:■ the algebraic basis of this failure of invariance. Here we show (a) how the ex­
peeled value (or mean) of a crossproduct term XZ depends on the expected values
(or means) of the variables, X and Z, of which it is comprised; (b) how the vari-
1 ance of the crossproduct XZ term depends upon the expected values of X and Z;
and (c) how the covariance of a crossproduct term XZ with another variable У
depends upon the expected values of X and Z. Having shown (c), it is easy to
'■';?/- ■ ' . ‘ ; ■ .; . : . л. ■ .
’ --/■Г-. > f • . ' f, . ’■ " ■ ■ . . ; ' '' ■. ■
i. : 177- .
i i ■ ■ ■: . -r,i ... : ■■ ■ ■ ■ ■ ■ .■ • ■

CamScanner
178 MULTIPLE REGRESSION

show how the covanance of XZ with Л’ or Z depends upon expected values of y


and Z as well.
Our demonstration will follow the development of Bohmstedt and Goldbergs
(1969) and work in terms of expected values. (Readers wishing a review of the
algebra of expectations should see Hays, 1988. Appendix B.) Those readers wfo
arc onh interested in the result of the demonstration should examine exprexxiont
A 4, A.8, and A 13 for the expected value (or mean) of n cmssproduct term V/
its variance, and its covariance with a criterion variable Y, respectively. Note th«
in each eave, the expression contains the terms E(X) and E(Z), the expected
values or means of the variables forming the crossproduct and hence the source
of the scale dependence.


1
The Expected Value (Mean) of a Product Term • .•
We begin with two variables X and Z, making no assumption about their dis­
tribution. Think of these as two predictors in a regression analysis. We define
deviation (centered) scores for each of the predictors,

x = X - E(X) and z = Z - E(Z)


. , , ... ..' 1
or equivalently,

X = E(X) + x and Z = E(Z) + г

Their expected values (means) are E(X) and E(Z), with variances V(X) « E(.n
and V(Z) = Efz2). and covariance C(X, Z) = E(xz).
First, we will form the crossproduct term XZ just as would be done in a regret
sion analysis involving interaction. We form the prossproduct of the raw scores:
• • ■ ■ •

XZ = [x 4- E(X)][z 4- E(Z)] (A ir I

XZ = [xz 4- zE(X) + xE(Z) 4- E(X)E(Z)) (A 2)


.■ . ■ , ■ ■ .... . ■ . ■■■■■. ',i
The expected value or mean of the crossproduct of the raw scores is then

E(XZ) = E(xz) + E(z)E(X) + E(x)E(Z) 4- E(X)E(Z) (А.Я

• . . ?. .’. Г.
But for deviation scores E(x) » E(z) = 0, so the expected value of the co­
product is as follows:

E(XZ) = C(X, Z) 4* E(X)E(Z) (A-4)


. • ■ '... " '
which is the same as expression (3) of Bohmstedt and Goldberger (1969). As
would expect, the mean of the crossproduct depends in an orderly way ел

CamScanner
Appendix A 179

means of the two variables. This expression holds regardless of the distributions
of A’and Z.

Variance of a Product Tenn

To'find the variance of the crossproduct tenn, we evaluate the expression

V(XZ) « [XZ - E(XZ)f (A.5)

substituting expression A.2 for XZ and A.4 for H(XZ). First, we form the «pare
as follows:

*Z)
V( - {« - [C(X, Z) + E(X)E(Z)]}’

. • (« + *B(Z) + tB(X) (A.B)


+ E(X)E(Z) - C(X. Z) - E(X)E(Z))’

Now we take expectations, obtaining the following expression for V(XZ):

V(XZ) • V(Z)E’(X) + V(X)E2(Z) + E(??) + 2E(X)E(x?)


+ 2E(Z)E(?z) + 2C(X, Z)E(X)E(Z) - C!|XZ) (A-7)

This expression can be simplified if we assume that X and Z are bivariate normal;
this is the usual assumption in regression analysis. If variables X, Z, and W are
multivariate normal, then all odd moments (first, third, fifth, etc.) are zero, (e.g..
EU) = E(xzw) = E(rz) ~ E(x2zM = 0). Moreover, E(??) = V(X)V(Z)
•+ 2C’(X, Z). Then equation A.7 simplifies to

V(XZ) = V(Z)E2(X) + V(X)E2(Z) + 2C(X, Z)E(X)E(Z)

+ V(X)V(Z) + C2(X, Z) (A.8)

This is the same as expression (6) in Bohmstedt and Gokibcrger. What is impor­
tant to note in expression A.8 is that k(XZ) depends upon the expected values
(or means) of X and Z. If constants are added to X, Z, or both, then V(XZ) will

Covariance of a Product Tenn with Another Tenn

< Consider the covariance between a crossproduct tenn XZ and the criterion )’ in
^regression analysis.

C(XZ, Г) - e{(XZ - E(XZ)|[r> E(T)H (A.9)

\ A . l' : < ; л : ’■’< • ’ . ' ‘ }\Л

;■ .• ‘ 't . ■ ■

•;/. ’•••? ■ • ■’ / - • ; • - -■ ■' ■ • ■ . . •. . ■ ■ ■ ’ »■ • • • ’. ' ' ■

CamScanner
180 MULTIPLE REGRESSION

where

У-Е(У)=у (AJO)

and .

XZ - E(XZ) = xz + xE(Z) 4* zE(X) - C(X, Z) (A J I)

Note that expression AJI above is formed by taking the difference between
expressions A.2 and Л.4.
We multiply expressions AJO and A.11, which yields

C(XZ, У) = E[xyz + xyE(Z) + zyE(X) - yC(X, Z)]

We take expectations, making note that E(xy) = C(X, У), E(zy) = C(Z, Y),
and E(y) = 0, so that

C(XZ, У) = E(xyz) + C(X, y)E(Z) + C(Z, У)Е(Х) (A.I2)

If we assume multivariate normality, then the third moment E(xyz) vanishes,


yielding the expression

C(XZ, У) = C(X, y)E(Z) + C(Z, У)Е(Х) (AJ3)

Expression A J3 shows that the covariance between a product term XZ and an­
other variable Y depends upon the expected values of the variables involved in
the product term but not of the other variable. Translating this into regression
with product variables, transforming the criterion У by additive constants will
have no effect on the regression analysis.

The Covariance of a Crossproduct with a Component

In a regression analysis containing as predictors X, Z, and XZ, we are also


concerned with the covariance of the crossproduct XZ with each of its component
variables. With no distributional assumptions,

C(XZ, X) - Е(л’г) + V(X)E(Z) + C(Z, X)E(X) (А.И)

With X and Z bivariate normal, the expression reduces to

C(XZ, X) = V(X)E(Z) + C(Z, X)E(X) (A l5)

The covariance of a crossproduct with one of its components depends upon


expected value of the two variables entering the crossproduct term.

CamScanner
Appendix d 181

Centered Variables

In Chapter 3 we introduced reg re «amn equation


* including interaction term
*.
We began by centering the variable
*. that i*
. setting their mean
* to zero. Here we
gagmine the expression
* for expected values, variances, and covariance
* given in
expression A 4, A.R, A 13. and A 15 for E(X) - E(Z) ** 0 First, with no
distributional assumptions. we substitute B(X) ~ Oand 11(7) ** 0 into equation
д.4, obtaining

E(XZ) « C(X. 7.), (A. 16)

that is. the mean of the crossproduct terms will equal the covariance between X
and 7. Note that even if X and Z are centered, the crossproduct XZ will not usually
be centered.
Second, with no distributional assumptions, we substitute E(X) » Oand E(Z)
« 0 into equation A.7. finding

V(XZ) « E(??) - C2(X. Z) (A. 17)

If we assume bivariate normality, this expression simplifies to

V(XZ) - V(X)V(Z) + C2(X,Z)

Third, with no distributional assumptions, we substitute E(X) ~ 0 and E(Z) *


0 imo equation A. 12, obtaining

C(XZ, Г) « Е(луг) (AJ8)

This expression reduces under the assumption of multivariate normality to

C(XZ, Y) ~ 0 (AN)

*
Tim result seems surprising. It says that when /wo predictor X and 2 and a
criterion Y are multivariate normal, the covariance between the product XZ and
7 will be ierv. Does this mean that (here is necessarily no interaction if X, Z, and
}лш- multivariate normal? Yes. Turning the logic around, if there exists an in­
teraction between X and Z in the prediction of Y, then, necessarily, the joint dis-
ttibutkin of X, Z. and Y is not multivariate iwnnal. Recall, however, that in fixed
r etivcix multiple regression, the distnbutional requirement applies only to the его
terion Otherwise stated, only the measurement error in the criterion must be nor-
\ «telly distributed, Thus the result in equation A. N does not present a problem
tai significance testing.
^« regression analysis we are very concerned with multicollincarity or very high
'among predictors. The covariance between a crossproduct term and a

CamScanner
182 MULTIPLE REGRESSION

component variable is substantially reduced by centering variables. With no dis­


tributional assumption, from expression A. 14,

C(XZ, X) = Е(Л) (A.20)

Assuming bivariate normality of the prediction,

C(XZ, X) = 0 (A,21)

If one compares expression A. 14 with expression A.20 it is seen that the co


variance between a product term and its component is dramatically reduced by
centering the predictor variables. As was shown in Chapter 4, centering versus
not centering has no effect on the highest order interaction term in multiple regres­
sion with product variables. However, centering may be useful in avoiding com­
putational difficulties.

CamScanner
Appendix В: Algorithm for Identifying
Scale-Independent Terms

■ •

. ■' •

Using an hierarchical step-down procedure to simplify regression equations with


к higher order terms, as was recommended in Chapter 6, requires that at each step
the scale-independent term(s) be identified. The algebraic strategy presented in
Chapter 3 to examine the effect of additive transformations of the variables can
be used to identify he scale-independent terms in any regression equation. We
present here a five-step algorithm that identifies the scale-free terms to be tested.
The algorithm is illustrated using equation 5.4:

: Y = b{X + b2X2 + byZ + bxXZ + fc3X2Z + d0 (5.4)

■ Although additive transformations are examined here, the procedure can be gen-
k- eralized to multiplicative transformations such as those involved in standardiza-
lion (see Cohen, 1978).
Step 1. State the full regression equation under consideration, here equation
5.4. k: .-..k.

Step 2. Rewrite the equation in terms of transformed predictor variables. For


additive transformations, use X* *= X4 c and Z’ = Z + /, or equivalently X =
й; ■ X’ - cand Z ~ Z' Substituting the expressions for the transformations into
c equation 5.4 produces the following result:
f ' •/' ' ' ■ г ■ ■ ' ' '
- c) 4 b2(X> -c)2 + MZ
* -/)
+ b,(X' - c)(Z- -/) + b,(X’~ C)!(Z- -/) + b„ (B.l)

S If к
чк
■f-’j*
;•/
V 5', •"' ■
■f/-'-i ■'■■.5
Г. ;
A." ' -4.
.-xV./rvi

'
;
. . ••
'* • * .
. . ■. < . к.'
' г’*’
-
|
'

/.'-Л'' ”i. ■
.
■■
i *
;
. ■
.■ ’

". .
'
'.
*

*
.
'
. .

. ■ ■
*
■ ■
■. ;• . '
...
*

”.
joi
-• .

Io.’

CamScanner
184 MULTIPLE REGRESSION

Step Expand and collect the terms of the regression equation with tranv
3.

formed variables:

Y ~ (bt - 2b2c - *
f
b 4 2/>5(/)X’ 4 (b) — b$f)X'2

+ (6, - b,c + + (/>, - 21>,с)Х'7.' + M'’Z'

+ (ft» - />,<■ + />;C! - />,/ + btcf - /',<?/) (Ij j)

or equivalently

f = b\X' + b[X’2 4- b^Z' 4- b’4X'Z' 4- £>£X'2Z' 4- bi (B 3)

Step Enumerate the relationship between each original coefficient and gj


4.

corresponding transformed coefficient. Each coefficient £>/ in equation B.3 that


would result from analysis of data transformed by the expressions X' « x + c

and Z' - Z 4 / is shown in equation B.2 as a function of the coefficients of the


original untransformed data and the scaling constants (c and/). Forex-
(b's)

ample,

b\ = bi - 2b2c - b4f + 2b$cf (B.4)

Appendix Table B.l presents the full enumeration of relationships between


coefficients of the original and transformed regression equations. In the portion
of Table B.l labeled “Coefficient Relationships’’ each coefficient of the trans­
formed equation is shown as a function of the coefficient of the original equation
plus the modifications due to transformation. The presence of terms in the col­
umns through labeled “Modifications due to Transformation” indicates that
bi b5

the coefficient is scale dependent. Only the row for the coefficient contains no b$

terms under “Modifications due to Transformation,” hence only the coefficient b\

is scale free.
Step5. If a scale-free term is tested and found to be nonsignificant, it is dropped 1
from the equation. For any equation resulting from deletion of higher order terms,
delete the corresponding columns of “Modifications due to Transformation.” For j
example, if the term were deleted from equation 5.4, the following equation 1
would result:
■ ''I
Y -4- />,№ + />,Z +
b,X (B.5)
b
XZ
* + b„

The column of “Modifications due to Transformation” would be deleted. After


bs

deletion, those coefficients of the transformed equation that show no entries under .<?
“Modifications due to Transformation,” here and are scale free in the
b2 b4,

reduced equation. Thus in equation B.5, both and are scale invariant. b2 b4
' . ’ . . ’■ . .... •

CamScanner
1S5

П a joint test of both the b4 and b4 terms were nonsignificant, leading both
tcnnsto be dropped from equation 5.4. then both the b> and b\ coefficients show
пл entries under "Modifications due to Transformation’* Hence b» and b, are
scale invariant in equation V ~ b,X ♦ b,,VJ t b»Z > bv, both coefficients may
be tested tor significance.
Theprcscnt strategy is applicable to more complex equations such ax equation
5.5 which includes two nonlincat etlccts and their interactions, or equations in *
voicing three variables, ,\\ Z, and IF. and their interactions such as equation 4. I.
Appendix Table 112 provides useful summary charts for determining the xcate-

ftvc terms for these equations.

Appendix Table B.l Foliations after Linear


Coefficients of Higher Order Regression Еф^п. $
Transformation of Original Variables for Equal

Anginal Equation:
r = b|X + b2X2 + bjZ + b4XZ + b<X2Z + bn

Transformed Equation:
Y = b\X' + bjX'2 + b\Z‘ + bJX’Z' + b\X,2Z' + htj

Transformations: X' «= X + G

Coefficient Relationships:

Coefficient
Modifications due to Transformation
Transformed Original b| b2 b\ b4 b^
Equation Equation
~2b2( -bj' ^2bscf
b\ b. -b^f
b‘3 bi ^■byC'
by h -2bsc
b4
ь;
b5
by ~byc ^Ьге2 ^b4cj -b$c2f
bi bo
J .v. >, •.'.Л ......... _
NOTE: Coeffickni relationships indicate, for example, that coefficient b\ of the transformed equation
equals the value (by - 2t>tc - bj + 2/>,c/). where the b, coefficients are taken from the original

equation.

cs CamScanner
MULTIPLE REGRESSION
186

Appendix Table B.2 , .


Coefficients of Higher Order Regression Equation after Linear
Transformation of Original Variables lor Equations 5.5 and 4.1,
Where X' = X + C, Z’ “ Z + f, W' >= IT + L

a. Two Factor Equation Containing a Quadratic x Quadratic Interaction

Original equation: Г = Zj,X + b2Z + b,X2 + b„Z2 + bsXZ + ft6XZ’ + b2X2Z

+ b,X2Z2 + b0

Transformed equation: Y = b\X' + b2Z' + b2X'2 + b'4Z'2 + b'iX'Z'

+ bbX'Z'2 + b2X’2Z + bgZ'2 + bi

Coefficient Relationships:

Coefficient

Transformed Original Modifications due to Transformation


Equation Equation b} b2 by b4 bs b6 b-, b*
____________________ __ ■
~2b,c -b,f2 +b6f2 +2b2cf -2bacf
b'i ^2 ~2b,f -bsc +2bbcf +b2c2 -2b,c2f
Ьз bi ~b2f +b,f
b* -b„c +b,c2
b’5 b> -2b2c +4bscf
bi .. . К -2>цС ' J
bi
bl —2btf
^8 .
bi
bi bu -b,c -bj +b,c2 + bj‘ +bscf ~bbcf2 -bittf +b„cf
' ■ . . ■ ■ о
. ■ ... . ■ - - ■ ' >' . ■ .<

■ . : - : ■ . . ■ ' - . ' / .. . ■ . Я

. . . ' :■ ■ . - • ■ ' .■ ", : ■ ■.'■.>'1®


■ : . ■ . ' ■
. ■ '■ ..; ; ....... , ; ■ ■ '.</

. ... ■ ■; - ? ' ' ; ■ ? '■ ■ ■ . ■ -

■. ■ .• ■ ■ ■ - • ■ . . . ■ , .
■ . ' • • . '• •' ' . ' '' : :■ ■ <'■ ■. ' '■ \ . '■'/t'-'S
. . ■ . . • - ■ .• - ; . - ■

CamScanner
Appendix В

Appendix Table B.2, continued

b. Three Factor Equation Containing All Linear Terms

(i) Original equation: f «6,X+M + b3W 4- bAXZ 4- b^XW

+ bJLW > b,XZW + fy>


(ii) Transformed equation: Y - b\X' 4- b2Z’ 4- b’3W 4- bAX'Z'

+ b^X'W1 + b^W' + btjX'Z’W' +

(iii) Coefficient relationships:

Transformed Original
Equation Equation bi b2 Ьз b4 b5 b6 bj

b\ b\ ~b4f -b5h ^-^fh


bi b2 —b4c -b6h +Ьтск
bi by -b5c -b6f ^cf
bA -b.h
к
bi bi -b,f
bi b6 -bye
•Ц-- •; _■ ■
bi bj
bi bo -bt c -bj ~b3h +b4cf +bsch +bbfh -bicfh
-iA’5 ’.*

h NOTE: Coefficient relationships indicate in the three factor equation, for example, that the coefficient
£ of the transformed equation equals the value (Л( - bAf - b$h + byfh), where the Z>, coefficients are
taken from the original equation.

CS CamScanner
Appendix C: SAS Program for Test of
Critical Rcgion(s)

Written by Jcnn-Yun Tcin, Arizona State University

This program is applicable to eases comparing regression lines in which there are
two groups and one continuous variable. It identifies critical regions where the
two regression lines differ significantly using PotthofFs (1964) extension of the
Johnson-Neyman procedure (sec Chapter 7). Separate regression analyses within
each of the groups provide the data necessary for input to this program.
Variables are entered in the order below separated by a space (free format).
The values of each of the variables for the example in chapter 7 appear in lines
23 and 24 of the program. The program prints the name of the dependent variable,
the limit of region 1 (XL1), and the limit of region 2 (XL2).

DEPVAR (short name of dependent variable)


ALLN = N (total N combining two groups)
N1 = nx (number of subjects in group 1)
N2 « n2 (number of subjects in group 2)
SXSQR1 - SSX(l) (sum of squares predicted in group 1)
SXSQR2 = SSX(2) (sum of squares predicted in group 2)
MEAN1 - X(l) (mean of predictor in group 1)
MEAN2 = X(2) (mean of predictor in group 2)
F = Ег,м-4 (value of F from table)
SSRES = SSres (sum of squares rcsidual—add up values of SSrc$ from groups
1 and 2)
B1 = B1(1) (slope for group I)
B01 - Boll) (intercept for group 1) j
B2 = B1(2) (slope for group 2)
B02 = B0(2) (intercept for group 2)

188

CamScanner
Appendix С 189

Program
00001 (local system Job Control Language [JCL])
00002 (local system JCL)
00003 (local system JCL)
00004 DATA JOHNNEYK:
00005 INPUT DEPVBL $ ALLN N1N2 SXSQR1SXSQR2 MEANX1MEANX2 F
00006 SSRES Bl B01B2B02;
00007 MXSQR1 = MEANX1
2;
**
00008 MXSQR2 = MEANX2
2;
**
00009 SUMI = (1/SXSQR1) + (1/SXSQR2);
00010 SUM2 = (MEANX1/SXSQR1) + (MEANX2/SXSQR2);
00011 SUM3 = (ALLN/(N1
N2)
* ) + (MXSQR1/SXSQR1) + (MXSQR2/SXSQR2) ;
00012 SUMB1 = B1-B2;
00013 SUMBO = B01-B02;
00014 SUMB1SQ = SUMB1
2;
**
00015 SUMBOSQ = SUMB0
2;
**
00016 A = (((-2
F)/(ALLN-4))
* * SSRES * SUMI) + SUMB1SQ;
00017 B= ((( 2
F)/(ALLN-4)
* ) * SSRES *
SUM2) + (SUMBO * SUMB1);
00018 C = (((-2
F)/(ALLN-4))
* * SSRES * SUM3) + SUMBOSQ;
00019 SQRTB2AC = ((B
2)
** - **
.5;
C))
*
(A
00020 XL1 = (-B-SQRTB2AC)/A;
00021 XL2 = (-B+SQRTB2AC)/A;
00022 CARDS;
00023 SALARY25 10 15 21768.4 6671180.4 2.40 2.99 3-47
00024 870923 122.9 27705.0 1872 18401.6
00025 PROC PRINT; VAR DEPVBL XL1XL2;
X 00026 RUN;
00027 (local system JCL)
/ '< ■■ ■; ■ . : J. ■■■. , у- ■ у-...- ' . .. • '• ... .. ' . • ■

' - - ' . ..■ / ■ _ ■ ■ . ■■ ■ ' ' < . 1 . “ ‘


CS CamScanner
References

Allison, P. D. (1977). Testing for interaction in multiple regression. American Journal of


Sociology, 83, 144-153.
Althauser, R. P. (1971). Multicollinearity and non-additive regression models. In H. M.
Blalock (Ed.), Causal models in the social sciences. Chicago: Aldine.
Alwin, D. F., & Jackson, D. J. (1980). Measurement models for response errors in surveys:
Issues and applications. In K. F. Schuessler (Ed.), Sociological methodology. San
Francisco: Jossey-Bass.
Alwin, D. F., & Jackson, D. J. (1981). Applications of simultaneous factor analysis to
issues of factorial invariance. In D. J. Jackson & E. F. Borgotta (Eds.), Factor analysis
and measurement in sociological research (pp. 249-279). Beverly Hills, CA: Sage.
Anderson, L. R., & Ager, J. W. (1978). Analysis of variance in small group research.
Personality and Social Psychology Bulletin, 4, 341-345.
Appelbaum, M. I., & Cramer, E. M. (1974). Some problems in the nonorthogonal analysis
of variance. Psychological Bulletin, 81, 335-343.
Arnold, H. J., & Evans, M. G. (1979). Testing multiplicative models does not require ratio
scales. Organizational Behavior and Human Performance, 24, 41-59.
Arvey, R. D., Maxwell, S. E., & Abraham, L. M. (1985). Reliability artifacts in compa­
rable worth procedures. Journal of Applied Psychology, 70, 695-705.
Atkinson, A. C. (1985). Plots, transformations, and regression. Oxford, UK: Clarendon $

Press.
Belsley, D. A., Kuh, E., & Welsh, R. E. (1980). Regression diagnostics: Identifying injlu-
ential data and sources of collinearity. New York: John Wiley.
Bentler, P. M. (1980). Multivariate analyses with latent variables: Causal modeling. In M. ■

R. Rosenzweig & L. W. Porter (Eds.), Annual Review of Psychology, 31. Palo Alto,
CA: Annual Reviews.
Bentler, P. M. (1989). EOS: Structural equations program manual. Los Angeles: BMDP
Statistical Software. • . .. . - . ■■ .. .- 1
■ . • • • • • ■ ■ ■'

CS CamScanner
References 191

Bender, P. M., & Chou. С. P. (1988). Practical issues in structural modeling. In J. S. Long
(Ed.), Common problems /proper solutions: Avoiding error in quantitative research
(pp. 161-192). Newbury Park, CA: Sage.
Berk. R. A. (1990). A primer on robust regression. In J, Pox & J. S. Ix>ng (Eds.), Modern
methods of data analysis (pp. 292-324). Newbury Park. CA: Sage.
Blalock, H. M., Jr. (1965). Theory building and the concept of interaction. American So­
ciological Review, 30. 374-381.
Bohmstedt. G. W. (1983). Measurement. In P. H. Rossi, J. I). Wright, & A. B. Anderson
(Eds.), Handbook of Survey Research (pp. 69-121). New York: Academic Press.
Bohmstedt, G. W., & Carter, T. M. (1971). Robustness in regression analysis. In H. L.
Costner (Ed.), Sociological Methodology (pp. 118-146). San Francisco: Jossey-Bass.
Bohmstedt, G. W., & Goldbcrgcr, A. S. (1969). On the exact covariance of products of
random variables. Journal of the American Statistical Association, 64, 325-328.
Bohmstedt. G. W„ & Marwell, G. (1978). The reliability of products of two random var­
iables. In K. F. Schuesslcr (Ed.), Sociological methodology. San Francisco: Josscy-
Bass.
Bollen. K. A. (1989). Structural equations with latent variables. New York: John Wiley.
Bollen, K. A., & Barb, К. H. (1981). Pearson’s rand coarsely categorized measures. Amer­
ican Sociological Review, 46, 232-239.
Bollen, K. A. & Jackman, R. W. (1990). Regression diagnostics: An expository treatment
of outliers and influential cases. In J. Fox and J. S. Long (Eds.), Modern methods of
data analysis (pp. 257-291). Newbury Park, CA: Sage.
Borich, G. D. (1971). Interactions among group regressions: Testing homogeneity of group
regressions and plotting regions of significance. Educational and Psychological Mea­
surement, 31, 251-253.
Borich, G. D., & Wunderlich, K. W. (1973). Johnson-Neyman revisited: Determining
interactions among group regressions and plotting regions of significance in the case
of two groups, two predictors, and one criterion. Educational and Psychological Mea­
surement, 33, 155-159.
Box, G. E. P., & Cox, D. R. (1964). An analysis of transformations (with discussion).
Journal of the Royal Statistical Society (Section B), 26, 211-246.
Browne, M. W. (1984). Asymptotic distribution free methods in analysis of covariance
structures. British Journal of Mathematical and Statistical Psychology, 37, 62-83.
Busemeyer, J. R., & Jones, L. E. (1983). Analyses of multiplicative combination rules
when the causal variables are measured with error. Psychological Bulletin, 93, 549-
562.
Byrne, В. M., Shavelson, R. J., & Muthdn, B. (1989). Testing for the equivalence of factor
covariance and mean structures: The issue of partial measurement invariance. Psycho­
logical Bulletin, 105, 456-466.
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the
multitrait-multimethod matrix. Psychological Bulletin, 56, 81-105.
Champoux, J. E., & Peters, W. S. (1987). Form, effect size, and power in moderated
regression analysis. Journal of Occupational Psychology, 60, 243-255.
Chaplin, W. F. (1991). The next generation of moderator research in personality psychol­
ogy. Journal of Personality, 59, 143-178.
Chaplin. W. F. (in press). Personality, interactive relations and applied psychology. In S.
Briggs, S. R. Hogan, & W. H. Jones (Eds.), Handbook of Personality Psychology.
Orlando, FL: Academic Press.

CS CamScanner
192 MULTIPLE REGRESSJOn

Cleary. P. I),. & Kc.wlcr. R. C. (1982). The estimation and interpretation of


cflccls Journal of Health and Social Behavior, 23, 159-169.
Cobb. S (1976). Social support as я moderator of life stress. Psychotninatie Medici
500 .114
Cohen. J (l°6K) Multiple regression as я general data analytic system. Ptyrhnlngle^
Irtin, 70, 426 443
Cohen, L (I°77) Statistical power analysis for the behavioral sciences. New Y'orft дС}
demk Press
Cohen. J. (1978). Pariialcd products arc interactions; partialcd vectors are curve
nents Psychological Bulletin, 85, 85R-866.
Cohen. J. (1983). The cost of dichotomizntion. Applied Psychological Menswem^t, ?.

249-253.
Cohen. J (1988), Statistical power analysis for the behavioral sciences (2nd ed. >.

NJ. Lnwrcncc Erlbaum.


Cohen. J.. & Cohen, P. (1975). Applied multiple regresslon/correlafion analyses fa,
behavioral sciences (1st cd.). Hillsdale, NJ: Lawrence Erlbaum.
Cohen. J„ & Cohen, P. (1983). Applied multiple regression/correlation analyses5» 1*
behavioral sciences (2nd cd.). Hillsdale, NJ: Lawrence Erlbaum.
Cook, R. D., & Weisberg, S. (1980). Characterization of an empirical influence (шщв
tor detecting influential cases in regression. Technometrics, 22(4), 495-508.
Cramer. E. M., & Appelbaum, M. 1. (1980). Nonorthogonal analysis of varunce-wa»

again Psychological Bulletin, 87, 51-57.


Cronbach, L. J. (1987). Statistical tests for moderator variables: Flaws in analyses newly

proposed. Psychological Bulletin, 102, 414-417.


Cronbach, L. J., & Snow, R. E. (1977). Aptitudes and instructional methods. New Yon

Irvington,
Daniel, C A Wood, F. S. (1980). Fitting equations to data (2nd ed.). New Yort: Jeto

Wiley.
Darlington, R. B. (1990). Regression and linear models. New York: McGraw-Hill.
Domino, G. (1968). Differential predictions of academic achievement m conforming ш
independent settings. Journal of Educational Psychology, 59, 256-260.
Domino, G (1971). Interactive effects of achievement orientation and teaching sty»eoe
academic achievement. Journal of Educational Psychology, 62, 427-431.
Dunean,O. D. (1975). Introduction to structural equation models. New York Academic
Press.
Dunlap, W P., 6 Kemery, E. R. (1987). Failure to detect moderator effects: Is iwW
linearity the pioblem? Psychological Bulletin, 102, 418-420.
Dunlap, W. P., & Kcrnery, E. R. (1988). Effects of predictor intereorreiuiioin and
bililies on moderated multiple regression. Organizational Behavior and Human Deci­

sion Proctssrs, 41, 248-258.


England, I*., Farkas, G Kilbourne, B. S., 6c Dou, T. (1988). Explaining occupation^ **
segregation and wages: Findings from a model with fixed effects. American ИйкиЖнЬ
iad Review, 53. 544-558. |
Etczadi-Ainoli, J., & McDonald, К. P (1983) A second generation nonlinear factor amd
у sis. Psychometrika, 48, 315-342.
Evans, M G. (1985). A Monte Carlo study of the effects of correlated method variance if*
moderated multiple regression analysis. Organizational Behavior and Human Decisis
Processes, 36. 305-323. .........-.4

CamScanner
References 193

Feucht, T. E. (1989). Estimating multiplicative regression terms in the presence of mea­


surement error. Sociological Methods & Research, 17, 257-282.
Fiedler, F. E. (1967). A theory of leadership effectiveness. New York: McGraw-Hill.
Fiedler, F. E., Chemcrs, M. M., & Mahar, L. (1976). Improving leadership effectiveness:
The leader match concept. New York: John Wiley.
Finney, J. W., Mitchell, R. E., Cronkitc, R. C., & Moos, R. H. (1984). Methodological
issues in estimating main and interactive effects: Examples from coping/social support
and stress field. Journal of Health and Social Behavior, 25, 85-98.
Fisher, G. A. (1988). Problems in the use and interpretation of product variables. In J. Scott
Long (Ed.). Common prohleins/proper solutions: Avoiding error on quantitative re-
scarch (pp. 84-107). Newbury Park, CA: Sage.
Friedrich, R. J. (1982). In defense of multiplicative terms in multiple regression equations.
American Journal of Political Science, 26, 797-833.
Fuller, W. A. (1980). Properties of some estimators for the errors-in- variables model. The
Annals of Statistics, 8, 407-422.
Fuller, W. A. (1987). Measurement error models. New York: John Wiley.
Fuller, W. A., & Hidiroglou, M. A. (1978). Regression estimation after correcting for
attenuation. Journal of the American Statistical Association, 73, 99-104.
Gallant, A. R. (1987). Nonlinear statistical models. New York: John Wiley.
Gulliksen, H. (1987). Theory of mental tests. Hillsdale, NJ: Lawrence Erlbaum. (Originally
published by John Wiley, 1950).
Hayduk, L. A. (1987). Structural equation modeling with LISREL: Essentials and ad­
vances. Baltimore, MD: Johns Hopkins Press.
Hays, W. L. (1988). Statistics (4th ed.). New York: Holt, Rinehart, & Winston.
Heise, D. R. (1975). Causal analysis. New York: John Wiley.
Heise, D. R. (1986). Estimating nonlinear models. Sociological Methods and Research,
14, 447-472.
Herr, D. G., & Gaebelein, J. (1978). Nonorthogonal two-way analysis of variance. Psy­
chological Bulletin, 85, 201-216.
Huitema, В. E. (1980). The analysis of covariance and alternatives. New York: John iley.
Huynh, H. (1982). A comparison of four approaches to robust regression. Psychologiia

Bulletin, 92, 505-512. ,


Jaccard, J., Turrisi, R., & Wan, С. K. (1990). Interaction effects in multiple regression.
Newbury Park, CA: Sage.
Johnson, P. O., & Fay, L. C. (1950). The Johnson-Neyman technique, its theory and
application. Psychometrika, 15, 349-367. , .
Johnson, P. O., & Neyman, J. (1936). Tests of certain linear hypotheses and their appli­
cations to some educational problems. Statistical Research Memoirs, I, 57-93.
Joreskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychome-
trika, 36, 409-426.
Joreskog, K. G., & Sorbom, D. (1979). Advances in factor analysis and structural equation
modeling. Cambridge, MA: Abt,
Joreskog, K. G., & Sorbom, D. (1981). LISREL 6: Analysis of linear structural relation­
ships by the method of maximum likelihood. Chicago: National Educational Resources.
Joreskog, K. G., & Sorbom, D. (1988). LISREL 7: A guide to the program and applica­
tions. Chicago: SPSS.
Jbreskog, K. G., & Sorbom, D. (1989). LISREL 7: User’s reference guide. Mooresville,
IN: Scientific Software.

CS CamScanner
194 MULTIPLE ИЕОтМО»

Judd, С. M., & McClelland, G. II. (1989). /Л//л analysis: А model comparison approx
San Diego: Harcourt, Brace, Jovanovich,
Judge, G. G., & Воск, M. E. (1978). 7he statistical implications ofpre-test and Steins
estimates in enconometrics. Amsterdam: North Holland.
Judge, G. G., Hill, R. C., Griffiths, W. E., I.mkepul, И„ & Lee, T. С. (1982), Introd^
tian to the theory and practice of econometrics. New York: John Wiley,
Kenny, D. Л. (1975). Л quasbcxperimcntal approach Io assessing treatment effect
* rni&e
noncquivalcnt control group design. Psychological Bulletin, 82, 345-362,
Kenny, D. A. (1979). Correlation and causality. New York: John Wiley,
Kenny, D. A. (1985). Quantitative methods for social psychology. In G, IJndzey Д £
Aronson (Eds.), Handbook of Social Psychology (3rd cd., Vol. I., pp. 487-5%). >1
*-»
York: Random House.
Kenny, D.,& Judd,C. M. (1984). Estimating the nonlinear and interactive effect
* of
variables. Psychological Bulletin, 96, 201-210.
Kenny, D. A., & Judd, С. M. (1986). Consequences of violating the independence
sumption in analysis of variance. Psychological Bulletin, 99, 422-431.
Kirk, R. E. (1982). Experimental design: Procedures for the behavioral sciences (2nd ed /.
Belmont, CA: Brooks/Cole.
Kmenta, J. (1986). Elements of econometrics (2nd Ed.). New York: Macmillan.
Lance, С. E. (1988). Residual centering, exploratory and confirmatory moderator analjsir,
and decomposition of effects in path models containing interactions. Applied Psycho­
logical Measurement, 12, 163-175.
Lane, D. L. (1981). Testing main effects of continuous variables in nonadditive models.
Multivariate Behavioral Research, 16, 499-509.
LaRocco, J. M., House, J. S., & French, J. R. P., Jr. (1980). Social support, occupational
stress, and health. Journal of Health and Social Behavior, 21, 202-228.
Lautenschlager, G. J. & Mendoza, J. L. (1986). A step-down hierarchical multiple regres­
sion analysis for examining hypotheses about test bias in prediction. Journal ofApplied
Measurement, 10, 133-139.
Levine, D. W., & Dunlap, W, P. (1982). Power of the F test with skewed data. Should
one transform or not? Psychological Bulletin, 92, 272-280.
Long, J. S. (1983a). Confirmatory factor analysis: A preface to L1SREL. Beverly Hdls.
CA: Sage.
Long, J. S. (1983b). Covariance structure models: An introduction to USREL. Beverly
Hills, CA: Sage.
Lord, F, M,, & Novick, M, R. (1968), Statistical theories of mental test scores. Reading,
MA: Addison-Wesley.
Lubin, A. (1961). The interpretation of significant interaction. Educational and Psycholog­
ical Measurement, 21, 807-817.
Lubinski, D., & Humphreys, L. G. (1990). Assessing spurious “moderator effects” III»»'
trated substantively with the hypothesized (‘synergistic’) relation between spatial and
mathematical ability. Psychological Bulletin, 107, 385-393.
Maddala, G. S. (1977). Econometrics. New York: McGraw-Hill.
Mansfield, E, R,, Webster, J. T., & Gunst, R. F. (1977). An analytic variable selector
technique for principal component regression. Applied Statistics. 26, 34-40, J
Marascuilo, L. A., & Levin, J. R. (1984). Multivariate statistics in the social sciences-
Belmont, CA: Brooks/Cole.
Marquardt, D. W. (1980). You should standardize the predictor variables in your regression
models. Journal of the American Statistical Association, 75, 87-91.
■ ■ • - ■ ■ ■ ■ ■ ■ , ■ ■, ' .' ■ ■■ c >'л|

CamScanner
References

Marsden, P. V, (1981). Conditional effects in regression models. In P. V. Mardsen (Ed.),


Unear Models in Social Research (pp. 97-116). Beverly Hills, СЛ: Sage.
McCIcaty, R. & Hay, R. A., Jr. (1980). Applied time series analysis. Beverly Hills, CA:
f Sage.
Mooijaart, A., & Bcntler, P. M. (1986). Random polynomial factor analysis. In E. Diday
et al. (Eds.), Data analysis and informatics (pp. 241-250). Amsterdam: Elsevier Sei-
ence.
Morris, J. H., Sherman, J. D., & Mansfield, E. R. (1986). Failures to detect moderating
effects with ordinary least squares—moderated multiple regression: Some reasons and
a remedy. Psychological Bulletin, 99, 282-288.
Morrison, D.F. (1976). Multivariate statistical methods (2nd cd.). New York: McGraw-
Hill.
Mostcller, F., & Tukcy, J. W. (1977). Data analysis and regression: A second course in
statistics. Reading, MA: Addison-Wesley.
Myers, J. L. (1979). Fundamentals of experimental design. Boston: Allyn & Bacon.
Netcr, J., Wasserman, W., & Kutner, M. H. (1989). Applied Linear Regression Models
(2nd ed.). Homewood, IL: Irwin.
Nunnally, J. C. (1978). Psychometric Methods (2nd ed.). New York: McGraw-Hill.
O’Brien, R. G., & Kaiser, M. D. (1985). MANOVA method for analyzing repeated meas­
ures designs: An extensive primer. Psychological Bulletin, 97, 316-333.
Oldham, G. R., & Fried, Y. (1987). Employee reactions to workplace characteristics. Jour­
nal of Applied Psychology, 72, 75-80.
Overall, J. E., Lee, D. M., & Hornick, C. W. (1981). Comparisons of two strategies
for analysis of variance in nonorthogonal designs. Psychological Bulletin, 90, 367-
■ STS. ’
Overall, J. E., & Spiegel, D. K. (1969). Concerning least squares analysis of experimental
data. Psychological Bulletin, 72, 311-322.
Overall, J. E., Spiegel, D. K., & Cohen, J. (1975). Equivalence of orthogonal and non-
orthogonal analysis of variance. Psychological Bulletin, 82, 182-186.
Paunonen, S. V., & Jackson, D. N. (1988). Type I error rates for moderated multiple
regression analysis. Journal of Applied Psychology, 73, 569-573.
Pedhazur, E. J. (1982). Multiple regression in behavioral research. New York: Holt, Rine­
hart &. Winston.
Peixoto, J. L. (1987). Hierarchical variable selection in polynomial regression models. The
American Statistician, 41, 311-313.
Potthoff, R. F. (1964). On the Johnson-Neyman technique and some extensions thereof.
Psychometrika, 29, 241-256.
Rao, C. R. (1973). Linear statistical inference and its applications. New York: John Wiley.
Rogosa, D. (1980). Comparing nonparallel regression lines. Psychological Bulletin, 88,
. 307-321.
Rogosa, D. (1981), On the relationship between the Johnson-Neyman region of significance
and statistical tests of parallel within group regressions. Educational and Psychological
Measurement, 41, 73-84.
Schmidt, F. L. (1973), Implications of a measurement problem for expectancy theory re­
search. Organizational Behavior and Human Performance, 10, 243-251.
Simonton, D. K. (1987). Presidential inflexibility and veto behavior: Two individual-situ­
ational interactions. Journal of Personality, 55, 1-18.
; Smith, K, W. & Sasaki, M. S, (1979). Decreasing multicollinearity: A method for models
with multiplicative functions. Sociological Methods and Research, 8, 35-56.

CamScanner
196 MUI Ill'Ll^ IMfhfiPhhllih

Sobel, M. E. (!0Я2). Asymptotic confident •; inter vab Im twill' * I i h Hi tltiiMHfid


lion models. In K. Schucsslcr (IM), Sot lulnghiil nii'lhiuhdiiitv him I'ihiv■JW-0
Bass.
SocklofT. A. L. (1976) The analyih of nmiliiu Hilly viu linw iigitwnfi mih pM/r^n-id
and product variables: An ewiihiiitimi Review tif Idtti Hlhmal llwiiith, 4ft, ЦЦ
291.
Southwood, К E, (1978). Substantive theory arid Miilhlh id Inlmit' lioh lia mmhk Amt,
ican Journal of Sociology, Hi, II54 12(1.1,
Sprrcht, D. A., A Warren, R. D, (1975), CompnrhlK ctutwl models. In Ih I* H
iu
* 0'4 t
Sociological methodology. San Frnnchco; Imuicy-fiim,
Stevens, J. P. (1984) Outliers and influential data poinls In irf/mulon analyih. I'tphob.y
teal Bulletin, 95(2), 334-344,
Stimson. J. A., Carmines, E. G., & Zeller, R, A, (1978). Inlerprrlifig polynomial г*дмг
sion. Sociological Methods and Research, 6, 515-524,
Stine. R. (1990). An introduction to bootstrap methods, In I, Pox Ac I, Я, l/ifijf (LAi ft
Modem methods of data analysis (pp. 325-374), Newbury Park, СЛ: huyr,,
Stolzenberg, R.M. (1979). The measurement and decomposition of causal etffth m г«л
linear and nonadditive models. In К, P, Schuesslcr (Ed,), Sociological meihM/alngy,
San Francisco: Jossey-Bass.
Stolzenberg, R. M., & Land, К, C. (1983), Causal modeling and survey research, In p
H. Rossi, J. D. Wright, & A. B. Anderson (Eds.), Handbook of survey research Ipp
613-675). New York: Academic Press.
Stone, E. F., & Hollenbeck, J. R. (1984). Some issues associated with the use of moderated
regression. Organizational Behavior and Human Performance, 34, 195-213,
Stone, E. F., & Hollenbeck, J. R, (1989), Clarifying some controversial issues surrounding
statistical procedures for detecting moderator variables: Empirical evidence and related
matters. Journal of Applied Psychology, 74, 3-10.
Tate, R. L. (1984). Limitations of centering for interactive models. Sociological Methods
and Research, 13, 251 -271.
Tatsuoka, M. M. (1975). The general linear model: A "new" trend in analysis of variance
Champaign, IL: Institute for Personality and Ability Testing,
Teghtsoonian, R. (1971). On the exponents in Stevens’ law and the constant in Ekman's
law. Psychological Review, 78, 71-80.
Thomas, G. B. (1972). Calculus and analytic geometry (4th ed.). Reading, MA. Addison-
Wesley.
Velleman, P. F., & Welsh, R. E. (1981), Efficient computing of regression diagnostics,
American Statistician, 35, 234-242.
Wenger, B., (Ed.). (1982). Social attitudes and psychophysical measurement. Hillsdale,
NJ: Lawrence Erlbaum.
West, S. G. &. Aiken, L. S. (1990). Conservative tests of simple effects. Unpublished man­
uscript, Arizona State University, Tempe, AZ.
West, S. G., & Finch, J. F. (in press). Measurement and analysis issues in the investigation
of personality structure. In S. Briggs, R. Hogan, & W. Jones (Eds.), Handbook of
Personality Psychology. New York: Academic Press.
West, S. G., Sandler, L, Pillow, D. R., Baca, L., & Gersten, J. C. (in press), The use of
structural equation modeling in generative research. American Journal of Community
Psychology.
Winer, B. J. (1971). Statistical principles in experimental design (2nd ed.). New York
McGraw-Hill. . - ...........

CamScanner
References

Won, E. V. T. (1982). Incomplete correction r 197


UMhods and Research, 10, 271-284 f°r rc8ressor unreliahtr,.
Wong, S. K. & Long, J. S. (1987) . nrcllab''HieS. Sociolog/cai
latent variables. Unpublished m,„,„ ■ "r,ilng nonUnear cm,,,. ■
ogy, Bloomington, IN. manu’cript, Indian, u„ivcnii"^'ntX in models with
Wonnacott, R. J. & Wonnacott, T. H f1070ч . partrncnt °f Sociob
Wi'ey. ’ ’■ tonometries (2nd ed.). New Vn. ,
Wright. 0. C„ Jr. (1976). Linear models forevah. .■ "
Journal of Political Science, 20, 3491-373 “IU‘',ing C
Yerkes. R. M., & Dodson, J. D (1908k Th i an
habit formation. Jonrna! of Cotnpo^^Z^^h\ '°
OJ Psychology, J8, 459-482

CamScanner
CamScanner
Glossary of Symbols 199

Symbol Definition Page

Covxz covariance between X and Z 44


C(X, D covariance between X and У 141
cv7 conditional value of Z, the value of Z at which the 18
simple regression of У on X is considered
cvH, conditional value of W, the value of W at which 54
the simple regression of У on X is considered
d vector of regression coefficients from principal 171
component regression
df degrees of freedom 6
D, dummy code for identifying group membership 117
effect code for identifying group membership 128
f additive constant 30
f2 effect size 157
G 117
number of levels of a categorical (group) variable
A 126
intercept for group i in slope/intercept computa­
tion
к 16
number of predictors in a regression equation, not
including the regression constant b(i
Mr
multiple regression
mean square residual from analysis of regression 25
n
number of cases in a sample 16
OLS
ordinary least squares %
K2R
principal component regression 166
hi.M, squared semi-partial (or part) correlation of set 1 157
i with the criterion with set M part tailed out
hM
squared partial correlation of set 1 with the trite 157
rl rion with set M partiailed out
squared multiple correlation resulting horn pre 157
diction of criterion by a set of variables M
Vi'/ •••: ' ,, •.
'
• ' ■ . ■ ■. ,
squared multiple correlation resulting from com 157
bined prediction of criterion by a set of variables
1 Ш M plus their interaction 1
;■ ■ ■ ■ .■
'Лй . - ■ ■" ' . ' ■ ■ in hierarchical regression in which predictor j is 106
- !-> : ■ :
■ ' ■' ■ added to predictor i to predict die criterion, the
••• ... .... ...... ■- ■ • , ■ .. •

CamScanner
MULTIPLE REGRESSION
200
Definition Page

squared multiple correlation with both predictor i


and j in the equation
in hierarchical regression in which predictor j is 106
R CH»’ added to predictor i to predict the criterion, the
squared multiple correlation with only predictor i
in the equation and j out of the equation
sample estimate of population reliability pxx of 142
rxx» rzz predictor X, of population reliability Pzz of pre­
dictor Z, respectively
zero order correlation between X and У 142
rxr
zero order correlation between X and Y 160
ГХЛ
sample variance covariance matrix of regression 25
Sb
coefficients
standard error of a simple slope 16
</
Sh
standard error of the difference between two sim­ 20
5j
ple slopes
standard deviation of predictor i 45
Si

Si slope for group i in slope/intercept computation 126

Sii
sample variance of unstandardized regression coef- 16
ficient by, i th diagonal element of Sfc, for centered
predictors
sample covariance between unstandardized 16
regression coefficients bj and by, off-diagonal ele­
ment of S/, for centered predictors
sample covariance between unstandardized 34
regression coefficients 6/ and bj; off-diagonal ele­
ment of Sb for uncentered predictors
5|_, ^M»
standard errors of simple slopes of Y on X at ZL. 17
ZM, and ZH, respectively (centered case)
si, 5м, 5h standard errors of simple slopes of Y on X at Zi, 44
Zm, and Z^, respectively (uncentered case)
*
c *
c *
r

5l» 5м» 5h standard errors of simple slopes of Y on X at val- 46 J


ues of standardized Z = -1.0, 0.0, and 1.0, re­
spectively, (i.e., one standard deviation below the
mean of standardized predictor Z, at the mean of
standardized predictor Z, and one standard devia­
tion above the mean of standardized predictor Z)
i ; -л-.’ ; ,-s ■■ ■;«

CamScanner
6^ry of Symbols
201
Symbol Definition

sample variance-covariance matrix of predictors


Sxx 25
vector of covariances of each predictor with the
Say criterion 171

standard deviation of criterion Y


Sy 45
Tx true score on variable X
140
и a linear combination of regression coefficients 25
Ui vector of component scores on principal compo­ 171
nent i of Sxx 1
weight applied to regression coefficient j in form­ 25
ing a simple slope
JF centered predictor W (in deviation form) 49
weight vector applied to vector of regression coef­ 25
ficients to form a simple slope
WABOVE (W - CVir), where СУц, = 1 standard deviation 58
of PF
(IV - CV>v), where CV(v = -1 standard devia­ 58
WBELOW
tion of W
(IV — CV^), predictor W from which a condi­ 54
tional value CV,v has been subtracted
score on centered predictor Ж one standard devia­ 51
tion below mean of W, one standard deviation
above mean of W, respectively
1
centered predictor X (in deviation form)

sample mean of X 30
uncentered predictor X 153
latent variable X in structural equation m
24
the value of X at which>X
of Y on X at values of Z era , * 32
*
the value of X at which two sW
^entered X
ofYonXatvaluesofZcm^centeredZ v 2
the crossproduct of cente t<)rs A-alld Z' 30
crossproduct of uncen‘^ed P d z. structural 153
product of latent variables л
equation model . _ . % with z 6
crossproduct of square of cente

cs CamScanner
ж MULTIPLE RRGRRSSION

5V
:X\
* IX'titriuon Page

Х7» civssptvduct of three centered predictors X, Z, and 49


W
XZABOVE pwxhict of X with /.ABOVE 19
XZBFl 0W ptvduct of X with ZBELOW 19
>' predicted score in unstandardized regression cqua- 1
lion
•ч»
<4 centered predictor Z (in deviation form) 1
Чр * unecntcred predictor 7.
4. 30
*Z latent variable Z in structural equation model 153
/ABOVE (7 - CV,.)» where CVZ « 1 standard deviation 19
of Z
ZBELOW v” OV A, where CVZ =« — I standard deviation 19
of Z
the value of Z at which two simple regression lines 24
♦ of Eon Z at values of A’ cross, for centered Z
Zcv (Z - CVZ), predictor Z from which a conditional 18
value CV? has been subtracted
Zj.» *-М » *- ,tt scores on centered predictor Z one standard de­ и
viation below the mean of Z. at the mean of Z.
and one standard deviation above the mean of Z.
respectively •
zi.ZM.z;, score on uncentered predictor Z' one standard de­ 33
viation below the mean of Z‘, at the mean of Z‘, -■'■J
and one standard deviation above the mean of 7.\ ■ '-'I
respectively J
; -я
.
t* к/ standardized predictor«Vand Z, respectively, based 43 ■ 3
on centered X and Z
standardized predictor A’ and Z, respectively, based 43
on unecntcred X and Z
*А 'z crossproduct of z-scores on centered predictors X 43
and Z
«
*r predicted standardized score from standardized 43
regression equation
residual (У - F) 25
■ ■is
<x measurement error in observed score on variable 140
X
.i

CamScanner
Glossary of Symbols 203

Symbol Definition Page

X. factor loading in measurement model (structural 152


equation modeling)
\ characteristic roots of covariance matrix of pre - 171
dictors
Hz population means of X and Z, respectively 144
Pxx, Pzz population reliabilities of X and Z, respectively 141
PXY population zero order correlation between X and Y 142
Pxz.xz population reliability of crossproduct term XZ 144
population variance-covariance matrix of the 25
regression coefficients
population variance of a simple slope 25
population variance of the residuals, L(Y — Y)2 2S
population variance of the measurement errors on 141
predictor X and Z, respectively
OiXZ population variance of the measurement errors on 147
crossproduct XZ
au population covariance between two regression 26
coefficients /?, and bj
population variance of regression coefficient j 26
°r, ■ variance of true scores on predictor X 141
”x> /т
а2
Qz2 population variance of predictors A", Z, respec­ 44 .
tively
„2
°XZ population variance of crossproduct term XZ 44

CamScanner
Author Index

Abraham, L. M., 139 Campbell, D. T., 148,191


Aiken, L S., 22
Carmines, E. G., 98
Allison, P. D., 4,101,110 Carter, T.M., 142
Althauser, R. P., 6
Champoux, J. E., 139,170
Alwin, D. F., 152,170
Chaplin, W.F., 139,156,169
Appelbaum, M. I., 38,101,102,104,109 Chemers, M. M., 3
Arnold, H. J., 6
Chou, С. P., 153
Arvey, R. D., 139 Cleary, P. D., 36,37,39,101,110
Atkinson, A. C., 96,175 Cobb, S., 2
Cohen, J,, 3-6,8,13,22,29,35,45, lOl-UE.
110, 113, 116, 142, 148, 154-158, 160, |
Baca, L, 153
165-166,168,171,173,177,183
Barb, K.H., 168 Cohen, P., 5, 8, 13, 22, 45, 101, 110,116.
Bclslcy, D. A., 97,175
142,148,154-155,168,173
Bender, P.M, 152,153,155 Cook, R. D„ 97,175
Berk, R. A., 175 Cox, D. R., 96
Blalock, H.M., Jr., 4,5 Cramer, E. M., 38,101-102,104,109
Воск, M.E., 115 Cronbach, L, J„ 3,137,139,156,167,169
Bohmstedt, G. W., 140, 142-144, 147-148, Cronkite, R. C„ 36,38-39,102,104-105
161,177-178 I
Bollen, K. A., 97,151,153,168,175
Borich, G. D., 137 Daniel, С.» 96,175
Box, G. E. P., 96 Darlington, R. B., 5, 10, 14, 18, 22, 37S&
Browne, M. W., 153 92,101-102,107-108,110,116,130,155 |
Busemeyer, J. R., 144,146,155-156,171 Dodson, J. D„ 3,62
Byrne, В. M., 152 Domino, G.,3 W

cs CamScanner
Author Index

Dou, T., 2 Huynh, II., 175


Duncan, O. D., 140,170
Dunlap, W. P., 94, 139, 154-156, 165-167,
169,175 Jaccard, J., 14,42,158-160
Jackman, R. W., 97,175
Jackson, D. J., 152,170
England, P., 2 Jackson, D. N., 139,165-167,169
Etezadi-Amoli, J., 155 Johnson, P. O., 132-134,136-137,188
Evans, M. G., 6,139,154 156,158,165,167 Jones, L. E., 144,146,155-156,171
JOrcskog, K. G., 151-152
Judd, С. M., 18,36,99,146,153,170,175
Farkas, G., 2 Judge, G.G.,96,115,175
Fay, L.C., 132
Feucbt,T. E., 148,150,154-155,171
Fiedler, F. E., 3 Kaiser, M. D., 175
Finch, J. F., 148,153 Kemery, E. R., 94, 139, 154-156, 165-167,
Finney, J. W., 36,38-39,102,104-105 169
Fisher, G. A., 61,110 Kenny, D. A., 5, 8, 99, 140, 142, 146, 153,
Fiske, D. W., 148 170,175
French, J. R. P., Jr., 2 Kessler, R. C., 36-37,39,101,110
Fried, Y.,3 Kilbourne, B. S., 2
Friedrich, R. J., 4,6,14, 28,42, 44-45,47 Kirk, R. E.,71
Fuller, W. A., 146,148,150,170 Kmenta, J., 24,61,96,175
Kuh, E., 97,175
Kutner, M. H., 5,33,94,96,110
Gaebelein, J., 101
Gallant, A. R., 96
Gersten, J. C., 153 Lance, С. E., 36
Goldberger, A. S., 177-178 Land, К. C., 26
Griffiths, W. E., 96,115 Lane, D. L.,22,101
Gulliksen, H, 140,146 LaRocco, J. M., 2
Gunst, R. F., 171 Lautenschlager, G. J., 110
Lee.D.M., 101-102,104
Lee,T, C., 96,115,175
Hay, R. A., Jr., 175 Levine, D. W., 175
Hayduk, L. A., 151,153 Long, J. S., 151,153
Hays,W. L., 178 Lord, F. M., 146,161
Heise, D.R., 140,148-151,170 Lubin, A., 22
Herr, D, G., 101 Lubinski, D„ 93,139
Hidiroglou, M. A., 146,150 Lulkepul, H.,96,115,175
Hill, R. C., 96,115,175
Hollenbeck, J. R., 110,156
Hornick, C. W., 101-102,104 Maddala, G. S., 25,142
House, J. S., 2 Mahar, L„ 3
Huitema, В. E., 133,137,139 Mansfield, E. R., 156,167-169,171
Humphreys, L. G., 93,139 Marquardt, D. W., 32,36,38,94

CamScanner
206 MULTIPLE rEGREssi01<

Marsden. P. V.» 143 Smith, K. W., 36


MatweU.G„ 143-145,147-148,161 Snow, R. E., 3,137,156
Maxwell. S.Ik 139 Sockloff, A, L„ 6,28
McCleary, R., 175 Sdrbom.D.. 151-152
McClelland, G. H, 18,36 Southwood, К. E., 4
McDonald. R. Р.» 155 Spiegel, D. K., 101,169
Mendoza, J. U, 110 Stevens, J. P., 97,175
Mitchell, R. Е.» 36,3S-39,102,104-105 Stimson, J. A., 98
Mooijaart, A., 155 Stine, R., 22
Moos, R. H„ 36,38-39,102,104-105 Stolzcnbcrg, R. M., 26
Moms, J. H„ 156,167-169,171 Stone, E.F.,110,156
Morrison, D. F., 25
MutMn,B., 152
Tate, R. L., 36
Tatsuoka, M. M., 173
Net», J., 5,33,94,96,110 Teghtsoonian, R., 23
Novick, M.R., 146,161 Thomas, G. B., 98
Nunnally, J. Q, 146 Turrisi, R., 14,42,158-160

O’Brien, R. G., 175 Velleman, P. F., 175


Oldham, G. R., 3
Overall, J. E., 101-102,104,169
Wan, C.K., 14,42,158-16
Wasserman, W., 5,33,94,
Paunonen, S. V., 139,165-167,169 Webster, J. T., 171
Pedhazur, E. J., 5, 35, 101-102, 107, 110, Weisberg, S., 97,175
116,130,134 Welsh, R. E„ 97,175
Peters, W. S., 139,170 West, S. O., 22,148,153
Piexoto, J. L, 101,110-112,114 Winer, B. J., 54,71
Pillow, D. R., 153 Won, E. Y.T., 148
Potthoff, R. F., 134,137,188 Wong, S. K., 153
Wonnacott, R. J., 96
Wonnacott, T. H., 96
Rao, C. R., 27 Wood, F. S., 96,175
Rogosa, D., 132,134 Wright, G.C., Jr., 4
Wunderlich, K. W., 137

Sandler,!., 153
Sasaki, M. S., 36 Yerkes, R. M., 3,62
Schmidt, F. L., 6,28
Shavelson, R. J., 152
Sherman, J. D., 156,167-169,171 Zeller, R. A., 98
Simonton, D. K., 3

cs CamScanner
Subject Index

ANOVA, comparison with multiple regres­ and X2, XZ, 35; criterion, 35; effect on
sion, 70-72,172-176 interpredictor correlation, 32-33; effect
Assumptions of regression analysis, 25 on simple slopes, 33; expected value,
variance, and co^ir-ззсе cf product
terms, 181-182; interpretation
Bonferoni procedure, 133,137 of first order coefficients, 39-
41; interprstetiG
* of order regression
coefficients, 37-39; hsesr by linear
X Z

Categorical by continuous variable inter­ interaction, 11-18; muluccilinearity in


actions, 109-110, 116-138; crossing point, predictor covariance matrix, 35-36, 49;
124-127; curvilinear (higher order) inter­ numerical example, centered predictors,
actions, 126-127; ordinal versus disordinal 11-18; reliability of crossproduct term,
interactions, 125-126; PotthofTs extension 144; standardized solution, 40,42-43. See
of Johnson-Neyman technique, 134-137; also Invariance of regression coefficients,
regions of significance, 134-137; standard Transformations of predictors
errors of simple slopes, 131-138; test of Comparison group (dummy variable coding),
differences between regression lines, 132- 117,131,133,138
133; r-test simple slopes, 131
of Computer analysis of simple slopes, 18-20,
Categorical predictor variables, 109-110, 54-58, 89-92, 131-133; categorical by
И6-138 continuous variable interactions, 131-
Centering (deviation scores), 9, 11-18, 28- 133; conditional values, 18; curvilinear
43, 49, 144, 181-182; additive trans­ (higher order) interactions, 89-92; linear
formation of predictors, 30; advantage for X by linear by linear IV interaction, 54-
Z

interpretation of first order regression 58; linear Xby linear interaction, 18-
Z

coefficients, 37-38; categorical predictor 20; standard errors of simple slopes, 19,
variables, 130; centered predictors 58; r-test for simple slopes, 19,58
defined, 9; centered versus uncentered Conditional effects, 10, 37-38, 50, 76, 102-
predictors, 28-34; correlation between X 105; centering and, 37; conditional inter-

207
CS CamScanner
208 MULTIPLE REGRESSION

actions, 50; first order conditional effects, Effects coding, 127-130; comparison with
37-38, 102-105; interpretability and pre­ dummy coding, 128-130; unweighted,
dictor scaling, 37-38. Sec also First order 127-129; weighted, 130
regression coefficients, Interactions Effect size, 157-161, 162; impact of
Conditional values of predictors, 18,58,89 measurement error on Interaction, 161,
Corrected estimate of regression coefficients, 162; relation to semipartial correlation,
145-154, 171; correction assuming 158-159; reliability, 161-162; small'

correlated measurement error, 148-150; moderate, large defined, 158. See


correction based on classical measure­ Measurement error, Power
ment theory, 146-148; correction based Expected values of crossproduct terms, 173­
on latent variable models, 151-153; 182; covariance of crossproduct term with
correction with constrained covariance another variable, 179-180, covariance of
matrix, 150-151; Fuller’s correctcd/con- crossproduct term with component, 13Q.
strained estimators, 150-151; Heise’s 181; mean of crossproduct term, 173;
corrected estimators, 148-151; problem of mean, variance, and covariances of сгок-
ill-conditioned matrices, 150, 171; product terms with centered variables, 18L
strategies for correction, 145-153. See 182; variance of crossproduct term, 179
also Measurement error
Crossing point of simple regression lines,
23-24, 32, 58-59, 82-84, 88-89, 124­ First order regression coefficients, 28-31,33,
127; categorical £ by continuous Z 37-41, 49, 61,101-105, Ш-113; as aver­
interaction, 124-227; centered versus age effects, 38-39; as conditional effects,
uncentered prefer, 32; curvilinear X 37-39; as main effects, 38; failure of in­
(higher order) interactions, 82-84, 88­ variance with higher order terms, 28-3L
89; linear X by fexr Z by linear W 33; geometric interpretation, 39-41; inda-
interaction, 58-59; linear X by linear Z sion of in equation containing interaction,
interaction, 23-24; regression of Y on X 49-61; interpretation in presence of inter­
at Z in XZ interaction, 24; regression of action, 37-40,101-102; testing in reduced
Y on Z at X in XZ interaction, 24; two models, 103-105, 111-113. See also In­
linear continuous variables, 23-24 variance of regression coefficients
Crossover versus noncrossover interaction, Fuller’s correction for measurement error,
22-23. See Ordinal versus disordinal 150-151
interactions
Curvature, direction of, 65-66,76,79,98
Curvilinearity versus iteraction, 92-93 Geometric interpretation of regression
CurvilinearX, linear Z, and linear X by linear coefficients, 39-40
Z interaction, 68-69,79-84 Global tests, 105-112; ANOVA-likc effect.
Curvilinear X, linear Z, curvilinear X by 108-109; curvilinearity of regressk'fi,
linear Z interaction, 69-70,84-89 108; effect of one variable, 108; equival­
Curvilinear X and linear Z relationship, 67­ ence of two regression equations, 109­
68,78-79 110; linearity of regression, 107-112
Curvilinear A' relationship (X2), 63-65,72-78 ■

Ill-conditioned predictor covariance matri­


Dichotomization of continuous predictors, 4, ces, 35-36; essential versus nonessential ,
167-168 ill-conditioning, 35-36. See also Multi- )
Dummy variable coding, 116-117 collinearity

CamScanner
Subject Index 209

Inclusion of lower order terms In equations Main effect», 38-39,63, 70-71,103. See aho
with interactions, 49, 61, 93-95 Inter­ First order regression coefficients
actions, 9-27, 29, 36, 40. 42-44, 49-61, Matrices, 150, 171; matrix Indefinite-new,
*7
69 0. 79-89, 100-102, 123-127; 171; positive definite matrices, 150,171;
categorical and continuous variables, positive wmldefinlte matrices, 171
123-127; conditional interactions, 50; Matrix solution for standard error of simple
crossover versus noncrossover, 22; regression coefficients, 25-27,54,61,78,
curvilinear (higher order) Interactions, 79, 82, 85, 86,98-99,131, 138; categori­
69-70, 70-89; inclusion of lower order cal by continuous variable, 131, 138;
terms in regression equation containing, curvilinear X, linear Z, and curvilinear X
49,61,93-95; interpretation in regression, by linear Z interaction, 86,99; curvilinear
9-10, 36; linear A
* by linear Z by linear IV X, linear Z, and linear X by linear Z inter­
interaction, 49-61; linear X by linear Z action, 82-83, 98; curvilinear X, linear Z
interaction, 9-27; ordinal versus disordi- relationship, 79, 98; curvilinear X
nal, 22-23, 83; overlapping variance with relationship, 78,98; linear and curvilinear
first order terms, 100-102; standardized coefficients, 85, 98; linear X by linear Z
regression analysis, 40,42-44 by linear IV interaction, 54, 61; linear X
Interpretation of first order regression by linear Z interaction, 25-27
coefficients. Sec First order regression Maximum or minimum of curve, 65, 74,75-
coefficients 76, 79, 82, 86-88; curvlincar X, linear Z,
Interpretation of interactions, 9-10, 36; and curvilinear X by linear Z relationship,
curvilinear (higher order) interactions, 86-88; curvilinear X, linear Z, and linear
69-70; linear X by linear Z interaction, 9- X by linear Z relationship, 82; curvilinear
10,36; rescaling of predictors, 29 X, linear Z relationship, 79; curvilinear X
invariance of regression coefficients, 28-31, relationship, 74,75-76
33,36,40-45,48,183-187; algorithm for Measurement error, 139-155, 160-167;
identifying invariant regression attenuation bias, 141-142; bias in regres­
coefficients, 183-187; centered versus un- sion coefficients, 141-144; classical
centered standardized solution, 40,42-43; measurement theory, 140-141; corrected
failure of invariance with higher order estimates of regression coefficients, 145-
terms, 28-31, 33, 177; highest order term 154; correlated errors in product terms,
in regression equation, 29, 30, 33, 48, 143; effect sizes for Interactions, 161-
177; invariance with no higher order 162; regression with product terms, 142»
terms, 29-30, 33; standardized solution 144; sample size requirements for
with higher order terms, 36, 40-45; detecting interactions, 163-164; variance
standardized versus unstandardized accounted for by interactions, 161, 162.
coefficients, 36. See oho Centering See aho Corrected estimates of regres­
sion coefficients, Reliability of predictors
Median splits on predictors, 4,167-168
Johnson-Neyman technique, 132-137 Muhicollinearhy, 32-33, 35-36, 49; effect of
predictor scaling, 32-33,35
*36,49, essen­
tial versus nonessential ill-conditioning,
Latent variable structural modeling, 151- 35-36
153; interactions, 153; regression es­ Multiple correlation and effect size, 157-159
timates corrected for measurement error,
151-153; structural coefficients, 152
? Linear combinations of regression Numerical examples, 10-12, 13-18, 32-34,
r, coefficients, 25 50-53, 55-59, 67, 117-126; categorical

CamScanner
210 MULTIPLE REGRESSION

and continuous variables, 117-126; cen­ requirements for detecting interactions,


tered versus uncentered predictors, 32-34; 158-159, 161, 163-164; simulations for
curvilinear (higher order) interactions, 67; power of interaction, 165-167; statistical
linear X by linear Z by linear IF interac­ power analysis, 156-160
tion, 50-53, 55-59; linear X by linear Z Principal component regression analysis,
interaction, 10-12,13-18,33-34 168-169

Ordinal versus disordinal interactions, 22-23, Range of a variable, 22-23; dynamic range
29, 31-32, 34, 125-126; categorical by of a variable, 23; meaningful range of a
continuous variable interactions, 125­ variable, 23
126; dynamic range of a variable, 23; in­ Regression analysis, 3-5, 93-95,96-97,110­
variance and predictor scaling, 29,31-32, 114; as general data analytic strategy, 3.
34; meaningful range of a variable, 23 5; exploratory, 96-97; inclusion of terms,
Ordinary least squares regression, 25; as­ 93-95, 110-111; sequential model revi­
sumptions, 25 sion, 111-113; step-up versus step-down
approaches, 113; stepwise, 114
Regression coefficient, 9-10, 37-40, 42-43,
Partial correlation and effect size, 157-159 45; interpretation of first order
Path coefficients, 170 coefficients, 9-10, 37-40; standardized
Plotting interactions, 12-14, 15, 52-53, 66, solution, 40,42-43,45
68, 123; categorical by continuous vari­ Regression equation, 9-10, 62-99, 118-126;
ables, 123; centered, uncentered data categorical and continuous variables, 119­
compared, 15; curvilinear (higher order) 123; categorical and continuous variables
interactions, 68; curvilinear (Z2) relation­ and their interactions, 123-126; categorical
ship, 66; linear X by linear Z by linear W variable only, 118-119; other forms of non­
interaction, 52-53; linear X by linear Z linearity, 95-96; with higher order
interaction, 12-14; theory and structuring (curvilinear) relationships, 62-99; with
graphs, 52 interaction, 9-10; with no interaction, 10
Post hoc probing of interactions, 12-24, 52­ Reliability of predictors, 139-155, 160-167;
53, 72-79, 130-137; curvilinear (higher definition of reliability, 141; reliability of
order) interactions, 72-89; plotting inter­ crossproduct term, 144-145; statistical
actions, 12-14, 52-53; post hoc tests, 14­ power of interactions, 161-167. See also
16, 130-137. See also Post hoc tests on Measurement error
interactions Residuals, 25; assumptions, ordinary least J
Post hoc tests on interactions, 14-26, 54-58, squares regression, 25
. ■ ■ S lllll я
72-89; curvilinear (higher order) inter­
actions, 72-89; differences between sim­ . ■ ? ■ ■ ■ ■ ■
ple slopes, 19-20; linear X by linear Z by Sample size requirements for detecting inter- j
linear IV interaction, 54-58; linear Д' by actions, 158-159, 161, 163-164; tables
linear Z interaction, 14-26 159,164
Power (statistical), 95, 139-140, 156-169; SAS computer package, 27, 137, 188-189; J
impact of measurement error for inter­ PROC REG, 27; program for Potthoff ex« j
actions, 160-167; interactions, 139-140, tension of Johnson-Neyman technique,
156-169; loss with dichotomization of 137,188-189; variance covariance matrix |
predictors, 167-168; low power for inter­ of predictors, 27 ।
actions, 139,156; predictor reliability and Scale free terms, algorithm for identificato g
interactions, 160-167; sample size Ш-112,183-187 3

CamScanner
Subject Index
211
Scaling of predictors. See Centering. Invari­
of, variances of simple slopes in
ance of regression coefficients, Trans­
curvilinear interactions, 64; table of, vari­
formations of predictors
ances of simple slopes in linear inter­
Semipartial correlation and effect size, i 57­
actions, 60
159 Standardized solution with interactions, 40
*
Simple regression equation, 12-14, 29, 31,
48; appropriate standardized solution, 44;
*75;
33, 36, 50-51, 73 categorical by con­
crossptoduct terms with standard scores,
tinuous interactions, 124-125, 129; ccn- 43; failure of invariance of regression
tered versus uncentcred predictors, 29,
coefficients, 40, 42-43, 48; raw versus
31, 33; choosing values of Z, regression
*47;
standardized solution, 45 simple
of Y on A', 12-13; curvilinear A' rclation- slope analysis, 44-45; standardized
ship, 73
*75; effects of predictor trans­
solutions from centered versus uncentcred
formation, 29, 31, 33; interpretation, 36; data, 40,42
linear A’ by linear Z by linear IV interac­ Structural coefficients, 152,170
tion, 50-51; linear X by linear Z interac­
tion, 13-14
Simple slope, 12, 16-21, 22, 29, 31, 33, 36,
Theory, role in structuring regression
*38,
37 42, 44-45, 48, 50-51, 54-58, 60,
equations, 70-72, 93-96, 103-105, 110,
64, 73-75; bias in tests of significance,
173-174
22; computer analysis, 18-21, 54-58; Transformations of predictors, 28-29,32-33,40,
curvilinear (higher order) interactions,
177-187; additive transformations, 28-29,
73-75, 79, 80-83, 86; definitions of, 73­
32-33, 177-182; algorithm for identifying
75; effect of predictor scaling, 29,31,33;
scale-independent regression coefficient, 40,
first derivative and, 73-75; first order
183-187, See also Centering
regression coefficients and, 37-38; inter­
r-tests, 16-18, 19, 34, 40, 46, 47, 54, 58, 83,
* by linear Z by
pretation, 36, 48; linear A 86, 131; effect of predictor scaling on
linear X interaction, 50-51; linear X by simple slope tests, 34; simple slopes in
linear 2 inteaction, 16-18; standardized
categorical by continuous variable inter­
regression equation, 42, 44-45; table, action, 131; simple slopes in curvilinear
higher order interactions, 64; table, linear
(higher order) interactions, 83,86; simple
interactions, 60
slopes in linear Af by linear Z by linear W
SPSS-X computer package, 27, 92;
interactions, 54, 58; simple slopes in
REGRESSION, 27; variance-covariance
linear ATby linear Z interaction, 16-18,19;
matrix of predictors, 27
standardized regression coefficients, 40,
Spurious regression effects, 154-155
46, 47. See also Standard error of simple
*18,
Standard error of simple slope, 16 19,
• -■ slopes ’ /•. ■;/'■'? './■.■ Д:;Д
24-26, 34, 46, 47, 54-58, 60, 64, 77-78,
1’83, 86-87/89-92, 131; categorical by
continuous variable interactions, 130­
Unbiasedness versus efficiency, 103-104
‘32; computer calculation, 19, 58, 89-92,
1» curvilinear (higher order) inter-
««ions, 64, 77-78, 81-83, 86-87; deriva-
Variance-covariance matrix of regression
l’on of standard error, 24-26; effect of
coefficients, 16
*17, 25, 27, 45, 131;
<^‘ct9r s^lhg, 34- iiQBar x by linear Z
standardized solution, 45
У linear Winteraction, 54-58;;4loear X
Variance of simple slope, 24-26, 60, 64;
* near ‘nterac,’on».16-18; regression
of v j general expression, 25 *26; table of,
on AT at Z, 16; regression of Y on Z at curvilinear (higher order) interactions, 64;
/ standardized solution, 46, 47; table table of, linear interactions, 60

CamScanner

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy