03.24.steenbergen Jones
03.24.steenbergen Jones
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=mpsa.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We work with the
scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that
promotes the discovery and use of these resources. For more information about JSTOR, please contact support@jstor.org.
Midwest Political Science Association is collaborating with JSTOR to digitize, preserve and extend access to
American Journal of Political Science.
http://www.jstor.org
Modeling Multilevel Data Structures
Multileveldata are structuresthat olitical scientists frequently work with data measured at multiple
consist of multipleunitsof analysis, levels, for example, individual-level survey data, and aggregatecon-
one nested withinthe other.Such textual or demographic data. As such, the use of multilevel data
data are becoming quite common in structures is common in political science research. Moreover, many theo-
politicalscience and providenumer- ries and hypotheses in political science hinge on the presumption that
ous opportunitiesfor theorytesting "something"observed at one level is related to "something"observed at an-
and development. Unfortunately, this other level. Yet despite the prevalence of cross-level or multilevel data and
type of data typicallygenerates a theories of political behavior, statisticalmethods to exploit the information
numberof statisticalproblems,of found in multilevel data structures have not been widely used.
which clusteringis particularlyimpor- In this article, we discuss the statistical modeling issues associated with
tant.Toexploitthe opportunitiesof- the analysis of multilevel data. Our intention is to provide both an intro-
fered by multileveldata, and to solve duction to multilevel modeling and to review the relevant statistical litera-
the statisticalproblemsinherentin ture on multilevel models. The article is structured as follows. First,we con-
them, special statisticaltechniques sider the substantive motivation leading to models for multilevel data.
are required.Inthis article,we focus Next, we discuss the statistical consequences of ignoring the multilevel
on a technique that has become structure of data. In the next section, we explain the problems of conven-
popularin educationalstatistics and tional approachesto analyzingthese kinds of data. This is followed by a dis-
sociology-multilevel analysis. In cussion of the multilevel model, its different variants and model estima-
multilevelanalysis, researchers build tion. We then provide an application of a multilevel model, examining the
models that capturethe layered issue of public support for the European Union. We conclude by discussing
structureof multileveldata, and deter- some general considerations in multilevel modeling.
mine how layers interactand impact
a dependent variableof interest.Our
objective in this articleis to introduce
the logic and statisticaltheorybehind
multilevelmodels, to illustratehow
such models can be applied fruitfully
Marco R. Steenbergenis Associate Professorof Political Science, Universityof North
in politicalscience, and to call atten- Carolina at Chapel Hill, 308 Hamilton Hall, CB# 3265, Chapel Hill, North Carolina
tion to some of the pitfallsin multilevel 27599-3265 (Marco_Steenbergen@unc.edu). BradfordS. Jonesis AssociateProfessorof
analysis. Political Science, University of Arizona, 315 Social Sciences, Tucson, Arizona 85721
(bsjones@email.arizona.edu).
An earlierversionof this paperwas presentedat the 14thAnnual Meetingof the Politi-
cal MethodologySociety,Columbus,OH. The authorswould like to thank the partici-
pants in this meeting, in particular,Gary King and Jeff Lewis, for their helpful com-
ments. We also thank Bill Dixon, Bob Erikson,David Lowery,StuartMacdonald,Bill
Mishler,GeorgeRabinowitz,MichaelSobel,and Jim Stimson for comments on various
versionsof this paper.Finally,we thankthe anonymousreviewersfor theirvery insight-
ful comments.
AmericanJournalof PoliticalScience,Vol.46, No. 1, January2002, Pp. 218-237
?2002 by the MidwestPoliticalScienceAssociation
218
MODELING MULTILEVEL DATA STRUCTURES 219
The main strengthof interactivemodelsis that they and variancecomponents analysis(see Searle,Casella,
allowthe analysisof substantivelyinterestingpredictors andMcCullogh1992).
in orderto accountfor causalheterogeneity.Thus,inter- To fix ideas,we firstconsidera multilevelmodel for
activemodels exploit the theoreticalopportunitiesthat the caseinvolvingtwo levelsof analysis.Wereferto these
multileveldatastructuresoffer.This is a majorimprove- levels as "level-1"and "level-2"and assumethe level-i
ment over dummyvariablemodels.On anotherdimen- observationsare nestedwithin level-2 units. In applied
sion, however,interactivemodelsareinferiorto dummy research,level-1 observationsmay referto individuals
variablemodels. Implicitly,interactivemodels assume and level-2 units may referto contextualunits, for ex-
that the subgrouplevel predictorsfullyaccountfor sub- ample,countries.We startby consideringa model with
group differences.This is so becauseinteractivemodels one level-1and one level-2predictor,which are observ-
incorporate random error only at the lowest level of able and have a linearrelationshipwith the level-i de-
analysis;at the higherlevelsof analysis(i.e., subgroups) pendentvariable(also observable).Laterwe generalize
the errorcomponentsare assumedto be zero.This is a this presentationto encompass more complex model
verystrongassumptionthatwill usuallyproveto be false. specifications,althoughthe focus in this paperremains
Unfortunately, whenfalse,the interactivemodeldoes not exclusivelyon linearmodels.9
avoid the statisticalissues associatedwith clusteringof
the data. ASimpleLinear
Multilevel
Model
It appears,then, that existing models make trade-
offs betweenthe opportunitiesand challengesof multi- Westartby positingthe followinglevel-1model:10
level data structures.Dummyvariablemodels meet the
statisticalchallenges,but they failto exploitthe theoreti- Yij= Poj+ Pljxij -+ ij (1)
calopportunitiesmultileveldataoffer.Interactivemodels
take advantageof these theoreticalopportunities,but Here yij is the level-i dependentvariablefor a level-1
they usuallyfallshorton solvingthe statisticalproblems. unit i (=1, . . ., Nj) nested in level-2 unit j (=1,. .., J).
Clearly,both of theseconventionalapproacheshavelimi- Further,xijis the level-1predictorand ?ijis a level-i dis-
tations. Unfortunately,combiningthese approachesto turbanceterm.This model is identicalto the simplere-
offset theirweaknessesis not possiblebecausethereare gressionmodelwith the importantdifferencethatthe re-
insufficientdegreesof freedomto estimateboth contex- gression parametersare not fixed but, instead, vary
tualeffectsand dummyvariables.Fortunately, thereis an acrosslevel-2 units (as indicatedby the j-subscriptson
alternativemodelingapproach,whichwe now consider. the regression parameters P). The introduction of these
variablecoefficientssets multilevelmodels apartfrom
most otherstatisticalmodelsusedin politicalscience.
To see thesedifferencesmore clearly,we can directly
TheMultilevelModel model the variationof the level-1regressionparameters
in (1) as a functionof level-2predictors,suchthat
Recentdevelopmentsin educationalresearch(e.g., Bryk
and Raudenbush 1992; De Leeuw and Kreft 1986; (2)
Po3j= Yoo+ Yolzj + 60j,
Goldstein 1995; Kreft and De Leeuw 1998; Longford
1993;Snijdersand Bosker1999),sociology(e.g.,Mason,
Wong,and Entwisle1983)haveresultedin the formula- 9Therehas been tremendousprogressin the developmentof non-
tion of multilevelor hierarchicalmodels.In the political linear and latent variablemultilevel models. An adequatediscus-
sion of these models would requireanotherpaper,however,so we
methodologyliterature,the pioneeringworkof Jackson leave the reader with some references to the relevant literature.
(1992) led to the consideration,development,and pre- Goldstein (1991, 1995), Hedeker and Gibbons (1994), and Gib-
sentation of a random coefficients model, which pro- bons and Hedeker(1997) develop multilevelmodels for categori-
vides the foundation for the models consideredhere.8 cal data.A discussion of multilevel event count and event history
models can be found in Goldstein (1995). The idea of multilevel
Thesemodelsarecloselyrelatedto severaloldermethod- latent variable models was first probed in papers by Joreskog
ological traditions such as random coefficient models (1971) and Sorbom (1974). Recently,Muthen (1989, 1991, 1994),
(see Rubin1950;Swamy1970;Swamyand Tavlas1995) McDonald (1994), and Chou, Bentler,and Pentz (2000) have ex-
pandedthe statisticaltheory for such models, makingit possibleto
conduct multilevel factor analyses (see Li et al. 1998, Muthen
8Other political methodologists who have contributed to multi- 1991) and multilevelsimultaneousequation analyses.
level analysis include Jones, Johnston, and Pattie (1992), Achen 10Our developmentand notation of the multilevelmodel drawlib-
and Shively (1995), Beck and Katz (1995), and Western (1998). erally from the excellent discussion by Bryk and Raudenbush
Zorn's(2001) work also follows in this tradition. (1992).
222 MARCO R. STEENBERGEN AND BRADFORD S. JONES
and (3) Cov[60j, bj] = t01. This means that the level-2
disturbanceson the interceptsand slopesmaybe corre-
= YIo + Y71Zj+ blj ' (3) lated. It is common to find that level-2 units with large
lj
slopesalsohavelargeinterceptsor,conversely, havesmall
Takentogether,(2) and (3) comprisethe level-2model.
intercepts.ThecovariancetermT1 capturesthis relation-
Here the y-parametersdenote the fixed level-2 param-
ship betweenthe interceptsand slopes and, in general,
etersand zjdenotesa level-2predictor.The 8-parameters one should
always estimate this term (Snijders and
in this model are disturbances.Thus,in (2) and (3), no Bosker1999).
assumptionis made that the level-2 predictoraccounts (4) 80jand 61jare normallydistributed,as is ij.12
perfectlyfor the variationin the level-1parameters. Takentogether,assumptions(1)-(4) implythatthe level-
The multilevel model is fully characterizedby the 2 disturbancesaredrawnfroma bivariatenormaldistri-
level-1 model, (1), and the level-2 model shown in (2) butionwith mean0 and a variance-covariance matrix
and (3). A single-equationexpressionof the modelis de-
rivedby substituting(2) and (3) into (1):
-'COO1QO = (5)
+
Yij = (Yoo+ Yolzj + o0j) + (Yio + Y -lzj + lj )xij ?ij
Yoo+ Yol0 + YloXij+ YllZjXi+ 80j + 6jij + ij, (4) Thelevel-1disturbancesaredrawnfroma normaldistri-
butionwith mean0 andvariance(2.
where oodenotesthe interceptor constant,yo, is the ef- (5) Cov[60j,?ij]= Cov[b1j,eij]= 0. This impliesthat
fect of the level-2predictor,yll is the effectof the level-1 the errorsin the location of slopes and interceptsare
uncorrelatedwith errorsthat affectwherelevel-1 units
predictor,and Yll is the effectof the cross-levelinterac-
tion betweenthe level-1and level-2predictors.The dis- arelocatedon the dependentvariable.This assumption
turbanceterm consistsof 8oj,5lj, and ?ij, which are the is usuallynecessaryto obtainan identifiedmodel.It im-
randomparametersof the model. Specifically,50jgives plies that the effectsof omittedlevel-1 predictorsis as-
the residuallevel-2variationin the level-1interceptthat sumedto be fixed,which,in the absenceof clearcontra-
remains after controlling for zji, 81jgives the residual dictoryinformation,is a reasonableassumption.
level-2variationin the level-1slope for xijaftercontrol- Withthe majorassumptionsspecified,we now con-
sider the statisticalfeaturesof the multilevelmodel. To
ling for zji,and eij is the level-i disturbancecapturing
omittedlevel-1predictors,measurementerrorin yij,and begin, we denote the multilevel disturbance term as
= bj + ijxij+ ?i. This expression has several impor-
any idiosyncraticsourcesof variationin yij attributable Uij
to level-1units.We can thus thinkof 80jand 6bjas com- tant characteristics.First,it can be shown that ui does
not haveconstantvariance:13
prisingparameternoise and of ij as level- noise.There-
fore,predictionerrorin the generalmultilevelmodelhas
two sources:imperfectlevel-1modelingof the response Var[uiy]= E[(80j + ljxij + ?j)2]
variable(?ij);andimperfectlevel-2modelingof the level- = +
E[82j] + 2xijE[oj,6 lj] xjE[l2] + E[?t]. (6)
1 parameters(6o0andblj). = 'o + 2xiyTo1+ xtl1 + 2
Thespecificationof the multilevelmodelin (4) is in-
completewithoutspecifyingthe assumptionsconcerning In (6), it is clearthat Var[uij](andthereforeVar[yj]) is
the disturbances.The following assumptionsare com- in part a functionof the level-1 predictor;hence uijhas
mon in multilevelanalysis. nonconstantvariance(even though 8j and ?ijhavecon-
(1) E[oj] = E[81j]= E[?ij]= . This implies that stant variancesby assumption (2)). Constantvariance
thereis no systematicparameternoise or level-1noise. will only be achievedwhen 61j is 0, whichmeansthat zj
= T11,Var[?ij]
(2) Var[80j]= T00,Var[81j] = (2. This
stipulatesthatthe level-1andlevel-2disturbanceshavea
constantvariance.11 An importantobjectiveof multilevel '2This last assumption is appropriate for the linear multilevel
analysisis to estimatethesevariancecomponentsand to models on which this paper focuses. Of course, models for cat-
makeinferencesaboutthem. egorical,count, or durationdatarequirea differentspecificationof
the distributionof the level-1 disturbances.
'3For this proof, we make an additional assumption that
IThis Cov[ ij, kl] = 0 for i ? j,k ? 1.This assumptionis includedmerely
assumptionmay be relaxedfor the level-1 disturbances(see
Browneet al. 2000;Snijdersand Bosker1999).We arealso awareof for convenience.It can and should be relaxedin applicationsof the
an applicationin which the level-2 units were characterizedby dif- multilevelmodelto timeseriesof pooledcross-sections
(Goldstein
ferent (co)variance structures (Thum 1997). 1995).
MODELING MULTILEVEL DATA STRUCTURES 223
Q
The randomcoefficients model. In the random coeffi- Yij= Yoo+ YoqZq+ + +ij * (19)
cientsmodel (see Swamy1970;Swamyand Tavlas1995), q=l
which is widely used in pooled cross-sections and time-
Since E[yij]= lj = Yoo+ Xq YoqZqj, the model treats the
series analysis (see Dielman 1989; Sayrs 1989; Stimson
level-2 means of the dependent variable, gLj,as a func-
1985) and also in other areas of political analysis (e.g., tion of the level-2 predictors (whence the name means-
Jackson 1992), the level-2 predictors are dropped from as-outcomes model). The disturbances of this model,
(10) and (11). Thus, the level-2 model for the random
coefficients model is Uij= 60j + ij, possess compound symmetry (i.e., have
an exchangeable covariance structure); their variances
are given by Var[uij]= oo0+ (2 and their covariances by
Pj =YO +60j (13)
Cov[uij,ukj]= 00,oresulting in an intra-class correlation
and of To0/ (X00+ (2).
Var[uij]
= o= 2 +
TopXpij +
T'prXpijXri a2, (16) Substitution into (18) gives
P p r
ity, the disturbance term is simply uij = ?ij. Given our as- densityin the contextof linearmultilevelmodels is the
sumptions, this implies that there is no clustering and normaldensity(see assumption(4)). This distributional
that the variance is constant. Thus, the interactive model assumption informs the likelihood function, which is
is, in fact, a special case of the general multilevel model in maximizedto obtain estimatorsof the fixed effectsand
(12), which arises under stringent and usually unrealistic the variancecomponents (for detailed discussions see
assumptions about the level-2 disturbances. Brykand Raudenbush1992;Goldstein 1995;Longford
1993;Raudenbush1988;Searle,Casella,and McCullogh
ModelswithMoreThanTwoLevels 1992).
There are several algorithms for implementing
The multilevel model can be extended to data structures MLE.In a pioneeringpaperon maximum-likelihoodes-
that have more than two levels. From a conceptual per- timation of variance components models, Dempster,
spective such an extension is straightforward: effects at Rubin, and Tsutakawa(1981) proposed the EM algo-
the highest level are considered fixed, while effects at rithm (see also Bryk and Raudenbush 1992; Mason,
lower levels may vary over units at higher levels in the hi- Wong,and Entwisle1983).Goldstein(1986) developed
erarchicaldata structure (although they do not have to). an iterativegeneralizedleast squares(IGLS)algorithm,
However, in terms of interpretation, these models can be while Longford(1987) proposeda Fisherscoringalgo-
very complex because of the rapid increase in possible rithm. Under the assumption of normality IGLSand
interaction terms, including higher-level interactions. Fisherscoringare equivalent(see Goldstein1995). The
For instance, a three-level model may contain three sets advantage of EM is guaranteed convergence, while
of two-way cross-level interactions (between level-1 and Fisherscoring (IGLS)has the advantageof being fast
level-2, level-1 and level-3, and level-2 and level-3), in (see Kreftand De Leeuw1998;Kreft,De Leeuw,andVan
addition to a set of three-way interactions (between Der Leeden1994).
level-1, level-2, and level-3) and three sets of main effects MLE has certain desirable statistical properties.
(one for each level). Not only can the interpretation of all When the normalityassumptionholds, MLEis consis-
of these terms be complicated, but the demands on the tent and asymptoticallyefficient(Brykand Raudenbush
data may be quite steep if one is to achieve sufficient sta- 1992;Goldstein1995;Longford1993;Raudenbush1988;
tistical power to unravel all of the interactions. Of course, Searle,Casella,andMcCullogh1992).MLEremainscon-
not every model will contain all possible interactions be- sistent in the estimationof the fixed effectseven when
cause there may not be theoretical reasons to do this. An the normalityassumptionis violated(Goldstein1995).15
example of a relatively simple three-level model is pre- Whilethe ideal circumstancesfor estimatingmultilevel
sented in our application. modelsarebalanceddesigns(i.e.,equalnumbersof level-
1 units per level-2unit), Raudenbush(1988) comments
Statistical
Inference
fortheMultilevel
Model that MLEperformswell even with unbalanceddesigns,
whicharecommonin politicalanalysis.In termsof finite
Our discussion so far has focused on the conceptual de- sample properties,a series of Monte Carlosimulation
velopment of multilevel models and the purposes they studieshas demonstratedthat MLEfixedeffectsestima-
serve. An unanswered question is how these models are torsareconsistent(Kreft1996).
estimated. We now turn to this question, drawing a dis- If thereis a weaknessto MLEit maybe in the estima-
tinction between estimation of the fixed effects and vari- tion of the variancecomponentsin smallsamples.MLE
ance components, on the one hand, and of the level- co- variancecomponentstend to be biaseddownwardlyin
efficients, on the other.
Fixedeffects and variancecomponents.The dominant 2000; Seltzer,Wong, and Bryk 1996;Zeger and Karim 1991), and
generalizedestimating equations (see Goldstein 1995;Liang and
approach to estimating the fixed effects and variance Zeger 1986; Zorn 2001). Of interest are also the minimum vari-
components is maximum likelihood estimation (MLE).14 ance (MINVAR)and minimum norm quadraticunbiasedestima-
MLE requiresthe specification of a density for the level- tion (MINQUE)estimatorsof the variancecomponents,which are
often used in the context of random effects ANOVA(see Searle,
and level-2 disturbances. The most common choice of Casella,and McCullogh1992).
'5However,the standarderrorsof the fixed effects will tend to be
14However, there is a wide rangeof alternativeapproaches.Among biased.In addition,the estimatesof the variancecomponents may
these alternativesare two-step OLS (see Chou, Bentler,and Pentz also be severelybiased. One solution to these problems is to use a
2000;Kackarand Harville 1981;VanDen Eeden 1988),generalized Monte CarloMarkovChainprocedure(e.g., Gibbssampling)or to
least squares (De Leeuw and Kreft 1986), Bayesian estimation rely on bootstrapping(see Goldstein 1995;Kuk 1995;Meijer,Van
(Lindleyand Smith 1972;Rubin 1981;see also Browneand Draper Der Leeden,and Busing 1995).
226 MARCO R. STEENBERGEN AND BRADFORD S. JONES
such cases because no correction is made for the degrees Neither a separate regression approach nor a pooled
of freedom consumed by estimation of the fixed effects approach, then, is ideal. The question is whether there is
(Swallow and Monahan 1984). An alternative estimator a way in which the approaches can be combined that re-
of the variance components is restricted maximum-like- sults in a "compromise estimator"(Morris 1983,47) with
lihood estimation (REML),which corrects this problem more desirable properties. This question has been ad-
(see Bryk and Raudenbush 1992; Goldstein 1995; Kreft dressed in the literature on empirical Bayes (EB) infer-
and De Leeuw 1998; Longford 1993; Searle, Casella, and ence, which shows that combining estimators is indeed
McCullogh 1992).16 Although REML and MLE are as- useful (Carlin and Louis 1996; Effron and Morris 1975;
ymptotically equivalent (Richardson and Welsh 1994), Morris 1983).
the premise is that REMLvariance component estimates The EB estimator of a level-i coefficient 3pj is a
are less biased than MLE variance component estimates weighted combination of the separate regression estima-
in small samples of level-2 units. Evidence from Monte tor of that coefficient and yp0+ XqY'pqzj. The weights
Carlo simulation studies lends support to this premise. correspond inversely to the precision of these two esti-
However,this evidence also points to a potential problem mators. That is, the more precise an estimator is, the
with REML. In the simulations, REML was sometimes more important its contribution is to the EB estimator.
less efficient than MLE.Consequently, the mean squared Thus, there is "shrinkage"in the direction of the most
error (which is the sum of the squared bias and the vari- precise estimator. The resulting EB estimator has the de-
ance of an estimator) did not always favor REML over sirable property that its MSE is typically smaller than the
MLE (Kreft 1996). MSE of the estimators on which it is based (Carlin and
Louis 1996; Effron and Morris 1975; Morris 1983).
Level-1coefficients.In multilevelmodeling,attentionis One of the biggest advantages of EB estimation is
usually not restricted to estimating fixed effects and vari- that the estimators of the level-1 parameters are no
ance components. Random level-1 coefficients are also of longer rooted only in the data for a particular level-2
considerable interest because these coefficients shed light unit. Because the EB estimator is a function of the fixed
on the behavior of the level- model in each of the level- effects, which are estimated from the pooled data, the
2 units. The question is how one should estimate these level-1 parameter estimates, in effect, borrow informa-
coefficients. tion from the data for other level-2 units. Borrowing
One option is to estimate level-1 coefficients sepa- strength is a major advantageof multilevel modeling (see
rately for the different level-2 units (e.g., using OLS). The Bryk and Raudenbush 1992; Kreft and De Leeuw 1998;
advantageof this approach is that unbiased estimators of Raudenbush 1988; for a political science example see
the level-1 coefficients can be obtained for each level-2 Western 1998). It makes it possible to draw inferences
unit (assuming that the level-1 sample sizes are suffi- about level-2 units even in the light of sparse data for
cient) to allow estimation. Unfortunately, these estima- those units. Of course, the need to borrow strength is in-
tors often lack precision because they are based on lim- versely related to the sample sizes for level-2 units. And
ited data, namely, the data for a particular level-2 unit. the ability to borrow strength depends on how much the
Especiallywhen the sample sizes for the level-2 units are level-2 units have in common.
small, the variance of the separate regression estimators
can be large. Discussion
An alternativeestimation approach is pooled regres-
sion, i.e., regression analysis of the level-1 units com- Now that the logic and estimation of multilevel models
bined across all level-2 units. The pooled regression esti- have been discussed, it is natural to ask what these mod-
mator of the effect of a level-1 predictor may be biased els have to offer for political analysis compared to alter-
for a particular level-2 unit, in that the true effect of the native approaches. First, the notion of borrowing
predictor in this unit is different from the pooled esti- strength that is embedded in EB estimation offers a ma-
mate. On the other hand, there are efficiency gains in re- jor improvement over separate regression estimation.
lying on pooled estimators because all of the data is used When faced with causal heterogeneity, political scientists
and fewer parameters are estimated (see Bartels 1996). often resort to separate estimation of level-i units for
each level-2 unit. However,this is not an efficient estima-
tion strategy and we expect that in many cases EB esti-
16Morespecifically,REMLpartitionsthe likelihood function into mation is a much better choice. In this regard, a multi-
two parts,one of which containsthe fixed effects.Variancecompo-
nent estimatesarebased on the part of the likelihoodfunction that level modeling approach to causal heterogeneity can be
does not include the fixed effects. verybeneficial.
MODELING MULTILEVEL DATA STRUCTURES 227
A second major benefit of multilevel models is that The argument here is that political parties are an impor-
they permit the analysis of substantive contextual effects tant context that shapes the opinions of party support-
while still allowing for heterogeneity between contextual ers. As a result supporters from the same party tend to
units. This is an important improvement over the alter- share similar views about the EU (e.g., Franklin, Marsh,
natives of dummy variables models and contextual mod- and McLaren 1994), suggesting clustering.17
els. Dummy variables models can account for level-2 het- Considering these three levels of analysis together,
erogeneity, but they contain no substantive explanation what emerges is a multilevel data structure: individuals
of this heterogeneity. Contextual models provide sub- (to the extent that they are party supporters) are nested
stantive explanation, but make the unrealistic assump- within political parties, which in turn are nested within
tion that this explanation eliminates any remaining level- EU member states. We can model this data structure
2 heterogeneity. Multilevel models combine substance through a three-level multilevel model. The dependent
with (more) reasonable assumptions about level-2 het- variable in this model can be written as Supportijk,which
erogeneity. denotes the level of support for the EU for an individual
A third major benefit of multilevel models concerns i who supports party j in country k.
statistical inference. The multilevel character of much of
political science data is often ignored. Political scientists
often treat multilevel data structures as if no hierarchy
Theory
between units of analysis existed. Consequently, observa- Three substantive questions arise when we think of EU
tions are treated as independent, whereas in fact they are support as a three-level phenomenon. First, what is the
to some extent dependent because of the hierarchical importance of each of the three levels for understanding
nesting structure. This can easily lead to incorrect infer- EU support? Second, how do we account for EU support
ences, such as rejecting the null hypothesis of no effect at the differentlevels?That is, what predictors can explain
too frequently.Multilevel models improve inferences be- EU support? Finally, is there causal heterogeneity in the
cause the hierarchical data structure is explicitly taken effects of predictors?These questions follow a logical pro-
into consideration. gression-later questions presuppose an answer to earlier
To illustrate the application of a multilevel model, questions. Let us now outline how one would translate
we now provide an empirical illustration of multilevel these questions into appropriatemultilevel models.
analysis. First, does support for the EU vary across the three
levels of analysis that we have identified? We can answer
this question by way of an ANOVA that decomposes the
variance in EU support:18
Application
Supportijk = Yooo+ Vook+ 5ojk + Eijk- (22)
We illustrate multilevel models through an analysis of
17Afourth level of analysis is time: how do opinions about the EU
public support for the European Union (EU). Studies of
vary over time (e.g., Eichenberg and Dalton 1993)? For the sake of
EU support typically rely on multilevel data. As such, the
simplicity, we will not consider this level in this paper, focusing in-
topic of EU support is an excellent test case for illustrat- stead on one particular point in time, namely 1996.
ing multilevel models. Our goals in this section are two- 18To derive this model, consider the individual-level model
fold. First, we show how to develop multilevel models
Support ijk = aojk + ? ijk
from substantive researchquestions and how to interpret
the results. Second, we show the implications of ignoring where a jk is the mean level of EU support in political partyj in
the multilevel data structure. country k, and ? k is individual-levelvariationaroundthis mean.
We model the mean by way of the party-levelmodel
The literature on EU support discusses at least three
different levels of analysis. First, a large number of stud- OCOjk= POOk + Ojk-
ies concentrate on the individual level; these studies as- Here Pookis a national mean for EU support and 8sjk is party-
level variationaroundthis mean. Finally,
sess the nature and determinants of inter-individual dif-
ferences in attitudes toward the EU (e.g., Inglehart, = 7000 +Vook,
POOk
Rabier,and Reif 1991). Second, a sizable literature docu- where Yooois the overall mean of EU support and Vookis cross-
ments the nature of cross-national differences in EU nationalvariationaroundthis mean. Back-substitutionof this for-
mula yields
support (e.g., Inglehart, Rabier,and Reif 1991). More re-
cently, scholars have become interested in national po- aCjk = Yooo+ Vook+ 6ojk-
litical parties as a third, intermediate level of analysis. Furtherback-substitutionyields (22).
228 MARCO R. STEENBERGEN AND BRADFORD S. JONES
In this model, Y000is the grand mean of EU support (i.e., indeed has happened in the case of public opinion
the mean across individuals, parties, and countries). The toward the EU (Franklin, Marsh, and McLaren 1994;
sources of cross-national variation, which cause particu- Franklin, Marsh, and Wlezien 1994; Franklin, Van Der
lar EU member states to deviate from the grand mean, are Eijk, and Marsh 1995; cf. Siune and Svenson 1993). As
contained in Vo0k. Similarly, 80jk contains sources of suggested before, cue-taking effects induce clustering at
cross-party variation. Finally, ?ijk captures inter-indi- the party-level because the supporters of a particular
vidual differences.The variancesof these differentsources party will tend to assimilate to the same cue.
of variation are given by coo0,Tooand 72, respectively.In Finally, consider the national level of analysis. Here
order to argue that all three levels of analysis are impor- we focus on two predictors. First,we consider the date at
tant for EU support, we should find that all three of these which a country acceded to the EU. Accession to the EU
variance components are statisticallysignificant. has been largely an elite-driven process. It is not uncom-
If EU support varies significantly at all three levels, mon to find that public opinion toward the EU is very
the next question to ask is how we should account for negative in countries that have recently acceded to the
this variance? The EU literature suggests several impor- union. However, often public opinion toward the EU be-
tant covariatesthat help predict EU support. At the indi- comes more favorable as time passes. Hence, we expect
vidual level, scholars have discussed three major predic- EU support to be greaterin older EU member states than
tors of EU support (see Gabel 1998). First, an extensive in more recent member states.
literature describes the impact of economic consider- Second, individuals may look at the dependence of
ations on EU support. A key finding in this literature is the national economy on the EU in determining whether
that bad economic conditions for an individual tend to to support or oppose the union. As the national economy
reduce support for the EU, while good economic condi- becomes more dependent on EU membership, EU sup-
tions tend to increase such support (e.g., Gabel 1998). In port may increase because people realize how important
this article, we use a person's income as an indicator of the EU is for their country. One indicator of this depen-
economic circumstances. Specifically,we distinguish be- dence is the percentage of the trade-flow (imports and
tween people in the bottom income quartile, the top in- exports) that is intra-EU (e.g., Eichenberg and Dalton
come quartile, and the two middle income quartiles (our 1993). We expect that higher percentages of intra-EU
baseline category) in the income distribution of an EU trade are associated with greaterEU support.
member state. We expect people in the bottom income Having defined the relevant theoretical factors pre-
quartile to be the least supportive of the EU. dicting EU support, we now consider how to bring these
Second, political orientations appear to be associated different predictors together in a multilevel model. We
with EU support. There is considerable debate about the start by defining an individual-level model:
relative importance of these orientations, but it appears
that citizens with a leftist ideological orientation are = + +
Supportijk aojk aljkLIijk a2jkHIijk
more opposed to the EU than those with a rightist ori- + a3jkIDijk+ a4jkOLijk (23)
entation (Inglehart, Rabier,and Reif 1991). Thus, we in-
a,5jkMaleijk+ a6jkAgeijk + ?ijk-
clude ideology as an individual-level predictor.
Third, Inglehart (1970) has argued that EU support Here Lijk is a dummy variable for the lowest income
is higher among opinion leaders than among the general quartile, Hlijkis a dummy variable for the highest income
public. The reason is that opinion leaders have a better quartile, IDijkis a person's ideology, and OLijkis opinion
understanding of the EU, which makes them feel less leadership. We include two demographic control vari-
threatened by it. The impact of opinion leadership is as- ables in the model: Maleijkis a dummy variable for gen-
sumed to be messageindependent:opinion leadership en- der (1 is male) and Agejkis an individual's age.
hances EU support regardless of the information envi- By modeling the individual-level constant, cOjk, we
ronment (Inglehart, Rabier,and Reif 1991). We put this can introduce the party-level predictor that we identified:
assumption to test later in this application.
Moving to the party level, recently a literature has + PolkCuejk+ O8jk,
aojk = POOk (24)
emerged about partisan cue-taking effects. Such effects
arise when parties take a position on an issue such as Eu- where Cuejk stands for party cue. Further, by modeling
ropean integration, which is then used by party support- the party-level constant, Pook, we can introduce the
ers to inform their own opinions about the issue. As a re- country-level predictors:
sult, the supporters assimilate their opinions to those of
the party. There is now good empirical evidence that this Pook= Yooo+ YoolTenurek + Vook, (25)
+ Yoo02Tradek
MODELING MULTILEVEL DATA STRUCTURES 229
where Tenurekdenotes the length of a country's EU tional political parties are an important source of mes-
membership and Tradekdenotes the percentage of the sages-or cues-about the EU. Opinion leaders are likely
country's trade that is intra-EU. to receivethese cues because they are tuned into the com-
Substitution of (25) into (24) gives munication flow about the EU (which is usually not true
of the general public). Moreover,opinion leaders are also
aOjk= Yooo+ YoolTenurek + 7002Tradek likely to accept these cues if they come from the party they
+ support. After all, this party is a trusted source in the eyes
PolkCuejk + Vook + 6ojk.
of the opinion leaders.Thus, we expect that opinion lead-
If we make the assumption that the effect of party cues is ers are likely to be persuaded by the party cues.
fixed (i.e., Polk= Yo0o)and that the effect of the indi- But if this is true, opinion leaders' support for the
vidual-level predictors is fixed as well (i.e., apjk= Ypoofor EU depends on the nature of the party cues to which
p ? 0), then substitution of this result into (23) yields: they are exposed. If the cue that the party sends is pro-
EU, then opinion leaders should also be pro-EU (and
more so than the general public). On the other hand, if
Supportijk = Yoo + yoolTenurek + yoo2Tradek + YoloCuejk the party cue is anti-EU, then opinion leaders should op-
+ ylooLIijk+ Y2oHIijk + Y300IDijk+ 400OLijk
pose the EU. Thus, instead of predicting a uniformly
+ Y5ooMaleijk+ y600Ageijk+ VOOk
+ 80jk + ?ijk positive effect of opinion leadership on EU support, we
(26) hypothesize an effect that varies with party cues.
How do we model this effect?We do this by dropping
This model has several notable features. First, it is com- the assumption in (23), that the effect of opinion leader-
prehensive in that it brings together the predictors at dif- ship is fixed. Instead,we model this effect as follows:
ferent levels. Second, (26) makes no assumption that
these predictors fully account for the variation in EU (27)
(4jk = P40k+ P41kCuejk+ 64jk
support at the different levels. Thus, the model implies
variance components (y2 for ?ijk, T00 for 60jk, and 0oo
This model stipulates that the effect of opinion leader-
for Vo0k.These features make the model appropriate for
ship, which is given by a4jk in (23), varies as a function
answering the second question that the EU data raise, of party cues. We do not expect these cues to explain all
namely, how do we account for EU support at different of the cross-party variation in aX4jk.Our theory does not
levels of analysis?
An important limitation of (26) is the assumption suggest that the right-hand side coefficients in (27) vary
across countries, so that we asume these coefficients to be
that the individual-level predictors have fixed effects. For
fixed (i.e.,^40k = Y400and [41k = Y410).If we retain our ear-
one of the predictors, namely opinion leadership,there is
lier assumption that the remaining individual-level and
an a priori reason to question this assumption.19 Al-
party-level predictors have fixed effects, we obtain:
though Inglehart has argued that the effect of opinion
leadership on EU support is uniformly positive (Ingle-
hart, Rabier,and Reif 1991), there is theoretical reason to = Yooo+ yOolTenurek+ yo02Tradek +
Supportijk YoloCuejk
expect the effect of this predictor varies as a consequence + + Y200HIijk +
of party cues.20In other words, there is causal heteroge- YlLIijk 300Dijk + Y4000Ljk
+ y5ooMaleijk + Y60oAgeijk+ y41oCuejk
neity in the effect of opinion leadership.
x OLijkVook +
The theoretical rationale for this hypothesis comes 0Ojk+ 64jkOLijk+ Eijk-
from Zaller's (1992) theory of mass opinion. This theory (28)
claims that individuals are persuaded by a message to the
extent that they receive the message and accept it. Na- This model contains the same variance components as
(26). In addition, it contains a variance component T44
(for 64jk) and a covariance component T04 (which cap-
19Wealso tested the assumptionfor the other individuallevel pre- tures the relationship between 80jk and 64jk). In terms
dictorsand for partycues. Forthese predictorswe had no theoreti- of the fixed effects, the model contains a cross-level inter-
cal expectationthat their effectsvary.The tests do not suggestsuch
variation either, so that we retain the assumption of fixed effects action, Cueikx OLiik.We expect this interaction to have a
for these predictors. positive effect. Because of the presence of this cross-level
interaction, (28) allows us to answer the third question
20Inglehart(1970) alludedto the possibilitythat the effectof opin-
ion leadershipis relatedto the content of elite messages,but did that we posed, namely whether the effect of the predic-
not pursue it. tors is uniform or heterogeneous.
230 MARCO R. STEENBERGEN AND BRADFORD S. JONES
Data TABLE
2 VariableDescriptions
Data for the dependent variable and for the individual Variable Description
level predictors come from Eurobarometer survey 46.0,
Support A compositeof twoitems:(1) "Generally speak-
which was fielded in the fifteen EU member states in Oc-
ing, do youthinkthat[ourcountry's]membership
tober and November of 1996.21We consider only the of the EuropeanUnionis (a bad thing,neither
subset of respondents who declared support for one of good norbad, a good thing)?and (2) the desired
the parties in our set (see below) and who were of voting speed of Europeanintegration(1=integration
shouldbe broughtto a "standstill;"
7=integration
age (at least 18 years old) at the time of the survey. As a r=
shouldrun"asfast as possible").inter-item
party supporter we count anyone who indicated in the .474;standardizeditema = .643. Thecomposite
survey that they would be a likely voter for that party in rangesbetween0 and 8, withhigherscores
the next general elections (the respondents could give indicatinggreatersupportforthe EU.Source:
only one choice). A total of 6354 respondents meet these Eurobaraometer 46.0.
requirements. Our measures of the dependent variable LI A dummyvariableindicatingthata respondent
and the predictors are described in Table2.22 fallsinthe bottomquartileof the incomedistribu-
At the party level, we consider only those parties for tionof his/hercountry.Source:Eurobarameter
which we have data about their EU position. These data 46.0.
were collected via an expert survey (Ray 1999). The sur- HI A dummyvariableindicatingthata respondent
vey instrument consists of a seven-point rating scale of fallsin the top quartileof the incomedistribution
the position of a party's leadership vis-a-vis the EU in of his/hercountry.Source:Eurobarameter 46.0.
1996 (the scale runs from "stronglyopposed to European ID Respondent'sideologicalself-placementon a 10-
integration" to "strongly in favor of European integra- pointscale (0-left;9=right).Source:Eurobara-
tion"). The survey was sent to party experts for each of meter46.0.
the EU member states, who provided ratings for the most
OL Respondent'slevelof opinionleadershipon a 4-
important parties in that country. We use the averageex- pointscale. Thismeasurecapturesthe respon-
pert rating for a party as a measure of the EU cue that the dents'potentialforactivepoliticalinvolvement
(for
party was sending at that time.23We have information on measurementdetailssee Inglehart1977).The
party cues for 100 different parties.24 originalscale rangesfrom1 (highopinion
The data for intra-EU trade are for 1995, the year leadership)to 4 (lowopinionleadership).We
reversedthisscale and centeredit aroundthe
prior to the Eurobarometer survey. We introduce this samplemean.Source:Eurobarameter 46.0.
one-year lag because we believe that individuals base
Male Respondent'sgender(1=male).Source:Euro-
barameter46.0.
46.0: PersonalHealth, En-
21Mellich, Anna. 1996. Eurobarometer Age Respondent'sage (inyears).Source:Euro-
ergy,DevelopmentAid,and the CommonEuropeanCurrency(com- barameter46.0.
puter file), 2nd release,Brussels:INRA Europe (producer), 1997;
Koln: ZentralArchiv fur EmpirischeSozial Forschung(distribu- Cue Partycue as measuredby the followingitem:
tor), 1999;Ann Arbor,MI:Inter-UniversityConsortiumfor Politi- "[What of the party
is] the overallorientation
cal and SocialResearch(distributor),1999. leadershiptowardEuropeanintegration?" (1 =
to 7 =
stronglyopposed integration; stronglyin
22Wehave centeredseveralof the predictors,including those that
form the cross-level interaction in (28). We centered about the
Theresponsesto thisitem
favorof integration).
came frompartyexpertsforthe differentEU
grandmean of these predictors,i.e., their mean acrossindividuals,
parties, and countries. On the topic of centering see Kreft, De
memberstates. Weaveragedthe responsesof all
Leeuw,and Aiken (1995) and Hofmann and Gavin (1998). expertswhoevaluateda particular party.For
purposesof the analysisthisaveragewas
23Herewe maketwo assumptions.First,the averageexpertratingis centeredaroundits mean.Source:Ray(1999).
a reasonableapproximation of a party'sstance vis-a-vis the EU.
Second,a party'sstanceis public informationso that it can serveas Tenure Thenumberof yearsa countryhas been an EU
a cue. Both assumptionsseem reasonable.First,with very few ex- memberstate.Thisvariablewas centeredaround
ceptions, there was a remarkableagreementamong expertsin po- the mean.
sitioning parties on the issue of the EU. Moreover,there also was
agreementbetweenthe expertjudgmentsand partymanifestodata Trade trade
Theratioof a country's1995 intra-EU
(Ray 1999). Second,we are not awareof any cases in which parties balance(in 1000 ECUs)overthe country'stotal
tried to hide their EU position from the public. tradebalance(in ECUs).Thisvariablewas
24The number of centeredaroundthe mean.Source:1997
parties per EU member state ranges between 4
and 10. The number of supportersper partyrangesbetween 1 and StatisticalYearbook.
International
383.
MODELING MULTILEVEL DATA STRUCTURES 231
provide an imperfect explanation at best. Finally,is the ef- eses about cross-level interactions and variance compo-
fect of these predictors heterogeneous?The answer to this nents hinges on the availability of sizable numbers of
question was affirmativefor one of our predictors, opin- contextual units (see Kreft 1996; Kreft and De Leeuw
ion leadership,and resulted in an important modification 1998; Raudenbush 1998; Snijders and Bosker 1994,
of existing theory about the effect of this predictor. 1999). This is a steep requirement that is not always met
While this particular example is interesting because in political science data.27Multilevel models also place a
of the obvious multilevel character of EU opinion data, hefty premium on valid and reliable measurements. Bad
the lessons that this application teaches are much more measures in multilevel models "getworse"because such a
general. This application demonstrates all of the advan- heavy demand is placed on the data in terms of estimat-
tages of multilevel modeling that we discussed before. ing coefficients and variance components. In the absence
First, it demonstrates the ease with which causal hetero- of adequate measurement, it may be impossible to ex-
geneity is introduced in multilevel models. Our explora- ploit the benefits of multilevel analysis (see Bryk and
tion of the varying effect of opinion leadership across Raudenbush 1992).
political parties required only a few changes from the ba- Second, we should be equally aware that multilevel
sic model and the interpretation was as easy as interpret- models are theory intensive. The specification and inter-
ing any interaction term. pretation of multilevel models hinge on a theoretical un-
Second, our application shows the advantages of derstanding of the relevant levels of analysis (see Hox,
multilevel modeling in comparison to dummy variable Van Den Eeden, and Hauer 1990; Lazarsfeldand Menzel
models and interactive models. Instead of including 1969; Opdenakker and Van Damme 2000; Van Den
dummy variables at the party and national levels of Eeden and Huttner 1982) and of the processes at work at
analysis,we were able to include substantivelyinteresting each level of analysis and between (or among) levels of
predictors. Unlike interactive models, however, our mul- analysis. What complicates matters is that theory bridg-
tilevel models did not assume that these predictors ac- ing the gap between micro and macro levels is still rela-
count perfectly for cross-party and cross-national varia- tively scarce in political science. Microscopic and macro-
tion in EU support, which would have been incorrect. scopic explanations in political science, say of voting
Finally,this application shows the statistical benefits behavior, often have developed next to each other with
of multilevel models. Too often political scientists ignore few points of contact. Fortunately, this situation is
the multilevel characterof the data they frequently work changing, and we expect that cross-level theories will be-
with. As we have demonstrated here, this can have perni- come more common.
cious effects on the statistical inferences that are drawn Third, multilevel models increase the number of as-
from those data. Multilevel models offer a statistical tool sumptions that one has to make about the data. Not only
that can capture the data structure and thereby produce do we have to assume a distribution for the dependent
correct inferences. variable, we also have to assume distributions for one or
more of the parameters that link predictors to this vari-
able. When these assumptions are incorrect, this could
adverselyaffect statistical inference.
Conclusion Multilevel models, then, make heavy demands on
theory and on data. Thus, we caution researchersagainst
Researcherswho wish to analyze multilevel data face the "blindly"using these models in data analysis. Instead, we
task of selecting a methodology that addresses the chal- urge them to consider the full range of methods for han-
lenges such data pose and that allows exploitation of the dling clustered data that is now available.These methods
opportunities offered by those data. In this paper, we include replicated sampling techniques (Lee, Forthofer,
have discussed one such methodology-multilevel mod- and Lorimor 1989), sandwich estimation of the standard
eling. Widely applied in other social sciences, we think errors (Huber 1967), generalized estimating equations
multilevel analysis is a powerful tool that can greatlyben- (GEE;Liang and Zeger 1986; Zorn 2001), and multilevel
efit political science-if used appropriately.Appropriate modeling.
use is based on an awareness of the requirements and An important consideration in choosing between
limitations of multilevel analysis, and on a sense of when these methods is whether data clustering is merely a sta-
this modeling strategy can be used most beneficially.
First, researchers should be aware that multilevel 270f course, if an interaction or variance component is large it will
models are data intensive. A growing body of statistical probably be detected even when the number of contextual units is
literature suggests that sufficient power to test hypoth- small.
MODELING MULTILEVEL DATA STRUCTURES 235
Hox, Joop, Pieter Van Den Eeden, and Joost Hauer. 1990. "Pre- Lau, Richard R. 1989. "Individual and Contextual Influences
liminaryNotes."In TheoryandModelin MultilevelResearch: on Group Identification." Social Psychology Quarterly
Convergenceor Divergence?ed. Pieter Van Den Eeden, Joop 52:220-231.
Hox and Joost Hauer.Amsterdam: SISWO. Lazarsfeld,Paul E, and Herbert Menzel. 1969. "On the Relation
Huber, Paul J. 1967. "The Behaviour of Maximum Likelihood Between Individual and Collective Properties."In A Socio-
Estimates Under Non-Standard Conditions." Proceedingsof logicalReaderon ComplexOrganizations,
ed.AmitaiEtzioni.
onMathematical
theFifthBerkeleySymposium Statisticsand New York:Holt, Rinehart & Winston.
Probability1:221-233. Lee, Eun Sul, Ronald N. Forthofer, and Ronald J. Lorimor.
Huckfeldt, Robert and John Sprague. 1987. "Networks in Con- 1989. Analyzing Complex Survey Data. Newbury Park, CA:
text: The Social Flow of Political Information." American Sage.
PoliticalScienceReview81:1197-1216.
Li, Fuzhong, Terry E. Duncan, Peter Harmer,Alan Acock, and
Huckfeldt, Robert, and John Sprague. 1993. "Citizens, Con- Mike Stoolmiller. 1998. "AnalyzingMeasurement Models of
texts,and Politics."In PoliticalScience:TheStateof theDis- Latent Variables through Multilevel Confirmatory Factor
cipline II, ed. Ada Finifter.Washington, D.C.: American Po- Analysis and Hierarchical Linear Modeling Approaches."
litical Science Association. Structural
EquationModeling5:294-306.
Inglehart, Ronald. 1970. "Cognitive Mobilization and Euro- Liang, Kung-Yee,and Scott L. Zeger 1986. "Longitudinal Data
pean Identity."CompartativePolitics3:45-70. Analysis Using Generalized Linear Models." Biometrika
ChangingValues
Inglehart,Ronald.1977.TheSilentRevolution: 73:13-22.
and Political Styles among WesternPublics. Princeton: Lindley, Dennis V., and A.F.M. Smith. 1972. "Bayes Estimates
Princeton University Press. for the LinearModel (with Discussion)."Journalof the Royal
Inglehart, Ronald, Jacques-Rene Rabier, and Karlheinz Reif. StatisticalSociety,SeriesB 34:1-41.
1991. "The Evolution of Public Attitudes toward European
Lijphart, Arend. 1971. "Comparative Politics and the Com-
Integration: 1970-86." In Eurobarometer:The Dynamics of parativeMethod."AmericanPoliticalScienceReview65:682-
European Public Opinion, ed. Karlheinz Reif and'Ronald 693.
Inglehart. London: Macmillan.
Longford, Nicholas T. 1987. "A Fast Scoring Algorithm for
Iversen,Gudmund R. 1991. ContextualAnalysis.Newbury Park, Maximum Likelihood Estimation in Unbalanced Mixed
CA: Sage. Models with Nested Random Effects." Biometrika 74:
Jackson,John E. 1992. "Estimationof Models with VariableCo- 817-827.
efficients."PoliticalAnalysis3:27-49.
Longford, Nicholas T. 1993. Random Coefficient Models. Ox-
Janssen,Joseph I. H. 1991. "Postmaterialism,Cognitive Mobili- ford: Clarendon Press.
zation, and Public Support for European Integration."Brit-
MacKuen,Michael, and Courtney Brown. 1987. "PoliticalCon-
ishJournalof PoliticalScience21:443-468.
text and Attitude Change."AmericanPoliticalScienceReview
Jones, Kelvin, Ronald John Johnston, and Charles J. Pattie. 91:471-490.
1992. "People, Places and Regions: Exploring the Use of
Multi-Level Modelling in the Analysis of Electoral Data." Mason, William M., George Y. Wong, and Barbara Entwisle.
1983. "Contextual Analysis Through the Multilevel Linear
BritishJournalof PoliticalScience22:343-380.
Model."Sociological 1983-1984:72-103.
Methodology
Joreskog, Karl G. 1971. "Simultaneous Factor Analysis in Sev-
eral Populations."Psychometrika36:409-426. McDonald, Roderick P. 1994. "The Bilevel Reticular Action
Model for Path Analysis with Latent Variables."Sociological
Kackar,Raghu, and David A. Harville. 1981. "Unbiasedness of Methods& Research
22:399-413.
Two-Stage Estimation and Prediction Procedures for Mixed
in Statistics-Theoryand
LinearModels."Communications Meijer, Erik, Rien Van Der Leeden, and Frank M.T.A. Busing.
Methods 10:1249-1261. 1995. "Implementing the Bootstrap for Multilevel Models."
MultilevelModellingNewsleter7:7-11.
King, Gary, Robert O. Keohane, and Sidney Verba. 1994. De-
Morris, Carl. 1983. "Parametric Empirical Bayes Inference:
signingSocialInquiry:ScientificInferencein QualitativeRe-
search.Princeton: Princeton University Press. Theory and Applications."Journalof the AmericanStatistical
Association78:47-65.
Kreft,Ita G.G. 1996. "AreMultilevel Techniques Necessary?An
Overview, Including Simulation Studies."Multilevel Models Muthen, Bengt 0. 1989. "LatentVariableModeling in Hetero-
Project at the Institute of Education, University of London. geneous Populations."Psychometrika54:557-585.
Kreft,Ita G.G., and Jan De Leeuw. 1998. IntroducingMultilevel Muthen, Bengt 0. 1991. "Multilevel Factor Analysis of Class
Modeling.London: Sage. and Student Achievement Components." Journal of Educa-
tionalMeasurement
28:338-354.
Kreft, Ita G.G., Jan De Leeuw, and Leona S. Aiken. 1995. "The
Effect of Different Forms of Centering in Hierarchical Lin- Muthen, Bengt 0. 1994. "Multilevel Covariance Structure
BehavioralResearch
earModels."Multivariate 30:1-21. Analysis."SociologicalMethods & Research22:376-398.
Kreft,Ita G.G., Jan De Leeuw, and Rien Van Der Leeden. 1994. Opdenakker,Marie-Christine,and JanVan Damme. 2000. "The
"Review of Five Multilevel Analysis Programs: BMDP-5V, Importance of Identifying Levels in Multilevel Analysis:An
GENMOD, HLM, ML3, and VARCL."TheAmericanStatisti- Illustration of the Effects of Ignoring the Top or Intermedi-
cian 48:324-335. ate Levelsin School EffectivenessResearch."SchoolEffective-
Kuk,Anthony Y.C. 1995. "AsymptoticallyUnbiased Estimation
11:103-130.
nessandSchoolImprovement
in Generalized Linear Models With Random Effects."Jour- Przeworski,Adam, and Henry Teune. 1970. The Logic of Com-
nal of theRoyalStatisticalSociety,SeriesB 57:395-407. parativeSocialInquiry.NewYork:Wiley-Interscience.
MODELING MULTILEVEL DATA STRUCTURES 237
Quillian, Lincoln. 1995. "Prejudice as a Response to Perceived Sorbom, Dag. 1974. "A General Method for Studying Differ-
Group Threat: Population Composition and Anti-Immi- ences in Factor Means and Factor Structure between
grant and Racial Prejudice in Europe."AmericanSociological Groups."BritishJournalof Mathematical and StatisticalPsy-
Review 60: 586-611. chology27: 229-239.
Rasbash, Jon, William Browne, Harvey Goldstein, Min Yang, Steenbergen, Marco R., Whitt Killburn, and JennyWolak. 2001.
Ian Plewis, Michael Healy, GeoffWoodhouse, David Draper, "AffectiveModerators of On-Line and Memory-Based Pro-
Ian Langford, and Toby Lewis. 2000. "A User's Guide to cessing in Candidate Evaluation."Presented at the annual
MLwiN."London:Multilevel Models Project at the Institute meeting of the Midwest Political Science Association,.
of Education, University of London. Stimson, JamesA. 1985. "Regressionin Space and Time: A Sta-
Raudenbush, Stephen W. 1988. "Educational Applications of tistical Essay." American Journal of Political Science 29:
Hierarchical Linear Models: A Review."Journal of Educa- 914-947.
tional Statistics 13:85-116. Swallow, William H., and John F. Monahan. 1984. "Monte
Ray, Leonard. 1999. "Measuring Party Orientations towards Carlo Comparisons of ANOVA, MINQUE, REML,and ML
European Integration: Results from an Expert Survey."Eu- Estimators of Variance Components." Technometrics 26:
ropeanJournalof Political Research36:283-306. 47-57.
Richardson, A.M., and Alan H. Welsh. 1994. "AsymptoticProp- Swamy,P.A.V.B.1970. "EfficientInference in a Random Coeffi-
erties of Restricted Maximum Likelihood (REML) Esti- cient Regression Model."Econometrica38:311-323.
mates for Hierarchical Linear Models."AustralianJournalof Swamy, P.A.V.B.,and George S. Tavlas. 1995. "Random Coeffi-
Statistics36: 31-43. cient Models: Theory and Applications." Journal of Eco-
Rokkan, Stein. 1966. "Comparative Cross-National Research: nomic Surveys9:165-196.
The Context of Current Efforts."In ComparingNations: The Thum, Yeow Meng. 1997. "HierarchicalLinearModels for Mul-
Use of Quantitative Data in Cross-NationalResearch,ed. Ri- tivariate Outcomes." Journal of Educational and Behavioral
chard L. Merritt and Stein Rokkan. New Haven: Yale Uni- Statistics22:77-108.
versity Press. Van Den Eeden, Pieter. 1988. "A Two-Step Procedure for
Rubin, Donald B. 1981. "Estimation in Parallel Randomized Analysing Multi-Level Structured Data."In SociometricRe-
Experiments." Journal of Educational Statistics 6(4): 377- search2: Data Analysis,ed. Willem E. Saris and Irmtraud N.
400. Gallhofer.New York:St. Martin's Press.
Sayrs,Lois W. 1989. Pooled Time SeriesAnalysis.Newbury Park, Van Den Eeden, Pieter, and Harry J.M. Huttner. 1982. "Multi-
CA: Sage. level Research."CurrentSociology20:1-178.
Searle, Shayle R., George Casella, and Charles E. McCullogh. Western, Bruce. 1998. "Causal Heterogeneity in Comparative
1992. VarianceComponents.New York:John Wiley & Sons. Research: A Bayesian Hierarchical Modeling Approach."
Seltzer, Michael H., Wing Hung Wong, and Anthony S. Bryk. AmericanJournalof Political Science42:1233-1259.
1996. "Bayesian Analysis in Applications of Hierarchical Winer, Benjamin J. 1971. Statistical Principles in Experimental
Models: Issues and Methods." Journal of Educational and Design. New York:McGraw-Hill.
BehavioralStatistics21:131-167. Zaller, John R. 1992. The Nature and Origins of Mass Opinion.
Siune, Karen, and Palle Svensson. 1993. "The Danes and the Cambridge:Cambridge University Press.
Maastricht Treaty: The Danish EC Referendum of June
Zeger, Scott L., and Rezaul Karim. 1991. "Generalized Linear
1992."ElectoralStudies 12:99-111. Models with Random Effects:A Gibbs Sampling Approach."
Snijders, Tom A.B., and Roel J. Bosker. 1994. "StandardErrors Journalof the American StatisticalAssociation86:79-95.
and Sampling Sizes for Two-LevelResearch."Journalof Edu- Zorn, Christopher J.W. 2001. "Generalized Equation Models
cational Statistics 18:237-261. for Correlated Data: A Review with Applications."American
Snijders, Tom A.B., and Roel J. Bosker. 1999. Multilevel Analy- Journalof Political Science45:470-90.
sis:An Introductionto Basic and AdvancedMultilevelModel-
ing. London: Sage.