International Statistical Institute (ISI)
International Statistical Institute (ISI)
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
International Statistical Institute (ISI) is collaborating with JSTOR to digitize, preserve and extend access to
International Statistical Review / Revue Internationale de Statistique.
http://www.jstor.org
Summary
Results from classic linear regressionregardingthe effect of adjustingfor covariatesupon the
precisionof an estimatorof exposure effect are often assumed to apply more generally to other
types of regressionmodels. In this paperwe show that such an assumptionis not justifiedin the case
of logistic regression,where the effect of adjustingfor covariatesupon precisionis quite different.
For example, in classiclinear regressionthe adjustmentfor a non-confoundingpredictivecovariate
results in improvedprecision, whereas such adjustmentin logistic regression results in a loss of
precision.However, when testing for a treatmenteffect in randomizedstudies, it is always more
efficientto adjustfor predictivecovariateswhen logisticmodels are used, and thus in this regardthe
behaviorof logisticregressionis the same as that of classiclinear regression.
Key words: Adjustment for covariates; Asymptotic relative efficiency;Classic linear regression;
Logisticregression;Omittedcovariate;Precision.
1 Introduction
The ability of covariance adjustment to improve the precision of estimates is a
long-standing idea in statistics that originated with R.A. Fisher (1932). In particular, in a
randomized experiment, when the assumptions of 'classic' linear regression apply,
adjustment for covariates that are associated with the response variable is not required to
obtain a valid estimate of the treatment effect, but nonetheless is desirable, as it will
improve the precision of the treatment effect estimate. This improvement in precision can
be explained in terms of a reduction of residual variance, an intuitive notion so persuasive
that it has become the conventional wisdom to assume that similar gains in precision will
be achieved with respect to regression models other than the classic, such as logistic
regression (Mantel & Haenszel, 1959; Mantel, 1989).
Recently, however, some authors have recognized that in some situations the
conventional wisdom regarding covariate adjustment does not apply. Wickramaratne &
Holford (1989) give a specific 2 x 2 x 2 contingency table example (for which a logistic
regression analysis is appropriate) in which the pooled and stratum specific (log) odds
ratio estimates are equal, and where the variance of the pooled (log) odds ratio estimate
is less than that of the stratified estimate. This point was also addressed by Breslow &
Day (1987).
In this paper it will be proven that adjustment for covariates always leads to a loss (or
at best no gain) of precision with respect to logistic regression models. Section 2 outlines
the details of the classic linear regression model, which are then used for comparison with
logistic model results obtained in later sections. In ? 3 logistic regression models which
parametrize the 2 x 2 x 2 contingency table situation are introduced, and asymptotic
varianceformulaefor the pooled and adjusted estimates of exposure effect are stated. In
? 4 it is demonstratedthat the varianceof the pooled estimate is alwaysless than or equal
to the variance of the adjusted estimate, and this result is then extended to the more
generalcase of several strata. In ? 5 it is demonstratedthat the result of ? 4 also applies to
common finite sample estimates of the asymptoticvariances. In ? 6 a simple argument,
involvingthe symmetricnatureof logistic regression,is given which demonstratesthat the
conventionalwisdom cannot apply to logistic regression. In ? 7 the effects of certain key
factorswhich influenceprecision are examined. In ? 8, it is shown that it is always as or
more efficientto adjust for covariateswhen testing for the presence of a treatmenteffect
in randomizedstudies, in the context of a logistic regressionmodel, despite the associated
loss of precisiondemonstratedin ? 4.
Here p12 is the simple correlation between the variables X1 and X2, and PY2.1 is the
partialcorrelationbetween the variablesY and X2 conditionalon fixed X1.
The comparison of asymptotic variances is of particularinterest when there is no
confounding,i.e. when b* = bl, and hence both estimatorsb^ and 6• are estimatingthe
same unknownpopulationparameter.For the classic linear regressionmodels described
above, there will be no confoundingif one or both of the following two conditionsholds.
Condition 1. pl2 = 0.
We now consider the behavior of the ARP(6I to t) more generally, not restricting
ourselves to conditions of no confounding. This provides important insight into key
factors which influence the precision of the estimator 61. In general, the value of
ARP (61 to b~) is seen to be (i) less than, (ii) equal to, or (iii) greaterthan 1 dependingon
whether p22.1 is (i) less than, (ii) equal to, or (iii) greater than p22. Thus a strong
associationbetween Y and X2 has a beneficialeffect upon the precisionof ~6, whereas a
strongassociationbetween X1 and X2 has a detrimentaleffect, and hence the precisionof
b1reflectsthe competingeffects of these Y - X2 and X1 - X2 relationships.
It is the above behavior of the ARP(61 to 1'), and more generally of the precision of
61, that we loosely refer to as the conventionalwisdom. The purpose of this paper is to
demonstrate that the conventional wisdom breaks down with respect to the logistic
regressionmodel.
pprr(Y=1X bXX1,
log[ ) a* + (3)
[ pr(Y= 1
IX X2)
log pr(Y==11X, a + b lX1 + b2X2. (4)
1 - pr (Y= 1 X1, •2)
X2)
Model (3) always provides a valid descriptionof the relationshipbetween the dichoto-
mous variables Y and X1, whereas model (4) imposes an assumptionof no interaction
(i.e. the variablesX1 and X2 are assumedto have additiveeffects with respect to the log
odds).
Suppose now that simple randomsamples of N1 exposed and No unexposed individuals
are obtained, and that both logistic regressionmodels are fit via the method of maximum
likelihood, resulting in respective estimators 6T and 61. Standard likelihood theory
techniquesresult in the following asymptoticvarianceformulae(Gart, 1962):
1[ 1 1 1
NooPoo
No0qol(
NloPlo Nloqlo NolPol Nooqoo
SNlJpll N1lq1l
No, + Noo= No. Also, here and throughout,the term asymptoticrefers to both No and N1
tendingto infinity.
The second varianceformulagiven above is conditionalon both X1 and X2. However,
in accordancewith the samplingscheme, in which prespecifiednumbersof exposed and
unexposedindividualsare sampled, but where the distributionof X2 is allowed to vary,
for the purposeof definingthe asymptoticrelative efficiencywe shall requirethe variance
conditional only on X1, that is var (b IX1), and thus must take the expectation of
var(b•1 X1, X2) with respect to X2. This resultsin the following formula:
{N
_[ 1 1 1 1 1
var(b1 X1)= +- + +N1
f1N 11jp11 . NIC11q11 NocolPo0 NO001qj_
1 1 1
+ [c + N + N + N 1 q]-11-1 (7)
[ NocooPoo Nocooqoo
SNlcioPlo
"
In the above formula ci = pr (X2=j i) for i, j= 0, 1.
X1= Nlcloqio
Table 1 gives the set of tables which representthe outcomes expected to result from the
samplingscheme describedabove. In all tables in this paper D will denote 'diseased', DI
'non-diseased',E 'exposed', and E 'unexposed'. To avoid technical difficulties,we will
assume that the population contains no 'structuralzeroes' (McCullagh& Nelder, 1983,
p. 61), so that none of the expected cell entries are 0.
The entries in the pooled table equal the sum of the correspondingentries in the two
sub-tables,for example
Nlcloplo + N1cjllP1 = Nl[cloPlo + c11P11] =
NlP1.
Furthermore,we see that var (IT IX1) simply equals the sum of the inverses of the
expected cell entries of the pooled table, and that var (61 X1) can be expressed as
[Vi' + Vo1]-1, where Vjequals the sum of the inversesof the expected cell entries of the
sub-tableX2 = j, for j = 0, 1.
The estimator 61 referred to above is the maximum likelihood estimator. Another
estimator commonly used is the 'inverse variance weighted, stratified estimator'
(Weinberg,1985), in which the parameter b1 is estimated separately from the two
sub-tables (using observed proportions), and then a weighted average of the two
estimates is obtained, the weights being inversely proportional to their respective
estimated variances. This estimator, also referred to as the 'Woolf estimator' (Woolf,
1955), can easily be computed by hand, whereas the maximum likelihood estimator
generallyrequiresan iterativescheme, and hence the use of a computer. It can be shown
that the varianceformulae given for the maximumlikelihood estimator I1 also apply to
the Woolf estimator(Gart, 1962).
Table 1
Expected cell frequencies for pooled and sub-tables from cohort design.
Pooled table Sub-table X2 = 0 Sub-table X2 1
asymptoticvariances:
[var (61 X1)]-' var
to (1
ARP
(• Ix1)
var(b, X,)
A[varR(6b|X)]-f
Here the asymptotic variances are those stated in ? Note again that both of these
3.
asymptotic variancesare conditionalon X1, in accordancewith the samplingscheme.
The main result of this paper is that ARP(•1 to T) 1, with equality occurringif and
only if the variableX2 is independentof (Y, X1). Since ARP(61 to b ') < 1 is equivalentto
[var (6*' X1)]-1 [var (61 X1)]-',
we must show that
1 1 1 1 -1 1 1 1 -1 1
-
[Npi N oqo Nlcloqlo NocooPoo NocooqooJ
NiqlNoPo
PNicioplo 1
'+ + + --1q
+[NC1 N1p1
NlCllpll NlCllqll NocolPol Nocolqol
This result follows readily as an applicationof Minkowski'sinequality(Hardy,
Littlewood& Polya,1952,pp. 30-31), whichfor our purposesmaybe statedas follows:
assumeall aijpositive,for i = 1, ..., I andj = 0, ..., J - 1. Forfiniter < 1, butnot equal
to 0, we havethe following:
II J-1 -r l1r J-1 I -l1r
with equality occurringif and only if a1j= k a1j,for all i and all choices of j / j' and for
some finite k > 0, where the value of k depends on the specificchoice of j and j'.
For our particularapplication, we restate the above theorem for the specific case of
I 4, J = 2, and r = -1. This yields
=
-
- - - 1- 11 1 1 1
+ 111 + +
++ >- -+-+-+- +
ao + a a20 + a21 a3 +
a31
a4a40
+ 1 10 20
a30 a40
[1 1 1
+ -+-+-+- 1_1
all a21 a31 a41
withequalityoccurringif andonlyif ail = k for i = 1, ... , 4 andfor somefinitek > 0.
Fromthiswe can immediately concludethataio
For the logistic regressionmodels (3) and (4) stated in ? 3 there will be no confounding,
that is b~ = b1, if one or both of the following two conditionsholds (Gail, 1986).
Condition1'. X1 and X2 are independentgiven Y.
Condition2'. Y and X2 are independentgiven X1 (note this is equivalentto b2 = 0).
Condition2' is very much analogousto the no confoundingCondition2 PY2.1= 0 of ? 2.
In particular,for classic linear regressionthe condition 'Y and X2 independentgiven XI'
does in fact imply that PY2.1=0. When Condition 2' alone holds, ARP(i1 to /*)<1,
which is the same result as was obtained for classic linear regressionwhen the analogous
no confoundingCondition2 holds. Thus we see that, for both logistic and classic linear
regression, adjustmentfor a non-predictivecovariate X2 which is associated with the
predictorvariableX1 results in a loss of precision.
Condition 1' may also be regarded as analogous to the no confounding Condition 1
P12 = 0 of ? 2, in that both conditions refer to a lack of an association between the
variables X1 and X2. However, for logistic regression the absence of association is
conditionalon Y, which is not the case for classic linear regression.In particular,it is not
Condition 1' which implies Condition 1 P12 = 0 with respect to classic linear regression,
but rather the condition 'XI and X2 independent'. When Condition 1' alone holds,
ARP (61 to
/b)< 1, which is not consistent with the analogous result from classic linear
regression,where we saw that ARP (b1 to 6*)> 1 when the no confoundingCondition 1
holds. Thus, whereas adjustingfor a non-confoundingcovariate X2 which is associated
with the dependent variable Y (conditional on X1) results in a gain in precision with
respect to classic linear regression,it results in a loss of precisionwith respect to logistic
regression.
When both Conditions 1' and 2' hold, the variableX2 is independentof (Y, X1), and
thus ARP(61 to b~)= 1. Furthermore,note that when X2 is independentof (Y, XI), with
respect to classic linear regression,both P12 = 0 and PY2.1= 0. Thus, for both logistic and
classic linear regression, ARP (b1 to b") = 1 when the variable X2 is independent of
(Y, X1).
More generally,for classic linear regressionwe saw that the value of ARP (61 to T) can
be less than, equal to, or greaterthan 1 dependingon the relative strengthsof the Y1 - X2
and X1 - X2 relationships,whereas for logistic regressionthe value of ARP (61 to 6I*) is
always less than or equal to 1 (again with equality occurring if and only if X2 is
independentof (Y, X1)). This suggests that, unlike classic linear regression, where the
Y - X2 and X1 - X2 relationshipshave opposingeffects which compete with each other to
determine the relative precision of b1, with respect to logistic regression these two
relationshipshave similareffects which combine to cause an automaticloss of precision.
Sections6 and 7 give additionalinsightinto the behaviorof the ARP (61 to /T) with respect
to logistic regression.
The result we have obtained for the case of two strata, i.e. two levels of the variable
X2, can be extended to the more general case of J > 2 strata in a straightforward manner.
In particular, the asymptotic variance of the maximum likelihood estimator of b1 can be
shown to equal (Gart, 1962)
_J- 1_ 1 1 1
[var(/T IX1)]-' ~[var (b• X1)]-', then follows as an application of Minkowski's in-
equality with I=4, r =-1, and J = the number of strata, in a manner completely
analogouswith the two strata case. This also allows extension of the asymptoticrelative
precisionresult to the case of adjustmentfor a set of discrete covariates.
In ? 4 it was proven that var (T I X1))- var (61 1X1), a result which pertains to the
asymptoticvariances.Suppose now that an actual set of data is obtained, and from that
data set estimates of b' and b1 are computed. In this section we will consider both the
maximumlikelihood estimator and the Woolf estimator of bl. Typically an investigator
will also obtain estimates of the variancesvar (I~ X1) and var (61• X1). Although the
maximumlikelihood estimatorand the Woolf estimatorof b1 have the same asymptotic
variance, the method by which this variance is estimated is generally different. In this
section we will examine the question of whether the result of ? 4, which pertains to
asymptoticvariances,extends to their common finite sample estimates.
The estimationof var ( IX1) given by (5) is very straightforward.There is only one
commonlyused method for estimatingvar (IT 1X1), namely to substitute the maximum
likelihood estimates and ,0 for p, and po, respectively. These maximumlikelihood
P1
estimates are the observed proportions of diseased individuals (that is Y = 1) among
exposed (that is X1 = 1) and unexposed (that is X1 = 0) individuals.Suppose now that the
data is as given in Table 2. From this data set we obtain the estimated variance of IT,
denoted by var (IT), as
1 1 1 1
var (6) =- + + +
alo + a11 a20+ a21 a30+ a31 a40+ a41
Let us now examinethe issue of estimatingthe varianceof I1. Usually,we further
conditionon X2 in calculatingan estimatedvarianceof 61. Therefore,regardlessof
whetherthe maximumlikelihoodestimatoror the Woolfestimatorhas been usedfor ~1,
the estimate of var (• IX1, X2), which we shall denote by var (b•), is obtained by
substitutingestimates pij (for the unknown probabilitiespij, for i, j = 0, 1) into the
asymptoticvarianceformula(6).
When the method of maximumlikelihoodis used to obtain an estimate b1, typicallythe
estimatesPijwhich are substitutedinto the asymptoticvarianceformulaare the maximum
likelihood estimates based on the regressionmodel. When the Woolf estimatoris used,
typicallythe estimatesfij which are substitutedare the observed proportions,which, in
this case, are not the same as the maximumlikelihoodestimates. Note, however, that it is
actuallyvalid to substituteeither set of estimates ^ijinto the asymptoticvarianceformula
regardlessof which estimator b1 has been obtained (although the maximumlikelihood
estimates YI will generallynot be availablewhen the Woolf estimatorhas been obtained).
First consider the var (b1) obtained by substitutingthe observed proportions in (6);
Table 2
Data for pooled and sub-tables arising from cohort studies.
Pooled table Sub-table X2 = 0 Sub-table X2 1
D D D D D D
E alo+a a20+a21 E a,( a20 E a,, a21
E a3o + a3
a0 +a41 E a30 a4o E a31
a4o a41
then
var[ 1
1-- 1 1 a 1a1
1 -1
-1
ao a20 a30 40l a21 a31 41
Thus it follows immediately from Minkowski's inequality that when the observed
proportions3p are used, vrr(6*) ~ var(61), and hence we see that, in this case, the
relationshipbetween the asymptoticvariancesextends to these estimatedvariances.
Now considerthe case where the maximumlikelihoodestimates •ijare used. It is a well
known property of maximum likelihood estimation that the fitted sub-table cell
frequencies must sum to the pooled table (Breslow & Day, 1980). Thus we have, for
example, Noo10+N11p11 equalling the total number of diseased, exposed (that is
Y = 1, X1 = 1) individuals,which in the above data set is a10+ a11. However, this total
number equals Nf1I, and thus we have No1,f10 + = Similarly we have
+ = etc. Thus, Minkowski's Nl1t11also
inequality Nlfl1.
applies when the maxi-
N1o^loNl1111 NI1,
mum likelihood estimates 5ijare used, and once again we have the result var (b~) ~
var(61).
more, the symmetric nature immediately suggests that the Y-X2 and X1 -X
relationshipshave similareffects which combine to influenceprecision, in contrastto the
situationobserved with respect to classic linear regression,as was discussedpreviouslyin
? 4.
The previousargumentwas stated specificallywith respect to logistic regressionmodels
(3), (4), (8), and (9), for which the covariate X2 is dichotomous, and where it was
assumedthat the variablesX1 and X2 are independentgiven Y. However, the validityof
the argumentapplies more generally to situations where there is confounding, and to
situationsinvolvingadjustmentfor a set of covariates,some of which may be continuous.
Thus, we strongly suspect that for logistic regression, when the risk factor of primary
interestis dichotomous,adjustmentfor any set of covariateswill result in a loss (or at best
no gain) of precision.
Suppose now that we vary the parameter m while holding the remaining four
parametersfixed. In fixingthe first three parameters,we have fixed the pooled table, and
hence var (iT X1) also. Since b2 is a function of m, it varies with m. Since the
distribution of the pooled table into the sub-tables varies with m, we see that
var (b IX1), and hence ARP(bI to bT), also vary as m varies. Consider now the
Table 3
Population probabilities that form the basis of Fig. 1.
Pooled table Sub-table X2 = 0 Sub-table X2 = 1
D D Total D D D D
E 0-250 0-250 0-500 E 0-250(1-m) 0-150 E 0-250m 0-100
E 0-125 0-375 0.500 E 0-125(1 - m) 0.225 E 0-125m 0-150
10-
0-9-
0.8-
0-7-
-6 -4 -2 0 2 4 6
b2
Figure 1. Plot of asymptotic relative precision (ARP) of the adjusted
estimator b1 to the pooled estimator bf, against b2, holding all other
parameters fixed at values shown in Table 3.
D D Total D D D D
E 0-500p, 0.500(1 -pi) 0-500 E 0-200p 1 0400(1 -p) E 0-300p, 0-100(1 -p)
E 0.500Po 0.500(1 - Po) 0.500 E 0-200po 0-400(1 - Po) E 0-300po 0-100(1 - po)
We now examine the effect of the marginal distributionof Y upon the asymptotic
relative precision by varying pr (Y = 1), while the values of other parametersare held
fixed. Again we will assumethat the variablesX1 and X2 are independentgiven Y, so that
the parametrizationof the previousexample also applies here. In this case we will fix the
values of the parameterspr (X1 = 1), m, and k, and then vary both p, and Po in such a
way as to hold
b- = log
[pl/(l-Po)
SPo/(1
-p04 _P0)
po)
-
fixed. As p, andpo vary, so does the overall incidenceof disease pr (Y = 1). Table 4 gives
an example of certainpopulationprobabilities.
In Table 4 we have set pr (X1 = 1) = 0.50, k = 0-20, and m = 0.60 (and thus b2 = log 6).
Again we consider a total sample size of 200. We now vary both p, and Po so as to hold
b* fixed at log 3, to obtain a graphof ARP (I1 to T) versus pr (Y = 1), shown in Fig. 2.
From Fig. 2 we see that for both small and large values of pr (Y = 1), the ARP to b )
is relativelyclose to the maximumvalue of 1, while for values of pr (Y = 1) closer(11 to
0.5
the ARP (61 to 6T) is further from 1. Thus, this particularexample suggests that the
potential for loss of precisiondue to adjustmentfor covariateswill tend to be greater in
cohort studies where the disease is relativelycommon. It must be noted, however, that
the minimumARP (h1 to 6*) value does not occur exactly at pr (Y = 1) = 0-5, and that in
fact the minimumcan occur at values of pr (Y = 1) quite far from 0-5 when the marginal
distributionof X2 is very skewed. Nonethelesss, in most studies the marginaldistribution
1"00
0-95-
0-90-
0.85-
of X2 will not be highly skewed, so that the conclusionreached above remainsvalid. The
above example also suggests that the potential for loss of precision will tend to be
particularlygreat for case-control studies, where the oversamplingof cases ensures a
relativelyhigh frequencyof disease in the sample.
to* at b = 0) var(m
1 1 X .1)
ARE(6 =lim{l bo db bb)/(d db1 b) [lim var (- 1 X0)?
b-_b,0
Here we are considering the parameter b* as a function of b1. Notationally, let
E(A1)= E pr (X2 = j)Aj, that is expectationover the distributionof X2. Accordingto this
notation, pi = E(pl) and Po = E(p0o). Consequently, b* can be expressed as
log [E(pl1)E(qo0)/E(po1)E(qlA)]
and var (bjI X1) can be expressed as
[N1 + [NoE(po;)E(qo)]-1.
E(p1l)E(qlj)]-'
Also, using the independence of X, and X2, the formula for var (b• IX1) given by (7)
can be expressed as
var E[
(1 IX1)= N,
Pq+&1
P Npjqoj
Nop
•
ooqo
9 Discussion
For classiclinear regressionmodels, the precisionof the estimatorb1depends upon the
relativestrengthsof the Y - X2 and X1 - X2 associations. In particular,a strong Y- X2
associationhas a beneficial effect upon precision, whereas a strong X1 - X2 association
has a detrimentaleffect. It has been, heretofore, conventionalwisdom to assume that the
above behavior of classic linear regression with respect to precision applies more
generallyto other types of regressionmodels. In this paper, however, we have shown that
the behaviorof logistic regressionwith respect to precisionis quite differentfrom that of
classic linear regression. In particular,while a strong X1 - X2 association again has a
detrimentaleffect upon precisionfor logistic regression,a strong Y - X2 associationalso
has a detrimentaleffect. Consequently,whereasin classic linear regressionadjustmentfor
predictive covariates can result in either increased or decreased precision, adjustment for
predictive covariates will always result in a loss of precision for logistic regression.
However, we have seen that for logistic regression, as for classic regression, adjustment
for predictive covariates results in greater efficiency when testing for a treatment effect in
randomized studies.
In any particular investigation, one may be interested in estimating bj, or both.
Given the behavior of logistic regression with respect to precision, when bl,the parameter of
interest is b1 it seems plausible that in some situations it might be preferable to use the
biased but more precise /• to estimate bl, rather than the unbiased but less precise /•, as
the estimator /• may result in greater accuracy, as measured by mean square error. Work
Acknowledgements
We thankW.J. Redfearnfor severalilluminatingdiscussions,and S. Selvinfor showingus an early proof of a
specialcase of the resultof ? 4, as well as numerousinvaluablediscussions.We are also indebtedto P. Armitage
for commentswith regard to the material in ? 8. Support for this researchwas provided in part by Grant
A129162from the NationalInstituteof Allergy and InfectiousDiseases. The manuscriptwas completedwhile
the second authorvisited the Departmentof Statistics,Universityof Oxford, with supportfrom a travel grant
fromthe BurroughsWellcomeFund.
References
Bishop, Y.M.M., Fienberg, S.E. & Holland, P.W. (1975). Discrete Multivariate Analysis: Theory and Practice.
Cambridge,Mass:MIT Press.
Breslow, N.E. & Day, N.E. (1980). Statistical Methods in Cancer Research, 1: The Analysis of Case-Control
Studies.Lyon, France:IARC ScientificPublications.
Breslow, N.E. & Day, N.E. (1987). Statistical Methods in Cancer Research, 2: The Design and Analysis of
CohortStudies.Lyon, France:IARC ScientificPublications.
Breslow,N.E. & Powers,W. (1978). Are there two logisticregressionsfor retrospectivestudies?Biometrics34,
100-105.
Cox, D.R. & Hinkley, D.V. (1974). Theoretical Statistics. London: Chapman and Hall.
Fisher, R.A. (1932). Statistical Methods For Research Workers. Edinburgh: Oliver and Boyd (13th ed., 1958).
Gail, M.H. (1986). Adjustingfor covariatesthat have the same distributionin exposed and unexposedcohorts.
In Modern Statistical Methods in Chronic Disease Epidemiology, Ed. S.H. Moolgavkar and R.L. Prentice,
pp. 3-18. New York: Wiley.
Gail, M.H., Tan, W.Y. & Piantadosi,S. (1988). Tests for no treatmenteffects in randomizedclinical trials.
Biometrika 75, 57-64.
Gart,J.J. (1962). On the combinationof relativerisks. Biometrics18, 601-610.
Hardy,G.H., Littlewood,J.E. & Polya, G. (1952). Inequalities.London:CambridgeUniversityPress.
Mantel,N. (1989). Confoundingin epidemiologicstudies. Biometrics45, 1317-18.
Mantel, N. & Haenszel, W. (1959). Statisticalaspects of the analysis of data from retrospectivestudies of
disease. J. Nat. Cancer Inst. 22, 719-48.
McCullagh,P. & Nelder, J.A. (1983). GeneralizedLinearModels. London:Chapmanand Hall.
Weinberg,C.R. (1985). On pooling acrossstratawhen frequencymatchinghas been followedin a cohortstudy.
Biometrics 41, 117-27.
Wickramaratne, P.J. & Holford, T.R. (1989). Confoundingin epidemiologicstudies. Response. Biometrics45,
1319-22.
Woolf, B. (1955). On estimatingthe relationshipbetween blood group and disease. Ann. Human Genetics19,
251-53.
Resume
Les r6sultatsde l'analysede regressionlin6aireclassiqueconcernantl'effet d'ajustementpour des variables
concomitantessur la precisiond'un estimateurd'exposition,sont souvent supposes s'appliquerde fagon plus
g6neralea d'autrestypes de modulesde r6gression.Dans cet article, on montrequ'une telle suppositionn'est
pas jutifiee dans le cas d'une r6gressionlogistique, oil I'effet d'ajustementde variablesconcomitantessur la
precisionest tout a fait different.Par exemple, en r6gressionlin'aire classique,I'ajustementpour une variable
concomitantede prevision non confondante se traduit en une precision amelioree. Par contre, le meme
ajustementen regressionlin6airelogistique,se traduiten une perte de precision.Quoiqu'ilen soit, quandI'effet
d'un traitementest test6 dans une etude randomis6eil est toujours plus efficace d'ajusterpour des variables
concomitantespr6visionnellesquand un modble logistique est utilis6 et ainsi, le comportementen regression
logistiqueest identiquea celiu en regressionlin6aireclassique.