Gologit 2 Part 1
Gologit 2 Part 1
Richard Williams
Department of Sociology
University of Notre Dame
Last updated March 27, 2019
https://www.nd.edu/~rwilliam/
Key features of gologit2
Backwards compatible with Vincent Fu’s original
gologit program – but offers many more features
Can estimate models that are less restrictive than
ologit (whose assumptions are often violated)
Can estimate models that are more parsimonious
than non-ordinal alternatives, such as mlogit
Specifically, gologit2 can estimate:
Proportional odds models (same as ologit –
all variables meet the proportional odds/
parallel lines assumption)
Generalized ordered logit models (same as the
original gologit – no variables need to meet
the parallel lines assumption)
Partial Proportional Odds Models (some but
not all variables meet the pl assumption)
Example: Proportional Odds
Assumption Violated
(Adapted from Long & Freese, 2003 – Data from the
1977 & 1989 General Social Survey)
Respondents are asked to evaluate the following
statement: “A working mother can establish just as
warm and secure a relationship with her child as a
mother who does not work.”
1 = Strongly Disagree (SD)
2 = Disagree (D)
3 = Agree (A)
4 = Strongly Agree (SA).
Explanatory variables are
yr89 (survey year; 0 = 1977, 1 = 1989)
male (0 = female, 1 = male)
white (0 = nonwhite, 1 = white)
age (measured in years)
ed (years of education)
prst (occupational prestige scale).
Ologit results
. ologit warm yr89 male white age ed prst
This is a series of binary logistic regressions. First it is 1 versus 2,3,4; then 1 & 2
versus 3 & 4; then 1, 2, 3 versus 4
If proportional odds/ parallel lines assumptions were not violated, all of these
coefficients (except the intercepts) would be the same except for sampling
variability.
Dealing with violations of assumptions
Just ignore it! (A fairly common practice)
Go with a non-ordinal alternative, such as
mlogit
Go with an ordinal alternative, such as the
original gologit & the default gologit2 (see
next slide)
Try an in-between approach: partial
proportional odds
. gologit warm yr89 male white age ed prst
Generalized Ordered Logit Estimates Number of obs = 2293
Model chi2(18) = 350.92
Prob > chi2 = 0.0000
Log Likelihood = -2820.3109918 Pseudo R2 = 0.0586
------------------------------------------------------------------------------
warm | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mleq1 |
yr89 | .95575 .1547185 6.18 0.000 .6525073 1.258993
male | -.3009775 .1287712 -2.34 0.019 -.5533645 -.0485906
white | -.5287267 .2278446 -2.32 0.020 -.975294 -.0821595
age | -.0163486 .0039508 -4.14 0.000 -.0240921 -.0086051
ed | .1032469 .0247377 4.17 0.000 .0547618 .151732
prst | -.0016912 .0055997 -0.30 0.763 -.0126665 .009284
_cons | 1.856951 .3872576 4.80 0.000 1.09794 2.615962
-------------+----------------------------------------------------------------
mleq2 |
yr89 | .5363707 .0919074 5.84 0.000 .3562355 .716506
male | -.7179949 .0894852 -8.02 0.000 -.8933827 -.5426072
white | -.3492339 .1391882 -2.51 0.012 -.6220378 -.07643
age | -.0249764 .0028053 -8.90 0.000 -.0304747 -.0194782
ed | .0558691 .0183654 3.04 0.002 .0198737 .0918646
prst | .0098476 .0038216 2.58 0.010 .0023575 .0173377
_cons | .7198119 .265235 2.71 0.007 .1999609 1.239663
-------------+----------------------------------------------------------------
mleq3 |
yr89 | .3312184 .1127882 2.94 0.003 .1101577 .5522792
male | -1.085618 .1217755 -8.91 0.000 -1.324294 -.8469423
white | -.3775375 .1568429 -2.41 0.016 -.684944 -.070131
age | -.0186902 .0037291 -5.01 0.000 -.025999 -.0113814
ed | .0566852 .0251836 2.25 0.024 .0073263 .1060441
prst | .0049225 .0048543 1.01 0.311 -.0045918 .0144368
_cons | -1.002225 .3446354 -2.91 0.004 -1.677698 -.3267524
------------------------------------------------------------------------------
The gologit model
Note that the gologit results are very similar
to what we got with the series of binary
logistic regressions and can be interpreted
the same way.
The gologit model can be written as
exp( j X i j )
P (Yi j ) , j 1 , 2, ..., M 1
1 [exp( j X i j )]
Note that the logit model is a special case of the gologit
model, where M = 2. When M > 2, you get a series of
binary logistic regressions, e.g. 1 versus 2, 3 4, then 1, 2
versus 3, 4, then 1, 2, 3 versus 4.
The ologit model is also a special case of the gologit model,
where the betas are the same for each j (NOTE: ologit
actually reports cut points, which equal the negatives of the
alphas used here)
exp( j X i )
P (Yi j ) , j 1 , 2, ..., M 1
1 [exp( j X i )]
A key enhancement of gologit2 is that it allows some of the
beta coefficients to be the same for all values of j, while
others can differ. i.e. it can estimate partial proportional
odds models. For example, in the following the betas for X1
and X2 are constrained but the betas for X3 are not.
exp( j X 1i 1 X 2i 2 X 3i 3 j )
P (Yi j ) , j 1 , 2, ..., M 1
1 [exp( j X 1i 1 X 2i 2 X 3i 3 j )]
gologit2/ partial proportional odds
Either mlogit or the original gologit can be
overkill – both generate many more
parameters than ologit does.
All variables are freed from the proportional odds
constraint, even though the assumption may only
be violated by one or a few of them
gologit2, with the autofit option, will only
relax the parallel lines constraint for those
variables where it is violated
gologit2 with autofit
. gologit2 warm yr89 male white age ed prst, auto lrforce
--------------------------------------------------------------------------
Testing parallel lines assumption using the .05 level of significance...
gologit2 is going through a stepwise process here. Initially no variables are constrained to
have proportional effects. Then Wald tests are done. Variables which pass the tests (i.e.
variables whose effects do not significantly differ across equations) have proportionality
constraints imposed.
------------------------------------------------------------------------------
( 1) [SD]white - [D]white = 0
( 2) [SD]ed - [D]ed = 0
( 3) [SD]prst - [D]prst = 0
( 4) [SD]age - [D]age = 0
( 5) [D]white - [A]white = 0
( 6) [D]ed - [A]ed = 0
( 7) [D]prst - [A]prst = 0
( 8) [D]age - [A]age = 0
• Note: with ologit, there were 6 degrees of freedom; with gologit &
mlogit there were 18; and with gologit2 using autofit there are 10.
The 8 d.f. difference is due to the 8 constraints above.
------------------------------------------------------------------------------
warm | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
SD |
yr89 | .98368 .1530091 6.43 0.000 .6837876 1.283572
male | -.3328209 .1275129 -2.61 0.009 -.5827417 -.0829002
white | -.3832583 .1184635 -3.24 0.001 -.6154424 -.1510742
age | -.0216325 .0024751 -8.74 0.000 -.0264835 -.0167814
ed | .0670703 .0161311 4.16 0.000 .0354539 .0986866
prst | .0059146 .0033158 1.78 0.074 -.0005843 .0124135
_cons | 2.12173 .2467146 8.60 0.000 1.638178 2.605282
-------------+----------------------------------------------------------------
D |
yr89 | .534369 .0913937 5.85 0.000 .3552406 .7134974
male | -.6932772 .0885898 -7.83 0.000 -.8669099 -.5196444
white | -.3832583 .1184635 -3.24 0.001 -.6154424 -.1510742
age | -.0216325 .0024751 -8.74 0.000 -.0264835 -.0167814
ed | .0670703 .0161311 4.16 0.000 .0354539 .0986866
prst | .0059146 .0033158 1.78 0.074 -.0005843 .0124135
_cons | .6021625 .2358361 2.55 0.011 .1399323 1.064393
-------------+----------------------------------------------------------------
A |
yr89 | .3258098 .1125481 2.89 0.004 .1052197 .5464
male | -1.097615 .1214597 -9.04 0.000 -1.335671 -.8595579
white | -.3832583 .1184635 -3.24 0.001 -.6154424 -.1510742
age | -.0216325 .0024751 -8.74 0.000 -.0264835 -.0167814
ed | .0670703 .0161311 4.16 0.000 .0354539 .0986866
prst | .0059146 .0033158 1.78 0.074 -.0005843 .0124135
_cons | -1.048137 .2393568 -4.38 0.000 -1.517268 -.5790061
------------------------------------------------------------------------------
• At first glance, it appears there are just as many parameters as before – but 8 of them are
duplicates because of the proportionality constraints that have been imposed.
.
Interpretation of the gologit2 results
Effects of the constrained variables (white, age, ed,
prst) can be interpreted pretty much the same as they
were in the earlier ologit model.
For yr89 and male, the differences from before are
largely a matter of degree. People became more
supportive of working mothers across time, but the
greatest effect of time was to push people away from
the most extremely negative attitudes. For gender,
men were less supportive of working mothers than
were women, but they were especially unlikely to
have strongly favorable attitudes.
Example: Imposing and testing
constraints
Rather than use autofit, you can use the pl and npl
parameters to specify which variables are or are not
constrained to meet the proportional odds/ parallel
lines assumption
Gives you more control over model specification &
testing
Lets you use LR chi-square tests rather than Wald tests
Could use BIC or AIC tests rather than chi-square tests if
you wanted to when deciding on constraints
pl without parameters will produce same results as ologit
Other types of linear constraints can also be
specified, e.g. you can constrain two variables to
have equal effects
The store option will cause the command estimates
store to be run at the end of the job, making it
slightly easier to do LR chi-square contrasts
Here is how we could do tests to see if we agree with
the model produced by autofit:
LR chi-square contrasts using gologit2
. * Least constrained model - same as the original gologit
. quietly gologit2 warm yr89 male white age ed prst, store(gologit)
. * Ologit clone
. quietly gologit2 warm yr89 male white age ed prst, store(ologit) pl
• Females are less likely to report poor health than are males (see the
positive female coefficient in the poor panel), but they are also less
likely to report higher levels of health (see the negative female
coefficients in the other panels), i.e. women tend to be less at the
extremes of health than men are. Such a pattern would be
obscured in a straight proportional odds (ologit) model.
Other gologit2 features of interest
The predict command can easily compute predicted
probabilities
Despite its name, gologit2 also supports the logit,
probit, cloglog, loglog, and cauchit links.
As of October 2014, gologit2 supports factor
variables, the margins command, and the svy: prefix.
(NOTE: Long and Freese 2014 came out before this
was done. The example they give on pp. 371-377
can now be done much more easily.)
The lrforce option (now the default) causes Stata to
report a Likelihood Ratio Statistic under certain
conditions when it ordinarily would report a Wald
statistic. Stata is being cautious but LR statistics are
appropriate for most common gologit2 models
gologit2 uses an unconventional but seemingly-
effective way to label the model equations. If
problems occur, the nolabel option can be used.
Most other standard options (e.g. robust, cluster,
level) are supported.
For more information, see:
http://www.stata-journal.com/article.html?article=st0097
https://www.tandfonline.com/doi/full/10.1080/0022250X.2015.1112384
http://www.statalist.org/forums/forum/general-stata-discussion/general/296459-major-u
pdate-to-gologit2-now-available
https://www.nd.edu/~rwilliam/gologit2
https://www3.nd.edu/~rwilliam/gologit2/tsfaq.html