Bgpev2 LDV
Bgpev2 LDV
c A. Colin Cameron
Univ. of Calif. - Davis
Advanced Econometrics
Bavarian Graduate Program in Economics
.
Based on A. Colin Cameron and Pravin K. Trivedi (2005),
Microeconometrics: Methods and Applications (MMA), C.U.P.
A. Colin Cameron and Pravin K. Trivedi (2009, 2010),
Microeconometrics using Stata (MUS), Stata Press.
1. Introduction
Abbreviated handout: assumes previous exposure to nonlinear models.
Binary outcomes
I y takes only one of two values, say 0 or 1.
I model Pr[y = 1jx]
I logit and probit are standard
Multinomial outcomes
I y takes only m possible outcomes.
I model Pr[y = j jx] for j = 1, ..., m
I many models including multinomial logit.
Censored and truncated models (e.g. Tobit) and selection models
I Considerably more di cult conceptually.
I Sample is not re ective of the population (selection on y )
I Standard methods rely on strong distributional assumptions.
Treatment evaluation
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in Economics
July 22-26,. 2013
Based on2 A.
/ 53
Coli
1. Introduction
Outline
1 Introduction
2 Logit and Probit Models
3 Multinomial Models
4 Censored and truncated data (Tobit)
5 Sample selection models
6 Treatment Evaluation
Pr[y = 1] = p
Pr[y = 0] = 1 p.
Example
A single regressor example allows a nice plot.
Compare predictions of Pr[y = 1jx ] from logit, probit and OLS.
I Scatterplot of y = 0 or 1 (jittered) on scalar x (data are generated).
2
1.5
Predicted Pr[y=1|x]
1
D ata (jittered)
.5
Logit
Probit
OLS
0
-2 0 2 4
Regressor x
Log-likelihood function:
ln L( β) = ln ∏N
i = 1 f ( y i j xi )
= ∑N
i =1 ln f (yi jxi )
yi
= ∑N
i =1 ln pi (1 pi )1 yi
= ∑N
i =1 fyi ln pi + (1 yi ) ln(1 pi )g
MLE solves ∂ ln L( β)/∂β = 0. After considerable algebra
Logit pi = Λ(xi0 β) ∑N
i = 1 ( yi Λ(xi0 β))xi = 0
Φ 0 (x0 β )
Probit pi = Φ(xi β) ∑N
0
i = 1 ( yi Φ(xi0 β)) Φ(x0 β)(1 i Φ(x0 β)) xi = 0.
i i
Properties of MLE
The distribution is necessarily Bernoulli
I If Pr[y = 1jx ] = p then necessarily Pr[y = 0jx ] = 1 pi since the
i i i i i
two probabilities must some to one.
I Only possible error is in pi .
So the MLE is consistent if pi is correctly speci ed
I p = Λ (x0 β ) for logit and p = Φ (x0 β ) for probit.
i i i i
The information matrix equality necessarily holds if data are
independent over i and
a 1
b
i =1 Λ (xi β )(1
∑N Λ(xi0 β))xi xi0
Logit β N β, 0
ML
b in place of β.
Default ML standard errors implement by using β
I For independent data there is no need for robust se's in this case.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in Economics
July 22-26,. 2013
Based on7 A.
/ 53
Coli
2. Logit and Probit models Data example: Private health insurance
. bysort ins: summarize retire age hstatusg hhincome educyear married hisp, sep(0)
-> ins = 0
-> ins = 1
ins=1 more likely if retired, older, good health status, richer, more
educated, married and nonhispanic.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in Economics
July 22-26,. 2013
Based on9 A.
/ 53
Coli
2. Logit and Probit models Logit data example
. * Logit regression
. logit ins retire age hstatusg hhincome educyear married hisp
. margins, dydx(*)
Warning: cannot perform check for estimable functions.
Delta-method
dy/dx Std. Err. z P>|z| [95% Conf. Interval]
. regress ins retire age hstatusg hhincome educyear married hisp, vce(robust)
Robust
ins Coef. Std. Err. t P>|t| [95% Conf. Interval]
legend: b/t
. predict plogit, p
. quietly probit ins retire age hstatusg hhincome educyear married hisp
. predict pprobit, p
. quietly regress ins retire age hstatusg hhincome educyear married hisp
Average probabilities are very close (and for logit and OLS = ȳ ).
bi < 0 and p
Range similar for logit and probit but OLS gives p bi > 1.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
15 A.
/ 53
Coli
2. Logit and Probit models Marginal e ects: Approximations
b
βLogit ' 4b
βOLS
b
βProbit ' 2.5b
βOLS
b
βLogit ' 1.6b
βProbit .
Which model?
Logit: binary model most often used by statisticians.
I generalizes simply to multinomial data (> two outcomes)
I b
βj measures change in log-odds ratio p/(1 p ) due to xj change.
Probit: binary model most often used by economists.
I motivated by a latent normal random variable.
I generalizes to Tobit models and multinomial probit.
Empirically: either logit or probit can be used
I give similar predictions and marginal e ects
I greatest di erence is in prediction of probabilities close to 0 or 1.
Complementary log-odds model
I sometimes used when outcomes are mostly 0 or mostly 1.
OLS: can be useful for preliminary data analysis
I but nal results should use probit or logit.
pj = Pr[y = j ], j = 1, ..., m.
Regression Model
Introduce individual characteristics
I parameterize pij in terms of observed data xi and parameters β:
pij = Pr[yi = j ] = Fj (xi , β), j = 1, ..., m.
I these probabilities should lie between 0 and 1 and sum over j to one.
MLE maximizes the log-likelihood function
y
ln L( ) = ln ∏N N m
i =1 f (yi ) = ln ∏i =1 ∏j =1 pj
j
= ∑N m
i =1 ∑j =1 yij ln pij
Variable de nitions
. describe
Data organization
I here wide form with one observation per individual
I each observation has data for all the possible alternatives.
Summary statistics
I Columns y = 1, ..., 4 give sample means for those with y = 1, ..., 4.
Sub-sample averages
Explanatory Variable y=1 y=2 y=3 y=4 All y
Beach Pier Private Charter Overall
Income ($1,000's per month) 4.052 3.387 4.654 3.881 4.099
Price beach ($) 36 31 138 121 103
Price pier ($) 36 31 138 121 103
Price private ($) 98 82 42 45 55
Price charter ($) 125 110 71 75 84
Catch rate beach 0.28 0.26 0.21 0.25 0.24
Catch rate pier 0.22 0.20 0.13 0.16 0.16
Catch rate private 0.16 0.15 0.18 0.18 0.17
Catch rate charter 0.52 0.50 0.65 0.69 0.63
Sample probability 0.113 0.151 0.354 0.382 1.000
Observations 134 178 418 452 1182
pier
income -.1434029 .0532882 -2.69 0.007 -.2478459 -.03896
_cons .8141503 .2286316 3.56 0.000 .3660405 1.26226
private
income .0919064 .0406638 2.26 0.024 .0122069 .1716059
_cons .7389208 .1967309 3.76 0.000 .3533352 1.124506
charter
income -.0316399 .0418463 -0.76 0.450 -.1136571 .0503774
_cons 1.341291 .1945167 6.90 0.000 .9600457 1.722537
Delta-method
dy/dx Std. Err. z P>|z| [95% Conf. Interval]
Further details
b is consistently asymptotically normal by the usual asymptotic
β
theory if the d.g.p. is correctly speci ed.
I The distribution is necessarily multinomial.
I So key is correct speci cation of pij = Fj (xi , β).
I And no need to use vce(robust) option if independent data.
Distinguish between two di erent types of regressors.
I Alternative-speci c or case-speci c or alternative-invariant regressors
do not vary across alternatives.
F e.g. income (in our example), gender.
I Alternative-varying regressors may vary across alternatives.
F e.g. price.
I Multinomial logit: all regressors are individual-speci c.
I Conditional logit: same as multinomial logit regressors are alternative
varying.
Unordered models
Unordered model: no obvious ordering of alternatives.
Additive random utility model (ARUM) speci es utility of each
alternative (of m ) as
U1 = V1 + ε1
U2 = V2 + ε2
.. .. ..
. . .
Um = Vm + εm
Stata commands
Stata commands
Command Model
mlogit multinomial logit
asclogit conditional logit
clogit older command for conditional logit
nlogit nested logit (ARUM version)
mprobit multinomial probit
asmprobit multinomial probit
mixlogit random parameters logit (Stata add-on)
yi if yi > 0
yi =
0 if yi 0.
Scatterplot & true regression curves (derived later) for three samples:
I truncated (top), censored (middle) and completely observed (bottom).
4000
Different Conditional Means
2000
0
Censored Mean
Uncensored Mean
1 2 3 4 5
x (natural logarithm of wage)
E[y jy > 0]
= E [x0 β + εjx0 β + ε > 0] as y = x0 β + ε
= x0 β + E [εhjε > x0 β] i as x and ε independent
x0 β
= x0 β + σE ε ε
σjσ > σ transform to ε/σ N [0, 1]
x0 β
= x0 β + σλ σ using next slide: key result for N [0, 1].
E[y ] = Ey [E[y jy ]]
= Pr[y 0]
0 + Pr[y > 0] E[y jy > 0]
φ (x0 β/σ)
= Φ(x0 β/σ) x0 β + σ
Φ (x0 β/σ)
E[y jx] = Φ(x0 β/σ)x0 β + σφ (x0 β/σ),
using earlier result for the truncated mean E[y jy > 0].
This conditional mean is again nonlinear.
I OLS of y on x is inconsistent for β
I Need NLS or MLE for consistent estimates.
1 if y1 > 0
y1 =
0 if y1 0.
I Outcome: Only positive values of y2 are observed
y2 if y1 > 0
y2 =
0 if y1 0.
ε2 = δ ε1 + v ,
I bi
Need standard errors that correct for wi heteroskedastic and λ
estimated. Stata command heckman does this.
Exclusion restriction:
I desirable to include some regressors in participation equation (x1 ) that
can be excluded from the outcome equation (x2 )
I otherwise identi cation solely from nonlinearity.
Selection on observables only
I If Cov[ε1 , ε2 ] = 0 model then there is no longer selection on
unobservables
I Model reduces to a two-part model
F Probit for whether y > 0
F Regular OLS for the positives.
F Can be reasonable for individual's hospital expenditure data.
lny
age .202124 .0242974 8.32 0.000 .1545019 .2497462
female .2891575 .073694 3.92 0.000 .1447199 .4335951
educ .0119928 .0116839 1.03 0.305 -.0109072 .0348928
blhisp -.1810582 .0658522 -2.75 0.006 -.3101261 -.0519904
totchr .4983315 .0494699 10.07 0.000 .4013724 .5952907
ins -.0474019 .0531541 -0.89 0.373 -.151582 .0567782
_cons 5.302572 .2941363 18.03 0.000 4.726076 5.879069
dy
age .097315 .0270155 3.60 0.000 .0443656 .1502645
female .6442089 .0601499 10.71 0.000 .5263172 .7621006
educ .0701674 .0113435 6.19 0.000 .0479345 .0924003
blhisp -.3744867 .0617541 -6.06 0.000 -.4955224 -.2534509
totchr .7935208 .0711156 11.16 0.000 .6541367 .9329048
ins .1812415 .0625916 2.90 0.004 .0585642 .3039187
_cons -.7177087 .1924667 -3.73 0.000 -1.094937 -.3404809
mills
lambda -.4801696 .2906565 -1.65 0.099 -1.049846 .0895067
rho -0.37130
sigma 1.2932083
lambda -.4801696 .2906565
Stata commands
Stata commands
Command Model
tobit Tobit MLE (censored)
truncreg Tobit MLE (truncated)
cnreg Tobit (varying known threshold)
intreg Interval normal data (e.g. $1-$100, $101-$200,..)
heckman, mle Sample selection MLE
heckman, 2step Sample selection two step
B. Di erence in di erences
I variation of preceding that does not require panel data.
I suppose treatment occurs only in second time period (not in rst)
F α = ∆y treated ∆y untreated = (y1,tr y0,tr ) (y1,untr y0,untr ).
use b
F more generally OLS on ∆yi = αdi + ∆xi0 β + ui
F requires common time trend for treated and untreated groups
I Extends to more time periods (model in level with dit )
I Extend to contrasts other than in time e.g. male/female
I Extension is event history analysis.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
52 A.
/ 53
Coli
6. Treatment e ects models Selection on Observables and Unobservables
C. Instrumental variables
I IV estimation with instrument zi in yi = αdi + xi0 β + ui
I consistent if Cov(ui , di jxi ) = 0
D. Fuzzy regression discontinuity design
I in fuzzy design not everyone with s > s gets the treatment.
i i
I this introduces a role for unobservables.
E. Parametric model e,g, Roy model:
I introduce latent variables di , y1i , y0i for di , y1i , y0i .
I then E[y1i ] = E[y1i jdi = 1] = E[y1i jdi > 0]
= E[x1i0 β + ε jz0 γ + v > 0 ] = x0 β + E [ ε jv > z0 γ ]
1i i i 1i 1i i i
I 0 β + δ λ (z0 γ ) where λ ( ) is inverse Mills ratio
so E[y1i ] = x1i 1 i
if ε1i = δ1 vi + ξ i > 0, vi N [0, 1], ξ i independent.
F. LATE (local average treatment e ects)
I allows α to vary with i and applies to many estimators.
I for example consider IV interpreted as local e ect
F e.g. in earnings-education regression with instrument law change that
increased school leaving age, the earnings e ect is for those with low
levels of education.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
53 A.
/ 53
Coli