0% found this document useful (0 votes)
18 views

Bgpev2 LDV

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Bgpev2 LDV

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Day 5

Limited Dependent Variable Models (Brief)


Binary, multinomial, censored, treatment e ects

c A. Colin Cameron
Univ. of Calif. - Davis

Advanced Econometrics
Bavarian Graduate Program in Economics
.
Based on A. Colin Cameron and Pravin K. Trivedi (2005),
Microeconometrics: Methods and Applications (MMA), C.U.P.
A. Colin Cameron and Pravin K. Trivedi (2009, 2010),
Microeconometrics using Stata (MUS), Stata Press.

July 22-26, 2013

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in Economics
July 22-26,. 2013
Based on1 A.
/ 53
Coli
1. Introduction

1. Introduction
Abbreviated handout: assumes previous exposure to nonlinear models.
Binary outcomes
I y takes only one of two values, say 0 or 1.
I model Pr[y = 1jx]
I logit and probit are standard
Multinomial outcomes
I y takes only m possible outcomes.
I model Pr[y = j jx] for j = 1, ..., m
I many models including multinomial logit.
Censored and truncated models (e.g. Tobit) and selection models
I Considerably more di cult conceptually.
I Sample is not re ective of the population (selection on y )
I Standard methods rely on strong distributional assumptions.
Treatment evaluation
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in Economics
July 22-26,. 2013
Based on2 A.
/ 53
Coli
1. Introduction

Outline

1 Introduction
2 Logit and Probit Models
3 Multinomial Models
4 Censored and truncated data (Tobit)
5 Sample selection models
6 Treatment Evaluation

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in Economics
July 22-26,. 2013
Based on3 A.
/ 53
Coli
2. Logit and Probit models De nition

2. Logit model: De nition


Data y takes only one of two values, say 0 or 1.
I OLS has problem that E[yi jxi ] = xi0 β > 1 or < 0 is possible
I And OLS is ine cient (based on homoskedasticity, normality).
I So what do we do?
Starting point from statistics is Bernoulli (binomial with 1 trial):

Pr[y = 1] = p
Pr[y = 0] = 1 p.

I with E[y ] = p and V[y ] = p (1 p ).


For regression the probability 0 < pi < 1 varies with regressors xi
exp(xi0 β)
Logit pi = Λ(xi0 β) = 1+exp(xi0 β)
Λ( ) is logistic c.d.f.
Probit pi = Φ(xi0 β) Φ( ) is standard normal c.d.f.

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in Economics
July 22-26,. 2013
Based on4 A.
/ 53
Coli
2. Logit and Probit models Example

Example
A single regressor example allows a nice plot.
Compare predictions of Pr[y = 1jx ] from logit, probit and OLS.
I Scatterplot of y = 0 or 1 (jittered) on scalar x (data are generated).

2
1.5
Predicted Pr[y=1|x]
1

D ata (jittered)
.5

Logit
Probit
OLS
0

-2 0 2 4
Regressor x

Logit similar to probit with predictions between 0 and 1.


OLS predicts outside the (0, 1) interval.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in Economics
July 22-26,. 2013
Based on5 A.
/ 53
Coli
2. Logit and Probit models Logit and Probit MLE

Logit and Probit MLE


Useful notation: The Bernoulli density can be written in compact
notation as
f (yi jxi ) = piyi (1 pi )1 yi .

Log-likelihood function:

ln L( β) = ln ∏N
i = 1 f ( y i j xi )

= ∑N
i =1 ln f (yi jxi )
yi
= ∑N
i =1 ln pi (1 pi )1 yi

= ∑N
i =1 fyi ln pi + (1 yi ) ln(1 pi )g
MLE solves ∂ ln L( β)/∂β = 0. After considerable algebra
Logit pi = Λ(xi0 β) ∑N
i = 1 ( yi Λ(xi0 β))xi = 0
Φ 0 (x0 β )
Probit pi = Φ(xi β) ∑N
0
i = 1 ( yi Φ(xi0 β)) Φ(x0 β)(1 i Φ(x0 β)) xi = 0.
i i

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in Economics
July 22-26,. 2013
Based on6 A.
/ 53
Coli
2. Logit and Probit models Logit and Probit MLE

Properties of MLE
The distribution is necessarily Bernoulli
I If Pr[y = 1jx ] = p then necessarily Pr[y = 0jx ] = 1 pi since the
i i i i i
two probabilities must some to one.
I Only possible error is in pi .
So the MLE is consistent if pi is correctly speci ed
I p = Λ (x0 β ) for logit and p = Φ (x0 β ) for probit.
i i i i
The information matrix equality necessarily holds if data are
independent over i and
a 1
b
i =1 Λ (xi β )(1
∑N Λ(xi0 β))xi xi0
Logit β N β, 0
ML

a (Φ0 (xi0 β)2 1


b
Probit β N β, ∑N x x0 .
ML i =1 Φ(xi0 β)(1 Φ(xi0 β)) i i

b in place of β.
Default ML standard errors implement by using β
I For independent data there is no need for robust se's in this case.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in Economics
July 22-26,. 2013
Based on7 A.
/ 53
Coli
2. Logit and Probit models Data example: Private health insurance

Data Example: Private health insurance


ins=1 if have private health insurance.
Summary statistics (sample is 50-86 years from 2000 HRS)

. describe ins retire age hstatusg hhincome educyear married hisp

storage display value


variable name type format label variable label

ins float %9.0g 1 if have private health insurance


retire double %12.0g 1 if retired
age double %12.0g age in years
hstatusg float %9.0g 1 if health status good of better
hhincome float %9.0g household annual income in $000's
educyear double %12.0g years of education
married double %12.0g 1 if married
hisp double %12.0g 1 if hispanic

. summarize ins retire age hstatusg hhincome educyear married hisp

Variable Obs Mean Std. Dev. Min Max

ins 3206 .3870867 .4871597 0 1


retire 3206 .6247661 .4842588 0 1
age 3206 66.91391 3.675794 52 86
hstatusg 3206 .7046163 .4562862 0 1
hhincome 3206 45.26391 64.33936 0 1312.124

educyear 3206 11.89863 3.304611 0 17


married 3206 .7330006 .442461 0 1
hisp 3206 .0726762 .2596448 0 1

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in Economics
July 22-26,. 2013
Based on8 A.
/ 53
Coli
2. Logit and Probit models Data example: Private health insurance

Summary statistics: by whether or not have private health insurance.

. bysort ins: summarize retire age hstatusg hhincome educyear married hisp, sep(0)

-> ins = 0

Variable Obs Mean Std. Dev. Min Max

retire 1965 .5938931 .49123 0 1


age 1965 66.8229 3.851651 52 86
hstatusg 1965 .653944 .4758324 0 1
hhincome 1965 37.65601 58.98152 0 1197.704
educyear 1965 11.29313 3.475632 0 17
married 1965 .6814249 .4660424 0 1
hisp 1965 .1007634 .3010917 0 1

-> ins = 1

Variable Obs Mean Std. Dev. Min Max

retire 1241 .6736503 .469066 0 1


age 1241 67.05802 3.375173 53 82
hstatusg 1241 .7848509 .4110914 0 1
hhincome 1241 57.31028 70.3737 .124 1312.124
educyear 1241 12.85737 2.755311 2 17
married 1241 .8146656 .3887253 0 1
hisp 1241 .0282031 .1656193 0 1

ins=1 more likely if retired, older, good health status, richer, more
educated, married and nonhispanic.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in Economics
July 22-26,. 2013
Based on9 A.
/ 53
Coli
2. Logit and Probit models Logit data example

Logit data example


Stata command logit gives the logit MLE (p = Λ(x0 β)).
∂ Pr[y =1jx]
I MEj = ∂xj = Λ0 (x0 β) βj = Λ(x0 β)(1 Λ(x0 β)) βj

. * Logit regression
. logit ins retire age hstatusg hhincome educyear married hisp

Iteration 0: log likelihood = -2139.7712


Iteration 1: log likelihood = -1998.8563
Iteration 2: log likelihood = -1994.9129
Iteration 3: log likelihood = -1994.8784
Iteration 4: log likelihood = -1994.8784

Logistic regression Number of obs = 3206


LR chi2(7) = 289.79
Prob > chi2 = 0.0000
Log likelihood = -1994.8784 Pseudo R2 = 0.0677

ins Coef. Std. Err. z P>|z| [95% Conf. Interval]

retire .1969297 .0842067 2.34 0.019 .0318875 .3619718


age -.0145955 .0112871 -1.29 0.196 -.0367178 .0075267
hstatusg .3122654 .0916739 3.41 0.001 .1325878 .491943
hhincome .0023036 .000762 3.02 0.003 .00081 .0037972
educyear .1142626 .0142012 8.05 0.000 .0864288 .1420963
married .578636 .0933198 6.20 0.000 .3957327 .7615394
hisp -.8103059 .1957522 -4.14 0.000 -1.193973 -.4266387
_cons -1.715578 .7486219 -2.29 0.022 -3.18285 -.2483064

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
10 A.
/ 53
Coli
2. Logit and Probit models Logit data example

Average marginal e ect


∂ Pr[yi =1jxi ]
AMEj = N1 ∑N i =1 Λ (x β )(1
= N1 ∑N Λ(x0 β)) βj
0
i =1 ∂xj
Compute AME after logit using Stata 11 margins, dydx(*) or
Stata 10 add-on command margeff.

. margins, dydx(*)
Warning: cannot perform check for estimable functions.

Average marginal effects Number of obs = 3206


Model VCE : OIM

Expression : Pr(ins), predict()


dy/dx w.r.t. : retire age hstatusg hhincome educyear married hisp

Delta-method
dy/dx Std. Err. z P>|z| [95% Conf. Interval]

retire .0427616 .018228 2.35 0.019 .0070354 .0784878


age -.0031693 .0024486 -1.29 0.196 -.0079686 .00163
hstatusg .0678058 .0197778 3.43 0.001 .0290419 .1065696
hhincome .0005002 .0001646 3.04 0.002 .0001777 .0008228
educyear .0248111 .0029705 8.35 0.000 .0189891 .0306332
married .1256459 .0198205 6.34 0.000 .0867985 .1644933
hisp -.175951 .0421962 -4.17 0.000 -.258654 -.0932481

Marginal e ects: 0.043, -0.003, 0.067, 0.0005, 0.025, 0.126, -0.176vs.


Coe cients: 0.197, -0.015, 0.312, 0.0023, 0.114, 0.579, -0.810.
I Marginal e ect here is about one- fth the size of the coe cient.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
11 A.
/ 53
Coli
2. Logit and Probit models Probit data example

Probit data example


Stata command probit gives the probit MLE.

. probit ins retire age hstatusg hhincome educyear married hisp

Iteration 0: log likelihood = -2139.7712


Iteration 1: log likelihood = -1996.0367
Iteration 2: log likelihood = -1993.6288
Iteration 3: log likelihood = -1993.6237

Probit regression Number of obs = 3206


LR chi2(7) = 292.30
Prob > chi2 = 0.0000
Log likelihood = -1993.6237 Pseudo R2 = 0.0683

ins Coef. Std. Err. z P>|z| [95% Conf. Interval]

retire .1183567 .0512678 2.31 0.021 .0178737 .2188396


age -.0088696 .006899 -1.29 0.199 -.0223914 .0046521
hstatusg .1977357 .0554868 3.56 0.000 .0889836 .3064877
hhincome .001233 .0003866 3.19 0.001 .0004754 .0019907
educyear .0707477 .0084782 8.34 0.000 .0541308 .0873646
married .362329 .0560031 6.47 0.000 .2525651 .472093
hisp -.4731099 .1104385 -4.28 0.000 -.6895655 -.2566544
_cons -1.069319 .4580791 -2.33 0.020 -1.967138 -.1715009

Scaled di erently to logit but similar t-statistics (see below).


c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
12 A.
/ 53
Coli
2. Logit and Probit models OLS data example

OLS data example


OLS estimates for private health insurance
I If do OLS need to use heteroskedastic-robust standard errors

. regress ins retire age hstatusg hhincome educyear married hisp, vce(robust)

Linear regression Number of obs = 3206


F( 7, 3198) = 58.98
Prob > F = 0.0000
R-squared = 0.0826
Root MSE = .46711

Robust
ins Coef. Std. Err. t P>|t| [95% Conf. Interval]

retire .0408508 .0182217 2.24 0.025 .0051234 .0765782


age -.0028955 .0023254 -1.25 0.213 -.0074549 .0016638
hstatusg .0655583 .0190126 3.45 0.001 .0282801 .1028365
hhincome .0004921 .0001874 2.63 0.009 .0001247 .0008595
educyear .0233686 .0027081 8.63 0.000 .0180589 .0286784
married .1234699 .0186521 6.62 0.000 .0868987 .1600411
hisp -.1210059 .0269459 -4.49 0.000 -.1738389 -.068173
_cons .1270857 .1538816 0.83 0.409 -.1746309 .4288023

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
13 A.
/ 53
Coli
2. Logit and Probit models Comparison of models

Compare logit, probit and OLS estimates


Coe cients in di erent models are not directly comparable!
I Though the t-statistics are similar.
. * Compare coefficient estimates across models with default and robust standard e
. estimates table blogit bprobit bols blogitr bprobitr bolsr, ///
> stats(N ll) b(%7.3f) t(%7.2f) stfmt(%8.2f)

Variable blogit bprobit bols blogitr bprobitr bolsr

retire 0.197 0.118 0.041 0.197 0.118 0.041


2.34 2.31 2.24 2.32 2.30 2.24
age -0.015 -0.009 -0.003 -0.015 -0.009 -0.003
-1.29 -1.29 -1.20 -1.32 -1.32 -1.25
hstatusg 0.312 0.198 0.066 0.312 0.198 0.066
3.41 3.56 3.37 3.40 3.57 3.45
hhincome 0.002 0.001 0.000 0.002 0.001 0.000
3.02 3.19 3.58 2.01 2.21 2.63
educyear 0.114 0.071 0.023 0.114 0.071 0.023
8.05 8.34 8.15 7.96 8.33 8.63
married 0.579 0.362 0.123 0.579 0.362 0.123
6.20 6.47 6.38 6.15 6.46 6.62
hisp -0.810 -0.473 -0.121 -0.810 -0.473 -0.121
-4.14 -4.28 -3.59 -4.18 -4.36 -4.49
_cons -1.716 -1.069 0.127 -1.716 -1.069 0.127
-2.29 -2.33 0.79 -2.36 -2.40 0.83

N 3206 3206 3206 3206 3206 3206


ll -1994.88 -1993.62 -2104.75 -1994.88 -1993.62 -2104.75

legend: b/t

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
14 A.
/ 53
Coli
2. Logit and Probit models Comparison of predicted probabilities

Compare predicted probabilities from models


0b
Predicted probabilities 1
N ∑N
i =1 F (xi β ) for di erent models.

. * Comparison of predicted probabilities from logit, probit and OLS


. quietly logit ins retire age hstatusg hhincome educyear married hisp

. predict plogit, p

. quietly probit ins retire age hstatusg hhincome educyear married hisp

. predict pprobit, p

. quietly regress ins retire age hstatusg hhincome educyear married hisp

. quietly predict pOLS

. summarize ins plogit pprobit pOLS

Variable Obs Mean Std. Dev. Min Max

ins 3206 .3870867 .4871597 0 1


plogit 3206 .3870867 .1418287 .0340215 .9649615
pprobit 3206 .3861139 .1421416 .0206445 .9647618
pOLS 3206 .3870867 .1400249 -.1557328 1.197223

Average probabilities are very close (and for logit and OLS = ȳ ).
bi < 0 and p
Range similar for logit and probit but OLS gives p bi > 1.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
15 A.
/ 53
Coli
2. Logit and Probit models Marginal e ects: Approximations

Marginal e ects: Approximations for logit and probit


In general for p = F (x0 β), MEj = ∂p
∂xj = F 0 (x0 β ) βj .
I For OLS: MEj = b
βj .
I For logit: MEj 0.25b
βj as F 0 (x0 β) = Λ(x0 β)(1 Λ(x0 β)) 0.25.
p
I For probit: MEj 0.40b
βj as F 0 (x0 β) = φ(x0 β) (1/ 2π ) ' 0.40.

This leads to the following rule of thumb for slope parameters

b
βLogit ' 4b
βOLS
b
βProbit ' 2.5b
βOLS
b
βLogit ' 1.6b
βProbit .

Also for logit a useful approximation is MEj ' ȳ (1 ȳ )b


βj .

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
16 A.
/ 53
Coli
2. Logit and Probit models Which model?

Which model?
Logit: binary model most often used by statisticians.
I generalizes simply to multinomial data (> two outcomes)
I b
βj measures change in log-odds ratio p/(1 p ) due to xj change.
Probit: binary model most often used by economists.
I motivated by a latent normal random variable.
I generalizes to Tobit models and multinomial probit.
Empirically: either logit or probit can be used
I give similar predictions and marginal e ects
I greatest di erence is in prediction of probabilities close to 0 or 1.
Complementary log-odds model
I sometimes used when outcomes are mostly 0 or mostly 1.
OLS: can be useful for preliminary data analysis
I but nal results should use probit or logit.

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
17 A.
/ 53
Coli
3. Multinomial models De nition

3. Multinomial models: De nition


There are m mutually-exclusive alternatives.
I y takes value j if the outcome is alternative j, j = 1, ..., m.
I Probability that the outcome is alternative j is

pj = Pr[y = j ], j = 1, ..., m.

Introduce m binary variables for each observed y


1 if y = j
yj = .
0 if y 6= j.

I yj = 1 if alternative j is chosen and yj = 0 for all non-chosen


alternatives.
I For an individual exactly one of y1 , y2 , ..., ym will be non-zero.
Density for one observation is conveniently written as
y
f (y ) = p1y1 p2y2 ... ym
pm = ∏m j
j =1 pj .

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
18 A.
/ 53
Coli
3. Multinomial models Regression model

Regression Model
Introduce individual characteristics
I parameterize pij in terms of observed data xi and parameters β:
pij = Pr[yi = j ] = Fj (xi , β), j = 1, ..., m.
I these probabilities should lie between 0 and 1 and sum over j to one.
MLE maximizes the log-likelihood function
y
ln L( ) = ln ∏N N m
i =1 f (yi ) = ln ∏i =1 ∏j =1 pj
j

= ∑N m
i =1 ∑j =1 yij ln pij

Di erent models have di erent models for pij .


I e.g. multinomial logit
exp(xi0 βj )
pij = Pr[yi = j ] = , j = 1, ..., m , β1 = 0.
∑m 0
k =1 exp(xi βk )
I nested logit, multinomial probit, ordered logit, ... use di erent pij .
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
19 A.
/ 53
Coli
3. Multinomial models Data Example: Fishing site

Data example: Fishing site

Multinomial variable y has outcome one of


I y =1 if sh from beach
I y =2 if sh from pier
I y =3 if sh from private boat
I y =4 if sh from charter boat
Regressors are
I price: varies by alternative and individual
I catch rate: varies by alternative and individual
I income: varies by individual but not alternative

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
20 A.
/ 53
Coli
3. Multinomial models Data Example: Fishing site

Variable de nitions

. describe

Contains data from mus15data.dta


obs: 1,182
vars: 16 12 May 2008 20:46
size: 85,104 (99.2% of memory free)

storage display value


variable name type format label variable label

mode float %9.0g modetype Fishing mode


price float %9.0g price for chosen alternative
crate float %9.0g catch rate for chosen
alternative
dbeach float %9.0g 1 if beach mode chosen
dpier float %9.0g 1 if pier mode chosen
dprivate float %9.0g 1 if private boat mode chosen
dcharter float %9.0g 1 if charter boat mode chosen
pbeach float %9.0g price for beach mode
ppier float %9.0g price for pier mode
pprivate float %9.0g price for private boat mode
pcharter float %9.0g price for charter boat mode
qbeach float %9.0g catch rate for beach mode
qpier float %9.0g catch rate for pier mode
qprivate float %9.0g catch rate for private boat mode
qcharter float %9.0g catch rate for charter boat mode
income float %9.0g monthly income in thousands $

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
21 A.
/ 53
Coli
3. Multinomial models Data Example: Fishing site

Data organization
I here wide form with one observation per individual
I each observation has data for all the possible alternatives.

. list mode d* p* income in 1/2, clean

mode dbeach dpier dprivate dcharter price pbeach ppier pprivat


> e pcharter pmlogit1 pmlogit2 pmlogit3 pmlogit4 income
1. charter 0 0 0 1 182.93 157.93 157.93 157.9
> 3 182.93 .1125092 .0919656 .4516733 .3438518 7.083332
2. charter 0 0 0 1 34.534 15.114 15.114 10.53
> 4 34.534 .1122198 .2117394 .2635553 .4124855 1.25

Here person 2 chose charter shing (mode=charter or dcharter=1)


when beach, pier, private and charter shing cost, respectively,
15.11, 15.11, 10.53 and 34.53.

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
22 A.
/ 53
Coli
3. Multinomial models Data Example: Fishing site

Summary statistics
I Columns y = 1, ..., 4 give sample means for those with y = 1, ..., 4.

Sub-sample averages
Explanatory Variable y=1 y=2 y=3 y=4 All y
Beach Pier Private Charter Overall
Income ($1,000's per month) 4.052 3.387 4.654 3.881 4.099
Price beach ($) 36 31 138 121 103
Price pier ($) 36 31 138 121 103
Price private ($) 98 82 42 45 55
Price charter ($) 125 110 71 75 84
Catch rate beach 0.28 0.26 0.21 0.25 0.24
Catch rate pier 0.22 0.20 0.13 0.16 0.16
Catch rate private 0.16 0.15 0.18 0.18 0.17
Catch rate charter 0.52 0.50 0.65 0.69 0.63
Sample probability 0.113 0.151 0.354 0.382 1.000
Observations 134 178 418 452 1182

On average a person chooses to sh where it is cheapest to sh.

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
23 A.
/ 53
Coli
3. Multinomial models Multinomial logit data example

Multinomial logit of shing mode regressed on intercept and income


xi0 (αj + βj income)
I Pr[yij = 1] = e , j = 1, 2, 3, 4, α1 = 0, β1 = 0.
x0 (αk + βk income)
∑4k =1 e i
I normalization that base outcome is beach shing (y = 1)

. * Multinomial logit with base outcome alternative 1


. mlogit mode income, baseoutcome(1)

Iteration 0: log likelihood = -1497.7229


Iteration 1: log likelihood = -1477.5265
Iteration 2: log likelihood = -1477.1514
Iteration 3: log likelihood = -1477.1506

Multinomial logistic regression Number of obs = 1182


LR chi2( 3) = 41.14
Prob > chi2 = 0.0000
Log likelihood = -1477.1506 Pseudo R2 = 0.0137

mode Coef. Std. Err. z P>|z| [95% Conf. Interval]

pier
income -.1434029 .0532882 -2.69 0.007 -.2478459 -.03896
_cons .8141503 .2286316 3.56 0.000 .3660405 1.26226

private
income .0919064 .0406638 2.26 0.024 .0122069 .1716059
_cons .7389208 .1967309 3.76 0.000 .3533352 1.124506

charter
income -.0316399 .0418463 -0.76 0.450 -.1136571 .0503774
_cons 1.341291 .1945167 6.90 0.000 .9600457 1.722537

(mode==beach is the base outcome)

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
24 A.
/ 53
Coli
3. Multinomial models Multinomial logit data example

Predicted probabilities of each outcome:


x0 ( b
α +b
β income)
b [yij = 1] = e i j0 j b
Pr x (b
α + βk income)
∑4k =1 e i k

. * Compare average predicted probabilities to sample average frequencies


. predict pmlogit1 pmlogit2 pmlogit3 pmlogit4, pr

. summarize pmlogit* dbeach dpier dprivate dcharter, separator(4)

Variable Obs Mean Std. Dev. Min Max

pmlogit1 1182 .1133672 .0036716 .0947395 .1153659


pmlogit2 1182 .1505922 .0444575 .0356142 .2342903
pmlogit3 1182 .3536379 .0797714 .2396973 .625706
pmlogit4 1182 .3824027 .0346281 .2439403 .4158273

dbeach 1182 .1133672 .3171753 0 1


dpier 1182 .1505922 .3578023 0 1
dprivate 1182 .3536379 .4783008 0 1
dcharter 1182 .3824027 .4861799 0 1

As expected average predicted probabilities sum to one.


Furthermore average predicted probabilities of each outcome equals
frequency of that outcome
I Property of multinomial logit and conditional logit
I Analog of OLS residuals sum to zero so yb = y .
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
25 A.
/ 53
Coli
3. Multinomial models Multinomial logit data example

Parameter interpretation is complex.


There are many marginal e ects: one for each outcome value.
I Here MEij = ∂pij /∂xi = pij ( βj βi ) where βi = ∑l pil βl .
I e.g. average marginal e ect (AME) of $1,000 increase in annual
income on probability sh from private boat (the third outcome) if a
$1,000 increase in monthly income increases Pr[charter sh] by 0.032.

. * AME of income change for outcome 3


. margins, dydx(*) predict(outcome(3))
Warning: cannot perform check for estimable functions.

Average marginal effects Number of obs = 1182


Model VCE : OIM

Expression : Pr(mode==3), predict(outcome(3))


dy/dx w.r.t. : income

Delta-method
dy/dx Std. Err. z P>|z| [95% Conf. Interval]

income .0317562 .0052589 6.04 0.000 .021449 .0420633

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
26 A.
/ 53
Coli
3. Multinomial models Further detail

Further details
b is consistently asymptotically normal by the usual asymptotic
β
theory if the d.g.p. is correctly speci ed.
I The distribution is necessarily multinomial.
I So key is correct speci cation of pij = Fj (xi , β).
I And no need to use vce(robust) option if independent data.
Distinguish between two di erent types of regressors.
I Alternative-speci c or case-speci c or alternative-invariant regressors
do not vary across alternatives.
F e.g. income (in our example), gender.
I Alternative-varying regressors may vary across alternatives.
F e.g. price.
I Multinomial logit: all regressors are individual-speci c.
I Conditional logit: same as multinomial logit regressors are alternative
varying.

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
27 A.
/ 53
Coli
3. Multinomial models Unordered models

Unordered models
Unordered model: no obvious ordering of alternatives.
Additive random utility model (ARUM) speci es utility of each
alternative (of m ) as
U1 = V1 + ε1
U2 = V2 + ε2
.. .. ..
. . .
Um = Vm + εm

I Here Vj is deterministic part of utility, e.g. Vj = x0 βj or xj0 β,


and εj are errors.
Then j is chosen if it has the highest utility
Pr[y = j ] = Pr[Uj Uk , all k 6= j ]
= Pr[εk εj (Vk Vj ), all k 6= j ]
Di erent error distributions lead to di erent multinomial models.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
28 A.
/ 53
Coli
3. Multinomial models Examples of Unordered models

Examples of unordered Models

1. Multinomial logit and conditional logit:


I errors εj are i.i.d. type I extreme value.
2. Nested logit
I εj are correlated type I extreme value.
3. Random parameters logit:
I εj are i.i.d. type I extreme value
I but additionally parameters βi are multivariate normal
I no analytical solution for pij .
4. Multinomial probit:
I εj are correlated multivariate normal
I no analytical solution for pij .

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
29 A.
/ 53
Coli
3. Multinomial models Examples of Unordered models

Model 1: multinomial logit, conditional logit


I attraction is that tractable (easy to estimate) but too limited
I independence of irrelevant alternatives
F Pr[yik = 1jyik = 1 or yij = 1] depends only on alternatives j and k
F assumes εij independent of εik
F red bus - blue bus problem.

Model 2: nested logit


I richer and still easy but requires specifying error correlation structure
I two versions - only one consistent with ARUM
Model 3: random parameters logit
I currently very popular (use simulated ML or Bayesian)
Model 4: multinomial probit
I potentially rich but hard to estimate and ts poorly.

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
30 A.
/ 53
Coli
3. Multinomial models Ordered models

Ordered multinomial models


For outcomes for which there is a natural ordering
I e.g. y is a person's health status.
We observe poor or fair (y = 1), good (y = 2) or excellent (yi = 3).
Model is based on a single latent variable y = x0 β + u.
Multinomial outcomes depend on magnitude of y . For 3 outcomes:
8
< 1 if y α1
yi = 2 if α1 < y α2
:
3 if y > α2 .
Ordered probit model speci es u N [0, 1]. Then
p1 = Pr[y α1 ] = Pr[x0 β + u α1 ] = Φ ( α1 xi0 β)
p2 = Pr[α1 < x0 β + u α2 ] = Φ ( α2 x0 β ) Φ ( α1 xi0 β)
p3 = 1 p1 p2 .
I ML estimation is straightforward.
I Ordered logit model speci es u logistic: replace Φ( ) above by Λ( ).
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
31 A.
/ 53
Coli
3. Multinomial models Stata commands

Stata commands
Stata commands
Command Model
mlogit multinomial logit
asclogit conditional logit
clogit older command for conditional logit
nlogit nested logit (ARUM version)
mprobit multinomial probit
asmprobit multinomial probit
mixlogit random parameters logit (Stata add-on)

Commands mlogit and mprobit for individual-speci c regressors only


I data in wide form (one obs is all alternatives for individual)
Other commands allow individual-varying regressors (e.g. price)
I data in long form (one obs is one alternative for individual)
I commands reshape to move from wide to long form.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
32 A.
/ 53
Coli
4. Censored and Truncated data Tobit

4. Censored data: Tobit

Problem: with censored or truncated data:


I The incomplete sample is not representative of the population.
Instead, sample is selected on basis of y (vs. selection on x is okay).
I Simple estimators are inconsistent and get wrong marginal e ects.
So need alternative estimators. These require strong assumptions.
Censored Data: For part of the range of y we observe only that y is in
that range, rather than observing the exact value of y .
I e.g. Annual income top-coded at $75,000 (censored from above).
I e.g. Expenditures or hours worked bunched at 0 (censored from below).
Truncated data: For part of range of y we do not observe y at all.
I e.g. Sample excludes those with annual income > $75,000 per year.
I e.g. Those with expenditures of $0 are not observed.

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
33 A.
/ 53
Coli
4. Censored and Truncated data De nition

Tobit Model De nition


Latent dependent variable y follows regular linear regression
y = x0 β + ε
ε N [0, σ2 ]
I But this latent variable is only partially observed.
Censored regression (from below at 0): we observe
y if y > 0
y=
0 if y 0.
Truncated regression (from below at 0): we observe only
y =y if y > 0.
In either case can estimate by MLE (skip this)
I very fragile: e.g. inconsistent if ε is nonnormal or is heteroskedastic.
We focus on conditional means, for intuition and later work.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
34 A.
/ 53
Coli
4. Censored and Truncated data Tobit example with simulated data

Tobit example with Simulated Data


Specify a linear relationship between
I y : annual hours worked, and
I x : log hourly wage.
Desired hours of work, y , generated by model

yi = 2500 + 1000xi + εi , i = 1, ..., 250,


2
εi N [0, 1000 ],
xi N [2.75, 0.62 ] () wi [18.73, 12.322 ]).

Tobit model: Instead of observing y we observe y where

yi if yi > 0
yi =
0 if yi 0.

I Here if desired hours are negative people do not work and y = 0.

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
35 A.
/ 53
Coli
4. Censored and Truncated data Tobit example with simulated data

Scatterplot & true regression curves (derived later) for three samples:
I truncated (top), censored (middle) and completely observed (bottom).

T obit: Censored and T runcated Means

4000
Different Conditional Means
2000
0

Actual Lat ent Variable


Truncated Mean
-2000

Censored Mean
Uncensored Mean

1 2 3 4 5
x (natural logarithm of wage)

Censored and truncated data the model is now nonlinear


I and linear model will be atter line than true line ( b
β ' 0.5β).
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
36 A.
/ 53
Coli
4. Censored and Truncated data Truncated mean in Tobit model

Truncated Mean in Tobit model

Truncated mean: We observe y only when y > 0.


The truncated conditional mean (suppressing conditioning on x) is

E[y jy > 0]
= E [x0 β + εjx0 β + ε > 0] as y = x0 β + ε
= x0 β + E [εhjε > x0 β] i as x and ε independent
x0 β
= x0 β + σE ε ε
σjσ > σ transform to ε/σ N [0, 1]
x0 β
= x0 β + σλ σ using next slide: key result for N [0, 1].

I where λ(z ) = φ(z )/Φ(z ) is called the inverse Mills ratio.


The regression function is not just x0 β (and is nonlinear).
I OLS of y on x is inconsistent for β
I Need NLS or MLE for consistent estimates.

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
37 A.
/ 53
Coli
4. Censored and Truncated data Truncated mean for standard normal

Derivation: Truncated mean E[z jz > c ] for the standard normal


I key result used in the previous slide
I consider z N [0, 1], with density φ(z ) and c.d.f. Φ(z ).
I conditional density of z jz > c is φ(z )/(1 Φ (c )).
I truncated conditional mean is
Z ∞
E[z jz > c ] = z ( φ (z ) / (1 Φ (c ))) dz
Z c∞
= z p1 exp( 1 z 2 ) dz
2 (1 Φ (c ))
c 2π
h i∞ .
= p1 exp( 1 z 2) (1 Φ (c ))
2π 2 c
φ (c )
=
1 Φ (c )
φ ( c)
=
Φ ( c)
= λ( c ), where λ(c ) = φ(c )/Φ(c ).

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
38 A.
/ 53
Coli
4. Censored and Truncated data Censored mean

Tobit Model: Censored Mean

Censored mean: We observe y = 0 if y < 0 and y = y otherwise.


The censored conditional mean (suppressing conditioning on x) is

E[y ] = Ey [E[y jy ]]
= Pr[y 0]
0 + Pr[y > 0] E[y jy > 0]
φ (x0 β/σ)
= Φ(x0 β/σ) x0 β + σ
Φ (x0 β/σ)
E[y jx] = Φ(x0 β/σ)x0 β + σφ (x0 β/σ),

using earlier result for the truncated mean E[y jy > 0].
This conditional mean is again nonlinear.
I OLS of y on x is inconsistent for β
I Need NLS or MLE for consistent estimates.

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
39 A.
/ 53
Coli
4. Censored and Truncated data Data Example

Tobit MLE: Data Example


Data from 2001 Medical Expenditure Survey (MUS chapter 16).
I ambexp (ambulatory expenditure = physician and hospital outpatient).
I dambexp (=1 if ambexp>0 and =0 if ambexp=0).
I Regressors: age (in tens of years), female, educ (years of completed
schooling), blhisp (=1 if black or hispanic) , totchr (number of
chronic conditions), and ins (=1 if PPO or HMO health insurance).

Variable Obs Mean Std. Dev. Min Max

ambexp 3328 1386.519 2530.406 0 49960


dambexp 3328 .8419471 .3648454 0 1
age 3328 4.056881 1.121212 2.1 6.4
female 3328 .5084135 .5000043 0 1
educ 3328 13.40565 2.574199 0 17

blhisp 3328 .3085938 .4619824 0 1


totchr 3328 .4831731 .7720426 0 5
ins 3328 .3650841 .4815261 0 1

16% of sample are censored (since dambexp has mean 0.84).


c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
40 A.
/ 53
Coli
4. Censored and Truncated data Data Example

Stata command tobit, ll(0) yields

. * Tobit on censored data


. tobit ambexp age female educ blhisp totchr ins, ll(0)

Tobit regression Number of obs = 3328


LR chi2(6) = 694.07
Prob > chi2 = 0.0000
Log likelihood = -26359.424 Pseudo R2 = 0.0130

ambexp Coef. Std. Err. t P>|t| [95% Conf. Interval]

age 314.1479 42.63358 7.37 0.000 230.5572 397.7387


female 684.9918 92.85445 7.38 0.000 502.9341 867.0495
educ 70.8656 18.57361 3.82 0.000 34.44873 107.2825
blhisp -530.311 104.2667 -5.09 0.000 -734.7443 -325.8776
totchr 1244.578 60.51364 20.57 0.000 1125.93 1363.226
ins -167.4714 96.46068 -1.74 0.083 -356.5998 21.65696
_cons -1882.591 317.4299 -5.93 0.000 -2504.969 -1260.214

/sigma 2575.907 34.79296 2507.689 2644.125

Obs. summary: 526 left-censored observations at ambexp<=0


2802 uncensored observations
0 right-censored observations

Question: How do we interpret the coe cients?


I Uncensored mean: ∂E[y jx] /∂x = β
j j
I Censored mean: ∂E[y jx] /∂x = Φ (x0 α ) β after some algebra
j j

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
41 A.
/ 53
Coli
4. Censored and Truncated data Limitations

The Tobit model is vary fragile


I MLE is inconsistent if errors are nonnormal and even if they are normal
but heteroskedastic.
I This has led to semiparametric estimators.
In particular censored least absolute deviations (CLAD) estimator
I Basic idea is that censoring and truncation e ect the mean, but not
the median (if less than 50% censored)
I LAD is the regression analog of the median estimate
I Censored LAD can work well particularly for top coded data.
Also when there is censoring from below at zero, the process for
zeroes can di er from that for nonzeroes.
I We consider this next.

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
42 A.
/ 53
Coli
5. Sample Selection Models Overview

5. Sample Selection Model: Overview

There are many generalizations of standard Tobit, often involving


sample selection or self-selection.
We consider the most common, Heckman's sample selection model
I Also called type 2 Tobit, Tobit with stochastic threshold, Tobit with
probit selection.
I For censoring below this is often more realistic than standard Tobit,
as it allows di erent equations for participation and the outcome.

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
43 A.
/ 53
Coli
5. Sample Selection Models De nition

Sample Selection Model: De nition


De ne two latent variables as follows:
Participation: y1 = x10 β1 + ε1
Outcome: y2 = x20 β2 + ε2
Neither y1 nor y2 are completely observed.
I Participation: We observe whether y1 is positive or negative

1 if y1 > 0
y1 =
0 if y1 0.
I Outcome: Only positive values of y2 are observed

y2 if y1 > 0
y2 =
0 if y1 0.

MLE is used if error terms are speci ed to be joint normal


I (ε , ε )
1 2 N (0, 0), (σ21 = 1, σ12 , σ22 )
I Fragile: e.g. inconsistent if ε is nonnormal or is heteroskedastic.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
44 A.
/ 53
Coli
5. Sample Selection Models Heckman 2-step estimator

Sample Selection Model: Heckman 2-step estimator


Assume instead that errors (ε1 , ε2 ) satisfy

ε2 = δ ε1 + v ,

where ε1 N [0, 1] and v is independent of ε1 .


I This is implied by (ε1 , ε2 ) joint normal.
I But it is a weaker assumption.
Then y2 = x20 β2 + ε2 if y1 > 0 implies

E[y2 jy1 > 0] = x20 β2 + E[ε2 jx10 β1 + ε1 > 0]


= x20 β2 + E [(δ ε1 + v )jε1 > x10 β1 ]
= x20 β2 + δ E[ε1 jε1 > x10 β1 ]
= x20 β2 + δ λ(x10 β1 )

where third equality uses v independent of ε1 and λ(c ) = φ(c )/Φ(c )


is the inverse Mills ratio.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
45 A.
/ 53
Coli
5. Sample Selection Models Heckman 2-step estimator

For the observed outcomes:

E[y2 jy1 > 0] = x20 β2 + δλ(x10 β1 ).

I OLS of y2 on x2 only is inconsistent as regressor λ(x10 β1 ) is omitted.


I Heckman included an estimate of λ(x10 β1 ) as an additional regressor.
Heckman's two-step procedure:
I 1. Estimate β1 by probit for y1 > 0 or y1 < 0 with regressors x1i .
I b
b i = λ (x0 β 0 b 0 b
Calculate λ 1i 1 ) = φ (x1i β1 )/Φ (x1i β1 ).
I 2. For observed y2 estimate β2 and σ in the OLS regression
0
y2i = x2i b i + wi .
β2 + δ λ

I bi
Need standard errors that correct for wi heteroskedastic and λ
estimated. Stata command heckman does this.

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
46 A.
/ 53
Coli
5. Sample Selection Models Heckman 2-step estimator

Exclusion restriction:
I desirable to include some regressors in participation equation (x1 ) that
can be excluded from the outcome equation (x2 )
I otherwise identi cation solely from nonlinearity.
Selection on observables only
I If Cov[ε1 , ε2 ] = 0 model then there is no longer selection on
unobservables
I Model reduces to a two-part model
F Probit for whether y > 0
F Regular OLS for the positives.
F Can be reasonable for individual's hospital expenditure data.

Logs for the outcome


I Often the outcome is expenditure
I Then better to use a log model for the outcome
I But will then need to transform to levels for prediction.

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
47 A.
/ 53
Coli
5. Sample Selection Models Heckman 2-step estimator

Heckman 2-step: Data Example


2-step where outcome is for ln y .
.
. * Heckman 2-step without exclusion restrictions
. heckman lny $xlist, select(dy = $xlist) twostep

Heckman selection model -- two-step estimates Number of obs = 3328


(regression model with sample selection) Censored obs = 526
Uncensored obs = 2802

Wald chi2(6) = 189.46


Prob > chi2 = 0.0000

Coef. Std. Err. z P>|z| [95% Conf. Interval]

lny
age .202124 .0242974 8.32 0.000 .1545019 .2497462
female .2891575 .073694 3.92 0.000 .1447199 .4335951
educ .0119928 .0116839 1.03 0.305 -.0109072 .0348928
blhisp -.1810582 .0658522 -2.75 0.006 -.3101261 -.0519904
totchr .4983315 .0494699 10.07 0.000 .4013724 .5952907
ins -.0474019 .0531541 -0.89 0.373 -.151582 .0567782
_cons 5.302572 .2941363 18.03 0.000 4.726076 5.879069

dy
age .097315 .0270155 3.60 0.000 .0443656 .1502645
female .6442089 .0601499 10.71 0.000 .5263172 .7621006
educ .0701674 .0113435 6.19 0.000 .0479345 .0924003
blhisp -.3744867 .0617541 -6.06 0.000 -.4955224 -.2534509
totchr .7935208 .0711156 11.16 0.000 .6541367 .9329048
ins .1812415 .0625916 2.90 0.004 .0585642 .3039187
_cons -.7177087 .1924667 -3.73 0.000 -1.094937 -.3404809

mills
lambda -.4801696 .2906565 -1.65 0.099 -1.049846 .0895067

rho -0.37130
sigma 1.2932083
lambda -.4801696 .2906565

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
48 A.
/ 53
Coli
5. Sample Selection Models Stata commands

Stata commands

Stata commands
Command Model
tobit Tobit MLE (censored)
truncreg Tobit MLE (truncated)
cnreg Tobit (varying known threshold)
intreg Interval normal data (e.g. $1-$100, $101-$200,..)
heckman, mle Sample selection MLE
heckman, 2step Sample selection two step

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
49 A.
/ 53
Coli
6. Treatment e ects models Treatment e ects problem

6. Treatment e ects models


What is the e ect of a binary treatment?
Outcome y (e.g. earnings) depends on whether or not get
treatment d (e.g. training).
Model
Treatment di = 0 or di = 1
y1i if yi = 1
Outcome yi =
y0i if yi = 1
Problem: We want treatment e ect y1i y0i .
I But we observe only one of y1i and y0i .
I And people self-select into training
F not randomized like an experiment.

Solutions: many. Key distinction between


I selection on observables only (just x 0 s)
I selection on observables and unobservables (x 0 s and ε0 s)

c A. Colin Cameron Univ. of Calif. - Davis Lectures


(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
50 A.
/ 53
Coli
6. Treatment e ects models Selection on Observables Only

Selection on observables only


A. Naive: Compare means
I use y 1 y 0
I same as bα in OLS of yi = αdi + ui
I consistent if Cov(ui , di ) = 0
I method for a randomized experiment, otherwise likely invalid.
B. Control function
I add xi0 s to control for di being chosen
I α in OLS of yi = αdi + xi0 β + ui
use b
I consistent if Cov(ui , di jxi ) = 0
C. Propensity score matching
I propensity score p = Pr [treatedjx] = Pr [d = 1jx]
I calculate using a very exible logit model (interactions ...)
I compare y10 s (treated) with y00 s (untreated) for those with similar p.
I practical variation of matching those with similar x0 s.
D. Sharp regression discontinuity design
I suppose y = f (s ) + αd + x0 β + u and d = 1(s > s ).
i i i i i i i i
I compare yi for those with si either side of threshold si
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
51 A.
/ 53
Coli
6. Treatment e ects models Selection on Observables and Unobservables

Selection on observables and unobservables


A. Panel data
I 0 β+v +ε
yit = αdit + xit i it
I rst di erence (or mean di erence) gets rid of vi
F OLS on ∆yit = α∆dit + ∆xit0 β + ∆εit
I consistent if Cov(εit , dit jxit ) = 0 but allows Cov(vi , dit jxit ) 6= 0
F okay if treatment correlated only with time invariant part of the error

B. Di erence in di erences
I variation of preceding that does not require panel data.
I suppose treatment occurs only in second time period (not in rst)
F α = ∆y treated ∆y untreated = (y1,tr y0,tr ) (y1,untr y0,untr ).
use b
F more generally OLS on ∆yi = αdi + ∆xi0 β + ui
F requires common time trend for treated and untreated groups
I Extends to more time periods (model in level with dit )
I Extend to contrasts other than in time e.g. male/female
I Extension is event history analysis.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
52 A.
/ 53
Coli
6. Treatment e ects models Selection on Observables and Unobservables

C. Instrumental variables
I IV estimation with instrument zi in yi = αdi + xi0 β + ui
I consistent if Cov(ui , di jxi ) = 0
D. Fuzzy regression discontinuity design
I in fuzzy design not everyone with s > s gets the treatment.
i i
I this introduces a role for unobservables.
E. Parametric model e,g, Roy model:
I introduce latent variables di , y1i , y0i for di , y1i , y0i .
I then E[y1i ] = E[y1i jdi = 1] = E[y1i jdi > 0]
= E[x1i0 β + ε jz0 γ + v > 0 ] = x0 β + E [ ε jv > z0 γ ]
1i i i 1i 1i i i
I 0 β + δ λ (z0 γ ) where λ ( ) is inverse Mills ratio
so E[y1i ] = x1i 1 i
if ε1i = δ1 vi + ξ i > 0, vi N [0, 1], ξ i independent.
F. LATE (local average treatment e ects)
I allows α to vary with i and applies to many estimators.
I for example consider IV interpreted as local e ect
F e.g. in earnings-education regression with instrument law change that
increased school leaving age, the earnings e ect is for those with low
levels of education.
c A. Colin Cameron Univ. of Calif. - Davis Lectures
(Advanced
in Microeconometrics:
Econometrics Bavarian
Brief
Graduate
LDV Program in July
Economics
22-26, .2013
Based on
53 A.
/ 53
Coli

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy