0% found this document useful (0 votes)
26 views

Chapter 2

This chapter discusses models for discrete outcomes where an individual or observation selects one of multiple mutually exclusive categories. It focuses on binary outcome models where the dependent variable y can take the value of 1 or 0. The chapter covers the linear probability model, the general binary outcome model, maximum likelihood estimation, consistency, asymptotic distributions, marginal effects, and the logit and probit models. It also introduces a latent variable representation of discrete choice models.

Uploaded by

David Sam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Chapter 2

This chapter discusses models for discrete outcomes where an individual or observation selects one of multiple mutually exclusive categories. It focuses on binary outcome models where the dependent variable y can take the value of 1 or 0. The chapter covers the linear probability model, the general binary outcome model, maximum likelihood estimation, consistency, asymptotic distributions, marginal effects, and the logit and probit models. It also introduces a latent variable representation of discrete choice models.

Uploaded by

David Sam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Chapter 2: Discrete Choice

Joan Llull

Quantitative Statistical Methods II  Part II


Barcelona School of Economics
Introduction

In this chapter we analyze some models for discrete outcomes, models for which
of m mutually exclusive categories is selected.
This section: binary outcomes.

For notational convenience: y = 1{A is selected}:


It allows us to write the likelihood in a very compact way.
What happens with N −1 N
y ? Why is it important?
P
i=1 i

Chapter 2. Discrete Choice 2


The linear probability model
Simple approach: linear regression model.
OLS regression of y on x provides consistent estimates of sample-average marginal
eects ⇒ nice exploration tool.

Becoming popular in the treatment eects literature.


Two important drawbacks:
Predicted probabilities p̂(x) = x β̂ are not bounded between 0 and 1.
0

Error term is heteroscedastic and has a discrete support (given x).


Chapter 2. Discrete Choice 3
The General Binary Outcome Model

Chapter 2. Discrete Choice 4


The General Binary Outcome Model
The conditional probability of choosing A given x is p(x) ≡ Pr[y = 1|x] = F (x β).
0

These are single-index models.


This general notation is useful to derive general results that are common across
models.
This model includes linear model, Probit and Logit as special cases:
Linear model: F (x β) = x β.0 0

Logit: F (x β) = Λ(x β) = . 0
0 0 ex β
1+ex0 β

Probit: F (x β) = Φ(x β) =
0 φ(z)dz .
0 x0 β
R
−∞

Chapter 2. Discrete Choice 5


Maximum Likelihood Estimation
Given the binomial nature of data, we know the distribution of the outcome: Bernoulli:
(
p if y = 1 ,
1 − p if y = 0
y 1−y
g(y|x) = p (1 − p) =

where p = F (x β).0

Therefore, the conditional log-likelihood is:


N
X
{yi ln F (x0i β) + (1 − yi ) ln 1 − F (x0i β) }.

LN (β) =
i=1

And the rst order condition:


N
∂ LN X yi − F (x0i β̂)
≡ f (x0i β̂)xi = 0,
∂β 0 0
i=1 F (xi β̂)(1 − F (xi β̂))

where f (·) ≡ . ∂F (z)


∂z

No explicit solution. Newton-Raphson converges quickly because log-likelihood is globally concave


for the Probit and Logit.
Chapter 2. Discrete Choice 6
Consistency
We know that the distribution of y is Bernoulli ⇒ Consistency additionally requires
p = F (x β ).
0
0

The true parameter vector is the solution of:


max E[y ln F (x0 β) + (1 − y) ln 1 − F (x0 β) ] .
 
β

The rst order condition is:


y − F (x0 β)
 
0
E f (x β)x = 0.
F (x0 β)(1 − F (x0 β)) [p=F (x0 β 0 )]

Chapter 2. Discrete Choice 7


Asymptotic distribution

From Chapter 1: β̂ →d N (β, Ω /N )). 0

We may use the information matrix equality to get Ω : 0


 2
−1  −1  −1
∂ L ∂L ∂L 1
−E =E =E f (x0 β)2 xx0 .
∂β∂β 0 ∂β ∂β 0 F (x0 β) (1 − F (x0 β))

Note that this is of the form E[ωxx ] . 0 −1

Chapter 2. Discrete Choice 8


Marginal eects
Marginal eects are given by:
∂ Pr[y = 1|x]
= f (x0 β)βk .
∂xk

In the linear probability model, f (x β) = 1.


0

In non-linear models, depend on x (we can compute several alternatives).


Coecients are still informative of the sign of the marginal eect.
Interesting property: ratios of marginal eects are constant:
∂ Pr[y = 1|x]/∂xk f (x0 β)βk βk
= = .
∂ Pr[y = 1|x]/∂xl f (x0 β)βl βl

In the case of a dichotomic regressor the marginal eect is:


F (x0−k β −k + βk ) − F (x0−k β −k ).

Chapter 2. Discrete Choice 9


The Logit Model

Chapter 2. Discrete Choice 10


The Logit Model
The Logit model is given by: 0
ex β
F (x0 β) = Λ(x0 β) = .
1 + ex0 β
Nice property of the logistic function: ∂Λ(z)/∂z = Λ(z)(1 − Λ(z)).
Therefore, the ML estimator reduces to:
N 
X 
yi − Λ(x0i β̂) xi = 0.
i=1

And the asymptotic variance to: −1


Ω0 = E Λ(x0 β) 1 − Λ(x0 β) xx0
 
.

Marginal eects are:


∂ Pr[y = 1|x]
= Λ(x0 β)(1 − Λ(x0 β))βk .
∂xk
And another interesting property: ln
p
= x0 β.
1−p
Chapter 2. Discrete Choice 11
The Probit Model

Chapter 2. Discrete Choice 12


The Probit Model
The Probit model is given by:
Z x0 β
F (x0 β) = Φ(x0 β) = φ(z)dz.
−∞

Therefore, the ML estimator is given by:


N
X yi − Φ(x0i β̂)
φ(x0i β̂)xi = 0.
0
i=1 Φ(xi β̂)(1 − Φ(x0i β̂))

And the asymptotic variance is:


−1
φ(x0 β)2

Ω0 = E xx0 .
Φ(x β) (1 − Φ(x0 β))
0

Marginal eects are:


∂ Pr[y = 1|x]
= φ(x0 β)βk .
∂xk

Chapter 2. Discrete Choice 13


Latent Variable Representation

Chapter 2. Discrete Choice 14


Latent Variable Representation
One way to give a more structural interpretation to the model is in terms of a
latent measure of utility.

A latent variable is a variable that is not completely observed.


Two alternative ways in this context:
Index function model: a threshold of the latent variable determines the ob-
served decision.
Random utility model: the decision is based on the comparison of the utilities
obtained from each alternative.
Chapter 2. Discrete Choice 15
Index Function Model
Let y be the latent variable of interest, such that:

y ∗ = x0 β + u u ∼ F (·)

We only observe: (
1 if y ∗
> 0,
y=
0 if y ∗
≤ 0.

The probability of observing y = 1 is:


Pr[y = 1|x] = Pr[x0 β + u > 0] = Pr[u > −x0 β] = F (x0 β).
f (·) symmetric

This model delivers the Logit if F (·) = Λ(·) and the Probit if F (·) = Φ(·).
The threshold is normalized to 0 because it is not separately identied from the constant.
Similarly, all parameters are identied up to scale since Pr[u > −x β] = Pr[ua > −x βa] ⇒ We
0 0

have to impose restrictions on the variance of u.


Chapter 2. Discrete Choice 16
Random Utility Model
Consider the utility of the two alternatives:
U0 = V0 + ε0 ,
U1 = V1 + ε1 .
We only observe y = 1 if U > U and y = 0 otherwise. 1 0

The probability of observing y = 1 is: i

Pr[y = 1|x] = Pr[U1 > U0 |x] = Pr[ε0 − ε1 < V1 − V0 |x] = F (V1 − V0 ).


We typically express V − V as a single-index:
V = x β and V = x β ⇒ V − V = x (β − β ).
1 0
0 0 0
1 1 0 0 1 0 1 0

V = w β and V = z β ⇒ V − V = x (β − β ) with some β = 0.


1
0
1 0
0
0 1 0
0
1 0 jk

V = z α + x β for j = 0, 1 ⇒ V − V = (z − z ) α + x (β − β ).
j
0
j
0
j 1 0 1 0
0 0
1 0

Dierent distributional assumptions deliver dierent models:


ε , ε ∼ i.i.d. N ⇒ ε − ε ∼ N variance not identied.
1 0 0 1

f (ε ) = e
j exp{e }, j = 0, 1 (i.e. Type I EV) ⇒ ε − ε ∼ Λ(·)
−εj −εj
0 1

Chapter 2. Discrete Choice 17


Endogenous Variables

Chapter 2. Discrete Choice 18


Endogeneity

When the number of endogenous regressors is small enough we proceed with a Mul-
tivariate Probit model.

We discuss two cases:


Continuous endogenous regressor.
Discrete endogenous regressor.

When Probit is unfeasible, we may use GMM.


Chapter 2. Discrete Choice 19
Continuous endogenous variable
Consider the model:
y1 = 1{x0 α + βy2 + ε ≥ 0}
      
x ε 1 ρσ
z= z ∼ N 0, .
y2 = z 0 γ + ν z2 ν ρσ σ2

Endogeneity is provided by ρ 6= 0.
As in Exercise 1, we can factorize the conditional likelihood: f (y |z, y )f (y |z).
1 2 2

Then, given ε|z, ν ∼ N ν, 1 − ρ , the log-likelihood is:


ρ
σ
2

N 
(y2i − z 0i γ)2
X 
LN (α, β, ρ, σ, γ) ∝ y1i ln Φ (a) + (1 − y1i ) ln [1 − Φ (a)] − ln σ − ,
i=1
2σ 2

where a = ρ
x0i α+βy2i + σ
√ .(y2i −z 0i γ)
1−ρ2

We can estimate it by FIML or LIML.


Chapter 2. Discrete Choice 20
Discrete endogenous variable
Consider the model:
y1 = 1{x0 α + βy2 + ε ≥ 0}
      
x ε 1 ρ
z= z ∼ N 0, .
y2 = 1{z 0 γ + ν ≥ 0} z2 ν ρ 1

Endogeneity is provided by ρ 6= 0. This is the bivariate binomial probit.


There is no LIML procedure here.
The conditional log-likelihood
X
is:
N
LN (α, β, γ, ρ) = {y1i y2i ln P11i + (1 − y1i )y2i ln P01i +
i=1
+y1i (1 − y2i ) ln P10i + (1 − y1i )(1 − y2i ) ln P00i } ,

where:
P00 ≡ Pr[y1 = 0, y2 = 0|z] = Φ2 (−x0 α, −z 0 γ; ρ).

P10 ≡ Pr[y1 = 1, y2 = 0|z] = Φ(−z 0 γ) − P00 .

P01 ≡ Pr[y1 = 0, y2 = 1|z] = Φ(−x0 α − β) − Φ2 (−x0 α − β, −z 0 γ; ρ).

P11 ≡ Pr[y1 = 1, y2 = 1|z] = 1 − P00 − P10 − P01 .


Chapter 2. Discrete Choice 21
Moment Estimation

When ML is unfeasible, we rely on moment-based estimation.


If the number of external instruments equals the number of endogenous variables
(problem just identied), the GMM estimator solves:
N
X
(yi − pi )z i = 0.
i=1

If the problem is overidentied, we minimize a quadratic form on this expression.

Chapter 2. Discrete Choice 22

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy