0% found this document useful (0 votes)
6 views

17 ae2

Chapter 17 of ECO 344 covers various econometric models including Logit, Probit, Tobit, and Poisson regression models, focusing on their specifications, estimations, and interpretations. It discusses the limitations of the Linear Probability Model (LPM) and emphasizes the importance of using specialized models for binary and count data. Additionally, the chapter addresses sample selection issues and introduces the Heckman two-step correction method for handling non-random sample data.

Uploaded by

zemmilsamad04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

17 ae2

Chapter 17 of ECO 344 covers various econometric models including Logit, Probit, Tobit, and Poisson regression models, focusing on their specifications, estimations, and interpretations. It discusses the limitations of the Linear Probability Model (LPM) and emphasizes the importance of using specialized models for binary and count data. Additionally, the chapter addresses sample selection issues and introduces the Heckman two-step correction method for handling non-random sample data.

Uploaded by

zemmilsamad04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

ECO 344 - Applied Econometrics II

Chapter 17

Muhammad Salman Khalid

School of Economics & Social Sciences

May 5, 2025

Salman (IBA) Chapter 17 1 / 26


Outline

1 The Logit & Probit Model

2 The Tobit Model

3 The Poisson Regression Model

4 Sample Selection Model

Salman (IBA) Chapter 17 2 / 26


The Logit & Probit Model

Salman (IBA) Chapter 17 3 / 26


Linear Probability Model (LPM)

Definition: OLS regression applied to a binary (0/1) dependent


variable. We model P(Y = 1 | X ) = E (Y | X ) = X β (linear in
parameters).
Interpretation: Coefficient βk represents the change in the
probability of Y = 1 for a one-unit increase in xk , holding other
variables fixed.
Example: βk = 0.10 means a 1-unit increase in xk is associated with a
0.10 (10%) increase in the probability that Y = 1.
Estimated by OLS. Easy to implement and coefficients are easy to
interpret as approximate probability changes.

Salman (IBA) Chapter 17 4 / 26


LPM: Issues and Limitations
Range Problem: X β might predict probabilities outside the logical
[0,1] range (e.g., negative probabilities or above 100%). This is a
conceptual flaw of the linear model for probabilities.
Heteroskedasticity: The error term in LPM has variance
Var(Y | X ) = P(Y = 1 | X )[1 − P(Y = 1 | X )] = X β(1 − X β),
which depends on X . Thus, LPM errors are inherently
heteroskedastic.
Consequences: OLS remains unbiased (if E [Y |X ] truly linear) but no
longer BLUE; standard errors need correction. Always use
heteroskedasticity-robust SE for hypothesis testing in LPM.
Nonlinearity: Probability effect may not be constant. LPM imposes
a constant marginal effect of x on P(Y = 1). In reality, impact of x
on probability might diminish as it approaches 0 or 1.
Bottom Line: LPM is a simple approximation. For binary outcomes,
specialized nonlinear models (Logit/Probit) ensure valid probability
predictions and often fit better.
Salman (IBA) Chapter 17 5 / 26
Transformation

We model:

P(y = 1|x) = G (β0 + β1 x1 + · · · + βk xk )


= G (x ′ β)

G (·) is a cumulative distribution function (CDF)


Properties:
0 < G (z) < 1 for all z
G ′ (z) > 0 (strictly increasing)
Domain: (−∞, ∞)
Range: (0, 1)
Variables generated from the transformation is known as Latent
Variables.

Salman (IBA) Chapter 17 6 / 26


The Logit Model

Logistic CDF:

exp(z)
G (z) =
1 + exp(z)

Model:
exp(x ′ β)
P(y = 1|x) =
1 + exp(x ′ β)

Log-odds (Logit):
 
P(y = 1|x)
log = x ′β
1 − P(y = 1|x)

Salman (IBA) Chapter 17 7 / 26


Logit Model: Specification
Idea: Instead of a linear probability, assume the log-odds of Y = 1 is
linear in X . The logit model is:
P(Y = 1 | X )
log = Xβ
1 − P(Y = 1 | X )
which implies
exp(X β)
P(Y = 1 | X ) = Λ(X β) =
1 + exp(X β)
(Here Λ(·) is the logistic CDF.)
P(Y = 1 | X ) is automatically between 0 and 1 for all X (S-shaped
probability curve).
Estimation: Coefficients β are estimated via maximum likelihood
(MLE) since the model is nonlinear. (No closed-form solution like
OLS.)
MLE yields consistent and asymptotically normal estimates under
correct specification. Use t-test for coefficient significance.
Salman (IBA) Chapter 17 8 / 26
Interpreting Logit Coefficients
Sign: If βk > 0, increasing xk increases the probability of Y = 1 (and
vice versa for βk < 0).
Odds Ratio: exp(βk ) is the factor by which the odds of Y = 1
change for a one-unit increase in xk . For example, if βk = 0.7, then
e 0.7 ≈ 2.01, meaning the odds of Y = 1 roughly double when xk
increases by 1 (holding other variables constant).
Marginal Effects: The effect of xk on the predicted probability
depends on the current probability level:
∂P(Y = 1 | X )
= βk P(Y = 1 | X )[1 − P(Y = 1 | X )] .
∂xk
This is highest when P(Y = 1 | X ) ≈ 0.5, and near zero when
P(Y = 1 | X ) is near 0 or 1.
In practice, we often evaluate marginal effects at the sample mean or
compute average marginal effects to interpret the size of βk in
probability terms.
Salman (IBA) Chapter 17 9 / 26
The Probit Model

Standard Normal CDF:


Z z  2
1 t
G (z) = Φ(z) = √ exp − dt
−∞ 2π 2

Model:

P(y = 1|x) = Φ(x ′ β)

Interpretation: Like logit but slope varies differently

Salman (IBA) Chapter 17 10 / 26


Probit Model: Specification and Estimation

Idea: Similar to logit, but assume a normal cumulative distribution


for probability. Formally:

P(Y = 1 | X ) = Φ(X β) .

(Here Φ(·) is the CDF of the standard normal distribution.)


Ensures 0 < P(Y = 1 | X ) < 1 for all X , with an S-shaped curve
(probit curve).
Often derived from an underlying latent variable model:
Y ∗ = X β + u, with u ∼ N(0, 1). Then Y = 1 if Y ∗ > 0, Y = 0
otherwise. This yields the probit form for P(Y = 1).
Estimation: Coefficients are estimated by MLE (no closed-form).
Interpretation of sign/significance is similar to logit. Use z-tests for
coefficients.

Salman (IBA) Chapter 17 11 / 26


Interpreting Probit Coefficients
Sign: βk ’s sign indicates direction of effect (like logit/LPM). If
βk > 0, higher xk increases P(Y = 1).
Magnitude: Probit coefficients are in units of underlying z-scores.
We cannot interpret βk as a direct probability change.
Partial Effect: For a continuous variable xk , the marginal effect on
probability is
∂P(Y = 1 | X )
= βk ϕ(X β) .
∂xk
(Here ϕ(·) denotes the standard normal density.) This depends on X ;
often evaluated at the mean of X or averaged across the sample.
For a binary independent variable, the effect can be found by
computing P(Y = 1) when xk = 1 versus xk = 0 (holding others
fixed).
Comparison to Logit: Probit and logit generally yield similar
qualitative results. Logit coefficients tend to be about 1.6 times larger
(due to scale differences), but after computing marginal effects or
predicted probabilities, both models agree closely. Choice between
Salman (IBA) Chapter 17 12 / 26
Comparison: Logit vs Probit

Both constrain probabilities between 0 and 1


Both are based on a latent (unobserved) variable approach
Logit assumes logistic errors; Probit assumes normal errors
In practice, results are often very similar

Salman (IBA) Chapter 17 13 / 26


The Tobit Model

Salman (IBA) Chapter 17 14 / 26


Tobit Model: Censored Dependent Variable
Use case: When the dependent variable is continuous but censored
at a known limit (often 0). Example: yi = hours worked (many
people have 0 hours, and others have positive hours).
Specification: There is an unobserved latent variable yi∗ . We assume:
yi∗ = Xi β + ui , ui ∼ N(0, σ 2 ) .
and the observed yi is:
(
yi∗ if yi∗ > 0,
yi =
0 if yi∗ ≤ 0 .
(Censoring at 0; this is sometimes called a corner solution outcome.)
If we run OLS on yi ignoring censoring, the predicted values could
have negative values.
Estimation: Tobit uses MLE, combining information from the
probability of being censored and the distribution of y ∗ for
noncensored observations.
Salman (IBA) Chapter 17 15 / 26
Interpreting Tobit Coefficients
The tobit coefficients β relate to the latent variable y ∗ . A positive βk
means that xk increases the latent desired result y ∗ .
Implication: If βk > 0, increasing xk increases the probability that
yi∗ > 0 (i.e. that yi is uncensored) and increases the expected value
of yi when it is above 0.
Partial effects: The effect of xk on the observed E [yi ] has two
components:
1 The change in the probability of yi being positive (extensive margin).
2 The change in the predicted yi given, it is positive (intensive margin).
Due to non-linearity, the marginal effect of xk on E [y ] can be
xi‘ β
calculated as βk ϕ( σ ) - Intensive Margins
It will be smaller in magnitude than βk because not all observations
are affected (some remain at 0).
Similarly, the marginal effect on the probability that y is uncensored
x ‘β
can be calculated as βσk ϕ( σi ) - Extensive Margins
Salman (IBA) Chapter 17 16 / 26
Specification Issue in Tobit

Limitation: Tobit assumes that the same underlying process


determines both the probability of being above the limit and the level
of y ∗ (aside from the error). If this is not true (e.g., different factors
affect participation vs. outcome), Tobit may be misspecified.
Like other MLE models, Tobit assumes normality and
homoscedasticity of ui . Violations can lead to incorrect inference and
prediction.

Salman (IBA) Chapter 17 17 / 26


The Poisson Regression Model

Salman (IBA) Chapter 17 18 / 26


The Poisson Regression Model
Used when dependent variable is count variables: nonnegative
integers (0, 1, 2, ...)
Examples:
Number of children born
Arrests per year
Patents filed by a firm
Often includes many zeros; linear models may not fit well

Salman (IBA) Chapter 17 19 / 26


The Poisson Regression Model
Used when dependent variable is count variables: nonnegative
integers (0, 1, 2, ...)
Examples:
Number of children born
Arrests per year
Patents filed by a firm
Often includes many zeros; linear models may not fit well
Solution: Exponential Mean Function
Instead of linear model, use:

E (y |x1 , . . . , xk ) = exp(b0 + b1 x1 + · · · + bk xk )

Ensures positive predicted values


Taking logs:

log E (y |x) = b0 + b1 x1 + · · · + bk xk

Salman (IBA) Chapter 17 19 / 26


Poisson Distribution
For count data, use Poisson distribution:

exp(−λ)λh
P(y = h|x) = , λ = exp(xb)
h!
Fully determined by mean λ = E (y |x)

Salman (IBA) Chapter 17 20 / 26


Poisson Distribution
For count data, use Poisson distribution:

exp(−λ)λh
P(y = h|x) = , λ = exp(xb)
h!
Fully determined by mean λ = E (y |x)
Interpretation:
Approximate: 100 · bj ≈ % change in E (y |x) from a one-unit increase
in xj
Exact change for dummy change (0 to 1):

exp(bk ) − 1

If xj = log(zj ), then bj is an elasticity


∂E (y |x)
Partial effect: ∂xj = exp(xb)bj
APE: ȳ · b̂j
Salman (IBA) Chapter 17 20 / 26
Estimation and Inference

Use Maximum Likelihood Estimation (MLE)


Poisson MLE is consistent even if Poisson assumption fails
Called Quasi-MLE (QMLE) in that case

Salman (IBA) Chapter 17 21 / 26


Estimation and Inference

Use Maximum Likelihood Estimation (MLE)


Poisson MLE is consistent even if Poisson assumption fails
Called Quasi-MLE (QMLE) in that case
Variance Assumptions
Poisson: Var(y |x) = E (y |x)
Often violated in practice (overdispersion)
Generalized form:
Var(y |x) = σ 2 E (y |x)
Estimate σ 2 and adjust standard errors
Inference
t-test for individual testing.
Likelihood ratio test for exclusion restrictions.
Quasi-likelihood ratio: divide LR by σ̂ 2 .
Salman (IBA) Chapter 17 21 / 26
Sample Selection Model

Salman (IBA) Chapter 17 22 / 26


Sample Selection Problem

Occurs when the data on the dependent variable are missing (not just
censored at a value) in a non-random way, typically due to an
endogenous selection mechanism.
Example: We only observe wages for people who choose to work.
The decision to work (and thus to have wage observed) may depend
on factors related to the wage outcome, causing a non-random
sample of wages.
Model Setup:
Selection equation: Di∗ = Zi γ + vi , Di = 1 if Di∗ > 0 (observation is
selected), otherwise Di = 0.
Outcome equation: yi = Xi β + ui , observed only if Di = 1.
Assume (ui , vi ) are correlated with correlation ρ. If ρ ̸= 0, then
E [ui | Di = 1] ̸= 0 (violating OLS assumption of zero conditional mean
in selected sample).

Salman (IBA) Chapter 17 23 / 26


Heckman Two-Step Correction
Step 1: Selection Equation. Estimate a probit model for Di
(selection) using Zi as regressors. From this, obtain the Inverse Mills
Ratio for each observation:
ϕ(Zi γ̂)
λi = .
Φ(Zi γ̂)
(Here ϕ and Φ denote the standard normal PDF and CDF,
respectively. λi is computed for observations with Di = 1.)
Step 2: Outcome Equation. For the subsample with Di = 1,
estimate the outcome equation by OLS, including λi as an additional
regressor:
yi = Xi β + δ λi + error.
Here, δ accounts for the selection bias. If δ ̸= 0 significantly, it
indicates sample selection was present (and now corrected for).
Intuition: λi is high when an observation had a low probability of
selection (but was selected anyway), indicating a likely large positive
ui . Controlling for λ adjusts for this non-randomness.
Salman (IBA) Chapter 17 24 / 26
Heckman Model: Assumptions & Notes

Identification: It’s important to have at least one variable in Zi


(selection equation) that is not in Xi (outcome equation). This
exclusion restriction helps identify the model (gives variation in
selection independent of outcome).
When a valid exclusion restriction is present, it improves the reliability
of the λ correction and the precision of estimates.
The model is applicable to scenarios of incidental truncation (sample
selection), not to be confused with censoring. In censoring (Tobit),
values at the limit are observed (as the limit); in selection, missing
outcomes are not observed at all.

Salman (IBA) Chapter 17 25 / 26


THANK YOU
Hardships often prepare ordinary people for an extraordinary destiny.

Salman (IBA) Chapter 17 26 / 26

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy