0% found this document useful (0 votes)
11 views43 pages

Slide

The document discusses unobserved heterogeneity in panel data analysis, highlighting the importance of accounting for it to avoid inefficiencies and biases in estimators. It introduces various models, including random effects and fixed effects, and emphasizes the need for strict exogeneity conditions to ensure valid estimates. The document also illustrates the implications of these models through examples and outlines the estimation methods for random effects models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views43 pages

Slide

The document discusses unobserved heterogeneity in panel data analysis, highlighting the importance of accounting for it to avoid inefficiencies and biases in estimators. It introduces various models, including random effects and fixed effects, and emphasizes the need for strict exogeneity conditions to ensure valid estimates. The document also illustrates the implications of these models through examples and outlines the estimation methods for random effects models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Applied Microeconometrics

Unobserved heterogeneity

Yutec Sun

ENSAI
Unobserved heterogeneity in panel data

What is panel data?

• Repeated observation of the same subjects (called panel) over time


• Specifically, {yit , xi1t , ..., xiKt } observed for each individual i at time t
• Many DID analysis is applied to 󰷫-period panel data.󰷪

In panel data analysis, one of the most widely used models looks like

yit = β0 + xit β + ci + uit ,

where ci is the e󰎎ect of unobserved heterogeneous factors, which persist over time.

󰷪
For example, the minimum-wage experiment of Card and Krueger (󰷪󰷲󰷲󰷭) and the water source experiment of Snow (󰷪󰷱󰷮󰷮).

󰷪
Data structure in Di󰎎erence-in-di󰎎erences analysis

Water suppliers

Southwark & Vauxhall S-V & Lambeth


(Control group) (Treatment group)

Before 󰷪󰷬󰷮 󰷪󰷬󰷩


A󰎗er 󰷪󰷭󰷰 󰷱󰷮

Di󰎎erence 󰷪󰷫 -󰷭󰷮

Table 󰷪: Average deaths per 󰷪󰷩,󰷩󰷩󰷩 before/a󰎗er Lambeth moved the water source

• Panel of sub-districts for treated & control groups over 󰷫 time periods

󰷫
Di󰎎erence-in-di󰎎erences estimator

Suppose that the death outcome is generated from

Yit = α + βTt + γTt Di + ci + 󰂃it ,

where Tt is indicator for time a󰎗er treatment, and Di is treatment indicator:

• Di = 1 if sub-district i belongs to the treatment group.


• Tt = 1 if time period t is a󰎗er the treatment Di occurred.

Parameter ci captures unobserved heterogeneity of each group.

󰷬
Why need unobserved heterogeneity?

Failing to account for the unobserved heterogeneity can lead to:

• Ine󰎎icient estimator
• Omitted variables bias
• Poor prediction of individual behavior

These problems motivate the use of random e󰎎ects, fixed e󰎎ects, and hierarchical Bayesian methods.

󰷭
Illustration of omitted variables bias in panel data

Suppose that our panel data model is

yit = β0 + xit β + ci + uit ,

where ci is the unobserved heterogeneity and

Cov(xit , ci ) ∕= 0,

which can arise when one of xit is correlated with ci : Cov(xikt , ci ) ∕= 0 for some k.
Two strategies available to solve this endogeneity problem: control function and IV methods.
But in panel data, we have one more option available.
We will see panel data methods based on the textbook of Wooldridge (󰷫󰷩󰷪󰷩).

󰷮
Illustration of di󰎎erence method

Let’s consider estimating a 󰷫-period panel model (i index omitted):

yt = β0 + x t β + c + ut t = 1, 2.

The error ut is usually assumed to satisfy

E(ut |xt , c) = 0.󰷫 (󰷪)

Let ∆y = y2 − y1 , ∆x = x2 − x1 , ∆u = u2 − u1 . Then c can be di󰎎erenced out as

∆y = ∆xβ + ∆u.

The OLS regression can estimate β consistently if for x ∈ RK

E∆x′ ∆u = 0 (󰷫)

rank E(∆x′ ∆x) = K. (󰷬)

Does this approach always work? Let’s look more closely at the consistency conditions (󰷫) and (󰷬).

󰷫
This is o󰎗en called mean independence condition since the conditional mean of u is independent of x and c.

󰷯
Exogeneity condition

What kind of exogeneity is needed?


󰷪. Rewriting the exogeneity condition (Equation 󰷫), we get

E(x1 u1 + x2 u2 + x2 u1 + x1 u2 ) = 0.

This shows

• the mean independence condition (Eq 󰷪) is not enough because it guarantees only

x1 u1 = x2 u2 = 0.

• we need a stronger version of exogeneity

E(xt us ) = 0 ∀t ∕= s.

󰷰
Rank conditions

What kind of rank condition is needed?

• If x includes a constant, its di󰎎erence in ∆x is 󰷩, and the full rank condition (Eq 󰷬) fails.

rank E(∆x′ ∆x) ∕= K.

• Hence, the constant term in x cannot be separately identified from unobserved heterogeneity c.

󰷱
Taxonomy of unobserved e󰎎ects

The panel data model can be written as

yit = x′it β + ci + uit t = 1, ..., T, i = 1, ..., N, (󰷭)

where xit is a vector of observable variables. ci is the e󰎎ect of unobserved heterogeneity, and uit is
idiosyncratic error.󰷬
There exist two views of the unobserved heterogeneity ci .󰷭

󰷪. Random e󰎎ect: This view assumes

E(ci |xi1 , ...., xiT ) = 0,

which implies that the random e󰎎ect ci is uncorrelated with xit for ∀t.
󰷫. Fixed e󰎎ect: no assumption imposed on the distribution of ci . Arbitrary correlation with xit is
allowed.

󰷬
The name idiosyncratic implies that uit is independently distributed.
󰷭
Historically, in the random e󰎎ects model ci is considered as a random variable while it is a parameter in the fixed e󰎎ects model.
Distinguishing the two views in this way has little implication for estimation.

󰷲
Random e󰎎ects model
Motivation

The random e󰎎ects (RE) approach

• accounts for the unobserved heterogeneity


• by allowing for heteroscedasticity in the unobservables to improve estimation e󰎎iciency
• when unobserved heterogeneity does not generate endogeneity bias.
• Failure to account for this will only lead to less precise estimates.

󰷪󰷩
Heteroscedastic error model

The RE model puts the unobserved heterogeneity into the error term by writing the panel data model
(Eq 󰷭) as
yit = xit β + vit , where vit = ci + uit . (󰷮)
The RE estimation needs exogeneity of xit , which can be decomposed into two parts:
Assumption 󰷪.

(a) (Strict exogeneity) E(uit |xi1 , ..., xiT , ci ) = 0 for all t.


(b) (Exogeneity) E(ci |xi1 , ..., xiT ) = 0.

Why need Assumption 󰷪?

• xit is uncorrelated with vit (no endogeneity).


• Covariance matrix of Vi = [vi1 , ..., viT ]′ is heteroscedastic.󰷮
• Therefore, the generalized least squares (GLS) method can estimate the model more e󰎎iciently.

󰷮
The error term vit is serially correlated due to ci .

󰷪󰷪
Strict exogeneity

Assumption 󰷪. (repeated)

(a) (Strict exogeneity) E(uit |xi1 , ..., xiT , ci ) = 0 for all t.


(b) (Exogeneity) E(ci |xi1 , ..., xiT ) = 0.

This implies:

(a) Cor(uit , xis ) = 0 for all t, s.


(b) Cor(ci , xit ) = 0 for all t.

It assumes that all unobservables do not create endogeneity bias for coe󰎎icients of xit .

󰷪󰷫
Strict exogeneity

Let’s examine some implications of the strict exogeneity:

• Given (xit , ci ), E(yit |xi1 , ..., xiT , ci ) does not depend on xis for s ∕= t since

E(yit |xi1 , ..., xiT , ci ) = xit β + ci .

• yt must not have feedback from past {xτ }τ ≤t−1 and into future {xτ }τ ≥t+1 .
• It’s stronger than standard exogeneity condition E(uit |xit ) = 0 since it implies

E(uit |xi1 , ..., xiT ) = 0.

󰷪󰷬
Example 󰷪: Program evaluation

Consider a model for the e󰎎ect of work training program on wages:

log(wageit ) = θt + z it γ + δ1 progit + ci + uit ,

where θt is time-varying intercept, z it is i’s observable characteristics, and ci is unobserved ability.


progit = 1 if i participates in the program at time period t, and zero otherwise.
Will Assumption 󰷪 hold?

• If progit is chosen based on unobserved skills ci ,󰷯 Assumption 󰷪 (b) fails.


• What if contemporaneous wage shock uit a󰎎ects future participation progit+1 ? 󰷰 If so, strict
exogeneity fails.

󰷯
either by individual or administrator
󰷰
This can happen, for example, when low uit prompts individuals to participate in future job training program.

󰷪󰷭
Example 󰷫: Distributed lag model

Consider a model for of R&D and technology innovation:

patentsit = θt + z it γ + δ0 RDit + δ1 RDit−1 + · · · + δ5 RDit−5 + ci + uit ,

where RDit is firm i’s R&D spending at time t, z it contains firm size.󰷱 ci is unobserved firm
heterogeneity that may be correlated with current, past & future R&Ds.
Validity of Assumption 󰷪

• Will today’s shocks uit to patents a󰎎ect the future R&D spending?
• Will RD’s be allowed to depend on ci ?

󰷱
O󰎗en measured by sales revenue or employees.

󰷪󰷮
Example 󰷬: Lagged dependent variable

Consider another simple model of wage:

log(wageit ) = β1 log(wageit−1 ) + ci + uit .

It estimates state dependence β1 in wage a󰎗er controlling for unobserved heterogeneity ci .󰷲 If we


write this model as
yit = β1 xit + ci + uit ,
where yit = log(wageit ), xit = log(wageit−1 ), and

E[uit |yit−1 , ..., yi0 , ci ] = 0.

Will Assumption 󰷪 be satisfied or not?

E[uit |xis , ci ] = 0 s ∕= t?

E[ci |xit ] = 0?

󰷲
The answer to this question boils down to nature vs nurture in a causal relationship.

󰷪󰷯
How to estimate the random e󰎎ects model?

To simplify, let’s write the RE model (Eq 󰷮) in vector form as


󰀵 󰀶 󰀵 󰀶 󰀵 󰀶
yi1 xi1 vi1
󰀹 . 󰀺 󰀹 . 󰀺 󰀹 . 󰀺
Yi = X i β + Vi , Yi = 󰀹 󰀺 󰀹
󰀷 .. 󰀸 , X i = 󰀷 .. 󰀸
󰀺 , Vi = 󰀹 󰀺
󰀷 .. 󰀸 ,
yiT xiT T ×K viT

V i = ci ιT + U i , ιT = [1, ..., 1]′ ∈ RT , Ui = [ui1 , ..., uiT ]′ .

Assumption 󰷫.
rank E(X ′i Ω−1 X i ) = K where ΩT ×T = E(Vi Vi′ ).

󰷪󰷰
Random e󰎎ects estimator

The RE approach consistently estimates β by using the feasible GLS method.


The RE estimator is defined as
󰀣 N
󰀤−1 󰀣 N
󰀤
󰁛 󰁛
β̂RE = X ′i Ω̂−1 X i X ′i Ω̂−1 Yi
i=1 i=1

where Ω̂ is the consistent estimate of Ω.󰷪󰷩


Under Assumptions 󰷪 & 󰷫, the RE estimator converges to
󰀕 󰁫 󰁬−1 󰀖

N (β̂RE − β) →d N 0, E(X ′i Ω−1 X i ) ,

as N → ∞.
In this way, the RE estimator can improve the estimation e󰎎iciency if the covariates xit do not depend
on the unobserved heterogeneity. But when they do, endogeneity bias will arise.
What if we cannot rule out such possibility?

󰷪󰷩
Ω̂ can be obtained from the residuals of an OLS regression in the first stage.

󰷪󰷱
Fixed e󰎎ects model
Fixed e󰎎ects model

In the fixed e󰎎ects (FE) model


yit = xit β + ci + uit , (󰷯)
the exogeneity of xit wrt ci is no longer assumed.󰷪󰷪
Assumption 󰷪. (Strict exogeneity)

(a) E(uit |xi1 , ..., xiT , ci ) = 0 for t = 1, ..., T .

Remarks

• E(ci |X i ) is allowed to depend on X i (no Assumption 󰷪 (b) of the RE model).


• Relying on weaker assumption, the FE estimation is more robust than the RE.
• But the tradeo󰎎 is, xit can no longer include time-constant factors.󰷪󰷫

󰷪󰷪
Now xit is allowed to depend on ci .
󰷪󰷫
For example, gender, race, industry, and city specific attributes. See next slide.

󰷪󰷲
Di󰎎erencing out the fixed e󰎎ects

Averaging Equation 󰷯 over t = 1, ..., T yields

ȳi = x̄i β + ci + ūi ,


󰁓T 󰁓T 󰁓T
where ȳi = t=1 yit /T , x̄i = t=1 xit /T , and ūi = t=1 uit /T . Subtracting this from Eq. 󰷯,

yit − ȳi = (xit − x̄i )β + uit − ūi .

The di󰎎erencing cancels out unobserved heterogeneity ci as well as all time-constant terms in xit .
Therefore, the source of endogeneity bias is eliminated, and we can apply the simple OLS for
estimation.

󰷫󰷩
Fixed e󰎎ects estimator

For estimation, the equation can be written alternatively as


󰀵 󰀶 󰀵 󰀶 󰀵 󰀶
ỹi1 x̃i1 ũi1
󰀹 . 󰀺 󰀹 . 󰀺 󰀹 . 󰀺
Ỹi = X̃ i β + Ũi for Ỹi = 󰀹 󰀺 󰀹
󰀷 .. 󰀸 , X̃ i = 󰀷 .. 󰀸
󰀺 , Ũi = 󰀹 󰀺
󰀷 .. 󰀸 ,
ỹiT x̃iT T ×K ũiT

where ỹit = yit − ȳi , x̃it = xit − x̄i , and ũit = uit − ūi . Then the FE estimator is
󰀣 N
󰀤−1 󰀣 N
󰀤
󰁛 ′ 󰁛 ′
β̂F E = X̃ i Ω̂−1 X̃ i X̃ i Ω̂−1 Ỹi , 󰷪󰷬 (󰷰)
i=1 i=1

where
N
1 󰁛 ˆ ˆ′
Ω̂ = Ũi Ũi ,
N i=1

ˆ = Ỹ − X̃ β̂
Ũ i i i F EOLS ,

and the 󰷪st-stage FE estimator β̂F EOLS is obtained by setting Ω̂ = I in Equation 󰷰.

󰷪󰷬
This FE estimator is called within estimator since it uses the time variation within each panel i.

󰷫󰷪
Consistency of FE estimator

For asymptotic inference, the following standard conditions are needed.


Assumption 󰷫.

rank E(X̃ i Ω−1 X̃ i ) = K.󰷪󰷭

Assumption 󰷬.

E(Ũ i Ũ i |X i , ci ) = Ω.

Under Assumptions 󰷪-󰷬, β̂F E is consistent since the strict exogeneity implies

E(x̃′it ũit ) = 0 t = 1, ..., T.

β̂F E is also robust to the heteroscedastic covariance of Ũi .

󰷪󰷭
This shows why time-constant variables are not allowed in xit since the corresponding columns in X̃ i will be zero for all i.

󰷫󰷫
Alternative FE approach

Consider again the di󰎎erenced model

yit − ȳi = (xit − x̄i )β + uit − ūi .

Is this the only way to di󰎎erence out the unobserved heterogeneity ci ?


In fact, we can achieve the same by taking a first di󰎎erence as

yit − yit−1 = (xit − xit−1 )β + uit − uit−1 .

The FE estimator β̂F E remains unchanged with appropriately changed ỹit , x̃it , and ũit .
Under unrestricted covariance of Ũi , both di󰎎erencing methods generate no di󰎎erence asymptotically
(Wooldridge, 󰷫󰷩󰷪󰷩).

󰷫󰷬
Which to choose between RE and FE?

There are two possibilities.


󰷪. xit is exogenous to ci (Assumption 󰷪 (b) in the RE model).

• Both RE & FE are consistent, and RE is more e󰎎icient.

󰷫. xit is endogenous to ci .

• RE is inconsistent, but FE is consistent.


• In this case, we should expect to see a di󰎎erence between RE & FE.
• Hausman proposes to test for this.

Hausman statistic
󰀗 󰀘−1
H = (β̂F E − β̂RE )′ Avar(
ˆ β̂F E ) − Avar(
ˆ β̂RE ) (β̂F E − β̂RE ) ∼ χ2K .

Idea

• If xit is endogenous to ci , the di󰎎erence metric H would be large.


• It is not a test of strict exogeneity.

󰷫󰷭
Hierarchical Bayesian
Hierarchical Bayesian model

O󰎗en, we want to estimate each individual’s heterogeneity to improve model prediction.


Some examples

• Which movie to recommend to individual Netflix viewer?


• Which products to display on Louis Vuitton online store?
• Which product should be discounted for individual shopper at Hyper U store or Amazon?

How can we estimate each individual’s βi in

yit = xit βi + uit ?

Classical approach allows limited heterogeneity in βi as a function of observed demographics.


Hierarchical Bayesian revolutionized the way to estimate rich heterogeneity distribution of each βi .
How?

󰷫󰷮
Hierarchical Bayesian model

󰷪. Likelihood
yit = xit βi + uit , uit ∼ F
󰷫. Prior
βi ∼ N (µi , σi )
󰷬. Hyperprior
µi ∼ N (µ0 , σ0 ), σi ∼ IG

HB model makes it possible to estimate posterior for each individual βi .


The priors (µi , σi )i are grouped by similar individuals automatically.

󰷫󰷯
Illustration: Demand estimation

Rossi et al. (󰷪󰷲󰷲󰷯) estimate the consumer utility for canned tunas (thon)

uij = xj βi + ui

for consumer i and tuna product j.


For consumer with demographic variable zi , the coe󰎎icient is

βi = ∆zi + vi vi ∼ N (0, Vβ ).

󰷫󰷰
Consumer preference heterogeneity by demographics

󰷫󰷱
Consumer preference heterogeneity estimated across datasets

󰷫󰷲
Distribution of choice probabilities

Choice prediction targeted at individual level is made possible by HB model.


Even with only 󰷭 samples, can do individual point predictions, even its entire shape of distribution.

󰷬󰷩
Application: Domestic violence
A panel data model

Aizer (󰷫󰷩󰷪󰷩) estimates the e󰎎ect of wage gap between married couple on domestic violence using the
model

DVit =α + β1 W ageGapit + β2 U nemployedit + β3 Incomeit


+ β4 Racei + β5 P opulationit + β6 V iolentCrimeit + ci + uit ,

where DVit is domestic violence for each county i in year t.

󰷬󰷪
Impact of wage gaps on domestic violence

Figure 󰷪: Panel data regressions (Aizer, 󰷫󰷩󰷪󰷩)

󰷬󰷫
Application: Airbnb
Airbnb’s impact on housing prices

Figure 󰷫: Airbnb rentals in Paris (insideairbnb.com)

Did Airbnb cause the rise in housing prices and rents?

󰷬󰷬
Panel data model

Barron et al. (󰷫󰷩󰷫󰷪) analyze panel data of U.S. zip codes for 󰷫󰷩󰷪󰷫–󰷫󰷩󰷪󰷯.
For zip code i at CBSA c in year-month t:

ln Yict = α + βAirbnbict + γAirbnbict × oorateic + Xict η + δi + θct + ξict

where

• Yict : housing price (ZHVI) or rent (ZRI) from Zillow.com


• Airbnb: Airbnb supply
• oorate: owner occupancy rate in 󰷫󰷩󰷪󰷩
• Xict : observed factors in demand for housing (school quality, local economy)
• ξict : shocks to local housing demand & supply

󰷬󰷭
Estimating the e󰎎ect of Airbnb on rental rates ln(ZRI)

Figure 󰷬: Barron et al. (󰷫󰷩󰷫󰷪)

• Column (󰷪) OLS estimate without controls: elasticity 󰷩.󰷩󰷲󰷱


• Column (󰷫) with zip code & CBSA fixed e󰎎ects: lower Airbnb e󰎎ect

󰷬󰷮
Questions?

󰷬󰷮
Bibliography

References

Aizer, Anna, “The Gender Wage Gap and Domestic Violence,” American Economic Review, September
󰷫󰷩󰷪󰷩, 󰷪󰷩󰷩 (󰷭), 󰷪󰷱󰷭󰷰–󰷮󰷲.
Barron, Kyle, Edward Kung, and Davide Proserpio, “The E󰎎ect of Home-Sharing on House Prices and
Rents: Evidence from Airbnb,” Marketing Science, 󰷫󰷩󰷫󰷪, 󰷭󰷩 (󰷪), 󰷫󰷬–󰷭󰷰.
Card, David and Alan B. Krueger, “Minimum Wages and Employment: A Case Study of the Fast-Food
Industry in New Jersey and Pennsylvania,” American Economic Review, 󰷪󰷲󰷲󰷭, 󰷱󰷭 (󰷭), 󰷰󰷰󰷫–󰷰󰷲󰷬.
Rossi, Peter E., Robert E. McCulloch, and Greg M. Allenby, “The Value of Purchase History Data in
Target Marketing,” Marketing Science, 󰷪󰷲󰷲󰷯, 󰷪󰷮 (󰷭), 󰷬󰷫󰷪–󰷬󰷭󰷩.
Wooldridge, Je󰎎rey M, Econometric Analysis of Cross Section and Panel Data, MIT press, 󰷫󰷩󰷪󰷩.

󰷬󰷯

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy