0% found this document useful (0 votes)
35 views25 pages

Chapter 5

Panel data analysis involves measuring the same collection of individuals or objects over time. This allows researchers to address more complex problems than would be possible with only cross-sectional or time series data. There are two main approaches to modeling panel data: fixed effects models and random effects models. Fixed effects models control for time-invariant characteristics of individuals through dummy variables, while random effects models treat these characteristics as random variables. The appropriate model depends on whether the time-invariant characteristics are correlated with the independent variables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views25 pages

Chapter 5

Panel data analysis involves measuring the same collection of individuals or objects over time. This allows researchers to address more complex problems than would be possible with only cross-sectional or time series data. There are two main approaches to modeling panel data: fixed effects models and random effects models. Fixed effects models control for time-invariant characteristics of individuals through dummy variables, while random effects models treat these characteristics as random variables. The appropriate model depends on whether the time-invariant characteristics are correlated with the independent variables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Chapter 5

Panel Data Analysis

1
The Nature of Panel Data

 Panel data, also known as longitudinal data, have both time


series and cross-sectional dimensions.
 They arise when we measure the same collection of people
or objects over a period of time.
 Econometrically, the setup is yit    xit  uit

where yit is the dependent variable,  is the intercept term,


 is a k  1 vector of parameters to be estimated on the
explanatory variables, xit; t = 1, …, T;
i = 1, …, N.

2
The Advantages of using Panel Data

There are a number of advantages from using a full panel


technique when a panel of data is available.
 We can address a broader range of issues and tackle more
complex problems with panel data than would not be
possible with pure time series or pure cross-sectional data
alone.
 It is often of interest to examine how variables, or the
relationships between them, change dynamically (over
time).
 By structuring the model in an appropriate way, we can
remove the impact of certain forms of omitted variables
bias in regression results.
3
Fixed Effects Models

 The fixed effects model for some variable yit may be written
yit    xit  i  vit

 We can think of i as encapsulating all of the variables that


affect yit cross-sectionally but do not vary over time – for
example, the sector that a firm operates in, a person's gender,
or the country where a bank has its headquarters, etc.
 Thus we would capture the heterogeneity that is encapsulated
in i by a method that allows for different intercepts for each
cross sectional unit.
 This model could be estimated using dummy variables, which
would be termed the least squares dummy variable approach.
4
Fixed Effects Models (Cont’d)

 The LSDV model may be written


yit    xit  1 D1i  2 D2i  3 D3i     N DN i  vit
where D1i is a dummy variable that takes the value 1 for all
observations on the first entity (e.g., the first firm) in the
sample and zero otherwise, D2i is a dummy variable that
takes the value 1 for all observations on the second entity
(e.g., the second firm) and zero otherwise, and so on.
 The LSDV can be seen as just a standard regression model
and therefore it can be estimated using OLS.
 Now the model given by the equation above has N+k
parameters to estimate.
5
Time Fixed Effects Models

 It is also possible to have a time-fixed effects model rather


than an entity-fixed effects model.
 We would use such a model where we think that the
average value of yit changes over time but not cross-
sectionally.
 Hence with time-fixed effects, the intercepts would be
allowed to vary over time but would be assumed to be the
same across entities at each given point in time.

6
Time Fixed Effects Models

 We could write a time-fixed effects model as


yit     xit   t  vit
where t is a time-varying intercept that captures all of the
variables that affect y and that vary over time but are constant
cross-sectionally.

 An example would be where the regulatory environment or


tax rate changes part-way through a sample period.

 In such circumstances, this change of environment may

well influence y, but in the same way for all firms. 7


Time Fixed Effects Models (Cont’d)

 Time-variation in the intercept terms can be allowed

for in exactly the same way as with entity fixed effects.


That is, a least squares dummy variable model could be
estimated
yit  xit  1D1t  2 D2t  ...  T DTt  vit
 where D1t, for example, denotes a dummy variable that

takes the value 1 for the first time period and zero
elsewhere, and so on.
8
Time Fixed Effects Models (Cont’d)

 The only difference is that now, the dummy variables

capture time variation rather than cross-sectional variation.

 Similarly, to avoid estimating a model containing all T

dummies, a within transformation can be conducted to


subtract away the cross-sectional averages from each
observation. Finally, it is possible to allow for both entity
fixed effects and time fixed effects within the same model.
Such a model would be termed a two-way error component
9
The Random Effects Model

 An alternative to the fixed effects model described above is

the random effects model, which is sometimes also known


as the error components model.

 As with fixed effects, the random effects approach


proposes different intercept terms for each entity and again
these intercepts are constant over time, with the
relationships between the explanatory and explained
variables assumed to be the same both cross-sectionally
10
and temporally.
The Random Effects Model

 However, the difference is that under the random

effects model, the intercepts for each cross-sectional


unit are assumed to arise from a common intercept 
(which is the same for all cross-sectional units and
over time), plus a random variable i that varies cross-
sectionally but is constant over time.
yit    xit  it , it   i  vit
 i measures the random deviation of each entity’s
11
How the Random Effects Model Works

 Unlike the fixed effects model, there are no dummy

variables to capture the heterogeneity (variation) in the


cross-sectional dimension.

 Instead, this occurs via the i terms.

 Note that this framework requires the assumptions that the

new cross-sectional error term, i, has zero mean, is


independent of the individual observation error term vit, has
constant variance, and is independent of the explanatory
12

variables.
How the Random Effects Model Works

 The parameters ( and the  vector) are estimated consistently but


inefficiently by OLS, and the conventional formulae would have to
be modified as a result of the cross-correlations between error
terms for a given cross-sectional unit at different points in time.

 Instead, a generalised least squares (GLS) procedure is usually


used. The transformation involved in this GLS procedure is to
subtract a weighted mean of the yit over time (i.e. part of the mean
rather than the whole mean, as was the case for fixed effects
estimation). 13
Fixed or Random Effects?

 It is often said that the random effects model is more

appropriate when the entities in the sample can be thought


of as having been randomly selected from the population,
but a fixed effect model is more plausible when the entities
in the sample effectively constitute the entire population.

 More technically, the transformation involved in the GLS

procedure under the random effects approach will not


remove the explanatory variables that do not vary over
14
time, and hence their impact can be enumerated.
Fixed or Random Effects?

 Also, since there are fewer parameters to be estimated with

the random effects model (no dummy variables or within


transform to perform), and therefore degrees of freedom are
saved, the random effects model should produce more
efficient estimation than the fixed effects approach.

 However, the random effects approach has a major drawback


which arises from the fact that it is valid only when the
composite error term it is uncorrelated with all of the
15

explanatory variables.
Fixed or Random Effects? (Cont’d)

 This assumption is more stringent than the corresponding one in

the fixed effects case, because with random effects we thus


require both i and vit to be independent of all of the xit.

 This can also be viewed as a consideration of whether any

unobserved omitted variables (that were allowed for by having


different intercepts for each entity) are uncorrelated with the
included explanatory variables. If they are uncorrelated, a
random effects approach can be used; otherwise the fixed
effects model is preferable. 16
Fixed or Random Effects? (Cont’d)

 A test for whether this assumption is valid for the random effects

estimator is based on a slightly more complex version of the


Hausman test.

 If the assumption does not hold, the parameter estimates will be

biased and inconsistent.

 To see how this arises, suppose that we have only one explanatory

variable, x2it that varies positively with yit, and also with the error
term, it. The estimator will ascribe all of any increase in y to x
when in reality some of it arises from the error term, resulting in
17

biased coefficients.
Fixed or Random Effects? (Cont’d)

 If the regressors are correlated with the ui, the FE

estimator is consistent but the RE estimator is not


consistent

 If the regressors are uncorrelated with the ui, the FE


estimator is still consistent, albeit inefficient, whereas
the RE estimator is consistent and efficient

18
Fixed or Random Effects? (Cont’d)

 Step 1: run a fixed effect model


xtreg lnfdi lngdphome lngdphost, fe
estimate store fe
 Step 2 : run a random effect model
xtreg lnfdi lngdphome lngdphost, re
estimate store ran
 Step 3: conduct Hausman’s test
hausman fe ran
Step 4 : make a decision as to which specification you should use
Notice that if the corresponding probability is < 0.05, Hausman test’s
null hypothesis that the RE estimator is consistent is soundly rejected
 The individual effects do appear to be correlated with the 19
Fixed or Random Effects? (Cont’d)

 Step 1: run a fixed effect model


xtreg trade_openess lnarea landlocked lnpop lngdp_pc lntot, fe
estimate store fix
Step 2 : run a random effect model
xtreg trade_openess lnarea landlocked lnpop lngdp_pc lntot, re
estimate store ran
 Step 3: conduct Hausman’s test
hausman fix ran
Step 4 : make a decision as to which specification you use. If the
corresponding probability is < 0.05, Hausman test’s null hypothesis
that the RE estimator is consistent is rejected, i.e., the individual
effects are correlated with the regressors 20
Fixed or Random Effects? (Cont’d)

Using the macro data, run the following Hausman’s test


rename exporter country
sort country year
sort id year
tsset id year
xtreg trade_openess lnarea landlocked lnpop lngdp_pc lntot, fe
estimate store fix
xtreg trade_openess lnarea landlocked lnpop lngdp_pc lntot, re
estimate store ran
hausman fix ran
21
Fixed or Random Effects? (Cont’d)

---- Coefficients ----


(b) (B) (b-B) sqrt(diag(V_b-V_B))
fix ran Difference S.E.

lnarea -.1430533 -.1846615 .0416082 .8741601


lnpop .6770179 .0990702 .5779477 .0751476
lngdp_pc .2283795 .2091023 .0192772 .037618
lntot -.0824428 .0548649 -.1373078 .014989

b = consistent under Ho and Ha; obtained from xtreg


B = inconsistent under Ha, efficient under Ho; obtained from xtreg

Test: Ho: difference in coefficients not systematic

chi2(4) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 75.51
Prob>chi2 = 0.0000
(V_b-V_B is not positive definite)

 Conclusion: the Hausman test’s null hypothesis that the RE


estimator is consistent is rejected. i.e., country fixed effects
do appear to be correlated with the regressors  We shall
apply fixed effects model
22
Dynamic Models

 All of the models we have considered so far have


been static, e.g.
yt = 1 + 2x2t + ... + kxkt + ut
 But we can easily extend this analysis to the case
where the current value of yt depends on previous
values of y or one of the x’s, e.g.
yt = 1 + 2x2t + ... + kxkt + 1yt-1 + 2x2t-1 + … + kxkt-1+ ut

 We could extend the model even further by adding


extra lags, e.g. x2t-2 , yt-3 .

23
Why Might we Want/Need To Include Lags in a
Regression?
 Inertia of the dependent variable
 Over-reactions
 However, other problems with the regression could cause
the null hypothesis of no autocorrelation to be rejected:
 Omission of relevant variables, which are themselves
autocorrelated.
 If we have committed a “misspecification” error by
using an inappropriate functional form.
 Autocorrelation resulting from unparameterised
seasonality.
24
Models in First Difference Form

 Another way to sometimes deal with the problem of


autocorrelation is to switch to a model in first
differences.
Denote the first difference of yt, i.e. yt - yt-1 as yt;
similarly for the x-variables, x2t = x2t - x2t-1 etc.
 The model would now be
yt = 1 + 2 x2t + ... + kxkt + ut
 Sometimes the change in y is purported to depend
on previous values of y or xt as well as changes in
x:yt = 1 + 2 x2t + 3x2t-1 +4yt-1 + ut
25

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy