0% found this document useful (0 votes)

457 views48 pages

Panel 101

This document provides an overview of panel data analysis using Stata. It discusses fixed effects models, which control for time-invariant characteristics of observational units. The document explains that panel data is structured with entities observed over time, and discusses transforming wide format data to long format for analysis in Stata. It also covers setting up the data as a panel in Stata once in long format.

Uploaded by

Acilgunanin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

457 views48 pages

Panel 101

Uploaded by

Acilgunanin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

Panel Data Analysis

Fixed and Random Effects

using Stata
(v. 6.0)

Oscar Torres-Reyna
otorres@princeton.edu

December 2007 http://www.princeton.edu/~otorres/

What panel data looks like…
Panel data (also known as Entity Year Y X1 X2 X3 …..
longitudinal or cross- 1 1 # # # # …..
sectional time-series data) 1 2 # # # # …..
is a dataset in which the 1 3 # # # # …..
behavior of entities (i) are
: : : : : : :
observed across time (t).
2 1 # # # # …..
2 2 # # # # …..
(Xit, Yit), i=1,…n; t=1,…T 2 3 # # # # …..
: : : : : : :

These entities could be 3 1 # # # # …..

states, companies, families, 3 2 # # # # …..

individuals, countries, etc. 3 3 # # # # …..

OTR See Stock and Watson, Introduction to Econometrics, chapter 10 “Regression with Panel Data”. 2
Usage
Panel data deals with omitted variable bias due to heterogeneity in
the data. It does this by controlling for variables that we cannot
observe, are not available, and/or can not be measured but are
correlated with the predictors. Two types:
1. Variables that do not change over time but vary across entities
(cultural factors, difference in business practices across
companies, etc.) → Entity fixed effects.
2. Variables that change over time but not across entities (i.e.
national policies, federal regulations, international
agreements, etc.) → Time fixed effects.
Some drawbacks when working with panel data are data collection
issues (i.e. sampling design, coverage), non-response in the case of
micro panels or cross-country dependency in the case of macro
panels (i.e. correlation between countries).
For a comprehensive list of advantages and disadvantages of panel data see Baltagi, Econometric
Analysis of Panel Data (chapter 1).
OTR 3
FIXED-EFFECTS MODEL
(Covariance Model, Within
Estimator, Individual Dummy
Variable Model, Least Squares
Dummy Variable Model)

OTR 4
The fixed effects idea
Entities have individual characteristics that may
or may not influence the outcome and/or
predictor variables. For example, the business
practices of a company may influence its stock
price or level of spending; attitudes or policies
towards guns in a particular state may affect its
levels of gun violence. Business practices,
cultural, or political variables are, most of the
time unavailable or hard to measure.
OTR 5
The fixed effects idea
Since individual characteristics are not random
and may impact the predictor or outcome
variables, we need to control for them. In this
way, the effect of the predictors will not be
influenced by those fixed characteristics.*
In entity’s fixed effects it is assumed a
correlation between the entity’s error term and
predictor variables. However, an entity’s fixed
effects cannot be correlated with another
entity’s.
OTR * See Stock and Watson, 2003, p.289-290 6
The model (1)
The entity fixed effects regression model is
𝑌𝑖𝑡 = 𝛼𝑖 + 𝛽𝑋𝑖𝑡 + 𝑢𝑖 + 𝑒𝑖𝑡
i = 1…n ; t = 1….T
Where:
𝑌𝑖𝑡 outcome variable (for entity i at time t).
𝛼𝑖 is the unknown intercept for each entity (n entity-specific intercepts).
𝑋𝑖𝑡 is a vector of predictors (for entity i at time t) .
𝑢𝑖 within-entity error term ; 𝑒𝑖𝑡 overall error term.

Interpretation of the 𝛽 coefficient: for a given entity, when a

predictor changes one unit over time, the outcome will
increase/decrease by 𝛽 units (assuming no transformation is
applied).* Here, 𝛽 represents a common effect across entities
controlling for individual heterogeneity.
* See Bartels, Brandom, “Beyond “Fixed Versus Random Effects”: A framework for improving substantive and statistical
OTR analysis of panel, time-series cross-sectional, and multilevel data”, Stony Brook University, working paper, 2008 7
The model (2)
The entity and time fixed effects regression model is
𝑌𝑖𝑡 = 𝛼𝑖 + 𝛽𝑋𝑖𝑡 + 𝛿𝑡 + 𝑢𝑖 + 𝑒𝑖𝑡
i = 1…n ; t = 1….T
Where:
𝑌𝑖𝑡 outcome variable (for entity i at time t).
𝛼𝑖 is the unknown intercept for each entity (n entity-specific intercepts).
𝑋𝑖𝑡 is a vector of predictors (for entity i at time t) .
𝛿𝑡 is the unknow coefficient for the time regressors (t)
𝑢𝑖 within-entity error term ; 𝑒𝑖𝑡 overall error term.

Interpretation of a 𝛽 coefficient: for a given entity, when a

columns. 1 3 # # # # …..
: : : : : : :
• Entity and time in 2 1 # # # # …..
rows. 2 2 # # # # …..
2 3 # # # # …..
: : : : : : :
This format is known as 3 1 # # # # …..
long form. 3 2 # # # # …..
3 3 # # # # …..

OTR 9
Wide form data (time in columns)
If your dataset is in wide format, either entity or time
are in columns, you need to reshape it to long format
(you can do this in Stata).
Beware that Stata does not like numbers as column
names. You need to add a letter to the numbers
before importing into Stata. If you have something
like the following:

OTR 10
Wide form data (time in columns)
Add a letter to the numeric column names, for example,
an ‘x’ before the year:

Import into Stata

OTR 11
Reshaping from wide to
long

Once in Stata, you can reshape

it using the command
reshape:
gen id = _n

order id

reshape long x , i(id) j(year)

rename x gdp

Type help reshape for more details

OTR 12
Wide form data (entity in columns)
If the wide format data has the entities in column
and time in rows, like this example:

OTR 13
Wide form data (entity in columns)
Import it into Stata:

OTR 14
Reshape wide to long format
Once in Stata, you can reshape it
using the command reshape:
* Adding the prefix ‘gdp’ to column names.
Command ‘renvars’ is user-written, you need
to install it, see note below

renvars A-G, pref(gdp)

gen id = _n
order id
reshape long gdp , i(id) j(country) str

Type help reshape for more details.

You need to install renvars, type:
search renvars
OTR Click on the link for dm88_* then install. 15
Setting data as panel
Once the data is in long form, we need to set it as panel so we can
use Stata’s panel data xt commands and the time series operators.
Using the example from the previous page type:

xtset country year

string variables not allowed in varlist;
Country is a string variable

Given the error, we need to have ‘country’ in numeric format.

Type
encode country, gen(country1)

Balanced panel: all entities are

Then using ‘country1’ type observed across all times.
Unbalanced panel: some entities
are not observed in some years.
xtset country1 year Stata algorithms automatically
Panel variable: country1 (strongly balanced)
Time variable: year, 1995 to 2005 account for this.
Delta: 1 unit
OTR 16
Assign numbers to strings

The encode command used in

the previous slide assigns a
number to the string variable in
alphabetical order.
The new variable is a labeled
variable where the labels are
the original strings assigned to
specific number.
Notice that string variables
have the color red, while
labeled variables have color
blue.
Type help encode for more
info.
OTR 17
Visualizing panel data
Once the data is set as panel, you can use a series of xt commands
to analyze it. For more information type:
help xt
A useful visualization command is xtline, type:
xtline gdp

OTR 18
Visualizing panel data
* All in one, type:
xtline gdp, overlay

OTR 19
Data example
The data used in the following slides was extracted from the World
Development Indicators database:
https://databank.worldbank.org/source/world-development-indicators

Selected variables since 2000, all countries only:

• GDP per capita (constant 2015 US$)

• Exports of goods and services (constant 2015 US$)
• Imports of goods and services (constant 2015 US$)
• Labor force, total

Data was further cleaned to remove regions, subregions, and missing values
across years and variables resulting in 126 countries.
Variable ‘trade’ was added by adding imports + exports.

OTR 20
Data example – histograms

hist gdppc
hist labor
hist trade

OTR 21
Data example – transformations

To log-transformed a variable use the function ln():

gen ln_gdppc = ln(gdppc)
gen ln_labor = ln(labor)
gen ln_trade = ln(trade)
If the variable has negative values, you need to add a value high
enough so the minimum value is over zero (preferable 1). For
example, if the lowest value in ‘varX’ is -1, then type:
gen ln_varX = ln(varX + 2)
The natural log of 1 is zero.

OTR 22
Data example – histograms

hist ln_gdppc
hist ln_labor
hist ln_trade

OTR 23
Setting data as panel
The panel variable (country) is in string format (red color, type
browse country to see it), we need to convert it to labeled
format (numbers with labels, blue color):

encode country, gen(country1)

Then using ‘country1’ type

xtset country1 year

Panel variable: country1 (strongly balanced)

Time variable: year, 2000 to 2021
Delta: 1 unit

OTR 24
Descriptive statistics
. sum gdppc trade labor // Pooled data

Variable | Obs Mean Std. dev. Min Max

-------------+---------------------------------------------------------
gdppc | 2,772 14925.78 19561 261.0194 112417.9
trade | 2,772 2.39e+11 5.33e+11 1.28e+08 5.58e+12
labor | 2,772 1.70e+07 4.54e+07 85987 4.89e+08

. xtsum gdppc trade labor // Heterogeneity by panel and time

Variable | Mean Std. dev. Min Max | Observations

-----------------+--------------------------------------------+----------------
gdppc overall | 14925.78 19561 261.0194 112417.9 | N = 2772
between | 19404.61 293.4895 104003.7 | n = 126
within | 2991.204 -14918.74 52165.38 | T = 22
| |
trade overall | 2.39e+11 5.33e+11 1.28e+08 5.58e+12 | N = 2772
between | 5.20e+11 3.14e+08 4.33e+12 | n = 126
within | 1.27e+11 -1.14e+12 1.49e+12 | T = 22
| |
labor overall | 1.70e+07 4.54e+07 85987 4.89e+08 | N = 2772
between | 4.54e+07 132657 4.53e+08 | n = 126
within | 3154440 -4.24e+07 5.27e+07 | T = 22

OTR See https://www.stata.com/manuals/xtxtsum.pdf 25

Fixed effects regression using xtreg, fe
𝑌𝑖𝑡 = 𝛼𝑖 + 𝛽𝑋𝑖𝑡 + 𝑢𝑖 + 𝑒𝑖𝑡
Fixed effects option

Controlling for Total number of

Outcome Predictor(s)
heteroskedasticity cases (rows) Total number of entities (i)

. xtreg ln_gdppc ln_trade ln_labor, fe robust

Fixed-effects (within) regression Number of obs = 2,772

If this number is < 0.05 then
Group variable: country1 Number of groups = 126
your model is ok. This is an F-
The within entity errors ui
test to see whether all the
are correlated with the R-squared: Obs per group:
coefficients in the model are
regressors in the fixed Within = 0.6267 min = 22
jointly different than zero.
effects model. Between = 0.3872 avg = 22.0
Overall = 0.3906 max = 22

F(2,125) = 87.57
corr(u_i, Xb) = 0.1067 Prob > F = 0.0000

Beta coefficients indicate the (Std. err. adjusted for 126 clusters in country1) Two-tail p-values test the
change in the output (y) when ------------------------------------------------------------------------------ hypothesis that each coefficient is
the predictors change one | Robust different from 0 (according to its
unit over time. In this ln_gdppc | Coefficient std. err. t P>|t| [95% conf. interval] t-value).
example, all the variables are -------------+---------------------------------------------------------------- A value lower than 0.05 will reject
log-transformed, the ln_trade | .3603947 .0737076 4.89 0.000 .2145182 .5062712 the null and conclude that the
interpretation is: when the ln_labor | .053167 .1608747 0.33 0.742 -.265224 .371558 predictor has a significant effect
predictor increases 1% over _cons | -.9384681 1.075791 -0.87 0.385 -3.067592 1.190656 on the outcome (95%
time, the output (y) changes -------------+---------------------------------------------------------------- significance).
𝛽% (elasticity). sigma_u | 1.1155513
sigma_e | .10989953
rho | .99038791 (fraction of variance due to u_i)
------------------------------------------------------------------------------

Intraclass correlation (rho), shows how much

of the variance in the output is explained by 𝑠𝑖𝑔𝑚𝑎_𝑢 2
the difference across entities. In this example 𝑟ℎ𝑜 =
is 99%.
𝑠𝑖𝑔𝑚𝑎_𝑢 2 + 𝑠𝑖𝑔𝑚𝑎_𝑒 2

sigma_u = sd of residuals within groups 𝑢𝑖

sigma_e = sd of residuals (overall error term) 𝑒𝑖𝑡
OTR 26
Entity and time fixed effects regression using xtreg, fe
Time fixed effects
𝑌𝑖𝑡 = 𝛼𝑖 + 𝛽𝑋𝑖𝑡 + 𝛿𝑡 + 𝑢𝑖 + 𝑒𝑖𝑡 Total number of
Fixed effects option cases (rows)

Controlling for
Outcome Predictor(s)
heteroskedasticity Total number of entities (i)

. xtreg ln_gdppc ln_trade ln_labor i.year, fe robust

Fixed-effects (within) regression Number of obs = 2,772

If this number is < 0.05 then
Group variable: country1 Number of groups = 126
your model is ok. This is an F-
The within entity errors ui
test to see whether all the
are correlated with the R-squared: Obs per group:
coefficients in the model are
regressors in the fixed Within = 0.7083 min = 22
jointly different than zero.
effects model. Between = 0.7977 avg = 22.0
Overall = 0.7581 max = 22

F(23,125) = 34.28
corr(u_i, Xb) = 0.7525 Prob > F = 0.0000

Beta coefficients indicate the (Std. err. adjusted for 126 clusters in country1) Two-tail p-values test the
change in the output (y) when ------------------------------------------------------------------------------ hypothesis that each coefficient is
the predictors change one | Robust different from 0 (according to its
unit over time. In this ln_gdppc | Coefficient std. err. t P>|t| [95% conf. interval] t-value).
example, all the variables are -------------+---------------------------------------------------------------- A value lower than 0.05 will reject
log-transformed, the ln_trade | .2401329 .0695213 3.45 0.001 .1025416 .3777242 the null and conclude that the
interpretation is: when the ln_labor | -.2958837 .081081 -3.65 0.000 -.456353 -.1354145 predictor has a significant effect
predictor increases 1% over | on the outcome (95%
time, the output (y) changes year | significance).
𝛽% (elasticity). 2001 | .0119809 .0042779 2.80 0.006 .0035144 .0204475
... ... ... ... ... ... ...
... ... ... ... ... ... ...
2021 | .2878247 .0705454 4.08 0.000 .1482065 .4274428
|
Intraclass correlation (rho), _cons | 7.213881 1.961627 3.68 0.000 3.331578 11.09619
shows how much of the -------------+----------------------------------------------------------------
variance in the output is sigma_u | 1.0561892
explained by the difference sigma_u = sd of residuals within groups 𝑢𝑖
sigma_e | .09753735
across entities. In this sigma_e = sd of residuals (overall error term) 𝑒𝑖𝑡
rho | .99154389 (fraction of variance due to u_i)
example is 99%. ------------------------------------------------------------------------------

OTR 𝑠𝑖𝑔𝑚𝑎_𝑢 2 27
𝑟ℎ𝑜 = 2 2
𝑠𝑖𝑔𝑚𝑎_𝑢 + 𝑠𝑖𝑔𝑚𝑎_𝑒
Fixed effects regression using xtreg, fe (with lags on predictors)
𝑌𝑖𝑡 = 𝛼𝑖 + 𝛽𝑋𝑖𝑡−1 + 𝑢𝑖 + 𝑒𝑖𝑡
Fixed effects option

Controlling for Total number of

Outcome Predictor(s)
heteroskedasticity cases (rows) Total number of entities (i)

. xtreg ln_gdppc l1.ln_trade l1.ln_labor, fe robust

Fixed-effects (within) regression Number of obs = 2,646

If this number is < 0.05 then
Group variable: country1 Number of groups = 126
your model is ok. This is an F-
The within entity errors ui
test to see whether all the
are correlated with the R-squared: Obs per group:
coefficients in the model are
regressors in the fixed Within = 0.6054 min = 21
jointly different than zero.
effects model. Between = 0.3771 avg = 21.0
Overall = 0.3799 max = 21

F(2,125) = 81.17
corr(u_i, Xb) = 0.1265 Prob > F = 0.0000

(Std. err. adjusted for 126 clusters in country1) Two-tail p-values test the
Beta coefficients indicate hypothesis that each coefficient is
the change in the output ------------------------------------------------------------------------------
| Robust different from 0 (according to its
(y) when the predictors one t-value).
unit over time (a year ln_gdppc | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+---------------------------------------------------------------- A value lower than 0.05 will reject
before –”L1.”). In this the null and conclude that the
example, all the variables ln_trade |
L1. | .3385586 .0703993 4.81 0.000 .1992297 .4778875 predictor has a significant effect
are log-transformed, the on the outcome (95%
interpretation is: when the |
ln_labor | significance).
predictor increases 1% over
time (a year before –”L1.”), L1. | .0581167 .1566956 0.37 0.711 -.2520033 .3682367
the output (y) changes 𝛽% |
(elasticity). _cons | -.4600892 1.082489 -0.43 0.672 -2.60247 1.682291
-------------+----------------------------------------------------------------
sigma_u | 1.1260807
Intraclass correlation (rho), sigma_e | .10685653
shows how much of the rho | .99107579 (fraction of variance due to u_i)
variance in the output is ------------------------------------------------------------------------------
explained by the difference
across entities. In this 𝑠𝑖𝑔𝑚𝑎_𝑢 2 sigma_u = sd of residuals within groups 𝑢𝑖
example is about 98%. 𝑟ℎ𝑜 = sigma_e = sd of residuals (overall error term) 𝑒𝑖𝑡
𝑠𝑖𝑔𝑚𝑎_𝑢 2 + 𝑠𝑖𝑔𝑚𝑎_𝑒 2
OTR 28
Entity fixed effects regression using reghdfe
𝑌𝑖𝑡 = 𝛼𝑖 + 𝛽𝑋𝑖𝑡 + 𝑢𝑖 + 𝑒𝑖𝑡
Fixed effects option Controlling for
correlation within
panels Total number of
Outcome Predictor(s) cases (rows)

. reghdfe ln_gdppc ln_trade ln_labor , absorb(country1) vce(cluster country1)

(MWFE estimator converged in 1 iterations) If this number is < 0.05 then
your model is ok. This is an F-
HDFE Linear regression Number of obs = 2,772 test to see whether all the
Absorbing 1 HDFE group F( 2, 125) = 87.57 coefficients in the model are
Statistics robust to heteroskedasticity Prob > F = 0.0000 jointly different than zero.
Total number of entities (i) R-squared = 0.9943
Adj R-squared = 0.9940
Within R-sq. = 0.6267
Number of clusters (country1) = 126 Root MSE = 0.1099
R-squared shows the percent
(Std. err. adjusted for 126 clusters in country1) of the variance in the outcome
------------------------------------------------------------------------------ explained by the model. The Adj
Beta coefficients indicate | Robust R-squared, accounts for the
the change in the output (y) ln_gdppc | Coefficient std. err. t P>|t| [95% conf. interval] number of variables and their
when the predictors change -------------+---------------------------------------------------------------- significant contribution to
one unit over time. In this ln_trade | .3603947 .0737076 4.89 0.000 .2145182 .5062712 explaining the variation in the
example, all the variables ln_labor | .053167 .1608747 0.33 0.742 -.265224 .371558 output variable.
are log-transformed, the _cons | -.9384681 1.075791 -0.87 0.385 -3.067592 1.190656
interpretation is: when the ------------------------------------------------------------------------------
predictor increases 1% over
time, the output (y) changes Absorbed degrees of freedom:
𝛽% (elasticity). -----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs | Two-tail p-values test the
-------------+---------------------------------------| hypothesis that each coefficient is
country1 | 126 126 0 *| different from 0 (according to its
-----------------------------------------------------+ t-value).
* = FE nested within cluster; treated as redundant for DoF computation A value lower than 0.05 will reject
the null and conclude that the
predictor has a significant effect
on the outcome (95%
significance).
NOTE: Use reghdfe when controlling for multiple fixed effects or when xtreg,fe cannot run due to the number
OTR of panels. 29
Entity and time fixed effects regression using reghdfe
𝑌𝑖𝑡 = 𝛼𝑖 + 𝛽𝑋𝑖𝑡 + 𝑢𝑖 + 𝑒𝑖𝑡
Fixed effects option Controlling for
correlation within
panels Total number of
Outcome Predictor(s) cases (rows)

. reghdfe ln_gdppc ln_trade ln_labor , absorb(country1 year) vce(cluster country1)

(MWFE estimator converged in 2 iterations) If this number is < 0.05 then
your model is ok. This is an F-
HDFE Linear regression Number of obs = 2,772 test to see whether all the
Absorbing 2 HDFE groups F( 2, 125) = 11.37 coefficients in the model are
Statistics robust to heteroskedasticity Prob > F = 0.0000 jointly different than zero.
Total number of entities (i) R-squared = 0.9955
Adj R-squared = 0.9953
Within R-sq. = 0.3050
Number of clusters (country1) = 126 Root MSE = 0.0976
R-squared shows the percent
(Std. err. adjusted for 126 clusters in country1) of the variance in the outcome
------------------------------------------------------------------------------ explained by the model. The Adj
Beta coefficients indicate | Robust R-squared, accounts for the
the change in the output (y) ln_gdppc | Coefficient std. err. t P>|t| [95% conf. interval] number of variables and their
when the predictors change -------------+---------------------------------------------------------------- significant contribution to
one unit over time. In this ln_trade | .2401329 .0695213 3.45 0.001 .1025416 .3777242 explaining the variation in the
example, all the variables ln_labor | -.2958837 .081081 -3.65 0.000 -.456353 -.1354145 output variable.
are log-transformed, the _cons | 7.381277 1.999695 3.69 0.000 3.423632 11.33892
interpretation is: when the ------------------------------------------------------------------------------
predictor increases 1% over
time, the output (y) changes Absorbed degrees of freedom:
𝛽% (elasticity). -----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs | Two-tail p-values test the
-------------+---------------------------------------| hypothesis that each coefficient is
country1 | 126 126 0 *| different from 0 (according to its
year | 22 0 22 | t-value).
-----------------------------------------------------+ A value lower than 0.05 will reject
* = FE nested within cluster; treated as redundant for DoF computation the null and conclude that the
predictor has a significant effect
on the outcome (95%
significance).

OTR 30
Entity fixed effects regression with lags using reghdfe
𝑌𝑖𝑡 = 𝛼𝑖 + 𝛽𝑋𝑖𝑡 + 𝑢𝑖 + 𝑒𝑖𝑡
Fixed effects option Controlling for
correlation within
panels Total number of
Outcome Predictor(s) cases (rows)

reghdfe ln_gdppc l1.ln_trade l1.ln_labor , absorb(country1 ) vce(cluster country1)

(MWFE estimator converged in 1 iterations) If this number is < 0.05 then
your model is ok. This is an F-
HDFE Linear regression Number of obs = 2,646 test to see whether all the
Absorbing 1 HDFE group F( 2, 125) = 81.17 coefficients in the model are
Statistics robust to heteroskedasticity Prob > F = 0.0000 jointly different than zero.
Total number of entities (i) R-squared = 0.9946
Adj R-squared = 0.9943
Within R-sq. = 0.6054
Number of clusters (country1) = 126 Root MSE = 0.1069
R-squared shows the percent
(Std. err. adjusted for 126 clusters in country1) of the variance in the outcome
------------------------------------------------------------------------------ explained by the model. The Adj
Beta coefficients indicate | Robust R-squared, accounts for the
the change in the output (y) ln_gdppc | Coefficient std. err. t P>|t| [95% conf. interval] number of variables and their
when the predictors change -------------+---------------------------------------------------------------- significant contribution to
one unit over time. In this ln_trade | explaining the variation in the
example, all the variables L1. | .3385586 .0703993 4.81 0.000 .1992297 .4778875 output variable.
are log-transformed, the |
interpretation is: when the ln_labor |
predictor increases 1% over L1. | .0581167 .1566956 0.37 0.711 -.2520033 .3682367
time, the output (y) changes |
𝛽% (elasticity). _cons | -.4600892 1.082489 -0.43 0.672 -2.60247 1.682291
------------------------------------------------------------------------------ Two-tail p-values test the
hypothesis that each coefficient is
Absorbed degrees of freedom: different from 0 (according to its
-----------------------------------------------------+ t-value).
Absorbed FE | Categories - Redundant = Num. Coefs | A value lower than 0.05 will reject
-------------+---------------------------------------| the null and conclude that the
country1 | 126 126 0 *| predictor has a significant effect
-----------------------------------------------------+ on the outcome (95%
* = FE nested within cluster; treated as redundant for DoF computation significance).

OTR NOTE: must type xtset country1 year, before using lags in reghdfe 31
A note on fixed effects
“...The fixed-effects model controls for all time-invariant
differences between the individuals, so the estimated coefficients
of the fixed-effects models cannot be biased because of omitted
time-invariant characteristics...[like culture, religion, gender, race,
etc].
One side effect of the features of fixed-effects models is that they
cannot be used to investigate time-invariant causes of the
dependent variables. Technically, time-invariant characteristics of
the individuals are perfectly collinear with the person [or entity]
dummies. Substantively, fixed-effects models are designed to
study the causes of changes within a person [or entity]. A time-
invariant characteristic cannot cause such a change, because it is
constant for each person.” [(Underline is mine) Kohler, Ulrich,
Frauke Kreuter, Data Analysis Using Stata, 2nd ed., p.245]

OTR 32
RANDOM-EFFECTS MODEL
(Random Intercept, Partial
Pooling Model)

OTR 33
The random effects idea
The rationale behind random effects model is that, unlike the
fixed effects model, the variation across entities is assumed
to be random and uncorrelated with the predictor or
independent variables included in the model:
“...the crucial distinction between fixed and random effects is
whether the unobserved individual effect embodies elements that
are correlated with the regressors in the model, not whether these
effects are stochastic or not” [Green, 2008, p.183]
If you have reason to believe that differences across entities
have some influence on your dependent variable then you
should use random effects. An advantage of random effects is
that you can include time invariant variables (i.e. gender). In
the fixed effects model these variables are absorbed by the
intercept.
OTR 34
The random effects idea
Random effects assume that the entity’s error term is not
correlated with the predictors which allows for time-
invariant variables to play a role as explanatory variables.
In random-effects you need to specify those individual
characteristics that may or may not influence the
predictor variables. The problem with this is that some
variables may not be available therefore leading to
omitted variable bias in the model.
RE allows to generalize the inferences beyond the sample
used in the model.

OTR 35
Random effects regression using xtreg, re
𝑌𝑖𝑡 = 𝛼 + 𝛽𝑋𝑖𝑡 + 𝑢𝑖𝑡 + 𝑒𝑖𝑡
Random effects option

Controlling for Total number of

Outcome Predictor(s)
heteroskedasticity cases (rows) Total number of entities (i)

. xtreg ln_gdppc ln_trade ln_labor, re robust

Random-effects GLS regression Number of obs = 2,772

If this number is < 0.05 then
Group variable: country1 Number of groups = 126
your model is ok. This is an F-
The between entity errors
test to see whether all the
uit are uncorrelated with R-squared: Obs per group:
coefficients in the model are
the regressors in the Within = 0.6110 min = 22
jointly different than zero.
random effects model. Between = 0.7295 avg = 22.0
Overall = 0.7212 max = 22

Wald chi2(2) = 192.71

corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

Beta coefficients indicate the (Std. err. adjusted for 126 clusters in country1) Two-tail p-values test the
change in the output (y) when ------------------------------------------------------------------------------ hypothesis that each coefficient is
the predictors change one | Robust different from 0 (according to its
unit over time and across ln_gdppc | Coefficient std. err. z P>|z| [95% conf. interval] t-value).
entities (average effect). In -------------+---------------------------------------------------------------- A value lower than 0.05 will reject
this example, all the variables ln_trade | .4175909 .0760404 5.49 0.000 .2685543 .5666274 the null and conclude that the
are log-transformed, the ln_labor | -.1597685 .1312262 -1.22 0.223 -.4169671 .0974302 predictor has a significant effect
interpretation is: when the _cons | .9295612 .6361615 1.46 0.144 -.3172923 2.176415 on the outcome (95%
predictor increases, on -------------+---------------------------------------------------------------- significance).
average, 1%, the output (y) sigma_u | .41594682
changes 𝛽% (elasticity). sigma_e | .10989953
rho | .93474564 (fraction of variance due to u_i)
------------------------------------------------------------------------------

Intraclass correlation (rho), shows how much

of the variance in the output is explained by 𝑠𝑖𝑔𝑚𝑎_𝑢 2
the difference across entities. In this example 𝑟ℎ𝑜 =
is 99%.
𝑠𝑖𝑔𝑚𝑎_𝑢 2 + 𝑠𝑖𝑔𝑚𝑎_𝑒 2

sigma_u = sd of residuals within groups 𝑢𝑖

sigma_e = sd of residuals (overall error term) 𝑒𝑖𝑡
OTR 36
FIXED OR RANDOM?

OTR 37
Which to choose?
Whenever there is a clear idea that individual characteristics of
each entity or group affect the regressors, use fixed effects. For
example, macroeconomic data collected for most countries
overtime. There might be a good reason to believe that
countries’ economic performance may be affected by their
own internal characteristics: type of government, political
environment, cultural characteristics, type of public policies,
etc.
Random effects is used whenever there is reason to believe
that individual characteristics have no effect on the regressors
(uncorrelated).

OTR 38
Which to choose?
The Hausman-test tests whether the individual characteristics are correlated with the regressors
(see Green, 2008, chapter 9). The null hypothesis is that they are not (random effects).

xtreg ln_gdppc ln_trade ln_labor, fe

estimates store fixed
xtreg ln_gdppc ln_trade ln_labor, re
estimates store random
hausman fixed random, sigmamore
. hausman fixed random, sigmamore
---- Coefficients ----
| (b) (B) (b-B) sqrt(diag(V_b-V_B))
| fixed random Difference Std. err.
-------------+----------------------------------------------------------------
ln_trade | .3603947 .4175909 -.0571962 .0026039
ln_labor | .053167 -.1597685 .2129354 .012825
------------------------------------------------------------------------------
b = Consistent under H0 and Ha; obtained from xtreg.
B = Inconsistent under Ha, efficient under H0; obtained from xtreg.
Test of H0: Difference in coefficients not systematic
chi2(2) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 484.43
Prob > chi2 = 0.0000 If Prob > chi2 is < 0.05 use fixed effects

OTR 39
TESTS / DIAGNOSTICS

OTR 40
Do we need time fixed effects?
To see if time fixed effects are needed when running a FE model use
the command testparm. It is a joint F-test to if all years jointly
equal to 0 (type help testparm for more details).

xtreg ln_gdppc ln_trade ln_labor i.year, fe robust

testparm i.year
. testparm i.year

( 1) 2001.year = 0
( 2) 2002.year = 0
( 3) 2003.year = 0
( 4) 2004.year = 0
( 5) 2005.year = 0
( 6) 2006.year = 0
( 7) 2007.year = 0
( 8) 2008.year = 0
( 9) 2009.year = 0
(10) 2010.year = 0
(11) 2011.year = 0
(12) 2012.year = 0 The Prob > F is < 0.05, we fail to
(13) 2013.year = 0
(14) 2014.year = 0 accept the null that the coefficients for
(15) 2015.year = 0
(16) 2016.year = 0 the years are jointly equal to zero. In this
(17) 2017.year = 0
(18) 2018.year = 0
case, time fixed effects are needed.
(19) 2019.year = 0
(20) 2020.year = 0
(21) 2021.year = 0

F( 21, 125) = 4.44

OTR Prob > F = 0.0000 41
Do we need random effects?
The LM test helps you decide between a random effects regression
and a simple OLS regression. The null hypothesis in the LM test is
that variances across entities is equal to zero. This is, no significant
difference across units (i.e. no panel effect). The command in Stata
is xttset0 type it right after running the random effects model

xtreg ln_gdppc ln_trade ln_labor, re robust

xttest0
. xttest0

Breusch and Pagan Lagrangian multiplier test for random effects

ln_gdppc[country1,t] = Xb + u[country1] + e[country1,t]

Estimated results:
| Var SD = sqrt(Var)
---------+-----------------------------
ln_gdppc | 2.022383 1.422105
e | .0120779 .1098995
u | .1730118 .4159468

Test: Var(u) = 0 Prob > chibar2 < 0.05, we fail to

chibar2(01) = 19981.51
accept the null hypothesis and conclude
Prob > chibar2 = 0.0000
that random effects are needed.
OTR 42
Are the panels correlated? [B-P/LM test]
According to Baltagi, cross-sectional dependence is a problem in macro panels
with long time series (over 20-30 years). The null hypothesis in the B-P/LM test of
independence is that residuals across entities are not correlated. The user-
defined command to run this test is xttest2 (run it after xtreg, fe):

ssc install xttest2

xtreg ln_gdppc ln_trade ln_labor, fe robust

xttest2

. xttest2

Correlation matrix of residuals:

[OMITTED]

Breusch-Pagan LM test of independence: chi2(7875) = 73886.228, Pr = 0.0000

Based on 22 complete observations over panel units

Pr < 0.05, we fail to accept the null hypothesis and conclude that panel are
correlated (cross-sectional dependence).

OTR 43
Are the panels correlated? [Pasaran CD test]
As mentioned in the previous slide, cross-sectional dependence is more of an issue in macro panels
with long time series (over 20-30 years) than in micro panels.
Pasaran CD (cross-sectional dependence) test is used to test whether the residuals are
correlated across entities*. Cross-sectional dependence can lead to bias in tests results (also called
contemporaneous correlation). The null hypothesis is that residuals are not correlated. The command
for the test is xtcsd, you have to install it typing:

ssc install xtcsd Pr < 0.05, we fail to accept the null

xtreg ln_gdppc ln_trade ln_labor, fe robust hypothesis and conclude that panel
are correlated (cross-sectional
xtcsd, pesaran abs dependence).

. xtcsd, pesaran abs

Pesaran's test of cross sectional independence = 9.266, Pr = 0.0000

Average absolute value of the off-diagonal elements = 0.588

Had cross-sectional dependence be present Hoechle suggests to use Driscoll and Kraay standard errors
using the command xtscc (install it by typing ssc install xtscc). Type help xtscc for more
details.
*Source: Hoechle, Daniel, “Robust Standard Errors for Panel Regressions with Cross-Sectional Dependence”,
http://fmwww.bc.edu/repec/bocode/x/xtscc_paper.pdf

OTR 44
Testing for heteroskedasticity
A test for heteroskedasticiy is avalable for the fixed- effects model using the
command xttest3. The null hyphotesis is homoskedasticity (or constant
variance). This is a user-written program, to install it type:
ssc install xttest3
xtreg ln_gdppc ln_trade ln_labor, fe robust
xttest3
. xttest3
Modified Wald test for groupwise heteroskedasticity
in fixed effect regression model

H0: sigma(i)^2 = sigma^2 for all i

chi2 (126) = 3.3e+05 We reject the null and conclude

heteroskedasticity.
Prob>chi2 = 0.0000

NOTE: Use the option ‘robust’ to obtain heteroskedasticity-robust standard errors (also known
as Huber/White or sandwich estimators).
OTR 45
Testing for serial correlation
Serial correlation tests apply to macro panels with long time series (over 20-30 years).
Not a problem in micro panels (with very few years). Serial correlation causes the
standard errors of the coefficients to be smaller than they actually are and higher R-
squared. A Lagram-Multiplier test for serial correlation is available using the command
xtserial. This is a user-written program, to install it type:
ssc install xtserial
xtreg ln_gdppc ln_trade ln_labor, fe robust
xtserial ln_gdppc ln_trade ln_labor
. xtserial ln_gdppc ln_trade ln_labor

Wooldridge test for autocorrelation in panel data

H0: no first order autocorrelation
F( 1, 125) = 289.854
We reject the null and conclude
Prob > F = 0.0000 serial correlation.

The null is no serial correlation. Above we fail to reject the null and conclude the data does not have first-
order autocorrelation. Type help xtserial for more details.
OTR 46
Source: Hoechle, Daniel, “Robust Standard Errors for Panel Regressions with Cross-Sectional
OTR 47
Dependence”, page 4, http://fmwww.bc.edu/repec/bocode/x/xtscc_paper.pdf
Suggested books / references
• Introduction to econometrics / James H. Stock, Mark W. Watson. 2nd ed., Boston: Pearson
Addison Wesley, 2007.
• Econometric Analysis of Panel Data, Badi H. Baltagi, Wiley, 2008.
• Econometric Analysis / William H. Greene. 6th ed., Upper Saddle River, N.J. : Prentice Hall, 2008.
• An Introduction to Modern Econometrics Using Stata/ Christopher F. Baum, Stata Press, 2006.
• Data analysis using regression and multilevel/hierarchical models / Andrew Gelman, Jennifer Hill.
Cambridge ; New York : Cambridge University Press, 2007.
• Data Analysis Using Stata/ Ulrich Kohler, Frauke Kreuter, 2 nd ed., Stata Press, 2009.
• Statistics with Stata / Lawrence Hamilton, Thomson Books/Cole, 2006.
• Statistical Analysis: an interdisciplinary introduction to univariate & multivariate methods / Sam
Kachigan, New York : Radius Press, c1986
• “Beyond “Fixed Versus Random Effects”: A framework for improving substantive and statistical
analysis of panel, time-series cross-sectional, and multilevel data” / Brandom Bartels
http://polmeth.wustl.edu/retrieve.php?id=838
• “Robust Standard Errors for Panel Regressions with Cross-Sectional Dependence” / Daniel
Hoechle, http://fmwww.bc.edu/repec/bocode/x/xtscc_paper.pdf
• Designing Social Inquiry: Scientific Inference in Qualitative Research / Gary King, Robert
O.Keohane, Sidney Verba, Princeton University Press, 1994.
• Unifying Political Methodology: The Likelihood Theory of Statistical Inference / Gary King,
Cambridge University Press, 1989.

OTR 48

Applied Choice Analysis
No ratings yet
Applied Choice Analysis
1,243 pages
Applied Econometrics & Time Series Analysis Homework 1
No ratings yet
Applied Econometrics & Time Series Analysis Homework 1
5 pages
CFA Level I FormulaSheet
100% (3)
CFA Level I FormulaSheet
12 pages
1 - Course Slides - Data Science and ML Fundamentals
No ratings yet
1 - Course Slides - Data Science and ML Fundamentals
92 pages
Panel Stata Command
No ratings yet
Panel Stata Command
7 pages
Econ MIdterm 2 Practise
No ratings yet
Econ MIdterm 2 Practise
11 pages
Using Stata To Replicate Table 4 in Bond PDF
No ratings yet
Using Stata To Replicate Table 4 in Bond PDF
3 pages
5.21. Chemometric Methods Applied To Analytical Data
No ratings yet
5.21. Chemometric Methods Applied To Analytical Data
18 pages
Experiment 1: Measurements and Error Analysis Laboratory Report
No ratings yet
Experiment 1: Measurements and Error Analysis Laboratory Report
5 pages
Panel Data Analysis Using STATA 13
No ratings yet
Panel Data Analysis Using STATA 13
17 pages
Department of Economics: ECONOMICS 481: Economics Research Paper and Seminar
No ratings yet
Department of Economics: ECONOMICS 481: Economics Research Paper and Seminar
15 pages
UsefulStataCommands PDF
No ratings yet
UsefulStataCommands PDF
51 pages
Quantile Regression
No ratings yet
Quantile Regression
11 pages
Multivariate Methods and Forecasting With IBM® SPSS® Statistics
No ratings yet
Multivariate Methods and Forecasting With IBM® SPSS® Statistics
185 pages
Panel Data Analysis
No ratings yet
Panel Data Analysis
8 pages
Stata Commands PDF
No ratings yet
Stata Commands PDF
5 pages
GAMS Getting Started
No ratings yet
GAMS Getting Started
31 pages
Advanced Econometrics
No ratings yet
Advanced Econometrics
61 pages
L G 0016125104 0051669710
50% (2)
L G 0016125104 0051669710
30 pages
A Short Course of Time-Series Analysis and Forecasting by D S G Pollock
No ratings yet
A Short Course of Time-Series Analysis and Forecasting by D S G Pollock
133 pages
Panel Data Analysis Using Stata: Sebastian T. Braun University of ST Andrews
No ratings yet
Panel Data Analysis Using Stata: Sebastian T. Braun University of ST Andrews
90 pages
Panel Data For Learing
100% (2)
Panel Data For Learing
34 pages
Topic 3-SPSS and STATA
100% (1)
Topic 3-SPSS and STATA
73 pages
Qualitative Response Regression Models
No ratings yet
Qualitative Response Regression Models
6 pages
Quantile Regression (Final) PDF
100% (1)
Quantile Regression (Final) PDF
22 pages
Correlation-Regression 2019
No ratings yet
Correlation-Regression 2019
76 pages
Unit I
No ratings yet
Unit I
26 pages
Panel Data
No ratings yet
Panel Data
9 pages
Data Science Engineering Full Time Program Brochure
No ratings yet
Data Science Engineering Full Time Program Brochure
21 pages
Var Models in Stata
No ratings yet
Var Models in Stata
13 pages
Analysing Panel Data Using STATA
No ratings yet
Analysing Panel Data Using STATA
13 pages
Bahan Univariate Linear Regression
No ratings yet
Bahan Univariate Linear Regression
64 pages
4 - LM Test and Heteroskedasticity
No ratings yet
4 - LM Test and Heteroskedasticity
13 pages
Comandos
No ratings yet
Comandos
51 pages
Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions
No ratings yet
Lecture Series 1 Linear Random and Fixed Effect Models and Their (Less) Recent Extensions
62 pages
A Tutorial in Logistic Regression
No ratings yet
A Tutorial in Logistic Regression
14 pages
Heteroskedasticity
No ratings yet
Heteroskedasticity
30 pages
Quantile Regression
No ratings yet
Quantile Regression
6 pages
Stationary Process
No ratings yet
Stationary Process
178 pages
GMM Var Stata
No ratings yet
GMM Var Stata
27 pages
Quantile Regression: EC 823: Applied Econometrics
No ratings yet
Quantile Regression: EC 823: Applied Econometrics
20 pages
Axiomatic Probability and Concepts
No ratings yet
Axiomatic Probability and Concepts
6 pages
06 Simple Linear Regression Part1
No ratings yet
06 Simple Linear Regression Part1
8 pages
Introduction To Regression Models For Panel Data Analysis Indiana University Workshop in Methods October 7, 2011 Professor Patricia A. Mcmanus
No ratings yet
Introduction To Regression Models For Panel Data Analysis Indiana University Workshop in Methods October 7, 2011 Professor Patricia A. Mcmanus
42 pages
Chapter16 Distributed Lag Models
No ratings yet
Chapter16 Distributed Lag Models
30 pages
Chi G, ZHU
100% (1)
Chi G, ZHU
26 pages
Panel Data Analysis Sunita Arora
100% (1)
Panel Data Analysis Sunita Arora
28 pages
Econometrics PPT Final Review Slides
No ratings yet
Econometrics PPT Final Review Slides
41 pages
5qqmn938 - Week 1
No ratings yet
5qqmn938 - Week 1
77 pages
Econometrics I: TA Session 5: Giovanna Ubida
No ratings yet
Econometrics I: TA Session 5: Giovanna Ubida
20 pages
Problems With OLS
No ratings yet
Problems With OLS
8 pages
Topic 5 Unit Roots, Cointegration and VECM
100% (1)
Topic 5 Unit Roots, Cointegration and VECM
42 pages
R Studio How To
No ratings yet
R Studio How To
12 pages
Statistical Methods For Decision Making
100% (1)
Statistical Methods For Decision Making
15 pages
Simple Regression Quiz
No ratings yet
Simple Regression Quiz
6 pages
Stata
No ratings yet
Stata
26 pages
Session 11 - Multiple Regression Analysis (GbA) PDF
No ratings yet
Session 11 - Multiple Regression Analysis (GbA) PDF
119 pages
Econometric Analysis of Panel Data: William Greene Department of Economics Stern School of Business
No ratings yet
Econometric Analysis of Panel Data: William Greene Department of Economics Stern School of Business
88 pages
The Measurement and Analysis of Housing Preference and Choice PDF
No ratings yet
The Measurement and Analysis of Housing Preference and Choice PDF
272 pages
Stata Graphs - Examples
No ratings yet
Stata Graphs - Examples
42 pages
Data Analysis
No ratings yet
Data Analysis
16 pages
K Kiran Kumar IIM Indore
100% (1)
K Kiran Kumar IIM Indore
115 pages
Dr. Abuzar Nomani
No ratings yet
Dr. Abuzar Nomani
26 pages
This Web Page: - Sort Panelvar Datevar - Tsset Panelvar Datevar
No ratings yet
This Web Page: - Sort Panelvar Datevar - Tsset Panelvar Datevar
4 pages
Cambridge International AS & A Level: Mathematics 9709/62
No ratings yet
Cambridge International AS & A Level: Mathematics 9709/62
12 pages
Sky The Limit
No ratings yet
Sky The Limit
32 pages
APA Research PPR
No ratings yet
APA Research PPR
15 pages
Lecture 10 - Quality Control-1
No ratings yet
Lecture 10 - Quality Control-1
31 pages
Regresion
No ratings yet
Regresion
38 pages
Sage Publications, Inc., Johnson Graduate School of Management, Cornell University Administrative Science Quarterly
No ratings yet
Sage Publications, Inc., Johnson Graduate School of Management, Cornell University Administrative Science Quarterly
27 pages
Importance of Service Quality in Customer Satisfaction
No ratings yet
Importance of Service Quality in Customer Satisfaction
12 pages
Determinants of Income Inequality in Ethiopia
No ratings yet
Determinants of Income Inequality in Ethiopia
23 pages
Fruit Powder Cookies
No ratings yet
Fruit Powder Cookies
7 pages
A Method For Calibration and Validation Subset Partitioning
No ratings yet
A Method For Calibration and Validation Subset Partitioning
5 pages
SimpleLineaReg Example
No ratings yet
SimpleLineaReg Example
12 pages
Quantitative Methods or Quantitative All Quiz
No ratings yet
Quantitative Methods or Quantitative All Quiz
17 pages
SPE-182808-MS Reservoir Simulation Assisted History Matching: From Theory To Design
No ratings yet
SPE-182808-MS Reservoir Simulation Assisted History Matching: From Theory To Design
19 pages
Important Questions
No ratings yet
Important Questions
3 pages
Predicting The Tigris River Water Quality Within Baghdad, Iraq PDF
No ratings yet
Predicting The Tigris River Water Quality Within Baghdad, Iraq PDF
9 pages
Homework 7
No ratings yet
Homework 7
4 pages
Bias in Ecological Regression Estimates
No ratings yet
Bias in Ecological Regression Estimates
24 pages
Introduction To Experimental Physics
No ratings yet
Introduction To Experimental Physics
44 pages
EVA UserGuide
No ratings yet
EVA UserGuide
44 pages
Dieter Weichert: Bulletin of The Seismological Society of America, Vol. 70, No. 4, Pp. 1337-1346, August 1980
No ratings yet
Dieter Weichert: Bulletin of The Seismological Society of America, Vol. 70, No. 4, Pp. 1337-1346, August 1980
10 pages
Chapter 5 Quiz
100% (1)
Chapter 5 Quiz
5 pages
Bundle Adjustment - A Modern Synthesis: Bill - Triggs@
No ratings yet
Bundle Adjustment - A Modern Synthesis: Bill - Triggs@
71 pages
Acuracy Surveys
No ratings yet
Acuracy Surveys
14 pages
Mathematics: Worksheet No.
No ratings yet
Mathematics: Worksheet No.
6 pages
A Study On Employee Attrition: Inevitable Yet Manageable: Dr.B.Latha Lavanya
No ratings yet
A Study On Employee Attrition: Inevitable Yet Manageable: Dr.B.Latha Lavanya
13 pages
ch16 Ci
No ratings yet
ch16 Ci
21 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Panel 101

Uploaded by

Panel 101

Uploaded by

Panel Data Analysis

Fixed and Random Effects

December 2007 http://www.princeton.edu/~otorres/

These entities could be 3 1 # # # # …..

states, companies, families, 3 2 # # # # …..

Interpretation of the 𝛽 coefficient: for a given entity, when a

Interpretation of a 𝛽 coefficient: for a given entity, when a

Import into Stata

Once in Stata, you can reshape

reshape long x , i(id) j(year)

Type help reshape for more details

renvars A-G, pref(gdp)

Type help reshape for more details.

xtset country year

Given the error, we need to have ‘country’ in numeric format.

Balanced panel: all entities are

The encode command used in

Selected variables since 2000, all countries only:

• GDP per capita (constant 2015 US$)

To log-transformed a variable use the function ln():

encode country, gen(country1)

Then using ‘country1’ type

xtset country1 year

Panel variable: country1 (strongly balanced)

Variable | Obs Mean Std. dev. Min Max

. xtsum gdppc trade labor // Heterogeneity by panel and time

Variable | Mean Std. dev. Min Max | Observations

OTR See https://www.stata.com/manuals/xtxtsum.pdf 25

Controlling for Total number of

. xtreg ln_gdppc ln_trade ln_labor, fe robust

Fixed-effects (within) regression Number of obs = 2,772

Intraclass correlation (rho), shows how much

sigma_u = sd of residuals within groups 𝑢𝑖

. xtreg ln_gdppc ln_trade ln_labor i.year, fe robust

Fixed-effects (within) regression Number of obs = 2,772

Controlling for Total number of

. xtreg ln_gdppc l1.ln_trade l1.ln_labor, fe robust

Fixed-effects (within) regression Number of obs = 2,646

. reghdfe ln_gdppc ln_trade ln_labor , absorb(country1) vce(cluster country1)

. reghdfe ln_gdppc ln_trade ln_labor , absorb(country1 year) vce(cluster country1)

reghdfe ln_gdppc l1.ln_trade l1.ln_labor , absorb(country1 ) vce(cluster country1)

Controlling for Total number of

. xtreg ln_gdppc ln_trade ln_labor, re robust

Random-effects GLS regression Number of obs = 2,772

Wald chi2(2) = 192.71

Intraclass correlation (rho), shows how much

sigma_u = sd of residuals within groups 𝑢𝑖

xtreg ln_gdppc ln_trade ln_labor, fe

xtreg ln_gdppc ln_trade ln_labor i.year, fe robust

F( 21, 125) = 4.44

xtreg ln_gdppc ln_trade ln_labor, re robust

Breusch and Pagan Lagrangian multiplier test for random effects

ln_gdppc[country1,t] = Xb + u[country1] + e[country1,t]

Test: Var(u) = 0 Prob > chibar2 < 0.05, we fail to

ssc install xttest2

xtreg ln_gdppc ln_trade ln_labor, fe robust

Correlation matrix of residuals:

Breusch-Pagan LM test of independence: chi2(7875) = 73886.228, Pr = 0.0000

ssc install xtcsd Pr < 0.05, we fail to accept the null

. xtcsd, pesaran abs

Pesaran's test of cross sectional independence = 9.266, Pr = 0.0000

Average absolute value of the off-diagonal elements = 0.588

H0: sigma(i)^2 = sigma^2 for all i

chi2 (126) = 3.3e+05 We reject the null and conclude

Wooldridge test for autocorrelation in panel data

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.