Econometrics I - Lecture 8 (Wooldridge)
Econometrics I - Lecture 8 (Wooldridge)
ECONOMETRICS I
National Economics
University
2019
MA Hai Duong
1
HETEROSKEDASTICITY
2
RECAP.
We have studied the multiple regression model and learnt that when:
1. model is linear in parameters:
2. conditional mean of errors is zero:
3. columns of are linearly
then the OLS estimator is an unbiased estimator of
if in addition
4. sample is a random sample and errors are homoskedastic:
then the OLS estimator
5. errors are normally distributed, then conditional on , is normally
distributed, and we can use the usual t and F tests to make inferences
based on the OLS estimator 3
LECTURE OUTLINE
Heteroskedasticity:
1. Definition of heteroskedasticity and its consequences for OLS
(textbook reference 8-1)
2. Testing for heteroskedasticity (textbook reference 8-3)
2.1 Breusch-Pagan test
2.2 White test
3. Heteroskedasticity robust standard errors (a simplified version of 8-2)
4. Weighted least squares when heteroskedasticity is known up to a
multiplicative constant (textbook reference 8-4a)
We will not cover heteroskedasticity robust LM tests (the last part of
section -.2), Feasible GLS and the consequences of wrong specification of
the variance function (section 8.4b, 8.4c) and the linear probability 4
HETEROSKEDASTICITY
Sometimes there is a good reason to doubt the assumption of equal
variance for all errors. Here are some examples:
In the study of food consumption, income is an important explanatory
variable. It is unreasonable to assume that the variance of food
consumption is the same for poor and rich people
In many cases we do not have individual data (for confidentiality
reasons), but we get information on averages over groups of individuals.
For example, we can get incidences of crime per 1000 people,
employment rate and income per capita in each district. These are
averages, but different districts have different populations, so there is a
good reason to believe that variances of these averages depend
inversely on the population of each district
In finance, some unpredicted news increase the volatility of the market
(i.e. the variance of the market return) and this can last for several days 5
3D GRAPHICAL REPRESENTATION
OF HETEROSKEDASTICITY
7
DETECTING HTSK
As always, step 1: think about the problem!
If we only have one , the scatter plot can give us a clue: (In fact, we
hardly ever have only one )
8
TESTING FOR HTSK
Since
If we suspect that variance can change with some subset of the
independent variables, or even some exogenous variables that do
not affect the mean, but can affect the variance, then, if we had ,
we could square it and estimate the conditional expectation
function of . But is unknown.
Australian econometricians Trevor Breusch and Adrian Pagan, and
the American econometrician Hal White showed that we can use
the OLS residuals instead, and in large samples this will give us
reliable results
9
TESTING FOR HTSK
where are a subset of . In fact the variables can include some variables
that do not appear in the conditional mean, but may affect the variance
(note the difference with the book)
11
STEPS OF BREUSCH-PAGAN
TEST
1. Estimate the model by OLS as usual. Obtain for and square
them.
2. Regress on a constant, . Denote the of this auxiliary regression
by .
3. Under , the statistic has a distribution with degrees of freedom
in large samples. This statistic is called the Lagrange multiplier (LM)
statistic for HTSK.
4. Given the desired level of significance, we obtain the of the test
from the table and reject if the value of the test statistic is larger
than the .
Under the test for the overall significance of the second regression
has an distribution, and it can also be used to test for HTSK. 12
EXAMPLE FOR BP TEST
Regress Net sales, in $s, of 100 customers on the number of their
purchasing Items and a constant. Data are in the workfile:
pelicanstores.wf1
13
EXAMPLE FOR BP TEST
Can save residuals, and run the OLS of on a constant and all the
explanatory variables, or we can use Eviews.
14
EXAMPLE FOR BP TEST
Choosing Breusch-Pagan test in Eviews produces. (When doing your
practice, it is better to do the LM steps yourself rather than to use
Eviews options in order to make sure you understand the mechanics of
the test)
15
AN EXAMPLE FOR BP TEST
16
WHITE TEST FOR HTSK
(1980)
White’s test of course tests the same null hypothesis
for
but its alternative is that the variance is an unknown function of .
Hal White showed that a regression of on a constant, to , to and
all pairwise cross products of to , has the power to detect this
general form of heteroskedasticity in large samples.
Similar to the B-P test, White’s test statistic is , which under the
null, has a distribution with degrees of freedom as the number of
explanatory variables in the auxiliary regression. We can also use
the of the overall significance of the auxiliary regression as well.
17
KOENKER-BASSET TEST
(1981)
A concern with White test is that there are regressors on the
auxiliary regression which is a very large number of restrictions
Recall that the fitted values from OLS, are a function of all the ’s
Thus, will be a function of the squares and crossproducts of all the
’s and and can proxy for all of the the ’s their squares and cross
products.
A special form of the White test would be to regress the residuals
squared on and and use the of this regression to form an or
statistic
Note only testing for 2 restrictions now (in many times, we only test
for a restriction of zero-influence of on ), no matter how many
independent variables we have. 18
AN EXAMPLE FOR WHITE
TEST
We have an original regression of net sales on the number of items
purchased by customers and customer’s age
19
AN EXAMPLE FOR WHITE
TEST (NO CROSS TERMS)
20
AN EXAMPLE FOR WHITE TEST
(INCLUDE CROSS TERMS)
21
AN EXAMPLE FOR KOENKER
BASSET TEST
Eviews has a built in command for the alternate form of the White test
(KB test), and we need to save the OLS residuals and OLS predictions
(continuously use data in pelicanstores.wf1, we named these variables by
U_hat and NS_hat) and then run the auxiliary regression:
22
AN EXAMPLE FOR KOENKER
BASSET TEST
The value of the statistic is which is larger than , the critical value of
the distribution with 2 degrees of freedom.
23
SOLUTION 1: ROBUST
STANDARD ERRORS
Since OLS estimator is still unbiased, we may be happy to live with the OLS
even if it is not BLUE. But the real practical problem is that and statistics
based on OLS standard errors are unusable
Recall the derivation of :
With homoskedasticity,
=
With HTSK,
24
SOLUTION 1: ROBUST
STANDARD ERRORS
White proved that:
25
SOLUTION 1: ROBUST
STANDARD ERRORS
Back to the example. The option of robust standard errors is under
the Options tab of the equation window:
26
SOLUTION 1: ROBUST
STANDARD ERRORS
With this option, we get the following results and compare with the original
regression results:
27
SOLUTION 2: TRANSFORM
THE MODEL
Logarithmic transformation of y may do the trick: If the
population model has as the dependent variable but we have used ,
this kind of mis-specification can show up as heteroskedastic errors.
So, if log-transformation is admissible (i.e. if is positive), moving to a
log model may solve the problem, and the OLS estimator on the log-
transformed model will then be BLUE and standard errors will be
useful. Of course when we consider transforming , we should think if a
log-level or a log-log model makes better sense
30
WEIGHTED LEAST SQUARES
– AN EXAMPLE
Return our example, based on the results of many previous tests,
we hypothesis that:
We create with EVIEWS command is: series w=1/@sqrt(items) and
run the weighted regression
31
WEIGHTED LEAST SQUARES
– AN EXAMPLE
Test HSTK after using remedial process
The standard errors are now reliable for inference and for forming confidence 32
intervals
WEIGHTED LEAST SQUARES
– IN EVIEWS
Eviews also has a built in WLS command under the option tab of
the equation window. We need to enter the name of the weight
series.
33
WEIGHTED LEAST SQUARES
– IN EVIEWS
The only advantage of the Eviews built in command is that it
produces a set of statistics for the original model
34
SUMMARY
The assumption of equal conditional variances for each observation
may not be appropriate
But OLS will still be unbiased even if the errors are heteroskedastic,
however the usual OLS standard errors will not be correct
We learnt how to test for heteroscedasticity
If HTSK is found, we can still use OLS, but calculate standard errors that
are robust to HTSK, and use those for inference
If we have a reasonable idea that the HTSK is proportional to a single
variable, we can use WLS, which will provide the best linear unbiased
estimator for the parameters and a set of standard errors that can be
used for inference
35
TEXTBOOK EXERCISES –
CHAP 7
Problems 2, 3, 4 (page 268)
Problem 5 (page 269)
Problem 8 (page 270)
36
COMPUTER EXERCISES
C2, C3 (page 270)
C4, C5, C8 (page 271)
C9 (page 272)
37
THANK FOR YOUR
ATTENDANCE - Q & A
38