2A.3 Lecture Slides8 Heteroskedasticity
2A.3 Lecture Slides8 Heteroskedasticity
Lecture 8:
Heteroskedasticity
Oleg I. Kitov
oik22@cam.ac.uk
1/20
Lecture outline
2/20
Heteroskedasticity, OLS and Gauss-Markov
5000000 5000000
4000000 4000000
3000000 3000000
res_wage2
res_wage2
2000000 2000000
1000000 1000000
0 0
8 10 12 14 16 18 0 5 10 15 20 25
years of education years of work experience 4/20
Unbiasedness of OLS
I Consider Yi = β0 + β1 Xi + ui , for a random sample (Yi , Xi ), i = 1, . . . , n.
I Take expectation of β̂1 conditional on all observations X = (X1 , . . . , Xn ).
Pn " Pn #
i=1 Xi − X̄ ui h i
i=1 Xi − X̄ ui
β̂1 = β1 + Pn 2 =⇒ E β̂1 |X = β1 +E Pn 2 |X .
i=1 Xi − X̄ i=1 Xi − X̄
6/20
Conditional variance of the OLS estimator
Su2 Su2
Sβ2ˆ1 = Var
c β̂1 |X = P 2 =
n
Xi − X̄ SSTX
i=1
I Under heteroskedasticity, Var (ui |X) = σi2 and σi2 6= σj2 some i 6= j:
- error variance estimator Su2 is wrong [σi2 vary across individuals i];
- OLS estimator Sβ̂2 is wrong for Var(β̂1 );
1
a
- asymptotic distribution β̂1 ∼ N(β1 , Sβ̂2 ) has the wrong variance;
1
- testing significance H0 : β1 = 0 will give wrong inference.
I OLS is not optimal in presence of heteroskedasticity since
- OLS gives equal weight to all observations, but...
- observations with larger/smaller error variance contain less/more information;
7/20
Causes of heteroskedasticity
9/20
Goldfeld-Quandt test for heteroskedasticity
I Consider: wagei = β0 + β1 educi + β2 IQi + β3 experi + ui .
I Test if error variance is monotonic [increasing/decreasing] in educi .
I Split sample into two sub-samples [with/without higher education]:
(1) (1) (1) (1)
educi ≤ 12 : wagei = β0 + β1 educi + β2 IQi + β3 experi + ui , n1 , SSR1
(2) (2) (2) (2)
educi > 12 : wagei = β0 + β1 educi + β2 IQi + β3 experi + vi , n2 , SSR2
SSR2 / (n2 − k − 1)
W = ∼ Fn2 −k−1,n1 −k−1 .
SSR1 / (n1 − k − 1)
81856465/451
w= = 1.91
45312537/478
I Auxiliary regression for ûi2 : include Xij , their squares and cross-terms.
I Auxiliary regression [simpler]: ûi2 = δ0 + δ1 Ŷi + δ2 Ŷi2 + εi .
I H0 : (δ1 = 0) ∩ (δ2 = 0), homoskedasticity Var (ui ) = σ 2 .
I H1 : (δ1 6= 0) ∪ (δ2 6= 0), heteroskedasticity Var (ui ) = f Ŷi .
I LM = nR 2 ∼ χ22 and reject H0 if test-statistic is greater than χ21−α,2 .
I Could also include higher powers of Ŷi , e.g. cubed Ŷi3 .
I STATA adds squares and cross-terms explicitly to auxiliary regression.
I White’s advantages compared with Breusch-Pagan:
- relaxes assumption of normally distributed errors;
- flexible functional form, identifying nearly any pattern of heteroskedasticity.
I White’s disadvantages compared with Breusch-Pagan:
- cannot determine explicit functional form of heteroskedasticity;
- loses power quickly when the number of regressors goes up.
13/20
Breusch-Pagan and White’s tests in wage regression
4 4
3 3
res_lwage2
res_lwage2
2 2
1 1
0 0
8 10 12 14 16 18 0 5 10 15 20 25
years of education years of work experience 15/20
Dealing with heteroskedasticity: model specification [2/2]
17/20
Dealing with heteroskedasticity: robust errors [2/3]
(1) iid errors (2) robust errors
wage wage
∗∗∗
educ 58.10 58.10∗∗∗
(7.056) (7.427)
∗∗∗
IQ 5.069 5.069∗∗∗
(0.941) (0.897)
exper 17.42∗∗∗ 17.42∗∗∗
(3.116) (3.104)
const −539.4∗∗∗ −539.4∗∗∗
(116.7) (115.0)
n 935 935
2
R 0.162 0.162
Standard errors in parentheses
∗p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001
18/20
Dealing with heteroskedasticity: robust errors [3/3]
Pn ε̂2 û 2
ij ij
Var β̂j = i=1 2
SSRj