0% found this document useful (0 votes)

2 views19 pages

Chapter 3

Chapter 3 discusses the implications of heteroskedasticity and cluster sampling on statistical inference in business analytics. It highlights that violations of homoskedasticity and independence lead to biased estimators and unreliable inference procedures, but provides solutions such as Heteroskedasticity-Corrected Covariance Matrix Estimators and cluster-robust estimators. The chapter emphasizes the importance of using these robust methods to ensure valid results in regression analysis.

Uploaded by

daryn.imashev.bu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views19 pages

Chapter 3

Uploaded by

daryn.imashev.bu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Statistical Foundations of Business Analytics

Chapter 3: Heteroskedasticity and Cluster Sampling

Tim Ederer

Mini 2, 2024
Tepper Business School
Introduction

With Chapter 1 and 2, we are now equipped to make inference about β

• Relies on assumptions EXO, RANK, IID, and HOMOSKEDASTICITY

What happens when HOMOSKEDASTICITY is not satisfied?

• Var(εi |xi ) = σi2 instead of Var(εi |xi ) = σ 2

What happens when IID is not satisfied?

• Cov(εi , εj |X ) ̸= 0 instead of Cov(εi , εj |X ) = 0

1 / 16
Reminder: Variance of OLS Estimator

Remember that the variance of β̂ has the following expression

Var(β̂|X ) = (X ′ X )−1 X ′ Var(ε|X )X (X ′ X )−1

Problem: Var(ε|X ) is a complex object

2 / 16
Reminder: Homoskedasticity and IID

HOMOSKEDASTICITY + IID highly simplify the problem!

 2 
σ 0 ... 0
 0 σ2 . . . 0
2 2 ′ −1
Var(ε|X ) =  . ..  = σ In =⇒ Var(β̂|X ) = σ (X X )
 
.. . .
 .. . . .
0 0 ... σ2

This has two important consequences

• β̂ is BLUE
• Var(β̂|X ) is easy to estimate

Let’s start by seeing what happens when we relax HOMOSKEDASTICITY

3 / 16
Heteroskedasticity
Heteroskedasticity: Definition

Heteroskedasticity means that the variance of εi is not constant across i

Var(εi |xi ) = σi2 ̸= σ 2

Examples
• Volatility in earnings increases with education: Var(εi |xi ) = γ1 + γ2 educi
• House price variance is higher in neighborhood A vs neighborhood B

4 / 16
Heteroskedasticity: Visualisation

Under HOMOSKEDASTICITY, plotting residuals ε̂i against xi should look like this
• Variance of residuals should not depend on xi

5 / 16
Heteroskedasticity: Visualisation
Under heteroskedasticity, plotting residuals ε̂i against xi could look like this
• Variance of residuals is increasing with xi in this case

Why is this a problem?

6 / 16
Heteroskedasticity: Consequences

Assume now that the variance of εi is not constant across i

 2 
σ1 0 . . . 0
 0 σ22 . . . 0 
Var(ε|X ) =  . ..  = Ω
 
.. . .
 .. . . .
0 0 ... σn2

The variance of β̂ now has the following expression

Var(β̂|X ) = (X ′ X )−1 X ′ ΩX (X ′ X )−1

̸= σ 2 (X ′ X )−1

7 / 16
Heteroskedasticity: Consequences

Two important consequences

• β̂ is not BLUE anymore (but it is still unbiased and consistent!)
• Our estimator for Var(β̂|X ) is biased

Under heteroskedasticity our inference procedure collapses

• The distribution of test statistics is no longer known
• Confidence intervals are wrong

This will eventually lead you to wrong conclusions!

8 / 16
Illustration in R

0
Consider model yi = β1 + β2 xi + εi with β =
1
• We assume that xi ∼ N (0, 1) and εi |xi ∼ N (0, σi2 )
• Set σi = 1 + 0.5xi + 0.1xi2

Draw 10000 samples of n = 100 observations

• For each sample: derive β̂2 and the confidence interval for α = 5%
• The confidence interval should contain 1 for 95% of the samples

Results
• It contains 1 only for 88% of the samples
• Confidence intervals are smaller than what they should be because of bias in std errors!

9 / 16
Testing for the Presence of Heteroskedasticity

The presence of heteroskedasticity can be visualized/tested

• We do not observe εi = yi − xi′ β but we observe the residuals ε̂i = yi − xi′ β̂

Method 1: plot ε̂i against xi

• If the variance of ε̂i varies with xi this is evidence of heteroskedasticity
• Problem: not easy to visualize when xi is multidimensional

Method 2: White (1990) test

• Step 1: run the auxiliary regression ε̂2i = γ0 + zi′ γ1 + νi where zi = xi
xi2
• Step 2: Test H0 : γ1 = 0 ⇐⇒ H0 : E[ε̂2i |zi ] = Var(ε̂i |zi ) = γ0 (homoskedasticity)
• Rejecting H0 is evidence of the presence of heteroskedasticity

10 / 16
Heteroskedasticity Robust Variance Estimators

Good news: there exists a very simple solution to this problem

• Our estimator for the variance of β̂ is biased...
• Why not simply find another estimator for Var(β̂|X )?

Solution: Heteroskedasticity-Corrected Covariance Matrix Estimators (HCCME)

 2 
ε̂1 0 . . . 0
HC
 0 ε̂22 . . . 0 
′ −1 ′ b ′ −1
Var (β̂|X ) = (X X ) X ΩX (X X ) where Ω =  .
 
c b .. . . .. 
 .. . . .
0 0 ... ε̂2n

The HC estimator is unbiased under both heteroskedasticity and homoskedasticity

q
c HC (β̂k ) =
• Can derive robust standard errors: s.e. c HC (β̂|X )
Var (k,k)

11 / 16
Summary

Heteroskedasticity is a major problem for inference

• Introduces bias in estimator of variance of β̂
• Interpretation of results can be severely affected

You can test for the presence of heteroskedasticity

• Either visually of formally

But more importantly you can change your estimator for Var(β̂|X )!
• HC estimator is unbiased under homoskedasticity AND heteroskedasticity
• It can be computed at 0 cost in any statistical software
• There is no excuse for not using it next time you run a regression!

12 / 16
Cluster Sampling
Relaxing IID

What happens when individuals are not sampled independently?

• Introduces dependence between observations
• Cov(εi , εj |X ) = σij ̸= 0

The variance covariance matrix of ε becomes very complex

 2 
σ1 σ12 . . . σ1n
σ12 σ22 . . . σ2n 
Var(ε|X ) =  .
 
.. .. .. 
 .. . . . 
σ1n σ2n . . . σn2

13 / 16
Cluster Sampling

Focus on case where you sample groups (or clusters) instead of individuals
• Example: you sample households or villages instead of individuals
• Independence across clusters but dependence within clusters
• Cov(εi , εj |X ) = σij ̸= 0 for i and j in the same cluster c = 1, ..., C

The variance covariance matrix of ε has a block structure

   2 
Ω1 0 . . . 0 σ1 σ12 ... σ1nc
 0 Ω2 . . . 0   σ12 σ22 ... σ2nc 
Var(ε|X ) =  . ..  where Ωc =  ..
   
.. . . .. .. .. 
 .. . . .   . . . . 
0 0 . . . ΩC σ1nc σ2nc ... σn2c

14 / 16
Consequences

Same consequences as heteroskedasticity

• β̂ is not BLUE anymore
• Our estimator for Var(β̂|X ) is biased

Solution: Cluster-Robust Variance Covariance Estimator

ε̂21
 
ε̂1 ε̂2 ... ε̂1 ε̂nc
ε̂22
C
!
CR X  ε̂1 ε̂2 ... ε̂2 ε̂nc 
′ −1
Var (β̂|X ) = (X X ) Xc′ Ω
b c Xc (X ′ X )−1 where Ω
bc =  
c  .. .. .. .. 
c=1
 . . . . 
ε̂1 ε̂nc ε̂2 ε̂nc ... ε̂2nc

The Cluster-Robust estimator is unbiased when number of clusters C is large

• Also robust to the presence of heteroskedasticity!
q
c CR (β̂k ) =
• Can derive cluster-robust standard errors: s.e. c CR (β̂|X )
Var (k,k)

15 / 16
Summary

Important issues arise when HOMOSKEDASTICITY and IID are not satisfied
• Estimator of Var(β̂|X ) is biased
• Inference procedure breaks down: confidence intervals are wrong, tests are unreliable

But there is an easy fix!

• Heteroskedasticity: use HCCME for Var(β̂|X )
• Cluster sampling: use cluster-robust estimator for Var(β̂|X )
• These fixes can be implemented at no cost on any software!

Only important assumption remaining: EXO

• Study in Chapter 4 what we should do when EXO does not hold

16 / 16

Mathematics Chapterwise Mcqs - 12th STD
No ratings yet
Mathematics Chapterwise Mcqs - 12th STD
164 pages
DECS431_Week9_Class1
No ratings yet
DECS431_Week9_Class1
29 pages
ec226_24-25_week8_Tuesday
No ratings yet
ec226_24-25_week8_Tuesday
16 pages
DRAFT_15 Heteroskedasticity_V2
No ratings yet
DRAFT_15 Heteroskedasticity_V2
12 pages
Topic 2 (1)
No ratings yet
Topic 2 (1)
15 pages
Advanced Bioprocess Engineering
100% (1)
Advanced Bioprocess Engineering
213 pages
Robust - Regression - Methods - Achieving - Small - Standard Error
No ratings yet
Robust - Regression - Methods - Achieving - Small - Standard Error
18 pages
Chapter_8
No ratings yet
Chapter_8
54 pages
Lec06 Heteroskedasticity
No ratings yet
Lec06 Heteroskedasticity
43 pages
Datasheet CX02-81
No ratings yet
Datasheet CX02-81
2 pages
Landsat-Reflectance-Data-At-Your-Fingertips
No ratings yet
Landsat-Reflectance-Data-At-Your-Fingertips
10 pages
Chapter 8 Heteroskedasticity
No ratings yet
Chapter 8 Heteroskedasticity
52 pages
Determinants of Tax Awareness: A Systematic Literature Review
No ratings yet
Determinants of Tax Awareness: A Systematic Literature Review
13 pages
Chapter 6-hetro.pptx
No ratings yet
Chapter 6-hetro.pptx
27 pages
Biotechnology Eligibility Test (BET) For DBT-JRF Award (2013-14)
No ratings yet
Biotechnology Eligibility Test (BET) For DBT-JRF Award (2013-14)
62 pages
Heterpskedasticity
No ratings yet
Heterpskedasticity
3 pages
Heteroskedasticity - Lecture Notes
No ratings yet
Heteroskedasticity - Lecture Notes
20 pages
Difference Between Jit and Traditional
33% (3)
Difference Between Jit and Traditional
7 pages
Bronfenbrenner
No ratings yet
Bronfenbrenner
4 pages
Feet First - Object Exploration in Young Infants
No ratings yet
Feet First - Object Exploration in Young Infants
6 pages
Lesson 2 Personal and Business Letter
100% (1)
Lesson 2 Personal and Business Letter
1 page
104 Cheat Sheet
No ratings yet
104 Cheat Sheet
4 pages
The Philippine Professional Standards For School Heads Word.
No ratings yet
The Philippine Professional Standards For School Heads Word.
4 pages
ECTRX Topic6 Heteroscedasticity (14)
No ratings yet
ECTRX Topic6 Heteroscedasticity (14)
31 pages
Hapen HPVFV
No ratings yet
Hapen HPVFV
82 pages
Chapter9_Heteroscedasticity - Copy
No ratings yet
Chapter9_Heteroscedasticity - Copy
17 pages
OMF Lecture 7
No ratings yet
OMF Lecture 7
72 pages
Heteroskedasticity
No ratings yet
Heteroskedasticity
49 pages
Heteros Ce Dasti City
No ratings yet
Heteros Ce Dasti City
2 pages
ECO_418_2.3
No ratings yet
ECO_418_2.3
8 pages
Physics Unit 4 PDF
No ratings yet
Physics Unit 4 PDF
49 pages
Heteros Kedas T I City
No ratings yet
Heteros Kedas T I City
19 pages
economatrics_postmte_1
No ratings yet
economatrics_postmte_1
46 pages
4 - Garch
No ratings yet
4 - Garch
42 pages
F-504 heteoscedasticity
No ratings yet
F-504 heteoscedasticity
13 pages
2A.3 Lecture Slides8 Heteroskedasticity
No ratings yet
2A.3 Lecture Slides8 Heteroskedasticity
20 pages
Econometrics For Finance Chapter 4
No ratings yet
Econometrics For Finance Chapter 4
44 pages
CHAPTER 6
No ratings yet
CHAPTER 6
10 pages
ch4 (Multi Hetro Auto)
No ratings yet
ch4 (Multi Hetro Auto)
43 pages
L1090_lecture7_AU24
No ratings yet
L1090_lecture7_AU24
27 pages
Revelation III Led
No ratings yet
Revelation III Led
2 pages
4---GARCH
No ratings yet
4---GARCH
42 pages
"Introductory Econometrics", Chapter 8 by Wooldridge: Heteroskedasticity
No ratings yet
"Introductory Econometrics", Chapter 8 by Wooldridge: Heteroskedasticity
14 pages
Year
No ratings yet
Year
43 pages
Heteros Kedas T I City
No ratings yet
Heteros Kedas T I City
23 pages
Project Script
No ratings yet
Project Script
1 page
Hsts423 Unit 4
No ratings yet
Hsts423 Unit 4
13 pages
Math Research
No ratings yet
Math Research
10 pages
Econometrics Assignment
No ratings yet
Econometrics Assignment
6 pages
Title: Attitude Is Everything!
No ratings yet
Title: Attitude Is Everything!
11 pages
WSP Cambodia WSS Turning Finance Into Service For The Future
No ratings yet
WSP Cambodia WSS Turning Finance Into Service For The Future
88 pages
Ch08. Heteroskedasticity (Section 8.1-8.4) : Ping Yu
No ratings yet
Ch08. Heteroskedasticity (Section 8.1-8.4) : Ping Yu
43 pages
MC Lit 3 Prelim
No ratings yet
MC Lit 3 Prelim
5 pages
Heteros Kedasti City
No ratings yet
Heteros Kedasti City
26 pages
Chapter 3 Heteroscedasticity
No ratings yet
Chapter 3 Heteroscedasticity
10 pages
Non Spherical Disturbances - Heteroskedasticity 1
No ratings yet
Non Spherical Disturbances - Heteroskedasticity 1
12 pages
Topic 5
No ratings yet
Topic 5
30 pages
Beam Deflections and Stresses During Lifting
No ratings yet
Beam Deflections and Stresses During Lifting
10 pages
Engineering Calculation: Instructions
No ratings yet
Engineering Calculation: Instructions
3 pages
Workshop 4 - Part 1 - Introductory Econometrics With EViews
100% (1)
Workshop 4 - Part 1 - Introductory Econometrics With EViews
99 pages
Lecture 10. Homoskedasticity
No ratings yet
Lecture 10. Homoskedasticity
12 pages
4.3 Structural Analysis 4.3.1 Modelling
No ratings yet
4.3 Structural Analysis 4.3.1 Modelling
8 pages
Econometery ch2
No ratings yet
Econometery ch2
38 pages
Further Regression Topics II
No ratings yet
Further Regression Topics II
32 pages
2023 Clobber - Progress Brain Research
No ratings yet
2023 Clobber - Progress Brain Research
18 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
1.5.3 Elastic Deformation
No ratings yet
1.5.3 Elastic Deformation
6 pages
Heteroskedasticity vs. Homoskedasticity
No ratings yet
Heteroskedasticity vs. Homoskedasticity
20 pages
Outline: Basic Econometrics in Transportation Basic Econometrics in Transportation
No ratings yet
Outline: Basic Econometrics in Transportation Basic Econometrics in Transportation
7 pages
CH 4 - Problems
No ratings yet
CH 4 - Problems
72 pages
Lecture # 3 (Heteroskedasticity in Cross-Sectional Data)
No ratings yet
Lecture # 3 (Heteroskedasticity in Cross-Sectional Data)
5 pages
Chapter Four Violations of Basic Classical Assumptions: Y and The Random Error Term U
No ratings yet
Chapter Four Violations of Basic Classical Assumptions: Y and The Random Error Term U
32 pages
cf endterm
No ratings yet
cf endterm
198 pages
Assignment Solved
No ratings yet
Assignment Solved
8 pages
Heteroscedasticity
No ratings yet
Heteroscedasticity
12 pages
Heteros Ce Dasti City
No ratings yet
Heteros Ce Dasti City
17 pages
Heteroscedasticity
No ratings yet
Heteroscedasticity
16 pages
TUGAS TUTORIAL KE 1 Dikonversi Dikonversi PDF
100% (1)
TUGAS TUTORIAL KE 1 Dikonversi Dikonversi PDF
5 pages
An Estimation of Motor Yacht Light Displacement Based On Design Parameters Using Computational Intelligence Techniques
No ratings yet
An Estimation of Motor Yacht Light Displacement Based On Design Parameters Using Computational Intelligence Techniques
14 pages
Econometrics moduleII
100% (2)
Econometrics moduleII
114 pages
Chapter8 Econometrics Heteroskedasticity
No ratings yet
Chapter8 Econometrics Heteroskedasticity
15 pages
A80s03mac 3NX
100% (1)
A80s03mac 3NX
3 pages
Calypso: Improved AI Function
100% (1)
Calypso: Improved AI Function
11 pages
Health and Safety Officer Job Advert
No ratings yet
Health and Safety Officer Job Advert
1 page
Libro Fire Engineering Managing Major Fires
No ratings yet
Libro Fire Engineering Managing Major Fires
330 pages
Heteroskedasticity
No ratings yet
Heteroskedasticity
30 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
CH08 Wooldridge 7e PPT 2pp
No ratings yet
CH08 Wooldridge 7e PPT 2pp
22 pages
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Chapter 3

Uploaded by

Chapter 3

Uploaded by

Statistical Foundations of Business Analytics

Chapter 3: Heteroskedasticity and Cluster Sampling

With Chapter 1 and 2, we are now equipped to make inference about β

What happens when HOMOSKEDASTICITY is not satisfied?

What happens when IID is not satisfied?

Remember that the variance of β̂ has the following expression

Var(β̂|X ) = (X ′ X )−1 X ′ Var(ε|X )X (X ′ X )−1

Problem: Var(ε|X ) is a complex object

HOMOSKEDASTICITY + IID highly simplify the problem!

This has two important consequences

Let’s start by seeing what happens when we relax HOMOSKEDASTICITY

Heteroskedasticity means that the variance of εi is not constant across i

Var(εi |xi ) = σi2 ̸= σ 2

Why is this a problem?

Assume now that the variance of εi is not constant across i

The variance of β̂ now has the following expression

Var(β̂|X ) = (X ′ X )−1 X ′ ΩX (X ′ X )−1

Two important consequences

Under heteroskedasticity our inference procedure collapses

This will eventually lead you to wrong conclusions!

Draw 10000 samples of n = 100 observations

The presence of heteroskedasticity can be visualized/tested

Method 1: plot ε̂i against xi

Method 2: White (1990) test

Good news: there exists a very simple solution to this problem

Solution: Heteroskedasticity-Corrected Covariance Matrix Estimators (HCCME)

The HC estimator is unbiased under both heteroskedasticity and homoskedasticity

Heteroskedasticity is a major problem for inference

You can test for the presence of heteroskedasticity

What happens when individuals are not sampled independently?

The variance covariance matrix of ε becomes very complex

The variance covariance matrix of ε has a block structure

Same consequences as heteroskedasticity

Solution: Cluster-Robust Variance Covariance Estimator

The Cluster-Robust estimator is unbiased when number of clusters C is large

But there is an easy fix!

Only important assumption remaining: EXO

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.