Heteroscedsaticity Lecture 2023
Heteroscedsaticity Lecture 2023
One of the assumptions of the classical linear regression model is that the error term
having the same variance. But in most practical situation this assumption did not
fulfill, and we have the problem of heteroscedasticity. Heteroscedasticity does not
destroy the unbiasedness and consistency property of the ordinary least square
estimators, but these estimators have not the property of minimum variance. Recall
that OLS makes the assumption that 𝑉(∈𝑖 ) = 𝜎 2 for all i. That is, the variance of
the error term is constant (Homoscedasticity). If the error terms do not have the
constant variance, then they are said to be heteroscedasticity. The term means
“differing variance” and comes from the Greek “hetero” (different) and “scedasis”
(dispersion).
Figure 1: If heteroscedasticity is present in the data set.
1
Figure 2: If homoscedsticity is present in the data set.
2
Examples:
1. The range in family income between the poorest and richest family in town is
the classical example of heteroscedasticity.
2. The range in annual sales between a corner drug store and general store.
Example of heteroscedasticity
Let’s take a look at a classic example of heteroscedasticity. If you model household
consumption based on income, you’ll find that the variability in consumption
increases as income increases. Lower income households are less variable in
absolute terms because they need to focus on necessities and there is less room for
different spending habits. Higher income households can purchase a wide variety of
luxury items, or not, which results in a broader spread of spending habits.
Why fix this problem? There are two big reasons why you want
homoscedasticity:
o While heteroscedasticity does not cause bias in the coefficient estimates, it does
make them less precise. Lower precision increases the likelihood that the
coefficient estimates are further from the correct population value.
o Heteroscedasticity tends to produce p-values that are smaller than they should
be. This effect occurs because heteroscedasticity increases the variance of the
coefficient estimates but the OLS procedure does not detect this increase.
Consequently, OLS calculates the t-values and F-values using an
underestimated amount of variance. This problem can lead you to conclude that
a model term is statistically significant when it is actually not significant.
There are several reasons why the variances of 𝒖𝒊 may be variable, some of
which are as follows.
1. Following the error-learning models, as people learn, their errors of behaviors
become smaller over time. In this case, 𝜎𝑖2 is expected to decrease. Consider an
example where the number of typing errors made in a given time period on a test
to the hours put in typing practice. As the number of hours of typing practice
3
increases, the average number of typing errors as well as their variances
decreases.
2. As incomes grow, people have more discretionary income and hence more scope
for choice about the disposition of their income. Hence, 𝜎𝑖2 is likely to increase
with income. Thus in the, regression of savings on income one is likely to find
𝜎𝑖2 increasing with income because people have more choices about their savings
behavior. Similarly, companies with larger profits are generally expected to show
greater variability in their dividend policies than companies with lower profits.
Also, growth-oriented companies are likely to show more variability in their
dividend payout ratio than established companies.
3. As data collecting techniques improve, 𝜎𝑖2 is likely to decrease. Thus, banks that
have sophisticated data processing equipment are likely to commit fewer errors
in the monthly or quarterly statements of their customers than banks without such
facilities.
4
variance may not be constant. But if the omitted variables are included in the
model, that impression may disappear.
5
Applying the usual formula, the OLS estimator of 𝛽2 is
∑ 𝑥𝑖 𝑦𝑖
𝛽̂2 =
∑ 𝑥𝑖2
𝑛 ∑ 𝑋𝑖 𝑌𝑖 −∑ 𝑋𝑖 ∑ 𝑌𝑖
= (1)
𝑛 ∑ 𝑋𝑖2 −(∑ 𝑋𝑖 )2
Recall that 𝛽̂2 is best linear unbiased estimator (BLUE) if the assumptions of the
classical model, including homoscedasticity, hold. Is it still BLUE when we drop
only the homoscedasticity assumption and replace it with the assumption of
̂ 𝟐 is still linear and unbiased.
heteroscedasticity? It is easy to prove that 𝜷
As a matter of fact, to establish the unbiasedness of 𝛽̂2 it is not necessary that the
disturbances (𝑢𝑖 ) be homoscedastic. In fact, the variance of 𝑢𝑖 , homoscedastic or
heteroscedastic, plays no part in the determination of the unbiasedness property. We
showed that 𝛽̂2 is a consistent estimator under the assumptions of the classical linear
regression model. Although we will not prove it, it can be shown that 𝛽̂2 is a
consistent estimator despite heteroscedasticity, that is, as the sample size increases
indefinitely, the estimated 𝛽̂2 converges to its true value.
Furthermore, it can also be shown that under certain Conditions (called regularity
conditions), 𝛽̂2 is asymptotically normally distributed. Of course, what we have said
about 𝛽̂2 also holds true of other parameters of a multiple regression model.
6
Granted that, 𝛽̂2 , is still linear unbiased and consistent, is it "efficient" or "best".
Does it have minimum variance in the class of unbiased estimators? And is that
minimum variance given by Eq. (2)? The answer is no to both the questions: 𝛽̂2 is
no longer best and the minimum variance is not given-by (2).
Consequences of Heteroscedasticity:
1. The OLS estimators and regression predictions based on them remains
unbiased and consistent.
2. The OLS estimators are no longer the BLUE (Best Linear Unbiased
Estimators) because they are no longer efficient, so the regression predictions
will be inefficient too.
3. Because of the inconsistency of the covariance matrix of the estimated
regression coefficients, the tests of hypotheses, (t-test, F-test) are no longer
valid.
Detection of Heteroscedasticity:
Graphical Method:
If there is no a priori or empirical information about the nature of heteroscedasticity,
in practice one can do the regression analysis on the assumption that there is no
heteroscedasticity and then do a postmortem examination of the residual squared 𝑢̂𝑖2
to see if they exhibit any systematic pattern.
In figure (a), there is no systematic pattern between two variables. It suggests that
there is no heteroscedasticity present in the data set. But in figure (b), (c), (d), (e),
there are some systematic pattern between two variables. That means, there exists
heteroscedasticity is present in the data set.
7
8
9
10
11
Formal Methods
Park Test
Park formalizes the graphical method by suggesting that 𝜎𝑖2 is some function of the
explanatory variable 𝑋𝑖 . The functional form hesuggested was
𝛽
𝜎𝑖2 = 𝜎 2 𝑋𝑖 𝑒 𝑣𝑖
Or
ln𝜎𝑖2 = 𝑙𝑛𝜎 2 + 𝛽ln𝑋𝑖 + 𝑣𝑖 (1)
where 𝑣𝑖 is the stochastic disturbance term.
12
Since 𝜎𝑖2 is generally not known, Park suggests using 𝑢̂𝑖2 as a proxy and running the
following regression:
ln𝑢̂𝑖2 = 𝑙𝑛𝜎 2 + 𝛽ln𝑋𝑖 + 𝑣𝑖
= 𝛼 + 𝛽ln𝑋𝑖 + 𝑣𝑖 (2)
If 𝛽 turns out to be statistically significant, it would suggest that heteroscedasticity
is present in the data. If it turns out to be insignificant, we may accept the assumption
of homoscedasiicity. In that case, hypothesis to be tested, H0: There is no
heteroscedasticity. We have to perform t-test to test the significance about β. If the
test is significant, then we reject the null hypothesis. Otherwise we accept the null
hypothesis.
The Park test is thus a two-stage procedure. In the first stage we run the OLS
regression disregarding the heteroscedasticity question. We obtain 𝑢̂𝑖 from this
regression, and then in the second stage we run-the regression (2).
Although empirically appealing, the Park test has some problems. Goldfeld and
Quandt have argued that the error term 𝑣𝑖 entering into (2) may not satisfy the OLS
assumptions and may itself be heteroscedastic. Nonetheless, as a strictly exploratory
method, one may use the Park test.
Glejser Test
The Glejser test is similar in spirit to the Park test. After obtaining the residuals 𝑢̂𝑖
from the OLS regression, Glejser suggests regressing the absolute values of 𝑢̂𝑖 on
the 𝑋 variable that is thought to be closely associated with 𝜎𝑖2 . In his experiments,
Glejser used the following functional forms:
|𝑢̂𝑖 | = 𝛽1 + 𝛽2 𝑋𝑖 + 𝑣𝑖
|𝑢̂𝑖 | = 𝛽1 + 𝛽2 √𝑋𝑖 + 𝑣𝑖
1
|𝑢̂𝑖 | = 𝛽1 + 𝛽2 + 𝑣𝑖
𝑋𝑖
1
|𝑢̂𝑖 | = 𝛽1 + 𝛽2 + 𝑣𝑖
√𝑋𝑖
|𝑢̂𝑖 | = √𝛽1 + 𝛽2 𝑋𝑖 + 𝑣𝑖
13
|𝑢̂𝑖 | = √𝛽1 + 𝛽2 𝑋𝑖2 + 𝑣𝑖
Again as an empirical or practical matter, one may use the Glejser approach. But
Goldfeld and Quandt point out that the error term 𝑣𝑖 has some problems in that its
expected value is nonzero, it is serially correlated, and ironically it is
heteroscedastic. An additional difficulty with the Glejser method is that models such
as
|𝑢̂𝑖 | = √𝛽1 + 𝛽2 𝑋𝑖 + 𝑣𝑖
and
are nonlinear in the parameters and therefore cannot be estimated with the usual
OLS procedure.
Glejser has found that for large samples the first four of the preceding models give
generally satisfactory results in detecting heteroscedasticity. As a practical matter,
therefore, the Glejser technique may be used for large samples and may be used in
the small samples strictly as a qualitative device to learn something about
heteroscedasticity. In that case, we have to test the significance test of β.
Step 1. Fit the regression to the data of 𝑌 on 𝑋 and obtain the residuals ̂𝑢𝑖 .
Step 2. Ignoring the sign of 𝑢̂𝑖 that is, taking their absolute value |𝑢̂𝑖 |, rank both |𝑢̂𝑖 |
and 𝑋𝑖 , (or 𝑌̂𝑖 ) according to an ascending or descending order and compute the
Spearman's rank correlation coefficient given previously.
Step 3. Assuming that the population rank correlation coefficient 𝜌𝑠 is zero and 𝑛 >
8, the significance of the sample 𝑟𝑠 can be tested by the 𝑡 test as follows:
𝑟𝑠 √𝑛 − 2
𝑡= … (2)
√1 − 𝑟𝑠2
With df = 𝑛 − 2.
If the computed 𝑡 value exceeds the critical 𝑡 value, we may accept the hypothesis
of heteroscedasticity; otherwise we may reject it. If the regression model involves
more than one 𝑋 variable, 𝑟𝑠 can be computed between |𝑢̂𝑖 | and each of the 𝑋
variables separately and can be tested for statistical significance by the 𝑡 test given
in Eq. (2).
Goldfeld-Quandt Test:
This popular method is applicable if one assumes that the heteroscedastic
variance 𝜎𝑖2 , is positively related to one of the explanatory variables in the regression
model. For simplicity, consider the usual two-variable model:
𝑌𝑖 = 𝛽1 + 𝛽2 𝑋𝑖 + 𝑢𝑖
Suppose 𝜎𝑖2 is positively related to 𝑋𝑖 as
𝜎𝑖2 = 𝜎 2 𝑋𝑖2 … (1)
where 𝜎 2 is a constant.
15
Assumption (1) postulates that 𝜎𝑖2 is proportional to the square of the 𝑋 variable.
Such an assumption has been found quite useful by Prais and Houthakker in their
study of family budgets.
If (1) is appropriate, it would mean 𝜎𝑖2 would be larger, the larger the values of 𝑋𝑖 .
If that turns out to be the case, heteroscedasticity is most likely to be present in the
model. To test this explicitly, Goldfeld and Quandt suggest the following steps:
Step 1: Order or rank the observations according to the values of 𝑋𝑖 , beginning with
the lowest 𝑋 value.
Step 2: Omit 𝑐 central observations, where c is specified a priori, and divide the
remaining (𝑛 − 𝑐) observations into two groups each of (𝑛 − 𝑐)/2 observations.
Step 3: Fit separate OLS regressions to the first (𝑛 − 𝑐)/2 observations and the last
(𝑛 − 𝑐)/2 observations, and obtain the respective residual sums of squares RSS1
and RSS2, RSS1 representing the RSS from the regression corresponding to the
smaller 𝑋𝑖 values (the small variance group) and RSS2, that from the larger 𝑋𝑖 values
(the large variance group). These RSS each have
(𝑛 − 𝑐) 𝑛 − 𝑐 − 2𝑘
− 𝑘 𝑜𝑟 ( ) df
2 2
where 𝑘 is the number of parameters to be estimated, including the intercept. For
the two-variable case 𝑘 is of course 2.
16
If in an application the computed (𝜆 = 𝐹) is greater than the critical 𝐹 at the chosen
level of significance, we can reject the hypothesis of homoscedasticity (H0: There
is no heteroscedasticity.). That is, we can say that heteroscedasticity is very likely.
Before illustrating the test, a word about omitting the 𝑐 central observations is in
order. These observations are omitted to sharpen or accentuate the difference
between the small variance group (i.e., RSS1) and the large variance group (i.e.,
RSS2). But the ability of the Goldfeld-Quandt test to do this successfully depends
on how 𝑐 is chosen. For the two-variable model the Monte Carlo experiments done
by Goldfeld and Quandt suggest that 𝑐 is about 8 if the sample size is about 30, and
it is about 16 if the sample size is about 60. But Judge et al. note that 𝑐 = 4 if 𝑛 =
30 and 𝑐 = 10 if 𝑛 is about 60 have been found satisfactory in practice.
Solution of Heteroscedasticity:
Once heteroscedasticity is detected, the appropriate solution is to transform the
original model in such a way as to get constant variance for u (error term). Then
OLS can be applied. The adjustment of the model depends on the form of relation
between v(u) and explanatory variable x.
a) If 𝝈𝟐𝒊 is known:
If heteroscedasticity of variance is suspected and v(ϵi ) = σ2i where 𝜎𝑖2 is known
for each case, then we use weighted Least squares (WLS) which is a special case
of a more general econometric technique known as generalized least squares
(GLS).
18
𝒚𝒊 𝜷𝟎 𝑿𝟏𝒊 𝑿𝒌𝒊 𝝐𝒊
= + 𝜷𝟏 + ⋯ ⋯ + 𝜷𝒌 +
√𝒛𝒊 √𝒛𝒊 √𝒛𝒊 √𝒛𝒊 √𝒛𝒊
Now we can transform our variables as
𝑦𝑖 𝑥𝑗𝑖
𝑦𝑖 ∗ = , 𝑥𝑗𝑖 ∗ = , 𝑗 = 1, 2, … … , 𝑘
√𝑧𝑖 √𝑧𝑖
𝜖𝑖
𝜖𝑖 ∗ =
√𝑧𝑖
Now in the transform model
𝝐𝒊
𝒗(𝝐𝒊 ∗ ) = 𝒗 ( )
√𝒛𝒊
𝟏
= 𝐯(𝛜𝐢 )
𝐳𝐢
𝟏
= 𝝈𝟐𝒊
𝒁𝒊
𝟏
= 𝐤 𝐳𝐢
𝐳𝐢
=𝐤
The transformed model satisfies all the assumptions of the classical linear
regression model and thus this procedure yield efficient parameter estimators
which are consistent, efficient and BLUE.
19
unobserved variables, or sample selection bias. Identifying the underlying causes of
heteroscedasticity is essential for selecting the appropriate corrective measures.
20