0% found this document useful (0 votes)
13 views21 pages

Multicollinearity 074432

Uploaded by

samsonndagwa5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views21 pages

Multicollinearity 074432

Uploaded by

samsonndagwa5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

MULTICOLLINEARITY

DISMAS ALEX
Dept of Economics and Tax Mag.
The Institute of Finance Management
In this lecture we take a critical look at this assumption by seeking
answers to the following questions:
1. What is the nature of multicollinearity?
2. Is multicollinearity really a problem?
3. What are its practical consequences?
4. How does one detect it?
5. What remedial measures can be taken to alleviate the problem of
multicollinearity?
INTUITION
• Consider the following regression:

Interpretations of the regression coefficients;

𝛽1- The marginal effect on salary of 1 additional year experience, holding other variables constant
𝛽2- The marginal effect on salary of 1 additional year of age, holding other variables constant
DEFINITION OF
MULTICOLLINEARITY
Perfect multicollinearity is the violation of Assumption (no explanatory
variable is a perfect linear function of any other explanatory variables).
•Perfect (or Exact) Multicollinearity
If two or more independent variables have an exact linear relationship
between them then we have perfect multicollinearity.
• Multicollinearity occurs when the X variables are themselves related
• Here is an example of perfect multicollinearity in a model with
two explanatory variables:
𝑌i = 𝛽0 + 𝛽1X1i + 𝛽2X2i + ei

X1i = 𝛼0 + 𝛼1X2i
MULTICOLLINEARITY CONT…
•Consequence: OLS cannot generate estimates of regression
coefficients (error message).

•Why? OLS cannot estimate the marginal effect of X1 on 𝑌 while


holding X2 constant because X2 moves exactly when X1 moves!
•Solution: Easy - Drop one of the variables!
Consequences of Multicollinearity
1. Difficult to identify separate effects of each variable in the model.
2. The variances and the standard errors of the regression coefficient estimates
will increase. This means lower t-statistics.
3. Estimates of parameters may not appear significantly different from zero
even though the F-test for functions of the correlated variables may be
large
4. Regression coefficients will be sensitive to specifications. Regression
coefficients can change substantially when variables are added or dropped.
5. The overall fit of the regression equation will be largely unaffected by
multicollinearity. This also means that forecasting and prediction will be
largely unaffected.
(Read also Gujarat, D. N. (2004) pg 350)
Why Care?
•What does multicollinearity do to my regression results?
Why Care? Cont….
• In passing, note that multicollinearity, as we have defined it, refers only to linear
relationships among the X variables. It does not rule out nonlinear relationships
among them. For example, consider the following regression model:
• Yi = β0 + β1Xi + β2X2i + β3X3i + ui
• where, say, Y = total cost of production and X = output. The variables X2i (output
squared) and X3i (output cubed) are obviously functionally related to Xi, but the
relationship is nonlinear.
• Why does the classical linear regression model assume that there is no
multicollinearity among the X’s? The reasoning is this:
1. If multicollinearity is perfect, the regression coefficients of the X variables are
indeterminate and their standard errors are infinite.
2. If multicollinearity is less than perfect, the regression coefficients, although
determinate, possess large standard errors which means the coefficients cannot
be estimated with great precision or accuracy.
Sources of multicollinearity
There are several sources of multicollinearity.
1. The data collection method employed, for example, sampling over a limited range of the
values taken by the regressors in the population.
2. Constraints on the model or in the population being sampled. For example, in the
regression of electricity consumption on income (X2) and house size (X3) (High X2 always
mean high X3).
3. Model specification, for example, adding polynomial terms to a regression model,
especially when the range of the X variable is small.
4. An overdetermined model. This happens when the model has more explanatory variables
than the number of observations.
• An additional reason for multicollinearity, especially in time series data, may be that the
regressors included in the model share a common trend, that is, they all increase or
decrease over time.
The Detection of Multicollinearity
•High Correlation Coefficients
Pairwise correlations among independent variables might be high (in
absolute value). Rule of thumb: If the correlation > 0.9 then severe
multicollinearity may be present.
The Detection of Multicollinearity
•High 𝑅2with low t-Statistic Values
Possible for individual regression coefficients to be insignificant but for
the overall fit of the equation to be high.
•High Variance Inflation Factors (VIFs)
A VIF measures the extent to which multicollinearity has increased
the variance of an estimated coefficient. It looks at the extent to
which an explanatory variable can be explained by all the other
explanatory variables in the equation.
The Detection of Multicollinearity
The Detection of Multicollinearity
Remedies for Multicollinearity

No single solution exists that will eliminate multicollinearity. Certain approaches


may be useful:
1. Do Nothing
Live with what you have.
2. Drop a Redundant Variable: If a variable is redundant, it should have never
been included in the model in the first place. So dropping it actually is just
correcting for a specification error. Use economic theory to guide your choice of
which variable to drop.
3. Transform the Multicollinear Variables: Sometimes you can reduce
multicollinearity by re-specifying the model, for instance, create a combination of
the multicollinear variables. As an example, rather than including the variables
GDP and population in the model, include GDP/population (GDP per capita)
instead.
4. Increase the Sample Size
AN ILLUSTRATIVE EXAMPLE: CONSUMPTION
EXPENDITURE
IN RELATION TO INCOME AND WEALTH
• Let us consider the consumption–income example in table 10.5. we
obtain the following regression:
Yˆi = 24.7747 + 0.9415X2i − 0.0424X3i
(6.7525) (0.8229) (0.0807)
t = (3.6690) (1.1442) (−0.5261) (10.6.1)
R2 = 0.9635 R¯2 = 0.9531 df = 7
• Regression (10.6.1) shows that income and wealth together explain about 96
percent of the variation in consumption expenditure, and yet neither of the slope
coefficients is individually statistically significant.
• The wealth variable has the wrong sign. Although βˆ2 and βˆ3 are individually
statistically insignificant, if we test the hypothesis that β2 = β3 = 0 simultaneously,
this hypothesis can be rejected, as Table 10.6 shows.
Source of variation SS df MSS
Due to regression 8,565.5541 2 4,282.7770
Due to residual 324.4459 7 46.3494

• Under the usual assumption we obtain:


• F =4282.7770 / 46.3494 = 92.4019 (10.6.2)
• This F value is obviously highly significant. Our example shows dramatically what
multicollinearity does.
• The fact that the F test is significant but the t values of X2 and X3 are individually
insignificant means that the two variables are so highly correlated that it is impossible
to isolate the individual impact of either income or wealth on consumption.
• If instead of regressing Y on X2, we regress it on X3, we obtain
Yˆi = 24.411 + 0.0498X3i
(6.874) (0.0037) (10.6.5)
t = (3.551) (13.29) R2 = 0.9567
• We see that wealth has now a significant impact on consumption
expenditure, whereas in (10.6.1) it had no effect on consumption
expenditure.
• Regressions (10.6.4) and (10.6.5) show very clearly that in situations
of extreme multicollinearity dropping the highly collinear variable will
often make the other X variable statistically significant. This result
would suggest that a way out of extreme collinearity is to drop the
collinear variable.
THANK YOU

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy