7 Multiple Regression 3
7 Multiple Regression 3
Having discussed the classical normal linear regression models, we would like
to revisit some of the assumptions and explore situations in which the
assumptions are violated.
Seeking for improvement (if any) should violations occur.
We focus on revisiting the following two assumptions
No linear relationship exists between two or more of the independent variables
The errors variance are identical and constant (homoscedasticity).
If perfect collinearity exists, the regression estimators â, bˆ1 , · · · , bˆk are not
well-defined.
Intuitively, recall that the coefficient bi measures the change in Y by shifting
one unit of xi , given all other variables are held constant.
However, if a linear relationship exists between two or more of the
independent variables, it would be impossible to change the value of one of
them without changing the value(s) of some of the rest.
Hence the previous interpretation is no longer valid.
s2
sb̂2 = , u = 1, 2
u Sxu xu (1 − r 2 )
−s 2 r
Cov (bˆ1 , bˆ2 ) = p ,
(1 − r 2 ) Sx1 x1 Sx2 x2
Sx1 x2
where r = √ is the simple correlation between x1 and x2 .
Sx1 x1 Sx2 x2
Yi = a + bxi + i for i = 1, · · · , n
An informal but useful way is to examine the pattern of the residuals. e.g. a
plot of squares of residuals, ˆ2i against time for a time-series model.
Specific to the alternative hypothesis that σi2 = Cxi2 , Goldfeld-Quandt Test
can be used.
Idea:
calculate two regression lines, one using data thought to be associated with
low variance errors, and the other using data thought to be associated with
high variance errors.
If the residual variances associated with each regression line are approximately
equal, the homoscedasticity assumption cannot be rejected.