Multiple Regression Analysis: Võ Đ C Hoàng Vũ
Multiple Regression Analysis: Võ Đ C Hoàng Vũ
V c Hong V
University of Economics HCMC
June 2015
V c Hong V (UEH)
Applied Econometrics
June 2015
1 / 17
Estimation
The model is : y = 0 + 1 x1 + 2 x2 + . . . + k xk
0 is still the intercept
1 to k all called slope parameters
u is still the error term (or disturbance)
Still need to make a zero conditional mean assumption, so now
assume that E (u|x1 , x2 , . . . , xk ) = 0
Still minimizing the sum of squared residuals, so have k + 1 first
order conditions
V c Hong V (UEH)
Applied Econometrics
June 2015
2 / 17
0 + 1 x1 + . . . + k xk , so
y = 1 x1 + 2 x2 + . . . + k xk ,
so holding x2 , . . . , xk fixed implies that
y = 1 x1 , that is
each has a ceteris paribus interpretation.
V c Hong V (UEH)
Applied Econometrics
June 2015
3 / 17
V c Hong V (UEH)
Applied Econometrics
June 2015
4 / 17
V c Hong V (UEH)
Applied Econometrics
June 2015
5 / 17
Goodness-of-fit
We can think of each observation as being made up of an
explained part, and an unexplained part - yi = yi + ui - We then
define the following:
P
(yi y)2 is the total sum of squares (SST)
P
(
yi y)2 is the explained sum of squares (SSE)
P 2
ui is the residual sum of squares (SSR)
Then SST = SSE + SSR
How do we think about how well our sample regression line fits
our sample data?
Can compute the fraction of the total sum of squares (SST) that
is explained by the model, call this the R-squared of regression
R 2 = SSE /SST = 1 SSR/SST
V c Hong V (UEH)
Applied Econometrics
June 2015
6 / 17
Goodness-of-fit (cont)
We can also think of R 2 as being equal to the squared
correlation coefficient between the actual yi and the value yi
P
( (yi y)(
yi y ))2
P
R = P
( (yi y)2 )( (
yi y )2 )
2
V c Hong V (UEH)
Applied Econometrics
June 2015
7 / 17
V c Hong V (UEH)
Applied Econometrics
June 2015
8 / 17
V c Hong V (UEH)
Applied Econometrics
June 2015
9 / 17
1 = P
(xi1 x1 )2
Recall the true model, so that yi = 0 + 1 xi1 + 2 xi2 + ui , so the
numerator
becomes
P
(x
1 )(0 + 1 xP
i1 + 2 xi2 + ui ) =P
Pi1
2
1 (xi1 x1 ) + 2 (xi1 x1 )xi2 + (xi1 x)ui
V c Hong V (UEH)
Applied Econometrics
June 2015
10 / 17
= 1 + 2 P
+ P
2
(xi1 x1 )
(xi1 x1 )2
since E (ui ) = 0, taking expectations we have
P
(xi1 x1 )xi2
E () = 1 + 2 P
(xi1 x1 )2
Consider the regression of x2 on x1
P
(xi1 x1 )xi2
x2 = 0 + 1 x1 then 1 = P
(xi1 x1 )2
so E (1 ) = 1 + 2 1
V c Hong V (UEH)
Applied Econometrics
June 2015
11 / 17
Positive bias
Negative bias
2 < 0
Negative bias
Positive bias
V c Hong V (UEH)
Applied Econometrics
June 2015
12 / 17
Applied Econometrics
June 2015
13 / 17
V c Hong V (UEH)
Applied Econometrics
June 2015
14 / 17
Var (j ) =
, where
SSTj (1 Rj2)
P
SSTj = (xij xj )2 and Rj2 is the R 2 from
regressioing xj on all other xs
V c Hong V (UEH)
Applied Econometrics
June 2015
15 / 17
V c Hong V (UEH)
Applied Econometrics
June 2015
16 / 17
Misspecified Models
V c Hong V (UEH)
Applied Econometrics
June 2015
17 / 17
V c Hong V (UEH)
Applied Econometrics
June 2015
18 / 17
V c Hong V (UEH)
Applied Econometrics
June 2015
19 / 17
V c Hong V (UEH)
Applied Econometrics
June 2015
20 / 17