0% found this document useful (0 votes)
32 views9 pages

Chapter 3 Econometrics

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views9 pages

Chapter 3 Econometrics

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

UNIT 3: MULTIPLE LINEAR REGRESSION

3.1 INTRODUCTION
We have studied the two-variable model extensively in the previous unit. But in economics

you hardly found that one variable is affected by only one explanatory variable. For example,

the demand for a commodity is dependent on price of the same commodity, price of other

competing or complementary goods, income of the consumer, number of consumers in the

market etc. Hence, the two variable model is often inadequate in practical works. Therefore,

we need to discuss multiple regression models. The multiple linear regression is entirely

concerned with the relationship between a dependent variable (Y) and two or more explanatory

variables (X1, X2, …, Xn).

3.2 SPECIFICATION OF THE MODEL


Let us start our discussion with the simplest multiple regression model i.e., model with two

explanatory variables.

Y = f(X1, X2)

Example: Demand for a commodity may be influenced not only by the price of the commodity

but by the consumers income.

Since the theory does not specify the mathematical form of the demand function, we assume

the relationship between Y, X1, and X2 is linear. Hence we may write the three variable

Population Regression Function (PRF) as follows:

Yi = o + 1X1i + 2X2i +Ui

Where Y is the quantity demanded

X1 and X2 are the price and income respectively

o is the intercept term

1 is the coefficient of X1 and its expected sign is negative

(Remember the law of demand)

2 is the coefficient of X2 and its expected sign is positive assuming that the good is a

normal good.
The coefficients 1 and 2 are called the partial regression coefficients. We will discuss the

meaning of these coefficients later.

3.3 ASSUMPTIONS
To complete the specification of our simple model we need some assumptions about the

random variable U. These assumptions are the same as those assumptions already explained in

the two-variables model in unit 2.

Assumptions of the model

1. Zero mean value of Ui

The random variable U has a zero mean value for each Xi

E(Ui/X1i, X2i) = 0 for each i.

2. Homoscedasticity

The variance of each Ui is the same for all the Xi values

Var (Ui) = E(Ui2) =  u2

3. Normality

The values of each Ui are normally distributed

Ui ~ N(0,  u2 )

4. No serial correlation (serial independence of the U’s)

The values of Ui (corresponding to Xi) are independent from the values of any other Uj

(corresponding to Xj).

Cov (Ui, Uj) = 0 for i  j

5. Independence of Ui and Xi

Every disturbance term Ui is independent of the explanatory variables. That is there is

zero covariance between Ui and each X variables.

Cov(Ui, X1i) = Cov (Ui, X2i) = 0

Or E(Ui X1i) = E(Ui X2i) = 0

Here the values of the X’s are a set of fixed numbers in all hypothetical samples (Refer

the assumptions of OLS in unit 2)


6. No collinearity between the X variables (No multicollinearity) The explanatory variables

are not perfectly linearly correlated. There is no exact linear relationships between X1 and

X2.

7. Correct specification of the model

The model has no specification error in that all the important explanatory variables appear

explicitly in the function and the mathematical form is correctly defined.

The rationale for the above assumptions is the same as unit 2.

3.4 ESTIMATION
We have specified our model in the previous subsection. We have also stated the assumptions

required in subsection 3.3. Now let us have sample observations on Y, X1i, and X2i and obtain

estimates of the true parameters b0, b1 and b2

Yi X1i X2i

Y1 X11 X21

Y2 X12 X22

Y3 X13 X23

  

Yn X1n X2n

The sample regression function (SRF) can be written as


  
Yi =  o   1 X 1i   2 X 2i  U i
  
Where  o ,  1 and  2 are estimates of the true parameters  o ,  1 and  2

U i is the residual term.

But since U i is un observable the above equation becomes


  
Yˆi =  o   1 X 1i   2 X 2i is the estimated Regression line.

As discussed in unit 2, the estimates will be obtained by choosing the values of the unknown

parameters that will minimize the sum of squares of the residuals. (OLS requires the Ui2 be

as small as possible). Symbolically,


n   
Min Ui2 = (Yi – Yˆi )2 = i
(Yi -  o   1 X 1i   2 X 2i )2
A necessary condition for a minimum value is that the partial derivatives of the above
  
expression with respect to the unknowns (i.e.  o ,  1 and  2 ) should be set to zero.
2
   

   Yi   0   1 X 1i   2 X 2i 
  0

 0
2
   

   Yi   0   1 X 1i   2 X 2i 
  0

 1
2
   

   Yi   0   1 X 1i   2 X 2i 
  0

 2

After differentiating, we get the following normal equations:


  
Yi = n  o +  1 X1i +  2 X2i
  
X1iYi =  o X1i +  1 X21i +  2 X1iX2i
  
X2iYi =  o X2i +  1  X1iX2i +  2 X22i
  
After solving the above normal equations we can obtain values for  o ,  1 and  2
  
 o = Y   1 X1   2 X 2

  x y  x    x y  x 2
x 2i 
1=
1i i 2i 2i i 1i

 x  x    x x 
2
1i
2
2i 1i 2i
2

  x y  x    x y  x 2
x 2i 
2
2i i 1i 1i i 1i
=
 x  x    x x 
2
1i
2
2i 1i 2i
2

where the variables x and y are in deviation forms

i.e., yi = Yi - Y , x1i = X1i - X 1 , x2i = X2i - X 2


  
Note: The values for the parameter estimates (  o ,  1 and  2 ) can also be obtained by using

other methods (ex. crammer’s rule).


ˆ u2 
U 2

, k being the total number of parameters that are estimated. In the above case
nk
(three-variable model, k = 3)

x1 and x2 are in deviations form.

3.5 THE MULTIPLE COEFFICIENT OF DETERMINATION


In unit 2 we saw the coefficient of determination (r2) that measures the goodness of fit of the

regression equation. This notion of r2 can be easily extended to regression models containing

more than two variables.

In the three-variable model we would like to know the proportion of the variation in Y

explained by the variables X1 and X2 jointly.

The quantity that gives this information is known as the multiple coefficient of determination.

It is denoted by R2, with subscripts the variables whose relationships is being studies.

Example: R 2 y . X 1 X 2 - shows the percentage of the total variation of Y explained by the regression

plane, that is, by changes in X1 and X2.

 yˆ  Y Y 
2 2


2 i i
R =
y  Y Y 
y. X1 X 2
2 2
i i

U
2
RSS
 1
i
=1–
y
2
i
TSS

where: RSS – residual sum of squares

TSS – total sum of squares

Recall that

The value of R2 lies between 0 and 1. The higher R2 the greater the percentage of the variation

of Y explained by the regression plane, that is, the better the goodness of fit of the regression

plane to the sample observations. The closer R2 to zero, the worse the fit.

The Adjusted R2
Note that as the number of regressors (explanatory variables) increases the coefficient of

multiple determinations will usually increase. To see this, recall the definition of R2

U
2
i
R =1–
2

y
2
i
Now yi2 is independent of the number of X variables in the model because it is simply (yi -

Y )2. The residual sum of squares (RSS), Ui2, however depends on the number of explanatory

variables present in the model. It is clear that as the number of X variables increases, Ui2 is

bound to decrease (at least it will not increase), hence R2 will increase. Therefore, in comparing

two regression models with the same dependent variable but differing number of X variables,

one should be very wary of choosing the model with the highest R2. An explanatory variable

which is not statistically significant may be retained in the model if one looks at R2 only.

Therefore, to correct for this defect we adjust R2 by taking into account the degrees of freedom,

which clearly decrease as new repressors are introduced in the function

U n  k 
2
2 i
R =1–
 y n  1
2
i

2
or R = 1 – (1 – R2)
n 1
nk
where k = the number of parameters in the model (including the intercept term)

n = the number of sample observations

R2 = is the unadjusted multiple coefficient of determination

As the number of explanatory variables increases, the adjusted R2 is increasingly less than the
2
unadjusted R2. The adjusted R2 ( R ) can be negative, although R2 is necessarily non-negative.

In this case its value is taken as zero.


2
If n is large, R and R2 will not differ much. But with small samples, if the number of regressors
2
(X’s) is large in relation to the sample observations, R will be much smaller than R2.

3.6 TEST OF SIGNIFICANCE IN MULTIPLE REGRESSIONS


The principle involved in testing multiple regressions is identical with that of simple regression.

3.6.1 Hypothesis Testing about Individual Partial Regression Coefficients


We can test whether a particular variable X1 or X2 is significant or not holding the other variable

constant. The t test is used to test a hypothesis about any individual partial regression

coefficient. The partial regression coefficient measures the change in the mean value of Y

E(Y/X2,X3), per unit change in X2, holding X3 constant



 1 i
t= 
~ t(n – k) (i = 0, 1, 2, …., k)
S ( i )

This is the observed (or sample) value of the t ratio, which we compare with the theoretical

value of t obtainable from the t-table with n – k degrees of freedom.

The theoretical values of t (at the chosen level of significance) are the critical values that define

the critical region in a two-tail test, with n – k degrees of freedom.

Now let us postulate that

H0:  i = 0

H1:  i  0 or one sided (  i > 0,  i < 0)

The null hypothesis states that, holding X2 constant, X1 has no (linear) influence on y.

If the computed t value exceeds the critical t value at the chosen level of significance, we may

reject the hypothesis; otherwise, we may accept it (  1 is not significant at the chosen level of

significance and hence the corresponding regression does not appear to contribute to the

explanation of the variations in Y).

Look at the following figure

Assume  = 0.05, t = 2.179 for 12 df


2

Acceptance
region

95%

Critical region
2.5% Critical region
(2.5%)

Fig 3.6.1. 95% confidence interval for t



Note that the greater the value of t calculated the stronger is the evidence that  i is significant.

For a number of degrees of freedom higher than 8 the critical value of t (at the 5% level of

significance) for the rejection of the null hypothesis is approximately 2.

3.6.2 Testing the Overall Significance of a Regression


This test aims at finding out whether the explanatory variables (X1, X2, …Xk) do actually have

any significant influence on the dependent variable. The test of the overall significance of the

regression implies testing the null hypothesis

H0:  1 =  2 = … =  k = 0

Against the alternative hypothesis

H1: not all  i ’s are zero.

If the null hypothesis is true, then there is no linear relationship between y and the regressors.

The above joint hypothesis can be tested by the analysis of variance (AOV) technique. The

following table summarizes the idea.

Source of variation Sum of squares Degrees of freedom Mean square

(SS) (Df) (MSS)

Due to regression (ESS)  yˆ 2


i
K–1  yˆ 2

k 1

Due to Residual (RSS)


U 2 U i
2

 k
i

N–k

Total y 2
i
N–1

(Total variation)

Therefore to undertake the test first find the calculated value of F and compare it with the F

tabulated. The calculated value of F can be obtained by using the following formula.

 yˆ k  1  ESS k  1 follows the F distribution with k – 1 and n – k df.


2
i
F=
U n  k  RSS N  k 
i
2

where k – 1 refers to degrees of freedom of the numerator


n – k refers to degrees of freedom of the denominator

k – number of parameters estimated

Decision Rule: If Fcalculated > Ftabulated (F(k – 1, N – k)), reject H0: otherwise you may accept it,

where F(k – 1, N – k) is the critical F value at the  level of significance and (k – 1) numerator

df and (N – k) denominator df.

Note that there is a relationship between the coefficient of determination R2 and the F test used

in the analysis of variance.

R 2 (k  1) f(F)
F=
(1  R 2 ( N  k )
5% of area

0 1 2 3 4 5

When R2 = 0, F is zero. The larger the R2, the greater the F value. In the limit, when R2 = 1, F

is infinite. Thus the F test, which is a measure of the overall significance of the estimated

regression, is also a test of significance of R2. Testing the null hypothesis is equivalent to testing

the null hypothesis that (the population) R2 is zero.

The F test expressed in terms of R2 is easy for computation.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy