0% found this document useful (0 votes)
273 views78 pages

CH 2 Multiple Regression S

The document summarizes key aspects of multiple linear regression analysis using a three-variable model. It outlines the multiple regression equation, assumptions, and ordinary least squares (OLS) estimation procedure. Specifically, it presents the notation and assumptions for a three-variable linear regression model, describes how to calculate OLS estimates by minimizing the residual sum of squares, and provides the formulas for estimating the variance of the OLS estimators. An example calculation is also shown.

Uploaded by

HAT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
273 views78 pages

CH 2 Multiple Regression S

The document summarizes key aspects of multiple linear regression analysis using a three-variable model. It outlines the multiple regression equation, assumptions, and ordinary least squares (OLS) estimation procedure. Specifically, it presents the notation and assumptions for a three-variable linear regression model, describes how to calculate OLS estimates by minimizing the residual sum of squares, and provides the formulas for estimating the variance of the OLS estimators. An example calculation is also shown.

Uploaded by

HAT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 78

Chapter 2

Multiple Regression
(Lec 5-7)

Nguyen Thu Hang, BMNV, FTU CS2 1


Outline
1. Multiple Regression Equation
2. The Three-Variable Model: Notation and
Assumptions
3. OLS Estimation for the three-variable
model
4. Properties of OLS estimators
5. Goodness of fit –R2 and adjusted R2
6. Partial correlation coefficients
7. More on Functional Form
8. Hypothesis Testing in Multiple Regression
Nguyen Thu Hang, BMNV, FTU CS2 2
LEC 9

1. Multiple regression equation


Yi  1   2 X 2i  ....   k X ki  ui
• Y = One dependent variable (criterion)
• X = Two or more independent variables (predictor
variables).
• ui the stochastic disturbance term
• Sample size: >= 50 (at least 10 times as many cases
as independent variables)
•  1 is the intercept
•  k measures the change in Y with respect to Xk,
holding other factors fixed.
Nguyen Thu Hang, BMNV, FTU CS2 3
Example- Multiple regression equation
• Problem: A labor economist would like to
examine the effects of job training on worker
productivity. In this case, there is little need
for formal economic theory. Basic economic
understanding is realizing that factors such as
education and experience affect worker
productivity. Also, economists are well aware
that workers are paid commensurate with
their productivity.

Nguyen Thu Hang, BMNV, FTU CS2 4


Example- Multiple regression equation
 Model: wage = f(educ,exper )
Where:
wage = hourly wage
educ: years of formal education
exper: years of workforce experience

wage  1   2educ  3 exp er  u

Nguyen Thu Hang, BMNV, FTU CS2 5


2. The Three-Variable Model: Notation and Assumptions
Assumptions Yi  1   2 X 2i  3 X 3i  ui
1. Linear regression model, or linear in the parameters.
2. Zero mean value of disturbance ui: E(ui|X2i, X3i) = 0
3. No serial correlation between the disturbances:
Cov(ui,uj) = 0, i ≠ j
4. Homoscedasticity or constant variance of ui: Var(ui)=σ2
5. Zero covariance between ui and each X variable
cov (ui, X2i) = cov (ui,X3i) = 0
6. No specification bias or the model is correctly specified.
7. No exact collinearity between the X variables.

Nguyen Thu Hang, BMNV, FTU CS2 6


3. OLS Estimation for the three-variable model

• To find the OLS estimators, let us first write the sample


regression function (SRF) as follows:
Yi  ˆ1  ˆ2 X 2i  ˆ3 X 3i  uˆi 7.4.1
• The residual sum of squares (RSS) ∑uˆ2i is as small as
possible
 
uˆ  (Y  Yˆ ) 2  min
2
i i i

ˆ
u  Y 2
i 
  i 1 2 2i 3 3i
ˆ  ˆ X  ˆ X
 
2
 min

Nguyen Thu Hang, BMNV, FTU CS2 7


3. OLS Estimation for the three-variable model

• Partial derivative

 Yi  nˆ1  ˆ2  X 2i  ˆ3  X 3i



 2i i  ˆ
1  ˆ
2   ˆ
3  X 2 i X 3i
2
X Y X 2i X 2i

 3i i  ˆ
1  ˆ
 2  ˆ
 3
2
X Y X 3i X X
2 i 3i
Nguyen Thu Hang, BMNV, FTU CS2 X 3i 8
3. OLS Estimation for the three-variable model

• If we denote: yi  Yi  Y
x2 i  X 2 i  X 2
x3i  X 3i  X 3
x X  n X 2 
2 2 2
2i 2i

x X  n X 3 
2 2 2
3i 3i

 y  Y  nY 
2 2 2
i i

 x x   X X  nX X
2i 3i 2i 3i 2 3

 y x  Y X  n.Y .X
i 2i i 2i 2

 y x  Y X  n.Y .X
i 3i i 3i 3
Nguyen Thu Hang, BMNV, FTU CS2 9
3. OLS Estimation for the three-variable model

• We will obtain:

ˆ1  Y  ˆ2 X 2  ˆ3 X 3


 yx  x    x x  y x 
2

ˆ 
2 3 2 3 3
2
 x  x    x x 
2
2
2
3 2 3
2

 y x  x    x x  y x 
2

ˆ 
3 2 2 3 2
3
 x  x    x x 
2
2
2
3 2 3
2

Nguyen Thu Hang, BMNV, FTU CS2 10


3. OLS Estimation for the three-variable model

• Example: We have a following data


Y X2 X3
20 8 3
19 7 4
18 6 5
17 5 5
16 5 6
15 5 6
15 4 7
14 4 7
14 3 8
12 3 9
Nguyen Thu Hang, BMNV, FTU CS2 11
3. OLS Estimation for the three-variable model

• We obtain
 Y  160
i  Y  2616i
2

 X  50 2i  X  274 2
2i

 X  60 3i  X  390 2
3i

Y  16  Y X  835
i 2i

X2  5  Y X  920
i 3i
X3  6
 X X  274 2i 3i
Nguyen Thu Hang, BMNV, FTU CS2 12
3. OLS Estimation for the three-variable model

• and
 y   Y  nY   56
2 2 2
i i

 x   X  nX   24
2 2 2
2i 2i 2

 x   X  nX   30
2 2 2
3i 3i 3

 y x   Y X  nY X  35
i 2i i 2i 2

 y x   Y X  nY X  40
i 3i i 3i 3

 x x   X X  nX X  26
2 i 3i 2i 3i 2 3
Nguyen Thu Hang, BMNV, FTU CS2 13
3. OLS Estimation for the three-variable model

• and ˆ
 2  0.2272
ˆ3  1.1363
ˆ1  21.6818

Yˆi  21.6818  0.2272 X 2 1.1363 X 3

Nguyen Thu Hang, BMNV, FTU CS2 14


3. OLS Estimation for the three-variable model

• Variances and Standard Errors of OLS Estimators

ˆ 1 X 22  x32  X 32  x22  2 X 2 X 3  x2 x3
Var (1 )  (  ) 2
n  x  x  ( x x )2
2
2
3 2 3
2

Var ( ˆ ) 
 x

2
3 2

 x  x  ( x x )
2 2 2 2
2 3 2 3

ˆ
Var ( 3 ) 
 x2
2

 2

 x2  x3  ( x2 x3 )
2 2 2

Nguyen Thu Hang, BMNV, FTU CS2 15


3. OLS Estimation for the three-variable model

• or, equivalently,
2
Var ( ˆ )  se(ˆ2 )  var( ˆ2 )
x (1  r )
2 2 2
2 23

2
Var ( ˆ3 )  se( ˆ3 )  var( ˆ3 )
 3 23 )
x 2
(1  r 2

where r23 is the sample coefficient of correlation between X2,X3


• In all these formulas σ2 is the variance of the population
disturbances ui
  ˆ2
2
2

 uˆ i

n3
Nguyen Thu Hang, BMNV, FTU CS2 7.4.19 16
Example- Stata output
• Model: wage = f(educ,exper )
. reg wage educ exper

Source SS df MS Number of obs = 526


F( 2, 523) = 75.99
Model 1612.2545 2 806.127251 Prob > F = 0.0000
Residual 5548.15979 523 10.6083361 R-squared = 0.2252
Adj R-squared = 0.2222
Total 7160.41429 525 13.6388844 Root MSE = 3.257

wage Coef. Std. Err. t P>|t| [95% Conf. Interval]

educ .6442721 .0538061 11.97 0.000 .5385695 .7499747


exper .0700954 .0109776 6.39 0.000 .0485297 .0916611
_cons -3.390539 .7665661 -4.42
Nguyen Thu Hang, BMNV, FTU CS20.000 -4.896466 -1.884613
17
4. Properties of OLS estimators
• The sample regression line (surface) passes through
the means of (Y , X 2 ,..., X k )

• The mean value of the estimated Yi is equal to the mean


value of f the actual Yi. ˆ
Y Y n
• Sum of residuals is equal to 0  uˆ
i 1
i 0
n
• The residuals are uncorrelated with Xki : X ki uˆ i  0
Yˆi
i 1
• The residuals are uncorrelated with n

 Yˆ uˆ
i 1
i i 0

Nguyen Thu Hang, BMNV, FTU CS2 18


4. Properties of OLS estimators
Gauss-Markov Theorem ˆ1 , ˆ2 ,...., ˆk are the best
linear unbiased estimators (BLUEs) of 1 , 2, ......,  k
~
• An estimator  j is an unbiased estimator of j if
~
E ( j )   j
• An estimator
~
j of j is linear if and only if it can
be expressed as a linear function of the data on the
n
~
dependent variable:
 j   wij y i
i 1

• “best” is defined as smallest variance.


Nguyen Thu Hang, BMNV, FTU CS2 19
̂

4. Properties of OLS estimators


Standard errors of the OLS estimators
n

• An unbiased estimator of  2
:   E (2
u )   i /n
u 2
2

i 1

 This is not a true estimator because we can not


observe the ui.

• The unbiased estimator of  : ˆ  2


RSS 2  i
ˆ
u 2

nk nk
RSS /  follows
2 2

distribution with df = number of
observations – number of estimated parameters = n-k
Positive ̂ is called the standard error of the regression
(SER) (or Root MSE). SER is an estimator of the standard
deviation of the error term.
Nguyen Thu Hang, BMNV, FTU CS2 20
4. Properties of OLS estimators
2
Var ( ˆ j ) 
SST j (1  R ) 2
j

• Where SST j   ( xij  x j ) 2


is total sample
i 1
variation in xj and R 2
j is the R-squared from
regressing xj on all other independent
variables (and including an intercept).
• Since  is unknown, we replace it with its
estimator ̂ . Standard error:
ˆ
se( j )   /[ SST j (1  R j )]
ˆ 2 1/ 2
Nguyen Thu Hang, BMNV, FTU CS2 21
5. Goodness-of-fit or coefficient of determination R2
• The total sum of squares (TSS)
TSS   y   (Yi  Y )  Yi  nY
2
i
2 2 2

• The explained sum of squares (ESS)


ESS   yˆ  (Yi  Y )  ˆ2  yi x2i  ˆ3  yi x3i
ˆ
2 2
i
• The residual sum of squares (RSS)
RSS   (Yi Yˆi ) 2   uˆi2  TSS  ESS
• Goodness of fit - Coefficient of Determination R2
ESS RSS
R 
2
 1
TSS TSS
 The fraction of the sample variation in Y that is explained
by X2 and X3.
Nguyen Thu Hang, BMNV, FTU CS2 22
Example- Goodness of fit
• Determinants of college GPA:
- The variables in GPA1.dta include the college
grade point average (colGPA), high school GPA
(hsGPA) and achievement test score (ACT) for
a sample of 141 students from a large
university

Nguyen Thu Hang, BMNV, FTU CS2 23


Example- Goodness of fit
• Determinants of college GPA:
- "D:\Bai giang\Kinh te luong\datasets\GPA1.DTA", clear
. use

. reg colGPA hsGPA ACT

Source SS df MS Number of obs = 141


F( 2, 138) = 14.78
Model 3.42365506 2 1.71182753 Prob > F = 0.0000
Residual 15.9824444 138 .115814814 R-squared = 0.1764
Adj R-squared = 0.1645
Total 19.4060994 140 .138614996 Root MSE = .34032

colGPA Coef. Std. Err. t P>|t| [95% Conf. Interval]

hsGPA .4534559 .0958129 4.73 0.000 .2640047 .6429071


ACT .009426 .0107772 0.87 0.383 -.0118838 .0307358
_cons 1.286328 .3408221Nguyen Thu
3.77Hang,0.000
BMNV, FTU CS2.612419 1.960237 24
Output interpretation
• hsGPA and ACT together explain about 17.6%
of the variation in college GPA for this sample
of students.
• There are many other factors including family
background, personality, quality of high school
education, affinity for college that contribute
to a student’s college performance.

Nguyen Thu Hang, BMNV, FTU CS2 25


5. Goodness-of-fit or coefficient of determination R2

• Note that R2 lies between 0 and 1.


o If it is 1, the fitted regression line explains 100 percent of
the variation in Y
o If it is 0, the model does not explain any of the variation
in Y.
• The fit of the model is said to be “better’’ the closer R2 is to
1
• As the number of regressors increases, R2 almost invariably
increases and never decreases.

Nguyen Thu Hang, BMNV, FTU CS2 26


R2 and the adjusted R2
• An alternative coefficient of determination:
RSS /( n  k )
R  1
2

TSS /( n  1)
n 1
R  1  (1  R )
2 2

nk
where k = the number of parameters in the model including the
intercept term.

Nguyen Thu Hang, BMNV, FTU CS2 27


R2 and the adjusted R2
• It is good practice to use adjusted R2 than R2
because R2 tends to give an overly optimistic
picture of the fit of the regression, particularly
when the number of explanatory variables is
not very small compared with the number of
observations.

Nguyen Thu Hang, BMNV, FTU CS2 28


The game of maximizing adjusted R2
• Sometimes researchers play the game of maximizing
adjusted R2, that is, choosing the model that gives the
highest adjusted R2.  This may be dangerous.
• In regression analysis, our objective is not to obtain a
high adjusted R2 per se but rather to obtain
dependable estimates if the true population regression
coefficients and draw statistical inferences about them.
• Researchers should be more concerned about the
logical or theoretical relevance of the explanatory
variables to the dependent variable and their statistical
significance.

Nguyen Thu Hang, BMNV, FTU CS2 29


Comparing Coefficients of Determination R2

• It is crucial to note that in comparing two models on the


basis of the coefficient of determination, whether adjusted
or not
• the sample size n must be the same
• the dependent variable must be the same
• the explanatory variables may take any form.
Thus for the models
lnYi = β1 + β2X2i + β3X3i + ui (7.8.6)
Yi = α1 + α2X2i + α3X3i + ui (7.8.7)
the computed R2 terms cannot be compared
Nguyen Thu Hang, BMNV, FTU CS2 30
6. Partial correlation coefficients
• Example: we have a regression model with three variables:
Y, X2 and X3.
• The coefficient of correlation r as a measure of the degree
of linear association between two variables: r12 (correlation
coefficient between Y and X2), r13(correlation coefficient
between Y and X3) and r23 (correlation coefficient between
X2 and X3) gross of simple correlation coefficients or
correlation coefficients of zero order.
• Does r12 in fact measure the “true” degree of (linear)
association between Y and X2 when X3 may be associated
with both of them?
 We need a correlation coefficient that is independent of
the influence of X3 on X2 and Y  The partial correlation
coefficient.

Nguyen Thu Hang, BMNV, FTU CS2 31


6. Partial correlation coefficients
• r12,3 =partial correlation coefficient between Y and X2,
holding X3 constant.
• r13,2 =partial correlation coefficient between Y and X3,
holding X2 constant.
• r23,1 =partial correlation coefficient between X2 and X3,
holding Y constant.
 These are called first order correlation coefficients (order=
the number of secondary subscripts).

r12  r13 r23 r13  r12 r23


r12,3  r13, 2 
(1  r )(1  r )
2
13
2
23 (1  r122 )(1  r232 )
r23  r12 r13
r23,1 
(1  r122 )(1  r132 )
Nguyen Thu Hang, BMNV, FTU CS2 32
Example- Partial correlation coefficients
• Y= crop yield, X2= rainfall, X3= temperature.
Assume r12=0, there is no association between
crop yield and rainfall. Assume r13 is positive,
r23 is negative  r12,3 will be positive 
Holding temperature constant, there is a
positive association between yield and rainfall.
Since temperature X3 affects both yield Y and
rainfall, we need to remove the influence of
the nuisance variable temperature.
• In Stata: pcorr Y X2 X3

Nguyen Thu Hang, BMNV, FTU CS2 33


LEC 11
7. More on Functional Form
The Cobb–Douglas Production Function

• The Cobb–Douglas production function, in its stochastic


form, may be expressed as:
2 3 U i
Yi  1 X 2i X 3i e 7.9.1
where Y = output
X2 = labor input
X3 = capital input
u = stochastic disturbance term
e = base of natural logarithm
• if we log-transform this model, we obtain:
ln Yi = ln β1 + β2 lnX2i + β3lnX3i + ui
= β0 + β2lnX2i + β3lnX3i + ui (7.9.2)
where β0 = ln β1. Nguyen Thu Hang, BMNV, FTU CS2 34
EXAMPLE 7.3 ValueAdded, Labor Hours, and Capital Input in
the Manufacturing Sector

• There

Nguyen Thu Hang, BMNV, FTU CS2 35


Regression

• There

Nguyen Thu Hang, BMNV, FTU CS2 36


7. More on Functional Form
The Cobb–Douglas Production Function

7.9.4

• The output elasticities of labor and capital were 0.4683 and


0.5213, respectively.
• Holding the capital input constant, 1 percent increase in the labor
input led on the average to about a 0.47 percent increase in the
output.
• Similarly, holding the labor input constant, 1 percent increase in
the capital input led on the average to about a 0.52 percent
increase in the output.
Nguyen Thu Hang, BMNV, FTU CS2 37
7. More on Functional Form
Polynomial Regression Models

Figure 7.1, The U-shaped marginal cost curve shows that


the relationship between MC and output is nonlinear.

Nguyen Thu Hang, BMNV, FTU CS2 38


7. More on Functional Form
Polynomial Regression Models
• Geometrically, the MC curve depicted in Figure 7.1
represents a parabola. Mathematically, the parabola is
represented by the following equation:
Y = β0 + β1X + β2Xi2 (7.10.1)
which is called a quadratic function,
• The general kth degree polynomial regression may be written
as
Yi = β0 + β1Xi + β2Xi2+ · · ·+βkXik + ui (7.10.3)

Nguyen Thu Hang, BMNV, FTU CS2 39


7. More on Functional Form
Polynomial Regression Models
EXAMPLE 7.4 Estimating the Total Cost Function

Nguyen Thu Hang, BMNV, FTU CS2 40


8. Hypothesis Testing in Multiple Regression

The hypothesis testing assumes several interesting forms, such as the


following:
8.1. Testing hypotheses about an individual partial regression
coefficient
8.2. Testing the overall significance of the estimated multiple regression
model, that is, finding out if all the partial slope coefficients are
simultaneously equal to zero.
8.3. Testing that two or more coefficients are equal to one another
8.4. Testing that the partial regression coefficients satisfy certain
restrictions
8.5. Testing the stability of the estimated regression model over time or
in different cross-sectional units
8.6. Testing the functional form of regression models
Nguyen Thu Hang, BMNV, FTU CS2 41
8.1. Hypothesis testing about individual regression
coefficients: the null hypothesis in most applications
• A hypothesis about any individual partial regression
coefficient.
H0: j = 0
H1: j  0
•  Xj has no effect on the expected value of Y.
• If the computed t value > critical t value at the chosen level
of significance, we may reject the null hypothesis;
otherwise, we may not reject it
• Where: ˆ j  0
t
se( ˆ )
j

Nguyen Thu Hang, BMNV, FTU CS2 42


Example : Determinants of college GPA

• The null hypothesis states that:


– ACT held constant,
– hsGPA has no influence on colGPA
H0: β2 = 0 and H1: β2 ≠ 0
0.4534  0
• t-test t  4.73
0.0958
• The critical t value is 2.61 for a two-tail test with the significance level
of 1% (look up tα/2 for 138 df)
• With the significance level of 1%, reject the null hypothesis that
hsGPA has no effect on colGPA.

Nguyen Thu Hang, BMNV, FTU CS2 43


Example 2: Determinants of college GPA

• -
. use "D:\Bai giang\Kinh te luong\datasets\GPA1.DTA", clear

. reg colGPA hsGPA ACT

Source SS df MS Number of obs = 141


F( 2, 138) = 14.78
Model 3.42365506 2 1.71182753 Prob > F = 0.0000
Residual 15.9824444 138 .115814814 R-squared = 0.1764
Adj R-squared = 0.1645
Total 19.4060994 140 .138614996 Root MSE = .34032

colGPA Coef. Std. Err. t P>|t| [95% Conf. Interval]

hsGPA .4534559 .0958129 4.73 0.000 .2640047 .6429071


ACT .009426 .0107772 0.87 0.383 -.0118838 .0307358
_cons 1.286328 .3408221Nguyen Thu
3.77Hang,0.000
BMNV, FTU CS2.612419 1.960237 44
A reminder on the language of
classical hypothesis testing
• When Ho is not rejected  “We fail to reject
H0 at the x% level”, do not say: “H0 is
accepted at the x% level”.
• Statistical significance vs economic
significance: The statistical significance is
determined by the size of the t-statistics
whereas the economic significance is related
to the size and sign of the estimators.

Nguyen Thu Hang, BMNV, FTU CS2 45


Testing Hypotheses on the coefficients

Hypotheses H0 Alternative Rejection


hypothesis H1 region

 j  0
Two tail |t0|>t(n-k),α/2
 j  0
Right tail t0 > t(n-k),α
 j  0  j  0

Left tail t0 <- t(n-k),α


 j  0  j  0

Nguyen Thu Hang, BMNV, FTU CS2 46


8.2. Testing the Overall Significance of
the Sample Regression
For Yi = 1 + 2X2i + 3X3i + ........+ kXki + ui
 To test the hypothesis
H0: 2 =3 =....= k= 0 (all slope coefficients are simultaneously zero)
(this is also a test of significance of R2)
H1: Not at all slope coefficients are simultaneously zero
R 2 (n  k )
F
(1  R 2 )( k  1)
(8.5.7)

(k = total number of parameters to be estimated including intercept)


If F > F critical = F,(k-1,n-k), reject H0, Otherwise you do not
reject it
Nguyen Thu Hang, BMNV, FTU CS2 47
Example : Testing the Overall Significance of
the Sample Regression

• Determinants of college GPA


0.1764 *138
F  14.78
(1  0.1764) * 2
• We have F > F critical = F0.05,(2,138) =3.062 
reject H0

Nguyen Thu Hang, BMNV, FTU CS2 48


8.3. Testing the Equality of Two Regression Coefficients
• Suppose in the multiple regression
Yi = β1 + β2X2i + β3X3i + β4X4i + ui
we want to test the hypotheses
H0: β3 = β4 or (β3 − β4) = 0
H1: β3 ≠ β4 or (β3 − β4) ≠ 0
that is, the two slope coefficients β3 and β4 are equal.

• If the t variable exceeds the critical t value at the designated


level of significance for given df, then you can reject the null
hypothesis; otherwise, you do not reject it
Nguyen Thu Hang, BMNV, FTU CS2 49
7.3. Testing the Equality of Two Regression Coefficients

Option 1: t-test
• If the t variable exceeds the critical t value at the designated
level of significance for given df, then you can reject the null
hypothesis; otherwise, you do not reject it

Nguyen Thu Hang, BMNV, FTU CS2 50


7. 3. Testing the Equality of Two Regression Coefficients

• review
 2 . x32 2
Var ( ˆ2 )  
x x2
2
2
3  ( x2 x3 ) 2
(1  r ) x
2
2, 3
2
2

 . x
2 2
 2
Var ( ˆ3 )  2

x x
2
2
2
3  ( x2 x3 ) 2
(1  r ) x
2
2, 3
2
3

 r 2 2

Cov( ˆ2 ˆ3 )   ( ˆ2 ˆ3 )  2,3

(1  r ) 2
2,3 x x 2
2
2
3

Nguyen Thu Hang, BMNV, FTU CS2 51


Example- Stata output
• Model: wage = f(educ,exper, tenure )
. reg wage educ exper tenure

Source SS df MS Number of obs = 526


F( 3, 522) = 76.87
Model 2194.1116 3 731.370532 Prob > F = 0.0000
Residual 4966.30269 522 9.51398984 R-squared = 0.3064
Adj R-squared = 0.3024
Total 7160.41429 525 13.6388844 Root MSE = 3.0845

wage Coef. Std. Err. t P>|t| [95% Conf. Interval]

educ .5989651 .0512835 11.68 0.000 .4982176 .6997126


exper .0223395 .0120568 1.85 0.064 -.0013464 .0460254
tenure .1692687 .0216446 7.82 0.000 .1267474 .2117899
_cons -2.872735 .7289643 -3.94 0.000 -4.304799 -1.440671
Nguyen Thu Hang, BMNV, FTU CS2 52
Example- Stata output
• Model: wage = f(educ,exper, tenure )
. estat vce

Covariance matrix of coefficients of regress model

e(V) educ exper tenure _cons

educ .00263
exper .00019406 .00014537
tenure -.0001254 -.00013218 .00046849
_cons -.03570219 -.0042369 .00143314 .53138894
Nguyen Thu Hang, BMNV, FTU CS2 53
Example- Stata output
• We have se(ˆ3  ˆ4 )  0.029635
t = -4.958 t 0.025,522  2.
Reject H0

Nguyen Thu Hang, BMNV, FTU CS2 54


8.3. Testing the Equality of Two Regression Coefficients

Option 2: F-test
• If the F variable exceeds the critical F value at the designated
level of significance for given df, then you can reject the null
hypothesis; otherwise, you do not reject it
2
 ( ˆ3  ˆ 4 )  (  3   4 ) 
F1,n  k  
 ˆ  ˆ ) 
 se ( 3 4 

Nguyen Thu Hang, BMNV, FTU CS2 55


Example
• F=24.58
• F(0.05,1,522) = 3.85

•  We reject the hypothesis that the two


effects are equal.

Nguyen Thu Hang, BMNV, FTU CS2 56


8.3. Testing the Equality of Two Regression Coefficients

Option 3: Stata output F-test

. test exper=tenure
( 1) exper - tenure = 0

F( 1, 522) = 24.58
Prob > F = 0.0000
 We reject the hypothesis that the two effects are
equal.
Nguyen Thu Hang, BMNV, FTU CS2 57
8.4. Restricted Least Squares: Testing Linear Equality
Restrictions LEC 13

• Now consider the Cobb–Douglas production function:


2 3 U i
Yi  1 X X e
2i 3i 8.6.1
where Y = output
X2 = labor input
X3 = capital input
• Written in log form, the equation becomes
ln Yi = ln β1 + β2 lnX2i + β3lnX3i + ui
= β0 + β2lnX2i + β3lnX3i + ui 8.6.2
where β0 = ln β1.

Nguyen Thu Hang, BMNV, FTU CS2 58


8.4. Restricted Least Squares: Testing Linear Equality
Restrictions

• Now if there are constant returns to scale (equiproportional


change in output for an equiproportional change in the
inputs), economic theory would suggest that:
β2 + β3 = 1 (8.6.3)
which is an example of a linear equality restriction.
• If the restriction (8.6.3) is valid? There are two approaches:
– The t-Test Approach
– The F-Test Approach

Nguyen Thu Hang, BMNV, FTU CS2 59


8.4. Restricted Least Squares: Testing Linear Equality
Restrictions
The t-Test Approach
•The simplest procedure is to estimate Eq. (8.6.2) in the usual
manner
•A test of the hypothesis or restriction can be conducted by the
t test.

(8.6.4)

•If the t value computed exceeds the critical t value at the chosen
level of significance, we reject the hypothesis of constant returns
to scale;
•Otherwise we do not reject it.
Nguyen Thu Hang, BMNV, FTU CS2 60
8.4. Restricted Least Squares: Testing Linear Equality
Restrictions

The F-Test Approach


•we see that β2 = 1 − β3
•we can write the Cobb–Douglas production function as
lnYi = β0 + (1 − β3) ln X2i + β3 ln X3i + ui
= β0 + ln X2i + β3(ln X3i − ln X2i ) + ui
or (lnYi − lnX2i) = β0 + β3(lnX3i − lnX2i ) + ui (8.6.7)
or ln(Yi/X2i) = β0 + β3ln(X3i/X2i) + ui (8.6.8)
Where (Yi/X2i) = output/labor ratio
(X3i/X2i) = capital/labor ratio
Eq. (8.6.7) or Eq. (8.6.8) is known as restricted least squares
(RLS)
Nguyen Thu Hang, BMNV, FTU CS2 61
8.4. Restricted Least Squares: Testing Linear Equality
Restrictions
• We want to test the hypotheses
H0: β2 + β3 = 1 (the restriction H0 is valid)

RSSUR: RSS of the unrestricted regression (8.6.2)


RSSR : RSS of the restricted regression (8.6.7) or (8.6.7)
m = number of linear restrictions (1 in the present example)
k = number of parameters in the unrestricted regression
n = number of observations
• If the F value computed > the critical F value at the chosen level of
significance, we reject the hypothesis H0

Nguyen Thu Hang, BMNV, FTU CS2 62


EXAMPLE 8.3 The Cobb–Douglas Production Function for the
Mexican Economy,1955–1974 (Table 8.8)
GDP Employment Fixed Capital
Year Millions of 1960 pesos. Thousands of people. Millions of 1960 pesos.
1955 114043 8310 182113
1956 120410 8529 193749
1957 129187 8738 205192
1958 134705 8952 215130
1959 139960 9171 225021
1960 150511 9569 237026
1961 157897 9527 248897
1962 165286 9662 260661
1963 178491 10334 275466
1964 199457 10981 295378
1965 212323 11746 315715
1966 226977 11521 337642
1967 241194 11540 363599
1968 260881 12066 391847
1969 277498 12297 422382
1970 296530 12955 455049
1971 306712 13338 484677
1972 329030 13738 520553
1973 354057
Nguyen Thu Hang, BMNV, FTU CS2 15924 561531
63
1974 374977 14154 609825
Example

Fk-1,n-k,α = F2,17,0.05 = 3.59

Nguyen Thu Hang, BMNV, FTU CS2 64


Example

• F = 3.75 < Fk-1,n-k,α = F2,17,0.05 = 4.45 we can not reject H0.


Nguyen Thu Hang, BMNV, FTU CS2 65
Example

General F Testing
•In Exercise 7.19, you were asked to consider the following
demand function for chicken:
lnYt = β1 + β2 lnX2t + β3 lnX3t + β4 lnX4t + β5 ln X5t + ui (8.6.19)
Where Y = per capita consumption of chicken, lb
X2 = real disposable per capita income,$
X3 = real retail price of chicken per lb
X4 = real retail price of pork per lb
X5 = real retail price of beef per lb.

Nguyen Thu Hang, BMNV, FTU CS2 66


Example

• Suppose that chicken consumption is not affected by the prices of


pork and beef.
H0: β4 = β5 = 0 (8.6.21)
Therefore, the constrained regression becomes
lnYt = β1 + β2 ln X2t + β3 lnX3t + ut (8.6.22)

Nguyen Thu Hang, BMNV, FTU CS2 67


Example

• F=1.1224 < F0.05 (2,18) = 3.55. Therefore, there is no reason to reject


the null hypothesis (the demand for chicken does not depend on pork
and beef prices)
• In short, we can accept the constrained regression (8.6.24) as
representing the demand function
Nguyen Thu Hang,for
BMNV,chicken.
FTU CS2 68
8.5. Testing for Structural or Parameter Stability of
Regression Models: The Chow Test
• Now we have three possible regressions:Regression (8.7.3) assume
Time period 1970–1981: Yt = λ1 + λ2Xt + u1t (8.7.1)
Time period 1982–1995: Yt = γ1 + γ2Xt + u2t (8.7.2)
Time period 1970–1995: Yt = α1 + α2Xt + ut (8.7.3)
• there is no difference between the two time periods. The mechanics
of the Chow test are as follows:
1. Estimate regression (8.7.3), obtain RSS3 with df = (n1 + n2 − k)
We call RSS3 the restricted residual sum of squares (RSSR) because it is obtained by
imposing the restrictions that λ1 = γ1 and λ2 = γ2, that is, the subperiod regressions are
not different.
2. Estimate Eq. (8.7.1) and obtain its residual sum of squares, RSS1,
with df = (n1 − k).
3. Estimate Eq. (8.7.2) and obtain its residual sum of squares, RSS2,
with df = (n2 − k). Nguyen Thu Hang, BMNV, FTU CS2 69
8.5. Testing for Structural or Parameter Stability of
Regression Models: The Chow Test
4. The unrestricted residual sum of squares (RSSUR), that is,
RSSUR = RSS1 + RSS2 with df = (n1 + n2 − 2k)
5. F ratio:

6. If the computed F value exceeds the critical F value, we reject the


hypothesis of parameter stability conclude that the regressions (8.7.1)
and (8.7.2) are different

Nguyen Thu Hang, BMNV, FTU CS2 70


TABLE 8.9 Savings and Personal Disposable Income (billions of dollars),
United States, 1970–1995

Observation Savings Income Observation Savings Income


1970 61.00 727.1 1983 167.0 2,522.4
1971 68.67 790.2 1984 235.7 2,810.0
1972 63.62 855.3 1985 206.2 3,002.0
1973 89.65 965.0 1986 196.5 3,187.6
1974 97.64 1,054.2 1987 168.4 33,363.1
1975 104.41 1,159.2 1988 189.1 3, 3640.8
1976 96.48 1,273.0 1989 187.8 33,894.5
1977 92.57 1,401.4 1990 208.7 44,166.8
1978 112.64 1,580.1 1991 246.4 4, 4343.7
1979 130.16 1,769.5 1992 272.6 4, 4613.7
1980 161.84 1,973.3 1993 214.4 4, 4790.2
1981 199.14 2,200.2 1994 189.4 5, 5021.7
1982 205.53 2,347.3 1995
Nguyen Thu Hang, BMNV, FTU CS2
249.3 5, 5320.8
71
8.5. Testing for Structural or Parameter Stability of
Regression Models: The Chow Test
• For the data given in Table 8.9, the empirical counterparts of the
preceding three regressions are as follows:

Nguyen Thu Hang, BMNV, FTU CS2 72


8.5. Testing for Structural or Parameter Stability of
Regression Models: The Chow Test
• RSSUR = RSS1 + RSS2 = (1785.032 + 10,005.22) = 11,790.252
• RSSR = RSS3 = 23248.3

= 10.69
• From the F tables, we find that for 2 and 22 df the 1 percent critical F
value is 5.72.
• Therefore, The Chow test therefore seems to support our earlier hunch
that the savings–income relation has undergone a structural change in
the United States over the period 1970–1995

Nguyen Thu Hang, BMNV, FTU CS2 73


8.6. Testing the Functional Form of Regression: Choosing
between Linear and Log–Linear Models
• We can use a test proposed by MacKinnon, White, and Davidson,
which for brevity we call the MWD test, to choose between the two
models
H0: The true model is linear
H1: The true model is Log–Linear
Step I: Estimate the linear model and obtain the estimated Y values. Call them Yf
Step II: Estimate the log–linear model and obtain the estimated lnY values; call lnf
Step III: Obtain Z1 = (lnY f − ln f ).
Step IV: Regress Y on X’s and Z1 obtained in Step III. Reject H0 if the coefficient of Z1
is statistically significant by the usual t test.
Step V: Obtain Z2 = (antilog of lnf − Y f ).
Step VI: Regress log of Y on the logs of X’s and Z2. Reject H1 if the coefficient of Z2 is
statistically significant by the usual t test.

Nguyen Thu Hang, BMNV, FTU CS2 74


EXAMPLE 8.5 The Demand for Roses

• Refer to Exercise 7.16 where we have presented data on the demand


for roses in the Detroit metropolitan area for the period 1971–III to
1975–II.
Linear model: Yt = α1 + α2X2t + α3X3t + ut (8.10.1)
Log–linear model: lnYt = β1 + β2lnX2t + β3lnX3t + ut (8.10.2)
• Where Y is the quantity of roses in dozens
X2 is the average wholesale price of roses ($/dozen),
X3 is the average wholesale price of carnations ($/dozen).
• A priori: α2 and β2 are expected to be negative (why?)
α3 and β3 are expected to be positive
• As we know, the slope coefficients in the log–linear model are
elasticity coefficients.
Nguyen Thu Hang, BMNV, FTU CS2 75
EXAMPLE 8.5 The Demand for Roses
• Step I, Step II

• Step III: Obtain Z1 = (lnYf − ln f ).


• Step IV:

Nguyen Thu Hang, BMNV, FTU CS2 76


EXAMPLE 8.5 The Demand for Roses

• The coefficient of Z1 is not statistically significant (t test), we do not


reject the hypothesis that the true model is linear
• Step V: Obtain Z2 = (antilog of lnf − Y f ).
• Step VI:

• The coefficient of Z2 is not statistically significant (t test), we also can


reject the hypothesis that the true model is log–linear at 5% level of
significance
• Conclusion: As this example shows, it is quite possible that in a given
situation we cannot reject either of the specifications.
Nguyen Thu Hang, BMNV, FTU CS2 77
Assignments
• Problems 7.16, 7.17, 7.18, 7.19, 7.20 in p25-
240, Gujarati.
• Problems 3.1-3.6 in p. 105-107, Wooldridge.
• Computer exercises C3.1-C3.7 in p. 110-111,
Wooldridge.

Nguyen Thu Hang, BMNV, FTU CS2 78

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy