0% found this document useful (0 votes)
27 views5 pages

Practice 3 Multiple Regression 2023 03-16-09!06!38

This document summarizes the results of multiple linear regressions analyzing factors that influence earnings. It reports coefficients and test statistics from 3 regressions using data on over 10,000 individuals. The regressions examine the relationship between earnings and variables like education level, gender, age, and geographic region. Statistical tests are used to analyze the significance and importance of each variable in determining earnings.

Uploaded by

Pavonis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views5 pages

Practice 3 Multiple Regression 2023 03-16-09!06!38

This document summarizes the results of multiple linear regressions analyzing factors that influence earnings. It reports coefficients and test statistics from 3 regressions using data on over 10,000 individuals. The regressions examine the relationship between earnings and variables like education level, gender, age, and geographic region. Statistical tests are used to analyze the significance and importance of each variable in determining earnings.

Uploaded by

Pavonis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Quantitative Methods III - Practice 3

Multiple Linear Regression

Prof. Lorenzo Cavallo: lorenzo.cavallo.480084@uniroma2.eu


Prof. Marianna Brunetti: marianna.brunetti@uniroma2.it

Exercise Given the table:

Regressor (1) (2) (3)


College (X1 ) 0.352 0.373 0.371
(0.021) (0.021) (0.021)
Male (X2 ) 0.458 0.457 0.451
(0.021) (0.020) (0.020)
Age (X3 ) 0.011 0.011
(0.001) (0.001)
North (X4 ) 0.175
(0.037)
South (X5 ) 0.103
(0.033)
East (X6 ) -0.102
(0.043)
Intercept 12.840 12.471 12.390
(0.018) (0.049) (0.057)
Summary and joint tests
F-statistic 21.87
SER 1.026 1.023 1.020
R̄ 2
0.0710 0.0761 0.0814
n 10973 10973 10973

1
The table reports the results on a series of regressions estimated on more than
10,000 individuals who worked full-time for the whole year in a developing country.

The analysis concerned both quantitative and dummy variables1 :


• AHE = logarithm of average hourly earnings (in 2007 units)
• College = dummy variable (1 if graduate, 0 otherwise)
• Male = dummy variable (1 if male, 0 if female)
• Age = age (in years)
• North = dummy variable (1 if Region = North, 0 otherwise)
• East = dummy variable (1 if Region = East, 0 otherwise)
• South = dummy variable (1 if Region = South, 0 otherwise)
• West = dummy variable (1 if Region = West, 0 otherwise)
1. For each of the three regressions, add ∗ (level 5%) and ∗∗
(level 1%) to the
table to indicate the significance of the coefficients.
2. Answer using the regression results reported in column (2)
(a) Is age an important determinant of earnings?
(b) Use an appropriate test statistic or confidence interval to justify your
answer.
(c) Suppose Alvo is a 30-year-old college graduate and Kal is a 40-year-old
college graduate. Construct a 95% confidence interval for the difference
between expected wages.
3. Using the values reported in the table:

(a) Construct the R2 for each of the regressions.


(b) Show how to construct the F statistic for testing the hypothesis that
β4 = β5 = β6 = 0 in the regression shown in column (3).
(c) Is the statistic significant at the 1% level?
(d) Show how to construct the F statistic to test the hypothesis that β4 =
β5 = β6 = 0 in the regression shown in column (3) using the Bonferroni
test.
(e) Construct a 99% confidence interval for β1 in the regression shown in
column (3).
1
A dummy variable is one that takes the values 0 or 1 to indicate the absence or presence of
some categorical effect that may be expected to shift the outcome.

2
Solution

1. Calculating the test statistic:


βi
ti =
S.E.(βi )

and comparing it with the critical value at 5% (t = 1.96) and 1% (t = 2.58),


the following table is obtained (in which the standard errors are indicated in
round brackets and test statistics in square brackets):

Regressor (1) (2) (3)


College (X1 ) 0.352∗∗ 0.373∗∗ 0.371∗∗
(0.021) (0.021) (0.021)
[16.762] [17.762] [17.667]
Male (X2 ) 0.458∗∗ 0.457∗∗ 0.451∗∗
(0.021) (0.020) (0.020)
[21.810] [22.850] [12.550]
Age (X3 ) 0.011∗∗ 0.011∗∗
(0.001) (0.001)
[11.000] [11.000]
North (X4 ) 0.175∗∗
(0.037)
[4.730]
South (X5 ) 0.103∗∗
(0.033)
[3.121]
East (X6 ) -0.102∗
(0.043)
[−2.372]
Intercept 12, 840∗∗ 12, 471∗∗ 12, 390∗∗
(0.018) (0.049) (0.057)
[692.833] [254.510] [217.368]

2. (a) Yes, age is an important determinant of wages.

Using the test statistic, we get:


0.011
t= = 11
0.001

3
The test statistic is greater than 2.58 and this implies that the age coef-
ficient is statistically significant at the 1% level.

(b) The (1-α)% confidence interval of a coefficient βi is:

(1 − α)%CI(βi ) = [βi ± z1−α/2 × SE(βi )]

In this case the 95% confidence interval of β3 is:

95%CI(β3 ) = [β3 ±z0.025 ×SE(β3 )] = [0.011±1.96×0.001] = [0.009; 0.013]

(c) To construct the confidence interval we need to consider the age diffe-
rence of the two subjects (∆Age = 40 − 30 = 10 years).
The confidence interval will be:

∆Age×[β3 ±z0.025 ×SE(β3 )] = 10×[0.011±1.96×0.001] = [0.0904; 0.1296]

As a percentage it is a difference in expected wages between 9.04% and


12.96%.

3. (a) From the formula of the Adjusted-R2 (R̄2 ), we can get R2 . In particu-
lar, knowing that
n−1 n − 1 RSS
R̄2 = 1 − × (1 − R2 ) = 1 − ×
n−k n − k T SS
n−k
→ R2 = 1 − (1 − R̄2 )
n−1
.

It follows that:

• column (1) = R2 = 1 − 10973−3


10973−1
(1 − 0.0710) = 0.0712
• column (2) = R2 = 1 − 10973−4
10973−1
(1 − 0.0761) = 0.0764
• column (3) = R = 1 −
2 10973−7
10973−1
(1 − 0.0814) = 0.0819

(b) H0 : β4 = β5 = β6 = 0 H1 : βi ̸= 0, for i = 4, 5, 6.

Knowing that:
2 2
Runrestricted − Rrestricted /q
F = 2
,
(1 − Runrestricted )/(n − kunrestricted )

4
we can test the null hypothesis starting from the R2 of the regressions.
For the non-restricted, the R2 is that of column (3), while for the
restricted we are going to consider the R2 of column (2).
Considering the number of restrictions q = 3 and n = 10973, it follows
that

0.0819 − 0.0764/3
F = = 21.898
(1 − 0.0819)/(10973 − 7)
.

(c) The critical value at 1% F3,∞ = 3.78, is less than the F statistic, so we
reject the null hypothesis at 1% significance level.

(d) The Bonferroni test allows testing joint hypotheses on q coefficients star-
ting from the t statistics relating to individual hypotheses but correcting
the critical value as follows:

c = z1− α/q
2

H0 is rejected if at least one of the individual t-ratios is, in absolute


value, greater than the critical value c, adjusted as above.

In our case, H0 is a joint hypothesis on 3 coefficients, therefore q = 3.


Setting α = 1%, we have:

c = z1− α/q = z1− 0.01/3 = z0.9983 = 2.93


2 2

The t-statistics of the single coefficients subject to the null hypothesis


are t4 = 4.730, t5 = 3.121 and t6 = −2.372. Since |t6 | < c but |t5 | > c
and |t4 | > c, the null hypothesis is rejected.

(e) [β1 ± 2.58 × SE(β1 )] = [0.371 ± 2.58 × 0.021] = [0.3168; 0.4252]

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy