0% found this document useful (0 votes)
29 views3 pages

Problem Set 4

This document provides a multiple linear regression analysis of factors affecting average hourly earnings using data from the 1998 Current Population Survey. It reports regression results with different combinations of independent variables and asks the reader to interpret the results, perform statistical tests of hypotheses, and construct confidence intervals. It also asks the reader to compare the results to a replication of the analysis using 1992 data and apply the regression approach to additional data sets analyzing college course evaluations and the effect of distance to college on educational attainment.

Uploaded by

Luca Vanz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views3 pages

Problem Set 4

This document provides a multiple linear regression analysis of factors affecting average hourly earnings using data from the 1998 Current Population Survey. It reports regression results with different combinations of independent variables and asks the reader to interpret the results, perform statistical tests of hypotheses, and construct confidence intervals. It also asks the reader to compare the results to a replication of the analysis using 1992 data and apply the regression approach to additional data sets analyzing college course evaluations and the effect of distance to college on educational attainment.

Uploaded by

Luca Vanz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Universidad Carlos III de Madrid

ME-MIEM
Econometrics I
Multiple Linear Regression II: Inference I
Problem Set 4

1. Explain how you would test the null hypothesis that 1 = 0 in the multiple regression model Yi = 0 + 1 X1i +
2 X2i + Ui : Explain how you would test the joint hypothesis that 1 = 0 and 2 = 0: Why isn’t the result of
the joint test implied by the results of the …rst two tests?

The following results were obtained using data from the 1998 Current Population Survey (CPS). The data
set consists of information on 4000 full-time full-year workers. The highest educational achievement for each
worker was either a high school diploma or a bachelor’s degree. The worker’s ages ranged from 25 to 34 years.
The data set also contained information on the region of the country where the person lived, marital status
and number of children:
AHE = average hourly earnings (in 1998 dollars)
College = binary variable (1 if college, 0 if high school)
Female = binary variable (1 if female. 0 if male)
Age = age (in years)
Ntheast= binary variable (1 if Region = Northeast. 0 otherwise)
Midwest = binary variable (1 if Region = Midwest, 0 otherwise)
South = binary variable (1 if Region = South, 0 otherwise)
West= binary variable (1 if Region = West, 0 otherwise)

Dependent Variable: Average Hourly Earnings (AHE)


Regressor (1) (2) (3)
College (X1 ) 5:46 5:48 5:44
(0:21) (0:21) (0:21)
Female (X2 ) 2:64 2:62 2:62
(0:20) (0:20) (0:20)
Age (X3 ) 0:29 0:29
(0:04) (0:04)
Northeast (X4 ) 0:69
(0:30)
Midwest (X5 ) 0:60
(0:28)
South (X6 ) 0:27
(0:26)
Intercept 12:69 4:40 3:75
(0:14) (1:05) (1:06)
Summary Statistics
SER 6.27 6.22 6.21
R2 0.176 0.190 0.194
n 4,000 4,000 4,000
F Statistic for regional e¤ects = 0 6.10

2. Add " " (5%) and "**" (1%) to the table to indicate the statistical signi…cance of the coe¢ cients.
3. Using the regression results in column (1):

(a) Is the college-high school earnings di¤erence estimated from this regression statistically signi…cant at the
5% level? Construct a 95% con…dence interval of the di¤erence.
(b) Is the male-female earnings di¤erence estimated from this regression statistically signi…cant at the 5%
level? Construct a 95% con…dence interval for the di¤erence.

4. Using the regression results in column (2):

1
(a) Is age an important determinant of earnings? Use an appropriate statistical test and/or a con…dence
interval to explain your answer.
(b) Sally is a 29-year-old female college graduate. Betsy is a 34.year-old female college graduate. Construct
a 95% con…dence interval for the expected di¤erence between their earnings.

5. Using the regression results in column (3):

(a) Do there appear to be important regional di¤erences? Use an appropriate hypothesis test to explain your
answer.
(b) Juanita is a 28-year-old female college graduate from the South. Molly is a 28-year-old female college
graduate from the West. Jennifer is a 28-year-old female college graduate from the Midwest.
(i) Construct a 95% con…dence interval for the di¤erence in expected earnings between Juanita and Molly.
(ii) Explain how you would construct a 95% con…dence interval for the di¤erence in expected earnings
between Juanita and Jennifer (Hint: What would happen if you included W est and excluded M idwest
variable from the regression?)

6. The regression shown in column (2) was estimated again, this time using data from 1992 (4000 observations
selected at random from the March 1993 CPS, converted into 1998 dollars using the consumer price index).
The results are
d
AHE = 0:77 + 5:29 College 2:59 F emale + 0:40 Age;
(0:98) (0:20) (0:18) (0:03)
2
SER = 5:85; R = 0:21:

Comparing this regression to the regression for 1998 shown in column (2), is there a statistically signi…cant
change in the coe¢ cient of College?
7. Evaluate the following statement: "In all of the regressions, the coe¢ cient on Female is negative, large, and
statistically signi…cant. This provides strong statistical evidence of gender discrimination in the U.S. labor
market."
8. Consider the regression model Yi = 0 + 1 X1i + 2 X2i + Ui : Transform the regression so that you can use a t
-statistic to test:

(a) 1 = 2:
(b) 1 +a 2 = 0, where a is a constant.
(c) 1 + 2 = 1: (Hint: You must rede…ne the dependent variable in the regression.)

9. Using the data set TeachingRatings, carry out the following exercises:

(a) Run a regression of Course_Eval on the variable that measures the professor’s Beauty (Beauty): Con-
struct a 95% con…dence interval for the e¤ect of Beauty on Course_Eval:
(b) Consider the di¤erent control variables in the data set. Which do you think should be included in the
regression? Using a table examine the robustness of the con…dence interval that you constructed in (a):
What is a reasonable con…dence interval for the e¤ect of Beauty on Course_Eval?

10. Using the data set CollegeDistance, answer to the following questions:

(a) An education advocacy group argues that on average, a person’s educational attainment would increase
by approximately 0.15 year if distance to the nearest college is decreased by 20 miles. Run a regression
of years of completed education (ED) on distance to the nearest college (Dist). Is the advocacy groups’
claim consistent with the estimated regression? Explain.
(b) Other factors also a¤ect how much college a person completes. Does controlling for these other factors
change the estimated e¤ect of distance on college years completed? To answer this question, construct
a table, including a simple speci…cation (as the one in (a)); a base speci…cation (that includes a set of
important control variables), and several modi…cations of the base speci…cation. Discuss how the estimated
e¤ect of Dist on ED changes across the speci…cations.

2
(c) It has been argued that, controlling for other factors, blacks and hispanics complete more college than
whites. Is this result consistent with the regressions that you constructed in part (b)?

SOLUTIONS:

3. a) 5:46 1:96 0:21; b) Yes, 2:64 1:96 0:20:


4. a) Yes, 0:29 1:96 0:04; b) ($1:06; $1:84) :
5. a) Yes, the regional e¤ects are signi…cant at 1%.
b) i) 0:27 1:96 0:26.
ii) The expected di¤erence between Juanita and Jennifer is (X5;Juanita X5;Jennifer ) 5 +(X6;Juanita X6;Jennifer )
6 = 5 + 6 : A 95% con…dence interval could be easily built omitting M idwest from the regression and
replacing with X5 = W est: In this new regression the coe¢ cient of South measures South-Midwest earnings
di¤erence and the 95% con…dence interval is computed directly.
6. t = 0:6552:

9. (a) 0:13 0:03 1:96 or 0.07 to 0.20


(b) Intro is not signi…cant, but the other variables are signi…cant.
A reasonable 95% con…dence interval is 0:17 1:96 0:03 or 0.11 to 0.23.
10. (a) Yes: The group’s claim is that the coe¢ cient on Dist is 0:075 and the 95% con…dence for Dist in a simple
regression model is 0:073 1:96 0:013 or 0:100 to 0:047.
(b) Yes: use a base speci…cation controlling for other important factors (bytest; f emale; black; hispanic;
incomehi; ownhome, dadcoll; M omcoll; 4cue80; stwmf g80) to obtain the 95% con…dence interval for Dist :
0:031 1:96 0:012 or 0:055 to 0:008. Use also additional variables (U rban; T uition).
(c) Yes.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy