Problem Set 4
Problem Set 4
ME-MIEM
Econometrics I
Multiple Linear Regression II: Inference I
Problem Set 4
1. Explain how you would test the null hypothesis that 1 = 0 in the multiple regression model Yi = 0 + 1 X1i +
2 X2i + Ui : Explain how you would test the joint hypothesis that 1 = 0 and 2 = 0: Why isn’t the result of
the joint test implied by the results of the …rst two tests?
The following results were obtained using data from the 1998 Current Population Survey (CPS). The data
set consists of information on 4000 full-time full-year workers. The highest educational achievement for each
worker was either a high school diploma or a bachelor’s degree. The worker’s ages ranged from 25 to 34 years.
The data set also contained information on the region of the country where the person lived, marital status
and number of children:
AHE = average hourly earnings (in 1998 dollars)
College = binary variable (1 if college, 0 if high school)
Female = binary variable (1 if female. 0 if male)
Age = age (in years)
Ntheast= binary variable (1 if Region = Northeast. 0 otherwise)
Midwest = binary variable (1 if Region = Midwest, 0 otherwise)
South = binary variable (1 if Region = South, 0 otherwise)
West= binary variable (1 if Region = West, 0 otherwise)
2. Add " " (5%) and "**" (1%) to the table to indicate the statistical signi…cance of the coe¢ cients.
3. Using the regression results in column (1):
(a) Is the college-high school earnings di¤erence estimated from this regression statistically signi…cant at the
5% level? Construct a 95% con…dence interval of the di¤erence.
(b) Is the male-female earnings di¤erence estimated from this regression statistically signi…cant at the 5%
level? Construct a 95% con…dence interval for the di¤erence.
1
(a) Is age an important determinant of earnings? Use an appropriate statistical test and/or a con…dence
interval to explain your answer.
(b) Sally is a 29-year-old female college graduate. Betsy is a 34.year-old female college graduate. Construct
a 95% con…dence interval for the expected di¤erence between their earnings.
(a) Do there appear to be important regional di¤erences? Use an appropriate hypothesis test to explain your
answer.
(b) Juanita is a 28-year-old female college graduate from the South. Molly is a 28-year-old female college
graduate from the West. Jennifer is a 28-year-old female college graduate from the Midwest.
(i) Construct a 95% con…dence interval for the di¤erence in expected earnings between Juanita and Molly.
(ii) Explain how you would construct a 95% con…dence interval for the di¤erence in expected earnings
between Juanita and Jennifer (Hint: What would happen if you included W est and excluded M idwest
variable from the regression?)
6. The regression shown in column (2) was estimated again, this time using data from 1992 (4000 observations
selected at random from the March 1993 CPS, converted into 1998 dollars using the consumer price index).
The results are
d
AHE = 0:77 + 5:29 College 2:59 F emale + 0:40 Age;
(0:98) (0:20) (0:18) (0:03)
2
SER = 5:85; R = 0:21:
Comparing this regression to the regression for 1998 shown in column (2), is there a statistically signi…cant
change in the coe¢ cient of College?
7. Evaluate the following statement: "In all of the regressions, the coe¢ cient on Female is negative, large, and
statistically signi…cant. This provides strong statistical evidence of gender discrimination in the U.S. labor
market."
8. Consider the regression model Yi = 0 + 1 X1i + 2 X2i + Ui : Transform the regression so that you can use a t
-statistic to test:
(a) 1 = 2:
(b) 1 +a 2 = 0, where a is a constant.
(c) 1 + 2 = 1: (Hint: You must rede…ne the dependent variable in the regression.)
9. Using the data set TeachingRatings, carry out the following exercises:
(a) Run a regression of Course_Eval on the variable that measures the professor’s Beauty (Beauty): Con-
struct a 95% con…dence interval for the e¤ect of Beauty on Course_Eval:
(b) Consider the di¤erent control variables in the data set. Which do you think should be included in the
regression? Using a table examine the robustness of the con…dence interval that you constructed in (a):
What is a reasonable con…dence interval for the e¤ect of Beauty on Course_Eval?
10. Using the data set CollegeDistance, answer to the following questions:
(a) An education advocacy group argues that on average, a person’s educational attainment would increase
by approximately 0.15 year if distance to the nearest college is decreased by 20 miles. Run a regression
of years of completed education (ED) on distance to the nearest college (Dist). Is the advocacy groups’
claim consistent with the estimated regression? Explain.
(b) Other factors also a¤ect how much college a person completes. Does controlling for these other factors
change the estimated e¤ect of distance on college years completed? To answer this question, construct
a table, including a simple speci…cation (as the one in (a)); a base speci…cation (that includes a set of
important control variables), and several modi…cations of the base speci…cation. Discuss how the estimated
e¤ect of Dist on ED changes across the speci…cations.
2
(c) It has been argued that, controlling for other factors, blacks and hispanics complete more college than
whites. Is this result consistent with the regressions that you constructed in part (b)?
SOLUTIONS: