Week 3 - Multiple Regression Solutions
Week 3 - Multiple Regression Solutions
Problem 1
The owner of Showtime Movie Theaters, Inc., would like to estimate weekly gross revenue
as a function of advertising expenditures. Historical data for a sample of eight weeks follow.
Webfile:Showtime
Regression Equation
Weekly Gross Revenue ($1000s) = 88.64 + 1.604 Televison Advertising ($1000s)
Regression Equation
Weekly Gross Revenue ($1000s) = 83.23 + 2.290 Televison Advertising ($1000s)
+ 1.301 Newspaper Advertising ($1000s)
No,
Coefficients
Term Coef SE Coef T-Value P-Value VIF
Constant 88.64 1.58 56.02 0.000
Televison Advertising ($1000s) 1.604 0.478 3.36 0.015 1.00
Coefficients
Term Coef SE Coef T-Value P-Value VIF
Constant 83.23 1.57 52.88 0.000
Televison Advertising ($1000s) 2.290 0.304 7.53 0.001 1.45
Newspaper Advertising ($1000s) 1.301 0.321 4.06 0.010 1.45
d. What is the estimate of the weekly gross revenue for a week when $3500 is spent on
television advertising and $1800 is spent on newspaper advertising?
Problem 2
The National Football League (NFL) records a variety of performance data for individuals
and teams. To investigate the importance of passing on the percentage of games won by a
team, the following data show the conference (Conf), average number of passing yards per
attempt (Yds/Att), the number of interceptions thrown per attempt (Int/Att), and the
percent of games won (Win%) for a random sample of 16 NFL teams for the 2011 season
(NFL website, February 12, 2012) Webfile: NFLPassing
Regression Equation
Win% = -58.8 + 16.39 Yds/Att
Regression Equation
Win% = 97.5 - 1600 Int/Att
c. Develop an estimated regression equation that can be used to predict the
percentage of games won given the average number of passing yards per attempt
and the number of interceptions thrown per attempt.
Regression Equation
Win% = -5.8 + 12.95 Yds/Att - 1084 Int/Att
d. The average number of passing yards per attempt for the Kansas City Chiefs was 6.2
and the number of interceptions thrown per attempt was .036. Use the estimated
regression equation developed in part (c) to predict the percentage of games won by
the Kansas City Chiefs. (Note: For the 2011 season the Kansas City Chiefs’ records
was 7 wins and 9 losses). Compare your prediction with the actual percentage of
games won by the Kansas City Chiefs.
a. Did the estimated regression equation that uses only the average number of passing
yards per attempt as the independent variable to predict the percentage of games
won provide a good fit?
Model Summary
S R-sq R-sq(adj) R-sq(pred)
15.8732 57.71% 54.69% 44.88%
b. Discuss the benefits of using both the average number of passing yards per attempt
and the number of interceptions thrown per attempt to predict the percentage of
games won.
Model Summary
S R-sq R-sq(adj) R-sq(pred)
12.6024 75.25% 71.44% 60.51%
Problem 4
The National Football League (NFL) records a variety of performance data for individuals
and teams. A portion of the data showing the average number of passing yards per game on
offence (OffPassYds/G), the average number of yards given up per game on defense
(DefYds/G), and the percentage of games won (Win%), for the 2011 season follows
Webfile NFL2011 (ESPN website, November 3, 2012).
Regression Equation
Win% = 60.5 + 0.3186 OffPassYds/G - 0.2413 DefYds/G
b. Use the F test to determine the overall significance of the relationship. What is your
conclusion at the 0.05 level of significance?
H0: β1=β2=0
Ha: β1≠0 or β2≠0 (at least one of the coefficients is not equal to zero)
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Regression 2 6179 3089.6 13.18 0.000
OffPassYds/G 1 6079 6079.5 25.94 0.000
DefYds/G 1 1713 1712.6 7.31 0.011
Error 29 6797 234.4
Total 31 12976
Reject the null hypothesis. The regression equation is significant. At least one of the
regression equation coefficients is different from zero.
c. Use the t test to determine the significance of each independent variable. What is
your conclusion at the 0.05 level of significance?
Coefficients
Term Coef SE Coef T-Value P-Value VIF
Constant 60.5 28.4 2.14 0.041
OffPassYds/G 0.3186 0.0626 5.09 0.000 1.21
DefYds/G -0.2413 0.0893 -2.70 0.011 1.21
For OffPassYds/G T-value = 5.09, P value = 0.000, P<alpha, reject the null hypothesis.
H0: β1=0, The OffPassYds/G is a significant independent variable at alpha =0.05.
Ha: β1≠0
For DefYds/G T-value = -2.70, P value = 0.011, P<alpha, reject the null hypothesis.
H0: β2=0 The DefYds/G is a significant independent variable at alpha =0.05.
Ha: β2≠0
d. Does the model have multicollinearity problems? Explain.
Discussion in class.
Problem 5
A 10-year study conducted by the American Heart Association provided data on how age,
blood pressure, and smoking relate to the risk of strokes. Assume that the following data
are from a portion of this study. Risk is interpreted as the probability (times 100) that the
patient will have a stroke over the next 10-year period. For the smoking variable, define a
dummy variable with 1 indicating a smoker and 0 indicating a nonsmoker. Webfile: Stroke
Regression Equation
Smoker
No Risk = -91.8 + 1.077 Age + 0.2518 Pressure
Or
Regression Equation
Risk = -91.8 + 1.077 Age + 0.2518 Pressure + 8.74 Smoker
Coefficients
Term Coef SE Coef T-Value P-Value VIF
Constant -91.8 15.2 -6.03 0.000
Age 1.077 0.166 6.49 0.000 1.46
Pressure 0.2518 0.0452 5.57 0.000 1.25
Smoker
Yes 8.74 3.00 2.91 0.010 1.36
H0: β1=0,
Ha: β1≠0
Discussion in class.