This Study Resource Was: Scatterplot of Attendance Vs Team Salary
This Study Resource Was: Scatterplot of Attendance Vs Team Salary
Refer to the Baseball 2016 data, which reports information on the 2016 Major League Baseball
season. Let attendance be the dependent variable and total team salary be the independent variable.
Determine the regression equation and answer the following questions.
a. Draw a scatter diagram. From the diagram, does there seem to be a direct relationship between the
two variables?
b. What is the expected attendance for a team with a salary of $100.0 million?
c. If the owners pay an additional $30 million, how many more people could they expect to attend?
d. At the .05 significance level, can we conclude that the slope of the regression line is positive? Conduct
the appropriate test of hypothesis.
f. Determine the correlation between attendance and team batting average and between attendance
and team ERA. Which is stronger? Conduct an appropriate test of hypothesis for each set of variables.
m
er as
co
eH w
Answer:
o.
a.
rs e
ou urc
Scatterplot of Attendance vs Team Salary
4000000
o
3500000
aC s
3000000
vi y re
2500000
Attendance
2000000
1500000
ed d
1000000
ar stu
500000
0
40.00 60.00 80.00 100.00 120.00 140.00 160.00 180.00 200.00 220.00 240.00
Team Salary
is
Th
The above scatterplot shows as team salary increase the corresponding attendance increase. There exists
sh
This study source was downloaded by 100000829508145 from CourseHero.com on 08-09-2021 23:48:50 GMT -05:00
https://www.coursehero.com/file/61871519/week7docx/
y
y−´¿
¿
r= ¿ =494162269/(30-1)(40.58)(594112.41)=0.70677
(x−x́) ¿
∑¿
¿
sy
b=r( )=0.70677(594112.41/40.58)=10347
sx
a= ý -b x́ =24588667-10347(122) =1196919
Attendance=1196919+10347 team salary
Attendance=1196919+10347(100) =2231619
m
er as
Attendance= 1196919+10347(130) =2542029
co
The expected more attendance is
eH w
2542029-2231619=310410
d. The null hypotheses H 0 : β=0 from the regression excel output, the t-statistics is 5.29
o.
The P-value is 0.000.
rs e
ou urc
The level of the significance is 0.05
The null hypothesis is rejected at 0.05 level of significance since the P-value is less than the level
of significance. There is sufficient evidence to indicate that the slope of the regression line is
o
e. From the regression excel output, the percentage of variation is 49.95. so 49.95%of the variation
vi y re
in the population is 0), from the regression excel output, the t-statistics is 0.662
ar stu
The null hypothesis fails to be rejected at 0.05 level of significance since the P-value is greater
than the level of significance. There is not enough evidence to indicate that there is correlation
Th
This study source was downloaded by 100000829508145 from CourseHero.com on 08-09-2021 23:48:50 GMT -05:00
https://www.coursehero.com/file/61871519/week7docx/
CH13- 64. Refer to the Lincolnville School bus data. Develop a regression equation that ex-presses the
relationship between age of the bus and maintenance cost. The age of the bus is the independent
variable.
a. Draw a scatter diagram. What does this diagram suggest as to the relationship between the two
variables? Is it direct or indirect? Does it appear to be strong or weak?
b. Develop a regression equation. How much does an additional year add to the maintenance cost?
What is the estimated maintenance cost for a 10-year-old bus?
c. Conduct a test of hypothesis to determine whether the slope of the regression line is greater than
zero. Use the .05 significance level. Interpret your findings from parts (a), (b), and (c) in a brief report.
Answer:
a.
m
er as
Scatterplot of Age vs Maintenance Cost
co
eH w
12000
o.
10000
rs e
Maintenance Cost
ou urc
8000
6000
o
4000
aC s
2000
vi y re
0
0 2 4 6 8 10 12 14 16
Age
ed d
ar stu
Form above diagram it seems that there is a positive relationship between the two variables. The
relationship seems to be direct. Though it appears that the relation is weak.
is
b= 603.2
a= a= ý -b x́ =4551.8875-603.2(6.9875) =337
the required regression equation is Maintenance cost= 337+ 603.2Age
sh
This study source was downloaded by 100000829508145 from CourseHero.com on 08-09-2021 23:48:50 GMT -05:00
https://www.coursehero.com/file/61871519/week7docx/
The null hypothesis is rejected at 0.05 level of significance since the P-value is less than the level
of significance. There is sufficient evidence to indicate that the slope of the regression line is
positive. The result is statistically significant.
CH14- 35. Refer to the Lincolnville School District bus data. First, add a variable to change the type of
engine (diesel or gasoline) to a qualitative variable. If the engine type is diesel, then set the qualitative
variable to 0. If the engine type is gasoline, then set the qualitative variable to 1. Develop a regression
equation using statistical software with maintenance cost as the dependent variable and age, odometer
miles, miles since last maintenance, and engine type as the independent variables.
a. Develop a correlation matrix. Which independent variables have strong or weak correlations with the
dependent variable? Do you see any problems with multicollinearity?
b. Use a statistical software package to determine the multiple regression equation. How did you select
the variables to include in the equation? How did you use the information from the correlation analysis?
m
er as
Show that your regression equation shows a significant relationship. Write out the regression equation
and interpret its practical application. Report and interpret the R-square.
co
eH w
c. Develop a histogram or a stem-and-leaf display of the residuals from the final regression equation
o.
developed in part (f). Is it reasonable to conclude that the normality assumption has been met?
rs e
ou urc
d. Plot the residuals against the fitted values from the final regression equation developed in part (f)
against the fitted values of Y. Plot the residuals on the vertical axis and the fitted values on the horizontal
axis.
o
aC s
vi y re
Answer:
a.
ed d
ar stu
is
Th
sh
From the correlation matrix on excel. The dependent variable maintenance cost has a strong
correlation with age. Also, two independent variable age and odometer miles are strong correlated.
This study source was downloaded by 100000829508145 from CourseHero.com on 08-09-2021 23:48:50 GMT -05:00
https://www.coursehero.com/file/61871519/week7docx/
c. For the frequency table we need determine the number classes. Use “2 to the k rule” if we try
k=6 then 26= 64 which is smaller than n=80 (observations) So let k=7 then 2 7=128 which is
greater then n=80. Then we need find out the class interval. Use “ i ≥ (H-L)/k” then i ≥
(2005.361283-(-1611.329037)/7≥516.67 So we can use 550 as interval. Since the minimum value
is -1611.329037 so the lower limit can be -1650
class frequency
1 -1650--1100 3
2 -1100--550 10
3 -550-0 30
4 0-550 23
5 550-1100 10
6 1100-1650 3
7 1650-2200 1
m
er as
co
eH w
o.
rs e
ou urc
o
aC s
vi y re
ed d
ar stu
d.
Th
sh
This study source was downloaded by 100000829508145 from CourseHero.com on 08-09-2021 23:48:50 GMT -05:00
https://www.coursehero.com/file/61871519/week7docx/
residuals against the fitted values
2500
2000
1500
1000
Residuals
500
0
-2000 0 2000 4000 6000 8000 10000
-500
-1000
-1500
-2000
Fitted Value
m
er as
There is no apparent relationship in the residuals, but the residual variation maybe increasing
co
eH w
with larger fitted values.
o.
rs e
ou urc
o
aC s
vi y re
ed d
ar stu
is
Th
sh
This study source was downloaded by 100000829508145 from CourseHero.com on 08-09-2021 23:48:50 GMT -05:00
https://www.coursehero.com/file/61871519/week7docx/
Powered by TCPDF (www.tcpdf.org)