0% found this document useful (0 votes)
64 views3 pages

Causality: Causes y

The document discusses linear regression and correlation. It explains that while a significant regression relationship between variables x and y is observed, other unknown variables not included in the analysis could actually be causing the relationship. It then provides exercises involving linear regression analysis on various data sets relating different variables.

Uploaded by

Thanh Nhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views3 pages

Causality: Causes y

The document discusses linear regression and correlation. It explains that while a significant regression relationship between variables x and y is observed, other unknown variables not included in the analysis could actually be causing the relationship. It then provides exercises involving linear regression analysis on various data sets relating different variables.

Uploaded by

Thanh Nhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

03758_13_ch12_p482-529.

qxd 9/7/11 1:06 PM Page 500

500 ❍ CHAPTER 12 LINEAR REGRESSION AND CORRELATION

Causality
When there is a significant regression of y and x, it is tempting to conclude that x
causes y. However, it is possible that one or more unknown variables that you have
not even measured and that are not included in the analysis may be causing the ob-
served relationship. In general, the statistician reports the results of an analysis but
leaves conclusions concerning causality to scientists and investigators who are experts
in these areas. These experts are better prepared to make such decisions!

12.5 EXERCISES

BASIC TECHNIQUES 12.22 Refer to Exercise 12.8. The data, along with
12.19 Refer to Exercise 12.6. The data are repro- the MS Excel analysis of variance table are reproduced
duced below. below:

x !2 !1 0 1 2 x 1 2 3 4 5 6

y 1 1 3 5 5 y 9.7 6.5 6.4 4.1 2.1 1.0

a. Do the data present sufficient evidence to indicate MS Excel ANOVA table for Exercise 12.22
that y and x are linearly related? Test the hypothesis ANOVA
that b " 0 at the 5% level of significance. df SS MS F Significance F
Regression 1 49.72857 49.72857 111.45 0.000
b. Use the ANOVA table from Exercise 12.6 to calcu- Residual 4 1.78476 0.4462
late F " MSR/MSE. Verify that the square of the Total 5 51.51333
t statistic used in part a is equal to F.
a. Do the data provide sufficient evidence to indicate
c. Compare the two-tailed critical value for the t-test
that y and x are linearly related? Use the informa-
in part a with the critical value for F with a " .05.
tion in the printout to answer this question at the
What is the relationship between the critical values?
5% level of significance.
12.20 Refer to Exercise 12.19. Find a 95% confi-
b. Calculate the coefficient of determination r2. What
dence interval for the slope of the line. What does the information does this value give about the useful-
phrase “95% confident” mean? ness of the linear model?
12.21 Refer to Exercise 12.7. The data, along with
the MINITAB analysis of variance table are reproduced APPLICATIONS
below.
12.23 Chirping Crickets In Exercise 3.18,
x 1 2 3 4 5 6 EX1223 we found that male crickets chirp by rubbing
y 5.6 4.6 4.5 3.7 3.2 2.7 their front wings together, and their chirping is temper-
ature dependent. The table below shows the number of
MINITAB ANOVA table for Exercise 12.21
chirps per second for a cricket, recorded at 10 different
Regression Analysis: y versus x temperatures:
Chirps per Second 20 16 19 18 18 16 14 17 15 16
Analysis of Variance
Source DF SS MS F P
Regression 1 5.4321 5.4321 152.10 0.000 Temperature 88 73 91 85 82 75 69 82 69 83
Residual Error
Total
4
5
0.1429
5.5750
0.0357
a. Use the formulas given in this chapter to find the
least-squares regression line relating the number of
a. Do the data provide sufficient evidence to indicate chirps to temperature. Compare to the results
that y and x are linearly related? Use the informa- obtained in Exercise 3.18.
tion in the MINITAB printout to answer this question b. Do the data provide sufficient evidence to indicate
at the 1% level of significance. that there is a linear relationship between number of
b. Calculate the coefficient of determination r 2. What chirps and temperature?
information does this value give about the useful- c. Calculate r 2. What does this value tell you about
ness of the linear model? the effectiveness of the linear regression analysis?
03758_13_ch12_p482-529.qxd 9/7/11 1:06 PM Page 501

12.5 TESTING THE USEFULNESS OF THE LINEAR REGRESSION MODEL ❍ 501

12.24 Gestation Times and Longevity The is achieved by using the linear regression
EX1224 table below, a subset of the data given in Exercise model?
3.33, shows the gestation time in days and the average c. Plot the data or refer to the plot in Exercise 12.9,
longevity in years for a variety of mammals in captivity.4 part b. Do the results of parts a and b indicate that
the model provides a good fit for the data? Are
Gestation Avg Longevity there any assumptions that may have been violated
Animal (days) (yrs)
in fitting the linear model?
Baboon 187 20
Bear (black) 219 18 12.26 Refer to the sleep deprivation experiment
Bison 285 15 described in Exercises 12.11 and 12.12 and data set
Cat (domestic) 63 12 EX1211. The data and the MINITAB and MS Excel
Elk 250 15 printout are reproduced here.
Fox (red) 52 7
Goat (domestic) 151 8 Number of Errors, y 8, 6 6, 10 8, 14
Gorilla 258 20 Number of Hours without Sleep, x 8 12 16
Horse 330 20
Monkey (rhesus) 166 15 Number of Errors, y 14, 12 16, 12
Mouse (meadow) 21 3 Number of Hours without Sleep, x 20 24
Pig (domestic) 112 10
Puma 90 12
Sheep (domestic) 154 12 MINITAB output for Exercise 12.26
Wolf (maned) 63 5
Regression Analysis: y versus x
a. If you want to estimate the average longevity of an The regression equation is
animal based on its gestation time, which variable y = 3.00 + 0.475 x

is the response variable and which is the indepen- Predictor Coef SE Coef T P
Constant 3.000 2.127 1.41 0.196
dent predictor variable? x 0.4750 0.1253 3.79 0.005
b. Assume that there is a linear relationship between S = 2.24165 R-Sq = 64.2% R-Sq(adj) = 59.8%
gestation time and longevity. Calculate the least- Analysis of Variance
squares regression line describing longevity as a
Source DF SS MS F P
linear function of gestation time. Regression 1 72.200 72.200 14.37 0.005
Residual Error 8 40.200 5.025
c. Plot the data points and the regression line. Does it Total 9 112.400
appear that the line fits the data?
d. Use the appropriate statistical tests and measures to
explain the usefulness of the regression model for MS Excel output for Exercise 12.26
predicting longevity.
ANOVA
12.25 Professor Asimov, continued Refer to the df SS MS F Significance F
data in Exercise 12.9, relating x, the number of books Regression 1 72.2 72.2 14.368 0.005
written by Professor Isaac Asimov, to y, the number of Residual 8 40.2 5.025
Total 9 112.4
months he took to write his books (in increments of
Coefficients Standard t P- Lower Upper
100). The data are reproduced below. Error Stat value 95% 95%
Intercept 3 2.1266 1.4107 0.1960 -1.9040 7.9040
Number of Books, x 100 200 300 400 490 x 0.475 0.1253 3.7905 0.0053 0.1860 0.7640
Time in Months, y 237 350 419 465 507
a. Do the data present sufficient evidence to indicate
a. Do the data support the hypothesis that b " 0? Use that the number of errors is linearly related to the
the p-value approach, bounding the p-value using number of hours without sleep? Identify the two
Table 4 of Appendix I. Explain your conclusions in test statistics in the printout that can be used to
practical terms. answer this question.
b. Use the ANOVA table in Exercise 12.9, part c, b. Would you expect the relationship between y and x
to calculate the coefficient of determination r 2. to be linear if x varied over a wider range (say,
What percentage reduction in the total variation x " 4 to x " 48)?
03758_13_ch12_p482-529.qxd 9/7/11 1:06 PM Page 502

502 ❍ CHAPTER 12 LINEAR REGRESSION AND CORRELATION

c. How do you describe the strength of the relation- MINITAB output for Exercise 12.28
ship between y and x? Regression Analysis: y versus x
d. What is the best estimate of the common population
The regression equation is
variance s 2? y = -26.8 + 1.26 x
e. Find a 95% confidence interval for the slope of the Predictor Coef SE Coef T P
Constant -26.82 14.76 -1.82 0.086
line. x 1.2617 0.1685 7.49 0.000

S = 7.61912 R-Sq = 75.7% R-Sq(adj) = 74.3%


12.27 Strawberries II The following data (Exer-
Analysis of Variance
cise 12.18 and data set EX1218) were obtained in an Source DF SS MS F P
experiment relating the dependent variable, y (texture Regression 1 3254.0 3254.0 56.05 0.000
Residual Error 18 1044.9 58.1
of strawberries), with x (coded storage temperature). Total 19 4299.0
Use the information from Exercise 12.18 to answer the a. Construct a scatterplot for the data. Does the
following questions: assumption of linearity appear to be reasonable?
x !2 !2 0 2 2 b. What is the equation of the regression line used for
y 4.0 3.5 2.0 0.5 0.0 predicting final exam score as a function of the
posttest score?
a. What is the best estimate of s 2, the variance of the
c. Do the data present sufficient evidence to indicate
random error #?
that final exam score is linearly related to the
b. Do the data indicate that texture and storage tem- posttest score? Use a " .01.
perature are linearly related? Use a " .05.
d. Find a 99% confidence interval for the slope of the
c. Calculate the coefficient of determination, regression line.
r 2.
12.29 Laptops and Learning, continued Refer to
d. Of what value is the linear model in increasing
Exercise 12.28.
the accuracy of prediction as compared to the
predictor, y!? a. Use the MINITAB printout to find the value of the
coefficient of determination, r 2. Show that
r 2 " SSR/Total SS.
12.28 Laptops and Learning In Exercise b. What percentage reduction in the total variation is
EX1228 1.61 we described an informal experiment con- achieved by using the linear regression model?
ducted at McNair Academic High School in Jersey
12.30 Armspan and Height II In Exercise 12.17
City, New Jersey. Two freshman algebra classes were
studied, one of which used laptop computers at school (data set EX1217), we measured the armspan and
and at home, while the other class did not. In each height of eight people with the following results:
class, students were given a survey at the beginning Person 1 2 3 4
and end of the semester, measuring his or her techno- Armspan (inches) 68 62.25 65 69.5
logical level. The scores were recorded for the end of Height (inches) 69 62 65 70
semester survey (x) and the final examination (y) for Person 5 6 7 8
the laptop group.5 The data and the MINITAB printout Armspan (inches) 68 69 62 60.25
are shown here. Height (inches) 67 67 63 62
Final Final a. Does the data provide sufficient evidence to indicate
Student Posttest Exam Student Posttest Exam that there is a linear relationship between armspan
1 100 98 11 88 84 and height? Test at the 5% level of significance.
2 96 97 12 92 93 b. Construct a 95% confidence interval for the slope of
3 88 88 13 68 57
4 100 100 14 84 84
the line of means, b.
5 100 100 15 84 81 c. If Leonardo da Vinci is correct, and a person’s
6 96 78 16 88 83 armspan is roughly the same as the person’s height,
7 80 68 17 72 84 the slope of the regression line is approximately
8 68 47 18 88 93
9 92 90 19 72 57
equal to 1. Is this supposition confirmed by the con-
10 96 94 20 88 83 fidence interval constructed in part b? Explain.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy