0% found this document useful (0 votes)
26 views3 pages

Exercises: 03758 - 13 - ch12 - p482-529.qxd 9/7/11 1:06 PM Page 517

The document discusses correlation analysis and linear regression. It provides examples of calculating correlation coefficients and interpreting their values. It also discusses that having correlated independent variables does not necessarily mean their combined use in prediction will account for a certain percentage reduction in variation. The document concludes by presenting exercises related to these topics.

Uploaded by

Thanh Nhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views3 pages

Exercises: 03758 - 13 - ch12 - p482-529.qxd 9/7/11 1:06 PM Page 517

The document discusses correlation analysis and linear regression. It provides examples of calculating correlation coefficients and interpreting their values. It also discusses that having correlated independent variables does not necessarily mean their combined use in prediction will account for a certain percentage reduction in variation. The document concludes by presenting exercises related to these topics.

Uploaded by

Thanh Nhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

03758_13_ch12_p482-529.

qxd 9/7/11 1:06 PM Page 517

12.8 CORRELATION ANALYSIS ❍ 517

If the linear coefficients of correlation between y and each of two variables x1


and x2 are calculated to be .4 and .5, respectively, it does not follow that a pre-
dictor using both variables will account for [(.4)2 ! (.5)2] " .41, or a 41% reduc-
tion in the sum of squares of deviations. Actually, x1 and x2 might be highly
correlated and therefore contribute virtually the same information for the predic-
tion of y.
Finally, remember that r is a measure of linear correlation and that x and y
could be perfectly related by some nonlinear function when the observed value of
r is equal to 0. The problem of estimating or predicting y using information given
by several independent variables, x1, x2, . . . , xk, is the subject of
Chapter 13.

12.8 EXERCISES

BASIC TECHNIQUES Repeat the steps of Exercise 12.51. Notice the change
in the sign of r and the relationship between the values
12.47 How does the coefficient of correlation mea-
of r 2 of Exercise 12.51 and this exercise.
sure the strength of the linear relationship between two
variables y and x?
APPLICATIONS
12.48 Describe the significance of the algebraic sign 12.53 Lobster The table gives the numbers of
and the magnitude of r. EX1253 Octolasmis tridens and O. lowei barnacles on
12.49 What value does r assume if all the data points each of 10 lobsters.8 Does it appear that the barnacles
fall on the same straight line in these cases? compete for space on the surface of a lobster?
a. The line has positive slope. Lobster
Field Number O. tridens O. lowei
b. The line has negative slope.
AO61 645 6
12.50 You are given these data: AO62 320 23
x #2 #1 0 1 2 AO66 401 40
y 2 2 3 4 4 AO70 364 9
AO67 327 24
a. Plot the data points. Based on your graph, what will AO69 73 5
be the sign of the sample correlation coefficient? AO64 20 86
AO68 221 0
b. Calculate r and r 2 and interpret their values. AO65 3 109
12.51 You are given these data: AO63 5 350

x 1 2 3 4 5 6 a. If they do compete, do you expect the number x of


y 7 5 5 3 2 0 O. tridens and the number y of O. lowei barnacles
to be positively or negatively correlated? Explain.
a. Plot the six points on graph paper.
b. If you want to test the theory that the two types of
b. Calculate the sample coefficient of correlation r and
barnacles compete for space by conducting a test of
interpret.
the null hypothesis “the population correlation coef-
c. By what percentage was the sum of squares of devi- ficient r equals 0,” what is your alternative hypoth-
ations reduced by using the least-squares predictor esis?
ŷ " a ! bx rather than !y as a predictor of y?
c. Conduct the test in part b and state your conclusions.
12.52 Reverse the slope of the line in Exercise 12.51
12.54 Social Skills Training A social skills
by reordering the y observations, as follows:
EX1254 training program was implemented with seven
x 1 2 3 4 5 6 mildly challenged students in a study to determine
y 0 2 3 5 5 7 whether the program caused improvement in pre/post
03758_13_ch12_p482-529.qxd 9/7/11 1:06 PM Page 518

518 ❍ CHAPTER 12 LINEAR REGRESSION AND CORRELATION

measures and behavior ratings. For one such test, the more energy per pound. The data in the table are
pre- and posttest scores for the seven students are reproduced from an article on geothermal systems by
given in the table.9 A.J. Ellis.11
Average (max.)
Subject Pretest Posttest Drill Hole Average (max.)
Earl 101 113 Location of Well Depth (m) Temperature (°C)
Ned 89 89 El Tateo, Chile 650 230
Jasper 112 121 Ahuachapan, El Salvador 1000 230
Charlie 105 99 Namafjall, Iceland 1000 250
Tom 90 104 Larderello (region), Italy 600 200
Susie 91 94 Matsukawa, Japan 1000 220
Lori 89 99 Cerro Prieto, Mexico 800 300
Wairakei, New Zealand 800 230
a. What type of correlation, if any, do you expect to Kizildere, Turkey 700 190
see between the pre- and posttest scores? Plot the The Geysers, United States 1500 250
data. Does the correlation appear to be positive or
Is there a significant positive correlation between aver-
negative?
age maximum drill hole depth and average maximum
b. Calculate the correlation coefficient, r. Is there a temperature?
significant positive correlation?
12.58 Ice Cream, Anyone? As much as
12.55 Hockey G. W. Marino investigated the vari-
EX1258 Americans try to avoid high fat, high calorie
ables related to a hockey player’s ability to make a fast
foods, the demand for a cold, creamy ice cream cone
start from a stopped position.10 In the experiment, each
on a hot day is hard to resist. The popular ice cream
skater started from a stopped position and attempted to
franchise Coldstone Creamery posted the nutritional
move as rapidly as possible over a 6-meter distance.
information for its ice cream offerings in three serving
The correlation coefficient r between a skater’s stride
sizes—“Like it”, “Love it”, and “Gotta Have it”—on
rate (number of strides per second) and the length of
their website.12 A portion of that information for the
time to cover the 6-meter distance for the sample of
“Like it” serving size is shown in the table.
69 skaters was #.37.
Flavor Calories Total Fat (grams)
a. Do the data provide sufficient evidence to indicate a
correlation between stride rate and time to cover the Cake Batter 340 19
distance? Test using a " .05. Cinnamon Bun 370 21
French Toast 330 19
b. Find the approximate p-value for the test. Mocha 320 20
c. What are the practical implications of the test OREO® Crème 440 31
in part a? Peanut Butter 370 24
Strawberry Cheesecake 320 21
12.56 Hockey II Refer to Exercise 12.55. Marino a. Should you use the methods of linear regression
calculated the sample correlation coefficient r for the analysis or correlation analysis to analyze the data?
stride rate and the average acceleration rate for the 69 Explain.
skaters to be .36. Do the data provide sufficient evi-
b. Analyze the data to determine the nature of the rela-
dence to indicate a correlation between stride rate and
tionship between total fat and calories in Coldstone
average acceleration for the skaters? Use the p-value
Creamery ice cream.
approach.
12.59 Body Temperature and Heart Rate
12.57 Geothermal Power Geothermal EX1259 Is there any relationship between these two
EX1257 power is an important source of energy. Since variables? To find out, we randomly selected 12 peo-
the amount of energy contained in 1 pound of water is ple from a data set constructed by Allen Shoemaker
a function of its temperature, you might wonder (Journal of Statistics Education) and recorded their
whether water obtained from deeper wells contains body temperature and heart rate.13
03758_13_ch12_p482-529.qxd 9/7/11 1:06 PM Page 519

CHAPTER REVIEW ❍ 519

Person 1 2 3 4 5 6 batting average for eight selected major league teams


Temperature 96.3 97.4 98.9 99.0 99.0 96.8 for the 2010 season.14
(degrees) Team Total Home Runs Team Batting Average
Heart Rate 70 68 80 75 79 75
(beats per minute) Atlanta Braves 139 .258
Baltimore Orioles 133 .259
Boston Red Sox 211 .268
Person 7 8 9 10 11 12 Chicago White Sox 177 .268
Temperature 98.4 98.4 98.8 98.8 99.2 99.3 Houston Astros 108 .247
(degrees) LA Dodgers 120 .252
Heart Rate 74 84 73 84 66 68 Philadelphia Phillies 166 .260
(beats per minute) Seattle Mariners 101 .236
Source: ESPN.com
a. Find the correlation coefficient r, relating body
a. Plot the points using a scatterplot. Does it appear
temperature to heart rate.
that there is any relationship between total home
b. Is there sufficient evidence to indicate that there is a runs and team batting average?
correlation between these two variables? Test at the
b. Is there a significant positive correlation between
5% level of significance.
total home runs and team batting average? Test at
the 5% level of significance.
12.60 Baseball Stats Does a team’s batting
EX1260average depend in any way on the number of c. Do you think that the relationship between these
home runs hit by the team? The data in the table show two variables would be different if we had looked
the number of team home runs and the overall team at the entire set of major league franchises?

CHAPTER REVIEW

Key Concepts and Formulas


I. A Linear Probabilistic Model
b MSR
1. When the data exhibit a linear relationship, the t " $$ or F " $$
!
"MS !
E/Sxx MSE
appropriate model is y " a ! bx ! e.
2. The random error e has a normal distribution 2. The strength of the relationship between x and
with mean 0 and variance s 2. y can be measured using
SSR
II. Method of Least Squares r 2 " $$
Total SS
1. Estimates a and b, for a and b, are chosen to
minimize SSE, the sum of squared deviations which gets closer to 1 as the relationship gets
about the regression line, ŷ " a ! bx. stronger.
2. The least-squares estimates are b " Sxy /Sxx and 3. Use residual plots to check for nonnormality,
a " y! # bx!. inequality of variances, or an incorrectly fitted
model.
III. Analysis of Variance
4. Confidence intervals can be constructed to esti-
1. Total SS " SSR ! SSE, where Total SS " Syy mate the intercept a and slope b of the regres-
and SSR " (Sxy)2/Sxx. sion line and to estimate the average value of
2. The best estimate of s 2 is MSE " SSE/(n # 2). y, E( y), for a given value of x.
5. Prediction intervals can be constructed to
IV. Testing, Estimation, and Prediction
predict a particular observation, y, for a
1. A test for the significance of the linear given value of x. For a given x, prediction
regression—H0 : b " 0—can be implemented intervals are always wider than confidence
using one of two test statistics: intervals.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy