0% found this document useful (0 votes)

11 views28 pages

Chapter 3 Notes 2024 2025 PDF

Chapter 3 discusses the analysis of relationships between quantitative variables using scatterplots, correlation, and least-squares regression. It emphasizes the importance of visualizing data, identifying explanatory and response variables, and interpreting the strength and direction of relationships. The chapter also covers practical applications, such as predicting outcomes based on regression models and evaluating the appropriateness of linear models using residual plots.

Uploaded by

Nancy Skocik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views28 pages

Chapter 3 Notes 2024 2025 PDF

Uploaded by

Nancy Skocik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Chapter 3: Describing Relationships

3.1 Scatterplots and Correlation

3.2 Least-Squares Regression

Name___________________________________
What information do the park rangers at Yellowstone use to predict when Old Faithful’s next eruption will be?

Identify the explanatory and the response variables in this situation.

What kind of graph do we use to show the relationship between two quantitative variables?

1
Here is a scatterplot that plots the interval between consecutive eruptions of Old Faithful against the duration of the
previous eruption, for the month prior to the Starnes visit.

1. Describe the direction of the relationship.

2. What form does the relationship take?

3. How strong is the relationship?

4. Are there any unusual features?

5. If the previous eruption lasted 3 minutes and 42 seconds how long do you think it would be until the next eruption?

2
3.1 Describing Relationships (Read pgs. 143-149)
In Chapter 1, we explored the relationships between categorical variables. In this chapter we investigate the
relationships between two quantitative variables.

The principles that guide our work remain the same:

• Plot the data, then add numerical summaries.

• Look for overall patterns and departures from those patterns.
• When there’s a regular overall pattern, use a simplified model to describe it.

A response variable _____________________________________________________________________.

An explanatory variable _____________________________________________________________________________

_____________________________________________.

Check Your Understanding

Identify the explanatory and response variables in each setting.

1. How does drinking beer affect the level of alcohol in people’s blood? The legal limit for driving in all states is 0.08%.
In a study, adult volunteers drank different numbers of cans of beer. Thirty minutes later, a police officer measured
their blood alcohol levels.

2. The National Student Loan Survey provides data on the amount of debt for recent college graduates, their current
income, and how stressed they feel about college debt. A sociologist looks at the data with the goal of using amount
of debt and income to explain the stress caused by college debt.

A ___________________________________ shows the relationship between two quantitative variables measured on

the same individuals. The values of one variable appear on the horizontal axis, and the values of the other variable
appear on the vertical axis. Each individual in the data appears as a point in the graph.

Always plot the ___________________________________________, if there is one, on the horizontal axis (the x-axis) of
a scatterplot. Note: you don’t always start at (0, 0).

What is the easiest way to lose points when making a scatterplot? Failure to label your axes.

3
What four characteristics should you consider when interpreting a scatterplot?

The following scatterplot displays the average number of points scored per game and the number of wins for college
football teams in the Southeastern Conference. Describe what the scatterplot reveals about the relationship between
points per game and wins.

Study the scatterplots below and evaluate the direction and strength of each relationship. Complete the table below.

Strong Moderate Weak

Negative

Positive

A B C

D E F

4
3.1 Measuring Linear Association: Correlation

The correlation r measures the ________________________________________________________________________

between two quantitative variables.

Facts about Correlation

• The correlation r is always between and .

• r > 0 indicates a _______________ relationship.

• r < 0 indicates a _______________ relationship.

• Values near 0 indicate a ____________ linear relationship.

• Values near 1 or -1 indicate a _______________ linear relationship.

1. The table shows the weight (in pounds) and cost (in dollars) of a sample of 11 stand mixers Weight Price
(from Consumer Reports, November 2005.) (lb) ($)
a. Enter the data into List 1 and List 2 of your calculator, sketch a scatterplot of the data and
23 180
describe the scatterplot. (Discuss strength, form, direction and outliers in context.)
28 250

19 300

17 150

25 300

b. Follow the steps below to calculate the correlation. 26 370

1. Select MODE and make sure your stats diagnostics are turned on.
21 400

32 350

16 200

17 150

8 30

2. To calculate the correlation, r, use the following keystrokes: STAT, CALC, 8:LinReg(a +bx)

r = ______

5
c. The last mixer in the table is from Walmart; put an x through this point. What happens to the correlation when
you remove the point?

d. What happens to the correlation if the Walmart mixer weighs 25 pounds instead of 8 pounds? Add the point
(25, 30) and recalculate the correlation. (Make it a star)

e. What happens to the correlation if the Walmart mixer weighs 25 pounds and costs $310? Add the point (25,
310) and recalculate the correlation.

f. Suppose that a new titanium mixer was introduced that weighed 8 pounds, but the cost was $500. Remove the
point (25, 30) and add the point (8, 500), circle it, and recalculate the correlation.

g. When a point is added that is far away from the other points but still fits the linear pattern what happens to the
correlation?

h. When a point is added that is far away from the other points and doesn’t fit the linear pattern what happens to
the correlation?

The formula for correlation is as follows:

2. Would switching the explanatory and response variables change the correlation?

3. Would switching the units (for example feet to inches) of one or both of the variables change the correlation?

X y
4. a. For the data to the right, is the relationship between x and y linear?
1 1

2 4

3 9
b. Find the correlation.
4 16

5 25

c. Does a correlation close to one always imply that the relationship is linear?

6
3.2 Least-Squares Regression
How Much Is That Truck Worth?

Everyone knows that cars and trucks lose value the more they are driven. Can we predict the price of a used Ford F-
150 SuperCrew 4 × 4 if we know how many miles it has on the odometer? A random sample of 16 used Ford F-150
SuperCrew 4 × 4s was selected from among those listed for sale at autotrader.com. The number of miles driven and
price (in dollars) were recorded for each of the trucks. Here are the data:

1. Identify the explanatory variable.

2. Identify the response variable.

3. Sketch a scatterplot of the data. Then describe what the scatterplot reveals about the relationship between miles
driven and price.

Suppose that y is a response variable (plotted on the vertical axis) and x is an explanatory variable (plotted on the
horizontal axis). A regression line relating y to x has an equation of the form

𝑦𝑦� = 𝑎𝑎 + 𝑏𝑏𝑏𝑏
In this equation,

• _______ (read “y hat”) is the ______________________ value of the response variable y for a given value of
the explanatory variable x.

• b is the _____________, the amount by which y is predicted to change when x increases by one unit.

• a is the ______________, the predicted value of y when x = 0.

7
4. Calculate the regression equation for predicting the price of a Ford F-150 based on number of miles driven.

5. Identify and interpret the slope.

6. Identify and interpret the y-intercept.

7. Predict the price of a Ford F-150 that has been driven 100,000 miles.

8. Can we predict the price of a Ford F-150 with 300,000 miles driven?

Residuals and the Least Squares Regression Line

A ___________________ is the difference between an observed value of the response variable and the value
predicted by the regression line. That is,

residual = ____________________________________________

Find and interpret the residual for the Ford F-150 that
had 70,583 miles driven and a price of $21,994.

The least-squares regression line of y on x is the line that makes ______________________________________

_____________________________________________________________________________________________

8
Determining Whether a Linear Model Is Appropriate: Residual Plots

A residual plot is a scatterplot of the residuals against the explanatory variable. Residual plots help us assess whether a
linear model is appropriate. When an obvious curved pattern exists in a residual plot, the linear model is not
appropriate.

Scatterplot Residual Plot Notes

Because there is only random scatter in the

residual plot we know the linear model is
appropriate.

There is a definite curved pattern to the

residual plot. A linear model is not appropriate.

Construct and interpret a residual plot for the Ford F-150 data. Your calculator calculates the residuals for you AFTER
finding the regression equation. Set up your Stat Plot as shown, then select 9:ZoomStat. (To access the list of residuals
press 2nd STAT)

Sketch the residual plot. Does the linear model seem appropriate? Explain.

9
A residual plot is a graphical tool for determining if a least-squares regression line is an appropriate model for a
relationship between two variables. Once we have determined that a least-squares regression line is appropriate, it
makes sense to ask: How well does the line work? If we use the least-squares regression line to make predictions, how
good will these predictions be?

If we use a least-squares line to predict the values of a response variable y from an explanatory variable x, the standard
deviation of the residuals (s) is given by

∑ residuals ∑ ( y − yˆ )
2 2
i
=s =
n−2 n−2

Calculate and interpret the standard deviation of the residuals for the F-150 data and interpret this value in context.

There is another numerical quantity that tells us how well the least-squares line predicts the values of the response
variable y. It is r2, the coefficient of determination.

Suppose that we randomly selected an additional used Ford F-150 that was on sale. What should we predict for its
price? Figure 3.14 shows a scatterplot of the truck data that we have studied throughout this section, including the
least-squares regression line. Another horizontal line has been added at the mean y-value, y = $27,834. If we don’t
know the number of miles driven for the additional truck, we can’t use the regression line to make a prediction. What
should we do? Our best strategy is to use the mean price of the other 16 trucks as our prediction.

10
Calculate the ratio of the sum of the squared residuals to determine the percent of variation that is unaccounted for by
the least squares regression line.

What are some other factors, besides miles driven, that may determine the price of the truck.

What is the proportion of the total variation in price that is accounted for by the least squares regression line? Interpret
this value in context.

The coefficient of determination r2 is the fraction of the variation in the values of y that is accounted for by the least-
squares regression line of y on x. We can calculate r2 using the formula:

2
r = 1−
∑ residuals 2

∑( y − y )
2
i

r2 is the percent of variation in the (y variable) that is accounted for by the linear model relating (y variable)
to (x variable).

How is r 2 related to r? How is r 2 related to s?

• r 2 and s both measure how well the least-squares regression line models the data (how much scatter there is
from the least-squares regression line)
• s is measured in the units of the response variable, r2 is on a standard scale (no units)
• neither address form!
• Careful – If you want to find r from r2 you must take into account the direction of the association.

11
3.2 Interpreting Computer Output

Does seat location affect grades?

Many people believe that students learn better if they sit closer to the front of the classroom. Does sitting closer cause
higher achievement, or do better students simply choose to sit in the front? To investigate, an AP Statistics teacher
randomly assigned students to seat locations in her classroom for a particular chapter. At the end of the chapter, she
recorded the row number (Row 1 is closest to the front) and test score for each student. Least-squares regression was
performed on the data. A scatterplot with the regression line added, a residual plot, and some computer output from
the regression are shown below.
slope y-intercept

r2
Standard deviation of the residuals

(a) Is this an observational study or an experiment? Explain.

(b) Identify the type and scope of inference.

(d) What is the equation of the least-squares regression line? Define any variables you use.

(e) Interpret the slope of the least-squares regression line.

12
(f) What is the correlation?

(g) Is a linear model appropriate for this data? Explain.

(h) Calculate and interpret the residual for a student who was seated in the fourth row and had a test score of 75.

(i) Interpret the value of r 2 in context.

(j) Interpret s in context.

(k) Would it be reasonable to use the fitted regression equations to predict the test score for a student sitting in row 50
of a lecture hall? Explain.

13
3.2 How to Calculate the Least Square Regression Line

The least-squares regression line is the line ____________________________________ with slope _______________

and y-intercept _______________________.

1. A random sample of 15 high school students was selected from the CensusAtSchool database. The foot length
(in centimeters) and height (in centimeters) of each student in the sample were recorded. The mean and
standard deviation of foot lengths are x = 24.76 cm and sx = 2.71 cm. The mean and standard deviation of the
heights are y = 171.43 cm and s y = 10.69 cm. The correlation between foot and length and height is r = 0.697.
Find the equation of the least-squares regression line for predicting height from foot length.

2. The mean height of married American women in their early twenties is 64.5 inches and the standard deviation is
2.5 inches. The mean height of married men the same age is 68.5 inches, with standard deviation of 2.7 inches.
The correlation between the heights of husbands and wives is about r = 0.5. Find the equation of the least-
squares regression line for predicting a husband’s height from his wife’s height for married couples in their early
20s.

14
3.2 Linear Regression Practice

Can we predict the number of wins for an MLB team from their run differential (runs scored – runs allowed)?

The scatterplot below shows the run differential and number of wins for all 30 MLB teams for the 2023 season. A
residual plot and some computer output is also provided.

(a) Describe the relationship between run differential and # of wins.

(b) Interpret the slope and y-intercept of the least-squares regression line.

(d) Interpret the value of r 2 in context.

(e) Interpret s in context.

(f) Statistician Bill James developed the Pythagorean Winning Percentage formula to predict the number of games a
team “should” win based on its total number of runs scored versus its number of runs allowed. The initial formula
for Pythagorean winning percentage was as follows: (runs scored^2)/(runs scored^2 + runs allowed^2). Since then,
other analysts have attempted to find an even more accurate formula. For instance, Baseball-Reference.com uses
1.83 as its exponent of choice. Use both versions of the formula do predict how many games should the Cubs have
won. They scored 819 runs and allowed 723 runs in 162 games.

15
How Do Outliers Affect the LSRL?

1. Use the Correlation and Regression applet at tinyurl.com/regressionapplet

• Click on the graphing area to add 10

points in the center so that the
correlation is about r = 0.90.
(d)
• Check the box to show the least-
squares line.
(b) (a)

(c)

2. Predict if the slope, y-intercept and correlation will increase, decrease, or stay about the same. Check your
prediction, then delete the point (click on it) before moving to the next part.

(a) Adding point (a) as shown in the picture above.

slope: y-intercept: correlation:

(b) Adding point (b) as shown in the picture above.

slope: y-intercept: correlation:

(c) Adding point (c) as shown in the picture above.

slope: y-intercept: correlation:

(d) Adding point (d) as shown in the picture above.

slope: y-intercept: correlation:

3. Why are points (a), (b), (c), and (d) considered outliers?

4. Which outliers had the greatest impact on the LSRL: vertical or horizontal outliers?

16
As you learned in this activity, unusual points may or may not have an influence on the least-squares regression line and
the correlation r. The same is true for the coefficient of determination r2 and the standard deviation of residuals s. Here
are 4 scatter plots the summarize the possibilities. In all four scatterplots, the 8 points in the low left are the same.

Case 1: No unusual points

Case 2: A point far from the other points in the x direction, but in the same pattern.

Compared to case 1, determine whether the following stayed the same, increased, decreased or changed from positive
to negative.

Slope: y-intercept: r:

r2 s:

Case 3: A point that if far from the other points in the x direction, and not in the same pattern.

Compared to case 1, determine whether the following stayed the same, increased, decreased or changed from positive
to negative.

Slope: y-intercept: r:

r2 s:

17
Case 4: A point that is far from the other points in the y direction, and not in the same pattern.

Compared to case 1, determine whether the following stayed the same, increased, decreased or changed from positive
to negative.

Slope: y-intercept: r:

r2 s:

In Cases 2 and 3, the unusual point had a much bigger x value than the other points. Points whose x values are much
smaller or much larger than the other points in a scatterplot have high leverage. In Case 4, the unusual point had a very
large residual. Points with large residuals are called outliers. All three of these unusual points are considered influential
points because adding them to the scatterplot substantially changed either the equation of the least-squares regression
line or one or more of the other summary statistics (r, r², s).

Definitions

Points with high leverage in regression have much larger or much smaller x values than the other points in the data set.

An outlier in regression is a point that does not follow the pattern of the data and has a large residual.

An influential point in regression is any point that, if removed, substantially changes the slope, y intercept, correlation,
coefficient of determination, or standard deviation of the residuals.

Example 10

Here is a scatterplot showing the cost in dollars and the battery life in hours 8
for a sample of netbooks (small laptop computers). 6

What effect do the two netbooks that cost $500 have on the following? 4

Slope: 2

y-intercept: 300 350 400 450 500

Cost (dollars)

correlation r:

coefficient of determination r2:

standard deviation of the residuals:

18
2007B

19
Matching Graphs & Coefficients of determination

A circled point was added to graph A to create graphs B, C and D. Likewise, a circled point was added to graph E to
create graphs F, G and H. Match the graphs with the regression equations and coefficients of determination. NO
CALCULATORS!

1) _______ yˆ =
−0.366 x + 3; r 2 =
0.012 2) _______ yˆ =
−0.436 x + 4.53; r 2 =
0.33

3) _ yˆ = 0.327 x + 1.3; r 2 = 0.72 4) _ yˆ =

−0.0617 x + 3.3; r 2 =
0.026

5) _______ yˆ =
−0.888 x + 6.7; r 2 =
0.34 6) _______ yˆ = 0.231x + 2.3; r 2 = 0.28

7) _ yˆ = 0.536 x + 0.31; r 2 = 0.65 8) _ yˆ = 0.53 x + 1.1; r 2 = 0.14

A B C
D)

D E F

G H

20
Developing an equation for estimating body height from linear body measurements of Ethiopian adults

Alemayehu Digssie, Alemayehu Argaw, and Tefera Belachew https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6258443/

Measurements of erect height in older people, hospitalized and bedridden patients, and people with skeletal deformity
is difficult. As a result, using body mass index for assessing nutritional status is not valid. Height estimated from linear
body measurements such as arm span, knee height, and half arm span was shown to be useful surrogate measures of
stature. However, the relationship between linear body measurements and stature varies across populations implying
the need for the development of population-specific prediction equation. The objective of this study was to develop a
formula that predicts height from arm span, half arm span, and knee height for Ethiopian adults and assess its
agreement with measured height.

A cross-sectional study was conducted from March 15 to April 21, 2016 in Jimma University among a total of 660 (330
females and 330 males) subjects aged 18–40 years. A two-stage sampling procedure was employed to select study
participants. Data were collected using interviewer-administered questionnaire and measurement of anthropometric
parameters. The data were edited and entered into Epi Data version 3.1 and exported to SPSS for windows version 20
for cleaning and analyses. Linear regression model was fitted to predict height from knee height, half arm span, and arm
span. Bland-Altman analysis was employed to see the agreement between actual height and predicted heights. P values
< 0.05 was used to declare as statistically significance.

On multivariable linear regression analyses after adjusting for age and sex, arm span (β = 0.63, p < 0.001, R2 = 87%), half
arm span (β = 1.05, p < 0.001, R2 = 83%), and knee height (β = 1.62, p < 0.001, R2 = 84%) predicted height significantly.
The Bland-Altman analyses showed a good agreement between measured height and predicted height using all the
three linear body measurements.

The findings imply that in the context where height cannot be measured, height predicted from arm span, half arm span,
and knee height is a valid proxy indicator of height. Arm span was found to be the best predictor of height. The
prediction equations can be used to assess the nutritional status of hospitalized and/or bedridden patients, people with
skeletal deformity, and elderly population in Ethiopia.

1. Identify the explanatory and response variables.

2. Identify the population and the sample.

3. “P values < 0.05 was used to declare as statistically significant.” What does this mean?

4. According to this study, which measurement is best for predicting height in Ethiopian adults?

5. Interpret the slope for the arm spam equation. (Note β = slope)

6. Interpret the r2 value for the arm span data.

21
How much candy can you grab?

Can students with a larger handspan grab more candy than those with smaller handspans?
Today we will investigate this question.

1. Measure the span of your dominant hand to the nearest half centimeter (cm). Handspan is the distance from the tip
of the thumb to the tip of the pinkie finger on your fully stretched-out hand. Handspan = ________ cm

2. Use the same hand to grab as many candies as possible from the container. You must grab the candies with your
fingers pointing down (no scooping!) and hold the candies for 2 seconds before counting them. After counting, put
the candy back into the container. Record your data on the board.

3. Sketch a scatterplot of the relationship and describe what the scatterplot reveals about the relationship between
hand span and number of candies held.

4. Using technology determine the least squares regression equation for predicting the number of candies grabbed
from hand size. Add the line to the scatterplot.

5. Record the correlation coefficient below and interpret its value.

6. What is the slope of the line? Interpret the slope in context.

22
7. What is the y-intercept of the line? Interpret the y-intercept in context.

8. Lebron James has a handspan of 23.5 cm. Use the equation to predict how many candies Lebron can grab.

9. Suppose Lebron actually grabbed 38 candies. Find and interpret the residual.

10. Suppose we did not gather information about your hand sizes and had simply used the mean number of candies
grabbed for your class to make a prediction for any given student. Use your data to find the mean number of candies
grabbed in this class. Record the mean.

11. Using the mean you found in #10, draw a horizontal line at the value on the graph in #3. For the majority of the
students in the class, does it appear that the mean line or the LSRL better predicts the number of candies a given
student can grab? Explain.

12. The coefficient of determination, r2, is a measure of the improvement in prediction when using the LSRL to predict
the value of a response variable rather than simply using the average value of the response variable. Find and
interpret the value of r2 for the LSRL of number of candies grabbed and hand size.

23
Chapter 3 Review

Exercises 1-5 refer to the following setting.

Measurements on young children in Mumbai, India, found this least-squares line for predicting height y from arm span x:

𝑦𝑦� = 6.4 + 0.93𝑥𝑥

Measurements are in centimeters (cm).

1. By looking at the equation of the least-squares regression line, you can see that the correlation between height
and arm span is

(a) greater than zero.

(b) less than zero.
(c) 0.93.
(d) 6.4.
(e) Can’t tell without seeing the data.

2. In addition to the regression line, the report on the Mumbai measurements says that r2 = 0.95. This suggests
that

(a) although arm span and height are correlated, arm span does not predict height very accurately.
(b) height increases by .97 cm for each additional centimeter of arm span.
(c) 95% of the relationship between height and arm span is accounted for by the regression line.
(d) 95% of the variation in height is accounted for by the regression line.
(e) 95% of the height measurements are accounted for by the regression line.

3. One child in the Mumbai study had height 59 cm and arm span 60 cm. This child’s residual is

(a) −3.2 cm.

(b) −2.2 cm.
(c) −1.3 cm.
(d) 3.2 cm.
(e) 62.2 cm.

4. Suppose that a tall child with arm span 120 cm and height 118 cm was added to the sample used in this study.
What effect will adding this child have on the correlation and the slope of the least-squares regression line?
(a) Correlation will increase, slope will increase.
(b) Correlation will increase, slope will stay the same.
(c) Correlation will increase, slope will decrease.
(d) Correlation will stay the same, slope will stay the same.
(e) Correlation will stay the same, slope will increase.

5. Suppose that the measurements of arm span and height were converted from centimeters to meters by dividing
each measurement by 100. How will this conversion affect the values of r2 and s?

(a) r2 will increase, s will increase.

(b) r2 will increase, s will stay the same.
(c) r2 will increase, s will decrease.
(d) r2 will stay the same, s will stay the same.
(e) r2 will stay the same, s will decrease.

24
6. You have data for many years on the average price of a barrel of oil and the average retail price of a gallon of
unleaded regular gasoline. If you want to see how well the price of oil predicts the price of gas, then you should
make a scatterplot with _______ as the explanatory variable.

(a) the price of oil

(b) the price of gas
(c) the year
(d) either oil price or gas price
(e) time

7. In a scatterplot of the average price of a barrel of oil and the average retail price of a gallon of gas, you expect to
see

(a) very little association.

(b) a weak negative association.
(c) a strong negative association.
(d) a weak positive association.
(e) a strong positive association.

8. If women always married men who were 2 years older than themselves, what would the correlation between
the ages of husband and wife be?

(a) 2
(b) 1
(c) 0.5
(d) 0
(e) Can’t tell without seeing the data

9. Which of the following is not a characteristic of the least-squares regression line?

(a) The slope of the least-squares regression line is always between −1 and 1.
(b) The least-squares regression line always goes through the point ( x , y ).
(c) The least-squares regression line minimizes the sum of squared residuals.
(d) The slope of the least-squares regression line will always have the same sign as the correlation.
(e) The least-squares regression line is not resistant to outliers.

10. The figure to the right is a scatterplot of reading test scores against IQ test scores for 14 fifth-grade children.
There is one low outlier in the plot. What effect does this low outlier have on the correlation?

(a) It makes the correlation closer to 1.

(b) It makes the correlation closer to 0 but still positive.
(c) It makes the correlation equal to 0.
(d) It makes the correlation negative.
(e) It has no effect on the correlation.

25
11. The manager of a grocery store selected a random sample of 11 customers to investigate the relationship
between the number of customers in a checkout line and the time to finish checkout. As soon as the selected
customer entered the end of the checkout line, data were collected on the number of customers in line who
were in front of the selected customer and the time, in seconds until the selected customer was finished with
the checkout. The data are shown in the following scatterplot along with the corresponding LSRL and computer
output.

a. Describe what the scatterplot reveals about the relationship between number of customers in line and the
time it takes to checkout.

b. What is the equation of the LSRL?

c. Identify and interpret in context the estimate of the slope for the LSRL.

d. Identify and interpret in context the estimate of the intercept for the LSRL.

e. Calculate and interpret the residual for a customer who was in a line with 3 people and finished checking out
after 200 seconds.

26
f. Identify and interpret the standard deviation of the residuals in context.

g. Identify and interpret in context the coefficient of determination, r2.

h. One of the data points was determined to be an outlier. Circle the point on the scatterplot and explain why
it is considered an outlier. If this point were removed from the plot, what effect would it have on the
correlation?

12. Each year, students in an elementary school take a standardized math test at the end of the school year. For a
class of fourth-graders, the average score was 55.1 with a standard deviation of 12.3. In the third grade, these
same students had an average score of 61.7 with a standard deviation of 14.0. The correlation between the two
sets of scores is r = 0.95. Calculate the equation of the least-squares regression line for predicting a fourth-grade
score from a third-grade score.

Cambridge International AS & A Level: HISTORY 9489/41
No ratings yet
Cambridge International AS & A Level: HISTORY 9489/41
4 pages
Employees Survival Guide To Change The Complete Guide To Surviving and Thriving During Organizational Change Jeffrey M Hiatt Instant Download
No ratings yet
Employees Survival Guide To Change The Complete Guide To Surviving and Thriving During Organizational Change Jeffrey M Hiatt Instant Download
37 pages
Advanced Photonics Research - 2021 - Wu - High Resolution 960 540 and 1920 1080 UV Micro Light Emitting Diode Displays
No ratings yet
Advanced Photonics Research - 2021 - Wu - High Resolution 960 540 and 1920 1080 UV Micro Light Emitting Diode Displays
8 pages
CrimPro Lakas Atenista Notes
No ratings yet
CrimPro Lakas Atenista Notes
46 pages
Labour Policy Snapshot
No ratings yet
Labour Policy Snapshot
1 page
Yuja Wang Returns
No ratings yet
Yuja Wang Returns
17 pages
Regression Lines
No ratings yet
Regression Lines
11 pages
Chapter 3 Describing Relationships
No ratings yet
Chapter 3 Describing Relationships
39 pages
Chapter 10: Correlation and Regression: Scatterplots Show The Type of Relation That Exists Between Two Variables
No ratings yet
Chapter 10: Correlation and Regression: Scatterplots Show The Type of Relation That Exists Between Two Variables
9 pages
@vtucode - in 21CS71 Module 5 PDF
No ratings yet
@vtucode - in 21CS71 Module 5 PDF
5 pages
Ch14 Regression
No ratings yet
Ch14 Regression
89 pages
Stat II Chapter 6
No ratings yet
Stat II Chapter 6
11 pages
Chapter 3 - Regression
No ratings yet
Chapter 3 - Regression
8 pages
30 List of Documents Required For Different Categories of Agricultural Loan Schemes-030823261212
No ratings yet
30 List of Documents Required For Different Categories of Agricultural Loan Schemes-030823261212
4 pages
CED To Stats Medic Alignment
No ratings yet
CED To Stats Medic Alignment
12 pages
Child Care Resources - Seven Hills Foundation
No ratings yet
Child Care Resources - Seven Hills Foundation
1 page
MST Math Lesson - T Vezi 16072025
No ratings yet
MST Math Lesson - T Vezi 16072025
15 pages
المادة العمية المتلقة بالارتباط والانحدار - د فواز القربي
100% (1)
المادة العمية المتلقة بالارتباط والانحدار - د فواز القربي
150 pages
Correlation and Regression
No ratings yet
Correlation and Regression
27 pages
D R L F L S I G S: EEP Einforcement Earning For Urniture Ayout Imulation in Ndoor Raphics Cenes
No ratings yet
D R L F L S I G S: EEP Einforcement Earning For Urniture Ayout Imulation in Ndoor Raphics Cenes
6 pages
DP CP Final Statistical Bulletin May 2023
No ratings yet
DP CP Final Statistical Bulletin May 2023
34 pages
Chapter 6 Notes Key - Pre Calc
No ratings yet
Chapter 6 Notes Key - Pre Calc
13 pages
Correlation Regression
No ratings yet
Correlation Regression
9 pages
Quiz No. 2
No ratings yet
Quiz No. 2
1 page
Chapter 2
No ratings yet
Chapter 2
59 pages
Regression and Correlation Notes
No ratings yet
Regression and Correlation Notes
28 pages
Post - Philadelphia Orchestra Wrti October Co
No ratings yet
Post - Philadelphia Orchestra Wrti October Co
7 pages
Mata Kuliah Pengantar Ilmu Ekonomi & Bisnis: Disusun Oleh
No ratings yet
Mata Kuliah Pengantar Ilmu Ekonomi & Bisnis: Disusun Oleh
3 pages
Aida
No ratings yet
Aida
24 pages
Determination of MSW Specific Weight
No ratings yet
Determination of MSW Specific Weight
10 pages
Econometrics Lectures
No ratings yet
Econometrics Lectures
240 pages
Correlation and Regression
No ratings yet
Correlation and Regression
6 pages
Unit 2
No ratings yet
Unit 2
44 pages
Resume 1
No ratings yet
Resume 1
1 page
IPS7e LecturePPT ch02
No ratings yet
IPS7e LecturePPT ch02
105 pages
Advance Structures (7th Semester) (B.ARCH)
No ratings yet
Advance Structures (7th Semester) (B.ARCH)
93 pages
Handout 5 Correlation and Regression (Recovered)
No ratings yet
Handout 5 Correlation and Regression (Recovered)
6 pages
Chapter 14 Simple Linear Regression .
No ratings yet
Chapter 14 Simple Linear Regression .
39 pages
Statistics Regression Final Project
100% (2)
Statistics Regression Final Project
12 pages
Second Stats Packet 24
No ratings yet
Second Stats Packet 24
100 pages
Lecture 1
No ratings yet
Lecture 1
15 pages
300UT-PL Concrete Mixer
No ratings yet
300UT-PL Concrete Mixer
8 pages
Electric Transport in The Netherlands
No ratings yet
Electric Transport in The Netherlands
44 pages
Chapter 1
No ratings yet
Chapter 1
24 pages
Section A (50 Marks)
No ratings yet
Section A (50 Marks)
4 pages
Chapter2-ESTA3042 2020S2
No ratings yet
Chapter2-ESTA3042 2020S2
80 pages
Correg
No ratings yet
Correg
19 pages
Gade 12 & 12 Promaths STATS 2024 June 2024
No ratings yet
Gade 12 & 12 Promaths STATS 2024 June 2024
206 pages
Correlation N Regression
No ratings yet
Correlation N Regression
25 pages
Regression Corr
No ratings yet
Regression Corr
15 pages
11august2010 - Correlation and Regression
No ratings yet
11august2010 - Correlation and Regression
7 pages
Smile-B3-Plus: Residential Series 3 KW Inverter
No ratings yet
Smile-B3-Plus: Residential Series 3 KW Inverter
2 pages
Keystone Algebra Remediation October 2013
No ratings yet
Keystone Algebra Remediation October 2013
19 pages
Notes 2 - Scatterplots and Correlation
No ratings yet
Notes 2 - Scatterplots and Correlation
6 pages
Two Quantitative Variables: Scatterplot, Correlation, and Linear Regression
No ratings yet
Two Quantitative Variables: Scatterplot, Correlation, and Linear Regression
17 pages
Regression and Correlation
No ratings yet
Regression and Correlation
19 pages
Development of Production and Product Safety EN PDF
No ratings yet
Development of Production and Product Safety EN PDF
19 pages
5 - Chapter9-Linear Regression
No ratings yet
5 - Chapter9-Linear Regression
15 pages
How Can We Explore The Association Between Two Quantitative Variables?
No ratings yet
How Can We Explore The Association Between Two Quantitative Variables?
7 pages
Presentation4 - Bivariate Analysis and Simple Linear Regression
No ratings yet
Presentation4 - Bivariate Analysis and Simple Linear Regression
31 pages
Chapter 12
No ratings yet
Chapter 12
36 pages
Bivariate EDA and Regression Analysis
No ratings yet
Bivariate EDA and Regression Analysis
61 pages
Ch03sec1 4
No ratings yet
Ch03sec1 4
12 pages
Smart Factory Navigator: Lukas Budde Roman Hänggi Thomas Friedli Adrian Rüedy
100% (2)
Smart Factory Navigator: Lukas Budde Roman Hänggi Thomas Friedli Adrian Rüedy
297 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
14 pages
Chapter 1
No ratings yet
Chapter 1
22 pages
Corr - Regression Analysis
No ratings yet
Corr - Regression Analysis
19 pages
Book 2 Notes-71-78
No ratings yet
Book 2 Notes-71-78
8 pages
Lectures 14 15
No ratings yet
Lectures 14 15
66 pages
ASS#1-FINALS Doromal
No ratings yet
ASS#1-FINALS Doromal
8 pages
Oiv Ma As1 12
No ratings yet
Oiv Ma As1 12
92 pages
Dr. Saeed A. Dobbah Alghamdi Department of Statistics Faculty of Sciences King Abdulaziz University
No ratings yet
Dr. Saeed A. Dobbah Alghamdi Department of Statistics Faculty of Sciences King Abdulaziz University
30 pages
A Study ON "Training and Development"
No ratings yet
A Study ON "Training and Development"
83 pages
Module 2 - Section 4 (Linear Regression) - 11
No ratings yet
Module 2 - Section 4 (Linear Regression) - 11
20 pages
Chapter 3 Notes-Alyssa
No ratings yet
Chapter 3 Notes-Alyssa
10 pages
Chapter 3 Notes-Alyssa
No ratings yet
Chapter 3 Notes-Alyssa
10 pages
Computer Organization: Basic Structure of Computer
No ratings yet
Computer Organization: Basic Structure of Computer
59 pages
SEE5211 Chapter3-P2017
No ratings yet
SEE5211 Chapter3-P2017
58 pages
Correlation
100% (1)
Correlation
29 pages
Fraud Alert!: "@ril - VC" and "@ril - Sg". These
No ratings yet
Fraud Alert!: "@ril - VC" and "@ril - Sg". These
2 pages
Public Administration:: Your Unofficially The Compulsory Subject (In The Changed Context)
No ratings yet
Public Administration:: Your Unofficially The Compulsory Subject (In The Changed Context)
4 pages
Regression and Correlation
No ratings yet
Regression and Correlation
37 pages
Statistics Correlation Analysis
No ratings yet
Statistics Correlation Analysis
10 pages
SQQS2073 Note 1 Simple Linear Regression
No ratings yet
SQQS2073 Note 1 Simple Linear Regression
11 pages
Correlation and Regression: Predicting The Unknown
No ratings yet
Correlation and Regression: Predicting The Unknown
5 pages
Descriptive Stats (E.g., Mean, Median, Mode, Standard Deviation) Z-Test &/or T-Test For A Single Population Parameter (E.g., Mean)
No ratings yet
Descriptive Stats (E.g., Mean, Median, Mode, Standard Deviation) Z-Test &/or T-Test For A Single Population Parameter (E.g., Mean)
43 pages
SW 4048 120 Spec Sheet
No ratings yet
SW 4048 120 Spec Sheet
2 pages
Chapter 5 - 1
No ratings yet
Chapter 5 - 1
5 pages
Lecture 2.a Analysis of RC Beams
No ratings yet
Lecture 2.a Analysis of RC Beams
27 pages
YMS Topic Review (Chs 1-8)
No ratings yet
YMS Topic Review (Chs 1-8)
7 pages
Completion (Natural Flow)
No ratings yet
Completion (Natural Flow)
3 pages
06 Simple Linear Regression Part1
No ratings yet
06 Simple Linear Regression Part1
8 pages
IGCSE-OL Geo CB Answers Theme 2 Natural Environment
100% (1)
IGCSE-OL Geo CB Answers Theme 2 Natural Environment
55 pages
Vagtacho Usb: See The List of Supported Cars For The Delco Hsfi, and Delco "F" Update
No ratings yet
Vagtacho Usb: See The List of Supported Cars For The Delco Hsfi, and Delco "F" Update
9 pages
Calculus III Essentials
From Everand
Calculus III Essentials
Editors of REA
1/5 (2)
Direct Linear Transformation: Practical Applications and Techniques in Computer Vision
From Everand
Direct Linear Transformation: Practical Applications and Techniques in Computer Vision
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Chapter 3 Notes 2024 2025 PDF

Uploaded by

Chapter 3 Notes 2024 2025 PDF

Uploaded by

Chapter 3: Describing Relationships

3.1 Scatterplots and Correlation

Identify the explanatory and the response variables in this situation.

1. Describe the direction of the relationship.

2. What form does the relationship take?

3. How strong is the relationship?

4. Are there any unusual features?

The principles that guide our work remain the same:

• Plot the data, then add numerical summaries.

A response variable _____________________________________________________________________.

An explanatory variable _____________________________________________________________________________

Check Your Understanding

Identify the explanatory and response variables in each setting.

A ___________________________________ shows the relationship between two quantitative variables measured on

Strong Moderate Weak

The correlation r measures the ________________________________________________________________________

between two quantitative variables.

Facts about Correlation

• The correlation r is always between ________ and ________.

• r > 0 indicates a _______________ relationship.

• r < 0 indicates a _______________ relationship.

• Values near 0 indicate a ____________ linear relationship.

• Values near 1 or -1 indicate a _______________ linear relationship.

b. Follow the steps below to calculate the correlation. 26 370

The formula for correlation is as follows:

1. Identify the explanatory variable.

2. Identify the response variable.

• a is the ______________, the predicted value of y when x = 0.

5. Identify and interpret the slope.

6. Identify and interpret the y-intercept.

Residuals and the Least Squares Regression Line

The least-squares regression line of y on x is the line that makes ______________________________________

Scatterplot Residual Plot Notes

Because there is only random scatter in the

There is a definite curved pattern to the

How is r 2 related to r? How is r 2 related to s?

Does seat location affect grades?

(a) Is this an observational study or an experiment? Explain.

(b) Identify the type and scope of inference.

(e) Interpret the slope of the least-squares regression line.

(g) Is a linear model appropriate for this data? Explain.

(i) Interpret the value of r 2 in context.

(j) Interpret s in context.

and y-intercept _______________________.

(a) Describe the relationship between run differential and # of wins.

(d) Interpret the value of r 2 in context.

(e) Interpret s in context.

1. Use the Correlation and Regression applet at tinyurl.com/regressionapplet

• Click on the graphing area to add 10

(a) Adding point (a) as shown in the picture above.

slope: y-intercept: correlation:

(b) Adding point (b) as shown in the picture above.

slope: y-intercept: correlation:

(c) Adding point (c) as shown in the picture above.

slope: y-intercept: correlation:

(d) Adding point (d) as shown in the picture above.

slope: y-intercept: correlation:

Case 1: No unusual points

y-intercept: 300 350 400 450 500

coefficient of determination r2:

standard deviation of the residuals:

3) _______ yˆ = 0.327 x + 1.3; r 2 = 0.72 4) _______ yˆ =

7) _______ yˆ = 0.536 x + 0.31; r 2 = 0.65 8) _______ yˆ = 0.53 x + 1.1; r 2 = 0.14

Alemayehu Digssie, Alemayehu Argaw, and Tefera Belachew https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6258443/

1. Identify the explanatory and response variables.

2. Identify the population and the sample.

6. Interpret the r2 value for the arm span data.

5. Record the correlation coefficient below and interpret its value.

6. What is the slope of the line? Interpret the slope in context.

Exercises 1-5 refer to the following setting.

𝑦𝑦� = 6.4 + 0.93𝑥𝑥

(a) greater than zero.

(a) −3.2 cm.

(a) r2 will increase, s will increase.

(a) the price of oil

(a) very little association.

• The correlation r is always between and .

3) _ yˆ = 0.327 x + 1.3; r 2 = 0.72 4) _ yˆ =

7) _ yˆ = 0.536 x + 0.31; r 2 = 0.65 8) _ yˆ = 0.53 x + 1.1; r 2 = 0.14