0% found this document useful (0 votes)
3 views41 pages

Linear Regression and Correlation

This document covers linear regression and correlation, focusing on their applications in predicting variable values and making informed decisions using statistical data. It explains the least-squares regression line, scatter plots, and the linear correlation coefficient, providing examples and formulas for practical applications. The document also includes exercises for calculating predictions and analyzing relationships between variables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views41 pages

Linear Regression and Correlation

This document covers linear regression and correlation, focusing on their applications in predicting variable values and making informed decisions using statistical data. It explains the least-squares regression line, scatter plots, and the linear correlation coefficient, providing examples and formulas for practical applications. The document also includes exercises for calculating predictions and analyzing relationships between variables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

Lesson 4:

Linear Regression
and
Correlation

December, 2024
Intended
Learning Outcomes
1. Use the methods of linear regression and
correlations to predict the value of a variable
given certain conditions
2. Advocate the use of statistical data in
making important decisions
3. Apply linear regression and correlation
analysis to analyze real-world data and solve
practical problems in fields like economics,
science, and social sciences
examples of situations Researchers want to
where regression determine if the daily
analysis could be caffeine intake (in
applied
milligrams) is related to
Educators are the level of heart
interested in damage.
determining how the
number of hours a Researchers want to
student studies can be determine if a
used to predict the
person’s age is
student’s score on a
particular exam. related to their
blood pressure.
3
Linear Regression is a statistical
method used to analyze the
relationship between two or more
variables, typically by fitting a straight
line to the data points.
Linear Regression is used to predict the
value of the dependent variable y based on
the independent variable x by fitting a
straight line to the data. It also helps to
understand the relationship between x and
y, quantifying how changes in x affect y.
The least-squares regression line is a straight line
that best fits a set of data points by making the
difference between the actual data points and the
points predicted by the line as small as possible. It
helps to show the relationship between two variables
and can be used to make predictions.
A scatter plot, also known as a scatter diagram, is
a type of mathematical diagram used to display
the relationship between two variables.
In a scatter plot, each data point represents the
values of two variables, one plotted along the
horizontal axis (x-axis) and the other along the
vertical axis (y-axis).
A scatter plot, also known as a scatter diagram

Scatter plots are often used to visually assess the


correlation or relationship between two variables.
The Formula for the Least-Squares Line

The equation of the least-squares line for the


ordered pairs is
, where
and
Hours Achievement
Spent Grade
Let us solve the value of a

𝐚 =𝟔 . 𝟐𝟓
Let us solve the value of a
a = 6.25

Using
What is the predicted achievement grade of a
student who spent 4 hours in studying the subject?
Using

Solution:
Substitute x = 4 hours in the equation of linear regression and solve for
y.

y = 6.25 (4) + 72.26


y = 25 + 72.26
y = 97.26
Therefore, a student who spent 4 hours in studying the subject has a
predicted grade of 97.26
2. Find the equation of the least-squares
regression line and predict the average speed
of an adult man for each of the following stride
lengths. Round your results to the nearest
tenth of a meter per second:
a) 2.8 m
b) 4.8 m

The ordered pairs are


, (3.0, 4.9), (3.3, 5.5), (3.5, 6.6), (3.8, 7.0), (4.0, 7.7),
(4.2, 8.3), (4.5, 8.7)
x y x2 y2 xy
2.5 3.4 6.25 11.56 8.5
3.0 4.9 9.00 24.01 14.7
3.3 5.5 10.89 30.25 18.15
3.5 6.6 12.25 43.56 23.1
3.8 7.0 14.44 49 26.6
4.0 7.7 16.00 59.29 30.8
4.2 8.3 17.64 68.89 34.86
4.5 8.7 20.25 75.69 39.15
28.8 52.1 106.72 362.25 195.86
∑ 𝑥=28.8,∑ 𝑦=52.1,∑ 𝑥 =106.72,∑ 𝑦 =362.25,∑ 𝑥𝑦=195.86
2 2

Find the value of .

( 8 ) ( 195.86 ) − ( 28.8 ) ( 52.1 )


¿ 2
( 8 ) ( 106.72 ) − ( 28.8 )

𝐚 ≈ 𝟐 .𝟕𝟑𝟎𝟑
∑ 𝑥=28.8,∑ 𝑦=52.1
Find and .
𝑥=3.6 =

Find the intercept .


𝑦 =𝑎 𝑥 +𝑏
𝑏=𝑦 − 𝑎 𝑥

𝒃=− 𝟑.𝟑𝟏𝟔𝟓𝟖
If and are each rounded to the nearest tenth, to reflect the
accuracy of the original data, then we have as our equation of
the least-squares line:

𝑏=−3.31658
^
𝑦 =𝑎𝑥 +𝑏
^
𝒚 =𝟐 .𝟕 𝒙 − 𝟑 .𝟑
If and are each rounded to the nearest tenth, to reflect the
accuracy of the original data, then we have as our equation of
the least-squares line:
y
10

𝑏=−3.31658 9
f(x) = 2.730263157895 x − 3.316447368421
8
R² = 0.987469217736318
^
𝑦 =𝑎𝑥 +𝑏 7

^
𝒚 =𝟐 .𝟕 𝒙 − 𝟑 .𝟑 5

0
2 2.5 3 3.5 4 4.5 5
Use the equation of the least-squares line to
predict the average speed of an adult man for each
of the following stride lengths. Round your results
to the nearest tenth of a meter per second.
a) 2.8 m
b) 4.8 m
a.
Substitute 2.8 for x

Thus 4.3 m/s is the predicted average speed for an


adult man with a stride length of 2.8 m.
Use the equation of the least-squares line to
predict the average speed of an adult man for each
of the following stride lengths. Round your results
to the nearest tenth of a meter per second.
a) 2.8 m
b) 4.8 m
b.
Substitute 4.8 for x

Thus 9.7 m/s is the predicted average speed for an


adult man with a stride length of 4.8 m.
a) x = 2.8 m
b) x =4.8 m
Find the equation of the least-
squares line for predicting guest
satisfaction scores based on room
cleanliness and staff friendliness.
The ordered pairs are:
(4, 5), (3, 4), (5, 5), (2, 3), and (5, 4).
The Linear Correlation Coefficient (denoted as r) is a
numerical measure that describes the strength and direction
of the linear relationship between two variables. It ranges
from -1 to 1:
r = 1 indicates a perfect positive linear relationship, where as
one variable increases, the other increases proportionally.
r = -1 indicates a perfect negative linear relationship, where
as one variable increases, the other decreases proportionally.
r = 0 indicates no linear relationship between the variables.
 When r>0, there is a positive linear
relationship, and when r<0, there is a negative
linear relationship.
 Values closer to 1 or -1 (greater than 0.9 or less
than -0.9) indicate a stronger linear
relationship, with r=1 or r=−1 showing a perfect
relationship.
 Values closer to 0 (greater than 0 or less than
0.3) indicate a weaker relationship, suggesting
little to no linear correlation between the
variables.
 Values greater than 1 or less than -1 are not
possible and usually indicate a calculation or data
error. The correlation coefficient must always fall
within the range of -1 to 1.
,,
:

𝑛 ∑ 𝑥𝑦 − ( ∑ 𝑥 ) ( ∑ 𝑦 )
𝑟=
√[ 𝑛 (∑ 𝑥 ) − ( ∑ 𝑥 ) ] [ 𝑛 (∑ 𝑦 ) − (∑ 𝑦 ) ]
2 2 2 2
If the linear correlation coefficient r is positive, the
relationship between the variables has a positive
correlation. In this case, if one variable increases, the other
variable also tends to increase.
the plot shown suggests
a positive relationship,
since as the number of
cars rented increases,
revenue tends to
increase also.

32
CAR RENTALS
If r is negative, the linear relationship between the
variables has a negative correlation. In this case, if one
variable increases, the other variable tends to decrease.
the data shown
suggests a negative
relationship, since as
the number of absences
increases, the final
grade decreases.

34
ABSENCES and FINAL
GRADES
The plot of the data
shows no clear
relationship, as no
visible pattern can be
identified..
35
AGE and HEALTH
36
2. Find the linear correlation coefficient
for stride length speed of an adult man.
Round your result to the nearest
hundredth.

The ordered pairs are


, (3.0, 4.9), (3.3, 5.5), (3.5, 6.6), (3.8, 7.0), (4.0, 7.7),
(4.2, 8.3), (4.5, 8.7)
∑ 𝑥=28.8,∑ 𝑦=52.1,∑ 𝑥 =106.72,∑ 𝑦 =362.25,∑ 𝑥𝑦=195.86
2 2

𝑛 ∑ 𝑥𝑦 − ( ∑ 𝑥 ) ( ∑ 𝑦 )
𝑟=
√[ 𝑛 (∑ 𝑥 ) − ( ∑ 𝑥 ) ] [ 𝑛 (∑ 𝑦 ) − (∑ 𝑦 ) ]
2 2 2 2

10 ( 195.86 ) −(28.8)(52.1)
𝑟=
√ [ 10 ( 106.72 − ( 28.8 ) ) ] [ 10 ( 362.25 ) − ( 52.1 ) ]
2 2
𝑟 ≈ 0.99
What is the significance of the fact that the
linear correlation coefficient is positive ?
 It indicates a positive correlation between a
man’s stride length and his speed. That is, as a
man’s stride length increases, his speed also
increases.
Activity: (3 members in a group)
Find the equation of the least-squares line and
the linear correlation coefficient for the given
data. Round the constants, a, b, and r to the
nearest hundredth. {(−7,−11.7),(−5,−9.8),
(−3,−8.1),(1,−5.9),(2,−5.7)}

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy