0% found this document useful (0 votes)

13 views36 pages

Lesson 3

This document covers the Pearson correlation coefficient, detailing its interpretation, calculation, and significance testing. It explains the strength and direction of relationships between quantitative variables, and when to use Pearson versus Spearman's correlation coefficients. Additionally, it provides a step-by-step guide for hypothesis testing related to correlation coefficients.

Uploaded by

euziaheunice

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views36 pages

Lesson 3

Uploaded by

euziaheunice

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Correlational and

regression
analysis
MS102 – Lesson 3
Learning outcomes
At the end of the lesson, students must have:
▪ interpreted correlation and regression on various
datasets,
▪ calculated and interpreted the correlation and
regression, and
▪ performed data mining.
Pearson correlation
coefficient (r)
➢ is the most widely used correlation coefficient and
is known by many names:
✓ Pearson’s r
✓ Bivariate correlation
✓ Pearson product-moment correlation coefficient
(PPMCC)
✓ The correlation coefficient
Pearson correlation
coefficient (r)
➢ (a descriptive statistic) summarizes the
characteristics of a dataset.
➢ Specifically, it describes the strength
and direction of the linear relationship
between two quantitative variables.
Pearson correlation Strength Direction
coefficient (r) value
Greater than .5 Strong Positive
Between .3 and .5 Moderate Positive
Between 0 and .3 Weak Positive
0 None None
Between 0 and –.3 Weak Negative
Between –.3 and –.5 Moderate Negative
Less than –.5 Strong Negative
Pearson correlation
coefficient (r)
➢ (also an inferential statistic), can be
used to test statistical hypotheses.
➢ Specifically, we can test whether two
variables have a significant relationship.
Visualizing the Pearson
correlation coefficient
➢ Another way to think of the Pearson
correlation coefficient (r) is as a measure of
how close the observations are to a line of
best fit.
➢ When the slope is negative, r is negative.
➢ When the slope is positive, r is positive.
when the correlation
coefficient 𝑟=1, it
indicates a perfect
positive linear
relationship between
two variables.

Example: if you were comparing the number of

hours studied and test scores, 𝑟=1 would mean
that for every additional hour of study, the test
score increases by a constant amount.
When 𝑟=−1r in Pearson
correlation, it indicates a
perfect negative linear
relationship between
two variables.

Example: if you were comparing the amount of

time spent watching TV and test scores, 𝑟=−1
would mean that for every additional hour spent
watching TV, the test score decreases by a
constant amount.
When r is greater than .5 or less than –.5, the
points are close to the line of best fit:

if r > 0.5, it suggests

that there is a moderate
to strong positive
relationship between the
two variables
When r is greater than .5 or less than –.5, the
points are close to the line of best fit:

if r < -0.5, it suggests a

moderate to strong
negative relationship.
When r is between 0 and .3 or between 0 and
–.3, the points are far from the line of best fit:

A weak positive correlation

suggests a very slight tendency
for one variable to increase as
the other increases, but the
relationship is weak and there
is a lot of variation or noise in
the data.
When r is between 0 and .3 or between 0 and
–.3, the points are far from the line of best fit:

A weak negative correlation

means a very slight tendency
for one variable to decrease
as the other increases, but
again, the relationship is weak.
When r is 0, a line of best fit is not helpful in
describing the relationship between the
variables:
When to use the Pearson
correlation coefficient?
The Pearson correlation coefficient is a good choice
when all of the following are true:
➢ Both variables are quantitative: You will need to
use a different method if either of the variables is
qualitative.
➢ The variables are normally distributed: You can
create a histogram of each variable to verify
whether the distributions are approximately normal.
It’s not a problem if the variables are a little non-
normal.
When to use the Pearson
correlation coefficient?
The Pearson correlation coefficient is a good choice
when all of the following are true:
➢ The data have no outliers: Outliers are observations that
don’t follow the same patterns as the rest of the data. A
scatterplot is one way to check for outliers—look for
points that are far away from the others.
➢ The relationship is linear: “Linear” means that the
relationship between the two variables can be described
reasonably well by a straight line. You can use a
scatterplot to check whether the relationship between two
variables is linear.
Pearson vs. Spearman’s rank
correlation coefficients
Spearman’s rank correlation coefficient is another
widely used correlation coefficient. It’s a better choice
than the Pearson correlation coefficient when one or
more of the following is true:
➢ The variables are ordinal.
➢ The variables aren’t normally distributed.
➢ The data includes outliers.
➢ The relationship between the variables is non-linear and
monotonic.
Calculating the Pearson
correlation coefficient
Testing for the significance of the
Pearson correlation coefficient
➢ The Pearson correlation of the sample is r.
➢ It is an estimate of rho (ρ), the Pearson correlation
of the population.
➢ Knowing r and n (the sample size), we can infer
whether ρ is significantly different from 0.

✓ Null hypothesis (H0): ρ = 0

✓ Alternative hypothesis (Ha): ρ ≠ 0
Steps to test the
hypothesis:
Step 1: Calculate the t (a test statistic) value
Steps to test the
hypothesis:
Step 2: Find the critical value of t
You can find the critical value of t (t*) in a t table. To
use the table, you need to know three things:
➢ The degrees of freedom (df): For Pearson correlation
tests, the formula is df = n – 2.
➢ Significance level (α): By convention, the significance
level is usually .05
➢ One-tailed or two-tailed: Most often, two-tailed is an
appropriate choice for correlations.
For example, in a
test with 𝛼 = 0.05,
the critical t-value
will be located in the
column
corresponding to
0.05 depending on
whether it’s one-tail
or two-tail.
Example:
Finding the critical
value of t
for a two-tailed test of
significance at α = .05
and df = 8, the critical
value of t (t*) is 2.306.
Steps to test the
hypothesis:
Step 3: Compare the t value to the critical value
Determine if the absolute t value is greater than the
critical value of t. “Absolute” means that if the t value is
negative you should ignore the minus sign.
➢ Example: Comparing the t value to the critical value of
t (t*)
t = 1.506 t* = 2.306
The t value is less than the critical value of t.
Steps to test the
hypothesis:
Step 4: Decide whether to reject the null hypothesis
➢ If the t value is greater than the critical value, the
relationship is statistically significant (p < α).
The data allows you to reject the null hypothesis and support
the alternative hypothesis.
➢ If the t value is less than the critical value, the relationship is
not statistically significant (p > α).
The data doesn’t allow you to reject the null hypothesis and
doesn’t provide support for the alternative hypothesis.
https://www.scribbr.com/statistics/pearson-correlation-
coefficient/
Weight Length
Imagine that you’re studying the
(kg) (cm)
relationship between newborns’ 3.63 53.1
weight and length. You have the 3.02 49.7
3.82 48.4
weights and lengths of the 10
3.42 54.2
babies born last month at your local 3.59 54.9
hospital. After you convert the 2.87 43.7
3.03 47.2
imperial measurements to metric,
3.46 45.2
you enter the data in a table:
3.36 54.4
3.3 50.4
Step 1: Calculate the sums of x and
y
Start by renaming the variables to
“x” and “y.” It doesn’t matter which
variable is called x and which is
called y—the formula will give the
same answer either way.

Next, add up the values of x and y.

(In the formula, this step is indicated
by the Σ symbol, which means “take
the sum of”.)
Example: Calculating the sums of x and y
Weight = x
Length = y

Σx = 3.63 + 3.02 + 3.82 + 3.42 + 3.59 + 2.87 + 3.03 + 3.46

+ 3.36 + 3.30

Σx = 33.5

Σy = 53.1 + 49.7 + 48.4 + 54.2 + 54.9 + 43.7 + 47.2 + 45.2

+ 54.4 + 50.4

Σy = 501.2
Step 2: Calculate x2 and y2 and their sums

Create two new columns that contain the squares of x and y.

Take the sums of the new columns.

Step 3: Calculate the cross product and its
sum

In a final column, multiply together x and y

(this is called the cross product). Take the
sum of the new column.
Step 4: Calculate r

Use the formula and the numbers you

calculated in the previous steps to find r.
Step 4: Calculate r

Use the formula and the numbers you

calculated in the previous steps to find r.

Qualities of Good Research Instrument
100% (1)
Qualities of Good Research Instrument
24 pages
Q4 Lesson 1 2 Pearson R and T Test
No ratings yet
Q4 Lesson 1 2 Pearson R and T Test
17 pages
Topic 11 - Correlation
No ratings yet
Topic 11 - Correlation
32 pages
Chapter4
No ratings yet
Chapter4
86 pages
Biostatistics PPT - 6
No ratings yet
Biostatistics PPT - 6
35 pages
Unit 3 Modelling and Evaluation
No ratings yet
Unit 3 Modelling and Evaluation
40 pages
Unit 2 Correlation Analysis: 2.1. Definition
No ratings yet
Unit 2 Correlation Analysis: 2.1. Definition
9 pages
CORRELATION and hOW TO FIND VALUE OF CORRELATION COEFFICIENT
No ratings yet
CORRELATION and hOW TO FIND VALUE OF CORRELATION COEFFICIENT
12 pages
P ', C - S, - T, Anova: Earson SR HI Quare EST AND
No ratings yet
P ', C - S, - T, Anova: Earson SR HI Quare EST AND
86 pages
Pearson R
No ratings yet
Pearson R
48 pages
Pearson R-Chi Square-ANOVA
No ratings yet
Pearson R-Chi Square-ANOVA
92 pages
GROUP 4 PPT Format
No ratings yet
GROUP 4 PPT Format
40 pages
Using Statistical Techniq Ues in Analyzing Data
100% (1)
Using Statistical Techniq Ues in Analyzing Data
40 pages
Measures of Relationship
No ratings yet
Measures of Relationship
17 pages
Pearson Correlation Coefficient
No ratings yet
Pearson Correlation Coefficient
4 pages
Pearsons Correlation
No ratings yet
Pearsons Correlation
11 pages
Pearson R
No ratings yet
Pearson R
43 pages
Pearson Correlation
No ratings yet
Pearson Correlation
59 pages
Statistics & Probability Q4 - Week 7-8
No ratings yet
Statistics & Probability Q4 - Week 7-8
15 pages
CORRELATION ANALYSIS Pearson's R
No ratings yet
CORRELATION ANALYSIS Pearson's R
3 pages
L7 Correlation
No ratings yet
L7 Correlation
40 pages
Stat and Prob Pearsons R
No ratings yet
Stat and Prob Pearsons R
29 pages
Correlation
No ratings yet
Correlation
9 pages
Corerellation and Regression Analysis
No ratings yet
Corerellation and Regression Analysis
28 pages
Measures of Relationship
No ratings yet
Measures of Relationship
11 pages
What Is The Correlation Coefficient?: Coefficient. The Sample Value Is Called R, and The Population Value Is Called
No ratings yet
What Is The Correlation Coefficient?: Coefficient. The Sample Value Is Called R, and The Population Value Is Called
22 pages
Investigating The Relationship Between Two or More Variables (Correlation)
No ratings yet
Investigating The Relationship Between Two or More Variables (Correlation)
33 pages
Applied Longitudinal Analysis Lecture Notes
No ratings yet
Applied Longitudinal Analysis Lecture Notes
475 pages
Inbound 2323158544640608273
No ratings yet
Inbound 2323158544640608273
149 pages
Correlation Coefficient Formula
No ratings yet
Correlation Coefficient Formula
5 pages
Biometry Course Outline
No ratings yet
Biometry Course Outline
3 pages
Confirmatory Factor Analysis Using AMOS: Step 1: Launch The AMOS Software
100% (1)
Confirmatory Factor Analysis Using AMOS: Step 1: Launch The AMOS Software
12 pages
Employee Attrition Risk Assessment Report - Global Organization by The Brew (Https://thebrew - In)
No ratings yet
Employee Attrition Risk Assessment Report - Global Organization by The Brew (Https://thebrew - In)
26 pages
Mixed Analysis For Duration
No ratings yet
Mixed Analysis For Duration
465 pages
Quantitative Methods
No ratings yet
Quantitative Methods
4 pages
Pearson R Practice 2 Answers
No ratings yet
Pearson R Practice 2 Answers
6 pages
BONGGA Statistics-and-Probability 4Q SLM8
No ratings yet
BONGGA Statistics-and-Probability 4Q SLM8
10 pages
Linear Correlation (Pearson) : Assumptions
No ratings yet
Linear Correlation (Pearson) : Assumptions
2 pages
Correlation Coefficient
No ratings yet
Correlation Coefficient
44 pages
Tutorial Notes (Complete) 2
No ratings yet
Tutorial Notes (Complete) 2
41 pages
4 Pearson R
No ratings yet
4 Pearson R
30 pages
Statistic Group 4
No ratings yet
Statistic Group 4
12 pages
JCheePearsonsProduct MomentCorrelation
No ratings yet
JCheePearsonsProduct MomentCorrelation
16 pages
WEEK 7 Modular
No ratings yet
WEEK 7 Modular
10 pages
Correlation
No ratings yet
Correlation
8 pages
Statistics
No ratings yet
Statistics
13 pages
Correlation and Regression Original
No ratings yet
Correlation and Regression Original
44 pages
Types of Correlation and Their Specific Applications
No ratings yet
Types of Correlation and Their Specific Applications
25 pages
1ststeps in Hyphothesis Testing
No ratings yet
1ststeps in Hyphothesis Testing
17 pages
Measures of Relationship
No ratings yet
Measures of Relationship
11 pages
Computing The Pearson Correlation Coefficient
No ratings yet
Computing The Pearson Correlation Coefficient
38 pages
Lecture 5 Correlation
No ratings yet
Lecture 5 Correlation
2 pages
Chapter7 Econometrics Multicollinearity
No ratings yet
Chapter7 Econometrics Multicollinearity
24 pages
Correlation and Linear Regression
No ratings yet
Correlation and Linear Regression
63 pages
Correlation Coefficients: Find Pearson's Correlation Coefficient
No ratings yet
Correlation Coefficients: Find Pearson's Correlation Coefficient
5 pages
Sales 20: Years Advertising Expense (Millions) X Sales (Thousands) y
No ratings yet
Sales 20: Years Advertising Expense (Millions) X Sales (Thousands) y
8 pages
Mat 152 Sas#15
No ratings yet
Mat 152 Sas#15
6 pages
Statistics and Probability: Quarter 4 - (Week 6)
No ratings yet
Statistics and Probability: Quarter 4 - (Week 6)
8 pages
Correlation Rev 1.0
No ratings yet
Correlation Rev 1.0
5 pages
Sas#20-Acc 116
No ratings yet
Sas#20-Acc 116
9 pages
bài tập 8
No ratings yet
bài tập 8
3 pages
Answer Key Testname: UNTITLED1.TST: ESSAY. Write Your Answer in The Space Provided
No ratings yet
Answer Key Testname: UNTITLED1.TST: ESSAY. Write Your Answer in The Space Provided
6 pages
TYCS Data Science Manual
No ratings yet
TYCS Data Science Manual
44 pages
Mod 3C
No ratings yet
Mod 3C
36 pages
A Semi-Detailed Lesson Plan in Statistics and Probabilit1
0% (1)
A Semi-Detailed Lesson Plan in Statistics and Probabilit1
5 pages
Tutorial 5 - Solutions
No ratings yet
Tutorial 5 - Solutions
8 pages
R Regress Post Estimation Time Series
No ratings yet
R Regress Post Estimation Time Series
12 pages
Psychstat Semifinals Reviewer (Bundalian)
No ratings yet
Psychstat Semifinals Reviewer (Bundalian)
8 pages
QTTM509 Research Methodology-I: Dr. Tawheed Nabi
No ratings yet
QTTM509 Research Methodology-I: Dr. Tawheed Nabi
28 pages
Lampiran R Studio
No ratings yet
Lampiran R Studio
25 pages
Psychstat Semifinals Reviewer
No ratings yet
Psychstat Semifinals Reviewer
5 pages
Lesson 7 Pearson Product of Moment Coefficient Correlation
No ratings yet
Lesson 7 Pearson Product of Moment Coefficient Correlation
6 pages
Major Project Final TABLE DIAGRAM
No ratings yet
Major Project Final TABLE DIAGRAM
28 pages
Lab 4
No ratings yet
Lab 4
21 pages
Dania Purnama - 2224190099 - Uji Normalitas
No ratings yet
Dania Purnama - 2224190099 - Uji Normalitas
4 pages
Pearson Product Moment Correlation
No ratings yet
Pearson Product Moment Correlation
6 pages
Dữ liệu chạy SPSS
No ratings yet
Dữ liệu chạy SPSS
9 pages
7.1.1. Linear Regression - Intuition
No ratings yet
7.1.1. Linear Regression - Intuition
7 pages
Uji Validitas Dan Reabilitas: Case Processing Summary
No ratings yet
Uji Validitas Dan Reabilitas: Case Processing Summary
2 pages
ML Week-12
No ratings yet
ML Week-12
7 pages
Eco Merged
No ratings yet
Eco Merged
101 pages
Ds Lab 4.ipynb - TARUN
No ratings yet
Ds Lab 4.ipynb - TARUN
6 pages
Advanced Statistical Methods Using R Notes
No ratings yet
Advanced Statistical Methods Using R Notes
55 pages
Assignment 2 & 3
No ratings yet
Assignment 2 & 3
4 pages
ML Lab Manual
No ratings yet
ML Lab Manual
60 pages
Mrinali Bhiwapurkar 1062212449 SPDDT
No ratings yet
Mrinali Bhiwapurkar 1062212449 SPDDT
23 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lesson 3

Uploaded by

Lesson 3

Uploaded by

Correlational and

Example: if you were comparing the number of

Example: if you were comparing the amount of

if r > 0.5, it suggests

if r < -0.5, it suggests a

A weak positive correlation

A weak negative correlation

✓ Null hypothesis (H0): ρ = 0

Next, add up the values of x and y.

Σx = 3.63 + 3.02 + 3.82 + 3.42 + 3.59 + 2.87 + 3.03 + 3.46

Σy = 53.1 + 49.7 + 48.4 + 54.2 + 54.9 + 43.7 + 47.2 + 45.2

Create two new columns that contain the squares of x and y.

Take the sums of the new columns.

In a final column, multiply together x and y

Use the formula and the numbers you

Use the formula and the numbers you

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.