Lesson 3
Lesson 3
regression
analysis
MS102 – Lesson 3
Learning outcomes
At the end of the lesson, students must have:
▪ interpreted correlation and regression on various
datasets,
▪ calculated and interpreted the correlation and
regression, and
▪ performed data mining.
Pearson correlation
coefficient (r)
➢ is the most widely used correlation coefficient and
is known by many names:
✓ Pearson’s r
✓ Bivariate correlation
✓ Pearson product-moment correlation coefficient
(PPMCC)
✓ The correlation coefficient
Pearson correlation
coefficient (r)
➢ (a descriptive statistic) summarizes the
characteristics of a dataset.
➢ Specifically, it describes the strength
and direction of the linear relationship
between two quantitative variables.
Pearson correlation Strength Direction
coefficient (r) value
Greater than .5 Strong Positive
Between .3 and .5 Moderate Positive
Between 0 and .3 Weak Positive
0 None None
Between 0 and –.3 Weak Negative
Between –.3 and –.5 Moderate Negative
Less than –.5 Strong Negative
Pearson correlation
coefficient (r)
➢ (also an inferential statistic), can be
used to test statistical hypotheses.
➢ Specifically, we can test whether two
variables have a significant relationship.
Visualizing the Pearson
correlation coefficient
➢ Another way to think of the Pearson
correlation coefficient (r) is as a measure of
how close the observations are to a line of
best fit.
➢ When the slope is negative, r is negative.
➢ When the slope is positive, r is positive.
when the correlation
coefficient 𝑟=1, it
indicates a perfect
positive linear
relationship between
two variables.
Σx = 33.5
Σy = 501.2
Step 2: Calculate x2 and y2 and their sums