0% found this document useful (0 votes)
10 views60 pages

Correlation and Regression

Uploaded by

renzballesterosb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views60 pages

Correlation and Regression

Uploaded by

renzballesterosb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 60

CORRELATION

AND
REGRESSION
BY: MS NIPAS
CORRELATION
A correlation exists between two variables
when one of them is related to the other in
some way.
some way.

- the degree of relationship or association (co-


variation) between two variables.

some way.
CORRELATION
If the graphical presentation is drawn and all
the coordinates of x and y lie exactly on a
straight line then there is an existence of
perfect correlation between these two
variables.

In statistics, the application of rectangular


system in locating the coordinates of the two
variables being investigated is known as scatter
diagram.
CORRELATION
Linear Correlation
Linear correlation is a study in the
degree of the relationship between two
variables.

In linear correlation, this relationship is


describe through linear equation of the form
y = mx + c.
CORRELATION

Types of Linear Correlation


CORRELATION
Positive Linear Correlation
y y y

x x x

(a) Positive (b) Strong (c) Perfect


positive positive
Negative Linear Correlation
CORRELATION
Negative Linear Correlation
y y y

x x x

(d) Negative (e) Strong (f) Perfect


negative negative
CORRELATION
No Linear Correlation
y
y

x x
(g) No Correlation (h) Nonlinear Correlation
CORRELATION

STEPS IN SOLVING CORRELATION


11.) Determination of Sample Coefficient of
Correlation ( r )
Parametric Non-parametric
2 variables

Pearson ( r ) Spearman Rank (𝑟s )


CORRELATION

STEPS IN SOLVING CORRELATION

2. Testing the Significant


Relationship
CORRELATION

Properties of the Linear Correlation


Coefficient r
1. -1  r  1
1. -1  r  1
1.) -1  r  1
1
2. Value of r does not

1. -1  r  1
1. -1  r  1

2.) Value of r does not change if all values of


either variable are converted to a different scale.
CORRELATION

Properties of the Linear Correlation


Coefficient r
1. -1  r  1
1. -1  r  1
3.)The r is not affected by the choice of x and y. Interchange x and y and the value of r will not
change.

4.) r measures strength of a linear relationship.

4. r of a.
CORRELATION

4. r of a.
CORRELATION

Pearson Product-Moment Correlation


Coefficient (Pearson r)

is a technique that is commonly used in determining relationship between two sets of data.
It is applicable once the data to be compared are measured in terms of interval or ratio
scale.

1. -1  r 1
4. r of a.
CORRELATION

Pearson Product-Moment Correlation


Coefficient (Pearson r)

A parametric tool that can be used to determine relationship


or association between two variables.
First derived by a British statistician named Karl Pearson

4
CORRELATION – Pearson ( r )

r=

4
CORRELATION – Pearson ( r )
Notations:
n = number of pairs of data presented

 denotes the addition of the items indicated.

x denotes the sum of all x values.

x2 indicates that each x score should be squared and then


those squares added.

(x)2 indicates that the x scores should be added and the total
then squared.

xy indicates that each x score should be first multiplied by its

corresponding y score. After obtaining all such products, find their


sum.

r represents linear correlation coefficient for a sample


 represents linear correlation coefficient for population
CORRELATION – Pearson ( r )

Example. A survey was conducted by a group of nursing


student to 12 women to determine the relationship of
their age to their systolic blood pressure. The table
below is the data they gathered:

Let x be the age


y be the systolic blood pressure
CORRELATION – Pearson ( r )

Women Age (x) BP (y) a.) Can you determine the


1 55 148 degree of relationship of
2 41 125 the age and systolic blood
3 71 159
pressure of these women?
4 36 117
5 62 148 b. ) Does this degree of
6 46 128 relationship is true for all
7 54 149 women? (Use 0.05 level of
8 48 144
9 37 115
significance)
10 41 140
11 67 151
12 59 154
CORRELATION – Pearson ( r )

The Scatter Diagram

55 Linear 148
Correlation of Age and BP of the 12 Women
41
180 125
71
160 159
140
36 117
120
62
100
148
46 128 Trend line
BP

80
54
60 149
40
48 144
20
37 115
0
41 0 140 20 40 60 80
67 151 AGE
59 154
CORRELATION – Pearson ( r )
Women Age (x) BP (y) xy
1 55 148 8140 3025 21904
2 41 125 5125 1681 15625
3 71 159 11289 5041 25281
4 36 117 4212 1296 13689
5 62 148 9176 3844 21904
6 46 128 5888 2116 16384
Compute for r
7 54 149 8046 2916 22201
8 48 144 6912 2304 20736
9 37 115 4255 1369 13225
10 41 140 5740 1681 19600
11 67 151 10117 4489 22801
12 59 154 9086 3481 23716
∑ 617 1678 87986 33243 237066
CORRELATION – Pearson ( r )
In your calculator press the following

(12 x 87986 - 617 x 1678)


((12 x 33243 - 6172)
(12 x 237066 – 16782)) =
r = 0.89
Finding:
A correlation coefficient of +0.89 means that there is
high positive relationship between the age and the
systolic blood pressure of the 12 women.
CORRELATION – Pearson ( r )

“Note that it is not meaningful to predict BP in terms of


age if r is not significant. What we can deduce initially
is the degree of the relationship between these
variables. And to come up with more meaningful
conclusion we have to test the significant of r.”
CORRELATION
The Significance Test for the Coefficient of Correlation

The correlation coefficient merely describes the


direction and the magnitude of the relationship
between the two variables involved in the sample on
which it was calculated. But the true correlation
between the entire populations () from which these
samples were taken can be higher, lower, zero or
even in the other direction.
CORRELATION
The Significance Test for the Coefficient of Correlation

Null Hypothesis
Ho:  = 0 (to mean that there is no correlation
between x and y),
and the

Alternative Hypothesis
Ha:   0 (to mean that there is correlation between
x and y)
CORRELATION
The Significance Test for the Coefficient of Correlation
The test statistics for the significance of the
correlation coefficient r is given by the following
methods:
Method 1:

t=r df = n -2 ; If n < 30
z=r ; if n > 30

or by
(uses fewer calculations)

CORRELATION
The Significance Test for the Coefficient of Correlation

Method 2: Comparison between Pearson ( r ) critical


value and computed ( r ) (no degrees of freedom)

Reject
Reject Fail to reject Reject
Reject
=
=00 =0 =
= 00

-1 r = - crit. value 0 r = +crit. value 1


Sample data:
r = ______
CORRELATION
The Significance Test for the Coefficient of Correlation

The Critical Regions/


Area of Rejections
Decision Rule:
Reject Ho if the
computed r value is
a) < - crit. Value or
b) > crit. Value,
Otherwise do not
reject Ho.
Start
CORRELATION
Let H0:  = 0
H1:   0
significance
level 

Determine the crit. value


METHOD 1 METHOD 2

The test statistic is The test statistic is r


r
t= Critical values of t or from
1-r2
Pearson crit. Table
n -2
with n -2 degrees of freedom

If the absolute value of the


test statistic exceeds the
critical values, reject H0:  = 0
Otherwise fail to reject H0

If H0 is rejected conclude that there


is a significant linear correlation.
If you fail to reject H0, then there is
not sufficient evidence to conclude
CORRELATION
The Significance Test for the Coefficient of Correlation
Example: From the above example
Step 1: Ho:  = 0
(There is no significant relationship
between the age and systolic blood
pressure of women)
Ha:   0
(There is significant relationship
between the age and systolic blood
pressure of women)
CORRELATION
The Significance Test for the Coefficient of Correlation
Example: From the above example
Step 2: Level of significance  = 0.05.

Step 3: Method 1: Since n < 30 use t- test.


The critical values of t
with  = 0.05 and df = n-2 = 12-2 = 10,
are +2.228 and – 2.228.
CORRELATION
The Significance Test for the Coefficient of Correlation

Step 4: The Critical Regions

Decision Rule:
Reject Ho if the computed
t value is
a) < - 2.228 or
b) > 2.228,
Otherwise do not reject
Ho.
-2.228 2.228
CORRELATION
The Significance Test for the Coefficient of Correlation
Step 5: Compute
for t-value:
Method 1: Method 2:
t=r t=
0.89

t = 6.17

-2.228
Sample r = 0.89
CORRELATION
The Significance Test for the Coefficient of Correlation

Step 6: Decision:
Since the computed t-value (6.17) is greater than
the t- tabular value (2.228), then Ho should be rejected.
Step 7: Interpretation:
Rejection of Ho means that there is a significant
relationship between the age and systolic blood pressure
of women base on the sample of 12 women using 0.05
level of significance.

Note that in both methods Ho is rejected.


Spearman’s Rank Correlation
Coefficient
CORRELATION - Spearman’s Rank Correlation Coefficient

Spearman's rank correlation coefficient

• It is named after Charles Spearman.


• It is a non-parametric measure of statistical
dependence between two variables.
CORRELATION -Spearman’s Rank Correlation Coefficient

The Spearman correlation coefficient is defined as the


Pearson Correlated Coefficient between the ranked
variables (ordinal data)
CORRELATION -Spearman’s Rank Correlation Coefficient

Example 2. A study was conducted to determine


whether there is a significant relationship between
the academic performance (GWAG) and OBE
awareness of freshmen students.
*For OBE level of awareness a five scale of responses were used
where in 1 is the lowest and 5 is the highest.

The Data are the following:


CORRELATION -Spearman’s Rank Correlation Coefficient
.
Academic Level of
Student Performance
(GWAG)
Awareness in
OBE
Let x be the Academic
A 1.50 5
B 2.75 3 Performance
C 3.0 3
D 1.50 4 y be the Level of
E 2.00 5
Awareness in
F 1.50 5
1.0
OBE
G 2
H 1.25 1
I 1.75 1
J 1.75 4
CORRELATION -Spearman’s Rank Correlation Coefficient
Rank x column and y column
Student X Rank X Y Rank Y
A 1.50 7 5 9
B 2.75 2 3 4.5
C 3.0 1 3 4.5
D 1.50 7 4 6.5
E 2.00 3 5 9
F 1.50 7 5 9
G 1.0 10 2 3
H 1.25 9 1 1.5
I 1.75 4.5 1 1.5
J 1.75 4.5 4 6.5
CORRELATION -Spearman’s Rank Correlation Coefficient
Determine the difference between rank x and rank y
Square the difference
𝟐
Student Rx Ry D 𝑫
A 7 9 2 4
Determine the sum
B 2 4.5 2.5 6.25
of
C 1 4.5 3.5 12.25
D 7 6.5 0.5 0.25 =181
E 3 9 6 36
F 7 9 2 4
G 10 3 7 49
H 9 1.5 7.5 56.25
I 4.5 1.5 3 9
J 4.5 6.5 2 4
CORRELATION -Spearman’s Rank Correlation Coefficient
Compute for

In your calculator
press the following:

= 1-(6x181÷(10x(-1)))= -0.097

Finding:
A correlation coefficient of -0.097 means that there is
a negligible negative relationship between academic
performance and level of awareness of the 10
freshmen students.
CORRELATION -Spearman’s Rank Correlation Coefficient
Test the Significance of

Step 1: Ho:  = 0
(There is no significant relationship
between the academic performance
and level of awareness in OBE of all
Freshmen.)
Ha:   0
(There is a significant relationship
between the academic performance
and level of awareness in OBE of all
Freshmen.)
CORRELATION
The Significance Test for the Coefficient of Correlation

Step 2: Level of significance  = 0.05.

Step 3: Method 1: Since n < 30 use t- test.


The critical values of t
with  = 0.05 and df = n-2 = 10-2 = 8,
are +2.3060 and – 2.3060.
CORRELATION
The Significance Test for the Coefficient of Correlation

Step 4: The Critical Regions

Decision Rule:
Reject Ho if the computed
t value is
a) < - 2.3060 or
b) > 2.3060,
Otherwise do not reject
Ho.
-2.3060 2.3060
CORRELATION
The Significance Test for the Coefficient of Correlation
Step 5: Compute for t-value:
Method 1:
t=r t = -0.0 97

t = -0.2756

-2.228
CORRELATION
The Significance Test for the Coefficient of Correlation

Step 6: Decision:
Since the computed t-value (-0.2756 ) is greater
than the t- tabular value (-2.3060) and less than
(2.3060) , then Ho should not be rejected.
Step 7: Interpretation:
Non-rejection of Ho means that there is no significant
relationship between the academic performance
and level of awareness in OBE of all Freshmen, base on
a sample of 10 freshmen using 0.05 level of significance.
-2.228
REGRESSION

Regression Analysis is a methodology that is use to


make prediction.

One of the main purpose of regression analysis is


to develop a statistical model that can be used to
predict the value of dependent variable based on
the values of the independent variable.

-2.228
REGRESSION
Regression Equation:

𝒚 =𝒂 +𝒃𝒙
Where:
a= y- intercept (ordinate or the point where the
regression line crosses the y- axis)

-2.228
REGRESSION
Regression Equation:

𝒚 =𝒂 +𝒃𝒙
Where:
b= beta weight or the slope of the line

𝒃=𝒏 ¿ ¿
-2.228
REGRESSION
Guidelines for Using The
Regression Equation
1. If there is no significant linear
correlation, don’t use the regression
equation to make predictions.

2. When using the regression equation


for predictions, stay within the scope of
the available sample data.
-2.228
REGRESSION
Guidelines for Using The
Regression Equation

3. A regression equation based on old


data is not necessarily valid now.

4. Don’t make predictions about a


population that is different from
the population from which the
-2.228
sample data was drawn.
REGRESSION
Example
Determine the Simple
Women Age (x) BP (y) Linear Regression
1 55 148 Equation for age and
2 41 125 systolic blood pressure of
3 71 159 women
4 36 117
5 62 148
6 46 128
7 54 149
8 48 144
9 37 115
10 -2.228 41 140
11 67 151
12 59 154
REGRESSION

Women Age (x) BP (y) xy


Compute for a
1 55 148 8140 3025 21904
2 41 125 5125 1681 15625
and b
3 71 159 11289 5041 25281
4 36 117 4212 1296 13689
5 62 148 9176 3844 21904
6 46 128 5888 2116 16384
7 54 149 8046 2916 22201
8 48 144 6912 2304 20736
9 37 115 4255 1369 13225
10 41 140 5740 1681 19600
11 67 151 10117 4489 22801
-2.228
12 59 154 9086 3481 23716
∑ 617 1678 87986 33243 237066
REGRESSION

a = 81.9878
-2.228
REGRESSION

𝒃=𝒏 ¿ ¿
=

= 1.1250

-2.228
REGRESSION
Then the Regression Equation is

𝒚 =𝟖𝟏.𝟗𝟖𝟕𝟖+𝟏. 𝟏𝟐𝟓𝟎 𝒙

Using the equation above if a woman has an age of


49 years, then her systolic blood pressure is
approximately 137.
Thank You

-2.228
Thank You

-2.228
Thank You

-2.228

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy