0% found this document useful (0 votes)
20 views

Correlation and Simple Regression

The document discusses correlation and simple linear regression. It defines the Pearson correlation coefficient 'r' as a measure of the linear relationship between two variables. The value of r ranges from -1 to 1, where -1 is a perfect negative correlation, 0 is no correlation, and 1 is a perfect positive correlation. Simple linear regression uses the correlation coefficient r to predict the value of a dependent variable y given the independent variable x through the equation y = bx + a, where b is the slope and a is the y-intercept. The document provides examples of calculating r and using a linear regression equation to make a prediction.

Uploaded by

Marjorie Hou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Correlation and Simple Regression

The document discusses correlation and simple linear regression. It defines the Pearson correlation coefficient 'r' as a measure of the linear relationship between two variables. The value of r ranges from -1 to 1, where -1 is a perfect negative correlation, 0 is no correlation, and 1 is a perfect positive correlation. Simple linear regression uses the correlation coefficient r to predict the value of a dependent variable y given the independent variable x through the equation y = bx + a, where b is the slope and a is the y-intercept. The document provides examples of calculating r and using a linear regression equation to make a prediction.

Uploaded by

Marjorie Hou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Correlation and Simple Regression

Pearson Product Moment Coefficient of Correlation


→ an index of relationship between two variables
→ x = independent variable, y = dependent variable
→ the value of r ranges from -1, 0, +1

if r = +1 or -1, there is a perfect correlation


if r = 0, x and y are independent of each other

r = +1

if the trend of the line graph is going upward, the value of r is positive

this indicates that as the value of x increases the value of y also increases

x and y being positively correlated [direct]

ex. as waist line gets bigger, it means weight is also getting heavier

r = -1

if the trend of the line graph is going downward, the value of r is negative

indicates that as the value of x increases the corresponding value of y decreases

x and y being negatively correlated [indirect]

as x increases, y decreases

r=0

if the trend of the line graph cannot be established either upward or downward, then r = 0

indicating that there is no correlation between the x and y variables

Why do we use ‘r’?


→ to analyze if a relationship exists between two variables
→ if there is a relationship existing between x and y, then we can determine the extent by which x influences y using the

Correlation and Simple Regression 1


coefficient of determination which is equal to the square of r and multiplied by 100%
→ this can answer or explain how much the independent variable influences the dependent variables or how much y depends on
x
→ this is now the degree of relationship between x and y which cannot be seen in other statistical tests of relationship
→ a more powerful test of relationship compared with other nonparametric tests

When do we use ‘r’?

→ the value of r ranges from +1 through zero -1


→ there is a perfect positive correlation of r = +1, likewise, there is a negative perfect correlation if the value of r = -1
→ however, if r = 0, then there is no correlation between the two variables x and y
→ positive correlation: as x increases, y also increase or vice versa
→ negative correlation: as x decreases, y increases or vice versa

💡 - ‘r’ tells you if there is a relationship between x and y or not


- coefficient of determination tells you how much x depends on y or how much y depends x (square the r)

Formula for ‘r’

nΣxy − ΣxΣy
r=
[nΣx2 − (Σx)2 ] ⋅ [nΣy2 − (Σy)2 ]

r = pearson product moment


coefficient of correlation
n = sample size

Σxy = sum of product of x and y


ΣxΣy = product of the sum of x and
sum of y
Σx^2 = sum of squares of x
Σy^2 = sum of squares of y

example—
x - 75 70 65 90 85 85 80 70 65 90

y - 80 75 65 95 90 85 90 75 70 90
solve for ‘r’

Correlation and Simple Regression 2


use formula and answer is:
r = 0.949

Solving by Stepwise Method


step 1—problem
step 2—hypotheses

step 3—level of significance


step 4—test statistics / computation
step 5—decision rule
step 6—conclusion / implication

example—
below are the midterm (x) and final (y) grades

x - 75 70 65 90 85 85 80 70 65 90
y - 80 75 65 95 90 85 90 75 70 90

step 1 - problem
→ is there a significant relationship between the midterm and the final grades of 10 students in Mathematics?
step 2 - hypotheses
→ Ho = there IS NO significant relationship between the midterm and the final grades of 10 students in Mathematics

→ Ha = there IS a significant relationship between the midterm and the final grades of 10 students in Mathematics
step 3 - level of significance
→ n = 10
→ a = 0.05
→ df = 8 (n-2)
→ r0.5 = 0.632
step 4 - test statistic / computation

→ r = 0.949
step 5 - decision rule
→ if the computed r value is greater than the tabular value, disconfirm Ho

r > 0.632 (tabular value at 0.05 level of significance with 8 degrees of freedom)

null hypothesis [Ho] is disconfirmed

step 6 - conclusion / implication


→ there is a significant relationship between the midterm and the final grades of 10 students in mathematics

Correlation and Simple Regression 3


Simple Linear Regression Analysis
→ predicts the value of y given the value of x

Why do we use ‘r’?


→ we are interested in predicting the value of y, the dependent variable; this is used for forecasting and prediction

When do we use ‘r’?

→ when there is a relationship between x and y variables


→ the data should be normally distributed using the level of measurement which is expressed in an interval or ratio data

Formula

y = bx + a
to get a,
y = dependent to get b,
variable a = yˉ − bx
​ ˉ
x = independent nΣxy − ΣxΣy
b= ȳ=
nΣx2 − (Σx)2

variable
average of
a = y-intercept y

b = slope of the line x̄ =


average of
x

example—

below are the midterm (x) and final (y) grades


x - 75 70 65 90 85 85 80 70 65 90

y - 80 75 65 95 90 85 90 75 70 90
suppose the midterm report is x = 88, what is the value of the final grade?

r = 0.949

x̄ = 77.5

Correlation and Simple Regression 4


ȳ = 81.5
b = 0.971

a = 6.25

y = 91.7 or 92 final grade

💡 Ho: μ1 = μ2 = μ3 = μ4 [negative because NO DIFFERENCE; all the same]


Ha:
μ1 ≠ μ2 ≠ μ3 ≠ μ4 [positive because MERON DIFFERENCE]

Before [if doing it manually,,, coming from the center]

computed value ≤ tabular value ,, Reject Ho


Now [if with the help of computer,,, coming from one side]

p-value ≤ alpha value ,, reject Ho

Correlation and Simple Regression 5

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy