RS4 Lecture 1 Overview and Correlation Slides 2023
RS4 Lecture 1 Overview and Correlation Slides 2023
• Completely anonymous
Overview of
Research Learning Objectives
Skills 4
The module has two related components:
1) Two practicals
Hand-in dates
• Practical 1: Tues 7 March 2023 (noon)
• Practical 2: Fri 5 May 2023 (noon)
3 Multiple regression II
8 Factor analysis
4 Preparing for your
Major Project
9 More advanced techniques:
power analysis and meta-analysis
5 Samples and Populations
The British Psychological Society
0.8
r = .93
0.6
0.4
0.2
0
0 20 40 60 80 100 120
“X predicts Y”
1
0.8
Y=a+b*X
0.6
0.4
0.2
0
0 20 40 60 80 100 120
What if we tested “X causes Y” empirically?
# people waiting for the train on platform
Correlation (association) versus Causation
Source
Correlation 1 F 18 135 8
al Research 2 M 19 120 4
Design 3 M 25 100 8
4 F 21 105 3
The basic correlational design is one in which the researcher measures two (X,Y) or more (Z)
different variables at the same time using a single group of cases/participants
These variables might be age, IQ (Y), and self-esteem (X)
It investigates whether or not there is a correlation (a statistical association) between the
chosen variables.
n of the relationship between 2 variables X and Y
Positive correlation between two variables Negative correlation between two variables
Near-zero correlation between two Influence of outliers on a correlation
variables
What is a correlation coefficient?
It is the result of a
statistical test measuring
It is a standardised
the extent to which two
statistical index describing
variables X and Y are
statistically related
0.0
-0.1 -0.2 -0.3 -0.4 -0.5 -0.6 -0.7 -0.8 -0.9 -1.0
0 = no relationship
Types of Correlation Coefficients
Categorical
Correlation X,Y
Two binary φ
Basic Questions
Which association test? It depends on the variables’ scales
Variable Composition Appropriate Tests
Two numerical variables Correlation coefficient (e.g.
Pearson, Spearman or
Regression)
Two categorical Contingency tables (e.g. Chi
variables Square, Fisher or McNemar
tests)
One categorical IV and T-test (two levels) or ANOVA
one nominal DV (three or more levels)
Andy Field
IF there were a relationship between these 2 variables X and Y, then as one variable deviates
from its mean, the other variable should deviate from its mean in the same or opposite way
(from Field 8.2) The more they covary and change in a similar way the more covariance
there will be.
Participant 1 2 3 4 5 Mean SD
Advertisements watched (X) 5 4 4 6 8 5.4 1.67
Toffee packets bought (Y) 8 9 10 13 15 11.0 2.92
Mean of Y = 11
Mean of X = 5.4
2
Example of a scatterplot and line of best fit
Correlation – how to
3
1
Interpretation of Pearson’s r (SPSS Output)
r = .1 (small effect)
r = .3 (medium effect)
r = .5 (large effect)
Coefficient of determination R2
R
variance in one variable shared by the other.
2 For example a coefficient of r = 0.6 indicates that
36% of the variance of X and Y is shared
Y
Y
X
0.1 = 1%
0.3 = 9% X
0.5 = 25%
0.9 = 81%
More overlap = More shared variance
Use when one of the variables is a dichotomous variable
(categorical: nominal with only two categories)
Point-Biserial • Is someone a student?
• Treatment / control
correlation: and the other variable is continuous.
Analysis and
Interpretation Calculation is simple – Same as Pearson’s in SPSS
Example: The relationship between gender (X) and
estimation of own IQ (Y)
code gender (1=male, 2=male) own IQ
Kendall’s tau-b
2x2 design
Categorical data
Frequencies: How many?
Standardization: it’s important that our measurement tool is tested for its:
• Reliability
IQ IQ
• Validity
Unreliable Reliable
IQ
Reliable
+
Valid
Exercises
Survey.sav
As Perceived
stress decreases,
so does the
amount of
control over
internal states
Interpretation:
Is the r statistically
significant?
As Perceived stress
decreases, so does the
amount of control
over internal states
What statistical analysis? childaggression.sav