Chap 15
Chap 15
THE DEGREE OF
RELATIONSHIP BETWEEN
VARIABLES
CORRELATION
• Correlation examines the strength of a
connection between two characteristics
belonging to the same individual, or event
or equipment
• The concept of correlation does not
include the proposition that one thing is
the cause and the other the effect
• We merely say that two things are
systematically connected
RELATIONSHIPS
• Two variables can be positively correlated - an increase
in one variable coincides with an increase in another
variable, e.g. the more electricity used the higher the
power bill.
10
0
Y
0 2 4 6 8 10 12
X
Fig 15.1(a) perf ect positive correlation +1.00
Perfect Negative Correlation
12
10
2
Y2
0
0 2 4 6 8 10 12
X
Fig 15.1(b) Perf ect inverse (negative) correlation -1.000
High Positive Correlation
12
10
2
Y3
0
0 2 4 6 8 10 12
X
Fig 15.1(c) positive correlation +0.867
Moderate Negative Correlation
12
10
2
Y4
0
0 2 4 6 8 10 12
X
Fig 15.1(d) negative correlation -0.576
Random Correlation
12
10
2
Y5
0
0 2 4 6 8 10 12
X
Fig 15.1(e) random correlation -0.079
DIRECTION AND STRENGTH OF
CORRELATION COEFFICIENTS
2. Select your variables (e.g. current salary, start salary, age and
number of days absent in 2006) and click the arrow button which
places them in the Variables box.
5. Should you wish, you can also obtain Means and Standard
Deviations by clicking Options then selecting these items in the
Statistics box.
Pearson Correlation SPSS
PEARSON CORRELATION SPSS
• Additionally, you should create a
scattergraph to display the relationship
visually.
• It lets you see whether you have a linear
or nonlinear relationship. Correlation
assumes a linear or closely linear
relationship.
• If the scattergraph shows a curvilinear
shape then you cannot use Pearson.
PRODUCING A SCATTERGRAPH
• 1. Click Graph and then Scatter.
• 2. Simple is the default mode and will provide a
single scattergraph for one pair of variables.
Click Define. (If you wish to obtain a number of
scattergraphs simultaneously click on Matrix
then Define.)
• 3. Move two variables (e.g. current and starting
salary) into the box with the arrow button, then
select OK to produce the scattergraph
• 4. With a correlation it does not really matter
which variable you place on which axis
Scatterplot using SPSS
Each point on the scatterplot represents the intersection for a pair
of data. The Y axis represents values of the dependent variable
(annual current salary) and the X axis the independent (annual
starting salary). Approximate linearity is demonstrated
100
90
80
annual current salaryin$1000's
70
60
50
40
30
10 20 30 40 50
Level Negative
of correlation
Perfor area
mance
Low
Stress Levels
Low High
Output Inter-correlation Table
Pearson Correlations
annual
no.of days starting annual
absent in salary in current salary
2006 $1000's in $1000's age
no.of days absent in Pearson Correlation
1 .070 .027 -.148
2006
Sig. (2-tailed) . .666 .866 .364
N 40 40 40 40
annual starting salary Pearson Correlation
.070 1 .735(**) .056
in $1000's
Sig. (2-tailed) .666 . .000 .733
N 40 40 40 40
annual current salary Pearson Correlation
.027 .735(**) 1 .354(*)
in $1000's
Sig. (2-tailed) .866 .000 . .025
N
40 40 40 40
RANK of RANK of
SALNOW SALES3
Spearman's rho RANK of SALNOW Correlation
1.000 .010
Coefficient
Sig. (2-tailed) . .953
N 40 40
RANK of SALES3 Correlation
.010 1.000
Coefficient
Sig. (2-tailed) .953 .
N 40 40
INTERPRETATION OF SPEARMAN’S RHO
PRINTOUT
e.g. r2 = .9852
= .970
Total Total
Variance Variance
X Y
COEFFICIENT OF DETERMINATION
X Y Shared
X Variance Y
64%
COEFFICIENT OF DETERMINATION
• If the correlation (r) between variable X and
variable Y = 1, then the coefficient of
determination = 12 = 1 x 100 = 100%. 100%
of the factors accounting for variability are
common to both variables.
XY
Shared
Variance
100%
PROBLEMS IN INTERPRETING A
CORRELATION
• Whenever a correlation is computed from
values that do not represent the full range
of the distribution caution must be used in
interpreting it.
• A high positive correlation can be
obscured if only a limited range of data is
obtained (following slide)
• An overall zero ‘r’ can be inflated to a
positive one by restricting the range too
(following slide)
Restricted Range producing Zero ‘r’
Restricting Range to Produce +ve
‘r’
PARTIAL CORRELATION
• Partialling out a variable is used when you wish
to eliminate the influence of a third variable on
the correlation between two other variables. It
simply means controlling the influence of that
variable.
• Other terms for partialling out are ‘holding
constant’, and ‘controlling for’.
• The partial correlation coefficient, which, like
other correlations takes values between -1 and
+1, is essentially an ordinary bivariate
correlation, except that some third variable is
being controlled for.
EXAMPLE OF PARTIAL CORRELATION
• We will illustrate this using Chap 13 data set with
data from 40 employees on their age, their
current salary, and their starting salary.
STARTSAL SALNOW
These correlations now
reflect the partialling out
STARTSAL 1.0000 .7662 of age. Note the
( 0) ( 37) correlation has increased
marginally from +0.735 to
P= . P= .000 +0.766