0% found this document useful (0 votes)
42 views7 pages

Spearman and Kendalls Tau B

Spearman's rank-order correlation (SROC) is a non-parametric measure of the strength and direction of the monotonic relationship between two ordinal or quantitative variables. It can be used as an alternative to Pearson's product-moment correlation coefficient (PPMC) when the assumptions of PPMC are violated, such as non-normality. SROC calculates the correlation based on the ranked values of each variable rather than the raw values. It ranges from -1 to 1, with the magnitude indicating the strength of the monotonic relationship and the sign indicating the direction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views7 pages

Spearman and Kendalls Tau B

Spearman's rank-order correlation (SROC) is a non-parametric measure of the strength and direction of the monotonic relationship between two ordinal or quantitative variables. It can be used as an alternative to Pearson's product-moment correlation coefficient (PPMC) when the assumptions of PPMC are violated, such as non-normality. SROC calculates the correlation based on the ranked values of each variable rather than the raw values. It ranges from -1 to 1, with the magnitude indicating the strength of the monotonic relationship and the sign indicating the direction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Spearman’s rank-order correlation (SROC)

§ It is a non-parametric measure of the strength and direction of the


monotonic1/linear relationship between two variables measured at least on an
ordinal scale (so, interval and ratio variables are also included).
o 𝑥 is ordinal, 𝑦 is interval/ratio, or vice versa
o Both are ordinal
§ It can also be used for quantitative variables targeted to be analyzed by PPMC
but there is/are violation/s in the assumptions of PPMC. When both 𝑥 and 𝑦 are
interval/ratio but there is/are violation/s in the assumptions of PPMC.

For example, the two variables below are compliant with the data and linearity
assumptions of PPMC. There are outliers for both variables but with and without
the outliers, results remain statistically significant. Thus, the outliers are not
significant. However, based on the 𝑧-scores of skewness and kurtosis and as
validated by the results of the Shapiro-Wilk test, normality assumption is violated.
If this is the case, the SROC can be used.

Descriptives

X Y

N 16 16
Missing 0 0

Mean 137.500 1000.000

Median 85.000 425.000

Standard deviation 234.023 2409.703

IQR 75.000 375.000

Minimum 10 50

Maximum 1000 10000

Skewness 3.769 3.945

Std. error skewness 0.564 0.564

Kurtosis 14.701 15.689

Std. error kurtosis 1.091 1.091

Shapiro-Wilk W 0.446 0.354

Shapiro-Wilk p < .001 < .001

25th percentile 47.500 237.500

50th percentile 85.000 425.000

75th percentile 122.500 612.500

1It means these two variables are moving in the same or opposite direction but not necessarily in the same rate or in a linear
manner; note that not all monotonic relations are linear in nature.
§ Features:
o Range: from −1 to +1;
o Sign: indicates the direction of the relationship (can be linear or
monotonic)
o Magnitude is the same, so with Strength. It is also symmetric.
o Advantages: 1) does not always assume that the underlying relationship
between 𝑥 and 𝑦 is linear; 2) no assumptions of normality are made
regarding the distributions of 𝑥 and 𝑦.
§ Assumptions:
1. Level of measurement of the variables should be at least ordinal. Why is
ordinality a requisite? The formula for Spearman’s is provided below.

(∑#"$% 𝑥" )(∑#"$% 𝑦" )


∑#"$% 𝑥" 𝑦" −
𝑟! 𝑜𝑟 𝜌 = 𝑛
& &
(∑# 𝑥" ) (∑# 𝑦 )
01∑#"$% 𝑥"& − "$% 2 1∑#"$% 𝑦"& − "$% " 2
𝑛 𝑛

∑#"$%(𝑥" − 𝑥̅ )(𝑦" − 𝑦4)


𝑟! 𝑜𝑟 𝜌 = 𝑛−1
∑#"$%(𝑥" − 𝑥̅ )& ∑# (𝑦 − 𝑦4)&
506 78 5 06 "$% " 78
𝑛−1 𝑛−1

Where 𝑥" and 𝑦" are the ranks of the 𝑖th data pairs. So, we do not use the actual
values (say ordinal categories or numerical observations) but their ranks relative
to other observations in each of the variables. This means that we need to rank
the observations per variable (with rank 1 being the lowest2). Since the values to
be considered in the formula’s computations are ranks, we can perform arithmetic
and non-binary operations.

The second formula is similar to our PPMC formula but the difference between
PPMC and SROC is the values of the arrays. PPMC involves actual observations
while SROC involves ranked values.

2. Monotonicity assumption. At least a monotonic relationship should exist between


the variables. Note that all linear relationships are monotonic but not all monotonic
relationships are linear. See the scatterplot below:

2
jamovi has a function to rank the observations in a variable in ascending order.
This is not linear but monotonic. So, if linearity is violated, but there is
monotonicity in the relationship of the variables, SROC can be used. If
monotonicity is violated, you need to consider other statistical tests to
determine the relationship of the variables. See scattergram below for an
example of a non-monotonic relationship.

Both variables are quantitative. There are no outliers. Based on Shapiro-


Wilk test, there is compliance with the assumption of normality. But, there
is a curvilinear (concave) relationship between the variables. So, linearity
and monotonicity assumptions are violated. Neither PPMC nor SROC is
appropriate. One should consider categorical data analysis to test for
association or independence between the variables.

Note: SROC is non-parametric and it can be a bit flexible. When it comes


to the requirement on monotonicity, it can allow for a tolerable deviation.
For example, examine the scatterplot below:
This can be considered “almost” compliant with monotonicity assumption as
level 1 to level 5 in the educational attainment of the respondents, you can
observe that the knowledge of the respondents increases. So there is a
positive monotonic relationship. There is only a deviation at levels 6 and 7
where the knowledge incrementally goes downward. But, since majority
shows a monotonic relationship, this can be considered as almost
monotonic. So, the use of SROC is permitted, provided that we meet the
ordinality assumption.

Note on the formula:


§ Developed by Charles Spearman in 1904
§ When there is little to no tied ranks, the formula for SROC given above can be
algebraically reduced to:

6 ∑#"$% 𝑑"&
𝜌 =1−
𝑛(𝑛& − 1)

Where: 𝑑"& = (𝑥" − 𝑦" )&

Note: This shortcut formula generates output similar to that of the general formula
if and only if there are no tied ranks. If there is a small number of tied ranks, there
will be minor discrepancies. So, only use this formula when there are no tied ranks.
If there are tied ranks, but not too many, it is better to use the general formula.

Example: It is of interest to determine the relationship between age and educational


attainment. The table below shows the observations of the two variables.

𝑖 Age Educational Age Educational 𝑥! 𝑦! 𝑥" 𝑦"


Attainment (Ranked) Attainment
(Ranked)
1 19 6 2 10.5 (2)(10.5) = 21 (2)" = 4 (10.5)" = 110.25
2 26 6 6.5 10.5 (6.5)(10.5) = 68.25 (6.5)" = 42.25 (10.5)" = 110.25
3 26 3 6.5 4 (6.5)(4) = 26 (6.5)" = 42.25 (4)" = 16
4 41 3 10 4 (10)(4) = 40 (10)" = 100 (4)" = 16
5 58 7 14 14 (14)(14) = 196 (14)" = 196 (14)" = 196
6 51 2 12 2 (12)(2) = 24 (12)" = 144 (2)" = 4
7 53 4 13 6 (13)(6) = 78 (13)" = 169 (6)" = 36
8 25 1 5 1 (5)(1) = 5 (5)" = 25 (1)" = 1
9 20 5 3 7.5 (3)(7.5) = 22.5 (3)" = 9 (7.5)" = 56.25
10 21 5 4 7.5 (4)(7.5) = 30 (4)" = 16 (7.5)" = 56.25
11 30 7 8 14 (8)(14) = 112 (8)" = 64 (14)" = 196
12 35 6 9 10.5 (9)(10.5) = 94.5 (9)" = 81 (10.5)" = 110.25
13 43 7 11 14 (11)(14) = 154 (11)" = 121 (14)" = 196
14 18 3 1 4 (1)(4) = 4 (1)" = 1 (4)" = 16
15 59 6 15 10.5 (15)(10.5) = 157.5 (15)" = 225 (10.5)" = 110.25
# # # # #

2 𝑥! 2 𝑥! = 120 2 𝑥! 𝑦! = 1,032.75 2 𝑥 " = 1,239.5 2 𝑦 " = 1,230.5


!$% !$% !$% !$% !$%
= 120

Now, let us use the formula for SROC.

(∑#"$% 𝑥" )(∑#"$% 𝑦" )


∑#"$% 𝑥" 𝑦" −
𝑟! 𝑜𝑟 𝜌 = 𝑛
& &
(∑#"$% 𝑥" ) (∑# 𝑦 )
01∑#"$% 𝑥"& − 2 1∑#"$% 𝑦"& − "$% " 2
𝑛 𝑛
(120)(120)
1,032.75 − 15
= = .26458088014 ≈ . 265
(120)& (120)&
C61,239.5 − 7 61,230.5 −
15 15 7

In MS Excel with the use of the CORREL function, same correlation coefficient is
generated. Also, jamovi output is also the same.
Kendall’s Tau-b
• This bivariate correlational test can be used when you want to do PPMC but failed
its assumptions and there are many tied rank values which makes SROC
inappropriate.
• When one or two variables are ordinal and there are tied rank values, Kendall’s
tau-b is more appropriate than SROC.
• Monotonicity assumption: Kendall’s tau-b also requires this assumption. Like
SROC, a small deviation can be tolerable.
• Tied-ranks assumption: when more than 1H3 to 1H2 of the observations are tied,
Kendall’s tau-b is more appropriate than SROC. This is because the formula of
Kendall’s tau-b is systematically better in handling ties as can be seen in the
formula below:

𝑛' − 𝑛(
𝜏=
𝑛! 𝑡" ! 𝑛! 𝑢) !
06 − ∑#"$% 71 − ∑#)$% 2
2! (𝑛 − 2)! 2! (𝑡" − 2)! 2! (𝑛 − 2)! 2! M𝑢) − 2N!

𝑛' − 𝑛(
𝜏=
+
COM#&N − ∑#"$%M*&!NPOM#&N − ∑#)$%M &" NP

Where:
§ 𝑛' = number of concordant pairs
§ 𝑛( = number of discordant pairs
#!
§ M#&N = &!(#.&)!
*! !
§ M*&!N = &!(*! .&)!
+" !
§ M+&"N = &!0+
" .&1!

For example, consider the two variables. We need to convert them into ranks so that we
can calculate for the coefficient. To do this in excel, use the =RANK.AVG function. Make
sure to enable absolute referencing for the “ref” in the function.

𝑖 Age Educational Age Educational


Attainment (Ranked)/variable Attainment
𝑥 (Ranked)/Variable
𝑦
1 18 3 1 4
2 19 6 2 10.5
3 20 5 3 7.5
4 21 5 4 7.5
5 25 1 5 1
6 26 6 6.5 10.5
7 26 3 6.5 4
8 30 7 8 14
9 35 6 9 10.5
10 41 3 10 4
11 43 7 11 14
12 51 2 12 2
13 53 4 13 6
14 58 7 14 14
15 59 6 15 10.5

Now, if 𝑥% < 𝑥& , assign +1. If 𝑥% > 𝑥& , assign -1. If 𝑥% = 𝑥& , assign 0. Do this for the variable
𝑦 as well. If 𝑦% < 𝑦& , assign +1. If 𝑦% > 𝑦& , assign -1. If 𝑦% = 𝑦& , assign 0. Then multiply
the assigned values for the pair. If the product is +1, the pair (1,2) is concordant. If the
product is -1, the pair (1,2) is discordant. If the product is 0, then the pair (1,2) is tied. You
can then use excel to count the frequency of concordant, discordant and tied pairs using
the =COUNTIF function. So, the number of concordant pairs is 55 while the number of
discordant pairs is 36 and the number of tied pairs is 14. When we add them all up, they
sum up to 105, which is equal to the total number of pairs.

To determine the total number of pairs,

𝑛 𝑛! 15!
S T= = = 105
2 2! (𝑛 − 2)! 2! (15 − 2)!

For the number of tied values in the 𝑖th group of the variable 𝑥,
# 𝑡" 𝑡" ! 2!
U V W=6 7= = 1
"$% 2 2! (𝑡" − 2)! 2! (2 − 2)!

For the number of tied values in the 𝑖th group of the variable 𝑦,

+" ! 2! &! 3! 2!
∑#)$%M+&" N = 6 7 = X&!(2.&)!Y + X&!(&.&)!Y + X&!(3.&)!Y + X&!(2.&)!Y = 13
&!0+" .&1!

Now, we can calculate for the value of our Kendall’s tau-b correlation coefficient.

𝑛' − 𝑛(
𝜏=
𝑛! 𝑡" ! 𝑛! 𝑢) !
06 (𝑛 − ∑#"$% 71 − ∑#)$% 2
2! − 2)! 2! (𝑡" − 2)! 2! (𝑛 − 2)! 2! M𝑢) − 2N!
55 − 36
= = .19424194478 ≈ . 194
Z(105 − 1)(105 − 13)

Our manual calculation is the same with the jamovi output.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy