Spearman and Kendalls Tau B
Spearman and Kendalls Tau B
For example, the two variables below are compliant with the data and linearity
assumptions of PPMC. There are outliers for both variables but with and without
the outliers, results remain statistically significant. Thus, the outliers are not
significant. However, based on the 𝑧-scores of skewness and kurtosis and as
validated by the results of the Shapiro-Wilk test, normality assumption is violated.
If this is the case, the SROC can be used.
Descriptives
X Y
N 16 16
Missing 0 0
Minimum 10 50
1It means these two variables are moving in the same or opposite direction but not necessarily in the same rate or in a linear
manner; note that not all monotonic relations are linear in nature.
§ Features:
o Range: from −1 to +1;
o Sign: indicates the direction of the relationship (can be linear or
monotonic)
o Magnitude is the same, so with Strength. It is also symmetric.
o Advantages: 1) does not always assume that the underlying relationship
between 𝑥 and 𝑦 is linear; 2) no assumptions of normality are made
regarding the distributions of 𝑥 and 𝑦.
§ Assumptions:
1. Level of measurement of the variables should be at least ordinal. Why is
ordinality a requisite? The formula for Spearman’s is provided below.
Where 𝑥" and 𝑦" are the ranks of the 𝑖th data pairs. So, we do not use the actual
values (say ordinal categories or numerical observations) but their ranks relative
to other observations in each of the variables. This means that we need to rank
the observations per variable (with rank 1 being the lowest2). Since the values to
be considered in the formula’s computations are ranks, we can perform arithmetic
and non-binary operations.
The second formula is similar to our PPMC formula but the difference between
PPMC and SROC is the values of the arrays. PPMC involves actual observations
while SROC involves ranked values.
2
jamovi has a function to rank the observations in a variable in ascending order.
This is not linear but monotonic. So, if linearity is violated, but there is
monotonicity in the relationship of the variables, SROC can be used. If
monotonicity is violated, you need to consider other statistical tests to
determine the relationship of the variables. See scattergram below for an
example of a non-monotonic relationship.
6 ∑#"$% 𝑑"&
𝜌 =1−
𝑛(𝑛& − 1)
Note: This shortcut formula generates output similar to that of the general formula
if and only if there are no tied ranks. If there is a small number of tied ranks, there
will be minor discrepancies. So, only use this formula when there are no tied ranks.
If there are tied ranks, but not too many, it is better to use the general formula.
In MS Excel with the use of the CORREL function, same correlation coefficient is
generated. Also, jamovi output is also the same.
Kendall’s Tau-b
• This bivariate correlational test can be used when you want to do PPMC but failed
its assumptions and there are many tied rank values which makes SROC
inappropriate.
• When one or two variables are ordinal and there are tied rank values, Kendall’s
tau-b is more appropriate than SROC.
• Monotonicity assumption: Kendall’s tau-b also requires this assumption. Like
SROC, a small deviation can be tolerable.
• Tied-ranks assumption: when more than 1H3 to 1H2 of the observations are tied,
Kendall’s tau-b is more appropriate than SROC. This is because the formula of
Kendall’s tau-b is systematically better in handling ties as can be seen in the
formula below:
𝑛' − 𝑛(
𝜏=
𝑛! 𝑡" ! 𝑛! 𝑢) !
06 − ∑#"$% 71 − ∑#)$% 2
2! (𝑛 − 2)! 2! (𝑡" − 2)! 2! (𝑛 − 2)! 2! M𝑢) − 2N!
𝑛' − 𝑛(
𝜏=
+
COM#&N − ∑#"$%M*&!NPOM#&N − ∑#)$%M &" NP
Where:
§ 𝑛' = number of concordant pairs
§ 𝑛( = number of discordant pairs
#!
§ M#&N = &!(#.&)!
*! !
§ M*&!N = &!(*! .&)!
+" !
§ M+&"N = &!0+
" .&1!
For example, consider the two variables. We need to convert them into ranks so that we
can calculate for the coefficient. To do this in excel, use the =RANK.AVG function. Make
sure to enable absolute referencing for the “ref” in the function.
Now, if 𝑥% < 𝑥& , assign +1. If 𝑥% > 𝑥& , assign -1. If 𝑥% = 𝑥& , assign 0. Do this for the variable
𝑦 as well. If 𝑦% < 𝑦& , assign +1. If 𝑦% > 𝑦& , assign -1. If 𝑦% = 𝑦& , assign 0. Then multiply
the assigned values for the pair. If the product is +1, the pair (1,2) is concordant. If the
product is -1, the pair (1,2) is discordant. If the product is 0, then the pair (1,2) is tied. You
can then use excel to count the frequency of concordant, discordant and tied pairs using
the =COUNTIF function. So, the number of concordant pairs is 55 while the number of
discordant pairs is 36 and the number of tied pairs is 14. When we add them all up, they
sum up to 105, which is equal to the total number of pairs.
𝑛 𝑛! 15!
S T= = = 105
2 2! (𝑛 − 2)! 2! (15 − 2)!
For the number of tied values in the 𝑖th group of the variable 𝑥,
# 𝑡" 𝑡" ! 2!
U V W=6 7= = 1
"$% 2 2! (𝑡" − 2)! 2! (2 − 2)!
For the number of tied values in the 𝑖th group of the variable 𝑦,
+" ! 2! &! 3! 2!
∑#)$%M+&" N = 6 7 = X&!(2.&)!Y + X&!(&.&)!Y + X&!(3.&)!Y + X&!(2.&)!Y = 13
&!0+" .&1!
Now, we can calculate for the value of our Kendall’s tau-b correlation coefficient.
𝑛' − 𝑛(
𝜏=
𝑛! 𝑡" ! 𝑛! 𝑢) !
06 (𝑛 − ∑#"$% 71 − ∑#)$% 2
2! − 2)! 2! (𝑡" − 2)! 2! (𝑛 − 2)! 2! M𝑢) − 2N!
55 − 36
= = .19424194478 ≈ . 194
Z(105 − 1)(105 − 13)