0% found this document useful (0 votes)
8 views44 pages

09 Nonparametric Test Li Wenyun1730862642

Uploaded by

kerwintong0116
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views44 pages

09 Nonparametric Test Li Wenyun1730862642

Uploaded by

kerwintong0116
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Chapter 9

Nonparametric Test Based on Ranks


(Rank sum test )

Li Wenyun

Division of Medical Statistics


Jinan University
What is Parametric Test?
· Parametric test assumes that the random
variable follows a certain distribution (normal
distribution), and we want to test whether the
population parameter (mean) equals a specified
value, or whether the parameters (means) of
two populations are equal or not. Equality of
parameters are tested, therefore they are called
parametric tests.
· t tests and ANOVA are parametric tests.
What is Nonparametric Test ?
· When data are not normally distributed, or the
distribution is unknown, the distribution cannot
be characterized by a few parameters, so t tests
and ANOVA are not suitable.
· Non-parametric tests are distribution-free tests
without any assumption about the distribution.
· Chi-square test for R×C contingency table is a
kind of nonparametric tests.
· Rank sum test is another kind of nonparametric
tests.
When are nonparametric tests used?
· The distributions are extremely skewed or
unknown;
· Ordered or ranked data;
· Imprecise data (e.g. X<0.5);
¨ Measured with some imprecision due to the
limit of detection of the instruments or assays
· There are outliers which inflate the standard
deviations and render parametric tests less
sensitive;
¨ outliers are data points that are away from
the main body of data.
Outliers: values below Q1-1.5(Q3-Q1) or
above Q3+1.5(Q3-Q1)

Figure 1 DNA content in gastric mucosal cells


among four kinds of persons
Parametric VS nonparametric tests
If the data are suitable for using parametric tests,
· Compared to parametric tests, nonparametric
tests will have lower power to detect any real
difference,
¨ Difference actually exists (H0 not hold), however the
test results indicate there is no significant difference
(do not reject H0)
· Parametric tests not only have higher power
but can also produce more useable results, such
as estimate of mean, standard deviation and
95% CI of the mean difference.
Idea for rank test

15 16
13 14
11 12
8 9 10
6 7
3 4 5
1 2
9.1 Wilcoxons Signed Rank Test
· It is used for paired quantitative data,
which do not follow normal distributions.
· Extension to paired-sample t test
· Example 9-1: A physician designed to
study whether there is significant
difference in interleukin-6 between skin
lesion and uninvolved normal skin among
patients with vitiligo. Data from 9 patients
are in Table 9-1.
· Paired sample t test
whether the mean of the difference in test
variable between the two populations
equals 0
· Wilcoxon Signed Rank Test
whether the median (center of the
distribution) of the difference in test
variable between the two populations
equals 0
The dispersion of the two samples and their difference
are large, so data are not normally distributed
(1) State the Hypotheses and Select
the Level of Significance
In nonparametric tests, distributions of data
are usually skewed, mean is not a suitable
representative of central tendency. Hypothesis
testing is based on median.
· H0: The median of the difference is 0
· H1: The median of the difference is not 0
· =0.05
If the level of interleukin-6 in skin lesion and in
normal skin are not different, their difference
should be centered at 0
(2) Select an appropriate test and
calculate the test statistics
· Calculate the difference for each pair;
· Rank the absolute differences (omit zero); if
two or more of the absolute differences are
equal (we say there is a tie), the average rank
will be assigned to the tied values;
· Give the signs of the differences to their ranks;
· Calculate the rank sum of positive and negative
ranks, separately;
· choose the smaller absolute rank sum as test
statistic (T- = 3).
(3) Determine P value, and state
the conclusion
· n: number of valid pairs (non-zero difference)
· From Table 9-6, for n=8, critical values are (3,
33).
¨ Probability of T≤3 or T≥33 is <0.05;
¨ Probability of 3<T<33 is ≥0.05.
· Tcalc=3, Tcalc = Tcrit, P<0.05, H0 is rejected.
· There was significant difference in level of
interleukin-6 between skin lesion and
uninvolved normal skin among the patients
with vitiligo. The levels of Interleukin-6 are
related to skin lesion of patients with vitiligo.
0.01952=0.039
Probability of T≤3 or
T≥33 is 0.039;
Probability of 3<T<33
is 1-0.039=0.961
· When n>25, the Table 9-10 cannot help. Then we
turn to the normal approximation.
· When n is large enough, the distribution of
statistic T will be close to normal.
· Calculate Z statistic

Z~N(0,1)
tp is number of pairs
for the pth of tied value
9.2 Wilcoxons Rank-sum Test
(Mann-Whitney U Test)
· It is used for two independent samples,
which do not follow normal distributions.
· Example 9-2: The survival time (minute)
of 5 cats and 14 rabbits without oxygen
are listed in Table 9-3. Now we try to
compare the survival times of cats and
rabbits in an environment without
oxygen.
(1) State the hypotheses and select
the significance level
· H0: The medians or distributions of two
populations are the same (Md1 = Md2)
· H1: The two medians or two distributions
are not the same (Md1  Md2)
· =0.05
(2) Select an appropriate test and
calculate the test statistics
· Pool the two samples and rank all the
observations from the smallest to the largest
while keeping track of sample to which each
observation belongs;
· For same values (tie), give a mean rank: there are
two “25”, and the ranks should be 9 and 10, so
the rank is (9+10)/2= 9.5 for each;
· Compute the rank sum for each sample;
· The rank sum for the smaller sample (smaller n),
T=T1= 78.5 is the test statistic; When n1=n2, let
T=min (T1, T2).
(3) Determine P value, and state
the conclusion
· Here T=78.5, n1=5, n2=14 and n2- n1=9.
· From Table 9-7 by n1 (<n2) and n2- n1,
T0.05,5,9=28~72.
· if T is outside the range or on the boundary,
P < 0.05; if T is inside the range, P ≥ 0.05.
· T=78, outside the range, P<0.05, H0 is rejected.
· Wilcoxon rank-sum test indicated that the
survival times of cats and rabbits in the
environment without oxygen differed
significantly (P < 0.05) .
· It can be proven that when sample size is
large enough, the distribution of the
statistic T is close to normal with

n=n1+n2

· There are five ties: 1.5,1.5; 6.5,6.5;


9.5,9.5; 12.5,12.5; 18.5,18.5 in Table 9-3.
· “tp” is the number of individuals in the p
-th tied value. If there are 4 values of 15,
ranked 1, 2, 3, 4, the ranks for the 4
values are 2.5, 2.5, 2.5 and 2.5, and “tp”
is 4, and .
· When H0 is true, the test statistic Z will
follow a standard normal distribution.
Statistic T is called W in SPSS.
· For the data in Table 9-3, T1=78.5, n1=5,
T2=111.5, n2=14, the Z value is calculated
according to SPSS procedure as follows:

· If we take T2=111.5 as T, the result is


same as T1=78.5:
· Wilcoxons rank-sum test is also called
Mann-Whitney U test, the two tests have
different statistic (W or U) but same P
value. Z is the test statistic to determine P
value (Z=2.64, P=0.008).
9.3 Kruskal-Wallis H Test
· Kruskal-Wallis H test is used to compare
multiple independent samples, which do
not follow normal distribution.
· Example 9-3: DNA content in gastric
mucosal cells among four kinds of
persons is shown in Table 9-5. Was there
a significant difference in DNA content of
gastric mucosal cells among the 4
populations?
(1) State the hypotheses and select
the significance level
· H0: The medians of four populations are
equal.
· H1: The medians of four populations are
not all equal.
· =0.05
n=n1+n2+n3+n4=37
(2) Select an appropriate test and
calculate the test statistics
· Combine the four samples;
· Rank all observations from the smallest
to the largest while keeping track of
sample to which each observation
belongs. if same values appear (tie), give
a mean rank;
· Calculate the rank sums for the four
samples, denoted by T1, T2, T3, T4
respectively.
The test statistic is as follows:

· There are three ties: 13.4, 13.4; 16.4,


16.4; 27.2, 27.2 in Table 9-5. The formula
should be further adjusted as
(3) Determine P value, and state
the conclusion
· The test statistic H approximately follows a chi-
square distribution with k-1 degrees of freedom
(k denotes the number of groups).
· The test statistic Hcalc=24.07, df = k-1= 4-1=3.
20.001,3=16.27 in Table 8-9 critical values of 2
distribution. Hcalc>20.001,3, then P<0.001. H0 is
rejected at the level of =0.05.
· The results indicated that there was a
statistically significant difference in DNA
content in gastric mucosal cells among four
kinds of persons (2 = 24.7, P <0.001).
SPSS Procedure
· Wilcoxons signed rank sum
test (2 related samples )
· Wilcoxons rank-sum test
(2 independent samples )
· Kruskal-Wallis H test
(k independent samples )
(1) Wilcoxons signed rank sum test
(Related samples )
· Example 9-1: A physician designed to study
whether there is significant difference in
interleukin-6 between skin lesion and
uninvolved normal skin among the patients
with vitiligo.
Table 9-1. Levels of interleukin-6 (u/ml) between skin lesion
and uninvolved normal skin among the patients with vitiligo
ID of Patients 1 2 3 4 5 6 7 8 9
Skin lesion 40.03 97.13 80.32 25.32 19.61 14.50 49.63 44.56 20.10
Normal skin 88.57 80.00 123.72 39.03 24.37 92.75 121.57 89.76 20.10
Analyze Non-parametric Test
 2 related samples
Click variable lesion and normal, move
them together to Test Pair(s) List

Test type, Select:


Wilcoxon
Output and Interpretation

· Wilcoxon Signed-Rank test indicated that


there was significant difference in level of
interleukin-6 between skin lesion and
uninvolved normal skin among the patients
with vitiligo (Z = 2.10, P = 0.036).
(2) Wilcoxons rank-sum test
(2 Independent samples )
· The survival time (minute) of 5 cats and 14 rabbits
without oxygen are listed in Table 7-4. Now we try
to compare the difference of survival times of cats
and rabbits in the environment without oxygen.
The researcher was concerned about the lack of
normality of the underlying distribution of the
data and so decided to use a nonparametric test.

Table 7-4. Survival times (minute) of cats and rabbits without oxygen
Cats 25 34 44 46 46
Rabbits 15 15 16 17 19 21 21 23 25 27 28 28 30 35
Analyze Non-parametric Test
 2 independent samples
Move time to Test Variable List
Move group to Grouping Variable box
Click Define Groups, type “1” in the box of Group 1 and
type “2” in the box of Group 2
Test type, select Mann-Whitney U
Output and Interpretation

· Wilcoxon rank-sum test indicated that there was


statistically significant difference in survival time
between cats and rabbits in the environment without
oxygen (Z = 2.64, P = 0.008). You can determine which
group has the higher rank by looking at “Mean Rank”.
The mean rank of cats is more than that of rabbits. It
concludes that cats live longer than rabbits in the
environment without oxygen.
(3) Kruskal-Wallis Test
(k Independent Samples)
· DNA content in gastric mucosal cells among four
kinds of persons is shown as follow. Was there a
significant difference in DNA content of gastric
mucosal cells among different kind persons?
Analyze Non-parametric Test
 k independent samples
Move dna to Test Variable List
Move group to Grouping Variable box
Click Define Groups, type “1” in the box of Minimum
and type “4” in the box of Maximum
Test type, select Kruskal-Wallis H
Output and Interpretation

· Kruskal-Wallis H Test indicated that there


was a significant difference in DNA content
in gastric mucosal cells among four kinds of
persons (2 = 24.07, P < 0.001).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy