Non Parametric Tests
Non Parametric Tests
Introduction
The statistical test in which the hypothesis
deals with the population distribution or
parameters are called Parametric Tests.
When a situation is not possible to make
any assumptions about the distribution of
the population, we follow non-parametric
test.
Meaning
• It is a test which is not concerned with
testing of parameters. Non-Parametric
tests do not depend on the particular form
of the distribution of the population. That
is, they do not make any assumption
regarding the form of the population. So
these tests are distribution free test also.
Situations where Non-parametric tests are applied
It is used:
• when the researcher concludes that a parametric test is
not applicable.
• When hypothesis does not involve a parameter of the
population
• When the observations are not as accurate as required
for a parametric test
• When assumptions necessary for the validity of a
parametric test are not clearly and correctly known.
Advantages of Non-Parametric Test
1. Non-Parametric Tests are distribution free. i.e they do not
require any assumption to be made about population following
normal or any other distribution.
2. Generally they are simple to understand and easy to apply when
the sample sizes are small
3. Most non-parametric tests do not require lengthy and laborious
computations and hence, are less time consuming.
4. NP Test are applicable to all types of data-nominal or ordinal. So,
NP tests have application in sociology, educational statistic, etc
5. Many non-parametric test make it possible to work with very
small samples. This is particularly helpful to the researcher
collecting pilot study data or to medical researcher working with
a rare disease.
Important NP Tests
1. Sign Tests(One sample and two samples(sign test
for paired data))
2. Signed Rank Test (Wilcoxon Matched pairs test)-
Small sample(25 or less) and large sample(>25)
3. Rank sum Tests(Mann-Whitney U test and
Kruskal-Wallis H test)
4. Run Tests
5. Chi-square Test
1.Sign Tests
• When sample is small and the assumption that
population is normal is not true, we can have the
hypothesis that the sample belongs to the population.
For this, we can use sign test. When sample is small and
population is normal, T-test is used to determine
whether the given population mean is true or not.
• When samples are small and the assumption that two
population are normal is not true, we can have the
hypothesis that two population are identical. For this,
we can use sign test. When samples are small and both
populations are normal and the variances of the two
population are identical, T-test is used to determine
whether two population means are equal.
• Sign test is based on the direction of the
plus or minus signs of observations in a
sample and not on their numerical
magnitudes. The sign tests may be
(a)One-sample sign test; and
(b) two sample sign test
a)One-sample sign test
• One sample sign test is a very simple NP
Test applicable, when
(1) Sample is taken from a continuous
population
(2) P(sample value<mean)= 2 and P(sample
1
value>mean)= 12
Test procedure
• µ0 is the mean. Replace the value of each item
in the sample with a plus(+) sign if it is greater
than µ0, and with a minus(-) sign if it is less
than µ0.
• If the value happens to be equal to µ0, we do not
assign any sign.
• After doing these, we find the proportion of
plus sign out of the total number of (+) and (-)
signs
In one sample sign test, we want to test whether the
sample belongs to the population. We test the null
hypothesis µ0= µ [or P= 2 ] against an appropriate
1
Answer
If greater than 204 + sign
202 -
210 +
200 -
203 -
193 -
203 -
204 Ignore as it is exactly 204
195 -
199 -
202 -
201 -
• Total signs = 10, + signs= 1, proportion of + sign to total sign=1/10
• H0: µ= 204 [or P= ½ ]
• H1: µ ≠ 204 [or P ≠ ½]
p−P
• Test statistic= SE
PQ
• SE= n
• -2.53=2.53 numerically.
• Since calculated value is greater than 1.96,
null hypothesis is rejected. i.e. Average is
not 204
Qn
On 15 occasions Mr X had to wait
9,5,3,8,8,6,9,7,2,10,7,7,6,10 and 6 minutes for
bus. Use sign test at 5% level of significance to
test the bus’s claim that on the average Mr X
has to wait 5 minutes.
Answer
• H0: µ= 5 minutes [or P= ½ ]
• H1: µ≠ 5 minutes[or P ≠ ½]
• p=12/14=0.857
p−P
• Test statistic= SE
• 0.857-0.5/0.134=2.66
SE= n
PQ
•
Values +/- signs
9 Answer
If greater than 5 + sign
+
5 Ignore
3 -
8 +
8 +
6 +
9 +
7 +
2 -
10 +
7 +
7 +
6 +
10 +
6 +
• Since calculated value is greater than 1.96,
null hypothesis is rejected. i.e. Average is
not 5 minutes. His claim that on the
average Mr X has to wait 5 minutes is not
correct.
Two sample sign tests(sign test for paired data)
II Sales man 10 13 14 11 10 7 15 11 10 9 8
P=Q=0.5
n=10
SE= 0.25
= 0.158
10
p−P
SE
(0.4 − 0.5) / 0.158
= −0.1 / 0.158
= 0.63
Machine B 51 41 43 41 47 32 24 58 43 53 52 57 44 57 40 68
An
Null Hypothesis: There is no difference between
the performances
Alternative Hypothesis: There is difference
between the performances
Machine A Machine B D(difference) Rank of |d| Rank with sign
73 51 22 13 13 ...
43 41 2 2.5 2.5 ...
47 43 4 4.5 4.5 ...
53 41 12 11 11
58 47 11 10 10
47 32 15 12 12
52 24 28 15 15
58 58 0 - - -
38 43 -5 6 6
61 53 8 8 8
56 52 4 4.5 4.5 ...
56 57 -1 1 1
34 44 -10 9 9
55 57 -2 2.5 2.5
65 40 25 14 14
75 68 7 7 7
Total 101.5 18.5
• Calculated value of T= 18.5(smaller of
101.5 or 18.5)
• Table value of T for 15 number is 25
• Since T is less than Table value , we reject
null hypothesis and conclude that there is
difference between the performance of the
two machines.
Ex 3 A
• Given below is 16 pairs of values showing
the performance of two machines. Test
whether there is difference between the
performances(use Wilcoxon matched-
pairs test, at 5% level of significance)
Machine A 73 43 47 53 58 47 52 58 38 61 56 56 34 55 50 75
Machine B 71 41 43 50 51 45 48 59 39 60 52 57 36 57 65 70
An
Null Hypothesis: There is no difference between
the performances
Alternative Hypothesis: There is difference
between the performances
Machine A Machine B D(difference) Rank of |d| Rank with sign
73 71 2 7 7
43 41 2 7 7
47 43 4 12 12
53 50 3 10 10
58 51 7 15 15
47 45 2 7 7
52 48 4 12 12
58 59 -1 2.5 2.5
38 39 -1 2.5 2.5
61 60 1 2.5 2.5
56 52 4 12 12
56 57 -1 2.5 2.5
34 36 -2 7 7
55 57 -2 7 7
50 65 -15 16 16
75 70 5 14 14
Total 98.5 37.5
• Calculated value of T= 37.5(smaller of 98.5
or 37.5)
• Table value of T for 16 number is 30
• Since T is more than Table value , we
accept null hypothesis and conclude that
there is no significant difference between
the performance of the two machines.
Case 2: When number of matched pairs >25
• We apply Z test(where Z follows standard
normal distribution)
• Test Statistic is Z=(T-µ)/σ
• µ=n(n+1)/4
n( n + 1)(2n + 1)
• σ=
24
• When Z value<table value, we accept null
hypothesis that there is no difference,
otherwise reject it.
Question
• The following are the weights in kgm of 26 babies
before and after of a diet. Use the signed Rank Test
to test at 5% level of significance, whether there is
any significant difference between the data before
and after diet.
Before 7 3.5 2.1 1.6 7.5 6.3 7 5.4 7.7 8.2 6.8 1.9 1.3 7.2 7.8 1.7
After 7.9 6.2 9 3.7 3.5 1.4 2.6 3.2 9 5.4 8.5 4.4 8.3 9 9.2 3.2
Before 2.4 3.5 4.5 8.0 1.5 2.0 5.8 6.5 3.5 5.2
(contd....)
After (contd....) 3.4 2.8 3.4 7.9 3.5 3.2 6.2 6.3 3.0 6.8
Answer
• We apply Z as number of matched pairs>25, ie n= 26
• Test Statistic is Z=(T-µ)/σ
• µ=n(n+1)/4
n( n + 1)(2n + 1)
• σ=
24
• T=128
• µ=26*27/4=175.5
• σ =39.37
• Z=1.21
3. Rank Sum Tests
• In Rank sum test, we are replacing the values
by ranks. All values are taken together and
they are assigned ranks. Rank tests are
applied to test whether the populations are
identical. Two important Rank sum tests are
A. Wilcoxon-Mann-Whitney test(U test)
B. The Kruskal-Wallis test(or (H-test)
3.A.Wilcoxon-Mann-Whitney test(U test)/Mann-Whitney
test(U test)/
Test of identicalness of two populations
• This is a two sample Rank Sum test as an alternative to
the t-test, when the assumptions in ‘t’ tests about the
population are not made.
• we are replacing the values by ranks. All values are
taken together and they are assigned ranks. That is
data are ranked jointly.
• Rank tests are applied to test whether the populations
are identical
• Here we want to test whether the population are
identical. That is, their mean are equal.
R1 = Rank of variables of sample 1, n1 and n2 are size of sample 1 and sample 2
Question
There are two samples.
First, contains the observations:
[54,39,70,58,47,40,74,49,74,75, 61 and 79].
The second contains:
[45,41,62,53,33,45,71,42,68,73,54 and 73].
• Apply Rank sum test to test at 5% level of the
hypothesis that they come from population with
the same mean.
Values Rank Sample I or II Rank of Sample I
33 1 II
39 2 I 2
40 3 I 3
41 4 II
42 5 II
45 6.5 II
45 6.5 II
47 8 I 8
49 9 I 9
53 10 II
54 11.5 I 11.5
54 11.5 II
58 13 I 13
Values Rank Sample I or II Rank of Sample I
61 14 I 14
62 15 II
68 16 II
70 17 I 17
71 18 II
73 19.5 II
73 19.5 II
74 21.5 I 21.5
74 21.5 I 21.5
75 23 I 23
79 24 I 24
R1 167.5
n1=12 n2 =12
U=54.5
12 *12(12 + 12 + 1)
SE = = 17.32
12
Test Statistic= (72-54.5)/17.32= 1.01
Table value at 5% level of significance= 1.96
We accept the null hypothesis that the two
samples are identical.
Home work No. 1: Apply U Test to test
whether two samples are identical.
Group A Group B
7 8
11 9
9 13
4 14
8 11
6 10
12 12
11 14
9 13
10 9
11 10
11 8
Hw 2
• Sample 1: 10,13,12,15,16,8,6
• Sample 2: 20,14,7,9,17,18,19,25,24
Use U Test to test whether two samples are
identical.
3.B. The Kruskal-Wallis Test(H Test)
• This test is used to the null hypothesis that ‘K ‘
independent samples come from identical
populations. That is all sample means are
equal. Alternative hypothesis is that the
means of the samples are not equal. This is a
test similar to one way analysis of variance.
• Method: Data are ranked jointly.
Test Statistic
Gasoline B 39 40 35 26 34 45 32 22 23 18
Gasoline C 28 30 25 31 41 38 36 44 19 50