0% found this document useful (0 votes)
57 views68 pages

Non Parametric Tests

Non-parametric tests are statistical methods that do not rely on assumptions about population distribution, making them applicable when parametric tests are unsuitable. They are advantageous for small sample sizes, ease of use, and applicability to various data types. Key non-parametric tests include the Sign Test, Signed Rank Test, and Rank Sum Tests, which are used to assess hypotheses without requiring normal distribution assumptions.

Uploaded by

sureshvishal6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views68 pages

Non Parametric Tests

Non-parametric tests are statistical methods that do not rely on assumptions about population distribution, making them applicable when parametric tests are unsuitable. They are advantageous for small sample sizes, ease of use, and applicability to various data types. Key non-parametric tests include the Sign Test, Signed Rank Test, and Rank Sum Tests, which are used to assess hypotheses without requiring normal distribution assumptions.

Uploaded by

sureshvishal6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Non-Parametric Tests

Introduction
The statistical test in which the hypothesis
deals with the population distribution or
parameters are called Parametric Tests.
When a situation is not possible to make
any assumptions about the distribution of
the population, we follow non-parametric
test.
Meaning
• It is a test which is not concerned with
testing of parameters. Non-Parametric
tests do not depend on the particular form
of the distribution of the population. That
is, they do not make any assumption
regarding the form of the population. So
these tests are distribution free test also.
Situations where Non-parametric tests are applied

It is used:
• when the researcher concludes that a parametric test is
not applicable.
• When hypothesis does not involve a parameter of the
population
• When the observations are not as accurate as required
for a parametric test
• When assumptions necessary for the validity of a
parametric test are not clearly and correctly known.
Advantages of Non-Parametric Test
1. Non-Parametric Tests are distribution free. i.e they do not
require any assumption to be made about population following
normal or any other distribution.
2. Generally they are simple to understand and easy to apply when
the sample sizes are small
3. Most non-parametric tests do not require lengthy and laborious
computations and hence, are less time consuming.
4. NP Test are applicable to all types of data-nominal or ordinal. So,
NP tests have application in sociology, educational statistic, etc
5. Many non-parametric test make it possible to work with very
small samples. This is particularly helpful to the researcher
collecting pilot study data or to medical researcher working with
a rare disease.
Important NP Tests
1. Sign Tests(One sample and two samples(sign test
for paired data))
2. Signed Rank Test (Wilcoxon Matched pairs test)-
Small sample(25 or less) and large sample(>25)
3. Rank sum Tests(Mann-Whitney U test and
Kruskal-Wallis H test)
4. Run Tests
5. Chi-square Test
1.Sign Tests
• When sample is small and the assumption that
population is normal is not true, we can have the
hypothesis that the sample belongs to the population.
For this, we can use sign test. When sample is small and
population is normal, T-test is used to determine
whether the given population mean is true or not.
• When samples are small and the assumption that two
population are normal is not true, we can have the
hypothesis that two population are identical. For this,
we can use sign test. When samples are small and both
populations are normal and the variances of the two
population are identical, T-test is used to determine
whether two population means are equal.
• Sign test is based on the direction of the
plus or minus signs of observations in a
sample and not on their numerical
magnitudes. The sign tests may be
(a)One-sample sign test; and
(b) two sample sign test
a)One-sample sign test
• One sample sign test is a very simple NP
Test applicable, when
(1) Sample is taken from a continuous
population
(2) P(sample value<mean)= 2 and P(sample
1

value>mean)= 12
Test procedure
• µ0 is the mean. Replace the value of each item
in the sample with a plus(+) sign if it is greater
than µ0, and with a minus(-) sign if it is less
than µ0.
• If the value happens to be equal to µ0, we do not
assign any sign.
• After doing these, we find the proportion of
plus sign out of the total number of (+) and (-)
signs
In one sample sign test, we want to test whether the
sample belongs to the population. We test the null
hypothesis µ0= µ [or P= 2 ] against an appropriate
1

alternative hypothesis µ0 ≠ µ [or P ≠ 2] on the basis


1

of a random sample of size n.


p−P
Test statistic is SE
Where p is the proportion of plus signs out of the
total signs and P and Q= 2 and SE = PQ
1 0.25
n n
• If the test statistic < the table value, we
accept null hypothesis, otherwise reject it.
• When null hypothesis is accepted we
conclude that the sample belong to the
population so that the given population
mean is true.
Question No 1
• In a golf play scores of 11 professionals are
202, 210, 200, 203, 193, 203, 204, 195, 199,
202, 201. Use the sign test at 5% level of
significance to test the null hypothesis that
professional golfers’ average is 204.
Values +/- signs

Answer
If greater than 204 + sign
202 -
210 +
200 -
203 -
193 -
203 -
204 Ignore as it is exactly 204
195 -
199 -
202 -
201 -
• Total signs = 10, + signs= 1, proportion of + sign to total sign=1/10
• H0: µ= 204 [or P= ½ ]
• H1: µ ≠ 204 [or P ≠ ½]
p−P
• Test statistic= SE

PQ
• SE= n

• -2.53=2.53 numerically.
• Since calculated value is greater than 1.96,
null hypothesis is rejected. i.e. Average is
not 204
Qn
On 15 occasions Mr X had to wait
9,5,3,8,8,6,9,7,2,10,7,7,6,10 and 6 minutes for
bus. Use sign test at 5% level of significance to
test the bus’s claim that on the average Mr X
has to wait 5 minutes.
Answer
• H0: µ= 5 minutes [or P= ½ ]
• H1: µ≠ 5 minutes[or P ≠ ½]
• p=12/14=0.857
p−P
• Test statistic= SE
• 0.857-0.5/0.134=2.66
SE= n
PQ

Values +/- signs

9 Answer
If greater than 5 + sign
+
5 Ignore
3 -
8 +
8 +
6 +
9 +
7 +
2 -
10 +
7 +
7 +
6 +
10 +
6 +
• Since calculated value is greater than 1.96,
null hypothesis is rejected. i.e. Average is
not 5 minutes. His claim that on the
average Mr X has to wait 5 minutes is not
correct.
Two sample sign tests(sign test for paired data)

• This is for checking whether two population are identical or


not?
• Suppose X and Y are two variables and their n values are
known. Then we get n pair of values, first value of each pair
being a value of X and the second is that of Y. That is if (x1,y1)
is a pair, then x1 belongs to X and y1 belongs to Y.
• In such problems, each pair can be replaced by + or – sign. If
in a pair, first value is greater than the second, we put + sign.
If the first value is less than the second value we put – sign. If
both are equal, concerning pair is discarded.
• Null Hypothesis is two population are
identical(means are equal)
That is, H0: P= ½
p−P
• Test statistic is =
SE
PQ
• SE= n
• P= 0.5
0.25
• Q=0.5 SE= n
• p is the proportion of + signs and n is the number
of pairs compared.
• When H0 is accepted, we conclude that
two populations are identical so that their
means are equal.
Ex 2
• The following are the numbers of tickets issued
by two salesmen on 11 days.
I Sales man 7 10 14 12 6 9 11 13 7 6 10

II Sales man 10 13 14 11 10 7 15 11 10 9 8

• Use the sign test at 1% level of significance to test


the null hypothesis that on the average the two
salesmen issue equal number of tickets
x
Answer
y sign
7 10 -
10 13 -
14 14
12 11 +
6 10 -
9 7 +
11 15 -
13 11 +
7 10 -
6 9 -
10 8 +
Number of + signs 4
Number of – signs 6
Number of pairs compared 10
Observed proportion of + signs=p=4/10=0.4
H0 : P= ½ and H1: P ≠ ½
Test statistic is = p − P
SE= PQ SE
n

P=Q=0.5
n=10

SE= 0.25
= 0.158
10
p−P
SE
(0.4 − 0.5) / 0.158
= −0.1 / 0.158
= 0.63

• Table value at 1% level of significance=2.576


• Calculated value is less than the table value
• We accept H0 , i.e, P= ½
• Two salesmen issue equal number of tickets
2.Signed Rank Test(Wilcoxon Matched-Pairs Test)

• In the case of two related samples, (when


we can determine both direction and
magnitude of difference between matched
values) to find the significance of
difference, we can use Signed Rank Test.
Steps
• Set the null hypothesis that there is no difference between two
samples (or population)
• Find the differences between each pair of values and assign
ranks to the difference from the smallest to the largest without
regard to sign. (No rank if there is no difference)
• The actual signs of each difference are then put to
corresponding ranks and the test statistic T is calculated.
• T is the smaller of the two, viz, the sum of the negative ranks
and the sum of the positive ranks.
• Use the Table of T values of Wilcoxon Signed Rank test(end of
the chapter in the Text Book of LR Potti) if number of matched
pairs is less than or equal to 25. if more than 25, use Z Table
• When T value > table value, we accept null
hypothesis that there is no difference, otherwise
reject it.
• Note :
• While applying this test we may have situations
where some matched pairs are equal, i.e.
difference between values is zero. In this case we
drop out these pairs.
• Some times, two or more pairs may have same
difference(same rank). In such case we assign
average of ranks to each of these pairs.
Ex 3
• Given below is 16 pairs of values showing
the performance of two machines. Test
whether there is difference between the
performances(use Wilcoxon matched-
pairs test, at 5% level of significance)
Machine A 73 43 47 53 58 47 52 58 38 61 56 56 34 55 65 75

Machine B 51 41 43 41 47 32 24 58 43 53 52 57 44 57 40 68
An
Null Hypothesis: There is no difference between
the performances
Alternative Hypothesis: There is difference
between the performances
Machine A Machine B D(difference) Rank of |d| Rank with sign
73 51 22 13 13 ...
43 41 2 2.5 2.5 ...
47 43 4 4.5 4.5 ...
53 41 12 11 11
58 47 11 10 10
47 32 15 12 12
52 24 28 15 15
58 58 0 - - -
38 43 -5 6 6
61 53 8 8 8
56 52 4 4.5 4.5 ...
56 57 -1 1 1
34 44 -10 9 9
55 57 -2 2.5 2.5
65 40 25 14 14
75 68 7 7 7
Total 101.5 18.5
• Calculated value of T= 18.5(smaller of
101.5 or 18.5)
• Table value of T for 15 number is 25
• Since T is less than Table value , we reject
null hypothesis and conclude that there is
difference between the performance of the
two machines.
Ex 3 A
• Given below is 16 pairs of values showing
the performance of two machines. Test
whether there is difference between the
performances(use Wilcoxon matched-
pairs test, at 5% level of significance)
Machine A 73 43 47 53 58 47 52 58 38 61 56 56 34 55 50 75

Machine B 71 41 43 50 51 45 48 59 39 60 52 57 36 57 65 70
An
Null Hypothesis: There is no difference between
the performances
Alternative Hypothesis: There is difference
between the performances
Machine A Machine B D(difference) Rank of |d| Rank with sign
73 71 2 7 7
43 41 2 7 7
47 43 4 12 12
53 50 3 10 10
58 51 7 15 15
47 45 2 7 7
52 48 4 12 12
58 59 -1 2.5 2.5
38 39 -1 2.5 2.5
61 60 1 2.5 2.5
56 52 4 12 12
56 57 -1 2.5 2.5
34 36 -2 7 7
55 57 -2 7 7
50 65 -15 16 16
75 70 5 14 14
Total 98.5 37.5
• Calculated value of T= 37.5(smaller of 98.5
or 37.5)
• Table value of T for 16 number is 30
• Since T is more than Table value , we
accept null hypothesis and conclude that
there is no significant difference between
the performance of the two machines.
Case 2: When number of matched pairs >25
• We apply Z test(where Z follows standard
normal distribution)
• Test Statistic is Z=(T-µ)/σ
• µ=n(n+1)/4
n( n + 1)(2n + 1)
• σ=
24
• When Z value<table value, we accept null
hypothesis that there is no difference,
otherwise reject it.
Question
• The following are the weights in kgm of 26 babies
before and after of a diet. Use the signed Rank Test
to test at 5% level of significance, whether there is
any significant difference between the data before
and after diet.
Before 7 3.5 2.1 1.6 7.5 6.3 7 5.4 7.7 8.2 6.8 1.9 1.3 7.2 7.8 1.7

After 7.9 6.2 9 3.7 3.5 1.4 2.6 3.2 9 5.4 8.5 4.4 8.3 9 9.2 3.2

Before 2.4 3.5 4.5 8.0 1.5 2.0 5.8 6.5 3.5 5.2
(contd....)
After (contd....) 3.4 2.8 3.4 7.9 3.5 3.2 6.2 6.3 3.0 6.8
Answer
• We apply Z as number of matched pairs>25, ie n= 26
• Test Statistic is Z=(T-µ)/σ
• µ=n(n+1)/4
n( n + 1)(2n + 1)
• σ=
24
• T=128
• µ=26*27/4=175.5
• σ =39.37
• Z=1.21
3. Rank Sum Tests
• In Rank sum test, we are replacing the values
by ranks. All values are taken together and
they are assigned ranks. Rank tests are
applied to test whether the populations are
identical. Two important Rank sum tests are
A. Wilcoxon-Mann-Whitney test(U test)
B. The Kruskal-Wallis test(or (H-test)
3.A.Wilcoxon-Mann-Whitney test(U test)/Mann-Whitney
test(U test)/
Test of identicalness of two populations
• This is a two sample Rank Sum test as an alternative to
the t-test, when the assumptions in ‘t’ tests about the
population are not made.
• we are replacing the values by ranks. All values are
taken together and they are assigned ranks. That is
data are ranked jointly.
• Rank tests are applied to test whether the populations
are identical
• Here we want to test whether the population are
identical. That is, their mean are equal.
R1 = Rank of variables of sample 1, n1 and n2 are size of sample 1 and sample 2
Question
There are two samples.
First, contains the observations:
[54,39,70,58,47,40,74,49,74,75, 61 and 79].
The second contains:
[45,41,62,53,33,45,71,42,68,73,54 and 73].
• Apply Rank sum test to test at 5% level of the
hypothesis that they come from population with
the same mean.
Values Rank Sample I or II Rank of Sample I
33 1 II
39 2 I 2
40 3 I 3
41 4 II
42 5 II
45 6.5 II
45 6.5 II
47 8 I 8
49 9 I 9
53 10 II
54 11.5 I 11.5
54 11.5 II
58 13 I 13
Values Rank Sample I or II Rank of Sample I
61 14 I 14
62 15 II
68 16 II
70 17 I 17
71 18 II
73 19.5 II
73 19.5 II
74 21.5 I 21.5
74 21.5 I 21.5
75 23 I 23
79 24 I 24
R1 167.5
n1=12 n2 =12

U=54.5
12 *12(12 + 12 + 1)
SE = = 17.32
12
Test Statistic= (72-54.5)/17.32= 1.01
Table value at 5% level of significance= 1.96
We accept the null hypothesis that the two
samples are identical.
Home work No. 1: Apply U Test to test
whether two samples are identical.
Group A Group B
7 8
11 9
9 13
4 14
8 11
6 10
12 12
11 14
9 13
10 9
11 10
11 8
Hw 2
• Sample 1: 10,13,12,15,16,8,6
• Sample 2: 20,14,7,9,17,18,19,25,24
Use U Test to test whether two samples are
identical.
3.B. The Kruskal-Wallis Test(H Test)
• This test is used to the null hypothesis that ‘K ‘
independent samples come from identical
populations. That is all sample means are
equal. Alternative hypothesis is that the
means of the samples are not equal. This is a
test similar to one way analysis of variance.
• Method: Data are ranked jointly.
Test Statistic

• Ri is sum of the ranks assigned to the observations in the ith


sample
• n is the total number of observations.
• ni is the number of observation in the ith sample and so on.
• If each sample has at least 5 items, chi square can be applied
at degree of freedom k-1.
Question
• Use the Kruskal-Wallis Test(H Test) at 1% level
of significance to test whether the four
salesmen have performed equally in their
sales drive.
Salesman A 171 182 157 148 162
Salesman B 152 175 202 168 176
Salesman C 160 155 139 146 166
Salesman D 179 142 197 170 158
Values Salesman Ranks Ranks - A Ranks - B Ranks -C Ranks - D
202 B 1 1
197 D 2 2
182 A 3 3
179 D 4 4
176 B 5 5
175 B 6 6
171 A 7 7
170 D 8 8
168 B 9 9
166 C 10 10
162 A 11 11
160 C 12 12
158 D 13 13
157 A 14 14
155 C 15 15
152 B 16 16
148 A 17 17
• Null Hypothesis: There is no significant
difference between the samples. (i.e. Four
salesmen have performed equally in their
sales drive

• R1 =52; R2 =37; R3 =75; R4 =46


• n is the total number of observations5x4=20
• n1=5; n2=5; n3=5; n4=5
• H=4.51
• k= number of samples=4
• Degree of freedom k-1=4-1=3
• Table value at 1% level of significance for 3 degrees of
freedom is 11.345
• Since Test statistic H is less than table value we may
accept the null hypothesis. That is four salesmen have
performed equally in their sales drive.
Qn 15 (HW 3)
•The following are the kilometres per gallon which a test
driver got for ten thankful each of three kinds of
gasoline.
Gasoline A 43 27 43 29 42 49 48 21 37

Gasoline B 39 40 35 26 34 45 32 22 23 18
Gasoline C 28 30 25 31 41 38 36 44 19 50

Use the Kruskal Wallies Test at 5% level of significance


to test the null hypothesis that there is no difference in
the average kilometre yield of three types of gasoline
One Sample Runs Test
• One sample Runs Test is a test used to judge the
randomness of a sample on the basis of the order in
which the observations are taken.
• Hypothesis is “ There is randomness”
r −µ
• Test statistic is Z =
σ
• Where r is the number of runs.
• A Run is a succession of identical letters which is
followed or preceded by different letters or no letters
at all. (eg: MM FFF M FFFF- Four Runs)
Eg: AA BBB A BBBB Here n1= number of As and n2 is number
of Bs
Ex 7
• Following is the arrangement of Men and
Women. Test the null hypothesis of
randomness at the 5% level of significance
• MWMWMMMWMWMMMWWMMMMWW
MWMMMWMMMWWWMWMMMWMWM
MMMWWM
Ex
• The following is an arrangement of 25 men, M
and 15 women, W lined up to purchase tickets for
a premier picture show:
• M WW MMM W MM W M W M WWW MMM W
MM WWW MMMMMM WWW MMMMMM
• Test for randomness at the 5% level of
significance
Answer
• H0: There is randomness
• n1=25, n2=15, r=17
r −µ
Z =
σ
• Z=-0.94=0.94
• Since this value is less than 1.96(at 5% level of
significance), the null hypothesis is accepted.
Hence, there is real evidence to suggest that
the arrangement is not random.
Qn
• The following are the speeds at which every fifth
passenger car was timed at a certain checkpoint:
46,58,60,56,70,66,48,54,62,41,39,52,45,62,53,69,65,65,
67,76,52,52,59,59,67,51,46,61,40,43,42,77,67,63,59,63,
63,72,57,59,42,56,47,62,67,70,63,66,69 and 73.
• Test the null hypothesis of randomness at the 5% level
of significance.(Given median speed is 59.5 km per
hour)
• 46,58,/60,/56,/70,66,/48,54,/62,/41,39,52,45,/
62,/53,/69,65,65,67,76,/52,52,59,59,67,51,46,
61,40,43,42,77,67,63,59,63,63,72,
57,59,42,56,47,62,67,70,63,66,69 and 73.
Answer
• H0: There is randomness
• n1=25, n2=25, r=20,
r −µ
Z =
σ
• μ= 26
• σ=3.5
• Z=20-26/3.5=-1.71
Table value=1.96
Ho accepted. That is, there is randomness.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy