0% found this document useful (0 votes)
5 views32 pages

Non Parametric Tests (Sarah)

The document discusses non-parametric tests, which are statistical methods that do not assume a specific distribution for the population. It outlines the characteristics, important tests, and specific methodologies such as the sign test, Wilcoxon matched-pairs test, Mann-Whitney U test, Kruskal-Wallis test, and rank correlation methods. These tests are particularly useful when data do not meet the assumptions required for parametric tests, such as normality or homogeneity of variance.

Uploaded by

Ankita Ahirwar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views32 pages

Non Parametric Tests (Sarah)

The document discusses non-parametric tests, which are statistical methods that do not assume a specific distribution for the population. It outlines the characteristics, important tests, and specific methodologies such as the sign test, Wilcoxon matched-pairs test, Mann-Whitney U test, Kruskal-Wallis test, and rank correlation methods. These tests are particularly useful when data do not meet the assumptions required for parametric tests, such as normality or homogeneity of variance.

Uploaded by

Ankita Ahirwar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

NON PARAMETRIC TESTS

SUBMITTED BY: SARAH ZAHIR


SUBMITTED TO: PROF. VAITHIYANATHAN
• In a statistical test, two kinds of assertions are involved viz., an
assertion directly related to the purpose of investigation and other
assertions to make a probability statement

• The former is an assertion to be tested and is technically called a


hypothesis, whereas the set of all other assertions is called the model

• When we apply a test (to test the hypothesis) without a model, it is


known as distribution-free test, or the nonparametric test
• Non-parametric tests do not make an assumption about the parameters
of the population and thus do not make use of the parameters of the
distribution

• In other words, under non-parametric or distribution-free tests we do


not assume that a particular distribution is applicable, or that a certain
value is attached to a parameter of the population

• In fact, there is a growing use of such tests in situations when the


normality assumption is open to doubt
• As a result many distribution-free tests have been developed that do
not depend on the shape of the distribution or deal with the parameters
of the underlying population
CHARACTERISTICS OF DISTRIBUTION-FREE OR NON-
PARAMETRIC TESTS

1. They do not suppose any particular distribution and the consequential


assumptions.

2. They are rather quick and easy to use i.e., they do not require
laborious computations since in many cases the observations are
replaced by their rank order and in many others we simply use signs.

3. They are often not as efficient or ‘sharp’ as tests of significance or the


parametric tests.
4. When our measurements are not as accurate as is necessary for
standard tests of significance, then non-parametric methods come to our
rescue which can be used fairly satisfactorily

5. Parametric tests cannot apply to ordinal or nominal scale data but


non-parametric tests do not suffer from any such limitation

6. The parametric tests of difference like ‘t’ or ‘F’ make assumption


about the homogeneity of the variances whereas this is not necessary
for non-parametric tests of difference
IMPORTANT NONPARAMETRIC OR DISTRIBUTION-FREE
TESTS

• Tests of hypotheses with ‘order statistics’ or ‘nonparametric statistics’


or ‘distribution-free’ statistics are known as nonparametric or
distribution-free tests

• The following distribution-free tests are important and generally used:


(i) Test of a hypothesis concerning some single value for the given data
(such as one-sample sign test).

(ii) Test of a hypothesis concerning no difference among two or more


sets of data (such as two-sample sign test, Fisher-Irwin test, Rank sum
test, etc.).
(iii)Test of a hypothesis of a relationship between variables (such as
Rank correlation, Kendall’s coefficient of concordance and other tests
for dependence

(iv) Test of a hypothesis concerning variation in the given data i.e., test
analogous to ANOVA
(v) Tests of randomness of a sample based on the theory of runs viz.,
one sample runs test
(vi)Test of hypothesis to determine if categorical data shows
dependency or if two classifications are independent viz., the chi-square
test
• Tests used in practice are listed as follows:
1. Sign tests:
• The sign test is one of the easiest parametric tests. Its name comes
from the fact that it is based on the direction of the plus or minus signs
of observations in a sample and not on their numerical magnitudes

• The sign test may be one of the following two types:


(a) One sample sign test
(b) Two sample sign test
(a) One sample sign test:
• The one sample sign test is a very simple non-parametric test
applicable when we sample a continuous symmetrical population in
which case the probability of getting a sample value less than mean is
1/2 and the probability of getting a sample value greater than mean is
also ½

• To test the null hypothesis μ μ= H0 against an appropriate alternative


on the basis of a random sample of size ‘n’, we replace the value of
each and every item of the sample with a plus (+) sign if it is greater
than μH0, and with a minus (–) sign if it is less than μH0
• But if the value happens to be equal to μH0 , then we simply discard
it.

• After doing this, we test the null hypothesis that these + and – signs
are values of a random variable, having a binomial distribution with p
= 1/2*

• For performing one sample sign test when the sample is small, we can
use tables of binomial probabilities, but when sample happens to be
large, we use normal approximation to binomial distribution
(b) Two sample sign test:
• The sign test has important applications in problems where we deal
with paired data. In such problems, each pair of values can be replaced
with a plus (+) sign if the first value of the first sample (say X) is
greater than the first value of the second sample (say Y) and we take
minus (–) sign if the first value of X is less than the first value of Y

• In case the two values are equal, the concerning pair is discarded. (In
case the two samples are not of equal size, then some of the values of
the larger sample left over after the random pairing will have to be
discarded.)

• The testing technique remains the same as started in case of one


sample sign test
Wilcoxon Matched-pairs Test (or Signed Rank Test):
• While applying this test, we first find the differences (di ) between
each pair of values and assign rank to the differences from the smallest
to the largest without regard to sign

• The actual signs of each difference are then put to corresponding ranks
and the test statistic T is calculated which happens to be the smaller of
the two sums viz., the sum of the negative ranks and the sum of the
positive ranks
• While using this test, we may come across two types of tie situations

• One situation arises when the two values of some matched pair(s) are
equal i.e., the difference between values is zero in which case we drop
out the pair(s) from our calculations

• The other situation arises when two or more pairs have the same
difference value in which case we assign ranks to such pairs by
averaging their rank positions. For instance, if two pairs have rank
score of 5, we assign the rank of 5.5 i.e., (5 + 6)/2 = 5.5 to each pair
and rank the next largest difference as 7
• For this test, the calculated value of T must be equal to or smaller than
the table value in order to reject the null hypothesis. In case the
number exceeds 25, the sampling distribution of T is taken as
approximately normal with mean UT = n(n + 1)/4 and standard
deviation

• Formula
Mann- whitney U test:
• This is a very popular test amongst the rank sum tests

• This test is used to determine whether two independent samples have


been drawn from the same population.

• This test applies under very general conditions and requires only that
the populations sampled are continuous

• However, in practice even the violation of this assumption does not


affect the results very much.
• To perform this test, we first of all rank the data jointly, taking them as
belonging to a single sample in either an increasing or decreasing
order of magnitude
• We usually adopt low to high ranking process which means we assign
rank 1 to an item with lowest value, rank 2 to the next higher item and
so on
• In case there are ties, then we would assign each of the tied
observation the mean of the ranks which they jointly occupy. For
example, if sixth, seventh and eighth values are identical, we would
assign each the rank (6 + 7 + 8)/3 = 7. After this we find the sum of
the ranks assigned to the values of the first sample (and call it R1 and
also the sum of the ranks assigned to the values of the second sample
(and call it R2). Then we work out the test statistic i.e., U, which is a
measurement of the difference between the ranked observations of the
two samples
• In applying U-test we take the null hypothesis that the two samples
come from identical populations

• If this hypothesis is true, it seems reasonable to suppose that the means


of the ranks assigned to the values of the two samples should be more
or less the same

• Under the alternative hypothesis, the means of the two populations are
not equal and if this is so, then most of the smaller ranks will go to the
values of one sample while most of the higher ranks will go to those of
the other sample
Kruskal-Wallis test (or H test)
• This test is conducted in a way similar to the U test described above

• This test is used to test the null hypothesis that ‘k’ independent
random samples come from identical universes against the alternative
hypothesis that the means of these universes are not equal

• This test is analogous to the one-way analysis of variance, but unlike


the latter it does not require the assumption that the samples come
from approximately normal populations or the universes having the
same standard deviation.
• In this test, like the U test, the data are ranked jointly from low to high
or high to low as if they constituted a single sample. The test statistic
is H for this test which is worked out

• If the null hypothesis is true that there is no difference between the


sample means and each sample has at least five items, then the
sampling distribution of H can be approximated with a chi-square
distribution with (k – 1) degrees of freedom

• As such we can reject the null hypothesis at a given level of


significance if H value calculated, as stated above, exceeds the
concerned table value of chi-square.
Spearman’s Rank Correlation:
• When the data are not available to use in numerical form for doing
correlation analysis but when the information is sufficient to rank the
data as first, second, third, and so forth, we quite often use the rank
correlation method and work out the coefficient of rank correlation

• In fact, the rank correlation coefficient is a measure of correlation that


exists between the two sets of ranks

• In other words, it is a measure of association that is based on the ranks


of the observations and not on the numerical values of the data
• For calculating rank correlation coefficient, first of all the actual
observations be replaced by their ranks, giving rank 1 to the highest
value, rank 2 to the next highest value and following this very order
ranks are assigned for all values

• If two or more values happen to be equal, then the average of the ranks
which should have been assigned to such values had they been all
different, is taken and the same rank (equal to the said average) is
given to concerning values

• The second step is to record the difference between ranks (or ‘d’) for
each pair of observations, then square these differences to obtain a
total of such differences which can symbolically be stated as ∑di 2
• Formula

• The value of Spearman’s rank correlation coefficient will always vary


between ±1 , +1, indicating a perfect positive correlation and –1
indicating perfect negative correlation between two variables

• All other values of correlation coefficient will show different degrees


of correlation
Kendall’s Coefficient of Concordance:
• Kendall’s coefficient of concordance, represented by the symbol W, is an
important non-parametric measure of relationship

• It is used for determining the degree of association among several (k)


sets of ranking of N objects or individuals

• When there are only two sets of rankings of N objects, we generally


work out Spearman’s coefficient of correlation, but Kendall’s coefficient
of concordance (W) is considered an appropriate measure of studying the
degree of association among three or more sets of rankings
• The basis of Kendall’s coefficient of concordance is to imagine how
the given data would look if there were no agreement among the
several sets of rankings, and then to imagine how it would look if
there were perfect agreement among the several sets

• For instance, in case of, say, four interviewers interviewing, say, six
job applicants and assigning rank order on suitability for employment,
if there is observed perfect agreement amongst the interviewers, then
one applicant would be assigned rank 1 by all the four and sum of his
ranks would be 1 + 1 + 1 + 1 = 4
• Another applicant would be assigned a rank 2 by all four and the sum
of his ranks will be 2 + 2 + 2 + 2 = 8

• The sum of ranks for the six applicants would be 4, 8, 12, 16, 20 and
24 (not necessarily in this very order)

• In general, when perfect agreement exists among ranks assigned by k


judges to N objects, the rank sums are k, 2k, 3k, … Nk. The, total sum
of N ranks for k judges is kN(N + 1)/2 and the mean rank sum is k(N
+ 1)/ 2
• The degree of agreement between judges reflects itself in the variation
in the rank sums

• When all judges agree, this sum is a maximum. Disagreement between


judges reflects itself in a reduction in the variation of rank sums

• For maximum disagreement, the rank sums will tend to be more or


less equal. This provides the basis for the definition of a coefficient of
concordance. When perfect agreement exists between judges, W
equals to 1
• When maximum disagreement exists, W equals to 0

• It may be noted that W does not take negative values because of the
fact that with more than two judges complete disagreement cannot
take place

• Thus, coefficient of concordance (W) is an index of divergence of the


actual agreement shown in the data from the perfect agreement
• The procedure for computing and interpreting Kendall’s coefficient of
concordance (W) is as follows:
• (a) All the objects, N, should be ranked by all k judges in the usual
fashion and this information may be put in the form of a k by N matrix

• (b) For each object determine the sum of ranks (Rj ) assigned by all
the k judges

• (c) Determine Rj and then obtain the value of s as under:


(FORMULA)
• (d) Work out the value of W using the following formula: (formula)
THANK YOU

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy