Topic 1: Topic 2: Topic 3:: This Course Is Designed To Deepen Students'
Topic 1: Topic 2: Topic 3:: This Course Is Designed To Deepen Students'
1
TOPIC 1: Ranking Tests:
INTRODUCTION TO NON-PARAMETRIC Alternatively, many of these tests are identified as “ranking tests”, and this
STATISTICS title suggests their other principal merit: non-parametric techniques may
TOPIC 2: be used with scores which are not exact in any numerical sense, but which
PARAMETRIC VS NON-PARAMETRIC TESTS in effect are simply ranks.
You are already familiar with parametric statistics from M111, which includes Precautions in using Non-Parametric Tests
descriptive and some inferential statistics mainly parametric. Below is the
comparison between Parametric and Nonparametric Statistics. In the use of non-parametric tests, the student is cautioned against the
following lapses:
PARAMETRIC STATISTICS NONPARAMETRIC STATISTICS 1. When measurements are in terms of interval and ratio scales, the
transformation of the measurements on nominal or ordinal scales will lead
Definition: Definition:
to the loss of much information. Hence, as far as possible parametric tests
Data that is assumed to have been drawn A nonparametric test (sometimes called should be applied in such situations. In using a non-parametric method as
from a particular distribution, and that is a distribution free test) does not assume a shortcut, we are throwing away dollars in order to save pennies.
used in a parametric test. anything about the underlying. 2. In situations where the assumptions underlying a parametric test are
satisfied and both parametric and non-parametric tests can be applied, the
Advantage: Advantage: choice should be on the parametric test because most parametric tests
The advantage of using a parametric test is Nonparametric tests are more robust than have greater power in such situations.
having more statistical power than parametric tests. In other words, they are 3. Non-parametric tests, no doubt, provide a means for avoiding the
nonparametric. In other words, a parametric valid in a broader range of situations (fewer assumption of normality of distribution. But these methods do nothing to
test is more able to lead to conditions of validity).
a rejection of H0.
avoid the assumptions of independence on homoscedasticity wherever
applicable.
Assumption/Conditions: 4. Behavioral scientist should specify the null hypothesis, alternative
Assumption/Conditions: ▪Normality: Data does not have a normal hypothesis, statistical test, sampling distribution, and level of significance
▪Normality: Data have a normal distribution. in advance of the collection of data. Hunting around for a statistical test
distribution (or at least is symmetric) ▪ Homogeneity of variances: Data from
▪ Homogeneity of variances: Data from after the data have been collected tends to maximize the effects of any
multiple groups might not have the same
multiple groups have the same variance variance chance differences which favor one test over another. As a result, the
▪ Linearity: Data have a linear relationship ▪ Linearity: Data does not have a linear possibility of rejecting the null hypothesis when it is true (Type I error) is
▪ Independence: Data are independent relationship greatly increased. However, this caution is applicable equally to parametric
▪ Independence: Data are independent as well as non-parametric tests.
5. We do not have the problem of choosing statistical tests for categorical
Two Alternative Names which are Frequently Given to Nonparametric Tests variables. Non-parametric tests alone are suitable for enumerative data.
6. The F and t tests are generally considered to be robust test because the
Distribution-Free: violation of the underlying assumptions does not invalidate the inferences.
Non-parametric tests are “distribution-free”. They do not assume that the It is customary to justify the use of a normal theory test in a situation where
scores under analysis are drawn from a population distributed in a certain normality cannot be guaranteed, by arguing that it is robust under non-normality.
way, e.g., from a normally distributed population.
2
Types or Characteristics of Data for Nonparametric Statistics TOPIC 3:
1. The underlying data do not meet the assumptions about the population LEVELS OF MEASUREMENT AND TYPES OF DATA
sample Generally, the application of parametric tests requires various USED IN A NONPARAMETRIC TEST
assumptions to be satisfied.
For example, the data follows a normal distribution and the population Nominal Scale Level
variance is homogeneous. However, some data samples may show skewed Data that is measured using a nominal scale is qualitative. Categories,
distributions. colors, names, labels and favourite foods along with yes or no responses
are examples of nominal level data. Nominal scale data are not ordered.
The skewness makes the parametric tests less powerful because the mean Nominal scale data cannot be used in calculations.
is no longer the best measure of central tendency because it is strongly
affected by the extreme values. At the same time, nonparametric tests Example:
work well with skewed distributions and distributions that are better ❖ To classify people according to their favorite food, like
represented by the median. pizza, spaghetti, and sushi. Putting pizza first and sushi
second is not meaningful.
2. The population sample size is too small ❖ Smartphone companies are another example of nominal
scale data. Some examples are Sony, Motorola, Nokia,
The sample size is an important assumption in selecting the appropriate Samsung and Apple. This is just a list and there is no
statistical method. If a sample size is reasonably large, the applicable agreed upon order. Some people may favor Apple but
parametric test can be used. However, if a sample size is too small, it is that is a matter of opinion.
possible that you may not be able to validate the distribution of the data.
Thus, the application of nonparametric tests is the only suitable option.
3
Interval Scale Level TOPIC 4:
Data that is measured using the interval scale is similar to ordinal level data THE USE OF STATISTICAL TESTS IN RESEARCH
because it has a definite ordering but there is a difference between data.
TOPIC 5:
The differences between interval scale data can be measured though the
THE NULL AND ALTERNATIVE HYPOTHESIS
data does not have a starting point.
Temperature scales like Celsius (C) and Fahrenheit (F) are measured by
using the interval scale. In both temperature measurements, 40° is equal What are Statistical Tests?
to 100° minus 60°. Differences make sense. But 0 degrees does not A statistical test provides a mechanism for making quantitative decisions
because, in both scales, 0 is not the absolute lowest temperature. about a process or processes. The intent is to determine whether there is
Temperatures like -10° F and -15° C exist and are colder than 0. enough evidence to "reject" a conjecture or hypothesis about the process.
The conjecture is called the null hypothesis. Not rejecting may be a good
Interval level data can be used in calculations, but comparison cannot be
done. 80° C is not four times as hot as 20° C (nor is 80° F four times as hot result if we want to continue to act as if we "believe" the null hypothesis is
as 20° F). There is no meaning to the ratio of 80 to 20 (or four to one). true. Or it may be a disappointing result, possibly indicating we may not
yet have enough data to "prove" something by rejecting the null
Example: hypothesis.
❖ Monthly income of 2000 part-time students in Texas
❖ Highest daily temperature in Odessa Statistical Treatments are basically
used for hypothesis testing. Whether
Ratio Scale Level
to accept or reject a certain
Data that is measured using the ratio scale takes care of the ratio problem
statement. As mentioned above it is
and gives you the most information. Ratio scale data is like interval scale
use for particularly testing the null
data, but it has a 0 point and ratios can be calculated. You will not have a
hypothesis but there is another type
negative value in ratio scale data.
of hypothesis called alternative
hypothesis. The comparison between
For example, four multiple choice statistics final exam scores are 80, 68, 20
these two are given below.
and 92 (out of a possible 100 points) (given that the exams are machine-
graded.) The data can be put in order from lowest to highest: 20, 68, 80,
Null Hypothesis
92. There is no negative point in the final exam scores as the lowest score
is 0 point.
Definition:
The null hypothesis is a general statement that states that there is no
The differences between the data have meaning. The score 92 is more
relationship between two phenomena under consideration or that there is
than the score 68 by 24 points. Ratios can be calculated. The smallest score
no association between two groups.
is 0. So, 80 is four times 20. If one student scores 80 points and another
student scores 20 points, the student who scores higher is 4 times better
Symbol:
than the student who scores lower.
• The symbol for the null hypothesis is H 0, and it is read as H-null, H-zero, or
Example:
H-naught.
❖ Weight of 200 cancer patients in the past 5 months
❖ Height of 549 newborn babies • The null hypothesis is usually associated with just equals to’ sign as a null
❖ Diameter of 150 donuts hypothesis can either be accepted or rejected.
4
• 𝐻0 : µ1 = µ2 2. Under another study that is trying to test whether there is a significant
difference between the effectiveness of medicine against heart arrest,
Examples: the alternative hypothesis will be that there is a relationship between
The following are some examples of null hypothesis: the medicine and chances of heart arrest.
1. If the hypothesis is that “the consumption of a particular medicine
reduces the chances of heart arrest”, the null hypothesis will be “the Sample Problems with Their Null and Alternative Hypothesis
consumption of the medicine doesn’t reduce the chances of heart
arrest.” Example 1 Example 2
2. If the hypothesis is that, “If random test scores are collected from men The life span of 100 W light bulbs A manufacturer of electric lamps is testing
manufactured by a particular company a new production method that will be
and women, does the score of one group differ from the other?” a
follows a normal distribution with a considered acceptable if the lamps
possible null hypothesis will be that the mean test score of men is the standard deviation of 120 hours and its produced by this method result in a normal
same as that of the women. half-life is guaranteed under warranty for a population with an average life of 2,400
minimum of 800 hours. At random, a hours and a standard deviation equal to
𝐻0 : µ1 = µ2 sample of 50 bulbs from a lot is selected 300. A sample of 100 lamps produced by
and it is revealed that the half-life is 750 this method has an average life of 2,320
𝐻0 = 𝑛𝑢𝑙𝑙 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠
hours. With a significance level of 0.01, hours. Can the hypothesis of validity for
µ1 = 𝑚𝑒𝑎𝑛 𝑠𝑐𝑜𝑟𝑒 𝑜𝑓 𝑚𝑒𝑛 should the lot be rejected by not honoring the new manufacturing process be
µ2 = 𝑚𝑒𝑎𝑛 𝑠𝑐𝑜𝑟𝑒 𝑜𝑓 𝑤𝑜𝑚𝑒𝑛 thewarranty? accepted with a risk\ equal to or less than
5%?
𝐻0 : µ = 800 There is no significant 𝐻0 : 𝜇 = 2,400 There is no significant
Alternative Hypothesis difference between the sample mean and difference between the sample mean and
the population mean (800). the population mean (2400).
Symbol:
Example3 3
• The symbol of the alternative hypothesis is either H1 or Ha while using less The quality control division of a factory that manufactures batteries suspects defects in the
than, greater than or not equal signs. production of a model of mobile phone battery which results in a lower life for the product.
• 𝐻𝑎: 𝑋1 ≠ 𝑋2 Two-tailed Test Until now, the time duration in phone conversation for the battery followed a normal
• 𝐻𝑎: 𝑋1 < 𝑋2 One-tailed Test distribution with a mean of 300 minutes and a standard deviation of 30. However, in an
inspection of the last batch produced before sending it to market, it was found that the
• 𝐻𝑎: 𝑋1 > 𝑋2 One-tailed Test
average time spent in conversation was 290 minutes in a sample of 60 batteries. Assuming
that the time is still normal with the same standard deviation: Can it be concluded that the
Examples: quality control suspicions are true at a significance level of 1%?
The following are some examples of alternative hypothesis:
1. If a researcher is assuming that the bearing capacity of a bridge is 𝐻0 : µ = 800 There is no significant difference between the sample mean and the
population mean (800).
more than 10 tons, then the hypothesis under this study will be:
▪ Null hypothesis 𝐻0: µ = 10 𝑡𝑜𝑛𝑠 𝐻1 : µ < 800 The sample mean is significantly lesser than the population mean
▪ Alternative hypothesis 𝐻𝑎: µ > 10 𝑡𝑜𝑛s (800).
5
Example 4
It is believed that the average level of prothrombin in a normal population is 20 mg/100 ml
of blood plasma with a standard deviation of 4 milligrams/100 ml. To verify this, a sample is
taken from 40 individuals in whom the average is 18.5 mg/100 ml. Can the hypothesis be
accepted with a significance level of 5%?
TOPIC 6:
Every nonparametric test has its own conditions and assumptions, below
are charts that might guide you in choosing what nonparametric test is applicable
and appropriate to the study or research you are undertaking.
6
Example 2
You want to find out how test anxiety affects actual test scores. The independent variable
“test anxiety” has three levels: no anxiety, low-medium anxiety and high anxiety. The
dependent variable is the exam score, rated from 0 to 100%.
Variable/s:
a. Test Anxiety Levels
b. Exam Score
Percentage
Type of data:
a. Ordinal
b. Ratio
No. of Groups: 3
(the sample is grouped into
low, medium and high
anxiety)
Independence of groups:
Nonparametric
IndependentStatistical Treatment:
The charts above are very helpful, these charts categorize the Kruskal-Wallis Test
nonparametric statistics according to the type of data required and the number of
groups being compared. Below examples of problems with identified type of data, Example 3
number of groups.
A school teacher wanted to examine whether pass rates increased as students had more
Example 1 time to study. In this hypothetical study, 60 students were recruited to take part. All
students were first given a "surprise exam" to test their current knowledge. They were then
A study assessed the effectiveness of a new drug designed to reduce repetitive behaviors in
given a "mock exam" two weeks later before they took a "final exam" a further two weeks
children affected with autism. A total of 8 children with autism enroll in the study and the
later. Students' performance in the exams were assessed in terms of a "pass" or "fail".
amount of time that each child is engaged in repetitive behavior during three-hour
observation periods are measured both before treatment and then again after taking the
new medication for a period of 1 week. The data are shown below. Variable/s:students either “pass”
or “fail”
Variable/s: number of times a child is engaged in Child Before After
repetitive behavior 1 85 75 Type of data:categorical or
2 70 50 nominal
Type of data: ordinal (although the data shown are
ratio, it will be transformed to ordinal) 3 40 50 (the data is dichotomous and
4 65 40 binary)
No. of Groups: 2 5 80 20
6 75 65 No. of Groups: 3
Independence of groups: Dependent/Paired (there
7 55 40
is a before and after) Surprise
8 20 25
Nonparametric Statistical Treatment:
Nonparametric Statistical Treatment:
Willcoxon Signed-rank Test Cochran Q Test
7
Example 4 population median can be one-tailed (right or left tailed) or two-tailed
A researcher wanted to investigate the impact of an intervention on smoking. In this distribution based on the hypothesis.
hypothetical study, 50 participants were recruited to take part, consisting of 25 smokers and
25 non-smokers. All participants watched an emotive video showing the impact that deaths
▪ Left tailed test- H0:median≥ Hypothesized value k; H1: median <k
from smoking-related cancers had on families. Two weeks after this video intervention, the
same participants were asked whether they remained smokers or non-smokers.
▪ Right tailed test- H0:median≤ Hypothesized value k; H1: median
>k
Variable/s:smoker or non-smoker ▪ Two tailed test- H0: median= Hypothesized value k; H1: median ≠k
Type of data:categorical or nominal
(the data is dichotomous and
binary)
No. of Groups: 2 Assumptions:
Nonsmoker Group
Smoker Group ✓ Data is non-normally distributed.
Independence of groups:
✓ A random sample of independent measurements for a population with
Dependent/Paired (there is a
before and after) unknown median
✓ The variable of interest is continuous1 sample test handles non-symmetric
Nonparametric Statistical Treatment: dat
Kruskal-Wallis Test ✓ a set, that means skewed either to the right or the left
8
Example of One Sample Sign Test
Bank of America West Palm Beach, FL branch manager indicates that the
median number of savings account customers per day is 64. A clerk from
the same branch claims that it was more than 64. Clerk collected the
number of savings account customers per day data for 10 random days.
Can we reject the branch manager’s claim at 0.05 significance level?
Solution:
1. Null Hypothesis H0: Savings account customer 𝑚𝑒𝑑𝑖𝑎𝑛 = 64;
Alternative Hypothesis H1: Savings account customer 𝑚𝑒𝑑𝑖𝑎𝑛 > 64
2. 𝐿𝑒𝑣𝑒𝑙 𝑜𝑓 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒 = 0.05
3. Assign observations less than 64 with – sign and observations above 64
with + sign
4. Critical value
Look at at the Binomial table (10, 0.5)
• Note: 10- is the number of trails
• 0.5 – 50% chance more than the median value and
50% change less than the median value
At 0.05 significance level
Since test statistic 2 is in accept region ( H0), hence accept the null
hypothesis. So, there is no significance evidence that the savings account
customers per day are more than 64.
9
TOPIC 9: 𝐸𝑖 = 𝑛𝑝𝑖
𝒙𝟐 ONE-SAMPLE TEST where Ei is the expected frequency count for the ith level of the categorical
variable, n is the total sample size, and p i is the hypothesized proportion of
Definition: observations in level i.
10
5. The expected frequencies are base on a uniform distribution.
1 TOPIC 10:
(7 + 10 = 11 + 9 + 12 + 10 + 14 + 7) = 10
8 WILCOXON SIGNED-RANKS TEST
7. The Chi-square value, 4.0, with the degree of freedom df=7, does 𝐻0 : 𝑀1 − 𝑀2 =0 Decision Rule
𝐻0 : 𝑀1 − 𝑀2 ≠0 𝐻0 if 𝑊 < 𝑤𝛼/2 or 𝑊 > 𝑤1−𝛼/2
not lie in the critical region, so the null hypothesis is accepted.
𝐻0 : 𝑀1 − 𝑀2 >0 𝐻0 if 𝑊 > 𝑤1−𝛼
𝐻0 : 𝑀1 − 𝑀2 <0 𝐻0 if 𝑊 < 𝑤𝛼
8. Therefore, the octahedral dice is fair.
Note 𝑤𝛼 is the critical value of 𝑊 given in the Table j of the Appendix with
𝑛(𝑛+1) 𝑛(𝑛+1)
𝑤1−𝛼/2 = − 𝑤𝛼/2 and 𝑤1−𝛼 = − 𝑤𝛼
2 2
11
Example:
TOPIC 11:
Pre-test and post-test scores were obtained from a group of 10 high school
MANN-WHITNEY TEST
seniors as shown below. The pre-test scores were collected before the
experiment was conducted and the post-test scores were obtained at the
conclusion of the experiment. Is there a difference between the pre-test
The Mann-Whitney Test is a non-parametric alternative to the t-test of two
and post-test scores?
independent samples which require only the assumption of any continuous, and at
least data measured on an ordinal level of measurement. The test does not require
Student 1 2 3 4 5 6 7 8 9 10
that the samples, which are to be compared, be of the same size. The interpretation
Pre-test Scores 45 56 38 52 45 25 36 28 34 53
of the test is essentially identical to the t-statistic for independent samples.
Post-test Score 61 53 40 52 38 30 32 28 42 54
Solution: If there are no ties, or just a few ties, the test statistic for small sample in
each group (𝑚 ≤ 20, 𝑛 ≤ 20), is shown below.
1. 𝐻0 : 𝑀𝑝𝑟𝑒 − 𝑀𝑝𝑜𝑠 = 0 There is no difference in the population median
𝑛
scores between pre-test and pos-test.
𝐻1 : 𝑀𝑝𝑟𝑒 − 𝑀𝑝𝑜𝑠 ≠ 0 There is difference in the population median 𝑈 = ∑ 𝑅(𝑋𝑖 )
𝑖=1
scores between pre-test and pos-test.
2. Level of significance: 𝛼 = 0.05 Where: 𝑈 = computed Mann-Whitney
3. Test statistic: Wilcoxon signedrank test. 𝑊 = 𝑅(𝑋𝑖 ) = ranks assigned to the sample from the first group
∑(𝑅𝑖 𝑤ℎ𝑒𝑟𝑒 𝐷𝑖 𝑖𝑠 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒)
4. Rejections region: Rreject 𝐻0 if 𝑊 < 9 or 𝑊 > 46 The procedure involves ranking altogether the data in both groups by
5. Computation: assigning the rank 1 to the lowest observation, rank 2 to the next lowest to the
Student 1 2 3 4 5 6 7 8 9 10 highest observation, rank 3 to the third lowest observation,…, the rank 𝑁 = 𝑛 + 𝑚
Pre-test Scores 45 56 38 52 45 25 36 28 34 53 to the highest observation. For tied observations, assign the average of the tied
Post-test Score 61 53 40 52 38 30 32 28 42 54 rank.
𝑅𝑖 +6 -3 +2 0 -7 +5 -4 0 +8 +1 The test of hypothesis when the sample sizes are small in each group can
be summarized as follows:
𝑊 = 6 + 2 + 5 + 8 + 1 = 22 𝐻0 : 𝑀1 − 𝑀2 = 0 Decision Rule
𝐻0 : 𝑀1 − 𝑀2 ≠ 0 𝐻0 if 𝑈 < 𝑤𝛼/2 or 𝑊 > 𝑤1−𝛼/2
6. Statistical Decision: Do not reject 𝐻0 since the computed value of 𝑊 is 𝐻0 : 𝑀1 − 𝑀2 > 0 𝐻0 if 𝑈 > 𝑤1−𝛼
22 and is not less than 9 and not greater than 46. 𝐻0 : 𝑀1 − 𝑀2 < 0 𝐻0 if 𝑈 < 𝑤𝛼
7. Interpretation/Conclusion: We have no sufficient evidence to conclude Note 𝑤𝛼 is the critical value of 𝑈 given in the Table I of the Appendix with
that the true median scores in the pre-test and post-test are different. 𝑤1−𝛼/2 = 𝑛(𝑛 + 𝑚 + 1) − 𝑤𝛼/2 and 𝑤1−𝛼 = 𝑛(𝑛 + 𝑚 + 1) − 𝑤𝛼 .
12
Male : 19 14 17 13 15 have been matched on one or more important variables). In the other
Female : 11 16 18 20 12 application, two or more comparable quantitative variables are measured
from the same sample (usually at the same time). In both applications,
Solution: Friedman's test is used to compare the distributions of the two or more
quantitative variables.
1. 𝐻0 : 𝑀1 − 𝑀2 = 0 There is no difference in the population median
scores between pre-test and pos-test. Thus, it is applied in the same data situation as an ANOVA for dependent
𝐻1 : 𝑀1 − 𝑀2 ≠ 0 There is difference in the population median samples except that it is used when the data are either from a too-small
scores between pre-test and pos-test. sample, are importantly non-normally distributed, or the measurement
2. Level of significance: 𝛼 = 0.05 scale of the dependent variable is ordinal (not interval or ratio). It is
3. Test statistic: Mann-Whitney test, 𝑈 = ∑𝑛𝑖=1 𝑅(𝑋𝑖 ) important to remember the null hypothesis, and to differentiate it from
4. Rejection region: Reject 𝐻0 if 𝑈 < 18 and 𝑈 > 37 the null for the dependent ANOVA. There are two specific versions of the
5. Computation: H0:, depending upon whether one characterizes the k conditions as
Male RankM Female RankF representing a single population under two or more different
19 9 11 1 circumstances (e.g., comparing treated vs. not treated or comparing
different treatments -- some consider this a representation of two or
14 4 16 6
more different populations) or as representing comparable variables
17 7 18 8 measured from a single population (as in the example below). Here are
versions of the H0: statement for each of these characterizations.
13 3 20 10
To reject H0: is to say that the distributions of the variables are different in
This statistic has two applications that can appear very different, but are some way, center, spread and/or shape. When the forms of the
really just two variations of the same statistical question. In one distributions are similar (as is often the case -- check the size and
application the same quantitative variable is measured at two or more symmetry of the IQR), then rejecting the H0: is interpreted to mean that
different times from the same sample (or from two or more samples that the variables have some pattern of larger and smaller scores (medians)
among them.
13
The data: In this analysis the one variable is the type of animal (fish,
reptiles, or mammals), and the response variable is the number of animals
on display. From our database, we use three variables reptnum (number of
reptiles on display), fishnum (number of fishon display) and mamlnum
(number of mammals on display). These scores are shown for the 12 stores
below (reptnum, fishnum, ).
Research Hypothesis: The data come from the Pet shop database. The Step 3 Compute the sum of the ranks for each condition.
researcher hypothesized that stores would tend to display more fish than
other types of animals, fewer reptiles, and an intermediate number of
mammals.
H0: for this analysis: Pet stores display the same number of reptiles, fish
and mammals.
Step 1 Rearrange the data so that scores from each subject are in the appropriate
columns, one for each condition.
Step 2 Rank order the scores SEPARATELY FOR EACH SUBJECT'S DATA with the
smallest score getting a value of 1. If there are ties (within the scores for a subject)
each receives the average rank they would have received.
14
Step 11 IF you reject the null hypothesis, determine whether the pattern of the
data completely supports, partially supports, or does not support the research
hypothesis.
-- IF you reject the null hypothesis, AND if the pattern of data agrees
exactly with the research hypothesis, then the research hypothesis is
completely supported.
For small samples (k < 6 AND N < 14):
-- IF you reject the null hypothesis, AND if part of the pattern of the data
Step 7 Determine the critical value of F by looking at the table of critical values for agrees with the research hypothesis, BUT part of the pattern of the data
Friedman's test does not, then the research hypothesis is partially supported.
𝐹(𝑘 = 3, 𝑁 = 12, 𝛼 = .05) = 8.67 -- IF you retain the null hypothesis, OR you reject the null BUT NO PART of
the pattern of the data agrees with the research hypothesis, then the
Step 8 Compare the obtained F and the critical F values to determine whether to research hypothesis is not at all supported.
retain or reject the null hypothesis.
By the way: To properly determine whether the hypothesized pattern of
-- if the obtained F value (from Step 6) is larger than the critical value of F, differences was found, one should perform pairwise comparisons (using Friedman's
then reject H0: test); the report of the results given below makes use of these follow-up tests
(although the computations are not shown).
-- if the obtained F value is less than or equal to the critical value of F, then
accept H0: By the way: Usually the researcher hypothesizes that there is a difference between
the conditions. Sometimes, however, the research hypothesis is that there is NO
For large samples (k > 5 OR N > 13):
difference between the conditions. If so, the research hypothesis and H0: are the
Step 9 Determine the critical value of F by looking at the table of critical values for same! When this is the case, retaining H0: provides support for the research
the Chi-Square test (𝑑𝑓 = 𝑘 − 1). (As an example, here is how you would apply hypothesis, whereas rejecting H0: provides evidence that research hypothesis is
this version of the significance test to these data.) incorrect.
For the example data, we would decide that the research hypothesis is
partially supported, because the null hypothesis was rejected, and
Step 10 Compare the obtained F and the critical X² values to determine whether to because, as hypothesized, there were fewer reptiles displayed than
retain or reject the null hypothesis. mammals or fish. However, there was not a significant difference between
the number of fish and mammals displayed at these stores.
-- if the obtained F value (from Step 6) is less than or equal the critical
value of C², then retain H0: Step 12 Reporting the results
-- if the obtained F value is larger than the critical value of C², then reject You will want to compute medians and IQR values to help describe the data before
H0: reporting the results of the significance test. With multiple-group designs it is often
easier to present these data in a table. As for the other statistical tests, the report
For the example data, we would decide to reject the null hypothesis, because the includes the "wordy" part and the statistical values based upon which you based
obtained value of F (15.526) is greater than the larger critical X² value (5.99).
15
your statistical decision. If you reject H0:, be sure to describe how the groups Fr Critical values for Friedman's two-way analysis of Variance by ranks
differed, rather than just reporting that there was "a difference".
Table 1 summarizes the data for the numbers of animals displayed at the
stores. There was a significant difference among the distributions of the
three types of animals (based on Friedman's test, X²(2) = 15.526, p = .0003.
Pairwise Friedman's tests (p < .05) revealed that, as hypothesized, fewer
reptiles were displayed than either fish or mammals. However, contrary to
the research hypothesis, there was not a difference in the mean numbers
of fish and mammals displayed.
16
TOPIC 13:
To compare the distributions of scores on a quantitative variable obtained from 2 or Step 2 Compute the sum of the ranks for each of the group
more groups. Thus, it is applied in the same data situation as an ANOVA for
independent samples, except that it is used when the data are either importantly
nonnormally distributed, the measurement scale of the dependent variable is
ordinal (not interval or ratio), or from a too-small sample. It is important to
remember the null hypothesis for this test, and to differentiate it from the nulls for
the t-test and the median test.
Step 3 Compute the square of the sum of the ranks for each of the groups.
H0: The populations represented by the k conditions (groups, samples)
have the same distribution of scores on the quantitative response variable.
The data: This analysis involves the grouping variable chain (1 = chain Step 5 Calculate H using the following formula
store, 2 = privately owned store, 3 = coop owned store) and the response
variable fishnum (number of fish on display) . Below are the scores for the
12 stores (chain, fishnum).
3,32 3,41 3,31 3,38 1,21 1,13 2,17 1,22 2,24 1,11 2,17 1,20
Research Hypothesis: The researcher hypothesized that Coop stores would Step 6 Determine the critical value of𝑋 2 . The sampling distribution of 𝐻
have the most fish on display, Chain stores would display the least, and
approximates that of 𝑋 2 (moreso when there are at least 5 data points in each
Private pet stores would display an intermediate amount.
group), so we can it as a critical value for the purpose of testing H0:. To find the
H0: for this analysis : The three different types of pet shops have the same critical value of 𝑋 2 , use df = number of group -1.
distribution of the number of fish displayed.
𝑋 2 (𝑑𝑓 = 2, 𝑝 = .05) = 5.99
Step 1 Rearrange the data from lowest to highest score while keeping track of
Step 7 Compare the obtained H value and the critical 𝑋 2 value to determine
group membership, and assign a rank to each score. If there is a tie, all of the scores
whether to retain or reject the null hypothesis.
that tie receive the average rank of that set of scores.
17
-- If the obtained H value is smaller than the critical 𝑋 2 value, then retain the coop stores had the most fish. However, there was not a difference in the
H0: number of fish in the private and chain stores.
-- If the obtained H value larger than the critical value of 𝑋 2 , then reject
H0:
For the example data, we would decide to reject the null hypothesis, because the Step 9 Reporting the results
obtained value of H (7.52) is greater than the larger critical 𝑋 2 value (5.99).
Step 8 IF you reject the null hypothesis, determine whether the data completely You will want to compute medians and IQR values to help describe the
support, partially support or do not support the research hypothesis. data before reporting the results of the significance test. With multiple-group
designs it is often easier to present these data in a table. As for the other statistical
▪ IF you reject the null hypothesis AND the pattern of group differences that tests, the report includes the "wordy" part and the statistical values upon which you
was hypothesized was found, then the research hypothesis was supported based your statistical decision. If you reject H0:, be sure to describe how the groups
differed, rather than just reporting that there was "a difference".
▪ IF you reject the null hypothesis, AND if part of the pattern of group
The number of fish displayed at each type of store is summarized in Table
differences agrees with part, but not all of the research hypothesis, then
1. The distributions of the number of fish displayed were significantly
the research hypothesis is partially supported.
different among the three types of stores, using Kruskal-Wallis, 𝑋² =
7.03, 𝑝 < .05. Pairwise comparisons using the Kruskal-Wallis test (𝑝 =
.05) revealed that as hypothesized Coop stores displayed the most fish.
▪ IF you retain the null hypothesis OR you reject the null hypothesis BUT the
However, contrary to the hypothesis, there was no difference between the
pattern of group differences was different from what was hypothesized,
number of fish displayed by Chain and Private pet stores.
then the research hypothesis is not supported.
By the way: Usually the researcher hypothesizes that there is a difference between
the k conditions. Sometimes, however, the research hypothesis is that there is NO
difference between the k conditions. If so, the research hypothesis and H0: are the
same! When this is the case, retaining H0: provides support for the research
hypothesis, whereas rejecting H0: provides evidence that research hypothesis is
incorrect.
For the example data, we would decide that the research hypothesis is partially
supported, because the null hypothesis was rejected, and because, as hypothesized,
18
X² Critical values of Chi-Square
TOPIC 14:
The data: The variables for this analysis are fishnum (number of fish
displayed) and fishgood (rating of fish quality on a 1-10 scale).
32,6 41,5 31,3 38,3 21,7 13,9 17,9 22,8 24,6 11,9 17,7 20,8
H0: for this analysis: There is no rank order relationship between the
number of fish displayed in pet stores and the quality rating of the fish.
Step 1 Rearrange the data so that scores from each subject are in the appropriate
columns, one for each variable.
19
Step 2 Rank order the scores SEPARATELY FOR EACH VARABLE with the smallest
score getting a value of 1. Cases with the same score each receive the average rank
they would have received. For example, there are two scores with values of 17. We
wouldn't want to rank them 3 and 4, because it makes no sense to give different
ranks to to values that are the same! Instead, we will assign the average rank ( [3 +
4] / 2 = 3.5) to both
Step 6 Look up the critical value of r for the appropriate sample size.
𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑟 (𝑁 = 12, 𝛼 = .05) = .588
Step 7 Compare the obtained r and critical r values and determine whether to
retain or reject the null hypothesis (that there is no rank order relationship between
the variables in the population represented by the sample). Remember that
correlation values can be positive or negative, and so we will compare the absolute
value of the obtained r to the critical r.
Step 3 For each pair of scores, compute the difference (𝑑) between the ranks,
compute the square of this difference (𝑑²) and then find the sum of these squared ▪ if the absolute value of the obtained r is less than the critical r, then retain
differences (∑ 𝑑²) the null hypothesis and conclude that there is no rank order relationship
between the two variables, in the population represented by the sample.
20
▪ If the absolute value of the obtained r is greater than the critical r, then Spearman's correlation between the number of fish displayed in these
reject the null hypothesis and conclude that there is a rank order stores (𝑀𝑑𝑛 = 21.5, 𝐼𝑄𝑅 = 17 − 31.5) and the quality rating for the
relationship between the variables in the population represented in the fish (𝑀𝑑𝑛 = 7, 𝐼𝑄𝑅 = 5.25 − 8.75)was 𝑟 = −.886 (𝑝 < .05). This
sample. result supports the research hypothesis that those stores with fewer fish
tended to have healthier fish, whereas those stores with more fish would
For the example data, we would decide to reject the null hypothesis, because the tend to have fish with lower health quality.
absolute value of the obtained 𝑟 (.86) is larger than the critical 𝑟 (.588)
Step 8 IF you reject the null hypothesis, determine whether the data support or do
not support the research hypothesis.
▪ IF you reject the null hypothesis AND the direction of the rank order
relationship ( + or - value of r) agrees with the direction of the research
hypothesis, then the research hypothesis is supported
▪ IF you retain the null hypothesis OR if you reject the null BUT the direction
of the rank order relationship ( + or - value of r) disagrees with the
direction of the research hypothesis, then the research hypothesis is not
supported.
By the way: Usually the researcher hypothesizes that there is a rank order
relationship between the variables. Sometimes, however, the research hypothesis is
that there is no rank order relationship between the variables. If so, the research
hypothesis and H0: are the same! When this is the case, retaining H0: provides
support for the research hypothesis, whereas rejecting H0: provides evidence that
the research hypothesis is incorrect.
For the example data, we would decide that the research hypothesis is supported,
because we rejected the null hypothesis, and the negative obtained r value agrees
with the negative rank-order relationship hypothesized by the researcher.
You will want to compute medians and IQR values for both variables to help
describe the data before reporting the results of the significance test. As for the
other statistical tests, the report includes the "wordy" part and the statistical values
upon which you made your statistical decision. Be sure to describe the pattern of
the data that led to the positive, no, or negative relationship between the variables.
21
columns (other than you perhaps switched the columns around), so just remove the
negative sign when you’re interpreting Tau.
• Tau-A and Tau-B are usually used for square tables (with equal columns
and rows). Tau-B will adjust for tied ranks.
• Tau-C is usually used for rectangular tables. For square tables, Tau-B and
Tau-C are essentially the same.
• Most statistical packages have Tau-B built in, but you can use the following
formula to calculate it by hand:
Kendall’s Tau = (C – D / C + D)
Where C is the number of concordant pairs and D is the number of discordant pairs.
Example Problem
Step 1: Make a table of rankings. The first column, “Candidate” is optional and for
reference only. The rankings for Interviewer 1 should be in ascending order (from
least to greatest).
TOPIC 15:
❖ 0 is no relationship,
❖ 1 is a perfect relationship.
A quirk of this test is that it can also produce negative values (i.e. from -1 to 0).
Unlike a linear graph, a negative relationship doesn’t mean much with ranked
22
Step 2: Count the number of concordant pairs, using the second column. When all concordant pairs have been counted, it looks like this:
Concordant pairs are how many larger ranks are below a certain rank. For example,
the first rank in the second interviewer’s column is a “1”, so all 11 ranks below it are
larger.
Step 3: Count the number of discordant pairs and insert them into the next column.
However, going down the list to the third row (a rank of 4), the rank immediately The number of discordant pairs is similar to Step 2, only you’re looking for smaller
below (3) is smaller, so it doesn’t count for a concordant pair. ranks, not larger ones.
23
Step 4: Sum the values in the two columns:
𝐾𝑒𝑛𝑑𝑎𝑙𝑙’𝑠 𝑇𝑎𝑢 = (𝐶 – 𝐷 / 𝐶 + 𝐷)
The Tau coefficient is .85, suggesting a strong relationship between the rankings.
24