09 Introduction To Nonparametric Methods
09 Introduction To Nonparametric Methods
NONPARAMETRIC METHODS | 2
DEPARTMENT of
STATISTICS Parametric vs. Nonparametric Statistical Analysis
NONPARAMETRIC METHODS | 3
DEPARTMENT of
STATISTICS Advantages of Nonparametric Methods
1. They can be used to test population parameters when the variable is not
normally distributed.
2. They can be used when the data are nominal or ordinal.
3. They can be used to test hypotheses that do not involve population
parameters.
4. In some cases, the computations are easier than those for the parametric
counterparts.
5. They are easy to understand.
6. There are fewer assumptions that have to be met, and the assumptions
are easier to verify.
NONPARAMETRIC METHODS | 4
DEPARTMENT of
STATISTICS Disadvantages of Nonparametric Methods
1. They are less sensitive than their parametric counterparts when the assumptions of
the parametric methods are met. Therefore, larger differences are needed before
the null hypothesis can be rejected.
2. They tend to use less information than the parametric tests. For example, the sign
test requires the researcher to determine only whether the data values are above or
below the median, not how much above or below the median each
value is.
3. They are less efficient than their parametric counterparts when the assumptions of
the parametric methods are met. That is, larger sample sizes are needed to
overcome the loss of information. For example, the nonparametric sign test is
about 60% as efficient as its parametric counterpart, the z test. Thus, a sample size
of 100 is needed for use of the sign test, compared with a sample size of 60 for use
of the z test to obtain the same results.
NONPARAMETRIC METHODS | 5
DEPARTMENT of
STATISTICS Assumptions of Nonparametric Statistics
Remarks:
• If the parametric assumptions can be met, the parametric methods are
preferred.
• When parametric assumptions cannot be met, the nonparametric
methods are a valuable tool for analyzing the data.
NONPARAMETRIC METHODS | 6
DEPARTMENT of
STATISTICS Selection of statistical tools
Parametric Test Nonparametric Test
Conditions/ Purposes
Normal Distribution Non-normal Distribution
One sample z-test (if 𝜎 is known)
Compare a mean with
and Wilcoxon test
standard value
One sample t-test (if 𝜎 is unknown)
Two independent samples z-test
(if 𝜎1 𝑎𝑛𝑑 𝜎2 are known)
Compare two means of
and Mann-Whitney test
unpaired data sets
Two independent samples t-test
(If 𝜎1 𝑎𝑛𝑑 𝜎2 is unknown)
Compare two means of
Paired-sample t-test Wilcoxon test
paired data sets
Compare >2 means of
One-way ANOVA Kruskal-Wallis test
unmatched data sets
NONPARAMETRIC METHODS | 7
DEPARTMENT of
STATISTICS Selection of statistical tools
Parametric Test Nonparametric Test
Conditions/ Purposes
Normal Distribution Non-normal Distribution
Compare >2 means of
Multi-factor ANOVA Friedman test
matched data sets
3. Normal quantile plot: If the histogram is basically symmetric and there is at most one
outlier, use technology to generate a normal quantile plot. Use the following criteria
to determine whether or not the distribution is normal. (These criteria can be used
loosely for small samples, but they should be used more strictly for large samples.)
Normal Distribution:
The population distribution is normal if the pattern of the points is reasonably close to a straight line
and the points do not show some systematic pattern that is not a straight-line pattern.
NONPARAMETRIC METHODS | 11
DEPARTMENT of
STATISTICS Example (Uniform)
The second case shows a histogram of data having a uniform distribution. The
corresponding normal quantile plot suggests that the points are not normally
distributed because the points show a systematic pattern that is not a straight-
line pattern. These sample values are not from a population having a normal
distribution.
NONPARAMETRIC METHODS | 12
DEPARTMENT of
STATISTICS Spearman Rank Correlation Coefficient
NONPARAMETRIC METHODS | 13
DEPARTMENT of
STATISTICS Spearman Rank Correlation Coefficient
Assumptions for Spearman’s Rank Correlation Coefficient
1. The sample is a random sample.
2. The data consist of two measurements or observations taken on the
same individual.
Formula for Computing the Spearman Rank Correlation Coefficient
6 σ 𝑑2
𝑟𝑠 = 1 −
𝑛(𝑛2 − 1)
Where: 𝑑 = difference in ranks
𝑛 = number of data pairs
Nursing
Hospitals
Homes
Find the Spearman rank correlation coefficient
107 230
for the following data, which represent the
number of hospitals and nursing homes in each 61 134
of seven randomly selected states. At the 0.05 202 704
level of significance, is there enough evidence 133 376
to conclude that there is a correlation between
145 431
the two?
117 538
108 3
NONPARAMETRIC METHODS | 16
DEPARTMENT of
STATISTICS Example 1
Critical Values for the Rank Correlation Coefficient
NONPARAMETRIC METHODS | 17
DEPARTMENT of
STATISTICS Example 1
4. Computation
a.) Rank each data set as shown in the table. Let 𝑋1 be the hospitals and
𝑋2 be the nursing homes.
Hospitals Nursing Homes
Rank of 𝑿𝟏 Rank of 𝑿𝟐 𝒅 = 𝑿𝟏 − 𝑿𝟐 𝒅𝟐
(𝑿𝟏 ) (𝑿𝟐 )
107 2 230 2 0 0
61 1 134 1 0 0
202 7 704 7 0 0
133 5 376 4 1 1
145 6 431 5 1 1
117 4 538 6 -2 4
108 3 373 3 0 0 σ 𝑑2 = 6
NONPARAMETRIC METHODS | 18
DEPARTMENT of
STATISTICS Example 1
6. Conclusion
At 5% level of significance, the sample data support the claim that there is
correlation between the number of hospitals and nursing homes.
(see next slide for the wording of final conclusion)
NONPARAMETRIC METHODS | 19
DEPARTMENT of
STATISTICS
✓
✓
Claim: ρ ≠ 0
Decision: Reject 𝑯𝒐
INTRODUCTION TO CORRELATION AND LINEAR REGRESSION | 20
DEPARTMENT of
STATISTICS Example 2
English Math
The following data shows the final term 56 66
exam scores in English and Math of 10 75 70
students. At the 0.01 level of 45 40
significance, is there enough evidence to 71 60
conclude that there is a correlation 62 65
between the two variables? 64 56
58 59
80 77
76 67
61 63
NONPARAMETRIC METHODS | 21
DEPARTMENT of
STATISTICS Example 2
Critical Values for the Rank Correlation Coefficient
Claim: there is a correlation between the two
variables
6. Conclusion
At 1% level of significance, there is no sufficient sample evidence to support
the claim that there is correlation between the English and Math scores of
students. (see next slide for the wording of final conclusion)
NONPARAMETRIC METHODS | 24
DEPARTMENT of
STATISTICS
Claim: ρ ≠ 0
Decision: Failed to reject 𝑯𝒐
✓
INTRODUCTION TO CORRELATION AND LINEAR REGRESSION | 25
DEPARTMENT of
STATISTICS Chi-Squared Test of Independence
• tests the null hypothesis that the row variable and column variable in a
contingency table are not related
Ho: The row variable and column variable are not related
Ha: The row variable and column variable are related
Assumptions:
• The sample data are randomly selected.
• For every cell in the contingency table, the expected frequency is at least 5.
NONPARAMETRIC METHODS | 26
DEPARTMENT of
STATISTICS Chi-Squared Test of Independence
Decision Rule: Reject Ho if 2𝑐 > 2𝛼,(𝑟−1)(𝑐−1) where r is the number of rows
and c is the number of columns in a contingency table
NONPARAMETRIC METHODS | 27
DEPARTMENT of
STATISTICS Chi-Squared Test of Independence
NONPARAMETRIC METHODS | 28
DEPARTMENT of
STATISTICS Example 1
Based on the table below, is there evidence to suggest that sex is related to
whether a person is left-handed or right-handed? Test at 0.05 level of
significance.
Hand Preference
Sex Total
Left Right
Female 12 108 120
Male 24 156 180
Total 36 264 300
NONPARAMETRIC METHODS | 29
DEPARTMENT of
STATISTICS Example 1
Claim: The sex and hand preference are related
1. Ho: The sex and hand preference are not related
Ha: The sex and hand preference are related -> claim
2. Test: Chi-Squared Test
𝑑𝑓 = 𝑟 − 1 𝑐 − 1 = 2 − 1 2 − 1 = 1 (see slide 27)
𝛼 = 0.05 , critical value = 3.841
𝑘
2 2 2 2 2
𝑂𝑖 − 𝐸𝑖 12 − 14.4 108 − 105.6 24 − 21.6 156 − 158.4
𝑐 =
2 = + + + = 𝟎. 𝟕𝟓𝟕𝟔
𝐸𝑖 14.4 105.6 21.6 158.4
𝑖=1
NONPARAMETRIC METHODS | 31
DEPARTMENT of
STATISTICS Example 1
5. Decision
Since 0.7576 > critical value (3.841), we failed to reject 𝐻𝑜 .
6. Conclusion
At 5% level of significance, we can conclude that the sex and the hand
preference are not related.
NONPARAMETRIC METHODS | 32