0% found this document useful (0 votes)

93 views10 pages

Nonparametric Statistics and Model Selection: 5.1 Estimating Distributions and Distribution-Free Tests

This chapter discusses nonparametric statistics and model selection techniques that make fewer assumptions than traditional parametric tests. It introduces several distribution-free tests for comparing samples, including the Kolmogorov-Smirnov test, which compares cumulative distribution functions to test if two datasets come from the same distribution. The chapter also covers Wilcoxon's signed-rank test for paired samples and the Mann-Whitney U test for independent samples, both of which compare medians rather than means. These tests can be used when assumptions of normality and equal variance required by traditional parametric tests may not hold.

Uploaded by

halijah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views10 pages

Nonparametric Statistics and Model Selection: 5.1 Estimating Distributions and Distribution-Free Tests

Uploaded by

halijah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Chapter 5

Nonparametric statistics and model

selection

In Chapter 2, we learned about the t-test and its variations. These were designed to compare
sample means, and relied heavily on assumptions of normality. We were able to apply them to
non-Gaussian populations by using the central limit theorem, but that only really works for
the mean (since the central limit theorem holds for averages of samples). Sometimes, we’re
interested in computing other sample statistics and evaluating their distributions (remember
that all statistics computed from samples are random variables, since they’re functions of the
random samples) so that we can obtain confidence intervals for them. In other situations, we
may not be able to use the central limit theorem due to small sample sizes and/or unusual
distributions.
In this chapter, we’ll focus on techniques that don’t require these assumptions. Such methods
are usually called nonparametric or distribution-free. We’ll first look at some statistical tests,
then move to methods outside the testing framework.

5.1 Estimating distributions and distribution-free tests

So far, we’ve only used “eyeballing” and visual inspection to see if distributions are similar.
In this section, we’ll look at more quantitative approaches to this problem. Despite this,
don’t forget that visual inspection is usually an excellent place to start!
We’ve seen that it’s important to pay attention to the assumptions inherent in any test.
The methods in this section make fewer assumptions and will help us test whether our
assumptions are accurate.

5.1.1 Kolmogorov-Smirnov test

The Kolmogorov-Smirnov test tests whether two arbitrary distributions are the same.
It can be used to compare two empirical data distributions, or to compare one empirical
data distribution to any reference distribution. It’s based on comparing two cumulative
distribution functions (CDFs).
Remember that the CDF of a random variable x is the probability that the random variable

1
Statistics for Research Projects Chapter 5

10 1.0

8 0.8

6 0.6

4 0.4

2 0.2

0 0.0
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3

18 1.0
16
14 0.8
12
10 0.6
8 0.4
6
4 0.2
2
0 0.0
−6 −4 −2 0 2 4 6 −6 −4 −2 0 2 4 6

Figure 5.1: Two Kolmogorov-Smirnov test plots (right column) with histograms of the data
being tested (left column). On the top row, the empirical CDF (green) matches the test
CDF (blue) closely, and the largest difference (dotted vertical red line, near 0.5) is very
small. On the bottom, the empirical CDF is quite different from the test CDF, and the
largest difference is much larger.

is less than or equal to some value. To be a bit more precise, it’s a function F such that
P ≤ a). When talking about data, it’s often useful to look at empirical CDFs:
F (a) = P (x
Fn (a) = n1 i I(xi < a)1 is the CDF of n observed data points.
Now suppose we want to compare two CDFs, F 1 and F 2 . They might be empirical CDFs (to
compare two different datasets and see whether they’re significantly different) or one might
be a reference CDF (to see whether a particular distribution is an appropriate choice for a
dataset). The Kolmogorov-Smirnov test computes the statistic Dn :

Dn = max |Fn1 (x) − Fn2 (x)|

This compares the two CDFs and looks at the point of maximum discrepancy; see Figure 5.1
for an example. We can theoretically show that if F 1 is the empirical distribution of x and
F 2 is the true distribution x was drawn from, then limn→∞ Dn = 0. Similarly, if the two
distributions have no overlap at all, the maximum difference will be 1 (when one CDF is 1
and the other is 0). Therefore, we can test distribution equality by comparing the statistic
Dn to 0 (if Dn is significantly larger than 0 and close to 1, then we might conclude that the
distributions are not equal).
Notice that this method is only defined for one-dimensional random variables: although there
1
Remember that I is a function that returns 1 when its argument is true and 0 when its argument is false.

2
Statistics for Research Projects Chapter 5

are extensions to multiple random variables; they are more complex than simply comparing
joint CDFs.
Also notice that this test is sensitive to any differences at all in two distributions: two
distributions with the same mean but significantly different shapes will produce a large
value of Dn .

5.1.2 Shapiro-Wilk test

The Shapiro-Wilk test tests whether a distribution is Gaussian using quantiles. It’s a bit
more specific than the Kolmogorov-Smirnov test, and as a result tends to be more powerful.
We won’t go into much detail on it in this class, but if you’re interested, the Wikipedia page
has more detail.

5.1.3 Wilcoxon’s signed-rank test

The two-sample t test we discussed in Chapter 2 requires us to use the central limit theorem
to approximate the distribution of the sample mean as Gaussian. When we can’t make
this assumption (i.e., when the number of samples is small and/or the distributions are very
skewed or have very high variance), we can use Wilcoxon’s signed-rank test for matched pairs.
This is a paired test that compares the medians of two distributions2 . For example, when
comparing the incomes of two different groups (especially groups that span the socioeconomic
spectrum), the distributions will likely be highly variable and highly skewed. In such a case,
it might be better to use a nonparametric test like Wilcoxon’s signed-rank test. In contrast
to the Kolmogorov-Smirnov test earlier, this test (like its unpaired cousin the Mann-Whitney
U ) is only sensitive to changes in the median, and not to changes in the shape.
The null hypothesis in this test states that the median difference between the pairs is zero.
We’ll compute a test statistic and a corresponding p-value, which give us a sense for how
likely our data are under the null hypothesis. We compute this test statistic as follows:

(1) For each pair i, compute the difference, and keep its absolute value di and its sign Si
(where Si ∈ {−1, 0, +1}). We’ll exclude pairs with Si = 0.

(2) Sort the absolute values di from smallest to largest, and rank them accordingly. Let
Ri be the rank of pair i (for example, if the fifth pair had the third smallest absolute
difference, then R5 = 3).

(3) Compute the test statistic:

X
W = Si Ri

i

2
For unmatched pairs, we can use the Mann-Whitney U test, described in the next section

3
Statistics for Research Projects Chapter 5

W has a known distribution. In fact, if N is greater than about 10, it’s approximately
normally distributed (if not, it still has a known form). So, we can evaluate the probability
of observing it under a null hypothesis and thereby obtain a significance level.
Intuitively, if the median difference is 0, then half the signs should be positive and half
should be negative, and the signs shouldn’t be related to the ranks. If the median difference
is nonzero, W will be large (the sum will produce a large negative value or a large positive
value). Notice that once we constructed the rankings and defined Ri , we never used the
actual differences!

5.1.4 Mann-Whitney U Test

The Mann-Whitney U test is similar to the Wilcoxon test, but can be used to compare
multiple samples that aren’t necessarily paired. The null hypothesis for this test is that the
two groups have the same distribution, while the alternative hypothesis is that one group
has larger (or smaller) values than the other3 . Computing the test statistic is simple:

(1) Combine all data points and rank them (largest to smallest or smallest to largest).

(2) Add up the ranks for data points in the first group; call this R1 . Find the number of
points in the group; call it n1 . Compute U1 = R1 − n1 (n1 + 1)/2. Compute U2 similarly
for the second group.

(3) The test statistic is defined as U = min(U1 , U2 ).

As with W from the Wilcoxon test, U has a known distribution. If n1 and n2 are reasonably
large, it’s approximately normally distributed with mean n1 n2 /2 under the null hypothesis.
If the two medians are very different, U will be close to 0, and if they’re similar, U will be
close to n1 n2 /2. Intuitively, here’s why:

• If the values in the first sample were all bigger than the values in the second sample,
then R1 = n1 (n1 + 1)/24 : this is the smallest possible value for R1 . U1 would then be
0.

• If the ranks between the two groups aren’t very different, then U1 will be close to U2 .
With a little algebra, you can show that the sum U1 +U2 will always be n1 n2 . If they’re
both about the same, then they’ll both be near half this value, or n1 n2 /2.
3
Under some reasonable assumptions about the distributions of the data (see the Mann-Whitney U
article on Wikipedia for more details), this test can be used with a null hypothesis of equal medians and a
corresponding alternative hypothesis of a significant difference in medians.
4
R1 = n1 (n1 + 1)/2 because in this case, the ranks for all the values from the first dataset would be 1
through n1 , and the sum of these values is n1 (n1 + 1)/2.

4
Statistics for Research Projects Chapter 5

5.2 Resampling-based methods

All the approaches we’ve described involve computing a test statistic from data and measur-
ing how unlikely our data are based on the distribution of that statistic. If we don’t know
enough about the distribution of our test statistic, we can use the data to tell us about
the distribution: this is exactly what resampling-based methods do. Permutation tests
“sample” different relabelings of the data in order to give us a sense for how significant the
true relabeling’s result is. Bootstrap creates “new” datasets by resampling several times
from the data itself, and treats those as separate samples. The next example illustrates a
real-world example where these methods are useful.
Example: Chicago teaching scandal
In 2002, economists Steven Levitt and Brian Jacob investigated cheating in Chicago public schools, but
not in the way you might think: they decided to investigate cheating by teachers, usually by changing
student answers after the students had taken standardized testsa
So, how’d they do it? Using statistics! They went through test scores from thousands of classrooms in
Chicago schools, and for each classroom, computed two measures:

(1) How unexpected is that classroom’s performance? This was computed by looking at every student’s
performance the year before and the year after. If many students had an unusually high score one
year that wasn’t sustained the following year, then cheating was likely.
(2) How suspicious are the answer sheets? This was computed by looking at how similar the A-B-C-D
patterns on different students’ answer sheets were.

Unfortunately, computing measures like performance and answer sheet similarity is tricky, and results
in quantities that don’t have well-defined distributions! As a result, it isn’t easy to determine a null
distribution for these quantities, but we still want to evaluate how unexpected or suspicious they are.
To solve this problem, Levitt and Jacob use two nonparametric methods to determine appropriate null
distributions as a way of justifying these measures. In particular:

• They assume (reasonably) that most classrooms have teachers who don’t cheat, so by looking at
the 50th to 75th percentiles of both measures above, they can obtain a null distribution for the
correlation between the two.
• In order to test whether the effects they observed are because of cheating teachers, they randomly
re-assign all the students to new, hypothetical classrooms and repeat their analysis. As a type of
permutation test, this allows them to establish a baseline level for these measures by which they
can evaluate the values they observed.

While neither of these methods is exactly like what we’ll discuss here, they’re both examples of a key
idea in nonparametric statistics: using the data to generate a null hypothesis rather than assuming any
kind of distribution.
What’d they find? 3.4% of classrooms had teachers who cheated on at least one standardized test when
the two measures above were thresholded at the 95th percentile. They also used regression with a variety
of classroom demographics to determine that academically poorer classrooms were more likely to have
cheating teachers, and that policies that put more weight on test scores correlated with increased teacher
cheating.
a
see Jacob and Levitt. Rotten Apples: An Investigation of the Prevalence and Predictors of Teacher
Cheating. For more economic statistics, see Steven Levitt’s book with Stephen Dubner, Freakonomics.

5
Statistics for Research Projects Chapter 5

5.2.1 Permutation tests

Suppose we want to see if a particular complex statistic is significantly different between two
groups. If we don’t know the distribution of the statistic, then we can’t assign any particular
probabilities to particular values, and so we can’t say anything interesting after computing
a statistic from data. The idea behind permutation tests is to use the data to generate
the null hypothesis.
The key idea is that if there weren’t a significant difference between this statistic across the
two groups in our data, then the statistic should have the same value for any other two
arbitrarily selected “groups” of the same sizes. We’ll make up a bunch of fake “groups” and
see if this is the case.
We’ll demonstrate the idea with comparing sample means, but note that in practice sample
means could be compared using a t-test, and that permutation tests are generally used for
more complicated statistics whose distribution can’t be deduced or approximated.
Let’s call the two groups in our data A and B. Our test statistic of interest will be w∗ = x̄A −
x̄B : the difference in the means of the two groups. Our null hypothesis will be H0 : x̄A = x̄B .
As before, we want to see whether this difference is significantly large or small. In order to
get a sense for the null distribution of w, we can try relabeling the points many different
times and recomputing the statistic. Suppose the original set A has n data points, and B has
m points. We’ll randomly pick n points out of all n + m, assign those to our new “group” Ai ,
and assign the others to our new “group” Bi (where the subscript i indicates the iteration).
We’ll recompute wi based on these new groups, and repeat the process. We can then see
whether our value w∗ is unusual based on the collection of wi , perhaps by looking at what
fraction of them are more extreme than w∗ .
We could either repeat this for every possible labeling (usually called an exact test), or
randomly choose a subset and evaluate only on those (usually called a Monte Carlo approx-
imation). So, this entire procedure requires only a procedure for computing the statistic of
interest (in this case, w): the rest is determined by the data.
We can apply permutations elsewhere, too! For example, if we suspect there is some kind of
trend in time series data, we can permute the time indices and compare the correlation. For
example, if we observe the time series data (8, 12, 15.8, 20.3) and suspect there’s a trend, we
can permute the order, obtaining permutations like (15.8, 12, 20.3, 8) or (20.3, 8, 15.8, 12), fit
linear models to each one, and evaluate how unlikely the observed fit is.

5.2.2 Bootstrap
Suppose we have some complicated statistic y that we computed from our data x. If we
want to provide a confidence interval for this statistic, we need to know its variance. When
our statistic was simply x̄, we could compute the statistic’s standard deviation √ (i.e., the
standard error of that statistic) from our estimated standard deviation using sx / n. But
for more complicated statistics, where we don’t know the distributions, how do we provide
a confidence interval?

6
Statistics for Research Projects Chapter 5

0.20
0.15
0.10
0.05
0.00
0 2 4 6 8 10

10
8
6
4
2
0
0 2 4 6 8 10

10 10 10 10 10
8 8 8 8 8
6 6 6 6 6
4 4 4 4 4
2 2 2 2 2
0 0 0 0 0
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10

Figure 5.2: An illustration of bootstrap sampling. The top figure shows the true distribution that
our data points are drawn from, and the second figure shows a histogram of the particular data
points we observed (N = 50). The bottom row shows various bootstrap resamplings of our data
(with n = N = 50). Even though they were obtained from our data, they can be thought of as
samples from the true distribution (top).

One approach is a method called bootstrap. The key idea here is that we can resample
points from our data, compute a statistic, and repeat several times to look at the variance
across different resamplings.
Recall that our original data (N points) are randomly generated from some true distribution.
If we randomly sample n points (n ≤ N , and often n = N ) from our data with replacement 5 ,
these points will also be random samples from our true distribution, as shown in Figure 5.2.
So, we can compute our statistic over this smaller random sample and repeat many times,
measuring the variance of the statistic across the different sample runs.
Everything we’ve talked about so far has been based on the idea of trying to approximating
the true distribution of observed data with samples. Bootstrap takes this a step further and
samples from the samples to generate more data.
A similar method, known as jackknife, applies a similar process, but looks at N − 1 points
taken without replacement each time instead of n with replacement. Put more simply, we
remove one point at a time and test the model. Notice that we’ve seen a similar idea before:
our initial definition of Cook’s distance was based on the idea of removing one point at
a time. In practice, boostrap is more widely used than jackknife; jackknife also has very
different theoretical properties.
5
This means that a single data point can be sampled more than once.

7
Statistics for Research Projects Chapter 5

5.3 Model selection

When fitting models (such as regression) to real data, we’ll often have a choice to make for
model complexity: a more complex model might fit the data better but be harder to interpret,
while a simpler model might be more interpretable but produce a larger error. We’ve seen
this before when looking at polynomial regression models and LASSO in Chapters 3 and 4.
In this section, we’ll learn how to pick the “best” model for some data among several choices.

5.3.1 Minimum generalization error

First, we’ll define “best” by how well our model generalizes to new data that we’ll see in the
future. In particular, we’ll define it this way to avoid overfitting.
How can we figure out how well our model generalizes? The most common way is to divide
our data into three parts:

• The training set is what we’ll use to fit the model for each possible value of the
manually-set parameters,

• the validation set is what we’ll use to determine parameters that require manual
settings,

• and the test set is what we use to evaluate our results for reporting, and get a sense
for how well our model will do on new data in the real world.

It’s critically important to properly separate the test set from the training and validation
sets! At this point you may be wondering: why do we need separate test and validation
sets? The answer is that we choose a model based on its validation set performance: if we
really want to see generalization error, we need to see how it does on some new data, not
data that we used to pick it.
A good analogy is to think of model fitting and parameter determination as a student learning
and taking a practice exam respectively, and model evaluation as that student taking an
actual exam. Using the test data in any way during the training or validation process is like
giving the student an early copy of the exam: it’s cheating!
Figure 5.3 illustrates a general trend we usually see in this setup: as we increase model
complexity, the training error (i.e. the error of the model on the training set) will go down,
while the testing error will hit a “sweet spot” and then start increasing due to overfitting.
For example, if we’re using LASSO (linear regression with sum-of-absolute-value penalty)
as described in Chapter4, we need to choose our regularization parameter λ. Recall that λ
controls model sparsity/complexity: small values of λ lead to complex models, while large
values lead to simpler models. One approach is:

(a) Choose several possibilities for λ and, for each one, compute coefficients using the training
set.

8
Statistics for Research Projects Chapter 5

Training error
Validation error

Error
1 2 3 4 5 6 7 8
Polynomial degree

Figure 5.3: Training and validation error from fitting a polynomial to data. The data were generated
from a fourth-order polynomial. The validation error is smallest at this level, while the training
error continues to decrease as more complex models overfit the training data.

(b) Then, look at how well each one does on the validation set. Separating training and
validation helps guard against overfitting: if a model is overfit to the training data, then
it probably won’t do very well on the validation data.
(c) Once we’ve determined the best value for λ (i.e., the one that achieves minimum error in
step (b)), we can fit the model on all the training and validation data, and then see how
well it does on the test data. The test data, which the model/parameters have never
seen before, should give a measure of how well the model will do on arbitrary new data
that it sees.

The procedure described above completely separates our fitting and evaluation processes,
but it does so at the cost of preventing us from using much of the data. Recall from last
week that using more data for training typically decreases the variance of our estimates, and
helps us get more accurate results. We also need to have enough data for validation, since
using too little will leave us vulnerable to overfitting.
One widely used solution that lets us use more data for training is cross-validation. Here’s
how it works:

(1) First, divide the non-test data into K uniformly sized blocks, often referred to as folds.
This gives us K training-validation pairs: in each pair, the training set consists of K − 1
blocks, and the validation set is the remaining block.
(2) For each training/validation pair, repeat steps (a) and (b) above: this gives us K different
errors for each value of λ. We can average these together to get an average error for each
λ, which we’ll then use to select a model.
(3) Repeat step (c) above to obtain the test error as an evaluation of the model.

Although these examples were described with respect to LASSO and the parameter λ, the
procedures are much more general: we could have easily replaced “value for λ” above with
a different measure of model complexity.

9
Statistics for Research Projects Chapter 5

Also note that we could use a bootstrap-like approach here too: instead of deterministically
dividing our dataset into K parts, we could have randomly subsampled the non-test data K
different times and applied the same procedure.

5.3.2 Penalizing complexity

Another way to address the tradeoff between model complexity and fit error is to directly
penalize model complexity. That is, we can express the “badness” of a model as the sum
of an error term (recall the sum of squared error cost that we used in linear regression) and
some penalty term that increases as the model becomes more complex. Here, the penalty
term effectively serves as a proxy for minimizing generalization error. Two common penalty
terms are AIC, or Akaike Information Criterion, which adds a penalty term equal to 2p
(where p is the number of parameters), and BIC, or Bayesian Information Criterion, which
adds a penalty term equal to p ln n (where n is the number of data points).
BIC is closely related to another criterion known as MDL, or minimum description length.
This criterion tries to compress the model and the error: if the model and the error are both
small and simple, they’ll be highly compressible. A complex model that produces low error
will be hard to compress, and an overly simplistic model will produce high error that will be
hard to compress. This criterion tries to achieve a balance between the two.

Example: Bayesian Occam’s Razor

Another idea commonly used is Bayesian Occam’s Razor. In Bayesian models, simpler models tend
to have higher probability of observing data.
Suppose we observe a number and want to decide which of two models generate it. The simpler model,
H1 , says that it’s equally likely to be 1 or 2, while the more complex model, H2 , says that it’s equally
likely to be 1, 2, 3, or 4. The following figure shows the two distributions:

H1 H2
1.0 1.0

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0
1.0 1.5 2.0 2.5 3.0 3.5 4.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0

If we observe the number 2, the likelihood of this observation under H1 is 0.5, while the likelihood under
H2 is 0.25. Therefore, by choosing the model with the highest probability placed on our observation,
we’re also choosing the simpler model.
Intuitively, the more values that a model allows for, the more it has to spread out its probability for
each value.

ML_UNIT-V
No ratings yet
ML_UNIT-V
161 pages
Chapter 15 Statel
No ratings yet
Chapter 15 Statel
55 pages
lec05
No ratings yet
lec05
28 pages
Friedman-Test-2
No ratings yet
Friedman-Test-2
22 pages
correlation (spss) 2024 July
No ratings yet
correlation (spss) 2024 July
25 pages
Berger 2014
No ratings yet
Berger 2014
5 pages
sign test
No ratings yet
sign test
6 pages
Exp 3
No ratings yet
Exp 3
35 pages
L27
No ratings yet
L27
22 pages
Non Parametric Tests (Sarah)
No ratings yet
Non Parametric Tests (Sarah)
32 pages
Nonparametric Tests
No ratings yet
Nonparametric Tests
9 pages
Bluman 5th - Chapter 8 HW Soln For My Class
100% (1)
Bluman 5th - Chapter 8 HW Soln For My Class
11 pages
Case 3-Fish - Catch - Data - With - Analysis
No ratings yet
Case 3-Fish - Catch - Data - With - Analysis
90 pages
Exploratory Data Analysis and Regression Modeling: by Isha Arora Nehalkumar Jesadiya Rohit Nanawati Shah Razak Mohammad
No ratings yet
Exploratory Data Analysis and Regression Modeling: by Isha Arora Nehalkumar Jesadiya Rohit Nanawati Shah Razak Mohammad
16 pages
US Output Growth - Clemens
No ratings yet
US Output Growth - Clemens
33 pages
review
No ratings yet
review
81 pages
Jur Nal Wing Surya
No ratings yet
Jur Nal Wing Surya
20 pages
1-s2.0-016794739592844N-main
No ratings yet
1-s2.0-016794739592844N-main
11 pages
Assignment-2 ML Solution by Loknath Regmi
No ratings yet
Assignment-2 ML Solution by Loknath Regmi
6 pages
L28
No ratings yet
L28
32 pages
Chapter 12 Statistical Analysis Tools
No ratings yet
Chapter 12 Statistical Analysis Tools
34 pages
Non Parametric Tests
No ratings yet
Non Parametric Tests
68 pages
Wilcoxon_rank-sum_test_LESSON
No ratings yet
Wilcoxon_rank-sum_test_LESSON
4 pages
The Kolmogorvo Smirmov test
No ratings yet
The Kolmogorvo Smirmov test
36 pages
Non-Parametric Methods: Statistics For Psychology
No ratings yet
Non-Parametric Methods: Statistics For Psychology
19 pages
NON-PARAMETRIC-TEST
No ratings yet
NON-PARAMETRIC-TEST
107 pages
Lab Report 1 PHD 226 2024
No ratings yet
Lab Report 1 PHD 226 2024
4 pages
1 PB
No ratings yet
1 PB
10 pages
Chapter 1 Qualitative Variables Final
No ratings yet
Chapter 1 Qualitative Variables Final
74 pages
Aguinis, Edwards, & Bradley (2917) - Improving Our Understanding of Moderation and Mediation in Strategic Management
No ratings yet
Aguinis, Edwards, & Bradley (2917) - Improving Our Understanding of Moderation and Mediation in Strategic Management
21 pages
Points of significance - Nonparametric tests
No ratings yet
Points of significance - Nonparametric tests
3 pages
4.2 Confidence Intervals For The Mean
No ratings yet
4.2 Confidence Intervals For The Mean
9 pages
Math204 NonParThree
No ratings yet
Math204 NonParThree
4 pages
Get (Ebook PDF) Econometrics by Example 2nd Edition Free All Chapters
100% (4)
Get (Ebook PDF) Econometrics by Example 2nd Edition Free All Chapters
41 pages
Unit 4
No ratings yet
Unit 4
4 pages
Discriminanat Analysis
No ratings yet
Discriminanat Analysis
13 pages
Non Parametric Test
No ratings yet
Non Parametric Test
13 pages
AST-503 Nonparametric Test PDF
No ratings yet
AST-503 Nonparametric Test PDF
85 pages
11paired T
No ratings yet
11paired T
49 pages
Basic Concepts of Non-Parametric Methods (Statistics)
No ratings yet
Basic Concepts of Non-Parametric Methods (Statistics)
18 pages
Chap 1samp Distributions
No ratings yet
Chap 1samp Distributions
7 pages
Quizlet PDF
No ratings yet
Quizlet PDF
2 pages
Chapter2 - Ordinary Least Squares
No ratings yet
Chapter2 - Ordinary Least Squares
32 pages
Wilcoxon Test 1
No ratings yet
Wilcoxon Test 1
18 pages
Unit 6
No ratings yet
Unit 6
60 pages
Vector 1
No ratings yet
Vector 1
6 pages
SM 38
No ratings yet
SM 38
58 pages
Building Probabilistic Graphical Models With Python
No ratings yet
Building Probabilistic Graphical Models With Python
24 pages
Math204 NonParOneTwo
No ratings yet
Math204 NonParOneTwo
4 pages
STAT 04 Simplify Notes
No ratings yet
STAT 04 Simplify Notes
34 pages
Sign Mann Wilcoxon Kruskal - PPT - Compatibility Mode
No ratings yet
Sign Mann Wilcoxon Kruskal - PPT - Compatibility Mode
28 pages
A Basic Overview of Statistical Tests That Are Used Commonly
No ratings yet
A Basic Overview of Statistical Tests That Are Used Commonly
25 pages
The Expectation-Maximization Algorithm: IEEE Signal Processing Magazine December 1996
No ratings yet
The Expectation-Maximization Algorithm: IEEE Signal Processing Magazine December 1996
15 pages
Chapter 6 Non-Parametric and Robust Test
No ratings yet
Chapter 6 Non-Parametric and Robust Test
43 pages
Test of Normality
No ratings yet
Test of Normality
7 pages
Basic Non-Parametric Statistical Tools : Prepared For GCMA 2001 Peter M. Quesada Gregory S. Rash
No ratings yet
Basic Non-Parametric Statistical Tools : Prepared For GCMA 2001 Peter M. Quesada Gregory S. Rash
15 pages
2.3. The Wilcoxon Signed Test 2.3.1. The Wilcoxon Signed Test For Paired Samples (Small Sample Size)
No ratings yet
2.3. The Wilcoxon Signed Test 2.3.1. The Wilcoxon Signed Test For Paired Samples (Small Sample Size)
14 pages
NON-PARAMETRIC TESTS
No ratings yet
NON-PARAMETRIC TESTS
11 pages
Non-Parametric Methods - 1 PDF
No ratings yet
Non-Parametric Methods - 1 PDF
3 pages
PY1PR1 Stats Lecture 6 Handout
No ratings yet
PY1PR1 Stats Lecture 6 Handout
35 pages
Applied Nonparametric Statistics 2
No ratings yet
Applied Nonparametric Statistics 2
15 pages
STAT22209 - Nonparametric Statistics
No ratings yet
STAT22209 - Nonparametric Statistics
74 pages
Wilcoxon Rank Sum or Mann-Whitney Test: One Sample
No ratings yet
Wilcoxon Rank Sum or Mann-Whitney Test: One Sample
6 pages
MATH 1281 Learning Guide Unit 5 Reading Assignment Home
No ratings yet
MATH 1281 Learning Guide Unit 5 Reading Assignment Home
4 pages
Quantitative Techniques Assignment
No ratings yet
Quantitative Techniques Assignment
22 pages
4 Unit III Statistical Tests
No ratings yet
4 Unit III Statistical Tests
9 pages
Computer Statistics With R: 8. Nonparametric Tests
No ratings yet
Computer Statistics With R: 8. Nonparametric Tests
10 pages
Student S T Statistic: Test For Equality of Two Means Test For Value of A Single Mean
No ratings yet
Student S T Statistic: Test For Equality of Two Means Test For Value of A Single Mean
35 pages
R M Handout
No ratings yet
R M Handout
13 pages
Tips and Tricks For Analyzing Non-Normal Data
No ratings yet
Tips and Tricks For Analyzing Non-Normal Data
3 pages
Non Parametric Test: Business Research Methods
No ratings yet
Non Parametric Test: Business Research Methods
26 pages
Choosing The Right Elementary Statistical Test: Type of Question Level of Data / Assumptions Examples Statistical Test
No ratings yet
Choosing The Right Elementary Statistical Test: Type of Question Level of Data / Assumptions Examples Statistical Test
2 pages
Lecture1 PDF
No ratings yet
Lecture1 PDF
40 pages
Non Parametric Statistics
No ratings yet
Non Parametric Statistics
20 pages
Unit 4 & Unit 5
0% (1)
Unit 4 & Unit 5
59 pages
Meeting 13 - 14 Non Parametric Statistics 16 - 17
No ratings yet
Meeting 13 - 14 Non Parametric Statistics 16 - 17
28 pages
Two Populations PDF
No ratings yet
Two Populations PDF
16 pages
Spear Man
No ratings yet
Spear Man
5 pages
Parametric & Non-Parametric Tests
100% (1)
Parametric & Non-Parametric Tests
34 pages
Module07 Notes
No ratings yet
Module07 Notes
14 pages
Chapter 12
No ratings yet
Chapter 12
26 pages
1.1random Variable
100% (1)
1.1random Variable
5 pages
Non Parametric Tests
100% (1)
Non Parametric Tests
49 pages
6220ac4ef0e8b Su Phe5020 w3 A2d Earle G
No ratings yet
6220ac4ef0e8b Su Phe5020 w3 A2d Earle G
2 pages
Parametric & Non-Parametric Tests
No ratings yet
Parametric & Non-Parametric Tests
34 pages
DL RS 299a
100% (3)
DL RS 299a
9 pages
Nonparametric Test
0% (1)
Nonparametric Test
54 pages
The Median Test
No ratings yet
The Median Test
12 pages
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
Acceptance Sampling in Quality Control
100% (5)
Acceptance Sampling in Quality Control
709 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Nonparametric Statistics and Model Selection: 5.1 Estimating Distributions and Distribution-Free Tests

Uploaded by

Nonparametric Statistics and Model Selection: 5.1 Estimating Distributions and Distribution-Free Tests

Uploaded by

Chapter 5

Nonparametric statistics and model

5.1 Estimating distributions and distribution-free tests

5.1.1 Kolmogorov-Smirnov test

Dn = max |Fn1 (x) − Fn2 (x)|

5.1.2 Shapiro-Wilk test

5.1.3 Wilcoxon’s signed-rank test

(3) Compute the test statistic:

5.1.4 Mann-Whitney U Test

(3) The test statistic is defined as U = min(U1 , U2 ).

5.2 Resampling-based methods

5.2.1 Permutation tests

5.3 Model selection

5.3.1 Minimum generalization error

5.3.2 Penalizing complexity

Example: Bayesian Occam’s Razor

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.