Chapter 12 Statistical Analysis Tools
Chapter 12 Statistical Analysis Tools
A chi-square probability of
0.05 or less is the criteria to
accept or reject the test of
difference between the
empirical and theoretical
distributions.
Chi-Square test: General Algorithm
http://en.wikipedia.org/wiki/Inverse-chi-square_distribution
PASS
(KS-TEST) KOLMOGOROV – SMIRNOV TEST
FOR CONTINUOUS MODELS
In statistics, the Kolmogorov–Smirnov test (K–S test) quantifies a
distance between the empirical distribution function of the sample
and the cumulative distribution function of the expected
distribution, or between the empirical distribution functions of two
samples.
n
number of element in the sample x
I X
1
Fn x i x
n n i 1
D n sup Fn x F x
x
Kolmogorov–Smirnov Statistic
• The Kolmogorov-Smirnov test statistic for a given function F(x) is
D n sup Fn x F x
x
Facts
• By the Glivenko-Cantelli theorem, if the sample comes from a
distribution F(x), then Dn converges to 0 almost surely.
• In other words, If X1, X2, …, Xn really come from the distribution
with CDF F(X), the distance Dn should be small
Example
Dmax
Example: Grade Distribution?
• We would like to know the distribution of the
Grades of students.
– First, determine the empirical distribution
– Second, compare to Normal and Poisson
distributions
• Data Sample: 50 Grades in a course and
computed the empirical distribution
– Mean = 63
– Standard Deviation = 15
Example: Grade Distribution?
D n sup Fn x F x
x
Frequency X grade = Number of grades X grade
Frequency X grade
Empirical Distribution = F X grade =p X X grade =
Sample Size
Example: Grade Distribution?
Dmax,Poisson= 0.153
Dmax,Normal= 0.119
D n sup Fn x F x
x
Kolmogorov–Smirnov Acceptance Criteria
i 1
ta
So, the test is accepted if Dn
n
Kolmogorov–Smirnov test
n n
( x)( y )
Sxy xy
n
Bestfitting line : yˆ a bx where
S xy
b and a y bx
S xx
Example
The table shows the math achievement test scores for a
random sample of n = 10 college freshmen, along with
their final calculus grades.
Student 1 2 3 4 5 6 7 8 9 10
Math test, x 39 43 21 64 57 47 28 75 34 52
Calculus grade, y 65 78 52 82 92 89 73 98 56 75
Player 1 2 3 4 5 6 7 8 9 10
Height, x 73 71 75 72 72 75 67 69 71 69
Weight, y 185 175 200 210 190 195 150 170 180 175
210
200
190
Weight
180 r = .8261
170 Strong positive
160 correlation
150 As the player’s height
66 67 68 69 70 71
Height
72 73 74 75 increases, so does his
weight.
Some Correlation Patterns
r = 0; No correlation r = .931; Strong positive
correlation
r = 1; Linear
relationship r = -.67; Weaker negative
correlation