0% found this document useful (0 votes)

16 views34 pages

Chapter 12 Statistical Analysis Tools

Chapter 10 of SPC 2308 focuses on statistical analysis tools for comparing distributions and evaluating relationships between random variables. It covers methods such as the Chi-Square test for goodness-of-fit and the Kolmogorov-Smirnov test for continuous models, along with linear regression and correlation analysis. The chapter provides detailed explanations, formulas, and examples to illustrate these statistical concepts.

Uploaded by

collision896

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views34 pages

Chapter 12 Statistical Analysis Tools

Uploaded by

collision896

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

SPC 2308 Modeling and Simulation

Chapter 10: Statistical Analysis

Tools
Goals of Today

• Know how to compare between

two distributions
• Know how to evaluate the
relationship between two random
variable
Outline
• Comparing Distributions: Tests for Goodness-of-Fit
– Chi-Square Distribution (for discrete models: PMF)
– Kolmogorov-Smirnov Test (for continuous models:
CDF)
• Evaluating the relationship
– Linear Regression
– Correlation
GOODNESS-OF-FIT ‫جودة المطابقة‬
Statistical Tests enables to compare between two distributions,
also known as Goodness-of-Fit.

 The goodness-of-fit of a statistical model describes how well it

fits a set of observations.

Measures of goodness of fit typically summarize the discrepancy

between observed values and the values expected under the model
in question

Goodness-of-fit means how well a statistical model fits a set of

observations
PEARSON’S C²-TESTS
CHI-SQUARE TESTS FOR DISCRETE MODELS

The Pearson's chi-square test enables to compare two

probability mass functions of two distribution.

If the difference value (Error) is greater than the critical

value, the two distribution are said to be different or the
first distribution does not fit (well) the second distribution.

If the difference if smaller that the critical value, the first

distribution fits well the second distribution
(Pearson's ) Chi-Square test
• Pearson's chi-square is used to assess two types of comparison:
– tests of goodness of fit: it establishes whether or not an
observed frequency distribution differs from a theoretical
distribution.
– tests of independence. it assesses whether paired observations
on two variables are independent of each other.
• For example, whether people from different regions differ in the
frequency with which they report that they support a political candidate.
• If the chi-square probability is less or equal to 0.05 then we say
that
– both distributions are equal (goodness-of-fit) or that
– the row variable is unrelated (that is, only randomly related) to the
column variable (test of independence).
Chi-Square Distribution
http://en.wikipedia.org/wiki/Chi-square_distribution
Chi-Square Distribution
http://en.wikipedia.org/wiki/Chi-square_distribution
(Pearson's ) Chi-Square test
• The chi-square test, in general, can be used to check whether an empirical
distribution follows a specific theoretical distribution.
• Chi-square is calculated by finding the difference between each observed (O)
and theoretical or expected (E) frequency for each possible outcome, squaring
them, dividing each by the theoretical frequency, and taking the sum of the
results.
• For n data outcomes (observations), the chi-square statistic is defined as:
n
Oi  E i  2
 n21  
i 1
Ei

– Oi = an observed frequency for a given outcome;

– Ei = an expected (theoretical) frequency for a given outcome;
– n = the number of possible outcomes of each event;
(Pearson's ) Chi-Square test

A chi-square probability of
0.05 or less is the criteria to
accept or reject the test of
difference between the
empirical and theoretical
distributions.
Chi-Square test: General Algorithm
http://en.wikipedia.org/wiki/Inverse-chi-square_distribution

 We say that the observed distribution (empricial) fits well the

expected distribution (theoretical) if:
 02  a2*,k 1c
which means p   02  a2*, k 1c   1  a
 
where
critical
2*
 a2*,k 1c  idfChiSquare  k  1  c , a 

• (k – 1 – c) is the degree of freedom, where k is the number of

possible outcome and c is the number of estimated parameters
• 1-a is the confidence level (basically, we use a = 0.05)
Chi-Square test: Example
Uniform distribution in [0 .. 9]

PASS
(KS-TEST) KOLMOGOROV – SMIRNOV TEST
FOR CONTINUOUS MODELS
 In statistics, the Kolmogorov–Smirnov test (K–S test) quantifies a
distance between the empirical distribution function of the sample
and the cumulative distribution function of the expected
distribution, or between the empirical distribution functions of two
samples.

 It can be used for both continuous and discrete models

 Basic idea: compute the maximum distance between two

cumulative distribution functions and compare it to critical value.
If the maximum distance is smaller than the critical value, the first
distribution fits the second distribution
If the maximum distance is greater than the critical value, the first
distribution does not fit the second distribution
Kolmogorov – Smirnov test
 In statistics, the Kolmogorov – Smirnov test is used to determine
whether two one-dimensional probability distributions differ, or
whether an probability distribution differs from a hypothesized
distribution,
in either case based on finite samples.
 The Kolmogorov-Smirnov test statistic measures the largest vertical
distance between an empirical cdf calculated from a data set and a
theoretical cdf.
 The one-sample KS-test compares the empirical distribution
function with a cumulative distribution function.
 The main applications are testing goodness-of-fit with the normal
and uniform distributions.
Kolmogorov–Smirnov Statistic
• Let X1, X2, …, Xn be iid random variables in with the CDF equal to F(x).
• The empirical distribution function Fn(x) based on sample
X1, X2, …, Xn is a step function defined by:

n
number of element in the sample  x
 I X
1
Fn  x    i x
n n i 1

where I(A) is the indicator of event A. I  X i  x   1 if  X i  x 


0 otherwise

• The Kolmogorov-Smirnov test statistic for a given function F(x) is

D n  sup Fn  x   F  x 
x
Kolmogorov–Smirnov Statistic
• The Kolmogorov-Smirnov test statistic for a given function F(x) is

D n  sup Fn  x   F  x 
x

Facts
• By the Glivenko-Cantelli theorem, if the sample comes from a
distribution F(x), then Dn converges to 0 almost surely.
• In other words, If X1, X2, …, Xn really come from the distribution
with CDF F(X), the distance Dn should be small
Example

Dmax
Example: Grade Distribution?
• We would like to know the distribution of the
Grades of students.
– First, determine the empirical distribution
– Second, compare to Normal and Poisson
distributions
• Data Sample: 50 Grades in a course and
computed the empirical distribution
– Mean = 63
– Standard Deviation = 15
Example: Grade Distribution?
D n  sup Fn  x   F  x 
x

 
Frequency X grade = Number of grades  X grade


Frequency X grade 
   
Empirical Distribution = F X grade =p X  X grade =
Sample Size
Example: Grade Distribution?

Dmax,Poisson= 0.153

Dmax,Normal= 0.119

D n  sup Fn  x   F  x 
x
Kolmogorov–Smirnov Acceptance Criteria

 Rejection Criteria: We consider that the two distributions

are not equal if the empirical CDF is too far from the
theoritical CDF of the proposed distribution
 This means: We reject if Dn is too large.
 But the question is: What does large mean?

For which values of Dn

should we accept the distribution?
Kolmogorov–Smirnov test
http://en.wikipedia.org/wiki/Kolmogorov-Smirnov_test

In the 1930’s, Kolmogorov and Smirnov showed that


  
2
t2
lim P n Dn  t  1- 2 (-1)i -1 e -2i
n 
i 1

So, for large sample sizes, you could assume Critical


value
  
2
t2
P n D n  t  1- 2 (-1)i -1 e -2i
i 1

alevel test: find the value of t such a  2  (-1)i -1 e -2i .t
2 2

i 1
ta
So, the test is accepted if Dn 
n
Kolmogorov–Smirnov test

 For small samples, people have worked out and

tabulated critical values, but there is no nice closed form
solution.
• J. Pomeranz (1973)
• J . Durbin (1968)

 For Large Samples: Good approximations for n>40:

a 0.20 0.10 0.05 0.02 0.01
1.0730 1.2239 1.3581 1.5174 1.6276
critical value
n n n n n
Example: Grade Distribution?

• For our example, we have n = 50

• The critical value for a a = 0.05
1.3581
Dcritical   0.192
50

Dmax,Normal  0.119  Dcritical  0.192 ACCEPT

Dmax,Poisson  0.153  Dcritical  0.192 ACCEPT

Example: Grade Distribution?

• If we get the same distribution for n = 100

• The critical value for a a = 0.05
1.3581
Dcritical   0.1358
100

Dmax,Normal  0.119  Dcritical  0.1358 ACCEPT

Dmax,Poisson  0.153  Dcritical  0.1358 REJECT

LINEAR REGRESSION: LEAST SQUARE
METHOD
http://en.wikipedia.org/wiki/Linear_regression

In statistics, linear regression is a form of regression analysis in

which the relationship between one or more independent
variables and another variable, called dependent variable, is
modeled by a least squares function, called linear regression
equation. This function is a linear combination of one or more
model parameters, called regression coefficients.
A linear regression equation with one independent variable
represents a straight line. The results are subject to statistical
analysis.
The Method of Least Squares
• The equation of the best-
fitting line is calculated
using a set of n pairs (xi, yi).
• We choose our estimates a and
b to estimate a and b so that
the vertical distances of the
points from the line, are
minimized.
Bestfitting line :yˆ  a  bx
Choosea and b to minimize
SSE  ( y  yˆ ) 2  ( y  a  bx) 2 SSE: Sum of Square of Errors
Least Squares Estimators
Calculatethe sumsof squares:
( x)
2
( y ) 2
Sxx   x 
2
Syy   y 
2

n n
( x)( y )
Sxy   xy 
n
Bestfitting line : yˆ  a  bx where
S xy
b and a  y  bx
S xx
Example
The table shows the math achievement test scores for a
random sample of n = 10 college freshmen, along with
their final calculus grades.
Student 1 2 3 4 5 6 7 8 9 10
Math test, x 39 43 21 64 57 47 28 75 34 52
Calculus grade, y 65 78 52 82 92 89 73 98 56 75

Use your calculator to  x  460  y  760

find the sums and
sums of squares.  x 2  23634  y 2  59816
 xy  36854
x  46 y  76
Example
(460) 2
Sxx  23634   2474
10
(760) 2
Syy  59816   2056
10
(460)(760)
Sxy  36854   1894
10
1894
b  .76556 and a  76  .76556(46)  40.78
2474
Bestfitting line : yˆ  40.78  .77 x
Correlation Analysis
• The strength of the relationship between x and y
is measured using the coefficient of correlation:
S xy
Correlation coefficient : r 
S xx S yy
• The sign of r indicates the direction of the
relationship;
• r near 0 indicates no linear relationship,
• r near 1 or -1 indicates a strong linear
relationship.
• A test of the significance of the correlation
coefficient is identical to the test of the slope b.
Example
The table shows the heights and weights of
n = 10 randomly selected college football players.

Player 1 2 3 4 5 6 7 8 9 10
Height, x 73 71 75 72 72 75 67 69 71 69
Weight, y 185 175 200 210 190 195 150 170 180 175

Use your calculator to S xy  328 S xx  60.4 S yy  2610

find the sums and
sums of squares. 328
r  .8261
(60.4)(2610)
Football Players
Scatterplot of Weight vs Height

210

200

190
Weight

180 r = .8261
170 Strong positive
160 correlation
150 As the player’s height
66 67 68 69 70 71
Height
72 73 74 75 increases, so does his
weight.
Some Correlation Patterns
r = 0; No correlation r = .931; Strong positive
correlation

r = 1; Linear
relationship r = -.67; Weaker negative
correlation

20250324-Week6-Input ModelingII
No ratings yet
20250324-Week6-Input ModelingII
38 pages
Model Validity
No ratings yet
Model Validity
511 pages
Dokumen - Tips Stochastic Hydrology Nptel Ref Stochastic Hydrology by Pjayarami Reddy 1997
No ratings yet
Dokumen - Tips Stochastic Hydrology Nptel Ref Stochastic Hydrology by Pjayarami Reddy 1997
70 pages
IE305 - 12 - Goodness-of-Fit Tests
No ratings yet
IE305 - 12 - Goodness-of-Fit Tests
30 pages
Grad Lecture 3
No ratings yet
Grad Lecture 3
27 pages
Statistics Assignment 2 (Team 3) - 1
No ratings yet
Statistics Assignment 2 (Team 3) - 1
27 pages
DDE STUDY MAT SEM-3 Statistics - Final
No ratings yet
DDE STUDY MAT SEM-3 Statistics - Final
278 pages
On The Kolmogorov-Smirnov Test For Normality With Mean and Variance Unknown
No ratings yet
On The Kolmogorov-Smirnov Test For Normality With Mean and Variance Unknown
5 pages
Chi-Squared Distribution
No ratings yet
Chi-Squared Distribution
15 pages
Comparison of Goodness of Fit Tests For PDF
No ratings yet
Comparison of Goodness of Fit Tests For PDF
32 pages
Non Parametric
No ratings yet
Non Parametric
37 pages
Lec 05
No ratings yet
Lec 05
28 pages
Basic Concepts of Non-Parametric Methods (Statistics)
No ratings yet
Basic Concepts of Non-Parametric Methods (Statistics)
18 pages
The Kolmogorvo Smirmov Test
No ratings yet
The Kolmogorvo Smirmov Test
36 pages
Final Simulation Theory - BT
No ratings yet
Final Simulation Theory - BT
13 pages
Engineering Mathematics 2
No ratings yet
Engineering Mathematics 2
29 pages
Chapter Four
No ratings yet
Chapter Four
26 pages
AP Stats Ch25
No ratings yet
AP Stats Ch25
105 pages
7 Chi-Square and F
No ratings yet
7 Chi-Square and F
68 pages
Chi-Square Tests
No ratings yet
Chi-Square Tests
3 pages
T Dist&chisquare
No ratings yet
T Dist&chisquare
21 pages
Lect. 3b
No ratings yet
Lect. 3b
33 pages
ASTA MC Week 4.1
No ratings yet
ASTA MC Week 4.1
40 pages
Massey 1951
No ratings yet
Massey 1951
12 pages
Module 5a Chi Square - Introduction - Goodness of Fit Test
No ratings yet
Module 5a Chi Square - Introduction - Goodness of Fit Test
39 pages
Chi Square Method
No ratings yet
Chi Square Method
34 pages
Berger 2014
No ratings yet
Berger 2014
5 pages
ChiSquare Examples
No ratings yet
ChiSquare Examples
22 pages
Empirical Data Analysis in Accounting and Finance
No ratings yet
Empirical Data Analysis in Accounting and Finance
48 pages
NCERT Reference
No ratings yet
NCERT Reference
295 pages
Anna's Archive
No ratings yet
Anna's Archive
12 pages
Abisola
No ratings yet
Abisola
12 pages
Tutorial 3
No ratings yet
Tutorial 3
4 pages
1 - CA51018 - Chi Square - Introduction - Goodness of Fit Test - 2
No ratings yet
1 - CA51018 - Chi Square - Introduction - Goodness of Fit Test - 2
36 pages
Chi Square
No ratings yet
Chi Square
34 pages
Testes Normalidade
No ratings yet
Testes Normalidade
5 pages
Nonparametric Statistics and Model Selection: 5.1 Estimating Distributions and Distribution-Free Tests
No ratings yet
Nonparametric Statistics and Model Selection: 5.1 Estimating Distributions and Distribution-Free Tests
10 pages
Chisquare
No ratings yet
Chisquare
10 pages
Lecture3 - Contingency Analysis
No ratings yet
Lecture3 - Contingency Analysis
16 pages
Determination of Probability Distributions: - by Constructing Histograms - Using Probability Papers
No ratings yet
Determination of Probability Distributions: - by Constructing Histograms - Using Probability Papers
31 pages
3.2 Tests For Random Numbers: Two Types of Tests: 1. Frequency Test: U
No ratings yet
3.2 Tests For Random Numbers: Two Types of Tests: 1. Frequency Test: U
12 pages
Measurement 6th Sem (H) DSE4 Lec 4 05 05 2020
No ratings yet
Measurement 6th Sem (H) DSE4 Lec 4 05 05 2020
19 pages
Unit 4 & Unit 5
0% (1)
Unit 4 & Unit 5
59 pages
Kolmogorov-Smirnov Tests: 3.1 The One-Sample Test
No ratings yet
Kolmogorov-Smirnov Tests: 3.1 The One-Sample Test
6 pages
Input Data Analysis (3) Goodness-of-Fit Tests
No ratings yet
Input Data Analysis (3) Goodness-of-Fit Tests
28 pages
Kolmogorov-Smirnov Test
No ratings yet
Kolmogorov-Smirnov Test
10 pages
KS Test PDF
No ratings yet
KS Test PDF
6 pages
Kolmogorov Smirnov
100% (1)
Kolmogorov Smirnov
12 pages
Chapter 9 - Chi-Square Test
No ratings yet
Chapter 9 - Chi-Square Test
3 pages
Chi Square (KI Square) Test
No ratings yet
Chi Square (KI Square) Test
30 pages
The Kolmogorov
No ratings yet
The Kolmogorov
3 pages
2.4.3.4 Interpret The Results From The SPSS Output Window The SPSS
No ratings yet
2.4.3.4 Interpret The Results From The SPSS Output Window The SPSS
11 pages
Inventory Schedule
No ratings yet
Inventory Schedule
41 pages
Nonparametric Methods: Chi-Square Applications
No ratings yet
Nonparametric Methods: Chi-Square Applications
21 pages
The Kolmogorov Smirnov One Sample Test
No ratings yet
The Kolmogorov Smirnov One Sample Test
13 pages
7.93 - Lecture #4 - and - Multiple Sequence Alignment: More Pairwise Sequence Comparisons
No ratings yet
7.93 - Lecture #4 - and - Multiple Sequence Alignment: More Pairwise Sequence Comparisons
44 pages
Vertopal Com EDA Project
No ratings yet
Vertopal Com EDA Project
21 pages
Test Iqt Va Jixm
No ratings yet
Test Iqt Va Jixm
10 pages
Mathematics W 21
100% (1)
Mathematics W 21
25 pages
Kolmogorov-Smirnov Test - Wikipedia, The Free Encyclopedia
No ratings yet
Kolmogorov-Smirnov Test - Wikipedia, The Free Encyclopedia
6 pages
Chi Square Test
No ratings yet
Chi Square Test
4 pages
17 Perceived Organization Support and Work Engagement Toward Employee Performance With Motivation As Mediating Variable
No ratings yet
17 Perceived Organization Support and Work Engagement Toward Employee Performance With Motivation As Mediating Variable
10 pages
Full Guide To The Guide Mesh
No ratings yet
Full Guide To The Guide Mesh
3 pages
Kolmogorov
No ratings yet
Kolmogorov
11 pages
Acceleration
No ratings yet
Acceleration
4 pages
Research Introduction - Exploring The Impact of AI Math Technology Applications Among STEM Sudents
No ratings yet
Research Introduction - Exploring The Impact of AI Math Technology Applications Among STEM Sudents
13 pages
Definite Integration - DPPs
No ratings yet
Definite Integration - DPPs
12 pages
Day 3 Solutions
100% (1)
Day 3 Solutions
5 pages
Take Home Section Logarithms
No ratings yet
Take Home Section Logarithms
6 pages
Virani Sir
No ratings yet
Virani Sir
17 pages
Mile Pra 25 Aug 2024 12th Jee Main Part Test Phase 3 KPM Model Test
No ratings yet
Mile Pra 25 Aug 2024 12th Jee Main Part Test Phase 3 KPM Model Test
12 pages
LPP (Tableau Method)
No ratings yet
LPP (Tableau Method)
7 pages
Chi Square Test
No ratings yet
Chi Square Test
3 pages
OMAE2007-29155: Mooring Line Damping Estimation by A Simplified Dynamic Model
No ratings yet
OMAE2007-29155: Mooring Line Damping Estimation by A Simplified Dynamic Model
8 pages
Quantum Theory of Many Particle Systems
No ratings yet
Quantum Theory of Many Particle Systems
109 pages
Fraunhofer Diffraction
No ratings yet
Fraunhofer Diffraction
7 pages
4 Volume of A Prism Ws
No ratings yet
4 Volume of A Prism Ws
3 pages
Gretl Empirical Exercise 2 - KEY PDF
No ratings yet
Gretl Empirical Exercise 2 - KEY PDF
3 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Is 1893 (Part 4) :2005
100% (3)
Is 1893 (Part 4) :2005
24 pages
Object Detection and Recognition: Final Project Title
No ratings yet
Object Detection and Recognition: Final Project Title
6 pages
Physics 2 A Fiv
No ratings yet
Physics 2 A Fiv
3 pages
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
30-Second Questions
No ratings yet
30-Second Questions
3 pages
CS482 Data Structures
No ratings yet
CS482 Data Structures
3 pages
MathPSHS Curriculum
No ratings yet
MathPSHS Curriculum
1 page
Board Diversity and Its Effects On Bank Performance - An International Analysis PDF
No ratings yet
Board Diversity and Its Effects On Bank Performance - An International Analysis PDF
13 pages
Optimum Power Flow Analysis by Newton Raphson Method, A Case Study
No ratings yet
Optimum Power Flow Analysis by Newton Raphson Method, A Case Study
9 pages
Habs Boys Maths 07 PDF 1
No ratings yet
Habs Boys Maths 07 PDF 1
8 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Chapter 12 Statistical Analysis Tools

Uploaded by

Chapter 12 Statistical Analysis Tools

Uploaded by

SPC 2308 Modeling and Simulation

Chapter 10: Statistical Analysis

• Know how to compare between

 The goodness-of-fit of a statistical model describes how well it

Measures of goodness of fit typically summarize the discrepancy

Goodness-of-fit means how well a statistical model fits a set of

The Pearson's chi-square test enables to compare two

If the difference value (Error) is greater than the critical

If the difference if smaller that the critical value, the first

– Oi = an observed frequency for a given outcome;

 We say that the observed distribution (empricial) fits well the

• (k – 1 – c) is the degree of freedom, where k is the number of

 It can be used for both continuous and discrete models

 Basic idea: compute the maximum distance between two

where I(A) is the indicator of event A. I  X i  x   1 if  X i  x 

• The Kolmogorov-Smirnov test statistic for a given function F(x) is

 Rejection Criteria: We consider that the two distributions

For which values of Dn

In the 1930’s, Kolmogorov and Smirnov showed that

So, for large sample sizes, you could assume Critical

 For small samples, people have worked out and

 For Large Samples: Good approximations for n>40:

• For our example, we have n = 50

Dmax,Normal  0.119  Dcritical  0.192 ACCEPT

Dmax,Poisson  0.153  Dcritical  0.192 ACCEPT

• If we get the same distribution for n = 100

Dmax,Normal  0.119  Dcritical  0.1358 ACCEPT

Dmax,Poisson  0.153  Dcritical  0.1358 REJECT

In statistics, linear regression is a form of regression analysis in

Use your calculator to  x  460  y  760

Use your calculator to S xy  328 S xx  60.4 S yy  2610

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.