0% found this document useful (0 votes)
13 views45 pages

Non-Parametric Methods: Goodness of Fit Tests: (Chi-Square Applications)

This document discusses non-parametric methods, specifically focusing on chi-square applications for hypothesis testing. It outlines the goals of conducting tests for population variance, characteristics of the chi-square distribution, and various applications in fields like economics and marketing. Additionally, it explains the goodness-of-fit test for comparing observed frequencies to expected distributions, providing examples and formulas for conducting these tests.

Uploaded by

dashing.hamza256
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views45 pages

Non-Parametric Methods: Goodness of Fit Tests: (Chi-Square Applications)

This document discusses non-parametric methods, specifically focusing on chi-square applications for hypothesis testing. It outlines the goals of conducting tests for population variance, characteristics of the chi-square distribution, and various applications in fields like economics and marketing. Additionally, it explains the goodness-of-fit test for comparing observed frequencies to expected distributions, providing examples and formulas for conducting these tests.

Uploaded by

dashing.hamza256
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 45

Non-parametric methods :

Goodness of fit tests


(Chi-Square Applications)

Chapter 17

McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved.
GOALS

1. Conduct a test of hypothesis for single


population variance.
2. List the characteristics of the chi-square
distribution.
3. Conduct a test of hypothesis comparing an
observed set of frequencies to an expected
distribution.
4. Conduct a test of hypothesis to determine
whether two classification criteria are related.

17-2
Chi square application in
hypothesis testing

 A chi-square test can be used to test if the variance


of a population is equal to a specified value. This test
can be either a two-sided test or a one-sided test.
The two-sided version tests against the alternative
that the true variance is either less than or greater
than the specified value. The one-sided version only
tests in one direction. The choice of a two-sided or
one-sided test is determined by the problem. For
example, if we are testing a new process, we may
only be concerned if its variability is greater than the
variability of the current process.

17-3
Is the packing machine working
properly?
 Suppose people have lodged complaints about
the weight of the 12.5 Kg mealie-meal bags.
 A consultant took a sample of mealie-meal bags
and did not find any problem with the average
weight. That is, she could not reject the null
hypothesis that the population mean weight  =
12.5 Kg

What could be the problem?

17-4 4
Why study variance?

 Although the mean is OK in the above


example, there could be a problem with the
variance
 Packaging plants are designed to operate
within certain specified precision
 Ideally it would be desirable to have the
machine pack exactly 12.5 Kg in every bag
but this is practically impossible. So a certain
pre-specified variation is tolerated
17-5
Testing for a single variance

 After years of operation it is always important to


check whether the machine variation  2 is still at
the initially set level of precision (say  0 )
2

 This implies testing the hypothesis


H 0 :  2  02
against the alternative
H1 :  2   02

17-6 6
Other applications

Other applications where testing for variance


may be important includes the following:
 Foreign exchange stability is important in any
economy. Too much variation of a currency is not
good.
 Price stability of other commodities is also
important.
Question:
Can you name other possible areas of application
where testing that the variation remains stable at a
17-7
pre-set value is important? 7
Characteristics of the Chi-Square
Distribution

The major characteristics of


the chi-square distribution
 It is positively skewed.
 It is non-negative.
 It is based on degrees of
freedom.
 When the degrees of
freedom change a new
distribution is created.

17-8
Chi-square Table look

17-9
The chi-square test

 This test applies when we want to test for a single


variance.
 The null hypothesis is of the form
H 0 :  2  02

Need to test this against the alternative H1 :    0
2 2

orH1 :  2   02
orH1 :  2  02
The test is based on the comparison between s 2 and
0 2
using the ratio ( n  1 ) s 2

17-10
 02 10
Conducting the test

( n  1 )s 2
X2 
 02

17-11 11
Conducting the test

17-12 12
17-13
Back to Example

 Suppose the mealie-meal packaging machine is


designed to operate with precision of
 02 0.0016Kg 2

 Supposethat data from a sample of 12 mealie-


2 2
meal bags gave s 0.0025Kg
.
 Does the data indicate a significant increase in
the variation?

17-14 14
Test computations and results

2 s2 11* 0.0025
X ( n  1 ) 2  17.2
0 0.0016

17-15 15
EXAMPLE

A nutritionist claims that the variance of the


number of calories in 1tbs of the major
brands of pancake syrup is greater than
3600. A sample of 18 major brands of syrup
yielded a variance of 4175. Test this claim
assuming the number of calories is
normally distributed.
 Compute 95% confidence interval for
variance too.

17-16
EXAMPLE

 Ina study in which the subjects were


15 patients suffering from pulmonary
sarcoid disease, blood gas
determinations were made. The
variance of the sample was 450. Test
the hypothesis that the population
variance is less than 250.

17-17
EXAMPLE

A tire manufacturer claims that the variance


of the distribution in a certain tire model is
8.6. A random sample of 10 tires has a
variance of 4.3 at α = 0.01. Is there enough
evidence to reject the manufacturers claim?

17-18
EXAMPLE

 A large candy manufacturer produces, packages and sells


packs of candy targeted to weigh 52 grams. A quality control
manager working for the company was concerned that the
variation in the actual weights of the targeted 52-gram packs
was larger than acceptable. That is, he was concerned that
some packs weighed significantly less than 52-grams and
some weighed significantly more than 52 grams. In an attempt
to estimate σ, the standard deviation of the weights of all of the
52-gram packs the manufacturer makes, he took a random
sample of n = 10 packs off of the factory line. The random
sample yielded a sample variance of 4.2 grams. Use the
random sample to derive a 90% confidence interval for σ.

17-19
Characteristics of the Chi-Square
Distribution

The major characteristics of


the chi-square distribution
 It is positively skewed.
 It is non-negative.
 It is based on degrees of
freedom.
 When the degrees of
freedom change a new
distribution is created.

17-20
What is the use of chi-square test in
economics?

 In econometrics, you use the chi-squared


distribution extensively. The chi-squared
distribution is useful for comparing estimated
variance values from a sample to those
values based on theoretical assumptions.
Therefore, it's typically used to develop
confidence intervals and hypothesis tests for
population variance.

17-21
What is the use of chi-square test
in Marketing?

 The Chi-square test is often used in research


studies to test the relationship between a
variable pertaining to behaviour or attitude,
with a variable pertaining to classification.
For instance, the relationship between the
consumption of a product with income level,
location, or age. The variables are cross
tabulated, and then tested. The test will
reveal whether relationship exists between
the two
17-22
Goodness-of-Fit Test: Equal Expected
Frequencies

Conduct a test of hypothesis comparing an observed set of


frequencies to an expected distribution.

The goodness-of-fit test is one of the most commonly used


statistical tests. It is particularly useful because it requires
only the nominal level of measurement. So we are able to
conduct a test of hypothesis on data that has been classified
into groups. Our first illustration of this test involves the case
when the expected cell frequencies are equal. As the full
name implies, the purpose of the goodness-of-fit test is
to compare an observed distribution to an expected
distribution.

17-23
Goodness-of-Fit Test: Equal Expected
Frequencies

 Let f0 and fe be the observed and expected frequencies


respectively.
 H0: There is no difference between the observed and the expected
frequencies
 H1: There is a difference between the observed and the expected
frequencies.
 The test statistic is:

  f o  f e 2 
 2

 
 fe


 The critical value is a chi-square value with (k-1) degrees of
freedom, where k is the number of categories.
17-24
EXAMPLE

 Bubba’s Fish and Pasta is a chain of restaurants located along the Gulf
Coast of Florida. Bubba, the owner, is considering adding steak to his
menu. Before doing so, he decides to hire Magnolia Research, LLC, to
conduct a survey of adults as to their favorite meal when eating out.
Magnolia selected a sample 120 adults and asked each to indicate their
favorite meal when dining out. The results are reported below. Is it
reasonable to conclude there is no preference among the four
entrées? Is the difference in the number of times each entrée is
selected due to chance, or should we conclude that the entrées are
not equally preferred?

17-25
EXAMPLE

Ms. Jan Kilpatrick is the marketing manager for a manufacturer of


sports cards. She plans to begin selling a series of cards with
pictures and playing statistics of former Major League Baseball
players. One of the problems is the selection of the former
players. At a baseball card show at Southwyck Mall last weekend,
she set up a booth and offered cards of the following six Hall of
Fame baseball players: Tom Seaver, Nolan Ryan, Ty Cobb,
George Brett, Hank Aaron, and Johnny Bench. At the end of the
day she sold a total of 120 cards. The number of cards sold for
each old-time player is shown in the table on the right. Can she
conclude the sales are not the same for each player? Use 0.05
significance level.

17-26
Goodness-of-Fit Example
Step 1: State the null hypothesis and the alternate hypothesis.

H0: there is no difference between fo and fe


H1: there is a difference between fo and fe

Step 2: Select the level of significance.


α = 0.05 as stated in the problem

Step 3: Select the test statistic.


The test statistic follows
2
the chi-square distribution, designated as
2
χ 2 Reject H 0 if     ,k  1

  f o  f e 2 
  f    2  ,k  1
 e 
  f o  f e 2 
  f    2 .05,5
 e 
  f o  f e 2 
17-27
  f   11.070
 e 
Goodness-of-Fit Example

Step 5: Compute the value of the Chisquare statistic and make a


decision

 The computed χ2 of 34.40 larger than the critical value of


11.070. The decision, therefore, is to reject H0 at the .05 level .

17-28
Goodness-of-Fit Example

 Conclusion: The difference between the


observed and the expected frequencies
is not due to chance. Rather, the
differences between f0 and fe and are
large enough to be considered
significant. It is unlikely that card sales
are the same among the six players.

17-29
Goodness-of-Fit Test: Unequal
Expected Frequencies

 Let f0 and fe be the observed and expected frequencies


respectively.

The Hypothesis
 H : There is no difference between the observed and expected
0
frequencies.
 H : There is a difference between the observed and the expected
1
frequencies.

The test statistic is computed using the following formula:

17-30
EXAMPLE

The American Hospital Administrators Association (AHAA)


reports the following information concerning the number of times
senior citizens are admitted to a hospital during a one-year
period. Forty percent are not admitted; 30 percent are admitted
once; 20 percent are admitted twice, and the remaining 10
percent are admitted three or more times.
A survey of 150 residents of Bartow Estates, a community
devoted to active seniors located in central Florida, revealed 55
residents were not admitted during the last year, 50 were
admitted to a hospital once, 32 were admitted twice, and the rest
of those in the survey were admitted three or more times.

Can we conclude the survey at Bartow Estates is consistent with the


information suggested by the AHAA? Use the .05 significance
level.

17-31
Goodness-of-Fit Test: Unequal Expected
Frequencies - Example

Step 1: State the null hypothesis and the alternate hypothesis.


H0: There is no difference between local and national experience for hospital
admissions.
H1: There is a difference between local and national experience for hospital
admissions.

Step 2: Select the level of significance.


α = 0.05 as stated in the problem

Step 3: Select the test statistic.


The test statistic follows the chi-square distribution, designated as χ2
Reject H 0 if  2   2 , k  1
Step 4: Formulate the decision rule.
  f o  f e 2 
 fe
    ,k  1
2

 
  f o  f e 2 
 fe
   .05, 4  1
2

 
  f o  f e 2 
 fe
   .05,3
2

 
  f o  f e 2 
17-32  f
  7.815
 e 
 Goodness-of-Fit Test: Unequal Expected
Frequencies - Example

Step 5: Compute the statistic and make a decision.

  f o  f e 2 
2
   
 f e



 The computed χ2 of 1.3723 is NOT greater than the critical value of 7.815
– we cannot reject the null hypothesis. The difference between the
observed and the expected frequencies is due to chance.
 We conclude that there is no evidence of a difference between the local
17-33
and national experience for hospital admissions.
Perform a chi-square test for independence on a
contingency table.

 We can use the chi-square statistic to formally test for a relationship between
two nominal-scaled variables. To put it another way, is one variable independent
of the other? Here are some examples where we are interested in testing
whether two variables are related.
 Ford Motor Company operates an assembly plant in Dearborn, Michigan. The
plant operates three shifts per day, 5 days a week. The quality control manager
wishes to compare the quality level on the three shifts. Vehicles are classified by
quality level (acceptable, unacceptable) and shift (day, afternoon, night). Is there
a difference in the quality level on the three shifts?That is, is the quality of the
product related to the shift when it was manufactured? Or is the quality of the
product independent of the shift on which it was manufactured?
 A sample of 100 drivers who were stopped for speeding violations was
classified by gender and whether or not they were wearing a seat belt. For this
sample, is wearing a seatbelt related to gender?
 Does a male released from federal prison make a different adjustment to civilian
life if he returns to his hometown or if he goes elsewhere to live? The two
variables are adjustment to civilian life and place of residence. Note that both
variables are measured on the nominal scale
17-34
Contingency Table Analysis

A contingency table is used to investigate whether two traits or


characteristics are related. Each observation is classified
according to two criteria. We use the usual hypothesis
testing procedure.
 The degrees of freedom is equal to:

(number of rows-1)(number of columns-1).

The expected frequency is computed as:

We can use the chi-square statistic to formally test for a


relationship between two nominal-scaled variables. To put it
another way, Is one variable independent of the other?

17-35
Contingency Analysis - Example

The Federal Correction Agency is investigating the “Does a male released


from federal prison make a different adjustment to civilian life if he
returns to his hometown or if he goes elsewhere to live?” To put it
another way, is there a relationship between adjustment to civilian life
and place of residence after release from prison? Use the .01
significance level.

The agency’s psychologists interviewed 200 randomly selected former


prisoners. Using a series of questions, the psychologists classified the
adjustment of each individual to civilian life as outstanding, good, fair,
or unsatisfactory.

The classifications for the 200 former prisoners were tallied as follows.
Joseph Camden, for example, returned to his hometown and has
shown outstanding adjustment to civilian life. His case is one of the 27
tallies in the upper left box (circled).
17-36
Example

17-37
Contingency Analysis - Example
Step 1: State the null hypothesis and the alternate hypothesis.
H0: There is no relationship between adjustment to civilian life and where the
individual lives after being released from prison.
H1: There is a relationship between adjustment to civilian life and where the
individual lives after being released from prison.

Step 2: Select the level of significance.


α = 0.01 as stated in the problem

Step 3: Select the test statistic.

The test statistic follows the chi-square distribution, designated as χ2


Reject H 0 if  2   2 ,( r  1)( c  1)
Step 4: Formulate the decision rule.
  f o  f e 2 
 f
    ,( 2  1)( 4  1)
2

 e 
  f o  f e 2 
 f
   .01,(1)( 3)
2

 e 
  f o  f e 2 
 f
   .01,3
2

 e 
  f o  f e 2 
17-38  fe
  11 .345
 
Computing Expected Frequencies (fe)

(120)(50)
200

17-39
Computing the Chisquare Statistic

17-40
Conclusion

The computed χ2 of 5.729 is5.729


in the “Do not rejection H0” region. The
null hypothesis is not rejected at the .01 significance level.
We conclude there is no evidence of a relationship between
adjustment to civilian life and where the prisoner resides after being
released from prison. For the Federal Correction Agency’s advisement
program, adjustment to civilian life is not related to where the ex-
17-41
Contingency Analysis - Minitab

17-42
LIMITATIONS OF CHI SQUARE

 If there is an unusually small expected frequency in a cell, chi-


square (if applied) might result in an erroneous conclusion. This
can happen because fe appears in the denominator, and
dividing by a very small number makes the quotient quite large!
Two generally accepted policies regarding small cell
frequencies are:
 1. If there are only two cells, the expected frequency in each
cell should be at least 5. The computation of chi-square would
be permissible in the following problem, involving a minimum fe
of 6.

17-43
Limitations of Chi-Square

 For more than two cells, chi-square should not be used if more
than 20 percent of the cells have expected frequencies less
than 5. According to this policy, it would not be appropriate to
use the goodness-of-fit test on the following data. Three of the
seven cells, or 43 percent, have expected frequencies ( fe) of
less than 5.

17-44
EXAMPLE

 A social scientist sampled 140 people and classified them


according to income level and whether or not they played a
state lottery in the last month. The sample information is
reported below. Is it reasonable to conclude that playing the
lottery is related to income level? Use the .05 significance level.

 At 0.05 level of significance can we conclude that there is no


relationship btw income and state lottery?

17-45

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy