0% found this document useful (0 votes)
6 views28 pages

28

Analysis of variance (ANOVA) is a hypothesis testing method used to compare means across more than two groups or factors. It is particularly useful for analyzing data with both discrete and continuous variables, and involves validating several assumptions before conducting the test. The document provides examples of real-world applications of ANOVA, including customer satisfaction analysis and process improvement in manufacturing.

Uploaded by

pimpom.loor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views28 pages

28

Analysis of variance (ANOVA) is a hypothesis testing method used to compare means across more than two groups or factors. It is particularly useful for analyzing data with both discrete and continuous variables, and involves validating several assumptions before conducting the test. The document provides examples of real-world applications of ANOVA, including customer satisfaction analysis and process improvement in manufacturing.

Uploaded by

pimpom.loor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Analysis of variance, often referred to as ANOVA, is a hypothesis test that deals with more

than two populations or factors of X. If you recall from chapter 21, we discussed a variety of
hypothesis tests for dealing with single factors of x or two factors of x. For example, you
learned to conduct a hypothesis test to determine if the mean of a sample was statistically
equal to a target mean. You also learned to compare the means of two groups of data with
each other for the same reason. ANOVA lets you perform this test with more than two
groups of data.

Analysis of variance is often used when you have a combination of both discrete and
continuous variables – if the independent variance is discrete – a list of employees, for
example – and the response variable is continuous – a list of errors or length of time worked
– ANOVA can be a valuable tool for analysis.

Before learning more about ANOVA, let’s review the other tools for testing means.

- When testing for one mean – usually against a target – you would use the Z test or T
test. Remember, the Z test is often used for testing large samples when you don’t know
what the standard deviation is. The Z test often provides the “lay of the land,” letting
you gather some information so you can run more accurate analysis with other tests.
- When testing for two means – which involves comparing two samples against each
other – you would use the 2 sample T test or the paired T test.

565
ANALYSIS OF VARIANCE (1-WAY ANOVA)

- When testing for three or more means, you can use a specific ANOVA test called the 1-
way Anova.

To better understand these designations, consider the real-world applications below.

A Six Sigma team is working to reduce the process time in a manufacturing process. The
team was provided with historic baseline metrics for the process. Specifically, in the last
quarter, the process averaged 35.8 minutes per output. After working through Define,
Measure, and Analyze phases, the team has put an improvement in place they believe will
reduce the average time per output. After piloting the change in Improve, the team takes
new measurements. To compare the new sample to the historical baseline, the team would
use the 1-sample T test.

In a different scenario, a Six Sigma team is working to increase the number of calls that can
be handled by a call center team in a given day. Call center management has repeatedly
asked for additional employees, but executive leadership wants to find out whether
efficiency improvements can increase production without adding additional employees.
After the first half of the DMAIC process, a Six Sigma team is ready with possible solutions.
To verify these solutions, the team decides to implement the changes in half the call center.
Over the course of one month, the team measures performance for both halves of the call
center. At the end of the month, the team has two sets of data and wants to answer the
question: Is the average production for the altered group greater than the average for the
unaltered group? In this case, the team would use the 2-sample T test.

The two examples above are in line with many of the examples used in the chapter on
hypothesis testing. Consider the scenario below, which is slightly different.

A restaurant chain wants to know if its branding and customer-facing activities are working
equally across all locations. Specifically, the company decides to look at the customer
satisfaction scores for each restaurant location. The chain includes five locations.
Satisfaction scores are collected via phone, web, and written surveys at each location. The
company averages the scores each week and reports those numbers. After several months,
the company has five sets of averages. If the company wants to know whether the scores for
any location differs statistically from the scores from other locations, then the correct test is
the 1-way ANOVA.

The way a Six Sigma expert approaches a 1-Way ANOVA is the very similar to the way he or
she would approach any other hypothesis test. First, you begin with a real-world problem –
566
SIX SIGMA: A COMPLETE STEP-BY-STEP GUIDE

a practical or business problem. In the example above, that problem is whether there is
variation in customer satisfaction among the various restaurant locations.

As with any hypothesis test, a null and alternative hypothesis is required. The null
hypothesis is that there is no difference in the means of the samples.

The alternative hypothesis is that at


least one of the means is not
Between versus Within Sample
statistically the same. Variance
After stating the null and alternative ANOVA calculations compare the between
hypothesis, a Six Sigma expert should sample variance with the within sample
verify that any assumptions within the variance. Between sample variance is the
model are appropriate. This includes variance that occurs across all of the samples
assumptions about errors as well as being analyzed. Within sample variance is the
some basic assumptions about the variance that occurs within a single sample or
ANOVA model. group.

First, the samples used for the ANOVA The reason this distinction is important is that
must be randomly selected. the within sample variance obviously impacts
Remember, this is always a necessary the between sample variance. If you are
assumption for inferential statistics. dealing with five separate samples and each
sample has a big variance, then the variance
After ensuring samples are randomly across the samples is likely to be large as well.
selected, Six Sigma experts must Part of the ANOVA calculation compares
validate six other assumptions about between sample variance to determine if it is
data before running a 1-way ANOVA large enough relative to within sample variance
test. to denote a statistical difference.
1. The dependent variable, or
outcome, must be continuous in nature. This means that it is a ratio or an interval.
In previous examples, the dependent variable included the customer satisfaction
score (numerical), the number of calls handled per day, and the cycle time for a
specific output. All of these are examples of continuous data. Anything that can be

567
ANALYSIS OF VARIANCE (1-WAY ANOVA)

measured in time, temperature, feet and inches (or centimeters and meters),
money, or ratios is typically continuous in nature.
2. The independent variable list contains two or more unrelated groups or categories.
Teams A, B, and C are three independent variables. If you are measuring
performance for five different workers, then you might have five independent
variables. In the example about restaurants, the five different restaurants are
independent variables. Categorical variables can include people, times, shifts,
teams, departments, locations, various demographic groups (age, gender,
ethnicity), or professions.
3. Observations are independent. First, you must ensure there are no dependencies
within groups. If you are measuring the performance of three different teams in a
department but several employees work on multiple teams, then the results are not
completely independent. Second, you must ensure that observations within groups
are not dependent.
4. Significant outliers don’t exist in your data. Outliers throw off the accuracy of the 1-
way ANOVA, but you don’t have to scrap the entire test because of a single
explainable outlier. As a Six Sigma expert, you do have to review your data,
investigate outliers, and discard them appropriately. For example, if a Six Sigma
expert is reviewing the response times for customer emails, he or she might review
data samples from nine different employees. The data for one employee includes
three outliers where emails were responded to after a much longer amount of time
than all other samples seem to indicate. The Six Sigma expert might investigate this
and note that the employee in question was on a short medical leave, which
skewed results. The Six Sigma expert could remove those three data points from his
or her calculations, but it would be important to note these outliers. The fact that
this particular problem can happen is still important to the overall process
improvement. The Six Sigma team might recommend instituting a process change
that addresses this issue, but they might analyze data minus these outliers to draw
other conclusions.
5. The dependent variable (as described in number 1 above) should be normally
distributed or approximate the normal curve within each group. In the example
above, the Six Sigma team gathers data for nine employees. Within each of these
nine data sets, the data should be normally distributed.
6. The variance within each set has to be statistically equal to the variances of other
sets – also called homogeneity of variances. The test for equal variance can be run
in Minitab and will be covered later in this chapter.

568
SIX SIGMA: A COMPLETE STEP-BY-STEP GUIDE

After validating assumptions and setting up the hypotheses, a Six Sigma expert can run the
1-way ANOVA manually or using statistical analysis software. In this chapter, we’ll use
Minitab to run the tests. When using statistical analysis software, you typically spend more
time on validating some of the assumptions than you do running the test itself.

To understand how to validate all the assumptions, run the 1-Way ANOVA, and interpret
the results, consider a real-world example.

A mail-order book company wants to improve customer satisfaction with deliveries. A Six
Sigma team working on the process has identified packing materials as a possible factor in
negative customer satisfaction scores. The mail-order company currently ships books via
cardboard boxes with no other protection, which the team believes is contributing to an
increased amount of damage to shipments before they reach the customer or when the
customer opens the package.

To test this theory and pilot two proposed solutions, the team implements additional
packing options. Some books will be shipped in the regular packaging – the box without any
additional materials. Others will be shipped in boxes with packing peanuts and still others in
padded envelopes. The team implements the different packing methods at three different
shipping stations and records an overall customer satisfaction score for each shipment
based on a scale from 1 to 10 with 10 being the most satisfied. That data is presented
below.

569
ANALYSIS OF VARIANCE (1-WAY ANOVA)

Original Padded Packing


Packaging Envelope Peanuts
5 5 6
5 5 8
4 3 8
6 6 6
4 6 5
6 8 5
6 7 8
6 6 7
5 7 8
4 5 6
6 6 6
6 5 10
5 6 7
4 4 9
5 6 6

To follow along with the 1-Way ANOVA for this example, copy the above data into Minitab.

Is the dependent variable continuous?

Yes, it is measured in interval form.

Is the independent variable made up of two or more unrelated groups?

Yes, the independent variable is how the book is packaged.

Is there independence of observation?

Yes, the three packing methods don’t have anything to do with each other and are not
combined. The team wouldn’t pack one book in a padded envelop and also put it in a box,
for example.

It is worth noting here that the team has implemented the various packing at three
different stations. It would be a good idea for the team to ensure that the packing stations

570
SIX SIGMA: A COMPLETE STEP-BY-STEP GUIDE

were otherwise treated identically to keep any other factors from influencing the outcome
of the pilot.

Are there any significant outliers in the data?

Graphical analysis can be an easy way to check for outliers. The box plot (also called the box
and whisper plot) is a good way to check quickly for outliers. Boxplotting in Minitab was
covered in Unit 6, but you can run one on the data above now.

1. Click Graph > Boxplot


2. Select multiple Ys
3. Click OK
4. Select all three data columns into the Graph Variables box
5. Click OK.

Boxplot of Original Packaging, Padded Envelope, Packing Peanuts


10

7
Data

3
Original Packaging Padded Envelope Packing Peanuts

Minitab returns the graph above, which shows no outliers for any of the three sets of data.
What if someone recording the customer satisfaction scores made a mistake and entered
the number 20 for one shipment? The boxplot analysis changes.

571
ANALYSIS OF VARIANCE (1-WAY ANOVA)

Boxplot of Original Packaging, Padded Envelope, Packing Peanuts


20

15
Data

10

Original Packaging Padded Envelope Packing Peanuts

In the graph above, you can see an outlier dot high above the box for packing peanuts. A Six
Sigma expert reviewing this graph would see that outlier and realize that it was a
measurement error – remember, the scale for customer satisfaction was only supposed to
go up to 10. Because there is an explanation for the outlier, it can be removed.

Is the data in each group normal?

The quickest way to test normality for each set of data is probably to run a graphical
analysis in Minitab. This was also covered in Unit 6.

1. Select Stat > Basic Statistics > Graphical Summary


2. Select the column for the first set of data into the Variables box.
3. Make sure the confidence level is set to 95.0.
4. Click OK.

572
SIX SIGMA: A COMPLETE STEP-BY-STEP GUIDE

Minitab returns the following chart and data.

Summary Report for Original Packaging


Anderson-Darling Normality Test
A-Squared 0.36
P-Value 0.405
Mean 5.1244
StDev 0.7449
Variance 0.5549
Skewness -0.18142
Kurtosis -1.03701
N 15
Minimum 3.7832
1st Quartile 4.3648
Median 5.1199
3rd Quartile 5.7692
Maximum 6.3274
95% Confidence Interval for Mean
4.0 4.5 5.0 5.5 6.0 6.5 4.7119 5.5369
95% Confidence Interval for Median
4.4348 5.7557
95% Confidence Interval for StDev
0.5454 1.1748

95% Confidence Intervals

Mean

Median

4.50 4.75 5.00 5.25 5.50 5.75

The p-value for the Anderson-Darling normality test is 0.405. Remember, when testing for
normalcy, the null hypothesis is that there is no difference between the data and the
normal curve. Since the p-value is above our alpha level (0.05), then we fail to reject the null
hypothesis and accept that the data is normal.

The graphs generated for the other two sets of data are included below. You can see that
the p-values for each also allow us to accept the null hypothesis and validate this
assumption. Yes, the data for each group approximates the normal distribution.

573
ANALYSIS OF VARIANCE (1-WAY ANOVA)

Summary Report for Padded Envelope


Anderson-Darling Normality Test
A-Squared 0.27
P-Value 0.625
Mean 5.6755
StDev 1.1386
Variance 1.2965
Skewness -0.374726
Kurtosis 0.236287
N 15
Minimum 3.4066
1st Quartile 5.1728
Median 5.7797
3rd Quartile 6.4068
Maximum 7.5639
95% Confidence Interval for Mean
4 5 6 7 5.0449 6.3061
95% Confidence Interval for Median
5.1809 6.2630
95% Confidence Interval for StDev
0.8336 1.7958

95% Confidence Intervals

Mean

Median

5.0 5.2 5.4 5.6 5.8 6.0 6.2

Summary Report for Packing Peanuts


Anderson-Darling Normality Test
A-Squared 0.34
P-Value 0.444
Mean 7.0374
StDev 1.4858
Variance 2.2075
Skewness 0.352309
Kurtosis -0.011739
N 15
Minimum 4.6152
1st Quartile 6.2027
Median 6.6021
3rd Quartile 8.0132
Maximum 10.0000
95% Confidence Interval for Mean
5 6 7 8 9 10 6.2146 7.8602
95% Confidence Interval for Median
6.2459 8.0083
95% Confidence Interval for StDev
1.0878 2.3432

95% Confidence Intervals

Mean

Median

6.0 6.5 7.0 7.5 8.0

574
SIX SIGMA: A COMPLETE STEP-BY-STEP GUIDE

Is there a homogeneity of variances? Are the variances between groups relatively equal?

Minitab includes an option for testing for equal variances that provides two different p-
values. One is the p-value for the Multiple Comparisons test and one is for Levene’s test.
Levene’s test is commonly used to test for equal assumptions when data within each group
is normal. This test was not previously covered in Unit 6.

1. Select Stat > ANOVA > Test for Equal Variances

575
ANALYSIS OF VARIANCE (1-WAY ANOVA)

In the drop down box, select “Response data are in a separate column for each factor level.”

576
SIX SIGMA: A COMPLETE STEP-BY-STEP GUIDE

2. Select all the data columns into the Responses box.

3. Click OK.

577
ANALYSIS OF VARIANCE (1-WAY ANOVA)

Minitab returns an interval graph of all the columns of data along with two statistical values.

Test for Equal Variances: Original Pac, Padded Envel, Packing Pean
Multiple comparison intervals for the standard deviation, α = 0.05

Multiple Comparisons
P-Value 0.109
Original Packaging Levene’s Test
P-Value 0.155

Padded Envelope

Packing Peanuts

0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25

If intervals do not overlap, the corresponding stdevs are significantly different.

For the test for equal variance, you are usually concerned with the p-value for Levene’s
Test. In this case, it is above 0.05 (Minitab’s default alpha setting), so you can fail to reject
the null hypothesis, which is that there is no difference in the variance among the data. In
this case, even a look at the interval graph helps you make this determination. The ranges
are certainly different, but the intervals are not extremely different in length.

578
SIX SIGMA: A COMPLETE STEP-BY-STEP GUIDE

To consider a contrasting example, copy the following data table into Minitab.

A B
4.01 10.00
4.91 10.06
6.38 10.04
7.98 10.00
6.00 10.06
10.00 9.97
6.50 9.96
4.14 10.01
9.00 10.05
6.45 10.08

Run the test for equal variance under the ANOVA menu using just the two columns of data
from the above table.

579
ANALYSIS OF VARIANCE (1-WAY ANOVA)

Test for Equal Variances: A, B


Multiple comparison intervals for the standard deviation, α = 0.05

Multiple Comparisons
P-Value 0.000
Levene’s Test
A P-Value 0.002

0 1 2 3 4

If intervals do not overlap, the corresponding stdevs are significantly different.

You can quickly see that these two sets of data do not have equal variance. First, the
intervals make it fairly obvious. The variance in set B is much smaller. The p-value for
Levene’s Test is much smaller than the alpha value, which means you reject the null
hypothesis and accept the alternative hypothesis that there is a difference in the variance.

You can change the alpha value for the tests by selecting Options on the Test for Equal
Variances dialogue box.

580
SIX SIGMA: A COMPLETE STEP-BY-STEP GUIDE

You can change the type of graph displayed by selecting Graphs from the Test for Equal
Variances dialogue box.

581
ANALYSIS OF VARIANCE (1-WAY ANOVA)

Here are the original three sets of data graphed via box plot.

Boxplot of Original Pac, Padded Envel, Packing Pean

Original Packaging

Padded Envelope

Packing Peanuts

3 4 5 6 7 8 9 10
Data

Now that all the assumptions are verified, you can run the 1-Way ANOVA test.

Select Stat > ANOVA > One-Way.

Select the drop down option that response data are in separate columns.

Select all three of the data columns (the data for the original packing, padded envelopes,
and packing peanuts) into the Responses box.

582
SIX SIGMA: A COMPLETE STEP-BY-STEP GUIDE

You can click Options to set the confidence interval or ensure it is still set at 95%, if desired.
Then click OK.

Click OK on the main dialogue box. Minitab will perform the 1-Way ANOVA calculations.
Depending on how much data is in columns when you run this test, it can take a few
seconds.

Minitab defaults to returning two things: a graphical analysis of the data and the ANOVA
results in the session window.

583
ANALYSIS OF VARIANCE (1-WAY ANOVA)

Interval Plot of Original Pac, Padded Envel, ...


95% CI for the Mean
8.0

7.5

7.0

6.5
Data

6.0

5.5

5.0

4.5

Original Packaging Padded Envelope Packing Peanuts

The pooled standard deviation is used to calculate the intervals.

584
SIX SIGMA: A COMPLETE STEP-BY-STEP GUIDE

One-way ANOVA: Original Packaging, Padded Envelope, Packing


Peanuts

The data from the session window is included above. Highlighted are the hypothesis and the
p-value. The null hypothesis is that all the means are equal. The alternative is that at least
one of the means is statistically different.

585
ANALYSIS OF VARIANCE (1-WAY ANOVA)

The p-value is less than the alpha value of 0.05, which means you reject the null hypothesis
and accept the alternative hypothesis that at least one of the means is different.

In business terms, the Six Sigma team now knows that “one of these things is not like the
other.” That might prompt them to work on additional analysis or indicate that there might
be an issue with one or more processes. If the team was comparing mean production times
between various teams on the same process, they might assume that the teams would
complete work in roughly the same time. If the 1-Way ANOVA indicates that isn’t true, the
Six Sigma team can then ask itself: What is it that makes the outcome statistically different?
Is there a team doing it better than everyone else? If so, can the Six Sigma team implement
solutions from that group across the other groups? Is there a team performing worse than
everyone else? If so, what is going on with that team and how can an improvement be
made?

If you can’t validate the sixth assumption about equal variances between your subgroups,
you can still run a 1-Way ANOVA test in Minitab. Minitab uses a different statistical
calculation, known as Welch’s test, to provide a p-value in such a case.

Use the data we previously found to have unequal variances to run this test in Minitab.

A B
4.01 10.00
4.91 10.06
6.38 10.04
7.98 10.00
6.00 10.06
10.00 9.97
6.50 9.96
4.14 10.01
9.00 10.05
6.45 10.08

Select Stat > ANOVA > One-Way.

Select the drop down option that response data are in separate columns.

586
SIX SIGMA: A COMPLETE STEP-BY-STEP GUIDE

Select the data columns for A and B into the Responses box.

Select Options.

Uncheck the box for assume equal variances.

Click OK. Click OK again.

587
ANALYSIS OF VARIANCE (1-WAY ANOVA)

As in the previous example, Minitab generates both a graphical analysis and the test data in
the session window.

Interval Plot of A, B
95% CI for the Mean

10

8
Data

5
A B

Individual standard deviations are used to calculate the intervals.

One-way ANOVA: A, B

588
SIX SIGMA: A COMPLETE STEP-BY-STEP GUIDE

You can see from the Minitab results, equal variance was not assumed for this data. You
also have a p-value that is effectively 0, which means you reject the null hypothesis and
accept the alternative that one of the means is statistically different: something we already
did just by looking at the graphical interpretation of this data before. As stated previously,
data doesn’t always look so different when viewed graphically, which often necessitates
statistical analysis.

At the end of the last unit, we noted that you could use the help menu in Minitab to look up
information about all of the functions and read in-depth instructions and summaries on
various types of tests. You can also use the Minitab Assistant to help you choose a
hypothesis test – including a 1-Way ANOVA – and run that test.

Select Assistant > Hypothesis Tests.

589
ANALYSIS OF VARIANCE (1-WAY ANOVA)

You can have Minitab help you choose a test to compare one sample with a target, two
samples with each other, or more than two samples. Minitab will apply the same
information included in this chapter and the previous chapter on hypothesis testing to help
you make this choice, but this is a good tool to use if you get confused, don’t remember
which test to choose, or think you might need to use a less common test that wasn’t
covered in detail in this book.

If you click “Help me choose” under the option for comparing more than two samples,
Minitab presents you with a second diagram.

590
SIX SIGMA: A COMPLETE STEP-BY-STEP GUIDE

If you are comparing the means for more than two samples of continuous data, you would
use the 1-Way ANOVA described in this chapter. If you are comparing standard deviations,
however, you use a different test. If you are comparing attributed data, then you would use
the chi-square tests. You can hover over any of the tests and click and Minitab will open the
dialogue box for performing that test.

591
ANALYSIS OF VARIANCE (1-WAY ANOVA)

Minitab offers several assistant wizards. For example, below is a screenshot of the graphical
analysis assistant menu.

Here, Minitab helps you choose a graph or visual analysis that best matches your data and
purposes. You can use the assistant tool for help with regression analysis, measurement
system analysis, control charts, and design of experiments, which will be covered in the next
chapter.

592

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy