OT - Stat - LL Ch06 ANOVA
OT - Stat - LL Ch06 ANOVA
The F Distribution
The probability distribution used in this chapter is the F distribution. It was named to honor Sir Ronald
Fisher, one of the founders of modern-day statistics. This probability distribution is used as the
distribution of the test statistic for several situations. It is used to test whether two samples are from
populations having equal variances, and it is also applied when we want to compare several population
means simultaneously. The simultaneous comparison of several population means is called analysis of
variance (ANOVA). In both of these situations, the populations must follow a normal distribution, and
the data must be at least interval-scale.
The characteristics of the F distribution
i. There is a "family" of F distributions. A particular member of the family is determined by two
parameters: the degrees of freedom in the numerator and the degrees of freedom in the denominator.
ii. The F distribution is continuous. This means that it can assume an infinite number of values between
zero and positive infinity.
iii. The F distribution cannot be negative. The smallest value F can assume is 0.
iv. It is positively skewed. The long tail of the distribution is to the right-hand side. As the number of
degrees of freedom increases in both the numerator and denominator the distribution approaches a
normal distribution.
v. It is asymptotic. As the values of X increase, the F curve approaches the X-axis but never touches it.
This is similar to the behavior of the normal distribution.
Page 1 of 8
Notation
Let k represent the number of groups. Then we'll set things up as follows:
Let µ1, µ2,….µk represent the true population means of the response variable for the subjects in each
group. As usual, these population parameters are what we're really interested in, but we don't know
their values.
We call each observation in the sample Xij , where i is a number from 1 to k that identifies the group
number, and j identifies the individual within that group. (For example, X12 represents the response
variable value of the second individual in the first group.)
We can calculate the sample means for each group, which we'll call ̅̅̅
𝑋1 , ̅̅̅
𝑋2 , ̅̅̅
𝑋3 , … … . ̅𝑋̅̅̅
𝑘 We can use these
known sample means as estimates of the corresponding unknown population means.
𝑋̅ represents the overall sample mean of all the data from all groups combined.
N is the total number of observations, and ni is the number of observations in the ith group. (So
n1 + n2 +_+ nK = N.)
Page 2 of 8
Example: Anbessa city bus offers transport services from Adama city to Bole airport in Addis. Ato
Alemu, the president of Anbessa city bus is considering two routes. One is via Selam bus and the other
Higer bus. He wants to study the time it takes to drive to the airport using each route and then compare
the results. He collected the following sample data, which is reported in minutes. Using the 0.10
significance level, is there a difference in the variation in the driving times for the two routes?
Selam bus 52 67 56 45 70 54 64
Higer bus 59 60 61 51 56 63 57 65
Solution:
The mean driving times along the two routes are nearly the same. The mean time is 58.29 minutes for
the Selam bus and 59 minutes along the Higer bus route. However, in evaluating travel times, Mr.
Alemu is also concerned about the variation in the travel times. The first step is to compute the two
sample variances. We'll use the following formula to compute the sample mean & standard deviations.
To obtain the sample variances, we square the standard deviations.
SELAM BUS
X )2 √485.43
s1 = √
𝛴𝑥𝑖 408 𝛴(𝑥𝑖−
X = 𝑛
= 7
= 58.29 𝑛−1
= 7−1 = 8.9947
HIGER BUS
X )2 √ 134
s2 = √
𝛴𝑥𝑖 472 𝛴(𝑥𝑖−
X = 𝑛
= 8
= 59 𝑛−1
= 8−1 = 4.3753
There is more variation, as measured by the standard deviation, in the Selam bus route than in the Higer
bus route. This is somewhat consistent with his knowledge of the two routes; the Selam bus route
contains more stoplights, whereas Higer bus is a limited-access interstate highway. However, the Higer
bus route is several miles longer. It is important that the service offered be both timely and consistent, so
he decides to conduct a statistical test to determine whether there really is a difference in the variation of
the two routes. The usual five-step hypothesis-testing procedure will be employed.
Step 1: We begin by stating the null hypothesis and the alternative hypothesis. The test is two-tailed
because we are looking for a difference in the variation of the two routes. We are not trying to show that
one route has more variation than the other.
H0: 𝜎12 = 𝜎22 and H1: 𝜎12 ≠ 𝜎22 .
Step 2: We selected the 0.10 significance level.
Step 3: The appropriate test statistic follows the F distribution.
Step 4: The critical value is obtained from F-distribution table. Because we are conducting a two-tailed
test, the tabled significance level is 0.05, found by α/2 = 0.10/2 = 0.05. There are n1 - 1 = 7 - 1 = 6
degrees of freedom in the numerator, and n2 -1 = 8 - 1 = 7 degrees of freedom in the denominator. To
find the critical value, move horizontally across the top portion of the F table for the 0.05 significance
level to 6 degrees of freedom in the numerator. Then move down that column to the critical value
Page 3 of 8
opposite 7 degrees of freedom in the denominator. The critical value is 3.87. Thus, the decision rule is:
Reject the null hypothesis if the ratio of the sample variances exceeds 3.87.
Step 5: The final step is to take the ratio of the two sample variances, determine the value of the test
statistic, and make a decision regarding the null hypothesis. Note that in the above formula refers to the
sample variances but we calculated the sample standard deviations. We need to square the standard
deviations to determine the variances.
𝑠2 (8.9947)2
F = 𝑠12 = (4.3753)2 = 4.23
2
The decision is to reject the null hypothesis, because the computed F value (4.23) is larger than the
critical value (3.87) (Fcal > Fcritical implies reject the Ho). We conclude that there is a difference in the
variation of the travel times along the two routes.
As noted, the usual practice is to determine the F ratio by putting the larger of the two sample variances
in the numerator. This will force the F ratio to be at least 1. This allows us to always use the right tail of
the F distribution, thus avoiding the need for more extensive F tables.
Sums of Squares
The most basic quantities that ANOVA uses to describe different kinds of variability are the sums of
squares; abbreviated SS. One-way ANOVA involves two sums of squares:
The group sum of squares, SSB, measures the variability between the groups by looking at how the
sample means for each group, 𝑋̅𝑖 vary around 𝑋̿ the overall mean. Its formula is
𝑘
𝑆𝑆𝐵 = ∑ 𝑛𝑖 (𝑋̅𝑖 − 𝑋̿ )2
𝑖=1
The error sum of squares, SSW, measures the variability within the groups by looking at how each
Xij value varies around 𝑋̅𝑖 , the sample mean for its group. Its formula is
𝑘 𝑛𝑖
Assumptions of ANOVA:
The one-way ANOVA F test makes four assumptions:
The data comes from a random sample or randomized experiment.
In an observational study, the subjects in each group should be a random sample from
that group.
In an experiment, the subjects should be randomly assigned to the groups.
Page 4 of 8
The data for each group should be independent. For example, we wouldn't want to reuse the
same subject for measurements in more than one group.
For each group, the population distribution of the response variable has a normal distribution. To
check this assumption, there a couple of things we should look for:
The shape of the data should look at least sort of close to normal.
There should be no outliers.
The population distribution of the response variable has the same standard deviation for each
group. Of course, we don't know the population standard deviation but we can still check this
assumption by comparing the sample standard deviations for each group.
Hypotheses
The null hypothesis for the one-way ANOVA F test is that the factor has no effect, and the alternative is
that it does. In terms of parameters, we can write these hypotheses as follows:
H0: µ1, µ2…….. µK are all equal.
Ha: µ1, µ2…….. µK are not all equal.
Test Statistic
If we're testing whether or not µ1, µ2…….. µK are all equal, then it seems reasonable to look at our
estimates of those quantities and see if those are all close “enough" to each other. So we want to look at
whether X1, X2,……..XK are all “close enough" to each other.
We measure the closeness of the group means using MSB, the variability between groups and MSW, the
variability within groups by taking a ratio:
𝑀𝑆𝐵
F = 𝑀𝑆𝑊
When MSB is large compared to MSW, F will be large. So larger F values represent more evidence that
there is a difference between the group populations means-in other words, more evidence against H0 and
in favor of Ha.
An F distribution has the following properties:
It is skewed to the right.
Things with an F distribution can't be negative, so the F distribution has only one tail. (We never
need to double any tail probabilities from an F distribution.)
The area of the F distribution is usually somewhere around 1, or a little less.
The exact shape of the F distribution is determined by two different degrees of freedom-the
numerator degrees of freedom, or df1, and the denominator degrees of freedom, or df2.
If H0 is true, our test statistic, F, has an F distribution with df1 = dfB and df2 = dfW. This is easy to
remember, since the formula for F is
𝑀𝑆𝐵
F = 𝑀𝑆𝑊
and the numerator and denominator degrees of freedom are just the degrees of freedom associated with
the quantities in the numerator and denominator of F.
Remember that we said the larger values of F are the values that are more supportive of Ha.
Decision: We make a decision the same way we always do for any hypothesis test: by rejecting H0 if
the calculated F value is greater than the critical F value, and failing to reject H0 if the calculated F
value is less than the critical F value. Remember that the hypotheses we're testing are
H0: µ1, µ2…….. µK are all equal.
Ha: µ1, µ2…….. µK are not all equal.
So let's think about what our decision really represents.
If we reject H0, then we're concluding that at least some of the group population means are different.
If we fail to reject H0, then we're concluding that it's reasonable that all the group population means
are the same.
Page 5 of 8
Example: Suppose a statistics class wanted to test whether or not the amount of caffeine consumed
affected memory. The variable caffeine is called a factor and students wanted to study how three levels
of that factor affected the response variable, memory. Twelve students were recruited to take part in the
study. The participants were divided into three groups of five and randomly assigned to one of the
following drinks:
A. Coca-Cola Classic (34 mg caffeine)
B. McDonald’s coffee (100 mg caffeine)
C. Jolt Energy (160 mg caffeine).
After drinking the caffeinated beverage, the participants were given a memory test (words remembered
from a list). The results are given below table.
No.of Observe Group 1 (34 mg) Group 2 (100 mg) Group 3 (160 mg)
Xi Xi Xi
1 7 11 14
2 8 14 12
3 10 14 10
4 12 12 16
5 7 10 13
Solution:
Step 1: Set the null hypothesis and the alternative hypothesis.
For an ANOVA, the null hypothesis is that the population means among the groups are the same. In this
case, H0: µ1 = µ2 = µ3, where µ1 is the population mean number of words recalled after people drink
Coca Cola and similarly for µ2 and µ3. The alternative or research hypothesis is that there is some
inequality among the three means. Notice that there is a lot of variation in the number of words
remembered by the participants. We break that variation into two components:
(1) variation in the number of words recalled among the three groups also called between-groups
variation.
(2) variation in number of words among participants within each group also called within-groups
variation. To measure each of these components, we’ll compute two different variances, the mean
square for groups (MSG) and the mean square error (MSE). The basic idea in gathering evidence to
reject the null hypothesis is to show that the between-groups variation is substantially larger than the
within-groups variation and we do that by forming the ratio, which we call F:
Step 2: We selected the 0.05 significance level.
Step 3: The appropriate test statistic follows the F distribution.
Step 4: The critical value is obtained from F-distribution table. The tabled significance level is 0.05, α =
0.05. The F distribution k –1 and N – k degrees of freedom. K - 1 = 3 - 1 = 2 degrees of freedom
numerator (df1), and N-K = 15 - 3 = 12 degrees of freedom for denominator (df2). The critical value is
3.89. Thus, the decision rule is: Reject the null hypothesis if the ratio of the sample variances exceeds
3.89.
Step 5: As FCal = 5.78 > f(2, 12),0.05 = 3.89, the null hypothesis is rejected. Hence, we
conclude that there exists significant difference in the amount of caffeine consumed
affected the mean memory score.
Page 6 of 8
No.of Obser Group 1 Group 2 Group 3
Xi (𝑋𝑖 − 𝑋̅) (𝑋𝑖 − 𝑋̅)2 Xi (𝑋𝑖 − 𝑋̅) (𝑋𝑖 − 𝑋̅)2 Xi (𝑋𝑖 − 𝑋̅) (𝑋𝑖 − 𝑋̅)2
1 7 -1.8 3.24 11 -1.2 1.44 14 1 1
2 8 -0.8 0.64 14 1.8 3.24 12 -1 1
3 10 1.2 1.44 14 1.8 3.24 10 -3 9
4 12 3.2 10.24 12 -0.2 0.04 16 3 9
5 7 -1.8 3.24 10 -2.2 4.84 13 0 0
ni n1=5 n1=5 n1=5
∑ 𝑿𝒊 44 61 65
∑ 𝑿𝒊 8.8 12.2 13
̅=
𝑿
𝒏
𝑛
18.8 12.8 20
∑(𝑋𝑖 − 𝑋̅)2
𝑖=1
Problem 1: To test whether all professors teach the same material in different sections of the
introductory statistics class or not, 4 sections of the same course were selected and a common test was
administered to 5 students selected at random from each section. The scores for each student from each
section were noted and are given below. At α = 0.05, to test for any differences in learning, as reflected
in the average scores for each section.
No. of Section 1 Section 2 Section 3 Section 4
Student Scores(X1) Scores(X2) Scores(X3) Scores(X4)
1 8 12 10 12
2 10 12 13 15
3 12 10 11 13
4 10 8 12 10
5 5 13 14 10
Total 45 55 60 60
Page 7 of 8
Problem 2.
Awash insurance company wants to test whether three of its salesmen, A, B, and C, in a given territory
make similar number of appointments with prospective customers during a given period of time. A
record of previous four months showed the following results for the number of appointments made by
each salesman for each month.
Salesman
Month (A) (B) (C)
1 8 6 14
2 9 8 12
3 11 10 18
4 12 4 8
Totals 40 28 52
Do you think that at 95% confidence level, there is significant difference in the average number of
appointments made by the three salesmen per month?
Problem 3
A department store chain is considering building a new store at one of the three locations. An important
factor in making such a decision is the household income in these areas. If the average income per
household is similar then they can pick any one of these three locations. A random survey of various
households in each location is undertaken and their annual combined income is recorded. This data is
tabulated as follows:
Annual Household Income ($1,000)
Area(1) Area(2) Area(3)
70 100 60
72 110 65
75 108 57
80 112 84
83 113 84
- 120 70
- 100 -
Total 380 763 420
Test if the average income per household in all these localities can be considered as the same at α =
0.01.
Page 8 of 8