0% found this document useful (0 votes)
15 views21 pages

Chapter 3 Analysis of Variance (ANOVA) - 240624 - 105926

Uploaded by

Ashley Lau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views21 pages

Chapter 3 Analysis of Variance (ANOVA) - 240624 - 105926

Uploaded by

Ashley Lau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

ANALYSIS OF VARIANCE

(ANOVA)
ANOVA test whether the means for three or more populations are
all equal.

1
Analysis of Variance (ANOVA)
• F-distribution is used for testing the equality of more than two means
using a technique called ANOVA (analysis of variance).
• ANOVA is used to compare variances across the mean of different
groups.
• It consists of statistical models and related procedures where the
sample variance of a specific variable is partitioned into components
arising from the various sources of variation.
• ANOVA makes available a statistical test regardless of whether or not
the means of some groups are all equal, and hence generalizes a t-
test for more than two groups.
2
Analysis of Variance (ANOVA) (cont.)

• ANOVA has an obvious advantage over the two-sample t-test. For


comparing three or more means, we have to perform multiple t-
tests that would lead to a higher probability of a type I error, unlike
ANOVA which enables comparisons to be performed at once.
• The term ‘treatment’ is used to identify the different populations
being examined. A treatment is defined as a cause, or specific
source, of variation in a set of data.

3
Assumptions of ANOVA (cont.)

Assumptions of ANOVA:

• Independence: The samples are independent and randomly selected


from the populations.
• Normality: The populations being studied are normally distributed.
• Equality (or homogeneity) of variances: The populations have equal
variances.

4
Analysis of Variance (ANOVA) (cont.)
• Null hypothesis, 𝐻0 : 𝜇1 = 𝜇2 = ⋯ = 𝜇𝑘
• Alternativehypothesis, 𝐻1 : Not all the k populations means are equal/
At least one of the populations mean is differ
• Total
sum of squares (SSTO)=Treatment Sum of Squares (SSTR) + Error
Sum of Squares (SSE)
• Let𝑥𝑖𝑗 denote the 𝑗𝑡ℎ observation from the 𝑖 𝑡ℎ treatment. 𝑇𝑖 is the
total of all observations in the sample from 𝑖 𝑡ℎ treatment. The total
number of observations is 𝑛 = 𝑛1 + 𝑛2 + ⋯ + 𝑛𝑘 where 𝑘 is the
number of different samples or treatments.
5
Analysis of Variance (ANOVA) (cont.)
𝑛𝑖 2
𝑘
𝑛𝑖 𝑖=1 𝑥
𝑗=1 𝑖𝑗
𝑘 2
• 𝑆𝑆𝑇𝑂 = 𝑖=1 𝑥
𝑗=1 𝑖𝑗 − = Total sum of squares
𝑛
𝑛𝑖 2
𝑘
𝑇12 𝑇22 𝑖=1 𝑥
𝑗=1 𝑖𝑗
• 𝑆𝑆𝑇𝑅 = + +⋯ − = Treatment sum of squares
𝑛1 𝑛2 𝑛
𝑘 𝑛𝑖 2 𝑇12 𝑇22
• 𝑆𝑆𝐸 = 𝑖=1 𝑥
𝑗=1 𝑖𝑗 − + + ⋯ = Error sum of squares
𝑛1 𝑛2
or 𝑆𝑆𝐸 = SSTO − SSTR

6
Analysis of Variance (ANOVA) (cont.)

• Test Statistics:
𝑀𝑆𝑇𝑅
𝐹= , 𝑑𝑓1 = k − 1, 𝑑𝑓2 = 𝑛 − 𝑘
𝑀𝑆𝐸

• Reject 𝐻0 if 𝐹 > 𝐹𝛼,𝑑𝑓1,𝑑𝑓2 .

7
Analysis of Variance (ANOVA) (cont.)
• ANOVA table

Source of Sum of Degree of Mean F


variation squares freedom Square

Treatment SSTR k-1 MSTR 𝑀𝑆𝑇𝑅


𝐹=
Error SSE n-k MSE 𝑀𝑆𝐸

Total SSTO n-1

8
Example
Susan Sound predicts that students will learn most effectively with a
constant background sound, as opposed to an unpredictable sound or no
sound at all. She randomly divides twenty-four students into three groups
of eight. All students study a passage of text for 30 minutes. Those in group
1 study with background sounds at a constant volume in the background.
Those in group 2 study with noise that changes volume periodically. Those
in group 3 study with no sound at all. After studying, all students take a 10
point multiple choice test over the material. Their scores follows:
Group Test scores 𝑥 = 48, 𝑥 2 = 322 , 𝑥 = 6
Constant
7 4 6 8 6 6 2 9
sound 𝑥 = 32, 𝑥 2 = 148 , 𝑥 = 4
Random
5 5 3 4 4 7 2 2
sound 𝑥 = 27, 𝑥 2 = 125 , 𝑥 = 3.375
No sound 2 4 7 1 2 1 5 5 107 595 9
Solution
Test the hypothesis that there is a difference in mean learning in the three groups. Use
𝛼 = 0.05.
Step 1: Hypothesis
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3
𝐻1 : At least one 𝜇𝑖 is differ, 𝑖 = 1,2,3/ Not all the 3 population means are equal

Step 2: 𝛼 = 0.05

Step 3: Test Statistics


𝑀𝑆𝑇𝑅
𝐹=
𝑀𝑆𝐸
107 2
𝑆𝑆𝑇𝑂 = 595 − = 117.9583
24

482 322 272 107 2


𝑆𝑆𝑇𝑅 = + + − = 30.0833
8 8 8 24

𝑆𝑆𝐸 = 117.958 − 30.083 = 87.875 10


Solution
Source of Sum of Degree of Mean
F
variation squares freedom Square
Treatment 30.0833 2 15.0417
Error 87.875 21 4.1845 3.5946
Total 117.9583 23

Step 4: Reject 𝐻0 if 𝐹 > 𝐹0.05,2,21 = 3.467


Since F = 3.594 > 3.467, so we reject 𝐻0 .
Step 5: We can conclude that there is a difference in mean learning in the three
groups.
11
Example: Parsley plants

A farmer wants to know whether the weight of parsley plants is


influenced by using a fertilizer. He selects 90 plants and randomly
divides them into three groups of 30 plants each. He applies a
biological fertilizer to the first group, a chemical fertilizer to the
second group and no fertilizer at all to the third group. After a
month he weighs all plants (in gram).
12
Example: Parsley plants (cont.)

The mean weights are the core of our output. After all, our main
research question is whether these differ for different fertilizers. On
average, parsley plants weigh some 51 grams if no fertilizer was used.
Biological fertilizer results in an average weight of some 54 grams
whereas chemical fertilizer does best with a mean weight of 57 grams. 13
Example: Parsley plants (cont.)

The null hypothesis is usually rejected if p <0 .05 so we conclude that


the mean weights of the three groups of plants are not equal. Since
the p-value (denoted by “Sig.”) is .028 which is less than 0.05, so
reject the null hypothesis. We can conclude that there is a difference
in weight for different fertilizers.
14
Example: Parsley plants (cont.)

15
Example
A consumer agency wants to study the time taken (in minutes) for each
drug in providing relief from a headache. Table below indicates the time
taken by each patient to get relief from a headache after taking the
medicine. Determine whether the mean time taken to provide relief from
a headache are differs among the three drugs at 5% significance level.
Drug A Drug B Drug C
Where;
25 15 44
Drug A: 𝑛 = 6, 𝑥 = 269, 𝑥 2 = 12971, 𝑥 = 44.8333
38 21 39
Drug B: 𝑛 = 4, 𝑥 = 80, 𝑥 2 = 1652, 𝑥 = 20
42 19 54
Drug C: 𝑛 = 5, 𝑥 = 268, 𝑥 2 = 15066, 𝑥 = 53.6
65 25 58
47 73
16
52
Solution
Step 1: Hypothesis
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3
𝐻1 : At least one 𝜇𝑖 is differ, 𝑖 = 1,2,3/ Not all the 3 population means are equal

Step 2: 𝛼 = 0.05

Step 3: Test Statistics


𝑀𝑆𝑇𝑅
𝐹=
𝑀𝑆𝐸
2
617
𝑆𝑆𝑇𝑂 = 29689 − = 4309.7333
15
2692 802 2682 617 2
𝑆𝑆𝑇𝑅 = + + − = 2645.7
6 4 5 15

𝑆𝑆𝐸 = 4309.7333 − 2645.7 = 1644.0333 17


Solution
Source of Sum of Degree of Mean
F
variation squares freedom Square
Treatment 2645.7000 2 1322.85
Error 1664.0333 12 138.6694 9.5396
Total 4309.7333 14
Step 4: Reject 𝐻0 if 𝐹 > 𝐹0.05,2,12 = 3.885.
Since F = 9.5396 > 3.885, so we reject 𝐻0 .
Step 5: In conclusion, the mean time taken to provide relief from a headache
are differs among the three drugs.

18
Exercise
Suppose the National Transportation Safety Board (NTSB) wants to
examine the safety of compact cars, midsize cars and full size cars.
It collects a sample of three for each of the treatments (cars
types). Test whether the mean pressure applied to the driver’s
head during a crash test is equal for each types of car at 1%
significance level.
Compact cars Midsize cars Full-size cars
643 469 484
655 427 456
702 525 402
19
Solution

20

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy