INE340 Advanced Statistics PPT Part 3
INE340 Advanced Statistics PPT Part 3
Part 3
1
Week Course outline Relevant Chapters
1 Introduction 1
2 Simple Comparative Experiments 2
3 Simple Comparative Experiments 2
4 The Analysis of Variance 3
5 The Analysis of Variance 3
6 Randomized Blocks, Latin Squares and Related Designs 4
7 Randomized Blocks, Latin Squares and Related Designs 4
8 Midterm
9 Introduction to Factorial Design 5
10 The 2𝑘 Factorial Design 6
11 Blocking and Confounding in the 2𝑘 Factorial Design 7
12 Blocking and Confounding in the 2𝑘 Factorial Design 7
13 Two-Level Fractional Factorial Designs 8
14 Fitting Regression Models 10
15 Exercises
2
Chapter 3:
3
Chapter 3 overview:
4
5
Recall from chapter 1
Experimental Design
• a plan and a structure to test hypotheses in which
the researcher controls or manipulates one or more variables.
4
Introduction to Design of Experiments (Recall from chapter 1)
Independent Variable
• Treatment variable - one that the experimenter controls or modifies in the
experiment.
• Classification variable - a characteristic of the experimental subjects that was
present prior to the experiment, and is not a result of the experimenter’s
manipulations or control.
• Levels or Classifications - the subcategories of the independent variable used
by the researcher in the experimental design.
• Independent variables are also referred to as factors.
• Dependent Variable is also called the response to the different levels of the
independent variables.
7
Three Types of Experimental Designs
8
1) Completely Randomized Design
• The completely randomized design contains only one independent variable with
two or more treatment levels
• If two treatment levels of the independent variable are present, the design is the
same used to test the difference in means of two independent populations which
used the t test to analyze the data.
• A technique has been developed that analyzes all the sample means at one time
and precludes the buildup of error rate: ANOVA
• A completely randomized design is analyzed by one way analysis of variance.
9
Introduction to Design of Experiments ANOVA
10
What If There Are More Than Two Factor Levels?
• There are lots of practical situations where there are either more than two
levels of interest, or there are several factors of simultaneous interest
• The ANOVA was developed by Fisher in the early 1920s, and initially
applied to agricultural experiments
11
Example 1:
An engineer is interested in investigating the relationship between the RF power
setting and the etch rate for this tool. The objective of an experiment like this is to
model the relationship between etch rate and RF power, and to specify the power
setting that will give a desired target etch rate.
• She is interested in a particular gas (C2F6) and gap (0.80 cm), and wants to test four
levels of RF power: 160W, 180W, 200W, and 220W. She decided to test five wafers at
each level of RF power.
• The experimenter chooses 4 levels of RF power 160W, 180W, 200W, and 220W
• This is an example of a single-factor experiment with a=4 levels of the factor and
n=5 replicates. 12
• Does changing the power change the mean etch rate?
13
Let’s examine experimental data graphically.
The first figure presents box plots for etch rate at each level of RF power
The second figure represents a scatter diagram of etch rate versus RF power.
14
▪ Both graphs indicate that etch rate increases as the power setting increases.
▪ There is no strong evidence to suggest that the variability in etch rate around the
average depends on the power setting.
▪ On the basis of this simple graphical analysis, we strongly suspect that (1) RF power
setting affects the etch rate and (2) higher power settings result in increased etch rate.
• t-test for all six possible pairs of means: inflates the type I error
• The appropriate procedure for testing the equality of several means is the analysis of
variance. It is probably the most useful technique in the field of statistical inference.
15
The appropriate procedure for testing the equality of several means is the analysis
of variance.
16
Models for the Data.
One way to write this model is using this formula called the means model:
where 𝑦𝑖𝑗 is the ijth observation, µ𝑖 is the mean of the ith factor level or treatment, and
ε𝑖𝑗 is a random error component.
17
• The name “analysis of variance” stems from a partitioning of the total
variability in the response variable into components that are consistent with a
model for the experiment
• Both means model and the effect models are linear statistical models
• object:
- test hypotheses about the treatment means
- estimate model parameters: 𝜇, 𝜏𝑖 , 𝜎 2
18
Analysis of Variance
• The null hypothesis states that the population means for all treatment levels
are equal
• Even if one of the population means is different from the other, the null
hypothesis is rejected
• Testing the hypothesis is done by portioning the total variance of data into the
following two variances
• Variance resulting from the treatment (columns)
• Error variance or that portion of the total variance unexplained by the
treatment
The Null hypothesis and the alternative hypothesis are:
Ho: 1 = 2 = 3 = = k
Ha: At least one of the means is different from the others
If F > F , reject H .
c
o
21
total sum of squares = error sum of squares + between sum of squares
SST = SSC + SSE
(Xij− X ) = n j X j − X
nj C
i =1 j= 1
2 C
j =1
( )
2 nj
+
C
i =1 j =1
(X ij − X j )
2
X ij
= individual value
22
One-Way ANOVA: Computational Formulas
23
𝟐
SSC = σ𝒄𝒋=𝟏 𝒏𝒋 ഥ𝒋 − 𝑿
𝑿 ഥ with 𝒅𝒇𝑪 = C - 1
𝒏𝒋 𝟐
SSE = σ𝒊=𝟏 σ𝒄𝒋=𝟏 𝒏𝒋 ഥ𝒋
𝑿𝒊𝒋 − 𝑿 with 𝒅𝒇𝑬 = N – C
𝒏𝒋 𝟐
SST = σ𝒊=𝟏 σ𝒄𝒊=𝟏 𝒏𝒋 ഥ
𝑿𝒊𝒋 − 𝑿 with 𝒅𝒇𝑻 = N – 1
SSC
MSC =
𝒅𝒇𝑪
SS𝑬 Where:
MSE=
𝒅𝒇𝑬 i: a particular member of the treatment
j: a treatment level
MSC C: Number of treatment levels
F= MSE
𝑛𝑖 : number of observations in each treatment level
ത grand mean
𝑋:
𝑋ത𝑗 : Column mean
𝑋𝑖𝑗 : Individual value 24
Example 2:
In a company we have 3 divisions: Division1, division 2, and division 3. We want to
study if there is any difference in the mean age between employees between the 3
groups. Here are the given data:
25
Let’s construct the ANOVA table for the given data:
We have: N=15, K=3 , 𝑋ത1 = 28.2 ; 𝑋ത2 = 32 ; 𝑋ത3 =24.8 and 𝑋തG = 28.333.
F= 39.715
To find the critical value from an F distribution you must know the numerator
(MSTR) and denominator (MSE) degrees of freedom, along with the
significance level. F critical has df1 and df2 degrees of freedom, where:
- df1 is the numerator degrees of freedom equal to c-1, here, df1 = k-1= 3-1= 2
- df2 is the denominator degrees of freedom equal to N-c. Here df2 = N-k =
15-3 = 12
If α = 10% then:
So F2,12 = 2.81.
Decision Rule: We reject the null hypothesis if: F (observed value) > F critical
(critical value).
At least one mean age is different from others.
29
a- Find the mean and the standard deviation for each category of cars.
To find the critical value from an F distribution you must know the numerator
(MSTR) and denominator (MSE) degrees of freedom, along with the significance
level. F critical has df1 and df2 degrees of freedom, where:
- df1 is the numerator degrees of freedom equal to c-1, here, df1 = 3 - 1 = 2
- df2 is the denominator degrees of freedom equal to N-c. Here df2 = 9 - 3 = 6.
So F2,6 = 5.14.
Decision Rule: We reject the null hypothesis if: F (observed value) > F critical
(critical value).
31
32
The ANOVA table is:
33
d- Calculate the appropriate test statistic by constructing the ANOVA table.
34
e- According to the ANOVA table, shall we reject the null hypothesis? Interpret
your result.
Since the F value 25.17 > 5.14 (Fcritical), so we reject the null hypothesis.
Interpretation: Since we rejected the null hypothesis, we are 95% confident (1-α )
that the mean head pressure is not statistically equal for compact, midsize, and full
size cars.
However, since only one mean must be different to reject the null, we do not yet
know which mean(s) is/are different.
In short, an ANOVA test will test us that at least one mean is different, but an
additional test must be conducted to determine which mean(s) is/are different.
35
f- Determine which mean(s) are different .
If you fail to reject the null hypothesis in an ANOVA then you are done. You know,
with some level of confidence, that the treatment means are statistically equal.
However, if you reject the null then you can conduct a separate test to determine which
mean(s) is/are different. There are several techniques for testing the differences
between means, but the most common test is the Least Significant Difference Test.
2 x MSE x F1,N−c
LSD =
n
where: MSE is the mean square error and r is the number of rows in each treatment.
2 x 1709 x 5.99
In the example above, LSD = = 82.61 36
3
Thus, if the absolute value of the difference between any two treatment means is greater
than 82.61, we may conclude that they are not statistically equal.
1) Compact cars vs. Midsize cars: |666.67 − 473.67|= 193. Since 193 > 82.61this means
head pressure is statistically different between compact and midsize cars.
2) Midsize cars vs. Full-size cars: |473.67 − 447.33| = 26.34. Since 26.34 < 82.61 this
means head pressure is statistically equal between midsize and full-size cars.
3) Compact vs. Full-size cars: | 666.67 − 447.33| = 219.34> 82.61this means head
pressure is statistically different between compact and Full-size cars.
Since 1&2 are different; 2&3 are equal and 1& 3 are different, we can conclude that 1
is different from 2 and 3.
Conclusion: Compact cars head pressure statistically different from Midsize cars and
Full-size cars. 37
Least Significant Difference (LSD) for a balanced sample:
2
LSD= 𝒕α,𝑵−𝒂 𝑀𝑆𝐸
𝟐 𝑛
Where
𝒕α,𝑵−𝒂 is the critical value from the t-distribution with N-a degrees of freedom ( same
𝟐
df as SSE)
MSE is the mean square error from the ANOVA table
n is the number of observations in each group (since the sample size is balanced).
38
Example 3 (revisited):
We have α = 5%, N-a = 9-3 = 6, MSE= 1709 and 𝒕α,𝑵−𝒂 = 2.447
𝟐
2
LSD = 𝒕α,𝑵−𝒂 𝑀𝑆𝐸 =
𝟐 𝑛
2
= 2.447 x 1709
3
= 82.596
39
1) Compact cars vs. Midsize cars: |666.67 − 473.67|= 193. Since 193 > 82.6 this means
head pressure is statistically different between compact and midsize cars.
2) Midsize cars vs. Full-size cars: |473.67 − 447.33| = 26.34. Since 26.34 < 82.6 this
means head pressure is statistically equal between midsize and full-size cars.
3) Compact vs. Full-size cars: | 666.67 − 447.33| = 219.34> 82.6 this means head
pressure is statistically different between compact and Full-size cars.
Since 1&2 are different; 2&3 are equal and 1& 3 are different, we can conclude that 1
is different from 2 and 3.
Conclusion: Compact cars head pressure statistically different from Midsize cars and
Full-size cars.
40
Example:
Suppose we conduct an experiment with three different fertilizers (A, B, C) on plant
growth and want to see if there's a significant difference between the mean plant heights
for each fertilizer. After running an ANOVA, we find that the overall test is significant,
meaning there is a difference somewhere between the groups. Now, we perform the LSD
test to determine which fertilizers differ significantly from each other. Use α = 5%.
Let’s assume:
MSE= 2.5 (from the ANOVA table)
The sample size for each group is n=10
Degrees of freedom for error = N-a = 30 – 3 = 27
And 𝒕α,𝑵−𝒂 = 2.052
𝟐
Using the following formula:
2 2
LSD= 𝒕α,𝑵−𝒂 𝑀𝑆𝐸 = 2.052 x 2.5 = 1.45
𝟐 𝑛 10
41
The least significant difference is 1.45 units. If the difference between any two fertilizer
means exceeds 1.45, we would conclude that the difference is statistically significant.
For example, if the mean heights for Fertilizer A, B, and C are 15, 14, and 12
respectively:
42
Example 4:
Three different traffic routes are tested for mean driving time. The entries in the table
are the driving times in minutes on the three different routes.
a- Construct the one-way ANOVA table.
b- State the null and alternative hypothesis and deduce the optimal route using 5% level
of significance.
43
44
45
Analysis of the Fixed Effects Model
Example 1: (revisited)
An engineer is interested in investigating the relationship between the RF power setting
and the etch rate for this tool. The objective of an experiment like this is to model the
relationship between etch rate and RF power, and to specify the power setting that will
give a desired target etch rate.
46
Note that:
▪ The RF power or between-treatment mean square (22290.18) is many times larger than
the within-treatment or error mean square (333.70). This indicates that it is unlikely that
the treatment means are equal.
▪ Suppose that the experimenter has selected α =0.05. So critical F0.05,3,16 = 3.24.
Because 66.80 > 3.24, we reject H0 and conclude that the treatment means differ; that is,
the RF power setting significantly affects the mean etch rate.
47
Graphical interpretation:
50
Estimation of the Model Parameters
if we assume that the errors ε𝑖𝑗 are normally distributed, each treatment average is
distribute NID(µ𝑖 , σ2 /n) .
A confidence interval estimate of the ith treatment mean may be easily determined
using the least squares estimation method.
51
Estimation of the Model Parameters
Df=N-k
52
Example 4: (revisited)
Route 1 Route 2 Route 3
Let’s find the CI estimate of the first treatment mean:
30 27 16
We have :𝑌ത1 = 31
MSE= 49; n=4, t=2.262; df= N-k=12-3 =9 32 29 41
27 28 22
A 95% confidence interval for the mean of route 1: 35 36 31
49 49
[31 – 2.262 ; 31 + 2.262 ] = [ 23.083 ; 38.917 ]
4 4
53
ν is the degree of
freedom
1- α: is CI
54
Example 1: (revisited)
The general mean
The analysis of variance described above may still be used, but slight modifications
must be made in the sum of squares formulas.
56
Model Adequacy Checking
The use of the partitioning to test formally for no differences in treatment means
requires that certain assumptions be satisfied.
These assumptions are that the observations are adequately described by the model
𝑦𝑖𝑗 = µ + 𝜏𝑖 + ε𝑖𝑗
and that the errors are normally and independently distributed with mean zero and
constant but unknown variance σ2 .
If these assumptions are valid, the analysis of variance procedure is an exact test of
the hypothesis of no difference in treatment means. However, these assumptions
will usually not hold exactly.
57
It is usually unwise to rely on the analysis of variance until the validity of these
assumptions has been checked.
Violations of the basic assumptions and model adequacy can be easily investigated
by the examination of residuals.
We define the residual for observation j in treatment i as :
𝑒𝑖𝑗 = 𝑦𝑖𝑗 - 𝑦ො𝑖𝑗
This means that the estimate of any observation in the ith treatment is just the
corresponding treatment mean.
Examination of the residuals should be an automatic part of any analysis of variance.
58
If the model is adequate, the residuals should be structureless; that is, they should
contain no obvious patterns.
Through analysis of residuals, many types of model inadequacies and violations of the
underlying assumptions can be discovered.
We show how model diagnostic checking can be done easily by graphical analysis of
residuals and how to deal with several commonly occurring abnormalities:
59
The table shows the original data and the residuals for the etch rate data in Example1.
Using 𝑒𝑖𝑗 = 𝑦𝑖𝑗 - 𝑦ො𝑖𝑗 for example: 575 – 551.2 = 23.8 and 565-587.4 = -22.4
60
The normal probability plot .
i X 𝑓𝑖 100 x Z score
𝑓𝑖
1 -25.4 0.025 2.5 -1.96
2 -22.4 0.075 7.5 -0.93
3 -22 0.125 12.5 -1.15
4 -21.2
5 -15.4
6 -12.2
7 -9.2
8 -8.4
9 -7
10 2.6
11 3
12 3.6
13 5.6
14 8
15 11.6
16 18
17 18.8
18 22.6
19 23.8
20 25.6 0.975 97.5 1.96
61
The normal probability plot .
64
Checking for outliers may be made by examining the standardized residuals :
𝒆𝒊𝒋
𝒅𝒊𝒋 =
𝑴𝑺𝑬
Thus, about 68 percent of the standardized residuals should fall within the limits ±1,
about 95 percent of them should fall within ±2, and virtually all of them should fall
within ±3.
A residual bigger than 3 or 4 standard deviations from zero is a potential outlier.
𝑒1 651−625.4 25.6 25.6
For example for the 14th value (651): 𝑑1 = = = = = 1.4
𝑀𝑆𝐸 333.27 333.27 18.27
65
Bartlett’s test:
𝐻0 : σ1 2 = σ2 2 = … = σ𝐾 2
𝐻1 : Not true for at least σ𝑖 2
66
The test statistic is χ 𝟐 = 2.3026 q/c
67
The quantity q is :
- large when the sample variances 𝑆𝑖2 differ greatly
- is equal to zero when all are equal.
We reject 𝐻0 when:
χ02 > χα2 ,𝑎−1
68
Example 4: (revisited)
In the plasma etch experiment, the normality assumption is not in question, so we can
apply Bartlett’s test to the etch rate data.
69
Using: χ 2 = 2.3026 q/c and
70
71
Example 5: Bartlett’s Test
Suppose a professor wants to know if three different studying techniques lead to different
average exam scores.
She randomly assigns 10 students to use each technique for one week, then makes each student
take an exam of equal difficulty.
The exam scores of the 30 students are shown below.
Conduct Bartlett’s Test to verify that the three groups have equal variances.
72
Brief solution:
73
Statistical Tests for Equality of Variance (Modified Levene test)
This test :
➢ Is robust to departures from normality.
➢ It uses the absolute deviation of the observations 𝑦𝑖𝑗 in each treatment from
the treatment median 𝑦ത𝑖 in order to test the hypothesis of equal variances in all
treatments.
➢ These deviations are:
➢ It evaluates whether or not the means of these deviations are equal for all
treatments.
➢ It turns out that if the mean deviations are equal, the variances of the
observations in all treatments will be the same.
➢ The test statistic for Levene’s test is simply the usual ANOVA F statistic
for testing equality of means applied to the absolute deviations.
➢ The Levene test rejects the null hypothesis if 𝐹𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 > 𝐹α,𝐾−1 ,𝑁−𝑘
75
Example 6:
A civil engineer is interested in determining whether four different methods of
estimating flood flow frequency produce equivalent estimates of peak discharge
when applied to the same watershed.
Each procedure is used six times on the watershed, and the resulting discharge
data (in cubic feet per second) are shown in the next Table.
The analysis of variance for the data, implies that there is a difference in mean
peak discharge estimates given by the four procedures.
The plot of residuals versus fitted values, is disturbing because the outward-
opening funnel shape indicates that the constant variance assumption is not
satisfied.
76
We will apply the modified Levene test to the peak discharge data.
The upper panel of Table 3.7 contains the treatment medians and the lower panel contains
the deviations dij around the medians.
The F test statistic that results from this is F0 = 4.55, for which the P-value is P = 0.0137.
Therefore, Levene’s test rejects the null hypothesis of equal variances, essentially
confirming the diagnosis we made from visual examination of Figure 3.7.
The peak discharge data are a good candidate for data transformation.
77
78
𝑒𝑖𝑗 = 𝑦𝑖𝑗 - 𝑦ො𝑖 𝑑𝑖𝑗 = |𝑦𝑖𝑗 - 𝑦𝑖 |
79
Here:
𝑦ො𝑖𝑗 =0.71; 2.63 ;
7.93; 14. 72
1- A Regression Model
The experimenter is interested in determining the differences, if any, between the levels of
the factors.
In fact, the analysis of variance treat the design factor as if it were qualitative or categorical.
The method often used to estimate the parameters in a model such as this is the method of
least squares.
81
In the previous example: 𝑦ො = 137.62 + 2.527X
82
The quadratic model appears to be superior to the linear model because it provides a
better fit at the higher power settings.
In general, we would like to fit the lowest order polynomial that adequately describes
the system or process.
In this example, the quadratic polynomial seems to fit better than the linear model, so
the extra complexity of the quadratic model is justified.
Selecting the order of the approximating polynomial is not always easy, however, and
it is relatively easy to overfit, that is, to add high-order polynomial terms that do not
really improve the fit but increase the complexity of the model and often damage its
usefulness as a predictor or interpolation equation
83
Contrasts
Because the null hypothesis was rejected, we know that some power settings produce
different etch rates than others, but which ones actually cause this difference? We might
suspect at the outset of the experiment that 200 W and 220 W produce the same etch rate,
implying that we would like to test the hypothesis:
H0 : µ 3 = µ 4
Ha : µ 3 ≠ µ 4
It is equivalent to:
H0 : µ 3 - µ 4 = 0
Ha : µ 3 - µ 4 ≠ 0 84
If we had suspected at the start of the experiment that the average of the lowest levels of
power did not differ from the average of the highest levels of power, then the hypothesis
would have been:
H0 : µ1 + µ2 = µ3 + µ4
Ha : µ1 + µ2 ≠ µ3 + µ4
Which is equivalent to:
H0 : µ1 + µ2 − µ3 − µ4 = 0
Ha : µ1 + µ2 - µ3 − µ4 ≠ 0
Г = σ𝑎𝑖=1 𝑐𝑖 µ1
H0 : σ𝑎𝑖=1 𝑐𝑖 µi = 0
Ha : σ𝑎𝑖=1 𝑐𝑖 µi ≠ 0
Because H0 : µ3 = µ4
Ha : µ 3 ≠ µ 4
The contrast constants for the hypotheses in our previous example are:
c1 = c2 = 0,
c3 = 1 and c4 = -1,
Or they are
c1 = c2 = 1,
c3 = c4 = -1,
86
Two basic ways for testing hypotheses involving contrasts:
C= σ𝒂𝒊=𝟏 𝒄𝒊 𝒚
ഥ𝒊
σ2
With variance V( C ) = σ𝑎𝑖=1 𝑐𝑖 2
𝑛
𝐶 σ𝒂 ഥ𝒊
𝒊=𝟏 𝒄𝒊 𝒚
The test statistic is: = and it follows a normal distribution N(0,1) if
𝑉 (𝐶) σ2
σ𝑎
𝑖=1 𝑐𝑖
2
𝑛
H0 is true.
σ𝒂 ഥ𝒊
𝒊=𝟏 𝒄𝒊 𝒚
The following test statistic will be used: 𝑡0 =
𝑀𝑆𝐸
σ𝑎
𝑖=1 𝑐𝑖
2
𝑛
The null hypothesis would be rejected if |𝑡0 | > 𝑡α;𝑁−𝑎 . 87
2
The second approach uses an F test.
Now the square of a t random variable with ν degrees of freedom is an F random variable
with 1 numerator and ν denominator degrees of freedom.
𝟐 (σ𝒂 ഥ 𝒊 )𝟐
𝐢=𝟏 𝒄𝒊 𝒚
𝑭𝟎 = 𝒕𝟎 = 𝑴𝑺𝑬
σ𝒂
𝐢=𝟏 𝒄𝒊
𝟐
𝒏
𝑀𝑆𝐶 𝑆𝑆𝐶/1
We can write the test statistic 𝐹0 = =
𝑀𝑆𝐸 𝑀𝑆𝐸
(σ𝒂 ഥ 𝒊 )𝟐
𝐢=𝟏 𝒄𝒊 𝒚
SSc = 𝟏
88
σ𝒂
𝐢=𝟏 𝒄𝒊
𝟐
𝒏
The confidence interval of the contrast C= σ𝑎𝑖=1 𝑐𝑖 µ𝑖 is:
89
When more than one contrast is of interest, it is often useful to evaluate them on
the same scale. One way to do this is to standardize the contrast so that it has
variance σ2 .
σ𝑎𝑖=1 𝑐𝑖 𝑑𝑖 =0
σ𝑎𝑖=1 𝑛𝑖 𝑐𝑖 𝑑𝑖 = 0
90
➢ There are many ways to choose the orthogonal contrast coefficients for a set of
treatments.
➢ Usually, something in the nature of the experiment should suggest which
comparisons will be of interest.
➢ For example, if there are a = 3 treatments, with treatment 1 a control and treatments
2 and 3 actual levels of the factor of interest to the experimenter, appropriate
orthogonal contrasts might be as follows:
𝒄𝒊 𝒅𝒊
Note that contrast 1 with 𝑐𝑖 = -2, 1, 1 compares the average effect of the factor with
the control, whereas contrast 2 with 𝑑𝑖 = 0, -1, 1 compares the two levels of the 91
factor of interest.
Example 1: (revisited)
Consider the plasma etching experiment. There are four treatment means and three
degrees of freedom between these treatments.
Suppose that prior to running the experiment the following set of comparisons among
the treatment means (and their associated contrasts) were specified:
92
(σ𝒂 ഥ 𝒊 )𝟐
𝐢=𝟏 𝒄𝒊 𝒚
Using the data from this Table and the following formulas: SSc = 𝟏 ,we find
σ𝒂
𝐢=𝟏 𝒄𝒊
𝟐
𝒏
from the numerical values of the contrasts and the sums of squares to be as follows:
k
Since
C = ci Yi.
i =1
(−36.2)2
𝐶1 = 𝑦ത1 - 𝑦ത2 = 1(551.2) - 1(587.4) = - 36.2 and 𝑆𝑆𝑐1 = 1 = 3276.10
(2)
5
𝐶2 = 𝑦ത1 + 𝑦ത2 - 𝑦ത3 - 𝑦ത4 = 1(551.2) + 1(587.4) -1(625.4) – 1(707) = - 193.8
93
(−193.8)2
and 𝑆𝑆𝑐2 = 1 = 46948.05
5
(4)
𝑐3 = 𝑦ത1 - 𝑦ത2 = 1(625.4) - 1(707.6) = - 81.6
(−81.6)2
and 𝑆𝑆𝑐3 = 1 = 16646.40
(2)
5
These contrast sums of squares completely partition the treatment sum of squares.
𝑀𝑆𝐶 𝑆𝑆𝐶/1
Since 𝐹0 = = we have:
𝑀𝑆𝐸 𝑀𝑆𝐸
22290.18/333.70 = 66.80 95
3276.10/333.70 = 9.82
We conclude from the P-values that there are significant differences in mean etch rates
between levels 1 and 2 and between levels 3 and 4 of the power settings, and that the
average of levels 1 and 2 does differ significantly from the average of levels 3 and 4 at
the 0.05 level.
F critical is 𝐹α ;1;𝑁−𝑎
96
More about contrasts
c
i =1
i =0
97
Contrasts and Hypothesis testing
A given contrast will test a specific set of hypotheses:
k
H 0 : ci i = 0
i =1
and k
H a : ci i 0
i =1
k
using C = ci Yi.
i =1
For example, each of two drugs is supposed to work by binding to the receptor for
adrenalin. Propanolol is such a drug sometimes used for hypertension or anxiety.
99
The two contrasts:
Contrast 1 tests whether or not the Control group differs from the groups
which block the adrenalin receptors.
Contrast 2 tests whether or not the two drugs differ in their effect.
100
Orthogonal Contrasts
c c
i =1
'
i i =0
101
Orthogonal Contrasts allow the Trt. Sums of Squares to be decomposed
Which can be used to test the hypotheses in the example. The a priori structure in
the Treatments can be tested for significance in a more powerful way.
102
Why?
If all of the differences in the means are described by one of the contrasts, say the first
contrast, then
F = SSC1 MSE
F = SSTrt MSE
103
Example 7:
Suppose we are testing three treatments, T1, T2 and T3 (control) with treatment
means m1, m2, and m3 (two d.f.).
Since there are two degrees of freedom for treatments, there are in principle two
independent comparisons that can be made.
104
For example, one could in principle test the two hypotheses that µ1 and µ2 are not
significantly different from the control: µ1 = µ3 and µ2 = µ3 .
However these contrasts are not orthogonal because we should have σ𝑎𝑖=1 𝑐𝑖 𝑑𝑖 =0
These are contrasts and are orthogonal. The hypotheses they define are:
1) The average of the two treatments is equal to the control (i.e. is there a
significant average treatment effect?); and
2) The two treatments are equal to one another (i.e. is one treatment significantly
different from the other?). 106
There are two general kinds of linear combinations:
▪ Class comparisons
▪ Trend comparisons.
107
Extra example :
To illustrate orthogonal contrasts in class comparisons we will use the following
data represented in the table showing results (mg shoot dry weight) of an
experiment (CRD) to determine the effect of seed treatment by acids on the
early growth of rice seedlings.
108
The analysis of variance for this experiment is given the following table:
The treatment structure of this experiment suggests that the investigator had
several specific questions in mind from the beginning:
109
In the following table, coefficients are shown that translate these three questions
into contrasts.
The table shows orthogonal coefficients for partitioning the SST into 3
independent tests.
110
The 1st contrast (first row) compares the control group to the average of the three
acid-treated groups, as can be seen from the following manipulations:
• 3μCont - 1μHCl - 1μProp - 1μBut = 0
• μCont = (1/3)*(1μHCl + 1μProp + 1μBut)
• Mean of the control group = Mean of all acid-treated groups
The H0 for this 1st contrast is that there is no average effect of acid treatment on
seedling growth. Since this Ho involves only two group means, it costs 1 df.
111
The 2nd contrast (second row) compares the inorganic acid group to the average of the
two organic acid groups:
The H0 for this second contrast is that the effect of the inorganic acid treatment on
seedling growth is no different from the average effect of organic acid treatment. Since
this null hypothesis involves only two group means (means than before), it also costs 1
df.
112
Finally, the third contrast (third row of coefficients) compares the two organic acid
groups to each other:
The H0 for this third contrast is that the effect of the propionic acid treatment on
seedling growth is no different from the effect of butyric acid treatment.
Since this null hypothesis involves only two group means (different means than
before), it also costs 1 df.
At this point, we have spent all our available degrees of freedom (dftrt = t – 1 = 4 –
1 = 3).
113
Because each of these questions are contrasts (each row of coefficients sums to
zero) and because the set of three questions is orthogonal, these three question
perfectly partition SST into three components, each with 1 df.
The SS associated with each of these contrasts serve as the numerators for three
separate F tests, one for each comparison. The critical F values for these single df
tests are based on 1 df in the numerator and dfError in the denominator.
All of this can be seen in the expanded ANOVA table below representing orthogonal
partitioning of SST via contrasts.
Source df SS MS F
Total 19 1.0113
Treatment 3 0.8738 0.2912 33.87
1. Control vs. acid 1 0.7415 0.7415 86.22
2. Inorg. vs. Org. 1 0.1129 0.1129 13.13
3. Between Org. 1 0.0194 0.0194 2.26
Error 16 0.1376 0.0086 114
Control HC1 Propionic Butyric
Control vs. acid +3 -1 -1 -1
Inorganic vs. 0 -2 +1 +1
organic
Between organics 0 0 +1 -1
Totals 20.95 19.34 18.64 18.2
Comparisons Means 4.19 3.87 3.73 3.64
(σ𝒂 ഥ𝒊 )𝟐
𝐢=𝟏 𝒄𝒊 𝒚
Using the following formula: SSc = 𝟏 we get
σ𝒂
𝐢=𝟏 𝒄𝒊
𝟐
𝒏
SS1 (control vs. acid) = [3(4.19) – 3.87 – 3.73 – 3.64]2 / [(12)/5] = 0.74
SS2 (inorg. vs. org.) = [– 2(3.87) + 3.64 + 3.73]2 / [(6)/5] = 0.11
SS3 (between org.) = [ 3.73 -3.64]2 / [(2)/5] = 0.02
We have:
SStreatment= SS1 +SS2 +SS3 =0.87125
SStotal= 298.46 - (77.13)2/20 = 1.008155
Sserror = 0.138155 115
Conclusion:
From this analysis, we conclude that in this experiment acids significantly reduce
seedling growth (F = 86.22, p < 0.01), that the organic acids cause significantly
more reduction than the inorganic acid (F = 13.13, p < 0.01), and that the
difference between the organic acids is not significant (F = 2.26, p > 0.05).
116
Scheffé’s Method for Comparing All Contrasts
In many situations, experimenters may not know in advance which contrasts they
wish to compare, or they may be interested in more than a -1 possible comparisons.
Scheffé (1953) has proposed a method for comparing any and all possible contrasts
between treatment means.
In the Scheffé method, the type I error is at most for any of the possible
comparisons.
117
118
Example 1: (revisited)
Suppose that the contrasts of interests are:
Г1 = µ1 + µ2 - µ3 - µ4 and Г2 = µ1 - µ4
and 𝐶2 = 𝑦ത1. - 𝑦ത4. = 551.2 - 707.0 = - 155.8 (this is the tested value for Г2 )
119
The 1 percent critical values are
120
121
Conclusion:
Because |C1|> 𝑆0.01,1 (193.8 > 65.09) we conclude that the contrast Г1 = µ1 + µ2 -
µ3 - µ4 does not equal zero; that is, we conclude that the mean etch rates of power
settings 1 and 2 as a group differ from the means of power settings 3 and 4 as a
group.
Because |C2|> 𝑆0.01,2 (155.8 > 45.97) we conclude that the contrast Г2 = µ1 - µ4
does not equal zero; that is, the mean etch rates of treatments 1 and 4 differ
significantly.
122
Exercise 1: (problem 3.25)
Four chemists are asked to determine the percentage of methyl alcohol in a certain
chemical compound. Each chemist makes three determinations, and the results are
the following:
124
125
126
c)
127
Exercise 2: (problem 3.26)
Three brands of batteries are under study. It is suspected that the lives (in weeks) of the
three brands are different. Five batteries of each brand are tested with the following
results:
Week of life
Brand 1 Brand 2 Brand 3
100 76 108
96 80 100
92 75 96
96 84 98
92 82 100
128
(a) Are the lives of these brands of batteries different?
The Model F-value of 38.34 > 6.93 implies the model is significant. There is only
a 0.01% chance that a "Model F-Value" this large could occur due to noise. Yes, at
least one of the brands is different.
129
(b) Analyze the residuals from this experiment.
Using the following formula 𝑒𝑖𝑗 = 𝑦𝑖𝑗 - 𝑦ො𝑖𝑗 and data from this table
Residuals are:
Week of life
Brand 1 Brand 2 Brand 3
100- 95.2 = 4.8 76- 79.4 = -3.40 7.60
96-95.2 = 0.8 0.60 -.40
92-95.2 = -3.2 -4.40 -4.40 130
96-95.2 = 0.8 4.60 -2.40
92-95.2 =-3.2 2.60 -0.40
131
132
(c) Construct a 95% interval estimate on the mean life of battery brand 2.
Construct also a 99% interval estimate on the mean difference between the lives
of battery brands 2 and 3.
133
(d) Which brand would you select for use? If the manufacturer will replace without
charge any battery that fails in less than 85 weeks, what percentage would the
company expect to replace?
Chose brand 3 for longest life. Mean life of this brand in 100.4 weeks, and the
variance of life is estimated by 15.60 (MSE).
85 −100.4
Φ( ) = Φ (-3.9) = 0.00005
15.6
That is, about 5 out of 100,000 batteries will fail before 85 week.
134
(e) Use the modified Levene test to determine if the assumption of equal variances is
satisfied. Use α = 0.05. Did you reach the same conclusion regarding the equality of
variances by examining the residual plots?
Solution:
The absolute value of Battery Life – brand median is (we find deviations): 𝑑𝑖𝑗 = |𝑦𝑖𝑗 - 𝑦𝑖 |
135
The analysis of variance indicates that there is not a difference between the different
brands and therefore the assumption of equal variances is satisfied.
136
Comparing Pairs of Treatment Means
In many practical situations, we will wish to compare only pairs of means.
The Scheffé method could be easily applied to this problem, it is not the most
sensitive procedure for such comparisons.
Suppose that we are interested in comparing all pairs of a treatment means and that
the null hypotheses that we wish to test are H0 : µ𝑖 - µ𝑗 for all i ≠ j.
➢ The Tukey procedure controls the experimentwise or “family” error rate at the
selected level .
➢ Tukey’s procedure makes use of the distribution of the studentized range statistic.
ഥ𝒎𝒂𝒙 −ഥ
𝒚 𝒚𝒎𝒊𝒏
q=
𝑴𝑺𝑬/𝒏
where 𝑦ത𝑚𝑎𝑥 𝑎𝑛𝑑 𝑦ത𝑚𝑖𝑛 are the largest and smallest sample means.
138
For equal sample sizes, Tukey’s test declares two means significantly different if the
absolute value of their sample differences exceeds:
𝑻𝜶 = 𝒒𝜶 (a,f) 𝑴𝑺𝑬/𝒏
139
140
a
141
We can construct a set of 100(1 - 𝛼) percent confidence intervals for all pairs of
means as follows:
142
Note that: The unequal sample size version is sometimes called the Tukey–Kramer
procedure.
143
Example 1 (revisited):
144
The Fisher Least Significant Difference (LSD) Method.
The Fisher method for comparing all pairs of means controls the error rate for each
individual pairwise comparison but does not control the experimentwise or family error
rate.
ഥ𝒊. −ഥ
𝒚 𝒚𝒋.
𝒕𝟎 = 𝟏 𝟏
𝑴𝑺𝑬 (𝒏 +𝒏 )
𝟏 𝟐
1 1
The quantity called the least significant difference is: LSD= 𝒕α,𝑵−𝒂 𝑀𝑆𝐸 ( + )
𝟐 𝑛1 𝑛2
145
Example 1 (revisited):
146
Problems 3.7 – 3.8 – 3.9 - 3.22 - 3.25 – 3.26 – 3.49 from the book
147
Comparing Treatment Means with a Control (Dunnett )
In many experiments, one of the treatments is a control, and the analyst is interested in
comparing each of the other a -1 treatment means with the control.
Suppose that treatment a is the control and we wish to test the hypotheses:
H0 : µ i = µ a
H1 : µi ≠ µa
for all a=1,2,…, a-1.
148
Dunnett’s procedure is a modification of the usual t-test.
For each hypothesis, we compute the observed differences in the sample means | 𝒚ഥ𝒊. − 𝒚ഥ𝒂.|.
149
Example 1: (revisited)
To illustrate Dunnett’s test, consider the experiment with etching rate with treatment 4
considered as the control. In this example, a = 4, f = n-a =16, and n = 5. At the 5 percent
level, we find from Appendix Table VIII that
𝒅𝜶 (a-1 , f) = 𝒅𝟎.𝟎𝟓 (3 , 16) =2.59.
𝟏 𝟏 𝟏 𝟏
The critical difference becomes 𝒅𝜶 (a-1 , f) 𝑴𝑺𝑬 (𝒏 + 𝒏 ) = 2.59 𝟑𝟑𝟑. 𝟕 (𝟓 + 𝟓 ) =29.92
𝒊 𝒂
150
Any treatment mean that differs in absolute value from the control by more than 29.92 would be
declared significantly different.
The observed differences are:
1 vs. 4: 𝑦ത1 - 𝑦ത4 = 551.2 - 707.0 = -155.8
2 vs. 4: 𝑦ത2 - 𝑦ത4 = 587.4 - 707.0 = - 119.6
3 vs. 4: 𝑦ത3 - 𝑦ത4 = 625.4 - 707.0 = -81.6
Note that all differences are significant. Thus, we would conclude that all power settings are
different from the control.
151
152
153
Example 1 (revisited):
154
Tables
155
The cumulative standard normal
distribution table.
156
ν is the degree of
freedom
1- α: is CI
157
158
159
160
161
162
Formulas
𝟐
Treatment: SSC = σ𝒄𝒋=𝟏 ഥ𝒋 − 𝑿
𝑿 ഥ with 𝒅𝒇𝑪 = C - 1
𝒏𝒋 𝟐
Error: SSE = σ𝒊=𝟏 σ𝒄𝒋=𝟏 ഥ𝒋
𝑿𝒊𝒋 − 𝑿 with 𝒅𝒇𝑬 = N – C
𝒏𝒋 𝟐
Total: SST = σ𝒊=𝟏 σ𝒄𝒊=𝟏 ഥ
𝑿𝒊𝒋 − 𝑿 with 𝒅𝒇𝑻 = N – 1
SSC
MSC =
𝒅𝒇𝑪
SS𝑬
MSE=
𝒅𝒇𝑬
MSC
F= MSE
163
Formulas
𝟐 𝐱 𝐌𝐒𝐄 𝐱 𝑭𝟏,𝐍−𝒄
LSD =
𝒓
χ 𝟐 = 2.3026
q/c
164
Contrasts
C= σ𝑎𝑖=1 𝑐𝑖 𝑦ത𝑖
𝜎2
V( C ) = σ𝑎𝑖=1 𝑐𝑖 2
𝑛
σ𝒂 ഥ𝒊
𝒊=𝟏 𝒄𝒊 𝒚
𝑡0 =
𝑀𝑆𝐸
σ𝑎𝑖=1 𝑐𝑖
2
𝑛
(σ𝒂 ഥ 𝒊 )𝟐
𝐢=𝟏 𝒄𝒊 𝒚
SSc = 𝟏
σ𝒂
𝐢=𝟏 𝒄 𝒊
𝟐
𝒏
2
(σ 𝑦𝑖𝑗 )
SSTotal = σ(𝑦𝑖𝑗 )2 -
𝑁
𝑀𝑆𝐶 𝑆𝑆𝐶/1
𝐹0 = =
𝑀𝑆𝐸 𝑀𝑆𝐸
165
Г𝑗 = σ 𝑐𝑖 µ𝑖
𝐶𝑗 =σ 𝑐𝑖 𝑦ത𝑖.
𝑇𝛼 = 𝑞𝛼 (a,f) 𝑀𝑆𝐸/𝑛
2 1 1
LSD = 𝒕 α
,𝑵−𝒂 𝑀𝑆𝐸 𝑑𝛼 (a-1 , f) 𝑀𝑆𝐸 ( + )
𝟐 𝑛 𝑛𝑖 𝑛𝑗 166