Stat 408 Analysis of Experimental Design PDF
Stat 408 Analysis of Experimental Design PDF
For example:
y Fish may be collected from three different regions of a lake, in order to compare
their weights over the three locations.
y Children from three different schools may be compared for their performance on
an achievement test.
y Households from three suburbs are surveyed to compare their incomes and
political opinions.
For example:
y Twenty plots of carrots are grown in a field. Each plot is randomly allocated to
one of five fertilizers, with four plots for each fertilizer. At the end of the
experiment, the carrots from each plot are weighed. The yield of carrots with
different fertilizers is being studied.
y Twenty children from a class are each randomly assigned to one of five different
teaching methods, four children to each method. After three weeks of teaching,
each child is tested for understanding of the material taught. The different
teaching methods are being compared.
y People with a certain disease are randomly allocated to three different drugs. The
drugs are being compared for their influence on the progress of the disease.
The goal of a study is to find out the relationships between certain explanatory factors
and response variables.
y An observational study usually can only answer whether there is an association between
the explanatory factor and the response variable. In general, external evidence is
required to rule out possible alternative explanations for a cause-and-effect relationship.
y Regression models can include both qualitative and quantitative explanatory variables.
– Regression models assume that there is some sort of linear relationship between quantitative
explanatory variables (or transformations) and the response.
y Analysis of variance (ANOVA) models assume all explanatory variables (quantitative and
qualitative) enter the model as qualitative variables.
y Effectively no difference between ANOVA models and regression models with qualitative
explanatory variables.
Analysis of Variance
y We must consider the method of analysis when designing a study. The method of analysis
depends on the nature of the data and the purpose of the study.
y ANOVA is typically used when the effects of one or more explanatory variables are of
interest.
y The goal of ANOVA is to determine if there is a difference between the mean response
associated with each factor level or treatment. If there is a difference, determine the nature of
the difference.
Basic Concepts
y We shall start with a simple real life problem that many of us face.
y Nowadays most of us use gas for cooking purposes. Most of the gas users are customers
of gas companies.
y The customers get their refills (filled gas cylinders) through the agents of these
companies.
y One of the customers, Mrs. Mensah, who buys her gas from ABC gas agent, has faced a
problem in the recent past.
y She observed that her cylinders were not lasting as long as they used to be in the past.
y So she suspected that the amount of gas in the refills was less compared to what she
used to get in the past. She knew that she is supposed to get 14.2 kgs of gas in every
refill.
y She explained her problem to the customers’ complaints section of the ABC gas
company.
y Subsequently, the company made a surprise check on an ABC agent.
y They took 25 cylinders that were being supplied to customers from this agency and
measured the amount of gas in each of these cylinders.
y The 25 observations were statistically analyzed and through a simple test of hypothesis
it was inferred that the mean amount of gas in the cylinders supplied by the ABC agent
was significantly lower than 14.2 kgs.
y On investigation, it was revealed that the agent was tapping gas from cylinders before
they are being supplied to the customers.
y There were five agents of the company in the town where Mrs. Mensah was living.
y To protect customers’ interests, the company decided to carry out surprise checks on all
the agents from time to time.
y During each check, they picked up 7 cylinders at random from each of the five agents
resulting in the data given in the table below. Is it possible to test from this data whether
the mean amount of gas per cylinder differs from agent to agent?
y It is possible to carry out a simple test of hypothesis for each of the agents separately.
But there is a better statistical procedure to do this simultaneously. We shall see how
this can be done.
Source of Variation
y You know that variation is inevitable in almost all the variables (measurable
characteristics) that we come across in practice.
y For example, the amount of gas in two refills is not the same irrespective of whether the
gas is tapped or not.
y Consider the data in the table below.
y We have the weights of gas in 35 cylinders taken at random, seven from each of the five
agents.
y These 35 weights exhibit variation. You will agree that some of the possible reasons for
this variation are one or more of the following:-
9 The gas refilling machine at the company does not fill every cylinder with exactly
same amount of gas.
9 There may be some leakage problem in some of the cylinders.
9 The agency/agents might have tapped gas from some of these cylinders.
9 All the 35 cylinders are not filled by the same filling machine.
y Thus, the variation in the 35 weights might have come from different sources.
y Though the variation is attributable to several sources, depending upon the situation,
we will be interested in analyzing whether most of this variation can be due to
differences in one (or more) of the sources.
For instance, in the above example, the company will be interested in identifying if there are
any differences among the agents. So the source of variation of interest here is AGENTS. In
other words, we are interested in one factor or, one-way analysis of variance.
y Now that you know what is source of variation, you can think of different types of sources.
y In the gas company example, agents form one type of source.
y If the cylinders under consideration were refilled by different filling machines, then filling
machines is another type of source of variation.
When the data are classified only with respect to one type of source of variation, we say that we
have one-way classification data.
In many situations, one conducts experiments to study the effect of a single factor on a variable
under study. Such experiments, known as one-factor experiments, lead to one-way
classification data.
Classification of Data
The process of arranging data into homogenous group or classes according to some common
characteristics present in the data is called classification.
For Example: The process of sorting letters in a post office, the letters are classified according to the cities and
further arranged according to streets.
Types of Classification:
(1) One -way Classification:
If we classify observed data keeping in view single characteristic, this type of classification is
known as one-way classification.
(2) Two -way Classification:
If we consider two characteristics at a time in order to classify the observed data then we are
Single-Factor Experiments
y We generally classify scientific experiments into two broad categories, namely, single-factor
experiments and multifactor experiment.
y Model I: This is a model where the factor levels are fixed by the researcher. Conclusions will
pertain only to the means associated with each of the fixed factor levels.
y Model II: This is a model where the factor levels are random, that is, the levels are randomly
selected by the researcher from a population of factor levels. Conclusions will extend to the
population of factor levels.
Notation
In general, we have a single factor with k u 2 levels (treatments) and ni replicates for each
treatment.
Model Assumptions
iid
y H ij ~ N (0, V 2 )
iid
y yij ~ N ( P i , V 2 )
Parameters
The parameters of the model are: ( P1 , P 2 , L P k , V
2
)
Estimates
ni
∑(y
j 1
ij yiy ) 2
2
For each level i, get an estimate of the variance, si
n 1
ni
∑y ij
ni
We combine these si2 to get an estimate of V 2 in the following way.
Pooled Estimate of V2
The pooled estimate is
k k k ni
∑ (n i 1) s 2 2
i ∑ (n i 1) s 2
i ∑∑(y
i 1 j 1
ij yiy ) 2
s2 i 1 i 1
MSE
k
N k N k
∑ (n
i 1
i 1)
In the special case that there are an equal number of observations per group ( ni n ) then
N nk and this becomes
k
( n 1)∑ si2
1 k 2 2
s 2
nk k
i 1
∑i
k i1
s a simple average of si
Hypothesis Tests
The hypothesis that all treatments are equally effective becomes:
⎧ i 1,2,...k
Effects Model: yij P W i H ij ⎨
⎩ j 1,2,...ni
k
∑nW
k
Where ∑W
i 1
i 0 (balanced design)
i 1
i i 0 (unbalanced design)
Parameters
The parameters of the factor effects model are: ( P ,W 1 ,W 2 ,LW k , V ) There are k+2 of these.
2
most popular method of estimation is the method of least squares (LS) which determines the
estimators of P and W i by minimizing the sum of squares of the errors.
k ni k ni
L ∑∑ H
i 1 j 1
2
ij ∑∑ ( y
i 1 j 1
ij P W i )2
We use the “^” (hat) notation to represent least squares estimators, as well as, predicted (or
fitted) values.
k
Minimization of L via partial differentiation (with the zero-sum constraint ∑W
i 1
i 0 ) provides
the estimates:
k ni
∑∑ y ij
yyy
P̂ i 1 j 1
yyy
N N
Proof
Consider the fixed effect one-way ANOVA model
where P and W i are fixed, but unknown, parameters and the H ij ' s are independent random
The least squares estimators, P̂ and Wˆi , of the parameters P and W i are obtained by
minimizing the sum of squares of the errors ( H ij ' s ).
We have H ij y ij P W i
Let the sum of squared errors be
k ni k ni
L ∑∑ H
i 1 j 1
2
ij ∑∑ ( y
i 1 j 1
ij P W i )2
k ni k ni
L ∑∑ Hˆ
i 1 j 1
2
ij ∑∑ ( y
i 1 j 1
ij Pˆ Wˆi ) 2
A solution can be found by using the Normal equations which are found equating the partial
derivatives to 0 and then solving:
xL k ni
2∑∑ ( yij Pˆ Wˆi )
xPˆ i 1 j 1
(1)
xL ni
2∑ ( yij Pˆ Wˆi ) i 1,L, k
xWˆi j 1
(2)
xL k ni
2∑∑ ( yij Pˆ Wˆi ) 0
xPˆ i 1 j 1
k ni k ni k ni
⇒ ∑∑ yij ∑∑ Pˆ ∑∑Wˆ i
i 1 j 1 i 1 j 1 i 1 j 1
k
⇒ yyy NPˆ ∑ niWˆi (3)
i 1
where N ∑n
i 1
i
Setting each of the equations in (2) equal to zero, the least squares estimators Wˆi for
i 1,L, k are given by
xL ni
2∑ ( yij Pˆ Wˆi ) 0 i 1,L, k
xWˆi j 1
ni ni ni
⇒ ∑ yij ∑ Pˆ ∑Wˆi
j 1 j 1 j 1
There is no unique solution to these equations as they are not linearly independent —
summing over i. To get unique solutions for P̂ and Wˆi we impose the constraint
k
∑nW
i 1
i i 0
yyy
Using the constraint into (3) yields yy y NP̂ or P̂ yyy
N
Hypothesis Tests
y The cell means model hypotheses were
H 0 : P1 P 2 ... P k
H1 : Pi { P j for at least one i, j (not all the P i are equal)
y For the factor effects model these translate to
H 0 : W1 W 2 ... W k 0
H1 : W i { 0 for at least one i
Thus, the one way ANOVA for testing the equality of treatment effects is identical to the
ANOVA for testing the equality of treatment means.
Sample Layout
The typical data layout for a one-way ANOVA is shown below:
yyy
yyy k
Grand mean
∑n
i 1
i
ni
yi y ∑y j 1
ij ith treatment sample sum
yi y
yi y ith treatment mean
ni
y ij y yy ( y i y y yy ) ( y ij y i y )
or
yij yyy = ( yij yiy ) ( yiy yyy )
∑∑ ( y
i 1 j 1
iy y yy ) ∑∑ ( y ij y iy ) 2∑∑ ( y iy y yy )( y ij y iy )
2
i 1 j 1
2
1i4j4
1
44 4244444 3
0
k k ni
∑n (y
i 1
i iy yyy ) ∑∑ ( yij yi y ) 2
2
i 1 j 1
Expressing the above sum of squares symbolically we have:
SST has N-1 d.f.; SSTR has k-1 d.f.; and SSE has N-k d.f.; so we also have a decomposition of
the total d.f.
The degrees of freedom (d.f.) for a sum of squares counts the number of independent pieces of
information that goes into that quantification of variability.
Notice that
k ni k
SSE ∑∑ ( yij yi y )2
i 1 j 1
∑ (n 1)s
i 1
i
2
i
2
Where s i is the sample variance within the ith treatment, so
(n1 1) s12 (n2 1) s22 L (nk 1) sk2
s 2p pooled estimate of V when k=2
SSE 2
MSE k
(n1 1) L (nk 1)
∑ (n
i 1
i 1)
Computational Formulae
We have defined SST, SSTR and SSE as sums of squared deviations. Equivalent formulas for
the SST and SSTR for computational purposes are as follows:
k ni k ni 2
y
SST ∑∑ ( y
i 1 j 1
ij yyy ) 2
∑∑
i 1 j 1
y yy
N
2
ij
2
k k
yi2y yy y
SSTR ∑n (y
i 1
i iy yy y ) 2
∑
i 1 ni
N
SSE is computed by subtraction: SSE = SST – SSTR
Mean Squares
The ratios of sums of squares to their degrees of freedom result in mean squares.
y MSTR, the treatment mean square error, is defined as follows: MSTR = SSTR/(k-1)
y MSE, the mean square error, is defined as follows: MSE = SSE/(N-k)
general,
k
∑ n (P
k
∑ niW i2 i i P)2
E ( MSTR) V 2 i 1
or E ( MSTR) V2 i 1
k 1 k 1
n1P1 n2 P 2 nP k
ni Pi
where P L k k ∑ and W i Pi P
N N N i 1 N
E ( MSE ) V 2
The F-test
E ( MSE ) V2 E ( MSTR )
∑n 0 i
Since E ( MSTR ) V 2 i 1
V2 0 V2
k 1
y Therefore if H 0 : P1 P 2 ... P k or equivalently H 0 : W 1 W 2 ... W k 0 is true
MSTR
If H0 is true F should be close to 1.
MSE
However, when H0 is false it can be shown that MSTR estimates something larger than V 2 (i.e.
E(MSTR)>E(MSE) when some treatments means are different or if real treatment effects do
exist)
y That is,
MSTR
y If !! 1 then it makes sense to reject H0
MSE
y Therefore to determine whether H0 is true or not, we look at how much larger than 1
MSTR/MSE is.
Where
⎛1 1⎞
LSD tD 2, N k MSE⎜ ⎟
⎜n n ⎟
⎝ i j ⎠
y A random effect factor is one that has many possible levels, and where the interest is in the
variability of the response over the entire population of levels, but we only include a random
sample of levels in the experiment.
The factor levels are meant to be representative of a general population of possible levels.
We are interested in whether that factor has a significant effect in explaining the response,
but only in a general way. For example, we're not interested in a detailed comparison of level
2 vs. level 3, say.
The mathematical representation of the model is the same as the fixed effects model:
yij P W i H ij i 1, L k ; j 1, L ni
where y , W and H are random variables and P is an unknown fixed parameter, the overall
mean.
Model Assumptions
1. The H ij ’s (random errors) come independently from a N (0, V 2 ) distribution. [i.e.
iid
H ij ' s ~ N (0,V 2 ) ]
2. The random effects W i ’s are independent random variables with the same
distribution N (0, V W ) .
2
iid
[i.e. we assume that W 1 , W 2 , K , W k ~ N (0, V W ) ]
2
Variance components
y In the random effects model, the variance of y ij is no longer just V . The equation for y ij
2
now has two random variables on the right. There is the residual unexplained variability V
2
as before, plus the variability from randomly selecting W i from a N (0, V W2 ) distribution.
The two variances V W2 and V 2 are called variance components (or components of variance) as
the variance of one observation is equal to V W V .
2 2
These two components may be estimated from the MS column of the ANOVA table.
Hypotheses
For the random-effects model, testing the hypothesis that the individual treatment effects are
zero is meaningless. It is more appropriate to test hypotheses about W i . Since we are interested
in the bigger population of treatments, the hypotheses of interest associated with the random
Wi effects are:
H 0 : V W2 0 vrs H1 : V W2 ! 0
If V W
2
y 0 , then all random treatment effects are identical, but
y If V W2 ! 0 significant variability exists among randomly selected treatments (that is, the
variability observed among the randomly selected treatments is significantly larger than the
variability that can be attributed to random error).
y Under the alternative hypothesis: V W2 ! 0 , and for ni=n the expected value of MSTR (mean
⎛ SSTR ⎞
E⎜ ⎟ V nV W .
2 2
E ( MSTR )
⎝ k 1 ⎠
Unbalanced design
For unequal sample sizes (i.e. unequal ni ‘s) (unbalanced design) n is replaced by n0
⎡ k
⎤
1 ⎢ k ∑ ni2 ⎥
Where n0 ⎢∑ ni i k1 ⎥
k 1 ⎢ i 1
⎢⎣ ∑ ni ⎥⎥
i 1 ⎦
ANOVA of variance
y The ANOVA decomposition of total variability is still valid;
y That is, the ANOVA identity is still SST = SSTR + SSE as for the fixed effects model and
the formulae for computing the sums of squares remain unchanged
y The computational procedure and construction of the ANOVA table for the random effects
model are identical to the fixed-effects case.
The conclusions, however, are quite different because they apply to the entire population of
treatments.
Testing
Testing is performed using the same F statistic that we used for the fixed effects model:
MSTR
F*
MSE
If F ! FD , k 1, N k then Reject H0
*
Otherwise do not Reject H0
If H0 is true then V W
2
0 the expected F-value is 1.
MSTR V 2 nV W2 V2
That is, E ( MSTR) V n0 (0) V 0 V and F
2 2 2 *
1
MSE V2 V2
However, when real variability among the random treatments does exist, that is, V W ! 0 , then
2
Therefore, the larger the variability among the random treatment effects W i , the larger
E ( MSTR ) V 2 n0V W2
1 ( another positive quantity ) becomes larger as the variability among the
E ( MSE ) V2
W i ’s increase.
Unbiased Estimators
The parameters of the one-way random effects model are P , V and V W2 .
2
Mean
As in the fixed effects case, we estimate P by
k ni
∑∑ y ij
yyy
P̂ i 1 j 1
yyy
N N
Estimation of V2 and V W2
Usually, we also want to estimate the variance components ( V 2 and V W ) in the model. The
2
procedure consists of equating the expected mean squares to their observed values in the
ANOVA table and solving for the variance components.
Vˆ 2 = MSE
⎛ MSTR MSE ⎞ n0V W V 2 V 2
y E ⎜⎜ ⎟⎟ V W2
⎝ n0 ⎠ n0
MSTR MSE
Since E ( MSTR) n0V W2 V 2 so VˆW2
n0
y Note that VˆW2 u 0 if and only if MSTR u MSE , which is equivalent to F u 1 .
y A negative variance estimate VˆW2 occurs only if the value of the F statistic is less than 1.
Obviously the null hypothesis H0 is not rejected when F e 1 . Since variance cannot be
negative, a negative variance estimate is replaced by 0. This does not mean that V W2 is zero. It
simply means that there is not enough information in the data to get a good estimate of V W2 .
⎛ SSE ⎞
Pr ⎜ F12 D 2 ( N k ) e 2 e F D22 ( N k ) ⎟ 1 D
⎝ V ⎠
Inverting all three terms in the inequality just reverses the ≤ signs to u’s:
⎛ ⎞
⎜ 1 V2 1 ⎟
Pr ⎜ 2 u u 2 ⎟ 1D
⎜ F1D 2 ( N k ) SSE F D 2 ⎟
⎝ ( N k ) ⎠
⎛ ⎞
⎜ SSE SSE ⎟
⇒ Pr ⎜ 2 uV u 2
2
⎟ 1D
⎜ F1 D 2 ( N k ) FD ⎟
⎝ 2 (N k ) ⎠
⎛ ⎞
⎜ SSE SSE ⎟
⎜ F2 , 2 ⎟
⎜ D 2 ( N k ) F 1D 2 ⎟
⎝ (N k ) ⎠
It turns out that it is a good bit more complicated to derive a confidence interval for V W .
2
However, we can more easily find exact CIs for the intra-class correlation coefficient
V W2 V W2 V W2
U and for the ratio of the variance components T
V W2 V 2 V Y2 V2
Confidence Interval for T V W V
2 2
Where T represents the ratio of the between treatment variance to the within-treatment or error
variance.
F 2 ( k 1) F 2 (N k)
Since MSTR ~ (V n0V A ) and MSE ~ V 2
2 2
k 1 N k
similar to the one we used to obtain our CI for V 2 we get the 100(1-)% interval [Lower,
Upper] for θ where
⎡ MSTR 1 ⎤ 1
Lower ⎢ X 1⎥ L
⎣⎢ MSE FD 2, k 1, N k ⎥⎦ n0
⎡ MSTR ⎤ 1 ⎡ MSTR 1 ⎤ 1
upper ⎢⎣ MSE X FD 2, N k , k 1 1⎥⎦ n ⎢ X 1⎥ U
0 ⎢⎣ MSE F1D 2, k 1, N k , ⎥⎦ n0
V W2 V W2
Confidence Intervals for U
V W2 V 2 V Y2
U (intra-class correlation coefficient) represents the proportion of the total variance that is
the result of differences between treatments
T
Since U we can transform the endpoints of the interval for θ to get an interval for ρ:
1T
1D P[ L e V W2 V 2 e U ]
P[1 L e 1 V W2 V 2 e 1 U ]
V 2 V W2
P[1 L e e 1U]
V2
⎡ 1 V2 1 ⎤
P⎢ u 2 u ⎥
⎣1 L V V W 1 U ⎦
2
⎡ 1 V2 1 ⎤
P ⎢1 e 1 2 e 1 ⎥
⎣ 1 L V VW2
1U ⎦
⎡ L V2 U ⎤
P⎢ e 2 W 2 e ⎥
⎣1 L V V W 1 U ⎦
⎡ Lower Upper ⎤
Thus, a 100(1-D)% Confidence Interval for ρ is ⎢ , ⎥
⎣1 Lower 1 Upper ⎦
Example 1:
We are to investigate the formulation of a new synthetic fibre that will be used to
make cloth for shirts. The cotton content varies from 10% - 40% by weight (the one
factor is cotton content) and the experimenter chooses 5 levels of this factor: 15%,
20%, 25%, 30%, 35%. The response variable is Y = tensile strength (time to break
when subject to a stress). There are 5 replicates (complete repetitions of the
experiment). In a replicate five shirts, each with different cotton content, are
randomly chosen from the five populations of shirts. The 25 tensile strengths are
measured, in random order.
Tensile Strength Data
Cotton Percentage
15% 20% 25% 30% 35%
7 12 14 19 7
7 17 18 25 10
15 12 18 22 11
11 18 19 19 15
9 18 19 23 11
Does changing the cotton content (level) change the mean strength?
Carry out an ‘Analysis of Variance’ (ANOVA) at D=0.01
Example 2
A textile company weaves a fabric on a large number of looms. They would like the looms to be
homogeneous so that they obtain a fabric of uniform strength. The process engineer suspects
that, in addition to the usual variation in strength within samples of fabric from the same loom,
there may also be significant variations in strength between looms. To investigate this, he selects
four looms at random and makes four strength determinations on the fabric manufactured on
each loom. The data are given in the following table:
Observations
Looms 1 2 3 4
1 98 97 99 96
2 91 90 93 92
3 96 95 97 95
4 95 96 99 98
Use D=0.05