0% found this document useful (0 votes)

248 views24 pages

Stat 408 Analysis of Experimental Design PDF

The document discusses one-way analysis of variance (ANOVA) and one-way classification of data. It provides an example of a gas company testing the amount of gas in cylinders supplied by different agents. The weights of gas in cylinders from each agent exhibit variation that could be due to differences between the agents. One-way ANOVA can determine if the mean weights differ significantly between agents. One-way classification occurs when data is grouped according to a single factor, like the supplying agent.

Uploaded by

pvbsudhakar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

248 views24 pages

Stat 408 Analysis of Experimental Design PDF

Uploaded by

pvbsudhakar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

The One way Classification

One-Way Analysis of Variance

1.1 Observational and Experimental Studies

Research studies may often be classified as either observational or experimental, although
some are a mixture of the two.

1.1.1 Observational Studies

In an observational study, data are collected without any attempt to manipulate or
influence the outcome.

For example:
y Fish may be collected from three different regions of a lake, in order to compare
their weights over the three locations.
y Children from three different schools may be compared for their performance on
an achievement test.
y Households from three suburbs are surveyed to compare their incomes and
political opinions.

1.1.2 Experimental Studies

In experiments usually some manipulation is attempted, in order to see if the outcome is
related to the factor being controlled.

For example:
y Twenty plots of carrots are grown in a field. Each plot is randomly allocated to
one of five fertilizers, with four plots for each fertilizer. At the end of the
experiment, the carrots from each plot are weighed. The yield of carrots with
different fertilizers is being studied.

y Twenty children from a class are each randomly assigned to one of five different
teaching methods, four children to each method. After three weeks of teaching,
each child is tested for understanding of the material taught. The different
teaching methods are being compared.

y People with a certain disease are randomly allocated to three different drugs. The
drugs are being compared for their influence on the progress of the disease.

The goal of a study is to find out the relationships between certain explanatory factors
and response variables.

The nature of the study matters when it comes to interpretation of results.

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 1

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

y An experimental study aims to answer the question: whether there is a cause-and-effect

relationship between the explanatory factor and the response variable.

y An observational study usually can only answer whether there is an association between
the explanatory factor and the response variable. In general, external evidence is
required to rule out possible alternative explanations for a cause-and-effect relationship.

Regression and ANOVA Models

Regression models and ANOVA models can be used for both observational and
experimental data.
– It is much easier to use regression methods for observational data, in particular when
variable selection is an issue.
– In many ways an ANOVA framework is easier to utilize for experiments.

y Regression models can include both qualitative and quantitative explanatory variables.
– Regression models assume that there is some sort of linear relationship between quantitative
explanatory variables (or transformations) and the response.

y Analysis of variance (ANOVA) models assume all explanatory variables (quantitative and
qualitative) enter the model as qualitative variables.

– Quantitative explanatory variables are normally converted to qualitative explanatory

variables.
– There are no assumptions about the nature of the statistical relation between the
explanatory variables and the response.

y Effectively no difference between ANOVA models and regression models with qualitative
explanatory variables.

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 2

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

Analysis of Variance
y We must consider the method of analysis when designing a study. The method of analysis
depends on the nature of the data and the purpose of the study.

y Analysis of variance, ANOVA, is a statistical procedure for analyzing continuous data,

sampled from two or more populations, or from experiments in which two or more
treatments are used. It extends the two-sample t -test to compare the means from more than
two groups.

y ANOVA is typically used when the effects of one or more explanatory variables are of
interest.

y The goal of ANOVA is to determine if there is a difference between the mean response
associated with each factor level or treatment. If there is a difference, determine the nature of
the difference.

Basic Concepts
y We shall start with a simple real life problem that many of us face.
y Nowadays most of us use gas for cooking purposes. Most of the gas users are customers
of gas companies.
y The customers get their refills (filled gas cylinders) through the agents of these
companies.
y One of the customers, Mrs. Mensah, who buys her gas from ABC gas agent, has faced a
problem in the recent past.
y She observed that her cylinders were not lasting as long as they used to be in the past.
y So she suspected that the amount of gas in the refills was less compared to what she
used to get in the past. She knew that she is supposed to get 14.2 kgs of gas in every
refill.
y She explained her problem to the customers’ complaints section of the ABC gas
company.
y Subsequently, the company made a surprise check on an ABC agent.
y They took 25 cylinders that were being supplied to customers from this agency and
measured the amount of gas in each of these cylinders.
y The 25 observations were statistically analyzed and through a simple test of hypothesis
it was inferred that the mean amount of gas in the cylinders supplied by the ABC agent
was significantly lower than 14.2 kgs.

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 3

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

y On investigation, it was revealed that the agent was tapping gas from cylinders before
they are being supplied to the customers.
y There were five agents of the company in the town where Mrs. Mensah was living.
y To protect customers’ interests, the company decided to carry out surprise checks on all
the agents from time to time.
y During each check, they picked up 7 cylinders at random from each of the five agents
resulting in the data given in the table below. Is it possible to test from this data whether
the mean amount of gas per cylinder differs from agent to agent?
y It is possible to carry out a simple test of hypothesis for each of the agents separately.
But there is a better statistical procedure to do this simultaneously. We shall see how
this can be done.

Source of Variation
y You know that variation is inevitable in almost all the variables (measurable
characteristics) that we come across in practice.
y For example, the amount of gas in two refills is not the same irrespective of whether the
gas is tapped or not.
y Consider the data in the table below.
y We have the weights of gas in 35 cylinders taken at random, seven from each of the five
agents.
y These 35 weights exhibit variation. You will agree that some of the possible reasons for
this variation are one or more of the following:-
9 The gas refilling machine at the company does not fill every cylinder with exactly
same amount of gas.
9 There may be some leakage problem in some of the cylinders.
9 The agency/agents might have tapped gas from some of these cylinders.
9 All the 35 cylinders are not filled by the same filling machine.

y Thus, the variation in the 35 weights might have come from different sources.
y Though the variation is attributable to several sources, depending upon the situation,
we will be interested in analyzing whether most of this variation can be due to
differences in one (or more) of the sources.

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 4

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

For instance, in the above example, the company will be interested in identifying if there are
any differences among the agents. So the source of variation of interest here is AGENTS. In
other words, we are interested in one factor or, one-way analysis of variance.

y Now that you know what is source of variation, you can think of different types of sources.
y In the gas company example, agents form one type of source.
y If the cylinders under consideration were refilled by different filling machines, then filling
machines is another type of source of variation.

When the data are classified only with respect to one type of source of variation, we say that we
have one-way classification data.
In many situations, one conducts experiments to study the effect of a single factor on a variable
under study. Such experiments, known as one-factor experiments, lead to one-way
classification data.

Classification of Data
The process of arranging data into homogenous group or classes according to some common
characteristics present in the data is called classification.
For Example: The process of sorting letters in a post office, the letters are classified according to the cities and
further arranged according to streets.

Types of Classification:
(1) One -way Classification:
If we classify observed data keeping in view single characteristic, this type of classification is
known as one-way classification.
(2) Two -way Classification:
If we consider two characteristics at a time in order to classify the observed data then we are

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 5

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

doing two way classifications.

(3) Multi -way Classification:
We may consider more than two characteristics at a time to classify given data or observed data.
In this way we deal in multi-way classification.
For Example: The population of world may be classified by Religion, Sex and Literacy.

Single-Factor Experiments
y We generally classify scientific experiments into two broad categories, namely, single-factor
experiments and multifactor experiment.

y Definition: Whenever an experimenter is concerned with comparing the means/effects of a

single factor having at least 3 levels whether the levels are (i) quantitative or qualitative or (ii)
fixed or random, the experiment is referred to as a single factor experiment.
y In a single-factor experiment, only one factor varies while others are kept constant.
y In these experiments, the treatments consist solely of different levels of the single variable
factor.
y If there is only one factor, and if the response variable is continuous and satisfies a few other
conditions to be discussed later, then the statistical analysis of the experimental data is done
by one-way analysis of variance.
y In multi-factor experiments (also referred to its factorial experiments), two or more factors
vary simultaneously.

In single factor experiments the response variable Y is continuous

There are two key differences regarding the explanatory variable X.

1. It is a qualitative variable (e.g. gender, location, etc). Instead of calling it an explanatory

variable, we now refer to it as a factor.
2. No assumption (i.e. linear relationship) is made about the nature of the relationship
between X and Y. Rather we attempt to determine whether the response differ
significantly at different levels of X.
We will consider two single-factor ANOVA models:

y Model I: This is a model where the factor levels are fixed by the researcher. Conclusions will
pertain only to the means associated with each of the fixed factor levels.

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 6

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

y Model II: This is a model where the factor levels are random, that is, the levels are randomly
selected by the researcher from a population of factor levels. Conclusions will extend to the
population of factor levels.

Fixed Factors Model (Model I)

There are two ways of parameterizing the model:

1. Cell means model
2. Factor effects model

Notation

X (or A) is the qualitative factor

y r (or a or k) is the number of levels
y we often refer to these as groups or treatments

Y is the continuous response variable

y ij is the jth observation in the ith group.

i 1, 2, L , k levels of the factor X.

j 1, 2, L , ni observations at factor level i.
k
The total number of observations is N ∑n
i 1
i

In general, we have a single factor with k u 2 levels (treatments) and ni replicates for each
treatment.

Cell Means Model

yij Pi H ij
Where
yij is the jth observation on treatment i,
Pi is the theoretical mean of all observations at level i.
H ij is a random deviation of yij about the ith mean P i . H ij is called the random error.

Model Assumptions
iid
y H ij ~ N (0, V 2 )
iid
y yij ~ N ( P i , V 2 )

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 7

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

Parameters
The parameters of the model are: ( P1 , P 2 , L P k , V
2
)

Estimates
ni

∑(y
j 1
ij yiy ) 2
2
For each level i, get an estimate of the variance, si
n 1
ni

∑y ij

Estimate P i by the mean of the observations at level i. That is, P̂ i yi y

j 1

ni
We combine these si2 to get an estimate of V 2 in the following way.

Pooled Estimate of V2
The pooled estimate is
k k k ni

∑ (n i 1) s 2 2
i ∑ (n i 1) s 2
i ∑∑(y
i 1 j 1
ij yiy ) 2
s2 i 1 i 1
MSE
k
N k N k
∑ (n
i 1
i 1)

In the special case that there are an equal number of observations per group ( ni n ) then
N nk and this becomes
k
( n 1)∑ si2
1 k 2 2
s 2

nk k
i 1
∑i
k i1
s a simple average of si

Hypothesis Tests
The hypothesis that all treatments are equally effective becomes:

H 0 : P1 P2 ... P k all means are equal vrs

H1 : Pi { P j for at least one i, j not all the means are equal

Factor Effects Model

An equivalent from of the model:

⎧ i 1,2,...k
Effects Model: yij P W i H ij ⎨
⎩ j 1,2,...ni
k

∑nW
k
Where ∑W
i 1
i 0 (balanced design)
i 1
i i 0 (unbalanced design)

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 8

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

y P is the “weighted" or overall mean of the treatment means

y W i The treatment effect (deviation up or down from the grand mean) of the ith treatment and
is defined to be W i Pi P
y Wi can be thought of as the average effect that factor level i has on the overall mean.

y Another interpretation is to think of W i as an adjustment that needs to be made to the overall

mean given that you know data comes from factor level i.

Parameters
The parameters of the factor effects model are: ( P ,W 1 ,W 2 ,LW k , V ) There are k+2 of these.
2

Estimation of Model Parameters

We now wish to estimate the model parameters, based on the effects model (P, W i , V ). The
2

most popular method of estimation is the method of least squares (LS) which determines the
estimators of P and W i by minimizing the sum of squares of the errors.
k ni k ni
L ∑∑ H
i 1 j 1
2
ij ∑∑ ( y
i 1 j 1
ij P W i )2

We use the “^” (hat) notation to represent least squares estimators, as well as, predicted (or
fitted) values.
k
Minimization of L via partial differentiation (with the zero-sum constraint ∑W
i 1
i 0 ) provides

the estimates:
k ni

∑∑ y ij
yyy
P̂ i 1 j 1
yyy
N N

Wˆi yiy yyy for i=1,…,k,

Hˆij eij yij yiy

yˆ ij Pˆ Wˆi = ŷ ij y iy

Hˆij eij yij yˆ ij yij yi y

Pˆ i Pˆ Wˆi yyy yiy yyy yiy

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 9

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

Proof
Consider the fixed effect one-way ANOVA model

yij P W i H ij ( i 1,L, k j 1,L, ni )

where P and W i are fixed, but unknown, parameters and the H ij ' s are independent random

variables with E( H ij ) = 0 and Var( H ij ) = V 2 .

The least squares estimators, P̂ and Wˆi , of the parameters P and W i are obtained by
minimizing the sum of squares of the errors ( H ij ' s ).

We have H ij y ij P W i
Let the sum of squared errors be
k ni k ni
L ∑∑ H
i 1 j 1
2
ij ∑∑ ( y
i 1 j 1
ij P W i )2

Mathematically, we want to find Pˆ ,Wˆ1 ,LWˆk that minimize

k ni k ni
L ∑∑ Hˆ
i 1 j 1
2
ij ∑∑ ( y
i 1 j 1
ij Pˆ Wˆi ) 2

A solution can be found by using the Normal equations which are found equating the partial
derivatives to 0 and then solving:

xL k ni
2∑∑ ( yij Pˆ Wˆi )
xPˆ i 1 j 1
(1)

xL ni
2∑ ( yij Pˆ Wˆi ) i 1,L, k
xWˆi j 1
(2)

Setting (1) equal to zero gives

xL k ni
2∑∑ ( yij Pˆ Wˆi ) 0
xPˆ i 1 j 1

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 10

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

k ni k ni k ni
⇒ ∑∑ yij ∑∑ Pˆ ∑∑Wˆ i
i 1 j 1 i 1 j 1 i 1 j 1

k
⇒ yyy NPˆ ∑ niWˆi (3)
i 1

where N ∑n
i 1
i

Setting each of the equations in (2) equal to zero, the least squares estimators Wˆi for
i 1,L, k are given by

xL ni
2∑ ( yij Pˆ Wˆi ) 0 i 1,L, k
xWˆi j 1

ni ni ni
⇒ ∑ yij ∑ Pˆ ∑Wˆi
j 1 j 1 j 1

⇒ yi y ni Pˆ niWˆi For i 1,L, k (4)

There is no unique solution to these equations as they are not linearly independent —
summing over i. To get unique solutions for P̂ and Wˆi we impose the constraint
k

∑nW
i 1
i i 0

yyy
Using the constraint into (3) yields yy y NP̂ or P̂ yyy
N

Thus yi y ni Pˆ niWˆi becomes yi y ni yy y niWˆi

Solving for Wˆi yields Wˆi yiy yyy For i 1,L, k

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 11

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

Hypothesis Tests
y The cell means model hypotheses were

H 0 : P1 P 2 ... P k
H1 : Pi { P j for at least one i, j (not all the P i are equal)
y For the factor effects model these translate to

H 0 : W1 W 2 ... W k 0
H1 : W i { 0 for at least one i

Thus, the one way ANOVA for testing the equality of treatment effects is identical to the
ANOVA for testing the equality of treatment means.

Sample Layout
The typical data layout for a one-way ANOVA is shown below:

Some more Notation

k ni
yyy ∑∑ y
i 1 j 1
ij Grand sum of all observations

yyy
yyy k
Grand mean
∑n
i 1
i

ni
yi y ∑y j 1
ij ith treatment sample sum

yi y
yi y ith treatment mean
ni

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 12

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

Decomposition of the Total Deviation

Decomposition of y ij
For any observed value yij we can write:

y ij y yy ( y i y y yy ) ( y ij y i y )
or
yij yyy = ( yij yiy ) ( yiy yyy )

Decomposition of Total Sum of Squares (SST)

The total (corrected) sum of squares is given by
k ni
SST ∑∑ ( y
i 1 j 1
ij yyy ) 2 is a measure of the total variability in the data.

Notice that the total sum of squares, SST, may be decomposed as

k ni
SST ∑∑ ( y
i 1 j 1
ij yiy yiy yyy ) 2
k ni k ni k ni

∑∑ ( y
i 1 j 1
iy y yy ) ∑∑ ( y ij y iy ) 2∑∑ ( y iy y yy )( y ij y iy )
2

i 1 j 1
2

1i4j4
1
44 4244444 3
0
k k ni

∑n (y
i 1
i iy yyy ) ∑∑ ( yij yi y ) 2
2

i 1 j 1
Expressing the above sum of squares symbolically we have:

SST = SSTR + SSE

Breakdown of Degrees of freedom:

SST has N-1 d.f.; SSTR has k-1 d.f.; and SSE has N-k d.f.; so we also have a decomposition of
the total d.f.

d.f Total = d.f. Trt + d.f. Error

→ N 1 = N k + (k – 1)

The degrees of freedom (d.f.) for a sum of squares counts the number of independent pieces of
information that goes into that quantification of variability.

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 13

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

Notice that
k ni k
SSE ∑∑ ( yij yi y )2
i 1 j 1
∑ (n 1)s
i 1
i
2
i

2
Where s i is the sample variance within the ith treatment, so
(n1 1) s12 (n2 1) s22 L (nk 1) sk2
s 2p pooled estimate of V when k=2
SSE 2
MSE k
(n1 1) L (nk 1)
∑ (n
i 1
i 1)

Computational Formulae
We have defined SST, SSTR and SSE as sums of squared deviations. Equivalent formulas for
the SST and SSTR for computational purposes are as follows:
k ni k ni 2
y
SST ∑∑ ( y
i 1 j 1
ij yyy ) 2
∑∑
i 1 j 1
y yy
N
2
ij

2
k k
yi2y yy y
SSTR ∑n (y
i 1
i iy yy y ) 2
∑
i 1 ni

N
SSE is computed by subtraction: SSE = SST – SSTR

Mean Squares
The ratios of sums of squares to their degrees of freedom result in mean squares.

y MSTR, the treatment mean square error, is defined as follows: MSTR = SSTR/(k-1)
y MSE, the mean square error, is defined as follows: MSE = SSE/(N-k)

Expected Mean Squares

If V represents the variance associated with random errors,
2 H ij then it can be shown that in

general,
k

∑ n (P
k

∑ niW i2 i i P)2
E ( MSTR) V 2 i 1
or E ( MSTR) V2 i 1

k 1 k 1

n1P1 n2 P 2 nP k
ni Pi
where P L k k ∑ and W i Pi P
N N N i 1 N

E ( MSE ) V 2

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 14

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

The F-test

y Under H 0 : P1 P2 ... Pk or equivalently H 0 : W 1 W2 ... W k 0

E ( MSE ) V2 E ( MSTR )

∑n 0 i
Since E ( MSTR ) V 2 i 1
V2 0 V2
k 1
y Therefore if H 0 : P1 P 2 ... P k or equivalently H 0 : W 1 W 2 ... W k 0 is true

then MSE and MSTR both estimate V2

MSTR SSTR (k 1)
y Therefore under H0 F ~ Fk 1, N k and the test statistic becomes an F-
MSE SSE ( N k )
test.

y We Reject H0 for large values of the F-ratio in comparison to an Fk 1, N k distribution

Logic behind the F-test

MSTR
If H0 is true F should be close to 1.
MSE
However, when H0 is false it can be shown that MSTR estimates something larger than V 2 (i.e.
E(MSTR)>E(MSE) when some treatments means are different or if real treatment effects do
exist)
y That is,

⎧ estimator of something l arg er than V 2

⎪ estimator of V 2
if H 0 is false
MSTR ⎪
⎨
⎪ estimator of V
2
MSE
if H 0 is true
⎪⎩ estimator of V 2

MSTR
y If !! 1 then it makes sense to reject H0
MSE
y Therefore to determine whether H0 is true or not, we look at how much larger than 1
MSTR/MSE is.

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 15

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

The test procedure may be summarized in an ANOVA TABLE as follows:

Degrees of Sum of Squares Mean Squares F
Source Freedom
Treatment k-1 SSTR MSTR=SSTR/(k-1) MSTR/MSE
Error N–k SSE MSE=SSE/(N-k)
Total N–1 SST

Comparison of factor level means/effects

y A confidence interval on one mean P i is estimated by whose variance is estimated by yi .
This results in:
CI yi y s tD / 2 , N k MSE / ni
y Similarly, a confidence Interval on one difference is Pi P j W i W j is
CI yi y y j y s tD / 2, N k MSE 1
ni
1
nj

Suppose that following the ANOVA F test (for treatments) where the null hypothesis
H 0 : P1y P2y L Pk y
is rejected, we wish to determine which means can be considered significantly different from
each other. That is, we wish to test
H 0 : Pi y P jy H1 : Pi y { P j y for 1 e i j e t of all P1y ,..., P 2y

This could be done using the t statistic

yi y y j y
t and comparing it to tD 2, ( N k ) .
⎛1 1⎞
MSE ⎜ ⎟
⎜n n ⎟
⎝ i j ⎠

An equivalent test declares P i y and P jy to be significantly different if yiy y jy ! LSD

Where
⎛1 1⎞
LSD tD 2, N k MSE⎜ ⎟
⎜n n ⎟
⎝ i j ⎠

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 16

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

Random Effects Model for One-way ANOVA (ANOVA Model II)

y So far we have studied experiments and models with only fixed effect factors: factors whose
levels have been specifically fixed (in advance) by the experimenter, and where the interest is
in comparing the response for just these fixed levels.

y A random effect factor is one that has many possible levels, and where the interest is in the
variability of the response over the entire population of levels, but we only include a random
sample of levels in the experiment.

The factor levels are meant to be representative of a general population of possible levels.
We are interested in whether that factor has a significant effect in explaining the response,
but only in a general way. For example, we're not interested in a detailed comparison of level
2 vs. level 3, say.

Examples: Classify as fixed or random effect.

1. The purpose of the experiment is to compare the effects of three specific dosages of a
drug on response.
2. A textile mill has a large number of looms. Each loom is supposed to provide the
same output of cloth per minute. To check whether this is the case, five looms are
chosen at random and their output is noted at different times.
3. A manufacturer suspects that the batches of raw material furnished by his supplier
differ significantly in zinc content. Five batches are randomly selected from the
warehouse and the zinc content of each is measured.
4. Four different methods for mixing Portland cement are economical for a company to
use. The company wishes to determine if there are any differences in tensile strength
of the cement produced by the different mixing methods.
5. A drug company has its products manufactured in a large number of locations, and
suspects that the purity of the product might vary from one location to another.
Three locations are randomly chosen, and several samples of product from each are
selected and tested for purity.

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 17

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

Random effects model

Suppose, as before, that there are k treatments (factor levels) or groups, and that yij is the jth
observation in the ith group.

The mathematical representation of the model is the same as the fixed effects model:

yij P W i H ij i 1, L k ; j 1, L ni

where y , W and H are random variables and P is an unknown fixed parameter, the overall
mean.

Model Assumptions
1. The H ij ’s (random errors) come independently from a N (0, V 2 ) distribution. [i.e.
iid
H ij ' s ~ N (0,V 2 ) ]

2. The random effects W i ’s are independent random variables with the same
distribution N (0, V W ) .
2

iid
[i.e. we assume that W 1 , W 2 , K , W k ~ N (0, V W ) ]
2

3. W i and H ij are independent of each other for all i, j . j 1, L , ni i 1,L, k .

Variance components

y In the random effects model, the variance of y ij is no longer just V . The equation for y ij
2

now has two random variables on the right. There is the residual unexplained variability V
2

as before, plus the variability from randomly selecting W i from a N (0, V W2 ) distribution.

That is: Var ( yij ) Var ( P W i H ij ) Var(W i ) Var(H ij ) V W2 V 2

The two variances V W2 and V 2 are called variance components (or components of variance) as
the variance of one observation is equal to V W V .
2 2

y Further, it can be shown that

E ( y ij ) P Var( yij ) V 2 V W2 i.e. yij ~ N ( P , V W2 V 2 )

These two components may be estimated from the MS column of the ANOVA table.

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 18

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

Hypotheses
For the random-effects model, testing the hypothesis that the individual treatment effects are

zero is meaningless. It is more appropriate to test hypotheses about W i . Since we are interested
in the bigger population of treatments, the hypotheses of interest associated with the random

Wi effects are:

H 0 : V W2 0 vrs H1 : V W2 ! 0

If V W
2
y 0 , then all random treatment effects are identical, but

y If V W2 ! 0 significant variability exists among randomly selected treatments (that is, the
variability observed among the randomly selected treatments is significantly larger than the
variability that can be attributed to random error).

Expected mean squares (EMS)

The expected values of the mean squares for treatments and error are somewhat different than in
the fixed-effect case.
Balanced design
In the case of a balanced design, with k treatments and n observations per treatments (so N =
kn), there are formulae for the expected mean squares.
The expected value for MSE (mean square error) is V .
2
y

This equation holds independent of V W .

y Under the alternative hypothesis: V W2 ! 0 , and for ni=n the expected value of MSTR (mean

squares for treatments) is V 2 nV W2 ,

⎛ SSTR ⎞
E⎜ ⎟ V nV W .
2 2
E ( MSTR )
⎝ k 1 ⎠
Unbalanced design
For unequal sample sizes (i.e. unequal ni ‘s) (unbalanced design) n is replaced by n0
⎡ k
⎤
1 ⎢ k ∑ ni2 ⎥
Where n0 ⎢∑ ni i k1 ⎥
k 1 ⎢ i 1
⎢⎣ ∑ ni ⎥⎥
i 1 ⎦

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 19

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

ANOVA of variance
y The ANOVA decomposition of total variability is still valid;
y That is, the ANOVA identity is still SST = SSTR + SSE as for the fixed effects model and
the formulae for computing the sums of squares remain unchanged
y The computational procedure and construction of the ANOVA table for the random effects
model are identical to the fixed-effects case.
The conclusions, however, are quite different because they apply to the entire population of
treatments.

ANOVA Table (for ni=n)

Source d.f. Sum of Mean square Expected MS
squares
Model k −1 SSTR SSTR/(k −1)=MSTR V 2 nV W2

Error n –k SSE SSE/(n −k)= MSE V2

Total n −1 SST

Testing
Testing is performed using the same F statistic that we used for the fixed effects model:
MSTR
F*
MSE
If F ! FD , k 1, N k then Reject H0
*
Otherwise do not Reject H0

If H0 is true then V W
2
0 the expected F-value is 1.

MSTR V 2 nV W2 V2
That is, E ( MSTR) V n0 (0) V 0 V and F
2 2 2 *
1
MSE V2 V2
However, when real variability among the random treatments does exist, that is, V W ! 0 , then
2

E ( MSTR) V 2 ( some positive quantity)

Therefore, the larger the variability among the random treatment effects W i , the larger

E(MSTR) becomes. This implies the ratio

E ( MSTR ) V 2 n0V W2
1 ( another positive quantity ) becomes larger as the variability among the
E ( MSE ) V2
W i ’s increase.

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 20

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

Unbiased Estimators
The parameters of the one-way random effects model are P , V and V W2 .
2

Mean
As in the fixed effects case, we estimate P by
k ni

∑∑ y ij
yyy
P̂ i 1 j 1
yyy
N N

Estimation of V2 and V W2
Usually, we also want to estimate the variance components ( V 2 and V W ) in the model. The
2

procedure consists of equating the expected mean squares to their observed values in the
ANOVA table and solving for the variance components.

Thus the estimates of the components of variance are:

y Since MSE is an unbiased estimator of its expected value V 2

Vˆ 2 = MSE
⎛ MSTR MSE ⎞ n0V W V 2 V 2
y E ⎜⎜ ⎟⎟ V W2
⎝ n0 ⎠ n0
MSTR MSE
Since E ( MSTR) n0V W2 V 2 so VˆW2
n0
y Note that VˆW2 u 0 if and only if MSTR u MSE , which is equivalent to F u 1 .

Occasionally MSTR < MSE. In such a case we will get VˆW 0.

2
y

y A negative variance estimate VˆW2 occurs only if the value of the F statistic is less than 1.
Obviously the null hypothesis H0 is not rejected when F e 1 . Since variance cannot be
negative, a negative variance estimate is replaced by 0. This does not mean that V W2 is zero. It
simply means that there is not enough information in the data to get a good estimate of V W2 .

Confidence Intervals for Variance Components

Since we now have estimates of V W2 and V 2 , the two components of variance in the response Y ,
we can estimate the percentage of the total variation due to the factor W , and the percentage due
to the residual variation.
VˆW2 Vˆ 2
% due W X 100 and % unexp lained X 100
VˆW2 Vˆ 2 VˆW2 Vˆ 2

It is also possible to calculate approximate confidence intervals for V W2 and V 2

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 21

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

Confidence Intervals for V2

SSE
Since ~ F 2 ( N k ) it must be true that
V 2

⎛ SSE ⎞
Pr ⎜ F12 D 2 ( N k ) e 2 e F D22 ( N k ) ⎟ 1 D
⎝ V ⎠
Inverting all three terms in the inequality just reverses the ≤ signs to u’s:

⎛ ⎞
⎜ 1 V2 1 ⎟
Pr ⎜ 2 u u 2 ⎟ 1D
⎜ F1D 2 ( N k ) SSE F D 2 ⎟
⎝ ( N k ) ⎠

⎛ ⎞
⎜ SSE SSE ⎟
⇒ Pr ⎜ 2 uV u 2
2
⎟ 1D
⎜ F1 D 2 ( N k ) FD ⎟
⎝ 2 (N k ) ⎠

Therefore, a 100(1 D )% confidence interval for V is

⎛ ⎞
⎜ SSE SSE ⎟
⎜ F2 , 2 ⎟
⎜ D 2 ( N k ) F 1D 2 ⎟
⎝ (N k ) ⎠

It turns out that it is a good bit more complicated to derive a confidence interval for V W .
2

However, we can more easily find exact CIs for the intra-class correlation coefficient

V W2 V W2 V W2
U and for the ratio of the variance components T
V W2 V 2 V Y2 V2
Confidence Interval for T V W V
2 2

Where T represents the ratio of the between treatment variance to the within-treatment or error
variance.

F 2 ( k 1) F 2 (N k)
Since MSTR ~ (V n0V A ) and MSE ~ V 2
2 2

k 1 N k

and MSTR and MSE are independent,

MSTR ⎛ V 2 n0V W2 ⎞ MSTR MSE

~ ⎜⎜ ⎟⎟ F (k 1, N k ) ⇒ ~ F (k 1, N k ) Using an argument
MSE ⎝ V 2
T
4⎠
1 n
142
4 43
1 n0T

similar to the one we used to obtain our CI for V 2 we get the 100(1-)% interval [Lower,
Upper] for θ where

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 22

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

⎡ MSTR 1 ⎤ 1
Lower ⎢ X 1⎥ L
⎣⎢ MSE FD 2, k 1, N k ⎥⎦ n0

⎡ MSTR ⎤ 1 ⎡ MSTR 1 ⎤ 1
upper ⎢⎣ MSE X FD 2, N k , k 1 1⎥⎦ n ⎢ X 1⎥ U
0 ⎢⎣ MSE F1D 2, k 1, N k , ⎥⎦ n0

V W2 V W2
Confidence Intervals for U
V W2 V 2 V Y2
U (intra-class correlation coefficient) represents the proportion of the total variance that is
the result of differences between treatments

T
Since U we can transform the endpoints of the interval for θ to get an interval for ρ:
1T
1D P[ L e V W2 V 2 e U ]

P[1 L e 1 V W2 V 2 e 1 U ]

V 2 V W2
P[1 L e e 1U]
V2
⎡ 1 V2 1 ⎤
P⎢ u 2 u ⎥
⎣1 L V V W 1 U ⎦
2

⎡ 1 V2 1 ⎤
P ⎢1 e 1 2 e 1 ⎥
⎣ 1 L V VW2
1U ⎦

⎡ L V2 U ⎤
P⎢ e 2 W 2 e ⎥
⎣1 L V V W 1 U ⎦

⎡ Lower Upper ⎤
Thus, a 100(1-D)% Confidence Interval for ρ is ⎢ , ⎥
⎣1 Lower 1 Upper ⎦

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 23

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

Example 1:
We are to investigate the formulation of a new synthetic fibre that will be used to
make cloth for shirts. The cotton content varies from 10% - 40% by weight (the one
factor is cotton content) and the experimenter chooses 5 levels of this factor: 15%,
20%, 25%, 30%, 35%. The response variable is Y = tensile strength (time to break
when subject to a stress). There are 5 replicates (complete repetitions of the
experiment). In a replicate five shirts, each with different cotton content, are
randomly chosen from the five populations of shirts. The 25 tensile strengths are
measured, in random order.
Tensile Strength Data
Cotton Percentage
15% 20% 25% 30% 35%
7 12 14 19 7
7 17 18 25 10
15 12 18 22 11
11 18 19 19 15
9 18 19 23 11

Does changing the cotton content (level) change the mean strength?
Carry out an ‘Analysis of Variance’ (ANOVA) at D=0.01

Example 2
A textile company weaves a fabric on a large number of looms. They would like the looms to be
homogeneous so that they obtain a fabric of uniform strength. The process engineer suspects
that, in addition to the usual variation in strength within samples of fabric from the same loom,
there may also be significant variations in strength between looms. To investigate this, he selects
four looms at random and makes four strength determinations on the fabric manufactured on
each loom. The data are given in the following table:

Observations
Looms 1 2 3 4
1 98 97 99 96
2 91 90 93 92
3 96 95 97 95
4 95 96 99 98
Use D=0.05

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 24

Q1 Descriptive Statistics PDF
No ratings yet
Q1 Descriptive Statistics PDF
6 pages
Sage - Girden, 1992 ANOVA Repeated Measures
0% (1)
Sage - Girden, 1992 ANOVA Repeated Measures
110 pages
Stat Course Outline Unity University
No ratings yet
Stat Course Outline Unity University
3 pages
Audit Planning and Assessment Group Project
No ratings yet
Audit Planning and Assessment Group Project
7 pages
Applied Longitudinal Analysis Lecture Notes
No ratings yet
Applied Longitudinal Analysis Lecture Notes
475 pages
Training On Research Designing/Planning, Implementation and Report Writing by Waktole S. (PHD) Jan. 19-Feb. 11, 2017 Debre Zeith Management Institute
100% (1)
Training On Research Designing/Planning, Implementation and Report Writing by Waktole S. (PHD) Jan. 19-Feb. 11, 2017 Debre Zeith Management Institute
404 pages
Anatomic Connections of The Diaphragm Influence of Respiration On The Body System
100% (1)
Anatomic Connections of The Diaphragm Influence of Respiration On The Body System
11 pages
Instructional Plan (Iplan) : Talugtug National High School
100% (1)
Instructional Plan (Iplan) : Talugtug National High School
10 pages
ANDO, C. y RICHARDSON, S. Ancient States and Infrastructural Power. Europe, Asia, and America
100% (1)
ANDO, C. y RICHARDSON, S. Ancient States and Infrastructural Power. Europe, Asia, and America
315 pages
PDD Book
No ratings yet
PDD Book
1,135 pages
Lesson Plan Great Migration
100% (1)
Lesson Plan Great Migration
7 pages
AgStat 2.22019 Mannula PDF
No ratings yet
AgStat 2.22019 Mannula PDF
132 pages
Theoretical Framework
No ratings yet
Theoretical Framework
16 pages
Linear Regression Slides
No ratings yet
Linear Regression Slides
129 pages
Curriculum-Based Assessments and Implications For UDL Implementation
No ratings yet
Curriculum-Based Assessments and Implications For UDL Implementation
23 pages
Unit 8 PDF
No ratings yet
Unit 8 PDF
18 pages
PR2 Lesson 7 Hypothesis Testing
No ratings yet
PR2 Lesson 7 Hypothesis Testing
59 pages
Introduction To Experimental Designs
No ratings yet
Introduction To Experimental Designs
7 pages
Exploring Linkage Between Organizational Culture and Performance - A Case Study of Telesom Company Somaliland
No ratings yet
Exploring Linkage Between Organizational Culture and Performance - A Case Study of Telesom Company Somaliland
87 pages
08 Split Plots
No ratings yet
08 Split Plots
25 pages
Variance Component Estimation & Best Linear Unbiased Prediction (Blup)
100% (1)
Variance Component Estimation & Best Linear Unbiased Prediction (Blup)
16 pages
The Latin Square Design
No ratings yet
The Latin Square Design
16 pages
Statatistical Inferences
No ratings yet
Statatistical Inferences
22 pages
Heredity and Enviornment
No ratings yet
Heredity and Enviornment
4 pages
4ch06 Lecture
No ratings yet
4ch06 Lecture
79 pages
Indesit Sustainability Report 2012
No ratings yet
Indesit Sustainability Report 2012
79 pages
Stats 250 W15 Exam 2 Solutions
No ratings yet
Stats 250 W15 Exam 2 Solutions
8 pages
Module 1: Security Management: Lesson 1: Confidentiality, Integrity, and Availability
No ratings yet
Module 1: Security Management: Lesson 1: Confidentiality, Integrity, and Availability
7 pages
Stat 130 - Chi-Square Goodnes-Of-Fit Test
100% (3)
Stat 130 - Chi-Square Goodnes-Of-Fit Test
32 pages
H8 Sess13
No ratings yet
H8 Sess13
17 pages
A Family of Median Based Estimators in Simple Random Sampling
No ratings yet
A Family of Median Based Estimators in Simple Random Sampling
11 pages
One-Way Repeated Measures Anova: Daniel Boduszek
No ratings yet
One-Way Repeated Measures Anova: Daniel Boduszek
15 pages
Chi Square and Annova
100% (1)
Chi Square and Annova
29 pages
Inclusive Education
No ratings yet
Inclusive Education
17 pages
Chapter10 Sampling Two Stage Sampling
No ratings yet
Chapter10 Sampling Two Stage Sampling
21 pages
Discrete Random Variable
No ratings yet
Discrete Random Variable
53 pages
Econometrics For Finance Assignment 2 2023 12-07-12!14!23
100% (1)
Econometrics For Finance Assignment 2 2023 12-07-12!14!23
3 pages
Statistical Inference For Decision Making
No ratings yet
Statistical Inference For Decision Making
9 pages
Topic 2 Tabular Presentation
No ratings yet
Topic 2 Tabular Presentation
13 pages
DoE Lecture
100% (1)
DoE Lecture
315 pages
Aggregate Planning: Cumulative Production
No ratings yet
Aggregate Planning: Cumulative Production
7 pages
ANOVA One Way
No ratings yet
ANOVA One Way
11 pages
Principles of Experimental Design and Data Analysis
100% (2)
Principles of Experimental Design and Data Analysis
8 pages
CALCULATING Standard Deviation
No ratings yet
CALCULATING Standard Deviation
4 pages
Logistics Regression
No ratings yet
Logistics Regression
14 pages
Quartiles, Deciles, Percentiles
100% (1)
Quartiles, Deciles, Percentiles
5 pages
Importance of Statistics
No ratings yet
Importance of Statistics
10 pages
Ecological Engineering: A B B C A D B
No ratings yet
Ecological Engineering: A B B C A D B
8 pages
Power and Agency
No ratings yet
Power and Agency
35 pages
Katambora Rhodes Grass Fact Sheet
No ratings yet
Katambora Rhodes Grass Fact Sheet
1 page
STATA Codes - Basic
No ratings yet
STATA Codes - Basic
8 pages
Hospital Compliance
No ratings yet
Hospital Compliance
15 pages
Hulme Shepherd
No ratings yet
Hulme Shepherd
21 pages
What Is Hypothesis Testing
100% (1)
What Is Hypothesis Testing
32 pages
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
No ratings yet
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
18 pages
Statistical Method
No ratings yet
Statistical Method
3 pages
Stating Assessment Objectives
No ratings yet
Stating Assessment Objectives
2 pages
An Introduction To T
No ratings yet
An Introduction To T
7 pages
How To Write Chapter 3-Methods of Research and Procedures
100% (1)
How To Write Chapter 3-Methods of Research and Procedures
17 pages
Statistical Tools For Data Analysis
No ratings yet
Statistical Tools For Data Analysis
4 pages
Scales of Measurement
100% (1)
Scales of Measurement
5 pages
Tests of Significance Notes PDF
No ratings yet
Tests of Significance Notes PDF
12 pages
Inclusive Education
No ratings yet
Inclusive Education
11 pages
Measure of Dispersion Statistics
No ratings yet
Measure of Dispersion Statistics
24 pages
16 Graeco Latin Squares 323
No ratings yet
16 Graeco Latin Squares 323
7 pages
Goods and Services Lesson Plan
No ratings yet
Goods and Services Lesson Plan
7 pages
One-Way ANOVA: What Is This Test For?
No ratings yet
One-Way ANOVA: What Is This Test For?
22 pages
4.3. Parametric & Nonparametric Tests
No ratings yet
4.3. Parametric & Nonparametric Tests
26 pages
13 Pag Design and Analysis of Experiments in The Health Sciences
100% (1)
13 Pag Design and Analysis of Experiments in The Health Sciences
13 pages
Statexer#6
No ratings yet
Statexer#6
2 pages
Jurnal Ilmiah Tesis PDF
No ratings yet
Jurnal Ilmiah Tesis PDF
15 pages
2024 Generative AI Risk Management and The NIST Generative AI PR
No ratings yet
2024 Generative AI Risk Management and The NIST Generative AI PR
54 pages
Lampiran Efas Dan Ifas: 1. EFAS (External Factors Analysis Strategy)
No ratings yet
Lampiran Efas Dan Ifas: 1. EFAS (External Factors Analysis Strategy)
3 pages
Manage Business Risk
No ratings yet
Manage Business Risk
23 pages
Applications of Statistical Software For Data Analysis
No ratings yet
Applications of Statistical Software For Data Analysis
5 pages
Clars College of Commerce: Six Sigma
No ratings yet
Clars College of Commerce: Six Sigma
22 pages
Pre Ph.D. (Education)
No ratings yet
Pre Ph.D. (Education)
10 pages
Psychology Ugcnet Notes
No ratings yet
Psychology Ugcnet Notes
2 pages
Planning Survey Research
No ratings yet
Planning Survey Research
6 pages
Linguistic Landscape of Tourist Spaces From 2014 To 2022 A Review
No ratings yet
Linguistic Landscape of Tourist Spaces From 2014 To 2022 A Review
15 pages
Inr 142 CM - 0
100% (1)
Inr 142 CM - 0
135 pages
ANOVA Practice Questions
No ratings yet
ANOVA Practice Questions
2 pages
Day 11 & 12 - Hypothesis Testing
No ratings yet
Day 11 & 12 - Hypothesis Testing
6 pages
Aem 772 Statistics and Research Methods in Extension
100% (1)
Aem 772 Statistics and Research Methods in Extension
128 pages
Statistics For Health Data Science An Organic Approach
No ratings yet
Statistics For Health Data Science An Organic Approach
238 pages
Group 2 Reporter
No ratings yet
Group 2 Reporter
3 pages
Research Methodology Lecture 1
No ratings yet
Research Methodology Lecture 1
42 pages
Depression Detection Using Multimodal Analysis With Chatbot Support
No ratings yet
Depression Detection Using Multimodal Analysis With Chatbot Support
7 pages
NCERT Solutions For Class 10 Maths Chapter 6 Triangles
No ratings yet
NCERT Solutions For Class 10 Maths Chapter 6 Triangles
45 pages
Generic - Research Methodology
No ratings yet
Generic - Research Methodology
209 pages
Training Report Capacity Building On PDNA
No ratings yet
Training Report Capacity Building On PDNA
36 pages
Spiritual Coping
No ratings yet
Spiritual Coping
22 pages
Cover Letter
No ratings yet
Cover Letter
2 pages
Statistical Analysis with Excel Complete Self-Assessment Guide
From Everand
Statistical Analysis with Excel Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
Epidemiological Research
From Everand
Epidemiological Research
Mbuso Mabuza
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Stat 408 Analysis of Experimental Design PDF

Uploaded by

Stat 408 Analysis of Experimental Design PDF

Uploaded by

STAT 408: ANALYSIS OF EXPERIMENTAL DESIGN LECTURE NOTES: ONE-WAY CLASSIFICATION

The One way Classification

1.1 Observational and Experimental Studies

1.1.1 Observational Studies

1.1.2 Experimental Studies

The nature of the study matters when it comes to interpretation of results.

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 1

y An experimental study aims to answer the question: whether there is a cause-and-effect

Regression and ANOVA Models

– Quantitative explanatory variables are normally converted to qualitative explanatory

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 2

y Analysis of variance, ANOVA, is a statistical procedure for analyzing continuous data,

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 3

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 4

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 5

doing two way classifications.

y Definition: Whenever an experimenter is concerned with comparing the means/effects of a

In single factor experiments the response variable Y is continuous

There are two key differences regarding the explanatory variable X.

1. It is a qualitative variable (e.g. gender, location, etc). Instead of calling it an explanatory

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 6

Fixed Factors Model (Model I)

There are two ways of parameterizing the model:

X (or A) is the qualitative factor

Y is the continuous response variable

i 1, 2, L , k levels of the factor X.

Cell Means Model

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 7

Estimate P i by the mean of the observations at level i. That is, P̂ i yi y

H 0 : P1 P2 ... P k all means are equal vrs

Factor Effects Model

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 8

y P is the “weighted" or overall mean of the treatment means

y Another interpretation is to think of W i as an adjustment that needs to be made to the overall

Estimation of Model Parameters

Wˆi yiy  yyy for i=1,…,k,

Hˆij eij yij  yiy

Hˆij eij yij  yˆ ij yij  yi y

Pˆ i Pˆ  Wˆi yyy  yiy  yyy yiy

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 9

yij P  W i  H ij ( i 1,L, k j 1,L, ni )

variables with E( H ij ) = 0 and Var( H ij ) = V 2 .

Mathematically, we want to find Pˆ ,Wˆ1 ,LWˆk that minimize

Setting (1) equal to zero gives

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 10

⇒ yi y ni Pˆ  niWˆi For i 1,L, k (4)

Thus yi y ni Pˆ  niWˆi becomes yi y ni yy y  niWˆi

Solving for Wˆi yields Wˆi yiy  yyy For i 1,L, k

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 11

Some more Notation

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 12

Decomposition of the Total Deviation

Decomposition of Total Sum of Squares (SST)

Notice that the total sum of squares, SST, may be decomposed as

SST = SSTR + SSE

Breakdown of Degrees of freedom:

d.f Total = d.f. Trt + d.f. Error

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 13

Expected Mean Squares

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 14

y Under H 0 : P1 P2 ... Pk or equivalently H 0 : W 1 W2 ... W k 0

then MSE and MSTR both estimate V2

y We Reject H0 for large values of the F-ratio in comparison to an Fk 1, N  k distribution

Logic behind the F-test

⎧ estimator of something l arg er than V 2

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 15

The test procedure may be summarized in an ANOVA TABLE as follows:

Comparison of factor level means/effects

This could be done using the t statistic

An equivalent test declares P i y and P jy to be significantly different if yiy  y jy ! LSD

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 16

Random Effects Model for One-way ANOVA (ANOVA Model II)

Examples: Classify as fixed or random effect.

 S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 17

Random effects model

3. W i and H ij are independent of each other for all i, j . j 1, L , ni i 1,L, k .

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 1

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 2

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 3

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 4

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 5

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 6

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 7

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 8

Wˆi yiy yyy for i=1,…,k,

Hˆij eij yij yiy

Hˆij eij yij yˆ ij yij yi y

Pˆ i Pˆ Wˆi yyy yiy yyy yiy

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 9

yij P W i H ij ( i 1,L, k j 1,L, ni )

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 10

⇒ yi y ni Pˆ niWˆi For i 1,L, k (4)

Thus yi y ni Pˆ niWˆi becomes yi y ni yy y niWˆi

Solving for Wˆi yields Wˆi yiy yyy For i 1,L, k

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 11

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 12

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 13

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 14

y We Reject H0 for large values of the F-ratio in comparison to an Fk 1, N k distribution

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 15

An equivalent test declares P i y and P jy to be significantly different if yiy y jy ! LSD

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 16

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 17

That is: Var ( yij ) Var ( P W i H ij ) Var(W i ) Var(H ij ) V W2 V 2

E ( y ij ) P Var( yij ) V 2 V W2 i.e. yij ~ N ( P , V W2 V 2 )

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 18

squares for treatments) is V 2 nV W2 ,

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 19

E ( MSTR) V 2 ( some positive quantity)

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 20

Occasionally MSTR < MSE. In such a case we will get VˆW 0.

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 21

Therefore, a 100(1 D )% confidence interval for V is

MSTR ⎛ V 2 n0V W2 ⎞ MSTR MSE

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 22

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 23

S.A. YEBOAH FSS, FASG, MSc, BSc (Hons) Page 24