0% found this document useful (0 votes)
29 views63 pages

Module 5 Class

Class Notes by professor for Module 5 of BCS301 CSE

Uploaded by

BreadBeau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views63 pages

Module 5 Class

Class Notes by professor for Module 5 of BCS301 CSE

Uploaded by

BreadBeau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

Module - 5: Design of Experiments & ANOVA

Topic 1: Basic Principles of Experimental Design & One-Way


ANOVA or CRD

Dr. P. Rajendra

Professor, Dept. of Maths

CMRIT, Bengaluru.

Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
1 / 20
Design of experiments and its basic principles:

The Design of Experiments (DOE) is a structured, methodical approach


used to study the effects of various factors on outcomes. DOE is crucial in
scientific research, engineering, and data science as it helps uncover
cause-and-effect relationships and provides a framework for testing
hypotheses, optimizing processes, and improving quality.
(i). Factors and Levels
Factors: Independent variables or parameters that we test to observe
their impact on the outcome.
Levels: Specific values or settings that a factor can take.

Example
In an image classification model, factors could be:
Learning rate with levels 0.001 and 0.01
Optimizer type with levels SGD and Adam

Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
2 / 20
(ii). Treatments: Specific combinations of factor levels applied in an
experiment. Each unique combination of factor levels constitutes a
treatment.
Example
With factors as:
Learning rate (levels: 0.001, 0.01)
Optimizer type (levels: SGD, Adam)
Possible treatments would be:
Learning rate 0.001 + SGD
Learning rate 0.001 + Adam
Learning rate 0.01 + SGD

(iii). Response Variable: The outcome or metric we measure in response


to applying treatments.
Example
Testing different data preprocessing methods with the response variable as
model accuracy helps identify the optimal steps.
Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
3 / 20
(iv). Control: Keeping certain variables constant to ensure changes in the
response are due to the treatments.
Example
When testing different model architectures, control for factors like random
seed and number of epochs.

(v). Randomization: Randomly assigning treatment order or


experimental units to minimize bias.
Example
Randomizing the order of model training sessions can prevent order-based
biases.

Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
4 / 20
(vi). Replication: Repeating experiments to ensure results are consistent
and reliable.
Example
Train and evaluate the same model multiple times to average out
performance fluctuations.

(vii). Blocking Grouping experimental units to account for variability due


to known factors.
Example
Comparing algorithms across different datasets by blocking on dataset
type minimizes external influences.

Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
5 / 20
Example 1: Plant Growth Experiment Objective: Understand how
sunlight exposure affects plant growth.
Experimental Unit: Individual plants
Treatments: Different sunlight exposure levels
Randomization: Random assignment of plants to sunlight levels
Replication: Multiple plants at each sunlight level
Local Control: Uniform soil conditions across all plants
Example 2: Improving a Recommendation System Objective: To
evaluate how different recommendation algorithms affect user engagement
on a platform.
Experimental Unit: Individual users on the platform.
Treatments: Different recommendation algorithms like Collaborative
Filtering, Content-based Filtering and Hybrid Model
Randomization: Users are randomly assigned to one of the
recommendation algorithms to avoid biases.
Replication: Large groups of users for each algorithm to ensure
reliable results.
Local Control: Users are grouped based on similar engagement levels
to control for existing behavior patterns.
Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
6 / 20
3. Introduction to Analysis of Variance (ANOVA):
A statistical method used to test differences between two or more
sample means.
The term “Analysis of Variance” refers to making inferences about
means by analyzing variance.
Typically applied when multiple sample cases or treatments are
involved, allowing us to determine if samples come from populations
with the same mean.
ANOVA involves two estimates of population variance:
1 Between Samples Variance (Cause Variance or Treatment Variance)
2 Within Samples Variance (Chance Variance or Error Variance)
These two estimates are compared using the F-test:
Estimate of population variance based on between samples variance
F =
Estimate of population variance based on within samples variance

Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
7 / 20
4. Completely Randomized Design (CRD):
Treatments are assigned completely at random to experimental units,
ensuring each unit has an equal chance of receiving any treatment.
Example: In a plant growth study, plants are randomly assigned one
of three fertilizers, making any growth differences attributable to
fertilizers.

One-Way ANOVA or CRD:


Examines one factor to determine if there are differences within that
factor.
Experimental units are randomly assigned to different levels of a
single factor.
Example: To compare mean marks of three student groups, use
one-way ANOVA to check for statistically significant differences.

Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
8 / 20
Steps in One-Way ANOVA:
1 Define the null hypothesis H : µ = µ = µ = · · · = µ
0 1 2 3 n
2 Let n be the number of items in each sample, and N the total
i
number of observations.
3 Calculate the sum of observations in each sample T and the grand
P i
total T = Ti .
4 Compute the Correction Factor (CF):

T2
CF =
N
5 Find the Total Sum of Squares (TSS):
XX
TSS = xij2 − CF
i j

6 Calculate the Sum of Squares Between Samples (SST):


X T2
i
SST = − CF
ni
i
7 The Sum of Squares Within Samples (SSE) is: SSE = TSS - SST
Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
9 / 20
8 d.o.f: Total degrees of freedom (TSS) = N − 1, Degrees of freedom
for SST = k − 1, Degrees of freedom for SSE = N − k
9 Mean Sum of Squares:
For treatments: S12 = SST 2 SSE
k−1 . For error: S2 = N−k
10. ANOVA Table: The ANOVA table summarizes the calculations:

Source of Variation Sum of Squares d.f M.S.S F-Ratio


SST S12
Between Samples SST k −1 S12 = k−1 F = S22
SSE
Within Samples SSE N −k S22 = N−k
S12
11. Calculating the F-Ratio:F = . If variance within treatments is
S22
greater than variance between treatments, swap numerator and
denominator, adjusting degrees of freedom.
12. Critical F-Value: Obtain from F-distribution table for (k − 1, N − k)
degrees of freedom at 5% significance.
13. Inference: If calculated F is less than the table F, accept H0 : No
significant difference between treatments. If calculated F is greater than
table F, reject H0 : The difference between treatments is significant.
Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
10 / 20
Problem 1: A test was given to five students taken at random from the
fifth class of three schools of a town. The individual scores are:
School I 9 7 6 5 8
School II 7 4 5 4 5
School III 6 5 6 7 6
Carry out the analysis of variance.
Solution: To carry out ANOVA, we calculate the necessary totals and
sums of squares for each process.
P 2
School Ti Ti
School I 9 7 6 5 8 35 1225
School II 7 4 5 4 5 25 625
School III 6 5 6 7 6 30 900
T = 90 -
Table of Squares of Individual observations:
School I 81 49 36 25 64 255
School II 49 16 25 16 25 131
School III 36 25 36 49 36 182
PP 2
xij = 568
Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
11 / 20
Null Hypothesis (H0 ): µ1 = µ2 = µ3 , i.e., there is no significant
difference between the performance of schools.
Alternative Hypothesis (H1 ): µ1 ̸= µ2 ̸= µ3
Level of Significance: α = 0.05
Correction factor (C.F):
T2 902 8100
C.F = = = = 540
N 15 15

Total Sum of Squares (TSS):


XX
TSS = xij2 − C.F = 568 − 540 = 28

Sum of Squares Between Schools (SST):


T12 T22 T32 2750
SST =
+ + − C.F = − 540 = 10
5 5 5 5
Sum of Squares Due to Error (SSE):
SSE = TSS − SST = 28 − 10 = 18
Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
12 / 20
ANOVA Table
Source of Variation d.f. S.S M.S.S F-Ratio
10 5.0
Between Schools 3−1=2 10 2 = 5.0 1.5= 3.33
18
Error 12 18 12= 1.5
Total 15 − 1 = 14 28

Conclusion:
Table Value: Table value of Fe for (2, 12) degrees of freedom at a
5% level of significance is 3.8853.
Inference: Since the calculated F0 (3.33) is less than the table value
(3.8853), we accept H0 and conclude that there is no significant
difference between the performances of the schools.

Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
13 / 20
Problem 2: Three processes A, B, and C are tested to see whether their
outputs are equivalent. The following observations of outputs are made:

A 10 12 13 11 10 14 15 13
B 9 11 10 12 13 - - -
C 11 10 15 14 12 13 - -

Carry out the analysis of variance and state your conclusion.


Solution: To carry out ANOVA, we calculate the necessary totals and
sums of squares for each process.

Process Ti Ti2
A 10 12 13 11 10 14 15 13 88 9604
B 9 11 10 12 13 - - - 55 3025
C 11 10 15 14 12 13 - - 75 5625
T = 218 -

Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
14 / 20
Table of Squares of Individual observations:
A 100 144 169 121 100 196 225 169 1224
B 81 121 100 144 169 - - - 615
C 121 100 225 196 144 169 - - 955
PP 2
xij = 2794

Null Hypothesis (H0 ): The means of the outputs of processes A, B,


and C are equal.
Alternative Hypothesis (H1 ): The means of the outputs of
processes A, B, and C are not all equal.
Level of Significance: α = 0.05
Correction Factor (C.F.):
T2 (228)2 51984
C .F = = = = 2736
N 19 19
Total Sum of Squares (TSS):
XX
TSS = xij2 − C.F = 2794 − 2736 = 58

Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
15 / 20
Sum of Squares Between Processes (SST):
T12 T22 T32 9604 3025 5625
SST = + + − C.F = + + − 2736
8 5 6 8 5 6
= 1200.5 − 605 − 937.5 = 2743 − 2736 = 7
Sum of Squares Due to Error (SSE):
SSE = TSS − SST = 58 − 7 = 51
ANOVA Table

Source of Variation d.f. S.S M.S.S F-Ratio


7 3.5
Between Processes 3−1=2 SST=7 2 = 3.5 3.18= 1.10
51
Error 19 - 3 = 16 SSE=51 16 = 3.18
Total 19 − 1 = 18 - -
Inference: Since the calculated F -ratio (0.64) is less than the table
value (3.74), we accept the null hypothesis (H0 ).
Conclusion: There is no significant difference between the outputs of
processes A, B, and C.
Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
16 / 20
Problem 3: Three different kinds of food are tested on three groups of
rats for 5 weeks. The objective is to check the difference in mean weight
(in grams) of the rats per week. Apply one-way ANOVA using a 0.05
significance level to the following data:

Food 1 8 12 19 8 6 11
Food 2 4 5 4 6 9 7
Food 3 11 8 7 13 7 9

Solution: To carry out the ANOVA, we form the following tables:

Food Observations Total (Ti ) Squares (Ti2 )


Food 1 8, 12, 19, 8, 6, 11 T1 = 64 T12 = 4096
Food 2 4, 5, 4, 6, 9, 7 T2 = 35 T22 = 1225
Food 3 11, 8, 7, 13, 7, 9 T3 = 55 T32 = 3025
Total T = 154 -

Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
17 / 20
xij2 )
P
Food Squares of Observations Total Squares (
Food 1 64, 144, 361, 64, 36, 121 790
Food 2 16, 25, 16, 36, 81, 49 223
Food 3 121, 64, 49, 169, 49, 81 533
xij2 = 1546
PP
Total
Correction Factor (CF):
T2 (154)2 23716
=CF = = = 1317.55
N 18 18
Total Sum of Squares (TSS):
XX
TSS = xij2 − CF = 1546 − 1317.55 = 228.45
Sum of Squares Between Treatments (SST):
X T2 4096 1225 3025
i
SST = − CF = + + − 1317.55 = 73.45
ni 6 6 6
Sum of Squares Error (SSE):
SSE = TSS − SST = 228.45 − 73.45 = 155
Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
18 / 20
ANOVA Table

Source of Var d.f. S.S Mean of S.S


B/w Treatments 3−1=2 SST = 73.45 S12 = 73.45
2 = 36.725
2 155
Error 18 − 3 = 15 SSE = 155 S2 = 15 = 10.33
Total 18 − 1 = 17 TSS = 228.45 -
Table: Analysis of Variance

F Ratio:
S12 36.725
F = 2
= = 3.55
S2 10.33
Critical Value: F (2, 15) at 0.05 level of significance = 3.68
Since 3.55 < 3.68, the null hypothesis is accepted. There is no significant
difference in the mean weights among the three groups.

Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
19 / 20
Assignment Problems:
(1). A trial was run to check the effects of different diets. Positive
numbers indicate weight loss and negative numbers indicate weight gain.
Check if there is an average difference in the weight of people following
different diets using an ANOVA Table.
Low Fat Low Calorie Low Protein Low Carbohydrate
8 2 3 2
9 4 5 2
6 3 4 -1
7 5 2 0
3 1 3 3
(2). Three types of fertilizers are used on three groups of plants for 5
weeks. We want to check if there is a difference in the mean growth of
each group. Apply a one-way ANOVA test at a significance level of 0.05.
Fertilizer 1 6 8 4 5 3 4
Fertilizer 2 8 12 9 11 6 8
Fertilizer 3 13 9 11 8 7 12
Dr. P. Rajendra (Professor, Dept. of Maths)Module - 5: Design of Experiments & ANOVA Topic 1: Bengaluru.
CMRIT, Basic Principle
20 / 20
Topic 2: Two-Way ANOVA (Randomized Block
Design)

Dr. P. Rajendra

Professor, Dept. of Maths

CMRIT, Bengaluru.

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 1 / 16
Two-Way ANOVA or Randomized Block Design:
Two-way ANOVA is used when the data are classified on the basis of two
factors. Experimental units are grouped into blocks based on one factor,
and treatments are randomly assigned within each block.
Examples:
. The agricultural output may be classified on the basis of different
varieties of seeds and also on the basis of different varieties of
fertilizers used.
. A business firm may have its sales data classified on the basis of
different salesmen and also on the basis of sales in different regions.
. In a factory, the various units of a product produced during a certain
period may be classified on the basis of different varieties of machines
used and also on the basis of different grades of labour
Designs of Experiments: A two-way design may have repeated
measurements of each factor or may not have repeated values. We shall
now explain the two-way ANOVA technique in the context of both the said
designs with the help of examples.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 2 / 16
Steps in Two-Way ANOVA (No Repeated Values)
Step 1: Define the Null Hypothesis
H0 : No significant difference between row means or column means.
Step 2: Total Number of Observations
X
N= ni
where ni is the number of items in the i-th sample.
Step 3: Calculate Totals
X X XX
Ti = xij , T = Ti = xij , (Grand Total)
Step 4: Correction Factor
T2
CF =
N
Step 5: Total Sum of Squares (TSS)
XX
TSS = xij2 − CF

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 3 / 16
Step 6: Sum of Squares for Rows (SSR)
X T2
i
SSR = − CF
ni
Step 7: Sum of Squares for Columns (SSC)
X Tj2
SSC = − CF
nj
where Tj is the total for the j-th column.
Step 8: Sum of Squares for Error (SSE)

SSE = TSS − (SSR + SSC )

Step 9: Degrees of Freedom (d.f.)


. For total sum of squares: N − 1
. For variance between rows: r − 1
. For variance between columns: c − 1
. For error: (r − 1)(c − 1)
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 4 / 16
Step 10: Mean Sum of Squares (MS)
Mean square for rows:
SSR
SR2 =
r −1
Mean square for columns:
SSC
SC2 =
c −1
Mean square for error:
SSE
SE2 =
(r − 1)(c − 1)
Step 11: Two-Way ANOVA Table
Source of Variation SS d.f. MSS F-ratio
SR2
Between Rows SSR r −1 SR2 FR = SE2
SC2
Between Columns SSC c −1 SC2 FC = SE2
Residual/Error SSE (r − 1)(c − 1) SE2 -
Total TSS N −1 - -
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 5 / 16
Step 12: F-Ratio Calculation
F-ratio for rows:
SR2
FR =
SE2
F-ratio for columns:
SC2
FC =
SE2
Step 13: Conclusion
. If FR > Ftable , row means are significantly different.
. If FC > Ftable , column means are significantly different.

Important Note: In these two cases, if the numerator variance is less


than the denominator variance, then numerator and denominator should
be interchanged and degrees of freedom should be adjusted accordingly.

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 6 / 16
Coding Method

Coding method is based on an important property of F-ratio that its value


does not change if all the n item values are either multiplied or divided by
a common figure or if a common figure is either added or subtracted from
each of the given n item values. Through this method big figures are
reduced in magnitude by division or subtraction and computation work is
simplified without any disturbance on the Fratio. This method should be
used specially when given figures are big or otherwise inconvenient. Once
the given figures are converted with the help of some common values, then
all the steps of the short-cut method (for both ONE-way ANOVA and
TWO-way ANOVA) stated above can be adopted for obtaining and
interpreting Fratio.

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 7 / 16
Problem 1: The following data represents the number of units of
production per day turned out by different workers using 4 different types
of machines.
Machine Types
Workers A B C D
1 44 38 47 36
2 46 40 52 43
3 34 36 44 32
4 43 38 46 33
5 38 42 49 39
1Test whether the five workers differ with respect to mean productivity.
2Test whether the mean productivity is the same for the four different
machine types.
Solution: The Null Hypotheses are given by:
H0 : The 5 workers (row factors) do not differ with respect to mean
productivity.
H0 : The mean productivity is the same for the four different
machines (column factors).
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 8 / 16
To simplify the calculation, use a coding method: subtract 40 from each
value. The transformed values are:
Machine Types
Workers A B C D Row Total
1 4 -2 7 -4 5
2 6 0 12 3 21
3 -6 -4 4 -8 -14
4 3 -2 6 -7 0
5 -2 2 9 -1 8
Column Total 5 -6 38 -17 20
Correction Factor:
T2 (20)2
C.F. = = = 20
N 20
Total Sum of Squares (TSS):
XX
TSS = xij2 − C.F. = 574 − 20 = 554
i j

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 9 / 16
Sum of Squares for Rows (SSR):
X T2
i
SSR = − C.F. = 181.5 − 20 = 161.5
ni
i

Sum of Squares for Columns (SSC):


X Tj2
SSC = − C.F. = 358.8 − 20 = 338.8
nj
j

Sum of Squares for Error (SSE):

SSE = TSS − (SSR + SSC) = 554 − 161.5 − 338.8 = 53.7

Degree of freedom (d.f):


. Total (TSS): N − 1 = 19
. Rows (SSR): r − 1 = 4
. Columns (SSC): c − 1 = 3
. Error (SSE): (r − 1)(c − 1) = 12
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 10 / 16
Mean Squares:
SSR 161.5
SR2 = = = 40.375
r −1 4
SSC 338.8
SC2 = = = 112.93
c −1 3
SSE 53.7
SE2 = = = 6.14
(r − 1)(c − 1) 12
ANOVA Table
Source of Variation SS d.f. MSS F-ratio
Rows (Workers) 161.5 4 40.375 6.576
Columns (Machine Type) 338.8 3 112.93 18.393
Error 53.7 12 6.14
Total 574 19
Conclusion:
. For rows (workers), FR = 6.576 > F0.05 (4, 12) = 3.26. Reject H0 :
The workers differ in mean productivity.
. For columns (machines), FC = 18.393 > F0.05 (3, 12) = 3.49. Reject
H0 : The mean productivity is not the same for all machines.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 11 / 16
Problem 2: Analyze the following per acre production data for three
varieties of wheat grown on four plots of land to determine:
(i). If the differences between the wheat varieties (column factors) are
significant.
(ii). If the differences between the plots (row factors) are significant.

Varieties
Plot A B C
1 6 5 5
2 7 5 4
3 3 3 3
4 8 7 4

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 12 / 16
Solution: To carryout the Two-way ANOVA, we need the following data
table.
Varieties
Plot A B C Row Total
1 6 5 5 16
2 7 5 4 16
3 3 3 3 9
4 8 7 4 19
Column Total 24 20 16 60
Null Hypotheses:
. H0 : The mean production for the three wheat varieties does not
differ.
. H0 : The mean production for the four plots does not differ.
Correction Factor (C.F.):

T2 (60)2
C.F. = = = 300
N 12
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 13 / 16
Total Sum of Squares (TSS):
XX
TSS = xij2 − C.F.
XX
xij2 = 62 + 52 + 52 + 72 + 52 + 42 + 32 + 32 + 32 + 82 + 72 + 42 = 332
∴ TSS = 332 − 300 = 32
Sum of Squares for Rows (SSR)
X T2
i
SSR = − C.F.
ni
i
162 162 92 192
∴ SSR = + + + − 300 = 318 − 300 = 18
3 3 3 3
Sum of Squares for Columns (SSC)

X Tj2
SSC = − C.F.
nj
j

242 202 162


∴ SSC = + + − 300 = 308 − 300 = 8
4 4 4
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 14 / 16
Sum of Squares for Error (SSE)

SSE = TSS − (SSR + SSC)


∴ SSE = 32 − (18 + 8) = 6
ANOVA Table:
Source SS d.f. MSS F-ratio
6
Rows (Plots) 18 3 18/3 = 6 1 =6
4
Columns (Varieties) 8 2 8/2 = 4 1 =4
Error 6 6 6/6 = 1
Total 11
Conclusion: The Critical Values are
F0.05 (3, 6) = 4.76, F0.05 (2, 6) = 5.14

. For rows (plots), FR = 3.32 < 4.76: Do not reject H0 . The plots do
not differ significantly.
. For columns (varieties), FC = 6.98 > 5.14: Reject H0 . The wheat
varieties differ significantly.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 15 / 16
Assignment Problems: (1). Perform ANOVA and test whether there are
differences in the detergent or in the engines for the following data:
Engine
Detergent I II III
A 45 43 51
B 47 46 52
C 48 50 55
D 42 37 49
(2). Three varieties of coal were analyzed by four chemists, and the ash
content in the varieties was found to be as follows:
Chemist
Variety 1 2 3 4
A 8 5 5 7
B 7 6 4 4
C 3 6 5 4
(i). Test whether there are significant differences in the ash content
between the coal varieties.
(ii). Analyze the impact of different chemists on the measurements.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 2: Two-Way ANOVA (Randomized Block Design)
CMRIT, Bengaluru. 16 / 16
Topic 3: Two-Way ANOVA (With Repetitions)

Dr. P. Rajendra

Professor, Dept. of Maths

CMRIT, Bengaluru.

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 3: Two-Way ANOVA (With Repetitions) CMRIT, Bengaluru. 1 / 11
Two-Way ANOVA Technique: Repeated Values
In a two-way design with repeated measurements for all categories: A
separate independent measure of the smallest or the inherent variations
can be obtained. This measure is calculated similar to the sum of squares
for variance within samples in one-way ANOVA. Calculations: The total
sum of squares (SS), the SS between columns, and the SS between rows
are determined. The left-over sums of squares and degrees of freedom are
used for ‘interaction variation. Interaction measures the interaction among
two classifications. After calculations, the ANOVA table is set up to draw
inferences.
Steps in Two-Way ANOVA in case of Repeated Values:
Step 1: Define the Null Hypothesis
H0 : No significant difference between row means or column means.
Step 2: Total Number of Observations
X
N= ni
where ni is the number of items in the i-th sample.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 3: Two-Way ANOVA (With Repetitions) CMRIT, Bengaluru. 2 / 11
Step 3: Calculate Totals
X X XX
Ti = xij , T = Ti = xij , (Grand Total)
Step 4: Correction Factor
T2
CF =
N
Step 5: Total Sum of Squares (TSS)
XX
TSS = xij2 − CF

Step 6: SS for Rows (SSR)


X T2
i
SSR = − CF
ni
i

Step 7: SS for Columns (SSC)


X Tj2
SSC = − CF
nj
j

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 3: Two-Way ANOVA (With Repetitions) CMRIT, Bengaluru. 3 / 11
Step 8: SS for Error (SSE)

SSE = TSS − (SSR + SSC )

Step 9: SS for Interaction (SSI)

SSI = TSS − (SSR + SSC + SSE )

Step 10: Degrees of Freedom (d.f.)


Total: N − 1, Rows: r − 1, Columns: c − 1
N
Interaction: 2 −c −r +1
N
Error: 2,
Step 11: Mean Sum of Squares
SSR SSC
Rows: SR2 = r −1 , Columns: SC2 = c−1 ,
SSI
Interaction: SI2 = N
−c−r +1
2
SSE
Error: SE2 = N
2

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 3: Two-Way ANOVA (With Repetitions) CMRIT, Bengaluru. 4 / 11
Step 12: ANOVA Table

Source SS d.f. MS F-ratio


SC2
Columns SSC c −1 SC2 FC = SE2
SR2
Rows SSR r −1 SR2 FR = S 2
E
N S2
Interaction SSI 2 − c − r + 1 SI2 FI = SI2
E
N
Error SSE 2 SE2 −
Total TSS N −1 − −

Step 13: F-ratio Calculations


SR2
Rows: FR = SE2
SC2
Columns: FC = SE2
SI2
Interaction: FI = SE2
Note: If the numerator variance < is the denominator variance,
interchange and adjust df.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 3: Two-Way ANOVA (With Repetitions) CMRIT, Bengaluru. 5 / 11
Problem 1: Set up an ANOVA table for the following data relating to
three drugs tested for their effectiveness in reducing blood pressure for
three groups of people:

Drug
Group of people X Y Z
A 14, 15 10, 9 11, 11
B 12, 11 7, 8 10, 11
C 10, 11 11, 11 8, 7

Do the drugs act differently? Are the different groups of people affected
differently? (Use a significance level of 5%).
Solution:
Number of observations in each row: ni = 6
Number of observations in each column: nj = 6
Total number of observations: N = 18
P P
Grand total: T = Ti = Tj = 187
T2 187×187
Correction factor (CF): CF = N = 18 = 1942.72
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 3: Two-Way ANOVA (With Repetitions) CMRIT, Bengaluru. 6 / 11
Total Sum of Squares (TSS)
XX
TSS = xij2 − CF = (2019 − 1942.72) = 76.28
i j

Sum of Squares Between Rows (SSR)


X T2 702 592 582
i
SSR = − CF = + + − 1942.72 = 14.78
ni 6 6 6
j

Sum of Squares Between Columns (SSC)


X Tj2 732 562 582
SSC = − CF = + + − 1942.72 = 28.77
nj 6 6 6
j

Error Deviations (SSE)


X
SSE = (x − x̄)2 = 3.50

Interaction Variations (SSI)


SSI = TSS − (SSR + SSC + SSE) = 76.28 − (14.78 + 28.77 + 3.50) = 29.23
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 3: Two-Way ANOVA (With Repetitions) CMRIT, Bengaluru. 7 / 11
ANOVA Table
Source of Variation SS d.f. MS F-ratio
Between Columns (Drugs) 28.77 2 14.385 19.0
Between Rows (People) 14.78 2 7.390 36.9
Interaction 29.23 4 7.308 18.786
Within (Error) 3.50 9 0.389 -
Total 76.28 17 - -

All F-ratios are significant at the 5% level.


This indicates that:
Drugs act differently.
Different groups of people are affected differently.
The interaction term is significant.
A significant interaction term suggests it is unnecessary to separately
analyze the effects of drugs or groups.

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 3: Two-Way ANOVA (With Repetitions) CMRIT, Bengaluru. 8 / 11
Problem 2: The following table gives the monthly sales (in thousand
rupees) of a certain firm in three states by its four salesmen.
States Salesman A Salesman B Salesman C Salesman D
X 5, 3 4, 5 4, 9 7, 8
Y 7, 3 8, 8 5, 7 4, 5
Z 9, 5 6, 4 6, 3 7, 1
Determine if the differences in sales among the four salesmen are
significant.
Determine if the differences in sales among the three states are
significant. (Use a significance level of 5%.)
Solution:
Total sum: T = 133, Total observations: N = 24
Correction Factor (CF):
T2 133 × 133
= CF = = 737.04
N 24
Total Sum of Squares (TSS)
XX
TSS = xij2 − CF = 823 − 737.04 = 86
i j
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 3: Two-Way ANOVA (With Repetitions) CMRIT, Bengaluru. 9 / 11
Sum of Squares Between Columns (SSC)
X Tj2 322 352 342 322
SSC = − CF = + + + − 737.04
nj 6 6 6 6
j

SSC = 0.86
Sum of Squares Between Rows (SSR)
X T2 452 472 412
i
SSR = − CF = + + − 737.04
ni 8 8 8
i

SSR = 2.26
Error Sum of Squares (SSE)
X
SSE = (x − x̄)2 = 58.5

Step (vi): Sum of Squares for Interaction (SSI)

SSI = TSS − (SSC + SSR + SSE) = 86 − (0.86 + 2.26 + 58.5) = 24.38


Dr. P. Rajendra (Professor, Dept. of Maths)Topic 3: Two-Way ANOVA (With Repetitions) CMRIT, Bengaluru. 10 / 11
ANOVA Table
Source of Variation SS DOF MS F-ratio
Between Columns (Salesmen) 0.86 3 0.286 0.057
Between Rows (States) 2.26 2 1.130 0.231
Interaction 24.38 4 6.095 1.251
Within (Error) 58.5 12 4.875 -
Total 86 23 - -

All F-ratios are compared to the table values at a significance level of


5%.
Since no calculated F-ratio exceeds the critical value, the differences
in sales among the salesmen and among the states are not significant.
Interaction is also not significant.

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 3: Two-Way ANOVA (With Repetitions) CMRIT, Bengaluru. 11 / 11
Topic 4: ANOVA in Latin-Square Design (LSD)

Dr. P. Rajendra

Professor, Dept. of Maths

CMRIT, Bengaluru.

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 4: ANOVA in Latin-Square Design (LSD) CMRIT, Bengaluru. 1 / 13
Introduction to Latin-Square Design:
Treatments are allocated so that no treatment occurs more than once
in any one row or any one column.
ANOVA technique splits variance into four parts:
1 Variance between columns.
2 Variance between rows.
3 Variance between varieties.
4 Residual variance (Error).

Steps for ANOVA in Latin-Square Design:


Step 1. Define thePnull hypothesis.
P
Step 2. Find N = ni = nj , the total number of observations.
Step 3. Find the sum of observations in the i-th row (Ti ) and the grand
total is: X X
T = Ti = xij .
Step 4. Compute the correction factor (CF ):
T2
CF = .
N
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 4: ANOVA in Latin-Square Design (LSD) CMRIT, Bengaluru. 2 / 13
Step 5. Find the Total Sum of Squares (TSS):
XX
TSS = (xij )2 − CF .
i j

Step 6. Compute Sum of Squares for Rows (SSR):


X T2
i
SSR = − CF ,
ni
i
where Ti is the sum of observations in the i-th row.
Step 7. Compute Sum of Squares for Columns (SSC):
X Tj2
SSC = − CF ,
nj
where Tj is the sum of observations in the j-th column.
Step 8. Compute Sum of Squares for Variance Between Varieties (SSV or
SSL):
X T2
v
SSL = − CF ,
nv
where Tv is the sum of observations of variety v .
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 4: ANOVA in Latin-Square Design (LSD) CMRIT, Bengaluru. 3 / 13
Step 9. Compute Sum of Squares for Residual Variance (SSE):

SSE = TSS − (SSR + SSC + SSV ).

Step 10. Degrees of freedom:


For TSS: N − 1, SSR: r − 1, SSC : c − 1, SSL: v − 1.
For SSE : (c − 1)(c − 2), where c = r = v in Latin-square design.
Step 11: ANOVA Table:
Source of Var SS d.f. MS F-Ratio
SSC SC2
B/w Columns SSC c −1 SC2 = c−1 FC = SE2
SSR SR2
B/w Rows SSR r −1 SR2 = r −1 FR = SE2
SSL SL2
B/w Varieties SSL c −1 SL2 = c−1 FL = SE2
SSE
Error SSE (c − 1)(c − 2) SE2 = (c−1)(c−2) -
Total TSS N −1 - -
Step 12: Compare the calculated F -values (FC , FR , FL ) with the tabulated
F -value. Draw conclusions about the significance of the variances.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 4: ANOVA in Latin-Square Design (LSD) CMRIT, Bengaluru. 4 / 13
Problem 1: Analyze and interpret the following statistics concerning the
output of wheat per field obtained from an experiment testing four
varieties of wheat (A, B, C, and D) under a Latin-square design.

C 25 B 23 A 20 D 20
A 19 D 19 C 21 B 18
B 19 A 14 D 17 C 20
D 17 C 20 B 21 A 15
Solution:
By coding method, subtract 20 from all values to simplify calculations:

row /column 1 2 3 4 Ti Ti2


1 C5 B3 A0 D0 8 64
2 A −1 D −1 C 1 B −2 −2 4
3 B −1 A −6 D −3 C 0 −10 100
4 D −3 C −3 B 1 A −5 −7 49
Tj 0 −4 −1 −7 T = −12
Tj2 0 16 1 49

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 4: ANOVA in Latin-Square Design (LSD) CMRIT, Bengaluru. 5 / 13
Correction Factor (CF)

T2 (−12)2
CF = = = 9.
N 16
Total Sum of Squares (TSS)
XX
TSS = Xij2 − CF = 122 − 9 = 113.

Variance Between Rows (SSR)


X T2 64 4 100 49
i
SSR = − CF = + + + − 9 = 46.5.
ni 4 4 4 4

Variance Between Columns (SSC)

X Tj2 0 16 1 49
SSC = − CF = + + + − 9 = 7.5.
nj 4 4 4 4

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 4: ANOVA in Latin-Square Design (LSD) CMRIT, Bengaluru. 6 / 13
Variance Between Varieties or Treatments(SSL): Rearranging coded
data by variety:
Treatments Obj − 1 Obj − 2 Obj − 3 Obj − 4 Tv
A −1 −6 0 −5 −12
B −1 3 1 −2 1
C 5 0 1 0 6
D −3 −1 −3 0 −7
X T2 144 1 36 49
v
SSL = − CF = + + + − 9 = 48.5.
nv 4 4 4 4
Residual Variance (SSE)
SSE = TSS − (SSC + SSR + SSL) = 113 − (7.5 + 46.5 + 48.5) = 10.5.
Degrees of Freedom:
d.f . for columns: c − 1 = 3,
d.f . for rows: r − 1 = 3,
d.f . for varieties: v − 1 = 3,
d.f . for residuals: (c − 1)(c − 2) = 6,
d.f . for total: n − 1 = 15.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 4: ANOVA in Latin-Square Design (LSD) CMRIT, Bengaluru. 7 / 13
ANOVA Table:

Source of Variation SS d.f. MSS F-Ratio


Rows 46.50 3 15.50 8.85
Columns 7.50 3 2.50 1.43
Varieties/Treatments 48.50 3 16.17 9.24
Residual(Error ) 10.50 6 1.75 −
Total 113.00 15 − −

Variance between rows (FR = 8.85) and between varieties (FV = 9.24) are
significant (F > 4.76). Variance between columns (FC = 1.43) is not
significant (F < 4.76). Row effects and variety effects influence yield, but
column effects do not.

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 4: ANOVA in Latin-Square Design (LSD) CMRIT, Bengaluru. 8 / 13
Problem 2: Analyze the variance in the following Latin square of yields of
paddy where A, B, C , D denote the different methods of cultivation:
D 122 A 121 C 123 B 122
B 124 C 123 A 122 D 125
A 120 B 119 D 120 C 121
C 17 D 20 B 21 A 15
Examine whether the different methods of cultivation give significantly
different yields, given F3,6 = 4.76.
Solution: By Coding method, we subtract 120 from all values to simplify
calculations:
row /column 1 2 3 4 Ti Ti2
1 D2 A1 C3 B2 8 64
2 B4 C3 A2 D5 14 196
3 A 0 B −1 D 0 C1 0 0
4 C2 D3 B1 A2 8 64
Tj 8 6 6 10 T = 30
Tj2 64 36 36 100
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 4: ANOVA in Latin-Square Design (LSD) CMRIT, Bengaluru. 9 / 13
Correction Factor
T2 302
N = 16, CF = = = 56.25.
N 16
Total Sum of Squares (TSS)
XX T2
TSS = xij2 − = 92 − 56.25 = 35.75.
n
Variance Between Rows (SSR)

82 + 142 + 02 + 82
SSR = − 56.25 = 24.75.
4
Variance Between Columns (SSC)

82 + 62 + 62 + 102
SSC = − 56.25 = 2.75.
4

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 4: ANOVA in Latin-Square Design (LSD) CMRIT, Bengaluru. 10 / 13
Variance Between Varieties or Treatments(SSL): Group Data by
Letters

Treatments Obj − 1 Obj − 2 Obj − 3 Obj − 4 Tv


A 1 2 0 2 5
B 2 4 1 −1 6
C 3 3 1 2 9
D 2 5 0 3 10
X T2 52 + 62 + 92 + 102
v
SSL = − CF = − 56.25 = 4.25
nv 4
Residual Variance (SSE)

SSE = TSS − (SSR + SSC + SSL) = 35.75 − (24.75 + 2.75 + 4.25) = 4

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 4: ANOVA in Latin-Square Design (LSD) CMRIT, Bengaluru. 11 / 13
ANOVA Table:

Source of Variation SS d.f. MS F-Ratio


BetweenColumns 2.75 3 0.92 1.37
BetweenRows 24.75 3 8.25 12.31
BetweenVarieties 4.25 3 1.42 2.12
Residual(Error ) 4 6 0.67 −
Total 35.75 15 − −

Critical Value: F3,6 = 4.76.


Columns: Hypothesis accepted (FC = 1.37 < 4.76). No significant
difference between columns.
Rows: Hypothesis rejected (FR = 12.31 > 4.76). Significant
difference between rows.
Varieties or Treatmetns: Hypothesis accepted (FV = 2.12 < 4.76).
No significant difference between methods of cultivation.
Overall: Only rows exhibit significant variance.

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 4: ANOVA in Latin-Square Design (LSD) CMRIT, Bengaluru. 12 / 13
Assignment Problems:
1. Present your conclusions after doing analysis of variance to the
following results of the Latin-square design experiment conducted in
respect of five fertilizers which were used on plots of different fertility.

A 16 B 10 C 11 D 9 E 9
E 10 C 9 A 14 B 12 D 11
B 15 D 8 E 8 C 10 A 18
D 12 E 6 B 13 A 13 C 12
C 13 A 11 D 10 E 7 B 14
2. Five varieties of paddy are tried. The plan, the varieties shown in each
plot and yields obtained in Kg are given in the following table.
A 95 B 85 C 139 D 117 E 97
E 90 C 89 A 75 B 146 D 87
B 116 D 95 E 92 C 89 A 74
D 85 E 130 B 90 A 81 C 77
C 87 A 65 D 99 E 89 B 93
Test whether there is a significant difference between rows and columns.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 4: ANOVA in Latin-Square Design (LSD) CMRIT, Bengaluru. 13 / 13
Topic 5: Analysis of Covariance (ANCOVA)
(Optional - No problems will be asked in VTU Exam)

Dr. P. Rajendra

Professor, Dept. of Maths

CMRIT, Bengaluru.

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 5: Analysis of Covariance (ANCOVA) (Optional


CMRIT,-Bengaluru.
No problems
1/3
Introduction to ANCOVA
ANCOVA is an extension of Analysis of Variance (ANOVA).
Combines regression and ANOVA concepts.
Used to compare group means while controlling for the effect of one
or more covariates.
Purpose: Adjust group means to account for external variability due to
covariates.
Components of ANCOVA
Dependent Variable (Y): The outcome being measured.
Independent Variable (X): The categorical variable representing
groups (e.g., treatment vs. control).
Covariate (Z): A continuous variable included to adjust for its
influence on Y .
Statistical Model:
Y = B0 + B1 X + B2 Z + ϵ
where:
B0 : Intercept (mean of Y when X = 0, Z = 0).,B1 : Effect of the
independent variable., B2 : Effect of the covariate., ϵ: Residual error.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic 5: Analysis of Covariance (ANCOVA) (Optional
CMRIT,-Bengaluru.
No problems
2/3
Why Use ANCOVA?
Adjusts for pre-existing differences among groups.
Reduces error variance, improving the accuracy of group comparisons.
Controls for potential confounding factors.

Example: In an educational study, controlling for pretest scores ensures


fair posttest comparisons. Key Assumptions of ANCOVA
1 Linearity: The relationship between the covariate and dependent
variable is linear.
2 Homogeneity of Slopes: The effect of the covariate is consistent
across groups.
3 Normality: The residuals are normally distributed.
4 Independence: Observations are independent.
5 Homogeneity of Variance: Variance within each group is
approximately equal.

Dr. P. Rajendra (Professor, Dept. of Maths)Topic 5: Analysis of Covariance (ANCOVA) (Optional


CMRIT,-Bengaluru.
No problems
3/3

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy