MCSL044 Section 3 CRC
MCSL044 Section 3 CRC
3.0 Introduction 48
3.1 Objectives 48
3.2 ANOVA Test 49
3.2.1 One-Way Classification 49
3.2.2 Two-Way Classification 53
3.3 Summary 56
3.4 Exercises 56
3.4 Solutions/Answers 56
3.0 INTRODUCTION
By now you must have become familiar with hypothesis testing based on test
statistic -test, -test and F-test in the earlier sessions. (Please refer to Section 2,
Book 3 or BCS 040 for details). Recall that you learned to test the significance of
differences between two sample means earlier. In addition to this, there are situations
in which we are interested in testing the significance of difference among two or more
means or equivalently equality of more than two means.
For example, an industrial manufacturing unit may be interested in testing the quality
of wielding done by workers who works in three different shifts viz., morning,
evening and night. In order to assess the quality of welding carried out by these
workers, data is collected by floor managers using an advanced imaging technique.
The goal is to test for difference in the average welding quality standards. In other
words, seek an answer to the query: Is there a significant difference in the average
welding quality of the workers who works in the three shifts? Notice that whereas
using -test, equality of only two means at a time can be carried out, ANOVA tests the
hypothesis concerning differences between two or more means. An advantage in using
ANOVA rather than multiple -tests is that it reduces the probability of error.
ANOVA is a technique that works by partitioning the total sums of squares into
components used in the model under consideration. It may further be noted that
ANOVA is “concerned not with analyzing the variances, but with analyzing the
variation in means.”It is recommended that you revise BCS 040 unit 8 prior to
starting with the sections below, as we choose to analyse data by exploring the excel
tool Data Analysis ToolPak.
3.1 OBJECTIVES
After going through this unit you will be able to:
48
Analysis of Variance
3.2 ANOVA
Analysis of variance is a technique due to Sir Ronald Fisher which can address
questions such as the one mentioned in the example above. It makes use of
the statistic you learned earlier.
Treatment 1 …
Treatment 2 …
Treatment k …
Recall the linear mathematical model for ANOVA you studied in section 8.3, Unit 8
in the form
, and
Based on the tabulated data, the company desires to investigate, is there any
significant differences in the average number of streamers sold by the dealers.
Some of the dealers are making efforts to promote their sales. Thus to promote the
sales of the streamers, one of the dealer has appointed four salesmen. These
49
Statistical Methods salesmen are guided to visit five localities of the same town randomly in a month
Lab and sell the product, whose day wise details are tabulated below.
The locality wise sales record of each salesman is tabulated below: Comment [AU
Please
see if
clarity
has been
achieved,
as
DEALER - 1 : LOCALITYWISE SALES RECORD OF SALESMEN desired
in the
LOCALITY 1 LOCALITY 2 LOCALITY 3 LOCALITY 4 LOCALITY 5
CASE
SALESMAN-1 22 33 9 31 18 STUDY.
SALESMAN-2 13 23 13 11 8
Sudhans
SALESMAN-3 7 15 4 24 15 h:
SALESMAN-4 31 44 13 31 23 Sir
thanks
for your
valuable
As a study, the dealer wants to test whether the salesmen differ in their ability of commen
salesmanship and he wants to test that whether the locality has any influence on the t, I
redrafte
sales of streamers. d the
text,
ANALYSIS hope the
same is
Based on the case study, following objectives are identified with respect to the now OK
Where the additive model has termed as the additional effect of the th
treatment and the hypothesis to be tested is , or equivalently ,
. Now, we demonstrate how to use the excel tool to perform a test of this
hypothesis.
Steps:
1. Tabulate the DEALER SALES RECORD as given above in Excel Spreadsheet
screen shot below.
2. Click DATA TAB → DATA ANALYSIS → ANOVA: Single Factor → OK
“For activation and usage of Data Analysis Toolpak, refer to the earlier unit of
Correlation and Regression - The snap shots are readily available there”
However we are giving some of the relevant screenshots here
50
Analysis of Variance
Notice from the screen shot above that while selecting the cells only row labels are
included (columns excluded). This is because incase of one way classification, the
data is required to be classified by only one factor viz., the Dealers in the present case.
Data on the sales volume of different dealers is recorded row wise, where the dealer
names are entered in first column. Thus, we check the option “Levels in First
Column”. Further, the level of significance Alpha for the test is by default set at
or , which can be altered to or etc. as per requirement. We have to
identify the output cell address where the results are desired to be placed, which is
chosen as . Following is the result of the procedure discussed above.
Anova: Single
Factor
SUMMARY
Groups Count Sum Average Variance
DEALER -1 5 209 41.8 213.2
DEALER -2 4 157 39.25 158.9167
DEALER -3 6 228 38 374
DEALER -4 5 167 33.4 46.8
ANOVA
Source of SS df MS F
Variation P-value F crit
Between Groups 184.2 3 61.4 0.290072 0.831921 3.238872
Within Groups 3386.75 16 211.6719
Total 3570.95 19
Now refer to Table 3 given at page 12 of BCS 040 Block 3 unit 8 i.e. ANOVA for a
comparison of the results. Notice the forms of the tables marked with corresponding
column heading shown above.
“F crit” in the Excel output is the critical value of - distribution at the stated level of
significance, which can be obtained from the table. -Value for the calculated value of
- statistic is also generated in the Excel output.
51
Statistical Methods Data Interpretation
Lab
A test of the hypothesis, in the present case, can be carried out based on either of the
following two approaches (see chapter 7, Book 3).
Calculated value of - statistic
Based Calculated value of - statistic the rule of Thumb is “if the
calculated value of -statistic is less than the critical value of i.e. crit at
the desired level of significance, do not reject the null hypothesis, else reject
the null hypothesis”
-Value
Based on -Value the thumb rule is “if -Value is less than the desired
level of significance,reject the null hypothesis, else do not reject the null
hypothesis”
Analyze the summary statistics of the ANOVA: Single Factor table given above and
Answer the following:
3) Compare the value of statistic with the critical value of . Use the corresponding
thumb rule and comment on the Null Hypothesis constructed to study the
company objective i.e., to test “whether there is significant difference between the
average number of streamers sold by the dealers.”
.........................................................................................................................................
.........................................................................................................................................
.........................................................................................................................................
.........................................................................................................................................
4) Use the thumb rule for P Value and comment on the Acceptance or Non
Acceptance of Null Hypothesis laid for the study of company objective.
.........................................................................................................................................
.........................................................................................................................................
.........................................................................................................................................
.........................................................................................................................................
.........................................................................................................................................
52
Analysis of Variance
3.2.2 Two-way Classification
The two-way classified data obtained in an experiment has the following layout with
one observation in each treatment:
Treatment 1 …
Treatment 2 …
Treatment k …
The linear mathematical model for ANOVA in this case is of the form
, and
Now, let us extend our discussion for Two-way or Two factor ANOVA test. Based on
case analysis, the company has single objective to study and dealer has two, which are
to be tested simultaneously. Thus, two factor ANOVA test is desired to be performed
for Dealers.
Dealer Objective
1. To Test “whether the salesmen differ in their ability of salesmanship”
2. To Test “whether the locality has any influence on the sales of streamers.”
From the objectives, we identified, that to apply the ANOVA test we need to establish
Two Null hypothesis, and , which are to be tested simultaneously. For ,let
be the grand mean, be that part of due to the ith salesman and for , let be
that part of due to thejth locality. Thus, the Null Hypothesis to be tested are
and .
Now, we will learn how we use excel to perform testing for these hypothesis.
Data analysis through Excel
Perform Following Steps:
1. Tabulate the DEALER SALES RECORD as given above in Excel
Spreadsheet.
2. Click DATA TAB → DATA ANALYSIS → ANOVA : Single Factor → OK
“For activation and usage of Data Analysis Toolpak, refer to the earlier unit of
Correlation and Regression - The snap shots are readily available there”
However we are giving some of the relevant screenshots here
53
Statistical Methods
Lab
Recall that concerned details about alpha, labels, output range are already discussed
in single factor ANOVA Test. We now proceed to analyze the results which you will
get when OK is clicked, and conclude whether to reject or accept the formulated
hypothesis.
ANOVA
Source of Variation SS df MS F P-value F crit
Rows 829.2 3 276.4 8.015466409 0.003373221 3.490295
Columns 867.8 4 216.95 6.291445143 0.005736052 3.259167
Error 413.8 12 34.48333333
Total 2110.8 19
54
Analysis of Variance
Note:
F Crit, is the critical value of -distribution at respective level of
significance, which you can get from the table mentioned earlier (Appendix,
Unit 11). In addition, -value is generated by Excel for additional data
interpretation.
Notice that the column headings of the ANOVA table are same as earlier, but
for the additional row for the columns, which in the present case is for the
Localities.
We can interpret the result based on value of or the -value as stated earlier
for both the hypotheses.
Analyze the summary statistics of the Anova: Two-Factor Without Replication table
given above and Answer the following:
3) Compare the F value with the Critical value of F. Use the thumb rule for F Value
and comment on the Acceptance or Rejection of Null Hypothesis H01 laid for the
study of dealer objective i.e. to Test “whether the salesmen differ in their ability
of salesmanship.”
.........................................................................................................................................
.........................................................................................................................................
.........................................................................................................................................
4) Compare the F value with the Critical value of F. Use the thumb rule for F Value
and comment on the Acceptance or Rejection of Null Hypothesis H02laid for the
study of dealer objective i.e. to Test “whether the locality has any influence on the
sales of streamers.”
.........................................................................................................................................
.........................................................................................................................................
.........................................................................................................................................
5) Use the thumb rule for -Value and comment on the Acceptance or Rejection of
Null Hypothesis laid for the study of dealer objective.
.........................................................................................................................................
.........................................................................................................................................
.........................................................................................................................................
55
Statistical Methods
Lab 3.3 SUMMARY
The practical sessions covered in this unit enabled you to utilize the facility of Data
Analysis ToolPak for ANOVA test. Further, it also enriched your understanding by
correlating the concepts you studied in BCS 040 with the practical implementation
through MS EXCEL. It is important to understand that mere usage of MS Excel or
any other software will enable the user to get the standard results/tables etc. without
getting into actual act of formula writing, which will require complete knowledge of
the mathematical expressions required for computing. The interpretation of results
generated through any such software is the sole tasks of the user, for which a complete
appraisal of the problem is necessary and hence unavoidable. It has been our effort in
this unit (and earlier) to explain the concept of data analysis through suitable example,
computation and hence interpretation, which by no means is the end. The journey of
data analysis and interpretation is yet to begin under this background.
3.4 EXERCISES
Exercise 1
One important factor in selecting software for word processing and database
management systems is the time required to learn how to use a particular system. In
order to evaluate three database management systems, a firm devised a test to see how
many training hours were needed for five of its word processing operators to become
proficient in each of the three systems.
System A 16 19 14 13 18 hours
System B 16 17 13 12 17 hours
System C 24 22 19 18 22 hours
Using a 5% significance level, investigate if there are any differences between the
training time needed for the three systems?
Perform the analysis using EXCEL concepts you learned in this course
Comment [AU
3.5 SOLUTIONS I have
NOT
checked
Check Your Progress -1 the
results
through
1) Alpha i.e., , so level of confidence is actual
computat
2) Critical value of Factor(Fcrit) ions.
3) Because and .Since , we do not reject Sudhans
the NULL Hypothesis. h : the
results
4) , which is more than the desired significance level, so we do not are
reject the null hypothesis displaye
d after
proper
Check Your Progress -2 calculati
on
1) Alpha i.e., , thus level of confidence
2) Critical value of (Fcrit) and respectively
3) ; ; Since , we Reject the Null Hypothesis
4) ; ; Since , we Reject the Null Hypothesis
5) Since value greater than Alpha, we Reject the Null Hypothesis
56
Analysis of Variance
Exercise 1
Total 175.3 15 - 1 = 14
57