0% found this document useful (0 votes)
6 views14 pages

MBA Quantitative Techniques and Analytics 03

This unit covers various measures of dispersion, including range, quartiles, standard deviation, variance, and mean deviation, explaining their significance in statistical analysis. It aims to help learners understand and assess absolute and relative measures of dispersion, evaluate the range, and calculate quartile and standard deviations. The unit also emphasizes the importance of dispersion in making comparisons between different data sets.

Uploaded by

ravins.chemical
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views14 pages

MBA Quantitative Techniques and Analytics 03

This unit covers various measures of dispersion, including range, quartiles, standard deviation, variance, and mean deviation, explaining their significance in statistical analysis. It aims to help learners understand and assess absolute and relative measures of dispersion, evaluate the range, and calculate quartile and standard deviations. The unit also emphasizes the importance of dispersion in making comparisons between different data sets.

Uploaded by

ravins.chemical
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

UNIT

03 Measure of Variation

Names of Sub-Units

Different Measures of Dispersion, Range, Quartile and interquartile Range, Standard Deviation and
Variance.

Overview

The unit begins by explaining the concept of Measures of Dispersion and Significance of Dispersion.
Further, it describes the Range and Standard Deviation. The unit explains the concept of Mean
Deviation and Quartile Deviation. It also discusses the Variance and Coefficient of Variation.

Learning Objectives

In this unit, you will learn to:


 Define the dispersion
 Explain the range and interquartile range
 Discuss the meaning mean deviation
 Explain the variance
 Elaborate coefficient of variation
JGI JAINDEEMED-TO-BE UNI VE RSI TY
Quantitative Techniques and Analytics

Learning Outcomes
At the end of this unit, you would:
 Assess the basis for absolute and relative measures of dispersion
 Evaluate the range
 Appraise the quartile deviation
 Examine the standard deviation and mean deviation
 Assess the coefficient of variation

3.1 INTRODUCTION
Dispersion means deviation, difference or spread of certain values from their central value. In relation
to statistical series, it means, deviations of various items of the series from its central value. According
to AX. Bowley, “Dispersion is the measure of variation of the items.” Measures of dispersion have two
types which you will study in this unit. The concept of Mean deviation is also discussed in this unit which
represents the extent of deviation of values from the mean.
According to Clark and Schkade, average deviation is the average amount of scatter of the items in a
distribution from either the mean or the median, ignoring the signs of the deviations. The average that
is taken of the scatter is an arithmetic mean, which accounts for the fact that this measure is often
called the mean deviation. Mean Deviation is used to measure variability across a data series.

3.2 DIFFERENT MEASURES OF DISPERSION


Dispersion is the state of getting dispersed or spread. Statistical dispersion means the extent to which
numerical data is likely to vary about an average value. In other words, dispersion helps to understand
the distribution of the data.
Using different measures of central tendency, you can find out the mean value, but these measures do
not explain the scattering of values near the mid-value in a data series. The measures of dispersion can
be used to study the dispersed values near the mean value. Figure 1 shows the measures of dispersion:

Measures of Dispersion

Absolute Relative

Figure 1: Measures of Dispersion


There are two main types of measures of dispersion in statistics are:
 Absolute Measures of Dispersion: An absolute measure of dispersion contain the same unit as the
original data set. The absolute dispersion method expresses the variations in terms of the average
of deviations of observations like standard or means deviations. It includes:
 Range
 Variance
 Standard deviation

32
UNIT 03: Measure of Variation JGI JAIN
DEEMED-TO-BE UNI VE RSI TY

 Quartile and quartile deviation


 Mean and mean deviation
 Relative Measures of Dispersion: The relative measures of dispersion are used to compare the
distribution of two or more data sets. This measure compares values without units. Common relative
dispersion methods include:
 Co-efficient of Range
 Co-efficient of Variation
 Co-efficient of Standard Deviation
 Co-efficient of Quartile Deviation
 Co-efficient of Mean Deviation

3.2.1 Significance of Dispersion


Measures of dispersion are also known as the averages of the ‘second order’. This is due to the precise
study of dispersion, the deviations of the size of items from a measure of central tendency are calculated
(ignoring the signs) and then these deviations are averaged. This averaged deviation or dispersion is
nothing else, but the average of the second order. Thus, these second order averages represent the series
and help in comparisons with other similar series. Following are the importance of dispersion:
 It makes a possible comparison between different groups
 It serves as a useful check on drawing wrong conclusions from the comparison of averages or
measures of central tendency
 It has great value in our statistical analysis provided relatives (coefficients of dispersions) are put
into practice.

3.3 RANGE
Range represents the difference between the highest value and the lowest value in a data series. It is
considered a rough measure of variability because it depends on the size of the data series. When the
highest (H) and/or the lowest (L) data point in a data series changes, the range also changes.
The formula used to calculate range is as follows:
Range = (Highest value of data series – Lowest value of data series)
Let us learn to calculate range with the help of the preceding example in which a group of 17 people
rated a book on a 5-pointer scale, where 1 is the lowest rating and 5 is the highest rating. The rating
given by the 17 people is as follows:
2, 5, 3, 4, 1, 5, 4, 3, 1, 2, 5, 4, 3, 2, 1, 5, 4
Now, you want to calculate the range for the data series.
To do so, you need to find the highest and lowest values of the data series. In the present case,
Highest value of data series = 5
Lowest value of data series = 1.

Therefore, the range would be:


Range = (Highest value of data series – lowest value of data series)

33
JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Quantitative Techniques and Analytics

Range = (5 – 1)
Range = 4

Therefore, the range of the ratings given by 17 people to a book is 4.

3.4 QUARTILE AND INTERQUARTILE RANGE


The quartiles are values that divide a list of numbers into quarters. In statistics, Quartiles are the set
of values that has three points dividing the data set into four identical parts. Quartiles are the values
that divide a list of numerical data into three quarters. The middle part of the three quarters measures
the central point of distribution and shows the data which are near to the central point. The lower part
of the quarters indicates just half the information set which comes under the median and the upper
part shows the remaining half, which falls over the median. In all, the quartiles depict the distribution
or dispersion of the data set. Quartiles divide the entire set into four equal parts. So, there are three
quartiles, first, second and third represented by Q1, Q2 and Q3, respectively. Q2 is nothing but the median,
since it indicates the position of the item in the list and thus, is a positional average. To find quartiles of
a group of data, we have to arrange the data in ascending order.
 Q3 is the upper quartile is the median of the upper half of the data set.
 Q1 is the lower quartile and median of the lower half of the data set.
 Q2 is the median.

Consider, we have n number of items in a data set. Then the quartiles are given by;
Q1 = [(n+1)/4]th item
Q2 = [(n+1)/2]th item
Q3 = [3(n+1)/4]th item
Hence, the formula for quartile can be given by;
 N
r C
 4
Q r  11  (12 – 11 )
f

Where Qr is the rth quartile


 l1 is the lower limit
 l2 is the upper limit
 f is the frequency
 c is the cumulative frequency of the class preceding the quartile class.

The interquartile range (IQR) is the difference between the upper and lower quartile of a given data set
and is also called a midspread. It is a measure of statistical distribution, which is equal to the difference
between the upper and lower quartiles. Also, it is a calculation of variation while dividing a data set into
quartiles. If Q1 is the first quartile and Q3 is the third quartile, then the IQR formula is given by;
IQR = Q3 – Q1
Let us understand the quartile with the help of an example.

34
UNIT 03: Measure of Variation JGI JAINDEEMED-TO-BE UNI VE RSI TY

Example: From the given data find out the quartile.


4, 6, 7, 8, 10, 23, 34.

Solution: Here the numbers are arranged in the ascending order and number of items, n = 7
Lower quartile, Q1 = [(n+1)/4]th item
Q1= 7+1/4
= 2nd item = 6
Median, Q 2 = [(n+1)/2]th item
Q2= 7+1/2 item
= 4th item = 8
Upper Quartile, Q3 = [3(n+1)/4]th item
Q3 = 3(7+1)/4 item
= 6th item
= 23

3.5 STANDARD DEVIATION


Standard Deviation is used to calculate the scattering of values in a given dataset. The symbol used to
represent standard deviation is sigma (). Standard Deviation (SD) is the square root of the variance of
a data series. The average of the squared differences from the Mean is called variance.
S.D. = √.
Table 1 shows the formula of standard deviation:
Standard Deviation Formula

Population Sample

(x  )2 (X  x)2


 s
N n1
X– The Value in the data distribution X– The value in the data distribution
– The population Mean x –The Sample Mean
N – Total Number of Observations n – Total Number of Observation
Source: https://www.k2analytics.co.in/

The coefficient of SD can be calculated by dividing SD by the mean of the series. It is a relative measure
of dispersion.
Let us understand the concepts of SD, the coefficient of SD, and the coefficient of variance with the help
of an example.

35
JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Quantitative Techniques and Analytics

Suppose you want to calculate the standard deviation of the weights of five friends shown in the
preceding example. Table 1 shows the data used to calculate the standard deviation, the coefficient of
standard deviation, and the coefficient of variance:

People Weight (kg) (x1) (x1 – X) (x1 – X)2


Jenny 35 –3 9
Robert 40 2 4
Ella 34 –4 16
Andy 39 1 1
Eliza 42 4 16
Total (X1 – X)2 = 46

The calculation of standard deviation is as follows:


35  40  34  39  42
X  38
5

() = √(Xi – X)2/n


= √46/5 = √9.2
= 3.033
The calculation of coefficient of SD is as follows:
Coefficient of Standard Deviation = SD/X
= 3.03/38 = 0.0798

3.5.1 Mean Deviation


The mean deviation is defined as a statistical measure that is used to calculate the average deviation
from the mean value of the given data set. The formulas for calculating mean deviation are as follows:

For Ungrouped Data For Grouped Data


 x  f x 
M.D.(mean)   M.D.(mean) 
N N

The mean deviation of the data values can be easily calculated using the below procedure.
Step 1: Find the mean value for the given data values
Step 2: Now, subtract the mean value from each of the data values given (Note: Ignore the minus symbol)
Step 3: Now, find the mean of those values obtained in step 2.
 represents the addition of values
X represents each value in the data set
µ represents the mean of the data set

36
UNIT 03: Measure of Variation JGI JAIN
DEEMED-TO-BE UNI VE RSI TY

N represents the number of data values


| | represents the absolute value, which ignores the “-” symbol
Example: Determine the mean deviation for the data values 5, 3, 7, 8, 4, 9.
Solution:
Given data values are 5, 3, 7, 8, 4, 9.
We know that the procedure to calculate the mean deviation.
First, find the mean for the given data:
Mean, µ = (5 + 3 + 7 + 8 + 4 + 9)/6
µ = 36/6
µ=6
Therefore, the mean value is 6.
Now, subtract each mean from the data value, and ignore the minus symbol if any
(Ignore”-”)
5–6=1
3–6=3
7–6=1
8–6=2
4–6=2
9–6=3
Now, the obtained data set is 1, 3, 1, 2, 2, 3.
Finally, find the mean value for the obtained data set
Therefore, the mean deviation is
= (1+3 + 1+ 2+ 2+3) /6
= 12/6 = 2
Hence, the mean deviation for 5, 3,7, 8, 4, 9 is 2.

3.5.2 Quartile Deviation


Quartile deviation is defined as half of the distance between the third and the first quartile. It is also
called Semi Interquartile range. The quartile deviation is half of the distance between the third and the
first quartile. If Q1 is the first quartile and Q3 is the third quartile, then the formula for deviation is given
by
Quartile deviation = (Q3-Q1)/2

37
JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Quantitative Techniques and Analytics

Example: From the given data find out the Quartiles.


23, 13, 37, 16, 26, 35, 26, 35

Solution:
Arrange the data in an order.
i.e., 13, 16, 23, 26, 26, 35, 35, 37
n=8
Q1 = [(n+1)/4] th item
Q1 = 8+1/4 = 9/4
= 2.25th term

From the quartile formula we can write;


Q1 = 2nd term + 0.25(3rd term-2nd term)
Q1= 16+0.25(23-26)
Q1= 15.25

Similarly,
Q2 = [(n+1)/2]th item
Q2 = 8+1/2 = 9/2
Q2= 4.5
Q2 = 4th term + 0.5 (5th term – 4th term)
Q2= 26+0.5(26 – 26)
Q2= 26

And,
Q3 = [3(n + 1)/4]th item
Q3 = 3(8 + 1)/4 = 6.75th term
Q3 = 6th term + 0.75(7th term – 6th term)
Q3 = 35 + 0.75(35 – 35)
Q3= 35
Q.D. = (Q3 – Q1)/2
=35 – 15.25/2
=19.75/2
=9.87

3.6 VARIANCE
The variance is a measure of variability. It is calculated by taking the average of squared deviations
from the mean.

38
UNIT 03: Measure of Variation JGI JAIN DEEMED-TO-BE UNI VE RSI TY

Variance tells you the degree of spread in your data set. The more spread the data, the larger the variance
is in relation to the mean. Variance is expressed in much larger units (for example, meters squared)
Since the units of variance are much larger than those of a typical value of a data set, it’s harder to
interpret the variance number intuitively. That’s why standard deviation is often preferred as a main
measure of variability. The formulas for calculating variance are as follows:

Population Variance Sample Variance


N n

(x 1
 ) 2 (x  x )1
2

2  i 1
s2  i 1

N n 1
2= population variance s2 = sample variance
xi = value of ith element xi = value of ith element
 = population mean x = sample mean
N = population size n = sample size

Following are the properties of a variance:


 Variance cannot be negative because its squares are either positive or zero.
 The variance of a constant value is equivalent to zero.
 Variance remains invariant when a constant value is added to all the figures in the data set.
 If the values are multiplied by a constant, the outcome of the variance is scaled by the square root
of that constant.
Example: Find the variance of the following data using the variance formula: 24, 53, 53, 36, 21, 84, 64, 34,
77, 54
Population Size (N) = 10

xi (xi - x) (xi - x)2


24 -26 676
53 3 9
53 3 9
36 -14 196
21 -29 841
84 34 1156
64 14 196
34 -16 256
77 27 729
54 4 16
x = xi/10 x = (xi−x)2/10
=500/10 4084/10
= 50 units =408.4 units2

The variance of the given data is 408.4 units2

39
JGI JAINDEEMED-TO-BE UNI VE RSI TY
Quantitative Techniques and Analytics

3.6.1 Requisites of a Good Measure of Variation


Following are the requisites of a good measure of variation:
 A good measure of variation is simple to understand.
 A good measure of variation is easy to compute.
 A good measure of variation is rigidly defined.
 A good measure of variation is based on every item of the distribution.
 A good measure of variation has sampling stability.
 A good measure of variation is not affected by the extreme items.

3.6.2 Coefficient of Variation


The coefficient of variation is a measure of relative variability. It is the ratio of the standard deviation
to the mean (average).
The coefficient of variation is particularly useful when you want to compare results from two different
surveys or tests that have different measures or values. For example, if you are comparing the results
from two tests that have different scoring mechanisms. If sample A has a coefficient of variation of 12%
and sample B has a coefficient of variation of 25%, you would say that sample B has more variation,
relative to its mean.
The formula for the coefficient of variation is:
Coefficient of Variation = (Standard Deviation / Mean) × 100
In symbols: CV = (SD/x) ×100

Conclusion 3.7 CONCLUSION

 Statistical dispersion means the extent to which a numerical data is likely to vary about an average
value.
 An absolute measure of dispersion contains the same unit as the original data set.
 The relative measures of dispersion are used to compare the distribution of two or more data sets.
 Measures of dispersion are also known as the averages of the ‘second order’.
 Range represents the difference between the highest value and the lowest value in a data series.
 The quartiles are values that divide a list of numbers into quarters.
 The interquartile range (IQR) is the difference between the upper and lower quartile of a given data
set and is also called a midspread.
 Standard Deviation is used to calculate the scattering of values in a given dataset. The symbol used
to represent standard deviation is sigma ().
 The mean deviation is defined as a statistical measure that is used to calculate the average deviation
from the mean value of the given data set.
 Quartile deviation is defined as half of the distance between the third and the first quartile.

40
UNIT 03: Measure of Variation JGI JAIN
DEEMED-TO-BE UNI VE RSI TY

 The variance is a measure of variability. It is calculated by taking the average of squared deviations
from the mean.
 The coefficient of variation is particularly useful when you want to compare results from two
different surveys or tests that have different measures or values.

3.8 GLOSSARY

 Statistical dispersion: The extent to which a numerical data is likely to vary about an average value
 Variance: A measure of variability
 Coefficient of variation: A measure of relative variability. It is the ratio of the standard deviation to
the mean (average)

3.9 CASE STUDY: QUALITY STANDARDS IN A SERVICE SECTOR COMPANY

Case Objective
The case study explains the importance of quality standards.
TPR Inc. was a multi-cuisine restaurant based in India. It had several outlets in the major Indian cities.
The restaurant management wanted to find out if its various outlets were meeting the established
standards of quality and customer service. It hired a consultancy firm for the purpose.
The consultants collected a large scale of data with the help of questionnaires, interviews, and
observations in the restaurants’ outlets. Then, they carefully followed the data processing steps to
analyse it and retrieve relevant and meaningful information from it.
While processing the responses in the questionnaires, they found that quite a large number of
questions were left unanswered. Instead of ignoring such questions, they proceeded systematically.
Each questionnaire comprised a series of interval questions, closed-ended questions and open-ended
questions.
In the case of interval questions, they gave a mid-value to the unanswered questions. In case of open-
ended questions, they went back to the customers and requested them to fill in the answers.
After retrieving sufficient data from the questionnaires, they classified the collected data. To do so, they
combined customers’ responses from different cities and then sub-grouped them according to their
cities.
Next, they formed a table to analyse the relationship between customers’ satisfaction and the sales of
the company:

Calculating the Correlation between Customer Satisfaction and Sales of the Company
Number of Customer Sales of Xi2 Yi2 XiY i
Observations Satisfaction (Xi) Company (Yi)
1 4 5 16 25 20
2 6 6 36 36 36

41
JGI JAINDEEMED-TO-BE UNI VE RSI TY
Quantitative Techniques and Analytics

Calculating the Correlation between Customer Satisfaction and Sales of the Company
Number of Customer Sales of Xi2 Yi2 XiY i
Observations Satisfaction (Xi) Company (Yi)
3 7 6 49 36 42
4 8 4 64 16 32
5 9 6 81 36 54
6 10 9 100 81 90
7 8 10 64 100 80
8 7 2 49 4 14
9 1 3 1 9 3
10 2 4 4 16 8
11 9 9 81 81 81
12 8 8 64 64 64
13 7 9 49 81 63
14 10 11 100 121 110
15 6 5 36 25 30
16 9 12 81 144 108
17 8 15 64 225 120
18 10 12 100 144 120
19 9 16 81 256 144
20 8 20 64 400 160
21 10 20 100 400 200
22 4 6 16 36 24
23 5 8 25 64 40
24 10 14 100 196 140
25 10 19 100 361 190
Total 185 239 1525 2957 1973

The correlation between the customers’ satisfaction and the sales of the company is as follows:
Correlation (r) = (n∑XiYi -∑ Xi∑Yi) / √n∑Xi2
r = (25 × 1973 – 185 × 239) / √ (1525 × 25 – 185 × 185) (25 × 2957 – 239 × 239)
r = 5110/8095.41
r = 0.6
Since the correlation coefficient is positive and close to 1, it indicates that the relationship between
the customers’ satisfaction and the sales is positive and strong. Similarly, the consultants studied the
relationship between different variables, such as quality of service and customer satisfaction, quality
of service and established standards, and so on. Finally, they concluded that the satisfaction level of the

42
UNIT 03: Measure of Variation JGI JAIN
DEEMED-TO-BE UNI VE RSI TY

restaurant’s customers was positive and strong. However, the restaurant’s service level was far behind
the established quality standards.

Questions
1. What are the different steps of data processing used in the case study?
(Hint: The consultants used all the steps of data processing, that is, first they extracted the relevant
data. Then, they classified and organised the information and studied the relationship between
variables.)
2. Which type of measure is used in analysing the table and what type of analysis is used?
(Hint: The measure of relationship is used to analyse the table.)
3. What was done to unanswered questions of the questionnaires filled by customers?
(Hint: Unanswered questions were not ignored and a systematic procedure was followed to retrieve
sufficient data.)
4. How was the data retrieved from questionnaire collected and classified?
(Hint: The customers’ responses from different cities were combined and then sub grouped according
to their cities.)
5. How the relationship between customers’ satisfaction and the sales of the company was derived?
(Hint: By forming a table and calculating correlation between customers’ satisfaction and the sales
of the company)

3.10 SELF-ASSESSMENT QUESTIONS

A. Essay Type Questions


1. What is the measure of dispersion?
2. Explain the types of measures of dispersion.
3. Define range.
4. Describe variance and standard deviation.
5. Elaborate the concept of quartiles.

3.11 ANSWERS AND HINTS FOR SELF-ASSESSMENT QUESTIONS

A. Hints for Essay Type Questions


1. Dispersion is the state of getting dispersed or spread. Statistical dispersion means the extent to
which a numerical data is likely to vary about an average value. In other words, dispersion helps to
understand the distribution of the data. Using different measures of central tendency, you can find
out the mean value, but these measures do not explain the scattering of values near the mid-value
in a data series. The measures of dispersion can be used to study the dispersed values near the mean
value. Refer to Section Different Measures of Dispersion
2. There are 2 types of measures of dispersion.
An absolute measure of dispersion contains the same unit as the original data set.

43
JGI JAINDEEMED-TO-BE UNI VE RSI TY
Quantitative Techniques and Analytics
The relative measures of dispersion are used to compare the distribution of two or more data sets.
Refer to Section Different Measures of Dispersion
3. Range represents the difference between the highest value and the lowest value in a data series. It is
considered a rough measure of variability because it depends on the size of the data series. Refer to
Section Range
4. Standard Deviation is used to calculate the scattering of values in a given dataset. The symbol used
to represent standard deviation is sigma (). The variance is a measure of variability. It is calculated
by taking the average of squared deviations from the mean. Refer to Section Standard Deviation
5. Quartiles are the values that divide a list of numerical data into three quarters. The middle part of
the three quarters measures the central point of distribution and shows the data which are near to
the central point. The lower part of the quarters indicates just half the information set which comes
under the median and the upper part shows the remaining half, which falls over the median. In all,
the quartiles depict the distribution or dispersion of the data set. Quartiles divide the entire set into
four equal parts. So, there are three quartiles, first, second and third represented by Q1, Q2 and Q3,
respectively. Refer to Section Quartile and Interquartile Range

@ 3.12 POST-UNIT READING MATERIAL

 https://www.youtube.com/watch?v=wDAd_QHKoOg
 https://www.youtube.com/watch?v=sOb9b_AtwDg

3.13 TOPICS FOR DISCUSSION FORUMS

 Discuss the difference between mean deviation and standard deviation.

44

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy