0% found this document useful (0 votes)
26 views14 pages

Measures of Dispersion

Measures of dispersion

Uploaded by

gathungwadavis2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views14 pages

Measures of Dispersion

Measures of dispersion

Uploaded by

gathungwadavis2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

84 q u a n t i tat i v e t e c h n i q u e s

The mode can be used when considering the popularity of given attributes e.g. the most popular
car in a given town.
A distribution can have more than one mode. When there are two modes, it is said to be bi-modal
denoted Xm
Example
What is the mode in following data set 2, 11, 25, 11, 2, 5, 17, 38, 25, 17, 25, 13
= 25 (since it appears 3 times)

Mode for grouped data


The mode for grouped data falls in the modal class. After the modal class is determined, the
mode is found as:

d1
Xm = Lm + i
d1 + d2

Where Lm = The lower limit of the modal class


d1 = The difference between the frequency of the modal class and that of the
class before it
TE X T

d2 = The difference between the frequencies of the modal class and that of the class
just after it.
i = The class mdth of the modal class.
STU D Y

Example
Compute the modal mark in the business statistics class last semester
Mark Frequency
30 – 40 3
40 – 50 10
50 – 60 19
60 – 70 31
70 – 80 11
Modal class – 60 -70 with a frequency of 31
(31 − 19)
Xm = 60 + × 10
(31 − 19) + (13 − 11)

Xm = 63.75

ii) Measures of dispersion

The measures of central tendencies give us values that may be considered to be typical values
samples of population from which they are computed.
Measures of dispersion enable us to know how far or how near observed values are spread
from the averages. They show the extent to which such values differ from the average value
(usually the mean). When observed values are close to the mean, we say there is low dispersion.
Descriptive Statistics 85

Dispersion is also known as spread, scatter or variation. Some of the most commonly used
measures of dispersion include: - range, variance and standard deviation.

Range

It is the difference between the highest and the lowest values in a data set
Therefore, R= Xmax – Xmin
The range is the simplest measure of dispersion because it only uses two values. It is most useful
in cases where there are erratic changes.
Example
What is the range in the following exchange rates of the shilling to the US dollar? 75, 74,77, 68,
69, 70, 73, 74, 68.5, 75.5, 69, 78.5, 70
Range = max – min
= 78.5 – 68
= 10.5
Example
The following data shows salaries earned by the top management of Kabete International Ltd.

TE X T
105,000; 2,000,000; 300,000; 250,000; 120,000; 350,000, 130,000
Range = Max – Min
= 2,000,000 – 105 000

STU D Y
= 1,895,000

Weakness
Range depends on only two values. This means that it can be influenced by extreme values that
may be considered to be outliers.
It doesn’t not give an indication as to how the values are spread in a distribution
To overcome this weakness, we use the inter-quartile range (IR).
The inter-quartile range is the difference between the top quartile and the lower quartile
IR = Q3 – Q1
Q3– The value in the observation below which ¾ or 75% of the observation lie and above in the
remaining ¼ or 25% of the observations
Q1- Will have 25% of observation are less than and 75% above it.

Quartiles, Deciles, percentiles

Another way to describe variation in data is to determine the location of what divides a set
of observation into equal parts. These values include the median, quartile, deciles and
percentiles.
86 q u a n t i tat i v e t e c h n i q u e s

Quartiles

They divide an ordered set into four equal parts. The first quartile Q1 is the value withih which
25% of the observation lie and Q3 is the value below which 75% of the observations lie. When
computing quartiles, the first step is to locate the quartile class. The location of the quartile is
found as:
Qj = (n+1) j/4= Where Qj quartile and is 1, 2, 3, 4.
The quartile value is then found as;
Qj = Lj + (jt/4-cf) i
F
Where Lj - the lower limit of the quartile class
cf - Cumulative frequency up to the class before the quartile class
f - the frequency of the quartile class
i - width of the quartile class
Example
Compute the values of Q1 and Q3 for the scores in the Business Statistics course last semester
Marks f cumulative cf
TE X T

30-40 3 3 0-2
40-50 0 13 3-12
STU D Y

50-60 19 32 13-31 (18)


60-70 31 63 31-62
70-80 11 74 62-73
Solution
Find the location of Q1
Q1 falls in the class width (74+1) ¼ the observation is 18.75th observation. This is the class 50-
60
Q1 will therefore be

(¼ (74) − 13)
50 + ×10
19
= 52.89
This means that 25% of the students scored less than 52.89% in the course.
Q3 = Location = (n+1)3/4
= (74+1)3/4
75 x 3/4 = 56.25

(¾ (74) − 32
Value = 60 + × 10
31
= 67.58
Descriptive Statistics 87

This means that 75% of the students scored less than 67.58% marks

Percentiles

They divide an ordered data set into 100 equal parts


Given a set distribution X1, X2, X3, ……………Xn then pth percentile is the value of X such that P%
of the observation are less than P and (100-p) % of the observation are greater than P.
Example 1
P40 means 40% of the observations are less than P and 60% of the observation are greater than
P
P40 is the 40th percentile
The location of the percentile has to be determined before we get the percentile. The percentile
class is the one that contains the (n+1) k/100th observation where k-1, 2,3,……100
The percentile will then be found as;
(kn/100 – cf)
Pk = Lk +
f
Where Lk - The lower limit of the percentile class

TE X T
Cf - Cumulative frequency upto class just before the percentile
f - Frequency of P class
j - Width of P class

STU D Y
Example 2
Compute the following percentiles for the scores in the Business Statistics course last
semester.
P10, p25, p50, p75, p60, p50
P10, location (74+1) 10/100 = 75x0.1 = 7.5
We look for the class with 7.5th observation
(10 x 74 -3)
40+ 10 = 44.4
100
10

Variance and standard deviation

The two common measures of dispersion are variance and standard deviation.
A data set that is more variable will have a larger variance than one that is relatively homogeneous.
The variance is the sum of the square deviations divided by the number of observations. It is the
average of the squares of the deviation of the individual values from their means. For any set of
values, the sum of square deviations from the mean is smaller than the sum of square from any
other point.
Population variance is denoted as δ2→ parameter
Sample variance is denoted as S2→ statistics
88 q u a n t i tat i v e t e c h n i q u e s

∑ (X i − µ) 2
δ2 = i − 1 N

Where xi = individual observed values


µ = population mean
N = No. of observations

Population Sample
Mean µ ∑x ∑x
= =
N n
Sample size n N
Standard deviation
σ = ∑ (x − µ)2 s = ∑ (x − )2
√ N √ n−1

Standard deviation = √σ2 − √variance


Example
TE X T

The following values were observed from a population:


30, 32, 40, 48, 50
STU D Y

Compute the variance for this data


Solution
(x i − u) 2
σ2 = ∑
N
200
µ =
5
µ = 40

Value, Deviation (xi - μ) (xi-μ) 2


xi
30 -10 100
32 -8 64
40 0 0
48 8 64
50 10 100
Total ∑ 200 328
Descriptive Statistics 89

328
S2 =
5
= 65.6
n
∑ (X i − X ) 2
Sample
variance,
S2 = i = 1
n−1

328
=
4 −
= 82
For group data, we only get an approximation (estimate) of the variance.

∑ f i( X i − X ) 2
S2 = i = 1
∑ fi − 1

The standard deviation is the square root of the variance. It is expressed in the same units as
the original data.
N

∑ (x i − m) 2
=
Population variance,σ 2 i =i

TE X T
n

∑ (x i − x− ) 2
Sample standard deviation, S = i =1

STU D Y
n − 1

For grouped data, S = ∑ (xi − x)2


i =1
√ ∑fi − 1

Coefficient of variation

Coefficient of variation is useful when comparing the levels of variability in sets of data. It is a
relative measure of variability. It is especially useful when comparing sets that are not measured
in the same units e.g. in weights of people vs. income, or when comparing data with means that
are of different magnitudes, or risk of projects. The coefficient of variation is dimensionless (free
of units). It is generally expressed in percentage or in decimal form.
s s
CV − or = × 100%
x x

The higher the coefficient of variation, the higher the variability.


90 q u a n t i tat i v e t e c h n i q u e s

Example
Which of these 2 sets of data has greater variability?
A B
χ = 150kgs χ = 0.85cm
S = 30.5kgs S = 0.015cm
CV = 30.5 CV = 0.015
150 0.85
= 0.203 = 0.018
Set A has greater variability than set B

Measures of normality/shape

A normal distribution is data that forms a symmetrical bell curve. Measures of normality tells us
more about the way data is distributed e.g. figures A, B and C below appear to be symmetrical.
Distributions may have same averages and measures of dispersion but have different shapes.
Measures of normality give us an idea of how the data is distributed. Measures of normality
include coefficient of skewness and coefficient of kurtosis.

Skewness
TE X T

Skewness describes the degree of symmetry in a distribution. When data are uni-modal and
symmetrical, the mean, mode and median will be almost the same value. In a skewed distribution,
we have higher frequencies occurring to one end of the distribution e.g.
STU D Y

A
Descriptive Statistics 91

C TE X T
STU D Y
92 q u a n t i tat i v e t e c h n i q u e s

Graphs A and C represent skewed distributions;


Skewed to the right (negatively skewed)
Skewed to the left (positively skewed)
In a positively skewed distribution the mean > median >mode
In a negatively skewed distribution the mean < median < mode
When data are skewed, the mean will be pulled towards the skew. The degree of skewness is
composed by using either the 1st Pearsonian coefficient of skewness or the 2nd
X − Xm
First coefficient of skewness, SK1 = where
S

X − X0.5
Second coefficient of skewness, SK2 = 3
S
xm = mode
S = standard deviation
X0.5 = median
If SK1 or SK2 = 0, the distribution is normally distributed or is symmetrical.
If SK>0, the distribution is positively skewed.
TE X T

If SK<0, the distribution is negatively skewed.

Kurtosis
STU D Y

Kurtosis describes the degree of peakedness or steepness in a distribution.

Example

Leptokurtic
(Highly peaked)
Descriptive Statistics 93

Mesokurtic
(Normal distribution)

TE X T
STU D Y
Platykurtic

For normal distribution, k ≈ 0.25 i.e. mesokurtic.


If K< 0.25, the distribution is platykurtic.
If K> 0.25, the distribution is leptokurtic.

½(Q3 − Q1)
K=
P90 − P10
94 q u a n t i tat i v e t e c h n i q u e s

Empirical Rule
i. The empirical rule says that if a sample or population of measurement has a normal
distribution
ii. Approximately 68% of the observations lie within one standard deviation of the mean
iii. Approximately 95% of the observations lie within two standard deviations of the mean
iv. Approximately 99.7% of the observations lie within three standard deviations of the
mean.

Diagram 1.1

68%

95%

99.7%
TE X T

-3δ -2δ -1δ μ δ 2δ 3δ


STU D Y

Chapter Summary

Statistics is the art and science of getting information from data or numbers to help in decision
making.
The following are some characteristics of index numbers
1. They are specialised averages to obtain a typical measure of central tendency like an
average. The items must both be comparable and the unit of measurement must be the
same
2. Measure the change in the level of a phenomenon3. Measure the effect of changes over a
period of time
Counting techniques may be classified into:
i. Probability trees
ii. Permutations
iii. Combinations
Descriptive Statistics 95

Chapter Quiz

1. Define Mean
2. Which of the following is the odd one out?
i. Mean
ii. Mode
iii. Median
iv. Range
3. What is the importance of Kurtosis?
4. ………… describes the degree of symmetry in a distribution when data are uni-modal
and symmetrical, the mean, mode and median will be almost the same value.
5. List three counting techniques.

TE X T
STU D Y
96 q u a n t i tat i v e t e c h n i q u e s

Answers to Chapter Quiz

1. It is the sum of all the values divided by the number of values.


2. Range – it is not a central tendancy measure
3. Kurtosis describes the degree of peakedness or steepness in a distribution.
4. Skewness
5. i) Probability trees
ii) Permutations
iii)Combinations

Questions from previous exams

1. The weights of 15 parcels recorded at the GPO were as follows:


16.2, 17, 20, 25(Q1) 29, 32.2, 35.8, 36.8(Q2) 40, 41, 42, 44(Q3) 49, 52, 55 (in kgs)
Required
TE X T

Determine the semi interquartile range for the above data


2. The following table shows the levels of retirement benefits given to a group of workers in a
given establishment.
STU D Y

Retirement benefits £ ‘000 No of retirees (f) UCB cf


20 – 29 50 29.5 50
30 – 39 69 39.5 119
40 – 49 70 49.5 189
50 – 59 90 59.5 279
60 – 69 52 69.5 331
70 – 79 40 79.5 371
80 – 89 11 89.5 382

Required
a) Determine the semi interquartile range for the above data
b) Determine the minimum value for the top ten per cent.(10%)
c) Determine the maximum value for the lower 40% of the retirees

3. The following information was obtained from an NGO which was giving small loans to some
small scale business enterprises in 1996. the loans are in the form of thousands of Kshs.
Descriptive Statistics 97

Loans Units Midpoints(x) x-a=d d/c= u fu Fu2 UCB cf


(f)
46 – 50 32 48 -15 -3 -96 288 50.5 32
51 – 55 62 53 -10 -2 -124 248 55.5 94
56 – 60 97 58 -5 -1 -97 97 60.5 191
61 –65 120 63 (A) 0 0 0 0 0 0
66 –70 92 68 5 +1 92 92 70.5 403
71 –75 83 73 10 +2 166 332 75.5 486
76 – 80 52 78 15 +3 156 468 80.5 538
81 – 85 40 83 20 +4 160 640 85.5 57.8
86 – 90 21 88 25 +5 105 525 90.5 599
91 – 95 11 93 30 +6 66 396 95.5 610
Total 610 428 3086

Required
Using the Pearsonian measure of skewness, calculate the coefficients of skewness and comment
briefly on the nature of the distribution of the loans.

TE X T
4. a) Distinguish between discrete and continuous data.
b) What is dispersion and what is the formula for the standard deviation?

STU D Y
c) What is the measure of relative dispersion?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy