0% found this document useful (0 votes)
30 views24 pages

Unit 16

Uploaded by

Rahul Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views24 pages

Unit 16

Uploaded by

Rahul Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

UNIT MEASURES OF SKEWNESS .

Structure
16.0 Objecthes
16.1 Introduction
16.2 Meaning of Skewness
16.3 Positive and Nagative Skewness
16.4 Difference between Dispersion and Skewness
. 16.5 Tests of Skewness
16.6 Measures of Skewness
i6.7 Some Illustrations
16.8 Properties of Normal Cuke
16.9 Let Us Sum Up
16-10 Key Words and Symbols
16.11 Answers to Check Your Progress
16.12 Terminal QuestionslExercises

16.0 OBJECTIVES
After studying this unit, you should be able to :
9 distinguish between skewness and dispersion
- 0 differentiate between symmetrical, positively skewed and negatively skewed data
calculate skewness by different methods
* decide which of the methods of computing is suitable in a given sithation
e appreciate the role of normal curve in the analysis of data and discuss its properties.

6 1 INTRODUCTION
As you know, to analyse any numerical data there are three main characteristics :
1) central tendency i.e., a value around which many other items of the data
congregate, 2) dispersion i.e., how much the items deviate from central tendency,
and 3) skewness i.e., how the items are distributed about the central tendency. 111
this unit, you will learn about the third characteristic i.e. skewness.
In Unit 10 to 13 you have studied the measures of central tendency viz., arithmetic
mean, median, mode geometric mean, harmonic mean and moving average. In units
14 and 15 you have studied the measures of dispersion viz. range, quartile deviation,
mean deviation, standard deviation and Lorenz curve. In this unit you will learn
about third characteristic i.e. skewness. You will study the meaning, purpose and
methods of computing skewness. You will also study the role and properties of
normal curve in analysis of data. In fact, there is one more characteristic called
kurtosis i.e., concentration of frequencies in the central part of the data, which is not
within the scope of this course.

16.2 MEANING OF SKEWNESS


'
A frequency distribution is said to be 'symmetrical', if the frequencies are
symmetrically distributed about central value, i,e., when values of the variable which
are at an equal distance from middle have equal frequencies. Study the following two
sets of distributions.
A) X : 10 15 20 25 30
f : 5 8 26 8 5
Here X = 20 is the group of middle items. + .

B) X 5-9 9-1 3 13-17 17-21 21-24


f 7 18 25 18 7
Here middle group is 13-17.
You can easily understand that they are symmetrical distributions. You shouId also
I note (can verify by calculation) that for each set the values of mean, median and
mode are the same values. In fact, for any symmetrical distribution in which
frequencies steadily rise and then steadily fall (i.e., bell shaped), mean, median ant?
mode are equal. Study Figure 16.1 for the shape of such data on graph paper.

Figure 16.1

- I
I
I Symmetrical Distribution

1 If the graph of a perfectly symmetrical data is folded at the line passing through mean,
I one side of the curve perfectly coincides with the others side. You can say one side is
I the mirror image of the other side.
Ingeneral, however, frequency distributions are not perfectly sylnmetrical; some may
,be slightly asymmetrical and some others may be highly asymmetrical. Consider the
following two asymrr~etrical(of Skewed) distributions :

B)
I
X 5-9 9-13 13-17 17-21 21-24
I f 7 28 15 10 2
4

' Here the frequencies are not symmetrically distributed about the middle. In
ldistrihution A the extent of asymmetry is small while in distribution B it is
I comparatively larger.
; I ' The word 'skewness' is used to denote the 'extent of asymmetry' in the data. When the
frequency distribution is not symmetrical, it is said to be "skewed". The word
Ii y y
'skewness' literally denoted 'asymmetry , or 'lack of symmetry and the word 'skewed'
1
1 denoted 'asymmetrical'. A symmetrical distribution has therefore zero skewness.
1 ! A distribution..can be symmetrical even if frequencies first steadily fall and then
I steadily rise. Consider the following distribAtions:
I
1
I Size of Items ; 10-20 20-30 30-40 40-50 50-60 6&70 7&80 '
I
I 1 Frequency : 40 27 15 10 15 27 40
E This is also a case of symmetrical distribution. But in this case there will be two values
1 . a
of the mode and both of them will be different from arithmetic mean and median
; which will be in the middle group, You may notice in such symmetrical distributions, .
,
+

which are called bimodel or u-shaped, only mean and median are equal. Look at
fl figure 16.2 and study the shape of such data on graph paper. 1
I
rB
Measures of Dispersion and
Skewness

I I
Mode 1 Mean = Median Mode 2

Figure 16.2 Bimodal or U Shaped Distribution

A bimodel distribution can also be a skewed distribution as in the following example :


Size of Items. : 10-15 15-20 20-25 25-30 30-35 35-40' 40-45
Frequency : 27 18 10 5 17 17 . 30
'Here also the distribution of items around the middle group or central value is not
-same on both sides. Thus, we can also say that study of skewness
.-
is the study of
distribution of items around the central tendency.
Analysing the skewness of data serves the following main purposes : *

1) It helps in finding out the nature and the degree of concentration - whether it
is in higher or the lower values.
2) The empirical relationship between mean, median and mode i.e.,
M, = 3 Md'- 2x, is based on a moderately skewed distribution. The measure
of skewnesi will reveal to what extent such empirical relationship holds goods.
3) It helps in knowing if the distribution is normal or not. You will l e a ~ nabout
normal distribution later in thisunit.

16.3 POSITIVE AND NEGATIVE SKEWNESS


Wherever dhta is skewed there can be two possibilities : 1) the skewness may be
positive or 2) it may be negative. In a bell shaped data or a unimodel data, which is
also most common in nature, it is quite easy to understand the concept of positive
and nagative skewness i.e., direction of the skewness. Mode plays an important role
in this connection. The spread of.the data on either side of the mode helps in deciding
the direction of ske,wness. Consider the two sets of data given below: ' "

A) Sizeof Items
Frequency
:
:
24
5
4-6
12
6-8
27 , 108-10 10-12
8
12-14
3
14-16
1

Frequency : 2 5 ' 12 18 30 21 6
I)
Like in Set B if there ismlonger tail towards the lower value or left hand sideCi.e..
larger spread on the lower side, of inode,.the skewness is negative or left handed. In
such 'a case Mean < Median < Mode. Look at Figure 16.3 to unclerstand the data
on graph paper.
Measures of Skewness

As in the case of Set A if there is a longer tail of the distribution towards the higher
values or right hand side i.e., larger spread on higher side of mode, the skewness 1s
positive or right handed. In this case Mean > Median > Mode. Shape of such a data '
on graph paper would be as shown in Figure 16.4.
Such data is termed 'ulongated bcll shaped data'. The case of extreme positive
skewness would arise when frequencies are highest in the lowest values and then they
steadily fall as the values increase. Similarly, the extreme negative skewness would
arise when frequencies are lowest in the lower values and they steadily increase as
values increase. t'he highest frequency representing the highest values: Such data
is callcd 'J' shupcd data. Consider the following two sets of data :
A ) Size of Items : 10-12 12-14 14-16 1 6 1 8 18-20
Frequency : 27 20 12 6 3
B) Sizeof Items : 10-12 12-14 14-18 1 6 1 8 18-20
Frequency : 3 6 12 20 27
set A shows very high positive skewness and Set B shows high negative skewness.
Thcir shape on graph paper will be as shown in Figure 16.5A and 16.5B.
\

Figure 16.5A Figure 16.5B


JBhaped Positively Skewed Distribution J Shaped Negatively Slewed Distribution
Measure or Dlsperslon and
Skewnew Note : For the data to be skewed or symmetrical, deviations from central
tendency must exist in the data.

Check Your Progress A


. 1) Distinguish between' symmetrical data and skewed data.

. .
2) Differentiate between positiveskewness and negative skewness.

3) ~ i s t i n ~ u i sbetween
h high skewness and moderate skewness.

.........................................................................................................
4) Differentiate between bell shaped and U-shaped data.

5) State whether the following statements are True or False


i) All distributions can be classified as negative or positive skewed.
ii) Two halves of a symmetrical distribution are mirror images of each other.
iii) The sum of positive and negative deviations from median is always equal
to zero in a symmetrical distribution.
iv) J-shaped distribution indicates moderate skeynegs.
v) It is possible that for some data Arithmetic Mean = Median = Mode, still.
it is not perfectly symmetrical.
vi) Positive skewness implies that mean value is less than mode.'
vii) Median can never be equal to mean in a skewed distribution.
viii) Greater the difference 'between mean and mode, the more skewed 'is the
distribution.
ix) U-shaped data has two modes.
x) A longer tail to right means data is negatively skewed.
xi) U-s4aped distributions are always symmetrical,
xii) Highly skewed data is always positively skewed.

6) Comment on the nature of the following distribution:


i) 14, 14, 14, 14, 14
ii) 11, 12, 14, 16, 17
52 iii) 1, 3, 6, 18, 42 -
Measures of Skenous
16.4 DIFFERENCE BETWEEN DISPERSION AND I

SKEWNESS
It has been explained in Units 14 and 15 that dispersion relates to the scatteredness
or spread or the deviation of the items of a sbries from its central value. You also
know that the measure of dispersion shows the degree of the scatteredness or average
of deviation of the items of the central tendency. On the other hand, skewness relates
to the depart-ureof the items of a series from symmetry and the measure of skewness
shows the degree of imbalance in the distribution of items around the central
tendency. The distinguishing features are tgbulated below :
- - -

Aspect ~is~ersion Skewness


1) Measureof scatter of individual values departure from symmetry of distribution
how much it can deviate from in what manner items are distrihted
central tendency about central tendency ,

2) Judges the extent of representativeness of any of the difference between any two of the three
three iiverages : Mean, Median, averages : Mean, Median and Mode
and Mode

3) For a symmetrical may have any value zero .


distribution

4) Useful to find variability in data concentration in higher or lower values

16.5 TESTS OF SKEWNESS *

How can we say that a particular distribution is skewed or not? We can say skewness
is present in a distribution if it has the following features:
1) Meal!, median and mode should not coincide.
2) The sum of the positive deviations from the median is not equal to the sum of the
, negative deviations.
3) Frequencies and their spread on either side of the mode are not equal. .
4) Quartiles are +notequidistant from the median i.e.. (Qj - Md) is not equal to
(Md - Q1).
5) When the observations in the' series are plotted on a graph paper, they do not
yield'a symmetrical curve. This means when the graph is divided vertically
I
through the median or mean,and folded, the two halves of the curve d o not
coincide in a perfect manner.

, 16.6 MEASURES OF SKEWNESS


I

To study the extent of asymmetry and direction in a series, various measures of


I
skewness are employed. These measures of skewness can be both absolhe or relative.
\ Absolute Measures of Skewness.
'
Absolute measures tell us the extent of asymmetry and whether i t is positive or
I
, negative.
The first absolute measure of skewness is based on the difference between mean and
mode or mean and median. Symbolicilly i) Absolute Sk = Mean - Mode or
ii) Absolute Sk = Mean - Median. If the value of mean is greater than the mode or
median, skewness is positive, otherwise it is negative. It may be noted that for a
positively skewed distribution; the value of the mean is the greatest and the value of
mode is the least of the three measures. Likewise, for a negatively skewed
distribution, mode has the maximum value and mean has the least value. In both the
ca'ses median is in between, the mean and mode.
Meosures of Dlsprslon and .The second measure of skewness, based on quartiles depends upon the fact that
Skewness
normally for a symmetrical distribution Q1 andQ3 are equidistant from the median,
i.e., Qf -Md = Md-Q1. But if a distribution is asymmetrical, then one quartile lying
on the longer tail side will be farther from the median than the other quartile. In such
a case absolute measure of skewness can be, measured by the following formula:
+
Absolute Skewness = (Q3 - Md)- (Md - Q1) = Q3 Q1 -2 Md, The formula shows
-
if (Q3 Md) is greater than (Md - Q1), skewness is positive otherwise it is
J

negative. This is true becasue Q3 - Me > Me - Q1 implies that the difference'between


Q3 and Me is greater than the difference between Me and Q1+This in turn means
that there is a longer tail on the Q, side or on right hand side i.e., skewness is right
handed or positive.

Relative Measures of Skewness


In order to make comparison between the skewness in two or more distributions,
coefficient o f skewness is computed for the.given series or distributions. he
following are the two important methods of measuring relative skewness:
1) ~ a rPearson9s
i Coefficient of Skewness. This method is most frequently used for
measuring skewness and is based on first absolute measure. Th.e formula for

~ k ,= Mean -Mode
S.D.
.,
measuring skewness is as follows:
-
X -M,
u i.e., first absolute measure of skewness is
divided by standard deviation. Thus, this value will be ftee of units of the data. The
value of this coefficient would be zero in a symmetrical distribution. If mean is greater
than mode, coefficient of skewness would be positive otherwise negative. In practice,
+
the value of this coefficient usually lies between 3.
If the mode is ill-defined, then using the approximate relationship:
Mode = 3 Median - 2 Mean
The above formula reduces to

Sk, =
3 (Mean - Median) or 3 (x-Md)
, S.D. u

Note : A s mean and standard deviations are calculated by using values of all the items
.of the data, Karl Pearson's method measures skewness utilising all the items of the
data,
T o understand the application of Karl Pearson method clearly, let us consider some
illustrations.

. Illustration 1
From the marks secured by 120 students in Sections A and B of a class of 120
students, the following measures are obtained: I

SectionA : = 46.83, a = 14.8, Mode = 51.67


SectionB : X = 47.83, a = 14.8, Mode = 47.07

Determine which distribution of marks is more skewed.

Solution
Section A

-
~k,
= X - Mode
cl
Section B Measures of Skewness 'mi

Hence the distribution of marks in Section A is more skewed. The skewness for
Section A.is negative, while that of B is positive.

Illustration 2 .
Following statistical measures are given for a data set. Find out the value of standard
deviation. ,
Coefficient of skewness is -0.375, Mean is 62 and Median is 6.

Sdution
The coefficient of skewness that depnds upon Mean, Median and Standard
Deviation is Karl Pearson's coefficient of skewness.

-
1 . .
-
- (X - Md), substituting the given values
SkP u

Standard Deviation is 24.

2) Bowley's Coefficient of Skewness: This method is based on quartiles, i.e., second


absolute measure of skewness. The formula for calculating skewness is :

This method is particularly useful in case of open end distributions and where extreme '
values are present or when class-intervals are unequal. Skewness should be measured
by t)is Bowley's method also when positional measures are called for.
If tlie value of this coefficient is zero, it is a symmetrical distribution. For positive
value, it is a positively skewed distribution and for a negative value it is a negatively
+
skewed distribution. The.range of variation under this formula is 1. But the main
drawback of this measure is that it is based on central 50% of the data and it ignores
the remaiaing 50% of the data i.e.,25% of the data below Q,, and 25% of the data
. above 'Q3.To understand the application of Bowley's method clearly, study
. Illustroti~ns3 and 4.
MC~SUWof Dispedon Pnd
. Skewness

For a given data, Q1= 58, Md= 59 and Q3 == 61. Find coefficient of skewness.

Solution .

Illustration 4
In a frequency distribution, the coefficient of skewness based upon quartiles is 0.6
If the sum of the upper and lower quartiles is 100 and the median is 38, find the value
of the upper quartile.
Solution
Bowley's coefficient of skewness based on quartiles is given by:

Substituting the given values

or QS- Q1 -
-.
0.6
= 40 ... (i)
Also it is given, Q3 + Q1 = 100 ... (ii)
Adding (i) and (ii), we get

Hence the upper quartile is 70.

16.7 SOME ILLUSTRATIONS


Illustration 5 '

Calculate appropriate measure of skewness from the following data.


Payment of Commission No. of Salesmen

-
2200 2400 5

I Solution
Since the given distribution is not openended and also the mode can be determined,
it is appropriate to apply Karl Pearson formula as given below :

Skewness = Mean - Mode


S.D.
M W u r e s of Skewness 1
Payment of Mid-point No. of Salesmen d' = X-1700 fd ' idJ2
Commissiop (Rs.) (x) (9 . 200

1400 - 1600 1500 18 -1 -18 18


1600- 1800 1700 20 0' 0 0
, 1800-2000 1900 25 + 1 25 25
2000- 2200 2100 . 10 +2 ' 20 40
2200- 2400 2300 5 4- 3 15 45

Total n = 100 xfd' = - 9 Xfd" = 251

Mode = L + fl - fo x i
(fl - fo) + (fl - f2)

= Clearly the modal group is 1800 - 2000. Substituting the values. '

. ,
Now calculating the standard deviation.

= 200 x jpr-m%i
= 1.582 x 200 = 316.4

1682 - 1850 .
Now coefficient of skewness, Skp = -
316.4
I
i = -0.531
2 '

This value of coefficient of skewness indicates that the distribution is negathely


skewed and hence there is a greater concentration towards the higher commission.

Illustration 6
Calculate the coefficient of skewness based on mean and median from the following
distribution: .
I

I
ctmm Inlewd Frrq-j
1
I
0-I0 6
j 10-20
20-30
12~
22
11 30-40 48
I .40-50 . 56
i
1 50-60 32
d
1 60-70 18
1

70- IU) 6
L
Ivieasures of Dispersion and Solution
Skewness
Calculations lor Mean, Median and S.D.

' CIW m-point =e


d1
10
I Id' Id" Cum.
Frequ.
Interval (XI

0- 10 5 -3 6 -18 54 6 .

20 - 30 25 -1 22 -22 22 40
30 - 40 35 0 48 0 0 88
40 - 50 45 1 56 56 56 144
50 - 60 55 2 32 64 128 176
60-70 65 3 18 54 162 194
70- 80 75 4 6 24 96 200
-
Total . - --.
-. 200, 134 566
-

Median has -N observations or 100 observations below it.


2
here fore, median lies in the 40-50 class.

.= 1.543 x 10 = 15.43

Karl Pearson's coefficient of skewness based on mean and median is given by:

= -0,085
I

Hence, the distribution is negatively skewed with very low degree of skewness.

Illustration 7
~aicula'tethe coefficient of skewness based on quartiies from the following data:
Monthly Mary No. of Employ&
1000-1200 5
Solution
Computation of Quartlles

Monthly Salary Frequency Comulative Frequency

1000-1200 5 5
12W - 1400 14 19

Q,has 9observations or 50 observations below it. It lies in the class 1600 - 1800.

1
N observations or 100 observations below it. So it lies in the class
Qz (= Mddian) has -
2
1800 - 2000.

( Q3 has 4
observations or 150 observations below it. So it lies in the class

1
I
Coefficient of Sk =
Q3 + Q1-2Md
j
QJ - Q;
Measures of Dispersion and lllustration 8
Skewness
a The following table gives the distribution of monthly income of 500 workers in a
factory:

Monthly Income ' No. of Employees


(Rs.)

Belaw Rs.1000 10
-
1000 1500 25
1500- 2000 145

2500-3000
3000 and above .. .
'

i) Obtain the limits of income of central 50 per cent of the dbserved ml;loyees
ii) Calculate Bowley's coefficient of skewness.
Solution
i) For obtaining the limits of central 50% of the workers, calculate Q , and Q3.

Calculations for Quartlles


-
Monthly Income . Comulative
(Rs.) Frequency

Below Rs. 1000 10 10

3000 and above 30 500

Q1 has 9 o r 125 observations below it. So it l i in~the class 1500 - 2000.

= 1500 + 3103 = 1810.3


Q3 3N
has -;i- or .375 observations below it. So it lies in the class 2000 - 2500.

Hence the incomes of central 50% of workers lies between Rs. 1'810.3 and
Rs. 2443.18.

ii). Bowley's coefficient of skewness is given by:

M, h a d23 or 250 observations below it. So it lies in 2000 - 2500 class. I


Measures of Skewness

The'negatiye coefficient (-0.102) indicates that distance between Qs and Md is


smaller than that.between Md and Q , i.e., the distribution is skewed to the left:

Illustration 9
Calculate Karl Pearson's coefficient of skewness from the following data:
Incomes (Rs.per day) No. of Shops

Above 0 150
Above 100 140
Above 200 100
Above 300 80
Above 400 80
Above 500 70
Above 600 30
Above 700 14
Above 800 0

Solution
Converting thc cumulative frequency distributions to ordinary frequency distribution,
we have:
-

Income (Rs. per day) No. of Shops

As it is a u-shaped distribution, skewness will be calculated by {sing Mean and


Median.
Calculations for Coellicient of Skewness

Income Mid-point f dl=- x -100350 fd'' Cum.


(Rs; per day) X Freq.

0-100 50 10 -3 -30 90 10
100-200 150 40 -2 -80 I60 50
200 - 300 250 20 . -1 -20 20 70
300 - 400 350 0 ..O 0 0 70* '

..
. a

4002%0 45? 10
I .
I
.
'. 10 10 80
500 Y $00 : so 40 2 80 160 12d
600 - 700 . 650 16 3 48 144 13'lr
700 - 800 75 0 14 . 4 56 224 150

Total . r.'
150 ' . 64 808
Measures of Dispersion and Calculation of Mean
Skewness

-
X Ifd'
=A+-
N
,;

Calculation of Median

id, s1: " or 75 obsei ~d:lii.f?


Z V,f,:mit. So it 11esin the class 400 - 500.

Calculation of Standard Deviation

Find standard deviation, rnode and median when mean = 50, coefficient of
variation = 40%, Skewness = -0.4.

Solution
Substituting the values of mean and C.V. in the formula

C.V. - -S . D . x 100,we get


Mean

Again using Karl Pearson's formula

Skp - Mean - Mode


-
S.D. '
-0.4 = 50 - Mode
20
Mode = 513 + 20 x 0.4
= 58

Using the empirical relationship, we obtain


Mean - Mode = 3 (Mean - Median)
50 -58 = 3 (50- Median)
-8 = 150 - 3 Median
3 Median = 150 +8
Median = , 52-67
Illustration 11
Find the appropriate measure of skewness from the following data :

Sales (Rs.in Lakhs) No. of Companies Cumulative Frequency

Below 50 8 8
50- 60 12 20
60-80 20 40
80 - 100 25 65
100 and Above 15 80

Solution
Here class intervals are unequal and open. So the appropriate method of determining
skewness is.BowleyYsmethod.

N observations or 20 observations below it. So it lies in the


Yow Q, has -
4
class 30 fl'

Qz (= median) has N observations or -


- 80 or 40 observations below it. So it lies in
2 2
the class 60 - 80.
i
-- c
Md = I t -
f xi

t
je
Q3 has a
4
or 60 observations below it. So it lies in the class 80 - 100
Measures of Dlsperslon and Q3f.Q1-2Md
Skewnesa SkB =
Q3 - Q1

= -0.11

This value of coeffident of skewness indicates that the distribution is slightly skewed
to the left and, therefore, there is a greater concentration of the sales at the higher
valuesethan the lower values of the distribution.

Illustration 12
The following facts were gathered fr,oma firm before a ~ after
d an industrial dispute:

Before Dispute After Dispute

Mean Wages (Rs.) 850 900


Median Wages (Rs.) 820 800
Modal Wages (Rs.) 7M1 600
Quartiles (Rs.) 750 & 920 750 & 950
S.D. (Rs.) 30 110
Number Employed 600 550

By making use of the above data, compare the position of the firm before and after
the dispute as fully as possible.

Solution
a) Number of workers has decreased by 50, from 600 to 550 as a result of the dispute.
b) Although the mean wage has slightly increased, the firm.saves Rs. 15,000 (after
dispute) in respect of the monthly salary bill: . .
Total Wages before Dispute (600 x 850) = Rs. 5,10,000
Total Wages after Dispute (550 x 900) = Rs. 4,95,000

Difference 15,000

c) The median and modal wages have decreased. Before the dispute, 50% of the
workers used to get Rs. 820 and above. But after the dispute, workers in thh
category are less than 50%. Similarly, most of the workers are being paid around
Rs. 600 (after dispute) as against Rs. 760 (before dispute).
d) The'first quartile Q , has not changed. The second quartile Q2(i.e,, Median) has
decreased slightly, but the third quartile Q3 has increased. 'Fhe significance df the
information is as shown below :

Wages (Rs.)

Category of Workers Before Dispute , .. After Dispute

A. Lowest Paid 25% ' ' f.


. .. upto 750 Upto 750
B. Next ~ i ~ h~er &r p r o f 2 5 % , 750 - 820 750 - HOU
C. Next Higher Group of 25% * 820- 920 800 - 950
D. Highest Paid25'5/0 Above 920 Above 950

Category (A) workers are not affected. The next higher category (B) workers
are now confined to a narrower range of salary. But the highest paid categories
(C) and (D) are now generally paid more after the dispute.
e) Standard deviation has increased from Rs. 30 to Rs. 110 implying thereby that
the variability in individual wages has increased after dispute. For pFoper
comparison, we have :
C.V. (before dispute) = 30
850
- X 100 = 3:53%

C.V. (after dispute) = 110


X 100 = 12.2%
900
The variability relative to mean has also increased.
I
I
f) Measure of skewness are:
I Before Dispute After Dispute
Pearson's Measure 850 - 760 = 3
30
Bowley's Measure
920 - 2 (820) 750 +
920 - 750

Pearson's measure of skewness. after dispute has decreased while the Bowley's
measure has increased, both being positive. This means that for middle 50% of
workers concentration in lower wages has increased. But when we consider all
the workers, then the relative concentration of frequencies on lower values side
is lower.

Note: There is nothing wrong if one formula gives result indicating increase in ' .
skewness while the other gives decrease in skewness. In fact, thtse can be
'
situations when one formula gives positive skewness while the other may give
negative skewness. This is because Bowley's me'thod is based on only middle 50%
data while Pearson's method relates to entire data.

Check Your Proggess B -


1) State formalas of the Karl Pearson's and the Bowley's methods of measuring
skewness.

............................................................................................................
2) What is skrwncss?

. .
..........................................................................................................
. '

3) Differentiate bctwcen skcwnesb and dispersion.,

I
1
..........................................................................................................
. . . .
1
a ................. .........................................................................................
i
. .
[.
!: 4). State whether the following statements dre True or False.
%
i,) Skcwness judges thc cxtent of representativeness of any average.
. ii) For a positively skewed distribution. on cent ration of frequencies is on left.
\
iii) Only relative value of skewness is used'for comparison even though standard
dcviation'is thc same.
i I iv) Skewness cannot he calculated for open end class intervals.
; v)- Skcwness docs not exist in Bimodel distribution
I

i Gi) Two distributions having different coefficient of variations so they have


h different skewness.
i
L
Meluures of Dispemion and 5) Fill in the blanks:
Skewness
i) If the mean and the mode of a given distribution areequal then its coefficient
of skewness is .........................
ii) Skewness is positive when mean is ......................... mode.
iii) In a symmetrical distribution the mean, median and mode are

iv) Median can never b e equal to ......................... in case of skewed


distribution.
v) If the mean, mode and standard deviation of a frequency distribution arc 41,
45, and 8 respectively, then its Pearson's coefficient of skewness is

vi) In a perfectly symmetrical distribution, 50% items are above 60 and 75%
items are below 75. Therefore Me = ..........................
QJ = .........................Q1 = ........................., coefficient or qi:~~.tilt:
deviation is.. ...................,and coefficient of skewness is ......................

16.8 PROPERTIES OF NORMAL CURVE


It has been observed that frequency distribution most of the phenomena that occur
in nature such as measurements of human characterist~cs(height, weight, IQ, etc. ,),
measurements reIating to industrial production and agricultural production, etc. are
symmetrical in nature. Normally, they all have almost a fixed rate of rise and fall of
frequencies from one group t o another group. Their shape is like in Figure 16.1.
Statisticians have tried to express these distributions by a single mathematical
formula. .As this formula describes most of the distributions which occur in nature,
it has been called 'Normal Curve'. At this stage, it is not necessary for you to know
the exact mathematical expression that gives the normal curve. But the properties .
that are exhibited by that formula are very useful in the analysis of data. Following
are the main properties of the normal curve:
1) It is perfectly symmetrical about the mean and is bell shaped.
2) Mean = Median = Mode
3) I t has only one mode, i.e., it is unimodel.
4) The quartiles Q1and Q, are equidistant from the median or mean and are given
by
QI = A.M. - 0.6745 S.D.

Q3 = A.M. + 0.6745 S.D.


QD = 5 M.D. Approximately.
6
- -2
- standard deviation (approximately) "
3

5) T h e mean deviation about mean is 3 x S.D.


5
6) One of the most fundamental properties of the normal probability curve is the
area property.
i) Mean + 0.6745 SD covers 50% area, i s . , 25% on each side.
ii) Mean -+ 2.5758 SD covers 99% area, i.e. 49.5% on each side.
iii) Mean + 1.96 SD covers 95% area, i.e. 47.5% on each side.
iv) Mean f 1 SD covers 68.37% area, i.e,, 34.14% on each side.
. +
v) Mean 2 SD covers 95.4% area, i.e., 47.7% on each side.
vi) Mean f 3 SD covers 99.7% area, i.e., 49.85% on each side.
Let us take one example to point out the usefulness of these properties. Measures of Skewness

Suppose mean height of 100 persons selected from a big group is 68 'inches and
standard deviation is 1.5 inches.
i) What is the range of height of middle 95% persons in the whole group?
' ii) How much would be the expected value of mode, Q.D. and M.D. for the
whole group?

Solution
i) Now 95% of items have values between the range Mean k 1.96 SD. So the
+
required range is 68 1.96 X 1.5 or 65.06 inches to 70.94 inches.
ij) Mean = Mode. Therefore mode is also 68 inches

QD = -
3 3
-
2 SD approximately. So QD = 2 x 1.5 = 1 inch approximately

MD =A SD approximately. So MD 4
=- x 1.5 = 1.2 inch approximately
5 5

In fact normal curve is very much useful in drawing statistical inference. It is also
used as a standard to find out the extent of concentration of frequencies in the central
part of the given data. This is the fourth main characteristic in analysis of data, called
Kurtoses, the details of which are out of scope of this course.

16.9 LET US SUM UP


The mehsures of central tendency and variation do not reveal all the characteristics
of a data set. Two distributions may have the same mean and standard deviation, but
may differ widely in the shape of their distribution. If the distribution of data is not
symmetrical, it is called asymmetrical or skewed. Skewness refere to the lack of
symmetry in distribution. Different methods of measuring skewness are as follows:
Absolute Measure Relative Measure Limits on Given by
Range

1. Mean - Mode Mean - Mode +- 3 Karl Pearson


SD
3 (Mean - Mode)
2. Mean - Median +3 Karl Pearson
SD

In highly skewed data highest frequency exists on one extreme of the data. A
positivcly skcwed distribution has a long tail on right hand side of the data and is also
tcrmcd as right handed skew. A negatively skewed data has a long tail on left hand
s i t l c of the data and is also termed as left handed skew. When the graph of a perfectly
symmetrical data, bell shaped or U-shaped, folded at the line at mean, two sides of
tht! curves perfectly coincide with one another.
*' Most of thc data which occurs in nature resembles the normal distribution. Normal
, curve is a perfectly symmetrical data with bell shape. It has a fixed percentages of
frequencies lying in different ranges from mean. These values of percentages help us
I in deciding whether the given data is no,rmal or not.

: 16.10 KEY WORDS AND SYMBOLS


;.Bellshaped Data : Frequencies steadily rise, reach a maximum and then steadily fall.
Shaped Data : Start with highest and end with lowest frequency and has a steady
rate of fall in between or vice-versa.
Measures of Dispersion and skewness : Refers to the lack of syrnrnetv
Skewness
Symmetrical Data : When values' of variable equidistant from middle have equil
frequencies. .
b-Shaped Dab : Data has high frequencies in the beginning and end, and lowest
frequencies in the middle. - .
List of Symbols
Coefficient of Skpwnes~: Bowlty's - SkB,
Coefficient of Skewness : Yea,son's - Skp
Skewness - Absolute Measure Sk, J

16.11
- ANSWERS TO CHECK YOUR PROGRESS
A) 5) j)+False ii) True iii) True iv) False v) False vi) False vii) True viii) Trul
ix) True x) False xi) False xii) False.
6) i) no variation 'ii) symmetrical iii) skewed '

B) 4) i) True ii) True' iii) True iv) False v) False


vi) May or may not be true.
. 5) i) zero ii) greater. than iii) .equal iv) mean v) -0.5 vi) M, = 60,
Q3 = 75, Q1 = 45 coefficient uf QD = 0.25, SkB = 0.

16;12 TERMINAL QUESTLONSIEXERCISES . ."

- Questiow '

- 1) Give the absolute and relative measures of skewness.


2) Central tendency, dispersion and skewness are three different measures to
analyse numerical data, Comment. . - -
Exercises
'
1) ~ r o mthe following frequency distribution of marks of students in an
7
examination, calculate the value of Karl Pearson s coefficient df skewnkss: .
Marks lessthan : 10 20 .. 30 40 50 60 70 . ' 80

(Answer: 53, '= 17.66, Skp = 0.453)

2) Calculate Pearson's Coefficient of Skewness from the table given below:


' Life Time (In Hours) No. of Tubes

looo-1100 22
1100-1200 -- 6
:.
(Answer : Skp = 715.5 - 669.23 = 0.243)
190.2
-. ' 3) The following data shows the daily salis at a pe&ol station. calculate the meai
median, standard deviation and coifficient bi skewness.
I
Mcssurcs of Skewness
Quantity sold tin Litrep) No.of Days

-
(Answer : X = 1426, -Md = 1600 = 447.35, SK = 1.167)
4) ' The following table gives the distribution of daily travelling allowance of
salesmen in a company. Compute Bowley's Coefficient of Skewness and
comment on its value.
Trnvelling Allowance (in Rs.) No. of Salesmen

100 - 120 14
120 - 140 16
140 - 1 60 20
-
160 180 IX
180- 2oU 15
-
200 220 7

189.33.+ 133.75 7 (2 X 160)


(Answer : Sicl3 = = 0.145)
189.33 - 133.75
5) Calriilate an appropriate measure of skewness for the data given below:
Age (Years) No. of Employees
-
Below 20 13
20 - 25 29

(I) Find a suitable measure of skewness from the following distribution:


Annual1 Sales ,
(s in 0 0 0 : 0-20 20-51) 50- 100 100-250 250-So() 500- I OO()
No. o f Firms : 20 50 69 30 22 19

(Answer : Ski, =
203.75 -
+ 39.9S (2 X 76.45) = 0.554)
203.75 - 39.95
7) You arc givcn bclow the details relating to the wages in respect of two factories.
From this it is concluded that the skewness and variability are the samc in both
t hc factories. Point out t h t mistakc or wrong. inference in sthe abovc statement.
Factory A '. Factory B '
.-
-.:
(Rs.) (Ks.)
Arithmetic Mean 50 -45
, .
Mode ' 45 SO
Variancc 1 OU 100
Measures of d ~ s ~ e r s i oand
n 8) Calculate Karl Pearson's coefficient of skewness based on the empirical
Skewness
relationship that exists between the central tendencies in a moderately
asymmetrical distribution:
Mean = 23, Median = 24, Standard Deviation = 10.
Is this distribution negatively or positively skewed?
(Answer : = -0.3)

9) The following is the position in a factory before and after the settlement of a;
industrial dispute. Comment on the gains or losses from the point of view sf
workers and that of management:

Before After
No. of Workers 3,000 2,900
Mean of Wages (Rs.) 220 230
Median of Wagec (Rs.) 250 240
Standard Deviation 30 26

Note : These questions and exercises will help you to understand the unit
better. Try to write answer for them. But do not submit your answer to the
University. These are for your practice only.

SOME USEFUL BOOKS


Elhance , D.N. and Veena Elhance, 1988. Fi~~rdarnent~ls
of S~tcrti.sric~s,
Kitab Mahnl :
Allahabad. (Chapters 9, 10 & 18)
Gupta, C.B., An Introduction to Statistical, Methods, Vikas Publishing House :
New Delhi. (Chapters 10, 11 & 17)
Gupta, S.P., 1989, Elementary Statistical Methods, Sultan Chand & Sons : New Delhi.
(Chapters 8 & 9)
Sancheti, D.C., and Kapoor, V.K., 1989, Sfatistics Theory Methods and Applications,
Sultan Chand & Sons : New Delhi. (Chapters 5, 7 & 16)
Shenoy, G.V., Srivastava V.K., and Sharrna, S.C., 1989, Business Statistics, Wiley
Eastern : New Delhi. (Chapters 5 , 6 & 11)
Simpson, G , and.Kafka, F. Basic Statistics, Oxford & IBH Publishing 1 New Delhi.
(Chapters 13, 16 & 21)
NOTES

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy