0% found this document useful (0 votes)
78 views18 pages

Dispersion

The document discusses various measures used to quantify the dispersion or variability in a data set beyond just the average. It introduces range, quartile deviation, mean deviation, and standard deviation as absolute and relative measures of dispersion. Quartile deviation is defined as the midpoint of the interquartile range (Q3-Q1). Mean deviation considers the absolute deviations from the median or mode, while standard deviation squares the deviations to calculate the average squared deviation from the mean. Standard deviation is described as the most important measure of dispersion.

Uploaded by

Fareed Rathod
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views18 pages

Dispersion

The document discusses various measures used to quantify the dispersion or variability in a data set beyond just the average. It introduces range, quartile deviation, mean deviation, and standard deviation as absolute and relative measures of dispersion. Quartile deviation is defined as the midpoint of the interquartile range (Q3-Q1). Mean deviation considers the absolute deviations from the median or mode, while standard deviation squares the deviations to calculate the average squared deviation from the mean. Standard deviation is described as the most important measure of dispersion.

Uploaded by

Fareed Rathod
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Chapter : 4 MEASURES OF DISPERSION

We have seen how to get an average for a given distribution. The average represents a given
distribution but when we want to study the given distribution, knowing only the average value is not
enough. For instance, though it is useful to have an average of wages of workers in a factory, this
value may not be sufficient to indicate the wage conditions in the factory. We should also know the
differences in individual wages. Average does not give the idea about the spread or scatter of the
data.
The same average may be found in two distributions, yet they may differ widely in the scatter of their
values. In the following examples we have three series. The arithmetic mean and median are the
same for all the three.
A B C
60 50 0
60 55 30
60 60 60
60 65 90
60 70 120
Here we can see that though the averages are the same, the three series are widely different from
each other. If we consider only the average, conclusion will be misleading as the same number will
represent the three series.
The first series A has all equal observations. There is no variability. The observations in the series B
differ by 5, while the difference between two consecutive observations in series C is 30. The
variability or scatter in series C is more than that in series B. In order to estimate to what extent, the
data vary from the average and to measure the spread or scatter of the data we compute measures of
dispersion so that by referring to a single number we can find whether a distribution is compact or
spread out.
Dispersion is an important characteristic and must be measured for the information it gives about the
data. Two students may have the same average of marks. But one may be having marks near the
average in all the subjects while the other may be having low marks in some subjects and very high
marks in others. A manufacturer wants to control the quality of his product. He is interested in
providing articles with uniform quality and therefore wants to prevent variability. For him uniformly
high quality is better than high average. A manufacturer who produces electric bulbs will be happier
with an average life of 1600 hours for his bulbs with uniform quality than an average life of 1700
hours with some bulbs lasting for less than 1000 hours and some for more than 2000 hours.
For measuring dispersion, we have various measures and each of them has different characteristics.
As in the case of averages, measures of dispersion also should have some qualities so that they give
proper idea about the scatter of the data. The following are the characteristics of a good measure of
dispersion.
1. It should be rigidly defined.
2. It should be based on all the observations.
3. It should be easy to calculate and understand.
4. It should be capable of further algebraic treatment.
5. It should not be affected much by sampling fluctuations.

Measures of Dispersion

Absolute Measures Relative Measures


1. Range 1. Coefficient of Range
2. Quartile Deviation 2. Coefficient of Quartile Deviation
3. Mean Deviation 3. Coefficient of Mean Deviation
4. Standard Deviation 4. Coefficient of Variation

Range
An elementary measure of dispersion is range. It is the easiest of all measures of dispersion. It is
defined as the difference between the highest and the lowest values taken by the variable.
i.e. Range = Maximum value – Minimum value
The corresponding relative measure is given by

.
Example: Calculate the range for the following data giving the daily sales of a shop for a week.
Sales in Rs.: 160, 130, 125, 127, 143, 150, 155
Here the lowest value is Rs.125 and the highest value is Rs.160.
Range = 160 – 125 = 35.

Range indicates nothing concerning the usual spread of the items. Therefore, it is most useful when it
is known that the extreme items are not exceptional in nature. Stock prices and interest rates are
often stated in terms of their range. Range is used in statistical quality control to study the variation
in quality of manufactured units. Saving in computation time is an important factor in favor of range.
However, range is not suitable for precise studies. It is only a rough measure of dispersion.

QUARTILE DEVIATION
Range is affected by extreme values. To avoid this, we consider the range of the middle 50 per cent of
the observations. i.e., Q3 – Q1. This is called inter quartile range. Quartile deviation is the midpoint
of the range between the two quartiles.

Quartile deviation is defined as where Q1 and Q3 are the first and the third quartiles
respectively.

PROBLEMS:
1. Calculate the quartile deviation for the following data giving the age distribution of 1500
women. Also find the coefficient of Q.D.
Age in years: 16-20 20-24 24-28 28-32 32-36 36-40
No. of women: 200 250 400 300 250 100
[ Answer: 4.44 years and 0.16]

2. Calculate the quartile deviation for the following data.


Sales (’00 Rs.) 100-110 110-120 120-130 130-140 140-150 150-160
No. of Shops: 4 7 20 9 6 4
[ Answer: 8.24]
3. Calculate quartile deviation for the following distribution of ages of 800 persons. Also find
the coefficient of quartile deviation.
Age in years: 20-25 25-30 30-35 35-40 40-45 45-50 50-55 55-60
No. of persons: 50 70 100 180 150 120 70 60
[ Answer: 6.54 and 0.1613]
4. Find the quartile deviation and the coefficient of Q.D.
C.I. 1500-1700 1700-1900 1900-2100 2100-2300 2300-2500 2500-2700
Freq.: 70 100 120 150 100 60
[ Answer: 230 and 0.11]
5. Find the quartile deviation and the coefficient of Q.D.
Age (less than): 10 20 30 40 50 60 70 80
No. of persons: 15 30 53 75 100 110 115 125
[ Answer: 13.4783 and 0.3962]

Interquartile Range
A measure of variability that overcomes the dependency on extreme values is the
interquartile range (IQR). This measure of variability is the difference between the third
quartile, Q3, and the first quartile, Q1. In other words, the interquartile range is the range for
the middle 50% of the data.

IQR = Q3 - Q1

MEAN DEVIATION

The previous two measures of dispersion viz., Range and Quartile deviation do not consider, the
deviations from the central value. The mean deviation considers these differences in absolute values
and averages these differences. Mean deviation considers all the observations and therefore is
superior to these two measures. Here deviations from mean are calculated considering their absolute
values and are averaged. Although any average can be used theoretically, median is the best to use
because mean deviation from the median is less than that from any other value.
Mean Deviation is calculated as follows:

Raw Data Frequency Distribution


Coefficient of Mean Deviation = where is mean, median or mode.
Problems:
1. Calculate mean deviation from median and the coefficient of M.D for the following
distribution of ages of 500 persons.
Age in years: 20 – 25 25 – 30 30 – 35 35 – 40 40 – 45 45 – 50
No. of persons: 70 80 180 100 50 20
[ Answer: 4.8896 and 0.1492]
2. Find the mean deviation from mode and the corresponding coefficient of mean deviation for
the following data.
Income in Rs.: 800-1000 1000-1200 1200-1400 1400-1600 1600-1800
No. of persons: 16 34 60 37 13
[ Answer: 163.545 and 0.1252]

STANDARD DEVIATION
It is the most important and widely used of all the measures of dispersion. In mean deviation
algebraic signs are ignored. In standard deviation, the deviations are squared to get positive values.
Here the deviations from arithmetic mean are squared, they are averaged, and the square root of the
resulting quantity is taken. Therefore, this is also known as ‘root-mean square deviation’.

If are n observations, then their standard deviation denoted by ‘ ’ is given

by or

In the case of a frequency distribution the standard deviation is calculated as follows:

The corresponding relative measure known as the coefficient of variation is calculated as

Problems:

1. Find the standard deviation for the following sets of values:


i. 15, 20, 17, 8, 9, 12, 18, 10
ii. 652, 672, 670, 639, 642, 670
iii. 85, 35, 43, 75, 42, 41
iv. 52, 57, 49, 48, 35, 37
[ Answer: i. 4.2112 ii. 13.7568 iii. 19.1289 iv. 7.867]
2. From the following distribution, find the standard deviation.
xi: 11 12 13 14 15 16 17
i.
fi: 3 6 10 8 5 3 2
xi: 20 30 40 50 60 70 80 90
ii.
fi: 5 8 12 9 7 5 2 1

[ Answer: i. 1.5657 ii. 17.189]

All types of Dispersion Problems:

1) A bowler’s scores for six games were 182, 168, 184, 190, 170, and 174.
Using these data as a sample, compute the following descriptive statistics: a. Range c.
Standard deviation b. Variance d. Coefficient of variation

2) A home theater in a box is the easiest and cheapest way to provide surround sound for
a home entertainment center. A sample of prices is shown here (Consumer Reports
Buying Guide, 2004). The prices are for models with a DVD player and for models
without a DVD player.
a) Compute the mean price for models with a DVD player and the mean price for
models without a DVD player. What is the additional price paid to have a DVD
player included in a home theater unit?
b) Compute the range, variance, and standard deviation for the two samples. What
does this information tell you about the prices for models with and without a DVD
player?

3) The Los Angeles Times regularly reports the air quality index for various areas of
Southern California. A sample of air quality index values for Pomona provided the
following data: 28, 42, 58, 48, 45, 55, 60, 49, and 50.
a) Compute the range and interquartile range.
b) Compute the sample variance and sample standard deviation.
c) A sample of air quality index readings for Anaheim provided a sample mean of
48.5, a sample variance of 136, and a sample standard deviation of 11.66. What
comparisons can you make between the air quality in Pomona and that in
Anaheim based on these descriptive statistics?

___________________________________________________________________________

Measures of Distribution Shape, Relative Location,


and Detecting Outliers.
Shape
Shape is the pattern of the distribution of data values throughout the entire range of all the
values. A distribution is either symmetrical or skewed. In a symmetrical distribution, the
values below the mean are distributed exactly as the values above the mean. In this case, the
low and high values balance each other out. In a skewed distribution, the values are not
symmetrical around the mean. This skewness results in an imbalance of low values or high
values
Shape influences the relationship of the mean to the median in the
following ways:
Mean >median: negative, or left-skewed
Mean = median: symmetric, or zero skewness
Mean < median: positive, or right-skewed
.

If Q1-Min > Max-Q3 then data is left skewed.


If Mean > Median, then data is left skewed.
Outlier: Sometimes a data set will have one or more observations with unusually large or
unusually small values. These extreme values are called outliers.

It is a good idea to check for outliers before making decisions based on data analysis. Errors
are often made in recording data and entering data into the computer. Outliers should not
necessarily be deleted, but their accuracy and appropriateness should be verified.

To identify outlier:

Any value below Q1 - 1.5*IQR and above Q3 + 1.5*IQR is treated as Outlier.

Any value outside the range of mean ± 3σ is treated as outlier.

Z Score
An extreme value or outlier is a value located far away from the mean. Z scores are useful in
identifying outliers. The larger the Z score, the greater the distance from the value to the
mean. The Z score is the difference between the value and the mean, divided by the standard
deviation.

Generally, a Z score is considered an outlier if it is less than -3.0 or greater than +3.0. None
of the times met that criterion to be considered outliers.
The Chebyshev Rule

The Chebyshev rule states that for any data set, regardless of shape, the percentage of values
that are found within distances of k standard deviations from the mean must be at least
(1 - 1/K2) x 100%

You can use this rule for any value of k greater than 1. Consider k = 2. The Chebyshev rule
states that at least [1 - (1/2)2] x 100% = 75% of the values must be found within 2 standard
deviations of the mean. The Chebyshev rule is very general and applies to any type of
distribution. The rule indicates at least what percentage of the values fall within a given
distance from the mean. However, if the data set is approximately bell shaped, the empirical
rule will more accurately reflect the greater concentration of data close to the mean. Table 3.6
compares the Chebyshev and empirical rules.

CHEBYSHEV’S THEOREM: At least (1 - 1/Z 2) of the data values


must be within z standard deviations of the mean, where z is any
value greater than 1.
The Empirical Rule:

In most data sets, a large portion of the values tend to cluster somewhat near the median. In
right-skewed data sets, this clustering occurs to the left of the mean that is, at a value less
than the mean. In left-skewed data sets, the values tend to cluster to the right of the mean that
is, at a value greater than the mean. In symmetrical data sets, where the median and mean are
the same, the values often tend to cluster around the median and mean, producing a bell-
shaped distribution. You can use the empirical rule to examine the variability in bell-shaped
distributions:

 Approximately 68% of the values are within a distance of 1 standard deviation from
the mean.
 Approximately 95% of the values are within a distance of 2 standard deviations from
the mean.
 Approximately 99.7% are within a distance of 3 standard deviations from the mean.

Note: The Chebyshev Rule should be used for interpreting


nonsymmetrical data while the Empirical Rule should be used for
Symmetrical data.

Problems:

1) Consider a sample with data values of 10, 20, 12, 17, and 16. Compute the z-score for
each of the five observations.

2) Consider a sample with a mean of 500 and a standard deviation of 100. What is the z-
scores for the following data values: 520, 650, 500, 450, and 280?

3) Consider a sample with a mean of 30 and a standard deviation of 5. Use Chebyshev’s


theorem to determine the percentage of the data within each of the following ranges:
a) 20 to 40
b) 15 to 45
c) 22 to 38
4) Suppose the data have a bell-shaped distribution with a mean of 30 and a standard
deviation of 5. Use the empirical rule to determine the percentage of data within each of
the following ranges:
a) 20 to 40
b) 15 to 45
c) 25 to 35

Kurtosis is a statistical measure that describes the shape and characteristics of the probability
distribution of a dataset. It provides insights into the tails and overall distribution of data
points relative to the shape of a standard normal distribution (bell curve).

In simpler terms, kurtosis helps us understand whether the data in a dataset has heavy tails
(outliers or extreme values) or light tails (values that cluster around the mean), compared to a
normal distribution.

There are generally three main types of Kurtosis.

Mesokurtic: A dataset with mesokurtic kurtosis has a distribution similar to a normal


distribution. The tails are neither too heavy nor too light. The kurtosis value for a normal
distribution is typically set at 3, so a mesokurtic distribution would have a kurtosis close to 3.

Leptokurtic: A dataset with leptokurtic kurtosis has heavier tails than a normal distribution.
This indicates that the dataset has more extreme values or outliers compared to a normal
distribution. The kurtosis value for a leptokurtic distribution is greater than 3.

Platykurtic: A dataset with platykurtic kurtosis has lighter tails than a normal distribution.
This suggests that the values in the dataset are less spread out and cluster closer to the mean.
The kurtosis value for a platykurtic distribution is less than 3.

Kurtosis is calculated using the fourth standardized moment of a dataset, which involves
raising each data point to the fourth power. The formula for calculating sample kurtosis is:
n

∑ ( x i−x ) 4
Sample Kurtosis= x=1
n. s4

It's important to note that kurtosis, while informative, is not the sole indicator of a dataset's
distribution. Other measures, like skewness (asymmetry) and graphical techniques, are often
used alongside kurtosis to fully understand the characteristics of a dataset's distribution.
Additionally, the interpretation of kurtosis can depend on the context of the data and the goals
of the analysis.
DLLE
Group Members
1. Priya (Leader) - 29
2. Rhea - 7
3. Rishab - 16
4. Fareed - 40
5. Ebrahim - 46
6. Akshay - 39
7. Devang - 52
8. Nicole - 6

Subject DLLE Topic Name Gender Equality

Sustainable Development 5 Date 28th August 2023


Goal No.:

Trainers Priya Session 28th August Session 1


(Leader) Date 2023 Semester
Rhea
Rishab
Fareed
Ebrahim
Akshay
Devang
Nicole

Time 45 min - 1 hr Location 205


Duration

Reference https://jerrytompkin.blogspot.com/2022/03/gender-equality-lesson-
plans.html
https://www.livescience.com/22037-pink-girls-blue-boys.html

Objectives Details Learning Learning Assessment


Resources Activity

Gender 5 students will https:// Observation by Assessment 1


sensitization be selected and www.unwomen. attendees,
will be posed org/en/news/ participants and
with 10 stories/2021/6/ information by
questions feature-seven- session team
surrounding ways-to-change-
opportunities the-world
received

Business Class asked to https:// Gender Assessment 2


Leadership write down F/M jerrytompkin.bl stereotypes will
depending on ogspot.com/ be addressed
the roles they 2022/03/gender- catering to
associate the equality-lesson- gender roles.
position with plans.html Information by
session team

Meaning and The team will Self prepared Gender, sex, Assessment 3
classification say 10 words expectation,
and the class has majority,
to write down obvious,
the possible independent, fir,
meanings society, gentle
associated with
it

Global goals Actions to SDG https:// The articles and Assessment 4


for SDG sdgs.un.org/ their associated
goals/goal5 with the
assessment no.
3

Assessment 1
Fareed and Priya
Step forward if Yes/ Step back if No
1. Are you involved in monetary decisions of the family?
2. Do you drive a car?
3. Do you workout in an outdoor place/gym?
4. Have you been dropped home by a friend, just because it is late at night?
5. Are you considerate (bothered) about your looks?
6. Have you ever had an identity crisis?
7. Do you cook at home?
8. Have you ever taken an off from work or college to help with house preparation,
when a guest is expected?
9. Do you have your own bank account?
10. Have you ever felt the need to change your gender?
Information:
Gender inequality plays a very subtle role in our lives and often goes unnoticed when not
voiced out. This activity was to bring about gender sensitization with the help of an exercise.
We now move into the second part that deals with Business Leadership

Assessment 2
Ebrahim and Akshay
Note down F for Female and M for Male
(Say it fast so the class writes down the first thing that they think of and don't discuss)
1. Prime Minister
2. Represents nation’s parliament
3. Principal
4. Doctor
5. Boss of your parent
6. Head of local police
7. Local bank manager
8. Newsreader on TV
9. Lead singer of your favourite song
10. Sports coach/ trainer
Information:
Count the number of F and M. Very often we stereotype (assume people and situations based
on what was history and not according to their calibre/ ability. Eg asking a girl to cook and
assuming she knows how to cook, even though it's a life skill that everyone should know) We
need to provide people with the platform for them to be confident and not demotivate based
on assumptions.

Assessment 3
Devang and Rhea
Meaning and classification
(As the class to write down what they feel the meaning of these words are. Then reveal the
meanings covering 5 points each. Rhea you could introduce this)
1. Gender: by nurture; society and environment impact; Eg: Girl, Boy, third etc.

2. Sex - nature; biology; female, male, LGBTQIA+


3. Expectation - by society, external effect
4. Majority - maximum
5. Obvious - without thinking
6. Independent - having no dependence
7. Society - people we live with, grow with and interact with
8. Gentle - mannerism
9. Dominant - common or direction
10. Leaders - person capable to lead
Information:
In the next section we will learn more about how these words come into play in real life and
how it can lead to hurdles or bridges in gender equality.

Assessment 4
Rishab and Nicole
Global Goals
Rishab
The Global Goals for Sustainable Development are a plan developed by the United Nations
and agreed upon by all countries to work towards by 2030 to:
1. Fight against Gender Inequality
2. End extreme poverty
3. And respect our planet
- Global plan is for everyone no matter who they are or where they live - to find
solutions to the most pressing issues for people and the planet
- Observe this video to understand what SDG 5 talks about and how Companies have
taken steps to include it
- SDG 5 - https://youtu.be/vz7IUDOYvXk?si=qdzRISJX768CtQzm
- Standard Chartered - https://youtu.be/bM64NrqVMq8?
list=PLAm6_yeZLsSSYG9C3c3aVhDZF0WiaAbHQ
- What did you observe?
Nicole
1. Roles and responsibilities - Gender, sex, expectation, majority, obvious, independent,
fir, society, gentle, dominant, leader
2. Actionable SDG 5 -
SDG.com
5.1 Target 5.1 End all forms of discrimination against all women and girls everywhere

5.1.1 Indicator 5.1.1: Whether or not legal frameworks are in place to promote, enforce and
monitor equality and non-discrimination on the basis of sex

5.2 Target 5.2 Eliminate all forms of violence against all women and girls in the public and
private spheres, including trafficking and sexual and other types of exploitation

Indicator 5.2.1: Proportion of ever-partnered women and girls aged 15 years and older
subjected to physical, sexual or psychological violence by a current or former intimate
partner in the previous 12 months, by form of violence and by age
Indicator 5.2.2: Proportion of women and girls aged 15 years and older subjected to sexual
violence by persons other than an intimate partner in the previous 12 months, by age and
place of occurrence

Target 5.3 Eliminate all harmful practices, such as child, early and forced marriage and
female genital mutilation

Indicator 5.3.1: Proportion of women aged 20-24 years who were married or in a union
before age 15 and before age 18
Indicator 5.3.2: Proportion of girls and women aged 15-49 years who have undergone female
genital mutilation/cutting, by age
Target 5.4 Recognize and value unpaid care and domestic work through the provision of
public services, infrastructure and social protection policies and the promotion of shared
responsibility within the household and the family as nationally appropriate

Indicator 5.4.1: Proportion of time spent on unpaid domestic and care work, by sex, age and
location

Target 5.5 Ensure women’s full and effective participation and equal opportunities for
leadership at all levels of decision making in political, economic and public life

Indicator 5.5.1: Proportion of seats held by women in (a) national parliaments and (b) local
governments
Indicator 5.5.2: Proportion of women in managerial positions

Target 5.6 Ensure universal access to sexual and reproductive health and reproductive rights
as agreed in accordance with the Programme of Action of the International Conference on
Population and Development and the Beijing Platform for Action and the outcome documents
of their review conferences

Indicator 5.6.1: Proportion of women aged 15-49 years who make their own informed
decisions regarding sexual relations, contraceptive use and reproductive health care
Indicator 5.6.2: Number of countries with laws and regulations that guarantee full and equal
access to women and men aged 15 years and older to sexual and reproductive health care,
information and education

Target 5.A Undertake reforms to give women equal rights to economic resources, as well as
access to ownership and control over land and other forms of property, financial services,
inheritance and natural resources, in accordance with national laws

Indicator 5.A.1: (a) Proportion of total agricultural population with ownership or secure rights
over agricultural land, by sex; and (b) share of women among owners or rights-bearers of
agricultural land, by type of tenure
Indicator 5.A.2: Proportion of countries where the legal framework (including customary
law) guarantees women’s equal rights to land ownership and/or control

Target 5.B Enhance the use of enabling technology, in particular information and
communications technology, to promote the empowerment of women

Indicator 5.B.1: Proportion of individuals who own a mobile telephone, by sex

Target 5.C Adopt and strengthen sound policies and enforceable legislation for the
promotion of gender equality and the empowerment of all women and girls at all levels

Indicator 5.C.1: Proportion of countries with systems to track and make public allocations for
gender equality and women’s empowerment

___________________________________________________________________________
__

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy