0% found this document useful (0 votes)
39 views20 pages

Stats Lecture 03. Summarizing of Data - New

The document discusses various measures of central tendency and variability used to describe data sets. It defines the mean, median and mode as common measures of central tendency, and explains how to calculate each. For measures of variability, it introduces the range, standard deviation, variance and coefficient of variation. It provides examples of calculating and interpreting these measures to compare the spread of different data sets. The document serves to explain the key concepts and calculations for summarizing and comparing the characteristics of numerical data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views20 pages

Stats Lecture 03. Summarizing of Data - New

The document discusses various measures of central tendency and variability used to describe data sets. It defines the mean, median and mode as common measures of central tendency, and explains how to calculate each. For measures of variability, it introduces the range, standard deviation, variance and coefficient of variation. It provides examples of calculating and interpreting these measures to compare the spread of different data sets. The document serves to explain the key concepts and calculations for summarizing and comparing the characteristics of numerical data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Measures of Central Tendency and Variability

Shair Muhammad Hazara


PhD Public Health (Fellow), HSA, NIH, Islamabad)
MSPH (Health Services Academy, NIH, Islamabad)
MSBE (Dow University of Health Sciences Karachi)
BSN (PRN) The Aga Khan University, Karachi
1
E-mail address: hazara_27@Hotmail.com
Measure of Central Tendency

Given a data set, a measure of the central tendency is a value


about which the observations tend to cluster. In other words it is a
value around which a data set is centered.

The three most common measures of central tendency are the


mean, the median, the mode.

2
Mean

is the arithmetic average of a set of numbers

Applicable for interval and ratio data

Not applicable for nominal or ordinal data


Affected by each value in the data set, including extreme
values

Computed by summing all values in the data set and


dividing the sum by the number of values in the data set
3
Sample Mean

Age of the patients coming to the clinic


57,86,42,38,90,66

X
 X X  X  X ...  X
 1 2 3 n
n n
57  86  42  38  90  66

6
379

6
 63.167
4
The Median

Middle value in an ordered array of numbers.


Applicable for ordinal, interval, and ratio data
Not applicable for nominal data
Unaffected by extremely large and extremely small values.
Median: Computational Procedure
Arrange the observations in an ordered array.
If there is an odd number of terms, the median is the middle term of
the ordered array.
If there is an even number of terms, the median is the average of the
middle two terms
5
Median:
Example with an Odd Number of Terms
Ordered Array
Age of the patients coming to the clinic
3, 4, 5, 7, 8, 9, 11, 14, 15, 16, 16, 17, 19, 19, 20, 21, 22
There are 17 terms in the ordered array.
Position of median = (n+1)/2 = (17+1)/2 = 9
The median is the 9th term, 15.
If the 22 is replaced by 100, the median is 15.
If the 3 is replaced by -103, the median is 15. 6
Median:
Example with an Even Number of Terms
Ordered Array
Age of the patients coming to the clinic
3, 4, 5, 7, 8, 9, 11, 14, 15, 16, 16, 17, 19, 19, 20, 21
 There are 16 terms in the ordered array.
 Position of median = (n+1)/2 = (16+1)/2 = 8.5
 The median is between the 8th and 9th terms, 14.5.
 If the 21 is replaced by 100, the median is 14.5.
 If the 3 is replaced by -88, the median is 14.5. 7
The Mode

The mode is the observation that occurs most frequently.

for a sample of five salaries

6,000,10,000,14,000,50,000,10,000

the mode is equal to $10,000.

It should be noted that there can be more than one mode for
a data set.

8
Measures of Variation

 Knowing the central tendency of a data set is helpful, but it


is not enough. For example the following two data sets have
the same mean
5, 6, 8, 10, 12, 14, 15 1, 4, 8, 10, 12, 16, 19

 The difference however , is that the second data set has


more spread. The same point is illustrated by the following
distributions, which have the same mean but different spread.

9
Measures of Variability:

Measures of variability describe the spread or the dispersion


of a set of data.
Common Measures of Variability
Range
Interquartile Range
Variance and Standard Deviation
Coefficient of Variation

10
Range
The difference between the largest 44 45
35 41
and the smallest values in a set of data
Simple to compute 37 41 44 46
Ignores all data points except
the two extremes. Example: 37 43 44 46

Range = Largest – Smallest


39 43 44 46
= 48 - 35 = 13

40 43 44 46
The range is quick to compute but fails
to be very useful since it considers only
the extreme values and does not take 40 43 45 48
into consideration the bulk of the
observations. It is not widely used. 11
Sample Variance

Average distance of the values from the arithmetic mean (=1773)

X X  X X X  X 
X 
2
2

2
2,398 625 390,625 S 
1,844 71 5,041 n1
1,539 -234 54,756 6 6 3 ,8 6 6
1,311 -462 213,444 
7,092 0 663,866 3
 2 2 1 , 2 8 8 .6 7

12
Sample Standard Deviation

 Square root of the sample variance

X  X 
2

2
S 
X X  X X  X 
2
n1
6 6 3 ,8 6 6
2,398 625 390,625 
3
1,844 71 5,041
1,539 -234 54,756  2 2 1 , 2 8 8 .6 7
1,311 -462 213,444 S 
2

7,092 0 663,866
S
 2 2 1 , 2 8 8 .6 7
 4 7 0 .4 1
13
EXAMPLE. Find the standard deviation of the average temperatures
recorded over a five-day period last winter: 18, 22, 19, 25, 12

14
Coefficient of variation
 It is a dimensionless measure of the relative variation.
– Constructed by dividing the standard deviation by the
mean and multiplying by 100.
CV = (s/x) (100)

15
Coefficient of variation
• Used to compare the variability in one data set with
that in another when a direct comparison of
standard deviation is not appropriate.

16
Coefficient of variation
Adults Children

Mean 25 yrs 11 yrs


age
Mean wt 145lbs 80lbs

SD 10lbs 10lbs
CV 6.9% 12.5%
17
• Example: Two plants C and D of a factory show the
following results about the number of workers and the
wages paid to them.

No. of workers 5000 6000


Average monthly wages $2500 $2500
Standard deviation 9 10

Using coefficient of variation formulas, find in which


plant, C or D is there greater variability in individual
wages.
18
To Find: Which plant has greater variability.
For this, we need to find the coefficient of variation. The plant that
has a higher coefficient of variation will have greater variability.

Coefficient of variation for plant C. Now, CV for plant D


Using coefficient of variation CV = (σ/μ) × 100
formula, CV = (10/2500) × 100
CV = (σ/μ) × 100, μ≠0 CV = 0.4%
CV = (9/2500) × 100
CV = 0.36%

Plant C has CV = 0.36 and plant D has CV = 0.4


Hence plant D has greater variability in individual wages.
19
20

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy