Central Tendency + Dispersion
Central Tendency + Dispersion
Explain
three important measures of central
tendency?
•
Measures of central tendency are scores that
represent the center of the distribution.
•
Three of the most common measures of central
tendency are:
–
Mean
–
Median
–
Mode
The Mean
The mean is the arithmetic average of the scores.
–
Mean is the average of the scores in a distribution
_ Xi
_________
i
X =
N
Mean Example
Exam Scores
75 82 72 68 89
X sum all scores
91 78 94 88 75 n = total number of
scores for the sample
•
Pros Pros and cons of using mean
–
Summarizes data in a way that is easy to understand.
–
Uses all the data
–
Used in many statistical applications
•
Cons
–
Affected by extreme values
•
E.g., average salary at a company
–
12,000; 12,000; 12,000; 12,000; 12,000; 12,000;
12,000; 12,000; 12,000; 12,000; 20,000; 390,000
–
Mean = $44,167
Median
•
The middle score of the distribution when all the scores
have been ranked.
•
If there are an even number of scores, the median is the
average of the two middle scores.
Central Tendency Example:
Median
• 52, 76, 100, 136, 186, 196, 205, 150, 257, 264,
264, 280, 282, 283, 303, 313, 317, 317, 325, 373,
384, 384, 400, 402, 417, 422, 472, 480, 643, 693,
732, 749, 750, 791, 891
• The median is the middle value when observations
are ordered.
– To find the middle, count in (N+1)/2 scores when
observations are ordered lowest to highest.
• Median hotel rate:
– (35+1)/2 = 18
– 317
Median (con’t)
2 Number of Words Recalled in Performance Study
2
3
3
4
4
4
4
4
10
Pros and Cons of Median
• Pros • Cons
– Not influenced by – May not exist in the
extreme scores or data.
skewed distributions. – Doesn’t take actual
– Good with ordinal data. values into account.
– Easier to compute than
the mean.
The mode.
2
2 Number of Words Recalled in Performance Study
3
3 The mode is 4.
4
4
4
4
4
10
Mode (con’t)
72 72 73 76 78
81 83 85 85 86
87 88 90 91 92
Demonstration
Red Blue Green Yellow
Pros Cons
Good for nominal data. Ignores most of the
Good when there are information in a
two “typical” scores. distribution.
Easiest to compute and Small samples may not
understand. have a mode.
The score comes from
the data set.
Scales of Measurement
•
Nominal scale = mode
•
Ordinal scale = median
•
Interval(Discrete) scale = mean,
median, or mode
•
Ratio(Continuous) scale = mean,
median, or mode
What is dispersion?
Explain two important measures of
dispersion.
Measures of Dispersion
Why Study Dispersion?
An average, such as the mean or the median, only locates the
centre of the data
An average does not tell us anything about the spread of the
data
A small value for a measure of dispersion indicates that the data
are clustered closely (the mean is therefore representative of the data)
A large measure of dispersion indicates that the mean is not
reliable (it is not representative of the data)
47 48 49 50 51 52 53 44 45 46 47 48 49 50 51 52 53 54 55 56
Daily Computer Production Daily Computer Production
WHAT IS DISPERSION?
Dispersion is the measure of the variation of
the items.
Measures of Dispersion are -
Range Quartile Deviation
Mean Deviation
Standard Deviation Variance
The Range
The simplest measure of dispersion is the range
For ungrouped data, the range is the difference
between the highest and lowest values in a set of
data.
RANGE = Highest Value - Lowest Value
0 1 0 2 2
1 9 9 1 9
2 7 14 0 0
3 3 9 1 3
4 4 16 2 8
Total: 24 fx = 48 f|x-x| = 22
_
_
x
fx mean = 48/24 = 2 MD
f |xx|
f f
MD = 22/24 = 0.92
Standard Deviation
Standard deviation is the most commonly
used measure of dispersion
Similar to the mean deviation, the standard
deviation takes into account the value of
every observation
The values of the mean deviation and the
standard deviation should be relatively
similar
3.3 Standard Deviation
A. Standard Deviation for Ungrouped Data
( x1 x) 2 ( x2 x) 2 ( xn x) 2
Standard deviation
n
n
f i ( x1 x)
2
i 1
n
Notes:
1. Two sets of data may have the same mean but different standard deviations.
2. The larger the standard deviation, the more spread out the data is.
26
3.3 Standing Deviation
B. Standard Deviation for Grouped Data
f1 ( x1 x ) 2 f 2 ( x2 x) 2 f n ( xn x) 2
Standard deviation
f1 f 2 f n
n
f i ( x1 x) 2
i 1
n
fi
i 1
where f i is the frequency of the ith group of data, x is the mean and
n is the total number of data.
27
Variance = Square Root of Standard
Deviation
28