0% found this document useful (0 votes)
62 views11 pages

Central Tendency

This document provides information about measures of central tendency and dispersion in statistics. It discusses the mean, mode, and median as the three main measures of central tendency. The mean is the average value, the mode is the most frequently occurring value, and the median divides the data set into equal halves. It also discusses how to calculate each of these measures from a data set. Finally, it briefly introduces the concept of data dispersion as a measure of how spread out the values are in a data set.

Uploaded by

rchandra2473
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views11 pages

Central Tendency

This document provides information about measures of central tendency and dispersion in statistics. It discusses the mean, mode, and median as the three main measures of central tendency. The mean is the average value, the mode is the most frequently occurring value, and the median divides the data set into equal halves. It also discusses how to calculate each of these measures from a data set. Finally, it briefly introduces the concept of data dispersion as a measure of how spread out the values are in a data set.

Uploaded by

rchandra2473
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Certification Course on

Quality Assurance and Statistical Quality Techniques


Course Level A Central Tendency and Code 1.03: Central Tendency
Dispersion Issue No.: 01
Effective Date: 15-04-2014

CENTRAL TENDENCY & DISPERSION

The term Central tendency is used to represent a middle value of a data distribution. Three types of
middle values are used which are mean, mode and median and each serve a distinct purpose.

The Mean, which is the arithmetic average of the data set, divides the data set into two halves of
aggregated values. The mean is used to get an idea of the average level of the parameter being studied.
It however gives no idea of the data variation or dispersion and for that reason can sometimes be
misleading. A sample mean is often studied to get an idea of the entire population, although as we will
study later, this will be subject to sampling errors, which also need to be evaluated when extrapolating the
sample mean to the larger population. For example, we can conduct a study in street to get the average
height of persons living there and get an average height of 165 cms. Supposing we were to make a
statement that the global average of height for human beings is 165 cms. Would this be a correct
statement. Obviously no. Why? Because the sample taken, a street in a town is not representative of the
global population, If we take similar samples in a Caucasian society, or in the Hispanic society the
averages may be completely different. Drawing inferences from mean therefore need caution, and the
user should be well conversant with the data set and what it represents.

The Mode is a central tendency that represents a category or class interval which displays the highest
occurrence in a data set distribution. The mode may not always be in the middle and we often see
distributions (histograms) where the mode is shifted to the left or right of the total range. In extreme
cases, the mode may be segregated at the left most or right most category of the range. We are
interested to know the mode so that we can quickly focus our attention to the group or category that
shows the highest occurrence in relation to others.

The Median is the third central tendency that divides the entire data set into two halves by occurrence
th
(not aggregated values). Thus is we have a data set of 110 parts, the median denotes the 55 part or if
the dara was divided into class intervals, say 10 each, the median class shall be represented by
th
whichever class contains the 55 part. The median (and its extensions, quartiles, percentiles) divide the
population in to segments. It is not necessary that the median class shall be in the center. In a classroom
th
having 40 children, the 50 percentile (or median) may be in the 70 -80 % marks range whereas the
entire range may start from 40 % and go upto 100 %.

Let us now see how we calculate all three central tendencies from a data set:

Page1
MEAN VALUE : x

The mean value or mean is the simple arithmetic average of the total of the sample values.

Ungrouped data

Direct calculation

Example : Five samples are taken from a lot of pins. The length of each sample is measured (units =
mm), with the results shown in Table-1a.

STEP 1 : Arrange the measured values (x) as shown in table - 1.

No Measured value Nos x


x
1 x1 1 52.14
2 x2 2 52.03
. 3 52.10
. 4 52.25
. xn 5 52.16
n
Total x Total 260.68
Average x Average 52.136

Table 1 Table 1a

STEP 2: Find the total (summation or  x ) of x

 x  x1  x2 ...  xn  260. 68

STEP 3: Find the mean by dividing  x by n

_ x 260.68
X =  52.136(mm)
n 5

Page2
Calculation using Data Transformation

In this case, an amount a (52) is subtracted from each measured value to make a smaller number, which
is multiplied by b (100) to eliminate decimal places. The formulas are modified accordingly.

U = (x-a) b = (x-52) 100

STEP 1: Transform each of the measured values and arrange them as shown in Table -2.

No x u = (x-a) x b No x u=(x-52) x 100

1 x1 u1 1 52.14 14
2 x2 u2
2 52.03 3
. . .
. . . 3 52.10 10

. . . 4 52.25 25
n xn un
5 52.16 16

Total x Total - 68

Average u Average - 13.6

Table -2 Table - 2a

STEP 2 :
_ u 68
u =  13. 6
n 5

STEP 3:
_ _ 1
x  u  a ; a = 52 ; b= 100
b
_ 1
x  13. 6   52 = 52.136 (mm)
100

MODE VALUE : X

The mode is the most frequent score in a data set. On a histogram it represents the highest bar in a bar
chart or histogram. You can, therefore, sometimes consider the mode as being the most popular option.
An example of a mode is given below. Normally, the mode is used for categorical data where we wish to
know which is the most common category

Page3
Example: A data set has the following values:

N=16: 33 35 36 37 38 38 38 39 39 39 39 40 40 41 41 45

The values are plotted against their frequency on a histogram. As we see on the histogram, the value 39
appears 4 times, which is the maxium in comparison to other values.

The mode = 39

4
Frequency

0
33 34 35 36 37 38 39 40 41 42 43 44 45
Score

We can have data sets where two values or class intervals indicate the highest frequency as shown in
the figure below. The most probable cause for the same is the data may be mixed from two sources, each
having similar distribution, but in different ranges.

Page4
In data distribitions that have rectangular (random) distribution as in the figure below, there may be no
mode. Slight variations in frequency in such distributions should not be mistaken for a mode.

MEDIAN VALUE : X

The median value is found by arranging the data values in order and determining the middle value. It is
some times used in place of the mean.

Page5
For an Odd Number of Values

STEP 1: List the measured values in order of their size.

~
Example : Determine the value of median x from the data give in Table l a.

52.03 52.10 52.14 52.16 52.25

Tabe 1 a

_
STEP 2: The value at the center is taken as the median value. x  52.14

For an Even Number of Values

STEP 1: List the measured values in order of their size.


~
Example : Determine the value of x when x6 = 52.24 is added to the data given in Table la.

52.03 52.10 52.14 52.16 52.24 52.25

STEP 2 : The two central values are determined, and their average is taken as the median.

~ 52.14  52.16
x = 52.15 (mm)
2

Page6
Data dispersion

Quantitative characteristics of Distribution


Data distributions are usually evaluated quantitatively by looking at an average value or at the spread of
the distribution of the values.

Spread of the distribution


RANGE : R
The range is the simplest measure of variability to calculate. It is the difference between the smallest and
largest values in the sample. It is calculated simply by subtraction.

STEP 1: Determine the maximum value xmax and the minimum value xmin among the measured
values.

Example : In the case of Table - la.

xmax = 52.25; xmin = 52.03

STEP 2 : R = xmax - xmin R = 52.25 - 52.03 = 0.22 (mm)

VARIANCE 

The variance and standard deviation are two measures of variability that indicate how much the scores
are spread out around the mean.

The population variance is the true or actual variance of the population of scores.

The formula for the population variance is:


2
σ =

2
where σ is the variance, μ is the mean, and n is the population size

The sample variance is the average of the squared deviations of scores around the sample mean

If the variance in a sample is used to estimate the variance in a population, then the previous
formula underestimates the variance and the following formula should be used:

s2 =

Page7
2
where s is the estimate of the variance and is the sample mean. Note that is the mean of a sample
taken from a population with a mean of μ. Since, in practice, the variance is usually computed in a
sample, this formula is most often used.

There are alternate formulas that can be easier to use if you are doing manual calculations. You should
note that these formulas are subject to rounding error if your values are very large and/or you have an
extremely large number of observations.

2 2
σ = and s =

Calculating variance for Ungrouped Data

STEP 1: The measured value x and the square of the measured value x 2 are shown in Table -3.

No x x2 No x x2
1 x1 x12 1 52.14 2.718.5796

2 x2 x 22 2 52.03 2.707.1209

. . . 3 52.10 2.714.4100

. . . 4 52.25 2.730.0625

. . . 5 52.16 2.720.6656
n xn
x n2

Total x  x2 Total 260.68 13.590.8386

Table -3 Table 3a

STEP 2 : Calculate the summation (total) of the squares.

( x 2 ) ( 260. 68)2
S =  x2 - S = 13,590.8386 - = 0.02612
n 5
STEP 3: Calculate the variance (V)

S ( 0. 02612)
v= v = = 0.00653
( n 1) (5  1)

Calculation Using Data Transformation


To simplify the calculations, a (52) is subtracted and the result multiplied by b (100) to eliminate decimal
places. The formulas are modified accordingly.

Page8
STEP 1: Transform each of the measured values and arrange them as shown in Table 4

STEP 2 : Calculate the sum of the squares.

1  2 ( u ) 2  1  (68)2 
S=  u   = 1,186   = 0.02612
b2  n  (100)2  5 

No x u = (x - a) x x2 No. x u= u2
b (x -52)
x 100

1 x1 u1 u12 1 52.14 14 196

2 x2 u2 u22 2 52.03 3 9

. . . . 3 52.10 10 100

. . . . 4 52.25 25 625

. . . . 5 52.16 16 256

n xn un un2

Total - u  u2 Total - 68 1.186

Table - 4 Table -4a

STEP 3 : Calculate the variance V.

S ( 0. 02612)
v= = = 0.00653 (mm)
( n 1) (5  1)

STANDARD DEVIATION:

The standard deviation indicates the “average deviation” from the mean, the consistency in the scores,
and how far scores are spread out around the mean
The sample standard deviation is the square root of the sample variance. It is denoted by s.
The population standard deviation is the true or actual standard deviation of the population of scores. It is
denoted by σ.

The standard deviation is an especially useful measure of variability when the distribution is normal or
approximately normal because the proportion of the distribution within a given number of standard
deviations from the mean can be calculated.

Page9
In a normal distribution, 68% of the distribution lies within plus minus one standard deviation of the mean,
approximately 95% of the distribution lies within plus minus two standard deviations of the mean and
99.73 of the population lies within plus minus three standard deviations of the mean

Calculating standard deviation

STEP 1: Determine the variance V.


Example : According to the previous calculation. V = 0.00653

STEP 2 : Calculate the standard deviation (s)

S= V = 0. 00653 = 0.0808 (mm)

Determination of mean and s.d for grouped data


STEP 1: Making a frequency table

Make a frequency table with f, u, uf, and u 2 f columns on its right side.

Example : The lengths of eighty component samples were measured (unit : mm) and the frequency of
the various measurements was recorded on the frequency table (Table -5).

No Section Boundary Median f u uf u2 f


Values Value

1 29.05 - 29.25 29.15 2 -4 -8 32

2 29.25 - 29.45 29.35 4 -3 -12 36

3 29.45 - 29.65 29.55 8 -2 -16 32

4 29.65 - 29.85 29.75 14 -1 -14 14

5 29.85 - 30.05 29.95 23 0

6 30.05 - 30.25 30.15 10 1 10 10

7 30.25 - 30.45 30.35 12 2 24 48

8 30.45 - 30.65 30.55 6 3 18 54

9 30.65 - 30.85 30.75 1 4 4 16

Total 80 - 6 242

_
Table 5: Frequency Table for Calculation of x and

Page10
STEP 2: Filling in the u column

u = 0 is an expression for the center of the distribution, which corresponds to the class with the highest
frequency f. Classes higher than this median value have ascending u values of 1, 2, 3, etc,; classes
lower than this median value have descending u values of -1, -2, -3, etc. In this case, set the value of u at
class no 5, which appears to be near the center of the distribution, to 0. On the basis of this value of u =
0 for class 5, assign u = 1, 2, 3, etc., for classes 6, 7, 8, etc., and u = -1, -2, -3, for classes  4, 3, 2, etc.

STEP 3: Calculating uf and the summation of uf

Calculate the product of u and f and write it in the u f column. Determine the total of u f =  uf. For
example :
No. 1: uf = 2 x (-4) = -8; No. 2: uf = 4 x (-3) = -12 and so on, to the result  uf = 6

STEP 4: Calculating u 2 f and the summation of u 2 f

Calculate the product of u and uf and write it in the u 2 f column. Determine the summation (total) of u 2 f =
 u2 f . For example :

No. 1: u 2 f =(-4) x (-8) = 32; No. 2: u 2 f = (-3) x (-12) =36 and so on, to the result  u 2 f = 242

STEP 5: Calculating the average value x

x0 : the central value of the class for which the value of u is 0.


N : Number of measured values (  f ) ;
h :section width

thus, the central value of class no. 5, for which u = 0, is

x0 = 29.95, N =  f = 80, h = 0.2

Accordingly, the average value is

 6
x  29. 95   0. 2 , = 29.95 + 0.015, = 29.965 (mm)
80

STEP 6: Calculating the standard deviation

(  uf )2 62
 u2 f  242  = 0.2  3.0576 = 0.2  1.749
sh n = s  0. 2  80
N 1 80  1
 0. 350( mm)

Page11

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy