Statistic
Statistic
Contents
1. Introduction ............................................................................................................................. 1
2. Mean ........................................................................................................................................ 2
3. Median ................................................................................................................................... 13
4. Mode ...................................................................................................................................... 21
5. Cumulative frequency Curve (Ogive) ................................................................................... 25
1. Introduction
In modern times, Statistics has a much wider meaning. It is considered as a science, which deals
with collection, representation, analysis, and interpretation of the data collected from the
surrounding.
Primary data: Data collected by investigator himself are called primary data. Eg., notes, lists.
Secondary data: When investigator does not himself collect the data, but he collects the data
from other sources. E.g., Published reports official statistics collected by the Government on
various facts.
Marks 10 − 20 20 − 30 30 − 40 40 − 50 50 − 60 60 − 70
Frequency 5 4 6 7 3 5
Frequency: The number of observations in each class is called frequency of that class. In Table
the frequency of class 40 − 50 is 7
1|Page
Class-Intervals and Class Limits: In the frequency Table 20-30 is called "class-interval" and
the end numbers, 20 and 30 are called "class limits", the smaller number 20 is the lower-class
limit and the larger number 30 is the upper-class limit.
The size or width of a Class Interval: The size or width of a class-interval is the difference
between the lower- and upper-class boundaries.
e.g., size = 40 − 30 = 10
Class Mark: The class-mark is the mid-point of the class-interval. e.g., the class mark for the
30+40 70
interval 30 − 40 is = = 35.
2 2
Grouped data: The data (or information) given in the form of class intervals such as 0-20, 20-40 and so
on.
Ungrouped data: The data given as individual points (i.e., values or numbers) such as 15,63,34,20,25,
and so on.
We generally observe that data of a variable tends to cluster around some central value. This
clustering of data around the central value is called central tendency. The measures of the central
tendency 𝑎𝑟𝑒:
1. Mean.
2. Median.
3. Mode.
2. Mean
Mean is the average of the given numbers and is calculated by dividing the sum of given numbers by the
total number of numbers.
∑n
i=1 xi
= n
2|Page
Question: Find the mean of 6, 10, 0, 7, 9.
Solution: We know that the mean of five variates x1 , x2 , x3 , x4 , x5 is given by
𝑥1 +𝑥2 +𝑥3 +𝑥4 +𝑥5
A= 5
6+10+0+7+9
= 5
32
= 5
= 6.4
N = ∑ni=1 fi = f1 + f2 + ⋯ . +fn
3|Page
Solution: We prepare the table as below:
∑(𝑓𝑖 𝑥𝑖 ) 625
Hence, mean (𝑥̅ ) = ∑ 𝑓𝑖
= = 25.
25
Question: The mean of the following distribution is 18. Find the frequency '𝑓' of the class 19-
21:
4|Page
19–21 𝑓 20 20𝑓
21–23 5 22 110
23–25 4 24 96
Total ∑ 𝑓𝑖 = (40 + 𝑓) ∑ 𝑓𝑖 𝑥𝑖 =
(704 + 20𝑓)
∑ 𝑓 𝑖 𝑥𝑖
∴ The mean = ∑ 𝑓𝑖
704+20𝑓
⇒ 18 =
40+𝑓
Question: Find the missing frequencies 𝑓1 and 𝑓2 in the table given below, it is being given that
the mean of the given frequency distribution is 50.
5|Page
80–100 19 90 1710
Total ∑ 𝑓𝑖 = 68 + 𝑓1 + 𝑓2 ∑ 𝑓𝑖 𝑥𝑖 = 3480 + 30𝑓1 + 70𝑓2
By the question,
68 + 𝑓1 + 𝑓2 = 120
⇒ 𝑓1 + 𝑓2 = 52 ……(1)
∑ 𝑓 𝑖 𝑥𝑖
and ∑ 𝑓𝑖
= 50
3480+30𝑓1 +70𝑓2
⇒ = 50
120
6|Page
Question: Find the mean of the following frequency distribution by Short-cut Method.
(Assumed mean method)
160
= 35 + = 35 + 2.66 = 37.66
60
7|Page
Question: Calculate arithmetic mean from the following data by Short-cut Method:
Solution:
Here, the assumed mean is decided from the mid-points. Let us take 37.5 as the assumed
mean. So, we calculate the deviations of the mid-values from 37.5
∑ 𝑓𝑑
𝑋̅ = 𝐴 + ∑ 𝑓
8|Page
(iii) Step Deviation Method
Since this method is the extension of the assumed mean method, the formula is:
(∑ 𝒖𝒊 𝒇𝒊 )
Step Deviation of Mean = 𝐀 + 𝐡{ ∑ 𝒇𝒊
},
Where,
• A is the assumed mean
• h is the class size
• 𝑢𝑖 = 𝑑𝑖 /ℎ
• 𝑓𝑖 is the frequency
• 𝑑𝑖 = 𝑥𝑖 − 𝐴
• 𝑥𝑖 is the midpoint of the class interval
Question: Find the mean of the following using the step-deviation method.
C.I. 𝐱𝐢 𝐟𝐢 𝒙𝒊 − 𝟑𝟓 𝐟𝐢 𝐮𝐢
𝐮𝐢 = 𝒖𝒊 =
𝟏𝟎
0–10 5 4 –3 – 12
10–20 15 4 –2 –8
20–30 25 7 –1 –7
30–40 35=A 10 0 0
40–50 45 12 1 12
50–60 55 8 2 16
60–70 65 5 3 15
Total ∑ fi = 50 ∑ fi ui = 16
9|Page
Question: Find the mean of the following frequency distribution using step-deviation method
𝑥𝑖 −25
Solution: Let us choose a = 25, h = 10, then 𝑑𝑖 = 𝑥𝑖 − 25 and 𝑢𝑖 = 10
= 25.8
Thus, the mean is 25.8
10 | P a g e
Question: Find the mean marks of students from the following cumulative frequency
distribution:
Solution: Here we have, the cumulative frequency distribution. So, first we convert it into an ordinary
frequency distribution. we observe that are 80 students getting marks greater than or equal to 0 and 77
students have secured 10 and more marks. Therefore, the number of students getting marks between 0
and 10 is 80-77= 3.
Similarly, the number of students getting marks between 10 and 20 is 77-72= 5 and so on. Thus, we
obtain the following frequency distribution.
11 | P a g e
70 and above 16
80 and above 10
90 and above 8
100 and above 0
Computation of Mean
12 | P a g e
3. Median
Median is the value of middle term of a series arranged in ascending or descending order of
magnitudes.
(i) Median of Ungrouped Data
𝐧+𝟏 𝐭𝐡
(a) If n is odd, the median = ( ) term
𝟐
Question: The heights (in cm ) of 11 players of a team are as follows. Find Median.
160, 158, 158, 159, 160, 160, 162, 165, 166, 167, 170
Solution: Arranging the variates in the ascending order, we get
158,158,159,160,160,160,162,165,166,167,170.
The number of variates = 11, which is odd.
11+1
Therefore, median = th variate = 6th variate = 160.
2
= 7.
13 | P a g e
(ii) Median of Grouped Data
Type 1
Step (i): Arrange the data in ascending order along with their frequencies.
Step (ii): Add a column of cumulative frequency.
N+1
Step (iii): When the total frequency ' N ' is odd, then median is ( ) th observation.
2
N
Step (iv): When the total frequency ' N′ is even, then median is the average of ( 2 ) th and
N
( 2 + 1) th observations.
𝐱𝐢 𝐟𝐢
15 3
21 5
27 6
30 7
35 8
Solution: Here, the given data in the ascending order along with, cumulative frequency column
is as below:
𝒙𝒊 𝒇𝒊 𝒄𝒇𝒊
15 3 3
21 5 3+5=8
27 6 8 + 6 = 14
30 7 14 + 7 = 21
35 8 21 + 8 = 29
Total N = 29
14 | P a g e
Here, N = 29, which is odd.
N+1
∴ Median = ( ) th observation
2
29+1
=( ) th 𝑖. 𝑒., 15th observation.
2
Type 2
N
Step (i): Find cumulative frequencies of all classes and 2 .
Step (ii): Median class is obtained as the class whose cumulative frequency is greater than or
N N
equal to i.e., Median class = Class, which corresponds to ( 2 ) th observation.
2
Question: Find the median daily wages from the following frequency distribution:
15 | P a g e
Solution: The cumulative frequency table is as below:
N
We have: N = 44 so that = 22.
2
The cumulative frequency just greater than 22 is 34, which corresponds to the class 250 –
300.
Thus, median class is 250 – 300.
Here, 𝑙 = 250, 𝑓 = 20, ℎ = 50, 𝑁 = 44
and 𝐶 = 14.
𝑁
−𝐶
2
∴ Median = 𝑙 + ×ℎ
𝑓
22−14
= 250 + × 50 = 250 + 20 = 270.
20
Solution: To find the median let us put the data in the table given below:
16 | P a g e
10–20 16 24
20–30 36 60
30–40 34 94
40–50 6 100
Total N = ∑ 𝑓𝑖 = 100
Now,
N = 100
N
⇒ = 50.
2
The cumulative frequency just greater than 50 is 60, and the corresponding class is 20–
30.
Thus, the median class is 20–30.
∴ I = 20, h = 10, N = 100, f = 36 and c.f. = 24.
Now,
N
−cf
2
Median = I + ×h
f
50−24
= 20 + × 10
36
260
= 20 + 36
= 27.22
Thus, the median is 27.22.
Class-interval Frequency
0–100 2
100–200 5
200–300 𝑥
300–400 12
400–500 17
17 | P a g e
500–600 20
600–700 𝑦
700–800 9
800 – 900 7
900–1000 4
Total 100
⇒ 25 = (14 − 𝑥) × 5
18 | P a g e
⇒ 5 = 14 − 𝑥
⇒ 𝑥 = 9.
Now, N = 100
⇒ 76 + 𝑥 + 𝑦 = 100
⇒ 𝑥 + 𝑦 = 24
⇒ 9 + 𝑦 = 24
⇒ 𝑦 = 15.
Hence, x = 9 and y = 15.
Question: The lengths of 40 leaves of a plant are measured correct to the nearest millimetre, and
the data obtained is represented in the following table:
Solution: The data needs to be converted to continuous classes for finding the median, since the
formula assumes continuous classes. The classes then change
(117.5−126.5,126.5−135.5,...,171.5−180.5)
19 | P a g e
153.5–162.5 5 34
162.5–171.5 4 38
171.5–180.5 2 40
n = 40
𝑛 40
∵ 𝑛 = 40 ∴ = = 20.
2 2
20–17
= 144.5 + ( )×9
12
9
= 144.5 + 4
Question: A life insurance agent found the following data for the distribution of ages
of 100 policy holders. Calculate the median age, if policies are only given to persons having
age 18 years on wards but less than 60 years.
20 | P a g e
Solution:
4. Mode
(i) Mode of Ungrouped Data
Mode: It is value of variate which occurs most often.
Number of wickets 0 1 2 3 4 5 6
Number of matches 1 1 3 2 1 1 1
21 | P a g e
Clearly, 2 is the number of wickets, which are taken by the bowler in the maximum
number of matches i.e., 3.
Hence, the mode of the data is 2.
Class Frequency
25–30 12
30–35 6
35–40 14
40–45 8
45–50 9
Solution: Since, the maximum frequency = 14 and the class corresponding to this frequency is
35–40.
So, the modal class is 35–40.
Here, 𝑙 = 35, 𝑓1 = 14, 𝑓0 = 6, 𝑓2 = 8 𝑎𝑛𝑑 ℎ = 5.
22 | P a g e
𝑓 −𝑓
1 0 14−6
∴ Mode = 𝑙 + 2𝑓 −𝑓 × ℎ = 35 + 28−6−8 × 5
1 −𝑓 0 2
40
= 35 + 14 = 35 + 2.85 = 37.85.
Question: The following table shows the marks obtained by 100 students of class X in a school
during a particular academic session. Find the mode of this distribution.
CI fi c.fi
0–10 7 7
10–20 21 – 7 = 14 21
20–30 34 – 21 = 13 34
30–40 46 – 24 = 12 46
40–50 66 – 45 = 20 66
50–60 77 – 66 = 11 77
60–70 92 – 77 = 15 92
70–80 100 – 92 = 8 100
The class (40–50) has maximum frequency i.e., 20 therefore, modal class is 40–45.
23 | P a g e
f = 20
Now, I = 40, h = 10, f1 = 20, f2 = 11 and f0 = 12
f1 −f0
∴ Mode = 𝐼 + 2f ×h
1 −f0 −f2
20−12
⇒ Mode = 40 + 2(20)−12−11 × 10
8
⇒ Mode = 40 + 40−12−11 × 10
8
⇒ Mode = 40 + 17 × 10
80
⇒ Mode = 40 +
17
⇒ Mode = 40 + 4.70
⇒ Mode = 44.70
Question: If the mode of the following distribution is 57.5, find the value of ‘x’.
16−10
⇒ 57.5 = 50 + 32−10−𝑥 × 10
60
⇒ 57.5 = 50 + 22−𝑥
24 | P a g e
60
⇒ 57.5 – 50 = 22−𝑥
60
⇒ 7.5 = 22–𝑥
⇒ 165 – 7.5x = 60
⇒ 7.5x = 165 – 60
⇒ 7.5x = 105
105
Hence, 𝑥 = = 14.
7.5
The empirical relationship between the three measures of central tendency in asymmetrical
distribution is:
25 | P a g e
Step (i): Mark the upper-class limits on the 𝑥-axis on a suitable scale.
Step (ii): Mark the corresponding cumulative frequencies on the 𝑦-axis on a suitable scale.
Scale may be different on both axes.
Step (iii): Plot the points (𝑥𝑖 ′ 𝑓𝑖 ), where 𝑥𝑖 is the upper limit of a class and 𝑓𝑖 ′ is the
corresponding cumulative frequency.
Step (iv): (a) Joining these points successively by line segments, we get cumulative frequency
polygon.
(b) Joining these points successively by a free hand smooth curve, we get cumulative frequency
curve or an ogive.
Question: Draw the cumulative frequency polygon and cumulative frequency curve (ogive) for
the following frequency distribution by less than type method:
Solution: The cumulative frequency distribution table by less than type is as below:
26 | P a g e
More than Type
For constructing cumulative frequency polygon or cumulative frequency curve (ogive), we have
the following guidance:
Step (i): Mark the lower-class limits on the 𝑥-axis on a suitable scale.
Step (ii): Mark the corresponding cumulative frequencies on the 𝑦-axis on a suitable scale.
Scale may be different on both axes.
Step (iii): Plot the points (𝑥𝑖 ′ 𝑓𝑖 ), where 𝑥𝑖 is the lower limit of a class and 𝑓𝑖 ′ , is the
corresponding cumulative frequency.
Step (iv): (a) Joining these points successively by line segments, we get cumulative frequency
polygon.
(b) Joining these points successively by a free hand smooth curve, we get cumulative frequency
curve or an ogive.
27 | P a g e
Solution: We write the given distribution using actual limits as under:
Question: During the medical check-up of 35 students of a class their weights are recorded as
follows:
Weight (in kg) 38–40 40–42 42–44 44–46 46–48 48–50 50–52
Number of 3 2 4 5 14 4 3
students
28 | P a g e
Draw a less than type and more than type ogive from given data. Hence, obtain the median
weight from the graph.
Solution: Here, less than type distribution table is as below:
29 | P a g e
Joining these points by a free hand curve, more than ogive is shown.
30 | P a g e