0% found this document useful (0 votes)
8 views11 pages

Episode 2

The document explains frequency distribution, which summarizes how often different values occur in a dataset, aiding in data analysis, population estimation, and statistical computations. It details types of frequency distributions, including discrete and continuous, and provides examples of constructing frequency tables and histograms. Additionally, it discusses methods for classifying data into intervals and visualizing distributions through histograms and frequency polygons.

Uploaded by

tjhfgkk256
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views11 pages

Episode 2

The document explains frequency distribution, which summarizes how often different values occur in a dataset, aiding in data analysis, population estimation, and statistical computations. It details types of frequency distributions, including discrete and continuous, and provides examples of constructing frequency tables and histograms. Additionally, it discusses methods for classifying data into intervals and visualizing distributions through histograms and frequency polygons.

Uploaded by

tjhfgkk256
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Presentation of data: Tables, charts and graphs

Frequency Distribution A frequency distribution is a summary of how often different


values or ranges of values occur within a dataset. It organizes data to show the frequency
of occurrence of various values of a single phenomenon.

Purpose of Constructing a Frequency Distribution:

1. Data Analysis Simplification: It helps in organizing raw data into a structured


format, making it easier to analyze and interpret patterns or trends.

2. Population Estimation: It allows for the estimation of frequencies in an unknown


population based on the distribution observed in sample data.

3. Statistical Computations: It provides a foundation for calculating various


statistical measures, such as mean, median, mode, variance, and standard
deviation.

Types of Frequency Distribution

a. Discrete (or) Ungrouped frequency distribution:


In a discrete frequency distribution, the data is organized to show the frequency of
individual, distinct values. This type of distribution is used when the data consists of
discrete variables values that are countable and finite, with no intermediate values
possible.
Examples of Discrete Data:
• Number of rooms in a house.
• Number of companies registered in a Nigeria.
• Number of children in a family.
Why Use Discrete Frequency Distribution?
It is particularly useful when dealing with small datasets or when the exact values
of the data points are important for analysis.

Example 1: In a survey of 40 families in a village, the number of children per family was
recorded and the following data obtained.

1 0 3 2 1 5 6 2

2 1 0 3 4 2 1 6

3 2 1 5 3 3 2 4

2 2 3 0 2 1 4 5

3 3 4 4 1 2 4 5

1
Represent the data in the form of a discrete frequency distribution.
Solution:

Frequency distribution of the number of children

Example 2: In a survey of 30 students, the number of siblings each student has was
recorded, and the following data was obtained:

2 1 0 3 2 1 4 2

1 0 2 3 1 2 1 4

3 2 1 0 2 3 1 2

4 1 2 3 1 2

Solutions
The distinct values in the data are: 0, 1, 2, 3, 4.
Count the Frequency of Each Value
We count how many times each value appears in the dataset:
Number of Siblings (x) Frequency (f)

0 3

1 8

2 11

3 5

2
4 3

Add up all the frequencies to ensure they match the total number of
students surveyed (30):

3+8+11+5+3=303+8+11+5+3=30 Correct!

Final Discrete Frequency Distribution:

Number of Siblings (x) Frequency (f)

0 3

1 8

2 11

3 5

4 3

Interpretation:

• 3 students have 0 siblings.

• 8 students have 1 sibling.

• 11 students have 2 siblings.

• 5 students have 3 siblings.

• 3 students have 4 siblings.


b. Continuous (or) grouped frequency distribution:

Sometimes, data collected are so large that it may not easily be managed; as a result it
becomes necessary to group the data through the use of some intervals. When data are
organised by the use of some intervals (class intervals), the organized data is called
grouped data. The advantage of a group frequency distribution is that it enables a very
large array of data to be reduced to a smaller manageable size.

Example 2: Consider the wage distribution of 100 employees.

Weekly Wages (N) Number of Employees

3
50-100 4

100-150 12

150-200 22

200-250 33

250-300 16

300-350 8

350-400 5

Total 100

Nature of class

The following are some basic technical terms when a continuous frequency distribution
is formed or data are classified according to class intervals.

a. Class limits:

The class limits are the lowest and the highest values that can be included in the class.
For example, take the class 50 -100. The lowest value of the class is 50 and highest class
is 100. The two boundaries of class are known as the lower limits and the upper limit of
the class. In statistical calculations, lower class limit is denoted by L and upper class limit
by U.

b. Class Interval:
The class interval may be defined as the size of each grouping of data.
For example, 50 -75, 75 -100, 100 -125…are class intervals. Each grouping begins with
the lower limit of a class interval and ends at the lower limit of the next succeeding class
interval.

c. Width or size of the class interval:


If a class interval is exclusive (continuous) its width or size of the class interval is the
difference between the lower and upper class limits and is denoted by ‘C’.

d. Range:
The difference between largest and smallest value of the observation is called the Range
and is denoted by ‘R’ i. e
R = Largest value – Smallest value
R =L–S

4
e. Mid-value or mid-point:
The central point of a class interval is called the mid value or mid-point. It is found out by
adding the upper and lower limits of a class and dividing the sum by 2. i.e
Midvalue=L+U/2.
For example, if the class interval is 20 - 30 then the mid-value is 20+30/2 = 25.
f. Number of class interval:

The number of class interval in a frequency is matter of importance. The number of class
interval should not be too many. For an ideal frequency distribution, the number of class
intervals can vary from 5 to 15. To decide the number of class intervals for the frequency
distributive in the whole data, we choose the lowest and the highest of the values. The
difference between them will enable us to decide the class intervals. Thus the number of
class intervals can be fixed arbitrarily keeping in view the nature of problem under study
or it can be decided with the help of Sturges’ Rule.

According to him, the number of classes can be determined by the formula

K=1 + 3. 322 log10N

Where

N = Total number of observations

Log = logarithm of the number

K = Number of class intervals.

Example: if the number of observations is 10, then the number of class intervals is
K = 1 + 3. 322 log10 10 = 4.322 ≅ 4

g. Size of the class interval:

Since the size of the class interval is inversely proportional to the number of class interval
in a given distribution. The approximate value of the size (or width or magnitude) of the
class interval ‘C’ is obtained by using Sturges’ rule as

Size of class interval = C = Range/Number of class interval

= Range/1+3.322log10N

Where Range = Largest Value – smallest value in the distribution.

Types of Class Intervals

There are three methods of classifying the data according to class intervals namely

a. Exclusive (Continuous) method


b. Inclusive (Discrete) method
5
c. Open-end classes

a. Exclusive (Continuous) method

Type of class interval in which the class interval overlaps. The following data are classified
on this basis.

Expenditure (N) Number of families

0-5000 60

5000 - 10000 95

10000 - 15000 122

15000 - 20000 83

20000 - 25000 40

TOTAL 400

The first class implies all the set of data from 0 to 4999.99, 5000 is not included in the first
class but the second class implies all sets of numbers from 5000 to 9999.99; 10000 is not
included but transferred to the third class etc.

b. Inclusive (Discrete) Method

In this method, the overlapping of the class intervals is avoided. Both the lower and upper
limits are included in the class interval. This type of classification may be used for a
grouped frequency distribution for discrete variable like members in a family, number of
workers in a factory etc., where the variable may take only integral values. It cannot be
used with fractional values like age, height, weight etc.

This method may be illustrated as follows:

Class Interval (C.I) Frequency

5-9 7

10-14 12

15-19 15

20-24 21

25-29 10

6
30-34 5

Total 70

Thus, to decide whether to use the inclusive method or the exclusive method, it is
important to determine whether the variable under observation in a continuous or discrete
one. In case of continuous variables, the exclusive method must be used. The inclusive
method should be used in case of discrete variable.

c. Open-end classes:

A class limit is missing either at the lower end of the first class interval or at the upper end
of the last class interval or both are not specified. The necessity of open end classes
arises in a number of practical situations, particularly relating to economic and medical
data when there are few very high values or few very low values which are far apart from
the majority of observations. The example for the open-end classes as follows:

Salary Range Number of Workers

Below 2000 7

2000-4000 5

4000-6000 6

6000-8000 4

8000 and above 3

Total 25

Preparation of frequency table:

The premise of data in the form of frequency distribution describes the basic pattern which
the data assumes in the mass. Frequency distribution gives a better picture of the pattern
of data if the number of items is large. If the identity of the individuals about whom
particular information is taken, is not relevant then the first step of condensation is to
divide the observed range of variable into a suitable number of class-intervals and to
record the number of observations in each class.

Example 1: Given below are the numbers of tools produced by workers in a factory.

43 18 25 18 39 44 19 20 20 26

7
40 45 38 25 13 14 27 41 42 17

34 31 32 27 33 37 25 26 32 25

33 34 35 46 29 34 31 34 35 24

28 30 41 32 29 28 30 31 30 34

31 35 36 29 26 32 36 35 36 37

32 23 22 29 33 37 33 27 24 36

23 42 29 37 29 23 44 41 45 39

21 21 42 22 28 22 15 16 17 28

22 29 35 31 27 40 23 32 40 37

Using the Sturges rule determine the number of class interval and prepare frequency
distribution table.

Solution

Number of class interval

K=1+ 3. 322 log10N

K= 1+3.322 log(100 ) = 7.6

Size of class interval = c = Range/Number of class interval

46 − 13
c = =4.34
7.6

The class size is 4.34 which will be approximated to 5.

Thus the number of class interval is 8 and size of each class is 5. The required frequency
distribution is prepared using tally marks as given below:

Hence taking the magnitude of class intervals as 5, we have 7 classes 13 -17, 18-22…
43-47 are the classes by inclusive type. Using tally marks, the required frequency
distribution is obtained in the following table

8
Histogram

Frequency distribution can be represented in form of graphs and charts. Histogram is also
called block frequency diagram. It shows the pattern of the distribution of data whether
symmetrical or skewed. Histogram is a continuous distribution, and therefore if the class
interval is discrete, we need to adjust it to a continuous one before the histogram is drawn
by subtracting 0.5 from lower classes and adding 0.5 to upper classes. The histogram is
constructed by placing the class boundaries on the horizontal (X) axis and the frequency
on the vertical (Y) axis.

Example 2: The scores of thirty students in Statistics examination were given as follows

126 145 137 145 140 146

131 143 127 133 134 144

136 135 128 130 137 142

141 139 147 149 150 148

146 150 148 151 153 155

Use the above information to obtain the histogram of the distribution

Class Interval Class Boundary Upper Class Frequency


Boundary

0-126 0 -125.5 125.5 0

126-130 125.5 -130.5 130.5 4

131-135 130.5 -135.5 135.5 4

9
136-140 135.5 -140.5 140.5 5

141-145 140.5 -145.5 145.5 6

146-150 145.5 -150.5 150.5 8

151-155 150.5 -155.5 155.5 3

155-160 155.5 -160.5 160.5 0

Total 30

Histogram of the scores of forty Students in


Statistics examination
9
8
7
6
Frequency

5
4
3
2
1
0
125.5 130.5 135.5 140.5 145.5 150.5 155.5 160.5
Upper Class Boundaries

Frequency Polygon

Frequency is obtained by plotting the midpoints of each class against the corresponding
frequency of that class. It can also be obtained by joining the midpoints of the tops of the
rectangles of the histogram and extending the lines to meet the X-axis. A polygon thus
drawn will have the same area as the corresponding histogram if the class intervals are
the same.

Using the data in example 2 plot the frequency polygon of the distribution

Class Interval Class Boundary Upper Class Frequency Mid-


Boundary value

0-126 0 -125.5 125.5 0 62.75

10
126-130 125.5 -130.5 130.5 4 128.00

131-135 130.5 -135.5 135.5 4 133.00

136-140 135.5 -140.5 140.5 5 138.00

141-145 140.5 -145.5 145.5 6 143.00

146-150 145.5 -150.5 150.5 8 148.00

151-155 150.5 -155.5 155.5 3 153.00

155-160 155.5 -160.5 160.5 0 158.00

Total 30

10

8
Frequency

4 Histogram
2 Frequency Polygon

0
125.5 130.5 135.5 140.5 145.5 150.5 155.5 160.5

Upper Class Boundaries

The frequency polygon can be singled out as:

Frequency Polygon of the scores of forty


Students in Statistics examination

10
8
Frequency

6
4
2
0
62.75 128 133 138 143 148 153 158
Mid-value

11

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy