Chapter 3 - Data Collection and Presentation
Chapter 3 - Data Collection and Presentation
1
3.1. Data Collection
Data collection – is the process of gathering and
measuring data on variable of interest, in an
established systematic fashion that enables one to
answer stated research question.
Sources of data:
1) Primary data sources – data which are
originally collected by researcher for the first
time
2) Secondary data sources – published &
unpublished sources
2
3.2. Methods of Collecting Data
1) Direct observation – involves counting the data of
interest in person
2) Personal interview – involves contacting the desired
people (respondents) in person and inquiring their
opinion concerning the area of interest
3) Telephone interview – contacting the desired people
through telephone lines
4) Written questionnaire – written questions are mailed
to individuals to fill in and send back their answers
3
Strength & Weakness of Data Collection Methods
Strength Weakness
Direct Avoids biases Not possible to directly observe
observation or count
Personal High rate of response Time consuming; cost of
interview training interviewers
Telephone Reduces the cost of Respondents may not have
interview individual contact telephone lines; or may not be
available to telephone calls
4
3.3. Classification of Data
Data classification – is a systematic grouping
of units according to their common
characteristics.
Objectives of data classification
To simplify and make data more concise,
meaningful or comprehensible
To bring out points of similarity &
dissimilarity
To compare characteristics
To prepare data for tabulation
5
…classification of data..(Cont’d)
6
Two Methods of Data Presentation
(a). Tabular Presentation
• Orderly and logical arrangement of data in
a table
(b). Diagrammatic or Graphic Presentation
a) Histogram
b) Frequency polygon
c) Bar chart
d) Pie chart
7
3.4. Frequency Distribution
(1). Tabular Presentation of Data
Frequency distribution – is a table in which the values of a
variable are grouped into classes, together with the number of
observed values falling into each class.
Grouped frequency – the number of observed values that belong
to a class is called its frequency
Cumulative frequency - is the sum of the class and all classes below it
in a frequency distribution. We add up a value and all of the values that
came before it.
Example:
8
Examples: Constructing Frequency Distribution
9
Constructing Grouped Frequency Table
Key Concepts:
Class Limits: the values which determine the upper and lower
limits of a class.
Lower Class Limit = smallest data value that can be included
in the class
Upper Class Limit = largest data value that can be included in
the class
Class Interval is the numerical width of any class in a particular
distribution
Class Interval = Upper Class Limit (UCL) – Lower Class
Limit (LCL)
The upper extreme value of the first class interval and the lower
extreme value of the next class interval will not be equal.
10
…key concepts (Cont’d)
11
…key concepts (Cont’d)
Class Boundary
There is a space between the upper limit of one class and
the lower limit of the next class. The halfway points of
these intervals are called class boundaries.
Class boundaries are the data values which separate
classes
Usually applicable for continuous variable
They are not part of the classes or the datasets
In class boundary, the upper extreme value of the first
class interval and the lower extreme value of the next
class interval will be equal.
12
Class Limit vs. Class Boundary
Class Limit Class Boundary
15 – 19 14.5 – 19.5
20 – 24 19.5 – 24.5
25 – 29 24.5 – 29.5
30 – 34 29.5 – 34.5
13
Rules for Forming Grouped Frequency Distributions
1) There is no hard and fast rules for the number of classes one should
use.
2) The number of classes to use depends largely on how many
measurements or observations we have.
3) Make sure that each observation falls into one and only one class.
4) Do not use too few classes or too many classes
5) Whenever possible, we make classes over equal ranges, that is, equal
class width for all classes. Reason: to have meaningful comparisons
between different classes and represent diagrammatically with greater
ease and utility.
14
Example
The price of 20 items (to the nearest Birr) is given below
46 62 60 47 38 48 51 53 42 60
67 46 54 42 43 38 38 54 46 51
a) Construct a grouped frequency distribution with five classes.
b) Determine the class boundaries and the class marks
c) What is the relative frequency of the 3rd and 5th class?
15
3.5. Graphical Presentation of Data
Graphs or charts of a frequency distribution are useful
because they emphasize and clarify the inherent
characteristics of and patterns that are not so readily visible in
frequency tables.
a) Histogram – consists of a set of rectangles having heights
equal to the class frequencies and bases equal to the class
width.
It is used to chart continuous frequency distribution.
Usually the markings on the horizontal scale (X-axis) are
the class boundaries and the markings on the vertical
scale (Y-axis) is the class frequencies.
16
…graphical presentation …(Cont’d)
b) Frequency polygon
A line graph of class frequencies plotted against
class mark.
Classes with zero frequencies are added at both
ends of the distribution to connect the graph with the
horizontal scale
Frequency polygon can also be obtained by
connecting the mid-points of the tops of the
rectangles in a histogram by straight lines.
17
…graphical presentation …(Cont’d)
c) Bar Charts – are constructed when the data are either
discrete or qualitative
Difference between Histogram and Bar Chart
Histogram Bar Chart
18
d) Pie Chart
19
…Pie Chart…(Cont’d)
20
Example: Pie Chart
Quarterly sales of Abyssinia Coffee PLC for the year 2020
were the following. Present this data using
a) Bar chart, and
b) Pie chart
21
END OF CHAPTER 3
22