Collecting and Presentation of Data
Collecting and Presentation of Data
Parameter refers to the numerical characteristic of the population like the population mean, population
standard deviation, population variance, and many more. It is usually unknown and estimated only by a
corresponding statistic computed from the sample data.
Types of Data:
1. Qualitative data – are categorical data which take the form of categories or attributes such as sex,
course, year level, race, religion, etc.
2. Quantitative data – are numerical data which are obtained from measurements like heights,
weights, ages, scores, temperature, IQ and other measurable quantities.
Measurement Scales
Qualitative data can be converted to quantitative data through a process called measurements. By
measurements, numbers are utilized to code objects in order that they can be treated statistically. There are 4
types of measurements.
1. Nominal measurements – are used only for identification or classification purposes. For example,
a group of students under investigation are classified according to courses such as:
1- engineering, 2-education, 3-commerce and 4-secretarial. There is no meaning attached
to the magnitudes of number assigned to the course of the respondents.
2. Ordinal measurements – do not only classify items. They also give the order or ranks of classes,
items, or objects. Examples are the ranks given to winners in contest such as oratorical,
beauty and essay writing contest etc.
3. Interval measurements – numbers are assigned to the items or objects. These are used to identify
and ranks the objects. They also measure the degree of differences between any two classes.
Examples are the weights, heights, temperatures, IQ, achievement grades, test scores, etc.
4. Ratio measurements – the ratio of the numbers assigned in the measurement shows the ratio in
the amount of property being measured.
Sampling Techniques - are utilized to test the validity of conclusions or inferences from the sample to
the population. A representative sample of 100 is generally preferable to an
unrepresentative sample of 1,000.
Random Sample refers to a limited number of individuals chosen from the population. Every individual
has an equal chance of being selected in the sample before the selection is done.
1. Simple Random Sampling – the simplest method of random sampling is through lottery.
2. Stratified Random Sampling – is done through dividing the population into categories or strata and
getting the members at random proportionate to each stratum or sub-group.
3. Systematic Random Sampling – refers to a process of selecting every nth element in the population
until the desired sample size is acquired.
4. Cluster Sampling – is the advantageous procedure when the population is spread out over a wide
geographical area. It also means as a practical sampling technique used if the complete list
of the members of the population is not available. A cluster refers to an intact group which
has common characteristics.
5. Multi-stage Sampling – a more complex sampling technique, which includes the following steps:
a) Divide the population into strata.
b) Divide each stratum into clusters.
c) Draw sample from each cluster using the simple random sampling technique.
n=
n=
n= (rounded to the nearest whole number)
Hence, at 95% accuracy we can take a sample of 217 respondents from a population of 475.
2. A researcher is conducting an investigation with regards to the organizational climate of 375 faculty
members of a certain university at National Capital Region (NCR). How many of the faculty members will
be taken as respondent if the researcher wants to have a margin error of 1%?
Solution:
Since the population will be represented by the entire faculty in certain university at the NCR, then
N should be 375 at 1% (0.01) marginal error:
n=
n=
n=
n= (rounded to the nearest whole number)
Method of Collecting Data
There are many methods of collecting data. However, there is no best method to get the desired information under
investigation. The choice of appropriate methods to be utilized depends on the following factors: nature of the problem,
the population under investigation, the time and material factors. Thus, to obtain the needed accurate information at a
minimum cost and least possible time, a combination of the following methods of data gathering may be applied.
a. The Direct or Interview Method - is one of the most effective methods of collecting original data. To
obtain accurate responses, the interview may be done by well-trained interviewers. The interviewer can
be of great help to the respondents in answering questions which the respondents could not understand.
b. The Indirect or Questionnaire Method – is one of the easiest method of data gathering. It takes time
to prepare because questionnaires need to be attractive. It can include illustrations, pictures, and
sketches. Its contents, especially the directions, must be precise, clear and elf-explanatory.
c. The Registration Method – the respondents provide information in compliance with certain laws,
policies, rules, regulations, decrees or standard practices. Data which can be collected by registration
method are as follows: marriage contracts, birth certificates, motor registration, licenses of firearms,
registration of corporations, real estate, voters, etc.
d. Other Methods
a. Observation Methods – utilizes to gather data regarding attitudes, behavior, values, cultural
patterns of the sample under investigation.
b. Telephone Interview – is employed if the questions to be asked are brief and few. An example is
the checks made on listeners to certain to certain radio programs like asking what program his
radio is on to. This method is used to find the most popular T.V. or radio programs.
c. Experiment – is applied to collect or gather data if the investigator wants to control the factors
affecting the variable studied. An example is when the researcher aims to determine the different
FREQUENCY DISTRIBUTION
When the researcher gathers all the needed data, the next task is to organize and present them with the use of
appropriate tables and graphs. Frequency distribution is one system used to facilitate the description of important features
of the data.
4. Class Size – refers to the difference between the upper-class boundary and the lower-class boundary of a
class interval. For the class boundaries 4.5 and 9.5, the class size is 5 since 9.5 minus 4.5 is equal to 5.
5. Class Frequency – means the number of observations belonging to a class interval.
Steps in Constructing the grouped frequency Distribution for the Statistics Test Scores.
This means that the 1st class interval or class limit will be 90 – 94.
Step 5. Construct the frequency Distribution Table.
The Grouped Frequency Distribution Table of 50 Statistics Scores
Class Interval Class Frequency Class Marks Cumulative frequency Cumulative frequency
(C.I.) (f) (X) (CumF <) (CumF >)
90 – 94 2 92 50 2
85 – 89 6 87 48 8
80 – 84 3 82 42 11
75 – 79 8 77 39 19
70 – 74 5 72 31 24
65 – 69 2 67 26 26
60 – 64 10 62 24 36
55 – 59 3 57 14 39
50 – 54 4 52 11 43
45 – 49 3 47 7 46
40 – 44 4 42 4 50
N = 50
Histogram – is made up of vertical bars that are joined together, making it an appropriate graph for continuous data. The
base of each bar or rectangle is equal to the class boundaries, wherein the height corresponding to its class frequency.
Steps in constructing a histogram includes the following steps.
a. Prepare the x and y-axis.
b. Lay off the x-axis and y-axis to represent the class intervals and the class frequencies respectively.
c. Draw each bar with the height equal to the class frequency of each class boundary.
d. The bases of the bar are plotted on the x-axis where the width corresponds to the real limits or class
boundaries of the class interval and the center of the base falls on the midpoint of the class interval.
Frequency Polygon – is commonly called linear graph. It is a very useful device to show changes in values over
successive periods of time. In constructing the frequency polygon, the following steps are included.
1. Represent the x-axis by utilizing the class marks of the class intervals.
Edited 1ST sem 2021-22
a) N = 1500, e = 5% d) N = 500, e = 1%
1500 500
N= 2 N= 2
1+ 1500 ( 0.05 ) 1+ 500 ( 0.01 )
N = 315.79 N = 476.19
N = 316 N = 476
c) N = 6075, e = 10%
6075
N= 2
1+ 6075 ( 0.1 )
N = 98.38
N = 98
Given the test scores of 50 students in Statistics, construct the grouped frequency distribution table.
47 41 29 28 25
26 23 46 38 28
46 37 28 23 28
20 27 44 26 37
29 36 26 43 21
27 18 29 34 42
29 43 34 19 27
25 40 28 32 14
29 32 40 13 24
41 11 31 24 27
Class Interval Class Frequency Class Marks Cumulative Frequency Cumulative Frequency
(C.I.) (f) (x) (Cumf <) (Cumf >)
45 - 47 3 46 50 3
42 – 44 4 43 47 7
39 – 41 4 40 43 11
36 – 38 4 37 39 15
33 – 35 2 34 35 17
30 – 32 3 31 33 20
27 – 29 14 28 30 34
24 – 26 7 25 16 41
21 – 23 3 22 9 44
18 – 20 3 19 6 47
15 – 17 0 16 3 47
12 – 14 2 13 3 49
9 – 11 1 10 1 50
N = 50
(25 points)
6. The real limits or class boundaries of the 1st highest class interval is = 44.5 – 47.5
7. The class mark or midpoint of the lowest class interval is = 10
Edited 1ST sem 2021-22
Construct the graph of the following based from the given data in Statistics test results of 50 students in grouped
frequency distribution. (30 points)
a. histogram
b. frequency polygon
c. cumulative frequency polygon
a. HISTOGRAM
THE GROUPED FREQUENCY
1
4
1
3
1
2
1
1
1
0
9
8
7
6
5
4
3
b. FREQUENCY POLYGON
39 - 41 4 40
36 – 38 4 37
33 – 35 2 34
30 – 32 3 31
27 – 29 14 28
24 – 26 7 25
21 – 23 3 22
18 – 20 3 19
15 – 17 0 16
12 – 14 2 13
9 - 11 1 10
N = 50
14
13
12
11
10
44.5/45 – 47/47.5 3 46 50 3
41.5/42 – 44/44.5 4 43 47 7
38.5/39 – 41/41.5 4 40 43 11
35.5/36 – 38/38.5 4 37 39 15
32.5/33 – 35/35.5 2 34 35 17
29.5/30 – 32/32.5 3 31 33 20
26.5/27 – 29/28.5 14 28 30 34
23.5/24 – 26/26.5 7 25 16 41
20.5/21 – 23/23.5 3 22 9 44
17.5/18 – 20/20.5 3 19 6 47
14.5/15 – 17/17.5 0 16 3 47
11.5/12 – 14/14.5 2 13 3 49
8.5/9 – 11/11.5 1 10 1 50
N = 50
50
48
46
44
42
40
38
36
34
32
30
28
26
24
22
20
18
16
14
12
10
8
6
4
2
0
8 1 1 1 2 2 2 2 3 3 3 4 4 4
Edited 1ST sem 2021-22