Day 4 Data Collection Methods-1
Day 4 Data Collection Methods-1
DAY 4
Learning Objectives
o population: The total set of units. It could be all the citizens in a country,
all farms in a region, or all children under the age of five living without
running water in a particular area.
o Population is an accessible group of people who meets a well-defined
set of eligibility criteria
o It consists of the totality or aggregate of the observations with which the
researcher is concerned
o The utmost importance in selecting a population is that the population
should be clearly defined so that the sample can be accurately identified.
The specific population types are:
-Target population which is a group of individuals who meet the
criteria.
When we begin planning our data collection strategy, we have to decide whether it is
possible to collect data from the entire population we intend to study
If we are able to do that, we can then make an accurate report
If we collect all the data accurately and reliably, then there is little chance of error.
However, most often we are unable to collect data from the entire population. It takes
too much time and costs too much.
Instead, we take a sample − a subset of the entire population.
If we select a sample, we may be able to draw inferences (extrapolations,
interpretations) about a population based on our sample results; that is, we can
estimate what the population is like based on our sample results.
We call this “generalizing to a population.”
We use samples all the time. For example, when we have a blood test to check on our
health, the laboratory takes a sample rather than all our blood.
Sampling theory:
o This is developed to determine mathematically the most effective way to
acquire a sample that would accurately reflect the population under
study.
The key concepts of a sampling theory include:
1. A sampling unit which refers to a specific place or location which can be
used during the sampling process
2. A sampling frame which describes the complete list of sampling units
from which you can select your sample.
The steps involved in sampling include:
When we begin planning our data collection strategy, we have to decide whether it is
possible to collect data from the entire population we intend to study
If we are able to do that, we can then make an accurate report
If we collect all the data accurately and reliably, then there is little chance of error.
However, most often we are unable to collect data from the entire population. It takes
too much time and costs too much.
Instead, we take a sample − a subset of the entire population.
If we select a sample, we may be able to draw inferences (extrapolations,
interpretations) about a population based on our sample results; that is, we can
estimate what the population is like based on our sample results.
We call this “generalizing to a population.”
We use samples all the time. For example, when we have a blood test to check on our
health, the laboratory takes a sample rather than all our blood.
How Large a Sample Do You Need?
Sample size is a function of size of the population of interest,
the desired confidence level, and level of precision.
You can calculate a formula to determine the appropriate
sample size or you can use a tool such as the one shown in the
slide below, a Guide to Minimum Sample Size.
This table shows the sample size needed when estimating a
population percentage (or proportion) at the 95%
confidence level and a + or – 5 percentage point confidence
interval.
As you can see, the smaller the population, the higher
proportion of cases you will need.
Sample size.
o Prior to the selection of the sampling technique, the evaluator must first
determine the sample size.
o A sample size can be determined using the Slovin’s (1960) formula,
which is as follows:
n
n = ………………..
1 + Ne2
There are two kinds of samples, random (Probability) and non-random (non-
probability).
Random samples are samples in which each unit in the population has an equal
chance of being selected.
You can take a random sample of files, roads, farms, or people.
An appropriately sized random sample should be representative of the population as a
whole, enabling you to generalize to the population from which the sample was
drawn.
A complete list of every unit in the population of interest, called a sampling frame, is
needed to select a random sample.
These units are selected using a random schedule; typically, we would use a table of
random numbers and select every unit until we reach the sample size we set.
Random numbers can be generated using any major spreadsheet program.
Types of Random Samples
Sometimes we want to make sure specific groups are included that might otherwise
be missed by using a simple random sample; those groups are usually a small
proportion of the population.
In this case, we would divide the population into strata based on some meaningful
characteristic.
This kind of sample is called a stratified random sample.
For example, you may want to make sure you have enough people from rural areas in
your study.
If selected by a simple random sample, you may not get enough people from rural
areas if they are a small proportion of all the people in the area.
This is especially important if you want to have sufficient numbers in each stratum so
you can make meaningful comparisons.
Cluster and Multi-stage Samples
Cluster sampling is another form of random sampling. A “cluster” is any
naturally occurring aggregate of the units that are to be sampled.
Cluster samples are most often used when:
You do not have a complete list of everyone in the population of interest
but do have a complete list of the clusters in which they occur, or
You have a complete list of everyone, but they are so widely disbursed
that it would be too time consuming and expensive to send data
collectors out to a simple random sample.
In a cluster sample, the cluster is randomly sampled (such as towns or
household) and then data is collected on all the target units. For
instance, if the evaluation needs to collect data on the height and weight
of children ages 2-5 in the program sites scattered across a large rural
region, the evaluators might randomly sample 20 villages from the 100
villages receiving the program, and then collect data on all the children
ages 2-5 in those villages.
Multi-stage random sampling