Sampling
Sampling
What is sampling?
Sampling involves the selection of a number of study units from a defined study population.
The population is too large for us to consider collecting information from all its members.
Instead we select a sample of individuals hoping that the sample is representative of the
population.
Definitions
Study population (population sampled): Population from which the sample actually
was drawn and about which a conclusion can be made. For Practical reasons the
study population is often more limited than the target population. In some
instances, the target population and the population sampled are identical.
Sampling unit: The unit of selection in the sampling process. For example, in a
etc.
Study unit: The unit on which the observations will be collected. For example,
N.B. The sampling unit is not necessarily the same as the study unit.
Sample design: The scheme for selecting the sampling units from the study population.
Sampling frame: The list of units from which the sample is to be selected.
40
Research methodology
Sampling methods
An important issue influencing the choice of the most appropriate sampling method is
whether a sampling frame is available, that is, a listing of all the units that compose the study
population.
Examples:
units that happen to be available at the time of data collection are selected.
2. Quota sampling: is a method that insures that a certain number of sample units
that all these characteristics are represented. In this method the investigator
interviews as many people in each category of study unit as he can find until he
whom we select strategically so that their in-depth information will give optimal
insight into an issue about which little is known. This is called purposeful
sampling.
population.
41
Research methodology
which factors are contributing significantly to a certain problem, we have to be sure that we
can generalise the findings obtained from a sample to the total study population. Then,
purposeful sampling methods are inadequate, and probability or random sampling methods
have to be used.
that each unit of the sample is chosen on the basis of chance. All units of the study
population should have an equal or at least a known chance of being included in the
sample.
1. Simple Random Sampling (SRS): This is the most basic scheme of random
Make a numbered list of all the units in the population from which you
every 5th, 10th, etc.) from the sampling frame. Ideally we randomly select a number
to tell us where to start selecting individuals from the list. For example, a
of the first student to be included in the sample is chosen randomly by picking one
out of the first ten pieces of paper, numbered 1 to 10. If number 5 is picked, every
tenth student will be included in the sample, starting with student number 5, until
100 students are selected. Students with the following numbers will be included in
42
Research methodology
Should not be used if there is any sort of cyclic pattern in the ordering
available).
groups of study units with specific characteristics (for example, residents from
urban and rural areas), then the sampling frame must be divided into groups, or
predetermined size will then have to be obtained from each group (stratum).
and rural
Conditions may suggest that prevalence rates will vary between strata: the
stratification is used.
Administrative reasons may make it easier to carry out the survey through
the selection of study units individually is called cluster sampling. Clusters are
clinics).
43
Research methodology
and widely scattered. The number of stages of sampling is the number of times
The primary sampling unit (PSU) is the sampling unit (or unit of
• The secondary sampling unit (SSU) is the sampling unit in the second sampling stage,
etc.
e.g.
individuals may be carried out within each household selected. This constitutes two
stage sampling, with the PSU being households and the SSU being individuals.
Advantages: less costly, we only need to draw up a list of individuals in the clusters
When we take a sample, our results will not exactly equal the correct results for the whole
population. That is, our results will be subject to errors. This error has two components:
Random error, the opposite of reliability (i.e., Precision or repeatability), consists of random
deviations from the true value, which can occur in any direction.
Sampling error (random error) can be minimized by increasing the size of the sample.
Reliability (or precision): This refers to the repeatability of a measure, i.e., the degree of
closeness between repeated measurement of the same value. Reliability addresses the
question, if the same thing is measured several times, how close are the measurements to
each other?
44
Research methodology
a) Variation in the characteristic of the subject being measured. Example: blood pressure
observation
It is possible to eliminate or reduce the non-sampling error (bias) by careful design of the
sampling procedure.
Validity: This refers to the degree of closeness between a measurement and the true value
of what is being measured. Validity addresses the question, how close is the measured
To be accurate, a measuring device must be both valid and reliable. However, if one cannot
have both, validity is more important in situations when we are interested in the absolute
value of what is being measured. Reliability on the other hand is more important when it is
not essential to know the absolute value, but rather we are interested in finding out if there is
Bias resulting from incompleteness of the sampling frame: accessibility bias, seasonability
45
Research methodology
Non-response bias refers to failure to obtain information on some of the subjects included in
the sample to be studied. It results in significant bias when the following two situations are
both fulfilled.
The issue of non-response should be considered during the planning stage of the
study:
Methods that may help in maintaining non-response at a low level could be:
• Training data collectors to initiate contact with study subjects in a respectful way and
convince them about the importance of the given study (this minimizes the refusal type
of non-response)
• By making repeated attempts (at least 3 times) to contact study subjects who were