Econ 522 - Chapter 4
Econ 522 - Chapter 4
SAMPLING TECHNIQUES
1
Sampling
Why sample?
Cost in terms of money, time and manpower
Accessibility
Utility e.g. to do diagnostic laboratory test you don’t
draw the whole of patient’s blood.
2
Sampling…..
Sampling is the process of selecting a representative sample
from populations.
It Selecting cases (elements)—or locating people (or other units of analysis)
—from a target population in order to study the population.
sampling
Sample
Inference
Population
3
Cont’d
The process of obtaining information from a subset (sample) of a larger
group (population)
The results for the sample are then used to make estimates of the larger
group
Faster and cheaper than asking the entire population
Two keys
1. Selecting the right people
Have to be selected scientifically so that they are representative of the population
2. Selecting the right number of the right people
To minimize sampling errors I.e. choosing the wrong people by chance
4
Characteristics of Good Samples
o Representation
Sample surveys are almost never conducted for the
purposes of describing the particular sample under
study. Rather they are conducted for purposes of
understanding the larger population from which the
sample was initially selected
A great deal of work has been done over the years in
developing sampling methods that provide
representative samples for the general population.
5
Cont’d
7
cont’d
Census: A census is a sample consisting of the entire
population.
It has the following disadvantages:
Expensive
Takes a long time
Cumbersome & therefore inaccurately done ( a careful
sample produces a more accurate data than a census.)
Sample survey: study sample and draw conclusions about
populations.
Cheaper in terms of cost,
Practical & convenient in terms of technicalities
Saves time & energy.
8
cont’d…
9
cont’d….
11
Errors in statistical Study
Sampling or Random
Errors
Non-sampling or
systematic
12
a. Sampling error
– random error- the sample selected is not
representative of the population due to chance
– The uncertainty associated with an estimate that is based
on data gathered from a sample of the population rather
than the full population is known as sampling error.
– Sampling errors are the random variations in the sample
estimates around the true population parameters.
13
cont’d…
14
The cause of sampling error
Chance: main cause of sampling error and is the error that
occurs just because of bad luck.
15
b. Non Sampling Error
It is a type of systematic error in the design or conduct of a
sampling procedure which results in distortion of the sample, so
that it is no longer representative of the reference population.
16
cont’d…
o The basic types of non-sampling error
Non-response error
Response or data error
o A non-response error occurs when units selected as part of the
sampling procedure do not respond in whole or in part
If non-respondents are not different from those that did
respond, there is no non-response error
When non-respondents constitute a significant proportion of
the sample (about 15% or more)
17
cont’d….
18
cont’d …
Random error can distort the results in any given direction but
tend to balance out on average
19
Types of Sampling
20
Types of Sampling Methods
Sampling Method
22
cont’d …
In probability sampling
A sampling frame exists or can be compiled.
should have an equal or at least a known or nonzero chance
of being included in the sample.
Generalization is possible (from sample to population)
Simple Random Sampling,
Systematic Sampling,
Stratified Random Sampling,
Cluster Sampling
Multistage Sampling.
23
a. Simple Random Sampling(SRS)
24
cont’d …
25
cont’d …
27
Cont’d…
• N = 1200, and n = 60
sampling fraction = 1200/60 = 20
• List persons from 1 to 1200
• Randomly select a number between 1 and 20
(e.g. 8)
• 1st person selected = the 8th on the list
• 2nd person = 8 + 20 = 28th list e.t.c.
28
c. Stratified Random Sampling
Stratified random sampling is used when we have subgroups in
our population that are likely to differ substantially in their
responses or behavior (i.e. if the population is heterogeneous).
So, you divide your sample into male and female members and
randomly select the required sample size within each subgroup
(or "stratum")
30
There are two methods to get the study subject from each subgroup,
proportional allocation or
equal allocation.
We use proportional allocation technique when our subgroups vary dramatically in size
in our population
Let N be total population and N 1, N2 . . . . Nk be the subtotal population for strata 1, 2, …. K
respectively. Moreover let n be the total sample size and n 1, n2…..nk be th subsample for strata
1, 2…..k respectively in which N = N1 + N2 +….. …+ NK and n = n 1 + n2 +
…………..+ nk
Then the subsample “ni “which will be selected from subgroup Ni can be computed by
n Ni
ni where i 1, 2, 3........k
N
31
The higher the population in the subgroup, the higher the
sample size will be.
32
d. Cluster Random Sampling
34
Non-Probability Sampling Method
In the presence of constraints to use probability sampling
strategies, the alternative sampling method is non-probability
sampling method.
35
Cont’d...
Advantages
Cheaper and faster than probability
Reasonably representative if collected in a thorough manner
36
a. Judgment Sampling/ Purposive sampling
37
b. Convenience Sampling
38
Cont’d………..
39
c. Quota sampling
40
Cont’d
41
d. Snowball sampling
42
Cont’d
43
Sample Size Determination
The answer will depend on the aims, nature and scope of the
study and on the expected result. All of which should be
carefully considered at the planning stage.
44
Sample……
o If sample (“n”) is
Large
Increase accuracy
Costy / complex
Take
Optimum
Small sample
o Decrease accuracy
o Less costy
How ?
45
Factors to determine sample size
Size of population
Resources – subjects, financial, manpower
Method of Sampling- random, stratified
Degree of difference to be detected
Variability (S.D.) – pilot study, historical
Degree of Accuracy (or errors)
- Type I error (alpha) p<0.05
- Type II error (beta) less than 0.2 (20%)
- Power of the test : more than 0.8 (80%)
Statistical Formulae
Dropout rate
46
Cont’d
There are three possible categories of outcome variables.
The first is where the variable of interest has only two
alternatives response: yes/no, dead/alive, vaccinated/not
vaccinated and so on.
The second category covers those outcome variable with
multiple, mutually exclusive alternatives responses, such as
marital status, religion, blood group and so on.
For these two categories of outcome variables, the data are
generally express as percentages or rates.
So we can use percentage to compute the sample size.
47
The third category covers continuous response variables
such as birth weight, age at first marriage, blood
pressure and cerium uric acid level, for which
numerical measurement are usually made.
48
Cont’d
There are several approaches to determining the sample size.
49
Sample for Single population mean
To estimate sample size for single survey using
simple or systematic random sampling, need to
know:
oEstimate of the prevalence of the outcome
o Precision desired
o Design effect
o Size of total population
oLevel of confidence (always use 95%)
50
cont’d
51
Maximum acceptable difference (w): This is the maximum
amount of error that you are willing to accept.
Desired confidence level (Z/2 ) : is your level of certainty that
the sample mean does not differ from the true population mean
by more than the maximum acceptable difference. Commonly
we use a 95% confidence level.
Then the sample size determination formula for single
population mean is defined by:
z22 2
n
w2
52
Sample Size for Single Population Proportion
53
Then the formula for the sample size of single population
proportion is defined as:
z22 * p (1 p )
n 2
w
54
Incorrect sample size will lead to
Wrong conclusions
Poor quality research (Errors)
Type II error can be minimized by increasing the sample
size
Waste of resources
Loss of money
Ethical problems
Delay in completion
55