Lecture 11 (Sampling & SIze)
Lecture 11 (Sampling & SIze)
METHODS
1
LEARNING OBJECTIVES
4
SAMPLING…….
5
6
SAMPLING BREAKDOWN
SAMPLING…….
STUDY POPULATION
SAMPLE
TARGET POPULATION
7
SAMPLING…….
• Two general approaches to sampling are used in social science research.
probability sampling and non-probability sampling
• With probability sampling, all elements (e.g., persons, households) in the
population have some opportunity of being included in the sample, and the
mathematical probability that any one of them will be selected can be calculated.
• With nonprobability sampling, in contrast, population elements are selected on
the basis of their availability (e.g., because they volunteered) or because of the
researcher's personal judgment that they are representative.
• The consequence is that an unknown portion of the population is excluded (e.g.,
those who did not volunteer).
• Because some members of the population have no chance of being sampled, the
extent to which a convenience sample – regardless of its size – actually represents
the entire population cannot be known
8
Types of Samples
• Probability (Random) Samples
• Simple random sample
– Systematic random sample
– Stratified random sample
– Multistage sample
– Multiphase sample
– Cluster sample
• Non-Probability Samples
– Convenience sample
– Purposive sample
– Quota
– Snow Ball
9
Process
• The sampling process comprises several stages:
– Defining the population of concern
– Specifying a sampling frame, a set of items or
events possible to measure
– Specifying a sampling method for selecting
items or events from the frame
– Determining the sample size
– Implementing the sampling plan
– Sampling and data collecting
– Reviewing the sampling process
10
Population definition
• A population can be defined as including all
people or items with the characteristic one
wishes to understand.
• Because there is very rarely enough time or
money to gather information from everyone
or everything in a population, the goal
becomes finding a representative sample (or
subset) of that population.
11
Population definition…….
• Note also that the population from which the
sample is drawn may not be the same as the
population about which we actually want
information. Often there is large but not
complete overlap between these two groups
due to frame issues etc.
• Sometimes they may be entirely separate -
for instance, we might study rats in order to
get a better understanding of human health,
or we might study records from people born
in 2008 in order to make predictions about
people born in 2009.
12
SAMPLING FRAME
• In the most straightforward case, such as the
sentencing of a batch of material from production
(acceptance sampling by lots), it is possible to
identify and measure every single item in the
population and to include any one of them in our
sample. However, in the more general case this is not
possible. There is no way to identify all rats in the
set of all rats. Where voting is not compulsory,
there is no way to identify which people will actually
vote at a forthcoming election (in advance of the
election)
• As a remedy, we seek a sampling frame which has
the property that we can identify every single
element and include any in our sample .
• The sampling frame must be representative of the
population
13
PROBABILITY SAMPLING
14
PROBABILITY SAMPLING…….
15
NON PROBABILITY SAMPLING
• Any sampling method where some elements of population
have no chance of selection (these are sometimes
referred to as 'out of coverage'/'undercovered'), or
where the probability of selection can't be accurately
determined. It involves the selection of elements based
on assumptions regarding the population of interest,
which forms the criteria for selection. Hence, because
the selection of elements is nonrandom, nonprobability
sampling not allows the estimation of sampling errors..
16
NONPROBABILITY
SAMPLING…….
• Nonprobability Sampling includes:
Convenience Sampling, Quota Sampling and
Purposive Sampling.
• In addition, nonresponse effects may turn
any probability design into a nonprobability
design if the characteristics of
nonresponse are not well understood, since
nonresponse effectively modifies each
element's probability of being sampled.
17
SIMPLE RANDOM SAMPLING
• Applicable when population is small,
homogeneous & readily available
• All subsets of the frame are given an equal
probability. Each element of the frame
thus has an equal probability of selection.
• It provides for greatest number of
possible samples. This is done by assigning
a number to each unit in the sampling
frame.
• A table of random number or lottery
system is used to determine which units
are to be selected. 18
SIMPLE RANDOM SAMPLING……..
• Estimates are easy to calculate.
• Simple random sampling is always an EPS design, but not all
EPS designs are simple random sampling.
• Disadvantages
• If sampling frame large, this method impracticable.
• Minority subgroups of interest in population may not be
present in sample in sufficient numbers for study.
19
REPLACEMENT OF SELECTED UNITS
22
SYSTEMATIC SAMPLING……
• ADVANTAGES:
• Sample easy to select
• Suitable sampling frame can be identified easily
• Sample evenly spread over entire reference population
• DISADVANTAGES:
• Sample may be biased if hidden periodicity in population
coincides with that of selection.
• Difficult to assess precision of estimate from one survey.
23
STRATIFIED SAMPLING
Where population embraces a number of distinct
categories, the frame can be organized into
separate "strata." Each stratum is then sampled as
an independent sub-population, out of which
individual elements can be randomly selected.
• Every unit in a stratum has same chance of being
selected.
• Using same sampling fraction for all strata
ensures proportionate representation in the
sample.
• Adequate representation of minority subgroups of
interest can be ensured by stratification & varying
sampling fraction between strata as required.
24
STRATIFIED SAMPLING……
• Finally, since each stratum is treated as an
independent population, different sampling
approaches can be applied to different strata.
26
CLUSTER SAMPLING
• Cluster sampling is an example of 'two-stage
sampling' .
• First stage a sample of areas is chosen;
• Second stage a sample of respondents within
those areas is selected.
• Population divided into clusters of homogeneous
units, usually based on geographical contiguity.
• Sampling units are groups rather than individuals.
• A sample of such clusters is then selected.
• All units from the selected clusters are studied.
27
CLUSTER SAMPLING…….
• Advantages :
• Cuts down on the cost of preparing a
sampling frame.
• This can reduce travel and other
administrative costs.
• Disadvantages: sampling error is higher
for a simple random sample of same
size.
• Often used to evaluate vaccination
coverage in EPI
28
CLUSTER SAMPLING…….
Two types of cluster sampling methods.
One-stage sampling. All of the elements
within selected clusters are included in
the sample.
Two-stage sampling. A subset of
elements within selected clusters are
randomly selected for inclusion in the
sample.
29
Difference Between Strata and Clusters
30
MULTISTAGE SAMPLING
31
MULTISTAGE SAMPLING……..
• This technique, is essentially the process of taking
random samples of preceding random samples.
• Not as effective as true random sampling, but
probably solves more of the problems inherent to
random sampling.
• An effective strategy because it banks on multiple
randomizations. As such, extremely useful.
• Multistage sampling used frequently when a complete
list of all members of the population not exists and is
inappropriate.
• Moreover, by avoiding the use of all sample units in all
selected clusters, multistage sampling avoids the
large, and perhaps unnecessary, costs associated with
traditional cluster sampling.
32
MULTI PHASE SAMPLING
• Part of the information collected from whole sample & part from
subsample.
33
QUOTA SAMPLING
• The population is first segmented into mutually exclusive sub-
groups, just as in stratified sampling.
• Then judgment used to select subjects or units from each
segment based on a specified proportion.
• For example, an interviewer may be told to sample 200
females and 300 males between the age of 45 and 60.
• It is this second step which makes the technique one of non-
probability sampling.
• In quota sampling the selection of the sample is non-random.
• For example interviewers might be tempted to interview those
who look most helpful. The problem is that these samples may
be biased because not everyone gets a chance of selection.
This random element is its greatest weakness and quota
versus probability has been a matter of controversy for many
years
34
CONVENIENCE SAMPLING
• Sometimes known as grab or opportunity sampling or accidental
or haphazard sampling.
• A type of nonprobability sampling which involves the sample being
drawn from that part of the population which is close to hand.
That is, readily available and convenient.
• The researcher using such a sample cannot scientifically make
generalizations about the total population from this sample
because it would not be representative enough.
• For example, if the interviewer was to conduct a survey at a
shopping center early in the morning on a given day, the people
that he/she could interview would be limited to those given there
at that given time, which would not represent the views of other
members of society in such an area, if the survey was to be
conducted at different times of day and several times per week.
• This type of sampling is most useful for pilot testing.
• In social science research, snowball sampling is a similar technique,
where existing study subjects are used to recruit more subjects
into the sample. 35
CONVENIENCE SAMPLING…….
36
36
Judgmental sampling or Purposive
sampling
• The researcher chooses the sample
based on who they think would be
appropriate for the study. This is used
primarily when there is a limited number
of people that have expertise in the
area being researched
37
PANEL SAMPLING
38
Sample Size
• There are no hard and fast rules for sample
size in qualitative research. (7- 10
samples???)
• The size of the sample depends on WHAT you
try to find out, and from what different
informants or perspectives you try to find
that out.
Sample Size cont…
• Quantitative Research:
N
n=
1+N(e2)
Let, N = 2000
= 2000/1+2000(.05)2
= 333.33 or 333
TABLE FOR DETERMINING SAMPLE SIZE FROM A GIVEN POPULATION
N S N S N S N S N S
10 10 100 80 280 162 800 260 2800 338
15 14 110 86 290 165 850 265 3000 341
20 19 120 92 300 169 900 269 3500 246
25 24 130 97 320 175 950 274 4000 351
30 28 140 103 340 181 1000 278 4500 351
35 32 150 108 360 186 1100 285 5000 357
40 36 160 113 380 181 1200 291 6000 361
45 40 180 118 400 196 1300 297 7000 364
50 44 190 123 420 201 1400 302 8000 367
55 48 200 127 440 205 1500 306 9000 368
60 52 210 132 460 210 1600 310 10000 373
65 56 220 136 480 214 1700 313 15000 375
70 59 230 140 500 217 1800 317 20000 377
75 63 240 144 550 225 1900 320 30000 379
80 66 250 148 600 234 2000 322 40000 380
85 70 260 152 650 242 2200 327 50000 381
90 73 270 155 700 248 2400 331 75000 382
95 76 270 159 750 256 2600 335 100000 384
“N” is population size
“S” is sample size.
Krejcie, Robert V., Morgan, Daryle W., “Determining Sample Size for Research Activities”, Educational and Psychological
Measurement, 1970.
Size of sample for factor analysis
The size of the sample for factor analysis
will be determined while keeping in mind
the suggestions of different researchers.
Arrindell and Van de Ende (1985)
proposed that from the factor analysis,
stable factors will be obtained and the
size of the sample will be 20 times
greater than the expected factors
For Unknown Population
• Comrey and Lee (1992) gave sample in a
series for inferential statistics. Sample having
less than 50 participants will observed to be a
weaker sample; sample of 100 size will be
weak; 200 will be adequate; sample of 300
will be considered as good; 500 very good
whereas 1000 will be excellent.
Unknown Population cont…
• According to Hair et al. (2006) the size of the
sample should depends on the number of
items developed for some specific
characteristic.
• It was suggested that each item should be
represented using 5 samples.
For Regression Analysis
• It was suggested by Field (2005) that in regression
analysis, a sample of 15 units representing every
independent variable will be appropriate.
• (Number of independent variables x 15).
48
Questions???
49