Sampling Techniques: Third Edition
Sampling Techniques: Third Edition
I Sampling Techniques
third edition
WILLIAM G. COCHRAN
Professor of Statistics, Emeritus
Harvard University
)D)~©~ om [E\~
w
l~ MAR I 6 2004
By
20 19 18 17 16 15 14 13
ES
haracteristics is to be measured on
lation. If P 1 • P 2 are the percentages
Stratified Random Sampling
l and 2 a client wishes to estimate
!ntage Points. What sample size do
between 40 and 60% and that the
lits'?
acteristics are positively correlated,
ai sample of 200, with the following
5.1 DESCRIP110N
In stratified sampling the population of N units is first divided into subpopula-
of units tions of N~o N 2 , ••• , NL units, respectively. These subpopulations are nonover-
lapping, and together they comprise the whole of the population, so that
72
~4 N1+N2+· · ·+NL =N
14
70 The subpopulations are called strata. To obtain the full benefit from stratification,
the values of the N, must be known. When the strata have been determined, a
00 sample is drawn from each, the drawings being made independently in different
strata. The sample sizes within the strata are denoted by n 1, n 2 , ••• , nL, respec-
p 1 - P;J with a standard error s2%? tively.
which is close to equality, and could If a simple random sample is taken in each stratum, the whole procedure is
two children. Ignoring the small
rfactor for a simple random sample described as stratified random sampling.
Stratification is a common technique. There are many reasons for this; the
, deff factor? principal ones are the following.
~- - - ~~---------------------- ---·---
------- -- -- - -
into subpopulations, each of which is internally homogeneous. This is suggested 5.3 PROl
by the name strata, with its implication of a division into layers. If each stratum is
homogeneous, in that the measurements vary little from one unit to another, a For the population mean 1
precise estimate of any stratum mean can be obtained from a small sample in that (st for stratified), where
stratum. These estimates can then be combinect into a precise estimate for the
whole population.
The theory of stratified sampling deals with the properties of the estimates from
a stratified sample and with the best choice of the sample sizes nh to obtain where N=N1 +N2 +· · ·+N,
maximum precision. In this development it is taken for granted that the strata The estimate Ys1 is not in .
have already been constructed. The problems of how to construct strata and of mean, y, can be written as
how many strata there should be are postponed to a later stage (section SA. 7).
5.2 NOTATION
The difference is that in Yst th
The suffix h denotes the stratum and i the unit within the stratum. The notation correct weights Nhl N. It is evi
is a natural extension of that previously used. The following symbols all refer to stratum
stratum h.
nh=Nh
total number of units n N
number of units in sample This means that the sampling f
described as stratification w:
Y~r; value obtained for the ith unit self-weighting sample. If num
sample is time-saving.
w.h_Nh
- stratum weight The principal properties o
N theorems. The first two theorE
not restricted to stratified rand
sampling fraction in the stratum need not be a simple random :
theorem 5.1. If in every st:
an unbiased estimate of the pc
true mean Proof.
.. h
L Yhi
y,.=--
- i=l
sample mean since the estimates are unbiase'
nh
Y may be written
N~o
(yh,- Y~o)
2
I L
1=1
true variance I
N,. -1 Y=~
1
Note that the divisor for the variance is (Nh -1). · This completes the proof.
STRA TIFlED RANDOM SAMPLING 91
togeneous. This is suggeste~ 5.3 PROPERTIES OF TilE ESTIMATES
into layers. If each stratum ts
from one unit to another, a For the population mean per unit, the estimate used in stratified sampling is Ysr
d from a small sample in that (st for stratified), where
to a precise estimate for the L
IN,y,. L
- lt=l ~ IH -
Ysr = N = t.. rY~oY~o (5.1)
lo-t
:lperties of the estimates fro~
te sample sizes nh to obtam where N=N1 +N2 +· · ·+Nv
:n for granted that the strata The estimate y., is not in general the same as the sample mean. The sample
ow to construct strata and of mean, y, can be written as
a later stage (section 5A.7).
(5.2)
The difference is that in j 8 , the estimates from the individual strata receive their
thin the stratum. The notation correct weights Nhf N. It is evident that y coincides with y., provided that in every
following symbols all refer to stratum
n~o n
or -=- or
f units Nh N '" =f
This means that the sampling fraction is the same in all strata. This stratification is
ts in sample
described as stratification with proportional allocation of the nh. It gives a
I for the ith unit self-weighting sample. If numerous estimates have to be made, a self-weighting
sample is time-saving.
The principal properties of the estimate y11 are outlined in the following
theorems. The first two theorems apply to stratified sampling in general and are
not restricted to stratified random sampling; that is, the sample from any stratum
ion in the stratum need not be a simple random sample.
11aeorem S.l. If in every stratum the sample estimate y,. is unbiased, then Y~r is
an unbiased estimate of the population mean Y.
Proof.
L L
E(y.,) = E L WhYh = L w, yh
h:l h=l
s~nce the estimates are unbiased in the individual strata. But the population mean
Y may be written
where V(y,) is the variance of y, over repeated samples from stratum h. This is the appropriate formul
Proof. Since
CoroUary 2. With propor
y,, is a linear function of the y, with fixed weights W,. Hence we may quote the in (5.6). The variance reduces
result in statistics for the variance of a linear function.
L 1- L
V(y..,)"" I W/V(y,)+2 L L Wh"'fCov(y,YJ) (5.5)
h=l h=I j>h
Coronary 3. If sampling is
But since samples are drawn independently in different strata, all covariance
same value, S,.}, we obtain tht
terms vanish. This gives the result (5.3).
To summarize theorems 5.1 and 5.2: if y, is an unbiased estimate of Yh in every
stratum, and sample selection is independent in different strata, then y., is an
unbiased estimate of Y with variance L w, 2 V(y,).
The important point about this result is that the variance of y., depends only on Theorem 5.4. If Y., = Ny,
the variances of the estimates of the individual stratum means Y,. If it were
possible to divide a highly variable population into strata such that all items had vc
the same value within a stratum, we could estimate ¥without any error. Equation
(5.4) shows that it is the use of the correct stratum weights N,JN in making the This follows at once from thee
estimate y,, that leads to this result. Example. Table 5.1 shows the
64 large cities in the United Stat
Theorem 5.3. For stratified random sampling, the variance of the estimate y., ran ked fifth to sixty-eighth in the {
is cities are arranged in two strata, tt
remaining 48 cities.
The total number of inhabitants
size 24. Find the standard error of
stratified random sample with pro
Proof. Since y, is an unbiased estimate of Y,, theorem 5.2 can be applied. 12 units drawn from each stratum
This population resembles the 1
Furthermore, by theorem 2.2, applied to an individual stratum, some units-the large cities-<:on
_ ) _ S/ Nn - n, greater variability than the remait
V( Yh - - - - - The stratum totals and sums of s'
n, N, used in this example: the 1920 da
For the complete population in
By substitution into the result of theorem 5.2, we obtain
Y=
_ 1 L 2 _ 1 ~ s,,l ·2 s" 2
The three estimates of Y are d1
V(y.,)=~N EN,. V(y,)=N2 L.. N,(N,-n,)-""L W,. -(1-f,.) 1. For simple random samplin~
h=t h=l n, nh
V('Y,.,) = N2 52 {
Some particular cases of this formula are given in the following corollaries. n
STRATIFIED RANDOM SAMPLING 93
:ndently in dtfferent strata, Corollacy 1. If the sampling fractions nh/Nh are negligible in all strata,
tmples from stratum h. This is the appropriate formula when finite population corrections can be ignored.
Corollary 2. With proportional allocation, we substitute
(5.4) nNh
n~a=--
N
; wh. Hence we may quote the in (5.6). The variance reduces to
tion.
(5.5)
V(y.,) =I~ s~2(N~n) = 1 :'I w,sh2 (5.8)
Corollary 3. If sampling is proportional and the variances in all strata have the
different strata, all covariance same value, S/, we obtain the simple result
).
variance of Y.st depends only on
Theorem 5.4. If Y., = Ny.., is the estimate of the population total Y, then
stratum means Y". If it were A s"2 .
Ex#mple. Table 5.1 shows the 1920 and 1930 number of inhabitants, in thousands, of
64 large cities in the United States. The data were obtained by taking the cities which
..
, the variance of the estimate y,, ranked fifth to sixty-eighth in the United States in total number of inhabitants in 1920. The
cities are arranged in two strata, the first containing the 161argest cities and the second the
remaining 48 cities.
The total number of inhabitants in all64 cities in 1930 is to be estimated from a sample of
size 24. Find the standard error of the estimated total for (1) a simp' ·'l.ndom sample, (2) a
stratified random sample with proportional allocation, (3) a stratifi .,dom sample with
12 units drawn from each stratum.
'h• theorem 5.2 can be applied. This population resembles the populations of many types of bus· . enterprise in that
vidual stratum, some units-the large cities-rontribute very substantia1ly to the tv.al and display much
greater variability than the remainder.
The stratum totals and sums of squares are given under Table 5 .1. Only the 1930 data are
used in this example: the 1920 data appear in a later example.
For the complete population in 1930, we find
e obtain Y=I9,568, S 2 ,52,448
2 s2 The three estimates of Y are denoted by Y,_, YP"'P' and Y"""ar•
n,.)~= L W/___!_(1-/") 1. For simple random sampling
n" nh 2
V(Y ) = !flS N- n = (64?(52,448)(40) =
5 594 453
1 in the following corollaries. '"" n N 24 64 ' '
94 SAMPLING TECHNIQUES Sl
Stratum Stratum =n
h =1 2 I 2 u(Y-)=1:
3. For n 1 = n 2 = 12 we use tl
797 314 172 121 900 364 209 113
773 298 172 120 822 317 183 115 V( yeqoo41) = L )\
748 296 163 119 781 328 163 123
734 258 162 118 805 302 253 154
= (16:
588 256 161 118 670 288 232 140
577 243 159 116 1238 291 260 119
507 238 153 116 573 253 201 130 u(Y.,...u)= 104
507 237 144 113 634 291 147 127 In this example equal sample
457 235 138 113 578 308 292 100 allocation. Both are greatly sup
438 235 138 110 487 272 164 107
415 216 138 110 442 284 143 114
401 208 138 108 451 255 169 Ill 5.4 THE I
387 201 136 106 459 270 139 163 c
381 192 132 104 464 214 170 116
324 180 130 101 400 195 150 122 If a simple random sample i
315 179 126 100 366 260 143 134 Sh2 (from theorem 2.4) is
Note. Cities are arranged in the same order in both years.
Note that the stratum with the largest cities has a variance nearly 10 times that of the other Population mean:
stratum. Population total:
STRATIFJED RANDOM SAMPUNG 95
In proportional allocation, we have n 1 = 6, n 2 = 18. From (5. 7), multiplying by N\ we
have
20 AND 1930 N-n
Size(y 1,) . ,:.
~ V('Y-)=--LN"S~~ 2
n
=~(16)(53,843)+(48)(5581)] =:: 1,882,293
tratum
2 uCY_) = 1372
3. For n 1 = n~ = 12 we use the general formula (5.9):
364 209 113
317 183 115 A
V(Y--')=LN11 (N,. -n11 ) -
s"2
328 163 123 n"
302 253 154 = (16)(4)(53,843) + (48)(36)(5581)
288 232 140 12 12 1,090,827
291 260 119
253 201 130 u(Yoqw~) = 1044
291 147 127 In this example equal sample sizes in the two strata are more precise than proportional
308 292 100 allocation. Both are greatly superior to simple random sampling.
272 164 107
284 143 114
255 169 111 5.4 TilE ESTIMATED VARIANCE AND
270 139 163 CONFIDENCE LIMITS
214 170 116
!95 150 122 If a simple random sample is taken within each stratum, an unbiased estimate of
260 143 134 Sh2 (from theorem 2.4) is
The value of n., always lies between the smallest of the values (n11 -1) and their
sum. The approximation takes account of the fact that s, 2 may vary from stratum
to stratum. It requires the assumption that the y11; are normal, since it depends on Stuart (1954) has noted tt
the result that the variance of sh2 is 2uh4 /(nh -1). If the distribution of Yh; has Cauchy-Sch warz inequality.
positive kurtosis, the variance of sh 2 will be larger than this and formula 5.16 inequality comes from the idt
overestimates the effective degrees of freedom.
In stratified sampling the values of the sample sizes nh in the respective strata
are chosen by the sampler. They may be selected to minimize V(ji.,) for a specified
(r
cost of taking the sample or to minimize the cost for a specified value of V(y.,).
equality occurring if and onl}
The simplest cost function is of the form
cost= C=co+L chnh (5.17) whsh
ah=--
J;,'
Within any stratum the cost is proportional to the size of sample, but the cost per
unit c11 may vary from stratum to stratum. The term c0 represents an overhead The inequality (5.21) gives
cost. This cost function is appropriate when the major item of cost is that of taking
w,2s 2
the measurements on each unit. If travel costs between units are substantial,
empirical and mathematical studies suggest that travel costs are better rep- v'c=(r~)(
resented by the expression l:t, ...r;;;, where th is the travel cost per unit (Beard wood
et aJ., 1959). Only the linear cost function (5.17) is considered here. Thus, no choice of the nh <