0% found this document useful (0 votes)
27 views185 pages

STA 308 Lecture Notes

The STA 308 course on Sampling Techniques covers both probability and non-probability sampling methods, including various techniques such as convenience, quota, and snowball sampling. Students will learn to distinguish between different sampling methods, estimate sample sizes, and understand the advantages and disadvantages of each technique. The course includes quizzes and is supported by recommended literature for further study.

Uploaded by

Ollo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views185 pages

STA 308 Lecture Notes

The STA 308 course on Sampling Techniques covers both probability and non-probability sampling methods, including various techniques such as convenience, quota, and snowball sampling. Students will learn to distinguish between different sampling methods, estimate sample sizes, and understand the advantages and disadvantages of each technique. The course includes quizzes and is supported by recommended literature for further study.

Uploaded by

Ollo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 185

STA 308:

SAMPLING TECHNIQUES

delivered by

PROF. NATHANIEL HOWARD

1
Course Description
Non probability and probability sampling methods.
Convenience sampling, quota sampling, snowball
sampling, etc. Simple random sampling (with/without
replacement). Estimation of sample size; estimation of
population parameters e.g., total and proportion; ratio
estimators of population means, totals, etc. Stratified
random sampling - proportional and optimum allocations.
Cluster sampling, systematic sampling, multistage
sampling.

Pre-requisite: STA 201 & STA 202/HND Statistics


2
Recommended Literature
1. Raj, D. (1981). Design of Sample Surveys; McGraw- Hill,
New York, USA.
2. Barnett, V. (1974). Elements of Sampling Theory; English
Universities Press, London. UK.
3. Cochran, W.G. (1977). Sampling Techniques. 3rd Ed., J.
Wiley, New York, USA.
4. Kish, L. (1972). Survey Sampling. J. Wiley, New York, USA.

3
Lecture Schedules
Wednesdays: 7:30 am to 8:30 am SW 12
Wednesdays: 8:30 am to 9:30 am SW 12
Thursdays: 8:30 am to 9:30 am LT 21

4
COURSE OBJECTIVES
At the end of the course, the students should be able to:
 distinguish between a population and a sample;
 distinguish between probability (or random) and non
probability (or non random) sampling;
 list at least 3 major probability sampling techniques;
 describe simple random sampling (with or without)
replacement;
 describe at least 2 common methods for selecting a simple
random sample;
 select all possible simple random samples of size n from a
given population; 5
COURSE OBJECTIVES (Continued)
At the end of the course, the students should be able to:
 state at least two advantages and disadvantages SRS;
 explain the factors required in the determination of an
appropriate sample size;
 estimate an appropriate sample size;
 estimate population parameters using simple random
samples;
 explain the justification for stratified random sampling;
 select stratified random samples of size n using
proportional allocation;
6
COURSE OBJECTIVES (Continued)
At the end of the course, the students should be able to:
 select stratified random samples of size n using optimum
allocation;
 state at least two advantages and disadvantages stratified
random sampling;
 estimate population parameters using stratified random
samples;
 explain the justification for systematic sampling;
 state at least two advantages and disadvantages systematic
sampling;
 estimate population parameters using systematic samples;
7
Quiz Schedules

Quiz 1: (Week 5) 12TH June to 16TH June, 2023

[20%]

Quiz 2: (Week 10) 17TH July to 22ND July, 2023


[20%]

8
PROBABILITY AND NON-PROBABILITY SAMPLING

The choice of sampling technique depends on a number of


issues. For example the objectives of the survey, the
characteristics of the population, etc. There are several
established methods of sampling (probability and non-
probability), each with its own merits and demerits.
Basically, there are two kinds of samples:
(i) non-probability (or non random) sample; and
(ii) probability (or random ) sample.
9
NON PROBABILITY SAMPLING
A non-probability sample is one in which the units are selected
on the basis of the judgment of the investigator, or by
convenience, or by some non random process. The
composition of a non random sample is influenced greatly by
the personal judgment of the one responsible for the selection.
The procedure is not objective, neither is it based on the
principles of probability theory. Consequently, it is not
possible to measure the degree of reliability of the sample
results. 10
Much of the sampling in marketing research is non random in
nature. Examples are judgmental (or purposive) sampling,
convenience (or accessibility) sampling and quota sampling.

11
Judgmental (or Purposive) Sampling
The approach to the problem of choosing sample involves selecting units
by considering available auxiliary information more or less subjectively
with a view to ensuring a reflection of the population in the sample. This
approach, therefore, involves selecting units that are known to be typical
with respect to certain characteristics, rather than selecting the units
randomly. The assumption is made that if the units are typical in this
respect, they will be typical with respect to the characteristics being
studied.

12
For example, in a household expenditure study in a given district, a
sample of households might be selected in such a manner that the
average size of the households in the sample would be the same as the
average for the district, and the distribution of the sample households
among the different types would parallel the distribution throughout the
district.

The quality of the results can be evaluated only as a judgment of


someone who has expert knowledge of the situation. The range of
variation as observed in such a sample does not give a good deal of the
variability in the population. 13
Advantages
1. Those people who are unsuitable for the study or who do not fit the
bill would have been already eliminated, so only the most suitable
candidates remain.

2. As the most appropriate people/objects for the study would have been
selected, this process becomes a lot less time consuming.

3. With fewer time constraints and a more accurate subject, the costs for
carrying out the sampling project are greatly reduced.

4. The results of purposeful sampling are usually expected to be more


accurate than those achieved with an alternative form of sampling.

5. If you are looking for a very rare or much sought after group of
people for a particular study, using purposive sampling may usually
be the only way you can track them down.
14
This happens because the sampler is unlikely to select units which are
too small or too large, although such units exist in the population. It is
not possible to get strictly valid estimates of the population values and
their sampling errors due to the risk of bias in selection due to the lack
of information on the probability of selection of the units.

Consequently, this type of sampling is rarely used in large-scale surveys


though a good deal of business data has been collected by using samples
of this kind.

15
Disadvantages
1. The vast array of inferential statistical procedures is not invalid here.
Inferential statistics lets you generalize from a particular sample to a
larger population and make statements about how sure you are that
you are right, or about how accurate you are.

2. The sample population used may not necessarily be entirely the


population that the researcher is trying to reach. As such, since such a
small sample population is often used, a small variation in the sample
will cause deviance in the results.

3. Due to the narrow range of the purposive sample, it is possible to put


undue weight on the data obtained purely because the sampling is so
small.

16
Convenience Sampling
The main stimulus is administrative convenience and thus, a sample is
chosen with the sole concern for its ease of access. To obtain
information quickly and inexpensively, a convenience sample can be
employed. The procedure is simply to contact sampling units that are
convenient shoppers at a market on a particular day, a few friends and
neighbours, or football fans after a game. A convenience sample is often
used to pre-test a questionnaire.

17
Advantages

Expedited Data Collection: When time is of the essence, many


researchers turn to convenience sampling for data collection because
they can swiftly gather data. Very little preparation is needed to
effectively use convenience sampling for data collection. Thus it is useful
in time sensitive research. It is also useful when researchers need to poll
a large number of people, as they can quickly reach their desired number
of participants by utilizing this method to draw from the nearby
population. By rapidly gathering information, researchers and scientists
can isolate growing trends, or extrapolate generalized information from
local public opinion. 18
Ease of Research Because researchers are not looking for an accurate
sampling, they can simply collect their information and move on to other
aspects of their study. Setting up for this type of sampling can be done
by simply creating a questionnaire and distributing it to their targeted
group. Through this method, researchers can easily finish collecting their
data in a matter of hours, free from worry about whether it is an accurate
representation of the population. This allows for a great ease of research,
letting researchers focus on analyzing the data rather than interviewing
and carefully selecting each participant.

19
Readily Available Since most convenience samples are collected with
the populations on hand, data are readily available for the researcher to
collect. They do not typically have to travel great distances to collect the
data, but simply pull from whatever environment is nearby. Having a
sample group readily available is important for meeting quotas quickly,
and allows for the researcher to even do multiple studies in an
expeditious fashion. If a researcher has a deadline to meet, he or she will
turn to this type of sampling instead of any other.

20
Cost Effective One of the most important aspects of this method is its
cost effectiveness. There is minimal overhead because there is no
elaborate setup, and researchers can pull from local population groups.
This method allows for funds to be distributed to other aspects of the
project. Often times this method of sampling is used to gain funding for a
larger, more thorough research project.

In this instance, funds are not available for a more complete survey, so a
quick selection of the population will be used to demonstrate a need for
the completed project.

21
Disadvantages
Biased Results One of the major drawbacks to this sampling method is
the opportunity for bias to cloud the results of the survey. For example, a
researcher looking to predict who will win the next election may only
survey an area close to them, and ignore the fact that the region is located
in the northern Ghana and therefore will likely have a more NDC slant.

Personal prejudices may also creep in. For example, a surveyor may not
distribute questionnaires to certain ethnic groups. These factors often
lead to skewed data collection, rendering the data useless for tracing
trends throughout the entire population.
22
Misrepresentation of Data Occasionally researchers will ignore the fact
that they did not complete a random survey, and will use the information
to prove facts that are not necessarily true. One example of this
misrepresentation is if certain magazine subscribers are polled on
political opinion, when the magazine appeals to a more liberal sect of the
populous. Because a national magazine subscription will have a large
pool of subscribers and the people surveyed could be selected at random,
researchers might be tempted to use the results of this poll as a
representation of the entire population’s view. Nevertheless, because the
magazine appeals to only a select group of people, it is still considered a
convenience sampling and will have a liberal slant. 23
Incomplete Conclusions Because of the flaws found in this form of
sampling, scientists cannot draw concrete conclusions from their data. In
order to be a truly accurate study, any statistical data must include all
facets of the population. By missing portions of the general populous,
researchers are only able to form incomplete conclusions. One example
would be polling only knitting clubs, which is typically frequented by
women. Researchers would not be able to draw conclusions on what
most men thought, as they would be underrepresented in this arena, and
any data collected would not allow them to reach accurate conclusions.

24
Quota Sampling
A wide variety of procedures go under the name of quota sampling, but
what distinguishes them fundamentally from probability sampling is
that, once the general breakdown of the sample is decided, and the quota
assignments are allocated to interviewers, the choice of the actual sample
units to fit into this framework is left to the interviewers. Quota
sampling is, therefore, a kind of stratified sampling in which the
selection within category is non-random. It is this non-random property
that constitutes its greatest weakness. Since speed is an important
consideration in surveys of public opinions, it has been common to use
the method of quota sampling for the selection of the sample. 25
Quota sampling starts with the premise that a sample should be well
spread geographically over the population and that it should contain the
same fraction of individuals having certain characteristics as does the
population.
The characteristics usually taken into account are sex, age, occupation,
economic level, and size of place, in addition to geographical control.
Data taken from a recent census are used as the standard. In a sense the
population is divided up into a number of strata whose weight are
obtained from the census or a large-scale survey. Interviewers are then
assigned quotas for the number of interviews to be taken from each
stratum. 26
Thus, an interviewer will be asked to select so many males and so many
females, so many young and so many old persons, and similarly for the
other characteristics used as control. The interviewer is free to choose
his sample as he likes provided the quota requirements are fulfilled.

27
The essential difference between quota sampling and stratified random
sampling is that in the former case, the selection of the sample within the
stratum is not strictly random. The interviewer may decide to omit
certain parts of the area entirely if it suits his convenience. He may not
like to approach certain kinds of people and so on. In general,
statisticians have criticized the method for this theoretical weakness,
while market and opinion researchers have defended it for its cheapness
and administrative convenience.

28
Advantages
1. It saves money when time is an issue
2. It is quick and easy to arrange

Disadvantages
3. Sample is not representative of the population.
4. Because the sample is non-random it is impossible to assess the
possible sampling error.
5. It limits the decisions of the researcher as it does not allow much
variety.

29
Snowball Sampling
A snowball design is a form of judgmental sampling that is very
appropriate when it is necessary to reach a small specialized (typical)
population. Suppose a long range planning group wanted to sample
people who were very knowledgeable about a new specialized
technology, such as the use of solar energy in manufacturing industries.
Even specialized magazines would have a small percentage of readers in
this category. Furthermore, the target group may be employed by diverse
organizations, like the government, universities, research institutions, and
industrial firms.
30
Under a snowball design each respondent is asked to identify one or
more potential respondents in that particular field.

In short, snowball sampling is a procedure in which initial respondents


are selected randomly. Additional respondents are then obtained from
referrals or other information provided by the initial respondents.
One major purpose of snowball sampling is to estimate various
characteristics that are rare in the total population. This design can be
used to reach any small population, such as off-shore fishermen, people
confined to wheelchairs, families with twins, etc. One problem is that it
is only those who are visible that are likely to be selected.
31
Advantages

It is Useful for Sampling of Special Population Segments.


In some cases, sampling of members of a special population segment is
required, but proves difficult to locate. In such cases, snowball sampling
is the best sampling method to choose.

For example, studies on Chinese “galamsey” operators in Ghana can be


done through snowball sampling. One subject is identified and studied
and then asked to recruit other subjects from the person's acquaintances
and contacts.

32
Low Cost: The cost of locating samples and researching is not high.
The researcher does not spend time and money trying to find the
sample subjects; rather they are being brought to the him/her.

33
Disadvantages
Sampling Bias: Snowball sampling is based on researching one subject
and using the subject to recruit more members for sampling. These
people are known to the initial subject, who is more than likely to
nominate people they know very well. These people will more often than
not share similar traits and behavior characteristics. What the sampler
may end up with is a small subgroup of the entire population. Since this
sampling method is used predominantly in race sampling, one subject
can very well nominate an entire family, close friends and other
acquaintances. All of whom may exhibit the same traits and
characteristics, leading to sampling bias. 34
Lack of Control over the Sampling Method: In the snowball
sampling method, the researcher actually has very little control over the
sampling method. The type of subjects that the researcher can secure for
sampling is mainly dependent on the original subjects that were
researched. After the first set of original subjects is researched, the
researcher may lose control over the sampling method. The reason being
the original subjects are tasked with adding to the sampling pool by
nominating people they know.
Another problem is that representatives of the sample are not guaranteed
because the researcher has no idea of the true distribution of the sample.
35
Probability (or Random) Sampling
Random samples are defined as those samples in which every
element in the population has a known, non-zero probability of
selection. It is not necessary that the probabilities of selection
be equal for all elements. The advantage of random sampling
is that if it is done properly, it provides a bias-free method of
selecting sample units and permits the measurement of
sampling error in addition to providing valid estimates of the
population values.
36
Examples of random sampling methods are simple random
sampling, stratified random sampling, systematic sampling
and cluster sampling.

37
Simple Random Sampling

SRS is the simplest of the probability sampling methods. Although not

much used in practice, its understanding is helpful in the study of other

more complex methods. Simple random samples are those samples in

which every unit has an equal chance of being selected and each distinct

possible sample of size n has the same chance of being chosen. Sampling

is done in one stage with the units of the sample selected, unit by unit,

independently of one another.


38
To select a sample of size n from a population consisting of N units, one

first prepares a comprehensive list of all the N units (sampling frame),

and then numbers these units serially from 1 to N.

With the aid of either table of random numbers or a computer, a series of

random numbers from 1 to N (both inclusive) are drawn until the

required sample is obtained.

39
The above procedure of selecting a sample requires elimination of

random numbers that are either larger than N or those that are repeated

in the case of sampling without replacement.

Probably the most difficult task in drawing a simple random sample is

finding the appropriate sampling frame. For example, there is no

complete list of all adults, households or stores in Ghana.

40
The non-existence of appropriate list of sampling units is one of the

major reasons why simple random samples are not widely used in

practice. Also in situations where the sampling units vary greatly in size,

if units are selected by simple random sampling, the large variation in

size will result in sharp increase in sampling variance. Such could be the

case when stores, for example, are the sampling units – the large stores

are characterized by wide variation in the size of the various departments

that comprise them.


41
On the other hand, there are smaller corner stores. One would probably

want to use a sampling scheme that will give the large stores a

considerable greater chance of being selected than the smaller stores.

42
The samples can be drawn in two possible ways:
 The sampling units are chosen without replacement because the units
once are chosen are not placed back in the population.
 The sampling units are chosen with replacement because the selected
units are placed back in the population.

43
Simple random sampling without replacement:
SRS without replacement is a method of selection of n units out of the N
units one by one such that at any stage of selection, any one of the
remaining units have the same chance of being selected, (i.e. 1/N).

Simple random sampling with replacement:


SRS with replacement is a method of selection of n units out of the N
units one by one such that at each stage of selection, each unit has an
equal chance of being selected, (i.e. 1/N).

44
Symbols and Notations

We shall assume that the characteristic observed on any unit of the

population has a unique value.

45
Capital letters are used to denote population values and lower

case letters are used to denote corresponding sample values.


N is the number of sampling units in the population,
th
U i is the i unit in the population,

Yi is the value of the characteristic Y for Ui ,


N
Y  Yi is the population total of the characteristic Y,
i 1
46
N
1
Y 
N
Y
i 1
i is the population mean of the characteristic Y,

N
1
 Y  Y 
2
 
2
i is the variance of Yi from population mean Y ,
N i 1

n is the sample size,


yi is the value of the characteristic Y for the ith sample unit,
1 n
y   yi is the sample mean,
n i 1
n
1
ˆ    yi  y  is the sample variance,
2 2

n i 1
n
1
  yi  y  is the unbiased estimator of the population variance.
2 2
s 
n  1 i 1 47
Sampling without replacement

Properties of Estimates

In order to demonstrate why the sample mean y of a random sample


may be used to estimate the population mean Y , we consider a
population consisting of five units labeled U1 , U 2 , U 3 , U 4 , and U 5
with corresponding values Y1 16, Y2 13, Y3 17, Y4 11, and Y5 20.
We wish to estimate the population mean
1
Y  16  13  17  11  20  15.4
5
and the population variance
1
s  16  15.4   13  15.4  17  15.4  11  15.4  20  15.4   12.3
2 2 2 2 2 2

4 
48
Suppose that we wish to select two units without replacement. Then the
first unit could be selected in five different ways, and the second in four
different ways. This means that there are 5 4 20 possible ways of
selecting the first two units. Note, however, that these units are ordered,
for example U 2 , U 5 
and . But U 2  two samples are the same.
U 5 , the

Consequently, there are (5 4) 2 10


distinct samples that could arise
and since the selection is random, any one of these samples is equally
likely to be drawn. In general, the number of distinct samples of size n
that could arise from a population of size N is
N N!
Cn  ,
n !( N  n)!
49
where N !  N ( N  1)( N  2) (1)(0)! with 0! 1. It can easily be
shown that the probability of selecting any one of the possible samples
1 N Cn
is . Consider a sample consisting of n distinct units. Then the
probability of drawing any one of the n units at the first draw is n N . At
the second draw, any one of the remaining (n  1) units are selected
from the remaining ( N  1) units, the probability for which is (n  1) ( N  1)
and so on. Hence, the probability of getting a particular sample of n
distinct units is
n (n  1) (n  1) 1 1
   N .
N ( N  1) ( N  1) ( N  n  1) Cn

50
For instance when N 5 and n 2 , then the number of possible
samples is
5 5! 5 4 3!
C2   10,
2!(5  2)! 2 1 3!

and the probability of selecting any one of these 10 possible samples is


1 5 1 5 C 2 . The 10 possible samples are listed in the table below.

The set of all 10 equally likely possible values of the mean is called the
sampling distribution of the mean. It simply describes all the possible
values of the sample mean that could result from the sampling scheme
being used.
51
That is, it is simply a frequency distribution of all possible values of the
mean under the same sampling scheme. Since the sample mean has a
sampling distribution, we say that it is a random variable. The study of a
sampling distribution tells how a sample value may deviate from the true
value. So if the sampling distribution is very much spread out, with a
large variance, the sampling procedure is more apt to be misleading than
it would be if the sampling distribution has a small variance.

52
Sampling distribution of y , ˆ 2 , and s 2 .
Sample Units in Values in
number sample sample y ˆ 2
s 2

1 U1 U 2 16, 13 14.5 2.25 4.5


2 U1 U 3 16, 17 16.5 0.25 0.5
3 U1 U 4 16, 11 13.5 6.25 12.5
4 U1 U 5 16, 20 18.0 4.00 8.0
5 U2 U3 13, 17 15.0 4.00 8.0
6 U2 U4 13, 11 12.0 1.00 2.0
7 U2 U5 13, 20 16.5 12.25 24.5
8 U3 U4 17, 11 14.0 9.00 18.0
9 U3 U5 17, 20 18.5 2.25 4.5
10 U4 U5 11, 20 15.5 20.25 40.5
Total 154.0 61.50 123.0
Average 15.4 6.15 12.3 53
We note that each of the 5 units appears in exactly 4 of the possible
samples. Hence, the proportion of samples in which a particular unit
U i appears is 4 10 2 5 n N . In general, U i appears in
( N  1)
C ( n  1) N C n n N . The ratio n N is sometimes called the

overall chance of selection since it gives the proportion of samples in


which U i appears, either by being selected on the first draw or the
second and so on. In simple random sampling without replacement, each
unit in the population has an equal chance of entering the sample n N .

54
In practice, only one sample will be selected and observed and hence any
of the means in the table above can be used as an estimate of the
population mean. The smallest estimate of the population mean is 12.0 as
found in sample number 6 which will mislead us by the amount
15.4  12.0 3.4. The largest estimate is obtained from sample
number 9 to be 18.5 and it differs from the true mean by the amount
15.4  18.5 3.1. However, the mean of the sample means will
always be exactly equal to the population mean as shown in the table
above. Each of the sample means is an independent estimate of the
population mean and it is written E ( y ).
55
The expected value of the sample mean is given by E ( y )  y i pi
i
where y i is the mean obtained from the i sample and pi is the
th

probability of selecting the particular sample. In simple random sampling


N
without replacement pi 1 C n , for all possible samples of size n.

The sample mean y is said to be an unbiased estimator of the popu-


lation mean if the expected value of the sample mean is equal to the
population mean. That is, y is said to be an unbiased estimator of Y if
E ( y ) Y
. This does not mean that the sample gives a completely 56
It can be seen from the above table that many of the sample means

differ substantially from the population mean. The method of

estimating is unbiased in the sense that if it is used repeatedly, the

average of all the estimates will equal the true population mean.

57
The variance of the sample mean

In general, the estimates obtained from different samples of same size,

and taken from the same population will vary among themselves and will

only by accident coincide with the population value. This is because only

a part of the population is covered in a sample. This variability due to

sampling is measured by the sampling variance of the estimator.

58
Since the sampling distribution of the mean is a set of all the possible

means that could result from the sampling scheme being used, its

variance can be calculated directly as

N
Cn
1 2
Var( y )  N  ( yi  y ) .
Cn i 1

59
For our example, the variance of the sample mean is obtained as
1

Var( y )  14.5  15.4 2  16.5  15.4 2    15.5  15.4 2
10

3.69.

This direct calculation of Var( y ) requires the knowledge of the

sampling distribution of the mean. The Var( y ) can alternately be

calculated without the knowledge of the sampling distribution of the

mean.

60
Theorem

The variance of the mean


y from SRS without replacement is

2 2
N n S S
Var( y )   (1  f )
N n n

where f n N is the sampling fraction.


2
In general, the value of S is not known. Its estimate from the sample

is used.

61
Uses of variance of the sampling mean

1. To assess the precision of the estimator, y

2. To compare y with other estimators of Y

3. To determine the sample size needed to yield a desired precision.

62
Sampling with replacement

We have learnt that if we select a SR sample of size n from a population


N
Cn
of size N without replacement, there will be possible samples

since no unit is repeated and the order of selection is not taken into

consideration.

In SRS with replacement, however, a unit can appear more than once in

a sample and the order of selection is taken into consideration. There are
n 1 N n
N samples and the probability of selecting any one is . 63
Theorem

In SRS with replacement, the sample mean y is an unbiased estimator of

the population mean Y (that is E ( y ) Y ) with variance


2
Var ( y ) 
n

N  1 S2
 
N n

64
Proof

Since repetition of units are allowed, any unit U i may appear

0, 1, 2,  , n times in the sample. Thus, the mean y may be written as


1 N
y   ri yi
n i 1
Thus
1 N 
E ( y ) E   ri yi 
 n i 1 
1 N
  yi E (ri )
n i 1 65
Clearly, ri is a random variable taking the values j 0, 1, 2,  , n
1
and is distributed as binomial variate with n trials and p  . Thus
N
n
E (ri ) np  .
N
Hence
1 N 
E ( y ) E   ri yi 
 n i 1 
1 n N
   yi
n N i 1

Y

66
Confidence interval for the sample mean

The sample mean y is an unbiased estimator of the population mean Y .


y
However, varies from one sample to the other and may never coincide

with the population mean. The variation of the mean among the samples

results in the sampling distribution and is the cause of error in estimation.

As a result, if a single sample is selected and its mean calculated, we can

be sure of some error. What we do not know is how large or how small

this error may be. 67


It is therefore useful to provide an interval around y that reflects one’s

judgment on the extent of the error. The likelihood that the error will be

as large as, or larger than the stated interval is expressed in terms of

probability. The probability that a specified interval contains the

unknown population mean is called confidence. The interval itself is

called a confidence interval.

68
A confidence interval is generally defined by an upper confidence limit

and a lower confidence limit. The following must be specified in order

to obtain desired confidence limits:

1. The value of the sample mean;

2. The standard error of the sample mean; and

3. A confidence coefficient which specifies the degree of confidence

that one can attach to the interval. If the confidence coefficient is

expressed in percentage, then it is referred to as confidence level.69


An approximate 100(1   )% confidence interval for the population

mean Y is given by y z 2 s y or y  z 2 s y Y  y  z 2 s y

where
y is the sample mean;
sy is the standard error of the sample mean;
z 2is the value from the standard normal distribution that indicates

within how many standard errors of the mean lies approximately


100(1   )% y
of the possible values of . This value is referred 70
to as
z 2 s y
is referred to as the bound on the error of estimation.

If sampling is done with replacement, then the 100(1   )% confidence

interval is
ˆ
y z 2
n;
and if sampling is done without replacement, the interval is
s
y z 2 1 f
n

71
Estimation of Proportion of Units

Sometimes we need to estimate the total number or the proportion (P)

of units in the population that fall into a certain class or possess certain

qualitative characteristics or attributes. For example, in a household

survey we might want to know the proportion of unemployed persons

in a district or the proportion of the population under 5 years old. We

may classify the N units in the population into two groups as follows:

72
Group 1: Units which possess the attributes

Group 2: Units which do not possess the attributes

If A is the number of units belonging to Group 1, then the proportion of


A
population with the attribute is P  .
N
Let us define our study variable as

1, if U i is in Group 1
yi 
0, if U i is in Group 2

73
Then, it can be seen that the population total is
N
Y   yi  A
i 1

and the population mean is


1 N A
Y   yi  P
N i 1 N

It follows therefore that estimation of P and A reduces to the estimation of

a population mean and a population total both which we have discussed

earlier.
74
2
y y
Since i takes only the values 1 or 0, i will also take only the values
N
2
1 or 0, and hence  i  A  NP
y
i i

Subsequently, we have
2 1 N
S   ( yi  Y ) 2
N  1 i 1

1 N 2 2
  ( yi  2 yiY  Y )
N  1 i 1

1 N 2 2
   i y  N Y 
N  1  i 1 


1
N1

NP  NP 2 
NP (1  P ) NPQ
 or , where Q 1  P 75
N1 N1
If in a SR sample of n units, a units possess the attribute then then the

sample mean is
1 n a
y   yi   p
n i 1 n
and since the sample mean is an unbiased estimator of the population

mean, it follows that the sample proportion p is an unbiased estimator of

the population proportion P. If sampling is without replacement, then the


2
variance of the sample mean is (1  f ) S n.

76
It follows that the variance of p is given by
NPQ
Var ( p) (1  f )
n( N  1)
N  n PQ
 
N1 n
2 2
We have already seen that s is an unbiased estimator of S , it

therefore follows that an unbiased estimator of the variance of p is of the

form
2
Var ( p ) (1  f ) s n
77
But
 n
2 1 2 2
s    yi  n y 
n  1  i 1 


1
n 1

np  np 2

np(1  p ) npq
 or , where q 1  p
n 1 n 1
Hence, an unbiased estimator of the variance of p is
pq
var( p ) (1  f )
n 1 78
Remarks:

1. If sampling is done with replacement, then


2

Var ( p ) 
n
1 N 2 
   yi  NY 
Nn  i 1 
PQ

n
and an unbiased estimator of the variance of the sample proportion is
pq
Var ( p ) 
n 79
2. The distribution of p for SRS is called binomial distribution when

selection is done with replacement and is called hypergeometric

distribution in the case of selection without replacement.

80
Confidence Limits for Proportions

A 100(1   )% confidence interval for a population proportion (P) is

given by
p z 2 s p

where
z 2 is the  percentage point in the normal distribution beyond which
( 2)% of the values lie; and

81
s p  pq (n  1) in the case of sampling with replacement , and

s p  (1  f ) pq (n  1) in the case of sampling without replacement.

82
Determination of Sample Size
The size of the sample to be used in every survey has to be determined
before hand. Usually, one of the first questions answered in a new survey
design is “How large should the sample be?” This important question has
to be addressed because the decision on sample size is directly related to
the cost of the survey. It is a difficult question to answer precisely since it
is related to a clear understanding of the concept of sampling
distribution. There are several approaches to determine the sample size,
the easiest being the empirical approach – to find out what sample sizes
have been used by others in similar studies.
83
The Confidence Interval Approach

The objectives in interval estimation are to obtain narrow intervals with

high reliability. The components of a confidence interval indicate that the

length of the interval is determined by the magnitude of the precision of

the estimate. That is,


Prescision (reliability coefficien t ) (standard error )

This is half of the total length of the interval. For a given reliability

coefficient, the length of the interval can be reduced by reducing the


84
2
On the other hand, for a given population variance,  , the
only way to obtain a smaller error is to take a large sample.
With this in mind, the sample size can be estimated by
specifying the following three factors:
1. A margin of error that expresses how close to the population
parameter the sample value should be;
2. A preset confidence with which it is required to know that
the margin will not be exceeded; and
3. A measure of the variability of the characteristic of interest
in the population. 85
These specifications will depend on a trade-off between the value of

more accurate information and the cost of an increased sample size. The

margin of error d, which is termed tolerable error, represents the

maximum acceptable difference between the estimate and the population

value for a specified confidence level, and it is expressed as absolute

percentage point. The magnitude of the margin depends on how precise

an estimate is to be. The more precise an estimate is, the smaller the

margin of error. A smaller margin of error requires a large sample size.


86
The determination of the value of the margin of error is a subjective

decision on the part of the investigator. The margin may be two-sided or

one-sided. If it is two-sided, one generally adapts equal values on either

side so that the margins are d . We may however decide that an

underestimation need to be guarded against or vice versa. In that case, the

margin of error will be one-sided so that the margin is either  d or


( d ) .

87
In fixing the percentage confidence level for asserting that the margins

of error will not be exceeded, distinction between one-sided and two-

sided need attention. A 95% confidence interval for a two-sided margin

implies a 5% probability that the error will exceed one or the other

margin. This 5% divides into 2.5% probability of going below the lower

margin  d and a 2.5% probability exceeding the upper margin d .

88
If the investigator is interested in one margin only, the requirement of

95% confidence that a two-sided margin d will not be breached

corresponds to a 97.5 confidence that the same one-sided margin d will

not be breached. In most cases, the population variance of the

characteristic of interest is unknown and has to be estimated. This can be

done by using a value based on past experience with similar studies.

Another possibility is to conduct a pilot survey taking a small sample and

using the results to estimate the population variance. 89


Sampling for the Mean

Suppose that we want an interval that extends d units on either side of the

mean. Then we can write


d (reliability coefficien t ) (standard error ) .

If sampling is to be done with replacement, from an infinite population,

or from a population that is sufficiently larger to warrant our ignoring the

finite population correction factor, then the size of the margin of error

can be expressed symbolically as 90



d  z 2 
n
which when solved for n gives,

2

2
n  z 2  2 1
When samplingdis without replacement, and the sampling fraction
n N
is not negligible, that is, if the calculated value of n is found to be more

than 10% of the population size N, a more appropriate estimate of the

sample size is

n , where n is the value obtained from Equation (1).


n0 
1 n N 91
Example

A nutritionist wishes to conduct a survey among a population of teenage

girls to determine their average protein intake. Let us assume that the

nutritionist would like an interval about 10 units wide. That is, the

estimate should be within 5 units of the true value in either direction. Let

us also assume that a confidence interval of 95% is decided on, and that

from past experience, the nutritionist feels that the population standard

deviation is probably 20 gm. What sample size is required? 92


Solution

The necessary information are: z0.025 1.96,  20 and d 5.

Ignoring the finite population correction factor, we have


2
20
2
n (1.96)  2
5
61.47

Thus, the required sample size is 62.

93
Example 2

An efficient expert wishes to determine the average time, in minutes, it

takes to drill three holes in a certain metal clamp. How large a sample

will be needed to be 95% confident that the sample mean will be within

15 seconds of the true mean? Assume that it is known from previous

studies that  40 seconds.

94
Solution

We have
2
40
2
n (1.96)  2
15
27.32

Hence the sample size is 28.

95
Determination of Sample Size for Proportions

The procedure for determining the appropriate sample size for estimating

proportions is similar to the one outlined above. We have seen that the

sampling variance of a population proportion p based on sampling

without replacement is
PQ P (1  P )
Var ( p )  or Var ( p ) 
n n
where P is the population proportion and n is the sample size.

96
Using this information in Equation (1) the required sample size n is

given by
P (1  P )
2
n  z 2  2
2
d
As an usual modification if the calculated value of n is found to be more

than 10% of the population size N, a more appropriate estimate of the

sample size is
n
n0  .
1  (n  1) N
97
Equation (2) depends on P which in most practical situations is not

known, and thus must be estimated. The method of estimating the

population proportion is to use a sample proportion obtained from a

previous comparable study or from a pilot survey. Another approach is

based on the fact that the maximum value of P(1  P) is 0.25, that is

when P 0.5 . If P is greater or smaller than 0.5, P (1  P ) is smaller

than 0.25. Therefore, substituting for 0.5 for P in Equation (2) results in

the largest value of n. 98


This is a conservative approach since the sample size would be larger
than required, and thus the desired accuracy will be exceeded. The logic
is that it is alright to make an error on the side of being too accurate.
Thus, if P is not known and no preliminary survey data are available, the
sample size should always be estimated for P 0.5 . Equation (2)
therefore reduces to
2 (0.5)(0.5)
n  z 2  2
d

(0.25) z2 2
 2
3
d
99
Example 3

A survey is being planned to determine what proportion of families in a

certain district are medically indigent. It is felt that the proportion cannot

be greater than 0.35. A 95% confidence interval is desired with tolerance

error of 0.05. what sample size of families should be selected?

100
Solution

If we can ignore the finite population correction factor, then


(1.96) 2 z (0.35)(0.65)
n
(0.05) 2
349.59

Hence the necessary sample size is 350.

101
Example 4

According to the Ministry of Transport, the DVLA registered 2.5m cars

during the year 2007 and 17% of the cars operate on both petrol and gas.

The ministry is aware that cars which were originally mean to operate on

petrol have been converted to operate on both petrol and gas. How many

cars should be randomly selected and examined in order to estimate the

proportion of cars in the country that operate on both petrol and gas with

90% confidence interval of width 0.07? 102


Solution

We know that the margin of error d is half the width of the confidence

interval. Thus, d 0.035, P 017 and z0.05 1.645.

Substituting these into Equation (2), we obtain


2
(1.645) (0.17)(0.83)
n 2
(0.035)
311 .69
Hence, the required sample size is 312, since the sampling fraction
n 312

N 2,500,000 is negligible. 103
Example 5

The Ghana Tourist Board plans to survey visitors to determine if they

plan to stay in the country for more than a month. The Board would like

to be 95% confidence of its estimate. How many tourists should be

interviewed if the Board wants the sample proportion of visitors staying

longer than one month to be within 0.03 of the true proportion?

104
Solution

Since the Board has no idea what proportion of visitors who plan to stay

for more than one month, Equation (3) is used to solve for the sample

size, using 0.5 as an estimate for the population proportion.


2
(0.25)(1.96)
n
(0.03) 2
1067.11
A sample size of 1067 will be required.
105
The Board cannot afford to take a sample of size 1067. What would

happen to the tolerance error if the sample size were to be reduced 600?
P (1  p )
Solving for d in Equation (2), we obtain d  z 2
n
and P(1  p )
d  z 2
n
(0.5)(0.5)
(1.96)
600
0.04
Thus, reducing the sample size 600 increases the tolerable error slightly.
106
Remark

The issue sampling size is partly technical. The larger the sample the

more elaborate the analyses can be sustained. This being so, one cannot

speak of an adequate sample size, still less of an optimal sample size. The

choice of sampling size involves balancing the demands (especially

insatiable) of the analyst against the constraints of funding and national

capability.

107
In many discussions of the sample size appropriate for a particular

survey, it should be borne in mind that a survey with large sample is

more difficult to manage and supervise- especially if option of

spreading the fieldwork over a longer period is considered unacceptable.

This fact argues for caution in allowing inflated sample sizes,

particularly in countries where survey capability is weak.

108
Stratified Random Sampling

Stratified random sampling begins with the division the entire study
population into subsets, called strata (singular form: stratum). Sampling
is then carried out independently in each stratum. This strategy reduces
the sampling error to the extent that the variable which defines the
strata is correlated with the survey variable of interest.

For example in a living standard survey, if one were to classify


households into upper-income, middle-income and lower-income, then
the sampling error would be reduced for such a variable as household
expenditure, which is correlated with income.
109
The principal objective of stratification is to reduce sampling error. In a
stratified sample, the sampling error depends on the population
variance existing within the strata but not between the strata.

110
Stratified samples have the following general characteristics:

1. The entire population is first divided into an exhaustive set of strata, using some
external source, such as census data, to form the strata.
2. Within each stratum a separate random sample is selected. This implies that the
sample designer is at liberty to use different sampling fractions in different
strata.
3. From each separate sample, some statistics (e.g. mean) is calculated and
properly weighted to form an overall estimated mean for the whole population.

4. Sample variances are also calculated within each stratum and appropriately
weighted to yield a combined estimate for the whole population.

111
Reasons for stratification
Though the main advantage of using stratified sampling is the possible increase in
efficiency per unit of cost in estimating the population characteristics, it is also likely
to be useful in a number of situations.

Firstly, the population may consist of a number of distinctive components and


inference may be necessary not only for the whole population but also for each of
these components. For example, in national sample surveys estimates are generally
required for specified components such as administrative regions, districts, rural and
urban areas, etc. It is obvious that this can be done only if the sample contains
representative units from each of these components. For the sample to contain
representative units from each component, each component must be separately
sampled. 112
Secondly, if the population is divided into a number of strata which are
more homogenous than the population with respect to the characteristic
under study, the sampling units within each stratum will be less varied
compared to the sampling units in the whole population. Hence,
estimates for the stratum characteristics obtained from random samples
from each would be more precise compared to similar estimates of the
population characteristic from unrestricted samples of the same size
from the whole population. That is, estimates of the characteristic
obtained by combining the estimates for the different strata would be
more precise than that obtained directly from a sample of the population
without stratification. 113
Apart from the stratification of the compelling nature, stratification
requires some prior knowledge of the characteristic under study. This is
usually given by the auxiliary information on some related characteristics
which are recorded in the sampling frame. In stratified sampling, the
following inter-related points need careful consideration:
1. choice of sampling design within each strata;
2. choice of stratification variable;
3. number of strata;
4. allocation of sample size to strata; and
5. demarcation of strata.
114
In fact, the best method of utilizing stratified sampling effectively consists of
determining an optimum composite choice of the possible solutions to the five points
mentioned above.

Suppose we have a population of N 6 students who are classified


according to whether they are nationals or foreigners. Let N 1 3
students be foreigners and N 2 3 students be nationals. We are
interested in the mean number of books these students have. Consider
the following data.

115
Foreigners Nationals
N1 N2
Y1i Y2i
Y11 2 books Y21 8 books
Y12 4 books Y22 12 books
Y13 6 books Y23 16 books

Total Y 12 books Y2 36 books


1

Mean Y 4 books
1 Y2 12 books

116
2

The number of strata is L 2, and N  N h  N 1  N 2 3  3 6 is


h 1
Yhi
the total number of sampling units (students) in the frame. The shows
th
the number of books owned by the i
student in the stratum. The
Yhthe number of books in stratum h. Hence, for our example,
shows
Y1 Y11 isYthe
12 number
Y13 12 of books in the first stratum

(foreigners), and Y2 Y21  Y22  Y23 36number of books in


is the
the second stratum (nationals).

In general, the stratum total is


Nh
Yh   Yhi .
i 1
117
The population total is the sum of the stratum totals, and is
Y Y1  Y2 12  36 48 which is the total number of books.

In general, the population total is given by


L L Nh
Y  Yh   Yhi .
h 1 h 1 i 1

The stratum mean is defined as


Yh
Yh  .
Nh

118
For the first and second strata in our. example, we have
Y1 12
Y1   4 books per foreigner and
N1 3
Y2 36
Y2   12 books per national.
N2 3

The population mean is defined as


Y Y1  Y2 12  36 48
Y     8,
N N1  N 2 33 6
and this shows the average is 8 books per student. Since Y1  N1Y1 and
Y2  N 2Y2 , the population mean may also be written as
L
 N hYh
N1Y1  N 2Y2 h1
Y   .
N1  N 2 N
119
Estimation of the population total Y and the population mean Y
The basic idea is to first estimate the population total Y and to divide the
result by N to estimate Y . The population total is the sum of the stratum
totals. Hence, an estimate of the population total is obtained by
summing the estimates of the stratum totals. An estimate of the stratum
Yˆ N y
total may be obtained by h h h , where y h is the sample mean of a

random sample of size n h from stratum h. For our example, suppose we


select random samples of size n1 2 and n2 2 from the first and
second strata of our population, thus making y1 and y 2 the sample
means of these random samples, respectively.
120
An estimate of the population total Y is the sum of the estimates of the
stratum totals. For our example it will be:
L
Yˆ  N y  N y  N y .
st 1 1 2 2 h h
h 1
Hence, an estimate of the population mean becomes
Yˆst N1 y1  N 2 y2
y st   .
N N1  N 2
In general, this may be written as
L

N h yh
y st  h 1 .
N

L
Note: E ( y st ) Y N  E ( Ny st ) Y .But Ny st  N h y h YˆstThat
. is,
ˆ
the estimator Yst is an unbiased estimator of Y: E (Yˆst ) Y .
h 1

121
Example
Let us illustrate these results using our hypothetical population. We can
 3
select  2  3 possible samples without replacement from each stratum.
   3  3 
Hence, there are altogether  2   2  3 3 9 possible samples of size
  
n n1  n2 2  2 4.
Suppose the first sample consists of (2, 4, 8, 12),
1 is
of which (2, 4) is from stratum 1 and (8,y12) 2 from
4 stratum
6. 2. The total of
y1 6
the sample (2, 4) from stratum 1 is y1   The
3.sample mean of
n1 2
the sample (2, 4) from stratum 1 is : Hence, an estimate
of the total 1 (3)(3of
N 1 ynumber ) 9. in stratum 1 based on the sample (2, 4) is
books
122
Similarly,
y 2 20
y2 8  12 20 and y 2   10.
n2 2
Hence, an estimate of the total number of books in stratum 2 based on
the sample (8, 12) is N 2 y 2 (3)(10) 30. Thus, the estimator of the
total number of books in the population based on the first sample
(2, 4, 8, 12), is Yˆ  N y  N y 9  30 39 and the estimator of
st 1 1 2 2
Yˆst 39
the population mean is y st   6.5 books.
N 6

Following the above procedure for all other possible samples, we obtain
the results summarized in the table below.

123
Stratum 1 Stratum 2 y1 y 2 y1 y 2 N1 y1 N 2 y 2 Yˆst y st
2, 4 8, 12 6 20 3 10 9 30 39 6.5
2, 4 8, 16 6 24 3 12 9 36 45 7.5
2, 4 12, 16 6 28 3 14 9 42 51 8.5
2, 6 8, 12 8 20 4 10 12 30 42 7.0
2, 6 8, 16 8 24 4 12 12 36 48 8.0
2, 6 12, 16 8 28 4 14 12 42 54 9.0
4, 6 8, 12 10 20 5 10 15 30 45 7.5
4, 6 8, 16 10 24 5 12 15 36 51 8.5
4, 6 12, 16 10 28 5 14 15 42 57 9.5

124
Now,
1  39 45 51 57 
E ( y st )       
9 6 6 6 6
8.
But the population mean Y 8 and therefore E ( y st ) 8. Thus, y st is
an unbiased estimator of YWe . also note that
ˆ 39  45    57
E (Yst )  48 Y
9
and Yˆst is an unbiased estimator of Y.

125
Population and stratum variances
There are two ways of defining the stratum variance which shows the
variance within
N
each stratum. One is
1
 Y  Yh 
h Nh
1
and the other is s h  N  1  Yhi  Yh  .
2
 
2
h hi
2 2

Nh i 1 h i 1

Note that when N h is large, N h  1  N h and hence 


2 2
h s .
h

The population variance


N
is defined as
1 L 1
    Yhi  Y    (Yi  Y ) 2 .
h
2 2

N h 1 i 1 N
The population variance shows the variation of the individual values from
the population mean, Y.

126
Example
Calculate the within stratum variances and the population variance for
our hypothetical example.

Note that,
Nh Nh
 hi h  hi h hi h 
  
2 2 2
Y  Y  Y  2Y Y  Y
i 1 i 1
Nh L
1
 Yhi2  2 N hYh2  N hYh2 since Yh   Yhi
i 1 Nh h 1
Nh
 Yhi2  N hYh2
i 1
2
Nh
1  
 Yhi2   Y  1
N   hi 
i 1 h  
127
Also, we have

Nh Nh

 hi
Y  Y 2
 
 hi
Y 2
 2Y Yhi  Y 2

i 1 i 1

Nh Nh
 Yhi2  2Y  Yhi  N hY 2 2
i 1 i 1

128
Now, we have

N1 N2
Y1i Y 2
1i Y2i Y 2
2i
4 64
Y11 2 Y21 8
16 Y22 12
144
Y12 4
36 Y23 16 256
Y13 6
12 56 36 464

129
From 1 Nh
 Yhi  Yh 
2 2
h 
Nh i 1

We have
N1
1
 1   Y1i  Y1 
2 2
N1 i 1 N1
1
 N1
1  1  N 1 
2
s12   Y1i  Y1 2

Y1i   and N1  1 i 1
 
N1  i 1
2
Y1i  
N1  i 1  
1


 56  13 (12) 2 
 1

 56 
2
1 (12) 2
3

3
8 4

3

130
Similarly, 2
1  N 2 1  2 N
2 2
 2    Y2i    Y2i  
N 2  i 1 N 2  i 1  
 
1

 464  13 (36) 2
3

32

3
and
 N2 N 2
1  1  2 

2
s2 
 
N 2  1 i 1
Y 2
2i 
  Y 
2i 
N 2  i 1  
 
1

 464  13 (36) 2
2

16
131
1 L Nh
1  N1 N2
2
  Y  Y    (Y1i  Y )   (Y2i  Y ) 
2
 
2
hi
2

N h 1 i 1 N  i 1 i 1 
But
N1 N1 N1

 1i
(Y
i 1
 Y ) 2
  1i
Y 2
 2Y  1i 1
Y  N Y 2

i 1 i 1
2
56  2(8)(12)  (3)(8)
and
56

N2 N1 N1

 (Y
i 1
2i  Y )  Y  2Y  Y2i  N 2Y
2

i 1
2
2i
i 1
2

464  2(8)(36)  (3)(8) 2


80
132
1 136
Therefore,   56  80  
2
.
6 6
Consider the following:
L Nh
1
  Y  Y
2
 
2
hi
N h 1 i 1
L Nh
1
  Y  Yh  Yh  Y 
2
 hi
N h 1 i 1

1  L Nh L Nh L Nh

   Yhi  Yh     Yh  Y     Yhi  Yh Yh  Y 
2 2

N  h 1 i 1 h 1 i 1 h 1 i 1 
1  L Nh L
2
   Yhi  Yh    N h Yh  Y  
2

N  h 1 i 1 h 1 
1 L 1 L
  N h h   N h Yh  Y 
2 2

N h 1 N h 1
133
Thus, the overall population variance is given by the sum of the stratum
variances and variances among the stratum means. Using our example,
these terms may be calculated as follows:
1 L
1   8   32   40
N

h 1
N h   3   3    .
2
h
6   3   3  6

 
L
1 1
 N Y  Y  34  8  3(12  8) 2 16
2 2
h h
N h 1 6

40 136
    16 
2

6 6

134
Note that 
2
is simply the population variance, and may thus be
obtained as
N
1
   Yi  Y 
2 2
N i 1
1

 2  82  4  82  6  82  8  82  12  82  16  82
6

136

6

For convenience, let us set


L
1 L 1
 w   N h h and  b 
2

N h 1
2 2

N
 h h
N (
h 1
Y  Y ) 2
,

and call  2
w within stratum variance and  2
b the between strata
variance. 135
The overall variance may then be shown as     .
2 2 2
w b The

implication of this result is that when  2


w is small, 2
b will be large, and
vice versa. That is, when the population is stratified into homogeneous
strata, the main source of variation will be the between stratum
variation. Conversely, if the population is stratified into heterogeneous
strata, the main source of variation will be the within stratum variation.

136
The variance of y st
1
The variance of y st is, by definition  2

st 
m m
( y st  Y ) 2
.

Stratum 1 Stratum 2
Yˆst y st y  Y
2
y st  Y st

2, 4 8, 12 39 39 6 96 81 36
2, 4 8, 16 45 45 6 36 9 36
2, 4 12, 16 51 51 6 36 9 36
2, 6 8, 12 42 42 6 66 36 36
2, 6 8, 16 48 48 6 0 0
2, 6 12, 16 54 54 6 66 36 36
4, 6 8, 12 45 45 6 36 9 36
4, 6 8, 16 51 51 6 36 9 36
4, 6 12, 16 57 57 6 96 81 36
137
 3  3 
There are m     9 possible samples of size n n1  n 2 2  2 4
 2  2 
that we can select. Hence, there are m 9 possible sample means, y st.
Thus,
1  270  30 5
y  
2
  .
st
9  36  36 6

This is the basic definition of the variance of y st . We wish to find the


2
y
variance of st in terms of the stratum variance h . Note that when
s

the population samples are fairly large, the number of possible samples
 N1  N 2 
m    
 n1   n2 
will be so large that calculation of var ( y st ) from its basic definition will
be practically difficult. 138
N 1 y1  N 2 y 2
We know that y st   w1 y1  w2 y 2 where
N
Nh
wh  nh
N
are called stratum weights. Since the samples are
selected
var ( y stby
) random
w12 var ( sampling and
y1 )  w22 var ( y 2are
) independent , we have
2 2
N  n s N  n s
 w12 1 1  1  w22 2 2
2
N1 n1 N2 n2
2 2
L
N
 h N  n s
    h h
 h .....................................(1)
h 1  N  Nh nh
1 L 2 N h  nh s h2
 2  Nh   ..................................... (2)
N h 1 Nh nh
1 L
N h  nh

N
 Nh
 N 2

h h ........................................ (3)
s
139
h 1
2
In our present illustration, we have s 4 and 2 16. Hence, we
s2
1

have
1  N 1  n1 N 1 s1 2 N 2  n2 N 2 s 2 2 
var ( y st )  2     
N  N1 n1 N2 n2 
1 3  2 3 2  3  2 3 4 
 2 2
 2    
6  3 2 3 2 
5
 .
6
Example
Given the following data where samples of size n1 2 and n2 2 are
selected from each stratum, calculate var ( y st )

140
2
X 1i X 1i2 X 2i
X 2i
X 11 1 1 X 10 100
21

X 12 3
9 X 16 256
25 22
484
X 1395 X 23 22
35 48 840

The equation
N h  n h N h s h 
L 2
1
var ( y st )  2
N

h 1 Nh

nh
2
shows that var ( y st ) is dependent on s when N,
h N h and n h are
2
given. The s shows the variance within each stratum. Hence, we
h

obtain the very important conclusion that when the variance within
each stratum is small, var ( y st ) will be small, and therefore, the
precision of y st will be high. 141
Different ways of allocating a sample among strata
Various ways exist for allocating samples among strata. We would learn
about two of them and how to estimate given population parameters
under their allocations

Proportional allocation
The simplest and most frequently used way of allocating a sample
among strata is to allocate it proportionally to the size of the strata. For
example, if a sample of size n 50 is to be selected from a population
size N 500, it means that the sampling fraction is to be
n N 50 500 0.1.
142
That means 10% of each stratum is to be selected for the sample. Then
n1 n2 nL
   10%.
N1 N 2 NL
This method compares favourably with other methods in terms of
precision, and is both simpler and more convenient to use.

Example
Consider the population
Stratum 1 Stratum 2
If we wish to select a simple random
Y11 2 Y21 8
Y12 4 Y22 12 sample of size n 4 by proportional
Y13 6 Y23 16 allocation, find n h .
143
Solution
We have
n1 n2 n n 4
  f,  f   .
N1 N 2 N N 6
4 4
 n1  N 1  f 3  2 and  n 2  N 2  f 3  2.
6 6
Since we are using simple random sampling in each stratum, the
probability of any sampling unit in stratum h being included in the
subsample n h and nh N h  f . Hence, since nh N h  f for all
strata, any unit in the population has the same probability f 4 6
be included in the sample. 144
Estimator of Y
The unbiased estimator of Y for stratified random sampling is
L

N h yh
y st  h 1L . nh
Substituting N h  into the above, we obtain
N h L f
h 1
 n
h
f yh
y st  h 1L
nh
h 1 f
L

n h yh
 h 1
n
L nh

 y hi
 h 1 i 1
.
n 145
Thus, the estimator y st is the sample mean of the sample n, and is an
unbiased estimator of Y . Suppose we select the sample
n1 : y11 2, y12 6.

n2 : y 21 8, y 22 12.
2  6  8  12 28
Then, an estimate of Y is y st    7.
4 4

Example
Given the following population, estimate the average number of
cigarettes a person smokes by selecting a stratified random sample of
size n 5 by proportional allocation.
146
Stratum 1 (Males) Stratum 2 (Females)
Y11 20 Y21 10
Y12 25 Y22 12
Y13 35 Y23 8
Y14 30 Y24 6
Y15 24
Y16 26

Solution
n 5 1
The sampling fraction is f    . Hence, we select
1 N 10 2 1
n1  N 1  f 6  3samples from stratum 1 and n 2  N 2  f 4  2
2 2
samples from stratum 2.
147
Suppose we select
n1 : y11 25, y12 20, y13 35, from Stratum 1
and
n2 : y 21 10, y 22 6 from Stratum 2.
Then the estimator y st is
1
y st  (25  20  35  10  6) 19.2 cigarettes .
5
The true average is
1 1
Y  (20  25    8  6)  (196) 19.6 cigarettes .
10 10

148
Optimum allocation
In some cases, it may be necessary to conduct a sample survey with a
fixed budget, but with varying costs of selecting sampling units from
different strata. For example, when families are stratified into urban and
rural classifications in order to survey their average income, the cost of
selecting sampling units from urban and rural families will usually differ.
The cost function Lis given by
c  c o   c h n h,
h 1

where co is the fixed cost and c h is the variable cost. The fixed costs
include office rent, fixed administrative costs, equipment costs, etc., and
149
The variable cost shows the cost per sampling unit in stratum h. Hence,
ch nh is the cost of selecting n h families from stratum h. Now,
L
C  c h nh ,.................................................(1)
h 1

where C c  co . Thus, our problem is as follows: Given a fixed budget


C, select a sample of size n, and allocate this sample into n1 , n2 ,  , n L
so that the sampling Lvariance of the average
2
income
1 N h  nh s h
var ( y st )  2
N

h 1
N 2
h
Nh
 ................................................( 2)
nh
is minimum. In other words, find the values n h that will minimize (2)
N h sh ch
subject to (1). The result is nh  n
 N s  h h ch

150
Note that nh is proportional to N h sh with the possible implications.
Given the following population, allocate the sample of size n 4
by optimum allocation. Assume c1 GHc 1.00 and c 2 GHc 4.00.
Stratum 1 Stratum 2
2 8
2 8
4 12
6 16
6 16

151
Solution
N1
1
s12   1i 1Y  Y 2

N 1  1 i 1


1
5 1

2  4  2  4  4  4  6  4  6  4
2 2 2 2 2

1
 (16) 4.
4
N1
1
s 22   2i 2Y  Y 2

N 2  1 i 1


1
5 1

8  122  8  122  12  122  16  122  16  122 
1
 (64) 16.
4

 s1 2 and s 2 4. 152


Hence, we set up the following table.
Stratum Nh sh s 2
h
ch c h N h s h N h s h  c h
I 5 2 4 1 1 10 10
II 5 4 16 4 2 20 10

N 1 s1 c1 10
n1  n  (4) 2
20
 N s 
h h ch

N 2 s2 c2 10
n2  n  (4) 2
20
 N s 
h h ch

153
Suppose we select
I II
Then
y11 4 y 21 8 N 1 y1  N 2 y 2
y st 
N
y12 6 5(5)  5(12)
y 22 16 
10
y1 10 y 2 24 8.5

y1 5 y 2 12

154
SYSTEMATIC SAMPLING
We will now learn about systematic sampling which is more convenient
than simple random sampling and which ensures that each unit has equal

chance of being included in the sample.

Introduction
In systematic sampling, we select every kth unit starting with a unit which
corresponds to the number r chosen at random from 1 to k, where k is an
integer such that k  N n . The sample consists of the units
corresponding to the units: r , r  k ; r  2k ,  , r  (n  1)k
155
The random number r is called the random start and k is called
the sampling interval. A sample selected by this procedure is
called a systematic sample with a random start. One can easily
see that r determines the entire sample. In this procedure, we
select with equal probability one of the k possible groups or
samples. Besides the operational convenience, systematic
sampling provides estimators that are more efficient than those
provided by SRS under certain conditions that are reasonable in
practice. Systematic sampling provides a useful alternative to
SRS for the following reasons: 156
1. Systematic sampling is easier to perform in the field and hence is less
subject to the selection errors by field workers than are simple
random samples especially if a good frame is not available.
2. Systematic sampling can provide greater information per unit cost
than simple random sampling can provide. A systematic sample is
generally more uniformly spread over the entire population and thus
may provide more information about the population than an
equivalent amount of data contained in a simple random sample.
Systematic sampling often suggests itself when there is a sequence
of units occurring naturally in space (trees in a forest) or time
(landing of fishing boats on the coast). 157
Sample Selection Procedure
A sample which is obtained by systematic sampling may be expressed as
“a 1 in 5 sample” or “a 1 in 10 sample” Thus, in general, a sample which
is obtained by systematic sampling is expressed as “a 1 in k sample” This
means that the sampling fraction is 1 k . There are two common
methods for selecting a sample by systematic sampling which we shall
call Methods A and B.

158
Method A
Suppose we have a population
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 Y12

and wish to select a 1 in k 3 systematic sample. The procedure is to


randomly select a sampling unit from the first three sampling units.
Suppose this is the second unit, Y2. Then every third unit from Y2 is
selected into the sample. Hence, we obtain a sample of size 4 given by
Y2 Y5 Y8 Y11 . Thus, the population has been divided into
N 12
n   4 strata, and from each strata we select 1 sampling unit.
k 3

159
Note that N nk , and therefore, the population is an exact multiple of k.
Since we select 1 sampling unit from each stratum, k 3 shows the
number of possible systematic samples that can be selected, and n 4
is the size of each sample. The k 3 possible samples that can be
obtained from the given population are shown in the table below.
Sample 1 Sample 2 Sample 3
Y1 Y2 Y3
Y4 Y5 Y6
Y7 Y8 Y9
Y10 Y11 Y12
160
Since the starting sampling unit is randomly selected from the first k 3
units, the probability of selecting any one of these k 3 sampling units is
1 k 1 3 , and the probability of selecting any one of these systematic

samples is also 1 k 1 3 .

Let us consider the case where N nk . Suppose we wish to select a 1 in


k 5 sample. Then
N 12 2
 2 .
k 5 5
Hence, the sample size will be either 2 or 3. That is,
12 2
2  2  3.
5 5
161
The procedure is to select a random starting point from the first k 5
th
sampling units, and then select every 5 sampling unit from the
starting point. The samples are shown in the table below.
Sample 1 Sample 2 Sample 3 Sample 4 Sample 5
Y1 Y2 Y3 Y4 Y5
Y6 Y7 Y8 Y9 Y10
Y11 Y12

The probability of selecting any one of these samples is 1 5 .

162
Example
Select a 1 in 8 systematic samples from a population of size N 37.

Solution
The sample size is obtained as
N 37 5
 4 .
k 8 8
Hence, the sample size will be either 4 or 5. That is,
5
4  4  5.
8
The samples are listed in the table below.
163
I II III IV V VI VII VIII
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8
Y9 Y10 Y11 Y12 Y13 Y14 Y15 Y16
Y17 Y18 Y19 Y20 Y21 Y22 Y23 Y24
Y25 Y26 Y27 Y28 Y29 Y30 Y31 Y32
Y33 Y34 Y35 Y36 Y37

164
Method B (Remainder Method)
Assume that N nk 12, and suppose that we wish to select a 1 in
k 3 j th
th
8
sample. A sampling unit (say, the unit) is randomly selected
from the population. Let j be the unit. Then
j 8
 2 r  2.
k 3
with remainder

165
Note that r 2  k 3 and that the values r can take will be 0, 1, and 2.
When r 1, select Y1 ; when r 2 , select Y2 ; and when r 0, select
Y3 k 3 rd

as the starting point. Then select every sampling unit from


the starting point. The systematic samples obtained are given in the
Sample 1 Sample 2 Sample 3
table below.
Y1 Y2 Y3
Y4 Y5 Y6
Y7 Y8 Y9
Y10 Y11 Y12
166
Now, consider the case where N 
 nk and assume that N 11. If we
wish to select a 1 in k 3 sample, then we find that
j 8
 2 with remainder r 2.
k 3
Following the same procedure for N nk , we obtain the samples
shown in the table below.
Sample 1 Sample 2 Sample 3
Y1 Y2 Y3
Y4 Y5 Y6
Y7 Y8 Y9
Y10 Y11 167
Note that in both cases of N nk and N  nk , the samples are the
same for the two methods.

The main characteristic of this procedure is that the probability of


selecting a systematic sample is n N , and not 1 k as in the case of
method A. For example, the probability of selecting Y2 , Y5 , Y8 , or Y11
is 1 11 , respectively. When any one of these is selected, we obtain the
systematic sample 2. Hence, the probability of selecting this sample is
1 1 1 1 4
    .
11 11 11 11 11
Similarly, the probability of selecting the systematic sample Y3 , Y6 , Y9 
will be 3 11. 168
Estimator of the population mean
If the sampling procedure of method A is used when N nk , the
sample mean of the systematic sample is an unbiased estimator of the
population mean. When N  nk , the sample mean is a biased
estimator of the population mean. However, if the sampling procedure
of method B is used, the sample mean of the systematic sample will be
an unbiased estimator of the population mean regardless of whether
N nk or N  nk .

169
To show that the sample mean of a systematic sample is an unbiased
estimator of the population mean when using Method A with N nk ,
let 1 n
y i   y ij ..................................(1)
n j 1
th
be the sample mean for the systematic i sample. Then
1
E ( y sys )  ( y1  y 2    y k )
k
since there are only k possible sample we can select, and the probability
of selecting a certain systematic sample is 1 k . Using our example with
N 12, k 3, n  4.
and
170
We have
1
E ( y sys )  ( y1  y 2  y3 )
k
1 1
  ( y1  y 2    y N )
k n
1
 ( y1  y 2    y12 )
12
Y .

Hence,
E ( y sys ) Y .

171
Example 1
Given the number of books N 9 children have, select a 1 in 3 sample
by systematic sampling and estimate the population mean.
1, 2, 3, 4, 5, 6, 7, 8, 9

Solution
Note that N 9 3 3 nk . There are 3 systematic samples that can
be selected namely:

172
Sample 1 Sample 2 Sample 3
1 2 3
4 5 6
7 8 9
12 15 18

Hence, the sample means are y1 12 3 4, y 2 15 3 5, and
y 3 18 3 6, which are the estimates of the population mean.

173
Example 2
Using the data in the Example 1 above, show that the sample mean of
the systematic sample is an unbiased estimator of the population mean,
that is E ( y sys ) Y .

Solution
1
We have E ( y sys )  (4  5  6) 5.
3
1 45
But Y  (1  2    9)  5.
9 9

Therefore, E ( y sys ) Y
174
Example 3
Suppose that the data in Example 1 are arranged in the following order:
6, 3, 4 9, 2 5 1, 7, 8
Then we have the 3 systematic samples
Sample 1 Sample 2 Sample 3
6 3 4
9 2 5
1 7 8
16 12 17

The 3 sample means are


y1 16 3 , y 2 12 3 , and y3 17 3.
175
Y
Therefore,
1  16 12 17  49
E ( y sys )       5 Y .
3 3 3 3 5
That is, y sys is an unbiased estimator of Y .

176
Example 4
Given the population Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 , select

a 1 in 3 systematic sample by Method A and show that the mean of


systematic sample is a biased estimator of Y .

Solution
We have
N 10 1
 3 .
k 3 3
Hence, the sample size is 3 or 4, and the samples are
177
Sample 1 Sample 2 Sample 3
Y1 Y2 Y3
Y4 Y5 Y6
Y7 Y8 Y9
Y10
Hence,
11 1 1 
E ( y sys )   (Y1  Y4  Y7  Y10 )  (Y2  Y5  Y8 )  (Y3  Y6  Y9 )
34 3 3 
1
 (Y1  Y2    Y10 ) Y ,
10
and so y sys is a biased estimator of Y . 178
To show that the sample mean of a systematic sample is an unbiased
estimator of the population mean when using Method B with N  nk ,
suppose we have the population
Y1 Y2 Y3 Y4 Y5 Y6 Y Y8 Y9 Y10 Y11
7
If k 3 , then n 3 or 4. Thus, we obtain the samples
Sample 1 Sample 2 Sample 3
Y1 Y2 Y3
Y4 Y5 Y6
Y7 Y8 Y9
Y10 Y11
179
The probability of the samples 1, 2, and 3 are 4 11 , 4 11 , and 3 11 ,
respectively. Hence,

4 4 3
E ( y sys )  ( y1 )  ( y 2 )  ( y3 )
11 11 11
4 1  4 1  3 1 
  (Y1  Y4  Y7  Y10    (Y2  Y5  Y8  Y11    (Y3  Y6  Y9 
11  4  11  4  11  3 
1
 Y1  Y2  Y3  Y4  Y5  Y6  Y7  Y8  Y9  Y10  Y11 
11
Y

Hence, y sys is an unbiased estimator of Y . This will also be the case


when N nk .

180
Example 5
Given the number of books N 8 children have, select 1 in 3 samples by
systematic sampling and estimate the population mean by using Method
B (Remainder method).

Solution
j 7
th
Assume we select j 7 unit. Then:  2 with remainder r 1.
k 3
1 2 3 4 5 6 7 8

181
The 3 systematic samples are The probabilities associated with
Sample 1 Sample 2 Sample 3
1 2 3 these samples are 3 8 , 3 8 , and
4 5 6 2 8.
7 8
12 15 9

Hence, the expected value of y st is


3  12  3  15  2  9 
E ( y st )          4.5.
8 3  8 3  8 2

The population mean is Y 36 8 4.5 and so E ( y st ) Y . That is,


y st is an unbiased estimator of Y .
182
It should not be concluded that random sampling always yields results
that are superior to those of non-random sampling, nor that the samples
obtained by non-random methods are necessarily less “representative” of
the population under study. The choice between random and non-random
sampling ultimately turns on a judgment of the relative size of the
sampling error of the random sample versus the combined sampling error
and selection bias of a non-random sample. For a given cost, one will
normally be able to select a larger non-random sample than random
sample. This means that the sampling error should be lower in the non-
random sample.
183
However, a selection bias will have been introduced by the non-random
process used for selecting the sample.

The choice between probability and non-probability sampling should be


based on the nature of the research, the degree of error tolerance, the
relative magnitude of sampling and non-sampling errors, the variability
in the population, and statistical and operational considerations.

184
OU
Y
NK
HA
T

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy