0% found this document useful (0 votes)

45 views24 pages

ST 318 Test 2-3

Sampling

Uploaded by

42b62kbf5m

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views24 pages

ST 318 Test 2-3

Sampling

Uploaded by

42b62kbf5m

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

UNIVERSITY OF DAR ES SALAAM

College of Social Science (CoSS)

Department of Statistics

POPULATION WITH TREND

GROUP 1 REJECTED QUESTION
a) If the starting random number is drawing the sample 19 with N=23, n=5, units 19, 1, 6, 11, 16
constituting the samples. Find the first and last members are Y1 and Y19 respectively.
Solution
𝑛 𝑁
1 ± 2(𝑁−𝐾) [2𝑖+(𝑛−1)𝑘−(𝑁+1)−2𝑛2 𝑛 ]

but 𝑖 + (𝑛 + 1) 𝑘 > 𝑁
For
𝑁=23 𝑛=𝑘 =5 𝑛2 =4

5 23
1 ± 2(23−5) [(2×19)+(5−1)5−(23+1)−(2×4) 5 ]

=1 ± 2(23−5)
5 14
[− 5 ]

=1 ± 2(18)
5 14
[− 5 ]

=1 ± [−18
7
]

SO,
Y1=1 + [−18
7
] =11
18

Y19=1 − [−18
7
] =25
18

The first number is 11

The last number is 25

18
GROUP 1 NEW QUESTION
Question
POPULATION WITH TREND
a) What is population with linear trend?

Answer
Population with linear trend; this refers to the time series population variables changes by a
constant amount of each time period.

b) The human resources department at company ABC has 42 workers in it. In order to find out
some information about the as a whole, we want to take a sample of 7 of those workers to
interview. If we have an order list of workers numbered through 42, and we start at worker
number 3, which workers would be included in our sample?
Solution.
N=42
n=7,
Then, to obtain the k=N/n
k= 42/7
k=6
But starting number is 3, now
3,9,15,21,27,33,39
Therefore, the workers would be included in our sample are; 3,9,15,21,27,33 and 39.
ST 318 GROUP 2 TEST 2

Instructions.

Answer all questions

a) What is probability proportional to size sampling?.What are the methods used in this sampling
technique.

Probability proportional to size sampling (PPS) is a sampling method in which units have
unequal chance of being selected. Samples are obtained depending on their size, in which a
unit with large size have a high probability of being selected compared to units with small size.

The methods used in this sampling technique are:

• Cumulative selection method

• Rejection method

b) Given the dataset of 500 BC population with income (xi) and expenditure (yi) with N = 100,
X= 150, n =5 as shown below.
i xi yi
1 50 100
2 40 90
3 30 80
4 20 70
5 10 60

Estimate Ýpps and it’s variance

GROUP 3 TEST QUESTION
a) In multi stage sampling, what do care mostly when optimally selecting samples from different
groups?
In multi stage sampling, we most care about the minimization of costs of selecting samples
from different groups and the minimization of the variance of the samples obtained to ensure
precision and relevance

b)
GROUP 4: ST 318
a) Define double sampling
Refers to the design in which initially a sample of units is selected for obtaining auxiliary
information only, and then a second sample is selected in which the variable of interest is
observed in addition to the auxiliary information.

b) Given
N=600, n=12, n1=4, n2=8, S1 2 = 696.97 and S2 2 = 4169.70. A large sample of 100 is selected
using SRS then n’=100, n1’=40, n2’=60. Another phase was done and the results were n1=4
from n1’=40 then we get y1i=30,40,60,70 and n2=8 from n2’=60 then we get
y2i=50,70,80,100,110,140,150,180. Estimate yst and its variance.
SOLUTION
From
𝑛ℎ ′
𝑦̅𝑠𝑡 = ∑2ℎ=1 𝑤ℎ 𝑦̅ℎ Where 𝑤ℎ = 𝑛′
40
= 0.4 𝑤1 =
100
60
𝑤2 = = 0.6
100
30+40+60+70
𝑦̅1 = = 50
4
50+70+80+100+110+140+150+180
𝑦̅2 = = 110
8
= 𝑤1 𝑦̅1+ 𝑤2 𝑦̅2 = 0.4 × 50 + 0.6 × 110 = 86
∴ Therefore 𝑦̅𝑠𝑡 = 86
➢ To estimate the variance of the estimate
From
w h 2 Sh 2 w h Sh 2 g′
𝑽𝒂𝒓(𝑦̅𝑠𝑡 ) = ∑2h=1 − ∑2h=1 + n′ ∑2h=1 wh (y̅h − y̅st )2
nh N

Hence
wh 2 Sh 2 w1 2 S1 2 w2 2 S2 2 0.42 ×696.97 0.62 ×4169.70 2155153
➢ ∑2h=1 = + = + =
nh n1 n2 4 8 10000
wh Sh 2 2
w1 S1 +w2 S2 2
1 2780608
➢ ∑2h=1 = = (0.4 × 696.97 + 0.6 × 4169.70) =
N 600 600 600000
g′ N−n′ 600−100 5
➢ = = =
n′ n′ (N−1) 100(600−1) 599
➢ ∑2h=1 wh (y̅h − y̅st )2 = w1 (y̅1 − y̅st )2 + w2 (y̅2 − y̅st )2
= 0.4(50 − 86)2 + 0.6(110 − 86)2 = 864
2155153 2780608 5
Then Var(y̅st ) = 10000 − + 599 × 864 = 218.09
600000

∴ The estimated variance of the estimate is 218.09

𝐵𝑢𝑡 𝑓𝑖𝑠𝑟𝑡 𝑎𝑛𝑠𝑤𝑒𝑟 𝑓𝑟𝑜𝑚 𝑞𝑢𝑒𝑠𝑡𝑖𝑜𝑛 𝑝𝑟𝑜𝑣𝑖𝑑𝑒𝑟 𝑤𝑎𝑠 217.9319
GROUP 5 QUESTION

PART A
Explain the following terms that provide a basis for discussing the concepts of achieving
optimum allocation estimated variance in double sampling for stratification.
i) Neyman allocation: An allocation strategy that minimizes the variance of the estimator
under the assumption of known stratum variances. Neyman allocation is often used as a
benchmark for evaluating the efficiency of alternative allocation strategies

ii) Proportional allocation: A type of allocation where the sample size allocated to each
stratum is proportional to the size of the stratum in the population. Proportional allocation
is often used as a starting point for determining the optimum allocation.

iii) The efficiency of an estimator; refers to its ability to provide accurate estimates using the
minimum amount of resources or sample units. Optimum allocation aims to maximize the
efficiency by minimizing the variance of the estimator.
PART B
In a simple random sample of 374 households from a large district, 292 were occupied by
white families and 82 by nonwhite families. A sample of about one in four households gave
the following data on 0wned rented total ownership.
white 31 43 74
Non- 4 14 18
white
GROUP 6 QUESTION

a) What is probability proportional to size sampling?

Probability proportional to size (PPS) Sampling is a sampling method in which units have
unequal chance of being selected. Samples are obtained depending on their size, in which a
unit with large size have a high probability of being selected compared to units with small size.

OR Is the sampling process where each element of the population has chance Pi to be
selected to the sample when performing one draw?

b) Consider the following information about income (yi) and expenditure (Xi) for 10
Observations from Mitumba Village with 150 people.
S/N 1 2 3 4 5 6 7 8 9 10
yi 78 74 104 88 96 109 102 72 93 84
Xi 26 29 56 31 52 55 71 31 54 40

From the above table estimate ŷ and its variance.

Solution:N=150, X=445

i yi Xi Pi =
𝑋𝑖 yi
⁄pi yi 2
𝑥 ( − ŷ)
Pi
1 78 26 0.058 1344.828 145535.383
2 74 29 0.065 1138.462 30668.766
3 104 56 0.126 825.397 19027.444
4 88 31 0.07 1257.143 86321.966
5 96 52 0.117 820.513 20398.695
6 109 55 0.124 879.032 7107.333
7 102 71 0.16 637.5 106169.751
8 72 31 0.07 1028.571 4255.475
9 93 54 0.121 768.595 37924.447
10 84 40 0.09 933.333 900.24

a) Estimate of ŷ
1 𝑦i 1
Ŷ=𝑛 ∑𝑛𝑖=1 𝑃𝑖 =10(9633.374) =963.337

b) Variance of ŷ

Note The whole variance also has a ^ symbol

1 𝑦𝑖 2
Var(ŷ)= 𝑛(𝑛−1) ∑𝑛𝑖=1 (𝑝𝑖 − ŷ)

1
=10×9 (458,309.498)= 5,092.328
QUESTION GROUP 7: Two stage sampling.
a) What is Multistage sampling and two stage sampling?
Answers
Multistage sampling is simply sampling at more than one stage. It involves randomly selecting
clusters at several stages; in the ultimate stage, that’s where observation is conducted and
estimations from this stage are of the next stage/ bigger cluster from which the elements of the
ultimate stage were selected. For instance, a researcher wants to estimate the average number
of statistics graduates in Dar es Salaam, sampling can be done by first identifying universities in
Dar es Salaam that offer statistics course, then identifying Statistics courses offered in those
universities. For simplicity, the size of the clusters is assumed to be constant, that is all clusters
are said to have the same size.

Two-stage sampling; meaning.

This is a sampling plan that involves randomly selecting clusters from a population, and from
the clusters, a element are randomly selected to constitute the final sample. This is an efficient
way of sampling incase the cluster are so large that collecting data from every unit in the cluster
is very expensive. However, it is a more reliable sampling plan if there is no large variation
between the elements in the clusters, so that the final sample is representative.
The first step involves randomly selecting clusters, these are referred to as primary units. And
then a sample is chosen from each primary unit.

b) A set of 20,000 records are stored in 400 drawers, each containing 50 records. In a two-stage
sampling, five records are drawn at random from each of the 80 randomly selected drawers.
For one item the estimated variance were s12= 362 and s22= 805. Compute the standard error
of the mean per record from this sample

Solution.
Estimated variance of estimated population mean in two stage sampling is:
(1−𝑓1) 𝑓1(1−𝑓2)
v(ӳ) = s12 + s22
𝑛 𝑛𝑚

(ȳ𝑖− ӳ)2 ∑𝑖 ∑𝑗(𝑦𝑖𝑗− ȳ𝑖)2

where s12 = ∑𝑛𝑖=1 s22=
𝑛−1 𝑛(𝑚−1)

N= 400 M=50 n=80 m=5 s12= 362 and s22= 805

f1= n/N = 80/400 = 0.2
f2 = m/M = 5/50 = 0.1
(1−0.2) 0.2(1−0.1)
v(ӳ) = (362) + (805)
80 5∗80

= 3.98225

Standard error = √v(ӳ)

= √3.98225
= 1.9955… ~ 1.996
The standard error of the mean per record is 1.996.
GROUP 8 QUESTION
Question 01; A
➢ AWhat is double sampling?
✓ Refers to the design in which initially a sample of unit is selected for obtaining auxiliary
information only, and then second sample is selected in which the variable of interest
is observed in addition to the auxiliary information.
➢ Provide the reason for conduct double sampling?
✓ The reason for conduct double sampling is to obtain better estimator by using the
relationship between auxiliary variable and the variable of interest.
Question 01; B.
In a survey to estimate average household monthly medical expenses, 500 households were
selected at random from a population of 5000 households. Of the selected households, 336
had children in the household and 164 had no children. A stratified subsample of 112
households with children and 41 households without children was then selected, and monthly
medical expenditure data were collected from households in the subsample. For the
households with children, the sample mean expenditure was $280 with a sample standard
deviation of 160; for the households without children, the respective figures were $110 and
60. Estimate mean monthly medical expenditure for households in the population, and
estimate the variance of the estimate.
Stratum 𝑛ℎ ′ 𝑛ℎ 𝑦̅ℎ 𝑆ℎ
1 336 112 $ 280 160
Solution
2 164 41 $ 110 60
Let that Total 500 153
❖ Stratum (1) represents households with children
❖ Stratum (2) represents households without children
Hence, the information above will be summarized as follow;
➢ To estimate the mean monthly medical expenditure for households in the population.
From
𝑛ℎ ′
𝑦̅𝑠𝑡 = ∑2ℎ=1 𝑤ℎ 𝑦̅ℎ Where 𝑤ℎ =
𝑛′
= 𝑤1 𝑦̅1+ 𝑤2 𝑦̅2 = 0.672 × 280 + 0.328 × 110 = $ 224.24
∴ The mean monthly medical expenditure for households in the population is $ 224.24
➢ To estimate the variance of the estimate
From
wh 2 Sh 2 wh Sh 2 g′
𝑽𝒂𝒓(𝑦̅𝑠𝑡 ) = ∑2h=1 nh
− ∑2h=1 N
+ n′ ∑2h=1 wh (y̅h − y̅st )2

Hence
w h 2 Sh 2 w 1 2 S1 2 w 2 2 S2 2 0.6722 ×1602 0.3282 ×602 70416
➢ ∑2h=1 = + = + =
nh n1 n2 112 41 625
2
w S w1 S1 2 +w2 S2 2 1 2298
➢ ∑2h=1 h h = = 5000 (0.627 × 160 + 0.328 × 602 ) =
2
N 5000 625
g′ N−n′ 5000−500 9
➢ = = 500(5000−1) = 4999
n′ ′
n (N−1)
➢ ∑2h=1 wh (y̅h − y̅st )2 = w1 (y̅1 − y̅st )2 + w2 (y̅2 − y̅st )2
3981264
= 0.672(280 − 224.24)2 + 0.328(110 − 224.24)2 = 625
70416 2298 9 3981264
Then Var(y̅st ) = 625
− 625 + 4999 × 625 = 120.4571
∴ The estimated variance of the estimate is 120.4571
GROUP 9 QUESTION:
a) what is probability proportional of size without replacement

Refers to a sampling method where the probability of selecting a specific item from a population is
directly proportional to its size or weight, and each selection is made without replacing the selected
item back into the population.
b) For a population with N = 3, Zi = ½, 1/3, 1/4 and Yi= 7, 5, 2; two units are drawn without
replacement, the first with probability proportional to Zi the second with probability proportional to
the remaining sizes. For this method of sample selection, compare the variances of YHT and YM. Use the
variance formulas.

Answers;

To compare the variances of YHT and YM using the given method of sample selection, we need to
calculate the variances of the two estimators.

Let's define the variables first:

N = 3 (population size)

Zi = ½, 1/3, 1/4 (sampling probabilities)

Yi = 7, 5, 2 (values of the population units)

YHT represents the estimator for the total sum (population total) and Ym represents the estimator for
the population mean.

YHT = N * (Y1/Z1 + Y2/Z2) / (1/Z1 + 1/Z2)

To calculate the variance of YHT, we'll use the variance formula for two-stage sampling:

Var (YHT) = N2 * (1 - n/N) * (Y12/ Z1 + Y22/ Z2) / (n - 1)

Where n is the size of units drawn.

Substituting the given values:

n = 2, N = 3, Y1 = 7, Y2 = 5, Z1 = ½, Z2 = 1/3

Var (YHT) = 32 * (1 - 2/3) * (72/ (1/2) + 52/ (1/3)) / (2 - 1)

= 9 * (1/3) * (49/ (1/2) + 25/ (1/3))

= 9 * (1/3) * (98 + 75)

= 9 * (1/3) * 173

= 519

Therefore, the variance of YHT is 519.

Now, let's calculate the variance of YM (estimator for the population mean):

YM = N * (Y1/ Z1 + Y2/Z2) / (N/ Z1 + (N-1) / Z2)

To calculate the variance of YM, we'll use the variance formula for two-stage sampling:

Var (YM) = N2 * (1 - (N-n)/N) * (Y12/ Z1 + Y22/ Z2) / (n * (N - n))

Substituting the given values:

n = 2, N = 3, Y1 = 7, Y2 = 5, Z1 = ½, Z2 = 1/3

Var (YM) = 32 * (1 - 1/3) * (72/ (1/2) + 52/ (1/3)) / (1 * (3 - 1))

= 9 * (2/3) * (49/ (1/2) + 25/ (1/3)) / 2

= 9 * (2/3) * (98 + 75) / 2

= 9 * (2/3) * 173 / 2

= 519

Therefore, the variance of YM is also 519.

In conclusion, the variances of YHT and YM are the same, both equal to 519.

GROUP 10
a) What is probability proportional to size (PPS)?
Answer:
Probability proportional to size (PPS), also known as probability proportional to volume (PPV)
or probability proportional to weight (PPW), is a sampling technique used in statistics and
survey research. It is commonly employed when selecting a sample from a population in which
the units vary in size or weight.

In PPS sampling, each unit in the population is assigned a probability of selection that is
proportional to its size or weight relative to the total size or weight of the population. The
larger or heavier units have a higher probability of being selected compared to smaller or
lighter units

b) Consider population of 10 individuals numbered from 1 to 10. The population have varying
size as follows
Element Size
1 10
2 5
3 15
4 8
5 4
6 10
7 6
8 9
9 11
10 7
Using PPS sampling without replacement, if we want to select a sample of size 5 determine
the probability of selecting the following elements 1,3,7,4 and 10

Element Xi Pi
1 10 0.118
2 5 0.057
3 15 0.176
4 8 0.0941
5 4 0.047
6 10 0.117
7 6 0.071
8 9 0.1059
Solution 9 11 0.129
10 7 0.082
∑ 𝑥 = 85 , and P(xi) = 𝑋𝑖⁄∑
𝑋𝑖
So the probability of selecting the elements 1,3,7,4 and 10
is,
∏ 𝑝(𝑋𝑖) =0.118×0.176×0.0941× 0.071×0.082= 0.000011377

QUESTION FROM GROUP 11

a) Define optimum sampling and subsampling fraction, what are the factors that affect optimum
and subsampling fraction
Answer: optimum sampling is the process of determining the optimal sample size and sampling
fraction for a survey. OR

The optimum sampling fraction is the value that minimizes the cost of the survey while still
achieving the desired level of precision.The goal of optimum sampling is to achieve a desired
level of precision (accuracy) at the lowest possible cost.

Subsampling fraction is the proportion of nonrespondents in a survey who are selected for a
more intensive follow-up in a second phase of data collection.
The subsampling fraction is used to increase the response rate of a survey and to improve the
accuracy of the estimates.

The optimal subsampling fraction is the value that maximizes the response rate and accuracy
of the estimates while still being cost-effective.
Here are some factors that can affect the optimum sampling fraction and subsampling fraction:
The cost of data collection: The more expensive it is to collect data, the lower the optimum
sampling fraction will be.
The desired level of precision: The more precise the estimates are desired, the higher the
optimum sampling fraction will be.
The expected response rate: The lower the expected response rate, the higher the optimum
subsampling fraction will be.
b) Find optimum sampling
Given that, C1 = 10C2 and S2 = 1.3Sµ
SOLUTION
By using Cauchy Schwarz inequality formular

S2 C1
Mopt = Sµ
× √C2

Since, S2 = 1.3Sµ and C1 = 10C2

1.3Sµ 10C2
Mopt = Sµ
× √ C2

Mopt = 1.3√10

Mopt = 1.3√10 Therefore the optimum sampling is 4.11

QUESTION FROM GROUP 12
(a) What are the objectives of phase 1 and phase 2 in double sampling stratification.
Answer
Phase 1; Objective is to estimate the weight of each stratum
Phase 2; Objective is to find/estimate the value of population mean.

(b) Given N=600, n=12, 𝑛1 = 4, 𝑛2 = 8.

Phase1, A large sample of 100 is selected from population SRS then 𝑛′ = 100, 𝑛′1 = 40 and
𝑛′ 2 = 60.
Also
Phase 2, pick random sample of 𝑛1 = 4 from 𝑛′1 = 40 then we get 𝑦1𝑖 = 3,4,6,7. Again pick
random
Sample of 𝑛2 = 8 from 𝑛′ 2 = 60 then we get 𝑦2𝑖 = 5,7,8,10,11,14,15,18.
i. Estimate 𝑦̅𝑠𝑡
ii. Variance (𝑦̅𝑠𝑡 )
Solution.

𝑛′ ℎ
From phase 1:objective to estimate the weight of each stratum. From 𝑤ℎ = 𝑛′
𝑛′ 1
Then, 𝑤1 =
𝑛′

But 𝑛′1 = 40, 𝑛′ = 100

40
, 𝑤1 = = 0.4
100
𝑛′ 2
Again, , 𝑤2 =
𝑛′

But 𝑛′ 2 = 60, 𝑛′ = 100

60
𝑤2 = = 0.6
100
Phase 2: Objective is to estimate the population mean
𝑛
ℎ 𝑦
∑𝑖=1 ℎ𝑖
𝑦̅ℎ𝑖 =
𝑛ℎ
𝑛
1 𝑦
∑𝑖=1 ℎ1
𝑦̅ℎ1 =
𝑛1
3+4+6+7
= =5
4
𝑛2 𝑦
∑𝑖=1 ℎ2
𝑦̅ℎ2 =
𝑛2
5+7+8+10+11+14+15+18
= = 11
8

i. ̅𝑦𝑠𝑡 = ∑𝑘ℎ=1 𝑤ℎ 𝑦̅ℎ = ∑2ℎ=1 𝑤ℎ 𝑦̅ℎ

= 𝑤1 𝑦̅1 + 𝑤2 𝑦̅2
= 0.4 × 5 + 0.6 × 11 = 8.6
∑𝑘 2 2
ℎ=1 𝑤 ℎ 𝑠 ℎ ∑𝑘 2
ℎ=1 𝑤ℎ 𝑠 ℎ 𝑔′
ii. Variance ( ̅𝑦𝑠𝑡 ) = − + ∑𝑘ℎ=1 𝑤ℎ (𝑦̅ℎ − 𝑦̅𝑠𝑡 )2
𝑛ℎ 𝑁 𝑛′

′ 𝑁−𝑛′
Where by 𝑔 =
𝑁−1
1
From 𝑠 2 ℎ = [∑ 𝑦 2 ℎ − 𝑛ℎ 𝑦̅ 2 ℎ ]
𝑛−1
1
𝑠 21 = [110 − 4 × 52 ] but [𝑛1 = 4, ∑ 𝑦 2 ℎ = 110, 𝑦̅ 21 = 5 ]
4−1
10
𝑠 21 = = 3.3333
3
1
𝑠22 = [1104 − 8 × 112 ] but [𝑛8 = 4, ∑ 𝑦 2 ℎ = 1104, 𝑦̅ 2 2 =
8−1
11]
136
𝑠22 = = 19.4286
7
600−100
𝑔′ = = 0.8347
600−1
0.42 (3.33) 0.62 (19.4286) 0.4(3.33)+0.6(19.4286) 0.8347
Var(𝑦̅𝑠𝑡 ) = [ + ]−[ ]+ [0.4(5 − 8.6)2 +
4 8 600 100
0.6(11 − 8.6)2 ]
= 1.007487 − 0.0216486 + 0.07211808
= 1.058
GROUP 13 Two-stage cluster sampling
a) Define the following terms as used in sampling;
1 Probability proportional to size (PPS)
Is a method of sampling from a finite population in which a size measure is available
for each unit before sampling and where a probability of selecting a unit proportional
to its size.
2 Effective sample size
Is the number of distinct units in the sample.
3 Sampling error
Is the statistical error that occurs when an analyst does not select a sample that represent
the entire population data.

b) There are 36 departments in a small liberal arts college. One wants to estimate the average
amount of money the students spent on the last semester. Since the size of each department
varies very much, a two-stage cluster sampling using probability proportion to size for the
primary unit is carried out. The results listed in the tables below.
Table 1.
Department Mi mi Textbook expenses in Tshs for last semester
1. 10 4 326,400,423,443
2. 20 8 278,312,450,350,227,438,512,403
3. 30 12 512,256,332,402,512,309,411,610,422,630,550,470
4. 15 6 426,312,512,440,342,533

Find; Table 2.
variable SE Mean StDev Variance
i. Means for each department (ȳ)
Dept1 25.6 51.1 2612.7
ii. Estimate the total sample mean (µ)
Dept2 34.1 96.3 9277.4
iii. Variance of the total sample mean (var (µ))
Dept3 33.9 117.6 13828.8
Note: Use Hansen-Hurwitz Approach
Dept4 36.1 88.4 7815.9
∑ 𝑦𝑖
Solution𝑦̅= =Means for each department (ȳ)
𝑛

• 𝑦̅ for dept1= (326+400+423+443) ÷4 =398.0

• 𝑦̅ for dept2= (278+312+450+350+227+438+512+403) ÷8=371.3
• 𝑦̅ for dept3= (512+256+332+402+512+309+411+610+422+630+550+470) ÷ 12= 451.3
• 𝑦̅ for dept4= (426+312+512+440+342+533) ÷ 6 =427.5
Solution for µ and var (µ) is given by;

i.
ii.
GROUP 14. TEST 2 QUESTION.
Topic 5: Multi–Stage Sampling Ref. Cochran pg274.
5.1 Two-stage sampling, means and variances in two stage sampling and variance of the estimated mean
a) Explain what is meant by two-stage sampling and describe the steps involved.
Suppose that each unit in the population can be divided into a number of smaller units. If
subunits within a selected unit give similar results, it seems uneconomical to measure them all.
A common practice is to select and measure a sample of the subunits in any chosen unit.
This is known as two-stage sampling because the sample is taken in two steps.

• the first is to select a sample of units, often called the primary units,

• the second is to select a sample of second-stage units or subunits from each chosen
primary unit.
b) A garment manufacturer has N = 90 plants located throughout the United States and wants
to estimate the average number of hours that the sewing machines were down for repairs in
the past months. Because the plants are widely scattered, she decides to use cluster sampling,
specifying each plant as a cluster of machines. Each plant contains many machines, and
checking the repair record for each machine would be time-consuming. Therefore, she uses
two-stage cluster sampling. Enough time and money are available to sample n = 10 plants
and approximately 20% of the machines in each plant. The resulting data are given in the
table below.

We want to estimate the average downtime per machine, and we know that the total
number of machines in all plants is K = 4500.
10

∑ 𝑀𝑖𝑦𝑖
̅ = 50 × 5.4 + 65 × 4 + 55 × 5.67 + 48 × 4.8 + 52 × 4.3 + 58 × 3.83 + 42 × 5 + 66 × 3.85 + 40 × 4.88 + 56 × 5 = 2400.59
1
FIRST SOLUTION
The table represent downturn of sewing machine

Using the data in the table above estimate average downtime per machine and its variance.
The manufacturer knows she has combined total of 4500 machines in all plants.
GRROUP 15
a) Define Probability proportion to size (PPS)
PPS is a method of sampling of sampling from a finite population in which a size measure is
available for each population unit before sampling and where the probability of selecting
unit is proportion to it's size

b) There are 36 department in small liberal arts college. One wants to estimate the average
amount of money student spent on text books last semester. Since the size of each
department varies much, a two stage sampling using PPS for the primary is carried out. The
result are below

Find the estimate the population mean using PPS estimator and estimate the variance of that
estimator
SOLN
GROUP 16 QUESTION
1) Given the data from a series of samples, what are three kinds of quantity for which we may
wish estimates?
i. The change in Y from one occasion to the next.
ii. The average value all occasions. of Y over
iii. The average value of Y for the most recent occasion.
2) In a survey to estimate average household monthly medical expenses, 500 households were
selected at random from a population of 5000 households. Of the selected households, 336
had children in the household and 164 had no children. A stratified subsample of 112
households with children and 41 households without children was then selected, and monthly
medical expenditure data were collected from households in the subsample. For the
households with children, the sample mean expenditure was $280 with a sample standard
deviation of 160; for the households without children, the respective figures were $110 and
60. Estimate mean monthly medical expenditure for households in the population, and
estimate the variance of the estimate
Solution
Let that Stratum 𝑛ℎ ′ 𝑛ℎ 𝑦̅ℎ 𝑆ℎ
1 336 112 $ 160
❖ Stratum (1) represents households with 280
children 2 164 41 60 $ 110
❖ Stratum (2) represents households without Total 500 153
children
Hence, the information above will be summarized as follow;
➢ To estimate the mean monthly medical expenditure for households in the population.
From
𝑛ℎ ′
𝑦̅𝑠𝑡 = ∑2ℎ=1 𝑤ℎ 𝑦̅ℎ Where 𝑤ℎ = 𝑛′
= 𝑤1 𝑦̅1+ 𝑤2 𝑦̅2 = 0.672 × 280 + 0.328 × 110 = $ 224.24
∴ The mean monthly medical expenditure for households in the population is $ 224.24
➢ To estimate the variance of the estimate
From
w h 2 Sh 2 w h Sh 2 g′
𝑽𝒂𝒓(𝑦̅𝑠𝑡 ) = ∑2h=1 − ∑2h=1 + n′ ∑2h=1 wh (y̅h − y̅st )2
nh N

Hence
w h 2 Sh 2 w 1 2 S1 2 w 2 2 S2 2 0.6722 ×1602 0.3282 ×602 70416
➢ ∑2h=1 = + = + =
nh n1 n2 112 41 625
w h Sh 2 w1 S1 2 +w2 S2 2 1 2298
➢ ∑2h=1 = = 5000 (0.627 × 1602 + 0.328 × 602 ) =
N 5000 625
g′ N−n′ 5000−500 9
➢ = = 500(5000−1) = 4999
n′ n′ (N−1)
➢ ∑2h=1 wh (y̅h − y̅st )2 = w1 (y̅1 − y̅st )2 + w2 (y̅2 − y̅st )2
3981264
= 0.672(280 − 224.24)2 + 0.328(110 − 224.24)2 = 625

Then
70416 2298 9 3981264
Var(y̅st ) = − + 4999 × = 120.4571
625 625 625

∴ The estimated variance of the estimate is 120.4571

GROUP 17 QUESTION
a. How can one determine the optimal allocation of sample units in double sampling for
stratification?
Solution:

• Neyman Allocation: It involves allocating sample units between the first and second phases
in a way that minimizes the variance of the estimated population parameter under certain
assumptions. Neyman allocation account the variances and covariances of the estimators and
cost associated in both phases.

• Efficiency Criteria: Efficiency Criteria aim to maximize the precision or efficiency of the
estimators. criteria such as Mean square, relative efficiency or design effect can be used to
evaluate allocation strategies, this criterion compares the precision of the estimators under
different allocation scenarios and guide the selection of the optimal allocation.

• Analytical approaches: this can be used to find allocation that maximizes precision. These
methods involve formulating an objective function that represents the desired allocation
goals incorporating relevant constraints and solving the optimization problem to obtain the
optimal allocation.

• Simulations: By simulating the sampling process and estimating the population parameters
under different allocation scenarios one can compare the precision and accuracy of the
estimates, thus helps to identify allocation strategy that yields the best results for a specific
study population.

b. A shoe store wants to estimate the average number of pairs of shoes owned by the students
who live in a certain college town neighborhood. They think that a stratified sample based
on gender is a good approach to take but do not know the makeup of the gender in that
neighborhood. They also do not know the gender of the respondent until after contacting
them. So, they use double sampling by first contacting 160 randomly selected students in that
neighborhood and asking them about their gender. It turns out that 64 are males and 96 are
females. They then randomly sample 8 males and 12 females, and provide them a $10.00
incentive for going home to count the number of pairs of shoes, and report them.
Compute 𝐲̅𝐬𝐭 and its estimated standard deviation.
The data are given in the table below:

Male 5 6 9 5 9 7 5 8
Female 17 19 13 16 8 11 15 19 12 13 33 20

Variable N Mean St.

Dev
Male 8 6.750 1.753
Female 12 16.33 6.37

Copy Geriatrics at Your Fingertips 2018
100% (1)
Copy Geriatrics at Your Fingertips 2018
370 pages
Statistics For Managenent II
No ratings yet
Statistics For Managenent II
73 pages
7.1 Basic Concepts
No ratings yet
7.1 Basic Concepts
28 pages
Nabard Act 1981
No ratings yet
Nabard Act 1981
37 pages
L-04 Producing Data Sampling and Design Experiment
No ratings yet
L-04 Producing Data Sampling and Design Experiment
70 pages
Basic Univariate Statistics For Engineers 2019
No ratings yet
Basic Univariate Statistics For Engineers 2019
32 pages
Power System Analysis
No ratings yet
Power System Analysis
40 pages
Prosocial Behavior - Extra Notes
No ratings yet
Prosocial Behavior - Extra Notes
5 pages
Two Stage Cluster Sampling
No ratings yet
Two Stage Cluster Sampling
42 pages
Statistics and Probability Quarter 3
No ratings yet
Statistics and Probability Quarter 3
28 pages
WEEK 5 - Random Sampling
0% (1)
WEEK 5 - Random Sampling
27 pages
Statistics c.1
No ratings yet
Statistics c.1
125 pages
Lecture Notes On Inference Statistics
No ratings yet
Lecture Notes On Inference Statistics
57 pages
Srda Advance Sampling
No ratings yet
Srda Advance Sampling
33 pages
FinQuiz - Curriculum Note, Study Session 2, Reading 5
No ratings yet
FinQuiz - Curriculum Note, Study Session 2, Reading 5
7 pages
SMFDA
No ratings yet
SMFDA
45 pages
N Out of A Finite Population of Size:) (SRSWR) (Srswor) (SRSWR
No ratings yet
N Out of A Finite Population of Size:) (SRSWR) (Srswor) (SRSWR
30 pages
CH 7
No ratings yet
CH 7
18 pages
1 Chapter
No ratings yet
1 Chapter
29 pages
D - 5answer Key
No ratings yet
D - 5answer Key
15 pages
Introduction To Probabilistic Sampling
No ratings yet
Introduction To Probabilistic Sampling
39 pages
Unit-Iii P&S
No ratings yet
Unit-Iii P&S
21 pages
Skala BiK 2008
No ratings yet
Skala BiK 2008
19 pages
Prevention of Alcoholism
No ratings yet
Prevention of Alcoholism
12 pages
Unit-2 - Sampling and Estimations
No ratings yet
Unit-2 - Sampling and Estimations
11 pages
Lec. Note E2
No ratings yet
Lec. Note E2
10 pages
Chapter 5 Sampling and Estimation
No ratings yet
Chapter 5 Sampling and Estimation
13 pages
Objective Resolution 1949
No ratings yet
Objective Resolution 1949
11 pages
Devotional and Prayer Journal
No ratings yet
Devotional and Prayer Journal
18 pages
Revised Sampling
No ratings yet
Revised Sampling
9 pages
Doing Critical Feminist Research: A Feminism & Psychology Reader
No ratings yet
Doing Critical Feminist Research: A Feminism & Psychology Reader
19 pages
The Yamas Niyamas Exploring Yoga S Ethical Practice Deborah Adele
No ratings yet
The Yamas Niyamas Exploring Yoga S Ethical Practice Deborah Adele
41 pages
Sampling Techniques MCQ
100% (2)
Sampling Techniques MCQ
47 pages
Chemical Calligraphy
No ratings yet
Chemical Calligraphy
11 pages
Christianity and The French Legion
No ratings yet
Christianity and The French Legion
19 pages
Lec. Note E4
No ratings yet
Lec. Note E4
5 pages
Philippine Christian University: Week 1
No ratings yet
Philippine Christian University: Week 1
6 pages
Sampling Design: Basic Concepts and Procedure: Sampling Frame. Known. Random Samples
No ratings yet
Sampling Design: Basic Concepts and Procedure: Sampling Frame. Known. Random Samples
18 pages
Two Stage
No ratings yet
Two Stage
13 pages
chptr1 Statistcs2
No ratings yet
chptr1 Statistcs2
8 pages
Soal Bahasa Inggris Kelas Xi Semester 2
100% (10)
Soal Bahasa Inggris Kelas Xi Semester 2
8 pages
Zaheen Khan
No ratings yet
Zaheen Khan
7 pages
Unit - Iv Sampling
No ratings yet
Unit - Iv Sampling
14 pages
Super Position Theorem
No ratings yet
Super Position Theorem
14 pages
Population Total
No ratings yet
Population Total
13 pages
SYSTEMATIC Sampling Assignment PDF
No ratings yet
SYSTEMATIC Sampling Assignment PDF
7 pages
Problem Set Solution QT I I 17 Dec
No ratings yet
Problem Set Solution QT I I 17 Dec
22 pages
Sampling Methods PDF
No ratings yet
Sampling Methods PDF
18 pages
SDG Statsreviewer1
No ratings yet
SDG Statsreviewer1
3 pages
SDG Statsreviewer
No ratings yet
SDG Statsreviewer
3 pages
Lecture 4 Simple Random Sampling
No ratings yet
Lecture 4 Simple Random Sampling
6 pages
Reading and Writing Week 6
No ratings yet
Reading and Writing Week 6
19 pages
Graphing Quadratics
100% (1)
Graphing Quadratics
14 pages
Sampling
No ratings yet
Sampling
20 pages
Important Statistical Terms: Population
No ratings yet
Important Statistical Terms: Population
27 pages
Offer Letter
No ratings yet
Offer Letter
3 pages
Persuasive Writing: Self Learning Activity Grade 10-English Learning Competencies
No ratings yet
Persuasive Writing: Self Learning Activity Grade 10-English Learning Competencies
4 pages
Chapter8 Double Sampling
No ratings yet
Chapter8 Double Sampling
17 pages
Van Cleve Three Versions of The Bundle Theory PDF
No ratings yet
Van Cleve Three Versions of The Bundle Theory PDF
13 pages
Sampling and Financial Audit
No ratings yet
Sampling and Financial Audit
19 pages
Formalizing The Concepts: Simple Random Sampling: Juan Muñoz Kristen Himelein March 2013
No ratings yet
Formalizing The Concepts: Simple Random Sampling: Juan Muñoz Kristen Himelein March 2013
25 pages
Lesson 11: The Computer As The Teacher's Tool
No ratings yet
Lesson 11: The Computer As The Teacher's Tool
19 pages
Sampling: Click at Http://goo - gl/7Dztn
No ratings yet
Sampling: Click at Http://goo - gl/7Dztn
8 pages
Stat 240
No ratings yet
Stat 240
3 pages
Stat 410-1
No ratings yet
Stat 410-1
4 pages
Principles of Sampling
No ratings yet
Principles of Sampling
20 pages
Chi Square Test
No ratings yet
Chi Square Test
44 pages
Ibsen's Portrayal of Nora in ADH
No ratings yet
Ibsen's Portrayal of Nora in ADH
1 page
Sampling Theory Sampling Theory: Two Stage Sampling Two Stage Sampling (Sub Sampling)
No ratings yet
Sampling Theory Sampling Theory: Two Stage Sampling Two Stage Sampling (Sub Sampling)
13 pages
Week 11: Sampling Distribution
No ratings yet
Week 11: Sampling Distribution
9 pages
Sampling Distribution: Estimation and Testing of Hypothesis
No ratings yet
Sampling Distribution: Estimation and Testing of Hypothesis
34 pages
Stat Introduction To Statistical Methodology
No ratings yet
Stat Introduction To Statistical Methodology
12 pages
Chapter I
No ratings yet
Chapter I
10 pages
Boolean Algebra
No ratings yet
Boolean Algebra
8 pages
Sampling
No ratings yet
Sampling
22 pages
Bronchos
No ratings yet
Bronchos
12 pages
The Power, Purpose and Priority of The Word of God
No ratings yet
The Power, Purpose and Priority of The Word of God
2 pages
2 Complex Sampling Concepts: PSU PSU PSU Usus CS SRS
No ratings yet
2 Complex Sampling Concepts: PSU PSU PSU Usus CS SRS
19 pages
Letter To Atlantic Beach
No ratings yet
Letter To Atlantic Beach
2 pages
Occupational Health and Safety Management Systems Tcm18 240421
100% (2)
Occupational Health and Safety Management Systems Tcm18 240421
6 pages
Sampling Theory: Double Sampling (Two Phase Sampling)
No ratings yet
Sampling Theory: Double Sampling (Two Phase Sampling)
12 pages
Self Assessment Tool-1
No ratings yet
Self Assessment Tool-1
4 pages
MPC PDF
No ratings yet
MPC PDF
12 pages
Greenwich Associates - Order and Execution Management Systems Increasingly Indispensable - 2019-06-17
No ratings yet
Greenwich Associates - Order and Execution Management Systems Increasingly Indispensable - 2019-06-17
2 pages
Hebrews 4 12-13
No ratings yet
Hebrews 4 12-13
6 pages
Lecture No7 Pipeline Systems
No ratings yet
Lecture No7 Pipeline Systems
4 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Trigonometric Ratios to Transformations (Trigonometry) Mathematics E-Book For Public Exams
From Everand
Trigonometric Ratios to Transformations (Trigonometry) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)
Multiplication Tables and Flashcards: Times Tables for Children
From Everand
Multiplication Tables and Flashcards: Times Tables for Children
Jack Goldstein
4/5 (1)
De Moiver's Theorem (Trigonometry) Mathematics Question Bank
From Everand
De Moiver's Theorem (Trigonometry) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

ST 318 Test 2-3

Uploaded by

ST 318 Test 2-3

Uploaded by

UNIVERSITY OF DAR ES SALAAM

College of Social Science (CoSS)

POPULATION WITH TREND

The first number is 11

The last number is 25

Answer all questions

The methods used in this sampling technique are:

• Cumulative selection method

Estimate Ýpps and it’s variance

∴ The estimated variance of the estimate is 218.09

a) What is probability proportional to size sampling?

From the above table estimate ŷ and its variance.

Note The whole variance also has a ^ symbol

Two-stage sampling; meaning.

(ȳ𝑖− ӳ)2 ∑𝑖 ∑𝑗(𝑦𝑖𝑗− ȳ𝑖)2

N= 400 M=50 n=80 m=5 s12= 362 and s22= 805

Standard error = √v(ӳ)

Let's define the variables first:

Zi = ½, 1/3, 1/4 (sampling probabilities)

Yi = 7, 5, 2 (values of the population units)

YHT = N * (Y1/Z1 + Y2/Z2) / (1/Z1 + 1/Z2)

Var (YHT) = N2 * (1 - n/N) * (Y12/ Z1 + Y22/ Z2) / (n - 1)

Where n is the size of units drawn.

Substituting the given values:

Var (YHT) = 32 * (1 - 2/3) * (72/ (1/2) + 52/ (1/3)) / (2 - 1)

= 9 * (1/3) * (49/ (1/2) + 25/ (1/3))

= 9 * (1/3) * (98 + 75)

Therefore, the variance of YHT is 519.

YM = N * (Y1/ Z1 + Y2/Z2) / (N/ Z1 + (N-1) / Z2)

Var (YM) = N2 * (1 - (N-n)/N) * (Y12/ Z1 + Y22/ Z2) / (n * (N - n))

Substituting the given values:

Var (YM) = 32 * (1 - 1/3) * (72/ (1/2) + 52/ (1/3)) / (1 * (3 - 1))

= 9 * (2/3) * (49/ (1/2) + 25/ (1/3)) / 2

= 9 * (2/3) * (98 + 75) / 2

Therefore, the variance of YM is also 519.

QUESTION FROM GROUP 11

Since, S2 = 1.3Sµ and C1 = 10C2

Mopt = 1.3√10 Therefore the optimum sampling is 4.11

(b) Given N=600, n=12, 𝑛1 = 4, 𝑛2 = 8.

But 𝑛′1 = 40, 𝑛′ = 100

But 𝑛′ 2 = 60, 𝑛′ = 100

i. ̅𝑦𝑠𝑡 = ∑𝑘ℎ=1 𝑤ℎ 𝑦̅ℎ = ∑2ℎ=1 𝑤ℎ 𝑦̅ℎ

• 𝑦̅ for dept1= (326+400+423+443) ÷4 =398.0

∴ The estimated variance of the estimate is 120.4571

Variable N Mean St.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.