Unit-Iii: Statistical Estimation Theory Unbiased Estimates
Unit-Iii: Statistical Estimation Theory Unbiased Estimates
𝜎 𝑁𝑃 −𝑁
And 𝜎𝑋̅ = √ …(finite or without replacement population)
√𝑁 𝑁 𝑃 −1
𝑝𝑞 𝑁𝑃 − 𝑁
𝑃 ± 𝑧𝑐 √ √
𝑁 𝑁𝑃 − 1
Qn) Suppose that height of 100 male students in XYZ university represent a random sample
of the heights of all 1546 students in the university with sample mean 𝑋̅ = 67.45 𝑎𝑛𝑑 𝜎 =
2.93
Find the (a) 95% and (b) 99% confidence intervals for estimating the mean height.
Solution:
𝜎 2.93
(a) The 95% confidence limits are 𝑋̅ ± 1.96 = 67.45 ± 1.96
√𝑁 √100
Thus the 95% confidence interval for the population mean 𝜇 is 66.88 𝑡𝑜 68.02 𝑖𝑛𝑐ℎ𝑒𝑠,
which can be denoted by 66.88 < 𝜇 < 68.02
We can therefore say that the probability that the population mean height lies between
66.88 𝑎𝑛𝑑 68.02 𝑖𝑛𝑐ℎ𝑒𝑠 is about 95%, 𝑜𝑟 0.95.
In symbols we write Pr{66.88 < 𝜇 < 68.02} = 0.95
This is equivalent to saying that we are 95% confident that the population mean (or true
mean) lies between 66.88 𝑎𝑛𝑑 68.02 𝑖𝑛𝑐ℎ𝑒𝑠.
(b) The 99% confidence limits are
𝜎 2.93
𝑋̅ ± 2.58 = 67.45 ± 2.58
√𝑁 √100
Thus the 99% confidence interval for the population mean 𝜇 is 66.69 𝑡𝑜 68.21 𝑖𝑛𝑐ℎ𝑒𝑠,
which can be denoted by 66.69 < 𝜇 < 68.21
We can therefore say that the probability that the population mean height lies between
66.69 𝑎𝑛𝑑 68.21 𝑖𝑛𝑐ℎ𝑒𝑠 is about 99% 𝑜𝑟 0.99
In symbols we write Pr{66.69 < 𝜇 < 68.21} = 0.99
This is equivalent to saying that we are 99% confident that the population mean (or true
mean) lies between 66.69 𝑎𝑛𝑑 68.21 𝑖𝑛𝑐ℎ𝑒𝑠.
Qn) In measuring reaction time, a psychologist estimates that the standard deviation is 0.05
seconds. How large a sample of measurements must he take in order to be (a) 95% and (b)
99% confident that the error of his estimate will not exceed 0.01 𝑠𝑒𝑐𝑜𝑛𝑑𝑠?
Solution:
𝜎
(a) The 95% confidence limits are 𝑋̅ ± 1.96
√𝑁
𝜎
Where 1.96 is the error of the estimate.
√𝑁
𝜎
This error will be less than 0.01 if 1.96 < 0.01
√𝑁
0.05
⇒ 1.96 < 0.01
√𝑁
0.05
⇒ 1.96 < √𝑁
0.01
⇒ 9.8 < √𝑁
⇒ 96.04 < 𝑁 ⇒ Thus we can be 95% confident that the error of the estimate will be less
than 0.01 𝑠𝑒𝑐𝑜𝑛𝑑𝑠 if 𝑁 is 97 or larger.
(b) H.W.
Qn) A random sample of 50 mathematics grades out of a total of 200 showed a mean of 75
and a standard deviation of 10.
(a) What are the 95% confidence limits for estimates of the mean of the 200 grades?
(b) With what degree of confidence could we say that the mean of all 200 grades is 75 ± 1?
Solution:
(a) Since the population size is not very large compared with the sample size, the 95%
confidence limits are
𝜎 𝑁𝑃 − 𝑁 10 200 − 50
𝑋̅ ± 1.96 √ = 75 ± 1.96 √ = 75 ± 2.4
√𝑁 𝑁𝑃 − 1 √50 200 − 1
(b) The confidence limits can be represented by
𝜎 𝑁𝑃 − 𝑁
𝑋̅ ± 𝑧𝑐 √ = 75 ± 1.23𝑧𝑐
√𝑁 𝑁𝑃 − 1
(0.55)(0.45)
= 0.55 ± 1.96√
100
= 0.55 ± 0.10
(b) The 99% confidence limits for the population are
𝑝𝑞
𝑃 ± 2.58𝜎𝑃 = 𝑃 ± 2.58√
𝑁
(0.55)(0.45)
= 0.55 ± 2.58√
100
= 0.55 ± 0.13
(c) The 99.73% confidence limits for the population are
𝑝𝑞
𝑃 ± 3𝜎𝑃 = 𝑃 ± 3√
𝑁
(0.55)(0.45)
= 0.55 ± 3√
100
= 0.55 ± 0.15
Confidence Intervals for Standard Deviations
The confidence limits for the standard deviation of a normally distributed population,
as estimated from a sample with standard deviation 𝑠, are given by
𝜎
𝑠 ± 𝑧𝑐 𝜎𝑠 = 𝑠 ± 𝑧𝑐
√2𝑁
Probable Error:
The 50% confidence limits of the population parameters corresponding to a statistic 𝑆
are given by 𝑆 ± 0.6745𝜎𝑆 .
The quantity 0.6745𝜎𝑆 is known as the probable error of the estimate.
Qn) The standard deviation of the breaking strengths of 100 cables tested by a company was
180 𝑙𝑏. Find the (a) 95%, (b) 99%, and (c) 99.73% confidence limits for the standard
deviation of all cables produced by the company.
Solution:
Here, 𝜎 = 180, 𝑁 = 100
𝜎 180
(a) The 95% confidence limits are 𝑠 ± 1.96 = 180 ± 1.96 = 180 ± 24.9
√2𝑁 √2×100
Special Test:
Mean:
Here 𝑆 = 𝑋̅, the sample mean
𝜇𝑆 = 𝜇𝑋̅ = 𝜇, the population mean
𝜎
𝜎𝑆 = 𝜎𝑋̅ = , where σ is the population standard deviation and N is sample size
√𝑁
𝑋̅−𝜇
The 𝑧 score is given by 𝑧 = 𝜎
√𝑁
Proportions
Here 𝑆 = 𝑃, the proportion of successes in a sample
𝜇𝑆 = 𝜇𝑃 = 𝑝, Where p is the population proportion of successes
𝑝𝑞
𝜎𝑆 = 𝜎𝑃 = √ , where 𝑞 = 1 − 𝑝 and N is sample size
𝑁
𝑃−𝑝
The 𝑧 score is given by 𝑧 = 𝑝𝑞
√𝑁
Reject 𝐻0
By new technique breaking strength has increased.
Qn) On an examination given to students at a large number of different schools, the mean
grade was 74.5 and standard deviation was 8.0. At one particular school where 200 student
took the examination, the mean grade was 75.9. Discuss the significance of this result at he
0.05 level from the view point of
a. One tailed test
b. Two tailed test
Solution:
a) 𝐻0 : Performance of school is same as the population.
And 𝐻1 :Performance of school is better than the population.
Given 𝜎 = 8, 𝑁 = 200, 𝑋̅ = 75.9, 𝑙. 𝑜. 𝑠. 𝛼 = 0.05
𝑋̅−𝜇 75.9−74.5
𝑍= 𝜎 = 8 = 2.4748 > 1.645
√𝑁 √200
Reject 𝐻0 .
∴ Performance of school is better than the population.
b) Two-tailed test
𝐻0 : 𝜇 = 74.5
𝐻1 : 𝜇 ≠ 74.5
𝑙. 𝑜. 𝑠. 𝛼 = 0.05
𝑍 = 2.4748 > 1.96
Reject 𝐻0
Performance of school differs from the population.
Test of proportion using Normal Distribution:
Qn) The manufacturer of a patent medicine claims that it is 90% effective in relieving an
allergy for a period of 8 hours. In a sample of 200 people who had the allergy, the medicine
provided relief for 160 people. Determine whether the manufacturer’s claim is legitimate at
5% level of significance.
Solution:
𝐻0 : 𝑝 = 90% = 0.9(Manufacturer’s claim is valid)
𝐻1 : 𝑝 < 0.9 (Manufacturer’s claim is not valid)
One-tailed test 𝑙. 𝑜. 𝑠. = 𝛼 = 0.05(𝑡𝑎𝑏𝑙𝑒 𝑣𝑎𝑙𝑢𝑒 1.645)
Since, 𝑝 = 0.90 ⇒ 𝑞 = 1 − 𝑝 ⇒ 𝑞 = 0.10
Given 𝑁 = 200
160
⇒𝑃 = = 0.8
200
𝑃−𝑝 0.8−0.9
𝑍= 𝑝𝑞
= = −4.7140 < −1.645
(0.9)(0.1)
√𝑁 √
200
∴ Reject 𝐻0
∴ Manufacturer’s claim is not valid.
Qn) A pair of dice is tossed 100 times and it is observed that 23 times sum of numbers
appearing on uppermost faces is 7. Test the hypothesis that the dice are fair by using a two-
tailed test at 5% significance level.
Solution:
We know that, Number of elements in sample space i.e. 𝑛(𝑆) = 36
Let 𝐴 be the event of getting 7 as sum⇒ 𝑛(𝐴) = 6
1
𝐻0 : 𝑝 = (i.e. the dice are fair)
6
1
𝐻0 : 𝑝 ≠ (dice are not fair)
6
Accept 𝐻0
There is no significant difference in the bulbs of A and B.
Test involving difference of Proportions using Normal Distributions:
Qn) Two groups 𝐴 𝑎𝑛𝑑 𝐵 consist of 100 people each who have a disease. A serum is given
to group A but not to group B, otherwise the two groups are treated identically. It is found
that in group 𝐴 𝑎𝑛𝑑 𝐵, 75 𝑎𝑛𝑑 65 people respectively, recover from the disease. Test the
hypothesis that the serum helps to cure the disease at 1% level.
Solution:
𝐻0 : 𝑝1 = 𝑝2
𝐻1 : 𝑝1 > 𝑝2
On tailed test(𝑙. 𝑜. 𝑠. 𝛼 = 0.01, 𝑡𝑎𝑏𝑙𝑒 𝑣𝑎𝑙𝑢𝑒 = 2.33)
75 65
𝑁1 = 100, 𝑃1 = = 0.75, 𝑁2 = 100, 𝑃2 = = 0.65
100 100
𝑁1 𝑃1 +𝑁2 𝑃2 (100×0.75)+(100×0.65)
𝑝= = = 0.7 ⇒ 𝑞 = 1 − 𝑝 = 0.3
𝑁1 +𝑁2 100+100
Accept 𝐻0
Serum doesn’t help to cure the disease.
Qn) Random samples of 200 bolts manufactured by machine A and of 100 bolts
manufactured by machine B showed 19 and 5 defective bolts respectively. Test the
hypothesis that
a. The two machines are showing different qualities of performance
b. Machine B is performing better than A
Use 5% significance level.
Solution:
Consider proportion of defective bolts
19 5
𝑁1 = 200, 𝑃1 = = 0.095, 𝑁2 = 100, 𝑃2 = = 0.05
200 100
𝑁1 𝑃1 +𝑁2 𝑃2 (200×0.095)+(100×0.05)
𝑝= = = 0.08
𝑁1 +𝑁2 200+100
⇒ 𝑞 = 1 − 𝑝 = 0.92
(𝑃1 − 𝑃2 ) − (𝑝1 − 𝑝2 ) (0.095 − 0.05) − 0
𝑍= = = 1.3543
1 1
√𝑝𝑞 ( + ) √0.08 × 0.92 × ( 1 + 1 )
𝑁1 𝑁2 200 100
a. 𝐻0 : 𝑝1 = 𝑝2
𝐻1 : 𝑝1 ≠ 𝑝2
Two tailed test(𝑙. 𝑜. 𝑠. 𝛼 = 5%, 𝑡𝑎𝑏𝑙𝑒 𝑣𝑎𝑙𝑢𝑒 = 1.96)
1.3543 lies between ±1.96
⇒ Accept 𝐻0
There is no significant difference in the performance.
b. 𝐻0 : 𝑝1 = 𝑝2
𝐻1 : 𝑝1 > 𝑝2
One tailed test(𝑙. 𝑜. 𝑠. 𝛼 = 5%, 𝑡𝑎𝑏𝑙𝑒 𝑣𝑎𝑙𝑢𝑒 = 1.645)
1.3543 < 1.645
⇒ Accept 𝐻0
There is no significant difference in the performance.