Applied Statistics Lecture 11
Applied Statistics Lecture 11
(a) Calculate the 95% confidence interval for the mean acidity
level in the soil of this agricultural region.
(b) Calculate the 99% confidence interval for the mean acidity
level in the soil of this agricultural region.
Solution
0.6
The interval is: 5.8 ± 2 √ = 5.8 ± 0.154, i.e., (5.646, 5.954).
61
Thus, we are 95% confident that the mean acidity level in the
soil is 5.646 pH to 5.954 pH.
0.6
The interval is: 5.8 ± 2.660 √ = 5.8 ± 0.204, i.e.,
61
(5.596, 6.004).
Thus, we are 99% confident that the mean acidity level in the
soil is 5.596 pH to 6.004 pH.
I Thus,
(zα/2 )2 S 2
n≈ .
E2
Theorem 3
we have
h (n − 1)S 2 i
P a≤ ≤ b =1−α
σ2
Considering
(n − 1)S 2
a≤ ≤ b,
σ2
(n − 1)S 2 (n − 1)S 2
≤ σ2 ≤ .
b a
Example 7
1.93 ≤ σ 2 ≤ 8.95 .
(1.39 ≤ σ ≤ 2.99).
CI for population proportion
(D). CI for Population Proportion
Theorem 4
For large random samples, a 100(1 − α)% CI for population
proportion p is: r
p̂(1 − p̂)
p̂ ± zα/2 .
n
Now,
h p̂ − p i
P − zα/2 ≤ q ≤ zα/2 ≈ 1 − α.
p(1−p)
n
(D). CI for Population Proportion contd.
Now, consider the inequality inside the brackets:
−zα/2 ≤ qp̂−p ≤ zα/2
p(1−p)
n
r r
p(1 − p) p(1 − p)
−zα/2 ≤ p̂ − p ≤ +zα/2
r n rn
p(1 − p) p(1 − p)
−p̂ − zα/2 ≤ −p ≤ −p̂ + zα/2
r n r n
p(1 − p) p(1 − p)
p̂ − zα/2 ≤ p ≤ p̂ + zα/2
n n
Replace population proportions (p) that appear at endpoints of the
interval with sample proportion (p̂) to get an (approximate)
100(1 − α)% CI for p
r r
p̂(1 − p̂) p̂(1 − p̂)
p̂ − zα/2 ≤ p ≤ p̂ + zα/2 .
n n
Example 8
(n − 1)SX2 + (m − 1)SY2
Sp2 =
n+m−2
Also,
(X − Y) − (µ1 − µ2 )
P −tα/2,n+m−2 ≤ q ≤ tα/2,n+m−2 = 1−α.
1 1
Sp n + m
On simplification, we get
r
1 1
(X − Y ) − tα/2,n+m−2 Sp + ≤ µ1 − µ 2
n m
r
1 1
≤ (X − Y ) + tα/2,n+m−2 Sp +
n m
Suppose the number of products sold by the two sales team, A and
B, weekly is as follows:
Team A Team B
28, 35, 30, 32, 29, 34, 31, 33, 24, 29, 26, 31, 27, 30, 28, 32,
27, 36, 30, 32, 28, 35, 31, 33, 25, 33, 29, 31, 24, 29, 28, 32,
29, 34, 30, 32, 31 26, 30, 27, 31, 28
I Since sample variances SX2 = 6.05 and SY2 = 6.63 are not that
different, we can assume the population variances are similar.
I Here,
2
SX2 SY2
+ 6.05 6.63 2
n m 21 + 21
r= 2 2 = (6.05/21)2 2 ≈ 40.07.
(SX2 /n)
+
(SY2 /m)
20 + (6.63/21)
20
n−1 m−1
So, dr e = 40.
SX2 SY2
> 4 or >4
SY2 SX2
(C.) Paired t-interval
Theorem 3
Day Xi Yi Di = Xi − Yi
1 45 38 7
2 50 42 8
3 48 40 8
4 55 48 7
5 42 35 7
6 47 41 6
7 53 45 8
8 52 40 12
9 49 39 10
10 46 37 9
Solution contd.
SD
D ± t0.025,9 √ .
n
I From the given data, we get
1.62
7.4 ± 2.262 √ = (6.241, 8.559).
10
I Since 95% confidence interval does not include 0, we can
conclude that installation of the filtration system has a
significant effect in reducing the particulate matter.
CIs for ratio of two population variances
CIs for ratio of two population variances
Theorem 4
If X1 , X2 , . . . , Xn ∼ N(µX , σX2 ) and
Y1 , Y2 , . . . , Ym ∼ N(µY , σY2 ) are independent samples, then a
(1 − α)100% CI for σX2 /σY2 is:
!
1 SX2 SX2
, F (m − 1, n − 1) 2 .
Fα/2 (n − 1, m − 1) SY2 α/2 SY
CIs for ratio of two population variances contd.
Proof
(n−1)SX2 (m−1)SY2
We know that σX2
∼ χ2n−1 and σY2
∼ χ2m−1 .
Also, by the independence of the two samples,
(m−1)SY2
σY2
/(m − 1) σX2 SY2
F = = · ∼ F (m − 1, n − 1).
(n−1)SX2
/(n − 1) σY2 SX2
σX2
Therefore,
" #
σX2 SY2
P F1−α/2 (m−1, n−1) ≤ 2 · 2 ≤ Fα/2 (m−1, n−1) = 1−α
σY SX
CIs for ratio of two population variances contd.
Simplifying the quantity within the bracket and using the fact
that
1
F1−α/2 (m − 1, n − 1) = ,
Fα/2 (n − 1, m − 1)
σ2
Then, the 95% CI for X2 is
σY
σX2
1 6.05 6.05
≤ 2 ≤ 2.47 .
2.47 6.63 σY 6.63
Now,
" #
(p̂1 − p̂2 ) − (p1 − p2 )
P − zα/2 ≤ q ≤ zα/2 ≈ 1 − α
p1 (1−p1 ) p2 (1−p2 )
n1 + n2