0% found this document useful (0 votes)
9 views5 pages

Final Soln

The document provides solutions to various statistical problems, including continuous and discrete random variables, joint density functions, and confidence intervals. It discusses the application of the Central Limit Theorem, power of tests, and properties of Chi-square distributions. The solutions involve calculations of probabilities, marginal densities, and sample sizes for estimating population proportions.

Uploaded by

thijmenwolken
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views5 pages

Final Soln

The document provides solutions to various statistical problems, including continuous and discrete random variables, joint density functions, and confidence intervals. It discusses the application of the Central Limit Theorem, power of tests, and properties of Chi-square distributions. The solutions involve calculations of probabilities, marginal densities, and sample sizes for estimating population proportions.

Uploaded by

thijmenwolken
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Solutions to Final, 2/11/05, Statistics & Probability (150610)

[The fine prints given in some of the problems indicates


how the reasoning should go in answering that question.]

1. (a.) [Uniform distribution is continuous. Square root of it is then also continuous. So, the cdf has to be calculated
first and then derivative is to be taken for pdf.]

U ∼ U nif [0, 1] and X = U . Let F (x) = P (X ≤ x).
Note that F (x) = 0, if x ≤ 0 and F (x) = 1, if x ≥ 1.
For 0 < x < 1,

F (x) = P (X ≤ x) = P ( U ≤ x) = P (U ≤ x2 ) = x2 .

Thus the density of Y is given by

f (x) = F 0 (x) = 2x, 0 ≤ x ≤ 1.

(b.) [Y takes only integer values, hence a discrete r.v. we can calculate pmf directly.]

U takes values between 0 and 1. Hence Y = bnU c takes integer values between 0
(when U = 0) and n (when U = 1). Note that
k k+1
bnU c = k ⇔ k ≤ nU < k + 1 ⇔ ≤U < .
n n
Hence P (Y = k) = P (bnU c = k) = P (k/n ≤ U < (k + 1)/n). Then P (Y = n) =
P (1 ≤ U < 1 + 1/n) = 0. Hence Y actually takes the values 0, 1, . . . n − 1 and
!
k k+1 k+1 k 1
P (Y = k) = P ≤U < = − = , k = 0, 1, . . . n − 1.
n n n n n

2. Suppose X and Y have the joint density function given by

f (x, y) = k · (y − x), for 0 ≤ x ≤ y ≤ 1, and = 0, elsewhere.

(a.) Density is positive on the triangle created by the points (x = 0, y = 0),


(x = 0, y = 1) and (x = 1, y = 1).
(b.) [Total probability (volume) has to be equal to 1. Also while doing integration the limits of integration have to
be carefully determined according to the picture above. For example, for a given y, what are the values of x,
where f (x, y) is positive. ]

Since f (x, y) is a density, we must have


#x=y
x2
"
Z ∞ Z ∞ Z 1 Z y Z 1
1 = f (x, y)dxdy = k (y − x)dxdy = k yx − dy
−∞ −∞ 0 0 0 2 x=0
y=1
y2 Z 1 2
k y3
" #
Z 1 y k 1 k
2
= k y − dy = k dy = = = .
0 2 0 2 2 3 y=0
23 6

Hence k must be equal to 6.

1
(c.) The marginal densities of X and Y are as follows. [After completing the integrations range
of values of X or Y must also be given.]

#y=1
y2
"
Z ∞ Z 1
fX (x) = f (x, y)dy = 6(y − x)dy = 6 − xy
−∞ x 2 y=x
2 2
" # " #
1 x 1 x
= 6 ( − x) − ( − x2 ) = 6 −x+ = 3(1 − 2x + x2 )
2 2 2 2
= 3(1 − x)2 , 0 ≤ x ≤ 1.
#x=y
x2
"
Z ∞ Z y
fY (y) = f (x, y)dx = 6(y − x)dx = 6 yx −
−∞ 0 2 x=0
2 2
" #
y y
= 6 y2 − −0 = 6 = 3y 2 , 0 ≤ y ≤ 1.
2 2

(d.) [Once again the range of values of X and Y must also be given.]

Note that given X = x, Y takes values in [x, 1]. We can safely assume that x < 1
(because if x = 1 then Y takes only one value 1 – not an interesting case!) Hence
the conditional density of Y given X = x is given by

f (x, y) 6(y − x) 2(y − x)


fY | X=x (y) = = 2
= , x ≤ y ≤ 1.
fX (x) 3(1 − x) (1 − x)2

Also, given Y = y, X takes values in [0, y]. Here also we can safely assume that
y > 0 (because if y = 0, then X takes only one value 0 – not interesting!) Hence
the conditional density of X given Y = y is given by

f (x, y) 6(y − x) 2(y − x)


fX | Y =y (x) = = 2
= , 0 ≤ x ≤ y.
fY (y) 3y y2

3. There are 100 packages. Let Xi denote the (random) weight of ith package, i =
1, 2, . . . 100. It is given that Xi ’s are independent and E(Xi ) = 15 and Var(Xi ) = 102 .
Let Y denote the total weight of the packages. Then Y = 100
P
i=1 Xi . We want P (Y >
1700).
[Even though no model/distribution is given for the random variable X. Since 100 is reasonably large number, from
P
Central Limit Theorem we know that Xi is approximately Normally distributed.]

100
approx
N (µ = 100×15 = 1500, σ 2 = 100×102 = 1002 ).
X
Using CLT we have Y = Xi ∼
i=1
Hence
Y −µ 1700 − 1500
 
P (Y > 1700) = P > = P (U > 2), where U ∼ N (0, 1)
σ 100
= 1 − Φ(2) = 1 − 0.9772 = 0.0228.

2
4. (a.) When X ∼ Binomial(n, p), population proportion p is estimated by p̂ = X/n.
But from DeMoivre’s Law (or CLT) We know that
   s 
p̂ − p p(1 − p) 
Pq ≤ zα/2  = 1 − α, i.e., P |p̂ − p| ≤ z0.025 = 0.95.
p(1−p) n
n

Hence to estimate p within 0.05 of the true value with 95% probability we need
s s
2
p(1 − p) p(1 − p) 1.96

z0.025 ≤ 0.05 ⇔ 1.96 ≤ 0.05 ⇔ n≥ p(1−p).
n n 0.05
But it is suspected that p > 0.85, and it is easily seen from the graph that
p(1 − p) ≤ 0.85(1 − 0.85) for all p ≥ 0.85. Hence any
2
1.96

n≥ × 0.85(1 − 0.85) = 195.92
0.05
would satisfy the above condition. Hence the sample size must be bigger than or
equal to 196.
(b.) Here n = 300, p̂ = 0.96. Then 95% confidence interval for p is given by
s s
p̂(1 − p̂) 0.96 × 0.04
p̂ ± z0.025 = 0.96 ± 1.96 × = 0.96 ± 0.0222,
n 300
i.e., (0.9378, 0.9822).
(c.) To ensure higher probability of having the true proportion inside the interval we
must enlarge the interval. Thus a 99% confidence interval would be larger than
the one obtained in (b.)
(d.) When we study more and more observations our conclusions would be expected
to be more accurate. In this case that translates to having a smaller interval.
Thus if the researcher took a sample of 400 executives, I would expect a smaller
95% confidence interval than the one obtained in (b.)?

5. (a.) Model: Suppose Xi denotes the melting point of the ith sample of the oil. Xi ’s
(i = 1, 2, . . . 16) are i.i.d. with Normal distribution with known variance σ 2 = 1.22
and unknown µ.
[Since σ is known we can use it without having to use the sample standard deviation S. Hence the relevant
distribution would remain Normal and not Student’s t.]

To test: H0 : µ = 95(= µ0 ) Vs. H1 : µ < 95.


X̄ − 95 X̄ − 95
We use test statistic : Z = 1.20 = . Under H0 , Z ∼ N (0, 1).

16
0.3
The critical region for H0 is Z ≤ −zα = −z0.01 = −2.33.
94.32 − 95
Observed test statistic zobs = = −2.2667.
0.3

3
Since zobs = −2.2667 > −2.33, i.e. zobs does not fall in the critical region, we
accept H0 .
In other words, the given data does not show enough evidence to conclude at 1%
level of significance that the expected melting point of the given oil is lower than
the desired level of 95.
(b.) [Power of a test is probability of rejecting H0 when H1 is true, i.e., the probability of the critical region under
alternate hypothesis.]

When the true value of √


µ = 94 then X̄ is normally distributed with mean 94 and
standard deviation 1.2/ 16 = 0.3. Hence the power of this test when µ = 94 is
given by
!
X̄ − 95
Pµ=94 (Z ≤ −2.33) = Pµ=94 ≤ −2.33 = Pµ=94 (X̄ ≤ 95 − 2.33 × 0.3)
0.3
! !
X̄ − 94 95 − 2.33 × 0.3 − 94 X̄ − 94
= Pµ=94 ≤ = Pµ=94 ≤ 1.0033
0.3 0.3 0.3
≈ Φ(1) = 0.8413.

6. (a.) [To show that a r.v. is χ2(n) , we can use the property that moment generating function identifies the distribution
uniquely.]

Let us calculate the moment generating function of the random variable V ≡


2 θ U , where U ∼ exponential(θ).
    Z ∞ Z ∞
m(t) = E e tV
= E e t2θU
= e t2θu
fU (u)du = e2t θ u θ e−θ u du
−∞ 0
∞ u=∞
Z
θ
= θ e(2t−1) θ u du = e(2t−1) θ u
0 (2t − 1) θ u=0
1
= [0 − 1], provided 2t − 1 < 0 (otherwise, m.g.f. does not exist)
(2t − 1)
1
= = (1 − 2t)−1 = (1 − 2t)−2/2 ,
(1 − 2t)
which is the mgf of χ2(n) with n = 2. Hence 2 θ U ∼ χ2(2) .
(b.) 2n distribution comes from ratio of χ2
[We know that F2m (2n)
/(2n) and χ2(2m) /(2m) where the Chi-square r.v.’s are
independent. So we try to express the given random variable as the ratio o such random variables.]

Suppose X1 , X2 , . . . Xn are i.i.d. exponential(θ1 ). Then


n n n
1X 1X 1 X
θ1 X̄ = θ1 Xi = (θ1 Xi ) = (2 θ1 Xi ).
n i=1 n i=1 2n i=1
χ2
n o
[ = 1
2n
{2θX1 + 2θX2 + ··· + 2θXn } = 1
2n
χ2(2) + χ2(2) + · · · + χ2(2) =
(2n)
2n
]
| {z }
independent

But from (a.) each of 2 θ1 Xi ’s are χ2(2) r.v. and they are independent because Xi ’s
are so. Further, we know that sum of independent Chi-square r.v.’s is again a

4
Chi-square r.v. with degrees of freedom (df) being the sum of the individual df’s.
Hence ni=1 (2 θ1 Xi ) ∼ χ2(2n) and subsequently, θ1 X̄ = χ2(2n) /(2n) (equality in
P

distribution). With exactly the same argument, we can show that if Y1 , Y2 , . . . Ym


are i.i.d. exponential(θ2 ) then probability distribution-wise θ2 Ȳ = χ2(2m) /(2m).
Finally note that since Xi ’s are independent of Yj ’s, θ1 X̄ is independent of θ2 Ȳ .
Hence as far as the probability distribution is concerned,

θ1 X̄ χ2(2n) /(2n) 2n
= 2 ∼ F2m (note that the two Chi-square r.v.’s are independent.)
θ2 Ȳ χ(2m) /(2m)
 
θ1 X̄ 2n θ1 X̄ 2m m
(c.) Since θ2 Ȳ
∼ F2m , we have E θ2 Ȳ
= 2m−2
= m−1
. Note that, 2m > 2 for m ≥ 2.
θ2 θ2
[Wanted an unbiased estimator of θ1
. In other words we need an expression (r.v.) so that E(r.v.) = θ1
.]

But,
! ! !
θ1 X̄ m m − 1 θ1 X̄ m − 1 X̄ θ2
E = ⇔ E · =1 ⇔ E · =
θ2 Ȳ m−1 m θ2 Ȳ m Ȳ θ1
θ2
Hence (m − 1)X̄/(mȲ ) is an unbiased estimator of θ1
.
θ1 X̄ 2n
(d.) Once again using the result in (b.), i.e., θ2 Ȳ
∼ F2m , we have
!
θ1 X̄
P F α2n;2m ≤ 2n
≤ F1− α
;2m = 1 − α.
2 θ2 Ȳ 2

But
θ1 X̄ θ2 1 X̄ θ1 X̄ 1 X̄ θ2
F α2n;2m ≤ ⇔ ≤ 2n and 2n
≤ F1− α
;2m ⇔ 2n ≤ .
2 θ2 Ȳ θ1 F α ;2m Ȳ θ2 Ȳ 2 F1− α ;2m Ȳ θ1
2 2

Hence  
1 X̄ θ2 1 X̄ 
P 2n
≤ ≤ 2n = 1 − α.
F1− α
;2m Ȳ θ1 F α ;2m Ȳ
2 2

θ2
Thus a 100(1 − α)% confidence interval for θ1
is given by
  !
1 X̄ 1 X̄  X̄ X̄

2n
, 2n ≡ F α2m
;2n
2m
, F1− α
;2n .
F1− α
;2m Ȳ F α ;2m Ȳ 2 Ȳ 2 Ȳ
2 2

[The last formula can be derived directly as well. For example, switching the roles of X and Y from (b.) we get
θ2 Ȳ 2m . Then
θ X̄
∼ F2n
1

   
θ2 Ȳ X̄ θ2 X̄
P F 2m
α ;2n ≤
2m
≤ F1− α ;2n =1−α ⇔ P F 2m
α ;2n ≤ 2m
≤ F1− α ;2n = 1 − α.
2 θ1 X̄ 2 2 Ȳ θ1 2 Ȳ

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy