Sheet8 Sol
Sheet8 Sol
Answer:
1
(a) uniform distribution: P (X = k) = ϑ
for k = 1, ..., θ, i.e. E(X) =
2
N +1
2
, Var(X) = N12−1
(b) E(T (X)) = E(2X − 1) = 2E(X) − 1 = N
1 N +1
N +1 ∈N
(c) P (T (X) = N ) = P (2X−1 = N ) = P (X = 2
) = N 2 ⇒
0 else
N=4: P (T (X) = N ) = 0 and N=5: P (T (X) = N ) = 1/5
N 2 −1
(d) V ar(T ) = V ar(2X − 1) = 4V ar(X) = 3
2. Fish are caught from a lake, until you get n (n ≥ 3) fishes of a certain
species A. The random variable X describe the number of all caught
fishes to this time. The lake contained a great number of fishes, so
that it can be assumed that the ratio p of the number of fishes of the
species A to the total number of all fish of the lake does not change,
when some fish are caught out of the lake.
k−1 n
(a) Show that Pp (X = k) = n−1 p (1 − p)k−n , k = n, n + 1, . . .
P (X = k) = P ({n-1 fishes from species A are among the first k-1 caught fishes} ∪
{k th caught fish is a fish from species A})
k − 1 n−1
= p (1 − p)k−1−n+1 · p
n−1
k−1 n
= p (1 − p)k−n
n−1
(b)
n−1
E(T (X)) = E( )
X −1
∞
X n−1 k−1 n
= · p (1 − p)k−n
k=n
k−1 n−1
∞
X k−2 n
= p (1 − p)k−n
k=n
n − 2
∞
X k − 2 n−1
= p· p (1 − p)k−n
k=n
n − 2
∞
X k−1
= p· pn−1 (1 − p)k+1−n
k=n−1
(n − 1) − 1
∞
X
= p· Pp (X̃ = k) = p
k=n−1
with X̃ number of all caught fishes until n-1 fishes of a certain are
get.
42 50 40 64 30 36 68 42 46 48
Easier to consider is P
f (ϑ) = ln L(x1 , ..., xnP; ϑ) = n ln ϑ + ( i=1 nxi − n) ln(1 − ϑ)
n
xi −n
From f 0 (ϑ) = nϑ − i=1 1−ϑ
= 0 we get, ϑ̂ = Pnn xi . f 0 has a sign
i=1
change from + to -. Thus there is local maximum.
Here: ϑ̂ = 0.0215
(a) Calculate the distribution function and density for the lifetime S
of the device.
(b) When measuring the lifetime of randomly from production of the
devices removed resulted in following values in hours:
(a)
(b) Likelihoodfunktion
5
5
Y 2 √
−λ(si +2 3 si )
L(s1 , ..., s5 ; λ) = λ (1 + p )e
i=5 3 3 s2i
(a) If N denotes the unknown number of red deers and X denotes the
random variables which counts the number of caught marked red
deers in the second trapping action we have
7 N −7
2
PN (X = 2) = N
1
3
Likelihod function
N p
0.5
7 0.00
prob. of the observation
0.4
8 0.38
9 0.50
0.3
10 0.52
11 0.51
0.2
12 0.48
0.1
13 0.44
14 0.40
0.0
15 0.37
16 0.34 10 20 30 40 50
N
blue = first caught, red: first and second caught
Confidence Intervals
1. A population is known to be normally distributed with a standard
deviation of 2.8.
(a) Compute the 95% confidence interval on the mean based on the
following sample of nine: 8, 9, 10, 13, 14, 16, 17, 20, 21.
(b) Now compute the 99% confidence interval using the same data.
# S o l u t i o n a p p l y i n g z . t e s t ( ) from t h e TeachingDemos p a c k a g e
l i b r a r y ( TeachingDemos )
z . t e s t ( x= sample , s d = 2 . 8 , a l t e r n a t i v e = ” two . s i d e d ” , c o n f . l e v e l = 0 . 9 5 ) $ c o n f . i n t
# a)
z . t e s t ( x = sample , s d = 2 . 8 , a l t e r n a t i v e = ” two . s i d e d ” , c o n f . l e v e l = 0 . 9 9 ) $ c o n f . i n t # b )
2. You take a sample of 22 from a population of test scores, and the mean
of your sample is 60.
(a) You know the standard deviation of the population is 10. What
is the 99% confidence interval on the population mean?
(b) Now assume that you do not know the population standard devi-
ation, but the standard deviation in your sample is 10. What is
the 99% confidence interval on the mean now?
# S o l u t i o n a p p l y i n g z . t e s t ( ) from t h e TeachingDemos p a c k a g e
l i b r a r y ( TeachingDemos )
z . t e s t ( x = m, s d = 1 0 , a l t e r n a t i v e = ” two . s i d e d ” , n = 2 2 , c o n f . l e v e l = 0 . 9 9 ) $conf . int
# b ) Now assume t h a t you do n o t know t h e p o p u l a t i o n s t a n d a r d
# d e v i a t i o n , but t h e s t a n d a r d d e v i a t i o n i n y o u r s a m p l e i s 1 0 . What
# i s t h e 99\% c o n f i d e n c e i n t e r v a l on t h e mean now?
s s a m p l e <= 10
t a <= q t (1 = a l p h a / 2 , n = 1)
t a
u <= m= t a * s / s q r t ( n )
o <= m+t a * s / s q r t ( n )
u;o
3. Calculate for the below given sample from a normally distributed pop-
ulation the 95% confidence intervals
xi : 247.4, 249.0, 248.5, 247.5, 250.6, 252.2, 253.4, 248.3, 251.4, 246.9,
249.8, 250.6, 252.7, 250.6, 250.6, 252.5, 249.4, 250.6, 247.0, 249.4
# c r e a t e sample v a l u e s
# s . v a l u e s <= round ( rnorm ( n=20 , mean = 2 5 1 , s d = 2 ) , 1 )
s . v a l u e s <= c ( 2 4 7 . 4 , 2 4 9 . 0 , 2 4 8 . 5 , 2 4 7 . 5 , 2 5 0 . 6 , 2 5 2 . 2 , 2 5 3 . 4 , 2 4 8 . 3 , 2 5 1 . 4 , 2 4 6 . 9 ,
249.8 ,250.6 ,252.7 ,250.6 ,250.6 ,252.5 ,249.4 ,250.6 ,247.0 ,249.4)
# c h a r a c t e r i s t i c s o f the sample
n <= l e n g t h ( s . v a l u e s )
x b a r <= mean ( s . v a l u e s )
s <= s d ( s . v a l u e s )
# l e v e l 1= a l p h a
a l p h a <= 0 . 0 5
# c o n f i d e n c e i n t e r v a l l s f o r mu
# a ) a s s u m p t i o n : s i gm a = 2
s i gm a <= 2
l . a <= x b a r = qnorm(1 = a l p h a / 2 ) * s ig m a / s q r t ( n )
u . a <= x b a r + qnorm(1 = a l p h a / 2 ) * s ig m a / s q r t ( n )
l .a; u.a
# b ) a s s u m p t i o n : s i g ma = unknown
l . b <= x b a r = q t (1 = a l p h a / 2 , d f = n = 1) * s / s q r t ( n )
u . b <= x b a r + q t (1 = a l p h a / 2 , d f = n = 1) * s / s q r t ( n )
l .b; u.b
# c o n f i d e n c e i n t e r v a l l s f o r s i gm a ˆ2
# c ) a s s u m p t i o n : mu = 250
mu <= 250
Qn <= sum ( ( s . v a l u e s = mu) ˆ 2 )
l . c <= Qn/ q c h i s q (1 = a l p h a / 2 , d f = n )
u . c <= Qn/ q c h i s q ( a l p h a / 2 , d f = n )
l .c; u. c
# d ) a s s u m p t i o n : mu unknown
l . d <= ( n = 1) * s ˆ2/ q c h i s q (1 = a l p h a / 2 , d f = n = 1)
u . d <= ( n = 1) * s ˆ2/ q c h i s q ( a l p h a / 2 , d f = n = 1)
l .d; u.d
A
hqone-sided confidence
interval (lower boundary) for σ:
(n−1)s2
χ2
, ∞ = [51, 57 ; ∞) with χ2n−1,1−α = χ250,0.95 = 67.505
n−1,1−α
####################################################
# At a t e l e m a r k e t i n g f i r m , t h e l e n g t h o f a t e l e p h o n e
# s o l i c i t a t i o n ( in seconds ) i s a normally d i s t r i b u t e d
# random v a r i a b l e w i t h mean mu and s t a n d a r d d e v i a t i o n
# sigma , both unknown . A s a m p l e o f 50 c a l l s h a s mean
# l e n g t h 300 and s t a n d a r d d e v i a t i o n 6 0 .
#
# f i l e : i n f s t a t c o n f i n t e r v a l t e l e f i r m .R
#####################################################
n <= 5 0 ; m <= 3 0 0 ; s s a m p l e <= 6 0 ; a l p h a <= 0 . 0 5
# a ) C o n s t r u c t t h e 95% c o n f i d e n c e u p p e r bound f o r mu .
t a <= q t (1 = a l p h a , n = 1)
t a
o <= m+t a * s s a m p l e / s q r t ( n )
o
# b ) C o n s t r u c t t h e 95% c o n f i d e n c e l o w e r bound f o r s ig m a .
c h i <= q c h i s q (1 = a l p h a , n = 1)
chi
u <= ( n = 1) * s s a m p l e ˆ2/ c h i
sqrt (u)
6. You read about a survey in a newspaper and find that 70% of the 250
people sampled prefer candidate A.
# normal a p p r o x i m a t i o n
l . appr <= p = qnorm(1 = a l p h a / 2 ) * s q r t ( p * (1 = p ) / n )
u . appr <= p + qnorm(1 = a l p h a / 2 ) * s q r t ( p * (1 = p ) / n )
l . appr ; u . appr
# exact
xp <= s e q ( 0 , 1 , l e n g t h =1+10ˆ4)
l . ex <= xp [ min ( which ( qbinom(1 = a l p h a / 2 , n , xp ) == p * n ) ) ]
u . ex <= xp [ max ( which ( qbinom ( a l p h a / 2 , n , xp ) == p * n ) ) ]
l . ex ; u . ex
# e x a c t c o n f i d e n c e i n t e r v a l w i t h R= f u n c t i o n
binom . t e s t ( x = 0 . 7 * 2 5 0 , n =250 , c o n f . l e v e l =1= a l p h a ) $ c o n f . i n t
# normal a p p r o x i m a t i o n
u . appr <= p + qnorm(1 = a l p h a ) * s q r t ( p * (1 = p ) / n )
u . appr
# exact
xp <= s e q ( 0 , 1 , l e n g t h =1+10ˆ4)
u . ex <= xp [ max ( which ( qbinom ( a l p h a , n , xp ) == p * n ) ) ]
u . ex
# e x a c t c o n f i d e n c e i n t e r v a l w i t h R= f u n c t i o n
binom . t e s t ( x =40 , n =100 , a l t e r n a t i v e = ” l e s s ” ,
c o n f . l e v e l =1= a l p h a ) $ c o n f . i n t
9. The interval [45.6, 47.8] is a symmetric 99% confidence interval for the
unknown parameter µ based on a sample x1 , . . . , x10 from a normal
10. The waiting time at the pay desk of a certain supermarket is normally
distributed with mean waiting time µ and known standard deviation
σ = 1, 8 minutes. A confidence interval for the mean waiting time
(in minutes) for this supermarket is [5.12; 8.32]. If the sample size is
n = 10, what is then the confidence level?
Answer: The length of the interval is 8.32 − 5.12 and
8.32 − 5.12 = 2 · u1− α2 · √σn = 2 · u1− α2 · √1.8
10
i.e. u1− α2 = 2.81 and the
α
normal distribution table gives 1 − 2 = 0.9975 i.e. α ≈ 0.005. So the
confidence level is 1 − α = 99.5%.
library ( tidyverse )
# s y m m e t r i c i n t e r v a l s [ l b , ub ] f o r X w i t h p r o b a b i l i t y 1= a l p h a
# for d i f f e r e n t values of M
s y . i n t e r v a l s <= t i b b l e (
M = 0 : N,
# q u a n t i l s o f H(N,M, n )
l b = q h y p e r ( a l p h a / 2 ,M, N=M, n ) ,
ub = q h y p e r (1 = a l p h a / 2 ,M, N=M, n )
)
# p l o t of the the i n t e r v a l s
p l o t ( x=s y . i n t e r v a l s $ M , y=s y . i n t e r v a l s $ l b , c o l =” b l u e ” ,
t y p e = ”p ” ,
x l a b = ”M” , y l a b = ” l o w e r and u p p e r bounds ” ,
main = ” s y m m e t r i c 95% i n t e r v a l s f o r X” )
p o i n t s ( x=s y . i n t e r v a l s $ M , y=s y . i n t e r v a l s $ u b , c o l =”r e d ” )
# The binom . t e s t ( x , n ) f u n c t i o n r e t u r n s i n t h e v a r i a b l e
# c o n f . i n t t h e c o n f i d e n c e i n t e r v a l f o r p=M/N i f t h e y a r e X
# w h i t e b a l l s i n a s a m p l e o f n b a l l s drawn from t h e urn
# with replacement
binom . appr . c o n f . i n t e r v a l l <= f u n c t i o n ( x ) {
return (
c(
binom . t e s t ( x , n , c o n f . l e v e l = 1= a l p h a ) $ c o n f . i n t [ 1 ] * N,
binom . t e s t ( x , n , c o n f . l e v e l = 1= a l p h a ) $ c o n f . i n t [ 2 ] * N
)
)
}
# normal a p p r o x i m a t i o n o f t h e c o n f i d e n c e i n t e r v a l f o r an
# unknown p r o p o r t i o n i f x w h i t e b a l l s a r e i n a s a m p l e o f
# n b a l l s drwan w i t h r e p l a c e m e n t
normal . appr . c o n f . i n t e r v a l l <= f u n c t i o n ( x ) {
return (
c(
N * ( x /n =qnorm(1 = a l p h a / 2 ) * s q r t ( x * (1 = x /n ) / n ˆ 2 ) ) ,
N * ( x /n +qnorm(1 = a l p h a / 2 ) * s q r t ( x * (1 = x /n ) / n ˆ 2 ) )
)
)
}
# t i b b l e o f t h e bounds o f t h e c o n f i d e n c e i n t e r v a l l s for M
# for a l l possibloe values of X
t a b <= t i b b l e (
X = 0 : n ) %>%
g r o u p b y (X) %>%
mutate ( ex . l b=ex . c o n f . i n t e r v a l l (X ) [ 1 ] ,
ex . ub=ex . c o n f . i n t e r v a l l (X ) [ 2 ] ,
binom . l b=binom . appr . c o n f . i n t e r v a l l (X ) [ 1 ] ,
binom . ub=binom . appr . c o n f . i n t e r v a l l (X ) [ 2 ] ,
norm . l b=normal . appr . c o n f . i n t e r v a l l (X ) [ 1 ] ,
norm . ub=normal . appr . c o n f . i n t e r v a l l (X ) [ 2 ]
)
# p l o t o f a l l bounds
p l o t ( x=tab $ X , y=t a b $ e x . l b , c o l =”r e d ” ,
x l a b = ” x ” , y l a b = ”M” ,
main = ”95% c o n f i d e n c e i n t e r v a l l f o r M i n H(N=500 ,M, n =50)” ,
sub = ” r e d = e x a c t , b l u e = b i n o m i a l approx , b l a c k = normal approx . ” )
p o i n t s ( x=tab $ X , y=t a b $ e x . ub , c o l =”r e d ” )
p o i n t s ( x=tab $ X , y=tab $ binom . l b , c o l =” b l u e ” )
p o i n t s ( x=tab $ X , y=tab $ binom . ub , c o l =” b l u e ” )
p o i n t s ( x=tab $ X , y=tab $ norm . l b , c o l =” b l a c k ” )
p o i n t s ( x=tab $ X , y=tab $ norm . ub , c o l =” b l a c k ” )