The University of Nottingham: Do NOT Turn Examination Paper Over Until Instructed To Do So
The University of Nottingham: Do NOT Turn Examination Paper Over Until Instructed To Do So
FUNDAMENTALS OF STATISTICS
Candidates may complete the front cover of their answer book and sign their desk card but must
NOT write anything else until the start of the examination period is announced.
An indication is given of the approximate weighting of each section of a question by means of a figure
enclosed by square brackets, eg [12], immediately following that section.
Dictionaries are not allowed with one exception. Those whose first language is not
English may use a standard translation dictionary to translate between that language and
English provided that neither language is the subject of this examination. Subject specific
translation dictionaries are not permitted.
No electronic devices capable of storing and retrieving text, including electronic
dictionaries, may be used.
1 (a) (i) Explain what is meant if we say that ĝ(X) is an unbiased estimator of g(θ).
(ii) Let pX (x|θ) denote a probability mass function or probability density function. State
the Neyman factorization theorem.
(iii) Suppose that T (x) is a sufficient statistic. Use the Neyman factorization theorem to
prove the following:
pX (x|θ)
if T (x) = T (x′ ) then is independent of θ. [9]
pX (x′ |θ)
(b) Let X1 , . . . , Xn denote an independent and identically distributed sample from the
distribution with probability density function
−1
θ if x ∈ [0, θ],
pX (x|θ) =
0 otherwise.
(iii) Prove that the MLE, θ̂, is sufficient for θ in this example.
(iv) By using the identity
Yn
Prob max Xi ≤ x|θ = Prob(Xi ≤ x|θ),
1≤i≤n
i=1
find the cumulative distribution function and hence the probability density function of
θ̂. [12]
G14FOS-E1
3 G14FOS-E1
2 (a) Let pX (x|θ) denote a probability mass function or probability density function.
(i) Define the score statistic U(X).
(ii) Prove that U(X) has zero expectation.
(iii) State a necessary and sufficient condition for an unbiased estimator ĝ(X) of g(θ) to
attain the Cramér–Rao lower bound. [8]
(b) Let X1 , . . . , Xn be an independent and identically distributed sample from the distribution
with probability mass function
(1 − θ)3 x
pX (x|θ) = θ (x + 1)(x + 2), . . . x = 0, 1, 2, . . . 0 < θ < 1.
2
(i) Calculate the log-likelihood for θ and determine the score statistic.
θ
(ii) Find an unbiased estimator of .
1−θ
(iii) Explain, giving brief justification, whether this unbiased estimator attains the Cramér–
Rao lower bound.
(iv) Calculate the Cramér–Rao lower bound. [11]
(iv) Given a sample x1 , . . . , xn from (3.1), find the posterior distribution for θ using the
conjugate prior you obtained in part (a)(iii). [10]
(c) Suppose that X ∼ Binomial(n, θ) with prior for θ the uniform distribution on [0, 1]. Let
Y ∼ Binomial(m, θ) denote a new observation.
(i) Calculate the predictive distribution of Y given that X = x.
(ii) Identify this distribution in the case m = 1. [8]
G14FOS-E1
5 G14FOS-E1
4 (a) Let θ = (θ1 , . . . , θk ) denote a vector of probabilities which sum to 1, i.e. θi ≥ 0 for
i = 1, . . . , k and ki=1 θi = 1. We say that N = (N1 , . . . , Nk ) ∼ Multinomial(n+ , θ)
P
when
k
(n+ )! Y ni
P (N = n|θ) = Qk θi ,
i=1 ni ! i=1
where n = (n1 , . . . , nk ), the ni are non-negative integers and n+ = ki=1 ni . We say that
P
a probability vector θ ∼ Dir(α) if θ has the probability density function
k
Y
D(α)−1 θiαi −1
i=1
on the set θi > 0 for all i = 1, . . . , k and ki=1 θi = 1, where α = (α1 , . . . , αk ) is a vector
P
of prior parameters with αi > 0 for i = 1, . . . , k and
Qk
Γ(αi )
D(α) = i=1 ,
Γ(α+ )
Pk
where α+ = i=1 αi and Γ is the gamma function. Recall that for all α > 0,
Γ(α + 1) = αΓ(α).
5 (a) Let pX (x|θ) denote a probability model, let π(θ) denote a prior for θ and let L(d, θ) denote
a loss function.
(i) What is a decision rule δ?
(ii) Define the risk R(δ, θ) of a decision rule δ with respect to the loss function L(d, θ).
θ3 2 −θx
pX (x|θ) = xe (x > 0, θ > 0) (5.1)
2
and consider the improper prior π(θ) ∝ θ−1 .
(i) Find the posterior density π(θ|x).
(ii) Given that the loss function L(d, θ) = (d − θ)2 is used, find the corresponding Bayes
rule. [8]
(c) Continuing with model pX (x|θ) in (5.1), consider the decision rule δc (x) = c/x, where
c > 0 is a constant.
(i) Show that the risk R(δc , θ) of δc is given by
θ2 2
R(δc , θ) = (c − 2c + 2).
2
[Hint: If X ∼ Gamma(α, β) and α > 2 then E[X −1 ] = β/(α − 1) and
Var(X −1 ) = β 2 /{(α − 1)2 (α − 2)}.]
(ii) What choice of c minimises the risk?
(iii) What choice of c corresponds to the Bayes rule found in part (b)?
(iv) What is the main conclusion? How does your conclusion relate to what you know
about the admissibility or otherwise of Bayes rules? [11]
[Note: Recall that the Gamma(α, β) distribution has probability density function
β α α−1 −βx
x e (x > 0, α > 0, β > 0),
Γ(α)
where Γ(α) is the gamma function that satisfies Γ(α + 1) = αΓ(α) for all α > 0.]
G14FOS-E1 End