Ch2 Prob II NAU
Ch2 Prob II NAU
POINT ESTIMATION
2.1 Introduction
Statistical inference is the process by which information from sample data is
used to draw conclusions about the population from which the sample was selected.
The techniques of statistical inference can be divided into two major areas:
parameter estimation and hypothesis testing. There are two types of
parameter estimation, namely, point estimation and interval estimation. This chapter
treats point estimation.
Consider a random sample of size n from a population with pdf (or pmf) f(x ; θ).
The term random sample may refer either to the set of i.i.d. (independent identically
distributed) random variables, X1, X2,..., Xn , or to the observed data x1, x2,..., xn.
distribution.
Example 2.1
Let X , X , . . . , X be a random sample from a normal distribution, N(μ,σ2),
1 2 n
1
then the sample mean X
n
Xi is an example of a statistic. The sample variance
1
S2
n1
( X i X) 2 provides another example of a statistic.
Example 2.2
Let X be uniformly distributed on the interval (α, 1). Given a random sample of size
n, use the method of moments to obtain a formula for estimating the parameter α.
Solution
To find an estimator of α by the method of moments, we note that the first
population moment about zero is
1
1
1 1 x2 1
1 E( X) x f ( x ; ) dx x . dx
1 1 2 2
The first sample moment is just
m1 X
Therefore,
1ˆ
m1 1 X
ˆ 2X 1
2
Example 2.3
Given a random sample of size n from a Poisson population, use the method of
moments to obtain a formula for estimating the parameter .
Solution
The p.m.f. of the Poisson distribution with parameter is given by
e x
f (x ; ) , x 0, 1, 2 , ....
x!
The first population moment about zero is
1 E( x )
The first sample moment is
m1 X
From (2.2), we obtain
m1 1 X ˆ
which has the solution
ˆ X
- 19 -
Example 2.4
Given a random sample of size n from a N(μ, σ2) population, use the method of
moments to obtain formulas for estimating the parameters μ and σ2.
Solution
The normal p.d.f. with parameters μ and σ2 is given by
1 x 2
1
f (x ; , 2 ) e 2
, x
2
The first two population moments about zero are
1 E( x ) , 2 E( x 2 ) 2 2
The first two sample moments are
1 n
m1 X and m2
n
X 2i
i 1
n i 1 n
Note that the likelihood function is now a function of the unknown parameter
only. The maximum likelihood estimator (MLE) of is the value of that
maximizes the likelihood function L(). Essentially, the maximum likelihood
- 20 -
estimator ̂ is the value of that maximizes the probability of occurrence of the
sample results i.e.
f ( x1 , x 2 ,..., x n ; ˆ ) max f ( x1 , x 2 , ..., x n ; )
If L() is differentiable, then the MLE will be a solution of the equation (ML
equation)
d
L () 0 (2.4)
d ˆ
If one or more solutions of (2.4) exist, it should be verified which ones, if any,
maximize L(). Note also that any value of that maximizes L() will also
maximizes the log-likelihood, ln L(), so for computational convenience the
alternate form of (2.4),
d
ln{ L ()} 0 (2.5)
d ˆ
Example 2.5
If x1, x2,…, xn are the values of a random sample from a Bernoulli population.
Find the MLE of its parameter .
Solution
The p.m.f. of the Bernoulli distribution is (Binomial with n = 1)
f ( x; ) x (1 )1x for x 0 ,1
The likelihood function is
n n n n
L( ) f ( x i ; ) x i (1 )1 x i x i (1 )1 x i
i 1 i 1 i 1 i 1
x i n x i
(1 )
and the log-likelihood function is
L* ( ) ln L( ) x i ln ( ) ( n x i ) ln (1 )
The ML equation is
dL* () x i n x i
0 0
d ˆ 1 ˆ
which has the solution ˆ X .
- 21 -
Example 2.6
Let X1 , X2 , . . . , X n be a random sample from an exponential population,
find the MLE for the parameter .
Solution
The pdf of the exponential distribution with parameter is
1 x /
f ( x ; )
e 0 , x 0
The likelihood function is
n n
1 1
L( ) f ( x i ; ) e x i / n e x i /
i 1 i 1
and the log-likelihood function is
xi
L* () ln L() nln()
The ML equation is
dL* ( ) n x
0 2i 0
d ˆ ˆ
which has the solution
ˆ X
Generalization
The method of maximum likelihood can be used in situations where there are
several unknown parameters, say 1 , 2 , … , k, to estimate. In such cases, the
likelihood function is a function of the k unknown parameters 1 , 2 , … , k and the
MLE’s { ̂ i } would be found by equating the k first partial derivatives of the
likelihood function or its logarithm, i.e. the MLE’s ̂1 , ̂ 2 , …, ̂ k of the parameters
1 , 2 , … , k are the solutions of the equations
L ( 1 , 2 ,..., k ) 0
1
L ( 1 , 2 ,..., k ) 0
2
∶
L ( 1 , 2 ,..., k ) 0
k
In this case it may also be easier to work with the logarithm of the likelihood L*.
- 22 -
Example 2.7
If X1 , X2 , . . . , X n constitute a random sample from a normal population
with the mean and the variance 2, find the joint maximum likelihood estimates of
these two parameters.
Solution
The pdf of the normal distribution with parameters & 2 is
( x )2
1 2 2
f ( x; , 2 ) e
2
Hence, the likelihood function is
n/2 n/2 ( xi )2
n
1 1
L(, 2 ) f ( x i ; , 2 ) 2 e 2 2
i 1 2
and the log-likelihood function is
(xi )2
2
2n
2
n
L (, ) ln L(, ) ln(2 ) ln( )
*
2
2
2 2
The ML equations are
L* ( , 2 ) ( x i ˆ )
0 2 0 ˆ X
2ˆ 2
L* ( , 2 ) n 1 ( x i ˆ ) 2 1
0 0 ˆ 2 ( x i ˆ ) 2
2
2 ˆ 2
2 ˆ 4
n
which have the solution
1
ˆ X and ˆ 2
(xi x)2 Sn2
n
It should be observed that we did not show that ̂ is a maximum likelihood
estimate of , only that ̂ 2 is a MLE of 2. However, it can be shown that the
maximum likelihood estimators have the invariance property that is if ̂ is the MLE
of and a function given by g() is continuous, then g(ˆ ) is also the MLE of g().
- 23 -
2.3 Properties of Estimators
There may be several different potential point estimators for a parameter. For
example, if we wish to estimate the mean of a random variable, we might consider
the sample mean, the sample median, or perhaps the average of the smallest and
largest observations in the sample as point estimators. In order to decide which point
estimator of a particular parameter is the best one to use, we need to examine their
statistical properties and develop some criteria for comparing estimators. There are
several properties of estimators that would appear to be desirable, such as
unbiasdness, minimum variance and sufficiency.
A desirable property of an estimator is that should be “close” in some sense to
the true value of the unknown parameter. A useful measure of goodness or closeness
of an estimator ˆ = t ( X1 , X2 , . . . , Xn ) of is what is called the mean-squared
error of the estimator.
MSE( ̂ 1 ) and MSE ( ̂ 2 ) be the mean square errors of ̂ 1 and ̂ 2 , then we prefer the
estimator with the smaller MSE, i.e we say that ̂ 1 is better than ̂ 2 if
a- Unbiased Estimators
An estimator ̂ is said to be an unbiased estimator of the parameter if
E (ˆ ) (2.7)
That is ̂ is an unbiased estimator of if “on the average” its values are equal to .
Note that this is equivalent to requiring that the mean of the sampling distribution of ̂
is equal to .
The quantity
ˆ )E
b ( ˆ
is called the bias of the estimator ̂ .
- 24 -
Theorem 2.3
The mean squared error can be written as follows,
MSE (ˆ ) var (ˆ ) (bias )2 (2.8)
Proof
2
MSE (ˆ ) E ˆ E ˆ E ˆ
ˆ E ˆ E ˆ
2 2
E
ˆ E ˆ E ˆ - 2 E
2 2
E ˆ E ˆ E ˆ
var (ˆ ) [b(ˆ )] 2
Since the third term is
2 E ˆ E ˆ E ˆ 0
That is, the mean square error of ̂ is equal to the variance of the estimator plus the
squared bias. If ̂ is an unbiased estimator of , then the mean square error of ̂ is equal
to the variance of ̂ .
Example 2.8
If X has the binomial distribution with the parameters n and p, show that the
X
sample proportion, p̂ is an unbiased estimator of p.
n
Solution
X np
Since E (X) = np, then E ( p̂ ) E p
n n
Example 2.9
Suppose that X is a r.v. with mean and variance 2. Let X1 , X2 , . . . , X n be
a random sample from X. Show that the sample mean X and the sample variance S2 are
unbiased estimators of and 2 respectively.
Solution
Since
1 n 1 n 1 n 1
E ( X ) E X i E ( X i ) n
n i 1 n i 1 n i 1 n
Therefore X is an unbiased estimator for .
- 25 -
2
Now, the properties of χ distribution discussed in Chapter (1),
1 n
E( S 2 ) E
n 1 i 1
( Xi X )2 2
Note:
Since X1 , X2 , . . . , X n are independent (random sample), then
1 n 1 n
1 n
1 2
var ( X ) var X i 2 var ( X i ) 2 n 2
n i 1 n i 1 n2 i 1 n2 n
The MLE of 2 , namely,
1
S n2
n
( X i X) 2
Definition 2.4
Let X1, X2, ….Xn be a random sample of size n from f(x; ). An estimator ̂ of
is called a minimum variance unbiased estimator (MVUE) of if
1- ̂ is unbiased for , and
~ ~
2- for any other unbiased estimator of , Var( ) ≥ Var( ̂ ) for all
possible values of .
In some cases, lower bounds can be derived for the variance of unbiased estimators. If
an unbiased estimator can be found that attains such a lower bound, then it follows that
the estimator is a MVUE.
- 26 -
C- Efficiency
The mean square error is an important criterion for comparing two estimators.
Let ̂ 1 and ̂ 2 be two estimators of the parameter , and let MSE( ̂ 1 ) and MSE ( ̂ 2 )
be the mean square errors of ̂ 1 and ̂ 2 . Then, the relative efficiency of ̂ 2 to ̂ 1 is
defined as.
ˆ ˆ MSE ( ˆ 1 )
eff ( 2 / 1 ) (2.9)
MSE ( ˆ 2 )
If this relative efficiency is less than one, we would conclude that ̂ 1 is a more efficient
estimator of than ̂ 2 , in the sense that it has smaller mean square error.
For example, suppose that we wish to estimate the mean of a population. We
have a random sample of n observations X1 , X2 , . . . , X n and we wish to compare
two possible. Estimators for : the sample mean X and a single observation from the
sample, say X1. Not that both X and X1 are unbiased estimators of ; consequently,
the mean square error of both estimators is simply the variance. For the sample mean,
we have
2
MSE ( X ) var ( X )
n
where 2 is the population variance; for an individual observation, we have
MSE (X1) = var (X1)= 2
Therefore, the relative efficiency of X1 to X is
MSE X 1
eff ( X1 / X )
MSE( X1 ) n
Since (1 / n) < 1 for sample sizes n 2 we would conclude that the sample mean X is
more efficient estimator of than as single observation X1.
Note: The MVUE is sometimes called the most efficient estimator.
D- Consistency
The estimator ̂ is called a consistent estimator of the parameter if and only if
MSE(ˆ ) E(ˆ )2 0 as n
That is ̂ is unbiased (or E[ ̂ ]→ as n ) and var( ̂ )→ as n .
Note that consistency is an asymptotic property, that is, a limiting property
of an estimator. Informally, consistency means that when n is sufficiently
- 27 -
large, we can be practically certain that the error made with a consistent
estimator well be as small as we can.
X
Based on the previous examples, since p̂ is unbiased estimator for p and
n
X np(1 p) p(1 p)
ˆ var
var(p) 0 as n
n n2 n
X
Then p̂ is consistent estimator for p.
n
Similarly X is a consistent estimators of , since X is unbiased and
2
var(X) 0 as n
n
<+><+><+><+><+><+><+><+><+>
- 28 -
EXERCISES
[1] Find method of moments estimators (MME’s) of based on a random sample
X1, …, Xn from each of the following pdf’s:
a- f(x; ) = x-1; 0 < x < 1, zero otherwise; > 0 .
b- f(x; ) = ( + 1)x--2; 1 < x, zero otherwise; > 0.
c- f(x; ) = 2xe -x ; 0 < x, zero otherwise; > 0 .
[2] Find maximum likelihood estimators (MLE’s) for based on a random sample of
size n for each of the pdf’s in problem [1].
[3] Find the MLE for based on a random sample of size n from a distribution with
pdf
22 x 3 x
f (x; )
0 x ; 0
[6] If X1, X2, ... , Xn constitute a random sample from a population given by the p.d.f.
2 xe
1 x /
x 0 , 0
f (x; )
0 otherwise
a- Find the maximum likelihood estimator ̂ for the parameter θ.
b- Show that the method of moments give the same estimator ̂ for θ.
c- Prove that ̂ is unbiased and consistent estimator for θ.
( Hint: xme x / d m!m 1 for any +ve integer m)
0
- 29 -
1 x /
2
3 x e , x0
f (x ;) 2
0
, o.w .
a- Find the maximum likelihood estimator ̂ for the parameter .
b- Show that the method of moments gives the same estimator ̂ for .
c- Prove that ̂ is the minimum variance unbiased estimator for .
[9] Let X1, …, Xn be a random sample from EXP(), and define ˆ 1 X and
ˆ 2 nX / ( n 1) .
(a) Find the variances of ̂ 1 and ̂ 2 .
(b) Find the MSE’s of ̂ 1 and ̂ 2 .
(c) Compare the variances of ̂ 1 an ̂ 2 for n = 2.
(d) Compare the MSE’s of ̂ 1 and ̂ 2 for n = 2.
[10] Let X1, X2 and X3 be a random sample from a population having mean and
variance 2. Consider the following estimators:
2X1 X2 X3 3X1 2X2 X3
ˆ 1 & ˆ 2
2 4
compare these two estimators. Which do you prefer? Why?
[11] Suppose that ̂1 and ̂ 2 are estimators of the parameter θ. We know that E(ˆ 1 ) ,
var( ˆ 1 ) 10 and E(ˆ 2 ) / 2 , var( ˆ 2 ) 4 which estimator is "best" ? In what
sense it is best?
[12] Let ̂1 and ̂ 2 be two estimators of θ. The estimator ̂ 2 is said to be more efficient
than ̂1 if
a. var(ˆ 1 ) var(ˆ 2 ) b. MSE(ˆ 1 ) MSE(ˆ 2 )
c. E(ˆ 1 ) E(ˆ 2 ) d. None of the above
- 30 -
[13] Let ̂1 and ̂ 2 be two unbiased estimators of θ. The estimator ̂1 is said to be more
efficient than ̂ 2 if
a. E(ˆ 12 ) E(ˆ 22 ) b. E(ˆ 12 ) E(ˆ 22 ) c. E(ˆ 1 ) E(ˆ 2 ) d. E(ˆ 1 ) E(ˆ 2 )
[14] Suppose that ̂1 , ̂ 2 and ̂ 3 are three estimators of the parameter θ. We know that
E(ˆ 1 ) E(ˆ 2 ) , E(ˆ 3 ) , var( ˆ 1 ) 12 , var( ˆ 2 ) 10 and E(ˆ 3 )2 6 .
Then the most efficient estimator between them is:
a. ̂1 b. ̂ 2 c. ̂ 3 d. None of the above.
[15] Let X be a random variable with mean μ and variance σ2. Given two independent
random samples of size 30 and 50 with sample means X1 and X 2 , respectively.
Show that
X X1 (1 )X2
is an unbiased estimator of μ. Find the value of α that minimizes var(X) . Let
X1 X2
ˆ 1be another estimator for μ, compare these two estimators. Which do
2
you prefer? Why?
<+><+><+><+><+><+><+><+><+>
- 31 -