0% found this document useful (0 votes)
24 views15 pages

Ch2 Prob II NAU

Uploaded by

smmagency1000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views15 pages

Ch2 Prob II NAU

Uploaded by

smmagency1000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Chapter 2

POINT ESTIMATION

2.1 Introduction
Statistical inference is the process by which information from sample data is
used to draw conclusions about the population from which the sample was selected.
The techniques of statistical inference can be divided into two major areas:
parameter estimation and hypothesis testing. There are two types of
parameter estimation, namely, point estimation and interval estimation. This chapter
treats point estimation.
Consider a random sample of size n from a population with pdf (or pmf) f(x ; θ).
The term random sample may refer either to the set of i.i.d. (independent identically
distributed) random variables, X1, X2,..., Xn , or to the observed data x1, x2,..., xn.

Definition 2.1: Statistic


A function of the random sample, T= t ( X , X , . . . , X ) , that does not
1 2 n

depend on any unknown parameters is called a statistic.


A statistic is also a random variable, the distribution of which depends on the
distribution of a random sample and on the form of the function
t ( X , X , . . . , X ) . The distribution of a statistic is referred to as a sampling
1 2 n

distribution.

Example 2.1
Let X , X , . . . , X be a random sample from a normal distribution, N(μ,σ2),
1 2 n

1
then the sample mean X 
n
 Xi is an example of a statistic. The sample variance

1
S2 
n1
 ( X i  X) 2 provides another example of a statistic.

A point estimate of a population parameter is a single numerical value of a


statistic that corresponds to that parameter. That is, the point estimate is a unique
selection for the value of an unknown parameter. More precisely, if X is a r.v. with
probability distribution f(x ; θ), characterized by the unknown parameter θ, and if
X 1 , X 2 , . . . , X n is a random sample of size n from X, then the statistic

ˆ = t ( X1 , X2 , . . . , X n ) corresponding to θ is called the estimator of θ.


- 17 -
Definition 2.2
A statistic T= t ( X1 , X2 , . . . , X n ) , that is used to estimate the unknown
parameter θ is called an estimator of θ, and an observed value of the statistic, t=
t ( x1 , x2 , . . . , x n ) is called an estimate of θ.
Estimation problems occur frequently in real life. We often need to estimate:
 the mean μ of a single population
 the variance σ2 (or standard deviation σ) of a single population
 the proportion p of items in a population that belong to a class of interest
 the difference in means of two populations, μ1 — μ2
 the difference in two population proportions, p1 — p2
Reasonable point estimates of these parameters are as follows:
 for μ, the estimate is ̂ = X , the sample mean
 for σ2 , the estimate is ˆ 2 = S2 , the sample variance
 for p, the estimate is p̂ = X / n , the sample proportion, where X is the number
of items in a random sample of size n that belong to the class of interest
 for μ1- μ2 , the estimate is ˆ 1 - ˆ 2 = X - X the difference between the
1 2

sample means of two independent random samples


 for p1 - p2 , the estimate is p̂ - p̂ the difference between two sample
1 2

proportions computed from two independent random samples

2.2 Some Methods of Estimation


In some cases reasonable point estimators can be found on the basis of intuition, but
various general methods have been developed for deriving estimators.

(I) The Method of Moments


Suppose that X is a r.v. with p.d.f. (or p.m.f.) f (x; θ1, θ2, ..., θr) characterized by r
unknown parameters. Let X1 , X2 , . . . , X n be a random sample of size n from X,
and define the first r sample moments about the origin as
1 n k
mk   Xi , k  1,2, ..., r (2.1)
n i 1
The population moments μk will, in general, be functions of the r unknown parameters
{θs}. Equating sample moments and population moments will yield r simultaneous
equations in r unknowns {the θi's}; that is.
- 18 -
 k  m k , k  1, 2 , ... , r (2.2)
The solution to equation (2.2), denoted by ˆ 1 , ˆ 2 , . . . , ˆ r yields the moment
estimators of θ1, θ2, ..., θr.

Example 2.2
Let X be uniformly distributed on the interval (α, 1). Given a random sample of size
n, use the method of moments to obtain a formula for estimating the parameter α.
Solution
To find an estimator of α by the method of moments, we note that the first
population moment about zero is
1
 1
1 1  x2  1 
 1  E( X)   x f ( x ; ) dx   x . dx     
  1  1   2  2
The first sample moment is just
m1  X
Therefore,
1ˆ
m1   1  X   
ˆ  2X  1
2

Example 2.3
Given a random sample of size n from a Poisson population, use the method of
moments to obtain a formula for estimating the parameter .
Solution
The p.m.f. of the Poisson distribution with parameter  is given by
e   x
f (x ;  )  , x  0, 1, 2 , ....
x!
The first population moment about zero is
1  E( x )  
The first sample moment is
m1  X
From (2.2), we obtain
m1   1  X  ˆ
which has the solution
ˆ  X

- 19 -
Example 2.4
Given a random sample of size n from a N(μ, σ2) population, use the method of
moments to obtain formulas for estimating the parameters μ and σ2.
Solution
The normal p.d.f. with parameters μ and σ2 is given by
1  x   2
1   
f (x ; , 2 )  e 2  
,  x
2 
The first two population moments about zero are
1  E( x )   ,  2  E( x 2 )  2   2
The first two sample moments are
1 n
m1  X and m2 
n
 X 2i
i 1

From (2.2), we obtain


m1  1  X  ˆ
1 n 2
m2   2   X  ˆ 2  ˆ 2
n i 1 i
which have the solution
ˆ  X
1 n
1
 X 2i  X   ( X i  X ) 2
2
ˆ 
2

n i 1 n

(II) The Method of Maximum Likelihood


One of the best methods of obtaining a point estimator is the method of
maximum likelihood. Suppose that X is a r.v. with p.d.f. (or p.m.f.) f(x ; ), where 
is a single unknown parameter. Let X1 , X2 , . . . , X n be a random sample of size n.
Then the likelihood function represents the joint pdf (or pmf) of the random sample,
i.e.
n
L ( )  f ( x1 , x 2 , ...., x n ; )   f ( x i ; )  f ( x1; ) . . . f ( x n ; ) (2.3)
i 1

Note that the likelihood function is now a function of the unknown parameter
 only. The maximum likelihood estimator (MLE) of  is the value of  that
maximizes the likelihood function L(). Essentially, the maximum likelihood

- 20 -
estimator ̂ is the value of  that maximizes the probability of occurrence of the
sample results i.e.
f ( x1 , x 2 ,..., x n ; ˆ )  max f ( x1 , x 2 , ..., x n ; )

If L() is differentiable, then the MLE will be a solution of the equation (ML
equation)
d
L ()  0 (2.4)
d ˆ


If one or more solutions of (2.4) exist, it should be verified which ones, if any,
maximize L(). Note also that any value of  that maximizes L() will also
maximizes the log-likelihood, ln L(), so for computational convenience the
alternate form of (2.4),
d
ln{ L ()} 0 (2.5)
d ˆ

will often be used

Example 2.5
If x1, x2,…, xn are the values of a random sample from a Bernoulli population.
Find the MLE of its parameter .
Solution
The p.m.f. of the Bernoulli distribution is (Binomial with n = 1)
f ( x; )  x (1  )1x for x  0 ,1
The likelihood function is

 
n n n n
L( )   f ( x i ; )   x i (1  )1 x i    x i  (1  )1 x i
i 1 i 1 i 1 i 1
x i n  x i
 (1  )
and the log-likelihood function is
L* ( )  ln L( )  x i ln ( )  ( n  x i ) ln (1  )
The ML equation is
dL* () x i n  x i
0   0
d ˆ 1  ˆ
which has the solution ˆ  X .

- 21 -
Example 2.6
Let X1 , X2 , . . . , X n be a random sample from an exponential population,
find the MLE for the parameter .
Solution
The pdf of the exponential distribution with parameter  is
1 x / 
f ( x ; ) 
e 0 , x  0

The likelihood function is
n n
1  1
L( )   f ( x i ; )    e  x i /    n e  x i / 
i 1 i 1    
and the log-likelihood function is
xi
L* ()  ln  L()   nln() 

The ML equation is
dL* ( ) n x
 0    2i  0
d ˆ ˆ
which has the solution
ˆ  X
Generalization
The method of maximum likelihood can be used in situations where there are
several unknown parameters, say 1 , 2 , … , k, to estimate. In such cases, the
likelihood function is a function of the k unknown parameters 1 , 2 , … , k and the
MLE’s { ̂ i } would be found by equating the k first partial derivatives of the
likelihood function or its logarithm, i.e. the MLE’s ̂1 , ̂ 2 , …, ̂ k of the parameters
1 , 2 , … , k are the solutions of the equations

L ( 1 ,  2 ,...,  k )  0
 1

L ( 1 ,  2 ,...,  k )  0
 2



L ( 1 ,  2 ,...,  k )  0
 k
In this case it may also be easier to work with the logarithm of the likelihood L*.
- 22 -
Example 2.7
If X1 , X2 , . . . , X n constitute a random sample from a normal population
with the mean  and the variance 2, find the joint maximum likelihood estimates of
these two parameters.
Solution
The pdf of the normal distribution with parameters  & 2 is
( x )2

1 2 2
f ( x;  ,  2 )  e
2 
Hence, the likelihood function is
n/2 n/2  ( xi   )2
n
 1   1  
L(,  2 )   f ( x i ; ,  2 )     2 e 2 2

i 1  2   
and the log-likelihood function is
(xi   )2
2
 2n
2
 n
L (,  )  ln L(,  )   ln(2 )  ln( ) 
*

2
2

2 2
The ML equations are
L* ( ,  2 ) ( x i  ˆ )
0  2  0  ˆ  X
 2ˆ 2
L* ( ,  2 ) n 1 ( x i  ˆ ) 2 1
0     0  ˆ 2  ( x i  ˆ ) 2
 2
2 ˆ 2
2 ˆ 4
n
which have the solution
1
ˆ  X and ˆ 2 
(xi  x)2  Sn2
n
It should be observed that we did not show that ̂ is a maximum likelihood
estimate of , only that ̂ 2 is a MLE of 2. However, it can be shown that the
maximum likelihood estimators have the invariance property that is if ̂ is the MLE
of  and a function given by g() is continuous, then g(ˆ ) is also the MLE of g().

Invariance Property of MLE’S


ˆ = t ( X1 , X2 , . . . , Xn ) be the MLE of , where  is assumed one-
Let 

dimensional, and τ(θ) is a single function with a single-valued inverse (one-to-one),


then the MLE of τ(θ) is τ ( ˆ ).

- 23 -
2.3 Properties of Estimators
There may be several different potential point estimators for a parameter. For
example, if we wish to estimate the mean of a random variable, we might consider
the sample mean, the sample median, or perhaps the average of the smallest and
largest observations in the sample as point estimators. In order to decide which point
estimator of a particular parameter is the best one to use, we need to examine their
statistical properties and develop some criteria for comparing estimators. There are
several properties of estimators that would appear to be desirable, such as
unbiasdness, minimum variance and sufficiency.
A desirable property of an estimator is that should be “close” in some sense to
the true value of the unknown parameter. A useful measure of goodness or closeness
of an estimator ˆ = t ( X1 , X2 , . . . , Xn ) of  is what is called the mean-squared
error of the estimator.

Definition 2.3 Mean-Squared Error


If ˆ = t ( X1 , X2 , . . . , X n ) is an estimator of , then the mean squared error
(MSE) of ̂ is given by
MSE(ˆ )  E(ˆ  )2 (2.6)
Thus, if ̂ 1 and ̂ 2 are two different estimators for the same parameter , and let

MSE( ̂ 1 ) and MSE ( ̂ 2 ) be the mean square errors of ̂ 1 and ̂ 2 , then we prefer the

estimator with the smaller MSE, i.e we say that ̂ 1 is better than ̂ 2 if

MSE( ̂ 1 ) < MSE ( ̂ 2 )

a- Unbiased Estimators
An estimator ̂ is said to be an unbiased estimator of the parameter  if
E (ˆ )   (2.7)
That is ̂ is an unbiased estimator of  if “on the average” its values are equal to .
Note that this is equivalent to requiring that the mean of the sampling distribution of ̂
is equal to .
The quantity
ˆ )E 
b ( ˆ  
is called the bias of the estimator ̂ .

- 24 -
Theorem 2.3
The mean squared error can be written as follows,
MSE (ˆ )  var (ˆ )  (bias )2 (2.8)
Proof

     
2
MSE (ˆ )  E  ˆ  E ˆ    E ˆ 
 

   ˆ  E  ˆ   E  ˆ  
2 2
 E  
ˆ E ˆ      E ˆ  - 2 E
   

  
2 2
 E  ˆ  E ˆ      E ˆ 
   
 var (ˆ )  [b(ˆ )] 2
Since the third term is

2   E ˆ   E  ˆ  E  ˆ    0
That is, the mean square error of ̂ is equal to the variance of the estimator plus the
squared bias. If ̂ is an unbiased estimator of , then the mean square error of ̂ is equal
to the variance of ̂ .

Example 2.8
If X has the binomial distribution with the parameters n and p, show that the
X
sample proportion, p̂  is an unbiased estimator of p.
n
Solution
 X np
Since E (X) = np, then E ( p̂ )  E    p
 
n n
Example 2.9
Suppose that X is a r.v. with mean  and variance 2. Let X1 , X2 , . . . , X n be

a random sample from X. Show that the sample mean X and the sample variance S2 are
unbiased estimators of  and 2 respectively.
Solution
Since
1 n  1 n 1 n 1
E ( X )  E  X i    E ( X i )     n  
 n i 1  n i 1 n i 1 n
Therefore X is an unbiased estimator for .
- 25 -
2
Now, the properties of χ distribution discussed in Chapter (1),
 1 n 
E( S 2 )  E  
 n  1 i 1
( Xi  X )2    2

Note:
 Since X1 , X2 , . . . , X n are independent (random sample), then

 1 n  1 n
1 n
1 2
var ( X )  var   X i   2  var ( X i )   2  n 2

 n i 1  n i 1 n2 i 1 n2 n
 The MLE of 2 , namely,
1
S n2 
n
 ( X i  X) 2

is not unbiased estimator for 2 , since


n 1
E( Sn2 )  E( S 2 )  (1  n1 )  2
n
However, we note that
E( S2n )  ( 1  n1 ) 2  2 as n  
Therefore, S 2n is said to be asymptotically unbiased estimator for 2.

b- Minimum variance unbiased Estimator (MVUE)


We note that if the estimator ̂ is unbiased for , the mean square error reduces
to the variance of the estimator ̂ . Within the class of unbiased estimators, we would
like to find the estimator that has the smallest variance. Such an estimator is called a
minimum variance unbiased estimator.

Definition 2.4
Let X1, X2, ….Xn be a random sample of size n from f(x; ). An estimator ̂ of
 is called a minimum variance unbiased estimator (MVUE) of  if
1- ̂ is unbiased for , and
~ ~
2- for any other unbiased estimator  of , Var(  ) ≥ Var( ̂ ) for all
possible values of .
In some cases, lower bounds can be derived for the variance of unbiased estimators. If
an unbiased estimator can be found that attains such a lower bound, then it follows that
the estimator is a MVUE.
- 26 -
C- Efficiency
The mean square error is an important criterion for comparing two estimators.
Let ̂ 1 and ̂ 2 be two estimators of the parameter , and let MSE( ̂ 1 ) and MSE ( ̂ 2 )
be the mean square errors of ̂ 1 and ̂ 2 . Then, the relative efficiency of ̂ 2 to ̂ 1 is
defined as.
ˆ ˆ MSE ( ˆ 1 )
eff (  2 / 1 )  (2.9)
MSE ( ˆ 2 )
If this relative efficiency is less than one, we would conclude that ̂ 1 is a more efficient
estimator of  than ̂ 2 , in the sense that it has smaller mean square error.
For example, suppose that we wish to estimate the mean  of a population. We
have a random sample of n observations X1 , X2 , . . . , X n and we wish to compare

two possible. Estimators for : the sample mean X and a single observation from the
sample, say X1. Not that both X and X1 are unbiased estimators of ; consequently,
the mean square error of both estimators is simply the variance. For the sample mean,
we have
2
MSE ( X )  var ( X ) 
n
where 2 is the population variance; for an individual observation, we have
MSE (X1) = var (X1)= 2
Therefore, the relative efficiency of X1 to X is
MSE X 1
eff ( X1 / X )  
MSE( X1 ) n
Since (1 / n) < 1 for sample sizes n  2 we would conclude that the sample mean X is
more efficient estimator of  than as single observation X1.
Note: The MVUE is sometimes called the most efficient estimator.

D- Consistency
The estimator ̂ is called a consistent estimator of the parameter  if and only if
MSE(ˆ )  E(ˆ  )2  0 as n  
That is ̂ is unbiased (or E[ ̂ ]→ as n   ) and var( ̂ )→ as n   .
Note that consistency is an asymptotic property, that is, a limiting property
of an estimator. Informally, consistency means that when n is sufficiently
- 27 -
large, we can be practically certain that the error made with a consistent
estimator well be as small as we can.
X
Based on the previous examples, since p̂  is unbiased estimator for p and
n
 X  np(1  p) p(1  p)
ˆ  var   
var(p)   0 as n  
n n2 n
X
Then p̂  is consistent estimator for p.
n
Similarly X is a consistent estimators of , since X is unbiased and
2
var(X)   0 as n  
n
<+><+><+><+><+><+><+><+><+>

- 28 -
EXERCISES
[1] Find method of moments estimators (MME’s) of  based on a random sample
X1, …, Xn from each of the following pdf’s:
a- f(x; ) = x-1; 0 < x < 1, zero otherwise;  > 0 .
b- f(x; ) = ( + 1)x--2; 1 < x, zero otherwise;  > 0.
c- f(x; ) = 2xe -x ; 0 < x, zero otherwise;  > 0 .

[2] Find maximum likelihood estimators (MLE’s) for  based on a random sample of
size n for each of the pdf’s in problem [1].

[3] Find the MLE for  based on a random sample of size n from a distribution with
pdf
 22 x 3 x 
f (x; )  
 0 x   ;  0

[4] Let X1, X2, …, Xn be random sample from a geometric distribution


f (x ; ) =  (1- )x-1 for x =1, 2, 3, ….
Find a formula for estimating  by using,
a- the method of moments, b- the method of maximum likelihood.

[5] Let X1 , X2 , . . . , X n be a random sample from a geometric distribution, X ~


GEO(p). Find the MLE’s of the following quantities:
a- E(X) = 1/p. b- Var(X) = (1-p)/p2.
c- P[X > k] = (1 – p)k for arbitrary k = 1, 2, …
(Hint: Use the invariance property of MLE’s).

[6] If X1, X2, ... , Xn constitute a random sample from a population given by the p.d.f.

 2 xe
1 x / 
x  0 ,  0
f (x; )  

0 otherwise
a- Find the maximum likelihood estimator ̂ for the parameter θ.
b- Show that the method of moments give the same estimator ̂ for θ.
c- Prove that ̂ is unbiased and consistent estimator for θ.

( Hint:  xme x / d  m!m 1 for any +ve integer m)
0

[7] If X1 , X2 , . . . , X n are a random sample from a population given by

- 29 -
 1 x / 
2
 3 x e , x0
f (x ;)   2 
0
 , o.w .
a- Find the maximum likelihood estimator ̂ for the parameter .
b- Show that the method of moments gives the same estimator ̂ for .
c- Prove that ̂ is the minimum variance unbiased estimator for .

[8] If X1 , X2 , . . . , X n are a random sample from the Poisson distribution


e   x
f ( x ; )  , x  0 , 1 , 2 ,....
x!
a- Find the maximum likelihood estimator for the parameter .
b- Prove that ̂ is the minimum variance unbiased estimator for .

[9] Let X1, …, Xn be a random sample from EXP(), and define ˆ 1  X and
ˆ 2  nX / ( n  1) .
(a) Find the variances of ̂ 1 and ̂ 2 .
(b) Find the MSE’s of ̂ 1 and ̂ 2 .
(c) Compare the variances of ̂ 1 an ̂ 2 for n = 2.
(d) Compare the MSE’s of ̂ 1 and ̂ 2 for n = 2.

[10] Let X1, X2 and X3 be a random sample from a population having mean  and
variance 2. Consider the following estimators:
2X1  X2  X3 3X1  2X2  X3
ˆ 1  & ˆ 2 
2 4
compare these two estimators. Which do you prefer? Why?

[11] Suppose that ̂1 and ̂ 2 are estimators of the parameter θ. We know that E(ˆ 1 )   ,
var( ˆ 1 )  10 and E(ˆ 2 )   / 2 , var( ˆ 2 )  4 which estimator is "best" ? In what
sense it is best?

[12] Let ̂1 and ̂ 2 be two estimators of θ. The estimator ̂ 2 is said to be more efficient
than ̂1 if
a. var(ˆ 1 )  var(ˆ 2 ) b. MSE(ˆ 1 )  MSE(ˆ 2 )
c. E(ˆ 1 )  E(ˆ 2 ) d. None of the above

- 30 -
[13] Let ̂1 and ̂ 2 be two unbiased estimators of θ. The estimator ̂1 is said to be more
efficient than ̂ 2 if
a. E(ˆ 12 )  E(ˆ 22 ) b. E(ˆ 12 )  E(ˆ 22 ) c. E(ˆ 1 )  E(ˆ 2 ) d. E(ˆ 1 )  E(ˆ 2 )

[14] Suppose that ̂1 , ̂ 2 and ̂ 3 are three estimators of the parameter θ. We know that
E(ˆ 1 )  E(ˆ 2 )   , E(ˆ 3 )   , var( ˆ 1 )  12 , var( ˆ 2 )  10 and E(ˆ 3  )2  6 .
Then the most efficient estimator between them is:
a. ̂1 b. ̂ 2 c. ̂ 3 d. None of the above.

[15] Let X be a random variable with mean μ and variance σ2. Given two independent
random samples of size 30 and 50 with sample means X1 and X 2 , respectively.
Show that
X  X1  (1  )X2
is an unbiased estimator of μ. Find the value of α that minimizes var(X) . Let

X1  X2
ˆ  1be another estimator for μ, compare these two estimators. Which do
2
you prefer? Why?
<+><+><+><+><+><+><+><+><+>

- 31 -

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy