0% found this document useful (0 votes)
33 views77 pages

Econ-2042 - Unit 6-W12-13

Uploaded by

yemata2129
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views77 pages

Econ-2042 - Unit 6-W12-13

Uploaded by

yemata2129
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

Addis Ababa University

College of Business and Economics


Department of Economics
Econ 2042: Statistics for Economists
6. Estimation of Parameters
Fantu Guta Chemrie (PhD)

F. Guta (CoBE) Econ 2042 March, 2024 1 / 77


6. Estimation of Parameters (7 hours)
6.1. Basic Concepts: Estimator and Estimate, Point and
Interval Estimation
6.2 Desirable Properties of Estimators
6.3 Methods of Estimation
6.3.1 Methods of Moments
6.3.2 Least Squares Method
6.3.3 Method of Maximum Likelihood Estimation

F. Guta (CoBE) Econ 2042 March, 2024 2 / 77


6.4 Interval Estimation
6.4.1 Con…dence Interval for the Mean
6.4.2 Con…dence Interval for the Di¤erence of Two Means
6.4.3 Con…dence Interval for the Variance
6.4.4 Con…dence Interval for Variance Ratio

F. Guta (CoBE) Econ 2042 March, 2024 3 / 77


6. Estimation of parameters
6.1 Basic Concepts: Estimator and Estimate,
Point and Interval Estimation

Two important problems in statistical inference are


estimation and tests of hypothesis about statistical
parameters.
Estimation of statistical parameters is the subject
of this chapter.
Given observations on rv X1 , X2 , . . . , Xn with joint
F. Guta (CoBE) Econ 2042 March, 2024 4 / 77
density function f (x1 , x2 , . . . , xn ; θ ):

θ – is a single parameter or vector of parameters


θ – is unknown
f – is known
Want to make inferences about unknown
parameters.
This requires estimation of θ.
In statistical inference, we usually concerned with:
Accuracy of estimates of θ.
F. Guta (CoBE) Econ 2042 March, 2024 5 / 77
Hypothesis about θ.
De…nition (6.1)
A statistic is any function of observable rvs and is
a rv itself. A statistic does not involve any
unknown values, and we assume that we have a
random sample on which estimation about
parameters is based.

De…nition (6.2)
An Estimator of parameter θ is a statistic
b
θ = t (x1 , x2 , . . . , xn ) used to estimate θ.
F. Guta (CoBE) Econ 2042 March, 2024 6 / 77
Example (6.1)
x1 + x2 + + xn
x= n is proposed as an estimator for the
parameter µ, the population mean.

De…nition (6.3)
An estimate of parameter θ is the speci…c value of
an estimator b
θ.
Example (6.2)
If we have x1 = 10, x2 = 9.5, x3 = 11.2, we have an
10+9.5+11.2
estimate of µ as follows: x = 3 ' 10.23.
Such an estimate is called point estimate.
F. Guta (CoBE) Econ 2042 March, 2024 7 / 77
Assume that the values x1 , x2 , . . . , xn of a random
sample X1 , X2 , . . . , Xn from f ( ; θ ) can be
observed.
On the basis of the observed sample values
x1 , x2 , . . . , xn it is desired to estimate the value of
the unknown parameter θ or the value of the some
function, say τ (θ ), of the unknown parameter.
This estimation can be made in two ways.
The …rst, called point estimation, is to let the value
F. Guta (CoBE) Econ 2042 March, 2024 8 / 77
of some statistics, say t (x1 , x2 , . . . , xn ), represent,
or estimate, the unknown τ (θ ); such a statistic is
called a point estimator.

The second, called interval estimation, is to de…ne


two statistics, say t1 (x1 , . . . , xn ) & t2 (x1 , . . . , xn ),
where t1 (x1 , . . . , xn ) < t2 (x1 , . . . , xn ), so that
(t1 (x1 , x2 , . . . , xn ) , t2 (x1 , x2 , . . . , xn )) constitute an
interval for which the probability can be determined
that it contains the unknown τ (θ ).
F. Guta (CoBE) Econ 2042 March, 2024 9 / 77
6.2 Desirable Properties of Estimators

The main objective of estimation is to obtain the


nearest estimate for the given parameter or some
function of the unknown parameters.
The …rst desirable property of an estimator is the
property of an estimator converging to the value of
the parameter as the sample size becomes large.
An estimator that possesses such a property is
called a consistent estimator.
F. Guta (CoBE) Econ 2042 March, 2024 10 / 77
A closely related property is the property of being
unbiased.
De…nition (6.4)
The statistic "t" is called an unbiased estimator of
θ i¤ E (t ) = θ 8θ 2 Θ, the parameter space.

De…nition (6.5)
The bias of an estimator denoted by B b
θ is
de…ned as B b
θ =E b
θ θ where b
θ is a
statistic used to estimate parameter θ.

F. Guta (CoBE) Econ 2042 March, 2024 11 / 77


Example (6.3)
Let X1 , X2 , . . . ., Xn be a random sample of size n
of X . The second moment about the mean, µ2 , of
this random sample can be taken as an estimator of
the population variance σ2 . The bias of this
estimator can be obtained as follows:
" #
1 n 2
b2 ) = E
E (µ ∑
n i =1
Xi X
" #
1 n 2
n i∑
= E (Xi µ) X µ
=1
F. Guta (CoBE) Econ 2042 March, 2024 12 / 77
Example (6.3 continued. . . )
h h 2
ii
b2 ) = E
E (µ 1
n ∑ni=1 (Xi µ )2 + X µ 2 X µ (Xi µ)
h h 2
ii
2
=E ∑ni=1 (Xi µ)
1
n X µ
h i h 2
i
2
= 1
n ∑ni=1 E (Xi µ) E X µ

σ2
= 1
n ∑ni=1 σ2 σ2x = σ2 n = 1 1
n σ2

1 1 2
) B (µ
b2 ) = E (µ
b2 ) σ2 = 1 σ2 σ2 = σ
n n

A useful, though crude, measure of goodness or


closeness of an estimator of parameter θ is the
F. Guta (CoBE) Econ 2042 March, 2024 13 / 77
mean-squared error of the estimator.
Let MSE b
θ denote the MSE of the estimator.

De…nition (6.6)
2
Let b
θ be an estimator of θ. E b
θ θ is
de…ned to be the MSE of the estimator b
θ, denoted
by MSE b
θ .
Let θ be the parameter of interest–to be estimated,
and h (x1 , x2 , . . . , xn ) be the estimator of θ, i.e.,

F. Guta (CoBE) Econ 2042 March, 2024 14 / 77


De…nition (6.6 continued. . . )
b
θ = h (x1 , x2 , . . . , xn ).

Then b
θ θ is the sampling error , and
2
E b
θ θ is the MSE of b
θ.

In general, we try to minimize the MSE b


θ .

Example (6.4)
Given a random sample of, x1 , x2 , ..., xn , let θ be
the population mean–i.e., the population parameter
of interest. Then,
F. Guta (CoBE) Econ 2042 March, 2024 15 / 77
Example (6.4 continued. . . )
b
θ1 = X
b
θ2 = X1 +X2
2
b
θ3 = X1 +X3 +X5 + +X2m +1
m +1
b
θ4 = X2 +X4 +X 6+ +X2m
m

are all possible candidate for an estimators of θ,


the question is, which one has minimum MSE ?

Though one can use the MAE E b


θ θ we often
use MSE b
θ due to computational and analytical
F. Guta (CoBE) Econ 2042 March, 2024 16 / 77
simplicity.
Theorem (6.1)
h i2
b b b
MSE θ = var θ + B θ

Proof.
2 2
E b
θ θ =E b
θ E bθ + E bθ θ

2 2
= E b
θ E bθ + 2 bθ E bθ E bθ θ + E b
θ θ

2 2
= E b
θ E bθ + E bθ θ , as E b
θ E bθ =0
| {z } | {z }
2
var (b
θ) [B (bθ )]

F. Guta (CoBE) Econ 2042 March, 2024 17 / 77


Proof.
h i2
Therefore, MSE b
θ = var b
θ + B b
θ

Since we want to minimize the MSE b


θ , we
would look for minimum bias, thus we restrict our
choices of estimators with B b
θ = 0 such that the
MSE b
θ = var b
θ , i.e., we look for estimators b
θ
such that B b
θ = θ.

F. Guta (CoBE) Econ 2042 March, 2024 18 / 77


For example the earlier proposed estimators for the
population mean, µ = θ, we have

b σ2
θ1 = X ) E bθ 1 = µ and var bθ 1 =
n

b X1 + X2 σ2
θ2 = ) E bθ 2 = µ and var bθ 2 =
2 2

b X1 + X3 + + X2m +1 σ2
θ3 = ) E bθ 3 = µ & var bθ 3 =
m+1 m+1

b X2 + X4 + + X2m σ2
θ4 = ) E bθ 4 = µ & var bθ 4 =
m m

But if we let b
θ5 = X1 +X2
n ,n 6= 2 ) E bθ 5 6= µ.
Therefore, b
θ 5 is biased.
F. Guta (CoBE) Econ 2042 March, 2024 19 / 77
From the set of unbiased estimators we still would
like to restrict our choice to the one with minimum
MSE b
θ = var b
θ .

Minimum variance

We have to choose an estimator with minimum


variance among all an unbiased estimators of the
parameter of interest.
Linearity: An estimator is said to be linear in
x1 , x2 , . . . , xn if it is written as
F. Guta (CoBE) Econ 2042 March, 2024 20 / 77
b
θ = c1 x1 + c2 x2 + + cn xn where ci ’s are known
constants.
De…nition (6.7)
An estimator will be called a best-unbiased
estimator of the parameter θ if it is unbiased and if
it possesses minimum variance among all unbiased
estimators.

If an estimator is best unbiased and linear, then it


is said to be best linear unbiased estimator (BLUE )
F. Guta (CoBE) Econ 2042 March, 2024 21 / 77
of the population parameter of interest.
Proposition (6.1): x is BLUE of popn¯ mean µ.
Proof.
a). E (x ) = µ. Therefore, x is an unbiased estimator
for µ.
b). x = n1 x1 + n1 x2 + + n1 xn , therefore x is linear in
the sampled observations.
σ2
c). Recall that var (x ) = n , where σ2 = var (x ) and
b = c1 x1 + c2 x2 +
let µ + cn xn be any other
linear unbiased estimator of µ. Thus,
F. Guta (CoBE) Econ 2042 March, 2024 22 / 77
Proof.
b ) = c1 E (x1 ) + c2 E (x2 ) +
E (µ + cn E (xn )
= c1 µ + c2 µ + + cn µ = µ ∑ni=1 ci

Note that ∑ni=1 ci = 1 for µ


b to be unbiased.

b ) = c1 var (x1 ) + c2 var (x2 ) +


var (µ + cn var (xn )
n
= c12 σ2 + c22 σ2 + + cn2 σ2 =σ 2
∑ ci2
i =1

To show that var (x ) b ) we have to show


var (µ
that ∑ni=1 ci2 1
n given that ∑ni=1 ci = 1.
F. Guta (CoBE) Econ 2042 March, 2024 23 / 77
Proof.
Let di = ci 1
n ) ∑ni=1 di = ∑ni=1 ci 1
n =
n
∑i =1 ci 1 = 0.
1
Given di = ci n ) ci = di + n1

2
1 1 d
ci2 = di + = di2 + 2
+2 i
n n n
n n
1 2 n
) ∑ ci2 = ∑ di2 + n2
n +
n i∑
di
i =1 i =1 =1
n n n
1
) ∑ ci2 = ∑ di2 + n , as ∑ di =0
i =1 i =1 i =1
n
1
) ∑ ci2 n
i =1
F. Guta (CoBE) Econ 2042 March, 2024 24 / 77
Proof.
Therefore, var (x ) b ) where µ
var (µ b is any other
linear unbiased estimator of µ.

Note: Among all the possible estimators for µ, x is the

best in the sense that it has minimum variance. Or


stated in other words var X b ), where µ
var (µ b
is any other linear unbiased estimator of µ.

Large Sample Properties of Estimators:


F. Guta (CoBE) Econ 2042 March, 2024 25 / 77
De…nition (6.8)
Let fX1 , X2 , . . . , Xn g be a sequence of rvs. If
8e > 0, limn!∞ Pr (jXn aj < e) = 1, then we
say "a" is the probability limit of fXn g or ( Xn
tends in probability to "a") or
limn!∞ Pr (a e < Xn < a + e) = 1.

De…nition (6.9)
An estimator b
θ of parameter θ is a consistent
estimator if plim b
θ = θ.

F. Guta (CoBE) Econ 2042 March, 2024 26 / 77


Let X1 , X2 , . . . , Xn be a random sample of size n
from a given population, then E (x ) = µ and
σ2
var (x ) = n .

The fact that as n increases the variance of the


mean is reduced and its mean approaches the popn¯
mean is known as the law of large numbers.
The weak law of large numbers: If x1 , x2 , . . . , xn
is a random sample from a popn¯ with mean, µ, &
variance σ2 < ∞, then, lim Pr fjx
n !∞
µj < eg = 1, e > 0,

F. Guta (CoBE) Econ 2042 March, 2024 27 / 77


and is written as plim x = µ. To prove this we
need Chebyshev’s inequality.
Theorem (6.2)
Let b
θ n be an estimator of θ based on a sample of
size n. If

i). lim E b
θn = θ
n!∞

ii). lim var b


θn = 0
n!∞

Then b
θ n is a consistent estimator of θ.

F. Guta (CoBE) Econ 2042 March, 2024 28 / 77


Proof.
h i2
b b b
MSE θ n = var θ n + B θ n

i). lim E b
θ n = θ ) lim B b
θn = 0 )
n!∞ n!∞
h i2
b
lim B θ n =0
n!∞

ii). lim var b


θn = 0
n!∞

From (i ) and (ii ) it can be seen that


lim MSE b
θn = 0
n!∞
Now,
F. Guta (CoBE) Econ 2042 March, 2024 29 / 77
Proof.
MSE (b
θn )
Pr b
θn θ e e2

) limn!∞ Pr b
θn θ e =0
) limn!∞ Pr b
θn θ < e = 1.

This implies that b


θ n is a consistent estimator of θ.

Example (6.5)
Let X1 , X2 , . . . , Xn be a random sample of size n
with E (Xi ) = µ and var (Xi ) = σ2 for all i. Let
F. Guta (CoBE) Econ 2042 March, 2024 30 / 77
Example (6.5 continued. . . )
σ2
Xn = 1
n ∑ni=1 Xi , then E X n = µ and var X n = n .

σ2
) limn !∞ E X n = µ & limn !∞ var X n = limn !∞ n =0
) X n is a consistent estimator of µ.

Example (6.6)
Let X1 , X2 , . . . , Xn be a random sample of size n
from a normal distribution with E (Xi ) = µ and
2
var (Xi ) = σ2 8i. Let S 2 = 1
n 1 ∑ni=1 Xi X be an
estimator of σ2 , then E S 2 = σ2 and var S 2 =
2σ 4 / ( n 1)
F. Guta (CoBE) Econ 2042 March, 2024 31 / 77
Example
2 σ4
) limn !∞ E S 2 & limn !∞ var S 2 = limn !∞ n 1 =0

Therefore, S 2 is a consistent estimator of σ2 .

Asymptotic e¢ ciency
There are two concepts of asymptotic e¢ ciency
(relative and absolute)
De…nition (6.10)
The e¢ ciency of an estimator b
θ relative to an
estimator e
θ, denoted by E¤ b
θ, e
θ is de…ned as
F. Guta (CoBE) Econ 2042 March, 2024 32 / 77
De…nition (6.10 continued. . . )
MSE (e
θ)
E¤ b
θ, e
θ = b
MSE (θ )
this is a measure of relative e¢ ciency of an
estimator.
Example (6.7)
Let X1 , X2 , . . . , Xn be a random sample of size n,
and let e
θ = X n and b
θ = (X1 + X2 ) /2, then

E b
θ = µ and E e
θ =µ

) MSE bθ = var bθ = σ2 /2 and


F. Guta (CoBE) Econ 2042 March, 2024 33 / 77
Example (6.7 continued. . . )
) MSE eθ = var eθ = σn
2

MSE (e
θ)
) E¤ bθ, eθ = MSE bθ = σσ2 /n
2
= n2
() /2

This is to say that b


θ is 2/n times as e¢ cient as e
θ
or equivalently e
θ is n/2 times as e¢ cient as b
θ.

Absolute e¢ ciency:

Given that e
θ is an unbiased estimator of θ the
measure of its absolute e¢ ciency is given when we

F. Guta (CoBE) Econ 2042 March, 2024 34 / 77


relate it with the optimum estimator b
θ opt , i.e., with
the one with the minimum variance and unbiased
estimator of θ.
Thus, the measure of absolute e¢ ciency is given by:
MSE b
θ opt CRLB
) E¤ eθ, bθ opt = = 1
MSE e
θ MSE eθ

6.3 Methods of Estimation


6.3.1 Methods of Moments

Let f ( , θ 1 , . . . , θ k ) be a density of a rv X which


F. Guta (CoBE) Econ 2042 March, 2024 35 / 77
has k parameters θ 1 , θ 2 , . . . , θ k , and let µr0 denote
the r th moment about 0 (origin), i.e., µr0 = E (X r ).

In general µr0 will be a known function of the k


parameters θ 1 , θ 2 , . . . , θ k and denote this by writing
µr0 = µr0 (θ 1 , θ 2 , . . . , θ k ).
Now, let X1 , X2 , . . . ., Xn be a random sample of
size n from the density function f ( , θ 1 , θ 2 , . . . , θ k ),
and let Mj0 be the j th sample moment about 0, that
1 n j
is Mj0 = n ∑i =1 Xi .
F. Guta (CoBE) Econ 2042 March, 2024 36 / 77
By equating the k sample moments Mj0 to the k
population moments µr0 we obtain k equation in k
unknown parameters θ 1 , θ 2 , . . . , θ k given by:

µj0 (θ 1 , θ 2 , . . . , θ k ) = Mj0 ; j = 1, 2, . . . , k

Assuming that the above system of equations have


a unique solution, let the solution set of these
system of equations be b
θ1, . . . , b
θk then we say
that the estimator b
θ1, . . . , b
θk is the estimator of
the parameters (θ 1 , . . . , θ k ) obtained by the MM.
F. Guta (CoBE) Econ 2042 March, 2024 37 / 77
Example (6.8)
Let X1 , X2 , . . . ., Xn be a random sample from a
normal distribution with mean µ and variance σ2 .
Let (θ 1 , θ 2 ) = (µ, σ ). Estimate the parameters µ
and σ by the method of moments.

Solution
Recall that µ = µ10 and σ2 = µ20 (µ10 )2 .
Now, the method of moments equations become:

M10 = µ10 = µ10 (µ, σ ) = µ


F. Guta (CoBE) Econ 2042 March, 2024 38 / 77
Solution
M20 = µ20 = µ20 (µ, σ ) = σ2 + µ2

Solving these system of two equations yield the


following solution as estimators of (µ, σ).

Hence, the MM’s estimator of µ is M10 = X , and


the MM’s estimator of σ is
q r r
2 1 2 1 2
∑ ∑ i =1
n n
M02 X = i =1
Xi2 X = Xi X .
n n

6.3.2 Least Squares Method


F. Guta (CoBE) Econ 2042 March, 2024 39 / 77
This is probably the most widely used method of
estimation.
LS method minimizes the sum of squared errors in
estimating the parameter of interest.
Let x1 , x2 , ..., xn be a random sample from a
population with mean equal to µ.
We know that E (xi ) = µ for all i.
However, the actual realization of xi will deviate
from the population mean, µ.
F. Guta (CoBE) Econ 2042 March, 2024 40 / 77
Let this deviation be given by ei for all i. Thus,

xi = µ + ei
ei = xi µ
E (ei ) = E (xi ) µ=0
2
var (ei ) = E (xi µ) = σ2 and cov (ei , ej ) = 0

De…ne the sum of squared errors as


n n
∑ e2i = ∑ (xi
2
S (µ) = µ)
i =1 i =1

The …rst order condition for minimization of this


F. Guta (CoBE) Econ 2042 March, 2024 41 / 77
function requires taking the …rst order derivative
with respect to µ and equating it to zero.

What is the second order condition? Thus,

∂S ( µ ) ∂ ∑ni=1 (xi µ )2 n

∂µ
=
∂µ
= ∑ 2 (xi µ ) ( 1)
i =1
n n
) ∑ 2 (xi b ) ( 1) = 0 )
µ ∑ xi b=0

i =1 i =1
n
) ∑ xi = nµb
i =1
1 n
b=
) µ ∑ xi = X
n i =1
F. Guta (CoBE) Econ 2042 March, 2024 42 / 77
Note that the least squares estimator of µ is BLUE .

6.3.3 Method of Maximum Likelihood Estimation


De…nition (6.11)
Let x1 , ..., xn have a joint pdf that depends on θ
(possibly vector) or θ 2 R k . Note that f maps
R n ! [0, 1]. Then the LF is a mapping from
R k ! [0, 1] given by

L (θ; x1 , x2 , ..., xn ) = f (x1 , x2 , ..., xn ; θ )

∏i =1 f (xi ; θ ) .
n
=
F. Guta (CoBE) Econ 2042 March, 2024 43 / 77
Note: If X1 , X2 , ..., Xn is a random sample of variables

with density function f (x; θ ), then

∏i =1 f (xi ; θ )
n
f (x1 , x2 , ..., xn ; θ ) =

Hence,

L (θ; x1 , x2 , ..., xn ) = ∏i =1 f (xi ; θ )


n

This function viewed as a function of θ given


x1 , x2 , ..., xn is known as the likelihood function and
is written as L (θ; x1 , x2 , ..., xn ) or L (θ ).
F. Guta (CoBE) Econ 2042 March, 2024 44 / 77
Note: Maximizing L (θ; x1 , x2 , ..., xn ) is equivalent to
maximizing ln L (θ; x1 , x2 , ..., xn ).

ln L (θ; x1 , x2 , ..., xn ) = ln ∏i =1 f (xi ; θ )


n

n
= ∑ ln f (xi ; θ )
i =1
is known as the log likelihood function.
De…nition (6.12)
A maximum likelihood estimator ( MLE ) of θ is a
solution b
θ to maxL (θ; x1 , x2 , ..., xn ).
θ

F. Guta (CoBE) Econ 2042 March, 2024 45 / 77


The argument here goes as follows: given an i.i.d
sample x1 , x2 , ..., xn , an investigator wants to
approximate the population (DGP) by a family of
density functions f (x; θ ) and then tries to infer the
value of θ based on the sample.
Consider two possible alternatives for θ say θ 1 & θ 2 .
Then the probability of observing the sample
x1 , x2 , ..., xn is L (θ 1 ; x ) if θ 1 is true and L (θ 2 ; x ) if
θ 2 is true.

F. Guta (CoBE) Econ 2042 March, 2024 46 / 77


If L (θ 1 ; x ) > L (θ 2 ; x ), then θ = θ 1 gives a higher
joint probability of the actual realization (i.e.,
x1 , x2 , ..., xn was observed).
This is based on the notion that if an event occurs
it must be because it is most likely to happen.
This is also a method that is widely used.
Here given the LF described earlier, i.e.,

L (θ ; x1 , x2 , ..., xn ) = f (x1 ; θ ) f (x2 ; θ ) f (xn ; θ )

∏i =1 f (xi ; θ )
n
=
F. Guta (CoBE) Econ 2042 March, 2024 47 / 77
We try to select a b
θ which maximizes the likelihood
function.
Such an estimator is called the MLE of θ.
It is often helpful computationally to take the
natural logarithm of the LF , because it comes out
as a sum rather than a product of the density
functions of each observation, & maximizing the
LF is equivalent to the maximization of the LLF ,
since the latter is a monotonic transformation of
the other. Thus,
F. Guta (CoBE) Econ 2042 March, 2024 48 / 77
ln L (θ ; x1 , x2 , ..., xn ) = ln f (x1 ; θ ) + ln f (x2 ; θ ) + + ln f (xn ; θ )

n
= ∑ ln f (xi ; θ )
i =1

and the …rst order condition for the maximization


of this is given as
∂ ln L (θ ; x1 , x2 , ..., xn ) ∂ ln f (x1 ; θ ) ∂ ln f (x2 ; θ ) ∂ ln f (xn ; θ )
= + + +
∂θ ∂θ ∂θ ∂θ
n
∂ ln f (xi ; θ )
= ∑ ∂θ
i =1

Example (6.9)
Let Xi , i = 1, 2, . . . , n be a random sample from a
F. Guta (CoBE) Econ 2042 March, 2024 49 / 77
Example (6.9 continued. . . )
normal distribution with mean µ and variance σ2 .

Find the MLE of µ and σ2 .


!
n 1 (xi µ )2
L µ, σ2 ; x1 , x2 , ..., xn = ∏ i =1
p exp
σ 2π 2σ 2
!
n n
1 1
= p
σ 2π
exp
2σ 2 ∑ (xi µ )2
i =1

which results in a LLF given as follows:

n
n n 1
ln L µ, σ2 ; x1 , x2 , ..., xn =
2
ln 2π
2
ln σ2
2σ 2 ∑ (xi µ )2
i =1
F. Guta (CoBE) Econ 2042 March, 2024 50 / 77
Example (6.9 continued. . . )
Thus, the …rst order conditions for a maximum are:

∂ ln L µ, σ2 ; x1 , x2 , ..., xn 1 n

∂µ
=
b2

∑( 2 ) ( xi b) = 0
µ
b2
b, σ
µ i =1
n
) ∑ ( xi b) = 0 ) µ
µ b=X
i =1

b = X.
Therefore the MLE of µ is µ

We can also …nd the MLE of σ2 as follows:

∂ ln L µ, σ2 ; x1 , ..., xn n 1 n

∂σ2
=
b2

+
b4

∑ ( xi b )2 = 0
µ
b
b, σ
µ 2 i =1

F. Guta (CoBE) Econ 2042 March, 2024 51 / 77


Example (6.9 continued. . . )
) n
σ2
2b
= 1
σ4
2b
n
∑i =1 (xi b )2
µ
b =
)σ 2 1 n
n ∑i =1 (xi
b )2
µ
2
b2 = n1 ∑ni=1 (xi
)σ b=x
x ) , as µ

The basic steps involved in MLE estimation are:

i). Write the LF based on the joint density function of


the sample.
ii). Find the log likelihood function.
iii). Find the …rst order derivatives of the LLF with
F. Guta (CoBE) Econ 2042 March, 2024 52 / 77
respect to the parameters to be estimated (these
are also known as the likelihood equations or the
…rst order conditions).
iv). Solve the …rst order conditions for the MLE .
Problem (6.1)
Given a random sample X1 , X2 , . . . , Xn drawn from
a Poisson distribution whose density function is
given by:
θx e θ
f (x, θ ) =
x!
F. Guta (CoBE) Econ 2042 March, 2024 53 / 77
Problem (6.1 continued. . . )
a). Construct the LF and the LLF .
b). Obtain the MLE of θ.

The Cramer-Rao Inequality:

Given a random sample X1 , X2 , . . . , Xn drawn from


a population with density function f (x, θ 0 ) and if b
θ
is an unbiased estimator of θ 0 , then (subject to
some regularity conditions) it can be shown that,

F. Guta (CoBE) Econ 2042 March, 2024 54 / 77


var b
θ I 1
(θ 0 )
where
∂2 ln L (θ 0 ) ∂2 ln f (X ; θ 0 )
I (θ 0 ) = E = nE
∂θ 2 ∂θ 2
I (θ 0 ) is known as the information contained in the
sample about θ 0 .
1
I (θ 0 ) is known as the Cramer-Rao Lower Bound
(CRLB) on var b θ .

Example (6.10)
1 x
Let X Ber (π ) then f (x, π ) = π x (1 π) ,
F. Guta (CoBE) Econ 2042 March, 2024 55 / 77
Example (6.10 continued. . . )
ln f (x, π ) = x ln π + (1 x ) ln (1 π)

∂ ln f (x, π ) x (1 x )
) =
∂π π (1 π )
∂2 ln f (x, π ) x (1 x )
) =
∂π 2 π 2 (1 π )2
∂2 ln f (x, π ) E (x ) (1 E (x ))
) E =
∂π 2 π2 (1 π )2
π (1 π ) 1 1
= 2 2 =
π (1 π ) π (1 π )
1
=
π (1 π )
F. Guta (CoBE) Econ 2042 March, 2024 56 / 77
Example (6.10 continued. . . )
Thus if t is any other unbiased estimator of π its
variance will be greater than or equal to this lower
π (1 π )
bound, i.e., var (t ) n .
Therefore, the CRLB for any unbiased estimator of
π (1 π )
π based on a random sample of size n is n .

Example (6.11)
θx e θ
Let X Poi (θ ), i.e., f (x; θ ) = x! ,
x = 1, 2, . . ., then it follows that:

F. Guta (CoBE) Econ 2042 March, 2024 57 / 77


Example (6.11 continued. . . )
ln f (x, θ ) = x ln θ θ ln (x!)

∂ ln f (x, θ ) x ∂2 ln f (x, θ ) x
) = 1) =
∂θ θ ∂θ 2 θ2
2
∂ ln f (x, θ ) E (x ) θ 1
) E = = =
∂θ 2 θ2 θ2 θ

Thus if t is any other unbiased estimator of θ its


variance will be this bound, i.e.,
1
∂2 ln f (x, θ ) θ
nE = .
∂θ 2 n
F. Guta (CoBE) Econ 2042 March, 2024 58 / 77
For the Poisson distribution E (x ) = θ and
var (x ) = nθ .

Example (6.12)
Let X N (µ, σ2 ) with a known variance.

1 1
f x; µ, σ2 = p exp (x µ )2
2πσ2 2σ 2
1 1 1
) ln f x; µ, σ2 = ln 2π ln σ2 (x µ )2
2 2 2σ 2
∂ ln f x; µ, σ2 1 1
) = ( 2 ) ( x µ ) ( 1 ) = (x µ )
∂µ 2σ 2 σ2
∂2 ln f x; µ, σ2 1
) 2
=
∂µ σ2
F. Guta (CoBE) Econ 2042 March, 2024 59 / 77
Example (6.12 continued. . . )
∂2 ln f (x;µ,σ2 )
)E ∂µ2
= 1
σ2

Thus if t is any other unbiased estimator of µ, then


its variance will be this bound, i.e.,
1
∂2 ln f (x; µ, σ2 ) σ2
nE = .
∂µ2 n

Recall: X is a linear unbiased estimator for µ, and


var X = σ2 /n, therefore, X is the best among all
an unbiased estimators, since it attains the CRLB.
F. Guta (CoBE) Econ 2042 March, 2024 60 / 77
Properties of the MLE

i). Invariance: If b
θ is the MLE of θ, & g ( ) is any
continuous function of θ, then the MLE of g (θ ) is
g b
θ .
ii). Asymptotic properties of MLE: under certain
regulatory conditions:
a). The MLE is consistent, i.e., plim b
θ MLE = θ.
A
b). MLE is asymptotically normal, i.e., bθ MLE N (θ , CRLB )

c). MLE is asymptotically e¢ cient asyvar b


θ MLE = CRLB ,

therefore, Asy .E¤ b


θ MLE = 1.

F. Guta (CoBE) Econ 2042 March, 2024 61 / 77


6.4 Interval Estimation
6.4.1 Con…dence Interval for the Mean

Suppose x N (µ, σ2 ), then we know that


x µ
z= p N (0, 1) .
σ/ n
Case 1: σ2 is known. For a standard normal
distribution, it is known that
n o
Pr z 2α < z < z 2α = 1 α.

Therefore,
F. Guta (CoBE) Econ 2042 March, 2024 62 / 77
n o
x µ
Pr z <
α
2
p
σ/ n
<z α
2
=1 α

σ σ
) Pr z 2α p < x µ < z 2α p =1 α
n n
σ σ
) Pr x z 2α p < µ < x + z 2α p =1 α
n n
σ σ
) Pr x z 2α p < µ < x + z 2α p =1 α
n n

where zα/2 is the upper 100α/2% critical value for


N (0, 1).
The interval x z 2α pσ , x
n
+ z 2α pσ
n
is a
100 (1 α) % con…dence interval for µ.
F. Guta (CoBE) Econ 2042 March, 2024 63 / 77
Or this is to say that the probability that the
interval x zα/2 pσ , x
n
+ zα/2 pσ
n
contains the
true parameter µ is 100 (1 α) %.
Example (6.13)
Let x N (µ, σ2 ), σ = 10, n = 25 and X = 140.
If we let α = 0.05 then z 2α = 1.96 and then the
95% con…dence interval for µ is given by

10 10
140 1.96 p , 140 + 1.96 p = (140 1.96 2, 140 + 1.96 2)
25 25

= (136.08, 143.92)

F. Guta (CoBE) Econ 2042 March, 2024 64 / 77


Example (6.13 continued. . . )
Therefore, we have the CI as (136.08, 143.92).
On the other hand the 90% CI for µ is obtained as
follows: α = 0.1 then z 2α = 1.645, thus the 90% CI
is obtained as follows: 140 1.645 p10 =
25
140 1.645 2 = 140 3.29.
Therefore, we have the 90% con…dence interval for
the mean, µ, as (136.71, 143.29).

Note that the distance between the largest and


F. Guta (CoBE) Econ 2042 March, 2024 65 / 77
lowest value is reduced as the con…dence level goes
down.

Case 2: σ2 is unknown. We substitute S 2 in place


x µ
of σ2 and obtain p .
S/ n

But this is a t distribution with n 1 degrees of


x µ
freedom, i.e., p
S/ n
tn 1 .
Thus, we use the t distribution in calculating the
CIs. Taking the same example as before and
changing σ to S, we have
F. Guta (CoBE) Econ 2042 March, 2024 66 / 77
x N (µ, σ2 ) , s = 10, n = 25 and X = 140.

Then the 95% con…dence interval for µ is given by


X tn 1 (α/2) pS .
n

Note that t25 1 (0.025) = 2.064, therefore


10
140 2.064 p = 140 2.064 2 = 140 4.128
25
Therefore we have the 95% con…dence interval for
the mean, µ, as (135.872, 144.128).

6.4.2 CI for the Di¤erence of Two Means


F. Guta (CoBE) Econ 2042 March, 2024 67 / 77
Suppose X N (µx , σ2 ) and Y N µy , σ 2 .
We want to set the CI for µx µy .
Suppose we have two random samples from these
populations with x1 , x2 , ..., xn (i.e., sample size is n)
from the population of x and y1 , y2 , ..., ym (i.e.,
sample size is m) from the population of y.
We know that
1 n 1 m
i). X = n ∑i =1 Xi and Y = m ∑i =1 Yi
2 2
ii). Sx2 = 1
n 1 ∑ni=1 (xi x ) & Sy2 = m1 1 ∑m
i =1 (yi y)
F. Guta (CoBE) Econ 2042 March, 2024 68 / 77
h i
1 n 2 n 2
2
iii). S = n +m 2 ∑ i =1 (xi x) + ∑ i =1 (yi y) .

Thus, it follows that

(x y) µx µy
t= q tn+m 2
1
S n + m1
) Pr f tn+m 2 ( α/2) < t < tn+m 2 ( α/2)g =1 α

Note that:
r r
1 1 1 1
s.e. (x y) = σ + and S + = s.e.\
(x y )
n m n m
F. Guta (CoBE) Econ 2042 March, 2024 69 / 77
Therefore a 100 (1 α) % con…dence interval for
the di¤erence of two means is given by

(x y) tn+m 2 (α/2) s.e.\


(x y )

6.4.3 Con…dence Interval for the Variance

We know that if x N (µ, σ2 ), then


(n 1) S 2
χ2n 1 .
σ2
Since the χ2 distribution is not symmetric, we have
to obtain both the upper and lower tail areas.
F. Guta (CoBE) Econ 2042 March, 2024 70 / 77
Therefore,

(n 1) S 2
Pr χ2n 1 (1 α/2) < < χ2n 1 (α/2) = 1 α
σ2
( )
χ2n 1 (1 α/2) 1 χ2n 1 (α/2)
) Pr < 2 < =1 α
(n 1) S 2 σ (n 1) S 2
( )
(n 1) S 2 ( n 1 ) S 2
) Pr < σ2 < 2 =1 α
χ2n 1 (α/2) χn 1 (1 α/2)

Therefore, a 100 (1 α) % con…dence interval for


the variance is given by
( n 1) S 2 ( n 1) S 2
,
χ2n 1 (α/2) χ2n 1 (1 α/2)
F. Guta (CoBE) Econ 2042 March, 2024 71 / 77
Example (6.14)
Let n = 25; s 2 = 100, …nd the 95% con…dence

interval for σ2 . Then χ224 (0.025) = 39.36 and

χ224 (0.975) = 12.4. Hence the 95% con…dence

interval for σ2 is given by:


!
(n 1) S 2 ( n 1) S 2 24 100 24 100
, 2 = ,
χ2n 1 ( α/2) χn 1 (1 α/2) 39.36 12.4
2400 2400
= ,
39.36 12.4

= (60.98, 193.55)
F. Guta (CoBE) Econ 2042 March, 2024 72 / 77
6.4.4 Con…dence Interval for Variance Ratio

Let X N (µx , σ2x ) , Y N µy , σ2y & σ2x 6= σ2y


both are assumed to be independent.
1 n 2
We estimate σ2x & σ2y by sx2 = n 1 ∑i =1 (xi x)
1 m 2
and sy2 = m 1 ∑i =1 (yi y ) , respectively.
σ2x sx2
We then estimate the variance ratio σ2y
by sy2
.
(n 1)sx2 (m 1)sy2
Then σ2x
χ2n 1 and σ2y
χ2m 1 and both
are independently distributed as a χ2 distribution.

F. Guta (CoBE) Econ 2042 March, 2024 73 / 77


Therefore,
(n 1)sx2 ! !
σ2x
/ (n 1) sx2 σ2x
= / Fn 1,m 1
(m 1)sy2
/ (m 1) sy2 σ2y
σ2y

It then follows that


sx2 sy2 sx2 σ2x
F = σ2x
/ σ2y
= sy2
/ σ2y
Fn 1,m 1 .

This distribution is also non symmetric as the χ2


distribution.
Note that:
1
Fn 1,m 1 (1 α/2) =
Fm 1,n 1 (α/2)
F. Guta (CoBE) Econ 2042 March, 2024 74 / 77
Therefore, using the earlier logic one could write
the following

Pr fFn 1,m 1 (1 α/2) < F < Fn 1,m 1 (α/2)g = 1 α


( ! )
sx2 σ2y
) Pr Fn 1,m 1 (1 α/2) < < Fn 1,m 1 (α/2) =1 α
sy2 σ2x
( ! ! )
sy2 σ2y sy2
) Pr Fn 1,m 1 (1 α/2) < 2 < Fn 1,m 1 (α/2)
sx2 σx sx2

Therefore,
! ! !
sy2 sy2
Fn 1,m 1 (1 α/2) , Fn 1,m 1 (α/2)
sx2 sx2

σ2y
is a 100 (1 α) % CI of σ2x
.
F. Guta (CoBE) Econ 2042 March, 2024 75 / 77
Example (6.15)
Let n = 13, m = 16, sx2 = 1.2, sy2 = 1.5,
α = 0.02 ) α/2 = 0.01.
Find the 98% CI for the variance ratio σ2y /σ2x .

The 98% CI for the variance ratio σ2y /σ2x is:


( ! ! )
sy2 σ2y sy2
Pr Fn 1,m 1 (1 α/2) < 2 < Fn 1,m 1 (α/2) =1 α
sx2 σx sx2
( )
1.5 σ2y 1.5
) Pr F12,15 (0.99) < 2 < F12,15 (0.01) = 0.98
1.2 σx 1.2
( )
1.5 1 σ2y 1.5
) Pr < 2 < F12,15 (0.01) = 0.98
1.2 F15,12 (0.01) σx 1.2

F. Guta (CoBE) Econ 2042 March, 2024 76 / 77


Example (6.5 continued. . . )
n o
1.5 1 σ2y 1.5
) Pr 1.2 4.04 < σ2x
< 1.2 3.67 = 0.98
n o
σ2y
) Pr 0.309 < σ2x
< 4.588 = 0.98

Therefore, a 98% CI for the variance ratio σ2y /σ2x is


(0.309, 4.588).

F. Guta (CoBE) Econ 2042 March, 2024 77 / 77

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy