0% found this document useful (0 votes)
10 views12 pages

Estimation 3

The document discusses Mean Square Estimation (MSE) of random variables, focusing on both nonlinear and linear estimations. It explains how to minimize the mean square error by finding appropriate constants or functions, and introduces the Orthogonality Principle, which states that the estimation error should be orthogonal to the data. Additionally, it highlights that for jointly normal random variables, the nonlinear and linear estimates are identical.

Uploaded by

Fehad Nazir 037
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views12 pages

Estimation 3

The document discusses Mean Square Estimation (MSE) of random variables, focusing on both nonlinear and linear estimations. It explains how to minimize the mean square error by finding appropriate constants or functions, and introduces the Orthogonality Principle, which states that the estimation error should be orthogonal to the data. Additionally, it highlights that for jointly normal random variables, the nonlinear and linear estimates are identical.

Uploaded by

Fehad Nazir 037
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Mean Square Estimation (Papoulis)

Ajit K Chaturvedi

Ajit K Chaturvedi Mean Square Estimation (Papoulis)


Mean Square Estimation
The Mean Square (MS) estimation of a random variable y by
a constant c can be formulated as:
Find c such that the second moment of the error y − c is
minimum:
Z ∞
2
e = E [(y − c) ] = (y − c)2 f (y )dy
−∞

Clearly, e depends on c and is minimum if


Z ∞
de
= 2(y − c)f (y )dy = 0
dc −∞

This leads to
Z ∞ Z ∞
c f (y )dy = c = yf (y )dy = E {y}
−∞ −∞

2 / 12
Nonlinear MS Estimation
Now we wish to estimate y not by a constant but by a
function c(x) of the random variable x.
We need to find the function c(x) that will minimize the MS
error e:
Z ∞Z ∞
2
[y − c(x)]2 f (x, y )dxdy

e = E [y − c(x)] =
−∞ −∞

Since f (x, y ) = f (y | x)f (x), we can write


Z ∞ Z ∞
e= f (x) [y − c(x)]2 f (y | x)dydx
−∞ −∞

These integrands are positive. Hence e is minimum if the


inner integral is minimum for every x.
R∞
Recall e = E [(y − c)2 ] = −∞ (y − c)2 f (y )dy . The above
equation is identical if c is changed to c(x), and f (y ) is
changed to f (y | x).
3 / 12
Nonlinear MS Estimation

Hence e is minimum if
Z ∞
c(x) = E {y | x} = yf (y | x)dy
−∞

If y = g (x), then E {y | x} = g (x); hence c(x) = g (x) and


the resulting MS error is 0.
This is not surprising because, if x is observed and y = g (x),
then y is determined uniquely.
If the random variables x and y are independent, then
E {y | x} = E {y} = constant.
In this case, knowledge of x has no effect on the estimate of y.

4 / 12
Linear MS Estimation

Compared to the nonlinear MS problem, an easier problem,


using only second-order moments, is the linear MS estimation
of y in terms of x.
The resulting estimate may not be as good as the nonlinear
estimate; however, it is useful because of the simplicity of the
solution.

5 / 12
Linear MS Estimation Contd.

The linear estimation problem is the estimation of the random


variable y in terms of a linear function Ax + B of x.
The problem now is to find the constants A and B so as to
minimize the MS error

e = E [y − (Ax + B)]2


For a given A, e is the MS error of the estimation of y − Ax


by the constant B.
R∞
Hence, as earlier in c = −∞ yf (y )dy = E {y}, e is minimum
if

B = E {y − Ax} = ηy − Aηx

6 / 12
Linear MS Estimation
With B so determined, we get

e = E [(y − ηy ) − A(x − ηx )]2 = σy2 − 2Ar σx σy + A2 σx2




where
E {(x − ηX )(y − ηY )}
r=
σx σy
e is minimum if
µ11
A = r σy /σx =
µ20
where n o
µkr = E (x − ηx )k (y − ηy )r
Inserting this A into the preceding quadratic, we obtain

µ11 2
em = σy2 (1 − r 2 ) = µ02 −
µ20

7 / 12
Terminology

In the above, the sum Ax + B is the non-homogeneous linear


estimate of y in terms of x. If y is estimated by a straight line
ax passing through the origin, the estimate is called
homogeneous.
The random variable x is the data of the estimation, the
random variable ϵ = y − (Ax + B) is the error of the
estimation, and the number e = E [ϵ2 ] is the MS error.

8 / 12
Fundamental Note
In general, the nonlinear estimate φ(x) = E [y|x] of y in terms
of x is not a straight line and the resulting MS error
E {[y − φ(x)]2 } is smaller than the MS error em of the linear
estimate Ax + B.
However, if the random variables x and y are jointly normal,
φ(x) is a straight line with slope r σy /σx and passing through
the point (ηx , ηy ). It is expressed as:
r σy x r σy ηx
φ(x) = + ηy −
σx σx
To verify the above expression, write down the pdf f (x, y ) for
jointly normal random variables x and y. From this, obtain the
conditional pdf f (y |x) and note that this is a normal density.
Hence E [y|x] is just the value at which the argument of the
exponential term is zero.
To conclude, for normal random variables, nonlinear and linear
MS estimates are identical.
9 / 12
The Orthogonality Principle
Recall
e = E [y − (Ax + B)]2


The MS error e is a function of A and B and it is minimum if


∂e ∂e
∂A = 0 and ∂B = 0. The first equation yields
∂e
= E {2[y − (Ax + B)](−x)} = 0
∂A
leading to
E {[y − (Ax + B)]x} = 0
The interchange between expected value and differentiation is
equivalent to the interchange of integration and differentiation
The result states that the optimum linear MS estimate
Ax + B of y is such that the estimation error y − (Ax + B) is
orthogonal to the data x.
This is known as the Orthogonality Principle. It is
fundamental in MS estimation and is used extensively.
10 / 12
Proof

We will show the Orthogonality Principle for the homogeneous


case also.
We wish to find a constant a such that, if y is estimated by
ax, the resulting MS error

e = E (y − ax)2


is minimum.
We get the desired result by differentiating wrt a and equating
it to 0. Thus,
E {(y − ax)x} = 0
Therefore,
E {xy}
a=
E {x2 }
There is an interesting alternate proof.
11 / 12
Alternate Proof

Assuming the Orthogonality Principle holds, we shall show


that the resulting e is minimum.
Let ā be an arbitrary constant, then

E (y − āx)2 = E [(y − ax) + (a − ā)x]2


 

= E (y − ax)2 + (a − ā)2 E x2
 

+ 2(a − ā)E {(y − ax)x}


Here, the last term is 0 by assumption and the second term is
positive.
From this it immediately follows that

E (y − āx)2 ≥ E (y − ax)2
 

for any ā; hence e is minimum.

12 / 12

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy