0% found this document useful (0 votes)

21 views

Basic Stats Estimation

The document discusses classical statistics and maximum likelihood estimation (MLE), detailing terminologies related to estimators, their properties, and examples of unbiased and consistent estimators. It explains the bias-variance trade-off in estimators and introduces concepts such as confidence intervals and minimum variance unbiased estimation (MVUE). Additionally, it provides examples of estimating parameters from various distributions and the statistical properties of these estimators.

Uploaded by

nenutrash

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

Basic Stats Estimation

Uploaded by

nenutrash

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Classical statistics

We list some terminology regarding estimators as follows.

1. We denote an estimator of an unknown parameter θ by Θ̂n (a random variable) which is a function of

n observations X1 , . . . , Xn whose distribution depends on θ.

2. The estimation error Θ̃n := Θ̂n −θ. The bias of an estimator b(Θ̂n ) is the expectation of the estimation
error
b(Θ̂n ) := E[Θ̂n ] − θ. (1)

3. The mean value, the variance and the bias of Θ̂n depend on θ, while the estimation error depends in
addition on X1 , . . . , Xn .

4. Θ̂n is unbiased if E[Θ̂n ] = θ for all values of θ.

5. Θ̂n is asymptotically unbiased if limn→∞ E[Θ̂n ] = θ, for all values of θ.

6. Θ̂n is consistent if the sequence Θ̂n converges to the true value of the parameter θ in probability for
all values of θ.

7. The mean squared error (MSE) E[Θ̃2n ] = E[(Θ̂ − θ)2 ] is related to the bias and the variance of Θ̂n as
follows

E[Θ̃2n ] = E[(Θ̂ − E[Θ̂] + E[Θ̂] − θ)2 ]

= E[(Θ̂ − E[Θ̂])2 + E[2(Θ̂ − E[Θ̂])(E[Θ̂] − θ)] + E[(E[Θ̂] − θ)2 ]
= var(Θ̂n ) + 2(E[Θ̂] − E[Θ̂])(E[Θ̂] − θ) + E[b2 (Θ̂n )]
= b2 (Θ̂n ) + var(Θ̂n ). (2)

This represents the trade-off between the bias and the variance is minimizing the mean squared error.

In general, θ can be a scalar or a vector of unknowns. The unknowns are treated as constants and not random
variables unlike in Bayesian statistics.

Maximum likelihood estimation

The objective of MLE is to estimate the unknown parameter θ from observations X1 , . . . , Xn which are
assumed to be iid random variables with pdf pXi (xi ; θ).
Consider the vector of observations X = (X1 , . . . , Xn ) described by a joint PMF pX (x; θ). Suppose
we observe x = (x1 , . . . , xn ). Then, a maximum likelihood estimate (MLE) is a value of the parameter that
maximizes pX (x1 , . . . , xn ; θ) over all θ:

θ̂n = argmaxθ pX (x1 , . . . , xn ; θ). (3)

If X is continuous, then
θ̂n = argmaxθ fX (x1 , . . . , xn ; θ). (4)
We refer to pX (x; θ) or fX (x; θ) as the likelihood function. If Xi s are independent, then

pX (x1 , . . . , xn ; θ) = Πni=1 pXi (xi ; θ).

The log-likelihood function for discrete Xi is

n
X
log pX (x1 , . . . , xn ; θ) = log pXi (xi ; θ)
i=1

1
and for continuous Xi it is
n
X
log fX (x1 , . . . , xn ; θ) = log fXi (xi ; θ).
i=1

θ represents either a scalar or a vector of unknowns.

Example 0.1 (Example 9.2, [1]). Consider the problem of estimating the probability of heads θ of a biased
coin, based on n independent tosses X1 , . . . , Xn where X
i = 1 for heads and Xid =n0 kfor tails.n−k
Let k
bethe number of heads. Then the PMF pX (x, θ) = nk θk (1 − θ)n−k . Setting dθ k θ (1 − θ) =
n k−1 (1 − θ)n−k − (n − k)θ k (1 − θ)n−k−1 ) = 0,
k (kθ

k
k(1 − θ) − (n − k)θ = 0 ⇒ θ̂ = .
n
Therefore, the MLE is
X1 + . . . + Xn
Θ̂n = .
n
This estimator is unbiased and consistent as Θ̂n → θ in probability by weak law of large numbers.

Example 0.2. Let X1 , . . . , Xn be iid random variables with exponential(θ) distribution. Therefore, pX (x1 , . . . , xn ; θ) =
Πni=1 θe−θxi = θn Πni=1 e−θxi .
n
X
argmaxθ log pX (x1 , . . . , xn ; θ) = argmaxθ (n log θ − θ xi ).
i=1

Taking derivative w.r.t. θ and setting it to zero,

n
n X n
= xi ⇒ θ̂M L = Pn .
θ i=1 xi
i=1

n
Therefore, Θ̂n = X1 +···+X n
. We want to study the statistical properties of Θ̂n and want Θ̂n to be close to
the true value of θ with high probability.

Example 0.3 (Example 7.4, [2]). Let

x(n) = A + w(n), n = 0, 1, . . . , N − 1

where A is the unkown DC level of white Gaussian noise w(n) with known variance σ 2 . Then the joint PDF
1 − 12
PN −1 2
n=0 (x(n)−A) .
p(x; A) = 2 N/2
e 2σ
(2πσ )

Taking the derivative of the log likelihood function and setting it to zero,
N −1 N −1
∂ log p(x; A) 1 X 1 X
= 2 (x(n) − A) = 0 ⇒ Â = x(n).
∂A σ N
n=0 n=0

Properties of MLE:

• The MLE of a one-to-one function h(θ) of θ is h(θ̂), where θ̂ is the MLE of θ. This is called the
invariance principle.

• When Xi s are iid, under some mild extra assumptions, each component of MLE is consistent and
asymptotically normal.

2
Estimation of the mean and the variance of a random variable: We consider the sample mean and the
sample variance where no knowledge of the distributions pX (x; θ) or fX (x; θ) is required. The sample
mean of observations X1 , . . . , Xn (iid, with unknown mean θ) is given as
X1 + · · · + Xn
Mn = .
n
This is unbiased since E[Mn ] = E[X] = θ. The sample mean is consistent since it converges to θ in
probability by weak law of large numbers.
Its mean squared error is

nvar(X) var(X)
E[(Mn − θ)2 ] = var(Mn ) = 2
=
n n
where var(X) is the common variance of Xi . The mean squared error does not depend on θ. The sample
mean is not necessarily the estimator with the smallest variance as the zero estimator θ̂n = 0 has zero
variance. However, the bias of the zero estimator is bθ (θ̂n ) = −θ which implies that the MSE is θ2 .

Example 0.4 (Example 9.5, [1]). Suppose X1 , . . . , Xn are normal iid with unknown mean θ and unknown
variance σ 2 . Consider the estimator θ̂ = X1 +···+X
n+1
n nθ
. This estimator is biased because E[θ̂n ] = n+1
θ
and b(θ̂n ) = − n+1 . However, limn→∞ b(θ̂n ) = 0, so θ̂n is asymptotically unbiased. The variance is
2
σ n σ2
var(θ̂n ) = (n+1) 2 which is smaller than n of the sample mean. Notice that var(θ̂n ) is independent of θ.
The mean squared error is

θ2 nσ 2
E[θ̃2 ] = b(θ̂n ) + var(θ̂n ) = + .
(n + 1)2 (n + 1)2

The sample variance estimator is defined as

n
1X
S̄n2 = (Xi − Mn )2 . (5)
n
i=1

σ2
Note that from definitions, E[Mn ] = θ, E[Xi2 ] = θ2 + σ 2 and E[Mn2 ] = θ2 + n . (Note that
n n
1 X 1 X 2 X 1 σ2
E[Mn2 ] = E[ 2
( Xi )2 ] = 2 E[ Xi + Xi Xj ] = (n(θ 2
+ σ 2
) + n(n − 1)θ 2
) = θ 2
+ .)
n n n2 n
i=1 i=1 1≤i<j≤n

Furthermore,
n n
1 X 2 X
E[S̄n2 ] = E[ Xi − 2Mn Xi + nMn2 ]
n
i=1 i=1
n
1 X
= E[ Xi2 ] − 2Mn2 + Mn2
n
i=1
n
1X 2
= E[ Xi ] − Mn2
n
i=1
σ
= θ + σ − (θ2 + )
2
n
n−1
= σ. (6)
n
2
Therefore, S¯n is not an unbiased estimator of σ but it is asymptotically unbiased. It coincides with the ML
estimator if Xi are normal.

3
Another variance estimator is
n
1 X n
Ŝn2 = (Xi − Mn )2 = S̄ 2 . (7)
n−1 n−1 n
i=1

Therefore,
n−1 n
E[Ŝn2 ] = σ=σ (8)
n n−1
i.e., Ŝn2 is an unbiased estimator of σ. For large n, the two estimators coincide.
Confidence intervals: The confidence intervals is an interval which contains θ with high probability. Let
α be a small number. Then, the confidence level is denoted by 1 − α. The point estimator Θ̂n by a lower
estimator Θ̂− + − +
n and an upper estimator Θ̂n such that Θ̂n ≤ Θ̂n and

Pθ (Θ̂− +
n ≤ θ ≤ Θ̂n ) ≥ 1 − α, (9)

for every possible value of θ. The interval [Θ̂− + −

n , Θ̂n ] is called a 1−α confidence interval. The estimators Θ̂n
+
and Θ̂n are random variables that depend on observations X1 , . . . , Xn . Typical values of α are 0.05, 0.25
or 0.01.
Example 0.5. [3] Consider a resisor estimation (with true value R0 ) problem using (noisy) measurements
v(k) and i(k) of the true voltage v0 and the true current i0 respectively. Let v(k) = v0 + nv (k), i(k) =
i0 + ni (k) where nv and ni denote voltage and current noise during measurements. Let nv , ni be iid, zero
mean with finite variances σv2 , σi2 . Furthermore, cov(nv (k)ni (j)) = 0 for all k, j. Notice that from the
noise models and the properties of sample mean and sample variance estimators,
N N
1 X 1 X
lim ni (k) = 0, lim nv (k) = 0,
N →∞ N N →∞ N
k=1 k=1

N N
1 X 1 X 2
lim ni (k)nv (k) = cov(ni , nv ) = 0, lim ni (k) = σi2 .
N →∞ N N →∞ N
k=1 k=1
1 Pn
v(k)
A possible estimator is R̂EV := N
1 Pk=1
n . Clearly, this is asymptotically unbiased. Furthermore,
N k=1 i(k)
since v(k), i(k) are independent,
Pn
1
E[ N1 nk=1 v(k)]
P
k=1 v(k) v0
E[ N1 Pn ]= 1 Pn = = R0 .
N k=1 i(k) E[ N k=1 i(k)] i0

Thus, R̂EV is an unbiased estimator.

Next, we propose a least squares estimator. The LS problem is
N
1 X
minR (v(k) − Ri(k))2 .
N
k=1

Let vN , iN be the vectors of voltages and currents (ΦN = iN in this case). Notice that the regressor iN is
noisy in this case unlike the previous cases where the regressor matrix Φ was assumed to be noise free.
The LS solution is
1 T 1 PN
N iN vN N k=1 (v0 + nv (k))(i0 + ni (k))
R̂(N ) = 1 T = 1 PN
.
2
N iN iN N k=1 (i0 + n i (k))
As N → ∞, the numerator evaluates to
XN
1
limN →∞ v0 i0 + nv (k)ni (k) = v0 i0
N
k=1

4
and the denominator evaluates to
N
1 X 2
limN →∞ i0 + ni (k)ni (k) = i20 + σi2
N
k=1

where we have used that the sample mean and the sample variance converges to the true mean and the true
variance asymptotically. Thus, limN →∞ R̂ = Rσ0 2 .
1+ i
i2
PN 0 v(k)
Another possible estimator is R̂SA = N1 k=1 i(k) . Note that this estimator may not converge as i(k)
may take 0 values for some k due to noise.

Minimum variance unbiased estimation

Recall the bias-variance trade-off. It turns out that there is a minimum variance lower bound satisfied
by unbiased estimators. These are minimum variance unbiased estimators (MVUE) which minimize the
variance in the MSE.
Let p(y; θ) be the PDF of obtaining measurements y for a given θ. Suppose p(y; θ0 ) for true θ0 is
known. Then, any unbiased estimator θ̂(y) is a random variable with covariance matrix Σθ̂ :=cov(θ̂yn )
satisfies
Σθ̂ ≥ I(θ0 )−1 (10)
where I(θ0 ) is the Fisher information matrix and this lower bound is called the Cramer-Rao lower bound.
Note that
∂ 2 log p(y; θ)
I(θ0 ) = −E[ |θ=θ0 ]. (11)
∂θ2
For linear models y = ΦN θ + w with additive Gaussian noise w ∼ N (0, Σ), I(θ) = (ΦT −1
N Σ ΦN ). This
can be derived as follows. Clearly, y ∼ N (ΦN θ, Σ). Therefore,
1
log p(y; θ) = C − (y − ΦN θ)T Σ−1 (y − ΦN θ)
2
∂ 2 log p(y; θ) −1
⇒ = −ΦTN Σ ΦN
∂θ2
∂ 2 log p(y; θ) −1
⇒ −E[ ] = ΦT
N Σ ΦN .
∂θ2
∂
Another way to obtain fisher information matrix is as follows. Note that one can replace ∂θ with ∇θ to
∂2 2
denote the gradient when θ is vector valued and ∂θ2 with ∇θ to denote the Hessian. Notice that

1
−∇θ log p(y; θ) = − ∇θ p(y; θ). (12)
p(y; θ)

Therefore,
1 1
−∇2θ log p(y; θ) = − ∇2 p(y; θ) + ( )2 ∇T
θ p(y; θ)∇θ p(y; θ)
p(y; θ) θ p(y; θ)
1
= − ∇2 p(y; θ) + (∇T θ log p(y; θ))(∇θ log p(y; θ))
p(y; θ) θ
X X
⇒ I(θ) = E[−∇2θ log p(y; θ)] = − ∇2θ p(y; θ) + ((∇T θ log p(y; θ))(∇θ log p(y; θ))p(y; θ)
X
= −∇2θ p(y; θ) + E[(∇T θ log p(y; θ))(∇θ log p(y; θ))]

= E[(∇T
θ log p(y; θ))(∇θ log p(y; θ))]. (13)

5
Example 0.6. Consider X ∼Bin(n, θ) = nx θx (1 − θ)n−x = pX (x; θ) with θ being an unknown param-

Fisher information matrix for the random variable X is given as follows. Note that
eter. Then, the scalar
n
log p(x; θ) = log x + x log θ + (n − x) log(1 − θ). Therefore,

∂ log p(x; θ) x n−x ∂ 2 log p(x; θ) x n−x

= − ⇒− 2
=− 2 − .
∂θ θ 1−θ ∂θ θ (1 − θ)2

Therefore

∂ 2 log p(x; θ) E[x] n − E[x] nθ n − nθ n n n

−E[ 2
]= 2 + 2
= 2 + 2
= + = .
∂θ θ (1 − θ) θ (1 − θ) θ 1−θ θ(1 − θ)
θ(1−θ)
Therefore, if θ̂ is an estimator of θ, then var(θ̂) ≥ n .
1 2
−1 √ 1 − 2 (x(n)−A)
Example 0.7. [2] Let x(n) = A+w(n) where w(n) ∼ N (0, σ 2 ). Then, p(x; A) = ΠN
n=0 2πσ 2 e
2σ

(x being the vector of N observations). Therefore,

N −1
1 X 2 ∂ 2 log p(x; A) N ∂ 2 log p(x; A) N
log p(x; A) = C − 2
(x(n) − A) ⇒ 2
= − 2
⇒ −E[ 2
] = 2.
2σ ∂A σ ∂A σ
n=1

σ2
Therefore, if Â is an estimator of A, then var(Â) ≥ N.

Unbiased estimators satisfying the Cramer-Rao lower bound are called minimum variance unbiased
estimators (MVUE). For ease of computation, we look for the best linear unbiased estimators (BLUE) as
MVUE cannot always be found in practice. Linear estimators are linear in data.

Linear estimators and BLUE

Consider an observation vector y and we want to estimate an unknown θ from y. For linear models (i.e.,
linear relation between θ and prediction ŷ) with Gaussian noise, the MVUE turns out to be linear using
least squares theory. For nonlinear models, this is not true. One can look for linear estimators for nonlinear
models i.e., estimators of the form θ̂ = Ky where K is a matrix of appropriate size. Best linear unbiased
estimators (BLUE) are linear estimators of the form θ̂ = Ky which have minimum covariance among all
linear unbiased estimators.

Bayesian estimation
• Bayesian statistics treats unknown parameters as random variables with known prior distributions.
These prior beliefs are then updated to posterior beliefs (using Bayes’ rule) after observations/measured
data.

• Principal Bayesian inference methods are: Maximum a posteriori probability (MAP) and Minimum
mean squares error (MMSE).

Bayesian inference and posterior distribution

The unknown quantity of interest is Θ which is modeled as a random variable or a finite collection of
random variables. We observe a collection of random variables X = (X1 , . . . , Xn ) called observation
vector/measurements and aim to extract information about Θ. We assume the following.

1. A prior distribution pΘ or fΘ , depending on whether Θ is discrete or continuous.

2. A conditional distribution pX|Θ or fX|Θ depending on whether X is discrete or continuous.

6
Using the prior and the conditional distribution, we calculate pΘ|X or fΘ|X called the posterior using Bayes’
rule. The four versions of Bayes’ rule are as follows.

1. Θ discrete, X discrete
pΘ (θ)pX|Θ (x|θ)
pΘ|X (θ|x) = P 0 0
. (14)
θ0 pΘ (θ )pX|Θ (x|θ )
This is useful in hypothesis testing problems with discrete data. This is useful in classification prob-
lems of discrete data points.

2. Θ discrete, X continuous
pΘ (θ)fX|Θ (x|θ)
pΘ|X (θ|x) = P 0 0
. (15)
θ0 pΘ (θ )fX|Θ (x|θ )
This is useful in hypothesis testing problems with continuous data. For example, binary signal detec-
tion in the presence of Gaussian noise.

3. Θ continuous, X discrete
fΘ (θ)pX|Θ (x|θ)
fΘ|X (θ|x) = ´ . (16)
fΘ (θ0 )pX|Θ (x|θ0 )dθ0
This is useful in estimation problems with discrete data. For example, a coin with an unknown param-
eter θ, the observation is the number of heads in n tosses or estimating parameters of a model from
the discrete measurements. A real world example would be estimating parameters of laws of motions
from discrete measurements of time and position of an object (for example, curve fitting).

4. Θ continuous, X continuous
fΘ (θ)fX|Θ (x|θ)
fΘ|X (θ|x) = ´ . (17)
fΘ (θ0 )fX|Θ (x|θ0 )dθ0

This is useful in estimation problems with continuous data. For example, estimating parameters of a
model from observed continuous signals (for example, system identification).

MAP estimation
The posterior distribution is either a PMF pΘ|X (.|X) or a PDF fΘ|X (.|X). To find the estimate of Θ given
X, we use the MAP rule:

pΘ|X (θ∗ |x) : = maxθ pΘ|X (θ|x) (18)

∗
⇒θ = argmaxθ pΘ|X (θ|x) (19)

for discrete Θ and

fΘ|X (θ∗ |x) : = maxθ fΘ|X (θ|x) (20)

∗
⇒θ = argmaxθ fΘ|X (θ|x) (21)

for continuous Θ. For continuous random variable Θ, the conditional expectation can be a better estimate
ˆ
E[Θ|X = x] = θfΘ|X (θ|x)dθ. (22)

than the MAP estimator. In general, apriori, there is no reason for chosing one estimator over the other
unless objectives are precisely stated.

7
MMSE estimator
Bayesian MSE is defined as
Bmse (Θ̂) := E[(Θ − Θ̂)2 ]. (23)
Using the joint PDF p(x, Θ),
ˆ ˆ
Bmse (Θ̂) = ((Θ − Θ̂)2 )p(x, Θ)dxdΘ. (24)

Using Bayes’ theorem, p(x, Θ) = p(Θ|x)p(x),

ˆ ˆ
2
Bmse (Θ̂) = ((Θ − Θ̂) )p(Θ|x)dΘ p(x)dx.

Note that
p(x|Θ) p(x|Θ)
p(Θ|x) = =´
p(x) p(x|Θ)p(Θ)dΘ
where p(Θ) is the prior PDF on Θ. It turns out that MMSE estimator is given by

Θ̂ = E[Θ|x]. (25)

For linear models of the observation x as a function of Θ and Gaussian noise assumptions, MMSE estima-
tors can be computed analytically. However, in general cases, computation of MMSE is computationally
intensive. Therefore, one looks for linear estimators which minimize MSE rather than computing E[Θ|x].
LMMSE estimators (for scalar estimation) are of the form
N
X −1
Θ̂linear = an x(n) + aN . (26)
n=0

This can be generalized to vector valued estimators.

An overview of classical and Bayesian approach:

• One can use classical stats approach for modeling or Bayesian approach for modeling. For parametrized
models, in the classical approach, one treats these parameters as deterministic unknowns. Whereas,
in Bayesian modeling, some prior knowledge on the parameters is assumed and this is updated after
measurements.

• We will use the classical approach to modeling and system identification. But Bayesian modeling
approach can also be used in appropriate cases.

• In estimation theory, both approaches are used. For example, least squares estimation is used in
classical estimation theory which is commonly used in multiple applications e.g., estimating true
signal from its noisy measurements. Kalman filter uses Bayesian approach for estimating the state of
a dynamical system.

References
[1] D. Bertsekas, J. Tsitsiklis, Introduction to Probability, 2nd edition, 2008.

[2] S. Kay, Fundamentals of Statistical Signal Processing: Estimation theory, 1993.

[3] M. Diehl, Lecture notes on Modeling and System Identification, Lecture notes and video lectures, 2020.

Instant Download Mathematical Modeling in the Social and Life Sciences 1st Edition Michael Olinick PDF All Chapters
100% (14)
Instant Download Mathematical Modeling in the Social and Life Sciences 1st Edition Michael Olinick PDF All Chapters
60 pages
Probabilistic Methods in Engineering: Dr. Horst Hohberger
No ratings yet
Probabilistic Methods in Engineering: Dr. Horst Hohberger
355 pages
Module3
No ratings yet
Module3
5 pages
ACFrOgDxHI9RLajsdAAleI AMD3fD8GMumHY4hP954G9Nc5wG y r Km6yewAtD6KPaLn4JtmlryIevFHyE5hLCpCG9kYiN y2aUEiWWoofQYGd7Z10 ETX5BGeaw6ImvJ9HjlO8aNIJuqL7FlX9wq3pZ2PgZnbra RuhNZrYg==
No ratings yet
ACFrOgDxHI9RLajsdAAleI AMD3fD8GMumHY4hP954G9Nc5wG y r Km6yewAtD6KPaLn4JtmlryIevFHyE5hLCpCG9kYiN y2aUEiWWoofQYGd7Z10 ETX5BGeaw6ImvJ9HjlO8aNIJuqL7FlX9wq3pZ2PgZnbra RuhNZrYg==
16 pages
01 Estimation PDF
No ratings yet
01 Estimation PDF
13 pages
Point Estimation: Institute of Technology of Cambodia
No ratings yet
Point Estimation: Institute of Technology of Cambodia
22 pages
Estimators1 PDF
No ratings yet
Estimators1 PDF
2 pages
16 Intro to Point Estimation, Bias, MSE, Efficiency, (8.1-8.2, 9.1-9.2)-1
No ratings yet
16 Intro to Point Estimation, Bias, MSE, Efficiency, (8.1-8.2, 9.1-9.2)-1
26 pages
Week 1 1720465962 Estimation Hour 2
No ratings yet
Week 1 1720465962 Estimation Hour 2
14 pages
Unit 2
No ratings yet
Unit 2
41 pages
Minimum Variance Unbiased Estimation: Example
No ratings yet
Minimum Variance Unbiased Estimation: Example
4 pages
lecture_note_15
No ratings yet
lecture_note_15
6 pages
Lectura 2 Point Estimator Basics
No ratings yet
Lectura 2 Point Estimator Basics
11 pages
Chap 10
No ratings yet
Chap 10
7 pages
6.CHAPTER 4
No ratings yet
6.CHAPTER 4
9 pages
Introecon Estimators Properties
No ratings yet
Introecon Estimators Properties
8 pages
S1B 15 02 Estimation Bias 4
No ratings yet
S1B 15 02 Estimation Bias 4
2 pages
7.7
No ratings yet
7.7
6 pages
Unbiased Estimator
No ratings yet
Unbiased Estimator
70 pages
L18-L21
No ratings yet
L18-L21
4 pages
Chapter 3 - Statistical Inference (Point Estimation
No ratings yet
Chapter 3 - Statistical Inference (Point Estimation
15 pages
Chapter 6 Statistical Estimation Method of Moments MLE
No ratings yet
Chapter 6 Statistical Estimation Method of Moments MLE
29 pages
mseee
No ratings yet
mseee
5 pages
3 The Rao-Blackwell Theorem: 3.1 Mean Squared Error
No ratings yet
3 The Rao-Blackwell Theorem: 3.1 Mean Squared Error
2 pages
Lectura 1 Point Estimation
No ratings yet
Lectura 1 Point Estimation
47 pages
Properties of Estimators New Update Spin
No ratings yet
Properties of Estimators New Update Spin
43 pages
S1B 17 02 Estimation Bias
No ratings yet
S1B 17 02 Estimation Bias
45 pages
4 - Point Estimation
No ratings yet
4 - Point Estimation
36 pages
Chapter 7. Statistical Estimation 7.7: Properties of Estimators II
No ratings yet
Chapter 7. Statistical Estimation 7.7: Properties of Estimators II
6 pages
6-point-estimation
No ratings yet
6-point-estimation
49 pages
Slides Estimation PDF
No ratings yet
Slides Estimation PDF
17 pages
RegEstimationLS_ML_StatColumbia
No ratings yet
RegEstimationLS_ML_StatColumbia
44 pages
Estimation Theory
No ratings yet
Estimation Theory
40 pages
Debre Berhan University: College of Natural and Computational Science Department of Statistics
No ratings yet
Debre Berhan University: College of Natural and Computational Science Department of Statistics
9 pages
Asymptotic Theory and Parametric Inference
No ratings yet
Asymptotic Theory and Parametric Inference
32 pages
ET Lecture02
No ratings yet
ET Lecture02
41 pages
Chapter 7. Statistical Estimation: 7.6: Properties of Estimators I
No ratings yet
Chapter 7. Statistical Estimation: 7.6: Properties of Estimators I
6 pages
Estimation Theory Eng
No ratings yet
Estimation Theory Eng
40 pages
Statistical Methods
No ratings yet
Statistical Methods
25 pages
9511_et_Module-2
No ratings yet
9511_et_Module-2
6 pages
02 Estimation
No ratings yet
02 Estimation
20 pages
Estimation Theory
100% (1)
Estimation Theory
8 pages
Chap - 2point - Estimation
No ratings yet
Chap - 2point - Estimation
11 pages
Advanced Statistical Inference
No ratings yet
Advanced Statistical Inference
7 pages
Bias-Variance Tradeoffs: 1 Single Sample MLE
No ratings yet
Bias-Variance Tradeoffs: 1 Single Sample MLE
7 pages
Statistical Inference: Dr. Venkataramana B
No ratings yet
Statistical Inference: Dr. Venkataramana B
38 pages
Solution 3 Problem 1: Let X
No ratings yet
Solution 3 Problem 1: Let X
12 pages
Cramer-Rao Lower Bound: 4.1 Estimator Accuracy
No ratings yet
Cramer-Rao Lower Bound: 4.1 Estimator Accuracy
7 pages
STAT2602 Tutorial 5
No ratings yet
STAT2602 Tutorial 5
7 pages
SDET Formulae MidSem2 2018 Ver3
No ratings yet
SDET Formulae MidSem2 2018 Ver3
2 pages
Estimator Properties
No ratings yet
Estimator Properties
17 pages
Notes Estimation Theory
100% (3)
Notes Estimation Theory
39 pages
Estimator & Types of Estimators
No ratings yet
Estimator & Types of Estimators
30 pages
Stat709 19
No ratings yet
Stat709 19
16 pages
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
No ratings yet
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
18 pages
Estimação Pontual
No ratings yet
Estimação Pontual
58 pages
ch7
No ratings yet
ch7
29 pages
9.0 Lesson Plan
No ratings yet
9.0 Lesson Plan
16 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Lectures on Integral Equations
From Everand
Lectures on Integral Equations
Harold Widom
3.5/5 (1)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
An Introduction to Linear Algebra and Tensors
From Everand
An Introduction to Linear Algebra and Tensors
M. A. Akivis
1/5 (1)
PPT_Cambridge_X_Probability Using Tree Diagrams and Venn Diagrams
No ratings yet
PPT_Cambridge_X_Probability Using Tree Diagrams and Venn Diagrams
11 pages
Statisti Cs For Busines S STAT130: Unit 2: Probability and Random
No ratings yet
Statisti Cs For Busines S STAT130: Unit 2: Probability and Random
84 pages
Unit 4 (BLANK Booklet)
No ratings yet
Unit 4 (BLANK Booklet)
199 pages
CME 106 - Probability Cheatsheet PDF
No ratings yet
CME 106 - Probability Cheatsheet PDF
11 pages
Chapter 3
No ratings yet
Chapter 3
19 pages
Untitled
No ratings yet
Untitled
5 pages
Statistics and Probability Presentation
No ratings yet
Statistics and Probability Presentation
10 pages
Introduction To Binomial distributionMS
No ratings yet
Introduction To Binomial distributionMS
3 pages
A8 Probability Sheet2 MT2023
No ratings yet
A8 Probability Sheet2 MT2023
2 pages
Conditional Probability: Presented By: Mitziyori Celoso Grade 10-St. Lorenzo Ruiz
No ratings yet
Conditional Probability: Presented By: Mitziyori Celoso Grade 10-St. Lorenzo Ruiz
11 pages
The Standard Behavior Analysis Guidelines
No ratings yet
The Standard Behavior Analysis Guidelines
9 pages
ADMN 2506A - Mahar Final Midterm Test
No ratings yet
ADMN 2506A - Mahar Final Midterm Test
5 pages
Assignment 2 - Çınar - Şerif - Kaya
No ratings yet
Assignment 2 - Çınar - Şerif - Kaya
3 pages
Supplementary Problems Set 2
No ratings yet
Supplementary Problems Set 2
5 pages
Ola Work01
No ratings yet
Ola Work01
3 pages
Probability Test O Level
No ratings yet
Probability Test O Level
17 pages
Ordonio - 08 Task Performance - Management Science
No ratings yet
Ordonio - 08 Task Performance - Management Science
3 pages
MyOpenMath Quiz Chapter 1
No ratings yet
MyOpenMath Quiz Chapter 1
4 pages
Trabajo de Grado Dulce María Angarita Corredor
No ratings yet
Trabajo de Grado Dulce María Angarita Corredor
35 pages
Instant download Fundamentals of Bayesian Epistemology 1 : Introducing Credences Michael G. Titelbaum pdf all chapter
No ratings yet
Instant download Fundamentals of Bayesian Epistemology 1 : Introducing Credences Michael G. Titelbaum pdf all chapter
41 pages
Transforming Normal To Standard Normal
No ratings yet
Transforming Normal To Standard Normal
14 pages
Chapter 5 Random Processes: Ensemble
No ratings yet
Chapter 5 Random Processes: Ensemble
19 pages
(Ebook) Fundamentals of applied probability and random processes by Oliver Ibe ISBN 9780120885084, 0120885085 instant download
100% (1)
(Ebook) Fundamentals of applied probability and random processes by Oliver Ibe ISBN 9780120885084, 0120885085 instant download
59 pages
Chapter 9 Probability
No ratings yet
Chapter 9 Probability
17 pages
CH 06
100% (1)
CH 06
30 pages
7 Input Modeling 2024
No ratings yet
7 Input Modeling 2024
90 pages
MS8 IGNOU MBA Assignment 2009
No ratings yet
MS8 IGNOU MBA Assignment 2009
6 pages
Discrete Probability Distributions: Vietnamese-German University
No ratings yet
Discrete Probability Distributions: Vietnamese-German University
25 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Basic Stats Estimation

Uploaded by

Basic Stats Estimation

Uploaded by

Classical statistics

We list some terminology regarding estimators as follows.

1. We denote an estimator of an unknown parameter θ by Θ̂n (a random variable) which is a function of

4. Θ̂n is unbiased if E[Θ̂n ] = θ for all values of θ.

5. Θ̂n is asymptotically unbiased if limn→∞ E[Θ̂n ] = θ, for all values of θ.

E[Θ̃2n ] = E[(Θ̂ − E[Θ̂] + E[Θ̂] − θ)2 ]

Maximum likelihood estimation

θ̂n = argmaxθ pX (x1 , . . . , xn ; θ). (3)

pX (x1 , . . . , xn ; θ) = Πni=1 pXi (xi ; θ).

The log-likelihood function for discrete Xi is

θ represents either a scalar or a vector of unknowns.

Taking derivative w.r.t. θ and setting it to zero,

Example 0.3 (Example 7.4, [2]). Let

The sample variance estimator is defined as

for every possible value of θ. The interval [Θ̂− + −

Thus, R̂EV is an unbiased estimator.

Minimum variance unbiased estimation

∂ log p(x; θ) x n−x ∂ 2 log p(x; θ) x n−x

∂ 2 log p(x; θ) E[x] n − E[x] nθ n − nθ n n n

(x being the vector of N observations). Therefore,

Linear estimators and BLUE

Bayesian inference and posterior distribution

1. A prior distribution pΘ or fΘ , depending on whether Θ is discrete or continuous.

2. A conditional distribution pX|Θ or fX|Θ depending on whether X is discrete or continuous.

pΘ|X (θ∗ |x) : = maxθ pΘ|X (θ|x) (18)

for discrete Θ and

fΘ|X (θ∗ |x) : = maxθ fΘ|X (θ|x) (20)

Using Bayes’ theorem, p(x, Θ) = p(Θ|x)p(x),

This can be generalized to vector valued estimators.

[2] S. Kay, Fundamentals of Statistical Signal Processing: Estimation theory, 1993.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.