0% found this document useful (0 votes)

19 views

Introduction

This document introduces nonparametric inference, which aims to estimate unknown quantities with minimal assumptions, focusing on problems such as estimating distribution functions, functionals, densities, and regression functions. It also provides notation and background on probability theory, including convergence concepts and statistical principles like maximum likelihood estimation and confidence sets. The chapter emphasizes the importance of constructing finite sample confidence sets and discusses various types of confidence intervals and their properties.

Uploaded by

Acuario Hatchery Los Vilos

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

Introduction

Uploaded by

Acuario Hatchery Los Vilos

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

1

Introduction

In this chapter we brieﬂy describe the types of problems with which we will
be concerned. Then we deﬁne some notation and review some basic concepts
from probability theory and statistical inference.

1.1 What Is Nonparametric Inference?

The basic idea of nonparametric inference is to use data to infer an unknown
quantity while making as few assumptions as possible. Usually, this means
using statistical models that are infinite-dimensional. Indeed, a better name
for nonparametric inference might be infinite-dimensional inference. But it is
difficult to give a precise definition of nonparametric inference, and if I did
venture to give one, no doubt I would be barraged with dissenting opinions.
For the purposes of this book, we will use the phrase nonparametric in-
ference to refer to a set of modern statistical methods that aim to keep the
number of underlying assumptions as weak as possible. Specifically, we will
consider the following problems:

1. (Estimating the distribution function). Given an iid sample X1 , . . . , Xn ∼

F , estimate the cdf F (x) = P(X ≤ x). (Chapter 2.)
2 1. Introduction

2. (Estimating functionals). Given an iid sample X1 , . . . , Xn ∼ F , estimate

a functional T (F ) such as the mean T (F ) = x dF (x). (Chapters 2
and 3.)

3. (Density estimation). Given an iid sample X1 , . . . , Xn ∼ F , estimate the

density f (x) = F (x). (Chapters 4, 6 and 8.)

4. (Nonparametric regression or curve estimation). Given (X1 , Y1 ), . . . , (Xn , Yn )

estimate the regression function r(x) = E(Y |X = x). (Chapters 4, 5, 8
and 9.)

5. (Normal means). Given Yi ∼ N (θi , σ 2 ), i = 1, . . . , n, estimate θ =

(θ1 , . . . , θn ). This apparently simple problem turns out to be very com-
plex and provides a unifying basis for much of nonparametric inference.
(Chapter 7.)

In addition, we will discuss some unifying theoretical principles in Chapter

7. We consider a few miscellaneous problems in Chapter 10, such as measure-
ment error, inverse problems and testing.
Typically, we will assume that distribution F (or density f or regression
function r) lies in some large set F called a statistical model. For example,
when estimating a density f , we might assume that

2 2
f ∈F= g: (g (x)) dx ≤ c

which is the set of densities that are not “too wiggly.”

1.2 Notation and Background

Here is a summary of some useful notation and background. See also
Table 1.1.
Let a(x) be a function of x and let F be a cumulative distribution function.
If F is absolutely continuous, let f denote its density. If F is discrete, let f
denote instead its probability mass function. The mean of a is

a(x)f (x)dx continuous case
E(a(X)) = a(x)dF (x) ≡
j a(xj )f (xj ) discrete case.

Let V(X) = E(X − E(X))2 denote the variance of a random variable. If

X1 , . . . , Xn are n observations, then a(x)dFn (x) = n−1 i a(Xi ) where Fn
is the empirical distribution that puts mass 1/n at each observation Xi .
1.2 Notation and Background 3

Symbol Deﬁnition
xn = o(an ) limn→∞ xn /an = 0
xn = O(an ) |xn /an | is bounded for all large n
an ∼ b n an /bn → 1 as n → ∞
an b n an /bn and bn /an are bounded for all large n
Xn X convergence in distribution
P
Xn −→ X convergence in probability
a.s.
Xn −→ X almost sure convergence
θn estimator of parameter θ
bias E(θn ) − θ

se V(θn ) (standard error)

se estimated standard error
mse E(θn − θ)2 (mean squared error)
Φ cdf of a standard Normal random variable
zα Φ−1 (1 − α)

TABLE 1.1. Some useful notation.

Brief Review of Probability. The sample space Ω is the set of possible

outcomes of an experiment. Subsets of Ω are called events. A class of events
A is called a σ-field if (i) ∅ ∈ A, (ii) A ∈ A implies that Ac ∈ A and (iii)
A1 , A2 , . . . , ∈ A implies that ∞ i=1 Ai ∈ A. A probability measure is a
function P defined on a σ-field A such that P(A) ≥ 0 for all A ∈ A, P(Ω) = 1
and if A1 , A2 , . . . ∈ A are disjoint then
∞ ∞
P Ai = P(Ai ).
i=1 i=1

The triple (Ω, A, P) is called a probability space. A random variable is a

map X : Ω → R such that, for every real x, {ω ∈ Ω : X(ω) ≤ x} ∈ A.
A sequence of random variables Xn converges in distribution (or con-
verges weakly) to a random variable X, written Xn X, if
P(Xn ≤ x) → P(X ≤ x) (1.1)
as n → ∞, at all points x at which the cdf
F (x) = P(X ≤ x) (1.2)
is continuous. A sequence of random variables Xn converges in probability
P
to a random variable X, written Xn −→ X, if,
for every > 0, P(|Xn − X| > ) → 0 as n → ∞. (1.3)
4 1. Introduction

A sequence of random variables Xn converges almost surely to a random

a.s.
variable X, written Xn −→ X, if

P( lim |Xn − X| = 0) = 1. (1.4)

n→∞

The following implications hold:

a.s. P
Xn −→ X implies that Xn −→ X implies that Xn X. (1.5)

Let g be a continuous function. Then, according to the continuous map-

ping theorem,

Xn X implies that g(Xn ) g(X)

P P
Xn −→ X implies that g(Xn )−→ g(X)
a.s. a.s.
Xn −→ X implies that g(Xn )−→ g(X)

According to Slutsky’s theorem, if Xn X and Yn c for some constant

c, then Xn + Yn X + c and Xn Yn cX.
Let X1 , . . ., Xn ∼ F be iid. The weak law of large numbers says that if
n P
E|g(X1 )| < ∞, then n−1 i=1 g(Xi )−→ E(g(X1 )). The strong law of large
n a.s.
numbers says that if E|g(X1 )| < ∞, then n−1 i=1 g(Xi )−→ E(g(X1 )).
The random variable Z has a standard Normal distribution if it has density
2
φ(z) = (2π)−1/2 e−z /2 and we write Z ∼ N (0, 1). The cdf is denoted by
Φ(z). The α upper quantile is denoted by zα . Thus, if Z ∼ N (0, 1), then
P(Z > zα ) = α.
If E(g 2 (X1 )) < ∞, the central limit theorem says that
√
n(Y n − µ) N (0, σ 2 ) (1.6)
n
where Yi = g(Xi ), µ = E(Y1 ), Y n = n−1 i=1 Yi and σ 2 = V(Y1 ). In general,
if
(Xn − µ)
N (0, 1)
n
σ
then we will write
n2 ).
Xn ≈ N (µ, σ (1.7)
According to the delta method, if g is diﬀerentiable at µ and g (µ) = 0
then
√ √
n(Xn − µ) N (0, σ 2 ) =⇒ n(g(Xn ) − g(µ)) N (0, (g (µ))2 σ 2 ). (1.8)

A similar result holds in the vector case. Suppose that Xn is a sequence of

√
random vectors such that n(Xn − µ) N (0, Σ), a multivariate, mean 0
1.3 Conﬁdence Sets 5

normal with covariance matrix Σ. Let g be diﬀerentiable with gradient ∇g

such that ∇µ = 0 where ∇µ is ∇g evaluated at µ. Then
√
n(g(Xn ) − g(µ)) N 0, ∇Tµ Σ∇µ . (1.9)

Statistical Concepts. Let F = {f (x; θ) : θ ∈ Θ} be a parametric model

satisfying appropriate regularity conditions. The likelihood function based
on iid observations X1 , . . . , Xn is

n
Ln (θ) = f (Xi ; θ)
i=1

and the log-likelihood function is n (θ) = log Ln (θ). The maximum likeli-
hood estimator, or mle θn , is the value of θ that maximizes the likelihood. The
score function is s(X; θ) = ∂ log f (x; θ)/∂θ. Under appropriate regularity

conditions, the score function satisﬁes Eθ (s(X; θ)) = s(x; θ)f (x; θ)dx = 0.
Also,
√
n(θn − θ) N (0, τ 2 (θ))
where τ 2 (θ) = 1/I(θ) and

∂ 2 log f (x; θ)
I(θ) = Vθ (s(x; θ)) = Eθ (s2 (x; θ)) = −Eθ
∂θ2
is the Fisher information. Also,

(θn − θ)
N (0, 1)

se
2 = 1/(nI(θn )). The Fisher information In from n observations sat-
where se
isﬁes In (θ) = nI(θ); hence we may also write se 2 = 1/(In (θn )).
The bias of an estimator θn is E(θ)
− θ and the the mean squared error mse
is mse = E(θ − θ)2 . The bias–variance decomposition for the mse of an
estimator θn is
mse = bias2 (θn ) + V(θn ). (1.10)

1.3 Conﬁdence Sets

Much of nonparametric inference is devoted to finding an estimator θn of
some quantity of interest θ. Here, for example, θ could be a mean, a density
or a regression function. But we also want to provide confidence sets for these
quantities. There are different types of confidence sets, as we now explain.
6 1. Introduction

Let F be a class of distribution functions F and let θ be some quantity of

interest. Thus, θ might be F itself, or F or the mean of F , and so on. Let
Cn be a set of possible values of θ which depends on the data X1 , . . . , Xn . To
emphasize that probability statements depend on the underlying F we will
sometimes write PF .

1.11 Definition. Cn is a finite sample 1 − α confidence set if

inf PF (θ ∈ Cn ) ≥ 1 − α for all n. (1.12)

F ∈F

Cn is a uniform asymptotic 1 − α conﬁdence set if

lim inf inf PF (θ ∈ Cn ) ≥ 1 − α. (1.13)

n→∞ F ∈F

Cn is a pointwise asymptotic 1 − α conﬁdence set if,

for every F ∈ F, lim inf PF (θ ∈ Cn ) ≥ 1 − α. (1.14)

n→∞

If || · || denotes some norm and fn is an estimate of f , then a conﬁdence

ball for f is a conﬁdence set of the form

Cn = f ∈ F : ||f − fn || ≤ sn (1.15)

where sn may depend on the data. Suppose that f is deﬁned on a set X . A

pair of functions (, u) is a 1 − α conﬁdence band or conﬁdence envelope
if

inf P (x) ≤ f (x) ≤ u(x) for all x ∈ X ≥ 1 − α. (1.16)
f ∈F

Confidence balls and bands can be finite sample, pointwise asymptotic and
uniform asymptotic as above. When estimating a real-valued quantity instead
of a function, Cn is just an interval and we call Cn a confidence interval.
Ideally, we would like to find finite sample confidence sets. When this is
not possible, we try to construct uniform asymptotic confidence sets. The
last resort is a pointwise asymptotic confidence interval. If Cn is a uniform
asymptotic confidence set, then the following is true: for any δ > 0 there exists
an n(δ) such that the coverage of Cn is at least 1 − α − δ for all n > n(δ).
With a pointwise asymptotic confidence set, there may not exist a finite n(δ).
In this case, the sample size at which the confidence set has coverage close to
1 − α will depend on f (which we don’t know).
1.3 Confidence Sets 7

1.17 Example. Let X1 , . . . , Xn ∼ Bernoulli(p). A pointwise asymptotic 1 − α

confidence interval for p is

pn (1 − pn )
pn ± zα/2 (1.18)
n
n
where pn = n−1 i=1 Xi . It follows from Hoeffding’s inequality (1.24) that a
finite sample confidence interval is

1 2
pn ± log . (1.19)
2n α

1.20 Example (Parametric models). Let

F = {f (x; θ) : θ ∈ Θ}

be a parametric model with scalar parameter θ and let θn be the maximum
likelihood estimator, the value of θ that maximizes the likelihood function

n
Ln (θ) = f (Xi ; θ).
i=1

Recall that under suitable regularity assumptions,

θn ≈ N (θ, se
2)

where
= (In (θn ))−1/2
se

is the estimated standard error of θn and In (θ) is the Fisher information.
Then
θn ± zα/2 se

is a pointwise asymptotic conﬁdence interval. If τ = g(θ) we can get an

asymptotic conﬁdence interval for τ using the delta method. The mle for
τ is τn = g(θn ). The estimated standard error for τ is se( θn )|g (θn )|.
τn ) = se(
The conﬁdence interval for τ is

θn )|g (θn )|.

τn ) = τn ± zα/2 se(
τn ± zα/2 se(

Again, this is typically a pointwise asymptotic conﬁdence interval.

8 1. Introduction

1.4 Useful Inequalities

At various times in this book we will need to use certain inequalities. For
reference purposes, a number of these inequalities are recorded here.

Markov’s Inequality. Let X be a non-negative random variable and suppose

that E(X) exists. For any t > 0,

E(X)
P(X > t) ≤ . (1.21)
t

Chebyshev’s Inequality. Let µ = E(X) and σ 2 = V(X). Then,

σ2
P(|X − µ| ≥ t) ≤ . (1.22)
t2

Hoeﬀding’s Inequality. Let Y1 , . . . , Yn be independent observations such that

E(Yi ) = 0 and ai ≤ Yi ≤ bi . Let > 0. Then, for any t > 0,

n
n
2
(bi −ai )2 /8
P Yi ≥ ≤ e−t et . (1.23)
i=1 i=1

Hoeﬀding’s Inequality for Bernoulli Random Variables. Let X1 , . . ., Xn ∼ Bernoulli(p).

Then, for any > 0,
2
P |X n − p| > ≤ 2e−2n (1.24)
n
where X n = n−1 i=1 Xi .

Mill’s Inequality. If Z ∼ N (0, 1) then, for any t > 0,

2φ(t)
P(|Z| > t) ≤ (1.25)
t
where φ is the standard Normal density. In fact, for any t > 0,

1 1 1
− 3 φ(t) < P(Z > t) < φ(t) (1.26)
t t t

and
1 −t2 /2
P (Z > t) < e . (1.27)
2
1.4 Useful Inequalities 9

Berry–Esséen Bound. Let X1 , . . . , Xn be iid with ﬁnite mean µ = E(X1 ),

√
variance σ 2 = V(X1 ) and third moment, E|X1 |3 < ∞. Let Zn = n(X n −
µ)/σ. Then
33 E|X1 − µ|3
sup |P(Zn ≤ z) − Φ(z)| ≤ √ 3 . (1.28)
z 4 nσ

Bernstein’s Inequality. Let X1 , . . . , Xn be independent, zero mean random vari-

ables such that −M ≤ Xi ≤ M . Then
n
t2
1
P Xi > t ≤ 2 exp − (1.29)
2 v + M t/3
i=1

where v ≥ ni=1 V(Xi ).

Bernstein’s Inequality (Moment version). Let X1 , . . . , Xn be independent, zero

mean random variables such that
m!M m−2 vi
E|Xi |m ≤
2
for all m ≥ 2 and some constants M and vi . Then,
n
t2
1
P Xi > t ≤ 2 exp − (1.30)
2 v + Mt
i=1
n
where v = i=1 vi .

Cauchy–Schwartz Inequality. If X and Y have ﬁnite variances then

E |XY | ≤ E(X 2 )E(Y 2 ). (1.31)

Recall that a function g is convex if for each x, y and each α ∈ [0, 1],

g(αx + (1 − α)y) ≤ αg(x) + (1 − α)g(y).

If g is twice diﬀerentiable, then convexity reduces to checking that g (x) ≥ 0

for all x. It can be shown that if g is convex then it lies above any line that
touches g at some point, called a tangent line. A function g is concave if
−g is convex. Examples of convex functions are g(x) = x2 and g(x) = ex .
Examples of concave functions are g(x) = −x2 and g(x) = log x.

Jensen’s inequality. If g is convex then

Eg(X) ≥ g(EX). (1.32)

10 1. Introduction

If g is concave then
Eg(X) ≤ g(EX). (1.33)

1.5 Bibliographic Remarks

References on probability inequalities and their use in statistics and pattern
recognition include Devroye et al. (1996) and van der Vaart and Wellner
(1996). To review basic probability and mathematical statistics, I recommend
Casella and Berger (2002), van der Vaart (1998) and Wasserman (2004).

1.6 Exercises
1. Consider Example 1.17. Prove that (1.18) is a pointwise asymptotic
conﬁdence interval. Prove that (1.19) is a uniform conﬁdence interval.

2. (Computer experiment). Compare the coverage and length of (1.18) and

(1.19) by simulation. Take p = 0.2 and use α = .05. Try various sample
sizes n. How large must n be before the pointwise interval has accurate
coverage? How do the lengths of the two intervals compare when this
sample size is reached?
√
3. Let X1 , . . . , Xn ∼ N (µ, 1). Let Cn = X n ± zα/2 / n. Is Cn a finite
sample, pointwise asymptotic, or uniform asymptotic confidence set
for µ?
√
4. Let X1 , . . . , Xn ∼ N (µ, σ 2 ). Let Cn = X n ± zα/2 Sn / n where Sn2 =
n 2
i=1 (Xi − X n ) /(n − 1). Is Cn a finite sample, pointwise asymptotic,
or uniform asymptotic confidence set for µ?

5. Let X1 , . . . , Xn ∼ F and let µ = x dF (x) be the mean. Let

X n + zα/2 se
Cn = X n − zα/2 se,

2 = Sn2 /n and
where se
n
1
Sn2 = (Xi − X n )2 .
n i=1

(a) Assuming that the mean exists, show that Cn is a 1 − α pointwise

asymptotic conﬁdence interval.
1.6 Exercises 11

(b) Show that Cn is not a uniform asymptotic conﬁdence interval. Hint :

Let an → ∞ and n → 0 and let Gn = (1 − n )F + n δn where δn is
a pointmass at an . Argue that, with very high probability, for an large

is not large.
and n small, x dGn (x) is large but X n + zα/2 se
(c) Suppose that P(|Xi | ≤ B) = 1 where B is a known constant. Use
Bernstein’s inequality (1.29) to construct a ﬁnite sample conﬁdence in-
terval for µ.

DGA Monitoring Systems
100% (7)
DGA Monitoring Systems
53 pages
Notes For 18.6501x, Fundamentals of Statistics: v0.2 (2019 April 24)
100% (1)
Notes For 18.6501x, Fundamentals of Statistics: v0.2 (2019 April 24)
14 pages
11D 1.8L Overhaul Mirage
50% (2)
11D 1.8L Overhaul Mirage
55 pages
Infrared Thermography: Errors and Uncertainties
No ratings yet
Infrared Thermography: Errors and Uncertainties
3 pages
Statistics
No ratings yet
Statistics
60 pages
Lecture1
No ratings yet
Lecture1
8 pages
Fundamentals of Statistics (18.6501x)
No ratings yet
Fundamentals of Statistics (18.6501x)
20 pages
Asymptotic Theory and Parametric Inference
No ratings yet
Asymptotic Theory and Parametric Inference
32 pages
18.6501x Fundamentals of Statistics
100% (1)
18.6501x Fundamentals of Statistics
8 pages
Statistical Inference Notes Melon University
No ratings yet
Statistical Inference Notes Melon University
5 pages
College Statistics
No ratings yet
College Statistics
244 pages
Lecture Notes
No ratings yet
Lecture Notes
90 pages
Statistical+Inference+1 Shaw2007
No ratings yet
Statistical+Inference+1 Shaw2007
66 pages
stat-review__xid-8243919_1
No ratings yet
stat-review__xid-8243919_1
24 pages
Empirical Processes
No ratings yet
Empirical Processes
47 pages
Chap 10
No ratings yet
Chap 10
7 pages
Xxxx- Mathematical Statistics II
No ratings yet
Xxxx- Mathematical Statistics II
192 pages
Rohatgi Expl
No ratings yet
Rohatgi Expl
192 pages
Lecture Notes For STAT2602
No ratings yet
Lecture Notes For STAT2602
104 pages
Lect Main Blanc
No ratings yet
Lect Main Blanc
185 pages
Math and Statistics PDF
No ratings yet
Math and Statistics PDF
192 pages
Statistics
No ratings yet
Statistics
53 pages
Lecture Notes Fall Term 2013
No ratings yet
Lecture Notes Fall Term 2013
40 pages
Lectura 2 Point Estimator Basics
No ratings yet
Lectura 2 Point Estimator Basics
11 pages
RaoCramerans PDF
No ratings yet
RaoCramerans PDF
10 pages
formulasheetensvnew
No ratings yet
formulasheetensvnew
15 pages
ACFrOgDxHI9RLajsdAAleI AMD3fD8GMumHY4hP954G9Nc5wG y r Km6yewAtD6KPaLn4JtmlryIevFHyE5hLCpCG9kYiN y2aUEiWWoofQYGd7Z10 ETX5BGeaw6ImvJ9HjlO8aNIJuqL7FlX9wq3pZ2PgZnbra RuhNZrYg==
No ratings yet
ACFrOgDxHI9RLajsdAAleI AMD3fD8GMumHY4hP954G9Nc5wG y r Km6yewAtD6KPaLn4JtmlryIevFHyE5hLCpCG9kYiN y2aUEiWWoofQYGd7Z10 ETX5BGeaw6ImvJ9HjlO8aNIJuqL7FlX9wq3pZ2PgZnbra RuhNZrYg==
16 pages
Econometría
No ratings yet
Econometría
43 pages
Empirical Process (Sara Van de Geer)
No ratings yet
Empirical Process (Sara Van de Geer)
91 pages
2A3. Review of Mathematical Statistics
No ratings yet
2A3. Review of Mathematical Statistics
8 pages
Review Materials 0 8 1
No ratings yet
Review Materials 0 8 1
140 pages
Stat 2013
No ratings yet
Stat 2013
132 pages
Estimação Pontual
No ratings yet
Estimação Pontual
58 pages
STAT 713 Mathematical Statistics Ii: Lecture Notes
No ratings yet
STAT 713 Mathematical Statistics Ii: Lecture Notes
152 pages
Rakhlin Mathstat sp22
No ratings yet
Rakhlin Mathstat sp22
108 pages
Lecture Notes Statistics II PDF
No ratings yet
Lecture Notes Statistics II PDF
139 pages
4 Estimation
No ratings yet
4 Estimation
33 pages
MIT14 30s09 Lec17
No ratings yet
MIT14 30s09 Lec17
9 pages
Statistics for Econometrics
No ratings yet
Statistics for Econometrics
100 pages
1.1 Parametric and Nonparametric Statistical Inference
No ratings yet
1.1 Parametric and Nonparametric Statistical Inference
8 pages
Stat PDF
No ratings yet
Stat PDF
132 pages
Notests PDF
No ratings yet
Notests PDF
153 pages
msqe_metrics_1_ps2
No ratings yet
msqe_metrics_1_ps2
11 pages
Chapter 3 - Statistical Inference (Point Estimation
No ratings yet
Chapter 3 - Statistical Inference (Point Estimation
15 pages
X400004_20220215_solutions
No ratings yet
X400004_20220215_solutions
8 pages
Problem Set 2
No ratings yet
Problem Set 2
18 pages
Asymptotic Statistics (By Changliang ZOU)
No ratings yet
Asymptotic Statistics (By Changliang ZOU)
115 pages
Theory of Estimation by P.G.dixit, Nirali Publication
No ratings yet
Theory of Estimation by P.G.dixit, Nirali Publication
186 pages
LECTURE NOTES_1
No ratings yet
LECTURE NOTES_1
56 pages
STAT 512 Mathematical Statistics: Lecture Notes
No ratings yet
STAT 512 Mathematical Statistics: Lecture Notes
120 pages
Cours 2 MVA
No ratings yet
Cours 2 MVA
5 pages
Statistical Inference
No ratings yet
Statistical Inference
82 pages
Outile-Course-Of- Inferential-Statistic
No ratings yet
Outile-Course-Of- Inferential-Statistic
16 pages
Lecture 21
No ratings yet
Lecture 21
4 pages
ემპირიული პროცესები
No ratings yet
ემპირიული პროცესები
131 pages
Point and Interval Estimation
No ratings yet
Point and Interval Estimation
9 pages
Risk Fisher
No ratings yet
Risk Fisher
39 pages
Notes Estimation Theory
100% (3)
Notes Estimation Theory
39 pages
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
No ratings yet
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
18 pages
Estimation of Parameter
No ratings yet
Estimation of Parameter
10 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Lectures on Integral Equations
From Everand
Lectures on Integral Equations
Harold Widom
3.5/5 (1)
Ch_23
No ratings yet
Ch_23
34 pages
Ch_18
No ratings yet
Ch_18
16 pages
Ch_21
No ratings yet
Ch_21
30 pages
Ibanez & Chong 2008 Enteroctopus
No ratings yet
Ibanez & Chong 2008 Enteroctopus
6 pages
Estimating the CDF and Statistical Functionals
No ratings yet
Estimating the CDF and Statistical Functionals
13 pages
The Bootstrap and The Jackknife
No ratings yet
The Bootstrap and The Jackknife
15 pages
Order Myctophiformes: Blackchins and Lanternfishes by H. Geoffrey Moser and William Watson
No ratings yet
Order Myctophiformes: Blackchins and Lanternfishes by H. Geoffrey Moser and William Watson
49 pages
Myctophiformes 3
No ratings yet
Myctophiformes 3
39 pages
Ibanez Et Al 2008 Robsonella PDF
No ratings yet
Ibanez Et Al 2008 Robsonella PDF
8 pages
Catalogue PDF
100% (1)
Catalogue PDF
56 pages
Combustion Study of Composite Solid Propellants Containing Metal Phthalocyanines
No ratings yet
Combustion Study of Composite Solid Propellants Containing Metal Phthalocyanines
6 pages
Dissertation Technique Arbitrage
100% (2)
Dissertation Technique Arbitrage
6 pages
ZTC800R532 (2) (1)
100% (1)
ZTC800R532 (2) (1)
210 pages
The People's History
No ratings yet
The People's History
3 pages
Calendar (Intake 2 - Product & Test Engineering)
No ratings yet
Calendar (Intake 2 - Product & Test Engineering)
3 pages
PR750H Spec Sheet
No ratings yet
PR750H Spec Sheet
1 page
Table Besi Iwf
No ratings yet
Table Besi Iwf
1 page
38-Article Text-381-1-10-20210727
No ratings yet
38-Article Text-381-1-10-20210727
10 pages
Class 10
No ratings yet
Class 10
1 page
Didactics Summary
No ratings yet
Didactics Summary
2 pages
Maths-April-QP & Memo-KZN-2021-Gr12
No ratings yet
Maths-April-QP & Memo-KZN-2021-Gr12
18 pages
Bebop Part II - All The Exercises You Can Eat
No ratings yet
Bebop Part II - All The Exercises You Can Eat
9 pages
Lecture 1 OMS
No ratings yet
Lecture 1 OMS
9 pages
Rajdeep Sardesai
No ratings yet
Rajdeep Sardesai
2 pages
Tle 9 Agri Crop Q3 Mod6
100% (1)
Tle 9 Agri Crop Q3 Mod6
25 pages
Social-Needs-Screening-Tool-AAFP
No ratings yet
Social-Needs-Screening-Tool-AAFP
2 pages
Assumptions of Multiple Linear Regression
No ratings yet
Assumptions of Multiple Linear Regression
18 pages
Flat Specifications
No ratings yet
Flat Specifications
4 pages
Emerald City
No ratings yet
Emerald City
9 pages
Rehbein Draft Master Plan & Feasibility Study AIRPORT
No ratings yet
Rehbein Draft Master Plan & Feasibility Study AIRPORT
84 pages
Cyber Crime Trends: Darrent NG APAC - Enterprise Sales
No ratings yet
Cyber Crime Trends: Darrent NG APAC - Enterprise Sales
32 pages
Stylistics
No ratings yet
Stylistics
4 pages
Acc107 p1 Exam Set A Answer Key
No ratings yet
Acc107 p1 Exam Set A Answer Key
9 pages
Lesson 24 Pivot Table
No ratings yet
Lesson 24 Pivot Table
42 pages
430 ZXM6-NH156 158.75 - 420-445W - 35×35 - 20200623 - E - 350mmcable - 430
No ratings yet
430 ZXM6-NH156 158.75 - 420-445W - 35×35 - 20200623 - E - 350mmcable - 430
2 pages
Workday Transaction Guide Assign Pay Group: Process Initiator Scope Relevance
No ratings yet
Workday Transaction Guide Assign Pay Group: Process Initiator Scope Relevance
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Introduction

Uploaded by

Introduction

Uploaded by

1

1.1 What Is Nonparametric Inference?

1. (Estimating the distribution function). Given an iid sample X1 , . . . , Xn ∼

2. (Estimating functionals). Given an iid sample X1 , . . . , Xn ∼ F , estimate

3. (Density estimation). Given an iid sample X1 , . . . , Xn ∼ F , estimate the

4. (Nonparametric regression or curve estimation). Given (X1 , Y1 ), . . . , (Xn , Yn )

5. (Normal means). Given Yi ∼ N (θi , σ 2 ), i = 1, . . . , n, estimate θ =

In addition, we will discuss some unifying theoretical principles in Chapter

which is the set of densities that are not “too wiggly.”

1.2 Notation and Background

Let V(X) = E(X − E(X))2 denote the variance of a random variable. If

TABLE 1.1. Some useful notation.

Brief Review of Probability. The sample space Ω is the set of possible

The triple (Ω, A, P) is called a probability space. A random variable is a

A sequence of random variables Xn converges almost surely to a random

P( lim |Xn − X| = 0) = 1. (1.4)

The following implications hold:

Let g be a continuous function. Then, according to the continuous map-

Xn X implies that g(Xn ) g(X)

According to Slutsky’s theorem, if Xn X and Yn c for some constant

A similar result holds in the vector case. Suppose that Xn is a sequence of

normal with covariance matrix Σ. Let g be diﬀerentiable with gradient ∇g

Statistical Concepts. Let F = {f (x; θ) : θ ∈ Θ} be a parametric model

1.3 Conﬁdence Sets

Let F be a class of distribution functions F and let θ be some quantity of

1.11 Definition. Cn is a finite sample 1 − α confidence set if

inf PF (θ ∈ Cn ) ≥ 1 − α for all n. (1.12)

Cn is a uniform asymptotic 1 − α conﬁdence set if

lim inf inf PF (θ ∈ Cn ) ≥ 1 − α. (1.13)

Cn is a pointwise asymptotic 1 − α conﬁdence set if,

for every F ∈ F, lim inf PF (θ ∈ Cn ) ≥ 1 − α. (1.14)

If || · || denotes some norm and fn is an estimate of f , then a conﬁdence

where sn may depend on the data. Suppose that f is deﬁned on a set X . A

1.17 Example. Let X1 , . . . , Xn ∼ Bernoulli(p). A pointwise asymptotic 1 − α

1.20 Example (Parametric models). Let

Recall that under suitable regularity assumptions,

is a pointwise asymptotic conﬁdence interval. If τ = g(θ) we can get an

 θn )|g (θn )|.

Again, this is typically a pointwise asymptotic conﬁdence interval.

1.4 Useful Inequalities

Markov’s Inequality. Let X be a non-negative random variable and suppose

Chebyshev’s Inequality. Let µ = E(X) and σ 2 = V(X). Then,

Hoeﬀding’s Inequality. Let Y1 , . . . , Yn be independent observations such that

Hoeﬀding’s Inequality for Bernoulli Random Variables. Let X1 , . . ., Xn ∼ Bernoulli(p).

Mill’s Inequality. If Z ∼ N (0, 1) then, for any t > 0,

Berry–Esséen Bound. Let X1 , . . . , Xn be iid with ﬁnite mean µ = E(X1 ),

Bernstein’s Inequality. Let X1 , . . . , Xn be independent, zero mean random vari-

Bernstein’s Inequality (Moment version). Let X1 , . . . , Xn be independent, zero

Cauchy–Schwartz Inequality. If X and Y have ﬁnite variances then

g(αx + (1 − α)y) ≤ αg(x) + (1 − α)g(y).

If g is twice diﬀerentiable, then convexity reduces to checking that g (x) ≥ 0

Jensen’s inequality. If g is convex then

Eg(X) ≥ g(EX). (1.32)

1.5 Bibliographic Remarks

2. (Computer experiment). Compare the coverage and length of (1.18) and

(a) Assuming that the mean exists, show that Cn is a 1 − α pointwise

(b) Show that Cn is not a uniform asymptotic conﬁdence interval. Hint :

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

If || · || denotes some norm and fn is an estimate of f , then a conﬁdence

θn )|g (θn )|.