0% found this document useful (0 votes)

3 views21 pages

Consistencyofsumvar

Uploaded by

jrhs1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views21 pages

Consistencyofsumvar

Uploaded by

jrhs1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

The Annals of Statistics

2005, Vol. 33, No. 5, 2022–2041

DOI: 10.1214/009053605000000390
c Institute of Mathematical Statistics, 2005

ESTIMATION OF SUMS OF RANDOM VARIABLES: EXAMPLES

arXiv:math/0602214v1 [math.ST] 10 Feb 2006

AND INFORMATION BOUNDS1

By Cun-Hui Zhang
Rutgers University
This paper concerns the estimation of sums of functions of ob-
servable and unobservable variables. Lower bounds for the asymptotic
variance and a convolution theorem are derived in general finite- and
infinite-dimensional models. An explicit relationship is established
between efficient influence functions for the estimation of sums of
variables and the estimation of their means. Certain “plug-in” esti-
mators are proved to be asymptotically efficient in finite-dimensional
models, while “u, v” estimators of Robbins are proved to be effi-
cient in infinite-dimensional mixture models. Examples include cer-
tain species, network and data confidentiality problems.

1. Introduction. Given a pool of n motorists, how do we estimate the

total intensity of those in the pool who have a prespecified number of traffic
accidents in a given time period? This is an example of a broad class of
problems involving the estimation of sums of random variables
n
X
(1.1) Sn ≡ u(Xj , θj )
j=1

[24], where Xj are observable variables, θj are unobservable variables or

constants, and u(·, ·) is a certain utility function. The estimation of (1.1)
has numerous important applications. In the motorist example, Xj is the
number of traffic accidents and θj the intensity of the jth individual in
the pool, and u(x, ϑ) = ϑI{x = a} for a prespecified integer a. In Sections
3, 4 and 5 we consider applications in certain species, network and data
confidentiality problems.

Received June 2001; revised October 2004.

1
Supported in part by the National Science Foundation.
AMS 2000 subject classifications. Primary 62F10, 62F12, 62G05, 62G20; secondary
62F15.
Key words and phrases. Empirical Bayes, sum of variables, utility, efficient estimation,
information bound, influence function, species problem, networks, node degree, data con-
fidentiality, disclosure risk.

This is an electronic reprint of the original article published by the

Institute of Mathematical Statistics in The Annals of Statistics,
2005, Vol. 33, No. 5, 2022–2041. This reprint differs from the original in
pagination and typographic detail.
1
2 C.-H. ZHANG

The estimation of (1.1) is a nonstandard problem in statistics, since the

sums, involving observables, as well as unobservables, are not parameters.
Without a theory of efficient estimation, the performance of different estima-
tors can only be measured against each other in terms of relative efficiency.
For the specific motorist example with u(x, ϑ) = ϑI{x = a}, Robbins and
Zhang [28] proved that, in a Poisson mixture model, the efficient estimation
of (1.1) is equivalent to the efficient estimation of E(θ|X = a), so that the
usual information bounds can be used. In this paper we provide a general
theory for the efficient estimation of sums of variables.
Let (X, θ), (Xj , θj ), j = 1, . . . , n, be i.i.d. vectors with an unknown com-
mon joint distribution F . Our general theory covers asymptotic efficiency
for the estimation of
n
X
(1.2) Sn ≡ Sn (F ) ≡ u(Xj , θj ; F )
j=1

based on X1 , . . . , Xn , where the utility u(x, ϑ; F ) is also allowed to depend

on F . This provides a unified asymptotic theory for the estimation of (1.1)
and conventional parameters u(F ), since the utility is allowed to depend on
F only. Our problem is closely related to the estimation of the mean
(1.3) µ(F ) ≡ EF u(X, θ; F ).
If EF u2 (X, θ; F ) < ∞ and 1/2 ≤ α < 1, an estimator is nα -consistent for the
estimation of Sn (F ) iff it is nα -consistent for the estimation of its mean
nµ(F ) = EF Sn (F ). But an efficient estimator of nµ(F ) is not necessarily
an efficient estimator of Sn (F ), since the two estimation problems may have
different efficient influence functions, as we demonstrate below in (1.4)–(1.6)
and in simple examples in Sections 2.3 and 2.4. The asymptotic theory for
the estimation of µ(F ) is well understood; see [3, 17, 31].
Suppose that F belongs to a known class F . Let F0 ∈ F . An estimator
µ
bn of (1.3) is (locally) asymptotically efficient in contiguous neighborhoods
of PF0 iff
n
1X
(1.4) µ
bn = µ(F0 ) + ψ∗ (Xj ) + oPF0 (n−1/2 ),
n j=1

where ψ∗ (x) ≡ ψ∗ (x; F0 ) is the efficient influence function at F0 for the esti-
mation of µ(F ). In Section 6 we show that, under mild regularity conditions
on the utility functions {u(x, ϑ; F ), F ∈ F}, an estimator Sbn of (1.2) is (lo-
cally) asymptotically efficient in contiguous neighborhoods of PF0 iff
n
Sbn 1X
(1.5) = µ(F0 ) + φ∗ (Xj ) + oPF0 (n−1/2 ),
n n j=1
ESTIMATING SUMS OF RANDOM VARIABLES 3

where φ∗ (x) ≡ φ∗ (x; F0 ) is the efficient influence function at F0 for the esti-
mation of Sn (F ). Furthermore, the following relationship holds between the
two efficient influence functions in (1.4) and (1.5):
(1.6) φ∗ (x) = ψ∗ (x) + u(x; F0 ) − µ(F0 ) − u∗ (x),
where u(x; F ) ≡ EF [u(X, θ; F )|X = x] and u∗ (x) ≡ u∗ (x; F0 ) is the projec-
tion of u(x; F0 ) to the tangent space of the family of distributions {F X , F ∈
F} at F0X . Here F X is the marginal distribution of X under the joint distri-
bution F of (X, θ). It follows clearly from (1.6) that asymptotically efficient
estimations of Sn (F )/n and µ(F ) are equivalent in contiguous neighbor-
hoods of PF0 iff u(·; F0 ) − µ(F0 ) is in the tangent space, that is, u(·; F0 ) −
µ(F0 ) = u∗ (·; F0 ).
We will derive more explicit results in finite-dimensional models and
infinite-dimensional mixture models. In finite-dimensional models F = {Fτ ,
τ ∈ TP } with a Euclidean τ , it will be shown that “plug-in” estimators of the
form nj=1 u(Xj ; Fbτn ) are asymptotically efficient for the estimation of (1.2)
if τbn is an efficient estimator of τ . In infinite-dimensional mixture models,
certain “u, v” estimators of Robbins [24] will be shown to be efficient for the
estimation of (1.1). We shall consider estimation of (1.1) with known f (x|ϑ)
in Section 2 and provide the general theory in Section 6. Section 7 contains
proofs of all theorems.

2. Mixture models. Suppose (X, θ) ∼ F (dx, dϑ) = f (x|ϑ)ν(dx)G(dϑ), that

is,
(2.1) X|θ ∼ f (x|θ), θ ∼ G.
In this section we state our results for the estimation of (1.1) with known
f (·|·).

2.1. Finite-dimensional mixture models. Let {Gτ , τ ∈ T } be a paramet-

ric family of distributions with an open T in a Euclidean space. Suppose
(2.1) holds with G = Gτ for an unknown vector τ ∈ T . Suppose that, for
certain functions ρeτ ,
Z
√
( gτ,∆ − 1 − ∆t ρeτ /2)2 dGτ = o(k∆k2 ),
(2.2) Z
gτ,∆ dGτ = 1 + o(k∆k2 ), as ∆ → 0,

where gτ,∆ is the Radon–Nikodym derivative of the absolutely continuous

part of Gτ +∆ with respect to Gτ . Let Eτ denote the expectation under Gτ .
The Fisher information matrix for the estimation of τ based on a single X
is
(2.3) Iτ ≡ Covτ (ρτ (X)), ρτ (x) ≡ Eτ [ρeτ (θ)|X = x].
Define uτ (x) ≡ Eτ [u(X, θ)|X = x] and µτ ≡ Eτ u(X, θ).
4 C.-H. ZHANG

Theorem 2.1. Suppose (2.2) holds, Eτ u2 (X, θ) is locally bounded and

Iτ are of full rank for all τ ∈ T . Then {Sbn , n ≥ 1} is an asymptotically
efficient estimator of (1.1) iff (1.5) holds with µ(F0 ) = µτ , P = Pτ , and the
efficient influence function
(2.4) φ∗ = φ∗,τ ≡ uτ − µτ + ρtτ Iτ−1 γτ ,
where γτ ≡ Eτ Covτ (u(X, θ), ρeτ (θ)|X) = Eτ {u(X, θ)ρeτ (θ) − uτ (X)ρτ (X)}.

Remark 2.1. Since κ∗,τ ≡ Iτ−1 ρτ is the efficient influence function for
the estimation of τ and ∂µτ /∂τ = Eτ U (X, θ)ρeτ (θ), ψ∗,τ ≡ ρtτ Iτ−1 Eτ u(X, θ)ρeτ (θ)
is the efficient influence function for the estimation of µτ . Moreover, u∗,τ ≡
ρtτ Iτ−1 Eτ uτ (X)ρτ (X) is the projection of uτ to the tangent space generated
by the scores ρτ (X) under Eτ . Thus, Theorem 2.1 asserts that (1.5) and
(1.6) hold under (2.2).

Our next theorem provides the asymptotic theory for plug-in estimators
n
X
(2.5) Sbn ≡ ubτn (Xj )
j=1

of (1.1), where uτ (x) ≡ Eτ [u(X, θ)|X = x] as in Theorem 2.1. An estimator

τbn of the vector τ is an asymptotically linear one with influence functions
κτ under Eτ if
n
1X
(2.6) τbn = κτ (Xj ) + oPτ (n−1/2 ),
n j=1

with Eτ κτ (X)ρtτ (X) being the identity matrix.

Theorem 2.2. Let Sbn be as in (2.5) with an asymptotically linear esti-

mator τbn as in (2.6). Suppose conditions of Theorem 2.1 hold, Eτ u2τ +∆ (X) =
O(1) as ∆ → 0 for every τ ∈ T , and for all τ ∈ T and c > 0,
n
X
(2.7) sup √ [uτ +∆ (Xj ) − uτ (Xj ) − {Eτ uτ +∆ (X) − µτ }] = oPτ (n1/2 ).
k∆k≤c/ n j=1

Let φ∗,τ and γτ be as in Theorem 2.1 and κ∗,τ = Iτ−1 ρτ . Then

Sbn − Sn D
(2.8) 1/2
−→ N (0, στ2 ), στ2 = σ∗,τ
2
+ Varτ ({κτ (X) − κ∗,τ (X)}t γτ )
n
under Eτ , where σ∗,τ2 ≡ Var (φ (X) − u(X, θ)). Consequently, S
τ ∗,τ
bn is an
asymptotically efficient estimator of (1.1) at Eτ0 iff γτ0 τbn is an asymptoti-
cally efficient estimator of γτ0 τ in contiguous neighborhoods of Eτ0 .
ESTIMATING SUMS OF RANDOM VARIABLES 5

Remark 2.2. It follows from (2.8) that |Sbn − Sn | ≤ 1.96σbτn n1/2 pro-
vides an approximate 95% confidence interval for (1.1), provided that στ is
continuous in τ .

Remark 2.3. Condition (2.7) holds if {uτ +∆ : τ + ∆ ∈ T , k∆k ≤ δτ } is

a Donsker class under Eτ for some δτ > 0 and Eτ u2τ +∆ (X) is continuous at
∆ = 0.

2.2. General mixtures. Let G be a convex class of distributions. Suppose

(2.1) holds with an unknown G ∈ G. Let EG be the expectation under (2.1).
Suppose EG u2 (X, θ) < ∞ for all G ∈ G. Define
Z
2
(2.9) GG0 ≡ G : EG0 (fG (X)/fG0 (X)) < ∞, fG I{fG0 > 0} dν = 1 ,
R
where fG (x) ≡ f (x|ϑ)G(dϑ), and define
(2.10) VG0 ≡ {v(x) : EG v(X) = EG u(X, θ) ∀ G ∈ GG0 }.

Theorem 2.3. (i) If VG0 is nonempty, then {Sbn , n ≥ 1} is an asymptoti-

P
cally efficient estimator of (1.1) at EG0 iff Sbn = { nj=1 vG0 (Xj )}+oPG0 (n1/2 )
with
(2.11) vG0 ≡ arg min{EG0 (v(X) − u(X, θ))2 : v ∈ VG0 }.

(ii) If VG0 is empty, then there does not exist any regular n−1/2 -consistent
estimator of EG u(X, θ) or Sn /n in contiguous neighborhoods of EG0 .

The definition of regular estimators of (1.1) is given in Section 6.

Suppose that for certain G∗ ⊆ G the collection
(2.12) V∗ ≡ {v(x) : EG v(X) = EG u(X, θ), EG v 2 (X) < ∞ ∀ G ∈ G∗ }
is nonempty, for example, certain VG0 as in Theorem 2.3(i). Let khkG ≡
{EG h2 (X)}1/2 .

Theorem 2.4. Let vG0 be as in (2.11). Suppose vG0 ∈ V∗ and as (ε, n) →

(0, ∞),
( n
)
X vG (Xj ) − vG0 (Xj )
sup : kvG − vG0 kG0 ≤ ε, G ∈ G∗ → 0 in PG0
j=1
n1/2
b be an estimator of G such that PG (G
for all G0 ∈ G∗ . Let G b ∈ G∗ ) → 1 and
0
kvGb − vG0 kG0 → 0 in PG0 for all G0 ∈ G∗ . Then
n
X
(2.13) Vbn ≡ vGb (Xj )
j=1
6 C.-H. ZHANG

is an asymptotically efficient estimator of (1.1) at PG0 for all G0 ∈ G∗ .

If f (x|ϑ) belongs to certain exponential families, there exists a unique

function v such that VG0 6= ∅ implies VG0 = {v}, so that vG0 = v for all G0
and V∗ = {v}. The following theorem is a variation of Theorem 2.4 for such
distributions.

Theorem 2.5. Suppose f (x|ϑ) ∝ exp(xt λ(ϑ)), λ(ϑ) ∈ Λ, is an exponen-

tial family with an open Λ in a Euclidean space, and that the conditional dis-
tribution of θ given λ(θ) is known. Suppose G contains distributions G ≡ Gc
with EG |λ(θ) − c| = 0 for all c ∈ Λ. If VG0 6= ∅ for certain G0 , then there
exists a function v(x) such that
(2.14) EG [v(X)|λ(θ) = c] = EG [u(X, θ)|λ(θ) = c] ∀ c ∈ Λ, G ∈ G,
and such that the following Vn is an efficient estimator of Sn under {EG :
EG v 2 (X) < ∞}:
n
X
(2.15) Vn ≡ v(Xj ).
j=1

Remark 2.4. Robbins [24] called (2.15) “u, v” estimators, provided

that (2.14) holds. The Vbn in (2.13) can be viewed as a “u, v” estimator
with an estimated optimal v. Theorems 2.4 and 2.5 provide conditions un-
der which these two types of “u, v” estimators are asymptotically efficient.

2.3. The Poisson example. Let (X, Y, λ) ≡ (X, θ) with

E[Y |X, λ] = λ,
(2.16)
f (x|λ) ≡ P (X = x|λ) = e−λ λx /x!, x = 0, 1, . . . .
Robbins [22, 24] and Robbins and Zhang [25, 26, 27] considered the esti-
P P
mation of Sn′ ≡ nj=1 λj u(Xj ) and Sn′′ ≡ nj=1 Yj u(Xj ), and several related
problems.
Both Sn′ and Sn′′ are special cases of (1.1). For u(x) = I{x ≤ a}, Sn′′ could
be the total number of accidents next year for those motorists with no more
than a accidents this year in the motorist example.
Suppose λj have a common exponential density τ e−λτ dλ with unknown
τ . The marginal distribution of X is fτ (x) = τ (1 + τ )−x−1 , and the marginal
and conditional expectations of λu(X) and Y u(X) are
∞
X
(x + 1)u(x)
uτ (x) = , µτ = fτ (x)xu(x − 1).
1+τ x=0
ESTIMATING SUMS OF RANDOM VARIABLES 7
Pn Pn
Let X ≡ j=1 Xj /n. Define τbn ≡ (β + n)/(α + j=1 Xj ) and
n
X n
X (α/n + X)(Xj + 1)u(Xj )
(2.17) Sbn ≡ ubτn (Xj ) = .
j=1 j=1 (α + β)/n + 1 + X

It follows from Theorem 2.2 that the plug-in estimators in (2.17) are asymp-
totically efficient for both Sn′ and Sn′′ . For α = β = 0, (2.17) gives the plug-in
estimator corresponding to the maximum likelihood estimator (MLE) of τ .
For general positive α and β, (2.17) gives the Bayes estimator of Sn′ and Sn′′
P
with a beta prior on τ /(1 + τ ). Clearly, µ bn ≡ ∞ bn xu(x − 1)}/(1 + τbn )x+1
x=1 {τ
is efficient for the estimation of the mean µτ ≡ Eτ u(X, θ), but not for Sn′ /n
or Sn′′ /n. Similar results can be obtained for λ with the gamma distribution;
see [23].
In the case of completely unknown G(dλ), the “u, v” estimator (2.15) with
v(x) = xu(x − 1) is asymptotically efficient for the estimation of Sn′ and Sn′′
for all G with finite EG {v(X) − λu(X)}2 .

2.4. More examples.

Example 2.1. Let X ∼ N (τ, σ 2 ). The number of “above average” indi-

viduals, Sbn ≡ #{j ≤ n : Xj > X}, is an efficient estimator of the number of
above mean individuals Sn (τ ) ≡ #{j ≤ n : Xj > τ }. The estimator Sen ≡ n/2
is efficient for the estimation of Eτ Sn (τ ) = n/2, but not Sn (τ ).

Example 2.2. Let f (x|ϑ) ∼ N (ϑ, σ 2 ). An efficient estimator for the

number of “above mean” individuals, Sn ≡ #{j ≤ n : Xj > θj }, is Sbn ≡
n/2, compared with Example 2.1. This is even true under the condition
P
n−1 nj=1 θj2 = O(1), that is, in contiguous neighborhoods of P0 with P0 {θj =
0} = 1.

Pn
Example 2.3. Sbn ≡ 0 is efficient for the estimation of Sn (τ ) ≡ j=1 ρτ (Xj ).

3. A species problem. An interesting example of our problem is estimat-

ing the total number of species in a population of plants or animals. Suppose
a random sample of size N is drawn (with replacement) from a population of
d species. Let nk be the number of species represented k times in the sample.
A species problem is to estimate d based on {nk , k ≥ 1}. The problem dates
back to [13] and [14] and has many important applications [4]. We consider
a network application in Section 4.
8 C.-H. ZHANG

3.1. Finite-dimensional models. Let Xj be the frequencies of the jth

species in the sample, so that, for certain pj > 0,
d
X
(3.1) nk = I{Xj = k}, (X1 , . . . , Xd ) ∼ multinomial(N, p1 , . . . , pd ).
j=1

We will confine P our discussionPd

to the case of (N, N/d) → (∞, µ), 0 < µ <
∞, since E(d − ∞ k=1 n k ) = N
j=1 (1 − pj ) → 0 as N → ∞ for fixed d. Let
{Gτ , τ ∈ T } be a parametric family of distributions in (0, ∞), where τ is an
unknown parameter with a scale component, Gτ (y/c) = Gτc′ (y). Let Pτ be
probability measures under which (3.1) holds conditionally on N and certain
i.i.d. variables θj > 0, and
d
!
θj X
(3.2) pj = Pd , N |{θj } ∼ Poisson c θj , θj ∼ G,
i=1 θi j=1
R
with G = Gτ . Under Pτ , Xj are i.i.d. with Pτ {Xj = k} = e−y (y k /k!)Gτc′ (dy).
Assume c = 1 due to scale invariance. Since n0 is unobservable, the MLE of
(d, τ ) is
PN ∞ R nk
k=1 nk
Y e−y y k Gτ (dy)
(3.3) db ≡ R , τb ≡ arg max R .
(1 − e−y )Gbτ (dy) τ ∈T k=1
1 − e−y Gτ (dy)
In the next two paragraphs we derive the influence function for the MLE
(3.3) and prove its asymptotic efficiency.
If (2.2) holds and the MLE τb of τ is asymptotically efficient, then
d
1X
(3.4) τb = τ + κ∗,τ (Xj ) + oP (d−1/2 )
d i=1
with κ∗,τ ≡ {Cov τ (ρτ (X)}−1 ρτ and ρτ ≡ I{x>0} (ρτ (x) − γτ ), where ρτ is as
in (2.3) and γτ ≡ Eτ [ρτ (X)|X > 0]. Thus, by the Taylor expansion of the db
in (3.3),
d
X
(3.5) db = d + φ∗,τ (Xj ) + oP (d1/2 ),
j=1

where φ∗,τ (x) ≡ I{x>0} /Pτ (X > 0) − 1 − κt∗,τ (x)γτ . In this case, as d → ∞,

db − d D Pτ (X = 0)
(3.6) −→ N 0, + γτt {Covτ (ρτ (X)}−1 γτ .
d1/2 Pτ (X > 0)
For the gamma G(dy; τ ) ∝ y α−1 exp(−y/β) dy, the MLE τb ≡ (α, b satisfies
b β)
∞ P∞
X
ℓ=k nℓ delog(1 + β)
b deα
b βb
(3.7) = , = N,
α
b+k−1 b −α
1 − (1 + β) b b −α
1 − (1 + β) b
k=1
ESTIMATING SUMS OF RANDOM VARIABLES 9
P
with de = ∞ k=1 nk , and (3.4) holds [29]. Rao [19] called (3.3) with (3.7)
pseudo MLE in a different (gamma) model, but the efficiency of the db was
not clear [11].
The species problem is a special case of estimating (1.1) when d is viewed
as the number of species represented in the population out of a total of n
species. Specifically, letting pj = 0 if the jth species is not represented in the
population, estimating
n
X n
X N
X
(3.8) d= I{pj > 0} = I{Xj = 0, pj > 0} + nk
j=1 j=1 k=1

is equivalent to estimating (1.1) with u(x, p) = I{p > 0} or u(x, p) = I{x =

0, p > 0}, based on observations {Xj , j ≤ n}. Under (3.1) and (3.2) with d
replaced by n,
R −y k
e (y /k!)Gτ (dy)
(3.9) Pp∗ ,τ {Xj = k} = (1 − p∗ )I{k = 0} + p∗ R −y
I{k > 0}
(1 − e )Gτ (dy)
R
with certain p∗ < (1 − e−y )Gτ (dy). Under P (3.9), the τb in (3.3) is the con-
ditional MLE of τ given {nk , k ≥ 1}. Since ( ∞ k=1 nk , d, n − d) is a trinomial
vector, τb in (3.3) equals the MLE of τ based on a sample {Xj , j ≤ n} from
(3.9), provided that db in (3.3) is no greater than n. Since Pp∗ ,τ {db ≤ n} → 1
under (3.9), by Theorem 2.1, the (conditional) MLE (3.3) is asymptotically
efficient in the empirical Bayes model (3.2) under conditions (2.2), (3.4) and
(3.5).

3.2. General mixture. Now, suppose the distribution G in (3.2) is com-

pletely unknown. The nonparametric MLE of (d, G) is given by
R ∞ R nk
de y>0 G(dy)
b Y e−y y k G(dy)
(3.10) db ≡ R , b ≡ arg max
G R ,
b
(1 − e−y )G(dy) G k=1
1 − e−y G(dy)
P
with de ≡ N k=1 nk , but its asymptotic
P
distribution is unclear. Since there is
no solution v to the equation ∞ x=0 v(x)e−ϑ ϑx /x! = I{ϑ > 0} for 0 ≤ ϑ < ∞,

by Theorems 2.3 and 2.5, the estimation of d with completely unknown G

is an ill-posed problem.
Among many choices, R a compromise between (3.3) and (3.10) is to fit
Eτ nk ∝ Pτ (X = k) = e−y (y k /k!)Gτ (dy) for 1 ≤ k ≤ m. For gamma G with
Enk+1 /Enk = (k + α)β/(1 + β), fitting the negative binomial distribution
yields
N
X
(3.11) db ≡ de + max(τb1 , 0)n1 , de ≡ nk ,
k=1
10 C.-H. ZHANG

where τb1 is the (weighted) least squares estimate of τ1 ≡ (β + 1)/(αβ) based

on
nk = τ1 nk+1 + τ2 (knk ) + error, k = 1, . . . , m − 1, τ2 ≡ −1/α,
with nk being a response variable and (nk+1 , knk ) being covariates for each
k. For small θj (large nk for small k), (3.11) has high efficiency for gamma G
and small bias for G(y) = c1 y α + (c2 + o(1))y α+1 at y ≈ 0. Chao [5] proposed
de + n21 /(2n2 ) as a low estimate of d. Another possibility is to estimate d by
correcting the bias of the estimator d/(1e − n1 /N ) of Darroch and Ratcliff
[9] as in [6].

4. Networks: estimation of node degrees based on source-destination data.

Source-destination (SD) data in networks are generated by sending probes
(e.g., traceroute queries in the Internet) through networks from certain source
nodes to certain destination nodes; see [8, 32]. We shall treat SD data as a
collection of random vectors Wj , j = 1, . . . , N , generated from a sample of
SD pairs and make statistical inference based on U -processes of {Wj }, for
example,
N
X X
h1 (Wj ) h2 (Wj1 , Wj2 )
(4.1) , ,
j=1
N 1≤j1 6=j2 ≤N
N (N − 1)

indexed by Borel h1 and h2 , where Wj are the observations from the jth SD
pair in the sample. We focus here on the estimation of node degrees, although
the approach based on (4.1) could be useful in other network problems.
The topology of a deterministic network can be described with a routing
table: a list r1 , . . . , rJ of directed paths representing connections between
pairs of source and destination nodes, with each path being composed of a
set of directed links. For example, the path 4 → 2 → 3 → 8 has source node
4, destination node 8, and links 4 → 2, 2 → 3 and 3 → 8. Consider a network
with nodes {1, . . . , K}. The link degree D(k, ℓ) is defined as the number of
paths using the link k → ℓ,
(4.2) D(k, ℓ) ≡ #{j ≤ J : link k → ℓ is used in rj },
with D(k, ℓ) = 0 if k → ℓ is nonexistent or never used. The node degree,
defined as
K
X
(4.3) dk = I{D(k, ℓ) > 0},
ℓ=1
is the number of outgoing
P
links from k to other nodes. This is also called out-
degree. The in-degree, ℓ I{D(ℓ, k) > 0}, is the number of incoming links
to k. The node degrees dk and their (empirical) distributions are important
characteristics of networks; see [12, 15, 30].
ESTIMATING SUMS OF RANDOM VARIABLES 11

For a given sample size N , let R1 , . . . , RN be a sample of SD pairs from the

routing table {r1 , . . . , rJ }. Suppose we observe the paths of Rj , so that the
vectors Wj ≡ (W1j , . . . , WKj )′ are given by Wkj ≡ ℓ if link k → ℓ is used in Rj
for some 1 ≤ ℓ ≤ K and Wkj = 0 otherwise. The observed link frequencies
are
N
X
(4.4) Xkℓ ≡ #{j ≤ N : link k → ℓ is used in Rj } = I{Wkj = ℓ}.
j=1

Since Xkℓ=0 for D(k, ℓ) = 0 by (4.3), the node degree dk is a sum

K
X
(4.5) dk = dek + sk , dek ≡ I{Xkℓ > 0},
ℓ=1

where dek is the observed degree and sk is the unobserved degree given by
K
X
(4.6) sk ≡ I{Xkℓ = 0, D(k, ℓ) > 0}.
ℓ=1
Lakhina, Byers, Crovella and Xie [16] and Clauset and Moore [7] pointed
out that the observed degrees dek may grossly underestimate the true node
degree dk .
It follows from (4.5), (4.6) and (3.8) that the problem of estimating the
node degree (4.3) is a species problem. From this point of view, we may di-
rectly use estimators in Section 3 and references therein, for example, (3.11).
However, in network problems, we are typically interested in simultaneous
estimation of many node degrees. Thus, information from {Xkℓ , ℓ ≤ K} can
be pooled from different nodes k. Let K ⊆ {1, . . . , K} be a collection of
“similar” and/or “independent” nodes. Let G be a family of distributions,
for example, gamma with unit scale. Suppose the G in (3.2) for different
nodes are identical to a member of G up to scale parameters βk . Then, as
in (3.10), the (pseudo) MLE for {dk , βk , k ∈ K, G} is given by
PN R
b
b j=1 nkj y>0 G(dy)
dk ≡ R ,
(1 − e−βbk y )G(dy)
b
(4.7) N R −βk y j
b b
Y Y e y G(dy) nkj
(β, G) ≡ arg max R ,
β,G k∈K j=1
1 − e−βk y G(dy)
where β ≡ (β, . . . , βK ) and the maximum is taken over all βk > 0 and G ∈ G.
This type of estimator is expected to perform well for self-similar networks.
In the nonparametric case of completely unknown G, the MLE (β, b G)
b in
(4.7) can be computed via the following EM algorithm:
( N (m) (m) )−1 X
N
(m+1) X p(j + 1; βk , G(m) ) p(1; βk , G(m) )
βk ← nkj (m)
+ (m)
jnkj ,
j=1 p(j; βk , G(m) ) 1 − p(0; βk , G(m) ) j=1
12 C.-H. ZHANG
R
with p(j; βk , G) ≡ e−βk y y j G(dy),
N
!−1
XX (m+1)
(m+1) (m)
G (dϑ) ← G (dϑ) nkj /{1 − p(0; βk , G(m) )}
k∈K j=1
N (m+1) (m+1)
XX exp(−βk ϑ)ϑj exp(−βk ϑ)
× nkj (m+1)
+ (m+1)
.
k∈K j=1 p(j; βk , G(m) ) 1 − p(0; βk , G(m) )

5. Data confidentiality: estimation of risk in statistical disclosure. A ma-

jor concern in releasing microdata sets is protecting the privacy of individ-
uals in the sample. Consider a data set in the form of a high-dimensional
contingency table. If an individual belongs to a cell with small frequency, an
intruder with certain knowledge about the individual may identify him and
learn sensitive information about him in the data. Statistical models and
methods concerning the risk of such breach of confidentiality have been con-
sidered by many; see [10] and the proceedings of the joint ECE/EUROSTAT
work sessions on statistical data confidentiality. For multi-way contingency
tables, Polettini and Seri [18] and Rinott [21] studied the estimation of global
disclosure risks of the form
J
X
(5.1) SJ ≡ u(Xj , Yj )
j=1

based on {Xj , j ≤ J}, where Xj and Yj are the sample and population
frequencies in the jth cell, J is the total number of cells, and u(x, y) is a loss
the form u(x, y) = u(x)/y, for example, u(x, y) = y −1 I{x = 1}.
function of P
Let N = Jj=1 Yj be the population size. Suppose N ∼ Poisson(λ),
{Yj }|N ∼ multinomial(N, {πj }), Xj |({Yj }, N ) ∼ binomial(Yj , pj ),
(5.2) PJ
for certain πj > 0 with j=1 πj = 1, 0 ≤ pj ≤ 1 and λ > 0. For known
{pj , πj , λ}, the Bayes estimator of SJ in (5.1) is
J
X
(5.3) SJ∗ ≡ E(SJ |{Xj }) = uj (Xj ), uj (x) ≡ Eu(x, Yj − Xj + x),
j=1

with Yj − Xj ∼ Poisson((1 − pj )πj λ) (independent of Xj ). For u(x, y) =

y −1 I{x = 1},
(5.4) uj (x) = {(1 − pj )πj λ}−1 [1 − exp{−(1 − pj )πj λ}].
In general, the parameters (1 − pj )πj λ cannot be completely identified
from the data Xj ∼ Poisson(pj πj λ), so that it is necessary to further model
the parameters. This can be achieved by setting {pj , πj , λ} to known tractable
ESTIMATING SUMS OF RANDOM VARIABLES 13

functions of an unknown vector τ and certain covariates zj characterizing

cells j, and by incorporating all available knowledge about the parameters,
P P
for example, λ ≈ N and Jj=1 pi πj ≈ n/N , where n = Jj=1 Xj is the sample
size. Consequently, the conditional expectation uj (x) in (5.4) can be written
as uj (x) = u(x, zj ; τ ). This suggests
J
X
(5.5) SbJ ≡ u(Xj , zj ; τbJ )
j=1

as an estimator of the global risk (5.1) and its conditional expectation (5.3),
where τbJ is a suitable (e.g., the maximum likelihood or method of moments)
estimator of τ . For example, in a two-way table with cells labelled by j ∼
(i, k) and known πi,k and λ, we may assume a regression model pi,k = ψ0 (τ1 +
τ2′ zi,k ) for a certain known (e.g., logit or probit) function ψ0 . In the case of
unknown πi,k , we may consider the independence model πi,k = πi·π·k with
unknown πi· and known or unknown π·k . If τ has fixed dimensionality and
τbJ is asymptotically efficient, (5.5) is efficient by Theorem 2.2. Theorem 2.2
also suggests that (5.5) is highly efficient if dim(τ )/J → 0.
Alternatively, we may consider the negative binomial model N ∼NB(α, 1/
(1 + β)), that is, P (N = k) = Γ(k + α){Γ(α)k!}−1 β k /(1 + β)k+α . As in
[21], we have in this case Yj ∼ NB(α, 1/(1 + βj )) with βj = βπj , Xj ∼
NB(α, 1/(1 + pj βj )), and (Yj − Xj )|{Xj = x} ∼ NB(x + α, (1 + pj βj )/(1 +
βj )). Consequently,
Z 1
1 + pj βj
(5.6) uj (x) = tαj −1 dt I{x = 1}
(1 − pj )βj (1+pj βj )/(1+βj )

in (5.3) for u(x, y) = y −1 I{x = 1}. Bethlehem, Keller and Pannekoek [2]
studied this negative binomial model with constant πj = 1/J and pj =
En/EN ≈ n/N . For (αj , βj ) → (0, ∞), (Yj − Xj )|{Xj = x} converges in dis-
tribution to the NB(x, pj ), resulting in the µ-ARGUS estimator [1] with
uj (x) = pj (1− pj )−1 (− log pj )I{x = 1} in (5.6), as pointed out by Rinott [21].
Compared with the Poisson model in which λ ≈ N , estimates of both EN
and Var(N ) are required in the negative binomial model. The µ-ARGUS
model essentially assumes Var(N )/(EN )2 ≥ 1/α → ∞, which may not be
suitable in some applications.

6. General information bounds. We provide a lower bound for the asymp-

totic variance and a convolution theorem for (locally asymptotically) regular
estimators of the sum in (1.2). To facilitate the statements of our results, we
first briefly describe certain terminologies and concepts in general asymp-
totic theory.
14 C.-H. ZHANG

6.1. Scores and tangent spaces. Suppose (X, θ) ∼ F with F ∈ F , where

F is a family of joint distributions. Let C ≡ C(F0 ) be a collection of mappings
{Ft , 0 ≤ t ≤ 1} from [0, 1] to F satisfying
√
(6.1) EF0 ( ft (X) − 1 − tρ(X)/2)2 = o(t2 ), EF0 ft (X) = 1 + o(t2 ),
for certain score functions ρ(x) ≡ ρ(x; {Ft }) depending on the mappings
{Ft }, where ft ≡ dFtX /dF0X is the Radon–Nikodym derivative of the abso-
lutely continuous part of the marginal distribution FtX of X under Ft with
respect to the marginal distribution F0X . Let C∗ ≡ C∗ (F0 ) be the collection
of score functions ρ(X) generated by C. The tangent space H∗ ≡ H∗ (F0 ) is
the closure of the linear span [C∗ ] of C∗ in L2 (F0 ); that is,
(6.2) H∗ ≡ [C∗ ], C∗ ≡ {ρ(·; {Ft }) : {Ft } ∈ C}.
For further discussion about score and tangent space, see [3], pages 48–57.
The second part of (6.1) holds in regular parametric models; see [3], page
459.

6.2. Smoothness of random variables and their distributions. Let L(U ; F )

be the distribution of U under PF . Suppose that, for all {Ft } ∈ C, the ran-
dom variables uFt ≡ u(X, θ; Ft ) and uFt ≡ EFt [uFt |X] satisfy the continuity
conditions
(6.3) lim VarF0 (uFt − uF0 ) = 0,
t→0+

D
(6.4) L(wFt ; Ft ) −→ L(wF0 ; F0 ), EFt wF2 t → EF0 wF2 0 ,
as t → 0+, with wF ≡ uF − uF , and also satisfy the differentiability condition
(6.5) lim EF0 (uFt − uF0 )/t = EF0 φ(X)ρ(X)
t→0+
for certain φ(X) ≡ φ(X; F0 ) ∈ L2 (F0 ). The usual smoothness condition for
µ(F ), see [3], pages 57–58, is that, for a certain influence function ψ(X) ≡
ψ(X; F0 ) ∈ L2 (F0 ),
(6.6) lim {µ(Ft ) − µ(F0 )}/t = EF0 ψ(X)ρ(X).
t→0+

6.3. Regular estimators. An estimator µ̃n ≡ µ̃n (X1 , . . . , Xn ) of µ(F ) is

(locally asymptotically) regular at F0 if there exists a random variable ζ0
such that
(6.7) lim L(n1/2 {µ̃n − µ(Fc/√n )}; Fc/√n ) = L(ζ0 ; F0 )
n→∞
for all c > 0 and {Ft } ∈ C ([3], page 21). Likewise, for the estimation of
the sum Sn (F ) in (1.2), we say that an estimator Sen ≡ Sen (X1 , . . . , Xn ) is
regular at F0 if there exists a random variable ξ0 such that, for all c > 0 and
{Ft } ∈ C,
(6.8) lim L(n−1/2 {Sen − Sn (Fc/√n )}; Fc/√n ) = L(ξ0 ; F0 ).
n→∞
ESTIMATING SUMS OF RANDOM VARIABLES 15

6.4. Efficient influence functions and information bounds. Let ψ∗ be the

projection of ψ in (6.6) to the tangent space H∗ in (6.2). The standard
convolution theorem ([3], page 63) asserts that, for a certain variable ζ0′ ,
L(ζ0 ; F0 ) = N (0, Eψ∗2 (X)) ⋆ L(ζ0′ ; F0 )
for the ζ0 in (6.7), and that
P
efficient estimators are characterized
√ by (1.4). For
h ∈ L2 (F0 ), let An (h) ≡ nj=1 h(Xj , θj )/n and Zn (h) ≡ n{An (h) − EF0 h}.

Theorem 6.1. Suppose (6.3), (6.4) and (6.5) hold at F0 . Let φ∗,0 be
the projection of φ in (6.5) into the tangent space H∗ in (6.2), and let
φ∗ ≡ uF0 − µ(F0 ) + φ∗,0 .
(i) If (6.8) holds, then VarF0 (ξ0 ) ≥ VarF0 (φ∗ − uF0 ). Moreover, the lower
bound is reached without bias, that is, EF0 ξ02 = VarF0 (φ∗ − uF0 ), iff (1.5)
holds.
(ii) If (6.8) holds and the L2 (F0 ) closure C∗ of C∗ in (6.2) is convex,
then there exist a random variable ξ̃0 and certain normal variables Z(h) ∼
N (0, VarF0 (h)) such that
√
n{Sen /n − An (φ∗ ) − µ(F0 )} D ξ̃0
L ; F0 −→ L ; F0
Zn (uF0 + h − uF0 ) Z(uF0 + h − uF0 )
and ξ̃0 is independent of Z(uF0 + h − uF0 ) for all h ∈ H∗ . In particular, for
h = φ∗,0 ,
L(ξ0 ; F0 ) = L(Z(φ∗ − uF0 ); F0 ) ⋆ L(ξ̃0 ; F0 ).
(iii) Suppose EFt u2 (X; Ft ) is bounded for all {Ft } ∈ C. Then, ψ∗ = φ∗,0 +
u∗ is the efficient influence function for the estimation of µ(F ), that is, (6.6)
holds with ψ = ψ∗ , where u∗ is the projection of uF0 to H∗ . Consequently,
(1.6) holds.

Remark 6.1. Based on Theorem 6.1(i) and (ii), Sbn is said to be locally
asymptotically efficient if (1.5) holds. Note that in Theorem 6.1(ii), ξ̃0 = 0
iff (1.5) holds.

Remark 6.2. In the proof of Theorem 6.1(iii), we show that (6.5) and
(6.6) are equivalent under the condition that EFt u2 (X; Ft ) = O(1) for all
{Ft } ∈ C.

Remark 6.3. For the estimation of µ(F ), that is, u(x, ϑ, F ) ≡ µ(F )
as a special case of Theorem 6.1(ii), a standard proof of the convolution
theorem uses analytic continuation along lines passing through the origin in
the tangent space, and as a result, C ∗ is often assumed to be a linear space.
In the proof of Theorem 6.1(ii), analytic continuation is used along arbitrary
16 C.-H. ZHANG

lines across C ∗ , so that only the convexity of C ∗ is needed as in [31], pages

366–367. Rieder [20] showed that, in the case of convex C ∗ , the projections
of scores to C ∗ (not to H∗ ) are useful in the context of one-sided confidence.

6.5. Finite-dimensional models. Let F = {Fτ , τ ∈ T } with an open Eu-

clidean parameter space T . We shall extend the results in Section 2.1 to
general sums (1.2). Suppose dFτX = fτX dν exists and is differentiable in the
sense of (6.1), that is,
Z
1/2
(6.9) (fτ +∆ − fτ1/2 − ∆ρτ )2 dν = o(k∆k2 ), τ ∈T.

Let Eτ ≡ EFτ , Iτ ≡ Covτ (ρτ (X)), uτ ≡ u(X, θ; Fτ ) and uτ ≡ u(X; Fτ ).

Theorem 6.2. (i) Suppose (6.9) holds, Iτ is of full-rank, L(uτ ; Fτ ) is

continuous in τ in the weak topology, Eτ u2τ is continuous, Eτ {uτ +∆ −uτ }2 →
0 as ∆ → 0, Eτ u2τ is locally bounded, and µ′ (τ ) exists. Then (2.4) gives the
efficient influence function for the estimation of (1.2) with γτ = µ′ (τ ) −
Eτ uτ ρτ , and (1.5) and (1.6) hold.
(ii) Suppose (2.6), (2.7) and conditions of (i) hold. Then (2.8) holds for
the plug-in estimator (2.5) with the γτ in (i). In particular, (2.5) is asymp-
totically efficient under Pτ iff γτ κτ = γτ Iτ−1 ρτ .

Remark 6.4. Comparing Theorem 6.2 with Theorems 2.1 and 2.2, we
see that (6.9) is weaker than (2.2) and (1.2) is more general than (1.1), while
stronger conditions are imposed on uτ in Theorem 6.2.

7. Proofs. We prove Theorems 6.1, 2.1, 2.2, 6.2, and 2.3–2.5 in this sec-
tion.

Lemma 7.1. Suppose (2.2) holds. Let (X, θ) ∼ Ft under Pτ +at and ρ =
at ρτ for a vector a, where ρτ is as in (2.3). Then (6.1) holds with PF0 = Pτ .

Proof. Let gt ≡ gτ +at and ∆ = at. The lemma follows from the expan-
sion
√ 1/2 t
ft − 1 ρ 1 g − 1 1/2 a ρeτ
− = 1/2 E0 t (gt + 1) X = x − E0 X =x .
t 2 f +1 t 2
t

The uniform integrability of the square of the right-hand side (i.e., the first
term) under f0 (x) follows from the inequality E0 [gt |X] ≤ ft (X)I{f0 (X) >
0}. We omit the details.
ESTIMATING SUMS OF RANDOM VARIABLES 17

Lemma 7.2. Suppose (6.1) holds and X ∼ FtX under Pt , 0 ≤ t ≤ 1. Let

µt ≡ Et ht (X) for a certain Borel ht . If Et h2t (X) = O(1) and ht → h0 in
L2 (P0 ), then
µt − µ0 = E0 {ht (X) − h0 (X)} + tE0 ρ(X)h0 (X) + o(t) as t → 0.

Proof. Let Bt be the support sets of dPt (X) − ft (X) dP0 (X). By (6.1)
and the boundedness of Et h2t , Et ht − E0 ft ht = Et ht IBt = O(1)(Et h2t )1/2 ×
1/2
Pt (Bt ) = o(t). Thus,
(7.1) µt − µ0 = Et ht − E0 h0 = E0 (ft − 1)ht + E0 (ht − h0 ) + o(t)
√ √
as t → 0+. Since ( ft − 1)/t → ρ/2 in L2 (P0 ) and E0 {( ft + 1)ht }2 = O(1),
p p
E0 (ft − 1)ht /t = E0 [t−1 ( ft − 1)( ft + 1)ht ] → E0 h0 ρ.
This and (7.1) complete the proof.
√
Proof of Theorem 6.1. Let Fn ≡ Fc/√n , ξn ≡ n{Sen /n − Sn (Fn )/n},
√ √
ξn′ ≡ n{Sen /n − An (uFn )}, ξn′′ ≡ nAn (wFn ) and Z ′′ = Z(wF0 ). Then ξn =
ξn′ + ξn′′ and ξn′ depend on {Xj } only. By (6.4), wF2 n under PFn are uniformly
D
integrable and L(wFn ; Fn ) −→ L(wF0 ; F0 ) as n → ∞. Thus, by the Lindeberg
central limit theorem and the weak law of large numbers,
(7.2) EFn [exp(itξn′′ )|{Xj }] → EF0 exp(itZ ′′ )
in probability for all t. Since ξn′ depends on {Xj } only, this and (6.8) imply
EFn exp(itξn′ )E exp(itZ ′′ ) = EFn exp(itξn′ ) exp(itξn′′ ) + o(1) → EF0 exp(itξ0 ).
Thus, since E exp(itZ ′′ ) 6= 0 for all t,
( n
) !
X D
(7.3) L n−1/2 Sen − u(Xj ; Fc/√n ) ; Fc/√n = L(ξn′ ; Fn ) −→ L(ξ0′ ; F0 )
j=1

for a certain variable ξ0′ independent of c > 0 and the curve {Ft } ∈ C.
′ ≡ √n{S
Define ξn,0 en /n−An (uF )}. By (6.3) and (6.5), ξ ′ −ξ ′ = √nAn ×
0 n,0 n
(uFn − uF0 ) = EF0 (uFn − uF0 ) + oP (1) → cEφ(X)ρ(X) in probability under
PF0 . Thus, as in [3], pages 24–26, by (7.3) and the LAN from (6.1) and (6.2),
(7.4) EF0 exp(itξ0′ + zZ(ρ)) = exp[itzEF0 φρ + z 2 EF0 ρ2 /2]EF0 exp(itξ0′ )
for all ρ ∈ C∗ and complex z. Here Z(h) are constructed so that (ξn,0 ′ , Z (h))
n
′
converges jointly in distribution to (ξ0 , Z(h)) for all h ∈ L2 (F0 ). Differenti-
ating (7.4) in t at t = 0 and then in z at z = 0, we find
(7.5) EF0 ξ0′ Z(h) = EF0 φ(X)h(X) = EF0 Z(φ∗,0 )Z(h)
18 C.-H. ZHANG

for all scores h = ρ, ρ ∈ C∗ , and then for all h ∈ H∗ by (6.2). Since φ∗,0 ∈ H∗ ,
ξ0′ − Z(φ∗,0 ) and Z(φ∗,0 ) are orthogonal in L2 (F0 ). This proves (i), since
ξ0′ and Z(φ∗,0 ) are both independent of Z ′′ by (7.2) and Z(φ∗,0 ) + Z ′′ =
Z(φ∗ − uF0 ).
Now, suppose C∗ is convex in L2 (F0 ). By continuity extension, (7.4) holds
for all ρ ∈ C∗ and complex z. Let ρj ∈ C∗ . Since (7.4) holds for ρ = sρ1 + (1 −
s)ρ2 , 0 ≤ s ≤ 1, with both sides being analytic in s, by analytic continuation
it holds for ρ = sρ1 +(1−s)ρ2 for all real s. Thus, (7.4) holds for all complex z
and
(7.6) ρ ∈ H0 ≡ {sρ1 + (1 − s)ρ2 : ρj ∈ C∗ , −∞ < s < ∞}.
Let H e be the linear span of a set of finitely many members of C∗ . Let ρ1
be a fixed interior point of H e ∩ C∗ and ρ2 ∈ H e with kρ2 − ρ1 k = δ0 . For
sufficiently small δ0 > 0, ρ2 ∈ C∗ for all such ρ2 , so that He ⊆ H0 . Thus, H0 is
a linear space and H∗ is the closure of H0 . It follows that (7.4) holds for all
ρ ∈ H∗ and complex z. As in [3], pages 25–26, this implies the independence
of ξ0′ − Z(φ∗,0 ) and {Z(h) : h ∈ H∗ }. Since {ξ0′ , Z(h), h ∈ H∗ } is independent
of Z ′′ = Z(uF0 − uF0 ) by (7.2), the conclusions of part (ii) hold with ξ̃0 =
ξ0′ − Z(ψ∗,0 ).
The proof of part (iii) follows easily from Lemma 7.2 with ht = uFt , which
gives
{µ(Ft ) − µ(F0 )}/t − EF0 {uFt − uF0 }/t → EF0 uF0 ρ = EF0 u∗ ρ.
It follows that (6.5) and (6.6) are equivalent under EFt u2 (X; Ft ) = O(1),
with ψ = ψ∗ = u∗ + φ∗,0 , by (1.6) and the definition of φ∗ . The proof is
complete.

Proof of Theorem 2.1. The proof is similar to that of Theorem 6.1(i),

so we omit certain details. By (2.2), ξ0 is independent of Z(ρeτ ) under Pτ .
Since Eτ u2 < ∞, (7.2) holds for fixed Fn = Fτ , so that ξ0 = ξ0′ + Z(uτ − u)
as a sum of independent variables. Let Z(hτ ) be the projection of ξ0′ to
{Z(h), h ∈ L2 (Fτ )} in L2 (Pτ ) and vτ = hτ + uτ . Then Varτ (ξ0 ) ≥ Eτ (vτ − u)2
and Eτ (vτ − u)ρeτ = 0. Since ξ0′ is the limit of variables dependent on {Xj }
only, hτ and vτ depend on X only.
Since Eτ u2 gτ,∆ (θ) ≤ Eτ +∆ u2 = O(1), by (2.2) and Lemma 7.2 with ht =
h0 = u(x, ϑ), µτ +∆ − µτ ≈ ∆t Eτ uρeτ = ∆t Eτ ψ∗,τ (X)ρτ (X), where ψ∗,τ ≡
ρtτ Iτ−1 Eτ uρeτ . It follows that 0 = Eτ (vτ − u)ρeτ = Eτ (vτ ρeτ − ψ∗,τ ρτ ) = Eτ (vτ −
ψ∗,τ )ρτ . Thus, Eτ (vτ − uτ )ρτ = Eτ (ψ∗,τ − u∗,τ )ρτ with u∗,τ ≡ ρtτ Iτ−1 Eτ uτ ρτ .
Since ψ∗,τ − u∗,τ is linear in ρτ , Z(vτ − uτ − (ψ∗,τ − u∗,τ )) is independent
of Z(ψ∗,τ − u∗,τ ). Thus, Varτ (vτ − uτ ) ≥ Varτ (ψ∗,τ − u∗,τ )) and Varτ (ξ0 ) ≥
Varτ (vτ −uτ )+Varτ (uτ −u) ≥ Varτ (φ∗,τ −u) by (2.4). The proof is complete.

ESTIMATING SUMS OF RANDOM VARIABLES 19

Proofs of Theorems 2.2 and 6.2. Theorem 6.2(i) follows from The-
orem 6.1 and Remark 6.2. Let µ(t; τ ) = Eτ ut (X). By Lemma 7.2, µ′ = Eτ uρe
in Theorem 2.2 and γτ = (∂/∂t)µ(τ ; τ ) in both theorems. Simple expansion
of (2.5) via (2.7) yields
Sbn
= An (uτ ) + {µ(τbn ; τ ) − µ(τ ; τ )} + oPτ (n−1/2 )
n
= An (uτ + γτ κτ ) + oPτ (n−1/2 ),
which implies (2.8). Note that γτ (κτ − κ∗,τ ) is orthogonal to uτ − uτ + γτ κ∗,τ .
The proof is complete.

Proofs of Theorems 2.3, 2.4 and 2.5. Let Gt ≡ (1 − t)G0 + tG,

ft ≡ fGt and Et ≡ EGt , t > 0. By (2.9), (6.1) holds with ρ = fG /f0 − 1. Since
EG u2 < ∞, u2 are uniformly integrable under Pt , so that (6.4) holds. Since
f0 /ft ≤ 1/(1 − t), {u2t , 0 ≤ t ≤ 1/2} are uniformly integrable under E0 , so
that (6.3) holds. Moreover,

fG fG
(7.7) t−1 E0 {ut − u0 } = E0 (uG − u0 ) → E0 (uG − u0 ) .
ft f0
Suppose there exists a regular estimator of (1.1). Let ξ0′ be as in (7.5) and
let Z(v − u0 ) be the projection of ξ0′ to {Z(h), h ∈ L2 (f0 )} as in the proof of
Theorem 2.1. It follows from (7.7) and the argument leading to (7.5) that

fG
E0 (v − u0 )(fG /f0 − 1) = E0 Z(v − u0 )Z(ρ) = E0 (uG − u0 ) ,
f0
which implies EG v − E0 v + E0 u = EG u. Since ξ0′ does not depend on the
choice of G ∈ GG0 , v ∈ VG0 . By the Lindeberg central limit theorem, EG0 v 2 <
∞ and v ∈ VG0 imply L(Zn (v − u); Pc/√n ) → L(Z(v − u); P0 ), so that Vn in
(2.15) is regular at G0 for all v ∈ VG0 . If v is a limit point of VG0 in L2 (f0 ),
Vn is also a regular estimator of Sn at P0 , so that VG0 is closed in L2 (f0 ).
This completes the proof of Theorem 2.3.
The proof of Theorem 2.4 is similar to those of Theorems 2.2 and 6.2
but simpler. We note that EG0 (vG − vG0 ) = 0. Finally, Theorem 2.5 follows
from the fact that VG contains a single function v due to the completeness
of exponential families. The proofs are complete.

REFERENCES
[1] Benedetti, R. and Franconi, L. (1998). Statistical and technological solutions
for controlled data dissemination. In Pre-proceedings of New Techniques and
Technologies for Statistics, Sorrento 1 225–232 .
[2] Bethlehem, J., Keller, W. and Pannekoek, J. (1990). Disclosure control of mi-
crodata. J. Amer. Statist. Assoc. 85 38–45.
20 C.-H. ZHANG

[3] Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner, J. A. (1993). Effi-
cient and Adaptive Estimation for Semiparametric Models. Johns Hopkins Univ.
Press, Baltimore. MR1245941
[4] Bunge, J. and Fitzpatrick, M. (1993). Estimating the number of species: A review.
J. Amer. Statist. Assoc. 88 364–373.
[5] Chao, A. (1984). Nonparametric estimation of the number of classes in a population.
Scand. J. Statist. 11 265–270. MR0793175
[6] Chao, A. and Bunge, J. (2002). Estimating the number of species in a stochastic
abundance model. Biometrics 58 531–539. MR1925550
[7] Clauset, A. and Moore, C. (2003). Traceroute sampling makes random graphs
appear to have power law degree. Preprint.
[8] Coates, A., Hero, A., Nowak, R. and Yu, B. (2002). Internet tomography. IEEE
Signal Processing Magazine 19(3) 47–65.
[9] Darroch, J. N. and Ratcliff, D. (1980). A note on capture–recapture estimation.
Biometrics 36 149–153. MR0672144
[10] Duncan, G. T. and Pearson, R. W. (1991). Enhancing access to microdata while
protecting confidentiality: Prospects for the future (with discussion). Statist.
Sci. 6 219–239.
[11] Engen, S. (1974). On species frequency models. Biometrika 61 263–270. MR0373217
[12] Faloutsos, M., Faloutsos, P. and Faloutsos, C. (1999). On power-law relation-
ships of the Internet topology. In Proc. ACM SIGCOMM 1999 251–262. ACM
Press, New York.
[13] Fisher, R. A., Corbet, A. S. and Williams, C. B. (1943). The relation between
the number of species and the number of individuals in a random sample of an
animal population. J. Animal Ecology 12 42–58.
[14] Good, I. J. (1953). The population frequencies of species and the estimation of
population parameters. Biometrika 40 237–264. MR0061330
[15] Govindan, R. and Tangmunarunkit, H. (2000). Heuristics for Internet map dis-
covery. In Proc. IEEE INFOCOM 2000 3 1371–1380. IEEE Press, New York.
[16] Lakhina, A., Byers, J., Crovella, M. and Xie, P. (2003). Sampling biases in
IP topology measurements. In Proc. IEEE INFOCOM 2003 1 332–341. IEEE
Press, New York.
[17] Pfanzagl, J. (with the assistance of W. Wefelmeyer) (1982). Contributions to a
General Asymptotic Statistical Theory. Lecture Notes in Statist. 13. Springer,
New York. MR0675954
[18] Polettini, S. and Seri, G. (2003). Guidelines for the protection of so-
cial micro-data using individual risk methodology. Application within µ-
Argus version 3.2, CASC Project Deliverable No. 1.2-D3. Available at
neon.vb.cbs.nl/casc/deliv/12D3_guidelines.pdf.
[19] Rao, C. R. (1971). Some comments on the logarithmic series distribution in the
analysis of insect trap data. In Statistical Ecology (G. P. Patil, E. C. Pielou and
W. E. Waters, eds.) 1 131–142. Pennsylvania State Univ. Press, University Park.
MR0375600
[20] Rieder, H. (2000). One-sided confidence about
functionals over tangent cones. Available at
www.uni-bayreuth.de/departments/math/org/mathe7/RIEDER/pubs/cc.pdf.
[21] Rinott, Y. (2003). On models for statistical disclosure risk esti-
mation. Working paper no. 16, Joint ECE/Eurostat Work Ses-
sion on Data Confidentiality, Luxemburg, 2003. Available at
www.unece.org/stats/documents/2003/04/confidentiality/wp.16.e.pdf.
ESTIMATING SUMS OF RANDOM VARIABLES 21

[22] Robbins, H. (1977). Prediction and estimation for the compound Poisson distribu-
tion. Proc. Natl. Acad. Sci. U.S.A. 74 2670–2671. MR0451479
[23] Robbins, H. (1980). An empirical Bayes estimation problem. Proc. Natl. Acad. Sci.
U.S.A. 77 6988–6989. MR0603064
[24] Robbins, H. (1988). The u, v method of estimation. In Statistical Decision Theory
and Related Topics IV (S. S. Gupta and J. O. Berger, eds.) 1 265–270. Springer,
New York. MR0927106
[25] Robbins, H. and Zhang, C.-H. (1988). Estimating a treatment effect under biased
sampling. Proc. Natl. Acad. Sci. U.S.A. 85 3670–3672. MR0946190
[26] Robbins, H. and Zhang, C.-H. (1989). Estimating the superiority of a drug to a
placebo when all and only those patients at risk are treated with the drug. Proc.
Natl. Acad. Sci. U.S.A. 86 3003–3005. MR0995401
[27] Robbins, H. and Zhang, C.-H. (1991). Estimating a multiplicative treatment effect
under biased allocation. Biometrika 78 349–354. MR1131168
[28] Robbins, H. and Zhang, C.-H. (2000). Efficiency of the u, v method of estimation.
Proc. Natl. Acad. Sci. U.S.A. 97 12,976–12,979. MR1795617
[29] Sampford, M. R. (1955). The truncated negative binomial distribution. Biometrika
42 58–69. MR0072401
[30] Spring, N., Mahajan, R. and Wetherall, D. (2002). Measuring ISP topologies
with rocketfuel. In Proc. ACM SIGCOMM 2002 133–145. ACM Press, New
York.
[31] van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Univ. Press.
MR1652247
[32] Vardi, Y. (1996). Network tomography: Estimating source-destination traffic inten-
sities from link data. J. Amer. Statist. Assoc. 91 365–377. MR1394093

Department of Statistics
Rutgers University
Hill Center
Busch Campus
Piscataway, New Jersey 08854-8019
USA
e-mail: czhang@stat.rutgers.edu

Notes Estimation Theory
100% (3)
Notes Estimation Theory
39 pages
(Newey and Mcfadden) Large Sample Estimation and Hypothesis Testing
No ratings yet
(Newey and Mcfadden) Large Sample Estimation and Hypothesis Testing
135 pages
Solutions To Steven Kay's Statistical Estimation Book
67% (3)
Solutions To Steven Kay's Statistical Estimation Book
16 pages
Huber RobustEstimationLocation 1964
No ratings yet
Huber RobustEstimationLocation 1964
30 pages
Blind Source Separation - Semiparametric Statistical Approach
No ratings yet
Blind Source Separation - Semiparametric Statistical Approach
9 pages
Robert Engle Dan McFadden Handbook of Econometrics PDF
No ratings yet
Robert Engle Dan McFadden Handbook of Econometrics PDF
1,024 pages
Unit-16 IGNOU STATISTICS
No ratings yet
Unit-16 IGNOU STATISTICS
16 pages
Annal Horowitz Mammen 2004
No ratings yet
Annal Horowitz Mammen 2004
32 pages
Study of Some Improved Ratio Type Estimators Under Second Order Approximation
No ratings yet
Study of Some Improved Ratio Type Estimators Under Second Order Approximation
18 pages
Lecture3 Module1 Anova 1
No ratings yet
Lecture3 Module1 Anova 1
10 pages
Kubat 1980
No ratings yet
Kubat 1980
8 pages
Statistics Diffusions
No ratings yet
Statistics Diffusions
66 pages
Chapter 36 Large Sample Estimation and Hypothesis Testing
No ratings yet
Chapter 36 Large Sample Estimation and Hypothesis Testing
135 pages
Asymptotically Efficient Estimation of Cointegration Regressions
No ratings yet
Asymptotically Efficient Estimation of Cointegration Regressions
22 pages
Lecture15 Fisherinfo
No ratings yet
Lecture15 Fisherinfo
4 pages
1981 Estimating The Dimension of A Linear-Model - J. Andel, M. G. Perez and A. I. Negrao
No ratings yet
1981 Estimating The Dimension of A Linear-Model - J. Andel, M. G. Perez and A. I. Negrao
12 pages
PDF Estimation 23mar23
No ratings yet
PDF Estimation 23mar23
45 pages
Cram Er-Rao Lower Bound and Information Geometry: 1 Introduction and Historical Background
No ratings yet
Cram Er-Rao Lower Bound and Information Geometry: 1 Introduction and Historical Background
27 pages
XXXX Statistical Estimation
No ratings yet
XXXX Statistical Estimation
87 pages
2009 Paninsky Nonparametric Estimation of Entropy and Distributions
No ratings yet
2009 Paninsky Nonparametric Estimation of Entropy and Distributions
34 pages
Econ-2042 - Unit 6-W12-13
No ratings yet
Econ-2042 - Unit 6-W12-13
77 pages
(Ebook PDF) Statistics For Business Economics 13th Edition by David PDF Download
100% (1)
(Ebook PDF) Statistics For Business Economics 13th Edition by David PDF Download
49 pages
牛颖Introduction to M-estimator
No ratings yet
牛颖Introduction to M-estimator
4 pages
Technical Report 73 IMSV, University of Bern Adaptive Confidence Sets For The Optimal Approximating Model
No ratings yet
Technical Report 73 IMSV, University of Bern Adaptive Confidence Sets For The Optimal Approximating Model
31 pages
Estimating The CDF and Statistical Functionals
No ratings yet
Estimating The CDF and Statistical Functionals
13 pages
ANOVA: Analysis of Variance: Prof. Rohit Joshi, Prof. Achinta Kr. Sarmah
No ratings yet
ANOVA: Analysis of Variance: Prof. Rohit Joshi, Prof. Achinta Kr. Sarmah
40 pages
MTH 515a: Inference-II Assignment No. 5: Asymptotically Efficient Estimators
No ratings yet
MTH 515a: Inference-II Assignment No. 5: Asymptotically Efficient Estimators
3 pages
Density Estimation 36-708
No ratings yet
Density Estimation 36-708
32 pages
Asymptotic Theory and Parametric Inference
No ratings yet
Asymptotic Theory and Parametric Inference
32 pages
Introduction To Econometrics - Stock & Watson - CH 9 Slides
100% (1)
Introduction To Econometrics - Stock & Watson - CH 9 Slides
69 pages
Estimating The Support of A High-Dimensional Distribution
No ratings yet
Estimating The Support of A High-Dimensional Distribution
28 pages
RaoCramerans PDF
No ratings yet
RaoCramerans PDF
10 pages
5 - Ratio Regression and Difference Estimation - Revised
No ratings yet
5 - Ratio Regression and Difference Estimation - Revised
39 pages
Lecture Notes - 1
No ratings yet
Lecture Notes - 1
56 pages
PDF Estimation Corr
No ratings yet
PDF Estimation Corr
43 pages
R300 Solution Guide 2018M
No ratings yet
R300 Solution Guide 2018M
8 pages
Realized Volatility and Parametric Estimation of Heston SDEs
100% (1)
Realized Volatility and Parametric Estimation of Heston SDEs
33 pages
Quiz Final Ae
No ratings yet
Quiz Final Ae
23 pages
A Modern Gauss-Markov Theorem
No ratings yet
A Modern Gauss-Markov Theorem
18 pages
Classical Estimation
No ratings yet
Classical Estimation
11 pages
Bài Tập Ước Lượng C12346
No ratings yet
Bài Tập Ước Lượng C12346
55 pages
Applied Robust Statistics 2005 PDF
No ratings yet
Applied Robust Statistics 2005 PDF
532 pages
Lecture II - Docx - 12
No ratings yet
Lecture II - Docx - 12
12 pages
Theory of Estimation
No ratings yet
Theory of Estimation
11 pages
Introduction
No ratings yet
Introduction
11 pages
STAT2102 Chapter6
No ratings yet
STAT2102 Chapter6
5 pages
Full Download PDF of (Ebook PDF) Business Statistics 9th by Kent D. Smith All Chapter
100% (18)
Full Download PDF of (Ebook PDF) Business Statistics 9th by Kent D. Smith All Chapter
43 pages
Stat-Review Xid-8243919 1
No ratings yet
Stat-Review Xid-8243919 1
24 pages
Covariance Matrix (W Krzanowski)
No ratings yet
Covariance Matrix (W Krzanowski)
5 pages
Statistical Properties of The OLS Coefficient Estimators: ECON 351 - NOTE 4
No ratings yet
Statistical Properties of The OLS Coefficient Estimators: ECON 351 - NOTE 4
12 pages
University of Gondar College of Medicine and Health Science Department of Epidemiology and Biostatistics
No ratings yet
University of Gondar College of Medicine and Health Science Department of Epidemiology and Biostatistics
119 pages
Week 1 1720465962 Estimation Hour 2
No ratings yet
Week 1 1720465962 Estimation Hour 2
14 pages
Lecture 22: Review For Exam 2 1 Basic Model Assumptions (Without Gaussian Noise)
No ratings yet
Lecture 22: Review For Exam 2 1 Basic Model Assumptions (Without Gaussian Noise)
7 pages
CH 7
No ratings yet
CH 7
47 pages
Properties of Estimators New Update Spin
No ratings yet
Properties of Estimators New Update Spin
43 pages
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
No ratings yet
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
9 pages
Introecon Estimators Properties
No ratings yet
Introecon Estimators Properties
8 pages
Question's 2
No ratings yet
Question's 2
20 pages
HD Econometrics
No ratings yet
HD Econometrics
197 pages
Product Quantization For Nearest Neighbor Search
No ratings yet
Product Quantization For Nearest Neighbor Search
13 pages
Estimation Theory
No ratings yet
Estimation Theory
40 pages
Unit-Iii: Statistical Estimation Theory Unbiased Estimates
No ratings yet
Unit-Iii: Statistical Estimation Theory Unbiased Estimates
14 pages
Statistical Methods
No ratings yet
Statistical Methods
25 pages
Basic Stats Estimation
No ratings yet
Basic Stats Estimation
8 pages
Lecture 1
No ratings yet
Lecture 1
8 pages
Dokumen - Tips - Understanding Robust and Exploratory Data Analysisby David C Hoaglin Frederick
No ratings yet
Dokumen - Tips - Understanding Robust and Exploratory Data Analysisby David C Hoaglin Frederick
3 pages
Mean Multi-Variate Normal Distribution: Inadmissibility of The Usual Esti - Mator For The OF
No ratings yet
Mean Multi-Variate Normal Distribution: Inadmissibility of The Usual Esti - Mator For The OF
10 pages
Sphit Algorithm Implementation in Matlab...
No ratings yet
Sphit Algorithm Implementation in Matlab...
20 pages
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
No ratings yet
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
18 pages
Lectura 2 Point Estimator Basics
No ratings yet
Lectura 2 Point Estimator Basics
11 pages
Estimation EMV
No ratings yet
Estimation EMV
37 pages
Williams Et Al. - 2013 - Assumptions of Multiple Regression Correcting Two
No ratings yet
Williams Et Al. - 2013 - Assumptions of Multiple Regression Correcting Two
15 pages
Bca 3 Sem Advanced Statistical Methods 21100700 Mar 2021
No ratings yet
Bca 3 Sem Advanced Statistical Methods 21100700 Mar 2021
2 pages
Chapter 2: Statistical Inference, Point Estimation, and Confidence Intervals
No ratings yet
Chapter 2: Statistical Inference, Point Estimation, and Confidence Intervals
16 pages
Lec6 Hist KDE
No ratings yet
Lec6 Hist KDE
11 pages
More On Specification and Data
No ratings yet
More On Specification and Data
22 pages
Chapter - 2 - Week 4-11 Feb
No ratings yet
Chapter - 2 - Week 4-11 Feb
45 pages
Sample
No ratings yet
Sample
242 pages
05 Inference Lab
No ratings yet
05 Inference Lab
12 pages
ST107 Exam Paper 2010
No ratings yet
ST107 Exam Paper 2010
12 pages
Module 3 - Data Analysis - S RM
No ratings yet
Module 3 - Data Analysis - S RM
63 pages
The Misconceptions About Multimedia Learning Questionnaire: An Empirical Evaluation Study With Teachers and Student Teachers
No ratings yet
The Misconceptions About Multimedia Learning Questionnaire: An Empirical Evaluation Study With Teachers and Student Teachers
25 pages
GLMM TMB
No ratings yet
GLMM TMB
46 pages
Poster For Residual Flows
No ratings yet
Poster For Residual Flows
1 page
Estimating The Mean and Variance of The Target Probability Distribution
No ratings yet
Estimating The Mean and Variance of The Target Probability Distribution
6 pages
Ordinary Differential Equations and Stability Theory: An Introduction
From Everand
Ordinary Differential Equations and Stability Theory: An Introduction
David A. Sanchez
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Nonlinear Transformations of Random Processes
From Everand
Nonlinear Transformations of Random Processes
Ralph Deutsch
No ratings yet
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Consistencyofsumvar

Uploaded by

Consistencyofsumvar

Uploaded by

The Annals of Statistics

2005, Vol. 33, No. 5, 2022–2041

ESTIMATION OF SUMS OF RANDOM VARIABLES: EXAMPLES

AND INFORMATION BOUNDS1

1. Introduction. Given a pool of n motorists, how do we estimate the

[24], where Xj are observable variables, θj are unobservable variables or

Received June 2001; revised October 2004.

This is an electronic reprint of the original article published by the

The estimation of (1.1) is a nonstandard problem in statistics, since the

based on X1 , . . . , Xn , where the utility u(x, ϑ; F ) is also allowed to depend

2. Mixture models. Suppose (X, θ) ∼ F (dx, dϑ) = f (x|ϑ)ν(dx)G(dϑ), that

2.1. Finite-dimensional mixture models. Let {Gτ , τ ∈ T } be a paramet-

where gτ,∆ is the Radon–Nikodym derivative of the absolutely continuous

Theorem 2.1. Suppose (2.2) holds, Eτ u2 (X, θ) is locally bounded and

of (1.1), where uτ (x) ≡ Eτ [u(X, θ)|X = x] as in Theorem 2.1. An estimator

with Eτ κτ (X)ρtτ (X) being the identity matrix.

Theorem 2.2. Let Sbn be as in (2.5) with an asymptotically linear esti-

Let φ∗,τ and γτ be as in Theorem 2.1 and κ∗,τ = Iτ−1 ρτ . Then

Remark 2.3. Condition (2.7) holds if {uτ +∆ : τ + ∆ ∈ T , k∆k ≤ δτ } is

2.2. General mixtures. Let G be a convex class of distributions. Suppose

Theorem 2.3. (i) If VG0 is nonempty, then {Sbn , n ≥ 1} is an asymptoti-

The definition of regular estimators of (1.1) is given in Section 6.

Theorem 2.4. Let vG0 be as in (2.11). Suppose vG0 ∈ V∗ and as (ε, n) →

is an asymptotically efficient estimator of (1.1) at PG0 for all G0 ∈ G∗ .

If f (x|ϑ) belongs to certain exponential families, there exists a unique

Theorem 2.5. Suppose f (x|ϑ) ∝ exp(xt λ(ϑ)), λ(ϑ) ∈ Λ, is an exponen-

Remark 2.4. Robbins [24] called (2.15) “u, v” estimators, provided

2.3. The Poisson example. Let (X, Y, λ) ≡ (X, θ) with

2.4. More examples.

Example 2.1. Let X ∼ N (τ, σ 2 ). The number of “above average” indi-

Example 2.2. Let f (x|ϑ) ∼ N (ϑ, σ 2 ). An efficient estimator for the

3. A species problem. An interesting example of our problem is estimat-

3.1. Finite-dimensional models. Let Xj be the frequencies of the jth

We will confine P our discussionPd

is equivalent to estimating (1.1) with u(x, p) = I{p > 0} or u(x, p) = I{x =

3.2. General mixture. Now, suppose the distribution G in (3.2) is com-

by Theorems 2.3 and 2.5, the estimation of d with completely unknown G

where τb1 is the (weighted) least squares estimate of τ1 ≡ (β + 1)/(αβ) based

4. Networks: estimation of node degrees based on source-destination data.

For a given sample size N , let R1 , . . . , RN be a sample of SD pairs from the

Since Xkℓ=0 for D(k, ℓ) = 0 by (4.3), the node degree dk is a sum

5. Data confidentiality: estimation of risk in statistical disclosure. A ma-

with Yj − Xj ∼ Poisson((1 − pj )πj λ) (independent of Xj ). For u(x, y) =

functions of an unknown vector τ and certain covariates zj characterizing

6. General information bounds. We provide a lower bound for the asymp-

6.1. Scores and tangent spaces. Suppose (X, θ) ∼ F with F ∈ F , where

6.2. Smoothness of random variables and their distributions. Let L(U ; F )

6.3. Regular estimators. An estimator µ̃n ≡ µ̃n (X1 , . . . , Xn ) of µ(F ) is

6.4. Efficient influence functions and information bounds. Let ψ∗ be the

lines across C ∗ , so that only the convexity of C ∗ is needed as in [31], pages

6.5. Finite-dimensional models. Let F = {Fτ , τ ∈ T } with an open Eu-

Let Eτ ≡ EFτ , Iτ ≡ Covτ (ρτ (X)), uτ ≡ u(X, θ; Fτ ) and uτ ≡ u(X; Fτ ).

Theorem 6.2. (i) Suppose (6.9) holds, Iτ is of full-rank, L(uτ ; Fτ ) is

Lemma 7.2. Suppose (6.1) holds and X ∼ FtX under Pt , 0 ≤ t ≤ 1. Let

Proof of Theorem 2.1. The proof is similar to that of Theorem 6.1(i),

Proofs of Theorems 2.3, 2.4 and 2.5. Let Gt ≡ (1 − t)G0 + tG,

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.