Consistencyofsumvar
Consistencyofsumvar
By Cun-Hui Zhang
Rutgers University
This paper concerns the estimation of sums of functions of ob-
servable and unobservable variables. Lower bounds for the asymptotic
variance and a convolution theorem are derived in general finite- and
infinite-dimensional models. An explicit relationship is established
between efficient influence functions for the estimation of sums of
variables and the estimation of their means. Certain “plug-in” esti-
mators are proved to be asymptotically efficient in finite-dimensional
models, while “u, v” estimators of Robbins are proved to be effi-
cient in infinite-dimensional mixture models. Examples include cer-
tain species, network and data confidentiality problems.
where ψ∗ (x) ≡ ψ∗ (x; F0 ) is the efficient influence function at F0 for the esti-
mation of µ(F ). In Section 6 we show that, under mild regularity conditions
on the utility functions {u(x, ϑ; F ), F ∈ F}, an estimator Sbn of (1.2) is (lo-
cally) asymptotically efficient in contiguous neighborhoods of PF0 iff
n
Sbn 1X
(1.5) = µ(F0 ) + φ∗ (Xj ) + oPF0 (n−1/2 ),
n n j=1
ESTIMATING SUMS OF RANDOM VARIABLES 3
where φ∗ (x) ≡ φ∗ (x; F0 ) is the efficient influence function at F0 for the esti-
mation of Sn (F ). Furthermore, the following relationship holds between the
two efficient influence functions in (1.4) and (1.5):
(1.6) φ∗ (x) = ψ∗ (x) + u(x; F0 ) − µ(F0 ) − u∗ (x),
where u(x; F ) ≡ EF [u(X, θ; F )|X = x] and u∗ (x) ≡ u∗ (x; F0 ) is the projec-
tion of u(x; F0 ) to the tangent space of the family of distributions {F X , F ∈
F} at F0X . Here F X is the marginal distribution of X under the joint distri-
bution F of (X, θ). It follows clearly from (1.6) that asymptotically efficient
estimations of Sn (F )/n and µ(F ) are equivalent in contiguous neighbor-
hoods of PF0 iff u(·; F0 ) − µ(F0 ) is in the tangent space, that is, u(·; F0 ) −
µ(F0 ) = u∗ (·; F0 ).
We will derive more explicit results in finite-dimensional models and
infinite-dimensional mixture models. In finite-dimensional models F = {Fτ ,
τ ∈ TP } with a Euclidean τ , it will be shown that “plug-in” estimators of the
form nj=1 u(Xj ; Fbτn ) are asymptotically efficient for the estimation of (1.2)
if τbn is an efficient estimator of τ . In infinite-dimensional mixture models,
certain “u, v” estimators of Robbins [24] will be shown to be efficient for the
estimation of (1.1). We shall consider estimation of (1.1) with known f (x|ϑ)
in Section 2 and provide the general theory in Section 6. Section 7 contains
proofs of all theorems.
Remark 2.1. Since κ∗,τ ≡ Iτ−1 ρτ is the efficient influence function for
the estimation of τ and ∂µτ /∂τ = Eτ U (X, θ)ρeτ (θ), ψ∗,τ ≡ ρtτ Iτ−1 Eτ u(X, θ)ρeτ (θ)
is the efficient influence function for the estimation of µτ . Moreover, u∗,τ ≡
ρtτ Iτ−1 Eτ uτ (X)ρτ (X) is the projection of uτ to the tangent space generated
by the scores ρτ (X) under Eτ . Thus, Theorem 2.1 asserts that (1.5) and
(1.6) hold under (2.2).
Our next theorem provides the asymptotic theory for plug-in estimators
n
X
(2.5) Sbn ≡ ubτn (Xj )
j=1
Remark 2.2. It follows from (2.8) that |Sbn − Sn | ≤ 1.96σbτn n1/2 pro-
vides an approximate 95% confidence interval for (1.1), provided that στ is
continuous in τ .
(ii) If VG0 is empty, then there does not exist any regular n−1/2 -consistent
estimator of EG u(X, θ) or Sn /n in contiguous neighborhoods of EG0 .
It follows from Theorem 2.2 that the plug-in estimators in (2.17) are asymp-
totically efficient for both Sn′ and Sn′′ . For α = β = 0, (2.17) gives the plug-in
estimator corresponding to the maximum likelihood estimator (MLE) of τ .
For general positive α and β, (2.17) gives the Bayes estimator of Sn′ and Sn′′
P
with a beta prior on τ /(1 + τ ). Clearly, µ bn ≡ ∞ bn xu(x − 1)}/(1 + τbn )x+1
x=1 {τ
is efficient for the estimation of the mean µτ ≡ Eτ u(X, θ), but not for Sn′ /n
or Sn′′ /n. Similar results can be obtained for λ with the gamma distribution;
see [23].
In the case of completely unknown G(dλ), the “u, v” estimator (2.15) with
v(x) = xu(x − 1) is asymptotically efficient for the estimation of Sn′ and Sn′′
for all G with finite EG {v(X) − λu(X)}2 .
Pn
Example 2.3. Sbn ≡ 0 is efficient for the estimation of Sn (τ ) ≡ j=1 ρτ (Xj ).
where φ∗,τ (x) ≡ I{x>0} /Pτ (X > 0) − 1 − κt∗,τ (x)γτ . In this case, as d → ∞,
db − d D Pτ (X = 0)
(3.6) −→ N 0, + γτt {Covτ (ρτ (X)}−1 γτ .
d1/2 Pτ (X > 0)
For the gamma G(dy; τ ) ∝ y α−1 exp(−y/β) dy, the MLE τb ≡ (α, b satisfies
b β)
∞ P∞
X
ℓ=k nℓ delog(1 + β)
b deα
b βb
(3.7) = , = N,
α
b+k−1 b −α
1 − (1 + β) b b −α
1 − (1 + β) b
k=1
ESTIMATING SUMS OF RANDOM VARIABLES 9
P
with de = ∞ k=1 nk , and (3.4) holds [29]. Rao [19] called (3.3) with (3.7)
pseudo MLE in a different (gamma) model, but the efficiency of the db was
not clear [11].
The species problem is a special case of estimating (1.1) when d is viewed
as the number of species represented in the population out of a total of n
species. Specifically, letting pj = 0 if the jth species is not represented in the
population, estimating
n
X n
X N
X
(3.8) d= I{pj > 0} = I{Xj = 0, pj > 0} + nk
j=1 j=1 k=1
indexed by Borel h1 and h2 , where Wj are the observations from the jth SD
pair in the sample. We focus here on the estimation of node degrees, although
the approach based on (4.1) could be useful in other network problems.
The topology of a deterministic network can be described with a routing
table: a list r1 , . . . , rJ of directed paths representing connections between
pairs of source and destination nodes, with each path being composed of a
set of directed links. For example, the path 4 → 2 → 3 → 8 has source node
4, destination node 8, and links 4 → 2, 2 → 3 and 3 → 8. Consider a network
with nodes {1, . . . , K}. The link degree D(k, ℓ) is defined as the number of
paths using the link k → ℓ,
(4.2) D(k, ℓ) ≡ #{j ≤ J : link k → ℓ is used in rj },
with D(k, ℓ) = 0 if k → ℓ is nonexistent or never used. The node degree,
defined as
K
X
(4.3) dk = I{D(k, ℓ) > 0},
ℓ=1
is the number of outgoing
P
links from k to other nodes. This is also called out-
degree. The in-degree, ℓ I{D(ℓ, k) > 0}, is the number of incoming links
to k. The node degrees dk and their (empirical) distributions are important
characteristics of networks; see [12, 15, 30].
ESTIMATING SUMS OF RANDOM VARIABLES 11
where dek is the observed degree and sk is the unobserved degree given by
K
X
(4.6) sk ≡ I{Xkℓ = 0, D(k, ℓ) > 0}.
ℓ=1
Lakhina, Byers, Crovella and Xie [16] and Clauset and Moore [7] pointed
out that the observed degrees dek may grossly underestimate the true node
degree dk .
It follows from (4.5), (4.6) and (3.8) that the problem of estimating the
node degree (4.3) is a species problem. From this point of view, we may di-
rectly use estimators in Section 3 and references therein, for example, (3.11).
However, in network problems, we are typically interested in simultaneous
estimation of many node degrees. Thus, information from {Xkℓ , ℓ ≤ K} can
be pooled from different nodes k. Let K ⊆ {1, . . . , K} be a collection of
“similar” and/or “independent” nodes. Let G be a family of distributions,
for example, gamma with unit scale. Suppose the G in (3.2) for different
nodes are identical to a member of G up to scale parameters βk . Then, as
in (3.10), the (pseudo) MLE for {dk , βk , k ∈ K, G} is given by
PN R
b
b j=1 nkj y>0 G(dy)
dk ≡ R ,
(1 − e−βbk y )G(dy)
b
(4.7) N R −βk y j
b b
Y Y e y G(dy) nkj
(β, G) ≡ arg max R ,
β,G k∈K j=1
1 − e−βk y G(dy)
where β ≡ (β, . . . , βK ) and the maximum is taken over all βk > 0 and G ∈ G.
This type of estimator is expected to perform well for self-similar networks.
In the nonparametric case of completely unknown G, the MLE (β, b G)
b in
(4.7) can be computed via the following EM algorithm:
( N (m) (m) )−1 X
N
(m+1) X p(j + 1; βk , G(m) ) p(1; βk , G(m) )
βk ← nkj (m)
+ (m)
jnkj ,
j=1 p(j; βk , G(m) ) 1 − p(0; βk , G(m) ) j=1
12 C.-H. ZHANG
R
with p(j; βk , G) ≡ e−βk y y j G(dy),
N
!−1
XX (m+1)
(m+1) (m)
G (dϑ) ← G (dϑ) nkj /{1 − p(0; βk , G(m) )}
k∈K j=1
N (m+1) (m+1)
XX exp(−βk ϑ)ϑj exp(−βk ϑ)
× nkj (m+1)
+ (m+1)
.
k∈K j=1 p(j; βk , G(m) ) 1 − p(0; βk , G(m) )
based on {Xj , j ≤ J}, where Xj and Yj are the sample and population
frequencies in the jth cell, J is the total number of cells, and u(x, y) is a loss
the form u(x, y) = u(x)/y, for example, u(x, y) = y −1 I{x = 1}.
function of P
Let N = Jj=1 Yj be the population size. Suppose N ∼ Poisson(λ),
{Yj }|N ∼ multinomial(N, {πj }), Xj |({Yj }, N ) ∼ binomial(Yj , pj ),
(5.2) PJ
for certain πj > 0 with j=1 πj = 1, 0 ≤ pj ≤ 1 and λ > 0. For known
{pj , πj , λ}, the Bayes estimator of SJ in (5.1) is
J
X
(5.3) SJ∗ ≡ E(SJ |{Xj }) = uj (Xj ), uj (x) ≡ Eu(x, Yj − Xj + x),
j=1
as an estimator of the global risk (5.1) and its conditional expectation (5.3),
where τbJ is a suitable (e.g., the maximum likelihood or method of moments)
estimator of τ . For example, in a two-way table with cells labelled by j ∼
(i, k) and known πi,k and λ, we may assume a regression model pi,k = ψ0 (τ1 +
τ2′ zi,k ) for a certain known (e.g., logit or probit) function ψ0 . In the case of
unknown πi,k , we may consider the independence model πi,k = πi·π·k with
unknown πi· and known or unknown π·k . If τ has fixed dimensionality and
τbJ is asymptotically efficient, (5.5) is efficient by Theorem 2.2. Theorem 2.2
also suggests that (5.5) is highly efficient if dim(τ )/J → 0.
Alternatively, we may consider the negative binomial model N ∼NB(α, 1/
(1 + β)), that is, P (N = k) = Γ(k + α){Γ(α)k!}−1 β k /(1 + β)k+α . As in
[21], we have in this case Yj ∼ NB(α, 1/(1 + βj )) with βj = βπj , Xj ∼
NB(α, 1/(1 + pj βj )), and (Yj − Xj )|{Xj = x} ∼ NB(x + α, (1 + pj βj )/(1 +
βj )). Consequently,
Z 1
1 + pj βj
(5.6) uj (x) = tαj −1 dt I{x = 1}
(1 − pj )βj (1+pj βj )/(1+βj )
in (5.3) for u(x, y) = y −1 I{x = 1}. Bethlehem, Keller and Pannekoek [2]
studied this negative binomial model with constant πj = 1/J and pj =
En/EN ≈ n/N . For (αj , βj ) → (0, ∞), (Yj − Xj )|{Xj = x} converges in dis-
tribution to the NB(x, pj ), resulting in the µ-ARGUS estimator [1] with
uj (x) = pj (1− pj )−1 (− log pj )I{x = 1} in (5.6), as pointed out by Rinott [21].
Compared with the Poisson model in which λ ≈ N , estimates of both EN
and Var(N ) are required in the negative binomial model. The µ-ARGUS
model essentially assumes Var(N )/(EN )2 ≥ 1/α → ∞, which may not be
suitable in some applications.
D
(6.4) L(wFt ; Ft ) −→ L(wF0 ; F0 ), EFt wF2 t → EF0 wF2 0 ,
as t → 0+, with wF ≡ uF − uF , and also satisfy the differentiability condition
(6.5) lim EF0 (uFt − uF0 )/t = EF0 φ(X)ρ(X)
t→0+
for certain φ(X) ≡ φ(X; F0 ) ∈ L2 (F0 ). The usual smoothness condition for
µ(F ), see [3], pages 57–58, is that, for a certain influence function ψ(X) ≡
ψ(X; F0 ) ∈ L2 (F0 ),
(6.6) lim {µ(Ft ) − µ(F0 )}/t = EF0 ψ(X)ρ(X).
t→0+
Theorem 6.1. Suppose (6.3), (6.4) and (6.5) hold at F0 . Let φ∗,0 be
the projection of φ in (6.5) into the tangent space H∗ in (6.2), and let
φ∗ ≡ uF0 − µ(F0 ) + φ∗,0 .
(i) If (6.8) holds, then VarF0 (ξ0 ) ≥ VarF0 (φ∗ − uF0 ). Moreover, the lower
bound is reached without bias, that is, EF0 ξ02 = VarF0 (φ∗ − uF0 ), iff (1.5)
holds.
(ii) If (6.8) holds and the L2 (F0 ) closure C∗ of C∗ in (6.2) is convex,
then there exist a random variable ξ̃0 and certain normal variables Z(h) ∼
N (0, VarF0 (h)) such that
√
n{Sen /n − An (φ∗ ) − µ(F0 )} D ξ̃0
L ; F0 −→ L ; F0
Zn (uF0 + h − uF0 ) Z(uF0 + h − uF0 )
and ξ̃0 is independent of Z(uF0 + h − uF0 ) for all h ∈ H∗ . In particular, for
h = φ∗,0 ,
L(ξ0 ; F0 ) = L(Z(φ∗ − uF0 ); F0 ) ⋆ L(ξ̃0 ; F0 ).
(iii) Suppose EFt u2 (X; Ft ) is bounded for all {Ft } ∈ C. Then, ψ∗ = φ∗,0 +
u∗ is the efficient influence function for the estimation of µ(F ), that is, (6.6)
holds with ψ = ψ∗ , where u∗ is the projection of uF0 to H∗ . Consequently,
(1.6) holds.
Remark 6.1. Based on Theorem 6.1(i) and (ii), Sbn is said to be locally
asymptotically efficient if (1.5) holds. Note that in Theorem 6.1(ii), ξ̃0 = 0
iff (1.5) holds.
Remark 6.2. In the proof of Theorem 6.1(iii), we show that (6.5) and
(6.6) are equivalent under the condition that EFt u2 (X; Ft ) = O(1) for all
{Ft } ∈ C.
Remark 6.3. For the estimation of µ(F ), that is, u(x, ϑ, F ) ≡ µ(F )
as a special case of Theorem 6.1(ii), a standard proof of the convolution
theorem uses analytic continuation along lines passing through the origin in
the tangent space, and as a result, C ∗ is often assumed to be a linear space.
In the proof of Theorem 6.1(ii), analytic continuation is used along arbitrary
16 C.-H. ZHANG
Remark 6.4. Comparing Theorem 6.2 with Theorems 2.1 and 2.2, we
see that (6.9) is weaker than (2.2) and (1.2) is more general than (1.1), while
stronger conditions are imposed on uτ in Theorem 6.2.
7. Proofs. We prove Theorems 6.1, 2.1, 2.2, 6.2, and 2.3–2.5 in this sec-
tion.
Lemma 7.1. Suppose (2.2) holds. Let (X, θ) ∼ Ft under Pτ +at and ρ =
at ρτ for a vector a, where ρτ is as in (2.3). Then (6.1) holds with PF0 = Pτ .
Proof. Let gt ≡ gτ +at and ∆ = at. The lemma follows from the expan-
sion
√ 1/2 t
ft − 1 ρ 1 g − 1 1/2 a ρeτ
− = 1/2 E0 t (gt + 1) X = x − E0 X =x .
t 2 f +1 t 2
t
The uniform integrability of the square of the right-hand side (i.e., the first
term) under f0 (x) follows from the inequality E0 [gt |X] ≤ ft (X)I{f0 (X) >
0}. We omit the details.
ESTIMATING SUMS OF RANDOM VARIABLES 17
Proof. Let Bt be the support sets of dPt (X) − ft (X) dP0 (X). By (6.1)
and the boundedness of Et h2t , Et ht − E0 ft ht = Et ht IBt = O(1)(Et h2t )1/2 ×
1/2
Pt (Bt ) = o(t). Thus,
(7.1) µt − µ0 = Et ht − E0 h0 = E0 (ft − 1)ht + E0 (ht − h0 ) + o(t)
√ √
as t → 0+. Since ( ft − 1)/t → ρ/2 in L2 (P0 ) and E0 {( ft + 1)ht }2 = O(1),
p p
E0 (ft − 1)ht /t = E0 [t−1 ( ft − 1)( ft + 1)ht ] → E0 h0 ρ.
This and (7.1) complete the proof.
√
Proof of Theorem 6.1. Let Fn ≡ Fc/√n , ξn ≡ n{Sen /n − Sn (Fn )/n},
√ √
ξn′ ≡ n{Sen /n − An (uFn )}, ξn′′ ≡ nAn (wFn ) and Z ′′ = Z(wF0 ). Then ξn =
ξn′ + ξn′′ and ξn′ depend on {Xj } only. By (6.4), wF2 n under PFn are uniformly
D
integrable and L(wFn ; Fn ) −→ L(wF0 ; F0 ) as n → ∞. Thus, by the Lindeberg
central limit theorem and the weak law of large numbers,
(7.2) EFn [exp(itξn′′ )|{Xj }] → EF0 exp(itZ ′′ )
in probability for all t. Since ξn′ depends on {Xj } only, this and (6.8) imply
EFn exp(itξn′ )E exp(itZ ′′ ) = EFn exp(itξn′ ) exp(itξn′′ ) + o(1) → EF0 exp(itξ0 ).
Thus, since E exp(itZ ′′ ) 6= 0 for all t,
( n
) !
X D
(7.3) L n−1/2 Sen − u(Xj ; Fc/√n ) ; Fc/√n = L(ξn′ ; Fn ) −→ L(ξ0′ ; F0 )
j=1
for a certain variable ξ0′ independent of c > 0 and the curve {Ft } ∈ C.
′ ≡ √n{S
Define ξn,0 en /n−An (uF )}. By (6.3) and (6.5), ξ ′ −ξ ′ = √nAn ×
0 n,0 n
(uFn − uF0 ) = EF0 (uFn − uF0 ) + oP (1) → cEφ(X)ρ(X) in probability under
PF0 . Thus, as in [3], pages 24–26, by (7.3) and the LAN from (6.1) and (6.2),
(7.4) EF0 exp(itξ0′ + zZ(ρ)) = exp[itzEF0 φρ + z 2 EF0 ρ2 /2]EF0 exp(itξ0′ )
for all ρ ∈ C∗ and complex z. Here Z(h) are constructed so that (ξn,0 ′ , Z (h))
n
′
converges jointly in distribution to (ξ0 , Z(h)) for all h ∈ L2 (F0 ). Differenti-
ating (7.4) in t at t = 0 and then in z at z = 0, we find
(7.5) EF0 ξ0′ Z(h) = EF0 φ(X)h(X) = EF0 Z(φ∗,0 )Z(h)
18 C.-H. ZHANG
for all scores h = ρ, ρ ∈ C∗ , and then for all h ∈ H∗ by (6.2). Since φ∗,0 ∈ H∗ ,
ξ0′ − Z(φ∗,0 ) and Z(φ∗,0 ) are orthogonal in L2 (F0 ). This proves (i), since
ξ0′ and Z(φ∗,0 ) are both independent of Z ′′ by (7.2) and Z(φ∗,0 ) + Z ′′ =
Z(φ∗ − uF0 ).
Now, suppose C∗ is convex in L2 (F0 ). By continuity extension, (7.4) holds
for all ρ ∈ C∗ and complex z. Let ρj ∈ C∗ . Since (7.4) holds for ρ = sρ1 + (1 −
s)ρ2 , 0 ≤ s ≤ 1, with both sides being analytic in s, by analytic continuation
it holds for ρ = sρ1 +(1−s)ρ2 for all real s. Thus, (7.4) holds for all complex z
and
(7.6) ρ ∈ H0 ≡ {sρ1 + (1 − s)ρ2 : ρj ∈ C∗ , −∞ < s < ∞}.
Let H e be the linear span of a set of finitely many members of C∗ . Let ρ1
be a fixed interior point of H e ∩ C∗ and ρ2 ∈ H e with kρ2 − ρ1 k = δ0 . For
sufficiently small δ0 > 0, ρ2 ∈ C∗ for all such ρ2 , so that He ⊆ H0 . Thus, H0 is
a linear space and H∗ is the closure of H0 . It follows that (7.4) holds for all
ρ ∈ H∗ and complex z. As in [3], pages 25–26, this implies the independence
of ξ0′ − Z(φ∗,0 ) and {Z(h) : h ∈ H∗ }. Since {ξ0′ , Z(h), h ∈ H∗ } is independent
of Z ′′ = Z(uF0 − uF0 ) by (7.2), the conclusions of part (ii) hold with ξ̃0 =
ξ0′ − Z(ψ∗,0 ).
The proof of part (iii) follows easily from Lemma 7.2 with ht = uFt , which
gives
{µ(Ft ) − µ(F0 )}/t − EF0 {uFt − uF0 }/t → EF0 uF0 ρ = EF0 u∗ ρ.
It follows that (6.5) and (6.6) are equivalent under EFt u2 (X; Ft ) = O(1),
with ψ = ψ∗ = u∗ + φ∗,0 , by (1.6) and the definition of φ∗ . The proof is
complete.
Proofs of Theorems 2.2 and 6.2. Theorem 6.2(i) follows from The-
orem 6.1 and Remark 6.2. Let µ(t; τ ) = Eτ ut (X). By Lemma 7.2, µ′ = Eτ uρe
in Theorem 2.2 and γτ = (∂/∂t)µ(τ ; τ ) in both theorems. Simple expansion
of (2.5) via (2.7) yields
Sbn
= An (uτ ) + {µ(τbn ; τ ) − µ(τ ; τ )} + oPτ (n−1/2 )
n
= An (uτ + γτ κτ ) + oPτ (n−1/2 ),
which implies (2.8). Note that γτ (κτ − κ∗,τ ) is orthogonal to uτ − uτ + γτ κ∗,τ .
The proof is complete.
REFERENCES
[1] Benedetti, R. and Franconi, L. (1998). Statistical and technological solutions
for controlled data dissemination. In Pre-proceedings of New Techniques and
Technologies for Statistics, Sorrento 1 225–232 .
[2] Bethlehem, J., Keller, W. and Pannekoek, J. (1990). Disclosure control of mi-
crodata. J. Amer. Statist. Assoc. 85 38–45.
20 C.-H. ZHANG
[3] Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner, J. A. (1993). Effi-
cient and Adaptive Estimation for Semiparametric Models. Johns Hopkins Univ.
Press, Baltimore. MR1245941
[4] Bunge, J. and Fitzpatrick, M. (1993). Estimating the number of species: A review.
J. Amer. Statist. Assoc. 88 364–373.
[5] Chao, A. (1984). Nonparametric estimation of the number of classes in a population.
Scand. J. Statist. 11 265–270. MR0793175
[6] Chao, A. and Bunge, J. (2002). Estimating the number of species in a stochastic
abundance model. Biometrics 58 531–539. MR1925550
[7] Clauset, A. and Moore, C. (2003). Traceroute sampling makes random graphs
appear to have power law degree. Preprint.
[8] Coates, A., Hero, A., Nowak, R. and Yu, B. (2002). Internet tomography. IEEE
Signal Processing Magazine 19(3) 47–65.
[9] Darroch, J. N. and Ratcliff, D. (1980). A note on capture–recapture estimation.
Biometrics 36 149–153. MR0672144
[10] Duncan, G. T. and Pearson, R. W. (1991). Enhancing access to microdata while
protecting confidentiality: Prospects for the future (with discussion). Statist.
Sci. 6 219–239.
[11] Engen, S. (1974). On species frequency models. Biometrika 61 263–270. MR0373217
[12] Faloutsos, M., Faloutsos, P. and Faloutsos, C. (1999). On power-law relation-
ships of the Internet topology. In Proc. ACM SIGCOMM 1999 251–262. ACM
Press, New York.
[13] Fisher, R. A., Corbet, A. S. and Williams, C. B. (1943). The relation between
the number of species and the number of individuals in a random sample of an
animal population. J. Animal Ecology 12 42–58.
[14] Good, I. J. (1953). The population frequencies of species and the estimation of
population parameters. Biometrika 40 237–264. MR0061330
[15] Govindan, R. and Tangmunarunkit, H. (2000). Heuristics for Internet map dis-
covery. In Proc. IEEE INFOCOM 2000 3 1371–1380. IEEE Press, New York.
[16] Lakhina, A., Byers, J., Crovella, M. and Xie, P. (2003). Sampling biases in
IP topology measurements. In Proc. IEEE INFOCOM 2003 1 332–341. IEEE
Press, New York.
[17] Pfanzagl, J. (with the assistance of W. Wefelmeyer) (1982). Contributions to a
General Asymptotic Statistical Theory. Lecture Notes in Statist. 13. Springer,
New York. MR0675954
[18] Polettini, S. and Seri, G. (2003). Guidelines for the protection of so-
cial micro-data using individual risk methodology. Application within µ-
Argus version 3.2, CASC Project Deliverable No. 1.2-D3. Available at
neon.vb.cbs.nl/casc/deliv/12D3_guidelines.pdf.
[19] Rao, C. R. (1971). Some comments on the logarithmic series distribution in the
analysis of insect trap data. In Statistical Ecology (G. P. Patil, E. C. Pielou and
W. E. Waters, eds.) 1 131–142. Pennsylvania State Univ. Press, University Park.
MR0375600
[20] Rieder, H. (2000). One-sided confidence about
functionals over tangent cones. Available at
www.uni-bayreuth.de/departments/math/org/mathe7/RIEDER/pubs/cc.pdf.
[21] Rinott, Y. (2003). On models for statistical disclosure risk esti-
mation. Working paper no. 16, Joint ECE/Eurostat Work Ses-
sion on Data Confidentiality, Luxemburg, 2003. Available at
www.unece.org/stats/documents/2003/04/confidentiality/wp.16.e.pdf.
ESTIMATING SUMS OF RANDOM VARIABLES 21
[22] Robbins, H. (1977). Prediction and estimation for the compound Poisson distribu-
tion. Proc. Natl. Acad. Sci. U.S.A. 74 2670–2671. MR0451479
[23] Robbins, H. (1980). An empirical Bayes estimation problem. Proc. Natl. Acad. Sci.
U.S.A. 77 6988–6989. MR0603064
[24] Robbins, H. (1988). The u, v method of estimation. In Statistical Decision Theory
and Related Topics IV (S. S. Gupta and J. O. Berger, eds.) 1 265–270. Springer,
New York. MR0927106
[25] Robbins, H. and Zhang, C.-H. (1988). Estimating a treatment effect under biased
sampling. Proc. Natl. Acad. Sci. U.S.A. 85 3670–3672. MR0946190
[26] Robbins, H. and Zhang, C.-H. (1989). Estimating the superiority of a drug to a
placebo when all and only those patients at risk are treated with the drug. Proc.
Natl. Acad. Sci. U.S.A. 86 3003–3005. MR0995401
[27] Robbins, H. and Zhang, C.-H. (1991). Estimating a multiplicative treatment effect
under biased allocation. Biometrika 78 349–354. MR1131168
[28] Robbins, H. and Zhang, C.-H. (2000). Efficiency of the u, v method of estimation.
Proc. Natl. Acad. Sci. U.S.A. 97 12,976–12,979. MR1795617
[29] Sampford, M. R. (1955). The truncated negative binomial distribution. Biometrika
42 58–69. MR0072401
[30] Spring, N., Mahajan, R. and Wetherall, D. (2002). Measuring ISP topologies
with rocketfuel. In Proc. ACM SIGCOMM 2002 133–145. ACM Press, New
York.
[31] van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Univ. Press.
MR1652247
[32] Vardi, Y. (1996). Network tomography: Estimating source-destination traffic inten-
sities from link data. J. Amer. Statist. Assoc. 91 365–377. MR1394093
Department of Statistics
Rutgers University
Hill Center
Busch Campus
Piscataway, New Jersey 08854-8019
USA
e-mail: czhang@stat.rutgers.edu