Pareto Est
Pareto Est
To cite this article: Mei Ling Huang (2011): Optimal estimation for the Pareto distribution, Journal
of Statistical Computation and Simulation, DOI:10.1080/00949655.2010.516751
This article may be used for research, teaching and private study purposes. Any
substantial or systematic reproduction, re-distribution, re-selling, loan, sub-licensing,
systematic supply or distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation
that the contents will be complete or accurate or up to date. The accuracy of any
instructions, formulae and drug doses should be independently verified with primary
sources. The publisher shall not be liable for any loss, actions, claims, proceedings,
demand or costs or damages whatsoever or howsoever caused arising directly or
indirectly in connection with or arising out of the use of this material.
Journal of Statistical Computation and Simulation
iFirst, 2011, 1–18
Department of Mathematics, Brock University, St. Catharines, ON, Canada L2S 3A1
This paper proposes an optimal estimation method for the shape parameter, probability density function
and upper tail probability of the Pareto distribution. The new method is based on a weighted empirical
distribution function. The exact efficiency functions of the estimators relative to the existing estimators
are derived. The paper gives L1 -optimal and L2 -optimal weights for the new weighted estimator. Monte
Carlo simulation results confirm the theoretical conclusions. Both theoretical and simulation results show
that the new estimation method is more efficient relative to several existing methods in many situations.
Keywords: efficiency; order statistics; weighted empirical distribution function; density estimation; Pareto
upper tail
1. Introduction
*Email: mhuang@brocku.ca
When the shape parameter α is unknown, we want to estimate α from a random sample
X1 , X2 , . . . , Xn , n > 2, from the p.d.f. in Equation (1). There are several parametric methods
as follows.
The sample mean estimator for the population mean μ in Equation (3) is
1
n
μ̂X̄ = X̄ = Xi . (5)
Downloaded by [University of Toronto Libraries] at 08:38 11 September 2011
n i=1
Note that Equation (5) is based on the empirical distribution function (EDF) Sn (x),
1
n
1, if x ∈ A,
Sn (x) = I(−∞,x] (Xi ), where IA = (6)
n i=1 0, if x ∈
/ A.
And Sn (x) uses equal weight 1/n for each sample point. Then we have the moment estimator for
the shape parameter α
1
α̂X̄ = + 1. (7)
X̄
n
α̂MLE = n , (8)
i=1 log(Xi + 1)
which is a complete and sufficient estimator with the Vinci distribution (or inverse gamma
distribution) having the p.d.f.
Therefore, a MLE for the mean of the Pareto distribution in Equation (3) is given by
1
μ̂MLE = , α̂MLE > 1. (11)
α̂MLE − 1
Journal of Statistical Computation and Simulation 3
n−2
α̂MRE = n , with
i=1 log(Xi + 1)
α(n − 2) α
E(α̂MRE ) = ; Bias(α̂MRE ) = − ; (12)
n−1 n−1
α 2 (n − 2) α2
Var(α̂MRE ) = ; MSE(α̂MRE ) = . (13)
(n − 1)2 n−1
Downloaded by [University of Toronto Libraries] at 08:38 11 September 2011
Note that the above Moment, MLE and MRE methods use equal weight 1/n on each data point.
We have two questions:
(a) Why should we use equal weight 1/n on the data? Is the EDF in Equation (6), a minimum
variance unbiased estimator for F (x), good enough?
(b) How do we overcome the problem of the lack of efficiency of the tail estimation for the Pareto
distribution? (especially in the heavy tailed case: α ≤ 2, or near 2).
In recent years, there have been studies putting unequal weights on the data points, to obtain
weighted EDFs or processes [4, 5]. But there are some difficulties to determine what weights
should be used for the data points. Huang [6] studied a symmetric weighted empirical distribution
function (SWEDF) and its efficiency function relative to the EDF in Equation (6) for estimating
any population c.d.f. The SWEDF is defined by
n
n∗ (x) =
F I(−∞,x] (X(i) )pn,i , x ∈ R, n > 2, (14)
i=1
and X(i) ≤ X(2) ≤ · · · ≤ X(n) are the order statistics of the random sample. Note that
n
0 < pn,i < 1, i = 1, . . . , n; pn,i = 1.
i=1
The parameter w in Equation (14) is the weight for the middle n − 2 data points, and w1,n is
the weight for the extreme data points. Huang [6] indicated that if w > 1/n, then the SWEDF in
Equation (14) has good efficiency of estimating any distribution function in the tails relative to
the EDF in Equation (6). Since the Pareto distribution is power-tailed, it is important to estimate
the upper tail probability. It is interesting to explore how different weights affect the estimation
of the mean, density function and upper tail probability. In this paper, we first obtain a weighted
4 M.L. Huang
estimator μ̂w based on Fˆn∗ (x) in Equation (14) to estimate the mean μ in Equation (3), namely
n−1
1
μ̂w = wX(i) + (1 − (n − 2)w)[X(1) + X(n) ]. (15)
i=2
2
Pareto mean, density function and upper tail probability relative to existing estimators. Section 2
gives an exact efficiency function (EFF) of the weighted estimator μ̂w in Equation (15) relative to
the sample mean X̄ in Equation (5) and μ̂MLE in Equation (11) for estimating the population mean
μ in (3). The L1 -optimal and L2 -optimal weights for the new weighted estimators are given in
Section 3. A weighted semiparametric density estimator is introduced in Section 4. In Section 5,
the Monte Carlo simulation results show that the new weighted estimation method for estimating
the Pareto mean, density function and upper tail probability is more efficient relative to the existing
methods in many situations. The simulation results confirm the theoretical conclusions. Finally,
we give all the proofs in Appendix 1.
In this section, we derive the EFF of μ̂w in Equation (15) relative to the sample mean X̄ in
Equation (5) for estimating the population mean μ in Equation (3).
Theorem 2.1 when α > 2, The mean square error (MSE) function of the μ̂w in Equation (15)
for estimating the population mean of the Pareto distribution (3) is given by
1
a = A + (n − 2)2 B − (n − 2)C,
4
1
b = − (n − 2)B + C − 2μD + μ(n − 2)E,
2
1
c = B − μE + μ2 .
4
n−1
(n + 1) (n − i + 1 − 2/α) (n − i + 1 − 1/α)
A= −2 +1
i=2
(n − i + 1) (n + 1 − 2/α) (n + 1 − 1/α)
n−2
n−1
(n + 1)(n − j + 1 − 1/α)
+2 •
i=2 j =i+1
(n − j + 1)
(n − i + 1 − 2/α) 1
× −
(n + 1 − 2/α)(n − i + 1 − 1/α) (n + 1 − 1/α)
(n + 1) (n − i + 1 − 1/α)
− +1 .
(n + 1 − 1/α) (n − i + 1)
Journal of Statistical Computation and Simulation 5
1
+ n!(1 − 1/α) −
(n + 1 − 2/α)(n − i + 1 − 1/α) (n + 1 − 1/α)
(n + 1) (n − i + 1 − 1/α)
− +2 ;
(n + 1 − 1/α) (n − i + 1)
n−1
(n + 1) (n − i + 1 − 1/α)
D= −1 ;
i=2
(n − i + 1) (n + 1 − 1/α)
n(n − 1/α) + n!(1 − 1/α)
E= − 2,
(n + 1 − 1/α)
∞
where (x) = 0 x α−1 e−x dx, α > 0, is a Gamma function.
Corollary 2.1 The EFFs of μ̂w in Equation (15) for estimating μ in Equation (3) relative to
X̄ in Equation (5) and μ̂MLE in Equation (11) are given by
where MSE(μ̂w ) is in Equation (17), μ̂MLE = 1/(α̂MLE − 1) is in Equation (11) and h(x) is in
Equation (9).
Huang and Brill [7] proposed a level-crossing weighted empirical distribution function (LCEDF)
to estimate the c.d.f. F (x), x ∈ R. This method leads to an L1 -optimal choice of the weights for
w in Equation (14),
1 1
wL1 -opt = √ > . (20)
n(n − 1) n
Huang [6] indicates that if more weight is given to the mid-data, then the efficiencies of the LCEDF
relative to the classical EDF exceeds 1 on the tails of x values. Using this idea, we have
6 M.L. Huang
Definition 3.1 L1 -optimal estimators for the mean μ and the shape parameter α of the Pareto
distribution are defined by
μ̂wL1 -opt = μ̂w |w=wL1 -opt , where μ̂w is given in Equation (15); (21)
1
α̂wL1 -opt = + 1. (22)
μ̂wL1 -opt
In this section, we use the MSE criteria to find an L2 -optimal weight w in Equation (14) for
Downloaded by [University of Toronto Libraries] at 08:38 11 September 2011
Corollary 3.1 The L2 -optimal weight w for minimizing MSE(μ̂w ) in Equation (17) when
α > 2, n > 4C/B + 2, is given by
b b2
wL2 -opt = − , and MSEmin (μ̂w ) = − + c, and (23)
2a 4a
μ̂wL2 -opt = μ̂w |w=wL2 -opt , where μ̂w is given in Equation (15); (24)
1
α̂wL2 -opt = + 1, (25)
μ̂wL2 -opt
where a, b, c, B and C are given in Equation (17).
√
In Table A1,Appendix 2, we list the values of wL1 -opt = 1/ n(n − 1), MSE(μ̂wL1 -opt ), MSE(X̄),
MSE(μ̂MLE ) and the exact EFF(μ̂w(X̄) ) relative to X̄, and EFF(μ̂w(MLE) ) relative to μ̂MLE , respec-
tively, for n = 10, 20, 30 and 50 here we choose α = 2.5, 3, 4 (since the variance of Equation (1)
is infinite when α ≤ 2), and use Equations (18)–(20), The efficiencies are greater than 1 in 21 out
of 24 (87.5%) cases.
Similarly, in Table A2, we list the values of wL2 -opt , MSEmin (μ̂wL2 -opt ), MSE(X̄), MSE(μ̂MLE )
and the exact EFFmax (μ̂w ) relative to X̄, and EFF(μ̂w(MLE) ) relative to μ̂MLE , respectively, for
n = 10, 20, 30 and 50; α = 2.5, 3, 4, by using Equation (23). The efficiencies are greater than 1
in 24 out of 24 (100%) cases.
Remark It is interesting to see that 100% efficiencies are greater than 1 in Tables A2 and A3
by using wL2 -opt . But the wL1 -opt in Equation (21) is totally nonparametric, it is more robust and
easy to use. The wL2 -opt in Equation (23) depends on α. In practice, we may estimate α first, then
obtain a wL2 -opt while still keeping the optimal advantage. Of course, we use the given values of
α in simulations.
A semiparametric weighted density estimator is given in this section. We estimate α first and then
substitute it into Equation (1).
αw
fw (x) = , x ≥ 0, n ≥ 2, (26)
(1 + x)(αw +1)
where
αw is defined in Equation (16).
Journal of Statistical Computation and Simulation 7
We substitute
αwL1 -opt ,
αwL2 -opt in Equations (22) and (25) replacing
αw in Equation (26) to
obtain the semiparametric L1 - and L2 -optimal weighted density estimators,
αwL1 -opt
fw L1 -opt (x) = , x ≥ 0, (27)
(
α +1)
(1 + x) wL1 -opt
αwL2 -opt
fw L2 -opt (x) = , x ≥ 0. (28)
(
αwL +1)
(1 + x) -
2 opt
αX̄ and
We also define two density estimators by using αMLE in Equations (7) and (8) as
Downloaded by [University of Toronto Libraries] at 08:38 11 September 2011
αX̄
αMLE
fX̄ (x) = α +1)
, x ≥ 0, and f
MLE (x) = , x ≥ 0. (29)
(1 + x) (
X̄ (1 + x)(αMLE +1)
5. Simulations
1
m
SMSE(μ̂w ) = (μ̂w(i) − μ)2 ,
m i=1
1
m
SMSE(μ̂X̄ ) = (μ̂ − μ)2 ,
m i=1 X̄(i)
1
m
SMSE(μ̂MLE ) = (μ̂MLE(i) − μ)2 ,
m i=1
where μ̂w(i) , μ̂X̄(i) and μ̂MLE(i) are estimates from the ith sample, i = 1, 2, . . . , m.
The simulation efficiencies of μ̂w relative to μ̂X̄ and μ̂MLE are given by
SMSE(μ̂X̄ ) SMSE(μ̂MLE )
SEFF(μ̂w(X̄) (x)) = , SEFF(μ̂w(MLE) (x)) = . (30)
SMSE(μ̂w ) SMSE(μ̂w )
The one-million times simulation efficiencies in Tables A3 and A4 are almost perfectly consistent
with the exact efficiencies in Tables A1 and A2. The simulation results confirm the theoretical
results in Theorem 2.1, Corollary 2.1. In each table, simulation efficiencies show that the optimal
estimator μ̂w is more efficient relative to the classical estimators μ̂X̄ and μ̂MLE in 22 out of 24
(91.7%) cases, when using the L1 -optimal weight, and in 24 out of 24 (100%) cases when using
the L2 -optimal weight.
8 M.L. Huang
1
m
SMSE(fw (x)) = (fw (i) (x) − f (x))2 ;
Downloaded by [University of Toronto Libraries] at 08:38 11 September 2011
m i=1
1
m
SMSE(fX̄ (x)) = (f (x) − f (x))2 ;
m i=1 X̄ (i)
1
m
SMSE(f
MLE (x)) = (fMLE (i) (x) − f (x))2 ,
m i=1
In many situations, we are interested in estimating the Pareto upper tail probability
1
1 − F (x) = P {X > x} = , x ≥ 0, α > 0. (31)
(1 + x)α
We consider the following estimation methods:
1
1
− Fw (x) =
, where αw is given in Equation (16),
(1 + x)αw
1
1− FX̄ (x) =
, where α X̄ is given in Equation (7),
(1 + x)αX̄
1
1− FMLE (x) = , where α MLE is given in Equation (8),
(1 + x)αMLE
1
1− FMRE (x) = , where α MRE is given in Equation (12).
(1 + x)αMRE
We use the wL1 -opt in Equation (20) to compute 1 − Fw (x) in Table A7, and use the wL2 -opt
in Equation (23) to compute 1 − Fw (x) in Table A8. The efficiencies of 1 − Fw (x) relative to
1
− FX̄ (x), 1 − FMLE (x) and 1 − FMRE (x) are at selected values x = 1, 2, 3, 4, 5, 6, 8, 10, 12 for
α = 2.5, 3, and m = 1000 generated random samples with sample sizes 10, 20, 30 and 50. There
are 216 cases for each table. The simulation efficiencies in TablesA7 andA8 show that the weighted
Journal of Statistical Computation and Simulation 9
6. Conclusions
In this paper, the theoretical and simulation results consistently show that the proposed weighted
Downloaded by [University of Toronto Libraries] at 08:38 11 September 2011
method has better efficiencies for estimating the Pareto mean, density function and upper tail
relative to the existing methods. The crucial point is that wL1 -opt and wL2 -opt are all greater than
1/n (intuitively, we can see this in Tables A1–A4). This means that the proposed weighted method
gives less weight on extreme data values. It is interesting to see how further studies on a variety
of weights affect the estimation of the Pareto distribution. Such studies open an alternative area
for inference of heavy tailed distributions.
Acknowledgements
Research of the author is supported by an NSERC Canada grant.
References
[1] L. Brown, N. Gans, A. Mandelbaum, A. Sakov, H. Shen, S. Zeltyn, and L. Zhao, Statistical analysis of a telephone
call center, J. Amer. Statist. Assoc. 100(469) (2005), pp. 36–50.
[2] D. Cooley, D. Nychka, and P. Naveau, Bayesian spatial modeling of extreme precipitation return levels, J. Amer.
Statist. Assoc. 102(479) (2007), pp. 824–840.
[3] C.K. Kleiber and S. Kotz, Statistical Size Distribution in Economics and Actuarial Sciences, John Wiley & Sons,
New York, 2003.
[4] P. Barbe and P. Bertail, The Weighted Bootstrap, Springer, New York, 1995.
[5] G.R. Shorack and J.A. Wellner, Empirical Processes with Applications to Statistics, John Wiley & Sons, New York,
1986.
[6] M.L. Huang, The Efficiencies of a Weighted Distribution Function Estimator, The Proceeding of American Statistical
Association, Nonparametric Statistics Section, 2003, pp. 1502–1506.
[7] M.L. Huang and P.H. Brill, A distribution estimation method based on level crossings, J. Statist. Plann. Inference
124(1) (2004), pp. 45–62.
Appendix 1
Lemma A.1 For a Pareto random variable X with density given in Equation (1), for i ∈ {1, 2, . . . , n}, 1 ≤ i ≤ n, α > 2,
(n + 1) (n − i + 1 − 1/α)
E[X(i) ] = • − 1;
(n − i + 1) (n + 1 − 1/α)
(n + 1) (n − i + 1 − 2/α) (n − i + 1 − 1/α)
2
E[X(i) ]= −2 + 1.
(n − i + 1) (n + 1 − 2/α) (n + 1 − 1/α)
For i, j ∈ {1, 2, . . . , n}, 1 ≤ i < j ≤ n, when α > max(1/(n − j + 1), 2/(n − i + 1)),
(n + 1)(n − j + 1 − 1/α)
E[X(i) X(j ) ] =
(n − j + 1)
(n − i + 1 − 2/α) 1
• −
(n + 1 − 2/α)(n − i + 1 − 1/α) (n + 1 − 1/α)
(n + 1) (n − i + 1 − 1/α)
− + 1.
(n + 1 − 1/α) (n − i + 1)
10 M.L. Huang
Proof Based on the theory of order statistics, the uth quantile of Equation (2) is
1
F −1 (u) = − 1, 0 < u < 1,
(1 − u)1/α
∞
n−1
E[X(i) ] = n x[F (x)]i−1 f (x)[1 − F (x)]n−i dx
i−1 0
1
n−1
Downloaded by [University of Toronto Libraries] at 08:38 11 September 2011
1
=n − 1 ui−1 (1 − u)n−i du
i−1 0 (1 − u)1/α
1 1
n−1
=n ui−1 (1 − u)n−i−1/α du − ui−1 (1 − u)n−i du
i−1 0 0
(n + 1) 1
= B i, n − i + 1 − − B(i, n − i + 1)
(i)(n − i + 1) α
(n + 1) (n − i + 1 − 1/α)
= − 1.
(n − i + 1) (n + 1 − 1/α)
∞
n−1
2
E[X(i) ]=n x 2 [F (x)]i−1 f (x)[1 − F (x)]n−i dx
i−1 0
1 2
n−1 1
=n − 1 ui−1 (1 − u)n−i du
i − 1 0 (1 − u)1/α
n−1 2 1
=n B i, n − i + 1 − − 2B i, n − i + 1 − + B(i, n − i + 1)
i−1 α α
(n + 1) (n − i + 1 − 2/α) (n − i + 1 − 1/α)
= −2 + 1.
(n − i + 1) (n + 1 − 2/α) (n + 1 − 1/α)
∞ ∞
n!
E[X(i) X(j ) ] = • xy[F (x)]i−1 f (x)
(i − 1)!(j − i − 1)!(n − j )! 0 0
1 v
[F −1 (u)][F −1 (v)]ui−1 (v − u)j −i−1 (1 − v)n−j du dv
0 0
1 v 1 1
= −1 − 1 ui−1 (v − u)j −i−1 (1 − v)n−j du dv
0 0 (1 − u)1/α (1 − v)1/α
1 v 1 1
= ui−1 (v − u)j −i−1 (1 − v)n−j du dv
0 0 (1 − u)1/α (1 − v)1/α
1 v 1
− ui−1 (v − u)j −i−1 (1 − v)n−j du dv
0 0 (1 − u)1/α
1 v 1
− ui−1 (v − u)j −i−1 (1 − v)n−j du dv
0 0 (1 − v)1/α
1 v
+ ui−1 (v − u)j −i−1 (1 − v)n−j du dv.
0 0
Journal of Statistical Computation and Simulation 11
Let 1 − u = x, 1 − v = y. Since 0 ≤ u < v ≤ 1, then 0 ≤ y < x ≤ 1, such that 0 < y/x < 1. Then the last integral
becomes
1 x
x −1/α y −1/α (1 − x)i−1 (x − y)j −i−1 y n−j dy dx
0 0
1 x
− x −1/α (1 − x)i−1 (x − y)j −i−1 y n−j dy dx
0 0
1 x
− y −1/α (1 − x)i−1 (x − y)j −i−1 y n−j dy dx
0 0
1 x
+ (1 − x)i−1 (x − y)j −i−1 y n−j dy dx.
Downloaded by [University of Toronto Libraries] at 08:38 11 September 2011
0 0
1 1 y j −i−1 y n−j −1/α y
x −1/α (1 − x)i−1 x (j −i−1)+(n−j −1/α)+1 1 − d dx
0 0 x x x
2 1
=B n−i+1− ,i B n − j + 1 − ,j − i .
α α
1 1 y j −i−1 y n−j y
x −1/α (1 − x)i−1 x (j −i−1)+(n−j )+1 1 − d dx
0 0 x x x
1
=B n−i+1− , i B(n − j + 1, j − i).
α
1 1 y j −i−1 y n−j −1/α y
(1 − x)i−1 x (j −i−1)+(n−j −1/α)+1 1 − d dx
0 0 x x x
1 1
=B n−i+1− ,i B n − j + 1 − ,j − i .
α α
1 1 y j −i−1 y n−j y
(1 − x)i−1 x (j −i−1)+(n−j )+1 1 − d dx
0 0 x x x
= B(n − i + 1, i)B(n − j + 1, j − i).
n! 2 1
E[X(i) X(j ) ] = • B n − i + 1 − ,i B n − j + 1 − ,j − i
(i − 1)!(j − i − 1)!(n − j )! α α
1 1 1
−B n−i+1− , i B(n − j + 1, j − i) − B n − i + 1 − , i B n − j + 1 − , j − i
α α α
+ B(n − i + 1, i)B(n − j + 1, j − i)
12 M.L. Huang
n−1
n−2
n−1
E[μ̂2w ] = w 2 E[X(i)
2
]+2 w 2 E[X(i) X(j ) ]
i=2 i=2 j =i+1
1
+ (1 − (n − 2)w)2 (E[X(1)
2
] + E[X(n)
2
] + 2E[X(1) X(n) ])
Downloaded by [University of Toronto Libraries] at 08:38 11 September 2011
4
n−1
+ w(1 − (n − 2)w)(E[X(1) X(i) ] + E[X(i) X(n) ]).
i=2
Let
n−1
n−2
n−1
A= 2
E[X(i) ]+2 E[X(i) X(j ) ];
i=2 i=2 j =i+1
B = E[X(1)
2
] + E[X(n)
2
] + 2E[X(1) X(n) ];
n−1
C= (E[X(1) X(i) ] + E[X(i) X(n) ]);
i=2
n−1
D= E[X(i) ];
i=2
E = E[X(1) ] + E[X(n) ],
then
1
MSE(μ̂w ) = Aw 2 + (1 − (n − 2)w)2 B + w(1 − (n − 2)w)C
4
1
− 2μ Dw + (1 − (n − 2)w)E + μ2
2
1
= A+ (n − 2)2 B − (n − 2)C w2
4
1 1
+ − (n − 2)B + C − 2μD + μ(n − 2)E w + B − μE + μ2 .
2 4
By Lemma A.1, for α > 2,
n−1
n−2
n−1
A= 2
E[X(i) ]+2 E[X(i) X(j ) ]
i=2 i=2 j =i+1
n−1
(n + 1) (n − i + 1 − 2/α) (n − i + 1 − 1/α)
n−2
n−1
= −2 +1 +2
(n − i + 1) (n + 1 − 2/α) (n + 1 − 1/α)
i=2 i=2 j =i+1
n−1
(n + 1)(n − i + 1 − 1/α) (n − 2/α) 1
= −
(n − i + 1) (n + 1 − 2/α)(n − 1/α) (n + 1 − 1/α)
Downloaded by [University of Toronto Libraries] at 08:38 11 September 2011
i=2
(n − i + 1 − 2/α) 1
+ n!(1 − 1/α) −
(n + 1 − 2/α)(n − i + 1 − 1/α) (n + 1 − 1/α)
(n + 1) (n − i + 1 − 1/α)
− +2
(n + 1 − 1/α) (n − i + 1)
n−1
n−1
(n + 1) (n − i + 1 − 1/α)
D= E[X(i) ] = −1 .
(n − i + 1) (n + 1 − 1/α)
i=2 i=2
Lemma A.2 MSE(μ̂w ) in Equation (17) is a convex function with a > 0, when n > 4C/B + 2, C and B are given in
Equation (17).
1
a =A+ (n − 2)2 B − (n − 2)C, where
4
⎡ 2 ⎤
n−1
A=E⎣ X(i) ⎦ ≥ 0,
i=2
2
B = E X(1) + X(n) ≥ 0,
n−1
C=E X(i) X(1) + X(n) ≥ 0.
i=2
1 4C
(n − 2)B − C > 0, such that a > 0, when n> + 2.
4 B
f (w) = MSE(μ̂w ) = aw 2 + bw + c,
14 M.L. Huang
which is a quadratic function with first and second derivatives of f (w) w.r.t. w
Appendix 2
√
Table A1.
Exact efficiencies of μ w relative to X̄, μ̂MLE , wL1 -opt = 1/ n(n − 1)
(sample size n = 10, 20, 30, 50; α = 2.5, 3, 4).
Sample size
n α wL1 -opt w =1/n
MSE(μ w) MSE(X̄) MSE(μ̂MLE ) EFF(μ̂w(X̄) ) EFF(μ̂w(MLE) )
Note: The L1 -optimal μ w has 21 out of 24 cases (87.5%, in bold) with exact efficiencies greater than one.
Table A2.
Exact efficiencies of μ w relative to X̄, μ̂MLE , wL2 -opt = −b/2a
(sample size n = 10, 20, 30, 50; α = 2.5, 3, 4).
Sample size
n α wL2 -opt w =1/n
MSE(μ w) MSE(X̄) MSE(μ̂MLE ) EFF(μ̂w(X̄) ) EFF(μ̂w(MLE) )
Note: The L2 -optimal μ w has 24 out of 24 cases (100%, in bold) with exact efficiencies greater than one.
Journal of Statistical Computation and Simulation 15
√
Table A3.
Simulation efficiencies of μ w relative to X̄, μ̂MLE , wL1 -opt = 1/ n(n − 1)
Generated m = 1,000,000 times, sample size n = 10, 20, 50, 100; α = 2.5, 3, 4
SEFF(μ̂w(X̄) ) = SMSE(X̄)/SMSE(μ w ); SEFF(μ̂w(MLE) ) = SMSE(μ̂MLE )/SMSE(μ
w ).
Sample size
n α wL1 -opt w =1/n
MSE(μ w) MSE(X̄) MSE(μ̂MLE ) EFF(μ̂w(X̄) ) EFF(μ̂w(MLE) )
Note: The L1 -optimal μ w has 22 out of 24 cases (91.7%, in bold) with simulation efficiencies greater than one.
Table A4.
Simulation efficiencies of μ w relative to X̄,
μMLE , wL2 -opt = −b/2a
Generated m = 1,000,000 times, sample size n = 10, 20, 50, 100; α = 2.5, 3, 4
SEFF(μ̂w(X̄) ) = SMSE(X̄)/SMSE(μ w ); SEFF(μ̂w(MLE) ) = SMSE(μ̂MLE )/SMSE(μ
w ).
Sample size
n α wL2 -opt w =1/n
MSE(μ w) MSE(X̄) MSE(μ̂MLE ) EFF(μ̂w(X̄) ) EFF(μ̂w(MLE) )
Note: The L2 -optimal μ w has 24 out of 24 cases (100%, in bold) with simulation efficiencies greater than one.
16 M.L. Huang
Table A5. Simulation efficiencies of L1 -optimal weighted density estimator fˆw relative to the moment
√ and
MLE methods (generated m = 1,000,000 times, sample size n = 10, 20, 30, 50; wL1 -opt = 1/ n(n − 1);
SEFF(fˆw(X̄) (x)) = SMSE(fˆX̄ (x))/SMSE(fˆw (x)); SEFF(fˆw(MLE) (x)) = SMSE(fˆMLE (x))/SMSE(fˆw (x))).
x
Estimator n 1 2 3 4 5 6 8 10 12
Note: There are 9 × 4 × 2 = 72 cases. Overall, the weighted method has 120 out of 144 (83.3%, in bold) cases with efficiencies > 1.
a
The L1 -optimal weighted method has 57 out of 72 (79.9%, in bold) cases with efficiencies >1.
b
The L1 -optimal weighted method has 63 out of 72 (87.5%, in bold) cases with efficiencies > 1.
Table A6. Simulation efficiencies of L2 -optimal weighted density estimator fˆw relative to the moment
and MLE methods (generated m = 1,000,000 times, sample size n = 10, 20, 30, 50; wL2 -opt = −b/2a;
SEFF(fˆw(X̄) (x)) = SMSE(fˆX̄ (x))/SMSE(fˆw (x)); SEFF(fˆw(MLE) (x)) = SMSE(fˆMLE (x))/SMSE(fˆw (x))).
x
Estimator n 1 2 3 4 5 6 8 10 12
Note: There are 9 × 4 × 2 = 72 cases. Overall, the weighted method has 113 out of 144 (78.5%, in bold) cases with efficiencies > 1.
a
The L2 -optimal weighted method has 51 out of 72 (70.8%, in bold) cases with efficiencies > 1.
b
The L2 -optimal weighted method has 62 out of 72 (86.1%, in bold) cases with efficiencies > 1.
Journal of Statistical Computation and Simulation 17
x
Estimator n 1 2 3 4 5 6 8 10 12
Note: There are 9 × 4 × 3 = 108 cases. Overall, the weighted method has 213 out of 216 (98.6%, in bold) cases with efficiencies > 1.
a
The L1 -optimal weighted method has 106 out of 108 (98.2%, in Bold) cases with efficiencies > 1.
b
The L1 -optimal weighted method has 107 out of 108 (99.1%, in bold) cases with efficiencies > 1.
18 M.L. Huang
x
Estimator n 1 2 3 4 5 6 8 10 12
− FX̄ (x) 10 0.8857 1.2987 1.7582 2.2373 2.7192 3.1941 4.1067 4.9587 5.7530
20 0.9019 1.2329 1.5825 1.9413 2.3037 2.6665 3.3845 4.0868 4.7711
30 0.9320 1.2279 1.5315 1.8397 2.1505 2.4619 3.0844 3.7027 4.3157
50 0.9594 1.2027 1.4398 1.6717 1.8989 2.1218 2.5551 2.9739 3.3796
1− FMLE (x) 10 1.0463 1.8241 2.7562 3.7812 4.8582 5.9608 8.1855 10.384 12.533
20 0.9399 1.3886 1.8603 2.3395 2.8175 3.2892 4.2039 5.0746 5.9019
30 0.8938 1.2123 1.5220 1.8197 2.1047 2.3765 2.8839 3.3468 3.7719
50 0.8565 1.0778 1.2764 1.4571 1.6229 1.7761 2.0514 2.2925 2.5070
1− FMRE (x) 10 2.0095 4.8864 9.7514 13.397 18.659 24.413 37.043 50.776 65.330
20 1.3414 2.4500 3.6980 5.0325 6.4205 7.8414 10.730 13.634 16.520
30 1.1527 1.8285 2.5200 3.2112 3.8928 4.5634 5.8570 7.0865 8.2552
50 1.0021 1.3969 1.7638 2.1062 2.4270 2.7287 3.2827 3.7809 4.2339
(B) Pareto distribution, α = 3b
1
− FX̄ (x) 10 0.9186 1.3166 1.7466 2.1827 2.6102 3.0228 3.7954 4.4979 4.1410
20 0.9392 1.2549 1.5785 1.9014 2.2201 2.5319 3.1323 3.7013 4.2413
30 0.9728 1.2510 1.5281 1.8018 2.0708 2.3349 2.8476 3.3403 3.8158
50 0.9977 1.2242 1.4393 1.6447 1.8419 2.0316 2.3917 2.7299 3.0496
1− FMLE (x) 10 1.0851 1.8492 2.7381 3.6889 4.6636 5.6411 7.5649 9.4184 11.199
20 0.9797 1.4135 1.8557 2.2916 2.7153 3.1233 3.8906 4.5960 5.2465
30 0.9330 1.2352 1.5186 1.7823 2.0268 2.2540 2.6625 3.0195 3.8158
50 0.8906 1.0971 1.2760 1.4336 1.5742 1.7006 1.9198 2.1043 2.2622
1− FMRE (x) 10 2.0839 4.9538 8.6941 13.069 17.912 23.101 34.235 46.056 58.378
20 1.3982 2.4939 3.6887 4.9294 6.1877 7.4460 9.9307 12.347 14.687
30 1.2032 1.8630 2.5144 3.1450 3.7497 4.3282 5.4071 6.3931 7.2990
50 1.0421 1.4219 1.7632 2.0722 2.3541 2.6126 3.072 3.4706 3.8206
Note: There are 9 × 4 × 3 = 108 cases. Overall, the weighted method has 202 out of 216 (93.5%, in bold) cases with efficiencies >1.
a
The L2 -optimal weighted method has 101 out of 108 (93.5%, in bold) cases with efficiencies >1.
b
The L2 -optimal weighted method has 101 out of 108 (93.5%, in bold) cases with efficiencies >1.