Distribution Models-Theory PDF
Distribution Models-Theory PDF
W.
b -a
m a
d a
M i I i I\I II I • i i i i I I |
I ll!!!MI'!iii:m
* I *
I 9
DISTRIBUTION
MODELS THEORY
DISTRIBUTION
MODELS THEORY
*3
DISTRIBUTION
MODELS THEORY
a
til sS
m *• a -a
B
* * • 43
6 • i3
# | ° 1 a 1
.jll!l!U|i|i|ii!j!
•
P
• • ** •* z
«3
••
*a
p
m
•0 4 - Eh
erf/fore •
QA9.7.D58 2006
511.3'4-dc22 2006048221
For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.
Preface v
Chapter 1
Modeling Income Distributions Using Elevated Distributions
on a Bounded Domain 1
J.R. van Dorp and S. Kotz
Chapter 2
Making Copulas Under Uncertainty 27
C. Garcia Garcia, J.M. Herrerias Velasco and
J.E. Trinidad Segovia
Chapter 3
Valuation Method of the Two Survival Functions 55
M. Franco Nicolas, R. Herrerias Pleguezuelo, J. Callejon Cespedes
and J.M. Vivo Molina
Chapter 4
Weighting Tools and Alternative Techniques to Generate
Weighted Probability Models in Valuation Theory 67
M. Franco Nicolas and J.M. Vivo Molina
Chapter 5
O n Generating and Characterizing Some Discrete and
Continuous Distributions 85
M.A. Fajardo Caldera and J. Perez Mayo
viii Contents
Chapter 6
Some Stochastic Properties in Sampling from the Normal
Distribution 101
J.M. Fernandez Ponce, T. Gomez Gomez, J.L. Pino Mejfas
andR. Rodriguez Grinolo
Chapter 7
Generating Function and Polarization 111
R.M. Garcia Fernandez
Chapter 8
A New Measure of Dissimilarity Between Distributions:
Application to the Analysis of Income Distributions
Convergence in the European Union 125
F.J. Callealta Barroso
Chapter 9
Using the Gamma Distribution to Fit Fecundity Curves for
Application in Andalusia (Spain) 161
F. Abad Montes, M.D. Huete Morales and M. Vargas Jimenez
Chapter 10
Classes of Bivariate Distributions with Normal and Lognormal
Conditionals: A Brief Revision 173
J.M. Sarabia, E Castillo, M. Pascual andM. Sarabia
Chapter 11
Inequality Measures, Lorenz Curves and Generating Functions 189
/ / . Nunez Velazquez
Chapter 12
Extended Waring Bivariate Distribution 221
/ . RodriguezAvi, A. Conde Sanchez, A.J. Saez Castillo
and M.J. Olmo Jimenez
Contents ix
Chapter 13
Applying a Bayesian Hierarchical Model in Actuarial Science:
Inference and Ratemaking 233
J.M. Perez Sanchez, J. M. Sarabia Alegria, E. Gomez Deniz
and F.J. Vazquez Polo
Chapter 14
Analysis of the Empirical Distribution of the Residuals Derived
from Fitting the Heligman and Pollard Curve to Mortality Data 243
F. Abad Montes, M.D. Huete Morales andM. Vargas Jimenez
Chapter 15
Measuring the Efficiency of the Spanish Banking Sector:
Super-Efficiency and Profitability 285
/ Gomez Garcia, J. Solana Ibanez andJ. C. Gomez Gallego
Chapter 1
SAMUEL KOTZ
Engineering Management and Systems Engineering Department
The George Washington University
1776 G street, Suite 110, NW, Washington DC, 20052
This paper presents a new two parameter family of continuous distribution on a bounded
domain which has an elevated but finite density value at its lower bound. Such a
characteristic appears to be useful, for example, when representing income distributions
at lower income ranges. The family generalizes the one parameter Topp and Leone
distribution originated in the 1950's and recently rediscovered. The family of beta
distributions has been used for modeling bounded income distribution phenomena, but it
only allows for an infinite and zero density values at its lower bound, and a constant
density of 1 in case of its uniform member. The proposed family alleviates this apparent
jump discontinuity at the lower bound. The U.S. Income distribution data for the year
2001 is used to fit distributions for Caucasian (Non-Hispanic), Hispanic and African-
American populations via a maximum likelihood procedure. The results reveal stochastic
ordering when comparing the Caucasian (Non-Hispanic) income distribution to that of
the Hispanic or African-American population. The latter indicates that although
substantial advances have reportedly been made in reducing the income distribution gap
amongst different ethnic groups in the U.S. during the last 20 years or so, these
differences still exist.
1. Introduction
In a 1955 issue of the Journal of the American Statistical Association an
isolated paper on a bounded continuous distribution by Topp and Leone [1]
appeared which received little attention. The paper was re-discovered by
Nadarajah and Kotz [2] and motivated by investigations of van Dorp and Kotz
[3,4] on the Two-Sided Power (TSP) distribution and other alternatives to the
l
2 J.R. van Dorp andS. Kotz
popular and versatile beta distribution which has been used in various
applications for over a century. Even in the late nineties of the 20th century the
arsenal of bounded univariate distributions contained very few members.
Amongst them, the triangular and uniform distribution are the most widely used
together with some "curious" distributions appearing as problems or exercises in
various Mathematical and Statistical journals. Other, somewhat artificial
empirical bounded continuous distributions are based on mathematical
transformations of the normal distribution (of an unbounded domain) - the most
wide spread amongst them are perhaps the Johnson [5] family of
transformations. On the other hand the existence of multitudes of unbounded
continuous distributions developed in the 20th century is well known and amply
documented.
The construction of the Topp and Leone distribution is quite straightforward
and based on the principle that by raising an arbitrary cdf F(x) e [ 0,1J to an
arbitrary power /? > 0 , a new cdf G(x) = F^ (x) emerges with one additional
parameter. This devise was used in 1939 by W. Weibull [6] proposing his
Weibull distribution, which has achieved substantial popularity the second part
of the 20th century, especially in reliability and biometrical applications. The
cdf F(x) in the above construction method may be referred to as the generating
cdf. Figure 1 demonstrates the construction of the Topp and Leone distribution.
The generating density of the Topp and Leone family is the right triangular
density ( 2 - 2 x ) , x e [0,1 ]. It is displayed in Figure 1A. Figure IB depicts its
cdf ( 2 x - 2 x 2 ) and Figures 2C and 2D plot the pdf and cdf of a one parameter
Topp and Leone distribution for/? = 3 . Note, the appearance of a mode in the
Figure 1. Construction of Topp and Leone distribution from a right triangular distribution
Using Elevated Distributions on a Bounded Domain 3
1.6 •
1.4 • 0.8'
1.2.
1 - U, 0 . 6 -
Q 0.8-
"•0.6.
2 0.4-
0.4' 2
0.2-
a-2(a-l)x 0.2- / ax- (a-l)x
•
. 0 0.25 0.5 0.75 1 ) 0.25 0.5 0.75 1
" X B X
1.6 •
3{ax -(a-l)x2}2x^ v 0.8-
{ax -(a -l)x2}3/
1.4 -
1.2- {a-2(a-l)x}/
H, 0 . 6 '
h, 1 •
§ 0.8-
0.6.
So,-
0.4 • 0.2-
0.2'
Figure 2. Construction of generalized Topp and Leone distribution from a slope distribution
Now the Generalized Topp and Leone (GTL) distribution that follows from
Figure 2B (utilizing the above construction method with /? = 3) is depicted in
Figure 2D. The density associated with this cdf is displayed in Figure 2C. Note
4 J.R. van Dorp andS. Kotz
that, while a mode in (0,1) is present in Figure 2C, it has been shifted to the
right when compared to the situation in Figure 1C. More importantly, the
density at the upper bound is strictly positive in Figure 2C while being zero in
Figure 1C (representing the original Topp and Leone density).
Our main interest in this paper is to represent income distributions. We shall
therefore consider the reflected version of the Generalized Topp and Leone
(GTL) distribution utilizing the cdf transformation H(x) = l - G ( l - x ) , where G
is a GTL cdf on [ 0,1 ]. The latter transformation typically assigns the mode
towards the left hand side of its support and allows for strictly positive density
values at the lower bound. This form seems to be appropriate when representing
income distributions at lower income ranges. (Compare, e.g., with Figure 2 of
Barsky et al. [8], p. 668). The U.S. Income distribution data for the year 2001 is
used to fit Reflected GTL (RGTL) distributions for Caucasian (Non-Hispanic),
Hispanic and African-American populations via a maximum likelihood
procedure. The results reveal stochastic ordering when comparing the Caucasian
(Non-Hispanic) income distribution to that of the Hispanic or African-American
populations. In particular when comparing Americans of Caucasian Origin,
African-Americans appear to be approximately 1.9 times as likely and the
Hispanics 1.5 times as likely to have inadequate or no income at all. The latter
indicates that although substantial advances have indeed occurred in reducing
the income distribution gap amongst different ethnic groups in the U.S. during
the last 20 years or so (see, e.g., Couch and Daly [9]), these differences still
exist.
Another reason to consider reflected GTL distribution rather than GTL
distributions is that a drift of the mode towards the left hand side mimics the
behavior of the classical unbounded continuous distributions such as the
Gamma, Weibull and Lognormal. (We note, in passing, that these three
distributions are in a strong competition amongst themselves as to which is the
best one for fitting numerous phenomena in economics, engineering and medical
applications). One can therefore conjecture that application of Reflected GTL
(RGTL) distributions may not be limited to the area of income distributions.
In Section 2, we shall present the cdf and pdf of a four parameter RGTL
distribution and investigate its various forms. In Section 3, we will elaborate on
some properties of RGTL distributions. Moment expressions for RGTL
distributions, to the best of our knowledge, cannot be derived in closed form
(except for certain special cases). The cdf of the beta distribution while not
available in a closed form (whereas that of an RGTL distribution is) is, however,
useful for calculating moments of RGTL distributions for 1 < a < 2. In Section 4,
we shall discuss a Maximum Likelihood Estimation (MLE) procedure utilizing
Using Elevated Distributions on a Bounded Domain 5
standard root finding algorithms that are readily available in various software
packages such as e.g. Microsoft Excel. In Section 5, we shall fit RGTL
distributions to the U.S. 2001 income distribution data with seemingly
satisfactory results. Some brief concluding remarks are presented in Section 6.
F(x|a,b,a,/ff) = l-
b-x^r . .jb-x^
a-(a-l) - (1)
> - a i,y
where a < x < b , 0 < a < 2 and P>0 Evidently, F(a) = 0 and F(b) = 1. The
probability density function (pdf) follows from (1) to be
(2)
2 /b x
«-<«-'>(BfV «<-' ~
with the same constraint on x, a and P as in (1) . From (2) it follows that in
particular
f(a|a,b,«,/J) = £ £ = ^ (3)
b-a
and
0 p>\
Pa
f(b| a, b, «,/?) = fl = \ (4)
b-a
-»ooasxTb p <\
Relation (3) shows that the RGTL family allows for arbitrary density values
at its lower bound a. Expressions (1) and (2) are reduced to the Topp and Leone
distribution (see Topp and Leone [1]):
3'
1.5- /^~- N^- L 2.5-
ft, 2'
Sl.5.
\
0.5- 1 •
/ r
\ 0.5'
3- 1.4-
2.5-
1.2 • *^--j 1
1 •
U, 2-
Q 0.8-
Sl.5- "•0.6.
1 --N^ 1
0.4-
0.5- 0.2-
^ [
0 0.25 0.5 0.75 ) 0.25 0.5 0.75
E x a = 0.5,p = 2
F x a = 0.5, p = 1
1.4 • A'
3.5-
1.2 •
3'
It,
1 "—=- - — _ _ L L - ^ •», 2 . 5 '
« 0.8
§ 2-
"-0.6-
1.5 •
0.4 •
1 •
0.2 • 0.5 •
Note that in case of Figure 3B the pdf assumes a similar form to that of a
reliability function whereas Figure 3C displays a mode at a value greater than 0.
Similarly in Figures 3E to 3H the pdf's with the same value of a (= 0.5) with
Using Elevated Distributions on a Bounded Domain 1
progressively decreasing p from 2 to 0.25, indicate the change in form of the pdf
from a monotonically decreasing concave form, a linear function with
decreasing slope, a mild U-shaped function, up to a monotonically increasing
convex curve.
The J-shaped form of the pdf in Figure 3E (a = 0 , b = 1 , a = 0.5 , p = 2 )
resembles that of a Weibull distribution with the shape parameter less than one
(but on a bounded domain). Note that the structure of (1) is reminiscent to that
of the Weibull cdf. Figures 3G and 3H depict a U-shaped pdf form ( a = 0 , b = 1 ,
a = 0.5 , P = 0.75 ) and a J-shaped pdf form ( a = 0 , b = l , a = 0.5,P = 0.25 )
respectively, and are similar to those appearing in the beta family, but with a
bounded density value at its lower bound (cf. (3)). Setting a = 1 , p = 1 in (2)
yields a uniform distribution on [ a, b J. Hence, analogously to the four
parameter beta distribution with the pdf
Y{a + P) fx-a] [b-xx
(6)
r(a)r(yff)(b-aHb-a b-a
where a < x < b , a > 0 and / ? > 0 and the Two-Sided Power family (see van
Dorp and Kotz [3,4]) with the pdf
N. n-l
x-.a I
a < x <m
b-x 1 ^ ,
b-a ^b-m/ m<x<b
where n > 0 , the RGTL family has the uniform distribution on [ a, b ] as one of
its members. Another common member amongst these 3 families (Beta, TSP
and RGTL) is the reflected power (RP) distribution on [ a, b ] the pdf
b-xx
f(x|a,b,a,/?): (8)
b- b-a
obtained by substituting a = 1 in (2). Substituting a = 0 in (2) also yields the
reflected power distribution but with parameter 2p. The reader is encouraged to
construct diagrams connecting the above cited distributions.
A distinguishing feature amongst RGTL distributions, compared with
distributions (6) and (9), is the existence of additional pdf forms with a positive
density value at its lower bound (see Figures 3B-3H) allowing representation of
uncertain phenomena with such a property. Another feature of RGTL
distribution (indicating a lesser flexibility within the same family) is that the
pdf's of a GTL distributions and its reflections possess different functional
forms, whereas the reflection of a TSP pdf as well as a beta pdf belong to the
same functional family.
8 J.R. van Dorp and S. Kotz
V H
' = C(x\a,p)f(x\a,P) (17)
dx
where the multiplier
C(x\a,P) =
This seems to be the most interesting case. From (17), (18) and (19)
it follows that the SRGTL pdf (12) may possess a mode in ( 0 , 1 ) . Defining
y = 1 - x and setting the derivative (17) to zero yields the following quadratic
equation in y
2 (or-l)2y2-2a(«-i)y + =0 (20)
2/?-l
(The left hand side of (20) is a parabolic function in y). Noting that the
symmetry axis of the parabola associated with the l.h.s. of (20) has the value
-*— (21)
2(«-l)
which is strictly greater than 1 for a > 1, and that y = 1 - x e [ 0,1 ] <=>
x e [ 0,1 ], it follows that out of the two possible solutions of (20) only the
solution
1
y =• 1- (22)
2(a-l) [ \ 2/9 — 1J
can yield a mode x* e ( 0 , 1 ) . Moreover, from 1 < a < 2 , /? > 1 it follows that
y > 0. Also, from (22) we have that y -»• a > 1 for 1 < a < 2 when
3
2(a-l)
P —> oo . Hence, from (22) we conclude that the mode x = 1 - y is
1 ( i —1: — A
:Max o,- a 1+ (23)
2 ( a - l ) I ^ \ 2/9-1
Setting a = 1.5 and P=2 (as in Figure 3C) yields
x* = Max [o,-i + i V J ] « 0.366. Setting a = 1.5 and /? = 6 (as in Figure 3B)
yields x* =Max[0,-- + — 7 l l J = 0 and hence a mode is located at the lower
bound 0 with value / ? ( « - 2 ) = 3 (cf. (3) with a = 0, b= 1). Utilizing (23) it
follows that a Standard Reflected Topp and Leone distribution ( a = 2) has a
mode at
2/7-1
1
for p > 1. Setting /? = 3 (as in Figure 3A) yields a mode at — v5 » 0.447
Similarly to Case 2 it follows that the pdf (10) has an infinite mode at lfor
0 < a < 1 , p < 1. However, from (17), (18) and (19) it follows that the pdf (10)
may also have an anti-mode x e ( 0 , 1 ) (resulting in a U-shaped form) in this
case. The formula for the anti-mode is also given by (23) provided /? > - . For
example, setting a = 0.5 , /? = 0.75 (as in Figure 3G) yields
x* = M a x [ o , - - i v 2 j and hence an anti-mode at approximately 0.793. For
Failure Rate
The failure rate function r(t) = f(t)/{ l-F(t)} for an SRGTL density
follows from (9) and (10) to be
D(a,x)-£- (24)
1-x
where
N «-2(a-l)(l-x)
D ( o r x )
' = i iw, / <25>
a-(a-l)(l-x)
and it is straightforward to check that /?/(l - x) is the failure rate of a standard
reflected power (SRP) distribution ( (10) with a = 1). From (24) it follows that
D(«, x) may be interpreted as the relative increase (or decrease) in the failure
rate of an SRGTL distribution as compared to a SRP distribution. Taking the
derivative of (25) with respect to x yields
dD(a,x) a\\-a)
dx {a-(ar-l)(l-x)}2
Hence, D(l,x) = l for all x e [ o , l ] and it follows from (26) that
D ( a , x ) < l (>1) for all x e [ 0 , l ] when 1 < a < 2(0<a < 1). Thus, a may be
interpreted as a failure deceleration parameter (relative to the reflected standard
power distribution) when 1 < a < 2 and a failure acceleration parameter when
0 < a < 1. On the other hand, (24) shows that p is a failure acceleration
parameter for all p > 0 .
Cumulative Moments
Due to the functional form of the cdf (9) calculations of cumulative
moments
Mk=|^xk(l-F(x))dx (27)
12 J.R. van Dorp and S. Kotz
for SRGTL distributions have a slight advantage over that of central moments
about the mean. The mean JU[ and the central moments about the mean ju2
(variance), /i 3 (skewness) and // 4 (kurtosis) are connected with the cumulative
moments M k , k = 1,...,4, via
tt'=M0
H2 = 2M, - M 0
(28)
/i 3 = 3 M 2 - 6 M , M 0 + 2 M 0 3
M4 = 4 M 3 - 1 2 M 2 M 0 +12M!M 0 2 - 3 M 0 4
(see, e.g., Stuart and Ord [10]). The cumulative moments M k for SRGTL
distributions follow from (9) and (27) to be
JJ Oi'x k ( l - x ) / , { a - ( a - l ) ( l - x ) } " d x =
k fk" i + (29)
= 1 {-iya^ y 'h-^~^ dx
i=0
For a = 1, expression (29) simplifies to that of the cumulative moments of
an SRP distribution (cf. (10) with a -1). For a e ( l , 2 J , the cumulative
moments can be expressed utilizing the incomplete Beta function
T(a + b)
B(x | «,/?) = (30)
r(a)r(b){ 0 x p a - | (l-p) b " , dp
as
or-1
yS+i+l B P + i + 1,/3 + 1
M (-!)'<*' (31)
i=<i; a-I B '(y9 + i + l,/? + l)
a-\
/?+i B ft + Ufi + l
U0=ap a
a-\ B - ' C ^ + l ^ + l)
M, =M0-a fp [ a \
a-\\
fi+2
\i~ P + 2,p + l)
V-\p + 2,P + \)
(32)
a-\
B P + 3,P + l
M2 = - M 0 + 2 M , +
B-l(p + 3,P + l)
•) ^ + 4 P + 4,P + l)
{ Ka
B-l(p + 4,p + \)
~ 2(or-l)
_1
can yield | l - F (z | a, /?) j e [ 0,1 ]. Analogously, it follows that for 0 < a < 1
only the solution
14 J.R. van Dorp and S. Kotz
a + ^a2 -4(a-l)^/l^z:
2(o-l)
can result in {1 - F _ 1 (z | a, f3)\e [ 0,1 ] . Hence, we have
a—la1 - 4(a-1)^/1^1
l<or<2
2(or-l)
F- I (z|a,/0 = 1-^/T a =1 (34)
2
« + A /a -4(or-l)^/T^z
1- 0<a<l
2(a-l)
where the case a = 1 follows from the cdf of a standard reflected power (SRP)
distribution ( a = 1 in (9)).
fi"fl\{*yi-(a-l)yi2Y~l{a-2{a-l)yi} (35)
i=lL
where
Yi = l - x , (36)
Instead of maximizing L(a, ft | x, n) we may equivalently maximize the
log-likelihood. Taking the logarithm of (35) and calculating the derivative with
respect to /? we obtain
4 + In i Ln{ a; y 1 -(a-l)y i 2 } (37)
p 1=1
Using Elevated Distributions on a Bounded Domain 15
1
/? = N (38)
(a-l)y/
is the unique MLE of ft given a particular value of a . Taking the logarithm of
(35) and calculating the derivative with respect to a , one obtains
ni(l-yi) . 2 ni(l-2yi)
(£-1)1 -+£ (39)
i=i a - ( a - l ) y j i=i a - 2 ( a - l ) y j
Substituting (38) into (39) (utilizing /? instead of p and expressing f5 in
terms of a ) the following function *¥(a) is derived:
x N g "i(l-yj) , g nj(l-2yi)
¥(a) = -1
i=i a - ( a - l ) y i i a-2(a-l)yi
ZnjLn-
i=l [ayi-(a-l)yi
(40)
where y, is given by (36) and the function is defined on a bounded range of
0 < a < 2 .The MLE a follows as one of the roots of the equation ¥(a) = 0 or
as one of the boundary values a = 0 or a = 2. The bounded domain of
x
P(or) allows for straightforward plotting of the function in standard spreadsheet
software such as Microsoft Excel and subsequent determination of an
approximate solution of the MLE a . Using the root finding algorithm
Goalseek, available in Microsoft Excel, and the approximate solution of a
allows us to calculate a up to a desired level of accuracy. Finally, substitution
of a in to (38) yields the MLE J3. The MLE procedure above will be
demonstrated in the next section using U.S. 2001 income data.
Figure 4 displays a graph of the function g{a) (cf. (40) ) for the income
data of Caucasian (Non-Hispanic) Americans presented in Table 1. From
Figure 4 we observe an approximate root of the equation g(a) = 0 to be the
value a* «1.70. Since, g(a)>0 ( < 0 ) f o r 0<a<a* (a* < a < 2) it follows
that a = a* is the unique MLE of (35) for a . Using Goalseek (a standard root
finding algorithm in Microsoft Excel) with a accuracy of 1 . 10~6, utilizing the
approximate solution 1.70 we obtain a = a =1.679. The unique MLE
P = 6.767 follows from substituting a = 1.679 in (38). Figure 5 below plots
both the empirical and their fitted RGTL counterparts (cf. (1) and (2) ) with
a = $0, b = $250,000 , a = 1.679 and /3 = 6.767 . Differences between the
empirical cdf and fitted cdf can be observed in Figure 5A. The Kolmogorov-
Smirnov Statistic D, which is the maximum observed difference between the
empirical and fitted cdf s (see, e.g., DeGroot [13]), in Figure 5A equals 8.60%.
1.0E+05 j
8.0E+04 -
6.0E+04 -
4.0E+04
^ 2.0E+04
3 0.0E+00
o
-2.0E+04
-4.0E+04
-6.0E+04
-8.0E+04
-1.0E+05
0 0.5 1 1.5 2
a
Figure 4. A graph of the Function g(a) (cf. (40) ) for the income data of Caucasian (Non-Hispanic)
Americans presented in Table 1
18 J.R. van Dorp and S. Kotz
Table 1. U.S. income distribution for households in year 2001 (Source: U.S. Census Bureau, Current
Population Survey, March 2002. Numbers in thousands, households as of March of the following
year)
Caucasian
(Non-Hispanic) African American Hispanic
Number Mean Number Mean Number Mean
Income of Household Income Income Income
100% •
^ 1.8E-05 •
90% - ../
80% • .../ 1.6E-05 •
70% • // 1.4E-05 •
60% • ft 1.2E-05 • ; • : ,
50% • ft 1.0E-05 •
^ 5
40% -
If 8.0E-06 • ••• ^
30% -
if 6.0E-O6 • :
20% - i 4.0E-06 • ^-A.
10% •
1 2.0E-06 •
^"^f^
$100,000 -I
$150,000 •
$200,000 •
$50,000 •
0% •
e o
000
000
000
1
*» o
e
1 B
^ - Empirical CDF RGTL CDF - Empirical PDF
Figure 5. Empirical and an MLE fitted RGTL distribution (a = 1.643 and /? = 6.179 ) of the
Caucasian (Non-Hispanic) income data in Table 1; A: CDF; B: PDF
Table 2. Maximum Likelihood Estimators for the parameters a and J3 of RGTL distributions for
the income data in Table 1 up to $250,000
a P
Caucasian (Non-Hispanic) 1.679 6.767
Hispanic 1.685 10.306
African-American 1.613 10.629
20 J.R. van Dorp and S. Kotz
1.8E-05 •
1.6E-05 •
\
1.4E-05 • \
1.2E-05 •
\
1.0E-05 • \
8.0E-O6 • \
6.0E-06 •
\
4.0E-06 • \
2.0E-06 •
0.0E+00 •
Empirical PDF
B •Empirical PDF -RGTL PDF
Figure 6. Empirical and MLE fitted RGTL pdf's for the income data in Table 1; A: African-
American ( a = 1.613 and p = 10.629); B: Hispanic ( a = 1.685 and p = 10.306)
Table 3. Cumulative Moments M k and Central Moments / / k + 1 of the MLE fitted RGTL
distributions for the income data in Table 1 up to $250,000 calculated utilizing (32) and (28),
k=l 3
M 0 =tt' M, M2 M3 Mi MT, MA
Table 4. Statistics associated with the MLE fitted RGTL distributions for the income data in Table 1
up to $250,000
Mean Median Mode St. Dev A Pi
Caucasian (Non-Hispanic) $58393 $52534 $28306 $39326 0.424 2.858
Hispanic $44316 $38606 $11851 $31710 0.660 3.248
African-American $39786 $33599 $0 $30002 0.858 3.522
Table 5. Density values at the lower bound of the MLE fitted RGTL distributions for the income
data in Table 1 up to $250,000
f(0|0,$250,000,a,#)
Caucasian (Non-Hispanic) 8.68e-6
Hispanic 1.30e-5
African-American 1.65e-5
22 J.R. van Dorp and S. Kotz
-+•34%
•+29%
-+•20%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100
%
Caucasian (Non-Hispanic) Percentile
-»- 56%
•*• 5 0 %
••37%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100
%
B Hispanic Percentile
African American . . . -Hispanic Caucasian (NH)
Figure 7. Stochastic Dominance Analysis by Ethnicity for the income data in Table 1 utilizing the
MLE fitted RGTL cdf 's
Summarizing, Table 2 and (16) alone imply that the chances of a Caucasian
(Non-Hispanic) or Hispanic American earning more than a specified amount
(anywhere within the range from $0 to $250,000) are higher than those for an
African-American. In addition, the analysis in Figure 7 allows us to conclude
that the chances of a Caucasian (Non-Hispanic) earning more than a specified
amount (anywhere within the range from $0 to $250,000) are higher than those
of a Hispanic. Moreover, Figure 7 and Table 4 demonstrate that although
24 J.R. van Dorp andS. Kotz
6. Concluding remarks
We have attempted to construct and investigate a new four-parameter
continuous family of distributions on a bounded domain possessing arbitrary
strictly positive density values at its lower bound. As an illustration, the new
family is applied to fitting the distributions of income of Caucasians (Non-
Hispanic), Hispanics and African-Americans in the U.S.A. in the year 2001
based on U.S. Census bureau data. The results seems to be quite satisfactory and
allow us to compare the incomes of the above 3 groups in a novel manner which
seems to be revealing by shedding additional light on features which are not
obvious from a direct examination of the raw data.
Acknowledgments
We are indebted to T.A. Mazzuchi for his helpful comments in the course of
developing this paper and to Dr. David Findley (U.S. Bureau of Census) for
helping us to obtain recent data.
References
1. Topp, C.W. and Leone, F.C. (1955). A family of J-shaped frequency
functions. Journal of the American Statistical Association, 50(269), 209-
219.
2. Nadarajah, S. and Kotz, S. (2003). Moments of some J-shaped distributions.
Journal of Applied Statistics, 30(3), 311-317.
3. Van Dorp, J.R. and Kotz, S. (2002). The standard two sided power
distribution and its properties: With applications in financial engineering.
The American Statistician, 56(2), 90-99.
4. Van Dorp, J.R. and Kotz, S. (2002). A novel extension of the triangular
distribution and its parameter estimation. Journal of Royal Statistical
Society, Series D, The Statistician, 51(1), 63-79.
5. Johnson, N.L. (1949). Systems of frequency curves generated by the
methods of translation. Biometrika, 36, 149-176.
6. Weibull W. (1939). A statistical distribution of wide applicability. Journal
of Applied Mechanics, 18,293-297.
7. Van Dorp, J.R. and Kotz, S. (2003). Generalized trapezoidal distributions.
Metrika, 58(1), 85-97.
Using Elevated Distributions on a Bounded Domain 25
8. Barsky, R., Bound, J., Kerwin, K.C. and Lupton, J.P. (2002). Accounting
for the Black-White wealth gap: A nonparametric approach. Journal of the
American Statistical Association, 97(459), 663-673.
9. Couch, K. and Daly, M.C. (2000). Black-White inequality in the 1990's: A
decade of progress. Working Papers in Applied Economic Theory, No.
2000-07, Federal Reserve Bank of San Francisco.
10. Stuart, A. and Ord, J.K. (1994). Kendall's Advanced Theory of Statistics
(Vol. 1, Distribution Theory). New York, Wiley.
11. O'Neill, D., Sweetman, O. and Van de Gaer, D. (2002). Estimating
counterfactual densities: An application to Black-White wage differentials
in the U.S., Economics Department Working Paper Series, Department of
Economics, National University of Ireland - Maynooth.
12. Kleiber, C. and Kotz, S. (2003). Statistical Size Distributions in Economics
and Actuarial Sciences. New York, Wiley.
13. DeGroot, M.H. (1991). Probability and Statistics, 3rd ed. Reading, MA:
Addison-Wesley.
Chapter 2
MAKING COPULAS UNDER UNCERTAINTY
C. GARCIA-GARCIA
Department of Quantitative Methods in Economics
University of Granada
Campus de Cartuja s/n. Granada, 18071, Spain
J.M. HERRERIAS-VELASCO
Department of Quantitative Methods in Economics
University of Granada
Campus de Cartuja s/n. Granada, 18071, Spain
J.E. TRINIDAD-SEGOVIA
Department of Business Administration
University ofAlmeria, Ctra. Sacramento s/n
La Canada de San Urbano, 04120
Almeria, Spain
This paper is based in the MTDF methodology, which lies in obtaining the value of an
asset from value of a specific index (Ballestero, 1973). The topics of this paper are to
apply this methodology in the case of two indexes under uncertainty, the construction of a
copula FGM using marginal TSP given the classical values (a, m, b) and the application
in an empirical case. Under an uncertainty environment a high correlation exists between
the indexes, which imply the impossibility to apply the FGM copula that is restricted to
the weak correlations case. This article overcomes this disadvantage presenting an
alternative that later is applied in a practical case.
1. Introduction
The method of the two distribution functions has been developed as a
method of valuation recommended under uncertainty environments, this is,
when there is no information over the asset that it has to be valued and an
experts' is consulted, acting in similar way that in the PERT method.
The present paper is based on the method of the two distribution functions,
also known as method of two betas. This method was presented by Ballestero
(1971) and it is highly used in valuation. It supposes an improvement of the
Synthetic method and it was formalized later by its author, Ballestero (1973),
27
28 C. Garcia-Garcia, J.M. Herrerias-Velasco andJ.E. Trinidad-Segovia
who describes it as follow: the variable market value of a good will follow
statistically the distribution function F. On the other hand, the index, parameter
or explanatory variable will follow statistically another distribution function G.
We suppose that the functions F and G have the form of a bell or similar, and
then the method of both betas establishes a relationship between both variables.
So, it is necessary to adopt the following hypothesis: if the index Z,, of an
assets Fj is higher that the Lj of another assets Fj, the market value Vt
corresponding to the first assets will be also major that the market value Vj
corresponding to the second one.
From it, if the distribution F of the market value is known as well as the
distribution G of the index, the market value Vk corresponding to an index Lk is
established by means of the transformation:
VK=0(LK)<*F(VK) = G(LK) (1)
or,
G(i) = P[l < i] = P[0(I) < 0(i)] = P[V < 0(i)} = F(0(i)\
K o = 0 ( / o ) = F- I (G(/ o )) (3)
Making Copulas Under Uncertainty 29
Assets Index
m Z b
Figure 1. Probability density function in the MTDF for the assets and the index respectively
Since the presentation of the method of the two betas in Ballestero (1973),
numerous contributions have been published. These contributions have extended
the application of this method and, summarizing, we can distinguish the
following lines:
a) Practical applications of the method of the two distribution functions:
Ballestero and Caballer (1982), Caballer (1994), Caballer (1998), Caballer
(1999) and Ballestero and Rodriguez (1999) extend its use to the valuation
of fruit-bearing trees and real estate. Alonso and Lozano (1985) do an
application to the valuation of properties in the region of Valladolid;
Guadalajara (1996) presents a series of practical cases. Garcia, Trinidad and
Sanchez (1997). Cafias, Domingo and Martinez (1994) realize a practical
application in the province of Cordoba.
b) Extension of the method to different distributions: Romero (1977) does an
extension of the method using uniform and triangular distributions; Garcia,
Cruz and Andujar (1998) present a review of the application in triangular
distributions. Garcia, Trinidad and Gomez (1999) extend the method to the
utilization of a special class of trapezoidal distributions; Herrerias, Garcia,
Cruz and Herrerias, (2001) extend the method to the use of trapezoidal
distributions of any type. Garcia, Trinidad and Garcia (2004) realize an
application using the generalized triangular functions of Van Dorp and Kotz
that can be fitted in an uncertainty environment.
c) Utilization of two or more indexes, under the independence hypothesis or
not, and implementation of econometric applications. In this line, Garcia,
Cruz and Rosado (2000, 2002) present an extension of the method to the
multi-index case under the hypothesis of independence between the indexes.
Herrerias Velasco (2002) in his Doctoral Thesis extends the method of both
distribution functions to the case bivariante of exhaustive form and, in
30 C. Garcia-Garcia, J.M. Herrerias-Velasco andJ.E. Trinidad-Segovia
In this work the MTDF will be applied in the case of two indexes under an
uncertainty environment. The FGM family will be used to construct a joint
distribution function given the TSP marginal.
2. Initial approach
When one tries to value an asset that depend of one or more indexes and
there is no statistical information, it is said that we are in uncertainty
environment. The habitual procedure in these cases is to turn to an expert, who
will be asked about the optimistic value, the pessimistic value and the most
probable value of the assets and, at least, a reference index (Garcia, Trinidad
and Garcia; 2004). If we suppose that we have information about the PERT
values for the assets and two reference indexes (see Table 1):
Table 1. PERT values for the assets and two references index
' It is been also considered the case of the Multi-Index in Garcia, Cruz and Rosado,
(2000, 2002)
Making Copulas Under Uncertainty 31
The distribution functions F(I,) and F(Jz) will be obtained from these
estimations using classical methods. Then, when the marginal distribution
functions are known it is necessary to create a joint distribution function,
F(h, Id- This question has been hardly studied in the literature.
The first references are Frechet (1951) and Levy (1950), who proved, when
he was looking for a definition of the distance between two distributions, that
once given a distance between random variables d(X, Y), the minimum of the
above mentioned distance, when the distributions of X and Y are given, is
another distance. Frechet, based on Levy's paper, began to study the problem of
creating a joint distribution function when the marginal distributions were
known. He proved that from two cumulative distribution function, F(x) and
F(y), the joint cumulative distribution function is between W(x,y) and M(x,y):
W(x,y)<F(x,y)<M(x,y) (4)
The lower and the upper limits of the previous inequation are usually called
Frechet's limits, and these limits are likewise distribution functions with the
following expression:
W(x,y) = max[F,(x) + F2(y)-l, 0]
(5)
M(x,y) = min[Fi(x),F2(y)]
some functions, the connection with linear programming. It was very important
the introduction of copulas by Sklar in 1961, and his later paper with Schweizer,
where it was opened a new route of investigation.
where:
F(Xj,X 2 ) is the joint cumulative distribution function ofXt andX2 .
F(X1) and F(X 2 ) are the marginals cumulative distribution functions
The expression for the probability density function is given by:
f(Xl ,X2) = fx (X, )f2 (X2 )[l + a(\ - 2F, (X, ))(1 - 2F2 (X2))] (8)
About the correlation between X e Y it is easy to prove that:
E{ylx) = E{y)+aJ2{2Fx{x)-\} (9)
+00
Standardizing the random variable x, this is, doing the change of variable:
x-a
t =•
)-a
34 C. Garcia-Garcia, J.M. Herrerias-Velasco cmdJ.E. Trinidad-Segovia
M \± si 0<t<M
F{tlM,ri) = - (12)
1-/
l-(l-M) si M<t<\
\-M
where:
(n - 1)M +1
E(t) = (13)
«+l
and
n-2(n- 1)M(1 - M)
var(f) = (14)
(« + 2)(« + l) 2
Where a is the pessimistic value, m the most probable value and b the
optimistic value, all of them apported by the expert. The parameter n has a more
complex interpretation, due that it is not known what exactly means, and also
what must be the question asked to the expert to obtain this information.
However, we can affirm that n verifies the following properties:
1. n > 0
2. For n = 1, then the STSP distribution degenerates into a uniform
distribution.
3. For n = 2, the TSP distribution is tranformed in a triangular distribution
with parameters a, m and b.
4. Finally, for a = 0 and m = b= 1, f(7a,m,b,n) is a potential function and
for a = m = 0 and b = 1, We would obtain its reflection.
In spite of this, Van Dorp and Kotz point out the intuitive meaning of n
since the expected value of x adopts the following expression:
Making Copulas Under Uncertainty 35
E(x)=aHn-l)m + b
n+\
So, n - 1 is the coefficient that wheights the mode to obtain the expected
value of the random variable, supposing that the extremes a and b are wheighted
by 1. In our opinion this property places the STSP distribution in the field of
PERT. From the habitual three classical values a, m and b, whose meaning is
known, it should be impossible to determine the unique STSP distribution, since
it is a tetraparametric distribution whit parameters a, b, and «. Therefore, it is
necessary to restrict the election of the unique STSP distribution to someone of
its subfamilies (see Garcia, Cruz and Garcia, 2004.b, 2005).
We will define, first of all, a family of constant variance as the set formed
by the STSP distributions with the same variance that the normal distribution of
the classic PERT. In case of working with random standardized variables, the
following equation is fulfilled:
w3 + 4« 2 + (-72M 2 + 72M - 31)« + (72M 2 - 72M + 2) = 0. (16)
a{M) = 2M4 - 4M 3 + 6M 2 - AM +1
b(M) = -14M 4 + 28M 3 - 22M 2 + 8M - 1
•c(M) = - 2 M 4 + 4 M 3 - 2 2 M 2 + 2 0 M - 6 (18)
4 3 2
d(M) = 62M - 124M + 94.M -32A/ + 2
e(M) = -48M 4 +96M 3 -56M2 +8M
It could be proved that for every Me(0,l), the equation (18) has an only
solution that verifies n > 1, so we can affirm that always a mesocurtic STSP
36 C. Garcia-Garcia, J.M. Herrerias-Velasco andJ.E. Trinidad-Segovia
distribution will exist. This result improves the ones obtained by mesocurtics
beta distributions in which is imposible to get a solution when M e
(0,2763933...;0,7236067...). Solving the system created by equations (17) and
(18), the unique solutions for n > 1 are:
M= 0,747133..., « = 3,02344...
(19)
M= 0,252867..., n = 3,02344...
These solutions are the same that correspond to the STSP distributions that
verify simultaneously the conditions a2 = 1/36 and /?2 =3 and these are called
classic family.
To conclude, in the PERT, the family of STSP distributions improves
always the Beta distributions families due to:
1. It is always possible to select a mesocurtic STSP distribution, while in
the case of the Beta distribution it is not possible.
2. As well as the classic Beta distribution exits a STSP distribution with
n = 3,0234.
3. The STSP distribution is more moderate in mean and more
conservative in variance to every M value. See Garcia, Cruz y Garcia
(2005).
This could be explained by the behaviour of the kurtosis coefficient. If we
compare the STSP distributions family depending of n and M, with the beta
distributions family depending on k = n - 1 and M, the first one allows to select
a distribution with higher kurtosis than the second one. It could be proved that,
in symmetric distributions, the value to the kurtosis coefficient of the beta
A
-j Beta
4 _ s l s p
—j cuitosis = 1
H curiosi; = 6
60
distribution is lower that the normal one (3), whereas in the STSP distribution
we can find weighted values that get higher or lower kurtosis values than the
normal one (see Figure 2). In conclussion, this distribution could be an
alternative to the normal distribution and others when we want to fit
distributions with a higher kurtosis (see Herrerias, Callejon, Perez and Herrerias,
2001).
f o^
M•i X2 = M2 (20)
1 1
M, 0<XX <MX
yMXJ
F(XX)- (21)
1 - (l - Af, M , < Xx <1
1-Af,
M, 0<X2 <M2
\M2J
F(X2) = (22)
\"2
1-X-,
l-(l-M 2 f M2<X2<\
\-M 2 7
V
MiAfV*
M M
* 1+G 1-M
^S
(K^MjO-cA^M,
M, <A",<1;0<A2<A4
F{^X2) = { (23)
1-A"2
«'£ 1-(1-M,
1-M
1+Q(1-M|—
M
p-W 2
1-M
0<Xi<M\,M2<X1<\
(fj
1 1_Ai
I-(I-M: H,-Ml| 1+0(1-^) ' (1-M^
{i-H 1-M
M<X,<1;M2<A2<1
M, I
x2
I A/, ' - ^ —fe
0 < X, < A/,;0< X, < M,
1-Af, ]"' V X2
l-A/,1 I AT,
-1 + 2(1-A/,)
H*.
M.<X,< 1;0 < A", < A/,
f(X„X2) = \
M, I
i-X1
I 1 - A/,
1+a|,_2M,|A)
a- |_I+2(1_W2)iz|
0 < X, < A/,; A/, < AT, <1
(24)
1-A-, 1-Af,
-1 + 2 ( 1 - A / , ) •1 + 2 ( 1 - A / , )
l-A/,1 ll-Af 1-M, J I ' *'U-A/ 2
A/, <X, <\;M2 < X2 <1
Figure 3 represents the joint distribution function given two marginal STSP, by
means of the FGM distribution function.
Making Copulas Under Uncertainty 39
00,9-1
• 0,80,9
• 0,7-0,8
B 0,6-0,7
00,5-0,6
• 0,4-0,5
• 0,3-0,4
• 0,2-0,3
• 0,1-0,2
• 0-0,1
Figure 3. FGM joint distribution function with marginals TSP, Mi = 0,8; M2 = 0,6; a = 0,9
0.15 0,22
U M
' 0.71 ^TITfmFci
0.85 , n 00
n 0
Figure 4. FGM joint density function with marginals TSP, Mi = 0,8; M2 = 0,6; a = 0,9
40 C. Garcia-Garcia, J.M. Herrerias-Velasco andJ.E. Trinidad-Segovia
In Figure 4 is presented the FGM joint density function given the Van Dorp
marginal functions. To carry out these representations, it is necessary to find
before the value of a. Let's remember that the parameter a belongs to the interval
[-1,1], so in the cases in which a = -1 and a = 1 represent the maximum degrees
of negative and positive dependence, respectively, that are allowed in FGM
family. It is observed that the parameter a is associated with the measurements
of dependence, and that is why these are used for the calculation of this
parameter.
The basic measurement of linear dependence between two variables Xt and
X2 is the covariance:
cov(^,X2) = £[(*, -E(X i )Xx 2 - E(X2))] (25)
In order that this measurement was independent to the units in which the
variables are expressed, the covariance is divided by the product of the standard
deviations. This is the widely known correlation coefficient:
This coefficient has been the basic measurement of the linear dependence
during more than 100 years. Many other measurements have been proposed
during the 20th century to calculate the positive or negative dependence, for
example the Spearman's coefficient, Kendall's tau, the Blomquist's q coefficient
and the Hoffding's A . Specifically, the correlation coefficient has been placed
for different families of FGM distribution introduced in the literature, so the
correlation coefficient takes the value of a/4, dn and 0,281a for exponential,
normal and Laplace marginals, respectively. See the Table 2.
l
corr(Xx, X2 ) = a\\ (28)
where:
i
Finally, the substitution of expression (13) into (29) and using (28) and (15),
we obtain the following expression for the correlation coefficient:
"^-(^-QM.-d-M,.)
r{xi,X2,Mi,M2,nl,n2)=aln] +
(30)
(l-M,) 2n, +1
A=f\«,-2(»,.-l)M,.
Thus if we observed that, once that the correlation coefficient is known, we
might find the parameter a by clearing of the previous expression. To find the
correlation coefficient, we will raise the relation between the index 1 and
the index 2 as a basic problem of regression. So, one of the indexes will be the
explanatory variable (X) and other one the explained one (Y):
Y = p0+pxX + u.
^ = (X'X)" 1 X>
where:
42 C. Garcia-Garcia, J.M. Herrerias-Velasco andJ.E. Trinidad-Segovia
-1 1 -(M2'+l)
(XX)
2(M|-M2+1) •(M2'+l)
1 1 1
X'y
0 M2 1 M2M!+1
1
v ;
Solving we obtain:
1 M
A>= T (Mf + 1 X M, + 1)-(M 2 +1)(M2M, + 1) - * t ^ 2 - ^ ~"2Ml
2(M 2 -M 2 +1) 2(Mj - M 2 +1)
2M2M1-M2-M1+2
A 2(M -M +1) (-(M2 + lJXMj +1) + (M2M^ +1)3 = 2(M -M +1)
2 2 2 2
The expression for the variances and the covariance can be obtained from
Table 3:
Table 3. Calculations
Y X y-y x-x (y-y)(*-*)
0 0 -(M,+l)/3 -(M2+l)/3 (Af,+l)(Af 2 +l)/9
M, M2 (2M1-l)(2M2-l)/9
1 M 22 - ( M ^ + 1 )
3 3
1 1 l (M1+l) a (M2+l) (2-M1)(2-M2)/9
3 3
• 0,8-1
B 0,6-0,8
• 0,4-0,6
•0,2-0,4
B
0-0.2
Figure 5. Representation of the correlation coefficient between the index 1 and the index 2 under
uncertainty
6. A solution
Under uncertainty we only have very limited information and this does that
the measures of correlation concludes with the existence of a high correlation
between the variables. Considering that the parameter a is related to the
measures of correlation, it is possible to affirm that the strong existing
correlation under uncertainty involves values of a out of the interval (-1,1). This
carries the imposibility to apply the family FGM in these cases.
The basic problem is the absence of observations for every index, but if we
consider the parameter n as the number of times that the mode is observed, and
in this way, n] should be the number of times that the mode of the index 1 has be
observed and n2 the number of times that the mode of the second index has been
observed, we would possess a total observations of:
44 C. Garcia-Garcia, J.M. Herrerias-Velasco andJ.E. Trinidad-Segovia
Hereby we have passed of having three observations for every index to have
ni + 2 for the first index and n2 + 2 for the second index (see Figure 6). The
intention of this, is to avoid high values of the correlation coefficient as a
consequence of the absence of information. Nevertheless, the result is that the
correlation coefficient will have nule value for every value of ni and n2. It is
proposed to omit some of the observations, and it seems to be logical to
eliminate that one in which the index 1 takes the optimistic value whereas the
index 2 gives us the pessimistic value and vice versa, since these are extreme
cases that under a supposition of correlation between the indexes would not be
possible. Hereby, the number of observations should be:
nlri2 + 2 « ] + 2 « 2 + 2 (34)
Then,
var(/ }= A^i2P"i"2 + 6"i"2 + 4w|1- Mi[2(n2 + \jn,n2 + 2"i)]+ 1"2 + "i"2 + 3"i"2 + 2 "i + 2" 2 + '1 (35)
\n^n2 + 2«i + 2« 2 + 2)
We have increased the observations with regard to the initial problem, and
we got a nule value of the correlation coefficient. Hereby when the correlation
coefficient takes a value into the range (-1/3,1/3) we will be able to apply the
FGM distribution family.
Figure 6. Graph of the proposed solution in which the parameter n is consider as the number of times
that the mode of the index is observed
Making Copulas Under Uncertainty 45
p_ C 0 v(7„/ 2 )
JVaifd-VarVi)
3
iiiiipiipiipiipii|ppiipiigiii^
o T- IN N n ^_ ^ w (B Q- r-_ eo er> en
o o o~ o" o o o o" o" o o © o
Figure 8. Detail of the representation of the correlation coefficient using the proposed solution
7. A valuation method
Until now it is been developed the procedure to calculate a joint distribution
function when the marginal TSP is known, in the case of uncertainty as well as
under risk, an expression for it has been achieved. In addition there has been
realized a mathematical formulation of the above mentioned procedure.
The next step is to present the valuation procedure that lies in, given values,
(x0, y0) of both indexs 1 and 2, respectively, to calculate the value of the asset
for those values.
First, the value F0 is calculated for the values (x 0 , y0), this is equivalent to
say:
Later, two possibilities are presented depending of the value of F0, if it is or
not major than the standarized mode of the assets (M), and the final value of the
assets will depend of this. These posibilities are defined in the expressions (38)
and (39):
Branch 1: If F 0 -<Mthen:
M
{^]=F^V = M§ (38)
Making Copulas Under Uncertainty 47
1 (1
- -^TS)" = F ° ^ =1 - (1 -Wr§- (39)
a m b
Assets 110 140 180
(0) (0,428) (1)
Index 1 200 230 300
(0) (0,3) (1)
Index 2 90 120 170
(0) (0,375) (1)
Table 5. Values of n
Once the values for the parameter n and M are known, it is possible to
substitute in the expression (23) and to obtain the joint distribution function.
Analogous substituting in the expression (24) the joint density function can be
obtained.
Table 6 contains the values of the correlation coefficient as well as the
parameter alpha for each of the subfamilies referred in the practical case:
9. Conclusions
I. Introducing two indexes in the two distributions functions methods
under uncertainty we will need the construction of a copula and define
50 C. Garcia-Garcia, J.M. Herrerias-Velasco andJ.E. Trinidad-Segovia
References
1. Athanassoulis, G.A., Skarsoulis, E.K. and Belibassakis, K.A. (1994).
Bivariate distributions with given marginals with an application to wave
climate description. Applied Ocean Research, 16, 1-17.
2. Alonso, R. and Lozano, J. (1985). El metodo de las dos funciones de
distribution: Una aplicacion a la valoracion de fincas agricolas en las
comarcas Centra y Tierra de Campos (Valladolid). Anales del INIA,
Economia, 9, 295-325.
3. Ballestero, E. (1971). Sobre la valoracion sintetica de tierras y un nuevo
metodo aplicable a la concentration parcelaria. Revista de Economia
Politica, 225-238.
4. Ballestero, E. (1973). Nota sobre un nuevo metodo rapido de valoracion.
Revista de Estudios Agrosociales, 85, 75-78.
5. Ballestero, E. and Caballer, V. (1982). II metodo delle due beta. Un
procedimiento rapido nella stima dei beni fondiari. Genio Rurale, 6, 33-36.
6. Ballestero, E. and Rodriguez, J.A. (1999). El precio de los inmuebles
urbanos. CIE Inversiones Editoriales DOSSAT 2000.
7. Barnnet, V. (1980). Some bivariate Uniform Distributions.
Communicationes in Statistics, A9,453-461.
8. Caballer, V. (1994). Metodos de valoracion de empresas. Ediciones
Piramide, S.A. 101-104.
9. Caballer, V. (198). Valoracion agraria. Teoria y practica. Ediciones Mundi-
Prensa. 4a edition.
Making Copulas Under Uncertainty 51
25. Garcia, J., Cruz, S. and Garcia, L.B. (2005). The two-sided power
distribution for the treatment of uncertainty. Statistical Methods &
Applications, 4(5), 209-222.
26. Garcia, J., Cruz, S. and Rosado, Y. (2000). Las funciones de distribucion
multivariantes en la teoria general de valoracion. Actas de la XIV Reunion
Asepelt-Espafia, Oviedo (publicacion en CD-Rom).
27. Garcia, J., Cruz, S. and Rosado, Y. (2002). Extension multi-indice del
metodo beta en valoracion agraria. Economia Agraria y Recursos Naturales,
2(2), 3-26.
28. Garcia, J. and Garcia, L.B. (2003). Teoria General de valoracion. Metodo de
las dos funciones de distribucion. ISBN 84 95979 09 8.
29. Garcia, J., Trinidad, J.E. and Garcia, L.B. (2004). Valoracion por el metodo
de las dos funciones de distribucion: Como seleccionar la mejor
distribucion. XVIII Reunion ASEPELT 84-60947165-.
30. Garcia, J., Trinidad, J.E. and Gomez, J. (1999). El metodo de las dos
funciones de distribucion: la version trapezoidal. Revista Espanola de
Estudios Agrosociales y Pesqueros, 185, 57-80.
31. Garcia, J., Trinidad, J.E. and Sanchez, M. (1997). Seleccion de una cartera
de cultivos: el principio primero la seguridad de Roy. Investigation Agraria.
Serie Economia, 12(1,2,3), 425-445.
32. Genest, C. and Mackay J. (1986). The joy of copulas: Bivariate distributions
with uniform marginals. The American Statistician, 40(4), 280-283.
33. Guadalajara, N. (1996). Valoracion Agraria. Casos Practicos. Ediciones
Mundi-Prensa.
34. Gumbel, E.J. (1960). Bivariate exponential distributions. Journal of the
American Statistical Association, 55, 698-707.
35. Gumbel, E.J. (1961). Bivariate logistic distributions. Journal of the
American Statistical Association, 55, 335-349.
36. Herrerias, R., Garcia, J., Cruz, S. and Herrerias Velasco, J.M. (2001). II
modello probabilistico trapezoidale nel metodo delle due distribucion della
teoria generate de valutazioni. Genco Rurale. Estimo e Territorio. Rivista de
Scienze Ambientali ANNO LXIV, 4, 3-9.
37. Herrerias, R., Palacios, F., Callejon, J. and Perez, E. (2001). Un metodo
para contrastar la bondad de un experto en la metodologia PERT.
Programacion, seleccion y control de proyectos en ambiente de incertidumbre.
38. Herrerias, R., Callejon, J., Perez, E. and Herrerias, J.M. (2001). Las familias
de distribuciones beta de varianza constante y mesocurticas en el metodo
PERT. Programacion, seleccion y control de proyectos en ambiente de
incertidumbre.
39. Herrerias Velasco, J.M. (2002). Avances en la teoria general de valoracion
en ambiente de incertidumbre. Tesis Doctoral.
40. Hoeffding, W. (1940). Maszstabinvariante Korrelationstheorie. Schriften
des Mathematischen Instituts un des Instituts fur Angewandte Mathematik
der Universitat Berlin, 5, 181-233.
Making Copulas Under Uncertainty 53
M. FRANCO-NICOLAS
Dpto. Estadistica e Investigation Operativa, Universidad de Murcia
Campus de Espinardo, Murcia, 30100, Spain
R. HERRERIAS-PLEGUEZUELO
Department of Quantitative Methods in Economics, University of Granada
Campus de Cartuja s/n. Granada, 18071, Spain
J. CALLEJON-CESPEDES
Department of Quantitative Methods in Economics, University of Granada
Campus de Cartuja s/n. Granada, 18071, Spain
J.M. VIVO-MOLINA
Dpto. Metodos Cuantitativos para la Economia, Universidad de Murcia
Campus de Espinardo, Murcia, 30100, Spain
In this paper, we discuss a new application of the survival functions in asset pricing from
quality indexes. Thus, we propose the valuation method based on the two survival
functions (VMTS) to find, under uncertainty, the market value from a quality index.
Within this framework, from a one-dimensional quality index, VMTS is equivalent to the
valuation method of the two distribution functions (VMTD), which produces loss with
respect to the assessments from each component of a multidimensional quality index;
nevertheless, VMTS provides profit with respect to these assessments from each
component. Finally, we motivate the use of VMTS, as tools for the valuation of an asset,
through a practical application on land pricing.
1. Introduction
In the literature, the survival or reliability measures have been widely used
in many areas of economics, in political science, in biology, and in industrial
engineering. In particular, many interesting results of reliability theory have
been applied in risk analysis, and their properties have interesting qualitative
implications in these fields (see, e.g. Bagnoli and Bergstrom (2005) and the
references therein).
55
56 M. Franco-Nicolas et al.
et al. (2002), Herrerias (2002) and Garcia and Garcia (2003)). Unfortunately, the
VMTD produces loss with respect to the assessments from each component of a
multidimensional quality index, thus it is often used weights among their
components to adjust the asset pricing. Therefore, we consider that the new
valuation method based on the two survival functions (VMTS) might help to
appraise an asset under uncertainty from a quality index, even more when the
dimension is reduced by unobserved components of quality.
The purpose of this paper is to establish the theoretical framework of a new
valuation method and exhibit its practical application. For that, we study this
new technique based on the two survival functions corresponding to the two
probability models, providing an in depth explanation of the principles
underlying the analysis of the economic value of an asset by means of the
VMTS and its comparison with the VMTD.
In Section 2, the VMTD is briefly introduced and the new VMTS is given
in order to value an asset, under uncertainty, from a one-dimensional quality
index. Section 3 analyzes the VMTS to find the assessment from a
bidimensional or multidimensional quality indexes, wherein the differences
between both methods are shown when greater information by more than a
quality index of an asset is made available. Likewise, in Section 4, the use of the
VMTS, as tools for the valuation of an asset, is motivated through a practical
application on land pricing, and finally, we provide some concluding remarks in
Section 5.
VD = MO (1)
w h e r e <j>D = Fy1 ° F, .
In this sense, it is possible to consider other valuation techniques where the
basic principle holds. In particular, taking into account the survival function Sv
of the market value V of the asset and the survival function Sj of a quality
index of this asset, instead of their distribution functions, we can consider an
alternative method based on the equality of both survival functions, and
consequently, the market value of an asset with quality index I = i by this new
valuation method of the two survival function is
v5=#s(0 (2)
l
where 0S = Sy °Sj, and these survival functions are defined by
Sy(v) = l-Fy(v) and Sj(i) = \-Fj(i). Besides, this VMTS verifies the basic
valuation principle, since both survival functions are decreasing.
In order to compare the assessments obtained by Eqs. (1) and (2) through
both methods (VMTD and VMTS) from a one-dimensional quality index, we
give the following result.
Theorem 1. Let 1 = i be the value of the one-dimensional quality index, with
vD its market value by VMTD and vs its assessment by VMTS. Then, vD =vs.
Proof. From Eqs. (1) and (2), it is immediate, since
index, which does not lead to loss in the appraisal of the asset. Moreover, the
VMTS produces an appraisal of the asset upper than the valuations obtained for
each component of its quality index, i.e., an appreciation when more than one
quality index is made available to value the asset, as we prove in the next result.
Theorem 2. Let I = (iiti2) be the value of the bidimensional quality index, with
Vj and v2 its market values by VMTS from the components /, and I2,
respectively. Let vs be the assessment by VMTS, then, vs > sup{V|, v 2 }.
Proof. Taking into account that for all bivariate survival function the following
inequalities holds:
S / f o . / j ) * ^ ! , ) and SI(il,i2)<S2(i2)
where Sj(ij) = l-Fj(ij) for j = 1,2, are the marginal survival functions of
each component of the quality index, i.e.
5 / 0 1 ,/ 2 )<inf{5,(/ 1 ),5 2 0 2 )}
and the market values V| and v2 given by Eq. (2) for each component, we have
the following inequality
V 5 >SUp{v!,V 2 }
Let j and k be two assets, with {iXj,...,i„j) and (ilk,...,i„k) their values of
the quality index and Vj and vk their market values, respectively. If
(hj,...,inj) <fa*,...,»'„*)then v,. < vk .
So, from a value of the quality index / = (il,...,i„) of the asset, the VMTD
provides the next assessment:
VD = Mh'-,'*) (7)
where <f>D = Fyl <= F,, which is lower than the valuations obtained for each of its
components, since
vD<inf{v1,...,v„} (8)
where v • 's are the assessments of the asset through marginal distributions Fj 's
byEq. (1), j = l,...,n.
Likewise, using the survival function Sy of the market value V of the asset
and the multivariate survival function S} of its quality index, the assessment
obtained by the VMTS from / = (/,,...,*„), is
v 5 =&(/,,...,/„) (9)
where $s = Sy1 ° Sj.
Analogous to the bidimensional case, the VMTS solves the depreciation
underwent by the VMTD when more than one quality index is made available to
value the asset, and provides appraisals of the asset greater than the best
valuation obtained for each component of its quality index.
Theorem 4. Let I = (ii,...,in) be the value of the multidimensional quality
index, with Vj 's its market values by VMTS from the components Ij 's,
respectively. Let vs be the assessment by VMTS, then, vs >sup{vj,...,v„}.
Theorem 5. Let I = (il,...,in) be the value of the multidimensional quality
index, where vD is its market value by VMTD and vs is its assessment by
VMTS. Then, vD<vs.
4. Practical application
In this section, we apply the new valuation method based on the two
survival functions to appraise agricultural plots, obtaining the assessments using
both valuation methods, wherein one can check the results established in the
former sections.
For that, we use the 2nd practical case of Guadalajara (1996), in which she
studies the valuation of an agricultural plot, used for growing grapes, in
Vinalopo Medio region (Alicante, Spain). The quality indexes considered to
62 M. Franco-Nicolas et al.
describe the market value (€//w2) are the gross production of grapes (kg/m2),
together with the percentage of sand in the soil of the plot.
Table 1 displays data of the minimum (pessimistic), maximum (optimistic)
and mode (most likely) values for each variable; where the goal is to valuate a
plot of agricultural land with an area of 12010.3833 m 2 , a gross production of
20399kg/m 2 and a sand/soil content of 32%.
this particular property; where one can also check the comparisons established in
Section 3.
Variable
-«- inf(Vl,V2)
-*- sup(Vl,V2)
-•- VD
— VS
10000
5 10 15 20 25 30 35
Variable
-e- inf(Vl,V2)
-=- sup(Vl,V2)
-+- VD
— VS
5 10 15 20 25 30 35
5. Conclusions
Finally, we point out the main conclusions of this paper.
The VMTD and VMTS are equivalent to appraise an asset from an one-
dimensional quality index.
The VMTS provides an greater assessment than one obtained by the
VMTD, from a bidimensional or multidimensional quality index. Moreover, the
VMTS produces profit with respect to both assessments given by each
component of the quality index, solving the depreciation of the VMTD when
more than a quality index made is available.
References
1. Alonso, R. and Lozano, J. (1985). El metodo de las dos funciones de
distribution: Una aplicacion a la valoracion de fincas agricolas en las
comarcas Centra y Tierra de Campos (Valladolid). Anales del INIA:
Economia, 9, 295-325.
2. Ballestero, E. (1971). Sobre valoracion sintetica de tierras y un nuevo
metodo aplicable a la concentration parcelaria. Revista de Economia
Politica, 57, 225-238.
3. Bagnoli, M. and Bergstrom, T. (2005). Log-concave probability and its
applications. Economic Theory, 26,445-469.
4. Banerjee, A., Gelfand, A.E., Knight, J.R. and Sirmans, C.F. (2004). Spatial
modeling of house prices using normalized distance-weighted sums. Journal
of Business and Economic Statistics, 22, 206-213.
Valuation Method of the Two Survival Functions 65
5. Benkard, C.L. and Bajari, P. (2005). Hedonic price indexes with unobserved
product characteristics, and application to personal computers. Journal of
Business and Economic Statistics, 23, 61-75.
6. Berny, J. (1989). A new distribution function for risk analysis. Journal of
the Operational Research Society, 40, 1121-1127.
7. Caballer, V. (1975). Concepto y metodos de valoracion agraria. Ed. Mundi-
Prensa, Madrid.
8. Callejon, J., Perez, E. and Ramos, A. (1996). La distribucion trapezoidal
como modelo probabilistico para la metodologia PERT. In Programacion,
selection y control de proyectos en ambiente de incertidumbre, R. Herrerias
(ed) (2001), 167-177.
9. Cruz, S., Garcia, C.B. and Garcia, J. (2002). Statistical test for the method
of the two distribution functions. An application in finance. In VI Congreso
de Matematica Financiera y Actuarial and 5th Italian-Spanish Conference in
Financial Mathematics, Valencia.
10. Deltas, G. and Zacharias, E. (2004). Sampling frequency and the
comparison between matched-model and hedonic regression price indexes.
Journal of Business and Economic Statistics, 22, 206-213.
11. Garcia, J., Cruz, S. and Andujar, A.S. (1999). II metodo delle due funzioni
di distribuzione: II modello triangolare. Una revisione. Genio Rurale, 11,
3-8.
12. Garcia, J., Cruz, S. and Rosado, Y. (2002). Extension multi-indice del
metodo beta en valoracion agraria. Economia Agraria y Recursos Naturales,
2, 3-26.
13. Garcia, J. and Garcia, L.B. (2003). Teoria General de Valoracion. Metodo
de las dos funciones de distribucion. Ed. Fundacion Unicaja, Malaga.
14. Guadalajara, N. (1996). Valoracion Agraria. Casos Practicos. Ed. Mundi-
Prensa, Madrid.
15. Herrerias, J.M. (2002). Avances en la Teoria General de Valoracion en
Ambiente de Incertidumbre. PhD Dissertation, Universidad de Granada.
16. Herrerias, R., Garcia, J. and Cruz, S. (2003). A note on the reasonableness
of PERT hypotheses. Operations Research Letters, 31, 60-62.
17. Herrerias, R., Garcia, J., Cruz, S. and Herrerias, J.M. (2001). II modello
probabilistico trapezoidale nel metodo delle due distribuzione della teoria
generale de valutazioni. Genio Rurale. Rivista di Scicienze Ambientali,
LXIV, 3-9.
18. Johnson, D. (1997). The triangular distribution as a proxy for the beta
distribution in risk analysis. Journal of the Royal Statistical Society, Ser. D,
46, 387-398.
19. Johnson, N.L. and Kotz, S. (1999). Non-smooth sailing or triangular
distributions revisited after some 50 years. Journal of the Royal Statistical
Society, Ser. D, 48, 179-187.
20. Law, A.M. and Kelton, W.D. (1982). Simulation modelling and analysis.
Ed. New York: McGraw-Hill.
66 M. Franco-Nicolas et a!.
21. Romero, C. (1977). Valoracion por el metodo de las dos distribuciones beta:
una extension. Revista de Economia Politica, 75,47-62.
22. van Dorp, J.R. and Kotz, S. (2002a). The standard two sided power
distribution and its properties: with applications in financial engineering.
The American Statistician, 56, 90-99.
23. van Dorp, J.R. and Kotz, S. (2002b). A novel extension of the triangular
distribution and its parameter estimation. Journal of the Royal Statistical
Society, Ser. D, 51,63-79.
24. van Dorp, J.R. and Kotz, S. (2003). Generalized trapezoidal distributions.
Metrika, 58, 85-97.
25. Williams, T.M. (1992). Practical use of distributions in network analysis.
Journal of the Operational Research Society, 43, 265-270.
26. Yaari, M. (1987). The dual theory of choice under risk. Econometrica, 55,
95-115.
Chapter 4
WEIGHTING TOOLS AND ALTERNATIVE TECHNIQUES TO
GENERATE WEIGHTED PROBABILITY MODELS IN
VALUATION THEORY
M. FRANCO-NICOLAS
Dpto. Estadistica e I.O., Universidadde Murcia
Campus de Espinardo, Murcia, 30100, Spain
J.M. VIVO-MOLINA
Dpto. Metodos Cuantitativos para la Economia, Universidad de Murcia
Campus de Espinardo, Murcia, 30100, Spain
In risk analysis, different procedures based on weighted probability models are usual
tools to reduce loss of the assessments in multivariate scenarios. In particular, the
weighted distribution functions have been widely used to correct and fit the market value
of an asset, through the valuation methods of the two functions, with respect to the
appraisals from each component of the multidimensional quality index, in the field of the
Valuation Theory.
In this context, the weighting procedures are of interest to find the weights and
consequently, to generate these weighted probability models.
The main objective of this paper is to analyze the different weighting techniques used in
the Valuation Theory, as well as to propose an alternative to calculate the weights and a
new tool to generate these weighted probability models.
First, the well-known weighting techniques to generate the weights are introduced, under
both independence and dependence presence of the components of the quality index.
Secondly, we expand these weighting techniques by the survival functions, which allows
us to generate other weighted probability models.
Likewise, we discuss a new tool to determine the weights of the components of the
quality index, modal mean technique, based on the mode values of its marginal
distribution functions, which extend the size of the possible weighted probability models
to approach the market value of the asset.
Finally, we give an application of these weighting techniques to generate weighted
probability models in one example of land pricing, and thus we obtain the assessments of
the land property according to each weighted probability model.
1. Introduction
In recent years, some authors have paid more attention to the study and
generalization of probability models required in PERT methodology and
Valuation Theory (see, e.g. Williams (1992), Callejon, Perez and Ramos (1996),
67
68 M. Franco-Nicolas andJ.M. Vivo-Molina
Johnson (1997), Johnson and Kotz (1999), Herrerias, Garcia, Cruz and Herrerias
(2001), Herrerias (2002), van Dorp and Kotz (2002a), (2002b) and (2003),
Garcia and Garcia (2003) and Herrerias, Garcia and Cruz (2003)). Likewise, the
valuation method of the two distributions (VMTD) have been studied and
applied, under uncertainty, to approach the market value of an asset from a
quality index (see, e.g. Garcia, Cruz and Andiijar (1999), Garcia, Trinidad and
Gomez (1999), Cruz, Garcia and Garcia (2002), Garcia, Cruz and Garcia (2002),
Garcia, Cruz and Rosado (2002), Herrerias (2002), Garcia and Garcia (2003)
and Garcia, Herrerias and Garcia (2003)); in this way, the valuation method
based on the two survival functions (VMTS) have been also used to find the
market value from a quality index (see, e.g. Callejon, Franco, Herrerias and
Vivo (2005), Franco, Callejon, Herrerias and Vivo (2005) and Franco, Herrerias,
Vivo and Callejon (2005)).
Unfortunately, the valuation methods based on two probability models
present some disadvantages when it is considered a greater information by more
than a quality index of this asset, such as loss or profit with respect to the
assessments from each component of the quality index. In order to reduce loss of
the assessments in risk analysis, it is useful to consider probability models
weighing the distinct components of the quality index. Therefore, these
procedures allow to correct and fit the market value of an asset. In particular,
seeking to reduce the depreciation (appreciation) underwent in the market by the
VMTD (VMTS) when more than one quality index is made available, it is
common the use of weighted probability models based on the marginal
distribution (survival) functions of the multidimensional quality index in both
cases, independence and dependence among its components.
Therefore, these weighting procedures, marginal distribution or survival
functions, in both independende and dependence cases, require to determine the
weights, i.e., the coefficients a and p of the weighted probability models.
Herrerias (2002) and Garcia and Garcia (2003) analyze three techniques to
calculate these weights, and consequently, to generate weighted probability
models:
1. subjective (supplied by an expert judgment).
2. modes (relationship between the modal values).
3. econometrics (fit by the linear models).
Within this framework, under independence of the components of the quality
index, Herrerias (2002) summarizes the use of the three procedures, with their
advantages and disadvantages, respectively. However, under dependence
between the components of the quality index, he comments that these techniques
Weighting Tools and Alternative Techniques 69
First, the mode technique allows us determine the weights by the following
relationship between the distribution functions
Fv(m) = FWD(ml,m2) (1)
where FWD is the distribution function of the weighted model from the marginal
distribution functions of the quality index.
On the other hand, when the two components of the quality index are
dependent, from Eq. (1), the mode technique allows us to get the weights by
Fr (w) = pFx (mx) + (1 - p)F2 (m2)
or equivalently,
Fv (m) - F2 {m2) = p(Fx (m,) - F2 (m2 )) (4)
where 0 < p < 1 represents the weight of the fisrt component of the quality
index.
If Fx (mx ) = F2 (m2), then Eq. (4) only makes sense for
Fr(m) = Fx(mx) = F2{m2), which is a strong restriction on the modal market
value of the asset, and thus, p could be chosen in (0, l).
If Fx (w,) £ F2 (m2), then the weights holds
Fv{m)-F2{m2)
P=
fi(ml)-F2(m2)
wherein we point out, the following contradictory cases:
1. When Fy(m)<F2(m2)<Fl(ml) then p<0
2. When Fv (m) < Fx (mx) < F2 (m2) then p > 1
or equivalently,
Sv(m)
(6)
S2(m2) S2(m2)
If 5, (/»!) = S2 (jn2), then Eq. (6) only makes sense when
Sy {m) = 5, (m,) = S2 (m2), which is a strong restriction on the modal market
value.
If Sx (ml) * S2 (m2), then the weight holds
logS F (/M)-logS 2 (m 2 )
a =
log Sj (»!,)-log S 2 (/« 2 )
wherein we remark the following contradictory situations:
1. When Sv(m)> S2(m2)>Sl(mi) then a<0
2. When Sv (m) > 5, (w,) > S2 (m2) then a > 1
Therefore, in order to that p e (0, l), it is necessary to impose the restriction (7)
on the modal market value. So, the mode technique by marginal survival
functions has the same disadvantage the mode in both cases, independence and
dependence between of the components.
Besides, note that Sx (m1 ) = S2 (m2) if and only if F, (w, ) = F2 {m2), and
the restrictions (3) and (7) are equivalent. Consequently, we have found the
same disadvantages of this technique by both weightings, the marginal survival
functions and the marginal distribution functions.
Note that the modal mean technique generates a weighted model using the
available information by these quality indexes; so it is not influenced by the
market value, and consequently, this procedure does not require any restriction
on the modal market value.
Firstly, when the components of the quality index are independent, from
Eq. (9) the weight a provides by the modal mean method is given by
F (m ) + F2(m2)
Fla(ml)F2l-a(m2) = l l
or equivalently,
^i(wi)| _F\{mx) + F2{m2)
f2(m2)) 2F2(m2)
where 0 < a < 1 represents the weight of the first component of the quality
index.
If Fl (m,) = F2 (m2), then a can take any value in (0, l).
If F\ (/«|) * F2 (m2), then the weight holds
F {m ) + F2(m2)
log x x -log F2{m2)
a =- K0,1)
logF,(w 1 )-logF 2 (/w 2 )
On the other hand, when the components of the quality index are dependent,
from Eq. (9) we have the next equation
F,(m,) + F 2 (m 2 )
pFi(m{) + (\-p)F2(m2) =-
or equivalently,
Fx{mx)-F2{m2)
p(Fi(fr'i)-F2(m2)) =
where 0 < p < 1 represents the weight of the first component of the quality
index.
If Fx (ml) = F2 (m2), then p may be any point in (0, l).
If ^ i ( w i ) ^ F 2 ( m 2 ) , t h e n p = \/2.
Remark that under dependence between the components, the modal mean
technique provides the same weight for each component of the quality index,
which exhibits the coherence of the same one, since the dependent structure
between the components includes the predominance and importance of one over
the other, and therefore it will be contradictory the assignment of the different
coefficients in the weighted model.
76 M. Franco-Nicolas andJ.M. Vivo-Molina
^.,™2)=5l(Wl)+/2(W2) (io)
Remark that this relationship to generate a weighted model using only the
quality index, just like the modal mean method by distribution functions.
In the first place, when the components of the quality index are
independent, from Eq. (10) the weight a of the first component is determined
by
Si(ml) + S2(m2)
S?(mx)S2-a{m2) =-
or equivalently,
c
.s2(™ 2)) 2S2(m2)
If F, (m,) = F2 (m2), then a can be any value in (0, l).
If Fj {mx) * F2 (m2 ) , then the weight holds
or equivalently,
Weighting Tools and Alternative Techniques 77
1V i ; n 2;
/>($,(m,)-S 2 (« 2 )) = 2
4. Practical application
In this section, we expose a practical application of the weighted probability
models, by these techniques, in one example of land pricing. In particular, we
consider the transactions of agricultural propierties in Tierras de Campos and
Centro regions (Valladolid, Spain) given in Alonso and Lozano (1985) and
Garcia and Garcia (2003). The quality indexes used to explain the market values
(€) are the income per hectare (€/Ha) and the inverse distance to Valladolid
(l/km).
Table 1 displays data of the minimum (pessimistic), maximum (optimistc)
and modal (most likely) values for each variable; where the objective is to
appraise an agricultural property whose income per hectare is 194.31 and
location is km from Valladolid.
Thus, in Tables 2 and 3, the weights of the first component have been
determined by the marginal distribution functions, and taking into account the
same ones for a better comparison in both cases, triangular and trapezoidal
market value when the weighting technique is econometric or mode, since the
other procedures are not influenced by the distribution function of the market
value.
In particular, Table 2 displays the assessments of the property through both
methods, VMTD and VMTS, when the weighted probability models are based
on the marginal distribution functions
and Table 3 includes the appraisals of the weighted probability models based on
the marginal survival functions
Weighting
a Fy VDWS Vsirs
Technique
Econometric 0.671763 Triangular 2060.45 2624.50
Econometric 0.671763 Trapezoidal 2120.40 2652.73
Mode 0.612085 Triangular 2065.21 2609.22
Mode 0.612085 Trapezoidal 2125.52 2638.58
Modal Mean 0.441782 Triangular 2079.29 2598.49
Modal Mean 0.441782 Trapezoidal 2140.51 2628.65
and Table 5 displays the assessments using the weighted probability models
through the marginal survival functions
Weighting
a Fy VDJTS vsws
Technique
Econometric 0.671763 Triangular 1705.06 2064.05
Econometric 0.671763 Trapezoidal 1724.39 2124.28
Mode 0.612085 Triangular 1712.13 2069.12
Mode 0.612085 Trapezoidal 1732.14 2129.70
Modal Mean 0.441782 Triangular 1714.64 2083.42
Modal Mean 0.441782 Trapezoidal 1734.89 2144.86
Remark that in all cases, the VMTS proposes appraisals greater than the
VMTD.
Besides, we show some graphics on the behaviour of both VMTS and
VMTD from the weighted models, in which the market value of an agricultural
land follows a triangular model and the quality index has independent and
triangular components.
In the first place, from marginal distribution functions, a = 0.82074,
0.615456 and 0.702754 by the mode, econometric and modal mean
techniques, respectively, and the valuation obtained from these procedures will
be marked by "m", "e" and "mm", respectively. So, Figures 1 and 2 describe the
80 M. Franco-Nicolas andJ.M. Vivo-Molina
8 12 16 20 24 2S 32 36 12 16 20 24 28 32
References
1. Alonso, R. and Lozano, J. (1985). El metodo de las dos funciones de
distribution: Una aplicacion a la valoracion de fincas agricolas en las
82 M. Franco-Nicolas andJ.M. Vivo-Molina
13. Garcia, J., Trinidad, J.E. and Gomez, J. (1999). El metodo de las dos
funciones de distribution: la version trapezoidal. Revista Espaflola de
Estudios Agrosociales y Pesqueros, 185, 57-80.
14. Herrerias, J.M. (2002). Avances en la Teoria General de Valoracidn en
Ambiente de Incertidumbre. Tesis Doctoral. Universidad de Granada.
15. Herrerias, R., Garcia, J. and Cruz, S. (2003). A note on the reasonableness
of PERT hypotheses. Operations Research Letters, 31, 60-62.
16. Herrerias, R., Garcia, J., Cruz, S. and Herrerias, J.M. (2001). II modello
probabilistico trapezoidale nel metodo delle due distribuzione della teoria
generale de valutazioni. Genio Rurale. Rivista di Scicienze Ambientali,
LXIV, 3-9.
17. Johnson, D. (1997). The triangular distribution as a proxy for the beta
distribution in risk analysis. Journal of the Royal Statistical Society, Ser. D,
46, 387-398.
18. Johnson, N.L. and Kotz, S. (1999). Non-smooth sailing or triangular
distributions revisited after some 50 years. Journal of the Royal Statistical
Society, Ser. D, 48, 179-187.
19. van Dorp, J.R. and Kotz, S. (2002a). The standard two sided power
distribution and its properties: With applications in financial engineering.
The American Statistician, 56, 90-99.
20. van Dorp, J.R. and Kotz, S. (2002b). A novel extension of the triangular
distribution and its parameter estimation. Journal of the Royal Statistical
Society, Ser. D, 51, 63-79.
21. van Dorp, J.R. and Kotz, S. (2003). Generalized trapezoidal distributions.
Metrika, 58, 85-97.
22. Williams, T.M. (1992). Practical use of distributions in network analysis.
Journal of the Operational Research Society, 43, 265-270.
Chapter 5
ON GENERATING AND CHARACTERIZING SOME DISCRETE
AND CONTINUOUS DISTRIBUTIONS
M.A. FAJARDO-CALDERA
Dpto. de Economia Aplicaday Organization de Empresas
University of Extremadura
Camino de Elbas s/n, Badajoz, 06071, Spain
J. PEREZ-MAYO
Dpto. de Economia Aplicaday Organization de Empresas
University of Extremadura
Camino de Elbas s/n, Badajoz, 06071, Spain
The main aim of this paper is to generate compound distributions, discrete or continuous,
from Binomial conditional distributions by means of Bayesian techniques. Besides, the
authors extend Kowar's paper (1975) by characterizing some discrete and continuous
distributions, in the context of some well-known distributions, from the conditional
distribution of a random variable (r. v.) and by the linear regression of the latter given the
former.
1. Introduction
One of the main aims of Probability Calculus is to determine some
theoretical distributions useful for modeling the random phenomena that appear
in Experimental Sciences.
Many methods have been applied to generate or characterize discrete or
continuous distributions of probability: change of variables, functional equations
(differential or in differences), etc. These methods supply the theoretical issues
needed to describe a random phenomenon and to obtain the explicit probability
law.
In the direct method, some distributions are obtained by the expression of a
mathematical model, which is the abstraction of a random experiment. An
example of this method is the theory of combinatory numbers to directly get the
probabilities that correspond to each value of a random variable. From this
theory some important and well-known distributions as the binomial,
hypergeometric, geometric or negative binomial ones, are generated.
85
86 M.A. Fajardo-Caldera and J. Perez-Mayo
Sometimes, while trying to establish the model, one must solve an equation
(functional, differential, in differences) to explicitly obtain the probability law.
For example, the equation in differences obtained when one establish the
probability of getting r successes in n independent tests, by assuming that
the probability of success varies in each test. The equation appears in the
generalization of the repeated Bernoulli's tests, having the Binomial distribution
as a particular case.
It is also possible to start from a differential equation, being the Poison
distribution the most known. Systems of differential equations have been
proposed in the literature. The most important one is the well-known Pearson's
system of curves, that is a generalization of the differential equation generated
from the Normal distribution and whose solution contains many of the
continuous distributions of probability as the Normal, Gamma, Beta,
Exponential. Later, this system was studied by Elderton and Johnson (1969) and
extended by Herrerias Pleguezuelo (1975) and Callejon (1995).
The discrete consideration of the Pearson's system is done by Ord (1972)
and generalized by Fajardo (1985) and Rodriguez Avi (1993), whose most
important consequence is the extended analysis of the family of discrete
distributions of probability defined by the generalized hypergeometric series.
This analysis was done by Dacey (1971) and later extended and generalized by
Hermoso Gutierrez (1986) and Saez Castillo (2002).
An alternative method used in Statistics for generating distributions of
probability is the use of functions of random variables, i.e. variables
transformations. The most usual transformations are the sum, product or
quotient of two variables.
Finally, it can not be forgotten another method of getting probabilistic
distributions by means of limits. Two well-known examples are the conversion
of a Binomial distribution into a Poisson or a Normal one.
Following the steps above, in this paper we try to generate probability
distributions from compound distributions by means of the well-known
Bayesian techniques and, on the other hand, to characterize discrete distributions
from Binomial conditional distributions and linear regression.
Then, from (8) and (9) one obtains the marginal distribution of £ a Poisson-
Binomial compound distribution, given by:
t X
'0\ J" 0~ PI
'(APy
*fe=4»Z \P'Q-PY (7)
X"
by assuming that — e ~ k is the probability of having n children.
f(0\a,fi) = 0" (
)~*> ,a>0,/?>0,0<fl>l (9)
where B(a,P) is the complete beta function. Since 9 is not observable, the
probability distribution of k in r trials, given a and /?, for a randomly chosen
member is the following simple mixture model
P^ = k\r,a,p]=jp[^ = k\r,9\f(9,a,P)d9 (10)
By using (8) and (9), the probability distribution of E, is defined as
^B*a + k,fi + r-k)tks0X^r ( U )
P[£ = k\r,a,p] =
Be{a,p)
This compound distribution is the beta-binomial model.
As an application of this beta-binomial model, consider that one wants
determine the probability of k, randomly selected, elements in a population have
influenza, knowing that the initial distribution of the proportion 9 of elements
with influenza is the distribution Be (9\1, 12) and that a random sample of 20
elements contained five sick people. Then, considering the former, the
probability for k randomly selected elements in a population have influenza
comes is given by
20\Be(S + k,U + 20-k)
p[t = k\20,3,n] =
k 5e(3,12)
Let © be a r.v. with density function f{9) and assume that 0 follows a
Gamma distribution, given by:
90 M.A. Fajardo-Caldera and J. Perez-Mayo
-0"-1e-ae,0>O
/ ( * ) = T(u) ' , withu>0, a > 0 (13)
0, 6 < 0
Then, the compound Gamma-Poisson distribution is given by:
/
I y- n«) r(w) I y\ {y J
(14)
That is known as negative binomial distribution of (u,p) parameters. This
distribution, introduced by Greenwood and Yule, has been used to represent the
industrial accidents. It is possible to say that the probability of y accidents is
given by a Poisson distribution of 0 parameter. However, this parameter changes
for each worker and one can observes that, assuming a Gamma distribution, the
observations follows a Poisson compound distribution.
Proposition 6.1
In the previous conditions with a finite E(X), we have the following:
(A) * > 0; (B) X is bounded if, and only if, 0 < a < 1. If X is bounded, then
b = m(\ - a); (C) 0 < a </>"' .
Generating and Characterizing Distributions 91
Proof.
(A) Specifying (16) for y = 0, we have E[X \ Y = 0] = b . On the other hand,
P[Y = y]=£,P[x = x,Y = y]=fiP[x = x]p[Y = y\X = x]=fd p>q'->'F[x = x\
\y.
(17)
Foiy = 0 andy = m, we have that:
m
F[Y = 0]=ttqxF[X = x] (18)
and
' ^
P[Y = m]=Y,P[X = m
] pxqxm =pmP\X = m\ (19)
v L J
mj ' _mpmP[X = x]
E[X\Y = m] = YjxP[X = x\y = m] = Yix
P[Y = m] P[Y = m]
(21)
Then, from (20) and (21), we have that m = E[X \ Y = m] = am + b. Then,
b = m{\ -a) implies that 0 < a < 1.
Now, assume that X is unbounded. Noting that Y also takes values 0, 1,2...,
and Y < X, we have from (16) that y = 0, 1, 2...
y = Z y^X = x'Y =
y] - Z xP x
l = x/Y = y] = E[X /Y = y] = ay + b (22)
x>y x>y
or (1 - a) y < b; which cannot hold for all the nonnegative integers unless
(I-a) < 0 . This proves (B).
(C) By considering that
92 M.A. Fajardo-Caldera and J. Perez-Mayo
l
E[X\Y = y] = YjxP[X = x\Y = y] = Yjx ' ^ q y + Z, (23)
Theorem 6.1
Assume that X is a discrete r.v. taking the values 0,..., m, where m may be a
positive integer or co. Let E[X] finite. Let Y be another r.v. whose conditional
distribution given X is given by (15). Then (16) hold for some constants a, b if,
and only if, X has these distributions: binomial iff 0 < a < 1, negative binomial
and Poisson. Furthermore, X is binomial iff 0 < a < 1, negative binomial iff
a < 1, and Poisson iff a = 1.
Proof (=>):
Let (16) hold for some constants a and b. Letting P[X = x],x = 0,..., m, and
G (t) = Eft5'], the probability generating function (p.g.f) of X, we have
E[X,Y--y] = ±xnX--xlY--y]J£x^--XW--y!X--X]
n n
.. ~' -' (28)
X
^^p^P[X_=x] = ay + b
*=y P[Y = y]
Generating and Characterizing Distributions 93
^ ^
If we use the following expression, x = C+i) +^ , we have,
\y. 7J
m m
n I X \
+i
p7^\y J
(30)
Then by (30), we have.
for the p.g.f H{i) of Y. However, it is known that the p.g.f. of Y is given by
x x m ( yA x m ( y
Vl (
' [/(I-op) /(\-ap)\ °
Now, let 0 < a < 1. Then, proposition l.C. (B), shows that b = m{\ -a).
Thus, considering the equation (37), we have
\\-apY /(l-ap)]
From (38) it follows that X has a binomial distribution with parameters
(w,a), where a = (1 - a)/(l - ap). [Remember that G(t) = (pt + q)n is the p.g.f.
of binomial distribution]
Finally, assume a > 1. Considering the equation (37), it follows that X has a
negative binomial distribution with parameters (v,a), where a = (1 - ap)laq. For
proposition 1 .C. (C) we have ap < 1, then a is indeed positive. [Remember that
G(t) = (p/1 - qt)n is the p.g.f. of negative binomial distribution]
This proves the "only if" part of the theorem.
<=) To prove the "if" part of the theorem, we consider the following:
Suppose that X is Poisson and Y/X = x is given by (1). Then, E[X|Y = y] =
ay + b, where a and b are constant.
Let be
/y\{x-y)\
the joint distribution of (X,Y) bivariate random variables (b.r.v).
Then,
Generating and Characterizing Distributions 95
Ap
P[Y = y] e- &pY (x-y)\
(41)
Therefore, X|Y = y follows a Poisson distribution with parameter Aq. The
E[X/Y=y] is the following:
e'M (Aqf~y) ^ x, + , e* (Xqf"-y)
E[X = x\Y = y] = Y,x- •=Y,( ~y y)- •=Xq + y
x>y (x-y)\ x>y (x-y)\
(42)
which has the form of (2), with a = 1 and b = Aq,
(B) Suppose now that X follows a Binomial distribution with parameters
(m,a) and Y|X = x is given by (15). Then E[X|Y =y] = ay + b, where a and b are
constant.
Let be
V a'(l-a)" ^
P[X = x,Y = y] = P[X = x]P[Y = y\X = x] = pyqx~y
V ;A \yj
(43)
(ap)> m-y {\-a)-'{aqy
m-x /• \x-y
P[X = x,Y = y] m-y
P[X = x\Y = y] = (45) ' aq *
P[Y = y] m — x {l-ap l-ap
Therefore, X|Y = y have a binomial distribution with parameters {(m - y),
(1 - a)l(l - op)}. Then
"•
(l-a (m— ii\f
' aq '
E[X = x\Y = y] = YJx m — y
m — x l-ap
m-y\( (l-a aq
= Y,(x-y+y)
x>y m-xj{l-ap l-ap
f \x-y-\
aq^'f m — y— I (l-a aq
= (m-y) +y
l-ap Jx>y {l-ap l-ap
\ f
aq ^ f
aq = ay + m(l -a) = ay + b
•(m-y) +y = m y 1-
{l-ap, {l-ap l-ap
(C) Suppose now that X has a negative binomial distribution with
parameters (v,a), where a = (1 - ap)laq and Y|X = x is given by (1). Then,
E[X|Y = y] = ay + b, where a and b are constant.
Let be
v + x-l ^
P[X = x,Y = y] = P[X = x]P[Y = y IX = x] = a" (I-a)" pyqx-y
\ •* J \y.
v + y-l v + x-l x-y
(aY{p(l-a)}} {q(l-a)}
< y J { x-y
(46)
Thus from (46) we have that
V + y-T v + x-\
P[Y = y] = ZP\.X = x,Y = y] = fj (ay{p(\-a)Y {<?(! -a)}"
, y , x-y
(47)
Therefore, Y has a binomial distribution with parameters (m, ap).
Then, from (46) and (47) we have that
P[X = x,Y = y] m-y (l-a aq
P[X = x\Y = yy- (48)
P[Y = y]
m-x l-ap
Therefore, X|Y = y has a binomial distribution with parameters {(m - y),
(l-or)/(l-ap)}.Then,
Generating and Characterizing Distributions 97
b = E[e= 6IY
= 0] = J0f(0/O)d0 > 0 (49)
o
Otherwise, from E [Y10] = 0, we have that
P[Y = y,Q = 0]
0 = E[Y/0] = ^yP[Y = y/0] = Zy
JW)
from which
0f(0) = Y,yP[Y = y,@ = 0] (50)
y
@
ay + b = E[@/y] = "j0f(0/y)d0 = Jfl \ * ' , ** d0
co
(52)
=> (ay + b)P[Y = y] = J0P[Y = y,@ = 0]d0
0
References
1. Dacey, M.F. (1972). A family of discrete probability distributions defined
by the generalized hypergeometric series. Sankhya, Series B, 34,243-250.
2. Elderton, William P. and Johnson, Norman L. (1969). System of Frequency
Curves. Cambridge University Press.
3. Fajardo Caldera, M.A. (1985). Generalizaciones de los sistemas
Pearsonianos discretos. Tesis doctoral. Universidad de Extremadura.
4. Hermoso Gutierrez, J.A. (1986). Estudio sobre distribuciones generadas por
funciones hipergeometricas de argumento matricial. Tesis doctoral.
Universidad de Granada
5. Herrerias Pleguezuelo, R. (1975). Sobre las estructuras estadisticas de
Pearson y exponenciales: problemas asociados. Tesis doctoral. Universidad
de Granada.
100 M.A. Fajardo-Caldera and J. Perez-Mayo
J.M. FERNANDEZ-PONCE
Departamento de Estadistica e I.O., Universidadde Sevilla
T. GOMEZ- GOMEZ
Departamento de Estadistica e I.O., Universidad de Sevilla
J.L. PINO-MEJIAS
Departamento de Estadistica e I.O., Universidad de Sevilla
R. RODRIGUEZ-GRINOLO
Departamento de Estadistica e I.O., Universidad de Sevilla
1. Introduction
Stochastic orderings arise in statistical decision theory in the comparison of
experiments and estimation problems. Many useful characterizations of the
usual stochastic and dispersion order can be found in the literature. An excellent
handbook is Shaked and Shantikumar [13]. One of the most interesting
characterizations of the dispersion order is given in Shaked [12]. In particular,
dispersion and spread has been used to characterize the variability for
distributions and it has extensively been studied (see Lewis and Thompson,
[10]; Shaked, [12]; Hickey, [8]; Rojo and He, [11]; Fernandez-Ponce et al. [6];
101
102 J.M. Fernandez-Ponce et al.
7=1 7=1
Proof. Caperaa [3] showed that if n <m then tm <r t„. In addition, Doksum [5]
showed that for univariate absolutely continuous distributions with F(0) =
G(0) = 0 such that/(0) >g(0) > 0 and G'1 (u)/F~' (u) non-decreasing for all u in
(0, 1), F <disp G holds. Under the last discussion, we consider the random
variable I /J with density function given by f. ,(f) = 1ft if) if t > 0 and 0
otherwise. A straightforward computation shows that the distribution function of
UJ is Fu{t) = 2FK{t)-\ fort>0. Hence, F\\(u) = F~\. [(u + l)/2]for
all u in the interval (0,1). Therefore, by using Caperaa [3], if « < m then
G~' (u)/F~' (u) is non-decreasing for all u in (0, 1). Since F]t ,(0) = F^ |(0) and
f\t |(0) > f\t |(0)» by using the result in Doksum [5], I /J <disp I tn\ holds. It is
easy to check, by using properties of symmetry, that this last result implies that
tm disp ni*
4. An application
In this Section, some results obtained in the last section to the particular t-
distribution family are applied. For this purpose, the corresponding definition of
the /-distribution from Bernardo and Smith ([2], pg.139-140) is used. A
continuous random vector X has a multivariate /-distribution or a multivariate
Student distribution of dimension k, with parameters // = (pi,...,pi), E and n,
where ju is in 9?*, E is a symmetric positive definite k x k matrix, and « > 0 if its
probability density function, denoted Stk(x;fi,E,n), is
Stk(x;n,!,n) = c
n
'n + k
for all x in 9? where
r§wr
Although not exactly equal to the inverse of the covariance matrix, the
parameter E is often referred to as the precision matrix of the distribution or,
equivalently, the inverse matrix of the dispersion matrix. In the general case,
EfXJ = ft and Var(X) = Z~' («/« - 2). An extension of the univariate dispersion
order to the multivariate case was given by Giovagnoli and Wynn [7]. A
function O : 9? -> iff is called an expansion if | 0 ( x ) - # X , ) | 2 > | x - X , | | 2 f o r
all x and x' in 9?n. Let X and Y be two «-dimensional random vectors. Suppose
that Y -sl 0(X) for some expansion function 3>.Then we say that X is less than Y
in the strong multivariate dispersion order (denoted by X <SD Y). Roughly
speaking, the strong multivariate dispersive order is based on the existence of an
108 J.M. Fernandez-Ponce et al.
/(•HSD^OO
Properties in Samplingfrom the Normal Distribution 109
This fact may be interpreted as the added variability, due to deletion of data
subset i. However it is not held that every subset of data with a fixed size k has
the same influence. Consequently, a Dispersion Bayesian Influence in terms of
Variability (DBIV) measure to the i-subset can be defined as
Q 2 =|M S %(I + H (i) ))-Ms 2 (I + H))|[
and the subsets from least to most influential according to the magnitude of Q \
are ordered. Note that under the assumptions in Corollary 4.1, if the inequality
3i(s2(i) (I + H (i) )) > l(s2 (I + H))
holds then
References
1. Arias-Nicolas, J.P. (2005). FernandezPonce, J.M., Luque-Calvo, P. and
SuarezLlorens, A. Multivariate dispersion order and the notion of copula
applied to the multivariate t-distribution. Probability in the Engineering and
Informational Sciences, 19, 361-375.
2. Bernardo, J.M. and Smith, A.F.M. (1994). Bayesian Theory. John Wiley
and Sons.
3. Caperaa, P. (1998). Tail ordering and asymptotic efficiency of rank tests.
The Annals of Statistics, 16, 470-478.
4. Droste, W. and Wefelmeyer, W. (1985). A note of strong unimodality and
dispersivity. Journal of Applied Probability, 22(1), 235-239.
5. Doksum, K. (1969). Starshaped transformations and the power of rank tests.
Annals of Mathematical Statistics, 40, 1167-1176.
6. Fernandez-Ponce, J.M., Kochar, S.C. and Munoz-Perez, J. (1998). Partial
orderings of distributions based on right spread functions. Journal of
Applied Probability, 35, 221-228.
7. Giovagnoli, A. and Wynn, H.P. (1995). Multivariate dispersion orderings.
Statistics and Probability Letters, 22, 325-332.
8. Hickey, R.J. (1986). Concepts of dispersion in distributions: Acomparative
note. Journal of Applied Probability, 23, 924-929.
9. Lawrence, M.J. (1975). Inequalities of s-ordered distributions. Ann. Statist.,
3,413-428.
10. Lewis, T. and Thompson, J.W. (1981). Dispersive distributions and the
connection between dispersivity and strong unimodality. Journal of Applied
Probability, 18, 76-90.
11. Rojo, J. and Guo Zhong He. (1991). New properties and characterizations
of the dispersive ordering. Statistics and Probability Letters, 11, 365-372.
110 J.M. Fernandez-Ponce et al.
R.M. GARCIA-FERNANDEZ
Department of Quantitative Methods in Economics, University of Granada
Campus de Cartuja s/n. Granada, 18071, Spain
In this paper we apply the generating function to obtain the density of the overall sample.
This density is called mix density and is proportional to the geometric mean of the
subgroups densities. This approach can be use to measure the polarization when it is
understood as an economic distance between distributions. An empirical illustration is
provided using the data from the Spanish Household Expenditure Survey corresponding
to the regions of Andalucia and Cataluna, elaborated by the Institute Nacional de
Estadistica (INE) for the year 1999.
1. Introduction
The main objective of this paper is to extend the economic applications of
the generating function concept. The generating function was defined by
Callejon [1] considering that the right hand side of the system of Pearson, which
is given by:
f'(y)= y-a
fiy) bQ+b1y + b2y2
f'(y)
is a function of real variable g(y) , that is to say = g(y) .
/O)
The generating function has been applied successfully to the estimation of
the income distribution as we can see for instance in the papers of Herrerias,
Palacios and Ramos [8] and Herrerias, Palacios and Callejon [9]. In addition, the
concept of generating function can be used to generate Lorenz curves and
therefore to measure the inequality of the income distribution [1].
Another economic problem related with the income distribution is the
measurement of the polarization of the income as shown by the increasing
publications related to this topic (see Esteban and Ray [5], Wolfson [15], Tsui
and Wang [13] among others). As we will discuss in Section 4, there are several
approaches to measure the polarization. Following Gertel, Giuliodori and
Rodriguez [7] we are going to focus on the analysis of the polarization when it is
111
112 R.M. Garcia-Femandez
2. Generating function
The starting point will be the definition of the generating function provided
by Callejon [1]. Let Y be a real variable defined over the bounded
support (a, b). Suppose that g(y) is a function of real variable such that
i) G(y) = \g{y)dy and ii) f eGiy)dy < co is verified. Then it is possible to
obtain a continuous probability function with density
-l
function f(y) = Ke (a < y < b), in which K = \"eGMdy .Observe that it
Ja
is verified:
±Lnf(y)=fM = g(y) (1)
dy f{y)
Function g(y) receives the name of generating function of probability (for
more details about this function and its properties see Callejon [1]).
Generating Function and Polarization 113
a,-I 1 (a2-\ 1^
g(y) = Pi 9
+ p2
v y <) v y -27
f
p1(a1-l)+p2(.a2-l) P±+P_2 a-\ 1 = g(y;a,9)
y V9- &1 J y 9
Therefore the density of the overall sample is given by:
\g(y)<iy 1 a
~'0 9
f{y) = Ke —y e
Y{a)9a
This density is called mix density and is distributed as a gamma distribution
where r(a) is the gamma function.
Observe that the mix density function is proportional to the geometric mean
of the densities of the subgroups:
l -ya'-xe 9 1
f(y) = K < /*-'* * (3)
T{a2)9a2>
r(«,)T
where K is a constant of renormalization given by: / \iPl«l+Pl<*2)
(4)
K ( 1 1
\PI
r(p1ai+p2a2)
Y{ax-)9? j r(a2)9? ,
Introducing (4) into (3) the mix density function can be rewritten as follows:
le mix
1 --
f(y) = — a ya~xe »
r(a)9
where «, and n2 are the sizes of the two subgroups and Yx and 72 are the
respective sample means.
The values of the parameters that maximize the above loglikelihood
function are denoted by dv9x,d2,32. Secondly, we introduce d],3l,d2,32
into f(y), and apply again the Method of the Maximum Likelihood to
estimate pup2. The empirical works show that pup2 are good for approaching
the group size. Observe the parameters a and 3 can be expressed as a
function, h. (.) of the parameters a],3l,a2,92,pl,p2, that is
a= h](al,a2,pl,p2)
3= h2(3l,32,pi,p2)
Hence, accordingly with the Zenha Theorem (Rohatgi [12]) we can
conclude that
d = hi(dud2,p1,p2)
3= h2(3i,32,pl,p2)
are the MLE of the parameters a and 9 .
After describing the estimation process, we are going to apply these results
to the polarization measurement.
W = 4^
2 \2 2
where /x is the mean, m is the median income, L\ — | is the Lorenz curve at the
Tsui and Wang [13] following the measure of Wolfson, defined a new class
of indices expressed by:
NT* rn J
where n, is the number of individuals that belong to group /, k is the number of
groups, rrij is the median of the group i, 3 is a positive constant and r takes
values in the interval [0,l] .
Esteban and Ray [5] provided a measure of polarization based on the sum of
antagonisms between individuals that belong to different groups. The
antagonism felt by each individual of group / is the joint result of the inter group
alienation combined with the sense of identification with the group to which
individual i belongs. The measure proposed by these authors is:
P= ttp)+apj\y,-yj\ i*«£i.6
(=1 7=1
Pa (/) = j\f{*)Haf{y)\y -A ^ a e
[°- 25 >!]
where \y - x\ represents the alienation (distance) felt by the individual located at
x and y. The sense of group identification that an individual with income x feels
is given by / ( * ) " , where a is the sensitivity to polarization and falls into the
interval [0.25,l], in order to be consistent with the set of axioms proposed by
Duclos, Esteban and Ray.
Generating Function and Polarization 117
D= *'-*' (7)
between distributions, D can be used to measure the polarization. The higher the
values of D , the larger the polarization of the income distribution.
In our opinion the last approach is the most appropriate to analyze the
polarization in the context that we are working on. That is, we know the
densities of the subgroups, / , (y) and f2 (y), and we want to see how separate
or polarized they are.
In the next stage, we are going to obtain D according to the results
provided in Section 3. Let us consider two regions, the first group collects the
income data from the individuals that belong to region 1, and the second one the
individuals that belong to region 2. The mean income of the two regions are
given by //, and // 2 , and we assume that //2 > //, . The corresponding densities
are:
/i0,)=
"Fr~b^ o , v * (8)
My)=
r< \a«ya'~le~t (9)
r(a2)32 -
Given that //, =a1>91 and /J2 = a292 we can write expression (7) as
follows:
_ a292-a^9x
2d2l +a,i9, -a292
The gross economic affluence, dn, is given by:
where / , (y) and f2 (y) are the densities functions (8) and (9) and F{ (y) and
F2 (y) are their cumulative densities functions respectively.
As we can see, the ratio D is expressed in terms of the parameters of the
gamma distributions and d]. In Section 3, we described an approach based on
the MSE method to estimate the parameters a, ,&i,a2,32,p^,p2, so the
following step will be to apply this theoretical result to an empirical distribution.
5. Empirical application
We want to point out that the main object of this Section is to show how the
method proposed works. This is a preliminary version and we do not pretend to
do an exhaustive analysis of the income polarization. We are going to use the
data from the Spanish Household Expenditure Survey, Encuesta Continua de
Presupuestos Familiares, elaborated by the Instituto Nacional de Estadistica
Generating Function and Polarization 119
(INE) for the year 1999. We are going to focus on the income per capita of two
autonomous regions (Comunidades Automonas), Andalucia and Cataluna. First,
we estimate the density function of Andalucia, / , (y), and of Cataluna, f2 (y).
Secondly the mix density associated with the overall sample is estimated, see
Figures from 1 to 4.
d, = 3.60153259 5.202E - 06
•Catalufia -Andalucfa
0.0000014
0.0000014
0.0000012
0.000001
0.0000008
0.0000006
0.0000004
0.0000002
The ratio D can be estimated from the observed values or from a parametric
model of income distribution. The estimation presented in this Section is done
from the estimated parametric model. To obtain dn we have to resolve by
numerical methods the following integrals:
Generating Function and Polarization 121
JoV.G0/2O04'+ K yF2(y)fx(y)dy
The Gini index of Andalucia and Cataluna, as well as the Gini index for
both regions jointly are obtained. Given that the income is distributed according
to a gamma distribution, the Gini indices (Lafuente [10]) for Andalucia, IGX,
and Cataluna, IG2 are calculated using the following expression:
2
IG,= ,->; \ i = i,2
The Gini index for the overall sample, considering that a = pxax + p2cc2, is
given by:
r
IG=- (g'^'+g^+i)
•sJ7rT\axpx + a2p2 +1)
The analysis of the Gini index and the ratio D jointly, show on the one hand
the distance between the income distribution of Andalucia and Cataluna, and on
the other hand the inequality in each region. The value taken by D, see Table 1,
indicates that the income distributions of these two regions are located in an
intermediate point between the total overlapping and the complete separation.
Concerning the Gini index we conclude that the incomes are more equally
distributed in Cataluna than in Andalucia.
As we pointed out at the beginning of this Section, our purpose is to explain
how the method development in this preliminary paper works. It will be
interesting to obtain the D ratio for other years to establish comparison and to
consider other characteristics, to group the population, such as education level,
occupation etc.
IG D
Andalucia: fx (_y) 0.28718086
Cataluna: f2 (y) 0.25807088 0.554589619
Mix density: f(y) 0.27290117
References
1. J. Callejon. (1995). Un nuevo metodo para generar distribuciones de
probabilidad. Problemas asociados y aplicaciones. Tesis Doctoral.
Universidad de Granada.
2. C. Dagum. (1985). Analyses of income distribution and inequality by
education and sex. Advances in Econometrics, 4, 167-227.
3. C. Dagum. (2001). Desigualdad del redito y bienestar social,
descomposicion, distancia direccional y distancia metrica entre
distribuciones. Estudios de Economia Aplicada, 17, 5-52.
4. J.Y. Duclos, J.M. Esteban and D. Ray. (2004). Polarization: Concepts,
measurement, estimation. Econometrica, 74, 1337-1772.
5. J.M. Esteban and D. Ray. (1994). On the measurement of polarization.
Econometrica, 62(4), 859-51.
6. J.M. Esteban, C. Gradin C. and D. Ray. (1999). Extensions of a Measure of
Polarization OCDE Countries Luxembourg income Study Working Paper
218, New York.
7. R.H. Gertel, R.F. Giuliodori, and A. Rodriguez. (2004). Cambios en la
diferenciacion de los ingresos de la poblaci6n del Gran Cordoba entre 1992
y 2000 segun el genero y el nivel de escolaridad. Revista de Economia y
Estadistica, XLII.
8. R. Herrerias, F. Palacios and A. Ramos. (1998). Una metodologia flexible
para la modelizacion de la distribution de la renta. Decima reunion
ASEPELT- ESPANA, Actas en CD-ROM.
9. R. Herrerias, F. Palacios and J. Callejon. (2001). Las curvas de Lorenz y el
sistema de Pearson 135-151. Aplicaciones Estadisticas y economicas de los
sistemas de funciones generadoras. Universidad de Granada.
Generating Function and Polarization 123
F.J. CALLEALTA-BARROSO
Departamento de Estadistica, Estructura Economicay O.E.I., University ofAlcala
Plaza de la Victoria no. 2, 28802 Alcald de Henares (Madrid), Spain
1. Introduction
Personal income distribution has been the subject of study from very
different perspectives during the last decades. These perspectives have been
characterized by terms such as inequality, poverty, deprivation, mobility or
convergence.
This study focuses on the measurement of differences between personal
income distributions in order to use such a measure as an index of convergence
between them.
Measuring these differences raises an important problem for which there is
not only one solution. Several interesting aspects can be observed in the
personal income distribution of a population, which explains the multiplicity of
instruments needed to inform about each of them. Thus, from the simplest
descriptive statistics of a distribution to the most sophisticated measures of
inequality and poverty, all of them allow us to compare populations in some of
their specific aspects.
However, although they achieve successfully the informative specialization
for which they were set out, using these measures produces biased results when
125
126 F.J. Callealta-Barroso
our aim is to measure the overall difference resulting from the comparison of the
individuals that constitute the compared populations. Thus, we can compare the
average wealth of two populations from their means, or the internal inequality
within them by comparing their Gini's concentration indices. But, for example,
in the first case, we are disregarding the information about the shapes of such
distributions (it must be remembered that the same mean can be obtained from
distributions with different shapes), while in the second case, we are
disregarding the localizations of such distributions (it must be remembered that
two very different populations can present similar concentration indices, even
when one of them can be much richer).
One attempt to avoid this problem is to combine localization statistics with
inequality indices. For example, we can consider for this purpose the index
I = H • G, where \i and G are the corresponding mean and the Gini index of the
considered distribution, respectively. This index, I, is closely related to Gini's
mean difference between the individuals of a population3.
Could we, therefore, use Gini's mean difference to measure the difference
between distributions? Unfortunately, this measure only informs about inter-
population inequalityb, and not about proximity0 between populations. It must
be noted that Gini's mean difference between identically distributed populations
is not zero but equals twice the product of their common mean and their
common Gini index, as can be deduced from footnote b.
In this paper we propose a new dissimilarity measure related to Gini's mean
difference, intuitively interpretable and also clearly informative, which can be
used to measure the resulting overall difference between two compared random
variable distributions.
"Let A = E[|X - Y|] be the Gini's mean difference between two random variables X and Y. Then, for
X and Y identically distributed, the following equality holds: I = u • G = A/2.
'Tor any two random variables X and Y, then A is related to Gini's inter-population inequality index,
Gxy, and their localizations, u„ and u y as follows:
'We use the term proximity as a generic reference to any of either dissimilarity or similarity
measures, following the terminology used in Cuadras (1996). In order to compare pairs of random
variables, (X,Y), this study will concentrate specifically on dissimilarity measures defined as real
functions, "d", which increase with the difference and comply with the following properties:
a) for X = Y^>d(X,Y) =0
b) d(X,Y) = d(Y,X)
These measures are discussed in more detail in Everitt (1993).
A New Measure of Dissimilarity Between Distributions 127
dx =E[(Y-X)-I(Y-X)]= l"dFY(y)lyo(y-x)dFx(x) 0)
where r j y > x
A = £|r-*|]=</;+rf-=«/;+</- (6)
• Relationship to the Difference of Means:
0<d~yx=d+xy<A (11)
+
^,=max[/;,^} = < ^ + f e - ^ l =A + K - ^ l (12)
i r » > yx) 2 2 2 2
or, alternatively:
dl=max\dxy,dxy) = 2 +- - 2+ i (13)
c
When the Gini index is calculated for a population coming from the joining of two others, the part
of inequality due to the relationship between the two joint populations after eliminating the part of
inequality presented internally by both population separately is:
Mx+Mr
132 F.J. Callealta-Barroso
Distinctive of Px Sub-populations
_Min{fx(t),Mt))_ fx(t)<fY(t)
\-p (18)
/c(0 \-p frit)
fy{t)^fx<J)
\-p
(19)
l
- P = kL(MW) / ? W + kw>/ l W ) / j f ( #
134 F.J. Callealta-Barroso
fr.(y)=My)-fAy)-i{My)>fAy)}
p
fr(y)>fx(y) (2i)
p
o , fY(y)<fx(y)
where p now represents the proportion of population PY that is "not
comparable" to any sub-population of P x .
With these definitions, the original distributions can be expressed as
mixtures of the variables defined above, as follows:
f
The indicative function for a proposition A has a value of
1 , A true
1
' 0 , A false
A New Measure of Dissimilarity Between Distributions 135
{ll)
/ jr (*) = 0-/0-/c(*)+/>-./>(*)
fr(y)=Q-p>fc(y)+rfr-(y) (23)
where variable C informs about characteristics of the sub-populations selected as
"comparable" in both populations P x and PY, with a proportion of 1-p, while the
variables X and Y inform about specific "distinctive" or "non-comparable"
sub-populations, of proportions p, coming respectively from either compared
populations P x or Py.
Some of these distributions properties are the following:
a) "Distinctive sub-populations" represent a proportion p of populations
from which they come, and:
P = \-DfAO-Mt^dt (24)
b) The means of these auxiliary distributions (C, X*, Y*) decompose the
means of the original distributions, informing of contributions to the
latter of each "comparable" and "distinctive" sub-population, according
to their weights in the corresponding mixtures, as follows:
sNote that we introduce the weight factor because our objective is, firstly, to make the measure as
intuitive as possible (it leads to the direct evaluation of differences related to non-shaded areas in
Figure 1). Secondly, we want to introduce in the expression the effect of relative sizes of
"distinctive" sub-populations (proportions of populations that "distinctive sub-populations"
represent).
136 F.J. Callealta-Barroso
d(X,Y) = p2E\Y*-X*\\
' ' (27)
= r r r > - *K/> <*> - /rwM/iw > M*)}
•ifr (y) - fx (y)}i{fr 00 > fx {y)\dx-dy
2.3.1. Properties of the proposed measure of dissimilarity
• d(X,Y) = 0e>X =Y (a.e.)
• The measure d(X,Y) increases with the difference between X and Y;
i.e., it increases not only with the increase of the proportion of X and Y
represented by their "distinctive" sub-populations, but also with the
increase of separation between them.
The measure is symmetrical: d(X, Y) = p EX -Y = d(Y,X)
0<d(X,Y)<A
h
There are counter-examples in the matrix of dissimilarities calculated in the application developed
in a later section. One of them is, for example, that occurring between the countries GER, BEL
and FRA in 1993, for which d(GER,FRA)=223 while d(GER,BEL)=59 and d(BEL,FRA)=142.
A New Measure of Dissimilarity Between Distributions 137
'In the conventional OECD equivalent scale the first adult counts as 1 unit, next adults as 0.7 and
each child under the age of 16 years as 0.5 units.
138 F.J. Callealta-Barroso
Equivalent Personal Income (net)" constructed this way, has been weighted by a
variable "weight" constructed as the product of the household cross-sectional
weight (variable HG004) and its size (variable HD001).
Table 1. Number of available cases for the variable "Comparable Equivalent Personal Income
(net)", by countries and waves
Table 2. Sums of household weights from available cases for the variable "Comparable
Equivalent Personal Income (net)", by countries and waves
Table 3. Weighted means for the variable "Comparable Equivalent Personal Income (net)",
by countries and waves (previous year incomes in purchasing parity units)
'SAS/STAT® and SAS/GRAPH® are registered products of SAS Institute Inc., Cary, NC, USA.
140 F.J. Callealta-Barroso
income distributions in all countries and different waves of the panel . For their
analysis, charts for the density functions calculated in this way were obtained
using SAS/GRAPH procedure GPLOT.
From these charts we can extract some remarkably different behaviours:
Firstly, we see how Luxembourg has a distribution of "Comparable
Equivalent Personal Incomes (net)" clearly displaced to the right of those for the
rest of the countries, standing out for its higher personal incomes.
Towards the middle of these charts we can see two other groups of
countries behaving differently. The Nordic countries (Finland, Sweden and
Denmark) together with Netherlands present distributions more leptokurtic,
higher in their central sections (although Denmark and Netherlands present
medium degrees of kurtosis). In contrast, the rest of Central European countries
present a wider diversity in their central sections of income.
Lastly, on the left hand side of these charts, we find those countries
conventionally considered poorer (Italy, Greece, Spain, Portugal and Ireland).
However, if we observe the dynamics of these distributions over time, we
can see that although these trends are preserved, most distributions in the EU-15
countries, in general, tend to approach to the others, leaning towards a common
average behaviour in the centre of the chart, with the clear exception of
Luxembourg and different particularities presented by each country at each
period of time.
Additionally, if we observe in these charts the evolution of the distributions
for each country through the 8 waves, we can see their systematic movement to
the right (a tendency to a higher level of income) with noticeable decreases in
modal probability densities (a tendency to a wider diversity of incomes and
possibly to a higher inequality) including, in some cases, the presence of central
flatness in their density functions and even a pair of relative modes.
We will attempt below to study in depth these first impressions, and for this
purpose we will analyse the information obtained from the proposed measure of
dissimilarity, calculated between each pair of distributions in all of those.
k
Density functions estimated by stochastic kernels produce small deviations for estimations of
population means. Assuming that the ECHP sample sizes have been calculated to obtain
parametric estimations rather than for any another reason, we have proceeded to correct slightly
the corresponding density function in each case regrouping the upper 1% of probability from the
right tail in a unique interval. Thus, we have conveniently determined its range and class-mark so
that the mean of the corrected density function would reproduce faithfully the corresponding mean
estimated by the ECHP.
A New Measure of Dissimilarity Between Distributions 141
3.3. Dissimilarities
From the estimates of the 120 density functions obtained in the way
mentioned in the previous section (in fact there are 113 since 7 of them are not
available, for Austria, Luxembourg, Finland and Sweden, for some of the years)
and which represent the behaviours of "Comparable Equivalent Personal
Incomes (net)" in the 15 countries studied through the 8 years of the panel, we
have proceeded to evaluate the proposed measure of dissimilarity for each pair
compared. Consequently, we have constructed the matrix, which reflects the
totality of dissimilarity coefficients calculated between every pair of density
functions, each one corresponding to a "country-year", using the programme
SAS/IML1.
To sum up, differences between distributions of "Comparable Equivalent
Personal Incomes (net)" within the 15 countries for the initial and final years of
the period studied are presented in Tables 4 and 5.
Obviously, we cannot calculate the corresponding dissimilarity measures
between countries for which data were not available, as it is clearly shown in the
table of dissimilarities for the initial year (Table 4). This is the case of Austria,
Luxembourg, Finland and Sweden in 1993, Finland and Sweden in 1994 and
Sweden in 1995, as mentioned earlier.
Table 4. Dissimilarities between countries for the year 1993
1.993 GER DK_ NL_ BEL LUX FRA UK_ IRL ITA GRE SPA POR AUS FIN SWE
_
GER 0 219 195 59 - 223 147 1.482 953 2.508 1.655 2.885 - -
DK_ 219 0 215 202 - 449 484 1.332 1.064 2.447 1.712 2.964 -
NL_ 195 215 0 244 - 117 109 829 526 1.717 911 2.133 -
BEL 59 202 244 0 - 142 217 1.341 1.024 2.694 1.722 3.006 -
LUX
FRA 223 449 117 142 - 0 82 947 533 1.842 1.072 2.192 -
UK_ 147 484 109 217 - 82 0 732 301 1.460 730 1.746 -
IRL 1.482 1.332 829 1.341 - 947 732 0 121 208 43 387 -
ITA 953 1.064 526 1.024 - 533 301 121 0 436 108 551 -
GRE 2.508 2.447 1.717 2.694 - 1.842 1.460 208 436 0 144 48 -
SPA 1.655 1.712 911 1.722 - 1.072 730 43 108 144 0 221 -
POR 2.885 2.964 2.133 3.006 - 2.192 1.746 387 551 48 221 0
AUS
FIN
SWE
2.000 GER DK_ NL_ BEL LUX FRA UK_ IRL ITA GRE SPA POR AUS FIN SWE
GER 0 93 243 45 2.665 158 153 765 1.411 3.002 1.583 3.533 90 688 592~
DK_ 93 0 363 263 2.910 290 374 918 1.432 2.961 1.821 3.652 139 863 780
NL_ 243 363 0 96 4.016 35 111 212 439 1.484 674 1.903 130 165 110
BEL 45 263 96 0 3.388 103 112 657 1.078 2.553 1.453 3.062 86 573 488
LUX 2.665 2.910 4.016 3.388 0 4.024 3.039 6.004 7.029 9.403 7.382 9.843 3.222 6.042 5.407
FRA 158 290 35 103 4.024 0 114 250 514 1.587 755 1.993 93 240 203
UK_ 153 374 111 112 3.039 114 0 424 884 2.140 1.159 2.612 204 641 424
IRL 765 918 212 657 6.004 250 424 0 61 605 197 1.050 578 117 141
ITA 1.411 1.432 439 1.078 7.029 514 884 61 0 315 77 676 925 171 245
GRE 3.002 2.961 1.484 2.553 9.403 1.587 2.140 605 315 0 171 97 2.306 881 1.016
SPA 1.583 1.821 674 1.453 7.382 755 1.159 197 77 171 0 417 1.264 441 535
POR 3.533 3.652 1.903 3.062 9.843 1.993 2.612 1.050 676 97 417 0 2.750 1.463 1.639
AUS 90 139 130 86 3.222 93 204 578 925 2.306 1.264 2.750 0 505 433
FIN 688 863 165 573 6.042 240 641 117 171 881 441 1.463 505 0 25
SWE 592 780 110 488 5.407 203 424 141 245 1.016 535 1.639 433 25 0
2000 LUX DK GER AUS BEL UK FRA NL SWE FIN IRL ITA SPA G R E P O R
L U X ^ 2 " in 2 w o '-222 "-tSN V)?') 4H24 4 0 1 ' . Mir M)42 Mil i-l "029 7382 9403 9843
DK_ 2910* 0 93 139 26' 5 4 2'Hi 3M "so Sd- VIS 1432 1821 2961 3652
GER 2665 9 3 0 90 45 153 15.S •'4? *v. (.SS " ( • * 1411 1583 3 0 0 2 3533
-.115
AUS 3222 \i<) 90 0 86 204 93 130 I«3
4sx
"\ 925 1264 2306 2750
BEL 3388 26? 45 86 0 112 103 96 *--. (.*• 1078 1453 2553 3062
UK_ 3039 " 1 153 204 112 0 114 111 424 (.41 424 884 1159 2140 2612
FRA 4024 2"D l*S 93 103 114 0 35 203 240 2*0 514 755 1587 1993
N L _ 4016 *'>* 243 130 96 111 35 0 110 165 212 430 6 7 4 1484 1903
S W E 5407 " X I I *"2 4-1 4ss 424 20 i 110 0 25 141 245 535 1016 1639
F I N 6042 ,\(.; (.sx ^(15 ^", i.ll 240 165 25 0 117
n 441 881 1463
I R L 6004 " I S ~h* S~S i'^~ 421 2->U 2 | l 197
141 117
() 61 605 1050
I T A 7029 I4- - 1 1411 l) uri *I4 1) 77 <|5 (."(.
-- SSI
||-MI
4'y 245
PI 61
|U7
S P A 7382 IS2I 1*»3 1 V-4 14--'- (•"4
"•=•> * ! • ;
441 77 0 PI 4P
G R E 9403 2961 3002 2306 2553 2140 1587 1484 1016 881 605 315 PI I) 97
P O R 9843 3652 3533 2750 3062 2612 1993 1903 1639 1463 1050 (»"d 11" 97 0
Consequently, a value of 1 for this index would show that the distributions
compared remain with the same degree of proximity, values greater than 1
would show separation or divergence between the distributions of the countries
compared, and values smaller than 1 would show proximity or convergence.
For the cases in which we did not have a dissimilarity measure (in the years
1993, 1994 and 1995) we employed, for the same countries compared, those
obtained the following year in which data were available.
144 F.J. Callealta-Barroso
Obtained results are shown in Table 7. Starting from it, we can infer that
there are groups of countries whose income distributions have come closer
during the period 1993-2000. However, there are other countries that present
greater differences between them at the end of this period. Consequently,
looking at Tables 6 and 7, we can highlight:
a) The country with the highest mean of "Comparable Equivalent
Personal Income (net)", Luxembourg, presents a final distribution of
incomes clearly distanced from those of the other EU-15 countries.
b) The four countries that follow Luxembourg, according to their mean
income (Germany, Denmark, Belgium and the United Kingdom), form
a group in which, generally, there is a final greater proximity between
income distributions; although with some internal polarizations. Thus,
Denmark with Germany and Belgium with the United Kingdom, have
respectively reduced their differences to approximately half those
presented initially. However, Germany and the United Kingdom
practically retain their differences, while Belgium and Denmark have
distanced themselves to some extent.
c) Austria has also distanced itself somewhat from the previous countries,
with the exception of Denmark, while the latter, has in turn distanced
itself from the other two northern countries, Sweden and Finland.
d) Income distributions in these two countries, Sweden and Finland,
together with France, Netherlands, Ireland and Italy, become much
closer to each other.
e) Out of these countries, Ireland is closer to those with mean incomes
higher than its own, with the only exception of Luxembourg which
distances itself more rapidly.
f) Spain and Greece (although more so in the case of Spain, which
therefore distances itself to a certain extent from Greece) also approach
this last group of 6 countries, with the exception of Ireland, which
seems to distance itself more rapidly.
g) Income distributions in both Spain and Greece distance themselves
from that in Portugal, which seems further from its initial position with
respect to the richest countries, shyly approaching the group of 6
countries mentioned in section d, with the exceptions of Italy and,
already mentioned, Ireland.
A New Measure of Dissimilarity Between Distributions 145
Income Country LUX GER DK BEL UK AUS FRA NL SWE FIN IRL ITA SPA GRE POR
23.101 LUX 1,03 1,19 1,56 1,09 1,43 1,48 1,18 1,14 1,30 1,18 1,37 1,07 1,21 1,17
15.166 GER 1,03 l l S l 0 ' 7 7 " '•° 4 '- 1 1 <Wl '-•' '••*" !•-- .0,52 1.48 0,9ft 1.2(1 1.22
14.982 DK_ 1,19*0,41 l.W 0,77 .0,67 0JSS I.fi9 1.10 1.46 0,69 \M I.Oh 1.21 1.2*
14.833 BEL 1,56 f 0,77 I.'O 0 ^ 1 2.X* 0 , » 0,39 LOS (),92 0,40 1.115 0.84 0,95 1.02
14.676 UK_ 1,09 1.04 0,77 0,51 1.40 1 IS 1.02 0,99 I.in 0,S8 2.')4 1.^9 1,4" 1,50
14.359 AUS 1,43 '.13 0,67 2.1s' 1.4ii 1.03 0,54 l,S2 0,98 0,55 1.01 0,66 I,»n 0,96
0 7 3 l ,K 0,3
13.549
13.287
FRA
NL_
'^Ifcilil ' - '•"' ° °''J5 °'67-<)'26 °' % °'TO 0M "•"
1,18 1,24 L69jO,.V> 1.02 0,54 0,30 0,81 0,74 0,26. 0,83 0.74 0,86 0,89
12.041 SWE
1,14 1,47 1,19 l.os o,W 1.S2 0,95 0,81 0,99 0,2* 0,67 0,55 0,85 0.93
11.799 FIN
1,30 1,22 1,46 0,92 !.'<() 0,98 0,67 0,*4 0,99 0,2$ 0,48, 0,51 0,84 0.99
11.616 IRL
1,IS^MJMM0A9' 0.SS 0,55 0,26 0J2fi 0,26 0^9 0,50" 4,(,4 2.92 2.-1
10.605 ITA
1,37 1,48'" 1,35 1.D5 2.94 l.til' 0,96 0,83 0,67 0,4* 0,50 0,71 0,72 1.21
10.409 SPA
1,07 0,96 1,0610,84 1.59 0,66 0,70 0.74 0,55 0,51 4.M 0,71 LIS l.sx
8.743 GRE
1,21 1,20 1,21 0,95 1.4" LOO o]»6 0.86 0.85 0,84 2.92 0,72 LIS 2,0?
8.619 POR
1,17 1,22 1,23 1.112 I.Si) (),% (1,91 0,89 0,93 0,99 :.-| 1.23 1 XS 2,0'
Source: Author's o w n , from E C H P data
( F f+i) j>/+r
P
2 I 2J
1
dissimilarity coefficients" between p populations for the different t periods of
time, which have to be interpreted in comparative terms.
"Actually, the number of non-trivial dissimilarity coefficients in the matrix resulting from the
comparison of the p t distributions of p countries through t periods, excluding the zero coefficients
derived from comparing a country in a period to itself, is:
pt\ p-ljp-t - 1)
2
J 2
"Torgenson (1958) proposed the fundamentals of multidimensional scaling. For an introduction to
these methods see Kruskal and Wish (1978).
A New Measure of Dissimilarity Between Distributions 147
SAS/STAT procedure MDS° has been used in order to solve the adequate
ALSCAL model. The model has been established trying a variety of
monotonous transformations (identity, afin, lineal, potential and staggered-
monotonous), as well as several dimensions for the factor space of
representation (between 1 and 6). The goodness of fit criterion used is the
measure of Kruskal's Stress-lp whose formulation is as follows:
Ife-TH))2 (29)
S,=
and, according to it, we finally find that in all spaces considered, the best
approximation was always given by the potential transformation model (or
linear logarithmic transformation, equivalently), as follows:
sij=T(dij)=s(duy
or equivalently, l 0 g f e ) = log(,) + ^ l o g ( ^ ) <30)
For every space of different dimensions considered, this model has provided the
values of the goodness of fit criterion reflected in Figure 2, which have also
been represented in an elbow chart.
According to these results and following the parsimony principle, two
dimensions should be enough to represent quite well the diversity reflected in
the calculated dissimilarities; or three if we want the adjustment to be qualified
as "excellent", according to Kruskal's scale. Increasing the dimension of
representation space to more than three does not seem to improve substantially
the goodness of fit for the model, although it improves it to some extent.
Thus, the model has been solved in three dimensions for the potential
transformation model (or equivalently, linear logarithmic transformation),
obtaining the following optimal solution, whose associated Shepard's Diagram
is presented in Figure 3:
,J v
or equivalently, ''' (31)
log(a>.. ) s log(234.9) +1.963-log(rf9 )
"SAS/STAT® and SAS/GRAPH® are registered products of SAS Institute be., Cary, NC, USA.
P
SAS/STAT MDS Procedure calculates Kruskal's Stress-1 when options Fit=l, Formula=l and
Coef=Identity are selected. According to Kruskal's criterion, Kruskal's Stress-1 characterizes the
goodness of fit of the model as follows: 0=perfect, 0.025=excellent, 0.05=good, 0.1=fair,
0.2=poor. Actually, this is the reason why, in terminology of MDS procedure, it is qualified as a
"Badness of Fit Criterion"
148 F.J. Callealta-Barroso
Dimensions Stress-1
1 0.058921
2 0.028915
3 0.022431
4 0.020017
5 0.018404
6 0.017446
''Graphic representation of the pairs (T(dij), 5ij) joined in order lowest to highest 8*j
A New Measure of Dissimilarity Between Distributions 149
reproduced by the coordinates obtained from the model; indeed, once this linear
correlation coefficient has been calculated it takes the approximated value of
1.00 with a two decimals precision.
As a consequence, we obtained the coordinates of each country's yearly
income distribution in the optimal factor space, which we analyse below.
3 "
* - * - * AUS t t t Ba t-t-l- DK_ ff-r-FIN f - r - f FRft
t t ^ GER t-«-+ GRE + 1 - 1 |RL 1—+—t" IT* t t t - LUK
LLMXK HUM- « . _ t- * f POB ++* SPA * * * SHE HMHt UK_
2"
s '•
UK_O#—g—It L_, }3?5v^
Q 0-
-2"
T 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f
- 6 - 5 - 4 - 3 - 2 - 1 0 1 2 3 4 S
Din-ensbn 1
There are only two exceptions to this rule: the United Kingdom, whose
coordinates seem to raise slightly in the second dimension, although it remains
in relatively low levels (+0.29); and Luxembourg, which increases substantially
its coordinates in the second dimension, and is far away from the area where
trajectories of the rest of the countries are situated.
Although the movement of concentration is generalized with the exceptions
mentioned above, we can distinguish five groups of countries with quite
different value levels at the end of the period studied: Luxembourg (+2.28),
Portugal (+0.68), the United Kingdom (+0.29), Sweden and Finland (-0.59 and
-0.65 respectively) and the rest of the countries (between -0.20 and +0.10).
concentration index. Therefore, a country will be located the more to the left of
the chart, the more its income distribution moves to the right over the income
size axis, providing higher average incomes and distributing higher wealth
(which usually happens with an increase of dispersion in the distribution), and
the lower inequality it presents (the more evenly the incomes are distributed).
Therefore, the first dimension can be interpreted as an index of welfare or
an index of "standards of living-income"r which takes into account jointly the
general level of wealth in the population and the degree of equality in the way it
is distributed.
1.0-
P 01 K3INI
0.5- P001 C ^
^ P R 01
DMGIN PJOP 05 Ni^OOl
FM
-0.5-
Looking now at the second dimension, we observe that its correlations with
the descriptive measures are not too high, and therefore its interpretation could
be risky. In any case, the descriptive statistic more closely correlated with it is
Gini's concentration index (positively correlated), although ratios of low
percentiles to the mean are also positively correlated, and ratios of high
percentiles to the mean are negatively correlated as well.
r
The group of "standards of living-income" indices introduced by Pena et al. (1996) is defined as the
product of the income distribution mean and the complement to 1 of a normalized inequality
index. It belongs to a wider class of welfare indices introduced by Blackorby and Donaldson
(1978).
152 F.J. Callealta-Barroso
0.5-
PR 10
p 29mA
P 9®ma^^P8_05
" P O I ~*2(s53
0.0-
Dlmension 3
DMQIM " ^ ^
-0.5- K3INI
DT >RAN30 GVAR2
-1.0-
<r-*r-*
•f-f-f- FIN f-f-f- \u + ** Kir
t-t-t GtE + 1-1- i n 1-r-t- ir*
t 11 ML. r-t-t Mir ** +
i-s-r Sf[ t t t u.
Figure 7. Dynamic of countries (without Luxembourg) using the proposed dissimilarity measure:
Dimensions l and 2
Source: Author's own, from ECHP 1994-2001
Figure 8. Dynamic of countries (without Luxembourg) using the proposed dissimilarity measure:
Dimensions l and 3
Source: Author's own, from ECHP 1994-2001
A New Measure of Dissimilarity Between Distributions 155
Figure 9. Dynamic of countries (without Luxembourg) using the proposed dissimilarity measure:
Dimensions 2 and 3
Source: Author's own, from ECHP 1994-2001
4. Conclusions
Taking as a starting point the problematic proposal made by Dagum (1980)
to measure the distance between income distributions, we have introduced in
this study a new measure of dissimilarity, based on Gini's mean difference.
To test its validity we have calculated the corresponding measures of
dissimilarity between all the yearly distributions of "Comparable Equivalent
Personal Incomes (net)" in the EU-15. Distributions and dissimilarities have
been constructed on the basis of the data from the EU-15 Household Panels
between 1994 and 2001.
158 F.J. Callealta-Barroso
Acknowledgments
This study has been partially supported by Project I+D+I ref.: SEC2002-00999,
from the Spanish Ministerio de Ciencia y Tecnologia. Data from the European
Community Household Panel have been used here by permission given in the
agreement ECHP/15/00, between EUROSTAT and the University of Alcala
(Spain).
A New Measure of Dissimilarity Between Distributions 159
References
1. C. Blackorby and D. Donaldson. (1978). Measures of relative equality and
their meaning in terms of social welfare. Journal of Economic Theory, 18,
59-80.
2. CM. Cuadras. (1996). Metodos de Analisis Multivariate. EUB.
3. C. Dagum. (1980). Inequality measures between income distributions with
applications. Econometrita, 48(7), 1971-1803.
4. Eurostat. (2004). ECHP UDB Manual: European Community Household
Panel Longitudinal Users' Database. Eurostat.
5. B.S. Everitt. (1993). Cluster Analysis. New York: John Wiley and Sons.
6. C. Garcia, F.J. Callealta and J.J. Nunez. (2005). La Interpretation
Economica de los Parametros de los Modelos Probabilisticos para la
Distribucion Personal de la Renta. Una Propuesta de Caracterizacion y su
Aplicacion a los Modelos de Dagum en el Caso Espanol. Estadistica
Espaflola, I.N.E.
7. Hey and Lambert. (1980). Relative deprivation and the Gini coefficient:
comment. Quarterly Journal of Economics, 95, 567-573.
8. J.B. Kruskal and M. Wish. (1978). Multidimensional Scaling. Sage
University Paper series on Qualitative Applications in the Social Sciences,
7-11. Sage Publications.
9. B. Pena, F. J. Callealta, J. M. Casas, A. Merediz and J. J. Nunez. (1996).
Distribucion Personal de la Renta en Espana. Piramide.
10. B.W. Silverman. (1986). Density Estimation for Statistics and Data
Analysis. London: Chapman and Hall.
11. A.F. Shorrocks. (1982). On the distance between income distributions.
Econometrica, 50(5), 1337-1339.
12. W.S. Torgenson. (1958). Theory and Methods of Scaling. John Wiley and
Sons, Inc.
13. J. Villaverde Castro and A. Maza Fernandez. (2003). Desigualdades
Regionales y Dependencia Espacial en la Union Europea. CLM Economia,
2, 109-128.
14. F.W. Young, R. Lewyckyj and Y. Takane. (1986). The ALSCAL
Procedure. SUGI Supplemental Library User's Guide. Version 5 Edition.
SAS Institute Inc.
Chapter 9
USING THE GAMMA DISTRIBUTION TO FIT FECUNDITY
CURVES FOR APPLICATION IN ANDALUSIA (SPAIN)
F. ABAD-MONTES
Dpto. Estadistica e Investigation Operativa, Universidad de Granada
C/Fuentenueva,s/n, Granada, Espaha
M.D. HUETE-MORALES
Dpto. Estadistica e Investigation Operativa, Universidad de Granada
C/Fuentenueva,s/n, Granada, Espana
M. VARGAS-JIMENEZ
Dpto. Estadistica e Investigation Operativa, Universidad de Granada,
C/Fuentenueva,s/n, Granada, Espana
Analysis of the evolution of specific fecundity rates, by the age of the mother, i.e.
fecundity curves, and their modelling, is of vital importance when we seek to obtain
projections or forecasts of the behaviour of this demographic phenomenon. Indeed, on
some occasions these estimates do not need to be reasonable from the populational
standpoint, but may have the goal of establishing hypothetical scenarios. The present
study includes an analysis of the observed data for total births (without taking into
account the order of birth) by age and by female population. These data, for the period
1975-2001, were provided by the Statistical Institute of Andalusia (IEA) and were used
to construct synthetic fecundity indicators, which are the most basic and the most
effective means of accounting for the global behaviour pattern of the phenomenon within
a given period. Subsequently, the observed fecundity curves were fitted using a Gamma-
type distribution. This distribution is one of the most commonly used, for two main
reasons: it provides very good quality fits, and the parameters of the distribution are
identified perfectly with the indicators of fecundity. Finally, various behaviour
hypotheses are proposed, on the basis of the information obtained during the period of
analysis.
161
162 M. Abad-Montes, M.D. Huete-Morales and M. Vargas-Jimenez
v
•>x p\l\lt p\/l/l+\ '
x £j[
2
where Nx is the number of births to mothers who have passed their 'x'
birthday during year 't', and Px is the female population having passed their
'x' birthday by 1 January in year 't'. These rates, for some of the years in
question, are represented as follows:
It is very apparent that in little more than a quarter of a century the pattern
of fecundity in Andalusia has varied spectacularly. In 1975, fecundity rates were
very high for almost all the ages, which suggests that the number of births was
also high. These high rates were mainly due to the fact that families began to
have children at a fairly young age and went on to have a lot of them; this
explains why fecundity rates were so high at the end of the fertile period. This
situation did not last, however, and the above figure shows that by 1985 the
fecundity rates had fallen significantly. Subsequently, they continued to fall,
though less dramatically. Nevertheless, it can be seen that the bell shape of the
Using the Gamma Distribution to Fit Fecundity Curves 163
fecundity curve was distorted, with the mode of the distribution shifting to the
right (as a result of the age of first pregnancy being delayed) and the appearance
of a "second mode", which reflects the births that occur to very young mothers,
normally unmarried and of children who were often unplanned.
Let us now define and construct the most commonly used indicators of
fecundity. First, we obtain the Synthetic Fecundity Index (SFI) which describes
the mean number of children per woman of fertile age:
49
SFI' = £/; ( 2)
x=\5
Other relevant indicators include the Mean Age at Maternity (MAM), which
describes whether the age of maternity is rising or falling, and the Variance in
the Age at Maternity (VAM), which provides a measure of the variability of the
occurrence of births, i.e. whether these occur at widely-spaced ages or are
closely grouped around the mean age:
I>+ 0'5)/,'
49
x=15
MAM' 49 (3)
I/;
x=15
49
£ [(* + 0,5) -MAM]2/:
<j2' = VAM' = ^ (4)
49
I/;
x=15
Table 1 shows the application of the above expressions to the available
information. The pattern of this series of indices might be more apparent in
graphical form:
The Synthetic Fecundity Index and that of the mean age of maternity reveal
a very different behaviour pattern; the former has fallen gradually over the
years, from 3.2 children per woman in 1975 to 1.3 in the year 2001. With
respect to the mean age of maternity, the graph might be considered to present a
distorted view of reality, since although the mean age seems to fall in the initial
years, then stabilise and then rise from the late 1980s onwards, we must take
into account the very high values recorded at the beginning of this period. This
latter fact was due to the very long period of fecundity commonly presented
Using the Gamma Distribution to Fit Fecundity Curves 165
then, with mothers having a large number of children; thus, the mean age of
maternity was higher than that of mothers today.
This situation is reflected in the Index of the Variance; in the initial
years of the study, the variance was very high, and so births were not
concentrated around the mean, but widely distributed throughout the fertile life
of the mothers:
38,000
36,000
26,000
1990 2010
It should be noted that in very recent years there has been a moderate rise in
the SFI (which shows that women in Andalusia are starting to have more
children), a levelling off in the rise in the mean age of maternity and a rise in the
variance (partly due to the "second mode", referred to above, in the fecundity
curves).
where y is the class mark of the interval considered less the minimum fertile
age, i.e. y = (x + 0,5) - 15 and T(c) is the gamma function. The parameters a, b
and c of F(y) are related to the fertility indicators as follows:
Year a b C
1975 3.212 0.39402 5.57054
1976 3.238 0.39384 5.46361
1977 3.132 0.38584 5.31246
1978 3.041 0.38263 5.23262
1979 2.861 0.37512 5.05241
1980 2.739 0.36942 4.94530
1981 2.535 0.37411 5.00861
1982 2.444 0.38081 5.12291
1983 2.275 0.38615 5.20674
1984 2.140 0.38687 5.21184
1985 1.990 0.39243 5.28599
1986 1.891 0.39774 5.38110
1987 1.819 0.40812 5.51987
1988 1.760 0.41678 5.61443
1989 1.689 0.43020 5.84028
1990 1.656 0.45324 6.18063
1991 1.612 0.46157 6.35014
1992 1.581 0.47910 6.67658
1993 1.527 0.49871 7.02920
1994 1.426 0.50733 7.25726
1995 1.375 0.51888 7.52001
1996 1.329 0.53433 7.85663
1997 1.336 0.53184 7.89414
1998 1.303 0.53263 7.96850
1999 1.335 0.52930 7.99202
2000 1.358 0.52244 7.91845
2001 1.354 0.51938 7.89913
Using the Gamma Distribution to Fit Fecundity Curves 167
1 H
1.7
\ SFI
1.6
\ •
^v •
• *
*^
1.3
A clear pattern can be observed in all the series. The SFI, although it has
fallen, seems to have recovered in the last few years; the MAM is also
increasing, albeit slowly (which might be a consequence of the fact that the SFI
Using the Gamma Distribution to Fit Fecundity Curves 169
is improving); and what is most dramatic is the recovery of the variance (which
could indicate that women in Andalusia are having children at more widely
spaced intervals, and perhaps too that the number of children born in higher
orders of birth is greater). Taking all this into account, we assume the following
values:
[ 577 = 1,4
r'Hypothesis\MAM = 30A
VAM = 29,8
•%nd
' SFI = 1,6
2"" Hypothesis MAM = 3\
VAM = 30
SFI = 1,3
'Hypothesis
MAM = 29
VAM = 28
These hypotheses would correspond, respectively, to: 1) a slight
improvement in fecundity rates in Andalusia; 2) a markedly higher number of
children being born to each woman, with women having children at wider age
intervals; 3) fewer children born to each woman and an advance in the mean age
of maternity. Let us examine a graphic representation of these three hypotheses,
compared to the observed data for 2001:
20 25 30 36
5. Conclusions
The Gamma distribution is a powerful tool for the analysis and subsequent
projection of fecundity curves for a given zone, and this distribution is the most
widely used by demographers and other researchers in the field. The results
obtained reveal its clarity and suitability for modelling fecundity patterns and for
carrying out simulations or predictions of future patterns, largely because its
parameters depend on synthetic indicators of fecundity.
References
1. Arroyo, A. (coordinator), Hernandez, J., Romero, J., Viciana, F. and Zoido,
F. (2004). Tendencias demograficas durante el siglo XX en Espana. INE.
Madrid.
2. Brass, W. (1971). Seminario sobre modelos para medir variables
demograficas (Fecundidad y mortalidad). CELADE. S. Jose de Costa Rica.
3. Brass, W. (1974). Metodos para estimar la fecundidad y la mortalidad en
poblaciones con datos limitados. CELADE. Santiago de Chile.
4. I.E.A. (1999). Un siglo de demografia en Andalucia. La poblacion desde
1900. Sevilla.
5. Leridon, H. and Toulemon, L. (1997). Demographie. Economica. Paris.
6. Pressat, R. (1995). Elements de demographie mathematique. AIDELF.
Paris.
Chapter 10
CLASSES OF BIVARIATE DISTRIBUTIONS WITH NORMAL
AND LOGNORMAL CONDITIONALS: A BRIEF REVISION*
J.M. SARABIA
Department of Economics, University ofCantabria
Avda. de los Castros s/n. Santander, 39005, Spain
E. CASTILLO
Dept. of Applied Mathematics and Computational Sciences
University ofCantabria
. Avda. de los Castros s/n. Santander, 39005, Spain
M. PASCUAL
Department of Economics, University ofCantabria
Avda. de los Castros s/n. Santander, 39005, Spain
M. SARABIA
Department of Business Administration, University ofCantabria
Avda. de los Castros s/n. Santander, 39005, Spain
The present paper is a brief survey of the classes of bivariate distributions with normal
and lognomal conditionals. Basic properties including conditional moments, marginal
distributions, characterizations, parameterizations, dependence and modality are revised.
Estimation and applications of these models are studied. Finally, some extensions of the
bivariate conditional normal model are reviewed.
The authors thank to Ministerio de Education y Ciencia (project SEJ2004-02810) for partial
support of this work.
173
174 J.M. Sarabia et al.
f(x,y) = -exp[-{x'+y2)/2]l(xy>0)
l x-M>(y)
fx,r(x\y) = exp (3)
°>(y)
1
f^x(y\x) = exp (4)
u t(x)^2n a2(x)
VA,(*)V AGO
exp exp
1
\ogfix,y) = \og -0>-//2«)2.
2a2(xy
We write:
flog f{x, y) = a,iy) + bx iy)x + c, (y)x
1 log fix, y) = a, ix) + b, ix)y + c, ix)y
d2 log fix,y)
= a," (*) + b~;(x)y + C,"ix)y2, Vy.
dx1
My)
= mm+muy + mny
vUy)
l
= ™20+m2ly + m22y ,
IcrUy)
which leads to
m +m +m
ny uy io
E(X\Y = y) = fi,(y) = ~ 2 (6)
2(m22y +m2ly + m2l>)
-1
Var(X\Y = y) = a2(y) = 2 (7)
2(m22y +m2ly + m20)
E(Y\X = x) = //2W = - - ^ (8)
2{m22x +mt2x + m02)
-1
Var(Y\X = x) = a2(x) = (9)
2(m22x +m]2x + m02)
The joint pdf can be written in the alternative form (with the notation used
for more general exponential families):
2
(moo ™m "O f l ] (12)
f{x,y) = exp- {hx,x ) W
10 «II
mn y
m2l m22)
(m20 W)
Bivariate Distributions with Normal and Lognormal Conditionals 177
Models satisfying conditions (a), are the classical bivariate normal models with:
• Normal marginals,
• Normal conditionals,
• Linear regressions and constant conditional variances.
More interesting are the models satisfying conditions (b). These models have:
• Normal conditional distributions,
• Non-normal marginal densities (see (10) and (11)),
• The regression functions are either constant or non-linear given by (6)
and (8). Each regression function is bounded (in contrast with the
classical bivariate normal model).
• The conditional variance functions are also bounded and non constant.
They are given by (7) and (9).
What if we require normal conditionals and independent marginals?
Referring to (12) the requirement of independence translates to the following
functional equation
{hx,x2) m„ in,,
n = r(x) + s(y),.
(13)
y
22/ \y
Its solution eventually leads us to
=0
which is the independence model. This result shows that independence is only
possible within the classical bivariate normal model. As consequences of the
above discussion, Castillo and Galambos [3] derive the following interesting
conditional characterizations of the classical bivariate normal distribution.
178 J.M. Sarabia et al.
2.3. Convenientparameterizations
Expression (5) depends on 8 parameters, and the normalizing constant is not
available in a close form. From a practical point of view it is convenient to
provide some simpler models or some convenient parameterization. In this way
Gelman and Meng [14] proposed a simple parameterization. If in (12) we make
location and scale transformations in each variable we get:
f(x,y) x exp{-(a x2y2 +x2 + y2 + ftxy + y x + Sy)},
where a,j3, y, and£ are the new parameters which are functions of the old /w..
parameters. In this parameterization, the conditional distributions are
Py + y 1
X\Y = y~N
2(ay2+l)'2(ay2+\))
Px + 5 1
Y\X = x~N
2(ax2+l) 2{ax2+\)j
f(x, y\fi,a,c) =-
LltO.CJ^
exp- '1-^+^1+^^1)1 *> y>0 (16)
Y\X = x~N )
'l + c(x-fj,)2 la1, y
yflc
*(c) =
C/(l/2,l,l/2c)
U(a,
v b, z)y = — f e-'zf-x v(1 + / ) ' - " - ' d?.
rVia)
^ ^ Jo >
Py+r
x = —2 ( a / + l )
(17)
Px + 8
y
~ 2(ax2+l)'
which is a polynomial of degree five. When this polynomial has a unique real
root, the density is unimodal, if it has three distinct real roots (two modes and a
180 J.M. Sarabia et al.
saddle point) the density is bimodal, and with 5 distinct real roots (three relative
maxima and two saddle points) we have 3 modes. For example, in the
symmetric case S = y, we have 3 real roots and consequently f(x, y) will be
bimodal if and only if
aS2 > 8 ( 2 - £ ) .
2.5. Dependence
For this model the usual correlation coefficient is not limited. Other
alternative non-scalar dependence measure is the local dependence function
[17,18] defined by
_d> log f(x,y)
HX y)
' - cxay ' (18>
which gives more detailed information about the dependence. In this case, the
local dependence function is
9 2 log/(x,y)
x
Y( >y) = T~Z = mn+2m2lx + 2mny + 4m22xy.
ox ay
An interpretation of this function is possible: random variables X and Y are
positively associated in the first and third quadrants and negatively associated in
the second and fourth which supposes non-linear dependences in the model.
If conditions (19) and (20) are satisfied, the joint probability density
function takes the form,
f{x, v; 8, m) = [(x - 8,)(v - 82)]"' exp{- [mM+ u (z,, z 2 ) + v (z,, z2)]} (22)
where
u(x,y) = mlox + m20x2 +mmy + m02y2 +muxy,
vix,y) = mnxy2 +m2lx2y + m22x2y2
182 J.M. Sarabia et at
(23)
A, 00 =
1$» iJ*
„ ,-. W2,Z, +W„Z, +W0, (24)
2(m22z, + m12z1+m02J
and
<r2O0 = (25)
2^ J'
<r2(*) = (26)
2(m 22 z 2 +m, 2 z l +w 02 )'
fx(x;Sl,m)=-
exp< - '20^1
' ¥ (29)
J{m12z*+mnzK+mm)l27T
2 \ (wi2222+W„Z2+W|0)2
exp-^ (30)
4(m22z2 + m2lz2 + m20)
fr(y;S2,m) =
^{m22z\+muz2+m2l))l27r
Note that (29) and (30) are not lognormal distributions if conditions (28)
hold. These marginals depend on all eight parameters and then present a high
flexibility. The conditional moments of (22) are (r = 1,2,...) :
where //,(£/) and <r,2(£/), 7 = 1,2 are given in (23) to (25). Combining (31)-(32)
with (29)-(30) the moments of the marginal marginal distributions as well as the
correlation coefficient can be obtained. The usual correlation coefficient is not
obviously limited and the local dependence function (18) is
mn+2m2, log(x) + 2ml2 log(y) + 4m22 log(x) log(;>)
r(x,y) = -
xy
5. Estimation
For this kind of conditional models, several estimation strategies have been
proposed by Arnold, Castillo and Sarabia [5]. Here we pay attention in
techniques based on the likelihood. The family of densities (5) is a member of
the exponential family with natural sufficient statistics:
(S^.I^.I'MyM^.Z^'I^.Z^ 1 ). <33>
However, inference from conditionally specified models is not direct
because the normalizing constant is an unknown function of the parameters. The
shape of the likelihood is known but not the factor required to make it integrate
to 1. A method to avoid dealing with the normalizing constant consists of using
both conditional distributions. We define the pseudolikelihood estimate of 0 to
be that value of 0 which maximizes the pseudolikelihood function defined by:
According to Arnold and Strauss [20] these estimators are consistent and
asymptotically normal. In this kind of conditional models, these estimators are
much easier to obtain than the maximum likelihood estimates.
6. Applications
The model with normal conditionals can present several modes and in
consequence is a natural alternative to mixture models for modelling
heterogeneity and also can be used for modelling a population composed for
several cluster. Arnold, Castillo and Sarabia [21] used this bivariate distribution
for fitting the classical Fisher data where there are pooled two different samples.
The model was fitted by pseudo-likelihood.
The model with lognormal conditionals has been used by Sarabia, Castillo,
Pascual and Sarabia [19] for modelling bivariate income distributions, using the
information contained in the European Community Household Panel. These
authors have used the Spanish microdata (approximately 10,500 individuals),
focusing analysis on waves 1, 3 and 6. It is important to point out that are a big
number of bivariate data with high variability. They fitted to these two sets of
data the classical bivariate lognormal distribution and the bivariate lognormal
conditional distribution (22) with <5. = 0, maximizing the pseudo-likelihood
function given in (34). The resulting fitted model is very acceptable and implies
a very significant improvement in the fit of the bivariate lognormal conditional
distribution.
where $(x) and Q>(x) denote, the standard normal density and distribution
functions, respectively. The parameter A E 9 ! is a parameter which governs the
skewness of the distribution. We will write X ~ SN(A). The skewness of this
Bivariate Distributions with Normal and Lognormal Conditionals 185
We are interested in the form of the density for a two dimensional random
variable (X,Y) such that:
for some functions At(y) and A2(x). If (37)-(38) are to hold, it must exist
densities fx(x) and fr(y) such that
In this functional equation, fx (x), fr (y), A, (y) and A2 (JC) are unknown
functions to be determined. It is not hard to proof that fr (y) = #(y) and
fx (•*) = #(•*) • Then we have:
0(A,(y)x)=(t>(A2(x)y), Vx,y
and then we get the solutions A, (y) = Ay and A2 (x) = Ax where A is a
constant. In consequence, we have two types of solutions to the previous
functional equation. The first one corresponds to the independence case. In this
situation we have A, (y) = /I,, A2 (x) = A2, X ~ SN(A2), Y ~ SN^) and
fxy(x,y) = 4 <*(*) # 0 0 <X>(A2x)<I>(A,y)
The previous joint density has standard normal marginals together with
skewed normal conditionals. The corresponding regression functions are non-
linear and take the form:
E{x\Y = y) = ^ . - r ^ = .
V*- yjl + A'y2
References
1. B.C. Arnold and S.J. Press. (1989). Compatible conditional distributions.
Journal of the American Statistical Association, 84, 152-156.
2. E. Castillo and J. Galambos. (1987). Bivariate distributions with normal
conditionals. Proceedings of the International Association of Science and
Technology for Development, 59-62. Anaheim, CA: Acta Press.
3. E. Castillo and J. Galambos. (1989). Conditional distributions and the
Bivariate normal distribution. Metrika, 36, 209-214.
4. A. Bhattacharyya. (1943). On some sets of sufficient conditions leading to
the normal Bivariate distribution. Sankhya, 6, 399-406.
5. B.C. Arnold, E. Castillo and J.M. Sarabia. (1999). Conditional specification
of statistical models. Springer Series in Statistics. New York: Springer
Verlag.
6. M. Ahsanullah. (1985). Some characterizations of the Bivariate normal
distribution. Metrika, 32, 215-218.
7. B.C. Arnold, E. Castillo and J.M. Sarabia. (1994a). A conditional
characterization of the multivariate normal distribution. Statistics and
Probability Letters, 19, 313-315.
8. B.C. Arnold, E. Castillo and J.M. Sarabia. (1994b). Multivariate normality
via conditional specification. Statistics and Probability Letters, 20, 353-354.
9. W. Bischoff. (1993). On the greatest class of conjugate priors and
sensitivity of multivariate normal posterior distributions. Journal of
Multivariate Analysis, 44, 69-81.
10. W. Bischoff. (1996a). Characterizing Multivariate Normal Distributions by
Some of its Conditionals. Statistics and Probability Letters, 26, 105-111.
11. W. Bischoff. (1996b). On distributions whose conditional istributions are
normal. A vector space approach. Mathematical Methods of Statistics, 5,
443-463.
12. W. Bischoff and W. Fieger. (1991). Characterization of the multivariate
normal distribution by conditional normal distributions. Metrika, 38, 239-
248.
Bivariate Distributions with Normal and Lognormal Conditionals 187
J.J. NUNEZ-VELAZQUEZ*
Departamento de Estadistica, Estructura Economicay O.EI., University ofAlcald
Plaza de la Victoria, 2, 28802 Alcald de Henares (Madrid), Spain
This paper studies the foundations of income inequality measures and its relations with
Lorenz curves, the Pigou-Dalton transfer principle and majorization relations among
income vectors. So, the historic development of these concepts is surveyed to see how the
actual set of properties and axioms was generated, in order to define when an inequality
measure has a good perform. Finally, this work includes an analysis studying the problem
associated with inequality orders and dominance relations among income vectors.
1. Introduction
It may be considered that the interest raised in the last thirty years in the
researcher's community, related to the study of economic inequality aspects has
begun since the seminal paper by Atkinson (1970) and the book by Sen (1973)
as its main focuses. Both of them have had profound effects on this research
field. Since then, papers and books on this task appear frequently in the
economic literature and this root interest has been spread to several nearby
important social problems, like poverty, mobility, polarization and privation
studies, among others.
In this period of time, different approximations to this problem have been
developed, including social welfare assumptions from Economic Theory to
support several economic inequality measures3. However, the number and
variety of these assumptions have considerably increased in such a way that
some of them have been matter of hard controversy. Some outstanding examples
This work is dedicated to the memory of Camilo Dagum, recently died. He was a direct disciple of
C. Gini and a master of several generations of researchers.
'When inequality measures are referred, we must understand them as functions or indicators defined
over an income distribution. So, these indicators are supposed to measure how much inequality is
present in the resources sharing. In other words, there are no connections with the same commonly
used concept in Measure Theory. So, along the paper, we shall use the words indicator and
measure in an interchangeable manner.
189
190 J.J. Nunez-Velazquez
of these works could be Cowell (1995), Foster (1985), Nygard and SandstrQm
(1981) or Dagum (2001), among others. In the Spanish case, we would quote the
works published by Zubiri (1985), Ruiz-Castillo (1987) or Pena et al. (1996).
Nevertheless, despite the huge amount of related literature, Lorenz curve
paradigm remains nowadays as the cornerstone of economic inequality analysis.
Indeed, Lorenz curve should be considered as the basic tool to be taken into
account to support inequality analysis, even though this proposal was presented
by Lorenz (1905), more than a century ago. Along all this time, Lorenz curve
has resisted all the alternative proposals suggested to modify it.
Because of the above argument, one of the main objectives of this paper
must be to pay tribute to Lorenz, a century after his curve's proposal. To put
Lorenz curves in context, a description of the 9 pages long original paper is
quoted from Arnold (2005), which was pronounced at the Siena Congress, just
celebrated owing to the commemoration of such an event. He wrote: ... In the
last 3 pages of the paper he describes what will become the Lorenz curve.
Actually there are only 35 lines of text and two diagrams devoted to the topic. It
has all grown from that! ...
First of all, in this paper, we review the classical concepts related to income
majorization, in order to identify the theoretical background underlying Lorenz
curves and economic inequality measures in the way we understand them
nowadays. This aim should be justified because we must reconsider what the
underlying basic concepts are really imbedded under economic inequality
measurement. In doing so, it would result in a better comprehension about what
elements are playing a significant role when economic inequality is intended to
be measured. Moreover, the aforementioned understanding must allow us to
back up an efficient selection about which the better inequality measures could
be. In this sense, a set of properties will be proposed in order to analyze the
suitability of a huge amount of economic inequality measures. Additionally, a
brief analysis of other related concepts and methods, recently proposed, will be
included. Variety of themes this paper deals with, advice us to provide the paper
with a well-disaggregated structure, which is exposed next.
So, the paper is structured as follows. In section 2, a brief chronology of
published concepts related to economic inequality is developed, emphasizing
those which are close to Lorenz curves methodology. Section 3 is devoted to set
the basic framework with respect to income distribution space and to present the
crucial majorization concepts. Section 4 studies, on the one hand, the meaning
of economic inequality and, on the other hand, it is dedicated to Lorenz curves
methodology to analyze income inequality as well as connected methods like
Inequality Measures, Lorenz Curves and Generating Functions 191
Hardy, Littlewood and Polya had proposed it in 1929. Its content can be
regarded as one of the cornerstones of economic inequality measurement.
In the year 1979, J. Gastwirth proposes the explicit expression of general
Lorenz curves, allowing the use of random variable-based ones, whatever its
type would be.
Obviously, this brief review must contain a mention to the aforementioned
paper by A.B. Atkinson, in 1970, where he sets key arguments on the normative
content of inequality measures through a family of indicators named after him.
These arguments are based on the general mean function or generalized mean,
but they are not free of controversy. Again, it is necessary to make reference to
the appearance of the book On Economic Inequality, by A.K. Sen in 1973,
which has been reedited in 1997, including a wide annexe with several advances
in economic inequality and poverty registered during the elapsed time of 25
years between them. This new annexe has been written by the same author with
J.E. Foster.
Precisely, J.E. Foster published, in 1985, his renowned theorem, where he
determined the conditions an indicator has to fulfil to be compatible with the
order generated using the Lorenz curve. These conditions impose suitable
properties on inequality measures in order to reach a performance according to
Lorenz curves do and they are conceptually different from the aforementioned
normative ones. This result constitutes the basic system of properties which are
required an inequality indicator to achieve and it may be considered as a starting
point in the search of relevant properties, so-called inequality axioms, to select
an adequate indicator. Nevertheless, this way of choosing an inequality indicator
had some precedents in the literature.
Finally, in 2001, C. Dagum publishes in the Spanish journal Estudios de
Economia Aplicada a summary from several papers published before in different
journals, since 1981. In this work, the author exposes his point of view about the
economic foundations of different inequality measures in contrast to the
normative view derived through the Atkinson's approachb.
Along this necessarily brief revision, we have tried to point out the
evolution that economic inequality study has registered, taking into account
those several concepts configured as fundamentals on this subject treatment.
Although nowadays these contents are usually presented as properties or axioms
like it was explained before, we believe this paper will show the links among
these properties and basic concepts supporting them. In the following sections,
the aforementioned concepts will be developed.
b
A more detailed description of this point of view can be seen in Dagum (1990).
Inequality Measures, Lorenz Curves and Generating Functions 193
>0
DN=Ux„x2,...,xN):xi>0,i = l,...,N;|>i f 0)
so that we shall choose the ordered income vector, from smallest to largest, as
the canonical element of each equivalence class. Thus:
D = UD N
N=2
'Here, we are referring only to the economic concept of income, although the analysis can be applied
directly to other concepts related to the individual or household economic positions, like earnings,
expenditures or wealth. However, there is controversy about what the economic position must be
used, because of both theoretical grounds and disposable data reliability (Ruiz-Castillo, 1987;
Pena et al., 1996, among others).
d
This argument is usually known as symmetry or anonymity axiom related to inequality measurement
(Foster, 1985).
194 J.J. Nunez-Velazquez
ix^Syj, k = l,2,...,(N-l)
i=l i=l
x -< y <=> N N
(4)
2>i = Syi
i=l i=l
Po=°;p, =—>i=1>2,...N
N (5)
1 i
q o = 0 ; q i = — Z X J ,1=1,2,..^
Thus, the Lorenz curve, L(p), is obtained by linking the points contained in
the set {(pi,qO; i = 0,1,...,N}, using linear interpolation to generate a polygonal
curve. Obviously, L(p) is inscribed within the unit square. So, if L(p) is near to
the unit square's diagonal, then the income sharing will be near to the egalitarian
situation. Else, the more bent the bow's curve is, the more inequality will be
present in the income distribution.
The previous definition is a descriptive one, but it can be easily generalized
to the case when we are dealing with a non-negative random variable, X, to
model incomes. In such a case, let u be its expectation E(X) and let F(x) be its
cumulative distribution function. Now, definition (5) can be expressed as
(Kendall and Stuart, 1977, for example):
p = F(x)=| 0 x dF(t)
(6)
q = L[F(x)] = Ij 0 x t.dF(t)
196 J.J. Nunez-Velazquez
L(p) = i j 0 P F - 1 ( t ) d t (7)
t ( p ) = ^ — ^ , pe(0,l) (8)
H
Also, the difference (related to the diagonal) function will be:
A(p) = p - L ( p ) , pe[0,l] (9)
and it reaches a maximum at the point p = F(u). Moreover, it is particularly
interesting the following resultf:
Theorem 1 (Iritani and Kuga, 1983): Let q = L(p) be a function defined over
the interval [0,1]. Then, L(p) is a Lorenz curve corresponding to some non-
negative random variable X, if and only if L(p) satisfies the following
properties:
L(0) = 0,L(1)=1.
L(p) is convex and non-decreasing.
However, the precedent discussion about Lorenz curves suitability had as a
main objective making inequality comparisons between income distributions. To
accomplish this aim, the following relationship, called Lorenz dominance
criterion is going to be established.
Definition 1: Let x, y e D. Then x is said to be less unequal than y in the
Lorenz sense (x <Ly) when the Lorenz curve associated to y contains completely
the corresponding to x. Formally:
x<Ly»Lx(p)>Ly(p) , Vpe[0,l] (10)
Related to majorization relation, Lorenz criterion turns out to be a more
versatile relation because of its capability of making comparisons between
c
See, for example, Casas and Nunez (1987) or Nygard and Sandstrom (1981), for more details.
•Analyses of sampling results about Lorenz curves are out of the scope of this paper. However, there
are very interesting references in this field, beginning with Goldie (1977) on strong consistency of
empirical Lorenz curves, and Beach and Davidson (1983) or Beach and Richmond (1985) on
asymptotic normality of Lorenz curves estimates.
Inequality Measures, Lorenz Curves and Generating Functions 197
g F (x) = ^ (12)
f(x)
This generating function allows us to obtain ordered families of Lorenz
curves. Such ordered families only depend on a parameter and, therefore, give
us a total order structure on paired comparisons using Lorenz dominance
Regarding estimation metjods in such a matter, see e.g. Castillo, Hadi and Sarabia (1998).
Inequality Measures, Lorenz Curves and Generating Functions 199
criterion, and this fact is due to the only parameter they have. The simplest case
corresponds to strongly unimodal distributions or, in other words, those whose
probability density function is log-concave. That is:
Some examples of this kind of random variables are log-normal and Pareto
distributions. As it may be seen through the mentioned examples, the main
drawback with these distributions is the rigidity as real income models .
On the other hand, in the same way as before, the Lorenz curve generating
function can be defined mutans mutandi, assuming now L(.) as a Lorenz curve
from a continuous random variable:
c) g L (p)>0, Vpe(0,l]
d)(g L (p)) 2 +g' L (P)>0, Vpe(0,l]
So, if a function gi_(p) fulfils the above conditions, then it will give a Lorenz
curve through the associated generating function. Using several generating
functions, Garcia and Herrerias (2001) has obtained a number of well-known
h
Although more complex, another method to generate ordered families of Lorenz curves can be seen
in Sarabia, Castillo and Slotjje (1999).
200 J.J. Nunez-Velazquez
gF(x) = E(X).f\x).L"[F(x)]
(16)
gL[FM]= X
E(X) L[F(X)]
hi) Z P i j = l . Vj = l,2,...,N
i=l
Thus, doubly stochastic matrices are finite ones with a probability
distribution defined over each row or column. The set including all these
matrices is closely related to permutation matrices, in the way expressed by the
following result.
Theorem 4 (Birkhoff, 1976): The (NxN) bi-stochastic matrices set constitutes
the convex envelope of the (NxN) permutation matrices set.
Furthermore, it would be easy to prove how the application of a doubly
stochastic matrix over an income distribution produces an equalizing effect. It is
enough to let P be a bi-stochastic matrix and x,ys DN, so that x = Y.y; then each
component of vector x will be a convex mixture of the vector y components and
thus we have a progressive transfer1. In other words:
'In that sense, Arnold (2005), quoting from Schur (1923), refers them defining x as an averaging
of y.
202 J.J. Nunez-Velazquez
( x
i- x
j)- >0, Vi*j, VxeDN n I N (20)
OXj OXj
equivalences stated before. For example, Gini index (Gini, 1912) is a strictly
S-convex function1. However, usual inequality measures construction is based
on the next statement, which connects all the implications related to inequality
and majorization exposed before.
Theorem 7 (Karamata, 1932): Let g(.) be a convex, continuous and real
function, then:
(x<y)oZg(xi)sZg(yi), Vx,yeDN
i=l i=l
N
Further, if g(.) is a convex real function, then h(x) = £ g(x j) is said to be a
i=l
convex separable function, provided that x e DN.
It is easy to see that every convex separable function is S-convex too.
Nevertheless, the inverse statement is not true and this can be readily checked
from Theorem 7. But it is important to observe how Theorem 7 relates
majorization to economic inequality measures construction. Moreover, the
following property links this to Lorenz dominance.
Corollary 1 (Arnold, 1987): Let g(.) be a convex, continuous and real function,
then:
v
r g( X rf y ^
<E g
UwJ U(y)JJ
As a result, it is worth mentioning that each convex, continuous and real
function can generate a genuine inequality indicator, because it will be
compatible with Lorenz dominance criterion, using Corollary 1. So, the partial
order deduced from Lorenz dominance criterion is still present, connecting to
the intersection quasi-order (Sen, 1973, pg. 72), which constitutes another
partial order rather less restrictive11.
Evidently, choosing a single inequality measure implies a total order as a
result, but Lorenz compatibility hides what the causes of different orders may be
when several inequality indicators are used. Reasons explaining this fact must be
explained by distinct weighting schemes placed on income distribution, which
are associated to each inequality measure. So, a research field has emerged,
considering batteries of inequality indicators instead of choosing only one of
'Marshall and Olkin (1979) contains an extensive exposition about S-convex functions, including the
result covered in Theorem 6.
k
Obviously, this will be true only if all of the considered inequality indicators are compatible with
Lorenz relation. In other case, there is no inclusion relationship linking both partial orders.
204 J.J. Nunez-Velazquez
them, in order to extract the common information included in such a set using
Principal Component Analysis or to eliminate the redundant inequality
information through Ivanovic-Pena DP2 distance (Garcia et al., 2002). This new
approach can be modified to allow dynamic inequality evaluations too
(Dominguez and Nunez, 2005).
On the other hand, Corollary 1 allows comparisons between income
distributions from different-sized populations. However, this achievement is
possible using homogeneous functions as inequality measures, so as
proportional income vectors must give the same value. This formal fact is
equivalent to impose the so-called Dalton Population Principle, proposed by the
aforementioned author with the name Individuals Proportional Addition
Principle (Dalton, 1920, pg. 357): Inequality becomes invariant against
population replicas'. In formal terms, this restriction imposes that inequality
measures have to be functions defined over the empirical accumulative
distribution function.
Finally, we can summarize a great part of the last discussion by reproducing
the next statement, where relative sensibility to income transfers is included,
depending on the chosen inequality measure.
Theorem 8 (Atkinson, 1970; Kakwani, 1980): If V(.) is a strictly convex and
real function, then every inequality measure defined by I(x) = E[V(x)] will
satisfy the Pigou-Dalton Transfer Principle, whatever the income level be.
Furthermore, if V(.) is differentiable too, then its relative sensibility to income
transfers will be proportional to:
T(x) = V'(x) - V'(x - 5 ) , 8 > 0.
'A r-order population replica consists of considering an income vector which repeats r times each
component of the original income distribution, giving (X],...,),Xi,X2,...r>,x2,...,xN,...r),XN)' as a result.
Inequality Measures, Lorenz Curves and Generating Functions 205
"A wide exposition of proposed axioms in the economic literature can be find in Nygard and
Sandstrom (1981) or Ruiz-Castillo (1986), for example.
"Axioms presented in this section are considered the basics related to Lorenz dominance. Among the
omitted ones, we must mention the additive decomposability axiom (Bourguignon, 1979), which
stands out because of its repercussion and controversy. Moreover, this axiom allows us to
characterize a family of inequality measures.
"This axiom appears in Dalton (1920, pg. 357).
206 J.J. Nunez-Velazquez
On the other hand, to connect this approximation with the more analytical
developed before, we introduce the next definition.
Definition 5: A real function I(.) defined over D is said to be a Lorenz-
compatible inequality measure when it is monotone with respect to Lorenz
dominance criterion. More formally:
I(x) > I(y) ^ x > L y « L x (p) < L y (p), Vp e [0,l]
p
It has been suggested the use of absolute measures, instead of relative ones. This approach implies
the suppression of the Scale Invariance Axiom (Moyes, 1987). Nevertheless, this kind of measures
are closer to the so-called Lorenz generalized dominance relation (Shorrocks, 1983).
Inequality Measures, Lorenz Curves and Generating Functions 207
limit case would be configured using only the poorer income (Rawls, 1972).
Therefore, this research field intends to restrict the Pigou-Dalton Transfer
Principle by placing more weighting on transfers where the smaller incomes are
involved. Some related results are Shorrocks and Foster (1987) or Fleurbaey and
Michel (2001), among others.
In this way of thinking, another related research field has as an objective the
use of weighting schemes on the Lorenz curve directly. It is a well-known fact
that Gini index matches twice Lorenz areaq (e.g. Wold, 1935 or Kakwani, 1980).
Following this idea, some authors have proposed inequality measures based on
geometrical elements on the Lorenz curve, like the maximum distance to the
egalitarian line (Pietra, 1914-15, 1948; Schutz, 1951), its length (Kakwani,
1980) and weighting Lorenz areas using specific functions (Mehran, 1976;
Casas and Nunez, 1991, among others).
where Lx(p) stands for the Lorenz curve of the income vector x. Properties of
these curves are easy to establish as direct consequences of Lorenz curves ones.
Consequently, the dominance relationship can be established:
q
It refers to the area located between the Lorenz curve and the diagonal of the unit square.
208 J.J. Nunez-Velazquez
Lately, a great deal of research effort has been devoted to the application of
well-known stochastic dominance criteria to provide alternative tools in the
study of economic inequality and other related concepts, such as poverty,
welfare and so on'. Stochastic dominance consists of several relationships
defined on pairs of random variables through their accumulative distribution
functions. To define them, let X be a non-negative random variable, representing
a society income and let F(.) be its accumulative distribution function, then the
successive orders accumulative distribution functions can be defined through the
following expressions:
F,(z) = F(z) = P(X<z), Vz>0
(25)
F j (z)=£F j _ 1 (t)-dt, Vz>0, Vj = 2,3,....
'Relations between Lorenz dominance and welfare have been studied in Bishop, Formby and Smith
(1991) and subsequent papers.
S
A sufficient condition for this dominance criterion is given in Ramos, Ollero and Sordo (2000).
'More details may be seen in Muliere and Scarsini (1989) or Bishop, Formby and Sakano (1995),
among others.
Inequality Measures, Lorenz Curves and Generating Functions 209
individual inequality as the amount each person contributes to global result with.
In doing so, if the resources sharing was egalitarian (all individuals have
perceived income mean), then each contribution to inequality would be null. But
when some of them get more or less income than mean, they are contributing to
raise inequality.
Therefore, the aim this interpretation is searching for consists in finding out
the method we must use to measure such an individual contribution to
inequality. It should be noted how this individual contribution must be coherent
with inequality concepts, and so we might expect at least a reduction of optional
indicators to choose among, as a result.
Next, along the first subsection, a precedent inequality indicators family
addressed to this approach is exposed, whereas a new proposal about what an
inequality indicator must fulfil will be presented at the second one.
where g"'(t) = inf{x: g(x) > t}, including the first formulation when g(.) is the
identity function.
This family values the individual inequality contribution through income
differences respect to a reference point, usually mean or median income. Only
the use of normalizing constants included in the weights specification, allows
habitual relative indicators like Pietra and Gini ones can be obtained. Therefore,
this family can be considered as generalized mean deviations.
fxY
g —
<vJ.
and the following conditions have to be fulfilled:
i) g(.) is a convex, continuous and real function.
ii) g(.) is non-negative.
iii) g(.) is non-increasing when x < u.
iv) g(.) is non-decreasing when x > u.
These conditions assure that I(X) will be a genuine inequality indicator,
because they are imposing such a performance over the individual contribution
valuation. As a matter of fact, the first one implies g(.) is a convex separable
function, the second is necessary because individual contribution to inequality
must not be negative, keeping in mind that incomes can not diminish inequality
and only can accumulate it or not. Two last conditions allow us to impose the
genuine perform of individual contributions, so as they must increase when
income becomes more far away from mean, whatever the direction would be
212 J.J. Nunez-Velazquez
I(X) = ^-.EJX-u|]=E
So, this function is a convex, continuous and real one, but g(x) > 0 <=> x > 1,
and it fails ii) because g(.) is negative when x > 1. This fact implies T)(X)
admits negative contributions to inequality when incomes are lesser than mean.
Also, condition iii) is not fulfilled.
b) Theil order 0 indicator is defined by:
C \"
1
log = E log
Hence, g(x) = log(l/x), g'(x) = -(1/x), g"(x) = 1/x2, and so it satisfies i).
But g(x) > 0 <=> x < 1, and it fails to satisfy ii). Then, T0(X) allows
negative contributions in case incomes are greater than mean. In fact, condition
iv) is not fulfilled.
Hence, Theil's indicators seem ill-conditioned to measure inequality, as far
as it has proven in Proposition 3, taking into account implied reasons to fail". So,
the proposed family could be used to assure us about if convex separable
inequality indicators are really measuring what they are supposed to do, despite
its Lorenz-compatibility. Furthermore, this last result enlightens us about some
well-known inequality measures, whose performance would not be adequate.
Perhaps, what this family shows is that Lorenz curves analyze inequality, of
course, but it is possible other things may be included too. However, this is a
task, which might require more investigation in the future.
10. Conclusions
In order to provide adequate comprehension of economic inequality
measures, underlying statistical theory has been exposed along this paper. In
doing so, we must conclude that economic inequality measures are firmly
connected to majorization and Lorenz dominance relationships between pairs of
income distributions. This conclusion has important consequences on the
selection of inequality indicators to be compatible with those relationships.
To reach conclusions like the afore-mentioned, a historical revision has
been developed to recover statistical terms related to economic inequality,
seldom used nowadays, as well as its relations with currently trends and
proposals. Even so, Lorenz curves have resisted against new theoretical
approaches, since its appearance more than a century ago and they can be
"Dagum (1990, 2001) had warned about Theil,s indicators ill-conditioned performance, but he did it
only in a social welfare framework.
214 J.J.Nunez-Velazquez
Acknowledgments
The author gratefully acknowledges partial financial support from
University of Alcala (grant UAH-PI2004/034) and of Junta de Comunidades de
Castilla-La Mancha together with Fondo Social Europeo. (Project PBI-05-004).
References
1. B.C. Arnold. (1987). Majorization and the Lorenz Order: A Brief
Introduction. Lecture Notes in Statistics. New York: Springer Verlag.
2. B.C. Arnold. (2005). The Lorenz curve: Evergreen after 100 years. Int.
Conference in Memory of C. Gini and M.O. Lorenz. Siena.
[http://www.unisi.it/eventi/GiniLorenz05].
3. B.C. Arnold, C.A. Robertson, P.L. Brockett and B.Y. Shu. (1987).
Generating ordered families of Lorenz curves by strongly unimodal
distributions. Journal of Business and Economic Statistics, 5(2), 305-308.
4. A.B. Atkinson. (1970). On the measurement of inequality. Journal of
Economic Theory, 2, 244-263.
5. C.P.A. Barrels. (1977). Economics Aspects of Regional Welfare. Martinus
Nijhoff Sciences Division.
6. CM. Beach and R. Davidson. (1983). Distribution-free statistical inference
with Lorenz curves and income shares. Review of Economic Studies, L,
723-735.
7. CM. Beach and J. Richmond. (1985). Joint confidence intervals for income
shares and Lorenz curves. International Economic Review, 26(2), 439-450.
8. Z.M. Berrebi and J. Silber. (1987). Dispersion, asymmetry and the Gini
index of inequality. International Economic Review, 28(2), 331-338.
9. G. Birkhoff. (1976). Tres observaciones sobre el Algebra Lineal. Univ.
Nacional de Tucuman Rev., Serie A, 5, 147-151.
216 J.J. Nunez-Velazquez
10. J.A. Bishop, J.P. Formby and R. Sakano. (1995). Lorenz and stochastic-
dominance comparisons of European income distributions. Research on
Economic Inequality, 6, 77-92.
11. J.A. Bishop, J.P. Formby and W.J. Smith. (1991). Lorenz dominance and
welfare: Changes in the U.S. distribution of income, 1967-1986. Review of
Economics and Statistics, 73, 134-139.
12. F. Bourguignon. (1979). Decomposable income inequality measures.
Econometrica, 47, 901-920.
13. J. Callejon. Un nuevo metodo para generar distribuciones de probabilidad.
Problemas asociados y aplicaciones. Ph. D. dissertation. University of
Granada.
14. J.M. Casas, R. Herrerias and J.J. Nunez. (1997). Familias de Formas
Funcionales para estimar la Curva de Lorenz. Actas de la IV Reunion Anual
de ASEPELT-Espafia. Servicio de Estudios de Cajamurcia, 171-176.
Reprinted in Aplicaciones estadisticas y economicas de los sistemas de
funciones indicadoras (R. Herrerias, F. Palacios and J. Callejon, eds.). Univ.
Granada, 119-125(2001).
15. J.M. Casas and J.J. Nunez. (1987). Algunas Consideraciones sobre las
Medidas de Concentration. Aplicaciones. Actas de las II Jornadas sobre
Modelizacion Economica, 49-62. Barcelona. Reprinted in Aplicaciones
estadisticas y economicas de los sistemas de funciones indicadoras (R.
Herrerias, F. Palacios and J. Callejon, eds.). Univ. Granada, 111-118
(2001).
16. J.M. Casas and J.J. Nunez. (1991). Sobre la Medicion de la Desigualdad y
Conceptos Afines. Actas de la V Reunion Anual de ASEPELT-Espafia,
Caja de Canarias, 2, 77-84. Reprinted in Aplicaciones estadisticas y
economicas de los sistemas de funciones indicadoras (R. Herrerias, F.
Palacios and J. Callejon, eds.). Univ. Granada, 127-133 (2001).
17. E. Castagnoli and P. Muliere. (1990). A note on inequality measures and the
Pigou-Dalton Principle of Transfers. Income and Wealth Distribution,
Inequality and Poverty. (C. Dagum and M. Zenga, eds.) Springer Verlag,
171-127.
18. E. Castillo, A.S. Hadi and J.M. Sarabia. (1998). A method for estimating
Lorenz curves. Communications in Statistics, Theory and Methods, 27,
2037-2063.
19. F.A. Cowell, Measuring inequality. 2a ed. LSE Handbooks in Economics.
Prentice Hall/Harvester Wheatsheaf (1995).
20. C. Dagum. (1990). Relationship between income inequality measures and
social welfare functions. Journal of Econometrics, 43(1-2), 91-102.
21. C. Dagum. (2001). Desigualdad del redito y bienestar social,
descomposicion, distancia direccional y distancia metrica entre
distribuciones. Estudios de Economia Aplicada, 17, 5-52.
22. H. Dalton. (1920). The measurement of the inequality of incomes.
Economic Journal, 30,348-361.
Inequality Measures, Lorenz Curves and Generating Functions 217
23. J. Davies and M. Hoy. (1994). The normative significance of using third-
degree stochastic dominance in comparing income distributions. Journal of
Economic Theory, 64, 520-530.
24. J. Davies and M. Hoy. (1995). Making inequality comparisons when Lorenz
curves intersect. American Economic Review, 85(4), 980-986.
25. J. Dominguez and J.J. Nunez. (2005). The evolution of economic inequality
in the EU countries during the nineties. First Meeting of the Society for the
Study of Economic Inequality (ECINEQ). Palma de Mallorca. Available at
[http://www.ecineq.org]
26. M. Fleurbaey and P. Michel. (2001). Transfer Principles and inequality
aversion, with an application to optimal growth. Mathematical Social
Sciences, 42, 1-11.
27. J.E. Foster. (1985). Inequality measurement. Published in Fair Allocation
(H.P. Young, ed.), Proceedings of Symposia in Applied Mathematics, 33,
Providence, American Mathematical Society, 31-68.
28. C. Garcia, J.J. Nunez, L.F. Rivera and A.I. Zamora. (2002). Analisis
comparativo de la desigualdad a partir de una bateria de indicadores. El
caso de las Comunidades Autonomas espafiolas en el periodo 1973-1991.
Estudios de Economia Aplicada, 20(1), 137-154.
29. R.M. Garcia and J.M. Herrerias. (2001). Inclusion de curvas de Lorenz en
las funciones generadoras. Aplicaciones estadisticas y economicas de los
sistemas de funciones indicadoras (R. Herrerias, F. Palacios and J. Callejon,
eds.). Univ. Granada, 185-191.
30. J.L. Gastwirth. (1971). A general definition of the Lorenz curve.
Econometrica, 39, 1037-1039.
31. C. Gini. (1912). Variability e Mutabilita: Contributo alio studio delle
distribuzioni e relazioni statistiche. Studi Economico-Giuridici
dell'Universita di Cagliari, 3, 1-158.
32. C. Gini. (1921). Measurement of inequality of incomes. The Economic
Journal, 31, 124-126.
33. CM. Goldie. (1977). Convergence Theorems for empirical Lorenz curves
and their inverses. Advances in Applied Probability, 9, 765-791.
34. M.R. Gupta. (1984). Functional form for estimating the Lorenz curve.
Econometrica, 52(5), 1313-1314.
35. G.H. Hardy, J.E. Littlewood and G. Polya. (1929). Some simple inequalities
satisfied by convex functions. The Messenger of Mathematics, 26, 145-153.
36. G.H. Hardy, J.E. Littlewood and G. Polya. (1952). Inequalities. 2a ed.
Cambridge University Press.
37. R. Herrerias, F. Palacios and J. Callejon. (2001). Las curvas de Lorenz y el
sistema de Pearson. Published in Aplicaciones estadisticas y economicas de
los sistemas de funciones indicadoras (R. Herrerias, F. Palacios and J.
Callejon, eds.). Univ. Granada, 135-151.
38. J.C. Houghton. (1978). Birth of a parent: The Wakeby distribution for
modelling flood flows. Water Resources Research, 14, 1105-1109.
218 J.J. Nunez-Velazquez
39. J. Iritani and K. Kuga. (1983). Duality between the Lorenz curves and the
income distribution functions. Economic Studies Quarterly, 23, 9-21.
40. N.C. Kakwani. (1980). Income Inequality and Poverty. Methods of
Estimation and Policy Applications. Oxford University Press.
41. N.C. Kakwani and N. Podder. (1973). On the estimation of Lorenz curves
from grouped observations. International Economic Review, 14(2), 278-
291.
42. J. Karamata. (1932). Sur une inegalite relative aux fonctions convexes.
Publ. Math. Univ. Belgrade, 1, 145-148.
43. M. Kendall and A. Stuart. (1977). The Advanced Theory of Statistics, 1,
4a ed. C. Griffin. London.
44. S. Kuznets. (1953). Share of upper income groups in income and savings.
National Bureau of Economic Research. New York.
45. M.O. Lorenz. (1905). Methods of measuring the concentration of wealth.
Journal of the American Statistical Association, 9,209-219.
46. A.W. Marshall and I. Olkin, Inequalities: Theory of Majorization and its
Applications. New York: Academic Press.
47. F. Mehran. (1976). Linear measures of income inequality. Econometrica,
44, 805-809.
48. P. Moyes. (1987). A new concept of Lorenz domination. Economics
Letters, 23, 203-207.
49. R.F. Muirhead. (1903). Some methods applicable to identities and
inequalities of symmetric algebraic functions of n letters. Proceedings of
Edinburgh Mathematical Society, 21, 144—157.
50. P. Muliere and M. Scarsini. (1989). A note on stochastic dominance and
inequality measures. Journal of Economic Theory, 49, 314-323.
51. F. Nygard and A. Sandstrom. (1981). Measuring Income Inequality.
Stockholm: Amqvist and Wiksell International.
52. A.M. Ostrowski. (1952). Sur quelques applications des fonctions convexes
et concaves au sens de I. Schur. Journal of Math. Pures Appl., 9, 253-292.
53. V. Pareto. (1897). Cours d'Economie Politique. Rouge. Lausanne.
54. J.B. Pena (Dir.), F.J. Callealta, J.M. Casas, A. Merediz and J.J. Nunez.
(1996). Distribucion Personal de la Renta en Espana. Piramide. Madrid.
55. G. Pietra. (1914-15). Delle relazioni tra gli indici di variability. Note I in
Atti del R. Istituto Veneto di Scienze, Lettere ed Arti, LXXIV (II), 775-
804.
56. G. Pietra. (1948). Studi di statistica metodologica. Giuffre. Milan.
57. A.C. Pigou. (1912). Wealth and welfare. McMillan. New York.
58. J.S. Ramberg, E.J. Dudewicz, P.R. Tadikamalla and E.F. Mykytra. (1979).
A probability distribution and its uses in fitting data. Technometrics, 21,
201-214.
59. H.M. Ramos, J. Ollero and MA. Sordo. (2000). A sufficient condition for
generalizad Lorenz order. Journal of Economic Theory, 90, 286-292.
Inequality Measures, Lorenz Curves and Generating Functions 219
60. H.M. Ramos and M.A. Sordo. (2001). El orden de Lorenz generalizado de
orden j , ^un orden en desigualdad?. Estudios de Economia Aplicada, 19,
139-149.
61. H.M. Ramos and M.A. Sordo. (2003). Dispersion measures and dispersive
orderings. Statistics and Probability Letters, 61, 123-131.
62. J. Rawls. (1972). A Theory of Justice. London: Oxford University Press.
63. J. Ruiz-Castillo. (1986). Problemas conceptuales en la medicion de la
desigualdad. Hacienda Publica Espanola, 101,17-31.
64. J. Ruiz-Castillo. (1987). La medicion de la pobreza y de la desigualdad en
Espana, 1980-81. Estudios Economicos, 42. Servicio de Estudios del Banco
de Espana. Madrid.
65. J.M. Sarabia, E. Castillo and D. Slottje. (1999). An ordered family of
Lorenz curves. Journal of Econometrics, 91,43-60.
66. J.M. Sarabia, E. Castillo and D. Slottje. (2002). Lorenz ordering between
McDonald's generalized functions of the income size distribution.
Economic Letters. 75, 265-270.
67. I. Schur. (1923). Uber eine klasse von mittelbildungen mit anwendungen
die determinaten. Theorie Sitzungsber Berlin Math. Gesellschaft, 22, 9-20.
68. R.R. Schutz. (1951). On the measurement of income inequality. American
Economic Review, 41, 107-122.
69. A.K. Sen. (1973). On Economic Inequality. Oxford: Clarendon Press.
70. A.K. Sen and J.E. Foster. (1997). On Economic Inequality. Expanded
edition. Clarendon Press Paperbacks and Oxford University Press.
71. A. Shorrocks. (1983). Ranking income distributions. Economica, 50, 3-18.
72. A. Shorrocks and J.E. Foster. (1987). Transfer sensitive inequality
measures. Review of Economic Studies, 54,485^197.
73. H. Wold. (1935). A study of the mean difference, concentration curves and
concentration ratio. Metron, 12, 39-58.
74. I. Zubiri. (1985). Una introduction al problema de la medicion de la
desigualdad. Hacienda Publica Espanola, 95, 291-317.
Chapter 12
EXTENDED WARING BIVARIATE DISTRIBUTION
J. RODRIGUEZ-AVI
Department of Statistics and Operations Research, University of Jain
Campus Las Lagunillas, B3, Jaen, 23071, Spain
A. CONDE-SANCHEZ
Department of Statistics and Operations Research, University of Jaen
Campus Las Lagunillas, B3, Jaen, 23071, Spain
A.J. SAEZ-CASTILLO
Department of Statistics and Operations Research, University of Jaen
Campus Las Lagunillas, B3, Jaen, 23071, Spain
M.J. OLMO-JIMENEZ
Department of Statistics and Operations Research, University of Jaen
Campus Las Lagunillas, B3, Jaen, 23071, Spain
The aim of this paper is to obtain a bivariate distribution that extends the Bivariate
generalized Waring distribution (BGWD) and that preserves some of its properties, such
as the partition of the variance into three distinguishable components due to randomness,
proneness and liability. Finally, an example in the context of accident theory is included
in order to illustrate the versatility of this new distribution.
1. Introduction
Accident theory has become the object of numerous studies that tried to
develop several hypotheses in order to interpret the causes of an accident.
Among them, the idea of accident proneness has stimulated much interesting
statistical theories. One important contribution in this direction is the
"proneness-liability" model proposed by Irwing [1] and Xekalaki [5] giving rise
to a three parameter discrete distribution, the univariate generalized Waring
distribution (UGWD) with probability generating function (p.g.f.) given by the
Gauss hipergeometric function:
221
222 J. Rodriguez-Avi et al.
However, there is a problem arising from the fact that the UGWD is
symmetrical in the parameters a and k and, hence, distinguishable estimates for
non-random components cannot be obtained.
Moreover, it is observed that the UGWD belongs to the family of Gaussian
hypergeometric distributions, GHD (Kemp and Kemp [2]). Thus, Rodriguez
et al. [4] have considered an extension of this distribution, introducing a
parameter A, 0<>l< 1, in such a way that the p.g.f. is given by:
G
«=4 H F l T «'Ar>o, o<^i. (4)
2F](a,/3;r,A)
This distribution, denoted by GHD\{a,p,y,A), may also be obtained as a
mixture of a Poisson distribution with a Gamma and a generalized Beta
distributions, so that the property of partition of the variance is verified and data
that can not be adequately fitted by the UGWD, are successfully modeled by the
proposed distribution. However, the two non-random variance components
cannot be separately estimated either.
Xekalaki [6] proposed a solution of this problem dividing the whole period
of observation into two non-overlapping sub-periods and then studying the
resulting bivariate accident distribution. Following a similar process to the
Extended Waring Bivariate Distribution 223
univariate case, this distribution, that she called bivariate generalized Waring
distribution (BGWD), has p.g.f. generated by the F\ Appell's hypergeometric
function:
(P)k m
G(tut2)= l Fx(a;k,m;a + k + m + p;tx,t2), (5)
where
y
(6)
x=0>.=0 \l)x+y*-y-
wither, k, m, p>0.
Then, the accident distribution in the whole period is also a UGWD, like in
each one of the sub-periods considered. Moreover, in this situation it is possible
to distinguish the non-random components in the partition of the variance. In
Kocherlakota and Kocherlakota [3] some of the most interesting properties of
the UGWD are listed.
Our aim is to obtain a bivariate distribution that extends the BGWD
introducing a parameter X, but without loosing its excellent properties in order to
be used in fields such as accident theory. Thus, distinguishable estimates for the
two non-random variance components are obtained and, moreover, fits achieved
by the BGWD are improved.
This means that the number of accidents in each period has a Poisson
distribution, both independent.
• Liability parameters have two independent Gamma distributions:
A1|p=p-><JO/MOTa(y01, V)
(8)
A2\P=p-*Gamma(/32, v),
-— (9)
Axj^i^yy^^^Q-Mi-pV^iMi-p))^. (ii)
p
x\ v!
2. (X,Y) is an extended bivariate Waring distribution (from now on EBWD)
withp.m.f:
/(jr.n(*..y) = /o y: — . (12)
/((X,Y)\P,p(x,y)
,x ,y i $ - l - / , / u . A - l - / , / t i
x\y\ \v) \v + \,
JC!J! vw + U \v + \
(l-A(l-p))A + A (,1(1-/,))*+>.
x!j>!
2. Firstly, we note that since
Fl(a;0l,02;r,A,A) = — fJ—
r(r)
U P
i „ dp
f p^-'a-p)" (15)
1
' 2
r(a)r(r-flr) o(i-/i(i-^ + A
r(or)r(y-ar)Jon-An-D^
the function in Eq. (10) is a density one. Then,
W * . y H „'° x\y\
. , * "0-/0 '•Fl{a;puP2;y-A,X)
X
r, y ya'X^-pf-'dp
.(fl)x(fl), A"" rOQ
x\y\ Fx(a;px,P2;y;A,A) Y{a)T{y-a)
xjy-»-\\-Py+y+a-ldp (16)
,(#),(&), Ax+y T{y)
x\y\ Fx{a;P,P2;y-A,X) T{a)T{y-a)
yT{y-a)T{x + y + a)
r(x + y + y)
i (<*WflUAMx+y
' Fx{a;PuP2;r,A,X) (y)„yx\y\
226 J. Rodriguez-Avi et al.
It can be observed that if A=\ the expressions in Eqs. (10), (11), (12) and
(13) reduce to those deduced by Xekalaki [4].
3. Properties of the EB WD
In this section we show some of the properties of the EBWD. Firstly, the
p.g.f. is given by:
g(tl,t2) = f0F](a,]3l,j32;y;AtuAt2), (17)
which is convergent for |fi|<l, fel^l if W^l (with y>c&P\+p2 for the case in that
A=l).
The probabilities may be obtained in a recursive way, since this distribution,
like the BGWD, belongs to the Pearson's system. Then, the p.m.f.,f^s, satisfies
the following system of difference equations:
(y + r + s)(r + l)fr+Us-A(a! +r+ s)(ft+r)frs=0
(r + r + s)(s + \)frs+x -A(a + r + s)(02 + *)/,,, = 0 .
So, if the constant of normalization, fofi=/o given in Eq. (13), is known, the
remainder probabilities are obtained. When A=\ this constant may be computed
exactly from the Gauss summation theorem:
(r-«-/?,-AW 2
_r(y)r(y-a-/31-/32)
nY-px-p2)T{y-a)
In the general case, the value of this constant is computed by
approximation.
,F,(A+A;r;0 '
where
Extended Waring Bivariate Distribution 227
2FX(CC,PX+P2;Y;X)
f (<*U,(0l)r<A), M
J2r±1
/,/,= — = — , (28)
where V=Z(\ —P)/[\ -A(l —P)] and P has a distribution with the density function
given in Eq. (10).
Concerning X and Y, since both variables are obtained as mixtures, their
variances may be split into three components
a\ = Var{X) = PXEP(V) + PXEP(V2) + ffVarP(V)
a) = Var(Y) = p2EP{V) + /32EP(V2) + P22VarP{V\
in the same way as the BGWD.
4. Applications
To conclude, we consider data about the number of driver accidents in
Connecticut (Xekalaki [6]).
The parameters are estimated by the maximum likelihood method because
the method of moments does not provide good estimates. Then, the log-
likelihood function, whose expression is
+ lnA£(x,+^-2tax,!-2lnj/1.!,
;=i i=i 1=1
than the internal factors or proneness in the explanation of the behavior of the
number of accidents. It should be pointed out that even though the BGWD and
the EBWD are different, the values obtained for the variance components are
very similar to those obtained by Xekalaki, so it seems that both models
coincide in the explanation of the factors that influence the number of accidents.
1931-33
1934-36 0 1 2 3 4
23881 2117 242 17 2
0
23887.9478 2146.1793 214.6711 23.6536 0.4292
2386 419 57 9 3
1
2378.6215 418.1887 61.6481 8.9106 0.2410
275 64 12 5 1
2 67.5159 2.3224
260.5670 13.0563 0.0874
22 5 2 2 0
3
31.1481 10.5873 2.5195 0.5297 0.0264
5 4 0 1 0
4
4.0282 1.6850 0.4739 0.1145 0.0073
References
1. J.O. Irwing. (1968). The generalized waring distribution applied to accident
theory. Journal of the Statistical Society, Series A, 131, 205.
2. A.W. Kemp and CD. Kemp. (1975). Models for Gaussian hypergeometric
distributions. Statistical Distributions in Scientific Work, 1,31.
3. S. Kocherlakota and K. Kocherlakota. (1992). Bivariate Discrete
Distributions. Marcel Dekker.
4. J. Rodriguez-Avi, A. Conde-Sanchez, M.J. Olmo-Jimenez and A.J. Saez-
Castillo. (2004). Properties and applications of the family of Gaussian
discrete distributions. Proceedings of the International Conference on
Distribution Theory, Order Statistics and Inference in Honour of Barry C.
Arnold, Santander, Spain.
Extended Waring Bivariate Distribution 231
J.M. PEREZ-SANCHEZ
Department of Quantitative Methods in Economics
University of Granada, 18071-Granada, Spain
J.M. SARABIA-ALEGRIA
Department of Economics, University ofCantabria, 39005-Santander, Spain
E. GOMEZ-DENIZ
Department of Quantitative Methods in Economics
University of Las Palmas de Gran Canaria, 3'5017'-Las Palmas de G.C. Spain
F.J. VAZQUEZ-POLO
Department of Quantitative Methods in Economics
University of Las Palmas de Gran Canaria, 35017-Las Palmas de G. C. Spain
In a standard Bayesian model, a prior distribution is elicited for the structure parameter in
order to obtain an estimate of this unknown parameter. The hierarchical model is a two
way Bayesian one which incorporates a hyperprior distribution for some of the
hyperparameters of the prior. In this way and under the Poisson-Gamma-Gamma model,
a new distribution is obtained by computing the unconditional distribution of the random
variable of interest. This distribution seems to provide a better fit to the data, given a
policyholders' portfolio. Furthermore, Bayes premiums are thus obtained under a bonus-
malus system and solve some of the problems of surcharges which appear in these
systems when they are applied in a simple manner.
1. Introduction
From the Bayesian standard model point of view, a structure parameter
follows a prior distribution. A hierarchical model is a two way Bayesian model
which incorporates a hyperprior distribution for some of the hyperparameters of
the prior. A new distribution is obtained by computing the unconditional
distribution of the random variable of interest if the Poisson-Gamma-Gamma
model is used. This distribution provides a better fit to the data. The hierarchical
approach reflects a different statistical perspective on how to model the expert's
233
234 J.M. Perez-Sanchez et al.
_ (0 x) _ WU^ 1 g ^ f f i ( g I *•>G)*2V»F,G)dMFdG
Hl\f(x\&,F)xl(0\A,G)x2(A,F,G)dAdFdGd0'
It is of great interest to estimate the posterior mean E(0j \x) and the
variance E(6f). However, it is possible that the posterior distribution of A , F
and G is in our range of interest. In this case, we need to compute:
This model was introduced by Lindley and Smith [1]. More recently,
Klugman [7] analyzed the normal-normal hierarchical structure from the
Bayesian point of view. Cano [8] applied this methodology to study the
Bayesian robustness of the model.
However, a continuous distribution is clearly inappropriate for frequency
counts. For severe or total losses, the distribution places probability in negative
numbers and so the Poisson and negative binomial are much more commonly
used.
The rest of this paper is structured as follows: Section 2 analyzes a
hierarchical Bayesian structure, the Poisson-Gamma-Gamma model. In Section
3 we use this model to compute premiums under a bonus-malus system. Section
4 applies the above results to an actuarial example. Finally, section 5 contains a
discussion of related work.
2. Inference procedure
In this section, the hierarchical Bayesian Poisson-Gamma-Gamma model is
studied. In this case, the hierarchical model is a two way Bayesian standard
model which is built in the following way:
Firstly, we have the model depending on an unknown parameter 0,
f(x | 0). We assume a Poisson distribution, i.e.,
f(x\0) ~ P(0). (3)
Secondly, parameter 0 follows a prior distribution which is assumed to be a
Gamma distribution. Then:
K,{0\a,b) ~ G(a,b), a,b>0 (4)
nx{6\a,a,P) = ^e\b)7r2{b)db
= \™baT(a)9a-xe-b(>par(a)ba-xe-pbdb
1 (6)7?)"-'
(6)
B(.a,a)P{\ + Oipy
BMS, where the variance of the observed data is generally greater than the mean
(Shengwang et al. [12]).
The following proposition gives the posterior distribution of 9 under the
hierarchical Bayesian model.
Proposition 1 The posterior distribution of 9 given the data x in the
hierarchical Poisson-Gamma-Gamma model is given by
where
3. Experience rating
To illustrate our approach, we apply the results obtained for computing
premiums under a BMS. This is a merit rating method used in automobile
insurance where the number of claims modifies the premium. A model often
used for experience rating in a BMS assumes that each individual risk has its
own Poisson distribution for a number of claims, assuming that the mean
number of claims is distributed across individual policyholders (Coene and
Doray, [10]; Corlier et al, [2]; Lemaire, [3], [6], [11]).
A bonus-malus premium (BMP) can be computed under the variance
principle (Gomez and Vazquez, [13]) in the same way as Lemaire [3] built a
BMP under the net principle. In this sense, we have:
\ a + \)2K<<X\x)dX f (X + \)x(X)dX
J
PBH'HX,0= J -r 5 (io)
A+B+C
'{x,f) = K (11)
D + C ''
where
A = P2{a + x + \)(a + x)11{a + x + 2,x-a + \,pt),
B = 2j3(a + x)ll(a + x + l,x + a + 2,pt),
C = V.(a + x,x-a + l,0t),
aP a{a + a-\)p2 a2p2
K = +1
(a-iy(a-2) (cc-1)2
and
fa + \fn(X I x)dX = \A2JI(A I JC)<M + 2 f A^(A | x)dX +1
JA JA JA
Although we do not have a perfect closed form for this BMP, its
computation is simple by using, for example, MATHEMATICA software,
because the confluent hypergeometric function is tabulated.
4. Numerical example
In this section, the results obtained in the preceding sections are illustrated
with an example from Lemaire [3], which represents the claims made by
policyholders of a Belgium insurance company during four periods.
Figure 1 shows the distribution for the number of claims, which provides a
fairly good fit, accepted by the %2 -test of goodness of fit.
The mean and variance of this distribution are 0.1011 and 0.1074,
respectively. The parameters of the structure function were estimated by
applying the method of moments. The estimated parameters are 5 = 3.25585,
£2 = 6.13732 and fi = 0.159492.
Applying a Bayesian Hierarchical Model in Actuarial Science 239
The results are illustrated in Table 1, which shows the BMP for the
hierarchical structure considered (in bold) and the BMP for the standard
Bayesian methodology.
120000
to
0)
g 100000
'o
c 80000
I Q Adjusted
CD 60000
cr D Observed
CD
O
c/j 20000
<
1 2 3
Number of claims
X
t 0 1 2 3
1 0.994 1.050 1.105 1.161
0.993 1.048 1.131 1.265
2 0.998 1.041 1.094 1.146
0.988 1.036 1.104 1.202
3 0.984 1.033 1.083 1.133
0.984 1.027 1.086 1.164
It is clear from Table 1 that the relative premiums allow the transition rules
commented above. For example, a policyholder has to pay 1.104 monetary units
in the second period because of his/her two previous claims. In the next period,
the policyholder will have to pay 1.164 monetary units if he/she makes a claim.
However, the premium will decrease to 1.086 monetary units if he/she does not
make a claim. This behaviour is observed for all the premiums, and so we obtain
BMP by using a hierarchical Bayesian model.
240 J.M. Perez-Sanchez et at.
Table 2 shows how a hierarchical BMP gives a bonus to good drivers with
respect to standard Bayesian premiums by decreasing their percentage of
penalization for the transition x-0-»x = 1 and t = l->t = 2. However, the
hierarchical structure increases the percentage of penalization for the other
transitions.
Ax
l->2 4.7% 10% 15.2%
4.3% 11.1% 20.9%
2->3 3.5% 8.9% 13.9%
3.9% 9.3% 17.1%
5. Conclusions
In this article we review some aspects of the hierarchical Bayesian models
and emphasize the Poisson-Gamma-Gamma model because of its practical use
in actuarial science. In order to model the number of claims of a BMS, we use a
hierarchical structure in which the second-kind beta distribution arises as the
hyperprior distribution.
The model poses no additional complications, as many of its positive
properties can be deduced analytically. The model can be applied
straightforwardly to actuarial premium-setting problems, and we show that these
premiums follow the transition rules of BMS. These transition rules allow the
malus policyholders to be surcharged and a bonus given to the bonus ones.
In order to check the prior distribution, we can carry out a Bayesian
robustness analysis of the premiums in the same way as Gomez and Vazquez
[13]. These authors studied the sensitivity of a BMS from a standard Bayesian
point of view. In the hierarchical setting, a Bayesian robustness analysis can be
carried out in the same way as in Cano [8], where the normal-normal
hierarchical model is analyzed.
References
1. D.V. Lindley and F.M. Smith. (1972). Bayes estimates for the linear model.
Journal of the Royal Statistical Society B, 34, 1-41.
2. F. Corlier, J. Lemaire and D. Muhokolo. (1979). Simulation of an
Automobile Portfolio. Essays in the Economic Theory of Risk and
Insurance, 11,40-46.
Applying a Bayesian Hierarchical Model in Actuarial Science 241
F. ABAD-MONTES
Dpto. Estadistica e Investigation Operativa, Universidad de Granada
C/Fuentenueva, s/n, Granada, Espana
M.D. HUETE-MORALES
Dpto. Estadistica e Investigation Operativa, Universidad de Granada
C/Fuentenueva, s/n, Granada, Espana
M. VARGAS-JIMENEZ
Dpto. Estadistica e Investigation Operativa, Universidad de Granada
C/Fuentenueva, s/n, Granada, Espana
1. Introduction
It is frequently necessary to determine the density function of certain data
sets, especially when such data present characteristics which, a priori, cannot be
assumed to behave like standard probability models. The experience of
243
244 F. Abad-Montes, M.D. Huete-Morales and M. Vargas-Jimenez
2. Data
We took the rHP residuals derived from the results of fitting H-P curves to
the mortality rates, qx, observed for ages 0 to 84 years for the population of
Andalusia for the period 1976-2002.
i i i i i i i i i i i i i i i i i i i i i i i i i i i
n PJ
J- o
ro
3
TJ
Q> o
DC o
iJjijjjNiN'iMijiijjiiiii
o_
9
1975 1980 1985 1990 1995 2000
Period
0-
I
X
CO
CD
a:
Age
Mean of the residuals at each period i Variance of the residuals at each period
Mean of the residuals at each period Variance of the residuals at each period
0.00
i
Variance
0.00010
1
1
0.00000
1
1
"i 1 1 r 1 1 1 1 1
1975 1980 1985 1990 1995 2000 1975 1980 1985
The top left figure shows that the assumption that the distribution of the
residuals presents an approximately zero mean for each age is unlikely to be
fulfilled.
248 F. Abad-Montes, M.D. Huete-Morales and M. Vargas-Jimenez
It can be seen that the curves are not fitted in the same way for every age; at
some (60-80 years), the figures show the residuals to be systematically negative.
Another noteworthy aspect is the diversity in the variability.
ti\y,-(P0+^l-x)+Mxl-xf+...+pp(xl-xy)Jk
(5)
Assuming that the weights matrix at a point x is
O, ~ x)
W{x) = Diag\k (6)
b
and that the matrix X evaluated at the point x is
250 F. Abad-Montes, M.D. Huete-Morales and M. Vargas-Jimenez
(x, -xf
X(x). (7)
p
1 xn-x ... (xn-x)
the value fitted in x is the first term (corresponding to the intercept) of the vector
solution by minimum weighted squares:
(X(x)W(x)X(x)ylX(xyW(x)y (8)
c) The results obtained from the definition of a curve as a linear combination of
baseline functions that constitute powers of x.
The splines method defines a curve in terms of linear combinations of
functions of powers of x that constitute a base. These are made up of polynomial
fragments that are defined in regions which are separated by knots or cutoff
points a,, ...,a K .
This method may be considered an extension of standard linear regression.
Under linear regression, the estimated values derived from a polynomial
expression in x are obtained by
y = X(X'X)-lX'y = Hy (9)
where X is given by the matrix nx(p+l), to fit a/?-type polynomial, the columns
of which form the base {l,x,x ,...,xp} , which is evaluated at the n points of
the sample.
The structure of the linear model can be generalised for the treatment of
non-linear, more complex structures, by including new functions in the above
base to represent truncated polynomials. For example, the p-type spline with K
knots in ak has the following parametric expression:
K
p
M(x) = /30+j3lx + ... + j3px + YJ ak(x-ak)1 (io)
where the truncated polynomial term
/ NO \(x-akYfor x>a.
p k
(x-ak) +=r ' y * (11)
\ 0 otherwise
has the base functions {l,x, ...,xp, (x — fl,)f ,...,(x ~ aFc)+}- In total there are
K+p+1 base functions, and this is described as a p-type truncated power base of
the spline model.
For any set of knots, the curve can be estimated by least squares using
multiple regression on the base functions evaluated in the n values observed
inX.
Fitting the Heligman and Pollard Curve to Mortality Data 251
One base that is widely used is that of cubic splines, specifically, a series of
cubic polynomials grouped around certain values of x, (the knots), {aj}, such
that the curve is continuous and with continuous first and second derivates. Each
spline is a 3 r -degree polynomial function over the interval [aj, aJ+i].
The dispersion or scatterplot diagram may sometimes suggest the
approximated location of the knots, being the points where the curve seems to
cross the trend line. The greater the number of knots, the greater the flexibility
of the curve. Nevertheless, an excessive number of knots may give an
impression of random fluctuations in the curve, thus obscuring the mean trend.
When there are many knots, and it is not straightforward to reduce this
number, their influence can be restricted by adopting a specific criterion, such as
the following:
a k <cte (12)
k=\
In this case, rather than minimizing
Y-X W (13)
a
\ J
we seek the solution to
(' R\ CR\
P
Y-X + A(p',a')D (14)
\aJ \ a j
where D is the diagonal matrix in which the first p+1 elements are null and the
rest are ones. The solution is given by
t = X(X'X + ADylX'y = S,y (i5)
S is termed a smoothed matrix.
If lambda is zero, the case is unrestricted. If the knots cover the range of
values of Xj reasonably well, the fit approaches the interpolation of the data. A
very large value of lambda weakens the influence of the knots and the fit is
smoother. As the effect of the knots decreases, the results are closer to a
standard parametric regression, the shape of which depends on the degree of the
spline.
In practice, we seek a lambda that produces a curve that is reasonably close
to the data but which eliminates the superfluous variability. In general, logically,
a spline of the order of p=3, for example, is more flexibly adapted to the data
than is a linear spline, but if there are many knots and penalised splines are used,
the differences are imperceptible.
252 F. Abad-Montes, M.D. Huete-Morales and M. Vargas-Jimenez
for this, which are based on the nature of the data, and one of the most
commonly used such procedures is that of cross validation (CV).
The cross validation technique consists of dividing the data set into two
parts: one that is used to estimate the model and another that enables us to make
a prediction. Thus, the values that are used for predicting do not play any part in
the fitting procedure. A particular case consists of reserving a single observation
for predicting, the rest (n-1) being used to estimate the model, in each of the n
partitions created.
Given n values in the response Y: y]5 ..., yn and the corresponding predicted
values, y_x,...,y_r..y_n , CV is defined as the sum of the squared residuals:
cr=5>,->_,)2 ( 17 )
where JK-, is the predicted value of the i-th case, when this case has not
been used to estimate the model.
In particular, given a lambda value and the predicted value of Xj in the non-
parametric regression curve, computed without the observation (XJ, yj), which we
shall denote as Ai,-/ \xi) , then the following definition may be made:
estimating f(x) for a random sample x1; ..., xn_where k is a symmetric density
function, for example, the standardised normal function. The value h is usually
large enough so that excessive smoothing is not produced, thus avoiding the
elimination of significant modes, but not so small as to allow too many random
spikes. A large value would lead to an excessively biased estimate, while a low
one would produce an estimate with too much variability. The choice of h is not
immediate. Some authors have proposed the execution of various solutions in
order to determine an optimum value. The method implemented in R is proposed
by Sheather and Jones (1991).
The following figures show an initial approximation of the density function.
-0.02
\ inJLt n
-0.01 0.00 0.01 -0.02 -0.01
JL.
0.00 0.01 0.02
S-,
I 1 1 1 1
< *M
i — i — i —
-0.02 -0.01 0.00
L_ i —
0.01
i
0.02
-0.02 -0.01 0.00 0.01 0.02
Residual H-P
Figure 6. Distributions of residuals by periods
Fitting the Heligman and Pollard Curve to Mortality Data 255
Although the sample size is small, we can see the high degree of similarity
in the pattern of the probability density function in each period, with similar
ranges of variability and similar function shapes.
A graphic examination of the distribution, according to the age of the
subject, reveals patterns that are much more varied.
-0.02
—T-
-0.01
JL T
o.oo
I
0.01 0.02 -0.02
-r-
-0.01
T
0.00
—I—
0.01 0.02
Residual H-P
Residual H-P
Age 69
Age 64
i
-0.02
r
-0.01
i T
0.00 0.01 0.02
~\ i
-0.02
r
-0.01
A T"
0.00
1
0.01
1
0.02
The above figure shows various shapes and differing ranges of variability in
the density functions that were estimated for different age values.
is the discrete distribution with a probability 1/n associated with each sample
value. This plays the role of a fitted model when no mathematical shape is
assumed for F.
To proceed with the statistical inference, here we assume a non-parametric
model with a sample of independent and identically distributed observations of
an unknown distribution F. In a parametric model, the estimator has a parametric
distribution, while in the non-parametric situation, we work with an empirical
distribution function. In the methods described below, we make use of
simulation to estimate the quantities of interest. The aim of this is to explore the
sample distribution of the mean and the variance as estimators of the mean value
and the variance of the residual associated with a particular age. The utility of
the bootstrap procedure is greater in cases for which there is no theoretical
knowledge of the distributions of the values.
6.1. Bootstrap
These methods are applied both when the probability models are well
defined and when they are not. One of the greatest proponents of the bootstrap
method of simulation is Efron. Based on the sample data, it is possible to make
an inference regarding certain aspects of the distribution.
Thus it is possible to explore, in a relatively straightforward way, the sample
distribution of the estimator of a parameter, for which we cannot a priori assume
any given model.
Let us assume that the parameter 0 is estimated from the sample x=( x t , ...,
x2), from which we calculate the value of interest t(x). The bootstrap sample
x*=( Xi*, ..., xn*) is then obtained by selecting and replacing n values of the
sample observed. For each bootstrap sample, we obtain the corresponding
replica of the statistic t(x*).
The bootstrap procedure consists of selecting B samples of size n with
replacement of the original sample x, and estimating the value t(x*) for each one
of these.
One of the most interesting values for measuring the accuracy of a statistical
measure in making an inference is the standard error associated with the
estimation. In this context, it is obtained as the standard deviation of the B
replicas of the bootstrap value corresponding to the B samples selected with
replacement.
Fitting the Heligman and Pollard Curve to Mortality Data 257
|Zk**)-r(*-)j (21)
e.e.(t(x))=-
5-1
where
fa (22)
'(**) =
B
The bias is estimated as the difference between the mean of the bootstrap
distribution and the value observed in the original sample.
Here, in particular, we are interested in the mean value of the residuals for
each age value, together with the variance or standard deviation as a measure
of dispersion.
One of our goals is to calculate the approximate distributions of the mean
and the standard deviation of the residuals for different ages. We wish to study
the differences there may be between the behaviour patterns of the residuals
derived from the fits, using a non-parametric analysis, that is, one based on the
pattern of the empirical distribution or the non-parametric estimation of F.
The graphic representation of the distributions of the estimators, in turn,
allows us to see whether the distribution is symmetric or biased. The graphic
representation of the estimate of the probability density function for each age
enables us to make visual comparisons.
The various methods of constructing confidence intervals also constitute a
powerful inferential tool.
where k is the standard normal density function. As observed above, the value h
determines the degree of smoothing of the estimated function, and the selection
of this parameter is more important than that of the k function; its designation is
a crucial element in the estimation process. A value that is too high or too low
could mask possible modes, producing too much smoothing of the shape of the
function. On the other hand, there could also be a behaviour pattern with
multiple spikes, possibly a chance occurrence. For this type of estimation, it is
recommended that the number of bootstrap samples should be quite large (1000
or more).
i T ~r I 1
2 e-04 3 e-04 4 e-04 5 e-04 6 e-04 7 e-04 8 e-04
Mean: age 54
r T" 1
2 e-04 3 e-04 4 e-04 5 e-04 6 e-04 7 e-04
Figure 8. Histogram and density of bootstrap distributions (means and standard deviation: age 54)
Fitting the Heligman and Pollard Curve to Mortality Data 259
_2> cp -
m
O o
9-
o
o
o
o
CO
o
o
i 1 1 1 1 1 1
Figure 9. Bootstrap distributions (histogram, density, and quantiles of means: age 75)
260 F. Abad-Montes, M.D. Huete-Morales and M. Vargas-Jimenez
III
a.
o
o
.o
c
ID
Q
-0.003 -0.002
1
-0.001
. \k1 0.000
1
0.001
1
0.002
Means boostrap
Figure 10. Bootstrap distributions of means residuals for various ages
Fitting the Heligman and Pollard Curve to Mortality Data 261
a.
(0
o
o
.o
c
Q
"i 1 1 1 1 1 1 r
Figure 11. Bootstrap distributions standard deviations of residuals for various ages
The interval of the percentiles: this is obtained from the a/2 and 1- a/2
order quantiles of the bootstrap distribution obtained from the B bootstrap values
of the parameter in question.
l-a=Pr(quantile ((tx*\ua/2) )< 0 < quantile (t(x*)(a/2)))
Another interval based on percentiles is the so-called basic interval, which
is obtained from
Pr[2t(x)- (quantile (t(x*) (1 . a/2) ) < 0 < 2t(x)- quantile (t(x*)(a/2))]
To do this, an appropriate transformation, for example the logarithmic
transformation in the estimation of the standard deviation, could improve the
limits to a certain extent. This is in contrast to the previous example, in which
the transformation was respected. Variations may occur in the case of
asymmetric distributions.
Note: a greater number of bootstrap distributions are required than are used
to determine the mean and the standard deviation, because of the need to
estimate the percentiles of the bootstrap distribution. The normal value taken is
B=1000ormore.
Other, improved, versions include:
t-intervals. These are useful for statistical measures such as the mean (in
general, for statistical measures of location). The idea is to imitate a Student-t
measure to overcome our ignorance of the standard deviation when an inference
is made concerning the mean. These intervals require us to estimate the variance
of the statistic for each bootstrap sample. The interval is based on the
Studentised statistic.
Bca intervals. Intended to correct bias. These, too, are calculated for
percentiles of the distribution of the B bootstrap replicas of the statistic, but
while the percentile intervals directly use the a/2 and 1- a/2 order quantiles to
define the extreme values of the confidence interval, those employed in Bca are
obtained by first deriving new al and a2 orders for the quantiles of the
distribution; the values of these depend on two constants termed acceleration, a,
and bias correction, zO, and are estimated from the bootstrap values (Efron and
Tibshirani, 1993).
The following results show the confidence intervals for the mean of the
residuals for ages 37, 54 and 75 years.
Fitting the Heligman and Pollard Curve to Mortality Data 263
Table 1. 37 years
Level Normal Basic Percentile BCa
90% (-0.0001, 0.0001) (-0.0001, 0.0001) (-0.0001, 0.0001) (-0.0001, 0.0001)
95% (-0.0001, 0.0001) (-0.0001, 0.0001) (-0.0001, 0.0001) (-0.0001, 0.0001)
Table 2. 54 years
Level Normal Basic Percentile BCa
90% (0.0003, 0.0006) (0.0003, 0.0006) (0.0003, 0.0006) (0.0003, 0.0006)
95% (0.0003, 0.0006) (0.0003, 0.0006) (0.0003, 0.0006) (0.0003, 0.0007)
Table 3. 75 years
Level Normal Basic Percentile BCa
90% (-0.0021,-0.0008) (-0.0021,-0.0008) (-0.0021,-0.0009) (-0.0021,-0.0009)
95% (-0.0022, -0.0007) (-0.0022, -0.0007) (-0.0022, -0.0008) (-0.0022, -0.0007)
Table 4. 37 years
Level Normal Basic Percentile BCa
90% (0.0001, 0.0002) (0.0001, 0.0002) (0.0001, 0.0002) (0.0001, 0.0002)
95% (0.0001, 0.0002) (0.0001, 0.0002) (0.0001, 0.0002) (0.0001, 0.0002)
Table 5. 54 years
Level Normal Basic Percentile BCa
90% (0.0003, 0.0006) (0.0003, 0.0006) (0.0003, 0.0006) (0.0003, 0.0007)
95% (0.0003, 0.0006) (0.0003, 0.0006) (0.0003, 0.0006) (0.0003, 0.0007)
Table 6. 75 years
Level Normal Basic Percentile BCa
90% (0.0016, 0.0025) (0.0016, 0.0025) (0.0014, 0.0024) (0.0016, 0.0026)
95% (0.0015, 0.0026) (0.0015, 0.0026) (0.0013, 0.0025) (0.0015, 0.0027)
(24)
t*(x) = Jl
n
The variance of the jackknife estimator is obtained in a similar way to that
used to derive the variance of a sample mean.
±[t*(x(i))-t*(x)¥
n2
Jackknife-influence values are established for the n values of the sample at
differences of t(x(-i)) - t(x) for i=l, ..., n.
The techniques known as "jackknife-after-bootstrap" consist of applying
jackknife to the results generated by the bootstrap method. One means of
checking or diagnosing the degree of influence of a given observation x( of the
sample on the value of the statistic t used in bootstrap is the jackknife-after-
bootstrap figure. This method enables us to detect the changes produced in the
empirical quantiles of t*-t if an observation x; is eliminated from the sample.
Specifically, we construct a figure with various quantiles (such as 0.05, 0.10,
0.16, 0.5, 0.84, 0.9, 0.95) that are determined using bootstrap with all the values
of the original sample and represented by horizontal lines. Each of the n Xj points
of the sample is represented with abscissas that are equal to the corresponding
values of empirical influence (for example, the jackknife value obtained by
regression) and with ordinates that are equal to the value of the difference
between the quantile obtained with the complete bootstrap simulation and the
quantile obtained with simulations from which xs is absent1.
Note: The influence function or influence component can be considered a
type of derivate that reflects the change in t(F) when the distribution F is
subjected to a small contamination in x. These values are useful for determining
Filling the Heligman and Pollard Curve to Mortality Data 265
the approximate variance of a statistic taking into account that such a statistic
may be a kind of first order expansion of a Taylor series (for more information,
see Efron and Tibshirani, 1993, pp. 298-302).
9
CD
9
0)
'-'*'*;*-jr?i.*:
~f-.-.-.-T^!"
i-;-»r»ir<*«W* -•HT«**-
0 1 2 3 4
where ^ ( x 0 ) is the standard deviation for the value fitted in xO, ju(x0),
obtained from the square root of the estimated variance,
V(ju(x0)) = S'x0 Sx0(T , where the row vector of S, termed Sx0, defines the
linear combination of values of y such that / i ( x 0 ) = Sx0'y.
For a small sample size, the Student t may be replaced by the normal value.
The degrees of freedom are those appropriate for the closest integer, and
correspond to the residual part of the fitted model. If the errors are not normal
and if n is large enough, the intervals given above may continue to be valid,
because of the central limit theorem.
The prediction intervals are also derived in a similar way to the parametric
regression, that is, by means of
Fitting the Heligman and Pollard Curve to Mortality Data 267
Age
i
1
Age
Simulation bootstrap
(SSRl-SSR2)/(g!2-g.l.l)
SSR2/(n-g.1.2) ^V2-*.u,„-,,.2 (33)
where SSR1 and SSR2 are the sums of the squares of the residuals in
Models 1 and 2, respectively; g.1.1 and g.1.2 are the corresponding degrees of
freedom. Thus, we obtain a significance of the order of 2.2e-16.
/(*,)= (43)
n -a
where a= amplitude of the interval.
To achieve acceptable results, the sample size and the number of intervals
must be large. Although the procedure is of most interest for statistics where it is
more difficult to identify the shape of the density function, especially in cases
where the curve may present various modes and perhaps biased behaviour
patterns, we shall apply it here to show, for example, the distribution of the data
resulting from a bootstrap simulation of the sampling means of the H-P residuals
recorded at 64 years of age.
In total, there were 9999 values of the means of the H-P residuals, from
which we obtained a frequency table of 100 intervals of equal amplitude, of
approximately a=0.00001.
^^
CD CNJ
C
m
0)
h
(0
in
Mean
Density function
Mean
Figure 15. Result for a bootstrap simulations of the sampling means of H-P residuals (64 years)
Density functions
Although this technique was first used in the 1930s, it has recently
become popular again as a means of approximating the density function. It is, in
fact, a refinement of the Edgeworth expansion, which is frequently used to
approximate an unknown distribution for which the moments are known. This
technique gives good results in the centre of the distribution but sometimes
leaves much to be desired in the tails, and can even give negative results for the
density in such zones.
The derivation of the density function and the distribution function is based
on the cumulant generator function K(t) and on its first two derivates with
respect to t, K'(t) and K."(t).Therefore, it requires the cumulant generator
function to have a known, manageable shape, a fact that means it cannot be
widely used in practice. Moreover, it is necessary to numerically resolve the so-
called saddlepoint equation for each value of the variable of interest.
The cumulant generator function K(t) of a variable X is given by the
logarithm of the moment generator function m(t).
m{t) = E{e,x)=\e'xf{x)dx (44)
v = t, (49)
The saddlepoint density function is approximated using:
fsadM = [^%)YeK^ (50)
In particular, in the context of the replacement of a sample Xi, X2, ..., Xn
where X;, is selected with a probability pj=l/n, we can assume a multinomial
distribution with a sum equal to n, given by the variables (n*1; n*2, ..., n* n ) that
describe the number of times that (Xi, X2, ..., Xn) appears, and the mean sample
statistic given by the linear combination
(51)
i n
with a;=Xj/n which has a cumulant generator function given by
K\t) (53)
T* = t + f5L (54)
then T*-t can be expressed as the linear combination of the n*i; with aHi/n,
where 1, are the influence values of the statistic.
Fitting the Heligman and Pollard Curve to Mortality Data 279
The following figure shows the density of the variance statistic for the H-P
residuals corresponding to the age of 75 years. It can be seen that the normal
density does not produce such a good fit as does the saddlepoint approximation,
especially in the tails.
o
o
ty
o
{/) CM
c
L> o
I T" I
~r
2 e-06 4 e-06 6 e-06 8 e-06
Variance
11. Conclusions
The most important source of heterogeneity in the residuals is not in the sets
generated by the different curves that are fitted for each year, but within each
curve, in those generated for different ages.
280 F. Abad-Montes, M.D. Huete-Morales andM. Vargas-Jimenez
\
\ 1
1
1
1
1
r 1
c
CD
Q i
»
1
1
I
rr
(
1
|f
1]
•
I
i
1 \
f to
>
T^ I T" "T
V 1
-0.00020 -0.00005 0.00010 -3 -2 -1
Mean Quantiles
1
i
3
t \ a
1
a i
0.0020
1
0.0025
• \
1
0.0030
Variance Quantiles
References
1. Booth, J.G, Hall, P. and Wood, A.T.A. (1993). Balanced importance
resampling for the bootstrap. Annals of Statistics, 21, 286-298.
282 F. Abad-Montes, M.D. Huete-Morales and M. Vargas-Jimenez
J. GOMEZ-GARCIA
Department of Quantitative Methods for Economics, University ofMurcia
Campus de Espinardo, s/n, Espinardo 30100 Murcia, Spain
J. SOLANA-IBANEZ
Department of Business and Management, Catholic University San Antonio ofMurcia
Campus de los Jeronimos, s/n, Guadalupe 30107 Murcia, Spain
1. Introduction
The high degree of correlation between the behaviour of the economy and
the banking sector, together with the sector's role as financial intermediary,
Pastor [1], is ample reason for the continual interest in different aspects of the
banking system.
Traditionally, this kind of study has been approached through the use of
costs and profitability ratios, Pastor, Perez and Quesada [2], although more
recently these traditional techniques have tended to be replaced by the use of
econometric techniques that look at an institution from a global viewpoint that
considers the inputs used and outputs obtained, that is techniques that permit the
efficiency of an organization to be measured. One such technique is that known
as Data Envelopment Analysis (DEA), a non-parametric econometric technique
285
286 J. Gomez-Garcia, J. Solana-Ibanez and J.C. Gomez-Gallego
So called X-type inefficiencies are those due to errors in management and/or organization, and
include technical inefficiencies such as the allocative type, and differ from scale inefficiencies.
Measuring the Efficiency of the Spanish Banking Sector 287
Table 1. (Continued)
3. Methodology
The most influential work related with such aspects of macroeconomics was
that of Solow [11] published in the "Review of Economics and Statistics" and
entitled "Technical change and the aggregate production function". At the same
time Farrell established the bases for studying efficiency and productivity at
microeconomic scale, putting forward two novel aspects: how to define
efficiency and productivity, and how to measure efficiency.
Faced witii the possibility of inefficiency, Farrell opted for the concept of
border_production as opposed to the mean efficiency underlying most of the
econometric literature to date on the production function. The new focus of
Farrell consisted of decomposing efficiency into technical and allocative
efficiency at individual production unit level. The radial contraction/expansion
connecting inefficient units with efficient units with respect to the production
function constitutes the base for measuring efficiency and is the true
contribution of Farrell.
Farrell proposed a measure of efficiency consisting of two components:
technical efficiency and assignative efficiency, both of which combine to
provide a measure of total economic efficiency. These measures assume that the
production function of efficient companies is known. Since this function is never
known in practice, as Farrell recognized, he proposed two possibilities:
obtaining a non-parametric function or a parametric function.
The first alternative gave rise to the models of estimating non-parametric
frontiers and was followed by Charnes, Cooper and Rhodes [9], and resulted in
an approach to DEA. A subsequent model gave rise to a great quantity of
research and was denominated FDH (free disposal hull) formulated in 1984 by
Deprins, Simar and Tulkens [12], and developed by Tulkens [13] in 1994. This
second pathway was followed by Afriat [14] and Aigner [15], resulting in two
approximations known as the determinist and stochastic frontier models.
An intermediate pathway comprising models that we might term models
that do not use production frontiers, is provided by index numbers, and their use
in measuring efficiency and productivity is indirect. They are used, rather, to
generate variables or data that can be used in the application of the DEA models
or in the estimation of stochastic frontiers, Solana [16].
3.2. Models
Since its genesis, Charnes et al [9] have developed a variety of DEA
models, both input and output oriented, depending on the existence of constant
or variable returns (in this last case, depending, too, on whether these are
growing or diminishing) and whether the inputs can or cannot be controlled,
290 J. Gomez-Garcia, J. Solana-Ibdnez and J.C. Gomez-Gallego
among other aspects. The first model we applied was that initially proposed by
Charnes et al [9] and known as CCR, after its authors. This model implies
returns on a constant scale and is input oriented. In accordance with Cooper et al
[17], the starting point is the traditional definition of efficiency (coefficient
between outputs and inputs) and the aim is, by means of lineal programming, to
obtain weights so that, the ratio between outputs and inputs can be maximized.
To calculate the efficiency of n units, n lineal programming problems must
be solved to obtain both the values of the weights (VJ) associated with the inputs
(xj), and the weights (ur) associated with the outputs (yr). Assuming m inputs
and s outputs, and transforming the fractional programming model into a lineal
programming problem, the input oriented CCR model is formulated as follows:
Max & = ulylo+u2y2o+... + usyso
s.a.
vxxlo + v2x2o+...+ vmxmo=\ (1)
s.a.
u
iyio + u 2y2o+- + u syso = 1 (2)
u
i y i j + u 2 y 2 j + - + UsySj ^ V l X l j + V 2 x 2 j + - + V m x m j j = l,2,...,n
Vj>0 (i = l,2,...,m)
ur>0 (r = l,2,...,s)
Min Z = £v,x/0+v0
Measuring the Efficiency of the Spanish Banking Sector 291
s.a.
Z u jyro = i
r
v;>0 (i = l,2,...,m)
ur>0 (r = l,2,...,s)
v0 free
where v0 is the variable that permits us to identify the nature of the scale returns
to scale. To obtain a more complete ranking, efficient units are classified by
applying the MDEA models proposed by Andersen, P. and Petersen, N. C. [19].
4. Results
When a wide number of correlated variables are available for a given
population, factorial analysis (FA) permits the information contained in these
variable to be synthesized into a lower number of variables (factors).
After typifying the original variables and demonstrating the existence of a
significant correlation, Barlett's sphericity test and the statistics of Kaiser-
Meyer-Olkin (KMO) were applied. These gave a Chi-squared value of 1283.59,
with 120 g.l., for the Bartlett test and a KMO value for the sample suitability of
0.637, with an associated significance level of 0.000. Next, the factorial axes
were extracted by principal components analysis. Lastly, the axes chosen were
rotated by Varimax to facilitate understanding.
Of the original variables observed, those related with size, profitability,
management and risk were selected, Moya and Caballer [22]. In this way the
following fifteen variables were included: Current Accounts, Debits, Time
deposits, ATM, Intermediation Margin, Intermediation Margin on ATM, ROE,
Operating Margin, Operating Margin over ATM, Credit investment over
employee, Credit Investment over ATM, Net profit over AMT, Debits per
employee, Deposits over debt capital.
Applying the FA procedure, four factorial axes were obtained which
explained 91.27% of the global variance. These were chosen bearing in mind the
value of the autovalues of the characteristic equation, in accordance with the
Measuring the Efficiency of the Spanish Banking Sector 293
criterion of the arithmetic mean. From the matrix of rotated components, the
factorial axes were defined as follows:
Factor 1: saturated by C/C (.900), CRD (.989), DEB (.995), IMPL (.984)
ATM (.986), ME (.965), MI (.967). Factor 2: saturated by MEATM (0.983),
MIATM (.942), ROE (.623), BATM (.932). Factor 3: saturated by ICEMP
(.970), DBEMP (.986). Factor 4: saturated by ICATM (.729), DPRA (.820).
Factor 1, with an associated autovalue of 7.09, explains 47.26% of the total
variance; Factor 2, with an associated autovalue of 3.21, explains 21.41% of the
total variance; Factor 3, with an associated autovalue of 1.37, explains 13.28%
of the total variance, and Factor 4, with a characteristic root of 1.37, explains
9.18% of the total variance.
From the correlations between the factorial axes and the original variables,
we have interpreted and, consequently, denominated the factorial axes as
follows: Factor 1: Size; Factor 2: Profitability; Factor 3: Management And
Factor 4: Risk.
Applying the BCC-O model, efficiency coefficients were obtained which
situated 14 financial institutions in the frontier, while the remaining 22 had a
different percentage of technical inefficiencies. The MDEA-0 model was used
to establish a complete ranking, and the corresponding coefficients of super-
efficiency were obtained for each bank.
Banks
Clusters Efficiency Profitability
( Efficiency ) - (* Super-efficiency)
N=9 D.T.: 58.70 D.T.: 1.10 Popular Esp (134)* Banif (305.2)*
Santan.C.F.(290)*
5. Conclusions
Factorial analysis provided four factors that encapsulated all the
characteristics of Spanish banks; size, management, profitability and risk. In this
way, each bank is represented in a tetra-dimensional space by the vector which
components are the scores of the bank on each of the four factorial axes.
Applying DEA analysis, the BCC-0 model and MDEA, we obtained an
efficiency ranking for 36 financial institutions. In 22 banks we observed a
percentage of technical inefficiency. Changes in the way of managing these
banks could bring them to the production frontier.
Applying cluster analysis to the super-efficiency scores enabled us to make
homogeneous groups of the banks analysed. In this way we obtained three
groups of minimal intra-group variance.
As regards profitability, we conclude that there are significant differences
between the groups of banks established from the measures of super-efficiency.
Significantly different (p=0.043) levels of profitability were found. Group 3,
with high level of medium efficiency (205.05), presents the highest level of
profitability, while, group 1, (469.96), shows quite low profitability. Despite the
significant differences, we need to take into account other characteristics such as
specialization, in order to explain these findings. This could be the topic of
future research.
References
1. J.M. Pastor. (1998). Gestion del Riesgo y Eficiencia en los Bancos y Cajas
de Ahorros, Serie Documentos de Trabajo, No 142/1998. Fundacion de
Cajas de Ahorro Confederadas para la Investigation Economica y Social
Espana.
2. J.M. Pastor, F. Perez and J. Quesada. (1995). Are European Banks Equally
Efficient? Revue de la Banque, June, 324—33.
3. A.N Berger. (1995). The Profit-Relationship in Banking - Tests of Market-
Power and Efficient-Structure Hypotheses. Journal of Money, Credit and
Banking, 27(2), 405-431.
4. L.G. Goldberg and A. Rai. (1996). The structure-performance relationship
for European banking. Journal of Banking and Finance, 20, 745-771.
5. J. Maudos and J.M. Pastor. (1998). La eficiencia del sistema bancario
espafiol en el contexto de la Union Europea. Papeles de Economia
Espanola, 84/85, 155-168.
6. N. Berger and R. De Young. (1997). Problem Loans and Cost Efficiency in
Commercial Banks. Journal of Banking & Finance, 21(6), 849-870.
296 J. Gomez-Garcia, J. Solana-Ibdnez and J.C. Gomez-Gallego
DISTRIBUTION
MODELS THEORY
Distribution Models Theory is a revised edition
of papers specially selected by the Scientific
Committee for the Fifth Workshop of Spanish
Scientific Association of Applied Economy on
Distribution Models Theory held in Granada
(Spain) in September 2005. The contributions
offer a must-have point of reference on models
theory.