Order-Statistics 2
Order-Statistics 2
Order statistics are the values of a random sample arranged in order of magnitude. Let us consider a random
sample x1 , x2 , , xn from a continuous distribution with density f x x . Now, the value of random sample are
This new ser of random variables are called the order statistics. Here, y1 is called first order or smallest value and
yn is called nth order or higher value of sample. These order statistics may be denoted as
x1:n is useful for the strength of a chain depends on the weakest link.
Detecting outlier.
It is used in demography, quality control, life insurance, weather forecast, share market etc.
It is used in non parametric test.
It is used data comparison.
Random sample x1 , x2 , , xn are independent but order statistics y1 , y2 , , yn are not independent.
Let x1 , x2 , , xn be iid random variables from a continuous distribution. Then the set of order statistics
xr:n , xs:n
d
x n s 1:n , x n r 1:n
Order Statistics - 1 of 24
2. Detection Outliers: If one is confronted with a set of measurements and is concerned with determining
whether some have been incorrectly made or reported, attention naturally focuses on certain order
statistics of the sample. Usually the largest one or two and the smallest one or two are deemed most
likely to be outliers.
3. Censored Sampling: In censored sampling, the sampling process ceases after completing r observations
out of n . In life testing of electric-light bulbs, one may start with a group of n bulbs but stop taking
observations after the r th bulbs burns out. Then information is available only on y1 , y2 , , yr where
rn.
4. Waiting for the Big One: Disastrous floods and destructive earthquakes recur through out history. Dam
construction has long focused on so called 100-year floods. Whether one agrees or not with the 100-year
disaster philosophy, it is obvious that designers of dams and skyscrapers and even doghouses, should be
concerned with the distribution of large statistics from a possibly dependent possible not identical
distributed sequence. Thus the maximum value in the sample of interest in the study of floods other
extreme meteorological phenomena.
5. Strength of Materials: x1:n , the minimum value, is useful for phenomena where, for example, the strength
Sample Range:
The difference xn:n x1:n of order statistics is called the sample range of the given sample.
Sample Block:
The intervals , x1:n , x1:n , x2:n , , xn:n , are called sample blocks.
density function is
n
f x1 , x2 , , xn x1 , x2 , , xn f x xi
i 1
Now, the joint distribution of n order statistics for this random sample is not same, since they are obviously
neither independent nor identical. The set of n order statistics is produced as
Order Statistics - 2 of 24
y1 1st smallest of x1 , x2 , , xn
y2 2nd smallest of x1 , x2 , , xn
yr rth smallest of x1 , x2 , , xn
yn largest of x1 , x2 , , xn
This transformation is not one to one. Since there are in total n ! possible arrangement of the original random
variables in increasing order of magnitude, there exist n ! inverses to the transformation. One of these n !
permutations might be x10 x2 xn 1 xn x1 and the corresponding inverse transformation is
x10 y1
x2 y2
xn 1 yr
xn yn 1
x1 yn
The Jacobian of this transformation would be the determinant of an n n identity matrix. So J 1 . Now, the
The same expression result for each of the n ! arrangements. So, for all n ! permutations the inverse
transformation of order statistic, the joint p.d . f is
n
f y1 , y2 , , yn y1 , y2 , , yn n! f x yi
i 1
This is the joint density function of n order statistics. Some example are given bellow
For a random sample of size n from the normal distribution, we have,
1 yi
2
n!
f y1 , y2 , , yn y1 , y2 , , yn exp
2 2
n
2 2
Order Statistics - 3 of 24
Fn:n x F x
n
F1:n x 1 1 F x
n
n! r 1 n r
f r:n yr Fx yr 1 Fx yr f x yr yr
r 1! n r !
Marginal c.d . f of r th order statistics
n
n n i
Fr:n x i F x 1 F x
i
i r
F x
n! nr
t r 1 1 t dt
n r !
r 1 ! 0
I F x r , n r 1 ; x
n j
1 F x j f xi f x j ; xi x j
s j r i r ! s r ! n s !
F xi F x j
n!
i 1! j i 1! n j ! t1 t2 t1 1 t2 dt2 dt1
i 1 j i 1 n j
0 t 1
n2
f1, n:n x1 , xn n n 1 F xn F x1 f x1 f xn ; x1 xn
n! i 1 n i 1
fi,i 1:n xi , xi 1 F xi 1 F xi 1 f xi f xi 1
i 1! n i 1!
Order Statistics - 4 of 24
Marginal distribution of n th order statistics:
We have the joint p.d . f on n order statistics
n
f x1:n , x2:n , , xn:n y1 , y2 , , yn n ! f x yi
i 1
Let
I1 Fx y2 f x y2 dy2
I1 Fx y2 f x y2 dy2 f x y2 Fx y2 dy2
I1 Fx y2 Fx y2 I1
2 I1 Fx y2
2
Fx y2
2
I1
2
Fx y3
y3 2
Fx y2 f x y2 dy2 2!
yn yn1 y4 F y 2 n 1
x 3 f y dy
f xn:n yn n ! f x yn 2!
x 3 3
f y dy
i 4 x i i
Let
I 2 Fx y3 f x y3 dy3
2
I 2 Fx y3 Fx y3 2 Fx y3 f x y3 dy3
2 2
I 2 Fx y3 2 I 2
3
3I 2 Fx y3
3
Fx y3
3
I2
3
y4
1 1
2 Fx y3 f x y3 dy3 Fx y4
2 3
3!
Order Statistics - 5 of 24
If the successive integration on y 4 , , y n 1 , y n we carried out, it is seen that
n 1
Fx yn
f xn:n yn n ! f x y n
n 1!
n 1
n Fx yn f X yn ; yn
st
Marginal distribution of 1 order statistics:
The joint p.d . f of n order statistics is
n
f x1:n , x2:n , , xn:n y1 , y2 , , yn n ! f x yi
i 1
n 1
n ! f x y1
f x yn dyn
i 2
f x yi dyi
y1 y2 yn2 yn1
n2
n ! f x y1 1 Fx yn 1 f x yn 1 dyn 1 f x yi dyi
i 2
y1 y2 yn 2
t 1 Fx yn 1
I1 1 Fx yn 1 f x yn 1 dyn 1
dt f x yn 1 dyn 1
tdt
t2
2
1 Fx yn 1
2
2
1
1 Fx yn 1 f x yn 1 dyn 1 1 Fx 1 Fx yn 2
2 2
yn 2
2
1
2
1 Fx yn 2
2
n 3
1
f x1:n y1 n ! f x y1 1 Fx yn 2 f x yn 2 dyn 2 f x yi dyi
2
y1 y2 yn3
2 i 2
Order Statistics - 6 of 24
Marginal Distribution of r th Order Statistics:
Let us assume that x1 , x2 , , xn is a random sample from as absolutely continuous populations with probability density
function f x , let x1:n x2:n xr:n xn:n be the order statistics obtained by arranging the preceding random
P x x r x x
f r:n x lim
1
x 0 x
ALLOCATION:
xi x for r 1 of the xi ’s
x xi x x for one xi
P x x r x x n!
p r 1 p1 p n r
r 1! 1! n r ! 1 2 3
2
Where,
p1 P xi x F x
p2 P x xi x x F x x F x
p3 P xi x x 1 P xi x x 1 F x x
Fn:n x P xn:n x
P xi x ; i 1, 2, , n
P x1 x x2 x xn x
P x1 x P x2 x P xn x
Order Statistics - 7 of 24
Cumulative Distribution Function of 1st Order Statistics:
The c.d . f of the smallest order statistics is denoted by F1:n x and defined as
F1:n x P x1:n x
1 P x1:n x
1 P xi x ; i 1, 2, , n
n
1 P xi x
i 1
n
1 1 P xi x
i 1
1 1 F x
n
; x
This is the cumulative distribution function of smallest order statistics.
Fr:n x P xr:n x
P at least r of x1 , x2 , , xn are x
n
P exactly i of x1 , x2 , , xn are x
i r
n
n n i
F x 1 F x
i
; x
i r
i
Fr:n x ~ b n, F x
Thus, we find that c.d . f of xr:n 1 r n is simply the tail probability of a binomial distributions with F x as the
probability of success and n as the number of trials. Furthermore by using the identity that
p
n
n i
p 1 p
n i n! nr
t r 1 1 t dt
; 0 p 1
i r
i r 1! n r ! 0
F x
n! nr
Fr:n x t r 1 1 t dt
r 1! n r ! 0
I F x r , n r 1 ; x
Where,
p
1 b 1
I p a, b t a 1 1 t dt
a, b 0
This is just Pearson’s incomplete beta function. It is important to mentions here that one can write the c.d . f of xr:n in
terms of negative binomial probabilities as noted by Pinker, Kipnis and Grechanousky , instead of the binomial form.
Fr:n x P xr:n x
P reaching r successes in the course of at most n trials with probability of success F x
r 1 0 r n 1 n r
F x 1 F x F x 1 F x F x 1 F x
r r 1 r
r 1 r 1 r 1
nr
n 1 i n r i
F x 1 F x
r
; x
i 0 r 1
Order Statistics - 8 of 24
Remarks:
Taking r 1 and r n in * , we get respectively:
n n
n i
n
n i
F1:n x i F x 1 F x 1 F x 1 F x
i i
i 1 i i 0
1 1 F x
n
Fn:n x F x
n
and
We shall now assume that xi ’s are i.i.d continuous r.v. ’s with p.d . f f x F x . If Fr:n x denotes the
p.d . f of X r:n . Then we get
d
f r:n x Fr:n x
dx
d
I r , n r 1
dx F s
F x
d 1 nr
dx r , n r 1 0
r 1
t 1 t dt
Let us write
nr
g t t r 1 1 t dt
nr
g t t r 1 1 t **
Now,
F x
nr F x
t r 1 1 t dt g t
0
0
F x
nr
t r 1 1 t dt g F x g 0
0
F x
d nr
t r 1 1 t dt g F x f x g 0 is cons tan t
dx 0
r 1 n r
F x 1 F x f x using
So, we get ,
1 r 1 n r
f r:n x F x 1 F x f x
r , n r 1
P xi xi:n xi xi , x j x j:n x j x j
j i 1 n j
n! i 1
F x F x j F xi xi 1 F x j x j
i 1! j i 1! n j ! i
F xi xi F xi F x j x j F x j O xi x j O xi x j
2
2
1
Here O xi x j and O xi x j
2 2
are higher order terms which correspond to the probabilities of the event of
P xi xi:n xi xi , x j x j:n x j x j
fi , j:n xi , x j lim
xi 0
x x
x j 0
i j
j i 1 n j
F xi
n! i 1
F x j F xi 1 F x j f xi f x j ;
i 1! j i 1! n j !
i j
th th
This is the joint distribution function of i and j order statistics.
The joint c.d . f of the i and j th order statistics is denoted by Fi , j:n x and defined as
th
Fi , j:n xi , x j P xi:n xi , x j:n x j
P at least i of x1 , x2 , , xn are at most xi and at least j of x1 , x2 , , xn are at most x j
n s
P exactly r of x1, x2 , , xn are at most xi and exactly s of x1 , x2 , , xn are at most x j
s j r i
n s s r ns
r ! s r ! n s !
n!
F xi F x j F xi
1 F x j 6
r
s j r i
This is the joint c.d . f of xi:n and x j:n 1 i j n is the tail probability over the rectangular region
s j r i
p1 p2
F xi F x j
Fi , j:n xi , x j i 1! j in! 1! n j ! t1 i 1 t2 t1 j i 1 1 t2 n j dt2 dt1 ;
0 t 1
xi x j 7
Order Statistics - 10 of 24
which may be noted to be as incomplete bivariate beta function. The expression of
Fi , j:n xi , x j in 7 holds for any
arbitrary population whether continuous or discrete.
n 1 n 1
n! 1 n
f xn x F x 2 1 F x 2 f x
n 1 n 1
2 1 ! n 2 !
n 1
n!
n 1
2 F x 1 F x 2 f x ; x
!
2
From the p.d . f of the sample median we see at once that it is symmetric about zero is the population distribution is
1
xn X n X n
2 2 :n 1:n
2
in order to derive the joint distribution of xn , we first have from the joint p.d . f of i th and j th order statistics the joint
density function of x n and x n to be
:n 1:n
2 2
n n
n! 1 1
fn n x1 , x2 F x1 2 1 F x2 2 f x1 f x2
, 1:n n n n n
2 2
2 1 ! 2 1 2 1 ! n 2 1 !
; x1 x2
Let us make the transformation
x1 x2
x x1 x1
2
x2 2 x x1
x1 x1
x1 x 1 0
J 2
x2 x2 1 2
x1 x
n n
2n ! 1 1
f xn , xn x1 , x 2
F x1 2 1 F 2 x x1 2 f x1 f 2 x x1 ; x1 x
:n
n
1 !
2
2
Order Statistics - 11 of 24
By integrating out x1 we derive the p.d . f of the sample median xn as
2n ! x n
1
n
1
f xn x 2
F x1
2 1 F 2 x x 2 f x f 2 x x dx
1 1 1 1
n
1 !
2
; x 1
The integration to be performed in equation 1 does not assume a manageable form in most cases. Yet the c.d . f of the
Fxn x0 P xn x0
x0 x n n
2n ! 1 1
2 F x1 2 1 F 2 x x1 2 f x1 f 2 x x1 dx1 dx ; x0
n
1 !
2
By employing Fubini’s Theorem and changing the order of integration, we derive the c.d . f of xn as
2n !
x0 n
1 x n
1
Fxn x0 F x1 2 f x1 1 F 2 x x1 2 f 2 x x1 dx dx1
2
n
1 !
2
2n ! x0 n
1
n x0 n
1
n
Fxn x0 F x1 2
1 F x1
2 f x1 dx1 F x1 2 1 F 2 x x 2 f x dx
0 1 1 1
n n
2 1 ! 2 !
Find the Distribution of Range:
Let x1 , x2 , , xn be a random sample of size n from a continuous population will p.d . f f x and c.d . f F x . Let
the sample values are arranged in order of magnitude as x1 , x 2 , , x n where x1 and x n are the smallest and
the range is
w y x
The joint distribution of x1 and x n is
n2
f1,n:n x, y n n 1 F y F x f x f y ; x y
Order Statistics - 12 of 24
Eves though the integration to be carried out in 1 does not assume a manageable form in many cases, the c.d . f . of w
does take on a simpler form and may be derived as
Fw w0 P w w0
w0
f w w dw
0
w0
n2
n n 1 F u w F u f u f u w du dw
0
f u du
w0
n2
n n 1 F u w F u
f u w dw
0
Let
w0
n2
I F u w F u f u w dw
0
w0
Let t F u w F u
t
n2
dt
0 dt f u w dw
w
t n 1 0
n 1 0
w
F u w F u n 1 0
n 1
0
F u w0 F u
n 1
n 1
F u w0 F u
n 1
Fw w0 n f u du ; 0w
n2
f1, n:n x1 , xn n n 1 F xn F x1 f x1 f xn ; x1 xn
n2
f m, x 2n n 1 F 2m x F x f 2m x f x ; x m
The p.d . f of m is
m
n2
f m 2n n 1 F 2m x F x f 2m x f x dx
Order Statistics - 13 of 24
Therefore the c.d . f of m is
m m
n2
F m 2n n 1 F 2m x F x f 2m x f x dx dm
Let
n2
I F 2m x F x
2 f 2m x dm
F 2m x F x t
t n 2 dt
2 f 2m x dm dt
t n 1
n 1
n 1
F 2m x F x
n 1
F 2m x F x
m
n 1 m
F m n f x dx
m
n 1
n F 2m x F x f x dx
Problem:
If R is the Sample range and m be the sample midrange from a continuous population, find their joint and marginal p.d . f .
Solution: Let x1:n x2:n xn:n be an ordered sample of size n from a population with p.d . f f x and c.d . f
F x .
The sample range, R is
R xn:n x1:n
The sample midrange, m is
xn:n x1:n
1
m
2
The joint p.d . f of x1:n and xn:n is
n2
f1, n:n xi , xn n n 1 F xn F x1 f x1 f xn ; x1 xn
Order Statistics - 14 of 24
The marginal p.d . f of R is
r n2
r r r r
f r n n 1 F m 2 F m 2 f m f m dm
2 2
n2
r r r r
f m n n 1 F m F m f m f m dr
m
2 2 2 2
Theorem:
xi:n
For the uniform 0,1 distribution, the random variables v1 and v2 x j:n ; 1 i j n , are statistically
x j:n
independent, will v1 and v2 having i, j 1 and j, n j 1 distribution respectively. Generalize this result.
Proof:
f x 1 ; 0 x 1
x
F x 1 dx x
0
J
xi , x j v2 v1
v2
v1 , v2 0 1
j i 1 n j
i 1! j in! 1! n j !
fi , j:n xi , x j F xi
i 1
F x j F xi
1 F x j
f xi f x j
n! j i 1 n j
xi i 1 x j xi 1 x j
i 1! j i 1! n j !
Order Statistics - 15 of 24
From above it is clear that the random variables v1 and v2 are statistically independent, and also they are distributed as
xik 1:n
vk 1 ~ ik 1 , ik ik 1
xik :n
vk xik :n ~ ik , n ik 1
v2 v3 vk v1v3 vk v1v2 vk 1
J
xi1 , xi2 , , xik 0 v3v4 vk v2 v3 vk 1
v1 , v2 , , v2
0 0 1
J v2 v32 v43 vk k 1
xi , xi ,
n! nk
f1,2, , xik 1 F xik f xik f xi2 f xik
, k :n 1 2
n k !
1 vk nk v2v32v43
n!
fv1 , v2 , v1 , v2 , , vk vk k 1
, vk
n k !
Problem:
th
For a sample from Logistic distribution, find m.g . f of the i order statistics.
Solution:
Order Statistics - 16 of 24
e x
f x ; x
1 e
2
x
F x f x dx
e x Let , 1 e x z
dx
1 e e x dx dz
2
x
dz
z2
z 2 1 1
2 1 z
x
1 1
x
1 e 1 e x
Let
1
u
1 e x
e x
du dx f x dx
1 e x
Again, Let
1
F x u
1 e x
1
1 e x
u
1 u
e x
u
1 u
ln x
u
u
x ln
1 u
u
ex
1 u
Now,
M i:n t E etX i:n
e
tx
fi:n x dx
n! t i 1 n i
e x F x 1 F x f x dx
i 1! n i !
Order Statistics - 17 of 24
1 t
n! u i 1
u 1 u du
n i
i 1! n i ! 0 1 u
1
n!
u i t 1 1 u
n i t 11
i 1! n i ! 0
du
n!
i t , n i t 1
i 1! n i !
n 1 it n i t 1
i n i 1 n 1
it n i t 1
i n i 1
P xi qi xi 1 pi ; qi 1 pi , xi 1, 2,
Solution:
Let F1:n x denote the c.d . f of the 1st order statistics x1:n .
F1:n x P X 1:n x
1 P X 1:n x
1 P all X i x
n
1 P X i x
i 1
n
1 1 P X i x
i 1
1 F x
n
1
i
1
i 1
Now,
x
F x pi qi xi 1
i
xi 1
x 1 rm
pi qi xi 1 1 r r2 r m 1
xi 1 1 r
1 qi x
pi
1 qi
1 qi x
1 qi
1 qi
1 qi x
Order Statistics - 18 of 24
From 1 we get ,
1 1 qi x
n
F1:n x 1
i 1
n
1 qi x
i 1
Now,
f1:n x F1:n x F1:n x 1
n n
1 qi x 1
qi x1
i 1 i 1
n n
qi x1 qi x
i 1 i 1
n n
qi x1 1 qi
i 1 i 1
x 1 n
n
qi
1
n
qi
Let , p 1 qi
i 1
i 1 i 1 1 q1q2 qn
x 1
1 p p
pq x 1 ; x 1, 2,
For a random sample of n from continuous population whose p.d . f f x is symmetric about x . Show that
fr:n x and f nr 1:n x are mirror images of each other is x as mirror, that is,
Solution:
Order Statistics - 19 of 24
f x f x
F x 1 F x
n! r 1 nr
f r:n x F x 1 F x f x
r 1! n r !
n! r 1 nr
f r:n x F x 1 F x f x
r 1! n r !
n! r 1 nr
1 F x 1 1 F x f x
r 1! n r !
n! nr r 1
F x 1 F x f x
n r ! r 1!
n! n r 11 n n r 1
F x 1 F x f x
n r 1 1
! n n r 1 !
f n r 1:n x
Now,
n! r 1 s r 1 ns
f r , s:n x, y F x F y F x 1 F y f x f y
r 1! r 1!
s n s !
s r 1
r 1! s rn! 1! n s ! F x x F y y F x x
r 1
f r , s:n x x, y y
ns
1 F y y f x x f y y
s r 1
n! r 1
1 F x x 1 F y y 1 F x x
r 1! s r 1! n s !
ns
1 1 F y y
f x x f y y
ns s r 1
n!
r 1! s r 1!
F y
n s ! y
F x x F y y
r 1
1 F x x f y y f x x
n s 11
n!
n s 1 1! n r 1 n s 1 1!
F y
n n r 1! y
n r 1 n s 1 1 n n r 1
F x x F y y
1 F x x
f y y f x x
f r , s:n x x, y y f n s 1,n r 1:n x x, y y
Problem: (David: Exercise: 2.1.3)
e x ; x0
f x
0 ; x0
Show that the c.d . f of the largest order statistics is a random sample of size n is
n
Fn:n x 1 e x
x .
x
e
Hence prove that as n , the c.d . f of xn:n ln n tends to the limiting form e
Order Statistics - 20 of 24
Solution:
f x e x
x
x
F x e y dy e y 1 e x
0
0
n 1
f n:n x n F x f x
n 1
n 1 e x f x 1
Now,
Fn:n x P xn:n x
x
f n:n y dy
0
x
n 1
n 1 e y e y dy
0
1 e x
n z n 1dz Let ,
z 1 e y dz e y dy ;
0 z 1 e x
0
1 e x
zn
n
n 0
n
1 e x
Let ,
z xn:n ln n dz dx
xn:n z ln n
Order Statistics - 21 of 24
z
F z f t dt
ln n
x n 1
1
1 et et dt
0 n
1
1 e z
n
1 1 t 1
u n 1ndu Let ,
u 1 e t
n
du
n
e dt ; 0 t 1 e z
n
0
1
1 e z
un n
n
n 0
n
1
1 e z
n
Now,
n
1
lim F z lim 1 e z
n n n
z 1
n
ee lim 1 e
n n
Let x1 x2 xN be the elements of a finite population from which a sample x1 x 2 x n n N is
t 1 N t
Pr x i xt
N
i 1 n i
; t i, i 1, , N ni
n
Solution:
t 1 N t
Population: x1 x2 x3 xt xt1 xN
If x i xt , then we can say that x1 , x 2 , , x i 1 i.e., first i 1 observations well come from
Order Statistics - 22 of 24
t 1
It happen in ways. Similarly, x i 1 , x i 2 , , x n observations of the sample will come from
i 1
N t
xt1 , xt 2 , , xN observations of the population. It happens in ways.
ni
t 1 N t
According to multiplication rule, are the favorable cases of occurring the event x i xt .
i 1 n i
N
And the total number of cases is .
n
t 1 N t
Pr x i xt
i 1 n i
N
; t i, i 1, , N ni
n
Problem:
Consider the standard uniform distribution, then find the cumulative function and the probability density function of the rth
order statistics and hence find the mean and variance.
Solution:
fx x 1 ; 0 x 1
x
Fx x f x x dx x
0
i r
Order Statistics - 23 of 24
Order Statistics - 24 of 24