0% found this document useful (0 votes)
11 views49 pages

09-Multivariate Distributions

The document discusses multivariate distributions, focusing on joint distributions of more than two random variables, including definitions, properties, and examples of joint probability mass functions and joint probability density functions. It explains how to derive marginal distributions and the conditions for independence among random variables. Theorems related to expected values and joint distributions are also presented, along with examples illustrating these concepts.

Uploaded by

happyhaha174
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views49 pages

09-Multivariate Distributions

The document discusses multivariate distributions, focusing on joint distributions of more than two random variables, including definitions, properties, and examples of joint probability mass functions and joint probability density functions. It explains how to derive marginal distributions and the conditions for independence among random variables. Theorems related to expected values and joint distributions are also presented, along with examples illustrating these concepts.

Uploaded by

happyhaha174
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

MULTIVARIATE DISTRIBUTIONS

TEXTBOOK
Fundamentals of Probability with Stochastic Processes
with Stochastic Processes, Fourth Edition
by Saeed Ghahramani

INSTRUCTOR
Ying-ping Chen
Dept. of Computer Science, NYCU
OVERVIEW

• Joint Distributions of n > 2 Random Variables


• Order Statistics
• Multinomial Distributions
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
Joint Probability Mass Functions
Definition 9.1
Let X1 , X2 , . . ., Xn be discrete random variables defined on the same sample
space, with sets of possible values A1 , A2 , . . ., An , respectively. The function

p(x1 , x2 , . . ., xn ) = P(X1 =x1 , X2 =x2 , . . ., Xn =xn )

is called the joint probability mass function of X1 , X2 , . . ., Xn .

Note that
(a). p(x1 , x2 , . . ., xn ) ≥ 0.
(b). If for some i, 1 ≤ i ≤ n, xi ∉ Ai , then p(x1 , x2 , . . ., xn ) = 0.
(c). ∑xi ∈Ai ,1≤i≤n p(x1 , x2 , . . ., xn ) = 1.
Overview 3 / 49
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
If the joint probability mass function of random variables X1 , X2 , . . ., Xn ,
p(x1 , x2 , . . ., xn ), is given, then for 1 ≤ i ≤ n, the marginal probability mass
function of Xi , pXi , can be found from p(x1 , x2 , . . ., xn ) by
pXi (xi ) = P(Xi = xi ) = P(Xi = xi , X j ∈ A j , 1 ≤ j ≤ n, j ≠ i)
= ∑ p(x1 , x2 , . . ., xn ) . (9.1)
x j ∈A j , j≠i

More generally, to find the joint probability mass function marginalized


over a given set of k of these random variables, we sum up p(x1 , x2 , . . ., xn )
over all possible values of the remaining n − k random variables. For
example, if p(x, y, z) denotes the joint probability mass function of random
variables X, Y, and Z, then
pX,Y (x, y) = ∑ p(x, y, z)
z
is the joint probability mass function marginalized over X and Y.
Overview | Joint Distributions of n > 2 Random Variables 4 / 49
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
• Examples:
– Suppose that he joint probability mass function for X, Y, and Z is
(5x )(10 8
y )( z )
p(x, y, z) =
(23
7)
for 0 ≤ x ≤ 5, 0 ≤ y ≤ 7, 0 ≤ z ≤ 7, x + y + z = 7; p(x, y, z) = 0,
otherwise. Find the marginal probability mass functions of X.
□ The marginal probability mass functions of X, pX , is
7−x (5)(10)( 8 )
x y 7−x− y
pX (x) = ∑ p(x, y, z) = ∑
x+ y+z=7 y=0 (23
7)
0≤ y≤7
0≤z≤7
(5x ) 7−x 10 8 (5)( 18 )
= 23 ∑ y
( )( ) = x 237−x , 0 ≤ x ≤ 5 .
( 7 ) y=0 7−x− y (7)
Overview | Joint Distributions of n > 2 Random Variables 5 / 49
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES

Remark 9.1
A joint probability mass function, such as p(x, y, z) of Example 9.1, is called
multivariate hypergeometric. In general, suppose that a box contains n1
marbles of type 1, n2 marbles of type 2, . . ., and nr marbles of type r. If n
marbles are drawn at random and Xi (i = 1, 2, . . . , r) is the number of the
marbles of type i drawn, the joint probability mass function of X1 , X2 , . . ., Xr is
called multivariate hypergeometric and is given by

(nx 1 )(nx 2 )⋯(nxrr )


p(x1 , x2 , . . ., xr ) = n11+ n22+ ⋯ + nr ,
( n )

where x1 + x2 + ⋯ + xr = n.

Overview | Joint Distributions of n > 2 Random Variables 6 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
We now extend the definition of the joint distribution from 2 to n > 2
random variables. Let X1 , X2 , . . ., Xn be n random variables (discrete,
continuous, or mixed). The joint distribution function of X1 , X2 , . . ., Xn is
defined by

F(t1 , t2 , . . ., tn ) = P(X1 ≤t1 , X2 ≤t2 , . . ., Xn ≤tn ) (9.2)

for all −∞ < ti < +∞, i = 1, 2, . . . , n. The marginal distribution function of


Xi , 1 ≤ i ≤ n, can be found from F as follows:

FXi (ti ) = P(Xi ≤ ti )


= P(X1 < ∞, . . . , Xi−1 < ∞, Xi ≤ ti , Xi+1 < ∞, . . . , Xn < ∞)
= lim F(t1 , t2 , . . ., tn ) . (9.3)
t j →∞
1≤ j≤n, j≠i
Overview | Joint Distributions of n > 2 Random Variables 7 / 49
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
To find the joint distribution function marginalized over a given set of k of
these random variables, we calculate the limit of F(t1 , t2 , . . ., tn ) as t j → ∞,
for every j that belongs to one of the remaining n − k variables. For
example, if F(x, y, z, t) denotes the joint probability density function of
random variables X, Y, Z, and T, then the joint distribution function
marginalized over Y and T is given by

FY,T (y, t) = lim F(x, y, z, t) .


x,z→∞
Just as the joint distribution function of two random variables, we have
that F, the joint distribution of n random variables, satisfies the following:
(a). F is nondecreasing in each argument.
(b). F is right continuous in each argument.
(c). F(t1 , t2 , . . ., ti−1 , −∞, ti+1 , . . . , tn ) = 0 for i = 1, 2, . . . , n.
(d). F(∞, ∞, . . . , ∞) = 1.
Overview | Joint Distributions of n > 2 Random Variables 8 / 49
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
Suppose that X1 , X2 , . . ., Xn are random variables (discrete, continuous, or
mixed) on a sample space. We say that they are independent if, for
arbitrary subsets A1 , A2 , . . ., An of real numbers,

P(X1 ∈A1 , X2 ∈A2 , . . ., Xn ∈An ) = P(X 1 ∈A1 )P(X 2 ∈A2 )⋯P(X n ∈An ) .

Similar to the case of two random variables, X1 , X2 , . . ., Xn are independent


if and only if, for any xi ∈ R, i = 1, 2, . . . , n,

P(X1 ≤x1 , X2 ≤x2 , . . ., Xn ≤xn ) = P(X 1 ≤x1 )P(X 2 ≤x2 )⋯P(X n ≤xn ) .

That is, X1 , X2 , . . ., Xn are independent if and only if

F(x1 , x2 , . . ., xn ) = FX1 (x1 )FX2 (x2 )⋯FXn (xn ) .


Overview | Joint Distributions of n > 2 Random Variables 9 / 49
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
If X1 , X2 , . . ., Xn are discrete, the definition of independence reduces to the
following condition:

P(X1 =x1 , X2 =x2 , . . ., Xn =xn ) = P(X 1 =x1 )P(X 2 =x2 )⋯P(X n =xn ) (9.4)

for any set of points, xi , i = 1, 2, . . . , n. Let X, Y, and Z be independent


random variables. By definition, for arbitrary subsets A1 , A2 , and A3 of R,

P(X ∈ A1 , Y ∈ A2 , Z ∈ A3 ) = P(X ∈ A1 )P(Y ∈ A2 )P(Z ∈ A3 ) . (9.4)

Now, if in (9.5) we let A2 = R, then since the event Y ∈ R is certain and has
probability 1, we get

P(X ∈ A1 , Z ∈ A3 ) = P(X ∈ A1 )P(Z ∈ A3 ) .

This shows that X and Z are independent random variables. In the same
way it can be shown that {X, Y}, and {Y, Z} are also independent sets.
Overview | Joint Distributions of n > 2 Random Variables 10 / 49
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES

Definition 9.2
A collection of random variables is called independent if all of its finite
subcollections are independent.

It is important to know that the result of Theorem 8.5 is true for any number
of random variables:
If {X1 , X2 , . . .} is a sequence of independent random variables and
for i = 1, 2, . . ., gi : R → R is a real-valued function, then the se-
quence {g1 (X1 ), g2 (X2 ), . . .} is also an independent sequence of
random variables.

Overview | Joint Distributions of n > 2 Random Variables 11 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES

The generalization of Theorem 8.4 follows from (9.4):


Theorem 9.1
Let X1 , X2 , . . ., Xn be jointly discrete random variables with the joint
probability mass function p(x1 , x2 , . . ., xn ). Then X1 , X2 , . . ., Xn are
independent if and only if p(x1 , x2 , . . ., xn ) is the product of their marginal
probability mass functions pX1 (x1 ), pX2 (x2 ), . . ., pXn (xn ).

Overview | Joint Distributions of n > 2 Random Variables 12 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
A generalization of Theorem 8.1 from dimension 2 to n:
Theorem 9.2
Let p(x1 , x2 , . . ., xn ) be the joint probability mass function of discrete random
variables X1 , X2 , . . ., Xn . For 1 ≤ i ≤ n, let Ai be the set of possible values of Xi .
If h is a function of n variables from Rn to R, then Y = h(X1 , X2 , . . ., Xn ) is a
discrete random variable with expected value given by

E(Y) = ∑ ⋯ ∑ h(x1 , x2 , . . ., xn )p(x1 , x2 , . . ., xn ) ,


xn ∈An x1 ∈A1
provided that the sum is finite.

Using Theorems 9.1 and 9.2, an almost identical proof to that of Theorem
8.6 implies that
The expected value of the product of several independent discrete
random variables is equal to the product of their expected values.
Overview | Joint Distributions of n > 2 Random Variables 13 / 49
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES

• Examples:
– Let

p(x, y, z) = k(x 2 + y 2 + yz), x = 0, 1, 2; y = 2, 3; z = 3, 4 .

□ (a) For what value of k is p(x, y, z) a joint probability mass


function: By solving

2 3 4
k ∑ ∑ ∑ (x 2 + y 2 + yz) = 1 ,
x=0 y=2 z=3

we get k = 1/203.

Overview | Joint Distributions of n > 2 Random Variables 14 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
□ (b) Find PY,Z (y, z) and pZ (z):

1 2 2 2
PY,Z (y, z) = ∑ (x + y + yz)
203 x=0
1
= (3y 2 + 3yz + 5) , y = 2, 3; z = 3, 4 .
203
1 2 3 2 2
pZ (z) = ∑ ∑ (x + y + yz)
203 x=0 y=2
15 7
= z+ , z = 3, 4 .
203 29

□ (c) Find E(XZ): By Theorem 9.2,

1 2 3 4 2 2 774
E(XZ) = ∑ ∑ ∑ xz(x + y + yz) = .
203 x=0 y=2 z=3 203
Overview | Joint Distributions of n > 2 Random Variables 15 / 49
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
Joint Probability Density Functions
Definition 9.3
Let X1 , X2 , . . ., Xn be continuous random variables defined on the same
sample space. We say that X1 , X2 , . . ., Xn have a continuous joint
distribution if there exists a nonnegative function of n variables,
f (x1 , x2 , . . ., xn ), on R × R × ⋯ × R ≡ Rn such that for any region R in Rn that
can be formed from n-dimensional rectangles by a countable number of set
operations,

P((X1 , X2 , . . ., Xn ) ∈ R) = ∫…∫ f (x1 , x2 , . . ., xn ) dx 1 dx 2 ⋯ dx n . (9.6)


R

The function f (x1 , x2 , . . ., xn ) is called the joint probability density


function of X1 , X2 , . . ., Xn .
Overview | Joint Distributions of n > 2 Random Variables 16 / 49
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
Let R = {(x1 , x2 , . . ., xn ) : xi ∈ Ai , 1 ≤ i ≤ n}, where Ai , 1 ≤ i ≤ n, is any subset
of real numbers that can be constructed from intervals by a countable
number of set operations. Then (9.6) gives

P(X1 ∈A1 , X2 ∈A2 , . . ., Xn ∈An )

=∫ ∫ ⋯ ∫ f (x1 , x2 , . . ., xn ) dx 1 dx 2 ⋯ dx n .
An An−1 A1

Letting Ai = (−∞, +∞), 1 ≤ i ≤ n, this implies that


+∞ +∞
⋯∫ f (x1 , x2 , . . ., xn ) dx 1 dx 2 ⋯ dx n = 1 .
−∞ −∞

Let fXi be the marginal probability density function of Xi , 1 ≤ i ≤ n. Then


+∞ +∞
fXi (xi ) = ∫ ⋯∫ f (x1 , x2 , . . ., xn ) dx1 ⋯dxi−1 dxi+1 ⋯dxn . (9.7)
−∞ −∞
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶
n − 1 times
Overview | Joint Distributions of n > 2 Random Variables 17 / 49
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES

Generally, to find the joint probability density function marginalized over


a given set of k of these random variables, we integrate f (x1 , x2 , . . ., xn )
over all possible values of the remaining n − k random variables. For
example, if f (x, y, z, t) denotes the joint probability density function of
random variables X, Y, Z, and T, then
+∞ +∞
fY,T (y, t) = ∫ f (x, y, z, t) dx dz
−∞ −∞

is the joint probability density function marginalized over Y and T, whereas


+∞
fX,Z,T (x, z, t) = ∫ f (x, y, z, t) d y
−∞

is the joint probability density function marginalized over X, Z, and T.

Overview | Joint Distributions of n > 2 Random Variables 18 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES

The following theorem is the generalization of Theorem 8.7 and the


continuous analog of Theorem 9.1. Its proof is similar to the proof of
Theorem 8.7.
Theorem 9.3
Let X1 , X2 , . . ., Xn be jointly continuous random variables with the joint
probability density function f (x1 , x2 , . . ., xn ). Then X1 , X2 , . . ., Xn are
independent if and only if f (x1 , x2 , . . ., xn ) is the product of their marginal
densities fX1 (x1 ), fX2 (x2 ), . . ., fXn (xn ).

Overview | Joint Distributions of n > 2 Random Variables 19 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES

Let F be the joint distribution function of jointly continuous random


variables X1 , X2 , . . ., Xn , with the joint probability density function
f (x1 , x2 , . . ., xn ), then

F(t1 , t2 , . . ., tn )
tn tn−1 t1
=∫ ⋯∫ f (x1 , x2 , . . ., xn ) dx 1 dx 2 ⋯ dx n , (9.8)
−∞ −∞ −∞

and
∂n F(t1 , t2 , . . ., tn )
f (x1 , x2 , . . ., xn ) = (9.9)
∂x 1 ∂x 2 ⋯ ∂x n

Overview | Joint Distributions of n > 2 Random Variables 20 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
The following is the continuous analog of Theorem 9.2. It is also a
generalization of Theorem 8.2 from dimension 2 to n.
Theorem 9.4
Let f (x1 , x2 , . . ., xn ) be the joint probability density function of random
variables X1 , X2 , . . ., Xn . If h is a function of n variables from Rn to R, then
Y = h(X1 , X2 , . . ., Xn ) is a random variable with expected value given by
+∞ +∞
E(Y) = ∫ ⋯∫ h(x1 , x2 , . . ., xn )f (x1 , x2 , . . ., xn ) dx 1 dx 2 ⋯ dx n ,
−∞ −∞
provided that the integral is absolutely convergent.

Using Theorems 9.3 and 9.4, an almost identical proof to that of Theorem
8.6 implies that
The expected value of the product of several independent random
variables is equal to the product of their expected values.
Overview | Joint Distributions of n > 2 Random Variables 21 / 49
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
• Examples:
– A system has n components, whose lifetimes are exponential
random variables with parameters λ1 , λ2 , . . ., λn , respectively.
Suppose that the lifetimes of the components are independent
random variables, and the system fails as soon as one of its
components fails. Find the probability density function and the
expected value of the time until the system fails.
□ Let X1 , X2 , . . ., Xn be the lifetimes of the n components,
respectively. Then X1 , X2 , . . ., Xn are independent random
variables and for i = 1, 2, . . . , n,

P(Xi ≤ t) = 1 − e−λi t , t≥0.

Overview | Joint Distributions of n > 2 Random Variables 22 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
□ Letting X be be the time until the system fails, we have
X = min(X1 , X2 , . . ., Xn ). Therefore,

P(X > t) = P(min(X1 , X2 , . . ., Xn ) > t)


= P(X1 > t, X2 > t, . . . , Xn > t)
= P(X1 > t)P(X2 > t)⋯P(Xn > t)
= (e−λ1 t )(e−λ2 t )⋯(e−λn t ) = e−(λ1 +λ2 +⋯+λn )t , t≥0.

Let f be the probability density function of X; then

P(X ≤ t) = (1 − e−(λ1 +λ2 +⋯+λn )t )


d d
f (t) =
dt dt
= (λ1 +λ2 +⋯+λn )e−(λ1 +λ2 +⋯+λn )t , t ≥ 0 .
1
E(X) = .
λ1 +λ2 +⋯+λn
Overview | Joint Distributions of n > 2 Random Variables 23 / 49
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES

Remark 9.2
If X1 , X2 , . . ., Xn are n independent exponential random variables with
parameters λ1 , λ2 , . . ., λn , respectively, then min(X1 , X2 , . . ., Xn ) is an
exponential random variable with parameter λ1 +λ2 +⋯+λn . Hence

1
E [min(X1 , X2 , . . ., Xn )] = .
λ1 +λ2 +⋯+λn

Overview | Joint Distributions of n > 2 Random Variables 24 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
• Examples:
– For the following function:


⎪ 1 if 0 < t ≤ z ≤ y ≤ x ≤ 1

f (x, y, z, t) = ⎨ x yz
⎩ 0 elsewhere .

□ (a) Prove f to be a joint probability density function: (1)


f (x, y, z, t) ≥ 0 and (2)

1 x y 1
z 1 x y 1
∫ ∫ ∫ ∫ dt dz d y dx = ∫ ∫ ∫ dz d y dx
0 0 0 0 x yz 0 0 0 xy
1 x 1 1
=∫ ∫ d y dx = ∫ dx = 1 .
0 0 x 0

Overview | Joint Distributions of n > 2 Random Variables 25 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES

□ (b) Find fY,Z,T (y, z, t): For 0 < t ≤ z ≤ y ≤ 1,

1 1
1 1 ln y
fY,Z,T (y, z, t) = ∫ dx = ln x∣ = − .
y x yz yz y yz

Therefore,

ln y
if 0 < t ≤ z ≤ y ≤ 1


⎪ − yz
fY,Z,T (y, z, t) = ⎨
⎩ 0 elsewhere .

Overview | Joint Distributions of n > 2 Random Variables 26 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES

Definition 9.4
Let S be a subset of the three-dimensional Euclidean space with volume
Vol(S) ≠ 0. A point is said to be randomly selected from S if for any subset
Ω of S with volume Vol(Ω), the probability that Ω contains the point is
Vol(Ω)/Vol(S).

Let X, Y, and Z be the coordinates of the point randomly selected from S.


Let f (x, y, z) be the joint probability density function of X, Y, and Z. We have
that

⎪ 1 if (x, y, z) ∈ S
⎪ Vol(S)
f (x, y, z) = ⎨ (9.10)
⎩ 0 otherwise .

Overview | Joint Distributions of n > 2 Random Variables 27 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
Conditional probability density functions and conditional probability
mass functions in higher dimensions are defined similarly to dimension
two. For example, if X, Y, and Z are three continuous random variables with
the joint probability density function f , then

f (x, y, z)
fX,Y∣Z (x, y∣z) =
fZ (z)

at all points z for which fZ (z) > 0. As another example, let X, Y, Z, V, and W
be five continuous random variables with the joint probability density
function f . Then

f (x, y, z, v, w)
fX,Z,W∣Y,V (x, z, w∣y, v) =
fY,V (y, v)

at all points (y, v) for which fY,V (y, v) > 0.


Overview | Joint Distributions of n > 2 Random Variables 28 / 49
JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
• Examples:
– Let X be a random point from the interval (0, 1), Y be a random
point from (0, X), and Z be a random point from (X, 1). Find f , the
joint probability density function of X, Y, and Z.
□ Note that

fX,Y (x, y) f (x, y, z)


f (x, y, z) = fX (x) ⋅ ⋅
fX (x) fX,Y (x, y)
= fX (x) ⋅ fY∣X (y∣x) ⋅ fZ∣X,Y (z∣x, y) .

Clearly,


⎪ 1 0<x<1

fX (x) = ⎨
⎩ 0 otherwise .

Overview | Joint Distributions of n > 2 Random Variables 29 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES




⎪ 1/x 0 < y < x
fY∣X (y∣x) = ⎨


⎩ 0 otherwise .


⎪ 1/(1 − x) x < z < 1
fZ∣X,Y (z∣x, y) = ⎨


⎩ 0 otherwise .

Thus,


⎪ 1 0< y<x<z<1

f (x, y, z) = ⎨ x(1−x)
⎩ 0 otherwise .

Overview | Joint Distributions of n > 2 Random Variables 30 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES

The relationship

f (x, y, z) = fX (x) ⋅ fY∣X (y∣x) ⋅ fZ∣X,Y (z∣x, y) , (9.11)

which was established in Example 9.6, can be generalized to n continuous


random variables. Let f be the joint probability density function of
X1 , X2 , . . ., Xn . We have

f (x1 , x2 , . . ., xn ) = fX1 (x1 )fX2 ∣X1 (x2 ∣x1 )fX3 ∣X1 ,X2 (x3 ∣x1 , x2 )
⋯fXn ∣X1 , X2 , ..., Xn−1 (xn ∣x1 , x2 , . . ., xn−1 ) . (9.12)

Overview | Joint Distributions of n > 2 Random Variables 31 / 49


JOINT DISTRIBUTIONS OF n > 2 RANDOM VARIABLES
Random Sample
Definition 9.5
We say that n random variables X1 , X2 , . . ., Xn form a random sample of size
n, from a (continuous or discrete) distribution function F, if they are
independent and, for 1 ≤ i ≤ n, the distribution function of Xi is F. Therefore,
elements of a random sample are independent and identically distributed.

Suppose that the lifetime distribution of the light bulbs manufactured by a


company is exponential with parameter λ. To estimate 1/λ, we can choose
n light bulbs at random and independently from those manufactured by
the company to form a random sample of size n from the exponential
distribution with parameter λ. Clearly, an estimation of 1/λ is the mean of
the random sample X1 , X2 , . . ., Xn denoted by X̄:
X +X +⋯+Xn
X̄ = 1 2 .
Overview | Joint Distributions of n > 2 Random Variables n 32 / 49
ORDER STATISTICS

Definition 9.6
Let {X1 , X2 , . . ., Xn } be an independent set of identically distributed
continuous random variables with the common probability density and
distribution functions f and F, respectively. Let X(1) be the smallest value in
{X1 , X2 , . . ., Xn }, X(2) be the second smallest value, X(3) be the third
smallest, and, in general, X(k) (1 ≤ k ≤ n) be the kth smallest value in
{X1 , X2 , . . ., Xn }. Then, X(k) is called the kth order statistic, and the set
{X(1) , X(2) , . . . , X(n) } is said to consist of the order statistics of
{X1 , X2 , . . ., Xn }.

Overview 33 / 49
ORDER STATISTICS

By this definition, for example, if at a sample point ω of the sample space,


X1 (ω) = 8, X2 (ω) = 2, X3 (ω) = 5, and X4 (ω) = 6, then the order statistics
of {X1 , X2 , X3 , X4 } is {X(1) , X(2) , X(3) , X(4) }, where X(1) (ω) = 2, X(2) (ω) = 5,
X(3) (ω) = 6, and X(4) (ω) = 8. Continuity of Xi ’s implies that
P(X(i) = X( j) ) = 0. Hence

P(X(1) < X(2) < X(3) < ⋯ < X(n) ) = 1 .

Unlike Xi ’s, the random variables X(i) ’s are neither independent nor
identically distributed.

Overview | Order Statistics 34 / 49


ORDER STATISTICS

• Examples:
– Suppose that customers arrive at a warehouse from n different
locations. Let Xi , 1 ≤ i ≤ n, be the time until the arrival of the next
customer from location i; then X(1) is the arrival time of the next
customer to the warehouse.
– Suppose that a machine consists of n components with the
lifetimes X1 , X2 , . . ., Xn , respectively, where Xi ’s are independent
and identically distributed. Suppose that the machine remains
operative unless k or more of its components fail. Then X(k) , the
kth order statistic of {X1 , X2 , . . ., Xn }, is the time when the
machine fails. Also, X(1) is the failure time of the first component.

Overview | Order Statistics 35 / 49


ORDER STATISTICS

• Examples:
– Let {X1 , X2 , . . ., Xn } be a random sample of size n from a
population with continuous distribution F. Then the following
important statistical concepts are expressed in terms of order
statistics:
1. The sample range is X(n) − X(1) .
2. The sample midrange is [X(n) + X(1) ]/2.
3. The sample median is


⎪ X(i+1)
⎪ if n = 2i + 1
m = ⎨ X(i) +X(i+1)


⎩ 2 if n = 2i .

Overview | Order Statistics 36 / 49


ORDER STATISTICS
Theorem 9.5
Let {X(1) , X(2) , . . . , X(n) } be the order statistics of the independent and
identically distributed continuous random variables {X1 , X2 , . . ., Xn } with the
common distribution and probability density functions F and f , respectively.
Then Fk and fk , the distribution and probability density functions of X(k) ,
respectively, are given by

n n
Fk (x) = ∑ ( )[F(x)]i [1 − F(x)]n−i , −∞ < x < ∞ , (9.13)
i=k i

and

n!
fk (x) = f (x)[F(x)]k−1 [1 − F(x)]n−k , −∞ < x < ∞ .
(k − 1)!(n − k)!
(9.14)
Overview | Order Statistics 37 / 49
ORDER STATISTICS
Proof.
Let −∞ < x < ∞. To calculate P(X(k) ≤ x), note that X(k) ≤ x if and only if at
least k of the random variables X1 , X2 , . . ., Xn are in (−∞, x]. Thus

Fk (x) = P(X(k) ≤ x)
n
= ∑ P(i of the random variables X1 , X2 , . . ., Xn are in (−∞, x])
i=k
n n
= ∑ ( )[F(x)]i [1 − F(x)]n−i ,
i=k i

where the last equality follows because from the random variables
X1 , X2 , . . ., Xn the number of those that lie in (−∞, x] has binomial
distribution with parameters (n, p), p = F(x).

Overview | Order Statistics 38 / 49


ORDER STATISTICS
Proof (Cont’d).
We will now obtain fk by differentiating Fk :
n n
fk (x) = ∑ ( )i f (x)[F(x)]i−1 [1 − F(x)]n−i
i=k i
n n
− ∑ ( )[F(x)]i (n − i)f (x)[1 − F(x)]n−i−1
i=k i
n n!
=∑ f (x)[F(x)]i−1 [1 − F(x)]n−i
i=k (i − 1)!(n − i)!
n n!
−∑ [F(x)]i f (x)[1 − F(x)]n−i−1
i=k i!(n − i − 1)!
n n!
=∑ f (x)[F(x)]i−1 [1 − F(x)]n−i
i=k (i − 1)!(n − i)!
n n!
− ∑ f (x)[F(x)]i−1 [1 − F(x)]n−i
i=k+1 (i − 1)!(n − i)!
Overview | Order Statistics 39 / 49
ORDER STATISTICS
Remark 9.3
Note that by (9.13) and (9.14), respectively, F1 and f1 , the distribution and the
probability density functions of X(1) = min(X1 , X2 , . . ., Xn ), are found to be
n n
F1 (x) = ∑ ( )[F(x)]i [1 − F(x)]n−i
i=1 i
n n
= ∑ ( )[F(x)]i [1 − F(x)]n−i − [1 − F(x)]n
i=0 i
= [F(x) + [1 − F(x)]]n − [1 − F(x)]n
= 1 − [1 − F(x)]n , −∞ < x < ∞ .
f1 (x) = n f (x)[1 − F(x)]n−1 , −∞ < x < ∞ .
Also, Fn and fn for X(n) = max(X1 , X2 , . . ., Xn ) are found to be
Fn (x) = [F(x)]n , −∞ < x < ∞ .
fn (x) = n f (x)[F(x)]n−1 , −∞ < x < ∞ .
Overview | Order Statistics 40 / 49
ORDER STATISTICS
• Examples:
– Let X1 , X2 , . . ., X2n+1 be 2n + 1 random numbers from (0, 1). Then
f and F, the respective probability density and distribution
functions of Xi ’s, are given by

⎧ ⎪
⎪ 0 if x < 0
⎪ 1 if 0 < x < 1
⎪ ⎪


f (x) = ⎨ F(x) = ⎨ x if 0 ≤ x < 1
⎩ 0 elsewhere .

⎪ ⎪

⎩ 1 if x ≥ 1 .


The probability density function of X(n+1) , the median, is

(2n + 1)! n 1
fn+1 (x) = x (1−x)n = x n (1−x)n , 0 < x < 1;
n!n! B(n + 1, n + 1)
0, elsewhere. Hence, X(n+1) is beta with parameters (n + 1, n + 1).
For 1 ≤ k ≤ 2n + 1, X(k) is beta with parameters k and 2n − k + 2.
Overview | Order Statistics 41 / 49
ORDER STATISTICS

Theorem 9.6
Let {X(1) , X(2) , . . . , X(n) } be the order statistics of the independent and
identically distributed continuous random variables {X1 , X2 , . . ., Xn } with the
common probability density and distribution functions f and F, respectively.
Then, for i < j and x < y, fi j (x, y), the joint probability density function of X(i)
and X( j) , is given by

fi j (x, y) =
n!
f (x)f (y)[F(x)]i−1 [F(y) − F(x)] j−i−1 [1 − F(y)]n− j .
(i − 1)!(j − i − 1)!(n − j)!

For x ≥ y, fi j (x, y) = 0.

Overview | Order Statistics 42 / 49


ORDER STATISTICS

Theorem 9.7
Let {X(1) , X(2) , . . . , X(n) } be the order statistics of the independent and
identically distributed continuous random variables {X1 , X2 , . . ., Xn } with the
common probability density and distribution functions f and F, respectively.
Then, f12⋯n , the joint probability density function of X(1) , X(2) , . . . , X(n) , is
given by

f12⋯n (x1 , x2 , . . ., xn ) =

⎪ n!f (x1 )f (x2 )⋯f (xn ) −∞ < x1 < x2 < ⋯ < xn < ∞


⎩ 0 otherwise .

Overview | Order Statistics 43 / 49


ORDER STATISTICS

• Examples:
– The distance between two towns, A and B, is 30 miles. If three gas
stations are constructed independently at randomly selected
locations between A and B, what is the probability that the
distance between any two gas stations is at least 10 miles?
□ Let X1 , X2 , and X3 be the locations at which the gas stations
are constructed. The probability density function of X1 , X2 ,
and X3 is given by


⎪ 1 if 0 < x < 30
f (x) = ⎨ 30

⎩ 0 elsewhere.

Overview | Order Statistics 44 / 49


ORDER STATISTICS
By Theorem 9.7, f123 , the joint probability density function
of the order statistics of X1 , X2 , and X3 is

1 3
f123 (x1 , x2 , x3 ) = 3! ( ) 0 < x1 < x2 < x3 < 30 .
30

Using this, we have that the desired probability is given by


the following triple integral.

P(X(1) + 10 < X(2) and X(2) + 10 < X(3) )


10 20 30
=∫ f123 (x1 , x2 , x3 ) dx3 dx2 dx1
x1 +10 x2 +10
∫ ∫
0
1
= .
27

Overview | Order Statistics 45 / 49


MULTINOMIAL DISTRIBUTIONS
Multinomial distribution is a generalization of a binomial. Suppose that,
whenever an experiment is performed, one of the disjoint outcomes
A1 , A2 , . . ., Ar will occur. Let P(Ai ) = pi , 1 ≤ i ≤ r. Then p1 + p2 + ⋯ + pr = 1.
If, in n independent performances of this experiment, Xi , i = 1, 2, 3, . . . , r,
denotes the number of times that Ai occurs, then p(x1 , x2 , . . ., xr ), the joint
probability mass function of X1 , X2 , . . ., Xr , is called multinomial joint
probability mass function, and its distribution is said to be a multinomial
distribution. For any set of nonnegative integers {x1 , x2 , . . ., xr } with
x1 + x2 + ⋯ + xr = n,

p(x1 , x2 , . . ., xr ) = P(X1 =x1 , X2 =x2 , . . ., Xr =xr )


n!
= px11 px22 ⋯pxr r . (9.19)
x1 !x2 !⋯xr !

Overview 46 / 49
MULTINOMIAL DISTRIBUTIONS
• Examples:
– Marginals of Multinomials: Let X1 , X2 , . . ., Xr (r ≥ 4) have the joint
multinomial probability mass function p(x1 , x2 , . . ., xr ) with
parameters n and p1 , p2 , . . ., pr .
□ Find the marginal probability mass functions pX1 :
n!
pX1 (x1 ) = ∑ px11 px22 ⋯pxr r
x2 +x3 +⋯+xr =n−x1 x1 !x2 !⋯xr !
n! (n − x1 )! x2 x3 xr
= px11 ∑ p2 p3 ⋯pr
x1 !(n − x1 )! x2 +x3 +⋯+xr =n−x1 x2 !⋯xr !
n!
= px1 (p2 + p3 + ⋯ + pr )n−x1
x1 !(n − x1 )! 1
n!
= px11 (1 − p1 )n−x1 .
x1 !(n − x1 )!

pX1 (x1 ) is binomial with parameters n and p1 .


Overview | Multinomial Distributions 47 / 49
MULTINOMIAL DISTRIBUTIONS

Remark 9.4
The method of Example 9.14 can be extended to prove the following theorem:
Let the joint distribution of the random variables X1 , X2 , . . ., Xr be
multinomial with parameters n and p1 , p2 , . . ., pr . The joint proba-
bility mass function marginalized over a subset Xi1 , Xi2 , . . ., Xik of k
(k > 1) of these r random variables is multinomial with parameters
n and pi1 , pi2 , . . ., pik , 1 − pi1 − pi2 − . . . − pik .

Overview | Multinomial Distributions 48 / 49

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy