(Last) Extension of Several Random Variables
(Last) Extension of Several Random Variables
PAPER
In order to fulfill the assignment for
Mathematical Statistics I
which is lectured by Mr. Susiswo and Mrs. Jamaliatul Badriyah
Written by:
1. Namira (140311606344)
2. Nur Rofidah Diyanah (140311604344)
3. Sri Prihatin (140311600162)
4. Trio Habibatur Rahma Utami (140311602695)
DEPARTMENT OF MATHEMATICS
THE FACULTY OF MATHEMATICS AND NATURAL SCIENCES
STATE UNIVERSITY OF MALANG
October, 2016
EXTENSION TO SEVERAL RANDOM VARIABLES
The notions about two random variables can be extended immediately to 𝑛 random
variables. We make the following definition of the space of 𝑛 random variables.
DEFINITION 2.6.1.
Consider a random experiment with the sample space 𝒞. Let the random variable 𝑋𝑖 assign to
each element 𝑐𝜖𝒞 one and only one real number 𝑋𝑖 (𝑐) = 𝑥𝑖 , 𝑖 = 1,2, … , 𝑛. We say that
(𝑋1 , 𝑋2 , … 𝑋𝑛 ) is an 𝑛-dimensional random vector. The space of this random vector is the
set of ordered 𝑛-tuples 𝒟 = {(𝑥1 , 𝑥2 , … , 𝑥𝑛 ): 𝑥1 = 𝑋1 (𝑐), … , 𝑥𝑛 = 𝑋𝑛 (𝑐), 𝑐 ∈ 𝒞}.
Furthermore, let 𝐴 be a subset of the space 𝒟. Then 𝑃[(𝑋1 , … 𝑋𝑛 ) ∈ 𝐴] = 𝑃(𝐶),where 𝐶 =
{𝑐: 𝑐 ∈ 𝒞 𝑎𝑛𝑑 (𝑋1 (𝑐), 𝑋2 (𝑐), … , 𝑋𝑛 (𝑐)) ∈ 𝐴}
In this section, we will often use vector notation. For example, we denote
(𝑋1 , … , 𝑋𝑛 )′ by the 𝑛 dimensional column vector 𝑿 and the observed values (𝑥1 , … , 𝑥𝑛 )′ of
the random variables by 𝒙. the joint cdf is defined to be
𝐹𝑿 (𝒙) = 𝑃[𝑋1 ≤ 𝑥1 , … , 𝑋𝑛 ≤ 𝑥𝑛 ].
We say that the 𝑛 random variables 𝑋1 , 𝑋2 , … 𝑋𝑛 are of the discrete type or of the continues
type and have a distribution of that type accordingly as the joint cdf can be expressed as
𝐹𝑿 (𝒙) = ∑ ∑ 𝑝(𝑤1 , … , 𝑤𝑛 ),
𝑤1 <𝑥1 ,…,𝑤𝑛 <𝑥𝑛
or as
In accordance with the convention of extending the definition of a joint pdf, is seen
that a point function 𝑓 essentially satisfies the condition of being a pdf if
(a) 𝑓 s defined and is nonnegative for all real values of its argument(s)
(b) its integral over all real values of its argument(s) is 1.
Likewise, a point function 𝑝 essentially satisfies of being a joint pmf if
(a) 𝑝 is defined and is nonnegative for all real values of its argument(s)
(b) its sum over all real values and its argument(s) is 1.
As in previous sections, it is sometimes convenient to speak the support side of a random
vector. For the discrete case, this would be all point in 𝒟 which have the positive mass, while
for the continous case these would be all point in 𝒟 can be embedded is an open set of
positive probability. We will use 𝒮 to denote support sets
EXAMPLE 2.6.1.
Let
−(𝑥+𝑦+𝑧)
𝑓(𝑥, 𝑦, 𝑧) = { 𝑒 0 < 𝑥, 𝑦, 𝑥 < ∞
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Be the pdf of the random variables 𝑋, 𝑌, and 𝑍. Then the distribution function of 𝑋, 𝑌, and 𝑍
is given by
𝐹(𝑥, 𝑦, 𝑧) = 𝑃(𝑋 ≤ 𝑥, 𝑌 ≤ 𝑦, 𝑍 ≤ 𝑧)
𝑧 𝑦 𝑥
= ∫ ∫ ∫ 𝑒 −𝑢−𝑣−𝑤 𝑑𝑢 𝑑𝑣 𝑑𝑤
0 0 0
exist when the random variables as of the continuous type, or if the 𝑛-fold sum
exist when the random variables are of the discrete type. If the expected value 𝑌 exist then it
its expectation is given by
∞ ∞
𝐸(𝑌) = ∫ … ∫ 𝑢(𝑥1 , 𝑥2 , … 𝑥𝑛 )𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) 𝑑𝑥1 𝑑𝑥2 … 𝑑𝑥𝑛
−∞ −∞
for the discrete case. In particular, 𝐸 is a linear operator. That is, if 𝑌𝑗 = 𝑢𝑗 (𝑋1 , … , 𝑋𝑛 ) for 𝑗 =
1,2, … , 𝑚 and each 𝐸(𝑌𝑖 ) exists then
𝑚
𝑚
𝐸 [∑ 𝑘𝑗 𝑌𝑗 ] = ∑ 𝑘𝑗 𝐸[𝑌𝑗 ],
𝑗=1
𝑗=1
Therefore, 𝑓1 (𝑥1 ) is the pdf of the random variable 𝑋1 and 𝑓1 (𝑥1 ) is called the marginal pdf
of 𝑋1. The marginal probability density functions 𝑓2 (𝑥2 ), … , 𝑓𝑛 (𝑥𝑛 ) of
𝑋2 , … , 𝑋𝑛 , respectively, are similar (𝑛 − 1)-fold integrals.
Up to this point, its marginal pdf has been a pdf of one random variable. It is
convenient to extend this terminology to joint probability density functions, which we shall
do now. Let 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) be the joint pdf of the 𝑛 random variables 𝑋1 , 𝑋2 , … , 𝑋𝑛 just as
before. Now, however, let us take any group of 𝑘 < 𝑛 of these random variables and let us
find the joint pdf of them. This joint pdf is called the marginal pdf of this particular group of
𝑘 variables. To fix the ideas, take = 6 , 𝑘 = 3, and let us select the group 𝑋2 , 𝑋4 , 𝑋5 . Then the
marginal pdf of 𝑋2 , 𝑋4 , 𝑋5 is the joint pdf of this particular group of three variables, namely,
∞ ∞ ∞
∫ ∫ ∫ 𝑓(𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 , 𝑥5 , 𝑥6 )𝑑𝑥1 𝑑𝑥3 𝑑𝑥6 ,
−∞ −∞ −∞
Provided 𝑓1 (𝑥1 ) > 0 and the integral converges (absolutely) a usefull random variable is
given by ℎ(𝑋1 ) = 𝐸[𝑢(𝑋2 , … , 𝑋𝑛 )|𝑋1 )].
The above discussion of marginal and conditional distribution generalizes two random
variables of the discrete type by using pmfs and summation instead of integral. Let the
random variables 𝑋1 , 𝑋2 , … , 𝑋𝑛 have the joint pdf 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) and the marginal
probability density functions 𝑓1 (𝑥1 ), 𝑓2 (𝑥2 ), … , 𝑓𝑛 (𝑥𝑛 ), respectively the definition of the
independence of 𝑋1 and 𝑋2 is generalize to the mutual independence of 𝑋1 , 𝑋2 , … , 𝑋𝑛 as
follows: The random variables 𝑋1 , 𝑋2 , … , 𝑋𝑛 are said to be mutually independent if and only
if
𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) ≡ 𝑓1 (𝑥1 )𝑓2 (𝑥2 ) … 𝑓𝑛 (𝑥𝑛 ),
for the continuous case. In the discrete case 𝑋1 , 𝑋2 , … , 𝑋𝑛 are said to be mutually
independent if and only if
𝑝(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) ≡ 𝑝1 (𝑥1 )𝑝(𝑥2 ) … 𝑝𝑛 (𝑥𝑛 ),
Suppose 𝑋1 , 𝑋2 , … , 𝑋𝑛 are mutually independent. Then
𝑃(𝑎1 < 𝑋1 < 𝑏1 , 𝑎2 < 𝑋2 < 𝑏2 , … , 𝑎𝑛 < 𝑋𝑛 < 𝑏𝑛
= 𝑃(𝑎1 < 𝑋1 < 𝑏1 )𝑃(𝑎2 < 𝑋2 < 𝑏2 ) … 𝑃(𝑎𝑛 < 𝑋𝑛 < 𝑏𝑛 )
𝑛
Is a necessary and sufficient condition for the mutual independence of 𝑋1 , 𝑋2 , … , 𝑋𝑛 . note that
we can write the joint mgf in vector notation as
𝑀(𝑡) = 𝐸[exp(𝒕′ 𝑿)], 𝑓𝑜𝑟 𝒕 ∈ 𝐵 ⊂ 𝑅 𝑛 ,
Where 𝐵 = {𝒕 ∶ −ℎ𝑖 < 𝑡𝑖 < ℎ𝑖 , 𝑖 = 1, … , 𝑛}
EXAMPLE 2.6.2
Let 𝑋1 , 𝑋2 , and 𝑋3 be three mutually independent random variables and let each has the pdf
2𝑥 0 < 𝑥 < 1 (2.6.7)
𝑓(𝑥) = {
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
The joint pdf of 𝑋1 , 𝑋2 , 𝑋3 is 𝑓(𝑥1 )𝑓(𝑥1 )𝑓(𝑥3 ) = 8𝑥1 𝑥2 𝑥3 , 0 < 𝑥𝑖 < 1, 𝑖 = 1,2,3, zero
elsewhere then, for illustration, the expected value of 5𝑋1 𝑋23 + 3𝑋2 𝑋34 is
1 1 1
1 6 1
=( ) =
2 64
In similar manner, we find that the cdf of 𝑌 is
0 𝑦<0
𝐺(𝑦) = 𝑃(𝑌 ≤ 𝑦) = {𝑦 6 0≤𝑦<1
1 1≤𝑦
Accordingly, the pdf of 𝑌 is
5
𝑔(𝑦) = {6𝑦 0<𝑦<1
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
REMARK 2.6.1
If 𝑋1 , 𝑋2 , and 𝑋3 are mutually independent, they are pairwise independent (that is, 𝑋𝑖 and 𝑋𝑗 ,
𝑖 ≠ 𝑗, where 𝑖, 𝑗 = 1,2,3 are independent). However, the following example, attributed to S.
Bernstein, shows that pairwise independence doesn’t necessarily imply mutual independence.
Let 𝑋1 , 𝑋2 and 𝑋3 have the joint pmf
1
𝑓(𝑥1 , 𝑥2 , 𝑥3 ) = {4 (𝑥1 , 𝑥2 , 𝑥3 ) ∈ {(1,0,0), (0,1,0), (0,0,1), (1,1,1)}
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
The joint pmf of 𝑋𝑖 𝑎𝑛𝑑 𝑋𝑗 , 𝑖 ≠ 𝑗 , is
1
𝑓𝑖𝑗 (𝑥𝑖 , 𝑥𝑗 ) = {4 (𝑥𝑖 , 𝑥𝑗 ) ∈ {(0,0), (1,0), (0,1), (1,1)}
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Whereas the marginal pmf of 𝑋𝑖 is
1
𝑓𝑖 (𝑥𝑖 ) = {2 𝑥𝑖 = 0, 1
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Obviously, if 𝑖 ≠ 𝑗 we have
𝑓𝑖𝑗 (𝑥𝑖 , 𝑥𝑗 ) ≡ 𝑓𝑖 (𝑥𝑖 )𝑓𝑗 (𝑥𝑗 )
And thus 𝑋𝑖 and 𝑋𝑗 are independent. However
𝑓(𝑥1 , 𝑥2 , 𝑥3 ) ≢ 𝑓1 (𝑥1 )𝑓2 (𝑥2 )𝑓3 (𝑥3 )
Thus 𝑋1 , 𝑋2, and 𝑋3 are not mutually independent.
Unless there is a possible misunderstanding between mutual and pairwise
independence, we usually drop the modifier mutual. Accordingly, using this practice in
example 2.6.2, we say that 𝑋1 , 𝑋2 , and 𝑋3 are independent random variables, meaning that
they are mutually independent. Occasionally, for emphasis, we use mutually independent so
that the reader is reminded that this is different from pairwise independence.
In addition if several random variables are mutually independent and have the same
distribution, we say that they are independent and identically distributed, which we
abbreviate as iid. So the random variables in example 2.6.2 are iid with the common pdf
given in expression (2.6.7)
2.6.1 VARIANCE-COVARIANCE
Let 𝑿 = (𝑋1 , … , 𝑋𝑛 )′ be an 𝑛 −dimensional random vector. Recall that we defined 𝐸(𝑿) =
(𝐸(𝑋1 ), … 𝐸(𝑋𝑛 ))′ that is the expectation of a random vector is just the vector of the
expectations of its components. Now suppose 𝑾 is an 𝑚 × 𝑛 matrix of random variables, say
𝑾 = [𝑊𝑖𝑗 ] for the random variable 𝑊𝑖𝑗 , 1 ≤ 𝑖 ≤ 𝑚 and 𝑖 ≤ 𝑗 ≤ 𝑛. Note that we can always
string out the matrix into an 𝑚𝑛 × 1 random vector. Hence, we define the expectation of a
random matrix
𝐸[𝑾] = [𝐸(𝑊𝑖𝑗 )] (2.6.8)
THEOREM 2.6.1
Let 𝑾𝟏 and 𝑾𝟐 be 𝑚 × 𝑛 matrices of random variables, and let 𝑨𝟏 and 𝑨𝟐 be 𝑘 × 𝑚
matrices of constants and let 𝑩 be a 𝑛 × 𝑙 matrix of constants. Then
𝐸[𝑨𝟏 𝑾𝟏 + 𝑨𝟐 𝑾2 ] = 𝑨𝟏 𝐸[𝑾𝟏 ] + 𝑨𝟐 𝐸[𝑾2 ] (2.6.9)
PROOF
Because of linearity of the operator 𝐸 of the operator 𝐸 on random variables, we have for the
(𝑖, 𝑗)th components of expression (2.6.9) that
𝑚 𝑚 𝑚 𝑚
THEOREM 2.6.2
Let 𝑿 = (𝑋1 , . . . , 𝑋𝑛 )′ be an 𝑛 −dimensional random vector, such that 𝜎𝑖2 = 𝜎𝑖𝑖 = 𝑉𝑎𝑟(𝑿𝒊 ) <
∞. Let 𝑨 be an 𝑚 × 𝑛 matrix of constants. Then
𝐶𝑜𝑣(𝑿) = 𝐸[𝑿𝑿′ ] = 𝝁𝝁′ (2.6.13)
𝐶𝑜𝑣(𝑨𝑿) = 𝑨Cov(𝐗)𝐀′ (2.6.14)
Use theorem 2.6.1 to derive (2.6.13); i.e:
𝐶𝑜𝑣 (𝑿) = 𝐸[(𝑿 − 𝝁)(𝑿 − 𝝁)′ ]
= 𝐸[𝑿𝑿′ − 𝝁𝑿′ − 𝑿𝝁′ + 𝝁𝝁′ ]
= 𝐸[𝑿𝑿′ ] − 𝝁𝐸[𝑿′ ] − 𝐸[𝑿]𝝁′ + 𝝁𝝁′
Which is the desired result. For (2.6.14), we have
𝐶𝑜𝑣(𝑨𝑿) = 𝐸[𝑨(𝑿 − 𝝁)(𝑨(𝑿 − 𝝁))′ ]
= 𝐸[𝑨(𝑿 − 𝝁)(𝑿 − 𝝁)′ 𝑨′ ]
= 𝐸[(𝑨𝑿 − 𝑨𝝁)(𝑿′ 𝑨′ − 𝝁′ 𝑨′ )]
= 𝐸[𝑨𝑿𝑿′ 𝑨′ − 𝑨𝑿𝝁′ 𝑨′ − 𝑨𝝁𝑿′ 𝑨′ + 𝑨𝝁𝝁′ 𝑨′ ]
= 𝑨𝐸(𝑿𝑿′ )𝑨′ − 𝑨𝐸(𝑿)𝝁′ 𝑨′ − 𝑨𝝁𝐸(𝑿′ )𝑨′ + 𝑨𝝁𝝁′ 𝑨′
= 𝑨[𝐸[𝑿𝑿′ ] − 𝝁𝐸[𝑿′ ] − 𝐸[𝑿]𝝁′ + 𝝁𝝁′]𝑨′
All variance-covariance matrices are positive-semi definite (psd) matrices; that is,
𝑎′ 𝐶𝑜𝑣(𝑿)𝑎 ≥ 0, for all vectors 𝑎 ∈ 𝑅 𝑛 . to see this let 𝑿 be a random vector and let 𝒂 𝑛 × 1
(2.6.14)
vector of constants. Then 𝑌 = 𝒂′𝑿 is a random variable and, hence, has nonnegative
variance; i.e,
0 ≤ 𝑉𝑎𝑟(𝑌) = 𝑉𝑎𝑟(𝒂′ 𝑿) = 𝒂′ 𝐶𝑜𝑣(𝑿)𝒂;
Hence, 𝐶𝑜𝑣(𝑿) is psd.
EXERCISE
2(𝑥+𝑦+𝑧)
1. Let 𝑋, 𝑌, 𝑍 have the joint pdf 𝑓(𝑥, 𝑦, 𝑧) = ; 0 < 𝑥 < 1; 0 < 𝑦 < 1; 0 < 𝑧 <
3
1 1 1 1
2
𝑓𝑌 (𝑦) = ∫ ∫ 𝑓(𝑥, 𝑦, 𝑧)𝑑𝑥 𝑑𝑧 = ∫ ∫(𝑥 + 𝑦 + 𝑧)𝑑𝑥 𝑑𝑧
3
0 0 0 0
1 1
2 1 1 2 1
= ∫ ( 𝑥 2 + 𝑥𝑦 + 𝑥𝑧] ) 𝑑𝑧 = ∫ ( + 𝑦 + 𝑧) 𝑑𝑧
3 2 0 3 2
0 0
2 1 1 1 2 2 2
= ( 𝑧 + 𝑦𝑧 + 𝑧 2 ] ) = (𝑦 + 1) = 𝑦 +
3 2 2 0 3 3 3
1 1 1 1
2
𝑓𝑍 (𝑧) = ∫ ∫ 𝑓(𝑥, 𝑦, 𝑧)𝑑𝑥 𝑑𝑦 = ∫ ∫(𝑥 + 𝑦 + 𝑧)𝑑𝑥 𝑑𝑦
3
0 0 0 0
1 1
2 1 1 2 1
= ∫ ( 𝑥 2 + 𝑥𝑦 + 𝑥𝑧] ) 𝑑𝑦 = ∫ ( + 𝑦 + 𝑧) 𝑑𝑦
3 2 0 3 2
0 0
2 1 1 1 2 2 2
= ( 𝑦 + 𝑦 2 + 𝑦𝑧] ) = (𝑧 + 1) = 𝑧 +
3 2 2 0 3 3 3
b.
1 1 1
2 2 2
1 1 1
𝑃 (0 < 𝑋 < , 0 < 𝑌 < , 0 < 𝑍 < ) = ∫ ∫ ∫ 𝑓(𝑥, 𝑦, 𝑧)𝑑𝑥 𝑑𝑦 𝑑𝑧
2 2 2
0 0 0
1 1 1 1 1
2 2 2 2 2 1
2 2 1 2
= ∫ ∫ ∫(𝑥 + 𝑦 + 𝑧)𝑑𝑥 𝑑𝑦 𝑑𝑧 = ∫ ∫( 𝑥 + 𝑥𝑦 + 𝑥𝑧] 2)𝑑𝑦 𝑑𝑧
3 3 2 0
0 0 0 0 0
1 1 1
2 2
3
2 1
2 1 1 1 2 1 3 1 2 1
= ∫ ∫ (( ) + 𝑦 + 𝑧 ) 𝑑𝑦 𝑑𝑧 = ∫ (( ) 𝑦 + 𝑦 + 𝑦𝑧 ] 2) 𝑑𝑧
3 2 2 2 3 2 4 2 0
0 0 0
1
2 1
2 1 4 1 4 1 2 1 4 1 4 1 2 2
= ∫ (( ) + ( ) + 𝑧 ) 𝑑𝑧 = (( ) 𝑧 + ( ) 𝑧 + 𝑧 ] )
3 2 2 4 3 2 2 8 0
0
2 1 5 1 5 1 5 1
= (( ) + ( ) + ( ) ) =
3 2 2 2 16
c. The multiplication of each marginal pdf gives
2 2 2 2 2 2
𝑓𝑋 (𝑥)𝑓𝑌 (𝑦)𝑓𝑍 (𝑧) = ( 𝑥 + ) ( 𝑦 + ) ( 𝑧 + )
3 3 3 3 3 3
Since 𝑓(𝑥, 𝑦, 𝑧) ≢ 𝑓𝑋 (𝑥)𝑓𝑌 (𝑦)𝑓𝑍 (𝑧) hence 𝑋, 𝑌, 𝑍 are not independent.
d.
1 1 1
2
𝐸(𝑋 2 𝑌𝑍 + 3𝑋𝑌 4 𝑍 2 ) = ∫ ∫ ∫(𝑥 2 𝑦𝑧 + 3𝑥𝑦 4 𝑧 2 )(𝑥 + 𝑦 + 𝑧)𝑑𝑥 𝑑𝑦 𝑑𝑧
3
0 0 0
1 1 1
2
= ∫ ∫ ∫(𝑥 3 𝑦𝑧 + 3𝑥 2 𝑦 4 𝑧 2 + 𝑥 2 𝑦 2 𝑧 + 3𝑥𝑦 5 𝑧 2 + 𝑥 2 𝑦𝑧 2 + 3𝑥𝑦 4 𝑧 3 )𝑑𝑥 𝑑𝑦 𝑑𝑧
3
0 0 0
1 1
2 1 1 3 1 3 1
= ∫ ∫ ( 𝑥 4 𝑦𝑧 + 𝑥 3 𝑦 4 𝑧 2 + 𝑥 3 𝑦 2 𝑧 + 𝑥 2 𝑦 5 𝑧 2 + 𝑥 3 𝑦𝑧 2 + 𝑥 2 𝑦 4 𝑧 3 ] ) 𝑑𝑦 𝑑𝑧
3 4 3 2 3 2 0
0 0
1 1
2 1 1 3 1 3
= ∫ ∫ ( 𝑦𝑧 + 𝑦 4 𝑧 2 + 𝑦 2 𝑧 + 𝑦 5 𝑧 2 + 𝑦𝑧 2 + 𝑦 4 𝑧 3 ) 𝑑𝑦 𝑑𝑧
3 4 3 2 3 2
0 0
1
2 1 1 1 1 1 3 1
= ∫ ( 𝑦 2 𝑧 + 𝑦 5 𝑧 2 + 𝑦 3 𝑧 + 𝑦 6 𝑧 2 + 𝑦 2 𝑧 2 + 𝑦 5 𝑧 3 ] ) 𝑑𝑧
3 8 5 9 2 6 10 0
0
1 1
2 1 1 1 1 1 3 2 1 16 3
= ∫( 𝑧 + 𝑧 2 + 𝑧 + 𝑧 2 + 𝑧 2 + 𝑧 3 ) 𝑑𝑧 = ∫ ( 𝑧 + 𝑧 2 + 𝑧 3 ) 𝑑𝑧
3 8 5 9 2 6 10 3 72 30 10
0 0
2 1 2 16 3 3 4 1 2 1 16 3
= ( 𝑧 + 𝑧 + 𝑧 ] )= ( + + )
3 72.3 30.3 4.10 0 3 72.3 30.3 4.10
e. The cdf of 𝑋, 𝑌, 𝑎𝑛𝑑 𝑍 is
𝐹(𝑥, 𝑦, 𝑧) = 𝑃(𝑋 ≤ 𝑥, 𝑌 ≤ 𝑦, 𝑍 ≤ 𝑧)
𝑧 𝑦 𝑥 𝑧 𝑦
2 2 1 𝑥
= ∫ ∫ ∫ (𝑢 + 𝑣 + 𝑤)𝑑𝑢 𝑑𝑣 𝑑𝑤 = ∫ ∫ ( 𝑢2 + 𝑢𝑣 + 𝑢𝑤] ) 𝑑𝑣 𝑑𝑤
3 3 2 0
0 0 0 0 0
𝑧 𝑦 𝑧
2 1 2 1 1 𝑦
= ∫ ∫ ( 𝑥 2 + 𝑥𝑣 + 𝑥𝑤) 𝑑𝑣 𝑑𝑤 = ∫ ( 𝑥 2 𝑣 + 𝑥𝑣 2 + 𝑥𝑣𝑤] ) 𝑑𝑤
3 2 3 2 2 0
0 0 0
𝑧
2 1 1 2 1 1 1 𝑧
= ∫ ( 𝑥 2 𝑦 + 𝑥𝑦 2 + 𝑥𝑦𝑤) 𝑑𝑤 = ( 𝑥 2 𝑦𝑤 + 𝑥𝑦 2 𝑤 + 𝑥𝑦𝑤 2 ] )
3 2 2 3 2 2 2 0
0
2 1 2 1 1 1 1 1
= ( 𝑥 𝑦𝑧 + 𝑥𝑦 2 𝑧 + 𝑥𝑦𝑧 2 ) = 𝑥 2 𝑦𝑧 + 𝑥𝑦 2 𝑧 + 𝑥𝑦𝑧 2
3 2 2 2 3 3 3
f. The conditional distribution of 𝑋 and 𝑌 , given 𝑍 = 𝑧 are
2
𝑓(𝑥, 𝑦, 𝑧) 3 (𝑥 + 𝑦 + 𝑧) 𝑥 + 𝑦 + 𝑧
𝑓𝑋|𝑍 (𝑋|𝑧) = = =
𝑓𝑍 (𝑧) 2 𝑧+1
(𝑧 + 1)
3
2
𝑓(𝑥, 𝑦, 𝑧) 3 (𝑥 + 𝑦 + 𝑧) 𝑥 + 𝑦 + 𝑧
𝑓𝑌|𝑍 (𝑌|𝑧) = = =
𝑓𝑍 (𝑧) 2 𝑧+1
(𝑧 + 1)
3
1 1
𝑥(𝑥 + 𝑦 + 𝑧) 𝑦(𝑥 + 𝑦 + 𝑧)
𝐸(𝑋 + 𝑌|𝑧) = 𝐸(𝑋|𝑧) + 𝐸(𝑌|𝑧) = ∫ 𝑑𝑥 + ∫ 𝑑𝑦
𝑧+1 𝑧+1
0 0
1 1
𝑥 2 + 𝑥𝑦 + 𝑥𝑧 𝑥𝑦 + 𝑦 2 + 𝑦𝑧
=∫ 𝑑𝑥 + ∫ 𝑑𝑦
𝑧+1 𝑧+1
0 0
1 1
1
= (∫(𝑥 2 + 𝑥𝑦 + 𝑥𝑧)𝑑𝑥 + ∫(𝑥𝑦 + 𝑦 2 + 𝑦𝑧)𝑑𝑦)
𝑧+1
0 0
1 1 1 1 1 1 1 1 1
= ( 𝑥 3 + 𝑥 2 𝑦 + 𝑥 2 𝑧] + 𝑥𝑦 2 + 𝑦 3 + 𝑦 2 𝑧] )
𝑧+1 3 2 2 0 2 3 2 0
1 1 1 1 1 1 1 1 2 1 1
= ( + 𝑦 + 𝑧 + 𝑥 + + 𝑧) = ( + 𝑥 + 𝑦 + 𝑧)
𝑧+1 3 2 2 2 3 2 𝑧+1 3 2 2
g. The conditional distribution of 𝑋 given 𝑌 = 𝑦 and 𝑍 = 𝑧 is
1
2 2 1 1 2 1
𝑓𝑋|𝑌,𝑍 (𝑋|𝑦, 𝑧) = ∫ (𝑥 + 𝑦 + 𝑧)𝑑𝑥 = ( 𝑥 2 + 𝑥𝑦 + 𝑥𝑧] ) = ( + 𝑦 + 𝑧)
3 3 2 0 3 2
0
2. Let 𝑓(𝑥1 , 𝑥2 , 𝑥3 ) = exp[−(𝑥1 + 𝑥2 + 𝑥3 )], 0 < 𝑥1 < ∞, 0 < 𝑥2 < ∞, 0 < 𝑥3 < ∞,
zero elsewhere, be the joint pdf of 𝑋1 , 𝑋2 , 𝑋3.
a. Compute 𝑃(𝑋1 < 𝑋2 < 𝑋3 ) and 𝑃(𝑋1 = 𝑋2 < 𝑋3 )
Solution
−(𝑥1 +𝑥2 +𝑥3 )
𝑓(𝑥1 , 𝑥2 , 𝑥3 ) = {𝑒 0 < 𝑥1 < ∞; 0 < 𝑥2 < ∞; 0 < 𝑥3 < ∞
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
a. Computing 𝑃(𝑋1 < 𝑋2 < 𝑋3 ) gives
∞ 𝑥3 𝑥2
1
𝑀(𝑡1 , 𝑡2 , 0) = ∑ ∑ 𝑒 𝑡1 𝑋1 +𝑡2 𝑋2 𝑝1,2 (𝑥1 , 𝑥2 ) = ∑ ∑ 𝑒 𝑡1 𝑋1 +𝑡2 𝑋2 ( )
4
𝑥1 𝑥2 𝑥1 𝑥2
1 1 1 1
= ∑ ∑ 𝑒 𝑡1 𝑋1 +𝑡2 𝑋2 ( ) ( ) = [∑ 𝑒 𝑡1 𝑋1 ( )] [∑ 𝑒 𝑡2 𝑋2 ( )]
2 2 2 2
𝑥1 𝑥2 𝑥1 𝑥2
while
1
𝑀(𝑡1 , 0, 𝑡3 ) = ∑ ∑ 𝑒 𝑡1 𝑋1 +𝑡3 𝑋3 𝑝1,3 (𝑥1 , 𝑥3 ) = ∑ ∑ 𝑒 𝑡1 𝑋1 +𝑡3 𝑋3 ( )
4
𝑥1 𝑥3 𝑥1 𝑥3
1 1 1 1
= ∑ ∑ 𝑒 𝑡1 𝑋1 +𝑡3 𝑋3 ( ) ( ) = [∑ 𝑒 𝑡1 𝑋1 ( )] [∑ 𝑒 𝑡3 𝑋3 ( )]
2 2 2 2
𝑥1 𝑥3 𝑥1 𝑥3
while
1
𝑀(0, 𝑡2 , 𝑡3 ) = ∑ ∑ 𝑒 𝑡2 𝑋2 +𝑡3 𝑋3 𝑝2,3 (𝑥2 , 𝑥3 ) = ∑ ∑ 𝑒 𝑡2 𝑋2 +𝑡3 𝑋3 ( )
4
𝑥2 𝑥3 𝑥2 𝑥3
1 1 1 1
= ∑ ∑ 𝑒 𝑡2 𝑋2 +𝑡3 𝑋3 ( ) ( ) = [∑ 𝑒 𝑡2 𝑋2 ( )] [∑ 𝑒 𝑡3 𝑋3 ( )]
2 2 2 2
𝑥2 𝑥3 𝑥2 𝑥3
= [∑ 𝑒 𝑡2 𝑋2 𝑝2 (𝑥2 )] [∑ 𝑒 𝑡3 𝑋3 𝑝3 (𝑥3 )] = 𝑀(0, 𝑡2 , 0)𝑀(0, 0, 𝑡3 )
𝑥2 𝑥3
1
= ∑ ∑ ∑ 𝑒 𝑡1 𝑋1 +𝑡2 𝑋2+𝑡3 𝑋3 ( )
4
𝑥1 𝑥2 𝑥3
𝑀(𝑡1 , 0 , 0)𝑀(0, 𝑡2 , 0)𝑀(0, 0, 𝑡3 ) = ∑ ∑ ∑ 𝑒 𝑡1 𝑋1 +𝑡2𝑋2 +𝑡3 𝑋3 𝑝1 (𝑥1 )𝑝2 (𝑥2 )𝑝3 (𝑥3 )
𝑥1 𝑥2 𝑥3
1 1 1 1
= [∑ 𝑒 𝑡1 𝑋1 ( )] [∑ 𝑒 𝑡2 𝑋2 ( )] [∑ 𝑒 𝑡3 𝑋3 ( )] = ∑ ∑ ∑ 𝑒 𝑡1 𝑋1 +𝑡2 𝑋2 +𝑡3 𝑋3 ( )
2 2 2 8
𝑥1 𝑥2 𝑥3 𝑥1 𝑥2 𝑥3
1 1
Since we have that 𝑝(𝑥1 , 𝑥2 , 𝑥3 ) = 4 and 𝑝1 (𝑥1 )𝑝2 (𝑥2 )𝑝3 (𝑥3 ) = 8 so we know that
Now we verify that each of the entry off the diagonal is the 𝑐𝑜𝑣(𝑋𝑖 , 𝑋𝑗 ). We know
that all of the entry off the diagonal has form
𝐸(𝑋𝑖 𝑋𝑗 − 𝑋𝑖 𝜇𝑗 − 𝑋𝑗 𝜇𝑖 + 𝜇𝑖 𝜇𝑗 )
According to the previous materials we have that
Thus it is proved that all the entry off of diagonal are c𝑜𝑣(𝑋𝑖 , 𝑋𝑗 )