0% found this document useful (0 votes)
5 views12 pages

S RandomVariables

The document provides an overview of random variables and their distributions, focusing on both discrete and continuous types. It explains key concepts such as probability density functions (pdf), cumulative distribution functions (cdf), moments, and bivariate distributions. Additionally, it covers the transformation of random variables and the derivation of marginal and conditional distributions.

Uploaded by

6be42b4a5e
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views12 pages

S RandomVariables

The document provides an overview of random variables and their distributions, focusing on both discrete and continuous types. It explains key concepts such as probability density functions (pdf), cumulative distribution functions (cdf), moments, and bivariate distributions. Additionally, it covers the transformation of random variables and the derivation of marginal and conditional distributions.

Uploaded by

6be42b4a5e
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

UCL Department of Economics MSc Maths and Statistics Refresher Course

Mónica Costa Dias September 2007

2 (Statistics) Random variables

References: DeGroot and Schervish, chapters 3, 4 and 5; Stirzaker, chapters 4, 5 and 6

We will now study the main tools use for modeling experiments with unknown outcomes in statistics:
random variables and their distribution.

2.1 Single random variables and distributions

• Consider an experiment in a sample space S. A random variable is a function that transforms


the sample space into R, assigning a real value to each possible outcome of the experiment.
In general we denote it by X(s) (s ∈ S) or simply X when there is no scope for confusion.

• Attached to each random variable is a probability rule that measures the likelihood of a
particular outcome. Suppose A is a set in R and we wish to measure the probability that
X ∈ A. This is given by:

p(X ∈ A) = p(s ∈ S : X(s) ∈ A)

• A probability rule is generally described by a function, the cumulative distribution function,


abbreviated as cdf. For any real value x, the cdf FX (x) is defined as follows:

F (x) = p(X 6 x)

• For a function FX defined in all R to be a cdf it must satisfy the following conditions:

– be non-decreasing;
– FX (+∞) = 1
– FX (−∞) = 0
– p(a < X 6 b) = FX (b) − FX (a)

• A random variable X is discrete or follows a discrete distribution if F can take only a finite
number of different values {k1 , k2 , . . . , kn }.

• A random variable X is discrete or follows a discrete distribution if F can take only a discrete
number of different values (possibly infinitely many) {k1 , k2 , . . . , kn , . . .}.

• A random variable X is continuous or follows a continuous distribution if it can take every


value in a real interval.

1
UCL Department of Economics MSc Maths and Statistics Refresher Course
Mónica Costa Dias September 2007

2.1.1 Discrete distributions

• A discrete random variable takes values only on a discrete set of values, {k1 , k2 , . . .}.

• For these variables we can define a function fX called the probability density function and
abbreviated as pdf :

fX (x) = p(X = x)

The pdf measures the likelihood of each particular outcome x.

• Notice:

– If x is not one of the possible outcomes of the experiment, {k1 , k2 , . . .}, then fX (x) = 0.
– At points in the possible space of outcomes, the pdf will be strictly positive. These are
called mass points of the distribution.
– Thus fX is always non-negative. Since it measures the probability of each possible event,
it cannot be bigger than 1.

• The pdf of X defines the cdf of X as follows:

FX (x) = p(X 6 x)
X
= fX (ki )
i:ki 6x

• Since FX is a cdf, it must be that:

– FX is non-decreasing: if x2 > x1 then all values ki that are no larger then x1 must not
be larger than x2 and thus FX (x2 ) must be at least equal to FX (x1 );
P
– FX (+∞) = all i fX (ki ) = 1;
– FX (−∞) = 0;
P P P
– p(a < X 6 b) = i:a<ki 6b fX (ki ) = i:ki 6b fX (ki ) − i:ki 6a fX (ki ) = FX (b) − FX (a).

• As a consequence of its definition, the cdf FX is a step function.

• Examples of discrete distributions: discrete uniform, bernoulli, binomial.

2.1.2 Continuous distributions

• A continuous random variable X can assume any value within an interval which may be
bounded or not.

2
UCL Department of Economics MSc Maths and Statistics Refresher Course
Mónica Costa Dias September 2007

• In this case, the cdf of X, FX may be continuous over the whole R or, at least, will be
continuous over intervals of R - it may have some discontinuity points.

• To start with, suppose that the cdf of X is continuous and differentiable over R. Then we
can define the function fX as follows:
dFX (x)
fX (x) = = FX0 (x) (1)
dx
• The function fX defined above is called the probability density function and abbreviated pdf.
It measures the marginal change (increase) in FX as x changes infinitesimally.

• A consequence of this definition is that fX (x) is always non-negative.

• The reverse of (1) is that the cdf can be defined as being the function F that satisfies the
following condition:
Z x
FX (x) = fX (x)dx
−∞
meaning that it measures the area below the curve of fX .

• This definition of cdf can be extended to random variables that follow a continuous distribu-
tion in all R except possibly for a finite number of points. In this case fX (x) = FX0 (x) for all
x where the derivative exists.

• Notice that from our definition the following properties follow:


R +∞
– −∞ fX (x)dx = 1;
R +∞
– p(X > x) = x fX (t)dt = 1 − FX (x);
Rb
– p(a < X 6 b) = FX (b) − FX (a) = a fX (x)dx;
– p(a < X 6 b) = p(a 6 X < b) = p(a < X < b) = p(a 6 X 6 b);
– the two properties above then imply that at a point a where the distribution of X is
continuous:

p(X = a) = p(a 6 X 6 a)
Z a
= fX (x)dx
a
= 0

meaning that the likelihood of the realisation of a particular value in the (continuous)
distribution of X is zero.
– But then the function fX is non-unique: it can be changed in a discrete (finite or infinite)
number of points and still form the same cdf. We solve the ambiguity by using always
the continuous version of fX unless there are reasons to do it otherwise.

• Examples: uniform distribution, normal distribution.

3
UCL Department of Economics MSc Maths and Statistics Refresher Course
Mónica Costa Dias September 2007

2.1.3 Functions of a random variable

• Consider a discrete random variable first:

– Suppose the random variable X is defined on the space {k1 , k2 , . . .}, following a discrete
distribution with pdf fX so that the probability of x is p(X = x) = fX (x).
– Now consider a transformation of X through a function h to form a new random variable
Y : Y = h(X).
– Y follows a new probability rule fY which is defined as follows:

fY (y) = p(Y = y)
= p(h(X) = y)
X
= fX (x)
x:h(x)=y

• Now consider a continuous random variable:

– X is a continuous random variable with a pdf fX (x).


– Consider again a transformation of X through a function h. The resulting random
variable is Y = h(X).
– The cdf of Y can now be defined as:

FY (y) = P (Y 6 y)
= p(h(X) 6 y)
Z
= fX (x)dx
x:h(x)6y

– Now suppose that h is strictly monotonic (either increasing or decreasing). Thus h is


invertible and we can write X = h−1 (Y ). In this case, the cdf of Y is:
¯ −1 ¯
¯ dh (y) ¯
fY (y) = fX (h (y)) ¯¯
−1 ¯
dy ¯

2.1.4 Moments

• The distribution of a random variable contains all the information about it. However, it is
often cumbersome and difficult to present. Instead, some functions of the random variable
summarise the distribution and are often presented. The most commonly used functions are
the moments of the random variable.

• Expected value: central moment of the distribution.

4
UCL Department of Economics MSc Maths and Statistics Refresher Course
Mónica Costa Dias September 2007

– For a discrete random variable with possible realisations {k1 , k2 , . . . , kn }


n
X
EX (X) = ki fX (ki )
i=1

– For a continuous random variable:


Z +∞
EX (X) = xfX (x)dx
−∞

– What is the expected valued of Z = h(X) where X is a continuous random variable?


– The expected value may or may not lie at the centre of the distribution of X.
– Some properties of the expected value:
∗ E(c) = c where c is a constant;
∗ E(a + bX) = a + bE(X) where a and b re scalars;
∗ If g(X) = g1 (X) + g2 (X) is a function then E(g(X)) = E(g1 (X)) + E(g2 (X)).
∗ If g is a non-linear function then E(g(X)) is generally different from g(E(X)).

• Variance: measures the dispersion of the distribution.

– The variance of a distribution is given by:


£ ¤
V (X) = E (X − E(X))2
= E(X 2 ) − E(X)2

– We also define standard deviation to be:


p
sd(X) = V (X)

– Some properties of the variance:


∗ V (c) = 0 where c is a constant;
∗ V (aX) = a2 V (X) where a is a scalar;
∗ V (aX + b) = a2 V (X) where a and b are scalars.

• Higher order moments: These help characterise a distribution. They may be centred or not:

– non centred moment of order k: E(X k );


£ ¤
– centred moment of order k: E (X − E(X))k .

• Median: another central moment of the distribution. It is the point m that divides the
distribution in two parts, each with a probability of 0.5.

5
UCL Department of Economics MSc Maths and Statistics Refresher Course
Mónica Costa Dias September 2007

– The median of the distribution of a continuous random variable X is defined as follows:

median(X) = m if p(X 6 m) = p(X > m) = 0.5

– The median of the distribution of a discrete random variable X is defined as the smallest
value m such that:

p(X 6 m) > 0.5

• Quantile: the median is an example of a quantile, the 0.5-quantile. In general, the p-quantile
of a distribution is the value x that divides the distribution in two parts, one with probability
p and the other with probability 1 − p.

– For a continuous random variable X, the p-quantile is defined as:

Qp (X) = x if p(X 6 x) = p

– For a discrete random variable X, the p-quantile is defined as the smallest x such that

p(X 6 x) > p

2.2 Bivariate distributions

We may imagine cases where more than one random variable is required to describe an experiment.
We will now study how to deal with more than one random variable in simultaneous.

• Let (X, Y ) be a pair of random variables. We now want to characterise their joint distribution.

• Discrete case: if both X and Y are discrete random variables defined on the space S, the
joint pdf is

fXY (x, y) = p(X = x, Y = y)

Again fXY is always non-negative and satisfies:


X
fXY (x, y) = 1
(x,y)∈S

The cdf is now:


X X
FXY (x, y) = fXY (xi , yj )
xi 6x yj 6y

6
UCL Department of Economics MSc Maths and Statistics Refresher Course
Mónica Costa Dias September 2007

• Continuous case: if X and Y are continuous random variables, the joint cdf is

FXY (x, y) = P (X 6 x, Y 6 y)

This is a nondecreasing function in both arguments such that:

FXY (−∞, −∞) = 0


FXY (+∞, +∞) = 1

We can now define the pdf to be:

∂ 2 FXY (x, y)
fXY (x, y) =
∂x∂y
and thus:
Z bx Z by
p(ax < X 6 bx , ay < Y 6 by ) = fXY (x, y)dydx
ax ay

and
Z bx Z by
p(X 6 a, Y 6 b) = fXY (x, y)dydx
−∞ −∞
= FXY (a, b)

2.2.1 Marginal distribution

• Consider again the case of two random variables (X, Y ). If the joint cdf is known, then the
cdf of each variable can be derived.

• In the discrete case, this amounts to sum over all the possible values of the other variable. Let
S be the support of (X, Y ) (the set of possible values that (X, Y ) may assume) and suppose
there are MX and MY different possible values that X and Y can assume, respectively. The
marginal distribution of X is defined by its marginal pdf, fX , as follows:

fX (x) = p(X = x)
MY
X
= fXY (x, yj )
j=1

and similarly to Y :

fY (y) = p(Y = y)
MX
X
= fXY (xi , y)
i=1

7
UCL Department of Economics MSc Maths and Statistics Refresher Course
Mónica Costa Dias September 2007

• In the continuous case we need to integrate over one of the variables to obtain the cdf of the
other:
Z x Z +∞
FX (x) = fXY (x, y)dydx
−∞ −∞
Z +∞ Z y
FY (y) = fXY (x, y)dydx
−∞ −∞

The marginal cdf’s can now be obtained from the first derivatives of the marginal pdf:
Z +∞
fX (x) = fXY (x, y)dy
−∞
Z+∞
fY (y) = fXY (x, y)dx
−∞

2.2.2 Conditional distribution

• We have encountered the concept of conditional probability before. We can now apply it to
distribution functions.

• Suppose we have a pair of random variables (X, Y ) and wish to determine the probability of
some realisation of y given that we have some information about X. In particular, we can
derive:
p(Y 6 y, X 6 x)
P (Y 6 y|X 6 x) =
p(X 6 x)
FXY (x, y)
=
FX (x)

• This is true for both discrete and continuous random variables.

• For discrete random variables we can immediately write the pdf in a similar way:

p(Y = y, X = x)
p(Y = y|X = x) =
p(X = x)
fXY (x, y)
=
fX (x)
= fY |X (y|x)

and the cdf:


p(Y 6 y, X = x)
p(Y 6 y|X = x) =
p(X = x)
= FY |X (y|x)

8
UCL Department of Economics MSc Maths and Statistics Refresher Course
Mónica Costa Dias September 2007

• For continuous random variables we need to take a small interval in X and write a similar
relationship:
p(Y 6 y, X 6 x + ∆) − p(Y 6 y, X 6 x)
P (Y 6 y|x < X 6 x + ∆) =
p(X 6 x + ∆) − p(X 6 x)
FXY (x + ∆, y) − FXY (x, y)
=
FX (x + ∆) − FX (x)
[FXY (x + ∆, y) − FXY (x, y)]/ ∆
=
[FX (x + ∆) − FX (x)]/ ∆

Taking the limits as ∆ approaches 0 yields


∂FXY (x, y)/ ∂x
p(Y 6 y|X = x) =
dFX (x)/ dx
∂FXY (x, y)/ ∂x
=
fX (x)
= FY |X (y|x)

and we can now take the derivatives with respect to Y to obtain:


∂FY |X (y|x)
fY |X (y|x) =
∂y
fXY (x, y)
=
fX (x)

• In both the continuous and discrete cases:

fXY (x, y) = fY |X (y|x)fX (x)


= fX|Y (x|y)fY (y)

and thus:
fY |X (y|x)fX (x)
fX|Y (x|y) =
fY (y)

2.2.3 Moments

• Expected value:

– Let g(X, Y ) be a function of the two random variables (X, Y ). Then:


( R +∞ R +∞
−∞ −∞ g(x, y)fXY (x, y)dydx for continuous random variables
EXY (g(X, Y )) = PMX PMy
i=1 j=1 g(xi , yj )fXY (xi , yj )dydx for discrete random variables

– But then:

EXY (g1 (X, Y ) + g2 (X, Y )) = EXY (g1 (X, Y )) + EXY (g2 (X, Y ))

9
UCL Department of Economics MSc Maths and Statistics Refresher Course
Mónica Costa Dias September 2007

– It is also true that:

EXY (g(X)) = EX (g(X))

• Covariance: cov(X, Y ) = E(XY ) − E(X)E(Y )


.p
• Correlation: corr(X, Y ) = cov(XY ) V (X)V (Y )

2.2.4 Independence

• Two random variables (X, Y ) are independent if

FXY (x, y) = FX (x)FY (y)

but this implies that

fXY (x, y) = fX (x)fY (y)

and

fX|Y (x|y) = fX (x)

which means that knowing one does not say anything about the other.

• In this case we have some results for the moments. If X and Y are independent then:

– E(XY ) = E(X)E(Y )
– V (aX + bY ) = a2 V (X) + b2 V (Y )
– E(X|Y ) = E(X) and E(Y |X) = E(Y ) where
Z +∞
E(Y |X = x) = yfY |X (y|x)dy
−∞

2.2.5 Iterated expectations

This is a very useful result. It states that:


£ ¤
EX (X) = EY EX|Y (X|Y )

Based on this result we can prove that, for example:

• if E(Y |X) = 0 then E(XY ) = 0;

• if E(Y |X) = 0 then E(Y ) = 0.

10
UCL Department of Economics MSc Maths and Statistics Refresher Course
Mónica Costa Dias September 2007

2.3 Many random variables

The above results extend simply to the case where there are many random variables. In such case
they are generally arranged in vectors.

• Let X = [X1 X2 . . . Xn ]0 be an n × 1 vector of random variables.

• The joint distribution function is:

FX (x) = p(X 6 x)
= p(X1 6 x1 , X2 6 x2 , . . . , Xn 6 xn )

• The joint pdf is:


∂ n FX (x)
fX (x) =
∂x1 ∂x2 . . . ∂xn

• Some of the most important moments are the following.

– Expected value:
 
EX1 (X1 )
 
 EX (X2 ) 
 2 
EX (X) =  .. 
 . 
 
EXn (Xn )

– Each of the expectations inside the vector are performed using the marginal distributions,
so for example:
Z +∞ Z +∞ Z +∞
EX1 (X1 ) = ... x1 fX (x1 , x2 , . . . , xn )dx1 dx2 . . . dxn
−∞ −∞ −∞
Z +∞
= x1 fX1 (x1 )dx1
−∞

– The variance is given by:

VX (X) = E(XX 0 ) − E(X)E(X)0

An important example is:

VX (a0 X + b) = VX (X 0 a)
= a0 V (X)a

This is a quadratic form. Since the variance is always non-negative, it yields that
VX (a0 X + b) > 0. But then, V (X) is psd.

11
UCL Department of Economics MSc Maths and Statistics Refresher Course
Mónica Costa Dias September 2007

2.4 Exercises

1. An exam consists of 100 multiple-choice questions. Form each question there are four possible
answers, only one of them being correct. If a candidate guesses answers at random, what is
the probability of getting at least 30 questions correct?

2. Two fair dices are thrown. Let X be the number of points in the first die and Y be the
number of points in the second die. Define Z = X + Y and W = XY .

Find the expectations and variances of X, Y, Z, W . Also find E(X 2 ), E(Y 2 ), E(Z 2 ) and
E(W 2 ).

3. The pdf of a random variable X is:


(
αx(2 − x) if 0 < x < 2
f (x) =
0 otherwise

Find α, E(X) and V (X).

4. Let (X, Y, Z) be independent random variables such that:

E(X) = −1 and V (X) = 2


E(y) = 0 and V (Y ) = 3
E(Z) = 1 and V (Z) = 4

Let

T = 2X + Y − 3Z + 4
U = (X + Z)(Y + Z)

Find E(T ), V (T ), E(T 2 ) and E(U ).

12

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy