0% found this document useful (0 votes)
30 views134 pages

Stat For Economists CHP 1-7

Uploaded by

abdurhamenaliyyi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views134 pages

Stat For Economists CHP 1-7

Uploaded by

abdurhamenaliyyi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 134

STATISTICS FOR ECONOMISTS

set by: Mekoro arega(M.Sc)


Introduction
 There are things in our lives that are absolutely
uncertain.
 This uncertainty makes the life challenging and
interesting.
 The concept of probability provides a
foundation for the scientific analysis of
problems involving uncertainty,
 its applications enable us to obtain a degree of
predictability from uncertain state of nature.

set by: Mekoro arega(M.Sc)


 Sample Space
 Experiment:- is any action or process whose outcome is
subject to uncertainty.
 Sample Space: - collection of all possible outcomes (or
elements) of the experiment (set, S).
 Example, consider an experiment in which an ordinary six-
sided die is rolled once. If we are interested in the face
value of any single toss, then there are six possible
outcomes, of which one must occur. The outcomes of this
experiment from a set of all possible outcomes S.
S = {1, 2, 3, 4, 5, 6}.
 A set, S, consisting of elements representing all possible
outcomes of an experiment is called a sample space.
set by: Mekoro arega(M.Sc)
 Sample Point: - each element in the sample space is
termed as sample point.
Example, consider an experiment of tossing a fair coin
twice.
S= {(HT), (TH), (HH), (TT)}.
 The sample points that belonging to the above sample
space S is (HT), (TH), (HH), and (TT).
 Event: - is a subset of sample space.
 Example: Considering the above experiment of tossing a
die, let A be the event of odd numbers, B be the event of
even numbers, and C be the event of number 8.
A = {1,3,5}, B = {2,4,6}, C = {} empty space or impossible
event
 NB: If S (sample space) has n members then there are
exactly 2n subsets or events.
set by: Mekoro arega(M.Sc)
A. Simple event (elementary event): is a subset of
sample space that has single element or sample point.
B. Sure (certain) event: it is an event which consists of all
sample points in the sample space.
 If an event is defined that it must occur on every trial,
it is called the certain event.
C. Compound events: Is a sub set of sample space
that has two or more sample points.
D. Impossible event: is a subset of sample space that
contains none of the sample points.
For instance, in the above example of rolling a die event of
obtaining number 8 is impossible event.
E. Independent events: two events are said to be
independent when the occurrence of one event does
not affect the occurrence of another event.
For instance, in a tossing of a die repeatedly, the event of
getting 5 in the first throw is independent of the
happening of the second throw or subsequent throws.
set by: Mekoro arega(M.Sc)
F. Dependent events: two events are said to be dependent
when the happening (or occurrence) and non-occurrence of
an event affects the happening of another event.
For instance, in drawing cards from a pack of cards, the result of
the second draw will depend upon the card drawn in the first
drawn if the experiment is performed without replacement.
G. Mutually exclusive events: Two events which cannot happen
at the same time. In other word, if two or more events are
mutually exclusive, then they can‟t occur together.
H. Equally Likely Events: Events which have the same chance of
occurring.
i. Complement of an Event: the complement of an event „A‟
means non-occurrence of „A‟ and is denoted by contains those
points of the sample space which don‟t belong to A.
Class work:
1. What is the sample space for the following experiment?
a. Toss a coin three times.
Solution
a. S={(HHH),(HHT),(HTH),(HTT),(THH),(THT),(TTH),(TTT)}
set by: Mekoro arega(M.Sc)
1.2. Definitions of Probability
 Probability is a value attached to a sample point or
event that it will be realized. These probability
assignments to events in the sample space must
follow certain rules.
 The probability of any basic outcome or event must
be between zero(0) and one(1).
 That is, for any outcome Oi or event Ei containing a
set of outcomes we have
0 ≤ 𝑃(𝑜𝑖 ) ≤ 1 𝑜𝑟 0 ≤ 𝑃(𝐸𝑖 ) ≤ 1
 Probability can be expressed as an odds ratio. If the probability of
an event Ej is a, then the odds of that event occurring is a to (1 −
a). If the odds in favor of an event are a to b then the probability of
𝑎
the event occurring is .
𝑎:𝑏
 Example, if the odds that your car will break down on the way
home from work are 1 to 10, then the probability it will break down
is
1/(10 + 1) = 1/11.
set by: Mekoro arega(M.Sc)
 There are four different conceptual approaches
to the study of probability theory.
These are:
a. The classical approach.
b. The frequentist approach.
c. The axiomatic approach.
d. The subjective approach.

set by: Mekoro arega(M.Sc)


This approach is used when:
 All outcomes are equally likely.
 Total number of outcome is finite, say N.
Definition: If a random experiment with N equally likely
outcomes is conducted and out of these NA outcomes are
favorable to the event A, then the probability that event A will
occur denoted by is defined as:

N A No. of outcomes favourable to A n( A)


P ( A)   
N Total number of outcomes n( S )

Examples:
 A fair die is tossed once. What is the probability of getting
a. Number 4? c) An even number?
b. An odd number? d) Number 8?

set by: Mekoro arega(M.Sc)


 Examples:
 A fair die is tossed once. What is the probability
of getting
a. Number 4? c) An even number?
b. An odd number? d) Number 8?
Solutions:
 First identify the sample space, say S
a. Let A be the event of number

b. Let A be the event of odd numbers

c) Let A be the event of even numbers


A  2,4,6
 N A  n( A)  3
n( A)
P ( A)   3 6  0 .5
n( S )
set by: Mekoro arega(M.Sc)
2. A box of 80 candles consists of 30
defective and 50 non defective candles. If 10
of this candles are selected at random, what is
the probability tat
a. All will be defective.
b. 6 will be non defective
c. All will be non defective
Solutions:  80 
Total selection     N  n( S )
10 

a. Let A be the event that all will be defective.

set by: Mekoro arega(M.Sc)


b. Let A be the event that 6 will be non defective.
 30   50 
Total way in which A occur    *    N A  n( A)
 4  6
 30   50 
 * 
n( A)  4   6 
 P ( A)    0.265
n( S )  80 
 

c. Let A be the event that all will be non defective


 10 

 Short coming of the classical approach:


This approach is not applicable when:
1. The total number of outcomes is infinite.
2. Outcomes are not equally likely.
set by: Mekoro arega(M.Sc)
This is based on the relative frequencies of outcomes
belonging to an event.
Definition: The probability of an event A is the
proportion of outcomes favorable to A in the long run
when the experiment is repeated under same
condition.

 Example: Assume that records show that 60 out of


100,000 bulbs produced are defective. What is the
probability of a newly produced bulb to be
defective?
 Solution: Let A be the event that the newly produced
bulb is defective.
Solution: Let A be the event that the newly produced bulb is
defective.
set by: Mekoro arega(M.Sc)
Defines the properties of probabilities or mathematical rules
that probabilities must satisfy. These sets of axioms are:
 Axiom1: 0 ≤ P(E) ≤ 1, for every event Ei
this axiom states that numerical values assigned as probabilities are real,
nonnegative numbers not exceeding unity.
 Axiom 2: P(S) =1; S is sure event
This axiom states that an event, whose occurrence is inevitable on any single trial, is
assigned a probability value of one. And
∞ ∞
 Axiom 3: 𝑃( 𝑖=1 𝐸𝑖) = 𝑖=1 𝑃(𝐸𝑖 ), for sequence of mutually exclusive events Ei
This axiom states that the probability of the union of unrelated events must be
equal to the addition of separate probabilities.
 Axiom 4: P(E1∩E2∩E3∩….En)= P(E1)* P(E2)* P(E3)*…. P(En), for independent
events E1,E2,….En.
 Axiom 5: P(Eꞌ) = 1-P(E), i.e. the probability that an event does not occur is always
equal to 1 minus the probability that the event occurs.
set by: Mekoro arega(M.Sc)
If events are not mutually exclusive: Consider the following venn diagram

n(E1 or E2) = n(E1E2c) + n(E1E2) + n(E1cE2)


Dividing Through by n(S), the number of sample points in the entire sample space S, we obtain
𝑛(𝐸1 𝑜𝑟 𝐸2) 𝑛(𝐸1 𝐸2𝑐 ) 𝑛(𝐸1 𝐸2 ) 𝑛(𝐸1𝑐 𝐸2 )
= + +
𝑛(𝑆) 𝑛(𝑆) 𝑛(𝑆) 𝑛(𝑠)
- taking the resulting ratios as probabilities, we may write
𝑃(𝐸1 𝑜𝑟 𝐸2 ) = 𝑃(𝐸1 𝐸2𝑐 ) + 𝑃(𝐸1 𝐸1 ) + 𝑃(𝐸1𝑐 𝐸2 )
 𝑃(𝐸1 ) = 𝑃(𝐸1 𝐸2𝑐 ) + 𝑃(𝐸1 𝐸2 ), and
 𝑃(𝐸2 ) = 𝑃(𝐸1𝑐 𝐸2 ) + 𝑃(𝐸1 𝐸2 )
Adding these two equations together and arranging the terms gives
𝑃(𝐸1 𝐸2𝑐 ) + 𝑃(𝐸1 𝐸2 ) + 𝑃(𝐸1𝑐 𝐸2 ) = 𝑃(𝐸1 ) + 𝑃(𝐸2 ) − 𝑃(𝐸1 𝐸2 )
𝑷(𝑬𝟏 𝒐𝒓 𝑬𝟐 ) = 𝑷(𝑬𝟏 ) + 𝑷(𝑬𝟐 ) − 𝑷(𝑬𝟏 𝑬𝟐 )
 The last equation is the general addition rule for any two events which are not
mutually exclusive.

set by: Mekoro arega(M.Sc)


In order to calculate probabilities, we have to know
a. The number of elements of an event
b. The number of elements of the sample space.
 That is in order to judge what is probable, we
have to know what is possible.
 In order to determine the number of outcomes,
one can use the following rules of counting.
I. The addition rule
II.The multiplication rule
III.Permutation rule
IV.Combination rule

set by: Mekoro arega(M.Sc)


 If a choice consists of k sequential operations of which the first can be
made in n1 ways, the second can be made in n2 ways…, and the kth can
be made in nk ways, then the whole choice can be made in
 Example: The digits 0, 1, 2, 3, and 4 are to be used in 4 digit
identification card. How many different cards are possible if
a) Repetitions are permitted.
b) Repetitions are not permitted.
Solutions a)
1st digit 2nd digit 3rd digit 4th digit

5 5 5 5

There are four steps the Selecting


1. Selecting the 1st digit, this can be made in 5 ways.
2. Selecting the 2nd digit, this can be made in 5 ways.
3. Selecting the 3rd digit, this can be made in 5 ways.
4. 4th digit, this can be made in 5 ways.
set by: Mekoro arega(M.Sc)
b.
1st digit 2nd digit 3rd digit 4th digit

5 4 3 2

There are four steps

1. Selecting the 1st digit can be made in 5 ways.


2. Selecting the 2nd digit can be made in 4 ways.
3. Selecting the 3rd digit can be made in 3 ways.
4. Selecting the 4th digit can be made in 2 ways.

 5 * 4 * 3 * 2  120 different cards are possible.

set by: Mekoro arega(M.Sc)


Permutation :- is an arrangement of n objects in a specified
order.
 To find the total number of permutations of “n” objects taken
“n” at a time,
 we first count the number of ways in which an object can be
assigned to the first position in arrangement;.
Rules:
1. The number of permutations of n distinct objects taken all
together is n!
Where

2. The arrangement of n objects in a specified order using r objects


at a time is called the permutation of n objects taken r objects at
a time. And it is written as and the formula is
n!
n Pr 
(n  r )!
set by: Mekoro arega(M.Sc)
3. The number of permutations of n objects in which k1 are
alike k2 are alike etc is

Example:

1. Suppose we have a letters A,B, C, D


a) How many permutations are there taking all the four?
b) How many permutations are there if two letters are used at a time?
2. How many different permutations can be made from the letters in the word
“CORRECTION”?
Here n  4, there are four disnict object
Solutions: 1. a)
 There are 4!  24 permutations.

Here n  4, r2
b) 4! 24
 There are P2    12 permutations.
( 4  2)!
4
2

Here n  10
Of which 2 are C , 2 are O, 2 are R ,1E ,1T ,1I ,1N
2.
 K 1  2, k 2  2, k 3  2, k 4  k5  k6  k7  1
u sin g the 3 rd rule of permutation , there are
10!
 453600 permutations.
2!*2!*2!*1!*1!*1!*1!
set by: Mekoro arega(M.Sc)
 A selection of objects without regard to order is called
combination.
The number of combinations of r objects selected from n objects is denoted by
 n  and is given by the formula:
n Cr or  
r
n n!

r
  ( n  r )!*r!
 

Example: Given the letters A, B, C, and D list the


permutation and combination for selecting two letters.
Solutions:
Permutation Combination

AB BA CA DA AB BC
AC BC CB DB AC BD
AD BD CD DC AD DC

Note that in permutation AB is different from BA. But in combination AB is the


same as BA. Combination Rule
set by: Mekoro arega(M.Sc)
𝑷 𝑨∩𝑩
Definition: 𝑷 𝑨 𝑩 = , 𝒇𝒐𝒓 𝑷 𝑩 > 0.
𝑷(𝑩)

Note that:
 𝑃 𝐴 𝐵 𝑃 𝐵 =𝑃 𝐵 𝐴 𝑃 𝐴
 If Events A1,A2,…,Ak are disjoint and exhaustive, then
𝑃 𝐴1 𝐵 + 𝑃 𝐴2 𝐵 + ⋯ += 1

𝑃 𝐵 𝐴𝑖 𝑃 𝐴 𝑖 = 𝑃 𝐵 , (𝐿𝑎𝑤 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦)


𝑖<1

set by: Mekoro arega(M.Sc)


 Two events are said to be statistically independent if the
occurrence of one of the events does not affect occurrence
of the other.
 Assume E2 is independent of E1, i.e. if the probability that E2
occurs is unaffected by the occurrence or non-occurrence of E1,
then
 𝑃(𝐸2 𝐸1 ) = 𝑃 𝐸2 𝑎𝑛𝑑
𝑃(𝐸1 𝐸2 ) = 𝑃(𝐸1 )𝑃 𝐸1 ∩ 𝐸2 = 𝑃 𝐸1 ∗ 𝑃 𝐸2
𝑃 𝐸1 ∩ 𝐸2 = 𝑃 𝐸1 𝐸2 ∗ 𝑃 𝐸2 = 𝑃 𝐸2 𝐸1 ∗ 𝑃(𝐸1 )
 Independence is the occurrence of one event does not
influence the occurrence (or non-occurrence) of the other
event.
 Mutual exclusiveness is refers to the events themselves
not the associated probabilities. Two events are said to be
mutually exclusive when they cannot occur together.
set by: Mekoro arega(M.Sc)
 Bayes’ theorem is applicable when the events for
which we want to compute posterior
probabilities are mutually exclusive and their
union is the entire sample space.
 For the case of n mutually exclusive events A1,
A2,..., An, whose union is the entire sample
space,
 Bayes’ theorem can be used to compute any
posterior probability P(Ai/B) as shown here:
𝑃 𝐴𝑖 𝑃(𝐵|𝐴𝑖 )
 𝑃(𝐴𝑖 |𝐵) =
𝑃 𝐴1 𝑃 𝐵 𝐴1 :𝑃 𝐴2 𝑃 𝐵 𝐴2 :⋯:𝑃 𝐴𝑛 𝑃(𝐵|𝐴𝑛 )

set by: Mekoro arega(M.Sc)


set by: Mekoro arega(M.Sc)
CHAPTER 2
2. RANDOM VARIABLES AND PROBABILITY
DISTRIBUTIONS
2.1. The Concept & Definition of a Random Variable
A random variable is a variable with unknown numerical
value that can take on, or represent, any possible element
from a sample space.
Random variable is a variable whose numerical value is
determined by the outcome of a random trial or experiment
where a unique numerical value is assigned to each sample
point.
Mathematical Definition
A random variable X(s), where s ∈ S, is a function from a sample
space S (domain) into the real numbers.set by: Mekoro arega(M.Sc)
E.g.: In the experiment of tossing a coin 3 times, we could define X as a random variable “the
total number of heads.”
𝑆= 𝐻𝐻𝐻 , 𝐻𝐻𝑇 , 𝐻𝑇𝐻 , 𝐻𝑇𝑇 , 𝑇𝐻𝐻 , 𝑇𝐻𝑇 , 𝑇𝑇𝐻 , 𝑇𝑇𝑇
𝑋 𝐻𝐻𝐻 = 3
𝑋 𝐻𝐻𝑇 = 𝑋 𝐻𝑇𝐻 = 𝑋 𝑇𝐻𝐻 = 2
𝑋 𝐻𝑇𝑇 = 𝑋 𝑇𝐻𝑇 = 𝑋 𝑇𝑇𝐻 = 1
𝑋 𝑇𝑇𝑇 = 0
𝑋 = 0,1,2,3
𝑋 𝑎𝑠𝑠𝑢𝑚𝑒𝑠 𝑎 𝑠𝑝𝑒𝑐𝑖𝑓𝑖𝑐 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑤𝑖𝑡𝑕 𝑠𝑜𝑚𝑒 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑖𝑒𝑠
There are two kinds of random variables (RV):
I. Discrete RV- is a variable which can take only a finite (or countably infinite) number of values,
II. Continuous RV- is a variable which can take any value in the real line within a
bounded or unbounded interval.

set by: Mekoro arega(M.Sc)


 Probability Distributions:- a probability
distribution consists of value that a random
variable can assume and the corresponding
probabilities of the values.
 The probability distribution for a discrete
random variable X associates with each of the
distinct outcomes xi, (i = 1, 2, 3, . . . , k) a
probability P(X = xi).
 It is also called the probability mass function or
the probability function.

set by: Mekoro arega(M.Sc)


Example1: Consider the experiment of tossing a
coin three times. Let X is the number of heads.
Construct the probability distribution of X.
Solution:
 First identify the possible value that X can
assume.
 Calculate the probability of each possible distinct
value of X and express X in the form of
frequency distribution.
X x 0 1 2 3

P X  x  18 38 38 18

set by: Mekoro arega(M.Sc)


Example 2: Consider a throw of a die twice and the sum of
numbers shown up in the first and second throw a rule, then
a) Define the random variable X
b) Find the values of the random variable X
c) Construct probability distribution of random variable X in
the form of table

set by: Mekoro arega(M.Sc)


 The probability mass function (pmf) of a discrete RV X,
denoted fX (x), is given by:
 f (x) = P (X = x) , for all x; must satisfy the following
conditions:
a) Any and all individual probabilities described by the probability
function take on a value between 0 and 1 (including 0 and 1).
i.e. f (x) ≥ 0 , for all x
b) The sum of all probabilities described by the probability function
is equal to 1.

= ∞
𝑖<1 𝑓(𝑥𝑖 ) 𝑖<1 𝑃 𝑥𝑖 = 1
 Example: Construct the probability distribution for the
experiment of tossing a balanced coin three times using
the „number of heads‟ as a rule.
set by: Mekoro arega(M.Sc)
Example: Construct the probability distribution for the experiment
of tossing a balanced coin three times using the „number of heads‟
as a rule.
Solution:
 Cumulative probability distribution function at a point:
From the above example P(0) =1/8, P (1) =3/8, P (2) =3/8, and P (3) =1/8.
Then the cumulative probability distribution at each point constructed as:
𝐹 (0) = 𝑃 (𝑋 0) = 𝑃 (𝑋 = 0) = 1/8
𝐹 (1) = 𝑃 (𝑋 1) = 𝑃 (𝑋 = 0) + 𝑃 (𝑋 = 1) = 1/8 + 3/8 = 4/8
𝐹 (2) = 𝑃 (𝑋 2) = 𝑃 (𝑋 = 0) + 𝑃 (𝑋 = 1) + 𝑃 (𝑋 = 2) = 1/8 + 3/8 + 3/8 = 7/8
𝐹 (3) = 𝑃 (𝑋 3) = 𝑃 (𝑋 = 0) + 𝑃 (𝑋 = 1) + 𝑃 (𝑋 = 2) + 𝑃 (𝑋 = 3)
= 1/8 + 3/8 + 3/8 + 1/8 = 1
Therefore, the cumulative distribution function at a point becomes

1/8, x=0
F(x) = ½, x=1
7/8, x=2
1, x=3
Cumulative probability distribution function over the set of real numbers
0, x<0
1/8, 0 x<1
F(x) = ½, 1 x<2
7/8, 2 x<3
1, x≥3

set by: Mekoro arega(M.Sc)


The probability density function is denoted by f(x), which
gives the probability density at x.
In mathematical terms:-
 The probability density function (pdf) of a continuous RV
X, denoted f(x), is given by:
𝑥2
𝑥1
𝑓 𝑥 𝑑𝑥 = 𝑃 𝑥1 ≤ 𝑋 ≤ 𝑥2 , 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑥1 𝑎𝑛𝑑 𝑥2
And the function must satisfy the following conditions:
a. 𝑓 𝑥 ≥ 0, 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑥𝑖

b. ;∞ 𝑓(𝑥) 𝑑𝑥 = 1
Example 1: Consider the following function and the check whether
it can satisfy a probability density function of continuous random
variable X.
1 2
 𝑓 𝑥 = 𝑥 , 0≤𝑥≤3
9
Solution:
set by: Mekoro arega(M.Sc)
Example 1: Consider the following function and the check whether it can satisfy a
probability density function of continuous random variable X.
1
𝑓 𝑥 = 9 𝑥2 , 0≤𝑥≤3
Solution:
i. to verify that f (x)≥ 0, for all x in the range of 0 to ;
the minimum value of the function attained at x=0 which is equal to zero
i.e. f (0)=(1/9)(0)2= 0.
the maximum value of the function attained at x=3 which is equal to 1
i.e. f (3) = (1/9)(3)2= 1.
Thus, the functional value of the function f (x) varies between the values 0 and 1 as the
values of x range between 0 and 3, which are all positive.

i. To verify ;∞
𝑓 𝑥 𝑑𝑥 = 1
31 2

0 9
𝑥 𝑑𝑥 =1
1 1
 𝑥 3 |30 = (33 −03 ) = 1.
27 27
 𝑡𝑕𝑢𝑠, 𝑡𝑕𝑒 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑠𝑎𝑡𝑖𝑠𝑓𝑖𝑒𝑠 𝑡𝑕𝑒 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑑𝑒𝑛𝑠𝑖𝑡𝑦 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛

set by: Mekoro arega(M.Sc)


Example 2: If X is a continuous random variable with density function
𝑘𝑒 ;3𝑥 , 𝑥 > 0
𝑓 𝑥 =𝑓 𝑥 =
0, 𝑜𝑡𝑕𝑒𝑟𝑤𝑖𝑠𝑒
Find the value of k and P(0.5  x  1).
Solution:

To determine value of k we have to start with ;∞
𝑓 𝑥 𝑑𝑥 = 1
0 ∞
;∞
𝑓 𝑥 𝑑𝑥 + 0
𝑘𝑒 ;3𝑥 𝑑𝑥 = 1 ii. Then the value of P (0.5 x  1 )is
0 ∞ ;3𝑥 1
;∞
0𝑑𝑥 + 𝑘 0
𝑒 𝑑𝑥 = 1 0.5
3𝑒 ;3𝑥 𝑑𝑥
0 1
0 ;∞ 𝑑𝑥 + 𝑘(− 3)𝑒 ;3𝑥 ∞ 0 =1 −𝑒 ;3𝑥 10.5
𝑘
0 𝑥 0;∞ + − 3 𝑒 ;3𝑥 ∞ 0 =1 −𝑒 ;3 + 𝑒 ;1.5
𝑘
0 + − 3 𝑒 ;3∞ − 𝑒 ;3 0 = 1 = 𝟎. 𝟏𝟕𝟑
𝑘
=1  𝒌=𝟑
3

set by: Mekoro arega(M.Sc)


2.4. Expected Value of Random Variables
i. Expected Value of Discrete RV
The mean value of a random variable in many trials is also known as its expected value. The
expected value of a discrete random variable X is denoted by E{X} and defined as:
𝑘
𝐸 *𝑥 + = 𝑖=1 𝑥𝑖 𝑃 (𝑥𝑖 ), where P (xi ) = P (X = xi).
Since the process of obtaining the expected value involves the calculation denoted by E{} above,
E{ } is called the expectation operator.
 The variance of a discrete random variable X is denoted by σ2{X} and defined as
𝟐 *𝑿+ = 𝒌
𝒊=𝟏(𝒙𝒊 − 𝑬*𝑿+𝑷(𝑿 = 𝒙𝒊 )
Example: Suppose the probability distribution for the experiment of tossing a fair coin three
times using X as a discrete random variable for the „number of heads‟ is.
X 0 1 2 3
P(X=x) 1/8 3/8 3/8 1/8
Then, expected value and variance of X is computed as:
1 3 3 1
𝐸 *𝑋+ = .8 ∗ 0/ + .1 ∗ 8/ + .2 ∗ 8/ + .3 ∗ 8/ =1.5

set by: Mekoro arega(M.Sc)


 The expected value of a continuous random variable is
defined as:

𝑬 𝑿 = ;∞ 𝒙𝒇 𝒙 𝒅𝒙
 This is not as different from the expected value of a
discrete random variable.
 The integral performs the same role for a continuous
variable as the summation does for a discrete one.
 The sums from minus infinity to plus infinity the variable x
with each little increment of x, given by dx, weighted by
the probability f (x) that the outcome of x will fall within
that increment

set by: Mekoro arega(M.Sc)


 Similarly, the variance of a continuous random variable is defined as:
2 *𝑋+ = 𝐸*(𝑋 − 𝐸{𝑋})2 +

= −∞
(𝑥 − 𝐸*𝑋+) 𝑓(𝑥) 𝑑𝑥
Example: Find the expected value and variance of the following probability density function
1
𝑓(𝑥) = 9
𝑥 2 for 0x3

i. First we need to find the following expected values E (x) and E (X2)
∞ ∞
𝐸(𝑋) = −∞
𝑥𝑓(𝑥)𝑑𝑥 𝐸(𝑋 2 ) = −∞
𝑥 2 𝑓(𝑥)𝑑𝑥
3 1 3 1
𝐸(𝑋) = 0
𝑥 9
𝑥 2 𝑑𝑥 𝐸(𝑋 2 ) = 0
𝑥2 9
𝑥 2 𝑑𝑥
1 𝑥4 3 9 1 𝑥5 3 27
𝐸(𝑋) = 9
∗ 4
0 = 4
𝐸(𝑋 2 ) = 9
∗ 5
0 = 5
2) 2 27 9
Thus, var(X)=𝐸(𝑋 − [𝐸(𝑋)] = 5
− (4 )2 = 𝟎. 𝟑𝟒

 Some properties of expected value of random variable:


1. If b is a constant, then E (b) =b.
2. If a and b are constants, then 𝐸(𝑎𝑋 + 𝑏) = 𝑎𝐸(𝑋) + 𝑏
3. If X and Y are independent random variables, then 𝐸(𝑋𝑌) = 𝐸(𝑋)𝐸(𝑌)
4. If X is a random variable with p.d.f f (x) and g (x) is any function of X, then
𝐸,𝑔(𝑋)- = 𝑥 𝑔(𝑥)𝑓(𝑥), 𝑖𝑓 𝑥 𝑖𝑠 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒

𝐸,𝑔(𝑋)- = −∞
𝑔(𝑥)𝑓(𝑥)𝑑𝑥 , 𝑖𝑓 𝑥 𝑖𝑠 𝑐𝑜𝑛𝑡𝑖𝑛𝑜𝑢𝑠
 Some properties of variance of random variable:
1. The variance of a constant is zero.
2. If a and b are constants, then 𝑣𝑎𝑟(𝑎𝑋 + 𝑏) = 𝑎 2 𝑣𝑎𝑟(𝑋)
3. If X and Y are independent variables, then
𝑣𝑎𝑟(𝑋 + 𝑌) = 𝑣𝑎𝑟(𝑋) + 𝑣𝑎𝑟(𝑌) + 2𝑐𝑜𝑣(𝑋, 𝑌)
𝑣𝑎𝑟(𝑋 − 𝑌) = 𝑣𝑎𝑟(𝑋) + 𝑣𝑎𝑟(𝑌) − 2𝑐𝑜𝑣(𝑋, 𝑌)
4. If X and Y are independent random variables and a and b are constants, then
𝑣𝑎𝑟(𝑎𝑋 + 𝑏𝑌) = 𝑎 2 𝑣𝑎𝑟(𝑋) + 𝑏 2 𝑣𝑎𝑟(𝑌) + 𝑎𝑏𝐶𝑜𝑣(𝑋, 𝑌)

set by: Mekoro arega(M.Sc)


i. The rth Moment about the origin
 The rth moment about the origin of a random variable X, denoted by r , is the expected
value Xr of , i.e. ,
 𝑟 = 𝐸 (𝑋 𝑟 ) = 𝑥 𝑥 𝑟 𝑓 (𝑥 ) 𝑓𝑜𝑟 𝑟 = 0,1,2, … , 𝑤𝑕𝑒𝑛 𝑋 𝑖𝑠 𝑑𝑒𝑠𝑐𝑟𝑒𝑡𝑒
∞ 𝑟
 𝑟 = 𝐸 (𝑋 𝑟 ) = −∞
𝑥 𝑓(𝑥)𝑑𝑥 𝑓𝑜𝑟 𝑟 = 0,1,2, … , 𝑤𝑕𝑒𝑛 𝑋 𝑖𝑠 𝑐𝑜𝑛𝑡𝑖𝑛𝑜𝑢𝑠
 Note that for r=0, 𝑟 = 𝐸 (𝑋 0 ) = 𝐸 (1) = 1, and for r=1, 1 = 𝐸 (𝑋1 ) = 𝐸(𝑋) i.e.,
the first moment about the origin is just the expected value of the random
variable X. It is called the mean of the distribution of X, or simply the mean of X, and
it is denoted by .
ii. The rth Moment about the mean.
The rth moment about the origin of a random variable X, denoted by, is the expected value of,
i.e., 𝑟 = 𝐸[(𝑋 − )𝑟 - = 𝑥 (𝑥 − )𝑟 𝑓(𝑥) 𝑓𝑜𝑟 𝑟 = 0,1,2, … , 𝑤𝑕𝑒𝑛 𝑋 𝑖𝑠 𝑑𝑒𝑠𝑐𝑟𝑒𝑡𝑒

𝑟 = 𝐸[(𝑋 − )𝑟 - = −∞
(𝑥 − )𝑟 𝑓(𝑥) 𝑓𝑜𝑟 𝑟 = 0,1,2, … , 𝑤𝑕𝑒𝑛 𝑋 𝑖𝑠 𝑐𝑜𝑛𝑡𝑖𝑛𝑜𝑢𝑠
𝑓𝑜𝑟 𝑟 = 0, 0 = 𝐸[(𝑋 − )0 - = 𝐸 (1) = 1, 𝑎𝑛𝑑

𝑓𝑜𝑟 𝑟 = 1, 1 = 𝐸[(𝑋 − )1 - = 𝑥 (𝑥 − ) 𝑓(𝑥) = 𝑥 𝑥𝑓(𝑥) − 𝑥 𝑓 (𝑥) = 𝐸 (𝑥) −  = 0

set by: Mekoro arega(M.Sc)


The general functions that we use to derive various moments of a distribution from some general
formula or function than 𝐸 (𝑋 𝑟 )for each r are called moments generating functions (MGFs).
the moment generating functions of a random variable where it exists is given by:
(𝑡)
 𝑀𝑥 = 𝐸 (𝑒 𝑡𝑥 ) = 𝑥 𝑒 𝑡𝑥 𝑓(𝑥) 𝑓𝑜𝑟 𝑋 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒 𝑟𝑎𝑛𝑑𝑜𝑚 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒
(𝑡) ∞ 𝑡𝑥
 𝑀𝑥 = 𝐸 (𝑒 𝑡𝑥 ) = −∞
𝑒 𝑓(𝑥)𝑑𝑥 𝑓𝑜𝑟 𝑋 continuous random variable
To see why such a function is referred to as an Moment generating functions (MGFs),
let consider the Maclaurin‟s series of 𝑒 𝑡𝑥 , i.e.,
𝑡𝑥 𝑡2𝑥2 𝑡3𝑥3 𝑡𝑟𝑥𝑟
𝑒 = 1 + 𝑡𝑥 + + + ⋯+ +⋯
2! 3! 𝑟!
(𝑡) 𝑡2𝑥2 𝑡3𝑥3 𝑡𝑟 𝑥𝑟
 𝑀𝑥 = 𝐸 (𝑒 𝑡𝑥 )
= 𝑥
𝑡𝑥
𝑒 𝑓 (𝑥 ) = 𝑥[ 1 + 𝑡𝑥 +
2!
+
3!
+ ⋯+
𝑟!
]𝑓(𝑥)
𝑡2 2 𝑡3 3 𝑡𝑟
= 𝑥 𝑓 (𝑥 ) + 𝑡 𝑥 𝑥𝑓(𝑥) + 2! 𝑥 𝑥 𝑓(𝑥) + 3! 𝑥 𝑥 𝑓(𝑥) + ⋯ + 𝑟 ! 𝑥 𝑥 𝑟 𝑓(𝑥) + ⋯
𝑡2 𝑡3 𝑡𝑟
= 1 + 1 𝑡 + 1 2! + 1 3! + ⋯ + 𝑟 𝑟! +…
From this equation, it can be easily seen that in the Maclaurin‟s series of the MGFs of random
𝑡𝑟
variable X, the coefficient of 𝑟!
is the r th
moment about the origin (𝑟 ).

set by: Mekoro arega(M.Sc)


set by: Mekoro arega(M.Sc)
3.1 Bernoulli & Binomial Distributions
1. Bernoulli Distribution
 A Bernoulli trial is an experiment with two possible
outcomes.
 A random variable X has a Bernoulli(p) if:
1, 𝑤𝑖𝑡𝑕 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑝
𝑋=
0, 𝑤𝑖𝑡𝑕 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 1 − 𝑝
 X = 1 often termed as “success” and X = 0 often termed
as “failure”.
 Example 1: toss a coin, head = “success” and tail =
“failure”.
 Example 2: incidence of a disease,
not infected = “success” and infected = “failure”.
set by: Mekoro arega(M.Sc)
 A binomial experiment is a probability
experiment that satisfies the following four
requirements called assumptions of a binomial
distribution.
1. The experiment consists of “n” identical trials.
2. Each trial has only two possible mutually
exclusive outcomes, success or a failure.
2. The probability of each outcome does not change
from trial to trial, and
3. The trials are independent- the outcome of any
trial has no effect on the probability of the others.
set by: Mekoro arega(M.Sc)
 Definition: The outcomes of the binomial
experiment and the corresponding probabilities of
these outcomes are called Binomial Distribution.

 Then the probability of getting successes in


trials becomes:

And this is some times written as:

set by: Mekoro arega(M.Sc)


 When using the binomial formula to solve
problems, we have to identify three things:
a. The number of trials ( )
b. The probability of a success on any one trial (p )
c. The number of successes desired ( x).
Examples:
1. What is the probability of getting three heads by tossing a fair con four times?
 Solution: Let X be the number of heads in tossing a fair coin four times

set by: Mekoro arega(M.Sc)


Example 2: Suppose that an examination consists of six true and
false questions, and assume that a student has no knowledge of the
subject matter. The probability that the student will guess the
correct answer to the first question is 30%. Likewise, the
probability of guessing each of the remaining questions correctly
is also 30%.
a) What is the probability of getting more than three correct
answers?
b) What is the probability of getting at least two correct
answers?
c) What is the probability of getting at most three correct
answers?
d) What is the probability of getting less than five correct
answers?
set by: Mekoro arega(M.Sc)
Solution: Let X = the number of correct answers that the student gets

A.

Thus, we may conclude that if 30% of the exam questions are answered by guessing,
the probability is 0.071 (or 7.1%) that more than four of the questions are answered
correctly by the student.

set by: Mekoro arega(M.Sc)


b.

C.

D.

Remark: If X is a binomial random variable with parameters n and p then

E ( X )  np , Var ( X )  npq
set by: Mekoro arega(M.Sc)
The Poisson distribution depends only on the
average number of occurrences per unit time of
space.
 The Poisson distribution is used as a distribution
of rare events, such as: Arrivals, Accidents,
Number of misprints, Hereditary, Natural disasters
like earth quake, etc.
 The process that gives rise to such events is
called Poisson process
set by: Mekoro arega(M.Sc)
 A random variable X is said to have a Poisson
distribution if its probability distribution is given
by:

set by: Mekoro arega(M.Sc)


Example: If 1.6 accidents can be expected an
intersection on any given day, what is the
probability that there will be 3 accidents on any
given day?
Solution: Let X =the number of accidents,

If X is a Poisson random variable with parameter  then

E (X )   , Var (X )  
set by: Mekoro arega(M.Sc)
 The Poisson probability distribution provides a
close approximation to the Binomial Probability
Distribution when n is large and p is quite small
or quite large with as n →∞.
x e 
fk ( x )  , x  0,1,2,......
x!
Where   the averagenumber.

n
fy ( x)    p x q n x , x  0,1,2,...., n
 x
𝒇𝒚 (𝒙) → 𝒇𝒌 (𝒙), 𝒇𝒐𝒓 𝒆𝒗𝒆𝒓𝒚 𝒙

(np) x e  ( np )
P( X  x)  , x  0,1,2,......
x!
Where   np  the average number.
set by: Mekoro arega(M.Sc)
Usually we use this approximation if np  5 .
In other words, if n  20 and np  5 [or n(1  p)  5 ],
then we may use Poisson distribution as an approximation to binomial distribution.

Example: Find the binomial probability P(X=3) by using the Poisson distributio
p  0.01
and n  200 . Solution:
U sin g Poisson ,   np  0.01* 200  2
23 e  2
 P( X  3)   0.1804
3!
U sin g Binomial , n  200, p  0.01
 200 
 P( X  3)   (0.01)3 (0.99)99  0.1814
 3 
set by: Mekoro arega(M.Sc)
4. The Hyper Geometric and Binomial Distributions
 This distribution is closely related to binomial probability
distribution.
 But in hyper geometric probability distribution, the trials are not
independent.
 Thus, the probability of success changes from trial to trial,

 the objective is to choose random sample of n-items out of a


population of N under condition that once an item has been selected,
it is not returned to the population (without replacement).
 The binomial formula could be applied in two outcome
sampling situations where the sample size “n” was not
more than 5 percent of the population size “N”.
 When n greater than 5percent of “N”, the hyper
geometric formula should be used.

set by: Mekoro arega(M.Sc)


The hyper geometric probability distribution is given by the formula:

𝑅 𝑁−𝑅
. /. /
(𝑟 ) = 𝑟 𝑛 − 𝑟 ,
𝑁
. /
𝑛
𝑤𝑕𝑒𝑟𝑒: 𝑁 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑖𝑧𝑒, 𝑅 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 𝑖𝑛 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛,
𝑛 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒, 𝑟 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 𝑖𝑛 𝑎 𝑠𝑎𝑚𝑝𝑙𝑒

Properties of Hyper geometric Probability Distribution:


a. The result of each draw can be classified in to two
categories.
b. The probability of success in each draw changes

set by: Mekoro arega(M.Sc)


Examples: 1. A population consists of 10 items, four of
which are classified as defective. What is the probability
that a random sample of size 3 will contain two defective
items?
 Solution: N = 10 R= 4 n = 3 r= 2
4 10;4 4 6
2 3;2 2 1
 𝑃 2 = 10 = 10 =____________
3 3
Example 2. Suppose that there are 15 identical tires in stock and 5 are
slightly damaged. What is the probability that a customer who buys 4
tires will obtain 2 damaged tires?
 Solution: N = 15 R = 5, n = 4, r = 2

5 15;5 5 10
2 4;2 2 2
 𝑃 2 = 15 = 15 = ____________--,
4 4
set by: Mekoro arega(M.Sc)
 Note: Hyper geometric probability distribution is more
tedious to compute by hand.
 When n is not too large, use binomial formula to
approximate hyper geometric results.
 Still, it is better to use Poisson formula to approximate
hyper geometric results given the following conditions:
a. n ≤ 0.05 N
b. n ≤ 20 and p ≤ 0.05

set by: Mekoro arega(M.Sc)


 For a continuous random variable, the probability density
function provides the value of the function at any particular
value of x;
 it does not directly provide the probability of the random

variable assuming some specific value.


 However, the area under the graph of ƒ (x) corresponding

to a given interval will assume a value in that interval.

set by: Mekoro arega(M.Sc)


1. Normal Distribution
 A random variable X is said to have a normal
distribution if its probability density function is

set by: Mekoro arega(M.Sc)


Properties of Normal Distribution:

1. It is bell shaped and is symmetrical about its mean and it is mesokurtic. The maximum
1
ordinate is at x   and is given by f ( x) 
 2
2. It is asymptotic to the axis, i.e., it extends indefinitely in either direction from the mean.
3. It is a continuous distribution.
4. It is a family of curves, i.e., every unique pair of mean and standard deviation defines a
different normal distribution. Thus, the normal distribution is completely described by two
parameters: mean and standard deviation.
5. Total area under the curve sums to 1, i.e., the area of the distribution on each side of the

mean is 0.5.   f ( x)dx  1


6. It is unimodal, i.e., values mound up only in the center of the curve.


7. Mean  Median  mod e  
8. The probability that a random variable will have a value between any two points is equal to
the area under the curve between those points.
set by: Mekoro arega(M.Sc)
Note: To facilitate the use of normal distribution, the following distribution known as the
standard normal distribution was derived by using the transformation
1
X  1  z2
Z  f ( z)  e2
 2

set by: Mekoro arega(M.Sc)


Properties of the Standard Normal Distribution:
- Same as a normal distribution, but also mean is zero, variance is one, standard Deviation is one
- Areas under the standard normal distribution curve have been tabulated in various ways. The
most common ones are the areas between Z  0 and a positive value of Z .

- Given normally distributed random variable X with mean  and s tan dard deviation 
a X  b
P ( a  X  b)  P (   )
  

 P ( a  X  b)  P ( a    Z  b   )
 
Note:
P ( a  X  b)  P ( a  X  b)
 P ( a  X  b)
 P ( a  X  b)
set by: Mekoro arega(M.Sc)
Examples:
1. Find the area under the standard normal distribution which lies
a) Between Z  0 and Z  0.96
Solution:

Area  P(0  Z  0.96)  0.3315


b) Between Z  1.45 and Z  0
Solution:

Area  P (1.45  Z  0)
 P (0  Z  1.45)
 0.4265

set by: Mekoro arega(M.Sc)


1. A random variable X has a normal distribution with mean 80 and standard deviation 4.8.
What is the probability that it will take a value
a) Less than 87.2
b) Greater than 76.4
c) Between 81.2 and 86.0
Solution
X is normal with mean,   80, s tan dard deviation,   4.8
a)
X  87.2  
P ( X  87.2)  P (  )
 
87.2  80
 P( Z  )
4. 8
 P ( Z  1.5)
 P ( Z  0)  P (0  Z  1.5)
 0.50  0.4332  0.9332

set by: Mekoro arega(M.Sc)


X   76.4  
P( X  76.4)  P(  )
 
76.4  80
 P( Z  )
4.8
 P( Z  0.75)
 P( Z  0)  P(0  Z  0.75)
 0.50  0.2734  0.7734

a)
81.2   X  86.0  
P (81.2  X  86.0)  P (   )
  
81.2  80 86.0  80
 P( Z )
4.8 4.8
 P (0.25  Z  1.25)
 P (0  Z  1.25)  P (0  Z  1.25)
 0.3934  0.0987  0.2957
set by: Mekoro arega(M.Sc)
Standard Normal Z- distribution table.

set by: Mekoro arega(M.Sc)


set by: Mekoro arega(M.Sc)
CHAPTER -4
 4. Joint and conditional probability distributions
 When joint probabilities are assigned to two or
more random variables, the resulting
relationship is known as a joint probability
distribution.
 Joint probability distribution of two random
variables is known as bivariate distributions.
When more than two random variables are
being considered, the distribution is said to be
multivariate distribution.

set by: Mekoro arega(M.Sc)


As a simple illustration of a bivariate experiment,
let us consider three successive tosses a fair coin.
Let X be the number of heads occurring in the
three trials; thus x =0, 1, 2, 3.
Let another random variable, Y, be defined as
follows: if the first of the three tosses results in the
occurrence of tails, Y = 0;
if it is results in the occurrence of heads, Y =1.
Thus, the range of Y is y = 0, 1.

set by: Mekoro arega(M.Sc)


Sample space X Y Probability
TTT 0 0 1/8
HTT 1 1 1/8
THT 1 0 1/8
TTH 1 0 1/8
HHT 2 1 1/8
HTH 2 1 1/8
THH 2 0 1/8
HHH 3 1 1/8

 As can be seen from the above table, the possible joint events in
the experiment is written as (X =0, Y =0), (X =0, Y =1) and so
on.
 The corresponding probabilities that X and Y both assume the
value 0 is P (X =0, Y =0) =P (0, 0) =1/8.
 The joint event (X =1, Y =0) occurs if the outcome is either THT
or TTH, and so P (X =1, Y =0) =P (1, 0) =2/8.
set by: Mekoro arega(M.Sc)
In general, we may write this condition as:

𝑥 𝑦 𝑃(𝑥, 𝑦) = 1,
 where the double summation sign indicates that the
entries in joint probability table are added over all
possible pairs of values of X and Y
 Thus, if X and Y are discrete random variables, the
function f (x, y) =P (X=x, Y=y) for each pair of (x, y)
within the range of X and Y is called the joint probability
distribution of X and Y.

set by: Mekoro arega(M.Sc)


 Any bivariate function can serve as a joint probability distribution of a pair of random
variables X and Y if and only if its values f (x, y) satisfy the following two conditions:
i. 𝑓 𝑥, 𝑦 ≥ 0, ∀ 𝑣𝑎𝑙𝑢𝑒𝑠 𝑜𝑓 𝑥, 𝑦 𝑤𝑖𝑡𝑕 𝑖𝑛 𝑡𝑕𝑒 𝑑𝑜𝑚𝑎𝑖𝑛
ii. 𝑥 𝑦 𝑓(𝑥, 𝑦) = 1, ∀ 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑝𝑎𝑖𝑟𝑠 𝑜𝑓 𝑥, 𝑦 𝑤𝑖𝑡𝑕 𝑖𝑛 𝑡𝑕𝑒 𝑑𝑜𝑚𝑎𝑖𝑛
Example1.
 Determine the value of k for which the function 𝑓 𝑥, 𝑦 = 𝑘𝑥𝑦 𝑓𝑜𝑟 𝑥 =
1,2,3 𝑎𝑛𝑑 𝑦 = 1,2,3 𝑐𝑎𝑛 𝑠𝑒𝑟𝑣𝑒 𝑎𝑠 𝑎 𝑗𝑜𝑖𝑛𝑡 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛.
Solution:
a) for the first condition, 𝑓 𝑥, 𝑦 ≥ 0 𝑡𝑜 𝑏𝑒 𝑠𝑎𝑡𝑖𝑠𝑓𝑖𝑒𝑑 𝑡𝑕𝑒 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛 𝑘 ≥ 0
b) Secondly, we need to check for the second condition 𝑥 𝑦 𝑓(𝑥, 𝑦) = 1.
 𝑥 𝑦𝑓𝑥, 𝑦 = 1 .
 𝑥 𝑦 𝑘𝑥𝑦 = 1.
 𝑘 𝑥 𝑦 𝑥𝑦 = 1.
 𝑘 𝑥 𝑥 𝑦 𝑦 = 1.
 𝑘 𝑥 𝑥(1 + 2 + 3) = 1.
 𝑘 𝑥 𝑥(6) = 1.
 6𝑘(1 + 2 + 3) = 1.
 36𝑘 = 1 → 𝒌 = 𝟏/𝟑𝟔
 Thus, if 𝒌 = 𝟏/𝟑𝟔 then both the required conditions are satisfied.

set by: Mekoro arega(M.Sc)


 If X and Y are discrete random variables, the
function 𝐹 𝑥, 𝑦 = 𝑃 𝑋 ≤ 𝑥, 𝑌 ≤ 𝑦 =
𝑡≤𝑥 𝑠≤𝑦 𝑓 𝑡, 𝑠 for x and y in the range (-∞, ∞)

where 𝑓 𝑡, 𝑠 is the value of the joint probability of


X and Y at 𝑓 𝑡, 𝑠 is called the joint cumulative
distribution function of random variables X and Y.

set by: Mekoro arega(M.Sc)


 Example 4.2.
 Using the following joint probability compute the joint
cumulative probability 𝑃 𝑋 ≤ 1, 𝑌 ≤ 1 .
x y
0 1 2
0 1/6 1/3 1/12
1 2/9 1/6 0
2 1/36 0 0

Solution
The joint cumulative probability 𝑃 𝑋 ≤ 1, 𝑌 ≤ 1 , can be computed
by considering the 0 and 1 for X and 0 and 1 for Y.
1 1
𝑃 𝑋 ≤ 1, 𝑌 ≤ 1 = 𝑃 0,0 + 𝑃 0,1 + 𝑃 1,0 + 𝑃 1,1 = + +
6 3
2 1 8
+ = .
9 6 9
set by: Mekoro arega(M.Sc)
 In the case of continuous random variables the only
difference is that double summation sign changed to double
integral sign.
 A bivariate function with 𝑓 𝑥, 𝑦 defined over the xy-plane

is called a joint probability density function of two


continuous random variables X and Y if and only if the two
conditions are satisfied.
a) 𝑓 𝑥, 𝑦 ≥ 0, 𝑓𝑜𝑟 𝑒𝑎𝑐𝑕 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑝𝑎𝑖𝑟𝑠 𝑖𝑛 𝑡𝑕𝑒 𝑥𝑦 − 𝑝𝑙𝑎𝑛𝑒
∞ ∞
b) ;∞ ;∞ 𝑓 𝑥, 𝑦 𝑑𝑥𝑑𝑦 = 1,
𝑓𝑜𝑟 𝑎𝑙𝑙 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑝𝑎𝑖𝑟𝑠 𝑖𝑛 𝑡𝑕𝑒 𝑥𝑦 − 𝑝𝑙𝑎𝑛𝑒

set by: Mekoro arega(M.Sc)


Example 1.
1. Check whether the following bivariate function can serve as a joint density
function or not. 𝑓 𝑥, 𝑦 = 2 − 𝑥 − 𝑦 , 0 ≤ 𝑥 ≤ 1; 0 ≤ 𝑦 ≤ 1
Solution:
 It is obvious that 𝑓 𝑥, 𝑦 ≥ 0. 𝑀𝑜𝑟𝑒𝑜𝑣𝑒𝑟,
1 1

0 0
2 − 𝑥 − 𝑦 𝑑𝑥𝑑𝑦 = 1
1 1
 , 2 − 𝑥 − 𝑦 𝑑𝑥-𝑑𝑦 =
0 0
1
1 𝑥2 1

0
2𝑥 − − 𝑥𝑦 | 0 -𝑑𝑦 = 1
2
1 3

0 2
− 𝑦 𝑑𝑦 = 1
3 𝑦2 1
 𝑦− |0 = 1
2 2
 Since both conditions are satisfied, the bivariate function 𝑓 𝑥, 𝑦 = 2 − 𝑥 −
𝑦, 0 ≤ 𝑥 ≤ 1; 0𝑦 ≤ 1 can serve as a joint density function of random variables
X and Y.
 The joint density for interval of values is obtained by just using integral
sign:
𝑑 𝑏

𝑐 𝑎
𝑓 𝑥, 𝑦 𝑑𝑥𝑑𝑦 = 𝑃( 𝐴 ≤ 𝑥 ≤ 𝐵; 𝑐 ≤ 𝑦 ≤ 𝑑).

set by: Mekoro arega(M.Sc)


Example 2.
 Given the following joint density function find joint probability
1
𝑃 0 < 𝑥 ≤ ;1 < 𝑦 < 2
2
3
𝑥(𝑥 + 𝑦, 0 < 𝑥 < 1, 0 < 𝑦 < 2
 𝑓 𝑥, 𝑦 = 5
0, 𝑜𝑡𝑕𝑒𝑟𝑤𝑖𝑠𝑒
Solution:
1
1 2 2
 𝑃 0<𝑥≤ ;1 <𝑦<2 = 1 0
3/5𝑥 𝑥 + 𝑦 𝑑𝑥𝑑𝑦
2
1
1 2
 𝑃 0<𝑥≤ ;1 <𝑦<2 = 3/5 2 ,𝑥 2 + 𝑥𝑦𝑑𝑥-𝑑𝑦
2 1 0
1 3 2 𝑥 3 𝑥 2 𝑦 1/2
 𝑃 0<𝑥≤ ;1 <𝑦<2 = 1
,( + )|0 -𝑑𝑦
2 5 3 2
1 3 2 𝑦 1
 𝑃 0<𝑥≤ ;1 <𝑦<2 = ,( + )-𝑑𝑦
2 5 1 8 24
1 3 1 𝑦 2 𝑦
 𝑃 0<𝑥≤ ;1 <𝑦<2 = , ( + )-|12
2 5 8 2 3
1 3 1 𝑦2 𝑦 𝟏𝟏
 𝑃 0<𝑥≤ ;1 <𝑦<2 = + |12 =
2 5 8 2 3 𝟖𝟎

set by: Mekoro arega(M.Sc)


4.2 Marginal Probability Distributions
 Suppose we are given the joint probability
distribution of two random variables X and Y and
we are interested in a problem involving only one of
the variables, example the variable X.
 Since the univariate probability distribution 𝑃(𝑥)
appears on the margin of the joint probability
distribution from which it is derived, it is called the
marginal probability distribution of the random
variable X.
 Algebraically, the marginal distributions of X and Y
are defined as:
 𝑃 𝑥 = 𝑦 𝑃(𝑥, 𝑦)
 𝑃 𝑦 = 𝑥 𝑃(𝑥, 𝑦)
set by: Mekoro arega(M.Sc)
 Thus, if X and Y are discrete random variables and 𝑓 (𝑥, 𝑦) is their
joint probability distribution,
 The marginal distributions would provide the distribution of X and Y
only in isolation from one another.
 Accordingly, for X and Y discrete random variables:
 𝑔 𝑥 = 𝑦𝑓 𝑥, 𝑦 𝑓𝑜𝑟 𝑒𝑎𝑐𝑕 𝑥 𝑖𝑛 𝑡𝑕𝑒 𝑟𝑎𝑛𝑔𝑒 𝑜𝑓 𝑋 𝑖𝑠 𝑡𝑕𝑒 𝑚𝑎𝑟𝑔𝑖𝑛𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑋
𝑕 𝑦 = 𝑥𝑓 𝑥, 𝑦 𝑓𝑜𝑟 𝑒𝑎𝑐𝑕 𝑥 𝑖𝑛 𝑡𝑕𝑒 𝑟𝑎𝑛𝑔𝑒 𝑜𝑓 𝑌 𝑖𝑠 𝑡𝑕𝑒 𝑚𝑎𝑟𝑔𝑖𝑛𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑌
Similarly, for X and Y are continuous random variables

𝑔 𝑥 = ;∞
𝑓 𝑥, 𝑦 𝑑𝑦 𝑓𝑜𝑟 𝑒𝑎𝑐𝑕 𝑥 𝑖𝑛 𝑡𝑕𝑒 𝑟𝑎𝑛𝑔𝑒 𝑜𝑓 𝑋 𝑖𝑠 𝑡𝑕𝑒 𝑚𝑎𝑟𝑔𝑖𝑛𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑋

𝑕 𝑦 = ;∞
𝑓 𝑥, 𝑦 𝑑𝑥 𝑓𝑜𝑟 𝑒𝑎𝑐𝑕 𝑥 𝑖𝑛 𝑡𝑕𝑒 𝑟𝑎𝑛𝑔𝑒 𝑜𝑓 𝑌 𝑖𝑠 𝑡𝑕𝑒 𝑚𝑎𝑟𝑔𝑖𝑛𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑌

set by: Mekoro arega(M.Sc)


Example 1.
 Using the following joint probability derive the marginal

probability distributions for X and Y


x Y
0 1 2
0 1/6 1/3 1/12
1 2/9 1/6 0
2 1/36 0 0

Solution:
2
𝑔 𝑥 = 𝑦<0 𝑓 𝑥, 𝑦 𝑑𝑦 , 𝑓𝑜𝑟 𝑦 = 0,1,2 𝑤𝑕𝑖𝑐𝑕 𝑖𝑠 𝑡𝑕𝑒 𝑟𝑜𝑤 𝑡𝑜𝑡𝑎𝑙.
Therefore, 𝑔 𝑥 would be 7/12, 7/18 𝑎𝑛𝑑 1/36
2
𝑕 𝑦 = 𝑥<0 𝑓 𝑥, 𝑦 𝑑𝑥 , 𝑓𝑜𝑟 𝑥 = 0,1,2 𝑤𝑕𝑖𝑐𝑕 𝑖𝑠 𝑡𝑕𝑒 𝑐𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙.
𝑕 𝑦 would be 5/12, 1/2 𝑎𝑛𝑑 1/12
X 0 1 2
g(x) 7/12 7/18 1/36
Y 0 1 2
h(y) 5/12 ½ 1/12
set by: Mekoro arega(M.Sc)
 Example 2.
 For the following joint density function derive the marginal distributions of X and Y
2
(𝑥 + 2𝑦), 0 < 𝑥 < 1, 0 < 𝑦 < 1
 𝑓 𝑥, 𝑦 = 3
0, 𝑜𝑡𝑕𝑒𝑟𝑤𝑖𝑠𝑒
 Solution:
i. Marginal distribution of X

 𝑔 𝑥 = 𝑓 𝑥, 𝑦 𝑑𝑦
;∞
12
 𝑔 𝑥 = 0 3
(𝑥 + 2𝑦)𝑑𝑦
2 2 1
 𝑔 𝑥 = (𝑥𝑦 + 𝑦 )|0
3
2
 𝑔 𝑥 = 3
(𝑥 + 1)
2
(𝑥 + 1), 0<𝑥<1
 𝑆𝑜, 𝑔(𝑥) = 3
0, 𝑜𝑡𝑕𝑒𝑟𝑤𝑖𝑠𝑒
ii. Marginal distribution Y

 𝑕 𝑦 = 𝑓 𝑥, 𝑦 𝑑𝑥
;∞
12
 𝑕 𝑦 = (𝑥 + 2𝑦)𝑑𝑥
0 3
2 𝑥2 1
 𝑕 𝑦 = (2𝑥𝑦 + )|0
3
2 12
 𝑕 𝑦 = (2𝑦 + )
3 2
2 1
+ 2𝑦 , 0<𝑦<1
 𝑆𝑜, 𝑕(𝑦) = 3 2
0, 𝑜𝑡𝑕𝑒𝑟𝑤𝑖𝑠𝑒

set by: Mekoro arega(M.Sc)


Now suppose we are given the probability distribution of
X and Y, and we also know that in a trial of the experiment
X has assumed a particular value = 𝑥 .
 The probability of an event E2, given the occurrence of
another event, E1, is the conditional probability:
𝑃(𝐸1 𝐸2 )
 𝑃(𝐸2 𝐸1 ) = 𝑓𝑜𝑟 𝑃(𝐸1 ) ≠ 0
𝑃(𝐸1 )

set by: Mekoro arega(M.Sc)


 the conditional probability that Y takes on the value y
given that X has the value x, may be written as
𝑃 𝑋<𝑥,𝑌<𝑦
 𝑃(𝑌 = 𝑦 𝑋 = 𝑥) = 𝑓𝑜𝑟 𝑃 𝑋 = 𝑥 ≠ 0
𝑃 𝑋<𝑥
In general, if x is fixed and y varies over all possible
values within the range of Y, then
𝑃 𝑥,𝑦
 𝑃(𝑦 𝑥) = 𝑓𝑜𝑟 𝑃 𝑥 ≠0 is conditional
𝑃 𝑥
probability distribution of the random variable Y, i.e. the
univariate distribution of Y given the condition that X has
the value x.
 Similarly,
𝑃 𝑥,𝑦
 𝑃(𝑥 𝑦) = 𝑓𝑜𝑟 𝑃 𝑦 ≠ 0 is the conditional
𝑃 𝑦
probability distribution of X, given 𝑌 = 𝑦.
set by: Mekoro arega(M.Sc)
 From conditional probability formulae given above, it
can be seen that this condition holds if 𝑃(𝑥/𝑦) = 𝑃(𝑥) or
if 𝑃(𝑦/𝑥) = 𝑃(𝑦).
 Thus, for two random variables X and Y to be
independent.

set by: Mekoro arega(M.Sc)


 To illustrate this, suppose a fair coin is tossed twice, and
the random variable V is assigned the value of 0 and 1 if
tails or heads occurs on the first toss, and the random
variable W is assigned the value of 0 or 1 if tails or heads
occurs on the second toss.
 The joint probability distribution for this experiment,
together with the marginal probability distributions P(v)
and P(w) and the conditional probability distributions
𝑃(𝑤/𝑣) and 𝑃(𝑣/𝑤), are shown in Table 4.5.

set by: Mekoro arega(M.Sc)


Table 4.5 Joint probability distribution of (v, w) with
marginal and conditional probabilities
v W P(v)

0 ¼ ¼ ½
1 ¼ ¼ ½
P(w) ½ ½ 1
Table 4.6 Conditional probability
Conditional probability P(v/w)
P(v/W=0) P(v/W=1)
½ ½
½ ½
Conditional probability P(w/v)
P(w/V=0) ½
P(w/V=1) ½

set by: Mekoro arega(M.Sc)


 As it can be seen from the above table, the conditional
probability distributions 𝑃(𝑣/𝑤) and 𝑃(𝑤/𝑣) are the same as
the marginal distribution 𝑃 (𝑣) 𝑎𝑛𝑑 𝑃(𝑤) respectively, and
the random variables V and W are thus shows to be
statistically independent.
 i.e, if marginal pdf equal to conditional pdf , rv is

statistical independent. (𝑃(𝑣/𝑤) =p(v).


 Since the outcome of the first toss can have no effect on

the outcome of the second toss.


 The random variables X1, X2,…, Xn are statistically

independent if the condition:


 𝑃(𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 ) = 𝑃(𝑥1 ) 𝑃(𝑥2 ) 𝑃(𝑥3 ) … 𝑃(𝑥𝑛 )

set by: Mekoro arega(M.Sc)


4. Covariance, Coefficient of Correlation and Conditional
Expectation and Variance
Covariance
 Given two jointly distributed random variables X and Y,
we often wish to know if there exists a relationship
between the values that the variables X and Y can assume
in a random experiment.
 A measure of liner linear relationship between two
random variables having the joint probability distribution
𝑃(𝑥, 𝑦) is called the covariance of the distribution, and is
usually denoted by:
 𝑐𝑜𝑣 𝑋, 𝑌 𝑜𝑟 𝜎𝑋𝑌
The covariance of two random variables X and Y is
defined as the expectation
set by: Mekoro arega(M.Sc)
The covariance of two random variables X and Y is defined as the expectation
𝑐𝑜𝑣(𝑋,𝑌) = 𝜎𝑋𝑌 = 𝐸,(𝑋 − µ𝑋 )(𝑌 − µ𝑌 )-
= 𝐸(𝑋𝑌) − µ𝑋 µ𝑌
It can be readily seen that the variance of a variable is the covariance of that variable with itself.
𝑥 𝑦 [ 𝑋 − µ𝑋 𝑌 − µ𝑌 ] 𝑃(𝑥,𝑦)

𝑥 𝑦 𝑋𝑌 𝑃(𝑥,𝑦) − µ𝑋 µ𝑌 , if X and Y are discrete random variables.


If X and Y are continuous random variables, then
∞ ∞
𝑐𝑜𝑣(𝑋,𝑌) = −∞ −∞[ 𝑋 − µ𝑋 𝑌 − µ𝑌 𝑃(𝑥,𝑦)𝑑𝑥𝑑𝑦
∞ ∞
𝑐𝑜𝑣(𝑋,𝑌) = −∞ −∞
𝑋𝑌 𝑃 ( 𝑥,𝑦 ) 𝑑𝑥𝑑𝑦 − µ𝑋 µ𝑌

set by: Mekoro arega(M.Sc)


set by: Mekoro arega(M.Sc)
CHAPTER -5

Sampling and sampling


distributions

set by: Mekoro arega(M.Sc)


5. Sampling and sampling distributions
Introduction
The need for reliable data is ever increasing for making
wise decisions in the various fields of human activities.
There are also two ways of acquiring information/data.
 complete enumeration or census methods and
 sampling methods.
The measurements of population characteristics are called
parameters and
The measurements of sample characteristics are called
sample statistic.
But, because of cost, time, and feasibility, population
parameters are frequently estimated from sample statistic.
 A sample statistic used to estimate a population parameter
is called an estimator, and a specific observed value is
called an estimate.
set by: Mekoro arega(M.Sc)
 Given a finite population with N units (eg. 10000
units of households),
 Suppose we want to study the mean income ()
and the variance (2), etc. of the population.
 Since N is large and we want to reduce the work
involved, we take a sample of observations and
talk about the population parameters.
 Thus, say on the basis of and S2, we want to
predict about the population  and 2.

set by: Mekoro arega(M.Sc)


a. Sampling without replacement
 Suppose we want to sample “n” observations from a
population of size N, without replacing the selected
draw into the population.
 Draw Probability of selecting any unit
1st = 1/N
2nd= 1/(N-1)
3rd =1/(N-2)
. .
nth =1/(N-n+1)
 Observe that the probabilities of the outcomes are
dependent on each other. The probability of getting a
sample of size n is then given as:

 and the total number of possible samples is


set by: Mekoro arega(M.Sc)
 If we assign the same probability of obtaining
any sample (randomness), we have

b. Sampling with replacement


 Suppose we want to sample n observations from a population of
size N.
 But now we replace the selected draw into the population. Then we
have Draw Probability of selecting any unit
1st = 1/N
2nd = 1/N
3rd = 1/N
. .
Nth= 1/N
 The outcomes here are statistically independent.

set by: Mekoro arega(M.Sc)


 The distribution of all possible values that can be assumed
by some statistic, computed from samples of the same size
randomly drawn from the same population, is called the
sampling distribution of that statistic.
 If we take repeated (or all possible) random samples, each
of size n, from a population of values of the variable X and
 find the mean of each of these samples X, we find that
most of the sample means differ from each other.

set by: Mekoro arega(M.Sc)


 Sampling distribution - The probability distribution of
a sample statistic.
 Formed when samples of size n are repeatedly taken
from a population.
 The probability distribution of these sample
means is called the theoretical sampling
distribution of the mean.

Sampling distribution for the sample mean 𝒙

set by: Mekoro arega(M.Sc)


5.2.1.Distribution of the Mean from a Normal Distribution
 The theoretical sampling distribution of the mean can be described
by its mean and standard deviation.
 The mean of the sampling distribution of the mean is given by the

symbol, 𝜇𝑥 .
 The standard deviation of the sampling distribution of the mean is
given by the symbol 𝜎𝑥 .
𝜎
,𝜎𝑥 =
𝑛
 If n > 0.05N, we use the following formula to determine the standard error of the sampling
𝜎 𝑁;𝑛
distribution, σx , σx = 𝑛 𝑁;1

 Let X be normally distributed with mean μ and variance σ2.


 Consider a random sample of size n from this normal population.
 The mean of such a sample:

will be a random variable because X1, X2, …, Xn corresponding to the n trials of the sample
are random variables.
set by: Mekoro arega(M.Sc)
 Theorem: If X is normally distributed with mean μ
and variance σ2 and a random sample of size n is
taken, then the sample mean, , will be normally
distributed with mean μ and variance σ2/n

Hence, X ~N(μ,σ2/n)
set by: Mekoro arega(M.Sc)
Example 5.1
 For a population composed of the following 5 numbers: 1, 3, 5, 7,
and 9, determine
A) μ and σ
B) The theoretical sampling distribution of the mean for the sample
size of 2
C)The mean of the sampling distribution, μx and the standard error of
the sample mean, σx
 Solution:
1:3:5:7:9 25
A. μ= = =5
5 5
(1;5)2 :(3;5)2 :(5;5)2 : 7;5)2 :(9;5)2 40
𝜎 = = = 8 = 2.83
5 5

B. The theoretical sampling distribution of the sample


mean for the sample size of 2 from the given finite
population n is given by the mean of all the possible
different samples that can be obtained from this
population.
set by: Mekoro arega(M.Sc)
 The number of combinations of 5 numbers taken
2 at a time without concern for the order is
5!/2!3! = 10.
 These 10 samples are 1,3; 1,5; 1,7; 1,9; 3,5; 3,7;
3,9; 5,7; 5,9; and 7,9.
 The mean,𝑋, of the preceding 10 samples is 2, 3,
4, 5, 4, 5, 6, 6, 7, 8.
 The theoretical sampling distribution of the mean
is given in Table 5.1 below.
 Note that the variability or spread of the sample
means (from 2 to 8) is less than the variability or
spread of the values in the parent population
(from 1 to 9).

set by: Mekoro arega(M.Sc)


Table 5.1 Theoretical Sampling Distribution of the Mean
Values of the Mean Possible Outcomes Probability of Occurrence
2 2 0.1
3 3 0.1
4 4 ,4 0.2
5 5, 5 0.2
6 6, 6 0.2
7 7 0.1
8 8 0.1
Total 1.00

C. By applying theorem the central limit theorem, μx = μ = 5. Since the sample size of 2 is
greater than 5% of the population size (that is, n > 0.05N), we use the following formula to
determine the standard error of the sampling distribution, σx

𝜎 𝑁;𝑛 8 5;2 3
σx = = = 4 = 3 = 1.73
𝑛 𝑁;1 2 5;1 4

set by: Mekoro arega(M.Sc)


Example: If X is normally distributed with mean
μ=20 and variance σ2=16, calculate the probability
that: a) X>21 and b) >21 if is based on a random
sample of size 16.
 Solution:

set by: Mekoro arega(M.Sc)


The central limit theorem:
 Let x1, x2, ..., xn, be a random sample from a
population with mean, , and variance 2 < ∞.
 Then, the distribution of Zn approaches the standard
normal distribution as n approaches infinity.
 ,
 Hence for large n,
represent the Z-distribution, normal distribution and
standard normal distribution respectively.
 i.e, the central limit theorem is that the distribution
of

 approaches a normal distribution with mean zero


and variance 1 as n
 In practice, the sample size of 30 or larger is
considered adequate for this purpose.

set by: Mekoro arega(M.Sc)


3 different scenarios are possible

1 . 𝑛 ≥ 30 → 𝑥 will be normally distributed →


𝑥;𝜇
May Use 𝑧 = 𝜎
𝑛

2. 𝑛 < 30 and original x is normally distributed →


𝑥;𝜇
May Use 𝑧 = 𝜎
𝑛

3. 𝑛 < 30 and original x is not normally distributed


→ Cannot use z! (Need to use a t score)

set by: Mekoro arega(M.Sc)


5.4 Distribution of Proportions
 The theoretical justification for this normal approximation to the
binomial distribution is the central limit theorem.
 If Xi = 1 or 0, accordingly as a success or failure occurs at the ith trial
and if X denotes the total number of successes in n trials, then the
sample proportion of successes is given by:

 Since the mean and variance of Xi are given by p and pq,


respectively, it follows that the variable

, by the central limit theorem, tends to a


standard normal distribution as the number of trials
approaches infinity. This can equivalently be stated as:

 N  np , npq  .
A
X
set by: Mekoro arega(M.Sc)
 We can summarize the characteristics of the sampling
distribution of sample mean, under two conditions:
1. When sampling is from a normally distributed population
with a known population variance
A.  x = 

B. x  
n

C. The sampling distribution of x is normal

2. Sampling is from a non-normally distributed


population with a known
 
population variance
A. x =

B. x  
n when n
N  0.05

x  
n
N n
N 1
, otherwise
C. The sampling distribution x is normal
set by: Mekoro arega(M.Sc)
5.5 Distribution of Sample Variance (𝑺𝟐 )
 S 2 measures the variability and indicates spread or dispersion
among observations.
 Since dispersion is as important a consideration as central
tendency,
 the importance of S 2 for inferences about σ2 is comparable to
that of X for inferences about µ.
5.5.1 Distribution of the Sample Variance when the Population mean is
known
 We will develop the sampling distribution of S 2 when sampling
is from normal population.
2
 Initially it is important to assume that µ is known and σ is not.
In this context S 2 is defined by n
i=1 (Xi − µ) 2
S2 =
n
 Where X1, X2, X3….,Xn constitutes a random sample from a

normal distribution with known mean µ and unknown variance σ2 .


set by: Mekoro arega(M.Sc)
 The sample variance is defined as
n

x  x 
2
i

S 
i 1
2
, when population mean μ is unknown
n 1
5.6. Large Sample Properties of Estimators
 An estimator of θ is a consistent estimator if plim = θ.
 Let X1, X2, …, Xn be a random sample of size n from a
given population, then ,
 i.e., the expected value of the sample mean is equal to the
population mean and the variance of the mean is the
population variance divided by the sample size.
 The fact that as n increases the variance of the mean is
reduced and its mean approaches the population mean is
known as the law of large numbers.
set by: Mekoro arega(M.Sc)
 Let {X1, X2, …, Xn} be a
Definition 2:
sequence of random variables.
 If ,

then we say a is the probability limit of {Xn}


or (Xn tends in probability to a) or

set by: Mekoro arega(M.Sc)


5.6. Binomial Random Variables
 Binomial Random Variable counts sampled
individuals falling into particular category;
 Sample size n is fixed
 Each selection independent of others
 Just 2 possible values for each individual
 Each has same probability p of falling in category of interest
Example: The random variable X is the count of tails
in two flips of a coin.
 Questions: Why is X binomial? What are n and p?

 answer:
◦ Sample size n fixed?
◦ Each selection independent of others?
◦ Just 2 possible values for each?
◦ Each has same probability p?
binomial
set by: Mekoro arega(M.Sc)
Mean and S.D. of Binomial Counts, Proportions
 Count X binomial with parameters n, p
has:
 Mean =np
 Standard deviation=

Sample proportion has:

 Mean=p
◦ Standard deviation=

set by: Mekoro arega(M.Sc)


set by: Mekoro arega(M.Sc)
 Estimation:- is a method that enable us to
estimate with reasonable accuracy the population
parameter.
 Estimator is a sample statistics used to estimate a
population parameter. Example, the sample
mean, can be an estimator of the population
mean, .
 Estimate is a specific observed value of a statistic.
We can get an estimate by taking a sample and
computing the value taken by our estimator in
that sample.

set by: Mekoro arega(M.Sc)


 In general terms, estimation uses a sample statistic as
the basis for estimating the value of the corresponding
population parameter.
 Although estimation and hypothesis testing are similar
in many respects, they are complementary inferential
processes.
 A hypothesis test is used to determine whether or not
a treatment has an effect, while estimation is used to
determine how much effect.
 This complementary nature is demonstrated
when estimation is used after a hypothesis test
that resulted in rejecting the null hypothesis.
 In this situation, the hypothesis test has
established that a treatment effect exists and the
next logical step is to determine how much
effect. set by: Mekoro arega(M.Sc)
 The followings are the standard estimators
used by statisticians to estimate these
parameters.
1. A. Sample Mean (𝑿)

 Sample mean is the most common estimator of the


population mean. The sample mean ( ) is unbiased and
consistent
2. Sample Variance & Standard Deviation
 Sample variance is an unbiased and consistent estimator of the
population variance. It is relatively efficient as compared to other
estimators.
3. Sample proportion ( )
 Sample proportion is an unbiased, consistent, and
relatively efficient estimator of the population
proportion.
set by: Mekoro arega(M.Sc)
1. Point Estimate
II. Interval Estimate:
I. Point Estimate: - is a single number, which is
used to estimate an unknown population
parameter.
Computing Point Estimates
The sample mean (  ) formula:

 =
 Xi
Where  is sample mean
n
 is summation
Xi values of random variables
N is sample size
The sample variance and standard deviation

S2 
 (X  X ) 2

--------------------- Sample variance formula


n 1

S
(X  X ) 2

-------------- Sample standard deviation formula


n 1

set by: Mekoro arega(M.Sc)


 A point estimate is always insufficient, because it
is either right or wrong.
 point estimate is going to be different from the
population parameter because due to the
sampling error.
 there is no way to know how close it is to the
actual parameter, statisticians would developed
an interval estimation of the parameter.

set by: Mekoro arega(M.Sc)


 is a range of values used to estimate a population
parameter.
 It indicates the error in two ways: By the extent
of its range and the probability of the true
population parameter lying with in that range.
 An interval estimate consists of a range of values and
has the advantage of providing greater confidence than
a point estimate.
 For example, you might estimate that the mean age for
students is somewhere between 20 and 23 years.
 Note that the interval estimate is less precise, but gives
more confidence.
 For this reason, interval estimates are usually called
confidence intervals.
set by: Mekoro arega(M.Sc)
 An interval estimate of the population mean consists of two bounds
with in which the population mean, µ, is estimated to lie. i.e., L ≤ µ
≤ U,
 where L= lower bound and U = upper bound.
 From the standard normal table and the notation induced above, P (-
Zα/2 < Z < Zα/2 ) = 1-α.

 The end points and are called confidence limits and 1-α is the degree of
confidence. is the maximum error to be permitted in estimating µ by with
probability 1-α .

set by: Mekoro arega(M.Sc)


 Example – 1
 A normal infinite population has a standard deviation of 10.
A random sample of size 25 has a mean of 50. Construct a
95% confidence interval of the population mean?

set by: Mekoro arega(M.Sc)


 Example 2: the mean annual income of EAL workers is
supposed to be 24,000 Birr. Assume that this estimate was
based on a sample of 250 airline workers and the
population standard deviation was 5000 Birr.
a) Compute the 95% confidence interval for the population
mean?
b) Construct the 90% confidence interval for the population
mean?

set by: Mekoro arega(M.Sc)


Note that:
 A narrower confidence interval is more precise
 Larger samples give more precise estimates
 Small variance leas to more precise estimates
 Lower confidence coefficients allow us to construct more
precise estimate

set by: Mekoro arega(M.Sc)


set by: Mekoro arega(M.Sc)
 The main objective of estimation is to obtain the
nearest estimate for the given parameter or some
function of the unknown parameters.
a. consistent estimator.
b. unbiased.
c. efficiency

 Therefore, the best estimator should be highly reliable


and has such desirable properties as unbiased ness,
consistency, efficiency and sufficiency.
 An estimator is said to be sufficient if it uses all the
information about the population parameter contained in
the sample.
set by: Mekoro arega(M.Sc)
The end chapter!!

set by: Mekoro arega(M.Sc)


CHAPTER SEVEN
Hypothesis testing
Outline
• Four parts of statistical hypothesis testing
• One- and Two-tailed tests
• Type I and II Errors
• Power of the test
• Step by Step Procedure for a Large-Sample
Test of Hypothesis
• Large-Sample Test of Hypothesis about a
Population Mean

set by: Mekoro arega(M.Sc)


• A hypothesis in statistics, is a claim or statement
about a property of a population.
• A statistical test of hypothesis consists of four parts:
1. A null hypothesis (the questioned hypothesis)
2. An alternative hypothesis (the hypothesis the researcher
wishes to support)
3. A test statistic
4. A rejection region
Null Hypothesis, Ho
• The null hypothesis is a statement about the value of
a population parameter
• The null hypothesis contains a condition of equality:
=, ≤, or ≥
• Test the Null Hypothesis directly
• Results: Reject H0 or fail to reject H0
set by: Mekoro arega(M.Sc)
Alternative Hypothesis, Ha
• Hypothesis the researchers wishes to support
• Must be true if H0 is false
• Contains =, <, >
• Oppositee of the Null
Test Statistic
• A value computed from the sample data that is
used in making the decision about the rejection
of the null hypothesis
• For large samples, testing claims about
population means:

set by: Mekoro arega(M.Sc)


Rejection Region

• Set of all values of the test statistic that would cause a


rejection of the null hypothesis

set by: Mekoro arega(M.Sc)


Significance Level, α
• The probability that the test statistic will fall in the
critical region when the null hypothesis is actually true.
• Common choices are 0.05, 0.01, and 0.10
• Same as degree of confidence for a confidence interval

set by: Mekoro arega(M.Sc)


Type I Error
• The error made by rejecting the null hypothesis when it is
true.
• α(alpha) is used to represent the probability of a type I error
• Example: Rejecting a claim that the mean body temperature
is 98.6 degrees when the mean really does equal 98.6

Type II Error
• The error made by failing to reject the null hypothesis when
it is false.
• ß (beta) is used to represent the probability of a type II error
• Example: Failing to reject the claim that the mean body
temperature is 102.6 degrees when the mean is really
different from 102.6

set by: Mekoro arega(M.Sc)


Type I and Type II Errors

Power of a Statistical Test


 The power of a hypothesis test is the P[Reject the null
hypothesis when the null hypothesis is false] = 1 – β
 Measures the ability of the test to perform as required
set by: Mekoro arega(M.Sc)
set by: Mekoro arega(M.Sc)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy