Stat For Economists CHP 1-7
Stat For Economists CHP 1-7
Examples:
A fair die is tossed once. What is the probability of getting
a. Number 4? c) An even number?
b. An odd number? d) Number 8?
5 5 5 5
5 4 3 2
Example:
Here n 4, r2
b) 4! 24
There are P2 12 permutations.
( 4 2)!
4
2
Here n 10
Of which 2 are C , 2 are O, 2 are R ,1E ,1T ,1I ,1N
2.
K 1 2, k 2 2, k 3 2, k 4 k5 k6 k7 1
u sin g the 3 rd rule of permutation , there are
10!
453600 permutations.
2!*2!*2!*1!*1!*1!*1!
set by: Mekoro arega(M.Sc)
A selection of objects without regard to order is called
combination.
The number of combinations of r objects selected from n objects is denoted by
n and is given by the formula:
n Cr or
r
n n!
r
( n r )!*r!
AB BA CA DA AB BC
AC BC CB DB AC BD
AD BD CD DC AD DC
Note that:
𝑃 𝐴 𝐵 𝑃 𝐵 =𝑃 𝐵 𝐴 𝑃 𝐴
If Events A1,A2,…,Ak are disjoint and exhaustive, then
𝑃 𝐴1 𝐵 + 𝑃 𝐴2 𝐵 + ⋯ += 1
P X x 18 38 38 18
1/8, x=0
F(x) = ½, x=1
7/8, x=2
1, x=3
Cumulative probability distribution function over the set of real numbers
0, x<0
1/8, 0 x<1
F(x) = ½, 1 x<2
7/8, 2 x<3
1, x≥3
i. First we need to find the following expected values E (x) and E (X2)
∞ ∞
𝐸(𝑋) = −∞
𝑥𝑓(𝑥)𝑑𝑥 𝐸(𝑋 2 ) = −∞
𝑥 2 𝑓(𝑥)𝑑𝑥
3 1 3 1
𝐸(𝑋) = 0
𝑥 9
𝑥 2 𝑑𝑥 𝐸(𝑋 2 ) = 0
𝑥2 9
𝑥 2 𝑑𝑥
1 𝑥4 3 9 1 𝑥5 3 27
𝐸(𝑋) = 9
∗ 4
0 = 4
𝐸(𝑋 2 ) = 9
∗ 5
0 = 5
2) 2 27 9
Thus, var(X)=𝐸(𝑋 − [𝐸(𝑋)] = 5
− (4 )2 = 𝟎. 𝟑𝟒
A.
Thus, we may conclude that if 30% of the exam questions are answered by guessing,
the probability is 0.071 (or 7.1%) that more than four of the questions are answered
correctly by the student.
C.
D.
E ( X ) np , Var ( X ) npq
set by: Mekoro arega(M.Sc)
The Poisson distribution depends only on the
average number of occurrences per unit time of
space.
The Poisson distribution is used as a distribution
of rare events, such as: Arrivals, Accidents,
Number of misprints, Hereditary, Natural disasters
like earth quake, etc.
The process that gives rise to such events is
called Poisson process
set by: Mekoro arega(M.Sc)
A random variable X is said to have a Poisson
distribution if its probability distribution is given
by:
E (X ) , Var (X )
set by: Mekoro arega(M.Sc)
The Poisson probability distribution provides a
close approximation to the Binomial Probability
Distribution when n is large and p is quite small
or quite large with as n →∞.
x e
fk ( x ) , x 0,1,2,......
x!
Where the averagenumber.
n
fy ( x) p x q n x , x 0,1,2,...., n
x
𝒇𝒚 (𝒙) → 𝒇𝒌 (𝒙), 𝒇𝒐𝒓 𝒆𝒗𝒆𝒓𝒚 𝒙
(np) x e ( np )
P( X x) , x 0,1,2,......
x!
Where np the average number.
set by: Mekoro arega(M.Sc)
Usually we use this approximation if np 5 .
In other words, if n 20 and np 5 [or n(1 p) 5 ],
then we may use Poisson distribution as an approximation to binomial distribution.
Example: Find the binomial probability P(X=3) by using the Poisson distributio
p 0.01
and n 200 . Solution:
U sin g Poisson , np 0.01* 200 2
23 e 2
P( X 3) 0.1804
3!
U sin g Binomial , n 200, p 0.01
200
P( X 3) (0.01)3 (0.99)99 0.1814
3
set by: Mekoro arega(M.Sc)
4. The Hyper Geometric and Binomial Distributions
This distribution is closely related to binomial probability
distribution.
But in hyper geometric probability distribution, the trials are not
independent.
Thus, the probability of success changes from trial to trial,
𝑅 𝑁−𝑅
. /. /
(𝑟 ) = 𝑟 𝑛 − 𝑟 ,
𝑁
. /
𝑛
𝑤𝑒𝑟𝑒: 𝑁 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑖𝑧𝑒, 𝑅 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 𝑖𝑛 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛,
𝑛 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒, 𝑟 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 𝑖𝑛 𝑎 𝑠𝑎𝑚𝑝𝑙𝑒
5 15;5 5 10
2 4;2 2 2
𝑃 2 = 15 = 15 = ____________--,
4 4
set by: Mekoro arega(M.Sc)
Note: Hyper geometric probability distribution is more
tedious to compute by hand.
When n is not too large, use binomial formula to
approximate hyper geometric results.
Still, it is better to use Poisson formula to approximate
hyper geometric results given the following conditions:
a. n ≤ 0.05 N
b. n ≤ 20 and p ≤ 0.05
1. It is bell shaped and is symmetrical about its mean and it is mesokurtic. The maximum
1
ordinate is at x and is given by f ( x)
2
2. It is asymptotic to the axis, i.e., it extends indefinitely in either direction from the mean.
3. It is a continuous distribution.
4. It is a family of curves, i.e., every unique pair of mean and standard deviation defines a
different normal distribution. Thus, the normal distribution is completely described by two
parameters: mean and standard deviation.
5. Total area under the curve sums to 1, i.e., the area of the distribution on each side of the
mean is 0.5. f ( x)dx 1
- Given normally distributed random variable X with mean and s tan dard deviation
a X b
P ( a X b) P ( )
P ( a X b) P ( a Z b )
Note:
P ( a X b) P ( a X b)
P ( a X b)
P ( a X b)
set by: Mekoro arega(M.Sc)
Examples:
1. Find the area under the standard normal distribution which lies
a) Between Z 0 and Z 0.96
Solution:
Area P (1.45 Z 0)
P (0 Z 1.45)
0.4265
a)
81.2 X 86.0
P (81.2 X 86.0) P ( )
81.2 80 86.0 80
P( Z )
4.8 4.8
P (0.25 Z 1.25)
P (0 Z 1.25) P (0 Z 1.25)
0.3934 0.0987 0.2957
set by: Mekoro arega(M.Sc)
Standard Normal Z- distribution table.
As can be seen from the above table, the possible joint events in
the experiment is written as (X =0, Y =0), (X =0, Y =1) and so
on.
The corresponding probabilities that X and Y both assume the
value 0 is P (X =0, Y =0) =P (0, 0) =1/8.
The joint event (X =1, Y =0) occurs if the outcome is either THT
or TTH, and so P (X =1, Y =0) =P (1, 0) =2/8.
set by: Mekoro arega(M.Sc)
In general, we may write this condition as:
𝑥 𝑦 𝑃(𝑥, 𝑦) = 1,
where the double summation sign indicates that the
entries in joint probability table are added over all
possible pairs of values of X and Y
Thus, if X and Y are discrete random variables, the
function f (x, y) =P (X=x, Y=y) for each pair of (x, y)
within the range of X and Y is called the joint probability
distribution of X and Y.
Solution
The joint cumulative probability 𝑃 𝑋 ≤ 1, 𝑌 ≤ 1 , can be computed
by considering the 0 and 1 for X and 0 and 1 for Y.
1 1
𝑃 𝑋 ≤ 1, 𝑌 ≤ 1 = 𝑃 0,0 + 𝑃 0,1 + 𝑃 1,0 + 𝑃 1,1 = + +
6 3
2 1 8
+ = .
9 6 9
set by: Mekoro arega(M.Sc)
In the case of continuous random variables the only
difference is that double summation sign changed to double
integral sign.
A bivariate function with 𝑓 𝑥, 𝑦 defined over the xy-plane
Solution:
2
𝑔 𝑥 = 𝑦<0 𝑓 𝑥, 𝑦 𝑑𝑦 , 𝑓𝑜𝑟 𝑦 = 0,1,2 𝑤𝑖𝑐 𝑖𝑠 𝑡𝑒 𝑟𝑜𝑤 𝑡𝑜𝑡𝑎𝑙.
Therefore, 𝑔 𝑥 would be 7/12, 7/18 𝑎𝑛𝑑 1/36
2
𝑦 = 𝑥<0 𝑓 𝑥, 𝑦 𝑑𝑥 , 𝑓𝑜𝑟 𝑥 = 0,1,2 𝑤𝑖𝑐 𝑖𝑠 𝑡𝑒 𝑐𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙.
𝑦 would be 5/12, 1/2 𝑎𝑛𝑑 1/12
X 0 1 2
g(x) 7/12 7/18 1/36
Y 0 1 2
h(y) 5/12 ½ 1/12
set by: Mekoro arega(M.Sc)
Example 2.
For the following joint density function derive the marginal distributions of X and Y
2
(𝑥 + 2𝑦), 0 < 𝑥 < 1, 0 < 𝑦 < 1
𝑓 𝑥, 𝑦 = 3
0, 𝑜𝑡𝑒𝑟𝑤𝑖𝑠𝑒
Solution:
i. Marginal distribution of X
∞
𝑔 𝑥 = 𝑓 𝑥, 𝑦 𝑑𝑦
;∞
12
𝑔 𝑥 = 0 3
(𝑥 + 2𝑦)𝑑𝑦
2 2 1
𝑔 𝑥 = (𝑥𝑦 + 𝑦 )|0
3
2
𝑔 𝑥 = 3
(𝑥 + 1)
2
(𝑥 + 1), 0<𝑥<1
𝑆𝑜, 𝑔(𝑥) = 3
0, 𝑜𝑡𝑒𝑟𝑤𝑖𝑠𝑒
ii. Marginal distribution Y
∞
𝑦 = 𝑓 𝑥, 𝑦 𝑑𝑥
;∞
12
𝑦 = (𝑥 + 2𝑦)𝑑𝑥
0 3
2 𝑥2 1
𝑦 = (2𝑥𝑦 + )|0
3
2 12
𝑦 = (2𝑦 + )
3 2
2 1
+ 2𝑦 , 0<𝑦<1
𝑆𝑜, (𝑦) = 3 2
0, 𝑜𝑡𝑒𝑟𝑤𝑖𝑠𝑒
0 ¼ ¼ ½
1 ¼ ¼ ½
P(w) ½ ½ 1
Table 4.6 Conditional probability
Conditional probability P(v/w)
P(v/W=0) P(v/W=1)
½ ½
½ ½
Conditional probability P(w/v)
P(w/V=0) ½
P(w/V=1) ½
symbol, 𝜇𝑥 .
The standard deviation of the sampling distribution of the mean is
given by the symbol 𝜎𝑥 .
𝜎
,𝜎𝑥 =
𝑛
If n > 0.05N, we use the following formula to determine the standard error of the sampling
𝜎 𝑁;𝑛
distribution, σx , σx = 𝑛 𝑁;1
will be a random variable because X1, X2, …, Xn corresponding to the n trials of the sample
are random variables.
set by: Mekoro arega(M.Sc)
Theorem: If X is normally distributed with mean μ
and variance σ2 and a random sample of size n is
taken, then the sample mean, , will be normally
distributed with mean μ and variance σ2/n
Hence, X ~N(μ,σ2/n)
set by: Mekoro arega(M.Sc)
Example 5.1
For a population composed of the following 5 numbers: 1, 3, 5, 7,
and 9, determine
A) μ and σ
B) The theoretical sampling distribution of the mean for the sample
size of 2
C)The mean of the sampling distribution, μx and the standard error of
the sample mean, σx
Solution:
1:3:5:7:9 25
A. μ= = =5
5 5
(1;5)2 :(3;5)2 :(5;5)2 : 7;5)2 :(9;5)2 40
𝜎 = = = 8 = 2.83
5 5
C. By applying theorem the central limit theorem, μx = μ = 5. Since the sample size of 2 is
greater than 5% of the population size (that is, n > 0.05N), we use the following formula to
determine the standard error of the sampling distribution, σx
𝜎 𝑁;𝑛 8 5;2 3
σx = = = 4 = 3 = 1.73
𝑛 𝑁;1 2 5;1 4
N np , npq .
A
X
set by: Mekoro arega(M.Sc)
We can summarize the characteristics of the sampling
distribution of sample mean, under two conditions:
1. When sampling is from a normally distributed population
with a known population variance
A. x =
B. x
n
B. x
n when n
N 0.05
x
n
N n
N 1
, otherwise
C. The sampling distribution x is normal
set by: Mekoro arega(M.Sc)
5.5 Distribution of Sample Variance (𝑺𝟐 )
S 2 measures the variability and indicates spread or dispersion
among observations.
Since dispersion is as important a consideration as central
tendency,
the importance of S 2 for inferences about σ2 is comparable to
that of X for inferences about µ.
5.5.1 Distribution of the Sample Variance when the Population mean is
known
We will develop the sampling distribution of S 2 when sampling
is from normal population.
2
Initially it is important to assume that µ is known and σ is not.
In this context S 2 is defined by n
i=1 (Xi − µ) 2
S2 =
n
Where X1, X2, X3….,Xn constitutes a random sample from a
x x
2
i
S
i 1
2
, when population mean μ is unknown
n 1
5.6. Large Sample Properties of Estimators
An estimator of θ is a consistent estimator if plim = θ.
Let X1, X2, …, Xn be a random sample of size n from a
given population, then ,
i.e., the expected value of the sample mean is equal to the
population mean and the variance of the mean is the
population variance divided by the sample size.
The fact that as n increases the variance of the mean is
reduced and its mean approaches the population mean is
known as the law of large numbers.
set by: Mekoro arega(M.Sc)
Let {X1, X2, …, Xn} be a
Definition 2:
sequence of random variables.
If ,
answer:
◦ Sample size n fixed?
◦ Each selection independent of others?
◦ Just 2 possible values for each?
◦ Each has same probability p?
binomial
set by: Mekoro arega(M.Sc)
Mean and S.D. of Binomial Counts, Proportions
Count X binomial with parameters n, p
has:
Mean =np
Standard deviation=
Mean=p
◦ Standard deviation=
=
Xi
Where is sample mean
n
is summation
Xi values of random variables
N is sample size
The sample variance and standard deviation
S2
(X X ) 2
S
(X X ) 2
The end points and are called confidence limits and 1-α is the degree of
confidence. is the maximum error to be permitted in estimating µ by with
probability 1-α .
Type II Error
• The error made by failing to reject the null hypothesis when
it is false.
• ß (beta) is used to represent the probability of a type II error
• Example: Failing to reject the claim that the mean body
temperature is 102.6 degrees when the mean is really
different from 102.6