Probability Distributions
Probability Distributions
1
Types
A. Discrete probability distributions
• Define probabilities associated with discrete variables. A
discrete random variable take any of a specified finite or
countable list of values. Discrete probability distributions
include:
• Binomial probability distribution
• Poisson probability distribution
• Negative binomial probability distribution
• Multinomial probability distribution
• Hypergeometric probability distribution
B. Continuous Probability Distributions
Define probabilities associated with continuous variables.
• Normal distribution
• Student’s t-distribution
• Chi-square
2
• F-distribition
Normal distribution
The Normal(Gaussian) distribution describes a special class of
distributions that are symmetric and can be described
by two parameters
5
Probability density function
The function tells us about the probability of obtaining a
value within some interval
7
Standard normal distribution (Z)
68% of the AUC lies within one standard deviation from the mean
i.e ( )
95 % within two standard deviations ( 2 ) and 99.7% within
three standard deviations ( 3 )
9
Suppose Z~ N(0,1), what is P(Z<0) ? Use the standard
Normal distribution table
14
Hypothesis testing: The z-test for
the mean of a Normal population
(large samples)
Situation
• A sample of n is selected from a normal
population with mean m (unknown) and
standard deviation s. We want to test
either
1. H 0 : 0 versus H A : 0
or
2. H 0 : 0 versus H A : 0
or
3. H 0 : 0 versus H A : 0
The Test Statistic
x 0 x 0x 0
z
x s
n n
if n is large.
1) What is the probability that less than 6 out of 1000 children get
infected, given that the probability of transmission by 6 weeks
is 2 percent
Note: np=>>>5
mean=np=1000*.02= 20
Variance=np(1-p)= (1000*.02)*(1-0.02)=19.6
Standard deviation=4.4
Using the Binomial distribution, X~bin (1000,0.02),
P(X<6)=0.000064
pˆ p0 pˆ p0
z
pˆ p0 1 p0
n
x 0 1 2
P(X=x) 1/4 1/2 1/4
Assumptions:
23
Binomial Random Variable
24
Illustration of possible outcomes
Child number
1 2 3 Outcome (k)
+ + + 3 infected
+ + - 2 infected
+ - + 2 infected
- + + 2 infected
+ - - 1 Infected
- + - 1 infected
- - + 1 infected
- - - 0 infected
25
Combinations
+ + + 3 infected 1 way
+ + - 2 infected
+ - + 2 infected 3 ways
- + + 2 infected
+ - - 1 Infected 3 ways
- + - 1 infected
1 way
- - + 1 infected
- - - 0 infected 26
What are the probabilities of the different outcomes?
k k ! n k ! k k !3 k !
k k
n k n! nk
P ( X k ) P (1 P ) nk = ( ) P k
(1 P )
k k ! n k !
28
Example:
1) What is the probability that none (0) out of three children get
infected, given that the probability of transmission by 6 weeks
is 2 percent
3
P ( X 0) X 0.020 X (1 0.02)30
0
3!
( ) x1x0.983
0! x(3 0)!
(3 x 2 x1)
*0.983
1x(3 x 2 x1)
0.94
29
What is the binomial probability distribution of the number of
infected children out of 3?
X 0 1 2 3
Note that the sum of the probabilities for all the possible
values of the binomial random variable is equal to 1
n
i.e
P( X k ) 1
k 1
30
Cumulative-distribution function (cdf)
=0.999992
31
Mean and Variance of a Bernoulli random variable
Given P X 1 p then P X 0 1 p
1
Mean, E X p j x j
j 0
(1 p ) x0 px1
p
Variance, 2 V X 1 p ( x )2
j j
j 0
(1 p ) *(0 p 2 ) p *(1 p) 2
p (1 p )
32
Mean and Variance of a binomial random
Variable
MEAN
VARIANCE
2 V [ X ] np (1 p )
33
Poisson distribution
34
Poisson Probability distribution
Discrete probability distribution for the counts of events that occur
randomly in a given interval of time (or space)
x
Note that e is an exponential number=2.711828
P( X x) e
x!
If X has a Poisson distribution, then we write: X Po( ) where
is the parameter of the distribution
Note: A Poisson random variable can take on any positive integer
value while Binomial distribution always has a finite upper limit.
35
Assumptions
1. The probability of two events occurring in the
same narrow interval is negligible. (rare events)
2. The probability of observing a single event over a
small interval is approximately proportional to the size
of that interval
3. The probability of an event within a certain interval
does not change over different intervals. (stationarity)
4. The probability of an event in one interval is
independent of the probability of an event in any other
non-overlapping interval.(independence)
36
Example:
P ( X 2) P ( X 3) P ( X 4) .......
1 P ( X 2)
1 P ( X 0) P ( X 1) P ( X 2)
1- 0.122+0.257+0.270)
0.350
38
Changing intervals
What would be the probability of observing 5 stillbirths in 2
Months?
Note: the interval for the rate changed from 1 month to 2 months
What is ?
5
4.2
So P ( X 5) e 4.2 = 0.1633
5!
39
Shape of the Poisson distribution
(i) Unimodal
Source:http://en.wikipedia.org/wiki
40
Mean and variance of a Poisson distribution
If X ~ Po( ) then
2
41
Poisson Approximation to the Binomial Distribution
The binomial distribution with large n and small p can be
accurately approximated by the Poisson distribution with
Parameters np
npq ~ np whereq 1 p
Thus for large n, the mean and variance of a binomial
Distribution are almost equal. Thus binomial approximates
a Poisson distribution
42
Example 2:
mean=np=1000*.02= 20
Using the Binomial distribution, X~bin (1000,0.02),
P(X=6) = 0.00017
43
Useful references
44