0% found this document useful (0 votes)
17 views22 pages

Chap3 Stat 2

Chapter 3 discusses various important probability distributions, including discrete distributions such as Bernoulli, binomial, geometric, negative binomial, and Poisson distributions, as well as continuous distributions like normal and exponential distributions. It provides definitions, examples, and methods for calculating probabilities using Excel for each type of distribution. The chapter aims to equip readers with the foundational knowledge of probability distributions and their applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views22 pages

Chap3 Stat 2

Chapter 3 discusses various important probability distributions, including discrete distributions such as Bernoulli, binomial, geometric, negative binomial, and Poisson distributions, as well as continuous distributions like normal and exponential distributions. It provides definitions, examples, and methods for calculating probabilities using Excel for each type of distribution. The chapter aims to equip readers with the foundational knowledge of probability distributions and their applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Chapter 3: Useful probability distributions

Nguyen Minh Tri

University of Information Technology

March 11, 2025


3.1 Some important discrete distributions
The probability distribution of a discrete random variable is called a discrete probability
distribution.
Definition 3.1 A trial with only two possible outcomes is called a Bernoulli trial.
• Suppose one outcome of a Bernoulli trial is a success and the other a failure. The
sample space Ω = {success, failure}.
• Random variable X : Let ω ∈ Ω,
(
1, ω = success
X(ω) =
0, ω = failure
Example 3.2
1. Tossing a coin (success = head, failure = tail).
2. Website click (success: A user clicks on a specific ad; failure: The user does not
click on the ad.)
3. Email classification (success: An email is classified as spam correctly; failure: An
email is misclassified)
Definition 3.3 A random experiment consists of n Bernoulli trials such that
1. The trials are independent.
2. Each trial results in only two possible outcomes: “success” (S) and “failure” (F).
3. The probability of a success in each trial, denoted as p, remains constant.
The random variable X that equals the number of trials that result in a success. We
call that X possesses a binomial distribution and the probability distribution (probability
mass function) of X is
....................................................................................
• n = the number of trials
• p = probability of a success.
• X has binomial distribution: X ∼ B(n, p)
• E(X) = np and var(X) = np(1 − p)
Using Excel to compute Binomial probabilities
• P (X = x) : =BINOMDIST(x,n,p,FALSE)
• P (X ≤ x) : =BINOMDIST(x,n,p,TRUE)
Example 3.4 A company deploys 20 AI models for different tasks in production. Each
model has a probability of 0.7 of functioning correctly (i.e., producing accurate predictions)
in real-world scenarios during the first month.
a. What is the probability that at least 15 of the models function correctly?
b. What is the average number of models that fail in the first month?
Solution.
• X : The number of models functioning correctly in the first month.
• n = 20 (Number of trials (AI models))
• p = 0.7 Probability of success (model functioning correctly).
• X ∼ B(n = 20; p = 0.7)
a. The probability that at least 15 of the models function correctly
....................................................................................
....................................................................................
b. The average number of models that fail in the first month
....................................................................................
Example 3.5 According to a survey of a country, 41% of all households are wireless-only
households (no landline). In a random sample of 20 households, what is the probability
that
a. exactly 5 are wireless-only?
b. fewer than 3 are wireless-only?
c. the number of households that are wireless-only is between 5 and 7, inclusive?
Solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
Example 3.6 A computer game is released. Sixty percent of players complete all the levels.
Thirty percent of them will then buy an advanced version of the game.
a. Among 15 users, what is the probability that at least two people will buy it?
b. What is the expected number of people who will buy the advanced version?
Solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
Definition 3.7 The number of Bernoulli trials needed to get the first success has Geo-
metric distribution.
• p = probability of success
• X ∼ Geo(p),
• Probability mass function:
(
(1 − p)k−1 p, k = 1, 2 . . .
p(k) =
0, otherwise
1 1−p
• E(X) = and var(X) =
p p2
Using Excel to compute geometric probabilities
• P (X = x) : =BINOM.DIST(1,x,p,FALSE)/x
• P (X ≤ x) : = 1 - BINOM.DIST(0,x,p,FALSE)
Example 3.8 An AI developer is debugging a program by repeatedly testing blocks of code
until a critical bug is identified. The probability of identifying the bug in any single test is
0.1. Let Y be the total number of tests required to find the bug for the first time.
• The random variable Y is the number of trials (tests) until the first success (finding
the bug). This follows a geometric distribution with parameter p = 0.1.
• What is the probability that the developer identifies the bug on the 5th test?
....................................................................................
....................................................................................
....................................................................................
Definition 3.9 In a sequence of independent Bernoulli trials, the number of trials needed
to obtain k successes has Negative Binomial distribution.
• k : number of success
• p : probability of success
• X has a negative binomial distribution with k and p : X ∼ N B(k, p)
k−1
• Probability mass function: p(x) = Cx−1 (1 − p)x−k pk , x = k, k + 1, . . .
k k(1 − p)
• E(X) = and var(X) =
p p2
Using Excel to compute Negative binomial probabilities
• P (X = x) : = NEGBINOM.DIST(x-k,k,p,FALSE)
• P (X ≤ x) : = NEGBINOM.DIST(x-k,k,p,TRUE)
Example 3.10 In a recent production, 20% of certain electronic components are defective.
We need to find 3 non-defective components for our 3 new computers. Components are
tested until 3 non-defective ones are found. What is the probability that 7 components
will have to be tested?
Solution.
• X : the number of components which are tested until 3 non-defective ones are found.
• k=3
• p = 80%
• X ∼ N B(k = 3, p = 0.2).
• The probability that 7 components will have to be tested
....................................................................................
Example 3.11 Suppose that the probability that a bit transmitted through a digital trans-
mission channel is received in error is 0.1. Assume that the transmissions are independent
events, and let the random variable X denote the number of bits transmitted until the
fourth error. What is the probability that the fourth error appears in nine or fewer trans-
missions?
Solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
A Poisson random variable is a count of the number of occurrences of a certain event in
a given unit of time, space, volume, distance, etc.
Example 3.12
1. The number of cars passing through a traffic light in 1 hour.
2. The number of requests to web servers in an hour.
3. Number of phone calls per minute.
4. The number of mistakes in a book per page.
Definition 3.13 A discrete random variable X is said to follow the Poisson probability
distribution with parameter λ > 0, denoted by P (λ), if is probability mass distribution is

e−λ λx
p(x) = , x = 0, 1, 2, . . .
x!

• λ : average number of events


• Denote X ∼ P (λ)
• E(X) = λ and var(X) = λ
Using Excel to compute Poisson probabilities
• P (X = x) : = POISSON.DIST(x,lambda,FALSE)
• P (X ≤ x) : = POISSON.DIST(x,lambda,TRUE)

• Siméon Denis Poisson (1781 - 1840)


• A French mathematician and physicist
• Source: https://en.wikipedia.org

1
Example 3.14 Let X be a Poisson random variable with λ = . Find P (X = 0) and
2
P (X ≥ 3).
Solution.
....................................................................................
....................................................................................
....................................................................................
Example 3.15 Customers of an internet service provider initiate new accounts at the av-
erage rate of 10 accounts per day.
a. What is the probability that more than 8 new accounts will be initiated today?
b. What is the probability that more than 16 accounts will be initiated within 2 days?
Solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
Definition 3.16 Assume that a set consists of N elements. Each element can be character-
ized as a success or a failure, and there are M successes in the set. A subset of k elements
is selected without replacement. The random variable X is the number of successes in the
subset. Then the probability distribution of X is called the hypergeometric distribution.

• X ∼ H(k; M ; N )
• Probability mass distribution:
x k−x
CM CN −M
p(x) = k
, max{0, k − N + M } ≤ x ≤ min{k, M }.
CN
   
M N −k M M
• E(X) = k and var(X) = k 1−
N N −1 N N
Using Excel to Compute Hypergeometric Probabilities
• P (X = x) : = HYPGEOM.DIST(x,k,M,N,FALSE)
• P (X ≤ x) : = HYPGEOM.DIST(x,k,M,N,TRUE)
Example 3.17 An IT company has 10 network routers in storage. Three of the routers are
faulty. Suppose the company randomly selects 4 routers for a new setup.
a. What is the probability that exactly two routers will be functional?
b. What is the probability that at least three routers will be functional?
....................................................................................
....................................................................................
....................................................................................
....................................................................................
3.2 Some important continuous distributions
Definition 3.18 A random variable X is said to have a normal probability distribution
with parameters µ and σ 2 , if it has a probability density function given by
1 (x−µ)2
− 2σ
f (x) = √ e
σ 2π

• Denote X ∼ N (µ, σ 2 )
• E(X) = µ and var(X) = σ 2

f (x)
• Johann Carl Friedrich Gauss (1777 - 1840)
• Source: https://en.wikipedia.org

µ x
• If µ = 0 and σ = 1, we call it a standard normal random variable.
X −µ
• If X ∼ N (µ, σ 2 ), then Z = ∼ N (0, 1)
σ
• Let X ∼ N (0, 1), then Φ(z) = P (X ≤ z) is given in Table A4 (Φ(z) is the
probability distribution function of Z)
Using Excel to standard normal probabilities: X ∼ N (µ, σ 2 )
1. Find P (X ≤ a) : =NORM.DIST(a,µ, σ,TRUE)
2. Find a such that P (X ≤ a) = t : =NORM.INV(t, µ, σ)
Example 3.19
a. Let X ∼ N (0, 1). Find P (X ≥ 1.13)
b. Let X ∼ N (0, 1). Find a such that P (X ≤ a) = 0.004
Solution. a. Using the normal table A4,
P (X ≥ 1.13) = 1 − P (X < 1.13) = 1 − 0.8708 = 0.1292
b. Using the normal table A4, a = −2.65

X −µ
Assume that X ∼ N (µ; σ 2 ) and let Z = . Then Z ∼ N (0; 1)
σ
Example 3.20 Assume that X is a random variable and X ∼ N (3, 16).
a. Find P (X ≥ 5).
b. Find P (2 ≤ X ≤ 5).
c. Find a such that P (X ≤ a) = 0.8944.
d. Find a such that P (X ≥ a) = 0.1.
e. Find a such that P (|X − 3| ≤ a) = 0.9544.
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
Example 3.21 The scores of an examination are assumed to be normally distributed with
µ = 75 and σ 2 = 64.
a. What is the probability that a student score chosen at random will be greater than
85?
b. A professor has decided that any student who scores below the 10th percentile must
retake the exam. What is the score that would require a student to retake the exam?
Solution. Let X be a score of the exam. Then, X ∼ N (75, 64).
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
Definition 3.22 A random variable X is said to have an exponential distribution with
parameter λ > 0 if the probability density function of X is given by
(
λe−λx , x ≥ 0
f (x) = .
0, x<0

• Denote X ∼ Exp(λ)
1 1
• E(X) = and var(X) = 2 .
λ λ
Using Excel
• Find P (X ≤ x) : = EXPON.DIST(x;λ,TRUE)
Example 3.23 Let X be the time (years) of use of a type of laptop. Suppose that X has
an exponential distribution and E(X) = 15. Choose a laptop at random.
a. Calculate the probability that the laptop will be used for less than 6 years.
b. Calculate the probability that the laptop will be used for more than 18 years.
c. Calculate the variance and standard deviation of X.
Solution. We have X ∼ Exp(1/15). Then the density function of X is given by
( 1 x
1 − 15
15
e , x≥0
f (x) = .
0, x<0
Z 6 Z 6
1 − 151 x
a. P (X < 6) = f (x)dx = e dx = 0, 3297.
−∞ 0 15
Z +∞ Z +∞
1 − 15
1 x
b. P (X > 18) = f (x)dx = e dx = 0, 3012.
18 18 15
1
c. σ 2 = 2 = 225 và σ = 15.
λ
Example 3.24 On the average, a certain computer part lasts ten years. The length of time
the computer part lasts is exponentially distributed.
a. What is the probability that a computer part lasts more than 7 years? (0.4966)
b. What is the probability that a computer part lasts between 9 and 11 years? (0.0737)

....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
Definition 3.25 Let n be a positive integer. A random variable X is said to have a chi-
square (χ2 ) distribution with n degrees of freedom if it has pdf
1

(n/2)−1 −x/2

n/2 Γ(n/2)
x e , x≥0
f (x) = 2
0, x < 0.

• Denote X ∼ χ2 (n)
• E(X) = n and var(X) = 2n.
 2
X −µ
Theorem 3.26 If X ∼ N (µ, σ 2 ), then ∼ χ2 (1).
σ
Example 3.27 Let X ∼ N (7, 4). Find P (15, 364 ≤ (X − 7)2 ≤ 20, 095).
Solution. Since X ∼ N (7, 4) we have µ = 7 and σ = 2. Then
 2 !
15, 364 X −7 20, 095
P (15, 364 ≤ (X − 7)2 ≤ 20, 095) = P ≤ ≤
4 2 4
2 
= P 3, 841 ≤ Z ≤ 5, 024
2  2 
= P Z ≤ 5, 024 − P Z ≤ 3, 841
= 0, 975 − 0, 95 = 0, 025
Definition 3.28 If Y and Z are independent random variables, Y has a chi-square distri-
bution with n degrees of freedom, and Z ∼ N (0, 1), then
Z
T = p
Y /n
is said to have a (Student) t-distribution with n degrees of freedom. We denote this by
T ∼ T (n). The probability density function of the random variable T with n degrees of
freedom is given by:
n+1   n+1
2 − 2
Γ( 2 ) t
f (t) = √ 1 + .
nπΓ( n2 ) n
The graphs of some t-distributions. The t-distribution tends to a standard normal distri-
bution as the degrees of freedom (equivalently, the sample size n) tend to infinity.
0.4

0.4
d.f.=10 Stu(10)
d.f.=5 Stu(30)
d.f.=2 N(0,1)
0.3

0.3
d.f.=1
dt(x, 10)

dt(x, 10)
0.2

0.2
0.1

0.1
0.0

0.0

−6 −4 −2 0 2 4 6 −6 −4 −2 0 2 4 6
the standard normal distribution provides a good approximation to the t-distribution for
sample sizes of 30 or more.
In t-table, we can find tα , n such that

P (T > tα,n ) = α.

Example 3.29 Let T be a random variable. Assume that T has t-distribution with 9 degree
of freedom and α = 0, 01.

It follows from t-table, we have P (T > 2, 821) = 0, 01.


Uxing Excel. Let X ∼ T (n)
• Find P (X ≤ x) : =T.DIST(x, n,TRUE)
• Find P (X ≤ a) = t : =T.DIST(t, n)
Example 3.30 Assume that T ∼ T (n).
a. Find t such that P (T > t) = 0.01 where n = 6.
b. Find t such that P (T > t) = 0.005 where n = 17.
c. Find P (T > 2.6025) where n = 16.
d. Find P (T ≤ −1.3212) where n = 23.
e. Find P (|T | ≥ 5.1106) where n = 14.
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
....................................................................................
Theorem 3.31 Let X1 , X2 , . . . , Xn be random variables possessing a normal distribution
N (µ, σ 2 ). Let
n n
1X 1 X
X= Xi and S 2 = (Xi − X)2 .
n i=1 n − 1 i=1
X −µ
Then the T = √ has a t-distribution with (n − 1) degrees of freedom.
S/ n

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy