Session 6 Probability Distribution I - Discreet
Session 6 Probability Distribution I - Discreet
Distribution in BA
SANGEETA SHAH BHARADWAJ
What role does Probability Distribution
play?
??
Need to predict the outcome of future event. E.g. whether a customer will churn , employee
will attrite , is it a fraudulent transaction?
Understanding of probability distribution improves and helps us in predicting the same and
taking further action.
What is sample space?
Examples of outcomes
Predicting customer churn at individual customer level . There are only two possibilities either
a customer will leave or will not leave.
So sample space is S={ churn, no churn}=binary (yes or no) or
S={ low , medium, high} or { AAA, AA, A ,BBB , B} credit rating= discrete but can take finite
values (also referred as discrete random variable) or
S={ {X|X X
( also refereed as continuous random variable)
S a sample space is universal set that consists of all possible outcomes of an experiment
What is your interest? What do you want
to analyze?
If you have to study the attrition in an organization , will you study the entire employee data set
or you would like to study a subset of employees. What would be this subset?
?? Only a subset.
mathematical function,
– The symbol represents the value of the random variable
X and is the probability.
• Properties:
Cumulative distribution function (CDF)is the probability that a random variable X takes a
value less than equal to a and is written as
Properties of Continuous Probability
Distributions
• Properties
You know from past data that 2% of all credit card transactions in a certain region are
fraudulent. If there are 50 transactions per day in a certain region, we can use a
Binomial Distribution Calculator to find the probability that more than a certain number of
fraudulent transactions occur in a given day
So what
Using Excel’s Binomial Distribution
Function
• The probability that exactly 3 of 10 individuals will make a
CDF
PMF
[]
𝑥
𝑛 𝑘
𝑃 ( X = 𝑥 )=
𝑛
𝑥[ ,wq22
𝑥
]
𝑝 (1− 𝑝 )
𝑛 −𝑥
𝑃 ( X ≤ 𝑥 )= ∑,wq22
𝑝 ( 1 − 𝑝 )
𝑛− 𝑘
𝑘=0 𝑘
Where
Binomial Distribution N=20
p=0.1(blue), p=0.5(green)
and p=0.8(red)
N=20, p=0.1 say 10% returns N=20, p=0.5 say 50% returns
P( exactly 5 customer will P( exactly 5 customer will return)=0.0147
return)=0.03192 P(max 5 customers will return)=0206
P(max 5 customers will return)=0.9887 P(exactly 20 customers will return)=0
P(exactly 20 customers will return)=0 Avg no. of customers who are likely to
Avg no. of customers who are likely to return= 10
return= 2 Variance is 5
Variance is 1.8
Let us see an example
You are a call center and want to know whether you have staff to handle the customers calls or
not. What will you do?
Let us say you hire 2 staff
I will collect data for number of calls received every hour
After collecting data for couple of days through observation, you come to a figure that you are
receiving on an average 15 calls per hour and are adequately handled by the two staff, however if you
start getting more than 20 calls , you worry and think you will need to plan more staff
You want to find out the probability of the same.
Poisson distribution
Poisson Distribution
Examples
No. of cancellation of orders by customers at an e commerce website in a day
No. of customer complaints at call centers in a day
Characteristics of distribution
Events are independent of each other. The occurrence of one event does not affect the probability another event will
occur
The average rate (events per time period) is constant
Two events cannot occur at the same time
The Poisson Process is the model for describing randomly occurring events and
by itself, isn’t that useful. We need the Poisson Distribution to do interesting
things like finding the probability of a number of events in a time period
Example 5.32: Using Excel’s Poisson
Distribution Function
• With
the probability that X = 1 is
For example, suppose a given website receives an average of 20 visitors per hour. We can use the
Poisson distribution calculator to find the probability that the website receives more than a certain number
of visitors in a given hour:
P(X > 25 visitors) = 0.11218
P(X > 30 visitors) = 0.01347
P(X > 35 visitors) = 0.00080
Binomial vs Poisson Distribution
The Binomial and Poisson distributions
are both discrete probability
distributions. In some circumstances
the distributions are very similar
Expected Value: Airline Revenue
Management
• Full and discount airfares are available for a flight.
• Full-fare ticket costs $560.
• Discount ticket costs $400.
• X = ticket price paid
• p = 0.75 (the probability of selling a full-fare ticket)
• The airline should not discount full-fare tickets because the expected
value of a full-fare ticket is greater than the cost of a discount ticket.
• Break-even point:
Variance of a Discrete Random
Variable
• The variance,
of a discrete random
variable X is a weighted average of the squared
deviations from the expected value:
Computing the Variance of a Random
Variable
• Rolling two dice
Questions?