0% found this document useful (0 votes)
81 views28 pages

Session 6 Probability Distribution I - Discreet

This document discusses probability distributions and their role in business analytics. It defines key terms like sample space, random variables, probability mass functions, cumulative distribution functions, and different types of probability distributions including discrete, continuous, binomial, and Poisson distributions. Examples are provided of how these distributions can be used to model and predict outcomes like customer churn, credit ratings, machine failures, fraudulent transactions, and more.

Uploaded by

SRV TECHS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views28 pages

Session 6 Probability Distribution I - Discreet

This document discusses probability distributions and their role in business analytics. It defines key terms like sample space, random variables, probability mass functions, cumulative distribution functions, and different types of probability distributions including discrete, continuous, binomial, and Poisson distributions. Examples are provided of how these distributions can be used to model and predict outcomes like customer churn, credit ratings, machine failures, fraudulent transactions, and more.

Uploaded by

SRV TECHS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Role of Probability

Distribution in BA
SANGEETA SHAH BHARADWAJ
What role does Probability Distribution
play?
??

 Need to predict the outcome of future event. E.g. whether a customer will churn , employee
will attrite , is it a fraudulent transaction?

Such outcomes cannot be predicted with certainty

Understanding of probability distribution improves and helps us in predicting the same and
taking further action.
What is sample space?

Examples of outcomes
Predicting customer churn at individual customer level . There are only two possibilities either
a customer will leave or will not leave.
So sample space is S={ churn, no churn}=binary (yes or no) or
 S={ low , medium, high} or { AAA, AA, A ,BBB , B} credit rating= discrete but can take finite
values (also referred as discrete random variable) or
 S={ {X|X X
( also refereed as continuous random variable)
S a sample space is universal set that consists of all possible outcomes of an experiment
What is your interest? What do you want
to analyze?
If you have to study the attrition in an organization , will you study the entire employee data set
or you would like to study a subset of employees. What would be this subset?

?? Only a subset.

So Event (E) is a subset of sample space of entire population


When will senior management be more concerned ?
When percentage of employees (having probability of leaving the organization is more than
60%) is 20% or 50%?
Random Variables
• A random variable is a numerical description of
the outcome of an experiment.
• A discrete random variable is one for which the
number of possible outcomes can be counted.
• A continuous random variable has outcomes
over one or more continuous intervals of real
numbers.
Discrete and Continuous Random
Variables
Examples of discrete random variables:
• outcomes of dice rolls
• whether a customer likes or dislikes a product
• number of hits on a Web site link today
Examples of continuous random variables:
• weekly change in Sensex
• daily temperature
• time between machine failures
Probability Distributions
• A probability distribution is a characterization
of the possible values that a random variable
may assume along with the probability of
assuming these values.
Discrete Probability Distributions
• For a discrete random variable X, the probability distribution of the discrete
outcomes is called a probability mass function and is denoted by a

mathematical function,
– The symbol represents the value of the random variable
X and is the probability.
• Properties:

– the probability of each outcome must be between 0 and 1

– the sum of all probabilities must add to 1


Random Variables
Discreet random variables
Probability mass function (PMF)  the probability that a random variable X takes a specific value
e.g. the number of fraudulent transaction at an e-commerce platform is 10 is written as P(X=10)
Cumulative distribution function (CDF)is the prbobaility that a random variable X takes a value
less than equal to 10 and is written as P(X 10)

Continuous random variables


Probability density function (PDF)  the probability that a random variable X takes a value in a
small neighborhood of x

Cumulative distribution function (CDF)is the probability that a random variable X takes a
value less than equal to a and is written as
Properties of Continuous Probability
Distributions
• Properties

– for all values of x


– Total area under the density function equals 1.

– Probabilities are only defined over intervals.
– is the area under the density function
between a and b.
Cumulative Distribution Function
• A cumulative distribution function,

specifies the probability that the random variable


X assumes a value less than or equal to a
specified value, x; that is,
Fraudulent transactions

You know from past data that 2% of all credit card transactions in a certain region are
fraudulent. If there are 50 transactions per day in a certain region, we can use a 
Binomial Distribution Calculator to find the probability that more than a certain number of
fraudulent transactions occur in a given day
So what
Using Excel’s Binomial Distribution
Function
• The probability that exactly 3 of 10 individuals will make a

a purchase is P(x = 3): = BINOM.DIST(3,10,0.2, FALSE) = 0.20133.


• The probability that 3 or fewer of 10 individuals will make

a purchase is P(x ≤ 3): = BINOM.DIST(3,10,0.2, TRUE) = 0.87913.


Binomial Distribution
For example, suppose it is known that 2% of all credit card transactions in a certain region are fraudulent. If there are 50
transactions per day in a certain region, we can use a Binomial Distribution Calculator to find the probability that more
than a certain number of fraudulent transactions occur in a given day
P(X > 1 fraudulent transaction) = 0.26423
P(X > 2 fraudulent transactions) = 0.07843
P(X > 3 fraudulent transactions) = 0.01776
Email companies use the binomial distribution to model the probability that a certain number of spam emails land in an
inbox per day.
For example, suppose it is known that 4% of all emails are spam. If an account receives 20 emails in a given day, we can
use a Binomial Distribution Calculator to find the probability that a certain number of those emails are spam:
P(X = 0 spam emails) = 0.44200
P(X = 1 spam email) = 0.36834
P(X = 2 spam emails) = 0.14580
Binomial Distribution
For example, Retail stores use the binomial distribution to model the probability that they
receive a certain number of shopping returns each week.
For example, suppose it is known that 10% of all orders get returned at a certain store
each week. If there are 50 orders that week, we can use a Binomial Distribution Calculator
 to find the probability that the store receives more than a certain number of returns that
week:
•P(X > 5 returns) = 0.18492
•P(X > 10 returns) = 0.00935
•P(X > 15 returns) = 0.00002
Binomial Distribution
It is discreet probability distribution
A random variable X is said to have a binomial distribution
◦ When random variable have only two outcomes
◦ The objective is to find the probability of getting x successes out of n trials
◦ The probability of success in p and thus probability of failure is (1-p)
◦ The probability is constant and does not change between trials

CDF
PMF

[]
𝑥
𝑛 𝑘
𝑃 ( X = 𝑥 )=
𝑛
𝑥[ ,wq22
𝑥
]
𝑝 (1− 𝑝 )
𝑛 −𝑥
𝑃 ( X ≤ 𝑥 )= ∑,wq22
𝑝 ( 1 − 𝑝 )
𝑛− 𝑘

𝑘=0 𝑘

Where
Binomial Distribution  N=20

 p=0.1(blue), p=0.5(green)
and p=0.8(red)

 N=20, p=0.1 say 10%


returns
 P( exactly 5 customer will
return)=0.03192
 P(max 5 customers will
return)=0.9887
 P(exactly 20 customers
will return)=0
Mean and Variance of Binomial
distribution
Mean of binomial distribution B(n, p) = np
Variance is np(1-p)

 N=20, p=0.1 say 10% returns  N=20, p=0.5 say 50% returns
 P( exactly 5 customer will  P( exactly 5 customer will return)=0.0147
return)=0.03192  P(max 5 customers will return)=0206
 P(max 5 customers will return)=0.9887  P(exactly 20 customers will return)=0
 P(exactly 20 customers will return)=0  Avg no. of customers who are likely to
 Avg no. of customers who are likely to return= 10
return= 2  Variance is 5
 Variance is 1.8
Let us see an example
You are a call center and want to know whether you have staff to handle the customers calls or
not. What will you do?
 Let us say you hire 2 staff
 I will collect data for number of calls received every hour
 After collecting data for couple of days through observation, you come to a figure that you are
receiving on an average 15 calls per hour and are adequately handled by the two staff, however if you
start getting more than 20 calls , you worry and think you will need to plan more staff
 You want to find out the probability of the same.

 Poisson distribution
Poisson Distribution
Examples
No. of cancellation of orders by customers at an e commerce website in a day
No. of customer complaints at call centers in a day
Characteristics of distribution
Events are independent of each other. The occurrence of one event does not affect the probability another event will
occur
The average rate (events per time period) is constant
Two events cannot occur at the same time

The Poisson Process is the model for describing randomly occurring events and
by itself, isn’t that useful. We need the Poisson Distribution to do interesting
things like finding the probability of a number of events in a time period
Example 5.32: Using Excel’s Poisson
Distribution Function
• With
the probability that X = 1 is

• The probability that


The Probability Mass Function (PMF) of
Poisson Distribution
The PMF

Where λ is the rate of occurrence of


events per unit of time

Mean of poison distribution = λ


Standard deviation =
More examples
suppose a given call center receives 10 calls per hour. We can use a Poisson distribution calculator to find the probability
that a call center receives 0, 1, 2, 3 … calls in a given hour:
P(X = 0 calls) = 0.00005
P(X = 1 call) = 0.00045
P(X = 2 calls) = 0.00227
P(X = 3 calls) = 0.00757

For example, suppose a given website receives an average of 20 visitors per hour. We can use the 
Poisson distribution calculator to find the probability that the website receives more than a certain number
of visitors in a given hour:
P(X > 25 visitors) = 0.11218
P(X > 30 visitors) = 0.01347
P(X > 35 visitors) = 0.00080
Binomial vs Poisson Distribution
The Binomial and Poisson distributions
are both discrete probability
distributions. In some circumstances
the distributions are very similar
Expected Value: Airline Revenue
Management
• Full and discount airfares are available for a flight.
• Full-fare ticket costs $560.
• Discount ticket costs $400.
• X = ticket price paid
• p = 0.75 (the probability of selling a full-fare ticket)

• The airline should not discount full-fare tickets because the expected
value of a full-fare ticket is greater than the cost of a discount ticket.

• Break-even point:
Variance of a Discrete Random
Variable
• The variance,
of a discrete random
variable X is a weighted average of the squared
deviations from the expected value:
Computing the Variance of a Random
Variable
• Rolling two dice
Questions?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy