Statistics Prob 2020
Statistics Prob 2020
1
Definition
• Experiment: An experiment is any activity from which
results are obtained. A random experiment is one in which
the outcomes, or results, cannot be predicted with certainty.
• Examples:
toss a coin twice
2
What Is Probability?
• Probabilities often deal with events
• The probability of an event is given by the
ratio of how often that event occurs and how
often all events occur
3
Probability
• Probability is a numerical measure of the
likelihood that an event will occur.
• Probability values are always assigned on a
scale from 0 to 1.
• A probability near 0 indicates an event is very
unlikely to occur.
• A probability near 1 indicates an event is
almost certain to occur.
4
Assigning Probabilities
• Classical Method
Assigning probabilities based on the assumption
of equally likely outcomes.
• Relative Frequency Method
Assigning probabilities based on experimentation
or historical data.
• Subjective Method
Assigning probabilities based on the assignor’s
judgment.
5
Probability of Events
• When you role a fair, 6
sided die, each of the six
Event p(Event)
faces has an equal chance 1 1/6
of coming up
2 1/6
Thus, the probability of 3 1/6
any single face
4 1/6
appearing is given by 1
(how often that event 5 1/6
occurs) divided by 6 6 1/6
(the total number of
events)
6
Classical Method
• If an experiment has n possible outcomes, this
method
• would assign a probability of 1/n to each
outcome.
• Example
Experiment: Rolling a die
Sample Space: S = {1, 2, 3, 4, 5, 6}
Probabilities: Each sample point has a 1/6
chance of occurring.
7
Probability of Events
• What is the probability Event p(Event)
of not rolling a 1?
1 1/6
• There are five events
that are not 1 2 1/6
• There is a total of six 3 1/6
events 4 1/6
• The probability of not 5 1/6
rolling a 1 is 5 / 6 6 1/6
• p(1) = 5 / 6
8
Relative Frequency Method
9
Relative Frequency Method
Frequency
(Number of
Find the probability that a student
Number of Activities Students) participated in:
0 8 a. at least one activity
1 20
2 12 b. three or more activities
3 6 c. exactly two activities
4 3
5 1 10
Subjective Method
• We can use any data available as well as our
experience and intuition, but ultimately a
probability value should express our degree of
belief that the experimental outcome will
occur.
11
The Addition Rule
• The addition rule is used to determine the
probability of two or more events occurring
– E.g., what is the probability that an odd number will
appear on the die?
• For mutually exclusive events, the addition rule
is:
p(A or B) = p(A) + p(B)
• Two events are mutually exclusive when both
cannot occur at the same time
12
The Addition Rule
• An “odd” event occurs Event p(Event)
whenever the die
1 1/6
comes up 1, 3, or 5
2 1/6
• These events are
mutually exclusive 3 1/6
– E.g., if it comes up 3, it 4 1/6
cannot also be 1 or 5 5 1/6
• p(1 or 3 or 5) = p(1) + 6 1/6
p(3) + p(5) = 1/6 + 1/6 +
1/6 = .5
13
The Addition Rule
• When events are not Like Dislike
mutually exclusive, a Cats Cats
different addition rule
must be used Like
Dogs 4 3
– When events are not
mutually exclusive, one or Dislike
more of the events can
Dogs 2 5
occur at the same time
• Are the events “liking cats”
and “liking dogs” mutually
exclusive?
14
The Addition Rule
• The two events, “liking Like Dislike
dogs” and “liking cats” Cats Cats
are not mutually
exclusive Like
Dogs 4 3
• It is possible for a
person to like dogs Dislike
Dogs 2 5
and either like or
dislike cats
15
The Addition Rule
• When events are not mutually exclusive, the
addition rule is given by:
• p(A or B) = p(A) + p(B) - p(A and B)
• p(A and B) is the probability that both event A
and event B occur simultaneously
• This formula can always be used as the addition
rule because p(A and B) equals zero when the
events are mutually exclusive
16
The Addition Rule
• Why do we subtract p(A and
B)?
• When the events are not
mutually exclusive, some
events that are A, are also B
p(A) p(A and B) p(B)
• Those events are counted
twice in p(A) + p(B)
• p(A and B) removes the
second counting of the
events that are both A and B
17
The Multiplication Rule
• To determine the probability of two (or more)
independent events occurring simultaneously,
one uses the multiplication rule
• p(A and B) = p(A) X p(B)
• Note: This formula can be used to solve the
previous addition rule, but only if the events
are independent
18
Independent Events
• Two events are said to be independent if the
occurrence of one event in no way influences
the occurrence of the other event
• That is, knowing something about whether one
event has occurred tells you nothing about
whether the other event has occurred
– E.g., flipping a coin twice
– E.g., being struck by lightning and having green
eyes
19
The Multiplication Rule
• What is the Event p(Event)
probability of rolling
1 1/6
two dice and having
both show 6? 2 1/6
• p(6 and 6) =
3 1/6
p(6) X p(6) = 4 1/6
1/6 X 1/6 = 5 1/6
1/36 6 1/6
20
• Of 1000 assembled components, 10 will have
a working defect and 20 will have structural
defect. No component has both the defects.
What is the prob that a component picked up
at random will be free from defects
21
• The probability that ram will get a plumbing
contract is 2/3. The prob that he will not get
electrical contract is 5/9. The prob of getting
at least one contract is 4/5. What is the prob
that he will get both contracts.
22
• The probability that A will solve a problem is
3/7. The prob that B will solve it is 7/15. What
is the prob that the problem gets solved
• What is the prob that the problem is not
solved
23
• A husband and wife appear for an interview
for 2 vacancies. The prob that husband will be
selected is 1/7 and wife will be selected is 1/5.
What is the prob that
• Both will be selected
• Only one will be selected
• None of them will be selected
24
• The odds for A speaking the truth is 3:2. The
odds against B speaking the truth is 3:5. In
what percentage of cases A and B will
contradict each other.
25
Joint and Marginal Probabilities
• A joint probability is the probability of two (or
more) events happening together
– E.g. The probability that a person likes statistics
and likes cats
• A marginal probability is the probability of just
one of those events
– E.g. probability of liking statistics
26
Joint and Marginal Probabilities
Male Female ( Marginal
Not Male) Probability
Engineers 3 1 4
.167 .056 .222
Non 10 4 14
Engineers
.556 .222 .778
Marginal 13 5
Probability
.722 .278
18
27
Conditional Probability
• If two events are not independent of each other,
then knowing whether one event occurred
changes the probability that the other event
might occur
– E.g., knowing that a person is an introvert decreases
the probability that you will find the person in a
social situation
• Conditional probabilities give the probability of
one event given that another event has
occurred
28
Conditional Probability
• The conditional probability of event B
occurring given that event A has occurred is
given by:
• p(B|A) = p(A and B) / p(A)
• The values of p(A and B) and p(A) can be easily
gotten from the table of joint and marginal
probabilities
29
Conditional Probability
• What is the probability Male Female ( Marginal
that a person is a male Not Male) Probability
given that person is an Engineers
engineer 3 1 4
• p(male| engineer ) = .167 .056 .222
p(male and engineer ) / Non 10 4 14
Engineers
p( engineer) .556 .222 .778
• .167 / .222 = .75 Marginal 13 5 18
Probability
.722 .278
30
Conditional Probability
• What is the Male Female ( Marginal
probability that a Not Male) Probability
person is an engineer Engineers
given that the person 3 1 4
.167 .056 .222
is a male ?
Non 10 4 14
• p( engineer| male) = Engineers
p( engineer and male ) .556 .222 .778
/ p( male) = Marginal 13 5
Probability
.722 .278
18
0.167/ .722 = .231
31
The Multiplication Rule Revisited
• The multiplication rule given before applied
only to two (or more) events that were
independent of each other
• When the event are not independent, the
multiplication rule must be revised to:
• p(A and B) = p(A) X p(B | A)
32
Bayes Theorem
• Often we begin probability analysis with initial or prior probabilities.
• Then, from a sample, special report, or a product test we obtain some
additional information.
• Given this information, we calculate revised or posterior probabilities.
• Bayes’ theorem provides the means for revising the prior probabilities.
Application
Prior New Posterior
of Bayes’
Probabilities Information Probabilities
Theorem
33
• The probability that an employee will get a
promotion is 0.6. The prob that an employee
will get salary hike is 0.5. The prob that an
employee will promotion or salary hike is 0.7.
What is the prob that an employee gets a
promotion given that he got a salary hike.
34
• There are 2 types of coins. In type 1 the head
comes up with a probability of 0.4. In the
second type, head comes up with a probability
of 0.7. From a pile of equal number of type 1
and type 2 coins, one coin is picked up at
random. It is tossed and head comes up.
35
• XYZ is organizing a picnic on 22nd. The only thing that
will cancel the picnic is thunderstorm. The weather
on 22nd is predicted to be dry with a prob of 0.2,
moist with a prob of 0.45 and wet with a prob of
0.35. The chance of a thunderstorm on a dry day is
0.3, on a moist day is 0.6 and on a wet day is 0.8.
1. What is the prob of a thunderstorm.
2. What is the prob that it was a moist day knowing
that the picnic was cancelled.
36
DISTRIBUTION
Frequency Distribution: It is a listing of observed / actual
frequencies of all the outcomes of an experiment that
actually occurred when experiment was done.
Probability Distribution: It is a listing of the probabilities
of all the possible outcomes that could occur if the
experiment was done.
It can be described as:
A diagram (Probability Tree)
A table
A mathematical formula
2
TYPES OF PROBABILITY DISTRIBUTION
Probability
Distribution
Continuous
Discrete PD
PD
Binomial Distribution
Normal
Distribution
Poisson Distribution
3
PROBABILITY DISTRIBUTION
Discrete Distribution: Random Variable can take only
limited number of values. Ex: No. of heads in two
tosses.
4
TREE DIAGRAM –
A FAIR COIN IS TOSSED TWICE
1st 2nd
H HH
T HT Possible
Outcome
H TH s
T TT
Attach probabilities
1st 2nd
H HH P(H,H)=½x½=¼
½
½ H
½
T HT P(H,T)=½x½=¼
½ H TH P(T,H)=½x½=
½ T ¼
½
T TT P(T,T)=½x½=
¼
*
½
½
T HT P(H,T)=½x½=¼
*
½ H TH P(T,H)=½x½=
½ T ¼
½
T TT P(T,T)=½x½=
¼
Probability of at least one Head?
Ans: ¼ +¼+¼=¾
DISCRETE Probability Distribution
Tossing a coin three times:
S= 𝐻𝐻𝐻, 𝐻𝐻𝑇, 𝐻𝑇𝐻, 𝐻𝑇𝑇, 𝑇𝐻𝐻,
𝑇𝐻𝑇, 𝑇𝑇𝐻, 𝑇𝑇𝑇
Let X represents “No. of heads”
X Frequency P (X=x)
0 1 1/8
1 3 3/8
2 3 3/8
3 1 1/8
E( X ) x p(x )
all x
i i
Var ( x) ( E ( x ) x) p( x i ) 2
q = probability of failure
Eggs are packed in boxes of 12. The probability that each
egg is broken is 0.35
12
P( X 4) 0.354 0.65(124 ) 495 0.354 0.658
4
0.235 to 3 significant figures
Eggs are packed in boxes of 12. The probability that each
egg is broken is 0.35
Find the probability in a random box of eggs:
P( X 3) P ( X 0) P ( X 1) P( X 2)
The probability of a bomb hitting a target is 1/5. Two bombs are enough to
destroy a bridge. If six bombs are fired at the bridge, find the probability
that the bridge is destroyed. (0.345)
If 8 ships out of 10 ships arrive safely. Find the probability that at least
one would arrive safely out of 5 ships selected at random. (0.999)
10
MEASURES OF CENTRAL TENDENCY AND
DISPERSION FOR THE BINOMIAL DISTRIBUTION
Mean of BD: µ = np
12
POISSON DISTRIBUTION
When there is a large number of trials, but a small
probability of success, binomial calculation becomes
impractical
If λ = mean no. of occurrences of an event per unit
interval of time/space, then probability that it will
occur exactly ‘x’ times is given by
𝒙 𝒆−𝝀
P(x) = 𝝀 where e is napier constant & e = 2.7182
𝒙
!
14
PRACTICE PROBLEMS – POISSON
DISTRIBUTION
On a road crossing, records show that on an average, 5 accidents
occur per month. What is the probability that 0, 1, 2, 3, 4, 5 accidents
occur in a month? (0.0067, 0.0335, 0.08425, 0.14042, 0.17552, 0.17552)
In case, probability of greater than 3 accidents per month exceeds 0.7, then
• road must be widened. Should the road be widened? (Yes)
It is given that 2% of the screws are defective. Use PD to find the probability that a
packet of 100 screws contains:
No defective screws (0.135)
One defective screw (0.270)
Two or more defective screw 15
(0.595)
CHARACTERISTICS OF POISSON
DISTRIBUTION
It is a discrete distribution
Occurrences are statistically independent
Standard Deviation of PD is 𝜆 = 𝑛𝑝
It is always right skewed.
PD is a good approximation to BD when n > or = 20 and
p< or = 0.05
16
NORMAL DISTRIBUTION
It is a continuous PD i.e. random variable can take on any
value within a given range. Ex: Height, Weight, Marks etc.
Developed by eighteenth century mathematician – astronomer Karl
Gauss, so also called Gaussian Distribution.
It is symmetrical, unimodal (one peak).
Since it is symmetrical, its mean, median and mode all
coincides i.e. all three are same.
The tails are asymptotic to horizontal axis i.e. curve goes to infinity
without touching horizontal axis.
X axis represents random variable like height, weight etc.
Y axis represents its probability density function.
Area under the curve tells the probability.
The total area under the curve is 1 (or 100%)
Mean = µ, SD = σ 17
DEFINING A NORMAL DISTRIBUTION
Only two parameters are considered: Mean &
Standard Deviation
Same Mean, Different Standard Deviations
Same SD, Different Means
Different Mean & Different Standard Deviations
18
AREA UNDER THE NORMAL CURVE
Standard Normal
Distribution
0.40 .34
0.30
.50 .135
0.20
0.10
.025
0.00
-4 -3 -2 -1 0 1 2 3 4
Standard Score (z)
68-95-99.7 RULE
68% of
the data
21
STANDARD NORMAL PD
In standard Normal PD, Mean = 0, SD = 1
𝑥
Z= −𝜎𝜇
Z = No. of std. deviations from x to mean. Also called Z Score
x = value of RV
22
PRACTICE PROBLEMS – NORMAL
DISTRIBUTION
Mean height of gurkhas is 147 cm with SD of 3 cm. What is
the probability of:
Height being greater than 152 cm. 4.75%
Ludhiana
Birinder Singh, Assistant Professor, PCTE
Height between 140 and 150 cm. 83.14%
Mean demand of an oil is 1000 ltr per month with SD of 250 ltr.
63
Unit Normal Curve
66
Determining Probabilities with
the Unit Normal Distribution
• The area above a z-score of z
2.59 is .0048
• The total area under the
unit normal curve is 1
• The probability is .0048 / 1
= .0048
• There are approximately 5
chances in 1000 of a
randomly selected female
weighing over 190 pounds 67
Less Than 100 Pounds
• First, convert the raw z = -1.5
score to a z-score
• N(133, 22), raw score =
100
• z = (100 - 133) / 22 =
-1.5
• Draw a unit normal
distribution and put
the z-score on it
68
Less Than 100 Pounds
• We want the area to z = -1.5 z = 1.5
the left of the z-score
• Because the unit
normal distribution is
symmetrical, the area
to the left of z = -1.5 is
the same as the area
to the right of z = 1.5
69
Less Than 100 Pounds
• Find the area to the z = -1.5 z = 1.5
right of z = 1.5
• Consult Normal Table
• The area corresponds
to .0668
• Thus, 6.68% of
randomly selected
females should weigh
less than 100 pounds
70
Less Than 100 Pounds
• Because the total area under
z = -1.5
the curve is 1, the area to
the left of any z-score plus
the area to the right of the z-
score must equal 1
• Because of symmetry, the
area to the left of a negative
z-score equals 1 - area to the
left of the same, positive z-
score
71
Less Than 100 Pounds
• Area to the left of z z = 1.5
equal to -1.5 equals 1
- area to the left of z
equal 1.5
• 1 - .9332 = .0668
72
> 150 or < 110
• This problem calls for the addition rule of
mutually exclusive events
• Determine the probability of a randomly
selected female weighing more than 150
pounds
• Determine the probability of a randomly
selected female weighing less than 110
pounds
73
> 150 Pounds
• Convert 150 pounds to
z = .77
a z-score
• z = 150 - 133 / 22 = 0.77
• Draw the distribution
and z-score
• Consult a table to find
the area above a z-
score of .77
• p(> 150) = .2206
74
< 110 Pounds
• Convert 110 pounds to a
z = -1.05
z-score
• z = 110 - 133 / 22 = -1.05
• Draw the distribution
and z-score
• Consult a table to find
the area below a z-score
of -1.05
• p(< 110) = .1469
75
p(>150 or <110)
76
Sampling and Sampling
Distributions
• Sel ecti n g a samp l e i s l es s ti m e-co n su mi n g th an sel ec ti n g ever y i te m i n th e p o p u l ati o n ( cen su s).
• Sel ecti n g a samp l e i s l es s co stl y th an sel ecti n g ev ery i tem i n t h e p o p u l ati o n .
• An an al ysi s o f a sam p l e i s l ess cu m b ers o me an d mo re p racti cal th an
• an an a l ysi s o f th e e n ti re p o p u l ati o n .
Why Sample?
Types of Samples
Samples
Simple Stratified
Judgment Chunk Random
Systematic Cluster
Quota Convenience
Types of Samples
Probability Samples
Simple
Systematic Stratified Cluster
Random
Simple Random Sampling
• Ever y i nd ivi du al o r item fr om th e fr ameh as an equ al ch anceo f bein gs elected
• Selecti on may be w ith r ep lacemen t ( selected in d ivid u al i s retu rn ed t o f rame fo r p o ssi ble reselectio n) or w it ho ut replacemen t ( sel ect ed in di vid ual is n’t r etu rn ed to t he fr ame) .
• Sampl es ob tai ned fro m table o f r an do m nu mb ers o r compu t er ran d om n umb er gener at or s.
σ
μX μ and σX
n
(This assumes that sampling is with replacement or sampling
is without replacement from an infinite population)
Sampling Distributions
Properties: Normal Pop.
Normal Population
Distribution
μx μ
μ x
(i.e. x is
unbiased ) Normal Sampling
Distribution
(has the same mean)
μx
x
Sampling Distributions
Properties: Normal Pop.
• For sampling with replacement:
– As n increases, Larger sample
– σ x decreases size
Smaller sample
size
μ x