Probability and Stat
Probability and Stat
Chapter Two
Summarizing of Data
The most important aspect of studying the distribution of a sample measurement is the position
of the central value, that is, a representative value about which the measurements are distributed
and when it is convenient to have one figure that is representative of each group. This figure is
known as the average of the group. If the numbers of the group are arranged in order of
magnitude, the averages tend to fall around the central position in the group, so averages are
called measures of central tendency. In short, any measure intended to represent the center of
data set is called a measure of location or central tendency.
Let a data set consists of a number of observations, represents by x1 , x 2 , ..., x n where n (the last
xi
subscript) denotes the number of observations in the data and is the ith observation. Then the
sum
n
x 1 +x 2 + . . . +x n =∑ x i
i=1
1
n
x 2 +x 2 + . . . +x 2 =∑ x 2i
1 2 n
Similarly i=1
We say a measure of central tendency is best if it possess most of the following. It should:
Several types of averages or measures of central tendency can be defined, the most commons are
- The mean
- the mode
- the median
The choice of average (measure of central tendency) depends upon which best represents the
property under discussion.
2
The arithmetic mean is defined as the sum of the measurements of the items divided by the total number of
items.
x 1 + x 2 +. ..+ x n ∑
xi
x̄= = i =1
n n
for sample mean,
n
X 1 + X 2 +.. .+ X n
∑ Xi
X̄ = = i=1
N N
for population mean,
Example 1: You measure the body lengths (in inches) of 10 infants at birth and record the
following:
17.5 19.5 17.5 19 20 21 18 19.5 18 10.75
n
∑ xi x 1 + x 2 + .. .+ x n 17 . 5+19 .5+. . .+ 10. 75 180. 75
x̄= i=1 = = = =18 .075
n n 10 10
The sum of the deviations of the items from their arithmetic mean is zero. This means, the
x , x , . . ., x n from their mean x̄ is zero.
algebraic sum of the deviations of a set of numbers 1 2
n
(x i x) 0
That is i 1
The sum of the squares of the deviations of a set of observations from any number, say A, is
3
of group k , then the combined mean ,denoted by
x̄ c , of all observations taken together is
given by
k
n1 x̄ 1 + n2 x̄ 2 +. ..+n k x̄ k
∑ ni x̄ i
i =1
x̄ c= =
n1 + n2 +. . .+ nk k
∑ ni
i=1
Example: Last year there were three sections taking Stat 273 course in Alemaya University. At
the end of the semester, the three sections got average marks of 80, 83 and 76. There were 28, 32
and 35 students in each section respectively. Find the mean mark for the entire students.
Solution:
n1 x̄ 1 + n2 x̄ 2 +n3 x̄3 28(80 )+32(83 )+35 (76) 7556
x̄ c= = = =
n1 + n2 + n3 28+32+35 95 79.54
If a wrong figure has been used in calculating the mean, we can correct if we know the
correct figure that should have been used. Let
x wr denote the wrong figure used in calculating the mean
x c be the correct figure that should have been used
x̄ wr be the wrong mean calculated using x wr x̄
, then the correct mean, correct , is given by
n x̄ wr + x c +−x wr
x̄ correct =
n
x , x , . . ., x n is x̄ , then
If the mean of 1 2
x ±k , x 2±k , . . .,x n ±k will be x̄±k
a) the mean of 1
b) The mean of
kx 1 , kx 2 , .. .,kx n will bek x̄ .
Example: An average weight of 10 students was calculated to be 65 kg, but latter, it was
discovered that one measurement was misread as 40 kg instead of 80 kg. Calculate the corrected
average weight.
4
n x̄ wr + x c− x wr 10 ( 68)+80−40
x̄ correct = = =69
Solution: n 10
Merits of Arithmetic Mean
- Arithmetic mean is rigidly defined and its value is always definite.
- It is calculated based on all observations.
- Arithmetic mean is simple to calculate and easy to understand. It doesn’t need arraying
(arranging in increasing or decreasing order) of the data.
- Arithmetic mean is also capable of further algebraic treatment.
- It affords a good standard of comparison.
- It is highly affected by extreme (abnormal) observations in the series. For instance, the
monthly incomes of three boys are 37 birr, 53 birr and 48 birr and that of their father is 1026
birr. The average income become becomes 219 birr which is not at all a representative
figure.
- It can be a number which does not exist in the series.
- It sometime gives such results which appear almost absurd. For example it is likely that we
can get an average of ‘3.6 children’ per family.
- It gives greater importance to bigger items of a series and lesser importance to smaller items.
That means it is an upward bias measure.
- It can’t be calculated for open-ended classes.
Merits of mode
- Mode is not affected by extreme values.
- Mode can be calculated even in the case of open-end intervals. And it is not necessary to know all
observations.
Demerits of mode
- Mode may not exist in the series and if it exists it may not be a unique value.
- It does not fulfill most of the requirements of a good measure of central tendency
- It may be unrepresentative in many cases.
5
2.2.3 The Median
~
x
The median is the midpoint of the data array. The median of is denoted by . For ungrouped data
the median is obtained by
Demerits of median
Measures of variation are statistical measures, which provide ways of measuring the extent to
which the data are dispersed or spread out.
6
To compare two or more sets of data with regard to their variability
To control variability itself like in quality control, body temperature, etc.
To make further statistical analysis or to facilitate the use of other statistical measures.
In case the two sets of data are expressed in different units, however, such as quintals of sugar
versus tons of sugarcane or if the average sizes are very different such as manager’s salary versus
worker’s salary, the absolute measures of dispersion are not comparable. In such cases measures
of relative dispersion should be used.
7
Some types of measures of variation are discussed below.
Range (R) is defined as the difference between the largest and the smallest observation in a given
In case grouped data, range is found by taking the difference between the class mark of the last
R=CM last −CM first CM last CM first
class and that of the first class. That is, where and are the
class marks of the last class and that of the first class respectively.
x max −x min R
RR= = . . . . . . . . for ungrouped data
x max +x min x max +x min
CM −CM first R
RR= last = . . . . . . . . . for grouped data
CM last +CM first CM last +CM first
- Range and relative range are easy to calculate and simple to understand.
- Both cannot be computed for grouped data with open ended classes.
- They do not tell us anything about the distribution of values in the series.
Example: Find the range and relative range for the monthly salary of ten workers in a certain
paint factory given below.
462 480 534 624 498 552 606 588 516 570
Solution:
x max =624 birr x min=462 birr
R=x max −x min =624 birr−462 birr=162 birr
x max −x min 624 birr−462 birr 162 birr
RR= = = =0 . 149
x max +x min 624 birr+462 birr 1086 birr
8
Example: Find the values of the range and relative range for the following frequency
distribution: which shows the distribution of the maximum loads supported by a certain number
of cables.
Solution:
Variance is the arithmetic mean of the square of the deviation of observations from their
arithmetic mean.
2
Population Variance (σ )
For raw data
( (∑ ) )
N N 2
∑ ( x i −μ ) 2
1
N
i=1
xi
2
σ=
i=1
N
=...=
N
∑ x 2i − N
i=1
μ
Where is the population arithmetic mean and N is the total number of observations in the
population.
9
For ungrouped FD
( (∑ ) )
k k 2
∑ f i ( ( x i −μ )2 ) 1
k
i=1
f i xi
2
σ=
i=1
N
=. . .=
N
∑ f i x i2 N
i=1
( (∑ ) )
k k 2
∑ f i ( ( x i −μ ) )
2
1
k
i=1
f i xi
2
σ=
i=1
N
=. . .=
N
∑ f i x i2 N
i=1
th
frequency of thei class and
N=∑ f i .
2
Sample Variance ( S )
For raw data
( (∑ ) )
n n 2
∑ ( x i− x̄ )2 1
n
i=1
xi
2
S=
i=1
n−1
=. ..=
n−1
∑ x 2i − n
i=1
Where x̄ is the sample arithmetic mean and n is the total number of observations in the sample.
For ungrouped FD
10
( (∑ ) )
k k 2
∑ f i (x i − x̄)2 1
k
i=1
f i xi
2
S=
i=1
n−1
=...=
n−1
∑ f i x i2 − n
i=1
f th
classes, i is the frequency of thei class and
n= ∑ f i.
For grouped data
( (∑ ) )
k k 2
∑ f i (x i − x̄)2 1
k
i=1
f i xi
2
S=
i=1
n−1
=...=
n−1
∑ f i x i2 − n
i=1
f th
classes, i is the frequency of thei class and
n= ∑ f i.
iii. The Standard Deviation
xi 5 10 12 17 Total
( x i − x̄ ) 2 36 1 1 36 74
n
∑ ( x i− x̄ ) 2
74
S2 = i=1 = =24 . 67
n−1 3
11
√
n
∑ ( x i− x̄ ) 2
S= √ S 2= i=1
n−1
=
√ 74
3
=4 . 97
Example: Calculate the Variance variance ans the standard deviation of the temperature data.
(Assume the data as sample)
( (∑ ) )
k 2
f i xi
( )
k
1 i=1 1 (5710)2 1
2
S=
n−1
∑ f i x i2− n
= 653890−
49 50
= (653890−652082)=36 . 9
49
i=1
S= √ S 2=√ 36 . 9=6 . 07
iv. Coefficient of Variation
The standard deviation is an absolute measure of dispersion. The corresponding relative measure
is known as the coefficient of variation (CV).
Coefficient of variation is used in such problems where we want to compare the variability of
two or more than two different series. Coefficient of variation is the ratio of the standard
deviation to the arithmetic mean, usually expressed in percent.
S
CV = ×100
x̄ . Where S is the standard deviation of the observations.
A distribution having less coefficient of variation is said to be less variable or more consistent or
more uniform or more homogeneous.
12
Example: Last semester, students of Biology and Chemistry Departments took Stat 273 course.
At the end of the semester, the following information was recorded.
Department Biology Chemistry
Mean score 79 64
Standard deviation 23 11
Compare the relative dispersions of the two departments’ scores using the appropriate way.
Solution:
Biology Department Chemistry Department
S S
CV = ×100 CV = ×100
x̄ x̄
23 11
= ×100=29 .11 % = ×100=17 . 19 %
79 64
Interpretation: Since the CV of Biology Department students is greater than that of Chemistry
Department students, we can say that there is more dispersion relative to the mean in the
distribution of Biology students’ scores compared with that of Chemistry students.
– Its unit is the square of the unit of measurement of values. For example, if the variable is
measured in kg, the unit of variance is kg2.
– It is calculated based on all the observations/data in the series.
– It gives more weight to extreme values and less to those which are near to the mean.
Standard Deviation
13
A standard score is a measure that describes the relative position of a single score in the entire
distribution of scores in terms of the mean and standard deviation. It also gives us the number of
standard deviations a particular observation lie above or below the mean.
x −μ
Z=
Population standard score: σ where x is the value of the observation, μ andσ are the
mean and standard deviation of the population respectively.
x − x̄
Z=
Sample standard score: S where x is the value of the observation, x̄ and S are the mean
and standard deviation of the sample respectively.
Example: Two sections were given an exam in a course. The average score was 72 with standard
deviation of 6 for section 1 and 85 with standard deviation of 5 for section 2. Student A from
section 1 scored 84 and student B from section 2 scored 90. Who performed better relative to
his/her group?
x B − x̄ 2 90−85
Z= = =1. 00
Z-score of student B: S2 5
From these two standard scores, we can conclude that student A has performed better relative to
his/her section students because his/her score is two standard deviations above the mean score of
selection 1 while the score of student B is only one standard deviation above the mean score of
section 2 students.
14
Chapter 3
Elementary Probability
Introduction
• Probability theory is the foundation upon which the logic of inference is built.
• It helps us to cope up with uncertainty.
• In general, probability is the chance of an outcome of an experiment. It is the measure of how
likely an outcome is to occur.
15
ii) Objective probability: - the probability of an event in a certain experiment based on
experimental evidence.
3.1 Deterministic and Non deterministic models
3.2 Review of set theory: sets, union, intersection complementation, De Morgan’s rules-
(Reading assignment)
16
Probability:-is a chance (likely hood) of occurrence of an event. It is expressed by a numerical
value between 0 and 1 inclusively.
Finite sample space: when the outcomes of certain experiment are finite.
Equally likely outcomes: - if each outcome in a sample space has the same chance to be
occurred.
Example: In throwing a fair die all possible outcomes are equally likely. That means the
elements of the sample space have the chance to be occurred.
2. Multiplication rule: - in a sequence of n events in which the first event has n1 possibilities…
the nth event has n2 possibilities, then the total possibilities of the sequence will be
n1 ×n 2×⋯×nk .
17
Example: The digits 0, 1, 2, 3, and 4 are to be used in 4 digit identification card. How many
different cards are possible if
a. Repetitions are permitted.
b. Repetitions are not permitted.
Solutions
a.
1st digit 2nd digit 3rd digit 4th digit
5 5 5 5
18
n
c) The number of permutation of n objects in whichn1 are alike, n2 are alike, k are alike is
n!
n1 !*n 2 !*.. .∗nk
Example: How many different permutations can be made from the letters in the word
“CORRECTION”?
n! 10 !
= =453600
n1 !*n 2 !*...∗nk 2!*2!*2!*1!*1!*1!*1! Permutations
Note: 0! =1! =1
Example: a photographer wants to arrange 3 persons in a raw for photograph. How many
different types of photographs are possible?
Solution:
Assume 3 persons Aster (A), lemma (L), Yared (Y) and n=3
Since n! =3! = 3*2! = 6, or 3P3 =3!/(3-3)!=3!/0!=6; there are 6 possible arrangement ALY,
AYL, LAY, LYA, YLA and YAL
Example2: fifteen athletes including Haile were entered to the race.
a) In how many different ways could prizes for the first, the second and the third place be
awarded?
b) How many of the above triplets just counted have if Haile is in the first position?
Solution:
15 objects taken 3 at a time 15P3=15! / (15-3)! = 2730
There are 14P2= 14! / (14-2) = 182
4. Combination: - counting technique in which the order of the objects is immaterial. Selection
of r objects from a collection of n objects where r<= n without regarding order. The combination
of n objects taking r objects at a time is given by
nCr = n!/(n-r)!r!
Example: In a club containing 7 members a committee of 3 people is to be formed. In how many
ways can the committee be formed?
Solution: 7C3 = 7! / (7-3)! 3! = 35
19
3.6 Definitions of probability / approaches of probability
i. Classical approach: - Uses sample space to determine the numerical probability that an event
will happen. If there are n equally likely outcomes of an experiment, and out of the n outcomes
event A occur only f times the probability of the event A is denoted by P (A) is defined as
P (A) = n (A)/ n(S) =
- If total number of outcomes is infinite or if it is not possible to enumerate all elements of the
sample space.
- If each outcome is not equally likely
Example: A fair die is tossed once. What is the probability of getting?
a) Number 4?
b) An odd number?
c) An even number?
Solutions:
First identify the sample space, say S= { 1, 2 , 3 , 4 , 5 ,6 }
n(S)= 6
a) A={ 4 } , n(A)=1, P(A)=n(A)/ n(S)=1/6
b) A={ 1 ,3 , 5 } , n(A)=3, P(E)=n(A)/ n(S)=3/6=0.5
In other words given a frequency distribution , the probability of an event (A) being in a given
frquency of a class
total frequency in the distribution
class is P(A)=
20
Example: the national center for health statistics reported that of every 539 deaths in recent
years, 24 resulted that from automobile accident, 182 from cancer, and 353 from other disease.
What is the probability that particular death is due to an automobile accident?
1. P (A) ≥0
2. P (S) = 1, S is the sure event..
3. If A and B are mutually exclusive events, then either A or B occur equals the sum of of
the two probabilities P (A¿ B) = P (A) + P (B)
4. P(A’)=1-P(A)
5. 0≤P(A)≤1
6. P(φ )=0, φ is the impossible event.
21
P (A¿ B) = P (A) + P (B) - P (A¿ B)
Example: A fair die is thrown twice. Calculate the probability that the sum of spots on the face
of the die that turn up is divisible by 2 or 3.
Solution:
S= {(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(2,1),(2,2),(2,3),(2,4),(2,5),(2,6),(3,1),
(3,2),(3,3),(3,4),(3,5),(3,6),(4,1),(4,2),(4,3),(4,4),(4,5),(4,6),(5,1),(5,2),(5,3),(5,4),(5,4),(5,5),(5,6),
(6,1),(6,2),(6,3),(6,4),(6,5),(6,6)}
This sample space has 6*6 =36 elements let A be the event that the sum of the spots on the die is
divisible by 2 and B be the event that the sum of the spots on the die is divisible by 3, then
P (A or B) = P (A¿ B)
= P (A) +P (B) – P (A¿ B)
= 18/36 + 12/36 -6/36 = 24/36 = 2/3
A. i.e. P(A/B) =P(A¿ B)/P(B)= P (A)* P (B) / P (B)= P (A) since P (A¿ B) = P (A/B) *P
(B)= P (A)* P (B)
Example: Suppose that an office has 100 calculating machines. Some of them use electric power
(E) while others are manual (M) and some machines are new (N) while others are used (U). The
table below gives numbers of machines in each category. A person enter the office picks a
22
machine at random and discovers that it is new. What is the probability that it is operates with
electric power?
E M Total
N 40 30 70
U 20 10 30
Total 60 40 100
40
100
70
Solution: P (E/N) =P (E¿ N) /P (N) = 100 = 4/7
¿
If A and B are any two events of a sample space such that P(A) ≠0 and P(B) ≠0, then P(A
B)=P(A)P(B/A)=P(B)P(A/B)
∑ P ( Ei ) P ( E
n
)
=
i=1
E i
Theorem 2: Let {E1,E2, .., En} be partitions of the sample space S, and suppose E1,E2, .., En
has non-zero probability that is P(Ei) ≠ 0 for i= 1,2, … ,n and let E be any event for P(E) > 0,
then for each integer k, 1 ≤ k ≤ n, we have
P( Ek ∩E ) E ) P( E / E )
P( k k
p(
E k
)
n
∑ P( Ei ∩E)
=
∑ P( E ) P( E/ E )
n
i i
E = i=1 i =1
Example: suppose that three machines are A1, A2 and A3 produce 60%, 30%, and 10%
respectively of the total production of machines are 2%, 4%, and 6% respectively.
i. If an item is selected at random, then find the probability that the item is defective. Assuming
that an item selected at random is found to be defective.
ii. Find the probability the item was produced on machine A1.
23
Solution: let B be an event of selecting a defective item at random and let E1, E2, E3 be an items
produced on machines A1, A2, A3 respectively then
P (B/E1) = 2%=0.02, P (B/E2) = 4% = 0.04 and P (B/E3) = 6% = 0.06
P (B) = P (B¿ [E1¿ E2 ¿ E3])
= P ([B¿ E1]¿ [B¿ E2] ¿ [B¿ E3])
= P (B¿ E1) + P (B¿ E2) +P (B¿ E3)
= P (E1)*P (B/E1) + P (E2)*P (B/E2) +P (E3)*P (B/E3)
= 0.6*0.02 + 0.3*0.04 + 0.1*.06
= 0.03
E ) P( B/ E )
P( 1 1
p( E ∩B)
1
∑ P( E ) P ( B / E )
n
i i
0 .6∗0 . 02
We use Baye’s formula P (E1/B) = P( B ) = i=1 = 0 . 03 =0.4
iii. Total Probability Theorem:
Let {E1,E2, .., En} be partitions of the sample space S, then for any event A of the same
probability space
n n
P( A )=∑ P( A∩Ei )=∑ P( A|Ei ) P( Ei )
i=1 i=1
P( A|B)=P( A) P( B|A)=P(B)
, .
Notation: Let probability of success and failure are p and q respectively P (success) = P(s) = p
and P (failure) = P (f) = q, where q= 1- p
24
Definition: Let X be the number of success in n repeated Binomial trials with probability of
success p on each trial, then the probabilities distribution of a discrete random variable X is
called binomial distribution.
Let P = the probability of success
q= 1-P= the probability of failure on any given trial.
A binomial random variable with parameters n and p represents the number of r successes in n
independent trials, when each trial has P probability of success
( )( )
0 3−0
3! 1 1 1
=
0 ! ( 3−0 ) ! 2 2 8
a) P(X=0) =
25
b) P (X=2) =
2. The probability that a student entering a college will graduate is 0.4. Determine the probability
that out of 5 students (a) none, (b) one (b) at least one (a) at most three will graduate
Solution X: No of students who will graduate
X = { 0,1,2,3,4,5, } P = 0.4, q = 1-P = 0.6
5!
( 0.4 )0 ( 0.6 )5 = 0.08
a) P (None will graduate) = P (X=0) = 0! ( 5−0 ) !
5!
( 0.4 )1 ( 0.6 )5 = 0.26
b) P (one will graduate) = P (X=1) = 1! ( 5−1 ) !
c) P (at least one will graduate) = P (X¿ 1)
= P(X=1) + P(X=2) +P(X=3) +P(X=4) +P+X=5)
= 1-P(X<1) = 1-P (X=0)
5!
( 0.4 )0 ( 0.6 )5
= 1- 0! ( 5−0 ) !
= 1-0.08=0.92
d) P (at most three will graduate) = P(X¿ 3 )
= 1-P(X>3)
= 1-
= 1-[5!/(4!(5-4)!(0.4)4(0.6)1+5!/5!(5-5)!)(0.4)5(0.6)0]
= 0.91296
If X is a binomial random variable with two parameters n and P, then
1. E (X) = n.p.
2. Var ( X) = npq
9.1.2 Poisson distribution
- It is a discrete probability distribution which is used in the area of rare events such as
number of car accidents in a day, arrival of telephone calls over interval of times, number
of misprints in a typed page natural disasters Like earth quake, etc,
26
- The expected occurrences of events can be estimated from part trials ( records)
- The numbers of success or events occur during a given regions / time intervals are
independent in another.
Definition Let X be the number of occurrences in a Poisson process and λ be the actual
average number of occurrence of an event in a unit length of interval, the probability function for
Poisson distribution is,
P (X = x) =
Remarks
Poisson distribution possesses only one parameter λ
If X has a Poisson distribution the parameter λ , then E (X) = λ and
Var (X) = λ , i.e. E (X) = Var (X) = λ ,
∞
∑ P ( X=x )=1
x=0
Examples 1: A company manufacturing light bulbs discovers from past experience that 2 defects
of bulbs are manufactured per 30 working hours. What is the prob that 4 defects will be
manufactured in 30 working hours?
Solution: Let X be the R.v that the no of defected bulbs and λ = 2 ,
e−2 .24
= = 0.09
P (X = 4) 4 !
Example 2: In a small city, 10 accidents took place in a time of 50 days. Find the probability that
there will be
a) Two accidents in a day
b) three or more accidents in a day
Solution: In 50 days we have 10 accidents, then the number of accidents per day becomes
10/50 = 0.2 or λ = 0. 2
Let X be the rv., the No of accidents per day
X ~poiss ( λ=0 . 2 ) X = 0, 1, 2,…
27
e 0. 2 ( 0 .2 )2
= 0 . 0164
a) P (X = 2) = 2 !
b) P (X ¿ 3 ) = P ( X=3 )+ P ( X=4 ) + P ( X =5 ) +.. .
∞
∑ P ( X=x )=1
= 1- [ P ( X = 0 ) + P ( X =1 ) + P ( X =2 ) ] . . . . . ………… b/c x=0
= 1- [ 0 . 8187 + 0. 1637+ 0. 0164 ]
= 0.0012
3. a) Referring to eg.1, what is the expected no of defected light bulbs in a day? What
about the variance?
b) Referring to eg.2, find the mean and the variance for the no of accidents in a day
Solution a) E (X) = Var (X) = λ=2
b) E (X) = Var (X) = λ=0 . 2
Example 3: Suppose the number of typographical errors on a single page of your book has a
Poisson distribution with parameter λ = 1/2. Calculate the probability that there is at least one
error on this page.
Work sheet
28
2. define independent event
4. Last year there were three sections taking Stat 273 course in Alemaya University. At the
end of the semester, the three sections got average marks of 80, 83 and 76. There were
28, 32 and 35 students in each section respectively. Find the mean mark for the entire
students.
5. For a student enrolling at freshman at certain university the probability is 0.25 that he/she will
get scholarship and 0.75 that he/she will graduate. If the probability is 0.2 that he/she will get
scholarship and will also graduate. What is the probability that a student who get a scholarship
graduate?
29