0% found this document useful (0 votes)
95 views48 pages

Stat & Probability

examples for probability with theorem

Uploaded by

yeabtsega
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views48 pages

Stat & Probability

examples for probability with theorem

Uploaded by

yeabtsega
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Haramaya University

College of Computing and Informatics


Department of Statistics

Probability and Statistics

Teshome Kebede Dheressa

©August 2016
Contents

1 Introduction 1
1.1 History and Definition of Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Classification of Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Application of Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Uses of Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 Measurement Scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.7 Methods of Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.8 Data Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Summarizing Data 10
2.1 Measures of Central Tendency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Types of MCT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.1 Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 Median . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.3 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Measures of Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Types of Measures of Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.1 Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.2 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.3 Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.4 Coefficient of Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Introduction to Probability 16
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Concept of Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.4 Counting Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.5 Approaches of Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.6 Probability Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.7 Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.8 Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.9 Partition and Baye’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4 One Dimensional Random Variables 29


4.1 Type of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

i
Probability and Statistics tashe.zgreat@gmail.com

4.2 Expectation of Random Variables and Its Properties . . . . . . . . . . . . . . . . . 30


4.2.1 Properties of Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.3 Variance of Random Variables and Its Properties . . . . . . . . . . . . . . . . . . . 31
4.3.1 Properties of Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5 Two Dimensional Random Variables 32


5.1 Two Dimensional Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . 32
5.2 Two Dimensional Continuous Random Variables . . . . . . . . . . . . . . . . . . . 33
5.3 Marginal and Conditional Distributions . . . . . . . . . . . . . . . . . . . . . . . . 34
5.4 Conditional Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.5 Independence of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

6 Special Probability Distributions 37


6.1 Binomial Probability Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.3 Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

ii
1

Introduction

1.1. History and Definition of Statistics

All of us are familiar with statistics in everyday life. As a discipline of study and research it has a
short history, but as a numerical information it has a long antiquity. There are various documents
of ancient times containing numerical information about countries (states), their resources and
composition of the people. This explains the origin of the word statistics as a factual description of
a state. The term ‘statistics’ is derived from the Latin word status, meaning state, and historically
statistics referred to the display of facts and figures relating to the demography of states or
countries. Generally, it can be defined in two senses: plural (as statistical data) and singular (as
statistical methods).

Plural sense: Statistics are collection of facts (figures). This meaning of the word is widely used
when reference is made to facts and figures on sales, employment or unemployment, accident,
weather, death, education, etc. In this sense the word Statistics serves simply as data. But
not all numerical data are statistics.

Singular sense: Statistics is the science that deals with the methods of data collection, organization,
presentation, analysis and interpretation of data. It refers the subject area that is concerned
with extracting relevant information from available data with the aim to make sound decisions.
According to this meaning, statistics is concerned with the development and application of
methods and techniques for collecting, organizing, presenting, analyzing and interpreting
statistical data.

According to the singular sense definition of statistics, a statistical study (statistical investigation)
involves five stages: collection of data, organization of data, presentation of data, analysis of data
and interpretation of data.

1. Collection of Data: This is the first stage in any statistical investigation and involves the
process of obtaining (gathering) a set of related measurements or counts to meet predetermined
objectives. The data collected may be primary data (data collected directly by the investigator)
or it may be secondary data (data obtained from intermediate sources such as newspapers,
journals, official records, etc).

2. Organization of Data: It is usually not possible to derive any conclusion about the
main features of the data from direct inspection of the observations. The second purpose of
statistics is describing the properties of the data in a summary form. This stage of statistical

1
Probability and Statistics tashe.zgreat@gmail.com

investigation helps to have a clear understanding of the information gathered and includes
editing (correcting), classifying and tabulating the collected data in a systematic manner.
Thus, the first step in the organization of data is editing. It means correcting (adjusting)
omissions, inconsistencies, irrelevant answers and wrong computations in the collected data.
The second step of the organization of data is classification that is arranging the collected
data according to some common characteristics. The last step of the organization of data is
presenting the classified data in tabular form, using rows and columns (tabulation).

3. Presentation of Data: The purpose of data presentation is to have an overview of what the
data actually looks like, and to facilitate statistical analysis. Data presentation can be done
using graphs and diagrams which have great memorizing effect and facilitates comparison.

4. Analysis of Data: The analysis of data is the extraction of summarized and comprehensive
numerical description in order to reach conclusions or provide answers to a problem. The
problem may require simple or sophisticated mathematical expressions.

5. Interpretation of Data: This is the last stage of statistical investigation. Interpretation


involves drawing valid conclusions from the data collected and analyzed in order to make
rational decision.

1.2. Classification of Statistics

Based on the scope of the decision making, statistics can be classified into two: Descriptive and
Inferential Statistics.

Descriptive Statistics: refers to the procedures used to organize and summarize masses of data.
It is concerned with describing or summarizing the most important features of the data. It
deals only the characteristics of the collected data without going beyond it. That is, this
part deals with only describing the data collected without going any further: that is without
attempting to infer(conclude) anything that goes beyond the data themselves.

The methodology of descriptive statistics includes the methods of organizing (classification,


tabulation, frequency distributions) and presenting (graphical and diagrammatic presentation)
data and calculations of certain indicators of data like measures of central tendency and
measures of variation which summarize some important features of the data.

Inferential Statistics: includes the methods used to find out something about a population,
based on the sample. It is concerned with drawing statistically valid conclusions about
the characteristics of the population based on information obtained from sample. In this
form of statistical analysis, descriptive statistics is linked with probability theory in order
to generalize the results of the sample to the population. Performing hypothesis testing,
determining relationships between variables and making predictions are also inferential
statistics.

Examples: Classify the following statements as descriptive and inferential statistics.

(a) The average age of the students in this class is 21 years.

(b) Of the students enrolled in Haramaya University in this year 74% are male and 26% are
female.

2
Probability and Statistics tashe.zgreat@gmail.com

(c) The chance of winning the Ethiopian National Lottery in any day is 1 out of 167000.

(d) It has been continuously raining in Harar from Monday to Friday. It will continue to rain
in the weekend.

1.3. Application of Statistics

In this modern time, statistical information plays a very important role in a wide range of fields.
Today statistics is applied in almost all fields of human endeavor.

In Scientific Research: Statistics plays an important role in the collection of data through
efficiently designed experiments, in testing hypotheses and estimation of unknown parameters,
and in interpretation of results.

In Industry: Statistical techniques are used to improve and maintain the quality of manufactured
goods at a desired level. Statistical methods help to check whether a product satisfies a given
standard.

In Business: Statistical methods are employed to forecast future demand for goods, to plan for
production, and to evolve efficient management techniques to maximize profit.

In Medicine: Principles of design of experiments are used in screening of drugs and in clinical
trials. The information supplied by a large number of biochemical and other tests is
statistically assessed for diagnosis and prognosis of disease. The application of statistical
techniques has made medical diagnosis more objective by combining the collective wisdom
of the best possible experts with the knowledge on distinctions between diseases indicated
by tests. Beside statistical methods are used for computation and interpretation of birth
and death rates.

In Courts of Law: Statistical evidence in the form of probability of occurrence of certain events
is used to supplement the traditional oral and circumstantial evidence in judging cases.

There seems to be no human activity whose value cannot be enhanced by injecting statistical ideas
in planning and by using statistical methods for efficient analysis of data assessment of results for
feedback and control.

1.4. Uses of Statistics

• To reduce and summarize masses of data and to present facts in numerical and
definite form. Statistics condenses and summarizes a large mass of data and presents facts
into a few presentable, understandable and precise numerical figures. The raw data, as is
usually available, is voluminous and haphazard. It is generally not possible to draw any
conclusions from the raw data as collected. Hence it is necessary and desirable to express
these data in a few numerical values.

• To facilitate comparison. Statistical devices such as averages, percentages, ratios, etc are
used for this purpose.

3
Probability and Statistics tashe.zgreat@gmail.com

• For determining functional relationships between two or more phenomenon.


Statistical techniques such as correlation analysis assist in establishing the degree of association
between two or more variables.

• For formulating and testing hypotheses. For instance, hypothesis like whether a new
medicine is effective in curing a disease, whether there is an association between variables
can be tested using statistical tools.

• For forecasting. Statistical methods help in studying past data and predicting future
trends.

1.5. Variable

Variable is any phenomena or an attribute that can assume different values. The most important
single distinguishing feature of a variable is that it varies; that is, it can take on different values.
Based on the values that variables assume, variables can be classified as

1. Qualitative variables: A qualitative variable has values that are intrinsically nonnumerical
(categorical).

Example: Gender, Religion, Color of automobile, etc.

2. Quantitative variables: A quantitative variable has values that are intrinsically numerical.
Example: Height, Family size, Weight, etc.

• Discrete variable: takes whole number values and consists of distinct recognizable
individual elements that can be counted. It is a variable that assumes a finite or
countable number of possible values. These values are obtained by counting (0, 1, 2, ...).
Example: Family size, Number of children in a family, number of cars at the traffic
light.

• Continuous variable: takes any value including decimals. Such a variable can
theoretically assume an infinite number of possible values. These values are obtained
by measuring.

Example: Height, Weight, Time, Temperature, etc.

Generally the values of a variable can be obtained either by counting for discrete
variables, by measuring for continuous variables or by making categories for qualitative
variables.

Example: Classify each of the following as qualitative and quantitative and if it is quantitative
classify as discrete and continuous.

1. Color of automobiles in a dealer’s show room.

2. Number of seats in a movie theater.

3. Classification of patients based on nursing care needed (complete, partial or safer).

4. Number of tomatoes on each plant on a field.

5. Weight of newly born babies.

4
Probability and Statistics tashe.zgreat@gmail.com

1.6. Measurement Scales

The level of measurement is one way in which variables can be classified. Broadly, this relates to
the level of information content implicit in the set of values and how each value may be interpreted
(mathematically) relative to other values on the variable - an issue which dictates how the variable
can be used and interpreted in statistical analysis. Consider the following illustrations.

• Mr A wears 5 when he plays foot ball and Mr B wears 6 when he plays foot ball.

Who plays better?


What is the average shirt number?

• Mr A scored 5 in Statistics quiz and Mr B scored 6 in Statistics quiz.

Who did better?


What is the average score?

Based on the number on the shirts it is not possible to judge, whether Mr B plays better. But by
using the test score, it is possible to judge that Mr B did better in the exam. Also it is not possible
to find the average shirt numbers (or the average shirt number is nothing) because the numbers
on the shirts are simply codes but it is possible to obtain the average test score. Therefore, scales
of measurement

• shows the information contained in the value of a variable.

• shows also that what mathematical operations and what statistical analysis are permissible
to be done on the values of the variable.

Different measurement scales allow for different levels of exactness, depending upon the characteristics
of the variables being measured. The four types of scales available in statistical analysis are

1. Nominal Scales of variables are those qualitative variables which show category of individuals.
They reflect classification in to categories (name of groups) where there is no particular order
or qualitative difference to the labels. Numbers may be assigned to the variables simply for
coding purposes. It is not possible to compare individual basing on the numbers assigned to
them. The only mathematical operation permissible on these variables is counting. These
variables

• have mutually exclusive (non-overlapping) and exhaustive categories.

• no ranking or order between (among) the values of the variable.

Example: Gender (Male, Female), Political Affiliation (Labour, Conservative,Liberal),


Ethnicity (White, Black, Asian, Other), etc.

2. Ordinal Scales of variables are also those qualitative variables whose values can be ordered
and ranked. Ranking and counting are the only mathematical operations to be done on the
values of the variables. But there is no precise difference between the values (categories) of
the variable.

Example: Academic Rank (BSc, MSc, PhD), Grade Scores (A, B, C, D, F), Strength (Very
Weak, Week, Strong, Very Strong), Health Status (Very Sick, Sick, Cured), Economic Status

5
Probability and Statistics tashe.zgreat@gmail.com

(Lower Class, Middle Class, Higher Class), etc.

3. Interval Scales of variables are those quantitative variables when the value of the variables
is zero it does not show absence of the characteristics i.e. there is no true zero. Zero indicates
lower than empty. For example, for temperature measured in degrees Celsius, the difference
between 5℃ and 10℃ is treated the same as the difference between 10℃ and 15℃. However,
we cannot say that 20℃ is twice as hot as 10℃ , i.e. the ratio between two different values
has no quantitative meaning. This is because there is no absolute zero on the Celsius scale;
0℃ not imply ‘no heat’.

4. Ratio Scales of variables are those quantitative variables when the values of the variables
are zero, it shows absence of the characteristics. Zero indicates absence of the characteristics.
All mathematical operations are allowed to be operated on the values of the variables.

For instance, a zero unemployment rate implies zero unemployment. Thus, we can also
legitimately say an unemployment rate of 20 percent is twice a rate of 10 percent or one
person is twice as old as another. In the case of temperature, we can use the Kelvin scale
instead of the Celsius scale: the Kelvin scale is a ratio scale because 0 Kelvin is ‘absolute
zero’ (-273℃) and this does imply no heat.

1.7. Methods of Data Collection

The first and foremost task in statistical investigation is data collection. Before data collection,
four important points should be considered.

• Purpose of data collection (why we need to collect data?),

• The data to be collected (what kind of data to be collected?),

• The source of data (where we can get the data?),

• The methods of data collection (how can we collect this data?).

Primary data can be collected through:

• Experimental methods in laboratory in natural sciences and through survey method in social
sciences.

• Survey methods

– Observational method

– Interview (personal interview or telephone interview)

– Questionnaire (mail or email questionnaire)

1.8. Data Organization

In order to describe situations, draw conclusions or make inferences about the population even
to describe the sample, the collected data must organized into some meaningful way. The most
convenient way of organizing data is to construct a frequency distribution. Frequency Distribution
is the organization of raw data in table form, using classes and frequencies.

6
Probability and Statistics tashe.zgreat@gmail.com

Definitions

• Class: is a description of a group of similar numbers in a data set.

• Frequency: is the number of times a variable value is repeated.

• Class Frequency: the number of observations belonging to a certain class.

There are three types of frequency distribution:

1. Categorical FD: the data is qualitative i.e. either nominal or ordinal. Each category of the
variable represents a single class and the number of times each category repeats represents
the frequency of that class (category).

Example: The blood type of 24 students is given below:

A B B AB O A O O B AB B A
B B O A O AB A O O O AB O

2. Ungrouped Frequency Distribution: A FD of numerical data (quantitative) in which


each value of a variable represents a single class. The values of the variable are not grouped)
and the number of times each value repeats represents the frequency of that class.

Example: Number of children for 21 families.

2354332
3104322
1114222

Construct ungrouped frequency distribution.

3. Grouped Frequency Distribution: A FD of numerical data in which several values of


a variable are grouped into one class. The number of observations belonging to the class is
the frequency of the class.

Example: Consider age group and number of persons:


Class Limits Class Boundaries Frequency
1-25 0.5-25.5 20
26-50 25.5-50.5 15
51-75 50.5-75.5 25
76-100 75.5-100.5 10
Total 70
Class Limits: the lowest and highest values that can be included in a class are called class
limits.

Class Boundaries: are class limits when there is no gap between the UCL of the first class
and the LCL of the second class.

Class Width: the difference between UCB and LCB of a class. It is also the difference
between the lower limits of two consecutive classes or it is the difference between upper limits
of two consecutive classes.

w = U CB − LCB = LCLi − LCLi−1 = U CLi − U CLi−1 = CMi − CMi−1

7
Probability and Statistics tashe.zgreat@gmail.com

For the above example, w = 25.5 − 0.5 = 26 − 1 = 50 − 25 = 25.

Class Mark: is the half way between the class limits or the class boundaries.

LCLi + U CLi LCBi + U CBi


CMi = =
2 2
Relative Frequency: is the ratio of class frequency to the total frequency (total number
of observations).

Percentage Frequency: Relative f requency × 100

Cumulative Frequency: is the sum of frequencies (total number of observations) below


or above a certain value.

Less than Cumulative Frequency: is the total number of values of a variable below a
certain UCB.

More than Cumulative Frequency: is the total number of values of a variable above a
certain LCB.
Class Limits Class Boundaries Frequency LCF MCF
1-25 0.5-25.5 20 20 20+15+25+10=70
26-50 25.5-50.5 15 20+15=35 15+25+10=50
51-75 50.5-75.5 25 20+15+25=60 25+10=35
76-100 75.5-100.5 10 20+15+25+10=70 10
Total 70

Construction of Grouped Frequency Distribution

(a) Arrange the data in an array form (increasing or decreasing order).

(b) Find the Unit of Measurement (U). U is the smallest difference between any two distinct
values of the data.

(c) Find the Range(R). R is the difference between the largest and the smallest values of
the variable.

(d) Determine the number of classes (k) using Sturge’s rule.

k = 1 + 3.322l log N

where N is the total number of observations.

(e) Specify the class width(w)

R R
w= =
k 1 + 3.322 log N

(f) Put the smallest value of the data set as the LCL of the first class. To obtain the LCL
of the second class add the class width w to the LCL of the first class. Continue adding
until you get k classes.

8
Probability and Statistics tashe.zgreat@gmail.com

Let X be the smallest observation.

LCL1 = X
LCLi = LCLi−1 + w for i = 2, 3, ..., k.

Obtain the UCLs of the frequency distribution by adding w − U to the corresponding


LCLs.

U CLi = LCLi + (w − U ) for i = 2, 3, ..., k.

(g) Generate the class boundaries.

LCBi = LCLi − U
2 and U CBi = U CLi + U
2 for i = 2, 3, ..., k.

Example: Mark of 50 students out of 40.

16 21 26 24 11 17 25 26 13 27 24 26 3 27 23 24 15 22 22 12 22 29 18 22 28
25 7 17 22 28 19 23 23 22 3 19 13 31 23 28 24 9 20 33 30 23 20 8 21 24

Construct grouped frequency distribution for the given data set.

Solution:

The array form of the data (increasing order).

3 3 7 8 9 11 12 13 13 15 16 17 17 18 19 19 20 20 21 21 22 22 22 22 22 22
23 23 23 23 23 24 24 24 24 24 25 25 26 26 26 27 27 28 28 28 29 30 31 33
U = 9 − 8 = 1, R = L − S = 33 − 3 = 30
k = 1 + 3.322 log N = 1 + 3.322 log 50 = 6.64 ≈ 7
w = R/k = 30/6.64 = 4.5 ≈ 5
w−U =5−1=4
Class Limits Class Boundaries Class Mark Frequency
3-7 2.5-7.5 5 3
8-12 7.5-12.5 10 4
13-17 12.5-17.5 15 6
18-22 17.5-22.5 20 13
23-27 22.5-27.5 25 17
28-32 27.5-32.5 30 6
33-37 32.5-37.5 35 1
Total 50

9
2

Summarizing Data

The first step in looking at data is to describe the data at hand in some concise way. To get a
better sense of the data, we use numerical measures, certain numbers that give special insight
into your values. Two types of numerical measures are important in statistics: measures of central
tendency and measures of variation. Each of these individual measures can provide information
about the entire set of data.

2.1. Measures of Central Tendency

Objectives

• To condense a mass of data into one single value.

• To facilitate comparison.

Desirable Properties of Good MCT

• It should be calculated based on all observations.

• It should not be affected by extreme values.

• It should be unique.

• It should always exist.

• It should be easy to understand calculate.

2.2. Types of MCT

Measures of central tendency is a number that tend to cluster around the “middle” of a set of
values. Three such middle numbers are the mean, the median, and the mode.

2.2.1. Mean

The (arithmetic) mean of a set of values is the number obtained by adding the values and dividing
the total by the number of values. For a sample of n observations x1 , x2 , ..., xn the sample mean

10
Probability and Statistics tashe.zgreat@gmail.com

is denoted by x̄ and calculated as follows.

x1 + x2 + ... + xn
P
xi
x̄ = =
n n
For a frequency array (un-grouped frequency distribution),
P
fi xi
x̄ = P
fi

where fi is the corresponding frequency of each class. For the case of grouped frequency distribution,
it becomes P
fi mi
x̄ = P
fi
where mi is the class mark of the corresponding class.

2.2.2. Median

The median of a data set is the middle value when the values are arranged in order of increasing
(or decreasing) magnitude. To find the median, first sort the values (arrange them in order), then
use one of these procedure.

1. If the number of values is odd, the median is the number that is located in the exact middle
of the list. th
n+1

x̃ = value
2

2. If the number of values is even, the median is found by computing the mean of the two
middle numbers.
n th
th
value + n2 + 1 value

x̃ = 2
2

3. For grouped frequency distributions median is given by the formula


n
− Fx̃−1

x̃ = Lx̃ + 2
w
fx̃

where

Lx̃ is the lower class boundary of the median class


Fx̃−1 is the less than cumulative frequency just before the median class
w is the class width of the median class
fx̃ is the frequency of the median class and n = fi .
P

Note:
n th
• The median class is the class which include

2 value.

• Median is not influenced by extreme values. It can be calculated for frequency distribution
with open-ended classes, even it can be located if the data is incomplete.

11
Probability and Statistics tashe.zgreat@gmail.com

2.2.3. Mode

The mode of data set is the value that occurs most frequently. When two values occur with the
same greatest frequency, each one is a mode and the data set is bimodal. When more than two
values occur with the greatest frequency, each is a mode and the data set is said to be multimodal.
When no value is repeated, we say that there is no mode.

Example 1: Find the modes of the following data sets.

• 5553151435

• 122234566679

• 1 2 3 6 7 8 9 10

In a frequency distribution, the mode is located in the class with highest frequency and that class
is the modal class. Then the formula for mode is
 
fx̂ − fx̂−1
x̂ = Lx̂ + w
fx̂ − fx̂−1 + fx̂ − fx̂+1

where

Lx̂ is the lower class boundary of the modal class


fx̂ is the frequency of modal class
fx̂−1 is the frequency of the class which precedes the modal class
fx̂+1 is the frequency of the class which is successor of the modal class
w is the class width of the modal class.

Mode is not affected by extreme values and can be calculated for open-ended classes. But it often
does not exist and is value may not be unique.

Example 1: The following table shows a frequency distribution of grades on a final examination
in college algebra.

Grade No of students
30-39 1
40-49 3
50-59 11
60-69 21
70-79 43
80-89 32
90-99 9
Then, obtain the mean, median and mode of the given data set and interpret the results.

2.3. Measures of Variation

The degree to which numerical data tend to spread about an average value is called the dispersion,
or variation, of the data. Dispersion or variation may be defined as the extent of scatterdness of
value around the measures of central tendency.

12
Probability and Statistics tashe.zgreat@gmail.com

Objectives

• To have an idea about the reliability of the measure of central tendency.

• To compare two or more sets of data with regard to their variability.

• To provide information about the structure the data.

• To pave way to the use of other statistical measures.

2.4. Types of Measures of Variation

• Absolute Measures of Variation: A measure of variation is said to be an absolute form


when it shows the actual amount of variation of an item from a measure of central tendency
and are expressed in concrete units in which the data have been expressed.

• Relative Measures of Variation: A relative measure of variation is the quotient obtained


by dividing the absolute measure by a quantity in respect to which absolute deviation has
been computed. It is a pure number and used for making comparisons between different
distributions.

Absolute Measures Relative Measures


Range Coefficient of Range
Quartile Deviation Coefficient of Quartile Deviation
Mean Deviation Coefficient of Mean Deviation
Variance Coefficient of Variation
Standard Deviation Standard Scores

Before giving the details of these measures of dispersion, it is worthwhile to point out that a
measure of dispersion (variation) is to be judged on the basis of all those properties of good
measures of central tendency. Hence, their repetition is superfluous.

2.4.1. Range

The simplest measure of variability is the range.

Range = M ax − M in

Although the range is the easiest of the measures of variability to compute, it is seldom used as
the only measure. The reason is that the range is based on only two of the observations and thus
is highly influenced by extreme values.

2.4.2. Variance

The variance is a measure of variability that utilizes all the data. The variance is based on the
difference between the value of each observation (xi ) and the mean. If the data are for a population,

13
Probability and Statistics tashe.zgreat@gmail.com

the average of the squared deviations is called the population variance (σ 2 ).

(xi − µ)2
P
σ = 2
N
In most statistical applications, the data being analyzed are for a sample. The sample variance
s2 is the estimator of the population variance σ 2 .

(xi − x)2
P
s = 2
n−1

2.4.3. Standard Deviation

The standard deviation is defined to be the positive square root of the variance. The sample
standard deviation s is the estimator of the population standard deviation σ. Following the
notation we adopted for a sample variance and a population variance,

s= s2

σ= σ2

The sample standard deviation s is the estimator of the population standard deviation σ. The
standard deviation is easier to interpret than the variance because the standard deviation is
measured in the same units as the data. For a sample of n elements, the sample variance (s2 ) for
grouped data is calculated by using the formula

fi (mi − x̄)2
P
s =
2
n−1

where mi is the class mark of the corresponding class.

2.4.4. Coefficient of Variation

The coefficient of variation is a relative measure of variability; it measures the standard deviation
relative to the mean. s
CV = × 100%

For example, if we found a sample mean of 44 and a sample standard deviation of 8. The coefficient
of variation is (8/44)×100% = 18.2%. In words, the coefficient of variation tells us that the sample
standard deviation is 18.2% of the value of the sample mean. In general, the coefficient of variation
is a useful statistic for comparing the variability of variables that have different standard deviations
and different means.

Example 1: The following table shows the frequency distribution of heights (recorded to the
nearest inch) of 100 male students at XYZ University.

14
Probability and Statistics tashe.zgreat@gmail.com

Height No of students
60-62 5
63-65 18
66-68 42
69-71 27
72-74 8
Find the standard deviation and coefficient of variation and interpret the results.

15
3

Introduction to Probability

The primary objective of this chapter is to develop a sound understanding of probability values,
which we will build upon in the subsequent chapters. A secondary objective is to develop the basic
skills necessary to solve simple probability problems.

3.1. Introduction

Probability is a numerical description of chance of occurrence of a given phenomena under certain


condition. Probability theory plays a central role in statistics. After all, statistical analysis
is applied to a collection of data in order to discover something about the underlying events.
These events may be connected to one another. However, the individual choices involved are
assumed to be random. Alternatively, we may sample a population at random and make inferences
about the population as a whole from the sample by using statistical analysis. Therefore, a solid
understanding of probability theory - the study of random events - is necessary to understand how
the statistical analysis works and also to correctly interpret the results.

3.2. Concept of Set

In order to discuss the theory of probability, it is essential to be familiar with some ideas and
concepts of mathematical theory of set. A set is a collection of well-defined objects which is
denoted by capital letters like A, B, C, etc.

In describing which objects are contained in set A, two common methods are available. These
methods are:

1. Listing all objects of A. For example, A = {1, 2, 3, 4} describes the set consisting of the
positive integers 1, 2, 3 and 4.

2. Describing a set in words, for example, set A consists of all real numbers between 0 and 1,
inclusive. It can be written as A = {x : 0 ≤ x ≤ 1}, that is, A is the set of all x0 s where x is
a real number between 0 and 1, inclusive.

If A = {a1 , a2 , ..., an }, then each object ai ; i = 1, 2, ..., n belonging to set A is called a member
or an element of set A, i.e., ai ∈ A. A set consisting all possible elements under consideration is
called a universal set (denoted by ∪). On the other hand, a set containing no element is called an
empty set (denoted by ∅ or {}).

16
Probability and Statistics tashe.zgreat@gmail.com

If every element of set A is also an element of set B, A is said to be a subset of B and write as
A ⊂ B. Every set is a subset of itself, i.e., A ⊂ A. Empty set is a subset of every set. If A ⊂ B
and B ⊂ C, then A ⊂ C. If A ⊂ B and B ⊂ A, then A and B are said to be equal.

Set Operation

1. Union (Or): A set consisting all elements in A or B or both is called the union set ofA and
B, and write as A ∪ B. That is, A ∪ B = {x : x ∈ A, x ∈ B or x ∈ both}. The setA ∪ B is
also called the sum of A andB.

2. Intersection (And): A set consisting all elements in both A and B is called an intersection
set of A and B, and write as A∩B. This is, A∩B = {x : x ∈ A and x ∈ B}. The intersection
set of A and B is also called the the product of A and B.

3. Complement (Not): The complement of a set A, denoted by Ac , is a set consisting all


elements of ∪ that are not in A; i.e., Ac = {x : x ∈
/ A}.

4. Disjoint Set: Sets A and B are disjoint set if A ∩ B = ∅.

5. Relative Complement: The relative complement of B in A, denoted by A\B is a set of all


elements of A which are not in B. It is written as A\B = {x : x ∈ A and x ∈
/ B} = A ∩ B c .

Important Laws

• Commutative laws:

– A∪B =B∪A

– A∩B =B∩A

• Associative laws:

– A ∪ (B ∪ C) = (A ∪ B) ∪ C

– A ∩ (B ∩ C) = (A ∩ B) ∩ C

• Distributive laws:

– A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)

– A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)

• Identity laws:

– A ∪ A = A, A ∩ A = A

– A ∪ U = U, A ∩ U = A

– A ∪ ∅ = A, A ∩ ∅ = ∅

3.3. Basic Concepts

1. Experiment (ξ): is any statistical process that can be repeated several times and in any
trial of which the outcome is unpredictable.

17
Probability and Statistics tashe.zgreat@gmail.com

• Tossing a coin only once, S = {Head (H), Tail (T)}

• Tossing a coin two times, S = {HH, HT, T H, T T }

• Rolling a die, S = {1, 2, 3, 4, 5, 6}

• Selecting an item from a production lot, S = {Defective, Non-defective}

• Introducing a new product, S = {Success, Failure}

2. Sample Space (S): is a set consisting all possible outcomes of a given experiment, ξ.

3. Event: is an outcome or a set of outcomes (having some common characteristics) of an


experiment.

Simple Event (Elementary Event): is an event consisting a single outcome. The


elementary events are the building blocks (or atoms) of a probability model. They are
the events that cannot be decomposed further into smaller sets of events.

Compound Event: is an event consisting two or more outcomes.

4. Independent Event: two or more events are independent if the occurrence of one event
has no effect on the probability of occurrence of the other.

5. Mutually Exclusive Events: two or more events are mutually exclusive, if they have no
outcome in common. They cannot occur together simultaneously.

6. Complementary Event: Two mutually exclusive events are complementary if there are
no common elements between themselves and both of them contain all possible outcomes.
To be complementary, first they should be mutually exclusive events.

3.4. Counting Rules

Counting techniques are mathematical models which are used to determine the number of possible
ways of arranging or ordering objects. They are used to find a solution to fix the size of the sample
space that is extremely large. To count possible outcomes of a sample space or/and an event we
use the following counting techniques.

Addition Rule: states that if a task can be done (accomplished) by any of the k procedures,
where ith procedures has ni alternatives, then the total number of ways of doing the task is

k
X
n1 + n2 + ... + nk = ni
i=1

Example: Suppose a lady wants to make journey from Harar to Dire Dawa. If she can
use either plane, bus, cycle, horse, and there are 3 flights, 4 buses, 2 cycles and 3 horses
available. In how many different ways can she make her journey?

Solution:

From the given problem nf = 3, nb = 4, nc = 2 and nh = 3. So she has

nf + nb + nc + nh = 3 + 4 + 2 + 3 = 12

18
Probability and Statistics tashe.zgreat@gmail.com

different ways to make her trip from Harar to Dire Dawa.

Multiplication Rule: states that if a choice consists k steps where the first step can be done in
n1 ways, for each of which second can be done in n2 ways, ..., for each of those k th steps can
be done in nk ways. Then, the total number of distinct ways to accomplish the task/choice
is equal to
Yk
n1 × n2 × ... × nk = ni
i=1

Example 1: Suppose a cafeteria provides 5 kinds of cake which it serves with tea, coffee,
milk and coca cola. Then, in how many different ways can you order your breakfast of cake
with a drink?

Solution:

The work has two steps. First, we order a type of cake n1 = 5 and then we order kind of
drink through n2 = 4. Thus,one can have

n1 × n2 = 5 × 4 = 20

different ways to order his/her breakfast.

Example 2: There are 2 bus routes from city X to city Y and 3 train routes from city Y
to city Z. In how many ways can a person go from city X to city Z?

Solution:
n1 × n2 = 2 × 3 = 6

So the person can go from city X to city Z in 6 ways.

Permutation: is arrangement of objects with attention to order of appearance.

Rule 1: The number of permutations of n distinct objects taking all together is

n! = n × (n − 1) × (n − 2) × ... × (1)

By definition 1! = 0! = 1.

Example 1: In how many different ways can 3 persons sleep in a bed?

Solution:
n! = 3! = 3 × 2 × 1 = 6 ways.

Example 2: Suppose a photographer must arrange 4 persons in a row for a photograph. In


how many different ways can the arrangement be done?

Solution:
n! = 4! = 4 × 3 × 2 × 1 = 24 ways.

Rule 2: Given n distinct objects, the number of permutations of r objects taken from n

19
Probability and Statistics tashe.zgreat@gmail.com

objects is denoted by nP r and given by

n!
nP r = ; r≤n
(n − r)!

Example 1: In how many ways can 10 people be seated on a bench if only 4 seats are
available?

Solution:
10! 10 × 9 × 8 × 7 × 6!
nP r = 10P 4 = = = 5040 ways.
(10 − 4)! 6!
Example 2: How many 5 letter permutations can be formed from the letters in the word
DISCOVER?

Solution:
8! 8 × 7 × 6 × 5 × 4 × 3!
nP r = 8P 5 = = = 6270
(8 − 5)! 3!
Rule 3: Given n objects in which n1 are alike, n2 are alike, ..., nr are alike is given by

n!
n1 ! × n2 ! × ... × nr !

Example: How many different permutations can be made from the letters in the word:

I STATISTICS

Solution:

n1 = n(s) = 3, n2 = n(t) = 3, n3 = n(a) = 1, n4 = n(i) = 2 and n5 = n(c) = 1. Thus,

n! 10!
= = 50400
n1 ! × n2 ! × n3 ! × n4 ! × n5 ! 3! × 3! × 1! × 2! × 1!

I MISSISSIPPI

Solution:

n1 = n(m) = 1, n2 = n(i) = 4, n3 = n(s) = 4 and n4 = n(p) = 2. Thus,

n! 11!
= = 34650
n1 ! × n2 ! × n3 ! × n4 ! × n5 ! 1! × 4! × 4! × 2!

Combination: A set of n distinct objects considered without regard to the orders of appearance
is called combination. For example, abc, bac, acb, cab, cba are six different permutations
but they are the same combination.

Rule 1: The number of ways of selecting r objects from n distinct objects is called combination
of r objects from n objects denoted by nCr or nr and given by


 
n n!
nCr = = ; r≤n
r (n − r)! × r!

Example: In how many ways can student choose 3 books from a list of 12 different books?

20
Probability and Statistics tashe.zgreat@gmail.com

Solution:

12
   
n n!
= =
r 3 (n − r)! × r!
12!
=
(12 − 3)! × 3!
12! 12 × 11 × 10 × 9!
= =
9! × 3! 9! × 3!
= 220

Rule 2: If the selection has k steps, by selecting r1 of n1 objects, r2 of n2 , ..., rk of nk


objects, then the total number of ways of doing this selection is equal to
     
n1 n2 nk
× × ... ×
r1 r2 rk

Example: Out of 5 male workers and 7 female workers of some factory a committee
consisting 2 male and 3 female workers to be formed. In how many ways can this done
if

(a) all workers are eligible.

5 7
       
n1 n2
× = × = 10 × 35 = 350
r1 r2 2 3

(b) one particular female must be a member.

5 6
       
n1 n2
× = × = 10 × 15 = 150
r1 r2 2 2

(c) two particular male workers cannot be members for some reason.

3 7
       
n1 n2
× = × = 3 × 35 = 105
r1 r2 2 3

 The difference between permutation and combination is that in combination the order of
objects being selected (arranged) is not important, but order matters in permutation.

3.5. Approaches of Probability

1. The Classical Approach (also called Mathematical Approach): Suppose there are N
possible outcomes in the sample space S of an experiment. Out of these N outcomes, only
n are favorable to the event E, then the probability that the event E will occur is:

N o of f avourable outcomes to E n(E) n


P (E) = = =
total no of outcomes n(S) N

Example 1: Consider an experiment of tossing a die. Then, what is the probability that

(a) odd numbers occur.

21
Probability and Statistics tashe.zgreat@gmail.com

Solution:

The sample space of the given experiment is S = {1, 2, 3, 4, 5, 6}. Further let A be an
event of getting odd numbers in rolling a die only once.

n(A) 3
P (A) = = = 0.5
n(S) 6

(b) number 4 occurs.

Solution:

Let B be an event of getting number 4 in rolling a die only once.

n(B) 1
P (B) = = = 0.167
n(S) 6

(c) number 8 occurs.

Solution:

Let C be an event of getting number 8 in rolling a die only once.

n(C) 0
P (C) = = =0
n(S) 6

(d) numbers between 1 and 6 inclusive occur.

Solution:

Let D be an event of getting numbers between 1 and 6 inclusive occur.

n(D) 6
P (D) = = =1
n(S) 6

• Events with zero probability of occurrence are known as null or impossible events.

• Events with probability equal to unity are known as sure events.

Example 2: What is the probability of getting one head in tossing two coins?

Solution:

S = {HH, HT, T H, T T } and suppose E be the event getting one head in an experiment of
tossing two coins.
n(E) 2
P (E) = = = 0.5
n(S) 4

2. The Empirical Approach (also called Frequentist Approach): It is based on a relative


frequency. Given a frequency distribution, the probability of an event being in a given class
is
fi
P (E) = P
fi

22
Probability and Statistics tashe.zgreat@gmail.com

The difference between classical and empirical probability is that the former uses sample
space to determine the numerical probability while the latter is based on frequency distribution.

3. Subjective Approach: calculates probability based on an educated guess or experience


or evaluation of a problem. For example a physician might say that on the basis of his/her
diagnosis, there is a 30% chance the patient will need an operation.

3.6. Probability Rules

Let S be a sample space associated with a random experiment. Then with any event E, in this
sample space, we associate a real number called probability of E satisfying the following properties
(axioms).

• 0 ≤ P (E) ≤ 1

• P (S) = 1

• If A and B are mutually exclusive events, then

P (A or B) = P (A ∪ B) = P (A) + P (B)

• If A1 , A2 , ..., An are pairwise mutually exclusive events, then


n
! n
[ X
P Ai = P (Ai )
i=n i=1

• P (A ∪ Ac ) = P (A) + P (Ac )

• P (φ) = 0

Using the above axioms, it can be shown that for any two events A and B,

P (A ∪ B) = P (A) + P (B) − P (A ∩ B)

Example 1: A box of 20 candles consists of 5 defective and 15 non-defective candles. If 4 of these


candles are selected at random, what is the probability that

(a) all will be defective.

Solution:

Let A be an event of all candles are defective.


5 15
 
n(A) 4 × 0
P (A) = = 20
 = 0.001032
n(S) 4

(b) 3 will be non-defective.

23
Probability and Statistics tashe.zgreat@gmail.com

Let B be an event of 3 candles are non-defective.


5 15
 
n(B) 1 × 3
P (B) = = 20
 = 0.4696
n(S) 4

(c) all will be non-defective.

Let C be an event of all candles are non-defective.


5 15
 
n(C) 0 × 4
P (C) = = 20
 = 0.2817
n(S) 4

Example 2: An urn contains 6 white, 4 red and 9 black balls. If 3 balls are drawn at random,
find the probability that

(a) two of the balls drawn are whites.

Let E1 be an event two of the balls drawn are whites.


6 13
n(E1 )
 
×
P (E1 ) = = 2
19
 1
= 0.2012
n(S) 3

(b) one is from each colour.

Let E2 be an event of one from each colour.


6 4 9
n(E2 )
  
× 1 × 1
P (E2 ) = = 1
19
 = 0.2229
n(S) 3

(c) none is red.

Let E3 be an event of none is red.


15 4
n(E3 )
 
×
P (E3 ) = = 3
19
 0
= 0.4695
n(S) 3

(d) at least one is white.

Let E4 be an event of at least one is white.


6 13 6 13 6 13
n(E4 )
     
× × ×
P (E4 ) = = 1
19
 2
+ 2
19
 1
+ 3
19
 0
= 0.7048
n(S) 3 3 3

3.7. Conditional Probability

When the outcome or occurrence of an event affects the outcome or occurrence of another event,
the two events are said to be dependent (conditional). If two events, A and B, are dependent to
each other, the probability of event A occurring knowing that event B has already occurred is said

24
Probability and Statistics tashe.zgreat@gmail.com

to be the conditional probability of A given that event B has already occurred,

P (A ∩ B)
P (A/B) = ; P (B) 6= 0
P (B)

The probability of event B occurring knowing that event A has already occurred is said to be the
conditional probability of B given that event A has already occurred,

P (A ∩ B)
P (B/A) = ; P (A) 6= 0
P (A)

Remarks

(i) 0 ≤ P (A/B) ≤ 1

(ii) P (S/B) = 1

(iii) For mutually exclusive events A1 and A2 ,

P (A1 ∪ A2 /B) = P (A1 /B) + P (A2 /B)

(iv) For pairwise mutually exclusive events A1 , A2 , ..., An

n
! n
[ X
P Ai /B = P (Ai /B)
i=n i=1

Example: If the probability that a research project will be well planned is 0.6, and the probability
that it will be well planned and well executed is 0.54. Then, what is the probability that it will be

(a) well executed given that it is well planned.

Solution:

Let D and E be an events of the research project is well planned and well executed respectively.
Then P (D) = 0.6 and P (D ∩ E) = 0.54.

P (D ∩ E) 0.54
P (E/D) = = = 0.9
P (D) 0.6

(b) will not be well executed given that it is well planned.

Solution:

P (D ∩ E c ) P (D) − P (D ∩ E) P (D ∩ E)
P (E c /D) = = =1−
P (D) P (D) P (D)
P (D ∩ E c )
P (E c /D) = = 1 − P (E/D) = 1 − 0.9 = 0.1
P (D)

25
Probability and Statistics tashe.zgreat@gmail.com

3.8. Independence

Recall mutually exclusive events A and B, A ∩ B = φ, which implies that P (A ∩ B) = 0.

P (A ∩ B)
P (A/B) = =0
P (B)

If B occurs A will never occur at the same time. That means, they are dependent. Again recall
that if A ⊂ B
P (A ∩ B) P (A)
P (B/A) = = ≤1
P (A) P (A)
Definition: Two events, A and B are said to be statistically independent if

P (A ∩ B) = P (A) × P (B)

Example: Consider an experiment of tossing two dice. Then, let

A - the first die show an even number.


B - the second die show an odd number.
C - both dice show even number.

Thus check whether A and B, A and C, B and C are independent events.

Solution:

Use the following sample space, S.

→ 1 2 3 4 5 6
1 (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6)
2 (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6)
3 (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6)
4 (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6)
5 (5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6)
6 (6,1) (6,2) (6,3) (6,4) (6,5) (6,6)

n(A) 18 n(A ∩ B) 9
P (A) = = , P (A ∩ B) = =
S 36 S 36
n(B) 18 n(A ∩ C) 9
P (B) = = , P (A ∩ C) = =
S 36 S 36
n(C) 9 n(B ∩ C) 0
P (C) = = , P (B ∩ C) = =
S 36 S 36
P (A ∩ B) = P (A) × P (B)
9 18 18
= ×
36 36 36
P (A ∩ C) 6= P (A) × P (C)
9 18 9
6= ×
36 36 36
P (B ∩ C) 6= P (B) × P (C)

26
Probability and Statistics tashe.zgreat@gmail.com

0 18 9
6= ×
36 36 36
Therefore, based on the above results A and B are statistically independent events. However,
events A and C and B and C are not statistically independent.

If A and B are independent, then the following holds true.

(i) P (A/B) = P (A)

(ii) P (B/A) = P (B)

(iii) Ac and B c are independent.

(iv) Ac and B, B c and A are independent.

3.9. Partition and Baye’s Theorem

Definition: We say that the events A1 , A2 , ..., Ak represent a partition of a sample space S if

(a) Ai ∩ Aj = φ; ∀i 6= j, that is A0i s are mutually exclusive.


Sk
(b) i=1 Ai = S

(c) P (Ai ) > 0 ∀i

In other words, when the experiment, ξ is done one and only one of the events Ai occurs. For
example, for tossing of a die B1 = {1, 2, 3}, B2 = {4, 5} and B3 = {6} would represent a partition
of the sample space, while C1 = {1, 2, 3, 4} and C2 = {4, 5, 6} would not.

Let B be any event with respect to S and A1 , A2 , ..., Ak be partition of S. Therefore,

B = (A1 ∩ B) ∪ (A2 ∩ B) ∪ ... ∪ (Ak ∩ B)

Of course, some of the sets Ai ∩ B may be empty, but this does not invalidate the above
decomposition of B. The important point is that all the events Ai ∩ B are pairwise mutually
exclusive. Since they are mutually exclusive,

P (B) = P (A1 ∩ B) + P (A2 ∩ B) + ... + P (Ak ∩ B)


= P (A1 )P (B/A1 ) + P (A2 )P (B/A2 ) + ... + P (Ak )P (B/Ak )
k
X
= P (Ai )P (B/Ai )
i=1

Pk
Thus, the equation P (B) = i=1 P (Ai )P (B/Ai ) is called theorem of total probability.

Example: A certain item is manufactured by three factories F1 , F2 and F3 . It is known that


F1 produces twice as many as F2 , and F2 and F3 produce the same number of items. It is also
known that 2% of items produced by F1 and F2 are defective, while 4% of items produced by F3
are defective. All items produced are put in a stock pile and one item is chosen randomly. What
is the probability that this item is defective?

27
Probability and Statistics tashe.zgreat@gmail.com

Baye’s Theorem: If A1 , A2 , ..., Ak be a partition of S and B be any event associated with S,


then
P (Ai )P (B/Ai )
P (Ai /B) = Pk
i=1 P (Ai )P (B/Ai )

Baye’s theorem can be thought of as a mechanism for updating a priori probability to a posterior
probability when the additional information becomes available.

Example 1: A statistics teacher know from past experience that a student who do homework
consistently has a probability of 0.95 of passing the examination, where as a student who does not
do the homework has a probability 0.30 of passing.

(a) If 25% of students in a large group of students do their homework consistently, what
percentage can expect to pass?

(b) If a student chosen at random from the group gets a pass, what is the probability that the
student has done the homework consistently?

Example 2: An insurance company insured 2000 scooter drivers, 4000 car drivers and 6000 truck
drivers. The probability of accident is meet 0.01, 0.03 and 0.15 respectively. One of the insured
persons meets an accident. What is the probability that the he is scooter driver?

28
4

One Dimensional Random Variables

Consider the following illustrations.

Example: Consider the experiment, ξ, of tossing a coin twice.

S = {HH, HT, T H, T T }

Let X be number of heads. Thus, another sample space with respect to X (also called the range
space of X) is
Rx = {0, 1, 2}

Definition: A function X which assigns a real numbers to all possible values of a sample space
is called a random variable. A random variable is a variable that has a single numerical value
(determined by chance) for each outcome of a procedure.

4.1. Type of Random Variables

A random variable can be classified as being either discrete or continuous depending on the
numerical values it assumes.

A discrete random variable has either a finite number of values or a countable number of values;
that is, they result from counting process. The possible value of X may be x1 , x2 , ..., xn . For any
discrete random variable X the following will be true.

i) 0 ≤ P (xi ) ≤ 1
Pn P∞
ii) i=1 P (xi ) = 1 for finite and i=1 P (xi ) = 1 for countably infinite.

P (xi ) is called probability function or point probability function or mass function. The collection of
pairs (xi , P (xi )) is called probability distribution. A probability distribution gives the probability
for each value or range of values of the random variable.

Example 1: Construct a probability distribution for getting heads in an experiment of tossing a


coin two times.

Example 2: The probability distribution of a discrete random variable Y is given by

P (Y = y) = cy 2 , y = 0, 1, 2, 3, 4

29
Probability and Statistics tashe.zgreat@gmail.com

Then, find the value of c.

A continuous random variable has infinitely many values, and those values can be associated with
measurements on a continuous scale in such a way that there are no gaps or interruptions. That
means, if it assumes all possible values in the interval (a, b), where a, b ∈ < and there exist a
function called probability density function (pdf) satisfying the following conditions.

• f (x) ≥ 0, ∀x
R∞
• −∞ f (x)dx = 1

• For any two real numbers a and b such that −∞ < a < b < ∞ then
Z b
P (a < X < b) = f (x)dx
a

If X is a continuous random variable, then:


Ra
• P (X = a) = P (a ≤ X ≤ a) = a f (x)dx = 0
Rb
• P (a < X < b) = P (a ≤ X < b) = P (a < X ≤ b) = P (a ≤ X ≤ b) = a
f (x)dx

Example 1: Let X be a continuous random variable and its pdf is given by:

2x, for 0 < x < 1
f (x) =
0, otherwise

a) Verify whether f(x) is a pdf or not.

b) Find P (0.5 < X < 0.75)

4.2. Expectation of Random Variables and Its Properties

Definition: If X is discrete random variable with possible values of x1 , x2 , ..., xn having the
probabilities of P (x1 ), P (x2 ), ..., P (xn ), then the mean value of X denoted by E(X) or µ is defined
as:
X∞
E(X) = µ = xi P (xi )
i=1

if the series converges.

Definition: If X is continuous random variable with pdf of f(x), its mean is given by
Z ∞
E(X) = µ = xf (x)dx
−∞

Example 1: A coin is tossed two times. Let X be the number of heads. Find the mean value of
X.

4.2.1. Properties of Expectation

Assume that the expected value of a random variable exists.

30
Probability and Statistics tashe.zgreat@gmail.com

1. For any constant “a” we have

E(aX) = aE(X) = aµ

2. If X = a, then E(X) = a.

3. E[g(x) + h(x)] = E[g(x)] + E[h(x)]

4. Let (X, Y ) be a two dimensional random variable and are independent. Then,

E(XY ) = E(X) × E(Y )

4.3. Variance of Random Variables and Its Properties

Definition: Let X be a random variable. Then variance of X denoted by V ar(X) or σx2 is defined
as
V ar(X) = σx2 = E[X − E(X)]2 = E[X − µ]2 = E(X 2 ) − µ2

Thus, the standard deviation of X is given by σx = σx2 .


p

4.3.1. Properties of Variance

1. If “a” is constant, then

• V ar(X + a) = var(X)

• V ar(aX) = a2 var(X)

2. If (X, Y ) is a two dimensional random variable and if X and Y are independent, then

var(X + Y ) = V ar(X) + var(Y )

Example: Suppose that X is a continuous random variable with pdf of



1 + x, −1 ≤ x < 0
f (x) =
1 − x, 0 ≤ x ≤ 1

then find the mean value and variance of X.

31
5

Two Dimensional Random Variables

In our study of random variables we have, so far, considered only one dimensional case. That
is, the outcome of the experiment could be recorded as a single number X. In many situations,
however, we are interested in observing two or more numerical characteristics simultaneously. For
example, we might study the height, H and weight, W of some chosen person, giving rise the
outcome (h, w) as a single experimental outcome.

Definition: Let S be a sample space of a random experiment. If X = X(s) and Y = Y (s), each
assigning a real number to every element s ∈ S, then we can say (X, Y ) as a two dimensional or
a bi-variate or a random vector. Generally, if X1 = X1 (s), X2 = X2 (s), ..., Xn = Xn (s) assigning
a real number to each element s ∈ S, we call (X1 , X2 , ..., Xn ) as n-dimensional random variables
or multivariate variables.

5.1. Two Dimensional Discrete Random Variables

Definition: (X, Y ) is a two dimensional discrete random variable if the possible values of (X, Y )
are finite or countably infinite that means the possible values of (X, Y ), denoted by (xi , yj ), i =
1, 2, ..., n; j = 1, 2, ..., m.

With each possible value (xi , yj ) of (X, Y ) we associate a real number called probability P (xi , yj ) =
P (X = xi , Y = yj ) satisfying

(i) 0 ≤ P (xi , yj ) ≤ 1 ∀i, j


Pn Pm P∞ P∞
(ii) i=1 j=1 P (xi , yj ) = 1 or i=1 j=1 P (xi , yj ) = 1

• The function P (xi , yj )) is called a point probability function.

• The set of triples (xi , yj , P (xi , yj )) is called probability distribution.

Example 1: Two production lines manufacture a certain type of item. Suppose the capacity (on
any given day) is 5 times for line I and 3 times for line II. Assume that the number of items actually
produced by either production line is a random variable. Let (X, Y ) represent the two dimensional
random variable yielding the number of items produced by line I and line II, respectively. The
following table gives the joint probability distribution of X and Y. Then, find

(a) P (X = 2, Y = 3) = P (2, 3) =??

(b) The probability that more items are produced by line I than by line II.

32
Probability and Statistics tashe.zgreat@gmail.com

↓ 0 1 2 3
0 0.00 0.03 0.03 0.04
1 0.01 0.02 0.01 0.02
2 0.03 0.05 0.03 0.04
3 0.07 0.09 0.04 0.03
4 0.05 0.06 0.08 0.05
5 0.05 0.06 0.06 0.05
Example 2: Suppose a machine is used for a particular task in the morning and for a different
task in the afternoon. Let X and Y represent the number of times the machine breakdown in the
morning and in the afternoon respectively. The table below gives the joint probability distribution
of X and Y .
↓ 0 1 2 P (Y = y)
0 0.25 0.15 0.10 0.50
1 0.10 0.08 0.07 0.25
2 0.05 0.07 0.13 0.25
P (X = x) 0.40 0.30 0.30 1.00
(a) What is the probability that the machine breakdown equal number of times in the morning
and in the afternoon?

(b) What is the probability that the machine breakdown greater number of times in the morning
than in the afternoon?

5.2. Two Dimensional Continuous Random Variables

Definition: (X, Y ) is a two dimensional continuous random variable if (X, Y ) can assume all
values in some interval {(X, Y ) : a ≤ x ≤ b, c ≤ y ≤ d}.

Let (X, Y ) be a continuous random variable assuming all values in some region, < of the Euclidean
plane. The joint probability function f (x, y) is a function satisfying the following conditions.

(i) f (x, y) ≥ 0 ∀x and y

(ii) < f (x, y)dxdy = 1, (the total volume under the surface given by the equation).
RR

Example 1: Two random variables X and Y have the following joint pdf.

 3 (xy + x2 ), 0 < x < 1, 0 < y < 2
f (x, y) = 5
0, elsewhere

(a) Show that f (x, y) is a pdf.

(b) Find P (0.5 < X < 1, 0 < Y < 1)

Example 2: Suppose X and Y have a joint pdf of



cx, 0 < y < x < 1, 0 < x2 < y < 1
f (x, y) =
0, elsewhere

33
Probability and Statistics tashe.zgreat@gmail.com

(a) Determine the value of c.

(b) Find P (X < 0.5, Y < 0.5)

Example 3: If the joint pdf of X and Y is given by



2, x > 0, y > 0, x + y < 1
f (x, y) =
0, elsewhere

Then find:

(a) P (X + Y ≥ 32 )

(b) P (X ≥ 2Y )

(c) Find P (X < 0.5, Y < 0.5)

5.3. Marginal and Conditional Distributions

From any two distribution of bivariate random variable it is possible to get one dimensional
distributions called marginal distributions.

Definition: Let (X, Y ) be a discrete bivariate random variable having a probability function
P (xi , yj ) then the marginal distribution of X and Y are given as:
P∞
B P (xi ) = j=1 P (xi , yi ) (marginal of X is the row total)
P∞
B P (yj ) = i=1 P (xi , yi ) (marginal of Y is the column total)

Example: Suppose a discrete bivariate random variable (X, Y ) has the following probability
distribution. Find the marginal probability distributions of X and Y.

↓ 0 1 2
0 0.25 0.15 0.10
1 0.10 0.08 0.07
2 0.05 0.07 0.13

Definition: Let (X, Y ) be bivariate continuous random variable with a joint pdf f (x, y). Then
the marginal probability distributions of X and Y denoted by g(x) and h(y) respectively are given
by:
R∞
• g(x) = −∞ f (x, y)dy
R∞
• h(y) = −∞ f (x, y)dx

Example 1: Let (X, Y ) be a two-dimensional continuous random variable with joint pdf

 1 , 0 < x < 1, 0 < y < 4
f (x, y) = 8
0, elsewhere

then find the marginal pdfs of X and Y.

34
Probability and Statistics tashe.zgreat@gmail.com

Example 2: Suppose the two-dimensional random variable (X, Y ) has a joint pdf given by

6, 0 < y < x < 1, 0 < x2 < y < 1
f (x, y) =
0, elsewhere

then find the marginal of X and Y.

5.4. Conditional Distribution

Definition: Suppose (X, Y ) is a two-dimensional discrete random variable with joint probability
function P (xi , yj ), then the conditional distribution of X for a given value of Y = yj and the
conditional distribution of Y given X = xi are defined as:
P (X=xi ,Y =yj ) P (xi ,yj )
• P (X = xi /Y = yj ) = P (Y =yj ) = P (yj )

P (Y =yj ,X=xi ) P (xi ,yj )


• P (Y = yj /X = xi ) = P (X=xi ) = P (xi )

Note: The conditional distributions are probability distributions by themselves.

(i) P (X/yj ) ≥ 0 and P (Y /xi ) ≥ 0


P∞ P∞
(ii) i=1 P (X/yj ) = 1 and i=1 P (Y /xi ) = 1

Example: Consider (X, Y ) has a joint probability distribution given by

↓ 0 1 2 Total
0 0.25 0.15 0.10 0.50
1 0.10 0.08 0.07 0.25
2 0.05 0.07 0.13 0.25
Total 0.40 0.30 0.30 1.00
Then find

(a) P (X = 1/Y = 0)

(b) P (Y ≥ 1/X = 1)

Definition: Let (X, Y ) be a two dimensional continuous random variable with joint pdf f (x, y)
and marginal pdfs g(x) and h(y). Then,

(i) the conditional distribution of X given Y = y is

f (x, y)
g(x/y) = , h(y) > 0
h(y)

(ii) the conditional distribution of Y given X = x is

f (x, y)
h(y/x) = , g(x) > 0
g(x)

R∞
Note: g(x/y) and h(y/x) satisfies the conditions of pdf, i.e. g(x/y) ≥ 0 and −∞
g(x/y)dx = 1.

35
Probability and Statistics tashe.zgreat@gmail.com

Example: Suppose 
2, x > 0, y > 0, x + y < 1
f (x, y) =
0, elsewhere

Find,

(a) the conditional pdf of X

(b) P (X < 21 /Y = 14 )

(c) P (Y > 13 /X = 12 )

5.5. Independence of Random Variables

Definition: Let (X, Y ) be a two dimensional discrete random variable. We say X and Y are
independent if
P (X = x, Y = y) = P (X = x) × P (Y = y)

Equivalently,
P (X = x/Y = y) = P (X = x)

P (Y = y/X = x) = P (Y = y)

Definition: Let (X, Y ) be a two dimensional continuous random variable. We say X and Y are
independent if
f (x, y) = g(x) × h(y)

Equivalently,
g(x/y) = g(x)

h(y/x) = h(y)

Example 1: Consider a two dimensional discrete random variable having the following probability
distribution.
↓ 1 2
0 0.10 0.00
1 0.20 0.10
2 0.00 0.10
3 0.30 0.20
Are they X and Y independent? Why?/Why not?

Example 2: Suppose P (X = x, Y = y) = 2−(x+y) for x = 1, 2, 3, ... and y = 1, 2, 3, ...


Are they X and Y independent? Why?/Why not?

Example 3: (X, Y ) is a two dimensional continuous random variable having the following joint
pdf 
4xy, 0 < x < 1, 0 < y < 1
f (x, y) =
0, elsewhere

Are they X and Y independent? Why?/Why not?

36
6

Special Probability Distributions

6.1. Binomial Probability Distribution

The binomial probability distribution is a discrete probability distribution that provides many
applications. It is associated with a multiple-step experiment that we call the binomial experiment.
A binomial experiment exhibits the following four properties.

1. The procedure has a fixed number of trials.

2. The trials are independent. The outcome of any individual trial does not affect the probabilities
in the other trials.

3. The outcome of each trial must be classifiable into one of two possible categories (success or
failure).

4. The probability of a success, denoted by p, does not change from trial to trial.

If a procedure satisfies these four requirements, the distribution of the random variable (X) is
called a binomial probability distribution (or binomial distribution). To calculate probabilities we
use the following formula.
 
n x n−x
P (X = x) = p q f or x = 0, 1, 2, ..., n
x

where

x = the number of successes


p = the probability of a success on one trial
q = the probability of failure on one trial (q = 1 − p)
n = the number of trials
p(x) = the probability of x successes in n trials.

Expected value and variance of binomially distributed random variable [X ∼ Bin(n, p)] can be
obtained using the following.
E(X) = µ = np

V ar(X) = σ 2 = np(1 − p) = npq



SD(X) = σ = np(1 − p) = npq
p

37
Probability and Statistics tashe.zgreat@gmail.com

Example: A university found that 10% of its students withdraw without completing the sophomore
course. Assume that 20 students registered for the course. Compute the probability

(a) exactly four will withdraw.

Let X be number of students who will withdraw without completing the introductory
statistics course. From the given problem p = 0.2 = 20%, n = 20 and X ∼ Bin(20, 0.1).

20 20!
 
P (X = 4) = 0.14 0.916 = 0.14 0.916 = 0.0898
4 4!(20 − 4)!

(b) at most two will withdraw.

2
X
P (X ≤ 2) = P (X = 0) + P (X = 1) + P (X = 2) = P (X = xi )
i=0
20 20 20
     
= 0.10 0.920 + 0.11 0.919 + 0.12 0.918
0 1 2
20! 20! 20!
= 0.10 0.920 + 0.11 0.919 + 0.12 0.918
0!(20 − 0)! 1!(20 − 1)! 2!(20 − 2)!
= 0.67693

(c) more than three will withdraw.

20
X
P (X > 3) = P (X = 4) + P (X = 5) + . . . + P (X = 20) = P (X = xi )
i=3

= 1 − P (X ≤ 3) = 1 − P (X = 0) + P (X = 1) + P (X = 2) + P (X = 3)
20!
 
= 1 − 0.67693 + 0.13 0.917
3!(20 − 3)!
= 0.1329

(d) the expected and standard deviation of withdrawals.

E(X) = np = 20 × 0.1 = 2

V ar(X) = σ 2 = np(1 − p) = npq = 20 × 0.1 × 0.9 = 1.8


√ √
SD(X) = np(1 − p) = npq = 20 × 0.1 × 0.9 = 1.342
p

6.2. Poisson Distribution

In this section we consider a discrete random variable that is often useful in estimating the number
of occurrences over a specified interval of time or space. For example, the random variable of
interest might be the number of arrivals at a car wash in one hour, the number of repairs needed
in 10 miles of highway, or the number of leaks in 100 miles of pipeline. If the following two
properties are satisfied, the number of occurrences is a random variable described by the Poisson
probability distribution.

38
Probability and Statistics tashe.zgreat@gmail.com

Properties of a Poisson Experiment

1. The probability of an occurrence is the same for any two intervals of equal length.

2. The occurrence or nonoccurrence in any interval is independent of the occurrence or nonoccurrence


in any other interval.

The Poisson probability function is defined by the following equation.

e−λ λx
P (X = x) =
x!
where

p(x) = the probability of x occurrences in an interval


λ = expected value or mean number of occurrences in an interval.

For the Poisson probability distribution, X is a discrete random variable indicating the number
of occurrences in the interval. Since there is no stated upper limit for the number of occurrences,
the probability function p(x) is applicable for values x = 0, 1, 2, ... without limit. In practical
applications, x will eventually become large enough so that p(x) is approximately zero and the
probability of any larger values of x becomes negligible.

A property of the Poisson distribution is that the mean and variance are equal. That is,

E(X) = V ar(X) = λ

Example: A student finds that the average number of amoeba in 10 ml of pond water is 4. Find
the probability that in 10 ml of water from that pond there are

(a) exactly 5 amoeba.

Let Y be the number of amoeba found in 10 ml pond water. From the given question λ = 4
which implies that Y ∼ P oisson(λ).

e−4 45
P (X = 5) = = 0.156
5!

(b) no amoeba.
e−4 40
P (X = 0) = = e−4 = 0.0183
0!

(c) fewer than 3 amoeba.

2
X
P (X < 3) = P (X = 0) + P (X = 1) + P (X = 2) = P (X = xi )
i=0
e−4 40 e−4 41 e−4 42
= + +
0! 1! 2!
= e−4 + 4e−4 + 8e−4
= 0.238

39
Probability and Statistics tashe.zgreat@gmail.com

6.3. Normal Distribution

The most important probability distribution for describing a continuous random variable is the
normal probability distribution. The normal distribution has been used in a wide variety of
practical applications in which the random variables are heights and weights of people, test scores,
scientific measurements, amounts of rainfall, and other similar values. It is also widely used in
statistical inference. In such applications, the normal distribution provides a description of the
likely results obtained through sampling.

Normal Curve

The form or shape of the normal distribution is illustrated by the bell-shaped normal curve in the
following figure. The probability density function (pdf) that defines the bell-shaped curve of the
normal distribution follows.

If a random variable X ∼ N (µ, σ 2 ) its probability density function (pdf) is given by:

1 2 2
f (x) = √ e−(x−µ) /2σ −∞<x<∞
2πσ

where µ = mean, σ = standard deviation.

The normal curve has two parameters, µ and σ. They determine the location and shape of the
normal distribution.

Properties of Normal Distribution

1. The entire family of normal distributions is differentiated by two parameters: the mean µ
and the standard deviation σ.

2. The highest point on the normal curve is at the mean, which is also the median and mode
of the distribution.

3. The mean of the distribution can be any numerical value: negative, zero, or positive. Three
normal distributions with the same standard deviation but three different means (-10, 0, and
20) are shown here.

40
Probability and Statistics tashe.zgreat@gmail.com

4. The normal distribution is symmetric, with the shape of the normal curve to the left of the
mean a mirror image of the shape of the normal curve to the right of the mean. The tails
of the normal curve extend to infinity in both directions and theoretically never touch the
horizontal axis. Because it is symmetric, the normal distribution is not skewed; its skewness
measure is zero.

5. The standard deviation determines how flat and wide the normal curve is. Larger values of
the standard deviation result in wider, flatter curves showing more variability in the data.
Two normal distributions with the same mean but with different standard deviations are
shown here.

6. Probabilities for the normal random variable are given by areas under the normal curve.
The total area under the curve for the normal distribution is 1. Because the distribution is
symmetric, the area under the curve to the left of the mean is 0.50 and the area under the
curve to the right of the mean is 0.50.

7. The percentage of values in some commonly used intervals are

(a) 68.3% of the values of a normal random variable are within plus or minus one standard
deviation of its mean.

(b) 95.4% of the values of a normal random variable are within plus or minus two standard

41
Probability and Statistics tashe.zgreat@gmail.com

deviations of its mean.

(c) 99.7% of the values of a normal random variable are within plus or minus three standard
deviations of its mean.

Standard Normal Probability Distribution

A random variable that has a normal distribution with a mean of zero and a standard deviation
of one is said to have a standard normal probability distribution. The letter z is commonly
used to designate this particular normal random variable, that is z ∼ N (0, 1). The reason for
discussing the standard normal distribution so extensively is that probabilities for all normal
distributions are computed by using the standard normal distribution. That is, when we have
a normal distribution with any mean µ and any standard deviation σ, we answer probability
questions about the distribution by first converting to the standard normal distribution. Then we
can use the standard normal probability table and the appropriate z values to find the desired
probabilities. Thus, we can convert using the following formula.

x−µ
z=
σ

Consequently, the standard normal density is given by:

1 2
f (z) = √ exp−z /2 −∞ < z < ∞

which is graphically shown below.

42
Probability and Statistics tashe.zgreat@gmail.com

Example 1: Given that z is a standard normal random variable, compute the following probabilities.

(a) P (0 ≤ z ≤ 2.5) = 0.4938

(b) P (z ≥ 1) = P (z > 0) − P (0 < z < 1) = 0.5 − 0.3413 = 0.1587

(c) P (z ≤ 1) = P (z < 0) + P (0 < z < 1) = 0.5 + 0.3413 = 0.8413

(d) P (1 ≤ z ≤ 1.5) = P (0 < z ≤ 1.5) − P (0 < z ≤ 1) = 0.4332 − 0.3413 = 0.0919

(e) P (−1 < z < 1.5)

P (−1 < z < 1.5) = P (−1 < z < 0) + P (0 < z < 1.5)
= P (0 < z < 1.5)
= P (0 < z < 1) + P (0 < z < 1.5)
= 0.3413 + 0.4332
= 0.7745

Example 2:

The college boards, which are administered each year to many thousands of high school students,
are scored so as to yield a mean of 500 and a standard deviation of 100. These scores are close
to being normally distributed. What percentage of the scores can be expected to satisfy each
condition?

(a) Greater than 600.

Let X be the score of students with mean µ = 500, σ = 100 and X ∼ N (500, 100).

600 − µ
 
X −µ
P (X > 600) = P >
σ σ
600 − 500
 
=P z>
100
= P [z > 1]
= P [z > 0] − P [0 < z < 1]
= 0.5 − 0.3413
= 0.1587

43
Probability and Statistics tashe.zgreat@gmail.com

(b) Less than 450.

450 − µ
 
X −µ
P (X < 450) = P <
σ σ
450 − 500
 
=P z<
100
= P [z < −0.5]
= P [z < 0] − P [−0.5 < z < 0]
= P [z < 0] − P [0 < z < 0.5]
= 0.5 − 0.1915
= 0.3085

(c) Between 450 and 600.

450 − µ 600 − µ
 
X −µ
P (450 < X < 600) = P < <
σ σ σ
450 − 500 600 − 500
 
=P <z<
100 100
= P [−0.5 < z < 1]
= P [−0.5 < z < 0] + P [0 < z < 1]
= P [0 < z < 0.5] + P [0 < z < 1]
= 0.1915 + 0.3413
= 0.5328

44
Probability and Statistics tashe.zgreat@gmail.com

45

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy