0% found this document useful (0 votes)
65 views183 pages

Inroduction To Statistics-1

Uploaded by

Tafa Tulu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views183 pages

Inroduction To Statistics-1

Uploaded by

Tafa Tulu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 183

MEKDELA AMBA UNIVERSITY

CHAPTER ONE

INTRODUCTION
At the end of this chapter students will be able to:

 Explain statistics
 Make a distinction between descriptive statistics and inferential statistics
 Identify applications and limitations of statistics
 Identify types of level of measurement

Introduction

Most people become familiar with statistics through radio, television, newspapers and magazines.
For instance, one may find the following statements in a newspaper or reports. “The HIV
prevalence rate in Ethiopia among adults 15-49 years is 1.4 in 2005”; “Among older men, the
mortality rate for smokers is twice the rate of those who never smoked”; “The agricultural
production increased by 5 percent this year”.

However, statistics is used in almost all fields of human endeavor to make a scientific decisions
based on data. For example, in public health an administrator would be concerned with the number
of residents who contract a new strain of flu virus during a certain year. In pharmacy, it is used to
study the efficacy and potency of drugs. To study plant life, a botanist has to relay on statistics to
know the effect of temperature, rainfall and so on. In general, statistics can be applied in business,
social Science, natural sciences and engineering.

1.1 Definition and classification of Statistics

Statistics is defined in two ways depending on its use in the plural and singular sense.

In the plural sense (layman definition):- statistics is defined as the collection of numerical facts
or figures (raw data themselves).

According to this definition statistic should have following features:

BY: ABEBEW A. 1
MEKDELA AMBA UNIVERSITY

1. Statistics is aggregate of facts: Single number or unrelated number cannot be statistic. The facts
and figures should be related to each other.

Example1.1: The average mark of statistics course for chemistry students is 70% would be
considered as a statistic whereas Betselot has got 90% in statistics course is not statistics.

Example 1.2 Vital statistics (numerical data on marriage, births, deaths), statistics of students,
statistics of imports and exports, etc.

2. Affected by multiplicity of causes: The aggregate of facts and figures should be affected by a
set of causes. For example, the unemployment rate of country has increased by 5% in last year
due to the low economic growth, political instability civil markets.
3. Numerically express: All statistics must be expressed in numerical form.
4. Enumerated or Estimated according to reasonable standard of accuracy: For investigation,
statistical data can be collected either by enumeration or by estimation. If the data are collected
by enumeration the result will be exact and accurate. But if the enumeration is not possible data
will be estimated but 100% accuracy is not possible in this method.
5. They are collected for the pre-determined purpose: Before collecting the data objectives of
inquiry should be clearly specified. The data collected without any pre-determined purpose may
not be useful for inquiry.
6. Collected in a systematic Manner: Before collecting the data, well plan of data collection should
be followed because haphazard collection of data may give error.

In the singular sense (formal definition): - Statistics is the subject that deals with the methods
of collecting, organizing, presenting, analyzing and interpreting statistical data as well as deriving
valid conclusions and making reasonable decisions on the basis of this analysis. Statistics is
concerned with the systematic collection of numerical data and its interpretation.

Classification of Statistics

Statistics is broadly divided into two categories based on how the collected data are used.

These are descriptive and inferential statistics

BY: ABEBEW A. 2
MEKDELA AMBA UNIVERSITY

1. Descriptive Statistics: - refers to the procedures used to organize and summarize masses of
data. It is concerned with describing or summarizing the most important features of the data.

It deals with describing the collected data without going further conclusion and concerned with
summary calculations, graphs, charts, tables and numerical measures (mean, median, variance
etc.).

The methodology of descriptive statistics includes the methods of organizing (classification,


tabulation, Frequency Distributions) and presenting (Graphical and Diagrammatic Presentation)
data and calculations of certain indicators of data like Measures of Central Tendency and Measures
of Dispersion (Variation which summarize some important features of the data).

 How to organize, summarize, and describe data.


 The collection, organization, summarization, and analysis of data.

Example 1.3:

 Suppose that the mark of 6 students in Statistics course for Information Technology is
given as 40, 45, 50, 60, 70 and 80. The average mark of the 6 students is 57.5 and it is
considered as descriptive statistics.
 The amount of medication in blood pressure pills.
 The starting salaries for Information Technology and Statistics students in different
organizations.

2. Inferential Statistics: - It deals with making inferences and/or conclusions about a population
based on data obtained from a sample of observations. It consists of performing hypothesis testing,
determining relationships among variables and making predictions.

For example, the average income of all families (the population) in Ethiopia can be estimated from
figures obtained from a few hundred (the sample) families.

 It is important because statistical data usually arises from sample.


 Statistical techniques based on probability theory are required.
 How to reach decisions about a large body of data by examine only a small part of the data.

BY: ABEBEW A. 3
MEKDELA AMBA UNIVERSITY

Example 1.4:

 In the above example, if we say that the average mark in Statistics course for Information
Technology students is 57.5, then we talk about inferential statistics (draw conclusion
based on the sample observation).
 There is a relationship between smoking tobacco and an increased risk of developing
cancer.
 To determine the most effective dose of a new medication (on the basis of tests performed
with volunteer patients from selected hospitals).

Activity 1.1:

1. Define the term statistics in two ways with the help of example.
2. Explain the difference between descriptive statistics and inferential statistics with the help
of example.

1.2 Stages of Statistical Investigation

The area of statistics points out the following five stages. These are collection, organization,
presentation, analysis and interpretation of data.

1. Collection of data: This is the process of obtaining measurements or counts or obtaining raw
data. Data can be collected in a variety of ways; one of the most common methods is through the
use of sample or census survey. Survey can also be done in different methods, three of the most
common methods are: -

 Telephone survey
 Mailed questionnaire.
 Personal interview.

2. Organization of data: - It is usually not possible to derive any conclusion about the main
features of the data from direct inspection of the observations. This stage of statistical investigation
helps to have a clear understanding of the information gathered and includes editing (correcting),
classifying and tabulating the collected data in a systematic manner. Thus the first step in the
organization of data is editing. It means correcting (adjusting) omissions, inconsistencies,

BY: ABEBEW A. 4
MEKDELA AMBA UNIVERSITY

ambiguities, and recording errors. The second step of the organization of data is classification that
is arranging the collected data according to some common characteristics. The last step of the
organization of data is grouping data into classes and tabulating.

3. Presentation of data: - After the data have been collected and organized they can be presented
in the form of tables, charts, diagrams and graphs. This presentation in an orderly manner facilitates
the understanding as well as analysis of data.
4. Analysis of data: - the basic purpose of data analysis is to dig out useful information for
decision making. This analysis may simply be a critical observation of data to draw some
meaningful conclusions about it or it may involve highly complex and sophisticated mathematical
techniques such as measures of central tendencies, dispersion, correlation, regression etc.
5. Interpretation of data: - Interpretation means drawing conclusions from the data collected and
analyzed. Correct interpretation will lead to a valid conclusion of the study & thus can aid in
decision making. A high degree of skill and experience is necessary for the interpretation.

Activity 1.2: Briefly explain the stages of statistical investigation.

1.3 Definitions of some terms

Population: Consists of all elements, individuals, items or objectives whose characteristics are
being studied. The population that is being studied is called target population.

Example 1.5:

 All students in MAU


 Population of trees under specified climatic conditions
 Population of households etc.

Sample: It is a subset of the population; selected using some pre-defined sampling technique in
such a way that they represent the population very well.

Example 1.6: if we want to study the income pattern of lecturers at Mekdela Amba University
and there are 350 lecturers, then we may take a random sample of only 100 lecturers out of this
entire population of 350 for the purpose of study. Then this number of 100 lecturers constitutes a
sample.

BY: ABEBEW A. 5
MEKDELA AMBA UNIVERSITY

Sampling: The process or method of sample selection from the population.

Sampling frame: - is the list of all possible units of the population that the sample can be drawn
from it.

Example 1.7: List of all students of MAU, List of all residential houses in Tulu Awuliya town,
etc.

Survey: - is an investigation of a certain population to assess its characteristics. It may be census


or sample.

Census: Complete enumeration or observation of the elements of the population. Or it is the


collection of data from every element in a population

Sample survey: the process of collecting data covering a representative part or portion of a
population.

Parameter: - is a statistical measure of a population, or summary value calculated from a


population. Examples: 1.8 Average, Range, proportion, variance, etc.

Statistic: - is a descriptive measure of a sample, or it is a summary value calculated from a sample.


From the previous example, the summary measure that describes a characteristic such
as average income of this sample is known as a statistic.

Sample size: The number of elements or observation to be included in the sample.

An element: - is a member of sample or population. It is specific subject or object (for example a


person, firm, item, etc.) about which the information is collected.

Observation (measurement): - is the value of a variable for an element.

Variable: - It is an item of interest that can take numerical or non-numerical values for different
elements. It may be qualitative or quantitative.

Example 1.9: age, weight, sex, marital status, etc.

Qualitative variables: - are variables that assume non-numerical values and can't be measured.
They can be categorized and they are usually called attributes.

BY: ABEBEW A. 6
MEKDELA AMBA UNIVERSITY

Example1.10: blood type, Sex, marital status, ID number, religious affiliation, state of birth etc.

Quantitative variables: - are variables which assume numerical values and can be measured.
Example age, weight, number of car accidents etc. Note that quantitative variables are either
discrete (which can assume only certain values, and there are usually "gaps" between the values,
such as the number of bedrooms in your house, number patients in a hospital, number of white
blood cells in a droplet of blood sample etc.) or continuous (which can assume any value within
a specific range, such as the air pressure in a tire, height etc.)

Example1.11: Identify the qualitative/quantitative and discrete/continuous variables.

1. Sara took her time when taking her final exam in a statistics class and got 85 correct
answers in a multiple–choice test with 100 questions.

Solution:

Discrete: Number of correct answers she got on the exam

Continuous: Length of time she took to finish the exam.

Unit of analysis: The type of thing being measured in the data, such as persons, families,
households, states, nations, etc.

Activity 1.3:

1. Explain the meaning of the following terms and give examples

a. Quantitative variable c. Qualitative variable

b. Discrete variable d. Continuous variable

2. Explain which of the following variables are quantitative and which are qualitative if it is
quantitative classify as discrete or continuous.

a. Number of persons in a family b. marital status of people

c. Monthly phone bills d. Length of frog jump

3. Clearly identify the difference between population and sample by giving example.

4. Differentiate the terms statistic and parameter.

BY: ABEBEW A. 7
MEKDELA AMBA UNIVERSITY

1.4 Applications, uses and limitations of Statistics

Statistics can be applied in any field of study which seeks quantitative evidence. For instance,
Information Technology, engineering, economics, etc.

a) Engineering: Statistics have wide application in engineering.

 To compare the breaking strength of two types of materials


 To determine the probability of reliability of a product.
 To control the quality of products in a given production process.
 To compare the improvement of yield due to certain additives such as fertilizer, herbicides,
e t c.

b) Economics: Statistics are widely used in economics study and research.

 To measure and forecast Gross National Product (GNP)


 Statistical analyses of population growth, inflation rate, poverty, unemployment figures,
rural or urban population shifts and so on influence much of the economic policy making.
 Financial statistics are necessary in the fields of money and banking including consumer
savings and credit availability.

c) Statistics and research: there is hardly any advanced research going on without the use of
statistics in one form or another. Statistics are used extensively in medical, pharmaceutical and
agricultural research.

Function/Uses of Statistics

Today the field of statistics is recognized as a highly useful tool to making decision process by
managers of modern business, industry, frequently changing technology. It has a lot of functions
in everyday activities. The following are some uses of statistics:

• It condenses and summarizes a mass of data: the original set of data (raw data) is normally
voluminous and disorganized unless it is summarized and expressed in few presentable,
understandable & precise figures.

BY: ABEBEW A. 8
MEKDELA AMBA UNIVERSITY

• Statistics facilitates comparison of data: measures obtained from different set of data can be
compared to draw conclusion about those sets. Statistical values such as averages, percentages,
ratios, rates, coefficients, etc, are the tools that can be used for the purpose of comparing sets of
data.

• Statistics helps to predict future trends: statistics is very useful for analyzing the past and
present data and forecasting future events.

• Statistics helps to formulate & review policies: Statistics provide the basic material for framing
suitable policies. Statistical study results in the areas of taxation, on unemployment rate, on
inflation, on the performance of every sort of military equipment, etc, may convince a government
to review its policies and plans with the view to meet national needs and aspirations.

• Formulating and testing hypothesis: Statistical methods are extremely useful in formulating
and testing hypothesis and to develop new theories.

Limitations of Statistics

The field of statistics, though widely used in all areas of human knowledge and widely applied in
a variety of disciplines such as engineering, economics and research, has its own limitations. Some
of these limitations are:

A) It does not deal with individual values: as discussed earlier, statistics deals with aggregate of
facts. For example, wage earned by an individual worker at any one time, taken by itself is not a
statistic.

B) It does not deal with qualitative characteristics directly: statistics is not applicable to
qualitative characteristics such as beauty, honesty, poverty, standard of living and so on since these
cannot be expressed in quantitative terms. These characteristics, however, can be statistically dealt
with if some quantitative values can be assigned to these with logical criterion. For example,
intelligence may be compared to some degree by comparing IQs or some other scores in certain
intelligence tests.

C) Statistical conclusions are not universally true: since statistics is not an exact science, as is
the case with natural sciences, the statistical conclusions are true only under certain assumptions.

BY: ABEBEW A. 9
MEKDELA AMBA UNIVERSITY

D) It can be misused: statistics cannot be used to full advantage in the absence of proper
understanding of the subject matter.

Activity 1.4:

1. Briefly explain the application of statistics in different sectors.

2. Discuss the limitation of statistics using example for each of the limitations.

1.5 Levels of Measurement

Proper knowledge about the nature and type of data to be dealt with is essential in order to specify
and apply the proper statistical method for their analysis and inferences. Measurement scale refers
to the property of value assigned to the data based on the properties of order, distance and fixed
zero.

Scale Types

Measurement is the assignment of values to objects or events in a systematic fashion. Four levels
of measurement scales are commonly distinguished: nominal, ordinal, interval, and ratio and each
possessed different properties of measurement systems. The first two are qualitative while the last
two are quantitative.

Nominal scale: The values of a nominal attribute provide only enough information to distinguish
one object from another. It possesses none of the three properties of order, distance and fixed zero.
These types of data are consisting of names, labels and categories.

This is a scale for grouping individuals into mutually exclusive categories. In this scale, one is
different from the other. Arithmetic operations (+, -, *, ÷) are not applicable, comparison (<, >, ≠)
is impossible

Example1.12:
 Eye color: brown, black, etc.
 sex: male, female
 Political party preference (Republican, Democrat, or Other,)

Ordinal scale: - defined as nominal data that can be ordered or ranked.

BY: ABEBEW A. 10
MEKDELA AMBA UNIVERSITY

 Can be arranged in some order, but the differences between the data values are
meaningless.
 Data consisting of an ordering of ranking of measurements are said to be on an ordinal
scale of measurements. That is, the values of an ordinal scale provide enough information
to order objects.
 One is different from and greater /better/ less than the other
 Arithmetic operations (+, -, *, ÷) are impossible, comparison (<, >, ≠) is possible.

Example1.13 Letter grading (A, B, C, D, F), -Rating scales (excellent, very good, good, fair,
poor), military status (general, colonel, lieutenant, etc).

Interval Level: data are defined as ordinal data and the differences between data values are
meaningful. However, there is no true zero, or starting point, and the ratio of data values are
meaningless. Note: Celsius & Fahrenheit temperature readings have no meaningful zero and ratios
are meaningless.

In this measurement scale: -

 One is different, better/greater and by a certain amount of difference than another.


 Possible to add and subtract. For example; 800c – 500c = 300c, 700c – 400c = 300c.
 Multiplication and division are not possible. For example; 600c = 3(200c). But this does not
imply that an object which is 600c is three times as hot as an object which is 200c.

Example1.14: IQ, temperature.

Ratio scale: Similar to interval, except there is a true zero (absolute absence), or starting point,
and the ratios of data values have meaningful.

 Arithmetic operations (+, -, *, ÷) are applicable. For ratio variables, both differences and
ratios are meaningful.
 One is different/larger /taller/ better/ less by a certain amount of difference and so much
times than the other.
 This measurement scale provides better information than interval scale of measurement.

BY: ABEBEW A. 11
MEKDELA AMBA UNIVERSITY

Example1.15: Weight, Height, Number of students, Age

Activity 1.5:
1. For each of the variables, indicate whether it is quantitative or qualitative and specify the
scale of measurement that is employed when taking measurement on each
a. Class standing of the members of this class relative to each other
b. Admitting diagnosis of patients admitted to a mental health clinic
c. Weight of babies born in a hospital during a year
d. Gender of babies born in a hospital during a year.
e. Under-arm temperature of day-old infants born in a hospital.
f. Patients may be characterized as unimproved, improved & much improved.

Exercise 1

1. Define the following terms.

a. Statistics in its plural and singular sense, b. inferential statistics, c. Variable,

d. Nominal scale, e. Survey

2. To estimate the amount of wheat produced per hectare, a farmer divides his total land holding
of 350 acres into 350 one-acre plots. He then selects ten plots at random and examines his harvest
in these ten plots at random and examines his harvest in these ten plots. Identify the

a. population, b. sample, c. variable under study and d. the parameter of interest.

3. You are a part of an anti-smoking campaign in your school. You are concerned about the general
health of your fellow students and want to know what percentage of the students are regular
smokers.

a. identifies the individual in the study b. identifies the variable,

c. is the variable qualitative or quantitative?

4. State the level of measurement for each of the following:

a. hair color, b. types of surgical procedures offered at general hospital

BY: ABEBEW A. 12
MEKDELA AMBA UNIVERSITY

c. grams of fat in a cheeseburger, d. time it takes to run 10 kms.

e. quality of a computer chip is ‘good’ or ‘bad’

f. number of customers served at a given hotel during lunch

g. salary, h. amount of rainfall, i. ranks of runners

BY: ABEBEW A. 13
MEKDELA AMBA UNIVERSITY

CHAPTER TWO

METHODS OF DATA COLLECTION AND PRESENTATION

Introduction:

The second unit of this module introduces the methods of data collection and presentation. This
unit will deal how to collect and present the data you have collected so that they can be of us e.
Thus the collected data also known as raw data are always in an unorganized form and need to be
organized and presented in a meaningful and readily comprehensible form in order to facilitate
further statistical analysis.

Objectives:

At the end of this chapter students will be able to:

 Arrange raw data in an array and then classified data to construct a frequency table and a
cumulative frequency table.
 To organize data using frequency distribution.
 To present data using suitable graphs or diagrams.

2.1 Methods of Data Collection

Data: - is the raw material of statistics. It can be obtained either by measurement or counting.
When we determine that the appropriate approach to seeking an answer to a question will require
the use of statistics, we begin to search for suitable data to serve as the raw material for our
investigation.

Sources of data

The statistical data may be classified under two categories depending up on the sources.

1. Primary data: - Data collected by the investigator himself for the purpose of a specific inquiry
or study. Such data are original in character & are mostly generated by surveys conducted by
individuals or research institutions.

BY: ABEBEW A. 14
MEKDELA AMBA UNIVERSITY

It is more reliable & accurate since the investigator can extract the correct information by removing
doubts, if any, in the minds of the respondents regarding certain questions. These may involve
data collection using observation, personal interview, self-administered questionnaire, mailed
questionnaire etc.

Primary source: Is a source of data that supplies first-hand information for the use of the
immediate purpose.

Example 2.1 if a researcher is interested to know the impact of noon meal scheme for the school
children, he has to undertake a survey and collect data on the opinion of parents and children by
asking relevant questions. Such a data collected for the first time is called primary data.

1. Direct personal interviews: The persons from whom information’s are collected are known as
informants. The investigator personally meets them and asks questions to gather the necessary
information’s. It is the suitable method for intensive rather than extensive field surveys. It suits
best for intensive study of the limited field.

2. Indirect Oral Interviews: Under this method the investigator contacts witnesses or neighbors’
or friends or some other third parties who are capable of supplying the necessary information. This
method is preferred if the required information is on addiction or cause of fire or theft or murder
etc., If a fire has broken out a certain place, the persons living in neighborhood and witnesses are
likely to give information on the cause of fire. In some cases, police interrogated third parties who
are supposed to have knowledge of a theft or a murder and get some clues. Enquiry committees
appointed by governments generally adopt this method and get people views and all possible
details of facts relating to the enquiry. This method is suitable whenever direct sources do not exist
or cannot be relied upon or would be unwilling to part with the information.

3. Information from correspondents: The investigator appoints local agents or correspondents’


indifferent places and compiles the information sent by them. Information to Newspapers and
some departments of Government come by this method. The advantage of this method is that it is
cheap and appropriate for extensive investigations. But it may not ensure accurate results because
the correspondents are likely to be negligent, prejudiced and biased. This method is adopted in
those cases where information’s are to be collected periodically from a wide area for a long time.

BY: ABEBEW A. 15
MEKDELA AMBA UNIVERSITY

4. Mailed questionnaire method: Under this method a list of questions is prepared and is sent to
all the informants by post. The list of questions is technically called questionnaire. A covering
letter accompanying the questionnaire explains the purpose of the investigation and the importance
of correct information’s and requests the informants to fill in the blank spaces provided and to
return the form within a specified time. This method is appropriate in those cases where the
informants are literates and are spread over a wide area.

2. Secondary data: - When an investigator uses data, which have already been collected by
others, such data are called secondary data. Such data are primary data for the agency that
collected them, and become secondary for someone else who uses these data for his own purposes.

Secondary source: are individuals or agencies, which supply data originally collected for other
purposes by them or others.

There is a vast amount of published information from which statistical studies may be made and
fresh statistics are constantly in a state of production. The sources of secondary data can broadly
be classified under two heads: Published sources and Unpublished sources.

1. Published Sources: The various sources of published data are:

1. Reports and official publications of

2. Semi-official publication of various local bodies such as Municipal Corporations and District
Boards.

It should be noted that the publications mentioned above vary with regard to the periodically of
publication. Some are published at regular intervals (yearly, monthly, weekly etc.,) where as others
are no regularity about periodicity of publications.

Note: A lot of secondary data is available in the internet. We can access it at any time for the
further studies.

2. Unpublished Sources All statistical material is not always published. There are various sources
of unpublished data such as records maintained by various Government and private offices, studies
made by research institutions, scholars, etc. Such sources can also be used where necessary.

BY: ABEBEW A. 16
MEKDELA AMBA UNIVERSITY

When our source is secondary data check that:

 The type and objective of the situations.

 The purpose for which the data are collected and compatible with the present problem.

 The nature and classification of data is appropriate to our problem.

 There are no biases and misreporting in the published data.

Note: Data which are primary for one may be secondary for the other.

Activity 2.1

1. Distinguish between primary and secondary data.

2. What are the various methods of collecting primary data? Give example of each.

3. Define secondary data. What are their sources?

4. Describe primary and secondary method of data collection. In what special circumstance are
the two methods suitable?

2.2 Methods of Data Presentation

Having collected and edited the data, the next important step is to organize it. That is to present it
in a readily comprehensible condensed form that aids in order to draw inferences from it. It is also
necessary that the like be separated from the unlike ones.

The presentation of data is broadly classified in to the following two categories:

 Tabular presentation
 Diagrammatic and Graphic presentation.

The process of arranging data in to classes or categories according to similarities technically is


called classification. It eliminates inconsistency and also brings out the points of similarity and/or
dissimilarity of collected items/data.

Classification is necessary because it would not be possible to draw inferences and conclusions if
we have a large set of collected raw data.

BY: ABEBEW A. 17
MEKDELA AMBA UNIVERSITY

Raw data: recorded information in its original collected form, whether it may be counts or
measurements.

An array: is an arrangement of raw numerical data in ascending or descending order of


magnitude. Any scientific investigation requires data related to the study.

2.2.1 Frequency distribution

Frequency distribution (FD): - is the organization of raw data in table form using classes and
frequencies.

A frequency distribution is constructed for three main reasons:

 To facilitate the analysis of data.

 To estimate frequencies of the unknown population distribution from the distribution of


sample data and

 To facilitate the computation of various statistical measures

Frequency Distribution Table

Class: is a description of a group of similar numbers in a data set.

Frequency: is the number of times a variable value is repeated.

Class frequency: the number of observations belonging to a certain class.

There are three types of frequency distributions; categorical, ungrouped (discrete or frequency
array) and grouped (continuous) frequency distributions.

There are three types of FD and there are specific procedures for constructing each type.

These are: -

I. Categorical FD

II. Ungrouped FD and

III. Grouped FD

BY: ABEBEW A. 18
MEKDELA AMBA UNIVERSITY

I. Categorical FD: FD in which the data is qualitative i.e. either nominal or ordinal. Each
category of the variable represents a single class and the number of times each category repeats
represents the frequency of that class (category).

Example 2.2: Twenty-five patients were given a blood test to determine their blood type. The
data is as shown below: A B B AB O O O B AB B B B O A O O O AB AB A O O B A.

Solution: since the data are categorical by taking the four blood types as classes we can construct
as shown below.

Step 1: Make a table as shown below

CLASS TALLY FREQUANCY PERCENRT


A
B
AB
O
Step2: Tally data and place the result under the column Tally
Step 3: Count the tallies and place the result under the column Frequency.

Step 4: find the percentage of values in each class by the formula (%= f/n * 100%; f= frequency,
n=total number of observation.)

CLASS TALLY FREQUANCY PERCENT


A //// 5 5/25* 100 = 20%
B //// // 7 28%
AB //// 4 16%
O //// //// 9 9/25*100 = 36%

II. Ungrouped Frequency Distribution (UFD)

A FD of numerical data (quantitative) in which each value of a variable represents a single class
(i.e. the values of the variable are not grouped) and the number of times each value repeats
represents the frequency of that class. It is often constructed for small set of data or data once
discrete variable.

Constructing ungrouped frequency distribution:

 First find the smallest and largest raw score in the collected data.

BY: ABEBEW A. 19
MEKDELA AMBA UNIVERSITY

 Arrange the data in order of magnitude and count the frequency.


 To facilitate counting one may include a column of tallies.

Example 2.3: The following data represent the mark of 20 students.

80 76 90 85 80
70 60 62 70 85
65 60 63 74 75
76 70 70 80 85
Construct a frequency distribution, which is ungrouped.

Solution:

Step 1: Find the range, Range=Max-Min=90-60=30.

Step 2: Make a table as shown

Step 3: Tally the data.

Step 4: Compute the frequency.

Mark Tally Frequency


60 // 2
62 / 1
63 / 1
65 / 1
70 //// 4
74 / 1
75 // 2
76 / 1
80 /// 3
85 /// 3
90 / 1
Note: Each individual value is presented separately, that is why it is named ungrouped frequency
distribution.

3. Grouped Frequency Distribution (GFD).

When the range of the data is large the data must be grouped in to classes that are more than one
unit in width.

BY: ABEBEW A. 20
MEKDELA AMBA UNIVERSITY

Definition of some basic terms Grouped frequency distribution: is a FD when several numbers
are grouped into one class.

Class limits (CL): It separates one class from another. The limits could actually appear in the data
and have gaps between the upper limits of one class and the lower limit of the next class.

Unit of measure (U): This is the possible difference between successive values. E.g. 1, 0.1, 0.01,
0.001……

Class boundaries: Separate one class in a grouped frequency distribution from the other. The
boundary has one more decimal place than the raw data. There is no gap between the upper
boundaries of one class and the lower boundaries of the succeeding class. Lower class boundary
is found by subtracting half of the unit of measure from the lower class limit and upper class
boundary is found by adding half unit measure to the upper class limit.

Class width (W): The difference between the upper and lower boundaries of any consecutive
class. The class width is also the difference between the lower limit or upper limits of two
consecutive classes.

Class mark (Mid-point): It is found by adding the lower and upper class limit (Boundaries) and
divided the sum by two.

Cumulative frequency (CF): It is the number of observation less than the upper class boundary
or greater than the lower class boundary of class.

CF (Less than type): it is the number of values less than the upper class boundary of a given class.

CF (Greater than type): it is the number of values greater than the lower class boundary of a
given class.

Relative frequency (Rf): The frequency divided by the total frequency. This gives the percent of
values falling in that class.

Rfi = fi/n= fi/∑fi

Percentage frequency: - Relative frequency ×100

BY: ABEBEW A. 21
MEKDELA AMBA UNIVERSITY

Relative cumulative frequency (RCf): The running total of the relative frequencies or the
cumulative frequency divided by the total frequency gives the percent of the values which are less
than the upper class boundary or the reverse.

CRfi = Cfi/n= Cfi/∑fi

STEPS IN CONSTRUCTING A GFD

1. Find the highest and the smallest value

2. compute the range; R = H – L

3. Select the number of classes desired, usually between 5 and 20 or use Sturges rule
k  1  3.322 log n where k is number of classes desired and n is total number of observation

4. Find the class width (W) by dividing the range by the number of classes and round to the
nearest integer.

W = R/K

5. Identify the unit of measure usually as 1, 0.1, 0.01,

6. Pick a suitable starting point less than or equal to the minimum value. Your starting point
is lower limit of the first class. - Then continue to add the class width to get the rest lower class
limits.

7. Find the upper class limits UCLi = LCLi+1-U. then continue to add width to get the rest
upper class limits

8. find class boundaries LCBi = LCLi – ½ U, UCBi = UCLi + ½ U

9. Find class mark CMi = (UCLi + LCLi) / 2 or CMi = (UCBi + LCBi) / 2.

10. Tally the data

11. Find the frequencies

BY: ABEBEW A. 22
MEKDELA AMBA UNIVERSITY

12. Find the cumulative frequencies. Depending on what you are trying to accomplish, it may
be necessary to find the cumulative frequency.

13. If necessary, find RF and RCF.

When grouping data, the following rules are important:

 The class should be between 5 and 20 classes.


 The groups must not overlap, otherwise there is confusion concerning in which group a
measurement belongs.
 There must be continuity from one group to the next, which means that there must be no gaps
in a frequency distribution. Otherwise some measurements may not fit in a group.
 The groups must range from the lowest measurement to the highest measurement so that all of
the measurements have a group to which they can be assigned.
 The groups should normally be of an equal width, so that the counts in different groups can
easily be compared.
 The classes must be all inclusive or exhaustive. This means that all data values must be
included.

Example 2.4: Construct FD for the following data.

11 29 33 22 27 19 22 21 18 17 22 38 26 39 27 6 34 13 20 32

Solution: -

1) Highest value = 39, Lowest value = 6

2) Range = 39 – 6 = 33

3) K = 1+ 3.322Log20 = 1 + 3.322(2.301) = 5.6 ≈ 6

4) W = R / K = 33/6 = 5.5 ≈ 6

5) U = 1

6) LCL1= 6

6, 12, 18, 24, 30, 36 are the lower class limits.

BY: ABEBEW A. 23
MEKDELA AMBA UNIVERSITY

7) Find the upper class limits. The first upper class = 12-U=12-1=11.

11, 17, 23, 29, 35, 41 are the upper class limits.

So combining step 6 and step 7, one can construct the following classes.

Class limits 6 – 11, 12 – 17, 18 – 23, 24 – 29, 30 – 35, 36 – 41

8) Find class boundaries

for class one Lower class boundary=6-U/2=5.5

Upper class boundary =11+U/2=11.5

Then continue adding W on both boundaries to obtain the rest boundaries. By doing so one can
obtain the following classes.

Class boundary 5.5 – 11.5, 11.5 – 17.5, 17.5 – 23.5, 23.5 – 29.5, 29.5 – 35.5, 35.5 – 41.5

9) Find class mark

10) Tally the data

11) Write the numeric values for the tallies in the frequency column.

12) Find cumulative frequency.

13) Find relative frequency or/and relative cumulative frequency.

Class limit Class boundary Class Mark Tally Frequency CF(<) CF(>) RF RCF(>)
6 – 11 5.5 – 11.5 8.5 // 2 2 20 2/20=0.1 1
12 – 17 11.5 – 17.5 14.5 // 2 4 18 2/20=0.1 0.9
18 – 23 17.5 – 23.5 20.5 //// // 7 11 16 7/20=0.35 0.8
24 – 29 23.5 – 29.5 26.5 //// 4 15 9 4/20=0.2 0.45
30 – 35 29.5 – 35.5 32.5 /// 3 18 5 3/20=0.15 0.25
36 – 41 35.5 – 41.5 38.5 // 2 20 2 2/20=0.1 0.10
Activity 2.2:

1. In a biology experiment the lengths of 25 worms, measured to the nearest 0.1cm, were:

9.5 8.1 5.1 6.6 9.3 9.1 6.5 5.0 6.9 7.6 9.3 8.3 6.0

6.2 7.4 7.7 7.8 7.9 7.0 7.8 5.4 9.8 6.3 7.5 8.4

BY: ABEBEW A. 24
MEKDELA AMBA UNIVERSITY

Construct a frequency distribution for the data by using sturgles rule for the number of classes.
What do you think about the typical length of these worms?

2.2.2 Diagrammatic presentation of data: Bar charts, Pie-chart, pictograms

These are techniques for presenting data in visual displays using geometric and pictures. It is easier
to understand and interpret data when they are presented graphically than using words or a
frequency table. A graph can present data in a simple and clear way. Also it can illustrate the
important aspects of the data. This leads to better analysis and presentation of the data

A diagram is a visual form for presentation of statistical data, highlighting their basic facts and
relationship. If we draw diagrams on the basis of the data collected they will easily be understood
and appreciated by all. It is readily intelligible and save a considerable amount of time and energy

In this article, we discuss the approach for the most commonly used diagrammatic or graphical
methods such as bar chart, pie chart, histogram, frequency polygon and cumulative frequency
polygon.

Significance of Diagrammatic and Graphical:

Diagrams and graphs are extremely useful because of the following reasons.

1. They are attractive and impressive.

2. They make data simple and intelligible.

3. They make comparison possible

4. They save time and labor.

5. They have universal utility.

6. They give more information.

7. They have a great memorizing effect.

The three most commonly used diagrammatic presentation for discrete as well as qualitative data
are:

BY: ABEBEW A. 25
MEKDELA AMBA UNIVERSITY

 Pie chart
 Bar chart
 Pictogram
A) Pie chart
A pie chart is a circle that is divided in to sections or wedges according to the percentage of
frequencies in each category of the distribution. The angle of the sector is obtained using:

Value of the part


Angle of a sector = ∗ 3600
The whole quantity

Example 2.5: Draw a suitable diagram to represent the following population in a town.

Men Women Girls Boys


2500 2000 4000 1500
Solutions:

Step 1: Find the percentage.

Step 2: Find the number of degrees for each class.

Step 3: Using a protractor and compass, graph each section and write its name with corresponding
percentage.

Class Frequency Percent Degree


Men 2500 25 90
Women 2000 20 72
Girls 4000 40 144
Boys 1500 15 54
Total 10000 100 360

BY: ABEBEW A. 26
MEKDELA AMBA UNIVERSITY

Boys
15%
Men
25%

Girls Women
40% 20%

B) Bar Charts
 A set of bars (thick lines or narrow rectangles) representing some magnitude over time space.
 Used to represent & compare the frequency distribution of discrete variables and attributes or
categorical series.
 Bars can be drawn either vertically or horizontally.

In presenting data using bar diagram,

 All bars must have equal width and the distance between bars must be equal.
 The height or length of each bar indicates the size (frequency) of the figure represented.

There are different types of bar charts. The most common being:

 Simple bar chart


 Component or sub divided bar chart.
 Multiple bar charts.
I. Simple bar chart
 Are used to display data on one variable.
 They are thick lines (narrow rectangles) having the same breadth. The magnitude of a
quantity is represented by the height /length of the bar.

Example 2.6: Number of students in the four department of Science College given as follows:

Department Physics Mathematics Chemistry Information Technology


Number of students 200 400 450 600
Male 170 350 250 200
Female 30 50 200 400

BY: ABEBEW A. 27
MEKDELA AMBA UNIVERSITY

Draw a simple bar chart of the number of students by department.

Solution:
Simple bar chart

800 600
Frequency 600 450
400
400 200
200
0
Phys Maths Chem Bio
Deprtm ent

II. Component Bar chart


 When there is a desire to show how a total (or aggregate) is divided in to its component
parts, we use component bar chart.
 The bars represent total value of a variable with each total broken in to its component parts
and different colors or designs are used for identifications

Example 2.7: Draw a component (sub-divided) bar chart of the number of students by department
is given in the example 2.5.

Solution:

Sub-divided bar chart

800
600 Female
Frequency 400 Male
200
0
Phys Maths Chem Bio
Department

III. Multiple Bar charts


 These are used to display data on more than one variable.
 They are used for comparing different variables at the same time.

BY: ABEBEW A. 28
MEKDELA AMBA UNIVERSITY

Example 2.8: The following data represent sales by product, 1957- 1959 of a given company for
three products A, B, C.

Product Sales in ($)


1957 1958 1959
A 12 14 18
B 24 21 18
C 24 35 54
Draw a multiple bar chart to represent the sales by product from 1957 to 1959.

Solution:

C. Pictograph
In this diagram, we represent data by means of some picture symbols. It is customary to represent
a unique value of the data by standard symbol or a picture and the whole quantity by an appropriate
number of repetitions of the symbol assumed. The symbol should be simple and clear for
understanding. We decide about a suitable picture to represent a definite number of units in which
the variable is measured.
Example 2.9: The following table shows the orange production in a plantation from production
year 1990-1993. Represent the data by a pictogram.
Orange productions from 1990 to 1993 is

Production year 1990 1991 1992 1993


Amount (in kg) 3000 3850 3500 5000

BY: ABEBEW A. 29
MEKDELA AMBA UNIVERSITY

Figure: Pictogram of the data on Orange productions from 1990 to 1993.

Activity 2.2

The following table gives the number of deaths in a certain country in 1987 due to accidents for
individuals in various classifications.

Classification Number of deaths


Pedestrians 1699
Bicyclists 280
Motorcyclists 650
Automobile drivers 1327
Represent the data using both a bar chart and a pie chart.

2.2.4 Graphical Presentation of data

The histogram, frequency polygon and cumulative frequency graph or ogive is most commonly
applied graphical representation for continuous data.

Procedures for constructing statistical graphs:

 Draw and label the X and Y axis.


 Choose a suitable scale for the frequencies or cumulative frequencies and label it on the Y
axis.
 Represent the class boundaries for the histogram or ogive or the mid points for the
frequency polygon on the X axis.
 Plot the points.
 Draw the bars or lines to connect the points.

BY: ABEBEW A. 30
MEKDELA AMBA UNIVERSITY

Histogram

It is a graph which displays the data by using vertical connected bars of various heights to represent
frequencies. Class boundaries are placed along the horizontal axis. Histogram can often indicate
how symmetric the data are; how spread out the data are; whether there are intervals having high
levels of data concentration; whether there are gaps in the data; and whether some data values are
far apart from others.

Example 2.10: Construct a histogram to represent the following data.


Class limits 15-24 25-34 35-44 45-54 55-64 65-74 75-84
Frequency 3 4 10 15 12 4 2

Solution:

Histogram
Frequency
20
15
15 12
10
10
4 4
5 3 2

0
Class boundaries

Frequency polygon

The frequency polygon is line graph that obtained by joining the mid-points of the tops of the
adjacent rectangles of the histogram with line segments. When the polygon is continued to the x-
axis just outside the range of the lengths the total area under the polygon will be equal to the total
area under the histogram.
Note: we should add two classes with zero frequencies at the two ends of the frequency distribution
to complete the polygon.

BY: ABEBEW A. 31
MEKDELA AMBA UNIVERSITY

Example 2.11: Construct a frequency polygon to represent the previous data in example 2.8.
Solution:

Class Frequency Class Class R.F. % R.F. Less than More than
limits marks boundaries (percent) C.F. C. F.
15 – 24 3 19.5 14.5 - 24.5 0.06 6% 3 50
25 – 34 4 29.5 24.5 - 34.5 0.08 8% 7 47
35 – 44 10 39.5 34.5 - 44.5 0.20 20% 17 43
45 – 54 15 49.5 44.5 - 54.5 0.30 30% 32 33
55 – 64 12 59.5 54.5 - 64.5 0.24 24% 44 18
65 – 74 4 69.5 64.5 - 74.5 0.08 8% 48 6
75 – 84 2 79.5 74.5 - 84.5 0.04 4% 50 2
Total 50 1.00 100%
Adding two class marks with f i  0 , we have 9.5 at the beginning, and 89.5 at the end, the
following frequency polygon is plotted:

Frequency Polygon
20
F
r
10
e
q
u
y 0
e 9.5 19.5 29.5 39.5 49.5 59.5 69.5 79.5 89.5
n
c Class mark

Ogive (cumulative frequency polygon)

An Ogive (pronounced as “oh-jive”) is a line that depicts cumulative frequencies, just as the
cumulative frequency distribution lists cumulative frequencies. Note that the Ogive uses class
boundaries along the horizontal scale and the corresponding cumulative frequencies are plotted
along the vertical axis. The points are joined by a free hand curve. The graph begins with the lower
boundary of the first class and ends with the upper boundary of the last class.

Ogive is useful for determining the number of values below or above some particular value. There
are two type of Ogive namely less than Ogive and more than Ogive. The difference is that less
than Ogive uses less than cumulative frequency and more than Ogive uses more than cumulative
frequency on y axis.

BY: ABEBEW A. 32
MEKDELA AMBA UNIVERSITY

Example 2.12: Draw a both types of ogives for the F.D. ofCumulative Frequency
Example 2.8.

Solutions:

The Less than Ogive


The More than
60 Ogive
50 60
Cumulative
Frequency

40 50
30 40
20 30
10 20
0 10
14.5 24.5 34.5 44.5 54.5 64.5 74.5 84.5 0
Class Boundaries 14.5 24.5 34.5 44.5 54.5 64.5 74.5 84.5
Class Boundaries

Note: For both ogives, one class with frequency zero is added for similar reason with the
frequency polygon.

Stem and leaf plots

The stem and leaf plot is a method of organizing data and is a combination of sorting and graphing.
It has the advantage over a grouped frequency distribution of retaining the actual data while
showing them in graphical form. A stem and leaf plot is a data plot that uses part of the data
value as the stem and part of the data value as the leaf to form groups or classes.

For example, a data value of 34 would have 3 as the stem and 4 as the leaf. A data value of 356
would have 35 as the stem and 6 as the leaf. Example 2–14 shows the procedure for constructing
a stem and leaf plot.

Example: At an outpatient testing center, the number of cardiograms performed each day for 20
days is shown. Construct a stem and leaf plot for the data.

BY: ABEBEW A. 33
MEKDELA AMBA UNIVERSITY

Note: Arranging the data in order is not essential and can be cumbersome when the data set is
large; however, it is helpful in constructing a stem and leaf plot. The leaves in the final stem and
leaf plot should be arranged in order.

BY: ABEBEW A. 34
MEKDELA AMBA UNIVERSITY

The above graph shows that the distribution peaks in the center and that there are no gaps in the
data. For 7 of the 20 days, the number of patients receiving cardiograms was between 31 and 36.
The plot also shows that the testing center treated from a minimum of 2 patients to a maximum of
57 patients in any one day. If there are no data values in a class, you should write the stem number
and leave the leaf row blank. Do not put a zero in the leaf row.

Step 2

Activity 2.3:

1. The following data represent the time a tumor progression, measured in months, for 65 patients
having a particular type of brain tumor called glioblastoma:

6, 5, 37, 10, 22, 9, 2, 16 , 3, 3, 11 , 9, 5, 14 , 11, 3, 1, 4, 6, 2, 7, 3 , 7 , 5, 4 8 , 2, 7, 13,


16 , 15, 9, 4, 4, 2, 3, 9, 5, 11, 7, 5, 9, 3, 8, 9, 4, 10, 3, 2 , 7, 6, 9, 3, 5, 4, 6 , 4, 14 , 3,
6, 12 , 8, 12, 7

a) Represent the data by a histogram using both the relative and absolute frequencies and compare
the two histograms.

b) Make up a cumulative more than and cumulative less than frequency distribution for this data
set.

c) Which values of the distributions seems to be typical?

BY: ABEBEW A. 35
MEKDELA AMBA UNIVERSITY

d) Plot the relative frequencies in a frequency polygon.

e) Draw the less than and more than ogive for this data on the same coordinate plane.

f) Roughly guess the point in which half of the observations lie below or above it from the graph
of the less than and the more than ogive.

g) Plot the data by using stem and leaf plots

Exercise 2

1. Classify the following as discrete or continuous variable.

a) Temperature; b) number of courses offered in MAU; c) rain fall; d) age.

2. Distinguish between primary and secondary data. What precautions should be taken before
using secondary data?
3. Construct a frequency distribution for a survey taken at a hotel, that 40 tourists arrived by
the following means of transportation:
car car bus plane plane car plane plane bus car plane car car car plane
bus car bus car plane car car car bus car bus bus plane plane
planecar plane plane plane bus bus car car plane car
4. The following are weekly salaries (in birr) of employees of a firm:

91 139 126 119 100 87 61 77 99 95


88 112 118 89 116 97 105 95 80 86
108 106 127 93 86 135 148 116 76 69
The data are to be presented in a frequency distribution.

a) How many classes can be used? c) What LCL would be used for the first class?

b) What class width should be used? d) Prepare the complete frequency distribution.
5. Given the following frequency distribution:
Class limit 0-1 2-3 4-5 6-7 8-9
frequency 16 25 13 4 2
Find a) the class marks; b) the class boundaries; c) the relative frequencies

BY: ABEBEW A. 36
MEKDELA AMBA UNIVERSITY

CHAPTER THREE
MEASURES OF CENTRAL TENDENCY
At the end of this chapter students will be able to:

 Identify measure of central tendency.


 Understand properties of arithmetic mean.
 Summarize an aggregate of statistical data by using single measure.
 Define and calculate the mean, mode and median.
 Measure the position of data using quartiles, deciles and percentiles with their
interpretation.

3.1 Introduction

Measures of Central Tendency give us information about the location of the center of the
distribution of data values. A single value that approximately describes the characteristics of the
entire mass of data is called measures of central tendency. This will help us in condensing a mass
of data into a single value which is in some sense representative of the whole data set.

An average is single value intended to represent a distribution as a whole.

Note that the individual values of the distribution must have a tendency to cluster around an
average. In view of this requirement an average is also referred to as a measure of central tendency.

3.2 Objectives of measuring central tendency

 To comprehend the data easily.


 To get one single value that describes the characteristics of the entire group.
 To facilitate comparison.
 To make further statistical analysis.

3.3 The Summation Notation ()

Statistical Symbols: Let a data set consists of a number of observations, represents by x1 , x2


, … , xn where n (the last subscript) denotes the number of observations in the data and xi is the ith

BY: ABEBEW A. 37
MEKDELA AMBA UNIVERSITY

observation. Then the sum of all numbers (xi ′s) where i goes from 1 up to n is symbolically given
by ∑ni=1 xi or ∑ xi or ∑ x that is,

∑ xi = x1 + x2 + … + xn

x - whole set of numbers

xi - specific score in a set of numbers

n - total number of observations

For instance a data set consisting of six measurements 2, 3, 9, 10, 8 and -2 is represented by x1 ,
x2 , … , x6 where x1 = 2, x2 =3, x3 =9, x4 = 10, x5 = 8 and x6 =-2 Their sum becomes ∑6i=1 xi
= x1 + x2 + … + x6 = 2+3+9+10+8+ (-2) = 30

Some Properties of the Summation Notation

1. ∑ni=1 c = n.c, where c is a constant number.


2. ∑ni=1 bxi = b∑ni=1 xi where b is a constant number
3. ∑ni=1(a + bxi ) = n.a + b∑ni=1 xi
4. ∑ni=1((xi ± yi ) = ∑ni=1 xi ± ∑ni=1 yi
5. ∑ni=1 xi yi ≠ ∑ni=1 xi ∑ni=1 yi
Example 3.1: ∑7i=1 xi = 20 , ∑7i=1 yi = 30, ∑7i=1 xi2 = 420, ∑7i=1 yi2 =280
Find i/ ∑7i=1(6xi + 4yi ) = 6 ∑7i=1 xi + 4∑7i=1 yi = 6.20 + 4.30 = 240
ii/ 3∑7i=1 xi2 − 2 ∑7i=1 yi2 = 3.420 – 2.280 = 700

3.4. Important characteristics of measures of central tendency

1. Rigidly defined (unique).

2. Based on all observation under investigation.

3. Easily understood.

4. Simple to compute.

5. Suitable for further mathematical treatment.

6. Little affected by fluctuations of sampling.

BY: ABEBEW A. 38
MEKDELA AMBA UNIVERSITY

7. Not highly affected by extreme values.

3.5. Types of Measures of Central Tendency

The following are types of Central Tendency which are suitable for a particular type of data. It has
its own advantage and dis advantages.

These are

 Mean (Arithmetic, Weighted Arithmetic, Combined, geometric, harmonic)


 Median
 Mode or modal value

3.5.1 Mean (Arithmetic, Weighted, Combined, geometric, harmonic)

a. Arithmetic Mean: - Arithmetic mean is defined as the sum of the measurements of the items
divided by the total number of items. It is usually denoted by x̅.

Arithmetic Mean for individual series

Suppose x1 , x2 , … , xn are observed values in a sample of size n from a population of size N, n<N
then the arithmetic mean of the sample, denoted by x̅ is given by

x1 + x2+ … +xn ∑n
i=1 xi
x̅ = = ……………………………………………………………………….3.1
n n

If we take an entire population the mean is denoted by μ and is given by:

X1 + X2+ … +XN ∑N
i=1 Xi
μ= =
N N

Where N stands for the total number of observations in the population

Example 3.2: Consider the samples given below:

i. 46 54 21 35

ii. 10.5 2.4 3.6 5.9 8.7

Find the arithmetic mean

Solution:

BY: ABEBEW A. 39
MEKDELA AMBA UNIVERSITY

i. The sample values are: 46 54 21 35

∑n
i=1 xi 46+ 54+21+35 156
x̅ = = = = 39
n 4 4

The arithmetic mean for sample value is 39.

ii. The sample values are: 10.5 2.4 3.6 5.9 8.7

∑n
i=1 xi 10.5+ 2.4+3.6+ 5.9+ 8.7 31.1
x̅ = = = = 6.22
n 5 5

The arithmetic mean for sample value is 6.22.

Arithmetic mean for discrete data arranged in frequency distribution

When the numbers x1 , x2 , … , xk occur with frequencies f1 , f2 , … , fk , respectively, then the mean
can be expressed in a more compact form as:

x1 f1 +x2 f2 + …+xk fk ∑k
i=1 xi fi
x̅ = = ∑k
…………………………………………………………………3.2
f1 +f2 + …+ fk i=1 fi

Example 3.3: Calculate the arithmetic mean of the sample of numbers of students in 10 classes:

50 42 48 60 58 54 50 42 50 42

∑n
i=1 xi 50+42+48+60+58+54+50+42+50+42 496
x̅ = = = = 49.6 ≈ 50
n 10 10

In this case there are three 42’s, one 48, three 50’s, one 54, one 58 and one 60. The number of
times each number occurs is called its frequency and the frequency is usually denoted by f. The
information in the sentence above can be written in a table, as follows.

Value, xi 42 48 50 54 58 60
Frequency, fi 3 1 3 1 1 1
xi fi 126 48 150 54 58 60
The formula for the arithmetic mean for data of this type is

x1 f1 +x2 f2 + …+xk fk ∑k
i=1 xi fi
x̅ = = ∑k
f1 +f2 + …+ fk i=1 fi

In this case we have:

BY: ABEBEW A. 40
MEKDELA AMBA UNIVERSITY

42x3 + 48x1 + 50x3 + 54x1+58x1+60x1 126+48 + 150+54+58+60 496


x̅ = = = = 49.6 ≈ 50
3+1+3+1+1+1 10 10

The mean numbers of students in ten classes is 50.

Arithmetic Mean for Grouped Continuous Frequency Distribution

If data are given in the form of continuous frequency distribution, the sample mean can be
computed as

∑k
i=1 xi fi x1 f1 +x2 f2 + …+xk fk
x̅ = ∑k
= ………………………………………………..……………….3.3
i=1 fi f1 +f2 + …+ fk

here xi is the class mark of the ith class; i=1, 2, . . . , k , fi is the frequency of the ith class and k is
the number of classes

Note that ∑ki=1 fi = n = the total number of observations.

Example 3.4: The following frequency table gives the height (in inches) of 100 students in a
college.

Class Interval (CI) 60-62 62-64 64-66 66-68 68-70 70-72 Total
Frequency (f) 5 18 42 20 8 7 100
Calculate the mean

Solution:

The formula to be used for the mean is as follows:

∑k
i=1 xi fi
x̅ = ∑k
i=1 fi

Let us calculate these values and make a table for these values for the sake of convenience.

Class Interval (CI) 60-62 62-64 64-66 66-68 68-70 70-72 Total
Frequency (f) 5 18 42 20 8 7 100
Mid-Point (xi ) 61 63 65 67 69 71
fi xi 305 1134 2730 1340 552 497 6558
Substituting these values with ∑6i=1 fi = 100, we get

∑k
i=1 xi fi 6558
x̅ = ∑k
= x̅ = = 65.58
i=1 fi 100

BY: ABEBEW A. 41
MEKDELA AMBA UNIVERSITY

The mean height of students is 65.58

Properties of the Arithmetic Mean

 The algebraic sum of the deviations of a set of numbers x1 , x2 , … , xn from their mean x
n
is always zero. i.e.  ( x  x)  0
i 1
i

 The sum of squares of deviations from the mean is the least comparing to other measure of
n
central tendencies. That is,  ( x  A)
i 1
i
2
is minimum when A  x .

 If a constant k is added/ subtracted to/from every observation then the new mean will be
the old mean± k respectively.

 If every observations are multiplied by a constant k then the new mean will be k*old me

Example 3.5:

1. The mean of n Tetracycline Capsules X1, X2, …,Xn are known to be 12 gm. New set of capsules
of another drug are obtained by the linear transformation Yi = 2Xi – 0.5 ( i = 1, 2, …, n ) then
what will be the mean of the new set of capsules.

New mean=2*old mean-0.5=2*12-0.5=23.5

2. The mean of a set of numbers is 500.


a. If 10 is added to each of the numbers in the set, then what will be the mean of the new set?
b. If each of the numbers in the set are multiplied by -5, then what will be the mean of the new
set?

Solutions:

a. New mean=Old mean +10=500+10=510


b. New mean =-5*old mean=-5*500=-2500
 If a wrong figure has been used when calculating the mean, the correct mean can be
obtained without repeating the whole process using:
(correct value−Wrong Value)
Correct mean = Wrong mean + n

BY: ABEBEW A. 42
MEKDELA AMBA UNIVERSITY

Where n is total number of observations.

Example 3.6: An average weight of 10 students was calculated to be 65kg.Latter it was discovered
that one weight was misread as 40kg instead of 80 kg. Calculate the correct average weight?

Solutions: Using above correct mean formula.


(80 − 40)
Correct mean = 65 + = 65 + 4 = 69kg
10

Merits of Arithmetic Mean

 Arithmetic mean has a rigidly defined mathematical formula so that its value is always
definite or unique. It can be calculated for any set of numerical data.
 It is calculated based on all observations.
 Arithmetic mean is simple to calculate and easy to understand.
 It doesn’t need arrangement of data in increasing or decreasing order to calculate the
results.
 Arithmetic mean of many samples from the same population does not fluctuate
considerably.
 It affords a good standard of comparison.

Demerits of Arithmetic Mean

 It can’t be calculated for data which are not quantifiable.


 It is highly affected by extreme (abnormal) values in the series.
 It can be a number which does not exist in the series.
 It can’t be calculated for grouped continuous open-ended classes.
b. Weighted Arithmetic Mean

While calculating simple arithmetic mean, all items were assumed to be of equally importance
(each value in the data set has equal weight). When the observations have different weight, we use
weighted average. Weights are assigned to each item in proportion to its relative importance.

If x1 , x2 , … , xn represent values of the items and w1 , w2 , … , wn are the corresponding weights,


then the weighted mean, (x̅w ) is given by

BY: ABEBEW A. 43
MEKDELA AMBA UNIVERSITY

w1 x1  w2 x2    wn xn  wi xi
xw  
w1  w2    wn  wi ……………………………………...3.4

Example 3.7: A student’s final mark in Mathematics, Physics, Chemistry and Information
Technology are respectively A, B, D and C. If the respective credits (weight) received for these
courses are 4, 4, 3 and 2, determine the average grade the student has got for the course.

Solution

We use a weighted arithmetic mean, weight associated with each course being taken as the number
of credits received for the corresponding course.

xi 4 3 1 2 Total
wi 4 4 3 2 13
xi wi 16 12 3 4 35

w1 x1  w2 x2    wn xn  wi xi
xw  
w1  w2    wn  wi
16+12+3+4 35
= = = 2.69, Average grade of the student is approximately 2.69.
13 13

Example 3.8:

In a vacancy for a position of Data base officer in an organization, the criteria of selection were
work experience, entrance exam, and, interview result. The relative importance of these criteria
was regarded to be different. The weights of these criteria and the scores obtained by 3 candidates
(out of 100 in each criterion) are given in the following table. In addition, the selection of a
candidate is based on average result on these criteria.

Criterion Weight Candidates


Betselot Tasew Endawok
Work experience 4 70 89 85
Entrance exam 3 78 83 89
Interview exam 2 90 92 90
who is the appropriate candidate for the position based on the criteria?

Solution: We use the weighted mean since the relative importance of these criteria are different.

BY: ABEBEW A. 44
MEKDELA AMBA UNIVERSITY

Criterion Weight (wi) Candidates


Betselot Tasew Endawok
xi Wi*xi xi Wi*xi xi Wi*xi
Work experience 4 70 280 89 356 85 340
Entrance exam 3 78 234 83 349 89 267
Interview exam 2 90 180 92 184 90 180
Total 9 238 694 264 789 264 787
The weighted mean and the simple arithmetic mean for the applicants are as follows:

Applicants Betselot Tasew Endawok


Weighted mean 694/9=77.11 789/9=87.67 787/9=87.44
Simple arithmetic mean 238/3=79.33 264/3=88 264/3=88
If we use the simple arithmetic mean of the scores, both Tasew and Engdawork have got equal
chances to be recruited. However, the relative importance of the criteria is different. So we have
to use the weighted mean for discriminating among the candidates. The weighted mean of the
scores obtained by Tasew is larger than the others. So Tasew should be recruited for the job.

c. Combined mean: When a set of observations is divided into k groups and x̅1 is the mean of n1
observations of group 1, x̅2 is the mean of n2 observations of group2, …, x̅k is the mean of nk
observations of group k, then the combined mean, denoted by x̅c , of all observations taken together
is given by
x̅1 n1 +x̅2 n2 +⋯+x̅k nk
x̅c = …………………………………..3.5
n1 +n2 +⋯+nk

This is a special case of the weighted mean. In this case the sample sizes are the weights.

Example 3.9: In the Previous year there were two sections taking Statistics course. At the end of
the semester, the two sections got average marks of 70 & 78. There were 45 and 50 students in
each section respectively. Find the mean mark for the entire students.

Solution:
x̅1 n1 +x̅2 n2 +⋯+x̅k nk x̅1 n1 +x̅2 n2 70x45 +78x50 7050
̅xc = = = = = 74.21
n1 +n2 +⋯+nk n1 +n2 45+50 95

The combined mean of the entire students will be 74.21.

Activity 3.1

BY: ABEBEW A. 45
MEKDELA AMBA UNIVERSITY

An industry producing heavy metals has three plants in different cities. A sample survey in the
industry revealed that the average salary paid to the employees working in plants A, B, C are $
859, $890, and $900, respectively. The sample sizes taken in plants A, B, and C are 675,700,750,
respectively. Find the mean salary paid in the whole sample of employees.

d. Geometric Mean

The geometric mean like arithmetic mean is calculated an average. It is used when observed values
are measured as ratios, percentages, proportions, indices or growth rates.

Geometric mean for individual series: The geometric mean, G.M. of an individual series of
positive numbers (> 0) x1 , x2 , … , xn is defined as the nth root of their product.

G.M  n x1 .x2  xn = antilog (1 ∑ logxi ) ………………………….3.6


n

Example 3.10: Find the G. M of (a) 3 and 12 b) 2, 4 and 8


3 3
Solution: a) GM  3  12  36  6 ; b) GM= √2x4x8 = √64 = 4

Properties of geometric mean

 It is less affected by extreme values. E.g. x = 2, 5, 8, 72; Find compare for Arithmetic and
geometric mean?
 It takes each and every observation into consideration.
 If the value of one observation is zero its values becomes zero.

Geometric mean for discrete data arranged in FD: When the numbers x1 , x2 , … , xk occur
with frequencies f1 , f2 , … , fm , respectively, then the geometric mean is obtained by

1
G.M .  n x1f1 .x2f2 ..xmfm = antilog ( ∑ fi logxi ) ……………………………3.7
n

where n is sum of fi for all i.

Example 3.11: Compute the geometric mean of the following values: 3, 3, 4, 4, 4, 5, 6 and 6.

Solution

Values 3 4 5 6

BY: ABEBEW A. 46
MEKDELA AMBA UNIVERSITY

Frequency 2 3 1 2
8
G.M. = √32 X43 X51 X62 = 4.236 , The geometric mean for the given data is 4.236.

Geometric mean for continuous grouped FD:- The above formula can also be used whenever
the frequency distribution is grouped continuous, class marks of the class intervals are considered
as xi.

e. Harmonic Mean

It is a suitable measure of central tendency when the data pertains to speed, rate and time. The
harmonic Mean of n values is defined as n divided by the sum of their reciprocal.

Harmonic mean for individual series: If x1 , x2 , … , xn are n observations, then harmonic mean
can be represented by the following formula:

n
H .M 
1 1 1
 
x1 x2 xn ………………………….……………..3.8

Example 3.12 A car travels 25 miles at 25 mph, 25 miles at 50 mph, and 25 miles at 75 mph. Find
average mean (the harmonic mean) of the three velocities.

Solution:
3
n = = 40.9
H .M  1 1 1
+ +
1 1 1 25 50 75
 
x1 x2 xn

Harmonic mean for discrete data arranged in FD: If the data is arranged in the form of
frequency distribution

n
H .M  m

f1 f 2 f m , where n f k
  k 1 ……………..……………3.9
x1 x 2 xm

Harmonic mean for continuous grouped FD: Whenever the frequency distribution are grouped
continuous, class marks of the class intervals are considered as xi and the above formula can be
used as

BY: ABEBEW A. 47
MEKDELA AMBA UNIVERSITY

n m
H.M. = fi where n   f k …………………….…………..3.10
∑n
i=1x k 1
i

xi is the class mark of ith class?

Properties of harmonic mean

 It is unique for a given set of data.


 It takes each and every observation into consideration.
 Difficult to calculate and understand.
 Appropriate measure of central tendency in situations where data is in time, speed or rate.

Relations among different means

i. If all the observations are positive we have the relationship among the three means given as: x̅ ≥
GM ≥HM

ii. For two observations √x̅ ∗ HM = GM

iii. x̅ = GM = HM if all observation are positive and have equal value.

Activity 3.2

A data base officer travelled by car for 3 days to visit fauna and flora of a certain region. He
covered 480 km each day. On the first day he drove for 10 hours at 48 km an hour, on the second
day he drove for 12 hours at 40 km an hour, and on the last day he drove for 15 hours at 32 km per
hour. What was his average speed?

3.5.2 Median

The median is as its name indicates the middle most value in the arrangement which divides the
data into two equal parts. It is obtained by arranging the data in an increasing or decreasing order
of magnitude and denoted by x̃.

Median for individual series: We arrange the sample in ascending order of the variable of
interest. Then the median is the middle value (if the sample size n is odd) or the average of the two
middle values (if the sample size n is even).

BY: ABEBEW A. 48
MEKDELA AMBA UNIVERSITY

For individual series the median is obtained by


n+1 th
a/ x̃ = ( ) value if n is odd, and
2

n n
( )th value + ( +1)th value
2 2
b/ x̃ = if n is even………………………………………………..11
2

Example 3.13: Find the median for the following data.

a/ -5 15 10 5 0 2 1 4 6 and 8

b/ 5 2 2 3 1 8 4

Solution;

a. The data in ascending order is given by:


-5 0 1 2 4 5 6 8 10 15
n=10 n is even. The two middle values are 5th and 6th observations. So the median is,
10 10
( )th +( +1)th 5th +6th 4+5
2 2
x̃ = value = = = 4.5
2 2 2

b. The data in ascending order is given by:


1 2 2 3 4 5 8
The middle value is the 4th observation. So the median is 3.

Note: The median is easy to calculate for small samples and is not affected by an "outlier".

Median for Discrete data arranged in a frequency distribution: - In this case also, the median
is obtained by the above formula. After arranging the values in an increasing order find the smallest
CF greater than or equal to the rank/position of the median value (i.e., that value obtained by a &
b above formula) and the corresponding value is the median.

Example 3.14 A shop keeper (sales person) recorded the number of video cassette recorders
(VCRs) sold per month over a two-year period. Find the median.

Number of sets sold Frequency ( months) Cumulative frequency


1 3 3
2 8 11
3 5 16

BY: ABEBEW A. 49
MEKDELA AMBA UNIVERSITY

4 4 20
5 2 22
6 1 23
7 1 24
The number of observations n=24 , even.
12𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛+13𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 3+3
𝑥̃ = =3
2 2

Median for grouped continuous data: -For continuous data, the median is obtained by the
following formula.

w n 
Median  L    CF   ~
x ……………….……….3.12
f med  2 

Where: L= the lower class boundary of the median class;

w = the class width of the median class;

f med = the frequency of the median class; and CF  the cum. freq. corresponding to the

class preceding the median class. That is, the sums of the frequencies of all classes lower than the
median class. Where the median class is the class which contains the (n/2)th observation whether
n is odd or even, since the items have already lost their originality once they are grouped in to
continuous classes.

Example 3.15: Calculate the median for the following frequency distribution.

C.I 1–5 6 - 10 11 – 15 16 – 20 21 - 25 26 - 30 31 - 35 Total


Freq. 4 8 12 6 3 4 3 40
Solution: Construct the less than cumulative frequency distribution, then:

C.I 1-5 6 – 10 11 – 15 16 – 20 21 - 25 26– 30 31 - 35 Total


Freq. 4 8 12 6 3 4 3 40
Cuml. Freq. 4 12 24 30 33 37 40
Since n = 40, 40/2 = 20, and the smallest CF greater than or equal to 20 is 24; thus, the median
class is the third class. And for this class, L = 10.5, w = 5, f med =12, CF = 12. Then applying the
formula, we get:

BY: ABEBEW A. 50
MEKDELA AMBA UNIVERSITY

~
x =10.5+(20-12)*5/12=13.8

Merits of median

 It is less affected by extreme values.


 Median can be calculated even in case of open-ended intervals.
 It can be computed for ratio, interval, and ordinal level of data.

Demerits of median

 Its value is not determined by each & every observation.


 It is not a good representative of the data if the number of items (data) is small.
 The arrangement of items in order of magnitude is sometimes very tedious process if the
number of items is very large.

3.5.3 The Mode or modal value

The mode or the modal value is the value with the highest frequency and denoted by x̂. A data set
may not have a mode or may have more than one mode. A distribution is called a bimodal
distribution if it has two data values that appear with the greatest frequency. If a distribution has
more than two modes, then the distribution is multimodal. If a distribution has no modes, then the
distribution is no modal.

Mode of individual series: - The mode or the modal value of individual series (raw data) is simply
obtained by locating the observation with the maximum frequency.

Example 3.16: Consider the following data:

a. 30 45 69 70 32 18 32. The mode (x̂ ) = 32.

b. 10 20 30 10 40 30. The mode (x̂ ) = 10 and 30.

c. 10 40 30 20 50 60. No mode.

Note that in some samples there may be more than one mode or there may not be a mode. The
mode is not a suitable measure of central tendency in these cases. We use the mode as a measure
of central tendency if we require a measure that takes on one of the sample values. The mode can

BY: ABEBEW A. 51
MEKDELA AMBA UNIVERSITY

be used for variables that are measured on a category (nominal) scale, e.g. the most popular
computer type.

Mode for discrete data arranged in a frequency distribution:-In the case of discrete grouped
data, the mode is determined just by looking to that value (s) having the highest frequency.

Mode for Grouped Continuous Frequency Distribution

For grouped data, the mode is found by the following formula:

In such cases, one can only determine the modal class easily: the class with the highest frequency.

After locating this class, the mode is interpolated using:

1
Mode  L   w, …………….…………3.13
1   2

where L = the lower class boundary of the modal class;

 1  f mod  f 1 ,  2  f mod  f 2 , w = the common class width,

f 1 = frequency of the class immediately preceding the modal class;

f 2 = frequency of the class immediately succeeding the modal class; and

fmode = frequency of the modal class.

Example 3.13: Calculate the mode for the frequency distribution of data of example 3.11.

Solution:

By inspection, the mode lies in the third class, where L =10.5, fmod = 12, f1=8, f2=6, w = 5

Using the formula, the mode is:

1
Mode  L   w = 10.5 + (12-8)*5/(12-8)+(12-5) = 12.5
1   2

Merits of mode

 Mode is not affected by extreme values.

BY: ABEBEW A. 52
MEKDELA AMBA UNIVERSITY

 We can change the size of the observations without changing the mode.
 It can be computed for all level of data i.e. ratio, interval, ordinal or nominal.

Demerits of mode

 It may not exist.


 It does not take every value into consideration.
 Mode may not exist in the series and if it exists it may not be unique.

The Relationship of the Mean, Median and Mode

Comparing the Mean, Median, and the Mode

 If the data is skewed –avoid the mean.


 If there is high gap around the middle- avoid the median.
 A measure is a resistant measure if its value is not affected by an outlier or an extreme data
value.
 The mean is not a resistant measure of central tendency because it is not resistant to the
influence of the extreme data values or outliers.
 The median is resistant to the influence of extreme data values or outliers and its value does
not respond strongly to the changes of a few extreme data values regardless of how large
the change may be.
 The mode has an advantage over both the mean and the median when the data is categorical
since it is not possible to calculate the mean or median for this type of data. Also, the mode
usually indicates the location within a large distribution where the data values are
concentrated. However, the mode can not always be calculated because if a distribution
has all different data values, then the distribution is non modal.
 In the case of symmetrical distribution; mean, median and mode coincide. That is
mean=median = mode. However, for a moderately asymmetrical (non symmetrical)
distribution, mean and mode lie on the two ends and median lies between them and they
have the following important empirical relationship, which is
Mean – Mode = 3(Mean - Median) ………..…….3.14

BY: ABEBEW A. 53
MEKDELA AMBA UNIVERSITY

Example 3.17: In a moderately asymmetrical distribution, the mean and the mode are 30 and 42
respectively. What is the median of the distribution?

Solution:

Median = (2mean + Mode)/2 = (2*30 + 42)/3 = 34

Hence the median of the distribution is 34.

Activity 3.3

1. A survey showed the following distribution for the number of students enrolled in each
field. Find the mode.

Subjects Business Information Technology Computer science Education General studies


No. of students 1020 825 645 478 100
2. Discuss the merits of mean, median and mode.
3. The following data represent the blood cholesterol levels of 40 first-year students at a
particular college. Find the mean, the median and the modal
Cholesterol level of students 170-180 180-200 200-210 210-220 220-230
Cholesterol levels number of students 3 13 8 5 4
3.5.4 Measures of Non-Central Locations

Median is the value of the middle item which divides the data in to two equal parts and found by
arranging the data in an increasing or decreasing order of magnitude, whereas quintiles are
measures which divides a given set of data in to approximately equal subdivision and are obtained
by the same procedure to that of median. They are averages of position (non-central tendency).
Some of these are quartiles, deciles and percentiles.

Quartiles: are values which divide the data set in to approximately four equal parts, denoted by
Q1 , Q2 and Q3 . The first quartile (Q1 ) is also called the lower quartile and the third quartile (Q3 )
is the upper quartile. The second quartile ( Q2 ) is the median.

Quartiles for Individual series:

Let x1 , x 2 ,  , x n be n ordered observations. The ith quartile Qi  is the value of the item
corresponding with the [i(n+1)/4]th position, i = 1, 2, 3.

BY: ABEBEW A. 54
MEKDELA AMBA UNIVERSITY

That is, after arranging the data in ascending order, Q1, Q2, & Q3 are, obtained by:

1(n+1) th 2(n+1) th 3(n+1) th


Q1 = ( ) value, Q2 = ( ) value and Q3 = ( ) value.
4 4 4

Quartiles for discrete data arranged in a frequency distribution: - Arranged in a frequency


distribution this case also, we will follow the same procedure as the median. That is, we construct
the less than cumulative frequency distribution and apply the formula of quartile for individual
series.

Quartiles in continuous data: - For continuous data, use the following formula:

w  in 
Qi  L    CF  ………….. ………..3.15
f Qi  4 

Where i = 1,2, 3, and L, w, fQi and CF are defined in the same way as the median.
w n
i.e. Q1 = L +f (4 − CF) ,
Q1

w 2n
Q2 = L + f ( 4 − CF) and
Q2

w 3n
Q3 = L + f ( 4 − CF)
Q3

The class under question is the one including (ixn/4)th value. That is, the class with the minimum

frequency greater than or equal to (ixn/4) th is the class of the ith quartile.

Deciles: are values dividing the data approximately in to ten equal parts, denoted by D1 , D2,…, D9 .

Deciles for Individual Series:

Let x1 , x 2 ,  , x n be n ordered observations. The ith deciles (Di ) is the value of the item

corresponding with the [i(n+1)/10]th position, i = 1, 2, . . . ,9.

That is, after arranging the data in ascending order, D1, D2, . . . & D9 are, obtained by:

1(n+1) th 2(n+1) th 9(n+1) th


D1 = ( ) value, D2 = ( ) value . . . and D9 = ( ) value.
10 10 10

BY: ABEBEW A. 55
MEKDELA AMBA UNIVERSITY

Deciles for Discrete data arranged in a frequency distribution:- Arranged in a frequency


distribution this case also, we will follow the same procedure as the median. That is, we construct
the less than cumulative frequency distribution and apply the formula of deciles for individual
series.

Deciles for continuous data: Apply the following formula and follow the procedures of quartile
for continuous data.
w in
Di = L + (10 − CF) ,i = 1, 2,...,9 …………………………3.16
fDi

Define the symbols in similar ways as we did in the case of quartiles for continuous data.

Percentiles: are values which divide the data approximately in to one hundred equal parts, and

denoted by P1 , P2,…, P99 .

Percentiles for Individual Series:

Let x1 , x 2 ,  , x n be n ordered observations. The ith percentile (Pi ) is the value of the item

corresponding with the [i(n+1)/100]th position, i = 1, 2, . . . ,99.

That is, after arranging the data in ascending order, P1, P2, . . . & P99 are, obtained by:

1(n+1) th 2(n+1) th 99(n+1) th


P1 = ( ) value, P2 = ( ) value . . . and P99 = ( ) value.
100 100 100

Percentiles for Discrete data arranged in a frequency distribution:- Arranged in a frequency


distribution this case also, we will follow the same procedure as the median. That is, we construct
the less than cumulative frequency distribution and apply the formula of percentile for individual
series.

Percentiles for continuous data: Apply the following formula


w in
Pi = L + (100 − CF) ,i = 1, 2,...,99 . …………..……….3.17
fPi

Then, define the symbols similar ways as we did in the case of quartiles or deciles for continuous
data.

BY: ABEBEW A. 56
MEKDELA AMBA UNIVERSITY

Interpretations

1. Qi is the value below which ( i × 25) percent of the observations in the series are found (where
i=1, 2,3). For instance Q3 means the value below which 75 percent of observations in the given
series are found.

2. Di is the value below which ( i ×10) percent of the observations in the series are found (where
i=1, 2,...,9 ). For instance D4 is the value below which 40 percent of the values are found in the
series.

3. Pi is the value below which i percent of the total observations are found (where i = 1, 2,3,...,99
). For example 60 percent of the observations in a given series are below P60.

Example 3.18: Calculate Q1 , Q2 , Q3, D4, D9, P40 & P90 for the following data given on the table
below.

X 10 11 12 13 14 15 16 17 18
F 2 8 25 48 65 40 20 9 2
Solution: The data is arranged in an increasing order. So we need to construct only the cumulative
frequency table before calculating the required values.

X 10 11 12 13 14 15 16 17 18
F 2 8 25 48 65 40 20 9 2
Cum. Freq. 2 10 35 83 148 188 208 217 219
The total number of observations is 219 which is odd. Clearly then the median is 14. i.e.
n+1 th 219+1 th
x̃ = ( ) =( ) value = 110th value = 14
2 2

1(n+1) th 1(219+1) th
Q1 = ( ) value = ( ) value = 55th value = 13
4 4

2(n+1) th 2(219+1) th
Q2 = ( ) value = ( ) value = 110th value = 14 = x̃
4 4

3(n+1) th 3(219+1) th
Q3 = ( ) value = ( ) value = 165th value = 15
4 4

4(n+1) th 4(219+1) th
D4 = ( ) value = ( ) value = 88th value = 14
10 10

BY: ABEBEW A. 57
MEKDELA AMBA UNIVERSITY

9(n+1) th 9(219+1) th
D9 = ( ) value = ( ) value = 198th value = 16
10 10

40(n+1) th 40(219+1) th
P40 = ( ) value = ( ) value = 88th value = 14
100 100

90(n+1) th 90(219+1) th
P90 = ( ) value = ( ) value = 198th value = 16
100 100

Example 3.19: Marks of 50 students out of 85 is given below. Based on the data find Q1 ,
D4 and P7.

Marks 46-50 51-55 56-60 61-65 66-70 71-75 76-80


fi 4 8 15 5 9 5 4
Solution: - first find the class boundaries and cumulative frequency distributions.

Marks 46-50 51-55 56-60 61-65 66-70 71-75 76-80


Class boundary 45.5-50.5 50.5-55.5 55.5-60.5 60.5-65.5 65.5-70.5 70.5-75.5 75.5-80.5
fi 4 8 15 5 9 5 4
Cum. Freq. 4 12 27 32 41 46 50
Q1 Measure of (n/4)th value = 12.5th value which lies in group 55.5 – 60.5
w n 5
Q1 = L +f (4 − CF) = 55.5 +15 (12.5 − 12) = 55.7
Q1

D4 Measure of (4n/10)th value = 20th value which lies in group 55.5 – 60.5.
w 4n 5
D4 = L +f (10 − CF) = 55.5 +15 (20 − 12) = 58.2
D4

P7 Measure of (7n/100)th value = 3.5th value which lies in group 45.5 – 50.5
w 7n 5
P7 = L +f (100 − CF) = 45.5 +4 (3.5 − 0) = 49.875.
P7

Activity 3.4

The following table presents the male population of a certain region in Ethiopia.

Find a) all quartiles

b) The 9th and 5th decile and

c) 65th and 75th percentiles

BY: ABEBEW A. 58
MEKDELA AMBA UNIVERSITY

Age groups(in years) 0-5 5-10 10-15 15-20 20-25 25-30 30-35 35-40
Male population 2580 3737 4620 5200 7250 620 297 355
Exercise- 3

1. Calculate the median, quartiles, 8th decile, and 75th percentile for the following data. Show
that the value of 75th percentile is the same as that of Q3.
Lifetime (C.M) 50 100 150 200 250 300 350 400
No of Batteries 6 8 13 20 9 6 3 2

2. The following data represent the number of offences for various robberies in a town per a
given day.
No. of robberies 26 34 30 15 10 32 12 25 7
No. of days 13 19 12 30 14 8 19 20 3
Compute the mean, median and mode

3. Calculate Q1, Q2, Q3, D5, D8, and P90 for the following table

Temperature (oF) 50-59 60-69 70-79 80-89 90-99


Days 2 8 20 4 1
4. The following data represent the pulse rates (beats per minute) of nine students 76 60 60
81 72 80 80 68 and 73. Calculate the mean, mode and the third quartile.
5. The number of births in a hospital is given below

Days Monday Tuesday Wednesday Thursday Friday Saturday Sunday


Num. of births 50 60 52 55 62 30 40
Find the average number of births per day and the mode.

6. From the table given below find the mode and 5th decile.

Size 11-15 16-20 21-25 26-30 31-35 36-40 41-45 46-50


Frequency 7 10 13 26 35 22 11 5
7. If the arithmetic mean of two items is 5 and G.M. is 4, find their H.M.
8. The following frequency distribution represents the magnitude of earth quake.
Magnitude 0-0.9 1-1.9 2-2.9 3-3.9 4-4.9 5-5.9 6-6.9 7-7.9
Frequency 20 50 45 30 10 8 6 1
Compute the median and verify that it is equal to the second quartile and find 72nd percentile

BY: ABEBEW A. 59
MEKDELA AMBA UNIVERSITY

CHAPTER FOUR
MEASURES OF DISPERSION (VARIATION)
After studying this chapter, you should be able to:

 Explain the meaning of measures of dispersion


 understand the importance of measuring the variability (dispersion) in a data set.
 Compare two or more sets of data using relative measures of dispersion.
 Apply the Z-score to find out the relative standing of values.
 Explain measures of skewness and kurtosis.

4.1 Introduction

Just as central tendency can be measured by a number in the form of an average, the amount of
variation (dispersion, spread, or scatter) among the values in the data set can also be measured.
The measures of central tendency describe that the major part of values in the data set appears to
concentrate around a central value called average with the remaining values scattered (distributed)
on either sides of that value. But these measures do not reveal how these values are dispersed
(spread or scatter) on each side of the central value. The dispersion of values is indicated by the
extent to which these values tend to spread over an interval rather than cluster closely around an
average.

The term dispersion is generally used in two senses. Firstly, dispersion refers to the variations of
the items among themselves. If the value of all the items of a series is the same, there will be no
variation among different items of a series. Secondly, dispersion refers to the variation of the items
around an average. If the difference between the value of items and the average is large, the
dispersion will be high and on the other hand if the difference between the value of the items and
averaging is small, the dispersion will be low. Thus, dispersion is defined as scatteredness or
spreadness of the individual items in a given series.

4.2 Objectives of measuring Variation:

 To judge the reliability of measures of central tendency


 To control variability itself.

BY: ABEBEW A. 60
MEKDELA AMBA UNIVERSITY

 To compare two or more groups of numbers in terms of their variability.


 To make further statistical analysis.
4.3 Absolute and relative measures of dispersion

Absolute measures of dispersion: Absolute measure is expressed in the same


statistical unit in which the original data are given such as kilograms, t ones etc. These
measures are suitable for comparing the variability in two distributions having
variables expressed in the same units and of the same averaging size. These measures
are not suitable for comparing the variability in two distributions having variables
expressed in different units.

Relative measures of dispersion: A relative measure of dispersion is the ratio of a measure of


absolute dispersion to an appropriate average or the selected items of the data.

BY: ABEBEW A. 61
MEKDELA AMBA UNIVERSITY

Relative
measure of
dispersion

Based on Based on all


selected items
items
Coefficient of mean
Coefficient deviation
of range &coefficient of
and standard deviation
coefficient or coefficient of
of quartile variation
deviation

4.4 Types of Measures of Variation

4.4.1 The Range and Relative Range

Range is the simplest measures of dispersion. It is defined as the difference between the largest
and smallest value in a given set of data. Its formula is:

R = L − S ………………………………………….4.1

Where R=Range, L= Largest value in a given set of data, S= smallest value in a given set of data.

For a continuous grouped distribution, the range may be obtained as:

 The difference between upper class limit of the last class and the lower class limit of the
first class, or
 The difference between the largest class mark and the smallest class mark, or
 The difference between the upper class boundary of the last class and the lower class
boundary of the first class.

The range is the crudest absolute measures of variation which is applicable in describing like the
maximum change in daily temperature, rainfall, etc. When the sample size is small, it can be an
adequate measure of variation. It is commonly used in quality control.

The relative measures of range, also called coefficient of range, is defined as

BY: ABEBEW A. 62
MEKDELA AMBA UNIVERSITY

LS
Relative Range(RR) = ……………………………………….4.2
LS

Example 4.1: Five students obtained the following marks in statistics: 20, 35, 25, 30, 15. Find the
range and relative range

Solution: Here, L = 35, and S = 15

Range = L − S = 35 − 15 = 20

LS 35  15
RR =   0.4
LS 35  15

Example 4.2: Find out range and relative range of the following given data.

Size 5-10 11-15 16-20 21-25 26-30


Frequency 4 9 15 30 40
Solution: Here,

L = Upper class limit of the largest class = 30

L = lower class limit of the smallest class = 5

30  5
Range = 30 – 5 = 25 and RR =  0.7143 .
30  5

Merits of the Range

 It is well-defined, easy to compute and simple to understand.


 It helps in giving an idea about the variation, just by giving the lowest value and the
greatest value of variable.

Demerits of the Range

 It is not based on all observations of the series.


 It can’t be calculated in case of open-ended distribution.
 It is affected by sampling fluctuation.
 It is affected by extreme values in the series.

BY: ABEBEW A. 63
MEKDELA AMBA UNIVERSITY

4.4.2 The Quartile Deviation and Coefficient of Quartile Deviation

Inter-quartile range and quartile deviation are other measures of dispersion. The difference
between the upper quartile (Q3 ) and lower quartile (Q1 ) is called inter-quartile range.
Symbolically,

𝐈nter 𝐐uartile 𝐑ange (IQD) = Q3 − Q1 ……………………………..4.3

The inter-quartile ranges covers dispersion of middle 50% of the items of the series. Quartile
deviation, also called semi-inter-quartile range, is half of the difference between the upper and
lower quartile. That is, half of the inter-quartile range. Its formula is
Q3 −Q1
Quartile Deviation (QD) = …………………………….4.4
2

The relative measure of quartile deviation also called the coefficient of quartile deviation (CQD)
is defined as:
Q −Q
CQD = Q3 +Q1…………………………………………..4.5
3 1

Example 4.3: Find inter-quartile range, quartile deviation and coefficient of quartile deviation
from the following data.

28, 18, 20, 24, 27, 30, 15

Solution: First arrange the data in ascending order. 15, 18, 20, 24, 27, 28, 30

n + 1 th 7 + 1 th
Q1 = size of ( ) item = size of ( ) item
4 4

= size of 2nd item = 18 marks

n + 1 th 7 + 1 th
Q3 = size of 3 ( ) item size of 3 ( ) item
4 4

= size of 6th item = 28 marks

IQR = Q3 − Q1 = 28 − 18 = 10

Q3 − Q1 28 − 18
QD = = =5
2 2

BY: ABEBEW A. 64
MEKDELA AMBA UNIVERSITY

Q3 − Q1 28 − 18
CQD = = = 0.217
Q3 + Q1 28 + 18

Example 4.4: Find inter-quartile range, quartile deviation and coefficient of quartile deviation
from the following data

Marks 2 3 4 5 6 7 8 9
No. Of students 10 11 12 13 5 12 7 5
Solution:

Marks 2 3 4 5 6 7 8 9
No. of students 10 11 12 13 5 12 7 5
CF 10 21 33 46 51 63 70 75=N
N+1 75 + 1
Q1 = ( )= = 19th item = 3
4 4
N+1 75+1
Q3 = 3 ( ) = 3( ) = 57th item = 7
4 4

IQR = Q3 − Q1 = 7 − 3 = 4

Q 3 − Q1 7 − 3
QD = = =2
2 2
Q 3 − Q1 7 − 3
CQD = = = 0.4
Q 3 + Q1 7 + 3

Remark: Q.D or CQD includes only the middle 50% of the observation.

Merits of QD

 It is well-defined, easy to compute and simple to understand.


 It helps in studying the middle 50% item in the series.
 It is not affected by the extreme items.
 It is useful in measuring variations in the case of open-ended distributions.

Demerits of QD

 It is not based on all the items (it ignores 50% items, i.e., the first 25% and the last 25%).
 It is greatly influenced by sampling fluctuations.

BY: ABEBEW A. 65
MEKDELA AMBA UNIVERSITY

 It is not amenable to algebraic manipulations.

4.4.3 The Mean Deviation and Coefficient of Mean Deviation

The mean deviation (MD) measures the average deviation of a set of observations about their
central value, generally the mean or the median, ignoring the plus/minus sign of the deviations. In
other words, the mean deviation of a set of items is defined as the arithmetic mean of the values
of the absolute deviations from a given average. Depending up on the type of averages used we
have different mean deviations.

 The mean deviation of a sample of n observations x1, x2, . . .,xn (individual series)is given as
∑|Xi −A|
MD = …………………………………….………..4.6
n

Where |Xi − A| denotes the absolute value of the deviation. Generally, arithmetic mean and
median are used in calculating mean deviation. So, A stands for the average used for
calculating MD. That is, A = median(X̃ ) or A = mean(X
̅).

 In case of discrete data arranged in FD and continuous grouped data, the formula for MD
becomes
∑ fi |Xi −A|
MD = ……………………………..4.7
n

where Xi is the class mark of the ith class, fi is the frequency of the ith class and n = ∑ fi .
1. The mean deviation about the arithmetic mean is, therefore, given by
̅
̅) = ∑|Xi −X|… ……………………..4.8
MD(X n

for ungrouped data (individual series)..


̅
̅) = ∑ fi |Xi −X| . . …………………………………..4.9
MD (X n

for discrete data arranged in FD and a grouped continuous frequency distribution; where Xi is the
value for discrete data arranged in FD and class mark of the ith class for continuous grouped data,
fi is the frequency of the ith class and n = ∑ fi .

̅)
Steps to calculate M.D for (𝐗

 Find the arithmetic mean, ̅


X

BY: ABEBEW A. 66
MEKDELA AMBA UNIVERSITY

 Find the deviations of each reading from ̅


X
 Find the arithmetic mean of the deviations, ignoring sign.
2. The mean deviation about the median is also given by
̃) = ∑|Xi −x̃|………………………………………………………………………4.10
MD(X n

for ungrouped data (individual series).


̃) = ∑ fi |Xi −x̃| . . . ……………………………………………………………..4.11
MD(X n

for discrete data arranged in FD and a grouped continuous frequency distribution; where Xi is the
value for discrete data arranged in FD and class mark of the ith class for continuous grouped data ,
fi is the frequency of the ith class and n = ∑ fi .

Steps to calculate M.D (𝐗̃ )

 Find the median, ̃


X
 Find the deviations of each reading from ̃
X
 Find the arithmetic mean of the deviations, ignoring sign.
3. The mean deviation about the mode is also given by
∑|Xi −x̂|
MD(x̂) = ………………………………………….4.12
n

for ungrouped data (individual series).


∑ fi |Xi −x̂|
MD(x̂) = . . …………………………………..…..4.13
n

for discrete data arranged in FD and a grouped continuous frequency distribution; where Xi is the
value for discrete data arranged in FD and class mark of the ith class for continuous grouped data,
fi is the frequency of the ith class and n = ∑ fi .

Steps to calculate M.D (𝐱̂)

 Find the mode, x̂


 Find the deviations of each reading from x̂
 Find the arithmetic mean of the deviations, ignoring sign.

Example 4.5

BY: ABEBEW A. 67
MEKDELA AMBA UNIVERSITY

The following are the number of visit made by ten mothers to the local doctor’s surgery. 8, 6, 5, 5,
7, 4, 5, 9, 7, 4. Find mean deviation about mean, median and mode.

Solution:

First calculate the three averages

̅ = 6, X
X ̃ = 5.5, x̂ = 5

Then take the deviations of each observation from these averages.

xi 4 4 5 5 5 6 7 7 8 9 Total
|Xi − ̅
X| 2 2 1 1 1 0 1 1 2 3 14
|Xi − x̃| 1.5 1.5 0.5 0.5 0.5 0.5 1.5 1.5 2.5 3.5 14
̂|
|Xi − X 1 1 0 0 0 1 2 2 3 4 14
Since the distribution is ungrouped the mean deviation about mean, median and mode:

∑|Xi − ̅
X| 14
̅) =
MD(X = = 1.4
n 10
∑|Xi − x̃| 14
̃) =
MD(X = = 1.4
n 10
∑|Xi − x̂| 14
MD(x̂) = = = 1.4
n 10

Merits of 𝐌𝐃

 It is well-defined, easy to compute and simple to understand.


 It is based on all observations.
 It is not greatly affected by the extreme items.
 It can be calculated by using any average.

Demerit of 𝐌𝐃

 It does not take in to account the signs of the deviations of items from the average.

Remark: Of all the mean deviations taken about different averages or any arbitrary value, the
mean deviation about the median has the smallest value.

BY: ABEBEW A. 68
MEKDELA AMBA UNIVERSITY

Coefficient of mean deviation (CMD):

The relative measure of mean deviation, also called the coefficient of mean deviation is obtained
by dividing mean deviation by the particular average used in computing mean deviation. Thus,

CMD about the arithmetic mean is given by:


̅
̅) = MD(X) …………………………………….4.14
CMD(X X̅

where MD is the mean deviation calculated about the arithmetic mean.

CMD about the median is given by:


̃
̃) = MD(X) ……………………………………..4.15
CMD(X X̃

in which case MD is calculated about the median of the observations.

CMD about the mode is given by:


MD(x̂)
CMD(x̂) = ……………………………………..4.16

in which case MD is calculated about the mode of the observations.

Example 4.6: Calculate the coefficient of mean deviation about the mean, median and mode for
the data in Example 4.5 above.

Solution:

MD(X ̅) 1.4
̅) =
CMD(X = = 0.23
̅
X 6
MD(X ̃) 1.4
̃) =
CMD(X = = 0.25
̃
X 5.5
MD(x̂) 1.4
CMD(x̂) = = = 0.28
x̂ 5

4.4.4 The Variance, Standard Deviation and Coefficient of Variation

Variance and Standard Deviation

BY: ABEBEW A. 69
MEKDELA AMBA UNIVERSITY

Like the mean deviation, the variance is also based on all observations in a set of data. But the
variance is the average of squared deviations from the mean. Recall that the sum of squared
deviations is minimum only when taken from the mean. Squared deviations are mathematically
manipulated than absolute deviations. Thus, if we averaged the squared deviations from the mean
and take the square root of the result (to compensate for the fact that the deviations were squared),
we obtain the standard deviation. This overcomes the limitation of the mean deviation.

Population Variance (𝛔𝟐 )

If we divide the variation by the number of values in the population, we get something called the
population variance. This variance is the "average squared deviation from the mean".

For ungrouped data (individual series )

∑𝐍
𝐢=𝟏(𝐗 𝐢 −𝛍)
𝟐 𝟏 2
𝛔𝟐 = = 𝐍 [∑N 𝟐
i=1 X i − 𝐍𝛍 ] ……………………………….4.17
𝐍

where 𝛍 is the population arithmetic mean and N is the total number of observations in the
population.

For discrete data arranged in FD & for continuous grouped data

∑ 𝐟𝐢 (𝐗 𝐢 −𝛍)𝟐 𝟏
𝛔𝟐 = = 𝐍 [∑ fi Xi 2 − 𝐍𝛍𝟐 ]……………………4.18
𝐍

where 𝛍 is the population arithmetic mean, 𝐗 𝐢 is the class mark of the ith class, fi is the frequency
of the ith class and N=∑ fi

Sample Variance (𝐒 𝟐 )

One would expect the sample variance to simply be the population variance with the population
mean replaced by the sample mean. However, one of the major uses of statistics is to estimate the
corresponding parameter. This formula has the problem that the estimated value isn't the same as
the parameter. To offset this, the sum of the squares of the deviations is divided by one less than
the sample size.

For ungrouped data

BY: ABEBEW A. 70
MEKDELA AMBA UNIVERSITY

∑n ̅ )2
i=1(xi −x 1
S2 = = n−1 [∑ni=1 xi 2 − nx̅ 2 ] ………………………..4.19
n−1

where 𝐱̅ is the sample arithmetic mean and n is the total number of observations in the sample.

For discrete data arranged in FD


If the values xi have frequencies fi (i=1,2,…,m), then the sample variance is given by:

∑ fi (xi −x̅)2 1
S2 = = n−1 [∑ fi xi 2 − nx̅ 2 ] or
n−1

1 m
S2   fi  xi  x  ……………………………………………………………….4.20
2

n  1 i 1

For continuous grouped data

∑ fi (xi −x̅)2 1
S2 = = n−1 [∑ fi xi 2 − nx̅ 2 ] …………………………….4.21
n−1

where 𝐱̅ is the sample arithmetic mean, 𝐱 𝐢 is the class mark of the ith class, fi is the frequency of
the ith class and n=∑ fi

The Standard Deviation

There is a problem with variances. Recall that the deviations were squared. That means that the
units were also squared. To get the units back the same as the original data values, the square root
must be taken.

Population Standard Deviation (σ)

σ = √𝛔𝟐 where σ2 is the population variance.

Sample Standard Deviation ( S )

S = √S 2 where S 2 is the sample variance.

Example 4.7: Find the sample variance and standard deviation of:

xi 2 4 5 6 8
fi 2 2 3 1 2
Solution: Prepare the following table:

BY: ABEBEW A. 71
MEKDELA AMBA UNIVERSITY

xi fi fixi xi2 fixi2


2 2 4 4 8
4 2 8 16 32
5 3 15 25 75
6 1 6 36 36
8 2 16 64 128
Sum 10 49 279

Thus, n=∑ fi = 10, ∑ fi xi = 49, ∑ fi xi 2 = 279.

1
S2 = [∑ fi xi 2 − nx̅ 2 ]
n−1
1 49 1
= 9 [279 − 10(10)2 ] = 9 (38.9) = 4.32, and S = √4.32 = 2.08

Example 4.8: Find the sample variance and standard deviation for the distribution:

C.I 1-5 6-10 11-15 16-20


Freq. 4 1 2 3
Solution: In a continuous F.D., xi is the class mark representing the ith class.

C.I xi fi f i xi f i xi
2

1-5 3 4 12 36
6-10 8 1 8 64
11-15 13 2 26 338
16.20 18 3 54 972
Total 10 100 1410
∑ fi x i 100
Where, n=∑ fi = 10, x̅ = = = 10, ∑ fi xi 2 = 1410, so that
n 10

1 1 410
S 2 = n−1 [∑ fi xi 2 − nx̅ 2 ] = 9 [1410 − 10(10)2 ] = = 45.56, S = √45.56 = 6.75.
9

Properties of Variance & Standard Deviation

1. If a constant is added to (or subtracted from) all the values, the variance remains the same;
i.e., for any constant k, V ( xi  k )  V ( xi ) .

BY: ABEBEW A. 72
MEKDELA AMBA UNIVERSITY

Example 4.9 Consider the 6 sample values xi: 54, 52, 53, 50, 51, and 52.

The sample variance is 2 = V  xi  . Now, subtract 50 from each value to get:

yi : 4, 2, 3, 0, 1, 2; and, the variance of this new series is 2. i.e., V  x   V  y   2 .

2. If each and every value is multiplied by a non-zero constant (k), the standard deviation is
multiplied by |k| and the variance is multiplied by k2; i.e., V (kxi )  k 2V ( xi ) .

3. Both the variance and the standard deviation give more weight to extreme values and less
to those which are near to the mean.

Coefficient of Variation (CV)

The standard deviation is an absolute measure of dispersion. The corresponding relative measure
is known as the coefficient of variation (CV).

Of course, standard deviation is an absolute measure of dispersion that expresses the variation in
the same unit as the original data but it cannot be the sole basis for comparing two distributions.
For instance, if we have a standard deviation of 10 and a mean of 5, the values vary by an amount
twice as large as the mean itself. If, on the other hand, we have a standard deviation of 10 and a
mean of 5000, the variation relative to the mean is significant. Therefore, we cannot know the
dispersion of a set of data until we know the standard deviation, the mean, and how the standard
deviation compares with the mean.

Coefficient of variation is used in such problems where we want to compare the variability of two
or more different series. Coefficient of variation is the ratio of the standard deviation to the
arithmetic mean, usually expressed in percent.
Standard deviation
CV = × 100 …………………………………………………..4.22
mean

For population data:


σ
CV = μ × 100 …………………………………………………………………….4.23

Where σ is the population standard deviation and μ is population mean.

BY: ABEBEW A. 73
MEKDELA AMBA UNIVERSITY

For sample data:


S
CV = x̅ × 100

Where S is the sample standard deviation and x̅ is sample mean.

Remark: A distribution having less coefficient of variation is said to be less variable or more
consistent or more uniform or more homogeneous.

Example 4.10: Last semester, the students of Information Technology and Chemistry Departments
took Introduction to Statistics course. At the end of the semester, the following information was
recorded.

Department Information Technology Chemistry


Mean score 85 65
Standard deviation 25 12
Compare the relative dispersions of the two departments’ scores using the appropriate way.

Solution:

Information Technology Departments Chemistry Departments


S S
CV = x̅ × 100 CV = x̅ × 100

25 12
= 85 × 100 = 65 × 100

= 29.41% = 18.46%

Interpretation: Since the CV of Information Technology department students is greater than that
of Chemistry department students, we can say that there is more dispersion relative to the mean in
the distribution of Information Technology students’ scores compared with that of Chemistry
students.

Example 4.11 The yearly salaries of all employees working for a company have a mean of Birr
42350 and standard deviation of Birr 3820. The years of schooling for the sample employees have
a mean of 15 years and standard deviation of 2 years. Is the relative variation in the salaries higher
or lower than that in years of schooling for these employees?

BY: ABEBEW A. 74
MEKDELA AMBA UNIVERSITY

Solution

Because the two variables (salary and years of schooling) have different units of measurement
(Birr and years, respectively), we cannot compare the two standard deviations. Hence, we calculate
the coefficient of variation for each data set.

𝐶𝑉 𝑓𝑜𝑟 𝑆𝑎𝑙𝑎𝑟𝑖𝑒𝑠 𝛿 = 3820, 𝜇 = 42350

𝛿 𝛿3820
𝐶𝑣 = 𝑥100 = 𝑥100 = 9.02%
𝜇 42350

𝐶𝑉 𝑓𝑜𝑟 𝑦𝑒𝑎𝑟𝑠 𝑜𝑓 𝑠𝑐ℎ𝑜𝑜𝑙𝑖𝑛𝑔 𝛿 = 2 𝑦𝑒𝑎𝑟𝑠, 𝜇 = 15 𝑦𝑒𝑎𝑟𝑠

𝛿 2
𝐶𝑣 = 𝑥100 = 𝑥100 = 13.33%
𝜇 15

Interpretation: since the coefficient of variation for salaries has a lower value than the coefficient
of variation for years of schooling, the salaries have a lower relative spread than the years of
schooling.

Remark: The coefficient of variation doesn’t have any units of measurement as it is always
expressed as a percent.

4.5 Standard Scores (Z-Scores)

A standard score for sample value in a data set is obtained by subtracting the mean of the data set
from the value and dividing the result by the standard deviation of the data set. Basically, the
standard score (z-score) tells us how many standard deviations a specific value is above or below
the mean value of the data set. That is, the z-score is the number of standard deviations the data
value falls above (positive z-score) or below (negative z-score) the mean for the data set.

Z-score computed from the population


X−μ
Z score = ……………………..4.23
σ

Z-score computed from the sample


̅
X−X
Z score = ……………………………………..4.25
S

BY: ABEBEW A. 75
MEKDELA AMBA UNIVERSITY

Example 4.12: What is the Z-score for the value of 14 in the following sample data set?

3 8 6 14 4 12 7 10

Solution:
14−8
̅
X = 8, SD = 3.8173 thus, Z =3.8173 ≈ 1.57.

 The data value of 14 is located 1.57 standard deviations above the mean 8 because the z-
score is positive.

Example 4.13: Suppose that a student scored 66 in Statistics and 80 in Information Technology.
The score of the summary of the courses is given below.

Course Average score Standard deviation of the score


Statistics 51 12
Information Technology 72 16
In which course did the student scored better as compared to his classmates?

Solution:
X−μ 66−51 15
Z-score of student in Statistics: Z = = = 12 = 1.25
σ 12

X−μ 80−72 8
Z-score of student in Information Technology: Z = = = 16 = 0.5
σ 16

From these two standard scores, we can conclude that the student has scored better in Statistics
course relative to his classmates than in Information Technology course.

Activity 4.1

A student scored 65 on a calculus test that had a mean of 50 and a standard deviation of 10; she
scored 30 on a Information Technology test with a mean of 25 and a standard deviation of 5.
Compare her relative positions on each test.

4.6 Moments, Skewness and Kurtosis

The measures of central tendency and variation discussed in previous one do not reveal the entire
story about a frequency distribution. Two distributions may have the same mean and standard

BY: ABEBEW A. 76
MEKDELA AMBA UNIVERSITY

deviation but may differ in their shape of the distribution. Further description of their
characteristics is necessary that is provided by measures of skewness and kurtosis.

a. Moments

Moments are statistical tools used in statistical investigation. The moments of a distribution are
the arithmetic mean of the various powers of the deviations of items from some number. In our
course, we shall use it in the study of Skewness and Kurtosis of statistical distribution.

Moments about the origin

∑ Xi r
Mr = ……………………….………..4.26
n

Where r = 0, 1, 2, 3, …

Moments about the origin for grouped frequency distribution and for ungrouped frequency
distribution is

∑ fi Xi r
Mr = ………………………….…….4.27
n

Where fi is the frequency of Xi . Xi is the midpoint in the case of grouped frequency distribution or
class value in the case of ungrouped frequency distribution.

Note that: M1 = ̅
X, M0 = 1

Moments about the Mean (Central Moments)


̅ )r
∑(Xi −X
Mr′ = ……………………………4.28
n

Moments about the mean for grouped frequency distribution and for ungrouped frequency
distribution
̅ )r
∑ fi (Xi −X
Mr′ = …………………..………4.29
n

Where fi is the frequency of Xi . Xi is the midpoint in the case of grouped frequency distribution or
class value in the case of ungrouped frequency distribution.

Note that: M2′ = SD2 if it is assumedn = n − 1.

BY: ABEBEW A. 77
MEKDELA AMBA UNIVERSITY

Moments about any arbitrary constant A


∑(Xi −A)r
Mr′ = ……………………………..4.30
n

Moments about any arbitrary constant A for grouped frequency distribution and for ungrouped
frequency distribution
∑ fi (Xi −A)r
Mr′ = ……………………………………….4.31
n

Example 4.14: Find the first four moments about the mean for the following individual series

Xi : 3 6 8 10 18

Solution: n=5,

S.No 𝐗𝐢 ̅)
(𝐗 𝐢 − 𝐗 ̅) 𝟐
(𝐗 𝐢 − 𝐗 ̅) 𝟑
(𝐗 𝐢 − 𝐗 ̅) 𝟒
(𝐗 𝐢 − 𝐗
1 3 -6 36 -216 1296
2 6 -3 9 -27 81
3 8 -1 1 -1 1
4 10 1 1 1 1
5 18 9 81 729 6561
Total ∑ X = 45 ̅) = 0
∑(X − X ̅) 2
∑(X − X ∑(X − X ̅) 3 ∑(X − X ̅) 4
= 128 = 486 = 7940
Thus,

45 ∑(Xi −9) 1 ∑(Xi −9) 128 2 ∑(Xi −9) 486 3


̅
X = 5 = 9, M1′ = = 0, M ′
2 = = = 25.6, M ′
3 = = = 97.2
5 5 5 5 5

∑(Xi − 9)4 7940


M4′ = = = 1588
5 5

b. Skewness

Skewness refers to lack of symmetry (or departure from symmetry) in a distribution.

 A skewed frequency distribution is one that is not symmetrical.


 Skewness is concerned with the shape of the curve not size.

BY: ABEBEW A. 78
MEKDELA AMBA UNIVERSITY

A distribution is said to be symmetrical when the value is uniformly distributed around the mean
(distribution of the data below the mean and above the mean are equal). In a symmetrical
distribution, the mean, median and mode coincide (i.e., mean = median = mode).

Positively skewed distribution: if the value of mean is greater than the mode, skewness is said to
be positive. In a positively skewed distribution mean is greater than the mode and the median lies
somewhere in between mean and mode. A positively skewed distribution contains some values
that are much larger than the majority of other observations.

Negatively Skewed distribution: if the value of mode is greater than the mean, skewness is said
to be negative. In a negatively skewed distribution mode is greater than the mean and the median
lies in between mean and mode. The mean is pulled towards the low-valued item (that is, to the
left). A negatively skewed distribution contains some values that are much smaller than the
majority of observations.

Note that: In moderately skewed distributions the averages have the following
relationship.

(Mean – mode) = 3(mean - median)

How to check the presence of skewness in a distribution?

Skewness present in the data if:

BY: ABEBEW A. 79
MEKDELA AMBA UNIVERSITY

i. the graph is not symmetrical.


ii. the mean, median and mode do not coincide.
iii. the sum of positive and negative deviations from the median is not zero.
iv. the frequencies are not similarly distributed on either side of the mode.

Measures of skewness (𝛂𝟑 )

A measure of skewness gives a numerical expression for and the direction of asymmetry in a
distribution. It gives information about the shape of the distribution and the degree of variation on
either side of the central value. The three most commonly used measures of skewness are Pearson’s
coefficient of skewness, Bowley’s coefficient of skewness and coefficient of skewness based on
moments.

1. Pearson’s coefficient skewness (Pearsonian coefficient of skewness)


The skewness of the distribution can be measured by Pearson’s Coefficient of Skewness (𝛂𝟑 ), for
which the formula is given below:
Mean−Mode
α3 = Standard deviation…………………….4.33

2. Bowley’s Coefficient of Skewness


Bowley’s coefficient of skewness is based on quartiles. The formula for calculating coefficient of
skewness is:
(Q3 −Q2 )−(Q2 − Q1 ) Q3 +Q1 − 2Q2
α3 = = ………………4.34
Q3 −Q1 Q3 −Q1

3. Moment Coefficient of Skewness


Moment coefficient of skewness is based on moments. The formula for calculating coefficient of
skewness is:
M′3 M′3
α3 = 3/2 = …………..…………4.35
M′2 σ3

Where, M'r = ∑ni=1(xi − x̅)r /n

The shape of the curve is determined by the value of α3

BY: ABEBEW A. 80
MEKDELA AMBA UNIVERSITY

α3 > 0, the distribution is positively skewed/skewed to the right, i.e mode < median <mean
smaller observations are more frequent than larger observations. i.e., the majority of the
observations have a value below an average.

α3 = 0, the distribution is symmetric, i.e. mean = mode = median

α3 < 0, the distribution is negatively skewed/skewed to the left. i.e.,mean < median < mode

smaller observations are less frequent than larger observations. i.e., the majority of the
observations have a value above an average.

c. Kurtosis

Kurtosis is a measure of peakedness of a distribution. The degree of kurtosis of a distribution is


measured relative to the peakedness of a normal curve. If a curve is more peaked than the normal
curve it is called ‘leptokurtic’; if it is more or flate-topped than the normal curve it is called
‘platykurtic’ or flat-topped. The normal curve itself is known as ‘mesokurtic’.

Measures of Kurtosis (𝛂𝟒 )

The moment coefficient of kurtosis:


M′4 M′4
α4 = = ……………………………….…………..4.36
M′22 σ4

The peakedness depends on the value of α4


α4 > 3  the curve is leptokurtic,

BY: ABEBEW A. 81
MEKDELA AMBA UNIVERSITY

α4 = 3  the curve is mesokurtic,


α4 < 3  the curve is platykurtic.

Example 4.15: Based on the following data:

M′0 = 1, M′1 = -0.6, M′2 = 1.6, M′3 = -2.4, M′4 = 5.8


a/ Find the coefficient of skewness and discuss the distribution type.
b/ Find the coefficient of kurtosis and discuss the distribution type.
Solution:
M′3 −2.4
a/ α3 = 3/2 = = -1.19 < 0, the distribution is negatively skewed.
M′2 1.63/2

M′4 5.8
b/ α4 = = 1.62 = 2.26 < 3, the curve is platykurtic.
M′22

Example 4.16: Find the coefficient of skewness and the coefficient of kurtosis for the above
example 4.15.
Solution:
M′3 97.2 97.2
i. α3 = 3/2 = 3 = 129.527 = 0.75
M′2 (25.6)2

the distribution is positively skewed.


M′4 1588
ii. α4 = = 25.62 = 2.42 the curve is platykurtic.
M′22

Activity 4.2

1. What information do we get from skewness and kurtosis?

2. Discuss the importance of Z score.

Exercise 4
1. Calculate the mean deviation about the mean, median and mode, and their coefficients and
also variance and standard deviation for the following data.

Size of shoes 3 6 11 2 4 10 5 7 8 9
No. of pairs sold 10 15 25 6 4 3 2 8 9 4
2. An analysis of the monthly wages paid (in birr) to workers in two firms A and B belonging
to the same industry gives the following results.

BY: ABEBEW A. 82
MEKDELA AMBA UNIVERSITY

Value Firm A Firm B

Mean wage 52.5 47.5

Variance 100 121

In which firm A or B is there greater variability in individual wages?

3. A meteorologist interested in the consistency of temperatures in three cities during a given


week collected the following data. The temperatures for the five days of the week in the
three cities were

City 1: 25, 24, 23, 26, 17

City 2: 22, 21, 24, 22, 20

City 3: 32, 27, 35, 24, 28

Which city have the most consistent temperature, based on these data?

4. Some characteristics of annually family income distribution (in Birr) in two regions is as
follows:

Region Mean Median Standard deviation


A 6250 5100 960
B 6980 5500 940
a) Calculate coefficient of skewness for each region

b) For which region the income is more consistent?

5. The median and the mode of a mesokurtic distribution are 32 and 34 respectively. The
4thmoment about the mean is 243. Compute the Pearsonian coefficient of skewness and
identify the type of skewness. Assume (n-1 = n).
6. If the standard deviation of a symmetric distribution is 10, what should be the value of the
fourth moment so that the distribution is mesokurtic?

BY: ABEBEW A. 83
MEKDELA AMBA UNIVERSITY

CHAPTER FIVE

ELEMENTARY PROBABLITY

Objectives

After studying this chapter, you should be able to:

 Understand the fundamental concepts of probability.


 Apply the principle of counting techniques to solve real problem.
 Define some basic terms of probability.
 Apply probability in biological phenomena

5.1 Introduction

Why it is that science is not always certain? Nature is complex and full of unexplained biological
variability. In addition, almost all methods of observation and experiment are imperfect. Observers
are subject to human bias and error.
Science is a continuing story; subjects vary; measurements fluctuate. Biomedical science, in
particular, contains controversy and disagreement; with the best of intentions, biomedical data,
medical histories, physical examinations, interpretations of clinical tests, descriptions of symptoms
and diseases are somewhat inexact. But most important of all, we always have to deal with
incomplete information: It is either impossible, or too costly, or too time consuming, to study the
entire population; we often have to rely on information gained from a sample, that is, a subgroup
of the population under investigation. So some uncertainty almost always prevails. Science and
scientists cope with uncertainty by using the concept of probability. By calculating probabilities,
they are able to describe what has happened and predict what should happen in the future under
similar conditions.
In short, more often the quantities we are interested in will not be predictable in advance but,
rather, will exhibit an inherent variation. Probability and statistics are concerned in the
quantification of such quantities (or random phenomena).

BY: ABEBEW A. 84
MEKDELA AMBA UNIVERSITY

5.2 Definition of some probability terms

Experiment: Any process of observation or measurement or any process which generates well
defined outcome.
Random experiment: it is an experiment which can be repeated any number of times under the
same conditions, but does not give unique results. The result will be any one of several possible
outcomes, but for each trial, the result will not be known in advance. A Random experiment is also
called a trial & the outcomes are called events.
Some of the characteristics of a random experiment are
 All the possible outcomes of the experiment can be specified in advance.
 The experiment can be repeated indefinitely.
 There is a sort of regularity in the outcomes observed in large repetitions of the experiment.
Sample space: - is the collection of all possible outcomes or sample points of a random
experiment.
Sample point: -Each element of sample space is called Sample point.
Event: - is a subset of a sample space i.e. an event is a collection of sample points.
Impossible event:- this is an event which will never occur.

Example 5.1: In an experiment of rolling a fair die, S = {1, 2, 3, 4, 5, 6}, each sample point is an
equally likely outcome. It is possible to define many events on this sample space as follows:

A = {1, 4} - the event of getting a perfect square number.

B = {2, 4, 6} - the event of getting an even number.

C = {1, 3, 5} - the event of getting an odd number.

D = the event of getting number 8 is an impossible event.

Example 5.2

If we toss a coin the sample space (S) of this experiment S = {head, tail} where head and tail are
two faces of a coin. If we are interested the outcome of head will turn up then the event E= {head}.

Example 5.3: find the sample space of tossing a coin three times.

BY: ABEBEW A. 85
MEKDELA AMBA UNIVERSITY

S= {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}

Example 5.4 : If an experiment consists of measuring the lifetime of hook worm, then the sample
space consists of all nonnegative real numbers, that is, S = [0,∞)

Mutually exclusive event: - two events A and B are said to be mutually exclusive if there is no
sample point which is common to A and B. i.e. A ∩ B = ∅

Independent event: two or more events are said to be independent if the occurrence or non-
occurrence of an event does not affect the occurrence or non-occurrence of the other.

Dependent Events: Two events are dependent if the first event affects the outcome or occurrence
of the second event in a way the probability is changed.
Complement of an Event: the complement of an event A means nonoccurrence of A and is
denoted by A', or Ac contains those points of the sample space which don’t belong to A.
Equally likely outcomes: if each outcome in a sample space has the same chance to be occurred.

Example 5.4: Casting a fair die all possible outcomes are equally likely.

Review of set theory

Concepts of set theory are important in understanding probability. Given A, B and C are events
associated with a sample space S and ω represents an elementary event (outcome) in S, then the
following are some useful definitions and results in set theory.

Union: - the union of A and B, AUB is the event containing all sample point in either A or B or
both. Sometimes we use A or B for union.

Intersection: - the intersection of A and B AnB is the event containing all sample points that are
both A and B. sometimes we use AB or A and B for intersection.

Subset: - If for any ω ∈ A, then ω ∈ B. Then A ⊆ B.

Empty set: - If a set A contains no points, it will be called the null set, or empty set, and denoted
by φ.

c. Counting rules: addition, multiplication, Permutation & Combination rule

BY: ABEBEW A. 86
MEKDELA AMBA UNIVERSITY

In order to calculate probabilities, we have to know

 The number of elements of an event.


 The number of elements of the sample space.

That is in order to judge what is probable, we have to know what is possible.

In order to determine the number of out comes one can use several rules of counting:

1. The addition rule

2. The multiplication rule

3. Permutation rule

4. Combination rule

1. The addition Rule

Suppose that a procedure, designated by 1, can be done in n1 ways. Assume that a second procedure
designated by 2, can be done in n2 ways. Suppose furthermore, that it is not possible that both 1
and 2 done together. Then, the number of ways in which we can do1 or 2 is 𝑛1 + 𝑛2 ways.

Example 5.5: suppose we are planning a trip to some place. If there are 3 bus routes & two train
routs that we can take, then there are 3+2=5 different routes that we can take.

Example 5.6: Suppose one wants to purchase a certain commodity and that this commodity is on
sale in 5 government owned shops, 6 public shops and 10 private shops. How many alternatives
are there for the person to purchase this commodity?

Solution: Total number of ways =5+6+10=21 ways

2. Multiplication rule: If an operation consists of k steps and the 1st step can be done in n1 ways,
the 2nd step can be done in n2 ways (regardless of how the 1st step was performed), the kth step can
be done in nk ways, (regardless of how the preceding steps were performed), then the entire
operation can be performed in n1 · n2 ·… · nk ways.

Example 5.7: Suppose that a person has 2 different pairs of trousers and 3 shirts. In how many
ways can he wear his trousers and shirts?

BY: ABEBEW A. 87
MEKDELA AMBA UNIVERSITY

Solution: He can choose the trousers in n1=2 ways, and shirts in n2=3 ways. Therefore, he can
wear in n1  n 2  2  3  6 possible ways.

Example 5.8: If a test consists of 10 multiple choice questions, with each permitting 4 possible
answers, how many ways are there in which a student gives his/her answers?

Solution: There are 10 steps required to complete the test.

First step: To give answer to question number one. He/she has 4 alternatives.

Second step: To give answer to question number two, he/she has 4 alternatives……

Last step: To give answer to last question, he/she has 4 alternatives.

Therefore, he/she has 4x4x4x…x4=410 ways or 1,048,576 ways of completing the exam.

Note that there is only one way in which he /she can give correct answers to all questions and that
there are 310 ways in which all the answers will be incorrect.

3. Permutation: -An arrangement of objects with attention given to order of arrangement is called
permutation. The number of permutation of n different objects taken r at a time is obtained by:

n!
Pr  for r  0, 1, 2, , n ……………..5.1
(n  r )!
n

Permutation Rule:

a) The number of permutations of n objects taken all together is n!

n! n!
i.e. n!= n*(n-1)*(n-2)*…*3*2*1 = Pn    n!
(n  n)! 0!
n

Note: By definition 0! = 1

b) The arrangement of n distinct objects in a specific order using r objects at a time is called the
permutation of n objects taken r objects at a time. It is written as nPr and the formula is

n!
Pr 
(n  r )!
n

BY: ABEBEW A. 88
MEKDELA AMBA UNIVERSITY

c) The number of distinct permutation of n objects in which n1 are alike, n2 are alike, ..., nk are
alike is

n! for n  n1  n2    nk
n1 !.n 2 !..n k !

Example 5.9: Find number of permutations of the letters in the word ‘‘statistics’’.

Solution:

There are 3 s’s, 3t’s, 1a’s, 2i’s and 1c’s. i.e. 𝑛1 = 3, 𝑛2 = 3, 𝑛3 = 1, 𝑛4 = 2 𝑎𝑛𝑑 𝑛5 = 1

Therefore 10! = 50,400.


3!.3!.1!.2!1!

Example 5.10: A photographer wants to arrange 3 persons in a row for photograph. How many
different types of photographs are possible?

Solution:

Assume 3 persons Aster (A), lemma (L), Yared (Y) and n=3

Since n! =3! = 3*2! = 6, there are 6 possible arrangement ALY, AYL, LAY, LYA,YLA and YAL.

Example 5.11: Suppose we have a letters A,B, C, D & E

a) How many permutations are there taking all the four?

b) How many permutations are there taking two letters at a time?

Solution:

a) Here n = 5, there are four distinct object.

There are 5! = 120 permutations.

b) Here n = 5, r = 2

There are 5P2 = 5!/(5-2)! = 120/6 = 20 permutations.

Example 5.12: Fifteen Ethiopian athletes were entered to the race. In how many different ways
could prizes for the first, the second and the third place be awarded?

BY: ABEBEW A. 89
MEKDELA AMBA UNIVERSITY

Solution

15 objects taken 3 at a time 15P3=15!/(15-3)! = 2730 ways.

4. Combination-A selection of objects considered without regard to order in which they occur is
called Combination. The number of combination of n different objects taking r of them at a time
 n n!
is Cr     , for r  0,1,2,, n . …………………………………..5.2
 r  r!(n  r )!
n

Example 5.13: Given the letters A, B, C, and D list the permutation and combination for selecting
two letters.

Solution:

Permutation Combination

AB BA CA DA AB BC

AC BC CB DB AC BD

AD BD CD DC AD DC

Note that in permutation AB is different from BA but in combination AB is the same as BA.

Example 5.14: In a club containing 7 members a committee of 3 people is to be formed. In how


many ways can the committee be formed?

n n! 7 7!
Solution: 7C3 = n Cr      7 C3     = 35
 r  r!(n  r )!  3  3!(7  3)!

Example 5.15: How many four-digit numbers can be formed with the 10 digits 0,1,2, . . ,9 if

a/ repetitions are allowed

b/ repetitions are allowed, and

c/ the last digit must be zero & repetitions are not allowed.

Solution:

BY: ABEBEW A. 90
MEKDELA AMBA UNIVERSITY

a/ the first digit can be any one of 9 (since 0 is not allowed). The second, third and fourth digits
can be any one of 10. Then 9.10.10.10=9000 numbers can be formed.

b/ the first digit can be any one of 9 & the remaining three can be chosen in 9 P3 ways. Thus 9. 9 P3
= 4536 numbers can be formed.

c/ the first digit can be chosen in 9 ways & the next two digits in 9 P2 ways. Thus 9. 8 P2 = 504
numbers can be formed.

5.4 Approaches in probability definition

Definition: Probability is a numerical measure of the chance or likelihood that a particular event
will occur & it lies in the range from 0-1, inclusive. Probability is a building block of inferential
statistics.

Definition: Let E be an experiment. Let S be a sample space associated with E. With each event
A in S we associate a real number designated by P (A) and called the probability of A.

Generally, probability can be divided into two

i. Subjective probability: - probability determined based on individual’s own judgment,


experience, information, belief . . . is called Subjective probability.
ii. Objective probability: - the probability of an event in a certain experiment based on
experimental evidence.

Basic approaches to probability

There are three different conceptual approaches to the study of probability theory.

These are:

1. The classical approach.

2. The frequentist approach.

3. The axiomatic approach.

1. Classical approach:

BY: ABEBEW A. 91
MEKDELA AMBA UNIVERSITY

Definition: If there are n equally likely outcomes of an experiment, and out of the n outcomes
event A occur only k times the probability of the event A is denoted by P (A) is defined as
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑓𝑎𝑣𝑜𝑟𝑎𝑏𝑙𝑒 𝑡𝑜 𝑒𝑣𝑒𝑛𝑡 𝐴 𝑛(𝐴) 𝑘
p(A) = = 𝑛(𝑆) = 𝑛 …………………..5.3
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 o𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠

Note: Classical approach of measuring probability fails to answer for the following conditions:

 If total number of outcomes is infinite or if it is not possible to enumerate all elements of


the sample space.
 If each outcome is not equally likely.

Example 5.16: Compute a/ the probability of having two boys & one girl is a three child family
using the classical method, assuming boys & girls are equally likely.

b/ using (a) compute the probability of having three boys in a three-child family.
c/ using (a) compute the probability of having three girls in a three –child family.
d/ using (a) compute the probability of having two girls & one boy in three child family.

Solution

The sample space S or the experiment is


S= {BBB, BBG, BGB, BGG, GBB, GBG, GGB, GGG}
So n(S)=8
a/ For the event A= ''two boys & a girl'' = {BBG,BGB,GBB} , we have n(A)=3,Since the outcome
are equally likely , the probability of A is P(A)= n(A)/n(S)=3/8 =0.375
b/ Compute the probability of having three boys in a three-child family.
For the event B= ''three boys'' = {BBB} , we have n(B)=1,Since the outcome are equally likely
, the probability of B is P(B)= n(B)/n(S)=1/8 = 0.125
c/ compute the probability of having three girls in a three –child family.
For the event C= ''three girls'' = {GGG} , we have n(C)=1,Since the outcome are equally likely
, the probability of C is P(C)= n(C)/n(S)=1/8 = 0.125
d/ Compute the probability of having two girls & one boy in three child family.
For the event D= ''two girls & one boy'' = {BGG, GBG,GGB}, we have n(A)=3,Since the
outcome are equally likely, the probability of D is P(D)= n(D)/n(S)=3/8 =0.375.

BY: ABEBEW A. 92
MEKDELA AMBA UNIVERSITY

Example 5.17: A box of 80 candles consists of 30 defectives and 50 non defective candles. If 10
of these candles are selected at random without replacement, what is the probability

a) all will be defective?

b) 6 will be non-defective?

c) all will be non-defective?

Solution

 80 
Total Selection:    N  n( S )
 10 

a) Let A be the event that all will be defective.

 30   50 
Total way in which A occur =   *    NA=n (A)
 10  0

𝑛(𝐴)  30   50   80 
P (A) ) =𝑛(𝑆) =   *   /    0.00001825
 10   0   10 

b) Let A be the event that 6 will be non defective.

 30   50 
Total way in which A occur =   *    NA=n (A)
4 6

𝑛(𝐴)  30   50   80 
P (A) ) =𝑛(𝑆) =   *   /    0.265
 4   6   10 

c) Let A be the event that all will be non-defective.

 30   50 
Total way in which A occur =   *    NA=n (A)
 0   10 

𝑛(𝐴)  30   50   80 
P (A) =𝑛(𝑆) =   *   /    0.00624.
 0   10   10 

2. The Frequentist Approach (Empirical Probability): This approach to probability is based on


relative frequencies.

BY: ABEBEW A. 93
MEKDELA AMBA UNIVERSITY

Definition: Suppose we do again and again a certain experiment n times and let A be an event of
the experiment and let k be the number of times that event A occurs. Therefore, the probability of
the event A happening in the long run is given by:
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑖𝑚𝑒𝑠 𝑒𝑣𝑒𝑛𝑡 𝐴 ℎ𝑎𝑠 𝑜𝑐𝑐𝑢𝑟𝑒𝑑 𝑘
P(A) = = ………………………….5.4
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑛

In other words, given a frequency distribution, the probability of an event (A) being
𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠 𝐴
in a given class is P(A) = 𝑇𝑜𝑡𝑎𝑙 𝑓𝑟𝑒𝑞𝑢e𝑛𝑐𝑦 𝑖𝑛 𝑡ℎ𝑒 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛

Example 5.18: The national center for health statistics reported that of every 539 deaths in recent
years, 24 resulted that from automobile accident, 182 from cancer, and 353 from other disease.
What is the probability that particular death is due to an automobile accident?

Solution

P (automobile) = death due to automobile /total death =24/539 = 0.445

The probability that particular death is due to an automobile accident is 0.445.

3. The axiomatic approach.

Let E be a random experiment and S be a sample space associated with E. With each event A a
real number called the probability of A satisfies the following properties called axioms of
probability or postulates of probability.

1. 0≤ 𝑃 (𝐴) ≤ 1

2. P(S) =1, S is the sure/certain event.

3. If A1 and A2 are mutually exclusive events, the probability that one or the other occur equals the
sum of the two probabilities. i. e. P(A1∪A2)=P(A1)+P(A2)

Similarly P(A1∪A2∪ . . . An) = P(A1)+P(A2) +. . . +P(An) = ∑𝑛𝑖=1 𝐴𝑖

4. P (A') =1-P (A)

5. P (ø) =0, ø is the impossible event.

BY: ABEBEW A. 94
MEKDELA AMBA UNIVERSITY

5.5 Some probability rules

Rule l: let A be an event and A' be the complement of A with respect to a given sample space of
an experiment, then P(A')=1-P(A)

Proof: let S be a sample space S=AUA' and, A and A' are mutually exclusive

A∩A' = ø

P(S) = P (AUA') = P (A') + P (A) and P(S) = 1

1= P (A') + P (A) => P (A') = 1-P (A)

Rule 2: let A and B are events of a sample space S, then

P (A' ∩ B) = P (B) - P (A ∩ B)

Proof: B =S ∩ B = (AUA') ∩ B = (A∩ B) U (A'∩ B)

If A∩B ≠ ø , then P(B) =P (A∩ B) +P (A' ∩ B)

P (A' ∩ B) = P(B) – P(A ∩ B).

Rule 3: Suppose A and B are two events of a sample space, then

P(AUB) = P(A) + P(B) – P(A ∩ B)

Proof:

(AUB) = AU(A' ∩ B), A and A' ∩ B are disjoint sets

∴ P(AU B) = p(A) + p(A' ∩ B) . . . .*

But we have already proved that P (A’ n B) = P (B) – P (A ∩ B)

Put this in equation *

P(A U B) = P(A) + P (B) – P (A ∩ B)

Example 5.19: A fair die is thrown twice. Calculate the probability that the sum of spots on the
face of the die that turn up is divisible by 2 or 3.

Solution

BY: ABEBEW A. 95
MEKDELA AMBA UNIVERSITY

S={(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(2,1),(2,2),(2,3),(2,4),(2,5),(2,6),(3,1),(3,2),(3,3),(3,4),(3,5),

(3,6),(4,1),(4,2),(4,3),(4,4),(4,5),(4,6),(5,1),(5,2),(5,3),(5,4),(5,4),(5,5),(5,6),(6,1),(6,2),(6,3),(6,4),
(6,5),(6,6)}

This sample space has 6*6 =36 elements let A be the event that the sum of the spots on the die is
divisible by 2 and B be the event that the sum of the spots on the die is divisible by three, then

A = {(1,1), (1,3), (1,5), (2,2), (2,4), (2,6), (3,1), (3,3), (3,5), (4,2), (4,4), (4,6), (5,1), (5,3), (5,5),
(6,2), (6,4), (6,6)}

B = {(1,2), (1,5), (2,1), (2,4), (3,3), (3,6), (4,2), (4,5), (5,1), (5,4), (6,3), (6,6)}

A∩B = {(1, 5), (2,4), (3,3), (4,2), (5,1), (6,6)}

P (A or B) = P (A U B)

= P (A) +P (B) – P (A∩B)

= 18/36 + 12/36 -6/36 = 24/36 = 2/3

Example 5.20: Two nurses perform ear examinations, focusing on the color of the eardrum
(tympanic membrane); each independently assigns each of 100 ears to one of two categories:
normal or gray, or not normal (white, pink, orange, or red). The data are shown in the table below.

Nurse1 Nurse 2
normal Not normal Total
Normal 35 10 45
Not normal 20 35 55
total 55 45 100
If an individual is going to get his ears examined by both nurses, what will be the probability that

a) the result from nurse 1 will be normal b) the result from nurse 2 will be normal

c) both results will be normal d) at least one of the result is normal

e) both results will be abnormal

Solution: Let A represents that the examination by nurse 1 will result normal and

B represents that the examination by nurse 2 will result normal.

BY: ABEBEW A. 96
MEKDELA AMBA UNIVERSITY

a) P(A) = 45/100 = 0.45

b) P(B) = 55/100 = 0.55

c) P(A n B) = 35/100 = 0.35

d) P(A or B) = P(A u B) = P(A) + P(B) –P(A n B) = 0.45 +0.55 -0.35 = 0.65

e) P(Ac n Bc) = P((AUB))c = 1-P(A u B)=1- 0.65 =0.35

5.6 Conditional Probability and Independence

5.6.1 Conditional Probability

If A and B are events. Conditional probability of A given B means the probability of occurrence
of A when the event B has already happened.

It is denoted by P (A/B) and is defined by

P (A/B) = P(A ∩ B)/P (B), if P (B)≠0

Conditional probability of B given A means the probability of occurrence of B when the event A
has already happened. It is denoted by P (B/A) and is defined

P (B/A) = P(A ∩ B)/P (A), if P (A)≠0

P (A ∩ B) = P (A) P (B/A) = P (B) P (A/B). ………………………………………….…5.6

5.6.2 Multiplication Law of Probability

If A and B are events in a sample space S, then

P (A ∩ B) = P (A) P (B/A), P (A) ≠ 0

P (A ∩ B) = P (B) P (A/B), P (B) ≠ 0

Where P (B/A) represents the conditional probability of B given A and P (A/B) represents the
conditional probability of A given B.

Note: Extension of multiplication law of probability for ‘n’ events A1, A2, …, An we have

P (A1 ∩ A2 ∩ …∩An) = P (A1) P (A2/A1) p (A3/A1 ∩ A2)…P(An/A1∩ A2 ∩ …∩An-1)

BY: ABEBEW A. 97
MEKDELA AMBA UNIVERSITY

Example 5.21: A coin is tossed twice. If it is already known that the first coin has thrown a head,
what is the probability of getting two heads?

Solution:

S = {HH, HT, TH, TT}, A = the first shows a head = {HH, HT}, B= two heads occur ={HH}

P (B/A) = P(A ∩ B)/ P(A)

But A ∩ B ={HH}, P(A ∩ B) =1/4, P(A)=1/2, therefore, P (B/A) = P(A ∩ B)/ P(A) = 1/2

Example 5.22: Let A and B are events such that P (A U B) = ¾, P (A ∩ B) = ¼ and P(A' ) = 2/3.

Find P (A'/B)

Solution:

P(A') = 2/3  P (A) = 1- P(A') = 1-2/3 = 1/3

Now, P (A U B) = P (A) + P (B) - P (A ∩ B)

3/4 = 1/3 + P (B) – ¼

P(B) = 3/4 - 1/3 + ¼ = 2/3

Therefore, P (A/B) = P (A ∩ B)/P(B) = 3/8  P(A'/B) =1-P (A/B) = 1-3/8 =5/8.

5.6.3 Probability of Independent Event

Two events A and B are said to be independent if the occurrence of A has no bearing on occurrence
of B. That means knowledge of A has occurred given no information about the occurrence of B.
Two events, A and B, are said to be independent if P(A∩B) =P(A)P(B).

Suppose A and B are independent events with 0<P (A) <1 and 0<P (B) <1. The following
statements true:

i. A' and B' are independent, ii. A and B' are independent, iii. A' and B are independent

iv. P(B|A) = P(B), v. P(B|A') = P(B)

Example 5.23: A box contains four black and six white balls. What is the probability of getting
two black balls in drawing one after the other under the following conditions?

BY: ABEBEW A. 98
MEKDELA AMBA UNIVERSITY

a. The first ball drawn is not replaced

b. The first ball drawn is replaced

Solution

Let A= first drawn ball is black

B= second drawn is black

Required P (A n B)

a. P (A ∩ B) = P (B/A) P(A) = (4/10) (3/9) = 2/15

b. P (A ∩ B) = P (A) P (B) = (4/10) (4/10) = 16/100 = 4/25.

5.7 Total probability and Bayes’ Theorem

Total probability: - If events B1, B2, …, & Bk constitute a partition of the sample space S & p(Bi)
≠ 0 for i = 1, 2,…,k, then for any event A in S,

P(A)= ∑ p(Bi)p(A/Bi). ……………………………………5.7

Example 5.24: In a factory, machines A1, A2, A3 manufactures 25%, 35%, 40% of the total output
respectively. Out of their products 5%, 4% & 2% are, respectively defective. An item is drawn at
random from the products is found to be defective. What is the probability that defective item is
produced by all machines?

So/n: p(A1)=0.25, p(A2) = 0.35, p(A3) = 0.40, P(D/A1)= 0.05, P(D/A2) = 0.04, P(D/A3) =0.02

P(D)= ∑ p(Ai)p(D/Ai) = p(A1) P(D/A1) + p(A2) P(D/A2) + p(A3) P(D/A3)

= (0.25) (0.05) + (0.35) (0.04)+ (0.40) (0.020) = 0.0345

Bayes’ Theorem:- If B1, B2, …,& Bk are events which make an exhaustive partition of the sample
space S, if A is any event in S, then the conditional probability of Bi given that A has already
P( Bi )  P( A / Bi )
occurred is: P( Bi / A)  k
………………………………….…5.9
 P( B )  P( A / B )
i
i i

Note: the denominator is the total probability

BY: ABEBEW A. 99
MEKDELA AMBA UNIVERSITY

Example 5.25: Based on the above example, what is the probability that it was manufactured by
machine A1?

P( A1 )  P( D / A1 )
Solution: P( A1 / D)  k
= (0.25)(0.05)/0.0345 = 0.3623
 P( A )  P( D / A )
i
i i

Exercise 5

1. A fair die is tossed once. What is the probability of getting?

a) Number 4? b) An odd number? c) Outcome of at least number 4? d) Number 8?

2. In how many ways can 10 people be seated on a bench if only 4 seats are available?

3. A committee of 5 is formed by drawing lots from 8 boys and 6 girls. Find the probability that
the committee will consist of 2 boys and 3 girls.

4. A box contains 6 red, 4 white and 5 black balls. A man draws 4 balls from the box at random.
Find the probability that among the balls drawn there is at least one ball of each color.

5. How many four-digit numbers can be formed with the 6 digits 1,2,. . .,6, if

a/ repetitions are allowed b/repetitions are not allowed.

6. The probabilities that A and B solve a given problem independently are 2/3 and 3/5 respectively.
If both of them attempt the problem, find the probability that the problem will be solved.

7. A bag contains 15 items of which 4 are defective. The items are selected at random one by one
and examined. The ones examined are not put back. What is the chance that 10th one examined is
the last defective?

8. A company has two machines M1 and M2. M1 produces 60% of its product and M2 produces
40% of its product. M1 produces 5% defective units and M2 produces 4% defective units. A unit
is selected at random from the whole product.

a/Find the probability that it is defective. b/ What is the probability that it was manufactured by
machine M2.

BY: ABEBEW A. 100


MEKDELA AMBA UNIVERSITY

CHAPTER SIX

PROBABILITY DISTRIBUTION
The purpose of this unit is to introduce you with the concept of random variable and their
probability distributions. In a probability distribution, the variables are distributed according to
some definite probability function. In the previous unit we have discussed the concept of
probability. The different rules of probability and frequency distributions were also discussed. In
this unit we utilize this information to understand the discrete and continuous probability
distributions. Moreover, the concept of mathematical expectation is discussed.

After completing this unit, you will be able to:

 define the term random variable


 understand discrete and continuous random variables
 define the term probability distribution
 differentiate between discrete and continuous probability distributions
 compute the expected value of a random variable
 apply the concepts of probability distributions to real-life problems.

6.1 Definition of Random Variables and probability distribution

Definition: A variable whose values are determined by chance with associated probabilities is
called a random variable. It is a quantity which in different observations can assume different
values.

In any experiment of chance, the outcomes occur randomly. For example, the total score when a
pair of dice is rolled, the number of heads when a coin is tossed several times, annual household
income, and so on are examples of random variables (or stochastic variables).

Random variables are usually denoted with capital letter X, Y, Z etc, while the values taken by
them are denoted by lower case letters x, y, z etc. Thus, P (x1  X  x2) is the probability that the
random variable X takes values between x1 and x2, both inclusive. A random variable can be
discrete or continuous.

BY: ABEBEW A. 101


MEKDELA AMBA UNIVERSITY

Discrete Random Variable

If the random variable X can assume only a particular finite or countable infinite set of values, it

is said to be a discrete random variable. For instance, the number of children in a family, number
of car accidents within given period of time in a certain locality, the number of bacteria in a cubic
mm of agar, throw a die etc

Example 6.1: Consider an experiment of "flipping a fair coin 3 times". List the elements of the
sample space that are assumed to be equally likely (as this is what is meant by a fair or balanced
coin) and the corresponding values x of the r-v X, the number of heads observed.

Solution: If H stands for heads and T for tails, then the sample space corresponding to this
experiments is S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}.

Since X= the number of heads observed, the results are shown in the following table:

Element of sample space Probability X


HHH 1/8 3
HHT 1/8 2
HTH 1/8 2
HTT 1/8 1
THH 1/8 2
THT 1/8 1
TTH 1/8 1
TTT 1/8 0
Thus, we can write X(HHH) = 3, X(HHT) = 2, , X(TTT) = 0, and P(X = 3) = 1/8 = the probability
that the r-v X is 3, P(X= 2) = 3/8, and P(X=0)=1/8.

Note that the possible values of X are: xi  0, 1, 2, 3

Continuous Random Variable

A random variable X is said to be continuous if it can take all possible values (integral as well as
fractional) between certain limits. Continuous random variables occur when we deal with

BY: ABEBEW A. 102


MEKDELA AMBA UNIVERSITY

quantities that are measured on a continuous scale. In such cases, probabilities are associated with
intervals or regions of a continuous random variable, and not with individual points.

Example 6.2: The height of an individual,

 The body weight of new born baby,


 The speed of a car,
 The distance between Dessie and Tulu Awuliya etc.

Probability Distribution

A probability distribution shows the possible outcomes of an experiment and the probability of
each of these outcomes. That is, probability distribution is a complete list of all possible of values
of a random variable and their corresponding probabilities.

Probability mass function (pmf)


If X is a one-dimensional discrete random variable taking at most a countable number of values
x1,x2,…, then the probabilistic behavior of X at each xi is described by its probability mass
function.

Definition:- If X is a discrete random variable taking at most a countable infinite number of values
x1, x2, …, then the function PX(x) or simply P(xi), defined by P(xi)=P(X=xi): i=1,2 …is called the
probability mass function of random variable X. The set of ordered pairs {x i, P (xi)} i= 1, 2 …
gives the probability distribution of the random variable X. Discrete probability distribution is a

distribution whose random variable is discrete. It describes a finite set of possible occurrences, for

discrete “count data.”

The probability mass function must satisfy the following conditions.


I. P(xi) ≥0

II. ∑∞
𝑖=1 𝑝(𝑋 = 𝑥𝑖 ) = 1

Example 6.3 In a certain society, the probability of a birth of a female baby is 0.5. If you pick a
family with four children randomly in the society and letting X to be the number of females out of
the four children,

BY: ABEBEW A. 103


MEKDELA AMBA UNIVERSITY

i. Construct the probability mass function of the random variable X.

Solution: Let b and g represent a boy and a girl, respectively. We represent a family with three
boys and a little girl as bbbg. The sample space consists of 16 possible outcomes. We can easily
enumerate the possible outcomes using the aid of the multiplication principle and tree diagram.
S={bbbb, bbbg, bbgb, bbgg, bgbb, bgbg, bggb, bggg, gbbb, gbbg, gbgb, gbgg, ggbb, ggbg, gggb,
gggg }

It is evident that the possible values of X are 0,1,2,3,4. To assign a probability for an event that X
will take a value, we need to know the equivalent event in terms of the original out comes. For
example, the event X=2 is equivalent to the event { bbgg, bgbg, bggb, gbbg, gbgb, ggbb}. Hence
P(X=2)=6/16=0.375. We can calculate the probability that X will take other values in a similar
way. The probability mass function of X is shown in the table below:

X 0 1 2 3 4
PX(x) 1/16 4/16 6/16 4/16 1/16
ii. Based on the distribution (p.m.f.) of X, calculate the probability that

A) the family will have no female child


B) the family will have three children
C) the family will have more than 2 female children
D) the family will have at least 1 female child
E) the family will have at most 2 female children.

Solution:

A. P(X=0)=1/16
B. P(X=3)=4/16
C. P(X>2)=P(X=3 or X=4)= P(X=3)+ P(X=4)=4/16 +1/16= 5/16
D. P(X≥1)= P(X=1)+ P(X=2) +P(X=3)+ P(X=4)=4/16 + 6/16 + 4/16 +1/16=15/16

= 1-P(X=0)=1-1/16, using the complement of an event

E. P(X≤2)= P(X=0)+P(X=1)+P(X=2)

= 1-P(X>2) = 1-[P(X=3) +P(X=4)] =1-5/16=11/16

BY: ABEBEW A. 104


MEKDELA AMBA UNIVERSITY

Definition Continuous variable is the probability density function (pdf) and is usually denoted

by f(x). The function f(x) is called probability density function of X. And it satisfies the following
conditions.

 f(x)≥0 for all x, -∞ < x < ∞



 ∫−∞ 𝑓 (𝑥)𝑑 𝑥 = 1

Continuous probability distribution is a probability distribution whose random variable is

continuous. It describes an “unbroken” continuum of possible occurrences. Probability of a single

value is zero and probability of an interval is the area bounded by curve of probability density

function and interval on x-axis. Let a and b be any two values; a <b. The probability that X assumes

a value that lies between a and b is equal to the area under the curve a and b; that is P(a  X  b)
b
=  f ( x)dx . The integration from a to b in the case of the continuous variable is analogous to the
a

summation of probabilities in the discrete case.

Figure: P(a≤ X ≤ b) is the shaded region

Example 6.4 A continuous random variable X has a probability density function given by

1 1
f(x) = x  , 0  X  1.
4 2

Find the probability that X lies between the interval 0 and 1.

1 1
1
1 1 1 1 5
Solution:   x  dx  x 2  x 10   
0
4 2 8 2 8 2 8

BY: ABEBEW A. 105


MEKDELA AMBA UNIVERSITY

Example 6.5 suppose we have a continuous random variable’X’ with probability denity function

is given by
2
f(x) = {cx 0 < x < 3
0 other wise

a. Determine the value of ‘c’

b. Verify that is pdf

c. Calculate p(1<x<2)

Solution

a. ∫−∞ f(x)dx = 1 properties of pdf

3
𝑥3 3
∫ 𝑐𝑥 2 dx = c ( ) | = 9c = c = 1/9
0 3 0

31 𝑥3 3
b. ∫0 9 𝑥 2 dx = (27) | = 1
0
21 𝑥3 2
c. ∫1 9 𝑥 3 dx = (27) | = 1/3
1

Cumulative distribution function of discrete random variable

Let X be a discrete random variable with probability mass function (pmf) then the cumulative
distribution function is denoted by F(x) and it is defined by;

𝐹(𝑋) = 𝑃(𝑋 ≤ 𝑥) = ∑𝑥𝑥=0 𝑃(𝑥 = 𝑥𝑖 )

Example: Tossing a coin three time, let X be getting the numbers of head, then find the CDF of x
Solution: The variable „x‟ takes the value 0,1,2,3 with probability distribution (HHH, HHT, HTH,
TTH, THT, THH, HTT, TTT)

X P(x) F(X)
0 1/8 1/8
1 3/8 4/8
2 3/8 7/8
3 1/8 1

BY: ABEBEW A. 106


MEKDELA AMBA UNIVERSITY

Cumulative distribution function for continuous random variable

If X is continuous random variable with probability density function (pdf), f(x) then the
Cumulative distribution function of X is F(x) which is defined as;
𝑥
F(x) =𝑝(𝑥 ≤ 𝑥) = ∫𝑡=−∞ 𝑓 (𝑡)𝑑 𝑡

F(X) gives the “accumulated” probability “up to x .

Properties of CDF

i. 0≤F(X)≤1
𝑥 ∞
ii. 𝑙𝑖𝑚 𝐹(𝑥) = 𝑙𝑖𝑚 ∫−∞ 𝑓(𝑡)𝑑𝑡 = ∫−∞ 𝑓(𝑡)𝑑𝑡 = 1
𝑥→∞ 𝑥→∞

𝑥 ∞
iii. 𝑙𝑖𝑚 𝐹(𝑥) = 𝑙𝑖𝑚 ∫−∞ 𝑓(𝑡)𝑑𝑡 = ∫−∞ 𝑓(𝑡)𝑑𝑡 = 0
𝑥→−∞ 𝑥→−∞

iv. F’(X)=f(x) i.e. (F(X) is the anti-derivative of f(x) )

v. F(X) is a non –decreasing function

Remarks:

 The area bounded under the graph of a probability density function and below by the
horizontal axis is 1.
 The probability that a continuous random variable X will assume a specific value is zero,
𝑐
i.e. 𝑝(𝑥 = 𝑐) = ∫𝑐 f(x)dx = 0where c is a constant.
 The probability that a continuous random variable X will assume a value in a closed
intervals is the same as the probability that it will assume in the open intervals or half open
intervals, i.e. P(a≤X≤b) = P(a<X<b) = P(a≤X<b) = P(a<X≤b), P(X≤c) = P(X<c) , P(X≥c)
= P(X>c) where a, b, and c are constants.

6.2 Introduction to Expectation and Variance of Random variable

The objective of this section is to introduce you with the most common parameters of probability
distributions. There are some summary measures in terms of which we can summarize the behavior

BY: ABEBEW A. 107


MEKDELA AMBA UNIVERSITY

of probability distributions. The most common of these are the average called expected value and
dispersion about the average called the variance.

6.2.1 Expectation

The averaging process, when applied to a random variable is called expectation. It is denoted by
E(X) or  and is read as the expected value of X or the mean value of X.

Case 1: For discrete random variable

Suppose X is a discrete random variable which takes on values in a finite set x1, x2,…, xn with
probabilities P(xi) = P[X = xi] i= 1, 2, …n, then Expected value of X, E(X) of the discrete random
variable is given by:

𝐸(𝑥) = 𝜇 = ∑𝑛𝑖=1 𝑥𝑖 𝑝(𝑥𝑖 )…………………….……..6.1

Case 2: For continuous random variable

If X is a continuous random variable then


∞ ∞
E(X) =∫−∞ 𝑥𝑓𝑋 (𝑥) 𝑑𝑥 provided ∫−∞ ∣ 𝑥 ∣ 𝑓𝑋 (𝑥) 𝑑𝑥 < ∞ ……………….……6.2

where 𝑓𝑋 (𝑥) is the probability density function of the continuous random variable X.

Case 3: Mathematical expectation of some real function h(x) of a discrete random variable is given
by:

𝐸(ℎ(𝑥)) = ∑𝑛𝑖=1 ℎ(𝑥𝑖 )𝑝(𝑥𝑖 ) …………………………….6.3

Similarly if X is a continuous random variable, then



𝐸(ℎ(𝑥)) = ∫−∞ ℎ(𝑥)𝑓𝑋 (𝑥) 𝑑𝑥 …………………………………….6.4

Properties of Expectation

If X and Y are random variables and a, b are constants then:

1. E(k) = k, where k is any constant

2. E (kX) = k E(X), where k is any constant

BY: ABEBEW A. 108


MEKDELA AMBA UNIVERSITY

3. E (X + k) =E(X) + k

4. E(X + Y) = E(X) +E(Y)

5. E(XY) = E(X) E(Y), if X, Y are independent random variables

6. E(X) ≥ 0, if X ≥ 0.

7. |E(X)| ≤ E(|X|)

8. |E(XY)2| ≤ E(X2) E(Y2).

6.2.2 Mean and variance of a random variables

Mean of X = E(X)

Variance of X =𝜎𝑥 2 = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2

= 𝐸[𝑋 − 𝐸(𝑋)]2 ……………………………………………………….6.5

Case 1: If X is a discrete random variable with expected value μ then the variance of X, denoted
by Var(X), is defined by:

𝜎𝑥 2 =Var(X) = E(X-μ)2 = E(X2) – μ2

= ∑𝑛𝑖=1(𝑥𝑖 )2 𝑃(𝑥𝑖 ) − 𝜇 2

Alternatively, Var(X) = ∑𝑛𝑖=1(𝑥𝑖 − 𝜇𝑥 )2 𝑃((𝑥𝑖 )

Case 2: If X is a continuous random variable, then var (X),


𝜎𝑥 = ∫ (𝑥 − 𝑥̅ )2 𝑓𝑥 (𝑥) 𝑑x
2

−∞

Properties of Variances

 For any random variable X and constant a, it can be shown that


Var(aX) = a2Var(X)
Var(X + a) = Var(X) +0 = Var(X)
 If X and Y are independent random variables, then
Var(X + Y) = Var(X) + Var(Y)

BY: ABEBEW A. 109


MEKDELA AMBA UNIVERSITY

More generally if X1, X2 ……, Xk are independent random variables, then

Var(X1 +X2 + …..+ Xk) = Var(X1) +Var(X2) +…. + var(Xk)

i.e., 𝑉𝑎𝑟(∑𝑘𝑖=1 𝑥𝑖 ) = ∑𝑘𝑖=1 𝑉𝑎𝑟(𝑥𝑖 )


 If X and Y are not independent, then
Var (X+Y) = Var(X) + 2Cov(X,Y) + Var(Y)
Var(X-Y) = Var(X) – 2Cov(X,Y) + Var(Y)
Note: The standard deviation is often easier to interpret, because it has the same units as X. For
example, if X measures length in meters, the units of variance are square meters, while the units
of the standard deviation are meters. It is always nonnegative.

Example 6.7: Two fair coins are tossed. Determine Var(X) where X is the number of heads that
appear.

a) Use the definition of the variance.

b) Use the fact that the variance of the sum of independent variables is equal to the sum of the
variance.

Solution:

a) Let X is number of heads with possible values 0,1and 2. The Sample space consists of {HH,
TH, HT,TT}

P (X = 0) =¼, P (X = 1) = ½, P(X=2) = ¼

E (X) = 0.P(X=0) +1.P (X=1) +2P(X=2)

= 0 (1/4) + 1(1/2) +2(1/4)

= 1.

E(X2) = 02P(X=1) +12.P(X=1) +22P(X=2)

= 0(1/4) + 1(1/2) +4(1/4) = 3/2.

Implies that, Var (X) = E(X2) – μ2 = 3/2-1=1/2

b) Let X be head on the first coin with possible values 0 and 1

BY: ABEBEW A. 110


MEKDELA AMBA UNIVERSITY

Y be head on the second coin with possible values 0 and 1.

P(X= 0) = ½, P (X = 1) = ½ and P (Y=0) = ½, P(Y=1) = ½

E(X) = 0.P(X=0 + 1.P(X=1) E(Y) = 0.P(Y=0) +1P(Y=1)

= 0(1/2) +1(1/2) = 0(1/2) +1(1/2)

= 1/2 = 1/2

E(X2) = 02 .P(X=0) +12.P(X=1) E (Y2) = 02.P(Y=0) +12P(Y=1)

= 0(1/2) +1(1/2) = 0(1/2) +1(1/2)

=1/2 =½

Var (X) = E (X2) – μ2 Var (Y) = E (Y2) - μ2

= ½ - (1/2)2 = ¼ = ½ - (1/2)2 = ¼

X and Y are independent (i.e. the outcome of one coin does not influence the outcome of the
second)

Var (X+Y) = Var (X) +Var (Y) = 1/4 +1/4 = ½ .

𝑥2
Example 6.8: Compute the variance of 𝑓(𝑥) = for 0 < x < 3
9

V(x) = E(x2) – [E(x)]2

3 𝑥2 3 𝑥4 1 𝑥5 3
𝐸(𝑥 2 ) = ∫0 𝑥 2 ( 9 ) 𝑑𝑥 = ∫0 𝑑𝑥 = 9 ( 5 ) ฬ =27/5
9 0

E(x) = = 0.34

6.3 Common Discrete Probability Distributions

6.3.1 Binomial Distribution

Many real problems (experiments) have two possible outcomes, for instance, a person may HIV-
Positive or HIV-Negative, a seed may germinate or not, the sex of a new born bay may be a girl
or a boy, etc. Technically, the two outcomes are called Success and Failure.

BY: ABEBEW A. 111


MEKDELA AMBA UNIVERSITY

Experiments or trials whose outcomes can be classified as either a “success” or as a “failure” are
called Bernoulli trails.

The origin of binomial distribution is Bernoulli's trial. Bernoulli's trial is an experiment where
there are only two possible outcomes, “success" or "failure". In connection with this trial, a success
may be getting heads with a balanced coin; it may be passing an examination. Whenever we face
such experiment, we use binomial distribution under the assumptions stated below. Any
experiment can also be turned into a Bernoulli trial by defining one or more possible results which
we are interested as ‘‘Success” and all other possible results as “Failure”. For instance, while
rolling a fair die, a "success" may be defined as "getting even numbers on top" and odd numbers
as "Failure".

Generally, the sample space in a Bernoulli trial is S = {S, F}, S = Success, F = failure.

Notation: Let probability of success and failure are p and q respectively.

P (success) = P(s) = p and P (failure) = P (f) = q, where q= 1- p.

Definition: Let X be the number of success in n repeated Binomial trials with probability of
success p on each trial, then the probability distribution of a discrete random variable X is called
binomial distribution. Let p = the probability of success q= 1-p= the probability of failure on any
given trial. A binomial random variable with parameters n and p represents the number of r
successes in n independent trials, when each trial has p probability of success.

If X is a random variable, then for i= 0, 1, 2… n

𝑛!
𝑃((𝑋 = 𝑟)) = 𝑝𝑟 (1 − 𝑝)𝑛−𝑟
𝑟! (𝑛 − 1)!

𝑛!
𝑃((𝑋 = 𝑟)) = 𝑟!(𝑛−𝑟)! 𝑝𝑟 𝑞 𝑛−𝑟 where q = 1 – p …………………………… 6.6

A binomial experiment is a probability experiment that satisfies the following assumptions.

1. The experiment consists of n identical trials.

2. Each trial has only one of the two possible mutually exclusive outcomes, success or a failure.

BY: ABEBEW A. 112


MEKDELA AMBA UNIVERSITY

3. The probability of each outcome does not change from trial to trial.

4. The trials are independent.

If X is a binomial random variable with two parameters n and p then

i) E (X) = np

ii) Var (X) = npq

Example 6.9: A fair coin is flipped 3 times, what is the probability of getting exactly two heads?

Solution:

Let X be number of heads with possible values 0,1,2,3

P (getting head) =) p = ½, q = 1-p =1/2, n =3

3! 1 1 3
𝑃((𝑋 = 2)) = (2)2 (2)3−2 = 8
2! ((3 − 2)!)

Example 6.10 Suppose it is known that the probability of recovery for a certain disease is 0.4. If
random sample of 10 people who are stricken with the disease are selected, what is the probability
that:

a. exactly 5 of them will recover?


b. at most 9 of them will recover?

Solution: Let X be the number of persons will recover from the disease. We can assume that the
selection process will not affect the probability of success (0.4) for each trial by assuming a large
diseased population size. Hence, X will have a binomial distribution with number of trials equal
to 10 and probability of success equal to 0.4
5!
a. 𝑝(𝑥 = 5) = 5!(10−5)! 0.45 0.610−5 =0.200658
10!
b. 𝑝(𝑥 ≤ 9) = 1 − 𝑝(𝑥 = 10) = 1 − 10!(10−10)! 0.410 0.610−10 = 1 − 0.000105 = 0.9999

BY: ABEBEW A. 113


MEKDELA AMBA UNIVERSITY

6.3.2 Poisson Distribution

Another important discrete probability distribution is the Poisson distribution. It is a discrete


probability distribution which is used in the area of rare events. The Poisson distribution counts
the number of success in a fixed interval of time or within a specified region.

Examples of random variables that usually obey the Poisson distribution are:

 The number of car accidents in a day.


 Arrival of telephone calls over interval of times.
 The number of misprints on a typed page (a group of pages) of a book.
 Natural disasters like earth quake.
 The number of suicides reported by a particular city.
 The number of customers entering a post office on a given day.

To apply the Poisson distribution, two conditions must be met:

 The number of success that occurs in any interval is independent of those that occur in
other non-overlapping intervals.
 The probability of a success in an interval is proportional to the size of the interval. In short,
the two important traits of the Poisson distribution are independence and probability.

Let X is the number of occurrences in a Poisson process and λ be the actual average number of
occurrence of an event in a unit length of interval, the probability function for Poisson distribution
is,

𝜆𝑥 𝑒 −𝜆
𝑝(𝑥) = …………………………………..………6.7
𝑥!

Where, X=0,1,2, .................

Remarks

 Poisson distribution possesses only one parameter λ


 If X has a Poisson distribution with parameter 𝜆 , then E (X) = λ and Var (X) = λ,
i.e. E (X) = Var (X) =λ ,
 ∑∞
𝑖=0 𝑝(𝑥𝑖 ) = 1

BY: ABEBEW A. 114


MEKDELA AMBA UNIVERSITY

Example 6.11 In a small city, 10 accidents took place in a time of 50 days. Find the probability
that there will be a) two accidents in a day and b) three or more accidents in a day.

Solution:

There are 0.2 accidents per day.

Let X be the random variable, the number of accidents per day

X ~poiss (𝜆 = 0.2) X = 0, 1, 2, ….

0.22 𝑒 −2
a. 𝑝(𝑥 = 2) = =0.0164
2!

b. P (X ≥ 3) = P(X = 3) + P(X = 4) + P(X = 5) +...

= 1- [P(X = 0) + P(X = 1) + P(X = 2)]

. . . . . . since ∑∞
𝑖=0 𝑝(𝑥𝑖 ) = 1

= 1- [0.8187 + 0.1637 + 0.0164] = 0.0012

6.4. Common Continuous Probability Distributions

6.4.1 Normal Distributions

It is the most important distribution in describing a continuous random variable and used as an
approximation of other distribution. Many variables in the practical world follow this distribution,
and hence in many ways it is the cornerstone of modern Statistical Theory. It has been noticed
that empirical distributions of various types of observations in natural and social sciences are often
very close to normal distribution. In statistical analysis the distributions of observations is
frequently assumed to be approximately normal. In statistical estimation and testing of hypotheses
the normal distribution plays an important role.

A random variable X has a normal distribution with parameters μ & σ2 and it is known as a normal
random variable iff its pdf is given by:

BY: ABEBEW A. 115


MEKDELA AMBA UNIVERSITY

1 
-1  x    
2
 1
f ( x)    e ( x   ) / 2
2 2
exp   ……………6.8
 2 2    
   2
for      ,    x   &   0.

The graph of the normal distribution is known as the normal curve, which is bell-shaped:

Normal probability curve

SOME PROPERTIES OF THE NORMAL CURVE

The following are the important properties of the normal curve:

1. The normal curve is “bell-shaped” and symmetrical about the mean. The property of

symmetry can be shown using the pdf as: f (   c)  f   c  . Which is equivalent to


saying that P( X   )  P( X   )  0.5 . Since this is the property of the median, it follows
that, for the normal distribution, Mean = Median= Mode.
2. The height of the normal curve is at its maximum when X    mean , which means,
again, Mean = Median= Mode. This property can also be verified using the first and second
derivative tests; that is, f ( x)  0  x   . This shows x =  may be maximum or
1
minimum value of X, but using the second derivative test, f (  )   0 , we see
 2
3

that the point is the maximum value. Therefore, by property 1 and 2, we can conclude that,
the mean, median and mode coincide for the normal curve.
3. The normal curve is asymptotic to the X- axis.
4. The first and the third quartiles are equidistant from the median,
Q Q
i.e., Q  Q  Q  Q1 . Or, Q  1 3
3 2 2 2 2

BY: ABEBEW A. 116


MEKDELA AMBA UNIVERSITY

5. The Probability that a random variable will have a value between any two points is equal to the
area under the curve between those points.

Standard Normal Distribution

The symmetrical property of the normal distribution provides a means that is helpful in calculating
probabilities, which is also facilitated by transforming any normal distribution with any mean and
variance to the standard normal distribution.

By standardization we mean that the random variable X will be transformed to another random
variable whose mean is 0 and variance is 1. The normal distribution with zero mean and standard
deviation one is known as standard normal distribution. If X has normal distribution with mean μx
and standard deviation 𝜎, then the standard normal distribution Z is given by
𝑥−𝜇
Z= , for population
𝜎

𝑥 − 𝑥̅
Z= , for sample
𝑆

Using the properties of expectations, it is now trivial to show that E ( Z )  0 and V(Z)  1 . The pdf of
1 2
1 
z
Z is, thus, given by f ( z )  e 2 ,  z   .
2

z
The entries in Table A of the Appendix are the values of P(0  Z  z )   f ( z )dz .
0

That is, the table gives us the probabilities that a random variable Z having the standard normal
distribution will take on a value on the interval from 0 to z, for z  0.00, 0.01, 0.02,, 3.98, and 3.99;
due to the symmetrical property of the normal curve with respect to its mean, it is unnecessary to
extend the table for negative values of Z.

Note that P( Z  0)  P( Z  0)  0.5 .

BY: ABEBEW A. 117


MEKDELA AMBA UNIVERSITY

Table value

0 Z

Tabulated areas under the standard N.D from 0 to z

1 2
z 1  z
That is, the arrowed region is P(0  Z  z )   e 2 dz .
0 2

Basic Properties of the standard normal Curve:

1. Total area under the standard normal curve is equal to 1.

2. The standard normal curve is asymptotic to x-axis.

3. The standard normal curve is symmetric about 0.

4. Most of the area under the standard normal curve lies between z= -3 and z=3.

Given a normal distributed random variable X with mean μ and standard deviation σ

𝑎−𝜇 𝑥−𝜇 𝑏−𝜇


𝑃(𝑎 < 𝑋 < 𝑏) = 𝑃( < < )
𝜎 𝜎 𝜎
𝑥−𝜇 𝑎−𝜇
𝑃(𝑋 < 𝑎) = 𝑃( < )
𝜎 𝜎
𝑥−𝜇 𝑎−𝜇
But, = 𝑍 standard normal random variable 𝑃(𝑍 < )
𝜎 𝜎

Note: i) P (a<x<b) = P (a ≤X<b)

= P (a<X≤ b)

=P (a ≤X≤ b)

BY: ABEBEW A. 118


MEKDELA AMBA UNIVERSITY

ii) P (- ∞ < Z < ∞) = 1

Example 6.12: Find the probabilities that a random variable having the standard normal
distribution will take on a value

a. Less than 1.72; b. Less than -0.88;


c. Between 1.30 and 1.75; d. Between -0.25 and 0.45.

Solution: By using the normal table,

a) P( Z  1.72)  P( Z  0)  P(0  Z  1.72)  0.5  0.4573  0.9573 .

b) P( Z  0.88)  P( Z  0.88)  0.5  P(0  Z  0.88)  0.5  0.3106  0.1894 .

c) P(1.30  Z  1.75)  P(0  Z  1.75)  P(0  Z  1.30)  0.4599  0.4032  0.0567 .

d) P(0.25  Z  0.45)  P(0.25  Z  0)  P(0  Z  0.45)  P(0  Z  0.25)  P(0  Z  0.45)

 0.0987  0.1736  0.2723 .

Application of the Standard Normal Distribution

Let X  N  ,  2 . Suppose that we want to find the probability P(a  X  b) .

Since a, b,  , and  are known (given), we standardize a, b and X as:

a X  b
P ( a  X  b )  P     P( z1  Z  z 2 ), say.
    

Now, we need only to get the readings from the Z- table corresponding to z1 and z2 to get the
required probabilities, as we have done in the preceding example.

Also, we can find the following one-sided probabilities:

 b   a
P ( X  b )  P Z    P( Z  z 2 ) , and P( X  a)  P Z    P ( Z  z1 ) .
     

We have seen that a Z- value measures the distance between a particular value of X and the mean
in units of standard deviation.

Example 6.13: If X N   ,  2  , find the probabilities

BY: ABEBEW A. 119


MEKDELA AMBA UNIVERSITY

a) P(     X     ) ; b) P(   2  X    2 ) ; c) P(   3  X    3 ) .

Solution: As in the case of P(a  X  b) , we simply replace a and b.

a) P(     X     )  P       Z       
   

 P (1  Z  1)  2 P(0  Z  1)  2(0.3413 ) (See Table A)

 0.6828 or 68.28%.

b) Similarly, P(   2  X    2 )  P(2  Z  2)  2 P(0  Z  2) =2(0.4772) = 0.9544.

c) P(   3  X    3 )  P(3  Z  3)  2 P(0  Z  3) = 2(0.4987) = 0.9974, or


99.74%.

From which we can tell that,

a) About 68.30% lies in the region    &    (1 Standard Dev. on either side).

b) About 95.50% lies in   2 &   2 (2 Standard Deviations on either side).

c) About 99.7% lies in   3 &   3 (3 Standard Deviations on either side).

Notation: Z denotes the value of Z for which the area to its right is equal to  .

This notation is useful in statistical inference, and note that finding Z is identical with reading
anti-logarithms.

Example 6.14: Find a) Z 0.01 ; b) Z 0.05

Solution: a) Z 0.01 corresponds to an entry of 0.5 - 0.01 = 0.4900.

In Table A, look for the value closest to 0.4900, which is 0.4901, and the Z value for this is
Z= 2.33. Thus, Z 0.01  2.33 .

b) Again, Z 0.05 is obtained as 0.5 - 0.05 = 0.4500, which lies exactly between 0.4495 and 0.4505,

corresponding to Z = 1.64 and Z= 1.65. Hence, using interpolation, Z 0.05  1.645 .

BY: ABEBEW A. 120


MEKDELA AMBA UNIVERSITY

Example 6.15: Suppose that X N (165, 9), where X = the breaking strength of cotton fabric. A
sample is defective if X<162. Find the probability that a randomly chosen fabric
will be defective.

Solution: Given that   165 and  2  9 ,

 X   162     162  165 


P( X  162)  P    P Z  
     3 

 P ( Z  1)  0.5  P(1  Z  0) (Since P( Z  0)  0.5 )

 0.5  P(0  Z  1) (By symmetry)

 0.5  0.3413  0.1587 (Table value for Z = 1)

6.4.2 Chi-Square Distribution

Chi-Square distribution may be derived from normal distributions, if Xi (i = 1, 2… n) are n


independent normal varieties with mean μi and variance 𝜎𝑖 2 (i= 1, 2, … , n) then

n
X i  i
χ2 = 
i 1 i
2 is a chi-square variate with n degrees of freedom. The probability density

function of the 𝜒 2 –distribution is given by

1 𝑛 𝜒2
𝑓(𝜒 2 ) = 2𝑛⁄2 Γ(𝑛 (𝜒 2 )(2 −1) 𝑒 − 2 , ……………………………………..6.9
⁄2)

0<𝜒 2 <∞ where n is the degree of freedom.

Since the Chi-square distribution arises in many important applications, its values have been

extensively tabulated. Table C at the end of this module contains values of  2  ,n for  =0.05,
0.025, 0.01, 0.005 and n=1, 2, 3, …, 30, where  2  ,n is such that the area to its right under the
Chi-square curve with n degrees of freedom is equal to  . That is,  2  ,n is such that if X is a
random variable having a Chi-square distribution with n degrees of freedom, then
P( X   2  ,n )   .  is known as the level of significance. When n is greater than 30, the table

BY: ABEBEW A. 121


MEKDELA AMBA UNIVERSITY

cannot be used and probabilities related to Chi-square distributions are usually approximated with
normal distributions.

0  2  ,

Properties of Chi-square Distribution

1. The exact shape of the distribution depends upon the number of degrees of freedom n. In general,
when n is small, the shape of the curve is skewed to the right and as n gets larger, the distribution
becomes more and more symmetrical.

2. The mean and variance of the 𝜒 2 distribution are n and 2n respectively.

3. As n → ∞ the 𝜒 2 distribution approaches a normal distribution.

4. The sum of independent 𝜒 2 varieties is also 𝜒 2 variety.

Example 16

Suppose we are interested to read the following values of the chi-square distribution.

i) The chi-square value with 2 degrees of freedom where the area to the right of this value is 0.005

ii) The chi-square value with 100 degrees of freedom where the area to the right of this value is
0.975.

Solution:

i) Look the degrees of freedom, 2, in the first column (df column) and then move horizontally until
you find the value of α , 0.005 in the first row. The point of intersection made by the horizontal

BY: ABEBEW A. 122


MEKDELA AMBA UNIVERSITY

and vertical movement will give the desired chi-square value, 10.597. This value satisfies the
following:

𝑃(𝑥 ≥ 10.595) = 0.005

ii) Similarly, the desired value in this case is 74.222. Note that, this value satisfies the following:

𝑝(𝑥 ≥ 74.222) = 0.975

6.4.3 The t-distribution

Let X1,X2,….Xn be a random sample drawn from a normal distribution having mean μ and
standard deviation σ (unknown but estimated by S, sample standard deviation).

𝑋̅ −𝜇
The statistic 𝑡 = 𝑆 has t – distribution with (n-1) degree of freedom where 𝑋̅ is sample mean

√𝑛

and S is standard deviation.

In view of its importance, the t distribution has been tabulated extensively. Table B at the end of
this module contains values of , for = 0.10, 0.05, 0.025, 0.01, 0.005, and = 1, 2, 3, …, 29 degrees
of freedom; where is such that the area to its right under the curve of the t distribution with (n-1)
degrees of freedom is equal to 𝛼.

Notation: tα, (n-1) stands for a value of t with (n-1) degree of freedom the right of which an area
equal to a in reading the tabulated values.

 

 t 0 t

Student’s t Distribution

BY: ABEBEW A. 123


MEKDELA AMBA UNIVERSITY

Note: 1. The table value does not contain values of t , n 1 for  > 0.50, since the curve is

symmetrical about t=0 (like the normal distribution) we have,

t , n 1 =  t , n 1 .
2. When (n-1) =30 or more, probabilities related to the t distribution are usually
approximated with the use of normal distributions.

Example 6.15: For a t-distribution with n=20, find values leaving an area of

a) 0.05 to the right; c) 0.10 to the left;

b) 0.975 to the right; d) half of =0.01 on either side.

Solution; referring to Table B with (n-1) =19 df, we have

a) 1.725; c) -1.328.

b) 2.093; d) 2.861; & -2.861

Applications of t Distribution

The t distribution has wide applications in Statistics, only some are listed below:

a) Test of population Mean ( One-sample t-test)

When we are dealing with a random sample of size n<30, from a normal population, when  2 is
unknown, the t distribution with n-1 degrees of freedom, is used to test the hypothesis that the
population mean  equals a given value (say,  O ), against the alternatives:    O , or    O ,

or    O .

𝑋̅ −𝜇
Then, we calculate t = 𝑆⁄ , which is to be compared with the table value t  , or t  with n-1
√𝑛 2

degrees of freedom.

Note: The assumptions underlying student’s t-distribution for such tests are:

a) The parent population from which the sample is drawn is normal.

BY: ABEBEW A. 124


MEKDELA AMBA UNIVERSITY

b) The sample observations are independent; that is, the sample is random.

c) The population standard deviation (  ) is unknown.

d) n is small; that is n<30.

Example: 6.16: In 16 one-hour test runs, the gasoline consumption of an engine averaged 16.4
gallons with a standard deviation of 2.1 gallons. In order to test the claim that the average gasoline
consumption of this engine is 12.0 gallons per hour, calculate the t value and t , n 1 , for  =0.05.

Solution: Substituting n=16,  =12.0, X =16.4, and S=2.1 in the formula, we get

𝑋̅ −𝜇 16.4  12.0
t=𝑆 = = 8.38; and the table value for n-1 = 15 is t 0.05,15 = 1.753.
⁄ 2.1 / 16
√𝑛

b) t-Test of the equality of two means (two-sample t-test)

In many problems of applied research, we are interested in hypotheses concerning the difference
between the means of two population means, 1   2  0 , or the equality of two means
( 1   2  0) . In such tests, the following are assumed:

i. The parent populations from which the samples have been drawn are normally distributed;
ii. The two population variances are equal, though unknown:  12   22   2
iii. The two samples are random and independent of each other;
iv. The sample sizes are small: n1 and/or n2 are <30.

c) t-Test of correlation and regression coefficients

In a normal regression and correlation analyses, it is used to test:

a. if the population regression coefficient equals to a certain constant (    o ); &


b. if the population correlation coefficient is significantly different from zero.

Example: 6.17: In 16 one-hour test runs, the gasoline consumption of an engine averaged 16.4
gallons with a standard deviation of 2.1 gallons. In order to test the claim that the average gasoline
consumption of this engine is 12.0 gallons per hour, calculate the t value and, for=0.05.

BY: ABEBEW A. 125


MEKDELA AMBA UNIVERSITY

Solution: Substituting n=16, =12.0, =16.4, and S=2.1 in the formula, we get

𝑋̅ −𝜇
t=𝑆 == 8.38; and the table value for n-1 =15 is = 1.753.

√𝑛

d) t-Test of the equality of two means (two-sample t-test)

In many problems of applied research, we are interested in hypotheses concerning the difference
between the means of two population means, , or the equality of two means . In such tests, the
following are assumed:

i. The parent populations from which the samples have been drawn are normally distributed;
ii. The two population variances are equal, though unknown:.
iii. The two samples are random and independent of each other;
iv. The sample sizes are small: n1 and/or n2 are <30.

e) t-Test of correlation and regression coefficients

In a normal regression and correlation analyses, it is used to test:

a. if the population regression coefficient equals to a certain constant (𝛽 = 𝛽0 ); &


b. if the population correlation coefficient is significantly different from zero.

Exercise 6

1. From a lot containing 20 items, of which 5 are defective, 4 are chosen at random. Let X be the
number of defectives found.

a) Write down the pmf of X. b) Find the probability distribution of X.

c) Find E(X) and V(X).

2. If X has a pdf of f ( x)  3 x 2 , for 0 <x <1, and o elsewhere, find

a) P(X < 0.5); b) E(X) and V(X);

c) a if P( X  a )  0.05 ; d) b if P( X  b)  P( X  b) .

3. The amount of bread X ( in hundreds of kg) that a certain bakery is able to sell in a day is
found to be a continuous r-v with a pdf given as below:

BY: ABEBEW A. 126


MEKDELA AMBA UNIVERSITY

 kx , 0 x5

f ( x)  k (10  x) , 5  x  10
 0
 , otherwise

a) Find k; b) Find the probability that the amount of bread that will be sold tomorrow is

i) More than 500kg, ii) between 250 and 750 kg;

c) Find the expected amount of bread to be sold in any day.

4. Find the value of Z if the area between -Z and Z is a) 0.4038; b) 0.8812; c) 0.3410.
5. The reduction of a person's oxygen consumption during periods of deep meditation may be
looked up on as a random variable having the normal distribution with   38 .6 cc per minute
and   6.5 cc per minute. Find the probabilities that during such a period a person's oxygen
consumption will be reduced by

a) at least 33.4 cc per minute; b) at most 34.7 cc per minute

6. For a Chi-square distribution with  =14 degrees of freedom, find the table value such that the
area

a) to its right is 0.01;

b) to its left is 0.975.

7. A manufacturer of baby electronic watches claims that the average life of a certain watch is 10
years with a standard deviation of 1.75 years. If a sample of 4 watches is found having lives of
6, 8, 12, and 10 years, calculate the Chi-square value to test the claim.
8. Find the tabulated t values for

a) t 0.05 for n =13;

b) t 0.01 for n = 9;

c) t 0.995 for n =16;

d) t  Such that P(  t   t  t  ) =0.95 for n =11.


2 2 2

BY: ABEBEW A. 127


MEKDELA AMBA UNIVERSITY

9. The heights of 10 males of a certain locality are found to be 70, 67, 62, 68, 61, 68, 70, 64, 64,
and 66 inches. If it is desired to test if the average height is 64 inches, at  =0.05,

a) calculate the t value;

b) find the table value t , n 1 .

BY: ABEBEW A. 128


MEKDELA AMBA UNIVERSITY

CHAPTER SEVEN

SAMPLING AND SAMPLING DISTRIBUTION OF THE STATISTIC


After completing this unit, the student should be able to

 Describe the basic concepts of sampling.


 List down and explain random sampling versus Non random Sampling Techniques.
 Identify the causes of non-sampling error.
 Develop the sampling distribution of the mean.
 Construct the probability distribution of the mean.
 Calculate the means, standard deviation and variance of sample means and sample
Proportion.
 Differentiate the sampling distribution of the sample means (sample proportion) when the
population is normal or non-normal.
 Explain the importance of Central limit theorem for statistical inference.

Introduction

In our daily life we are forced to make decision based on small scale study. For instance, a
laboratory technician take small droplets of blood examine the presence disease; we examine fruits
before we purchase it; zoologists use the concept of sampling to estimate the population of rodents,
e t c. This process of inspection is very wide and is commonly used on various occasions. But this
job is difficult to implement on large scale. On the basis of small study, we make inference about
the entire population.

7.1 Basic Concepts

Population: - is the complete collection of individuals, objects or measurements for which


inferences are to be made. The population represents the target of an investigation, and the
objective of the investigation is to draw conclusions about the population and it should be defined
on the basis of the objective of the study by the investigator.

Example 7.1:

BY: ABEBEW A. 129


MEKDELA AMBA UNIVERSITY

 All customers of electric supply company.


 All students of MAU.
 Population of farms having a certain type of natural fertility.
 Population of households in a certain village.

Sample:- A sample from a population is the set of measurements that are actually collected in the
course of an investigation. It should be selected using some predefined sampling technique in such
a way that they represent the population very well.

Sampling (elementary) unit:- the ultimate unit to be sampled or elements of the population to be
sampled.

Example 7.2

 If somebody studies economic status of the households, households is the sampling unit.
 If one studies performance of freshman students in some college, the student is the
sampling unit.

Sampling frame:- is the list of all elements (sampling units) in a population.

Examples7.3

 List of households of a certain city.


 List of students in the registrar office of the Mekdela Amba university.

Parameter and Statistic:- are basic terms in sampling theory. Parameter is a value calculated
from the population. For instance population mean, population variance, population proportion is
parameters. Statistic is a value calculated from a sample. Sample mean, sample variance, sample
proportion, etc

Sampling error: - A type of error that may arise due to inappropriate sampling techniques applied.
A sampling error is the difference between a sample statistic and its corresponding parameter. We
can make probabilistic statements about this sampling error only if we have a probability sample.

Non-sampling error:- In addition to sampling error, the sample estimate may be subject to other
errors, sampling errors. Errors in observation, interview or measurement error, errors due to non-

BY: ABEBEW A. 130


MEKDELA AMBA UNIVERSITY

response and errors in data processing: editing, coding, etc. The non-sampling error is likely to
increase with increase in sample size. For instance a census survey may have non-sampling errors
in large amount collected in the course of an investigation. It should be selected using some
predefined sampling technique in such a way that they represent the population very well.

There are four common types of non-sampling errors:

i. Non-response errors:- The people from whom we get the information are called the
respondents and the people in the sample from whom we do not get information are called non-
respondents. The error which arises, when we fail to get the information is called nonresponse
error and the phenomenon is called non-response. This error arises because of the fact that we are
not able to cover the whole sample. For example, if we want to interview 100 farmers and suppose
5 out of them do not allow us to interview them. Then we are interviewing only 95. So the sample
is not complete. Such errors are called non-response errors.

ii. Measurement Errors:-The errors that we bring in measuring the characters are called
measurement errors. For example, suppose we want to measure the age of the respondents. Among
the respondents, some may report their age less than their actual age. These types of errors are
called measurement errors.

iii. Tabulating Errors:-The errors which arise due to missing some numbers due to non-
availability of data or recording some numbers wrongly, while making a table is called a tabulation
error.

iv. Computational Errors:- After the table is formed, we start our calculations. The errors
committed in calculations are known as computational errors.

Activity 7.1:

1. Briefly explain the terms: population, sample, sampling frame.


2. List types of error in statistics.

BY: ABEBEW A. 131


MEKDELA AMBA UNIVERSITY

7.2 Reasons for sampling

Sample survey saves money:- It is possible to collect information from sample households and
obtain estimates that reasonably approximate the actual characteristics of a large population .It
obviously cheaper to gather information from 100 households rather than from 10,000 households.

Sample Survey saves time:- sample survey requires a smaller scale of operations at all stage and
it reduces data collection and processing time.

Sample survey provides higher level of accuracy:- This accuracy can be achieved through more
selective recruiting of interviewers and supervisors, more extensive training programs, a closer
supervision of the personnel involved and a more efficient monitoring of the field work.

Sample survey could be the only option for the study in some specialized area. For example,
there are some cases where information of technical nature requires highly trained personnel and
specialized equipment like in medical areas.

Experimentation could be destructive in nature like testing industrial products such as testing
the average duration of burning of bulbs, testing the quality of wine, beer, study the efficacy of
new drugs etc. In this case sampling is the only feasible means of study.

7.3 Types of Sampling Techniques

The technique of selecting a sample is important in sampling theory and usually it depends upon
the nature of the investigation. The commonly used sampling techniques may be broadly classified
as: Non Probability and Probability Sampling.

7.3.1 Random Sampling or probability sampling

Probability sampling techniques is a method of sampling in which all elements in the population
have a pre-assigned probability to be included in to the sample.

In this sub-section, four different techniques of taking a random sample are discussed.

a/ Simple random sampling

b/ Stratified random sampling

BY: ABEBEW A. 132


MEKDELA AMBA UNIVERSITY

c/ Cluster sampling

d/ Systematic sampling

a) Simple Random Sampling

In statistics, a simple random sample from a population is a sample chosen randomly, so that each
possible sample has the same probability of being chosen. One consequence is that each member
of the population has the same probability of being chosen as any other. In small populations such
sampling is typically done "without replacement", i.e., one deliberately avoids choosing any
member of the population more than once. Although simple random sampling can be conducted
with replacement, this is less common and would normally be described more fully as simple
random sampling without replacement. Simple random sampling is a method of selecting n units
out of a finite population of size N by giving equal probability to all units, or a sampling procedure
in which all possible combinations of n units that may be formed from the finite population of size
N units have the same probability of selection. There are distinct possible samples in the case of
sampling without replacement; the chance of selecting each one of them is There are 𝑁 𝑛 possible
samples in the case of sampling with replacement, the chance of selecting each one of them is
1/𝑁 𝑛 . Conceptually, simple random sampling is the simplest of the probability sampling
techniques. It requires a complete sampling frame, which may not be available or feasible to
construct for large populations. Even if a complete frame is available, more efficient approaches
may be possible if other useful information is available about the units in the population.

Simple random sampling is free of classification error, and it requires minimum advance
knowledge of the population. It best suits situations where the population is fairly homogeneous
and not much information is available about the population. If these conditions are not true, some
other types of sampling techniques may be a better choice. Lottery method and computer generated
random numbers are used to select a random sample in simple random sampling:

i) Lottery method: This is a very common method of taking a random sample under this method;
we label each member of the population by identifiable ticket or pieces of papers.

BY: ABEBEW A. 133


MEKDELA AMBA UNIVERSITY

Tickets must be of identical size, color and shape. They are placed in the container and well mixed
before each draw and then draws may be continued until a sample of the required size is selected.
This shows that selection of items depends entirely on chance.

Example 7.4: If we want to take a sample of 25 persons out of a population of 150, the procedure
is to write the names of all the 150 persons on separate slips of papers, fold these slips, mix them
thoroughly and then make a blindfold selection of 25 slips without replacement.

ii) Table of random numbers This is an alternative method of selecting a simple random sample.
It is constructed from the digits 0, 1, 2,…, 9. There are several tables available in standard books
of Statistics.

Suppose we want to select a sample of size n, then

- Make a list of population to be sampled;


- Give a distinct code number to each unit of the population;
- Choose the direction of selection randomly;
- Take n units whose code numbers coincide with the random numbers as numbers of the sample
- By omitting those random numbers which do not exist on the list and repeated numbers if an
element is not appear more than once in a sample.

Table of Random Numbers


Column
Row 1 2 3 4 5 6 7 8
1 57172 42088 70098 17333 26902 29959 43909 49607
2 33883 87680 24923 15659 O9839 45817 89405 70743
3 77950 15344 35609 87119 15859 74577 42791 75889
4 11607 26596 16796 24498 17009 67119 60557 49521
5 56149 55678 38169 47228 49931 94303 67448 31286
6 80719 65101 77729 83949 83358 75230 56624 27549
7 93809 19505 82000 79068 45552 86776 48980 56684
8 40950 86216 48161 17646 24164 35513 94057 51834
9 12182 59744 83710 41125 14291 74773 66391 50031
10 13382 48076 73151 48724 35670 38453 63154 58116
11 38629 94576 48859 75654 17152 66516 78796 73099
12 60728 52063 12431 23898 23683 10853 O4038 75246
13 O1881 99056 46747 O8846 O1331 88163 74462 14551
14 23094 08831 24387 23917 O7421 97869 88092 72201

BY: ABEBEW A. 134


MEKDELA AMBA UNIVERSITY

Example 7.5: Suppose that N= 40 and we want to select n=10 without replacement, starting with
the 3rd row and 2nd column by reading vertically using the above random table, we get

Solution: starting with the 3rd row and 2nd column by reading vertically we will get:

15, 26, 19, 08, 24, 35, 16, 38, 12 and 17.

b/ Stratified random sampling

In stratified sampling, the population of N units is sub-divided into k sub-populations, called strata,
so that the units in each stratum are as homogeneous as possible and the means of the different
strata are as different as possible. These sub-populations should be non-overlapping so that they
comprise the whole population such that, where represent the population size in the strata. Then a
sample is drawn from each stratum independently, the sample size within the stratum being such
that. The procedure of taking samples in this way is called stratified sampling. If the sample is
taken randomly from each stratum, the procedure is known as stratified random sampling.

Remarks:

In stratified random sampling, the following two points are equally important to ensure accuracy.

a) proper stratification of the population into various strata, and

b) a suitable sample size from each stratum.

For example a population can be stratified based on the following variables:

 Sex (male, female)


 Age (under 18, 18 to 28, 29 to 39);
 Occupation (professional, other).
 Geographical classifications
 Administrative regions, etc.

c/ Cluster Sampling:

The population is divided in to non-overlapping groups called clusters. A simple random sample
of groups or cluster of elements is chosen and all the sampling units in the selected clusters will
be surveyed in the case of single stage cluster sampling. Clusters are formed in a way that elements

BY: ABEBEW A. 135


MEKDELA AMBA UNIVERSITY

within a cluster are heterogeneous, i.e. observations in each cluster should be more or less
dissimilar. Cluster sampling is useful when it is difficult or costly to generate a simple random
sample. For example, to estimate the average annual household income in a large city we use
cluster sampling, because to use simple random sampling we need a complete list of households
in the city as sampling frame. To use stratified random sampling, we would again need the list of
households. A less expensive way is to let each block within the city represent a cluster. A sample
of clusters could then be randomly selected, and every household within these clusters could be
interviewed to find the average annual household income.

d/ Systematic Sampling:

Systematic sampling is the selection of every kth element from a sampling frame, where k, the
sampling interval and k = population size / sample size = N/n. Using this procedure each element
in the population has a known and equal probability of selection. This makes systematic sampling
functionally similar to simple random sampling. It is however, much more efficient and much less
expensive to do. Like simple random sampling a complete list of all elements with in the
population (sampling frame) is required. The procedure starts in determining the first element to
be included in the sample. It is however, much more efficient and much less expensive to do.
Suppose that we have a complete and up-to-date list of the N units in the population numbered
from 1 to N in some order. To select a sample of size n, if N is an integral multiple of n, N = kn
for some integer k, k = population size / sample size = N/n.

The procedure starts in determining the first element to be included in the sample, select a unit i
randomly from the first group, i ≤ 𝑘 as the first element. The second unit will be (i+k)th element
from the frame. Totality we have a sample of size n from the population of size N, i th , (i+k)th ,
(i+2k)th ,… (i+(n-1)k)th element of the population are taken as a sample.

Example 7.6: Suppose that N = 20 and we want to select a sample of size 4, so that k = N/n =20/4
= 5. The first element in the sample is selected from the first 5 units randomly, say 3 rd, which is
the random start. Then, every 5th unit is selected, and the sample contains the 3rd, 8th, 13th and 18th
units of the population.

BY: ABEBEW A. 136


MEKDELA AMBA UNIVERSITY

7.3.2 Non-Random Sampling or non-probability sampling.

It is a sampling technique in which the choice of individuals for a sample depends on the basis of
convenience, personal choice or interest.

Types of non-random sampling are:

1. Judgment sampling.

2. Convenience sampling

3. Quota Sampling.

1. Judgment Sampling

In this case, the person taking the sample has direct or indirect control over which items are
selected for the sample. This method is mainly used for opinion surveys but is not recommended
for general use, as it bias of the sampler.

2. Convenience Sampling

In this method, the decision maker selects a sample from the population in a manner that is
relatively easy and convenient.

3. Quota Sampling

This is a type of judgment sampling and may be the most commonly used one in the non-
probability category. In a quota sample, quotas are set up according to some specified
characteristics such as income groups, age groups, political or religious groups, etc. Within the
quota, the selection of sampling units depends up on personal judgment.

Activity 7.2:

1. Briefly explain the difference between probabilistic and non-probabilistic sampling


techniques of selecting sample.

2. Discuss on the major reason to use sampling rather than taking all

3. State and discuss on the types of probability sampling techniques by giving examples to
each type

BY: ABEBEW A. 137


MEKDELA AMBA UNIVERSITY

7.4 Sampling Distribution of the sample mean

The value of the sample mean for any sample will depend on the elements included in that sample.
Consequently, the sample mean is a random variable. Therefore, like other random variable, the
sample means possess a probability distribution which is more commonly called the sampling
distribution of sample mean. In general, the probability distribution of a sample statistic is called
its sampling distribution. Sampling distribution is important in statistical inference.

Consider all possible samples of size n that can be drawn from a given population (either with or
without replacement). For each sample, we can compute a statistic (such as the mean & the
standard deviation) that will vary from sample to sample. The important characteristics of the
sampling distribution of the sample mean are its mean, variance and the form of the distribution

Steps for the construction of Sampling Distribution of the mean

1. From a finite population of size N , randomly draw all possible samples of size n. There are
𝑁 𝑛 possible samples if sampling is with replacement and there are NCn possible samples if
sampling is without replacement.

2. Calculate the mean for each sample.

3. Summarize the mean obtained in step 2 in terms of frequency distribution

Example 7.7: Suppose we have a population of size 5, consisting of the age of five children 3, 5,
7, 9, and 11. Population mean is 7 and population variance is 8. (Consider sampling without
replacement).

Take samples of size 2 and construct sampling distribution of the sample mean.

Solution:

Step 1: N= 5 , n=2 we have =10, possible samples.

(3,5), (3,7), (3,9), (3,11), (5,7), (5,9), (5,11), (7,9), (7,11) and (9, 11)

Step 2: Calculate the sample mean for each sample:

Means = 4, 5,6,7,6,7,8,8,9,10 respectively.

BY: ABEBEW A. 138


MEKDELA AMBA UNIVERSITY

Step 3: Summarize the mean obtained in step 2 in terms of frequency distribution.

Step 2: Calculate the sample mean for each sample:

Means = 4, 5,6,7,6,7,8,8,9,10 respectively.

Step 3: Summarize the mean obtained in step 2 in terms of frequency distribution.


xi 4 5 6 7 8 9 10 Total
𝑓𝑖 1 1 2 2 2 1 1 10
xi 𝑓𝑖 4 5 12 14 16 9 10 70
𝑓( xi − 7)2 9 4 2 0 2 4 9 30

∑ x i 𝑓𝑖
a) Mean of sample means , E( X ) = ∑ 𝑓𝑖
= 70/10 = 7
∑𝑘
𝑖 ( xi −E( X ) )2
b) Variance of sample means, var( X ) = = 30/10 = 3
𝑘

2  N n 852
V (x)    =  = 3
n  N 1  2  5 1 

Example 7.8

Three students have taken a class test which is marked out of 10. We want to estimate the mean
mark using the sample mean as the estimate of the population mean. We take a sample of size 2 in
two cases and suppose the marks of the three students are 1, 2 and 6.

The population mean μ is (1+2+6)/3 = 3

∑(𝑥𝑖 −)2
The population variance = = 14/3
𝑁

i) Sampling without replacement


In this type of sampling an observation is included in the sample only once and is selected
randomly without any preference or conscious effort.

If sampling is without replacement we can take 3C2 =3 possible samples; the possibilities are given
below.
Possible sample (1,2) (1,6) (2,6)
Sample mean 1.5 3.5 4

BY: ABEBEW A. 139


MEKDELA AMBA UNIVERSITY

The sample mean is a random variable, and we see that it can take three possible values. We can
now write down its probability distribution as follows
xi 1.5 3.5 4 Total
P( X = xi ) 1/3 1/3 1/3 1
( xi − 3)2 2.25 0.25 1 3.5

i) Mean of sample means E( X )=∑ 𝑋̅𝑖 𝑝 ( X = xi )=1.5(1/3) + 3.5(1/3) +4(1/3) =3 = population

mean. i.e., Mean of sample means E( X ) = population mean

∑𝑘
𝑖 ( xi −E( X ) )2
ii) Variance of sample means, var( X ) = =3.5/3 = 1.17
𝑘

where k is number of sample mean.

2  N n 14 / 3  3  2 
In which if Sampling without replacement, V ( x )    =   =14/12 = 1.17.
n  N 1  2  3 1 

ii. Sampling with replacement

In this type of sampling an observation has a chance to be selected at each draw.

Suppose that we take the sample with replacement, there are 32 = 9 possible samples.
Sample (1,1) (1,2) (1,6) (2,1) (2,2) (2,6) (6,1) (6,2) (6,6)
Sample mean 1 1.5 3.5 1.5 2 4 3.5 4 6
The sample mean is a random variable & its probability distribution is:
xi 1 1.5 2 3.5 4 6 Total
P( X = xi ) 1/9 2/9 1/9 2/9 2/9 1/9 1
xi P( X = xi ) 1/9 1/3 2/9 7/9 8/9 6/9 3
𝑓𝑖( xi − 3)2 4 4.5 1 0.50 2 9 21

i) Mean of sample means E( X )=∑ 𝑋̅𝑖 𝑝 ( X = xi ) = 1(1/9) +1.5(2/9) + 2(1/9) +3.5(2/9) + 4(2/9)

+ 6(1/9) = 3.

Mean of sample means, E( X ) = population mean.

∑𝑘
𝑖 ( xi −E( X ) )2
ii) Variance of sample means var( X ) = =21/9 = 2.33
𝑘

BY: ABEBEW A. 140


MEKDELA AMBA UNIVERSITY

Where k is number of sample means

 
V X   x2 
2
n
=
14 / 3
2
= 14/6 = 2.33

In which if sampling with replacement, V X   x2    2


n
=
14 / 3
2
= 14/6 = 2.33.

In each case the expected value of the sample mean equals the population mean. This explains why
the sample mean is a good estimate of the population mean. If we use the sample mean as an
estimate of the population mean we will sometimes overestimate it, and sometimes under-estimate
it, but “on average” we will be accurate.

The example above illustrates an important result:

Example 7.9: Suppose we have a hypothetical population of size 3, consisting of three children
namely: A is 3 years old, B is 6 years old and C is 9 years old. Construct sampling distribution of
the sample mean of size 2 using sampling without replacement and with replacement.

Solution: The mean and variance of the population are 6 and 6, respectively.

1. If sampling is without replacement we will have 3C2 = 3 possible samples: (A, B), (A, C) and
(B, C) and their corresponding sample means are (3+6)/2 = 4.5, 6 and 7.5, respectively. Hence the
probability distribution (sampling distribution) of the sample mean is:
𝑥 4.5 6 7.5
𝑝(𝑋 = 𝑥) 1/3 1/3 1/3

𝐸(𝑋) = ∑ 𝑥𝑝(𝑥)=4.5(1/3)+7.5(1/3)+6(1/3)=6

2
𝑉(𝑋) = ∑ 𝑥 𝑃(𝑥) − 𝑢2 𝑥 = (6.75 + 12 + 18.75) − 36 = 1.5

2. If sampling is with replacement we will have Nn = 32 = 9 possible samples: (A, A), (A, B), (A,
C), (B, A), (B, B), (B, C), (C, A), (C, B) and (C, C). Hence the probability distribution (sampling
distribution) of the sample mean is:
𝑥 3 4.5 6 7.5 9
𝑝(𝑋 = 𝑥) 1/9 1/9 3/9 2/9 1/9

BY: ABEBEW A. 141


MEKDELA AMBA UNIVERSITY

𝐸(𝑋) = ∑ 𝑥𝑝(𝑥)=3(1/9) + 4.5(1/9)+...+ 9(1/9)=6

2
𝑉(𝑋) = ∑ 𝑥 𝑃(𝑥) − 𝑢2 𝑥 = (1 + 4.5 + 12 + 12.5 + 9) − 36 = 3

Remark:

∑ x i 𝑓𝑖
1. Mean of sample means= E( X ) = ∑ 𝑓𝑖
= ∑ 𝑋̅𝑖 𝑝 ( X = xi ) = population mean.

 
2. Variance of sample means, V X   x2 
2
n
( if sampling is with replacement).

2  N n
3) Variance of sample means V ( x )    ,(if sampling is without replacement).
n  N 1 

 N n
The quantity   is finite population correction (fpc), and if n/N <0.05, fpc is ignored.
 N 1 

Note: the square root the Variance of sample means is known as standard error.

The distribution of sample means depends on distribution of the population, sample size and
whether population variance is known or unknown. A sample may be from a normally distributed
population or from a non-normally distributed population, from a population with variance is
known or unknown and the sample size may be large or small.

Case-I: If sampling is from a normally distributed population with known variance:

When sampling is from a normally distributed population with known variance, the distribution of
sample means, X , is normal whatever the sample size.

Example 7.10: The speed of all cars travelling on a street is normally distributed with mean 68
km/h and variance 9 km/h. Find the probability that the mean speed of a random sample of 16 cars
travelling on the street is more than 70 km/h.

Solution:

Let X be the speed of cars with mean 68 and variance 9. A sample of size 16 is taken, the sample
mean is a random variable ( X ),

BY: ABEBEW A. 142


MEKDELA AMBA UNIVERSITY

X  N   ,  
2

 =X  N 68 , 0.56  , since the population is normally distributed, Probability of a


 n 

70  68
sample mean is greater than 70 is P( X >70) = p(Z> ) = p(Z>2.67) = 0.0038
0.56

Case-II: When sampling from a non-normal population and when the sample size is large

If sampling is from a non-normal population and when the sample size is large the distribution of
X depends on Central Limit Theorem.

7.5 Sampling Distribution of the sample Proportion

In situations where it is not possible to measure the characteristic under study, but is possible to
classify the whole population in various categories with respect to the attributes they possess,
consideration is usually given to estimating the population elements that belong to a defined
category of class. Suppose that we have two complementary and mutually exclusive class, C and
C' such that every unit in the population falls into either of them.

In order to know how many of the units fall in class C, we define a counting variable as

1, if the unit fall in class C


Xi = {
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

If the number of units falling in C is denoted by A for the population and by a for the sample, then

∑𝑁
𝑖 𝑋𝑖 = 𝐴 and hence the population proportion denoted by P is given by P = A/N

𝑎 ∑𝑛
𝑖 𝑋𝑖
Given a simple random sample of n units, the sample proportion denoted by p= 𝑛 = from the
𝑛

formula, we see that X and p are essentially identical. In fact p is special case of X , the case
where possible values of Xi are only 0 and 1. Consequently p possesses all properties of X .

p is an estimate of P, with variance

2  N n 𝑖 (𝑋𝑖 −  )
∑𝑁 2

 where  =
2
var(p) = var( X ) =  = PQ………………………..………7.1
n  N 1  𝑁

PQ  N  n 
var(P) =  
n  N 1 

BY: ABEBEW A. 143


MEKDELA AMBA UNIVERSITY

Where Q=1-P is proportion of units falling in class C'

PQ  N  n 
var(P) =   is estimated by using sample values as
n  N 1 

pq  N  n 
var(𝑝̂ ) =  =
pq
1  f 
n 1  N  n 1

Where sampling fraction, f = n/N.

npq
This expression is obtained by replacing  2 by its estimator S2 = .
n 1

The sampling fraction can be ignored, when N is large relative to sample size n, n/N<0.05.

pq pq
var(𝑝̂ ) = and the standard error of p is √ .
n 1 n 1

PQ  N  n 
Sample proportion p is normally distributed with mean P and variance var (p) =  
n  N 1 

Example 7.11: In a simple random sample of size 100, from a population of size 500, there are 37
employed persons in the sample.

a) Estimate proportion of employed persons in the population.

b) Calculate the standard error of p.

Solution:

a) Population proportion P is estimated by p= a/n = 37/100 = 0.37. 37% of the population is


employed.

pq(1  f ) (0.37)(0.63)(1  0.2)


b) Standard error of p is√ =√ = 0.0434
n 1 99

BY: ABEBEW A. 144


MEKDELA AMBA UNIVERSITY

7.6 The Central Limit Theorem

If X1, X2, …, Xn is a random sample from a population with mean μ and variance  2 , then as n
goes to infinity the distribution of the sample mean, X , approximates normal distribution with
2
. In short as n gets large number, X  N   ,  .
2
mean μ and variance 
n 
 n 

We can standardize this to get Z  X    N (0, 1) (approximately as n gets large). When


/ n

population variance is unknown Z  X    N (0, 1) (approximately as n gets large)


S/ n

Example 7.12: The mean weight of 500 male students at a certain university is 151 pounds (lb)
and the standard deviation is 15 lb. assuming that the weights are normally distributed. Suppose
that a sample of 64 students is taken, what is the probability that the weight in the sample is more
than 154.75 lb?

Solution

As we have taken a large (n=64) sample we can use the Central Limit Theorem. This says that the
mean weight of the sample can be approximated by a normal random variable with a mean of 151
and a variance of 225. If we let X be the mean weight of the students, it is required to find

P( X >154.75) = X  N 151 ,225 / 64 

154.75  151
P( X >154.75) = p( X   > ) = P (Z>2.00) = 0.5 – 0.4772 = 0.0228.
/ n 15 / 8

Example 7.13: Suppose that 150 customers enter a supermarket on a given day. Each customer
spends a random amount. All they knew about the distribution of these expenditures that its mean
is 7.50 birr and its standard deviation is 3.40 birr. What is the probability that a person, on average,
spent more than 8.00 birr during the day?

Solution:

BY: ABEBEW A. 145


MEKDELA AMBA UNIVERSITY

We have n = 150 which is large enough to use the Central Limit Theorem. Mean =7.50 and
standard deviation = 3.40.

Let 𝑋̅ be the mean amount of an individual’s expenditure during the day. 𝑋̅ N (7.50, 0.077)

Let X the average amount of an individual’s expenditure during the day, it is required to find P(
X >8)

P( X >8.00) = p( X   > 8.00   ) = p(Z > 8.00  7.5 ) = p(Z>1.80) = 0.5 – P (0<Z<1.80)
/ n / n 3.4 / 150
= 0.5 – 0.4641 = 0.0359
This means there is only 0.0359 probabilities that a person will spent larger than 8.00 birr on
average.

Example 7.14: A factory making soft-drinks has an automatic process that fills its bottles. The
volume of the soft drink in each bottle is supposed to be 330ml, but the machine fills the bottles
with a random amount of soft drink that has a mean of 330ml and a standard deviation of 5ml.
Suppose we take a sample of 100 bottles of drink. What is the probability that the mean volume
of drink in the sample is more than 331ml?

Solution:

As we have taken a large (n=100) sample we can use the Central Limit Theorem.

This says that the mean amount of soft drink in the bottles of the sample can be approximated by
a normal random variable with a mean of 330 and a variance of ¼. If we let 𝑥 be the mean volume
of the soft drink in the bottle, it is required to find P.

1
𝑥 > 331 𝑤ℎ𝑒𝑟𝑒 𝑥~𝑁(330, )
4
𝑥−𝜇 𝑥−𝜇 331−𝜇
𝑝(𝑥 > 331, 𝑝(𝛿 )= 𝑝(𝛿 > 𝛿⁄ )
⁄ ⁄
√𝑛 √𝑛 √ 𝑛

331−330
= 𝑝(𝑍 > 1⁄ ) =p(Z>2.00)=0.5-0.4772=0.0228
2

BY: ABEBEW A. 146


MEKDELA AMBA UNIVERSITY

Case-III: When sampling is from normally distributed population with unknown population
variance,

a) If the sample size is large, Z  X    N (0, 1), where S is an estimate of  .


S/ n

b) If the sample size is small (n<30), t  X   t(n-1). t has t-distribution with (n-1) degree of
S/ n

freedom, where S is an estimate of  .

Activity 7.3

1. State and briefly describe the central limit theorem.


2. Suppose that 150 customers enter a supermarket on a given day. Each customer spends a
random amount. All they knew about the distribution of these expenditures that its mean is
7.50 birr and its standard deviation is 3.40 birr. What is the probability that a person, on
average, spent more than 8.00 birr during the day?

Exercise 7

1. What is the difference between a statistic and a parameter?

2. What is a sampling frame?

3. How do you select a simple random sample?

4. A population consists of the four numbers, 3,7,11 and 15. Consider all possible samples of size
2 drawn from this population with replacement.

Find a) population mean

b) Population variance

c) the sampling distribution of sample means

d) the mean of sample means

e) the standard deviation of the sample means.

5. Solve problem 4 if the sampling is wor.

BY: ABEBEW A. 147


MEKDELA AMBA UNIVERSITY

6. An electrical firm manufactures light bulbs that have a length of life which are normally
distributed with population mean 800 hrs and standard deviation 40 hrs. Find the probability that
a bulb burns:

a) Between 778 and 834 hrs

b) Greater than 834 hrs?

7. The amount of sulphur in a daily emission from a factory has a normal distribution with mean
of 134 pounds and a standard deviation of 22 pounds. For a day selected randomly, find the
probability that the mean amount of sulphur emission will be less than 130 pounds.

8. A population consists of the four numbers, 3,7,11, 13 and 15. Consider all possible samples of
size 2 drawn from this population without replacement.

Find: a) the sampling distribution of sample means

b) The mean of sample means

c) The standard deviation of the sample means.

BY: ABEBEW A. 148


MEKDELA AMBA UNIVERSITY

CHAPTER EIGHT

ONE SAMPLE INFERENCES

Objectives:

After completing this unit, the student should be able to

 Explain the concepts of statistical estimation and the confidence interval.


 Distinguish interval estimation from point estimation.
 Calculate and interpret point estimate of population mean and population proportion.
 Define the concept of hypothesis testing and differentiate types of tests.
 List down the basic steps in hypothesis testing.
 Follow the steps to solve problems on hypothesis testing.
 Identify the appropriate test statistics for a given practical problem.

8.1 Introduction

The process of inferring information about a population from a sample is known as statistical
inference. This chapter has two major parts .The first part is statistical estimation discusses the
method of estimating a population parameter by using statistic, point estimation. It also explains
the concepts of confidence interval. The second part is hypothesis testing describes the different
techniques of testing a given tentative assumptions by applying an appropriate test statistic.

It is the procedure of using a sample statistic to estimate a population parameter. This is one way
of making inference about the population parameter where the investigator does not have any prior
notion about values or characteristics of the population parameter. A statistic used to estimate a
parameter is called an estimator and the value taken by the estimator is called an estimate.
Statistical estimation is divided into two main categories: Point Estimation and Interval Estimation.

8.2 Point and Interval Estimation of mean

Point estimate When we use a single value of a statistic to estimate the corresponding parameter
of a population, it is called point estimation. It is a common way of estimating a parameter, where
a random sample of n observations is selected from a population and the statistic is calculated.

BY: ABEBEW A. 149


MEKDELA AMBA UNIVERSITY

Examples:

 A sample mean is an estimate for population mean μ. That is, 𝑋̅ is an estimator for
population mean μ.
 A sample variance is an estimate for population variance. That is, S2 is an estimator for
population Variance 𝜎 2 .
 A sample proportion estimate for population proportion

Properties of best estimator

The following are some qualities of an estimator.

 It should be unbiased.

 It should be consistent.

 It should be relatively efficient.

To explain these properties let 𝜃̂ be an estimator of θ.

1. Unbiased Estimator: An estimator whose expected value is the value of the parameter being
estimated. i.e., E(𝜃̂) = θ.

2. Consistent Estimator: An estimator which gets closer to the value of the parameter as the
sample size increases. i.e., 𝜃̂ gets closer to θ as the sample size increases.

3. Relatively Efficient Estimator: The estimator for a parameter with the smallest variance.

This actually compares two or more estimators for one parameter.

Interval estimation: It is unlikely that any particular estimate will be exactly equal to the
population mean, surely an estimate can be greater than or less than the parameter .That is, it is not
always possible to estimate population parameter without any error so allowance is needed for
such error .We take interval, ranges of values about an estimate in which the parameter may lie.
This procedure is Interval estimation. It is the procedure that results in the interval of values of a
parameter. Interval estimates indicate the precision or accuracy of an estimate and are, therefore,

BY: ABEBEW A. 150


MEKDELA AMBA UNIVERSITY

preferable to point estimates. It deals with identifying the upper and lower limits of a parameter.
Confidence interval for the parameter is:

Estimate ± critical value × Standard error of the estimator…………………………….8.1

Example 8.1:: Confidence interval for the population mean is:

𝑋̅ ± Critical value × Standard error of ( 𝑋̅ )

Although 𝑋̅ possesses nearly all the qualities of a good estimator, because of sampling error, we
know that it's not likely that our sample statistic will be equal to the population parameter, but
instead will fall into an interval of values. We will have to be satisfied knowing that the statistic is
"close to" the parameter. That leads to the obvious question, what is "close"?

We can phrase the latter question differently: How confident can we be that the value of the
statistic falls within a certain "distance" of the parameter? Or, what is the probability that the
parameter's value is within a certain range of the statistic's value? This range is the confidence
interval.

The confidence level is the probability that the value of the parameter falls within the range
specified by the confidence interval surrounding the statistic. There are different cases to be
considered to construct confidence intervals.

Case-I: Population variance (σ2) is known and parent population is normal.

The sampling distribution of the sample mean is normal with mean μ and variance 𝜎 2 ⁄𝑛, that is,
𝑋̅ ~ N(μ, 𝜎 2 /n) . We can standardize this to get
̅ −𝜇
𝑋
Z= 𝜎⁄ ~ N (0, 1). …………………………………8.2
√𝑛

From the standard normal distribution, we have

𝑃(−𝑍𝛼⁄2 < 𝑍 < 𝑍𝛼⁄2 ) = 1 − 𝛼

Where α is risk probability and 1- α confidence level. The confidence level is the probability that
the value of the parameter falls within the range specified by the confidence interval surrounding

BY: ABEBEW A. 151


MEKDELA AMBA UNIVERSITY

the statistic. 𝜎⁄√𝑛 is the standard error of the statistic . Standard error is the square root of variance
where Var ( 𝑋̅ ) = 𝜎 2 ⁄𝑛.

Using the standardized form of the sampling distribution of the sample mean in the above
probability statement, we get the limits of the confidence interval as follows:
̅ −𝜇
𝑋
𝑃 (−𝑍𝛼⁄2 < 𝜎⁄ < 𝑍𝛼⁄2 ) = 1 − 𝛼
√𝑛

𝑃(−𝑍𝛼⁄2 𝜎⁄√𝑛 < 𝑋̅ – 𝜇 < 𝑍𝛼⁄2 𝜎⁄√𝑛) = 1 − 𝛼

𝑃(−𝑍𝛼⁄2 𝜎⁄√𝑛 − 𝑋̅ < −𝜇 < −𝑋̅ + 𝑍𝛼⁄2 𝜎⁄√𝑛) = 1 − 𝛼

𝑃(𝑋̅ − 𝑍𝛼⁄2 𝜎⁄√𝑛 < 𝜇 < 𝑋̅ + 𝑍𝛼⁄2 𝜎⁄√𝑛) = 1 − 𝛼

The last statement clearly shows that, there is a (1- 𝛼) 100% confidence interval for population
mean (μ) to lie in the interval

(𝑋̅ − 𝑍𝛼⁄2 𝜎⁄√𝑛 , 𝑋̅ + 𝑍𝛼⁄2 𝜎⁄√𝑛)

This interval is known as a (1- 𝛼) 100% confidence interval for population mean (μ).

Here are the Z values corresponding to the most commonly used confidence levels.

(1- 𝛼) 100% 𝛼 𝛼 ⁄2 𝑍𝛼⁄2


90 0.10 0.05 1.645
95 0.05 0.025 1.96
99 0.01 0.005 2.58

Example 8.2: The weights of full boxes of a certain kind of cereal are normally distributed with a
standard deviation of 0.27 ounce. If a sample of 15 randomly selected boxes produced a mean
weight of 9.87 ounce, find:

a) The 95% confidence interval for the true mean weight of boxes of this cereal,

b) The 99% confidence interval for the true mean weight of boxes of this cereal,

c) What effect does the increase in the level of confidence have on the width of the interval?
Solution:

BY: ABEBEW A. 152


MEKDELA AMBA UNIVERSITY

a) Given 1    0.95 , so that  / 2  0.025 , n  15,   0.27 ounce, x  9.87 ounce The 95%
C.I. is P( Z 0.025  Z  Z 0.025 )  0.95 and  Z  / 2   Z 0.025  1.96 ounce

X 
Where Z  .
/ n

 
Substituting these values in x  Z / 2     x  Z / 2  , the resulting confidence
n n
interval is (9.73, 10.01).

b) Similarly the 99% C.I. is (9.69, 10.05).

c) The increase in the confidence level widens the length of the confidence interval.

Case-II: When sampling from a non-normal population and when the sample size is large the
distribution of 𝑋̅ depends on Central Limit Theorem (with known and unknown population
variance).

Recall the Central Limit Theorem, which applies to the sampling distribution of the mean of a
sample. Consider samples of size n drawn from a population, whose mean is μ and standard
deviation is σ. The population can have any frequency distribution. The sampling distribution of
𝑋̅ will have a mean μ and standard deviation is √𝑛
σ
. The sampling distribution of 𝑋̅ is normal with
2
𝜎
𝜎2
a mean μ and variance 𝑛
as n gets large .That is 𝑋̅ ~ N (μ, 𝑛 ) (as n gets large). We can standardize
̅ ̅
this to get Z = 𝜎𝑋⁄−𝜇
√𝑛
~ N(0,1) or Z = 𝑆𝑋⁄−𝜇
√𝑛
~ N(0,1) when 𝜎 is unknown.

A (1-α) 100% confidence interval for population mean (μ) is

(𝑋̅ − 𝑍𝛼⁄2 𝜎⁄√𝑛 , 𝑋̅ + 𝑍𝛼⁄2 𝜎⁄√𝑛) if 𝜎 2 is known and

(𝑋̅ − 𝑍𝛼⁄2 𝑆⁄√𝑛 , 𝑋̅ + 𝑍𝛼⁄2 𝑆⁄√𝑛) if 𝜎 2 is unknown. ………………...8.3

Example 8.3: An economist wants to estimate the average amount in checking accounts at banks
in given region. A random sample of 100 accounts gives 𝑋̅ = $357.60 and S= $140.00. Give a
95% confidence interval for μ, the average amount in any checking account at a bank in the given
region.

BY: ABEBEW A. 153


MEKDELA AMBA UNIVERSITY

Solution:

Given: n = 100, 𝑋̅ = $357.60, S= $140.00 & α = 0.05

A 95% confidence interval for population mean (μ) is

(𝑋̅ – 𝑍𝛼⁄2 𝑆⁄√𝑛 , 𝑋̅ + 𝑍𝛼⁄2 𝑆⁄√𝑛) … since n is large and 𝜎 2 is unknown

= (357.60 − 1.96(140.00⁄√100), 357.60 + 1.96(140.00⁄√100))

= (330.16, 385.04).

Case-III: When sampling is from normally distributed population with unknown population
variance and when the sample size is small (n<30).

When population variance σ2 is unknown, we estimate it by sample variance. The standardized


̅̅̅
distributions of the sample mean, 𝑡 = 𝑋𝑆⁄−𝜇
√𝑛
0 is t-distribution with (n-1) degrees of freedom. From

this distribution, (1-α) 100% confidence interval for population mean is


𝑆
(𝑋̅ – 𝑡𝛼⁄2(𝑛−1) , 𝑋̅ + 𝑡𝛼⁄2(𝑛−1) √𝑛
𝑆
) . …………………….…………………….8.5
√𝑛

Example 8.4: From a normal sample of size 25 a mean of 32 was found .Given that the standard
deviation is 4.2. Find

a) A 95% confidence interval for the population mean.

b) A 99% confidence interval for the population mean.

Solution:

a) Given: n = 25 𝑋̅ = 32, S = 4.2, 1-α = 0.95 ⟹ α = 0.05, 𝛼2 = 0.025

⟹ 𝑡𝛼⁄2,24 = 2.064 𝑓𝑟𝑜𝑚 𝑡𝑎𝑏𝑙𝑒.

𝑆
⟹ The required interval will be (𝑋̅ – 𝑡𝛼⁄2(𝑛−1) , 𝑋̅ + 𝑡𝛼⁄2(𝑛−1) √𝑛
𝑆
)
√𝑛

4.2
= 32 ± 2.064 ×
√25

= 32 ± 1.73

BY: ABEBEW A. 154


MEKDELA AMBA UNIVERSITY

= (30.27, 33.73)

b) Given: n = 25 𝑋̅ = 32, S = 4.2, 1-α = 0.99 ⟹ α = 0.01, 𝛼2 = 0.005

⟹ 𝑡𝛼⁄2,24 = 2.797 𝑓𝑟𝑜𝑚 𝑡𝑎𝑏𝑙𝑒.

𝑆
⟹ The required interval will be (𝑋̅ – 𝑡𝛼⁄2(𝑛−1) , 𝑋̅ + 𝑡𝛼⁄2(𝑛−1) √𝑛
𝑆
)
√𝑛

4.2
= 32 ± 2.797×
√25

= 32 ± 2.35

= (29.65, 34.35)

8.3 Sample size determination in estimation of population mean

In the process of estimating population mean μ using the sample mean with absolute margin of
error (d) and risk probability α, the sample size is given by:

𝑍α 𝜎 2
𝑛=[ 2
𝑑
] where |𝑋̅ − 𝜇| = 𝑑 ………………..……………………8.6

Example 8.5: To determine the average amount of time students take to get from one class to the
next, how large a sample is needed with probability 0.95 that the error will be at most 0.25 minutes,
if is known from past experience to be 1.50 minutes?

Solution: using 𝑍0.025 = 1.96 and replacing E=0.25 and 𝛿 = 1.50 in the for n,

we get 138.30≈ 139 (always rounded to the next integer) is required for the estimate

8.4 point and interval estimation for the proportion

The confidence interval for the population proportion is performed in the same manner as the
population mean .We have discussed that the sampling distribution of sample proportion is normal
.The sample estimate of population proportion P is sample proportion p and sample estimate of
𝑝𝑞 𝑝𝑞
variance of sample proportion is 𝑉𝑎𝑟(𝑝̂ ) = 𝑛−1 for large sample 𝑉𝑎𝑟(𝑝̂ ) = .
𝑛

A (1-α)100% confidence interval for proportion p is given by (for large n):

BY: ABEBEW A. 155


MEKDELA AMBA UNIVERSITY

𝑝̂ ± 𝑍𝛼⁄2 √𝑝̂𝑛𝑞̂ …………………………………….8.7

Example 8.6: The Human Resource director of a large organization wanted to know what
proportion of all persons who had ever been interviewed for a job with his organization had been
hired. He was willing to settle for 95% confidence interval. A random sample of 500 interview
records revealed that 76 or 0.152 of the persons in the sample had been hired.

Solution:

Given: 𝑝̂ = 0.152, 𝑞̂ = 1 − 𝑝̂ = 0.848, n = 500, α = 0.05, 𝑍0.025 = 1.96

The 95% confidence interval for the population proportion is given by

p̂ ± Zα⁄2 √p̂nq̂ = 0.152 ± 1.96√0.152×0.848


500
= 0.152 ± 0.031

= (0.121, 0.183)

Hence the required proportion varies between 0.121 and 0.183.

Sample size for estimating population proportion: method for determining a sample size for
estimating the population proportion is similar to that used in the previous section. We require that
the population proportion P should fall within the rangeε ±p , with a specified probability.

𝑝𝑞
𝜀 = 𝑍𝛼⁄2 √ 𝑛 q=1-p

𝑝𝑞
𝜀2 = (𝑍𝛼⁄2 )2 ……………………………..…8.8
𝑛

Thus, the minimum required sample size to estimate P with maximum tolerable error ε and with a
confidence level (1- α) is as follows:
𝑍𝛼⁄
𝑛=( 2
)2 𝑝𝑞 ……………………………..8.9
𝜀

Example 8.7: A cigarette manufacturer wishes to use random sampling to estimate the average
nicotine content. The sampling error should not be more than plus or minus one milligram and the

BY: ABEBEW A. 156


MEKDELA AMBA UNIVERSITY

population standard deviation is four milligram. What sample size should the company use in order
to estimate the average nicotine content with 99% confidence interval?

Solution: Given: mg 1= ε; mg δ=4;

(1 − 𝛼)100% = 99%

1 − 𝛼 = 0.99

𝛼 = 0.01, 𝛼⁄ = 0.005,
2

𝑧𝛼⁄2 = 𝑧0.005 = 2.58

Therefore, the minimum sample size n is given by


𝑍𝛼⁄ 2.58∗4 2
𝑛=( 2
)2 𝑝𝑞=( ) = 106.58 ≈ 107
𝜀 1

Thus, a sample of size 107 is needed to estimate the average nicotine content in the cigarette.

Activity 8.2:

1. Discuss factors that should be considered while determining sample size?

2. Discuss the relationship between sample size and margin of error? What will happen for the
sample size as the confidence level increase?

3. Suppose you want to construct a 95% confidence interval for the proportion of college students
that want to work in the health care profession after graduation. Previous studies shows that 60%
of college graduates want to work in health care profession. If you want a margin of error of no
more than 5%, how many college students must you survey?

8.5 Statistical Hypothesis testing

We have studied how to make estimations of the mean using point and interval estimations. The
other aspect of statistical inference is known as statistical test of hypothesis. The branch of statistics
which helps us in arriving at the criterion for deciding about the characteristics of the population,
a parameter, based on the information obtained from the sample data is known as testing of

BY: ABEBEW A. 157


MEKDELA AMBA UNIVERSITY

hypothesis. We shall use the theoretical results presented for the interval estimation, and hence, a
test of hypothesis is highly connected with the theory of estimation we studied before.

In this section, basically we will deal with testing hypotheses about population mean and
population proportion. While doing so, we shall define some important terminologies which we
may face and the errors we are committing in the process. We shall employ the standard normal
distribution (or Z-test) and the t-distribution (or t-test), depending upon the nature of the population
sampled and the sample.

8.5.1 Hypothesis testing about the mean

Some terms in tests of hypothesis

Statistical hypothesis is defined as a statement (or an assertion) about the parameter of a


population or its distribution that may be proved or disproved. Its plausibility is to be evaluated on
the basis of information obtained by taking sample from the population.

Test statistic: is a statistic whose value serves to determine whether to reject or accept the
hypothesis to be tested. It is a random variable.

A given statement concerning a parameter could be true or false. Hence we have two
complementary hypotheses, namely, null hypothesis and alternative hypothesis.

a) Null hypothesis (H0)

It is the hypothesis to be tested for possible rejection under the assumption that it is true and it is
the hypothesis of equality or the hypothesis of no difference.

b) Alternative hypothesis (H1)

It is hypothesis which is the complementary to the null hypothesis. It may be accepted if Ho is


rejected or be rejected if Ho is accepted; It is the hypothesis of difference.

Statistical Test: is a test or procedure used to evaluate a statistical hypothesis for deciding whether
to reject the hypothesis depending on sample data. The decisions we make are of two types: Either
to reject Ho and conclude that H1 is accepted or retain Ho and conclude that we have no enough
evidence to reject Ho.

BY: ABEBEW A. 158


MEKDELA AMBA UNIVERSITY

Types of errors

Statistical test of hypothesis can lead to two kinds of errors. If the statistical test rejects Ho when
it is true, the error is type I error. If the test accepts Ho when it is false, the error is a type II error.

The following table gives a summary of possible results of any hypothesis testing procedure:

Decision Ho is true Ho is false


Reject Ho Type I error Correct decision
Accept Ho Correct decision Type II error
Type I error is the error committed in rejecting the null hypothesis when it is true. Probability of
committing type I error is sometimes called level of significance and denoted by α.

Type II error is the error committed in accepting the null hypothesis when it is false. Probability
of committing type II error is denoted by β.

In both types of errors, a wrong decision has occurred. An ideal test procedure is one which is so
planned as to safeguard against both these errors. However, in practical situations an attempt to
minimize one of the errors maximizes the other. In view of this dilemma and the fact that wrong
rejection of Ho is a more serious error, we will hold at a predetermined low level, such as 0.1,
0.05, or 0.01 when choosing a rejection region. The level of significance 5% implies that in 5
samples out of 100 we are likely to reject a correct H0. In other words this implies that we are 95%
confident that our decision to reject H0 is correct.

General steps in hypothesis testing on population mean, μ

Step-1 The first step in hypothesis testing is to specify the null hypothesis (H0) and the alternative
hypothesis (H1). Suppose the assumed or hypothesized value of μ is denoted by μ o, then one can
formulate two sided and one sided hypothesis as follows:

1. Ho: μ =μo versus H1: μ μo (two sided test)

2. Ho: μ =μo versus H1: μ < μo (one sided test)

3. Ho: μ =μo versus H1: μ > μo (one sided test)

Step-2: Specify a significance level of α.

BY: ABEBEW A. 159


MEKDELA AMBA UNIVERSITY

Step-3 We should identify the sampling distribution of the estimator and the test statistic.

Case-I: Population variance (σ2) is known and parent population is normal.


̅ −μ
X
The test statistic is Z = σ ~ N (0, 1).
⁄ n

Case-II: When sampling from a non-normal population and when the sample size is large the
distribution of X depends on Central Limit Theorem (with known and unknown variance).
̅ −μ0
X
a) The test statistic is: Z = σ ~ N (0, 1) with known variance
⁄ n

̅ −μ0
X
b) The test statistic is: Z = S⁄ ~ N (0, 1) with unknown variance.
√n

Case-III: When sampling is from normally distributed population with unknown population
variance.
̅ −μ0
X
 When the sample size is large, Z = S⁄ ~ N (0, 1).
√n
̅ −μ0
X
 When the sample size is small (n<30), t = S⁄ ~ t(n-1).
√n

Step-4. The value of the test statistic can be calculated as follows:


̅ −μo
X
a) Zc = σ with known variance.
⁄ n

̅ −μo
X
b) Zc = S⁄ with unknown variance & large sample size.
√n
̅ −μo
X
c) tc = S⁄ with unknown variance and small sample size.
√n

where ̅
X is the sample mean and μo the parameter specified by the null hypothesis.

Step-5: Identify the critical (rejection) region or put the decision rule.

a) For two sided test Ho: μ = μo versus H1: μ  μo , reject Ho if

Zc > 𝑍𝛼⁄2 or Zc < −𝑍𝛼⁄2 .

BY: ABEBEW A. 160


MEKDELA AMBA UNIVERSITY

Note: Zc refers to Zcalculated

Graphically, the rejection and acceptance regions are:

Rejection Region Acceptance Region Rejection Region


2 
2
- Z Z
2 2

b) For one sided test (right sided test) Ho: μ = μo versus H1: μ > μo reject Ho if Zcalculated > 𝑍𝛼 .
Graphically, the rejection and acceptance regions are

Acceptance Region Rejection Region (  )

Z

d) For one sided test (left sided test) Ho: μ = μo versus H1: μ < μo reject Ho if Zcalculated < −𝑍𝛼 .
Graphically, the rejection and acceptance regions are

Rejection Region Acceptance Region

 Z

Step 6: Summarization the result and put the conclusion

BY: ABEBEW A. 161


MEKDELA AMBA UNIVERSITY

Decision Table

To test H 0 :   0 against the three alternatives, the rules are summarized as:

Alternative Accept H0 if Reject H0 if Inconclusive if


Hypothesis
  0  Z / 2  Z C  Z / 2 Z C  Z  / 2 or Z C   Z  / 2 Z C  Z / 2
orZ C   Z  / 2
  0 Z C  Z Z C  Z Z C  Z
  0 Z C  Z Z C  Z Z C  Z

Example 8.7: Test at   0.05 whether the mean of a random sample of size n = 16 is
"significantly less than 10" if the distribution from which the sample was taken is normal, x  8.4
and   3.2 (known).

Solution:

H 0 :   10 versus H A :   10 ,   0.05

Z  Z 0.05  1.645 (critical value)

x  0 8.4  10
ZC    2 (Calculated value)
/ n 3.2 / 4

Since Z c  2   Z  1.645 , the null hypothesis is rejected. That is, the population mean 8.4
is significantly less than 10 at 5% level of significance.

Example 8.8: Based upon a random sample of size 100 with an average of 3.4 minutes and a
standard deviation of 2.8 minutes, is the claim that the average telephone call is 4 minutes true
with a confidence of 95%?

Solution: Given: n  100 , x  3.4 min, s  2.8 min,   0.05

H 0 :  4
To test:
H A :  4

Since  is unknown this should be a t-distribution; however, since n  100 is large the z-statistic
is used.

BY: ABEBEW A. 162


MEKDELA AMBA UNIVERSITY

X  0 3.4  4
Zc    2.14
S/ n 2.8 / 10

From the standard normal table we have,  Z  / 2   Z 0.025  1.96

Since the calculated value is less than the tabulated value (-2.14<-1.96), the null hypothesis will
be rejected. Therefore average telephone call is significantly different from 4 minutes at
  0.05 .

Example 8.9: A sample of 16 students gave an average mark of 53.8 with a standard deviation
of 5.2. Can we conclude that the population mean of marks is 50 at   0.05 ?

Solution: H 0 :   50 H A :   50

  0.05 And hence  / 2  0.025

t / 2,n1  t 0.025, 15  2.131 .

x  0 53.8  50 3.8
tC     2.92.
s/ n 5.2 / 16 1.3

Since tc  2.92  2.131, H 0 is rejected. Therefore the population mean mark is significantly

different from 50 at   0.05 .

8.5.2 Hypothesis testing about the proportion

Hypothesis testing for population proportion P is carried out in the same way as hypothesis testing
for population mean when large samples and normality assumptions are fulfilled.

The test statistic is:


̂ −po
P
Z= p
~ N (0, 1) where q o = 1 − po …………………………8.10
√ oqo
n

The decision rule is:

a) For two sided test Ho: P = Po versus H1 : P  Po reject Ho if

Zc > Zα⁄2 or Zc < −Zα⁄2 .

BY: ABEBEW A. 163


MEKDELA AMBA UNIVERSITY

b) For one sided test Ho: P = Po versus H1: P > Po reject Ho if

Zcalculated > Zα

c) For one sided test Ho: P = Po versus H1: P < Po, reject Ho if

Zc < −Zα .

Example 8.10: A sales clerk in the departmental store claims that 60% of the shoppers entering
the store leave without making a purchase. A random sample of 50 shoppers showed that 35 of
them left without buying anything. Are these sample results consistent with the claim of the sales
clerk? Use a level of significance of 0.05.

̂ = 35 = 0.7, n = 50, α = 0.05, Z0.05 = 1.645


Solution: Given: Po = 0.6, q o = 0.4, P 50

Ho: P = 0.6 Vs H1: P > 0.6

Using the z statistic, we have


̂ −po
P 0.7−0.6
Zc = == = 1.44
p q
√ on o √0.6×0.4
50

Since computed value of Zc =1.44 is less than the critical value of Z0.05 = 1.645, therefore, the
null hypothesis cannot be rejected. Hence, based on this sample data we cannot reject the claim of
the sales clerk.

8.6 Test of Association

Usually we encounter with nominal scale data. The χ2 test of association is useful for determining
whether there is any relationship or association exists between two nominal variables. For instance,
we might be interested in the relationship between HIV status with sex, lung cancer and smoking
habit, political affiliation and sex, e t c.

When observations are classified according to two variables or attributes and arranged in a table,
the display is called a contingency table as shown below:

BY: ABEBEW A. 164


MEKDELA AMBA UNIVERSITY

The test of association or independence uses the contingency table format. Here the variables A
and B have been classified into mutually exclusive categories. The values Oij in row i and column
j of the table shows the observed frequency falling in each joint category i and j. The row and
column totals are the sums of their corresponding frequencies. The sum of row or column totals
will give grand total n, which represents the sample size. The procedure to test the association
between two independent variables is summarized as follows:

Step 1: State the null and alternative hypothesis

H0: There is no association or relationship exists between two variables, that is, the two variables
are independent.

H1: There is association or relationship between two variables, that is, the two variables are
dependent.

Step 2: State the level of significance, α.

Step 3: Calculate the expected frequencies, Eij, corresponding to the observed frequency in row i
and column j. The expected frequencies in each cell are calculated as:

𝑅𝑜𝑤 𝑖 𝑡𝑜𝑎𝑡𝑙 ∗ 𝑐𝑜𝑙𝑢𝑚𝑛 𝑗 𝑡𝑜𝑡𝑎𝑙 𝑅𝑖 ∗ 𝐶𝑗


𝐸𝑖𝑗 = =
𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 𝑛

Step 4: Compute the value of test-statistic:

BY: ABEBEW A. 165


MEKDELA AMBA UNIVERSITY

𝑟 𝑐 (𝑜𝑖𝑗 − 𝐸𝑖𝑗 )2
𝜒𝑐𝑎𝑙 2 = ∑ ∑
𝑖=1 𝑗=1 𝐸𝑖𝑗

where Oij is the observed frequency of row i and column j and Eij is the expected frequency of row
i and column j.

Step 5: Find the critical (table) value of χα2 (df ) (from Appendix..). The value of χα2 corresponds
to an area in the right tail of the distribution.

where df = (Number of rows – 1)(Number of columns – 1) = (r – 1)(c – 1)

Step 6: Compare the calculated and table values of χ2. Decide whether the variables are
independent or not, using the following decision rule:

Reject H 0 if χCal 2 is greater than χα2 ,(df ). Otherwise do not reject H 0

Example 8.12: The following data on the colours of eye and hair for 6800 individuals were
obtained from a source:

Eye colours
Hair
colours Fair Brown Black red Total
Blue 1768 808 190 47 2813
Green 946 1387 746 43 3122
Brown 115 444 288 18 865
Total 2829 2639 1224 108 6800
Test the hypothesis that hair color and eye color are independently distributed

(there is no association between color of eye and color of hair) at the level of α= 0.01.

Solution:

1. H 0 : There is no association between hair colours and eye colours.

H1 : There is association between hair colours and eye colours.

2. α= 0.01.

3. Calculate the expected frequencies, Eij

BY: ABEBEW A. 166


MEKDELA AMBA UNIVERSITY

𝑅𝑖 ∗ 𝐶𝑗
𝐸𝑖𝑗 =
𝑛
2813∗2829 2813∗108
𝐸11 = = 1170.29..........𝐸14 = = 44.68
6800 6800

865∗2829 865∗108
𝐸31 = = 359.87 .........𝐸34 = = 13.78
6800 6800

Therefore, the contingency table for expected frequencies is as follows:

Eye colours
Hair colours Fair Brown Black red Total
Blue 1170.29 1091.69 506.34 44.68 2813
Green 1298.84 1211.61 561.96 49.58 3122
Brown 359.87 335.70 155.70 13.74 865
Total 2829 2639 1224108 6801

4. Calculate the test statistic:


𝑟 𝑐 (𝑜𝑖𝑗 − 𝐸𝑖𝑗 )2
𝜒𝑐𝑎𝑙 2 = ∑ ∑
𝑖=1 𝑗=1 𝐸𝑖𝑗
(1768−1170.29)2 (47−44.68)2 (43−49.58)2 (18−13.74)2
= + + ⋯+ + ⋯+ =1074.43
1170.29 44.68 49.58 13.74

5. Critical value χα2 (df ) df = (r – 1) (c – 1) = (3 – 1) (4 – 1) = (2) (3) = 6

χα2 (df ) =χ0.012 (6) =16.812

6. Since χCal 2 =1074.43 > χα2 (df ) =16.812 ⇒Reject H0 .

7. Conclusion: There is association between hair color and eye color. That is, hair color and eye
color are dependent.

Example 8.13: A study was conducted to investigate the effectiveness of bicycle safety helmets
in preventing head injury. The data consist of a random sample of 793 people involved in bicycle
accidents during a one-year period.

BY: ABEBEW A. 167


MEKDELA AMBA UNIVERSITY

Wearing Helmet
Head Injury Yes No
Yes 17 218
No 130 428

Based on the given information, is there association between wearing helmet and head injury.
Use 5% significance level.

Solution:

1. H0: There is no association between wearing helmet and head injury. That is, they are
independent.

H1: There is association between wearing helmet and head injury. That is, they are dependent.

2. Level of significance, α= 0.05.


3. Expected frequencies:

𝑅𝑖 ∗ 𝐶𝑗
𝐸𝑖𝑗 =
𝑛
Wearing Helmet
Head Injury Yes No
Yes 43.56 191.44
No 103.44 454.56
4. Calculate the test statistic:

𝑟 𝑐 (𝑜𝑖𝑗 − 𝐸𝑖𝑗 )2
𝜒𝑐𝑎𝑙 2 = ∑ ∑
𝑖=1 𝑗=1 𝐸𝑖𝑗

(17−43.56)2 (218−191.44)2 (130−103.44)2 (428−454.56)2


= + + + =28.26
43.56 191.44 103.44 454.56

Critical value χα2 (df ) df = (r – 1) (c – 1) = (2 – 1) (2 – 1) = 1

χα2 (df ) =χ0.052 (1) = 3.841

Since χCal 2 =28.26 > χα2 (df ) =3.841⇒Reject H 0

BY: ABEBEW A. 168


MEKDELA AMBA UNIVERSITY

5. Conclusion: There is association between wearing helmet and head injury. That is, they are
dependent.

Activity 8.4:

1. Under what condition do we use chi-square test of association? Explain briefly.

2. Discuss the steps used to test independence.

Exercise 8

1. From a normal population with the standard deviation is 4.2. A sample of size 25 is taken with
mean of 32. Find 99% confidence interval for the population mean.
2. A sample from an assumed normal distribution produced the values 9, 14, 10, 12, 7, 13, 12.
a) What is the single best estimate of  ? b) Find an 80% C.I. for  ?
3. Out of a sample of 80 customers 60 of them reply they are satisfied with the service they
received .Calculate a 95% confidence interval for the proportion of satisfied customers.
4. The manager claims that the average content of juice per bottle is less than 50cl. The machine
operator disagrees. A sample of 100 bottles yields an average content of 49cl per bottle. Does
this sample allow the manager to claim he is right (5% significance level)? Assume that the
population standard deviation s = 5 cl.

5. According to the norms established for a reading comprehension test, students should average
84. If 45 randomly selected students averaged 87.8 with a s.d of 8.6, test the null hypothesis
  84 against the alternative   84 , at   0.01 .

6. Assuming normality, perform a test for each of the following hypotheses:

a) H 0 :   55 Vs H A :   55,   0.01, n  25, x  50, s  10 .

b) H 0 :   327 Vs H A :   327 ,   0.10, n  9, x  329 .3, s  3 .

7. In a study of aviophobia, a psychologist claims that 30% of all women are afraid of flying. If,
in a random sample, 41 of 150 women are afraid of flying, test the null hypothesis p = 0.30
against H A : p  0.30 , at   0.05 .

BY: ABEBEW A. 169


MEKDELA AMBA UNIVERSITY

CHAPTER NINE

SIMPLE LINEAR REGRESSION AND CORRELATION


After completing the topic, the students will be able to:

 Determine the relationship between variables.


 Find the fitted regression line of the two variables.
 Draw and describe scatter diagram.
 Interpret the slope and intercept of the fitted regression line.
 Calculate and interpret the correlation coefficient.
 Find and interpret the coefficient of determination.
 Calculate and Interpret explained and unexplained variations.
 Calculate and interpret the spearman’s correlation coefficient.

9.1 Introduction

The statistical methods discussed so far are used to analyze the data involving only one variable.
Often an analysis of data concerning two or more variables is needed to look for any statistical
relationship or association between them. Thus, regression and correlation analysis are helpful in
ascertaining the probable form of the relationship between variables and the strength of the
relationship.

9.2 fitting simple Linear Regression

Regression analysis is the statistical method that helps to formulate an algebraic relationship
between two or more variables in the form of an equation. It can be used for assessment of
association, for estimation and prediction.

The variable whose value is estimated using the algebraic equation is called dependent or response
variable (usually denoted by Y) and the variable whose value is used as the basis for the estimate
is called independent or predictor variable (usually denoted by X). The linear algebraic equation
used for expressing a dependent variable in terms of independent variable is called linear
regression equation.

BY: ABEBEW A. 170


MEKDELA AMBA UNIVERSITY

Simple regression model deals with the relationship between a dependent variable Y and only one
independent variable X. If the relationship between the two variables is a straight line, it is known
as simple linear regression. But if more than one independent variables are associated with a
dependent variable, then such regression model is called a multiple regression model.

Examples of simple regression:

 Relationship between the height of fathers and their sons.


 Relationship between fertilizer application and yield.
 Relationship between blood pressure and age.
 Relationship between the concentration of injected drug and heart beat rate.

In all such cases, to analyze the relationship between two variables, we use a statistical technique
called simple regression analysis.

The first step in regression analysis involving two variables is to construct a scatter plot (diagram)
of the observed data. Scatter diagram is a plot of all ordered pairs (Xi,Yi) on the coordinate plane
which is a quick at-a-glance method of determining an apparent relationship between two
variables, if any.

Under simple linear regression of Y on X, we have one independent variable which is influential
usually denoted by X and one dependent variable influenced by the independent variable which
we denote it by Y. For example in real world variables that may be related linearly are,
production/yield ( Y ) and amount of rainfall(X ), monthly income (Y ) and level of education (X),
,where an increase in one variable is associated with an increase in the other variable. Similar
examples can also be given on the negative relation between two variables; the increase in one is
accompanied by a decrease in the other.

A simple linear regression model is given as

Y=α+βX+∈……………………………………9.1

Where α is intercept of the regression line. It gives the value of Y whenever X is zero. If the range
of X does not include zero, α has no practical interpretation. β is the slope. It is a measure of the
rate of change. It shows by how much Y changes for every unit change in X. The sign of β has

BY: ABEBEW A. 171


MEKDELA AMBA UNIVERSITY

also some significance; because it shows the direction of the relation between the two variables. A
positive sign of β shows that the two variables are positively related and a negative sign of β shows
that the two variables are negatively related.

The constants, α and β are parameters and are commonly referred to as regression coefficients.

- ∈ is a random error term. It is neither observable nor measurable. In real life problems, even
though two variables are linearly related, their relationship is not fixed as

Y=α+βX …………………..…………9.2

This is because the dependent variable, Y is the effect of many independent variables in which X
is one of them. Contribution of other independent variables not considered in the model may be
minor. However, we cannot be certain that Y depends only on X. Thus the contribution of these
variables not included in the model and other factors such as measurement error is accommodated
by ∈.

Mean of the values of ∈ is zero. Some of its values are positive, that is when the actual value lies
above the line ̂ ̂ + β̂Xi and some are negative in case when the actual value of Y lies below
Y=α
the fitted regression line.

Assumptions:

1. The relationship between the dependent variable Y and independent variable X exist and is
linear.

2. For every value of the independent variable X, there is an expected value of the dependent
variable Y.

3. The dependent variable Y is a continuous random variable, whereas values of the independent
variable X are fixed values.

4. The sampling error ∈, associated with the expected value of the dependent variable Y is assumed
to be an independent random variable distributed normally with mean 0 and constant variance  2
about the regression line.

BY: ABEBEW A. 172


MEKDELA AMBA UNIVERSITY

To estimate this model we take a sample of n independent observations which give rise to n pairs
(Xi, Yi) and find best estimates of the parameters or best fitted line using least square method of
estimation. A best fitting line is one for which the sum of squares of the errors, ∑ ε2i is minimum.

In the principle of least square method, one would select α and β such that

̂i )2 is minimum where Y
∑ ε2i = ∑(Yi − Y ̂ + β̂Xi
̂i = α

̂ and β̂
To minimize this function, first we take the partial derivatives of ∑ ε2i with respect to α
respectively .Then the partial derivatives are equated to zero separately and result in the following
normal equations respectively

̂ +β̂ ∑ni=1 Xi
∑ni=1 Yi = nα

̂ ∑ni=1 Xi +β̂ ∑ni=1 Xi2


∑ni=1 Xi Yi =α

̂ and β̂ as follows.
Solving these normal equations simultaneously we can get the values of α

n
 n  n 
n xi yi    xi   yi 
 xy  nx y  i1  i 1  i 1     x  x  y  y 
β̂  and α̂ = Y̅ -β̂X
̅……9.3
 x  nx
2 2
 n 
n  xi    xi 
2
2
 x  x 2
 i 1 

These estimates are denoted by α ̂ The estimated (fitted) regression line is given by:
̂ and β.

̂ ̂ + β̂Xi
Yi = α

Before estimating the regression coefficients, it would be wise to plot the observed data on a graph
known as a scatter diagram. Scatter diagram is a plot of all ordered pairs (xi ,yi ) on the coordinate
plane which helps to observe relationship between two variables. This diagram gives a preliminary
idea on the type of relationship the two variables have.

Regression analysis is useful in predicting the value of one variable from the given value of another
̂ + β̂Xi.
̂i = α
variable, Y

Example 9.1: For the following example [the number of hours (X) a student spent studying and
the marks (Y) each student received in an examination]:

BY: ABEBEW A. 173


MEKDELA AMBA UNIVERSITY

Assuming simple linear relationship between X and Y,

a/ Draw the scatter diagram;

b/ Find the estimated regression equation of Y on X;

c/ Give the predicted value of Y for X= 12

Solution: a) The scatter diagram is as follows:

Scatter diagram for num ber of hours studied (X) and m arks obtained (Y)
by 10 students

100
90
80
70
Marks obtained

60
50 y
40
30
20
10
0
0 5 10 15 20
hours spent
\

b) And the necessary statistics are computed below:

β̂ 
 xy  nx y  7034  (10)(9.7)(64.8)  748.4  3.596 and
 x  nx
2 2
1149  (10)(9.7) 208.1 2

̂ = 64.8-3.596(9.7) =29.92.
α

Hence, the equation is ŷ = 29.92 + 3.596x.

BY: ABEBEW A. 174


MEKDELA AMBA UNIVERSITY

c) When X = 12, yˆ12  29 .92  3.596 (12 )  73 .1.

Activity 9.1

1. Explain the concept of regression and discuss its importance.

2. What is the use of scatter diagram? How do we plot it?

3. How do you interpret the regression coefficients?

9.3 The covariance and the correlation coefficient

Given the paired data (x1,y1), (x2,y2), . . ., (xn,yn) we may want to describe the type & strength of
relationship between the independent variable X and the dependent variable Y. We can give these
two by applying an index called simple correlation coefficient. The population correlation
coefficient is represented by ρ and its estimator by r. The correlation coefficient r is also called
Pearson’s correlation coefficient since it was developed by Karl Pearson. The computational
formula is:

r
 ( x  x )( y  y ) …………………… 9.4
 ( x  x )  ( y  y)
2 2

Alternatively: The correlation coefficient is given by

r
 xy  nx y
 x  nx  y
2 2 2
 ny 2  …………………………………..9.5
The correlation coefficient, r is always lies between –1 and +1, inclusive.

• r = -1 implies perfect negative linear relationship between the two variables.

• r = +1 implies perfect positive linear relationship between the two variables.

• r = 0 implies there is no linear relationship between the two variables. But the two variables may
have non-linear relationship between them.

• r approaches +1 indicates strong positive linear relationship between the two variables.

BY: ABEBEW A. 175


MEKDELA AMBA UNIVERSITY

• r approaches -1 indicates strong negative linear relationship between the two variables.

• r approaches 0 indicates weak linear relationship between the two variables .

Coefficient of Determination (r2)

The square of the correlation coefficient, r2, is called the coefficient of determination. It measures
the variation in the dependent Y explained by the simple linear regression of Y on X.

1− r2 measures the proportion of variation in Y not explained by the simple linear regression of
Y on X.

Example 9.2: If r = 0.9, then r2 = 0.81 and 1- r2 =0.19. Approximately 81% of the variation in the
dependent variable, Y, is explained by the simple linear regression of Y on X. The remaining, 1-
r2, 19 % of the variation in Y is unexplained by the simple linear regression of Y on X.

Example 9.3: The research director of the Saving and Loan Bank collected 25 observation of
montage interest rates X and number of house sales Y at each interest rate. The director computed
that,

∑ xi = 125, ∑ yi = 100, ∑ xi yi = 520 , ∑ xi2 = 650 , ∑ yi2 = 436

Compute and interpret (i) Coefficient of correlation.

(ii) The coefficient of determination.

Solution: i) Coefficient of correlation.

r
 xy  nx y 
520  (25)(5)(4)
 x  nx  y
2 2 2
 ny 2
 650  25(5)(5)436  (25)(4)(4) = 0.667
The two variables have positive linear relationship.

ii) Coefficient of determination, r2= (0.667)2 =0.44 this shows that 44% of the variation in the
number of house sales is due to the variation in the interest rate.

BY: ABEBEW A. 176


MEKDELA AMBA UNIVERSITY

9.4 Spearman’s Rank Correlation Coefficient

The simple correlation coefficient (r) cannot be used when we are dealing with a qualitative data
such as judgment about beauty, efficiency, honesty, etc. In such cases, the rank correlation
coefficient is used to explain the correlation or if there is an agreement in ranking. It is denoted

by rs and is defined as follows:

Definition: The coefficient of rank correlation, rs given by Spearman for n pairs is

6 d 2
rs  1  , ……………………………….……………9.6
n(n 2  1)

where d is the difference between the rank of x and the corresponding y.

To calculate rs , we first rank the xs among themselves from least to best or from best to least; then

we rank the y' s in the same way, find the sum of the squares of the differences, d, between the
ranks of the x's and the y’s. When there are ties in rank, we assign to each of the tied observations
(having equal value) the mean of their ranks.

Example 9.4: Assume that ten girls in a beauty contest for Miss Mekdela Amba were ranked by
two judges as follows:

Girl Number 1 2 3 4 5 6 7 8 9 10
Judge A 4 8 6 7 1 3 2 5 10 9
Judge B 3 9 6 5 1 2 4 7 8 10

Calculate rs and interpret it.

Solution: Since the ranks are given, we need to find only the difference in ranks for each girl
and the square of these differences.

Girl Number 1 2 3 4 5 6 7 8 9 10 Total


D 1 -1 0 2 0 1 -2 -2 2 -1 0
d2 1 1 0 4 0 1 4 4 4 1 20

BY: ABEBEW A. 177


MEKDELA AMBA UNIVERSITY

For these n = 10 pairs, d 2


 20 , and rs = 1  6(20)
10(100  1)
 0.88 , which is positive and close to 1,

showing that there is a very good agreement (or concordance) between the two judges regarding
the beauty of the girls.

N.B:  d  0 provides a check in calculations.


Like the values of r, the values of rs also lie between -1 and +1, inclusive, and the interpretations

of its size and sign are analogous to those of r. rs  1  Perfect positive agreement,

rs =-1complete disagreement where the two rankings go completely in opposite direction.

Exercise 9

1. What is scatter diagram? What is the advantage of scatter diagram?

2. What is the coefficient of determination?

3. Based on the following data answer the question.

Sales 15 18 25 27 30 35
Advertising expenditure 50 65 82 95 110 120
a. Decide which variable should be the independent variable and which should be the dependent
variable.

b. Make a scatter plot of the data.

c. Does it appear from inspection that there is a relationship between the variables?

d. Calculate the least squares line. Put the equation in the form of: 𝑌̂𝑖 = 𝛼̂ + 𝛽̂ Xi

e. Find and interpret the correlation coefficient.

f. What is the slope of the least squares (best-fit) line? Interpret the slope.

4. Below are the planets distance from the sun and the time it takes for the planet to complete its
orbit around the sun.

BY: ABEBEW A. 178


MEKDELA AMBA UNIVERSITY

Planet Distance from the sun ( millions of miles)-x Year to complete the orbit-y
Mercury 36 0.24
Venus 67 0.62
Earth 93 1.00
Mars 142 1.88
Jupiter 483 11.9
Saturn 887 29.5
Uranus 1785 84.0
Neptune 2797 165.0
Pluto 3675 248.0
a. Make a scatter plot of the data. Does it appear from inspection that there is a relationship between
the variables?

b. Calculate the least squares line. Put the equation in the form of: 𝑌̂𝑖 = 𝛼̂ + 𝛽̂Xi

c. Find and interpret the correlation coefficient.

d. What is the slope of the least squares (best-fit) line? Interpret the slope.

e. find the estimated year to complete the orbit if the distance is 1000.

5. The number of cigarettes consumed (in billions)(say, x) and the number of cigarettes exported
from the same country (say, y).

X 525 510 500 485 486 487 486


Y 164 179 206 196 220 231 244
Compute and interpret

a) Calculate the least squares line. Put the equation in the form of: 𝑌̂𝑖 = 𝛼̂ + 𝛽̂ Xi

b) The Coefficient of correlation, r.

c) The coefficient of determination, r2

BY: ABEBEW A. 179


MEKDELA AMBA UNIVERSITY

Table A. Approximate values of the standard normal distribution function (i.e. area between
z=0 and Z=z OR area between Z= 0 and Z≤z):

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0190 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2157 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2969 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3513 0.3554 0.3577 0.3529 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4215 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4492 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
3.1 0.4990 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993
3.2 0.4993 0.4993 0.4994 0.4994 0.4994 0.4994 0.4994 0.4995 0.4995 0.4995
3.3 0.4995 0.4995 0.4995 0.4996 0.4996 0.4996 0.4996 0.4996 0.4996 0.4997
3.4 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4998

BY: ABEBEW A. 180


MEKDELA AMBA UNIVERSITY

t α= 0.1 0.05 0.025 0.01 0.005 0.0025 0.001 0.0005


df = 1 3.078 6.314 12.706 31.821 63.656 127.321 318.289 636.578
2 1.886 2.920 4.303 6.965 9.925 14.089 22.328 31.600
3 1.638 2.353 3.182 4.541 5.841 7.453 10.214 12.924
4 1.533 2.132 2.776 3.747 4.604 5.598 7.173 8.610
5 1.476 2.015 2.571 3.365 4.032 4.773 5.894 6.869
6 1.440 1.943 2.447 3.143 3.707 4.317 5.208 5.959
7 1.415 1.895 2.365 2.998 3.499 4.029 4.785 5.408
8 1.397 1.860 2.306 2.896 3.355 3.833 4.501 5.041
9 1.383 1.833 2.262 2.821 3.250 3.690 4.297 4.781
10 1.372 1.812 2.228 2.764 3.169 3.581 4.144 4.587
11 1.363 1.796 2.201 2.718 3.106 3.497 4.025 4.437
12 1.356 1.782 2.179 2.681 3.055 3.428 3.930 4.318
13 1.350 1.771 2.160 2.650 3.012 3.372 3.852 4.221
14 1.345 1.761 2.145 2.624 2.977 3.326 3.787 4.140
15 1.341 1.753 2.131 2.602 2.947 3.286 3.733 4.073
16 1.337 1.746 2.120 2.583 2.921 3.252 3.686 4.015
17 1.333 1.740 2.110 2.567 2.898 3.222 3.646 3.965
18 1.330 1.734 2.101 2.552 2.878 3.197 3.610 3.922
19 1.328 1.729 2.093 2.539 2.861 3.174 3.579 3.883
20 1.325 1.725 2.086 2.528 2.845 3.153 3.552 3.850
21 1.323 1.721 2.080 2.518 2.831 3.135 3.527 3.819
22 1.321 1.717 2.074 2.508 2.819 3.119 3.505 3.792
23 1.319 1.714 2.069 2.500 2.807 3.104 3.485 3.768
24 1.318 1.711 2.064 2.492 2.797 3.091 3.467 3.745
25 1.316 1.708 2.060 2.485 2.787 3.078 3.450 3.725
26 1.315 1.706 2.056 2.479 2.779 3.067 3.435 3.707
27 1.314 1.703 2.052 2.473 2.771 3.057 3.421 3.689
28 1.313 1.701 2.048 2.467 2.763 3.047 3.408 3.674
29 1.311 1.699 2.045 2.462 2.756 3.038 3.396 3.660
30 1.310 1.697 2.042 2.457 2.750 3.030 3.385 3.646
40 1.303 1.684 2.021 2.423 2.704 2.971 3.307 3.551
50 1.299 1.676 2.009 2.403 2.678 2.937 3.261 3.496
60 1.296 1.671 2.000 2.390 2.660 2.915 3.232 3.460
Infinity 1.282 1.645 1.960 2.326 2.576 2.807 3.090 3.290
Table B. t-table with right tail probabilities

BY: ABEBEW A. 181


MEKDELA AMBA UNIVERSITY

Table C. Right tail areas for the Chi-square Distribution

df\area 0.995 0.99 0.975 0.95 0.9 0.25 0.1 0.05 0.025 0.01 0.005
1 0.000 0.000 0.001 0.004 0.016 1.323 2.706 3.841 5.024 6.635 7.879
2 0.010 0.020 0.051 0.103 0.211 2.773 4.605 5.991 7.378 9.210 10.597
3 0.072 0.115 0.216 0.352 0.584 4.108 6.251 7.815 9.348 11.345 12.838
4 0.207 0.297 0.484 0.711 1.064 5.385 7.779 9.488 11.143 13.277 14.860
5 0.412 0.554 0.831 1.145 1.610 6.626 9.236 11.071 12.833 15.086 16.750
6 0.676 0.872 1.237 1.635 2.204 7.841 10.645 12.592 14.449 16.812 18.548
7 0.989 1.239 1.690 2.167 2.833 9.037 12.017 14.067 16.013 18.475 20.278
8 1.344 1.647 2.180 2.733 3.490 10.219 13.362 15.507 17.535 20.090 21.955
9 1.735 2.088 2.700 3.325 4.168 11.389 14.684 16.919 19.023 21.666 23.589
10 2.156 2.558 3.247 3.940 4.865 12.549 15.987 18.307 20.483 23.209 25.188
11 2.603 3.053 3.816 4.575 5.578 13.701 17.275 19.675 21.920 24.725 26.757
12 3.074 3.571 4.404 5.226 6.304 14.845 18.549 21.026 23.337 26.217 28.300
13 3.565 4.107 5.009 5.892 7.042 15.984 19.812 22.362 24.736 27.688 29.819
14 4.075 4.660 5.629 6.571 7.790 17.117 21.064 23.685 26.119 29.141 31.319
15 4.601 5.229 6.262 7.261 8.547 18.245 22.307 24.996 27.488 30.578 32.801
16 5.142 5.812 6.908 7.962 9.312 19.369 23.542 26.296 28.845 32.000 34.267
17 5.697 6.408 7.564 8.672 10.085 20.489 24.769 27.587 30.191 33.409 35.718
18 6.265 7.015 8.231 9.390 10.865 21.605 25.989 28.869 31.526 34.805 37.156
19 6.844 7.633 8.907 10.117 11.651 22.718 27.204 30.144 32.852 36.191 38.582
20 7.434 8.260 9.591 10.851 12.443 23.828 28.412 31.410 34.170 37.566 39.997
21 8.034 8.897 10.283 11.591 13.240 24.935 29.615 32.671 35.479 38.932 41.401
22 8.643 9.542 10.982 12.338 14.041 26.039 30.813 33.924 36.781 40.289 42.796
23 9.260 10.196 11.689 13.091 14.848 27.141 32.007 35.172 38.076 41.638 44.181
24 9.886 10.856 12.401 13.848 15.659 28.241 33.196 36.415 39.364 42.980 45.559
25 10.520 11.524 13.120 14.611 16.473 29.339 34.382 37.652 40.646 44.314 46.928
26 11.160 12.198 13.844 15.379 17.292 30.435 35.563 38.885 41.923 45.642 48.290
27 11.808 12.879 14.573 16.151 18.114 31.528 36.741 40.113 43.195 46.963 49.645
28 12.461 13.565 15.308 16.928 18.939 32.620 37.916 41.337 44.461 48.278 50.993
29 13.121 14.256 16.047 17.708 19.768 33.711 39.087 42.557 45.722 49.588 52.336
30 13.787 14.953 16.791 18.493 20.599 34.800 40.256 43.773 46.979 50.892 53.672

BY: ABEBEW A. 182


MEKDELA AMBA UNIVERSITY

Reference:

1. Eshetu Wencheko, Introduction to Statistics. April 2000, Addis Ababa University.


2. Bluman, A.G. (1995). Elementary Statistics: A Step by Step Approach (10th
edition).Wm. C. Brown Communications, Inc
3. Gupta S.P., Gupta M.P., Business Statistics, 2001, Sultan chand & sons, New Delhi.
4. Monga G.S., Mathematics and Statistics for Economics (second revised edition),2007.
5. Moorthy M.B.K., Subramani K. & Santha A. Probability and Statistics, Dec. 2007,
Scitech publications (India) pvt. Ltd.
6. Pal Nabendu, Sarkar Sahadeb, Statistics concepts and applications, 2006, New Delhi.
7. Spiegel Murry R. & Stephen Larry J.. Statistics-schaum’s outline, 1999, ATAMCGraw-
Hill edition, 3rd edition, New Delhi.
8. Sullivan Michael, iii, Statistics: informed decision using data: 2004, New Jersey.

BY: ABEBEW A. 183

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy