0% found this document useful (0 votes)
23 views335 pages

QM Merged

Uploaded by

chandanlata8389
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views335 pages

QM Merged

Uploaded by

chandanlata8389
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 335

Chapter 1 1-1

QUANTITATIVE METHODS

Business Statistics
Unit-1
Defining & Collecting Data

Learning Objectives

In this Unit you will learn:

 The types of variables used in statistics


 The measurement scales of variables
 How to collect data
 The different ways to collect a sample
 About the types of survey errors

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-2

Types of Variables

 Categorical (qualitative) variables have values that


can only be placed into categories, such as “yes” and
“no.”

 Numerical (quantitative) variables have values that


represent quantities.
 Discrete variables arise from a counting process
 Continuous variables arise from a measuring process

Types of Variables

Variables

Categorical Numerical

Examples:
 Marital Status
 Political Party Discrete Continuous
 Eye Color
(Defined categories) Examples: Examples:
 Number of Children  Weight
 Defects per hour  Voltage
(Counted items) (Measured characteristics)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-3

Levels of Measurement

A nominal scale classifies data into distinct


categories in which no ranking is implied.

Categorical Variables Categories

Personal Computer Yes / No


Ownership

Type of Stocks Owned Growth / Value / Other

Internet Provider Reliance Jio, Airtel, Excitel, ACT

Levels of Measurement (con’t.)

An ordinal scale classifies data into distinct


categories in which ranking is implied

Categorical Variable Ordered Categories

Student class designation Fresher, Graduate, Junior, Senior

Product satisfaction Satisfied, Neutral, Unsatisfied

Faculty rank Professor, Associate Professor,


Assistant Professor, Instructor
Standard & Poor’s bond ratings AAA, AA, A, BBB, BB, B, CCC, CC,
C, DDD, DD, D
Student Grades A, B, C, D, F

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-4

Levels of Measurement (con’t.)

 An interval scale is an ordered scale in which the


difference between measurements is a meaningful
quantity but the measurements do not have a true
zero point.

 A ratio scale is an ordered scale in which the


difference between the measurements is a
meaningful quantity and the measurements have a
true zero point.

Interval and Ratio Scales

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-5

Establishing A Business Objective


Focuses Data Collection
Examples Of Business Objectives:
 A marketing research analyst needs to assess the effectiveness
of a new television advertisement.

 A pharmaceutical manufacturer needs to determine whether a


new drug is more effective than those currently in use.

 An operations manager wants to monitor a manufacturing


process to find out whether the quality of the product being
manufactured is conforming to company standards.

 An auditor wants to review the financial transactions of a


company in order to determine whether the company is in
compliance with generally accepted accounting principles.

Sources of Data

 Primary Sources: The data collector is the one using the data
for analysis
 Data from a political survey
 Data collected from an experiment
 Observed data
 Secondary Sources: The person performing data analysis is
not the data collector
 Analyzing census data
 Examining data from print journals or data published on the internet.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-6

Examples Of Data Distributed


By Organizations or Individuals
 Financial data on a company provided by
investment services.

 Industry or market data from market research


firms and trade associations.

 Stock prices, weather conditions, and sports


statistics in daily newspapers.

Examples of Data From A


Designed Experiment
 Consumer testing of different versions of a
product to help determine which product should
be pursued further.

 Material testing to determine which supplier’s


material should be used in a product.

 Market testing on alternative product


promotions to determine which promotion to
use more broadly.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-7

Examples of Survey Data

 Political polls of registered voters during political


campaigns.

 People being surveyed to determine their


satisfaction with a recent product or service
experience.

Examples of Data Collected


From Observational Studies
 Market researchers utilizing focus groups to
elicit unstructured responses to open-ended
questions.

 Measuring the time it takes for customers to be


served in a fast food establishment.

 Measuring the volume of traffic through an


intersection to determine if some form of
advertising at the intersection is justified.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-8

Examples of Data Collected From


Ongoing Business Activities

 A bank studies years of financial transactions to


help them identify patterns of fraud.

 Economists utilize data on searches done via


Google to help forecast future economic
conditions.

 Marketing companies use tracking data to


evaluate the effectiveness of a web site.

Data Is Collected From Either A


Population or A Sample
POPULATION
A population consists of all the items or individuals
about which you want to draw a conclusion. The
population is the “large group”

SAMPLE
A sample is the portion of a population selected for
analysis. The sample is the “small group”

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-9

Population vs. Sample

Population Sample

All the items or individuals about A portion of the population of


which you want to draw conclusion(s) items or individuals

Data Cleaning Is Often A Necessary


Activity When Collecting Data

 Often find “irregularities” in the data


 Typographical or data entry errors
 Values that are impossible or undefined
 Missing values
 Outliers
 When found these irregularities should be
reviewed
 Many statistical software packages will handle
irregularities in an automated fashion (Excel
does not)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-10

Sampling

Sampling means the selection of a part of the


aggregate with a view to draw some statistical
information about the whole. This aggregate of
the investigation is called population and the
selected part is called sample. A population is
finite and infinite according to its size i.e.
number of members.
The main objective of the sampling is to obtain
the maximum information of the population.

Types of Samples

Samples

Non-Probability Probability Samples


Samples

Simple Stratified
Random
Judgment Convenience

Systematic Cluster

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-11

Types of Samples:
Nonprobability Sample

 In a nonprobability sample, items included are


chosen without regard to their probability of
occurrence.
 In convenience sampling, items are selected based
only on the fact that they are easy, inexpensive, or
convenient to sample.
 In a judgment sample, you get the opinions of pre-
selected experts in the subject matter.

Types of Samples:
Probability Sample

 In a probability sample, items in the


sample are chosen on the basis of known
probabilities.
Probability Samples

Simple
Random Systematic Stratified Cluster

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-12

Probability Sample:
Simple Random Sample

 In this type of sampling every unit of the population has


an equal chance of being selected in a sample. There
are two ways of drawing a simple random sample, one
is with replacement (WR) and the other is without
replacement (WOR).
In WR type, the drawn unit of the population is again
returned to the population so that the size of the
population remains same before each drawing. In WOR
type, the draw unit of the population is not returned to
the population. For finite population the size diminishes
as the sampling process continues.

Probability Sample:
Systematic Sampling

 In systematic sampling one unit is chosen


at random from the population and the
items are selected regularly at
predetermined intervals. This method is
quite good over the simple random
sampling provided there is no deliberate
attempt to change the sequence of the
units in the population.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-13

Probability Sample:
Systematic Sample
 Decide on sample size: n
 Divide population of N individuals into groups of
k individuals: k=N/n
 Randomly select one individual from the 1st
group
 Select every kth individual thereafter
N = 40 First Group
n=4
k = 10

Probability Sample:
Stratified Sampling

 Here the population is sub-divided into


several parts, called strata showing the
homogeneity of the items and then a sub
sample is selected from each of the
strata. All the sub-samples combined
together give the stratified sample. it is
useful when the population is
heterogeneous.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-14

Probability Sample:
Stratified Sample
 Divide population into two or more subgroups (called strata) according
to some common characteristic
 A simple random sample is selected from each subgroup, with sample
sizes proportional to strata sizes
 Samples from subgroups are combined into one
 This is a common technique when sampling population of voters,
stratifying across racial or socio-economic lines.

Population
Divided
into 4
strata

Probability Sample:
Cluster Sampling

 When the population consists of certain


group of clusters of units, it may be
advantageous and economical to select a
few clusters of units and then examine all
the units in the selected clusters.
For example of certain goods which are
packed in cartons and repacking is costly it
is advisable to select only few cartons and
inspect all the inside goods.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-15

Probability Sample
Cluster Sample
 Population is divided into several “clusters,” each representative of
the population
 A simple random sample of clusters is selected
 All items in the selected clusters can be used, or items can be
chosen from a cluster using another probability sampling technique
 A common application of cluster sampling involves election exit polls,
where certain election districts are selected and sampled.

Population
divided into
16 clusters. Randomly selected
clusters for sample

Probability Sample:
Comparing Sampling Methods

 Simple random sample and Systematic sample


 Simple to use

 May not be a good representation of the population’s

underlying characteristics
 Stratified sample
 Ensures representation of individuals across the entire

population
 Cluster sample
 More cost effective

 Less efficient (need larger sample to acquire the same

level of precision)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-16

Evaluating Survey Worthiness

 What is the purpose of the survey?


 Is the survey based on a probability sample?
 Coverage error – appropriate Population?
 Non response error – follow up
 Measurement error – good questions elicit good
responses
 Sampling error – always exists

Sampling Error

The analysis of the sample is done to


obtain an idea of the probability
distribution of the variable in the
population.

Though by applying proper process of


sampling we may not be able to represent
the characteristics of the population
correctly. This discrepancy is called
sampling error.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-17

Types of Survey Errors


 Coverage error or selection bias
 Exists if some groups are excluded from the Population and
have no chance of being selected

 Non response error or bias


 People who do not respond may be different from those who
do respond

 Sampling error
 Variation from sample to sample will always exist

 Measurement error
 Due to weaknesses in question design, respondent error, and
interviewer’s effects on the respondent (“Hawthorne effect”)

Types of Survey Errors

 Coverage error Excluded from


frame

 Nonresponse error Follow up on


nonresponses

 Sampling error Random


differences from
sample to sample
 Measurement error Bad or leading
question

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-18

Unit-1 Summary

In this Unit we have discussed:

 The types of variables used in statistics


 The measurement scales of variables
 How to collect data
 The different ways to collect a sample
 The types of survey errors

Business Statistics

Unit-2
Organizing and Visualizing Data

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-19

Learning Objectives of Unit-2

In this Unit you will learn:


 To construct tables and charts for categorical data
 To construct tables and charts for numerical data
 The principles of properly presenting graphs
 To organize and analyze many variables

Categorical Data Are Organized By


Utilizing Tables
Categorical
Data

Tallying Data

One Two
Categorical Categorical
Variable Variables

Summary Contingency
Table Table

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-20

Organizing Categorical Data:


Summary Table
 A summary table tallies the frequencies or percentages of items in a set
of categories so that you can see differences between categories.

Summary Table From A Survey of 1000 Banking Customers

Banking Preference? Percent


ATM 16%
Automated or live telephone 2%
Drive-through service at branch 17%
In person at branch 41%
Internet 24%

A Contingency Table Helps Organize


Two or More Categorical Variables

 Used to study patterns that may exist between


the responses of two or more categorical
variables

 Cross tabulates or tallies jointly the responses


of the categorical variables

 For two variables the tallies for one variable are


located in the rows and the tallies for the
second variable are located in the columns

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-21

Contingency Table - Example

 A random sample of 400 Contingency Table Showing


invoices is drawn. Frequency of Invoices Categorized
By Size and The Presence Of Errors
 Each invoice is categorized
No
as a small, medium, or large Errors Errors Total
amount. Small 170 20 190
 Each invoice is also Amount
examined to identify if there Medium 100 40 140
are any errors. Amount
Large 65 5 70
 This data are then organized Amount
in the contingency table to
335 65 400
the right. Total

Contingency Table Based On


Percentage Of Overall Total
No
Errors Errors Total 42.50% = 170 / 400
Small 170 20 190 25.00% = 100 / 400
Amount 16.25% = 65 / 400
Medium 100 40 140
Amount No
Large 65 5 70 Errors Errors Total
Amount Small 42.50% 5.00% 47.50%
335 65 400 Amount
Total Medium 25.00% 10.00% 35.00%
Amount
83.75% of sampled invoices
Large 16.25% 1.25% 17.50%
have no errors and 47.50% Amount
of sampled invoices are for 83.75% 16.25% 100.0%
small amounts. Total

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-22

Contingency Table Based On


Percentage of Row Totals
No
Errors Errors Total 89.47% = 170 / 190
Small 170 20 190 71.43% = 100 / 140
Amount 92.86% = 65 / 70
Medium 100 40 140
Amount No
Large 65 5 70 Errors Errors Total
Amount Small 89.47% 10.53% 100.0%
335 65 400 Amount
Total Medium 71.43% 28.57% 100.0%
Amount
Medium invoices have a larger
Large 92.86% 7.14% 100.0%
chance (28.57%) of having Amount
errors than small (10.53%) or 83.75% 16.25% 100.0%
large (7.14%) invoices. Total

Contingency Table Based On


Percentage Of Column Totals
No
Errors Errors Total 50.75% = 170 / 335
Small 170 20 190 30.77% = 20 / 65
Amount
Medium 100 40 140
Amount No
Large 65 5 70 Errors Errors Total
Amount Small 50.75% 30.77% 47.50%
335 65 400 Amount
Total Medium 29.85% 61.54% 35.00%
Amount
There is a 61.54% chance
Large 19.40% 7.69% 17.50%
that invoices with errors are Amount
of medium size. 100.0% 100.0% 100.0%
Total

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-23

Tables Used For Organizing


Numerical Data
Numerical Data

Frequency Cumulative
Ordered Array
Distributions Distributions

Stacked Or Unstacked Format

 This is an issue when you have a categorical variable


that may be used group your numerical variable for
analysis.

 Stacked format is when your numerical variable is in


one column and a second column identifies the value of
the categorical variable.

 Unstacked format is when the values of the numerical


variable in each group (unique value of the categorical
variable) are in different columns.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-24

Example of Stacked &


Unstacked Format
Stacked Format Unstacked Format
Age Of Day or Age Of Age Of
Students Night Student Day Students Night Students
16 D 16 18
19 D 19 23
22 D 22 18
18 N 17 28
Different Programs & 23 N 19 19
17 D 25 32
different analyses may 19 D 17 19
require a specific format 25 D 20 33
18 N 27
28 N 18
17 D 20
20 D 32
27 D
19 N
32 N
18 D
20 D
32 D
19 N
33 N

Organizing Numerical Data:


Ordered Array
 An ordered array is a sequence of data, in rank order, from the
smallest value to the largest value.
 Shows range (minimum value to maximum value)
 May help identify outliers (unusual observations)

Age of Day Students


Surveyed
16 17 17 18 18 18
College
Students 19 19 20 20 21 22
22 25 27 32 38 42
Night Students
18 18 19 19 20 21
23 28 32 33 41 45

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-25

Organizing Numerical Data:


Frequency Distribution

 The frequency distribution is a summary table in which the data are


arranged into numerically ordered classes.

 You must give attention to selecting the appropriate number of class


groupings for the table, determining a suitable width of a class grouping,
and establishing the boundaries of each class grouping to avoid
overlapping.

 The number of classes depends on the number of values in the data. With
a larger number of values, typically there are more classes. In general, a
frequency distribution should have at least 5 but no more than 15 classes.

 To determine the width of a class interval, you divide the range (Highest
value–Lowest value) of the data by the number of class groupings desired.

Organizing Numerical Data:


Frequency Distribution Example

Example: A manufacturer of insulation randomly selects 20


winter days and records the daily high temperature

24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-26

Organizing Numerical Data:


Frequency Distribution Example
 Sort raw data in ascending order:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
 Find range: 58 - 12 = 46
 Select number of classes: 5 (usually between 5 and 15)
 Compute class interval (width): 10 (46/5 then round up)
 Determine class boundaries (limits):
 Class 1: 10 to less than 20
 Class 2: 20 to less than 30
 Class 3: 30 to less than 40
 Class 4: 40 to less than 50
 Class 5: 50 to less than 60
 Compute class midpoints: 15, 25, 35, 45, 55
 Count observations & assign to classes

Organizing Numerical Data: Frequency


Distribution Example

Data in ordered array:


12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Class Midpoints Frequency

10 but less than 20 15 3


20 but less than 30 25 6
30 but less than 40 35 5
40 but less than 50 45 4
50 but less than 60 55 2
Total 20

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-27

Organizing Numerical Data: Relative &


Percent Frequency Distribution Example

Data in ordered array:


12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Relative
Class Frequency Percentage
Frequency
10 but less than 20 3 0.15 15%
20 but less than 30 6 0.30 30%
30 but less than 40 5 0.25 25%
40 but less than 50 4 0.20 20%
50 but less than 60 2 0.10 10%
Total 20 1.00 100%

Organizing Numerical Data: Cumulative


Frequency Distribution Example

Data in ordered array:


12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Cumulative Cumulative
Class Frequency Percentage
Frequency Percentage

10 but less than 20 3 15% 3 15%


20 but less than 30 6 30% 9 45%
30 but less than 40 5 25% 14 70%
40 but less than 50 4 20% 18 90%
50 but less than 60 2 10% 20 100%
Total 20 100 20 100%

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-28

Why Use a Frequency Distribution?

 It condenses the raw data into a more


useful form
 It allows for a quick visual interpretation of
the data
 It enables the determination of the major
characteristics of the data set including
where the data are concentrated /
clustered

Frequency Distributions:
Some Tips

 Different class boundaries may provide different pictures for


the same data (especially for smaller data sets)

 Shifts in data concentration may show up when different


class boundaries are chosen

 As the size of the data set increases, the impact of


alterations in the selection of class boundaries is greatly
reduced

 When comparing two or more groups with different sample


sizes, you must use either a relative frequency or a
percentage distribution

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-29

Practice Problem

 The height in cms of 30 persons are given below:

133 125 137 129 130 130 131 125 137 147

128 127 147 141 148 149 145 148 139 125

145 134 129 145 127 147 132 128 130 131

Prepare a frequency distribution for the above data.

Practice Problem

Here, the largest value is 149 and the smallest value is


125. Therefore,

Range = Largest value – Smallest value = 149-125 = 24

Also let us assume the number of classes as 9.

So, Class width = range/no of classes = 24/9


= 2.67 or 3 (rounded off)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-30

Practice Problem

Class Intervals Frequency

125 - 127 5
128 - 130 7
131 - 133 4
134 - 136 1
137 - 139 3
140 - 142 1
143 - 145 3
146 - 148 4

149 - 151 2

Total 30

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-1

Visualizing Categorical Data


Through Graphical Displays
Categorical
Data
Visualizing Data

Summary Contingency
Table For One Table For Two
Variable Variables

Bar Pareto Side By Side


Chart Chart Bar Chart
Pie Chart

Visualizing Categorical Data:


The Bar Chart
 In a bar chart, a bar shows each category, the length of which
represents the amount, frequency or percentage of values falling into
a category which come from the summary table of the variable.

Banking Preference

Banking Preference? % Internet


ATM 16%
In person at branch
Automated or live 2%
telephone
Drive-through service at branch
Drive-through service at 17%
branch
In person at branch 41% Automated or live telephone

Internet 24%
ATM

0% 5% 10% 15% 20% 25% 30% 35% 40% 45%

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-2

Visualizing Categorical Data:


The Pie Chart
 The pie chart is a circle broken up into slices that represent categories.
The size of each slice of the pie varies according to the percentage in
each category.
Banking Preference

Banking Preference? %
16% ATM
ATM 16% 24%
2% Automated or live
Automated or live 2%
telephone telephone
Drive-through service at
Drive-through service at 17%
17% branch
branch
In pers on at branch
In person at branch 41%
Internet 24% Internet
41%

Visualizing Categorical Data:


The Pareto Chart

 Used to portray categorical data (nominal scale)


 A vertical bar chart, where categories are
shown in descending order of frequency
 A cumulative polygon is shown in the same
graph
 Used to separate the “vital few” from the “trivial
many”

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-3

Visualizing Categorical Data:


The Pareto Chart (con’t)

Pareto Chart For Banking Preference

100% 100%
% in each category

80% 80%

Cumulative %
(line graph)
(bar graph)

60% 60%

40% 40%

20% 20%

0% 0%
In person Internet Drive- ATM Automated
at branch through or live
service at telephone
branch

Visualizing Categorical Data:


Side By Side Bar Charts
 The side by side bar chart represents the data from a contingency table.

No
Errors Errors Total
Invoice Size Split Out By Errors
Small 50.75% 30.77% 47.50% & No Errors
Amount
Medium 29.85% 61.54% 35.00% Errors

Amount
Large 19.40% 7.69% 17.50% No Errors

Amount
0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0%
100.0% 100.0% 100.0% Large Medium Small
Total

Invoices with errors are much more likely to be of


medium size (61.54% vs 30.77% and 7.69%)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-4

Visualizing Numerical Data


By Using Graphical Displays
Numerical Data

Frequency Distributions
Ordered Array and
Cumulative Distributions

Stem-and-Leaf
Histogram Polygon Ogive
Display

Stem-and-Leaf Display

 A simple way to see how the data are distributed


and where concentrations of data exist

METHOD: Separate the sorted data series


into leading digits (the stems) and
the trailing digits (the leaves)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-5

Organizing Numerical Data:


Stem and Leaf Display
 A stem-and-leaf display organizes data into groups (called
stems) so that the values within each group (the leaves)
branch out to the right on each row.
Age of College Students

Age of Day Students Day Students Night Students


Surveyed
16 17 17 18 18 18 Stem Leaf
College Stem Leaf
Students 19 19 20 20 21 22
1 67788899 1 8899
22 25 27 32 38 42
Night Students 2 0012257 2 0138
18 18 19 19 20 21
3 28 3 23
23 28 32 33 41 45
4 2
4 15

Visualizing Numerical Data:


The Histogram

 A vertical bar chart of the data in a frequency distribution is


called a histogram.

 In a histogram there are no gaps between adjacent bars.

 The class boundaries (or class midpoints) are shown on the


horizontal axis.

 The vertical axis is either frequency, relative frequency, or


percentage.

 The height of the bars represent the frequency, relative


frequency, or percentage.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-6

Visualizing Numerical Data:


The Histogram
Relative
Class Frequency Frequency
Percentage

10 but less than 20 3 .15 15


20 but less than 30 6 .30 30
30 but less than 40 5 .25 25
40 but less than 50 4 .20 20 8
50 but less than 60 2 .10 10 Histogram: Age Of Students
Total 20 1.00 100
6

Frequency
4
(In a percentage
histogram the vertical
axis would be defined to 2
show the percentage of
observations per class)
0
5 15 25 35 45 55 More

Visualizing Numerical Data:


The Polygon

 A percentage polygon is formed by having the midpoint of


each class represent the data in that class and then connecting
the sequence of midpoints at their respective class
percentages.

 The cumulative percentage polygon, or ogive, displays the


variable of interest along the X axis, and the cumulative
percentages along the Y axis.

 Useful when there are two or more groups to compare.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-7

Visualizing Numerical Data:


The Frequency Polygon
Class
Class Midpoint Frequency
10 but less than 20 15 3
20 but less than 30 25 6
30 but less than 40 35 5 Frequency Polygon: Age Of Students
40 but less than 50 45 4
50 but less than 60 55 2
7
6
Frequency
5
4
3
2
(In a percentage 1
polygon the vertical axis 0
would be defined to 5 15 25 35 45 55 65
show the percentage of
Class Midpoints
observations per class)

Visualizing Numerical Data:


The Ogive (Cumulative % Polygon)
Lower % less
class than lower
Class boundary boundary
10 but less than 20 10 15
20 but less than 30 20 45
30 but less than 40 30 70
40 but less than 50 40 90
50 but less than 60 50 100 Ogive: Age Of Students
Cumulative Percentage

100
80
60
40
(In an ogive the percentage 20
of the observations less 0
than each lower class
boundary are plotted versus 10 20 30 40 50 60
the lower class boundaries. Lower Class Boundary

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-8

Visualizing Two Numerical Variables


By Using Graphical Displays

Two Numerical
Variables

Scatter Time-
Plot Series
Plot

Visualizing Two Numerical


Variables: The Scatter Plot
 Scatter plots are used for numerical data consisting of paired
observations taken from two numerical variables

 One variable is measured on the vertical axis and the other


variable is measured on the horizontal axis

 Scatter plots are used to examine possible relationships


between two numerical variables

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-9

Scatter Plot Example

Volume Cost per


per day day Cost per Day vs. Production Volume
23 125
250
26 140
200
29 146 C o st p e r D ay
150
33 160
100
38 167
50
42 170
0
50 188
20 30 40 50 60 70
55 195
Volume per Day
60 200

Visualizing Two Numerical


Variables: The Time Series Plot

 A Time-Series Plot is used to study


patterns in the values of a numeric
variable over time

 The Time-Series Plot:


 Numeric variable is measured on the

vertical axis and the time period is


measured on the horizontal axis

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-10

Time Series Plot Example

Number of
Year Franchises Number of Franchises, 1996-2004
120
1996 43
100
1997 54

Franchises
Number of
80
1998 60 60
1999 73 40
2000 82 20
0
2001 95
1994 1996 1998 2000 2002 2004 2006
2002 107 Year
2003 99
2004 95

Guidelines For Developing


Visualizations
 Avoid chartjunk
 Use the simplest possible visualization
 Include a title
 Label all axes
 Include a scale for each axis if the chart contains axes
 Begin the scale for a vertical axis at zero
 Use a constant scale

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-11

Graphical Errors: Chart Junk

Bad Presentation  Good Presentation


Minimum Wage Minimum Wage
1960: $1.00
$
4
1970: $1.60
2
1980: $3.10
0
1990: $3.80 1960 1970 1980 1990

Graphical Errors:
No Relative Basis

Bad Presentation Good Presentation


A’s received by A’s received by
Freq. students. % students.
30%
300

200 20%

100 10%

0 0%
FR G JR SR FR G JR SR

FR = Freshmen, G = Graduate, JR = Junior, SR = Senior

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-12

Graphical Errors:
Compressing the Vertical Axis

Bad Presentation  Good Presentation


Quarterly Sales Quarterly Sales
$ $
200 50

100 25

0 0
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

Graphical Errors: No Zero Point


on the Vertical Axis

Bad Presentation
 Good Presentations
Monthly Sales $ Monthly Sales
$ 45
45
42
42 39
39 36
36 0
J F M A M J J F M A M J

Graphing the first six months of sales

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-13

Unit-2 Summary

In this Unit we have discussed:


 Constructed tables and charts for categorical data
 Constructed tables and charts for numerical data
 Examined the principles of properly presenting
graphs

Business Statistics

Unit-3
Numerical Descriptive Measures

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-14

Learning Objectives
In this Unit, you will learn:
 To describe the properties of central tendency,
variation, and shape in numerical data
 To calculate descriptive summary measures for a
population
 To calculate the coefficient of variation and Z-
scores
 To construct and interpret a box-and-whisker plot
 To calculate the covariance and the coefficient of
correlation

Summary Measures
Describing Data Numerically

Central Tendency Quartiles Variation Shape

Arithmetic Mean Range Skewness

Median Interquartile Range

Mode Variance

Geometric Mean Standard Deviation

Coefficient of Variation

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-15

MEASURES OF CENTRAL TENDENCY

An average is a single value which is used to


represent all of the series. Since the average
lies somewhere in between the two extremes
i.e. the largest and the smallest items, it is
sometimes known as a measure of central
tendency.

Measures of Central Tendency


Overview
Central Tendency

Arithmetic Mean Median Mode Geometric Mean

X i
X G  ( X1  X 2    Xn )1/ n

X i1
n Midpoint of Most
ranked frequently
values observed
value

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-16

Arithmetic Mean For an Individual


Series/ Ungrouped Data

If x1, x2 ,..., xn are n values of a variable then

Sum of the values


A.M.
Numberof values
__
i.e. X 
X
n
__
Where X = ArithmeticMean
 X = sum of all valuesof variableX
n = Numberof individualobservations

Arithmetic Mean
 The arithmetic mean (sample mean) is the
most common measure of central tendency

 For a sample of size n:


n

X i
X1  X 2    Xn
X i1

n n

Sample size Observed values

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-17

Arithmetic Mean
(continued)

 The most common measure of central tendency


 Mean = sum of values divided by the number of values
 Affected by extreme values (outliers)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Mean = 3 Mean = 4
1  2  3  4  5 15 1  2  3  4  10 20
 3  4
5 5 5 5

Arithmetic Mean For Discrete


Series/Grouped Data

A series is said to be a discrete series if for each xi


we have corresponding fi. It is also Known as a
frequency distribution and is denoted by
xi/ fi, i= 1, 2, … ,n

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-18

Arithmetic Mean For Discrete Series

If f1 , f 2 ,..., f n be the frequencie s correspond ing

to the values of the var iate x1 , x2 ,..., x n then


__
X 
 fx
N
__
Where X = Arithmetic Mean
 fx = Sum of product of frequency and value
of variable X
N =  f  Sum of frequencie s

Example

Calculation of arithmetic mean by assued mean method for discrete series

Marks X 5 15 25 35 45 55
No. of Students
f 10 20 30 50 40 30

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-19

Solution

No. of
Marks
Students
X
f fX
5 10 50
15 20 300
25 30 750
35 50 1750
45 40 1800
55 30 1650
180 6300

Mean = 35 = 6300/180

Arithmetic Mean For Continuous


Series

Here first of all find the mid  po int(m) as


Lower lim it  upper lim it
Mid  po int(m) 
2

Then X 
__
 fm
N
__
Where X = ArithmeticMean
 fm = Sum of product of frequencyand mid - point
N =  f  Sum of frequencies

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-20

Example

From the fol owing data calculate the Arithmetic Mean

Marks 0-10 10-20 20-30 30-40 40-50 50-60


No. of Students 10 20 30 50 40 30

Solution
No. of
Marks Mid Point
Students
X m
f fm
0-10 5 10 50
10-20 15 20 300
20-30 25 30 750
30-40 35 50 1750
40-50 45 40 1800
50-60 55 30 1650
180 6300

Mean = 35 = 6300/180

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-21

Median

Median is the central value of the variable that


divides the series into two equal parts in such a
way that half of the items lie above this value and
the remaining half lie below this value.

It can be computed for both ungrouped data


(individual series) and grouped data
(discrete/continuous series).

Median
 In an ordered array, the median is the “middle”
number (50% above, 50% below)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Median = 3 Median = 3

 Not affected by extreme values

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-22

Median For Individual Series

First of all, arrange the data in ascending or


descending order.

If the number of terms is odd, then


Median = (n+1)/2th term.

If the number of terms is even, then


Median = A.M. of n/2 and (n/2+1)th term.

Example

Calculate the median for the following data.


Roll nos. 1 2 3 4 5 6
Marks 25 55 5 45 15 35

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-23

Solution

Firtly, arrange the data in ascending order


Marks 5 15 25 35 45 55

Here n = 6, which is an even number.Therefore,


Median = [n/2+(n/2+1)]/2 th term = (25+35)/2 =30

Example

Calculate the median for the following data.


Roll nos. 1 2 3 4 5 6 7
Marks 25 55 5 45 15 35 60

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-24

Solution

Firtly, arrange the data in ascending order


Marks 5 15 25 35 45 55 60

Here n = 7, which is an odd number.Therefore,


Median = (n+1)/2 th term =8/2 the term =35

Median For Discrete Series

1.Arrange the size of items in the ascending or


descending order.

2.Then calculate the cumulative frequency and


(N+1)/2, where N is the sum of the frequencies.

In this case, Median is the size of item for


which the cumulative frequency is just greater
than or equal to the (N+1)/2.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-25

Example

Calculate the median for the following data.


Marks 45 55 25 35 5 15
No. of students 40 30 30 50 10 20

Solution

Arrange the data in ascending order

Cumulative
Marks No. of Students
Frequency
5 10 10
15 20 30
25 30 60
35 50 110
45 40 150
55 30 180
180

Here, N =180 and (N+1)/2 = 90.5

Therefore c.f. just greater than 90.5 is 110


Hence, Median = 35

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-26

Median For Continuous Series

Calculate Cumulative Frequency and N/2. Look for


the class where the c.f. is just greater than or equal
to N/2, this class is know as Median Class and
Median is calculated as follows:

Median = l +(N/2-c.f.)*(i / f)

where, l = lower limit of the median class


c.f. = cumulative frequency of the preceding class
f = frequency of the median class
i = width of the median class

Example

From the following data, calculate the median


Marks 0-10 10-20 20-30 30-40 40-50 50-60
No. of Students 10 20 30 50 40 30

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-27

Solution

Marks No. of Students C.f.


0-10 10 10
10-20 20 30
20-30 30 60
30-40 50 110 Median Class
40-50 40 150
50-60 30 180
180

Here, N/2 = 180/2 = 90


C.f. just greater than 90 is 110. So the median class is 30-40

Median = l+[N/2 - c.f.]* (I / f) = 36

Which measure of location


is the “best”?
 Mean is generally used, unless
extreme values (outliers) exist
 Then median is often used, since
the median is not sensitive to
extreme values.
 Example: Median home prices may be
reported for a region – less sensitive to
outliers

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-28

Mode
 A measure of central tendency
 Value that occurs most often
 Not affected by extreme values
 Used for either numerical or categorical
(nominal) data
 There may be no mode
 There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6

No Mode
Mode = 9

Mode

Mode is often said to be that value in a series


which occurs most frequently or which has the
greatest frequency.

Note: A distribution is said to be unimodal or


bimodal or multimodal if it has only one or two
or more than two modes respectively.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-29

Mode for Individual Series

Count the number of times the various values of


the series repeat themselves and Mode is that
value which occurs the maximum number of
times.

Example

Calculate the mode from the following data


Roll No. 1 2 3 4 5 6 7 8 9 10
Marks Obtained 20 30 31 32 25 25 30 31 30 32

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-30

Solution

No.of
Size of item times it
occurs
20 1
25 2
30 3
31 2
32 2
Total 10

Since the item 30 occurs the maximum number of times i.e. 3


Hence Mode is 30

Mode For Discrete Series

In case of Discrete Series mode can be


determined just by inspection.

Ascertain maximum frequency and Mode is the


value of the item corresponding to maximum
frequency.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-31

Example

Calculate the mode from the following data


Marks 5 15 25 35 45 55
No. of students 10 20 30 50 40 30

Since the maximum frequency is 50, the mode corresponding to this value is 35.

Mode For Continuous Series

For Continuous Series Mode is Calculated as


follows:
f m  f m 1
Mode  l  i
2 f m  f m 1  f m 1
where , l  lower lim it of the mod al class
f m  frequency of the mod al class
f m 1  frequency of the class preceeding the
mod al class
f m 1  frequency of the class succeeding the
mod al class
i  class int erval

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-32

Example

Calculate the mode from the following data


Wages (Rs.) below 100 100-200 200-300 300-400 400-500 above 500
No.of workers 8 12 25 15 10 6

Solution
Since the maximum frequency is 25. therefore the modal class is 200-300

Here, l  200, f m  25, f m1  12, f m1  15, i  100

f m  f m1 2512
Mode l  i  200 100 =256.52
2 f m  f m1  f m1 2  251215

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-33

Review Example
 Five houses on a hill by the beach
$2,000 K
House Prices:

$2,000,000
500,000 $500 K
300,000 $300 K
100,000
100,000

$100 K

$100 K

Review Example:
Summary Statistics

House Prices:
 Mean: ($3,000,000/5)
$2,000,000 = $600,000
500,000
300,000
100,000
100,000  Median: middle value of ranked data
Sum $3,000,000
= $300,000

 Mode: most frequent value


= $100,000

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-34

Partition Values

Partition Values are the positional measures


which divide the series into equal parts say 4
equal parts or 10 equal parts or 100 equal
parts. The most popular partition values are
Quartiles, Deciles and Percentiles.

Partition Values For Grouped Data

For a grouped distribution,


Quartiles are given by
 iN 
  C 
Q i  l   4   h, i  1, 2 , 3
 f 
 
Deciles are given by
 jN 
  C 
D j l   10   h, j  1 , 2 ,..., 9
 f 
 
Percentile s are given by
 kN 
  C 
P k  l   100   h, k  1 , 2 ,..., 99
 f 
 

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-35

Partition Values contd….

Where, l = lower limit of the class in which a


partition value lies
h = width of the class
f = frequency of the class
C = cumulative frequency of the preceeding
class in which a partition value lies
N = total frequency

Quartiles
 Quartiles split the ranked data into 4 segments with
an equal number of values per segment

25% 25% 25% 25%

Q1 Q2 Q3

 The first quartile, Q1, is the value for which 25% of the
observations are smaller and 75% are larger
 Q2 is the same as the median (50% are smaller, 50% are
larger)
 Only 25% of the observations are greater than the third
quartile

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-36

Quartile Formulas

Find a quartile by determining the value in the


appropriate position in the ranked data, where

First quartile position: Q1 = (n+1)/4

Second quartile position: Q2 = (n+1)/2 (the median position)

Third quartile position: Q3 = 3(n+1)/4

where n is the number of observed values

Quartiles

 Example: Find the first quartile


Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22

(n = 9)
Q1 is in the (9+1)/4 = 2.5 position of the ranked data
so use the value half way between the 2nd and 3rd values,

so Q1 = 12.5

Q1 and Q3 are measures of noncentral location


Q2 = median, a measure of central tendency

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-37

Quartiles
(continued)
 Example:
Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22

(n = 9)
Q1 is in the (9+1)/4 = 2.5 position of the ranked data,
so Q1 = 12.5

Q2 is in the (9+1)/2 = 5th position of the ranked data,


so Q2 = median = 16

Q3 is in the 3(9+1)/4 = 7.5 position of the ranked data,


so Q3 = 19.5

Example

Find the first and third quartiles for the following data. Also find the 7th decile.
Wages (Rs.) 0-10 10-20 20-30 30-40 40-50
No. of Workers 22 38 46 35 20

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-38

Solution
Cumulative
Class Frequency
Frequency
0-10 22 22
10-20 38 60 first Quartile class
20-30 46 106
30-40 35 141 second quartile
40-50 20 161
161

For first quartile,


Here, N = 161, therefore N/4 = 161/4 = 40.25
c.f. just greater than or eual to 40.25 is 60, so class of first quartile is 10-20

l = 10, N = 161, C = 22, f = 38, h = 10


40 . 25  22
First quartile = Q 1  10   10 =14.80
38
For second quartile,
3N/4 = 120.75 and c.f. just greater than or equal to 120.75 is 141. therefore
class of second quartile is 30-40

l = 30, N = 161, C = 106, f = 35, h = 10

120.75  106
Q3  30  10 =34.21
35

Geometric Mean

Geometric Mean of n items is the nth root of their


product. Symbolically,
1
G.M .  ( x1  x2  ...  xn ) n

where x1 , x2 ,..., xn refer to the values of


var ious items of the series.
n  total no. of items of the series

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-39

Geometric Mean
 Geometric mean
 Used to measure the rate of change of a variable
over time

X G  ( X1  X 2    Xn )1/ n
 Geometric mean rate of return
 Measures the status of an investment over time

R G  [(1  R1 )  (1  R 2 )    (1  Rn )]1/ n  1
 Where Ri is the rate of return in time period i

Example

An investment of $100,000 declined to $50,000 at the


end of year one and rebounded to $100,000 at end
of year two:

X1  $100,000 X 2  $50,000 X3  $100,000

50% decrease 100% increase

The overall two-year return is zero, since it started and


ended at the same level.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-40

Example
(continued)

Use the 1-year returns to compute the arithmetic


mean and the geometric mean:

Arithmetic ( 50%)  (100%)


mean rate X  25% Misleading result
2
of return:

Geometric R G  [(1  R1 )  (1  R 2 )    (1  Rn )]1/ n  1


mean rate
 [(1  ( 50%))  (1  (100%))]1/ 2  1 More
of return:
accurate
 [(.50 )  (2)]1/ 2  1  11/ 2  1  0% result

Measures of Central Tendency:


Summary

Central Tendency

Arithmetic Median Mode Geometric Mean


Mean
n

X i
XG  ( X1  X 2    Xn )1/ n

X i1
n Middle value Most Rate of
in the ordered frequently change of
array observed a variable
value over time

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-41

MEASURES OF DISPERSION
Dispersion measures the extent to which the items vary from
some central value. It may be noted that the measures of
dispersion measure only the degree but not the direction of the
variation.

The measures of dispersion are also called averages of the


Second Order because they are based on the deviations of the
different values from the mean or other measure of central
tendency. Measures of Central Tendency on the other hand are
called averages of the First Order.

Note: Measures of Skewness tell us the direction of the


variation.

Example

Group A Group B Group C

100 100 1

100 105 489

100 102 2

100 103 3

100 90 5
Total 500 500 500
Average 100 100 100

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-42

Example Contd…
The A.M. is same for all the three groups. But, these distributions
differ widely from one another:

 In Group A, each and every item is perfectly represented by the


A.M. (none of the items of Group A deviates from the A.M. and
hence there is no dispersion).

 In Group B, only one item is perfectly represented by the A.M.


(other items vary but the variation is very small and hence there is
some dispersion).

 In Group C not a single item is represented by the A.M. (the items


vary widely from one another and there is greater dispersion as
compared to Group B).

Measures of Variation
Variation

Range Interquartile Variance Standard Coefficient


Range Deviation of Variation

 Measures of variation give


information on the spread
or variability of the data
values.

Same center,
different variation

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-43

Range

 Simplest measure of variation


 Difference between the largest and the smallest
values in a set of data:

Range = Xlargest – Xsmallest

Example:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Range = 14 - 1 = 13

Disadvantages of the Range


 Ignores the way in which data are distributed

7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5

 Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-44

Uses of Range
 It facilitates statistical quality control. If the range
increases beyond a certain point, the product may
be examined to find out the reasons for variations.
 It facilitates the study of variations in the prices of
shares, debentures, bonds, agricultural
commodities etc.
 It facilitates the weather forecasts. On the basis of
minimum and maximum temperature, one can
know the limits within which the temperature is
likely to vary on a particular day.
 If the averages of the two distributions are almost
same, the distribution with smaller range is said to
have less dispersion and the distribution with larger
range is said to have more dispersion.

Measures of Variation
Variation

Range Interquartile Variance Standard Coefficient


Range Deviation of Variation

 Measures of variation give


information on the spread
or variability of the data
values.

Same center,
different variation

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-45

Interquartile Range

 Can eliminate some outlier problems by using


the interquartile range

 Eliminate some high and low-valued


observations and calculate the range from the
remaining values

 Interquartile range = 3rd quartile – 1st quartile


= Q3 – Q1

Interquartile Range

Example:

X Median X
minimum Q1 (Q2) Q3 maximum

25% 25% 25% 25%

12 30 45 57 70

Interquartile range
= 57 – 30 = 27

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-1

Measures of Variation
Variation

Range Interquartile Variance Standard Coefficient


Range Deviation of Variation

 Measures of variation give


information on the spread
or variability of the data
values.

Same center,
different variation

Variance

 Average (approximately) of squared deviations


of values from the mean
n
 Sample variance:
 (X  X) i
2

S2  i 1
n -1
Where X = mean
n = sample size
Xi = ith value of the variable X

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-2

Measures of Variation
Variation

Range Interquartile Variance Standard Coefficient


Range Deviation of Variation

 Measures of variation give


information on the spread
or variability of the data
values.

Same center,
different variation

Standard Deviation
 Most commonly used measure of variation
 Shows variation about the mean
 Is the square root of the variance
 Has the same units as the original data

n
 Sample standard deviation:  (X  X)
i
2

S i1
n -1

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-3

Measures of Variation:
The Standard Deviation

Steps for Computing Standard Deviation

1. Compute the difference between each value and the


mean.
2. Square each difference.
3. Add the squared differences.
4. Divide this total by n-1 to get the sample variance.
5. Take the square root of the sample variance to get
the sample standard deviation.

Calculation Example:
Sample Standard Deviation
Sample
Data (Xi) : 10 12 14 15 17 18 18 24

n=8 Mean = X = 128/8 = 16


(10  X )2  (12  X )2  (14  X)2    (24  X)2
S
n 1

(10  16) 2  (12  16) 2  (14  16) 2    (24  16) 2



8 1

130 A measure of the “average”


  4.3095
7 scatter around the mean

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-4

Comparing Standard Deviations

Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 3.338

Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 0.926
Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 4.567

Measuring variation

Small standard deviation

Large standard deviation

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-5

Advantages of Variance and


Standard Deviation

 Each value in the data set is used in the


calculation

 Values far from the mean are given extra


weight
(because deviations from the mean are squared)

Standard Deviation for


Grouped Data
The standard deviation for sample data, based on
frequency distribution is given by
 f(X  X ) 2

S= n 1 which is used to estimate the Population


Standard Deviation .

Here X 
 fX
n

n is the Sample Size = f , X =Mid Point of each class

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-6

Standard Deviation for


Grouped Data-Example
Frequency Distribution of Return on Investment of Mutual Funds

Return on Number of Mutual


Investment Funds
5-10 10
10-15 12
15-20 16
20-25 14
25-30 8
Total 60

Solution for the Example

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-7

Solution for the Example

From the spreadsheet of Microsoft Excel in the previous


slide, it is easy to see

Mean = X   fX =1040/60=17.333(cell F10),


n
2448.33
Standard Deviation = S =  f(X  X) 2

= 59 = 6.44
n 1
(Cell H12)

Measures of Variation:
Summary Characteristics
 The more the data are spread out, the greater the
range, variance, and standard deviation.

 The more the data are concentrated, the smaller the


range, variance, and standard deviation.

 If the values are all the same (no variation), all these
measures will be zero.

 None of these measures are ever negative.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-8

Measures of Variation
Variation

Range Interquartile Variance Standard Coefficient


Range Deviation of Variation

 Measures of variation give


information on the spread
or variability of the data
values.

Same center,
different variation

Coefficient of Variation

 Measures relative variation


 Always in percentage (%)
 Shows variation relative to mean
 Can be used to compare two or more sets of
data measured in different units

 S
CV     100%

X 

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-9

Comparing Coefficient
of Variation
 Stock A:
 Average price last year = $50

 Standard deviation = $5

S $5
CVA     100%   100%  10%
X
  $50 Both stocks
 Stock B: have the same
standard
 Average price last year = $100 deviation, but
stock B is less
 Standard deviation = $5 variable relative
to its price
S $5
CVB     100%   100%  5%
X $100

Measures of Variation:
Comparing Coefficients of Variation
(continued)
 Stock A:
 Average price last year = $50

 Standard deviation = $5

S $5
CVA     100%   100%  10%
X $50 Stock C has a
 Stock C: much smaller
standard
 Average price last year = $8 deviation but a
much higher
 Standard deviation = $2 coefficient of
variation
 S  $2
CVC     100%   100%  25%

X  $8

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-10

Locating Extreme Outliers:


Z-Score

 To compute the Z-score of a data value, subtract the


mean and divide by the standard deviation.

 The Z-score is the number of standard deviations a


data value is from the mean.

 A data value is considered an extreme outlier if its Z-


score is less than -3.0 or greater than +3.0.

 The larger the absolute value of the Z-score, the


farther the data value is from the mean.

Z Scores

 A measure of distance from the mean (for example, a


Z-score of 2.0 means that a value is 2.0 standard
deviations from the mean)
 The difference between a value and the mean, divided
by the standard deviation
 A Z score above 3.0 or below -3.0 is considered an
outlier

XX
Z
S

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-11

Locating Extreme Outliers:


Z-Score
(continued)

Example:
 If the mean is 14.0 and the standard deviation is 3.0,
what is the Z score for the value 18.5?

X  X 18.5  14.0
Z   1.5
S 3.0
 The value 18.5 is 1.5 standard deviations above the
mean
 (A negative Z-score would mean that a value is less
than the mean)

Locating Extreme Outliers:


Z-Score

 Suppose the mean math score is 490, with a standard


deviation of 100.
 Compute the Z-score for a test score of 620.

X  X 620  490 130


Z    1.3
S 100 100

A score of 620 is 1.3 standard deviations above the


mean and would not be considered an outlier.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-12

Shape of a Distribution

 Describes how data are distributed


 Two useful shape related statistics are:
 Skewness
 Measures the extent to which data values are not
symmetrical
 Kurtosis
 Kurtosis affects the peakedness of the curve of
the distribution—that is, how sharply the curve
rises approaching the center of the distribution

Shape of a Distribution
(Skewness)

 Measures the extent to which data is not


symmetrical
Left-Skewed Symmetric Right-Skewed
Mean < Median Mean = Median Median < Mean

Skewness
Statistic <0 0 >0

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-13

Shape of a Distribution -- Kurtosis


measures how sharply the curve rises
approaching the center of the distribution)

Sharper Peak
Than Bell-Shaped
(Kurtosis > 0)

Bell-Shaped
(Kurtosis = 0)
Flatter Than
Bell-Shaped
(Kurtosis < 0)

Using Microsoft Excel

 Descriptive Statistics can be obtained


from Microsoft® Excel
 Use menu choice:
tools / data analysis / descriptive statistics

 Enter details in dialog box

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-14

Using Excel

Use menu choice:


tools / data analysis /
descriptive statistics

Using Excel
(continued)

 Enter dialog box


details

 Check box for


summary statistics

 Click OK

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-15

Excel output
Microsoft Excel
descriptive statistics output,
using the house price data:
House Prices:

$2,000,000
500,000
300,000
100,000
100,000

The Five Number Summary

The five numbers that help describe the center, spread


and shape of data are:
 Xsmallest
 First Quartile (Q1)
 Median (Q2)
 Third Quartile (Q3)
 Xlargest

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-16

Relationships among the five-number


summary and distribution shape

Left-Skewed Symmetric Right-Skewed


Median – Xsmallest Median – Xsmallest Median – Xsmallest
> ≈ <
Xlargest – Median Xlargest – Median Xlargest – Median
Q1 – Xsmallest Q1 – Xsmallest Q1 – Xsmallest

> ≈ <

Xlargest – Q3 Xlargest – Q3 Xlargest – Q3


Median – Q1 Median – Q1 Median – Q1

> ≈ <

Q3 – Median Q3 – Median Q3 – Median

Exploratory Data Analysis


 Box-and-Whisker Plot: A Graphical display of
data using 5-number summary:
Minimum -- Q1 -- Median -- Q3 -- Maximum

Example:

25% 25% 25% 25%

Minimum 1st Median 3rd Maximum


Minimum Quartile
1st Median Quartile
3rd Maximum
Quartile Quartile

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-17

Shape of Box-and-Whisker Plots

 The Box and central line are centered between the


endpoints if data are symmetric around the median

Min Q1 Median Q3 Max

 A Box-and-Whisker plot can be shown in either vertical


or horizontal format

Distribution Shape and


Box-and-Whisker Plot

Left-Skewed Symmetric Right-Skewed

Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-18

Box-and-Whisker Plot Example

 Below is a Box-and-Whisker plot for the following


data:
Min Q1 Q2 Q3 Max
0 2 2 2 3 3 4 5 5 10 27

00 22 33 55 27
27
 The data are right skewed, as the plot depicts

PROBABILITY

Probability also known as chance, is a measure


of uncertainty. The probability of event A is a
numerical measure of the likelihood of the
event’s occurrence.

In other words, it is a mathematical


measurement of chance. It is a number which
ranges from 0 to 1.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-19

Areas of Applications of Probability

 Investment companies use sophisticated analytical tools


to evaluate the prospects of the stocks of various
companies.
 Traders do their own analysis to assess the prices of
commodities and trade accordingly.
 A manufacturer or producer conducts market research
to assess the likely demand for products and services.
 A HRD manager designs systems to recruit employees
with high chance of being useful to his company.

In all the above cases, it is not possible to predict the


outcomes with certainty, and the decisions are based
on ‘chance factors’.

Definitions of Probability

 Classical
 Statistical, Empirical or Frequency
 Subjective

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-20

Classical Definition of Probability

Before going into the Classical approach, let’s have a


look on some basic concepts.

Random Experiments (Trial): An experiment which


when repeated under essentially identical conditions
does not give unique results but may result in any one
of the several possible outcomes. These outcomes are
known as Events or Cases. Events are denoted by
Capital letters A, B, C,….E.g. getting a head (H) or tail
(T) is an event when we toss a coin, getting any of the
six faces 1, 2, 3, 4, 5, 6 is an event when we throw a
dice.

Classical Approach contd…


Exhaustive Events: The total number of possible outcomes in a
random experiment. E.g. there are two exhaustive events i.e. head
and tail when we toss a coin.

Mutually Exclusive Events: Such events where the occurrence of


one rules out the occurrence of the other. In other words, no two or
more of them can happen simultaneously in the same trial. E.g. in
tossing of a coin there are two M.E. events, for if head comes in a
trial, then tail cannot come in the same trial or vice versa.

Equally Likely Events: The events are said to be equally likely if


none of them is expected to occur in preference to other. E.g. in
tossing of a coin, there are two equally likely events i.e H
and T.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-21

Classical Approach contd…

Mathematical or Classical or a priori definition


of probability: If there are n exhaustive, mutually
exclusive and equally likely events, out of which m
are favourable to the happening of an event A then
the probability of the happening of A, denoted by
P(A), is defined as

Favourable number of cases m


P( A)  
Total / Exhaustive number of cases n

Example and Limitations

From a pack of 52 cards, two cards are drawn at


random, then the chance that one is king and the
other a queen is 8/663.
The total no. of cases (n) =52C2.and there are 4
kings and 4 queens. The number of favourable
cases (m) = 4C1* 4C1. Hence, required Probability =
m/n
=(4C1* 4C1)/ 52C2.
= 8/663.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-22

Limitations of Classical Approach

 Cannot be applied if the outcomes are not equally likely. (e.g., If


a person jumps from the top of Qutab Minar, the probability of
his survival will not be 50% since the two mutually exclusive and
exhaustive outcomes are not equally likely).

 It is difficult or impossible to apply this approach as soon as we


deviate fields of coins, dice, cards and other games of chance
(e.g., it fails to answer questions like “what is the probability that
Mr. X will top in his class?).

 It may not explain actual results in certain cases. For example: If


a coin is tossed 10 times we may get 3 heads and 7 tails. The
probability of head is 0.3 and that of a tail 0.7.

Questions
1. Find the probability of getting an even
number in a throw of a single die.
2. In a single throw of two dice, find the
probability of getting a total of 10.
3. Two cards are drawn at random from a
well shuffled pack of 52 cards. Find the
probability of getting 2 aces.
4. What is the chance that a leap year
selected at random will contain 53
Sundays?

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-23

Empirical or Relative Frequency


Approach

Empirical probability of an event is defined as the


relative frequency of occurrence of the event when
the number or observations is very large.
P(A) = Lt a/n
n ∞
In the last example, if the experiment is carried out
a large number of times we should expect
approximately equal number of heads and tails.

Example of empirical probability

Find the probability of selecting a male taking statistics


from the population described in the following table:

Taking Stats Not Taking Total


Stats
Male 84 145 229
Female 76 134 210
Total 160 279 439

number of males taking stats 84


Probability of male taking stats    0.191
total number of people 439

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-24

Subjective/Personalistic Approach
The Subjective Probability is defined as the probability assigned to an
event by an individual based whatever evidence is available.

It differs from person to person, based on experience, perceptions,


prejudices and above all, judgmental capabilities.

It is better to rely on well-defined quantitative probabilities. But when either


it is not possible to gather requisite data or it is too cumbersome and time
consuming, subjective probabilities are preferred. Many a time, they are
the only option in decision making situations. The Subjective school of
thought is also known as Personalistic school of probability.

For example: 1.“ I am 90% certain that this budget would boost up the
capital market”.
2. I am 100% sure that Mr. X will top in his class.
3. A media development team assigns a 60% probability of success to its
new ad campaign.
4.The chief media officer of the company is less optimistic and assigns a
40% of success to the same campaign

Basic Probability Concepts

 Probability – the chance that an uncertain event


will occur (always between 0 and 1)

 Impossible Event – an event that has no


chance of occurring (probability = 0)

 Certain Event – an event that is sure to occur


(probability = 1)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-25

Events

Each possible outcome of a variable is an event.

 Simple event
 An event described by a single characteristic
 e.g., A day in January from all days in 2023
 Joint event
 An event described by two or more characteristics
 e.g. A day in January that is also a Wednesday from all days in 2023
 Complement of an event A (denoted A’)
 All events that are not part of event A
 e.g., All days from 2023 that are not in January

Sample Space
The Sample Space is the collection of all
possible events
e.g. All 6 faces of a die:

e.g. All 52 cards of a bridge deck:

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-26

Organizing & Visualizing Events


 Contingency Tables -- For All Days in 2023
Jan. Not Jan. Total

Wed. 5 47 52
Not Wed. 26 287 313

Total 31 334 365

Total
Number
Of
Sample
Space
Outcomes

Definition: Simple Probability

 Simple Probability refers to the probability of a


simple event.
 ex. P(Jan.)
 ex. P(Wed.)
Jan. Not Jan. Total P(Wed.) = 52 / 365
Wed. 5 47 52
Not Wed. 26 287 313

Total 31 334 365

P(Jan.) = 31 / 365

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-27

Definition: Joint Probability


 Joint Probability refers to the probability of an
occurrence of two or more events (joint event).
 ex. P(Jan. and Wed.)
 ex. P(Not Jan. and Not Wed.)

Jan. Not Jan. Total


P(Not Jan. and Not Wed.)
Wed. 5 47 52
= 287 / 365
Not Wed. 26 287 313

Total 31 334 365

P(Jan. and Wed.) = 5 / 365

Mutually Exclusive Events

 Mutually exclusive events


 Events that cannot occur simultaneously

Example: Randomly choosing a day from 2023

A = day in January; B = day in February

 Events A and B are mutually exclusive

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-28

Collectively Exhaustive Events


 Collectively exhaustive events
 One of the events must occur
 The set of events covers the entire sample space
Example: Randomly choose a day from 2023

A = Weekday; B = Weekend;
C = January; D = Spring;

 Events A, B, C and D are collectively exhaustive


(but not mutually exclusive – a weekday can be in
January or in Spring)
 Events A and B are collectively exhaustive and
also mutually exclusive

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-1

Mutually Exclusive Events

 Mutually exclusive events


 Events that cannot occur simultaneously

Example: Randomly choosing a day from 2023

A = day in January; B = day in February

 Events A and B are mutually exclusive

Collectively Exhaustive Events


 Collectively exhaustive events
 One of the events must occur
 The set of events covers the entire sample space
Example: Randomly choose a day from 2023

A = Weekday; B = Weekend;
C = January; D = Spring;

 Events A, B, C and D are collectively exhaustive


(but not mutually exclusive – a weekday can be in
January or in Spring)
 Events A and B are collectively exhaustive and
also mutually exclusive

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-2

Computing Joint and


Marginal Probabilities

 The probability of a joint event, A and B:


number of outcomes satisfying A and B
P( A and B) 
total number of elementary outcomes

 Computing a marginal (or simple) probability:

P(A)  P(A and B1 )  P(A and B 2 )    P(A and Bk )


 Where B1, B2, …, Bk are k mutually exclusive and collectively
exhaustive events

Joint Probability Example

P(Jan. and Wed.)


number of days that are in Jan. and are Wed. 5
 
total number of days in 2013 365

Jan. Not Jan. Total

Wed. 5 47 52
Not Wed. 26 287 313

Total 31 334 365

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-3

Marginal Probability Example

P(Wed.)
5
4 48 52
 P(Jan. and Wed.)  P(Not Jan. and Wed.)   47 
365 365 365

Jan. Not Jan. Total

Wed. 5 47 52
Not Wed. 26 287 313

Total 31 334 365

Marginal & Joint Probabilities In A


Contingency Table

Event
Event B1 B2 Total
A1 P(A1 and B1) P(A1 and B2) P(A1)

A2 P(A2 and B1) P(A2 and B2) P(A2)

Total P(B1) P(B2) 1

Joint Probabilities Marginal (Simple) Probabilities

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-4

Probability Summary So Far


 Probability is the numerical measure
of the likelihood that an event will 1 Certain
occur
 The probability of any event must be
between 0 and 1, inclusively
0 ≤ P(A) ≤ 1 For any event A 0.5
 The sum of the probabilities of all
mutually exclusive and collectively
exhaustive events is 1
P(A)  P(B)  P(C)  1
0 Impossible
If A, B, and C are mutually exclusive and
collectively exhaustive

General Addition Rule

General Addition Rule:


P(A or B) = P(A) + P(B) - P(A and B)

If A and B are mutually exclusive, then


P(A and B) = 0, so the rule can be simplified:

P(A or B) = P(A) + P(B)


For mutually exclusive events A and B

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-5

General Addition Rule Example

P(Jan. or Wed.) = P(Jan.) + P(Wed.) - P(Jan. and Wed.)


= 31/365 + 52/365 - 5/365 = 78/365
Don’t count
the five
Wednesdays
in January
Jan. Not Jan. Total twice!
Wed. 5 47 52
Not Wed. 26 287 313

Total 31 334 365

Computing Conditional
Probabilities
 A conditional probability is the probability of one
event, given that another event has occurred:
P(A and B) The conditional
P(A | B)  probability of A given
P(B) that B has occurred

P(A and B) The conditional


P(B | A)  probability of B given
P(A) that A has occurred

Where P(A and B) = joint probability of A and B


P(A) = marginal or simple probability of A
P(B) = marginal or simple probability of B

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-6

Example of
Conditional Probability
 Of the cars on a used car lot, 70% have air
conditioning (AC) and 40% have a GPS. 20%
of the cars have both.

 What is the probability that a car has a GPS,


given that it has AC ?

i.e., we want to find P(GPS | AC)

Conditional Probability Example


(continued)
 Of the cars on a used car lot, 70% have air conditioning
(AC) and 40% have a GPS and
20% of the cars have both.
GPS No GPS Total
AC 0.2 0.7
No AC
Total 0.4

P(GPS and AC) 0.2


P(GPS | AC)    0.2857
P(AC) 0.7

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-7

Conditional Probability Example


(continued)
 Of the cars on a used car lot, 70% have air conditioning
(AC) and 40% have a GPS and
20% of the cars have both.
GPS No GPS Total
AC 0.2 0.5 0.7
No AC 0.2 0.1 0.3
Total 0.4 0.6 1.0

P(GPS and AC) 0.2


P(GPS | AC)    0.2857
P(AC) 0.7

Problems on Addition Theorem


Que-1 One card is drawn from a standard pack of 52 cards. What is
the probability that it is a king or a queen?

Que-2 A card is drawn from a well shuffled pack of playing cards.


Find the probability that it is either a diamond or a king.

Que-3 A bag contains 30 ball numbered 1 to 30. One ball is drawn at


random. Find the probability that the number of the drawn ball will be
a multiple of 5 or 3

Que-4 What is the chance of throwing a total of 5 or 11 with two dice?

Que-5 A bag contains 6 white, 5 black and 4 yellow balls. Two balls
are drawn from it. Find the probability of getting either 2 white balls or
2 yellow balls in a single draw.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-8

Problems contd..
Que-1 One card is drawn from a standard pack of 52 cards. What is
the probability that it is a king or a queen?

Ans-1 Let A and B be the events of drawing a king and queen


respectively. There are 4 kings and 4 queens in standard pack.

The probability that the card drawn is king = 4/52

The probability that the card drawn is queen = 4/52

Since the two events are mutually exclusive. Therefore the


probability that the card drawn is either a king or a queen is
4/52 + 4/52 = 2/13

Problems contd..

Que-2 A card is drawn from a well shuffled pack of playing cards.


Find the probability that it is either a diamond or a king.

Ans-2 Let A and B be the events of drawing a diamond and king


respectively. Therefore,
P(A) = P(drawing a diamond) = 13/52

P(B)= P(drawing a king) = 4/52

P(AПB) = P(drawing a diamond king)


= 1/52
P(AUB) = P (drawing either a diamond or a king)
= P(A)+P(B)-P(AПB)
= 13/52 + 4/52 - 1/52 = 4/13

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-9

Problems contd..

Que-3 A bag contains 30 ball numbered 1 to 30. One ball is drawn at


random. Find the probability that the number of the drawn ball will
be a multiple of 5 or 3.

Ans-3 Let the event of getting a number multiple of 5 be A and let the
event of getting a number multiple of 3 be B.

Then, P(A) = 6/30


P(B) = 10/30
P(AПB) = 2/30
P(AUB) =P(A) + P(B) - P(AПB)
= 6/30 + 10/30 - 2/30
=14/30 = 7/15

Problems Contd..
Que-4 What is the chance of throwing a total of 5
or 11 with two dice?

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-10

Problems Contd..
Que-5 A bag contains 6 white, 5 black and 4
yellow balls. Two balls are drawn from it. Find the
probability of getting either 2 white balls or 2
yellow balls in a single draw.

Computing Conditional
Probabilities
 A conditional probability is the probability of one
event, given that another event has occurred:
P(A and B) The conditional
P(A | B)  probability of A given
P(B) that B has occurred

P(A and B) The conditional


P(B | A)  probability of B given
P(A) that A has occurred

Where P(A and B) = joint probability of A and B


P(A) = marginal or simple probability of A
P(B) = marginal or simple probability of B

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-11

Independence
 Two events are independent if and only
if:

P(A | B)  P(A)
 Events A and B are independent when the probability
of one event is not affected by the fact that the other
event has occurred

Multiplication Rules

 Multiplication rule for two events A and B:

P(A and B)  P(A | B) P(B)

Note: If A and B are independent, then P(A | B)  P(A)


and the multiplication rule simplifies to

P(A and B)  P(A) P(B)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-12

Marginal Probability

 Marginal probability for event A:

P(A)  P(A | B1 ) P(B1 )  P(A | B 2 ) P(B 2 )    P(A | Bk ) P(Bk )

 Where B1, B2, …, Bk are k mutually exclusive and


collectively exhaustive events

Problems on Multiplication Theorem


(Independent Events)

Que-1 Probability that a man will be alive 25


years hence is 0.30 and the probability that his
wife will be alive 25 years hence is 0.4. Find the
probability that 25 years hence
(i) both will be alive
(ii) only the man will be alive
(iii) At least one of them will be alive.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-13

Answer
Let us define the following events:
A= The man will be alive 25 years hence. P(A)=0.30
B= His wife will be alive 25 years hence. P(B)=0.40

(i) P(AПB)=P(A).P(B)=0.30*0.40=0.12

(ii) P(AПBc)=P(A). P(Bc)=0.30*(1-0.40)


= 0.30*0.60=0.18

(iii) P(AUB) = P(A)+P(B)- P(A ПB)


= 0.30+0.40-(0.30*0.40)
= 0.58

Alternatively,
P (at least one will be alive) = 1- P (none will be alive)
= 1- P (Ac) . P (Bc)
=1- (0.70*0.60)
=1-0.42
=0.58

Problems on Multiplication Theorem


(Independent Events) Contd…

Que2 - The probability that India wins a cricket match against


England is given to be 1/3. If India and England play three
matches, what is the probability that India will lose all the three
matches?

Ans2 - Let A, B and C denote the events that India wins first, second
and third matches against England respectively.
P(A)= P(B)= P(C)=1/3

P(Ac)= P(Bc)= P(Cc)=1-1/3=2/3

P (India loses all the three matches) = 2/3*2/3*2/3


= 8/27

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-14

Problems on Multiplication Theorem


(Independent Events) Contd…

Que-3 There are 5 white and 7 red balls in a bag.


A ball is drawn and then replaced. What is the
probability that a white and a red ball are drawn
in that order?

Ans-3 Let us define the events:


A= Event of drawing a white ball
B= Event of drawing a red ball.
P(A)= 5/12
P(B) = 7/12
P(AПB) =P(A).P(B)=5/12*7/12=35/144

Problems on Multiplication Theorem


(Dependent Events)

Que-4 What will be the probability if in the


previous question, balls drawn were not put
back into the bag?
Ans-4 Ball drawn without replacement.
P(AПB) = P(A).P(B/A)
=5/12*7/11
=35/132

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-15

Problems on Multiplication Theorem


(Dependent Events) Contd…

Que5 - There are 12 balls in a bag, 4 red and 8


green. Three balls are drawn successively
without replacement. What is the probability that
they are alternately of the same colour?

Answer 5
Order of balls drawn:

(i) G,R,G
(ii) R,G,R

Required Probability = P (i) + P (ii) = 224/1320+96/1320


= 320/1320
P (i) = P(AПBПC)= P(A).P(B/A).P(C/AB)
= 8/12*4/11*7/10 = 224/1320
[first draw there are 4R and 8G, second draw there are 4R and 7G
and third draw there are 3R and 7G balls]

P (ii)= P(AПBПC)= P(A).P(B/A).P(C/AB)


= 4/12*8/11*3/10=96/1320
[first draw there are 4R and 8G, second draw there are 3R and 8G
and third draw there are 3R and 7G balls]

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-16

Problem Contd…
 A problem in Statistics is given to the three students A, B and C
whose chances of solving it are ½, 1/3 and ¼ respectively. What
is the probability that the problem is solved?

Solution: It is given that the


P(A)=½, P(B)=1/3 and P(C)=1/4
Now, P(Ac)=½, P(Bc)=2/3 and P(Cc)=3/4
The problem will be solved if at least one of them will solves it.
The prob that none of them will solves it
= P(Ac)*P(Bc)*P(Cc)= ¼
Hence the prob that the problem is solved
= 1- ¼ = 3/4

Problem Contd…
 A can hit a target 3 times in 5 shots, B 2 times in 5 shots and C
3 times in 4 shots. Find the probability of the target being hit
when all of them try.

Solution: Let E1 be the event that A hits the target. Therefore,


P(E1) = 3/5 and P(E1c)= 2/5
Similarly, let E2 be the event that B hits the target i.e. P(E2) = 2/5
and P(E2c)= 3/5
And, let E3 be the event that C hits the target i.e. P(E3) = 3/4 and
P(E3c)= ¼

The required probability that the target is hit when all of them try
is
= P[at least one of the three hits the target]
=1–P[ none hits the target] =1–2/5*3/5*1/4 = 47/50

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-17

Problem Contd…

 A speaks truth in 75% and B in 80% of the cases. In what


percentages of cases are they likely to contradict each other in
stating the same fact.
Solution: Probabilities that A and B speak the truth are given by
P(A)=75/100=3/4 and P(B)=80/100=4/5
And probabilities that they do not speak truth are P(Ac)=1/4 and
P(Bc)=1/5
Probability that they contradict each other is
= P(ABc)+P(Ac B)
= P(A)P(Bc)+P(Ac)P(B)= 7/20

Bayes’ Theorem

 Bayes’ Theorem is used to revise previously


calculated probabilities based on new
information.

 Developed by Thomas Bayes in the 18th


Century.

 It is an extension of conditional probability.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-18

Bayes’ Theorem

Let B1, B2,…, Bn are n mutually exclusive


events such that B1 U B2 U…U Bn = S. Let A be
an event so that A can occur in conjunction with
only one of the events B1, B2,…, Bn and P(A)≠0.
Then the conditional probability of occurrence
of Bj (j=1,2,…,n) given that A has occurred is
given by
P(B j )P( A / B j )
P(B j / A)  n
j  1, 2, ...,n
 P(B )P( A / B )
i 1
i i

Bayes’ Theorem Example

 A drilling company has estimated a 40%


chance of striking oil for their new well.
 A detailed test has been scheduled for more
information. Historically, 60% of successful
wells have had detailed tests, and 20% of
unsuccessful wells have had detailed tests.
 Given that this well has been scheduled for a
detailed test, what is the probability
that the well will be successful?

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-19

Bayes’ Theorem Example


(continued)

 Let S = successful well


U = unsuccessful well
 P(S) = 0.4 , P(U) = 0.6 (prior probabilities)
 Define the detailed test event as D
 Conditional probabilities:
P(D|S) = 0.6 P(D|U) = 0.2
 Goal is to find P(S|D)

Bayes’ Theorem Example


(continued)

Apply Bayes’ Theorem:


P(D | S)P(S)
P(S | D) 
P(D | S)P(S)  P(D | U)P(U)
(0.6)(0.4)

(0.6)(0.4)  (0.2)(0.6)
0.24
  0.667
0.24  0.12

So the revised probability of success, given that this well


has been scheduled for a detailed test, is 0.667

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-20

Bayes’ Theorem Example


(continued)

 Given the detailed test, the revised probability


of a successful well has risen to 0.667 from
the original estimate of 0.4

Prior Conditional Joint Revised


Event
Prob. Prob. Prob. Prob.
S (successful) 0.4 0.6 (0.4)(0.6) = 0.24 0.24/0.36 = 0.667
U (unsuccessful) 0.6 0.2 (0.6)(0.2) = 0.12 0.12/0.36 = 0.333

Sum = 0.36

Problem on Bayes’ Theorem

Q.1 A company has two plants to manufacture scooters. Plant I


manufactures 80% of the scooters and Plant II manufactures
20%. At Plant I, 85 out of 100 scooters are rated standard
quality or better. At Plant II, only 65% scooters are rated
standard quality or better.

(i) What is the probability that scooter selected at random came


from Plant I if it is known that the scooter is of standard quality?

(ii) What is the probability that scooter selected at random came


from Plant II if it is known that the scooter is of standard quality?

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-21

Answer

Let us define the following events:


B1=Scooter is manufactured by Plant I
B2=Scooter is manufactured by Plant II
A= Scooter is rated as standard quality.
P(B1)=0.80
P(B2)=0.20
P(A/B1)=0.85
P(A/B2)=0.65

Answer Contd…

P(B1/A) = P(B1)*P(A/B1)
---------------------------------
P(B1)*P(A/B1)+P(B2)*P(A/B2)

= 0.80*0.85
--------------------------------
0.80*0.85+0.20*0.65

= 0.84

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-22

Answer Contd…
If we had to find the probability that the scooter came from Plant II,
if it is known that it is of standard quality,
Required Probability would be

P(B2/A) = P(B2)*P(A/B2)
---------------------------------
P(B1)*P(A/B1)+ P(B2)*P(A/B2)

= 0.20*0.65
--------------------------------
0.80*0.85+0.20*0.65

= 0.16 or (1 - 0.84 = 0.16)

Problems on Baye’s Theorem

Q1. In a bolt factory machines A, B and C


manufacture respectively 25, 35 and 40 percent
of the total. Out of their output 5, 4 and 2 percent
are defective bolts. A bolt is drawn from the
produce and is found defective. What are the
probabilities that it was manufactured by A, B and
C?

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-23

Solution to Bayes’ Theorem Problem

Ans-1 Let us define the following events:


B1=Bolt manufactured by machine A
B2=Bolt manufactured by machine B
B3=Bolt manufactured by machine C
D= Bolt rated as defective
P(B1)=0.25
P(B2)=0.35
P(B3)=0.40
P(D/B1)=0.05
P(D/B2)=0.04
P(D/B3)=0.02

Solution to Bayes’ Theorem Problem

P(B1/D) = P(B1)*P(D/B1)
--------------------------------------------------------------
P(B1)*P(D/B1)+P(B2)*P(D/B2)+ P(B3)*P(D/B3)

= 0.25*0.05
--------------------------------------------
0.25*0.05+0.35*0.04+0.40*0.02

= 0.0125 0.0125
---------------------------- = ---------- = 0.362
0.0125+0.014+0.008 0.0345

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-24

Solution to Bayes’ Theorem Problem

P(B2/D) = P(B2)*P(D/B2)
--------------------------------------------------------------
P(B1)*P(D/B1)+P(B2)*P(D/B2)+ P(B3)*P(D/B3)

= 0.35*0.04
--------------------------------------------
0.25*0.05+0.35*0.04+0.40*0.02

= 0.014 0.014
---------------------------- = ---------- = 0.406
0.0125+0.014+0.008 0.0345

Solution to Bayes’ Theorem Problem

P(B3/D) = 1 – [P(B1/D)+ P(B2/D)]


= 1 – [0.362 + 0.406]
= 1 – 0.768 = 0.232

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-25

Problems on Baye’s Theorem contd..

Q.2 The contents of urns I, II and III are as


follows:
1 white, 2 black and 3 red balls;
2 white, 1 black and 1 red balls; and
4 white, 5 black and 3 red balls.
One urn is chosen at random and two balls are
drawn. They happen to be white and red. What
is the probability that they came from urn I, II or
III?

Problems on Baye’s Theorem contd..

Q3. Three newspapers A, B, C are published in a


city and a survey of readers indicates the following:
20% read A, 16% read B, 14% read C, 8% read
both A and B, 5% read both A and C, 4% read both
B and C, 2% read all the three.
For a person chosen at random, find the probability
that he reads none of the papers.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-26

Counting Rules Are Often Useful In


Computing Probabilities Objective

 In many cases, there are a large number of


possible outcomes.

 We have various counting rules for such


situations.

Counting Rules

 Rules for counting the number of possible


outcomes
 Counting Rule 1:
 If any one of k different mutually exclusive and
collectively exhaustive events can occur on each of
n trials, the number of possible outcomes is equal to

kn
 Example
 If you roll a fair die 3 times then there are 63 = 216 possible
outcomes

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-27

Counting Rules
(continued)

 Counting Rule 2:
 If there are k1 events on the first trial, k2 events on
the second trial, … and kn events on the nth trial, the
number of possible outcomes is

(k1)(k2)…(kn)
 Example:
 You want to go to a park, eat at a restaurant, and see a
movie. There are 3 parks, 4 restaurants, and 6 movie
choices. How many different possible combinations are
there?
 Answer: (3)(4)(6) = 72 different possibilities

Counting Rules
(continued)

 Counting Rule 3:
 The number of ways that n items can be arranged in
order is
n! = (n)(n – 1)…(1)

 Example:
 You have five books to put on a bookshelf. How many
different ways can these books be placed on the shelf?

 Answer: 5! = (5)(4)(3)(2)(1) = 120 different possibilities

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-28

Counting Rules
(continued)
 Counting Rule 4:
 Permutations: The number of ways of arranging X
objects selected from n objects in order is

n!
n Px 
(n  X)!
 Example:
 You have five books and are going to put three on a
bookshelf. How many different ways can the books be
ordered on the bookshelf?
n! 5! 120
 Answer: n Px     60 different possibilities
(n  X)! (5  3)! 2

Counting Rules
(continued)

 Counting Rule 5:
 Combinations: The number of ways of selecting X
objects from n objects, irrespective of order, is
n!
n Cx 
X!(n  X)!
 Example:
 You have five books and are going to select three are to
read. How many different combinations are there, ignoring
the order in which they are selected?

Answer: n! 5! 120 different possibilities



n Cx     10
X!(n  X)! 3! (5  3)! (6)(2)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-1

Discrete Probability Distributions

Introduction to Probability
Distributions

 Random Variable
 Represents a possible numerical value from

an uncertain event
Random
Variables

Discrete Continuous
Random Variable Random Variable

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-2

Discrete Random Variables


 Can only assume a countable number of values
Examples:

 Roll a die twice


Let X be the number of times 4 comes up
(then X could be 0, 1, or 2 times)

 Toss a coin 5 times.


Let X be the number of heads
(then X = 0, 1, 2, 3, 4, or 5)

Discrete Probability Distribution

Experiment: Toss 2 Coins. Let X = # heads.


4 possible outcomes
Probability Distribution
T T X Value Probability
0 1/4 = 0.25
T H 1 2/4 = 0.50
2 1/4 = 0.25
H T
Probability

0.50

H H 0.25

0 1 2 X

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-3

Random Variables

Random Variable: A random variable X is a function


whose domain is the sample space S and range is the
set of real numbers (R) i.e X: S R. Thus, random
variable assign a real number to each possible outcome
of an experiment. Thus, the function has to be one-one
or many – one correspondence.

Ex-1 Suppose two coins are tossed simultaneously.


The values a random variable X is defined as number of
heads. Then, the sample space of the experiment is,
S={TT, HT, TH, HH} and since the random variable X is
number of heads therefore it takes three distinct values
{0, 1, 2}.

Random Variables Contd…

Ex-2 Let X be a random variable defined as difference


of the numbers that appear when a pair of dice is rolled.
Then there are 36 possible outcomes i.e.,

S={(1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2), (2,3),


(2,4), (2,5), (2,6), (3,1), (3,2), (3,3), (3,4), (3,5), (3,6),
(4,1), (4,2), (4,3), (4,4), (4,5), (4,6), (5,1), (5,2), (5,3),
(5,4), (5,5), (5,6), (6,1), (6,2), (6,3), (6,4), (6,5), (6,6)}

Since the random variable X is the difference of the


numbers. Its values are, {0, 1, 2, 3, 4, 5}

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-4

Probability Distribution Contd…

Ex-3 Suppose a random variable is number of tails


when a coin is flipped thrice. The sample space is,
S={HHH,THH,HTH,HHT,TTH,THT,HTT, TTT}
Therefore the required probability distribution is

Value of X=xi 0 1 2 3
R.V.
Prob. P(X=xi) 1/8 3/8 3/8 1/8

Probability Distribution Contd…


Ex-4 A random variable is sum of the numbers that appear when a
pair of dice is rolled. Here the sample space is,
S={(1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2), (2,3), (2,4),
(2,5), (2,6), (3,1), (3,2), (3,3), (3,4), (3,5), (3,6), (4,1), (4,2), (4,3),
(4,4), (4,5), (4,6), (5,1), (5,2), (5,3), (5,4), (5,5), (5,6), (6,1), (6,2),
(6,3), (6,4), (6,5), (6,6)}
Therefore the required probability distribution is

X=xi 2 3 4 5 6 7 8 9 10 11 12

P(X=xi) 1/ 2/ 3/ 4/ 5/ 6/ 5/ 4/ 3/ 2/ 1/
36 36 36 36 36 36 36 36 36 36 36

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-5

Discrete Random Variable

A random variable X is said to be discrete if it


takes finite or count ably infinite number of
possible values. Thus, discrete random variable
takes only isolated values. Random variables
mentioned in the previous examples are
discrete random variables.
Some of the practical examples of discrete
random variable are:
 Number of accident on an expressway
 Number of cars arriving at a petrol pump
 Number of students attending class
 Number of customers arriving at a shop, etc.

Probability Mass Function (p.m.f.)

 Let X be a discrete random variable defined on a


sample space S. Suppose {x1, x2,…, xn} is the
range set of X. With each of xi, we assign a
number P(xi)=P(X=xi) called the probability mass
function (p.m.f.) such that,

 P(xi)≥0 for i = 1, 2, …, n and

 ∑ P(xi)=1

The table containing the value of X along with the


probabilities given by probability mass function is called
probability distribution of the random variable X.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-6

Example
Let X represent the difference between the number of
heads and the number of tails obtained when a fair
coin is tossed 3 times. What are the possible value of X
and its p.m.f.?

Example
Let X represent the difference between the number of heads
and the number of tails obtained when a fair coin is tossed 3
times. What are the possible value of X and its p.m.f.?
S={HHH, THH, HTH, HHT, TTH, THT, HTT, TTT}
Since the probability of head or tail in each toss is ½ and X
can takes the value 3, 1, -1, -3
Thus the probability distribution of X is

X=xi -3 -1 1 3 Total=∑P(xi)

P(X=xi) 1/8 3/8 3/8 1/8 1

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-7

Practice Question

Suppose a fair dice is rolled twice. Find the possible


values of random variable X and its associated p.m.f.,
if:
(i) X is maximum of the values appearing in two rolls
(ii) X is minimum of the values appearing in two rolls
(iii) X is the sum of the values appearing in two rolls
(iv) X is value appearing in the first roll minus the value
appearing in the second roll

Discrete Variables
Expected Value (Measuring Center)
 Expected Value (or mean) of a discrete
distribution (Weighted Average)
N
  E(X)   Xi P( Xi )
i 1

X P(X)
 Example: Toss 2 coins, 0 0.25
X = # of heads, 1 0.50
compute expected value of X: 2 0.25

E(X) = (0 x 0.25) + (1 x 0.50) + (2 x 0.25)


= 1.0

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-8

Discrete Random Variables


Measuring Dispersion
(continued)
 Variance of a discrete random variable
N
σ   [Xi  E(X)]2 P(Xi )
2

i 1

 Standard Deviation of a discrete random variable


N
σ  σ2   [X  E(X)] P(X )
i 1
i
2
i

where:
E(X) = Expected value of the discrete random variable X
Xi = the ith outcome of X
P(Xi) = Probability of the ith occurrence of X

Discrete Random Variables X P(X)


Measuring Dispersion 0 0.25
1 0.50
2 0.25
 Example: Toss 2 coins, X = # heads,
compute standard deviation (recall E(X) = 1)

σ  [X  E(X)] P(X )
i
2
i

σ  (0  1)2 (0.25)  (1 1)2 (0.50)  (2  1)2 (0.25)  0.50  0.707

Possible number of heads


= 0, 1, or 2

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-9

The Binomial Distribution


Probability
Distributions

Discrete
Probability
Distributions

Binomial

Poisson

Binomial Distribution
Binomial random variable is very useful in
practice, which counts the number of
successes when ‘n’ Bernoulli trials are
performed, the one that results in either
success or failure.

Each trial results in a success with


probability p or a failure with probability
q=1-p.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-10

Binomial Distribution contd..


If X represent the number of success that occurs in n
trials, then X is said to be Binomial random variable and
the probability distribution is known as Binomial
distribution. The probability mass function of binomial
distribution is defined as follows:

P(X) = nCx px qn-x , X = 0, 1, …, n

Here, n is the number of independent trials which are


positive integer, 0≤p≤1 and q=1-p.
The binomial distribution contains two independent
constants n and p which are called the parameters of
the binomial distribution. It is denoted by B(n, p).

Mean, Variance and Mode of the


Binomial Distribution

If X has a binomial distribution,


 Mean of X= E(X) = np
 Variance of X= Var(X) = npq
Mode is that value of X for which P(X) is maximum.
So for binomial distribution,
 when (n+1)p is not an integer, then
mode = integral part of (n+1)p
 when (n+1)p is an integer, then we obtain two modes i.e.
mode = (n+1)p and (n+1)p – 1

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-11

Application of Binomial Distribution


When to use binomial distribution is an
important decision. Binomial distribution can be
used when the following conditions are
satisfied:
 Trials are finite (not very large), performed
repeatedly for ‘n’ times
 Each trial should be a Bernoulli trial, the one
that results in either success or failure
 Probability of success in any trial is ‘p’ and is
constant for each trial
 All trials are independent

Real Life Example of Applications of


Binomial Distribution

 Number of defective items in a lot of n items


produced by a machine
 Number of male births out of n births in a
hospital
 Number of correct answers in a multiple
choice test
 Number of missiles hitting the targets out of n
fired
 Number of seeds germinated in a row of n
planted seeds, etc.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-12

Calculating a Binomial Probability


What is the probability of one success in five
observations if the probability of success is 0.1?
X = 1, n = 5, and p = 0.1

n!
P(X  1)  p X (1 p)n X
X! (n  X)!
5!
 (0.1)1(1 0.1)5 1
1! (5  1)!
 (5)(0.1)(0.9) 4
 0.32805

Binomial Distribution
 The shape of the binomial distribution depends on the
values of p and n
Mean P(X) n = 5 p = 0.1
.6
 Here, n = 5 and p = 0.1 .4
.2
0 X
0 1 2 3 4 5

P(X) n = 5 p = 0.5
 Here, n = 5 and p = 0.5 .6
.4
.2
0 X
0 1 2 3 4 5

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-13

Example-1

Determine the binomial distribution for which the


mean is 8 and variance is 4 and also find its mode.
It is given that mean=8 and variance=4 i.e.
np = 8 and npq = 4
This implies p=1/2, q=½ and n = 16
Thus the binomial distribution is B(16,1/2).
Also (n+1)p=17/2= 8+1/2, which implies that the
mode = 8 (integral part of (n+1)p)

Example-2

It is given that, n=10, p = q = ½. Find P(X≤1).

P(X≤1)= P(X=0) + P(X=1)


= 10C0 p0 q10+ 10C1 p1 q9
= (1/2)10+10*(1/2)*(1/2)9
= 0.01

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-14

Example-3

In a shooting competition, the probability of a man


hitting a target is 2/5. If he fires 5 times, what is the
probability of hitting the target (i) at least twice (ii) at
most twice.
Let p=hitting a target=2/5, q=3/5, n=5.
(i) P(at least twice hitting) = 1–[P(no hitting)+P(one
hitting)] = 0.66
(ii) P(at most twice hitting) = P(no hitting) + P(one
hitting) + P(two hitting)=0.68

Example-4

If on an average 1 vessel in every 10 is wrecked,


what is the probability that out of 5 vessels
expected to arrive 4 at least will arrive safely.
Let p be the prob. of a vessel to arrive safely. Then
p= 1- prob. of a vessel to be wrecked
=1– 1/10 = 9/10
q= 1-p = 1/10, n = 5
Required Prob. = p(4) + p(5)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-15

Practice Problems

Que-1 For a binomial variate x, find p if n=4 and


P(X=4) = 6 P(X=2)
Que-2 Let X be a r.v. having a binomial
distribution with parameters n=100 and p=0.1.
Evaluate P[X≤ µx-3σx].
Que-3 If the sum and product of the mean and
variance of a binomial distribution are 24 and
128, find the distribution?

Practice Question

Que: A biased coin has probability of heads


as 1/3. This coin is tossed 6 times. Find the
probability of getting:
(i) 4 heads
(ii) At least 2 heads
(iii) At the most 1 head
(iv) 4 tails
(v) At least 1 tail
(vi) All tails

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-16

Using Excel For The


Binomial Distribution

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-1

The Poisson Distribution


Probability
Distributions

Discrete
Probability
Distributions

Binomial

Poisson

The Poisson Distribution


Definitions

 Use the Poisson distribution when you are


interested in the number of times an event
occurs in a given area of opportunity.
 An area of opportunity is a continuous unit or
interval of time, volume, or such area in which
more than one occurrence of an event can
occur.
 The number of scratches in a car’s paint
 The number of mosquito bites on a person
 The number of computer crashes in a day

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-2

The Poisson Distribution

 Apply the Poisson Distribution when:


 You wish to count the number of times an event
occurs in a given area of opportunity
 The probability that an event occurs in one area of
opportunity is the same for all areas of opportunity
 The number of events that occur in one area of
opportunity is independent of the number of events
that occur in the other areas of opportunity
 The probability that two or more events occur in an
area of opportunity approaches zero as the area of
opportunity becomes smaller
 The average number of events per unit is  (lambda)

Poisson Distribution
Poisson distribution is the limiting case of Binomial
distribution in which
 the number of trials is indefinitely large i.e. n ―›∞,
 constant probability of success for each trial is very small

i.e. p ―›0 and


 np = λ a finite value/constant.

It’s probability mass function (pmf) is given by


 
e  x
P(X) = , x  0 , 1 ,...
x!
Where λ is called the parameter of this
distribution. It is denoted by P(λ).

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-3

Application of Poisson Distribution

The Poisson random variable has a tremendous


range of application in diverse areas. Some of the
common application where Poisson distribution is
used are:

 Number of accident on the express way in one day


 Number of misprint on a page
 Number of vehicles arriving at a petrol pump in one hour
 Number of earthquakes occurring in one year in a
particular seismic zone
 Number of deaths of policy holders in one year

Mean, Variance and Mode of the


Poisson Distribution

If X has a poisson distribution, then


 mean of X = E(X) = λ
 Variance of X = Var(X) =λ

Mode is that value of X for which P(X) is maximum.


So for poisson distribution,
 when λ is not an integer, then mode = integral part of λ
 when λ is an integer, then we obtain two modes i.e. mode
= λ and λ– 1

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-4

Example
Example: Find P(X = 2 |  = 0.50)

e  λ λ X e 0.50 (0.50)2
P(X  2 | 0.50)    0.0758
X! 2!

Example-1

Let the probability that an individual suffers a bad


reaction from an injection is 0.001. What is the
probability that out of 3000 individual
(i) exactly 3,
(ii) more than 2 individual will suffer a bad reaction?

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-5

Example-1

Here, λ = 3000 * 0.001 =3  x


(i) Required prob. = P(X=3) = e  = 0.22
x!
(ii) Required Prob. = P(X>2) = 1-P(X≤2)

Example-2

Find the probability that at most 5


defective fuses will be found in a box
of 200 fuses if experience shows that
2% of such fuses are defective. (e-4
=0.0183)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-6

Example-2

Let the presence of a defective fuse in the


box be called a success. Then
p = 0.02 and n = 200
Hence λ= 0.02*200 = 4
Required probability
= P(0)+P(1)+P(2)+P(3)+P(4)+P(5) = 0.78

Example-3

A manufacturer of compact disc knows that 5%


of his product is defective. If he sells CD in
boxes of 100 and guarantees that not more
than 10 CD will be defective, what is the
approximate probability that a box will fail to
meet the guaranteed quality?

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-7

Example-3

Here, n =100, p = 5% = 0.05


λ = 100*0.05 = 5

Prob. that a box will fail to meet the


guaranteed quality is P(X>10) =1-P(X≤10)

Practice Problems

Que-1 Number of errors on a single page has Poisson


distribution with average number of errors of one per
page. Calculate the probability that there is at least one
error on a page.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-8

Practice Problems

Que-2 Number of accident on an express-way each day


is a Poisson random variable with average of three
accident per day. What is the probability that no
accident will occur today?

Practice Problems

Que-3 A car hire firm has two cars which it hires out day
by day. The number of demands for a car on each day
is distributed as poisson variate with mean 1.5.
Calculate the proportion of the days on which neither
car is used and some of the demand is refused.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-9

Practice Problems

Que-4 If X has a Poisson distribution and


P(X=0)=1/2. what is E(X)?

Practice Problems

Que-5 If X is a Poisson variate such that


P(X=2)=9 P(X=4)+90 P(X=6). Find E(X).

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-10

Using Excel For The


Poisson Distribution

Graph of Poisson Probabilities

Graphically:
 = 0.50
=
X 0.50
0 0.6065
1 0.3033
2 0.0758
3 0.0126
4 0.0016
5 0.0002
6 0.0000
P(X = 2 | =0.50) = 0.0758
7 0.0000

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-11

Poisson Distribution Shape

 The shape of the Poisson Distribution


depends on the parameter  :
 = 0.50  = 3.00

Continuous Random Variable

Random variables could also be such that


their set of possible values is uncountable.
Examples of such random variables are time
between arrivals of two vehicles at petrol
pump or time taken for an operation at a
hospital or lifetime of a component etc.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-12

Probability Density Function (p.d.f.)

Like we have p.m.f. for discrete random variable, we define


p.d.f. for continuous random variable. Let X be a
continuous random variable. Function f(x) defined for all
real xЄ(-∞, ∞) is called probability density function if for any
set B of real numbers, we get probability,

P (x  B )  
B
f ( x ) dx

And
b
P (a  x  b )   a
f ( x ) dx

Properties of Random Variables and


their Probability Distributions

 Axiom-I Any probability must be between zero and one


For discrete random variable, this can be stated as,
0≤ P(xi) ≤1

For continuous r.v.,


b
0  P (a  x  b)  
a
f ( x ) dx  1

 Axiom-II Total probability of sample space must be one. For


discrete random variable, this can be stated
as, ∑P(xi)=1
For continuous r.v.,

 
f ( x ) dx  1

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-13

Properties of Random Variables and their


Probability Distributions Contd…

 Axiom-III For any sequence of mutually exclusive events


probability of a union set of events is sum of their individual
probabilities i.e. if E1, E2,….. are mutually exclusive events
then
for discrete random variable,
 
P( Ei )   P( Ei )
i 1 i 1
For continuous r.v.,
b c b
a
f ( x)dx   f ( x) dx   f ( x)dx
a c

Example

Find the probability between X=1 and 2


i.e. P(1≤X≤2) for a continuous r.v. whose
p.d.f. is given below:
f(X)=1/6 (x)+ k for 0≤X≤3

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-14

Example

Find the probability between X=1 and 2 i.e.


P(1≤X≤2) for a continuous r.v. whose p.d.f. is given
below:
f(X)=1/6 (x)+ k for 0≤X≤3
Since, p.d.f. must satisfy Axiom-II. Thus,



f ( x) dx  1  k  1 / 12
Now, P(1≤ X ≤2) = 1/3

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-15

Probability Distributions
Probability
Distributions

Discrete Continuous
Probability Probability
Distributions Distributions

Binomial Normal

Poisson

Continuous Probability Distributions


 A continuous random variable is a variable that
can assume any value on a continuum (can
assume an uncountable number of values)
 thickness of an item
 time required to complete a task
 temperature of a country
 height, in inches

 These can potentially take on any value


depending only on the ability to precisely and
accurately measure

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-16

The Normal Distribution


 ‘Bell Shaped’
 Symmetrical
f(X)
 Mean, Median and Mode
are Equal
Location is determined by the σ
mean, μ X
Spread is determined by the μ
standard deviation, σ
Mean
The random variable has an = Median
infinite theoretical range: = Mode
+  to  

The Normal Distribution


Density Function

 The formula for the normal probability density function is


2
1  (X  μ) 
1  
 

f(X)  e 2
2π
Where e = the mathematical constant approximated by 2.71828
π = the mathematical constant approximated by 3.14159
μ = the population mean
σ = the population standard deviation
X = any value of the continuous variable

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-17

Many Normal Distributions

By varying the parameters μ and σ, we obtain


different normal distributions

The Normal Distribution


Shape

f(X) Changing μ shifts the


distribution left or right.

Changing σ increases
or decreases the
σ spread.

μ X

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-18

The Standardized Normal

 Any normal distribution (with any mean and


standard deviation combination) can be
transformed into the standardized normal
distribution (Z)

 Need to transform X units into Z units

 The standardized normal distribution (Z) has a


mean of 0 and a standard deviation of 1

Translation to the Standardized


Normal Distribution

 Translate from X to the standardized normal


(the “Z” distribution) by subtracting the mean
of X and dividing by its standard deviation:

X μ
Z
σ
The Z distribution always has mean = 0 and
standard deviation = 1

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-19

The Standardized Normal


Probability Density Function
 The formula for the standardized normal
probability density function is

1 2
f(Z)  e (1/2)Z

Where e = the mathematical constant approximated by 2.71828


π = the mathematical constant approximated by 3.14159
Z = any value of the standardized normal distribution

The Standardized
Normal Distribution

 Also known as the “Z” distribution


 Mean is 0
 Standard Deviation is 1
f(Z)

0 Z

Values above the mean have positive Z-values,


values below the mean have negative Z-values

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-1

Example

 If X is distributed normally with mean of $100


and standard deviation of $50, the Z value
for X = $200 is
X  μ $200  $100
Z   2.0
σ $50
 This says that X = $200 is two standard
deviations (2 increments of $50 units) above
the mean of $100.

Comparing X and Z units

$100 $200 $X (μ = $100, σ = $50)


0 2.0 Z (μ = 0, σ = 1)
Note that the shape of the distribution is the same,
only the scale has changed. We can express the
problem in the original units (X in dollars) or in
standardized units (Z)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-2

Finding Normal Probabilities

Probability is measured by the area


under the curve
f(X)
P (a ≤ X ≤ b)
= P (a < X < b)

a b X

Probability as
Area Under the Curve
The total area under the curve is 1.0, and the curve is
symmetric, so half is above the mean, half is below

f(X) P(   X  μ)  0.5
P(μ  X   )  0.5

0.5 0.5

μ X
P(   X   )  1.0

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-3

The Standardized Normal Table

 The Cumulative Standardized Normal table


gives the probability less than a desired value
of Z (i.e., from negative infinity to Z)

0.5000
Example: 0.4772
P(Z < 2.00) = 0.9772

0 2.00 Z

General Procedure for


Finding Normal Probabilities

To find P(a < X < b) when X is


distributed normally:

 Draw the normal curve for the problem in


terms of X

 Translate X-values to Z-values

 Use the Standardized Normal Table

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-4

Finding Normal Probabilities


 Let X represent the time it takes (in seconds)
to download an image file from the internet.
 Suppose X is normal with a mean of 18.0
seconds and a standard deviation of 5.0
seconds. Find P(X < 18.6)

X
18.0
18.6

Finding Normal Probabilities


(continued)
 Let X represent the time it takes, in seconds to download an image file
from the internet.
 Suppose X is normal with a mean of 18.0 seconds and a standard
deviation of 5.0 seconds. Find P(X < 18.6)

X  μ 18.6  18.0
Z   0.12
σ 5.0

μ = 18 μ=0
σ=5 σ=1

18 18.6 X 0 0.12 Z

P(X < 18.6) P(Z < 0.12)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-5

Solution: Finding P(Z < 0.12)


P(X < 18.6)
= P(Z < 0.12)
=0.5000+0.0478
=0.5478

Z
0.00
0.12

Finding Normal
Upper Tail Probabilities

 Suppose X is normal with mean 18.0 and


standard deviation 5.0.
 Now Find P(X >18.6)

X
18.0
18.6

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-6

Finding Normal
Upper Tail Probabilities
(continued)

 Now Find P(X >18.6)…


P(X > 18.6) = P(Z > 0.12) = 0.5 - P(0 ≤ Z ≤ 0.12)
= 0.5 - 0.0478 = 0.4522

0.0478
0.5 0.5 - 0.0478
= 0.4522

Z Z
0 0
0.12 0.12

Finding a Normal Probability


Between Two Values

 Suppose X is normal with mean 18.0 and


standard deviation 5.0. Find P(18 < X < 18.6)

Calculate Z-values:

X  μ 18  18
Z  0
σ 5
18 18.6 X
X  μ 18.6  18 0 0.12 Z
Z   0.12
σ 5 P(18 < X < 18.6)
= P(0 < Z < 0.12)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-7

Probabilities in the Lower Tail

 Suppose X is normal with mean 18.0 and


standard deviation 5.0.
 Now Find P(17.4 < X < 18)

X
18.0
17.4

Probabilities in the Lower Tail


(continued)

Now Find P(17.4 < X < 18)…


P(17.4 < X < 18)
= P(-0.12 < Z < 0)
The Normal distribution is symmetric, so this
= P(0 < Z < 0.12) probability is the same as P(0 < Z < 0.12)
= 0.0478

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-8

Empirical Rules

What can we say about the distribution of values


around the mean? For any normal distribution:
f(X)

μ ± 1σ encloses about
68.26% of X’s
σ σ

μ-1σ μ μ+1σ X

68.26%

The Empirical Rule


(continued)

 μ ± 2σ covers about 95% of X’s


 μ ± 3σ covers about 99.7% of X’s

2σ 2σ 3σ 3σ
μ x μ x

95.44% 99.73%

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-9

Given a Normal Probability


Find the X Value

 Steps to find the X value for a known


probability:
1. Find the Z value for the known probability
2. Convert to X units using the formula:

X  μ  Zσ

Finding the X value for a


Known Probability
(continued)

Example:
 Let X represent the time it takes (in seconds) to
download an image file from the internet.
 Suppose X is normal with mean 18.0 and standard
deviation 5.0
 Find X such that 20% of download times are less than
X.
0.2000

? 18.0 X
? 0 Z

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-10

Find the Z value for


20% in the Lower Tail

1. Find the Z value for the known probability


 20% area in the lower
tail is consistent with a
Z value of -0.84
0.3000

0.2000

? 18.0 X
-0.84 0 Z

Finding the X value

2. Convert to X units using the formula:

X  μ  Zσ
 18.0  ( 0.84)5.0
 13.8

So 20% of the values from a distribution with


mean 18.0 and standard deviation 5.0 are less
than 13.80

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-11

Practice Problems

1. X is a normal variate with mean 30 and s.d. 5.


Find the probabilities that 26≤X≤40 and X≥45.

Practice Problems

2. X is normal variate with mean 1 and s.d.


3, find P[3.43≤X≤6.19] and P[-1.43≤X≤6.19]

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-12

Practice Problems

3. If X is normally distributed with mean 11 and


s.d.1.5, find the number X0 such that P[X>X0]=0.3 and
P[X>X0]=0.09

Practice Problems Contd…

4. If X is a normal variate with mean 50 and


s.d. 10. Find P[Y≤3137], where Y=X2+1

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-13

Practice Problems Contd…


5. The marks obtained by candidates in statistics in a
certain examination are found to be normally distributed. If
12.5% of the candidates obtain 60 or more marks and
39% obtain less than 30 marks. Find the mean number of
marks obtained by the candidates.

Practice Problems Contd…

6. Small stones are collected and weights are assumed to


be normal. It is found that 5% of the stones are under 30
gm and 80% are under 50 gm. What are the mean and
standard deviation of the distribution?

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-1

Two Measures Of The Relationship Between Two


Numerical Variables

 Scatter plots allow you to visually examine the


relationship between two numerical variables
and now we will discuss two quantitative
measures of such relationships.

 The Covariance
 The Coefficient of Correlation

The Covariance

 The covariance measures the strength of the linear


relationship between two numerical variables (X & Y)

 The sample covariance:


n

 ( X  X)( Y  Y )
i i
cov ( X , Y )  i 1
n 1
 Only concerned with the strength of the relationship
 No causal effect is implied

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-2

Interpreting Covariance

 Covariance between two variables:


cov(X,Y) > 0 X and Y tend to move in the same direction
cov(X,Y) < 0 X and Y tend to move in opposite directions

cov(X,Y) = 0 X and Y are independent

 The covariance has a major flaw:


 It is not possible to determine the relative strength of the
relationship from the size of the covariance

Coefficient of Correlation
 Measures the relative strength of the linear
relationship between two numerical variables
 Sample coefficient of correlation:

cov (X , Y)
r
SX SY
where
n n n
 (X  X)(Y  Y)
i i  (X  X)
i
2
 (Y  Y )
i
2

cov (X , Y)  i1
SX  i 1
SY  i1
n 1 n 1 n 1

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-3

Correlation

Correlation is the relationship that exists between


two or more variables. If two variables are related
to each other in such a way that change in one
creates a corresponding change in the other, then
the variable are said to be correlated.
Examples:
 Relationship b/w the Heights & Weights.
 Relationship b/w the Price & Demand of a
commodity.
 Relationship b/w the Advertising Expenditure &
Sales.

Correlation Contd…

Depending upon the direction of change of the


variables, correlation may be positive or
negative. If both the variables vary in the same
direction (either both increases or decreases),
correlation is said to be positive and if both the
variables vary in the opposite direction, the
correlation is said to be negative.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-4

Features of the
Coefficient of Correlation
 The population coefficient of correlation is referred as ρ.
 The sample coefficient of correlation is referred to as r.
 Either ρ or r have the following features:
 Unit free
 Ranges between –1 and 1
 The closer to –1, the stronger the negative linear relationship
 The closer to 1, the stronger the positive linear relationship
 The closer to 0, the weaker the linear relationship

Method of studying correlation

Methods
Of
Studying Correlation

Graphic Algebraic

Karl Pearson’s
Spearman’s
Scatter Diagram Graphic Method Coefficient
Rank Correlation
of Correlation

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-5

Scatter Plots of Sample Data with


Various Coefficients of Correlation
Y Y

X X
r = -1 r = -.6
Y
Y Y

X X X
r = +1 r = +.3 r=0

Karl Pearson’s Coefficient of


Correlation
Given a set of n pairs of observation (x1,y1),
(x2,y2),…, (xn,yn) relating to two variables X and Y,
the coefficient of correlation b/w X and Y, denoted
by the symbol ‘ r ’ and is defined as
cov (X , Y)
r
SX SY
where
n n n
 (X  X)(Y  Y)
i i  (X  X)i
2
 (Y  Y )
i
2

cov (X , Y)  i1
SX  i 1
SY  i1
n 1 n 1 n 1

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-6

Karl Pearson’s Coefficient of


Correlation

Properties of Coefficient of
Correlation

 It is independent of change of origin as well as


change of scale i.e. r(x,y) = r(u,v) where
u=(x-a)/h, v=(y-b)/k
 It is independent of units of measurement i.e. it
is a pure number.
 It lies between -1 and +1.
 It is the geometric mean of the two regression
coefficients.
 For two independent variables, coefficient of
correlation is zero.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-7

Example-1

Calculate the coefficient of correlation for the


following ages of husband and wifes:

Husband’s Age: 23 27 28 28 29 30 31 33 35 36
Wife’s age: 18 20 22 27 21 29 27 29 28 29

Solution-1
X Y XY X^2 Y^2
23 18 414 529 324
27 20 540 729 400
28 22 616 784 484
28 27 756 784 729
29 21 609 841 441
30 29 870 900 841
31 27 837 961 729
33 29 957 1089 841
35 28 980 1225 784
36 29 1044 1296 841
300 250 7623 9138 6414

r = 0.82

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-8

Example-2

A computer while calculating r(x,y) from 25


pairs of observations obtained the following
constant:
n=25, ∑x =125, ∑x^2 = 650, ∑y=100,
∑y^2 = 460, ∑xy = 508
A recheck showed that he had copied down two
pairs (6,14), (8,6) while the correct values were
(8,12), (6,8). Obtain the correct value of the
correlation coefficient.

Solution-2

Correct value of ∑x =125-6-8+8+6 = 125


Correct value of ∑x^2 = 650-36-64+64+36= 650
Correct value of ∑y=100-14-6+12+8 = 100
Correct value of∑y^2=460-196-36+144+64=436
Correct value of ∑xy = 508-84-48+96+48=520

Therefore the corrected value of r = 2/3

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-9

The Coefficient of Correlation Using


Microsoft Excel Function

Test #1 Score Test #2 Score Correlation Coefficient


78 82 0.7332 =CORREL(A2:A11,B2:B11)
92 88
86 91
83 90
95 92
85 85
91 89
76 81
88 96
79 77

Interpreting the Coefficient of Correlation


Using Microsoft Excel

 r = .733 Scatter Plot of Test Scores

100

 There is a relatively 95

strong positive linear


Test #2 Score

90

relationship between test 85

score #1 and test score 80

#2. 75

70
70 75 80 85 90 95 100

Test #1 Score
 Students who scored high
on the first test tended to
score high on second test.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-10

Regression
Regression is the measure of average relationship between
two or more variables in terms of the original units of the data.
For example, after having established that two variables (say
sales and advertising expenditure) are correlated, one may find
out the average relationship b/w the two to estimate the
unknown values of dependent variable (say sales) from the
known value of the independent variable (say advertising
expenditure).

Regression Analysis is a statistical tool to study the nature and


extent of functional relationship b/w two or more variables and
to estimate/predict the unknown value of the dependent variable
from the known value of the independent variable.

Difference between Correlation and


Regression

 Correlation measures degree and direction of


relationship b/w the variable. While, Regression
measures the nature and extent of average
relationship b/w two or more variables in terms of the
original units of the data.
 Correlation is a relative measure showing association
b/w variables. While, Regression is an absolute
measure of relationship. (Relative measures are most
relevant when you want to transfer your finding to other
populations. Absolute measures are very dependent on
the baseline incidence in the population under study).

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-11

Difference between Correlation and


Regression contd..

 Correlation is not a forecasting device while Regression


is a forecasting device which can be used to predict
the value of dependent variable from the given value of
independent variable.

Regression Equations

In simple linear regression analysis, there are


two lines of regression since there are two
variables X and Y.

One is Y on X and the other is X on Y.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-12

Regression Line of Y on X

Regression line of Y on X is given by


Y=a + b X
where, X=Independent Variable
Y=Dependent Variable
a=Y intercept (value of dependent variable when
value of independent variable is zero)
b=Slope of the line (i.e. the amount of change in
the value of the dependent variable per unit change
in the independent variable)

Regression Line of Y on X Contd…

The values of two constants a and b can be calculated for the


given data of X and Y variable by solving the two algebraic
normal equations.

∑Y = n a + b ∑X
∑XY = a ∑X + b ∑X2

Where, n=no. of pairs of X and Y variable


∑X=sum of values of variable X
∑Y=sum of values of variable Y
∑X2=sum of square of values of variable X
∑XY=sum of products of value of X and Y variable

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-13

Regression Line of X on Y

Regression line of X on Y is given by


X=A + B Y
where, Y=Independent Variable
X=Dependent Variable
A=X intercept (value of dependent variable when
value of independent variable is zero)
B=Slope of the line (i.e. the amount of change in
the value of the dependent variable per unit change
in the independent variable)

Regression Line of X on Y Contd…

The values of two constants a and b can be calculated for the


given data of X and Y variable by solving the two algebraic
normal equations.

∑X = n A + B ∑Y
∑XY = A ∑Y + B ∑Y2

Where, n=no. of pairs of X and Y variable


∑X=sum of values of variable X
∑Y=sum of values of variable Y
∑Y2=sum of square of values of variable Y
∑XY=sum of products of value of X and Y variable

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-14

Regression Line of Y on X – Another


Way

This line can also be expressed as follows:


(Y  Y )  byx ( X  X )
y
(Y  Y )  r (X  X )
x
where , X  A.M . of X series
Y  A.M . of Y series
 x  S .D. of X series
 y  S .D. of Y series
r  Coefficien t of correlatio n b / w two
var iable X and Y

Properties of Linear Regression

 The product of the two regression coefficients is equal


to the square of correlation coefficient i.e. bxy* byx=r2
 Correlation and regression coefficient have the same
sign.
 If r=0 then bxy and byx are also zero.
 The regression lines always intersect at their means.
 The angle b/w the two regression lines depends on the
correlation coefficients i.e. if r=0 the regression lines are
perpendicular to each other. If r=+1 or -1 then the
regression lines coincide or identical to each other.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-15

Properties of Regression Coefficients

 The Regression Coefficients are independent of the


change of origin but not of scale.
 A.M. of regression coefficients is greater than the
correlation coefficient.
 Correlation coefficient is the geometric mean of the
regression coefficients.
 If one of the regression coefficients is greater then
unity, the other must be less than unity.
 If X and Y are independent, the regression
coefficient is zero.

Formula for Regression Coefficients.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-16

Example-3

From the following data obtain the two


regression lines and the coefficient of
correlation:

Sales (x): 100 98 78 85 110 95 80


Purchases(y): 85 90 70 72 95 81 70

Find the value of y when x = 82

Solution-3

Mean of x = 92, Mean of y= 81, byx=0.84, bxy=1.12


Regression equation of y on x:
y-mean of y= byx (x-mean of x)
y-81 = 0.84 (x-92) y = 0.84x+3.72
Regression equation of x on y:
x-mean of x = bxy (y-mean of y)
x-92 = 1.12 (x-81) x = 1.12y+1.28
The correlation coefficient = Sqrt(bxy*byx) = 0.97
For x=82, the value of y is 72.6

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-1

Regression Equations

In simple linear regression analysis, there are


two lines of regression since there are two
variables X and Y.

One is Y on X and the other is X on Y.

Regression Line of Y on X

Regression line of Y on X is given by


Y=a + b X
where, X=Independent Variable
Y=Dependent Variable
a=Y intercept (value of dependent variable when
value of independent variable is zero)
b=Slope of the line (i.e. the amount of change in
the value of the dependent variable per unit change
in the independent variable)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-2

Regression Line of Y on X Contd…

The values of two constants a and b can be calculated for the


given data of X and Y variable by solving the two algebraic
normal equations.

∑Y = n a + b ∑X
∑XY = a ∑X + b ∑X2

Where, n=no. of pairs of X and Y variable


∑X=sum of values of variable X
∑Y=sum of values of variable Y
∑X2=sum of square of values of variable X
∑XY=sum of products of value of X and Y variable

Regression Line of X on Y

Regression line of X on Y is given by


X=A + B Y
where, Y=Independent Variable
X=Dependent Variable
A=X intercept (value of dependent variable when
value of independent variable is zero)
B=Slope of the line (i.e. the amount of change in
the value of the dependent variable per unit change
in the independent variable)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-3

Regression Line of X on Y Contd…

The values of two constants a and b can be calculated for the


given data of X and Y variable by solving the two algebraic
normal equations.

∑X = n A + B ∑Y
∑XY = A ∑Y + B ∑Y2

Where, n=no. of pairs of X and Y variable


∑X=sum of values of variable X
∑Y=sum of values of variable Y
∑Y2=sum of square of values of variable Y
∑XY=sum of products of value of X and Y variable

Regression Line of Y on X – Another


Way

This line can also be expressed as follows:


(Y  Y )  byx ( X  X )
y
(Y  Y )  r (X  X )
x
where , X  A.M . of X series
Y  A.M . of Y series
 x  S .D. of X series
 y  S .D. of Y series
r  Coefficien t of correlatio n b / w two
var iable X and Y

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-4

Properties of Linear Regression

 The product of the two regression coefficients is equal


to the square of correlation coefficient i.e. bxy* byx=r2
 Correlation and regression coefficient have the same
sign.
 If r=0 then bxy and byx are also zero.
 The regression lines always intersect at their means.
 The angle b/w the two regression lines depends on the
correlation coefficients i.e. if r=0 the regression lines are
perpendicular to each other. If r=+1 or -1 then the
regression lines coincide or identical to each other.

Properties of Regression Coefficients

 The Regression Coefficients are independent of the


change of origin but not of scale.
 A.M. of regression coefficients is greater than the
correlation coefficient.
 Correlation coefficient is the geometric mean of the
regression coefficients.
 If one of the regression coefficients is greater then
unity, the other must be less than unity.
 If X and Y are independent, the regression
coefficient is zero.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-5

Formula for Regression Coefficients.

Example-3

From the following data obtain the two


regression lines and the coefficient of
correlation:

Sales (x): 100 98 78 85 110 95 80


Purchases(y): 85 90 70 72 95 81 70

Find the value of y when x = 82

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-6

Solution-3

Mean of x = 92, Mean of y= 81, byx=0.84, bxy=1.12


Regression equation of y on x:
y-mean of y= byx (x-mean of x)
y-81 = 0.84 (x-92) y = 0.84x+3.72
Regression equation of x on y:
x-mean of x = bxy (y-mean of y)
x-92 = 1.12 (x-81) x = 1.12y+1.28
The correlation coefficient = Sqrt(bxy*byx) = 0.97
For x=82, the value of y is 72.6

Example-4

Consider the two regression lines:


3X+2Y=26 and 6X+Y=31

(a) Find the mean value and correlation coefficient


between X and Y

(b) If the variance of Y is 4, find the S.D. of X

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-7

Solution-4
(a) Intersection of two regression lines pass
through their means. This implies on solving two
equation, we get mean of x = 4 and mean of y = 7
Let 3X+2Y=26 be the regression line of x on y
and the other line as y on x. Then
x = -2/3y+26/3 (x on y) bxy = -2/3 and
y = -6x+31 (y on x) byx = -6
But r^2=bxy*byx = 4 which can not be true.
So we change our assumption i.e. the line
3X+2Y=26 represent y on x and the other as x on
y. Then

Solution-4
Then
y = -3/2x+13 (y on x) byx = -3/2 and
x = -1/6y+31/6 (x on y) bxy = -1/6
Here, r = sqrt(bxy*byx) = - sqrt(1/4) = -1/2
Since both the coefficient are negative therefore r
has to be negative.
(b) Given, variance of y = 4 means s.d. of y = 2
We have bxy = r s.d. (x)/s.d. (y)
S.d. (x) = s.d. (y)*bxy/r = 2/3

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-8

Standard Error of Estimates


The standard error of estimates measures the dispersion
about an average line called the regression line. It indicates
how precise the prediction of Y is based on X or of X on Y.

S tan dard Error of X values from X c ( S xy ) 


(X  X c )2
n

S tan dard Error of Y values from Yc ( S yx ) 


 (Y  Y )
c
2

n
where X c and Yc can be calculated by putting the actual values
of independent var iable in the regression equation of dependent
var iable and n is the number of observations.

Interpretation of Standard Error

 If the value of the S.E. of estimate is small then the dots


(actual values of the dependent variable) will be closer
to the regression line and the estimates based on this
line will be better.
 If the value of the S.E. of estimate is large then the dots
will be farther from the regression line and the estimates
based on this line will not be accurate.
 If the value of the S.E. of estimate is zero then there is
no variation about the line and both the lines will
coincide and correlation will be perfect.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-9

Coefficient of Determination (r2)

Coefficient of Determination (r2) =


Explained Variation/Total Variation

Explained Variation = Total Variation – Unexplained


Variation

Unexplained Variation =∑(Y-Yc)2


_
Total Variation = ∑(Y-Y)2

Example-5

The following Data relate to advertising


expenditure and sales.

Advertising Expenditure (Rs. Lakhs) 1 2 3 4 5


Sales (Rs. Lakhs) 10 20 30 50 40
Calculate Total Variation in Y, Unexplained
Variation in Y, Explained Variation in Y, and
Standard Error of Estimates.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-10

Solution-5

X Y Y  Y X  X (Y  Y ) ( X  X ) Y  Y
2 2
c
Yc  3  9 X (Y  Y c ) 2
1 10 -20 -2 400 4 -2 12 4
2 20 -10 -1 100 1 -1 21 1
3 30 0 0 0 0 0 30 0
4 50 20 1 400 1 11 39 121
5 40 10 2 100 4 -8 48 64
15 150 0 0 1000 10 0 150 190

Here n = 5

X
 X  15  3
Y
 Y  150  30
n 5
n 5

Solution-5
n 5
2
Total Variation in Y =  (Y  Y ) =1000
 (Y  Y )
Unexplained Variation in Y = 2 =190
c
Explained Variation in Y = Total Variation - Unexplained Variation = 1000 - 190 = 810

Standard error of estimation =


S yx 
 (Y  Y ) c
2

=6.164
n

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-11

Uses of Regression Analysis

 Regression analysis through regression lines facilitates


prediction of the value of a dependent variable from the
given value of an independent variable.
 Regression analysis through standard error facilitates to
obtain a measure of the error involved in using the
regression line as a basis for estimation.
 Regression analysis through regression coefficients (bxy
and byx) facilitates calculation of coefficient of
determination (r2) and coefficient of correlation (r).
 Regression analysis is a highly valuable tool in
economics and business research since most of the
problems of economics analysis are based on cause
and effect relationship.

The Importance of Management


Science

 Management Science, also known as Operations


Research, Decision Sciences, etc., involves a
philosophy of problem solving in a logical manner.
 Management science uses a scientific approach to
solving management problems.
 It is used in a variety of organizations to solve different
types of problems. For example – Manufacturing
(production scheduling, inventory control, product mix,
replacement policies), Marketing (advertising budget
allocation, supply chain management), Finance
(investment analysis, portfolio analysis), Construction
(allocation of resources to projects, project scheduling)
etc.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-12

Successful Applications of Management


Science

Successful Applications of Management


Science (cont’d)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-13

The Management Science Approach

The Management Science Approach


Contd…

 Observation - Identification of a problem that exists (or may


occur soon) in a system or organization.
 Definition of the Problem - problem must be clearly and
consistently defined, showing its boundaries and interactions
with the objectives of the organization.
 Model Construction - Development of the functional
mathematical relationships that describe the decision variables,
objective function and constraints of the problem.
 Model Solution - Models solved using management science
techniques.
 Model Implementation - Actual use of the model or its solution.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-14

Linear Programming

Linear Programming is a mathematical technique


for choosing the best alternative from a set of
feasible alternatives, in situations where the
objective function as well as the restrictions or
constraints can be expressed as linear
mathematical function.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-15

Linear Programming Contd…

 Linear programming has nothing to do with computer


programming.
 The use of the word “programming” here means “choosing a
course of action.”
 Linear programming involves choosing a course of action when
the mathematical model of the problem contains only linear
functions.
 Linear functions are functions in which each variable appears in
a separate term raised to the first power and is multiplied by a
constant (which could be 0).
 Linear constraints are linear functions that are restricted to be
"less than or equal to", "equal to", or "greater than or equal to" a
constant.

Problem Formulation

 Problem formulation or modeling is the process of


translating a verbal statement of a problem into a
mathematical statement.
 Formulating models is an art that can only be
mastered with practice and experience.
 General guidelines for LP model formulation are
illustrated on the slides that follow.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-16

Guidelines for LP Model Formulation


The formulation of L.P.P. as a mathematical model
involves the following basic steps:
 Study the given situation to find the key decision to be
made (understand the problem).
 Identify the decision variables involved and designate
them by symbols xj, j = 1, 2, .....
 Identify the objective function and express it as a linear
function of the decision variables.
 Identify the constraints or restrictions, as linear
equalities or inequalities in terms of the decision
variables of the problem.

Summary of Model Formulation


Steps

 Step 1:Clearly define the decision variables


 Step 2:Construct the objective function
 Step 3: Formulate the constraints

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-17

L.P.P. Formulation
Product Mix Problem-1

A firm can produce three types of woolen clothes, say


A, B and C using three kinds of wool, say red, green
and blue wool. One unit length of type A cloth needs 2
yards of red wool and 3 yards of blue wool; one unit
length of type B cloth needs 3 yards of red wool, 2
yards of green wool and 2 yards of blue wool; and one
unit length of type C cloth needs 5 yards of green wool
and 4 yards of blue wool. The firm has only a stock of 8
yards of red, 10 yards of green and 15 yards of blue
wool. It is assumed that income obtained from one unit
length of type A cloth is Rs. 3, of type B cloth is Rs. 5
and that of type C cloth is Rs. 4. Formulate the above
problem as a L.P.P.

Solution

Firstly try to summarize the data into a single


table like below:
Wool Garment Type Stock Available
A B C
Red 2 3 - 8
Green - 2 5 10
Blue 3 2 4 15

Income (Rs.) 3 5 4

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-18

Solution Contd…

Suppose he produces x1, x2, and x3 unit lengths


of type A, B and C clothes respectively. Then
the L.P.P. is
Maximize income = 3x1+5x2+4x3
S.t. 2x1+3x2 ≤8 (Red wool)
2x2+5x3 ≤10 (Green wool)
3x1+2x2+4x3 ≤15 (Blue wool)
x1≥0, x2≥0, x3≥0

Problem - 2
The Server Problem
A firm that assembles computer and computer equipment is about to start
production of two new Web server models. Each type of model will require
assembly time, inspection time and storage space. The amount of each of
these resources that can be devoted to the production of the servers is limited.
The manager of the firm would like to determine the quantity of each model to
produce in order to maximize the profit generated by sales of these servers.
In order to develop a suitable model of the problem, the manager has met with
design and manufacturing personnel. As a result of those meetings, the
manager has obtained the following information:
Type 1 Type 2
Profit per unit Rs. 60 Rs. 50
Assembly time per unit 4 Hours 10 Hours
Inspection time per unit 2 Hours 1 Hours
Storage Space per unit 3 Cubic feet 3 Cubic feet

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-19

Problem – 2 Contd…
The manager also has acquired information on the availability of
company resources. These (daily) amounts are:
Resource Amount Available
Assembly time 100 Hours
Inspection time 22 Hours
Storage Space 39 Cubic feet
The manager also met with the firm’s marketing manager and
learned that demand for the servers was such that whatever
combination of these two models of servers is produced, all of
the output can be sold.
Formulate this problem as a Linear Programming Model.

Solution

X1=quantity of server model 1 to produce


X2=quantity of server model 2 to produce
Max Z =60X1+50X2
Subject to
4X1+10X2≤ 100 (Assembly)
2X1+1X2≤ 22 (Inspection)
3X1+3X2≤ 39 (Storage)
X1,X2≥0

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Marketing Application
Media Selection Problem - 3
The Long Last Appliance Sales Company is in the business of selling appliances such
as microwave ovens, traditional ovens, refrigerators, dishwashers, washers, dryers,
and the like. The company has stores in the greater Chicago land area and has a
monthly advertising budget of $90,000.
Among its options are radio advertising, advertising in the cable TV channels,
newspaper advertising, and direct-mail advertising. A 30-second advertising spot on
the local cable channel costs $1,800, a 30-second radio ad costs $350, a half-page
ad in the local newspaper costs $700, and a single mailing of direct-mail insertion for
the entire region costs $1,200 per mailing. The number of potential buying
customers reached per advertising medium usage is as follows:

Radio 7,000
TV 50,000
Newspaper 18,000
Direct mail 34,000

Due to company restrictions and availability of media, the maximum number of


usages of each medium is limited to the following:
Radio 35
TV 25
Newspaper 30
Direct mail 18

10/12/2024 1

Media Selection Problem - 3


Contd…
The management of the company has met and
decided that in order to ensure a balanced
utilization of different types of media and to
portray a positive image of the company, at
least 10 percent of the advertisements must be
on TV. No more than 40 percent of the
advertisements must be on radio. The cost of
advertising allocated to TV and direct mail
cannot exceed 60 percent of the total
advertising budget.
What is the optimal allocation of the budget
among the four media? What is the total
maximum audience contact?

10/12/2024 2

1
Solution
Let X1= no. of radio ads, X2=no. of television ads,
X3= no. of newspaper ads, X4=no. of direct–mail ads
Max Z = 7,000X1+50,000X2+18,000X3+34,000X4
Subject to
350X1+1,800X2+700X3+1,200X4≤90,000 (Budget
Constraint)
X1≤35, X2≤25, X3≤30, X4≤18 (Maximum exposure
Constraints)
X2≥0.10 (X1+X2+X3+X4)
or -0.10X1+0.9X2 -0.1X3 -0.1X4 ≥0 (10% minimum TV ads)
X1≤0.40 (X1+X2+X3+X4)
Or 0.6X1-0.4X2 -0.4X3 -0.4X4 ≤0 (40% maximum radio ads)
1,800X2+1,200X4 ≤ 0.6 (90,000)
X1, X2, X3, X4≥0 (Non-negative constraint)

10/12/2024 3

Financial Planning Problem - 4


First American Bank issues five types of loans. In addition,
to diversify its portfolio, and to minimize risk, the bank
invests in risk-free securities. The loans and the risk-free
securities with their annual rate of return are given in
following table

Rates of Return for Financial Planning Problem

Type of Loan or Security Annual Rate of Return (%)

Home mortgage (first) 6


Home mortgage (second) 8
Commercial loan 11
Automobile loan 9
Home improvement loan 10
Risk-free securities 4

10/12/2024 4

2
Financial Planning Problem Contd…
The bank’s objective is to maximize the annual rate of
return on investments subject to the following policies,
restrictions, and regulations:
1. The bank has $90 million in available funds.
2. Risk-free securities must contain at least 10 percent of the
total funds available for investments.
3. Home improvement loans cannot exceed $8,000,000.
4. The investment in mortgage loans must be at least 60
percent of all the funds invested in loans.
5. The investment in first mortgage loans must be at least
twice as much as the investment in second mortgage
loans.
6. Home improvement loans cannot exceed 40 percent of the
funds invested in first mortgage loans.
7. Automobile loans and home improvement loans together
may not exceed the commercial loans.
8. Commercial loans cannot exceed 50 percent of the total
funds invested in mortgage loans.

10/12/2024 5

Solution

10/12/2024 6

3
Solution Contd…
Subject to
X1+X2+X3+X4+X5+X6= 90,000,000 (Budget Constraint)
(Note here, we force the entire budget to be spent)
X6≥0.10 (90,000,000) i.e. X6≥9,000,000
X5≤8,000,000
X1+X2 ≥0.6 (X1+X2+X3+X4+X5)
or 0.40X1+0.40X2 - 0.60X3 -0.60X4 - 0.60X5≥0
X1 ≥2X2 or X1-2X2≥0
X5≤0.4X1 or -0.4X1+X5 ≤0
X4+X5 ≤ X3 or –X3+X4+X5 ≤0
X3≤0.5(X1+X2) or –0.5X1-0.5X2+X3 ≤ 0
X1, X2, X3, X4,X5,X6 ≥0 (Non-negative constraint)

10/12/2024 7

Portfolio Selection Problem - 5


A conservative investor has $100,000 to invest. The
investor has decided to use three vehicles for generating
income: municipal bonds, a certificate of deposit (CD), and
a money market account. After reading a financial
newsletter, the investor has also identified several
additional restrictions on the investments:
1. No more than 40 percent of the investment should be in
bonds.
2. The proportion allocated to the money market account
should be at least double the amount in the CD.

The annual return will be 8 percent for bonds, 9 percent for


the CD, and 7 percent for the money market account.
Assume the entire amount will be invested.
Formulate the LP model for this problem, ignoring any
transaction costs and the potential for different investment
lives. Assume that the investor wants to maximize the total
annual return.

10/12/2024 8

4
Solution
Let X1= amount invested in bonds,
X2=amount invested in the CD, X3= amount
invested in the money market account
Max Z = 0.08X1+0.09X2+0.07X3
Subject to
X1+X2+X3=100,000 (Budget Constraint)
X1≤0.4(100,000) or X1≤40,000
X3≥2X2 or X3-2X2 ≥0
X1, X2, X3≥0 (Non-negative constraint)

10/12/2024 9

Workforce Scheduling Problem


A Department Store has decided to stay open
on a 24-hour basis. The store manager has
divided the 24-hour day into six 4-hour periods
and determine the following minimum
personnel requirement for each period:
Time Personnel Needed
Midnight – 4:00 A.M. 90
4:00 – 8:00 A.M. 215
8:00 – Noon 250
Noon – 4:00 P.M. 65
4:00 – 8:00 P.M. 300
8:00 – Midnight 125

10

5
Workforce Scheduling Problem
Personnel must report for work at the
beginning of one of these times and
work 8 consecutive hours. The store
manager wants to know the minimum
number of employees to assign for
each 4-hour segment to minimize the
total number of employees.

11

Workforce Scheduling Problem


Let xi be the no. of employees assigned to time period i,
where i = 1,2,...,6 (time period i = 12:00 midnight –
4:00 A.M.; period 2 = 4:00 – 8:00 A.M.; etc.)
Minimize Z = x1 + x2 + x3 + x4 + x5 + x6
Subject to x6 + x1 ≥ 90
x1 + x2 ≥ 215
x2 + x3 ≥ 250
x3 + x4 ≥ 65
x4 + x5 ≥ 300
x5 + x6 ≥ 125
xi ≥ 0

12

6
Graphical Solution of LP Models
 Graphical solution is limited to linear
programming models containing only
two decision variables.
 Graphical methods provide
visualization of how a solution for a
linear programming problem is
obtained.

Steps Involved in Graphical Method


 Plot each of the constraints/restrictions
 Determine the feasible region or area
that contains all of the points that
satisfy the entire set of constraints.
 Determine the Optimal Solution.

7
Example- The Server Problem
Max Z =60X1+50X2
Subject to
4X1+10X2≤ 100 (Assembly)
2X1+1X2≤ 22 (Inspection)
3X1+3X2≤ 39 (Storage)
X1,X2≥0

Feasible Region Based on a Plot of the


First Constraint (assembly time) and the
Non-negativity Constraint

8
A Completed Graph of the Server Problem
Showing the Assembly and Inspection
Constraints and the Feasible Solution Space

Completed Graph of the Server Problem


Showing All of the Constraints and the
Feasible Solution Space

9
Finding the Optimal Solution
The extreme point approach
– Involves finding the coordinates of each
corner point that borders the feasible
solution space and then determining
which corner point provides the best
value of the objective function.
The extreme point theorem
– An optimal solution to an LPP can be
found at an extreme point of the feasible
region.

The Extreme Point Approach


1. Graph the problem and identify the feasible solution
space.
2. Determine the values of the decision variables at
each corner point of the feasible solution space.
3. Substitute the values of the decision variables at
each corner point into the objective function to obtain
its value at each corner point.
4. After all corner points have been evaluated in a
similar fashion, select the one with the highest value
of the objective function (for a maximization
problem) or lowest value (for a minimization
problem) as the optimal solution.

10
Graph of Server Problem with Extreme
Points of the Feasible Solution Space
Indicated

Extreme Point Solutions for the


Server Problem

11
Computing the Amount of Slack for the
Optimal Solution to the Server
Problem

Binding Constraints are those constraints where the value of the


slack is zero, otherwise it is called Non-Binding Constraints.

Graphical Solution of Maximization


Model (1 of 7)

Maximize Z = 40x1 + 50x2


subject to: 1x1 + 2x2  40
4x1 + 3x2  120
x1, x2  0

12
Graphical Solution of Maximization
Model (2 of 7)

Maximize Z = 40x1 + 50x2


subject to: 1x1 + 2x2  40
4x1 + 3x2  120
x1, x2  0

10/12/2024 25

Graphical Solution of Maximization


Model (3 of 7)

Maximize Z = 40x1 + 50x2


subject to: 1x1 + 2x2  40
4x1+ 3x2  120
x1, x2  0

10/12/2024 26

13
Graphical Solution of Maximization
Model (4 of 7)

Maximize Z = 40x1 + 50x2


subject to: 1x1 + 2x2  40
4x1 + 3x2  120
x1, x2  0

10/12/2024 27

Graphical Solution of Maximization


Model (5 of 7)

Maximize Z = 40x1 + 50x2


subject to: 1x1 + 2x2  40
4x1 + 3x2  120
x1, x2  0

10/12/2024 28

14
Graphical Solution of Maximization
Model (6 of 7)

Maximize Z = 40x1 + 50x2


subject to: 1x1 + 2x2  40
4x1 + 3x2  120
x1, x2  0

10/12/2024 29

Graphical Solution of Maximization


Model (7 of 7)

Maximize Z = 40x1 + 50x2


subject to: 1x1 + 2x2  40
4x1 + 3x2  120
x1, x2  0

10/12/2024 30

15
Some Special Cases in Graphical
Method
• No Feasible Solutions
- Occurs in problems where to satisfy one of the
constraints, another constraint must be violated or if
there is no feasible region.
• Unbounded Problems
- Exists when the value of the objective function can be
increased without limit.
• Redundant Constraints
- A constraint that does not form a unique boundary of
the feasible solution space; its removal would not
alter the feasible solution space.
• Multiple Optimal Solutions
- Problems in which different combinations of values of
the decision variables yield the same optimal value.

10/19/2024 1

Infeasible Solution
Maximize Z = X1+X2
S.t. X1+X2 ≤ 1
-3X1+X2 ≥ 3
X1, X2 ≥0

10/19/2024 2

1
Infeasible Solution Contd…
In this case there is no point
in common in the first
quadrant. Therefore, the
given LPP has no solution or
the given problem is said to
have infeasible.

10/19/2024 3

Unbounded Solution
Maximize Z = 10X2-2X1
S.t. X1-X2 ≥ 0
-X1+5X2 ≥ 5
X1, X2 ≥0

10/19/2024 4

2
Unbounded Solution Contd…
Here we have only one vertex of feasible
region i.e. A(5/4,5/4) and the value of the
objective function at this point is Z=10.
But there exist points in the convex region
for which the value of the objective function
is more than 10. For instance, the point
(2,2) lies in the convex region and the
objective value at this point is 16, which is
more than 10. Hence it may be concluded
that the maximum value of Z occurs at a
point at infinity and hence the problem has
an unbounded solution.

10/19/2024 5

Example of Redundant Constraints

10/19/2024 6

3
Multiple Optimal Solution
Maximize Z = X1+(3/5) X2
S.t. 5X1+3X2 ≤ 15
3X1+4X2 ≤ 12
X1, X2 ≥0

10/19/2024 7

Multiple Optimal Solution Contd…


Here in the above example we have
four extreme points. They are A(0,0),
B(3,0), C(24/11,15/11) and D(0,3).
The common feasible region is
bounded and the maximum has
occurred at two corner points i.e. at B
& C respectively, these solutions are
called multiple optima.

10/19/2024 8

4
Problem
Use graphical method to solve the following
problem:
Maximize Z = 2X1+X2
S.t. X2 ≤ 10
2X1+5X2 ≤ 60
X1+X2 ≤ 18
3X1+X2 ≤ 44
X1, X2 ≥0
10/19/2024 9

Solution
X1=13, X2=5 and
Max Z =31

10/19/2024 10

5
Problem
Use graphical method to solve the following
problem:
Maximize Z = 10X1+20X2
S.t. -X1+2X2 ≤ 15
X1+X2 ≤ 12
5X1+3X2 ≤ 45
X1, X2 ≥0
10/19/2024 11

Solution
X1=3, X2=9 and
Max Z =210

10/19/2024 12

6
Assignment Problems
Consider n machines M1, M2,…,Mn and n
different jobs J1, J2,…,Jn. These jobs to be
processed by the machines one to one
basis i.e. each machine will process exactly
one job and each job will be assigned to
only one machine. For each job the
processing cost depends on the machine to
which it is assigned. Now we have to
determine the assignment of the jobs to
the machines one to one basis such that
the total processing cost is minimum. This
is called ASSIGNMENT PROBLEM.

Assignment Problem Contd…


If the number of machines are equal
to the number of jobs then the above
problem is called balanced or
standard assignment problem.
Otherwise the problem is called
unbalanced or non-standard
assignment problem.

7
Introduction And Mathematical
Formulation of Assignment Problem
Let us consider a balanced
assignment problem. For L.P.P.
formulation let us define the decision
variables as:
Xij=1, if job j is assigned to machine i
0, otherwise
and Cij is the cost of processing job j
on machine i. Then we can formulate
the assignment problem as follows:

Introduction And Mathematical


Formulation of Assignment Problem
n n
Minimize Z  C
i 1 j 1
ij X ij

n
Subject to j 1
X ij  1, i  1, 2 , ..., n

( Each machine is assigned exactly to one job )


n

i 1
X ij  1, j  1, 2 , ..., n

( Each job is assigned exactly to one machine )


X ij  0 or 1, for all i and j

8
Example
A company is facing the problem of assigning four
operators to four machines. The assignment cost in
rupees is given below:
Machine
M1 M2 M3 M4
I 5 7 - 4
II 7 5 3 2
Operator III 9 4 6 -
IV 7 2 7 6
In the above, operators I and III can not be assigned to
the machines M3 and M4 respectively. Formulate the
above problem as a LP model.

Example Contd…
Let us define the decision variables as:
Xij=1, if the ith operator is assigned to jth machine
0, otherwise i, j = 1, 2, 3, 4

By the problem X13= 0 and X34= 0

The LP model for the given problem is as follows:

Minimize Z = 5X11+7X12+4X14+7X21+5X22+3X23

+2X24+9X31+4X32+6X33+7X41+2X42+7X43+6X44

9
Example Contd…
Subject to X11 + X12 + X14 = 1
X21+ X22 + X23 + X24 = 1
X31+ X32+ X33 = 1
X41+ X42+ X43+ X44 = 1
(Operator assignment constraints)
X11 + X21 + X31 + X41 = 1
X12+ X22 + X32 + X42 = 1
X23+ X33+ X43 = 1
X14+ X24+ X44 = 1
(Machine assignment constraints)
Xij=0 or 1, for all i and j

Example-2
Job
J1 J2 J3 J4
A 18 26 17 11
B 13 28 14 26
Person C 38 19 18 15
D 19 26 24 10
Ans-59

10
Example-3
Job
J1 J2 J3 J4 J5
A 2 9 2 7 1
B 6 8 7 6 1
Person C 4 6 5 3 1
D 4 2 7 3 1
E 5 3 9 5 1
Ans-13

Question
A computer centre has got three
programmers. The centre needs three
application programme to be
developed. The head of the computer
centre, after studying carefully the
programme to be developed,
estimates the computer time in
minutes required by the experts to
the application programme as
follows:

11
Question Contd…
Programme
1 2 3
A 120 100 80
Programmers B 80 90 110
C 110 140 120
Assign the programmers to the programme in
such a way that the total computer time is
least.

Unbalanced Assignment Problems


For unbalanced or non-standard
assignment problem number of rows are
not equal to the number of columns in the
assignment cost matrix. To find an
assignment for this type of problem, we
have to first convert this unbalanced
problem into a balanced problem by adding
dummy rows or columns with zero costs.
Now, the Hungarian method may be used
to solve the problem.

12
Example
Machine
M1 M2 M3 M4
A 18 24 28 32
Job B 8 13 17 19
C 10 15 19 22

Example Contd…
Machine
M1 M2 M3 M4
A 18 24 28 32
Job B 8 13 17 19
C 10 15 19 22
This is an unbalances assignment problem.
Here we add a dummy fourth row i.e. job D in
the cost matrix so as to get the balanced
assignment problem.
Ans-50

13
MS Excel (Discussed in class)

Example Contd…
Machine
M1 M2 M3 M4
A 18 24 28 32
Job B 8 13 17 19
C 10 15 19 22
D 0 0 0 0

Ans-50

14
Introduction And Mathematical
Formulation of Transportation Problems
Transportation problem is generally concerned
with the distribution of a certain
commodity/product from several
origins/sources to several destinations with
minimum total cost through single mode of
transportation.
Suppose there are m factories where a certain
product is produced and n markets where it is
needed. Let the supply from the factories be
a1,a2,.. am units and demands at the market be
b1,b2,.. bn units.

Formulation of Transportation
Problems Contd…
Also consider,
Cij=Unit cost of shipping from
factory i to market j.
Xij=Quantity shipped from factory
i to market j.
Then the Linear Programming
Formulation can be stated as
follows:

15
Formulation of Transportation
Problems Contd…
Minimum Z = Total cost of
transportation
m n
i.e. Minimize Z    Cij X ij
i 1 j 1
n
S .t. X
j 1
ij  ai , i  1, 2, ..., m

(Total amount shipped from any factory does not exceed its capacity )
m

X
i 1
ij  b j , j  1, 2, ..., n

(Total amount shipped to a market meets the demand of the market )

Formulation of Transportation
Problems Contd…
Here the market demand can be met if
∑ai ≥ ∑bj.

If ∑ai=∑bj i.e. total supply = total


demand, the problem is said to be
“Balanced Transportation Problem” and
all the constraints are replaced by
equality sign.

16
Formulation of Transportation
Problems Contd…
m n
i.e. Minimize Z   Cij X ij
i 1 j 1
n
S .t. X
j 1
ij  ai , i  1, 2, ..., m

X
i 1
ij  b j , j  1, 2, ..., n

X ij  0 for all i and j.


(There are total m  n constra int s and mn var iables)

The Unbalanced Case


If the supply and demand or availability
and requirements are unequal, we make
the supply and demand equal by the
introduction of either a dummy destination,
if the supply is larger or a dummy source if
the demand is larger. The difference is
allocated to this dummy. The cost of
moving units from the dummy to any
source or from sources to a dummy
location is zero as no movement actually
taken place.

17
Example
A company wants to supply materials
from three plants to three new
projects. Project I requires 50 truck
loads, Project II requires 40 truck
loads and Project III requires 60 truck
loads. Supply capacities for the plant
P1, P2 and P3 are 30, 55 and 45 truck
loads. The table of transportation
costs are given below:

Example Contd…
I II III
P1 7 10 12
P2 8 12 7
P3 4 9 10
Determine the optimal distribution.

18
Example Contd…
Here the total supplies = 130 and total requirement
=150. Therefore, the given problem is unbalanced TP.
To make it Balanced consider a dummy plants with
supply capacity of 20 truck loads and zero
transportation costs to the three Projects.
I II III
P1 7 10 12 30
P2 8 12 7 55
P3 4 9 10 45
P4 0 0 0 20
50 40 60
Now the above TP is a balanced TP and can be solved
as above.

MS Excel (Discussed in class)

19
Chapter 1 1-1

Parameter And Statistic

Any statistical measure relating to the population


which is based on all units of population is called
parameter. E.g. population mean (µ), population
standard deviation (σ).
Any statistical measure relating to the sample
which is based on all units of sample is called
statistic. E.g. sample mean (x), sample variance
(S).

Example

For a population of five units, the values of a


characteristic x are given below:
8, 2, 6, 4 and 10.
Consider all possible samples of size 2 from the
above population and show that the mean of
the sample is exactly equal to the population
mean.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-2

Example Contd…
The population mean = µ = 30/5 = 6
Random sample of size two

S.No. Sample Sample


Values Mean
1 8, 2 5
2 8, 6 7
3 8, 4 6
4 8, 10 9
5 2, 6 4

Example Contd…

S.No. Sample Sample


Values Mean
6 2, 4 3
7 2, 10 6
8 6, 4 5
9 6, 10 8
10 4, 10 7
Total 60

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-3

Example Contd…

Therefore, Sample Mean = 60/10=6


Which is same as population mean.

Testing of Hypothesis

There are many problems in which we have to make decisions


about a statistical population on the basis of sample
observations. To reach a decision, we make an assumption or
statement about the population which is known as a Statistical
Hypothesis.

For example, (i) the average marks of students in a university is


77% (ii) the average lifetime of a certain tires is at least 25,000
miles, (iii) difference of resistance between two types of electric
wires is 0.025 ohm, etc.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-4

Null And Alternative Hypothesis

 The hypothesis which we are going to test for


possible rejection under the assumption is called
‘Null Hypothesis’ which is usually denoted by H0.
For example,
H 0 : μ  μ 0 , or, H 0 : μ1 - μ 2  K, H 0 : σ 2  σ0
 Any hypothesis which is taken as complementary to
the null hypothesis is called an ‘Alternative
Hypothesis’ and is usually denoted by H1. For
example,
H 1 : μ  μ 0 , or, H 1 : μ  μ 0 or H 1 : μ ≠ μ 0

Type I & II Error

While accepting or rejecting a hypothesis we


commit two types of errors:
Type I Error : Reject H0 when it is true.
Type II Error : Accept H0 when it is false.
If we consider
P [ type I error ] = 
P [ type II error ] = 
then  and  are referred to as producer’s risk
and consumer’s risk respectively.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-5

Producer’s risk and Consumer’s risk

 The error of rejecting a good-quality lot creates


a problem for the producer; the probability of
this error is called the producer's risk.
 On the other hand, the error of accepting a
poor-quality lot creates a problem for the
purchaser or consumer; the probability of this
error is called the consumer's risk.

Critical Region & Significance Level


 Critical Region: A region corresponding to a statistic
which amounts to rejection of H0 is termed as critical
region or region of rejection.

 Level of Significance (): This is a probability that a


random value of the statistic belongs to the critical
region. In other words, it is the size of the critical region.
Usually, the level of significance is taken as 5% or 1%.
So  = P (Type I error).

 Critical Value: The value which separates the critical


region and the acceptance region is called critical value
which is set by seeing the alternative hypothesis.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-6

Degree of Freedom

 Degree of Freedom: Degrees of freedom


are the number of independent variables that
can be estimated in a statistical analysis and tell
you how many items can be randomly selected
before constraints must be put in place.
In general, the variable x1, x2,…, xn have n d.f.,
but these get reduced by the number of
conditions imposed on them. For example, if we
calculate their mean, their d.f. get reduced by
one.

Left, Right & Two Tail Tests

 It is determined based on the alternative


hypothesis. For example,
If H1 :   o , then it is calledright tailed test.
If H1 :   o , then it is calledleft tailed test.
If H1 :   o , then it is calledboth tailed test (two tailed test).

 Simple and composite hypothesis. If all the


parameters are completely specified, the
hypothesis is called simple, e.g., = o,  = o.
Otherwise it is called composite hypothesis, e.g.,
 ≤ o,  > o etc

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-7

Right Tailed Test

Left Tailed Test

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-8

Two Tailed Test

Steps in Testing Hypothesis


 Set up H0
 Set up H1
 Set up test statistic.
 Set up the level of significance and critical value using statistical
table.
 Compute the value of statistic using sample drawn from
population.
 Take decision.
(a) If the calculated value of test statistic lies in the critical
region, reject H0, i.e., the assumption under the null
hypothesis cannot be accepted.
(b) If the calculated value of test statistic lies in the accepted
region i.e., outside the critical region, accept H0, i.e., the
assumption under H0 can be taken as true value of the
parameter.
Power function of the test. Let  = P [ Type II error ]. Then 1-  is
called the power function of testing H0 against H1.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-9

Testing of a Single Mean


(Inferences about a Single Mean) for Large Sample n>30

1. Set up H0 :  = 0
2. Set up H1 :  > 0 or  < 0 or   0
3. Set up the test statistic
x - 0
Z  which follows s tan dard normal distribution
/ n
4. Set up the level of significance α and the critical
value as Ztab from the normal table.
Compute the static, say Zcal

Testing of a Single mean


contd…

5. Decision

Note: If the population S.D. is not known for large sample


we can take the sample S.D. in the test statistics.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-10

Example-1

The mean lifetime of 100 picture tubes


produces by a manufacturing company is
estimated to be 5795 hours with a standard
deviation of 150 hours. If  be the mean
lifetime of all the picture tubes produces by the
company, test the hypothesis  = 6000 hours
against   6000 hours at 5% level of
significance.

Solution-1

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-11

Example-2

A tire company claims that the lives of the


tyres have mean of 42000 kilometers with
standard deviation of 4000 kilometers. A
change in the production process is believed to
result in a better product. A test sample of 81
new tyres has a mean life of 42500 kilometers.
Test at 5% level of significance that the new
product is significantly better than the current
one?

Solution-2

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-12

Practice Problems on Testing of Means

1. The manufacturer of television tubes knows from the past


experience that the average life of tube is 2000 hrs. with a s.d.
of 200 hrs. A sample of 100 tubes has an average life of 1950
hrs. Test at 1% level of significance to see if this sample came
from a normal population of mean 2000 hrs.
2. A sample of 400 students is found to have a mean height of
171.38 cms. Can it be reasonably regarded as a sample from a
large population with mean height 171.17 cms and s.d. 3.30
cms?
3. The mean weight of a random sample of size 100 from a
student’s population is 65.8 kgs and the s.d. is 4 kgs. Test at
5% level of significance that the student’s population weight is
below 72 kgs.

Testing of Difference of Two Mean


(for Large Sample n>30)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-13

Testing of Difference of Two Mean


(for Large Sample n>30)

Example-3

A random sample of 100 villages was taken


from a district A and the average height of the
population per village was found to be 170 cm
with a SD of 10 cm. Another random sample of
120 villages was taken from another district B
and the average height of the population per
village was found to be 176 cm with a SD of 12
cm. Is the difference between averages of the
two populations statistically significant?

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-14

Solution-3

Example-4 (Practice Problem)

The sales manager of a large company conducted a


sample survey in states A and B taking 400 sample
salesman in each case. The following were the results:
State A State B
Average Sales Rs. 2500 Rs. 2200
Standard Deviation Rs. 400 Rs. 550

Test whether the average the average sales is the


same in the two states at 1% level of significance.

Answer: Two tailed test, Zcal = 8.82, H0: μ1 = μ2 rejected

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-15

TEST FOR SMALL


SAMPLES

Testing of a Single Mean


(Inferences about a Single Mean) for Small Sample n<30

Here sample is small (n<30) and σ is unknown.


1. Set up H0 :  = 0
2. Set up H1 :  > 0 or  < 0 or   0
3. Set up the test statistic

4. Set up the level of significance α and the critical


value as Ztab from the table of t-distribution.
5. Compute the static, say Zcal

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-16

Testing of a Single mean


contd…

6. Decisions

Degree of Freedom

 Degree of Freedom: Degrees of freedom


are the number of independent variables that
can be estimated in a statistical analysis and tell
you how many items can be randomly selected
before constraints must be put in place.
In general, the variable x1, x2,…, xn have n d.f.,
but these get reduced by the number of
conditions imposed on them. For example, if we
calculate their mean, their d.f. get reduced by
one.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-17

Example-5

The mean breaking strength of a certain kind of


metallic rope is 160 pounds. If six pieces of
ropes (randomly selected from different rolls)
have a mean breaking strength of 154.3
pounds with a SD of 6.4 pounds, test the null
hypothesis µ = 160 pounds against the
alternative hypothesis µ<160 pounds at 1%
level of significance. Assume that the
population follows normal distribution.

Solution-5

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-18

Practice Problem-6

The heights of 10 residents of a given locality


are found to be 70, 68, 62, 68, 61, 68, 69, 65,
64 and 66 inches. It is reasonable to believe
that the average height is greater than 64
inches? Use 5% level of significance.

Solution Practice Problem-6

Right tailed test , tcal = 0.72, Average


height is not greater than 64 inches.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-19

Testing of Difference of Two Mean


(for Small Sample n<30)

Testing of Difference of Two Mean


(for Small Sample n<30)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-20

Example-7

The following are the number of sales with a sample of


6 sales people of gas lighter in a city A and a sample of
8 sales people of gas lighter in another city B made
over a certain fixed period of time:

City A: 63, 48, 54, 44, 59, 52


City B: 41, 52, 38, 50, 66, 54, 44, 61

Assuming that the population samples can be


approximated closely with normal distributions having
the same variance, test H0: µ1 = µ2 against H1: µ1 ≠ µ2
at the 5% level of significance.

Solution-7

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-21

Example-8
Measuring specimens of nylon yarn taken
from two machines. It was found that 8
specimens from 1st machine had a mean
denier of 9.67 with a standard deviation
of 1.81 while 10 specimens from a 2nd
machine had a mean denier of 7.43 with
a standard deviation 1.48.Assuming the
population are normal, test the
hypothesis H0: µ1 - µ2 =1.5 against
H1: µ1 - µ2 > 1.5 at the 5% level of
significance.

Solution-8

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-22

Paired t-test

Paired t-test contd..

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-23

Example-9

An I.Q. test was administered to 5 persons before and


after they were trained. The results are given below:
I.Q. before training: 110 120 123 132 125
I.Q. after training: 120 118 125 136 121

Test whether there is any change in I.Q. after the


training programme (use 1% level of significance).

Solution-9

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-1

Testing of Proportion
Single Proportion
 Set up H0:P=p0
 Set up H1:P>p0 or P<p0 or P≠p0
 Set up test statistics

Which approximately follows the standard normal


distribution
 Set up the level of significance α and critical value (Z
tab) using statistical table.
 Compute the statistic

Testing of Proportion
Single Proportion contd..
 Decisions:

H1 Reject Ho if
P < p0 Z cal < - Z tab
P > p0 Z cal > Z tab
P ≠ p0 Z cal < - Z tab i.e., - Zα/2
or Z cal > Z tab i.e., Zα/2

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-2

Problem-1

A die was thrown 500 times and six


resulted 100 times. Do the data justify the
hypothesis of unbiased die.

Solution-1

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-3

Problem-2

A manufacturer claimed that at least 95% of the


components of an electronic circuit board which
he supplied, conformed to specifications. A
random sample of 220 components showed
that only 185 were up to the standard. Test his
claim at 1% level of significance.

Solution-2

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-4

Testing of Proportion
Difference of Two Proportions
Let p1 and p2 be the proportions in two large samples of sizes n1
and n2 drawn respectively from two populations. To test whether
the differences p1 - p2 as observed in the samples has arises only
due to fluctuation of sampling.
 Set up H0: P1 = P2
 Set up H1: P1 ≠ P2
 Set up test statistics
p1  p 2 n p  n2 p 2
Z  where p  1 1 , q  1 p
1 1 n1  n 2
pq (  )
n1 n 2
Which approximately follows the standard normal distribution
 Set up the level of significance α and critical value say, Z tab using
normal table.
 Compute the statistic as Z cal

Testing of Proportion
Difference of Two Proportions
 Decisions:
H1 Reject Ho if

P1 < P 2 Z cal < - Z tab

P1 > P 2 Z cal > Z tab

P1 ≠ P2 Z cal < - Z tab i.e., - Zα/2


or Z cal > Z tab i.e., Zα/2

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-5

Problem-3

A machine produced 20 defective articles in a


batch of 400. After overhauling it produced 10
defectives in a batch of 300. Has the machine
improved?

Solution-3

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-6

Testing of Equality of Two Variances


(F-Test)

F- Test contd..

Here consider the null hypothesis H0: The two samples


have been drawn from normal populations with the same
variance.
Then we compare the calculated value of F with its
tabulated value. If the calculated value of F exceeds F0.05
for (n1-1, n2-1) degree of freedom, we say that the ratio is
significant at 5% level and the hypothesis may be
rejected. If the calculated value of F is less than F0.05 the
hypothesis may be true and we conclude that the samples
could have come from two normal populations with the
same variance.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-7

Problem-4

In one sample of 8 observations the sum of the


squares of deviations of the sample values
from the sample mean was 84.4 and in the
other sample of 10 observations it was 102.6.
Test whether this difference is significant at 5%
level.

Solution-4

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-8

Problem-5

Two independent samples of 8 and 7 items


respectively had the following values of the
variables:
Sample I: 9 11 13 11 15 9 12 14
Sample II: 10 12 10 14 9 8 10
Do the two estimates of population variance
differ significantly?

Solution-5

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-9

Problem-6

Solution-6

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-1

Chi- Square Goodness of Fit test

This test was given by Karl Pearson in 1900 for testing


the significance of the discrepancy between theory and
experiment. It enables us to find if the deviation of the
experiment from theory is just by chance or due to the
inadequacy of the theory to fit the observed data.
If Oi, i = 1, 2, …,n is a set of observed (experimental)
frequencies and Ei, i = 1, 2, …,n is the corresponding
set of expected (theoretical or hypothetical) frequencies
then Chi Square is defined by
n
(O i  E i ) 2
 2
 
i 1
[
Ei
]

 2 is defined above is said to have (n-1) degree of freedom.

Chi- Square Goodness of Fit test


contd..

If the calculated value of  2 is less than the tabulated value of  2 at 5% level of


significance, then the fit is considered to be good i.e. the deviation between actual
and expected frequencies is attributed to fluctuation of simple sampling.

If the calculated value of  2 is greater than the tabulated value of  2 at 5% level of


significance, then the fit is considered to be poor.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-2

Chi- Square Goodness of Fit test


Problem-1

A die is thrown 264 times with the following


results:
No. appeared
on the die: 1 2 3 4 5 6
Frequency: 40 32 28 50 54 60

Show that the die is biased?

Solution-1
Set the hypothesis that the die is unbiased
The expected frequency of each of the number 1, 2, 3, 4, 5, and 6 is
264/6 = 44

No. appeared on the die : 1 2 3 4 5 6


Observed Frequency Oi : 40 32 28 50 54 60
Expected Frequency Ei : 44 44 44 44 44 44

Chi Square (cal) = 22


The tabulated value of chi square for 6-1=5 d.f. at 5% level = 11.07
Chi Square (cal) > Chi Square (tab).
Therefore, the result of the experiment do not support the hypothesis.
Hence the die is biased

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-3

Chi- Square Goodness of Fit test


Problem-2

The following table gives the number of aircraft


accidents that occurred during the various days
of the week.
Find whether the accidents are uniformly
distributed over the week.

Days: Sun Mon Tue Wed Thu Fri Sat Total


No. of 14 16 8 12 11 9 14 84
Accidents:

Solution-2
Consider the hypothesis that the accidents are uniformly
distributed over the week.

The expected frequencies of accidents on any day = 84/7


= 12
Chi Square (cal) = 4.17
The tabulated value of chi square for 7-1= 6 d.f. at 5%
level = 12.59
Chi Square (cal) < Chi Square (tab).
Therefore, the result of the experiment support the
hypothesis. Hence the hypothesis is true.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-4

Chi- Square Test of Independence

Total

Total Grand
Total

And Oij= observed frequencies.

Chi- Square Test of Independence


contd..

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-5

Chi- Square Test of Independence


contd..

Chi- Square Test of Independence


Problem-3

A random sample of 220 students in a college were


asked to give opinion in terms of yes or no about the
winning of their college cricket team in a tournament.
The following data are collected:
Class in College
Ist Year IInd Year IIIrd Year
Yes 43 20 37
No 23 57 40

Test whether there is any association between opinion


and class in college. Use 5% level of significance.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-6

Solution-3

Ho = There is no association between opinion


and class in the college
H1 = There is an association between opinion
and class in the college
Also it is given that the grand total is 220,
O11=43, O12=20, O13=37,
O21=23, O22=57, O23= 40
Degree of freedom = (3-1)*(2-1) = 2*1 = 2

Solution-3 contd..

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-7

Solution-3 contd..

Problem -4

 The meal plan selected by 200 students is shown below:

Number of meals per week


Class
Standing 20/week 10/week none Total
Fresh. 24 32 14 70
Soph. 22 26 12 60
Junior 10 14 6 30
Senior 14 16 10 40
Total 70 88 42 200

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-8

Solution-4

 Test the hypothesis:

H0: Meal plan and class standing are independent


(i.e., there is no relationship between them)
H1: Meal plan and class standing are dependent
(i.e., there is a relationship between them)

Solution-4:
Expected Cell Frequencies
(continued)
Observed:
Number of meals
per week
Class Expected cell
Standing 20/wk 10/wk none Total
Fresh. 24 32 14 70
frequencies if H0 is true:
Soph. 22 26 12 60 Number of meals
Junior 10 14 6 30 Class per week
Senior 14 16 10 40 Standing 20/wk 10/wk none Total
Total 70 88 42 200 Fresh. 24.5 30.8 14.7 70
Soph. 21.0 26.4 12.6 60
Example for one cell:
row total  column total Junior 10.5 13.2 6.3 30
fe 
n Senior 14.0 17.6 8.4 40

30  70 Total 70 88 42 200
  10.5
200

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-9

Solution-4: The Test Statistic


(continued)

 The test statistic value is:

( f o  f e )2
2
χ STAT  
all cells
fe
( 24  24.5 ) 2 ( 32  30.8 ) 2 ( 10  8.4 ) 2
    0.709
24.5 30.8 8. 4

χ 0.2 05 = 12.592 from the chi-squared distribution


with (4 – 1)(3 – 1) = 6 degrees of freedom

Solution-4:
Decision and Interpretation
(continued)

2
The test statistic is χ STAT  0.709 ; χ 02.05 with 6 d.f.  12.592

Decision Rule:
2
If χ STAT > 12.592, reject H0,
otherwise, do not reject H0

0.05 Here,
2 2
χ STAT = 0.709 < χ 0.05 = 12.592,
so do not reject H0
0
Do not Reject H0 2 Conclusion: there is not
reject H0 sufficient evidence that meal
20.05=12.592 plan and class standing are
related at  = 0.05

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-10

General ANOVA Setting


 Investigator controls one or more independent
variables
 Called factors (or treatment variables)
 Each factor contains two or more levels (or groups or
categories/classifications)
 Observe effects on the dependent variable
 Response to levels of independent variable
 Experimental design: the plan used to collect
the data

Hypotheses of One-Way ANOVA

 H0 : μ1  μ2  μ3    μc
 All population means are equal
 i.e., no treatment effect (no variation in means among
groups)


H1 : Not all of the population means are the same
 At least one population mean is different
 i.e., there is a treatment effect
 Does not mean that all population means are different
(some pairs may be the same)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-11

One-Way ANOVA
H0 : μ1  μ2  μ3    μc
H1 : Not all μ j are the same

All Means are the same:


The Null Hypothesis is True
(No Treatment Effect)

μ1  μ2  μ3

One-Way ANOVA
(continued)
H0 : μ1  μ2  μ3    μc
H1 : Not all μ j are the same
At least one mean is different:
The Null Hypothesis is NOT true
(Treatment Effect is present)

or

μ1  μ2  μ3 μ1  μ2  μ3

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-12

Partitioning the Variation

 Total variation can be split into two parts:

SST = SSA + SSW

SST = Total Sum of Squares


(Total variation)
SSA = Sum of Squares Among Groups
(Among-group variation)
SSW = Sum of Squares Within Groups
(Within-group variation)

Partitioning the Variation


(continued)

SST = SSA + SSW

Total Variation = the aggregate dispersion of the individual


data values across the various factor levels (SST)

Among-Group Variation = dispersion between the factor


sample means (SSA)

Within-Group Variation = dispersion that exists among


the data values within a particular factor level (SSW)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-13

Partition of Total Variation

Total Variation (SST)


d.f. = n – 1

Variation Due to Variation Due to Random


= Factor (SSA) + Sampling (SSW)
d.f. = c – 1 d.f. = n – c

Commonly referred to as: Commonly referred to as:


 Sum of Squares Between  Sum of Squares Within
 Sum of Squares Among  Sum of Squares Error
 Sum of Squares Explained  Sum of Squares Unexplained
 Among Groups Variation  Within-Group Variation

Total Sum of Squares


SST = SSA + SSW
c nj

SST   ( Xij  X)2


Where: j1 i 1

SST = Total sum of squares


c = number of groups (levels or treatments)
nj = number of observations in group j
Xij = ith observation from group j
X = grand mean (mean of all data values)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-14

Total Variation
(continued)

SST  ( X11  X)2  ( X12  X)2  ...  ( Xcnc  X)2


Response, X

Group 1 Group 2 Group 3

Among-Group Variation
SST = SSA + SSW
c
SSA   n j ( X j  X )2
j 1
Where:
SSA = Sum of squares among groups
c = number of groups
nj = sample size from group j
Xj = sample mean from group j
X = grand mean (mean of all data values)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-15

Among-Group Variation
(continued)
c
SSA   n j ( X j  X )2
j 1

SSA
Variation Due to
MSA 
Differences Among Groups
c 1
Mean Square Among =
SSA/degrees of freedom

i j

Among-Group Variation
(continued)

SSA  n1 ( x1  x )2  n 2 ( x 2  x )2  ...  nc ( x c  x )2

Response, X

X3
X2 X
X1

Group 1 Group 2 Group 3

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-16

Within-Group Variation
SST = SSA + SSW
c nj

SSW    ( Xij  X j )2
j1 i1
Where:
SSW = Sum of squares within groups
c = number of groups
nj = sample size from group j
Xj = sample mean from group j
Xij = ith observation in group j

Within-Group Variation
(continued)

c nj

SSW    ( Xij  X j )2
j1 i1
SSW
Summing the variation
MSW 
within each group and then
adding over all groups nc
Mean Square Within =
SSW/degrees of freedom

μj

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-17

Within-Group Variation
(continued)

SSW  ( x11  X1 )2  ( X12  X 2 )2  ...  ( Xcnc  Xc )2

Response, X

X3
X2
X1

Group 1 Group 2 Group 3

Obtaining the Mean Squares

SSA
MSA 
c 1
SSW
MSW 
nc
SST
MST 
n 1

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-18

One-Way ANOVA Table

Source of SS df MS
F ratio
Variation (Variance)
Among SSA MSA
SSA c-1 MSA =
Groups c - 1 F = MSW
Within SSW
SSW n-c MSW =
Groups n-c

Total SST = n-1


SSA+SSW
c = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom

One-Way ANOVA
F Test Statistic
H0: μ1= μ2 = … = μc
H1: At least two population means are different

 Test statistic MSA


F
MSW
MSA is mean squares among groups
MSW is mean squares within groups
 Degrees of freedom
 df1 = c – 1 (c = number of groups)
 df2 = n – c (n = sum of sample sizes from all populations)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-19

Interpreting One-Way ANOVA


F Statistic
 The F statistic is the ratio of the among
estimate of variance and the within estimate
of variance
 The ratio must always be positive
 df1 = c -1 will typically be small
 df2 = n - c will typically be large

Decision Rule:
 Reject H0 if F > FU,  = .05
otherwise do not
reject H0 0 Do not Reject H0
reject H0
FU

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-1

One-Way ANOVA
F Test Example

You want to see if three Club 1 Club 2 Club 3


different golf clubs yield 254 234 200
different distances. You 263 218 222
randomly select five 241 235 197
measurements from trials on 237 227 206
an automated driving 251 216 204
machine for each club. At the
0.05 significance level, is
there a difference in mean
distance?

One-Way ANOVA Example:


Scatter Diagram
Distance
Club 1 Club 2 Club 3 270
254 234 200 260 •
••
263
241
218
235
222
197
250 X1
240 •
237 227 206 • ••
230
251 216 204
220

X2 • X
••
210
x1  249.2 x 2  226.0 x 3  205.8
•• X3
200 ••
x  227.0 190

1 2 3
Club

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-2

One-Way ANOVA Example


Computations
Club 1 Club 2 Club 3 X1 = 249.2 n1 = 5
254 234 200 X2 = 226.0 n2 = 5
263 218 222
X3 = 205.8 n3 = 5
241 235 197
237 227 206 n = 15
X = 227.0
251 216 204 c=3
SSA = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4
SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6

MSA = 4716.4 / (3-1) = 2358.2 2358.2


F  25.275
MSW = 1119.6 / (15-3) = 93.3 93.3

One-Way ANOVA Example


Solution
H0: μ1 = μ2 = μ3 Test Statistic:
H1: μj not all equal
MSA 2358.2
 = 0.05 F   25.275
df1= 2 df2 = 12 MSW 93.3

Critical Decision:
Value:
Reject H0 at  = 0.05
FU = 3.89
 = .05 Conclusion:
There is evidence that
0 Do not Reject H0 at least one μj differs
F = 25.275
reject H0
FU = 3.89 from the rest

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-3

One-Way ANOVA
Excel Output
EXCEL: tools | data analysis | ANOVA: single factor
SUMMARY
Groups Count Sum Average Variance
Club 1 5 1246 249.2 108.2
Club 2 5 1130 226 77.5
Club 3 5 1029 205.8 94.2
ANOVA
Source of
SS df MS F P-value F crit
Variation
Between
4716.4 2 2358.2 25.275 4.99E-05 3.89
Groups
Within
1119.6 12 93.3
Groups
Total 5836.0 14

Assignment Problems

Consider n machines M1, M2,…,Mn and n different


jobs J1, J2,…,Jn. These jobs to be processed by the
machines one to one basis i.e. each machine will
process exactly one job and each job will be
assigned to only one machine. For each job the
processing cost depends on the machine to which it
is assigned. Now we have to determine the
assignment of the jobs to the machines one to one
basis such that the total processing cost is
minimum. This is called ASSIGNMENT PROBLEM.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-4

Assignment Problem Contd…

If the number of machines are equal to the


number of jobs then the above problem is
called balanced or standard assignment
problem. Otherwise the problem is called
unbalanced or non-standard assignment
problem.

Introduction And Mathematical


Formulation of Assignment Problem

Let us consider a balanced assignment


problem. For L.P.P. formulation let us define the
decision variables as:
Xij=1, if job j is assigned to machine i
0, otherwise
and Cij is the cost of processing job j on
machine i. Then we can formulate the
assignment problem as follows:

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-5

Introduction And Mathematical


Formulation of Assignment Problem

n n
Minimize Z  C
i 1 j 1
ij X ij

n
Subject to j 1
X ij  1, i  1, 2 , ..., n

( Each machine is assigned exactly to one job )


n

i 1
X ij  1, j  1, 2 , ..., n

( Each job is assigned exactly to one machine )


X ij  0 or 1, for all i and j

Example

A company is facing the problem of assigning four operators to four


machines. The assignment cost in rupees is given below:
Machine
M1 M2 M3 M4
I 5 7 - 4
II 7 5 3 2
Operator III 9 4 6 -
IV 7 2 7 6
In the above, operators I and III can not be assigned to the
machines M3 and M4 respectively. Formulate the above problem as
a LP model.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-6

Example Contd…
Let us define the decision variables as:
Xij=1, if the ith operator is assigned to jth machine
0, otherwise i, j = 1, 2, 3, 4

By the problem X13= 0 and X34= 0

The LP model for the given problem is as follows:

Minimize Z = 5X11+7X12+4X14+7X21+5X22+3X23

+2X24+9X31+4X32+6X33+7X41+2X42+7X43+6X44

Example Contd…
Subject to X11 + X12 + X14 = 1
X21+ X22 + X23 + X24 = 1
X31+ X32+ X33 = 1
X41+ X42+ X43+ X44 = 1
(Operator assignment constraints)
X11 + X21 + X31 + X41 = 1
X12+ X22 + X32 + X42 = 1
X23+ X33+ X43 = 1
X14+ X24+ X44 = 1
(Machine assignment constraints)
Xij=0 or 1, for all i and j

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-7

Introduction And Mathematical


Formulation of Transportation Problems

Transportation problem is generally concerned with the


distribution of a certain commodity/product from several
origins/sources to several destinations with minimum
total cost through single mode of transportation.
Suppose there are m factories where a certain product
is produced and n markets where it is needed. Let the
supply from the factories be a1,a2,.. am units and
demands at the market be b1,b2,.. bn units.

Formulation of Transportation
Problems Contd…

Also consider,
Cij=Unit cost of shipping from factory i to
market j.
Xij=Quantity shipped from factory i to
market j.
Then the Linear Programming
Formulation can be stated as follows:

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-8

Formulation of Transportation
Problems Contd…

Minimum Z = Total cost of


transportation
m n
i.e. Minimize Z    Cij X ij
i 1 j 1
n
S .t. X
j 1
ij  ai , i  1, 2, ..., m

(Total amount shipped from any factory does not exceed its capacity )
m

X
i 1
ij  b j , j  1, 2, ..., n

(Total amount shipped to a market meets the demand of the market )

Formulation of Transportation
Problems Contd…

Here the market demand can be met if ∑ai ≥


∑bj.

If ∑ai=∑bj i.e. total supply = total demand, the


problem is said to be “Balanced
Transportation Problem” and all the
constraints are replaced by equality sign.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-9

Formulation of Transportation
Problems Contd…

m n
i.e. Minimize Z   Cij X ij
i 1 j 1
n
S .t. X
j 1
ij  ai , i  1, 2, ..., m

X
i 1
ij  b j , j  1, 2, ..., n

X ij  0 for all i and j.


(There are total m  n constra int s and mn var iables)

The Unbalanced Case

If the supply and demand or availability and


requirements are unequal, we make the supply and
demand equal by the introduction of either a
dummy destination, if the supply is larger or a
dummy source if the demand is larger. The
difference is allocated to this dummy. The cost of
moving units from the dummy to any source or from
sources to a dummy location is zero as no
movement actually taken place.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-10

Example

A company wants to supply materials from


three plants to three new projects. Project I
requires 50 truck loads, Project II requires 40
truck loads and Project III requires 60 truck
loads. Supply capacities for the plant P1, P2 and
P3 are 30, 55 and 45 truck loads. The table of
transportation costs are given below:

Example Contd…

I II III
P1 7 10 12
P2 8 12 7
P3 4 9 10

Determine the optimal distribution.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-11

Example Contd…

Here the total supplies = 130 and total requirement =150.


Therefore, the given problem is unbalanced TP. To make it
Balanced consider a dummy plants with supply capacity of 20
truck loads and zero transportation costs to the three Projects.

I II III
P1 7 10 12 30
P2 8 12 7 55
P3 4 9 10 45
P4 0 0 0 20
50 40 60
Now the above TP is a balanced TP and can be solved
as above.

A company has four warehouses and five stores.


The availability at each warehouse, the
requirement at each store and the cost of
transportation of one unit from a warehouse to a
store are given in the table below:

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-12

S1 S2 S3 S4 S5 Capacity
W1 9 12 10 10 6 150
W2 5 18 12 11 2 30
W3 10 9999 7 3 20 120
W4 5 6 2 9999 8 130
Requirements 80 60 20 210 80

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-1

One-Way ANOVA Table

Source of SS df MS
F ratio
Variation (Variance)
Among SSA MSA
SSA c-1 MSA =
Groups c - 1 F = MSW
Within SSW
SSW n-c MSW =
Groups n-c

Total SST = n-1


SSA+SSW
c = number of groups
n = sum of the sample sizes from all groups
df = degrees of freedom

Que-1

Fill in the following ANOVA Table:


The Marketing Head in a company wanted to
determine whether the sales achieved from the
three different geographies are same or not. The
five months data was obtained for each of the
geographies and the statistical analysis was
done, whose results are partially produced as
under.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-2

Que-1

Sum of squared Degrees of freedom Mean squared variations F-statistic


variations

Between the groups 330

Within the groups

Total 550

Please fill in the six shaded cells in the table.


What is the inference that you will generate from this table
given the F distribution for 5% significance level?

Que on ANOVA

 A researcher wants to compare the effectiveness of three different


teaching methods (Method A, Method B, and Method C) in
improving students' performance in a mathematics test. She
randomly selects 12 students and divides them equally into three
groups. Each group is taught using one of the three methods. After
completing the course, the researcher administers a mathematics
test to all students and records their scores. Determine if there is a
significant difference in mean test scores among the three teaching
methods at a 5% significance level.
Method A scores: 65, 70, 75, 80
Method B scores: 70, 75, 80, 85
Method C scores: 55, 70, 75, 80

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-3

Answer

Null hypothesis (H0): There is no significant difference in mean test


scores among the three teaching methods.
Alternative hypothesis (H1): There is a significant difference in mean
test scores among the three teaching methods.
Calculate the group means and the overall mean:
Method A mean: (65 + 70 + 75 + 80)/4 = 72.5
Method B mean: (70 + 75 + 80 + 85)/4 = 77.5
Method C mean: (55 + 70 + 75 + 80)/4 = 70.0

Overall mean: (65 + 70 + 75 + 80 + 70 + 75 + 80 + 85 + 55 + 70 + 75


+ 80)/12 = 73.33

Answer
Calculate the sum of squares between groups: SSA = 116.57
Calculate the sum of squares within groups:
SSW= 600
Calculate the mean squares
MSA=58.28
MSW=66.67
F=MSA/MSW =0.874
Also, F (tab)= 4.26
Since F(cal) < F(tab)
we do not reject the null hypothesis.
There is no significant difference in mean test scores among the three
teaching methods at the 5% significance level.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-4

One-Way ANOVA
F Test Example

You want to see if three Club 1 Club 2 Club 3


different golf clubs yield 254 234 200
different distances. You 263 218 222
randomly select five 241 235 197
measurements from trials on 237 227 206
an automated driving 251 216 204
machine for each club. At the
0.05 significance level, is
there a difference in mean
distance?

One-Way ANOVA Example:


Scatter Diagram
Distance
Club 1 Club 2 Club 3 270
254 234 200 260 •
••
263
241
218
235
222
197
250 X1
240 •
237 227 206 • ••
230
251 216 204
220

X2 • X
••
210
x1  249.2 x 2  226.0 x 3  205.8
•• X3
200 ••
x  227.0 190

1 2 3
Club

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-5

One-Way ANOVA Example


Computations
Club 1 Club 2 Club 3 X1 = 249.2 n1 = 5
254 234 200 X2 = 226.0 n2 = 5
263 218 222
X3 = 205.8 n3 = 5
241 235 197
237 227 206 n = 15
X = 227.0
251 216 204 c=3
SSA = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4
SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6

MSA = 4716.4 / (3-1) = 2358.2 2358.2


F  25.275
MSW = 1119.6 / (15-3) = 93.3 93.3

One-Way ANOVA Example


Solution
H0: μ1 = μ2 = μ3 Test Statistic:
H1: μj not all equal
MSA 2358.2
 = 0.05 F   25.275
df1= 2 df2 = 12 MSW 93.3

Critical Decision:
Value:
Reject H0 at  = 0.05
FU = 3.89
 = .05 Conclusion:
There is evidence that
0 Do not Reject H0 at least one μj differs
F = 25.275
reject H0
FU = 3.89 from the rest

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-6

One-Way ANOVA
Excel Output
EXCEL: tools | data analysis | ANOVA: single factor
SUMMARY
Groups Count Sum Average Variance
Club 1 5 1246 249.2 108.2
Club 2 5 1130 226 77.5
Club 3 5 1029 205.8 94.2
ANOVA
Source of
SS df MS F P-value F crit
Variation
Between
4716.4 2 2358.2 25.275 4.99E-05 3.89
Groups
Within
1119.6 12 93.3
Groups
Total 5836.0 14

Probability as
Area Under the Curve
The total area under the curve is 1.0, and the curve is
symmetric, so half is above the mean, half is below

f(X) P(   X  μ)  0.5
P(μ  X   )  0.5

0.5 0.5

μ X
P(   X   )  1.0

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-7

The Standardized Normal Table

 The Cumulative Standardized Normal table


in the textbook (Appendix table E.2) gives the
probability less than a desired value of Z (i.e.,
from negative infinity to Z)

Example: 0.9772
P(Z < 2.00) = 0.9772

0 2.00 Z

The Standardized Normal Table


(continued)

The column gives the value of


Z to the second decimal point
Z 0.00 0.01 0.02 …

The row shows 0.0


the value of Z 0.1
. The value within the
to the first .
decimal point . table gives the
2.0 .9772 probability from Z =  
up to the desired Z
value
2.0
P(Z < 2.00) = 0.9772

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-8

Finding Normal Probabilities


 Let X represent the time it takes (in seconds)
to download an image file from the internet.
 Suppose X is normal with a mean of18.0
seconds and a standard deviation of 5.0
seconds. Find P(X < 18.6)

X
18.0
18.6

Finding Normal Probabilities


(continued)
 Let X represent the time it takes, in seconds to download an image file
from the internet.
 Suppose X is normal with a mean of 18.0 seconds and a standard
deviation of 5.0 seconds. Find P(X < 18.6)

X  μ 18.6  18.0
Z   0.12
σ 5.0

μ = 18 μ=0
σ=5 σ=1

18 18.6 X 0 0.12 Z

P(X < 18.6) P(Z < 0.12)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-9

Solution: Finding P(Z < 0.12)

Standardized Normal Probability P(X < 18.6)


Table (Portion) = P(Z < 0.12)
Z .00 .01 .02 0.5478
0.0 .5000 .5040 .5080

0.1 .5398 .5438 .5478


0.2 .5793 .5832 .5871
Z
0.3 .6179 .6217 .6255 0.00
0.12

Finding Normal
Upper Tail Probabilities

 Suppose X is normal with mean 18.0


and standard deviation 5.0.
 Now Find P(X > 18.6)

X
18.0
18.6

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-10

Finding Normal
Upper Tail Probabilities
(continued)

 Now Find P(X > 18.6)…


P(X > 18.6) = P(Z > 0.12) = 1.0 - P(Z ≤ 0.12)
= 1.0 - 0.5478 = 0.4522

0.5478
1.000 1.0 - 0.5478
= 0.4522

Z Z
0 0
0.12 0.12

Finding a Normal Probability


Between Two Values

 Suppose X is normal with mean 18.0 and


standard deviation 5.0. Find P(18 < X < 18.6)

Calculate Z-values:

X  μ 18  18
Z  0
σ 5
18 18.6 X
X  μ 18.6  18 0 0.12 Z
Z   0.12
σ 5 P(18 < X < 18.6)
= P(0 < Z < 0.12)

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-11

Solution: Finding P(0 < Z < 0.12)

Standardized Normal Probability P(18 < X < 18.6)


Table (Portion) = P(0 < Z < 0.12)
= P(Z < 0.12) – P(Z ≤ 0)
Z .00 .01 .02 = 0.5478 - 0.5000 = 0.0478
0.0 .5000 .5040 .5080 0.0478
0.5000
0.1 .5398 .5438 .5478
0.2 .5793 .5832 .5871

0.3 .6179 .6217 .6255 Z


0.00
0.12

Probabilities in the Lower Tail

 Suppose X is normal with mean 18.0


and standard deviation 5.0.
 Now Find P(17.4 < X < 18)

X
18.0
17.4

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-12

Probabilities in the Lower Tail


(continued)

Now Find P(17.4 < X < 18)…


P(17.4 < X < 18)
= P(-0.12 < Z < 0) 0.0478
= P(Z < 0) – P(Z ≤ -0.12)
= 0.5000 - 0.4522 = 0.0478 0.4522

The Normal distribution is


symmetric, so this probability
17.4 18.0 X
is the same as P(0 < Z < 0.12) Z
-0.12 0

Empirical Rules

What can we say about the distribution of values


around the mean? For any normal distribution:
f(X)

μ ± 1σ encloses about
68.26% of X’s
σ σ

X
μ-1σ μ μ+1σ
68.26%

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-13

The Empirical Rule


(continued)

 μ ± 2σ covers about 95% of X’s


 μ ± 3σ covers about 99.7% of X’s

2σ 2σ 3σ 3σ
μ x μ x
95.44% 99.73%

Given a Normal Probability


Find the X Value

 Steps to find the X value for a known


probability:
1. Find the Z value for the known probability
2. Convert to X units using the formula:

X  μ  Zσ

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-14

Finding the X value for a


Known Probability
(continued)

Example:
 Let X represent the time it takes (in seconds) to
download an image file from the internet.
 Suppose X is normal with mean 18.0 and standard
deviation 5.0
 Find X such that 20% of download times are less than
X.
0.2000

? 18.0 X
? 0 Z

Find the Z value for


20% in the Lower Tail

1. Find the Z value for the known probability


Standardized Normal Probability  20% area in the lower
Table (Portion) tail is consistent with a
Z … .03 .04 .05 Z value of -0.84

-0.9 … .1762 .1736 .1711


0.2000
-0.8 … .2033 .2005 .1977
-0.7 … .2327 .2296 .2266
? 18.0 X
-0.84 0 Z

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-15

Finding the X value

2. Convert to X units using the formula:

X  μ  Zσ
 18.0  ( 0.84)5.0
 13.8

So 20% of the values from a distribution


with mean 18.0 and standard deviation
5.0 are less than 13.80

Que-1

A marketing team wants to estimate the


proportion of customers who are satisfied with
their new product launch. They want to be 95%
confident that the sample proportion will be within
3% of the true population proportion. Based on
previous similar studies, they estimate that the
population proportion of satisfied customers is
around 0.60. What is the minimum sample size
required for this study?

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-16

Ans-1
To determine the minimum sample size required for estimating the
proportion of satisfied customers with a specified confidence level and
margin of error, we can use the following formula for sample size
estimation:
n=[(Z^2 * p * (1−p)) / e^2)]
(if p is not given then the formula becomes n = [(Z.σ)/e]^2
where:
Z is the Z-value corresponding to the desired confidence level.
p is the estimated population proportion.
e is the margin of error.
Given:
Confidence level = 95%, so Z=1.96 (from the standard normal
distribution table).
Estimated population proportion p=0.60
Margin of error e=0.03

Ans-1

Now, Put the values into the formula:


n≈1024.43
Since we can't have a fraction of a respondent,
we round up to the next whole number.
Therefore, The minimum sample size required
for this study is 1025 respondents.
This ensures that the marketing team can be 95%
confident that the sample proportion will be within
3% of the true population proportion.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-17

Que-2
A marketing team wants to conduct a survey to
estimate the proportion of customers who are
satisfied with their new product. They want to
achieve a 95% confidence level with a margin of
error of no more than 3%. According to their
preliminary research, they estimate that the
proportion of satisfied customers is around 60%.
What is the minimum sample size required for this
survey? How will this answer change if the
permissible margin of error is enhanced to 4%?
Justify your answer.

Ans-2
Initial Scenario (3% Margin of Error):
 Confidence Level (α): 95% (which corresponds to a z-score of
approximately 1.96 for a two-tailed test).
 Margin of Error (e): 3% (expressed as a decimal, i.e., 0.03).
 Estimated Proportion of Satisfied Customers (p): 60% (expressed
as a decimal, i.e., 0.60).
The formula for calculating the minimum sample size is:
n=(z^2⋅p⋅(1−p)/e^2)
Plugging in the values:
n=(1.96^2⋅0.60⋅(1−0.60)/0.03^2)
Calculating:
n≈385n≈385
Therefore, the minimum sample size required for a 95% confidence
level with a 3% margin of error is approximately 385 respondents.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.


Chapter 1 1-18

Ans-2
Enhanced Margin of Error (4%): If we increase the
permissible margin of error to 4% (expressed as a
decimal, i.e., 0.04), we need to recalculate the sample
size using the same formula:
n=(z^2⋅p⋅(1−p)/e^2)
Plugging in the values:
n=(1.96^2⋅0.60⋅(1−0.60)/0.04^2)
Calculating:
n≈601
Therefore, if the permissible margin of error is enhanced
to 4%, the minimum sample size required increases to
approximately 601 respondents.

Basic Business Statistics, 10/e © 2006 Prentice Hall, Inc.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy