Chapter 2
Chapter 2
❑ Introduction
❑ Frequency Distribution
❑ Measures of Central Tendency
❑ Measures of Dispersion
❑ Skewness and kurtosis
❑ Regression and correlation analysis
❑ Time series Analysis
❑ Index Number
❑ Normal distribution
❑ Sampling distribution
❑ Estimation
❑ Testing hypothesis
• W.I. King has defined Statistics in a wider context, the science of Statistics is the method of
judging collective, natural or social phenomena from the results obtained by the analysis or
enumeration or collection of estimates
• Seligman explored that statistics is a science that deals with the methods of collecting,
classifying, presenting, comparing and interpreting numerical / quantitative data collected to
throw some light on any sphere of enquiry.
• Spiegal define: statistics is concerned with scientific method for collecting, organizing, summa
rizing, presenting and analyzing data as well as drawing valid conclusions and making
reasonable decisions on the basis of such analysis.
• According to Prof. Horace Secrist, Statistics is the aggregate of facts, affected to a marked
extent by multiplicity of causes, numerically expressed, enumerated or estimated according to
reasonable standards of accuracy, collected in a systematic manner for a pre-determined
purpose, and placed in relation to each other.
(i) Statistics are the aggregates of facts. It means a single figure is not statistics.
For example, national income of a country for a single year is not statistics but the same for
two or more years is statistics.
(ii) Statistics are affected by a number of factors. For example, sale of a product
depends on a number of factors such as its price, quality, competition, the income of the consumers,
and so on.
(iii) Statistics must be reasonably accurate. Wrong figures, if analyzed, will lead to erroneous
conclusions. Hence, it is necessary that conclusions must be based on accurate figures.
(iv) Statistics must be collected in a systematic manner. If data are collected in a haphazard manner,
they will not be reliable and will lead to misleading conclusions.
(vi) Lastly, Statistics should be placed in relation to each other. If one collects data unrelated to each
other, then such data will be confusing and will not lead to
any logical conclusions. Data should be comparable over time and over space.
EXAMPLE 1: A Gallup poll found that 49% of the people in a survey knew the name of the first
book of the Bible. The statistic 49 describes the number out of every 100 persons who knew the
answer.
Inferential Statistics: also known as inductive statistics, goes beyond describing a given
problem situation by means of collecting, summarizing, and meaningfully presenting the related
data. Instead, it consists of methods that are used for drawing inferences, or making broad
generalizations, about a totality of observations on the basis of knowledge about a part of that
totality. The totality of observations about which an inference may be drawn, or a generalization
made, is called a population or a universe. The part of totality, which is observed for data
collection and analysis to gain knowledge about the population, is called a sample.
• These are methods for using sample data to make general conclusions (inferences) about
populations. Because a sample is typically only a part of the whole population, sample data
provide only limited information about the population. As a result, sample statistics are generally
imperfect representatives of the corresponding population parameters.
Use of statistics in business
■ Accounting
Public accounting firms use statistical sampling procedures when
conducting audits for their clients.
■ Finance
Financial analysts use a variety of statistical information, including
price-earnings ratios and dividend yields, to guide their investment
recommendations.
■ Marketing
Electronic point-of-sale scanners at retail checkout counters are being
used to collect data for a variety of marketing research applications.
■ Production
A variety of statistical quality control charts are used to monitor the output
of a production process.
■ Economics
Economists use statistical information in making forecasts about the future
of the economy or some aspect of it.
• In general, there are more alternatives for statistical analysis when the
data are quantitative.
■ Data sources could be seen as of two types, viz., secondary and primary. The two
can be defined as under:
(i) Secondary data: They already exist in some form: published or unpublished -
in an identifiable secondary source. They are, generally, available from published
source(s), though not necessarily in the form actually required.
(ii) Primary data: Those data which do not already exist in any form, and thus
have to be collected for the first time from the primary source(s). By their very
nature, these data require fresh and first-time collection covering the whole
population or a sample drawn from it.
■ Qualitative data are labels or names used to identify an attribute of each element.
Population
Sample
1. Nominal
2. Ordinal
3. Interval
4. Ratio
2. An ordinal scale is an ordered set of categories. Ordinal measurements tell you the
direction of difference between two individuals. Ex, Economic status
4. A ratio scale is an interval scale where a value of zero indicates none of the variable. Ratio
measurements identify the direction and magnitude of differences and allow ratio
comparisons of measurements. Zero point is the absence of the characteristic. Ex, height
Nominal level - data that is classified into categories and cannot be arranged in any particular order.
EXAMPLES: eye color, gender, religious affiliation.
Ordinal level – involves data arranged in some order, but the differences between data values cannot
be determined or are meaningless.
EXAMPLE: During a taste test of 4 soft drinks, Mellow Yellow was ranked number 1,
Sprite number 2, Seven-up number 3, and Orange Crush number 4.
Interval level - similar to the ordinal level, with the additional property that meaningful amounts of
differences between data values can be determined. There is no natural zero point.
EXAMPLE: Temperature on the Fahrenheit scale.
Ratio level - the interval level with an inherent zero starting point. Differences and ratios are
meaningful for this level of measurement.
EXAMPLES: Monthly income of surgeons, or distance traveled by manufacturer’s
representatives per month.
A. Discrete variables: can only assume certain values and there are usually “gaps” between
values.
EXAMPLE: the number of bedrooms in a house, or the number of hammers sold at the
local Home Depot (1,2,3,…,etc).
EXAMPLE: The pressure in a tire, the weight of a pork chop, or the height of students in
a class.
■ Data in raw form are usually not easy to use for decision making
■ Some type of organization is needed
■ Table
■ Graph
■ Techniques reviewed here:
■ Bar charts and pie charts
■ Pareto diagram
■ Ordered array
■ Stem-and-leaf display
■ Frequency distributions, histograms and polygons
■ Cumulative distributions and ogives
■ Contingency tables
■ Scatter diagrams
Categorical Data
In the table we see that the relative frequency for coke is 19/50 = 0.38, the relative frequency for
sprite is 13/50 = 0.26 and so on. From the percent frequency distribution, we see that 38% of the
purchases were coke, 26% of the purchases were Sprite and so on. We can also note that
38%+26%+16% = 80% of the purchases were of the top three soft drinks.
■ Bar charts and Pie charts are often used for qualitative data (categories or nominal scale)
■ Height of bar or size of pie slice shows the frequency or percentage for each category
■ A bar graph is a graphical device for depicting data that have been summarized in a frequency,
relative frequency or percent frequency distribution.
■ On one axis of the graphs (usually the horizontal axis) we specify the labels that are used for
the classes (categories of data).
■ A frequency, relative frequency or percent frequency scale can be used for the other axis of the
graph (usually the vertical axis).
■ Then using a bar of fixed width drawn above each class label, we extend the length of the bar
until we reach the frequency, relative frequency or percent frequency of the class. For,
qualitative data the bars should be separated to emphasize the fact that each class (category) is
separate.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 2-26
Bar Chart Example
■ The pie chart is another graphical device for presenting relative frequency and percent
frequency distributions.
■ To construct a pie chart, we first draw a circle to represent all of the data.
■ Then we use the relative frequencies to subdivide the circles into sectors or parts that
correspond to the relative frequency for each class.
Percentages
are rounded to
Bonds the nearest
29% percent
cumulative % invested
(line graph)
graph)
Numerical Data
Stem-and-Leaf
Display Histogram Polygon Ogive
■ 41 is shown as 4 1
Stem Leaves
2 1 4 4 6 7 7
3 0 2 8
4 1
Stem Leaf
■ 613 would become 6 1
■ 776 would become 7 8
■ ...
■ 1224 becomes 12 2
Data:
Stem Leaves
613, 632, 658, 717, 6 136
722, 750, 776, 827, 7 2258
841, 859, 863, 891, 8 346699
894, 906, 928, 933, 9 13368
955, 982, 1034,
1047,1056, 1140, 10 356
1169, 1224 11 47
12 2
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 2-39
Tabulating Numerical Data: Frequency
Distributions
24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27
Relative
Class Frequency Frequency Percentage
Cumulative Cumulative
Class Frequency Percentage
Frequency Percentage
▪ Each class is shown on the graph by drawing a rectangle whose base is the class
boundary and height is the corresponding class frequency.
Class
Class Midpoint Frequency
10 but less than 20 15 3
20 but less than 30 25 6
30 but less than 40 35 5
40 but less than 50 45 4
50 but less than 60 55 2
(No gaps
between
bars)
Class Midpoints
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 2-49
Graphing Numerical Data:
The Frequency Polygon
Class
Class Midpoint Frequency
10 but less than 20 15 3
20 but less than 30 25 6
30 but less than 40 35 5
40 but less than 50 45 4
50 but less than 60 55 2
(In a percentage
polygon the vertical axis
would be defined to
show the percentage of
observations per class) Class Midpoints
10 20 30 40 50 60
Class Boundaries (Not Midpoints)
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 2-51
Side-by-Side Chart Example