Prepared by Kenish
Prepared by Kenish
1. Introduction
Statistical thinking has now a day became very essential for different fields of study. Its
usefulness has now spread to such diverse fields as agriculture, business, accounting, marketing,
economics, management, medicine, political science, psychology, sociology, engineering,
journal, metrology, tourism, etc. For this reason, statistics is now included in the curriculum of
many professional and academic study programs. In biomedical research, meaningful
conclusions can only be drawn based on data collected from a valid scientific design using
appropriate statistical methods. Therefore, the selection of an appropriate study design is
important to provide an unbiased and scientific evaluation of the research questions. Each design
is based on a certain rationale and is applicable in certain experimental situations. Before a study
design is chosen, some basic design considerations such as goals of the studies, subject or
sample selection, randomization and blinding, the selection of controls, and some statistical
issues must be considered to justify the use.
1.1. Definition and Classification of Statistics
Definition: - The word statistics is derived from the Latin word “status” which means state was
used to refer to a collection of facts of interest to the state. Statistics is also the art of learning
from data.
Statistics as a subject (field of study): In this sense statistics is defined as the science of
collecting, organizing, presenting, analyzing and interpreting numerical data to make effective
decision on the bases of such analysis.(in singular sense)
Statistics as a numerical data: In this sense statistics is defined as aggregates of numerical
expressed facts (figures) collected in a systematic manner for a predetermined purpose.(in
plural sense)
Classification of Statistics
Depending on how data can be used statistics is sometimes divided in to two main areas or
branches.
1. Descriptive Statistics:-it is a method of collecting, organizing, summarizing and
presenting data in an informative way. Most of the statistical information in newspapers,
magazines, reports and other publications come from data that has been summarized and
presented in a form that is easy for the reader to understand. Descriptive statistics, therefore,
deals with the classification of data, which may be tabular, graphical (such as histogram,
2. Quantitative Variables: are numerical variables and can be measured. Examples include
balance in checking account, number of children in family. Note that quantitative variables
are either discrete (which can assume only certain values, and there are usually "gaps"
between the values, such as the number of bedrooms in your house) or continuous (which can
assume any value within a specific range, such as the air pressure in a tire.)
. Try to classify the different measurement systems into one of the four types of
scales. (Exercise)
iii. At the time when sampling plan is so complicated it may requires more time, labor
and money than a complete count. This is so if size of the sample is a large
proportion of the total population and if complicated weighted procedures are used.
With each additional complication in the survey, the chances of error multiply and
greater care has to be taken, which in turn needs more timed labor.
iv. If the information is required for each and every unit in the domain of study, complete
enumeration survey is necessary.
Having collected and edited the data, the next important step is to organize it. That is to present
it in a readily comprehensible condensed form that aids in order to draw inferences from it. It is
also necessary that the like be separated from the unlike ones.
Class limits
6 – 11
12 – 17
18 – 23
24 – 29
30 – 35
36 – 41
Step 7: Find the class boundaries;
E.g. For class 1 Lower class boundary=6-U/2=5.5
Upper class boundary =11+U/2=11.5
Then continue adding w on both boundaries to obtain the rest boundaries. By doing so
one can obtain the following classes.
Class boundary
5.5 – 11.5
11.5 – 17.5
17.5 – 23.5
23.5 – 29.5
29.5 – 35.5
35.5 – 41.5
Step 8: Write the numeric values for the tallies in the frequency column.
Step 9: Find cumulative frequency.
Step 10: Find relative frequency or/and relative cumulative frequency.
The complete frequency distribution follows:
Class Class Class Mark Freq. Cf (less Cf (more rf. rcf (less than
limit boundary than than type) type
type)
6 – 11 5.5 – 11.5 8.5 2 2 20 0.10 0.10
12 – 17 11.5 – 17.5 14.5 2 4 18 0.10 0.20
18 – 23 17.5 – 23.5 20.5 7 11 16 0.35 0.55
24 – 29 23.5 – 29.5 26.5 4 15 9 0.20 0.75
30 – 35 29.5 – 35.5 32.5 3 18 5 0.15 0.90
36 – 41 35.5 – 41.5 38.5 2 20 2 0.10 1.00
There are different types of bar charts. The most common being :
Simple bar chart
Component or sub divided bar chart.
Multiple bar charts.
30
25
Sales in $
20
15
10
5
0
A B C
product
100
80
Sales in $
Product C
60
Product B
40
Product A
20
0
1957 1958 1959
Year of production
60
Sales in $
50
40 Product A
30 Product B
20 Product C
10
0
1957 1958 1959
Year of production
ii) Pie Chart:-Is the circle that is divided in to different sectors according to the percentage of
frequency in to each category of the distribution with angle in proportion of 360° to the amount
associated to each category.
E.g. for scholarship data construct pie-chart.
Class frequency Rf Pf 360xRf (in degree)
1st 5 5/25 20% 72°
nd 7 7/25 28% 100.8
2
3rd 9 9/25 36% 12 9.6
th 4 4/25 16% 57.6
4
Total 25
4th
1st
6
5
4
3
2
1
0
5.5 10.5 CLASS BOUNDARY
Fig 3 Histogram
Frequency polygon
i.e. super imposed on
a histogram.
Class boundaries
5.5 10.5 15.5 20.5 25.5 30.5 35.5 40.5
Ex1. The following table is a grouped frequency distribution of money spent per visit by a
random sample of 100 customers at a dep’t store.
Amount of spent no of customers
5 100
10 90
15 60
20 25
25 5
I) compute: -
a) class limit
b) class boundary
c) the class width
d) the class mark
Ex 2.The salaries (in millions of dollars) for 31 NFL teams for a specific season are given in
this frequency distribution.
32, 39, 46, 53, 60, 67, 74 and 81, find (a) size of the class interval, and (b) the
class boundaries.
Ex4. Change the following into continuous frequency distribution.
Marks (Mid- 5 1 25 35 45 55
values) 5
No. of students 8 1 15 9 4 2
2
Also find the less than and more than cumulative frequencies and
Construct a histogram, a frequency polygon, and an Ogive for the data.
Ex5. The following data represent the lifetimes (in hours) of a sample of 30 transistors:
42 39 26 18 22 52 24 12 24 32
48 16 33 28 29 30 56 16 36 62
24 38 16 14 32 19 21 30 78 54
Prepare a grouped frequency distribution, using 11 classes.
Answer the following questions.