0% found this document useful (0 votes)
30 views15 pages

1) Biostatistics Introduction Note

The document provides an overview of statistics and biostatistics, emphasizing their importance in various fields, particularly biology. It covers data classification, tabulation, and methods of data collection, including primary and secondary data, as well as graphical and diagrammatic representations. Additionally, it discusses statistical measures, tests of significance, and the role of computers in data analysis and management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views15 pages

1) Biostatistics Introduction Note

The document provides an overview of statistics and biostatistics, emphasizing their importance in various fields, particularly biology. It covers data classification, tabulation, and methods of data collection, including primary and secondary data, as well as graphical and diagrammatic representations. Additionally, it discusses statistical measures, tests of significance, and the role of computers in data analysis and management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Topic – Introduction and Importance of statistics and Biostatistics,

Classification and Tabulation of Data. Parameter, Statistic and


Observation. Graphical and Diagrammatic representation of Data.
Drvet.in 2020

dispersion (variance, standard deviation, standard error and coefficient of


variation): for
simple and grouped data. Graphical representation of data. Tests of
significance –t, Z, X2
and F tests. Estimation of correlation. Estimation of regression. Analysis of
variance:
C.R.D., R.B.D. Computer basics and components of computer. Simple
operations:
Entering and saving biological data, database management systems. MS-
Office. Spread
sheet. Internet, e-mail and geographic information system (GIS).
DEMONSTRATION
Use of word processor and spreadsheet. Graphics and their uses. Data
retrieving and
analysis through computer (Data base). Use of local area network (LAN)
and other
network systems. Retrieving library information through network. G.I.S. and
its use.

MODULE-1: INTRODUCTION AND IMPORTANCE - MEANING OF


STATISTICS AND THEIR FUNCTION

Learning objective

One can understand the importance, use and meaning of Statistics after going through
this module.

STATISTICS

Meaning of Statistics

The word ‘Statistics’ has come from the Latin word ‘status’ ,the Italian word ‘statista’ or
the German word ‘statistik' , the French word ‘statistique’, each of which means political
state.

 In early days, facts and figures about the financial resources, births and deaths,
army strength and income were collected for the purpose of efficient
administration which was called statistics i.e, anything pertaining to the state.
Drvet.in 2020
 Now a days Statistics is not only the science of state but it plays an important role
in all walks of life and in all branch of scientific enquiry. In fact, statistics has
become one of the essential tools in modern biology.
 Usually, the word 'statistics' carries different meanings depending on the
occasion in which it is used.
o For e.g., it may mean statistical data which refers to quantitative
information, statistical method which means the methods dealing with
quantitative information or statistical measures of a sample. i.e.,
Arithmetic mean, standard deviation etc. of a sample.
 By statistical data, we mean the aggregate of facts which are affected by
multiplicity of causes, numerically expressed, estimated to a reasonable
standard of accuracy and collected in a systematic manner for a pre -
determined purpose.
 Statistical method includes collection, classification, tabulation, presentation,
analysis and interpretation of data.
 Biostatistics is the application of statistical methods to the problems of biology
including human biology, medicine and public health.
 Biostatistics is also called Biometry meaning "biological measurement".

Functions of Statistics

 Presents facts in a definite form


 Simplifies mass of figures
 Facilitates comparison
 Helps in formulating hypothesis
 Helps in testing the hypothesis
 Helps in prediction
 Helps in the formulation of suitable policies

Limitations of Statistics

 Statistics does not deal with the individuals.


 It deals only with quantitative characters. However qualitative characters can
be
numerically expressed and analyzed.
 For eg. Intelligence of students by marks obtained, poverty by income received.
 Statistical results are true only on an average.
 Statistics is only one of the methods of studying a problem.
 Statistics may be sometimes misused, if not properly interpreted.

DEFINITIONS

Population

 A set or collection of objects pertaining to a phenomenon of statistical enquiry is


referred to as universe or population or census. (e.g.) animals in a farm.
Drvet.in 2020
Sample

 When a few units are selected from a population, it is called as a sample. (e.g.)
animals of a particular breed in a farm.

Variable

 The quantitative or numerical characteristic of the data is called as a variable.


(e.g.) weight of an animal.

Constant

 It is a numerical value, which is same for all the units in the population. (e.g.) no.
of credit hours for B.V.Sc students.

Attribute

 It refers to the qualitative character of the items chosen. (e.g.) breed of an animal.

Parameter

 A statistical measure pertaining to a population is called as a parameter. (e.g.)


mean, standard deviation of the population.

Statistic

 A statistical measure pertaining to the sample is called as a statistic. (e.g.) mean,


standard deviation of the sample.

Continuous variable

 If a variable takes an intermediate value between any specified interval, it is


called as a continuous variable. (e.g.) the weight of animal.

Discrete or discontinuous variable

 If a variable takes only


integral values, then it is called as a discrete (or)
discontinuous variable. (e.g.) no. of animals in a farm

MODULE-2: COLLECTION AND CLASSIFICATION OF DATA

 Learning objective
 The learner will get an idea of the ways of collecting and simplifying the data
after going through this module.
Drvet.in 2020

COLLECTION OF DATA

A statistical investigation always begins with collection of data. One can collect the data
either by himself or from available records.

 The data collected by the investigator himself or by his agent from the sample or
population are called as the primary data.
 The source from which one gathers primary data is called as the primary source.
 The data collected from the available sources is known as secondary data.
 The source from which we are getting secondary data is known as secondary
source.

PRIMARY DATA

The data collected originally or the first hand information of facts.

Methods of collecting Primary Data

 Direct personal observation: The investigator himself goes to the field of enquiry
and collects the data.
 Indirect personal observation: The investigator collects data from a third person
(called as witness), who knows about the data being gathered.
 Data collection through agents, local reporters etc: Here the investigator
appoints some person called agents or local reporters on his behalf to collect
information.
 Data collection through questionnaires: The investigator prepares the needed
information for the particular study in the form of questions,called
questionnaires and sends the same to the respondents to collect data from the
respondents.

Merits and Demerits of Primary Data

Methods Merits Demerits


Direct personal  It is very accurate.  Expensive in terms of time
observation  Intensive details can be and money.
collected.  Not suitable when the field of
enquiry is large.

Indirect personal  It saves time.  Witness should possess


observation. thorough knowledge of the
facts regarding the problem
of investigation.
 Witness must be willing to
Drvet.in 2020

give information.

Data collection  It saves time.  The agents will collect


through agents and  Large area can be information in their own
local reporters etc. covered fashion.
 Only approximate results can
be obtained.
 It is expensive.

Data collection  It saves time.  It cannot be used if the


through  It is less expensive. informants are illiterate.
questionnaires  Geographically  Response may be poor.
dispersed area can be  Possibility of
covered vague/inaccurate answers.

Note

 Of all the types of collection of primary data if


the questionnaires are framed
skillfully so that it can be answered easily and if there is a compelling
force through which we can collect the questionnaires then the data collection
through questionnaires will be the best method.

SECONDARY DATA

The data collected from the available sources like published reports, documents,
journals etc. are called secondary data.

 The source from which the secondary data are collected is called as secondary
source of data.
 While the primary data are collected for a specific purpose, the secondary data
are gathered from sources which were done for some other purpose.

Sources of obtaining secondary data

 Published reports/ documents of institutions, NGOs etc.


 Scientific journals
 Government reports
 Books and news papers

Merits and Demerits of Secondary Data

Merits
Drvet.in 2020
 It saves time, labour and money.

Demerits

 It may not be very accurate


 All the data needed may not be available
 It might have been collected by some improper methods and in some abnormal
condition

CLASSIFICATION OF DATA

Classification is the process of arranging data into sequences and groups according to
their common characteristics or separating them into different but related parts.

Objectives of Classifications

 To remove unnecessary details


 To bring out explicitly the significant features in the data
 To make comparisons and drawing inferences

Methods of Classification

 Numerical Classification
o Classification of data according to quantitative characters. (e.g)
classification of animals in a farm according to their weight
 Descriptive Classification
o Classification according to attributes i.e, qualitative characters. (e.g).
classification of animals according to breeds
 Spatial or Geographical Classification
o Classification according to geographical area. (e.g) district-wise
livestock population in Tamil Nadu
 Temporal or Chronological Classification
o Classification according to time (e.g) livestock population in different
years
 Classification according to class interval or frequency distribution
o When the data are grouped into classes of appropriate interval, showing
the number in each class, we get frequency distribution.This is called
grouped data.The original data is called raw data.
 The following is the frequency table showing the distribution of chicks in
different weight classes.

Weight( in gm.) No. of Chicks


36-40 12
40-44 25
Drvet.in 2020

44-48 17
48-52 05
52-56 06
56-60 10
Total 75
Terms used in Frequency distribution

Class Interval and Class Limits

 Data are classified or grouped into regular intervals with the range of values of
the data (Class Interval) with the lower and upper limits which iare known as
Class Limits.
 True Class Interval
o When the Class Intervals are continuous, it is called True or Inclusive
Class Interval.
 Apparent Class Interval
o When there is a small gap between the upper boundary of any class and
lower boundary of sucessive class, then the Class Interval is called
Apparent or Exclusive Class Interval.

Class Frequency or Frequency

 Class frequency or frequency is the number of observations in that class.

Width 0r Length of the Class Interval

 Width 0r length of the Class Interval is the difference between the upper
boundary and lower boundary of the same class.

Class Mark

 It is the midpoint of the class.


 It is given by half of the sum of the lower limit and upper limit of any class.

TABULATION OF DATA

It is a systematic arrangement of statistical data in columns and rows. It is the next


process of condensation of data after classification. Tabulations is a mechanical part of
classification. The objects of tabulation are

 Tables are more comprehensive and intelligible and carry a lasting impression on
the mind of the reader.
 Tables facilitate quick comparisons.
Drvet.in 2020
 Tables facilitate economy of space (while presenting) and time (while reading)
 Relationship and other relevant characteristics of item can be easily marked out
in tabulated data.

The following are the points to be considered carefully in preparing a table.

 The title should be short but clear and it should give a full idea of its contents.
 The column and row headings should be self explanatory.
 Footnotes may be given if absolutely necessary.
 Prominence may be given to important facts by different methods of mailing and
spacing.
 To have better clarity, space should be left after every five to ten rows.
 It the table is taken from secondary data, it is advisable to give a source note for
the table mentioning the source for which the data is collected.

Types of table

 Reference table or General Table


o These table contain a great deal of summarized information. They
appear usually at the end of the report in the form of appendixes.
 Text or summary tables.
o They are used to analyse or assist in the analysis of classified data. They
are included in the discussion of the body of the report.
 Statistical tables
o These are special tables used by statisticians in interpreting the results
of statistical analysis. The commonly used tables are ‘t’ tables, z table, F
table etc.,

RULES IN FORMING FREQUENCY DISTRIBUTION

 The class interval should be of equal width and of such size that the characteristic
features of the distribution are displayed.
 Classes should not be too large (or) too small. If too large, it will involve
considerable errors in assuming that the midpoints of the class intervals are
the average of that class. If too small, there will be many classes with zero
frequency (or) small frequency. There are however certain type of data, which
may require the use of unequal or varying class intervals.
 When there is irregular flow of data and wide fluctuating gap among the varieties,
varying class intervals are to be taken (or) otherwise there may be a possibility
of classes without any frequency or observations falling in that category.
 The range of the classes should cover the entire range of data and the classes
must be continuous.
 It is convenient to have the midpoint of the class interval to be an integer. As a
general rule, the number of classes should be in the range of 6-16 and never
more than 30.
Drvet.in 2020

FORMATION OF FREQUENCY DISTRIBUTION

 Method of Tally Marks


 Array Method

Method of Tally Marks

 First we have to form the class interval. The difference between maximum and
minimum values in the collected data are noted and it is to be divided by the
number of required classes. This value should be rounded off to our
convenience.
 The number of required classes can be calculated using the formula suggested
either by Sturge’s rule or Yule's rule.
 Sturge’s rule

K = 1 + 3.322 log n (approx.)

where K is the number of required classes and n is the number of observations.

 Yule's rule

K = 2.5 x n ¼ (approx.)

where K is the number of required classes and n is the number of observations.

 After forming the class interval each should be written one below the other and
for each item in the collected data a stroke is marked against the class interval
in which it falls.
 Usually after every four such strokes in the class interval, the fifth item is
indicated by striking the previous four strokes, thus, making it easy to count.
 These strokes are counted and this is called formation of frequency distribution
by the method of tally marks.

Array Method

 An array is anorderly arrangement of the data by magnitude in the ascending or


descending order.
 Form the class interval as in the previous method.
 Then arrange the given data in the ascending order of magnitude.
 From the array, we will count the number of observations belonging to each class
and then we will write.
 This method is not easy, when the number of observations is large.
 We can adopt this method in cases, where the number of observations are less
than 50
Drvet.in 2020

MODULE-3: PRESENTATION OF DATA

 Learning objective
 This helps the reader to know about the various ways of representing the data
by means of diagrams and graphs so that the voluminous numerical data can
be exhibited by attractive pictures.

PRESENTATION OF DATA

Introduction

 Classification and tabulation reduce the complexity of vast and complicated


statistical data but still it is not easy to interpret the tabulated data. Diagrams
and graphs will catch the eye more easily than tables which provide array of
figures. A glance over a graph or diagram will enable any layman (without
statistical knowledge) to get an idea about the essential characteristics of the
tabulated data without much strain or effort.

FUNCTIONS AND LIMITATIONS OF DIAGRAMS AND GRAPHS

Functions

 It will attract the attention of a large number of persons.


 They carry a “birds – eye view” impression in the human mind.
 It saves a lot of valuable time if presented in a form of suitable charts & graphs
instead of pages of numerical figures.
 To facilitate comparison between two or more sets of data.
 Prediction equations can be represented by graphs and these will be
much helpful in forecastings.

Limitations

 They are approximate indicators.


 Exact and accurate informations can be obtained from original tabular
information.
 They cannot substitute the tabular information.
 They fail to disclose small difference when large figures are involved.

GRAPHICAL REPRESENTATION OF DATA

 Graphicalrepresentation is done when the data are classified in the form of a


frequency distribution.The different graphs are
o Histogram
Drvet.in 2020
o Frequency Polygon
o Frequency Curve
o Ogive
o Lorenz Curve

Histogram

 It is a vertical bar diagram without gap between the bars.


 It consists of bars erected over the true class interval, their areas
being proportional to the frequencies of the respective classes.
 Since the intervals are of equal width, the height of each bar serves as a measure
of the corresponding frequency.
 Draw the two diagonals in the highest modal class rectangles at its top corner to
the pre and post modal rectangle corners and the x co-ordinate of the point of
intersection is the mode.

Frequency Polygon

 If points are plotted with the x co-ordinate equal to the mid value of the class
intervals and the corresponding frequencies as the y co-ordinate and these
points are joined by means of a straight line, we obtain frequency polygon.
 These points are the midpoints of the top of the bars in the histogram.

Frequency Curve

 If points are plotted with the x co-ordinate equal to the mid value of the class
intervals and the corresponding frequencies as the y co-ordinate and these
points are joined by means of a smooth curve then we get frequency curve.

Ogive

 This is cumulative frequency curve.


 This curve is obtained by making use of cumulative frequency instead of the
simple frequency.

Cumulative Frequency Distribution

 A frequency distribution gives the number of observations that lie in any class
interval whereas the cumulative frequency distribution gives the number of
frequencies that lie below any mark or above any given mark.
 When derived from a frequency distribution, the cumulative frequency
distribution of one kind gives the number of observations less than the lower
boundaries of the successive class and the cumulative frequency distribution of
the second kind gives the number of observations that exceed the lower
boundaries of the class which are respectively known as the less than and
greater than cumulative frequency distribution.
Drvet.in 2020
 If we draw frequency polygon to the above two distribution we get cumulative
frequency polygon (less than and greater than).
 If we draw a frequency curve to the above two distribution in the same graph, we
get cumulative frequency curve or Ogive.
 The x co-ordinate of the point of intersection of less than and greater than
cumulative frequency curve is the median.

Lorenz Curve

 This is a modification of the Ogive when the variables and the cumulative
frequencies are expressed as percentages.
 It serves to measure the evenness of the distribution and is useful in picturing the
distribution and dispersion of wealth, sales and profits etc.,

DIAGRAMMATIC REPRESENTATION OF DATA

Points to be followed in drawing a diagram

 For each diagram, a suitable short heading should be given.


 It should be drawn to exhibit the statistical matter clearly. It should be such as to
allow its significant feature to be clearly shown out by adopting suitable scale
and will depend upon the space available.
 Diagram should be drawn accurately with the help of drawing instruments.
 Colouring and different markings should be done with pencil or with colours.
 Different colours or marks or dottings are used to show different items. In such
cases legend should be given for the column and item it refers. In doing so, we
should see that the visual impression conveyed by the diagram is not in any
way affected.
 The original data on which the diagram has been based should be given, if
necessary facing the diagram as this will help the observer to see the details
with clarity.
 Reference to the source of the table should be provided.

Types of a diagram

 One dimensional diagram


o Line diagram
o Bar Diagram
 Two dimensional (or) Area Diagram
o Pie diagram
o Square diagram and rectangle diagrams
 Three dimensional (or)Volume diagrams
o Cubes
o Spheres, Cylinders etc.
 Pictogram
o Actual pictures
Drvet.in 2020

ONE DIMENSIONAL DIAGRAM

Line diagram consisting of curves and lines as well as bars

 Line diagram
o This requires vertical lines to be drawn at equal intervals each of length
proportional to the magnitude of the variable for the different items.
o It has no width and hence of very poor visual effect.
o It makes comparison easy although it is less attractive.
 Bar Diagram
o It is the simplest of all statistical diagrams.
o It consists of bars of equal width (all horizontal or vertical) standing on
a common base line at equal intervals, the length of the bars being
proportional to the magnitude of the variable for different items.
 Sub-divided bar diagram or component bar diagram
o Sometimes the variable is capable of being sub-divided into two or more
component parts each representing a sub variable.
o In this case, all the bars are subdivided by lines in the same order so
that each subdivision represents the parts in magnitude in the same
scale.
o They are properly coloured or marked differently for visual guidance.
o Small squares should be given below the diagram containing the same
colour or mark to show their significance.
 Superimposed or Multiple bar diagram
o Bars may sometimes be superimposed for comparative purpose.
 Percentage bar diagram
o When the component parts are expressed in percentages of the whole,
the resulting bar diagram is called a percentage bar diagram.
o In this case all the bars are of equal length.

TWO DIMENSIONAL (OR) AREA DIAGRAM

Pie diagram

 Circles with area proportional to the magnitudes of the data are drawn (i.e.) radii
proportional to the square root of the magnitude of the data and the
components(sub variables) are drawn with sectors proportional in area to their
magnitude.
 A circle subtends an angle of 3600 at the centre and this represents the total. The
required angle of the sector representing the component is calculated and area
distinguished by different colours or markings and key for this should be given.
 It is usual to start from a horizontal radius to the right and proceed in the anti-
clock wise direction giving the quantities in descending order of magnitude
except the miscellaneous which is shown at the end.
Drvet.in 2020
 The lengths have more visual effect than areas and hence it is of less use for
comparative purpose. It is commonly used to represent single observation with
different components.

Square diagram and Rectangle diagrams

 Their areas should be proportional to the magnitudes of the data.


 For square diagrams, we will have to take the square root of the given figures
which will give the measurement of the sides of the square. By adopting
suitable scale we can draw squares.
 In the case of rectangle diagrams, if we take equal breath (width) for the
rectangles, then the areas will be proportional to the lengths and hence the
lengths will be proportional to the magnitude of the given variables.

THREE DIMENSIONAL (OR) VOLUME DIAGRAM

 These comprise of cubes, spheres, prisms, cylinder and blocks.


 Of these cubes are mainly used and their sides are drawn in proportion to the
cube roots of the magnitudes of the data.
 They are particularly used when the data has a very wide range. In such a case, it
would be difficult to represent the quantities even by squares.

PICTOGRAM

 Tabular data can also be represented by pictogram, cartogram, maps and


pictures as these device help in attracting the attention to statistical matter
which when presented in the ordinary diagrammatic form is very often
ignored.
 Pictogram are diagrams of pictorial or semi-pictorial nature and are drawn in
different sizes according to scale. Though they are useful in attracting the
attention of the people, they very often lean on tables, ignoring the pictorial
diagrams.
 They cannot be made use of with certain complicated data.

MODULE-4: MEASURES OF AVERAGES

 Learning objective
 Readers of this module will come to know the methods of condensing the data
by means of a single figure and comparing two or more distributions.

MEASURES OF AVERAGE

Need of an average

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy