RMBS BPT402
RMBS BPT402
and Bio-Statistics
BPT-402
Syllabus
Measures of central tendency or measures of Location – Mean,
Median Mode in Ungrouped& grouped series. Partition Values –
Quartiles, Deciles, Percentiles in Ungrouped& grouped series.
Graphical Determination of Median, Mode & partition Values.
Measures of Skewness – Pearson’s and Bowley’s coefficient of
Skew ness. Measures of Dispersion or Variation – Range, Mean
Deviation, Standard Deviation.
Probability – Random experiment, sample space, events,
probability of an event, addition & multiplication laws of
probability, use of permutations & combinations in calculation
of probabilities, random variable, probability distribution of a
random variable, Binomial Distribution.
Correlation – Bivariate distribution, scatter diagram, coefficient of correlation, calculation &
interpretation of correlation coefficient.
Regression – Lines of regression, calculation of Regression coefficient.
Sampling Variability & significance – Sampling Distribution, Standard error, null hypothesis,
alternative hypothesis, Type I & Type II errors, tests of significance, acceptance 7 rejection of null
hypothesis, level of significance, Z test, t test (paired & unpaired), chi-square test.
Estimation of confidence limits & intervals.
Vital Statistics
1) Rates & ratios of vital events.
2) Measures of Mortality: - Crude Death Rate, Specific Death Rate, Age Specific DeathRate,
Standardized Death Rates, Infant Mortality Rate.
3) Measures of Fertility: - Crude Birth Rate, General Fertility Rate, Specific FertilityRate, Age Specific
Fertility Rate, And Total Fertility Rate. Measurement of Population Growth: - Crude Rate of Natural
Increase & Pearli’s Vital Index, Gross Reproduction Rate, Net Reproduction Rate.
5) Measures of Morbidity: - Morbidity Incidence Rate, Morbidity Prevalence Rate.
6) Life Tables or Mortality Table.
Summary Measures
Summary Measures
Standard Deviation
Geometric Mean
Collection and Presentation of Data
Collection
Data: Foundation of Statistical analysis and interpretation
Data Sources: Primary and Secondary, Internal and External records
Presentation:
Classification
Chronological, Geographical, Qualitative, Quantitative
Quantitative- Frequency distribution
Class intervals- class limits, class mid point, inclusive and exclusive methods
Tabulation
Charting
Some important concepts
Variable
Continuous- Measurement(height. Weight, etc.)
Discrete/ Discontinuous- counting (Number of Rooms, number of persons)
frequency
Summary Measures
Summary Measures
Standard Deviation
Geometric Mean
Measures of Central Tendency
Central Tendency
X i
X i 1
n
N
X i
i 1
N
Chap 3-9
Mean/ Arithmetic Mean
Average
Adding together all the observations and dividing this total by the number
of observations
Mean (Arithmetic Mean)
X i
X1 X 2 Xn
X i 1
n n
Population mean
Population Size
N
X i
X1 X 2 XN
i 1
N N
Mean (Arithmetic Mean)
(continued
)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Mean = 5 Mean = 6
Mean (Arithmetic Mean)
(continued
From a Frequency Distribution )
Approximating the Arithmetic Mean
Used when raw data are not available
c
m
j 1
j fj
X
n
n sample size
c number of classes in the frequency distribution
m j midpoint of the jth class
f j frequencies of the jth class
Median
Robust Measure of Central Tendency
Not Affected by Extreme Values
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Median
In an Ordered = 5the Median is the ‘Middle’ Number
Array, Median = 5
Grouped Data
Use N/2 to locate Median Class
Median= L + {(N/2 - P.C.F.)/ f } X i
Related Positional Measures
0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
No Mode
Mode = 9
Calculation of Mode
Ungrouped Data
Tally Marks
Grouped Data
Mo = L [(∆1/ ∆1+ ∆2) i]
To:
Determine reliability of average
Basis to control variability
Comparing two or more series on the basis of variability
Use more statistical measures
Methods of Measuring Variation
Methods
Most unreliable
Not based on each and every observation
Subject to fluctuations of considerable magnitude from sample to sample
Cant tell us about the character of distribution within 2 extreme
observations
2. Interquartile Range and Quartile
Deviation
Range which includes middle 50% of the observations i.e. one quartile
lower end and another quartile upper end
Quartile Deviation: average amount by which the two quartiles differ from
the median (An absolute measure)
= Variance
Calculation of Standard Deviation
Ungrouped Data
1. Deviations from Actual mean
2. Assumed Mean
Mathematical Properties of Standard
Deviations
1. Combined S.D.
2. S.D. of Natural Numbers
Mean +- 1= 68.27%
Mean +- 2= 95.45%
Mean +- 3= 99.73%
Merits & Limitations
Example:
N= 100
Mean= 40
SD= 5
The computer by mistake took the value 50 in place of 40 for one of the
observations.
Find the Correct Mean & Variance.
MEASURES OF SKEWNESS & KURTOSIS
Measures of central tendency and variation discussed do not reveal the
entire story about a frequency distribution
Two distributions may have the same mean and SD but may differ in their
shape of the distribution
SKEWNESS
Measures of Skewness
Lack of symmetry or departure from symmetry
Measures- Absolute & Relative
Farther Mean and Mode, Higher the skewness
Distance between mean & mode is Karl Pearson’s basis for measuring skewness
Relative Skewness= Absolute Skewness/SD
Karl Pearson’s Coeff. Of Skewness
Bowley’s Coeff.
Moments
Grouped Data
Kurtosis
Topics to be discussed:
Bivariate distribution
Scatter diagram
Coefficient of Correlation
Calculation & interpretation of correlation coefficient
Correlation
“the tool with the help of which the relationships between two or more than
two variables is studied is called correlation”
So far we have discussed problems relating to one variable only
Correlation is for 2 or more than 2 variables
measure= Coefficient of Correlation
Denoted by ‘r’
Why use correlation?
Scatter Diagram
Karl Pearson’s Coefficient of Correlation
Spearman’s rank correlation coefficient
Method of least squares
1. Scatter Diagram
or where,
When Original Observations are used, instead of deviations
Or
r=
R = +1; Perfect Positive Correlation
R= -1; Perfect Negative Correlation
R = 0; No Correlation
Taking deviations from Assumed Mean
Correlation between Bivariate Grouped Data
Properties of Correlation Coefficient
Lies between
Independent of Origin and Scale
Geometric Mean of two regression coefficients
r = 0 in case of independent variables
Limitations
Assumptions of linear relationship
Affected by extreme values
Time consuming method
Regression Analysis
Topics to be studied
Lines of Regression
Calculation of Regression Coefficient
Regression
“The statistical tool with the help of which we are in a position to estimate
(or predict) the unknown values of one variable from known values of
another variables”
Dictionary Meaning- Act of Returning or Going Back
Francis Galton, 1877, study of heights of fathers and sons.
The line describing this tendency to regress or going back is called
‘regression line’
Types of Regression
Simple Regression
Multiple Regression
Linear Bivariate Regression Model
Assumptions
Value of dependent variable Y is dependent in
Some degrees upon independent Variable X
Linear relationship between X and Y
Regression Lines
SAMPLING DISTRIBUTION
Census Vs. Sampling
Sample Statistic Vs. Population Parameter
Statistical Regulation and Inertia of Large Numbers
Statistical Inference
SAMPLING DESIGN
Selecting a subset of units from a target population for the purpose of collecting
information
Economical and accurate research process
Types: Probability and Non Probability Sampling
Probability Sampling Techniques
Probability
Sampling
Simple Systematic
Stratified Cluster
Random Random
Sampling Sampling
Sampling Sampling
Non-Probability Sampling Techniques
Non-
Probability
Sampling
A statistical error that occurs when researcher doesn’t select a sample that
represents the entire population of data.
Biased errors and Unbiased Errors
HYPOTHESIS
The hypothesis about the population mean is rejected for value of falling
into either tail of sampling distribution
Ho: μ=100
And H1: μ≠100
One-tailed test
Hypothesis about population mean is rejected only for value of falling into
one of the tails of sampling distribution
Right tailed test or left tailed test
Ho: μ=100
And H1: μ<100 or >100
Type I and II errors
When a statistical hypothesis is tested, there
are 4 possible results
Hypothesis true: accepted
Hypothesis false: rejected
Hypothesis true: rejected
Hypothesis false: accepted
Probability of committing type I error is level
of significance (α)
Probability of committing type II error is beta
error (β)
T-test
Uses
For individuals and operating agencies
Research demographics and medical research
Population estimation and projection
Public administration
International Uses
Methods
Registration Method
Census Enumeration
Analytical Method
Estimation of Vital rates using census data