Unit 2 Regression
Unit 2 Regression
By
Prof. R. B. Darade
• UNIT-II
• Regression:
• Curve fitting by the method of least squares, fitting the lines y= a + bx and x = a
• + by, Multiple regression, standard error of regression– Pharmaceutical Examples
• Probability:
• Definition of probability, Binomial distribution, Normal distribution, Poisson’s
• distribution,properties–
• problems,Sample,Population,largesample,smallsample,Null
• hypothesis,alternativehypothesis,sampling,essenceofsampling,typesofsampling,
• Error-I type, Error-II type, Standard error of mean (SEM) - Pharmaceutical
• examples
• Parametric test:
• t-test(Sample, Pooled or Unpaired and Paired),ANOVA,(Oneway and Two
• way),Least Significance difference
WHAT IS REGRESSION ANALYSIS?
Ref: https://www.geeksforgeeks.org/least-square-method/
https://www.cuemath.com/data/least-squares/
The sum of squares measures how widely a set
of datapoints is spread out from the mean. It is
also known as variation.
It is calculated by adding together the squared
differences of each data point. To determine
the sum of squares, square the distance
between each data point and the line of best
fit, then add them together. The line of best fit
will minimize this value.
The sum of squares measures the
deviation of data points away
from the mean value. A higher sum
of squares indicates higher
variability while a lower result
indicates low variability from the
mean.
To calculate the sum of squares,
subtract the mean from the data
Formula for Least Square Method
Least Square Method formula is used to find the best-fitting line through a set of
data points. For a simple linear regression, which is a line of the form y=mx+c,
where y is the dependent variable, x is the independent variable, a is the
slope of the line, and b is the y-intercept, the formulas to calculate the slope
(m) and intercept (c) of the line are derived from the following
equations:
1.Slope (m) Formula: m = n(∑xy)−(∑x)(∑y) / n(∑x2)−(∑x)2
Where:
•n is the number of data points,
r = 0 indicates no correlation.
Types of Regression Analysis
Simple Linear
Studies relationship between two variables (predictor and outcome)
Regression
Multiple Linear
Captures impact of all variables
Regression
Polynomial Regression Finds and represents complex patterns and non-linear relationships
Used in cases with high correlation between variables; can also be used as a
Ridge Regression
regularization method for accuracy
The standard error of the regression (S) and R-squared are two key goodness-
of-fit measures for regression analysis. While R-squared is the most well-
known amongst the goodness-of-fit statistics, I think it is a bit over-
hyped. The standard error of the regression is also known as residual
References:
https://www.datamation.com/big-data/what-is-regression-
analysis/