Ml Module3 Regression
Ml Module3 Regression
1
Contents
• What is Regression?
• Why Regression?
• Linear Regression
– Linear Regression algorithm using least square method
– Evaluation of method
• Multilinear Regression
• Logistic Regression
2
What is Regression?
6
Use cases of Regression
• Predictive analytics
• Operation efficiency
• Supporting decisions
• Correcting errors
• New insights
9
Simple Linear Regression: Understanding
+ve Relationship
Speed of Vehicle
Distance travelled in
fixed duration of time
10
(Independent variable)
Simple Linear Regression: Understanding
11
Slops of Simple Linear Regression Model
Example:
(X , Y ) = (−3, −2) and (X , Y ) = (2, 2)
Rise = (Y − Y ) = (2 − (−2)) = 2 + 2 = 4
Run = (X − X ) = (2 − (−3)) = 2 + 3 = 5
12 Slope = Rise/Run = 4/5 = 0.8
13
14
Simple Linear Regression:
Least Square Method (Example)
15
Simple Linear Regression:
Least Square Method (Example)
16
Simple Linear Regression
• Measure of Goodness: R2 method
18
OLS algorithm
19
Example of Simple Linear Regression
Calculation summary:
Sum of X = 299
Sum of Y = 852
Mean X, M = 19.93
Mean Y, M = 56.8
20
Error in Simple Regression
Y = (a + bX) + ε
21 SSE=
22
• The simple linear regression
• Parameter ‘a’ is the intercept of
model and the multiple
this plane. Parameters ‘b1 ’ and
regression model assume that
‘b2 ’ are referred to as partial
the dependent variable is
regression coefficients.
continuous.
24
Multiple regression for estimating equation when there are ‘n’ predictor
variables is as follows:
While finding the best fit line, we can fit either a polynomial or
curvilinear regression. These are known as polynomial or curvilinear
regression, respectively.
25
Assumptions in Regression Analysis
26
5. If the business conditions change and the business assumptions
underlying the regression model are no longer valid, then the past
data set will no longer be able to predict future trends.
6. Variance is the same for all values of X (homoskedasticity).
7. The error term (ε) is normally distributed. This also means that the
mean of the error (ε) has an expected value of 0.
8. The values of the error (ε) are independent and are not related to
any values of X. This means that there are no relationships between a
particular X, Y that are related to another specific value of X, Y.
27 Given the above assumptions, the OLS estimator is the Best Linear
Unbiased Estimator (BLUE), and this is called as Gauss-Markov
Theorem.
Main Problems in Regression Analysis
29
• First, none of the independent variables, other than the variable
associated with the intercept term, can be a constant.
• Second, variation in the X’s is necessary.
• In general, the more variation in the independent variables, the
better will be the OLS estimates in terms of identifying the impacts
of the different independent variables on the dependent variable.
Heteroskedasticity
• Refers to the changing variance of the error term.
• If the variance of the error term is not constant across data sets,
there will be erroneous predictions.
30 • In general, for a regression equation to make accurate predictions,
the error term should be independent, identically (normally)
distributed (iid).
31
Improving Accuracy of the Linear Regression Model
1. Shrinkage Approach
2. Subset Selection
3. Dimensionality (Variable) Reduction
33
Polynomial Regression Model
• Let us use the below data set of (X, Y) for degree 3 polynomial.
34
35
• As you can observe, the regression line is slightly curved for
polynomial degree 3 with the above 15 data points.
36
What is Logistic Regression?
37
Use cases of Logistic Regression
38
Linear and Logistic Regression
39
Linear and Logistic Regression
Logistic curve
Sigmoid (S) curve
40
Logistic Regression Curve
41
Some fundamentals terms of Logistic Regression
• The probability that an event will occur is the fraction of times you expect
to see that event in many trials. If the probability of an event occurring is
Y, then the probability of the event not occurring is 1-
Y. Probabilities always range between 0 and 1.
• The odds are defined as the probability that the event will occur divided
by the probability that the event will not occur. Unlike probability, the
odds are not constrained to lie between 0 and 1 but can take any value
from zero to infinity.
• If the probability of Success is P, then the odds of that event is:
44
Math behind Logistic Regression
45
Math behind Logistic Regression
46
• Let us say we have a model that can predict whether a person is
male or female on the basis of their height.
• Given a height of 150 cm, we need to predict whether the person
is male or female.
• We know that the coefficients of a = −100 and b = 0.6.
• Using the above equation, we can calculate the probability of male
given a height of 150 cm or more formally P(male|height = 150).
● The coefficients in a logistic regression are estimated using a process called Maximum
Likelihood Estimation (MLE).
● Likelihood function:
49
Thank You
50
Parameter Estimation by
Maximum Likelihood Method
51
Parameter Estimation by
Maximum Likelihood Method
52
References
53