0% found this document useful (0 votes)

6 views51 pages

Ml Module3 Regression

The document provides an overview of regression analysis, a supervised learning algorithm used in predictive analytics to model the relationship between dependent and independent variables. It covers various types of regression, including linear, multilinear, and logistic regression, along with their applications, assumptions, and methods for improving accuracy. Key concepts such as the least squares method, multicollinearity, and the differences between linear and logistic regression are also discussed.

Uploaded by

12302080603002

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views51 pages

Ml Module3 Regression

Uploaded by

12302080603002

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Regression Analysis

1
Contents

• What is Regression?
• Why Regression?

• Linear Regression
– Linear Regression algorithm using least square method
– Evaluation of method
• Multilinear Regression
• Logistic Regression

2
What is Regression?

• Regression is a supervised learning algorithm under

Machine Learning terminology
• An important tool in Predictive Analytics
• Regression analysis is a predictive modeling technique
which investigates the relationship between a
dependent and independent variable.
• Graphing a line over a set of data points that most
closely fits the overall shape of the data.
• The regression shows the changes in the dependent
variable on the Y axis to the changes in the explanatory
3 variable on X axis.
What is Regression?

• Regression is a tool for finding existence of an association

relationship between a dependent variable (Y) and one or more
independent variables (X1, X2, …, Xn) in a study.

• The relationship can be linear or non-linear.

• A dependent variable (response variable) “measures an

outcome of a study (also called outcome variable)”.

• An independent variable (explanatory variable) “explains

changes in a response variable”.
4
Types of Regression

One More than

independent One
variable independent
5 variable
Most Common Regression Algorithms

● Simple linear regression

● Multiple linear regression
● Polynomial regression
● Multivariate adaptive regression splines
● Logistic regression
● Maximum likelihood estimation (least squares)

6
Use cases of Regression

• Predictive analytics
• Operation efficiency
• Supporting decisions
• Correcting errors
• New insights

• House Price Predictions

• Trend forecasting
– E.g. what will be the price of gold in next six months
• Finding Associations among attributes:
– E.g. Mediclaim agencies: Effect of age on claims
7
Linear Regression
• Linear regression: It is a linear approach to modelling the
relationship between a scalar response and one or more
explanatory variables (also known as dependent and independent
variables).
• The case of one explanatory variable is called simple linear
regression; for more than one, the process is called multiple
linear regression.

• In linear regression, the relationships are modeled using linear

predictor functions whose unknown
model parameters are estimated from the data.
• Linear regression models are often fitted using the least
squares approach.
8
Simple Linear Regression

• One of the easiest algorithm in machine learning.

• Simple Linear regression: It is a statistical model that

attempts to show the relationship between two variables
through the linear equation.

• Data is modeled using a straight line (Y = mX + c)

• Correlation between X and Y variables

9
Simple Linear Regression: Understanding
+ve Relationship

Speed of Vehicle

m=Slop of the line

(Dependent variable)

Distance travelled in
fixed duration of time

c= y – intercept of the line

10
(Independent variable)
Simple Linear Regression: Understanding

11
Slops of Simple Linear Regression Model

Linear positive slope Linear negative slope

Slope = (Y − Y ) / (X − X ) = Delta (Y) / Delta(X)

Slope = Change in Y/Change in X

Curve linear positive slope

Example:
(X , Y ) = (−3, −2) and (X , Y ) = (2, 2)
Rise = (Y − Y ) = (2 − (−2)) = 2 + 2 = 4
Run = (X − X ) = (2 − (−3)) = 2 + 3 = 5
12 Slope = Rise/Run = 4/5 = 0.8

Curve linear negative slope

Relations in Regression

Linear positive slope Linear negative slope

Curve linear positive slope

Curve linear negative slope

Simple Linear Regression:
Least Square Method
• How to find the best Regression Line?
• Our challenge is to determine the value of m and c, that gives
the minimum error for the given dataset. We will be doing this
by using the Least Squares method.
• Loss function:
y = mx +c

For minimum loss we take partial

derivative of L(x) and equate to 0,
then finding expression of m and c.

14
Simple Linear Regression:
Least Square Method (Example)

• A method to predict best fit line.

15
Simple Linear Regression:
Least Square Method (Example)

16
Simple Linear Regression
• Measure of Goodness: R2 method

18
OLS algorithm

● Step 1: Calculate the mean of X and Y

● Step 2: Calculate the errors of X and Y
● Step 3: Get the product
● Step 4: Get the summation of the products
● Step 5: Square the difference of X
● Step 6: Get the sum of the squared difference
● Step 7: Divide output of step 4 by output of step 6 to calculate ‘b’
● Step 8: Calculate ‘a’ using the value of ‘b’

19
Example of Simple Linear Regression

Calculation summary:
Sum of X = 299
Sum of Y = 852
Mean X, M = 19.93
Mean Y, M = 56.8

20
Error in Simple Regression

Y = (a + bX) + ε

Example of simple regression

Scatter plot and regression line

Sum of Square of Residual

21 SSE=

Residual is the distance between the predicted point (on the

regression line) and the actual point as depicted in Figure
Multiple Linear Regression

• Two or more independent variables, i.e. predictors are involved in the

model.

• In the example of simple linear regression by considering Price of a

Property as the dependent variable and the Area of the Property (in sq.
m.) as the predictor variable.

• If we consider Price of a Property (in $) as the dependent variable and

Area of the Property (in sq. m.), location, floor, number of years since
purchase and amenities available as the independent variables, we can
form a multiple regression equation as shown below:

22
• The simple linear regression
• Parameter ‘a’ is the intercept of
model and the multiple
this plane. Parameters ‘b1 ’ and
regression model assume that
‘b2 ’ are referred to as partial
the dependent variable is
regression coefficients.
continuous.

• Parameter b1 represents the

• The following expression
change in the mean response
describes the equation involving
corresponding to a unit change in
the relationship with two
X1 when X2 is held constant.
predictor variables, namely X1
and X2 .
• Parameter b2 represents the
change in the mean response
23 corresponding to a unit change in
• The model describes a plane in X2 when X1 is held constant.
the three-dimensional space of
Ŷ, X1, and X2 .
• Consider the following
example of a multiple linear
regression model with two
predictor variables, namely
X1 and X2.

24
Multiple regression for estimating equation when there are ‘n’ predictor
variables is as follows:

While finding the best fit line, we can fit either a polynomial or
curvilinear regression. These are known as polynomial or curvilinear
regression, respectively.
25
Assumptions in Regression Analysis

1. The dependent variable (Y) can be calculated / predicated as a

linear function of a specific set of independent variables (X’s) plus an
error term (ε).
2. The number of observations (n) is greater than the number of
parameters (k) to be estimated, i.e. n > k.
3. Relationships determined by regression are only relationships of
association based on the data set and not necessarily of cause and
effect of the defined class.
4. Regression line can be valid only over a limited range of data. If the
line is extended (outside the range of extrapolation), it may only lead
to wrong predictions.

26
5. If the business conditions change and the business assumptions
underlying the regression model are no longer valid, then the past
data set will no longer be able to predict future trends.
6. Variance is the same for all values of X (homoskedasticity).
7. The error term (ε) is normally distributed. This also means that the
mean of the error (ε) has an expected value of 0.
8. The values of the error (ε) are independent and are not related to
any values of X. This means that there are no relationships between a
particular X, Y that are related to another specific value of X, Y.

27 Given the above assumptions, the OLS estimator is the Best Linear
Unbiased Estimator (BLUE), and this is called as Gauss-Markov
Theorem.
Main Problems in Regression Analysis

• Two primary problems: Multicollinearity and heteroskedasticity

Mutilcollinearity
• Two variables are perfectly collinear if there is an exact linear
relationship between them.

• Multicollinearity is the situation in which the degree of correlation is

not only between the dependent variable and the independent
variable, but there is also a strong correlation within (among) the
independent variables themselves.

• A multiple regression equation can make good predictions when

28 there is multicollinearity, but it is difficult for us to determine how
the dependent variable will change if each independent variable is
changed one at a time.
• When multicollinearity is present, it increases the standard errors
of the coefficients.
• One way to gauge multicollinearity is to calculate the Variance
Inflation Factor (VIF), which assesses how much the variance of
an estimated regression coefficient increases if the predictors are
correlated.
• If no factors are correlated, the VIFs will be equal to 1.
• The assumption of no perfect collinearity states that there is no
exact linear relationship among the independent variables.
• This assumption implies two aspects of the data on the
independent variables.

29
• First, none of the independent variables, other than the variable
associated with the intercept term, can be a constant.
• Second, variation in the X’s is necessary.
• In general, the more variation in the independent variables, the
better will be the OLS estimates in terms of identifying the impacts
of the different independent variables on the dependent variable.

Heteroskedasticity
• Refers to the changing variance of the error term.
• If the variance of the error term is not constant across data sets,
there will be erroneous predictions.
30 • In general, for a regression equation to make accurate predictions,
the error term should be independent, identically (normally)
distributed (iid).
31
Improving Accuracy of the Linear Regression Model

• Accuracy refers to how close the estimation is near the actual

value, whereas prediction refers to continuous estimation of the
value.
High bias = low accuracy (not close to real value)
High variance = low prediction (values are scattered)
Low bias = high accuracy (close to real value)
Low variance = high prediction (values are close to each other)

• We have a regression model which is highly accurate and highly

predictive; therefore, the overall error of our model will be low,
implying a low bias (high accuracy) and low variance (high
32 prediction). This is highly preferable.
Improving Accuracy of the Linear Regression Model

Accuracy of linear regression can be improved using the following

three methods:

1. Shrinkage Approach
2. Subset Selection
3. Dimensionality (Variable) Reduction

33
Polynomial Regression Model

• Extension of the simple linear model by adding extra predictors

obtained by raising (squaring) each of the original predictors to a
power.
• This approach provides a simple way to yield a non-linear fit to
data. For example,

• Let us use the below data set of (X, Y) for degree 3 polynomial.

34
35
• As you can observe, the regression line is slightly curved for
polynomial degree 3 with the above 15 data points.

• The regression line will curve further if we increase the polynomial

degree.

• At the extreme value as shown above, the regression line will be

overfitting into all the original values of X.

36
What is Logistic Regression?

• Logistic regression is a Classification algorithm.

• Logistic Regression is all about predicting binary variables,
not predicting continuous variables.
• Logistic regression models estimate how probability of an
event may be affected by one or more explanatory variables.

• Logistic regression is a technique used for predicting “class

probability”, that is the probability that the case belongs to a
particular class.

37
Use cases of Logistic Regression

• Mail[Spam / Not Spam]

• Transaction [Fraudulent / Normal]
• Tumor [Malignant / Benign]
• Sentimental Analysis [Positive / Negative]
• Weather Prediction [Rain / Not Rain]
• Medical Diagnosis [Fit / ill]

38
Linear and Logistic Regression

39
Linear and Logistic Regression

Logistic curve
Sigmoid (S) curve

40
Logistic Regression Curve

41
Some fundamentals terms of Logistic Regression

• The probability that an event will occur is the fraction of times you expect
to see that event in many trials. If the probability of an event occurring is
Y, then the probability of the event not occurring is 1-
Y. Probabilities always range between 0 and 1.
• The odds are defined as the probability that the event will occur divided
by the probability that the event will not occur. Unlike probability, the
odds are not constrained to lie between 0 and 1 but can take any value
from zero to infinity.
• If the probability of Success is P, then the odds of that event is:

• The logit function is the logarithmic transformation of the logistic function.

It is defined as the natural logarithm of odds.
43
Math behind Logistic Regression

44
Math behind Logistic Regression

45
Math behind Logistic Regression

46
• Let us say we have a model that can predict whether a person is
male or female on the basis of their height.
• Given a height of 150 cm, we need to predict whether the person
is male or female.
• We know that the coefficients of a = −100 and b = 0.6.
• Using the above equation, we can calculate the probability of male
given a height of 150 cm or more formally P(male|height = 150).

47 or a probability of near zero that the person is a male.

Linear vs Logistic Regression

Basis Linear Regression Logistic Regression

Core concept Data is modeled using a Data is modeled using a
(Modeling of data) straight line. logistic (sigmoid) function.
Used with Continuous variable Categorical variable
Output/prediction Value of the variable Probability of occurrence of
event
Problem Solved Regression Classification
Accuracy Loss, R2, adjusted R2, etc. Accuracy, Precision,
(goodness of fit) Recall, F1 score, ROC
curve, Confusion matrix,
etc.

• The basic difference: the type of function that is used for

mapping
– (Linear: continuous X -> Continuous Y;
48
– Logistic: continuous X -> binary Y) – used for deciding category or true/false
decisions of the data
Parameter Estimation by
Maximum Likelihood Method

● The coefficients in a logistic regression are estimated using a process called Maximum
Likelihood Estimation (MLE).

● Likelihood function:

49
Thank You

50
Parameter Estimation by
Maximum Likelihood Method

• Probability density function for binary logistic regression is

given by:

51
Parameter Estimation by
Maximum Likelihood Method

The above system of equations are solved iteratively to

estimate β0 and β1

52
References

• Coursera tutorial - Linear Regression, Logistic Regression

• SimpliLearn tutorial – Logistic Regression
• Wikipedia – linear regression

• Business Analytics – by U.Dinesh Kumar

Linear Regression
No ratings yet
Linear Regression
16 pages
Hanan
No ratings yet
Hanan
9 pages
MachineLearning_Unit-II
No ratings yet
MachineLearning_Unit-II
45 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Regression Unit-2
No ratings yet
Regression Unit-2
5 pages
ML-UNIT-IV - Complete
No ratings yet
ML-UNIT-IV - Complete
42 pages
1linear Regression
No ratings yet
1linear Regression
12 pages
MOD3_EDA
No ratings yet
MOD3_EDA
16 pages
-18-Linear Regression
No ratings yet
-18-Linear Regression
29 pages
Data Science
100% (1)
Data Science
14 pages
MachineLearning Unit II
No ratings yet
MachineLearning Unit II
45 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Unit - 3 Machine Learning
No ratings yet
Unit - 3 Machine Learning
30 pages
ML Unit-2
No ratings yet
ML Unit-2
123 pages
Unit 3c Linear Regression
No ratings yet
Unit 3c Linear Regression
98 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
6 pages
1.5.Linear Regression
No ratings yet
1.5.Linear Regression
5 pages
Linear Regression
No ratings yet
Linear Regression
17 pages
Lecture Note #8_PEC-CS701E
No ratings yet
Lecture Note #8_PEC-CS701E
20 pages
Linear regression for machine learning
No ratings yet
Linear regression for machine learning
9 pages
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
No ratings yet
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
6 pages
Unit II-II
No ratings yet
Unit II-II
21 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
Mod 3C
No ratings yet
Mod 3C
36 pages
ML unit-2 ppt
No ratings yet
ML unit-2 ppt
34 pages
Chapter2 1
No ratings yet
Chapter2 1
55 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Linear Regression Models
No ratings yet
Linear Regression Models
42 pages
Linear Regression
No ratings yet
Linear Regression
22 pages
What Are Linear Models in Machine Learning[1].Docx (Unit3 Ml)
No ratings yet
What Are Linear Models in Machine Learning[1].Docx (Unit3 Ml)
60 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
PA
No ratings yet
PA
28 pages
Chapter_2_Linear and Logistic Regression
No ratings yet
Chapter_2_Linear and Logistic Regression
34 pages
Linear_Regression (1)
No ratings yet
Linear_Regression (1)
35 pages
Regression
No ratings yet
Regression
25 pages
unit-3 part 2 DA
No ratings yet
unit-3 part 2 DA
20 pages
MODULE-3
No ratings yet
MODULE-3
34 pages
Regression
No ratings yet
Regression
14 pages
Lecture 6 - Regression Analysis
No ratings yet
Lecture 6 - Regression Analysis
34 pages
IDS UNIT 5 Linear Regression
No ratings yet
IDS UNIT 5 Linear Regression
27 pages
simple linear regression with example problem
No ratings yet
simple linear regression with example problem
12 pages
Module III (Part II)(Regression and Time Series)
No ratings yet
Module III (Part II)(Regression and Time Series)
118 pages
Machine Learning and Deep Learning Course
No ratings yet
Machine Learning and Deep Learning Course
23 pages
U-4_IML
No ratings yet
U-4_IML
17 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
Regression Coeffient
No ratings yet
Regression Coeffient
52 pages
Regression and Introduction To Bayesian Network
No ratings yet
Regression and Introduction To Bayesian Network
12 pages
Day 2-Data Science
No ratings yet
Day 2-Data Science
16 pages
9 Regression Analysis
No ratings yet
9 Regression Analysis
38 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
OE-ML unit -3
No ratings yet
OE-ML unit -3
29 pages
Unit 2 Topic 1 REGRESSION
No ratings yet
Unit 2 Topic 1 REGRESSION
19 pages
Lecture 8 Linear and Multiple Regression
No ratings yet
Lecture 8 Linear and Multiple Regression
55 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet
Exercises of Multi-Variable Functions
From Everand
Exercises of Multi-Variable Functions
Simone Malacrida
No ratings yet
ML_MODULE7_Advanced Topics in ML
No ratings yet
ML_MODULE7_Advanced Topics in ML
22 pages
ML MODULE6 Artificial Neural Networks
No ratings yet
ML MODULE6 Artificial Neural Networks
42 pages
Ml Module5 Clustering
No ratings yet
Ml Module5 Clustering
71 pages
ML MODULE1 Introduction to Machine
No ratings yet
ML MODULE1 Introduction to Machine
38 pages
Question Paper Code:: M I G11 Queueing Model
No ratings yet
Question Paper Code:: M I G11 Queueing Model
14 pages
Management Science Notes
No ratings yet
Management Science Notes
13 pages
W. H - Winch.: London, Enyland
No ratings yet
W. H - Winch.: London, Enyland
8 pages
Factors Affecting Employee Turnover
No ratings yet
Factors Affecting Employee Turnover
21 pages
Surfac Limestone Model
No ratings yet
Surfac Limestone Model
91 pages
HSTS423 - Unit 5 Multicolinearity
No ratings yet
HSTS423 - Unit 5 Multicolinearity
12 pages
Machine Learning Algorithms, Real-World Applications and Research Directions
No ratings yet
Machine Learning Algorithms, Real-World Applications and Research Directions
73 pages
QAM-Business Forecasting (Session 6)
No ratings yet
QAM-Business Forecasting (Session 6)
7 pages
fundamental-forward-beta
No ratings yet
fundamental-forward-beta
14 pages
Correlation and Path Coefficient Analysis in Fodder Maize
No ratings yet
Correlation and Path Coefficient Analysis in Fodder Maize
6 pages
Ebooks File Energy Poverty: (Dis) Assembling Europe's Infrastructural Divide 1st Edition Stefan Bouzarovski (Auth.) All Chapters
100% (8)
Ebooks File Energy Poverty: (Dis) Assembling Europe's Infrastructural Divide 1st Edition Stefan Bouzarovski (Auth.) All Chapters
52 pages
Development of Standardized Sizing System for Ethiopian Children's Dodd
No ratings yet
Development of Standardized Sizing System for Ethiopian Children's Dodd
66 pages
FINAL-THESIS-FOR- 2 ROUND
No ratings yet
FINAL-THESIS-FOR- 2 ROUND
67 pages
Case - Study - Logistic Regression Updated
No ratings yet
Case - Study - Logistic Regression Updated
17 pages
Data Handling and Parameter Estimation: Gürkan Sin Krist V. Gernaey Sebastiaan C.F. Meijer Juan A. Baeza
No ratings yet
Data Handling and Parameter Estimation: Gürkan Sin Krist V. Gernaey Sebastiaan C.F. Meijer Juan A. Baeza
34 pages
Mediation (David A. Kenny)
No ratings yet
Mediation (David A. Kenny)
11 pages
The Study of The Relationship Between Mental Health and Academic Performance and Self-Efficacy and Academic Procrastination of High School Students
No ratings yet
The Study of The Relationship Between Mental Health and Academic Performance and Self-Efficacy and Academic Procrastination of High School Students
4 pages
The Students' Speaking Performance in A Speaking Classroom Environment Viewed From The Students' Satisfaction and Their Motivation
No ratings yet
The Students' Speaking Performance in A Speaking Classroom Environment Viewed From The Students' Satisfaction and Their Motivation
6 pages
BBA Lesson Plan
No ratings yet
BBA Lesson Plan
3 pages
Do Procrastination Friendly Environments
No ratings yet
Do Procrastination Friendly Environments
22 pages
Food Waste Management Utilizing Black Soldier Fly
No ratings yet
Food Waste Management Utilizing Black Soldier Fly
10 pages
2019 JBFE Jin-Determinants of Bank Competitiveness in Digital Era A Case Study of South Korea
No ratings yet
2019 JBFE Jin-Determinants of Bank Competitiveness in Digital Era A Case Study of South Korea
17 pages
Gpts Are GPTS: An Early Look at The Labor Market Impact Potential of Large Language Models
No ratings yet
Gpts Are GPTS: An Early Look at The Labor Market Impact Potential of Large Language Models
35 pages
MBA Group 1 BRM Presentation Question 11
No ratings yet
MBA Group 1 BRM Presentation Question 11
18 pages
Hassan (1981) - Historical Nile Floods and Their Implication For Climate Change
No ratings yet
Hassan (1981) - Historical Nile Floods and Their Implication For Climate Change
5 pages
ANVESAK
No ratings yet
ANVESAK
9 pages
Revision Questions -- SA3(Q)
No ratings yet
Revision Questions -- SA3(Q)
10 pages
Research Methods in Education 7th Edition by Louis Cohen, Lawrence Manion, Keith Morrison 0415583365 978-0415583367 instant download
100% (3)
Research Methods in Education 7th Edition by Louis Cohen, Lawrence Manion, Keith Morrison 0415583365 978-0415583367 instant download
67 pages
Asa Iajmmes 2022 9 1 3
No ratings yet
Asa Iajmmes 2022 9 1 3
14 pages
1. Linear Regression
No ratings yet
1. Linear Regression
3 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Ml Module3 Regression

Uploaded by

Ml Module3 Regression

Uploaded by

Regression Analysis

• Regression is a supervised learning algorithm under

• Regression is a tool for finding existence of an association

• The relationship can be linear or non-linear.

• A dependent variable (response variable) “measures an

• An independent variable (explanatory variable) “explains

One More than

● Simple linear regression

• House Price Predictions

• In linear regression, the relationships are modeled using linear

• One of the easiest algorithm in machine learning.

• Simple Linear regression: It is a statistical model that

• Data is modeled using a straight line (Y = mX + c)

• Correlation between X and Y variables

m=Slop of the line

c= y – intercept of the line

Linear positive slope Linear negative slope

Slope = (Y − Y ) / (X − X ) = Delta (Y) / Delta(X)

Curve linear positive slope

Curve linear negative slope

Linear positive slope Linear negative slope

Curve linear positive slope

Curve linear negative slope

For minimum loss we take partial

• A method to predict best fit line.

● Step 1: Calculate the mean of X and Y

Example of simple regression

Scatter plot and regression line

Sum of Square of Residual

Residual is the distance between the predicted point (on the

• Two or more independent variables, i.e. predictors are involved in the

• In the example of simple linear regression by considering Price of a

• If we consider Price of a Property (in $) as the dependent variable and

• Parameter b1 represents the

1. The dependent variable (Y) can be calculated / predicated as a

• Two primary problems: Multicollinearity and heteroskedasticity

• Multicollinearity is the situation in which the degree of correlation is

• A multiple regression equation can make good predictions when

• Accuracy refers to how close the estimation is near the actual

• We have a regression model which is highly accurate and highly

Accuracy of linear regression can be improved using the following

• Extension of the simple linear model by adding extra predictors

• The regression line will curve further if we increase the polynomial

• At the extreme value as shown above, the regression line will be

• Logistic regression is a Classification algorithm.

• Logistic regression is a technique used for predicting “class

• Mail[Spam / Not Spam]

• The logit function is the logarithmic transformation of the logistic function.

47 or a probability of near zero that the person is a male.

Basis Linear Regression Logistic Regression

• The basic difference: the type of function that is used for

• Probability density function for binary logistic regression is

The above system of equations are solved iteratively to

• Coursera tutorial - Linear Regression, Logistic Regression

• Business Analytics – by U.Dinesh Kumar

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.