0% found this document useful (0 votes)
15 views45 pages

SMDS-unit-3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views45 pages

SMDS-unit-3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

UNIT-3

Linear& Multiple Regression


Syllabus:
Linear Regression:
Simple linear
multiple linear
Curvilinear
Exponential
Polynomial
Power model
practical Ex's: The nature of "relationship.
Multiple Linear Regression
Imp measurements of the regression Estimate
Multiple Regression with Categorical Explanatory variables
Inference in Multiple Regression
variable selection.
Important Measurement of Regression Estimate:
Categorical Regression with Dummy Variables:
Explanation:
Categorical regression is used when one or more
independent variables in the regression model are categorical
(like Gender, Color, Education Level, etc.).
Since regression models work only with numerical data, we
convert categorical variables into dummy variables.

Dummy Variables:
Dummy variables are numerical values (0 or 1) assigned to
represent categories.
Example:
Gender →
Male = 1, Female = 0

Steps to Perform Categorical Regression:


1. Identify the categorical variable.
2. Assign dummy variables:
 If the category has 2 options (Binary Variable):
Use 1 dummy variable.
 If the category has N options (More than 2 categories):
Use N-1 dummy variables.
3. Build the regression model using dummy variables.
Step 1: Assign Dummy Variables
Let:
Male = 1
Female = 0
Step 2: Regression Model Equation
Model:
Salary = b0 + b1(Experience) + b2(Gender)
Step 3: Apply Regression
After solving, the final equation becomes:
Salary = 15 + 5(Experience) + 3(Gender)

How to Interpret:

If Male: Gender = 1
Equation:
Salary = 15 + 5(Experience) + 3(1)

If Female: Gender = 0
Equation:
Salary = 15 + 5(Experience) + 3(0)

Conclusion:
Dummy variables help include categorical data in regression
models.
Number of dummy variables = Number of categories - 1.
Always exclude one category to avoid Dummy Variable Trap
(Multicollinearity).
Variable selection Methods:
It used to find the best predictors for a regression model.
Variable selection methods help in choosing the most
important predictors in multiple regression models.

1.Forward Selection:
 Start with no variables in the model.
 Add the most significant variable (lowest p-value).
 Keep adding variables one by one until no significant
improvement occurs.
✅ Best for: When you want a simple and efficient model.

2. Backward Elimination:
 Start with all variables in the model.
 Remove the least significant variable (highest p-value).
 Keep removing variables one by one until all remaining
variables are significant.
✅ Best for: When you start with a full model and want to
simplify it.
3. Stepwise Regression:
 A mix of Forward Selection and Backward Elimination.
 Adds a variable if it is significant but removes it later if it
becomes insignificant.
✅ Best for: When you need a balance between forward and
backward methods.

4. Best Subset Selection:


 Tries all possible combinations of variables.
 Selects the best combination based on metrics like
Adjusted , AIC, or BIC.
✅ Best for: When computational power is not a concern.

5. LASSO (Least Absolute Shrinkage and Selection Operator):


 Uses penalization to shrink coefficients of less important
variables to zero.
 Helps in automatic variable selection and reducing
overfitting.
✅ Best for: When dealing with high-dimensional data (many
variables).
6. Ridge Regression:
 Similar to LASSO but does not set coefficients to zero.
 Reduces the impact of less important variables instead
of removing them.
✅ Best for: When all variables contribute but need
regularization.

7. Elastic Net:
 A combination of LASSO and Ridge Regression.
 Balances between shrinking coefficients (like Ridge) and
eliminating them (like LASSO).
✅ Best for: When correlated predictors exist.
Example problem:
Curvilinear regression:
EXPONENTIAL Regression:
Polynomial Regression:

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy