SMDS-unit-3
SMDS-unit-3
Dummy Variables:
Dummy variables are numerical values (0 or 1) assigned to
represent categories.
Example:
Gender →
Male = 1, Female = 0
How to Interpret:
If Male: Gender = 1
Equation:
Salary = 15 + 5(Experience) + 3(1)
If Female: Gender = 0
Equation:
Salary = 15 + 5(Experience) + 3(0)
Conclusion:
Dummy variables help include categorical data in regression
models.
Number of dummy variables = Number of categories - 1.
Always exclude one category to avoid Dummy Variable Trap
(Multicollinearity).
Variable selection Methods:
It used to find the best predictors for a regression model.
Variable selection methods help in choosing the most
important predictors in multiple regression models.
1.Forward Selection:
Start with no variables in the model.
Add the most significant variable (lowest p-value).
Keep adding variables one by one until no significant
improvement occurs.
✅ Best for: When you want a simple and efficient model.
2. Backward Elimination:
Start with all variables in the model.
Remove the least significant variable (highest p-value).
Keep removing variables one by one until all remaining
variables are significant.
✅ Best for: When you start with a full model and want to
simplify it.
3. Stepwise Regression:
A mix of Forward Selection and Backward Elimination.
Adds a variable if it is significant but removes it later if it
becomes insignificant.
✅ Best for: When you need a balance between forward and
backward methods.
7. Elastic Net:
A combination of LASSO and Ridge Regression.
Balances between shrinking coefficients (like Ridge) and
eliminating them (like LASSO).
✅ Best for: When correlated predictors exist.
Example problem:
Curvilinear regression:
EXPONENTIAL Regression:
Polynomial Regression: