Overfitting and Mitigation
Overfitting and Mitigation
1
Overfitting
§ Errors in machine learning
§ Bias and Variance
§ Example
§ Methods to avoid Overfitting in DL
§ Simpler Architecture
§ Regularization
§ Drop-out layer
2
Errors in machine learning model
Errors in Machine
Learning Model
Bias Variance
3
Bias and Variance
Bias:
• Bias is the inability of the model to learn the patterns in the data effectively.
• High bias models (read as simple models) will not learn the relationship between the inputs and the
features.
• Consequently, the model's predictions will also be inaccurate, resulting in errors.
High Accuracy
Example of Complex (Low Bias) Model:
6
Bias and Variance
But finding the ideal model is non-trivial. Why ?
7
Bias and Variance
Bias Curve
Variance Curve
overfitting
Underfitting Ideal
zone
zone zone
8
Example
Train Data Test Data Ideal Model = Similar and
Model Bias Variance Inference
RMSE RMSE Reasonable performance in Train
0.3 0.3 and Test Data
Model 1 Low Low Satisfactory
(reasonable) (reasonable)
0.7 0.7 Underfitting Model = Poor
Model 2 High Low Underfitting
(poor) (poor)
performance in Train and Test Data
0.2 0.8
Model 3 Low High Overfitting
(reasonable) (poor)
Overfitting Model = Reasonable
0.9 0.9 Inconsistent and
Model 4
(poor) (poor)
High High
Inaccurate performance in Train but poor
performance Test Data
9
Remedies to Avoid Overfitting
Avoiding Overfitting
Regularization
10
Lasso Regression
Addition of Penalty factor
will result in shrinkage of
Minimize + λ and may even force some to
zero
11
Ridge Regression
Addition of Penalty
factor will result in
Minimize + λ shrinkage of
12
ELNET Regression
Minimize + λ + λ
13
Regularization
LASSO Regression Ridge Regression
• L1 produces results with sparse betas • Ridge achieves parameter shrinkage only
and smaller coefficients • Difficult to interpret as it retains all
• betas may be set to zeros predictors
• Useful when you are interested in
keeping less no of attributes in the
model
So what ? So what ?
• Model is efficient to store • Model is useful when over-fitting /variance is
• Model is efficient to compute the main concern
14
Thank You
15