ML UNIT - 2 Part 2
ML UNIT - 2 Part 2
Reduce Overfitting
• The Problem of
Overfitting
U19IT602
Regression example
price
price
price
size size size
𝑤1 𝑥 + 𝑤1 𝑥 + 𝑤 2 𝑥 2 𝑤1 𝑥 + 2𝑤 𝑥 2 +3 𝑤 𝑥 3 +
𝑏 + 𝑏
𝑤4 𝑥 4 + 𝑏
• Does not fit the • Fits training set • Fits the training set
training set well pretty well extremely well
U19IT602
Classification
𝑥2 𝑥2 𝑥2
𝑥1 𝑥1 𝑥1
𝑧 = 𝑤1 𝑥1 + 𝑤2 𝑥2 𝑧 = 𝑤1 𝑥1 + 𝑧 = 𝑤1 𝑥1 +
𝑓 w ,𝑏𝑏 x =
+ + 𝑤 3𝑤 𝑥 2 𝑥+
1 + 𝑤3 𝑤 𝑥1 2𝑥2𝑥
2 2 22 23 4 1
𝑔 is 𝑔
the𝑧 sigmoid 𝑤 4 𝑥5 1 𝑥 2
+𝑤 2𝑥
2 + 𝑤𝑤 𝑥𝑥 𝑥𝑥 +
25 1
2 2 1
function + 𝑏 𝑤 6 𝑥 32𝑥 2
+ ⋯
+ 𝑏
U19IT602
Regularization to
Reduce Overfitting
•Addressing
Overfitting
U19IT602
Collect more training examples
price
price
size size
U19IT602
Select features to include/exclude
size bedrooms floors age avg … distance to price
income coffee shop
U19IT602
Regularization
Reduce the size of parameters
𝑤𝑗
price
price
features features
U19IT602
Addressing
overfitting
Options
1. Collect more data
2. Select features
– Feature selection
U19IT602
Regularization to
Reduce Overfitting
price
price
size size
𝑤1 𝑥 + 𝑤 2 𝑥 2 𝑤1 𝑥 + 𝑤 2 𝑥 2 + 𝑤 3 𝑥 3
+ 𝑏 + 𝑤4 𝑥4 + 𝑏
𝑚
make 𝑤 31
, 𝑤4 really small (≈ 2
min w, x 𝑖 − 𝑦
0) w,𝑏
𝑓 𝑏 𝑖
2𝑚 𝑖=
1
U19IT602
Regularization
simpler model
small values 𝑤1, 𝑤2, ⋯ ,
less likely to overfit
𝑤𝑛 , 𝑏
size bedrooms floors age avg … distance to price
income coffee shop
𝑤1 , 𝑤1, 𝑤2, ⋯ ,
𝑤 100 , 𝑏
U19IT602
Regularization
choose 𝜆 =
1010 x = 𝑤 𝑥 + 𝑤 𝑥 2 + 𝑤 𝑥 3 +
price
𝑓 w, 𝑏 1 2 3
𝑤4 𝑥4 + 𝑏
U19IT602
Regularization to
Reduce Overfitting
• Regularized
Linear
Regression
U19IT602
Regularized linear regression
Gradient descent
repeat {
𝜕
𝑤𝑗 = 𝑤𝑗 𝐽
𝜕 𝑗
− 𝛼 w, 𝑏
𝑤
𝜕
𝑏= 𝑏 𝐽
𝜕
− 𝛼 w, 𝑏
𝑏
} simultaneous update
U19IT602
Implementing gradient
descent
repeat {
} simultaneous
update
U19IT602
Implementing gradient
descent
repeat {
} simultaneous
update
𝜆
𝑤𝑗 1− 𝛼
𝑚
U19IT602
How we get the derivative term
(optional)
𝜕
𝐽 w, 𝑏
𝜕𝑤
=
𝑗
U19IT602
Regularization to
Reduce Overfitting
• Regularized
Logistic
Regression
U19IT602
Regularized logistic
regression
𝑧 = 𝑤1 𝑥1 +
𝑤 2+𝑤
𝑥 2 3 1𝑥 𝑥 2 + 1
2
+4𝑤
𝑤 𝑥 2521𝑥𝑥2 𝑥 + ⋯
2 3
𝑥2 + 𝑏 2 1
x =
𝑓 w, 𝑏 1+
𝑒− 𝑧
𝑚
𝑥1 Cost function
1
𝐽 w, 𝑏 = − 𝑦 𝑖
w, x 𝑖 + 1 − 𝑦 𝑖 log 1 − w,
𝑓 x 𝑖
log� 𝑖=𝑓 𝑏 𝑏
�1
𝐽
w, 𝑏
U19IT602
Regularized logistic
regression
1
𝑚
�
𝑛
𝐽 w, 𝑏 = − 𝑦 𝑖
w, x 𝑖 + 1− 𝑦 𝑖 log 1 − w, x 𝑖 + � 𝑗
log� 𝑖=𝑓 𝑏 𝑓 𝑏
2
𝑗 =𝑤2
�1 𝑚 1
Gradient descent
repeat {
𝜕
𝑤𝑗 = 𝑤𝑗 𝐽
𝜕 𝑗
− 𝛼 w, 𝑏
𝑤
𝜕
𝑏= 𝑏 𝐽
𝜕
− 𝛼 w, 𝑏
𝑏
}
U19IT602