CM20315 06 Fitting
CM20315 06 Fitting
or for short:
Returns a scalar that is smaller
when model maps inputs to
outputs better
Training
• Loss function:
Returns a scalar that is smaller
when model maps inputs to
outputs better
Loss function:
= step size
Gradient descent
Gradient descent
Step 1: Compute derivatives (slopes of function) with
Respect to the parameters
= step size
Gradient descent
Step 1: Compute derivatives (slopes of function) with
Respect to the parameters
= step size
Gradient descent
Gradient descent
Gradient descent
Line Search
Step 1: Compute derivatives (slopes of function) with
Respect to the parameters
= step size
Line Search (bracketing)
Line Search (bracketing)
Line Search (bracketing)
Gradient descent
Convex problems
After (SGD)
• Normalize:
Normalized gradients
• Measure mean and pointwise squared gradient
• Normalize:
Normalized gradients
• Measure mean and pointwise squared gradient
• Normalize:
Adaptive moment estimation (Adam)
• Compute mean and pointwise
squared gradients with momentum