Lecture 10
Lecture 10
Math Foundations
Team
Introduction
► A closer look at the contour plot for the elliptical bowl case
shows that in the y -direction, we see oscillatory movement as in
each step we correct the mistake of overshooting made in the
previous step. The gradient component along the
y -direction is more than the component along the x -direction.
► Along the x -direction, we make small movements towards the
optimum x -value. Overall, after many training steps we find that
we have made little progress to the optimum.
► It needs to be kept in mind that the path of steepest descent in
most objective functions is only an instantaneous direction of
best improvement, and is not the correct direction of descent in
the longer term.
Revisiting feature normalization
► Loss function:
J ( w ) = (0.1w1+25w2−7)2+(0.8w1+10w2−1)2+(0.4w1+10w2−4)2
maximize f (x, y ) = x 2y
subject to g (x, y ) : x 2 + y 2
=1
Constrained Optimization : Lagrange Multiplier Method
maximize xy
subject to x+y =6