0% found this document useful (0 votes)
8 views18 pages

GD Algo

Gradient descent is an iterative optimization algorithm aimed at finding the local minimum of a function by computing the gradient and moving in the opposite direction. It has limitations, including the tendency to find only local minima, sensitivity to step size, and the requirement for functions to be differentiable and convex. The algorithm involves initializing a point, calculating the gradient, and iteratively updating the point until convergence criteria are met.

Uploaded by

pgggg622
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views18 pages

GD Algo

Gradient descent is an iterative optimization algorithm aimed at finding the local minimum of a function by computing the gradient and moving in the opposite direction. It has limitations, including the tendency to find only local minima, sensitivity to step size, and the requirement for functions to be differentiable and convex. The algorithm involves initializing a point, calculating the gradient, and iteratively updating the point until convergence criteria are met.

Uploaded by

pgggg622
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Gradient Descent

Objective
Gradient descent algorithm is an iterative process that takes us to the
minimum of a function
What is Gradient Descent?

The million-dollar question!

Let’s say you are playing a game where the players are at the top of a mountain, and they are asked to
reach the lowest point of the mountain. Additionally, they are blindfolded. So, what approach do you
think would make you reach the lake?
Gradient descent was originally proposed by CAUCHY in 1847. It is also known as steepest descent.
Gradient descent is an iterative optimization algorithm for finding the local

minimum of a function.

The goal of the gradient descent algorithm is to minimize the given function (say cost
function). To achieve this goal, it performs two steps iteratively:

1.Compute the gradient (slope), the first order derivative of the function at that point

1.Make a step (move) in the direction opposite to the gradient, opposite direction of
slope increase from the current point by alpha times the gradient at that point
Limitations

Gradient descent
Gradient descent is a general-purpose algorithm that numerically finds minima of multivariable functions.
1. One of its limitations is that it only finds local minima (rather than the global minimum). As
soon as the algorithm finds some point that's at a local minimum, it will never escape as long
as the step size doesn't exceed the size of the ditch.
2. Another limitation of gradient descent concerns the step size. A good step size
moves toward the minimum rapidly, each step making substantial progress.
If the step size is too large, however, we may never converge to a local
minimum because we overshoot it every time.
3. A final limitation is that gradient descent only works when our function is differentiable
everywhere. Otherwise we might come to a point where the gradient isn't defined, and then
we can't use our update formula.
Function requirements
Gradient descent algorithm does not work for all functions. There are two specific
requirements. A function has to be:

● differentiable

● convex
Differentiable: If a function is differentiable it has a derivative for each point
in its domain

Examples of differentiable functions


Typical non-differentiable functions have a step a cusp or a discontinuity:
Gradient Descent method

1. Choose a starting point (initialisation)


2. Calculate gradient at this point
3. Make a scaled step in the opposite direction to the gradient (objective:
minimise)
4. Repeat points 2 and 3 until one of the criteria is met:
5. Maximum number of iterations reached
6. Step size is smaller than the tolerance
For more detailed clarity about GD algo
https://towardsdatascience.com/understanding-the-mathematics-behind-gradient-d
escent-dde5dc9be06e

https://towardsdatascience.com/gradient-descent-algorithm-a-deep-dive-cf04e811
5f21

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy