0% found this document useful (0 votes)
14 views9 pages

Ensemble Methods Random Forests.

Uploaded by

nehaalkhasim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views9 pages

Ensemble Methods Random Forests.

Uploaded by

nehaalkhasim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Gradient Descent in Machine

Learning
Gradient Descent

• Gradient Descent is known as one of the most


commonly used optimization algorithms to train
machine learning models.

• Gradient descent is also used to train Neural


Networks.

• It minimizes errors between actual and expected


results.
Linear Regression
Let X be the independent variable and Y be the dependent
variable

Our goal is to determine the


value of m and c, such that the
line corresponding to those
values is the best fitting line or
gives the minimum error.
Loss Function
• The loss is the error in our predicted value of m and c.
• Our goal is to minimize this error to obtain the most accurate
value of m and c.
• The Mean Squared Error function to calculate the loss.

1.Find the difference between the actual y and predicted y


value(y = mx + c), for a given x.
2.Square this difference.
3.Find the mean of the squares for every value in X.

Here yᵢ is the actual value and ȳᵢ is the predicted


value
Let’s substitute the value of ȳᵢ:
Understanding Gradient Descent
Gradient Descent
• 1.Initially, let m = 0, c = 0 and L-is the Learning rate.
learning rate-controlling how much the value of “m” changes with each step. The
smaller the L, the greater the accuracy.
• 2.Calculating the partial derivative of loss function wrt “m” and
giving current values of x, y , m, and c to get the derivative D.

• Dₘ is the value of the partial derivative with respect to m.


Similarly, lets find the partial derivative with respect to c, Dc :
Gradient Descent
3. Now we update the current value of m and c using the
following equation:

4. Repeat this process until our loss function is a very small


value or ideally 0 (which means 0 error or 100% accuracy).
The value of m and c that we are left with now will be the
optimum values.
Learning rate Difference

(a) Large learning rate, (b) Small learning rate, (c) Optimum
learning rate
Global Minimum
In the case of the linear regression model, there is only one minimum and it is the
global minimum

The local minimum reached depends on the initial coefficients taken


into consideration. Here, point A, B are termed Local Minimum and
point C is Global Minimum.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy