0% found this document useful (0 votes)

74 views11 pages

Practical Aspects of Deep Learning PIII

This document discusses numerical approximation of gradients as a way to debug the backpropagation algorithm in neural networks. It explains how gradient checking works by approximating the gradient of a function f at a point θ using a two-sided finite difference formula. This approximation is compared to the actual derivative value to validate that the gradient is being computed correctly during backpropagation. The document provides tips for implementing gradient checking such as only using it for debugging, including regularization terms, and checking at different points in the training process.

Uploaded by

Pedro Casariego Córdoba

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views11 pages

Practical Aspects of Deep Learning PIII

Uploaded by

Pedro Casariego Córdoba

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Practical aspects of Deep

Learning Part III

Arles Rodríguez
aerodriguezp@unal.edu.co

Facultad de Ciencias
Departamento de Matemáticas
Universidad Nacional de Colombia
Motivation
• Backpropagation is the most difficult part of a
neural network to implement.
• There is a mathematical way to debug the
gradient programming called gradient
checking.
• It consists of a numerical approximation of the
gradients.
Numerical approximation of gradients
Let`s say , 𝜀=0.01
𝑓 ( 𝜃+𝜀)

𝑓 ( 𝜃+𝜀 ) − 𝑓 (𝜃 − 𝜀)
𝑓 ( 𝜃 + 𝜀) − 𝑓 ( 𝜃 − 𝜀)
𝑓 ′ ( 𝜃) ≈
2𝜀

𝑓 (𝜃− 𝜀) Given

2𝜀 𝑓 ′ ( 𝜃) ≈
𝑓 ( 1.01 ) − 𝑓 (0.99)
2 (0.01)
( 1.01 )3 −(0.99)3
𝑓 ′ (𝜃 )≈
2 (0.01)
¿3,0001

𝑓 ′ ( 𝜃 ) =3 𝜃 2=3

𝜃−𝜀 𝜃+ 𝜀 Approx error is 0.0001

𝜃=1
Some math details
′ 𝑓 (𝜃+ 𝜀) − 𝑓 (𝜃− 𝜀)
𝑓 ( 𝜃+𝜀) 𝑓 ( 𝜃 ) =lim

𝑓 ( 𝜃+𝜀 ) − 𝑓 (𝜃 − 𝜀)
𝜀→ 0 2𝜀

That is the second order Taylor

expansion of (Bengio, 2012).
𝑓 (𝜃− 𝜀)
The error of this approximation on in
( 𝜃
2𝜀 )
the order of

If .0001

This two sides difference formula is

accurate.

𝜃−𝜀 𝜃+ 𝜀
𝜃=1
Gradient checking
• Take parameters and concatenate into a big
vector

• Take parameters d and concatenate into a big

vector
• How to validate that is the gradient of ?
Gradient checking implementation

for each i:

• To check if the distance between the two

vectors is computed.
Interpretation of

• If derivative approximation is ok.

• If let’s take a careful look.
• If there might be a bug. Review components
of the vector vs and make sure that no
component is too large .
Tips to implement gradient check
• Don’t use in training.
• Gradient check must be used only on debug.
• When doing gradient check remember the regularization
term.

must include regularization term

• Gradient check does not work with dropout because
dropout is randomly eliminating subsets of hidden units.
• Turn off dropout to check gradients (keep_prob = 1) and
then turn on dropout.
Tips to implement gradient check

• Run gradient check at starting (with approx. to

zero) and after some iterations.
• This can be useful because sometimes derivatives
seem well for small values of 𝑊,𝑏 and when training
values tend to grow there can be big differences in
gradients in the case of bugs.
References
• Ng. A (2022) Deep Learning Specialization.
https://www.deeplearning.ai/courses/deep-learning-specialization/
• Bengio, Y. (2012). Practical Recommendations for Gradient-Based
Training of Deep Architectures. In Lecture Notes in Computer Science
(including subseries Lecture Notes in Artificial Intelligence and Lecture
Notes in Bioinformatics) (Vol. 7700 LECTU, pp. 437–478).
https://doi.org/10.1007/978-3-642-35289-8_26
¡Thank you!

d2l en
No ratings yet
d2l en
1,157 pages
d2l en Vol1 Preprint
No ratings yet
d2l en Vol1 Preprint
579 pages
Dive Into Deep Learning: Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola
No ratings yet
Dive Into Deep Learning: Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola
1,185 pages
Dive Into Deep Learning: Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola
No ratings yet
Dive Into Deep Learning: Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola
1,222 pages
Q-Basic Numerical Analysis Programs
84% (19)
Q-Basic Numerical Analysis Programs
54 pages
Dive Into Deep Learning - Varios
No ratings yet
Dive Into Deep Learning - Varios
1,151 pages
CS231n Deep Learning For Computer Vision p-1
No ratings yet
CS231n Deep Learning For Computer Vision p-1
10 pages
d2l en
No ratings yet
d2l en
982 pages
Computational Intelligence in Expensive Optimization Problems (2010) (Attica)
50% (2)
Computational Intelligence in Expensive Optimization Problems (2010) (Attica)
707 pages
CI DeepLearningFundamentals
No ratings yet
CI DeepLearningFundamentals
45 pages
d2l en PDF
No ratings yet
d2l en PDF
1,197 pages
First
No ratings yet
First
92 pages
d2l en
No ratings yet
d2l en
883 pages
Dive Into DeepLearning
No ratings yet
Dive Into DeepLearning
1,151 pages
Deep Learning
No ratings yet
Deep Learning
1,108 pages
Learning 3
No ratings yet
Learning 3
98 pages
Intro Deep Learning
No ratings yet
Intro Deep Learning
32 pages
Chapter 4
No ratings yet
Chapter 4
34 pages
M3 DS21-Data Mining Dan Statistik - Rev
No ratings yet
M3 DS21-Data Mining Dan Statistik - Rev
101 pages
Practical Aspects On Solving Differential Equations Using Deep Learning
No ratings yet
Practical Aspects On Solving Differential Equations Using Deep Learning
32 pages
Docking With ArgusLab
100% (1)
Docking With ArgusLab
24 pages
On Testing Machine Learing Programs - Braiek & Khomh
No ratings yet
On Testing Machine Learing Programs - Braiek & Khomh
15 pages
Assignment3 - DeepLearning
No ratings yet
Assignment3 - DeepLearning
16 pages
lb0sNIABQY29LDSAARGNyg C2W1
No ratings yet
lb0sNIABQY29LDSAARGNyg C2W1
39 pages
5 CommonPractices
No ratings yet
5 CommonPractices
106 pages
WINSEM2024-25 CSE4006 ETH AP2024254000689 2025-01-03 Reference-Material-I
No ratings yet
WINSEM2024-25 CSE4006 ETH AP2024254000689 2025-01-03 Reference-Material-I
39 pages
(00000) - 2018-Weinan - (CommMathStat) - The Deep Ritz Method A Deep Learning-Based
No ratings yet
(00000) - 2018-Weinan - (CommMathStat) - The Deep Ritz Method A Deep Learning-Based
12 pages
Neural Network Training
No ratings yet
Neural Network Training
73 pages
TFG Final
No ratings yet
TFG Final
75 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
123 pages
Deep Learning
No ratings yet
Deep Learning
78 pages
Soft Computing
No ratings yet
Soft Computing
39 pages
IoT - Lecture 11
No ratings yet
IoT - Lecture 11
58 pages
3 Gradient
No ratings yet
3 Gradient
31 pages
01 - Introduction To Deep Learning
No ratings yet
01 - Introduction To Deep Learning
56 pages
Deep Learning For Mathematicians
No ratings yet
Deep Learning For Mathematicians
32 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
40 pages
04 Numerical
No ratings yet
04 Numerical
39 pages
04 Numerical
No ratings yet
04 Numerical
46 pages
1803 08823 PDF
No ratings yet
1803 08823 PDF
122 pages
SS 2020 Solutions
No ratings yet
SS 2020 Solutions
22 pages
Deep Learning Concepts Overview
No ratings yet
Deep Learning Concepts Overview
7 pages
Deep Learning
No ratings yet
Deep Learning
15 pages
CT1 DL Ans
No ratings yet
CT1 DL Ans
13 pages
Deep Learning: An Introduction For Applied Mathematicians: Catherine F. Higham Desmond J. Higham
No ratings yet
Deep Learning: An Introduction For Applied Mathematicians: Catherine F. Higham Desmond J. Higham
32 pages
3 - DeepLearning - and - CNN v3
No ratings yet
3 - DeepLearning - and - CNN v3
50 pages
Comparison of Euler and Range-Kutta Methods in Solving Ordinary Differential Equations of Order Two and Four
No ratings yet
Comparison of Euler and Range-Kutta Methods in Solving Ordinary Differential Equations of Order Two and Four
28 pages
Neural Networks: Learning: Cost Function
No ratings yet
Neural Networks: Learning: Cost Function
33 pages
Fixing Neural Network Course 2 1659759284
No ratings yet
Fixing Neural Network Course 2 1659759284
30 pages
Deep Learning For Mathematicians
No ratings yet
Deep Learning For Mathematicians
32 pages
On The Geometry of Deep Learning
No ratings yet
On The Geometry of Deep Learning
14 pages
Backpropagation Exercises
No ratings yet
Backpropagation Exercises
7 pages
Unit 3
No ratings yet
Unit 3
6 pages
Unit-2 Improving-Deep-Neural-Networks
No ratings yet
Unit-2 Improving-Deep-Neural-Networks
18 pages
Systems of Linear Algebraic Equations: Subtopics
100% (1)
Systems of Linear Algebraic Equations: Subtopics
42 pages
Survey of FNN
No ratings yet
Survey of FNN
25 pages
Geometric Learning in Python: Vector Operators: Patrick R. Nicolas
No ratings yet
Geometric Learning in Python: Vector Operators: Patrick R. Nicolas
7 pages
Artificial Neural NetworkIV
No ratings yet
Artificial Neural NetworkIV
6 pages
Introduction To Differentiable Physics - Physics-Based Deep Learning
No ratings yet
Introduction To Differentiable Physics - Physics-Based Deep Learning
8 pages
Machine Learning Part 9
No ratings yet
Machine Learning Part 9
33 pages
Question Bank of OT
No ratings yet
Question Bank of OT
15 pages
Machine Learning
No ratings yet
Machine Learning
4 pages
Simplex Method Incase of Artificial Variables " "
No ratings yet
Simplex Method Incase of Artificial Variables " "
13 pages
Neural Networks: Learning: Introduction To Machine Learning
No ratings yet
Neural Networks: Learning: Introduction To Machine Learning
8 pages
Backpropagation: Loading Data
No ratings yet
Backpropagation: Loading Data
12 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
Mathematics of Deep Learning: Lecture 1-Introduction and The Universality of Depth 1 Nets
No ratings yet
Mathematics of Deep Learning: Lecture 1-Introduction and The Universality of Depth 1 Nets
12 pages
DS303: Introduction To Machine Learning: Manjesh K. Hanawal
No ratings yet
DS303: Introduction To Machine Learning: Manjesh K. Hanawal
17 pages
Syl3 ML
No ratings yet
Syl3 ML
5 pages
Math. System of Linear Equations Solved With Matrix Inversion Method, in Excel and in VBA
No ratings yet
Math. System of Linear Equations Solved With Matrix Inversion Method, in Excel and in VBA
7 pages
Introduction To Machinelearning
No ratings yet
Introduction To Machinelearning
75 pages
ANN Matlab
No ratings yet
ANN Matlab
13 pages
Pyomo VSjump
No ratings yet
Pyomo VSjump
22 pages
Algebra Worksheet - 3
No ratings yet
Algebra Worksheet - 3
2 pages
Two-Phase Method Example-2
No ratings yet
Two-Phase Method Example-2
3 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
2 pages
9 Lagrange Interpolation
No ratings yet
9 Lagrange Interpolation
18 pages
2024 LifeArchitect - Ai Data (Shared) - Large Language Models (2024)
No ratings yet
2024 LifeArchitect - Ai Data (Shared) - Large Language Models (2024)
7 pages
1.6 Practice - 2
No ratings yet
1.6 Practice - 2
4 pages
AI - 100 Days Learning Plan
No ratings yet
AI - 100 Days Learning Plan
9 pages
Model Questions DWT
No ratings yet
Model Questions DWT
2 pages
Algorithm 778: L-BFGS-B: Fortran Subroutines For Large-Scale Bound-Constrained Optimization
No ratings yet
Algorithm 778: L-BFGS-B: Fortran Subroutines For Large-Scale Bound-Constrained Optimization
11 pages
M.Tech - CDS Course Plan 2019
No ratings yet
M.Tech - CDS Course Plan 2019
13 pages
EEP311L Chapter 17 The QR Method
No ratings yet
EEP311L Chapter 17 The QR Method
10 pages
10 Optimization by Minimization
No ratings yet
10 Optimization by Minimization
2 pages
Mem 591 HW3
No ratings yet
Mem 591 HW3
2 pages
Chapter 4 Numerical Differentiation and Integration
No ratings yet
Chapter 4 Numerical Differentiation and Integration
3 pages
Section 5.3, Exercise 10
No ratings yet
Section 5.3, Exercise 10
3 pages
Name of Faculty: Dr. Sanjeet Kumar Designation: Professor & Head Department: Mathematics, LNCT&S, Bhopal (M.P)
No ratings yet
Name of Faculty: Dr. Sanjeet Kumar Designation: Professor & Head Department: Mathematics, LNCT&S, Bhopal (M.P)
2 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Practical Aspects of Deep Learning PIII

Uploaded by

Practical Aspects of Deep Learning PIII

Uploaded by

Practical aspects of Deep

Learning Part III

𝜃−𝜀 𝜃+ 𝜀 Approx error is 0.0001

That is the second order Taylor

This two sides difference formula is

• Take parameters d and concatenate into a big

• To check if the distance between the two

• If derivative approximation is ok.

must include regularization term

• Run gradient check at starting (with approx. to

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.