0% found this document useful (0 votes)
50 views2 pages

Note On Backpropagation John Hull: Ith Observation, and y

This document provides an overview of backpropagation, which is an algorithm for calculating the partial derivatives of error with respect to network parameters in neural networks. It explains that backpropagation uses the chain rule to work backwards through the network from the final output layer to earlier layers. Starting from the network outputs, it recursively calculates the partial derivatives needed to determine how changes to lower layer parameters affect the overall error.

Uploaded by

Omar Bairan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views2 pages

Note On Backpropagation John Hull: Ith Observation, and y

This document provides an overview of backpropagation, which is an algorithm for calculating the partial derivatives of error with respect to network parameters in neural networks. It explains that backpropagation uses the chain rule to work backwards through the network from the final output layer to earlier layers. Starting from the network outputs, it recursively calculates the partial derivatives needed to determine how changes to lower layer parameters affect the overall error.

Uploaded by

Omar Bairan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Note on Backpropagation

John Hull

Backpropagation is a way of using the chain rule to calculate


derivatives of the mean squared error (or other objective
function) with respect to the parameter values. For convenience
we assume a single target. The mean squared error is given by:
𝑛
1
𝐸 = ∑(𝑦̂𝑖 − 𝑦𝑖 )2
𝑛
𝑖=1

where there are n observations, 𝑦̂𝑖 is the value of the target for the
ith observation, and yi is the estimate of the target’s value given by
the neural network. If is the value of a parameter
𝑛
𝜕𝐸 2 𝜕𝑦𝑖
= − ∑(𝑦̂ 𝑖 − 𝑦𝑖 )
𝜕θ 𝑛 𝜕θ
𝑖=1

We can therefore consider each observation separately,


calculating 𝜕𝑦𝑖⁄𝜕θ, and at the end use this equation to get the
partial derivative we are interested in.
We start with the values of  used to calculate the target 𝑦𝑖 at
the end of the network and work back through the network
considering other values. As in Chapter 6, we define L as the
number of layers and K as the number of neurons per layer. The
value at the kth neuron for layer l will be denoted by 𝑉𝑙𝑘(1 ≤ k ≤ K;
1 ≤ l ≤ L).
First we note that if  is a parameter relating the output to the
final layer, 𝜕𝑦𝑖⁄𝜕θ can be calculated without difficulty. If is a
parameter relating the value at neuron k of the final layer to a
neuron in the penultimate layer, we have from the chain rule

1
Machine Learning in Business

𝜕𝑦𝑖 𝜕𝑦𝑖 𝜕𝑉𝐿𝑘


=
𝜕𝜃 𝜕𝑉𝐿𝑘 𝜕𝜃

Both 𝜕𝑦𝑖⁄𝜕𝑉𝐿𝑘 and 𝜕𝑉𝐿𝑘⁄𝜕θ can be calculated without difficulty.


Now let us consider the situation where the parameter
relates the value at neuron k of layer l to a neuron in layer l−1 (l
< L). Then

𝜕𝑦𝑖 𝜕𝑦𝑖 𝜕𝑉𝑙𝑘


=
𝜕𝜃 𝜕𝑉𝑙𝑘 𝜕𝜃

The partial derivative 𝜕𝑉𝑙𝑘 ⁄𝜕θ can be calculated without difficulty.


We have to do a little more work to calculate 𝜕𝑦𝑖⁄𝜕𝑉𝑙𝑘. An
application of the chain rule gives
𝐾
𝜕𝑦𝑖 𝜕𝑦𝑖 𝜕𝑉𝑙+1,𝑘∗
= ∑
𝜕𝑉𝑙𝑘 ∗
𝜕𝑉𝑙+1,𝑘∗ 𝜕𝑉𝑙𝑘
𝑘 =1

The partial derivative, 𝜕𝑉𝑙 +1,𝑘∗ ⁄𝜕𝑉𝑙𝑘 , can be calculated without


difficulty for all k and k*. Because calculations start at the end of
the network and work back, we have already calculated the values
of 𝜕𝑦𝑖⁄𝜕𝑉𝑙+1,𝑘∗ for all k* by the time that we consider a  that
relates layer l−1 to layer l.
Taken together, the equations we have presented provide a
fast way to calculate all the partial derivatives necessary for the
gradient descent algorithm.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy