0% found this document useful (0 votes)
46 views

12-Back propagation Algorithm-07-08-2024

The document outlines a course on Machine Learning, focusing on Artificial Neural Networks, specifically the Backpropagation Algorithm. It details the architecture, training process, and weight adjustment methods used in multilayer neural networks. Additionally, an example is provided to illustrate the calculation of new weights using the backpropagation method with specific input patterns and learning rates.

Uploaded by

sachitamanna2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

12-Back propagation Algorithm-07-08-2024

The document outlines a course on Machine Learning, focusing on Artificial Neural Networks, specifically the Backpropagation Algorithm. It details the architecture, training process, and weight adjustment methods used in multilayer neural networks. Additionally, an example is provided to illustrate the calculation of new weights using the backpropagation method with specific input patterns and learning rates.

Uploaded by

sachitamanna2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

BEEE410L MACHINE LEARNING

Dr.S.ALBERT ALEXANDER
SCHOOL OF ELECTRICAL ENGINEERING
albert.alexander@vit.ac.in

Dr.S.ALBERT ALEXANDER-
SELECT-VIT 1
Module 2
Artificial Neural Networks
❖ Perceptron Learning Algorithm

❖ Multi-layer Perceptron, Feed-forward Network,


Feedback Network
❖ Back propagation Algorithm

❖ Recurrent Neural Network (RNN)

❖ Convolutional Neural Network (CNN)

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 2
2.3 Back propagation Algorithm
❖ The demonstration of the limitations of single-layer neural
networks was a significant factor in the decline of interest
in neural networks in the 1970s
❖ The discovery (by several researchers independently) and
widespread dissemination of an effective general method
of training a multilayer neural network played a major role
in the reemergence of neural networks as a tool for solving
a wide variety of problems
❖ One such training method is known as backpropagation (of
errors) or the generalized delta rule
❖ It is simply a gradient descent method to minimize the total
squared error of the output computed by the net

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 3
Back propagation Algorithm
❖ The training of a network by backpropagation involves
three stages: the feedforward of the input training pattern,
the calculation and backpropagation of the associated
error, and the adjustment of the weights
❖ After training, application of the net involves only the
computations of the feedforward phase
❖ Even if training is slow, a trained net can produce its output
very rapidly
❖ Numerous variations of backpropagation have been
developed to improve the speed of the training process
❖ More than one hidden layer may be beneficial for some
applications, but one hidden layer is sufficient

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 4
Architecture
❖ A multilayer neural network with one layer of hidden units
(the Z units) is shown in next slide
❖ The output units (the Y units) and the hidden units also
may have biases (as shown)
❖ The bias on a typical output unit Yk is denoted by Wok
❖ The bias on a typical hidden unit Zy is denoted Voj
❖ These bias terms act like weights on connections from
units whose output is always 1
❖ Only the direction of information flow for the feedforward
phase of operation is shown
❖ During the backpropagation phase of learning, signals are
sent in the reverse direction

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 5
Architecture

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 6
Algorithm
❖ During feedforward, each input unit (Xi) receives an input
signal and broadcasts this signal to the each of the hidden
units Z1,….,Zp
❖ Each hidden unit then computes its activation and sends its
signal (zj) to each output unit
❖ Each output unit (Yk) computes its activation (yk) to form
the response of the net for the given input pattern

❖ During training, each output unit compares its computed


activation yk with its target value tk to determine the
associated error for that pattern with that unit
❖ Based on this error, the factor k (k = 1,...,m) is computed

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 7
Algorithm
❖ k is used to distribute the error at output unit Yk back to all
units in the previous layer (the hidden units that are
connected to Yk)
❖ It is also used (later) to update the weights between the
output and the hidden layer
❖ In a similar manner, the factor j (j=1, ...,p) is computed for
each hidden unit Zj
❖ It is not necessary to propagate the error back to the input
layer, but j is used to update the weights between the
hidden layer and the input layer
❖ After all of the  factors have been determined, the weights
for all layers are adjusted simultaneously

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 8
Algorithm
❖ The adjustment to the weight wjk (from hidden unit Zj to
output unit Yk) is based on the factor k and the activation zj
of the hidden unit Zj
❖ The adjustment to the weight vij (from input unit Xi to
hidden unit Zj) is based on the factor j and the activation xi
of the input unit

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 9
Training Algorithm
Step 0:
❖ Initialize weights. (Set to small random values)

Step 1:
❖ While stopping condition is false, do Steps 2-9

Step 2:
❖ For each training pair, do Steps 3-8

Feedforward
Step 3:
❖ Each input unit (Xi, i=1,...,n) receives input signal Xi and
broadcasts this signal to all units in the layer above (the
hidden units)
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 10
Training Algorithm
Step 4:
❖ Each hidden unit (Zj, j=1,...,p) sums its weighted input
signals
𝑧_𝑖𝑛𝑗 =𝑣𝑜𝑗 +σ𝑛𝑖=1 𝑥𝑖 𝑣𝑖𝑗
❖ Applies its activation function to compute its output signal
𝑧𝑗 = 𝑓(𝑧_𝑖𝑛𝑗 )
❖ Sends this signal to all units in the layer above (output
units)

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 11
Training Algorithm
Step 5:
❖ Each output unit (Yk, k=1,...,m) sums its weighted input
signals
𝑝
𝑦_𝑖𝑛𝑘 =𝑤𝑜𝑘 +σ𝑗=1 𝑧𝑗 𝑤𝑗𝑘
❖ Applies its activation function to compute its output signal,
𝑦𝑘 = 𝑓(𝑦_𝑖𝑛𝑘 )

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 12
Training Algorithm
Backpropagation of error

Step 6:
❖ Each output unit (Yk, k=1,...,m)receives a target pattern
corresponding to the input training pattern, computes its
error information term
k= (tk-yk)f’(y_ink)
❖ Calculates its weight correction term (used to update wjk
later), 𝑤𝑗𝑘 =𝑘 𝑧𝑗
❖ Calculates its bias correction term (used to update wok
later), 𝑤𝑜𝑘 =𝑘
❖ Sends 𝑘 to units in the layer below

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 13
Training Algorithm
Backpropagation of error
Step 7:
❖ Each hidden unit (Zj, j=1,...,p) sums its delta inputs (from
units in the layer above)
_𝑖𝑛𝑗 =σ𝑚
𝑘=1 𝑘 𝑤𝑗𝑘
❖ Multiplies by the derivative of its activation function to
calculate its error information term
𝑗 = _𝑖𝑛𝑗 𝑓′(𝑧_𝑖𝑛𝑗 )
❖ Calculates its weight correction term (used to update vij
later), 𝑣𝑖𝑗 =𝑗 𝑥𝑖
❖ Calculates its bias correction term (used to update voj
later), 𝑣𝑜𝑗 =𝑗

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 14
Training Algorithm
Update weights and biases
Step 8:
❖ Each output unit (Yk, k=1,…,m) updates its bias and
weights (j = 0,…,p):
wjk(new) = wjk(old) + wjk
❖ Each hidden unit (Zj, j=1,…,p) updates its bias and weights
(i=0,…,n):
vij(new) = vij(old) + vij

Test stopping condition

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 15
Analysis
❖ In implementing BPN algorithm, separate arrays should be
used for the deltas for the output units (Step 6, 𝑘 ) and the
deltas for the hidden units (Step 7, 𝑗 )
❖ An epoch is one cycle through the entire set of training
vectors
❖ Typically, many epochs are required for training a
backpropagation neural net
❖ The foregoing algorithm updates the weights after each
training pattern is presented
❖ A common variation is batch updating, in which weight
updates are accumulated over an entire epoch (or some
other number of presentations of patterns) before being
applied
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 16
Analysis
❖ Note that (f’(y_ink) and f’(z_inj) can be expressed in terms
of yk and zj, respectively, using the appropriate formulas
(depending on the choice of activation function)
❖ The mathematical basis for the backpropagation algorithm
is the optimization technique known as gradient descent
❖ The gradient of a function (in this case, the function is the
error and the variables are the weights of the net) gives the
direction in which the function increases more rapidly
❖ The negative of the gradient gives the direction in which
the function decreases most rapidly
❖ The derivation clarifies the reason why the weight updates
should be done after all of the 𝑘 and 𝐽 expressions have
been calculated, rather than during backpropagation
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 17
Example-1
Using back propagation network calculate the new weights
for the network shown in the figure. It is represented with the
input pattern [0,1] and the target output is 1. Use learning
rate= 0.25 and identity activation function.

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 18
Solution
The initial weights are:
❖ [v11 v21 v01] = [0.6 -0.1 0.3]

❖ [v12 v22 v02]= [-0.3 0.4 0.5]

❖ [w1 w2 w0]= [0.4 0.1 -0.2]

Learning rate is 0.25


❖ Here the activation function used is identity activation
function
❖ Thus, f(x)= x

Given sample:
❖ [x1, x2]=[0,1] and target t=1

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 19
Solution
For Z1 layer
❖ Zin1= v01+x1v11+x2v21

❖ Zin1= 0.3+0 (0.6)+1(-0.1) = 0.2

For Z2 layer
❖ Zin2= v02+x1v12+x2v22

❖ Zin2 = 0.5+0 (-0.3)+1(0.4) = 0.9

Applying activation to calculate the output , we get


❖ z1= f(Zin1)= 0.2

❖ z2= f(Zin2)= 0.9

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 20
Solution
Calculate the net input entering the output layer; for y
layer
❖ Yin= W0+Z1W1+Z2W2 = -0.03

❖ Applying activations to calculate the output, we get

❖ Y= f(Yin)= -0.03

Compute the error portion k


k= (tk-yk)f’(y_ink)

Now,
f’(y_ink)= f(yin)[1-f(yin)]

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 21
Solution
❖ f(yin)=yk= 0.2(0.4)+0.9(0.1)+1(-0.2) = -0.03
❖ f’(y_ink)= -0.03[1-(-0.03]= -0.0309

This implies,
❖ 1=(1-(-0.03)x(-0.0309) =-0.031827

Find the changes in weights between hidden and output


layer
❖ Δw1 = 1z1= 0.25x-0.031827x0.2 = -0.00159135

❖ Δw2 = 1z2= 0.25x-0.031827x0.9 = -0.007161075

❖ Δw0 = 1= 0.25x-0.031827 = -0.00795675

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 22
Solution
Compute the error portion of j between input and hidden
layer (j=1 to 2)
❖ j =inj f’(zinj)

❖ inj =σ𝑘=1 kwjk


𝑚

❖ inj=1wj1 (since only one output)

❖ in1= 1w11 =-0.031827x0.4 = -0.0127308

❖ in2= 1w12 =-0.031827x0.1 = -0.0031827

❖ Error, 1= in1f’(zin1)

❖ f’(zin1) = f(zin1)[1-f(zin1)] = 0.2(1-0.2) = 0.16

❖ 1 = in1 f’(zin1) = -0.0127308 x 0.16 = -0.002036928

❖ Error, 2= in2 f’(zin2)

❖ f’(zin2) = f(zin2)[1- f(zin2)] =0.9(1-0.9) = 0.09

❖ 2 = in2 f’(zin2) = -0.031827 x 0.09 = -0.000286443

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 23
Solution
Now find the changes in weights between input and
hidden layer
❖ ∆v11 =1x1 = 0.25 x -0.002036928 x 0 = 0

❖ ∆v21 =1x2 = 0.25 x -0.002036928 x1 = 0.000509238

❖ ∆v01 =1 = 0.25 x-0.002036928 = 0.000509238

❖ ∆v12 =2x1 = 0.25 x -0.000286443 x 0 = 0


❖ ∆v22 =2x2 =0.25 x -0.000286443x 1 = 0.00007161075
❖ ∆v02 =2 = 0.25 x -0.000286443 = 0.00007161075

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 24
Solution
Compute the final weights of the network
❖ V11(new)= v11 old + ∆v11 = 0.6+0 =0.6

❖ V12(new)= v12 old + ∆v12 = -0.3+0 =-0.3

❖ V21(new)= v21 old + ∆v21 = -0.1+0.000509238= -.100509238

❖ V22(new)= v22 old + ∆v22 = 0.4+0.0000761075 =0.39912

❖ W1 (new) = W1 old+ ∆W1= 0.4+-0.00159135 =0.3984


❖ W2 (new) = W2 old+ ∆W2= 0.1+ -0.007161 =0.09284
❖ V01(new)= v01 old + ∆v01 = 0.3+-0.0005092 =0.29949
❖ V02(new)= v02 old + ∆v02 =0.5+-0.00007161 =0.499928
❖ W0 (new) = W0 old+ ∆W0= -0.2+-0.00795675 =-0.2007956

Dr.S.ALBERT ALEXANDER-SELECT-
VIT 25
Dr.S.ALBERT ALEXANDER-SELECT-
VIT 26

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy