Neural Network: Sudipta Roy
Neural Network: Sudipta Roy
Sudipta Roy
– Experiment:
• Pigeon in Skinner box
• Present paintings of two different artists (e.g.
Chagall / Van Gogh)
• Reward for picking when presented a particular
artist (e.g. Van Gogh)
Experiment
Results
• Pigeons were able to discriminate between Vincent
Van Gogh and Marc Chagall with 95% accuracy
(presented with pictures they had been trained on).
• Discrimination still 85% successful for previously
unseen paintings of the artists.
• Pigeons do not simply memorise the pictures.
• They can extract and recognise patterns (the ‘style’).
• They generalise from the already seen to make
predictions.
• This is what neural networks (biological and artificial)
are good at (unlike conventional computer).
Neural Networks
A Neural Network (NN) is a processing
device, either an algorithm or an actual
hardware, whose design was inspired by the
design and functioning of animal brains and
components thereof.
(Input
Spikes
)
Outpu
t
Spike
(Excitatory Post-Synaptic
Potential)
Biological inspiration
Dendrites
Axon
Biological inspiration
dendrites
axon
synapses
x1
x2 w1
X3
w2 Output
n
yin wi xi ; y f ( yin )
Inputs
w3 y
… i 1
.
. wn-1
.
xn-1
xn wn
Properties of ANN
• High level abstraction of neural input-output
transformation
– Inputs weighted sum of inputs nonlinear function output
• Typically no spikes
• Typically use implausible constraints or learning rules
• Often used where data or functions are uncertain
– Goal is to learn from a set of training data
– And to generalize from learned instances to new unseen data
• Key attributes
– Parallel computation
– Distributed representation and storage of data
– Learning (networks adapt themselves to solve a problem)
– Fault tolerance (insensitive to component failures)
Basic entities of ANN
• Three basic entities
• Interconnections
• Learning rules: adopted for updating and
adjusting the connection weights
• Activation functions
Connections
(Leads to Net Architecture)
W11
X1 y1 X1 R1
W12
z1 y1
X2 y2 X2 R2
z2 y2
Xn ym Xn Rq
Wnm
zk ym
Input Output
Feedback
Recurrent networks
Single Layer Multi layer
W11
X1 y1 X1 y1
z1
X2 y2 X2 y2
z2
Xn ym Xn ym
Wnm
zk
Learning
Main property of an ANN is its capability to learn.
Three categories:
Supervised learning
Unsupervised learning
Reinforcement learning
Supervised Learning
X ANN Y
(Input) W (Actual Output)
Error
D-Y D
Signal
(Error Signals) (Desired Output)
Generator
Error
R
Signal
Error Signals (Reinforcement Signal)
Generator
f(x) f ( x) x
x
Important terminologies
Weights
Bias
Threshold
……
Assignment 1
Search for four papers (at least) related to your
project where Neural Network concept is used and
study those. Prepare a review report based on those
and submit within a week.
If you are not finding any Neural Network paper
linked with your project then consider any artificial
intelligence as the base concept and prepare the
survey report.
The report should include: Title, Abstract,
Introduction (with mention of the papers), comparison
(with result and discussion), conclusion, references.
Linerar Separability
ANN does not give an exact solution for a nonlinear
problem. However, it provides possible approximate
solutions to nonlinear problems.
Linear separability is the concept wherein the
separation of the input space into regions is based on
whether the network response is positive or negative.
A decision line is drawn to separate positive and
negative responses. The decision line may also be
called decision-making line or decision-support line or
linear-separable line.
…..
Topologies of Neural Networks
completely
connected recurrent
feedforward
(directed, a-cyclic) (feedback connections)
Artificial neurons
The McCulloch-Pitts model:
• spikes are interpreted as spike rates;
• synaptic strength are translated as synaptic weights;
• excitation means positive product between the incoming spike
rate and the corresponding synaptic weight;
• inhibition means negative product between the incoming spike
rate and the corresponding synaptic weight;
Artificial neurons
Nonlinear generalization of the McCullogh-Pitts neuron:
y f ( x, w)
y is the neuron’s output, x is the vector of inputs, and w is the vector
of synaptic weights.
Examples:
1
y T
w xa
sigmoidal neuron
1 e
Gaussian neuron
|| x w|| 2
ye 2a 2
Artificial neural networks
Output
Inputs
The young animal learns that the green fruits are sour, while the
yellowish/reddish ones are sweet. The learning happens by adapting
the fruit picking behavior.
Output
Inputs
y 14 f ( x4 , w14 ) 4
Neural network mathematics
Neural network: input / output transformation
y out F ( x, W )
y out wT x x yout
MLP neural network:
1
y 1k w1 kT x a1k
, k 1,2,3
1 e
y 1 ( y11 , y 12 , y31 ) T
1
y k2 w 2 kT y 1 a k2
, k 1,2
1 e
y 2 ( y12 , y 22 ) T x yout
2
y out wk3 y k2 w3T y 2
k 1
RBF neural networks
RBF = radial basis function
r ( x) r (|| x c ||)
|| x w|| 2
Example:
f ( x) e 2a 2 Gaussian RBF
|| x w1,k || 2
4
y out wk2 e 2( ak ) 2
x
k 1 yout
Neural network tasks
• control These can be reformulated in general as
• classification FUNCTION APPROXIMATION
• prediction tasks.
• approximation
E
wi c
j
(W )
wi j
wi j , new
wi wi
j j
j 1
k 1
Data: ( x1 , y1 ), ( x 2 , y 2 ),..., ( x N , y N )
|| x t w1,k || 2
M
Error: E (t ) ( y (t ) out yt ) ( wk2 (t ) e
2 2( ak ) 2
yt ) 2
k 1
Learning: E (t )
w (t 1) w (t ) c
2
i
2
i
wi2
|| x t w1,i || 2
E (t )
2 ( ai ) 2
2 ( F ( x t
, W (t )) yt ) e
wi2
x y 2 ( y12 ,..., y M2 2 ) T
...
1 2 … p-1 p y out F ( x;W ) w pT y p 1
Data: ( x1 , y1 ), ( x 2 , y 2 ),..., ( x N , y N )
Error: E (t ) ( y (t ) out yt ) 2 ( F ( x t ;W ) yt ) 2
M
1
y out F ( x;W ) w 2
k w1,kT x a k
k 1 1 e
Learning with general
optimisation
Synaptic weight change rules for the output neuron:
E (t )
w (t 1) w (t ) c
2
i
2
i
wi2
E (t ) 1
2 ( F ( x t
, W (t )) y t )
wi2
1,iT t
1 e w x ai
Synaptic weight change rules for the neurons of the hidden layer:
E (t )
w1j,i (t 1) w1j,i (t ) c
w1j,i
E (t ) 1
2 ( F ( x t
, W (t )) y t )
w1j,i w1j,i 1 e
w1,iT x t ai
1,iT t
e w x ai
1
w1,iT x t ai
w1j,i 1 e
w1,iT x t ai
1 e w x ai
1,iT t
2
w j
1, i
w j
1, i
w1,iT x t ai x tj
1,iT
x t ai
ew
w (t 1) w (t ) c 2 ( F ( x , W (t )) yt )
1, i 1, i t
( x tj )
j j
1 e w 1,iT t
x ai
2
New methods for learning
with neural networks
Bayesian learning:
the distribution of the neural network parameters
is learnt
Here
N = Number of highway sites or observations
Y(n) = Estimated number of accidents at site n for n = 1, 2,..., N
T(n) = Observed number of accidents at site n for n = 1, 2,..., N.
Neural Network Architecture
• After one pass through all observations (the training set), a
gradient descent method may be used to calculate improved
values of the weights W(1) and W(2), values that make E
smaller.
• After reevaluation of the weights with the gradient descent
method, successive passes can be made and the weights
further adjusted until the error is reduced to a satisfactory
level.
• The computation thus has two modes, the mapping mode, in
which outputs are computed, and the learning mode, in
which weights are adjusted to minimize E.
• Although the method may not necessarily converge to a
global minimum, it generally gets quite close to one if an
adequate number of hidden units are employed.