0% found this document useful (0 votes)
25 views64 pages

Neural Network: Sudipta Roy

Uploaded by

man
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views64 pages

Neural Network: Sudipta Roy

Uploaded by

man
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 64

Neural network

Sudipta Roy

Department of Computer Science and Engineering


Assam University, Silchar
Overview
1. Biological Inspiration
2. History of Artificial Neural Network
3. Artificial neurons and Neural Networks
4. Learning Processes
5. Learning with Artificial Neural Networks
Biological inspiration
 Animals are able to react adaptively to changes
in their external and internal environment, and
they use their nervous system to perform these
behaviours.
 An appropriate model/simulation of the nervous
system should be able to produce similar
responses and behaviours in artificial systems.
 The nervous system is build by relatively simple
units, the neurons, so copying their behaviour
and functionality should be the solution.
Biological Neural Nets

• Pigeons as art experts (Watanabe et al. 1995)

– Experiment:
• Pigeon in Skinner box
• Present paintings of two different artists (e.g.
Chagall / Van Gogh)
• Reward for picking when presented a particular
artist (e.g. Van Gogh)
Experiment
Results
• Pigeons were able to discriminate between Vincent
Van Gogh and Marc Chagall with 95% accuracy
(presented with pictures they had been trained on).
• Discrimination still 85% successful for previously
unseen paintings of the artists.
• Pigeons do not simply memorise the pictures.
• They can extract and recognise patterns (the ‘style’).
• They generalise from the already seen to make
predictions.
• This is what neural networks (biological and artificial)
are good at (unlike conventional computer).
Neural Networks
 A Neural Network (NN) is a processing
device, either an algorithm or an actual
hardware, whose design was inspired by the
design and functioning of animal brains and
components thereof.

 An Artificial Neural Network (ANN) may be


defined as an information-processing
system/model that is inspired by the biological
nervous systems, such as the brain.
Biological inspiration
Basic Input Output Transformation

(Input
Spikes
)

Outpu
t
Spike

(Excitatory Post-Synaptic
Potential)
Biological inspiration
Dendrites

Soma (cell body)

Axon
Biological inspiration
dendrites
axon

synapses

The information transmission happens at the synapses.


Biological inspiration
 The spikes travelling along the axon of the pre-synaptic
neuron trigger the release of neurotransmitter substances
at the synapse.
 The neurotransmitters cause excitation or inhibition in the
dendrite of the post-synaptic neuron.
 The integration of the excitatory and inhibitory signals
may produce spikes in the post-synaptic neuron.
 The contribution of the signals depends on the strength of
the synaptic connection.
History
Spiking neural networks
Vapnik (1990) Support vector machine
Broomhead & Lowe (1988) Radial basis functions (RBF)
Linsker (1988) Informax principle
Rumelhart, Hinton Back-propagation
& Williams (1986)
Kohonen(1982) Self-organizing maps
Hopfield(1982) Hopfield Networks
Minsky & Papert(1969) Perceptrons
Rosenblatt(1960) Perceptron
Minsky(1954) Neural Networks (PhD Thesis)
Hebb(1949) The organization of behaviour
McCulloch & Pitts (1943) Neural networks and
artificial intelligence were born
Neural Networks
 A Neural Network (NN) is a processing device, either
an algorithm or an actual hardware, whose design was
inspired by the design and functioning of animal
brains and components thereof.

 An Artificial Neural Network (ANN) may be defined as


an information-processing system/model that is
inspired by the biological nervous systems, such as
the brain.
Advantages of Neural Networks

 Ability to derive meaning from complicated or


imprecise data.
 Can be used to extract patterns and detect trends
that are too complex to be noticed by either humans
or other computer techniques.
 Ability to learn how to do tasks based on the data
given for training or initial experience.
 Computations can be carried out in parallel. Special
hardware devices being designed and manufactured
to take advantage of this capability.
Biological neuron vs Artificial Neuron

Biological Neuron Artifial Neuron


Cell Neuron/Node/Unit
Dendrites Weights or interconnections
Soma (Cell body) Net input
Axon Output
Artificial Neural Networks

 Process large number of interconnected processing


elements called nodes or units or neurons which usually
operate in parallel and are configured in regular
architecture.
 Each neuron is connected with others by a connection link.
 Each connection link is associated with weights which
contains information about the input signal.
 Each neuron has an internal state of its own. The state is
called the activation or activity level of neuron.
 A neuron can send only one signal at a time which can be
transmitted to several other neurons.
Artificial neurons
Neurons work by processing information. They receive
and provide information in form of spikes.

x1
x2 w1
X3
w2 Output
n
yin   wi xi ; y  f ( yin )
Inputs

w3 y
… i 1
.
. wn-1
.
xn-1
xn wn
Properties of ANN
• High level abstraction of neural input-output
transformation
– Inputs  weighted sum of inputs  nonlinear function  output
• Typically no spikes
• Typically use implausible constraints or learning rules
• Often used where data or functions are uncertain
– Goal is to learn from a set of training data
– And to generalize from learned instances to new unseen data
• Key attributes
– Parallel computation
– Distributed representation and storage of data
– Learning (networks adapt themselves to solve a problem)
– Fault tolerance (insensitive to component failures)
Basic entities of ANN
• Three basic entities
• Interconnections
• Learning rules: adopted for updating and
adjusting the connection weights
• Activation functions
Connections
(Leads to Net Architecture)

• The arrangement of neurons to form layers and the


connection pattern formed within and between layers
is called the network architecture.

• Five basic types of neuron connection architecture


• Feed forward network
– Single-layer
– Multi-layer
• Single-node with its own feedback
• Recurrent network
– Single-layer
– Multi-layer
Feed forward networks
Single Layer Multi layer

W11
X1 y1 X1 R1
W12
z1 y1
X2 y2 X2 R2
z2 y2

Xn ym Xn Rq
Wnm
zk ym

Input Output Layer I Layer III


Layer II
layer layer
Hidden Layer
Feed back networks
Single Node

Input Output

Feedback
Recurrent networks
Single Layer Multi layer

W11
X1 y1 X1 y1
z1
X2 y2 X2 y2
z2

Xn ym Xn ym
Wnm
zk
Learning
 Main property of an ANN is its capability to learn.

 Learning is a process by which a NN adapts itself to


a stimulus by making proper parameter
adjustments, resulting in the production of desired
response.

 Three categories:
 Supervised learning
 Unsupervised learning
 Reinforcement learning
Supervised Learning
X ANN Y
(Input) W (Actual Output)

Error
D-Y D
Signal
(Error Signals) (Desired Output)
Generator

 Learning is performed with the help of a guide to


minimize the training.
 It is assumed that the output is known for each input.
 The actual output vector is compared with the desired
output and an error signal is generated if there is any
difference.
 The error signal is used for adjustment of weights
until the actual output matches the desired output.
Unsupervised Learning
X ANN Y
(Input) W (Actual Output)

 Learning is performed without the help of a guide.


E.g. A tadpole can learn swim without its mother.
 The input vectors of similar type are grouped together
by forming the cluster without the use of training data
to specify how a member of each group looks like or to
which group a member belongs.
 When a new input is applied, ANN gives an output
response indicating the class to which the input pattern
belongs. For any input, if pattern class is not found then
a new pattern class is generated.
Unsupervised Learning
 No feedback from the environment to inform what
the output should be or whether the output is correct
or not.
 The network must itself discover the patterns,
regularities, features or categories from the input data
and relations of input data over the output.
 While discovering all these features, the network
undergoes change in its parameters.
 This process is called self-organizing in which exact
clusters will be formed by discovering similarities and
dissimilarities among the objects.
Reinforcement Learning
X ANN Y
(Input) W (Actual Output)

Error
R
Signal
Error Signals (Reinforcement Signal)
Generator

 Learning is similar to supervised learning.


 Exact output values are not known but the critic information is
available. E.g. The actual output is only 50% correct or so.
 The learning based on the critic info is called reinforcement
learning and the feedback is called reinforcement signal.
 Self-organizing NN (Kohonen NN) is an example.
 E.g. Primary school students’ assembly and teacher is
controlling. Sudden cane of teacher is reinforcing the straight
line.
Activation Functions
 The activation function is applied over the net input
to calculate the output of an ANN.
 The information processing of a processing element
can be viewed as consisting of two major parts
 Input: An integration function is associated with the input.
The function serves to combine activation, information or
evidence from an external source or other processing
elements into a net input.
 Output: The non-linear activation function is used to ensure
that a neuron’s response is bounded i.e. the actual response
of a neuron is conditioned or dampened as a result of large
or small activity stimuli and is thus controllable.
 Several activation functions are available.
Activation Functions
 Identity function
 Binary step function
 Bipolar step function
 Sigmoidal function
 Binary Sigmoidal
 Bipolar Sigmoidal
 Ramp function
 Radial Basis function
……..

f(x) f ( x)  x

x
Important terminologies
 Weights
 Bias
 Threshold
……
Assignment 1
 Search for four papers (at least) related to your
project where Neural Network concept is used and
study those. Prepare a review report based on those
and submit within a week.
 If you are not finding any Neural Network paper
linked with your project then consider any artificial
intelligence as the base concept and prepare the
survey report.
 The report should include: Title, Abstract,
Introduction (with mention of the papers), comparison
(with result and discussion), conclusion, references.
Linerar Separability
 ANN does not give an exact solution for a nonlinear
problem. However, it provides possible approximate
solutions to nonlinear problems.
 Linear separability is the concept wherein the
separation of the input space into regions is based on
whether the network response is positive or negative.
A decision line is drawn to separate positive and
negative responses. The decision line may also be
called decision-making line or decision-support line or
linear-separable line.
…..
Topologies of Neural Networks

completely
connected recurrent
feedforward
(directed, a-cyclic) (feedback connections)
Artificial neurons
The McCulloch-Pitts model:
• spikes are interpreted as spike rates;
• synaptic strength are translated as synaptic weights;
• excitation means positive product between the incoming spike
rate and the corresponding synaptic weight;
• inhibition means negative product between the incoming spike
rate and the corresponding synaptic weight;
Artificial neurons
Nonlinear generalization of the McCullogh-Pitts neuron:

y  f ( x, w)
y is the neuron’s output, x is the vector of inputs, and w is the vector
of synaptic weights.
Examples:

1
y T
w xa
sigmoidal neuron
1 e
Gaussian neuron
|| x  w|| 2

ye 2a 2
Artificial neural networks

Output
Inputs

An artificial neural network is composed of many artificial neurons that


are linked together according to a specific network architecture. The
objective of the neural network is to transform the inputs into
meaningful outputs.
Artificial neural networks
Tasks to be solved by artificial neural networks:
• controlling the movements of a robot based on self-perception and
other information (e.g., visual information);
• deciding the category of potential food items (e.g., edible or non-
edible) in an artificial world;
• recognizing a visual object (e.g., a familiar face);
• predicting where a moving object goes, when a robot wants to
catch it.
Learning in biological
systems
Learning = learning by adaptation

The young animal learns that the green fruits are sour, while the
yellowish/reddish ones are sweet. The learning happens by adapting
the fruit picking behavior.

At the neural level the learning happens by changing of the synaptic


strengths, eliminating some synapses, and building new ones.
Learning as optimisation
The objective of adapting the responses on the basis of the information
received from the environment is to achieve a better state. E.g., the
animal likes to eat many energy rich, juicy fruits that make its stomach
full, and makes it feel happy.

In other words, the objective of learning in biological organisms is to


optimise the amount of available resources, happiness, or in general to
achieve a closer to optimal state.
Learning in biological neural
networks
The learning rules of Hebb:
• synchronous activation increases the synaptic strength;
• asynchronous activation decreases the synaptic strength.

These rules fit with energy minimization principles.


Maintaining synaptic strength needs energy, it should be maintained at
those places where it is needed, and it shouldn’t be maintained at places
where it’s not needed.
Learning principle for
ANN
ENERGY MINIMIZATION

We need an appropriate definition of energy for artificial neural


networks, and having that we can use mathematical optimisation
techniques to find how to change the weights of the synaptic
connections between neurons.

ENERGY = measure of task performance error


Neural network mathematics

Output
Inputs

y11  f ( x1 , w11 )  y11  2


 1  y1  f ( y 1 , w12 )  y32 
y 12  f ( x2 , w12 ) y 1   y 2  2  2
 y 2  f ( y , w2 )
1 2
y 2
  y  y  f ( y 2
, w 3
1)
1 3 Out
y31  f ( x3 , w31 )  y3  2  2 
 y 1  y3  f ( y , w3 )  y3 
1 2

y 14  f ( x4 , w14 )  4
Neural network mathematics
Neural network: input / output transformation

y out  F ( x, W )

W is the matrix of all weight vectors.


MLP neural networks
MLP = multi-layer perceptron
Perceptron:

y out  wT x x yout
MLP neural network:

1
y 1k   w1 kT x  a1k
, k  1,2,3
1 e
y 1  ( y11 , y 12 , y31 ) T
1
y k2   w 2 kT y 1  a k2
, k  1,2
1 e
y 2  ( y12 , y 22 ) T x yout
2
y out   wk3 y k2  w3T y 2
k 1
RBF neural networks
RBF = radial basis function

r ( x)  r (|| x  c ||)
|| x  w|| 2

Example:
f ( x)  e 2a 2 Gaussian RBF

|| x  w1,k || 2
4 
y out   wk2  e 2( ak ) 2
x
k 1 yout
Neural network tasks
• control These can be reformulated in general as
• classification FUNCTION APPROXIMATION
• prediction tasks.
• approximation

Approximation: given a set of values of a function g(x) build a neural


network that approximates the g(x) values for any input x.
Neural network
approximation
Task specification:

Data: set of value pairs: (xt, yt), yt=g(xt) + zt; zt is random


measurement noise.

Objective: find a neural network that represents the input / output


transformation (a function) F(x,W) such that
F(x,W) approximates g(x) for every x
Learning to approximate
Error measure:
1 N
E
N
 ( F ( xt ; W )  y t ) 2
t 1

Rule for changing the synaptic weights:

E
wi  c 
j
(W )
wi j

wi j , new
 wi  wi
j j

c is the learning parameter (usually a constant)


Learning with a perceptron
Perceptron: y out  wT x
Data: ( x1 , y1 ), ( x 2 , y 2 ),..., ( x N , y N )
Error: E (t )  ( y (t ) out  yt ) 2  ( w(t )T x t  yt ) 2
Learning:
E (t )  ( w(t )T x t  yt ) 2
wi (t  1)  wi (t )  c   wi (t )  c 
wi wi
wi (t  1)  wi (t )  c  ( w(t )T x t  yt )  xit
m
w(t ) x   w j (t )  x tj
T

j 1

A perceptron is able to learn a linear function.


Learning with RBF neural
networks M 
|| x  w1,k || 2

RBF neural network: y out  F ( x, W )   wk2  e 2( ak ) 2

k 1
Data: ( x1 , y1 ), ( x 2 , y 2 ),..., ( x N , y N )
|| x t  w1,k || 2
M 
Error: E (t )  ( y (t ) out  yt )  ( wk2 (t )  e
2 2( ak ) 2
 yt ) 2
k 1
Learning: E (t )
w (t  1)  w (t )  c 
2
i
2
i
wi2
|| x t  w1,i || 2
E (t ) 
2 ( ai ) 2
 2  ( F ( x t
, W (t ))  yt )  e
wi2

Only the synaptic weights of the output neuron are modified.


An RBF neural network learns a nonlinear function.
Learning with MLP neural
networks 1
y 1k   w1 kT x  a1k
, k  1,..., M 1
MLP neural network: 1 e
with p layers y 1  ( y11 ,..., y 1M 1 ) T
1
y k2  , k  1,..., M 2
yout 1 e  w 2 kT y 1  a k2

x y 2  ( y12 ,..., y M2 2 ) T
...
1 2 … p-1 p y out  F ( x;W )  w pT y p 1

Data: ( x1 , y1 ), ( x 2 , y 2 ),..., ( x N , y N )
Error: E (t )  ( y (t ) out  yt ) 2  ( F ( x t ;W )  yt ) 2

It is very complicated to calculate the weight changes.


Learning with
backpropagation
Solution of the complicated learning:
• calculate first the changes for the synaptic weights of the
output neuron;
• calculate the changes backward starting from layer p-1, and
propagate backward the local error terms.

The method is still relatively complicated but it is much


simpler than the original optimisation problem.
Learning with general
optimisation
In general it is enough to have a single layer of nonlinear neurons in a
neural network in order to learn to approximate a nonlinear function.
In such case general optimisation may be applied without too much
difficulty.

Example: an MLP neural network with a single hidden layer:

M
1
y out  F ( x;W )   w  2
k  w1,kT x  a k
k 1 1 e
Learning with general
optimisation
Synaptic weight change rules for the output neuron:
E (t )
w (t  1)  w (t )  c 
2
i
2
i
wi2
E (t ) 1
 2  ( F ( x t
, W (t ))  y t ) 
wi2
1,iT t
1  e  w x  ai
Synaptic weight change rules for the neurons of the hidden layer:
E (t )
w1j,i (t  1)  w1j,i (t )  c 
w1j,i
E (t )   1 
 2  ( F ( x t
, W (t ))  y t )   
w1j,i w1j,i 1 e
 w1,iT x t  ai

1,iT t
e  w x  ai
 

1 
  

 w1,iT x t  ai 
w1j,i 1 e
 w1,iT x t  ai

 1  e  w x  ai
1,iT t
2
w j
1, i


w j
1, i

 w1,iT x t  ai   x tj 
1,iT
x t  ai
ew
w (t  1)  w (t )  c  2  ( F ( x , W (t ))  yt ) 
1, i 1, i t
 ( x tj )
j j
1  e w 1,iT t
x  ai

2
New methods for learning
with neural networks
Bayesian learning:
the distribution of the neural network parameters
is learnt

Support vector learning:


the minimal representative subset of the available
data is used to calculate the synaptic weights of the
neurons
Summary
• Artificial neural networks are inspired by the learning processes
that take place in biological systems.
• Artificial neurons and neural networks try to imitate the working
mechanisms of their biological counterparts.
• Learning can be perceived as an optimisation process.
• Biological neural learning happens by the modification of the
synaptic strength. Artificial neural networks learn in the same way.
• The synapse strength modification rules for artificial neural
networks can be derived by applying mathematical optimisation
methods.
Neural Network Architecture
A typical neural network is composed of
• input units X1, X2,…corresponding to independent
variables
• a hidden layer known as the first layer, and
• an output layer (second layer) whose output units Y1,...
correspond to dependent variables.
Neural Network Architecture
• Hidden units H1, H2, … corresponding to
intermediate variables.
• These interact by means of weight matrices W(1)
and W(2) with adjustable weights.
• The values of the hidden units are obtained from
the formulas:
Neural Network Architecture
• The first weight matrix is multiplied by the input vector X =
(X1, X2,...) and then applies an activation function f to each
component of the result.
• Likewise the values of the output units are obtained by
applying the second weight matrix to the vector H = (H 1,
H2,...) of hidden unit values, and then applying the activation
function f to each component of the result.
• In this way an output vector Y= (Y1, Y2,...) can be obtained.
• The activation function f is typically of sigmoid form and may
be a logistic function, hyperbolic tangent, etc.:
Neural Network Architecture
• Usually the activation function is taken to be the same for all
components but it need not be.
• Values of W(1) and W(2) are assumed at the initial iteration.
The accuracy of the estimated output is improved by an
iterative learning process in which the outputs for various
input vectors are compared with targets (observed frequency
of accidents) and an average error term E is computed: :

Here
N = Number of highway sites or observations
Y(n) = Estimated number of accidents at site n for n = 1, 2,..., N
T(n) = Observed number of accidents at site n for n = 1, 2,..., N.
Neural Network Architecture
• After one pass through all observations (the training set), a
gradient descent method may be used to calculate improved
values of the weights W(1) and W(2), values that make E
smaller.
• After reevaluation of the weights with the gradient descent
method, successive passes can be made and the weights
further adjusted until the error is reduced to a satisfactory
level.
• The computation thus has two modes, the mapping mode, in
which outputs are computed, and the learning mode, in
which weights are adjusted to minimize E.
• Although the method may not necessarily converge to a
global minimum, it generally gets quite close to one if an
adequate number of hidden units are employed.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy