100% found this document useful (1 vote)

95 views59 pages

2 - Neural Network

This document provides an overview of artificial neural networks and their biological inspiration. It discusses the basic structure and functioning of biological neurons, comparing them to artificial neurons. The key components of artificial neurons like weights, inputs, outputs and activation functions are explained. Different types of neural network topologies like feedforward and recurrent networks are described. The document aims to introduce some of the fundamental concepts behind artificial neural networks and their relationship to biological neural systems.

Uploaded by

Jitesh Behera

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

95 views59 pages

2 - Neural Network

Uploaded by

Jitesh Behera

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 59

PARALA MAHARAJA ENGINEERING COLLEGE

BERHAMPUR

A Course on
SOFT COMPUTING

Prepared By
Mr. Suryalok Dash
Asst. Professor, Dept. of Electrical Engineering.
Mr. Suryalok Dash
2 PMEC- Berhampur 11/13/2018
Artificial Neural Network
(ANN)

Mr. Suryalok Dash

3 PMEC- Berhampur 11/13/2018
Human brain
 Biological nervous system is the most important part of human being.
 There is a part called brain at the center of the human nervous system.
 Unlike computers, which are programmed to solve problems using sequential
algorithms, the brain makes use of a massive network of parallel and distributed
computational elements called neurons.
 Neurons are responsible for thought emotion, cognition, pattern recognition etc.
 Each neurons is approximately 10 micron long in length and they can operate in parallel.
Typically, human brain consists of approximately 1011 neurons.
 A neuron receives electro chemical signals from its various sources and in turn responds
by transmitting electrical impulse to other neurons.
 Characteristics of Human Brain
 Ability to learn from experience
 Ability to generalize the knowledge it possess
 Ability to perform abstraction
 To make errors.

Mr. Suryalok Dash

4 PMEC- Berhampur 11/13/2018
Mr. Suryalok Dash
5 PMEC- Berhampur 11/13/2018
Working of Neuron:
 A neuron composed of a nucleus (a cell body, also known as soma).
 Long irregularity shaped filaments called dendrites are connected to the soma. These
behaves as inputs to the neuron.
 The output channels of the neuron is called Axon. These are electrically active. These
are non linear threshold devices which produce a voltage pulse that lasts for about a
millisecond.
 If the cumulative input as received by the soma raises the internal potential of the cell
(known as membrane potential), the neuron fires to excite or inhibit other neurons.
 The axon terminates in a contact called Synapse or Synaptic junction that connects
the axon with the dendrite of other neuron.
 Synaptic junction contains a neurotransmitter fluid, which is responsible for accelerating
or retarding the electric charges to the soma.
 A single neuron may have many synaptic input and synaptic output.

Mr. Suryalok Dash

6 PMEC- Berhampur 11/13/2018
Computer and Human Brain:
Similarities
 Both operate on electrical signals
 Both are a composition of a large number of simple elements.
 Both perform functions that are computational.
Differences
 Compared to s or ns time scales of digital computation, nerve impulses are
slow.
 The brain’s huge computation rate is achieved by a tremendous number of
parallel computational units, far beyond any proposed for a computer system.
 A digital computer is inherently error free, but brain often produces best
guesses and approximations from partially incomplete and incorrect inputs,
which may be wrong.

Mr. Suryalok Dash

7 PMEC- Berhampur 11/13/2018
Modern Computer Biological Neural System
Processor Complex Simple
High Speed Low Speed
One or Few A Large Number
Memory Separate from a Processor Integrated into Processor

Localized Distributed
Computing Centralized Distributed
Sequential Parallel
Stored Programs Self-Learning
Reliability Very Vulnerable Robust
Expertise Numerical and Symbolic Perceptual Problems
Manipulations
Operating Environment Well-Defined Poorly Defined

Well Constrained Unconstrained

Mr. Suryalok Dash
8 PMEC- Berhampur 11/13/2018
Artificial neural network (ANN)
ANN or NN are simplified models of the biological nervous system.
Definition: (According to Haykin (1994)): A neural network is a massively parallel
distributed processor that has a natural propensity for storing experiential knowledge and
making it available for use. It resembles the brain in two respects:
 Knowledge is acquired by the network through a learning process.
 Interneuron connection strengths known as synaptic weights are used to store the
knowledge.
Objective: To Develop a computational device for modeling the brain to perform various
computational tasks at a faster rate than the traditional system.
Motivation:
 Scientists are challenged to use machines more effectively for tasks currently solved
by humans.
 Symbolic rules don't reflect processes actually used by humans.
 Traditional computing excels in many areas, but not in others.

Application: Pattern Recognition, Classification, Optimization function, Approximation,

Vector quantization, data clustering etc.
Mr. Suryalok Dash
9 PMEC- Berhampur 11/13/2018
 Artificial Neuron is also referred as “Perceptron”.
Analogy Between BNN and ANN:

Mr. Suryalok Dash

10 PMEC- Berhampur 11/13/2018
 A biological neuron receives all inputs through the dendrites, sums them and produces
an output if the sum is greater than a threshold value.
 The input signal are passed on to the cell body through the synapse which may
accelerate or retard an arriving signal.
 The acceleration and retardation of the input signals is modelled by the weights.
 An strong synapse which transmits a stronger signal will have a corresponding larger
weight while a weak synapse will have a smaller weight.

Mr. Suryalok Dash

11 PMEC- Berhampur 11/13/2018
Biological Neuron Artificial Neuron
Cell Neuron
Dendrites Weights
Soma Net Input
Axon Output

Mr. Suryalok Dash

12 PMEC- Berhampur 11/13/2018
Features of ANN
 An artificial neural network (ANN) is typically composed of a set of parallel and
distributed processing units, called nodes or neurons.
 These are usually ordered into layers, appropriately interconnected by means of
unidirectional (or bi-directional in some cases) weighted signal channels, called
connections or synaptic weights. (Fig. next slide)
 The internal architecture of ANN provides powerful computational capabilities,
allowing for the simultaneous exploration of different competing hypotheses.
 Massive parallelism and computationally intensive learning through examples
make them suitable for application in nonlinear functional mapping, speech and
pattern recognition, categorization, data compression, and many other
applications characterized by complex dynamics and possibly uncertain
behavior.
 Three important features generally characterize an artificial neural network:
the network topology, the network transfer functions, and the
network learning algorithm.
Mr. Suryalok Dash
13 PMEC- Berhampur 11/13/2018
Mr. Suryalok Dash
14 PMEC- Berhampur 11/13/2018
Neural network topologies:
 The way the nodes and the interconnections are arranged within the layers of a
given ANN determines its topology.
 The choice for using a given topology is mainly dictated by the type of problem
being considered.
 The following are the topologies of the Neural Network.
The feedforward (FF) topology
 It has nodes hierarchically arranged in layers starting with the input layer and
ending with the output layer.
 In between, a number of internal layers, also called hidden layers, provide most
of the network computational power.
 The nodes in each layer are connected to the next layer through unidirectional
paths starting from one layer (source) and ending at the subsequent layer (sink).
This means that the outputs of a given layer feed the nodes of the following layer
in a forward path.
 These are static in nature. i.e. the output of a given pattern of inputs is
independent of the previous state of the network.
Mr. Suryalok Dash
15 PMEC- Berhampur 11/13/2018
Mr. Suryalok Dash
16 PMEC- Berhampur 11/13/2018
The recurrent topology
 recurrent networks (RN) allow for feedback connections among their nodes.
 They are structured in such a way as to permit storage of information in their
output nodes through dynamic states, hence providing the network with some
sort of “memory.”
 recurrent networks map states into states and as such are very useful for
modeling and identifying dynamic systems.

Mr. Suryalok Dash

17 PMEC- Berhampur 11/13/2018
Neural network activation functions:
 The Neurons take the weighted sum of their inputs from other nodes and apply
to them a nonlinear mapping (not necessarily linear) called an activation
function before delivering the output to the next neuron.

Mr. Suryalok Dash

18 PMEC- Berhampur 11/13/2018
 The output Ok of a typical neuron (k) having (l) inputs is given as

 where f is the node’s activation function, x1, x2, . . . , xl are the node’s inputs,
w1k, w2k, . . . , wlk are the connections weights, and θk is the node’s
threshold.
 The bias effect (threshold value) is intended to occasionally inhibit the activity of
some nodes.
 The activation functions can take different forms: sigmoid mapping, signum
function, step function or linear correspondence.
 The mathematical representation for some of these mappings are:

Mr. Suryalok Dash

19 PMEC- Berhampur 11/13/2018
Mr. Suryalok Dash
20 PMEC- Berhampur 11/13/2018
Neural network learning algorithms:
 Learning algorithms are used to update the weight parameters at the
interconnection level of the neurons during the training process of the network
 The three well-known and most often used learning mechanisms are the
supervised, the unsupervised (or self-organized) and the reinforced.
Supervised learning:
 The main feature of the supervised (or active) learning mechanism is the
training by examples.
 This means that an external teacher provides the network with a set of input
stimuli for which the output is a priori known.
 During the training process, the output results are continuously compared with
the desired data.
 An appropriate learning rule uses the error between the actual output and the
target data to adjust the connection weights so as to obtain, after a number of
iterations, the closest match between the target output and the actual output.
 Supervised learning is particularly useful for feedforward networks.

Mr. Suryalok Dash

21 PMEC- Berhampur 11/13/2018
 Gradient descent optimization technique and the least mean square
algorithm are among the most commonly used supervised learning rules.

Unsupervised learning:
 Unsupervised or self-organized learning does not involve an external teacher
and relies instead upon local information and internal control.
 The training data and input patterns are presented to the system, and through
predefined guidelines, the system discovers emergent collective properties and
organizes the data into clusters or categories.

Mr. Suryalok Dash

22 PMEC- Berhampur 11/13/2018
 An unsupervised learning scheme operates as follows. A set of training data is
presented to the system at the input layer level. The network connection
weights are then adjusted through some sort of competition among the nodes of
the output layer, where the successful candidate will be the node with the
highest value. In the process, the algorithm strengthens the connection between
the incoming pattern at the input layer and the node output corresponding to
the winning candidate

Mr. Suryalok Dash

23 PMEC- Berhampur 11/13/2018
Reinforcement learning:
 Reinforcement, also known as graded learning, mimic the way of adjusting
behavior of humans when interacting with a given physical environment.
 This is another type of learning mechanism by means of which the network
connections are modified according to feedback information provided to the
network by its environment.
 This information simply instructs the system on whether or not a correct
response has been obtained. In the case of a correct response, the corresponding
connections leading to that output are strengthened otherwise weakened.

Mr. Suryalok Dash

24 PMEC- Berhampur 11/13/2018
Other important terminologies of ANN
Weights:
 In ANN each neuron is connected to other neuron by direct link with
weights.
 The weight contains the information about the input signal.
 The weight can be represented in the form of a matrix called connection
matrix.
Bias:
 The bias included in the NN has its impact in
calculating the net input.
 The bias is considered as another input.
 Yin = ∑wixi + θ, where θ is the bias
Threshold:
Threshold is a set value based upon which the final output of the network may
be calculated. The threshold value is used in the activation function. A
comparison is made between the calculated net input and the threshold to
obtain the network output.
Mr. Suryalok Dash
25 PMEC- Berhampur 11/13/2018
Learning Rate:
 The learning rate is denoted by “η”.
 It is used to control the amount of weight adjustment at each step of
training.
 The learning rate, ranging from 0 to 1, determines the rate of learning at
each time step.
Momentum Factor
 Convergence is made faster if a momentum factor is added to the weight
updation factor process.
 This is generally done is back propagation network.

Mr. Suryalok Dash

26 PMEC- Berhampur 11/13/2018
Evolution of ANN
Year Neural Network Designer
1943 McCulloch and Pitts (M-P) Neuron McCulloch and Pitts
1949 Hebb Network Hebb
1958 Perceptron Frank Rosenblatt and
Others
1960 Adaline Widrow and Hoff
1972 Kohonen self organizing map Kohonen
1982 Hopfield Network John Hopfield and Tank

1986 Back Propagation Network Rumelhart and others

1988 Counter Propagation Network Grossberg

1987-1990 Adaptive Resonance Theory Carpenter and Grossberg

1988 Radian Basis Function Network Broomhead and Lowe

Mr. Suryalok Dash

27 PMEC- Berhampur 11/13/2018
MC-Culloch–Pitts models
 This model is considered by many as the first serious attempt to model the computing
process of the biological neuron.
 This Model is limited in terms of computing capability and doesn’t actually provide
learning.
 This neuron model is quite simple in design as shown in Figure

 It collects the incoming signals x1, x2, . . . , xl, multiplies them by corresponding
weights w1, w2, . . . , wl and compares the result with a predetermined bias θ before
applying activation function resulting in the output o.
Mr. Suryalok Dash
28 PMEC- Berhampur 11/13/2018
The output is expressed by

 If the weighted sum of the different signal inputs, is larger than the bias θ, then
an output value of 1 is generated, otherwise the result is zero. This is done
through the step activation function.
Note:
 McCulloch-Pitts neuron allows binary 0 or 1 states only ie.it is binary activated
 The weights could be excitatory (positive) or inhibitory (Negative).
 model topology is based on a fixed set of weights and thresholds
 This model are most widely used in case of logic function.
 Notice here the absence of any type of learning since there is no updating
mechanism for the synaptic weights once the system has been presented with a
set of training input–output data.
Mr. Suryalok Dash
29 PMEC- Berhampur 11/13/2018
Example: X1 X2 O
Implement AND function using M-P neuron. 0 0 0
0 1 0
1 0 0
Solution:
1 1 1

Assume weights W1 and W2 =1 and θ=0. With this assumed weight, the net input
for four inputs are:
Yin = 0 (for input 0,0), Yin = 1 (for input 0,1)
Yin = 1 (for input 1,0), Yin = 2 (for input 1,1)
Hence, if we set the threshold of the activation function anything > 1 and < 2, we
will get desired output.
Alternate solution: w1=1, w2=1, θ= -1, threshold =0.1
Note: There are many such solutions !

Mr. Suryalok Dash

30 PMEC- Berhampur 11/13/2018
Hebb Network
 Hebb, proposed a very simple learning method to the neural network. .
 According to Hebb rule, the weight vector is found to increase proportionately
to the product of the input (xi) and the learning signal. Here, the learning signal
is equal to the neuron’s target output(t).
 In Hebb learning, if two interconnected neurons are ON simultaneously then
the weights associated with these neurons can be increased by the modification
made in their strength. The weight update rule is
𝑤𝑖 𝑛𝑒𝑤 = 𝑤𝑖 𝑜𝑙𝑑 + 𝑡 𝑥𝑖 (For all I, including i=0)
Note:The Hebb rule is suited for bipolar data (1,-1) than binary data (0,1)

Mr. Suryalok Dash

31 PMEC- Berhampur 11/13/2018
Example: Training of an AND Gate with bipolar data X1 X2 Target (t)
Solution: -1 -1 -1
 Initialize the weight to random values w1=0 and
-1 1 -1
w2=0 with Bias θ =0
 Take the first set of data, (-1,-1), target =-1
1 -1 -1
Δw1= (-1)-1 = 1 and Δw2= (-1)-1 = 1 and Δθ =-1 1 1 1
So, the updated weights are w1=1 and w2=1 with Bias =-1
 Take the second set of data, (-1,1), target =-1
Δw1= (-1)-1 = 1 and Δw2= (-1)1 = -1 and Δθ =(-1)= -1
So, the updated weights are w1=2 and w2=0 with Bias =-2
 Take the third set of data, (1,-1), target =-1
Δw1= (-1)1 = -1 and Δw2= (-1)-1 = 1 and Δθ =-1
So, the updated weights are w1=1 and w2=1 with Bias =-3
 Take the forth set of data, (1,1) target =1
Δw1= (1)1 = 1 and Δw2= (1)1 = 1 and Δθ =1
So, the updated weights are w1=2 and w2=2 with Bias =-2
Mr. Suryalok Dash
32 PMEC- Berhampur 11/13/2018
Example: Let’s Try to solve the same problem with Binary data (0,1)
Solution:
I/P W1 W2 W0 Targ Δw1 Δw2 Δw0 W1 W2 W0
x1,x2 (ol (old (old et (new) (new) (new)
d) ) ) (t)
Epoch-1 (TakeW0=0,x0=1)
0,0 0 0 0 0 0 0 0 0 0 0
0,1 0 0 0 0 0 0 0 0 0 0
1,0 0 0 0 0 0 0 0 0 0 0
1,1 0 0 0 1 1 1 1 1 1 1
Epoch-2
0,0 1 1 1 0 0 0 0 1 1 1
0,1 1 1 1 0 0 0 0 1 1 1
1,0 1 1 1 0 0 0 0 1 1 1
1,1 1 1 1 1 1 1 1 2 2 2
It can be seen that the settled value will keep increasing. Hence, weights will never be settled

Mr. Suryalok Dash

33 PMEC- Berhampur 11/13/2018
Perceptron
 To overcome the limitations of M-P model, other models involving learning
capabilities have been suggested on the basis they would have to adjust the
connection weights in accordance with some sort of optimization mechanism.
 In the early 1960s, Rosenblatt of Cornell University developed a trainable
model of an artificial neuron using a supervised learning procedure in which the
system adjusts its weights in response to a comparative signal computed
between the actual output and the target output. This is called perceptron.
 The motivation behind this development is for the purpose of pattern
classification of linearly separable sets. By definition, sets are linearly separable if
there exists a hyperplanar multidimensional decision boundary that classifies the
data input into two classes.
 Perceptron architecture is a hierarchical three-level structure is shown in figure
next slide.
 The input level corresponding to the sensory unit or retina.

Mr. Suryalok Dash

34 PMEC- Berhampur 11/13/2018
Mr. Suryalok Dash
35 PMEC- Berhampur 11/13/2018
 The second-level unit (hidden unit), called a feature detector unit or associator
unit.
 The input and hidden unit are connected with fixed connection weights and
thresholds.
 The third-level unit (response unit) involving the output layer composed of one
single node but with adjustable connection weights.
 The activation function used in the perceptron is the step or the hard limiting
(signum) activation function.
 The learning algorithm used to adjust the weights is the perceptron learning
rule. It is used to update the weights between the associator unit and response
unit.
 The error calculation is based on the comparison of the values of target with
those of the calculated output. The target value is either 1 or -1.
 The weights will be adjusted according to learning rule if an error occurred. If
no error, no weight updatation.

Mr. Suryalok Dash

36 PMEC- Berhampur 11/13/2018
Training Algorithm for Perceptron:
 Step-1: Initialize weights and thresholds to small random values.
 Step 2: Choose an input–output pattern from the training input–output dataset
(x(k), t(k)).
 Step 3: Compute the net input to the net and actual output
 Step 4:
 Case-1: (If the activation function is step function)
(i.e. if target is 0 or 1)

 Case-2: (If the activation function is signum function)(i.e, target is 1 or -1)

Where, with η being a positive update rate ranging from 0 to 1

representing the learning rate.
Mr. Suryalok Dash
37 PMEC- Berhampur 11/13/2018
Step 5: In case the weights do not reach steady-state values Δwi = 0, repeat
starting from step 2 and choose another training pattern.
 This learning procedure is very similar to the Hebbian learning rule, with the
difference that the connection weights here are not updated if the network
responds correctly.

Mr. Suryalok Dash

38 PMEC- Berhampur 11/13/2018
Example: Training of an AND Gate
X1 X2 Target (t)
(Use bipolar data and signum activation
function) -1 -1 -1
i.e. -1 1 -1
y=-1 ; if net i/p <0
1 -1 -1
y=0 ; if net i/p =0
y=1 ; if net i/p >0 1 1 1

Solution:
 Let’s Initialize the weight to
random values w1=0 and
w2=0 with Bias =0
 Also assume, η =0.3

Mr. Suryalok Dash

39 PMEC- Berhampur 11/13/2018
I/p I/P x0 Target Net Output Δw1 Δw2 Δw0 W1 W2 W0
x1 x2 ,t i/p O (Bias (new (ne (ne
) ) w) w)
Epoch -1
-1 -1 1 -1 0 0 0.6 0.6 -0.6 0.6 0.6 -0.6

-1 1 1 -1 0.6 1 0.6 -0.6 -0.6 1.2 0 0

1 -1 1 -1 1.2 1 -0.6 0.6 -0.6 0.6 0.6 -0.6

1 1 1 1 0.6 1 0 0 0 0.6 0.6 -0.6

Epoch -2
-1 -1 1 -1 -0.6 -1 0 0 0 0.6 0.6 -0.6

-1 1 1 -1 -0.6 -1 0 0 0 0.6 0.6 -0.6

1 -1 1 -1 0.6 -1 0 0 0 0.6 0.6 -0.6

1 1 1 1 0.6 1 0 0 0 0.6 0.6 -0.6

Mr. Suryalok Dash

40 PMEC- Berhampur 11/13/2018
Example: The AND Gate can also be trained
X1 X2 Target (t)
using Binary data and step activation
function as below 0 0 0
Solution: 0 1 0
 Initialize the weight to 1 0 0
random values w1=0.3 and
w2=0.2 with Bias =0
1 1 1
 Take the first set of data, (0,0)
Net input=x1w1+x2w2 =0
Compute the output. O=f(net input)=0 and t=0 Hence, no weight change
Note:The function is a step function
 Take the second set of data, (0,1), Net input=x1w1+x2w2 =0.2
Output, O = f(Net input)=1, but, Target, t=0. So we the weights need to be
updated. Assume the learning rate η =0.6
Δw1= 0.6[0-1]0 = 0 and Δw2= 0.6[0-1]1 = -0.6
So, the updated weights becomes, W1= 0.3+0 =0.3 and W2= 0.2-0.6 = -0.4

Mr. Suryalok Dash

41 PMEC- Berhampur 11/13/2018
 Take the third set of data, (1,0), Net input=x1w1+x2w2 =0.3
Output, O = f(Net input)=1, but, Target, t=0. So we the weights need to be
updated. Assume the learning rate η =0.6
Δw1= 0.6[0-1]1 = -0.6 and Δw2= 0.6[0-1]0 = 0
So, the updated weights becomes, W1= 0.3-0.6 =-0.3 and W2= =-0.4+0 = -0.4
 Take the fourth set of data, (1,1), Net input=x1w1+x2w2 =-0.7
Output, O = f(Net input)=0, but, Target, t=1. So we the weights need to be
updated. Assume the learning rate η =0.6
Δw1= 0.6[1-0]1 = 0.6 and Δw2= 0.6[1-0]1 = 0.6
So, the updated weights becomes, W1= -0.3+0.6 =0.3 and W2= =-0.4+0.6= 0.2
Remember, we started the initial weight from 0.3 and 0.2, and after training all 4 set of data we are
still at the same weights.
So, NO CONVERGENCE !!!
Probable solution: Add a Bias

Mr. Suryalok Dash

42 PMEC- Berhampur 11/13/2018
 Initialize the weight to random values w1=0.3 and w2=0.2 with Bias, θ =-1
 Take the first set of data, (0,0) Net input=x1w1+x2w2-1 =-1
O=f(net input)=0 and t=0 Hence, no weight change
 Take the second set of data, (0,1), Net input=x1w1+x2w2 -1=-0.8
Output, O = f(Net input)=0, and, Target, t=0. Hence, no weight change
 Take the third set of data, (1,0), Net input=x1w1+x2w2 -1=-0.7
Output, O = f(Net input)=0, and, Target, t=0. Hence, no weight change
 Take the forth set of data, (1,1), Net input=x1w1+x2w2 -1=-0.5
Output, O = f(Net input)=0, and, Target, t=1. Hence, weights need to be updated
Δw1= 0.6[1-0]1 = 0.6 and Δw2= 0.6[1-0]1 = 0.6
So, the updated weights becomes, W1= 0.3+0.6 =0.9 and W2= =0.2+0.6= 0.8
 One iteration is completed. Now rerun for next set of iteration for all 4 data set.
 Net input=-1, O=f(net input)=0 and t=0 Hence, no weight change
 Net input=-0.2, O=f(net input)=0 and t=0 Hence, no weight change
 Net input=-0.1, O=f(net input)=0 and t=0 Hence, no weight change
 Net input=0.7, O=f(net input)=1 and t=1 Hence, no weight change
Mr. Suryalok Dash
43 PMEC- Berhampur 11/13/2018
Linear Separability:
 According to perceptron convergence theorem, it was shown that as long as the patterns
used to train the perceptron are linearly separable, the learning algorithm should
converge in a finite number of steps. This means that a decision boundary should be
obtained to classify the patterns as shown in figure below.

 To illustrate the ideas, the hyperplane takes the form of

 In the two-dimensional case, this translates into finding the line given by w1x1 + w2x2
− θ = 0, which after learning should adequately classify the patterns.
Mr. Suryalok Dash
44 PMEC- Berhampur 11/13/2018
 On training, if the weights of training input vectors of correct response +1 lie on one
side of the boundary and that of -1 lie on the other side of the boundary, then the
problem is linearly separable.

XOR
AND

(1,0) (1,1)
(1,0) (1,1)

(0,0) (0,1) (0,0) (0,1)

Mr. Suryalok Dash

45 PMEC- Berhampur 11/13/2018
 Excersise-1:
Can we train a Perceptron to solve the logic exclusive (XOR) problem? Justify the answer.
 Excersise-2:
Train the network using the following set of input and desired output training vectors:
X1 X2 X3 X4 t
1 -2 0 -1 -1
0 1.5 -0.5 -1 -1
-1 1 0.5 -1 1
with initial weight vector w(1) = [1, −1, 0, 0.5]T, and the learning rate η = 0.1.
Ans: w1 = −0.2, w2 = 0.3, w3 = 0.5, and w4 = 0.3.

Mr. Suryalok Dash

46 PMEC- Berhampur 11/13/2018
 Single layer perceptron were subjected to two major criticisms
perceptron lacked the ability to solve problems dealing with nonlinearly separable

patterns
 perceptron, once trained with a set of training data, cannot adapt its connection
weights to a new set of data (lack of generalization).
 The shortcomings put on hold the connectionist research for a number of years. (1960-
1970)

Mr. Suryalok Dash

47 PMEC- Berhampur 11/13/2018
Adaline
 Adaptive Linear Neuron (ADALINE), was developed by Widrow 1962 and is composed
of a linear combiner, an activation function (hard limiter) and a set of trainable signed
weights as shown in Figure.
 In an Adaline, the input-output relationship
is linear.
 It has only one output.
 This model is more versatile in terms of
generalization and more powerful in terms
of weight adaptation.
 The weights in this model are adjusted
according to the least mean square (LMS)
algorithm which is also known as the
Widrow–Hoff learning rule or delta rule.
 The learning rule for the adaline is
formally derived using the gradient
descent algorithm.

Mr. Suryalok Dash

48 PMEC- Berhampur 11/13/2018
 The perceptron learning rule stops after a finite number of learning steps. But, gradient
descent approach continues forever, converging only asymptotically to the solution.
 The LMS rule adjusts the weights by incrementing them every iteration step by an
amount proportional to the gradient of the cumulative error of the network E(w):

where,

 This is the cumulative error for all patterns k’s (k = 1 . . . n) between the desired
response t k and the actual output of the linear combiner.
 The weights are updated individually according to the formulae

Mr. Suryalok Dash

49 PMEC- Berhampur 11/13/2018
 Once Adaline has been trained for a set of patterns, it can be applied successfully to
classify a noisy version of the same set.
 The steps involving the training of the Adaline are quite similar to those used for the
perceptron with the difference that the weight updating law is being carried out using
the desired output and the actual output of the linear combiner (not output of
activation function) before going through the activation function.

Mr. Suryalok Dash

50 PMEC- Berhampur 11/13/2018
 Training Algorithm for Adaline:

Mr. Suryalok Dash

51 PMEC- Berhampur 11/13/2018
Madaline
 Like Perceptron, Adaline also unable to train patterns belonging to nonlinearly separable
spaces.
 It was found that, by combining in parallel a number of Adaline units can solve this
problem. This network is called a Madaline. (Many Adaline)
 For instance, it was found that the XOR logic function could be effectively trained by
combining in parallel two Adaline units using the AND logic gate. (Fig next slide)
 The weights that are connected from Adaline to Madaline layer are fixed, positive and
posses fixed value. The weights between the input layer and Adaline layer are adjustable
during training process.
 The Training process of the Madaline is similar to that of Adaline.
 Hoever, Madaline was still restricted in its capabilities for dealing with complex
functional mappings and multi-class pattern recognition problems.

Mr. Suryalok Dash

52 PMEC- Berhampur 11/13/2018
Mr. Suryalok Dash
53 PMEC- Berhampur 11/13/2018
Kohonen’s self-organizing network (or Map)
(KSON or KSOM)
 The Kohonen self-organizing network (KSON), also known as the Kohonen self-
organizing map (KSOM) or self-organizing map (SOM), belongs to the class of
unsupervised learning networks i.e updates of weighting parameters without the need
for a performance feedback from a teacher or a network trainer.
 One major feature of this network is that the nodes distribute themselves across the
input space to recognize groups of similar input vectors, while the output nodes
compete among themselves to be fired one at a time in response to a particular input
vector. This process is known as competitive learning.
 When suitably trained, the network produces a low dimension representation of the
input space that preserves the ordering of the original structure of the network. This
implies that two input vectors with similar pattern characteristics excite two physically
close layer nodes.
 KSONs have been used extensively for clustering applications such as speech
recognition, vector coding, and texture segmentation and robotics applications.
 A schematic representation of a typical KSON with a 2-D output configuration is shown
in Figure next slide
Mr. Suryalok Dash
54 PMEC- Berhampur 11/13/2018
Mr. Suryalok Dash
55 PMEC- Berhampur 11/13/2018
 The learning here permits the clustering of input data into a smaller set of elements
having similar characteristics (features).
 It is based on the competitive learning technique also known as the ‘winner take all’
strategy.
 Let us presume that the input pattern is given by the vector x and let us denote by wij
the weight vector connecting the pattern input elements to an output node with
coordinates provided by indices i and j.
 Let us also denote Nc as being the neighborhood around the winning output candidate,
which has its size decreasing at every iteration of the algorithm until convergence
occurs.
 The steps of the learning algorithm are summarized as follows:
Step1: Initialize all weights to small random values. Set a value for the initial learning rate
α and a value for the neighborhood Nc.
Step 2: Choose an input pattern x from the input data set.
Step 3: Select the winning unit c (the index of the best matching output unit) such that the
performance index I given by the Euclidian distance from x to wij is minimized:

Mr. Suryalok Dash

56 PMEC- Berhampur 11/13/2018
Step 4: Update the weights according to the global network updating phase from iteration
k to iteration k + 1 as:

where α(k) is the adaptive learning rate (strictly positive value smaller than
unity) and Nc(k) is the neighborhood of the unit c at iteration k
Step 5: The learning rate and the neighborhood are decreased at every iteration according
to an appropriate scheme. For instance, Kohonen suggested a shrinking function in the
form of α(k) = α(0)(1 − k/T), with T being the total number of training cycles and α(0)
the starting learning rate bounded by one.
Step 6: The learning scheme continues until a sufficient number of iterations has been
reached or until each output reaches a threshold of sensitivity with respect to a portion of
the input space.

Mr. Suryalok Dash

57 PMEC- Berhampur 11/13/2018
recurrent network
 Feedforward networks are networks in which the output of a given node is always
injected into the following layer. Recurrent networks, on the other hand, have some
outputs directed back as inputs to node(s) in the same or preceding layer.
 The presence of cycles in a recurrent network provides the abilities of dynamically
encoding, storing, and retrieving context information and hence their name as dynamic
neural networks.
 Fig shows a typical structure of a recurrent neural network (RNN). The “feedback”
connections allow the network to tackle problems involving dynamic processes.

Mr. Suryalok Dash

58 PMEC- Berhampur 11/13/2018
 The architecture is similar to the feedforward network except that there are
connections going from nodes in the output layer to the ones in the input layer. There
are also some self-loops in the hidden layer.
 When time inputs are presented at the input layer, the unit computes the activation
output just like a feedforward node does. However, its net input consists of additional
information reflecting the state of the network. For the subsequent inputs, the state will
essentially be a function of the previous and current inputs. As a result, the behavior of
the network depends not only on the current excitation, but also on past history.
 Huge information can be stored in a recurrent network .At the same time, however, the
feedback connections pose a challenging problem in terms of training.

Mr. Suryalok Dash

59 PMEC- Berhampur 11/13/2018

Soft Computing PPT Module1
No ratings yet
Soft Computing PPT Module1
102 pages
Unit 5
No ratings yet
Unit 5
75 pages
Soft Computing
No ratings yet
Soft Computing
30 pages
ML Notes - 2025
No ratings yet
ML Notes - 2025
145 pages
AI - Characteristics of Neural Networks
88% (26)
AI - Characteristics of Neural Networks
22 pages
Face Recognition Using Neural Network: Seminar Report
100% (3)
Face Recognition Using Neural Network: Seminar Report
33 pages
CCS364-Soft Computing Lab Manual
No ratings yet
CCS364-Soft Computing Lab Manual
30 pages
A Hybrid Deep Learning Approach by Integrating LSTM-ANN Networks
No ratings yet
A Hybrid Deep Learning Approach by Integrating LSTM-ANN Networks
33 pages
Artificial Neural Network: Synapses Weight The Individual Parts of Information
No ratings yet
Artificial Neural Network: Synapses Weight The Individual Parts of Information
8 pages
ANN Unit 1
No ratings yet
ANN Unit 1
77 pages
Deloitte NL Data Analytics Artificial Intelligence Whitepaper Eng
No ratings yet
Deloitte NL Data Analytics Artificial Intelligence Whitepaper Eng
35 pages
Harrison Kinsley, Daniel Kukieła - Neural Networks From Scratch in Python (2020) - 31-61
No ratings yet
Harrison Kinsley, Daniel Kukieła - Neural Networks From Scratch in Python (2020) - 31-61
31 pages
Introduction To Neural Networks - Chapter1
No ratings yet
Introduction To Neural Networks - Chapter1
43 pages
CH 9: Connectionist Models
No ratings yet
CH 9: Connectionist Models
35 pages
Unit1 2
No ratings yet
Unit1 2
28 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
22 pages
Notes - B.tech CSE - VIII - Neural Network & Fuzzy Logic
No ratings yet
Notes - B.tech CSE - VIII - Neural Network & Fuzzy Logic
80 pages
Unit1 2-OOMDUML
No ratings yet
Unit1 2-OOMDUML
28 pages
Peterson, C. y Rögnvaldsson, T. - An Introduction To Artificial Neural Networks PDF
No ratings yet
Peterson, C. y Rögnvaldsson, T. - An Introduction To Artificial Neural Networks PDF
58 pages
ANN Training Course
No ratings yet
ANN Training Course
31 pages
Unit 5 Neural Networks
No ratings yet
Unit 5 Neural Networks
99 pages
Chapter 2
No ratings yet
Chapter 2
41 pages
P.E.S. College of Engineering, MANDYA, 571401: Identifying The Android Malware Using Machine Learning Algorithm
No ratings yet
P.E.S. College of Engineering, MANDYA, 571401: Identifying The Android Malware Using Machine Learning Algorithm
34 pages
Introduction To ANN P1
No ratings yet
Introduction To ANN P1
20 pages
Lecture 2 - Introduction To Neural Networks-Ch1 Send
No ratings yet
Lecture 2 - Introduction To Neural Networks-Ch1 Send
41 pages
Soft Computing Notes-I Mca
No ratings yet
Soft Computing Notes-I Mca
142 pages
Week8 9 Ann
No ratings yet
Week8 9 Ann
41 pages
ANN Material
No ratings yet
ANN Material
99 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
10 pages
L2 Neural Network
No ratings yet
L2 Neural Network
44 pages
CS 522 Selected Topics in CS: Lecture 07 - Artificial Neural Network
No ratings yet
CS 522 Selected Topics in CS: Lecture 07 - Artificial Neural Network
52 pages
ML M4 Ann
No ratings yet
ML M4 Ann
31 pages
Unit 1
No ratings yet
Unit 1
89 pages
Lec 1
No ratings yet
Lec 1
57 pages
Neural Network
No ratings yet
Neural Network
38 pages
ANN Module 1
No ratings yet
ANN Module 1
39 pages
NN Topologies
No ratings yet
NN Topologies
19 pages
Ann Mod1
No ratings yet
Ann Mod1
106 pages
Unit 4
No ratings yet
Unit 4
27 pages
Ann2018 L2 PDF
No ratings yet
Ann2018 L2 PDF
18 pages
Dtailed Notes Unit 1 ASC
No ratings yet
Dtailed Notes Unit 1 ASC
28 pages
2nd Unit NN Final Class Notes
No ratings yet
2nd Unit NN Final Class Notes
51 pages
Study of Ink Release From Gravure Cell
No ratings yet
Study of Ink Release From Gravure Cell
235 pages
UNIT1 NN 2023ver1
No ratings yet
UNIT1 NN 2023ver1
57 pages
Module 1
No ratings yet
Module 1
22 pages
Neural Networks
No ratings yet
Neural Networks
32 pages
Can Ai Replace Stock Analysts
No ratings yet
Can Ai Replace Stock Analysts
56 pages
Ict L2 PDF
No ratings yet
Ict L2 PDF
49 pages
ANN Introduction
No ratings yet
ANN Introduction
37 pages
Assignment 2
No ratings yet
Assignment 2
12 pages
Analyzing Types of Neural Networks in Deep Learning
No ratings yet
Analyzing Types of Neural Networks in Deep Learning
15 pages
Lecture 1 - Neural Network Definitions and Concepts 1
No ratings yet
Lecture 1 - Neural Network Definitions and Concepts 1
4 pages
UNIT II Basic On Neural Networks
No ratings yet
UNIT II Basic On Neural Networks
36 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
50 pages
Complete NN Concept
No ratings yet
Complete NN Concept
73 pages
Machine Learning Methods For Fault Diagnosis in AC Microgrids
No ratings yet
Machine Learning Methods For Fault Diagnosis in AC Microgrids
39 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
48 pages
Unit 1
No ratings yet
Unit 1
25 pages
5 Neural Networks 30-07-2024
No ratings yet
5 Neural Networks 30-07-2024
32 pages
UNIT 1 Soft Computing
No ratings yet
UNIT 1 Soft Computing
11 pages
Neural Network
No ratings yet
Neural Network
12 pages
Artificial Neural Network: Introduction-Basics
No ratings yet
Artificial Neural Network: Introduction-Basics
28 pages
Machine Learning: Version 2 CSE IIT, Kharagpur
No ratings yet
Machine Learning: Version 2 CSE IIT, Kharagpur
5 pages
Content Library Read
No ratings yet
Content Library Read
25 pages
Neural Networks
No ratings yet
Neural Networks
28 pages
Soft Computing
No ratings yet
Soft Computing
30 pages
Assignment-1 ML
No ratings yet
Assignment-1 ML
1 page
Neuro Fuzzy - Session 1
No ratings yet
Neuro Fuzzy - Session 1
21 pages
Ain3001 - Introduction - To.ann
No ratings yet
Ain3001 - Introduction - To.ann
39 pages
Landlside Phase1 Report
No ratings yet
Landlside Phase1 Report
50 pages
Artificial Neural Network: (Approved by AICTE & Affiliated To UPTU, Lucknow)
No ratings yet
Artificial Neural Network: (Approved by AICTE & Affiliated To UPTU, Lucknow)
16 pages
Ai ML
No ratings yet
Ai ML
23 pages
Neural Networks
No ratings yet
Neural Networks
32 pages
Project Report
0% (1)
Project Report
53 pages
Minin Handout
No ratings yet
Minin Handout
13 pages
CH 7 Neural Networks
No ratings yet
CH 7 Neural Networks
15 pages
Course Material Neural Updated
No ratings yet
Course Material Neural Updated
90 pages
1866 - Year - B.E. Computer Technology Sem-VII Subject - CT7052 - CT705 - Elective-II - Neural Network & Fuzzy Logic
No ratings yet
1866 - Year - B.E. Computer Technology Sem-VII Subject - CT7052 - CT705 - Elective-II - Neural Network & Fuzzy Logic
4 pages
Artificial Intelligence Artificial Neural Networks - : Introduction
No ratings yet
Artificial Intelligence Artificial Neural Networks - : Introduction
43 pages
Machine Learning For Risk Calculations A Practitioner S View The Wiley Finance Series 1st Edition Ruiz Ignacio Zeron Mariano
91% (11)
Machine Learning For Risk Calculations A Practitioner S View The Wiley Finance Series 1st Edition Ruiz Ignacio Zeron Mariano
54 pages
Assignment 01
No ratings yet
Assignment 01
3 pages
Soft Computing 4
No ratings yet
Soft Computing 4
23 pages
Purple Gradient Artificial Intelligence Presentation
No ratings yet
Purple Gradient Artificial Intelligence Presentation
9 pages
Artificial Neural Network1
No ratings yet
Artificial Neural Network1
31 pages
Unit 2 DL
No ratings yet
Unit 2 DL
3 pages
1cO1CO2: A CO1CO1Co1
No ratings yet
1cO1CO2: A CO1CO1Co1
4 pages
Artificial Neural Networks Applied To Long-Term Electricity Demand Forecasting
No ratings yet
Artificial Neural Networks Applied To Long-Term Electricity Demand Forecasting
6 pages
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
From Everand
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
César Pérez López
No ratings yet
Long Short Term Memory: Fundamentals and Applications for Sequence Prediction
From Everand
Long Short Term Memory: Fundamentals and Applications for Sequence Prediction
Fouad Sabry
No ratings yet
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
From Everand
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

2 - Neural Network

Uploaded by

2 - Neural Network

Uploaded by

PARALA MAHARAJA ENGINEERING COLLEGE

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Well Constrained Unconstrained

Application: Pattern Recognition, Classification, Optimization function, Approximation,

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

1986 Back Propagation Network Rumelhart and others

1988 Counter Propagation Network Grossberg

1987-1990 Adaptive Resonance Theory Carpenter and Grossberg

1988 Radian Basis Function Network Broomhead and Lowe

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

 Case-2: (If the activation function is signum function)(i.e, target is 1 or -1)

Where, with η being a positive update rate ranging from 0 to 1

Mr. Suryalok Dash

Mr. Suryalok Dash

-1 1 1 -1 0.6 1 0.6 -0.6 -0.6 1.2 0 0

1 -1 1 -1 1.2 1 -0.6 0.6 -0.6 0.6 0.6 -0.6

1 1 1 1 0.6 1 0 0 0 0.6 0.6 -0.6

-1 1 1 -1 -0.6 -1 0 0 0 0.6 0.6 -0.6

1 -1 1 -1 0.6 -1 0 0 0 0.6 0.6 -0.6

1 1 1 1 0.6 1 0 0 0 0.6 0.6 -0.6

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

 To illustrate the ideas, the hyperplane takes the form of

(0,0) (0,1) (0,0) (0,1)

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

Mr. Suryalok Dash

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.