Neural Network: Sudipta Roy
Neural Network: Sudipta Roy
Sudipta Roy
x1
x2
X3 w1
Output
w2 n
yin wi xi ; y f ( yin )
Inputs
y
…
. w3
i 1
.
.
xn-1
wn-1
xn
wn
McCulloch-Pitts Model
The McCulloch-Pitts model:
• Spikes are interpreted as spike rates.
• Synaptic strength are translated as synaptic weights.
• Excitation means positive product between the incoming
spike rate and the corresponding synaptic weight.
• Inhibition means negative product between the incoming
spike rate and the corresponding synaptic weight.
• Does not have any particular training algorithm.
• Used as building blocks on which we can model any function
or phenomena which can be represented as a model
function.
• Each neuron has a fixed threshold such that if the net input
to the neuron is greater than the threshold the neuron fires.
The threshold should be such that any non-zero inhibitory
input will prevent the neuron from firing.
McCulloch-Pitts Model
The McCulloch-Pitts model:
• The activation function is
𝟏 𝒊𝒇 𝒚𝒊𝒏 ≥ 𝜽
𝒇ሺ𝒙ሻ = ൜
𝟎 𝒊𝒇 𝒚𝒊𝒏 < 𝜽
• MP neuron has no particular training algorithm.
• An analysis has to be performed to determine the
values of the weights and the threshold.
• MP neurons are used as building blocks on which we
can model any function or phenomenon, which can
be represented as a logic function.
Hebb Network
Donald Hebb stated it in 1949 (Hebbian Learning Rule)
Hebb stated that in the brain, the learning is performed
by the change in the synaptic gap.
Hebb explained as “when an axon of cell A is near
enough to excite cell B, and repeatedly or permanently
takes place in firing it, some growth process or metabolic
change takes place in one or both the cells such that A’s
efficiency as one of the cells firing B, is increased.”
Hebb rule can be used for pattern association, pattern
categorization, pattern classification and over a range of
other areas.
Flowchart : Hebb Network
Start
Initialize weights
For
each s:t
Weight update
wi(new) = wi(old) + xiy
Bias update
b(new) = b(old) + y
Stop
Hebb Network
Training Algorithm:
Key points:
1. The perceptron network consists of three units:
Sensory unit (input unit)
Associator unit (hidden unit)
Response unit (output unit)
2. Sensory units are connected to associator units with fixed
weights having values 1, 0 or -1, which are assigned as
random.
Perceptron Network (Perceptron
learning rule)
3. Binary activation functions are used in sensory unit and
associator unit
4. Response unit has an activation of 1, 0 or -1.
Binary step with fixed threshold θ is used as activation
for associator.
Output signals that are sent from the associator unit to the
response unit are only binary.
5. Output of the perceptron network is given by 𝑦 = 𝑓(𝑦𝑖𝑛 )
Where 𝟏 𝒊𝒇 𝒚 > 𝜽
𝒊𝒏
𝒇ሺ𝒚𝒊𝒏ሻ = ቐ 𝟎 𝒊𝒇 − 𝜽 ≤ 𝒚𝒊𝒏 ≤ 𝜽
−𝟏 𝒊𝒇 𝒚𝒊𝒏 < −𝜽
Perceptron Network (Perceptron
learning rule)
6. The perceptron learning rule is used in the weight
updation between associator and response unit.
For each training input, the net will calculate the response
and it will determine whether or not an error has occurred.
7. Error calculation is based on the comparison of the values
of targets with those of the calculated outputs.
8. The weights on the connections from the units that send
the nonzero signal will get adjusted suitably.
9. The weights will be adjusted on the basis of the learning
rule if an error has occurred for a particular training pattern.
wi(new) = wi(old) + αtxi
where α=learning rate; t is -1 or +1 and (i=1, 2, ...., n)
b(new) = b(old) + αt
Perceptron Network
• Training Algorithm for Single/Multiple output classes
1. Initialize weights and bias to zero
Set learning rate 0 < 𝛼 ≤ 1; for simplicity 𝛼 = 1
2. Perform steps 3 to 7 until final stopping condition is false
3. Perform steps 4 to 6 for each training pair s:t
4. Set activation of input units 𝑥𝑖 = 𝑠𝑖 ; 𝑖 = 1 𝑡𝑜 𝑛
Step Single output class Step Multiple output classes
5 Compute response of output unit 5 Compute response of output unit
𝑛 𝑛
7. Train the network until there is no weight change. This is the stopping condition of the
network. If this condition is not met then start again from step 3.
Perceptron Network
• Testing Algorithm
It is best to test the network performance once the training
process is complete.
For efficient performance of the network, it should be trained
with more data.
The testing algorithm is as follows:
1. Initialize the weights with the final weights obtained during training.
2. For each input vector x to be classified, perform steps 3 and 4
3. Set activations of the input unit.
4. Obtain the response of output unit
𝑛
𝑦𝑖𝑛 = 𝑥𝑖 𝑤𝑖
𝑖=1
1 𝑖𝑓 𝑦𝑖𝑛 > 𝜃
𝑦 = 𝑓ሺ𝑦𝑖𝑛 ሻ = ቐ0 𝑖𝑓 − 𝜃 ≤ 𝑦𝑖𝑛 ≤ 𝜃
−1 𝑖𝑓 𝑦𝑖𝑛 < −𝜃
Delta learning rule
o The delta rule for adjusting the weight from the ith input to the jth output unit is
∆𝑤𝑖𝑗 =∝ ൫𝑡𝑗 − 𝑦𝑖𝑛𝑗 ൯𝑥𝑖
o The squared error for a particular training error
𝑚
𝐸 = (𝑡𝑗 − 𝑦𝑖𝑛𝑗 )2
𝑗 =1
Therefore,
𝑚
𝜕𝐸 𝜕 𝜕 𝜕𝑦𝑖𝑛𝑗
= (𝑡𝑗 − 𝑦𝑖𝑛𝑗 )2 = (𝑡𝑗 − 𝑦𝑖𝑛𝑗 )2 = −2(𝑡𝑗 − 𝑦𝑖𝑛𝑗 ) = −2(𝑡𝑗 − 𝑦𝑖𝑛𝑗 )𝑥𝑖
𝜕𝑤𝑖𝑗 𝜕𝑤𝑖𝑗 𝜕𝑤𝑖𝑗 𝜕𝑤𝑖𝑗
𝑗 =1
Thus,
𝜕𝐸
∆𝑤𝑖𝑗 = −∝ =∝ ൫𝑡𝑗 − 𝑦𝑖𝑛𝑗 ൯𝑥𝑖
𝜕𝑤𝑖𝑗
Adaline
(Adaptive linear neuron)
The units with linear activation function are called linear
units.
A network with a single layer unit is called an Adaline
i.e. the input-output relationship in Adaline is linear.
Adaline uses bipolar activation for its input signals and
its target output i.e. weights will be +1 or -1.
The weights between input and output are adjustable.
The bias is connected to an unit whose activation is
always 1.
Adaline is a net which has only one output unit.
Adaline network may be trained using delta rule.
Adaline Training algorithm
1. Set weights and bias to some non-zero random values. Set the learning parameter rate α
2. Perform 3-7 when stopping condition is false.
3. Perform 4-6 for each bipolar training set s:t
4. Set activations for input units 𝑥𝑖 = 𝑠𝑖 ; 𝑖 = 1 𝑡𝑜 𝑛
5. Calculate the net input to the output unit
𝑛
𝑦𝑖𝑛 = 𝑏 + 𝑥𝑖 𝑤𝑖
𝑖=1
6. Update the weights and bias for 𝑖 = 1 𝑡𝑜 𝑛
𝑤𝑖 ሺ𝑛𝑒𝑤ሻ = 𝑤𝑖 ሺ𝑜𝑙𝑑ሻ+∝ (𝑡 − 𝑦𝑖𝑛 )𝑥𝑖
𝑏ሺ𝑛𝑒𝑤ሻ = 𝑏ሺ𝑜𝑙𝑑ሻ+∝ (𝑡 − 𝑦𝑖𝑛 )
𝑦𝑖𝑛 = 𝑏 + 𝑥𝑖 𝑤𝑖
𝑖=1
5. Apply the activation function over the net input calculated
1 𝑖𝑓 𝑦𝑖𝑛 ≥ 0
𝑦=൜
−1 𝑖𝑓 𝑦𝑖𝑛 < 0
Madaline (Multiple Adaptive linear
neuron)
Consists of many adalines in parallel with a single output
unit whose value is based on certain selection rules.
The output would have an answer true or false.
If AND rule is used, then output is true if and only if
both the inputs are true.
The weights that are connected from the adaline layer
to the madaline layer are fixed, positive and are equal
values.
The weights between the input layer and the adaline
layer are adjusted during the training process.
Bias of excitation is 1 (The adaline and madaline layer
neurons have a bias of excitation 1 connected to them).
Madaline (Multiple Adaptive linear
neuron)
Training process of madaline is similar to that of adaline.
Madaline structure consists of “n” units of input layer,
“m” units of adaline layer and “1” unit of the madaline
layer.
Each neuron in the adaline and madaline layers has a
bias of excitation “1”.
Adaline layer is present between the input and madaline
(output) layer. Hence, adaline layer can be considered as
hidden layer.
The adaline and madaline models can be applied
effectively in communication systems of adaptive
equalizers and adaptive noise cancellation and other
cancellation circuits.
Madaline Training algorithm
In training algorithm, the weights between the hidden layer and the input layer are adjusted
and the weights for output units are fixed.
Response of y is 1 i.e. 𝑣1 = 𝑣2 = ⋯ 𝑣𝑚 and 𝑏0 should be 1. Thus weights may be taken as
1 1
𝑣1 = 𝑣2 = ⋯ 𝑣𝑚 = and 𝑏0 =
2 2
Activation of madaline (output) and adaline(input) units is given as
1 𝑖𝑓 𝑦 ≥ 0
𝑦 = ൜ 𝑖𝑛
−1 𝑖𝑓 𝑦𝑖𝑛 < 0
Training algorithm
1. Initialize the weights. Set initial small random values for adaline weights and initial learning
rate α
2. Perform 3-4 when stopping condition is false.
3. Perform 4-8 for each bipolar training set s:t
4. Set activations for input units 𝑥𝑖 = 𝑠𝑖 ; 𝑖 = 1 𝑡𝑜 𝑛
5. Calculate net input to each hidden adaline unit
𝑛