0% found this document useful (0 votes)
31 views25 pages

Neural Network: Sudipta Roy

- Neural networks were first discovered in 1943 with the McCulloch-Pitts neuron model. This early model had inputs, weights, thresholds and an output. - Hebb proposed Hebbian learning in 1949 which became the basis for many neural network training algorithms. It states that connections between neurons are strengthened when neurons are simultaneously active. - The perceptron was developed which used the perceptron learning rule to adjust weights based on errors between actual and target outputs. It was one of the first supervised learning algorithms for neural networks.

Uploaded by

man
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views25 pages

Neural Network: Sudipta Roy

- Neural networks were first discovered in 1943 with the McCulloch-Pitts neuron model. This early model had inputs, weights, thresholds and an output. - Hebb proposed Hebbian learning in 1949 which became the basis for many neural network training algorithms. It states that connections between neurons are strengthened when neurons are simultaneously active. - The perceptron was developed which used the perceptron learning rule to adjust weights based on errors between actual and target outputs. It was one of the first supervised learning algorithms for neural networks.

Uploaded by

man
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Neural network

Sudipta Roy

Department of Computer Science and Engineering


Assam University, Silchar
McCulloch-Pitts Neuron
 Earliest NN discovered in 1943

x1
x2
X3 w1
Output
w2 n
yin   wi xi ; y  f ( yin )
Inputs

y

. w3
i 1

.
.
xn-1
wn-1
xn
wn
McCulloch-Pitts Model
The McCulloch-Pitts model:
• Spikes are interpreted as spike rates.
• Synaptic strength are translated as synaptic weights.
• Excitation means positive product between the incoming
spike rate and the corresponding synaptic weight.
• Inhibition means negative product between the incoming
spike rate and the corresponding synaptic weight.
• Does not have any particular training algorithm.
• Used as building blocks on which we can model any function
or phenomena which can be represented as a model
function.
• Each neuron has a fixed threshold such that if the net input
to the neuron is greater than the threshold the neuron fires.
The threshold should be such that any non-zero inhibitory
input will prevent the neuron from firing.
McCulloch-Pitts Model
The McCulloch-Pitts model:
• The activation function is
𝟏 𝒊𝒇 𝒚𝒊𝒏 ≥ 𝜽
𝒇ሺ𝒙ሻ = ൜
𝟎 𝒊𝒇 𝒚𝒊𝒏 < 𝜽
• MP neuron has no particular training algorithm.
• An analysis has to be performed to determine the
values of the weights and the threshold.
• MP neurons are used as building blocks on which we
can model any function or phenomenon, which can
be represented as a logic function.
Hebb Network
Donald Hebb stated it in 1949 (Hebbian Learning Rule)
Hebb stated that in the brain, the learning is performed
by the change in the synaptic gap.
Hebb explained as “when an axon of cell A is near
enough to excite cell B, and repeatedly or permanently
takes place in firing it, some growth process or metabolic
change takes place in one or both the cells such that A’s
efficiency as one of the cells firing B, is increased.”
Hebb rule can be used for pattern association, pattern
categorization, pattern classification and over a range of
other areas.
Flowchart : Hebb Network
Start

Initialize weights

For
each s:t

Activate input units xi = si

Activate output units yj = tj

Weight update
wi(new) = wi(old) + xiy

Bias update
b(new) = b(old) + y

Stop
Hebb Network
Training Algorithm:

1. Initialize all weights wi = 0 (i=1, 2, ...., n)


2. For each input training vector and target output pair s:t
do steps 3 to 5
3. Set input units’ activations xi = si (i=1, 2, ...., n)
4. Set output units’ activations yj = tj (j=1, 2, ...., m)
5. Weight and bias adjustments are performed
wi(new) = wi(old) + xiy (i=1, 2, ...., n)
b(new) = b(old) + y
Hebb Network
Hebb rule is more suited for bipolar data than binary
data.
If binary data is used, the weight updation formula in
step 5 cannot distinguish two conditions:
A training pair in which an input unit is “on” and
target value is “off”
A training pair in which both input unit and target
value is “off”
Thus, there are limitations for binary data.
Perceptron Network (Perceptron
learning rule)
• Perceptron network comes under single-layer feed-forward
networks and are also called simple perceptron.

Key points:
1. The perceptron network consists of three units:
Sensory unit (input unit)
Associator unit (hidden unit)
Response unit (output unit)
2. Sensory units are connected to associator units with fixed
weights having values 1, 0 or -1, which are assigned as
random.
Perceptron Network (Perceptron
learning rule)
3. Binary activation functions are used in sensory unit and
associator unit
4. Response unit has an activation of 1, 0 or -1.
Binary step with fixed threshold θ is used as activation
for associator.
Output signals that are sent from the associator unit to the
response unit are only binary.
5. Output of the perceptron network is given by 𝑦 = 𝑓(𝑦𝑖𝑛 )
Where 𝟏 𝒊𝒇 𝒚 > 𝜽
𝒊𝒏
𝒇ሺ𝒚𝒊𝒏ሻ = ቐ 𝟎 𝒊𝒇 − 𝜽 ≤ 𝒚𝒊𝒏 ≤ 𝜽
−𝟏 𝒊𝒇 𝒚𝒊𝒏 < −𝜽
Perceptron Network (Perceptron
learning rule)
6. The perceptron learning rule is used in the weight
updation between associator and response unit.
For each training input, the net will calculate the response
and it will determine whether or not an error has occurred.
7. Error calculation is based on the comparison of the values
of targets with those of the calculated outputs.
8. The weights on the connections from the units that send
the nonzero signal will get adjusted suitably.
9. The weights will be adjusted on the basis of the learning
rule if an error has occurred for a particular training pattern.
wi(new) = wi(old) + αtxi
where α=learning rate; t is -1 or +1 and (i=1, 2, ...., n)
b(new) = b(old) + αt
Perceptron Network
• Training Algorithm for Single/Multiple output classes
1. Initialize weights and bias to zero
Set learning rate 0 < 𝛼 ≤ 1; for simplicity 𝛼 = 1
2. Perform steps 3 to 7 until final stopping condition is false
3. Perform steps 4 to 6 for each training pair s:t
4. Set activation of input units 𝑥𝑖 = 𝑠𝑖 ; 𝑖 = 1 𝑡𝑜 𝑛
Step Single output class Step Multiple output classes
5 Compute response of output unit 5 Compute response of output unit
𝑛 𝑛

𝑦𝑖𝑛 = 𝑏 + ෍ 𝑥𝑖 𝑤𝑖 𝑦𝑖𝑛𝑗 = 𝑏𝑗 + ෍ 𝑥𝑖 𝑤𝑖𝑗


𝑖=1 𝑖=1
1 𝑖𝑓 𝑦𝑖𝑛 > 𝜃 1 𝑖𝑓 𝑦𝑖𝑛𝑗 > 𝜃
𝑦 = 𝑓 ሺ𝑦𝑖𝑛 ሻ = ቐ0 𝑖𝑓 − 𝜃 ≤ 𝑦𝑖𝑛 ≤ 𝜃 𝑦𝑗 = 𝑓൫𝑦𝑖𝑛𝑗 ൯= ቐ0 𝑖𝑓 − 𝜃 ≤ 𝑦𝑖𝑛𝑗 ≤ 𝜃
−1 𝑖𝑓 𝑦𝑖𝑛 < −𝜃 −1 𝑖𝑓 𝑦𝑖𝑛𝑗 < −𝜃
6 Update weights and bias if an error 6 Update weights and bias if an error occurred
occurred i.e. if 𝑦 ≠ 𝑡 then i.e. if 𝑦 ≠ 𝑡 then
𝑤𝑖 ሺ𝑛𝑒𝑤ሻ = 𝑤𝑖 ሺ𝑜𝑙𝑑ሻ+∝ 𝑡𝑥𝑖 𝑤𝑖 ሺ𝑛𝑒𝑤ሻ = 𝑤𝑖 ሺ𝑜𝑙𝑑ሻ+∝ 𝑡𝑥𝑖
𝑏 ሺ𝑛𝑒𝑤ሻ = 𝑏 ሺ𝑜𝑙𝑑ሻ+∝ 𝑡 𝑏 ሺ𝑛𝑒𝑤ሻ = 𝑏 ሺ𝑜𝑙𝑑ሻ+∝ 𝑡
else else
𝑤𝑖 ሺ𝑛𝑒𝑤ሻ = 𝑤𝑖 ሺ𝑜𝑙𝑑 ሻ 𝑤𝑖 ሺ𝑛𝑒𝑤ሻ = 𝑤𝑖 ሺ𝑜𝑙𝑑 ሻ
𝑏 ሺ𝑛𝑒𝑤ሻ = 𝑏 ሺ𝑜𝑙𝑑ሻ 𝑏 ሺ𝑛𝑒𝑤ሻ = 𝑏 ሺ𝑜𝑙𝑑ሻ

7. Train the network until there is no weight change. This is the stopping condition of the
network. If this condition is not met then start again from step 3.
Perceptron Network
• Testing Algorithm
 It is best to test the network performance once the training
process is complete.
For efficient performance of the network, it should be trained
with more data.
 The testing algorithm is as follows:
1. Initialize the weights with the final weights obtained during training.
2. For each input vector x to be classified, perform steps 3 and 4
3. Set activations of the input unit.
4. Obtain the response of output unit
𝑛

𝑦𝑖𝑛 = ෍ 𝑥𝑖 𝑤𝑖
𝑖=1
1 𝑖𝑓 𝑦𝑖𝑛 > 𝜃
𝑦 = 𝑓ሺ𝑦𝑖𝑛 ሻ = ቐ0 𝑖𝑓 − 𝜃 ≤ 𝑦𝑖𝑛 ≤ 𝜃
−1 𝑖𝑓 𝑦𝑖𝑛 < −𝜃
Delta learning rule

 The Delta rule is very similar to perceptron learning rule.


 Their origins are different:
o Perceptron learning rule originates from the Hebbian assumptions while the Delta
rule is derived from the gradient-descent method.
 Perceptron learning rule stops after a finite number of learning steps but the gradient-descent
rule continues for ever, converging only asymptotically to the solution.
 The Delta rule changes the weights of the neural connection so as to minimize the difference
between the net input to the output unit and the target value i.e.
𝑛

∆𝑤𝑖 =∝ ሺ𝑡 − 𝑦𝑖𝑛 ሻ𝑥𝑖 𝑤ℎ𝑒𝑟𝑒 𝑦𝑖𝑛 = ෍ 𝑥𝑖 𝑤𝑖


𝑖=1
Delta learning rule
Proof (for single unit)
o The squared error for a particular training pattern is 𝐸 = (𝑡 − 𝑦𝑖𝑛 )2
𝜕𝐸
o The error can be reduced by adjusting the weights in the direction of − 𝜕𝑤
𝑖
𝜕𝐸 𝜕𝑦𝑖𝑛
= −2 𝑡 − 𝑦𝑖𝑛
ሺ ሻ = −2ሺ𝑡 − 𝑦𝑖𝑛 ሻ𝑥𝑖
𝜕𝑤𝑖 𝜕𝑤𝑖

o According to the delta learning rule


∆𝑤𝑖 = −∝ ∇𝐸 = −∝ ሾ−2ሺ𝑡 − 𝑦𝑖𝑛 ሻ𝑥𝑖 ሿ=∝ (ሺ𝑡 − 𝑦𝑖𝑛 ሻ𝑥𝑖
Delta learning rule
For multiple units:

o The delta rule for adjusting the weight from the ith input to the jth output unit is
∆𝑤𝑖𝑗 =∝ ൫𝑡𝑗 − 𝑦𝑖𝑛𝑗 ൯𝑥𝑖
o The squared error for a particular training error
𝑚

𝐸 = ෍ (𝑡𝑗 − 𝑦𝑖𝑛𝑗 )2
𝑗 =1
Therefore,
𝑚
𝜕𝐸 𝜕 𝜕 𝜕𝑦𝑖𝑛𝑗
= ෍ (𝑡𝑗 − 𝑦𝑖𝑛𝑗 )2 = (𝑡𝑗 − 𝑦𝑖𝑛𝑗 )2 = −2(𝑡𝑗 − 𝑦𝑖𝑛𝑗 ) = −2(𝑡𝑗 − 𝑦𝑖𝑛𝑗 )𝑥𝑖
𝜕𝑤𝑖𝑗 𝜕𝑤𝑖𝑗 𝜕𝑤𝑖𝑗 𝜕𝑤𝑖𝑗
𝑗 =1

Thus,
𝜕𝐸
∆𝑤𝑖𝑗 = −∝ =∝ ൫𝑡𝑗 − 𝑦𝑖𝑛𝑗 ൯𝑥𝑖
𝜕𝑤𝑖𝑗
Adaline
(Adaptive linear neuron)
The units with linear activation function are called linear
units.
A network with a single layer unit is called an Adaline
i.e. the input-output relationship in Adaline is linear.
Adaline uses bipolar activation for its input signals and
its target output i.e. weights will be +1 or -1.
The weights between input and output are adjustable.
The bias is connected to an unit whose activation is
always 1.
Adaline is a net which has only one output unit.
Adaline network may be trained using delta rule.
Adaline Training algorithm
1. Set weights and bias to some non-zero random values. Set the learning parameter rate α
2. Perform 3-7 when stopping condition is false.
3. Perform 4-6 for each bipolar training set s:t
4. Set activations for input units 𝑥𝑖 = 𝑠𝑖 ; 𝑖 = 1 𝑡𝑜 𝑛
5. Calculate the net input to the output unit
𝑛

𝑦𝑖𝑛 = 𝑏 + ෍ 𝑥𝑖 𝑤𝑖
𝑖=1
6. Update the weights and bias for 𝑖 = 1 𝑡𝑜 𝑛
𝑤𝑖 ሺ𝑛𝑒𝑤ሻ = 𝑤𝑖 ሺ𝑜𝑙𝑑ሻ+∝ (𝑡 − 𝑦𝑖𝑛 )𝑥𝑖
𝑏ሺ𝑛𝑒𝑤ሻ = 𝑏ሺ𝑜𝑙𝑑ሻ+∝ (𝑡 − 𝑦𝑖𝑛 )

7. If highest weight change ˂ specified tolerance then


Stop the training process
Else
Continue
OR
Calculate error: 𝐸𝑖 = σ (𝑡 − 𝑦𝑖𝑛 ) 2
If (Ei ≤ Es)
Stop
Else
Continue.
Adaline Testing algorithm
1. Initialize the weights (weights are obtained from training algorithm)
2. Perform steps 3-5 for each bipolar input vector x
3. Set the activations of the input units to x
4. Calculate the net input to the output unit
𝑛

𝑦𝑖𝑛 = 𝑏 + ෍ 𝑥𝑖 𝑤𝑖
𝑖=1
5. Apply the activation function over the net input calculated
1 𝑖𝑓 𝑦𝑖𝑛 ≥ 0
𝑦=൜
−1 𝑖𝑓 𝑦𝑖𝑛 < 0
Madaline (Multiple Adaptive linear
neuron)
Consists of many adalines in parallel with a single output
unit whose value is based on certain selection rules.
The output would have an answer true or false.
If AND rule is used, then output is true if and only if
both the inputs are true.
The weights that are connected from the adaline layer
to the madaline layer are fixed, positive and are equal
values.
The weights between the input layer and the adaline
layer are adjusted during the training process.
Bias of excitation is 1 (The adaline and madaline layer
neurons have a bias of excitation 1 connected to them).
Madaline (Multiple Adaptive linear
neuron)
Training process of madaline is similar to that of adaline.
Madaline structure consists of “n” units of input layer,
“m” units of adaline layer and “1” unit of the madaline
layer.
Each neuron in the adaline and madaline layers has a
bias of excitation “1”.
Adaline layer is present between the input and madaline
(output) layer. Hence, adaline layer can be considered as
hidden layer.
The adaline and madaline models can be applied
effectively in communication systems of adaptive
equalizers and adaptive noise cancellation and other
cancellation circuits.
Madaline Training algorithm
 In training algorithm, the weights between the hidden layer and the input layer are adjusted
and the weights for output units are fixed.
 Response of y is 1 i.e. 𝑣1 = 𝑣2 = ⋯ 𝑣𝑚 and 𝑏0 should be 1. Thus weights may be taken as
1 1
𝑣1 = 𝑣2 = ⋯ 𝑣𝑚 = and 𝑏0 =
2 2
Activation of madaline (output) and adaline(input) units is given as
1 𝑖𝑓 𝑦 ≥ 0
𝑦 = ൜ 𝑖𝑛
−1 𝑖𝑓 𝑦𝑖𝑛 < 0
Training algorithm
1. Initialize the weights. Set initial small random values for adaline weights and initial learning
rate α
2. Perform 3-4 when stopping condition is false.
3. Perform 4-8 for each bipolar training set s:t
4. Set activations for input units 𝑥𝑖 = 𝑠𝑖 ; 𝑖 = 1 𝑡𝑜 𝑛
5. Calculate net input to each hidden adaline unit
𝑛

𝑧𝑖𝑛𝑗 = 𝑏𝑗 + ෍ 𝑥𝑖 𝑤𝑖𝑗 𝑓𝑜𝑟 𝑗 = 1 𝑡𝑜 𝑚


𝑖=1
6. Calculate the output of each hidden unit 𝑧𝑗 = 𝑓(𝑧𝑖𝑛𝑗 )
7. Calculate the output of the net
𝑚

𝑦𝑖𝑛 = 𝑏0 + ෍ 𝑧𝑗 𝑣𝑗 𝑎𝑛𝑑 𝑦 = 𝑓(𝑦𝑖𝑛 )


𝑗 =1
Training algorithm cont..
1. Calculate the error and update the weights
a. If t=y then no weight updation is required
b. If t≠y and t=+1 then
Update weights on 𝑧𝑗 where net input is closest to 0
𝑤𝑖𝑗 ሺ𝑛𝑒𝑤ሻ = 𝑤𝑖𝑗 ሺ𝑜𝑙𝑑ሻ+∝ (1 − 𝑧𝑖𝑛𝑗 )𝑥𝑖
𝑏𝑗 ሺ𝑛𝑒𝑤ሻ = 𝑏𝑗 ሺ𝑜𝑙𝑑ሻ+∝ (1 − 𝑧𝑖𝑛𝑗 )
c. If t≠y and t=-1 then and bias for 𝑖 = 1 𝑡𝑜 𝑛
Update weights on 𝑧𝑘 where net input is positive
𝑤𝑖𝑘 ሺ𝑛𝑒𝑤ሻ = 𝑤𝑖𝑘 ሺ𝑜𝑙𝑑ሻ+∝ (−1 − 𝑧𝑖𝑛𝑘 )𝑥𝑖
𝑏𝑘 ሺ𝑛𝑒𝑤ሻ = 𝑏𝑘 ሺ𝑜𝑙𝑑ሻ+∝ (−1 − 𝑧𝑖𝑛𝑘 )

2. If there is no weight change or weight reaches a satisfactory level or a specified maximum no


of iterations of weight updation have been performed then
Stop
Else
Continue
Madaline
Madaline can be formed with the weights on the output
unit set to perform some logic functions.
If there are only two hidden units present or there are
more than two hidden units then the “majority vote
rule” function may be used.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy