0% found this document useful (0 votes)
683 views153 pages

08 09 23 Soft Computing - ANN - PPT

The document describes a course on soft computing that covers topics like artificial neural networks, fuzzy logic, and genetic algorithms. It provides the objectives, topics, textbooks, and expected learning outcomes of the course. The course is intended to give students an understanding of intelligent computational approaches and the mathematical background for optimization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
683 views153 pages

08 09 23 Soft Computing - ANN - PPT

The document describes a course on soft computing that covers topics like artificial neural networks, fuzzy logic, and genetic algorithms. It provides the objectives, topics, textbooks, and expected learning outcomes of the course. The course is intended to give students an understanding of intelligent computational approaches and the mathematical background for optimization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 153

BTech IV Year VII Semester Section:

BCSE0103: SOFT COMPUTING


(Artificial Neural Network)

Dr. Dilip Kumar Sharma


SM(IEEE, ACM, CSI), Fellow IETE, Fellow IE(I)
Professor- Dept. of Computer Engineering & Applications
Dean (International Relations & Academic Collaborations)
GLA University, Mathura, INDIA
Email: dilip.sharma@gla.ac.in
https://www.gla.ac.in/academics/faculty-detail/7/prof-dilip-kumar-sharma
1
BCSE0103: SOFT COMPUTING
Objective: Students will get an insight of the intelligent computational
approaches. Providing students, the mathematical background to carry out
optimization.
• Neural Networks: Introduction to Soft Computing & Neural Computing,
Fundamentals of Artificial Neural Network(ANN), Models of ANN, ANN
models: Rosenblatt’s Perception, McCulloch & Pitts Model, Single Layer
Perceptron, Learning Methods in Perceptron, Linearly Separable Task and
XOR Problem, Multi-Layer Perceptron, Back Propagation Learning
Algorithm, Associative Memory: Hopfield network, Auto Associative
Memory, Bidirectional Hetro-Associative Memory, ADALINE, MADALINE
Network, Applications of Neural Network.
• Fuzzy Logic: Introduction to Fuzzy Sets & Crisp Sets, Fuzzy Membership
and Fuzzy Operations, Properties of Fuzzy Sets- Linguistic Hedges, Fuzzy
Logic – T-norms and other aggregation operators, Crisp Relations.

<BCSE0103> <Soft Computing> 2


BCSE0103: SOFT COMPUTING
• Fuzzy Logic: Fuzzy Relations, Fuzzy System, Crisp Logic, Propositional
Logic and its Laws, Inference in Propositional Logic, Fuzzy Logic,
Inference in Fuzzy Logic (GMP and GMT), Fuzzy Rule Based System,
Fuzzyfication & Defuzzification, Applications of Fuzzy Logic.
• Genetic Algorithm(GA):Introduction to GA, Search optimization
Method, Evolutionary Algorithm Working Principle, Biological
Background of GA, Working Principles of GA, Encoding(Binary, Value,
Permutation, Tree), Operators of GA(Random Population, Reproduction
or Selection), Crossover and Mutation, Basics of Genetic Algorithm with
Examples, Introduction to Genetic programming
Text Books:
• S. Rajsekaran& G.A. VijayalakshmiPai, “Neural Networks, Fuzzy Logic
and Genetic Algorithm: Synthesis and Applications”, 4th Edition,
Prentice Hall of India,2003.
<BCSE0103> <Soft Computing> 3
BCSE0103: SOFT COMPUTING
Reference Books:
• Timothy J Ross ,"Fuzzy Logic with Engineering Applications", 3 rd Edition, John Wiley and
Sons,2016.
• David E. Goldberg ,"Genetic Algorithm in Search Optimization and Machine Learning
"Adission-Wesley,2009.
• Karray, “Soft Computing and Intelligent Systems Design: Theory, Tools and Applications”, 1 st
Edition, Pearson Education,2009.
Outcome: After completion of course, student will be able to:
• CO1: Understand basics of Soft Computing including Artificial Neural Networks, Fuzzy Logic
and Genetic Algorithms.
• CO2: Demonstrate the ability to develop some familiarity with current research problems
and research methods in Soft Computing by working on a research or design project.
• CO3: Understand about the fundamental theory and concepts of neural networks, neuro
modeling, several neural networks paradigms and its applications.
• CO4: Design and implement the concepts of knowledge using fuzzy inference systems and
other machine intelligence applications.
• CO5: Identify an evolutionary computing paradigm known as genetic algorithms and its
applications to engineering optimization problems.
<BCSE0103> <Soft Computing> 4
Agenda

• Introduction to Soft Computing


• Concept of computing
• Important characteristics of ”Computing”
• Soft computing vs. ”Hard” Computing
• Few examples of Soft computing applications
• Characteristics of Soft computing
• Hybrid computing

<BCSE0103> <Soft Computing> 5


Concept of Computing

<BCSE0103> <Soft Computing> 6


Important Characteristics of
Computing

1. Should provide precise solution.


2. Control action should be unambiguous and
accurate.
3. Suitable for problem, which is easy to model
mathematically

<BCSE0103> <Soft Computing> 7


Hard Computing

In 1996, LA Zade (LAZ) introduced the term hard computing.


According to LAZ: We term a computing as ”Hard” computing, if
1. Precise result is guaranteed
2. Control action is unambiguous
3. Control action is formally defined (i.e. with mathematical model
or algorithm)
Example of Hard Computing:
Solving numerical problems (e.g. Roots of polynomials,
Integration etc.)
Searching and sorting techniques
Solving ”Computational Geometry” problems (e.g.
Shortest tour in Graph theory, Finding closest pair of points
etc.)
<BCSE0103> <Soft Computing> 8
Problem Areas of Hard Computing

Problems in some other areas of applications


• Medical diagnosis
• Person identification
• Computer vision
• Hand written character recognition
• Pattern recognition and Machine Intelligence MI
• Weather forecasting
<BCSE0103> <Soft Computing> 9
A classic example of a task that
requires machine learning: It is very hard
to say what makes a 2

<BCSE0103> <Soft Computing> 10


Some more examples of tasks
that are best solved by using a
learning algorithm
1. Recognizing patterns:
• Facial identities or facial expressions
• Handwritten or spoken words
• Medical images
2. Recognizing anomalies:
• Unusual sequences of credit card transactions
• Unusual patterns of sensor readings in a nuclear power
plant or unusual sound in your car engine.
3. Prediction:
• Future stock prices or currency exchange rates

<BCSE0103> <Soft Computing> 11


Some more examples….

Some web-based examples of machine learning


are….
1. The web contains a lot of data. Tasks with very big datasets often
use machine learning
• especially if the data is noisy or non-stationary.
2. Spam filtering, fraud detection:
• The enemy adapts so we must adapt too.
3. Recommendation systems:
• Lots of noisy data. Million dollar prize!
4. Information retrieval:
• Find documents or images with similar content.
<BCSE0103> <Soft Computing> 12
Soft Computing

 The term soft computing was proposed by the inventor of


fuzzy logic, Lofti A. Zadeh. He describes it as follows.

 Definition:-Soft computing is a collection of methodologies


that aim to exploit the tolerance for imprecision and
uncertainty to achieve tractability, robustness, and low
solution cost, Its principal constituents are fuzzy logic ,neuro-
computing, and probabilistic reasoning.
 The role model for soft computing is the human mind.

<BCSE0103> <Soft Computing> 13


Characteristics of Soft
Computing

1. It does not require any mathematical modeling of


problem solving.
2. It may not yield the precise solution
3. Algorithms are adaptive (i.e. it can adjust to the
change of dynamic environment)
4. Use some biological inspired methodologies such as
genetics, evolution, Ant’s behaviors, particles
swarming, human nervous systems etc.

<BCSE0103> <Soft Computing> 14


<BCSE0103> <Soft Computing> 15
<BCSE0103> <Soft Computing> 16
<BCSE0103> <Soft Computing> 17
How Soft Computing?

How a student learns from his teacher?


 Teacher asks questions and tell the answers then.
• Teacher puts questions and hints answers and asks whether
the answers are correct or not.
• Student thus learn a topic and store in his memory.
• Based on the knowledge he solves new problems.
 This is the way how human brain works.
 Based on this concepts Artificial Neural Networks is used to
solve problem

<BCSE0103> <Soft Computing> 18


How Soft Computing?

 How world selects the best?


• It starts with a population(random).
• Reproduces another population(Next Generation).
• Rank the population and selects the superior
individuals.
 Genetic Algorithm is based on this natural
phenomena.
• Population is synonymous to solutions.
• Selection of superior is synonymous to exploring the
optimal solution.
<BCSE0103> <Soft Computing> 19
How Soft Computing?

 How a doctor treats his patient?


• Doctor asks the patient about suffering.
• Doctor find the symptoms of diseases.
• Doctor prescribed tests and medicines.
 This is exactly the way Fuzzy logic works.
• Symptoms are correlated with diseases
with uncertainty
• Doctor prescribes tests/medicines fuzzily.

<BCSE0103> <Soft Computing> 20


Hard Computing vs Soft
Computing

Hard Computing Soft Computing


It requires a precisely stated analytical It is tolerant of imprecision, uncertainty,
model and often a lot of Computation partial truth and approximation
time

It is based on binary logic crisp system, It is based on fuzzy logic, neural nets and
numerical analysis and crisp software. probabilities reasoning.

It has the characteristics of precision and It has the characteristics of approximation


categoricity. and dispositionality

<BCSE0103> <Soft Computing> 21


Hard Computing vs Soft
Computing

Hard Computing Soft Computing

It is deterministic It incorporates stochasticity

It requires exact input data It can deal with ambiguous and noisy data
It is strictly sequential. It allows parallel computations

It produces precise answers It can yield approximate answers

<BCSE0103> <Soft Computing> 22


Hybrid Computing

<BCSE0103> <Soft Computing> 23


 In this course….
How to build an artificial neural networks and train it with input data to
solve a number of problems, which are not possible to solve with hard
computing.
Basic concepts of fuzzy algebra and then how to solve problems using fuzzy
logic.
The framework of genetic algorithm and solving varieties of optimization
problems.

<BCSE0103> <Soft Computing> 24


MCQ

Q.1.Which of these has been associated with fuzzy


logic?
a. Many-valued logic b. Crisp set logic
c. Binary set logic d. Two-valued logic

Answer: (a) Many-valued logic

<BCSE0103> <Soft Computing> 25


MCQ

Q.2.Hard Computing is/are…


(a) Deterministic (b) incorporates stochasticity
(c) Bothe a & b (d) None of these

Ans:- (a) Deterministic

<BCSE0103> <Soft Computing> 26


MCQ

Q.3.Soft computing is /are…


(a) incorporates stochasticity
(b) Deterministic (c)both a & b
(d)None of these

Ans:- (a) incorporates stochasticity i.e. refers to


the property of being well described by a
random probability distribution.
<BCSE0103> <Soft Computing> 27
MCQ

Q.4. The term soft computing was proposed by


the inventor of fuzzy logic, Lofti A. Zadeh
(a) Charles Babbage
(b) Lofti A. Zadeh (c) Alan Turing
(d)None of these

Ans:- (b) Lofti A. Zadeh


<BCSE0103> <Soft Computing> 28
Questions

Q1. Differentiate Hard Computing and Soft


Computing?

Q2. What are the problems which can be solved by soft


computing?

<BCSE0103> <Soft Computing> 29


Biological nervous system

<BCSE0103> <Soft Computing> 30


Neuron and its working

• There is a chemical in each neuron called neurotransmitter.


• A signal (also called sense) is transmitted across neurons by this chemical.
• That is, all inputs from other neuron arrive to a neurons through
dendrites.
• These signals are accumulated at the synapse of the neuron and then
serve as the output to be transmitted through the neuron. An action may
produce an electrical impulse, which usually lasts for about a millisecond.
• Note that this pulse generated due to an incoming signal and all signal
may not produce pulses in axon unless it crosses a threshold value.
• Also, note that an action signal in axon of a neuron is commutative signals
arrive at dendrites which summed up at soma.

<BCSE0103> <Soft Computing> 31


Brain: Center of the nervous
system

Brain Computer
1. Analogue 1.Digital
2. uses content addressable memory 2.Uses byte addressable memory
3. Massive parallel mechanism 3. Serial and modular approach
4. Processing speed is not fixed; no 4. Processing speed is fixed
system clock
5.Synapses are far more complex than 5. Easier than the architecture of Brain
logic gates
6. Much bigger than any computer 6. Not so big.

<BCSE0103> <Soft Computing> 32


Artificial Neural Networks (ANNs)

• Neural networks are also artificial neural


networks (ANNs) or simulated neural
networks (SNNs).
• It is a subset of machine learning at the heart
of deep learning algorithms.
• Their name and structure are inspired by the
human brain, mimicking the way that
biological neurons signal to one another.

<BCSE0103> <Soft Computing> 33


Artificial Neural Networks (ANNs)

• Artificial neural networks (ANNs) are comprised of a


node layers, containing an input layer, one or more
hidden layers, and an output layer.
• Each node, or artificial neuron, connects to another
and has an associated weight and threshold.
• If the output of any individual node is above the
specified threshold value, that node is activated,
sending data to the next layer of the network.
• Otherwise, no data is passed along to the next layer
of the network.
<BCSE0103> <Soft Computing> 34
Artificial Neural Networks (ANNs)

• Neural networks rely on training data to learn and improve


their accuracy over time.
• However, once these learning algorithms are fine-tuned for
accuracy, they are powerful tools in computer science and
artificial intelligence, allowing us to classify and cluster data at
a high velocity. Tasks in speech recognition or image
recognition can take lass than a minute versus hours when
compared to manual identification by human experts.
• One of the most well-known neural networks is Google’s
search algorithm.
**Fine-Tuned:- Make minor adjustments to something in order to
achieve the best or desired performance.
<BCSE0103> <Soft Computing> 35
Elements of a Neural Network

• Input Layer: This layer accepts input features. It provides


information from the outside world to the network, no computation
is performed at this layer, nodes here just pass on the
information(features) to the hidden layer.
• Hidden Layer: Nodes of this layer are not exposed to the outer
world, they are part of the abstraction provided by any neural
network. The hidden layer performs all sorts of computation on the
features entered through the input layer and transfers the result to
the output layer.
• Output Layer: This layer bring up the information learned by the
network to the outer world.

<BCSE0103> <Soft Computing> 36


Elements Artificial Neural Networks

<BCSE0103> <Soft Computing> 37


Types of neural networks
(i) The perceptron is the oldest neural network, created by Frank Rosenblatt
in 1958. Rosenblatt perceptron is a binary single-neuron model.
1. The inputs integration is implemented through the addition
of the weighted inputs that have fixed weights obtained
during the training stage.
2. If the result of this addition is larger than a given threshold θ
the neuron fires. When the neuron fires its output is set to 1,
otherwise it’s set to 0.

<BCSE0103> <Soft Computing> 38


Types of neural networks

(ii) Feedforward neural networks, or multi-layer perceptron (MLPs):-


They comprise an input layer, a hidden layer or layers, and an output layer.
While these neural networks are also commonly referred to as MLPs, it’s
important to note that they are actually comprised of sigmoid neurons. Data
usually is fed into these models to train them, and they are the foundation for
computer vision, natural language processing, and other neural networks.

<BCSE0103> <Soft Computing> 39


Types of neural networks
(iii)Convolutional neural networks (CNNs) are similar to feedforward
networks, but they’re usually utilized for image recognition, pattern
recognition, and/or computer vision. These networks harness principles from
linear algebra, particularly matrix multiplication, to identify patterns within an
image.

(iv)Recurrent neural networks (RNNs) are identified by their feedback


loops. These learning algorithms are primarily leveraged when using time-
series data to predict future outcomes, such as stock market predictions or
sales forecasting.

<BCSE0103> <Soft Computing> 40


History of neural networks
1943: Warren S. McCulloch and Walter Pitts published “A logical calculus of the
ideas immanent in Nervous. This research sought to understand how the human
brain could produce complex patterns through connected brain cells or neurons.
One of the main ideas that came out of this work was the comparison of neurons
with a binary threshold to Boolean logic (i.e., 0/1 or true/false statements).

1958: Frank Rosenblatt is credited with the development of the perceptron,


documented in his research, “The Perceptron: A Probabilistic Model for
Information Storage and Organization in the Brain. He takes McCulloch and Pitt’s
work a step further by introducing weights to the equation. Leveraging an IBM
704, Rosenblatt was able to get a computer to learn how to distinguish cards
marked on the left vs. cards marked on the right.

<BCSE0103> <Soft Computing> 41


History of neural networks
1974: While numerous researchers contributed to the idea of backpropagation,
Paul Werbos was the first person in the US to note its application within neural
networks within his PhD thesis.

1989: Yann LeCun published a paper illustrating how the use of constraints in
backpropagation and its integration into the neural network architecture can be
used to train algorithms. This research successfully leveraged a neural network to
recognize hand-written zip code digits provided by the U.S. Postal Service.

<BCSE0103> <Soft Computing> 42


Neuron Components
(i) A set of ‘i’ synapses having weight wi. A signal xi forms the input to the i-th
synapse having weight wi. The value of any weight may be positive or negative. A
positive weight has an extraordinary effect, while a negative weight has an
inhibitory effect on the output of the summation junction.

(ii) A summation junction for the input signals is weighted by the respective
synaptic weight. Because it is a linear combiner or adder of the weighted input
signals, the output of the summation junction can be expressed as follows:

<BCSE0103> <Soft Computing> 43


Neuron Components
(iii) A threshold activation function (or simply the activation function, also
known as a squashing function) results in an output signal only when an input
signal exceeding a specific threshold value comes as an input. It is similar in
behaviour to the biological neuron, which transmits the signal only when the total
input signal meets the firing threshold.

** Threshold Value:- Reflects the minimum performance required to achieve the


required operational effect while being achievable through
the current state of technology at an affordable life-cycle cost.
<BCSE0103> <Soft Computing> 44
Types of Activation Function
The most commonly used activation function are listed below:
1. Identity Function
2. Threshold/step Function
3. ReLU (Rectified Linear Unit) Function
4. Sigmoid Function
5. Hyperbolic Tangent Function

<BCSE0103> <Soft Computing> 45


Types of Activation Function
It’s just a thing function that you use to get the output of node. It
is also known as Transfer Function.

(i) Identity Function: The identity function is used as an activation


function for the input layer. The identity function is a special case of an
activation function where the output signal is equal to the input signal.
In other words, the identity function simply passes the input signal
through unchanged. It is a linear function having the form:

(ii) Threshold/step Function: It is a commonly used activation function.


As depicted in the diagram, it gives 1 as an output of the input is either 0
or positive. If the input is negative, it gives 0 as output. Expressing it
mathematically,
Types of Activation Function

(iii) ReLU (Rectified Linear Unit) Function: It is the most


popularly used activation function in the areas of
convolutional neural networks and deep learning. It is
of the form:

This means that f(x) is zero when x is less than


zero and f(x) is equal to x when x is above or
equal to zero. This function is differentiable,
except at a single point x = 0. In that sense, the
derivative of a ReLU is actually a sub-derivative.
(iv) Sigmoid Function: It is by far the most commonly used activation function in
neural networks. The need for sigmoid function stems from the fact that many learning
algorithms require the activation function to be differentiable and hence continuous.
There are two types of sigmoid function:

(a). Binary Sigmoid Function:-A binary sigmoid function is of the form

Where k = steepness or slope parameter, By varying the value of k, sigmoid function with different slopes can be obtained. It has a range
of (0,1). The slope of origin is k/4. As the value of k becomes very large, the sigmoid function becomes a threshold function.
(b)Bipolar Sigmoid Function
• A bipolar sigmoid function is of the form

The range of values of sigmoid functions can be varied depending on the application. However, the range
of (-1,+1) is most commonly adopted.
(v) Hyperbolic Tangent Function
• It is bipolar in nature. It is a widely adopted activation
function for a special type of neural network known
as Backpropagation Network. The hyperbolic tangent
function is of the form

This function is similar to the bipolar sigmoid function.


Artificial neural network

• In fact, the human brain is a highly complex structure viewed as


a massive, highly interconnected network of simple processing
elements called neurons.

• Artificial neural networks (ANNs) or simply we refer it as neural


network (NNs), which are simplified models (i.e. imitations) of
the biological nervous system, and obviously, therefore, have
been motivated by the kind of computing performed by the
human brain.

• The behavior of a biolgical neural network can be captured by a


simple model called artificial neural network.

<BCSE0103> <Soft Computing> 51


Task of a Neuron

We may note that a neutron is a part of an interconnected network


of nervous system and serves the following.
• Compute input signals
• Transportation of signals (at a very high speed)
• Storage of information
• Perception, automatic training and learning
We also can see the analogy between the biological neuron and
artificial neuron. Truly, every component of the model (i.e. artificial
neuron) bears a direct analogy to that of a biological neuron. It is this
model which forms the basis of neural network (i.e. artificial neural
network).

<BCSE0103> <Soft Computing> 52


Artificial neural network

<BCSE0103> <Soft Computing> 53


Artificial neural network

• Here, x1, x2, · · · , xn are the n inputs to the artificial neuron. w1, w2, · · · , wn are
weights attached to the input links.
• Note that, a biological neuron receives all inputs through the dendrites, sums
them and produces an output if the sum is greater than a threshold value.
• The input signals are passed on to the cell body through the synapse, which
may accelerate or retard an arriving signal.
• It is this acceleration or retardation of the input signals that is modeled by the
weights.
• An effective synapse, which transmits a stronger signal will have a
correspondingly larger weights while a weak synapse will have smaller
weights.
• Thus, weights here are multiplicative factors of the inputs to account for the
strength of the synapse.

<BCSE0103> <Soft Computing> 54


Artificial Neural Network

Hence, the total input say I received by the soma of the


artificial neuron is I = w1x1 + w2x2 + · · · + wnxn = ∑ni=1 wixi
To generate the final output y, the sum is passed to a
filter φ called transfer function, which releases the
output. That is, y = φ(I).

<BCSE0103> <Soft Computing> 55


Mathematical
Understanding
• A very commonly known transfer function is the thresholding
function.
• In this thresholding function, sum (i.e. I) is compared with a
threshold value θ.
• If the value of I is greater than θ, then the output is 1 else it is 0
(this is just like a simple linear filter).
• In other words, y = φ( ∑ ni=1 wixi − θ)
where φ(I) = 1 , if I > θ
0 , if I ≤ θ Such a Φ is called step function
(also known as Heaviside function).

<BCSE0103> <Soft Computing> 56


MCQ
1. Why do we need biological neural networks?
a) to solve tasks like machine vision & natural language processing
b) to apply heuristic search methods to find solutions of problem
c) to make smart human interactive & user friendly system
d) all of the mentioned
Answer: d -Explanation: These are the basic aims that a neural network achieve.
2. What is the trend in software nowadays?
a) to bring computer more & more closer to user
b) to solve complex problems
c) to be task specific
d) to be versatile
Answer: a- Explanation: Software should be more interactive to the user, so that
it can understand its problem in a better fashion.
<BCSE0103> <Soft Computing> 57
MCQ

3. What’s the main point of difference between human & machine


intelligence?
a) human perceive everything as a pattern while machine perceive it merely
as data
b) human have emotions
c) human have more IQ & intellect
d) human have sense organs
Answer: a -Explanation: Humans have emotions & thus form different
patterns on that basis, while a machine(say computer) is dumb & everything is
just a data for him.

<BCSE0103> <Soft Computing> 58


MCQ
4. What is auto-association task in neural networks?
a) find relation between 2 consecutive inputs
b) related to storage & recall task
c) predicting the future inputs
d) none of the mentioned
Answer: b - Explanation: This is the basic definition of auto-association in neural
networks.
5. Does pattern classification belongs to category of non-supervised learning?
a) yes
b) no
Answer: b - Explanation: Pattern classification belongs to category of supervised
learning.

<BCSE0103> <Soft Computing> 59


MCQ
6. In pattern mapping problem in neural nets, is there any kind of generalization
involved between input & output?
a) yes
b) no
Answer: a - Explanation: The desired output is mapped closest to the ideal
output & hence there is generalisation involved.
7. What is unsupervised learning?
a) features of group explicitly stated
b) number of groups may be known
c) neither feature & nor number of groups is known
d) none of the mentioned
Answer: c - Explanation: Basic definition of unsupervised learning.

<BCSE0103> <Soft Computing> 60


MCQ
8. Does pattern classification & grouping involve same kind of learning?
a) yes
b) no
Answer: b - Explanation: Pattern classification involves supervised learning while
grouping is an unsupervised one.
9. Does for feature mapping there’s need of supervised learning?
a) yes
b) no
Answer: b - Explanation: Feature mapping can be unsupervised, so it’s not a
sufficient condition.

<BCSE0103> <Soft Computing> 61


MCQ
10. The fundamental unit of network is
a) brain
b) nucleus
c) neuron
d) axon
Answer: c - Explanation: Neuron is the most basic & fundamental unit of a
network .
11. What are dendrites?
a) fibers of nerves
b) nuclear projections
c) other name for nucleus
d) none of the mentioned
Answer: a - Explanation: Dendrites tree shaped fibers of nerves.
<BCSE0103> <Soft Computing> 62
Advantages of ANN

<BCSE0103> <Soft Computing> 63


Advantages of ANN

<BCSE0103> <Soft Computing> 64


MuCulloch & Pitt Model

• The first computational model of a neuron was proposed by Warren


MuCulloch (neuroscientist) and Walter Pitts (logician) in 1943.
• It may be divided into 2 parts. The first part, g takes an input, performs an
aggregation and based on the aggregated value the second part, f makes a
decision.
• The McCulloch-Pitts neural model has only two types of inputs Excitatory and
Inhibitory.

<BCSE0103> <Soft Computing> 65


MuCulloch & Pitt Model
• The excitatory inputs have weights of positive magnitude and the
inhibitory weights have weights of negative magnitude.
• The inputs of the McCulloch-Pitts neuron could be either 0 or 1.
• It has a threshold function as an activation function. So, the output
signal yout is 1 if the input ysum is greater than or equal to a given threshold
value, else 0.
• The diagrammatic representation of the model is as follows:

<BCSE0103> <Soft Computing> 66


MuCulloch & Pitt Model
McCulloch-Pitts neurons can be used to design logical operations. For that purpose, the
connection weights need to be correctly decided along with the threshold .
Example:
• John carries an umbrella if it is sunny or if it is raining. There are four given situations. I
need to decide when John will carry the umbrella. The situations are as follows:
• First scenario: It is not raining, nor it is sunny
• Second scenario: It is not raining, but it is sunny
• Third scenario: It is raining, and it is not sunny
• Fourth scenario: It is raining as well as it is sunny

To analyse the situations using the McCulloch-Pitts neural


model, I can consider the input signals as follows:
• X1: Is it raining?
• X2 : Is it sunny? <BCSE0103> <Soft Computing> 67
MuCulloch & Pitt Model

• So, the value of both scenarios can be either 0 or 1. We can use the value
of both weights X1 and X2 as 1 and a threshold function as 1. So, the neural
network model will look like:

<BCSE0103> <Soft Computing> 68


MuCulloch & Pitt Model

• Truth Table for this case will be:

Situation x1 x2 ysum yout

1 0 0 0 0

2 0 1 1 1

3 1 0 1 1

4 1 1 2 1

<BCSE0103> <Soft Computing> 69


Example 2
A bank wants to decide if it can sanction a loan or not. There are 2 parameters to decide- Salary and
Credit Score. So there can be 4 scenarios to assess-
(i) High Salary and Good Credit Score (ii) High Salary and Bad Credit Score
(iii) Low Salary and Good Credit Score (iv) Low Salary and Bad Credit Score
Let X1 = 1 denote high salary and X1 = 0 denote Low salary and X2 = 1 denote good credit score and
X2 = 0 denote bad credit score.
Let the threshold value be 2.

<BCSE0103> <Soft Computing> 70


Example 2
A bank wants to decide if it can sanction a loan or not. There are 2 parameters to decide- Salary and
Credit Score. So there can be 4 scenarios to assess-
(i) High Salary and Good Credit Score (ii) High Salary and Bad Credit Score
(iii) Low Salary and Good Credit Score (iv) Low Salary and Bad Credit Score
Let X1 = 1 denote high salary and X1 = 0 denote Low salary and X2 = 1 denote good credit score and
X2 = 0 denote bad credit score.Let the threshold value be 2. The truth table is as follows

The truth table shows when the loan should be approved


considering all the varying scenarios. In this case, the X1 X2 X1+X2 Loan
approved
loan is approved only if the salary is high and the credit
score is good. The McCulloch Pitt's model of neuron 1 1 2 1
was mankind’s first attempt at mimicking the human 1 0 1 0
brain. And it was a fairly simple one too. It’s no surprise 0 1 1 0
it had many limitations-
The model failed to capture and compute cases of non- 0 0 0 0
binary inputs. It was limited by its ability to compute
every case with 0 and 1 only.
The threshold had to be decided beforehand and needed
manual computation instead of the model deciding itself.

<BCSE0103> <Soft Computing> 71


MuCulloch & Pitt Model

So, I can say that,

From the truth table, I can conclude that in the


situations where the value of yout is 1, John needs to
carry an umbrella. Hence, he will need to carry an
umbrella in scenarios 2, 3 and 4.

<BCSE0103> <Soft Computing> 72


Learning methods in
PERCEPTRON

Supervised Learning:-Perceptron is a machine learning algorithm


for supervised learning of binary classifiers. In Perceptron, the weight
coefficient is automatically learned. Initially, weights are multiplied with input
features, and the decision is made whether the neuron is fired or not.
Every input pattern that is used to train the network associated with an output
pattern, which is the target pattern.
Supervised learning is a process of providing input data as well as correct
output data to the machine learning model.

The aim of a supervised learning algorithm is to find a mapping function to


map the input variable(x) with the output variable(y).

<BCSE0103> <Soft Computing> 73


Types of ANN Architecture

<BCSE0103> <Soft Computing> 74


AND Problem and its Neural
Network

<BCSE0103> <Soft Computing> 75


AND Problem and its Neural
Network

<BCSE0103> <Soft Computing> 76


AND Problem and its Neural
Network

<BCSE0103> <Soft Computing> 77


Single layer feed forward
Neural Network

<BCSE0103> <Soft Computing> 78


Single layer feed forward Neural
Network

• We see, a layer of n neurons constitutues a single layer feed


forward neural network.
• This is so called because, it contains a single layer of artificial
neurons.
• Note that the input layer and output layer, which receive input
signals and transmit output signals are although called layers,
they are actually boundary of the architecture and hence truly
not layers.
• The only layer in the architecture is the synaptic links carrying
the weights connect every input to the output neurons.
<BCSE0103> <Soft Computing> 79
Modeling SLModelling SLFFNNFFNN

In a single layer neural network, the inputs x1, x2, · · · , xm


are connected to the layers of neurons through the weight
matrix W . The weight matrix Wm×n can be represented as
follows.

(1)

The output of any k -th neuron can be determined as


follows.

where k = 1, 2, 3, · · · , n and θk denotes the threshold value of the k-


th neuron. Such network is feed forward in type or acyclic in nature
and hence the name.
Multi layer feed forward Neural
Network

• This network, as its name indicates is made up of multiple layers.


• Thus architectures of this class besides processing an input and
an output layer also have one or more intermediary layers called
hidden layers.
• The hidden layer(s) aid in performing useful intermediary
computation before directing the input to the output layer.
• A multilayer feed forward network with l input neurons (number
of neuron at the first layer), m1 , m2 , · · · , mp number of
neurons at i -th hidden layer (i = 1, 2, · · · , p) and n neurons at
the last layer (it is the output neurons) is written as l − m1 − m2
− · · · − mp − n MLFFNN.

<BCSE0103> <Soft Computing> 81


Multi layer feed forward Neural
Network

<BCSE0103> <Soft Computing> 82


Multi layer feed forward Neural
Network

<BCSE0103> <Soft Computing> 83


Recurrent Neural network
Architecture

• The networks differ from feedback network


architectures in the sense that there is at least
one ”feedback loop”.
• Thus, in these networks, there could exist one
layer with feedback connection.
• There could also be neurons with self-feedback
links, that is, the output of a neuron is fed back
into itself as input.

<BCSE0103> <Soft Computing> 84


Recurrent neural Network

<BCSE0103> <Soft Computing> 85


Why Different types of neural
network

<BCSE0103> <Soft Computing> 86


Revisit of a single neural network

<BCSE0103> <Soft Computing> 87


AND and XOR problems

<BCSE0103> <Soft Computing> 88


AND problem is linearly separable

<BCSE0103> <Soft Computing> 89


AND problem is linearly separable

<BCSE0103> <Soft Computing> 90


MCQ

For problems, with error calculation, we solve using


(a) Recurrent neural networks
(b) Single layer feed forward neural network
(c) Multilayer feed forward neural network
(d) All of the above
Answer: (a) Recurrent neural networks

Application of Neural Network includes


(a) Pattern Recognition
(b) Classification
(c) Clustering
(d) All of the above
Answer: (d) All of the above
<BCSE0103> <Soft Computing> 91
MCQ

Introduction to ANN
1) An ANN learn quickly if ŋ, the learning rate assumes the following value(s).
(a) ŋ = 1
(b) ŋ < 1
(c) ŋ > 1
(d) ŋ = 0
Answer: (c) ŋ > 1
2) Which of the following is true for neural networks?
i. The training time depends on the size of the network.
ii. Neural networks can be simulated on a conventional computer.
iii. Artificial neurons are identical in operation to biological ones.
(a) i and ii are true
(b) i and iii are true
(c) ii is true.
(d) all of them are true
Answer: (b) i and iii are true
<BCSE0103> <Soft Computing> 92
MCQ

3) Which of the following is true for neural networks?


i. The error calculation which is followed in “Back-propagation algorithm” is the
steepest descent method.
ii. Simulated annealing approach is followed in unsupervised learning.
iii. A problem whose output is linearly separable can also be solved with MLFFNN.
iv. The output of the perceptron with hard limit transfer function is more accurate than
it is defined with any sigmoid transfer function.
(a) i and iii are true
(b) i and ii are true
(c) ii and iv are true
(d) all are true

Answer: (a) i and iii are true

<BCSE0103> <Soft Computing> 93


MCQ

4) A possible neuron specification to solve the AND problem requires a minimum of


(a) Single neuron
(b) Two neurons
(c) Three neurons
(d) Four neurons
Answer: (a) Single neuron

5) What is back propagation?


(a) It is another name given to the curvy function in the perceptron
(b) It is the transmission of error back through the network to adjust the inputs
(c) It is the transmission of error back through the network to allow weights to be
adjusted so that the network can learn
(d) None of the above
Answer: (c) It is the transmission of error back through the network to allow weights to be
adjusted so that the network can learn
<BCSE0103> <Soft Computing> 94
MCQ

What is perceptron in Neural network


(a) It is an auto-associative neural network
(b) It is a double layer auto-associative neural network
(c) It is a single layer feed-forward neural network with pre-processing
(d) It is a neural network that contains feedback

Answer: (c) It is a single layer feed-forward neural network with pre-processing

Neural Networks are complex ______________ with many parameters


(a) Linear Functions
(b) Nonlinear Functions
(c) Discrete Functions
(d) Exponential Functions

Answer: (a) Linear Functions

<BCSE0103> <Soft Computing> 95


XOR problem is linearly non-
separable

<BCSE0103> <Soft Computing> 96


XOR problem is linearly non-
separable

<BCSE0103> <Soft Computing> 97


Our observations

• From the example discussed,


• we understand that a straight line is possible in AND-
problem to separate two tasks namely the output as
0 or 1 for any input.
• However, in case of XOR problem, such a line is not
possible. Note: horizontal or a vertical line in case of
XOR problem is not admissible because in that case it
completely ignores one input.

<BCSE0103> <Soft Computing> 98


Our observations

• In fact, any linearly separable problem can be


solved with a single layer feed forward neural
network. For example, the AND problem.
• On the other hand, if the problem is non-
linearly separable, then a single layer neural
network can not solves such a problem. To
solve such a problem, multilayer feed forward
neural network is required.

<BCSE0103> <Soft Computing> 99


Solution of XOR Problem

<BCSE0103> <Soft Computing> 100


Dynamic neural Network

<BCSE0103> <Soft Computing> 101


Dynamic neural Network

<BCSE0103> <Soft Computing> 102


Concept of Learning in neural
Network

<BCSE0103> <Soft Computing> 103


Types of Learning

<BCSE0103> <Soft Computing> 104


Supervised Learning

<BCSE0103> <Soft Computing> 105


Unsupervised Learning

<BCSE0103> <Soft Computing> 106


Reinforcement Learning

<BCSE0103> <Soft Computing> 107


Different learning techniques :
Gradient descent learning

<BCSE0103> <Soft Computing> 108


Different learning techniques:
Stochastic learning

**A stochastic policy requires learning a probability distribution over actions, which
can be challenging. On the other hand, a deterministic policy only requires
selecting the best action for each state, which is relatively easy.

<BCSE0103> <Soft Computing> 109


Different learning techniques:
Hebbian learning

<BCSE0103> <Soft Computing> 110


Different learning techniques:
Competitive learning

<BCSE0103> <Soft Computing> 111


Linearly Separable Task & XOR
Problem

Classification algorithms like Logistic Regression and Naive


Bayes only work on linearly separable problems.
Def.:- Linear Separability refers to the data points in
binary classification problems which can be separated
using linear decision boundary.
If the data points can be separated using a line, linear
function, or flat hyperplane are considered linearly
separable.
Linear separability is an important concept in neural
networks.
<BCSE0103> <Soft Computing> 112
Linearly Separable Task & XOR
Problem
• Consider a data set with two attributes x1 and x2 and two classes
0 and 1. Let class 0 = o and class 1 = x.

• A straight line (or plane) can be used to separate the two


classes (i.e. the x’s from the o’s). In other words, a single
decision surface can be used as the boundary between
both classes.
<BCSE0103> <Soft Computing> 113
XOR Problem

The XOR function is not linearly separable, which means we cannot draw a
single straight line to separate the inputs that yield different outputs.
The XOR problem with neural networks can be solved by using Multi-Layer
Perceptrons or a neural network architecture with an input layer, hidden
layer, and output layer. So during the forward propagation through the neural
networks, the weights get updated to the corresponding layers and the XOR
logic gets executed.
The Neural network architecture to solve the XOR problem will be as shown
below:-

<BCSE0103> <Soft Computing> 114


XOR Problem
• So with this overall architecture and certain weight
parameters between each layer, the XOR logic output
can be yielded through forward propagation.
• The overall neural network architecture uses the Relu
activation function to ensure the weights updated in
each of the processes to be 1 or 0 accordingly where for
the positive set of weights the output at the particular
neuron will be 1 and for a negative weight updation at
the particular neuron will be 0 respectively.
• So let us understand one output for the first input
state
<BCSE0103> <Soft Computing> 115
Multilayer Perceptron

 The single-layer networks suffer from the disadvantage that they are only able to
solve linearly separable classification problems.

 The MLP is a hierarchical structure of several perceptron, & overcomes the


disadvantages of these single-layer networks.

 However the computational effort needed for finding the correct combination of
weights increases substantially when more parameters and more complicated
topologies are considered.

<BCSE0103> <Soft Computing> 116


Architecture of Multilayer
Perceptron
 No connections within a layer
 No direct connections between input and output layers
 Fully connected between layers
 No. of output units need not equal no. of input units
 No. of hidden units per layer can be more or less than input or output units
 Each unit is a perceptron
 The training algorithm for MLP requires differentiable, continuous nonlinear
activation functions.
**The non-linear functions are known to be the most used activation functions. It makes it easy for a neural
network model to adapt with a variety of data and to differentiate between the outcomes.
These functions are mainly divided basis on their range or curves,
Such as Sigmoid Activation Functions
• Sigmoid takes a real value as the input and outputs another value between 0 and 1. The sigmoid
activation function translates the input ranged in (-∞,∞) to the range in (0,1)
<BCSE0103> <Soft Computing> 117
<BCSE0103> <Soft Computing> 118
<BCSE0103> <Soft Computing> 119
<BCSE0103> <Soft Computing> 120
<BCSE0103> <Soft Computing> 121
<BCSE0103> <Soft Computing> 122
<BCSE0103> <Soft Computing> 123
<BCSE0103> <Soft Computing> 124
<BCSE0103> <Soft Computing> 125
<BCSE0103> <Soft Computing> 126
<BCSE0103> <Soft Computing> 127
<BCSE0103> <Soft Computing> 128
<BCSE0103> <Soft Computing> 129
<BCSE0103> <Soft Computing> 130
<BCSE0103> <Soft Computing> 131
<BCSE0103> <Soft Computing> 132
Back Propagation Algorithm

 The BP algorithm is used to find a local minimum of the error function.

 The network is initialized with randomly chosen weights.

 The gradient of the error function is computed and used to correct the
initial weights.

 The back propagation algorithm includes two passes through the network:
(i) forward pass (ii) backward pass.

<BCSE0103> <Soft Computing> 133


Back Propagation Algorithm

<BCSE0103> <Soft Computing> 134


The BP Algorithm – what it
actually does!

 BP attempts to reduce the errors between the output of the


network and the desired result.
 However, assigning blame for errors to hidden nodes, is not so
straightforward. The error of the output nodes must be
propagated back through the hidden nodes.
 The contribution that a hidden node makes to an output node is
related to the strength of the weight on the link between the two
nodes and the level of activation of the hidden node when the
output node was given the wrong level of activation.
 This can be used to estimate the error value for a hidden node in
the penultimate layer, and that can, in turn, be used in making
error estimates for earlier layers.
<BCSE0103> <Soft Computing> 135
Back Propagation Learning

• Back-propagation is the essence of neural


net training.
• It is the practice of fine-tuning the weights of a
neural net based on the error rate (i.e. loss) obtained
in the previous epoch (i.e. iteration). Proper tuning of
the weights ensures lower error rates, making the
model reliable by increasing its generalization.

<BCSE0103> <Soft Computing> 136


How Back Propagation
Algorithm Works

The backpropagation algorithm works by computing


the gradient of the loss function with respect to each
weight by the chain rule, computing the gradient one
layer at a time, iterating backward from the last layer to
avoid redundant calculations of intermediate terms in
the chain rule.

<BCSE0103> <Soft Computing> 137


Adaline network
• A network with a single linear unit is called Adaline (Adaptive
Linear Neural).
• A unit with a linear activation function is called a linear unit. In
Adaline, there is only one output unit and output values are bipolar
(+1,-1).
• Weights between the input unit and output unit are adjustable. It
uses the delta rule i.e , where, and are
the weight, predicted output, and true value respectively.
• The learning rule is found to minimize the mean square error
between activation and target values.
• Adaline consists of trainable weights, it compares actual output with
calculated output, and based on error training algorithm is applied.

<BCSE0103> <Soft Computing> 138


Adaline network

<BCSE0103> <Soft Computing> 139


Adaline network

In Adaline, all the input neuron is directly connected to the output neuron with
the weighted connected path. There is a bias b of activation function 1 is
present.

<BCSE0103> <Soft Computing> 140


<BCSE0103> <Soft Computing> 141
Problem: Design OR gate using
Adaline Network?
Solution :
Initially, all weights are assumed to be small random values, say 0.1, and set
learning rule to 0.1. Also, set the least squared error to 2. The weights will be
updated until the total error is greater than the least squared error.

x1 x2 t

1 1 1

1 -1 1

-1 1 1

-1 -1 -1

<BCSE0103> <Soft Computing> 142


Problem: Design OR gate using
Adaline Network?
Solution :
Initially, all weights are assumed to be small random values, say 0.1,
and set learning rule to 0.1. Also, set the least squared error to 2. The
weights will be updated until the total error is greater than the least
squared error. x1 x2 t

1 1 1

1 -1 1

-1 1 1

-1 -1 -1

<BCSE0103> <Soft Computing> 143


Similarly, repeat the same steps for other input vectors and you will get.
(t-
x1 x2 t yin (t-yin) ∆w1 ∆w2 ∆b w1 (0.1) w2 (0.1) b (0.1) yin)^2

1 1 1 0.3 0.7 0.07 0.07 0.07 0.17 0.17 0.17 0.49

1 -1 1 0.17 0.83 0.083 -0.083 0.083 0.253 0.087 0.253 0.69

-1 1 1 0.087 0.913 -0.0913 0.0913 0.0913 0.1617 0.1783 0.3443 0.83

-1 -1 -1 0.0043 -1.0043 0.1004 0.1004 -0.1004 0.2621 0.2787 0.2439 1.01

This is epoch 1 where the total error is 0.49 + 0.69 + 0.83 + 1.01 = 3.02 so more epochs
will run until the total error becomes less than equal to the least squared error i.e 2.

<BCSE0103> <Soft Computing> 144


<BCSE0103> <Soft Computing> 145
Madaline (Multiple Adaptive
Linear Neuron) Network
• The Madaline(supervised Learning) model consists of many
Adaline in parallel with a single output unit.
• The Adaline layer is present between the input layer and the
Madaline layer hence Adaline layer is a hidden layer.
• The weights between the input layer and the hidden layer
are adjusted, and the weight between the hidden layer and
the output layer is fixed.
• It may use the majority vote rule, the output would have an
answer either true or false. Adaline and Madaline layer
neurons have a bias of ‘1’ connected to them. use of
multiple Adaline helps counter the problem of non-linear
separability. <BCSE0103> <Soft Computing> 146
Madaline (Multiple Adaptive
Linear Neuron) Network
• There are three types of a
layer present in Madaline
First input layer contains all
the input neurons, the
Second hidden layer consists
of an adaline layer, and
weights between the input
and hidden layers are
adjustable and the third layer
is the output layer the
weights between hidden and
output layer is fixed they are
not adjustable. <BCSE0103> <Soft Computing> 147
Madaline (Multiple Adaptive
Linear Neuron) Network

<BCSE0103> <Soft Computing> 148


Madaline (Multiple Adaptive
Linear Neuron) Network

<BCSE0103> <Soft Computing> 149


Problem: Using the Madaline network, implement XOR function
with bipolar inputs and targets. Assume the required parameter
for the training of the network.
• Training pattern for XOR function :

<BCSE0103> <Soft Computing> 150


Problem: Using the Madaline network, implement XOR function
with bipolar inputs and targets. Assume the required parameter
for the training of the network.

<BCSE0103> <Soft Computing> 151


Problem: Using the Madaline network, implement XOR function with
bipolar inputs and targets. Assume the required parameter for the
training of the network.

<BCSE0103> <Soft Computing> 152


<BCSE0103> <Soft Computing> 153

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy