0% found this document useful (0 votes)

11 views47 pages

Session 5

Uploaded by

forads684

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views47 pages

Session 5

Uploaded by

forads684

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 47

ARTIFICIAL NEURON

MODEL
SESSION - 5
CONTENT

• Rosenblatt’s Perceptron Model

• Case study
• Minsky and Papert Model
• Summary
MCCULLOH-PITTS MODEL

• The first computational model of a neuron was proposed by Warren

McCulloch and Walter Pitts in 1943.
• Simplest binary classification can be achieved by the following way
LIMITATIONS

• It cannot process non-Boolean inputs.

• It gives equal weights to each input.
• Threshold ϴ must be chosen manually.
PERCEPTRON MODEL

• Introduced in 1957 by Frank Rosenblatt.

• Core of deep learning concepts.
• A perceptron is a single layer Neural Network.
• A perceptron can simply be seen as a set of inputs, that are weighted and
to which we apply an activation function.
• This produces sort of a weighted sum of inputs, resulting in an output.
• This is typically used for classification problems, but can also be used for
regression problems.
PERCEPTRON MODEL

• We attach to each input a weight ( wi) and notice how we add an input
of value 1 with a weight of −θ. This is called bias.
• The inputs can be seen as neurons and will be called the input layer.
Altogether, these neurons and the function form a perceptron.
• The binary classification function of perceptron network is represented
as
PERCEPTRON MODEL

• Schematic Representation
PERCEPTRON MODEL

• Comes under single layer feedforward networks.

• Supervised learning approach - mostly used in classification tasks, but
also works for regression approaches.
• Two types – single layer perceptron, multi layer perceptron
PERCEPTRON MODEL

• The output of perceptron network is

• Yin = x1w1 + x2w2 + x3w3 + …… xnwn
• Yout = f(yin)
• Y = f(yin) =
• F(yin) is the activation function (step function)
PERCEPTRON MODEL

• Compare yout with the target output.

• Weight Updation:
• If y ≠ t, then
• w (new) = w (old) + α.t.x
• Where
• α- learning rate, t – target, x – input
• Similarly bias updation:
• b (new) = b (old) + α.t
TRAINING ALGORITHM

• Step 0: Initialize weights and bias. α=1

• Step 1: Perform steps 2 to 5 until stop condition is reached.
• Step 2: Perform step 3 and 4 for each training pair.
• Step 3: Calculate the output

Y = f(yin) =
TRAINING ALGORITHM

• Step 4: weight and bias adjustment

• If y ≠ t, then

w (new) = w (old) + α.t.x

b (new) = b (old) + α.t
• Else:

w (new) = w (old)
b (new) = b (old)
• Step 5: Train the network until stop condition is reached.(no change in weight for all
cases)
CASE STUDY
CASE STUDY

• Implementation of two input AND gate for bipolar input using

Rosenblatt’s perceptron model
MINSKY AND
PAPERT MODEL
INTRODUCTION

• Marvin Minsky and Seymour Papert are two influential figures in the
field of artificial intelligence and neural networks.
• Their book, "Perceptrons: An Introduction to Computational Geometry,"
published in 1969, is a seminal work that critically analyzed the
capabilities and limitations of the Perceptron model, a simple type of
artificial neural network.
• Their analysis highlighted significant challenges in the field and spurred
the development of more complex neural network architectures.
KEY CONCEPTS

17
THE PERCEPTRON MODEL

• A single-layer neural network used for binary classification tasks.

• Composed of input units, weights, a bias term, and an activation
function (typically a step function).
• Functions by computing a weighted sum of inputs and passing the
result through the activation function to produce a binary output.
LINEAR SEPARABILITY

• A fundamental concept in understanding the limitations of the

Perceptron.
• Refers to the ability of a model to classify data points using a linear
decision boundary.
• Problems that can be separated by a straight line (or hyperplane in
higher dimensions) are linearly separable.
MINSKY AND PAPERT'S ANALYSIS

20
LIMITATIONS OF THE PERCEPTRON

• Inability to Solve Non-Linearly Separable Problems: Minsky and

Papert demonstrated that the Perceptron cannot solve problems where
data points are not linearly separable. The XOR problem is a classic
example.
• XOR Problem: The XOR (exclusive OR) problem involves classifying pairs
of binary inputs into two classes. The classes cannot be separated by a
single straight line, making it unsolvable by a single-layer Perceptron.

• Limited Computational Power: They argued that the Perceptron's

computational power is limited and insufficient for complex tasks.
MATHEMATICAL PROOFS

• Minsky and Papert provided mathematical proofs showing the

Perceptron's limitations. They rigorously analyzed the types of
problems that could and could not be solved by the Perceptron.
• Their proofs highlighted that any problem requiring a non-linear
decision boundary cannot be solved by a single-layer Perceptron.
IMPLICATIONS FOR AI RESEARCH

• Criticism and Impact: Their work initially led to a decline in interest

and funding for neural network research during the 1970s, a period
often referred to as the "AI Winter."
• Reevaluation and Revival: Despite the initial setback, their
criticisms were crucial for the eventual resurgence of neural networks.
Researchers recognized the need for multi-layer networks, leading to
the development of more advanced models like multi-layer perceptrons
(MLPs) and deep neural networks.
ADVANCES FOLLOWING MINSKY AND
PAPERT'S WORK

24
MULTI-LAYER PERCEPTRONS (MLPS):

• Introduction of additional layers (hidden layers) between the input and

output layers.
• Use of non-linear activation functions (e.g., sigmoid, tanh, ReLU) in
hidden layers.
• Ability to solve non-linearly separable problems and learn complex
patterns.

25
BACKPROPAGATION ALGORITHM:

• Developed in the 1980s, backpropagation is a supervised learning

algorithm used to train multi-layer neural networks.
• Involves adjusting weights through gradient descent to minimize the
error between predicted and actual outputs.
• Enabled the training of deep neural networks and addressed the
limitations highlighted by Minsky and Papert.

26
NEURAL NETWORK ARCHITECTURES:

• Emergence of various neural network architectures, such as

Convolutional Neural Networks (CNNs) for image processing and
Recurrent Neural Networks (RNNs) for sequential data.
• Significant improvements in computational power and availability of
large datasets facilitated advancements in neural networks.

27
ACTIVATION FUNCTIONS
Session - 6

28
ACTIVATION FUNCTIONS

• Activation functions are functions used in a neural network to compute

the weighted sum of inputs and biases, which is in turn used to decide
whether a neuron can be activated or not.
• Activation functions play an integral role in neural networks by
introducing nonlinearity.
• This nonlinearity allows neural networks to develop complex
representations and functions based on the inputs that would not be
possible with a simple linear regression model.

29
TYPES OF ACTIVATION FUNCTIONS

• Linear Activation Functions

• Sigmoid
• ReLU
• Leaky ReLU
• Tanh
• Step Function

30
LINEAR ACTIVATION FUNCTIONS

• The linear activation function, also

known as "no activation," or "identity
function" (multiplied x1.0), is where the
activation is proportional to the input.
• The function doesn't do anything to the
weighted sum of the input, it simply spits
out the value it was given.

• Mathematically, it can be represented as

31
LIMITATIONS OF LINEAR ACTIVATION
FUNCTION
• It’s not possible to use backpropagation as the derivative of the
function is a constant and has no relation to the input x.
• All layers of the neural network will collapse into one if a linear
activation function is used. No matter the number of layers in the
neural network, the last layer will still be a linear function of the first
layer. So, essentially, a linear activation function turns the neural
network into just one layer.

32
BINARY STEP FUNCTION

• Binary step function depends on a

threshold value that decides whether a
neuron should be activated or not.
• The input fed to the activation function
is compared to a certain threshold; if the
input is greater than it, then the neuron
is activated, else it is deactivated,
meaning that its output is not passed on
to the next hidden layer.
• Mathematically it can be represented as:

33
LIMITATIONS OF BINARY STEP FUNCTION

• It cannot provide multi-value outputs—for example, it cannot be used

for multi-class classification problems.
• The gradient of the step function is zero, which causes a hindrance in
the backpropagation process.

34
SIGMOID / LOGISTIC ACTIVATION FUNCTION

• This function takes any real value as

input and outputs values in the range of
0 to 1.

• The larger the input (more positive), the

closer the output value will be to 1.0,
whereas the smaller the input (more
negative), the closer the output will be
to 0.0, as shown in figure.
• Mathematically, it can be represented
as

35
LIMITATIONS OF SIGMOID ACTIVATION
FUNCTION
• The derivative of the function is f'(x) =
sigmoid(x)*(1-sigmoid(x)).
• As we can see from the Figure, the gradient values
are only significant for range -3 to 3, and the graph
gets much flatter in other regions.
• It implies that for values greater than 3 or less
than -3, the function will have very small
gradients. As the gradient value approaches zero,
the network ceases to learn and suffers from
the Vanishing gradient problem.
• The output of the logistic function is not symmetric
around zero. So the output of all the neurons will
be of the same sign. This makes the
training of the neural network more difficult and
unstable.

36
TANH FUNCTION (HYPERBOLIC TANGENT)

• Tanh function is very similar to the

sigmoid/logistic activation function, and
even has the same S-shape with the
difference in output range of -1 to 1. In
Tanh, the larger the input (more
positive), the closer the output value
will be to 1.0, whereas the smaller the
input (more negative), the closer the
output will be to -1.0.
• Mathematically, it can be represented
as

37
ADVANTAGE OF TANH FUNCTION

• The output of the tanh activation function is Zero centered; hence we

can easily map the output values as strongly negative, neutral, or
strongly positive.
• Usually used in hidden layers of a neural network as its values lie
between -1 to 1; therefore, the mean for the hidden layer comes out to
be 0 or very close to it. It helps in centering the data and makes
learning for the next layer much easier.

38
LIMITATIONS OF TANH FUNCTION

• As you can see— it also faces the problem

of vanishing gradients similar to the
sigmoid activation function. Plus the
gradient of the tanh function is much
steeper as compared to the sigmoid
function.
• Note: Although both sigmoid and tanh
face vanishing gradient issue, tanh is zero
centered, and the gradients are not
restricted to move in a certain direction.
Therefore, in practice, tanh nonlinearity is
always preferred to sigmoid nonlinearity.

39
RELU FUNCTION

• ReLU stands for Rectified Linear Unit.

• Although it gives an impression of a linear
function, ReLU has a derivative function and
allows for backpropagation while simultaneously
making it computationally efficient.
• The ReLU function is a simple max(0, x) function,
which can also be thought of as a piecewise
function with all inputs less than 0 mapping to 0
and all inputs greater than or equal to 0
mapping back to themselves (i.e., identity
function). Graphically,
• Mathematically it can be represented as

40
ADVANTAGES OF RELU FUNCTION

• Since only a certain number of neurons are activated, the ReLU

function is far more computationally efficient when compared to the
sigmoid and tanh functions.
• ReLU accelerates the convergence of gradient descent towards the
global minimum of the loss function due to its linear, non-saturating
property.

41
LIMITATIONS OF RELU FUNCTION

• The Dying ReLU problem,

• The negative side of the graph makes the
gradient value zero. Due to this reason,
during the backpropagation process, the
weights and biases for some neurons are
not updated. This can create dead
neurons which never get activated.
• All the negative input values become
zero immediately, which decreases the
model’s ability to fit or train from the
data properly.

42
LEAKY RELU FUNCTION

• Leaky ReLU is an improved

version of ReLU function to solve
the Dying ReLU problem as it
has a small positive slope in the
negative area.
• Mathematically it can be
represented as:

43
ADVANTAGES OF LEAKY RELU FUNCTION

• The advantages of Leaky ReLU are same

as that of ReLU, in addition to the fact
that it does enable backpropagation,
even for negative input values.

• By making this minor modification for

negative input values, the gradient of
the left side of the graph comes out to
be a non-zero value. Therefore, we
would no longer encounter dead
neurons in that region.

44
LIMITATIONS OF LEAKY RELU FUNCTION

• The predictions may not be consistent for negative input values.

• The gradient for negative values is a small value that makes the
learning of model parameters time-consuming.

45
THANKS

46
THANKS
ANN TEAM

Neural Networks and Machine Learning
0% (1)
Neural Networks and Machine Learning
25 pages
Module-4 Neural Network
No ratings yet
Module-4 Neural Network
61 pages
DL Full Merged
No ratings yet
DL Full Merged
454 pages
Deep Learning
No ratings yet
Deep Learning
37 pages
NN Lec - 04 - 05
No ratings yet
NN Lec - 04 - 05
84 pages
ANN-unit 2
No ratings yet
ANN-unit 2
32 pages
Ai Unit-3
No ratings yet
Ai Unit-3
69 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
86 pages
Neural Deep Learning
No ratings yet
Neural Deep Learning
221 pages
Deep Learning
No ratings yet
Deep Learning
180 pages
Mod 3
No ratings yet
Mod 3
101 pages
Unit V
No ratings yet
Unit V
26 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
221 pages
Neural Networks
No ratings yet
Neural Networks
61 pages
ANNs
No ratings yet
ANNs
57 pages
Neural Networks
No ratings yet
Neural Networks
19 pages
Module1 - Upto Loss Function
No ratings yet
Module1 - Upto Loss Function
137 pages
Unit V - Aiml PDF
No ratings yet
Unit V - Aiml PDF
29 pages
Unit V
No ratings yet
Unit V
33 pages
ML Unit 2
No ratings yet
ML Unit 2
24 pages
Artificial Neural Artificial Neural Networks
No ratings yet
Artificial Neural Artificial Neural Networks
40 pages
Linearly Separable 1
No ratings yet
Linearly Separable 1
36 pages
ANN Presentation Exam Tanjina
No ratings yet
ANN Presentation Exam Tanjina
21 pages
Lecture 36 40
No ratings yet
Lecture 36 40
25 pages
Unit 2 Deep Learning and Neural Networks
No ratings yet
Unit 2 Deep Learning and Neural Networks
38 pages
Neural Networks
No ratings yet
Neural Networks
28 pages
Module1
No ratings yet
Module1
124 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
34 pages
Unit V
No ratings yet
Unit V
25 pages
CS 522 Selected Topics in CS: Lecture 07 - Artificial Neural Network
No ratings yet
CS 522 Selected Topics in CS: Lecture 07 - Artificial Neural Network
52 pages
FML Unit5
No ratings yet
FML Unit5
21 pages
UNIT-II Chapter-2
No ratings yet
UNIT-II Chapter-2
20 pages
Unit 2 - Activation Function - PR
No ratings yet
Unit 2 - Activation Function - PR
22 pages
Soft Computing Manual.-1
No ratings yet
Soft Computing Manual.-1
45 pages
Unit V Neural Networks
No ratings yet
Unit V Neural Networks
35 pages
UNIT1 Perceptron MLP
No ratings yet
UNIT1 Perceptron MLP
26 pages
ML Unit 3-2-18
No ratings yet
ML Unit 3-2-18
17 pages
Aiml Unit 5
No ratings yet
Aiml Unit 5
34 pages
Lec 6-7 (Neural Networks)
No ratings yet
Lec 6-7 (Neural Networks)
26 pages
26 Neural Nets
No ratings yet
26 Neural Nets
77 pages
Digital Library
No ratings yet
Digital Library
24 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
216 pages
Percptron
No ratings yet
Percptron
25 pages
Unit 3 Self Made
No ratings yet
Unit 3 Self Made
23 pages
ML Lec-22
No ratings yet
ML Lec-22
25 pages
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
No ratings yet
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
40 pages
NNDL
No ratings yet
NNDL
96 pages
CHP 9
No ratings yet
CHP 9
29 pages
Session: Deep Learning: Module: Digital Image Processing Module Code: IMP302
No ratings yet
Session: Deep Learning: Module: Digital Image Processing Module Code: IMP302
34 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
82 pages
4 Neural Network
No ratings yet
4 Neural Network
74 pages
FALLSEM2024-25 BCSE209L TH VL2024250101737 2024-08-06 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101737 2024-08-06 Reference-Material-I
20 pages
Unit 3 - IntrotoNN
No ratings yet
Unit 3 - IntrotoNN
4 pages
Deep Learning 10 Hours
No ratings yet
Deep Learning 10 Hours
27 pages
Data Science Intervieew Questions
100% (1)
Data Science Intervieew Questions
16 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Artificial Intelligence: Outline
No ratings yet
Artificial Intelligence: Outline
35 pages
Co 1
No ratings yet
Co 1
1 page
Hierarchical Clustering Unit 4 ML
No ratings yet
Hierarchical Clustering Unit 4 ML
14 pages
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
Deep Learning Unit-1 Finals
No ratings yet
Deep Learning Unit-1 Finals
23 pages
Activation Functions
No ratings yet
Activation Functions
11 pages
DLCourseFile (09 12 2021)
No ratings yet
DLCourseFile (09 12 2021)
78 pages
Unified Vision For A Sustainable Future A Multidisciplinary Approach Towards The Sustainable Development Goals 1st Edition Mir Sayed Shah Danish Download
No ratings yet
Unified Vision For A Sustainable Future A Multidisciplinary Approach Towards The Sustainable Development Goals 1st Edition Mir Sayed Shah Danish Download
60 pages
ANNand Its Applications
No ratings yet
ANNand Its Applications
16 pages
Supervised Learning Neural Networks
No ratings yet
Supervised Learning Neural Networks
34 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
Session 17
No ratings yet
Session 17
25 pages
ML Imp Ques 2
No ratings yet
ML Imp Ques 2
37 pages
Session 10B
No ratings yet
Session 10B
28 pages
Basic Concepts For Understanding ML & DL
No ratings yet
Basic Concepts For Understanding ML & DL
8 pages
Forecasting Using ANN's
No ratings yet
Forecasting Using ANN's
60 pages
Session 10c
No ratings yet
Session 10c
25 pages
Session 10a
No ratings yet
Session 10a
24 pages
Terjemahan Bab1 TheHundredPageLanguageModels AndySetiawan
No ratings yet
Terjemahan Bab1 TheHundredPageLanguageModels AndySetiawan
34 pages
B.Tech - CSE and CS Syllabus of 3rd Year 12 Oct 20
No ratings yet
B.Tech - CSE and CS Syllabus of 3rd Year 12 Oct 20
35 pages
Stock Price Prediction Using LSTM On Indian Share Market
No ratings yet
Stock Price Prediction Using LSTM On Indian Share Market
10 pages
2.neural Network
No ratings yet
2.neural Network
19 pages
1 s2.0 S2772783124000311 Main
No ratings yet
1 s2.0 S2772783124000311 Main
15 pages
Application of Deep Learning in Software Testing and Quality Assurance
No ratings yet
Application of Deep Learning in Software Testing and Quality Assurance
13 pages
Deep Learning - Assignment 11 Your Name, Roll Number 1. What Is The Difference Between Backpropagation Algorithm and Backpropagation Through Time (BPTT) Algorithm ?
No ratings yet
Deep Learning - Assignment 11 Your Name, Roll Number 1. What Is The Difference Between Backpropagation Algorithm and Backpropagation Through Time (BPTT) Algorithm ?
10 pages
Experiment 9 1
No ratings yet
Experiment 9 1
3 pages
Neural Network (Machine Learning) - Wikipedia
No ratings yet
Neural Network (Machine Learning) - Wikipedia
40 pages
W-Net A Deep Model For Fully Unsupervised Image Segmentation
No ratings yet
W-Net A Deep Model For Fully Unsupervised Image Segmentation
13 pages
Study On Deep Reinforcement Learning Techniques For Building Energy
No ratings yet
Study On Deep Reinforcement Learning Techniques For Building Energy
14 pages
IOT Based Ids System Using ANN
No ratings yet
IOT Based Ids System Using ANN
8 pages
Soft Computing Short
No ratings yet
Soft Computing Short
7 pages
Lec 2 - Neural Network Perceptron Adaline PDF
No ratings yet
Lec 2 - Neural Network Perceptron Adaline PDF
7 pages
Application of Machine Learning in Anaerobic Digestion: Perspectives and Challenges
No ratings yet
Application of Machine Learning in Anaerobic Digestion: Perspectives and Challenges
46 pages
Intro Prep Document
No ratings yet
Intro Prep Document
2 pages
AI Fundamentals Finals
No ratings yet
AI Fundamentals Finals
6 pages
Prediction of Fabric Compressive Properties Using Artificial Neural Networks
No ratings yet
Prediction of Fabric Compressive Properties Using Artificial Neural Networks
13 pages
Pre LSTM Intent Classification
No ratings yet
Pre LSTM Intent Classification
11 pages
How To Derive Errors in Neural Network With The Backpropagation Algorithm?
No ratings yet
How To Derive Errors in Neural Network With The Backpropagation Algorithm?
7 pages
Feedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs
From Everand
Feedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Session 5

Uploaded by

Session 5

Uploaded by

ARTIFICIAL NEURON

• Rosenblatt’s Perceptron Model

• The first computational model of a neuron was proposed by Warren

• It cannot process non-Boolean inputs.

• Introduced in 1957 by Frank Rosenblatt.

• Comes under single layer feedforward networks.

• The output of perceptron network is

• Compare yout with the target output.

• Step 0: Initialize weights and bias. α=1

• Step 4: weight and bias adjustment

w (new) = w (old) + α.t.x

• Implementation of two input AND gate for bipolar input using

• A single-layer neural network used for binary classification tasks.

• A fundamental concept in understanding the limitations of the

• Inability to Solve Non-Linearly Separable Problems: Minsky and

• Limited Computational Power: They argued that the Perceptron's

• Minsky and Papert provided mathematical proofs showing the

• Criticism and Impact: Their work initially led to a decline in interest

• Introduction of additional layers (hidden layers) between the input and

• Developed in the 1980s, backpropagation is a supervised learning

• Emergence of various neural network architectures, such as

• Activation functions are functions used in a neural network to compute

• Linear Activation Functions

• The linear activation function, also

• Mathematically, it can be represented as

• Binary step function depends on a

• It cannot provide multi-value outputs—for example, it cannot be used

• This function takes any real value as

• The larger the input (more positive), the

• Tanh function is very similar to the

• The output of the tanh activation function is Zero centered; hence we

• As you can see— it also faces the problem

• ReLU stands for Rectified Linear Unit.

• Since only a certain number of neurons are activated, the ReLU

• The Dying ReLU problem,

• Leaky ReLU is an improved

• The advantages of Leaky ReLU are same

• By making this minor modification for

• The predictions may not be consistent for negative input values.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.