Session 5
Session 5
MODEL
SESSION - 5
CONTENT
• We attach to each input a weight ( wi) and notice how we add an input
of value 1 with a weight of −θ. This is called bias.
• The inputs can be seen as neurons and will be called the input layer.
Altogether, these neurons and the function form a perceptron.
• The binary classification function of perceptron network is represented
as
PERCEPTRON MODEL
• Schematic Representation
PERCEPTRON MODEL
Y = f(yin) =
TRAINING ALGORITHM
w (new) = w (old)
b (new) = b (old)
• Step 5: Train the network until stop condition is reached.(no change in weight for all
cases)
CASE STUDY
CASE STUDY
• Marvin Minsky and Seymour Papert are two influential figures in the
field of artificial intelligence and neural networks.
• Their book, "Perceptrons: An Introduction to Computational Geometry,"
published in 1969, is a seminal work that critically analyzed the
capabilities and limitations of the Perceptron model, a simple type of
artificial neural network.
• Their analysis highlighted significant challenges in the field and spurred
the development of more complex neural network architectures.
KEY CONCEPTS
17
THE PERCEPTRON MODEL
20
LIMITATIONS OF THE PERCEPTRON
24
MULTI-LAYER PERCEPTRONS (MLPS):
25
BACKPROPAGATION ALGORITHM:
26
NEURAL NETWORK ARCHITECTURES:
27
ACTIVATION FUNCTIONS
Session - 6
28
ACTIVATION FUNCTIONS
29
TYPES OF ACTIVATION FUNCTIONS
30
LINEAR ACTIVATION FUNCTIONS
31
LIMITATIONS OF LINEAR ACTIVATION
FUNCTION
• It’s not possible to use backpropagation as the derivative of the
function is a constant and has no relation to the input x.
• All layers of the neural network will collapse into one if a linear
activation function is used. No matter the number of layers in the
neural network, the last layer will still be a linear function of the first
layer. So, essentially, a linear activation function turns the neural
network into just one layer.
32
BINARY STEP FUNCTION
33
LIMITATIONS OF BINARY STEP FUNCTION
34
SIGMOID / LOGISTIC ACTIVATION FUNCTION
35
LIMITATIONS OF SIGMOID ACTIVATION
FUNCTION
• The derivative of the function is f'(x) =
sigmoid(x)*(1-sigmoid(x)).
• As we can see from the Figure, the gradient values
are only significant for range -3 to 3, and the graph
gets much flatter in other regions.
• It implies that for values greater than 3 or less
than -3, the function will have very small
gradients. As the gradient value approaches zero,
the network ceases to learn and suffers from
the Vanishing gradient problem.
• The output of the logistic function is not symmetric
around zero. So the output of all the neurons will
be of the same sign. This makes the
training of the neural network more difficult and
unstable.
36
TANH FUNCTION (HYPERBOLIC TANGENT)
37
ADVANTAGE OF TANH FUNCTION
38
LIMITATIONS OF TANH FUNCTION
39
RELU FUNCTION
40
ADVANTAGES OF RELU FUNCTION
41
LIMITATIONS OF RELU FUNCTION
42
LEAKY RELU FUNCTION
43
ADVANTAGES OF LEAKY RELU FUNCTION
44
LIMITATIONS OF LEAKY RELU FUNCTION
45
THANKS
46
THANKS
ANN TEAM
47