ANN Presentation
ANN Presentation
Technology
Department of Information and Communication Engineering (ICE)
2
Dept. of Information and Communication Engineering (ICE), BAUET
Overfitting
•Overfitting occurs when our model becomes really good at being able to classify
or predict on data that was included in the training set but is not so good at
classifying test data.
•Over fitted model is unable to generalize well meaning it's learned the features
of the training set extremely well but if we give the model any data that slightly
deviates from the exact data used during training it's unable to generalize and
accurately predict the output
9/23/2022 3
Dept. of Information and Communication Engineering (ICE), BAUET
Overfitting
• we get metrics for the validation accuracy and loss as well as the training accuracy
and loss if the validation metrics are considerably worse than the training
metrics then that's indication that our model is overfitting.
9/23/2022 4
Dept. of Information and Communication Engineering (ICE), BAUET
Under fitting
•Under fitting is a scenario in data science where a data model is
unable to capture the relationship between the input and output
variables accurately, generating a high error rate on both the
training set and unseen data.
9/23/2022 5
Dept. of Information and Communication Engineering (ICE), BAUET
Under fitting
•If we have an under fitted model, this means that we do not have enough
parameters to capture the trends in the underlying system. Imagine
for example that we have data that is parabolic in nature, but we try to
fit this with a linear function, with just one parameter.
9/23/2022 6
Dept. of Information and Communication Engineering (ICE), BAUET
Simple Techniques to Prevent Overfitting
•It will be able to learn from the
training set also with more
data we're hoping to be
adding more diversity to the
training set
• Data augmentation: Modifications
by cropping rotating flipping
or zooming we'll cover more
on the concept of data
9/23/2022 7
Dept. of Information and Communication Engineering (ICE), BAUET
Simple Techniques to Prevent Overfitting
•Reducing overfitting is something called dropout the general idea behind dropout
is that if you add it to a model it will randomly ignore some subset of nodes in a
given layer during training
9/23/2022 8
Dept. of Information and Communication Engineering (ICE), BAUET
Simple Techniques to Prevent Overfitting
9/23/2022 9
Dept. of Information and Communication Engineering (ICE), BAUET
Backpropagation
•Feed forward network
•Loss calculation
•Back propagation
•Weight update
9/23/2022 10
Dept. of Information and Communication Engineering (ICE), BAUET
Backpropagation
Each node in the model receives its input from
the previous layer, and that this input is a
weighted sum of the weights at each of the
connections multiplied by the previous layer's
output.[1]
9/23/2022 11
Dept. of Information and Communication Engineering (ICE), BAUET
Backpropagation
weighted sum is passed to an
activation function, and the result
from this activation function is the
output for a particular node and is
then passed as part of the input for
the nodes in the next layer. This
happens for each layer in the network
until we reach the output layer, and
this process is called forward
propagation.
9/23/2022 12
Dept. of Information and Communication Engineering (ICE), BAUET
Backpropagation
Backpropagation is the tool that gradient
descent uses to calculate the gradient of
the loss function.
9/23/2022 13
Dept. of Information and Communication Engineering (ICE), BAUET
Backpropagation
9/23/2022 14
Dept. of Information and Communication Engineering (ICE), BAUET
Backpropagation
In addition to updating weights to move in
the desired direction i.e. positive or
negative, backpropagation is also working
to efficiently update the weights so that the
updates are being done in a manner that
helps to reduce the loss function most
efficiently.[2]
9/23/2022 15
Dept. of Information and Communication Engineering (ICE), BAUET
Gradient descent
• To find the derivative of loss with respect to the weight, if
• all data points(n) are considered, it is gradient descent.(computationally powerful , higher memory)
• One data point-> stochastic gradient descent
• K<n : mini batch gradient descent
Mini-batch gradient
descent
Gradient descent
Stochastic gradient
descent
9/23/2022 16
Dept. of Information and Communication Engineering (ICE), BAUET
Vanishing & Exploding Gradient
•Vanishing & Exploding Gradient explained | A problem resulting from
backpropagation
•Unstable gradient
•Gradient means gradient of the loss function with respect to the
weights.
•This problem involves with the earlier layers of the neural network
•Sgd works the gradient of the loss with respect to weight.
•Gradient is calculated using back propagation
9/23/2022 17
Dept. of Information and Communication Engineering (ICE), BAUET
Vanishing & Exploding Gradient
•Gradient of the earlier layer of network becomes very small, vanishingly small-so
vanishing gradient
•Model uses gradient value to update the weight. the weight gets updated in
some way that is proportional to the gradient. If the gradient is vanishingly small,
then this update is, in turn, going to be vanishingly small as well.
9/23/2022 18
Dept. of Information and Communication Engineering (ICE), BAUET
Vanishing Gradient
• Therefore, if this newly updated value of the
weight has just barely moved from its
original value, then it's not really doing much
for the network.
• As a result, this weight becomes kind of stuck,
never really updating enough to even
get close to its optimal value
which has implications for the
remainder of the network to the right of
this one weight and impairs the ability of the
network to learn well.
9/23/2022 19
Dept. of Information and Communication Engineering (ICE), BAUET
Exploding Gradient
Not a gradient that vanishes, but rather, a
gradient that explodes.
calculating the gradient with respect to the same
weight, but instead of really small terms, what if
they were large? And by large, we mean greater
than one.[2]
9/23/2022 20
Dept. of Information and Communication Engineering (ICE), BAUET
Exploding Gradient
Instead of barely moving our weight with this
update, we're going to greatly move it, So much
so, that the optimal value for this weight won't
be achieved because the proportion to which the
weight becomes updated with each epoch is just
too large and continues to move further and
further away from its optimal value.
9/23/2022 21
Dept. of Information and Communication Engineering (ICE), BAUET
Ways to solve
• Changing activation function:
Certain activation functions, like the sigmoid function, squishes a large input space into a small input space
between 0 and 1. Therefore, a large change in the input of the sigmoid function will cause a small change in
the output. Hence, the derivative becomes small.
Use Relu
9/23/2022 22
Dept. of Information and Communication Engineering (ICE), BAUET
Ways to solve
Weight Initialization: A way to reduce the vanishing gradient problem
9/23/2022 23
Dept. of Information and Communication Engineering (ICE), BAUET
Bias
• The values assigned to these biases are learnable, just like the weights. Just how
stochastic gradient descent learns and updates the weights via backpropagation
during training, SGD is also learning and updating the biases as well.
•bias at each neuron as having a role similar to that of a threshold. This is because
the bias value is what's going to determine whether or not the activation
output from a neuron is going to be propagated forward through the network.
•In other words, the bias is determining whether or not, or by how much, a
neuron will fire[3]
9/23/2022 24
Dept. of Information and Communication Engineering (ICE), BAUET
Bias
9/23/2022 25
Dept. of Information and Communication Engineering (ICE), BAUET
Bias
9/23/2022 26
Dept. of Information and Communication Engineering (ICE), BAUET
Link
•To get a clear concept this video link will be helpful:
•https://www.youtube.com/playlist?list=PLZbbT5o_s2xq7LwI2y8_Qtv
uXZedL6tQU
9/23/2022 27
Dept. of Information and Communication Engineering (ICE), BAUET
References
1. https://elitedatascience.com/overfitting-in-machine-learning
2. https://deeplizard.com/learn/video/qO_NLVjD6zE
3. https://deeplizard.com
9/23/2022 28
Dept. of Information and Communication Engineering (ICE), BAUET
Thank you
9/23/2022 29
Dept. of Information and Communication Engineering (ICE), BAUET