Recurrent Neural Networks: Ref: Https://Youtu - Be/Asntp8Kwu80?Si Ibb-Di1 - 7Lzqxjff
Recurrent Neural Networks: Ref: Https://Youtu - Be/Asntp8Kwu80?Si Ibb-Di1 - 7Lzqxjff
Prediction of
today’s stock
price is
ignored
Yesterday’s
price
Today’s
price
Unrolling the feedback loops
Unrolling the RNN
●
For handling more day’s of data, keep unrolling.
●
Regardless of how many times we unroll the RNN, the weights and biases
are shared across every input.
●
This means the number of weights and biases we need to train is constant.
●
The more we unroll the RNN, the more difficult it is train it.
Gradient Descent
Limitations of RNN – Exploding/vanishing gradient problem
●
Caused due to the weight on the
feedback path that is copied each
time the network is unrolled.
●
When we set w2 to > 1, then the
input1 gets multiplied by w2 n
time, when we unroll the loop n
times.
●
The derivatives of this term will be
quite large, which causes the
steps in the gradient descent
algorithm to be too big.
Limitations of RNN – Exploding/vanishing gradient problem
●
The previous example is what is
called exploding gradient.
●
When we set w2 to < 1, then the
input1 gets multiplied by w2 n
time, when we unroll the loop n
times. => a very small term.
●
The derivatives of this term will be
very small, which causes the steps
in the gradient descent algorithm
to be too small.
●
This is called vanishing gradient.
RNNs
●
RNNs are recurrent.
●
The same operation happens over
and over again.
●
Each input is a feature which is
given one after the other in each
time step.
●
Each time step takes the feature
and the output of the previous step
as the input.
T1 T2 T3 .........................................
(Time steps)
RNNs
●
In a Neural network, we give all
features at a time, whereas in
RNN, we give one feature as input
in each time step.
●
In all time steps except the first
one, we have 2 inputs.
Unrolled representation
RNNs – Key Components
●
Recurrent Neuron
●
Unrolled representation
RNN - Limitations
●
Vanishing Gradient ●
Exploding Gradient
LSTM – Long Short Term Memory
●
Retains memory about previous
state.
●
Used to extract the features of
sequential data.
LSTM – Architecture
●
Cell state – Long term memory
●
Hidden State – Short term memory
●
●
Gates