0% found this document useful (0 votes)
5 views18 pages

Recurrent Neural Networks: Ref: Https://Youtu - Be/Asntp8Kwu80?Si Ibb-Di1 - 7Lzqxjff

The document provides an overview of Recurrent Neural Networks (RNNs), highlighting their ability to handle sequential data and the differences between RNNs and traditional neural networks. It discusses the limitations of RNNs, particularly the exploding and vanishing gradient problems, and introduces Long Short Term Memory (LSTM) networks as a solution for retaining memory in sequential data. Key components and architecture of RNNs and LSTMs are also outlined.

Uploaded by

heisenberganaya1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views18 pages

Recurrent Neural Networks: Ref: Https://Youtu - Be/Asntp8Kwu80?Si Ibb-Di1 - 7Lzqxjff

The document provides an overview of Recurrent Neural Networks (RNNs), highlighting their ability to handle sequential data and the differences between RNNs and traditional neural networks. It discusses the limitations of RNNs, particularly the exploding and vanishing gradient problems, and introduces Long Short Term Memory (LSTM) networks as a solution for retaining memory in sequential data. Key components and architecture of RNNs and LSTMs are also outlined.

Uploaded by

heisenberganaya1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

RNN

Recurrent Neural Networks


Ref: https://youtu.be/AsNTP8Kwu80?si=IBB-DI1_7LzqXJFf
A recap on ANN

Neural networks are sequential.

Information passes sequentially.
Basic RNN or Vanilla RNN

Stepping stone to understand fancier things such as
– LSTM
– Transformers

Works well with sequential data / time series data
– Can be of varying length
– A NN that is flexible in terms of how much sequential data is used to
make prediction.

Classic NN/CNN always works with fixed size inputs.

It has weights, biases, layers and activation functions.
Difference between RNN and NN

RNN has feedback loops.

Prediction using feedback loops

Suppose we want to predict
tomorrow’s stock price using
yesterday’s and today’s stock
price.

Input is today’s stock value.

Y1 is the output of the activation
function based on yesterday’s
value.
Unrolling the feedback loops

Prediction of
today’s stock
price is
ignored

Yesterday’s
price

Today’s
price
Unrolling the feedback loops
Unrolling the RNN

For handling more day’s of data, keep unrolling.

Regardless of how many times we unroll the RNN, the weights and biases
are shared across every input.

This means the number of weights and biases we need to train is constant.

The more we unroll the RNN, the more difficult it is train it.
Gradient Descent
Limitations of RNN – Exploding/vanishing gradient problem

Caused due to the weight on the
feedback path that is copied each
time the network is unrolled.

When we set w2 to > 1, then the
input1 gets multiplied by w2 n
time, when we unroll the loop n
times.

The derivatives of this term will be
quite large, which causes the
steps in the gradient descent
algorithm to be too big.
Limitations of RNN – Exploding/vanishing gradient problem

The previous example is what is
called exploding gradient.

When we set w2 to < 1, then the
input1 gets multiplied by w2 n
time, when we unroll the loop n
times. => a very small term.

The derivatives of this term will be
very small, which causes the steps
in the gradient descent algorithm
to be too small.

This is called vanishing gradient.
RNNs

RNNs are recurrent.

The same operation happens over
and over again.

Each input is a feature which is
given one after the other in each
time step.

Each time step takes the feature
and the output of the previous step
as the input.
T1 T2 T3 .........................................
(Time steps)
RNNs


In a Neural network, we give all
features at a time, whereas in
RNN, we give one feature as input
in each time step.

In all time steps except the first
one, we have 2 inputs.

Unrolled representation
RNNs – Key Components

Recurrent Neuron

Unrolled representation
RNN - Limitations

Vanishing Gradient ●
Exploding Gradient
LSTM – Long Short Term Memory

Retains memory about previous
state.

Used to extract the features of
sequential data.
LSTM – Architecture

Cell state – Long term memory

Hidden State – Short term memory


Gates

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy