0% found this document useful (0 votes)

42 views97 pages

RNN LSTM Gru R

Uploaded by

Ayan Karim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views97 pages

RNN LSTM Gru R

Uploaded by

Ayan Karim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 97

1

2
3
RNN:
• Recurrent Neural Network(RNN) is a type of Neural Network where the
output from the previous step is fed as input to the current step.
• In traditional neural networks, all the inputs and outputs are independent
of each other.
• Still, in cases when it is required to predict the next word of a sentence,
the previous words are required and hence there is a need to remember
the previous words.
• Thus RNN came into existence, which solved this issue with the help of a
Hidden Layer. The main and most important feature of RNN is its Hidden
state, which remembers some information about a sequence.

• The state is also referred to as Memory State since it remembers the

previous input to the network. It uses the same parameters for each input
as it performs the same task on all the inputs or hidden layers to produce
the output.

• This reduces the complexity of parameters, unlike other neural networks.

4
Types Of RNN:

There are four types of RNNs based on the number of inputs and
outputs in the network.

1.One to One
2.One to Many
3.Many to One
4.Many to Many

5
6
Sentiment analysis
Movie Rating

7
8
9
Recurrent Neural Network Architecture:

• RNNs have the same input and output architecture as any other
deep neural architecture. However, differences arise in the way
information flows from input to output.
• Unlike Deep neural networks where we have different weight
matrices for each Dense network in RNN, the weight across the
network remains the same.

10
11
12
How does RNN work?:

• The Recurrent Neural Network consists of multiple fixed activation

function units, one for each time step.
• Each unit has an internal state which is called the hidden state of
the unit. This hidden state signifies the past knowledge that the
network currently holds at a given time step.
• This hidden state is updated at every time step to signify the change
in the knowledge of the network about the past.

• The hidden state is updated using the following recurrence relation:-

13
14
15
Simple RNN:

16
17
18
19
Forward and Backward Propagation in
RNN

20
21
Forward Propagation

22
Backward Propagation

23
Backward Propagation

24
Backward Propagation

25
26
Backward Propagation:

27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
Issues of Standard RNNs:

1. Vanishing Gradient: Text generation, machine translation,

and stock market prediction are just a few examples of the
time-dependent and sequential data problems that can be
modelled with recurrent neural networks. You will discover,
though, that the gradient problem makes training RNN difficult.

2. Exploding Gradient: An Exploding Gradient occurs when a

neural network is being trained and the slope tends to grow
exponentially rather than decay. Large error gradients that
build up during training lead to very large updates to the
neural network model weights, which is the source of this
issue.

48
Training through RNN:

1. A single-time step of the input is provided to the network.

2. Then calculate its current state using a set of current input and the previous
state.
3. The current ht becomes ht-1 for the next time step.
4. One can go as many time steps according to the problem and join the
information from all the previous states.
5. Once all the time steps are completed the final current state is used to
calculate the output.
6. The output is then compared to the actual output i.e the target output and
the error is generated.
7. The error is then back-propagated to the network to update the weights and
hence the network (RNN) is trained using Backpropagation through time.

49
Advantages and Disadvantages of Recurrent Neural
Network

• Advantages

1. An RNN remembers each and every piece of information

through time. It is useful in time series prediction only
because of the feature to remember previous inputs as well.
This is called Long Short Term Memory.
2. Recurrent neural networks are even used with convolutional
layers to extend the effective pixel neighborhood.

• Disadvantages

1. Gradient vanishing and exploding problems.

2. Training an RNN is a very difficult task.
3. It cannot process very long sequences if using tanh or relu as
an activation function.
50
Applications of Recurrent Neural Network

1. Language Modelling and Generating Text

2. Speech Recognition
3. Machine Translation
4. Image Recognition, Face detection
5. Time series Forecasting

Variation Of Recurrent Neural Network (RNN)

• To overcome the problems like vanishing gradient and exploding
gradient descent several new advanced versions of RNNs are
formed some of these are as;

1. Bidirectional Neural Network (BiNN)

2. Long Short-Term Memory (LSTM)

51
Difference between RNN and Simple Neural Network
RNN is considered to be the better version of deep neural when the data is
sequential. There are significant differences between the RNN and deep neural
networks they are listed as:

Recurrent Neural Network Deep Neural Network

Weights are same across all the layers number Weights are different for each layer of the
of a Recurrent Neural Network network

Recurrent Neural Networks are used when the A Simple Deep Neural network does not have any
data is sequential and the number of inputs is special method for sequential data also here the
not predefined. number of inputs is fixed

The Numbers of parameter in the RNN are

The Numbers of Parameter are lower than RNN
higher than in simple DNN

Exploding and vanishing gradients is the the These problems also occur in DNN but these are
major drawback of RNN not the major problem with DNN

52
Bidirectional Recurrent Neural Network:

• Recurrent Neural Networks (RNNs) are a particular class of neural networks

that was created with the express purpose of processing sequential input,
including speech, text, and time series data.
• RNNs process data as a sequence of vectors rather than feedforward neural
networks, which process data as a fixed-length vector.
• Each vector is processed depending on the hidden state from the previous
phase.
• The network can store data from earlier steps in the sequence in a type of
memory by computing the hidden state by taking into account both the
current input and the hidden state from the previous phase.
• RNNs are thus well suited for jobs that call for knowledge of the context and
connections among sequence elements.
• Even though conventional RNNs can handle variable-length sequences, they
sometimes have trouble with the vanishing gradient problem.
• Gradients during backpropagation become extremely small at this point,
making it challenging for the network to learn from the data.
• Many RNN versions, such LSTMs, and GRUs, which use gating methods to
regulate the flow of information and enhance learning, have been created to
53
Bidirectional Recurrent Neural Network:

54
Working of Bidirectional Recurrent Neural Network
1. Inputting a sequence: A sequence of data points, each represented as a vector with the same
dimensionality, are fed into a BRNN. The sequence might have different lengths.
2. Dual Processing: Both the forward and backward directions are used to process the data. On the
basis of the input at that step and the hidden state at step t-1, the hidden state at time step t is
determined in the forward direction. The input at step t and the hidden state at step t+1 are used to
calculate the hidden state at step t in a reverse way.
3. Computing the hidden state: A non-linear activation function on the weighted sum of the input
and previous hidden state is used to calculate the hidden state at each step. This creates a memory
mechanism that enables the network to remember data from earlier steps in the process.
4. Determining the output: A non-linear activation function is used to determine the output at each
step from the weighted sum of the hidden state and a number of output weights. This output has two
options: it can be the final output or input for another layer in the network.
5. Training: The network is trained through a supervised learning approach where the goal is to
minimize the discrepancy between the predicted output and the actual output. The network adjusts
its weights in the input-to-hidden and hidden-to-output connections during training through
backpropagation.

55
To calculate the output from an RNN unit, we use the following
formula:

The hidden state at time t is given by a combination of

Ht (Forward) and Ht (Backward). The output at any given hidden
state is :

56
The training of a BRNN is similar to backpropagation
through a time algorithm. BPTT algorithm works as
follows:

• Roll out the network and calculate errors at each

iteration
• Update weights and roll up the network.
• However, because forward and backward passes in a
BRNN occur simultaneously, updating the weights for the
two processes may occur at the same time. This produces
inaccurate outcomes. Thus, the following approach is
used to train a BRNN to accommodate forward and
backward passes individually.

57
• Advantages of Bidirectional RNN:

• Context from both past and future: With the ability to process sequential input both forward and
backward, BRNNs provide a thorough grasp of the full context of a sequence. Because of this, BRNNs are
effective at tasks like sentiment analysis and speech recognition.
• Enhanced accuracy: BRNNs frequently yield more precise answers since they take both historical and
upcoming data into account.
• Efficient handling of variable-length sequences: When compared to conventional RNNs, which require
padding to have a constant length, BRNNs are better equipped to handle variable-length sequences.
• Resilience to noise and irrelevant information: BRNNs may be resistant to noise and irrelevant data
that are present in the data. This is so because both the forward and backward paths offer useful
information that supports the predictions made by the network.
• Ability to handle sequential dependencies: BRNNs can capture long-term links between sequence
pieces, making them extremely adept at handling complicated sequential dependencies.

• Disadvantages of Bidirectional RNN:

• Computational complexity: Given that they analyze data both forward and backward, BRNNs can be
computationally expensive due to the increased amount of calculations needed.
• Long training time: BRNNs can also take a while to train because there are many parameters to optimize,
especially when using huge datasets.
• Difficulty in parallelization: Due to the requirement for sequential processing in both the forward and
backward directions, BRNNs can be challenging to parallelize.
• Overfitting: BRNNs are prone to overfitting since they include many parameters that might result in too
complicated models, especially when trained on short datasets.
• Interpretability: Due to the processing of data in both forward and backward directions, BRNNs can be
tricky to interpret since it can be difficult to comprehend what the model is doing and how it is producing
predictions.

58
LSTM
59
Long Short-Term Memory: LSTM

Its is an improved version of recurrent neural network designed by Hochreiter &

Schmidhuber.

A traditional RNN has a single hidden state that is passed through time, which
can make it difficult for the network to learn long-term dependencies. LSTMs
model address this problem by introducing a memory cell, which is a container
that can hold information for an extended period.

LSTM architectures are capable of learning long-term dependencies in

sequential data, which makes them well-suited for tasks such as language
translation, speech recognition, and time series forecasting.

60
61
62
63
LSTM Architecture:

• The LSTM architectures involves the memory cell which is controlled by

three gates: the input gate, the forget gate, and the output gate. These gates
decide what information to add to, remove from, and output from the
memory cell.

• The input gate controls what information is added to the memory cell.

• The forget gate controls what information is removed from the memory cell.

• The output gate controls what information is output from the memory cell.

• This allows LSTM networks to selectively retain or discard information as it

flows through the network, which allows them to learn long-term
dependencies.

• The LSTM maintains a hidden state, which acts as the short-term memory of
the network. The hidden state is updated based on the input, the previous
hidden state, and the memory cell’s current state.

64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
Applications of LSTM

Some of the famous applications of LSTM includes:

• Language Modeling: LSTMs have been used for natural language processing tasks
such as language modeling, machine translation, and text summarization. They can be
trained to generate coherent and grammatically correct sentences by learning the
dependencies between words in a sentence.
• Speech Recognition: LSTMs have been used for speech recognition tasks such as
transcribing speech to text and recognizing spoken commands. They can be trained to
recognize patterns in speech and match them to the corresponding text.
• Time Series Forecasting: LSTMs have been used for time series forecasting tasks
such as predicting stock prices, weather, and energy consumption. They can learn
patterns in time series data and use them to make predictions about future events.
• Anomaly Detection: LSTMs have been used for anomaly detection tasks such as
detecting fraud and network intrusion. They can be trained to identify patterns in data
that deviate from the norm and flag them as potential anomalies.
• Recommender Systems: LSTMs have been used for recommendation tasks such as
recommending movies, music, and books. They can learn patterns in user behavior and
use them to make personalized recommendations.
• Video Analysis: LSTMs have been used for video analysis tasks such as object
detection, activity recognition, and action classification. They can be used in
combination with other neural network architectures, such as Convolutional Neural
87
LSTM VS RNN:
Feature LSTM (Long Short-term Memory) RNN (Recurrent Neural Network)

Has a special memory unit that allows it to learn

Memory long-term dependencies in sequential data
Does not have a memory unit

Can be trained to process sequential data in both Can only be trained to process sequential data in
Directionality forward and backward directions one direction

More difficult to train than RNN due to the

Training complexity of the gates and memory unit
Easier to train than LSTM

Long-term dependency learning Yes Limited

Ability to learn sequential data Yes Yes

Machine translation, speech recognition, text Natural language processing, machine

Applications summarization, natural language processing, time translation, speech recognition, image
series forecasting processing, video processing

88
Long Short-Term Memory (LSTM) is a powerful type of
recurrent neural network (RNN) that is well-suited for
handling sequential data with long-term dependencies. It
addresses the vanishing gradient problem, a common
limitation of RNNs, by introducing a gating mechanism that
controls the flow of information through the network. This
allows LSTMs to learn and retain information from the past,
making them effective for tasks like machine translation,
speech recognition, and natural language processing.

89
GRU
90
Gated Recurrent Unit Networks: GRU

Gated Recurrent Unit (GRU) is a type of recurrent neural network (RNN) that
was introduced by Cho et al. in 2014 as a simpler alternative to Long Short-
Term Memory (LSTM) networks. Like LSTM, GRU can process sequential data
such as text, speech, and time-series data.

The basic idea behind GRU is to use gating mechanisms to selectively update
the hidden state of the network at each time step. The gating mechanisms are
used to control the flow of information in and out of the network. The GRU has
two gating mechanisms, called the reset gate and the update gate.

The reset gate determines how much of the previous hidden state should be
forgotten, while the update gate determines how much of the new input
should be used to update the hidden state. The output of the GRU is calculated
based on the updated hidden state.

91
92
To solve the Vanishing-Exploding gradients problem often encountered
during the operation of a basic Recurrent Neural Network, many
variations were developed. One of the most famous variations is
the Long Short Term Memory Network(LSTM). One of the lesser-
known but equally effective variations is the Gated Recurrent Unit
Network(GRU).

93
94
95
96
97

Unit-Iv DL
No ratings yet
Unit-Iv DL
54 pages
Deep Learning (MODULE-4)
No ratings yet
Deep Learning (MODULE-4)
102 pages
DL Co3 - PPT 1
No ratings yet
DL Co3 - PPT 1
22 pages
GenAI Module2
No ratings yet
GenAI Module2
190 pages
A Recurrent Neural Network
No ratings yet
A Recurrent Neural Network
22 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
18 pages
Module2 L7 RNN LSTM
No ratings yet
Module2 L7 RNN LSTM
47 pages
Name Email Mobile
No ratings yet
Name Email Mobile
30 pages
Unit 4 NLP
No ratings yet
Unit 4 NLP
19 pages
ML Lec 21 RNN
No ratings yet
ML Lec 21 RNN
72 pages
Module 4 Recurrent Neural Network
No ratings yet
Module 4 Recurrent Neural Network
78 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
21 pages
28-Recurrent Neural Networks - Bidirectional RNNs-19!09!2024
No ratings yet
28-Recurrent Neural Networks - Bidirectional RNNs-19!09!2024
12 pages
RNN
No ratings yet
RNN
23 pages
Unit-2 Part-2
No ratings yet
Unit-2 Part-2
42 pages
Deep Learning Notes
100% (1)
Deep Learning Notes
44 pages
Dl-Unit 5
No ratings yet
Dl-Unit 5
10 pages
ML Unit 4
No ratings yet
ML Unit 4
47 pages
Unit 5
No ratings yet
Unit 5
76 pages
Lec 4 Recurrent Neural Network Long Short-Term Memory
No ratings yet
Lec 4 Recurrent Neural Network Long Short-Term Memory
32 pages
RNN 2
No ratings yet
RNN 2
144 pages
Unit V
No ratings yet
Unit V
32 pages
RNN SK
No ratings yet
RNN SK
17 pages
Unit 3 RCNN
No ratings yet
Unit 3 RCNN
25 pages
RNN Tutorial
No ratings yet
RNN Tutorial
41 pages
Unit Ix Cost Effectiveness and Cost Accounting
No ratings yet
Unit Ix Cost Effectiveness and Cost Accounting
38 pages
Module 5
No ratings yet
Module 5
21 pages
Sequence Modeling Recurrent Neural Networks
No ratings yet
Sequence Modeling Recurrent Neural Networks
18 pages
DeepLearning Unit-III
No ratings yet
DeepLearning Unit-III
42 pages
Convolutional Neural Networks (CNNS)
No ratings yet
Convolutional Neural Networks (CNNS)
10 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
36 pages
Recurrent Neural Networks: Index
No ratings yet
Recurrent Neural Networks: Index
13 pages
Blue and White Simple Business Plan Presentation
No ratings yet
Blue and White Simple Business Plan Presentation
15 pages
Deep Arch MSC 2024
No ratings yet
Deep Arch MSC 2024
83 pages
What Are Recurrent Neural Networks
No ratings yet
What Are Recurrent Neural Networks
7 pages
Deep & Reinforcement - Unit 4
No ratings yet
Deep & Reinforcement - Unit 4
17 pages
DL Mod4
No ratings yet
DL Mod4
105 pages
Introduction To Recurrent Neural Network
No ratings yet
Introduction To Recurrent Neural Network
18 pages
Perspectives On The History of Mathematical Logic PDF
100% (1)
Perspectives On The History of Mathematical Logic PDF
218 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
6 pages
What Is A Recurrent Neural Network
No ratings yet
What Is A Recurrent Neural Network
36 pages
Unit 4 - MachineLearning
No ratings yet
Unit 4 - MachineLearning
16 pages
Unit 4 - Merged
No ratings yet
Unit 4 - Merged
13 pages
What Is An RNN
No ratings yet
What Is An RNN
6 pages
Module 4-1
No ratings yet
Module 4-1
44 pages
Unit 3
No ratings yet
Unit 3
30 pages
Unit IV
No ratings yet
Unit IV
31 pages
Recurrent Neural Network: Dr. Sukanta Ghosh
100% (1)
Recurrent Neural Network: Dr. Sukanta Ghosh
34 pages
Unit V Recurrent Neural Networks
No ratings yet
Unit V Recurrent Neural Networks
35 pages
Unit 4 - Machine Learning
No ratings yet
Unit 4 - Machine Learning
16 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
8 pages
Module 06
No ratings yet
Module 06
5 pages
DL Unit4
No ratings yet
DL Unit4
20 pages
Lecture Notes - RRN
No ratings yet
Lecture Notes - RRN
8 pages
DeepLearning Unit-III
No ratings yet
DeepLearning Unit-III
99 pages
Steps For Training A Recurrent Neural Network: Advantages
No ratings yet
Steps For Training A Recurrent Neural Network: Advantages
13 pages
UNIT5
No ratings yet
UNIT5
13 pages
Class 11 CHAPTER 04 by Arslan Saleem
No ratings yet
Class 11 CHAPTER 04 by Arslan Saleem
12 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
Civic Education Lesson Plan
No ratings yet
Civic Education Lesson Plan
2 pages
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
0% (1)
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
16 pages
Recurrent Neural Network (RNN)
No ratings yet
Recurrent Neural Network (RNN)
26 pages
DL Unit - III Notes1
No ratings yet
DL Unit - III Notes1
14 pages
Introduction To Recurrent Neural Network
No ratings yet
Introduction To Recurrent Neural Network
9 pages
Trevithick Second Steam Locomotive PDF
50% (2)
Trevithick Second Steam Locomotive PDF
6 pages
Test Questions in Professional Education
No ratings yet
Test Questions in Professional Education
16 pages
Assessments in Occupational Therapy Mental Health An Integrative Approach, 4th Edition Full Digital Edition
100% (15)
Assessments in Occupational Therapy Mental Health An Integrative Approach, 4th Edition Full Digital Edition
16 pages
Trainng report-BSNL: Monday, July 2, 2007
No ratings yet
Trainng report-BSNL: Monday, July 2, 2007
40 pages
Repetitve Nerve Stimulation (RNS) : By: Syed Irshad Murtaza Neurophysiology Dept AKUH Karachi Date:12-06-2013
No ratings yet
Repetitve Nerve Stimulation (RNS) : By: Syed Irshad Murtaza Neurophysiology Dept AKUH Karachi Date:12-06-2013
33 pages
Marks Oriented Notes For IGCSE O Level Physics v37
No ratings yet
Marks Oriented Notes For IGCSE O Level Physics v37
76 pages
Gs33k50e10 50e
No ratings yet
Gs33k50e10 50e
5 pages
Research II Proposal
No ratings yet
Research II Proposal
26 pages
Environmental Law and Jurisprudence
No ratings yet
Environmental Law and Jurisprudence
76 pages
QMS M1
No ratings yet
QMS M1
10 pages
Sharif Abushaikha Hotel General Manager
No ratings yet
Sharif Abushaikha Hotel General Manager
2 pages
Cloud Seeding
No ratings yet
Cloud Seeding
23 pages
UG - CAO - .00132-002 Tools & Equipment
No ratings yet
UG - CAO - .00132-002 Tools & Equipment
39 pages
Activity No. 4: Simple and Complex Tissues
0% (1)
Activity No. 4: Simple and Complex Tissues
6 pages
Fpsyt 15 1458939
No ratings yet
Fpsyt 15 1458939
11 pages
Sneha SVMCM SC 2023-2024
No ratings yet
Sneha SVMCM SC 2023-2024
2 pages
EE1005 L01 Computers & Programming
No ratings yet
EE1005 L01 Computers & Programming
35 pages
Dhupguri Report
No ratings yet
Dhupguri Report
11 pages
Mandelbrot Zoom Report
No ratings yet
Mandelbrot Zoom Report
9 pages
Lymph 4649 Document PDF
No ratings yet
Lymph 4649 Document PDF
17 pages
1st Batch Uat - March 11 - Ibajay
No ratings yet
1st Batch Uat - March 11 - Ibajay
3 pages
#6 Adding File Upload To A Form
No ratings yet
#6 Adding File Upload To A Form
10 pages
Samuel Maldonado Setu - CV & Ijazah
No ratings yet
Samuel Maldonado Setu - CV & Ijazah
4 pages
Chawimawi Ru
No ratings yet
Chawimawi Ru
1 page
Fpse 64
No ratings yet
Fpse 64
1 page
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
From Everand
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
César Pérez López
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

RNN LSTM Gru R

Uploaded by

RNN LSTM Gru R

Uploaded by

1

• The state is also referred to as Memory State since it remembers the

• This reduces the complexity of parameters, unlike other neural networks.

• The Recurrent Neural Network consists of multiple fixed activation

• The hidden state is updated using the following recurrence relation:-

1. Vanishing Gradient: Text generation, machine translation,

2. Exploding Gradient: An Exploding Gradient occurs when a

1. A single-time step of the input is provided to the network.

1. An RNN remembers each and every piece of information

1. Gradient vanishing and exploding problems.

1. Language Modelling and Generating Text

Variation Of Recurrent Neural Network (RNN)

1. Bidirectional Neural Network (BiNN)

Recurrent Neural Network Deep Neural Network

The Numbers of parameter in the RNN are

• Recurrent Neural Networks (RNNs) are a particular class of neural networks

The hidden state at time t is given by a combination of

• Roll out the network and calculate errors at each

• Disadvantages of Bidirectional RNN:

Its is an improved version of recurrent neural network designed by Hochreiter &

LSTM architectures are capable of learning long-term dependencies in

• The LSTM architectures involves the memory cell which is controlled by

• This allows LSTM networks to selectively retain or discard information as it

Some of the famous applications of LSTM includes:

Has a special memory unit that allows it to learn

More difficult to train than RNN due to the

Long-term dependency learning Yes Limited

Ability to learn sequential data Yes Yes

Machine translation, speech recognition, text Natural language processing, machine

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.