0% found this document useful (0 votes)

4 views10 pages

DL-UNIT_5

This document discusses Recurrent Neural Networks (RNNs) and their ability to process sequential data by retaining information from previous inputs through a hidden state. It covers various types of RNN architectures, including One-to-One, One-to-Many, Many-to-One, and Many-to-Many, as well as the encoder-decoder model and attention mechanisms that enhance performance in tasks like natural language processing and image captioning. The attention mechanism allows models to focus on relevant parts of the input, improving accuracy and efficiency in various applications.

Uploaded by

23f2002473

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views10 pages

DL-UNIT_5

Uploaded by

23f2002473

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Deep Learning - UNIT 5:

Recurrent Neural Network:

In traditional neural networks, inputs and outputs are treated independently. However, tasks
like predicting the next word in a sentence require information from previous words to make
accurate predictions. To address this limitation, Recurrent Neural Networks (RNNs) were
developed.

Recurrent Neural Networks introduce a mechanism where the output from one step is fed back
as input to the next, allowing them to retain information from previous inputs. This design
makes RNNs well-suited for tasks where context from earlier steps is essential, such as
predicting the next word in a sentence.

The defining feature of RNNs is their hidden state—also called the memory state—which
preserves essential information from previous inputs in the sequence. By using the same
parameters across all steps, RNNs perform consistently across inputs, reducing parameter
complexity compared to traditional neural networks. This capability makes RNNs highly
effective for sequential tasks.

process data in one direction, from input to output, without retaining information from
previous inputs. This makes them suitable for tasks with independent inputs, like image
classification. However, FNNs struggle with sequential data since they lack memory.

Recurrent Neural Networks (RNNs) solve this by incorporating loops that allow information
from previous steps to be fed back into the network. This feedback enables RNNs to remember
prior inputs, making them ideal for tasks where context is important.
1. Recurrent Neurons

The fundamental processing unit in a Recurrent Neural Network (RNN) is a Recurrent Unit,
which is not explicitly called a “Recurrent Neuron.” Recurrent units hold a hidden state that
maintains information about previous inputs in a sequence. Recurrent units can “remember”
information from prior steps by feeding back their hidden state, allowing them to capture
dependencies across time.

2. RNN Unfolding

RNN unfolding, or “unrolling,” is the process of expanding the recurrent structure over time
steps. During unfolding, each step of the sequence is represented as a separate layer in a series,
illustrating how information flows across each time step. This unrolling enables
backpropagation through time (BPTT), a learning process where errors are propagated across
time steps to adjust the network’s weights, enhancing the RNN’s ability to learn dependencies
within sequential data.
Types of Recurrent Neural Networks

There are four types of RNNs based on the number of inputs and outputs in the network:

1. One-to-One RNN

One-to-One RNN behaves as the Vanilla Neural Network, is the simplest type of neural network
architecture. In this setup, there is a single input and a single output. Commonly used for
straightforward classification tasks where input data points do not depend on previous
elements.

2. One-to-Many RNN

In a One-to-Many RNN, the network processes a single input to produce multiple outputs over
time. This setup is beneficial when a single input element should generate a sequence of
predictions.

For example, for image captioning task, a single image as input, the model predicts a sequence
of words as a caption

3. Many-to-One RNN

The Many-to-One RNN receives a sequence of inputs and generates a single output. This type is
useful when the overall context of the input sequence is needed to make one prediction.

In sentiment analysis, the model receives a sequence of words (like a sentence) and produces a
single output, which is the sentiment of the sentence (positive, negative, or neutral).
4. Many-to-Many RNN

The Many-to-Many RNN type processes a sequence of inputs and generates a sequence of
outputs. This configuration is ideal for tasks where the input and output sequences need to
align over time, often in a one-to-one or many-to-many mapping.

In language translation task, a sequence of words in one language is given as input, and a
corresponding sequence in another language is generated as output.

Back Propagation Through Time:

Introduction: Recurrent Neural Networks are those networks that deal with sequential data.
They predict outputs using not only the current inputs but also by taking into consideration
those that occurred before it. In other words, the current output depends on current output as
well as a memory element (which takes into account the past inputs). For training such
networks, we use good old backpropagation but with a slight twist. We don’t independently
train the system at a specific time “t”. We train it at a specific time “t” as well as all that has
happened before time “t” like t-1, t-2, t-3. Consider the following representation of a RNN

Recurrent Neural Networks are those networks that deal with sequential data. They can predict
outputs based on not only current inputs but also considering the inputs that were generated
prior to it. The output of the present depends on the output of the present and the memory
element (which includes the previous inputs).

To train these networks, we make use of traditional backpropagation with an added twist. We
don't train the system on the exact time "t". We train it according to a particular time "t" as
well as everything that has occurred prior to time "t" like the following: t-1, t-2, t-3.

Take a look at the following illustration of the RNN:

S1, S2, and S3 are the states that are hidden or memory units at the time of t1, t2, and t3,
respectively, while Ws represents the matrix of weight that goes with it.

X1, X2, and X3 are the inputs for the time that is t1, t2, and t3, respectively, while Wx
represents the weighted matrix that goes with it.

The numbers Y1, Y2, and Y3 are the outputs of t1, t2, and t3, respectively as well as Wy, the
weighted matrix that goes with it.

For any time, t, we have the following two equations:

St = g1 (Wx xt + Ws St-1)

Yt = g2 (WY St )

where g1 and g2 are activation functions.

We will now perform the back propagation at time t = 3.

Let the error function be:

Et=(dt-Yt )2

Here, we employ the squared error, in which D3 is the desired output at a time t = 3.

In order to do backpropagation, it is necessary to change the weights that are associated with
inputs, memory units, and outputs.
Encoder Decoder Model:
In deep learning, the encoder-decoder architecture is a type of neural network most widely
associated with the transformer architecture and used in sequence-to-sequence learning.
Literature thus refers to encoder-decoders at times as a form of sequence-to-sequence model
(seq2seq model). Much machine learning research focuses on encoder-decoder models for
natural language processing (NLP) tasks involving large language models (LLMs).

Encoder-decoder models are used to handle sequential data, specifically mapping input
sequences to output sequences of different lengths, such as neural machine translation, text
summarization, image captioning and speech recognition. In such tasks, mapping a token in the
input to one in the output is often indirect. For example, take machine translation: in some
languages, the verb appears near the beginning of the sentence (as in English), in others at the
end (such as German) and in some, the location of the verb may be more variable (for example,
Latin). An encoder-decoder network generates variable length yet contextually appropriate
output sequences to correspond to a given input sequence.1

As may be inferred from their respective names, the encoder encodes a given input into a
vector representation, and the decoder decodes this vector into the same data type as the
original input dataset.

Both the encoder and decoder are separate, fully connected neural networks. They may be
recurrent neural networks (RNNs)—plus its variants long-short term memory (LSTM), gated
recurrent units (GRUs)—and convolutional neural networks (CNNs), as well as transformer
models. An encoder-decoder model typically contains several encoders and several decoders.

Each encoder consists of two layers: the self-attention layer (or self-attention mechanism) and
the feed-forward neural network. The first layer guides the encoder in surveying and focusing
on other related words in a given input as it encodes one specific word therein. The feed-
forward neural network further processes encodings so they are acceptable for subsequent
encoder or decoder layers.

The decoder part also consists of a self-attention layer and feed-forward neural network, as
well as an additional third layer: the encoder-decoder attention layer. This layer focuses
network attention on specific parts of the output of the encoder. The multi-head attention
layer thereby maps tokens from two different sequences.2

Attention Mechanism:
Let’s take a look at hearing and a case study of selective attention in the context of a crowded
cocktail party. Assume you’re at a social gathering with a large number of people speaking at
the same time. You’re also talking with a friend, but the background noise is not recognized.
You are only paying attention to your friend’s voice and grasping their words while filtering out
background noise. In this scenario, our auditory system employs selective attention to focus on
the relevant auditory information. The neurological system of our brain improves the
representation of speech by prioritizing relevant sounds and ignoring background noises.

A computer method for prioritizing specific information in a given context is called the
attention mechanism of deep learning. During translation or question-answering activities,
attention is used in natural language processing to align pertinent portions of the source
phrase. Without necessarily relying on reinforcement learning, attention mechanisms allow
neural networks to give various weights to various input items, boosting their ability to capture
crucial information and improve performance in a variety of tasks. Google Streetview’s house
number identification is an example of an attention mechanism in Computer vision that enables
models to systematically identify particular portions of an image for processing.

Attention Mechanism

An attention mechanism is an Encoder-Decoder kind of neural network architecture that allows

the model to focus on specific sections of the input while executing a task. It dynamically
assigns weights to different elements in the input, indicating their relative importance or
relevance. By incorporating attention, the model can selectively attend to and process the most
relevant information, capturing dependencies and relationships within the data. This
mechanism is particularly valuable in tasks involving sequential or structured data, such as
natural language processing or computer vision, as it enables the model to effectively handle
long-range dependencies and improve performance by selectively attending to important
features or contexts.

Recurrent models of visual attention use reinforcement learning to focus attention on key areas
of the image. A recurrent neural network governs the peek network, which dynamically selects
particular locations for exploration over time. In classification tasks, this method outperforms
convolutional neural networks. Additionally, this framework goes beyond image identification
and may be used for a variety of visual reinforcement learning applications, such as helping
robots choose behaviours to accomplish particular goals. Although the most basic use of this
strategy is supervised learning, the use of reinforcement learning permits more adaptable and
flexible decision-making based on feedback from past glances and rewards earned throughout
the learning process.

The application of attention mechanisms to image captioning has substantially enhanced the
quality and accuracy of generated captions. By incorporating attention, the model learns to
focus on pertinent image regions while creating each caption word. The model can synchronize
the visual and textual modalities by paying attention to various areas of the image at each time
step thanks to the attention mechanism. By focusing on important objects or areas in the
image, the model is able to produce captions that are more detailed and contextually
appropriate. The attention-based image captioning models have proven to perform better at
catching minute details, managing complicated scenes, and delivering cohesive and educational
captions that closely match the visual material.

The attention mechanism is a technique used in machine learning and natural language
processing to increase model accuracy by focusing on relevant data. It enables the model to
focus on certain areas of the input data, giving more weight to crucial features and disregarding
unimportant ones. Each input attribute is given a weight based on how important it is to the
output in order to accomplish this. The performance of tasks requiring the utilization of the
attention mechanism has significantly improved in areas including speech recognition, image
captioning, and machine translation.

How Attention Mechanism Works

An attention mechanism in a neural network model typically consists of the following steps:

Input Encoding: The input sequence of data is represented or embedded using a collection of
representations. This step transforms the input into a format that can be processed by the
attention mechanism.

Query Generation: A query vector is generated based on the current state or context of the
model. This query vector represents the information the model wants to focus on or retrieve
from the input.

Key-Value Pair Creation: The input representations are split into key-value pairs. The keys
capture the information that will be used to determine the importance or relevance, while the
values contain the actual data or information.

Attention Mechanism Over Images:

Attention mechanisms in deep learning have significantly improved image-related tasks like
image classification, object detection, image captioning, and segmentation. Attention allows
models to focus on the most relevant parts of an image rather than processing all regions
equally.

Types of Attention Mechanisms in Images

Here are the main attention mechanisms used for image processing:

1. Spatial Attention

Focuses on "where" in the image is important.

Assigns different importance to different spatial regions.

Commonly used in Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs).

Example:

In object detection, spatial attention helps the model focus on the main object while ignoring
the background.

2. Channel Attention

Focuses on "what" features (color, texture, edges) are important.

Assigns different importance to different feature channels.

Often used in CNN-based architectures.

Example:

In image classification, channel attention helps the model emphasize informative features like
object edges rather than less useful ones.

3. Self-Attention (Transformers)

Computes attention between all pixels or patches in an image.

Used in Vision Transformers (ViTs) to replace traditional convolutions.

Example:

The Vision Transformer (ViT) divides an image into patches and applies self-attention to learn
global dependencies.

4. Cross-Attention

Used when multiple images or modalities (e.g., text and image) are involved.

Helps in tasks like image captioning or multimodal learning.

Example:

In an image-captioning model, cross-attention helps focus on relevant image regions based on

the generated text.

• Popular Attention-Based Models

• SE-Net (Squeeze-and-Excitation Networks)

• Uses channel attention to improve feature learning.

• CBAM (Convolutional Block Attention Module)

• Combines spatial and channel attention.

• Vision Transformers (ViTs)

• Uses self-attention to process images efficiently.

• DETR (Detection Transformer)

• Applies transformer-based attention for object detection.

GenAI-Module2
No ratings yet
GenAI-Module2
190 pages
DL
No ratings yet
DL
251 pages
DeepLearning Unit-III
No ratings yet
DeepLearning Unit-III
99 pages
UNIT-IV DL
No ratings yet
UNIT-IV DL
54 pages
06 - LLM
No ratings yet
06 - LLM
18 pages
DL_MOD4 (3)
No ratings yet
DL_MOD4 (3)
105 pages
Unit V
No ratings yet
Unit V
32 pages
Unit 3 Chapter 1 RNN
No ratings yet
Unit 3 Chapter 1 RNN
121 pages
Deep & Reinforcement - Unit 4
No ratings yet
Deep & Reinforcement - Unit 4
17 pages
AAM unit 6 notes
No ratings yet
AAM unit 6 notes
20 pages
03 Fekadu Mekonen PDF
No ratings yet
03 Fekadu Mekonen PDF
104 pages
RNN LSTM Gru R
No ratings yet
RNN LSTM Gru R
97 pages
module5
No ratings yet
module5
21 pages
UNIT-3 Sequence Modeling
No ratings yet
UNIT-3 Sequence Modeling
20 pages
001 How To Create A Top Navigation Bar
No ratings yet
001 How To Create A Top Navigation Bar
6 pages
DNN U2 Notes
No ratings yet
DNN U2 Notes
32 pages
Deep Arch Msc 2024
No ratings yet
Deep Arch Msc 2024
83 pages
Unit III- Recurrent Neural Networks
No ratings yet
Unit III- Recurrent Neural Networks
44 pages
User Manual For Easydent
No ratings yet
User Manual For Easydent
239 pages
RNN
No ratings yet
RNN
15 pages
Module 4-1
No ratings yet
Module 4-1
44 pages
Sequence Models231205
No ratings yet
Sequence Models231205
72 pages
Day 4
No ratings yet
Day 4
22 pages
Unit 5
No ratings yet
Unit 5
76 pages
T3-Slide_002_Vanilla RNNs
No ratings yet
T3-Slide_002_Vanilla RNNs
25 pages
Fast Ethernet If Module User Guide
0% (1)
Fast Ethernet If Module User Guide
156 pages
PROGRAMS Class 12 Python Computer Science
No ratings yet
PROGRAMS Class 12 Python Computer Science
17 pages
Module2 L7 RNN LSTM
No ratings yet
Module2 L7 RNN LSTM
47 pages
Ask Mona - AI Solutions
No ratings yet
Ask Mona - AI Solutions
8 pages
Module 06
No ratings yet
Module 06
5 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
8 pages
Unit 3 Deep Learning SPPU BE IT
No ratings yet
Unit 3 Deep Learning SPPU BE IT
30 pages
June 2012 2 Computer Science Ocr Paper
No ratings yet
June 2012 2 Computer Science Ocr Paper
24 pages
UNIT-3
No ratings yet
UNIT-3
30 pages
What are Recurrent Neural Networks.docx
No ratings yet
What are Recurrent Neural Networks.docx
7 pages
? in-Depth Guide to JVM Architecture ?
No ratings yet
? in-Depth Guide to JVM Architecture ?
7 pages
What is a Recurrent Neural Network
No ratings yet
What is a Recurrent Neural Network
36 pages
RNN.docx
No ratings yet
RNN.docx
8 pages
Build Flutter Apk On Android and iOS
No ratings yet
Build Flutter Apk On Android and iOS
3 pages
Elnet Comm
100% (1)
Elnet Comm
81 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
36 pages
Module 5(Chapter 10)
No ratings yet
Module 5(Chapter 10)
17 pages
DeepLearning Unit-III
No ratings yet
DeepLearning Unit-III
42 pages
DL Notes
No ratings yet
DL Notes
35 pages
Unit 3 Questions With Answers Ghanta Ka Password
No ratings yet
Unit 3 Questions With Answers Ghanta Ka Password
20 pages
HP Client Automation Administrator: User Guide
No ratings yet
HP Client Automation Administrator: User Guide
280 pages
Adobe Chat Transcript 1-13-24
No ratings yet
Adobe Chat Transcript 1-13-24
3 pages
Unit V Recurrent Neural Networks
No ratings yet
Unit V Recurrent Neural Networks
35 pages
2 U4-Rnn
No ratings yet
2 U4-Rnn
17 pages
unit 4_merged
No ratings yet
unit 4_merged
13 pages
Tutorial Simulating VANET and ITS Using
No ratings yet
Tutorial Simulating VANET and ITS Using
23 pages
Recurrent Neural Networks: Index
No ratings yet
Recurrent Neural Networks: Index
13 pages
Unit 4 - MachineLearning
No ratings yet
Unit 4 - MachineLearning
16 pages
Operating Instructions Flexi Soft Gateways Hardware en IM0033218
No ratings yet
Operating Instructions Flexi Soft Gateways Hardware en IM0033218
88 pages
Marlon Calimbo Mobile System
No ratings yet
Marlon Calimbo Mobile System
5 pages
RNN
No ratings yet
RNN
23 pages
ifpd-final-s
No ratings yet
ifpd-final-s
2 pages
UNIT5
No ratings yet
UNIT5
13 pages
DL 4
No ratings yet
DL 4
11 pages
6b. Recurrent Neural Networks
No ratings yet
6b. Recurrent Neural Networks
38 pages
What is an RNN
No ratings yet
What is an RNN
6 pages
RNN SK
No ratings yet
RNN SK
17 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
Lecture Notes_RRN
No ratings yet
Lecture Notes_RRN
8 pages
Rtos
No ratings yet
Rtos
34 pages
Cadrg
No ratings yet
Cadrg
83 pages
DL Unit - III Notes1
No ratings yet
DL Unit - III Notes1
14 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
18 pages
SRS For Student Attendance System
100% (8)
SRS For Student Attendance System
15 pages
An In-Depth Explanation of Recurrent Neural Networks (RNNS) - InsideAIML
No ratings yet
An In-Depth Explanation of Recurrent Neural Networks (RNNS) - InsideAIML
9 pages
Cyber Crime
No ratings yet
Cyber Crime
4 pages
Unit III (2) RNN, LSTM, Gru
No ratings yet
Unit III (2) RNN, LSTM, Gru
14 pages
Unit 3
No ratings yet
Unit 3
8 pages
Model Exam
No ratings yet
Model Exam
18 pages
Steps For Training A Recurrent Neural Network: Advantages
No ratings yet
Steps For Training A Recurrent Neural Network: Advantages
13 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
6 pages
Building A Basic Sudoku
0% (1)
Building A Basic Sudoku
13 pages
Computer Architecture Notes VIII
No ratings yet
Computer Architecture Notes VIII
10 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
11 pages
Study of R2R 4-Bit and 8-Bit DAC Circuit Using Multisim Technology
No ratings yet
Study of R2R 4-Bit and 8-Bit DAC Circuit Using Multisim Technology
5 pages
Python String Methods
No ratings yet
Python String Methods
1 page
Ministry of Higher Education and Scientific Research University of Technology Computer Engineering Department
No ratings yet
Ministry of Higher Education and Scientific Research University of Technology Computer Engineering Department
6 pages
Welcome To Wildflower Design Co
No ratings yet
Welcome To Wildflower Design Co
14 pages
Harshath Basha S H: Working in Hewlett Packard Enterprise (HPE), Bangalore As SVC Info Developer
No ratings yet
Harshath Basha S H: Working in Hewlett Packard Enterprise (HPE), Bangalore As SVC Info Developer
4 pages
MS Teams Lecture
No ratings yet
MS Teams Lecture
7 pages
EzIdentity EndToEnd Security SDK
No ratings yet
EzIdentity EndToEnd Security SDK
2 pages
RRU3908 V2 Hardware Description (V100 - 02)
50% (2)
RRU3908 V2 Hardware Description (V100 - 02)
46 pages
Multilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks
From Everand
Multilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks
Fouad Sabry
No ratings yet
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
Long Short Term Memory: Fundamentals and Applications for Sequence Prediction
From Everand
Long Short Term Memory: Fundamentals and Applications for Sequence Prediction
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

DL-UNIT_5

Uploaded by

DL-UNIT_5

Uploaded by

Deep Learning - UNIT 5:

Recurrent Neural Network:

Back Propagation Through Time:

Take a look at the following illustration of the RNN:

For any time, t, we have the following two equations:

where g1 and g2 are activation functions.

We will now perform the back propagation at time t = 3.

Let the error function be:

An attention mechanism is an Encoder-Decoder kind of neural network architecture that allows

How Attention Mechanism Works

Attention Mechanism Over Images:

Types of Attention Mechanisms in Images

Focuses on "where" in the image is important.

Assigns different importance to different spatial regions.

Focuses on "what" features (color, texture, edges) are important.

Assigns different importance to different feature channels.

Often used in CNN-based architectures.

Computes attention between all pixels or patches in an image.

Used in Vision Transformers (ViTs) to replace traditional convolutions.

Helps in tasks like image captioning or multimodal learning.

In an image-captioning model, cross-attention helps focus on relevant image regions based on

• Popular Attention-Based Models

• SE-Net (Squeeze-and-Excitation Networks)

• Uses channel attention to improve feature learning.

• CBAM (Convolutional Block Attention Module)

• Combines spatial and channel attention.

• Vision Transformers (ViTs)

• Uses self-attention to process images efficiently.

• DETR (Detection Transformer)

• Applies transformer-based attention for object detection.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.