0% found this document useful (0 votes)

25 views26 pages

CH4_AA1.1-Sequence Models (1)

Uploaded by

Saadaoui Mayssa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views26 pages

CH4_AA1.1-Sequence Models (1)

Uploaded by

Saadaoui Mayssa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Chapter5: Sequence models

Unit: Deep Learning

1
Learning Outcomes
• At the end of this chapter, you will:
• Understand how to build and train Recurrent Neural Networks
(RNNs), and commonly-used variants such as GRUs and LSTMs.
• Be able to apply sequence models to natural language problems,
• Be able to apply sequence models to audio applications, including
speech recognition

2
What is sequential data ?
• Sequence of numbers in time:
• Stock price
• Earthquake sensor data
• Sequence of words: texts
• Sound: speech, sound
• Image: videos

3
What tasks can be done with sequential data?
• Time Series: prediction time series problem such as stock market predictions.
• Image Captioning: caption an image by analyzing the present action.

• Natural Language Processing: Text mining and Sentiment

• Machine Translation:

4
What tasks can be done with sequential data?
• Speech recognition:

• Music generation : generating classical music using recurrent neural

networks
• DNA sequence analysis: Predicting Transcription Factor Binding Sites

5
Different types of Sequence Modeling Tasks
• Sentiment classification

Image classification Music generation Sentiment classification Machine translation Speech recognition
Image captioning Video activity recognition

6
CNN limitations
• Fixed input size: CNNs require a fixed input size. This can be a problem
when dealing with sequences of varying length, such as in natural
language processing or speech recognition.
• Lack of memory: CNNs have no memory of previous inputs, which
means that they can't easily capture temporal dependencies in
sequential data.
• Order invariance: CNNs are order-invariant, which means that they treat
all inputs equally regardless of their position in the sequence. This makes
it difficult for the network to capture the order-dependent patterns that
are often present in sequential data.

7
Recurrent Neural network : RNN
• RNNs are designed to handle sequences of varying length
• They have memory that allows them to capture temporal
dependencies in the data.
• They also have the ability to process inputs in a sequential order,
which allows them to capture order-dependent patterns.
• RNNs have become the standard approach for processing sequential
data such as speech, text, and time series data.

8
Graphical Representation of RNN

• Xi : Input at tti1 t2 t3

• hi : hidden state at ti
• Yi : output at ti

9
Back Propagation Though Time (BPTT)
•

• Assume that f1 is a linear function :

• And L is the MSE loss function:

10
Back Propagation Though Time (BPTT)
•
• Assume that

•
=

11
Back Propagation Though Time
•

12
RNN limitations
• Two common problems that occur during the backpropagation : the
vanishing and exploding gradients.
•
• =

• If is a sigmoid function,
• If is a tanh function,

13
RNN limitations
•
• Assume that:

• If then gradient will vanish

• If then gradient will explode

14
Exploding Gradients with vanilla RNNs
• To avoid exploding gradient :
• gradient clipping : where at each timestamp, we can check if the
gradient > threshold and if it is, we normalize it.
•

• Where g is the gradient of the loss function

15
Vanishing Gradients with vanilla RNNs
• To tackle the vanishing gradient problem these are the possible
solutions:
• Use ReLU instead of tanh or sigmoid activation function.
• Proper initialization of the weights can reduce the effect of
vanishing gradients.
• Truncated BPTT
• Using gated cells such as LSTM or GRUs

16
Truncate BPTT
• Moving window through the training process

Forward propagation
Backward propagation

• Advantages:
• Help to avoid exploding/vanishing gradient
• Much faster than the simple BPTT, and also less complex
• disadvantage:
• dependencies of longer than the chunk length, are not taught during the
training process.
17
Long Short-Term Memory (LSTM)

18
LSTM : Input Gate
• The input gate decides what new information will be stored in the long-term
memory.
• It only works with the information from the current input and the short-term
memory from the previous time step.

19
LSTM : Input Gate
• i1 : filter which selects what information can pass through it and what
information to be discarded.
• The sigmoid function will transform the values to be between 0 and 1,
• 0 indicates that part of the information is unimportant,
• 1 indicates that the information will be used

• i2 : uses tanh function to regulate the network.

• The final outcome represents the information to be kept in the long-term

memory and used as the output.

20
Forget Gate
• The forget gate decides which information from the long-term memory should
be kept or discarded.
• multiply the incoming long-term memory by a forget vector generated by the
current input and incoming short-term memory.
• Uses Sigmoid function

• The outputs from the Input gate and the Forget gate
will undergo a pointwise addition to give a new version
of the long-term memory.

21
Output Gate
• The output gate will take the current input, the previous short-term memory,
and the newly computed long-term memory to produce the new short-term
memory.
• the previous short-term memory and current input will be passed into a sigmoid function to
create the third and final filter.
• the new long-term memory is passed through an activation tanh function.

22
Gated Recurrent Units : GRU
• A variant of the RNN architecture introduced in 2014 by Cho, et al.
• Uses gating mechanisms to control and manage the flow of
information between cells in the neural network.

23
Reset Gate
• multiply and with their respective weights and summing them before passing
the sum through a sigmoid function.

• will first be multiplied by a trainable weight and will then undergo an element-
wise multiplication (Hadamard product) with the reset vector.
• This operation will decide which information is to be kept from the previous time
steps together with the new inputs.

24
Update Gate
• Both the Update and Reset gate vectors are created using the same formula, but,
the weights multiplied with the input and hidden state are unique to each gate.
• The purpose of the Update gate here is to help the model determine how much
of the past information stored in the previous hidden state needs to be retained
for the future.

25
Combining the outputs
•
• : The purpose of this operation is for the Update gate to determine what portion
of the new information should be stored in the hidden state.

RNN-StannfordBased
No ratings yet
RNN-StannfordBased
102 pages
Unit 4
No ratings yet
Unit 4
50 pages
Sequence Modeling
No ratings yet
Sequence Modeling
131 pages
42 Recurrent Neural Networks and LSTM
No ratings yet
42 Recurrent Neural Networks and LSTM
68 pages
RNN LSTM GRU Transformers
0% (1)
RNN LSTM GRU Transformers
123 pages
ch6_RNN
No ratings yet
ch6_RNN
25 pages
Unit-4 (2)
No ratings yet
Unit-4 (2)
34 pages
RNN & LSTM
No ratings yet
RNN & LSTM
12 pages
Cloud Burst Prediction System Using Machine Learning
No ratings yet
Cloud Burst Prediction System Using Machine Learning
6 pages
DL_MOD4 (3)
No ratings yet
DL_MOD4 (3)
105 pages
RNN.docx
No ratings yet
RNN.docx
10 pages
RNN_2
No ratings yet
RNN_2
144 pages
Unit_3_rcnn
No ratings yet
Unit_3_rcnn
25 pages
ashageri assignment
No ratings yet
ashageri assignment
13 pages
Google Cloud Professional Machine Learning Engineer Exam Questions
100% (5)
Google Cloud Professional Machine Learning Engineer Exam Questions
123 pages
CE6146_Lecture_4
No ratings yet
CE6146_Lecture_4
53 pages
chapter 2
No ratings yet
chapter 2
68 pages
Draexlmaier Sousse PFE Book 2024-2025
No ratings yet
Draexlmaier Sousse PFE Book 2024-2025
39 pages
Deep Arch Msc 2024
No ratings yet
Deep Arch Msc 2024
83 pages
Unit III- Recurrent Neural Networks
No ratings yet
Unit III- Recurrent Neural Networks
44 pages
An Overview of Large Language Models for Statisticians
No ratings yet
An Overview of Large Language Models for Statisticians
67 pages
Module 4-1
No ratings yet
Module 4-1
44 pages
5-Machine Learning–Based Predictive Maintenance for Solar Plants for Early Fault Detection and Diagnostics
No ratings yet
5-Machine Learning–Based Predictive Maintenance for Solar Plants for Early Fault Detection and Diagnostics
28 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
Text Summarization Using NLP
No ratings yet
Text Summarization Using NLP
15 pages
Blue and White Simple Business Plan Presentation
No ratings yet
Blue and White Simple Business Plan Presentation
15 pages
Unit 5
No ratings yet
Unit 5
76 pages
DSA210_2025Spring_Syllabus-3
No ratings yet
DSA210_2025Spring_Syllabus-3
4 pages
RNNs and Their Types - Simple Explanation
No ratings yet
RNNs and Their Types - Simple Explanation
5 pages
Rnn Tutorial
No ratings yet
Rnn Tutorial
41 pages
07 RNN Recurrent Neural Networks
No ratings yet
07 RNN Recurrent Neural Networks
115 pages
RNN
No ratings yet
RNN
9 pages
Bianchi
No ratings yet
Bianchi
62 pages
data science
No ratings yet
data science
6 pages
Module 06
No ratings yet
Module 06
5 pages
MODULE 4
No ratings yet
MODULE 4
14 pages
Unit IV
No ratings yet
Unit IV
22 pages
Does Audio Deepfake Detection Generalize?: Fraunhofer Aisec Technical University Munich Why Do Birds GMBH
No ratings yet
Does Audio Deepfake Detection Generalize?: Fraunhofer Aisec Technical University Munich Why Do Birds GMBH
5 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
36 pages
Projet Anglais Startups
No ratings yet
Projet Anglais Startups
7 pages
RNNs and Their Types - 15 Slides (Easy Copy-Paste Format)
No ratings yet
RNNs and Their Types - 15 Slides (Easy Copy-Paste Format)
6 pages
30-35
No ratings yet
30-35
26 pages
RBF and Unsupervised Learning
No ratings yet
RBF and Unsupervised Learning
34 pages
CNN RNN LSTM Attention
No ratings yet
CNN RNN LSTM Attention
86 pages
CGAN-Based Collaborative Intrusion Detection For UAV Networks A Blockchain-Empowered Distributed Federated Learning Approach
No ratings yet
CGAN-Based Collaborative Intrusion Detection For UAV Networks A Blockchain-Empowered Distributed Federated Learning Approach
13 pages
Illustrated Guide To LSTM's and GRU'S - A Step by Step Explanation - by Michael Phi - Towards Data Science
No ratings yet
Illustrated Guide To LSTM's and GRU'S - A Step by Step Explanation - by Michael Phi - Towards Data Science
15 pages
ML Unit 4
No ratings yet
ML Unit 4
47 pages
RNN LSTM
No ratings yet
RNN LSTM
72 pages
15.03.2024_CSA3007_A24+D23+D24 (1)
No ratings yet
15.03.2024_CSA3007_A24+D23+D24 (1)
8 pages
3 FINANCE Homework Assessment Digital Skills - Answer KEYS
No ratings yet
3 FINANCE Homework Assessment Digital Skills - Answer KEYS
9 pages
ANGLAIS
No ratings yet
ANGLAIS
9 pages
Module 4 Recurrent Neural Network
No ratings yet
Module 4 Recurrent Neural Network
78 pages
Blockchain-Based Event Detection and Trust Verification Using Natural Language Processing and Machine Learning
No ratings yet
Blockchain-Based Event Detection and Trust Verification Using Natural Language Processing and Machine Learning
11 pages
Machine Learning Based Models To Support Decision Maki - 2021 - International Jo
No ratings yet
Machine Learning Based Models To Support Decision Maki - 2021 - International Jo
7 pages
Correction EtapeI
No ratings yet
Correction EtapeI
7 pages
In5400 Week4 2020 Pytorch Lecture4
100% (1)
In5400 Week4 2020 Pytorch Lecture4
84 pages
CNN RNN LSTM GRU Simple
100% (3)
CNN RNN LSTM GRU Simple
20 pages
Unit 4 - MachineLearning
No ratings yet
Unit 4 - MachineLearning
16 pages
RNN
No ratings yet
RNN
28 pages
Instant Ebooks Textbook Grokking Artificial Intelligence Algorithms First Edition Rishal Hurbans Download All Chapters
100% (2)
Instant Ebooks Textbook Grokking Artificial Intelligence Algorithms First Edition Rishal Hurbans Download All Chapters
62 pages
Unit 4
No ratings yet
Unit 4
27 pages
Unit IV
No ratings yet
Unit IV
31 pages
Deepfake Detection On Social Media Leveraging Deep Learning and FastText Embeddings For Identifying Machine-Generated Tweets
No ratings yet
Deepfake Detection On Social Media Leveraging Deep Learning and FastText Embeddings For Identifying Machine-Generated Tweets
14 pages
Machine-learning-Unit-4-RNN (2)
No ratings yet
Machine-learning-Unit-4-RNN (2)
11 pages
Towardsdatascience
No ratings yet
Towardsdatascience
10 pages
Assess Your Project Knowledge
No ratings yet
Assess Your Project Knowledge
2 pages
Projet Anglais Emotional-Intelligence
No ratings yet
Projet Anglais Emotional-Intelligence
5 pages
RNN LSTM
No ratings yet
RNN LSTM
49 pages
Lecture Notes_RRN
No ratings yet
Lecture Notes_RRN
8 pages
Credit Card Fraud Detection Using Neural Networks
No ratings yet
Credit Card Fraud Detection Using Neural Networks
30 pages
M.Tech CS May 2022 0
No ratings yet
M.Tech CS May 2022 0
44 pages
LSTM
No ratings yet
LSTM
22 pages
Qualitative Forecasting
No ratings yet
Qualitative Forecasting
3 pages
Hoang Thai Nam
No ratings yet
Hoang Thai Nam
1 page
6 - RNN LSTM & Gru
No ratings yet
6 - RNN LSTM & Gru
14 pages
Unit 4 - Machine Learning
No ratings yet
Unit 4 - Machine Learning
16 pages
PAMA Presentation
No ratings yet
PAMA Presentation
16 pages
Python Projects Download With Source Code, Database and Reports
No ratings yet
Python Projects Download With Source Code, Database and Reports
43 pages
Modelling Time Series With Neural Networks: Volker Tresp Summer 2017
No ratings yet
Modelling Time Series With Neural Networks: Volker Tresp Summer 2017
24 pages
Lazard Global Medical Devices Diagnostics and Tools Leaders Study 2023
No ratings yet
Lazard Global Medical Devices Diagnostics and Tools Leaders Study 2023
17 pages
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
No ratings yet
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
9 pages
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
0% (1)
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
16 pages
CS 601 Machine Learning Unit 4
No ratings yet
CS 601 Machine Learning Unit 4
14 pages
Resume Microsoft
No ratings yet
Resume Microsoft
2 pages
Unit 3
No ratings yet
Unit 3
8 pages
Assignment Ict Ai Machine Learning (1) - 084742
No ratings yet
Assignment Ict Ai Machine Learning (1) - 084742
7 pages
Data Sciencefor Business
No ratings yet
Data Sciencefor Business
107 pages
Deep Learning RNN
100% (1)
Deep Learning RNN
53 pages
CS5560 Lect12-RNN - LSTM
No ratings yet
CS5560 Lect12-RNN - LSTM
30 pages
Recurrent Neural Networks: Anahita Zarei, PH.D
No ratings yet
Recurrent Neural Networks: Anahita Zarei, PH.D
37 pages
Big Data in The Construction
No ratings yet
Big Data in The Construction
36 pages
About The Classification and Regression Supervised Learning Problems
No ratings yet
About The Classification and Regression Supervised Learning Problems
3 pages
Soft Computing Lab Record
100% (1)
Soft Computing Lab Record
35 pages
UNIT-5 Foundations of Deep Learning
No ratings yet
UNIT-5 Foundations of Deep Learning
9 pages
Quantum Computer Vs Traditional Computer
From Everand
Quantum Computer Vs Traditional Computer
Arief Muinnudin
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CH4_AA1.1-Sequence Models (1)

Uploaded by

CH4_AA1.1-Sequence Models (1)

Uploaded by

Chapter5: Sequence models

Unit: Deep Learning

• Natural Language Processing: Text mining and Sentiment

• Music generation : generating classical music using recurrent neural

• Assume that f1 is a linear function :

• If then gradient will vanish

• Where g is the gradient of the loss function

• i2 : uses tanh function to regulate the network.

• The final outcome represents the information to be kept in the long-term

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.