0% found this document useful (0 votes)

8 views21 pages

Lecture 11

Lecture 11 introduces recurrent neural networks (RNN) and Long Short-Term Memory (LSTM) networks, focusing on their application in analyzing sequential data. Key concepts include the differences between RNNs and multi-layer perceptrons (MLPs), the function of gating sub-units in LSTMs, and a software framework for building RNNs. The lecture also covers advanced RNN architectures, including bi-directional RNNs and attention mechanisms, and suggests additional readings and resources for further understanding.

Uploaded by

Esraa Al dn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views21 pages

Lecture 11

Uploaded by

Esraa Al dn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

SYSC4415

Introduction to Machine Learning

Lecture 11

Prof James Green

jrgreen@sce.Carleton.ca
Systems and Computer Engineering, Carleton University
Learning Objectives for Lecture 11
• Introduce recurrent neural networks (RNN) and Long Short-Term
Memory (LSTM) networks for analyzing sequential data
• Understand how a recurrent neural network (RNN) differs from a MLP
• Understand the function of each gating sub-unit within an LSTM
• Introduce at least one software framework for building, training, and
testing RNN/LSTM
Pre-Lecture Assignment
• Chapter 6.2.2(72-75 = 4 pages)
• https://www.youtube.com/watch?v=WCUNPb-5EYI
• (RNN and LSTM at a conceptual level; 26min, or 17min @ 1.5 speed)
Key terms
• Recurrent neural networks (RNNs), state, softmax function,
backpropagation through time, long short-term memory (LSTM),
Gated Recurrent Unit (GRU), minimal gated GRU, bi-directional RNNs,
attention, sequence-to-sequence RNN, recursive neural network.
In-Class Activities
• Review key concepts in the chapter through discussion,
PollEverywhere questions
• Tutorial: Review a Jupyter notebook that builds, trains, and tests a
LSTM network using Keras
RNN : used to label , classify ,
generate sequences
>
- different from FNN (contains Wops)

· Each unit u
of recurrent
layer L has a
-
state/h
<, u
&
-

memory
unit
of the

RVN
Seqseq - , specifically LSTN

element by element addition

remember/(
& gating (on/off) 1 0
.

,
0 . 5 ,
0 0
.

Parameters : , , blu , in matrix) , Enu

found using gradient descent with backpropagation .

Through time
didn't e
das
~

&
do
in
>
-
I
Motivational Example
• Introduction to RNN:
• https://www.youtube.com/watch?v=LHXXI4-IEns (10 min)
vanishing gradient causes to
>
- problem us

move
away from RNN towards LSTM Set dimensionality of parameter matrix Vj such that Vjhjt results
in a vector of dimension c (# classes)

Recurrent Neural Network Hidden Layers

Softmax function:
Input sequence:
=>
-
RNN Unfolding/unrolling
http://colah.github.io/posts/2015-08-Understanding-LSTMs/

• Left: https://www.youtube.com/watch?v=S0XFd0VMFss (2-6 of 8min)

• Right: https://www.youtube.com/watch?v=_h66BW-xNgk&index=1&list=PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI (~15 min mark)

RNN Unfolding

Unfolding
>
-
LSTM and GRU
• Watch: https://www.youtube.com/watch?v=8HyCNIVRbSU (11min)

ct-1 ct ht-1 ht

ht-1 ht
xt xt
-
Minimal Gated Recurrent Unit
Each “cell” is made up of multiple units (sizel). One unit shown here:
“Forget” or “Update” gate

Textbook: Minimal Gated GRU (gated recurrent unit)

1) New potential memory cell value f(inputs and ht-1);&

g1 = tanh
2) Memory forget gate; 1=forget=take new, 0=keep=ignore new; f(inputs and ht-1)
- g2 “gate function” uses sigmoid function  Fgate value;
3) New memory cell value. Either take new h~ value, or keep ht-1; f(Fgate and ht-1)
4) Vector of new memory cell values. 1 per unit in this layer
5) Output vector;-
g3=softmax

Discussion of dimensions of signals in an LSTM: https://mmuratarat.github.io//2019-01-19/dimensions-of-lstm

true for
both
Whig SGD)
(aery !
Advanced RNN Architectures
• Other important extensions to RNNs include:
• A generalization of an RNN is a recursive neural network
• bi-directional RNNs
• RNNs with attention (see extended Ch6 material on course wiki)
• Attention-only networks = transformers…
• sequence-to-sequence RNN models.
• Frequently used to build neural machine translation models and other models for text to
text transformations.
• Will see this later in the textbook (section 7.7)…
• Combinations of CNN+LSTM
• Image Captioning
• Video:
• CNN on indiv frames to extact feature vectors, LSTM for time-series
• Or look at 3D conv with fixed time window
Textbook Recommended Readings for RNN:
• An extended version of Chapter 6 with RNN unfolding, bidirectional RNN, and attention
• The Unreasonable Effectiveness of Recurrent Neural Networks by Andrej Karpathy (2015)
• Recurrent Neural Networks and LSTM by Niklas Donges (2018)
• Understanding LSTM Networks by Christopher Olah (2015)
• Introduction to RNNs by Denny Britz (2015)
• Implementing a RNN with Python, Numpy and Theano by Denny Britz (2015)
• Backpropagation Through Time and Vanishing Gradients by Denny Britz (2015)
• Implementing a GRU/LSTM RNN with Python and Theano by Denny Britz (2015)
• Simplified Minimal Gated Unit Variations for Recurrent Neural Networks by Joel Heck and
Fathi Salem (2017)
Transformers: “Attention is all you need”
Great 1-hour Transformers Tutorial
• Transformers (“Attention is all you need” 2017)
• Attention: watch ~11 mins from 10min mark: Transformers with Lucas Beyer
• “LSTM is Dead. Long Live Transformers” (2019)
• https://www.youtube.com/watch?v=S27pHKBEp30 (~45min)
• Warning: we won’t cover NLP for a few weeks…
• Code and pre-trained models: github.com/huggingface/transformers
• “The Illustrated Transformer”: http://jalammar.github.io/illustrated-transformer/
• Transformers replacing CNN for image analysis…
• “An Image Is Worth 16x16 Words: Transformers For Image Recognition At Scale”
https://arxiv.org/pdf/2010.11929.pdf (2021)
• ViT = Vision Transformer
• Break image into patches; flatten each patch into a vector; add positional information (where
did patch come from within image?); get ‘sequence’ of encoded patches; compute key, value,
query using linear layer; compute attention; MLP/FFNN; …

CNN RNN LSTM GRU Simple
100% (3)
CNN RNN LSTM GRU Simple
20 pages
RNN LSTM GRU Transformers
0% (1)
RNN LSTM GRU Transformers
123 pages
RNN LSTM
No ratings yet
RNN LSTM
49 pages
RNN & LSTM: Nguyen Van Vinh Computer Science Department, UET, Vnu Ha Noi
No ratings yet
RNN & LSTM: Nguyen Van Vinh Computer Science Department, UET, Vnu Ha Noi
35 pages
Lecture 4 - Language Modelling and RNNs Part 2
No ratings yet
Lecture 4 - Language Modelling and RNNs Part 2
44 pages
Unit 4
No ratings yet
Unit 4
50 pages
CE6146_Lecture_4
No ratings yet
CE6146_Lecture_4
53 pages
Illustrated Guide To LSTM's and GRU'S - A Step by Step Explanation - by Michael Phi - Towards Data Science
No ratings yet
Illustrated Guide To LSTM's and GRU'S - A Step by Step Explanation - by Michael Phi - Towards Data Science
15 pages
DL (1)
No ratings yet
DL (1)
26 pages
Final PDL_Unit IV
No ratings yet
Final PDL_Unit IV
51 pages
Bianchi
No ratings yet
Bianchi
62 pages
1 Recurrent Neural Networks
No ratings yet
1 Recurrent Neural Networks
36 pages
chapter 2
No ratings yet
chapter 2
68 pages
Cs224n 2025 Lecture06 Fancy Rnn
No ratings yet
Cs224n 2025 Lecture06 Fancy Rnn
57 pages
RNN LSTM
No ratings yet
RNN LSTM
42 pages
07 RNN Recurrent Neural Networks
No ratings yet
07 RNN Recurrent Neural Networks
115 pages
lecture 11
No ratings yet
lecture 11
57 pages
4. Recurrent Neural Network
No ratings yet
4. Recurrent Neural Network
36 pages
42 Recurrent Neural Networks and LSTM
No ratings yet
42 Recurrent Neural Networks and LSTM
68 pages
Unit 4 - DL
No ratings yet
Unit 4 - DL
23 pages
Endsem Imp Dl Unit 4
No ratings yet
Endsem Imp Dl Unit 4
30 pages
CSE 4237 SoftCom Solutions
No ratings yet
CSE 4237 SoftCom Solutions
115 pages
4-Recurrent Neural Network
No ratings yet
4-Recurrent Neural Network
21 pages
RNN_2
No ratings yet
RNN_2
144 pages
10DL
No ratings yet
10DL
20 pages
Unit-5-updated
No ratings yet
Unit-5-updated
125 pages
Deep Learning
No ratings yet
Deep Learning
6 pages
CNN RNN LSTM Attention
No ratings yet
CNN RNN LSTM Attention
86 pages
DL Half TechKnowledge
No ratings yet
DL Half TechKnowledge
50 pages
semster_ dl
No ratings yet
semster_ dl
15 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
2410[1]
No ratings yet
2410[1]
27 pages
1 Recurrent Neural Networks (1)
No ratings yet
1 Recurrent Neural Networks (1)
34 pages
RNNs
No ratings yet
RNNs
22 pages
Deep Arch Msc 2024
No ratings yet
Deep Arch Msc 2024
83 pages
Recurrent Neural Network
No ratings yet
Recurrent Neural Network
34 pages
Module 4 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
No ratings yet
Module 4 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
21 pages
30-35
No ratings yet
30-35
26 pages
RNN LSTM
No ratings yet
RNN LSTM
72 pages
Practice Question DL Unit-3
No ratings yet
Practice Question DL Unit-3
3 pages
Sequence Modeling RNN-LSTM-APPL-Anand Kumar JUNE2021
No ratings yet
Sequence Modeling RNN-LSTM-APPL-Anand Kumar JUNE2021
71 pages
RNN
No ratings yet
RNN
9 pages
What is a Recurrent Neural Network
No ratings yet
What is a Recurrent Neural Network
36 pages
Towardsdatascience
No ratings yet
Towardsdatascience
10 pages
RNNs and Their Types - Simple Explanation
No ratings yet
RNNs and Their Types - Simple Explanation
5 pages
RNN_LSTM_BiRNN_Notes
No ratings yet
RNN_LSTM_BiRNN_Notes
3 pages
CH4_AA1.1-Sequence Models (1)
No ratings yet
CH4_AA1.1-Sequence Models (1)
26 pages
Unit 3 RCNN Updated
No ratings yet
Unit 3 RCNN Updated
28 pages
RNN
No ratings yet
RNN
32 pages
dis6-sol
No ratings yet
dis6-sol
6 pages
15.03.2024_CSA3007_A24+D23+D24 (1)
No ratings yet
15.03.2024_CSA3007_A24+D23+D24 (1)
8 pages
RNNs and Their Types - 15 Slides (Easy Copy-Paste Format)
No ratings yet
RNNs and Their Types - 15 Slides (Easy Copy-Paste Format)
6 pages
Deep Learning
100% (3)
Deep Learning
32 pages
deep learning u4
No ratings yet
deep learning u4
5 pages
KSC2016 - Recurrent Neural Networks
No ratings yet
KSC2016 - Recurrent Neural Networks
66 pages
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
No ratings yet
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
9 pages
CS 601 Machine Learning Unit 4
No ratings yet
CS 601 Machine Learning Unit 4
14 pages
Lecture Notes_RRN
No ratings yet
Lecture Notes_RRN
8 pages
10. Chap 10-2 Sequence Modeling Recurrent and Recursive Net-Hyun-Lim Yang
No ratings yet
10. Chap 10-2 Sequence Modeling Recurrent and Recursive Net-Hyun-Lim Yang
39 pages
Unit-2 Part-2
No ratings yet
Unit-2 Part-2
42 pages
Module 06
No ratings yet
Module 06
5 pages
Unit 3
No ratings yet
Unit 3
41 pages
UNIT-5 Foundations of Deep Learning
No ratings yet
UNIT-5 Foundations of Deep Learning
9 pages
Semantic Compositionality Through Recursive Matrix-Vector Spaces
No ratings yet
Semantic Compositionality Through Recursive Matrix-Vector Spaces
11 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lecture 11

Uploaded by

Lecture 11

Uploaded by

SYSC4415

Introduction to Machine Learning

Prof James Green

element by element addition

Parameters : , , blu , in matrix) , Enu

Recurrent Neural Network Hidden Layers

• Left: https://www.youtube.com/watch?v=S0XFd0VMFss (2-6 of 8min)

• Right: https://www.youtube.com/watch?v=_h66BW-xNgk&index=1&list=PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI (~15 min mark)

Textbook: Minimal Gated GRU (gated recurrent unit)

1) New potential memory cell value f(inputs and ht-1);&

Discussion of dimensions of signals in an LSTM: https://mmuratarat.github.io//2019-01-19/dimensions-of-lstm

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.