0% found this document useful (0 votes)
12 views14 pages

LSTMS

Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) designed to address long-term dependency issues, utilizing hidden and cell states to manage information. It employs three gates—forget, input, and output—to control the flow of information in and out of the cell state, allowing for effective learning and retention of relevant data. Variants of LSTM include those with peephole connections and coupled gates for improved performance.

Uploaded by

shreyash.ggv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views14 pages

LSTMS

Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) designed to address long-term dependency issues, utilizing hidden and cell states to manage information. It employs three gates—forget, input, and output—to control the flow of information in and out of the cell state, allowing for effective learning and retention of relevant data. Variants of LSTM include those with peephole connections and coupled gates for improved performance.

Uploaded by

shreyash.ggv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Long Short-Term

Memory (LSTM)
Long Short-Term Memory (LSTM)
• A type of RNN proposed by Hochreiter and Schmidhuber in 1997 as a
solution to the long-term dependency problem

• On step t, there is a hidden state h(t) and a cell state c(t)


 The cell stores long-term information

 The LSTM can erase, write and read information from the cell

• The selection of which information is erased/written/read is controlled by three corresponding gates


On each time step, each element of the gates can be open(1), closed(0), or somewhere in-between
LSTMs
 All RNNs have the form of a chain of repeating modules of neural network

 Repeating module in a vanilla RNN is a single layer with tanh activation


LSTMs
LSTMs
Explanations of Each cells
• Cell state (Ct):

Information can flow along cell state unchanged. Why is this


important?

Ability to remove or add information to cell state, regulated by


gates

• Gates:

Composed of a sigmoid neural net layer and a pointwise


multiplication operation

Sigmoid layer outputs numbers between zero and one,


describing how much of each component should be let through
Explanations of Each cells
• Forget Gate: controls what is kept vs what is forgotten, from previous cell state
Explanations of Each cells
• Input gate: decides what information to throw away from the cell state

• Cell content: new content to be written to cell


Explanations of Each cells
• Cell state: erase ("forget") some content from the last cell state, and write
("input") some new cell content
Explanations of Each cells
• Output gate: controls what parts of cell are output to hidden state
• Hidden state: read ("output") some content from cell
LSTMs Explained What can you tell about cell state (Ct),
if forget gate is set to 1 and input gate
set to 0

What happens if you fix input gate to


all 1s, forget gate to all 0s, output gate
to all 1s?

Almost standard RNN:


Why almost?
Tanh added here

Information of that cell is preserved indefinitely.


LSTMs: how does it solve the
vanishing gradient problem
• Gradient “Highway”

• Gradient at Ct passed on to Ct-1 unaffected


by any other operations, but for forget
gate; why does this not matter?

• Forget gate is part of the design, it reduces


the gradient where it should, does not
ameliorate the gradient otherwise!
Variants of LSTM
• LSTM with peephole connections
Variants of LSTM
Coupled forget and input gates

• Instead of separately deciding what to forget and what to add, make decisions together

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy