LSTM: A Search Space Odyssey: Klaus Greff, Rupesh K. Srivastava, Jan Koutn Ik, Bas R. Steunebrink, J Urgen Schmidhuber
LSTM: A Search Space Odyssey: Klaus Greff, Rupesh K. Srivastava, Jan Koutn Ik, Bas R. Steunebrink, J Urgen Schmidhuber
Abstract—Several variants of the Long Short-Term Memory synthesis [10], protein secondary structure prediction [11],
(LSTM) architecture for recurrent neural networks have been analysis of audio [12], and video data [13] among others.
proposed since its inception in 1995. In recent years, these
networks have become the state-of-the-art models for a variety The central idea behind the LSTM architecture is a memory
of machine learning problems. This has led to a renewed interest cell which can maintain its state over time, and non-linear
in understanding the role and utility of various computational gating units which regulate the information flow into and out of
components of typical LSTM variants. In this paper, we present the cell. Most modern studies incorporate many improvements
the first large-scale analysis of eight LSTM variants on three
that have been made to the LSTM architecture since its
arXiv:1503.04069v2 [cs.NE] 4 Oct 2017