Deep Learning
Deep Learning
1969: Marvin Minsky and Seymour Papert publish "Perceptrons," highlighting the limitations of single-layer
perceptrons, particularly their inability to solve the XOR problem. This leads to a decline in interest and
funding for neural network research.
The Backpropagation Breakthrough (1980s)
● Discovered and rediscovered several times throughout
1960’s and 1970’s.
● 1982: Paul Werbos introduces the backpropagation
algorithm in his Ph.D. thesis, providing an efficient
method for training multi-layer neural networks.
● 1986: Geoffrey Hinton, David Rumelhart, and Ronald
Williams popularize backpropagation with their paper,
"Learning representations by back-propagating errors."
This reignites interest in neural networks.
Gradient Descent
Cauchy discovered Gradient Descent motivated by the need
to compute the orbit of heavenly bodies.
Universal Approximation Theorem
A multilayered network of neurons with a single
hidden layer can be used to approximate any
continuous function to any desired precision.
Unsupervised Pre-Training
● The idea of unsupervised pre-training actually dates back to the
1991-1993 (J. Schmidhuber) when it was used to train a “Very Deep
Learner”.
● In particular, Schmidhuber and his collaborators developed methods
for training deep learning models without requiring labeled data,
which paved the way for modern unsupervised learning techniques.
More Insights (2007-2009)
● Further investigations into the effectiveness of Unsupervised Pre-training.
● Greedy Layer-Wise Pre-training of Deep Networks.
● Why does Unsupervised Pre-training Help Deep Learning?
● Exploring Strategies for Training Deep Neural Networks
Neural Networks in Practice: 1990s
● 1990s: Neural networks begin to show practical success in
various applications, such as speech recognition,
handwriting recognition, and simple pattern recognition
tasks.
● 1997: Sepp Hochreiter and Jürgen Schmidhuber propose
Sepp Hochreiter
the Long Short-Term Memory (LSTM) network,
addressing the issue of vanishing gradients in training
recurrent neural networks (RNNs). LSTMs become a key
component for sequential data tasks, such as language
modeling and time-series prediction.
Jürgen Schmidhuber
2000s: Foundations for Modern Deep Learning
2006: Geoffrey Hinton, Simon Osindero, and Yee-Whye
Teh introduce the concept of Deep Belief Networks
(DBNs), demonstrating that deep neural networks can
be pre-trained layer by layer using unsupervised
learning. This helps to initialize deep networks
effectively.