01 Intro
01 Intro
• The field rapidly tackled and solved problems that are intellectually
difficult for human beings but relatively straightforward for
computers
q Mathematical rules
(Goodfellow 2016)
Deep learning
• Learn from experience
• understand the world in terms of a hierarchy of
concepts
q each concept defined in terms of its relation to
simpler concepts.
• If we draw a graph showing how these concepts are
built on top of each other, the graph is deep, with
many layers.
q For this reason, we call this approach to AI deep
learning.
(Goodfellow 2016)
Computer vs. Human
• IBM’s Deep Blue chess-playing system
defeated world champion Garry Kasparov in
1997
q Chess is of course a very simple world!
(Goodfellow 2016)
Previous AI appraches
• Knowledge base: A computer can reason about
statements in a formal language automatically using
logical inference rules.
• Cyc failed to understand a story about a person
named Fred shaving in the morning
q people do not have electrical parts,
q but Fred was holding an electric razor
q “FredWhileShaving” contained electrical parts.
q Cyc asked whether Fred was still a person while
he was shaving!
(Goodfellow 2016)
Machine learning
• Extracting patterns from raw data.
q A simple machine learning algorithm called
logistic regression can determine whether to
recommend cesarean delivery
q A simple machine learning algorithm called naive
Bayes can separate legitimate e-mail from spam
e-mail.
(Goodfellow 2016)
Representations Matter
(Goodfellow 2016)
Autoencoders
• An autoencoder is the combination of an encoder
function that converts the input data into a different
representation,
(Goodfellow 2016)
Factors of variation
• When analyzing an image of a car, the factors of
variation include the position of the car, its color,
and the angle and brightness of the sun.
(Goodfellow 2016)
Depth: Repeated Composition
(Goodfellow 2016)
Multi-step computer program
• Another perspective is that depth allows the
computer to learn a multi-step computer program.
• Each layer of the representation can be thought of
as the state of the computer’s memory after
executing another set of instructions in parallel
• Networks with greater depth can execute more
instructions in sequence.
• Sequential instructions offer great power because
later instructions can refer back to the results of
earlier instructions.
(Goodfellow 2016)
Computational Graphs
Logistic regression: p y = 1 x ; 𝜽) =σ(𝜽T x).
(Goodfellow 2016)
Deep learning vs. machine
learning
• Deep learning is:
q An approach to AI
q A type of machine learning
q a technique that allows computer systems to
improve with experience and data
q can safely be regarded as the study of models
that either involve a greater amount of
composition of learned functions or learned
concepts than traditional machine learning does.
(Goodfellow 2016)
Machine Learning and AI
Machine learning can
operate in
complicated, real-
world environments
Deep learning is a
particular kind of
machine learning
that achieves great
power and flexibility
(Goodfellow 2016)
Organization of the Book
Figure 1.6
(Goodfellow 2016)
Who should take this
course?
• University students (undergraduate or graduate)
q If you want to begin a career in deep learning and
artificial intelligence research
q If you want to work as software engineer and want to
rapidly acquire machine learning background and
begin using deep learning in your product or platform.
• Applications:
q computer vision, speech and audio processing,
natural language processing, robotics, bioinformatics
and chemistry, video games, search engines, online
advertising and finance.
(Goodfellow 2016)
Prerequisities
• We do assume that all readers come from a
computer science background.
(Goodfellow 2016)
Deep learning history
• DL has had a long and rich history, but has gone by
many names reflecting different philosophical
viewpoints, and has waxed and waned in popularity.
• DL has become more useful as the amount of
available training data has increased.
• DL models have grown in size over time as
computer infrastructure (both hardware and
software).
• DL has solved increasingly complicated applications
with increasing accuracy over time.
(Goodfellow 2016)
History
• Three waves of development of deep learning:
q Cybernetics in the 1940s–1960s
q Connectionism in the 1980s–1990s
q Deep learning starting 2006
(Goodfellow 2016)
Historical Waves
𝑓 𝑥, 𝑤 = 𝑥! 𝑤! + ⋯ + 𝑥" 𝑤"
(Goodfellow 2016)
Linear models
(e.g. Perceptron, Adaline)
• The training algorithm used to adapt the weights of
the ADALINE was a special case of an algorithm
called stochastic gradient descent.
• Linear models have many limitations. Most
famously, they cannot learn the XOR function,
where 𝑓 ([0, 1], 𝑤) = 1 and 𝑓([1, 0], 𝑤) =
1 but 𝑓 ([1, 1], 𝑤) = 0 and 𝑓 ([0, 0], 𝑤) = 0.
• Critics who observed these flaws in linear models
caused a backlash against biologically inspired
learning in general (Minsky and Papert, 1969).
(Goodfellow 2016)
Neuroscience
• Neuroscience has given us a reason to hope that a
single deep learning algorithm can solve many
different tasks.
(Goodfellow 2016)
Connectionism
• Distributed representation (Hinton et al., 1986)
q Each input to a system should be represented by
many features,
q and each feature should be involved in the
representation of many possible inputs.
q Example: shape vs. color
(Goodfellow 2016)
Second winter
• Ambitious claims while seeking investments.
(Goodfellow 2016)
Third wave
• Researchers showed that they were able to train deeper
neural networks than had been possible before, and
focused attention on the theoretical importance of depth
(Goodfellow 2016)
Historical Trends: Growing
Datasets
(Goodfellow 2016)
Connections per Neuron
(Goodfellow 2016)
Solving Object Recognition
(Goodfellow 2016)
Other applications
• Recurrent neural networks, such as the LSTM sequence
model are now used to model relationships between
sequences and other sequences rather than just fixed
inputs.
• In the context of reinforcement learning, an autonomous
agent must learn to perform a task by trial and error,
without any guidance from the human operator.
q DeepMind demonstrated that a deep reinforcement
learning system is capable of learning to play Atari
video games, reaching human-level performance
q Deep learning has also significantly improved the
performance of reinforcement learning for robotics
(Goodfellow 2016)
Companies and tools
• Google, Microsoft, Facebook, IBM, Baidu,
Apple, Adobe, Netflix, NVIDIA and NEC.
• Geoffrey Hinton
• Yoshua Bengio
• Reinforcement Learning
q David Silver: Introduction to Reinforcement Learning
q OpenAI: Spinning Up in Deep RL
(Goodfellow 2016)
Summary
• Deep learning is an approach to machine learning that has
drawn heavily on our knowledge of the human brain,
statistics and applied math as it developed over the past
several decades.
• In recent years, it has seen tremendous growth in its
popularity and usefulness, due in large part to more
q powerful computers,
q larger datasets and
q techniques to train deeper networks.
(Goodfellow 2016)
Watch
• https://www.youtube.com/watch?v=vi7lACKOUao
• https://www.youtube.com/watch?v=0VH1Lim8gL8
(Goodfellow 2016)