0% found this document useful (0 votes)

4 views73 pages

Deep

The lecture discusses deep learning, its historical development, and its impact on fields like speech recognition and image classification. It covers key concepts such as supervised and unsupervised learning, neural networks, and optimization techniques like stochastic gradient descent. Additionally, it introduces advanced topics like autoencoders and distributional semantics, emphasizing the empirical nature of deep learning and the necessity for large datasets.

Uploaded by

andrws.vieira

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views73 pages

Deep

Uploaded by

andrws.vieira

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 73

Lecture 19: Deep Learning

CS221 / Autumn 2015 / Liang

Google Trends
Query: deep learning

CS221 / Autumn 2015 / Liang 1

[figure from Li Deng]

Speech recognition (2009-2011)

• Steep drop in WER due to deep learning

• IBM, Google, Microsoft all switched over from GMM-HMM
CS221 / Autumn 2015 / Liang 2
[Krizhevsky et al., 2012, a.k.a. AlexNet]

Object recognition (2012)

• Landslide win in ImageNet competition

• Computer vision community switched to CNNs

CS221 / Autumn 2015 / Liang 3

[figure from Honglak Lee]

What is deep learning?

Philosophy: learn high-level abstractions automatically

CS221 / Autumn 2015 / Liang 4

A brief history
• 1950-60s: modeling brain using neural networks (Rosenblatt,
Hebb, etc.)
• 1969: research stagnated after Minsky and Papert’s paper
• 1986: popularization of backpropagation by Rumelhardt, Hinton,
Williams
• 1990s: convolutional neural networks (LeCun)
• 1990s: recurrent neural networks (Schmidhuber)
• 2006: revival of deep networks (Hinton et al.)
• 2013-: massive industrial interest
Key problem: was difficult to get training multi-layer neural networks to
work!

CS221 / Autumn 2015 / Liang 5

What’s different today
Computation (time/memory) Information (data)

Deep learning is fundamentally empirical

CS221 / Autumn 2015 / Liang 6

Roadmap

Supervised learning

Unsupervised learning

Convolutional neural networks

Recurrent neural networks

Final remarks

CS221 / Autumn 2015 / Liang 7

Framework
x

Dtrain Learner f

Learner

Optimization problem Optimization algorithm

CS221 / Autumn 2015 / Liang 8

Review: optimization
Regression:
Loss(x, y, θ) = (fθ (x) − y)2
Key idea: minimize training loss
1 X
TrainLoss(θ) = Loss(x, y, θ)
|Dtrain |
(x,y)∈Dtrain

min TrainLoss(θ)
θ∈Rd

Algorithm: stochastic gradient descent

For t = 1, . . . , T :
For (x, y) ∈ Dtrain :
θ ← θ − ηt ∇θ Loss(x, y, θ)

CS221 / Autumn 2015 / Liang 9

Review: linear predictors
w
x1

x2 fθ (x)

Output:

fθ (x) = w · x

Parameters: θ = w

CS221 / Autumn 2015 / Liang 10

Review: neural networks
V h1
x1 w
σ
x2 fθ (x)
σ
x3
h2
Intermediate hidden units:
hj (x) = σ(vj · x) σ(z) = (1 + e−z )−1
Output:
fθ (x) = w · h(x)
Parameters: θ = (V, w)

CS221 / Autumn 2015 / Liang 11

Summary so far
Neural network predictor: fθ (x) = w · σ(Vx)

Squared loss: Loss(x, y, θ) = (fθ (x) − y)2

Next step: compute the gradient ∇θ Loss(x, y, θ)

CS221 / Autumn 2015 / Liang 12

Basic building blocks
+ − ·
1 1 1 −1 b a

a b a b a b

max σ

1[a > b] 1[a < b] σ(a)(1 − σ(a))

a b a

CS221 / Autumn 2015 / Liang 13

Composing functions
out function2

∂ out
∂ mid

mid function1

∂ mid
∂ in

in
Chain rule:
∂out ∂out ∂mid
∂in = ∂mid ∂in

CS221 / Autumn 2015 / Liang 14

Computing the gradient
 2
k
X
Loss(x, y, w) =  wj σ(vj · φ(x)) − y 
j=1

Assume labels {1, 2, 3} and correct label is y = 1

(·)2
2 · residual

residual −
1

+ y
1 1 1

· · ·
h1 w1 h2 w2 h3 w3

w1 h1 σ w2 h2 σ w3 h3 σ
h1 (1 − h1 ) h2 (1 − h2 ) h3 (1 − h3 )

· · ·
φ(x) φ(x) φ(x)

v1 φ(x) v2 φ(x) v3 φ(x)

CS221 / Autumn 2015 / Liang 15

[Andrej Karpathy’s demo]

CS221 / Autumn 2015 / Liang 16

Deep neural networks
1-layer neural network: x
w>
score =

2-layer neural network: x

> V
w
score = σ( )

3-layer neural network: x

U V
>
w
score = σ( σ( ))

CS221 / Autumn 2015 / Liang

... 17
Depth
x h h0 h00 h000
fθ (x)

Intuitions:
• Hierarchical feature representations
• Can simulate a bounded computation logic circuit (original moti-
vation from McCulloch/Pitts, 1943)
• Learn this computation (and potentially more because networks
are real-valued)
• Depth k + 1 logic circuits can represent more than depth k (count-
ing argument)
• Formal theory/understanding is still incomplete

CS221 / Autumn 2015 / Liang 18

Supervised learning
• Construct deep neural networks by composing non-linearities (σ)
and linear transformations (matrix multiplication)

• Train via SGD, use backpragation to compute gradients

• Non-convex optimization, but works empirically given enough com-

pute and data

CS221 / Autumn 2015 / Liang 19

Roadmap

Supervised learning

Unsupervised learning

Convolutional neural networks

Recurrent neural networks

Final remarks

CS221 / Autumn 2015 / Liang 20

Motivation
• Deep neural networks requires lots of data

• Sometimes not very much labeled data, but plenty of unlabeled

data (text, images, videos)

• Humans rarely get direct supervision; can learn from raw sensory
information?

CS221 / Autumn 2015 / Liang 21

Autoencoders
Analogy:
AAAABBBBB 4 A’s, 5 B’s AAAABBBBB

Key idea: autoencoders

If can compress a data point and still reconstruct it, then we have
learned something generally useful.

General framework:
x x̂
h
Encode Decode

minimize kx − x̂k2

CS221 / Autumn 2015 / Liang 22

Principal component analysis

Input: points x1 , . . . , xn

x U
U> h
Encode(x) = Decode(h) =

(assume xi ’s are mean zero and U is orthogonal)

PCA objective:
n
X
minimize kxi − Decode(Encode(xi ))k2
i=1
CS221 / Autumn 2015 / Liang 23
Autoencoders
Increase dimensionality of hidden dimension:
x h x̂

Encode Decode

• Problem: learning nothing — just set Encode, Decode to identity

function!

• Need to control complexity of Encode and Decode somehow...

CS221 / Autumn 2015 / Liang 24

Non-linear autoencoders
Non-linear transformation (e.g., logistic function):

Encode(x) = σ(W x + b)
Decode(h) = σ(W 0 h + b0 )
W0 b0
W b

Loss function:
minimize kx − Decode(Encode(x))k2
Key: non-linearity makes life harder, prevents degeneracy

CS221 / Autumn 2015 / Liang 25

Denoising autoencoders
Corrupt(x) x̂
h
Encode Decode

Types of noise:
• Blankout: Corrupt([1, 2, 3, 4]) = [0, 2, 3, 0]
• Gaussian: Corrupt([1, 2, 3, 4]) = [1.1, 1.9, 3.3, 4.2]
Objective:
minimize kx − Decode(Encode(Corrupt(x)))k2
Algorithm: pick example, add fresh noise, SGD update
Key: noise makes life harder, prevents degeneracy

CS221 / Autumn 2015 / Liang 26

[Figure 7 of Vincent et al. (2010)]

Denoising autoencoders
MNIST: 60,000 images of digits (784 dimensions)

200 learned filters (rows of W ):

CS221 / Autumn 2015 / Liang 27

Stacked denoising autoencoders
Goal: learn hierarchical features
Train first layer:
Corrupt(x) h x̃

Encode Decode

Train second layer:

Corrupt(h) h̃
h0
Encode0 Decode0

...
Test time: Encode0 (Encode(x))
CS221 / Autumn 2015 / Liang 28
Probabilistic models
So far:

Decode(Encode(x))

Probabilistic model: distribution over inputs and hidden states

p(x, h)

Two types:
• Restricted Boltzmann machines (Markov network)
• Deep belief network (Bayesian network)

For simplicity, assume x and h are binary vectors

CS221 / Autumn 2015 / Liang 29

Restricted Boltzmann machines
Markov network (factor graph):
h

Sampling: h | x is easy, x is hard

x
> W
h
p(x, h) ∝ exp( +b> h + c> x)

Learning: SGD; gradient requires summing over all (x, h)

Contrastive divergence: initialize x, 1 step of Gibbs sampling

CS221 / Autumn 2015 / Liang 30

Deep belief networks
Bayesian network:
h

Sampling: h | x is hard, x is easy

x
> W
h
p(x|h) ∝ exp( +b> h + c> x)

Learning: maximum likelihood is intractable, so use same algorithm as

RBM; repeat to get deep (like for stacked denoising autoencoders)

CS221 / Autumn 2015 / Liang 31

Distributional semantics: warmup
The new design has lines.

Let’s try to keep the kitchen .

I forgot to out the cabinet.

What does mean?

CS221 / Autumn 2015 / Liang 32

Distributional semantics
The new design has lines.

Observation: context can tell us a lot about word meaning

Autoencoding: predict x from x

Distributional methods: predict x from context

CS221 / Autumn 2015 / Liang 33

General recipe
1. Form a word-context matrix of counts (data)
context c

word w N

2. Perform dimensionality reduction (generalize)

word w Θ ⇒ word vectors θw ∈ Rd

CS221 / Autumn 2015 / Liang 34

[Deerwater/Dumais/Furnas/Landauer/Harshman, 1990]

Latent semantic analysis

Data:
Doc1: Cats have tails.
Doc2: Dogs have tails.
Matrix: context = documents that word appear in

Doc1 Doc2
cats 1 0
dogs 0 1
have 1 1
tails 1 1

CS221 / Autumn 2015 / Liang 35

[Deerwater/Dumais/Furnas/Landauer/Harshman, 1990]

Latent semantic analysis

Dimensionality reduction: SVD

document c
S V>
≈ Θ
word w N

• Used for information retrieval

• Match query to documents in latent space rather than on keywords

CS221 / Autumn 2015 / Liang 36

[Mikolov/Sutskever/Chen/Corrado/Dean, 2013 (word2vec)]

Skip-gram model with negative sampling

Data:
Cats and dogs have tails.
Matrix: context = words in a window

cats and dogs have tails

cats 0 1 1 0 0
and 1 0 1 1 0
dogs 1 1 0 1 1
have 0 1 1 0 1
tails 0 0 1 1 0

CS221 / Autumn 2015 / Liang 37

[Mikolov/Sutskever/Chen/Corrado/Dean, 2013 (word2vec)]

Skip-gram model with negative sampling

Cats are smarter than the best AI.

Dimensionality reduction: logistic regression with SGD

Model: predict good (w, c) using logistic regression

pθ (g = 1 | w, c) = (1 + exp(θw · βc ))−1

Positives: (w, c) from data

Negatives: (w, c0 ) for irrelevant c0 (k times more)

+(cats, AI) −(cats, linguistics) −(cats, statistics)

CS221 / Autumn 2015 / Liang 38

[Mikolov/Sutskever/Chen/Corrado/Dean, 2013 (word2vec)]

Skip-gram model with negative sampling

Data distribution:

p̂(w, c) ∝ N (w, c)

Objective:
X
max p̂(w, c) log p(g = 1 | w, c)+
θ,β
w,c

X
k p̂(w)p̂(c0 ) log p(g = 0 | w, c0 )
w,c0

CS221 / Autumn 2015 / Liang 39

2D visualization of word vectors

CS221 / Autumn 2015 / Liang 40

[Mikolov/Yih/Zweig, 2013; Levy/Goldberg, 2014]

Analogies
Differences in context vectors capture relations:

θking − θman ≈ θqueen − θwoman (gender)

θfrance − θfrench ≈ θmexico − θspanish (language)

θcar − θcars ≈ θapple − θapples (plural)

Intuition:

θking − θman ≈ θqueen − θwoman

|{z} |{z} | {z } | {z }
[crown,he] [he] [crown,she] [she]

CS221 / Autumn 2015 / Liang 41

Unsupervised learning
• Principle: make up prediction tasks (e.g., x given x or context)

• Hard task → pressure to learn something

• Loss minimzation using SGD

• Discriminatively fine tune: initialize feedforward neural network

and backpropagate to optimize task accuracy

• Helps less given large amounts of labeled data, but doesn’t mean
unsupervised learning is solved — quite the opposite!

CS221 / Autumn 2015 / Liang 42

Roadmap

Supervised learning

Unsupervised learning

Convolutional neural networks

Recurrent neural networks

Final remarks

CS221 / Autumn 2015 / Liang 43

Motivation
x
W

• Observation: images are not arbitrary vectors

• Goal: leverage spatial structure of images (translation invariance)

CS221 / Autumn 2015 / Liang 44

[figure from Andrej Karpathy]

Prior knowledge

• Local connectivity: each hidden unit operates on a local image

patch (3 instead of 7 connections per hidden unit)

• Parameter sharing: processing of each image patch is same (3

parameters instead of 3 · 5)

• Intuition: try to match a pattern in image

CS221 / Autumn 2015 / Liang 45

Fully-connected:

Convolutional: each depth column produced from localized region (in

height/width)

[Andrej Karpathy’s demo]

CS221 / Autumn 2015 / Liang 46
[figure from Andrej Karpathy]

Max-pooling

• Intuition: test if there exists a pattern in neighborhood

• Reduce computation, prevent overfitting

CS221 / Autumn 2015 / Liang 47

Example of function evaluation

[Andrej Karpathy’s demo]

CS221 / Autumn 2015 / Liang 48

[Krizhevsky et al., 2012, a.k.a. AlexNet]

AlexNet

• Non-linearity: use RelU (max(z, 0)) instead of logistic

• Data augmentation: translate, horizontal reflection, vary intensity,
dropout (guard against overfitting)
• Computation: parallelize across two GPUs (6 days)
• Impressive results: 15% error; next best was 25%!

CS221 / Autumn 2015 / Liang 49

Summary
• Intuition: spatial regularity across the input

• Key idea: locality and parameter sharing

• Dominant in computer vision

• Applications to text classification and speech recognition

CS221 / Autumn 2015 / Liang 50

Roadmap

Supervised learning

Unsupervised learning

Convolutional neural networks

Recurrent neural networks

Final remarks

CS221 / Autumn 2015 / Liang 51

Motivation
Model sequences (sentences):

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12

Paris Talks Set Stage for Action as Risks to the Climate Rise

Goal: rich probabilistic model

p(x1 )p(x2 | x1 )p(x3 | x1 , x2 )p(x4 | x1 , x2 , x3 ) · · ·

No conditionally independence!

CS221 / Autumn 2015 / Liang 52

Recurrent neural networks
h1 h2 h3 h4

x1 x2 x3 x4

h1 = Encode(x1 )
x2 ∼ Decode(h1 ) Update context vector:
h2 = Encode(h1 , x2 ) ht = Encode(ht−1 , xt )
x3 ∼ Decode(h2 ) Predict next character:
h3 = Encode(h2 , x3 ) xt+1 = Decode(ht )
x4 ∼ Decode(h3 ) context ht compresses x1 , . . . xt
h4 = Encode(h3 , x4 )

CS221 / Autumn 2015 / Liang 53

[Elman, 1990]

Simple recurrent network

h1 h2 h3 h4

x1 x2 x3 x4

xt
V ht−1 W ht

Encode(ht−1 , xt ) = σ( + )=

W0 p(xt+1 )
ht

Decode(ht ) ∼ softmax( )=

CS221 / Autumn 2015 / Liang 54

Vanishing gradient problem
h1 h2 h3 h4 h5

x1 x2 x3 x4 x5

(set x1 = 1, x2 = x3 = · · · = 0, σ = identity function)

x1
V V V V W

h5 =

If V = 0.1, then
• Value: ht = 0.1t−1 W
∂ht
• Gradient: ∂W = 0.1t−1 (vanishes as length increases)
CS221 / Autumn 2015 / Liang 55
Additive combinations
h1 h2 h3 h4 h5

x1 x2 x3 x4 x5

What if:
ht = ht−1 + W xt
Then:
(set x1 = 1, x2 = x3 = · · · = 0, σ = identity function)

• Value: ht = W
∂ht
• Gradient: ∂W = 1 for any t

CS221 / Autumn 2015 / Liang 56

[Schmidhuber & Hochreiter, 1997]

Long Short Term Memory (LSTM)

API:
(ht , ct ) = LSTM(ht−1 , ct−1 , xt )
Input gate:
it = σ(Wi xt + Ui ht−1 + Vi ct−1 + bi )
Forget gate (initialize with bf large, so close to 1):
ft = σ(Wf xt + Uf ht−1 + Vf ct−1 + bf )
Cell: additive combination of RNN update with previous cell
ct = it tanh(Wc xt + Uc ht−1 + bc ) + ft ct−1
Output gate:
ot = σ(Wo xt + Uo ht−1 + Vo ct + bo )
Hidden state:
ht = ot tanh(ct )

CS221 / Autumn 2015 / Liang 57

[from Andrej Karpathy’s blog]

Character-level language modeling

Sampled output:

Naturalism and decision for the majority of Arab countries’ capitalide was
grounded by the Irish language by [[John Clair]], [[An Imperial Japanese
Revolt]], associated with Guangzham’s sovereignty. His generals were
the powerful ruler of the Portugal in the [[Protestant Immineners]], which
could be said to be directly in Cantonese Communication, which followed
a ceremony and set inspired prison, training. The emperor travelled back
to [[Antioch, Perth, October 25—21]] to note, the Kingdom of Costa
Rica, unsuccessful fashioned the [[Thrales]], [[Cynth’s Dajoard]], known
in western [[Scotland]], near Italy to the conquest of India with the
conflict.

CS221 / Autumn 2015 / Liang 58

[from Andrej Karpathy’s blog]

CS221 / Autumn 2015 / Liang 59

[Sutskever et al., 2014]

Sequence-to-sequence model
Motivation: machine translation
x:Je crains l’homme de un seul livre.
y:Fear the man of one book.
y4 y5 y6
h1 h2 h3

h4 h5 h6
x1 x2 x3

Read in a sentence first, output according to RNN:

ht = Encode(ht−1 , xt or yt−1 ), yt = Decode(ht )
CS221 / Autumn 2015 / Liang 60
Attention-based models

Motivation: long sentences — compress to finite dimensional vector?

Eine Folge von Ereignissen bewirkte, dass aus Beethovens Studienreise

nach Wien ein dauerhafter und endgültiger Aufenthalt wurde. Kurz nach
Beethovens Ankunft, am 18. Dezember 1792, starb sein Vater. 1794
besetzten französische Truppen das Rheinland, und der kurfürstliche Hof
musste fliehen.

Key idea: attention

Learn to look back at your notes.

CS221 / Autumn 2015 / Liang 61

[Bahdanau et al., 2015]

Attention-based models
y4 y5 y6
h1 h2 h3

h4 h5 h6
x1 x2 x3

Distribution over input positions:

αt = softmax([Attend(h1 , ht−1 ), . . . , Attend(hL , ht−1 )])

Generate with attended input:

PL
ht = Encode(ht−1 , yt−1 , j=1 αt hj )

CS221 / Autumn 2015 / Liang 62

[Bahdanau et al., 2015]

Machine translation

CS221 / Autumn 2015 / Liang 63

[Google, 2015]

Email responder

CS221 / Autumn 2015 / Liang 64

[Xu et al., 2015]

Image captioning

CS221 / Autumn 2015 / Liang 65

Summary
• Recurrent neural networks: model sequences (non-linear version of
Kalman filter or HMM)

• Logic intuition: learning a program with a for loop (reduce)

• LSTMs mitigate the vanishing gradient problem

• Attention-based models: when only part of input is relevant at a

time

• Newer models with ”external memory”: memory networks, neural

Turing machines

CS221 / Autumn 2015 / Liang 66

Roadmap

Supervised learning

Unsupervised learning

Convolutional neural networks

Recurrent neural networks

Final remarks

CS221 / Autumn 2015 / Liang 67

Computation
...wait for a long time...
Better optimization algorithms: SGD, SGD+momentum, AdaGrad,
AdaDelta, momentum, Nesterov, Adam
Buy GPUs:

...wait for a long time...

CS221 / Autumn 2015 / Liang 68

Theory: why does it work?
Two questions:
• Approximation: why are neural networks good hypothesis classes?
• Optimization: why can SGD optimize a high-dimensional non-
convex problem?
Partial answers:
• 1-layer neural networks can approximate any continuous function
on compact set [Cybenko, 1989; Barron, 1993]
• Generate random features works too [Rahimi/Recht, 2009; Andoni
et. al, 2014]
• Use statistical physics to analyze loss surfaces [Choromanska et
al., 2014]

CS221 / Autumn 2015 / Liang 69

Summary
Phenomena Ideas

Fixed vectors Feedforward NNs

Spatial structure convolutional NNs

Sequence recurrent NNs

LSTMs

Sequence-to-sequence encoder-decoder
attention-based models

Unsupervised belief networks

RBMs
autoencoders
CS221 / Autumn 2015 / Liang 70
References
Tutorials:
• http://deeplearning.net/tutorial/
• http://deeplearning.stanford.edu/tutorial/
• http://cs.stanford.edu/people/karpathy/convnetjs/
Software:
• Caffe (Berkeley): centered around computer vision
• Theano (Montreal); also see Keras: Python
• Torch (Facebook): fast, but write in Lua
• TensorFlow (Google): new!

CS221 / Autumn 2015 / Liang 71

Outlook
Extensibility: able to compose modules

LSTM Attend Encode

Learning programs: think about analogy with a computer

x fθ y

Data:

reinforcement learning? unsupervised learning?

CS221 / Autumn 2015 / Liang 72

UNIT 1 Introduction Part 1
No ratings yet
UNIT 1 Introduction Part 1
37 pages
Deep Learning
No ratings yet
Deep Learning
37 pages
1 2 Logistics Comp Graphs
No ratings yet
1 2 Logistics Comp Graphs
58 pages
6S191 MIT DeepLearning L1
No ratings yet
6S191 MIT DeepLearning L1
108 pages
21 Conclusion
No ratings yet
21 Conclusion
78 pages
cs224n 2025 Lecture02 Wordvecs2
No ratings yet
cs224n 2025 Lecture02 Wordvecs2
46 pages
Lec9 NN I
No ratings yet
Lec9 NN I
47 pages
Learning
No ratings yet
Learning
63 pages
Chapter21 4e
No ratings yet
Chapter21 4e
35 pages
Jntuk r20 Unit V Deep Learning Techniqueswwwjntumaterials
No ratings yet
Jntuk r20 Unit V Deep Learning Techniqueswwwjntumaterials
32 pages
Cs224n 2025 Lecture03 Neuralnets
No ratings yet
Cs224n 2025 Lecture03 Neuralnets
96 pages
DNN Merged Sugata
No ratings yet
DNN Merged Sugata
243 pages
Deep Learning Intro Slides
No ratings yet
Deep Learning Intro Slides
68 pages
Introduction (BT4222) YL
No ratings yet
Introduction (BT4222) YL
48 pages
DL Concepts 1 Overview
No ratings yet
DL Concepts 1 Overview
80 pages
Lesson 4 - Deep Learning
No ratings yet
Lesson 4 - Deep Learning
20 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
33 pages
Session 2 ANN 2024
No ratings yet
Session 2 ANN 2024
29 pages
Mathematical Psychology
No ratings yet
Mathematical Psychology
8 pages
Lecture Notes 02
No ratings yet
Lecture Notes 02
65 pages
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
No ratings yet
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
84 pages
1 AI - Introduction and ML
No ratings yet
1 AI - Introduction and ML
32 pages
Aidl Unit III
No ratings yet
Aidl Unit III
79 pages
Unit 12
No ratings yet
Unit 12
26 pages
Lecture 1 2 Ruohan
No ratings yet
Lecture 1 2 Ruohan
35 pages
Deep Learning Book Part1
No ratings yet
Deep Learning Book Part1
100 pages
CS221 - Artificial Intelligence - Machine Learning - 1 Overview
No ratings yet
CS221 - Artificial Intelligence - Machine Learning - 1 Overview
16 pages
Lec 1 Introduction
No ratings yet
Lec 1 Introduction
20 pages
Lecture 6 - Convolution Neural Network (CNN)
No ratings yet
Lecture 6 - Convolution Neural Network (CNN)
26 pages
Cs224n 2024 Lecture02 Wordvecs2
No ratings yet
Cs224n 2024 Lecture02 Wordvecs2
45 pages
DL For NLP
No ratings yet
DL For NLP
31 pages
Session: Deep Learning: Module: Digital Image Processing Module Code: IMP302
No ratings yet
Session: Deep Learning: Module: Digital Image Processing Module Code: IMP302
34 pages
ML Archs
No ratings yet
ML Archs
36 pages
DL Module 1 - CS-1 Fundamentals of Neural Network
No ratings yet
DL Module 1 - CS-1 Fundamentals of Neural Network
81 pages
SPH Whitepaper
No ratings yet
SPH Whitepaper
25 pages
Deep Learning
No ratings yet
Deep Learning
48 pages
3 - Deep Learning
No ratings yet
3 - Deep Learning
33 pages
6 Benefits of DL Techniques For Credit Scoring
No ratings yet
6 Benefits of DL Techniques For Credit Scoring
14 pages
Lecture2 Slides 1
No ratings yet
Lecture2 Slides 1
28 pages
NISS Deep Learning Tutorial
No ratings yet
NISS Deep Learning Tutorial
58 pages
A Step by Step Backpropagation Example - Matt Mazur
100% (1)
A Step by Step Backpropagation Example - Matt Mazur
19 pages
DL Notes 1 5 Deep Learning
100% (1)
DL Notes 1 5 Deep Learning
189 pages
Neural Network (RNN & CNN)
No ratings yet
Neural Network (RNN & CNN)
31 pages
6COM1044 Deep Learning 1
No ratings yet
6COM1044 Deep Learning 1
49 pages
Unit IV V Deep Learning Material
No ratings yet
Unit IV V Deep Learning Material
32 pages
TSR Neural
No ratings yet
TSR Neural
16 pages
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet
Lec 01 Introduction
No ratings yet
Lec 01 Introduction
98 pages
Lecture Notes 01
No ratings yet
Lecture Notes 01
77 pages
NNFL 1 RA Moodle
No ratings yet
NNFL 1 RA Moodle
42 pages
9751 27984 1 PB
No ratings yet
9751 27984 1 PB
10 pages
Deep Learning State of The Art: Amulya Viswambharan ID 202090007 Kehkshan Fatima ID
No ratings yet
Deep Learning State of The Art: Amulya Viswambharan ID 202090007 Kehkshan Fatima ID
17 pages
CS445 - Neural Networks and Deep Learning - Lecture Notes
No ratings yet
CS445 - Neural Networks and Deep Learning - Lecture Notes
5 pages
CS 446: Machine Learning: Dan Roth University of Illinois, Urbana-Champaign
No ratings yet
CS 446: Machine Learning: Dan Roth University of Illinois, Urbana-Champaign
75 pages
Deep Learning - IIT Ropar - Unit 5 - Week 2
No ratings yet
Deep Learning - IIT Ropar - Unit 5 - Week 2
4 pages
Lec 0
No ratings yet
Lec 0
24 pages
AD3501 Deep Learning Course Plan
No ratings yet
AD3501 Deep Learning Course Plan
6 pages
Natural Language Processing With Deep Learning CS224N/Ling284
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
34 pages
BAB 3 - Dian Ayu Rahmawati - 205150201111042
No ratings yet
BAB 3 - Dian Ayu Rahmawati - 205150201111042
14 pages
Assignment 01
No ratings yet
Assignment 01
3 pages
Short Course On Deep Learning: Welcome!!
No ratings yet
Short Course On Deep Learning: Welcome!!
57 pages
Machine Learning CS229/STATS229: Instructors: Moses Charikar, Tengyu Ma, and Chris Re
No ratings yet
Machine Learning CS229/STATS229: Instructors: Moses Charikar, Tengyu Ma, and Chris Re
40 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Week 2
No ratings yet
Week 2
3 pages
DL Unit 1
No ratings yet
DL Unit 1
19 pages
DLAI4 Revision
No ratings yet
DLAI4 Revision
6 pages
A Step by Step Perceptron Example
100% (1)
A Step by Step Perceptron Example
5 pages
Stock Prediction Using Recurrent Neural Network (RNN)
0% (1)
Stock Prediction Using Recurrent Neural Network (RNN)
24 pages
Introduction To Neural Networks For Senior Design: August 9 - 12, 2004 Intro-1
No ratings yet
Introduction To Neural Networks For Senior Design: August 9 - 12, 2004 Intro-1
33 pages
CSE Deep Learning Seminar Report
No ratings yet
CSE Deep Learning Seminar Report
4 pages
Deep Learning Question Bank Iv-I
No ratings yet
Deep Learning Question Bank Iv-I
5 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Machine Learning: Neural Networks
No ratings yet
Machine Learning: Neural Networks
22 pages
Unit Iv DL
No ratings yet
Unit Iv DL
26 pages
DL Unit-2
No ratings yet
DL Unit-2
31 pages
Unit-V Deep Learning Techniques
100% (1)
Unit-V Deep Learning Techniques
31 pages
Data Science Lab-KTU
No ratings yet
Data Science Lab-KTU
5 pages
Lecture 1a - Introduction
No ratings yet
Lecture 1a - Introduction
38 pages
Natural Language Processing With Deep Learning CS224N/Ling284
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
33 pages
3rd Lecture
No ratings yet
3rd Lecture
21 pages
CS 224D: Deep Learning For NLP: Lecture Notes: Part III Spring 2016
No ratings yet
CS 224D: Deep Learning For NLP: Lecture Notes: Part III Spring 2016
14 pages
Intro4 ANN Deep CNN PDF
No ratings yet
Intro4 ANN Deep CNN PDF
20 pages
NLP-NeuralNetworks Reading Notes
No ratings yet
NLP-NeuralNetworks Reading Notes
13 pages
Deep Learning Resources
No ratings yet
Deep Learning Resources
5 pages
50 Deep Learning Technical Interview Questions With Answers
100% (1)
50 Deep Learning Technical Interview Questions With Answers
20 pages
Machine Learning For Beginners
No ratings yet
Machine Learning For Beginners
16 pages
DL Lab File Front Page
No ratings yet
DL Lab File Front Page
7 pages
Artificial Neural Networks (ch7)
No ratings yet
Artificial Neural Networks (ch7)
12 pages
Artificial Neural Network-Adaline & Madaline
No ratings yet
Artificial Neural Network-Adaline & Madaline
18 pages
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.