0% found this document useful (0 votes)

13 views47 pages

08 Neural Networks

Uploaded by

SAMEER REDDY

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views47 pages

08 Neural Networks

Uploaded by

SAMEER REDDY

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Units in Neural Networks

Simple Neural
Networks
This is in your brain

By BruceBlaus - Own work, CC BY 3.0,

https://commons.wikimedia.org/w/index.php?curid=28761830 2
Neural Network Unit
This is not in your brain

Output value y
a

Non-linear transform σ
z
Weighted sum ∑

Weights w1 w2 w3
bias
b
Input layer x1 x2 x3 +1
3
heart, a neural unit is taking a weighted sum of its inputs, with o
econvenient
Eq. 7.2,
in the zis
sum to expressthisweighted
just
Neural unit
called a real
bias valued
term. number.
Given sum
a set using
of vector
inputs x1
notati
...xn, a
gebra that
rresponding a vector
weights wis, at
...w heart,
and ajust
bias ab,list
so
stead of using z, a linear function of x, as the outpu
1 n or
the array of
weighted numb
sum
d
utas:
z in terms
Take of a
weightedweight
sum vector
of w,
inputs, a
plus scalar
a bias bias b, and an inp
near function f to z. We Xwill refer to the output of th
place the sum with the z =convenient
b+ wi xdot
i product:
value for the unit, a. Since i
we are just modeling a s
henodeisin fact thefinal z=this
output
w·x+
of thenetwork, which
b sum using vector notatio
w
moreconvenient to express weighted
value
r algebra defined
y isthat as:is, at heart, just a list or array of number
a vector
Instead of just using z, we'll apply a nonlinear activation
about zfunction
Eq. 7.2, inzis
termsf:ofaareal
just weight vector
valued w, a scalar bias b, and an inpu
number.
l replace
stead of the sumz,with
using y =convenient
the
a linearafunction
= f (z)dot
of product:
x, as the output, ne
near function f to z. We will refer to the output of this fu
fact thefinal output of thenetwork, which we’ ll generally
ned as:
Non-Linear Activation Functions
y = a = f (z)
The sigmoid for logistic regression:
non-linear functions f () below (the sigmoid, the tanh,
LU) but it’s pedagogically convenient to start with the
saw it in Chapter 5:
Sigmoid
1
y = s (z) = (7.3)
1+ e− z
Fig. 7.1) has a number of advantages; it maps the output
is useful in squashing outliers toward 0 or 1. And it’s
saw in Section ?? will be handy for learning.
5
Final function the unit is computing
7.1 • U NI

stituting Eq. 7.2 into Eq. 7.3 gives us the output of a neural unit:
1
y = s (w·x+ b) =
1+ exp(− (w·x+ b))
7.2 shows a ﬁnal schematic of a basic neural unit. In this example t
nput values x1, x2, and x3, and computes a weighted sum, multiplyin
aweight (w1, w2, and w3, respectively), adds them to abias term b, a
he resulting sum through a sigmoid function to result in a number bet
Final unit again
Output value y
a

Non-linear activation function σ

z
Weighted sum ∑

Weights w1 w2 w3
bias
b
Input layer x1 x2 x3 +1
7
w = [0.2, 0.3, 0.9]
An example
b = 0.5
Suppose a unit has:
What
w =would this unit do with the following input vector:
[0.2,0.3,0.9]
b = 0.5 x = [0.5, 0.6, 0.1]
What happens with input x:
The resulting output y would be:
x = [0.5,0.6,0.1]
1 1
y = s (w·x+ b) = − (w·x+ b)
= − (.5⇤.2+ .6⇤.3+ .1⇤.9+ .5)
=
1+ e 1+ e
In practice, the sigmoid is not commonly used as an activation
w == [0.2,
w [0.2,0.3,
0.3,0.9]
0.9]
An example
Suppose a unit has: bb == 0.5
0.5

What w = this
What would
would [0.2,0.3,0.9]
this unit do
unit do with
with the
thefollowing
following input
input vector:
vector:
b = 0.5
xx == [0.5,
[0.5,0.6,
0.6,0.1]
0.1]
What happens with the following input x?
Theresulting
The resulting output yy would
x = output would be:
[0.5,0.6,0.1] be:

11 1
1
yy== ss (w·x+
(w·x+ b)b) ==
1+ e−
− (w·x+
(w·x+ b)
b)
==
1+ e−(.5
− ⇤
(.5⇤ .2+.6
.2+ ⇤
.6⇤ .3+.1
.3+ ⇤.9+
.1⇤ .5) =
.9+ .5)
1+ e 1+ e
Inpractice,
In practice, the
thesigmoid
sigmoid is
isnot
not commonly
commonly used
used as
asan
an activation
activation
ww == we
example just to get an intuition. Let’s suppose [0.2,
[0.2, 0.3,
0.3,
have 0.9]
a 0.9]
An example
eight vector and bias: bb == 0.5
0.5
Suppose a unit has:
w = would
What [0.2, 0.3,0.9]
this unit do
do with
with the
thefollowing
following inputinput vector:
vector:
What would w = this unit
[0.2,0.3,0.9]
b = 0.5
b = 0.5 xx == [0.5,
[0.5,0.6,
0.6,0.1]
0.1]
with the following input vector:
What happens with input x:
Theresulting
The resulting output
output yy would
would be:be:
x = [0.5,
x =0.6,[0.5,0.6,0.1]
0.1]
11 1
1
yy== ss (w·x+
uld be: (w·x+ b) b) ==
1+ e−− (w·x+
(w·x+ b)
b)
==
1+ e−(.5
− ⇤
(.5⇤ .2+.6
.2+ ⇤
.6⇤ .3+.1
.3+ ⇤.9+
.1⇤ .5) =
.9+ .5)
1+ e 1+ e
1 1 1
In=practice,
(w·x+In
b) practice,
− the
⇤ .2+ sigmoid
.6⇤.3+ .1⇤ is
1+ e the sigmoid is not commonly
(.5 .9+ .5)
= commonly
not 1+ e − 0.87
= used
.70 as an activation
used as an activation
ght vector
ght vector and
andbias:
bias: w == we
[0.2, 0.3, 0.9]
example just to get an intuition. Let’s w
suppose [0.2, 0.3,
have a 0.9]
An example
eight vector
w = and bias:
[0.2, 0.3,0.9] bb == 0.5
0.5
w Suppose
= [0.2,a0.3, 0.9]
unit has:
bw
Whatb ===would
What [0.2,
0.5=0.3,0.9]
w
0.5
would [0.2,0.3,0.9]
this
this unit do
unit do with
with thethefollowing
following input
input vector:
vector:
b = 0.5
the following b = input
following 0.5vector:
input vector:
th the xx == [0.5,
[0.5,0.6,
0.6,0.1]
0.1]
with the following
What happens input vector:
with input x:
The
xx ==
The resulting
[0.5,
[0.5,
resulting 0.6,output
0.6, 0.1]
0.1]
output
x =0.6,[0.5,0.6,0.1] y
y would
would be:
be:
x = [0.5, 0.1]
ldd be:
be: 11 1
1
yy== ss (w·x+
uld be: (w·x+ b) b) ==
1+ e−− (w·x+
(w·x+ b)
b)
==
1+ e−−(.5 ⇤
(.5⇤ .2+.6
.2+ ⇤
.6⇤ .3+.1
.3+ ⇤.9+
.1⇤ .5) =
.9+ .5)
1 1+ e 1+1 e
1 1 1 1 1
In ==practice,−− the
⇤ sigmoid
⇤ ⇤is not == − 0.87−=0.87
= commonly .70= =.70
used as .70
an activation
activation
w·x+
(w·x+
w·x+ b)In practice,
b)b) 1+
1+ e e
e − the
⇤
(.5
(.5
(.5.2+
⇤ sigmoid
.2+
.6
.2+⇤.6 .3+
.3+
.6⇤ .1⇤
.3+ .1
.9+is
.1 .9+
.5)
⇤ not
.5)
.9+ commonly
1+
.5) 1+
e e
1+ e − used
0.87 as an
The simplest activation function, and perhaps the most comm
not commonly used as an activation function. A function
U tiﬁed linear unit, also called the ReLU,besides
Non-Linear Activation Functions sigmoid
shown in Fig. 7.3b.
ost always better is the tanh function shown in Fig. 7.3a;
when
moid that z is from
ranges positive,
-1 to and
+1: 0 otherwise: Most Common:
ez − e− z y = max(z,0)
y= z −z (7.5)
e+e
nction, and perhaps the most commonly used, is the rec-
ed the ReL U, shown in Fig. 7.3b. It’s just the same as x
therwise:

y = max(x, 0) (7.6)
tanh ReLU
Rectified Linear Unit 12
The XOR problem
Simple Neural
Networks
ts into larger networks.
One of the most clever demonstrations of the need for multi-layer networks was
The XOR problem
proof by Minsky and Papert (1969) that aMinsky
single neural unit cannot compute
and Papert (1969)
me very simple functions of its input. Consider the task of computing elementary
ical functions of two inputs,
Can neural like AND, OR,
units compute and XOR.
simple functions As a reminder,
of input?here are
truth tables for those functions:

AND OR XOR
x1 x2 y x1 x2 y x1 x2 y
0 0 0 0 0 0 0 0 0
0 1 0 0 1 1 0 1 1
1 0 0 1 0 1 1 0 1
1 1 1 1 1 1 1 1 0

This example was ﬁrst shown for the per ceptr on, which is a very simple neural
t that has a binary output and does not have a non-linear activation function. The
Perceptrons
A very simple neural unit
• Binary output (0 or 1)
• No non-linear activation function
Early in the history of Early
neuralinnetworks
the historyit was realized
of neural that theitpower
networks of
was real
Easy to build AND or OR with perceptrons
works, as with the real neurons
works, that inspired
as with them, comes
the real neurons from comb
that inspired the
units into larger networks.
units into larger networks.
One of the most cleverOnedemonstrations of the demonstrations
of the most clever need for multi-layer
of thenen
the proof by Minsky the and proof
Papertby(1969)
Minsky thatand
a single
Papertneural
(1969)unit
thatcanno
a si
some very simple functions
some of itssimple
very input. Consider
functions the task
of its of computing
input. Consider
logical functions of two inputs,
logical like AND,
functions OR,inputs,
of two and XOR. As a remind
like AND, OR, an
the truth tables for those
thefunctions:
truth tables for those functions:
AND OR AND XOR OR
x1 x2 y x1 x2x1yx2 y x1 x2
x1yx2 y
0 0 0 0 00 00 0 0 00 00 0
0 1 0 0 10 11 0 0 10 11 1
AND 1 0 0 1 01 1 OR
0 0 1 01 10 1
1 1 1 1 11 11 1 1 11 01 1
Why? Perceptrons are linear classifiers
Perceptron equation given x1 and x2, is the equation of a line
w1x1 + w2x2 + b = 0

(in standard linear format: x2 = (−w1/w2)x1 + (−b/w2) )

This line acts as a decision boundary

• 0 if input is on one side of the line
• 1 if on the other side of the line
Decision boundaries

x2 x2 x2

1 1 1

?
0 x1 0 x1 0 x1
0 1 0 1 0 1

a) x1 AND x2 b) x1 OR x2 c) x1 XOR x2

XOR is not a linearly separable function!

hem, comes from combining these

Solution to the XOR problem

e need for multi-layer networks was
single neural unit cannot compute
er the task of computing elementary
XOR can't be calculated by a single perceptron
, and XOR. As a reminder, here are
XOR can be calculated by a layered network of units.
ReLU y1
XOR
x1 x2 y 1 -2 0
0 0 0 ReLU h1 h2 +1
0 1 1
1 0 1 1 1 1 1 0 -1
1 1 0 x1 x2 +1
ptr on, which is a very simple neural
hem, comes from combining these

Solution to the XOR problem

e need for multi-layer networks was
single neural unit cannot compute
er the task of computing elementary
XOR can't be calculated by a single perceptron
, and XOR. As a reminder, here are
XOR can be calculated by a layered network of units.
y1
XOR
x1 x2 y 1 -2 0
0 0 0 h1 h2 +1
0 1 1
1 0 1 1 1 1 1 0 -1
1 1 0 x1 x2 +1
ptr on, which is a very simple neural
y1

The hidden representation h h1

1 -2
h2
0

+1
1 1 1 1 0 -1
x1 x2 +1
x2 h2

1 1

0 x1 0
h1
0 1 0 1 2

a) The original x space b) The new (linearly separable) h space

(With learning: hidden layers will learn to form useful representations)

Feedforward Neural Networks
Simple Neural
Network
Feedforward Neural Networks
Can also be called multi-layer perceptrons (or
MLPs) for historical reasons
Binary Logistic Regression as a 1-layer Network
(we don't count the input layer in counting layers!)

Output layer σ 𝑦 = 𝜎(𝑤 ∙ 𝑥 + 𝑏)

(σ node) (y is a scalar)

w w1 wn b (scalar)
(vector)

Input layer x1 xn +1
vector x
24
Multinomial Logistic Regression as a 1-layer Network
Fully connected single layer network
y1 yn
Output layer s s s 𝑦 = softmax(𝑊𝑥 + 𝑏)
(softmax nodes) y is a vector
W b
W is a b is a vector
matrix
Input layer x1 xn +1
scalars
25
Reminder: softmax: a generalization of sigmoid

For a vector z of dimensionality k, the softmax is:

Example:
Two-Layer Network with scalar output
Output layer 𝑦 = 𝜎(𝑧) y is a scalar
(σ node) z = 𝑈ℎ
U
hidden units
(σ node) Could be ReLU
Or tanh

W b
Input layer
x1 xn +1
(vector)
Two-Layer Network with scalar output
Output layer 𝑦 = 𝜎(𝑧) y is a scalar
(σ node) z = 𝑈ℎ
U
hidden units j
Wji
(σ node)
W b vector
Input layer
x1 i xn +1
(vector)
Two-Layer Network with scalar output
Output layer 𝑦 = 𝜎(𝑧) y is a scalar
(σ node) z = 𝑈ℎ
U
hidden units
(σ node) Could be ReLU
Or tanh

W b
Input layer
x1 xn +1
(vector)
Two-Layer Network with softmax output
Output layer 𝑦 = softmax(𝑧)
(σ node) z = 𝑈ℎ
U y is a vector

hidden units
(σ node) Could be ReLU
Or tanh

W b
Input layer
x1 xn +1
(vector)
Multi-layer Notation
𝑦 = 𝑎[2]
sigmoid or softmax
𝑎[2] = 𝑔 2 (𝑧 2 )
𝑧 [2] = 𝑊 [2] 𝑎[1] + 𝑏 [2]
W[2 b[2]
]

j 𝑎[1] = 𝑔 1 (𝑧 1
) ReLU

𝑧 [1] = 𝑊 [1] 𝑎[0] + 𝑏 [1]

W[1 b[1]
]

x1 i xn +1 𝑎[0]
y

Multi Layer Notation σ

z
∑

w1 w2 w3 b
x1 x2 x3 +1

32
Replacing the bias unit
Let's switch to a notation without the bias unit
Just a notational change
1. Add a dummy node a0=1 to each layer
2. Its weight w0 will be the bias
3. So input layer a[0]0=1,
◦ And a[1]0=1 , a[2]0=1,…
Replacing the bias unit
Instead of: We'll do this:
x= x1, x2, …, xn0 x= x0, x1, x2, …, xn0
Replacing the bias unit
Instead of: We'll do this:

y1 y2 … yn
2
y1 y2 … yn
2

U U

h1 h2 h3 … hn
1
h1 h2 h3 … hn
1

W b W

x1 x2 … xn
0 +1 x0=1 x1 x2 … xn
0
Simple Neural
Networks and
Applying feedforward networks
Neural to Real tasks
Language
Models
Use cases for feedforward networks
Let's consider 2 (simplified) sample tasks:
1. Text classification
2. Language modeling

State of the art systems use more powerful neural

architectures, but simple models are useful to
consider!

37
Classification: Sentiment Analysis

We could do exactly what we did with logistic

regression
Input layer are binary features as before
Output layer is 0 or 1 σ
U

W
x1 xn
Feedforward nets for simple classification
σ
σ U
2-layer
Logistic
Regression W feedforward
network
W
x1 xn

f1 f2 fn x1 xn

f1 f2 fn
40

Just adding a hidden layer to logistic regression

• allows the network to use non-linear interactions between features

• which may (or may not) improve performance.
Even better: representation learning
σ
The real power of deep learning comes U
from the ability to learn features from
the data W
Instead of using hand-built human-
engineered features for classification x1 xn
e1 e2 en
Use learned representations like
embeddings!

41
Neural Net Classification with embeddings as input features!

42
Issue: texts come in different sizes
This assumes a fixed size length (3)!
Kind of unrealistic.
Some simple solutions (more sophisticated solutions later)
1. Make the input the length of the longest review
• If shorter then pad with zero embeddings
• Truncate if you get longer reviews at test time
2. Create a single "sentence embedding" (the same
dimensionality as a word) to represent all the words
• Take the mean of all the word embeddings
• Take the element-wise max of all the word embeddings
• For each dimension, pick the max value from all words 43
Reminder: Multiclass Outputs
What if you have more than two output classes?
◦ Add more output units (one for each class)
◦ And use a “softmax layer”

W
x1 xn
44
Neural Language Models (LMs)
Language Modeling: Calculating the probability of the
next word in a sequence given some history.
• We've seen N-gram based LMs
• But neural network LMs far outperform n-gram
language models
State-of-the-art neural LMs are based on more
powerful neural network technology like Transformers
But simple feedforward LMs can do almost as well!
45
Simple feedforward Neural Language Models

Task: predict next word wt

given prior words wt-1, wt-2, wt-3, …
Problem: Now we’re dealing with sequences of
arbitrary length.
Solution: Sliding windows (of fixed length)

46
Neural Language Model

47
Why Neural LMs work better than N-gram LMs

Training data:
We've seen: I have to make sure that the cat gets fed.
Never seen: dog gets fed
Test data:
I forgot to make sure that the dog gets ___
N-gram LM can't predict "fed"!
Neural LM can use similarity of "cat" and "dog"
embeddings to generalize and predict “fed” after dog

Learning XOR - Gradient Based Learning - Hidden Units
No ratings yet
Learning XOR - Gradient Based Learning - Hidden Units
43 pages
Structure of Neural Networks
No ratings yet
Structure of Neural Networks
12 pages
7 NN Apr 28 2021
No ratings yet
7 NN Apr 28 2021
81 pages
Unit V
No ratings yet
Unit V
26 pages
02 - Supervised Network (Perceptron)
No ratings yet
02 - Supervised Network (Perceptron)
40 pages
1) Deep - Learning
No ratings yet
1) Deep - Learning
60 pages
Chapter 5 Artificial Neural Networks
No ratings yet
Chapter 5 Artificial Neural Networks
50 pages
Neural Networks
No ratings yet
Neural Networks
27 pages
DL Activation
No ratings yet
DL Activation
41 pages
Neural Networks
No ratings yet
Neural Networks
11 pages
Unit 2 - Machine Learning - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Machine Learning - WWW - Rgpvnotes.in
18 pages
Understanding and Creating Neural Networks
No ratings yet
Understanding and Creating Neural Networks
69 pages
(Ebook) Sailing School Navigating Science and Skill, 1550-1800 by Margaret E. Schotte ISBN 9781421429533, 9781421429540, 1421429535, 1421429543
100% (1)
(Ebook) Sailing School Navigating Science and Skill, 1550-1800 by Margaret E. Schotte ISBN 9781421429533, 9781421429540, 1421429535, 1421429543
81 pages
Unit 2 - Machine Learning
No ratings yet
Unit 2 - Machine Learning
19 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
86 pages
5 - From Linear Models To Multi-Layer Perceptrons
No ratings yet
5 - From Linear Models To Multi-Layer Perceptrons
45 pages
Week 2 Artificial Neural Networks
No ratings yet
Week 2 Artificial Neural Networks
62 pages
06 MDP
No ratings yet
06 MDP
89 pages
Artificial Neural Networks and Deep Learning
No ratings yet
Artificial Neural Networks and Deep Learning
22 pages
Lecture - 05 (Introduction To ANN)
No ratings yet
Lecture - 05 (Introduction To ANN)
27 pages
Digital Image Processing Lecture 1: Introduction: Prof. Charlene Tsai Tsaic@cs - Ccu.edu - TW
No ratings yet
Digital Image Processing Lecture 1: Introduction: Prof. Charlene Tsai Tsaic@cs - Ccu.edu - TW
38 pages
ML Lecture#4
No ratings yet
ML Lecture#4
109 pages
Neural Networks and Neural Language Models
No ratings yet
Neural Networks and Neural Language Models
27 pages
7 NN Apr 28 2021
No ratings yet
7 NN Apr 28 2021
81 pages
Artificial Neural Artificial Neural Networks
No ratings yet
Artificial Neural Artificial Neural Networks
40 pages
Deep Learning-Material For The Units 1,2,3
No ratings yet
Deep Learning-Material For The Units 1,2,3
36 pages
UNIT III 3.1 ML Artificial Neural Networks
No ratings yet
UNIT III 3.1 ML Artificial Neural Networks
65 pages
Lecture20 Backprop
No ratings yet
Lecture20 Backprop
77 pages
Square Summable Power Series
From Everand
Square Summable Power Series
Louis de Branges
5/5 (1)
Week 14 (NN)
No ratings yet
Week 14 (NN)
49 pages
Đề Kiểm Giữa Học Kì 1 Môn Tiếng Anh Lớp 2
No ratings yet
Đề Kiểm Giữa Học Kì 1 Môn Tiếng Anh Lớp 2
13 pages
07 Probability Review
No ratings yet
07 Probability Review
56 pages
International Baccalaureate (IB) : Artificial Neural Networks - #1
No ratings yet
International Baccalaureate (IB) : Artificial Neural Networks - #1
33 pages
Unit 2 Deep Learning and Neural Networks
No ratings yet
Unit 2 Deep Learning and Neural Networks
38 pages
FML Unit5
No ratings yet
FML Unit5
21 pages
Ryan International School Chandigarh Winter Holiday Homework
100% (1)
Ryan International School Chandigarh Winter Holiday Homework
7 pages
Syntax Analysis
No ratings yet
Syntax Analysis
73 pages
I - Ochorowicz - The Project of An International Congress of Psychology
No ratings yet
I - Ochorowicz - The Project of An International Congress of Psychology
13 pages
Ann MJJ-1
No ratings yet
Ann MJJ-1
64 pages
08 NN
No ratings yet
08 NN
43 pages
Lecture Slides 2 - Neural Networks - 2021
No ratings yet
Lecture Slides 2 - Neural Networks - 2021
42 pages
Unit V
No ratings yet
Unit V
25 pages
Editorial Writing Exercises
No ratings yet
Editorial Writing Exercises
5 pages
Lesson 7.0 Supervised Learning With Neural Networks
No ratings yet
Lesson 7.0 Supervised Learning With Neural Networks
22 pages
Aml Pa
No ratings yet
Aml Pa
17 pages
CUET UG ScoreCard
No ratings yet
CUET UG ScoreCard
1 page
NN Unit - 1
No ratings yet
NN Unit - 1
27 pages
ML Unit 3 Study Material-1
No ratings yet
ML Unit 3 Study Material-1
32 pages
Lexical Analysis
No ratings yet
Lexical Analysis
41 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
Pr2 ANN WriteUp
No ratings yet
Pr2 ANN WriteUp
11 pages
AND Operation Using NN
No ratings yet
AND Operation Using NN
53 pages
Machine Learning (CSO851) - Lecture 08
No ratings yet
Machine Learning (CSO851) - Lecture 08
27 pages
Unit 2 - Machine Learning - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Machine Learning - WWW - Rgpvnotes.in
18 pages
Unit I - Artificial Neural Network - Assignment 1
No ratings yet
Unit I - Artificial Neural Network - Assignment 1
4 pages
Activation Functions
No ratings yet
Activation Functions
11 pages
Chapter 7
No ratings yet
Chapter 7
31 pages
The Teacher As Curriculum Implementor and Manager Ma. Merle A. Lupo
No ratings yet
The Teacher As Curriculum Implementor and Manager Ma. Merle A. Lupo
27 pages
Unit II
No ratings yet
Unit II
12 pages
Learning With Linear Neurons: Adapted From Lectures by Geoffrey Hinton and Others Updated by N. Intrator, May 2007
No ratings yet
Learning With Linear Neurons: Adapted From Lectures by Geoffrey Hinton and Others Updated by N. Intrator, May 2007
59 pages
Chapter Neural Networks
No ratings yet
Chapter Neural Networks
14 pages
0905 Cs 161183 Vishal
No ratings yet
0905 Cs 161183 Vishal
38 pages
Neural Network Notes
No ratings yet
Neural Network Notes
8 pages
SCT Unit2
No ratings yet
SCT Unit2
11 pages
NVS - NWDA - Pension-Supremecourt - Judgment
No ratings yet
NVS - NWDA - Pension-Supremecourt - Judgment
32 pages
Storybook Reading Lesson Plan Proudest Blue
No ratings yet
Storybook Reading Lesson Plan Proudest Blue
5 pages
DLL Science 6 q3 w10
No ratings yet
DLL Science 6 q3 w10
6 pages
Neural Network Presentation
No ratings yet
Neural Network Presentation
33 pages
Unit 2
No ratings yet
Unit 2
18 pages
Complex numbers
From Everand
Complex numbers
Alessio Mangoni
No ratings yet
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
CS231n Convolutional Neural Networks For Visual Recognition 2
No ratings yet
CS231n Convolutional Neural Networks For Visual Recognition 2
12 pages
Dealing: Ma. Concepcion Manalili - de Rueda
No ratings yet
Dealing: Ma. Concepcion Manalili - de Rueda
68 pages
Science - Investigation - Fabric Forensics
No ratings yet
Science - Investigation - Fabric Forensics
3 pages
Qualities of A Good Researcher PDF
No ratings yet
Qualities of A Good Researcher PDF
42 pages
The Scream: Language Level:! ! Learner Type:! Time:!! Activity:! Topic:! Language:! !
No ratings yet
The Scream: Language Level:! ! Learner Type:! Time:!! Activity:! Topic:! Language:! !
7 pages
Jaea Davidson Resume
No ratings yet
Jaea Davidson Resume
2 pages
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
No ratings yet
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
31 pages
Sparseautoencoder 2011new
No ratings yet
Sparseautoencoder 2011new
19 pages
The Software Quality Challenge
No ratings yet
The Software Quality Challenge
16 pages
October 22 Science 7 Biomagnification
100% (1)
October 22 Science 7 Biomagnification
2 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
Homework - Grammar - Week 11
No ratings yet
Homework - Grammar - Week 11
3 pages
Peace Education As Transformative Education: Why Educate For Peace?
50% (2)
Peace Education As Transformative Education: Why Educate For Peace?
2 pages
Identity in Funny Boy by Shyama Sylvadurai
0% (1)
Identity in Funny Boy by Shyama Sylvadurai
6 pages
Final Science Lesson Plan
No ratings yet
Final Science Lesson Plan
15 pages
Materi LSSWB
No ratings yet
Materi LSSWB
25 pages
Anne of Green Gables L2 Orginal
100% (5)
Anne of Green Gables L2 Orginal
65 pages
Dancing Sultanas
No ratings yet
Dancing Sultanas
2 pages
S9 Q4 Hybrid Module 2 Week 3 Conservation of Momentum
No ratings yet
S9 Q4 Hybrid Module 2 Week 3 Conservation of Momentum
19 pages
T15 The Expression of Manner
100% (1)
T15 The Expression of Manner
4 pages
Lesson Plan - E-Waste - Grade 10
No ratings yet
Lesson Plan - E-Waste - Grade 10
2 pages
Congenital Hypothyroidism
No ratings yet
Congenital Hypothyroidism
2 pages
Social Group Work Process-Phases
No ratings yet
Social Group Work Process-Phases
11 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

08 Neural Networks

Uploaded by

08 Neural Networks

Uploaded by

Units in Neural Networks

By BruceBlaus - Own work, CC BY 3.0,

Non-linear activation function σ

(in standard linear format: x2 = (−w1/w2)x1 + (−b/w2) )

This line acts as a decision boundary

XOR is not a linearly separable function!

Solution to the XOR problem

Solution to the XOR problem

The hidden representation h h1

a) The original x space b) The new (linearly separable) h space

(With learning: hidden layers will learn to form useful representations)

Output layer σ 𝑦 = 𝜎(𝑤 ∙ 𝑥 + 𝑏)

For a vector z of dimensionality k, the softmax is:

𝑧 [1] = 𝑊 [1] 𝑎[0] + 𝑏 [1]

Multi Layer Notation σ

State of the art systems use more powerful neural

We could do exactly what we did with logistic

Just adding a hidden layer to logistic regression

• allows the network to use non-linear interactions between features

Task: predict next word wt

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.