Chapter 7
Chapter 7
1
Introduction
a non-linear function f to z
output of this function : the activation value, a
y = a = f(z)
Different activation functions:
• Sigmoid
• Tanh
• rectified linear unit or ReLU
Sigmoid
it maps the output into the range (0,1)
the output of a neural unit:
Example:
Advantages:
Now,
gradients that are almost 0 cause the error signal to get smaller to be used
for training
a problem called the vanishing gradient problem
The XOR problem
Can neural units compute simple functions of input?
AND OR
XOR
x1 x2 y x1 x2 y x1 x2 y
0 0 0 0 0 0 0 0 0
0 1 0 0 1 1 0 1 1
1 0 0 1 0 1 1 0 1
1 1 1 1 1 1 1 1 0
Perceptrons
A very simple neural unit
• Binary output (0 or 1)
• No non-linear activation function
Easy to build AND or OR with perceptrons
AND :
0
0 0 0
-1
Easy to build AND or OR with perceptrons
AND :
0
1 1 0
-1
Easy to build AND or OR with perceptrons
AND :
1
0 0 0
-1
Easy to build AND or OR with perceptrons
AND :
1
1 1 1
-1
Easy to build AND or OR with perceptrons
OR :
0
0
0 0 0
0
Easy to build AND or OR with perceptrons
OR :
0
0
1 1 1
0
Easy to build AND or OR with perceptrons
OR :
1
1
0 0 1
0
Easy to build AND or OR with perceptrons
OR :
1
1
1 1 1
0
Not possible to capture XOR with perceptrons !!
Perceptron equation given x1 and x2, is the equation of a line
w1x1 + w2x2 + b = 0
in standard linear format: x2 = (−w1/w2)x1 + (−b/w2)
This line acts as a decision boundary
• 0 if input is on one side of the line
• 1 if on the other side of the line
Decision boundaries
x x x
12 12 12
?
0 x 0 x 0
0 1 0 1 0 1
1 1
a) x1 b) x1 c) x1
AND x2 OR x2 XOR x2
Filled circles represent perceptron outputs of 1,
white circles perceptron outputs of 0
no way to draw a line that correctly separates the two categories for
XOR
The solution: neural networks
Can be calculated by a layered network of perceptron units
h = σ(Wx+b)
X = Where,
g[z1,z2,z3] =
[g(z1),g(z2),g(z3)]
+ =
Output layer computation
weight matrix U , U ∈ n2×n1
input vector (h)
intermediate output z, z ∈ R n2
Where, z = Uh
h = σ(Wx+b)
z = Uh
y = softmax(z)