Aml Pa
Aml Pa
Exercise 4-1
Consider the neural network shown below with linear activation functions. Can any function that is
implemented by the shown network be presented as a neural network with a single neuron? If yes, what
are the corresponding weights?
Solution:
The given neural network consists of three layers with linear activation functions:
Substituting V1 and V2 :
Expanding:
1
Grouping terms:
Y = (w0 + w1 w10 + w2 w20 ) + (w1 w11 + w2 w21 )X1 + (w1 w12 + w2 w22 )X2 (7)
Step 2: Representing as a Single Neuron This equation can be rewritten in the form:
Y = W0′ + W1′ X1 + W2′ X2 (8)
where:
W0′ = w0 + w1 w10 + w2 w20 (9)
W1′ = w1 w11 + w2 w21 (10)
W2′ = w1 w12 + w2 w22 (11)
Thus, the function implemented by the given network can be represented by a single-layer neuron with
these equivalent weights.
Yes, a single neuron could compensate the architecture if the used activation function is a linear one, the
function Y could be written as W1 X1 + W2 X2 + b
if W1 = W 1W 11 + W 2W 21, W2 = W 1W 12 + W 2W 22, b = W 0 + W 1W 10 + W 2W 20
Exercise 4-2
Compute activations and the output for the following samples for only one neuron
Initialise all weights and bias to 0.1, and use sigmoid function given below as activation function
2
S X1 X2 X3
S1 1 0 1
S2 0 1 1
S3 1 1 0
Solution:
You have a single neuron that takes three inputs (X1 , X2 , X3 ) and applies a weighted sum
plus a bias. Then, the result is passed through an activation function (sigmoid function) to
get the final output y.
The neuron’s equation is given by:
y = f (wT X + b) (12)
where:
1
S(x) =
1 + e−x
The given dataset consists of three samples:
S X1 X2 X3
S1 1 0 1
S2 0 1 1
S3 1 1 0
z = w1 X1 + w2 X2 + w3 X3 + b
Since all weights are 0.1 and the bias is also 0.1, we substitute:
3
z = 0.1 + 0.1 + 0 + 0.1 = 0.3
Exercise 4-3
Compute activations and the output for the following samples
S X1 X2
S1 1 0
S2 0 1
S3 1 1
Assume ReLU activation (ReLU(x) = max(0,x)) for both hidden layer and output layer, Initialise all
weights and bias to 0.1;
Solution:
Assume ReLU activation (ReLU(x) = max(0, x)) for both hidden layer and output layer. Initialize all
weights and bias to 0.1.
z = w1 X1 + w2 X2 + b
Since all weights are initialized to 0.1 and the bias is also 0.1:
a1 = max(0, z1 )
z2 = (0.1 · X1 ) + (0.1 · X2 ) + 0.1
a2 = max(0, z2 )
4
Step 2: Compute Activation for the Output Neuron N3
The output neuron computes:
z3 = (0.1 · a1 ) + (0.1 · a2 ) + 0.1
Y = max(0, z3 )
For S2 : (X1 = 0, X2 = 1)
For S3 : (X1 = 1, X2 = 1)
Final Results
Sample X1 X2 Y
S1 1 0 0.14
S2 0 1 0.14
S3 1 1 0.16
Exercise 4-4
Given the neural network below with the specified weights and biases, compute the feedforward step and
the backpropagation step. Assume the sigmoid activation function:
5
Using the given weights and biases such that it implements the XOR function. Assume Sigmoid activation
function for all neurons that is given by
Solution:
6
The XOR function is given by:
Y = X1 ∗ X2′ + X2 ∗ X1′
Which can be implemented using 2 AND gates and an output OR gate. We can use the given neural
network structure such that the two neurons in the first input layer implement the 2 AND gates, and the
output neuron implements the OR gate. The desired truth table is:
Substituting values:
net1 = (0.21)(0) + (0.15)(0) + (−0.3)(1) = −0.3 (16)
Applying the activation function:
1
f1 = = 0.43 (17)
1 + e0.3
Calculate net input for the second hidden neuron:
Substituting values:
net2 = (−0.4)(0) + (0.1)(0) + (0.25)(1) = 0.25 (19)
Applying the activation function:
1
f2 = = 0.56 (20)
1 + e−0.25
Calculate net input for the output neuron:
Substituting values:
net3 = (−0.2)(0.43) + (0.3)(0.56) + (−0.4)(1) = −0.318 (22)
Applying the activation function:
1
f3 = = 0.42 (23)
1 + e0.318
Expected output = 0, but the network predicted 0.42.
7
Step 3: Backpropagation - Second Pass
Goal We aim to compute the weight update using backpropagation, specifically:
∂Loss
(24)
∂w1
Using the chain rule:
∂Loss ∂Loss ∂Y ∂Net3
= × × (25)
∂w1 ∂Y ∂Net3 ∂w1
1
Y = σ(Net3 ) = (29)
1 + e−Net3
Differentiating:
dY
= Y (1 − Y ) (30)
dNet3
Result:
∂Y
= Y (1 − Y ) (31)
∂Net3
Net3 = w1 V1 + w2 V2 + b (32)
8
Final Formula
∂Loss
= (Y − T ) · Y (1 − Y ) · V1 (34)
∂w1
Substituting Values:
δ3 = (0 − 0.42)(0.42)(1 − 0.42)
δ3 = (−0.42)(0.42)(0.58) = −0.102
Backpropagation for Hidden Layers
Error at the output neuron:
δ3 = (T − Y )Y (1 − Y ) (35)
Error at hidden neuron i:
δi = δ3 w3i f ′ (Neti ) (36)
δ3 = (T − Y )Y (1 − Y ) (37)
δ3 = (−0.42)(0.42)(0.58) (39)
δ3 = −0.102 (40)
δ1 = δ3 w31 Y1 (1 − Y1 ) (41)
δ1 = (−0.102)(−0.2)(0.43)(0.57) (43)
δ1 = 0.005 (44)
δ2 = δ3 w32 Y2 (1 − Y2 ) (45)
δ2 = (−0.102)(0.3)(0.56)(0.44) (47)
δ2 = −0.007 (48)
9
Step 4: Weight Updates
Weights are updated using the formula:
∆w = ηδx
where η (learning rate) is assumed to be 0.1.
Updating Weights for Output Layer
new
w31 old
= w31 − η · δ3 · Y1 (49)
new
w31 = −0.2 − (0.1)(−0.102)(0.43) (50)
new
w31 = −0.2 + 0.004386 (51)
new
w31 = 0.1956 (52)
Updating
new
w32 old
= w32 − η · δ3 · Y2 (53)
new
w32 = 0.3 − (0.1)(−0.102)(0.56) (54)
new
w32 = 0.3 − (−0.0057) (55)
new
w32 = 0.3057 (56)
new
w30 old
= w30 − η · δ3 · input
new
w30 = −0.4 − (0.1 × −0.0075 × 1)
10
= −0.4 + 0.00075
= −0.39925
Summary
Exercise 4-5
Suppose we have a perceptron having weights corresponding to the three inputs have the following values:
w1= 2 ;w2 = -4; and w3 = 1 and the activation of the unit is given by the step-function:
ϕ(v) = 1 if v ≥ 0 otherwise 0
Calculate the output value y of the given perceptron for each of the following input patterns:
Solution:
To calculate the output value y for each of the given patterns we have to follow below two steps:
11
a) Calculate the weighted sum : v =
P
i (wixi) = w1x1 + w2x2 + w3x3
b) Apply the activation function to v.
For each point, we calculate the weighted sum and then apply the activation function on the output :
A perceptron computes the weighted sum of inputs and applies an activation function to determine
the output. The general formula is:
v = w1 x1 + w2 x2 + w3 x3
where:
Given weights:
w1 = 2, w2 = −4, w3 = 1
Input Patterns:
P1 P2 P3 P4
X1 1 0 1 1
X2 0 1 0 1
X3 0 1 1 1
Step-by-Step Calculations:
For P1 : (X1 = 1, X2 = 0, X3 = 0)
v = (2 × 1) + (−4 × 0) + (1 × 0) = 2
Since v ≥ 0, the output is 1.
For P2 : (X1 = 0, X2 = 1, X3 = 1)
v = (2 × 0) + (−4 × 1) + (1 × 1) = −3
Since v < 0, the output is 0.
For P3 : (X1 = 1, X2 = 0, X3 = 1)
v = (2 × 1) + (−4 × 0) + (1 × 1) = 3
Since v ≥ 0, the output is 1.
For P4 : (X1 = 1, X2 = 1, X3 = 1)
v = (2 × 1) + (−4 × 1) + (1 × 1) = −1
Since v < 0, the output is 0.
Final Output Table:
P1 P2 P3 P4
Output(y) 1 0 1 0
Exercise 4-6
Consider a feed-forward Neural Network having 2 inputs(label -1 and label -2 )with fully connected layers
and we have 2 hidden layers:
12
a) Hidden layer-1: Nodes labeled as 3 and 4
b) Hidden layer-2: Nodes labeled as 5 and 6
A weight on the connection between nodes i and j is represented by wij, such as w24 is the weight on
the connection between nodes 2 and 4. The following lists contain all the weights values used in the given
network:
Each of the nodes 3, 4, 5, and 6 use the following activation function: ϕ(v) = 1 if v ≥ 0 otherwise 0
where v denotes the weighted sum of a node. Each of the input nodes (1 and 2) can only receive binary
values (either 0 or 1). Calculate the output of the network (y5 and y6) for the input pattern given by
(node-1 and node-2 as 0, 0 respectively).
Solution:
To find the output of the network it is necessary to calculate weighted sums of hidden nodes 3 and 4:
v3 = w13x1 + w23x2
,
v4 = w14x1 + w24x2
Then find the outputs from hidden nodes using activation function ϕ :
y3 = ϕ(v3)
y4 = ϕ(v4).
Use the outputs of the hidden nodes y3 and y4 as the input values to the output layer (nodes 5 and 6),
and find weighted sums of output nodes 5 and 6:
v5 = w35 ∗ y3 + w45 ∗ y4
,
v6 = w36 ∗ y3 + w46 ∗ y4.
y5 = ϕ(v5)
y6 = ϕ(v6).
The output pattern will be (y5, y6).
Perform this calculation for the given input, Input pattern (0, 0)
v3 = −2 ∗ 0 + 3 ∗ 0 = 0,
y3 = ϕ(0) = 1
v4 = 4 ∗ 0 − 1 ∗ 0 = 0
y4 = ϕ(0) = 1
13
v5 = 1 ∗ 1 − 1 ∗ 1 = 0
y5 = ϕ(0) = 1
v6 = −1 ∗ 1 + 1 ∗ 1 = 0
y6 = ϕ(0) = 1
Therefore, the output of the network for a given input pattern is (1, 1).
Exercise 4-7
Consider the neural network shown below with a linear activation function. Apply two iterations of the
backpropagation algorithm showing the output and the error after each iteration for an input x1 = 1, x2
= 2 and x3 = -1. Assume a target value of +1, initialize all weights to the value 0.2, and use a learning
rate of 0.1.
Solution:
Given Parameters
• Inputs: x1 = 1, x2 = 2, x3 = −1
• Target Output: y ∗ = 1
• Initial Weights: w1 = w2 = w3 = 0.2
• Learning Rate: η = 0.1
• Activation Function: Linear, y =
P
xi wi
Iteration 1
Forward Pass
y = x1 w1 + x2 w2 + x3 w3
= (1 × 0.2) + (2 × 0.2) + (−1 × 0.2)
= 0.2 + 0.4 − 0.2 = 0.4
Backward Pass
e = y∗ − y
= 1 − 0.4 = 0.6
14
Updating weights:
Iteration 2
Forward Pass
Backward Pass
e = 1 − 0.76 = 0.24
Update Weights
Updating weights:
Final Results
Exercise 4-8
Consider the neural network shown below with a linear activation function for the output neuron and a
Rectified Linear Unit (ReLU) activation function for the hidden neuron.
15
Solution:
b) First Iteration
Since az >0 in the first iteration, therefore based on the derivative obtained in a
16
17