0% found this document useful (0 votes)
11 views17 pages

Aml Pa

The document outlines a practice assignment for an advanced machine learning course, focusing on neural networks and their computations. It includes exercises on representing neural networks with single neurons, computing activations using different activation functions, and performing feedforward and backpropagation steps for a network implementing the XOR function. The solutions provided detail the mathematical processes involved in these exercises, including weight calculations and the application of activation functions.

Uploaded by

Salma Hicham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views17 pages

Aml Pa

The document outlines a practice assignment for an advanced machine learning course, focusing on neural networks and their computations. It includes exercises on representing neural networks with single neurons, computing activations using different activation functions, and performing feedforward and backpropagation steps for a network implementing the XOR function. The solutions provided detail the mathematical processes involved in these exercises, including weight calculations and the application of activation functions.

Uploaded by

Salma Hicham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

German International University of Applied Sciences

Informatics and Computer Science


Dr. Caroline Sabty
TA. Merna Said
TA. Ayman Iskandar

Advanced Machine Learning , Spring 2025


Practice Assignment 4

Exercise 4-1
Consider the neural network shown below with linear activation functions. Can any function that is
implemented by the shown network be presented as a neural network with a single neuron? If yes, what
are the corresponding weights?

Solution:
The given neural network consists of three layers with linear activation functions:

• Input Layer: Two inputs, X1 and X2 .

• Hidden Layer: Two neurons, N1 and N2 .


• Output Layer: One neuron, N3 , producing the final output Y .

Each neuron computes a weighted sum of its inputs.


Step 1: Equations for Each Neuron The equation for each neuron follows the linear summation
formula:
V = w0 + w1 X1 + w2 X2 (1)

For the hidden layer neurons:

V1 = w10 + w11 X1 + w12 X2 (2)


V2 = w20 + w21 X1 + w22 X2 (3)

For the output neuron:


Y = w0 + w1 V1 + w2 V2 (4)

Substituting V1 and V2 :

Y = w0 + w1 (w10 + w11 X1 + w12 X2 ) + w2 (w20 + w21 X1 + w22 X2 ) (5)

Expanding:

Y = w0 + w1 w10 + w1 w11 X1 + w1 w12 X2 + w2 w20 + w2 w21 X1 + w2 w22 X2 (6)

1
Grouping terms:
Y = (w0 + w1 w10 + w2 w20 ) + (w1 w11 + w2 w21 )X1 + (w1 w12 + w2 w22 )X2 (7)

Step 2: Representing as a Single Neuron This equation can be rewritten in the form:
Y = W0′ + W1′ X1 + W2′ X2 (8)

where:
W0′ = w0 + w1 w10 + w2 w20 (9)
W1′ = w1 w11 + w2 w21 (10)
W2′ = w1 w12 + w2 w22 (11)

Thus, the function implemented by the given network can be represented by a single-layer neuron with
these equivalent weights.

Yes, a single neuron could compensate the architecture if the used activation function is a linear one, the
function Y could be written as W1 X1 + W2 X2 + b
if W1 = W 1W 11 + W 2W 21, W2 = W 1W 12 + W 2W 22, b = W 0 + W 1W 10 + W 2W 20

Exercise 4-2
Compute activations and the output for the following samples for only one neuron

Initialise all weights and bias to 0.1, and use sigmoid function given below as activation function

2
S X1 X2 X3
S1 1 0 1
S2 0 1 1
S3 1 1 0

Solution:
You have a single neuron that takes three inputs (X1 , X2 , X3 ) and applies a weighted sum
plus a bias. Then, the result is passed through an activation function (sigmoid function) to
get the final output y.
The neuron’s equation is given by:

y = f (wT X + b) (12)

where:

• w = [w1 , w2 , w3 ] are the weights (each initialized to 0.1).


• X = [X1 , X2 , X3 ] are the inputs.
• b is the bias (also initialized to 0.1).
• f (x) is the sigmoid activation function, defined as:

1
S(x) =
1 + e−x
The given dataset consists of three samples:

S X1 X2 X3
S1 1 0 1
S2 0 1 1
S3 1 1 0

Step 1: Compute the Weighted Sum z


The weighted sum is computed as:

z = w1 X1 + w2 X2 + w3 X3 + b

Since all weights are 0.1 and the bias is also 0.1, we substitute:

z = (0.1 × X1 ) + (0.1 × X2 ) + (0.1 × X3 ) + 0.1

For Sample S1 (X1 = 1, X2 = 0, X3 = 1):

z = (0.1 × 1) + (0.1 × 0) + (0.1 × 1) + 0.1

z = 0.1 + 0 + 0.1 + 0.1 = 0.3

For Sample S2 (X1 = 0, X2 = 1, X3 = 1):

z = (0.1 × 0) + (0.1 × 1) + (0.1 × 1) + 0.1

z = 0 + 0.1 + 0.1 + 0.1 = 0.3

For Sample S3 (X1 = 1, X2 = 1, X3 = 0):

z = (0.1 × 1) + (0.1 × 1) + (0.1 × 0) + 0.1

3
z = 0.1 + 0.1 + 0 + 0.1 = 0.3

Step 2: Apply the Sigmoid Function


Now, we apply the sigmoid function:
1
S(x) =
1 + e−x
For all three samples, we found z = 0.3, so:
1
S(0.3) =
1 + e−0.3
Approximating e−0.3 ≈ 0.7408:
1 1
S(0.3) = = ≈ 0.5744
1 + 0.7408 1.7408

Exercise 4-3
Compute activations and the output for the following samples

S X1 X2
S1 1 0
S2 0 1
S3 1 1

Assume ReLU activation (ReLU(x) = max(0,x)) for both hidden layer and output layer, Initialise all
weights and bias to 0.1;

Solution:
Assume ReLU activation (ReLU(x) = max(0, x)) for both hidden layer and output layer. Initialize all
weights and bias to 0.1.

Step 1: Compute Activations for the Hidden Layer


Each neuron in the hidden layer (N1 and N2 ) computes the weighted sum of inputs:

z = w1 X1 + w2 X2 + b

Since all weights are initialized to 0.1 and the bias is also 0.1:

z1 = (0.1 · X1 ) + (0.1 · X2 ) + 0.1

a1 = max(0, z1 )
z2 = (0.1 · X1 ) + (0.1 · X2 ) + 0.1
a2 = max(0, z2 )

4
Step 2: Compute Activation for the Output Neuron N3
The output neuron computes:
z3 = (0.1 · a1 ) + (0.1 · a2 ) + 0.1
Y = max(0, z3 )

Step 3: Compute Outputs for Given Inputs


For S1 : (X1 = 1, X2 = 0)

z1 = (0.1 × 1) + (0.1 × 0) + 0.1 = 0.2


a1 = max(0, 0.2) = 0.2
z2 = (0.1 × 1) + (0.1 × 0) + 0.1 = 0.2
a2 = max(0, 0.2) = 0.2
z3 = (0.1 × 0.2) + (0.1 × 0.2) + 0.1 = 0.14
Y = max(0, 0.14) = 0.14

For S2 : (X1 = 0, X2 = 1)

z1 = (0.1 × 0) + (0.1 × 1) + 0.1 = 0.2


a1 = max(0, 0.2) = 0.2
z2 = (0.1 × 0) + (0.1 × 1) + 0.1 = 0.2
a2 = max(0, 0.2) = 0.2
z3 = (0.1 × 0.2) + (0.1 × 0.2) + 0.1 = 0.14
Y = max(0, 0.14) = 0.14

For S3 : (X1 = 1, X2 = 1)

z1 = (0.1 × 1) + (0.1 × 1) + 0.1 = 0.3


a1 = max(0, 0.3) = 0.3
z2 = (0.1 × 1) + (0.1 × 1) + 0.1 = 0.3
a2 = max(0, 0.3) = 0.3
z3 = (0.1 × 0.3) + (0.1 × 0.3) + 0.1 = 0.16
Y = max(0, 0.16) = 0.16

Final Results

Sample X1 X2 Y
S1 1 0 0.14
S2 0 1 0.14
S3 1 1 0.16

Exercise 4-4
Given the neural network below with the specified weights and biases, compute the feedforward step and
the backpropagation step. Assume the sigmoid activation function:

5
Using the given weights and biases such that it implements the XOR function. Assume Sigmoid activation
function for all neurons that is given by

Use the given weights and biases to:


1.Compute the network’s output during feedforward.
2.Perform backpropagation to update the weights.

Solution:

6
The XOR function is given by:
Y = X1 ∗ X2′ + X2 ∗ X1′
Which can be implemented using 2 AND gates and an output OR gate. We can use the given neural
network structure such that the two neurons in the first input layer implement the 2 AND gates, and the
output neuron implements the OR gate. The desired truth table is:

Step 1: Define the Activation Function


The activation function used is the sigmoid function:
1
f (netj ) = (14)
1 + e−netj

Step 2: Feedforward - First Pass


Input pattern: (0, 0, 0)
Calculate net input for the first hidden neuron:

net1 = w11 x1 + w12 x2 + b1 (15)

Substituting values:
net1 = (0.21)(0) + (0.15)(0) + (−0.3)(1) = −0.3 (16)
Applying the activation function:
1
f1 = = 0.43 (17)
1 + e0.3
Calculate net input for the second hidden neuron:

net2 = w21 x1 + w22 x2 + b2 (18)

Substituting values:
net2 = (−0.4)(0) + (0.1)(0) + (0.25)(1) = 0.25 (19)
Applying the activation function:
1
f2 = = 0.56 (20)
1 + e−0.25
Calculate net input for the output neuron:

net3 = w31 f1 + w32 f2 + b3 (21)

Substituting values:
net3 = (−0.2)(0.43) + (0.3)(0.56) + (−0.4)(1) = −0.318 (22)
Applying the activation function:
1
f3 = = 0.42 (23)
1 + e0.318
Expected output = 0, but the network predicted 0.42.

7
Step 3: Backpropagation - Second Pass
Goal We aim to compute the weight update using backpropagation, specifically:
∂Loss
(24)
∂w1
Using the chain rule:
∂Loss ∂Loss ∂Y ∂Net3
= × × (25)
∂w1 ∂Y ∂Net3 ∂w1

Step 1: Compute ∂Loss


∂Y Using Mean Squared Error (MSE) loss:
1
Loss = (Y − T )2 (26)
2
Differentiating:  
d 1
(Y − T )2 = (Y − T ) (27)
dY 2
Result:
∂Loss
= (Y − T ) (28)
∂Y
Step 2: Compute ∂Y
∂Net3 The activation function is sigmoid:

1
Y = σ(Net3 ) = (29)
1 + e−Net3
Differentiating:
dY
= Y (1 − Y ) (30)
dNet3
Result:
∂Y
= Y (1 − Y ) (31)
∂Net3

∂w1 The weighted sum is:


Step 3: Compute ∂Net 3

Net3 = w1 V1 + w2 V2 + b (32)

Differentiating with respect to w1 :


∂Net3
= V1 (33)
∂w1

8
Final Formula
∂Loss
= (Y − T ) · Y (1 − Y ) · V1 (34)
∂w1
Substituting Values:

δ3 = (0 − 0.42)(0.42)(1 − 0.42)

δ3 = (−0.42)(0.42)(0.58) = −0.102
Backpropagation for Hidden Layers
Error at the output neuron:
δ3 = (T − Y )Y (1 − Y ) (35)
Error at hidden neuron i:
δi = δ3 w3i f ′ (Neti ) (36)

The error at the output neuron is computed as:

δ3 = (T − Y )Y (1 − Y ) (37)

Substituting the given values:

δ3 = (0 − 0.42)(0.42)(1 − 0.42) (38)

δ3 = (−0.42)(0.42)(0.58) (39)

δ3 = −0.102 (40)

The error at hidden neuron 1 is computed as:

δ1 = δ3 w31 Y1 (1 − Y1 ) (41)

Substituting the given values:

δ1 = (−0.102)(−0.2)(0.43)(1 − 0.43) (42)

δ1 = (−0.102)(−0.2)(0.43)(0.57) (43)

δ1 = 0.005 (44)

The error at hidden neuron 2 is computed as:

δ2 = δ3 w32 Y2 (1 − Y2 ) (45)

Substituting the given values:

δ2 = (−0.102)(0.3)(0.56)(1 − 0.56) (46)

δ2 = (−0.102)(0.3)(0.56)(0.44) (47)

δ2 = −0.007 (48)

9
Step 4: Weight Updates
Weights are updated using the formula:

∆w = ηδx
where η (learning rate) is assumed to be 0.1.
Updating Weights for Output Layer

new
w31 old
= w31 − η · δ3 · Y1 (49)

new
w31 = −0.2 − (0.1)(−0.102)(0.43) (50)

new
w31 = −0.2 + 0.004386 (51)

new
w31 = 0.1956 (52)

Updating

new
w32 old
= w32 − η · δ3 · Y2 (53)

new
w32 = 0.3 − (0.1)(−0.102)(0.56) (54)

new
w32 = 0.3 − (−0.0057) (55)

new
w32 = 0.3057 (56)

new
w30 old
= w30 − η · δ3 · input

new
w30 = −0.4 − (0.1 × −0.0075 × 1)

10
= −0.4 + 0.00075

= −0.39925

Updating Weights for Hidden Layer

wnew = wold − η · δ · input

Hidden Layer Weight Updates:


Bias weight for the first hidden neuron w10 :

w10 = −0.3 − (0.1 × 0.005 × 1) = −0.3 − 0.0005 = −0.3005

Weight w12 connected to x:



w12 = 0.21 − (0.1 × 0.005 × 0) = 0.21

Weight w13 connected to y:



w13 = 0.15 − (0.1 × 0.005 × 0) = 0.15

Bias weight for the second hidden neuron w20 :



w20 = 0.25 − (0.1 × −0.0075 × 1) = 0.25 + 0.00075 = 0.25075

Weight w21 connected to x:



w21 = −0.4 − (0.1 × −0.0075 × 0) = −0.4

Weight w22 connected to y:



w22 = 0.1 − (0.1 × −0.0075 × 0) = 0.1

Summary

• Forward pass: Compute weighted sums, apply activation functions.


• Backpropagation: Compute errors and adjust weights.
• Repeat until convergence (small error).

Exercise 4-5
Suppose we have a perceptron having weights corresponding to the three inputs have the following values:
w1= 2 ;w2 = -4; and w3 = 1 and the activation of the unit is given by the step-function:
ϕ(v) = 1 if v ≥ 0 otherwise 0
Calculate the output value y of the given perceptron for each of the following input patterns:

Solution:
To calculate the output value y for each of the given patterns we have to follow below two steps:

11
a) Calculate the weighted sum : v =
P
i (wixi) = w1x1 + w2x2 + w3x3
b) Apply the activation function to v.

For each point, we calculate the weighted sum and then apply the activation function on the output :

A perceptron computes the weighted sum of inputs and applies an activation function to determine
the output. The general formula is:
v = w1 x1 + w2 x2 + w3 x3
where:

• x1 , x2 , x3 are the inputs,


• w1 , w2 , w3 are the corresponding weights,
• v is the weighted sum.

The step function is defined as: (


1, if v ≥ 0
ϕ(v) =
0, otherwise

Given weights:
w1 = 2, w2 = −4, w3 = 1

Input Patterns:
P1 P2 P3 P4
X1 1 0 1 1
X2 0 1 0 1
X3 0 1 1 1
Step-by-Step Calculations:
For P1 : (X1 = 1, X2 = 0, X3 = 0)
v = (2 × 1) + (−4 × 0) + (1 × 0) = 2
Since v ≥ 0, the output is 1.
For P2 : (X1 = 0, X2 = 1, X3 = 1)
v = (2 × 0) + (−4 × 1) + (1 × 1) = −3
Since v < 0, the output is 0.
For P3 : (X1 = 1, X2 = 0, X3 = 1)
v = (2 × 1) + (−4 × 0) + (1 × 1) = 3
Since v ≥ 0, the output is 1.
For P4 : (X1 = 1, X2 = 1, X3 = 1)
v = (2 × 1) + (−4 × 1) + (1 × 1) = −1
Since v < 0, the output is 0.
Final Output Table:
P1 P2 P3 P4
Output(y) 1 0 1 0

Exercise 4-6
Consider a feed-forward Neural Network having 2 inputs(label -1 and label -2 )with fully connected layers
and we have 2 hidden layers:

12
a) Hidden layer-1: Nodes labeled as 3 and 4
b) Hidden layer-2: Nodes labeled as 5 and 6

A weight on the connection between nodes i and j is represented by wij, such as w24 is the weight on
the connection between nodes 2 and 4. The following lists contain all the weights values used in the given
network:

w13=-2 w35=1 w23 = 3 w45=-1 w14 = 4 w36 = -1 w24=-1 w46=1


(57)

Each of the nodes 3, 4, 5, and 6 use the following activation function: ϕ(v) = 1 if v ≥ 0 otherwise 0
where v denotes the weighted sum of a node. Each of the input nodes (1 and 2) can only receive binary
values (either 0 or 1). Calculate the output of the network (y5 and y6) for the input pattern given by
(node-1 and node-2 as 0, 0 respectively).

Solution:
To find the output of the network it is necessary to calculate weighted sums of hidden nodes 3 and 4:

v3 = w13x1 + w23x2
,
v4 = w14x1 + w24x2

Then find the outputs from hidden nodes using activation function ϕ :

y3 = ϕ(v3)
y4 = ϕ(v4).

Use the outputs of the hidden nodes y3 and y4 as the input values to the output layer (nodes 5 and 6),
and find weighted sums of output nodes 5 and 6:

v5 = w35 ∗ y3 + w45 ∗ y4
,
v6 = w36 ∗ y3 + w46 ∗ y4.

Finally, find the outputs from nodes 5 and 6 (also using ϕ) :

y5 = ϕ(v5)
y6 = ϕ(v6).
The output pattern will be (y5, y6).
Perform this calculation for the given input, Input pattern (0, 0)

v3 = −2 ∗ 0 + 3 ∗ 0 = 0,
y3 = ϕ(0) = 1

v4 = 4 ∗ 0 − 1 ∗ 0 = 0
y4 = ϕ(0) = 1

13
v5 = 1 ∗ 1 − 1 ∗ 1 = 0
y5 = ϕ(0) = 1

v6 = −1 ∗ 1 + 1 ∗ 1 = 0
y6 = ϕ(0) = 1

Therefore, the output of the network for a given input pattern is (1, 1).

Exercise 4-7
Consider the neural network shown below with a linear activation function. Apply two iterations of the
backpropagation algorithm showing the output and the error after each iteration for an input x1 = 1, x2
= 2 and x3 = -1. Assume a target value of +1, initialize all weights to the value 0.2, and use a learning
rate of 0.1.

Solution:
Given Parameters

• Inputs: x1 = 1, x2 = 2, x3 = −1
• Target Output: y ∗ = 1
• Initial Weights: w1 = w2 = w3 = 0.2
• Learning Rate: η = 0.1
• Activation Function: Linear, y =
P
xi wi

Iteration 1
Forward Pass
y = x1 w1 + x2 w2 + x3 w3
= (1 × 0.2) + (2 × 0.2) + (−1 × 0.2)
= 0.2 + 0.4 − 0.2 = 0.4

Backward Pass
e = y∗ − y
= 1 − 0.4 = 0.6

Update Weights Using gradient descent:


∆w1 = η × e × x1 = 0.1 × 0.6 × 1 = 0.06
∆w2 = 0.1 × 0.6 × 2 = 0.12
∆w3 = 0.1 × 0.6 × (−1) = −0.06

14
Updating weights:

w1new = 0.2 + 0.06 = 0.26


w2new = 0.2 + 0.12 = 0.32
w3new = 0.2 − 0.06 = 0.14

Iteration 2
Forward Pass

y = (1 × 0.26) + (2 × 0.32) + (−1 × 0.14)


= 0.26 + 0.64 − 0.14 = 0.76

Backward Pass

e = 1 − 0.76 = 0.24

Update Weights

∆w1 = 0.1 × 0.24 × 1 = 0.024


∆w2 = 0.1 × 0.24 × 2 = 0.048
∆w3 = 0.1 × 0.24 × (−1) = −0.024

Updating weights:

w1new = 0.26 + 0.024 = 0.284


w2new = 0.32 + 0.048 = 0.368
w3new = 0.14 − 0.024 = 0.116

Final Results

Iteration Output y Error e


1 0.4 0.6
2 0.76 0.24

Exercise 4-8
Consider the neural network shown below with a linear activation function for the output neuron and a
Rectified Linear Unit (ReLU) activation function for the hidden neuron.

a) Given an activation a, the ReLU activation function h is defined by

Derive the expression of the derivative of the ReLU activation function.


b) Apply one iteration of the backpropagation algorithm showing the output and the error after the
iteration for an input x1 = 1 and x2 = 2. Assume a target value of +1, initialize all weights to the
value 0.2, and use a learning rate of 0.1.

15
Solution:

a) We compute the derivative of each part of the function which becomes

b) First Iteration

Since az >0 in the first iteration, therefore based on the derivative obtained in a

For the output neuron:

For the hidden neuron:

16
17

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy