0% found this document useful (0 votes)

24 views28 pages

4-Neural Networks and Activation Function

Uploaded by

biware8359

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views28 pages

4-Neural Networks and Activation Function

Uploaded by

biware8359

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 28

Lecture 2

Shallow Neural Networks

Shallow Neural Networks
 In the single-layer neural network, the training process is relatively straightforward because
the error (or loss function) can be computed as a direct function of the weights, which
allows easy gradient computation.
 In the case of multi-layer networks, the problem is that the loss is a complicated
composition function of the weights in earlier layers.
 The gradient of a composition function is computed using the backpropagation algorithm.
 The backpropagation algorithm leverages the chain rule of differential calculus, which
deeplearning.ai
computes the error gradients in terms of summations of local-gradient products over the
various paths from a node to the output.
Neural Network Representation

𝑥1
𝑥2 ^
𝑦

𝑥3

07/04/2024 Zeeshan Khan 3

Shallow Neural Networks

Backpropagation Algorithm contains two main phases/passes, referred to as the forward

and backward phases/passes, respectively.
 Forward pass: In this pass, the inputs for a training instance are fed into the neural
network. This results in a forward cascade of computations across the layers, using the
current set of weights.
 Backward pass: The main goal of the backward pass is to learn the gradient of the loss
deeplearning.ai
function with respect to the different weights by using the chain rule of differential
calculus. These gradients are used to update the weights.
Calculating Neural Networks’ Output
Forward Pass

deeplearning.ai
Neural Network Representation
Consider the following representation of Neural Network.
 It has two layers i.e., one hidden layer and one output layer.
 The first layer is referred as a[0], second layer as a[1], and the final layer as a[2]. Here ‘a’
stands for activations.
 The corresponding parameters are w[1], b[1] and w[1], b[2]

deeplearning.ai
Neural Network Representation

𝑥1
𝑥2 ^
𝑦

𝑥3

07/04/2024 Zeeshan Khan 7

Computing a Neural Network’s Output
Let’s look in detail at how each neuron of a neural network works. Each neuron takes an input,
performs some operation on them (calculates z = w[T] + b), and then applies the activation
function (sigmoid) function:

𝑥1

𝑥2 𝑤
𝑇
𝜎 (𝑧)
𝑥 +𝑏 𝑎= ^
𝑦 𝑥1
𝑧 𝑎 𝑥2 ^
𝑦
𝑥3
𝑥3
𝑇
𝑧 =𝑤 𝑥 +𝑏
𝑎=𝜎
07/04/2024(𝑧 ) Zeeshan Khan 8
Computing a Neural Network’s Output
This step is performed by each neuron.

𝑥1
𝑥1
𝑥2 ^
𝑦
𝑥2 𝑤
𝑇
𝜎 (𝑧)
𝑥 +𝑏 𝑎= ^
𝑦 𝑥3
𝑧 𝑎
𝑥3

𝑧 =𝑤 𝑥 +𝑏
𝑇
𝑥1
𝑎=𝜎 (𝑧 ) 𝑥2 ^
𝑦
𝑥3
07/04/2024 Zeeshan Khan 9
Computing a Neural Network’s Output

This step is performed by each neuron. The equations for the first hidden layer with four
neurons will be:

𝑎 [11 ] )
𝑥1 𝑎 [21 ] )
𝑥2 [1 ]
^
𝑦
)
𝑎3
𝑥3
𝑎 [41 ] )

07/04/2024 Zeeshan Khan 10

Computing a Neural Network’s Output
So, for given input X, the outputs for each layer will be:

𝑎 [11 ]
𝑧 [ 1] = 𝑊 [ 1 ] 𝑥+ 𝑏[ 1 ]
𝑥1 𝑎 [21 ]
𝑥2 ^
𝑦 𝑎 [ 1 ] = 𝜎 ( 𝑧 [ 1] )
𝑎 [31 ]
𝑥3
𝑎 [41 ] 𝑧 [ 2 ] =𝑊 [ 2] 𝑎 [ 1] +𝑏 [ 2 ]

𝑎[ 2 ] = 𝜎 ( 𝑧 [ 2 ] )
To compute these outputs, we need to run a for loop which will calculate these values
individually for each neuron. But recall that using a for loop will make the computations very
slow,07/04/2024
and hence we should optimize the code to Khan
Zeeshan get rid of this for loop and run it faster. 11
Vectorizing across multiple examples
The non-vectorized form of computing the output from a neural network is:

for i=1 to m:

z[1](i) = W[1](i)x + b[1]

a[1](i) = 𝛔(z[1](i))

z[2](i) = W[2](i)x + b[2]

a[2](i) = 𝛔(z[2](i))
deeplearning.ai
Using this for loop, we are calculating z and a value for each training example separately.
Now we will look at how it can be vectorized. All the training examples will be merged in a
single matrix X:
Vectorizing across multiple examples
for i = 1 to m:
[ 1] (𝑖 ) [ 1 ] (𝑖 ) [1 ]
𝑧 =𝑊 𝑥 +𝑏
[ 1 ] ( 𝑖) [ 1] ( 𝑖 )
𝑎 =𝜎 ( 𝑧 )
[ 2 ] (𝑖 ) [ 2] [ 1] ( 𝑖 ) [ 2]
𝑧 =𝑊 𝑎 +𝑏
[2 ](𝑖 ) [2 ]( 𝑖 )
𝑎 =𝜎 (𝑧 )

07/04/2024 Zeeshan Khan 13

Activation functions

deeplearning.ai
Activation functions
What are activation functions ?
Activation function decides, whether a neuron should be activated or not. The
purpose of the activation function is to introduce non-linearity into the output
of a neuron
Why do we need Non-linear activation functions ?
A neural network without an activation function is essentially just a linear
regression model. The activation function does the non-linear transformation to
the input making it capable to learn and perform more complex tasks.
Sigmoid activation function
It is a function which is plotted as ‘S’ shaped graph. a

1 z
𝑎=
sigmoid: 1 +𝑒
−𝑧

Nature : Non-linear.
Value Range : 0 to 1
Uses : Usually used in output layer of a binary classification, where result is either 0 or 1,
as value for sigmoid function lies between 0 and 1 only so, result can be predicted easily to
be 1 if value is greater than 0.5 and 0 otherwise.
Tanh activation function
The activation that works almost always better than sigmoid function is Tanh function also
knows as Tangent Hyperbolic function. It’s actually mathematically shifted version of the
sigmoid function.
a

Formula: tanh(z) = 2 * sigmoid(2z) - 1

x
Value Range :- -1 to +1
Nature :- non-linear
Uses :- Usually used in hidden layers of a neural network as it’s values lies between -1 to
1 hence the mean for the hidden layer comes out be 0 or very close to it, hence helps
in centering the data by bringing mean close to 0. This makes learning for the next layer
much easier.
ReLU activation function
Stands for Rectified linear unit. It is the most widely used activation function. Mainly
implemented in hidden layers of Neural network.

z
Equation :- A(Z) = max(0,Z). It gives an output z if z is positive and 0 otherwise.
Value Range :- [0, inf)
Nature :- non-linear, which means we can easily backpropagate the errors and have multiple
layers of neurons being activated by the ReLU function.
Uses :- ReLu is less computationally expensive than tanh and sigmoid because it involves
simpler mathematical operations. In simple words, RELU learns much faster than sigmoid and
Tanh function.
Leaky ReLU activation function
 It is an attempt to solve the dying ReLU problem
 Equation :- A(Z) = max(0.01,Z). It gives an output z if z is positive and 0 otherwise.
 The leak helps to increase the range of the ReLU function. Usually, the value of a is 0.01
or so.When a is not 0.01 then it is called Randomized ReLU. Therefore the range of the
Leaky ReLU is (-infinity to infinity).
 Both Leaky and Randomized ReLU functions are monotonic in nature. Also, their
derivatives also monotonic in nature.
Softmax activation function
 The softmax function is also a type of sigmoid function but is handy when we are trying to
handle classification problems.
 Nature :- non-linear
 Uses :- Usually used when trying to handle multiple classes. The softmax function would
squeeze the outputs for each class between 0 and 1 and would also divide by the sum of the
outputs.
 Output:- The softmax function is ideally used in the output layer of the classifier where we
are actually trying to attain the probabilities to define the class of each input.
Activation Functions
Activation Function Pros Cons
Sigmoid It is useful for binary Output is restricted between 0 and 1
classification
tanh Better than sigmoid Parameters are updated slowly
when points are at extreme ends
ReLU Parameters are updated faster Zero slope when x<0
as slope is 1 when x>0

• The basic rule of thumb is if you really don’t know what activation function to use, then
simply use RELU as it is a general activation function and is used in most cases these
days.
• If your output is for binary classification then, sigmoid function is very natural choice for
output layer.
07/04/2024 Zeeshan Khan 21
Choosing a good W

f(x,W) = Wx + b

1. Use a loss function to quantify how good a value of W is

2. Find a W that minimizes the loss function (optimization)

07/04/2024 Zeeshan Khan 22

Loss Function
•A loss function tells how good our current classifier is

•Low loss = good classifier High loss = bad classifier

•(Also called: objective function; cost function)

07/04/2024 Zeeshan Khan 23

Loss Function
•A loss function tells how good our current classifier is

•Low loss = good classifier High loss = bad classifier

•(Also called: objective function; cost function)

•Negative loss function sometimes called reward function, profit

function, utility function, fitness function, etc

07/04/2024 Zeeshan Khan 24

Loss Function
•A loss function tells how good our Given a dataset of examples
current classifier is
𝑁
•Low loss = good classifier
𝑥i , 𝑦i i" 1
High loss = bad classifier Where 𝑥i is image and
𝑦i is (integer) label
•(Also called: objective function; cost
function)

•Negative loss function sometimes called

reward function, profit function, utility
function, fitness function, etc
07/04/2024 Zeeshan Khan 25
Loss Function
•A loss function tells how good our Given a dataset of examples
current classifier is 𝑁
𝑥i , 𝑦i i" 1
•Low loss = good classifier
High loss = bad classifier Where 𝑥i is image and
𝑦i is (integer) label
•(Also called: objective function; cost
function) Loss for a single example is
𝐿i 𝑓 𝑥i , 𝑊
•Negative loss function sometimes called
reward function, profit function, utility , 𝑦i
function, fitness function, etc
07/04/2024 Zeeshan Khan 26
Loss Function
•A loss function tells how good our
current classifier is

•Low loss = good classifier

High loss = bad classifier

•(Also called: objective function; cost

function)

•Negative loss function sometimes called

reward function, profit function, utility
function, fitness function, etc
07/04/2024 Zeeshan Khan 27
Next Lecture:
Gradient descent for neural networks
Backward Pass
deeplearning.ai

DL Unit2 HD
No ratings yet
DL Unit2 HD
141 pages
Artificial Neural Networks (ANN)
No ratings yet
Artificial Neural Networks (ANN)
67 pages
Activation Function
No ratings yet
Activation Function
43 pages
Week 2 Artificial Neural Networks
No ratings yet
Week 2 Artificial Neural Networks
62 pages
Activation Functions
No ratings yet
Activation Functions
9 pages
NN Unit - 1
No ratings yet
NN Unit - 1
27 pages
Activation Function To Back Pro
No ratings yet
Activation Function To Back Pro
22 pages
Module-4 Neural Network
No ratings yet
Module-4 Neural Network
61 pages
Activation Function
No ratings yet
Activation Function
31 pages
Mod 2.3 - Activation Function, Loss Functions
No ratings yet
Mod 2.3 - Activation Function, Loss Functions
12 pages
Fundamentals of Neural Network
No ratings yet
Fundamentals of Neural Network
84 pages
Activation Function
No ratings yet
Activation Function
36 pages
06 AIS302 ANN Backpropagation
No ratings yet
06 AIS302 ANN Backpropagation
83 pages
Activation Function
No ratings yet
Activation Function
4 pages
Module1 - Upto Loss Function
No ratings yet
Module1 - Upto Loss Function
137 pages
Unit 2 - Activation Function - PR
No ratings yet
Unit 2 - Activation Function - PR
22 pages
Functii de Activare1
No ratings yet
Functii de Activare1
89 pages
Lect 5 - Non Linear Activation Functions
No ratings yet
Lect 5 - Non Linear Activation Functions
41 pages
Lecture 9-NN - Modified
No ratings yet
Lecture 9-NN - Modified
94 pages
Activation Function
No ratings yet
Activation Function
34 pages
Module1
No ratings yet
Module1
124 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
36 pages
Activation Functions
No ratings yet
Activation Functions
11 pages
Feed Forward NN
No ratings yet
Feed Forward NN
35 pages
Unit V Neural Networks
No ratings yet
Unit V Neural Networks
35 pages
26 - Netinput Activation Function Forward and Back Propogation
No ratings yet
26 - Netinput Activation Function Forward and Back Propogation
41 pages
DL M2 Tech
No ratings yet
DL M2 Tech
32 pages
Unit-1 4350705
No ratings yet
Unit-1 4350705
23 pages
Activation FN
No ratings yet
Activation FN
15 pages
003 Activation Functions in Machine Learning
No ratings yet
003 Activation Functions in Machine Learning
19 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
34 pages
Forward and Backward Propagation Deep Learning 1703697260
No ratings yet
Forward and Backward Propagation Deep Learning 1703697260
9 pages
Lec 22 Activations Functions Complete
No ratings yet
Lec 22 Activations Functions Complete
33 pages
Unit 2
No ratings yet
Unit 2
35 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
Notes On Introduction To Deep Learning
No ratings yet
Notes On Introduction To Deep Learning
19 pages
4 4 Choosing The Right Activation Function For Neural Networks
No ratings yet
4 4 Choosing The Right Activation Function For Neural Networks
25 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
7 pages
Unit 2b
No ratings yet
Unit 2b
11 pages
DL Answers
No ratings yet
DL Answers
24 pages
Activatn FN 2
No ratings yet
Activatn FN 2
10 pages
Machine Learning (CSO851) - Lecture 08
No ratings yet
Machine Learning (CSO851) - Lecture 08
27 pages
4 - Activation Functions in Neural Networks
No ratings yet
4 - Activation Functions in Neural Networks
12 pages
ML Lec-22
No ratings yet
ML Lec-22
25 pages
Unit 3 Deep Learning
No ratings yet
Unit 3 Deep Learning
11 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages
Unit Iv
No ratings yet
Unit Iv
34 pages
UNIT-III Activation-Function
No ratings yet
UNIT-III Activation-Function
6 pages
Deep Learning: International Islamic University of Chittagong
No ratings yet
Deep Learning: International Islamic University of Chittagong
31 pages
Study of Ensemble of Activation Functions in Deep Learning
No ratings yet
Study of Ensemble of Activation Functions in Deep Learning
10 pages
Unit Ii DNN
No ratings yet
Unit Ii DNN
24 pages
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Activation
No ratings yet
Activation
7 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
Activation Functions - Ipynb - Colaboratory
No ratings yet
Activation Functions - Ipynb - Colaboratory
10 pages
Fundamentals Deep Learning Activation Functions When To Use Them
No ratings yet
Fundamentals Deep Learning Activation Functions When To Use Them
15 pages
Need and Use of Activation Functions in Anndeep Learning
No ratings yet
Need and Use of Activation Functions in Anndeep Learning
7 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Understanding Attention Mechanisms in Deep Learning
No ratings yet
Understanding Attention Mechanisms in Deep Learning
104 pages
Modern AI Pro Essentials
100% (1)
Modern AI Pro Essentials
9 pages
Activation Functions
No ratings yet
Activation Functions
8 pages
NNDL
No ratings yet
NNDL
96 pages
Machine Learning: Chapter 4. Artificial Neural Networks
No ratings yet
Machine Learning: Chapter 4. Artificial Neural Networks
34 pages
1 - Artificial Intelligence Introduction
No ratings yet
1 - Artificial Intelligence Introduction
30 pages
2020 02. DNNRec A Novel Deep Learning Based Hybrid Recommender System
No ratings yet
2020 02. DNNRec A Novel Deep Learning Based Hybrid Recommender System
14 pages
Financial Documents For Effcient Retirval
No ratings yet
Financial Documents For Effcient Retirval
87 pages
Lecture 2 - Supervised Learning
No ratings yet
Lecture 2 - Supervised Learning
6 pages
Lecture 01 Introduction
No ratings yet
Lecture 01 Introduction
58 pages
Machine Learning R22 MID QUESTION BANK
No ratings yet
Machine Learning R22 MID QUESTION BANK
6 pages
L12 Optim Slides
No ratings yet
L12 Optim Slides
55 pages
Accident Detection Using Machine Learning Method
No ratings yet
Accident Detection Using Machine Learning Method
10 pages
Zoomcamp 01 01
No ratings yet
Zoomcamp 01 01
28 pages
Purdue AI and ML Dual Master Program SlimUp
No ratings yet
Purdue AI and ML Dual Master Program SlimUp
27 pages
CH3 Logistic Regression 2024
No ratings yet
CH3 Logistic Regression 2024
31 pages
Lec 23 - 24 KNN
No ratings yet
Lec 23 - 24 KNN
25 pages
A Deep Learning-Based Model For Date Fruit Classification
No ratings yet
A Deep Learning-Based Model For Date Fruit Classification
17 pages
使用自反思大型语言模型学习生成可解释的股票预测
No ratings yet
使用自反思大型语言模型学习生成可解释的股票预测
20 pages
Machine Learning-Based Real-Time Sensor Drift Fault Detection Using Raspberry Pi
No ratings yet
Machine Learning-Based Real-Time Sensor Drift Fault Detection Using Raspberry Pi
7 pages
Detection of Cyber Attacks Using Artificial Intelligence
No ratings yet
Detection of Cyber Attacks Using Artificial Intelligence
14 pages
Optimizing ViViT Training: Time and Memory
No ratings yet
Optimizing ViViT Training: Time and Memory
16 pages
3 - Golovko
No ratings yet
3 - Golovko
5 pages
CS 370 - Artificial Intelligence
No ratings yet
CS 370 - Artificial Intelligence
26 pages
Deep Learning I. Introduction (: 1. The History and The Development of Deep Learning
No ratings yet
Deep Learning I. Introduction (: 1. The History and The Development of Deep Learning
21 pages
Human Activity Recognition Using Convolutional Neural Network
No ratings yet
Human Activity Recognition Using Convolutional Neural Network
19 pages
Generating Latex Code For Commutative Diagrams
No ratings yet
Generating Latex Code For Commutative Diagrams
8 pages
Computer Vision Nanodegree Syllabus: Before You Start
No ratings yet
Computer Vision Nanodegree Syllabus: Before You Start
5 pages
Longformer (2020) - The Long-Document Transformer by Naoki Medium
No ratings yet
Longformer (2020) - The Long-Document Transformer by Naoki Medium
1 page
9036 - English
No ratings yet
9036 - English
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

4-Neural Networks and Activation Function

Uploaded by

4-Neural Networks and Activation Function

Uploaded by

Lecture 2

Shallow Neural Networks

07/04/2024 Zeeshan Khan 3

Backpropagation Algorithm contains two main phases/passes, referred to as the forward

07/04/2024 Zeeshan Khan 7

07/04/2024 Zeeshan Khan 10

z[1](i) = W[1](i)x + b[1]

z[2](i) = W[2](i)x + b[2]

07/04/2024 Zeeshan Khan 13

Formula: tanh(z) = 2 * sigmoid(2z) - 1

1. Use a loss function to quantify how good a value of W is

2. Find a W that minimizes the loss function (optimization)

07/04/2024 Zeeshan Khan 22

•Low loss = good classifier High loss = bad classifier

•(Also called: objective function; cost function)

07/04/2024 Zeeshan Khan 23

•Low loss = good classifier High loss = bad classifier

•(Also called: objective function; cost function)

•Negative loss function sometimes called reward function, profit

07/04/2024 Zeeshan Khan 24

•Negative loss function sometimes called

•Low loss = good classifier

•(Also called: objective function; cost

•Negative loss function sometimes called

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.