0% found this document useful (0 votes)

29 views219 pages

Unit 5

Uploaded by

dasaditi2312

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views219 pages

Unit 5

Uploaded by

dasaditi2312

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 219

Deep Learning

Approach for
Image Processing

UNIT-5
Artificial Neural Networks
 Computational models inspired by the human
brain:
 Algorithms that try to mimic the brain.

 Massively parallel, distributed system, made up of

simple processing units (neurons)

 Synaptic connection strengths among neurons are

used to store the acquired knowledge.

 Knowledge is acquired by the network from its

environment through a learning process
History
 late-1800's - Neural Networks appear as an
analogy to biological systems
 1960's and 70's – Simple neural networks appear
 Fall out of favor because the perceptron is not
effective by itself, and there were no good algorithms
for multilayer nets
 1986 – Backpropagation algorithm appears
 Neural Networks have a resurgence in popularity
 More computationally expensive
Biological Neuron

A variety of different neurons exist (motor

neuron, on-center off-surround visual cells…),
with different branching structures.

The connections of the network and the

strengths of the individual synapses establish
the function of the network.
Biological Neuron
Properties of ANNs
Properties
 Inputs are flexible
 any real values
 Highly correlated or independent
 Target function may be discrete-valued, real-valued, or
vectors of discrete or real values
 Outputs are real numbers between 0 and 1
 Resistant to errors in the training data
 Long training time
 Fast evaluation
 The function produced can be difficult for humans to
interpret
When to consider neural networks
 Input is high-dimensional discrete or raw-valued
 Output is discrete or real-valued
 Output is a vector of values
 Possibly noisy data
 Form of target function is unknown
 Human readability of the result is not important
Examples:
 Speech phoneme recognition
 Image classification
 Financial prediction
Perceptron
 Basic unit in a neural network
 Linear separator
 Parts
 N inputs, x1 ... xn
 Weights for each input, w1 ... wn
 A bias input x0 (constant) and associated weight w0
 Weighted sum of inputs, y = w0x0 + w1x1 + ... + wnxn
 A threshold function or activation function,
 i.e 1 if y > t, -1 if y <= t
A Neuron (= a perceptron)
- t
x0 w0
x1

w1
f
output y
xn wn
For Exampl e
n
Input weight weighted Activation y  sign(  wi xi  t )
vector x vector w sum function i 0

 The n-dimensional input vector x is mapped into variable y by

means of the scalar product and a nonlinear function mapping

10
Artificial Neuron Model
Artificial Neuron Model
Bias
Artificial Neuron Model
Bias
Activation functions
Activation functions
Perceptron

 Input values → Linear weighted sum → Threshold

Decision surface of a perceptron

 Representational power of perceptrons

- Linearly separable case like (a) :
possible to classify by hyperplane,
- Linearly inseparable case like (b) :
impossible to classify
Perceptron training rule (delta rule)

wi  wi + wi
where wi =  (t – o) xi
Where:
 t = c(x) is target value
 o is perceptron output
  is small constant (e.g., 0.1) called learning
rate
Can prove it will converge
 If training data is linearly separable
Gradient descent
Derivation of gradient descent

 Gradient descent
- Error (for all training examples.):

- the gradient of E ( partial differentiating ) :

- direction : steepest increase in E.

- Thus, training rule is as follows.

(The negative sign : the direction that decreases E)

Derivation of gradient
Derivation descent
of gradient descent

where xid denotes the single input

components xi for training example d

- The weight update rule for gradient descent


Gradient descent and delta rule
Derivation of gradient descent

Because the error surface

contains only a single global
minimum, this algorithm will
converge to a weight vector with
minimum error, given a
sufficiently small  is used
Hypothesis Derivation
Space of gradient descent

- Error of different hypotheses

- For a linear unit with two weights, the hypothesis space H is the wo,w1 plane.
- This error surface must be parabolic with a single global minimum (we
desire a hypothesis with minimum error).
Stochastic approximation to gradient
Derivation of gradient descent
descent

- Stochastic gradient descent (i.e. incremental mode) can sometimes

avoid falling into local minima because it uses the various gradient of E
rather than overall gradient of E.
Summary Derivation of gradient descent

 Perceptron training rule guaranteed to succeed if

 training examples are linearly separable
 Sufficiently small learning rate η

 Linear unit training rule using gradient descent

 Converge asymptotically to min. error hypothesis
(Guaranteed to converge to hypothesis with minimum squared
error )
THE PERCEPTRON
Derivation of gradient descent
THE PERCEPTRON
Derivation
Computing of gradient descent
with McCulloch-Pitts Neurons
THE PERCEPTRON
Derivation of gradient descent
THE PERCEPTRON
Limitation Derivation of
of MP-neurons gradient descent
THE PERCEPTRON
Derivation of gradient descent
THE PERCEPTRON
Derivation of gradient descent
Perceptron Analysis
THE PERCEPTRON
Derivation of gradient descent
Perceptron Analysis
THE PERCEPTRON
Derivation of gradient descent
Perceptron Analysis
THE PERCEPTRON
Derivation
Perceptron Learning Rule of gradient descent
THE PERCEPTRON
Derivation
Perceptron of gradient
Learning Rule descent
THE PERCEPTRON
Derivation
Perceptron of gradient
Learning Rule descent
THE PERCEPTRON
Derivation
Perceptron of Rule
Learning gradient descent
THE PERCEPTRON
Derivation of gradient
Perceptron Learning Rule descent
THE PERCEPTRON
Derivation of gradient descent
Perceptron Learning Algorithm
THE PERCEPTRON
Derivation
Perceptron of gradient
Learning Algorithmdescent
THE PERCEPTRON
Derivation of gradient descent
Pocket Algorithm
THE PERCEPTRON
Derivation
Adaline of gradient descent
THE PERCEPTRON
Derivation of gradient descent
Adaline Analysis
THE PERCEPTRON
Derivation of gradient
Adaline Learning Principle descent
THE PERCEPTRON
Derivation
Adaline of gradient descent
Adaline Learning Principle
THE PERCEPTRON
Derivation of gradient
Adaline Adaline Learningdescent
Principle
THE PERCEPTRON
Derivation of gradient
Adaline Adaline descent
Learning Algorithm
THE PERCEPTRON
Derivation of gradient
Adaline Adaline descent
Learning Algorithm
THE PERCEPTRON
Derivation of gradient
Adaline Adaline descent
Learning Algorithm
Multilayer networks and the Back propagation
AlgorithmDerivation of gradient descent
 Speech recognition example of multilayer networks learned by the
backpropagation algorithm
 Highly nonlinear decision surfaces
Multilayer networks and the back propagation
Sigmoid Threshold Unit
The Back propagation algorithm
The Back propagation algorithm
Adding Momentum

 Often include weight momentum α

- nth iteration update depend on (n-1)th iteration

-  : constant between 0 and 1 (momentum)
 Roles of momentum term
 The effect of keeping the ball rolling through small local minima in the
error surface
 The effect of gradually increasing the step size of the search in regions
(greatly improves the speed of learning)
The Back propagation algorithm
Convergence and Local Minima

 Gradient descent to some local minimum

 Perhaps not global minimum...
 Add momentum
 Stochastic gradient descent
Artificial Neuron Model
Applications of ANNs

 ANNs have been widely used in various domains

for:
 Pattern recognition
 Function approximation
 Associative memory
Types of connectivity

output units
 Feedforward networks
 These compute a series of
transformations hidden units
 Typically, the first layer is the
input and the last layer is the
input units
output.
 Recurrent networks
 These have directed cycles in their
connection graph. They can have
complicated dynamics.
 More biologically realistic.
Artificial Neuron Network
Different Network Topologies
 Single layer feed-forward networks
 Input layer projecting into the output layer

Single layer
network

Input Output
layer layer
Different Network Topologies
 Multi-layer feed-forward networks
 One or more hidden layers. Input projects only from
previous layers onto a layer.

2-layer or
1-hidden layer
fully connected
network
Input Hidden Output
layer layer layer
Different Network Topologies
 Multi-layer feed-forward networks

Input Hidden Output

layer layers layer
Different Network Topologies
 Recurrent networks
 A network with feedback, where some of its inputs
are connected to some of its outputs (discrete time).

Recurrent
network

Input Output
layer layer
How to Decide on a Network Topology?
Algorithm for learning ANN
 Initialize the weights (w0, w1, …, wk)

 Adjust the weights in such a way that the output

of ANN is consistent with class labels of training
examples
E   Yi  f ( wi , X i )
2
 Error function:
i

 Find the weights wi’s that minimize the above error

function
 e.g., gradient descent, backpropagation algorithm
Optimizing concave/convex function

 Maximum of a concave function = minimum of a

convex function
Gradient ascent (concave) / Gradient descent (convex)

Gradient ascent rule

Decision surface of a perceptron

 Decision surface is a hyperplane

 Can capture linearly separable classes
 Non-linearly separable
 Use a network of them
Multi-layer Networks
 Linear units inappropriate
 No more expressive than a single layer
 „Introduce non-linearity
 Threshold not differentiable
 „Use sigmoid function
Backpropagation
 Iteratively process a set of training tuples & compare the network's
prediction with the actual known target value
 For each training tuple, the weights are modified to minimize the mean
squared error between the network's prediction and the actual target
value
 Modifications are made in the “backwards” direction: from the output
layer, through each hidden layer down to the first hidden layer, hence
“backpropagation”
 Steps
 Initialize weights (to small random #s) and biases in the network

 Propagate the inputs forward (by applying activation function)

 Backpropagate the error (by updating weights and biases)

 Terminating condition (when error is very small, etc.)

81
How A Multi-Layer Neural Network Works?
 The inputs to the network correspond to the attributes measured for
each training tuple
 Inputs are fed simultaneously into the units making up the input layer
 They are then weighted and fed simultaneously to a hidden layer
 The number of hidden layers is arbitrary, although usually only one
 The weighted outputs of the last hidden layer are input to units making
up the output layer, which emits the network's prediction
 The network is feed-forward in that none of the weights cycles back to
an input unit or to an output unit of a previous layer
 From a statistical point of view, networks perform nonlinear regression:
Given enough hidden units and enough training samples, they can
closely approximate any function

83
Defining a Network Topology

 First decide the network topology: # of units in the input

layer, # of hidden layers (if > 1), # of units in each hidden
layer, and # of units in the output layer
 Normalizing the input values for each attribute measured
in the training tuples to [0.0—1.0]
 One input unit per domain value, each initialized to 0
 Output, if for classification and more than two classes, one
output unit per class is used
 Once a network has been trained and its accuracy is
unacceptable, repeat the training process with a different
network topology or a different set of initial weights
84
Backpropagation and Interpretability
 Efficiency of backpropagation: Each epoch (one interation through the
training set) takes O(|D| * w), with |D| tuples and w weights, but # of
epochs can be exponential to n, the number of inputs, in the worst case
 Rule extraction from networks: network pruning
 Simplify the network structure by removing weighted links that have the
least effect on the trained network
 Then perform link, unit, or activation value clustering
 The set of input and activation values are studied to derive rules
describing the relationship between the input and hidden unit layers
 Sensitivity analysis: assess the impact that a given input variable has on a
network output. The knowledge gained from this analysis can be
represented in rules

85
Neural Network as a Classifier
 Weakness
 Long training time
 Require a number of parameters typically best determined empirically,
e.g., the network topology or “structure.”
 Poor interpretability: Difficult to interpret the symbolic meaning behind
the learned weights and of “hidden units” in the network
 Strength
 High tolerance to noisy data
 Ability to classify untrained patterns
 Well-suited for continuous-valued inputs and outputs
 Successful on a wide array of real-world data
 Algorithms are inherently parallel
 Techniques have recently been developed for the extraction of rules
from trained neural networks

86
Artificial Neural Networks (ANN)
Input
nodes Black box
X1 X2 X3 Y
1 0 0 0 Output
1 0 1 1
X1 0.3 node
1 1 0 1
1 1 1 1 X2 0.3 
0 0 1 0
Y
0 1 0 0
0 1 1 1 X3 0.3 t=0.4
0 0 0 0

Y  I (0.3 X 1  0.3 X 2  0.3 X 3  0.4  0)

1 if z is true
where I ( z )  
0 otherwise
Characteristics Artificial Neural Networks
Artificial Neural Networks Characteristics are –

 It is a neurally implemented mathematical model.

 It contains a large number of interconnected processing elements

called neurons to do all the operations.

 Information stored in the neurons is basically the weighted

linkage of neurons.

 The input signals arrive at the processing elements through

connections and connecting weights.

 It has the ability to learn, recall, and generalize from the given
data by suitable assignment and adjustment of weights.

 The collective behaviour of the neurons describes its

computational power and no single neuron carriers specific
information.
Deep learning vs Machine Learning
Back Propagation Algorithm

The main features of Back propagation are the iterative, recursive and efficient
method through which it calculates the updated weight to improve the network
until it is not able to perform the task for which it is being trained.

Derivatives of the activation function to be known at network design time is

required to Back propagation.

Now, how error function is used in Back propagation and how Back propagation
works?

Let start with an example and do it mathematically to understand how

exactly updates the weight using Back propagation.
Back Propagation Algorithm
Back Propagation Algorithm

Now, we first calculate the values of H1 and H2 by a forward pass.

Forward Pass

To find the value of H1 we first multiply the input value from the weights
as
Back Propagation Algorithm
H1=x1×w1+x2×w2+b1

H1=0.05×0.15+0.10×0.20+0.35

H1=0.3775

To calculate the final result of H1, we performed the sigmoid function as

Back Propagation Algorithm
We will calculate the value of H2 in the same way as H1

H2=x1×w3+x2×w4+b1

H2=0.05×0.25+0.10×0.30+0.35

H2=0.3925

To calculate the final result of H1, we performed the sigmoid function as

Back Propagation Algorithm
Now, we calculate the values of y1 and y2 in the same way as we
calculate the H1 and H2.

To find the value of y1, we first multiply the input value i.e., the
outcome of H1 and H2 from the weights as

y1=H1×w5+H2×w6+b2

y1=0.593269992×0.40+0.596884378×0.45+0.60

y1=1.10590597

To calculate the final result of y1 we performed the sigmoid function as

Back Propagation Algorithm

We will calculate the value of y2 in the same way as y1

y2=H1×w7+H2×w8+b2

y2=0.593269992×0.50+0.596884378×0.55+0.60

y2=1.2249214
Back Propagation Algorithm
To calculate the final result of H1, we performed the sigmoid
function as

Our target values are 0.01 and 0.99. Our y1 and y2 value is not matched
with our target values T1 and T2.

Now, we will find the total error, which is simply the difference between the
outputs from the target outputs. The total error is calculated as
Back Propagation Algorithm

So, the total error is

Now, we will back propagate this error to update the weights using a
backward pass.
Back Propagation Algorithm
Backward pass at the output layer
To update the weight, we calculate the error correspond to each weight with the
help of a total error. The error on weight w is calculated by differentiating total
error with respect to w.

We perform backward process so first consider the last weight w5 as

From equation two, it is clear that we cannot partially differentiate it with respect
to w5 because there is no any w5. We split equation one into multiple terms so
that we can easily differentiate it with respect to w5 as
Back Propagation Algorithm

Now, we calculate each term one by one to differentiate Etotal with respect to
w5 as
Back Propagation Algorithm

Putting the value of e-y in equation (5)

Back Propagation Algorithm
Back Propagation Algorithm

Backward pass at Hidden layer

Now, we will back propagate to our hidden layer and update the weight w1, w2,
w3, and w4 as we have done with w5, w6, w7, and w8 weights.

We will calculate the error at w1 as

Back Propagation Algorithm
From equation (2), it is clear that we cannot partially differentiate it with
respect to w1 because there is no any w1. We split equation (1) into
multiple terms so that we can easily differentiate it with respect to w1 as

Now, we calculate each term one by one to differentiate Etotal with

respect to w1 as

We again split this because there is no any H1final term in Etoatal as

Back Propagation Algorithm
Back Propagation Algorithm
Back Propagation Algorithm
Back Propagation Algorithm
Back Propagation Algorithm
Back Propagation Algorithm
Back Propagation Algorithm

Now, we will calculate the updated weight w1new with the help of the
following formula
Back Propagation Algorithm

We have updated all the weights. We found the error 0.298371109 on the network
when we fed forward the 0.05 and 0.1 inputs. In the first round of Back propagation,
the total error is down to 0.291027924. After repeating this process 10,000, the
total error is down to 0.0000351085.

At this point, the outputs neurons generate 0.159121960 and 0.984065734 i.e.,
nearby our target value when we feed forward the 0.05 and 0.1.
Convolutional Neural Network
Convolutional Neural Network
Convolution
Convolutional Neural Network
Convolution Properties
Convolutional Neural Network
ConvNet
ConvNet architectures for images:

 fully-connected structure does not scale to large images

 the explicit assumption that the inputs are images
 allows us to encode certain properties into the
architecture.
 These then make the forward function more efficient to
implement
 Vastly reduce the amount of parameters in the network.

3D volumes: neurons arranged in 3 dimensions: width,

height, depth.
Convolutional Neural Network
Convolutional Neural Network
Convolutional Neural Network
Back propagation with weight constraints
Convolutional Neural Network
What does replicating the feature detectors achieve?
Convolutional Neural Network
Pooling the outputs of replicated feature detectors
Convolutional Neural Network
Example Architecture for CIFAR-10
Convolutional Neural Network
Convolution Layer
Convolutional Neural Network
Convolution
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Convolutions: More detail
Convolutional Neural Network
Spatial arrangement
Convolutional Neural Network
Spatial arrangement
Convolutional Neural Network
Spatial arrangement
Convolutional Neural Network
Parameter Sharing
Convolutional Neural Network
Parameter Sharing
Convolutional Neural Network
Summary of Conv Layer
Convolutional Neural Network
Spatial Pooling
Convolutional Neural Network
3. Spatial Pooling
Convolutional Neural Network
Pooling Layer
Convolutional Neural Network
General pooling layer
Convolutional Neural Network
General pooling
Convolutional Neural Network
Getting rid of pooling
Convolutional Neural Network
Getting Rod of Pooling 2
Softmax function
(Normalized exponential function)

If we take an input of [1,2,3,4,1,2,3], the softmax of that is

[0.024, 0.064, 0.175, 0.475, 0.024, 0.064, 0.175].

The softmax function highlights the largest values and

suppress other values.Comparing to “max” function,
softmax is differentiable.
Convolutional Neural Network
Fully-connected layer
Convolutional Neural Network
Converting FC layers to CONV layers
Convolutional Neural Network
ConvNet Architectures
Convolutional Neural Network
Convolutional Neural Network
Recent Departures
Convolutional Neural Network
Case Studies
Convolutional Neural Network
Convolutional Neural Network
LeNet
Convolutional Neural Network
LeNet
Convolutional Neural Network
From hand-written digits to 3-D objects
Convolutional Neural Network
The ILSVRC-2012 competition on ImageNet
Convolutional Neural Network
Convolutional Neural Network
A neural network for ImageNet
Convolutional Neural Network
A Common Architecture: AlexNet
Convolutional Neural Network
Convolutional Neural Network
Convolutional Neural Network
Tricks that significantly improve generalization
Convolutional Neural Network
The hardware required for Alex’s net
Convolutional Neural Network
Case Study: ZFNet
Convolutional Neural Network
Case Studies
Convolutional Neural Network
Case Study: VGGNet
Convolutional Neural Network
Case Study: GoogLeNet
Convolutional Neural Network
GoogLeNet vs State of the art
Convolutional Neural Network
Residual Network
Convolutional Neural Network
Convolutional Neural Network
Convolutional Neural Network
Convolutional Neural Network
Convolutional Neural Network
Plain Network
Convolutional Neural Network
Convolutional Neural Network
Convolutional Neural Network
Convolutional Neural Network
Convolutional Neural Network
Convolutional Neural Network
Convolutional Neural Network
Residual Network
Convolutional Neural Network
Convolutional Neural Network
Convolutional Neural Network
Convolutional Neural Network
Results
Convolutional Neural Network
Convolutional Neural Network

Practical matters

15MATDIP41
71% (7)
15MATDIP41
2 pages
Unit - 4 ANN
No ratings yet
Unit - 4 ANN
46 pages
Machine Learning: Algorithms and Applications: (Continued)
No ratings yet
Machine Learning: Algorithms and Applications: (Continued)
17 pages
L6 Neural Network
No ratings yet
L6 Neural Network
57 pages
5 1 ArtificialNeuralNetworks 4up
No ratings yet
5 1 ArtificialNeuralNetworks 4up
12 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
Lecture 8
No ratings yet
Lecture 8
65 pages
Neural Networks
No ratings yet
Neural Networks
40 pages
Neural
No ratings yet
Neural
32 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
4.2 Ann
No ratings yet
4.2 Ann
26 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
71 pages
Unit 1
No ratings yet
Unit 1
29 pages
Jntuk R20 ML Unit-V
No ratings yet
Jntuk R20 ML Unit-V
19 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
Neural Network BSC
No ratings yet
Neural Network BSC
32 pages
Ann MJJ-1
No ratings yet
Ann MJJ-1
64 pages
Unit 1
No ratings yet
Unit 1
72 pages
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
No ratings yet
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
25 pages
Unit 4
No ratings yet
Unit 4
38 pages
2021 Lecture11 NeuralNetworks
No ratings yet
2021 Lecture11 NeuralNetworks
48 pages
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
No ratings yet
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
25 pages
ML Unit - 2
No ratings yet
ML Unit - 2
70 pages
2023 Lecture11 NeuralNetworks
No ratings yet
2023 Lecture11 NeuralNetworks
48 pages
Chapter 2. Training NN
No ratings yet
Chapter 2. Training NN
50 pages
Unit 2-Ann
No ratings yet
Unit 2-Ann
62 pages
Slide 2
No ratings yet
Slide 2
35 pages
Wk. 12. Artificial Neural Networks (12!05!2021)
No ratings yet
Wk. 12. Artificial Neural Networks (12!05!2021)
48 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
Unit-4 Full
No ratings yet
Unit-4 Full
48 pages
ML Unit-5
No ratings yet
ML Unit-5
19 pages
Unit 6 Application of AI
No ratings yet
Unit 6 Application of AI
91 pages
Unit 4 ML NN, DL, CNN-1
No ratings yet
Unit 4 ML NN, DL, CNN-1
84 pages
Institute For Advanced Management Systems Research Department of Information Technologies Abo Akademi University
No ratings yet
Institute For Advanced Management Systems Research Department of Information Technologies Abo Akademi University
41 pages
AIML-Module-3-part 2
No ratings yet
AIML-Module-3-part 2
122 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
68 pages
Machine Learning: Chapter 4. Artificial Neural Networks
No ratings yet
Machine Learning: Chapter 4. Artificial Neural Networks
34 pages
Lecture 4: Perceptrons and Multilayer Perceptrons: Cognitive Systems II - Machine Learning SS 2005
No ratings yet
Lecture 4: Perceptrons and Multilayer Perceptrons: Cognitive Systems II - Machine Learning SS 2005
25 pages
Basics
No ratings yet
Basics
48 pages
3ML.05.NeuralNetworks DeepLearning
No ratings yet
3ML.05.NeuralNetworks DeepLearning
67 pages
Artificial Neural Networks: Biological Motivation
No ratings yet
Artificial Neural Networks: Biological Motivation
22 pages
AI Unit II Lec Notes Deep Learning
No ratings yet
AI Unit II Lec Notes Deep Learning
64 pages
Lecture15 NeuronNetworks
No ratings yet
Lecture15 NeuronNetworks
61 pages
Unit V
No ratings yet
Unit V
49 pages
Neural Network
No ratings yet
Neural Network
44 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
37 pages
Unit 1
No ratings yet
Unit 1
19 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
2024 MTH058 Lecture02 Backpropagation
No ratings yet
2024 MTH058 Lecture02 Backpropagation
62 pages
Neural Network
No ratings yet
Neural Network
97 pages
Neural Network Presentation
No ratings yet
Neural Network Presentation
33 pages
Main
No ratings yet
Main
25 pages
Notes ML 02 Slides RNN ANN
No ratings yet
Notes ML 02 Slides RNN ANN
105 pages
NN Suppl
No ratings yet
NN Suppl
64 pages
Ai 7
No ratings yet
Ai 7
41 pages
Wk9-Neural Networks
No ratings yet
Wk9-Neural Networks
46 pages
Chapter 5 Artificial Neural Networks
No ratings yet
Chapter 5 Artificial Neural Networks
50 pages
Association Rule Mining: FP Growth
No ratings yet
Association Rule Mining: FP Growth
22 pages
BCS 2015-16
No ratings yet
BCS 2015-16
29 pages
Week 2 - The General Strategy For Solving Material Balance Problems
No ratings yet
Week 2 - The General Strategy For Solving Material Balance Problems
19 pages
Chapter Five
No ratings yet
Chapter Five
35 pages
Blind Deconvolution Using A Normalized Sparsity Measure
No ratings yet
Blind Deconvolution Using A Normalized Sparsity Measure
8 pages
Anomaly Detection For Cyber Security
No ratings yet
Anomaly Detection For Cyber Security
31 pages
Graph Traversal
100% (1)
Graph Traversal
38 pages
ESL2023 randomInterleaverNS
No ratings yet
ESL2023 randomInterleaverNS
4 pages
Unit IV Numerical Methods Solution
No ratings yet
Unit IV Numerical Methods Solution
33 pages
House Price Prediction Using Machine Learning
No ratings yet
House Price Prediction Using Machine Learning
13 pages
Berkley Data Science
No ratings yet
Berkley Data Science
4 pages
1 - Practical Guide For Kaggle Competitions
No ratings yet
1 - Practical Guide For Kaggle Competitions
39 pages
Practical-1 Create A Program To Find Out The Interest: Chandigarh Engineering College, Landran
No ratings yet
Practical-1 Create A Program To Find Out The Interest: Chandigarh Engineering College, Landran
27 pages
Automata - Chap2+finiteautomata
No ratings yet
Automata - Chap2+finiteautomata
47 pages
Quantum Computers
No ratings yet
Quantum Computers
2 pages
Deep Learning - Brochure
No ratings yet
Deep Learning - Brochure
1 page
Adaptive Delta Modulation
No ratings yet
Adaptive Delta Modulation
5 pages
AI Documents
No ratings yet
AI Documents
25 pages
Mock End Sem 2024-2025 NMCP
No ratings yet
Mock End Sem 2024-2025 NMCP
2 pages
Perspectives On System Identification
No ratings yet
Perspectives On System Identification
17 pages
Ppt-Unit 5 - 18mab302t-Graph Theory
No ratings yet
Ppt-Unit 5 - 18mab302t-Graph Theory
72 pages
Sharma CreditScoring
No ratings yet
Sharma CreditScoring
45 pages
Maste503 Structural-Dynamics TH 1.0 84 Maste503
No ratings yet
Maste503 Structural-Dynamics TH 1.0 84 Maste503
2 pages
A Branch and Bound Algorithm For The Traveling Purchaser Problem
No ratings yet
A Branch and Bound Algorithm For The Traveling Purchaser Problem
9 pages
Additional Material For STK110 From BME120
No ratings yet
Additional Material For STK110 From BME120
179 pages
Quantitative Management-Network Models: Minimum Spanning Tree
No ratings yet
Quantitative Management-Network Models: Minimum Spanning Tree
10 pages
Image Compression (Chapter 8) : CS474/674 - Prof. Bebis
No ratings yet
Image Compression (Chapter 8) : CS474/674 - Prof. Bebis
75 pages
Jinawi Awi
No ratings yet
Jinawi Awi
171 pages
Paper 7 - The Object Detection Based On Deep Learning
No ratings yet
Paper 7 - The Object Detection Based On Deep Learning
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unit 5

Uploaded by

Unit 5

Uploaded by

Deep Learning

 Massively parallel, distributed system, made up of

 Synaptic connection strengths among neurons are

 Knowledge is acquired by the network from its

A variety of different neurons exist (motor

The connections of the network and the

 The n-dimensional input vector x is mapped into variable y by

 Input values → Linear weighted sum → Threshold

 Representational power of perceptrons

- the gradient of E ( partial differentiating ) :

- direction : steepest increase in E.

(The negative sign : the direction that decreases E)

where xid denotes the single input

- The weight update rule for gradient descent

Because the error surface

- Error of different hypotheses

- Stochastic gradient descent (i.e. incremental mode) can sometimes

 Perceptron training rule guaranteed to succeed if

 Linear unit training rule using gradient descent

 Often include weight momentum α

- nth iteration update depend on (n-1)th iteration

 Gradient descent to some local minimum

 ANNs have been widely used in various domains

Input Hidden Output

 Adjust the weights in such a way that the output

 Find the weights wi’s that minimize the above error

 Maximum of a concave function = minimum of a

Gradient ascent rule

 Decision surface is a hyperplane

 Propagate the inputs forward (by applying activation function)

 Backpropagate the error (by updating weights and biases)

 Terminating condition (when error is very small, etc.)

 First decide the network topology: # of units in the input

Y  I (0.3 X 1  0.3 X 2  0.3 X 3  0.4  0)

 It is a neurally implemented mathematical model.

 It contains a large number of interconnected processing elements

 Information stored in the neurons is basically the weighted

 The input signals arrive at the processing elements through

 The collective behaviour of the neurons describes its

Derivatives of the activation function to be known at network design time is

Let start with an example and do it mathematically to understand how

Now, we first calculate the values of H1 and H2 by a forward pass.

To calculate the final result of H1, we performed the sigmoid function as

To calculate the final result of H1, we performed the sigmoid function as

To calculate the final result of y1 we performed the sigmoid function as

We will calculate the value of y2 in the same way as y1

So, the total error is

We perform backward process so first consider the last weight w5 as

Putting the value of e-y in equation (5)

Backward pass at Hidden layer

We will calculate the error at w1 as

Now, we calculate each term one by one to differentiate Etotal with

We again split this because there is no any H1final term in Etoatal as

 fully-connected structure does not scale to large images

3D volumes: neurons arranged in 3 dimensions: width,

If we take an input of [1,2,3,4,1,2,3], the softmax of that is

The softmax function highlights the largest values and

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.