0% found this document useful (0 votes)

25 views13 pages

ML3 Unit 4-3

The document provides information about artificial neural networks and the Perceptron algorithm. It discusses the characteristics of artificial neurons and how they are modeled after biological neurons. It then describes the Perceptron learning algorithm, including single layer and multilayer Perceptrons. The Perceptron learning rule is explained, which allows the algorithm to automatically learn optimal weight coefficients. Gradient descent is also summarized as an optimization algorithm for minimizing cost functions by taking steps proportional to the negative of the gradient. The document outlines batch, stochastic, and mini-batch gradient descent methods.

Uploaded by

ISHAN SRIVASTAVA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views13 pages

ML3 Unit 4-3

Uploaded by

ISHAN SRIVASTAVA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

UNIT 4

ARTIFICIAL NEURAL NETWORKS

The artificial neuron has the following characteristics:

 A neuron is a mathematical function modelled on the working of biological neurons

 It is an elementary unit in an artificial neural network

 One or more inputs are separately weighted

 Inputs are summed and passed through a nonlinear function to produce output

 Every neuron holds an internal state called activation signal

 Each connection link carries information about the input signal

 Every neuron is connected to another neuron via connection link

1. Perceptron

A Perceptron is an algorithm for supervised learning of binary classifiers. This algorithm

enables neurons to learn and processes elements in the training set one at a time.

There are two types of Perceptron: Single layer and Multilayer.

 Single layer - Single layer perceptron can learn only linearly separable patterns

 Multilayer - Multilayer perceptron or feed-forward neural networks with two or more

layers have the greater processing power

The Perceptron algorithm learns the weights for the input signals in order to draw a linear
decision boundary.

This enables you to distinguish between the two linearly separable classes +1 and -1.

Note: Supervised Learning is a type of Machine Learning used to learn models from labeled
training data. It enables output prediction for future or unseen data. Let us focus on the
Perceptron Learning Rule in the next section.
1.1. Perceptron Learning Rule

Perceptron Learning Rule states that the algorithm would automatically learn the optimal
weight coefficients. The input features are then multiplied with these weights to determine if
a neuron fires or not.

The Perceptron receives multiple input signals, and if the sum of the input signals exceeds a
certain threshold, it either outputs a signal or does not return an output. In the context of
supervised learning and classification, this can then be used to predict the class of a sample.

A perceptron is a neural network unit (an artificial neuron) that does certain computations to
detect features or business intelligence in the input data. And this perceptron tutorial will give
you an in-depth knowledge of Perceptron and its activation functions.

1.2 Perceptron Function

Perceptron is a function that maps its input “x,” which is multiplied with the learned weight
coefficient; an output value”f(x)”is generated.

In the equation given above:

 “w” = vector of real-valued weights

 “b” = bias (an element that adjusts the boundary away from origin without any
dependence on the input value)

 “x” = vector of input x values

 “m” = number of inputs to the Perceptron

The output can be represented as “1” or “0.” It can also be represented as “1” or “-1”
depending on which activation function is used.

1.3 Perceptron has the following characteristics:

 Perceptron is an algorithm for Supervised Learning of single layer binary linear classifiers.

 Optimal weight coefficients are automatically learned.

 Weights are multiplied with the input features and decision is made if the neuron is fired
or not.

 Activation function applies a step rule to check if the output of the weighting function is
greater than zero.

 Linear decision boundary is drawn enabling the distinction between the two linearly
separable classes +1 and -1.

 If the sum of the input signals exceeds a certain threshold, it outputs a signal; otherwise,
there is no output.

1.4 Single Layer Perceptron

A perceptron is a neural network unit that does a precise computation to detect features in the
input data. Perceptron is mainly used to classify the data into two parts. Therefore, it is also
known as Linear Binary Classifier.
Perceptron uses the step function that returns +1 if the weighted sum of its input 0 and -1.

The activation function is used to map the input between the required value like (0, 1) or (-1,
1).

1.5 Multi-Layer perceptron

 Multi-Layer perceptron defines the most complicated architecture of artificial neural

networks. It is substantially formed from multiple layers of perceptron.
 The diagrammatic representation of multi-layer perceptron learning is as shown
below −
 MLP networks are usually used for supervised learning format. A typical learning
algorithm for MLP networks is also called back propagation’s algorithm.

2. GRADIENT DESCENT
2.1 WHAT IS GRADIENT DESCENT?

Gradient Descent is an optimization algorithm for finding a local minimum of a differentiable

function. Gradient descent is simply used in machine learning to find the values of a
function's parameters (coefficients) that minimize a cost function as far as possible.
A gradient simply measures the change in all weights with regard to the change in error. You
can also think of a gradient as the slope of a function. The higher the gradient, the steeper the
slope and the faster a model can learn. But if the slope is zero, the model stops learning. In
mathematical terms, a gradient is a partial derivative with respect to its inputs.
Imagine a blindfolded man who wants to climb to the top of a hill with the fewest steps along
the way as possible. He might start climbing the hill by taking really big steps in the steepest
direction, which he can do as long as he is not close to the top. As he comes closer to the top,
however, his steps will get smaller and smaller to avoid overshooting it. This process can be
described mathematically using the gradient.

Imagine the image below illustrates our hill from a top-down view and the red arrows are the
steps of our climber. Think of a gradient in this context as a vector that contains the direction
of the steepest step the blindfolded man can take and also how long that step should be.
Note that the gradient ranging from X0 to X1 is much longer than the one reaching from X3
to X4. This is because the steepness/slope of the hill, which determines the length of the
vector, is less.

This perfectly represents the example of the hill because the hill is getting less steep the
higher it's climbed. Therefore a reduced gradient goes along with a reduced slope and a
reduced step size for the hill climber.

2.2 How Gradient Descent works?

Instead of climbing up a hill, think of gradient descent as hiking down to the bottom of a
valley. This is a better analogy because it is a minimization algorithm that minimizes a given
function.

The equation below describes what gradient descent does: b is the next position of our
climber, while a represents his current position. The minus sign refers to the minimization
part of gradient descent. The gamma in the middle is a waiting factor and the gradient term
( Δf(a) ) is simply the direction of the steepest descent.

2.3 Importance of the Learning Rate

For gradient descent to reach the local minimum we must set the learning rate to an
appropriate value, which is neither too low nor too high. This is important because if the steps
it takes are too big, it may not reach the local minimum because it bounces back and forth
between the convex function of gradient descent (see left image below). If we set the learning
rate to a very small value, gradient descent will eventually reach the local minimum but that
may take a while (see the right image).

So, the learning rate should never be too high or too low for this reason. You can check if
you’re learning rate is doing well by plotting it on a graph.

2.4 Types of Gradient Descent

There are three popular types of gradient descent that mainly differ in the amount of data they
use:

2.4.1 BATCH GRADIENT DESCENT

Batch gradient descent, also called vanilla gradient descent, calculates the error for each
example within the training dataset, but only after all training examples have been evaluated
does the model get updated. This whole process is like a cycle and it's called a training epoch.

Some advantages of batch gradient descent are its computational efficient, it produces a
stable error gradient and a stable convergence. Some disadvantages are the stable error
gradient can sometimes result in a state of convergence that isn’t the best the model can
achieve. It also requires the entire training dataset be in memory and available to the
algorithm.

2.4.2 STOCHASTIC GRADIENT DESCENT

By contrast, stochastic gradient descent (SGD) does this for each training example within the
dataset, meaning it updates the parameters for each training example one by one. Depending
on the problem, this can make SGD faster than batch gradient descent. One advantage is the
frequent updates allow us to have a pretty detailed rate of improvement.

The frequent updates, however, are more computationally expensive than the batch gradient
descent approach. Additionally, the frequency of those updates can result in noisy gradients,
which may cause the error rate to jump around instead of slowly decreasing.
2.4.3 MINI-BATCH GRADIENT DESCENT
Mini-batch gradient descent is the go-to method since it’s a combination of the concepts of
SGD and batch gradient descent. It simply splits the training dataset into small batches and
performs an update for each of those batches. This creates a balance between the robustness
of stochastic gradient descent and the efficiency of batch gradient descent.

Common mini-batch sizes range between 50 and 256, but like any other machine learning
technique, there is no clear rule because it varies for different applications. This is the go-to
algorithm when training a neural network and it is the most common type of gradient descent
within deep learning.

3. Delta Rule
3.1 What Does Delta Rule Mean?

The Delta rule in machine learning and neural network environments is a specific type of
back propagation that helps to refine connectionist ML/AI networks, making connections
between inputs and outputs with layers of artificial neurons.

The Delta rule is also known as the Delta learning rule.

In general, backpropagation has to do with recalculating input weights for artificial neurons
using a gradient method. Delta learning does this using the difference between a target
activation and an actual obtained activation. Using a linear activation function, network
connections are adjusted.

Another way to explain the Delta rule is that it uses an error function to perform gradient
descent learning.

The Delta Rule is an interesting mechanism for searching the hypothesis space. Actually, the
Delta Rule uses one of the most, if not the most, popular search technique in the hypothesis
space that is called Gradient Descent.

Using Gradient Descent, the Delta Rule strives to find the best-fitting model.
The Delta Rule, uses gradient descent as an optimization techniques, and tries different
values for the weights in a neural network, and depending on how accurate the output of the
network is (i.e., how close to the ground truth), it will make certain adjustments to certain
weights (i.e., increase some and decrease the other). It will try to increase and decrease the
weights in a way that the error of the output would go down, during training.
Gradient descent is the very foundation of the back-propagation algorithm that helps
us learn neural networks

4. Back-propagation algorithm

 Back-propagation algorithm is probably the most fundamental building block in a

neural network.

 The algorithm is used to effectively train a neural network through a method called
chain rule.
 In simple terms, after each forward pass through a network, back-propagation
performs a backward pass while adjusting the model’s parameters (weights and
biases).

 Repeatedly adjusts the weights of the connections in the network so as to minimize a

measure of the difference between the actual output vector of the net and the desired

output vector.
 The ability to create useful new features distinguishes back-propagation from earlier,

simpler method.
 In other words, back-propagation aims to minimize the cost function by adjusting

network’s weights and biases. The level of adjustment is determined by the gradients

of the cost function with respect to those parameters.

How Back-propagation Algorithm Works
The Back propagation algorithm in neural network computes the gradient of
the loss function for a single weight by the chain rule. It efficiently computes
one layer at a time, unlike a native direct computation. It computes the
gradient, but it does not define how the gradient is used. It generalizes the
computation in the delta rule.

Consider the following Back propagation neural network example diagram

to understand:

1. Inputs X, arrive through the reconnected path

2. Input is modelled using real weights W. The weights are usually
randomly selected.
3. Calculate the output for every neuron from the input layer, to the
hidden layers, to the output layer.
4. Calculate the error in the outputs
Error = Actual Output – Desired Output

5. Travel back from the output layer to the hidden layer to adjust the
weights such that the error is decreased.

Why We Need Back-propagation?

Most prominent advantages of Back-propagation are:
 Back-propagation is fast, simple and easy to program
 It has no parameters to tune apart from the numbers of input
 It is a flexible method as it does not require prior knowledge about
the network
 It is a standard method that generally works well
 It does not need any special mention of the features of the function to
be learned.

Types of Back-propagation Networks

Two Types of Back-propagation Networks are:

 Static Back-propagation
 Recurrent Back-propagation

Static back-propagation:
It is one kind of back-propagation network which produces a mapping of a
static input for static output. It is useful to solve static classification issues
like optical character recognition.

Recurrent Back-propagation:
Recurrent Back propagation in data mining is fed forward until a fixed value
is achieved. After that, the error is computed and propagated backward.

Disadvantages of using Back-propagation

 The actual performance of backpropagation on a specific problem is
dependent on the input data.
 Back propagation algorithm in data mining can be quite sensitive to
noisy data
 You need to use the matrix-based approach for backpropagation
instead of mini-batch.

Self-Organizing Map

Self-Organizing Map (or Kohonen Map or SOM) is a type of Artificial

Neural Network which is also inspired by biological models of neural
systems form the 1970’s.
It follows an unsupervised learning approach and trained its network
through a competitive learning algorithm.

SOM is used for clustering and mapping (or dimensionality reduction)

techniques to map multidimensional data onto lower-dimensional
which allows people to reduce complex problems for easy
interpretation.

SOM has two layers, one is the Input layer and the other one is the
Output layer.

The architecture of the Self Organizing Map with two clusters and n
input features of any sample is given below:

Algorithm

Steps involved are :

 Weight initialization
 For 1 to N number of epochs
 Select a training example
 Compute the winning vector
 Update the winning vector
 Repeat steps 3, 4, 5 for all training examples.
 Clustering the test sample

How SOM works?

Let’s say an input data of size (m, n) where m is the number of training
example and n is the number of features in each example.
First, it initializes the weights of size (n, C) where C is the number of clusters.
Then iterating over the input data, for each training example, it updates the
winning vector (weight vector with the shortest distance (e.g Euclidean
distance) from training example). Weight updating rule is given by:
wij = wij(old) - alpha(t) * (x ik - wij(old))

Where alpha is a learning rate at time t, j denotes the winning vector, i

denotes the ith feature of training example and k denotes the k th training
example from the input data.
After training the SOM network, trained weights are used for clustering new
examples. A new example falls in the cluster of winning vector.

1 Intro
No ratings yet
1 Intro
91 pages
AI Unit II Lec Notes Deep Learning
No ratings yet
AI Unit II Lec Notes Deep Learning
64 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
18 pages
Tafl Question
No ratings yet
Tafl Question
2 pages
Toshiba Digital Copiers Error Codes PDF
No ratings yet
Toshiba Digital Copiers Error Codes PDF
20 pages
Unit-Ii (Ml-I)
No ratings yet
Unit-Ii (Ml-I)
81 pages
ML Unit-5
No ratings yet
ML Unit-5
19 pages
CS601 - Machine Learning - Unit 2 New
No ratings yet
CS601 - Machine Learning - Unit 2 New
56 pages
CS601 - Machine Learning - Unit 2 - Notes - 1672759753
No ratings yet
CS601 - Machine Learning - Unit 2 - Notes - 1672759753
14 pages
Hemax330Hematology Analyzer Communication Protocol
No ratings yet
Hemax330Hematology Analyzer Communication Protocol
8 pages
Backpropagation, Sgmiod Neuron & Gradient Discend
No ratings yet
Backpropagation, Sgmiod Neuron & Gradient Discend
29 pages
DL Unit 2
No ratings yet
DL Unit 2
46 pages
Multi Percept Ron
No ratings yet
Multi Percept Ron
14 pages
Upload Unit 2
No ratings yet
Upload Unit 2
19 pages
Linearity: Skip To Content
No ratings yet
Linearity: Skip To Content
10 pages
Unit 5
No ratings yet
Unit 5
32 pages
Lecture Notes 3 &4
No ratings yet
Lecture Notes 3 &4
35 pages
ML Lec 08 Gradient Descent
No ratings yet
ML Lec 08 Gradient Descent
37 pages
Gradient Descent Unit3
No ratings yet
Gradient Descent Unit3
9 pages
Feet Template-1
No ratings yet
Feet Template-1
1 page
Chapter 7
No ratings yet
Chapter 7
68 pages
NN WK 3 Lec 5 6 Gradient Descent
No ratings yet
NN WK 3 Lec 5 6 Gradient Descent
7 pages
CS601 Machine Learning Unit 2 Notes 1672759753
No ratings yet
CS601 Machine Learning Unit 2 Notes 1672759753
14 pages
Flexible Manufacturing Systems (F.M.S) : A Whitepaper
100% (4)
Flexible Manufacturing Systems (F.M.S) : A Whitepaper
61 pages
DL U-I Introduction Part-2
No ratings yet
DL U-I Introduction Part-2
48 pages
Api Ágil - Erros Nos Simulados
No ratings yet
Api Ágil - Erros Nos Simulados
15 pages
External Qs
No ratings yet
External Qs
5 pages
DL Unit2
No ratings yet
DL Unit2
113 pages
Notes 7sem Pec Csm701
No ratings yet
Notes 7sem Pec Csm701
23 pages
1909 EKC and EKE Portfolio - Shared Version PDF
No ratings yet
1909 EKC and EKE Portfolio - Shared Version PDF
61 pages
Structures in C
100% (1)
Structures in C
25 pages
Lecture 5 - CS50's Introduction To Artificial Intelligence With Python
No ratings yet
Lecture 5 - CS50's Introduction To Artificial Intelligence With Python
16 pages
AI & ML Unit 5 Notes
No ratings yet
AI & ML Unit 5 Notes
23 pages
Mid 1 DL Notes
No ratings yet
Mid 1 DL Notes
15 pages
LDPC Project
No ratings yet
LDPC Project
27 pages
DLA Unit 3
No ratings yet
DLA Unit 3
26 pages
AD601 Deep Learning Unit-2 Notes
No ratings yet
AD601 Deep Learning Unit-2 Notes
14 pages
DEVONthink Pro Office Manual
No ratings yet
DEVONthink Pro Office Manual
145 pages
WG Pfsense Guide
No ratings yet
WG Pfsense Guide
27 pages
Domnic Object Detecion Basics
No ratings yet
Domnic Object Detecion Basics
62 pages
Unit IV BPA GD
No ratings yet
Unit IV BPA GD
12 pages
Unit 1
No ratings yet
Unit 1
72 pages
Ch2-Training, Optimization and Regularization of DNN-new
No ratings yet
Ch2-Training, Optimization and Regularization of DNN-new
114 pages
Neural Network (Basics)
No ratings yet
Neural Network (Basics)
48 pages
Practical Assignment Demo
No ratings yet
Practical Assignment Demo
7 pages
Problem Set Ee8205 PDF
No ratings yet
Problem Set Ee8205 PDF
4 pages
Gradient Descent - PR
No ratings yet
Gradient Descent - PR
31 pages
3 TrainingNetwork
No ratings yet
3 TrainingNetwork
65 pages
Jobname Comand Id Frequency Time Job Description Jobd Library Jobq Jobq Library Userid Active, Not Active Requestnumber?
No ratings yet
Jobname Comand Id Frequency Time Job Description Jobd Library Jobq Jobq Library Userid Active, Not Active Requestnumber?
23 pages
ML Notes UNIT - 2
No ratings yet
ML Notes UNIT - 2
21 pages
RTS Test Plan - Genentech
No ratings yet
RTS Test Plan - Genentech
10 pages
S Algorithm
No ratings yet
S Algorithm
6 pages
ML3 Unit 3
No ratings yet
ML3 Unit 3
3 pages
Navarro College ADN Entrance Exam (TEAS) Scheduling Guide
0% (1)
Navarro College ADN Entrance Exam (TEAS) Scheduling Guide
1 page
Gradient Descent
No ratings yet
Gradient Descent
2 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
Neural Networks Essay Feranmi Dere
No ratings yet
Neural Networks Essay Feranmi Dere
7 pages
chp2 Gradient Descent Algorithm
No ratings yet
chp2 Gradient Descent Algorithm
5 pages
QB Unit 3
No ratings yet
QB Unit 3
14 pages
Lesson 7.0 Supervised Learning With Neural Networks
No ratings yet
Lesson 7.0 Supervised Learning With Neural Networks
22 pages
DL Unit - 2
No ratings yet
DL Unit - 2
20 pages
DeepLearing Theory
No ratings yet
DeepLearing Theory
51 pages
OpenSTA Tutorial
No ratings yet
OpenSTA Tutorial
57 pages
Neural Networks - V Unit
No ratings yet
Neural Networks - V Unit
43 pages
Unit V
No ratings yet
Unit V
25 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Unit 3
No ratings yet
Unit 3
8 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
Artificial Intelligence Notes
100% (1)
Artificial Intelligence Notes
28 pages
Ann MJJ-1
No ratings yet
Ann MJJ-1
64 pages
Machine Learning: Algorithms and Applications: (Continued)
No ratings yet
Machine Learning: Algorithms and Applications: (Continued)
17 pages
C Program Syntax
No ratings yet
C Program Syntax
27 pages
Exercise 7.1: Label Formula Value Formula Label
No ratings yet
Exercise 7.1: Label Formula Value Formula Label
3 pages
Unit 4
No ratings yet
Unit 4
18 pages
Gradient Descent Algorithm Is A First
No ratings yet
Gradient Descent Algorithm Is A First
5 pages
Gradient Descent
No ratings yet
Gradient Descent
13 pages
Jntuk R20 ML Unit-V
No ratings yet
Jntuk R20 ML Unit-V
19 pages
DSP Finance PDF
No ratings yet
DSP Finance PDF
19 pages
Ipi74931 PDF
No ratings yet
Ipi74931 PDF
7 pages
Schematic Diagram
No ratings yet
Schematic Diagram
19 pages
The Vim Alphabetic Reference Card of Commonly Used Commands
No ratings yet
The Vim Alphabetic Reference Card of Commonly Used Commands
2 pages
NNDL
No ratings yet
NNDL
96 pages
Kevin Swingler - Lecture 4: Multi-Layer Perceptrons
No ratings yet
Kevin Swingler - Lecture 4: Multi-Layer Perceptrons
20 pages
Marulaberry Kicad Ebook
No ratings yet
Marulaberry Kicad Ebook
23 pages
Machine Learning Techniques Quantum
No ratings yet
Machine Learning Techniques Quantum
159 pages
Quick Start Guide: Industrial Automation
No ratings yet
Quick Start Guide: Industrial Automation
25 pages
IEC 61000-4-30 - Changes - Ed 1 To Ed 2 PDF
No ratings yet
IEC 61000-4-30 - Changes - Ed 1 To Ed 2 PDF
3 pages
7400 Series
No ratings yet
7400 Series
16 pages
Elements of Gloud Computing Security (2016)
No ratings yet
Elements of Gloud Computing Security (2016)
65 pages
CS361 Artificial Intelligence (SEP) Lecture 1 (An Introduction To Artificial Intelligence) Fall 2020
No ratings yet
CS361 Artificial Intelligence (SEP) Lecture 1 (An Introduction To Artificial Intelligence) Fall 2020
44 pages
UNIT2
No ratings yet
UNIT2
25 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
Genio Power Point PDF
No ratings yet
Genio Power Point PDF
40 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
19 pages
Perceptrons: Fundamentals and Applications for The Neural Building Block
From Everand
Perceptrons: Fundamentals and Applications for The Neural Building Block
Fouad Sabry
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

ML3 Unit 4-3

Uploaded by

ML3 Unit 4-3

Uploaded by

UNIT 4

ARTIFICIAL NEURAL NETWORKS

 A neuron is a mathematical function modelled on the working of biological neurons

 It is an elementary unit in an artificial neural network

 One or more inputs are separately weighted

 Every neuron holds an internal state called activation signal

 Each connection link carries information about the input signal

 Every neuron is connected to another neuron via connection link

A Perceptron is an algorithm for supervised learning of binary classifiers. This algorithm

There are two types of Perceptron: Single layer and Multilayer.

 Multilayer - Multilayer perceptron or feed-forward neural networks with two or more

1.2 Perceptron Function

In the equation given above:

 “w” = vector of real-valued weights

 “x” = vector of input x values

1.3 Perceptron has the following characteristics:

 Optimal weight coefficients are automatically learned.

1.4 Single Layer Perceptron

1.5 Multi-Layer perceptron

 Multi-Layer perceptron defines the most complicated architecture of artificial neural

Gradient Descent is an optimization algorithm for finding a local minimum of a differentiable

2.2 How Gradient Descent works?

2.3 Importance of the Learning Rate

2.4 Types of Gradient Descent

2.4.1 BATCH GRADIENT DESCENT

2.4.2 STOCHASTIC GRADIENT DESCENT

The Delta rule is also known as the Delta learning rule.

 Back-propagation algorithm is probably the most fundamental building block in a

 Repeatedly adjusts the weights of the connections in the network so as to minimize a

of the cost function with respect to those parameters.

Consider the following Back propagation neural network example diagram

1. Inputs X, arrive through the reconnected path

Why We Need Back-propagation?

Types of Back-propagation Networks

Disadvantages of using Back-propagation

Self-Organizing Map (or Kohonen Map or SOM) is a type of Artificial

SOM is used for clustering and mapping (or dimensionality reduction)

Steps involved are :

How SOM works?

Where alpha is a learning rate at time t, j denotes the winning vector, i

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.