0% found this document useful (0 votes)
68 views112 pages

AI Unit 6 Deep Learning Basics of Neural Network

Uploaded by

bautilmenlynsa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views112 pages

AI Unit 6 Deep Learning Basics of Neural Network

Uploaded by

bautilmenlynsa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 112

Unit – 6 Deep Learning: Basics of Neural Network

▪ The roots of all work of NN are in neurobiological studies.


▪ How do nerves behave when stimulated by different
magnitudes of electric current?
▪ Is there a minimal threshold (quantity of current) needed
for nerves to be activated?
▪ how do different nerve cells communicate electrical
currents among one another?
▪ How do various nerve cells differ in behavior?

▪ //up to mid-twentieth century not answered……

Prof. Bhavisha Suthar


▪ From psychologists striving to understand exactly how
learning, forgetting, recognition, and other such tasks are
accomplished by animals.

▪ McCulloch and Pitts (1943)


- credited first mathematical model of a single neuron
▪ Dreyfus (1962), Bryson and Ho (1969), and Werbos (1974)

- combinations of many neurons


▪ Hybrid systems Gallant (1986)
- combining neural networks and non-connectionist
components

and many more……..


Prof. Bhavisha Suthar
▪A biological neural network is a series of
interconnected neurons whose activation defines a
recognizable linear pathway.

▪ The interface through which neurons interact with their


neighbours usually consists of several axon
terminals connected via synapses to dendrites on other
neurons.

▪ If the sum of the input signals into one neuron surpasses a


certain threshold, the neuron sends an action potential at
the axon hillock and transmits this electrical signal along
the axon.

Prof. Bhavisha Suthar


➢ Nervous System
➢ Neurons
• What?→ A neuron is a nerve cell that is fundamental
building block of the biological nervous system.
Neurons are similar to other cells in the human body
in a number of ways, but there is one key difference
between neurons and other cells. Neurons are
specialized to transmit information throughout the
body

• 10 – 100 billions Neurons


• connection to 100 – 10000 other neurons
• 100 different types
• signal
Prof. Bhavisha Suthar
Prof. Bhavisha Suthar
▪ The receptors collect information from the environment
– e.g. photons on the retina

▪ The effectors generate interactions with the environment


– e.g. activate muscles

▪ The flow of information/activation is represented by


arrows feedforward and feedback.

▪ Naturally, this module will be primarily concerned with


how the neural network in the middle works.
Prof. Bhavisha Suthar
➢ How it works?

Prof. Bhavisha Suthar


Prof. Bhavisha Suthar
▪ The majority of neurons encode their activations
or outputs as a series of brief electrical pulses (i.e
action potentials).

▪ The neuron’s cell body (soma) processes the


incoming activations and converts them into
output activations.

▪ The neuron’s nucleus contains the genetic


material in the form of DNA. This exists in most
types of cells, not just neurons.
Prof. Bhavisha Suthar
▪ Dendrites are fibres which start from the cell body and
provide the receptive zones that receive activation from
other neurons.

▪ Axons are fibres acting as transmission lines that send


activation to other neurons.

▪ The junctions that allow signal transmission between the


axons and dendrites are called synapses. The process of
transmission is by diffusion of chemicals called
neurotransmitters across the synaptic cleft.
▪ At the other end of axon, there exits an inhibiting unit
called synapse. This unit controls flow of neuronal current
from originating neuron to receiving dendrites of
neighborhood neurons.
▪ Synapses
Prof. Bhavisha Suthar
have processing value or weight.
➢ Communication Between Synapses

▪ Once an electrical impulse has reached the end of an axon, the


information must be transmitted across the synaptic gap to the dendrites
of the adjoining neuron. In some cases, the electrical signal can almost
instantaneously bridge the gap between the neurons and continue along
its path.

▪ In other cases, neurotransmitters are needed to send the information


from one neuron to the next. Neurotransmitters are chemical messengers
that are released from the axon terminals to cross the synaptic gap and
reach the receptor sites of other neurons. In a process known as
reuptake, these neurotransmitters attach to the receptor site and are
reabsorbed by the neuron to be reused.

Prof. Bhavisha Suthar


▪ Neurotransmitters:
Neurotransmitters are an essential part of our
everyday functioning. While it is not known exactly
how many neurotransmitters exist, scientists have
identified more than 100 of these chemical
messengers.

Prof. Bhavisha Suthar


Prof. Bhavisha Suthar
➢ Massive connectivity and parallel
- each neuron can be connected with about more than 5,000
other neurons

➢ Robust and fault tolerant


- tolerance against damage to individual neurons
-tolerance to loss of neuron is a high priority since a
graceful degradation of performance is very important to the
survival of the organism

➢ Capability to adapt to surroundings


- with change in our surroundings/ work conditions, our
brain helps us gradually adapt to the new environment

Prof. Bhavisha Suthar


➢ Ability to learn and generalize from known examples
- e.g: meet a friend after a long with new look, human can
recognize

➢ Collective behavior is more important than individual behavior


- BNN is more complex, meaningless to understand
single neuron and its interaction with other neuron.

Prof. Bhavisha Suthar


▪ Signals are received by the processing elements,
This element sums the weighted inputs.
▪ The weight at the receiving end has the
capability to modify the incoming signal.
▪ The neuron fires(transmit output), when
sufficient input is obtained.
▪ The output produced from one neuron may be
transmitted to other neuron.
▪ Processing of information is found to be local.

Prof. Bhavisha Suthar


▪ The weights can be modified by experience.
▪ Neuron transmitters for the synapses may be
excitatory or inhibitory.
▪ Both Artificial and Biological neurons have an inbuilt
fault tolerance.

→After multiplication all values are summed with each


other. This value is known as net value.
→After passing the net value in a threshold function
we can get the output. These operations is formed
inside
Prof. cell body.
Bhavisha Suthar
▪ Task involving intelligence/ Pattern recognition
- difficult to automate
- easily done by animals (by sense)
- computation/ simulation of process (to extend physical
limitations)
- necessity the study and simulation of neural network

▪ NN
- part of Nervous sytem (interconnected neurons)

Prof. Bhavisha Suthar


- Borrowed from the analogy of biological neural
networks.

- Also known as neural nets, Artificial Neural System,


Parallel distributed processing systems,
connectionist systems.

- For computing system


- directed graph (nodes/vertices &
edges/links/arcs)

Prof. Bhavisha Suthar


- In ANN
- each node performs some simple
computations.
- each arc conveys a signal from one node to
another.
- labelled by weight (indicating the extent to
which a signal is amplified or diminished by a
connection)

- Every graph not be NN

Prof. Bhavisha Suthar


▪ Example:
“AND" of two binary inputs, implemented in hardware
using AND gate.
Inputs:
X1 {0,1} and X2 {0,1}

Output:
=1 ; if X1=X2=1
=0 ; otherwise

Prof. Bhavisha Suthar


▪ Not a neural network

Prof. Bhavisha Suthar


▪ Neural network

- weights are initially random


- based on learning algorithm the value of
weight will be change.

Prof. Bhavisha Suthar


▪ In computer science, synaptic weight refers to the
strength or amplitude of a connection between two
nodes, corresponding in biology to the amount of
influence the firing of one neuron on another.

Prof. Bhavisha Suthar


▪ What is a Perceptron ?
- A Perceptron is a type of artificial neuron
which takes in several binary inputs x1, x2, … , xn
and produces a single binary output.

▪ To compute output (O):


a) Sum up the inputs. O = x1 + x2 + x3
b) Apply a linear mathematical operation to the
inputs. O = (x1 * x2) + x3

Prof. Bhavisha Suthar


▪ In Case b)
- If x1 and x2 both are 0, then output O=x3.
- If x1 or x2(any one) is 0, then output O=x3.
(loss info.)
▪ In case a)
- Inputs are not relatively connected.
- so correct

Prof. Bhavisha Suthar


▪ Example:
Que: Make a perceptron for predicting whether it will rain today or not.
- Binary o/p as O, take values 0 or 1 and inputs x1 and x2.

Let, x1 be 1 if the weather is humid today,


0 if the opposite.

Let, x2 be 1 if you are wearing a white cloth today,


0 if not.

here that wearing a white cloth has almost no correlation with the possibility of
rainfall.
So, a possible output function can be: O = x1 + 0.1 * x2

Prof. Bhavisha Suthar


- if x2= 0 , 0.1*x2 is still 0,
- if x2=1, 0.1*x2 will be 0.1, not 1.
- It basically brings down the importance of the input x2 from 1
to 0.1, and hence, it is called the ‘weight’ of the input x2.

- Consider x2 to be 1 for now.


- O will still be (0 + 0.1) = 0.1 or (1 + 0.1) = 1.1,
i.e. not a binary value. In order to make it one, what we can do is
use a ‘threshold’ for the total value of O.
- So, O is 1 if (x1 + 0.1 * x2) > 1
O is 0 if (x1 + 0.1 * x2) < 1. We have thus solved our problem.

Prof. Bhavisha Suthar


Simplification:

w = [ w1 w2 w3 …. wn],
x = [x1 x2 x3 …. xn]
B(bias) = -(threshold)

Where,
O=1 ; {x * (weight of x) + y * (weight of y) +
b} > 0,
O=0 ; otherwise
Prof. Bhavisha Suthar
BNN ANN
Slower in speed. Faster in speed.
Adaptable is possible because new Adaptable is not possible because
information is added without new information destroys the old
destroying old information. information.
Fault-tolerant: if irrespective of faults No fault-tolerant: If information
in network connections, then also corrupted in the memory, it can not
information is still preserved. be restored back.
Memory and processing elements are Memory and processing are separate.
collocated.
There is self organization during the This is software dependent.
learning.
Parallel and asynchronous. Sequential synchronous.
Prof. Bhavisha Suthar
- A bias unit is an extra neuron added to each pre-
output layer that stores the value of 1.
- are not connected to any previous layer
- It is just appended to the start/end of the input
and each hidden layer,
- is not influenced by the values in the previous
layer
- neurons don't have any incoming connections.

Prof. Bhavisha Suthar


- bias units still have outgoing connections and they
can contribute to the output of the ANN.
- It improves the performance of the neural network.
- Bias nodes are added to increase the flexibility of
the model to fit the data.

Prof. Bhavisha Suthar


▪ In ANN defines the output of neuron given a set
of inputs.
▪ Biologically inspired by activity in our brains,
where different neurons fire or activated by
different stimuli.
▪ A standard computer chip circuit can be seen as
a digital network of activation function that can
be on or off(1 or 0), depending on input. This is
same behavior of the linear perception in NN.

Prof. Bhavisha Suthar


▪ The activation function is used to calculate
output of a neuron, net is the linear combiner
output. To generate final output Y.
▪ It’s basically decide whether a neuron should
be activated or not.
▪ Whether the information that the neuron is
receiving is relevant for the given information or
should it be ignored.

Prof. Bhavisha Suthar


▪ Types:
- linear (straight shape)
- non-linear (in curve)

Prof. Bhavisha Suthar


Prof. Bhavisha Suthar
Prof. Bhavisha Suthar
▪ Linear function:
Y= net

Prof. Bhavisha Suthar


▪ Binary step function:
Y= f(net) = +1, if net>0
= -0, if net<=0

Prof. Bhavisha Suthar


▪ Sigmoid function:
Y= 1/(1+e ^-net)

Prof. Bhavisha Suthar


Non-Negative: If a number is greater than or equal
to zero.

Negative: If a number is less than or equal to Zero.

The Sigmoid function used for binary


classification in logistic regression model.
Prof. Bhavisha Suthar
▪ Tanh function:

Prof. Bhavisha Suthar


▪ ReLU function:

Prof. Bhavisha Suthar


▪ ReLU is best for hidden layers
▪ for O/P layer use softmax for classification or linear for regression.

Prof. Bhavisha Suthar


net = ∑ xiwi + b; i=1 to n
y = f (net)

▪ Threshold Value: The net value is known as


activation value. The threshold value is sometimes
used to qualify the output of a neuron in the output
layer.
→ The net value or activation value is compared with
threshold and neuron fires if the threshold value is
exceeded.

Prof. Bhavisha Suthar


▪ It can be used to extract patterns and detect
trends that are too complex to be notices by
wither humans or other computer techniques.

▪ A trained neural network can be thought of as an


“expert” in the category of information it has
been given to analyze. This expert can then be
used to provide projections given new situations
of interest and answer “what if” questions.

Prof. Bhavisha Suthar


▪ Process modeling and control
▪ Machine Diagnostics
▪ Portfolio Management
▪ Target Recognition
▪ Medical Diagnosis
▪ Credit Rating
▪ Targeted Marketing
▪ Voice recognition
▪ Financial Forecasting
▪ Intelligent searching
▪ Fraud detection
▪ Essay scoring

Prof. Bhavisha Suthar


Dendrons Cell body Axon Synapse

Prof. Bhavisha Suthar


Prof. Bhavisha Suthar
Prof. Bhavisha Suthar
Prof. Bhavisha Suthar
Prof. Bhavisha Suthar
Prof. Bhavisha Suthar
▪ Feedforward Network

- Single layer feed forward network


- Multilayer feed forward network

Prof. Bhavisha Suthar


➢ Fully Connected Network
➢ Layered Network
➢ Acyclic Network
➢ Feedforward Network

Prof. Bhavisha Suthar


▪ (Asymmetric) Fully Connected Networks
▪ Every node is connected to every other node
▪ Connection may be excitatory (positive), inhibitory
(negative), or irrelevant ( 0).
▪ Most general
▪ Symmetric fully connected nets: weights are symmetric (wij
= wji)
Input nodes: receive
input from the
environment
Output nodes: send
signals to the
environment
Hidden nodes: no
direct interaction to
the environment
Prof. Bhavisha Suthar
▪ Fully Connected Network

Prof. Bhavisha Suthar


▪ Layered Networks
▪ Nodes are partitioned into subsets, called layers.
▪ No connections that lead from nodes in layer j to those
in layer k if j > k.
• Inputs from the
environment are
applied to nodes in
layer 0 (input
layer).
• Nodes in input layer
are place holders
with no computation
occurring (i.e., their
node functions are
identity function)
Prof. Bhavisha Suthar
▪ Acyclic Network
▪ The connection may exists between any node in
layer but it not allows if two layers are i & j then
i=j is not possible.

Prof. Bhavisha Suthar


▪ Feedforward Networks
▪ A connection is allowed from a node in layer i only to
nodes in layer i + 1.
▪ Most widely used architecture.

Conceptually, nodes
at higher levels
successively
abstract features
from preceding
layers

Prof. Bhavisha Suthar


➢Depending on nature of problems, the artificial
neural net is organized in different structural
arrangements.

a) Single layered recurrent net with lateral feedback


b) Two layered feed-forward structure
c) Two layered structure with feedback
d) Three layered feed-forward structure
e) A recurrent structure with self-loops
▪ Aerospace
▪ Automotive
▪ Banking
▪ Defense
▪ Electronics
▪ Financial
▪ Manufacturing
▪ Medical
▪ Robotics
▪ Securities
▪ Telecommunications
Prof. Bhavisha Suthar
Three types of learning have been used.
1. Supervised learning
2. Unsupervised learning:
3. Reinforcement learning

Prof. Bhavisha Suthar


▪ Every input pattern associated with output pattern.
▪ Output pattern is known as target or desired pattern.
▪ A teacher is present during learning process.
▪ A comparison is required between the network’s
computed output and correct output to determine
the error.
▪ The error can then be used to change network
parameters which result improve the performance.
▪ X(i) is an input vector and d(i) is corresponding
target value, the learning rule calculates the updated
value of the neuron weights and bias. Eg.
Perceptron, back propagation etc.
Prof. Bhavisha Suthar
Prof. Bhavisha Suthar
▪ In this method target output is not present to
the network.
▪ No teacher to present the desired patterns
hence the system learns of its own by
discovering and adapting to structural
features in the input pattern.
▪ The weight and bias are adjusted on inputs
only.
▪ To learn cluster input patterns into a finite
number of classes.
Prof. Bhavisha Suthar
Prof. Bhavisha Suthar
▪ The value of the output is unknown, but the
network provides the feedback whether the
output is right or wrong.

Prof. Bhavisha Suthar


Prof. Bhavisha Suthar
➢ Supervised learning:
ex. Image is of cat or dog? Classify it based on features
➢ Unsupervised learning:
ex. Different age group people falls in different group
➢ Reinforcement learning
ex. Give rewards to cat if it follows some rules

Prof. Bhavisha Suthar


▪ A neural network can perform tasks that a linear program can
not.
▪ When an element of the neural network fails, it can continue
without any problem by their parallel nature.
▪ A neural network learns and does not need to be
reprogrammed.
▪ It can be implemented in any application.
▪ It can be performed without any problem.

Prof. Bhavisha Suthar


▪ The neural network needs amount of Data for training to operate.
▪ Computationally expensive
▪ Requires high processing time for large neural networks.

Prof. Bhavisha Suthar


▪ It is very well known that the most fundamental unit of deep neural
networks is called an artificial neuron/perceptron.
▪ McCulloch-Pitts Neuron
▪ But the very first step towards the perceptron we use today was taken
in 1943 by McCulloch and Pitts, by mimicking the functionality of a
biological neuron.

Prof. Bhavisha Suthar


▪ The first computational model of a neuron was proposed
by Warren MuCulloch (neuroscientist) and Walter Pitts
(logician) in 1943.

Prof. Bhavisha Suthar


▪ In machine learning, the perceptron is an
algorithm for supervised learning of binary
classifiers.

Prof. Bhavisha Suthar


❖Types:
▪ Single layer perceptron network
▪ Multilayer perceptron network

Prof. Bhavisha Suthar


Prof. Bhavisha Suthar
Prof. Bhavisha Suthar
Prof. Bhavisha Suthar
▪ To start training process ,initially the weights and the bias are
set to zero.
▪ Also set learning parameter, ranges between 0 and 1.
▪ Then find output by multiplying input with weights and adding
bias entity. The after apply activation function on it.
▪ This output is compared with the target, where if any difference
occurs, we go in for weight updation based on perceptron
learning rule, else the network training is stopped.
▪ The algorithm can be used for both binary and bipolar input
vectors. It uses a bipolar target with fixed threshold and
adjustable bias.

Prof. Bhavisha Suthar


1. Initialize weights and bias θ (initially it can be zero). Set learning
rate α (o to 1).
2. While stopping condition is false do steps 3-7.
3. For each training pair s:t do steps 4-6.
4. Set Activations of input units.
xi = sj for i= 1 to n
5. Compute the output unit response.
Yin = ∑ xiwi + b;
The activation function used is,
Y= f(Yin) = 1, if Yin > θ
0, if Yin = θ
-1, if Yin < θ

Prof. Bhavisha Suthar


6. The weights and bias are updated if the target is not equal to the
output response.
If t!=y and the value of Xi !=0.
Wi(new)= wi(old) + α *t*xi
b(new) = b(old) + α *t
else
Wi(new)= wi(old)
b(new) = b(old)
7. Test for stopping condition.
The stopping conditions may be the weight changes.

NOTE: Only weights connecting active input units (xi!=0) are updated.
Weights are updated only for patterns that do not produce the
correct value of y.
Prof. Bhavisha Suthar
▪ If two classes of patterns can be separated by a decision boundary,
represented by the linear equation

▪ then they are said to be linearly separable. The simple network can
correctly classify any patterns.
▪ If such a decision boundary does not exist, then the two classes are
said to be linearly inseparable.
▪ Linearly inseparable problems cannot be solved by the simple
network , more sophisticated architecture is needed.

Prof. Bhavisha Suthar


Prof. Bhavisha Suthar
Prof. Bhavisha Suthar
Prof. Bhavisha Suthar
Prof. Bhavisha Suthar
▪ It is systematic method for training multi-layer ANN.
▪ It is a multi-layer forward network using extend gradient-descent
based delta-learning rule, commonly known as back propagation
(of errors ) rule.

▪ Back propagation is computationally efficient method for changing


the weights in a feed forward network.

▪ The network is trained by supervised learning method.

▪ The aim to this network is to train the net to achieve a balance


between the ability to respond correctly to the input patterns that
are used for training and the ability to provide good responses to
the input that are similar.
▪ There are four stages:
1. Initialization of weights
2. Feed forward
3. Back Propagation of errors
4. Updation of the weights and biases.
▪ Initialization of weights, some random values.
▪ Each input unit (Xi) receive an input signal and transmits this signal
to each of hidden units z1….zp.
▪ Each Hidden unit calculate activation function and send its signal zj
to each output units,
▪ The output unit calculates the activation function to form the
response of the net for given input pattern.
▪ During Back propagation of errors, each output unit compares its
computed activation yk with its target value tk to determine the
associated error for the pattern with that unit.
▪ Based on error, the factor δk(k=1…m) is computed and is used to
distribute the error at output yk back to all units in the previous
layer.
▪ Similarly, the factor δj(j=1…p) is computed for each hidden unit zj.
▪ During final stage, the weight and biases are updated using the δ
factor and the activation.
▪ Parameters
NOTE: The stopping condition when minimization
of errors, number of epochs etc.
▪ Find the new weights when the network illustrated in
given figure, presents the input patterns [0.6 0.8 0]
and the target is 0.9. Use learning rate α = 0.3 and
use binary sigmoid activation function.
The process can be continued up to any specified stopping
condition.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy