Unit - 1
Unit - 1
Artificial Neural Network (ANN) tries to approximate the structure of human brain and are
fed with massive amount of data to learn, all at once. Neural network architecture is arranged into
layers, where each layer consists of many simple processing units called nodes, further connected
to several nodes in the layers above and below it. The data is fed into the lowest layer which is
then relayed to the next layer. However, ANNs can also learn based on a pre-existing
representation.
This process is called fine-tuning and consists of adjusting the weights from a pre-trained network
topology at a relatively slow learning rate to perform well on newly supplied input training data.
While artificial neural nets were initially designed to function like biological neural networks, the
neural activity and capability in our brains is far more complex than suggested by artificial
neurons. Unlike Human Neural Networks, ANN cannot yet be trained to work well for many
heterogeneous tasks simultaneously.
The Functionalities that describe Biological and Artificial neurons are as follows.
1. Size: our brain contains about 10 billion neurons and more than 60 trillion synapses
(connections). The number of “neurons” in artificial networks is much less than that but
comparing their numbers this way is misleading. Deep Neural Networks usually consist of
input neurons, output neurons, and neurons in the hidden layers in between. All the layers are
usually fully connected to the next layer, meaning that artificial neurons usually have as many
connections as there are artificial neurons in the preceding and following layers combined.
The limitation in size isn’t just computational: simply increasing the number of layers and
artificial neurons does not always yield better results in machine learning tasks.
Topology: all artificial layers compute one by one, instead of being part of a network that has
nodes computing asynchronously. Feedforward networks compute the state of one layer of
artificial neurons and their weights, then use the results to compute the following layer the
same way. During backpropagation, the algorithm computes some change in the weights in the
opposing way to reduce the difference of the feedforward computational results in the output
layer from the expected values of the output layer. Layers aren’t connected to non-
neighboring layers, but it’s possible to somewhat mimic loops with recurrent and LSTM
networks. In biological networks, neurons can fire asynchronously in parallel, have small
world nature with a small portion of highly connected neurons (hubs) and a large amount of
lesser connected ones. Since artificial neuron layers are usually fully connected, this small-
world nature of biological neurons can only be simulated by introducing weights that are 0 to
mimic the lack of connections between two neurons.
2. Speed: certain biological neurons can fire around 200 times a second on average. Signals
travel at different speeds depending on the type of the nerve impulse, ranging from 0.61 m/s
up to 119 m/s. Signal travel speeds also vary from person to person depending on their gender,
age, height, temperature, medical condition, lack of sleep, etc. Information in artificial neurons
is instead carried over by the continuous, floating-point number values of synaptic weights.
How quickly feedforward or backpropagation algorithms are calculated carries no
information, other than making the execution and training of the model faster.
3. Fault-tolerance: biological neuron networks, due to their topology, are also fault-tolerant.
Information is stored redundantly so minor failures will not result in memory loss. They don’t
have one “central” part. The brain can also recover and heal to an extent. Artificial neural
networks are not modeled for fault tolerance or self-regeneration, though recovery is possible
by saving the current state (weight values) of the model and continuing the training from that
saved state. Training artificial neural networks for longer periods of time will not affect the
efficiency of the artificial neurons.
4. Power consumption: the brain consumes about 20% of all the human body’s energy —
despite it’s large cut, an adult brain operates on about 20 watts (barely enough to dimly light a
bulb) being extremely efficient. Our machines are way less efficient than biological systems.
Computers also generate a lot of heat when used, with consumer GPUs operating safely
between 50–80 degrees Celsius instead of 36.5–37.5 °C.
5. Signals: an action potential is either triggered or not — biological synapses either carry a
signal or they don’t. Artificial neurons accept continuous values as inputs and apply a simple
non-linear, easily differentiable function (an activation function) on the sum of its weighted
inputs to restrict the outputs’ range of values.
6. Learning: we still do not understand how brains learn, or how redundant connections store
and recall information. Brain fibers grow and reach out to connect to other neurons,
neuroplasticity allows new connections to be created or areas to move and change function,
and synapses may strengthen or weaken based on their importance. Artificial neural networks
in the other hand have a predefined model, where no further neurons or connections can be
added or removed. Only the weights of the connections can change during training. Learning
can be understood as the process of finding optimal weights to minimize the differences
between the network’s expected and generated output.
Model of an ANN
A neuron is an information-processing unit that is fundamental to the operation of a neural
network. The following diagram shows the model of a neuron, which forms the basis for
designing (artificial) neural networks.
(1)
and
(2)
Where x1,x2,….xm are the input signals; Wk1, Wk2, ….. Wkm are the synaptic weights of
neuron k; Uk is the linear combiner output due to the input signals; bk is the bias;
is the activation function; and Yk is the output signal of the neuron.
The use of bias bk has the effect of applying an affine transformation to the output uk of
the linear combiner in the model of Figure, as shown by
(3)
The bias bk is an external parameter of artificial neuron k.
and
In this figure the effect of the bias is accounted for by doing two things: (1) adding a new
input signal fixed at + 1. and (2) adding a new synaptic weight equal to the bias bk.
Although the models of Fig. 1.5 and 1.7 are different in appearance, they are
mathematically equivalent.
Activation functions used in ANNs
The activation function denoted by defines the output of a neuron in
terms of the induced local field v. Here we identify three basic types of activation
functions:
1. Threshold Function
2. Piecewise-Linear Function
3. Sigmoid Function
Threshold Function For this type of activation function described in following figure , we
have
The output of neuron k employing such a threshold function is expressed as
The following two situations may be viewed as special forms of the piecewise-linear
function:
• A linear combiner arises if the linear region of operation is maintained without
running into saturation.
• The piecewise-linear function reduces to a threshold function if the amplification
factor of the linear region is made infinitely large.
Sigmoid Function The sigmoid function, whose graph is s-shaped, is the most common
form of activation function used in the construction of artificial neural networks. It is
defined as a strictly increasing function that exhibits a graceful balance between linear
and nonlinear behavior.
An example of the sigmoid function is the logistic function defined by
where a is the slope parameter of the sigmoid function. By varying the parameter,
‘ a’, we obtain sigmoid functions of different slopes, as illustrated in the following
figure.
In fact, the slope at the origin equals a/4. In the limit, as the slope parameter approaches
infinity, the sigmoid function becomes simply a threshold function. Whereas a threshold
function assumes the value of 0 or 1, a sigmoid function assumes a continuous range of
values from 0 to 1 .
The neural network in the belowFigure is said to be fully connected in the sense that
every node in each layer of the network is connected to every other node in the adjacent
forward layer. If some of the communication links (synaptic connections) are missing
from the network, we say that the network is partially connected.
Recurrent Networks
A recurrent neural network distinguishes itself from a feedforward neural network in
that it has at least one feedback loop. For example, a recurrent network may consist of a
single layer of neurons with each neuron feeding its output signal back to the inputs of
all the other neurons, as illustrated in the architectural graph in following Figure. In the
structure depicted in this figure there are no self-feedback loops in the network; self
feedback refers to a situation where the output of a neuron is fed back into its own input.
The recurrent network illustrated in following Figure also has no hidden neurons.