Deep Learning algorithms
Deep Learning algorithms
Deep Learning
Multilayer Perceptron
Self-Organizing Maps
1
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
Autoencoders
2
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
1) Convolutional Neural Networks (CNNs):
CNNs, also known as ConvNets, are multilayer neural networks that are
primarily used for image processing and object detection.
CNNs are commonly used to detect abnormalities, identify satellite photos,
interpret medical imaging, forecast time series, and identify anomalies.
CNN's have multiple layers that process and extract features from data.
o Convolution layer – performs the convolution operation using several filters.
o Activation unit – helps to fire the neurons in the layers.
o Pooling layer- helps to down sample the features.
o Flatten layer- Convert the high dimensional data to single column vector
o Fully connected layer-A fully connected layer forms when the flattened
matrix from the pooling layer is fed as an input, which classifies and
Long term dependencies can be learned and remembered using LSTMs, which
are a form of Recurrent Neural Network (RNN).
The default behavior is to recall past information over long periods of time.
For their nature in remembering previous inputs they are useful in time-series
prediction.
Four interacting layers communicate in a unique way in LSTMs, which have a
chain-like structure.
3
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
LSTMs are commonly employed for voice recognition, music creation, and
pharmaceutical research, in addition to time-series predictions.
4
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
How do LSTM work?
The outputs from the LSTM can be given as inputs to the current phase since
RNNs contain connections that create directed cycles.
The LSTM's output becomes an input to the current phase, and its internal
memory allows it to remember prior inputs. Image captioning, time-series
analysis, natural-language processing, handwriting identification, and machine
translation are all common uses for RNNs
5
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
4) Generative Adversarial Networks (GANs):
GANs are deep learning generative algorithms that generate new data instances
that are similar to the training data. GAN is made up of two parts: a generator
that learns to generate fake data and a discriminator that learns from that data.
GANs aid in the creation of realistic images and cartoon characters, as well as
the creation of photographs of human faces and the rendering of 3D objects.
The discriminator learns to tell the difference between the bogus data
generated by the generator and the genuine sample data.
The generator generates fraudulent data during early training, and the
discriminator quickly learns to recognize that it’s false.
To update the model, the GAN provides the results to the generator and
discriminator.
6
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
neuron's center.
7
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
The output of the network is a linear combination of the input's radial-basis
functions and the parameters of the neuron.
MLPs are a type of feedforward neural network that consists of multiple layers of
perceptron with activation functions.
MLPs are made up of two fully connected layers: an input layer and an output
layer.
They have the same amount of input and output layers, but they can have
several hidden layers, and they can be used to create voice, picture, and
machine translation software.
The data feed into the input layer of the network. The signal flows in one way
because the layers of neurons are connected in a graph.
MLP computes the weight between the input and hidden layer.
Then it uses activation function to determine which nodes to fire. Activation
functions includes sigmoid, tanh and ReLU.
From a training data set, MLPs train the model to grasp the correlation and
learn the dependencies between the independent and target variables.
8
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
7) Self Organizing Map (SOMs):
SOM enables the data visualization to reduce the dimensions of data through
self-organizing artificial neural networks.
Data visualization attempts to solve the problem that humans cannot easily
visualize in high dimensional data.
SOMs choose a vector at random from the training data to initialize weights for
each node.
SOMs look at each node to see which weights are most likely to be the input
vector. The Best Matching Unit (BMU) is the winning node.
The BMU's neighborhood is discovered through SOMs, and the number of
neighbors decreases with time.
The sample vector is given a winning weight using SOMs. The weight of a node
changes as it gets closer to a BMU.
The further neighbor is from the BMU, the less it learns from it. For N iterations,
SOMs repeat step two.
DBNs are generative models with several layers of stochastic, latent variables.
Latent variables, often known as hidden units, have binary values.
DBNs are a stack of Boltzmann Machines with connections between the layers,
and each RBM layer communicates with both the previous and subsequent
9
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
layers.
10
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
Deep Belief Networks (DBNs) are used for image-recognition, video-recognition,
and motion- capture data.
Greedy learning algorithm used to train the DBN. The algorithm used layer by
layer approach for learning top to down generative weights.
DBN uses the Gibbs sampling on the top two hidden layers and draws the
sample from RBM defined by the two hidden layers.
DBNs draw a sample from the visible units using a single pass of ancestral
sampling through the rest of the model.
DBNs learn that the values of the latent variables in every layer can be inferred
by a single, bottom-up pass.
BMs are stochastic neural networks that can learn from a probability distribution
over a set of inputs.
RBMs consists of visible units and hidden units. Each visible unit is connected to
all hidden units. RBMs have a bias unit that is connected to all the visible units
and the hidden units, and they have no output nodes.
It used in dimensionality reduction, classification, regression, collaborative
filtering, feature learning, and topic modeling.
11
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
RBMs accept the inputs and translate them into a set of numbers that encodes
the inputs in the forward pass.
Then it combines every input with individual weight and overall bias. The
algorithm passes the output to the hidden layer.
In the backward pass, RBMs take that set of numbers and translate them
to form the reconstructed inputs.
RBMs combine each activation with individual weight and overall bias and pass
the output to the visible layer for reconstruction.
At the visible layer, the RBM compares the reconstruction with the original
input to analyze the quality of the result.
10)Autoencoder:
Autoencoders are the unique type of feed forward neural network in which
input and output are identical.
The autoencoders are trained to replicate the data from the input layer to the
output layer.
It is widely used in pharma discovery, predicting popularity and imaging
application.
Autoencoders are the structure to receive the input and transformed it into
the different representation.
Encoder used to encode the image and then reduce the input size into smaller
representation.
Finally the decoder used to decode the image to generate the reconstructed
12
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
image.
13
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
Important questions:
GANs are difficult to train. The reason they are difficult to train is that both the
generator model and the discriminator model are trained simultaneously in a game.
This means that improvements to one model come at the expense of the other
model.
14
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
modelling complex non linear functions.
15
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
PCA features are totally linearly uncorrelated with each other since features are
projections onto the orthogonal basis. But autoencoder features might have
correlations since they are just trained for accurate reconstruction.
PCA is faster and computationally cheaper than autoencoders.
A single layered autoencoder with a linear activation function is very similar to
PCA.
Autoencoder is prone to overfitting due to high number of parameters.
The self-organizing map makes topologically ordered mappings between input data
and processing elements of the map. Topological ordered implies that if two inputs
are of similar characteristics, the most active processing elements answering to
inputs that are located closed to each other on the map.
RBMs are generative. That is, unlike autoencoders that only discriminate some data
vectors in favour of others, RBMs can also generate new data with given joined
distribution. They are also considered more feature-rich and flexible.
Some of the top Deep Learning frameworks out there today are:
TensorFlow
Keras
PyTorch
Caffe2
CNTK
MXNet
Theano
16
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
Valid padding: It is used when there is no requirement for padding. The output
matrix will have the dimensions (n – f + 1) X (n – f + 1) after convolution.
Same padding: Here, padding elements are added all around the output matrix.
It will have the same dimensions as the input matrix.
If the set of weights in the network is put to a zero, then all the neurons at each
layer will start producing the same output and the same gradients during
backpropagation.
As a result, the network cannot learn at all because there is no source of
asymmetry between neurons. That is the reason why we need to add
randomness to the weight initialization process.
18
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or
12.What is Data Normalization, and why do we need it?
Data normalization is used during backpropagation. The main motive behind data
normalization is to reduce or eliminate data redundancy. Here we rescale values to
fit into a specific range to achieve better convergence.
Type 1 error is when your algorithm makes a positive prediction, but in fact, it’s
negative. For example, your algorithm predicted a patient has diabetes, but in fact,
he doesn’t.
Type 2 error is when your algorithm makes a negative prediction, but in fact, it’s
positive. For example, your algorithm predicted a patient doesn’t have diabetes, but
in fact, they do.
The learning rate is a tuning parameter that determines the step size of each
iteration (epoch) during model training.
Your model will converge quickly if the learning rate is high, and the model
weights are updated often, but it may overshoot the true error minima. This
results in a model that is faster yet incorrect.
Your model will take a long time to converge but will not overshoot the genuine
error minima if the learning rate is low and the model weights are updated
slowly. This results in a model that is slower but more accurate.
19
Proprietary content. © 2013 - 2021 Great Learning. All Rights Reserved. Unauthorized use or