0% found this document useful (0 votes)
3 views27 pages

introduction_to_deep_learning

The document provides an introduction to deep learning, focusing on neural networks and their components, including artificial neurons, layers, weights, biases, and activation functions. It explains how artificial neural networks (ANN) are inspired by biological neural systems and details the training process through forward and backward propagation. Additionally, it covers convolutional neural networks (CNN), recurrent neural networks (RNN), and autoencoders, highlighting their applications in various fields such as medical imaging, speech recognition, and data de-noising.

Uploaded by

quillsbot
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views27 pages

introduction_to_deep_learning

The document provides an introduction to deep learning, focusing on neural networks and their components, including artificial neurons, layers, weights, biases, and activation functions. It explains how artificial neural networks (ANN) are inspired by biological neural systems and details the training process through forward and backward propagation. Additionally, it covers convolutional neural networks (CNN), recurrent neural networks (RNN), and autoencoders, highlighting their applications in various fields such as medical imaging, speech recognition, and data de-noising.

Uploaded by

quillsbot
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

INTRODUCTION TO DEEP LEARNING

Neural Networks
Even with the advancements of technology and algorithms, there is still no match to the
capabilities of the human brain which hasn't been fully understood yet. The core
components of our nervous system or the brain are the brain cells or nerve cells which
are also referred as “neurons”. These cells are connected to one another to form a
complex network structure known as “Neural Network”.

The above figure represents the structure of a neuron which has three major
components, namely,
1. Cell body
2. Dendrite
3. Axon
Dendrites receive signal from other neurons and bring it to the cell body, where the
signal is processed. The axon transmits the processed signal from the cell body to
dendrites of the neighboring cells to which it is connected.
The study of artificial neural networks (ANN) is inspired by attempts to simulate the
biological neural system. ANN consists of interconnected artificial neurons
or nodes, analogous to human/biological neuron network. Just like a biological neuron
receives an input signal, processes it and transmits the output to other neurons, in
ANN, a node receives the input, processes it using a function known as activation
function and transmits the output to other nodes.
The figure below shows the analogy between a neuron of the human brain and the node
of ANN.
The figure below shows part of a biological neural network and the Artificial neural
network (ANN).

Components of ANN
Artificial Neurons/Nodes: These are the elementary units in a neural network. An
artificial neuron is a computational unit.
Layers: These are group of neurons at different levels. An ANN has one input layer, one
or more hidden layers and one output layer. Nodes in input layer are the nodes to which
we feed in the input feature values. Output layer nodes give us the output or target
values. The layers between input and output layers are called as hidden layers.
Weights and Biases: Weights are numerical parameters which determine how strongly
each of the neurons affects the other. Bias is special neuron with the value 1. It is added
to each pre-output layer.
Activation function: An activation function is a non-linear mathematical function which
converts the input values to an output. Without activation functions, the working of a
neural network will be like linear models.
Lets now consider an example to demonstrate how an ANN is used.
Assume we have a lot of past medical data about diabetic patients. We want to make a
model to predict whether a person is “Diabetic” or “Non-Diabetic” based on the inputs,
“Sugar Level” and “Age”.
We will see how an ANN is trained with the past data for this purpose.
Black Box Example - Diabetic prediction:
The example we considered can be compared to black box to predict whether the
person is “Diabetic” or “Non- Diabetic” on the following input.
INPUT:
Sugar level
Age
OUTPUT:
Diabetic
Non- Diabetic

Layers in ANN
The problem can be modelled as a neural network as shown below. Let us learn about
each of the layers in detail to understand how the model works. Note that the input to
the model is the age and the sugar level and the output is a binary decision if a patient is
diabetic.
Input Layer
• It is the first layer in any neural network.
• It brings the initial data into the system for further processing by subsequent
layers of artificial neurons. (It is the only layer where no computation happens
i.e. no activation function is applied.)
• Here we will pass “Sugar Level” and “Age” as input parameters to the neurons in
the input layer.
Hidden Layer
• Layers between input layer and output layer in any ANN are called hidden layers.
• It takes a set of weighted input and applies a non-linear activation function to it.
• Each of these hidden layers can contain same or different number of neurons.

Output Layer
• It is the last layer in any neural network, from which we get the final target values.
• Usually, output layer nodes also have activation fuctions.
• The neurons of this layer will give us the output whether the person is “Diabetic”
or “Non-Diabetic”.
Learning Weights and Biases
Weights and biases are parameters that are learned (or adjusted) to produce the
desired outputs from the given inputs. Mathematical techniques like stochastic
Gradient Descent, Adam etc. are used to search for the optimal set of weights that will
make the model accurate in its prediction.

Activation Function
An activation function (also known as transfer function) is a non-linear mathematical
function which converts input values to an output. It helps neural networks find non-
linear relations in data.

Commonly used functions:


• Sigmoid
• Tanh
• ReLU
• Softmax
The process of Learning
The model learns its parameters ( weights and biases ) through a process of forward
propagation of data and back propagation of errors.
To start with, random weights and biases are assumed. The input is then fed forward
through the network. Each hidden layer modifies the input forwarded to it till it reaches
the output. This output achieved is compared against the expected output. Error is
recorded as their difference. This error is then propagated backward through the layers
to adjust the parameters i.e. the weights and the biases. The forward pass and the back
propagation continues over a number of iterations till a desirable low level of error is
achieved. The neural network is then said to have learnt its parameters.
Representation of an image
An image is an unstructured data represented by pixels.
To store an image on a computer, the image is broken down into tiny elements called
pixels (short for picture element). A pixel is a number and it represents one colour.
Usually, an 8 bit number is used to represent a pixel. Hence, each pixel of a black and
white image can have 28 = 256 different shades of gray. It can also be represented as a
decimal number from 0 to 255. Zero stands for black and 255 for white.
An image with a resolution of 1024 by 798 pixels has 1024 x 798 total number of pixels
(817,152 pixels)
A pixel in a coloured image is an array of 3 numbers. They represent the red, green and
blue (RGB) values. For an 8 bit representation, each colour can take 256 shades. The
final colour of the pixel is given by the combination of the 3 values. Hence each pixel
can have 256 x 256 x 256 values of nearly 17 million different colours.
So, the computer sees an image as an array of numbers. It represents an image with a
colour depth that represents the number of bits that represent a pixel (in the examples
mentioned above, we considered it to be 8) and the image resolution: the height and
width of an image in pixels. As a colour image has 3 matrices, one for each colour,
hence it is represented by its height, width and depth. It is sometimes called a 3D
volume.
Feature
Can we teach a machine to identify objects the way humans do? Assume that we have a
large number of images of cats. Is there a way to train a machine to identify cats with
these given images? Can the machine identify a new cat that was not in the training
dataset?
To answer these questions, we have to understand how we identify objects. We identify
objects by their features. Lets understand what is a feature.

Consider the two images a caracal and a cheetah:


• Caracal has pointy ears. Cheetah has round ears.
• Cheetah is spotted. Caracal is not spotted.
• Cheetah has dark black lines on its face.
Such differentiating data is called feature.
The abstraction of such features (a straight line, a curve , etc.) is shown in the diagram. A
feature is a small image, usually of size 2 x 2 or 3 x 3 pixel dimension. These features are
also called filters, kernels or weights in Deep learning.
But, how does a machine find out what features an image has? It learns features using
the mathematical operation called convolution. Convolution means the similarity of a
feature with a given image. We select a random feature and convolve (roll) it on the
surface of the image. The learning process progressively modifies the random feature
such that it has a good match with the given image. Multiple features are deployed,
each starting at a different random state. Each of them specializes to learn different
features of the image. The process of learning features from an image is called feature
extraction.
Once we have extracted the features from an object, we can use them to identify the
object.
Learnable filters
Convolutional neural network has two parts. The first part identifies the features in the
image. The features learnt in the first phase is fed to the second part, a fully connected
neural network. This part does the classification.
The question is how do we select a filter such that its overlap with the image is
maximum? Convolutional neural networks uses learnable filters. We randomly select
such filters and then learn the required filters by training, just as in ANN, where we
begin with random weights and then learn the best weights through training. This is
why filters are also called weights.
Sample filter -

Padding
We observed that everytime we use the filter and scan the image for similarity, the size
of the resulting image becomes smaller and smaller. We do not want that, because we
want to preserve the original size of the image to extract some low level of features. The
second issue is that the pixels in the periphery get covered only once but the ones in the
middle get covered multiple times.
Thus, there are two issues with convolution -
• Shrinking feature map and
• Losing information on the periphery of the image.
These two issues are solved by padding. In padding, the image is added with extra pixels
at the boundary before the convolution. Zero padding means the value of these added
pixels is zero. Sometimes the value of the edge pixel is added in the padded cell.
Add image

If zero padding = 1 , one pixel will be added around the original image with value = 0.

Feature map sets


As discussed, the feature map is a measure of similarity of the filter with the image. An
image can have multiple distinguising features. To correctly identify the image we need
to learn all of them. Hence we deploy a number of learnable filters. Each specializes to
recognize one feature. And each filter results in a feature map. Hence we get a set for
feature maps.
Part drawing showing feature map sets
Pooling
Pooling layers, also known as downsampling layers, are a component of convolutional
neural networks (CNNs) that reduce the spatial dimensions of input data while retaining
important information. They do this by dividing the input data into small regions, called
pooling windows or receptive fields, and then performing an aggregation operation
within each window. The aggregation operation can be taking the maximum or average
value of the features in each rectangle. Pooling layers help control the model's
complexity, reduce overfitting, and improve computational efficiency. They also give the
features in the feature maps a degree of local translational invariance, which makes the
CNN more robust to variations in their positions. This means that if the input is
translated by a small amount, the values of most of the pooled outputs will not change.
Flattening
How are the pooled layer sets fed to a fully connected layer? Each element from the
pooled set is still a matrix. However, a fully connected network does not accept a matrix
but only scalars.
So we flatten the matrix; See the image below.

Dropout
Neural networks can very easily overfit -- they learn very well on training set but perform
poorly on validation set.
To get around this problem, the network is trained by randomly dropping a few weights
(or filters) during each pass. This ensures that the filters do not “over specialize” in
recognizing features and can work when a few features are marked off.
Putting them together Convolutional network
The following image shows all the stages in a Convolutional neural network.
Training
The fully connected network at the end is trained using Back-propagation and gradient
descent discussed earlier. It is during this process that the distinguishing features of the
images are learnt by the filters.

Applications of Convolutional Neural Network


1. Medical imaging analysis
• Deep learning helps in automatically selecting features which actually are
needed to have confidence on the prediction made by the system than
compared to traditional machine learning systems where the features have to be
hand crafted after an extensive analysis of the data. This helps in applications
like segmentation, abnormality detection and so on.
2. Security
• The London Police force applied deep learning to find the crime hotspots and to
learn who are likely to reoffend. They applied facial recognition to find criminals
in a large group of people.
3. Retail
• Labelling of products based on image
• Matching a material at a retail kiosk. The user can bring an image of a product
and the kiosk can match and tell which shop in the mall will have the product.

Static and Dynamic models


This problem (of predicting the 6th day stock price) cannot be solved using the Neural
Network models we have discussed so far. This is because those models do not
consider the history of events or sequences. For example, assume that we build an
image recognition model to identify a 'healthy' or 'diseased' cell. We show it two images
one after the other. Assume the first image is of a healthy cell. Now when we show the
next image to the model, it classifies that image alone. It does not relate it to the
previous image that was shown to it. Such models are called static models. Time is not
considered in these models.
Compare this to the daily price data of the stock we discussed. Here the price of the
share is related to its previous prices. Data sequence is associated with time and the
next output is produced after a time step. In the example above we can say that we are
considering the share price at a time step of 1 day. Such models are called dynamic
models. Time is an essential element in these models. They are also called temporal
models.

Training an RNN
Recurrent Neural Networks are different from feed-forward networks discussed earlier
as they allow feedback loops, including self-feedback loops.
Training the network: The recurrent neural network can be trained using the gradient
descent error optimization method. The error is back propagated to all the previous time
steps to optimize the parameters (weights). Hence this process is also called Back
Propagation Through Time (BPTT). Once the network is trained, it can predict the next
sequence given the previous sequence data as inputs.
Recurrent Neural Networks are a state-of-the-art algorithm for sequential data. They are
used by, among others, Apple's Siri and Google's Voice Search.
Applications of RNN
• Speech recognition means taking the speech wave data, converting them into
machine learnable format and identifying words and phrases used in them. A
simple way to imagine speech recognition would be to input audio clip and
output the transcript. Speech is broken down into chunks and is passed through
the RNN. Systems like Alexa, Siri use RNN for speech recognition.
• Machine Translation means translating text/speech from one language to
another. It happens in two steps --
1. Speech recognition: converting the input to words and phrases and
translating each word to another language. However, this sequence may
not be correct in another language. Hence;
2. Sequence the words as per the target language. RNN is used for
sequencing the words in another language.
• Robot control -- Robots are being implemented using RNN in surgeries
for precision and control over the equipment where humans have a higher rate of
error compared to the still robots.

Autoencoders

Autoencoders are unsupervised way to learn about inputs. Input and output are same.
There are two phases: encoding, decoding. Both phases are mirror layers with similar
number of layers and depths decreasing for encoder phase and increasing in decoding
phase.
In the above image, input data is being encoded to a smaller fraction and then it is
decoded to the original form.
Salient features get captured due to data compression. This is why it is used to remove
noise from images (e.g. removing patches from the image). It can be used as
dimensionality reduction technique for the same reason.
Applications of Autoencoders
• Data De-noising
As we compress data in subsequent hidden layers, only salient features are captured in
the process of auto encoding. This process helps us eliminate any noise in the data.
• Dimensionality Reduction
Google uses auto-encoders for image search. A situation where a large number of
images are to be stored and when an unseen image is presented, it searches through all
these images and suggests similar images from the storage. Auto encoders is used to
compress image data and store it as an index. Any new image data can be compressed
and quickly searched against this index.
Another application could be face recognition system to identify criminals at public
locations (Railway station, Airports, Metro stations, etc).

Applications
Finance
Deep learning is used in Finance domain for various purposes like detecting a
fraudulent transaction, determining the risk in offering a loan amount, checking
whether an insurance claim is genuine or not and many more.
• Fraud Detection
• Loan and Insurance underwriting
• Portfolio Management
• Algorithmic Trading
Fraud Detection
Fraud detection is a topic applicable to many industries including the banking and
financial sectors, insurance, government agencies, and law enforcement, and more.
Fraud attempts have seen a drastic increase in recent years, making fraud detection
more important than ever. Despite efforts on the part of the affected institutions,
hundreds of millions of dollars are lost to fraud every year. Since relatively few cases
show fraud in a large population, finding these can be tricky.
Benefits:
• Reduces fraud involving credit card, forging checks, misleading accounting
practices, etc.
• Helps in detecting fraud claims in insurance and e-commerce companies
• Helps in better loan amount prediction
• Can detect malwares and phishing websites much easily
• Enhanced automation that leads to fewer manual reviews
• Greater speed in making a risk assessment by quickly identifying patterns in data
• Increased accuracy to identifying good orders versus fraudulent orders
• Allows data to be better classified and trends to be identified quickly
• Efficient utilization of resources is achieved as models are constantly updated
with new data and feature extraction.
Companies:
• Danske Bank
• Lloyds Bank

Loan And Insurance Underwriting


A lender determines if the risk of offering a loan to a particular borrower under certain
parameters is acceptable. Most of the risks and terms that underwriters consider fall
under the three C’s of underwriting: credit( credit score, …), capacity(to repay the
loan), and collateral(Quality of Assets that can back non payment).
Benefits:
• Helps in better loan amount prediction.
• Helps underwriters to focus on the most valuable business
• Helps in prediction of types of insurance and coverage plans new customers will
purchase
• Better premiums and policy updates
• Predicts coverage changes and forms of insurance (health, life, property, flood)
that will most likely be dominant
• Detect fraudulent insurance claim volume.
Companies:
• Liberty Mutual Insurance
• StateFarm

Portfolio Management
Portfolio management is all about
• Art and science of making decisions about investment mix and policy
• Matching investments to objectives,
• Asset allocation for individuals and institutions, and
• Balancing risk against performance.
• Determining strengths, weaknesses, opportunities and threats in the choice of
debt vs. equity, domestic vs. international, growth vs. safety, and
• Many other trade-offs encountered in the attempt to maximize return at a given
appetite for risk.
To learn more please refer this link.
Benefits:
• End investors will benefit.
• Expenses will decrease as firms achieve economies of scale by
using technology.
• Returns will increase as new investment management processes are put in
place.
• Better decision making system
• Better risk management
Companies:
• Wealthfront
• Betterment

Algorithmic Trading
Algorithmic trading (also called automated trading, black-box trading, or simply algo-
trading) is the process of using computers programmed to follow a defined set of
instructions for placing a trade in order to generate profits at a speed and frequency that
is impossible for a human trader.
The defined sets of rules are based on:
• Timing,
• Price,
• Quantity or
• Any mathematical model.
Apart from profit opportunities for the trader, algo-trading makes markets more liquid
and makes trading more systematic by ruling out emotional human impacts on trading
activities. Some of the organizations that leverage these techniques are
Benefits:
• Better decision making system
• Better risk management
• Predict better stock to invest
• Predict better fund for long term investment
Companies:
• Societe Generale
• Goldman Sachs
• Morgan Stanley
• Credit Suisse

Energy
Saving energy consumption
Different companies are now using deep learning to conserve energy.
Saving energy is a complex scenario as it is not just about switching off. It means
intelligently adjusting the energy consumption based on the anticipated system need.
Benefits:
• Better efforts to improve the efficiency of building appliances and materials
• Better efforts towards increasing the use of renewable energy sources
• New policies, incentives, and regulations to reduce energy consumption
• Automate building control in a way that improves building operation
Companies:
• Colas
• DeepMind
• Energisme
Better production
Saving energy is not enough, we also need to increase the efficiency of its productivity
and its distribution in order to achieve better result. DL can help in this too.
Benefits:
• More production of power in same amount of resource as earlier (say an
improved mileage for oil consumption)
• Saving energy to reduce energy wastage
• Using better architecture to increase efficiency
• Better energy conversation plans
Companies:
• nVidia
• Maruti Suzuki

Healthcare
Deep Learning in Health Care
Deep Learning models are now being successfully used in medical domain to detect
rare diseases and also to find ways to treat and prevent them.
Benefits:
• Automated and faster image diagnosis
• Detecting rare diseases
• Help in monitoring risk factors
• Personalised health monitoring
• Reduction in waiting time for reports
• Better administrative workflow
Companies:
• Carrefour
• Manipal Hospital
• Sensely

Deep Learning Framework


Tensorflow
• Developed by Google Brain
• Released under Apache 2.0 open source license
• Numerical computation using Data Flow Graphs
• Nodes: operations, edges: multidimensional data arrays called tensors
A high-level architecture of Tensorflow is given below.
Tensor flow is Widely used across industry verticals like: telecommunication, ecommerce,
finance, banking etc.

Torch and PyTorch


Torch is one of the market leaders in deep learning softares. It is an Open Source and uses Lua
language, closer to C.

PyTorch is the Python version of Torch. This was primarily developed by Facebook AI Research
group

The followings are PyTorch Features:

• Hybrid Front-End

• Scalable Distributed Training

• Deep integration with Python

• Rich ecosystem of tools and libraries

DL4J
• Open Source Deep Learning library
• Java and Scala based
• Used across industries: Finance, E-Commerce, Banking, Supply Chain,
Manufacturing
DL4J Features:
• Excel in identifying patterns in unstructured data
• Can be used for image processing
• Works well with text analysis
• Compatible with distributed computing softwares
Keras
• User friendly interface to Tensorflow, Microsoft Cognitive Toolkit, Theano
• Python based
• Developing Deep Learning network is simplified
• Gaining acceptance in user community
• Developed by Google
Keras Features:
• Can be used for prediction
• Widely used for feature extraction
• Weights can be downloaded
• Large number of pre-trained models available
• Fine tuning can be easily done
Caffe
• Convolutional Architecture for Fast Feature Embedding
• C++ based with Python interface
• Usable with Apace Spark distributed computing environment
• Primarily used for image classification

Deep Learning Products


Google Cloud Platform

Google Cloud Platform, offered by Google, is a suite of cloud computing services that
runs on the same infrastructure that Google uses internally for its end-user products,
such as Google Search and YouTube.
It provides a series of modular cloud services including computing, data storage, data
analytics and machine learning.
GCP Features:
• Server less, fully managed computing
• Secured platform
• Better data center
• Powerful for data analytics
• Infrastructure developed keeping future in mind
GCP Products:
• Cloud computing – provides compute, storage, databases
• Analytic and machine learning – provides BigQuery, artificial intelligence APIs
• Google Maps platform – provides Maps, Routes, Places
• Browser, hardware and OS – provides Chrome, Android, Jamboard
• Professional services – provides consulting, training, certification
and many more.

H2O:

H2O is open-source software for big-data analysis. It is produced by the company


H2O.ai. H2O allows users to fit thousands of potential models as part of discovering
patterns in data.
It is used for exploring and analyzing datasets held in cloud computing systems and in
the Apache Hadoop Distributed File System as well as in the conventional operating-
systems.
H2O Features:
• Algorithms are developed from the ground up for distributed computing and for
both supervised and unsupervised approaches
• You can access it from Python, R, Flow and many more
• AutoML can be used for automating the machine learning workflow
H2O Products:
• H2O
• Sparkling Water
• H2O4GPU
• Driverless AI

IBM Watson:
Watson is a question-answering computer system capable of answering questions
posed in natural language. It was created as a question answering computing system
that IBM built to apply advanced natural language processing, information
retrieval, knowledge representation, automated reasoning, and machine
learning technologies to the field of open domain question answering.
Watson Features:
• Accelerate research and discovery
• Detect liabilities and migrate risk
• Scale expertise and learning
• Learn more with less data
• Reimagine your workflow
Watson Products:
• AI Assistance
• Data
• Knowledge
• Vision
• Speech

Amazon Web Services:


Amazon Web Services is a subsidiary of Amazon.com that provides on-demand cloud
computing platforms to individuals, companies and governments, on a paid
subscription basis. The technology allows subscribers to have at their disposal a virtual
cluster of computers, available all the time, through the Internet.
AWS Features:
• It provide better support
• Provide best practice recommendations to improve performance and efficiency
• It have AWS Trusted Advisor that draws upon best practices learned from
operational history
• The AWS Support API allows you to programmatically interact with your Support
cases
• It provide third party software support
AWS Products:
• AWS Deeplens
• Amazon Polly
• Amazon SageMaker
• Alexa for Business
• AWS IoT Analytics
• AWS cloud9
Baidu ezDL:
EZDL is a simple drag-and-drop platform that allows users to design and build custom
machine learning models. Even if you don’t have any programming background (and no
time to learn it either).
ezDL Features:
• It is easy to use
• It is fast
• Has high accuracy
• Highly secured
Apache Spark MLlib:
Spark MLlib is Apache Spark's Machine Learning component. One of the major
attractions of Spark is the ability to scale computation massively, and that is exactly
what you need for machine learning algorithms.
Spark Mllib Features:
• It is easy to use as it supports Java, Scala, Python and R
• It has better performance and is faster than MapReduce
• It can run anywhere like on Hadoop Apache Mesos, Kubernetes, standalone, or
in the cloud, against diverse data sources.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy