0% found this document useful (0 votes)
26 views31 pages

PP&DS 5

Deep learning is a subset of machine learning and artificial intelligence that uses neural networks to process large amounts of data without explicit programming. It has applications in self-driving cars, voice-controlled assistants, and image caption generation, but it is expensive to train and lacks strong theoretical foundations. Key architectures include multi-layer perceptrons, recurrent neural networks, and convolutional neural networks, each with distinct functions and applications in areas like speech recognition and image classification.

Uploaded by

shatviklakshman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views31 pages

PP&DS 5

Deep learning is a subset of machine learning and artificial intelligence that uses neural networks to process large amounts of data without explicit programming. It has applications in self-driving cars, voice-controlled assistants, and image caption generation, but it is expensive to train and lacks strong theoretical foundations. Key architectures include multi-layer perceptrons, recurrent neural networks, and convolutional neural networks, each with distinct functions and applications in areas like speech recognition and image classification.

Uploaded by

shatviklakshman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Deep Learning

• Deep learning is a subset of Machine learning, which on the other hand is a subset of
artificial intelligence.

• Artificial Intelligence (AI) is the branch of computer sciences that emphasizes the
development of intelligence machines, thinking and working like humans. For
example, speech recognition, problem-solving, learning and planning..etc
• Machine learning is a branch of artificial intelligence (AI) and computer science which
focuses on the use of data and algorithms to imitate the way that humans learn,
gradually improving its accuracy.
• Deep Learning is just a type of Machine Learning, inspired by the structure of a human
brain.
• In deep learning, nothing is programmed explicitly/clearly.
• Deep learning algorithms are used, especially when we have a huge no of inputs and
outputs.

• Deep learning is implemented with the help of Neural Networks, and the idea behind
the motivation of Neural Network is the biological neurons, which is nothing but a brain
cell.

• So basically, deep learning is implemented by the help of deep networks, which are
nothing but neural networks with multiple hidden layers.

Example of Deep Learning

In the example given above, we provide the raw data of images to the first layer of the input
layer.

After then, these input layers will determine the patterns of local contrast that means it will
differentiate on the basis of colours, luminosity, etc.

Then the 1st hidden layer will determine the face feature, i.e., it will fixate on eyes, nose, and
lips, etc.

And then, it will fixate those face features on the correct face template.

So, in the 2nd hidden layer, it will actually determine the correct face here as it can be seen in
the above image, after which it will be sent to the output layer.

Deep learning applications


o Self-Driving-Cars
In self-driven cars, it is able to capture the images around it by processing a huge
amount of data, and then it will decide which actions should be incorporated to take a
left or right or should it stop. So, accordingly, it will decide what actions it should take,
which will further reduce the accidents that happen every year.
o Voice-Controlled-Assistance
When we talk about voice control assistance, then Siri is the one thing that comes into
our mind. So, you can tell Siri whatever you want it to do it for you, and it will search it
for you and display it for you.
o Automatic-Image-Caption-Generation
Whatever image that you upload, the algorithm will work in such a way that it will
generate caption accordingly. If you say blue coloured eye, it will display a blue-
coloured eye with a caption at the bottom of the image.
o Automatic-Machine-Translation
With the help of automatic machine translation, we are able to convert one language
into another with the help of deep learning.

Limitations

o It only learns through the observations.

Advantages

o It results in the best-in-class performance on problems.

Disadvantages

o It is quite expensive to train.


o It does not have strong theoretical groundwork.

Multi-layer Perceptron /ANN


• Multi-Layer perceptron defines the most complex architecture of artificial neural
networks.
• It is substantially formed from multiple layers of the perceptron.
• A series or set of algorithms that try to recognize the underlying relationship in a data
set through a definite process that mimics the operation of the human brain is known
as Neural Network.
The pictorial representation of multi-layer perceptron learning is as shown below-

MLP networks are used for supervised learning format.

A typical learning algorithm for MLP networks is also called back propagation's algorithm.

A multilayer perceptron (MLP) is a feed forward artificial neural network that generates a set
of outputs from a set of inputs.

MLP uses back propagation for training the network. MLP is a deep learning method.

Back propagation
• Back propagation is one of the important concepts of a neural network.
• Back propagation is a technique used to train certain classes of neural networks it is
essentially a principal that allows the machine learning program to adjust itself
accordingly
• Back propagation is sometimes called the “back propagation of errors.”

Consider the following Back propagation Example for neural network

How Backpropagation Algorithm Works


1. Inputs X, arrive through the pre connected path
2. Input is modelled using real weights W. The weights are usually randomly selected.
3. Calculate the output for every neuron from the input layer, to the hidden layers, to the
output layer.
4. Calculate the error in the outputs

Error= Actual Output – Desired Output

5. Travel back from the output layer to the hidden layer to adjust the weights such that
the error is decreased.

Keep repeating the process until the desired output is achieved

loss function
• The loss function is the function that computes the distance between the current output
of the algorithm and the expected output.
• It’s a method to evaluate how your algorithm models the data.
• It can be categorized into two groups.
• One for classification and the other for regression.

Hyper parameter tuning
• Hyper parameter tuning works by running multiple trails in a single training job.
• A hyper parameter is a model argument whose value is set before the learning
process begins.

Recurrent Neural Network (RNN)


• A recurrent neural network (RNN) is a kind of artificial neural network mainly used
in speech recognition and natural language processing (NLP).
• RNN is used in deep learning and in the development of models that imitate the activity
of neurons in the human brain.
• Recurrent Networks are designed to recognize patterns in sequences of data, such
as text, handwriting, the spoken word...etc
• A recurrent neural network looks similar to a traditional neural network except that a
memory-state is added to the neurons.
• The recurrent neural network is a type of deep learning-oriented algorithm, which
follows a sequential approach.
• In neural networks, we always assume that each input and output is dependent on all
other layers.
• These types of neural networks are called recurrent because they sequentially perform
mathematical computations.

Consider the following steps to train a recurrent neural network −


Step 1 − Input a specific example from dataset.
Step 2 − Network will take an example and compute some calculations using randomly
initialized variables.
Step 3 − A predicted result is then computed.
Step 4 − The comparison of actual result generated with the expected value will produce
an error.
Step 5 − To trace the error, it is propagated through same path where the variables are
also adjusted.
Step 6 − The steps from 1 to 5 are repeated until we are confident that the variables
declared to get the output are defined properly.
Step 7 − A systematic prediction is made by applying these variables to get new unseen
input.
The schematic approach of representing recurrent neural networks is described below

10.2M
Application of RNN

1. Machine Translation

We make use of Recurrent Neural Networks in the translation engines to translate the text
from one to another language. They do this with the combination of other models
like LSTM (Long short-term memory)s.

2. Speech Recognition

Recurrent Neural Networks has replaced the traditional speech recognition models that made
use of Hidden Markov Models. These Recurrent Neural Networks, along with LSTMs, are better
poised at classifying speeches and converting them into text without loss of context.
3. Automatic Image Tagger

RNNs, in conjunction with convolutional neural networks, can detect the images and provide
their descriptions in the form of tags.

For example, a picture of a fox jumping over the fence is better explained appropriately using
RNNs.

Convolutional Neural Network


• Convolutional Neural Network is one of the techniques to do image classification and
image recognition in neural networks.
• It is widely used for image recognition.
• CNN represents the input data in the form of multidimensional arrays.
• It works well for a large number of labelled data.
• CNN extract each and every portion of input image, which is known as receptive field.
• It assigns weights for each neuron based on the significant role of the receptive field.
• So that it can discriminate the importance of neurons from one another.
• The architecture of CNN consists of three types of layer
(1) convolution, (2) pooling, and (3) fully connected.

• A CNN network has an input and an output layer, as well as multiple hidden layers.
• The hidden layers of a CNN typically consist of a series of convolutional layers.
• ReLU is the typical activation function, which is normally followed by additional
operations such as pooling layers, fully connected layers and normalization layers.
• Backpropagation is used for error distribution and weight adjustment.

For example,

Consider the image below of Nature, upon first glance; we will see a lot of buildings and
colours.

How Does a Computer read an image?

The image is broken into 3 colours-channels which is Red, Green, and Blue.
Each of these colours channels is mapped to the image's pixel.

Edge Detection

• Every image has vertical and horizontal edges which actually combining to form an
image.
• Convolution operation is used with some filters for detecting edges.

Example-

• For this example, we are using 3*3 Prewitt filter as shown in the above image.
• As shown below, when we apply the filter to perform edge detection on the given
6*6 image
• The output image will contain
• ((a11*1) + (a12*0) + (a13*(-1))+(a21*1)+(a22*0)+(a23*(-))+(a31*1)+(a32*0)
+(a33*(-1))) in the purple square.
• We repeat the convolutions horizontally and then vertically to obtain the output
image.

• We would continue the above procedure to get the processed image after
edge-detection.
• But, in the real world, we deal with very high-resolution images for Artificial
Intelligence applications.
• Hence we opt for an algorithm to perform the convolutions, and even use
Deep Learning to decide on the best values of the filter.

Layers in CNN
There are five different layers in CNN
• Input layer
• Convo layer (Convo + ReLU)
• Pooling layer
• Fully connected(FC) layer
• Softmax/logistic layer
• Output layer

Different layers of CNN

Input Layer

• Input layer in CNN should contain image data.

• Image data is represented by three dimensional matrixes as we saw earlier.

• You need to reshape it into a single column.

• Suppose you have image of dimension 28 x 28 =784, you need to convert it into 784 x 1
before feeding into input.

• If you have “m” training examples then dimension of input will be (784, m).
Convo Layer

• Convo layer is sometimes called feature extractor layer because features of the image
are get extracted within this layer.

• First of all, a part of image is connected to Convo layer to perform convolution


operation

• Result of the operation is single integer of the output volume.

• Then we slide the filter over the next receptive field of the same input image by a Stride
and do the same operation again.
• We will repeat the same process again and again until we go through the whole image.
The output will be the input for the next layer.

• Convo layer also contains ReLU activation to make all negative value to zero.

Pooling Layer

• Pooling layer is used to reduce the spatial volume of input image after convolution.

• It is used between two convolution layer.

• If we apply FC after Convo layer without applying pooling or max pooling, then it will be
computationally expensive and we don’t want it.

• So, the max pooling is only way to reduce the spatial volume of input image.

• In the above example, we have applied max pooling in single depth slice with Stride of
2.

• You can observe the 4 x 4 dimension input is reduce to 2 x 2 dimension.

• There is no parameter in pooling layer but it has two hyper


parameters — Filter(F) and Stride(S).

Fully Connected Layer(FC)

• Fully connected layer involves weights, biases, and neurons.

• It connects neurons in one layer to neurons in another layer.

• It is used to classify images between different category by training.


Softmax / Logistic Layer

• Softmax or Logistic layer is the last layer of CNN.

• It resides at the end of FC layer.

• Logistic is used for binary classification and softmax is for multi-classification.

Output Layer

• Output layer contains the label which is in the form of one-hot encoded.

• Now you have a good understanding of CNN.

• Let’s implement a CNN in Keras.

Keras Implementation

We will use CIFAR-10 dataset to build a CNN image classifier. CIFAR-10 dataset has 10
different labels

• Airplane

• Automobile

• Bird

• Cat

• Deer

• Dog

• Frog

• Horse

• Ship

• Truck

It has 50,000 training data and 10,000 testing image data. Image size in CIFAR-10 is 32 x 32
x 3. It comes with Keras library.
Long short-term memory (LSTM)
• Long short-term memory (LSTM) is an artificial recurrent neural network (RNN)
architecture used in the field of deep learning.
• RNNs process inputs in a sequential manner, where the context from the previous
input is considered when computing the output of the current step.
• This allows the neural network to carry information over different time steps rather
than keeping all the inputs independent of each other.
• RNNs are unable to work with longer sequences and hold on to long-term
dependencies, making them suffers from “short-term memory”.

• As we can see from the image, the difference lies mainly in the LSTM’s ability to
preserve long-term memory.
• This is especially important in the majority of Natural Language Processing (NLP) or
time-series and sequential tasks.

For example,

• Let’s say we have a network generating text based on some input given to us.
• At the start of the text, it is mentioned that the author has a “dog named Cliff”.
• After a few other sentences where there is no mention of a pet or dog, the author
brings up his pet again, and the model has to generate the next word to "However,
Cliff, my pet ____".
• As the word pet appeared right before the blank, a RNN can deduce that the next
word will likely be an animal that can be kept as a pet.

RNNs are unable to remember information from much earlier

• However, due to the short-term memory, the typical RNN will only be able to use the
contextual information from the text that appeared in the last few sentences - which is
not useful at all.
• The RNN has no clue as to what animal the pet might be as the relevant information
from the start of the text has already been lost.
• On the other hand, the LSTM can retain the earlier information that the author has a
pet dog, and this will aid the model in choosing "the dog" when it comes to generating
the text at that point due to the contextual information from a much earlier time step.
• Long Short- Term Memory (LSTM) networks are a modified version of recurrent neural
networks, which makes it easier to remember past data in memory.

Data Science Applications


Here are some of the applications for data science

▪ Fraud and Risk Detection


▪ Healthcare
▪ Internet Search
▪ Targeted Advertising
▪ Website Recommendations
▪ Advanced Image Recognition
▪ Speech Recognition
▪ Airline Route Planning
▪ Gaming

Fraud and Risk Detection

• The earliest applications of data science were in Finance.


• Companies were fed up of bad debts and losses every year.
• However, they had a lot of data which use to get collected during the initial paperwork
while sanctioning loans.
• They decided to bring in data scientists in order to rescue them out of losses.
• Over the years, banking companies learned to divide and conquer data via customer
profiling, past expenditures, and other essential variables to analyze the probabilities of
risk and default.
• Moreover, it also helped them to push their banking products based on customer’s
purchasing power.

Healthcare

• The healthcare sector, especially, receives great benefits from data science
applications.

Medical Image Analysis

• Procedures such as detecting tumors, artery stenosis, organ delineation employ various
different methods and frameworks like MapReduce to find optimal parameters for tasks
like lung texture classification.
• It applies machine learning methods, support vector machines (SVM), content-based
medical image indexing, and wavelet analysis for solid texture classification.

Internet Search

• Now, this is probably the first thing that strikes your mind when you think Data Science
Applications.
• When we speak of search, we think ‘Google’.
• But there are many other search engines like Yahoo, Bing, Ask, AOL, and so on.
• All these search engines (including Google) make use of data science algorithms to
deliver the best result for our searched query in a fraction of seconds.
• Considering the fact that, Google processes more than 20 petabytes of data every day.

Targeted Advertising
• The entire digital marketing spectrum in now a day is depending on data science
Algorithms.
• Starting from the display banners on various websites to the digital billboards at the
airports – almost all of them are decided by using data science algorithms.
• This is the reason why digital ads have been able to get a lot higher CTR (Call-Through
Rate) than traditional advertisements.
• They can be targeted based on a user’s past behavior.
• This is the reason why you might see ads of Data Science Training Programs while I see
an ad of apparels in the same place at the same time.

Website Recommendations

• The suggestions about similar products on Amazon?


• They not only help you find relevant products from billions of products available with
them but also add a lot to the user experience.
• A lot of companies have used this to promote their products in accordance with user’s
interest and relevance of information.
• Internet giants like Amazon, Twitter, Google Play, Netflix, Linkedin, imdb and many more
use this system to improve the user experience.
• The recommendations are made based on previous search results for a user.

Advanced Image Recognition

• You upload your image with friends on Facebook and you start getting suggestions to
tag your friends.
• This automatic tag suggestion feature uses face recognition algorithm.
• In their latest update, Facebook has outlined the additional progress they’ve made in
this area, making specific note of their advances in image recognition accuracy and
capacity.
• In addition, Google provides you with the option to search for images by uploading
them.
• It uses image recognition and provides related search results.
Speech Recognition

• Some of the best examples of speech recognition products are Google Voice, Siri,
Cortana etc.
• Using speech-recognition feature, even if you aren’t in a position to type a message,
your life wouldn’t stop.
• Simply speak out the message and it will be converted to text.
• However, at times, you would realize, speech recognition doesn’t perform accurately.

Airline Route Planning

• Airline Industry across the world is known to bear heavy losses.


• Except for a few airline service providers, companies are struggling to maintain their
occupancy ratio and operating profits.
• With high rise in air-fuel prices and need to offer heavy discounts to customers has
further made the situation worse.
• It wasn’t for long when airlines companies started using data science to identify the
strategic areas of improvements.

Now using data science, the airline companies can:

1. Predict flight delay


2. Decide which class of airplanes to buy
3. Whether to directly land at the destination or take a halt in between (For example, A
flight can have a direct route from New Delhi to New York. Alternatively, it can also
choose to halt in any country.)
4. Effectively drive customer loyalty programs

• Southwest Airlines, Alaska Airlines are among the top companies who’ve Gaming

• Games are now designed using machine learning algorithms which improve/upgrade
themselves as the player moves up to a higher level.
• In motion gaming also, your opponent (computer) analyzes your previous moves and
accordingly shapes up its game.
• EA Sports, Pubg, Sony, Nintendo, Activision-Blizzard have led gaming experience to the
next level using data science.

Image Classification
Image classification is the process of taking an input (like a picture) and outputting a class (like
“cow”) or a probability that the input is a particular class (“there’s a 90% probability that this input
is a cow”).

We can look at a picture and know that we’re looking at a cow, but how can a computer learn
to do that?

The answer is with a convolutional neural network!

• Computers don’t see images as humans do.


• They see a matrix of pixels, each of which has three components: red, green and blue

(ever heard of RGB?)


▪ Therefore, a 1,000-pixel image for us will have 3,000 pixels for a computer.
▪ Each of those 3,000 pixels will have a value, or intensity, assigned to it.
▪ The result is a matrix of 3,000 precise pixel intensities, which the computer must
somehow interpret as one or more objects.
▪ For a black and white image, pixels are interpreted as a 2D array (for example, 2x2
pixels).
▪ Every pixel has a value between 0 and 255. (Zero is completely black and 255 is
completely white.
▪ The greyscale exists between those numbers.) Based on that information, the computer
can begin to work on the data.
▪ For a colour image (a combination of red, green and blue), this is a 3D array.
▪ Each colour has its value between 0 and 255. The colour can be found by combining the
values in each of the three layers.

Setting up the Structure of our Image Data

▪ In data science, the image classifiers we build have to be trained to recognize


objects/patterns — this shows our classifier what exactly (or approximately) what to
recognize.
▪ Imagine I show you the following image:
There is a boathouse, a little boat, hills, a river and other objects.

You’ll notice them, but you won’t know exactly what object I’m referring to.

Now, if I show you another photo:

And another:
▪ You’ll recognize that I want you to recognize a “Mountain”.
▪ By showing you the first two images, I essentially trained your brain to recognize a
mountain in the third image.
▪ The process is (almost) the same for computers — our data needs to be in a specific
format.

▪ Our model will be trained on the images present in a “training”set and the label
predictions will happen on the testing set images.

Recommender systems
▪ Recommender systems are the systems that are designed to recommend things to the
user based on many different factors.
▪ These systems predict the most likely product that the users are most likely to
purchase and are of interest to.
▪ Companies like Netflix, Amazon, etc. use recommender systems to help their users to
identify the correct product or movies for them.
▪ The recommender system deals with a large volume of information present by filtering
the most important information based on the data provided by a user and other
factors that take care of the user’s preference and interest.
▪ It finds out the match between user and item and imputes the similarities between
users and items for recommendation.
▪ Both the users and the services provided have benefited from these kinds of systems.
The quality and decision-making process has also improved through these kinds of
systems.

What can be Recommended?

▪ There are many different things that can be recommended by the system like movies,
books, news, articles, jobs, advertisements, etc.
▪ Netflix uses a recommender system to recommend movies & web-series to its users.
Similarly, YouTube recommends different videos.
There are many examples of recommender systems that are widely used today.

How do User and Item matching is done?


In order to understand how the item is recommended and how the matching is done, let us a
look at the images below;

Showing user-item matching for social websites

Perfect matching may not be recommended

Real-life user interaction with a recommendations system


▪ The above pictures show that there won't be any perfect recommendation which is
made to a user.
▪ In the above image, a user has searched for a laptop with 1TB HDD, 8GB ram, and an i5
processor for 40,000.
▪ The system has recommended 3 most similar laptops to the user.

Types of Recommendation System

1.Popularity-Based Recommendation System


It is a type of recommendation system which works on the principle of popularity and or
anything which is in trend.
▪ These systems check about the product or movie which are in trend or are most
popular among the users and directly recommend those.
For example, if a product is often purchased by most people then the system will get to
know that that product is most popular so for every new user who just signed it, the system
will recommend that product to that user also and chances becomes high that the new user
will also purchase that.
Example
▪ Google News: News filtered by trending and most popular news.
▪ YouTube: Trending videos.
2. Classification Model
• The model that uses features of both products as well as users to predict whether a
user will like a product or not.

Classification model

The output can be either 0 or 1. If the user likes it then 1 and vice-versa.
3. Content-Based Recommendation System
▪ It is another type of recommendation system which works on the principle of similar
content.
▪ If a user is watching a movie, then the system will check about other movies of similar
content or the same genre of the movie the user is watching.
▪ There are various fundamentals attributes that are used to compute the similarity
while checking about similar content.
▪ To explain more about how exactly the system works, an example is stated below:

Figure1: Different models of one plus.

Figure shows the different models of one plus phone.


If a person is looking for one plus 7 mobile then, one plus 7T and one plus 7 Pro is
recommended to the user.

Social Network Graphs

Network’s basic components: nodes and edges.


▪ Nodes (A,B,C,D,E in the example) are usually representing entities in the network, and
can hold self-properties (such as weight, size, position and any other attribute) and
network-based properties (such as Degree- number of neighbors or Cluster- a
connected component the node belongs to etc.).

▪ Edges represent the connections between the nodes, and might hold properties as well
(such as weight representing the strength of the connection, direction in case of
asymmetric relation or time if applicable).

▪ These two basic elements can describe multiple phenomena, such as social
connections, virtual routing network, physical electricity networks, roads network,
biology relations network and many other relationships.

Real-world networks

Real-world networks and in particular social networks have a unique structure which often
differs them from random mathematical networks:
A social network graph is a graph where the nodes represent people and the lines between
nodes, called edges, represent social connections between them, such as friendship or working
together on a project.

These graphs can be either undirected or directed.

For instance, Facebook can be described with an undirected graph since the friendship is
bidirectional, Alice and Bob being friends is the same as Bob and Alice being friends. On the
other hand, Twitter can be described with a directed graph: Alice can follow Bob without Bob
following Alice.

Social networks are important to social scientists interested in how people interact as well as
companies trying to target consumers for advertising.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy