0% found this document useful (0 votes)

8 views30 pages

DL U-III Computer Vision

The document discusses the applications of deep learning in computer vision, highlighting key areas such as image classification, object detection, and image segmentation. It details various techniques for image segmentation, including thresholding, region growing, and deep learning-based methods, and outlines their real-world applications in fields like autonomous vehicles and medical imaging. Additionally, it covers advanced topics like automatic image captioning, generative adversarial networks, and attention models, emphasizing their significance in enhancing computer vision tasks.

Uploaded by

sanashashikanth

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views30 pages

DL U-III Computer Vision

Uploaded by

sanashashikanth

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 30

UNIT-III

APPLICATIONS OF DEEP LEARNING

TO
COMPUTER VISION
Computer Vision
• Computer vision is a type of AI which enables computers to interpret
and analyze the visual world, simulating the way humans see and
understand their environment.
Deep Learning has been used in the
following computer vision problems:
1.Image Classification
2.Image Classification With Localization
3.Object Detection
4.Object Segmentation
5.Image Style Transfer
6.Image Colorization
7.Image Reconstruction
8.Image Super-Resolution
9.Image Synthesis
10.Other Problems
Image segmentation
• One of the most important operations in Computer
Vision is Segmentation.
• Image segmentation is the process of dividing an
image into multiple parts or regions that belong to the
same class. This task of clustering is based on specific
criteria, for example, color or texture.
• This process is also called pixel-level classification. In
other words, it involves partitioning images (or video
frames) into multiple segments or objects.
Image Segmentation
The Deep Learning Approach to Image
Segmentation

• In the last 40 years, various segmentation methods have been

proposed, ranging from MATLAB image segmentation and traditional
computer vision methods to the state of the art deep learning methods.
Especially with the emergence of Deep Neural Networks (DNN), image
segmentation applications have made tremendous progress
• For image segmentation, deep learning is a great technique. Deep
learning algorithms automatically extract features from data, which
may be used to segment it. Deep learning models can learn complex
characteristics that are difficult to specify manually.
• Convolutional neural networks (CNNs), fully connected networks (FCNs),
and recurrent neural networks are among the deep learning designs
that may be utilized for picture segmentation (RNNs). Each architecture
has its own set of benefits and drawbacks.
semantic image segmentation with driving cars – Source: Sample from the Mapillary
Vistas Dataset
Image Segmentation Techniques

• There are various image segmentation techniques available, and each

technique has its own advantages and disadvantages.
• Thresholding: Thresholding is one of the simplest image segmentation
techniques, where a threshold value is set, and all pixels with intensity
values above or below the threshold are assigned to separate regions.
• Region growing: In region growing, the image is divided into several
regions based on similarity criteria. This segmentation technique starts
from a seed point and grows the region by adding neighboring pixels with
similar characteristics.
• Edge-based segmentation: Edge-based segmentation techniques are
based on detecting edges in the image. These edges represent boundaries
between different regions and are detected using edge detection
algorithms.
• Clustering: Clustering techniques group pixels into clusters based on
similarity criteria. These criteria can be color, intensity, texture, or any
other feature.
• Watershed segmentation: Watershed segmentation is based on the
idea of flooding an image from its minima. In this technique, the
image is treated as a topographic relief, where the intensity values
represent the height of the terrain.
• Active contours: Active contours, also known as snakes, are curves
that deform to find the boundary of an object in an image. These
curves are controlled by an energy function that minimizes the
distance between the curve and the object boundary.
• Deep learning-based segmentation: Deep learning techniques, such
as Convolutional Neural Networks (CNNs), have revolutionized image
segmentation by providing highly accurate and efficient solutions.
• Graph-based segmentation: This technique represents an image as a
graph and partitions the image based on graph theory principles.
• Superpixel-based segmentation: This technique groups a set of
similar image pixels together to form larger, more meaningful regions,
called superpixels
Applications of Image Segmentation

• Image segmentation problems play a central role in a broad range of real-

world computer vision applications, including road sign detection, biology,
the evaluation of construction materials, or video security and surveillance.
• Also, autonomous vehicles and Advanced Driver Assistance Systems (ADAS)
need to detect navigable surfaces or apply pedestrian detection.
• Furthermore, image segmentation is widely applied in medical imaging
applications, such as tumor boundary extraction or measurement of tissue
volumes. Here, an opportunity is to design standardized image databases
that can be used to evaluate fast-spreading new diseases and pandemics
(for example, for AI vision applications of coronavirus control).
• Deep Learning based Image
Segmentation has been
successfully applied to segment
satellite images in the field of
remote sensing, including
techniques for urban planning or
precision agriculture. Also, images
collected by drones (UAVs) have
been segmented using Deep
Learning based techniques,
offering the opportunity to
address important environmental
problems related to climate
change.
Object detection
• Object detection in computer vision refers to the process of locating
and classifying objects within images or video frames. It involves
identifying and delineating the boundaries of objects in a given scene
and associating them with specific object classes or labels. Object
detection goes beyond simple image classification by providing
information about the spatial location of each detected object.
• Key components of object detection include:
1.Localization: Determining the precise location (bounding box) of
each object in the image or frame.
2.Classification: Assigning a label or category to each detected object,
indicating the type or class of the object.
• Object detection is widely used in various applications, such as
autonomous vehicles, surveillance, medical imaging, robotics, and
more.
Object Detection Applications
1.Autonomous Vehicles
2.Surveillance and Security
3.Medical Imaging
4.Retail (Inventory Management, Checkout)
5.Industrial Automation (Quality Control)
6.Augmented Reality
7.Robotics
8.Sports Analytics
9.Environmental Monitoring
10.Retail Analytics
11.Augmented Traffic Management
12.Human-Computer Interaction
Automatic
image
captioning
• Image Caption Generator or
Photo Descriptions is one of
the Applications of Deep
Learning. In Which we have to
pass the image to the model
and the model does some
processing and generating
captions or descriptions as per
its training. This prediction is
sometimes not that much
accurate and generates some
meaningless sentences. We
need very high computational
power and a very huge
dataset for better results.
• Automatic image captioning is a critical research problem with
numerous complexities, attracting a significant amount of work with
extensive applications across various domains such as human-
computer interaction , medical image captioning and prescription,
traffic data analysis, quality control in industry , and especially
assistive technologies for visually impaired individuals.
• Given an input image I, the goal is to generate a caption C describing
the visual contents present inside the given image, with C being a set
of sentences C = {c1, c2, ..., cn} where each ci is a sentence of the
generated caption C
Image generation with
Generative adversarial
networks
• A generative adversarial network is a subclass of
machine learning frameworks in which when we give a
training set, this technique learns to generate new data
with the same statistics as the training set with the help
of algorithmic architectures that uses two neural
networks to generate new, synthetic instances of data
that is very much similar to the real data.
• GANs are usually trained to generate images from random noises and
a GAN has usually two parts in which it works namely the Generator
that generates new samples of images and the second is a
Discriminator that classifies images as real or fake
• Generator: A generator is a model that is used to generate
new reasonable data examples from the problem statement
and
• Discriminator: A discriminator model is a model that
classifies the given examples as real (from the domain) or
fake (generated).
Applications
Image-to-Image 6. Virtual Try-On:
Translation: Allowing users to virtually try on clothes,
Generating images that transform accessories, or other items before making
from one domain to another, such a purchase.
as turning satellite images into 7. Deepfake Generation:
maps or black-and-white photos Creating realistic-looking fake videos or
into color. images by replacing faces in existing content.
Style Transfer: 8. Image Inpainting:
Creating images in the style of a Filling in missing or damaged parts of an
particular artist or applying the image with realistic content.
visual style of one image to 9. Drug Discovery and Molecular
another.
Design:
Face Aging and De-aging: Generating molecular structures for new drug
Simulating the aging or de-aging of candidates or designing novel molecules.
faces in photographs. 10. Image Synthesis for Anomaly
Super-Resolution: Detection:
Enhancing the resolution and Generating normal images to train models for
quality of images, making them detecting anomalies or outliers in datasets.
sharper and more detailed.
Data Augmentation:
Generating additional training data
Video to text with LSTM models
• LSTM stands for Long-Short Term Memory. LSTM is a type of
recurrent neural network but is better than traditional
recurrent neural networks in terms of memory.
• Having a good hold over memorizing certain patterns LSTMs
perform fairly better. As with every other NN, LSTM can have
multiple hidden layers and as it passes through every layer,
the relevant information is kept and all the irrelevant
information gets discarded in every single cell.
• LSTM model is trained on video-sentence pairs and learns to
associate a sequence of video frames to a sequence of words
in order to generate a description of the event in the video
clip.
• A stacked LSTM first encodes the frames one
by one, taking as input the output of a
Convolutional Neural Network (CNN) applied
Applications
to each input frame’s intensity values. 1.Automatic Video Captioning
• Once all frames are read, the model 2.Video Summarization
generates a sentence word by word. 3.Content Indexing and Retrieval
• The encoding and decoding of the frame and 4.Surveillance and Security
word representations are learned jointly 5.Educational Videos
from a parallel corpus. 6.Media Production
• To model the temporal aspects of activities 7.Human-Computer Interaction
typically shown in videos, we also compute 8.Video Search Engines
the optical flow between pairs of 9.Assistive Technologies
consecutive frames. The flow images are also
10.Event Recognition
passed through a CNN and provided as input
to the LSTM.
Attention Models for computer
vision tasks
• Attention mechanisms enhance deep learning models by
selectively focusing on important input elements, improving
prediction accuracy and computational efficiency. They prioritize
and emphasize relevant information, acting as a spotlight to
enhance overall model performance.
• In psychology, attention is the cognitive process of selectively
concentrating on one or a few things while ignoring others.
• The attention mechanism emerged as an improvement over the
encoder decoder-based neural machine translation
system in natural language processing (NLP). Later, this
mechanism, or its variants, was used in other applications,
including computer vision, speech processing, etc.
What Is An Attention Model?

• An attention model, also known as an attention mechanism, is an

input processing technique of neural networks. This mechanism
helps neural networks solve complicated tasks by dividing them
into smaller areas of attention and processing them sequentially.
• Just as the human brain solves a complex task by dividing it into
simpler tasks and focusing on them one by one, the attention
mechanism makes it possible for neural networks to handle
intuitive and challenging tasks like translation and generating
subtitles.
• The neural network focuses on specific aspects of a complex
input until it categorizes the entire dataset.
Types of Attention Model
• There are several types of attention mechanisms, each with its own
characteristics and applications:
• Global (Soft) Attention: The model considers all parts of the input data
when computing the attention weights, leading to a fully differentiable
mechanism.
• Local (Hard) Attention: The model focuses on a subset of the input data,
which is often determined by a learned alignment model. This approach is
less computationally expensive but introduces non-differentiable operations.
• Self-Attention: Also known as intra-attention, this mechanism allows
different positions of a single sequence to attend to each other. It is a key
component of transformer models.
• Multi-Head Attention: This approach extends self-attention by allowing
the model to focus on different parts of the input data from different
representation subspaces, providing a richer understanding of the data.
Example for Self Attention
Mechanism
• The red words are read or
processed at the current
instant, and the blue words are
the memories. The different
shades represent the degree of
memory activation.
• When we are reading or
processing the sentence word
by word, where previously seen
words are also emphasized on,
is inferred from the shades, and
this is exactly what self-
Attention in a machine reader
does.

5.size Oriented and Function Oriented Metrics
No ratings yet
5.size Oriented and Function Oriented Metrics
4 pages
Chapter 14 Ia2
75% (4)
Chapter 14 Ia2
18 pages
List of Books and Notebooks - 2025-26 Class 6-12
No ratings yet
List of Books and Notebooks - 2025-26 Class 6-12
7 pages
Fee Structure 2024 25 MBBS
No ratings yet
Fee Structure 2024 25 MBBS
1 page
Unit - 3 - DL
No ratings yet
Unit - 3 - DL
15 pages
T1 - Indian Road Image Generation Using GAN
No ratings yet
T1 - Indian Road Image Generation Using GAN
8 pages
Computer Vision Class 10 Notes
100% (5)
Computer Vision Class 10 Notes
7 pages
ACFAKFAFM
No ratings yet
ACFAKFAFM
27 pages
Policing For Profit
No ratings yet
Policing For Profit
212 pages
Unit 1
No ratings yet
Unit 1
200 pages
A1745136595 29458 13 2025 Unit6cv
No ratings yet
A1745136595 29458 13 2025 Unit6cv
54 pages
Ch-3 Image AnalysisComputer Vision
No ratings yet
Ch-3 Image AnalysisComputer Vision
88 pages
2024 July Rationale Crisil
No ratings yet
2024 July Rationale Crisil
7 pages
Computer Vision News - December 2019
No ratings yet
Computer Vision News - December 2019
52 pages
Cambria TechSpecs V 9-2
No ratings yet
Cambria TechSpecs V 9-2
12 pages
150 Hcs Type-01new
No ratings yet
150 Hcs Type-01new
3 pages
Image Segmentation For Object Detection Using Mask R-CNN in Colab
No ratings yet
Image Segmentation For Object Detection Using Mask R-CNN in Colab
5 pages
A Brief Survey and An Application of Sem
No ratings yet
A Brief Survey and An Application of Sem
38 pages
Affidavit With Pay Slip
No ratings yet
Affidavit With Pay Slip
4 pages
Rrkabel
No ratings yet
Rrkabel
1 page
DL Unit 5
No ratings yet
DL Unit 5
63 pages
Computer VIsion Applications
No ratings yet
Computer VIsion Applications
30 pages
Defining A Function: Docstring
No ratings yet
Defining A Function: Docstring
8 pages
Sensors: HEMIGEN: Human Embryo Image Generator Based On Generative Adversarial Networks
No ratings yet
Sensors: HEMIGEN: Human Embryo Image Generator Based On Generative Adversarial Networks
16 pages
Module 5
No ratings yet
Module 5
72 pages
Unit 3 DL
No ratings yet
Unit 3 DL
15 pages
Classics: Invention of The Integrated Circuit
No ratings yet
Classics: Invention of The Integrated Circuit
16 pages
Understanding Deep Learning Techniques For Image Segmentation
No ratings yet
Understanding Deep Learning Techniques For Image Segmentation
58 pages
Unit 1 To 5 Computer Vision and Image Processing
No ratings yet
Unit 1 To 5 Computer Vision and Image Processing
56 pages
Untitled Document
No ratings yet
Untitled Document
23 pages
Computer Vision Lecture 1
No ratings yet
Computer Vision Lecture 1
15 pages
Computer Vision XTH
No ratings yet
Computer Vision XTH
9 pages
Exploring The Various Machine Learning Models For Image Generation - A Comprehensive Survey Unlocking The Future of Digital Creativity
No ratings yet
Exploring The Various Machine Learning Models For Image Generation - A Comprehensive Survey Unlocking The Future of Digital Creativity
15 pages
Charles Crissman Wendy Crissman Christine Crissman v. Dover Downs Entertainment Inc. Dover Downs, Inc, 289 F.3d 231, 3rd Cir. (2000)
No ratings yet
Charles Crissman Wendy Crissman Christine Crissman v. Dover Downs Entertainment Inc. Dover Downs, Inc, 289 F.3d 231, 3rd Cir. (2000)
31 pages
249 254Tesma601IJEAST
No ratings yet
249 254Tesma601IJEAST
7 pages
Computer Vision
No ratings yet
Computer Vision
23 pages
13 Marquez v. CA
No ratings yet
13 Marquez v. CA
1 page
Paper4 (GAN)
No ratings yet
Paper4 (GAN)
24 pages
Sagar Paper
No ratings yet
Sagar Paper
4 pages
Lecture 13 Image Segmentation Using Convolutional Neural Network
No ratings yet
Lecture 13 Image Segmentation Using Convolutional Neural Network
9 pages
Dip 7
No ratings yet
Dip 7
4 pages
Sunlight Dishwashing Liquid Msds
No ratings yet
Sunlight Dishwashing Liquid Msds
12 pages
Object Detection Using Convolutional Neural Network Transfer Learning
No ratings yet
Object Detection Using Convolutional Neural Network Transfer Learning
11 pages
Image Segmentation Using Deep Learning: A Survey
No ratings yet
Image Segmentation Using Deep Learning: A Survey
23 pages
Paper BackProoagation
No ratings yet
Paper BackProoagation
13 pages
PDF Joiner
No ratings yet
PDF Joiner
38 pages
NNDL Unit 5
No ratings yet
NNDL Unit 5
21 pages
Image Recognition in Self-Driving Cars Using CNN
No ratings yet
Image Recognition in Self-Driving Cars Using CNN
7 pages
Production - Derieux - Cedric - Advances in Automatic Image Restoration and Upscaling
No ratings yet
Production - Derieux - Cedric - Advances in Automatic Image Restoration and Upscaling
4 pages
Lecture 1 AI Summary
No ratings yet
Lecture 1 AI Summary
31 pages
QA For Bank
No ratings yet
QA For Bank
443 pages
02 Activity 1 READING WRITING
No ratings yet
02 Activity 1 READING WRITING
5 pages
Image Manipulation Finall
No ratings yet
Image Manipulation Finall
7 pages
CV Syamsul Maarif
No ratings yet
CV Syamsul Maarif
4 pages
A Review On Deep Learning Approaches To Image Classification and Object Segmentation 1
No ratings yet
A Review On Deep Learning Approaches To Image Classification and Object Segmentation 1
23 pages
G2 - Imrad Hbo
No ratings yet
G2 - Imrad Hbo
19 pages
Computer Vision Presentation Updated
No ratings yet
Computer Vision Presentation Updated
15 pages
Computer Vision Technology
No ratings yet
Computer Vision Technology
29 pages
Admin,+4554 Article+Text 17736 2 10 20210928
No ratings yet
Admin,+4554 Article+Text 17736 2 10 20210928
13 pages
CS312 Module 4
No ratings yet
CS312 Module 4
21 pages
ML Report-Image Segmentation
No ratings yet
ML Report-Image Segmentation
19 pages
CPCS335 - Chapter 9-Final
No ratings yet
CPCS335 - Chapter 9-Final
24 pages
Question Bank For Certification Programme of Returning Officers
No ratings yet
Question Bank For Certification Programme of Returning Officers
77 pages
CV Unit 1
No ratings yet
CV Unit 1
30 pages
Visual Image Understanding
No ratings yet
Visual Image Understanding
7 pages
Basic Firefighting Course
No ratings yet
Basic Firefighting Course
15 pages
What Is Computer Vision
No ratings yet
What Is Computer Vision
9 pages
Harsha Thesis
No ratings yet
Harsha Thesis
62 pages
A Review of Generative Adversarial Networks For Computer Vision TasksElectronics Switzerland
No ratings yet
A Review of Generative Adversarial Networks For Computer Vision TasksElectronics Switzerland
17 pages
Computer Vision in Aritificial Intelligence
No ratings yet
Computer Vision in Aritificial Intelligence
33 pages
Post-Reading Report Alex Shen (Mid Exam)
No ratings yet
Post-Reading Report Alex Shen (Mid Exam)
36 pages
Csas Allocation (Dufresher24)
No ratings yet
Csas Allocation (Dufresher24)
4 pages
Airframes and Engines Option
No ratings yet
Airframes and Engines Option
14 pages
Computer Visiondk
No ratings yet
Computer Visiondk
12 pages
Computer Vision
No ratings yet
Computer Vision
13 pages
Computer Vision Notes
No ratings yet
Computer Vision Notes
3 pages
CTC
No ratings yet
CTC
30 pages
Image Segmentation in Deep Learning
No ratings yet
Image Segmentation in Deep Learning
12 pages
2022-23 Eco Ch-1 Assignment (Development)
No ratings yet
2022-23 Eco Ch-1 Assignment (Development)
4 pages
Computer Vision
No ratings yet
Computer Vision
13 pages
Computer Vision PDF
No ratings yet
Computer Vision PDF
6 pages
Computer Vision Advancement Rebecca
No ratings yet
Computer Vision Advancement Rebecca
17 pages
Public Administration:: Your Unofficially The Compulsory Subject (In The Changed Context)
No ratings yet
Public Administration:: Your Unofficially The Compulsory Subject (In The Changed Context)
4 pages
A Survey On Computer Vision Algorithms
No ratings yet
A Survey On Computer Vision Algorithms
16 pages
Chavez vs. CA
No ratings yet
Chavez vs. CA
1 page
Deep Generative Adversarial Networks For Image-To
No ratings yet
Deep Generative Adversarial Networks For Image-To
26 pages
Print - Udyam Registration Certificate
No ratings yet
Print - Udyam Registration Certificate
2 pages
Notes On COMPUTER VISION
No ratings yet
Notes On COMPUTER VISION
10 pages
E Ticket
No ratings yet
E Ticket
2 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

DL U-III Computer Vision

Uploaded by

DL U-III Computer Vision

Uploaded by

UNIT-III

APPLICATIONS OF DEEP LEARNING

• In the last 40 years, various segmentation methods have been

• There are various image segmentation techniques available, and each

• Image segmentation problems play a central role in a broad range of real-

• An attention model, also known as an attention mechanism, is an

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.