CV Module 1
CV Module 1
Course Objectives
01 02 03 04 05
Recognize and describe Describe the foundation of Become familiar with the Get an exposure to Build computer vision
both the theoretical and image formation and image major technical approaches advanced concepts leading applications.
practical aspects of analysis. Understand the involved in computer to object and scene
computing with images. basics of 3D Computer vision. Describe various categorization from
Connect issues from Vision. methods used for images.
Computer Vision to Human registration, alignment, and
Vision matching in images.
To implement fundamental image processing
techniques required for computer vision.
Perspective
Binocular Stereopsis
Module Homography
2
Rectification
DLT
RANSAC
Auto-calibration apparel
Feature Extraction And Image Segmentation:
Pattern Analysis:
Clustering:
Module 4
K-Means, K-Medoids, Mixture of Gaussians,
Classification:
Classifiers:
Light at Surfaces;
Phong Model;
Reflectance Map;
Module 5 Albedo estimation;
Photometric Stereo;
• …
Using Computer Vision: Facial Expression
Detecting faces allows the devices to identify the presence of faces apart from
the task of recognizing them.
http://www.youtube.com/watch?v=7tD1KlTkunM&feature=player_embedded
Using Computer • Here are pictures of people and their expressions. As you
can see, below the faces, the camera can sense where the
Vision: Facial main features change in the face.
Expressions
Camera Mouse
o The Camera Mouse can detect your head’s motions and
move along on the computer screen.
o “Instead of using a mouse, a webcam or built-in camera
looks at you and tracks a spot on your face. If you move
your head to the left, the mouse moves to the left. If
you hold the pointer over the spot, a click is issued.
Anything you can do with a mouse, you can do
with Camera Mouse.” – Professor Gips
o June 2007, Camera Mouse was made available free of
charge through Internet download.
o According to Gips, 100,000 copies were downloaded in
the first 31 months; in the year following that, another
100,000. More recently is that100,000 were
downloaded in just one month
Computer Vision to the
rescue !!
• Computer Vision can also be used to help people in need
• Such as those who can’t use certain body parts
to communicate.
• Jordan, the girl above, can’t communicate using her hands to
move the mouse on a computer. But with the Camera Mouse
that recognizes where she wants to click on she can move the
mouse where she wants using her head.
Eagle Eyes
o Eagle Eyes allows people who can only move their eyes to use
the computer by having five electrodes attached to their head
in spots that can see head and eye movement.
Computer
Comp. Photography: Images to
Vision and Images
Nearby Fields
Is that a
queen or a
bishop?
Why computer vision matters
• Examples of state-of-the-art
Optical character recognition (OCR)
Technology to convert scanned docs to text
• If you have a scanner, it probably came with OCR software
LaneHawk by EvolutionRobotics
“A smart camera is flush-mounted in the checkout lane, continuously
watching for items. When an item is detected and recognized, the
cashier verifies the quantity of items that were found under the basket,
and continues to close the transaction. The item can remain under the
basket, and with LaneHawk,you are assured to get paid for it… “
Vision-based biometrics
“How the Afghan Girl was Identified by Her Iris Patterns” Read the story
wikipedia
Login without a password…
• http://www.sportvision.co
m/video.html
Smart cars Slide content courtesy of Amnon Shashua
• Mobileye
– Vision systems currently in high-end BMW, GM,
Volvo models
– By 2010: 70% of car manufacturers.
Google cars
http://www.nytimes.com/2010/10/10/science/10google.html?ref=artificialintelligence
Interactive Games: Kinect
• Object Recognition:
http://www.youtube.com/watch?feature=iv&
v=fQ59dXOo63o
• Mario:
http://www.youtube.com/watch?v=8CTJL5lUj
Hg
• 3D:
http://www.youtube.com/watch?v=7QrnwoO
1-8A
• Robot:
http://www.youtube.com/watch?v=w8Bmgt
MKFbY
Vision in space
NASA'S Mars Exploration Rover Spirit captured this westward view from atop
a low plateau where Spirit spent the closing months of 2007.
27-Mar-22 59
Human Perception
Applications include:
• Noise filtering
• Content enhancement
• Remote sensing
• Area of medicine
• Terrain mapping
• Atmospheric studies
• Astronomical studies
27-Mar-22 60
Noise filtering
27-Mar-22 61
Content enhancement
27-Mar-22 62
Content enhancement - deblurring
63
Remote sensing
64
Area of medicine
65
Terrain mapping Weather forecast
27-Mar-22 66
Atmospheric studies - Ozone hole
27-Mar-22 67
Machine Applications
27-Mar-22 68
Image
Advantages: Disadvantages:
1) Fast processing High memory for good quality images and
2) Cost effective hence requires fast processing
3) Effective storage
4) Effective transmission
5) Versatile image manipulation
27-Mar-22 71
Digital Image Processing
27-Mar-22 72
Low Level processes – involves primitive operations such as
image pre-processing
Both input and output are images.
Mid Level processes – involves tasks such as segmentation ,
description and classification
Inputs are generally images but the outputs are attributes that
extracted from the images
High Level processes – involves ensemble of recognized objects
27-Mar-22 73
Purpose of Image processing
27-Mar-22 75
FUN FACTS
01 02 03 04
Our eyes recognize each Cones in human eyes works Some other creatures can see Red yellow and blue are
wavelength by a different as a receiver for these small parts of the spectrum that known as primary colours and
colour. Red has the longest visible light waves. are not visible to us. For are used to create all the
wavelength and violet has the example, some insects can colours that we see. Orange,
shortest wavelength. see UV light. purple and green are called
secondary colours.
ACTIVITY
Applications of Computer Vision
transformation
|A| = 0 then A is called Singular
|A|!= 0 then A is called Non-Singular or
Regular
Inner Product 𝑏
(f , g) = 𝑥𝑑 𝑥 𝑔 𝑥 𝑓 𝑎
Two functions
Inner product
are orthogonal
is zero
in nature
Orthogonal Transformation
It is a linear
It preserves the
transformation T:V→V
lengths of vectors
which preserves a
and angles between
symmetric inner
vectors
product
Euclidean Transformation
x-axis rotates to the y-axis and the y-axis rotates to the negative direction of the original x-axis.
Rotation about X axis
y-axis rotates to the z-axis and the z-axis rotates to the negative direction of the original y-axis.
Rotation about y axis
x-axis rotates to the negative direction of the z-axis and the z-axis rotates to the original x-axis.
A rotation matrix and a translation matrix can be combined into a single
matrix as follows,
where the r's in the upper-left 3-by-3 matrix form a rotation
and p, q and r form a translation vector.
This matrix represents rotations followed by a translation.
Euclidean Transformations
That is, lines transform to lines, planes transform to planes, circles transform
to circles, and ellipsoids transform to ellipsoids.
Projective transformations are the most general "linear" transformations and require the use of
homogeneous coordinates.
Given a point in space in homogeneous coordinate (x,y,z,w) and its image under a projective
transform (x',y',z',w'), a projective transform has the following form:
4-by-4 matrices must be non-singular (i.e., invertible). Therefore, projective transformations are
more general than affine transformations because the fourth row does not have to contain 0,
0, 0 and 1.
Projective transformation can bring finite points to infinity and points at infinity to finite range
Scaling
Scaling can be applied to all axes, each with a different scaling factor.
For example, if the x-, y- and z-axis are scaled with scaling
factors p, q and r, respectively, the transformation matrix is:
Shearing
Neighbourhood
- 4 neighbourhood
- 8 neighbourhood
Neighbourhood
of a pixel - diagonal neighbourhood
27-Mar-22
• A pixel has 4 diagonal neighbors
• Denoted as ND(P)
27-Mar-22
• Some more processing is required to say whether these pixels belong to the same object or not.
• For grouping, we have to identify the pixels that are connected and not connected
• Connectivity between pixels is an important property to establish object boundaries, find area of the object, find
descriptors of the object to recognize the object
• Binary image –> 2 points P and Q will be connected if q belongs to N(p) or p belongs to N(q) and B(p)==B(q)
• This defines the connectivity for 2 points F(p,q) belongs to V, then three
types of connectivity are defined
• 4 connectivity
• 8 connectivity
• m connectivity
IMAGE ENHANCEMENT
27-Mar-22
Improves the quality of the image
Remove noise
More appealing
IMAGE Methods :
27-Mar-22
SPATIAL DOMAIN
Refers to the image plane itself
Categories
1) Intensity information
2) Spatial filtering
27-Mar-22
Origin
y
(x,y)
x
27-Mar-22
In spatial domain processing , the process consists of
Applying the operator T to the pixel in the neighborhood that will be yielding
the output of the location
Simplest form - when the neighborhood is of the size 1x1
The g depends on the value of f and (x,y) and T will become gray level
Denoted as S=T(r)
27-Mar-22
Contrast Stretching Point Processing
27-Mar-22
• Figure a) – effect of transformation that produce an high
contrast image than original image
27-Mar-22
The value of r below m will be darker
The mask coefficients will determine the value of process that can be applied on the image
27-Mar-22
Basic Intensity transformation function
27-Mar-22
27-Mar-22
Linear transformation function (
Image negativity & image
Identity )
CLASSIFICATION
OF Logarithmic transformation
function (log function & Inverse
TRANSFORMATI log function)
ON FUNCTIONS
Power law transformation
function (nth power and nth root)
27-Mar-22
Very less helpful in digital
image processing
Output image is same as
input image
IMAGE IDENTITY
It is also called as identity
transformation
The transformation is a
linear straight line
27-Mar-22
Image negative
Intensity level – 0 to L-1
Represented as S=L-1-r
r=0 r=L-1
S=L-1-r S=L-1-r
S=L-1 =L-1-L+1= 0
27-Mar-22
LOG TRANSFORMATION
Represented as S=clog(1+r)
Using the logarithmic t/f , we can compress or expand the gray level
o/p – higher contrast image or lower contrast image depending on the function that we
perform
27-Mar-22
27-Mar-22
INVERSE LOG TRANSFORMATION
Represented as S = C r γ
27-Mar-22
27-Mar-22
Three types
PIECEWISE
LINEAR •Contrast
TRANSFORMATI stretching
ON FUNCTION •Gray level slicing
•Bit plane slicing
27-Mar-22
CONTRAST STRETCHING
27-Mar-22
S
S2=L-1
S1=0 r1=r2 r
black white
Used to highlight specific range of gray
levels
GRAY LEVEL
SLICING I approach: Display a high value of r
gray levels in the range of intrest and
low level value for all other gray levels
II approach: Brighten the desired gray
level but preserve the gray level
unchanged for other pixels
27-Mar-22
T(r)
T(r)
A B A B
27-Mar-22
Highlights the contribution made to the total image
appearance for specific bits
27-Mar-22
Histogram, Image
restoration, Convolution,
Filtering , Fourier transform
Histogram Equalization
• Graphical representation of any data
7 6 5 5 4
27-Mar-22
0 0
6 6 7 7 6 1 0
2 3 5 5
5 2 2 3 4
3 4
4 4 4
3 3 4 4 5 4 4 3
5 7 3 6 2 5 5
6 5
7 6 5 5 4
7 4 0 1 2 3 4 5 6 7
27-Mar-22
Histogram Equalization
• Dark image – histogram placed at 0
• Can control the quality of the image by normalizing the histogram value to a flat profile
27-Mar-22
4 4 4 4 4 6 6 6 6 6
3 4 5 4 3 2 6 7 6 2
3 5 5 5 3 2 7 7 7 2
3 4 5 4 3 2 6 7 6 2
Histogram
4 4 4 4 4 Equalization 6 6 6 6 6
27-Mar-22
Gray 0 1 2 3 4 5 6 7
levels
No.Of 0 0 0 6 14 5 0 0
Pixels
14
6
5
0 1 2 3 4 5 6 7
27-Mar-22
27-Mar-22
27-Mar-22
7 7 7
1 2 3 4 5 6 7
Image Restoration
Objective of image restoration
Convolution is a simple mathematical operation
The matrix of weights is called the convolution kernel, also known as the filter.
A convolution kernel is a correlation kernel that has been rotated 180 degrees.
Fourier Transformations