Module-2 - Computer Vision Complete
Module-2 - Computer Vision Complete
Vignetting
Morphological Operations
● Morphological operations are image processing
techniques that change the shape and structure of objects
in an image.
● They are based on mathematical morphology, which studies
the properties of shapes and patterns.
● To perform such an operation, we first apply convolution on
the binary image with a binary structuring element.
● Then select a binary output value depending on the
thresholded result of the convolution.
● The structuring element can be any shape, from a simple
3 × 3 box filter, to more complicated disc structures.
Different Structuring Elements
Morphological Operations
● The convolution of a binary image f with a 3×3 structuring
element s and the resulting images for the operations is
described as c = f ⊗s.
● Where c is an integer-valued count of the number of 1s
inside each structuring element as it is scanned over the
image.
● Let S be the size of the structuring element (number of
pixels)
Morphological Operations
● The standard operations used in binary morphology
include:
1) Dilation: dilate(f, s) = θ(c, 1);
2) Erosion: erode(f, s) = θ(c, S);
3) Majority: maj(f, s) = θ(c, S/2);
4) Opening: open(f, s) = dilate(erode(f, s), s);
5) Closing: close(f, s) = erode(dilate(f, s), s).
Morphological Operations
1) Erosion: erode(f, s) = θ(c, S);
● The basic idea of erosion is just like soil erosion only, it
foreground object.
● Here, a pixel element is '1' if at least one pixel under the
kernel is '1'.
● So it increases the white region in the image or size of
by dilation.
● Because, erosion removes white noises, but it also shrinks
our object.
● So we dilate it. Since noise is gone, they won't come back,
Dilation.
● It is useful in removing the noise from the images.
Before After
4) Closing: close(f, s) = erode(dilate(f, s), s).
● Closing is reverse of Opening, Dilation followed by
Erosion.
● It is useful in closing small holes inside the foreground
objects, or small black points on the object.
Distance transforms
● The distance transform provides a metric or measure of the
separation of points in the image.
● The distance transform is useful in quickly computing the
distance between a point and a set of points or a curve using
a two-pass raster algorithm.
● It has many applications, including level sets, binary image
alignment, feathering in image stitching and blending, and
nearest point alignment.
Distance transforms
● The distance transform D(i, j) of a binary image b(i, j) is
defined as follows.
● Let d(k, l) be some distance metric between pixel offsets.
Manhattan distance
h(x)
Some 1D filters and its Fourier transform
h(x)
Two-dimensional Fourier transforms
● The formulas and insights we have developed for one-
dimensional signals and their transforms translate directly
in to two-dimensional images.
● Here, instead of just specifying a horizontal or vertical
frequency ωx or ωy, we can create an oriented sinusoid of
frequency (ωx , ωy).
s(x, y) = sin(ωx x + ωy y).
Two-dimensional Fourier transforms
Fast Fourier transforms (Example)
Two-dimensional Inverse Fourier transforms
Inverse Fast Fourier transforms (Example)
Discrete cosine transform
● The discrete cosine transform (DCT) is a variant of the
Fourier transform particularly well-suited to compressing
images in a block-wise fashion.
● The 1D DCT is computed by taking the dot product of each
N-wide block of pixels with a set of cosines of different
frequencies.
What is this?
from youtube.
Image Pyramids
● We often used to work with an image of constant size. But
on some occasions, we need to work with the same image
in different resolution.
● For example, we may need to enlarge a small image to
increse its resolution for better quality.
● Alternatively, we may want to reduce the size of an image to
speed up the execution of an algorithm or to save on storage
space or transmission time.
● The set of images with different resolutions are called
Image Pyramids (because when they are kept in a stack
with the highest resolution image at the bottom and the
lowest resolution image at top, it looks like a pyramid).
Image Pyramids
Image Pyramid = Hierarchical representation of an image
No details in image
Low Resolution (blurred image)
Low frequencies
Details in image
High Resolution
Low+high frequencies
the pixels.
● We find the unknown pixels to be at (-0.5, -0.5), (-0.5, 0.5) and
so on…
● Now compare the values of the known pixels to the values of
2x2
4x4
Nearest neighbour interpolation
The result is as follows -
Linear interpolation
● The pixels of the below 2x2 image will be as follows.
● Suppose it has been enlarged by a factor 5x5.
16 11 28 34
Bilinear interpolation
● In bilinear interpolation we take the values of four nearest
known neighbours (2x2 neighbourhood) of unknown pixels and
then take the average of these values to assign the unknown
pixel.
● Let’s first understand how this would work on a simple example.
linear interpolation.
● We then find the value of the pixel required (0.75, 0.25) using
examples above.
● Upon bicubic interpolation, we get the following result:
Decimation (Downsampling)
● While interpolation can be used to increase the resolution of an
image, decimation (downsampling) is required to reduce the
resolution.
● Decimation can be done using the following equation:
highpass bands.
Wavelet
Transform
Inverse Wavelet
Transform
Inverse DWT: Reconstruction
Geometric transformations
●
In this section, we look at how to perform more general
transformations, such as image rotations or general
warping.
●
In point processing we saw the function applied to an
image transformation the range of the image,
g(x) = h(f(x)).
●
Here we look at functions that transform the domain,
g(x) = f(h(x)).
Geometric transformations
Parametric transformations
●
Parametric transformations apply a global deformation
to an image, where the behavior of the transformation is
controlled by a small number of parameters.
Hierarchy of 2D coordinate transformations.
Geometric transformations
●
In general, given a transformation specified by a formula
x` = h(x) and a source image f(x), how do we compute
the values of the pixels in the new image g(x).
●
This process is called forward warping or forward
mapping and is shown in Figure 3.45a.
Forward wrapping
Limitations of forward warping
●
Rounding the value of x` to the nearest integer coordinate and
copy the pixel there, but the resulting image has severe aliasing
and pixels that jump around a lot when animating the
transformation.
●
You can also “distribute” the value among its four nearest
neighbors in a weighted (bilinear) fashion, keeping track of the
per-pixel weights and normalizing at the end.
●
This technique is called splatting and is sometimes used for
volume rendering in the graphics community.
●
The second major problem with forward warping is the
appearance of cracks and holes, especially when magnifying an
image.
●
Filling such holes with their nearby neighbors can lead to further
aliasing and blurring.
Limitations of forward warping (Example)
Inverse warpping
Inverse warpping
Mesh-based warping
●
Many projection environments require images that are not simple
perspective projections that are the norm for flat screen displays.
●
Examples include geometry correction for cylindrical displays
and some new methods of projecting into planetarium domes or
upright domes intended for VR.
●
The standard approach is to create the image in a format that
contains all the required visual information and distort it to
compensate for the non planar nature of the projection device or
surface.
●
Mesh-based warping, a technique used in image processing and
computer graphics, involves deforming or warping an image by
manipulating a mesh (a network of points and lines) that
represents the image's geometry.
Mesh-based warping
Figure 1. Image applied as a texture to a mesh, each node is defined by a position (x,y) and texture coordinate (u,v).
Source Destination
Feature-based morphing
● Feature-based morphing in image processing transforms
one image into another by identifying and warping
corresponding features.
Source Destination
Feature-based morphing
Step 1: Select lines in source image Is and Destination Image Id.
Step 2: Generate intermediate frame I by generating new set of
line segments by interpolating lines from their positions in Is to
positions in Id.
Step 3: Now intermediate frame I pixels in each line segments
map to frame Is pixels by multiple line algorithm.
Step 4: Multiple line algorithm:
For each pixel X in the destination
Find the corresponding weights in source
Step 5: Now warped image Is and warped image Id is cross
dissolved with given dissolution factor [0-1].
Feature-based morphing
Multiple-line algorithm
Feature-based morphing