0% found this document useful (0 votes)
5 views93 pages

M4 CST304 Ktunotes - in

The document provides an overview of digital image processing, including definitions, types of images (binary, grayscale, color), and fundamental steps in image processing such as acquisition, enhancement, and restoration. It discusses the components of an image processing system, including image sensors, processing hardware, and software, as well as the concepts of sampling and quantization. Additionally, it covers relationships between pixels and the importance of adjacency and connectivity in image analysis.

Uploaded by

yoshua Immanuel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views93 pages

M4 CST304 Ktunotes - in

The document provides an overview of digital image processing, including definitions, types of images (binary, grayscale, color), and fundamental steps in image processing such as acquisition, enhancement, and restoration. It discusses the components of an image processing system, including image sensors, processing hardware, and software, as well as the concepts of sampling and quantization. Additionally, it covers relationships between pixels and the importance of adjacency and connectivity in image analysis.

Uploaded by

yoshua Immanuel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 93

1

CST 304
Computer Graphics and Image Processing
Module - 4 (Fundamentals of Digital Image Processing)
Introduction to Image processing and applications. Image
as 2D data. Image representation in Gray scale, Binary
and Colour images. Fundamental steps in image
processing. Components of image processing system.
Coordinate conventions. Sampling and quantization.
Spatial and Gray Level Resolution. Basic relationship
between pixels– neighbourhood, adjacency, connectivity.
Fundamentals of spatial domain-convolution operation.
Text Books:
1. Rafael C. Gonzalez, Richard E. Woods, Digital Image
Processing (English)
Introduction
• What is Digital Image Processing?
Digital Image
— a two-dimensional function
f ( x, y)
x and y are spatial coordinates
The amplitude of f is called intensity or gray level at the point (x, y)

Digital Image Processing


— process digital images by means of computer, it covers low-, mid-, and high-level processes
low-level: inputs and outputs are images
mid-level: outputs are attributes extracted from input images
high-level: an ensemble of recognition of individual objects

Pixel
— the elements of a digital image

5
Origins of Digital Image
Processing

Sent by submarine cable


between London and New
York, the transportation
time was reduced to less
than three hours from
more than a week
6
Origins of Digital Image
Processing

7
Sources for Images
• Electromagnetic (EM) energy spectrum –
Gamma rays to Radio Waves
• Acoustic / Ultrasonic – Ultrasound scan
• Electronic – Electronic Microscope images
• Synthetic images produced by computer –
Fractals etc.

8
Electromagnetic (EM) energy
spectrum

Major uses
Gamma-ray imaging: nuclear medicine and astronomical observations
X-rays: medical diagnostics, industry, and astronomy, etc.
Ultraviolet: lithography, industrial inspection, microscopy, lasers, biological imaging,
and astronomical observations
Visible and infrared bands: light microscopy, astronomy, remote sensing, industry,
and law enforcement
Microwave band: radar
Radio band: medicine (such as MRI) and astronomy

9
Examples: Gama-Ray Imaging

10
Examples: X-Ray Imaging

11
Examples: Ultraviolet Imaging

12
Examples: Light Microscopy Imaging

13
Examples: Visual and Infrared Imaging

14
Examples: Visual and Infrared Imaging

15
Examples: Infrared Satellite Imaging

2003
USA 1993
Weeks 1 & 2 16
Examples: Infrared Satellite Imaging

Weeks 1 & 2 17
Examples: Automated Visual Inspection

18
Examples: Automated Visual Inspection

Results of
automated
reading of the
plate content by
the system

The area in which


the imaging
system detected
the plate

19
Example of Radar Image

Weeks 1 & 2 20
Examples: MRI (Radio Band)

Weeks 1 & 2 21
Examples: Ultrasound Imaging

Weeks 1 & 2 22
23

Types of Images
• Binary, Grayscale, Color
• Binary
• It is the simplest type of image. It takes only two values i.e., Black and White or
0 and 1. The binary image consists of a 1-bit image and it takes only 1 binary
digit to represent a pixel. Binary images are mostly used for general shape or
outline.
• Binary images are generated using threshold operation. When a pixel is above
the threshold value, then it is turned white('1') and which are below the
threshold value then they are turned black('0')
24

Types of Images
Grayscale Images
• Grayscale images are referred to as monochrome
(one color) images. They contain gray level
information – no color information.
• The number of bits used for each pixel determines
the number of different gray levels available.

The typical gray scale image


contains 8 bits / pixel
data, which allows us to
have 256 different gray
levels in the scale 0 to 255.
25

Types of Images
Color Images
• Color images are three channel monochrome images in
which, each channel contains a different color and the actual
information is stored in the digital image. The color images
contain gray level information in each Red, Blue, Green
channels.
• The images are represented as Red, Green and Blue (RGB
images). And each color image has 24 bits/pixel means 8 bits
for each of the three color channels (RGB).
26

Fundamental Steps in DIP


There are two categories of the steps involved in the image processing
1. Methods whose outputs are input are images.
2. Methods whose outputs are attributes extracted from those images.
27

Fundamental Steps in DIP


• Image Acquisition: It could be as simple as being given
an image that is already in digital form.
• Image Enhancement: It is among the simplest and most
appealing areas of digital image processing. The idea
behind this is to bring out details that are obscured or
simply to highlight certain features of interest in image.
Image enhancement is a very subjective area of image
processing.
28

Fundamental Steps in DIP


• Image Restoration: It deals with improving the
appearance of an image. It is an objective approach, in
the sense that restoration techniques tend to be based
on mathematical or probabilistic models of image
processing. Enhancement, on the other hand is based
on human subjective preferences regarding what
constitutes a “good” enhancement result.

• Color Image Processing: It is an area that is been


gaining importance because of the use of digital
images over the internet. Color image processing deals
with basically color models and their implementation in
image processing applications.
29

Fundamental Steps in DIP


• Wavelets and Multi-resolution Processing: These
are the foundation for representing image in various
degrees of resolution.
• Compression: It deals with techniques reducing the
storage required to save an image, or the bandwidth
required to transmit it over the network. It has to major
approaches
1. Lossless Compression
2. Lossy Compression
• Morphological Processing: It deals with tools for
extracting image components that are useful in the
representation and description of shape and boundary
of objects. It is majorly used in automated inspection
applications.
30

Fundamental Steps in DIP


• Representation and Description: It always
follows the output of segmentation step that is,
raw pixel data, constituting either the boundary of
an image or points in the region itself. In either
case converting the data to a form suitable for
computer processing is necessary.
• Recognition: It is the process that assigns label
to an object based on its descriptors. It is the last
step of image processing which use artificial
intelligence of software.
31

Fundamental Steps in DIP


• Knowledge Base:
Knowledge about a problem domain is coded into an
image processing system in the form of a knowledge
base.
This knowledge may be as simple as detailing regions of
an image where the information of the interest in known
to be located. Thus, limiting search that has to be
conducted in seeking the information.
The knowledge base also can be quite complex such
interrelated list of all major possible defects in a
materials inspection problem or an image database
containing high resolution satellite images of a region in
connection with change detection application.
32

Fundamental Steps in DIP


• Low level Process: These involve primitive operations
such as image processing to reduce noise, contrast
enhancement and image sharpening. These kinds of
processes are characterized by fact the both inputs and
output are images.
• Mid-level Image Processing: It involves tasks like
segmentation, description of those objects to reduce them
to a form suitable for computer processing, and
classification of individual objects. The inputs to the process
are generally images but outputs are attributes extracted
from images.
• High level Processing: It involves “making sense” of an
ensemble of recognized objects, as in image analysis, and
performing the cognitive functions normally associated with
vision.
33

Components of an Image Processing System


34

Components of an Image Processing System

• Image Sensors:
With reference to sensing, two elements are required
to acquire digital image. The first is a physical device
that is sensitive to the energy radiated by the object
we wish to image and second is specialized image
processing hardware.
• Specialize Image Processing Hardware:
It consists of the digitizer just mentioned, plus
hardware that performs other primitive operations
such as an arithmetic logic unit, which performs
arithmetic such addition and subtraction and logical
operations in parallel on images.
35
Components of an Image Processing System

• Computer:
It is a general-purpose computer and can range
from a PC to a supercomputer depending on the
application. In dedicated applications, sometimes
specially designed computer is used to achieve a
required level of performance
• Software:
It consists of specialized modules that perform
specific tasks a well-designed package also
includes capability for the user to write code, as a
minimum, utilizes the specialized module. More
sophisticated software packages allow the
integration of these modules.
36

Components of an Image Processing System

• Mass Storage:
This capability is a must in image processing
applications. An image of size 1024 x1024 pixels, in
which the intensity of each pixel is an 8- bit quantity
requires one Megabytes of storage space if the
image is not compressed. Image processing
applications falls into three principal categories of
storage.
• Short term storage for use during processing
• On line storage for relatively fast retrieval
• Archival storage such as magnetic tapes and
disks
37

Components of an Image Processing System


• Image Display:
Image displays in use today are mainly color TV monitors.
These monitors are driven by the outputs of image and
graphics displays cards that are an integral part of
computer system.
• Hardcopy Devices:
The devices for recording image include laser printers, film
cameras, heat sensitive devices inkjet units and digital units
such as optical and CD ROM disk.
• Networking:
It is almost a default function in any computer system in use
today because of the large amount of data inherent in
image processing applications. The key consideration in
image transmission bandwidth.
38

Simple Image Model


• Simple Image Model:
An image is denoted by a two dimensional function of the
form f{x, y}. The value or amplitude of ‘f’ at spatial
coordinates {x,y} is a positive scalar quantity whose physical
meaning is determined by the source of the image.
When an image is generated by a physical process, its
values are proportional to energy radiated by a physical
source. As a consequence, f(x,y) must be non zero and
finite; that is o<f(x,y)<finite
The function f(x,y) may be characterized by two
components - The amount of the source illumination
incident on the scene being viewed (illumination) and - the
amount of the source illumination reflected back by the
objects in the scene (reflectance).
39

Simple Image Model


• These are called illumination and reflectance
components and are denoted by i(x,y) and r(x,y)
respectively.
• The functions combine as a product to form f(x,y).
We call the intensity of a monochrome image at
any coordinates (x,y) the gray level (l) of the
image at that point l= f (x, y)
• Lmin ≤ l ≤ Lmax
• Lmin is to be positive and Lmax must be finite
• Lmin =imin rmin
• Lmax =imax rmax
40

Simple Image Model

• The interval [Lmin, Lmax] is called gray scale.


• Common practice is to shift this interval
numerically to the interval [0, L-l] where l=0 is
considered black and l= L-1 is considered
white on the gray scale.
• All intermediate values are shades of gray
varying from black to white.
41

Image sampling and Quantization

• Image sampling and Quantization:


• To create a digital image, we need to convert the
continuous sensed data into digital form. This
involves two processes.
1. Sampling and
2. Quantization
• A continuous image, f(x, y), is to be converted to
digital form. An image may be continuous with
respect to the x- and y- coordinates, and also in
amplitude. To convert it to digital form, we have to
sample the function in both coordinates and in
amplitude.
42

Image sampling and Quantization


• Digitizing the coordinate values is called
Sampling. Digitizing the amplitude values is
called Quantization.
43

Image acquisition using sensor arrays


44

Digital Image Representation


• Digital Image Definition:
A digital image f(m,n) described in a 2D discrete
space is derived from an analog image f(x,y) in a
2D continuous space through a sampling and
quantization process that is frequently referred to
as digitization.
The 2D continuous image f(x,y) is divided into N
rows and M columns. The intersection of a row
and a column is termed a pixel. The value
assigned to the integer coordinates (m,n) with
m=0,1,2..M-1 and n=0,1,2…N-1 is f(m,n).
45

Digital Image Representation


• Representing Digital Images:
The result of sampling and quantization is matrix of real
numbers. Assume that an image f(x,y) is sampled so that the
resulting digital image has M rows and N Columns.
The values of the coordinates (x,y) now become discrete
quantities thus the value of the coordinates at origin become
(x,y) = (0,0).
46

Digital Image Representation


• Hence f(x,y) is a digital image if gray level (that is, a real number
from the set of real number R) is assigned to each distinct pair of
coordinates (x,y). This functional assignment is the quantization
process.
• Due to processing storage and hardware consideration, the
number of gray levels typically are an integer power of 2.
• L=2k
• Then, the number ‘b’ of bites required to store a digital image is
• b=M*N*k
• When M=N, the equation becomes b=N2*k
• When an image can have 2k gray levels, it is referred to as “k-
bit”. An image with 256 possible gray levels is called an “8- bit
image” (256=28).
47

Digital Image Representation


48

Relationship between Pixels


• Let us consider several important relationships between
pixels in a digital image.
• Neighbours of a Pixel:
A pixel p at coordinates (x, y) has four horizontal and
vertical neighbors whose coordinates are given by:
• (x+1,y), (x-1, y), (x, y+1), (x,y-1)
49
Relationship between Pixels

Basic Relationships between pixels

• Neighborhood

• Adjacency

• Connectivity

• Paths

• Regions and boundaries


50
Relationship between Pixels

• Neighbors of a pixel p at coordinates (x,y)

 4-neighbors of p, denoted by N4(p):


(x-1, y), (x+1, y), (x,y-1), and (x, y+1).

 4 diagonal neighbors of p, denoted by ND(p):


(x-1, y-1), (x+1, y+1), (x+1,y-1), and (x-1, y+1).

 8 neighbors of p, denoted N8(p)


N8(p) = N4(p) U ND(p)
51

Relationship between Pixels

• Adjacency
Let V be the set of intensity values

4-adjacency: Two pixels p and q with values from V


are 4-adjacent if q is in the set N4(p).

8-adjacency: Two pixels p and q with values from V


are 8-adjacent if q is in the set N8(p).
52

Relationship between Pixels

• Adjacency
Let V be the set of intensity values

m-adjacency: Two pixels p and q with values from V


are m-adjacent if

(i) q is in the set N4(p), or

(ii) q is in the set ND(p) and the set N4(p) ∩ N4(q) has no pixels
whose values are from V.
53
54
55
56
57

Relationship between Pixels

• Path
 A (digital) path (or curve) from pixel p with coordinates (x0, y0) to pixel q
with coordinates (xn, yn) is a sequence of distinct pixels with
coordinates

(x0, y0), (x1, y1), …, (xn, yn)

Where (xi, yi) and (xi-1, yi-1) are adjacent for 1 ≤ i ≤ n.

 Here n is the length of the path.

 If (x0, y0) = (xn, yn), the path is closed path.

 We can define 4-, 8-, and m-paths based on the type of adjacency
used.
Examples: Adjacency and Path
V = {1, 2}

0 1 1 0 1 1 0 1 1
0 2 0 0 2 0 0 2 0
0 0 1 0 0 1 0 0 1

Weeks 1 & 2 58
Examples: Adjacency and Path
V = {1, 2}

0 1 1 0 1 1 0 1 1
0 2 0 0 2 0 0 2 0
0 0 1 0 0 1 0 0 1
8-adjacent

Weeks 1 & 2 59
Examples: Adjacency and Path
V = {1, 2}

0 1 1 0 1 1 0 1 1
0 2 0 0 2 0 0 2 0
0 0 1 0 0 1 0 0 1
8-adjacent m-adjacent

Weeks 1 & 2 60
Examples: Adjacency and Path
V = {1, 2}

0 1 1
1,1 1,2 1,3 0 1 1 0 1 1
0 2 0
2,1 2,2 2,3 0 2 0 0 2 0
0 0 1
3,1 3,2 3,3 0 0 1 0 0 1
Two pixels p and q with
values from V are 8-adjacent if 8-adjacent m-adjacent
q is in the set N8(p).
The 8-path from (1,3) to (3,3): The m-path from (1,3) to (3,3):
(i) (1,3), (1,2), (2,2), (3,3) (1,3), (1,2), (2,2), (3,3)
(ii) (1,3), (2,2), (3,3)
Two pixels p and q with values from V are m-
adjacent if
(i) q is in the set N4(p), or
(ii) q is in the set ND(p) and the set N4(p) ∩ N4(q) has
Weeks 1 & 2 no pixels whose values are from V. 61
62

Examples: Adjacency and Path


• Find the shortest 4, 8 and m path between p and q
for the given image for the sets
i) v = {0,1} ii) v = {1,2}

3 1 2 1q
2 2 0 2
1 2 1 1
1 0 1 1
p
Basic Relationships Between Pixels

• Connected in S
Let S represent a subset of pixels in an image. Two pixels
p with coordinates (x0, y0) and q with coordinates (xn, yn) are
said to be connected in S if there exists a path

(x0, y0), (x1, y1), …, (xn, yn)

Weeks 1 & 2 63
Basic Relationships Between Pixels
Let S represent a subset of pixels in an image

• For every pixel p in S, the set of pixels in S that are connected to p is


called a connected component of S.

• If S has only one connected component, then S is called Connected


Set.

• We call R a region of the image if R is a connected set

• Two regions, Ri and Rj are said to be adjacent if their union forms a


connected set.
• Regions that are not to be adjacent are said to be disjoint.

Weeks 1 & 2 64
Basic Relationships Between Pixels
• Boundary (or border)

 The boundary of the region R is the set of pixels in the region that have
one or more neighbors that are not in R.
 If R happens to be an entire image, then its boundary is defined as the
set of pixels in the first and last rows and columns of the image.

• Foreground and background

 An image contains K disjoint regions, Rk, k = 1, 2, …, K. Let Ru denote


the union of all the K regions, and let (Ru)c denote its complement.
All the points in Ru is called foreground;
All the points in (Ru)c is called background.

Weeks 1 & 2 65
Question 1

• In the following arrangement of pixels, are the two


regions (of 1s) adjacent? (if 8-adjacency is used)

1 1 1
Region 1
1 0 1
0 1 0
0 0 1 Region 2

1 1 1
1 1 1

Weeks 1 & 2 66
Question 2

• In the following arrangement of pixels, are the two parts


(of 1s) adjacent? (if 4-adjacency is used)

1 1 1
Part 1
1 0 1
0 1 0
0 0 1 Part 2

1 1 1
1 1 1

Weeks 1 & 2 67
• In the following arrangement of pixels, the two regions
(of 1s) are disjoint (if 4-adjacency is used)

1 1 1
Region 1
1 0 1 Regions that are not adjacent are
0 1 0 said to be disjoint.

0 0 1 Region 2

1 1 1
1 1 1

Weeks 1 & 2 68
• In the following arrangement of pixels, the two regions
(of 1s) are disjoint (if 4-adjacency is used)

1 1 1 foreground An image contains K disjoint


regions, Rk, k = 1, 2, …, K. Let Ru
1 0 1 denote the union of all the K
0 1 0 regions, and let (Ru)c denote its
complement.
background
0 0 1 All the points in Ru is called
foreground;
1 1 1 All the points in (Ru)c is called
background.
1 1 1

Weeks 1 & 2 69
Distance Measures
• Given pixels p, q and z with coordinates (x, y), (s, t),
(u, v) respectively, the distance function D has
following properties:

a. D(p, q) ≥ 0 [D(p, q) = 0, if p = q]

b. D(p, q) = D(q, p)

c. D(p, z) ≤ D(p, q) + D(q, z)

Weeks 1 & 2 70
Distance Measures
The following are the different Distance measures:

a. Euclidean Distance :
De(p, q) = [(x-s)2 + (y-t)2]1/2

b. City Block Distance:


D4(p, q) = |x-s| + |y-t|

c. Chess Board Distance:


D8(p, q) = max(|x-s|, |y-t|)

Weeks 1 & 2 71
Question 3 Chess Board Distance:
D8(p, q) = max(|x-s|, |y-t|)

• In the following arrangement of pixels, what’s the value


of the chessboard distance between the circled two
points?

0 0 0 0 0
0 0 1 1 0
max(|1-4|, |4-2|) = 3
0 1 1 0 0
0 1 0 0 0
0 0 0 0 0
0 0 0 0 0

Weeks 1 & 2 72
City Block Distance:
Question 4 D4(p, q) = |x-s| + |y-t|

• In the following arrangement of pixels, what’s the value


of the city-block distance between the circled two
points?

0 0 0 0 0 D4(p,q) = |1-4|+|4-1| = 3+3 = 6


0 0 1 1 0
0 1 1 0 0
0 1 0 0 0
0 0 0 0 0
0 0 0 0 0
Weeks 1 & 2 73
Question 5

• In the following arrangement of pixels, what’s the value


of the length of the m-path between the circled two
points? V = {0} Two pixels p and q with values from V are m-
adjacent if
0 0 0 0 0 (i) q is in the set N4(p), or
(ii) q is in the set ND(p) and the set N4(p) ∩ N4(q)
0 0 1 1 0 has no pixels whose values are from V.
0 1 1 0 0 City block Dist D4(p,q) = |1-4|+|4-2| = 5
0 1 1 0 0 D4 Path = 7 Path distances are dependent on
D8 Path = 3 the set V. But D4(p,q) and
0 0 0 1 0 D8(p,q) are not dependent on the
set V
0 0 0 0 0 Dm Path = 4 =
Dm
D4(p, q) = |x-s| + |y-t| Path length / Distance(p,q)
D8(p, q) = max(|x-s|, |y-t|) distance
Chess Board Dist D8(p,q) = max(1-4|,|4-2|)
=3
Weeks 1 & 2 74
Question 6

• In the following arrangement of pixels, what’s the value


of the length of the 4 – path, 8 – path and m-path
between the circled two points? V = {1} Also,
calculate De, D4, D8 and Dm distances.

0 0 0 0 0 De(p, q) = [(x-s)2 + (y-t)2]1/2


0 0 1 1 1 D4(p, q) = |x-s| + |y-t|
0 0 1 0 1 D8(p, q) = max(|x-s|, |y-t|)
1 1 0 1 1 m-path = Dm distance
1 1 1 1 0
0 0 0 0 0
Weeks 1 & 2 75
76

Spatial Domain Processing


• An image can be represented in the form of a 2D matrix
where each element of the matrix represents a pixel
intensity. This domain of 2D matrices that depict the intensity
distribution of an image is called Spatial Domain. It can be
represented as shown below:

• In spatial domain processing, we work directly with the


image (matrix) and modify the pixel intensities as required.
Spatial Domain Processing 77

• The modified image can be expressed as


g(x,y)=T[f(x,y)]
• Here f(x, y) is the original image and T is the
transformation applied to it to get a new modified image
g(x, y).
• For all spatial domain techniques it is simply T that
changes. S =T(r) where S = output gray level; T =
Transformation Function; r =input gray level
• To summarize, spatial domain techniques are those,
which directly work with the pixel values to get a new
image based on Equation above.
• Spatial domain enhancement can be carried out in two
different ways (1) Point processing (2) Neighborhood
processing
78

Spatial Domain Processing


• For the RGB image, the spatial domain is represented
as a 3D vector of 2D matrices. Each 2D matrix contains
the intensities for a single color as shown below:
Spatial Domain Processing 79

• Another domain, called Frequency Domain exists.


This is obtained by applying a Fourier
Transformation on an image that is currently in
Spatial Domain.
Spatial Domain Processing – 80

Convolution operation
• In image processing, convolution is the process of
transforming an image by applying a kernel over each
pixel and its local neighbors across the entire image. The
kernel is a matrix of values whose size and values
determine the transformation effect of the convolution
process.
Spatial Domain Processing – 81

Convolution operation
The Convolution Process involves these steps:

1. It places the Kernel Matrix over each pixel of the


image (ensuring that the full Kernel is within the
image), multiplies each value of the Kernel with the
corresponding pixel it is over.
2. Then, sums the resulting multiplied values and returns
the resulting value as the new value of the center
pixel.
3. This process is repeated across the entire image.

The output of the convolution process changes with the


changing kernel values.
82

Convolution These are the network


parameters to be learned.

1 -1 -1
1 0 0 0 0 1 -1 1 -1 Filter 1
0 1 0 0 1 0 -1 -1 1
0 0 1 1 0 0
1 0 0 0 1 0 -1 1 -1
0 1 0 0 1 0 -1 1 -1 Filter 2
0 0 1 0 1 0 -1 1 -1



6 x 6 image
Each filter detects a
small pattern (3 x 3).
83

1 -1 -1
Convolution -1 1 -1 Filter 1
-1 -1 1
stride=1

1 0 0 0 0 1 Dot
product -
0 1 0 0 1 0 3
1
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image
84

1 -1 -1
Convolution -1 1 -1 Filter 1
-1 -1 1
If stride=2

1 0 0 0 0 1
0 1 0 0 1 0 -
3
3
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image
85

1 -1 -1
Convolution -1 1 -1 Filter 1
-1 -1 1
stride=1

1 0 0 0 0 1
0 1 0 0 1 0 - - -
3
1 3 1
0 0 1 1 0 0
- -
1 0 0 0 1 0 1 0
3 3
0 1 0 0 1 0
- -
0 0 1 0 1 0 0 1
3 3
- - -
6 x 6 image 3
2 2 1
86

-1 1 -1
Convolution -1
-1
1
1
-1
-1
Filter 2

stride=1
Repeat this for each filter
1 0 0 0 0 1
0 1 0 0 1 0 - - -
3- - - -
1 3 1
0 0 1 1 0 0 1 1 1 1
- -
1 0 0 0 1 0 - 1- 0-
3 31
0 1 0 0 1 0 1 Feature
1 2
- - Map
0 0 1 0 1 0 - - 0- 1
3 3 1
1 1 2
- - -
6 x 6 image 3- -
20 2 13
1 4
Two 4 x 4 images
Forming 2 x 4 x 4 matrix
87

Convolution

or feature map
88

Stride and Padding


Stride specifies how much
we move the convolution
filter at each step.

By default the value is 1.

Stride reduces dimension.

Padding is used to preserve


the boundary information.

Zero padding pads the input


volume with zeros around the
border.
Convolution – output size
89

• Stride is the number of pixel shifts over the input


matrix.
• For padding p, filter size 𝑓∗𝑓 and input image size
m ∗ m and stride ‘𝑠’ our output image dimension will
be [ {(m + 2𝑝 − 𝑓 ) / 𝑠} +1 ] ∗ [ {(m + 2𝑝 − 𝑓) / 𝑠} +1
].
• While dividing by stride, if there is a decimal point,
floor it (round it down).
90

Convolution - Color Image (RGB)


11 -1-1 -1-1 -1-1 11 -1-1
1 -1 -1 -1 1 -1
-1-1 11 -1-1 -1-1 11 -1-1
-1 1 -1 Filter 1 -1 -1 1 1 -1 -1 Filter 2
-1-1 -1-1 11 -1 1 -1
-1 -1 1 -1 1 -1
Color image
1 0 0 0 0 1
1 0 0 0 0 1
0 11 00 00 01 00 1
0 1 0 0 1 0
0 00 11 01 00 10 0
0 0 1 1 0 0
1 00 00 10 11 00 0
1 0 0 0 1 0
0 11 00 00 01 10 0
0 1 0 0 1 0
0 00 11 00 01 10 0
0 0 1 0 1 0
0 0 1 0 1 0
Convolution – examples
91

With image convolutions, you can easily detect lines. Here are four
convolutions to detect horizontal, vertical and lines at 45 degrees:

This convolution kernel has an averaging effect. So you end up with a


slight blur. The image convolution kernel is:
92

Convolution - examples

Earlier, we had to use separate components for horizontal and vertical lines. A
way to "combine" the results is to merge the convolution kernels. The new
image convolution kernel looks like this:

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy