0% found this document useful (0 votes)
7 views3 pages

Convolution Operation

Uploaded by

meenuthakur088
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views3 pages

Convolution Operation

Uploaded by

meenuthakur088
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Convolution operation

The convolution is a mathematical operation used to extract features from an image. The convolution
is defined by an image kernel. The image kernel is nothing more than a small matrix. Most of the
time, a 3x3 kernel matrix is very common.

In the below fig, the green matrix is the original image and the yellow moving matrix is called kernel,
which is used to learn the different features of the original image. The kernel first moves horizontally,
then shift down and again moves horizontally. The sum of the dot product of the image pixel value
and kernel pixel value gives the output matrix. Initially, the kernel value initializes randomly, and its a
learning parameter.

In order to understand the concept of edge detection, taking an example of a simplified image.

A 6∗6 image convolved with 3∗3 kernel


So if a 6*6 matrix convolved with a 3*3 matrix output is a 4*4 matrix. To generalize this if a 𝑚 ∗
𝑚 image convolved with 𝑛 ∗ 𝑛 kernel, the output image is of size (𝑚 − 𝑛 + 1) ∗ (𝑚 − 𝑛 + 1).

Padding

What Is Padding
padding is a technique used to preserve the spatial dimensions of the input image after
convolution operations on a feature map. Padding involves adding extra pixels around the border
of the input feature map before convolution.

This can be done in two ways:

 Valid Padding: In the valid padding, no padding is added to the input feature map, and the
output feature map is smaller than the input feature map. This is useful when we want to
reduce the spatial dimensions of the feature maps.

 Same Padding: In the same padding, padding is added to the input feature map such that
the size of the output feature map is the same as the input feature map. This is useful
when we want to preserve the spatial dimensions of the feature maps.

The number of pixels to be added for padding can be


calculated based on the size of the kernel and the desired
output of the feature map size. The most common padding
value is zero-padding, which involves adding zeros to the
borders of the input feature map.

Padding can help in reducing the loss of information at the


borders of the input feature map and can improve the
performance of the model. However, it also increases the
computational cost of the convolution operation.
Overall, padding is an important technique in CNNs that helps
in preserving the spatial dimensions of the feature maps and
can improve the performance of the model.

There are two problems arise with convolution:


1. Every time the convolution operation, the original image size shrinks, as we have seen in the
above example six by six down to four by four, and in the image classification task there are
multiple convolution layers so after multiple convolution operations, our original image will
get small but we don’t want the image to shrink every time.
2. The second issue is that, when the kernel moves over the original images, it touches the
edge of the image less number of times, touches the middle of the image more number of
times, and overlaps also in the middle. So, the corner features of any image or on the edges
aren’t used much in the output.

So, to solve these two issues, a new concept called padding was introduced. Padding preserves the
size of the original image.

So if a 𝑛∗𝑛 matrix convolved with an f*f matrix the with padding p then the size of the output image
will be (n + 2p — f + 1) * (n + 2p — f + 1) where p =1 in this case.

Stride

left image: stride =0, middle image: stride = 1, right image: stride =2

Stride is the number of pixels shifts over the input matrix. For padding p, filter size 𝑓∗𝑓 and input
image size 𝑛 ∗ 𝑛 and stride ‘𝑠’ our output image dimension will be [ {(𝑛 + 2𝑝 − 𝑓 + 1) / 𝑠} + 1] ∗
[ {(𝑛 + 2𝑝 − 𝑓 + 1) / 𝑠} + 1].

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy