Summary Notes of Cnn
Summary Notes of Cnn
Answer: Padding is a technique that can be used to offset or mitigate the loss of information
due to reduction in size, and border pixels getting less opportunity to interact with the kernel.
PROVIDE THE NUMERICAL EXAMPLE PRESENTED IN THE CLASS (CHECK
GOOGLE CLASSROOM PPT OF CNN)
Basic Operations of CNN
Answer:
A convolutional layer in a Convolutional Neural Network (CNN) operates by applying a set
of filters (or kernels) to the input data, typically an image. Here’s how it works:
1. Input Data: The input to a convolutional layer is usually a multi-dimensional array,
like an image, which has width, height, and depth (for color images, depth
corresponds to the RGB channels).
2. Filters: Each filter is a smaller matrix (e.g., 3x3, 5x5) that slides over the input data.
The number of filters determines the number of output channels produced by the
layer.
3. Convolution Operation: The filter is applied to the input through a process called
convolution:
o The filter is placed on a portion of the input, and an element-wise
multiplication is performed, followed by summing the results to produce a
single output value.
o The filter then slides (or convolves) across the input in a specified stride (the
number of pixels the filter moves after each operation) to create the output
feature map.
4. Activation Function: After convolution, an activation function (commonly ReLU) is
applied to introduce non-linearity, allowing the network to learn complex patterns.
5. Pooling (optional): Often, a pooling layer follows the convolutional layer to
downsample the feature maps, reducing their dimensionality and retaining the most
important information.
6. Output: The result is a set of feature maps that capture various features (like edges,
textures, and patterns) from the input image.
Through training, the filters learn to detect specific features relevant for the task, making
CNNs particularly effective for image recognition and classification tasks.
PROVIDE THE NUMERICAL EXAMPLE PRESENTED IN THE CLASS (CHECK
GOOGLE CLASSROOM PPT OF CNN)
Pool size is typically 2x2 or 3x3, and stride is usually the same as the pool size.
PROVIDE THE NUMERICAL EXAMPLE PRESENTED IN THE CLASS (CHECK
GOOGLE CLASSROOM PPT OF CNN)
4. Fully Connected Layers (FC)
After several convolutional and pooling layers, the CNN typically includes one or more fully
connected layers (dense layers). In these layers, each neuron is connected to every neuron in
the previous layer, similar to a traditional feed-forward neural network.
The purpose of the fully connected layers is to:
Integrate high-level features from earlier layers.
Classify the input based on the learned features.
Mathematically, a fully connected layer works as follows:
7. Output Layer
The output layer provides the final predictions of the CNN. In classification tasks, this layer
typically has as many neurons as there are classes, and each neuron represents the probability
of the input belonging to a specific class. A softmax activation function is commonly used in
this layer to produce normalized probabilities.
For binary classification, a single neuron with a sigmoid activation function can be used to
output a probability between 0 and 1.
PROVIDE THE NUMERICAL EXAMPLE PRESENTED IN THE CLASS (CHECK
GOOGLE CLASSROOM PPT OF CNN)
Concept of Stride
Concept of Zero-padding
PROVIDE THE NUMERICAL EXAMPLE PRESENTED IN THE CLASS (CHECK
GOOGLE CLASSROOM PPT OF CNN)
Concept of Pooling
The concept of 2D CNN with diagrammatic overview
PROVIDE THE NUMERICAL EXAMPLE PRESENTED IN THE CLASS (CHECK
GOOGLE CLASSROOM PPT OF CNN)
Weight calculation of a CNN network
Consider a CNN composed of three convolutional layers, each with 3X3 kernels, a stride
of 2, and "same" padding. The lowest layer outputs 100 feature maps, the middle one
outputs 200, and the top one outputs 400. The input images are RGB images of 200X300
pixels. What are the total number of parameters in the CNN?
Answer:
To calculate the total number of parameters in a CNN composed of three convolutional
layers, we need to consider each layer's parameters based on the number of filters (feature
maps) and the dimensions of the kernels.
Layer 1: Convolutional Layer 1
Input: RGB images (3 channels) of size 200x300 pixels.
Number of filters: 100
Kernel size: 3x3
Parameters for each filter:
Parameters per filter=(3×3×3)+1=27+1=28
(The "+1" accounts for the bias term.)
Total parameters for Layer 1:
Total parameters=100×28=2800
Layer 2: Convolutional Layer 2
Input: Output from Layer 1 with 100 feature maps (each of size calculated later).
Number of filters: 200
Parameters for each filter:
Parameters per filter=(3×3×100)+1=900+1=901
Total parameters for Layer 2:
Total parameters=200×901=180200
Layer 3: Convolutional Layer 3
Input: Output from Layer 2 with 200 feature maps.
Number of filters: 400
Parameters for each filter:
Parameters per filter=(3×3×200)+1=1800+1=1801
Total parameters for Layer 3:
Total parameters=400×1801=720400
Total Parameters in the CNN
Now, we sum the parameters from all three layers:
Total parameters=2800+180200+720400=1002400
Thus, the total number of parameters in the CNN is 1,002,400.