CNNs 1697477106
CNNs 1697477106
Search Write
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 1/45
🧠
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Introduction
Convolutional Neural Networks (CNNs or ConvNets) are specialized neural
architectures that is predominantly used for several computer vision tasks,
such as image classification and object recognition. These neural networks
harness the power of Linear Algebra, specifically through convolution
operations, to identify patterns within images.
Convolutional neural networks have three main kinds of layers, which are:
• Convolutional layer
• Pooling layer
• Fully-connected layer
The convolutional layer is the first layer of the network, while the fully-
connected layer is the final layer, responsible for the output. The first
convolutional layer may be followed by several additional convolutional
layers or pooling layers; and with each new layer, the more complex is the
CNN.
As the CNN gets more complex, the more it excels in identifying greater
portions of the image. Whereas earlier layers focus on the simple features,
such as colors and edges; as the image progresses through the network, the
CNN starts to recognize larger elements and shapes, until finally reaching its
main goal.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 2/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
The image below displays the structure of a CNN. We have an input image,
followed by Convolutional and Pooling layers, where the feature learning
process happens. Later on, we have the layers responsible for the task of
classifying whether the vehicle in the input data is a car, truck, van, bicycle,
etc.
Convolutional Layer
The convolutional layer is the most important layer of a CNN; responsible for
dealing with the major computations. The convolutional layer includes input
data, a filter, and a feature map.
To illustrate how it works, let’s assume we have a color image as input. This
image is made up of a matrix of pixels in 3D, representing the three
dimensions of the image: height, width, and depth.
process is repeated until the kernel slides through the entire image, resulting
in an output array.
The resulting output array is also known as a feature map, activation map, or
convolved feature.
GIF displaying the convolutional process. First, we have a 5×55×5 matrix — pixels in the input image — with a
3×33×3 filter. The result of the operation is the output array.
Source: Convolutional Neural Networks
It is important to note that the weights in the filter remain fixed as it moves
across the image. The weights values are adjusted during the training
process due to backpropagation and gradient descent.
Besides the weights in the filter, we have other three important parameters
that need to be set before the training begins:
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 4/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
• Stride: This is the distance, or number of pixels, that the filter moves over
the input matrix.
• Zero-padding: This parameter is usually used when the filters do not fit the
input image. This sets all elements outside the input matrix to zero,
producing a larger or equally sized output. There are three different kinds of
padding:
■ Valid padding: Also known as no padding. In this specific case, the last
convolution is dropped if the dimensions do not align.
■ Same padding: This padding ensures that the output layer has the exact
same size as the input layer.
■ Full padding: This kind of padding increases the size of the output by
adding zeros to the borders of the input matrix.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 5/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Source: ResearchGate
The subsequent convolutional layers can see the pixels within the receptive
fields of the prior layers, which helps to extract and interpret additional
patterns.
Pooling Layer
• Max Pooling: As the filter slides through the input, it selects the pixel with
the highest value for the output array.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 6/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Source: ResearchGate
Fully-connected Layer
This is the layer responsible for performing the task classification based on
the features extracted during the previous layers. While both convolutional
and pooling layers tend to use ReLU functions, fully-connected layers use the
Softmax activation function for classification, producing a probability from 0
to 1.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 7/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Source: ResearchGate
• Social Media: Google, Meta, and Apple use these systems to identify people
in a photograph, making it easier to organize photo albums and tag friends.
• Agriculture: Drones equipped with cameras can monitor the health of vast
farmlands to identify areas that need more water or fertilizers.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 8/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
This Article
Nowadays, there are several pre-trained CNNs available for many tasks.
Models like ResNet, VGG16, InceptionV3, as well as many others, are highly
efficient in most computer vision tasks we currently perform across
industries.
• Rust: These are plant diseases caused by Pucciniales fungi, which cause
severe deformities to the plant.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 9/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
• Healthy: Naturally, these are the plants that are free from diseases.
We may also count the files inside each subfolder to compute the total of
data we have for training and testing, as well as measure the degree of class
imbalance.
Train/Healthy: 458
Train/Powdery: 430
Train/Rust: 434
Total: 1322
--------------------------------------------------------------------------------
Test/Healthy: 50
Test/Powdery: 50
Test/Rust: 50
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 10/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Total: 150
--------------------------------------------------------------------------------
Validation/Healthy: 20
Validation/Powdery: 20
Validation/Rust: 20
Total: 60
--------------------------------------------------------------------------------
We have a total of 1,322 files inside the Train directory and there are no
large imbalances between classes. A small variation between them is fine,
and a simple metric such as Accuracy may be enough to measure
performance.
For the testing set, we have a total of 150 images, whereas the validation set
consists of 60 images in total. Both sets have a perfect class balance.
Convolutional Neural Networks require a fixed size for all images we feed
into it. This means that every single image in our dataset must be equally
sized,
either 128×128128×128, 224×224224×224, and so on.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 11/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
First, we are going to resize the images, so they all have the same shape.
Then, we will transform the input from rectangular shape to square shape.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 12/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Even though not all images are of the same data type, uint8 , it is fairly easy
to guarantee that they will have the same data type once we load images into
datasets. We confirmed, though, that all the images have pixel values
ranging from 0 to 255, which is great news.
Before moving on to the Preprocessing step, let’s plot some images from each
class to see what they look like.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 13/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 14/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 15/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Preprocessing
For those familiar with tabular data, preprocessing is probably one of the
most daunting steps of dealing with neural networks and unstructured data.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 16/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 17/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
We have successfully captured all files within each set for each of the three
classes. We can also print these datasets for a further understanding of their
structure.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 18/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Let’s explore a bit deeper what all the information above means.
represents the batch size, which is None here because it can vary depending
on how many samples we have in the last batch; 256, 256 represents the
height and width of the images; 3 is the number of channels in the images,
indicating they are RGB images. Last, dtype=tf.float32 tells us that the data
type of the image pixels is a 32-bit floating point.
(256, 256) , we have ensured that all images have the same dimensions,
256×256256×256.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 19/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Another important step for preprocessing is ensuring that the pixel values of
our images are within a 0 to 1 range. The image_dataset_from_directory
method performed some transformations already, but the pixel values are
still in the 0 to 255 range.
To bring the pixel values to the 0 to 1 range, we can easily use one of Keras’
preprocessing layers, tf.keras.layers.Rescaling .
Now we can once more visualize the minimum and maximum pixel values in
the validation set.
Data Augmentation
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 20/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Keras has about seven different layers for image data augmentation. These
are:
attribute.
attribute.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 21/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
tf.keras.layers.RandomBrightness(
factor = (-.45, .45),
value_range = (0.0, 1.0),
seed = seed),
tf.keras.layers.RandomContrast(
factor = (.5),
seed = seed)
]
)
We can also use an input_shape as example to build the pipeline above and
plot it below to illustrate how it looks.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 22/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 23/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
I plan to use different kernel sizes, both 3×33×3 and 5×55×5. This may allow
the network to capture features at multiple scales.
With that being said, let’s go ahead and build our ConvNet.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 24/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
# Flattening tensors
model.add(Flatten())
# Fully-Connected Layers
model.add(Dense(2048))
model.add(Activation('relu'))
model.add(Dropout(0.5))
# Output Layer
model.add(Dense(3, activation = 'softmax')) # Classification layer
By using Keras’ compile method, we can prepare our neural network for
training. This method has several parameters, the ones we will be focusing
here are:
which is the best optimizer I've found during the tests I ran.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 25/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
• loss: This is the loss function we’re trying to minimize during training. In
this case, we are using categorical_crossentropy , which is a good choice for
classification tasks with over two classes.
• metrics: This parameter defines the metric that will be used to evaluate
performance during training and validation. Since our data is not heavily
unbalanced, we may use accuracy for this, which is a very straightforward
metric given by the following formula:
# Compiling model
model.compile(optimizer = tf.keras.optimizers.RMSprop(0.0001), # 1e-4
loss = 'categorical_crossentropy', # Ideal for multiclass tasks
metrics = ['accuracy']) # Evaluation metric
Early Stopping serves the purpose of interrupting the training process when
a certain metric stops improving over a period of time. In this case, I am
going to configure the EarlyStopping method to monitor the accuracy in the
test set, and stop the training process if we don't have any improvement on it
after 5 epochs.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 26/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Model Checkpoint will ensure that only the best weights get saved, and we’re
also going to define the best weights according to the accuracy of the model
in the test set.
checkpoint = ModelCheckpoint('best_model.h5',
monitor = 'val_accuracy',
save_best_only = True)
We may now use model.fit() to start the training and testing process.
Epoch 1/50
83/83 [==============================] - 82s 760ms/step - loss: 6.5686 - accurac
Epoch 2/50
83/83 [==============================] - 69s 768ms/step - loss: 3.0173 - accurac
Epoch 3/50
83/83 [==============================] - 70s 774ms/step - loss: 2.1228 - accurac
Epoch 4/50
83/83 [==============================] - 67s 727ms/step - loss: 1.3750 - accurac
Epoch 5/50
83/83 [==============================] - 67s 744ms/step - loss: 1.1113 - accurac
Epoch 6/50
83/83 [==============================] - 68s 746ms/step - loss: 0.8958 - accurac
Epoch 7/50
83/83 [==============================] - 70s 765ms/step - loss: 0.7605 - accurac
Epoch 8/50
83/83 [==============================] - 72s 792ms/step - loss: 0.6549 - accurac
Epoch 9/50
83/83 [==============================] - 72s 794ms/step - loss: 0.6207 - accurac
Epoch 10/50
83/83 [==============================] - 73s 803ms/step - loss: 0.5761 - accurac
Epoch 11/50
83/83 [==============================] - 73s 800ms/step - loss: 0.5478 - accurac
Epoch 12/50
83/83 [==============================] - 68s 749ms/step - loss: 0.4660 - accurac
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 27/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Epoch 13/50
83/83 [==============================] - 68s 744ms/step - loss: 0.4503 - accurac
Epoch 14/50
83/83 [==============================] - 69s 766ms/step - loss: 0.4796 - accurac
Epoch 15/50
83/83 [==============================] - 69s 757ms/step - loss: 0.4338 - accurac
Epoch 16/50
83/83 [==============================] - 69s 763ms/step - loss: 0.3859 - accurac
Epoch 17/50
83/83 [==============================] - 71s 781ms/step - loss: 0.3487 - accurac
Epoch 18/50
83/83 [==============================] - 68s 747ms/step - loss: 0.2876 - accurac
Epoch 19/50
83/83 [==============================] - 68s 754ms/step - loss: 0.3202 - accurac
Epoch 20/50
83/83 [==============================] - 70s 772ms/step - loss: 0.3956 - accurac
Epoch 21/50
83/83 [==============================] - 65s 708ms/step - loss: 0.2890 - accurac
Epoch 22/50
83/83 [==============================] - 65s 716ms/step - loss: 0.3251 - accurac
Epoch 23/50
83/83 [==============================] - 70s 762ms/step - loss: 0.2763 - accurac
Epoch 24/50
83/83 [==============================] - 69s 760ms/step - loss: 0.3304 - accurac
Epoch 25/50
83/83 [==============================] - 69s 763ms/step - loss: 0.2737 - accurac
Epoch 26/50
83/83 [==============================] - 69s 769ms/step - loss: 0.2629 - accurac
Epoch 27/50
83/83 [==============================] - 68s 754ms/step - loss: 0.2416 - accurac
The highest accuracy for the testing set has been reached at the 22nd epoch
at 0.9600, or 96%, and didn’t improve after that.
With the history object, we can plot two lineplots showing both the loss
function and accuracy for both sets over epochs.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 28/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
It is possible to see that the loss of the training set decreases continuously
over epochs, whereas its accuracy increases. This happens because, at each
epoch, the model starts to become more and more aware of the training set’s
patterns and particularities.
For the test set, however, this process is a bit more slower. Overall, the
lowest loss for the test set happened at epoch 14 at 0.5319, while the
accuracy was at its peak at epoch 22, at 0.9600.
Now that our model is built, trained, and tested, we can also plot its
architecture, as well as summary to better understand it.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 29/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 30/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 31/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 32/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
sequential (Sequential) (None, 256, 256, 3) 0
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 33/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
=================================================================
Total params: 69,246,659
Trainable params: 69,244,675
Non-trainable params: 1,984
_________________________________________________________________
The summary displays the output shapes for each layer, as well as the
number of parameters. We can clearly see, for instance, that the output
shape for the first layer is (None, 256,256,3) where 256256 represents both
height and width, while 33 represents the RGB color. In the last dense layer,
however, the output shape is (None, 3) , where 33 represents the three
classes for classification.
We can also see that the model has over 69 million parameters, where
99.99% of them are trainable. The non-trainable parameters are the ones
from the BatchNormalization layers.
Validating Performance
After finishing the training and testing phase, we may go ahead and validate
our model on the validation set. To load the best weights achieved during
training, we simply use the load_weights method. These weights will be
saved with the same name we've given during the ModelCheckpoint
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 35/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
model.load_weights('best_model.h5')
The output for model.predict() consists of probabilities for each class, while
model.evaluate() returns loss and accuracy values.
It is clear that the model correctly predicts 97% of the labels of the images in
the validation set.
I am going to load some images from the validation test and run predictions
on them individually, so we can see how the model performs according to
each picture.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 36/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
The model is about 99.9% confident that the plant in the picture belongs to
the Powdery class, which is correct.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 37/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
The model is 100% certain that the plant in the picture belongs to the Rust
class, which is also correct.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 38/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
The model is 100% certain that the plant in the picture belongs to the
Healthy class, which is also correct.
After running several tests with other pictures, I could identify that the
current model is performing fairly well in classifying all the three classes.
To save the current weights, so you can deploy this model or continue
working with it later on, you can simply use Keras’ .save() method. This is
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 39/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Conclusion
In this article, we explored the basics of Convolutional Neural Networks. We
delved deeper into the main layers — Convolutional, Pooling, etc. — ,
activation functions, as well as many other techniques to work with image
data and CNNs for image classification.
Even though many tasks nowadays can be efficiently done with pre-trained
models, that can be easily accessible via platforms such as TensorFlow Hub
and HuggingFace, it is still essential to understand what is the role of each
layer inside a Convolutional Neural Network and how they interact with each
other. This is why this notebook have the intention of guiding you through
the process of building a CNN from scratch, and I plan to bring more
notebooks such as this one for other Deep Learning tasks and architectures.
Our model scored 97.0% in accuracy while predicting labels for the
validation dataset, which is a great performance, and it was competent to
identify relevant patterns across all the classes in the dataset.
I hope that this notebook serves as an introduction to those that are still just
starting to explore ConvNets, or even help veterans to refine their knowledge
on some of the basics. Please, feel free to copy this notebook and edit it as
you wish, specially to try your own improvements for higher performance
and testings.
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 40/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
Thank you so much for reading. Your feedback, upvotes, and suggestions are
always much welcome!
Let’s connect! 🔗
LinkedIn • Kaggle • HuggingFace
853 Followers
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 41/45
10/16/23, 10:27 AM Convolutional Neural Network From Scratch | by Luís Fernando Torres | Oct, 2023 | Medium
778 7 990 11
Notebook: 📒📈
Note: This article is based on my Kaggle
Mastering Linear…
Introduction
158 143
https://medium.com/@luuisotorres/convolutional-neural-network-from-scratch-6b1c856e1c07 42/45