0% found this document useful (0 votes)
3 views38 pages

Full Text 02

This Bachelor thesis compares pre-trained Convolutional Neural Network (CNN) architectures for classifying organic and recyclable materials from solid waste, addressing the challenges posed by increasing waste due to urbanization. The study evaluates four transfer learning models: DenseNet121, ResNet50, VGG19, and MobileNet, with VGG19 achieving the highest accuracy of 97.5%. The research aims to automate waste classification using computer vision techniques to improve efficiency and reduce human health risks associated with manual sorting.

Uploaded by

hbae080925
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views38 pages

Full Text 02

This Bachelor thesis compares pre-trained Convolutional Neural Network (CNN) architectures for classifying organic and recyclable materials from solid waste, addressing the challenges posed by increasing waste due to urbanization. The study evaluates four transfer learning models: DenseNet121, ResNet50, VGG19, and MobileNet, with VGG19 achieving the highest accuracy of 97.5%. The research aims to automate waste classification using computer vision techniques to improve efficiency and reduce human health risks associated with manual sorting.

Uploaded by

hbae080925
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Bachelor Thesis in Computer Science

October 2022

Comparison of pre-trained
Convolutional Neural Network (CNN)
architectures for classification of organic
and recyclable materials from solid
waste.

Nandakishore Reddy Mandle


Naga Veera Sai Ram Chikkala

Faculty of Computing, Blekinge Institute of Technology, 371 79 Karlskrona, Sweden


This thesis is submitted to the Faculty of Computing at Blekinge Institute of Technology in
partial fulfillment of the requirements for the degree of Bachelor Thesis in Computer Science.
The thesis is equivalent to 10 weeks of full-time studies.

The authors declare that they are the sole authors of this thesis and that they have not used
any sources other than those listed in the bibliography and identified as references. They further
declare that they have not submitted this thesis to any other institution to obtain a degree.

Contact Information:
Author(s):
Nandakishore Reddy Mandle
E-mail: nama21@student.bth.se

Naga Veera Sai Ram Chikkala


E-mail: nach21@student.bth.se

University advisor:
Dr. Sai Prashanth Josyula, Associate Senior Lecturer
Department of Computer Science

Faculty of Computing Internet : www.bth.se


Blekinge Institute of Technology Phone : +46 455 38 50 00
SE–371 79 Karlskrona, Sweden Fax : +46 455 38 50 57
Abstract

Background: Classification of organic and recyclable materials from the solid waste
is one of the primary challenges. Due to the world’s expanding population and ur-
banization, waste is growing at an alarming rate. As a result, waste classification
requires a significant amount of human effort and time. However, because of the
toxic materials present in the waste, this comes at a cost to human health. So the
goal of this thesis is to contribute towards automating the solid waste classification
process using computer vision techniques. In this study, Convolutional Neural Net-
work (CNN) architectures are used to build and automate an image classifier that
recognizes the object and determines the type of waste material, and find the best
efficient CNN architecture among the selected architectures.

Objectives: This thesis aims to select the best transfer learning algorithm for the
classification of organic and recyclable materials from solid waste. The selected
models are compared to find the efficient transfer learning architecture for classify-
ing images of solid waste into organic and recyclable.

Methods: An experiment is chosen to determine the most effective transfer learning


algorithm for the considered problem. For all architectures, a single data set is used.
The data set is divided into two parts: training and testing. After the architectures
have been built and trained, the four selected transfer algorithms: DenseNet121,
ResNet50, VGG19, and MobileNet are evaluated individually using the performance
metric i.e Accuracy to classify organic and recyclable materials from the images of
solid waste.

Results: In this thesis, we have selected four transfer learning architectures namely,
DenseNet121, ResNet50, VGG19, and MobileNet. When compared to other archi-
tectures, VGG19 achieved high accuracy of 97.5% after testing and training the
architectures using the Kaggle data set.

Conclusion: The trained models for CNN architectures to classify organic and re-
cyclable materials from the images of solid waste. Upon experimentation with the
Kaggle dataset, VGG19 performed better than other architectures.

Keywords: Convolutional Neural Network(CNN), computer vision, supervised deep


learning algorithms, Transfer algorithms.
Acknowledgments

We are extremely grateful to our supervisor Dr. Sai Prashanth Josyula for help-
ing and meeting us throughout our thesis work. We sincerely thank our supervisor,
friends, and family who have been supportive throughout our research work.

Additionally, we would like to thank our examiner Prashant Goswami for his ongoing
encouragement and suggestions throughout the writing of this thesis. His lectures
on research methodology were very helpful to us in structuring the thesis completion.

Authors:
Nandakishore Reddy Mandle
Naga Veera Sai Ram Chikkala

ii
Contents

Abstract i

Acknowledgments ii

1 Introduction 3
1.1 Aim and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.1 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Outline of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Background Work 6
2.1 Waste Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 Deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . 7
2.3.1 Layers of Convolutional Neural Network . . . . . . . . . . . . 8
2.3.2 Types of Convolutional Neural Network . . . . . . . . . . . . 9
2.4 Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Related Work 12

4 Method 14
4.1 Experimentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1.1 Experimental Environment . . . . . . . . . . . . . . . . . . . . 15
4.1.2 Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.1.3 Exploratory Data Analysis (EDA) . . . . . . . . . . . . . . . . 17
4.1.4 Fitting the models . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.5 Performance analysis of the architectures . . . . . . . . . . . . 21

5 Results and Analysis 22


5.1 Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.1.1 MobileNet Results . . . . . . . . . . . . . . . . . . . . . . . . 22
5.1.2 DenseNet121 Results . . . . . . . . . . . . . . . . . . . . . . . 23
5.1.3 ResNet Results . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.1.4 VGG19 Results . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.1.5 Comparison Results . . . . . . . . . . . . . . . . . . . . . . . . 24
5.2 Observation on Architectures . . . . . . . . . . . . . . . . . . . . . . 25

iii
6 Discussion 26

7 Conclusions and Future Work 27


7.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

References 28

iv
List of Figures

2.1 Convolution Neural Network architecture [21] . . . . . . . . . . . . . 8

4.1 Flowchart representing the steps followed in methodology . . . . . . . 15


4.2 representing VGG19 params . . . . . . . . . . . . . . . . . . . . . . 18
4.3 representing Resnet50 params . . . . . . . . . . . . . . . . . . . . . . 19
4.4 representing MobileNet params . . . . . . . . . . . . . . . . . . . . . 20
4.5 representing Densenet121 params . . . . . . . . . . . . . . . . . . . . 21

5.1 representing MobileNet accuracy results . . . . . . . . . . . . . . . . 22


5.2 representing DenseNet121 accuracy results . . . . . . . . . . . . . . . 23
5.3 representing ResNet50 accuracy results . . . . . . . . . . . . . . . . . 23
5.4 representing VGG19 accuracy results . . . . . . . . . . . . . . . . . . 24
5.5 Representing a Line-graph about the results of the transfer learning
algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1
Acronyms

AI Artificial Intelligence.

CNN Convolutional Neural Network.

DL Deep Learning.

ED Exploratory Data Analysis.

FN False Negative.

FP False Positive.

ML Machine Learning.

RQ Research Question.

TN True Negative.

TP True Positive.

2
Chapter 1
Introduction

Because of the world’s growing population and urbanization, waste is increasing at an


alarming rate. Waste is defined as the intentional disposal of unwanted materials [1].
Manufacturing processes, industries, packaging, food scraps, and other sources gen-
erate a lot of waste [1]. Every year, nearly 2.01 billion tons of waste are generated,
the average amount of waste generated per person per day is 0.74 kilograms, but
it varies greatly, ranging from 0.11 to 4.54 kilograms [2]. The above statistics only
account for 16% of the global population. On average, high-income people generate
683 million tons of global waste. It is predicted that by 2050, the global waste will
have increased to 3.40 billion tons [2].

The separation of waste and recycling is essential for a sustainable society [3]. Sep-
aration of waste is required before it becomes contaminated by other materials [3].
Waste classification can be done based on the type of garbage, biodegradable mate-
rials, and other factors [4]. The main goal of this thesis is to use computer vision
to classify organic and recyclable materials from waste using image classifiers. Com-
puter vision is a visual perception component of an ambitious agenda to mimic human
intelligence and to endow robots with intelligent behavior [5].

Organic waste refers to waste materials that are biodegradable and originate from
plants or animals [6]. Leftover food, eggshells, apple cores, fallen leaves, cut flowers,
untreated/unpainted woods, and so on are examples of organic waste [6]. Turning or-
ganic waste into resources has numerous environmental benefits. For example, these
organic wastes can be converted into nutrient- and protein-rich organic fertilizers,
which will help crops get all the nutrients and proteins they need, as they decom-
pose [7], biogas can be produced by anaerobic digestion of organic waste to provide
a versatile energy [8], and so on.

Recycling is a viable option for reducing waste, particularly in large areas where
waste is growing [9]. Recycling is the process of collecting and processing waste
materials to re-purpose them into new products [9]. Recycling materials has many
benefits, including reducing waste sent to landfills, conserving natural resources, sav-
ing energy, reducing pollution, and creating jobs [9]. Because natural resource usage
is increasing, future generations may lose their resources. Recycling helps to reduce
the collection of new resources [10].

Classifying organic materials and recyclable materials from the waste manually re-

3
Chapter 1. Introduction 4

quires a lot of human effort and time, this process can also be dangerous for human
health [11]. To resolve this problem, we can automate this process using computer
vision [5] to analyze and classify the organic and recyclable materials from the waste.

In computer vision, Machine Learning (ML) performs a significant role to extract cru-
cial information from images [12]. Machine Learning is a part of Artificial Intelligence
(AI), it allows computers to learn and think by themselves, and it performs tasks in a
similar way as humans do [13]. ML algorithms are divided into three sub-categories:
Supervised Learning, Unsupervised Learning, and Reinforcement Learning. Super-
vised ML uses labeled data sets, which allow the models to learn and improve over
time [13]. Unsupervised machine learning algorithms recognize previously unidenti-
fied patterns in data sets in order to derive rules from them [13]. Reinforcement ML
algorithms train models through trial and error to determine the best action based
on previous results [14].

ML techniques particularly supervised deep learning algorithms, are used in a variety


of fields and perform well at object detection [15]. Supervised deep learning algo-
rithms have been used in a variety of applications recently, including autonomous
driving, pattern recognition, and so on [15]. So, this study compares the supervised
deep learning Convolutional Neural Network architectures such as DenseNet121,
ResNet50, VGG19, and MobileNet to classify organic and recyclable materials from
the images of solid waste.

1.1 Aim and Objectives


1.1.1 Aim
The main aim of this thesis is to compare CNN architectures that classify organic and
recyclable materials from images of solid waste. The CNN architectures compared are
DenseNet121, ResNet50, VGG19, and MobileNet. The performance metric that will
be used for comparison is accuracy and among the selected CNN architectures most
efficient architecture will be proposed as the better architecture for the classification
of solid waste.

1.1.2 Objectives
The main objectives of this thesis are:

1. To classify the images of solid waste into either of the two categories: "organic"
or "recyclable".

2. To compare the selected architectures based on chosen performance metrics.

3. To find the efficient CNN architecture, among the selected architectures, that is
most suitable for classifying images of solid waste into organic and recyclable.
Chapter 1. Introduction 5

1.2 Research Questions


RQ: How effective are different CNN architectures such as DenseNet121, ResNet50,
VGG19, and MobileNet for classifying images of solid waste into two categories (or-
ganic and recyclable)?

Motivation: An experimentation is chosen to identify the most effective CNN ar-


chitecture. A single data set will be used for all the architectures. The data set
will be split into training and testing. This experiment necessitates the use of image
analysis techniques to detect and classify objects. This process involves Image aug-
mentation, which can be used to generate new data from existing data to increase
the size of data [16]. Augmentation technique i.e., ImageDataGenerator will be used.
Once the augmentation is done then the selected CNN architectures: DenseNet121,
ResNet50, VGG19, and MobileNet will be trained, tested, and evaluated individually
using performance metrics i.e Accuracy to find the most efficient and accurate CNN
architecture.

1.2.1 Outline of Thesis


This section describes the thesis work’s structure. The first chapter contains the
introduction, goals, and objectives of this thesis. Chapter 2 discusses the thesis’s
underlying work, selected algorithms, and several related issues. Chapter 3 discusses
related work in waste classification and cites all articles that assisted us in selecting
research work. Chapter 4 covers the methodology of building selected architectures.
The results of this study’s experiments are described in Chapter 5. Chapters 6 and
7 examine the results, conclusion, and future work.
Chapter 2
Background Work

2.1 Waste Classification


Waste classification is the process of separating recyclable and non-recyclable (or-
ganic) materials from waste. Due to urbanization and a growing population, waste
is increasing at an alarming rate, which poses a significant problem for waste classi-
fication. Waste must be removed before it can be contaminated by other materials.
Therefore, the goal of this thesis is to build and automate a waste classification system
that recognizes objects and classifies waste materials using supervised deep-learning
CNN architectures.

2.2 Machine Learning


Machine Learning (ML) is a subset of Artificial Intelligence (AI) that can make com-
puters learn on their own without being explicitly programmed. ML observes hidden
patterns in the data and uses them to train itself to make better predictions [12]. ML
is a process that automates problem-solving so that machines can act based on prior
observations and solve problems with little to no human input. ML has three cate-
gories based on how learning is absorbed or how feedback on learning is provided to
the system under development. There are numerous applications for ML, including
text recognition, natural language processing, self-driving cars, pattern recognition,
etc. The three categories of ML are Supervised Learning, Unsupervised learning,
and Reinforcement learning.

Supervised learning: Supervised learning is a subcategory of ML. In supervised


learning, a labeled data set will be provided with the goal of training algorithms to
accurately classify data or predict outcomes [13]. A training set is used in supervised
learning to teach models to produce the desired output. For example, consider a
data set with images of sharks labeled "shark" and images of rivers labeled "water."
As a result, supervised learning will later identify images of sharks as sharks and
images of rivers as water. The use of historical data to predict statistically likely
future events is a common application of supervised learning.

Unsupervised learning: Unsupervised learning, unlike supervised learning, makes


use of unlabeled data. The main goal of this learning is to discover hidden patterns
in data that will aid in the solution of clustering or association problems [13]. It is
especially useful if the person is unsure of common properties in a given data set.

6
Chapter 2. Background Work 7

Consider customer purchase data, but humans cannot understand what types of pur-
chases and similar properties can be drawn from customer profiles. When this data is
fed into unsupervised learning, it may be determined that a woman of a certain age
who purchases unscented soaps is indeed likely to be pregnant. As a result, products
related to babies and pregnant women will be suggested to that target group.

Reinforcement learning: Reinforcement learning uses trial and error to train


models to determine the best action based on previous results [13]. To solve the
problem, it employs a trial and error approach. It’s like a game where you either win
or lose, but reinforcement learning tries to maximize the reward that the programmer
desires. One of the best applications of reinforcement learning is self-driving cars. It
tries to achieve the customer’s goals even when the customer does not provide any
hints to the machine.

2.2.1 Deep learning


Deep Learning(DL) is a subset of ML that is essentially a three or more-layered neural
network architecture [20]. It attempts to simulate human behavior by allowing them
to learn from large data sets. DL is the process of automatically learning multiple
levels of representations of the underlying distribution of the data to be modeled [20].
In other words, a DL algorithm extracts the low and high-level features required for
classification automatically. Translation of languages, speech recognition, and image
classification have all benefited from deep learning. It can solve any pattern recog-
nition problem without the need for human intervention.

Neural networks are layers of nodes that function in the same way that the human
brain is made up of nerves. The number of layers in the network indicates how
deep it is. Signals travel between nodes and are assigned weights. A node with a
higher weight will have a greater influence on the nodes below it. The last layer
combines the weighted inputs to generate an output [20]. There are various types of
DL algorithms. This study focuses on the Transfer learning algorithm.

2.3 Convolutional Neural Network


CNN is a type of deep learning-based machine learning algorithm that is effective in
image processing and artificial intelligence. CNN is a type of artificial neural network
that uses the mathematical operation convolution instead of general matrix multi-
plication in at least one of its layers [22]. They are specifically designed to process
pixel data and are used in image recognition and processing. CNN is a multilayer
regularised network. Multilayer networks are typically fully connected, which means
that every neuron in one layer is linked to every neuron in the next layer. These net-
works are prone to data overfitting due to their "full connectivity". CNN recognizes
patterns in input images such as lines, gradients, circles, and even eyes and faces.
Some of the applications of CNN are self-driving cars, which the globally popular
leading Tesla company uses, and text classification, which includes detecting text
and classifying it based on its text definition, among many others.
Chapter 2. Background Work 8

The CNN’s job is to compress the images into a more manageable format while pre-
serving elements that are required for prediction. This is crucial when creating an
architecture that can learn features while also being scalable to large datasets. So, a
CNN is composed of three layers: the input layer, the hidden layers, and the output
layer, which are known as the convolutional layer, the pooling layer, and the fully-
connected layer, respectively. Each layer contains a collection of filters that detect
patterns or features in images.

Figure 2.1 illustrates how, after training, the CNN processes the input to produce
and evaluate output.

Figure 2.1: Convolution Neural Network architecture [21]

2.3.1 Layers of Convolutional Neural Network


The following are CNN’s three main layers. Referring to the above Figure 2.1

2.3.1.1 Convolutional Layer


The convolutional layer is a major part of CNN that extracts the presence of features
in a given image [22]. The Convolutional layer parameters are made up of a set of
K learnable filters (i.e., "kernels"), each of which has a width and a height and is al-
most always square. This layer performs the mathematical operation of convolution
between the input image and a filter of size MxM. The dot product between the filter
and the parts of the input image with respect to the size of the filter is calculated by
sliding the filter over the input image (MxM) [22].

The result is known as the Feature map, and it contains information about the image
such as its corners and edges. This feature map is later transferred to other layers
in CNN to learn additional features from the input image.
Chapter 2. Background Work 9

2.3.1.2 Pooling Layer


The pooling layer is linked to the Convolutional layer. The pooling layer’s main goal
is to reduce the features convoluted from the convolutional layer in order to reduce
computational power [22]. It essentially summarises the characteristics produced by
a convolution layer. This can be accomplished by reducing the number of connec-
tions between layers. There are various polling functions, including global, average,
and maximum pooling.

Pooling that selects the maximum element from the region of the feature map covered
by the filter is known as max pooling. Average pooling computes the average of the
elements present in the filter’s feature map region. Global pooling reduces each
channel in the feature map to a single value.

2.3.1.3 Fully Connected Layer


One of the various types of CNN layers is the Fully Connected layer. Fully Connected
means that every neuron in the previous layer is fully connected to every neuron in the
current layer. Fully connected layers are typically used near the end of a CNN when
the goal is to make predictions using the features learned by previous layers. For
instance, if we were using a CNN to classify waste images, the final Fully connected
layer could use the features learned by the previous layers to classify an image as
recyclable, organic, and so on.

2.3.2 Types of Convolutional Neural Network


This thesis will make use of four specific CNN architectures: DenseNet121, ResNet50,
VGG19, and MobileNet.

Each of the following CNN architectures excels at extracting features from images,
detecting objects, detecting hidden patterns, and classifying them. Each architecture
has some pros and cons. So, we want to study how accurately each CNN architec-
ture performs and compare the architectures for the classification of organic and
recyclable materials from the images of solid waste.

DenseNet121: A DenseNet121 is a type of convolutional neural network. Each


layer in the DenseNet121 architecture is connected to every other layer in a feed-
forward manner. All previous layers’ feature maps will be applied to each layer. As
a result, even after passing through many layers, the features are preserved. Due
to the reuse of functions, the parameters in the DenseNet121 architecture are re-
duced [16].

ResNet50: The Residual Network (ResNet50) architecture is thought to be a good


place to start when it comes to transfer learning. ResNet50 architecture is divided
into five stages. A convolutional block and an identity block are present in each stage.
There are three convolutional layers in each convolutional block. It was created to
make training deeper networks easier [16]. ResNet50 is comparable to VGG19 in
Chapter 2. Background Work 10

that it has 8 times more depth than VGG, which only has 3 x 3 filters. ResNet50 is
made up of two layers: convolutional and pooling.

VGG19: VGG19 is a deep learning image classification advanced CNN with 19


layers (16 convolutional layers and 3 fully connected layers) [17]. VGG19 is a very
deep network that has been trained on millions of images with complex classification
tasks. As a result, the network has learned rich feature representations for a wide
range of images [17].

MobileNet: MobileNet is a type of convolutional neural network used for image


classification developed by Google researchers [18]. They are based on depthwise
separable convolutions to build a lightweight deep CNN that reduces computation
time and makes a model very small. Depthwise separable convolution is a type of
convolution in which only one convolutional filter is applied to each input channel [18].

2.4 Performance Metrics


To validate the chosen CNN architectures, performance metrics are essential. There
are many performance metrics available for analyzing performance. The performance
metrics are chosen based on the classifier. Performance metrics for classification prob-
lems include accuracy, F1-measure, precision, and recall [28]. Performance metrics
are defined from the baseline as follows.

True Positive(TP): It is the case when the model correctly predicted positive out-
comes [28].

True Negative(TN):It is the case when the model correctly predicted negative
outcomes [28].

False Positive(FP): It is the case when the model incorrectly predicted positive
outcomes [28].

False Negative: It is the case when the model incorrectly predicted negative out-
comes [28].

Precision P: Precision is defined as the ratio of true predicted classification values


to the total number of positive classifications [28]. Equation2.1 defines the precision.

TP
P= (2.1)
TP + FP

Recall R: The recall is determined by dividing the total number of positive samples
by the number of Positive samples that were correctly identified as Positive [28].
Equation2.2 defines the recall.
Chapter 2. Background Work 11

TP
R= (2.2)
TP + FN

Accuracy: Accuracy is defined as the ratio of the total number of true predicted
values to the total number of predictions. Accuracy is used for evaluating classifica-
tion models. The accuracy of the ML model indicates how many times it was correct
overall [28]. Equation2.3 defines the accuracy.

TP + TN
Accuracy = (2.3)
TP + FP + FN + TN

F1 Score: The F1-Score is calculated by taking the harmonic mean of precision and
recall and assigning equal weight to each [28]. A higher F1 score indicates better
performance from the classifier. Equation2.4 defines the F1 Score.

2XP XR
F1 Score = (2.4)
P +R
Chapter 3
Related Work

George E. Sakr et al [3] have carried out research on comparing deep learning al-
gorithms like Convolutional Neural Network (CNN) and Support Vector Machines
(SVM) to automate the classification of waste by using images. Using a 256 x 256
colored png image of the waste, each algorithm separates waste into three main cat-
egories: plastic, paper, and metal in this study. The accuracy of each algorithm
was compared to see which one was the most accurate, and SVM achieved higher
accuracy than CNN.

Sai Susanth G et al [16] have carried out research on Garbage Waste Segregation
by comparing deep learning algorithms such as ResNet50, DenseNet169, VGG16,
and AlexNet. The data used in this study is collected from Gary Thung and Mindy
Yang’s. DenseNet169 achieved a high accuracy rate and ResNet50 performed closer
to DenseNet169.

Victoria Ruiz et al [10] have carried out research on the automatic image-based
classification of garbage types. This study compared Convolutional Neural Network
(CNN) architectures: VGG, Inception, and ResNet. The trashNet dataset was used
to train and compare the VGG, Inception, and ResNet. This dataset contains RGB
images of six classes of waste: glass, paper, cardboard, plastic, metal, and general
trash. Each image has only one type of garbage. The Combined Inception-ResNet
model got high accuracy.

Dipesh Gyawali et al [21] have carried out research on Comparative Analysis of


Multiple Deep CNN Models like ResNet50, VGG16, and ResNet18 for Waste Classi-
fication. This thesis made use of a data set collected from self-endeavors and Trash
Net. After visualizing the accuracy of both training and validation, the ResNet18
model achieved high accuracy.

Arpit Patil et al [23] have carried out research to deal with garbage classification
using deep learning. The process is automated by building an image classifier using
a CNN, ResNet50, and VGG16. The trained models are then fed into a mobile ap-
plication that uses a camera to capture images in real time. The final result of this
thesis concludes that VGG16 outperformed other models.

12
Chapter 3. Related Work 13

Rahmi Arda Aral et al [26] carried out research on the classification of waste using
deep learning models. Densenet121, DenseNet169, InceptionResnetV2, MobileNet,
Xception architectures were applied for Trashnet dataset. Every image that was uti-
lized in this investigation had a single object against a white backdrop. Densenet121
received a high accuracy score of 95%. InceptionResnetV2 had a comparable score
of 94%.

In the research papers mentioned above, VGG16 and ResNet18 achieved high ac-
curacy. So, in this study, we decided to compare four CNN architectures, DenseNet121,
ResNet50, VGG19, and MobileNet, along with the above two architectures to classify
organic and recyclable materials from images of solid waste.
Chapter 4
Method

In this study, experimentation is chosen as a methodology. The supervised deep-


learning CNN architectures are trained using Kaggle data sets. The performance
metric, Accuracy is used to compare the chosen architectures after the algorithms
have been trained and validated. The following are the steps taken in this method.

1. "waste classification" Dataset is used for this study which is gathered from the
Kaggle.

2. To conduct this study four transfer learning architectures are selected and used.

3. As a part of the study the different data pre-processing steps are applied to
the selected dataset.

4. After data pre-processing the dataset is fit to the selected transfer learning


algorithm.

5. The four algorithms have different performance metrics such as accuracy for
comparing them against each other.

14
Chapter 4. Method 15

Figure 4.1: Flowchart representing the steps followed in methodology

4.1 Experimentation
4.1.1 Experimental Environment
The research takes place in the environments listed below. The four transfer learning
algorithms have different environments involved individually.

Python: Python is an object-oriented, function-oriented, interpreted programming


language. It is currently one of the most widely used languages. Furthermore, it
includes a large number of built-in standard libraries.

Pytorch: Pytorch is an open-source Machine Learning (ML) framework based on the


"Python" and "Torch" libraries that are used for computer vision and deep learning
research. It is open-source software distributed under the modified Berkeley Source
Distribution (BSD) license [27]. Hugging Face’s Transformers, Tesla Autopilot, and
Uber’s Pyro are all built on Pytorch. Pytorch’s advantages include easy debugging
with Python tools, cloud platform support, and a user-friendly interface.
Chapter 4. Method 16

TensorFlow: TensorFlow is a Google open-source library designed primarily for


deep learning applications [24]. It was created to aid in large numerical computa-
tions without having to worry about deep learning. Tensors are multi-dimensional
arrays of higher dimensions that TensorFlow accepts. Multi-dimensional arrays come
in handy when dealing with large amounts of data.

Google Colab: Google colab was created by Google. It is a jupyter notebook en-
vironment that entirely runs Python code in the cloud. It is best suited for machine
learning and data analysis. Colab, in more technical terms, is a hosted jupyter note-
book service that requires no setup and provides free access to computing resources
such as GPUs.

Torchvision: The torchvision library contains popular computer vision datasets,


model architectures, and image transformations. It includes object detection, image
classification, video classification, and semantic segmentation training recipes. It in-
cludes over 60 pre-trained models for fine-tuning.

NumPy: NumPy is an open-source library for the Python programming language


that supports multidimensional arrays, matrices, and data manipulation. It enables
high-level mathematical operations on these arrays. Numpy is an abbreviation for
numerical python. The statement "import numpy" is used to include the NumPy
library.

Matplotlib: Matplotlib is a Python library for low-level graph plotting and visu-
alization. For platform compatibility, a small portion of Matplotlib is written in C,
Objective-C, and JavaScript. The majority of the library is written in Python. The
matplotlib library is included using the statement "import matplotlib".

4.1.2 Data Set


The data set used in this study is an open-source dataset available on Kaggle [19].
Kaggle is an online community for data scientists, to find and publish data sets, and
compete with other data scientists [19]. This data set contains 22564 jpg images
of organic waste (fruits, vegetables, peels, and so on) and recyclable waste (plastic,
paper, iron, steel, and so on), organized into two folders: o(organic) and r(recyclable).
the dataset contains 12410 images related to organic images and the remaining 10154
images are related to recyclable images. For data validation, 5% of data is used i.e
1128 images belonging to 2 classes. After data validation, data preprocessing is
implemented and irrelevant data is removed. For the implementation of the model
only 12630 images are used because of limited resources,80% of the data is used for
training the model and the remaining 20% data i.e 2526 images are used for testing
the model where 1409 images are related to organic images and 1117 images are
related to recyclable images.
Chapter 4. Method 17

4.1.3 Exploratory Data Analysis (EDA)


EDA is an approach used for data pre-processing. Before pre-processing the dataset
has raw data with noise, irrelevant data, and missing values that are unnecessary to
fit in the transfer learning algorithm.

1. Data Validation A sample of data withheld from your model’s training is


referred to as a validation dataset. the data validation must be performed
before performing any actions in the data set.

2. Data cleaning: Noise and inconsistencies can be found in the Kaggle data set
we downloaded. This data set can be filtered by eliminating the noise in the
data and dropping null values.

3. Data augmentation: Data augmentation increases the training data that


helps in improving the results. The images dataset will undergo different trans-
formations like rescale, shear range, zoom range, horizontal flip many more to
increase the dataset.

4. Splitting of data: Selected dataset is split into two parts, mainly called
training and testing where 80% of the data is considered as training and the
remaining as testing. In both training and testing, a batch size of 32 images is
sent with a categorical mode of the same target size ( 224,224).

4.1.4 Fitting the models


To conduct the experiment "VGG19","DenseNet","ResNet","MobileNet" are used.
Although all the transfer learning algorithms have similar parameters they differ in
the number of convolution layers, learning rates, and error reduction.

VGG19: VGG-19 transfer learning algorithm is imported from "tf.keras.applications.-


vgg19". The weights for the neural network are taken from the imagenet and the
model is executed for 20 epochs. "categorical cross entropy" is used as a loss function
and ADAM optimizers are applied to optimize the error during the compilation step.
In the below table, the clear description of layers and parameters is summarized and
the learning rate is 1e-3
Chapter 4. Method 18

Figure 4.2: representing VGG19 params


Chapter 4. Method 19

Resnet50: ResNet50 transfer learning algorithm is imported from "tensorflow.keras.ap-


plications.resnet50". The weights for the neural network are taken from the imagenet
and the model is executed for 20 epochs. "categorical cross entropy" is used as a
loss function and ADAM optimizers are applied to optimize the error during the
compilation step. In the below table, the clear description of layers and parameters
is summarized and the learning rate is 3.2

Figure 4.3: representing Resnet50 params


Chapter 4. Method 20

MobileNet: MobileNet transfer learning algorithm is imported from "tf.keras.mobilenet-


.Mobilenet". "categorical crossentrophy" is used as a loss function and ADAM op-
timizers are applied to optimize the error during the compilation step. In the below
table, the clear description of layers and parameters is summarized and the learning
rate is 1.66 x 10-3

Figure 4.4: representing MobileNet params


Chapter 4. Method 21

DenseNet121: DenseNet121 transfer learning algorithm is imported from " tf.keras-


.densenet.densenet121". The weights for the neural network are taken from the
imagenet and the model is executed for 20 epochs. ADAM optimizers are applied to
optimize the error during the compilation step and the learning rate is 0.003.

Figure 4.5: representing Densenet121 params

4.1.5 Performance analysis of the architectures


The data set is split into training data and testing data, and the performance metric
is obtained by getting the transfer learning algorithm results. Each transfer learning
algorithm’s accuracy value is noted and performance analysis of the algorithms is
compared by the accuracy to find out the best transfer learning algorithm from the
obtained test data.
Chapter 5
Results and Analysis

5.1 Prediction
This section contains the predictions of the CNN architectures after fitting the data
set into each architecture. For comparison, four architectures are used: DenseNet121,
ResNet50, VGG19, and MobileNet. Each architecture’s performance metrics were
recorded and compared in order to determine the most efficient architecture for
classifying organic and recyclable materials from solid waste images.

5.1.1 MobileNet Results


Figure 5.5 below contains the accuracy value for the transfer learning algorithm
MobileNet

Figure 5.1: representing MobileNet accuracy results

22
Chapter 5. Results and Analysis 23

5.1.2 DenseNet121 Results


In Figure5.2 the Densenet transfer learning algorithm yields performance metrics on
accuracy.

Figure 5.2: representing DenseNet121 accuracy results

5.1.3 ResNet Results


Figure5.3 below displays the accuracy that was attained using the ResNet50 transfer
learning algorithm.

Figure 5.3: representing ResNet50 accuracy results


Chapter 5. Results and Analysis 24

5.1.4 VGG19 Results


In Figure 5.4 below displays the test loss and accuracy for the VGG19 transfer
learning algorithm.

Figure 5.4: representing VGG19 accuracy results

5.1.5 Comparison Results


Each transfer learning algorithm independently produces a result following the execu-
tion of the algorithms. Among multiple performance metrics, we used only accuracy
for the comparison because among the obtained results of performance metrics of
transfer learning algorithms only accuracy produced the best results. The transfer
learning algorithm underwent two trials because it takes more time to execute four
transfer learning algorithms due to the differences in approach and working environ-
ment between each algorithm. The average accuracy produced is taken for the two
trials produced.

Figure 5.5: Representing a Line-graph about the results of the transfer learning
algorithm
Chapter 5. Results and Analysis 25

5.2 Observation on Architectures


Results are more effective when feature selection and data pre-processing produce
the best results. Because we only used twenty epochs, the accuracy values that are
generated are not constant from execution to execution. Higher epochs predict more
effective performance metrics. After comparing the results produced we concluded
that among these four transfer learning algorithms VGG19 produces higher accuracy
results.
Chapter 6
Discussion

RQ: How effective are different CNN architectures such as DenseNet121, ResNet50,
VGG19, and MobileNet for classifying images of solid waste into two categories (or-
ganic and recyclable)?

Answer: An experimentation process is carried out above to answer the RQ. The
primary goal of the research is to select the most efficient CNN architecture from
the available architectures for classifying images of solid waste into two categories
(organic and recyclable). We created models for several CNN architectures, including
DenseNet121, ResNet50, VGG19, and MobileNet.A Kaggle data set is used, For data
validation, 5% of data is utilised. then data preprocessing is implemented and the
newly cleaned dataset is used for the training and testing of the model. To train
and test the chosen architectures, 80% of the data or 10104 images, is used for
training, while 20%, or 2526 images(i.e 1409 images are related to organic images
and 1117 images are related to recyclable images), are used to test the architectures.
We obtained accuracy for each architecture after execution. Accuracy was chosen
because it provides a more useful performance measurement for the target variable’s
categorical. DenseNet121 has a 94.51% accuracy, ResNet50 has a 93.6% accuracy,
VGG19 has a 97.5% accuracy, and MobileNet has a 92.73% accuracy. In comparison
to all other CNN architectures tested, the final result obtained by VGG19 has the
highest accuracy of 97.5%.

26
Chapter 7
Conclusions and Future Work

7.1 Conclusions
The goal of this study is to use transfer learning algorithms to classify waste into
organic and recyclable categories from images of solid waste. Our data was gathered
from Kaggle, an open-source data source. EDA is used to analyze the data. After
preprocessing, nearly 80% of the data is trained, and the remaining 20% is used for
testing. To predict the best accurate performing algorithm, the thesis includes a
comparative study of transfer learning algorithms such as DenseNet121, ResNet50,
VGG19, and MobileNet. The experimental results show that the VGG19 is the
best-performing algorithm with an accuracy of 97.5%.

7.2 Future Work


The work in the future will focus on identifying the best algorithm that has greater
accuracy, determining the best outcomes with higher epochs in supercomputers, and
using a larger data set to produce results that are more precise when classifying
images into recyclable and organic things.

27
References

[1] A. Demirbas, “Waste management, waste resource facilities and waste conversion
processes,” Energy Conversion and Management, vol. 52, no. 2, pp. 1280–1287,
2011.
[2] “Trends in Solid Waste Management.” https://datatopics.worldbank.org/what-
a-waste/trends_in_solid_waste_management.html (accessed May 28, 2022).
[3] G. E. Sakr, M. Mokbel, A. Darwich, M. N. Khneisser, and A. Hadi, “Comparing
deep learning and support vector machines for autonomous waste sorting,” in
2016 IEEE international multidisciplinary conference on engineering technology
(IMCET), 2016, pp. 207–212.
[4] M. Yang and G. Thung, “Classification of trash for recyclability status,” CS229
project report, vol. 2016, p. 3, 2016.
[5] R. Szeliski, Computer vision: algorithms and applications. Springer Science
Business Media, 2010.
[6] N. Nnamoko, J. Barrowclough, and J. Procter, “Solid Waste Image Classification
Using Deep Convolutional Neural Network,” Infrastructures, vol. 7, no. 4, p. 47,
2022.
[7] A. A. Kadir, N. W. Azhari, and S. N. Jamaludin, “An overview of organic waste
in composting,” in MATEC Web of Conferences, 2016, vol. 47, p. 05025.
[8] V. K. Vijay, R. Kapoor, A. Trivedi, and V. Vijay, “Biogas as clean fuel for cook-
ing and transportation needs in India,” in Advances in Bioprocess Technology,
Springer, 2015, pp. 257–275.
[9] O. US EPA, “Recycling Basics,” Apr. 16, 2013.
https://www.epa.gov/recycle/recycling-basics (accessed May 28, 2022).
[10] V. Ruiz, Á. Sánchez, J. F. Vélez, and B. Raducanu, “Automatic image-based
waste classification,” in International Work-Conference on the Interplay Between
Natural and Artificial Computation, 2019, pp. 422–431.
[11] W. Lu and J. Chen, “Computer vision for solid waste sorting: A critical review
of academic research,” Waste Management, vol. 142, pp. 29–43, 2022.
[12] A. A. Khan, A. A. Laghari, and S. A. Awan, “Machine learning in computer
vision: A review,” EAI Transactions on Scalable Information Systems, p. e4,
2021.
[13] J. Alzubi, A. Nayyar, and A. Kumar, “Machine learning from theory to algo-
rithms: an overview,” in Journal of physics: conference series, 2018, vol. 1142,
no. 1, p. 012012.

28
References 29

[14] P. Y. Glorennec, “Reinforcement learning: An overview,” in Proceedings Euro-


pean Symposium on Intelligent Techniques (ESIT-00), Aachen, Germany, 2000,
pp. 14–15.
[15] Y. Yang, “Waste Classification Based On Yolov4,” JOURNAL OF SIMULA-
TION, vol. 9, no. 6, p. 79, 2021.
[16] G. S. Susanth, L. J. Livingston, and L. A. Livingston, “Garbage Waste Segre-
gation Using Deep Learning Techniques,” in IOP Conference Series: Materials
Science and Engineering, 2021, vol. 1012, no. 1, p. 012040.
[17] L. Wen, X. Li, X. Li, and L. Gao, “A new transfer learning based on VGG-
19 network for fault diagnosis,” in 2019 IEEE 23rd international conference on
computer supported cooperative work in design (CSCWD), 2019, pp. 205–209.
[18] S. Phiphiphatphaisit and O. Surinta, “Food image classification with improved
MobileNet architecture and data augmentation,” in Proceedings of the 2020
The 3rd International Conference on Information Science and System, 2020, pp.
51–56.
[19] “Waste Classification data.” https://www.kaggle.com/techsash/waste-
classification-data (accessed May 31, 2022).
[20] F. Q. Lauzon, “An introduction to deep learning,” in 2012 11th International
Conference on Information Science, Signal Processing and their Applications
(ISSPA), 2012, pp. 1438–1439.
[21] [1] V. H. Phung and E. J. Rhee, “A high-accuracy model average ensemble of
convolutional neural networks for classification of cloud image patches on small
datasets,” Applied Sciences, vol. 9, no. 21, p. 4500, 2019.
[22] “Convolutional Neural Network,” DeepAI. May 2019. Accessed: Aug. 29,
2022. [Online]. Available: https://deepai.org/machine-learning-glossary-and-
terms/convolutional-neural-network
[23] D. Gyawali, A. Regmi, A. Shakya, A. Gautam, and S. Shrestha, “Comparative
analysis of multiple deep CNN models for waste classification,” arXiv preprint
arXiv:2004.02168, 2020.
[24] A. Patil, A. Tatke, N. Vachhani, M. Patil, and P. Gulhane, “Garbage Classifying
Application Using Deep Learning Techniques,” in 2021 International Conference
on Recent Trends on Electronics, Information, Communication & Technology
(RTEICT), 2021, pp. 122–130.
[25] J. V. Dillon et al., “Tensorflow distributions,” arXiv preprint arXiv:1711.10604,
2017.
[26] R. A. Aral, Ş. R. Keskin, M. Kaya, and M. Hacıömeroğlu, “Classification of
trashnet dataset based on deep learning models,” in 2018 IEEE International
Conference on Big Data (Big Data), 2018, pp. 2058–2062.
[27] E. Stevens, L. Antiga, and T. Viehmann, Deep learning with PyTorch. Manning
Publications, 2020.
References 30

[28] M. Sokolova and G. Lapalme, “A systematic analysis of performance measures


for classification tasks,” Information processing management, vol. 45, no. 4, pp.
427–437, 2009.
Faculty of Computing, Blekinge Institute of Technology, 371 79 Karlskrona, Sweden

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy