0% found this document useful (0 votes)
9 views9 pages

ANN ConferencePaper

This research explores the use of Convolutional Neural Networks (CNNs) for automated waste image classification, comparing six architectures: MobileNetV2, ResNet50, DenseNet121, InceptionV3, Xception, and a Custom CNN. The study highlights the limitations of traditional waste segregation methods and demonstrates that modern CNNs can significantly improve classification accuracy, achieving up to 64%. The findings emphasize the importance of model architecture choice for effective waste management solutions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views9 pages

ANN ConferencePaper

This research explores the use of Convolutional Neural Networks (CNNs) for automated waste image classification, comparing six architectures: MobileNetV2, ResNet50, DenseNet121, InceptionV3, Xception, and a Custom CNN. The study highlights the limitations of traditional waste segregation methods and demonstrates that modern CNNs can significantly improve classification accuracy, achieving up to 64%. The findings emphasize the importance of model architecture choice for effective waste management solutions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Real Waste Image Classification

Muskan Karodiya
Department of Artificial Intelligence and Data Science

karodiyamuskan@gmail.com

Abstract of predictive accuracy at approximately 54%.


This gap highlights the limitations of manually
With the global focus on sustainability growing designed models that do not benefit from transfer
stronger every day, the need for efficient waste learning or pretrained knowledge. Despite its
classification has become increasingly urgent in simplicity, the Custom CNN serves as a valuable
both urban and rural settings. Traditional baseline and offers insight into how handcrafted
methods of manual waste segregation are not models compare with advanced, pretrained
only time-consuming and labor-intensive but counterparts.Ultimately, our research
also prone to human error, making the process emphasizes how crucial the choice of model
inefficient and inconsistent. This has paved the architecture is when it comes to practical
way for automated solutions, where artificial deployment in real-world waste management
intelligence—particularly deep learning—can scenarios. By highlighting the strengths and
play a transformative role. In this research, we trade-offs of each approach, this study lays the
explore the capabilities of deep learning using groundwork for future improvements in
Convolutional Neural Networks (CNNs) to intelligent, AI-driven waste classification systems
automate and improve the accuracy of waste that can contribute meaningfully to sustainable
classification. environmental practices.

Our study conducts a comprehensive comparison 1. INTRODUCTION


of six different CNN architectures:
MobileNetV2, ResNet50, DenseNet121, The growing challenges of waste generation and
InceptionV3, Xception, and a Custom CNN mismanagement are putting serious strain on
model built from scratch. The goal is to evaluate urban sustainability and public health systems. As
how suitable each architecture is for multi-class cities become more crowded and consumption
waste image classification tasks. All models are habits continue to rise, there's an urgent need for
trained and validated on a curated dataset that faster and more effective waste segregation.
includes images from nine distinct waste Unfortunately, traditional methods of sorting
categories: Cardboard, Food Organics, Glass, waste are slow, physically demanding, and often
Metal, Miscellaneous Trash, Paper, Plastic, inaccurate—making them an unsustainable long-
Textile Trash, and Vegetation. term solution. That’s where artificial intelligence
and computer vision come into play, offering
To ensure a robust evaluation, we assess each powerful alternatives. In particular, Convolutional
model based on a range of performance metrics, Neural Networks (CNNs) have shown remarkable
including accuracy, precision, recall, F1-score, success in image classification tasks and hold
training time, and generalization ability. Our great promise for automating waste segregation.
findings reveal that modern CNN architectures
like MobileNetV2 and DenseNet121 stand out for CNNs mark a major leap forward from earlier
their ability to strike a balance between high computer vision techniques. Instead of relying on
classification accuracy and efficient manually engineered features, CNNs learn directly
computation. These models achieve around 64% from image data, automatically discovering useful
accuracy while maintaining strong patterns through layers of convolution, pooling,
generalization and acceptable training durations. and non-linear activation. This capability has
In contrast, Xception and InceptionV3 also already revolutionized areas like facial
demonstrate competitive performance but require recognition, medical diagnostics, and object
more time to train effectively.The Custom CNN detection—and it's now being applied to waste
model, although efficient in terms of execution classification as well. By training on diverse and
speed and resource usage, trails behind in terms representative image datasets, CNNs can
effectively distinguish between different types of 2. RELATED WORK
waste, enabling smart, real-time segregation at the
source. The field of image classification has seen
remarkable progress over the last decade,
Over time, various CNN architectures have been especially with the advent of Convolutional
developed to improve accuracy, reduce training Neural Networks (CNNs). These deep learning
time, and optimize for different hardware models have revolutionized computer vision by
constraints. Early models like AlexNet and automatically learning hierarchical features
VGG16 paved the way by showing how deeper directly from pixel data—eliminating the need for
networks and activation functions could hand-engineered features like edge detectors or
significantly boost performance. Later models like color histograms. In recent years, CNNs have
ResNet introduced the concept of residual found applications across various domains, from
connections, which made training deeper networks autonomous vehicles to medical imaging. One
much easier. Architectures such as MobileNetV2 domain where CNNs have shown particular
and DenseNet121 further refined these advances, promise is automated waste classification, a
offering lightweight and efficient models suitable critical step toward sustainable waste
for mobile and embedded applications. management.
Meanwhile, InceptionV3 and Xception used
innovative techniques like parallel and depthwise Early efforts in image-based waste recognition
separable convolutions to strike a balance between focused on traditional machine learning
power and efficiency. algorithms. These approaches relied on
handcrafted features such as texture, shape, or
While pretrained models have become the go-to color, which were fed into classifiers like Support
choice for many applications due to their Vector Machines (SVMs) or Decision Trees.
robustness and transfer learning benefits, building While these models performed reasonably well on
a custom CNN from scratch can provide a tailored small datasets, they lacked the flexibility and
solution, fine-tuned to the specific needs of a task. robustness required for large-scale, real-world
In this study, we examine both approaches. We waste classification scenarios, where the
compare five advanced pretrained CNN models— variability in image quality, object positioning,
MobileNetV2, ResNet50, DenseNet121, and lighting conditions is high.
InceptionV3, and Xception—against a custom-
designed CNN built specifically for waste With the emergence of CNNs, researchers began
classification. applying deep learning models to waste
classification tasks. AlexNet, one of the earliest
All models are trained under consistent conditions deep networks to gain attention, set the foundation
using a curated dataset consisting of nine distinct for modern CNNs by demonstrating how deeper
waste categories. Beyond standard performance models could significantly outperform traditional
metrics like accuracy, precision, recall, and F1- techniques on image datasets. Although
score, we also evaluate each model’s training computationally heavy, AlexNet remains a
time, generalization ability, and suitability for popular baseline in many comparative studies.
deployment in resource-limited environments such
as smart bins or embedded waste sorting systems. VGG16 followed by increasing depth and using
small 3×3 filters, making the architecture simple
Our goal is to identify the most effective model yet effective. Despite its high parameter count,
for real-time waste classification—one that can VGG16 gained popularity for its transfer learning
help power smarter, cleaner, and more efficient capabilities. In the waste classification context,
waste management systems using deep learning. VGG16 has been applied with pre-trained weights
and fine-tuning on datasets such as TrashNet and
TACO (Trash Annotations in Context), showing
promising results in separating recyclable and Our work builds upon these developments by
organic waste. conducting a comparative study of five well-
established pretrained CNN architectures and one
ResNet50 introduced the concept of residual custom-designed CNN, all trained and evaluated
connections, which allow gradients to flow more on a nine-class real waste image dataset. By
easily in deeper networks, addressing the benchmarking these models under consistent
vanishing gradient problem. It quickly became a experimental conditions, this study contributes
go-to architecture for complex image fresh insights into which architectures are best
classification tasks. ResNet’s ability to generalize suited for practical deployment in automated
well and its ease of training have made it a strong waste classification systems.
candidate in environmental and recycling
applications, including real-time waste sorting. 3. METHODOLOGY
MobileNetV2 is another widely adopted model This chapter explains how the models were
designed for mobile and embedded applications. It developed, trained, and evaluated for the task of
achieves a balance between speed and accuracy waste image classification. The approach was
using depthwise separable convolutions and linear designed to ensure consistency, fairness, and
bottlenecks. Its lightweight design makes it reliable comparison across all six CNN
suitable for integration into low-power devices architectures: MobileNetV2, ResNet50,
like smart bins or handheld scanners used for DenseNet121, Xception, InceptionV3, and a
classifying household waste. Custom CNN built from scratch.

DenseNet121 pushes the boundary further by 3.1 Dataset Description


connecting each layer to every other layer in a
feed-forward fashion. This encourages feature The dataset used in this project consists of high-
reuse and leads to improved efficiency and quality RGB images of real-world waste items.
generalization with fewer parameters. In waste These images are distributed across nine classes:
image classification, DenseNet has shown robust Cardboard, Food Organics, Glass, Metal,
performance in dealing with intra-class variability, Miscellaneous Trash, Paper, Plastic, Textile
such as distinguishing between clean and Trash, and Vegetation. Each category includes a
contaminated paper or different types of plastic. mix of indoor and outdoor photos with different
lighting conditions and backgrounds, making the
InceptionV3 and Xception also introduced classification task more realistic and challenging.
architectural innovations that use parallel and
separable convolutions to extract rich features To prepare the images for model training, each
while maintaining computational efficiency. These image was resized to 160×160 pixels—a
models have been explored in waste-related resolution that balances the trade-off between
applications where speed and scalability are retaining meaningful visual details and reducing
important, such as city-wide smart waste computational load. All images were stored in a
management systems. structured directory format, where each subfolder
represented a class.
Lastly, custom CNNs—though typically
simpler—are valuable for experimenting with 3.2 Data Preprocessing and Augmentation
architecture design and tailoring a model to
specific dataset characteristics. They allow Before feeding images into the CNN models,
researchers to find a sweet spot between preprocessing steps were applied to standardize
complexity, accuracy, and execution time. and enhance the dataset:
However, custom networks often fall short of
state-of-the-art accuracy without extensive tuning  Normalization: Each image’s pixel values
and large-scale data. were scaled to the [0, 1] range by dividing
by 255. This normalization helps models manner, which encourages feature reuse
converge faster during training. and reduces the number of parameters.
 Data Augmentation: To prevent  Xception: Extends the Inception idea with
overfitting and improve generalization, depthwise separable convolutions. It
image augmentation was applied using captures complex spatial relationships
random transformations: while remaining computationally efficient.
o Horizontal flipping  InceptionV3: Employs multiple types of
o Small zoom variations convolution filters in parallel to extract
o Random rotations diverse features at different scales. It's
o Minor brightness and contrast particularly effective in complex visual
changes environments.
 Custom CNN: This model was built from
These augmentations increased the diversity of the scratch using basic Conv2D and
training data without requiring new image MaxPooling2D layers. It consists of
collection. multiple convolutional blocks followed by
dense layers and a softmax output. It
3.3 Train-Validation Split serves as a baseline to compare the
performance of popular pretrained
The dataset was divided into two subsets: networks with a simpler, custom-designed
model.
 Training Set (80%) – Used to train the
models and learn feature patterns. 3.5 Model Compilation and Training Strategy
 Validation Set (20%) – Used to tune
hyperparameters and monitor model Each model was compiled with the following
performance on unseen data during configuration:
training.
 Optimizer: Adam (adaptive learning rate)
This split ensured that each class was  Loss Function: Categorical Crossentropy
proportionally represented in both sets.  Evaluation Metric: Accuracy

3.4 Model Architectures Callbacks Used:

Six CNN models were evaluated in this project.  EarlyStopping: Monitors validation loss
Five were based on well-established pretrained and halts training if there's no
architectures, while one was a custom-built improvement after a set number of epochs.
network.  ReduceLROnPlateau: Lowers the
learning rate if the model’s performance
 MobileNetV2: Designed for speed and plateaus, helping it escape local minima.
efficiency, it uses depthwise separable
convolutions and inverted residuals. Its Hyperparameters:
lightweight nature makes it ideal for
deployment on edge devices like smart  Epochs: 10–30 (varied per model based on
bins. convergence behavior)
 ResNet50: A 50-layer deep model using  Batch Size: 32
residual connections to solve the vanishing  Initial Learning Rate: 1e-4
gradient problem. It's known for stability
in training and strong generalization. All models were trained on a GPU-enabled Colab
 DenseNet121: Connects each layer to environment for faster computation.
every other layer in a feed-forward
3.6 Evaluation Metrics  MobileNetV2 demonstrated high training
accuracy (~84%) and achieved a
Model performance was evaluated using a range respectable validation accuracy of ~63%.
of metrics: It showed fast convergence with stable
validation loss, reflecting good
 Accuracy: Percentage of correct generalization and efficient learning,
predictions across all classes. especially considering its lightweight
 Precision, Recall, and F1-Score: Provide design.
a balanced view of performance, especially  ResNet50 had a slow start but gradually
in multi-class classification. improved, reaching training accuracy of
 Confusion Matrix: Visualizes how ~20% and validation accuracy of ~22%.
predictions are distributed among classes. However, due to the limited number of
 Validation Loss: Used to check epochs (10), it underperformed compared
overfitting. to others. It needs longer training or fine-
 Training Time: Compared to assess tuning for deeper convergence.
efficiency.  DenseNet121 steadily progressed with a
 AUC (for selected models): Evaluated the final training accuracy of ~80% and
model's ability to separate classes. validation accuracy of ~64%. It performed
consistently across epochs and showed low
3.7 Summary variance between training and validation,
indicating minimal overfitting.
This methodology was carefully planned to ensure  Xception achieved strong learning with
that each model was evaluated under fair and training accuracy of ~74% and validation
consistent conditions. By applying the same accuracy of ~60%. The model handled
preprocessing, training environment, and complex features well and benefited from
performance metrics, the comparative results depthwise separable convolutions.
obtained are both valid and meaningful. This  InceptionV3 showed balanced training
approach helps uncover which architectures are (69%) and validation (57%) accuracies. It
most suitable for real-world waste classification generalized reasonably well and
systems and highlights the trade-offs between maintained low training-to-validation loss
model complexity and performance. difference.
 Custom CNN trained quickly and
4. RESULTS & DISCUSSION achieved training accuracy of ~73%, but
validation accuracy remained around
This chapter presents and analyzes the ~55%. Though it generalized less
performance of all six CNN models— effectively, it still performed adequately
MobileNetV2, ResNet50, DenseNet121, for a model built from scratch.
Xception, InceptionV3, and the Custom CNN—
trained and evaluated for the real-world task of 4.2 Classification Report Analysis
waste image classification. Each model was
assessed using consistent metrics to ensure a fair The classification reports (precision, recall, F1-
comparison across different architectures. score) highlighted class-wise model effectiveness:

4.1 Training and Validation Performance  Vegetation, Glass, and Food Organics
were consistently the easiest to classify
The training and validation accuracy trends across across all models, achieving high F1-
all models offer insights into learning behavior scores (e.g., up to 0.94 for Vegetation in
and generalization: DenseNet121 and Xception).
 Plastic, Miscellaneous Trash, and
Textile Trash were comparatively harder
to classify. F1-scores for these categories F1- Notable
were lower due to their visual similarity to Accura Precisi Reca
Model Sco Strength
other classes and intra-class variability. cy on ll
re s
 Models like DenseNet121 and
MobileNetV2 maintained a good balance Good
of precision and recall, while Custom Inception ~0.6 ~0.6 depth and
CNN showed high variability across 60% ~0.60
V3 2 0 multiscal
classes, with relatively lower precision for e filter
minority classes.
Fast
4.3 Confusion Matrix Insights Custom ~0.5 ~0.5 training,
54% ~0.56
CNN 4 4 decent
Confusion matrices revealed specific confusion baseline
trends:

 Plastic vs. Paper, and Metal vs. These results affirm that deeper and more recent
Miscellaneous Trash were commonly CNNs like DenseNet and Xception offer reliable
confused pairs. These results highlight the accuracy while preserving model efficiency.
need for finer texture or shape-based
feature extraction, which modern CNNs 4.5 Training Time and Efficiency
like DenseNet and Xception handled
better.  Custom CNN had the fastest training time
 Vegetation and Glass maintained distinct (~10 mins) and the lowest complexity,
features, leading to high classification making it ideal for scenarios where quick
accuracy and fewer misclassifications. model deployment is needed.
 MobileNetV2 and InceptionV3 were also
4.4 Comparison of Evaluation Metrics relatively efficient, completing training in
F1- Notable under 25 minutes with strong results.
Accura Precisi Reca  ResNet50 and DenseNet121 took longer
Model Sco Strength
cy on ll due to deeper networks and slower
re s
convergence.
Good  Xception had one of the longest training
MobileNet ~0.6 ~0.6 balance, times (~40–50 mins), but this was justified
63% ~0.63 by its relatively high performance.
V2 3 2 fast
inference
4.6 Summary of Observations
Needs
ResNet50 22% Low Low Low more  Best overall performer: DenseNet121—
epochs highest accuracy with balanced
performance across all metrics.
Robust,  Best lightweight model: MobileNetV2—
DenseNet1 ~0.6 ~0.6 balanced offers competitive performance with low
64% ~0.65 memory and computation needs.
21 7 4 performa
 Most promising custom model: Custom
nce
CNN—decent results and fast training;
High on could improve further with architectural
~0.6 ~0.6 tuning.
Xception 60% ~0.60 complex
3 0  Most stable classifier: Xception—
textures
consistent across all classes.
 Underperformer: ResNet50 (in current challenges due to intra-class variations and
setup with only 10 epochs); likely to overlapping features. Confusion matrices and
improve significantly with more training. classification reports provided deeper insights into
these class-specific behaviors.
5. CONCLUSION
From a broader perspective, this research
The rapid growth of environmental waste and the reaffirms the value of deep learning—especially
urgent need for sustainable waste management CNNs—in addressing environmental problems
systems have emphasized the importance of like waste segregation. It also emphasizes the
automated waste classification. This project significance of using balanced datasets, adequate
explored the capabilities of Convolutional Neural training epochs, and appropriate architectural
Networks (CNNs) in categorizing real-world choices to optimize model performance.
waste images into distinct classes. By comparing
six different CNN architectures—five pretrained Key Takeaways:
models (MobileNetV2, ResNet50, DenseNet121,
Xception, InceptionV3) and one custom-designed  Pretrained models outperform custom
CNN—we evaluated their effectiveness in models in classification accuracy,
identifying various types of waste, ranging from especially with limited data.
biodegradable materials to recyclable items.  DenseNet121 and MobileNetV2 offer the
best performance-to-efficiency ratios.
Through careful experimentation, it became  The architecture of CNNs significantly
evident that pretrained models offer clear impacts their ability to distinguish between
advantages in terms of accuracy, generalization, visually similar waste categories.
and class-wise consistency. Among them,  Lightweight models like MobileNetV2 are
DenseNet121 emerged as the top performer, promising for real-time deployment in
maintaining a strong balance between accuracy smart bins or mobile apps.
(~64%), precision, recall, and robustness.
MobileNetV2, while slightly behind in raw Future Scope:
accuracy, provided the best trade-off between
computational efficiency and prediction reliability, This work lays the groundwork for more advanced
making it suitable for mobile or embedded developments in waste image classification.
applications. Similarly, Xception and Future improvements could include:
InceptionV3 demonstrated robust classification
capabilities, particularly for visually distinct  Fine-tuning pretrained models with
classes like glass and vegetation. transfer learning techniques.
 Incorporating attention mechanisms to
The Custom CNN, built entirely from scratch focus on localized image features.
without leveraging pretrained weights,  Expanding the dataset to include more
demonstrated encouraging results with a diverse waste classes and image
validation accuracy of ~54%. Despite being conditions.
simpler and faster to train, it struggled to  Exploring ensemble models to combine
generalize across all categories, especially those the strengths of multiple CNN
with visual similarities. However, it still architectures.
highlighted the potential of tailored architectures
when computational resources are limited. Ultimately, integrating such AI-powered waste
classifiers into real-world systems can
Throughout the study, we observed that classes significantly improve waste sorting, boost
like Vegetation, Glass, and Food Organics were recycling efforts, and contribute to smarter urban
consistently easier to classify, while Textile infrastructure.
Trash, Plastic, and Miscellaneous Trash posed
6. REFERENCES networks from overfitting. Journal of
Machine Learning Research, 15(1), 1929–
1. Krizhevsky, A., Sutskever, I., & Hinton, 1958.
G. E. (2012). ImageNet classification with https://jmlr.org/papers/volume15/srivastav
deep convolutional neural networks. In a14a/srivastava14a.pdf
Advances in Neural Information 10. Deng, J., Dong, W., Socher, R., Li, L.-J.,
Processing Systems (NeurIPS), 25, 1097– Li, K., & Fei-Fei, L. (2009). ImageNet: A
1105. large-scale hierarchical image database.
https://papers.nips.cc/paper_files/paper/20 CVPR, 248–255.
12/file/c399862d3b9d6b76c8436e924a68c https://doi.org/10.1109/CVPR.2009.52068
45b-Paper.pdf 48
2. Simonyan, K., & Zisserman, A. (2015).
Very deep convolutional networks for
large-scale image recognition.
International Conference on Learning
Representations (ICLR).
https://arxiv.org/abs/1409.1556
3. He, K., Zhang, X., Ren, S., & Sun, J.
(2016). Deep residual learning for image
recognition. Proceedings of the IEEE
Conference on Computer Vision and
Pattern Recognition (CVPR), 770–778.
https://doi.org/10.1109/CVPR.2016.90
4. Huang, G., Liu, Z., Van Der Maaten, L., &
Weinberger, K. Q. (2017). Densely
connected convolutional networks. CVPR,
4700–4708.
https://doi.org/10.1109/CVPR.2017.243
5. Chollet, F. (2017). Xception: Deep
learning with depthwise separable
convolutions. CVPR, 1251–1258.
https://doi.org/10.1109/CVPR.2017.195
6. Szegedy, C., Vanhoucke, V., Ioffe, S.,
Shlens, J., & Wojna, Z. (2016). Rethinking
the inception architecture for computer
vision. CVPR, 2818–2826.
https://doi.org/10.1109/CVPR.2016.308
7. Tan, M., & Le, Q. V. (2019). EfficientNet:
Rethinking model scaling for convolutional
neural networks. International Conference
on Machine Learning (ICML), 6105–6114.
https://arxiv.org/abs/1905.11946
8. Sandler, M., Howard, A., Zhu, M.,
Zhmoginov, A., & Chen, L. C. (2018).
MobileNetV2: Inverted residuals and
linear bottlenecks. CVPR, 4510–4520.
https://doi.org/10.1109/CVPR.2018.00474
9. Srivastava, N., Hinton, G., Krizhevsky, A.,
Sutskever, I., & Salakhutdinov, R. (2014).
Dropout: A simple way to prevent neural
The Report is Generated by DrillBit Plagiarism Detection Software

Submission Information

Author Name muskankarodiya06o@gmail.com


Title Submit/Check your document for plagiarism
Paper/Submission ID 3716294
Submitted by hod-library@nmit.ac.in
Submission Date 2025-06-02 15:02:04
Total Pages, Total Words 8, 3198
Document type Assignment

Result Information

Similarity 17 %
1 10 20 30 40 50 60 70 80 90

Sources Type Report Content

Journal/ Internet
Publicatio 8.07%
n 8.93%
Words <
14,
13.63%

Exclude Information Database Selection

Quotes Excluded Language English


References/Bibliography Not Excluded Student Papers Yes
Source: Excluded < 14 Words Not Excluded Journals & publishers Yes
Excluded Source 0% Internet or Web Yes
Excluded Phrases Not Excluded Institution Repository Yes

A Unique QR Code use to View/Download/Share Pdf File

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy