ANN ConferencePaper
ANN ConferencePaper
Muskan Karodiya
Department of Artificial Intelligence and Data Science
karodiyamuskan@gmail.com
Six CNN models were evaluated in this project. EarlyStopping: Monitors validation loss
Five were based on well-established pretrained and halts training if there's no
architectures, while one was a custom-built improvement after a set number of epochs.
network. ReduceLROnPlateau: Lowers the
learning rate if the model’s performance
MobileNetV2: Designed for speed and plateaus, helping it escape local minima.
efficiency, it uses depthwise separable
convolutions and inverted residuals. Its Hyperparameters:
lightweight nature makes it ideal for
deployment on edge devices like smart Epochs: 10–30 (varied per model based on
bins. convergence behavior)
ResNet50: A 50-layer deep model using Batch Size: 32
residual connections to solve the vanishing Initial Learning Rate: 1e-4
gradient problem. It's known for stability
in training and strong generalization. All models were trained on a GPU-enabled Colab
DenseNet121: Connects each layer to environment for faster computation.
every other layer in a feed-forward
3.6 Evaluation Metrics MobileNetV2 demonstrated high training
accuracy (~84%) and achieved a
Model performance was evaluated using a range respectable validation accuracy of ~63%.
of metrics: It showed fast convergence with stable
validation loss, reflecting good
Accuracy: Percentage of correct generalization and efficient learning,
predictions across all classes. especially considering its lightweight
Precision, Recall, and F1-Score: Provide design.
a balanced view of performance, especially ResNet50 had a slow start but gradually
in multi-class classification. improved, reaching training accuracy of
Confusion Matrix: Visualizes how ~20% and validation accuracy of ~22%.
predictions are distributed among classes. However, due to the limited number of
Validation Loss: Used to check epochs (10), it underperformed compared
overfitting. to others. It needs longer training or fine-
Training Time: Compared to assess tuning for deeper convergence.
efficiency. DenseNet121 steadily progressed with a
AUC (for selected models): Evaluated the final training accuracy of ~80% and
model's ability to separate classes. validation accuracy of ~64%. It performed
consistently across epochs and showed low
3.7 Summary variance between training and validation,
indicating minimal overfitting.
This methodology was carefully planned to ensure Xception achieved strong learning with
that each model was evaluated under fair and training accuracy of ~74% and validation
consistent conditions. By applying the same accuracy of ~60%. The model handled
preprocessing, training environment, and complex features well and benefited from
performance metrics, the comparative results depthwise separable convolutions.
obtained are both valid and meaningful. This InceptionV3 showed balanced training
approach helps uncover which architectures are (69%) and validation (57%) accuracies. It
most suitable for real-world waste classification generalized reasonably well and
systems and highlights the trade-offs between maintained low training-to-validation loss
model complexity and performance. difference.
Custom CNN trained quickly and
4. RESULTS & DISCUSSION achieved training accuracy of ~73%, but
validation accuracy remained around
This chapter presents and analyzes the ~55%. Though it generalized less
performance of all six CNN models— effectively, it still performed adequately
MobileNetV2, ResNet50, DenseNet121, for a model built from scratch.
Xception, InceptionV3, and the Custom CNN—
trained and evaluated for the real-world task of 4.2 Classification Report Analysis
waste image classification. Each model was
assessed using consistent metrics to ensure a fair The classification reports (precision, recall, F1-
comparison across different architectures. score) highlighted class-wise model effectiveness:
4.1 Training and Validation Performance Vegetation, Glass, and Food Organics
were consistently the easiest to classify
The training and validation accuracy trends across across all models, achieving high F1-
all models offer insights into learning behavior scores (e.g., up to 0.94 for Vegetation in
and generalization: DenseNet121 and Xception).
Plastic, Miscellaneous Trash, and
Textile Trash were comparatively harder
to classify. F1-scores for these categories F1- Notable
were lower due to their visual similarity to Accura Precisi Reca
Model Sco Strength
other classes and intra-class variability. cy on ll
re s
Models like DenseNet121 and
MobileNetV2 maintained a good balance Good
of precision and recall, while Custom Inception ~0.6 ~0.6 depth and
CNN showed high variability across 60% ~0.60
V3 2 0 multiscal
classes, with relatively lower precision for e filter
minority classes.
Fast
4.3 Confusion Matrix Insights Custom ~0.5 ~0.5 training,
54% ~0.56
CNN 4 4 decent
Confusion matrices revealed specific confusion baseline
trends:
Plastic vs. Paper, and Metal vs. These results affirm that deeper and more recent
Miscellaneous Trash were commonly CNNs like DenseNet and Xception offer reliable
confused pairs. These results highlight the accuracy while preserving model efficiency.
need for finer texture or shape-based
feature extraction, which modern CNNs 4.5 Training Time and Efficiency
like DenseNet and Xception handled
better. Custom CNN had the fastest training time
Vegetation and Glass maintained distinct (~10 mins) and the lowest complexity,
features, leading to high classification making it ideal for scenarios where quick
accuracy and fewer misclassifications. model deployment is needed.
MobileNetV2 and InceptionV3 were also
4.4 Comparison of Evaluation Metrics relatively efficient, completing training in
F1- Notable under 25 minutes with strong results.
Accura Precisi Reca ResNet50 and DenseNet121 took longer
Model Sco Strength
cy on ll due to deeper networks and slower
re s
convergence.
Good Xception had one of the longest training
MobileNet ~0.6 ~0.6 balance, times (~40–50 mins), but this was justified
63% ~0.63 by its relatively high performance.
V2 3 2 fast
inference
4.6 Summary of Observations
Needs
ResNet50 22% Low Low Low more Best overall performer: DenseNet121—
epochs highest accuracy with balanced
performance across all metrics.
Robust, Best lightweight model: MobileNetV2—
DenseNet1 ~0.6 ~0.6 balanced offers competitive performance with low
64% ~0.65 memory and computation needs.
21 7 4 performa
Most promising custom model: Custom
nce
CNN—decent results and fast training;
High on could improve further with architectural
~0.6 ~0.6 tuning.
Xception 60% ~0.60 complex
3 0 Most stable classifier: Xception—
textures
consistent across all classes.
Underperformer: ResNet50 (in current challenges due to intra-class variations and
setup with only 10 epochs); likely to overlapping features. Confusion matrices and
improve significantly with more training. classification reports provided deeper insights into
these class-specific behaviors.
5. CONCLUSION
From a broader perspective, this research
The rapid growth of environmental waste and the reaffirms the value of deep learning—especially
urgent need for sustainable waste management CNNs—in addressing environmental problems
systems have emphasized the importance of like waste segregation. It also emphasizes the
automated waste classification. This project significance of using balanced datasets, adequate
explored the capabilities of Convolutional Neural training epochs, and appropriate architectural
Networks (CNNs) in categorizing real-world choices to optimize model performance.
waste images into distinct classes. By comparing
six different CNN architectures—five pretrained Key Takeaways:
models (MobileNetV2, ResNet50, DenseNet121,
Xception, InceptionV3) and one custom-designed Pretrained models outperform custom
CNN—we evaluated their effectiveness in models in classification accuracy,
identifying various types of waste, ranging from especially with limited data.
biodegradable materials to recyclable items. DenseNet121 and MobileNetV2 offer the
best performance-to-efficiency ratios.
Through careful experimentation, it became The architecture of CNNs significantly
evident that pretrained models offer clear impacts their ability to distinguish between
advantages in terms of accuracy, generalization, visually similar waste categories.
and class-wise consistency. Among them, Lightweight models like MobileNetV2 are
DenseNet121 emerged as the top performer, promising for real-time deployment in
maintaining a strong balance between accuracy smart bins or mobile apps.
(~64%), precision, recall, and robustness.
MobileNetV2, while slightly behind in raw Future Scope:
accuracy, provided the best trade-off between
computational efficiency and prediction reliability, This work lays the groundwork for more advanced
making it suitable for mobile or embedded developments in waste image classification.
applications. Similarly, Xception and Future improvements could include:
InceptionV3 demonstrated robust classification
capabilities, particularly for visually distinct Fine-tuning pretrained models with
classes like glass and vegetation. transfer learning techniques.
Incorporating attention mechanisms to
The Custom CNN, built entirely from scratch focus on localized image features.
without leveraging pretrained weights, Expanding the dataset to include more
demonstrated encouraging results with a diverse waste classes and image
validation accuracy of ~54%. Despite being conditions.
simpler and faster to train, it struggled to Exploring ensemble models to combine
generalize across all categories, especially those the strengths of multiple CNN
with visual similarities. However, it still architectures.
highlighted the potential of tailored architectures
when computational resources are limited. Ultimately, integrating such AI-powered waste
classifiers into real-world systems can
Throughout the study, we observed that classes significantly improve waste sorting, boost
like Vegetation, Glass, and Food Organics were recycling efforts, and contribute to smarter urban
consistently easier to classify, while Textile infrastructure.
Trash, Plastic, and Miscellaneous Trash posed
6. REFERENCES networks from overfitting. Journal of
Machine Learning Research, 15(1), 1929–
1. Krizhevsky, A., Sutskever, I., & Hinton, 1958.
G. E. (2012). ImageNet classification with https://jmlr.org/papers/volume15/srivastav
deep convolutional neural networks. In a14a/srivastava14a.pdf
Advances in Neural Information 10. Deng, J., Dong, W., Socher, R., Li, L.-J.,
Processing Systems (NeurIPS), 25, 1097– Li, K., & Fei-Fei, L. (2009). ImageNet: A
1105. large-scale hierarchical image database.
https://papers.nips.cc/paper_files/paper/20 CVPR, 248–255.
12/file/c399862d3b9d6b76c8436e924a68c https://doi.org/10.1109/CVPR.2009.52068
45b-Paper.pdf 48
2. Simonyan, K., & Zisserman, A. (2015).
Very deep convolutional networks for
large-scale image recognition.
International Conference on Learning
Representations (ICLR).
https://arxiv.org/abs/1409.1556
3. He, K., Zhang, X., Ren, S., & Sun, J.
(2016). Deep residual learning for image
recognition. Proceedings of the IEEE
Conference on Computer Vision and
Pattern Recognition (CVPR), 770–778.
https://doi.org/10.1109/CVPR.2016.90
4. Huang, G., Liu, Z., Van Der Maaten, L., &
Weinberger, K. Q. (2017). Densely
connected convolutional networks. CVPR,
4700–4708.
https://doi.org/10.1109/CVPR.2017.243
5. Chollet, F. (2017). Xception: Deep
learning with depthwise separable
convolutions. CVPR, 1251–1258.
https://doi.org/10.1109/CVPR.2017.195
6. Szegedy, C., Vanhoucke, V., Ioffe, S.,
Shlens, J., & Wojna, Z. (2016). Rethinking
the inception architecture for computer
vision. CVPR, 2818–2826.
https://doi.org/10.1109/CVPR.2016.308
7. Tan, M., & Le, Q. V. (2019). EfficientNet:
Rethinking model scaling for convolutional
neural networks. International Conference
on Machine Learning (ICML), 6105–6114.
https://arxiv.org/abs/1905.11946
8. Sandler, M., Howard, A., Zhu, M.,
Zhmoginov, A., & Chen, L. C. (2018).
MobileNetV2: Inverted residuals and
linear bottlenecks. CVPR, 4510–4520.
https://doi.org/10.1109/CVPR.2018.00474
9. Srivastava, N., Hinton, G., Krizhevsky, A.,
Sutskever, I., & Salakhutdinov, R. (2014).
Dropout: A simple way to prevent neural
The Report is Generated by DrillBit Plagiarism Detection Software
Submission Information
Result Information
Similarity 17 %
1 10 20 30 40 50 60 70 80 90
Journal/ Internet
Publicatio 8.07%
n 8.93%
Words <
14,
13.63%