CV Expl 21070126001
CV Expl 21070126001
NET Pytorch
Aditya Pande
21070126001
AIML A1
Computer Vision Experiential Learning
Introduction to Image
Segmentation
• Image segmentation is a fundamental task in computer vision, playing a pivotal role in extracting
meaningful information from images by dividing them into semantically coherent regions. Unlike
object detection, which identifies and localizes objects within an image, segmentation goes a step
further by precisely outlining the boundaries of individual objects or regions. This process is critical
for various applications, ranging from medical imaging and autonomous vehicles to augmented
reality and content-based image retrieval.
Significance in Computer Vision
Applications:
1. Object Recognition and Tracking:
- Image segmentation facilitates precise identification and tracking of objects within a scene, enabling applications like
object recognition and tracking in real-time video streams.
2. Medical Imaging:
- In medical fields, segmentation aids in the accurate delineation of structures and organs, assisting in diagnosis,
treatment planning, and monitoring of diseases.
3. Autonomous Vehicles:
- For autonomous vehicles, accurate segmentation is crucial for understanding the surrounding environment, identifying
road lanes, pedestrians, and other vehicles.
4. Augmented Reality:
- In augmented reality applications, segmentation helps distinguish between the foreground and background, allowing
virtual elements to seamlessly interact with the real world.
Explanation of the Project
• In our project, we focus on image segmentation using the Cityscapes dataset, which contains labeled urban scenes
captured from vehicles in Germany. The dataset provides a challenging yet realistic environment for testing and
evaluating segmentation techniques. Our project involves implementing various image segmentation methods,
encompassing traditional techniques such as thresholding, clustering algorithms, as well as state-of-the-art deep
learning models like U-Net and Mask R-CNN.
• One aspect of our project involves the application of clustering algorithms such as K-means and DBSCAN to
segment images. These algorithms group pixels based on similarities in color, allowing us to explore their
effectiveness in extracting meaningful regions from the dataset. We will compare the results of clustering algorithms
with traditional and deep learning methods to understand their respective advantages and limitations.
• Our evaluation will not only focus on visual comparisons but will also include quantitative assessments using metrics
such as Intersection over Union (IoU) and Dice Coefficient. These metrics provide insights into the accuracy and
precision of the segmentation methods, aiding in a comprehensive analysis of their performance.
• Additionally, our project aims to explore the trade-offs between traditional and deep learning approaches, taking into
consideration factors such as computational efficiency, robustness to variations, and interpretability. By conducting
this analysis, we seek to contribute insights into the effectiveness of different segmentation techniques, offering a
holistic understanding of the challenges associated with image segmentation in complex urban environments.
Literature Review on Image Segmentation:
Vijay Badrinarayanan, Alex Kendall, Roberto Cipolla SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation The novelty of SegNet lies is in the manner in which the decoder upsamples its lower resolution
input feature map(s). Specifically, the decoder uses pooling indices computed in the max-pooling step
of the corresponding encoder to perform non-linear upsampling.
Fausto Milletari, Nassir Navab, Seyed-Ahmad Ahmadi · V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation In this work we propose an approach to 3D image segmentation based on a volumetric, fully
convolutional, neural network.
S. Prabu A Study on Image Segmentation Method for Image Processing In this paper different algorithms of segmentation can be reviewed, analyzed and finally list out the comparison for all the algorithms.
This comparison study is useful for increasing accuracy and performance of segmentation methods in various image processing
J.M. Gnanasekar domains.
Refik Samet; Şahin Emrah Amrahov; Ali Hikmet Ziroğlu Fuzzy Rule-Based Image Segmentation technique for rock thin section images In this paper, we propose Fuzzy Rule-Based Image Segmentation technique to segment rock thin
section images.
Ashwani Kumar Yadav; Ratnadeep Roy; Rajkumar; Vaishali; Devendra Somwanshi Thresholding and morphological based segmentation techniques for medical images The main objective of this work is to segment the medical image under various conditions and
different backgrounds.
Sharifah Lailee Syed Abdullah; Hamirul'Aini Hambali; Nursuriati Jamil An accurate thresholding-based segmentation technique for natural images The traditional thresholding and clustering segmentation techniques that were widely used are Otsu
and K-means
Annegreet van Opbroek; M. Arfan Ikram; Meike W. Vernooij; Marleen de Bruijne Transfer Learning Improves Supervised Image Segmentation Across Imaging Protocols The variation between images obtained with different scanners or different imaging protocols
presents a major challenge in automatic segmentation of biomedical images.
Acknowledgements
• This dataset is the same as what is available here from the Berkeley AI Research group.
The Cityscapes data available from cityscapes-dataset.com has the following license:
Dataset • This dataset is made freely available to academic and non-academic entities for non-
commercial purposes such as academic research, teaching, scientific publications, or
Context: personal experimentation. Permission is granted to use the data given that you agree:
Cityscapes data (dataset home page) contains labelled • That the dataset comes "AS IS", without express or implied warranty. Although every
videos taken from vehicles driven in Germany. This effort has been made to ensure accuracy, we (Daimler AG, MPI Informatics, TU
version is a processed subsample created as part of Darmstadt) do not accept any responsibility for errors or omissions.
the Pix2Pix paper. The dataset has still images from the
original videos, and the semantic segmentation labels are • That you include a reference to the Cityscapes Dataset in any work that makes use of the
shown in images alongside the original image. This is one
dataset. For research papers, cite our preferred publication as listed on our website; for
of the best datasets around for semantic segmentation
tasks. other media cite our preferred publication as listed on our website or link to the
Cityscapes website.
Content:
• That you do not distribute this dataset or modified versions. It is permissible to distribute
This dataset has 2975 training images files and 500
validation image files. Each image file is 256x512 pixels, derivative works in as far as they are abstract representations of this dataset (such as
and each file is a composite with the original photo on the models trained on it or additional annotations that do not directly include any of our
left half of the image, alongside the labeled image (output data) and do not allow to recover the dataset or something similar in character.
of semantic segmentation) on the right half.
• That you may not use the dataset or any derivative work for commercial purposes as, for
example, licensing or selling the data, or using the data with a purpose to procure a
commercial gain.
• That all rights not expressly granted to you are reserved by (Daimler AG, MPI Informatics,
TU Darmstadt).
Overall Findings and Observations
• In our exploration of image segmentation techniques using the Cityscapes dataset, we observed distinct
strengths and limitations across traditional and deep learning approaches. Traditional methods, such as
thresholding and clustering algorithms like K-means and DBSCAN, showcased computational efficiency but
struggled with precision, particularly in handling complex scenes.
• Deep learning models, including U-Net and Mask R-CNN, exhibited superior precision, but at the expense of
increased computational demands. Evaluation metrics such as Intersection over Union (IoU) and Dice
Coefficient provided a quantitative perspective, revealing nuanced performance differences among the methods.
• We identified a trade-off between computational efficiency and segmentation precision, emphasizing the need
for a balanced approach tailored to specific application requirements. The robustness of traditional methods to
variations and their interpretability were highlighted, while deep learning models demonstrated superior
generalization.
• Key takeaways include the potential for a hybrid approach integrating the strengths of both methods and the
importance of further research to optimize deep learning models for efficiency without compromising precision.
Additionally, domain-specific adaptations may enhance segmentation performance in diverse urban
environments.
https://github.com/adityapande403/CV_segmentation_UNET_EXPL/tree/main