Object Ditection Assignment
Object Ditection Assignment
two-
stage object detectors
One-stage object detectors refer to neural networks anticipating all the bounding boxes in a single trip
through the web. Mobile devices are better suited for these tasks due to increased speed and
compatibility. Some of the most often encountered instances of one-stage object detectors include
YOLO, SSD, SqueezeDet, and DetectNet.In contrast, two-stage object detectors employ a two-step
approach. Initially, they utilize region suggestions to create preliminary object proposals. Subsequently,
a specialized per-region head categorizes and enhances these ideas. One-stage detectors typically
exhibit faster processing speeds, while two-stage detectors generally provide superior accuracy.
The prevalent architectures for one-stage object detection include YOLO, SSD, and EfficientDet. The
YOLO (You Only Look Once) system is an object recognition system that operates in real time. It utilizes a
singular neural network to make predictions regarding bounding boxes and class probabilities straight
from whole images in a single evaluation. Single Shot Detection (SSD) is a one-stage object detection
method that uses a solitary deep neural network to immediately estimate bounding boxes and class
probabilities from complete images in a single evaluation. EfficientDet refers to a collection of one-stage
object detectors that employ EfficientNet as the underlying network architecture, demonstrating
exceptional performance on diverse object detection benchmarks.
One-stage object detectors offer several advantages, including enhanced speed and excellent
compatibility with mobile devices. In addition, they necessitate a reduced amount of memory and
computational resources. One of the drawbacks of single-stage object detectors is their relatively lower
accuracy than two-stage detectors. Additionally, these detectors may have challenges in accurately
identifying small items or objects close to each other.
One advantage of two-stage object detectors is their tendency to achieve higher accuracy levels than
one-stage detectors. Additionally, they exhibit enhanced proficiency in identifying things with irregular
shapes or clusters of smaller objects. One of the drawbacks of two-stage detectors is their slower
processing speed and increased need for memory and computational resources compared to one-stage
sensors.
Yolo:
• In YOLO, Object detection as a single regression problem also known as single stage detector
bbox coordinates & class probabilities all are computed in just a single run of algorithm
• YOLO sees the entire image during training & test time so it makes < 1/2 the no. of background
errors compared to Fast R-CNN.
• Intersection Over Union (IOU) is a metric used to judge the accuracy of the bounding box
predicted by the model.
• Non-max Suppression reduces the no. of predicted b.boxes by taking the larges probability
associated with each detection
• Drawbacks:
• The major problem with YOLOv1 is its inability to detect very small objects.
Fast R-CNN:
• Similar to R-CNN, just that it expects region proposals to be fed in with images rather than
proposing them itself
• We run the CNN only once per image sharing the computation among 2,000 region proposals.
• The CNN processes the image and outputs a feature map
• Input region proposals are used to extract the ROI from feature map & create a region proposal
feature map for each proposed region called a the ROI Projection
• Then down-sample this feature map with the help of a ROI pooling layer to get a fixed length
feature map
• Drawbacks:
• much faster as compared to R-CNN. But with large real-life datasets, it does not works so fast
Result:
Ultimately, the best model for a particular application will depend on the specific requirements of that
application.
In some cases, Faster R-CNN may be the better choice, even if speed is a concern. For example, if the
application is only processing a small number of images, the slower speed of Faster R-CNN may not be a
significant drawback.
In other cases, YOLOv5 may be the better choice, even if accuracy is a concern. For example, if the
application is processing a large number of images in real time, the faster speed of YOLOv5 may be
essential.
The best way to choose between Faster R-CNN and YOLOv5 is to experiment with both models and see
which one works best for the particular application.
Reference:
1. https://www.analyticsvidhya.com/blog/2022/06/yolo-algorithm-for-custom-object-detection/
2. https://viso.ai/deep-learning/object-detection/#:~:text=on%20Viso%20Suite-
,Most%20Popular%20Object%20Detection%20Algorithms,the%20single%2Dshot%20detector%2
0family
3. https://pjreddie.com/darknet/yolo/
4. https://www.v7labs.com/blog/yolo-object-detection