Skip to main content

Advertisement

Log in

SS-YOLOv8: small-size object detection algorithm based on improved YOLOv8 for UAV imagery

  • Regular paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Unmanned aerial vehicle (UAV) image object detection has extensive applications across both civilian and military domains. However, the traditional YOLOv8 detection algorithm faces significant challenges in detecting small objects in UAV imagery, primarily due to a high missed detection rate and an excessive number of parameters. To address these issues, this paper introduces an enhanced small object detection approach, called Small-Size Object Detection Algorithm Based on Improved YOLOv8 for UAV Imagery (SS-YOLOv8). Firstly, considering the difficulties and stringent real-time requirements in detecting small objects in UAV aerial images, this work streamlines the model by eliminating the large object detection head and its associated redundant network layers, significantly limiting the parameter count. Furthermore, a specialized tiny object detection head is proposed. It is custom-designed for detecting small objects. To enhance the model’s capacity for extracting fine features of small objects and mitigate information loss during feature extraction, the convolution module is replaced in the backbone with space-to-depth convolution (SPD-Conv). Moreover, in the neck section, the self created GCU module is incorporated as it effectively amalgamates deep semantic features with shallow positional ones. Finally, while maintaining consistent parameter costs, multi-scale features are meticulously reused and incorporated to achieve a more comprehensive and sophisticated feature fusion. The experimental results on the VisDrone2019 dataset showed that compared with YOLOv8, SS-YOLOv8 improved its MAP index by 6.9% on the validation set and 5.8% on the test set, while reducing the number of parameters and model size by 65.93% and 64.85%, respectively. Compared with the best performing comparison algorithm, SS-YOLOv8 has the highest detection accuracy, indicating the superiority of this method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Osco, L., Junior, J., Ramos, A., et al.: A review on deep learning in uav remote sensing. Int. J. Appl. Earth Observ. Geoinf. 102, 102456 (2021)

    Google Scholar 

  2. Sun, Z., Dai, M., Leng, X., et al.: An anchor-free detection method for ship targets in high-resolution sar images. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 14, 7799–7816 (2021)

    Article  MATH  Google Scholar 

  3. Xue, Y., Jin, G., Shen, T., et al.: Smalltrack: Wavelet pooling and graph enhanced classification for uav small object tracking. IEEE Trans. Geosci. Remote Sens. 61, 5618815 (2023)

    Article  Google Scholar 

  4. Xue, Y., Jin, G., Shen, T., et al.: Template-guided frequency attention and adaptive cross-entropy loss for uav visual tracking. Chin. J. Aeronaut. 36(9), 299–312 (2023)

    Article  MATH  Google Scholar 

  5. Xue, Y., Jin, G., Shen, T., et al.: Mobiletrack: Siamese efficient mobile network for high-speed uav tracking. IET Image Process. 16(12), 3300–3313 (2022)

    Article  MATH  Google Scholar 

  6. Xue, Y., Jin, G., Shen, T., et al.: Consistent representation mining for multi-drone single object tracking. IEEE Trans. Circuits Syst. Video Technol. (2024). https://doi.org/10.1109/TCSVT.2024.3411301

    Article  MATH  Google Scholar 

  7. Xue, Y., Shen, T., Jin, G., et al.: Handling occlusion in uav visual tracking with query-guided redetection. IEEE Trans. Instrum. Meas. (2024). https://doi.org/10.1109/TIM.2024.3440378

    Article  Google Scholar 

  8. Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587 (2014)

  9. Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 1440–1448 (2014)

  10. Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2016)

    Article  MATH  Google Scholar 

  11. Liu, W., Anguelov, D., Erhan, D., et al.: Ssd: Single shot multibox detector. In: Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016. Proceedings, Part I 14, 21–37 (2016)

  12. Redmon, J., Divvala, S., Girshick ,R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788 (2016)

  13. Ge, Z., Liu, S., Wang, F., et al.: Yolox: Exceeding yolo series in 2021 (2021). arXiv:0707.3168

  14. Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263–7271 (2017)

  15. Redmon, J., Farhadi, A.: Yolov3: An incremental improvement (2018). arXiv:1804.02767

  16. Lin, Y., Chen, T., Liu, S., et al.: Quick and accurate monitoring peanut seedlings emergence rate through uav video and deep learning. Comput. Electron. Agric. 197, 106938 (2022)

    Article  Google Scholar 

  17. Nguyen, K., Huynh, N., Nguyen, P., et al.: Detecting objects from space: an evaluation of deep-learning modern approaches. Electronics 9(4), 583 (2020)

    Article  MATH  Google Scholar 

  18. Dai, J., Li, Y., He, K., et al.: R-fcn: object detection via region-based fully convolutional networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 379–387 (2016)

  19. Singh, B., Najibi, M., Davis, L.: Sniper: efficient multi-scale training. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 9333–9343 (2018)

  20. Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv:1904.07850 (2019)

  21. Lin, T., Goyal, P., Girshick, R., et al.: Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp. 2980–2988 (2017)

  22. Zhu, P., Wen, L., Bian, X., et al.: Vision meets drones: A challenge (2018). arXiv:1804.07437

  23. Ammar, A., Koubaa, A., Benjdira, B.: Deep-learning-based automated palm tree counting and geolocation in large farms from aerial geotagged images. Agronomy 11(8), 1458 (2021)

    Article  Google Scholar 

  24. Bochkovskiy, A., Wang, C., Liao, H.: Yolov4: Optimal speed and accuracy of object detection (2020). arXiv:2004.10934

  25. Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)

    Article  MATH  Google Scholar 

  26. Cai, W., Wei, Z.: Remote sensing image classification based on a cross-attention mechanism and graph convolution. IEEE Geosci. Remote Sens. Lett. 19, 1–5 (2020)

    MATH  Google Scholar 

  27. Deng, S., Li, S., Xie, K., et al.: A global-local self-adaptive network for drone-view object detection. IEEE Trans. Image Process. 30, 1556–1569 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  28. Domozi, Z., Stojcsics, D., Benhamida, A., et al.: Real time object detection for aerial search and rescue missions for missing persons. In: 2020 IEEE 15th International Conference of System of Systems Engineering (SoSE), pp. 000519–000524 (2020)

  29. Hong, S., Kang, S., Cho, D.: Patch-level augmentation for object detection in aerial images. In: 17th IEEE/CVF International Conference on Computer Vision Workshop, ICCVW 2019, pp. 127–134 (2019)

  30. Gao, J., Chen, M., Xiang, L., et al.: A comprehensive survey on evidential deep learning and its applications. (2024). arXiv preprint arXiv:2409.04720

  31. Yang, F., Fan, H., Chu, P., et al.: Clustered object detection in aerial images. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8311–8320 (2019)

  32. Zhao, W., Xu, L.: Weakly supervised target detection based on spatial attention. Visual Intell. 2(1), 2 (2024)

    Article  MATH  Google Scholar 

  33. Jia, Z., Sun, S., Liu, G., et al.: Mssd: multi-scale self-distillation for object detection. Visual Intell. 2(1), 8 (2024)

    Article  MATH  Google Scholar 

  34. Liang, X., Zhang, J., Zhuo, L., et al.: Small object detection in unmanned aerial vehicle images using feature fusion and scaling-based single shot detector with spatial context analysis. IEEE Trans. Circuits Syst. Video Technol. 30(6), 1758–1770 (2019)

    Article  MATH  Google Scholar 

  35. Liu, Z., Gao, G., Sun, L., et al.: Hrdnet: High-resolution detection network for small objects. In: 2021 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2021)

  36. Huang, Y., Chen, J., Huang, D.: Ufpmp-det: Toward accurate and efficient object detection on drone imagery. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1026–1033 (2022)

  37. Liu, M., Wang, X., Zhou, A., et al.: Uav-yolo: small object detection on unmanned aerial vehicle perspective. Sensors 20(8), 2238–2250 (2020)

    Article  MATH  Google Scholar 

  38. Deng, F., Xie, Z., Mao, W., et al.: Research on edge intelligent recognition method oriented to transmission line insulator fault detection. Int. J. Electr. Power Energy Syst. 139, 108054 (2022)

    Article  MATH  Google Scholar 

  39. Wang, X., Li, W., Guo, W., et al.: Spb-yolo: an efficient real-time detector for unmanned aerial vehicle images. In: 2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), pp. 099–104 (2021)

  40. Jocher, G., et al.: Yolov5 by ultralytics (2020). Available online https://github.com/openimages

  41. Jocher, G., Chaurasia, A., Qiu, J.: Yolo by ultralytics (2023). Available online https://github.com/ultralytics/ultralytics

  42. Li, Y., Fan, Q., Huang, H., et al.: A modified yolov8 detection network for uav aerial image recognition. Drones 7(5), 304 (2023)

    Article  MATH  Google Scholar 

  43. Lou, H., Duan, X., Guo, J., et al.: Dc-yolov8: small-size object detection algorithm based on camera sensor. Electronics 12(10), 2323 (2023)

    Article  MATH  Google Scholar 

  44. Shen, L., Lang, B., Song, Z.: Ds-yolov8-based object detection method for remote sensing images. IEEE Access 11, 125122–125137 (2023)

    Article  MATH  Google Scholar 

  45. Yi, H., Liu, B., Zhao, B., et al.: Small object detection algorithm based on improved yolov8 for remote sensing. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. (2023). https://doi.org/10.1109/JSTARS.2023.3339235

    Article  MATH  Google Scholar 

  46. Sunkara, R., Luo, T.: No more strided convolutions or pooling: A new cnn building block for low-resolution images and small objects. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp. 443–459 (2022)

  47. Sunkara, R., Luo, T.: No more strided convolutions or pooling: A new cnn building block for low-resolution images and small objects. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 443–459 (2022)

  48. Han, K., Wang, Y., Tian, Q., et al.: Ghostnet: More features from cheap operations. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1580–1589 (2020)

  49. Du, D., Wen, L., Zhu, P., et al.: Visdrone-det2020: The vision meets drone object detection in image challenge results. In: European Conference on Computer Vision, pp. 692–712 (2020)

  50. Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251–1258 (2017)

  51. Wang, C., Yeh, I., Liao, H.: Yolov9: Learning what you want to learn using programmable gradient information (2024). arXiv:2402.13616

  52. Zhang, Z.: Drone-yolo: an efficient neural network method for target detection in drone images. Drones 7(8), 526–549 (2023)

    Article  MATH  Google Scholar 

  53. Lou, H., Duan, X., Guo, J., et al.: Dc-yolov8: small-size object detection algorithm based on camera sensor. Electronics 12(10), 2323 (2023)

    Article  MATH  Google Scholar 

  54. Li, Y., Fan, Q., Huang, H., et al.: A modified yolov8 detection network for uav aerial image recognition. Drones 7(5), 304 (2023)

    Article  MATH  Google Scholar 

  55. Wang, L., Shi, Y., Mao, G., et al.: Consumer-centric insights into resilient small object detection: Sciou loss and recursive transformer network. IEEE Trans. Consum. Electron. (2023). https://doi.org/10.1109/TCE.2023.3330788

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This work was funded by the National Key Research and Development Program of China, grant Number 2021YFB1407005, and the Key Technology Research and Development Program of Shandong Province, Grant Number 2021SFGC0401.

Author information

Authors and Affiliations

Authors

Contributions

Jinlong Qu proposed the research idea, designed and implemented the model, conducted experiments, and served as the primary writer. Qi Li and Jie Pan contributed to the model design, result analysis, and manuscript revisions. Mingzheng Sun, Xingzheng Lu, Ying Zhou and Hongliang Zhu supervised and participated in revising the paper.

Corresponding author

Correspondence to Qi Li.

Ethics declarations

Conflict of interest

The authors have no Conflict of interest to declarethat are relevant to the content of this article.

Ethical standard

All datasets used in this paper are publicly availableand no consent was required for their use.

Open Access

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copy-right holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Additional information

Communicated by Junyu Gao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qu, J., Li, Q., Pan, J. et al. SS-YOLOv8: small-size object detection algorithm based on improved YOLOv8 for UAV imagery. Multimedia Systems 31, 42 (2025). https://doi.org/10.1007/s00530-024-01622-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00530-024-01622-3

Keywords

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy