Skip to main content

Advertisement

Log in

Cross-domain object detection by local to global object-aware feature alignment

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Cross-domain object detection has attracted more and more attention recently. It reduces the gap between the two domains, where the source domain is labeled and the target domain is label-agnostic. The feature alignment is a fundamental step for cross-domain object detection. However, the complex background information may lead to an unreliable feature alignment and even negative transfer. It has been found that paying more attention to the object instances is more conducive to the feature alignment. Therefore, this paper proposes a local to global object-aware feature alignment (OFA) method for cross-domain object detection. The proposed network consists of three components: (1) local object-aware feature alignment (LOFA) module, which utilizes the classification map to enhance the local object representation and achieve the local object feature alignment. (2) cross-scale attention feature alignment (CAFA) module, which employs the local cross-channel interactive attention to highlight the foreground features and aligns the features of global cross-scale fusion in the multi-scale level. (3) memory-category guided alignment (MCGA) module, which records all category features from the source domain, and then uses the memory-category features to guide the category learning of target domain. By the proposed OFA, the enhanced object information can lead to better feature alignment. The proposed method is demonstrated by the cross-domain detection in different scenarios and shows state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability Statement

The datasets used in this study are publicly available online.

References

  1. Liu Z, Song Y, Tang R, Duan G, Tan J (2023) Few-shot defect recognition of metal surfaces via attention-embedding and self-supervised learning. J Intell Manuf 34:3507–3521

    Article  Google Scholar 

  2. Girshick R (2015) Fast r-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448

  3. Wu S, Xu Y, Zhang B, Yang J, Zhang D (2021) Deformable template network (DTN) for object detection. IEEE Trans Multimedia

  4. Cai Z, Vasconcelos N (2018) Cascade r-CNN: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162

  5. Ren S, He K, Girshick R (2015) Sun J Faster r-CNN: Towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28:91–99

    Google Scholar 

  6. Long M, Cao Y, Wang J, Jordan M (2015) Learning transferable features with deep adaptation networks. In International conference on machine learning. PMLR, pp 97–105

  7. Cheng Z, Chen C, Chen Z, Fang K, Jin X (2021) Robust and high-order correlation alignment for unsupervised domain adaptation. Neural Comput Appl 33:6891–6903

    Article  Google Scholar 

  8. Jiang B, Chen C, Jin X (2020) Unsupervised domain adaptation with target reconstruction and label confusion in the common subspace. Neural Comput Appl 32:4743–4756

    Article  Google Scholar 

  9. Song Y, Liu Z, Wang J, Tang R, Duan G, Tan J (2021) Multiscale adversarial and weighted gradient domain adaptive network for data scarcity surface defect detection. IEEE Trans Instrum Meas 70:1–10

    Google Scholar 

  10. Zhou L, Ye M, Xiao S (2022) Domain adaptation based on source category prototypes. Neural Comput Appl 34:21191–21203

    Article  Google Scholar 

  11. Song Y, Liu Z, Tang R, Duan G, Tan J (2023) GradCa: generalizing to unseen domains via gradient calibration. Neurocomputing 529:1–10

    Article  Google Scholar 

  12. Ganin Y, Lempitsky V (2015) Unsupervised domain adaptation by backpropagation. In: International conference on machine learning. PMLR, pp 1180–1189

  13. Chen Y, Li W, Sakaridis C, Dai D, Van Gool L (2018) Domain adaptive faster r-CNN for object detection in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3339–3348

  14. He Z, Zhang L (2019) Multi-adversarial faster-RCNN for unrestricted object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6668–6677

  15. Xu C-D, Zhao X-R, Jin X, Wei X-S (2020) Exploring categorical regularization for domain adaptive object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11724–11733

  16. Zhuang C, Han X, Huang W, Scott M (2020) iFAN: image-instance full alignment networks for adaptive object detection. In: Proceedings of the AAAI conference on artificial intelligence, vol 4, pp 13122–13129

  17. Hsu C-C, Tsai Y-H, Lin Y-Y, Yang M-H (2020) Every pixel matters: center-aware feature alignment for domain adaptive object detector. In: European conference on computer vision. Springer, Berlin, pp 733–748

  18. Zheng Y, Huang D, Liu S, Wang Y (2020) Cross-domain object detection through coarse-to-fine feature adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13766–13775

  19. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  20. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: European conference on computer vision. Springer, Berlin, pp 21–37

  21. Lin T.-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988

  22. Tian Z, Shen C, Chen H, He T (2019) Fcos: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636

  23. Long M, Zhu H, Wang J, Jordan MI (2016) Unsupervised domain adaptation with residual transfer networks. arXiv preprint arXiv:1602.04433

  24. Zellinger W, Grubinger T, Lughofer E, Natschläger T, Saminger-Platz S (2017) Central moment discrepancy (CMD) for domain-invariant representation learning. arXiv preprint arXiv:1702.08811

  25. Yan H, Ding Y, Li P, Wang Q, Xu Y, Zuo W (2017) Mind the class weight bias: weighted maximum mean discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2272–2281

  26. Shen J, Qu Y, Zhang W, Yu Y (2018) Wasserstein distance guided representation learning for domain adaptation. In: Thirty-Second AAAI conference on artificial intelligence

  27. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, vol 27

  28. Saito K, Ushiku Y, Harada T, Saenko K (2019) Strong-weak distribution alignment for adaptive object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6956–6965

  29. Hsu H-K, Yao C-H, Tsai Y-H, Hung W-C, Tseng H-Y, Singh M, Yang M-H (2020) Progressive domain adaptation for object detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 749–757

  30. Zhu J.-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232

  31. Chen C, Zheng Z, Huang Y, Ding X, Yu Y (2021) I3net: implicit instance-invariant network for adapting one-stage object detectors. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12576–12585

  32. Lang Q, Zhang L, Shi W, Chen W, Pu S (2022) Exploring implicit domain-invariant features for domain adaptive object detection. IEEE Trans Circuits Syst Video Technol 33(4):1816–1826

    Article  Google Scholar 

  33. He Z, Zhang L, Yang Y, Gao X (2021) Partial alignment for object detection in the wild. IEEE Trans Circuits Syst Video Technol 32(8):5238–5251

    Article  Google Scholar 

  34. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125

  35. He Z, Zhang L (2020) Domain adaptive object detection via asymmetric tri-way faster-RCNN. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIV 16. Springer, Berlin, pp 309–324

  36. Chen C, Zheng Z, Ding X, Huang Y, Dou Q (2020) Harmonizing transferability and discriminability for adapting object detectors. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8869–8878

  37. He Z, Zhang L, Gao X, Zhang D (2023) Multi-adversarial faster-RCNN with paradigm teacher for unrestricted object detection. Int J Comput Vision 131(3):680–700

    Article  Google Scholar 

  38. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223

  39. Sakaridis C, Dai D, Van Gool L (2018) Semantic foggy scene understanding with synthetic data. Int J Comput Vision 126(9):973–992

    Article  Google Scholar 

  40. Johnson-Roberson M, Barto C, Mehta R, Sridhar SN, Rosaen K, Vasudevan R (2016) Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? arXiv preprint arXiv:1610.01983

  41. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 3354–3361

  42. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vision 88(2):303–338

    Article  Google Scholar 

  43. Inoue N, Furuta R, Yamasaki T, Aizawa K (2018) Cross-domain weakly-supervised object detection through progressive domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5001–5009

  44. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  45. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  46. Kim S, Choi J, Kim T, Kim C (2019) Self-training and adversarial background regularization for unsupervised domain adaptive one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6092–6101

  47. Shi W, Zhang L, Chen W, Pu S (2022) Universal domain adaptive object detector. In: Proceedings of the 30th ACM international conference on multimedia, pp 2258–2266

  48. Xu M, Qin L, Chen W, Pu S, Zhang L Multi-view adversarial discriminator: mine the non-causal factors for object detection in unseen domains. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 8103–8112 (2023)

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under U22A600, 52075480 and 51935009, in part by the Key Research and Development Program of Zhejiang Province under Grant 2022C01064, and in part by the High-level Talent Special Support Plan of Zhejiang Province under Grant 2022C01064.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guifang Duan.

Ethics declarations

Conflicts of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, Y., Liu, Z., Tang, R. et al. Cross-domain object detection by local to global object-aware feature alignment. Neural Comput & Applic 36, 3631–3644 (2024). https://doi.org/10.1007/s00521-023-09248-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-09248-8

Keywords

Navigation

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy