Cross-domain object detection by local to global object-aware feature alignment

Song, Yiguo; Liu, Zhenyu; Tang, Ruining; Duan, Guifang; Tan, Jianrong

doi:10.1007/s00521-023-09248-8

Cross-domain object detection by local to global object-aware feature alignment

Original Article
Published: 06 December 2023

Volume 36, pages 3631–3644, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

563 Accesses
2 Citations
Explore all metrics

Abstract

Cross-domain object detection has attracted more and more attention recently. It reduces the gap between the two domains, where the source domain is labeled and the target domain is label-agnostic. The feature alignment is a fundamental step for cross-domain object detection. However, the complex background information may lead to an unreliable feature alignment and even negative transfer. It has been found that paying more attention to the object instances is more conducive to the feature alignment. Therefore, this paper proposes a local to global object-aware feature alignment (OFA) method for cross-domain object detection. The proposed network consists of three components: (1) local object-aware feature alignment (LOFA) module, which utilizes the classification map to enhance the local object representation and achieve the local object feature alignment. (2) cross-scale attention feature alignment (CAFA) module, which employs the local cross-channel interactive attention to highlight the foreground features and aligns the features of global cross-scale fusion in the multi-scale level. (3) memory-category guided alignment (MCGA) module, which records all category features from the source domain, and then uses the memory-category features to guide the category learning of target domain. By the proposed OFA, the enhanced object information can lead to better feature alignment. The proposed method is demonstrated by the cross-domain detection in different scenarios and shows state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Fig. 5

Bi-Dimensional Feature Alignment for Cross-Domain Object Detection

MTTrans: Cross-domain Object Detection with Mean Teacher Transformer

Cross-Domain Object Detection Through Image-Category Features Joint Alignment

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability Statement

The datasets used in this study are publicly available online.

References

Liu Z, Song Y, Tang R, Duan G, Tan J (2023) Few-shot defect recognition of metal surfaces via attention-embedding and self-supervised learning. J Intell Manuf 34:3507–3521
Article Google Scholar
Girshick R (2015) Fast r-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Wu S, Xu Y, Zhang B, Yang J, Zhang D (2021) Deformable template network (DTN) for object detection. IEEE Trans Multimedia
Cai Z, Vasconcelos N (2018) Cascade r-CNN: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162
Ren S, He K, Girshick R (2015) Sun J Faster r-CNN: Towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28:91–99
Google Scholar
Long M, Cao Y, Wang J, Jordan M (2015) Learning transferable features with deep adaptation networks. In International conference on machine learning. PMLR, pp 97–105
Cheng Z, Chen C, Chen Z, Fang K, Jin X (2021) Robust and high-order correlation alignment for unsupervised domain adaptation. Neural Comput Appl 33:6891–6903
Article Google Scholar
Jiang B, Chen C, Jin X (2020) Unsupervised domain adaptation with target reconstruction and label confusion in the common subspace. Neural Comput Appl 32:4743–4756
Article Google Scholar
Song Y, Liu Z, Wang J, Tang R, Duan G, Tan J (2021) Multiscale adversarial and weighted gradient domain adaptive network for data scarcity surface defect detection. IEEE Trans Instrum Meas 70:1–10
Google Scholar
Zhou L, Ye M, Xiao S (2022) Domain adaptation based on source category prototypes. Neural Comput Appl 34:21191–21203
Article Google Scholar
Song Y, Liu Z, Tang R, Duan G, Tan J (2023) GradCa: generalizing to unseen domains via gradient calibration. Neurocomputing 529:1–10
Article Google Scholar
Ganin Y, Lempitsky V (2015) Unsupervised domain adaptation by backpropagation. In: International conference on machine learning. PMLR, pp 1180–1189
Chen Y, Li W, Sakaridis C, Dai D, Van Gool L (2018) Domain adaptive faster r-CNN for object detection in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3339–3348
He Z, Zhang L (2019) Multi-adversarial faster-RCNN for unrestricted object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6668–6677
Xu C-D, Zhao X-R, Jin X, Wei X-S (2020) Exploring categorical regularization for domain adaptive object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11724–11733
Zhuang C, Han X, Huang W, Scott M (2020) iFAN: image-instance full alignment networks for adaptive object detection. In: Proceedings of the AAAI conference on artificial intelligence, vol 4, pp 13122–13129
Hsu C-C, Tsai Y-H, Lin Y-Y, Yang M-H (2020) Every pixel matters: center-aware feature alignment for domain adaptive object detector. In: European conference on computer vision. Springer, Berlin, pp 733–748
Zheng Y, Huang D, Liu S, Wang Y (2020) Cross-domain object detection through coarse-to-fine feature adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13766–13775
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: European conference on computer vision. Springer, Berlin, pp 21–37
Lin T.-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Tian Z, Shen C, Chen H, He T (2019) Fcos: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636
Long M, Zhu H, Wang J, Jordan MI (2016) Unsupervised domain adaptation with residual transfer networks. arXiv preprint arXiv:1602.04433
Zellinger W, Grubinger T, Lughofer E, Natschläger T, Saminger-Platz S (2017) Central moment discrepancy (CMD) for domain-invariant representation learning. arXiv preprint arXiv:1702.08811
Yan H, Ding Y, Li P, Wang Q, Xu Y, Zuo W (2017) Mind the class weight bias: weighted maximum mean discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2272–2281
Shen J, Qu Y, Zhang W, Yu Y (2018) Wasserstein distance guided representation learning for domain adaptation. In: Thirty-Second AAAI conference on artificial intelligence
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, vol 27
Saito K, Ushiku Y, Harada T, Saenko K (2019) Strong-weak distribution alignment for adaptive object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6956–6965
Hsu H-K, Yao C-H, Tsai Y-H, Hung W-C, Tseng H-Y, Singh M, Yang M-H (2020) Progressive domain adaptation for object detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 749–757
Zhu J.-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
Chen C, Zheng Z, Huang Y, Ding X, Yu Y (2021) I3net: implicit instance-invariant network for adapting one-stage object detectors. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12576–12585
Lang Q, Zhang L, Shi W, Chen W, Pu S (2022) Exploring implicit domain-invariant features for domain adaptive object detection. IEEE Trans Circuits Syst Video Technol 33(4):1816–1826
Article Google Scholar
He Z, Zhang L, Yang Y, Gao X (2021) Partial alignment for object detection in the wild. IEEE Trans Circuits Syst Video Technol 32(8):5238–5251
Article Google Scholar
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
He Z, Zhang L (2020) Domain adaptive object detection via asymmetric tri-way faster-RCNN. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIV 16. Springer, Berlin, pp 309–324
Chen C, Zheng Z, Ding X, Huang Y, Dou Q (2020) Harmonizing transferability and discriminability for adapting object detectors. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8869–8878
He Z, Zhang L, Gao X, Zhang D (2023) Multi-adversarial faster-RCNN with paradigm teacher for unrestricted object detection. Int J Comput Vision 131(3):680–700
Article Google Scholar
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223
Sakaridis C, Dai D, Van Gool L (2018) Semantic foggy scene understanding with synthetic data. Int J Comput Vision 126(9):973–992
Article Google Scholar
Johnson-Roberson M, Barto C, Mehta R, Sridhar SN, Rosaen K, Vasudevan R (2016) Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? arXiv preprint arXiv:1610.01983
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 3354–3361
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vision 88(2):303–338
Article Google Scholar
Inoue N, Furuta R, Yamasaki T, Aizawa K (2018) Cross-domain weakly-supervised object detection through progressive domain adaptation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5001–5009
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Kim S, Choi J, Kim T, Kim C (2019) Self-training and adversarial background regularization for unsupervised domain adaptive one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6092–6101
Shi W, Zhang L, Chen W, Pu S (2022) Universal domain adaptive object detector. In: Proceedings of the 30th ACM international conference on multimedia, pp 2258–2266
Xu M, Qin L, Chen W, Pu S, Zhang L Multi-view adversarial discriminator: mine the non-causal factors for object detection in unseen domains. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 8103–8112 (2023)

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under U22A600, 52075480 and 51935009, in part by the Key Research and Development Program of Zhejiang Province under Grant 2022C01064, and in part by the High-level Talent Special Support Plan of Zhejiang Province under Grant 2022C01064.

Author information

Authors and Affiliations

The State Key Laboratory of CAD &CG, Zhejiang University, Hangzhou, 310027, Zhejiang, People’s Republic of China
Yiguo Song, Zhenyu Liu, Ruining Tang, Guifang Duan & Jianrong Tan

Authors

Yiguo Song
View author publications
You can also search for this author in PubMed Google Scholar
Zhenyu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ruining Tang
View author publications
You can also search for this author in PubMed Google Scholar
Guifang Duan
View author publications
You can also search for this author in PubMed Google Scholar
Jianrong Tan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guifang Duan.

Ethics declarations

Conflicts of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Song, Y., Liu, Z., Tang, R. et al. Cross-domain object detection by local to global object-aware feature alignment. Neural Comput & Applic 36, 3631–3644 (2024). https://doi.org/10.1007/s00521-023-09248-8

Download citation

Received: 13 April 2023
Accepted: 06 November 2023
Published: 06 December 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s00521-023-09248-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cross-domain object detection by local to global object-aware feature alignment

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Bi-Dimensional Feature Alignment for Cross-Domain Object Detection

MTTrans: Cross-domain Object Detection with Mean Teacher Transformer

Cross-Domain Object Detection Through Image-Category Features Joint Alignment

Data Availability Statement

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Cross-domain object detection by local to global object-aware feature alignment

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Bi-Dimensional Feature Alignment for Cross-Domain Object Detection

MTTrans: Cross-domain Object Detection with Mean Teacher Transformer

Cross-Domain Object Detection Through Image-Category Features Joint Alignment

Explore related subjects

Data Availability Statement

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.