CIR-Net: Cross-modality Interaction and Refinement for RGB-D Salient Object Detection

Cong, Runmin; Lin, Qinwei; Zhang, Chen; Li, Chongyi; Cao, Xiaochun; Huang, Qingming; Zhao, Yao

doi:10.1109/TIP.2022.3216198

Computer Science > Computer Vision and Pattern Recognition

arXiv:2210.02843 (cs)

[Submitted on 6 Oct 2022]

Title:CIR-Net: Cross-modality Interaction and Refinement for RGB-D Salient Object Detection

Authors:Runmin Cong, Qinwei Lin, Chen Zhang, Chongyi Li, Xiaochun Cao, Qingming Huang, Yao Zhao

View PDF

Abstract:Focusing on the issue of how to effectively capture and utilize cross-modality information in RGB-D salient object detection (SOD) task, we present a convolutional neural network (CNN) model, named CIR-Net, based on the novel cross-modality interaction and refinement. For the cross-modality interaction, 1) a progressive attention guided integration unit is proposed to sufficiently integrate RGB-D feature representations in the encoder stage, and 2) a convergence aggregation structure is proposed, which flows the RGB and depth decoding features into the corresponding RGB-D decoding streams via an importance gated fusion unit in the decoder stage. For the cross-modality refinement, we insert a refinement middleware structure between the encoder and the decoder, in which the RGB, depth, and RGB-D encoder features are further refined by successively using a self-modality attention refinement unit and a cross-modality weighting refinement unit. At last, with the gradually refined features, we predict the saliency map in the decoder stage. Extensive experiments on six popular RGB-D SOD benchmarks demonstrate that our network outperforms the state-of-the-art saliency detectors both qualitatively and quantitatively.

Comments:	Accepted by IEEE Transactions on Image Processing 2022, 16 pages, 11 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2210.02843 [cs.CV]
	(or arXiv:2210.02843v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2210.02843
Related DOI:	https://doi.org/10.1109/TIP.2022.3216198

Submission history

From: Runmin Cong [view email]
[v1] Thu, 6 Oct 2022 11:59:19 UTC (3,907 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:CIR-Net: Cross-modality Interaction and Refinement for RGB-D Salient Object Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Computer Science > Computer Vision and Pattern Recognition

Title:CIR-Net: Cross-modality Interaction and Refinement for RGB-D Salient Object Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.