CIR-Net: Cross-Modality Interaction and Refinement for RGB-D Salient Object Detection

被引:146
作者
Cong, Runmin [1 ,2 ]
Lin, Qinwei [1 ,2 ]
Zhang, Chen [1 ,2 ]
Li, Chongyi [3 ]
Cao, Xiaochun [4 ]
Huang, Qingming [5 ,6 ,7 ]
Zhao, Yao [1 ,2 ]
机构
[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing 100044, Peoples R China
[2] Beijing Key Lab Adv Informat Sci & Network Techno, Beijing 100044, Peoples R China
[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
[4] Sun Yat Sen Univ, Sch Cyber Sci & Technol, Shenzhen Campus, Shenzhen 518107, Peoples R China
[5] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 101408, Peoples R China
[6] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
[7] Peng Cheng Lab, Shenzhen 518055, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Decoding; Task analysis; Periodic structures; Middleware; Logic gates; Electronic mail; Object detection; Salient object detection; RGB-D images; cross-modality attention; cross-modality interaction; FUSION NETWORK; SEGMENTATION;
D O I
10.1109/TIP.2022.3216198
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Focusing on the issue of how to effectively capture and utilize cross-modality information in RGB-D salient object detection (SOD) task, we present a convolutional neural network (CNN) model, named CIR-Net, based on the novel cross-modality interaction and refinement. For the cross-modality interaction, 1) a progressive attention guided integration unit is proposed to sufficiently integrate RGB-D feature representations in the encoder stage, and 2) a convergence aggregation structure is proposed, which flows the RGB and depth decoding features into the corresponding RGB-D decoding streams via an importance gated fusion unit in the decoder stage. For the cross-modality refinement, we insert a refinement middleware structure between the encoder and the decoder, in which the RGB, depth, and RGB-D encoder features are further refined by successively using a self-modality attention refinement unit and a cross-modality weighting refinement unit. At last, with the gradually refined features, we predict the saliency map in the decoder stage. Extensive experiments on six popular RGB-D SOD benchmarks demonstrate that our network outperforms the state-of-the-art saliency detectors both qualitatively and quantitatively. The code and results can be found from the link of https://rmcong.github.io/proj_CIRNet.html.
引用
收藏
页码:6800 / 6815
页数:16
相关论文
共 60 条
[1]   Circular Complement Network for RGB-D Salient Object Detection [J].
Bai, Zhen ;
Liu, Zhi ;
Li, Gongyang ;
Ye, Linwei ;
Wang, Yang .
NEUROCOMPUTING, 2021, 451 :95-106
[2]   Salient object detection: A survey [J].
Borji, Ali ;
Cheng, Ming-Ming ;
Hou, Qibin ;
Jiang, Huaizu ;
Li, Jia .
COMPUTATIONAL VISUAL MEDIA, 2019, 5 (02) :117-150
[3]   Depth-Quality-Aware Salient Object Detection [J].
Chen, Chenglizhao ;
Wei, Jipeng ;
Peng, Chong ;
Qin, Hong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :2350-2363
[4]   Improved Saliency Detection in RGB-D Images Using Two-Phase Depth Estimation and Selective Deep Fusion [J].
Chen, Chenglizhao ;
Wei, Jipeng ;
Peng, Chong ;
Zhang, Weizhong ;
Qin, Hong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :4296-4307
[5]   Three-Stream Attention-Aware Network for RGB-D Salient Object Detection [J].
Chen, Hao ;
Li, Youfu .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (06) :2825-2835
[6]   Progressively Complementarity-aware Fusion Network for RGB-D Salient Object Detection [J].
Chen, Hao ;
Li, Youfu .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3051-3060
[7]   Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection [J].
Chen, Hao ;
Li, Youfu ;
Su, Dan .
PATTERN RECOGNITION, 2019, 86 :376-385
[8]   DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection [J].
Chen, Zuyao ;
Cong, Runmin ;
Xu, Qianqian ;
Huang, Qingming .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :7012-7024
[9]  
Chen ZY, 2020, AAAI CONF ARTIF INTE, V34, P10599
[10]   Saliency Detection for Stereoscopic Images Based on Depth Confidence Analysis and Multiple Cues Fusion [J].
Cong, Runmin ;
Lei, Jianjun ;
Zhang, Changqing ;
Huang, Qingming ;
Cao, Xiaochun ;
Hou, Chunping .
IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (06) :819-823