Cross-modal and multi-level feature refinement network for RGB-D salient object detection

被引:7
作者
Gao, Yue [1 ]
Dai, Meng [1 ]
Zhang, Qing [1 ]
机构
[1] Shanghai Inst Technol, Sch Comp Sci & Informat Engn, Shanghai, Peoples R China
基金
中国国家自然科学基金; 上海市自然科学基金;
关键词
RGB-D salient object detection; Cross-modal feature interaction; Multi-level feature fusion; Skip connection; FUSION;
D O I
10.1007/s00371-022-02543-w
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
RGB-D salient object detection (SOD) methods adopt depth maps as important supplementary information in order to identify salient objects more accurately. However, there are still two main challenges in the existing RGB-D SOD methods. One typical issue is how to obtain effective cross-modal features, and another issue is how to optimize the integration of multi-level features. To tackle these two issues, we propose a novel cross-modal and multi-level feature refinement network which equips with a cross-modal feature interaction module and a multi-level feature fusion module. Specifically, a cross-modal feature interaction module is designed to enhance depth features from both channel and spatial perspectives and then effectively integrate cross-modal features. Moreover, considering the characteristics of different levels of features, we propose a multi-level feature fusion module which combines contextual information from multi-level features by means of skip connection. Extensive experiments on five benchmark datasets demonstrate that our proposed model outperforms other 17 state-of-the-art RGB-D SOD methods.
引用
收藏
页码:3979 / 3994
页数:16
相关论文
共 66 条
[1]  
Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
[2]   Using CNN for facial expression recognition: a study of the effects of kernel size and number of filters on accuracy [J].
Agrawal, Abhinav ;
Mittal, Namita .
VISUAL COMPUTER, 2020, 36 (02) :405-412
[3]  
[Anonymous], 2016, Proceedings of the IEEE conference on computer vision and pattern recognition, DOI DOI 10.1109/CVPR.2016.257
[4]  
Ao Luo, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12357), P346, DOI 10.1007/978-3-030-58610-2_21
[5]   Depth-Quality-Aware Salient Object Detection [J].
Chen, Chenglizhao ;
Wei, Jipeng ;
Peng, Chong ;
Qin, Hong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :2350-2363
[6]   Three-Stream Attention-Aware Network for RGB-D Salient Object Detection [J].
Chen, Hao ;
Li, Youfu .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (06) :2825-2835
[7]   Progressively Complementarity-aware Fusion Network for RGB-D Salient Object Detection [J].
Chen, Hao ;
Li, Youfu .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3051-3060
[8]   Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection [J].
Chen, Hao ;
Li, Youfu ;
Su, Dan .
PATTERN RECOGNITION, 2019, 86 :376-385
[9]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[10]  
Chen S., 2020, EUR C COMP VIS, P520