Cross-modal and multi-level feature refinement network for RGB-D salient object detection

被引：7

作者：

Gao, Yue ^{[1
]}

Dai, Meng ^{[1
]}

Zhang, Qing ^{[1
]}

机构：

[1] Shanghai Inst Technol, Sch Comp Sci & Informat Engn, Shanghai, Peoples R China

来源：

VISUAL COMPUTER | 2023年 / 39卷 / 09期

基金：

上海市自然科学基金; 中国国家自然科学基金;

关键词：

RGB-D salient object detection; Cross-modal feature interaction; Multi-level feature fusion; Skip connection; FUSION;

D O I：

10.1007/s00371-022-02543-w

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

RGB-D salient object detection (SOD) methods adopt depth maps as important supplementary information in order to identify salient objects more accurately. However, there are still two main challenges in the existing RGB-D SOD methods. One typical issue is how to obtain effective cross-modal features, and another issue is how to optimize the integration of multi-level features. To tackle these two issues, we propose a novel cross-modal and multi-level feature refinement network which equips with a cross-modal feature interaction module and a multi-level feature fusion module. Specifically, a cross-modal feature interaction module is designed to enhance depth features from both channel and spatial perspectives and then effectively integrate cross-modal features. Moreover, considering the characteristics of different levels of features, we propose a multi-level feature fusion module which combines contextual information from multi-level features by means of skip connection. Extensive experiments on five benchmark datasets demonstrate that our proposed model outperforms other 17 state-of-the-art RGB-D SOD methods.

引用

页码：3979 / 3994

页数：16

共 66 条

[1]

Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596

[2] Using CNN for facial expression recognition: a study of the effects of kernel size and number of filters on accuracy [J].

Agrawal, Abhinav ;

Mittal, Namita .

VISUAL COMPUTER, 2020, 36 (02) :405-412

[3]

[Anonymous], 2016, IEEE Conf. Comput. Vis. Pattern Recog, DOI DOI 10.1109/CVPR.2016.257

[4] Depth-Quality-Aware Salient Object Detection [J].

Chen, Chenglizhao ;

Wei, Jipeng ;

Peng, Chong ;

Qin, Hong .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :2350-2363

[5] Three-Stream Attention-Aware Network for RGB-D Salient Object Detection [J].

Chen, Hao ;

Li, Youfu .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (06) :2825-2835

[6] Progressively Complementarity-aware Fusion Network for RGB-D Salient Object Detection [J].

Chen, Hao ;

Li, Youfu .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3051-3060

[7] Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection [J].

Chen, Hao ;

Li, Youfu ;

Su, Dan .

PATTERN RECOGNITION, 2019, 86 :376-385

[8] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[9] Progressively Guided Alternate Refinement Network for RGB-D Salient Object Detection [J].

Chen, Shuhan ;

Fu, Yun .

COMPUTER VISION - ECCV 2020, PT VIII, 2020, 12353 :520-538

[10] DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection [J].

Chen, Zuyao ;

Cong, Runmin ;

Xu, Qianqian ;

Huang, Qingming .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :7012-7024

← 1 2 3 4 5 6 7 →