CFIDNet: cascaded feature interaction decoder for RGB-D salient object detection

被引:41
作者
Chen, Tianyou [1 ]
Hu, Xiaoguang [1 ]
Xiao, Jin [1 ]
Zhang, Guofeng [1 ]
Wang, Shaojie [1 ]
机构
[1] Beihang Univ, Sch Automat Sci & Elect Engn, State Key Lab Virtual Real Technol & Syst, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
Salient object detection; RGB-D images; Cross-modality feature fusion; Cascaded refinement; NETWORK; FUSION;
D O I
10.1007/s00521-021-06845-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Compared with RGB salient object detection (SOD) methods, RGB-D SOD models show better performance in many challenging scenarios by leveraging spatial information embedded in depth maps. However, existing RGB-D SOD models prone to ignore the modality-specific characteristics and fuse multi-modality features by simple element-wise addition or multiplication. Thus, they may induce noise-degraded saliency maps when encountering inaccurate or blurred depth images. Besides, many models adopt the U-shape architecture to integrate multi-level features layer-by-layer. Despite the fact that low-level features can be gradually polished, little attention has been paid to enhance high-level features, which may lead to suboptimal results. In this paper, we propose a novel network named CFIDNet to tackle the above problems. Specifically, we design the feature-enhanced module to excavate informative depth cues from depth images and enhance the RGB features by employing complementary information between RGB and depth modalities. Besides, we propose the feature refinement module to exploit multi-scale complementary information between multi-level features and polish these features by applying residual connections. The cascaded feature interaction decoder (CFID) is then proposed to refine multi-level features iteratively. Equipped with these proposed modules, our CFIDNet is capable of segmenting salient objects accurately. Experimental results on 7 widely used benchmark datasets validate that our CFIDNet achieves highly competitive performance over 15 state-of-the-art models in terms of 8 evaluation metrics. Our source code will be publicly available at haps://github.comiclelouch/CFIDNet.
引用
收藏
页码:7547 / 7563
页数:17
相关论文
共 86 条
[1]  
[Anonymous], 2016, IEEE Conf. Comput. Vis. Pattern Recog, DOI DOI 10.1109/CVPR.2016.257
[2]  
Borji A., 2012, CVPR, P23
[3]   Salient Object Detection: A Benchmark [J].
Borji, Ali ;
Sihite, Dicky N. ;
Itti, Laurent .
COMPUTER VISION - ECCV 2012, PT II, 2012, 7573 :414-429
[4]   Three-Stream Attention-Aware Network for RGB-D Salient Object Detection [J].
Chen, Hao ;
Li, Youfu .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (06) :2825-2835
[5]   Progressively Complementarity-aware Fusion Network for RGB-D Salient Object Detection [J].
Chen, Hao ;
Li, Youfu .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3051-3060
[6]   Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection [J].
Chen, Hao ;
Li, Youfu ;
Su, Dan .
PATTERN RECOGNITION, 2019, 86 :376-385
[7]   BPFINet: Boundary-aware progressive feature integration network for salient object detection [J].
Chen, Tianyou ;
Hu, Xiaoguang ;
Xiao, Jin ;
Zhang, Guofeng .
NEUROCOMPUTING, 2021, 451 :152-166
[8]   DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection [J].
Chen, Zuyao ;
Cong, Runmin ;
Xu, Qianqian ;
Huang, Qingming .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :7012-7024
[9]   BING: Binarized normed gradients for objectness estimation at 300fps [J].
Cheng, Ming-Ming ;
Liu, Yun ;
Lin, Wen-Yan ;
Zhang, Ziming ;
Rosin, Paul L. ;
Torr, Philip H. S. .
COMPUTATIONAL VISUAL MEDIA, 2019, 5 (01) :3-20
[10]   RepFinder: Finding Approximately Repeated Scene Elements for Image Editing [J].
Cheng, Ming-Ming ;
Zhang, Fang-Lue ;
Mitra, Niloy J. ;
Huang, Xiaolei ;
Hu, Shi-Min .
ACM TRANSACTIONS ON GRAPHICS, 2010, 29 (04)