DMRA: Depth-Induced Multi-Scale Recurrent Attention Network for RGB-D Saliency Detection

被引：48

作者：

Ji, Wei ^{[1
,2
]}

Yan, Ge ^{[2
]}

Li, Jingjing ^{[1
,2
]}

Piao, Yongri ^{[3
]}

Yao, Shunyu ^{[2
]}

Zhang, Miao ^{[4
]}

Cheng, Li ^{[1
]}

Lu, Huchuan ^{[3
]}

机构：

[1] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T5V 1A4, Canada

[2] Dalian Univ Technol, Sch Software Technol, Dalian 116024, Peoples R China

[3] Dalian Univ Technol, Sch Informat & Commun Engn, Fac Elect Informat & Elect Engn, Dalian 116024, Peoples R China

[4] Dalian Univ Technol, DUT RU Int Sch Informat & Software Engn, Key Lab Ubiquitous Network & Serv Software Liaoni, Dalian 116024, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2022年 / 31卷

基金：

加拿大自然科学与工程研究理事会; 中国国家自然科学基金;

关键词：

Feature extraction; Saliency detection; Semantics; Random access memory; Cameras; Analytical models; Visualization; RGB-D saliency detection; salient object detection; convolutional neural networks; cross-modal fusion; OBJECT DETECTION; FUSION; SEGMENTATION;

D O I：

10.1109/TIP.2022.3154931

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work, we propose a novel depth-induced multi-scale recurrent attention network for RGB-D saliency detection, named as DMRA. It achieves dramatic performance especially in complex scenarios. There are four main contributions of our network that are experimentally demonstrated to have significant practical merits. First, we design an effective depth refinement block using residual connections to fully extract and fuse cross-modal complementary cues from RGB and depth streams. Second, depth cues with abundant spatial information are innovatively combined with multi-scale contextual features for accurately locating salient objects. Third, a novel recurrent attention module inspired by Internal Generative Mechanism of human brain is designed to generate more accurate saliency results via comprehensively learning the internal semantic relation of the fused feature and progressively optimizing local details with memory-oriented scene understanding. Finally, a cascaded hierarchical feature fusion strategy is designed to promote efficient information interaction of multi-level contextual features and further improve the contextual representability of model. In addition, we introduce a new real-life RGB-D saliency dataset containing a variety of complex scenarios that has been widely used as a benchmark dataset in recent RGB-D saliency detection research. Extensive empirical experiments demonstrate that our method can accurately identify salient objects and achieve appealing performance against 18 state-of-the-art RGB-D saliency models on nine benchmark datasets.

引用

页码：2321 / 2336

页数：16

共 50 条

[41] Multi-Prior Driven Network for RGB-D Salient Object Detection
Zhang, Xiaoqin
Xu, Yuewang
Wang, Tao
Liao, Tangfei
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9209 - 9222
[42] EF-Net: A novel enhancement and fusion network for RGB-D saliency detection
Chen, Qian
Fu, Keren
Liu, Ze
Chen, Geng
Du, Hongwei
Qiu, Bensheng
Shao, Ling
PATTERN RECOGNITION, 2021, 112
[43] SLMSF-Net: A Semantic Localization and Multi-Scale Fusion Network for RGB-D Salient Object Detection
Peng, Yanbin
Zhai, Zhinian
Feng, Mingkun
SENSORS, 2024, 24 (04)
[44] Multi-modal interactive attention and dual progressive decoding network for RGB-D/T salient object detection
Liang, Yanhua
Qin, Guihe
Sun, Minghui
Qin, Jun
Yan, Jie
Zhang, Zhonghan
NEUROCOMPUTING, 2022, 490 : 132 - 145
[45] RGB-D Salient Object Detection via Feature Fusion and Multi-scale Enhancement
Wu, Peiliang
Duan, Liangliang
Kong, Lingfu
COMPUTER VISION, CCCV 2015, PT II, 2015, 547 : 359 - 368
[46] DGFNet: Depth-Guided Cross-Modality Fusion Network for RGB-D Salient Object Detection
Xiao, Fen
Pu, Zhengdong
Chen, Jiaqi
Gao, Xieping
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2648 - 2658
[47] Bilateral Attention Network for RGB-D Salient Object Detection
Zhang, Zhao
Lin, Zheng
Xu, Jun
Jin, Wen-Da
Lu, Shao-Ping
Fan, Deng-Ping
IEEE Transactions on Image Processing, 2021, 30 : 1949 - 1961
[48] Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection
Chen, Hao
Li, Youfu
Su, Dan
PATTERN RECOGNITION, 2019, 86 : 376 - 385
[49] DMGNet: Depth mask guiding network for RGB-D salient object detection
Tang, Yinggan
Li, Mengyao
NEURAL NETWORKS, 2024, 180
[50] Depth quality-aware selective saliency fusion for RGB-D image salient object detection
Wang, Xuehao
Li, Shuai
Chen, Chenglizhao
Hao, Aimin
Qin, Hong
NEUROCOMPUTING, 2021, 432 : 44 - 56

← 1 2 3 4 5 →