DMRA: Depth-Induced Multi-Scale Recurrent Attention Network for RGB-D Saliency Detection

被引:48
|
作者
Ji, Wei [1 ,2 ]
Yan, Ge [2 ]
Li, Jingjing [1 ,2 ]
Piao, Yongri [3 ]
Yao, Shunyu [2 ]
Zhang, Miao [4 ]
Cheng, Li [1 ]
Lu, Huchuan [3 ]
机构
[1] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T5V 1A4, Canada
[2] Dalian Univ Technol, Sch Software Technol, Dalian 116024, Peoples R China
[3] Dalian Univ Technol, Sch Informat & Commun Engn, Fac Elect Informat & Elect Engn, Dalian 116024, Peoples R China
[4] Dalian Univ Technol, DUT RU Int Sch Informat & Software Engn, Key Lab Ubiquitous Network & Serv Software Liaoni, Dalian 116024, Peoples R China
基金
加拿大自然科学与工程研究理事会; 中国国家自然科学基金;
关键词
Feature extraction; Saliency detection; Semantics; Random access memory; Cameras; Analytical models; Visualization; RGB-D saliency detection; salient object detection; convolutional neural networks; cross-modal fusion; OBJECT DETECTION; FUSION; SEGMENTATION;
D O I
10.1109/TIP.2022.3154931
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose a novel depth-induced multi-scale recurrent attention network for RGB-D saliency detection, named as DMRA. It achieves dramatic performance especially in complex scenarios. There are four main contributions of our network that are experimentally demonstrated to have significant practical merits. First, we design an effective depth refinement block using residual connections to fully extract and fuse cross-modal complementary cues from RGB and depth streams. Second, depth cues with abundant spatial information are innovatively combined with multi-scale contextual features for accurately locating salient objects. Third, a novel recurrent attention module inspired by Internal Generative Mechanism of human brain is designed to generate more accurate saliency results via comprehensively learning the internal semantic relation of the fused feature and progressively optimizing local details with memory-oriented scene understanding. Finally, a cascaded hierarchical feature fusion strategy is designed to promote efficient information interaction of multi-level contextual features and further improve the contextual representability of model. In addition, we introduce a new real-life RGB-D saliency dataset containing a variety of complex scenarios that has been widely used as a benchmark dataset in recent RGB-D saliency detection research. Extensive empirical experiments demonstrate that our method can accurately identify salient objects and achieve appealing performance against 18 state-of-the-art RGB-D saliency models on nine benchmark datasets.
引用
收藏
页码:2321 / 2336
页数:16
相关论文
共 50 条
  • [21] Multi-scale Cross-Modal Transformer Network for RGB-D Object Detection
    Xiao, Zhibin
    Xie, Pengwei
    Wang, Guijin
    MULTIMEDIA MODELING (MMM 2022), PT I, 2022, 13141 : 352 - 363
  • [22] AGRFNet: Two-stage cross-modal and multi-level attention gated recurrent fusion network for RGB-D saliency detection
    Liu, Zhengyi
    Wang, Yuan
    Tan, Yacheng
    Li, Wei
    Xiao, Yun
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2022, 104
  • [23] Cross-Modal Adaptive Interaction Network for RGB-D Saliency Detection
    Du, Qinsheng
    Bian, Yingxu
    Wu, Jianyu
    Zhang, Shiyan
    Zhao, Jian
    APPLIED SCIENCES-BASEL, 2024, 14 (17):
  • [24] Efficient Depth-Included Residual Refinement Network for RGB-D Saliency Detection
    Yu, Jinhao
    Yan, Guoliang
    Xu, Xiuqi
    Wang, Jian
    Chen, Shuhan
    Hu, Xuelong
    IMAGE AND GRAPHICS (ICIG 2021), PT III, 2021, 12890 : 3 - 14
  • [25] CDNet: Complementary Depth Network for RGB-D Salient Object Detection
    Jin, Wen-Da
    Xu, Jun
    Han, Qi
    Zhang, Yi
    Cheng, Ming-Ming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3376 - 3390
  • [26] ASIF-Net: Attention Steered Interweave Fusion Network for RGB-D Salient Object Detection
    Li, Chongyi
    Cong, Runmin
    Kwong, Sam
    Hou, Junhui
    Fu, Huazhu
    Zhu, Guopu
    Zhang, Dingwen
    Huang, Qingming
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (01) : 88 - 100
  • [27] MAGNet: Multi-scale Awareness and Global fusion Network for RGB-D salient object detection
    Zhong, Mingyu
    Sun, Jing
    Ren, Peng
    Wang, Fasheng
    Sun, Fuming
    KNOWLEDGE-BASED SYSTEMS, 2024, 299
  • [28] Encoder deep interleaved network with multi-scale aggregation for RGB-D salient object detection
    Feng, Guang
    Meng, Jinyu
    Zhang, Lihe
    Lu, Huchuan
    PATTERN RECOGNITION, 2022, 128
  • [29] HFNet: Hierarchical feedback network with multilevel atrous spatial pyramid pooling for RGB-D saliency detection
    Zhou, Wujie
    Liu, Chang
    Lei, Jingsheng
    Yu, Lu
    Luo, Ting
    NEUROCOMPUTING, 2022, 490 : 347 - 357
  • [30] Multi-scale Residual Interaction for RGB-D Salient Object Detection
    Hu, Mingjun
    Zhang, Xiaoqin
    Zhao, Li
    COMPUTER VISION - ACCV 2022, PT III, 2023, 13843 : 575 - 590