Cross-modal hierarchical interaction network for RGB-D salient object detection

被引:42
|
作者
Bi, Hongbo [1 ]
Wu, Ranwan [1 ]
Liu, Ziqi [1 ]
Zhu, Huihui [1 ]
Zhang, Cong [1 ]
Xiang, Tian -Zhu [2 ]
机构
[1] Northeast Petr Univ, Sch Elect Informat Engn, Daqing 163000, Peoples R China
[2] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
关键词
Saliency detection; Salient object detection; RGB-D; Feature fusion; Cross-modal interaction;
D O I
10.1016/j.patcog.2022.109194
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
How to effectively exchange and aggregate the information of multiple modalities ( e.g. RGB image and depth map) is a big challenge in the RGB-D salient object detection community. To address this problem, in this paper, we propose a cross-modal Hierarchical Interaction Network ( HINet ), which boosts the salient object detection by excavating the cross-modal feature interaction and progressively multi-level feature fusion. To achieve it, we design two modules: cross-modal information exchange (CIE) module and multi-level information progressively guided fusion (PGF) module. Specifically, the CIE module is proposed to exchange the cross-modal features for learning the shared representations, as well as the beneficial feedback to facilitate the discriminative feature learning of different modalities. Besides, the PGF module is designed to aggregate the hierarchical features progressively with the reverse guidance mechanism, which employs the high-level feature fusion to guide the low-level feature fusion and thus improve the saliency detection performance. Extensive experiments show that our proposed model significantly outperforms the existing nine state-of-the-art models on five challenging benchmark datasets. Codes and results are available at: https://github.com/RanwanWu/HINet.(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Multi-level cross-modal interaction network for RGB-D salient object detection
    Huang, Zhou
    Chen, Huai-Xin
    Zhou, Tao
    Yang, Yun-Zhi
    Liu, Bi-Yuan
    NEUROCOMPUTING, 2021, 452 : 200 - 211
  • [2] Cross-Modal Fusion and Progressive Decoding Network for RGB-D Salient Object Detection
    Hu, Xihang
    Sun, Fuming
    Sun, Jing
    Wang, Fasheng
    Li, Haojie
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (08) : 3067 - 3085
  • [3] Depth Enhanced Cross-Modal Cascaded Network for RGB-D Salient Object Detection
    Zhao, Zhengyun
    Huang, Ziqing
    Chai, Xiuli
    Wang, Jun
    NEURAL PROCESSING LETTERS, 2023, 55 (01) : 361 - 384
  • [4] Lightweight cross-modal transformer for RGB-D salient object detection
    Huang, Nianchang
    Yang, Yang
    Zhang, Qiang
    Han, Jungong
    Huang, Jin
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
  • [5] RGB-D salient object detection with asymmetric cross-modal fusion
    Yu M.
    Xing Z.-H.
    Liu Y.
    Kongzhi yu Juece/Control and Decision, 2023, 38 (09): : 2487 - 2495
  • [6] Depth Enhanced Cross-Modal Cascaded Network for RGB-D Salient Object Detection
    Zhengyun Zhao
    Ziqing Huang
    Xiuli Chai
    Jun Wang
    Neural Processing Letters, 2023, 55 : 361 - 384
  • [7] Cross-modal refined adjacent-guided network for RGB-D salient object detection
    Bi H.
    Zhang J.
    Wu R.
    Tong Y.
    Jin W.
    Multimedia Tools Appl, 24 (37453-37478): : 37453 - 37478
  • [8] Disentangled Cross-Modal Transformer for RGB-D Salient Object Detection and Beyond
    Chen, Hao
    Shen, Feihong
    Ding, Ding
    Deng, Yongjian
    Li, Chao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1699 - 1709
  • [9] Global Guided Cross-Modal Cross-Scale Network for RGB-D Salient Object Detection
    Wang, Shuaihui
    Jiang, Fengyi
    Xu, Boqian
    SENSORS, 2023, 23 (16)
  • [10] Joint Cross-Modal and Unimodal Features for RGB-D Salient Object Detection
    Huang, Nianchang
    Liu, Yi
    Zhang, Qiang
    Han, Jungong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 2428 - 2441