Point-aware Interaction and CNN-induced Refinement Network for RGB-D Salient Object Detection

被引:23
作者
Cong, Runmin [1 ,2 ,5 ]
Liu, Hongyu [1 ,6 ,7 ]
Zhang, Chen [1 ,6 ,7 ]
Zhang, Wei [2 ,5 ]
Zheng, Feng [3 ]
Song, Ran [2 ,5 ]
Kwong, Sam [4 ]
机构
[1] Beijing Jiaotong Univ, Beijing, Peoples R China
[2] Shandong Univ, Jinan, Shandong, Peoples R China
[3] Southern Univ Sci & Technol, Shenzhen, Guangdong, Peoples R China
[4] City Univ Hong Kong, Hong Kong, Peoples R China
[5] Minist Educ, Key Lab Machine Intelligence & Syst Control, Jinan, Shandong, Peoples R China
[6] Beijing Jiaotong Univ, Inst Informat Sci, Beijing, Peoples R China
[7] Beijing Key Lab Adv Informat Sci & Network Techno, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年
基金
中国国家自然科学基金;
关键词
salient object detection; RGB-D images; CNNs-assisted Transformer architecture; point-aware interaction; FUSION;
D O I
10.1145/3581783.3611982
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
By integrating complementary information from RGB image and depth map, the ability of salient object detection (SOD) for complex and challenging scenes can be improved. In recent years, the important role of Convolutional Neural Networks (CNNs) in feature extraction and cross-modality interaction has been fully explored, but it is still insufficient in modeling global long-range dependencies of self-modality and cross-modality. To this end, we introduce CNNs-assisted Transformer architecture and propose a novel RGB-D SOD network with Point-aware Interaction and CNN-induced Refinement (PICR-Net). On the one hand, considering the prior correlation between RGB modality and depth modality, an attention-triggered cross-modality point-aware interaction (CmPI) module is designed to explore the feature interaction of different modalities with positional constraints. On the other hand, in order to alleviate the block effect and detail destruction problems brought by the Transformer naturally, we design a CNN-induced refinement (CNNR) unit for content refinement and supplementation. Extensive experiments on five RGB-D SOD datasets show that the proposed network achieves competitive results in both quantitative and qualitative comparisons. Our code is publicly available at: https://github.com/rmcong/PICR-Net_ACMMM23.
引用
收藏
页码:406 / 416
页数:11
相关论文
共 57 条
  • [1] Chen Qian, 2022, IEEE T NEURAL NETWOR
  • [2] DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection
    Chen, Zuyao
    Cong, Runmin
    Xu, Qianqian
    Huang, Qingming
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 7012 - 7024
  • [3] Chen ZY, 2020, AAAI CONF ARTIF INTE, V34, P10599
  • [4] Chongyi Li, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12353), P225, DOI 10.1007/978-3-030-58598-3_14
  • [5] An In Depth View of Saliency
    Ciptadi, Arridhana
    Hermans, Tucker
    Rehg, James M.
    [J]. PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2013, 2013,
  • [6] Does Thermal Really Always Matter for RGB-T Salient Object Detection?
    Cong, Runmin
    Zhang, Kepu
    Zhang, Chen
    Zheng, Feng
    Zhao, Yao
    Huang, Qingming
    Kwong, Sam
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 6971 - 6982
  • [7] Multi-Projection Fusion and Refinement Network for Salient Object Detection in 360° Omnidirectional Image
    Cong, Runmin
    Huang, Ke
    Lei, Jianjun
    Zhao, Yao
    Huang, Qingming
    Kwong, Sam
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (07) : 9495 - 9507
  • [8] PSNet: Parallel Symmetric Network for Video Salient Object Detection
    Cong, Runmin
    Song, Weiyu
    Lei, Jianjun
    Yue, Guanghui
    Zhao, Yao
    Kwong, Sam
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (02): : 402 - 414
  • [9] A Weakly Supervised Learning Framework for Salient Object Detection via Hybrid Labels
    Cong, Runmin
    Qin, Qi
    Zhang, Chen
    Jiang, Qiuping
    Wang, Shiqi
    Zhao, Yao
    Kwong, Sam
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (02) : 534 - 548
  • [10] CIR-Net: Cross-Modality Interaction and Refinement for RGB-D Salient Object Detection
    Cong, Runmin
    Lin, Qinwei
    Zhang, Chen
    Li, Chongyi
    Cao, Xiaochun
    Huang, Qingming
    Zhao, Yao
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6800 - 6815