Hierarchical Two-stage modal fusion for Triple-modality salient object detection

被引:2
作者
Wen, Hongwei [1 ,2 ,3 ]
Song, Kechen [1 ,2 ,3 ]
Huang, Liming [1 ,2 ,3 ]
Wang, Han [1 ,2 ,3 ]
Wang, Junyi [4 ]
Yan, Yunhui [1 ,2 ,3 ]
机构
[1] Northeastern Univ, Sch Mech Engn & Automat, Shenyang 110819, Peoples R China
[2] Northeastern Univ, Natl Frontiers Sci Ctr Ind Intelligence & Syst Opt, Shenyang 110819, Peoples R China
[3] Northeastern Univ, Key Lab Data Analyt & Optimizat Smart Ind, Minist Educ, Shenyang, Peoples R China
[4] Northeastern Univ, Fac Robot Sci & Engn, Shenyang 110819, Peoples R China
基金
中国国家自然科学基金;
关键词
Triple-modality salient object detection; Two-stage fusion; Accurate location; Feature-level correlation; NETWORK;
D O I
10.1016/j.measurement.2023.113180
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Salient object detection (SOD) is essential for home service robot grasping technology. Recently, studies have shown that using triple-modality (RGB-Depth-Thermal infrared, RGB-D-T) information can significantly improve the detection effect of salient object detection. The location accuracy and completeness of the salient object play a crucial role in the subsequent grasping process. This seriously affects the robot's judgment of the object position and grasping position. Therefore, the critical problem of triple-modality SOD technology in robot grasping is locating the salient object accurately and detecting the salient object completely. Consequently, we propose a triple-modality salient object detection method based on hierarchical two-stage modal fusion. In the first fusion stage, we use the triple-modal information rationally to locate the salient object accurately. Considering the properties of the different modal information, we use the depth information to supplement and improve the visible light and thermal infrared information through an accurate selection fusion module (ASFM). Then in the second stage, we use the feature correlation enhancement module (FCEM) to realize the correlation of the different modal salient features. FCEM can perfectly combine the different modal information and make the salient object more complete. Comparative and challenging experiments demonstrate that the proposed method outperforms 14 state-of-the-art methods on the VDT2048 dataset. The code is available at: https://github. com/VDT-2048/HTMF.
引用
收藏
页数:15
相关论文
共 79 条
[1]  
Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
[2]   Modality-Induced Transfer-Fusion Network for RGB-D and RGB-T Salient Object Detection [J].
Chen, Gang ;
Shao, Feng ;
Chai, Xiongli ;
Chen, Hangwei ;
Jiang, Qiuping ;
Meng, Xiangchao ;
Ho, Yo-Sung .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) :1787-1801
[3]   CGMDRNet: Cross-Guided Modality Difference Reduction Network for RGB-T Salient Object Detection [J].
Chen, Gang ;
Shao, Feng ;
Chai, Xiongli ;
Chen, Hangwei ;
Jiang, Qiuping ;
Meng, Xiangchao ;
Ho, Yo-Sung .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (09) :6308-6323
[4]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[5]  
Chen Q, 2021, AAAI CONF ARTIF INTE, V35, P1063
[6]   DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection [J].
Chen, Zuyao ;
Cong, Runmin ;
Xu, Qianqian ;
Huang, Qingming .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :7012-7024
[7]   Depth-Induced Gap-Reducing Network for RGB-D Salient Object Detection: An Interaction, Guidance and Refinement Approach [J].
Cheng, Xiaolong ;
Zheng, Xuan ;
Pei, Jialun ;
Tang, He ;
Lyu, Zehua ;
Chen, Chuanbo .
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 :4253-4266
[8]   Does Thermal Really Always Matter for RGB-T Salient Object Detection? [J].
Cong, Runmin ;
Zhang, Kepu ;
Zhang, Chen ;
Zheng, Feng ;
Zhao, Yao ;
Huang, Qingming ;
Kwong, Sam .
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 :6971-6982
[9]  
Fan DP, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P698
[10]   JL-DCF: Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient Object Detection [J].
Fu, Keren ;
Fan, Deng-Ping ;
Ji, Ge-Peng ;
Zhao, Qijun .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :3049-3059