Intra-Modality Self-Enhancement Mirror Network for RGB-T Salient Object Detection

被引:3
作者
Wang, Jie [1 ]
Li, Guoqiang [2 ]
Yu, Hongjie [3 ]
Xi, Jinwen [4 ]
Shi, Jie [5 ]
Wu, Xueying [6 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300350, Peoples R China
[2] Chinese Acad Sci, Natl Sci Lib, Beijing 100080, Peoples R China
[3] Shanghai Univ, Sch Commun & Informat Engn, Shanghai 200444, Peoples R China
[4] Zhongguancun Lab, Beijing 100194, Peoples R China
[5] North Automat Control Technol Inst, Taiyuan 030006, Peoples R China
[6] Beijing Forestry Univ, Sch Informat, Beijing 100107, Peoples R China
基金
中国国家自然科学基金;
关键词
RGB-T images; cross-scale fusion; salient object detection; intra-modality self-enhancement;
D O I
10.1109/TCSVT.2024.3489440
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The inherent imaging properties of sensors result in two distinct differences between the data from the two modalities in RGB-T Salient Object Detection (SOD) tasks. Namely, differences in imaging effectiveness due to varying sensitivities to specific scenes and fundamental domain differences resulting from differences in reflecting scene characteristics. Existing methods primarily focus on pursuing unique cross-modal fusion designs to enhance model performance. However, not only do direct cross-modal fusion modes fail to improve the effectiveness of original features, but intricate cross-modal fusion designs also increase the domain differences between modalities, thereby resulting in suboptimal performance. Therefore, in this paper, we no longer insist on pursuing unique cross-modal fusion designs but instead contemplate how to enhance the effectiveness of original features within modalities (mitigating differences in imaging effectiveness) and utilize a concise cross-modal fusion mechanism (alleviating the impact of domain differences) to achieve satisfactory performance. In this spirit, we propose the Intra-modality Self-enhancement Mirror Network (ISMNet) for RGB-T salient object detection. The core of ISMNet is the proposed Intra-modality Cross-scale Self-enhancement Module (ICSM). The main insight of ICSM is to exploit saliency clues by modeling the correlation between intra-modality cross-scale features (which exhibit strong correlations and small domain differences), thereby enhancing the effectiveness of original multi-scale features within modalities. We employ the proposed novel paradigm to mirror-expand existing typical paradigms to obtain a more robust model architecture. Extensive experiments demonstrate that our proposed new architecture and the introduced universal Intra-modality Cross-scale Self-enhancement Module effectively improve the effectiveness of original features and promote the achievement of state-of-the-art performance.
引用
收藏
页码:2513 / 2525
页数:13
相关论文
共 65 条
[1]  
Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
[2]   Modality-Induced Transfer-Fusion Network for RGB-D and RGB-T Salient Object Detection [J].
Chen, Gang ;
Shao, Feng ;
Chai, Xiongli ;
Chen, Hangwei ;
Jiang, Qiuping ;
Meng, Xiangchao ;
Ho, Yo-Sung .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) :1787-1801
[3]   CGMDRNet: Cross-Guided Modality Difference Reduction Network for RGB-T Salient Object Detection [J].
Chen, Gang ;
Shao, Feng ;
Chai, Xiongli ;
Chen, Hangwei ;
Jiang, Qiuping ;
Meng, Xiangchao ;
Ho, Yo-Sung .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (09) :6308-6323
[4]  
Cheng ZY, 2024, Arxiv, DOI arXiv:2304.14614
[5]   Point-aware Interaction and CNN-induced Refinement Network for RGB-D Salient Object Detection [J].
Cong, Runmin ;
Liu, Hongyu ;
Zhang, Chen ;
Zhang, Wei ;
Zheng, Feng ;
Song, Ran ;
Kwong, Sam .
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, :406-416
[6]   Does Thermal Really Always Matter for RGB-T Salient Object Detection? [J].
Cong, Runmin ;
Zhang, Kepu ;
Zhang, Chen ;
Zheng, Feng ;
Zhao, Yao ;
Huang, Qingming ;
Kwong, Sam .
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 :6971-6982
[7]   Structure-measure: A New Way to Evaluate Foreground Maps [J].
Fan, Deng-Ping ;
Cheng, Ming-Ming ;
Liu, Yun ;
Li, Tao ;
Borji, Ali .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4558-4567
[8]  
Fan DP, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P698
[9]   Unified Information Fusion Network for Multi-Modal RGB-D and RGB-T Salient Object Detection [J].
Gao, Wei ;
Liao, Guibiao ;
Ma, Siwei ;
Li, Ge ;
Liang, Yongsheng ;
Lin, Weisi .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) :2091-2106
[10]   TSFNet: Two-Stage Fusion Network for RGB-T Salient Object Detection [J].
Guo, Qinling ;
Zhou, Wujie ;
Lei, Jingsheng ;
Yu, Lu .
IEEE SIGNAL PROCESSING LETTERS, 2021, 28 :1655-1659