CFRNet: Cross-Attention-Based Fusion and Refinement Network for Enhanced RGB-T Salient Object Detection

被引：2

作者：

Deng, Biao ^{[1
,2
]}

Liu, Di ^{[2
]}

Cao, Yang ^{[2
]}

Liu, Hong ^{[2
]}

Yan, Zhiguo ^{[1
]}

Chen, Hu ^{[2
]}

机构：

[1] Dongfang Elect Autocontrol Engn Co LTD, Deyang 618000, Peoples R China

[2] Sichuan Univ, Coll Comp Sci, Chengdu 610000, Peoples R China

来源：

SENSORS | 2024年 / 24卷 / 22期

基金：

中国国家自然科学基金;

关键词：

RGB-T salient object detection; RGB-thermal fusion; cross-attention; fusion and refinement;

D O I：

10.3390/s24227146

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

Existing deep learning-based RGB-T salient object detection methods often struggle with effectively fusing RGB and thermal features. Therefore, obtaining high-quality features and fully integrating these two modalities are central research focuses. We developed an illumination prior-based coefficient predictor (MICP) to determine optimal interaction weights. We then designed a saliency-guided encoder (SG Encoder) to extract multi-scale thermal features incorporating saliency information. The SG Encoder guides the extraction of thermal features by leveraging their correlation with RGB features, particularly those with strong semantic relationships to salient object detection tasks. Finally, we employed a Cross-attention-based Fusion and Refinement Module (CrossFRM) to refine the fused features. The robust thermal features help refine the spatial focus of the fused features, aligning them more closely with salient objects. Experimental results demonstrate that our proposed approach can more accurately locate salient objects, significantly improving performance compared to 11 state-of-the-art methods.

引用

页数：14

共 38 条

[1] Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement [J].

Cai, Yuanhao ;

Bian, Hao ;

Lin, Jing ;

Wang, Haoqian ;

Timofte, Radu ;

Zhang, Yulun .

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, :12470-12479

[2] CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification [J].

Chen, Chun-Fu ;

Fan, Quanfu ;

Panda, Rameswar .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :347-356

[3]

Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929

[4] Unified Information Fusion Network for Multi-Modal RGB-D and RGB-T Salient Object Detection [J].

Gao, Wei ;

Liao, Guibiao ;

Ma, Siwei ;

Li, Ge ;

Liang, Yongsheng ;

Lin, Weisi .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) :2091-2106

[5] Salient Object Detection Techniques in Computer Vision-A Survey [J].

Gupta, Ashish Kumar ;

Seal, Ayan ;

Prasad, Mukesh ;

Khanna, Pritee .

ENTROPY, 2020, 22 (10) :1-49

[6] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[7] FastReID: A Pytorch Toolbox for General Instance Re-identification [J].

He, Lingxiao ;

Liao, Xingyu ;

Liu, Wu ;

Liu, Xinchen ;

Cheng, Peng ;

Mei, Tao .

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, :9664-9667

[8] Guided Saliency Feature Learning for Person Re-identification in Crowded Scenes [J].

He, Lingxiao ;

Liu, Wu .

COMPUTER VISION - ECCV 2020, PT XXVIII, 2020, 12373 :357-373

[9]

Hosseini R., 2022, Advances in Neural Information Processing Systems, V35, P14743

[10]

Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/TPAMI.2019.2913372, 10.1109/CVPR.2018.00745]

← 1 2 3 4 →