HEFANet: hierarchical efficient fusion and aggregation segmentation network for enhanced rgb-thermal urban scene parsing

被引:1
作者
Shen, Zhengwen [1 ]
Pan, Zaiyu [1 ]
Weng, Yuchen [1 ]
Li, Yulian [1 ]
Wang, Jiangyu [1 ]
Wang, Jun [1 ]
机构
[1] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Peoples R China
关键词
RGB-thermal; Semantic segmentation; Feature descriptor; Spatial interaction; Sparse selection; SEMANTIC SEGMENTATION; ATTENTION;
D O I
10.1007/s10489-024-05743-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-Thermal semantic segmentation is important in widespread applications in adverse illumination conditions, such as autonomous driving and robotic sensing. However, most existing methods ignore the feature differences between the two modalities and do not effectively exploit and handle the features at different levels. In this paper, we present a novel multimodal feature fusion network named HEFANet, which effectively enhances the interaction and fusion of features. Concretely, we propose a Cross-layer and Cross-modal Feature Descriptor module (CCFD) to mitigate differences between different multimodal data and to mine the valuable and correlated features of cross-layers. To effectively fuse multimodal features at different levels, we propose a Multi-modal Interleaved Sparse Self-Attention module (MISSA) to aggregate rich spatial semantic information in the earlier layers. Then, we propose the Spatial Interaction and Channel Selection module (SICS) in the last layer to enhance the representation of rich contextual features and highlight important information by channel communication interactions for optimal sparse feature aggregation selectively. Extensive experiments were carried out on three publicly available datasets (MFNet, PST900, and FMB), and achieved new state-of-the-art results. The code and results are available at https://github.com/shenzw21/HEFANet.
引用
收藏
页码:11248 / 11266
页数:19
相关论文
共 56 条
[1]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[2]   A survey on imbalanced learning: latest research, applications and future directions [J].
Chen, Wuxing ;
Yang, Kaixiang ;
Yu, Zhiwen ;
Shi, Yifan ;
Chen, C. L. Philip .
ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (06)
[3]   Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [J].
Chen, Xiaokang ;
Lin, Kwan-Yee ;
Wang, Jingbo ;
Wu, Wayne ;
Qian, Chen ;
Li, Hongsheng ;
Zeng, Gang .
COMPUTER VISION - ECCV 2020, PT XI, 2020, 12356 :561-577
[4]  
Cheng B, 2021, ADV NEUR IN, V34
[5]   TIRNet: Object detection in thermal infrared images for autonomous driving [J].
Dai, Xuerui ;
Yuan, Xue ;
Wei, Xueye .
APPLIED INTELLIGENCE, 2021, 51 (03) :1244-1261
[6]   FEANet: Feature-Enhanced Attention Network for RGB-Thermal Real-time Semantic Segmentation [J].
Deng, Fuqin ;
Feng, Hua ;
Liang, Mingjian ;
Wang, Hongmin ;
Yang, Yong ;
Gao, Yuan ;
Chen, Junfeng ;
Hu, Junjie ;
Guo, Xiyue ;
Lam, Tin Lun .
2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, :4467-4473
[7]  
Dosovitskiy A, 2021, INT C LEARN REPR ICL
[8]   Multi-Objective Neural Architecture Search for Efficient and Fast Semantic Segmentation on Edge [J].
Dou ZiWen ;
Dong, Ye .
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01) :1346-1357
[9]   DooDLeNet: Double DeepLab Enhanced Feature Fusion for Thermal-color Semantic Segmentation [J].
Frigo, Oriel ;
Martin-Gaffe, Lucien ;
Wacongne, Catherine .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, :3020-3028
[10]  
Ha Q, 2017, IEEE INT C INT ROBOT, P5108, DOI 10.1109/IROS.2017.8206396