MS-IRTNet: Multistage information interaction network for RGB-T semantic segmentation

被引:14
作者
Zhang, Zhiwei [1 ]
Liu, Yisha [1 ]
Xue, Weimin [1 ]
机构
[1] Dalian Maritime Univ, Informat Sci & Technol Coll, Dalian 116026, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantic segmentation; Gate-weighted interaction; Feature information interaction; RGB image; Thermal image; FUSION NETWORK;
D O I
10.1016/j.ins.2023.119442
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The complementary information from RGB and thermal images can remarkably boost semantic segmentation performance. Existing RGB-T segmentation methods usually use simple interaction strategies to extract complementary information from RGB and thermal images, which ignores recognizability features from different imaging mechanisms. To address these problems, we propose a multistage information interaction network for RGB-T semantic segmentation called MS-IRTNet. MS-IRTNet has a dual-stream encoder structure that can extract multistage feature information. To better interact with multimodal information, we design a gate-weighted interaction module (GWIM) and a feature information interaction module (FIIM). GWIM can learn multimodal information weights in different channels, while FIIM integrates and fuses weighted RGB and thermal information into a single feature map. Finally, multistage interactive information is fed into the decoder for semantic prediction. Our method achieves 60.5 mIoU on the MFNet dataset, outperforming state-of-the-art methods. Notably, MS-IRTNet also achieved state-of-the-art results in tests of daytime images (51.7 mIoU) and nighttime images (62.5 mIoU). The code and pre-trained models are available at https://github .com /poisonzzw /MS -IRTNet.
引用
收藏
页数:10
相关论文
共 41 条
[1]   Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [J].
Chen, Xiaokang ;
Lin, Kwan-Yee ;
Wang, Jingbo ;
Wu, Wayne ;
Qian, Chen ;
Li, Hongsheng ;
Zeng, Gang .
COMPUTER VISION - ECCV 2020, PT XI, 2020, 12356 :561-577
[2]   Locality-Sensitive Deconvolution Networks with Gated Fusion for RGB-D Indoor Semantic Segmentation [J].
Cheng, Yanhua ;
Cai, Rui ;
Li, Zhiwei ;
Zhao, Xin ;
Huang, Kaiqi .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1475-1483
[3]   FEANet: Feature-Enhanced Attention Network for RGB-Thermal Real-time Semantic Segmentation [J].
Deng, Fuqin ;
Feng, Hua ;
Liang, Mingjian ;
Wang, Hongmin ;
Yang, Yong ;
Gao, Yuan ;
Chen, Junfeng ;
Hu, Junjie ;
Guo, Xiyue ;
Lam, Tin Lun .
2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, :4467-4473
[4]  
Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[5]  
Fan J., 2022, IEEE Trans. Intell. Veh.
[6]   CGFNet: cross-guided fusion network for RGB-thermal semantic segmentation CGI PaperID: 105 [J].
Fu, Yanping ;
Chen, Qiaoqiao ;
Zhao, Haifeng .
VISUAL COMPUTER, 2022, 38 (9-10) :3243-3252
[7]   VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation [J].
Gan, Chuang ;
Li, Yandong ;
Li, Haoxiang ;
Sun, Chen ;
Gong, Boqing .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :1829-1838
[8]   Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation [J].
Ghiasi, Golnaz ;
Fowlkes, Charless C. .
COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 :519-534
[9]   Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection [J].
Guan, Dayan ;
Cao, Yanpeng ;
Yang, Jiangxin ;
Cao, Yanlong ;
Yang, Michael Ying .
INFORMATION FUSION, 2019, 50 :148-157
[10]   Robust semantic segmentation based on RGB-thermal in variable lighting scenes [J].
Guo, Zhifeng ;
Li, Xu ;
Xu, Qimin ;
Sun, Zhengliang .
MEASUREMENT, 2021, 186