Dual-branch deep cross-modal interaction network for semantic segmentation with thermal images

被引:1
|
作者
Dai, Kang [1 ]
Chen, Suting [1 ]
机构
[1] Nanjing Univ Informat Sci Technol, Sch Elect & Informat Engn, Nanjing 210044, Peoples R China
基金
中国国家自然科学基金;
关键词
Thermal images; Semantic segmentation; Cross-modal feature; Deep interaction; FUSION NETWORK; RGB;
D O I
10.1016/j.engappai.2024.108820
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Semantic segmentation using RGB (Red-Green-Blue) images and thermal datas is an indispensable component of autonomous driving. The key to RGB-Thermal (RGB and Thermal) semantic segmentation is achieving the interaction and fusion of features between RGB and thermal images. Therefore, we propose a dual-branch deep cross-modal interaction network (DCIT) based on Encoder-Decoder structure. This framework consists of two parallel networks for feature extraction from RGB and Thermal data. Specifically, in each feature extraction stage of the Encoder, we design a Cross Feature Regulation Modules (CFRM) to align and correct modality specific features by reducing the inter-modality feature differences and eliminating intra-modality noise. Then, the modality features are aggregated through Cross Modal Feature Fusion Module (CMFFM) based on cross linear attention to capture global information from modality features. Finally, Adaptive Multi-Scale Cross- positional Fusion Module (AMCFM) utilizes the fused features to integrate consistent semantic information in the Decoder stage. Our framework can improve the interaction of cross modal features. Extensive experiments on urban scene datasets demonstrate that our proposed framework outperforms other RGB-Thermal semantic segmentation methods in terms of objective metrics and subjective visual assessments.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] DMANet: Dual-branch multiscale attention network for real-time semantic segmentation
    Dong, Yongsheng
    Mao, Chongchong
    Zheng, Lintao
    Wu, Qingtao
    NEUROCOMPUTING, 2025, 617
  • [32] A lightweight dual-branch semantic segmentation network for enhanced obstacle detection in ship navigation
    Feng, Hui
    Liu, Wensheng
    Xu, Haixiang
    He, Jianhua
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 136
  • [33] TerSeg: A dual-branch semantic segmentation network for Mars terrain and autonomous path planning
    Fan, Lili
    Yuan, Jiabin
    Zha, Keke
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 270
  • [34] Cross-modal semantic transfer for point cloud semantic segmentation
    Cao, Zhen
    Mi, Xiaoxin
    Qiu, Bo
    Cao, Zhipeng
    Long, Chen
    Yan, Xinrui
    Zheng, Chao
    Dong, Zhen
    Yang, Bisheng
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2025, 221 : 265 - 279
  • [35] Cross-modal attention fusion network for RGB-D semantic segmentation
    Zhao, Qiankun
    Wan, Yingcai
    Xu, Jiqian
    Fang, Lijin
    NEUROCOMPUTING, 2023, 548
  • [36] Transformer-Based Cross-Modal Information Fusion Network for Semantic Segmentation
    Duan, Zaipeng
    Huang, Xiao
    Ma, Jie
    NEURAL PROCESSING LETTERS, 2023, 55 (05) : 6361 - 6375
  • [37] Cross-Modal Hash Retrieval Model for Semantic Segmentation Network for Digital Libraries
    Tang, Siyu
    Yin, Jun
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (10) : 58 - 66
  • [38] Transformer-Based Cross-Modal Information Fusion Network for Semantic Segmentation
    Zaipeng Duan
    Xiao Huang
    Jie Ma
    Neural Processing Letters, 2023, 55 : 6361 - 6375
  • [39] Deep Semantic Mapping for Cross-Modal Retrieval
    Wang, Cheng
    Yang, Haojin
    Meinel, Christoph
    2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 234 - 241
  • [40] Cross-modal hashing with semantic deep embedding
    Yan, Cheng
    Bai, Xiao
    Wang, Shuai
    Zhou, Jun
    Hancock, Edwin R.
    NEUROCOMPUTING, 2019, 337 : 58 - 66