A Feature Divide-and-Conquer Network for RGB-T Semantic Segmentation

被引:23
作者
Zhao, Shenlu [1 ,2 ]
Zhang, Qiang [1 ,2 ]
机构
[1] Xidian Univ, Key Lab Elect Equipment Struct Design, Minist Educ, Xian 710071, Shaanxi, Peoples R China
[2] Xidian Univ, Ctr Complex Syst, Sch Mechanoelect Engn, Xian 710071, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Semantic segmentation; Data mining; Semantics; Lighting; Decoding; Thermal sensors; RGB-T semantic segmentation; feature divide-and-conquer strategy; multi-scale contextual information;
D O I
10.1109/TCSVT.2022.3229359
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Similar to other multi-modal pixel-level prediction tasks, existing RGB-T semantic segmentation methods usually employ a two-stream structure to extract RGB and thermal infrared (TIR) features, respectively, and adopt the same fusion strategies to integrate different levels of unimodal features. This will result in inadequate extraction of unimodal features and exploitation of cross-modal information from the paired RGB and TIR images. Alternatively, in this paper, we present a novel RGB-T semantic segmentation model, i.e., FDCNet, where a feature divide-and-conquer strategy performs unimodal feature extraction and cross-modal feature fusion in one go. Concretely, we first employ a two-stream structure to extract unimodal low-level features, followed by a Siamese structure to extract unimodal high-level features from the paired RGB and TIR images. This concise but efficient structure enables to take into account both the modality discrepancies of low-level features and the underlying semantic consistency of high-level features across the paired RGB and TIR images. Furthermore, considering the characteristics of different layers of features, a Cross-modal Spatial Activation (CSA) module and a Cross-modal Channel Activation (CCA) module are presented for the fusion of low-level RGB and TIR features and for the fusion of high-level RGB and TIR features, respectively, thus facilitating the capture of cross-modal information. On top of that, with an embedded Cross-scale Interaction Context (CIC) module for mining multi-scale contextual information, our proposed model (i.e., FDCNet) for RGB-T semantic segmentation achieves new state-of-the-art experimental results on MFNet dataset and PST900 dataset.
引用
收藏
页码:2892 / 2905
页数:14
相关论文
共 50 条
  • [41] Semantic Progressive Guidance Network for RGB-D Mirror Segmentation
    Li, Chao
    Zhou, Wujie
    Zhou, Xi
    Yan, Weiqing
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2780 - 2784
  • [42] GMNet: Graded-Feature Multilabel-Learning Network for RGB-Thermal Urban Scene Semantic Segmentation
    Zhou, Wujie
    Liu, Jinfu
    Lei, Jingsheng
    Yu, Lu
    Hwang, Jenq-Neng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 7790 - 7802
  • [43] Relationship-Incremental Scene Graph Generation by a Divide-and-Conquer Pipeline With Feature Adapter
    Li, Xuewei
    Zheng, Guangcong
    Yu, Yunlong
    Ji, Naye
    Li, Xi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 678 - 688
  • [44] Efficient Context-Guided Stacked Refinement Network for RGB-T Salient Object Detection
    Huo, Fushuo
    Zhu, Xuegui
    Zhang, Lei
    Liu, Qifeng
    Shu, Yu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) : 3111 - 3124
  • [45] Enhancing Aerial Semantic Segmentation With Feature Aggregation Network for DeepLabV3+
    Nguyen, Gia-Vuong
    Huynh-The, Thien
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21
  • [46] Efficient Feature Recombining Network Based on Refining Multi-Level Feature Maps for Semantic Segmentation
    Zhao, Luan
    Zhang, Xiaofeng
    IEEE ACCESS, 2020, 8 : 228409 - 228419
  • [47] Hierarchical Decoding Network Based on Swin Transformer for Detecting Salient Objects in RGB-T Images
    Sun, Fan
    Zhou, Wujie
    Ye, Lv
    Yu, Lu
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1714 - 1718
  • [48] FTNet: Feature Transverse Network for Thermal Image Semantic Segmentation
    Panetta, Karen
    Kamath, K. M. Shreyas
    Rajeev, Srijith
    Agaian, Sos S.
    IEEE ACCESS, 2021, 9 : 145212 - 145227
  • [49] Region Selective Fusion Network for Robust RGB-T Tracking
    Yu, Zhencheng
    Fan, Huijie
    Wang, Qiang
    Li, Ziwan
    Tang, Yandong
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1357 - 1361
  • [50] Bidirectional Feature Aggregation and Adaptive Fusion Network for ALS Point Cloud Semantic Segmentation
    Liu, Changhong
    Liu, Zhihui
    Wang, Xinyu
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2025, 22