Cross-level interaction fusion network-based RGB-T semantic segmentation for distant targets

被引:0
作者
Chen, Yu [1 ]
Li, Xiang [1 ]
Luan, Chao [2 ]
Hou, Weimin [2 ]
Liu, Haochen [2 ]
Zhu, Zihui [3 ]
Xue, Lian [3 ]
Zhang, Jianqi [1 ]
Liu, Delian [1 ]
Wu, Xin [1 ]
Wei, Linfang [1 ]
Jian, Chaochao [1 ]
Li, Jinze [1 ]
机构
[1] Xidian Univ, Sch Optoelect Engn, Xian 710071, Peoples R China
[2] Beijing Inst Control & Elect Technol, Beijing 100038, Peoples R China
[3] Natl Key Lab Sci & Technol Test Phys & Numer Math, Beijing 100076, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Semantic segmentation; Feature fusion; Cross modality; Multi-scale information; Distant object;
D O I
10.1016/j.patcog.2024.111218
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-T segmentation represents an innovative approach driven by advancements in multispectral detection and is poised to replace traditional RGB segmentation methods. An effective cross-modality feature fusion module is essential for this technology. The precise segmentation of distant objects is another significant challenge. Focused on these two areas, we propose an end-to-end distant object feature fusion network (DOFFNet) for RGB-T segmentation. Initially, we introduce a cross-level interaction fusion strategy (CLIF) and an inter-correlation fusion method (IFFM) in the encoder to enhance multi-scale feature expression and improve fusion accuracy. Subsequently, we propose a residual dense pixel convolution (R-DPC) in the decoder with a trainable upsampling unit that dynamically reconstructs information lost during encoding, particularly for distant objects whose features may vanish after pooling. Experimental results show that our DOFFNet achieves a top mean pixel accuracy of 75.8% and dramatically improves accuracy for four classes, including objects occupying as little as 0.2%-2% of total pixels. This improvement ensures more reliable and effective performance in practical applications, particularly in scenarios where small object detection is critical. Moreover, it demonstrates potential applicability in other fields like medical imaging and remote sensing.
引用
收藏
页数:13
相关论文
共 39 条
  • [1] DHFNet: Decoupled Hierarchical Fusion Network for RGB-T dense prediction tasks
    Chen, Haojie
    Wang, Zhuo
    Qin, Hongde
    Mu, Xiaokai
    [J]. NEUROCOMPUTING, 2024, 583
  • [2] FEANet: Feature-Enhanced Attention Network for RGB-Thermal Real-time Semantic Segmentation
    Deng, Fuqin
    Feng, Hua
    Liang, Mingjian
    Wang, Hongmin
    Yang, Yong
    Gao, Yuan
    Chen, Junfeng
    Hu, Junjie
    Guo, Xiyue
    Lam, Tin Lun
    [J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 4467 - 4473
  • [3] A comprehensive review of machine vision systems and artificial intelligence algorithms for the detection and harvesting of agricultural produce
    Dhanush, Guduru
    Khatri, Narendra
    Kumar, Sandeep
    Shukla, Praveen Kumar
    [J]. SCIENTIFIC AFRICAN, 2023, 21
  • [4] EGFNet: Edge-Aware Guidance Fusion Network for RGB-Thermal Urban Scene Parsing
    Dong, Shaohua
    Zhou, Wujie
    Xu, Caie
    Yan, Weiqing
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (01) : 657 - 669
  • [5] Dosovitskiy A., 2020, ARXIV, DOI [10.48550/arXiv.2010.11929, DOI 10.48550/ARXIV.2010.11929, 10.48550/ARXIV.2010.11929]
  • [6] Modification of chloromethylated polystyrene with 2-mercabtobenzothiazole for application as a new sorbent for preconcentration and determination of Ag+ from different matrices
    El-Menshawy, A. M.
    Kenawy, I. M.
    El-Asmy, A. A.
    [J]. JOURNAL OF HAZARDOUS MATERIALS, 2010, 173 (1-3) : 523 - 527
  • [7] Innovative Hybrid Approach for Masked Face Recognition Using Pretrained Mask Detection and Segmentation, Robust PCA, and KNN Classifier
    Eman, Mohammed
    Mahmoud, Tarek M.
    Ibrahim, Mostafa M.
    Abd El-Hafeez, Tarek
    [J]. SENSORS, 2023, 23 (15)
  • [8] Fan SQ, 2023, Arxiv, DOI arXiv:2303.08692
  • [9] CEKD: Cross-Modal Edge-Privileged Knowledge Distillation for Semantic Scene Understanding Using Only Thermal Images
    Feng, Zhen
    Guo, Yanning
    Sun, Yuxiang
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (04) : 2205 - 2212
  • [10] Ha Q, 2017, IEEE INT C INT ROBOT, P5108, DOI 10.1109/IROS.2017.8206396