BMDENet: Bi-Directional Modality Difference Elimination Network for Few-Shot RGB-T Semantic Segmentation

被引:3
作者
Zhao, Ying [1 ,2 ]
Song, Kechen [1 ,2 ]
Zhang, Yiming [1 ,2 ]
Yan, Yunhui [1 ,2 ]
机构
[1] Northeastern Univ, Natl Frontiers Sci Ctr Ind Intelligence & Syst Opt, Sch Mech Engn & Automat, Shenyang 110819, Liaoning, Peoples R China
[2] Northeastern Univ, Key Lab Data Analyt & Optimizat Smart Ind, Minist Educ, Shenyang 110819, Liaoning, Peoples R China
基金
中国国家自然科学基金;
关键词
Few-shot semantic segmentation; RGB-T FSS; difference elimination; cross-modal; NEURAL-NETWORKS;
D O I
10.1109/TCSII.2023.3278941
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Few-shot semantic segmentation (FSS) aims to segment the target prospects of query images using a few labeled support samples. Compared with the fully-supervised methods, FSS has a greater ability to generalize to unseen classes and reduce the pressure to label large pixel-level datasets. To cope with the complex outdoor lighting environment, we introduce the thermal infrared images (T) to the FSS task. However, the existing RGB-T FSS methods all ignore the differences between various modalities for direct fusion, which may hinder cross-modal information interaction. Also considering the effect of successive downsampling on the results, we propose a bidirectional modality difference elimination network (BMDENet) to boost the segmentation performance. Concretely, the bidirectional modality difference elimination module (BMDEM) reduces the heterogeneity between RGB and thermal images in the prototype space. The residual attention fusion module (RAFM) mines the bimodal features to fully fuse the cross modal information. In addition, the mainstay and subsidiary enhancement module (MSEM) enhances the fusion features for the existing problem of the advanced model. Extensive experiments on Tokyo Multi-Spectral-4i dataset prove that BMDENet achieves the state-of-the-art on both 1-and 5-shot settings.
引用
收藏
页码:4266 / 4270
页数:5
相关论文
共 38 条
  • [1] Implementation of a Modified U-Net for Medical Image Segmentation on Edge Devices
    Ali, Owais
    Ali, Hazrat
    Shah, Syed Ayaz Ali
    Shahzad, Aamir
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (11) : 4593 - 4597
  • [2] Visible and thermal images fusion architecture for few-shot semantic segmentation
    Bao, Yanqi
    Song, Kechen
    Wang, Jie
    Huang, Liming
    Dong, Hongwen
    Yan, Yunhui
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 80
  • [3] How it Flies and Why it Flies? Volleyball Trajectory Segmentation and Classification
    Chen, Chen-Ni
    Chu, Wei-Ta
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (05) : 1591 - 1595
  • [4] A 67.5 μJ/Prediction Accelerator for Spiking Neural Networks in Image Segmentation
    Chen, Qinyu
    He, Guoqiang
    Wang, Xinyuan
    Xu, Jin
    Shen, Sirui
    Chen, Hui
    Fu, Yuxiang
    Li, Li
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (02) : 574 - 578
  • [5] Dang K., 2018, P BRIT MACH VIS C BM
  • [6] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [7] The Pascal Visual Object Classes (VOC) Challenge
    Everingham, Mark
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338
  • [8] Dual Attention Network for Scene Segmentation
    Fu, Jun
    Liu, Jing
    Tian, Haijie
    Li, Yong
    Bao, Yongjun
    Fang, Zhiwei
    Lu, Hanqing
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149
  • [9] Ha Q, 2017, IEEE INT C INT ROBOT, P5108, DOI 10.1109/IROS.2017.8206396
  • [10] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778