BMDENet: Bi-Directional Modality Difference Elimination Network for Few-Shot RGB-T Semantic Segmentation

被引：3

作者：

Zhao, Ying ^{[1
,2
]}

Song, Kechen ^{[1
,2
]}

Zhang, Yiming ^{[1
,2
]}

Yan, Yunhui ^{[1
,2
]}

机构：

[1] Northeastern Univ, Natl Frontiers Sci Ctr Ind Intelligence & Syst Opt, Sch Mech Engn & Automat, Shenyang 110819, Liaoning, Peoples R China

[2] Northeastern Univ, Key Lab Data Analyt & Optimizat Smart Ind, Minist Educ, Shenyang 110819, Liaoning, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS | 2023年 / 70卷 / 11期

基金：

中国国家自然科学基金;

关键词：

Few-shot semantic segmentation; RGB-T FSS; difference elimination; cross-modal; NEURAL-NETWORKS;

D O I：

10.1109/TCSII.2023.3278941

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Few-shot semantic segmentation (FSS) aims to segment the target prospects of query images using a few labeled support samples. Compared with the fully-supervised methods, FSS has a greater ability to generalize to unseen classes and reduce the pressure to label large pixel-level datasets. To cope with the complex outdoor lighting environment, we introduce the thermal infrared images (T) to the FSS task. However, the existing RGB-T FSS methods all ignore the differences between various modalities for direct fusion, which may hinder cross-modal information interaction. Also considering the effect of successive downsampling on the results, we propose a bidirectional modality difference elimination network (BMDENet) to boost the segmentation performance. Concretely, the bidirectional modality difference elimination module (BMDEM) reduces the heterogeneity between RGB and thermal images in the prototype space. The residual attention fusion module (RAFM) mines the bimodal features to fully fuse the cross modal information. In addition, the mainstay and subsidiary enhancement module (MSEM) enhances the fusion features for the existing problem of the advanced model. Extensive experiments on Tokyo Multi-Spectral-4i dataset prove that BMDENet achieves the state-of-the-art on both 1-and 5-shot settings.

引用

页码：4266 / 4270

页数：5

共 38 条

[1] Implementation of a Modified U-Net for Medical Image Segmentation on Edge Devices
Ali, Owais
Ali, Hazrat
Shah, Syed Ayaz Ali
Shahzad, Aamir
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (11) : 4593 - 4597
[2] Visible and thermal images fusion architecture for few-shot semantic segmentation
Bao, Yanqi
Song, Kechen
Wang, Jie
Huang, Liming
Dong, Hongwen
Yan, Yunhui
[J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 80
[3] How it Flies and Why it Flies? Volleyball Trajectory Segmentation and Classification
Chen, Chen-Ni
Chu, Wei-Ta
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (05) : 1591 - 1595
[4] A 67.5 μJ/Prediction Accelerator for Spiking Neural Networks in Image Segmentation
Chen, Qinyu
He, Guoqiang
Wang, Xinyuan
Xu, Jin
Shen, Sirui
Chen, Hui
Fu, Yuxiang
Li, Li
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (02) : 574 - 578
[5] Dang K., 2018, P BRIT MACH VIS C BM
[6] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[7] The Pascal Visual Object Classes (VOC) Challenge
Everingham, Mark
Van Gool, Luc
Williams, Christopher K. I.
Winn, John
Zisserman, Andrew
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338
[8] Dual Attention Network for Scene Segmentation
Fu, Jun
Liu, Jing
Tian, Haijie
Li, Yong
Bao, Yongjun
Fang, Zhiwei
Lu, Hanqing
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149
[9] Ha Q, 2017, IEEE INT C INT ROBOT, P5108, DOI 10.1109/IROS.2017.8206396
[10] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778

← 1 2 3 4 →