Full-Scale Selective Transformer for Semantic Segmentation

被引：0

作者：

Lin, Fangjian ^{[1
,2
,3
]}

Wu, Sitong ^{[2
]}

Ma, Yizhe ^{[1
]}

Tian, Shengwei ^{[1
]}

机构：

[1] Xinjiang Univ, Sch Software, Urumqi, Peoples R China

[2] Baidu VIS, Beijing, Peoples R China

[3] Baidu Res, Inst Deep Learning, Beijing, Peoples R China

来源：

COMPUTER VISION - ACCV 2022, PT VII | 2023年 / 13847卷

关键词：

Semantic segmentation; Transformer; Full-scale feature fusion;

D O I：

10.1007/978-3-031-26293-7_19

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we rethink the multi-scale feature fusion from two perspectives (scale-level and spatial-level) and propose a full-scale selective fusion strategy for semantic segmentation. Based on such strategy, we design a novel segmentation network, named Full-scale Selective Transformer (FSFormer). Specifically, our FSFormer adaptively selects partial tokens from all tokens at all scales to construct a token subset of interest for each scale. Therefore, each token only interacts with the tokens within its corresponding token subset of interest. The proposed full-scale selective fusion strategy can not only filter out the noisy information propagation but also reduce the computational costs to some extent. We evaluate our FSFormer on four challenging semantic segmentation benchmarks, including PASCAL Context, ADE20K, COCO-Stuff 10K, and Cityscapes, outperforming the state-of-the-art methods. We evaluate our FSFormer on four challenging semantic segmentation benchmarks, including PASCAL Context, ADE20K, COCO-Stuff 10K, and Cityscapes, outperforming the state-of-the-art methods.

引用

页码：310 / 326

页数：17

共 50 条

[21] Enhancing Semantically Masked Transformer With Local Attention for Semantic Segmentation
Xia, Zhengyu
Kim, Joohee
IEEE ACCESS, 2023, 11 : 122345 - 122356
[22] Transformer fusion for indoor RGB-D semantic segmentation
Wu, Zongwei
Zhou, Zhuyun
Allibert, Guillaume
Stolz, Christophe
Demonceaux, Cedric
Ma, Chao
COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
[23] Learning graph structures with transformer for weakly supervised semantic segmentation
Wanchun Sun
Xin Feng
Hui Ma
Jingyao Liu
Complex & Intelligent Systems, 2023, 9 : 7511 - 7521
[24] Learning graph structures with transformer for weakly supervised semantic segmentation
Sun, Wanchun
Feng, Xin
Ma, Hui
Liu, Jingyao
COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (06) : 7511 - 7521
[25] Transformer framework for depth-assisted UDA semantic segmentation
Song, Yunna
Shi, Jinlong
Zou, Danping
Liu, Caisheng
Bai, Suqin
Shu, Xin
Qian, Qian
Xu, Dan
Yuan, Yu
Sun, Yunhan
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
[26] TBFormer: three-branch efficient transformer for semantic segmentation
Wei, Can
Wei, Yan
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3661 - 3672
[27] Remote Sensing Image Semantic Segmentation Based on Cascaded Transformer
Wang F.
Ji J.
Wang Y.
IEEE. Trans. Artif. Intell., 2024, 8 (4136-4148): : 4136 - 4148
[28] Enhancing Mask Transformer with Auxiliary Convolution Layers for Semantic Segmentation
Xia, Zhengyu
Kim, Joohee
SENSORS, 2023, 23 (02)
[29] EMSFomer: Efficient Multi-Scale Transformer for Real-Time Semantic Segmentation
Xia, Zhengyu
Kim, Joohee
IEEE ACCESS, 2025, 13 : 18239 - 18252
[30] Global and edge enhanced transformer for semantic segmentation of remote sensing
Wang, Hengyou
Li, Xiao
Huo, Lianzhi
Hu, Changmiao
APPLIED INTELLIGENCE, 2024, 54 (07) : 5658 - 5673

← 1 2 3 4 5 →