Full-Scale Selective Transformer for Semantic Segmentation

被引:0
|
作者
Lin, Fangjian [1 ,2 ,3 ]
Wu, Sitong [2 ]
Ma, Yizhe [1 ]
Tian, Shengwei [1 ]
机构
[1] Xinjiang Univ, Sch Software, Urumqi, Peoples R China
[2] Baidu VIS, Beijing, Peoples R China
[3] Baidu Res, Inst Deep Learning, Beijing, Peoples R China
来源
COMPUTER VISION - ACCV 2022, PT VII | 2023年 / 13847卷
关键词
Semantic segmentation; Transformer; Full-scale feature fusion;
D O I
10.1007/978-3-031-26293-7_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we rethink the multi-scale feature fusion from two perspectives (scale-level and spatial-level) and propose a full-scale selective fusion strategy for semantic segmentation. Based on such strategy, we design a novel segmentation network, named Full-scale Selective Transformer (FSFormer). Specifically, our FSFormer adaptively selects partial tokens from all tokens at all scales to construct a token subset of interest for each scale. Therefore, each token only interacts with the tokens within its corresponding token subset of interest. The proposed full-scale selective fusion strategy can not only filter out the noisy information propagation but also reduce the computational costs to some extent. We evaluate our FSFormer on four challenging semantic segmentation benchmarks, including PASCAL Context, ADE20K, COCO-Stuff 10K, and Cityscapes, outperforming the state-of-the-art methods. We evaluate our FSFormer on four challenging semantic segmentation benchmarks, including PASCAL Context, ADE20K, COCO-Stuff 10K, and Cityscapes, outperforming the state-of-the-art methods.
引用
收藏
页码:310 / 326
页数:17
相关论文
共 50 条
  • [21] Enhancing Semantically Masked Transformer With Local Attention for Semantic Segmentation
    Xia, Zhengyu
    Kim, Joohee
    IEEE ACCESS, 2023, 11 : 122345 - 122356
  • [22] Transformer fusion for indoor RGB-D semantic segmentation
    Wu, Zongwei
    Zhou, Zhuyun
    Allibert, Guillaume
    Stolz, Christophe
    Demonceaux, Cedric
    Ma, Chao
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
  • [23] Learning graph structures with transformer for weakly supervised semantic segmentation
    Wanchun Sun
    Xin Feng
    Hui Ma
    Jingyao Liu
    Complex & Intelligent Systems, 2023, 9 : 7511 - 7521
  • [24] Learning graph structures with transformer for weakly supervised semantic segmentation
    Sun, Wanchun
    Feng, Xin
    Ma, Hui
    Liu, Jingyao
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (06) : 7511 - 7521
  • [25] Transformer framework for depth-assisted UDA semantic segmentation
    Song, Yunna
    Shi, Jinlong
    Zou, Danping
    Liu, Caisheng
    Bai, Suqin
    Shu, Xin
    Qian, Qian
    Xu, Dan
    Yuan, Yu
    Sun, Yunhan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
  • [26] TBFormer: three-branch efficient transformer for semantic segmentation
    Wei, Can
    Wei, Yan
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3661 - 3672
  • [27] Remote Sensing Image Semantic Segmentation Based on Cascaded Transformer
    Wang F.
    Ji J.
    Wang Y.
    IEEE. Trans. Artif. Intell., 2024, 8 (4136-4148): : 4136 - 4148
  • [28] Enhancing Mask Transformer with Auxiliary Convolution Layers for Semantic Segmentation
    Xia, Zhengyu
    Kim, Joohee
    SENSORS, 2023, 23 (02)
  • [29] EMSFomer: Efficient Multi-Scale Transformer for Real-Time Semantic Segmentation
    Xia, Zhengyu
    Kim, Joohee
    IEEE ACCESS, 2025, 13 : 18239 - 18252
  • [30] Global and edge enhanced transformer for semantic segmentation of remote sensing
    Wang, Hengyou
    Li, Xiao
    Huo, Lianzhi
    Hu, Changmiao
    APPLIED INTELLIGENCE, 2024, 54 (07) : 5658 - 5673