SSformer: A Lightweight Transformer for Semantic Segmentation

被引:23
|
作者
Shi, Wentao [1 ]
Xu, Jing [1 ]
Gao, Pan [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China
来源
2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2022年
关键词
Image Segmentation; Transformer; Multilayer perceptron; Lightweight model;
D O I
10.1109/MMSP55362.2022.9949177
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
It is well believed that Transformer performs better in semantic segmentation compared to convolutional neural networks. Nevertheless, the original Vision Transformer [2] may lack of inductive biases of local neighborhoods and possess a high time complexity. Recently, Swin Transformer [3] sets a new record in various vision tasks by using hierarchical architecture and shifted windows while being more efficient. However, as Swin Transformer is specifically designed for image classification, it may achieve suboptimal performance on dense prediction-based segmentation task. Further, simply combing Swin Transformer with existing methods would lead to the boost of model size and parameters for the final segmentation model. In this paper, we rethink the Swin Transformer for semantic segmentation, and design a lightweight yet effective transformer model, called SSformer. In this model, considering the inherent hierarchical design of Swin Transformer, we propose a decoder to aggregate information from different layers, thus obtaining both local and global attentions. Experimental results show the proposed SSformer yields comparable mIoU performance with state-of-the-art models, while maintaining a smaller model size and lower compute. Source code and pretrained models are available at: https://github.com/shiwt03/SSformer
引用
收藏
页数:5
相关论文
共 50 条
  • [11] Enhancing Mask Transformer with Auxiliary Convolution Layers for Semantic Segmentation
    Xia, Zhengyu
    Kim, Joohee
    SENSORS, 2023, 23 (02)
  • [12] Lightweight Asymmetric Dilation Network for Real-Time Semantic Segmentation
    Hu, Xuegang
    Gong, Yu
    IEEE ACCESS, 2021, 9 : 55630 - 55643
  • [13] LACTNet: A Lightweight Real-Time Semantic Segmentation Network Based on an Aggregated Convolutional Neural Network and Transformer
    Zhang, Xiangyue
    Li, Hexiao
    Ru, Jingyu
    Ji, Peng
    Wu, Chengdong
    ELECTRONICS, 2024, 13 (12)
  • [14] Semantic segmentation in medical images through transfused convolution and transformer networks
    Tashvik Dhamija
    Anunay Gupta
    Shreyansh Gupta
    Rahul Anjum
    Ghanshyam Katarya
    Applied Intelligence, 2023, 53 : 1132 - 1148
  • [15] Semantic segmentation in medical images through transfused convolution and transformer networks
    Dhamija, Tashvik
    Gupta, Anunay
    Gupta, Shreyansh
    Anjum
    Katarya, Rahul
    Singh, Ghanshyam
    APPLIED INTELLIGENCE, 2023, 53 (01) : 1132 - 1148
  • [16] AST-Net: Lightweight Hybrid Transformer for Multimodal Brain Tumor Segmentation
    Wang, Peixu
    Liu, Shikun
    Peng, Jialin
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 4623 - 4629
  • [17] Video Semantic Segmentation via Sparse Temporal Transformer
    Li, Jiangtong
    Wang, Wentao
    Chen, Junjie
    Niu, Li
    Si, Jianlou
    Qian, Chen
    Zhang, Liqing
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 59 - 68
  • [18] Semantic segmentation feature fusion network based on transformer
    Li, Tianping
    Cui, Zhaotong
    Zhang, Hua
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [19] Efficient and adaptive semantic segmentation network based on Transformer
    Zhang H.-B.
    Cai L.
    Ren J.-P.
    Wang R.-Y.
    Liu F.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2023, 57 (06): : 1205 - 1214
  • [20] Full-Scale Selective Transformer for Semantic Segmentation
    Lin, Fangjian
    Wu, Sitong
    Ma, Yizhe
    Tian, Shengwei
    COMPUTER VISION - ACCV 2022, PT VII, 2023, 13847 : 310 - 326