SSformer: A Lightweight Transformer for Semantic Segmentation

被引：23

作者：

Shi, Wentao ^{[1
]}

Xu, Jing ^{[1
]}

Gao, Pan ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China

来源：

2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2022年

关键词：

Image Segmentation; Transformer; Multilayer perceptron; Lightweight model;

D O I：

10.1109/MMSP55362.2022.9949177

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

It is well believed that Transformer performs better in semantic segmentation compared to convolutional neural networks. Nevertheless, the original Vision Transformer [2] may lack of inductive biases of local neighborhoods and possess a high time complexity. Recently, Swin Transformer [3] sets a new record in various vision tasks by using hierarchical architecture and shifted windows while being more efficient. However, as Swin Transformer is specifically designed for image classification, it may achieve suboptimal performance on dense prediction-based segmentation task. Further, simply combing Swin Transformer with existing methods would lead to the boost of model size and parameters for the final segmentation model. In this paper, we rethink the Swin Transformer for semantic segmentation, and design a lightweight yet effective transformer model, called SSformer. In this model, considering the inherent hierarchical design of Swin Transformer, we propose a decoder to aggregate information from different layers, thus obtaining both local and global attentions. Experimental results show the proposed SSformer yields comparable mIoU performance with state-of-the-art models, while maintaining a smaller model size and lower compute. Source code and pretrained models are available at: https://github.com/shiwt03/SSformer

引用

页数：5

共 50 条

[21] Indoor semantic segmentation based on Swin-Transformer
Zheng, Yunping
Xu, Yuan
Shu, Shiqiang
Sarem, Mudar
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 98
[22] A Patch Diversity Transformer for Domain Generalized Semantic Segmentation
He, Pei
Jiao, Licheng
Shang, Ronghua
Liu, Xu
Liu, Fang
Yang, Shuyuan
Zhang, Xiangrong
Wang, Shuang
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 14138 - 14150
[23] MUSTER: A Multi-Scale Transformer-Based Decoder for Semantic Segmentation
Xu, Jing
Shi, Wentao
Gao, Pan
Li, Qizhu
Wang, Zhengwei
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025, 9 (01): : 202 - 212
[24] Evaluating Transformer-based Semantic Segmentation Networks for Pathological Image Segmentation
Cam Nguyen
Asad, Zuhayr
Deng, Ruining
Huo, Yuankai
MEDICAL IMAGING 2022: IMAGE PROCESSING, 2022, 12032
[25] Heavy and Lightweight Deep Learning Models for Semantic Segmentation: A Survey
Carunta, Cristina
Carunta, Alina
Popa, Calin-Adrian
IEEE ACCESS, 2025, 13 : 17745 - 17765
[26] Semantic Segmentation of the Eye With a Lightweight Deep Network and Shape Correction
Huynh, Van Thong
Yang, Hyung-Jeong
Lee, Guee-Sang
Kim, Soo-Hyung
IEEE ACCESS, 2020, 8 : 131967 - 131974
[27] ETFT: Equiangular Tight Frame Transformer for Imbalanced Semantic Segmentation
Jeong, Seonggyun
Heo, Yong Seok
SENSORS, 2024, 24 (21)
[28] Multispectral Fusion Transformer Network for RGB-Thermal Urban Scene Semantic Segmentation
Zhou, Heng
Tian, Chunna
Zhang, Zhenxi
Huo, Qizheng
Xie, Yongqiang
Li, Zhongbo
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[29] Enhancing Semantically Masked Transformer With Local Attention for Semantic Segmentation
Xia, Zhengyu
Kim, Joohee
IEEE ACCESS, 2023, 11 : 122345 - 122356
[30] Cross-scale sampling transformer for semantic image segmentation
Ma, Yizhe
Yu, Long
Lin, Fangjian
Tian, Shengwei
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (02) : 2895 - 2907

← 1 2 3 4 5 →