SSformer: A Lightweight Transformer for Semantic Segmentation

被引：23

作者：

Shi, Wentao ^{[1
]}

Xu, Jing ^{[1
]}

Gao, Pan ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China

来源：

2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2022年

关键词：

Image Segmentation; Transformer; Multilayer perceptron; Lightweight model;

D O I：

10.1109/MMSP55362.2022.9949177

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

It is well believed that Transformer performs better in semantic segmentation compared to convolutional neural networks. Nevertheless, the original Vision Transformer [2] may lack of inductive biases of local neighborhoods and possess a high time complexity. Recently, Swin Transformer [3] sets a new record in various vision tasks by using hierarchical architecture and shifted windows while being more efficient. However, as Swin Transformer is specifically designed for image classification, it may achieve suboptimal performance on dense prediction-based segmentation task. Further, simply combing Swin Transformer with existing methods would lead to the boost of model size and parameters for the final segmentation model. In this paper, we rethink the Swin Transformer for semantic segmentation, and design a lightweight yet effective transformer model, called SSformer. In this model, considering the inherent hierarchical design of Swin Transformer, we propose a decoder to aggregate information from different layers, thus obtaining both local and global attentions. Experimental results show the proposed SSformer yields comparable mIoU performance with state-of-the-art models, while maintaining a smaller model size and lower compute. Source code and pretrained models are available at: https://github.com/shiwt03/SSformer

引用

页数：5

共 50 条

[41] TBFormer: three-branch efficient transformer for semantic segmentation
Can Wei
Yan Wei
Signal, Image and Video Processing, 2024, 18 : 3661 - 3672
[42] PASTS: TOWARD EFFECTIVE DISTILLING TRANSFORMER FOR PANORAMIC SEMANTIC SEGMENTATION
Kim, Jihyun
Jeong, Somi
Sohn, Kwanghoon
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2881 - 2885
[43] Unsupervised Domain Adaptation for Remote Sensing Semantic Segmentation with Transformer
Li, Weitao
Gao, Hui
Su, Yi
Momanyi, Biffon Manyura
REMOTE SENSING, 2022, 14 (19)
[44] STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation
Gao, Liang
Liu, Hui
Yang, Minhang
Chen, Long
Wan, Yaling
Xiao, Zhengqing
Qian, Yurong
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 (14) : 10990 - 11003
[45] Lightweight semantic segmentation of complex structural damage recognition for actual bridges
Xu, Yang
Fan, Yunlei
Li, Hui
STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2023, 22 (05): : 3250 - 3269
[46] A Lightweight Semantic Segmentation Model of Wucai Seedlings Based on Attention Mechanism
Li, Wen
Liu, Chao
Chen, Minhui
Zhu, Dequan
Chen, Xia
Liao, Juan
PHOTONICS, 2022, 9 (06)
[47] Efficient Multi-Scale Feature Extraction for Lightweight Semantic Segmentation
Liu Y.
Lu C.-Z.
Li S.-J.
Zhang L.
Wu Y.-H.
Cheng M.-M.
Jisuanji Xuebao/Chinese Journal of Computers, 2022, 45 (07): : 1517 - 1528
[48] A Novel Transformer Based Semantic Segmentation Scheme for Fine-Resolution Remote Sensing Images
Wang, Libo
Li, Rui
Duan, Chenxi
Zhang, Ce
Meng, Xiaoliang
Fang, Shenghui
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[49] DGFormer: A Dynamic Kernel with Gaussian Fusion Transformer for Semantic Image Segmentation
Yang, Haoran
Tang, Longyi
Wu, Tingting
Yan, Binyu
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT III, 2024, 15018 : 17 - 30
[50] A Transformer-based Semantic Segmentation Model for Street Fashion Images
Peng, Dingjie
Kameyama, Wataru
INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY, IWAIT 2023, 2023, 12592

← 1 2 3 4 5 →