DS-TransUNet: Dual Swin Transformer U-Net for Medical Image Segmentation

被引：602

作者：

Lin, Ailiang ^{[1
]}

Chen, Bingzhi ^{[1
]}

Xu, Jiayu ^{[1
]}

Zhang, Zheng ^{[1
]}

Lu, Guangming ^{[1
]}

Zhang, David ^{[2
]}

机构：

[1] Harbin Inst Technol, Shenzhen Med Biometr Percept & Anal Engn Lab, Shenzhen 518055, Peoples R China

[2] Chinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518055, Peoples R China

来源：

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT | 2022年 / 71卷

关键词：

Transformers; Image segmentation; Semantics; Decoding; Computer architecture; Task analysis; Medical diagnostic imaging; Hierarchical swin transformer; long-range contextual information; medical image segmentation; transformer interactive fusion~(TIF) module;

D O I：

10.1109/TIM.2022.3178991

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Automatic medical image segmentation has made great progress owing to powerful deep representation learning. Inspired by the success of self-attention mechanism in transformer, considerable efforts are devoted to designing the robust variants of the encoder-decoder architecture with transformer. However, the patch division used in the existing transformer-based models usually ignores the pixel-level intrinsic structural features inside each patch. In this article, we propose a novel deep medical image segmentation framework called dual swin transformer U-Net (DS-TransUNet), which aims to incorporate the hierarchical swin transformer into both the encoder and the decoder of the standard U-shaped architecture. Our DS-TransUNet benefits from the self-attention computation in swin transformer and the designed dual-scale encoding, which can effectively model the non-local dependencies and multiscale contexts for enhancing the semantic segmentation quality of varying medical images. Unlike many prior transformer-based solutions, the proposed DS-TransUNet adopts a well-established dual-scale encoding mechanism that uses dual-scale encoders based on swin transformer to extract the coarse and fine-grained feature representations of different semantic scales. Meanwhile, a well-designed transformer interactive fusion (TIF) module is proposed to effectively perform multiscale information fusion through the self-attention mechanism. Furthermore, we introduce the swin transformer block into the decoder to further explore the long-range contextual information during the up-sampling process. Extensive experiments across four typical tasks for medical image segmentation demonstrate the effectiveness of DS-TransUNet, and our approach significantly outperforms the state-of-the-art methods.

引用

页数：15

共 48 条

[11]

Deng-Ping Fan, 2020, Medical Image Computing and Computer Assisted Intervention - MICCAI 2020. 23rd International Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12266), P263, DOI 10.1007/978-3-030-59725-2_26

[12]

Dosovitskiy Alexey, 2021, arXiv

[13] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[14]

Hsiang Huang C., 2021, ARXIV210107172

[15] Joint Attention Network for Finger Vein Authentication [J].

Huang, Junduan ;

Tu, Mo ;

Yang, Weili ;

Kang, Wenxiong .

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70

[16] CCNet: Criss-Cross Attention for Semantic Segmentation [J].

Huang, Zilong ;

Wang, Xinggang ;

Huang, Lichao ;

Huang, Chang ;

Wei, Yunchao ;

Liu, Wenyu .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :603-612

[17] Real-Time Polyp Detection, Localization and Segmentation in Colonoscopy Using Deep Learning [J].

Jha, Debesh ;

Ali, Sharib ;

Tomar, Nikhil Kumar ;

Johansen, Havard D. ;

Johansen, Dag ;

Rittscher, Jens ;

Riegler, Michael A. ;

Halvorsen, Pal .

IEEE ACCESS, 2021, 9 :40496-40510

[18] DoubleU-Net: A Deep Convolutional Neural Network for Medical Image Segmentation [J].

Jha, Debesh ;

Riegler, Michael A. ;

Johansen, Dag ;

Halvorsen, Pal ;

Johansen, Havard D. .

2020 IEEE 33RD INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS(CBMS 2020), 2020, :558-564

[19] Kvasir-SEG: A Segmented Polyp Dataset [J].

Jha, Debesh ;

Smedsrud, Pia H. ;

Riegler, Michael A. ;

Halvorsen, Pal ;

de Lange, Thomas ;

Johansen, Dag ;

Johansen, Havard D. .

MULTIMEDIA MODELING (MMM 2020), PT II, 2020, 11962 :451-462

[20] Multi-compound Transformer for Accurate Biomedical Image Segmentation [J].

Ji, Yuanfeng ;

Zhang, Ruimao ;

Wang, Huijie ;

Li, Zhen ;

Wu, Lingyun ;

Zhang, Shaoting ;

Luo, Ping .

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT I, 2021, 12901 :326-336

← 1 2 3 4 5 →