DS-TransUNet: Dual Swin Transformer U-Net for Medical Image Segmentation

被引：588

作者：

Lin, Ailiang ^{[1
]}

Chen, Bingzhi ^{[1
]}

Xu, Jiayu ^{[1
]}

Zhang, Zheng ^{[1
]}

Lu, Guangming ^{[1
]}

Zhang, David ^{[2
]}

机构：

[1] Harbin Inst Technol, Shenzhen Med Biometr Percept & Anal Engn Lab, Shenzhen 518055, Peoples R China

[2] Chinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518055, Peoples R China

来源：

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT | 2022年 / 71卷

关键词：

Transformers; Image segmentation; Semantics; Decoding; Computer architecture; Task analysis; Medical diagnostic imaging; Hierarchical swin transformer; long-range contextual information; medical image segmentation; transformer interactive fusion~(TIF) module;

D O I：

10.1109/TIM.2022.3178991

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Automatic medical image segmentation has made great progress owing to powerful deep representation learning. Inspired by the success of self-attention mechanism in transformer, considerable efforts are devoted to designing the robust variants of the encoder-decoder architecture with transformer. However, the patch division used in the existing transformer-based models usually ignores the pixel-level intrinsic structural features inside each patch. In this article, we propose a novel deep medical image segmentation framework called dual swin transformer U-Net (DS-TransUNet), which aims to incorporate the hierarchical swin transformer into both the encoder and the decoder of the standard U-shaped architecture. Our DS-TransUNet benefits from the self-attention computation in swin transformer and the designed dual-scale encoding, which can effectively model the non-local dependencies and multiscale contexts for enhancing the semantic segmentation quality of varying medical images. Unlike many prior transformer-based solutions, the proposed DS-TransUNet adopts a well-established dual-scale encoding mechanism that uses dual-scale encoders based on swin transformer to extract the coarse and fine-grained feature representations of different semantic scales. Meanwhile, a well-designed transformer interactive fusion (TIF) module is proposed to effectively perform multiscale information fusion through the self-attention mechanism. Furthermore, we introduce the swin transformer block into the decoder to further explore the long-range contextual information during the up-sampling process. Extensive experiments across four typical tasks for medical image segmentation demonstrate the effectiveness of DS-TransUNet, and our approach significantly outperforms the state-of-the-art methods.

引用

页数：15

共 48 条

[1]

Alom M. Z., 2018, CoRR

[2] Bi-Directional ConvLSTM U-Net with Densley Connected Convolutions [J].

Azad, Reza ;

Asadi-Aghbolaghi, Maryam ;

Fathy, Mahmood ;

Escalera, Sergio .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :406-415

[3] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[4] WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians [J].

Bernal, Jorge ;

Javier Sanchez, F. ;

Fernandez-Esparrach, Gloria ;

Gil, Debora ;

Rodriguez, Cristina ;

Vilarino, Fernando .

COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 43 :99-111

[5] Nucleus segmentation across imaging experiments: the 2018 Data Science Bowl [J].

Caicedo, Juan C. ;

Goodman, Allen ;

Karhohs, Kyle W. ;

Cimini, Beth A. ;

Ackerman, Jeanelle ;

Haghighi, Marzieh ;

Heng, CherKeng ;

Becker, Tim ;

Minh Doan ;

McQuin, Claire ;

Rohban, Mohammad ;

Singh, Shantanu ;

Carpenter, Anne E. .

NATURE METHODS, 2019, 16 (12) :1247-+

[6]

Cao H., 2021, arXiv

[7] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[8]

Chen C.-F.R., 2021, P IEEECVF INT C COMP, P357

[9]

Chen J., ARXIV210204306, V2021

[10]

Codella Noel, 2019, arXiv

← 1 2 3 4 5 →