MESTrans: Multi-scale embedding spatial transformer for medical image segmentation

被引：15

作者：

Liu, Yatong ^{[1
]}

Zhu, Yu ^{[1
,2
]}

Xin, Ying ^{[3
]}

Zhang, Yanan ^{[3
]}

Yang, Dawei ^{[2
,4
]}

Xu, Tao ^{[3
]}

机构：

[1] East China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China

[2] Shanghai Engn Res Ctr Internet Things Resp Med, Shanghai 200237, Peoples R China

[3] Qingdao Univ, Dept Pulm & Crit Care Med, Affiliated Hosp, Qingdao 266000, Shandong, Peoples R China

[4] Fudan Univ, Zhongshan Hosp, Dept Pulm & Crit Care Med, Shanghai 200032, Peoples R China

来源：

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE | 2023年 / 233卷

基金：

中国国家自然科学基金;

关键词：

Computer-aided diagnosis; COVID-19; Medical image segmentation; Transformer; ATTENTION; REGIONS;

D O I：

10.1016/j.cmpb.2023.107493

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Background and objective: Transformers profiting from global information modeling derived from the self -attention mechanism have recently achieved remarkable performance in computer vision. In this study, a novel transformer-based medical image segmentation network called the multi-scale embedding spatial transformer (MESTrans) was proposed for medical image segmentation. Methods: First, a dataset called COVID-DS36 was created from 4369 computed tomography (CT) im-ages of 36 patients from a partner hospital, of which 18 had COVID-19 and 18 did not. Subsequently, a novel medical image segmentation network was proposed, which introduced a self-attention mechanism to improve the inherent limitation of convolutional neural networks (CNNs) and was capable of adap-tively extracting discriminative information in both global and local content. Specifically, based on U-Net, a multi-scale embedding block (MEB) and multi-layer spatial attention transformer (SATrans) structure were designed, which can dynamically adjust the receptive field in accordance with the input content. The spatial relationship between multi-level and multi-scale image patches was modeled, and the global context information was captured effectively. To make the network concentrate on the salient feature region, a feature fusion module (FFM) was established, which performed global learning and soft selec-tion between shallow and deep features, adaptively combining the encoder and decoder features. Four datasets comprising CT images, magnetic resonance (MR) images, and H&E-stained slide images were used to assess the performance of the proposed network. Results: Experiments were performed using four different types of medical image datasets. For the COVID-DS36 dataset, our method achieved a Dice similarity coefficient (DSC) of 81.23%. For the GlaS dataset, 89.95% DSC and 82.39% intersection over union (IoU) were obtained. On the Synapse dataset, the average DSC was 77.48% and the average Hausdorff distance (HD) was 31.69 mm. For the I2CVB dataset, 92.3% DSC and 85.8% IoU were obtained. Conclusions: The experimental results demonstrate that the proposed model has an excellent generaliza-tion ability and outperforms other state-of-the-art methods. It is expected to be a potent tool to assist clinicians in auxiliary diagnosis and to promote the development of medical intelligence technology. (c) 2023 Elsevier B.V. All rights reserved.

引用

页数：13

共 47 条

[1] A Deep Learning-Based Approach for the Detection and Localization of Prostate Cancer in T2 Magnetic Resonance Images
Alkadi, Ruba
Taher, Fatma
El-baz, Ayman
Werghi, Naoufel
[J]. JOURNAL OF DIGITAL IMAGING, 2019, 32 (05) : 793 - 807
[2] [Anonymous], 2001, P 18 INT C MACH LEAR, DOI DOI 10.29122/MIPI.V11I1.2792
[3] Cao H., 2022, COMPUTER VISION ECCV, P205
[4] Chaitanya K., 2020, Proc. Adv. Neural Inf. Process. Syst, V33, P12546
[5] Chen J., 2021, arXiv
[6] Chen LC, 2016, Arxiv, DOI [arXiv:1412.7062, 10.48550/arXiv.1412.7062, DOI 10.48550/ARXIV.1412.7062]
[7] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Chen, Liang-Chieh
Papandreou, George
Kokkinos, Iasonas
Murphy, Kevin
Yuille, Alan L.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
[8] Child R, 2019, Arxiv, DOI arXiv:1904.10509
[9] Chowdhary K., 2020, FUNDAMENTALS ARTIFIC, P603, DOI [DOI 10.1007/978-81-322-3972-719, 10.1007/978-81-322-3972-7_19]
[10] Cicek Ozgun, 2016, Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016. 19th International Conference. Proceedings: LNCS 9901, P424, DOI 10.1007/978-3-319-46723-8_49

← 1 2 3 4 5 →