Convolution-Free Medical Image Segmentation Using Transformers

被引:74
作者
Karimi, Davood [1 ]
Vasylechko, Serge Didenko
Gholipour, Ali
机构
[1] Boston Childrens Hosp, Dept Radiol, Computat Radiol Lab CRL, Boston, MA 02115 USA
来源
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT I | 2021年 / 12901卷
基金
美国国家卫生研究院;
关键词
Segmentation; Deep learning; Transformers; Attention;
D O I
10.1007/978-3-030-87193-2_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Like other applications in computer vision, medical image segmentation and his email address have been most successfully addressed using deep learning models that rely on the convolution operation as their main building block. Convolutions enjoy important properties such as sparse interactions, weight sharing, and translation equivariance. These properties give convolutional neural networks (CNNs) a strong and useful inductive bias for vision tasks. However, the convolution operation also has important shortcomings: it performs a fixed operation on every test image regardless of the content and it cannot efficiently model long-range interactions. In this work we show that a network based on self-attention between neighboring patches and without any convolution operations can achieve better results. Given a 3D image block, our network divides it into n(3) 3D patches, where n = 3 or 5 and computes a 1D embedding for each patch. The network predicts the segmentation map for the center patch of the block based on the self-attention between these patch embeddings. We show that the proposed model can achieve higher segmentation accuracies than a state of the art CNN. For scenarios with very few labeled images, we propose methods for pre-training the network on large corpora of unlabeled images. Our experiments show that with pre-training the advantage of our proposed network over CNNs can be significant when labeled training data is small.
引用
收藏
页码:78 / 88
页数:11
相关论文
共 50 条
  • [21] A Vision Transformer Model for Convolution-Free Multilabel Classification of Satellite Imagery in Deforestation Monitoring
    Kaselimi, Maria
    Voulodimos, Athanasios
    Daskalopoulos, Ioannis
    Doulamis, Nikolaos
    Doulamis, Anastasios
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (07) : 3299 - 3307
  • [22] Enhancing the ability of convolutional neural networks for remote sensing image segmentation using transformers
    Barr M.
    Neural Computing and Applications, 2024, 36 (22) : 13605 - 13616
  • [23] Domain and Content Adaptive Convolution Based Multi-Source Domain Generalization for Medical Image Segmentation
    Hu, Shishuai
    Liao, Zehui
    Zhang, Jianpeng
    Xia, Yong
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (01) : 233 - 244
  • [24] Federated Learning for Brain Tumor Segmentation Using MRI and Transformers
    Nalawade, Sahil
    Ganesh, Chandan
    Wagner, Ben
    Reddy, Divya
    Das, Yudhajit
    Yu, Fang F.
    Fei, Baowei
    Madhuranthakam, Ananth J.
    Maldjian, Joseph A.
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2021, PT II, 2022, 12963 : 444 - 454
  • [25] Generative image inpainting with enhanced gated convolution and Transformers
    Wang, Min
    Lu, Wanglong
    Lyu, Jiankai
    Shi, Kaijie
    Zhao, Hanli
    DISPLAYS, 2022, 75
  • [26] Medical endoscopic image segmentation using snakes
    Yoon, SW
    Lee, HK
    Kim, JH
    Lee, MH
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (03): : 785 - 789
  • [27] Medical Image Segmentation Using Genetic Algorithms
    Maulik, Ujjwal
    IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2009, 13 (02): : 166 - 173
  • [28] Medical image segmentation using genetic snakes
    Ballerini, L
    APPLICATIONS AND SCIENCE OF NEURAL NETWORKS, FUZZY SYSTEMS, AND EVOLUTIONARY COMPUTATION II, 1999, 3812 : 13 - 23
  • [29] Medical image segmentation using a combined approach
    Jiang, CY
    Zhang, XH
    Meinel, C
    VISION '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON COMPUTER VISION, 2005, : 84 - 90
  • [30] DECTNet: Dual Encoder Network combined convolution and Transformer architecture for medical image segmentation
    Li, Boliang
    Xu, Yaming
    Wang, Yan
    Zhang, Bo
    PLOS ONE, 2024, 19 (04):