Medical Transformer: Gated Axial-Attention for Medical Image Segmentation

被引:829
作者
Valanarasu, Jeya Maria Jose [1 ]
Oza, Poojan [1 ]
Hacihaliloglu, Ilker [2 ]
Patel, Vishal M. [1 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
[2] Rutgers State Univ, New Brunswick, NJ USA
来源
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT I | 2021年 / 12901卷
基金
美国国家科学基金会;
关键词
Transformers; Medical image segmentation; Self-attention;
D O I
10.1007/978-3-030-87193-2_4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Over the past decade, deep convolutional neural networks have been widely adopted for medical image segmentation and shown to achieve adequate performance. However, due to inherent inductive biases present in convolutional architectures, they lack understanding of long-range dependencies in the image. Recently proposed transformer-based architectures that leverage self-attention mechanism encode long-range dependencies and learn representations that are highly expressive. This motivates us to explore transformer-based solutions and study the feasibility of using transformer-based network architectures for medical image segmentation tasks. Majority of existing transformer-based network architectures proposed for vision applications require large-scale datasets to train properly. However, compared to the datasets for vision applications, in medical imaging the number of data samples is relatively low, making it difficult to efficiently train transformers for medical imaging applications. To this end, we propose a gated axial-attention model which extends the existing architectures by introducing an additional control mechanism in the self-attention module. Furthermore, to train the model effectively on medical images, we propose a Local-Global training strategy (LoGo) which further improves the performance. Specifically, we operate on the whole image and patches to learn global and local features, respectively. The proposed Medical Transformer (MedT) is evaluated on three different medical image segmentation datasets and it is shown that it achieves better performance than the convolutional and other related transformer-based architectures. Code: https://github.com/jeya-maria-jose/Medical-Transformer
引用
收藏
页码:36 / 46
页数:11
相关论文
共 27 条
  • [1] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
    Badrinarayanan, Vijay
    Kendall, Alex
    Cipolla, Roberto
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2481 - 2495
  • [2] Cicek Ozgun, 2016, Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016. 19th International Conference. Proceedings: LNCS 9901, P424, DOI 10.1007/978-3-319-46723-8_49
  • [3] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [4] Dosovitskiy A, 2021, ICLR 2021 9 INT C LE
  • [5] Huang HM, 2020, INT CONF ACOUST SPEE, P1055, DOI [10.1109/ICASSP40776.2020.9053405, 10.1109/icassp40776.2020.9053405]
  • [6] CCNet: Criss-Cross Attention for Semantic Segmentation
    Huang, Zilong
    Wang, Xinggang
    Huang, Lichao
    Huang, Chang
    Wei, Yunchao
    Liu, Wenyu
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 603 - 612
  • [7] Huiyu Wang, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12349), P108, DOI 10.1007/978-3-030-58548-8_7
  • [8] Learning to Segment Brain Anatomy From 2D Ultrasound With Less Data
    Jose Valanarasu, Jeya Maria
    Yasarla, Rajeev
    Wang, Puyang
    Hacihaliloglu, Ilker
    Patel, Vishal M.
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (06) : 1221 - 1234
  • [9] Kokkinos I, 2014, INT C LEARN REPR
  • [10] A Multi-Organ Nucleus Segmentation Challenge
    Kumar, Neeraj
    Verma, Ruchika
    Anand, Deepak
    Zhou, Yanning
    Onder, Omer Fahri
    Tsougenis, Efstratios
    Chen, Hao
    Heng, Pheng-Ann
    Li, Jiahui
    Hu, Zhiqiang
    Wang, Yunzhi
    Koohbanani, Navid Alemi
    Jahanifar, Mostafa
    Tajeddin, Neda Zamani
    Gooya, Ali
    Rajpoot, Nasir
    Ren, Xuhua
    Zhou, Sihang
    Wang, Qian
    Shen, Dinggang
    Yang, Cheng-Kun
    Weng, Chi-Hung
    Yu, Wei-Hsiang
    Yeh, Chao-Yuan
    Yang, Shuang
    Xu, Shuoyu
    Yeung, Pak Hei
    Sun, Peng
    Mahbod, Amirreza
    Schaefer, Gerald
    Ellinger, Isabella
    Ecker, Rupert
    Smedby, Orjan
    Wang, Chunliang
    Chidester, Benjamin
    That-Vinh Ton
    Minh-Triet Tran
    Ma, Jian
    Minh N Do
    Graham, Simon
    Quoc Dang Vu
    Kwak, Jin Tae
    Gunda, Akshaykumar
    Chunduri, Raviteja
    Hu, Corey
    Zhou, Xiaoyang
    Lotfi, Dariush
    Safdari, Reza
    Kascenas, Antanas
    O'Neil, Alison
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2020, 39 (05) : 1380 - 1391