ATFormer: Advanced transformer for medical image segmentation

被引:12
作者
Chen, Yong [1 ]
Lu, Xuesong [1 ]
Xie, Oinlan [1 ]
机构
[1] South Cent Minzu Univ, Sch Biomed Engn, Wuhan 430074, Hubei, Peoples R China
关键词
Self-attention mechanism; Advanced transformer; Multi-scale feature fusion; Medical image segmentation;
D O I
10.1016/j.bspc.2023.105079
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Combining transformers and convolutional neural networks is considered one of the most important directions for tackling medical image segmentation problems. To learn the long-range dependencies and local contexts, previous approaches embedded a convolutional layer into feedforward neural network inside the transformer block. However, a common issue is the instability during training since large differences in amplitude across layers by pre-layer normalization. Furthermore, multi-scale features were directly fused using the transformer from the encoder to decoder, which could disrupt valuable information for segmentation. To address these concerns, we propose Advanced TransFormer (ATFormer), a novel hybrid architecture that combines convo-lutional neural networks and transformers for medical image segmentation. First, the traditional transformer block has been refined into an Advanced Transformer Block, which adopts post-layer normalization to obtain mild activation values and employs the scaled cosine attention with shifted window for accurate spatial information. Second, the Progressive Guided Fusion module is introduced to make multi-scale features more discriminative while reducing the computational complexity. Experimental results on the ACDC, COVID-19 CT-Seg, and Tumor datasets demonstrate the significant advantage of ATFormer over existing methods that rely solely on convolutional neural networks, transformers, or their combination.
引用
收藏
页数:10
相关论文
共 45 条
[1]   The Medical Segmentation Decathlon [J].
Antonelli, Michela ;
Reinke, Annika ;
Bakas, Spyridon ;
Farahani, Keyvan ;
Kopp-Schneider, Annette ;
Landman, Bennett A. ;
Litjens, Geert ;
Menze, Bjoern ;
Ronneberger, Olaf ;
Summers, Ronald M. ;
van Ginneken, Bram ;
Bilello, Michel ;
Bilic, Patrick ;
Christ, Patrick F. ;
Do, Richard K. G. ;
Gollub, Marc J. ;
Heckers, Stephan H. ;
Huisman, Henkjan ;
Jarnagin, William R. ;
McHugo, Maureen K. ;
Napel, Sandy ;
Pernicka, Jennifer S. Golia ;
Rhode, Kawal ;
Tobon-Gomez, Catalina ;
Vorontsov, Eugene ;
Meakin, James A. ;
Ourselin, Sebastien ;
Wiesenfarth, Manuel ;
Arbelaez, Pablo ;
Bae, Byeonguk ;
Chen, Sihong ;
Daza, Laura ;
Feng, Jianjiang ;
He, Baochun ;
Isensee, Fabian ;
Ji, Yuanfeng ;
Jia, Fucang ;
Kim, Ildoo ;
Maier-Hein, Klaus ;
Merhof, Dorit ;
Pai, Akshay ;
Park, Beomhee ;
Perslev, Mathias ;
Rezaiifar, Ramin ;
Rippel, Oliver ;
Sarasua, Ignacio ;
Shen, Wei ;
Son, Jaemin ;
Wachinger, Christian ;
Wang, Liansheng .
NATURE COMMUNICATIONS, 2022, 13 (01)
[2]   Deep Learning Techniques for Automatic MRI Cardiac Multi-Structures Segmentation and Diagnosis: Is the Problem Solved? [J].
Bernard, Olivier ;
Lalande, Alain ;
Zotti, Clement ;
Cervenansky, Frederick ;
Yang, Xin ;
Heng, Pheng-Ann ;
Cetin, Irem ;
Lekadir, Karim ;
Camara, Oscar ;
Gonzalez Ballester, Miguel Angel ;
Sanroma, Gerard ;
Napel, Sandy ;
Petersen, Steffen ;
Tziritas, Georgios ;
Grinias, Elias ;
Khened, Mahendra ;
Kollerathu, Varghese Alex ;
Krishnamurthi, Ganapathy ;
Rohe, Marc-Michel ;
Pennec, Xavier ;
Sermesant, Maxime ;
Isensee, Fabian ;
Jaeger, Paul ;
Maier-Hein, Klaus H. ;
Full, Peter M. ;
Wolf, Ivo ;
Engelhardt, Sandy ;
Baumgartner, Christian F. ;
Koch, Lisa M. ;
Wolterink, Jelmer M. ;
Isgum, Ivana ;
Jang, Yeonggul ;
Hong, Yoonmi ;
Patravali, Jay ;
Jain, Shubham ;
Humbert, Olivier ;
Jodoin, Pierre-Marc .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2018, 37 (11) :2514-2525
[3]  
Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9
[4]  
Chen J., 2021, 2021 IEEECVF C COMPU
[5]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[6]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[7]  
Dong B., 2021, POLYP PVT POLYP SEGM
[8]   Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images [J].
Hatamizadeh, Ali ;
Nath, Vishwesh ;
Tang, Yucheng ;
Yang, Dong ;
Roth, Holger R. ;
Xu, Daguang .
BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2021, PT I, 2022, 12962 :272-284
[9]   UNETR: Transformers for 3D Medical Image Segmentation [J].
Hatamizadeh, Ali ;
Tang, Yucheng ;
Nath, Vishwesh ;
Yang, Dong ;
Myronenko, Andriy ;
Landman, Bennett ;
Roth, Holger R. ;
Xu, Daguang .
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, :1748-1758
[10]   Masked Autoencoders Are Scalable Vision Learners [J].
He, Kaiming ;
Chen, Xinlei ;
Xie, Saining ;
Li, Yanghao ;
Dollar, Piotr ;
Girshick, Ross .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :15979-15988