CTRANSNET: CONVOLUTIONAL NEURAL NETWORK COMBINED WITH TRANSFORMER FOR MEDICAL IMAGE SEGMENTATION

被引:4
作者
Zhang, Zhixin [1 ]
Jiang, Shuhao [1 ]
Pan, Xuhua [1 ]
机构
[1] Tianjin Univ Commerce, Informat Engn Dept, Tianjin 300134, Peoples R China
关键词
Medical image segmentation; deep learning; attention mechanism; ATTENTION; CNN;
D O I
10.31577/cai20232392
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Transformer has been widely used for many tasks in NLP before, but there is still much room to explore the application of the Transformer to the image domain. In this paper, we propose a simple and efficient hybrid Transformer framework, CTransNet, which combines self-attention and CNN to improve medi-cal image segmentation performance. Capturing long-range dependencies at differ-ent scales. To this end, this paper proposes an effective self-attention mechanism incorporating relative position information encoding, which can reduce the time complexity of self-attention from O(n2) to O(n), and a new self-attention decoder that can recover fine-grained features in encoder from skip connection. This paper aims to address the current dilemma of Transformer applications: i.e., the need to learn induction bias from large amounts of training data. The hybrid layer in CTransNet allows the Transformer to be initialized as a CNN without pre-training. We have evaluated the performance of CTransNet on several medical segmentation datasets. CTransNet shows superior segmentation performance, robustness, and great promise for generalization to other medical image segmentation tasks.
引用
收藏
页码:392 / 410
页数:19
相关论文
共 48 条
[1]   Bi-Directional ConvLSTM U-Net with Densley Connected Convolutions [J].
Azad, Reza ;
Asadi-Aghbolaghi, Maryam ;
Fathy, Mahmood ;
Escalera, Sergio .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :406-415
[2]  
Beltagy I, 2020, Arxiv, DOI arXiv:2004.05150
[3]  
Chen J., 2021, arXiv
[4]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[5]  
Cicek Ozgun, 2016, Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016. 19th International Conference. Proceedings: LNCS 9901, P424, DOI 10.1007/978-3-319-46723-8_49
[6]   TransMed: Transformers Advance Multi-Modal Medical Image Classification [J].
Dai, Yin ;
Gao, Yifan ;
Liu, Fayu .
DIAGNOSTICS, 2021, 11 (08)
[7]  
Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, 10.48550/arxiv.1810.04805]
[8]   A Multichannel Deep Neural Network for Retina Vessel Segmentation via a Fusion Mechanism [J].
Ding, Jiaqi ;
Zhang, Zehua ;
Tang, Jijun ;
Guo, Fei .
FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2021, 9
[9]  
Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[10]   CE-Net: Context Encoder Network for 2D Medical Image Segmentation [J].
Gu, Zaiwang ;
Cheng, Jun ;
Fu, Huazhu ;
Zhou, Kang ;
Hao, Huaying ;
Zhao, Yitian ;
Zhang, Tianyang ;
Gao, Shenghua ;
Liu, Jiang .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2019, 38 (10) :2281-2292