CTRANSNET: CONVOLUTIONAL NEURAL NETWORK COMBINED WITH TRANSFORMER FOR MEDICAL IMAGE SEGMENTATION

被引:4
作者
Zhang, Zhixin [1 ]
Jiang, Shuhao [1 ]
Pan, Xuhua [1 ]
机构
[1] Tianjin Univ Commerce, Informat Engn Dept, Tianjin 300134, Peoples R China
关键词
Medical image segmentation; deep learning; attention mechanism; ATTENTION; CNN;
D O I
10.31577/cai20232392
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Transformer has been widely used for many tasks in NLP before, but there is still much room to explore the application of the Transformer to the image domain. In this paper, we propose a simple and efficient hybrid Transformer framework, CTransNet, which combines self-attention and CNN to improve medi-cal image segmentation performance. Capturing long-range dependencies at differ-ent scales. To this end, this paper proposes an effective self-attention mechanism incorporating relative position information encoding, which can reduce the time complexity of self-attention from O(n2) to O(n), and a new self-attention decoder that can recover fine-grained features in encoder from skip connection. This paper aims to address the current dilemma of Transformer applications: i.e., the need to learn induction bias from large amounts of training data. The hybrid layer in CTransNet allows the Transformer to be initialized as a CNN without pre-training. We have evaluated the performance of CTransNet on several medical segmentation datasets. CTransNet shows superior segmentation performance, robustness, and great promise for generalization to other medical image segmentation tasks.
引用
收藏
页码:392 / 410
页数:19
相关论文
共 48 条
[21]   Interactformer: Interactive Transformer and CNN for Hyperspectral Image Super-Resolution [J].
Liu, Yaoting ;
Hu, Jianwen ;
Kang, Xudong ;
Luo, Jing ;
Fan, Shaosheng .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[22]   Swin Transformer: Hierarchical Vision Transformer using Shifted Windows [J].
Liu, Ze ;
Lin, Yutong ;
Cao, Yue ;
Hu, Han ;
Wei, Yixuan ;
Zhang, Zheng ;
Lin, Stephen ;
Guo, Baining .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9992-10002
[23]  
Luo XD, 2022, Arxiv, DOI arXiv:2112.04894
[24]   V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation [J].
Milletari, Fausto ;
Navab, Nassir ;
Ahmadi, Seyed-Ahmad .
PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, :565-571
[25]  
Oktay O, 2018, Arxiv, DOI arXiv:1804.03999
[26]   U-Net: Convolutional Networks for Biomedical Image Segmentation [J].
Ronneberger, Olaf ;
Fischer, Philipp ;
Brox, Thomas .
MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, PT III, 2015, 9351 :234-241
[27]   Attention gated networks: Learning to leverage salient regions in medical images [J].
Schlemper, Jo ;
Oktay, Ozan ;
Schaap, Michiel ;
Heinrich, Mattias ;
Kainz, Bernhard ;
Glocker, Ben ;
Rueckert, Daniel .
MEDICAL IMAGE ANALYSIS, 2019, 53 :197-207
[28]  
Shaw P, 2018, Arxiv, DOI arXiv:1803.02155
[29]   Deep vessel segmentation by learning graphical connectivity [J].
Shin, Seung Yeon ;
Lee, Soochahn ;
Yun, Il Dong ;
Lee, Kyoung Mu .
MEDICAL IMAGE ANALYSIS, 2019, 58
[30]   MSRF-Net: A Multi-Scale Residual Fusion Network for Biomedical Image Segmentation [J].
Srivastava, Abhishek ;
Jha, Debesh ;
Chanda, Sukalpa ;
Pal, Umapada ;
Johansen, Havard ;
Johansen, Dag ;
Riegler, Michael ;
Ali, Sharib ;
Halvorsen, Pal .
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (05) :2252-2263