TC-Net: A joint learning framework based on CNN and vision transformer for multi-lesion medical images segmentation

被引:27
作者
Zhang, Zhongxiang [1 ]
Sun, Guangmin [1 ]
Zheng, Kun [1 ]
Yang, Jin-Kui [2 ,3 ]
Zhu, Xiao-rong [2 ,3 ]
Li, Yu [1 ,4 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing, Peoples R China
[2] Beijing Tongren Hosp, Dept Endocrinol, Beijing, Peoples R China
[3] Capital Med Univ, Beijing, Peoples R China
[4] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing, Peoples R China
基金
国家重点研发计划;
关键词
Medical image segmentation; Convolutional neural network; Vision transformer; Class-imbalance; DIABETIC-RETINOPATHY; ARCHITECTURE;
D O I
10.1016/j.compbiomed.2023.106967
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: With the rapid advancement of medical imaging technology, the demand for accurate segmentation of medical images is increasing. However, most existing methods are unable to capture locality and long-range dependency information in integrated ways for medical images.Method: In this paper, we propose an elegant segmentation framework for medical images named TC-Net, which can utilize both the locality-aware and long-range dependencies in the medical images. As for the locality-aware perspective, we employ a CNN-based encoder and decoder structure. The CNN branch uses the locality of convolution operations to dig out local information in medical images. As for the long-range dependencies, we construct a Transformer branch to focus on the global context. Additionally, we proposed a locality-aware and long-range dependency concatenation strategy (LLCS) to aggregate the feature maps obtained from the two subbranches. Finally, we present a dynamic cyclical focal loss (DCFL) to address the class imbalance problem in multi-lesion segmentation.Results: Comprehensive experiments were conducted on lesion segmentation tasks using two fundus image da-tabases and a skin image database. The TC-Net achieves scores of 0.6985 and 0.5171 in the metric of mean pixel accuracy on the IDRiD and DDR databases, respectively. Moreover, on the skin image database, the TC-Net reached mean pixel accuracy of 0.8886. The experiment results demonstrate that the proposed method ach-ieves better performance than other deep learning segmentation schemes. Furthermore, the proposed DCFL achieves higher performance than other loss functions in multi-lesion segmentation.Significance: The proposed TC-Net is a promising new framework for multi-lesion medical image segmentation and many other challenging image segmentation tasks. (c) 2001 Elsevier Science. All rights reserved.
引用
收藏
页数:14
相关论文
共 49 条
[1]   Deep Belief Network Modeling for Automatic Liver Segmentation [J].
Ahmad, Mubashir ;
Ai, Danni ;
Xie, Guiwang ;
Qadri, Syed Furqan ;
Song, Hong ;
Huang, Yong ;
Wang, Yongtian ;
Yang, Jian .
IEEE ACCESS, 2019, 7 :20585-20595
[2]  
[Anonymous], 2019, COLLABORATIVE GLOBAL
[3]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[4]   Review of brain MRI image segmentation methods [J].
Balafar, M. A. ;
Ramli, A. R. ;
Saripan, M. I. ;
Mashohor, S. .
ARTIFICIAL INTELLIGENCE REVIEW, 2010, 33 (03) :261-274
[5]  
Cao H., 2021, arXiv, DOI 10.48550/arXiv.2105.05537
[6]  
Chen J., 2021, arXiv
[7]  
Chen LC, 2016, Arxiv, DOI arXiv:1412.7062
[8]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[9]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[10]  
Cicek Ozgun, 2016, Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016. 19th International Conference. Proceedings: LNCS 9901, P424, DOI 10.1007/978-3-319-46723-8_49