RAMIS: Increasing robustness and accuracy in medical image segmentation with hybrid CNN-transformer synergy

被引:0
作者
Gu, Jia [1 ]
Tian, Fangzheng [1 ]
Oh, Il-Seok [1 ,2 ]
机构
[1] Jeonbuk Natl Univ, Dept Comp Sci & Artificial Intelligence, Jeonju Si 54896, South Korea
[2] Jeonbuk Natl Univ, Ctr Adv Image Informat Technol, Jeonju 54896, South Korea
基金
新加坡国家研究基金会;
关键词
Medical image segmentation; Hybrid models; Implicit representation; Self-distillation; Multi-resolution network; UNET PLUS PLUS; LESION SEGMENTATION; FUSION NETWORK; U-NET; CLASSIFICATION; ARCHITECTURE;
D O I
10.1016/j.neucom.2024.129009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hybrid architectures based on Convolutional Neural Network (CNN) and Vision Transformer (ViT) have become an important research direction in medical image segmentation in recent years. However, the currently popular hybrid architectures weaken the decision making process within the Transformer model, the way in which the output of the Transformer is post-processed by the upsampling of the convolution stack makes it difficult to restore the blurred boundaries of the target area. To improve the feature learning capability by addressing these issues, we propose RAMIS, a novel hybrid architecture for general medical image segmentation. RAMIS develops implicit neural representation and self-distillation to simultaneously obtain the super-resolution details and core features of the image as input to the Transformer encoder. Meanwhile, RAMIS explores an unsupervised learning CNN to obtain the initial input to the Transformer decoder, which not only explicitly considers the correlation within different samples, reduces the constraints on small datasets, but also fully leverages the potential of Transformer's cross-attention for optimizing segmentation results. RAMIS designs a multi-resolution interaction network to post-process the Transformer output and solves the problem of blurred segmentation boundaries by combining super-resolution image. We extensively evaluate RAMIS on five datasets from three typical publicly available medical image segmentation datasets. Extensive experimental results demonstrate the general applicability and superior performance of the proposed method. The code and pre-trained models are available on our website https://ramis.netlify.app.
引用
收藏
页数:14
相关论文
共 113 条
[1]   Dataset of breast ultrasound images [J].
Al-Dhabyani, Walid ;
Gomaa, Mohammed ;
Khaled, Hussien ;
Fahmy, Aly .
DATA IN BRIEF, 2020, 28
[2]   Going Off-Grid: Continuous Implicit Neural Representations for 3D Vascular Modeling [J].
Alblas, Dieuwertje ;
Brune, Christoph ;
Yeung, Kak Khee ;
Wolterink, Jelmer M. .
STATISTICAL ATLASES AND COMPUTATIONAL MODELS OF THE HEART: REGULAR AND CMRXMOTION CHALLENGE PAPERS, STACOM 2022, 2022, 13593 :79-90
[3]  
Alimanov Alnur, 2023, Biomed. Eng. Adv.
[4]   Breast mass segmentation in ultrasound with selective kernel U-Net convolutional neural network [J].
Byra, Michal ;
Jarosik, Piotr ;
Szubert, Aleksandra ;
Galperin, Michael ;
Ojeda-Fournier, Haydee ;
Olson, Linda ;
O'Boyle, Mary ;
Comstock, Christopher ;
Andre, Michael .
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2020, 61
[5]   Using Guided Self-Attention with Local Information for Polyp Segmentation [J].
Cai, Linghan ;
Wu, Meijing ;
Chen, Lijiang ;
Bai, Wenpei ;
Yang, Min ;
Lyu, Shuchang ;
Zhao, Qi .
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT IV, 2022, 13434 :629-638
[6]  
Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9
[7]   ICL-Net: Global and Local Inter-Pixel Correlations Learning Network for Skin Lesion Segmentation [J].
Cao, Weiwei ;
Yuan, Gang ;
Liu, Qi ;
Peng, Chengtao ;
Xie, Jing ;
Yang, Xiaodong ;
Ni, Xinye ;
Zheng, Jian .
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (01) :145-156
[8]   Emerging Properties in Self-Supervised Vision Transformers [J].
Caron, Mathilde ;
Touvron, Hugo ;
Misra, Ishan ;
Jegou, Herve ;
Mairal, Julien ;
Bojanowski, Piotr ;
Joulin, Armand .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9630-9640
[9]   A comprehensive survey on segmentation techniques for retinal vessel segmentation [J].
Cervantes, Jair ;
Cervantes, Jared ;
Garcia-Lamont, Farid ;
Yee-Rendon, Arturo ;
Cabrera, Josue Espejel ;
Jalili, Laura Dominguez .
NEUROCOMPUTING, 2023, 556
[10]   Domain-Specific Batch Normalization for Unsupervised Domain Adaptation [J].
Chang, Woong-Gi ;
You, Tackgeun ;
Seo, Seonguk ;
Kwak, Suha ;
Han, Bohyung .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7346-7354