Aggregated Mutual Learning between CNN and Transformer for semi-supervised medical image segmentation

被引：2

作者：

Xu, Zhenghua ^{[1
,2
]}

Wang, Hening ^{[1
]}

Yang, Runhe ^{[1
,2
]}

Yang, Yuchen ^{[4
]}

Liu, Weipeng ^{[1
,3
]}

Lukasiewicz, Thomas ^{[5
,6
]}

机构：

[1] Hebei Univ Technol, State Key Lab Reliabil & Intelligence Elect Equip, Tianjin, Peoples R China

[2] Hebei Univ Technol, Sch Hlth Sci & Biomed Engn, Tianjin, Peoples R China

[3] Hebei Univ Technol, Sch Artificial Intelligence, Tianjin, Peoples R China

[4] Johns Hopkins Univ, Dept Appl Math & Stat, Baltimore, MD USA

[5] Vienna Univ Technol, Inst Log & Computat, Vienna, Austria

[6] Univ Oxford, Dept Comp Sci, Oxford, England

来源：

KNOWLEDGE-BASED SYSTEMS | 2025年 / 311卷

关键词：

Medical image segmentation; Semi-supervision learning; Mutual learning; Vision Transformer;

D O I：

10.1016/j.knosys.2025.113005

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent Advances show that both Convolutional layers and Transformer blocks have their own advantages in the feature learning tasks of medical image analysis. However, the existing models combining both CNN and Transformers cannot effectively integrate the features extracted by both networks. In this work, we propose anew semi-supervised medical image segmentation method which can effectively aggregate mutual learning between CNN and Transformer, denoted AML-CT, which consists of an auxiliary module and amain network. Specifically, the auxiliary module consists of two segmentation subnetworks based on CNN and Transformer, aiming at extracting features from different perspectives, where, to enhance integration of image features from distinct segmentation networks, a Cross-Branch Feature Fusion module is proposed to effectively fuses local and global information via internal cross-fusion of feature maps between networks. Then, to aggregate the extracted image features from the auxiliary module, a three-branch network (TB-net) structure is further proposed to learn the extracted joint features and facilitate aggregation of multi-source information. Experimental results on two public datasets demonstrate that: (i) AML-CT successfully accomplishes medical image segmentation tasks with limited labeled data, outperforming recent mainstream semi-supervised segmentation methods; (ii) Ablation studies confirm the effectiveness of each module in the AML-CT model for performance improvement.

引用

页数：15

共 49 条

[1] The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks [J].

Berman, Maxim ;

Triki, Amal Rannen ;

Blaschko, Matthew B. .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4413-4421

[2] Deep Learning Techniques for Automatic MRI Cardiac Multi-Structures Segmentation and Diagnosis: Is the Problem Solved? [J].

Bernard, Olivier ;

Lalande, Alain ;

Zotti, Clement ;

Cervenansky, Frederick ;

Yang, Xin ;

Heng, Pheng-Ann ;

Cetin, Irem ;

Lekadir, Karim ;

Camara, Oscar ;

Gonzalez Ballester, Miguel Angel ;

Sanroma, Gerard ;

Napel, Sandy ;

Petersen, Steffen ;

Tziritas, Georgios ;

Grinias, Elias ;

Khened, Mahendra ;

Kollerathu, Varghese Alex ;

Krishnamurthi, Ganapathy ;

Rohe, Marc-Michel ;

Pennec, Xavier ;

Sermesant, Maxime ;

Isensee, Fabian ;

Jaeger, Paul ;

Maier-Hein, Klaus H. ;

Full, Peter M. ;

Wolf, Ivo ;

Engelhardt, Sandy ;

Baumgartner, Christian F. ;

Koch, Lisa M. ;

Wolterink, Jelmer M. ;

Isgum, Ivana ;

Jang, Yeonggul ;

Hong, Yoonmi ;

Patravali, Jay ;

Jain, Shubham ;

Humbert, Olivier ;

Jodoin, Pierre-Marc .

IEEE TRANSACTIONS ON MEDICAL IMAGING, 2018, 37 (11) :2514-2525

[3]

Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9

[4]

Chaitanya K., 2020, P ADV NEUR INF PROC, V33, P12546

[5]

Chen J., 2021, arXiv, DOI DOI 10.48550/ARXIV.2102.04306

[6] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].

Chen, Liang-Chieh ;

Zhu, Yukun ;

Papandreou, George ;

Schroff, Florian ;

Adam, Hartwig .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851

[7] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[8] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision [J].

Chen, Xiaokang ;

Yuan, Yuhui ;

Zeng, Gang ;

Wang, Jingdong .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :2613-2622

[9]

Chinchor N, 1993, 5 MESS UND C MUC 5 P

[10]

Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]

← 1 2 3 4 5 →