MedFCT: A Frequency Domain Joint CNN-Transformer Network for Semi-supervised Medical Image Segmentation

被引：2

作者：

Xie, Shiao ^{[1
]}

Huang, Huimin ^{[1
]}

Niu, Ziwei ^{[1
]}

Lin, Lanfen ^{[1
]}

Chen, Yen-Wei ^{[2
]}

机构：

[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China

[2] Ritsumeikan Univ, Coll Informat Sci & Engn, Kyoto, Japan

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME | 2023年

关键词：

Medical Image Segmentation; Transformer; Semi-supervised Learning;

D O I：

10.1109/ICME55011.2023.00328

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Semi-supervised learning(SSL) is a data-efficient way in leveraging large-scale data without annotations and alleviating the dependence on labeled data. Mean-Teacher (MT) scheme with teacher-student model architecture has shown its effectiveness in semi-supervised medical image segmentation, where the student network learns from the teacher by minimizing pixel-wise consistency loss. However, existing MT-based SSLs still give rise to two main concerns: (1) limited learning ability of student network that neglects the union of local feature and global cues extraction which may impact the representation learning of variable objects. (2) limited knowledge-transferring ability of teacher network with only pixel-level consistency regularization that may result in inadequate and unstable guidance. To address these limitations, we propose a novel semi-supervised learning scheme, namely MedFCT, with two appealing designs: (1) A dual student architecture with parallel CNN and Transformer branches is designed for local-global feature extraction, where the full-frequency interaction between CNN and Transformer can be explored by a frequency domain cross-fusion (FDCF) module to learn complementarity of the two-paradigm features. (2) A comprehensive multi-level consistency regularization considering pixel-wise, feature-wise and class-wise information is presented to realize more effective guidance and knowledge transfer from teacher network. Experiments show that MedFCT outperforms previous state-of-the-art methods on two public medical image segmentation benchmarks.

引用

页码：1913 / 1918

页数：6

共 26 条

[1]

Basak H., 2022, arXiv

[2] Deep Learning Techniques for Automatic MRI Cardiac Multi-Structures Segmentation and Diagnosis: Is the Problem Solved? [J].

Bernard, Olivier ;

Lalande, Alain ;

Zotti, Clement ;

Cervenansky, Frederick ;

Yang, Xin ;

Heng, Pheng-Ann ;

Cetin, Irem ;

Lekadir, Karim ;

Camara, Oscar ;

Gonzalez Ballester, Miguel Angel ;

Sanroma, Gerard ;

Napel, Sandy ;

Petersen, Steffen ;

Tziritas, Georgios ;

Grinias, Elias ;

Khened, Mahendra ;

Kollerathu, Varghese Alex ;

Krishnamurthi, Ganapathy ;

Rohe, Marc-Michel ;

Pennec, Xavier ;

Sermesant, Maxime ;

Isensee, Fabian ;

Jaeger, Paul ;

Maier-Hein, Klaus H. ;

Full, Peter M. ;

Wolf, Ivo ;

Engelhardt, Sandy ;

Baumgartner, Christian F. ;

Koch, Lisa M. ;

Wolterink, Jelmer M. ;

Isgum, Ivana ;

Jang, Yeonggul ;

Hong, Yoonmi ;

Patravali, Jay ;

Jain, Shubham ;

Humbert, Olivier ;

Jodoin, Pierre-Marc .

IEEE TRANSACTIONS ON MEDICAL IMAGING, 2018, 37 (11) :2514-2525

[3]

Cao Hu, 2021, arXiv

[4] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision [J].

Chen, Xiaokang ;

Yuan, Yuhui ;

Zeng, Gang ;

Wang, Jingdong .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :2613-2622

[5]

Codella NCF, 2018, I S BIOMED IMAGING, P168, DOI 10.1109/ISBI.2018.8363547

[6]

Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929

[7]

Huang H., 2023, P IEEE CVF C COMP VI

[8]

Lee-Thorp J, 2022, Arxiv, DOI [arXiv:2105.03824, 10.48550/arXiv.2105.03824]

[9] DS-TransUNet: Dual Swin Transformer U-Net for Medical Image Segmentation [J].

Lin, Ailiang ;

Chen, Bingzhi ;

Xu, Jiayu ;

Zhang, Zheng ;

Lu, Guangming ;

Zhang, David .

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71

[10]

Long J, 2015, PROC CVPR IEEE, P3431, DOI 10.1109/CVPR.2015.7298965

← 1 2 3 →