GO-MAE: Self-supervised pre-training via masked autoencoder for OCT image classification of gynecology

被引：1

作者：

Wang, Haoran ^{[1
]}

Guo, Xinyu ^{[1
]}

Song, Kaiwen ^{[1
]}

Sun, Mingyang ^{[1
]}

Shao, Yanbin ^{[1
]}

Xue, Songfeng ^{[1
]}

Zhang, Hongwei ^{[1
]}

Zhang, Tianyu ^{[1
]}

机构：

[1] Jilin Univ, Coll Instrumentat & Elect Engn, Key Lab Geophys Explorat Equipment, Minist Educ, Changchun 130012, Peoples R China

来源：

NEURAL NETWORKS | 2025年 / 181卷

基金：

中国国家自然科学基金;

关键词：

Deep learning; Genitourinary syndrome of menopause; Image classification; Optical coherence tomography; Self-supervised learning;

D O I：

10.1016/j.neunet.2024.106817

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Genitourinary syndrome of menopause (GSM) is a physiological disorder caused by reduced levels of oestrogen in menopausal women. Gradually, its symptoms worsen with age and prolonged menopausal status, which gravely impacts the quality of life as well as the physical and mental health of the patients. In this regard, optical coherence tomography (OCT) system effectively reduces the patient's burden in clinical diagnosis with its noncontact, noninvasive tomographic imaging process. Consequently, supervised computer vision models applied on OCT images have yielded excellent results for disease diagnosis. However, manual labeling on an extensive number of medical images is expensive and time-consuming. To this end, this paper proposes GOMAE, a pretraining framework for self-supervised learning of GSM OCT images based on Masked Autoencoder (MAE). To the best of our knowledge, this is the first study that applies self-supervised learning methods on the field of GSM disease screening. Focusing on the semantic complexity and feature sparsity of GSM OCT images, the objective of this study is two-pronged: first, a dynamic masking strategy is introduced for OCT characteristics in downstream tasks. This method can reduce the interference of invalid features on the model and shorten the training time. In the encoder design of MAE, we propose a convolutional neural network and transformer parallel network architecture (C&T), which aims to fuse the local and global representations of the relevant lesions in an interactive manner such that the model can still learn the richer differences between the feature information without labels. Thereafter, a series of experimental results on the acquired GSMOCT dataset revealed that GO-MAE yields significant improvements over existing state-of-the-art techniques. Furthermore, the superiority of the model in terms of robustness and interpretability was verified through a series of comparative experiments and visualization operations, which consequently demonstrated its great potential for screening GSM symptoms.

引用

页数：17

共 57 条

[1]

Bao H., 2021, INT C LEARN REPR

[2]

Cao KD, 2019, ADV NEUR IN, V32

[3] GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond [J].

Cao, Yue ;

Xu, Jiarui ;

Lin, Stephen ;

Wei, Fangyun ;

Hu, Han .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, :1971-1980

[4] IL-MCAM: An interactive learning and multi-channel attention mechanism-based weakly supervised colorectal histopathology image classification approach [J].

Chen, Haoyuan ;

Li, Chen ;

Li, Xiaoyan ;

Rahaman, Md Mamunur ;

Hu, Weiming ;

Li, Yixin ;

Liu, Wanli ;

Sun, Changhao ;

Sun, Hongzan ;

Huang, Xinyu ;

Grzegorzek, Marcin .

COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 143

[5] Large-scale individual building extraction from open-source satellite imagery via super-resolution-based instance segmentation approach [J].

Chen, Shenglong ;

Ogawa, Yoshiki ;

Zhao, Chenbo ;

Sekimoto, Yoshihide .

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2023, 195 :129-152

[6] An Empirical Study of Training Self-Supervised Vision Transformers [J].

Chen, Xinlei ;

Xie, Saining ;

He, Kaiming .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9620-9629

[7] Randaugment: Practical automated data augmentation with a reduced search space [J].

Cubuk, Ekin D. ;

Zoph, Barret ;

Shlens, Jonathon ;

Le, Quoc, V .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, :3008-3017

[8] RepVGG: Making VGG-style ConvNets Great Again [J].

Ding, Xiaohan ;

Zhang, Xiangyu ;

Ma, Ningning ;

Han, Jungong ;

Ding, Guiguang ;

Sun, Jian .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13728-13737

[9]

Dong XY, 2022, Arxiv, DOI arXiv:2111.12710

[10]

Dosovitskiy A, 2021, INT C LEARN REPR ICL

← 1 2 3 4 5 6 →