Cervical OCT image classification using contrastive masked autoencoders with Swin Transformer

被引:0
|
作者
Wang, Qingbin [1 ]
Xiong, Yuxuan [1 ]
Zhu, Hanfeng [2 ,3 ]
Mu, Xuefeng [4 ]
Zhang, Yan [4 ]
Ma, Yutao [2 ,3 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China
[2] Cent China Normal Univ, Sch Comp Sci, Wuhan 430079, Peoples R China
[3] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China
[4] Wuhan Univ, Remin Hosp, Dept Obstet & Gynecol, Wuhan 430060, Peoples R China
关键词
Cervical cancer; Optical coherence tomography; Image classification; Self-supervised learning; Swin Transformer; Interpretability; OPTICAL COHERENCE TOMOGRAPHY;
D O I
10.1016/j.compmedimag.2024.102469
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Background and Objective: Cervical cancer poses a major health threat to women globally. Optical coherence tomography (OCT) imaging has recently shown promise for non-invasive cervical lesion diagnosis. However, obtaining high-quality labeled cervical OCT images is challenging and time-consuming as they must correspond precisely with pathological results. The scarcity of such high-quality labeled data hinders the application of supervised deep-learning models in practical clinical settings. This study addresses the above challenge by proposing CMSwin, a novel self-supervised learning (SSL) framework combining masked image modeling (MIM) with contrastive learning based on the Swin-Transformer architecture to utilize abundant unlabeled cervical OCT images. Methods: In this contrastive-MIM framework, mixed image encoding is combined with a latent contextual regressor to solve the inconsistency problem between pre-training and fine-tuning and separate the encoder's feature extraction task from the decoder's reconstruction task, allowing the encoder to extract better image representations. Besides, contrastive losses at the patch and image levels are elaborately designed to leverage massive unlabeled data. Results: We validated the superiority of CMSwin over the state-of-the-art SSL approaches with five-fold cross- validation on an OCT image dataset containing 1,452 patients from a multi-center clinical study in China, plus two external validation sets from top-ranked Chinese hospitals: the Huaxi dataset from the West China Hospital of Sichuan University and the Xiangya dataset from the Xiangya Second Hospital of Central South University. A human-machine comparison experiment on the Huaxi and Xiangya datasets for volume-level binary classification also indicates that CMSwin can match or exceed the average level of four skilled medical experts, especially in identifying high-risk cervical lesions. Conclusion: Our work has great potential to assist gynecologists in intelligently interpreting cervical OCT images in clinical settings. Additionally, the integrated GradCAM module of CMSwin enables cervical lesion visualization and interpretation, providing good interpretability for gynecologists to diagnose cervical diseases efficiently.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Utilizing Swin Transformer for the Classification of Ophthalmic Diseases in Optical Coherence Tomography (OCT) Images: A Novel Approach
    Mapanao, Jay Ryan
    Luis Lozano, Paulo
    2024 6TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND THE INTERNET, ICCCI 2024, 2024, : 94 - 100
  • [22] Swin-MSP: A Shifted Windows Masked Spectral Pretraining Model for Hyperspectral Image Classification
    Tian, Rui
    Liu, Danqing
    Bai, Yu
    Jin, Yu
    Wan, Guanliang
    Guo, Yanhui
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [23] Transformer-based unsupervised contrastive learning for histopathological image classification
    Wang, Xiyue
    Yang, Sen
    Zhang, Jun
    Wang, Minghui
    Zhang, Jing
    Yang, Wei
    Huang, Junzhou
    Han, Xiao
    MEDICAL IMAGE ANALYSIS, 2022, 81
  • [24] CRAT: Advanced transformer-based deep learning algorithms in OCT image classification
    Yang, Mingming
    Du, Junhui
    Lv, Ruichan
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 104
  • [25] Swin transformer with multiscale 3D atrous convolution for hyperspectral image classification
    Farooque, Ghulam
    Liu, Qichao
    Sargano, Allah Bux
    Xiao, Liang
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [26] Image recoloring for color vision deficiency compensation using Swin transformer
    Chen, Ligeng
    Zhu, Zhenyang
    Huang, Wangkang
    Go, Kentaro
    Chen, Xiaodiao
    Mao, Xiaoyang
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (11) : 6051 - 6066
  • [27] Efficient Lung Cancer Image Classification and Segmentation Algorithm Based on an Improved Swin Transformer
    Sun, Ruina
    Pang, Yuexin
    Li, Wenfa
    ELECTRONICS, 2023, 12 (04)
  • [28] Image recoloring for color vision deficiency compensation using Swin transformer
    Ligeng Chen
    Zhenyang Zhu
    Wangkang Huang
    Kentaro Go
    Xiaodiao Chen
    Xiaoyang Mao
    Neural Computing and Applications, 2024, 36 : 6051 - 6066
  • [29] Image denoising using channel attention residual enhanced Swin Transformer
    Qiang Dai
    Xi Cheng
    Li Zhang
    Multimedia Tools and Applications, 2024, 83 : 19041 - 19059
  • [30] Image denoising using channel attention residual enhanced Swin Transformer
    Dai, Qiang
    Cheng, Xi
    Zhang, Li
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (07) : 19041 - 19059