Cervical OCT image classification using contrastive masked autoencoders with Swin Transformer

被引：0

作者：

Wang, Qingbin ^{[1
]}

Xiong, Yuxuan ^{[1
]}

Zhu, Hanfeng ^{[2
,3
]}

Mu, Xuefeng ^{[4
]}

Zhang, Yan ^{[4
]}

Ma, Yutao ^{[2
,3
]}

机构：

[1] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China

[2] Cent China Normal Univ, Sch Comp Sci, Wuhan 430079, Peoples R China

[3] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China

[4] Wuhan Univ, Remin Hosp, Dept Obstet & Gynecol, Wuhan 430060, Peoples R China

来源：

COMPUTERIZED MEDICAL IMAGING AND GRAPHICS | 2024年 / 118卷

关键词：

Cervical cancer; Optical coherence tomography; Image classification; Self-supervised learning; Swin Transformer; Interpretability; OPTICAL COHERENCE TOMOGRAPHY;

D O I：

10.1016/j.compmedimag.2024.102469

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

Background and Objective: Cervical cancer poses a major health threat to women globally. Optical coherence tomography (OCT) imaging has recently shown promise for non-invasive cervical lesion diagnosis. However, obtaining high-quality labeled cervical OCT images is challenging and time-consuming as they must correspond precisely with pathological results. The scarcity of such high-quality labeled data hinders the application of supervised deep-learning models in practical clinical settings. This study addresses the above challenge by proposing CMSwin, a novel self-supervised learning (SSL) framework combining masked image modeling (MIM) with contrastive learning based on the Swin-Transformer architecture to utilize abundant unlabeled cervical OCT images. Methods: In this contrastive-MIM framework, mixed image encoding is combined with a latent contextual regressor to solve the inconsistency problem between pre-training and fine-tuning and separate the encoder's feature extraction task from the decoder's reconstruction task, allowing the encoder to extract better image representations. Besides, contrastive losses at the patch and image levels are elaborately designed to leverage massive unlabeled data. Results: We validated the superiority of CMSwin over the state-of-the-art SSL approaches with five-fold cross- validation on an OCT image dataset containing 1,452 patients from a multi-center clinical study in China, plus two external validation sets from top-ranked Chinese hospitals: the Huaxi dataset from the West China Hospital of Sichuan University and the Xiangya dataset from the Xiangya Second Hospital of Central South University. A human-machine comparison experiment on the Huaxi and Xiangya datasets for volume-level binary classification also indicates that CMSwin can match or exceed the average level of four skilled medical experts, especially in identifying high-risk cervical lesions. Conclusion: Our work has great potential to assist gynecologists in intelligently interpreting cervical OCT images in clinical settings. Additionally, the integrated GradCAM module of CMSwin enables cervical lesion visualization and interpretation, providing good interpretability for gynecologists to diagnose cervical diseases efficiently.

引用

页数：14

共 50 条

[41] ATTENTION-GUIDED CONTRASTIVE MASKED IMAGE MODELING FOR TRANSFORMER-BASED SELF-SUPERVISED LEARNING
Zhan, Yucheng
Zhao, Yucheng
Luo, Chong
Zhang, Yueyi
Sun, Xiaoyan
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2490 - 2494
[42] Resizer Swin Transformer-Based Classification Using sMRI for Alzheimer's Disease
Huang, Yihang
Li, Wan
APPLIED SCIENCES-BASEL, 2023, 13 (16):
[43] Medical image detection and classification of renal incidentalomas based on YOLOv4+ASFF swin transformer
Pan, Canyu
Chen, Jieyun
Huang, Risheng
JOURNAL OF RADIATION RESEARCH AND APPLIED SCIENCES, 2024, 17 (02)
[44] Classification of High-Resolution Remote Sensing Image Based on Swin Transformer and Convolutional Neural Network
He Xiaoying
Xu Weiming
Pan Kaixiang
Wang Juan
Li Ziwei
LASER & OPTOELECTRONICS PROGRESS, 2024, 61 (14)
[45] Intelligent leaf disease diagnosis: image algorithms using Swin Transformer and federated learning
Zhang, Huanshuo
Ren, Guobiao
VISUAL COMPUTER, 2024, : 4815 - 4838
[46] Enhancing Image Quality by Reducing Compression Artifacts Using Dynamic Window Swin Transformer
Ma, Zhenchao
Wang, Yixiao
Tohidypour, Hamid Reza
Nasiopoulos, Panos
Leung, Victor C. M.
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2024, 14 (02) : 275 - 285
[47] DualBranch-FusionNet: A Hybrid CNN-Transformer Architecture for Cervical Cell Image Classification
Xu, Chuanyun
Huang, Shuaiye
Zhang, Yang
Hu, Die
Sun, Yisha
Li, Gang
INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2025, 35 (03)
[48] Enhancing Dynagraph Card Classification in Pumping Systems Using Transfer Learning and the Swin Transformer Model
Dong, Guoqing
Li, Weirong
Dong, Zhenzhen
Wang, Cai
Qian, Shihao
Zhang, Tianyang
Ma, Xueling
Zou, Lu
Lin, Keze
Liu, Zhaoxia
APPLIED SCIENCES-BASEL, 2024, 14 (04):
[49] Swin-CFNet: An Attempt at Fine-Grained Urban Green Space Classification Using Swin Transformer and Convolutional Neural Network
Wu, Yehong
Zhang, Meng
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21
[50] SwinT-SRNet: Swin transformer with image super-resolution reconstruction network for pollen images classification
Zu, Baokai
Cao, Tong
Li, Yafang
Li, Jianqiang
Ju, Fujiao
Wang, Hongyuan
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133

← 1 2 3 4 5 →