A Simple Framework for Depth-Augmented Contrastive Learning for Endoscopic Image Classification

被引:0
|
作者
Weng, Weihao [1 ]
Zhu, Xin [2 ]
Cheikh, Faouzi Alaya [3 ]
Ullah, Mohib [3 ]
Imaizumi, Mitsuyoshi [4 ]
Murono, Shigeyuki [4 ]
Kubota, Satoshi [4 ]
机构
[1] Univ Aizu, Grad Sch Comp Sci & Engn, Aizu Wakamatsu, Fukushima 9658580, Japan
[2] Inst Tokyo, M&D Data Sci Ctr, Dept AI Technol Dev, Tokyo 1010062, Japan
[3] Norwegian Univ Sci & Technol, Dept Comp Sci, N-2815 Gjovik, Norway
[4] Fukushima Med Univ, Dept Otolaryngol, Fukushima 9601295, Japan
基金
日本学术振兴会;
关键词
Estimation; Training; Accuracy; Endoscopes; Image classification; Contrastive learning; Testing; Three-dimensional displays; Pneumonia; Pharynx; deep learning; depth estimation; endoscopic image classification; self-supervised; semi-supervised;
D O I
10.1109/TIM.2024.3470015
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This article introduces a simple framework for depth-augmented contrastive learning (SimDCL), a novel approach to enhance endoscopic image classification by incorporating depth information. Unlike traditional methods that struggle with the absence of depth in 2-D endoscopic images, SimDCL leverages a depth estimation technique trained exclusively on da Vinci Xi endoscope data. This method not only addresses the challenge of obtaining accurate depth data for regions like the pharynges or larynges but also presents the information in a manner that aligns with medical professionals' expertise. Specifically, we designed a loss function for self-supervised depth estimation (SSDE), which performs well when trained on public datasets and then applied to data without depth information. In addition, we developed an augmentation method and corresponding loss function that utilize this depth information to improve the accuracy of endoscopic image classification. The evaluation involved a private dataset of 199 flexible endoscopic evaluation of swallowing (FEES) video images for training and 40 independent FEES video images for testing, along with two public datasets (Nerthus and Kvasir). SimDCL achieved an accuracy of 73.0% (72.7% for Nerthus and 81.6% for Kvasir), surpassing the performance of existing methods (CCSSL, CoMatch, and FixMatch) by margins (9.2%, 12.1%, and 17.8% for FEES, 9.82%, 11.33%, and 11.67% for Nerthus, and 4.21%, 5.42%, and 9.97% for Kvasir, respectively).
引用
收藏
页数:12
相关论文
共 50 条
  • [21] SAR Image Classification Using Contrastive Learning and Pseudo-Labels With Limited Data
    Wang, Chenchen
    Gu, Hong
    Su, Weimin
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [22] Language-Enhanced Dual-Level Contrastive Learning Network for Open-Set Hyperspectral Image Classification
    Qin, Boao
    Feng, Shou
    Zhao, Chunhui
    Li, Wei
    Tao, Ran
    Zhou, Jun
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [23] Two-Stream Networks for Contrastive Learning in Hyperspectral Image Classification
    Xia, Shuxiang
    Zhang, Xiaohua
    Meng, Hongyun
    Fan, Jiaxin
    Jiao, Licheng
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 1903 - 1920
  • [24] Semi-supervised hybrid contrastive learning for PolSAR image classification
    Hua, Wenqiang
    Sun, Nan
    Liu, Lin
    Ding, Chen
    Dong, Yizhuo
    Sun, Wei
    KNOWLEDGE-BASED SYSTEMS, 2025, 311
  • [25] CLIB: Contrastive learning of ignoring background for underwater fish image classification
    Yan, Qiankun
    Du, Xiujuan
    Li, Chong
    Tian, Xiaojing
    FRONTIERS IN NEUROROBOTICS, 2024, 18
  • [26] Graph Contrastive Learning based Adversarial Training for SAR Image Classification
    Wang, Xu
    Ye, Tian
    Kannan, Rajgopal
    Prasanna, Viktor
    ALGORITHMS FOR SYNTHETIC APERTURE RADAR IMAGERY XXXI, 2024, 13032
  • [27] GraphCLIP: Image-graph contrastive learning for multimodal artwork classification
    Scaringi, Raffaele
    Fiameni, Giuseppe
    Vessio, Gennaro
    Castellano, Giovanna
    KNOWLEDGE-BASED SYSTEMS, 2025, 310
  • [28] Image Classification with Caffe Deep Learning Framework
    Cengil, Emine
    Cinar, Ahmet
    Ozbay, Erdal
    2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2017, : 440 - 444
  • [29] Dynamic Graph Network augmented by Contrastive Learning for Radar Target Classification
    Meng, Han
    Peng, Yuexing
    Wang, Wenbo
    2024 IEEE RADAR CONFERENCE, RADARCONF 2024, 2024,
  • [30] sMoBYAL: Supervised Contrastive Active Learning for Image Classification
    Thanh Hong Dang
    Thanh Tung Nguyen
    Huy Quang Trinh
    Linh Bao Doan
    Toan Van Pham
    SIXTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION, ICMV 2023, 2024, 13072