Consistency Loss for Improved Colonoscopy Landmark Detection with Vision Transformers

被引:0
|
作者
Tamhane, Aniruddha [1 ]
Dobkin, Daniel [1 ]
Shtalrid, Ore [1 ]
Bouhnik, Moshe [1 ]
Posner, Erez [1 ]
Mida, Tse'ela [1 ]
机构
[1] Intuit Surg Inc, 1020 Kifer Rd, Sunnyvale, CA 94086 USA
来源
MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2023, PT II | 2024年 / 14349卷
关键词
Colonoscopy; Vision Transformer; Landmark Detection; Self-supervised learning; Consistency loss; Data sampling; COLON;
D O I
10.1007/978-3-031-45676-3_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Colonoscopy is a procedure used to examine the colon and rectum for colorectal cancer or other abnormalities including polyps or diverticula. Apart from the actual diagnosis, manually processing the snapshots taken during the colonoscopy procedure (for medical record keeping) consumes a large amount of the clinician's time. This can be automated through post-procedural machine learning based algorithms which classify anatomical landmarks in the colon. In this work, we have developed a pipeline for training vision-transformers for identifying anatomical landmarks, including appendiceal orifice, ileocecal valve/cecum landmark and rectum retroflection. To increase the accuracy of the model, we utilize a hybrid approach that combines algorithm-level and data-level techniques. We introduce a consistency loss to enhance model immunity to label inconsistencies, as well as a semantic non-landmark sampling technique aimed at increasing focus on colonic findings. For training and testing our pipeline, we have annotated 307 colonoscopy videos and 2363 snapshots with the assistance of several medical experts for enhanced reliability. The algorithm identifies landmarks with an accuracy of 92% on the test dataset.
引用
收藏
页码:124 / 133
页数:10
相关论文
共 50 条
  • [1] Colonoscopy Landmark Detection Using Vision Transformers
    Tamhane, Aniruddha
    Mida, Tse'ela
    Posner, Erez
    Bouhnik, Moshe
    IMAGING SYSTEMS FOR GI ENDOSCOPY, AND GRAPHS IN BIOMEDICAL IMAGE ANALYSIS, ISGIE 2022, 2022, 13754 : 24 - 34
  • [2] Adaptive robust loss for landmark detection
    Tian, Yingjie
    Su, Duo
    Li, Shilin
    INFORMATION FUSION, 2023, 144
  • [3] Adaptive robust loss for landmark detection
    Tian, Yingjie
    Su, Duo
    Li, Shilin
    INFORMATION FUSION, 2024, 101
  • [4] Improved Heatmap-Based Landmark Detection
    Yao, Huifeng
    Guo, Ziyu
    Zhang, Yatao
    Li, Xiaomeng
    DEEP GENERATIVE MODELS, AND DATA AUGMENTATION, LABELLING, AND IMPERFECTIONS, 2021, 13003 : 125 - 133
  • [5] Vision transformers are active learners for image copy detection
    Tan, Zhentao
    Wang, Wenhao
    Shan, Caifeng
    NEUROCOMPUTING, 2024, 587
  • [6] BUViTNet: Breast Ultrasound Detection via Vision Transformers
    Ayana, Gelan
    Choe, Se-Woon
    DIAGNOSTICS, 2022, 12 (11)
  • [7] Optimized Vision Transformers for Superior Plant Disease Detection
    Ouamane, Abdelmalik
    Chouchane, Ammar
    Himeur, Yassine
    Miniaoui, Sami
    Atalla, Shadi
    Mansoor, Wathiq
    Al-Ahmad, Hussain
    IEEE ACCESS, 2025, 13 : 48552 - 48570
  • [8] Self-Supervised Vision Transformers for Malware Detection
    Seneviratne, Sachith
    Shariffdeen, Ridwan
    Rasnayaka, Sanka
    Kasthuriarachchi, Nuran
    IEEE ACCESS, 2022, 10 : 103121 - 103135
  • [9] Regularizing self-attention on vision transformers with 2D spatial distance loss
    Mormille, Luiz H.
    Broni-Bediako, Clifford
    Atsumi, Masayasu
    ARTIFICIAL LIFE AND ROBOTICS, 2022, 27 (03) : 586 - 593
  • [10] Regularizing self-attention on vision transformers with 2D spatial distance loss
    Luiz H. Mormille
    Clifford Broni-Bediako
    Masayasu Atsumi
    Artificial Life and Robotics, 2022, 27 : 586 - 593