Consistency Loss for Improved Colonoscopy Landmark Detection with Vision Transformers

被引:0
作者
Tamhane, Aniruddha [1 ]
Dobkin, Daniel [1 ]
Shtalrid, Ore [1 ]
Bouhnik, Moshe [1 ]
Posner, Erez [1 ]
Mida, Tse'ela [1 ]
机构
[1] Intuit Surg Inc, 1020 Kifer Rd, Sunnyvale, CA 94086 USA
来源
MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2023, PT II | 2024年 / 14349卷
关键词
Colonoscopy; Vision Transformer; Landmark Detection; Self-supervised learning; Consistency loss; Data sampling; COLON;
D O I
10.1007/978-3-031-45676-3_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Colonoscopy is a procedure used to examine the colon and rectum for colorectal cancer or other abnormalities including polyps or diverticula. Apart from the actual diagnosis, manually processing the snapshots taken during the colonoscopy procedure (for medical record keeping) consumes a large amount of the clinician's time. This can be automated through post-procedural machine learning based algorithms which classify anatomical landmarks in the colon. In this work, we have developed a pipeline for training vision-transformers for identifying anatomical landmarks, including appendiceal orifice, ileocecal valve/cecum landmark and rectum retroflection. To increase the accuracy of the model, we utilize a hybrid approach that combines algorithm-level and data-level techniques. We introduce a consistency loss to enhance model immunity to label inconsistencies, as well as a semantic non-landmark sampling technique aimed at increasing focus on colonic findings. For training and testing our pipeline, we have annotated 307 colonoscopy videos and 2363 snapshots with the assistance of several medical experts for enhanced reliability. The algorithm identifies landmarks with an accuracy of 92% on the test dataset.
引用
收藏
页码:124 / 133
页数:10
相关论文
共 50 条
[41]   Detection of Alzheimer Disease in Neuroimages Using Vision Transformers: Systematic Review and Meta-Analysis [J].
Mubonanyikuzo, Vivens ;
Yan, Hongjie ;
Komolafe, Temitope Emmanuel ;
Zhou, Liang ;
Wu, Tao ;
Wang, Nizhuan .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2025, 27
[42]   Depth-Based Intervention Detection in the Neonatal Intensive Care Unit Using Vision Transformers [J].
Hajj-Ali, Zein ;
Dosso, Yasmina Souley ;
Greenwood, Kim ;
Harrold, Joann ;
Green, James R. .
SENSORS, 2024, 24 (23)
[43]   Glaucoma Progression Detection and Humphrey Visual Field Prediction Using Discriminative and Generative Vision Transformers [J].
Tian, Ye ;
Zang, Mingyang ;
Sharma, Anurag ;
Gu, Sophie Z. ;
Leshno, Ari ;
Thakoor, Kaveri A. .
OPHTHALMIC MEDICAL IMAGE ANALYSIS, OMIA 2023, 2023, 14096 :62-71
[44]   Improved Deepfake Video Detection Using Convolutional Vision Transformer [J].
Deressa, Deressa Wodajo ;
Lambert, Peter ;
Van Wallendael, Glenn ;
Atnafu, Solomon ;
Mareen, Hannes .
2024 IEEE GAMING, ENTERTAINMENT, AND MEDIA CONFERENCE, GEM 2024, 2024, :492-497
[45]   Mel-MViTv2: Enhanced Speech Emotion Recognition With Mel Spectrogram and Improved Multiscale Vision Transformers [J].
Ong, Kah Liang ;
Lee, Chin Poo ;
Lim, Heng Siong ;
Lim, Kian Ming ;
Alqahtani, Ali .
IEEE ACCESS, 2023, 11 :108571-108579
[46]   Cross-domain endoscopic image translation and landmark detection based on consistency regularization cycle generative adversarial network [J].
Huang, Lan ;
Wang, Yuzhao ;
Zhang, Yingfang ;
Guo, Shuyu ;
Tao, Ke ;
Bai, Tian .
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 227
[47]   Endocuff Vision-Assisted Colonoscopy Significantly Improves Adenoma Detection in a Shorter Withdrawal Time Compared with Standard Colonoscopy: A Randomized Controlled Trial [J].
Quach, Duc Trong ;
Nguyen, Thu Anh ;
Luu, Mai Ngoc ;
Vo, Uyen Pham-Phuong ;
Tran, Vy Ly-Thao ;
Tran, Truc Le-Thanh ;
Nguyen, Tai Duy ;
Le, Nhan Quang ;
Hiyama, Toru ;
Tanaka, Shinji .
DIGESTION, 2025,
[48]   Mutually Improved Endoscopic Image Synthesis and Landmark Detection in Unpaired Image-to-Image Translation [J].
Sharan, Lalith ;
Romano, Gabriele ;
Koehler, Sven ;
Kelm, Halvar ;
Karck, Matthias ;
De Simone, Raffaele ;
Engelhardt, Sandy .
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (01) :127-138
[49]   Presentation attack detection based on two-stream vision transformers with self-attention fusion [J].
Peng, Fei ;
Meng, Shao-hua ;
Long, Min .
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 85
[50]   Image-Based Lunar Hazard Detection in Low Illumination Simulated Conditions via Vision Transformers [J].
Ghilardi, Luca ;
Furfaro, Roberto .
SENSORS, 2023, 23 (18)