Deep Learning Applied to White Light and Narrow Band Imaging Videolaryngoscopy: Toward Real-Time Laryngeal Cancer Detection

被引:69
作者
Azam, Muhammad Adeel [1 ,2 ]
Sampieri, Claudio [3 ,4 ]
Ioppi, Alessandro [3 ,4 ]
Africano, Stefano [3 ,4 ]
Vallin, Alberto [4 ]
Mocellin, Davide [4 ]
Fragale, Marco [3 ,4 ]
Guastini, Luca [3 ,4 ]
Moccia, Sara [5 ]
Piazza, Cesare [6 ,7 ]
Mattos, Leonardo S. [1 ,2 ]
Peretti, Giorgio [3 ,4 ]
机构
[1] Ist Italiano Tecnol, Dept Adv Robot, Genoa, Italy
[2] Univ Genoa, Dept Informat Bioengn Robot & Syst Engn, Genoa, Italy
[3] IRCCS Osped Policlinico San Martino, Unit Otorhinolaryngol Head & Neck Surg, Genoa, Italy
[4] Univ Genoa, Dept Surg Sci & Integrated Diagnost DISC, Largo Rosanna Benzi 10, I-16132 Genoa, Italy
[5] Biorobot Inst, Dept Excellence Robot & AI, Scuola Superiore SantAnna, Brescia, Italy
[6] ASST Spedali Civili Brescia, Unit Otorhinolaryngol Head & Neck Surg, Brescia, Italy
[7] Univ Brescia, Dept Med & Surg Specialties Radiol Sci & Publ Hlt, Brescia, Italy
关键词
Larynx cancer; deep learning; narrow band imaging; computer-assisted image interpretation; videolaryngoscopy;
D O I
10.1002/lary.29960
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
Objectives To assess a new application of artificial intelligence for real-time detection of laryngeal squamous cell carcinoma (LSCC) in both white light (WL) and narrow-band imaging (NBI) videolaryngoscopies based on the You-Only-Look-Once (YOLO) deep learning convolutional neural network (CNN). Study Design Experimental study with retrospective data. Methods Recorded videos of LSCC were retrospectively collected from in-office transnasal videoendoscopies and intraoperative rigid endoscopies. LSCC videoframes were extracted for training, validation, and testing of various YOLO models. Different techniques were used to enhance the image analysis: contrast limited adaptive histogram equalization, data augmentation techniques, and test time augmentation (TTA). The best-performing model was used to assess the automatic detection of LSCC in six videolaryngoscopies. Results Two hundred and nineteen patients were retrospectively enrolled. A total of 624 LSCC videoframes were extracted. The YOLO models were trained after random distribution of images into a training set (82.6%), validation set (8.2%), and testing set (9.2%). Among the various models, the ensemble algorithm (YOLOv5s with YOLOv5m-TTA) achieved the best LSCC detection results, with performance metrics in par with the results reported by other state-of-the-art detection models: 0.66 Precision (positive predicted value), 0.62 Recall (sensitivity), and 0.63 mean Average Precision at 0.5 intersection over union. Tests on the six videolaryngoscopies demonstrated an average computation time per videoframe of 0.026 seconds. Three demonstration videos are provided. Conclusion This study identified a suitable CNN model for LSCC detection in WL and NBI videolaryngoscopies. Detection performances are highly promising. The limited complexity and quick computational times for LSCC detection make this model ideal for real-time processing. Level of Evidence 3 Laryngoscope, 2021
引用
收藏
页码:1798 / 1806
页数:9
相关论文
共 27 条
[1]  
[Anonymous], 2010, International journal of computer vision, DOI DOI 10.1007/s11263-009-0275-4
[2]  
[Anonymous], 2006, P 23 INT C MACH LEAR, DOI 10.1145/1143844.1143874
[3]   Proposal for a descriptive guideline of vascular changes in lesions of the vocal folds by the committee on endoscopic laryngeal imaging of the European Laryngological Society [J].
Arens, Christoph ;
Piazza, Cesare ;
Andrea, Mario ;
Dikkers, Frederik G. ;
Gi, Robin E. A. Tjon Pian ;
Voigt-Zimmermann, Susanne ;
Peretti, Giorgio .
EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2016, 273 (05) :1207-1214
[4]  
Cesare P., 2011, CURR OPIN OTOLARYNGO, V19, P67
[5]   Comparison of Convolutional Neural Network Models for Determination of Vocal Fold Normality in Laryngoscopic Images [J].
Cho, Won Ki ;
Choi, Seung-Ho .
JOURNAL OF VOICE, 2022, 36 (05) :590-598
[6]   European Laryngological Society position paper on laryngeal dysplasia Part II: diagnosis, treatment, and follow-up [J].
Eckel, Hans Edmund ;
Simo, Ricard ;
Quer, Miquel ;
Odell, Edward ;
Paleri, Vinidh ;
Klussmann, Jens Peter ;
Remacle, Marc ;
Sjogren, Elisabeth ;
Piazza, Cesare .
EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2021, 278 (06) :1723-1732
[7]   Laryngeal Lesion Classification Based on Vascular Patterns in Contact Endoscopy and Narrow Band Imaging: Manual Versus Automatic Approach [J].
Esmaeili, Nazila ;
Illanes, Alfredo ;
Boese, Axel ;
Davaris, Nikolaos ;
Arens, Christoph ;
Navab, Nassir ;
Friebe, Michael .
SENSORS, 2020, 20 (14) :1-12
[8]   Intraoperative Narrow Band Imaging Better Delineates Superficial Resection Margins During Transoral Laser Microsurgery for Early Glottic Cancer [J].
Garofolo, Sabrina ;
Piazza, Cesare ;
Del Bon, Francesca ;
Mangili, Stefano ;
Guastini, Luca ;
Mora, Francesco ;
Nicolai, Piero ;
Peretti, Giorgio .
ANNALS OF OTOLOGY RHINOLOGY AND LARYNGOLOGY, 2015, 124 (04) :294-298
[9]   Artificial intelligence system for detecting superficial laryngopharyngeal cancer with high efficiency of deep learning [J].
Inaba, Atsushi ;
Hori, Keisuke ;
Yoda, Yusuke ;
Ikematsu, Hiroaki ;
Takano, Hiroaki ;
Matsuzaki, Hiroki ;
Watanabe, Yoshiki ;
Takeshita, Nobuyoshi ;
Tomioka, Toshifumi ;
Ishii, Genichiro ;
Fujii, Satoshi ;
Hayashi, Ryuichi ;
Yano, Tomonori .
HEAD AND NECK-JOURNAL FOR THE SCIENCES AND SPECIALTIES OF THE HEAD AND NECK, 2020, 42 (09) :2581-2592
[10]   Pharyngo-laryngeal examination with the narrow band imaging technology: early experience [J].
Irjala, Heikki ;
Matar, Nayla ;
Remacle, Marc ;
Georges, Lawson .
EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2011, 268 (06) :801-806