Diagnosis of Early Glottic Cancer Using Laryngeal Image and Voice Based on Ensemble Learning of Convolutional Neural Network Classifiers

被引:10
作者
Kwon, Ickhwan [1 ]
Wang, Soo-Geun [2 ,3 ]
Shin, Sung -Chan [2 ,3 ]
Cheon, Yong-Il [2 ,3 ]
Lee, Byung-Joo [2 ,3 ]
Lee, Jin-Choon [4 ]
Lim, Dong-Won [5 ]
Jo, Cheolwoo [6 ]
Cho, Youngseuk [7 ]
Shin, Bum-Joo [1 ]
机构
[1] Pusan Natl Univ, Dept Appl IT & Engn, Miryang, Gyeongsangnam D, South Korea
[2] Pusan Natl Univ, Coll Med, Dept Otorhinolaryngol Head & Neck Surg, Busan, South Korea
[3] Pusan Natl Univ Hosp, Med Res Inst, Busan, South Korea
[4] Pusan Natl Univ, Yangsan Hosp, Dept Otorhinolaryngol Head & Neck Surg, Yangsan, Gyeongsangnam D, South Korea
[5] Pusan Natl Univ Hosp, Dept Otorhinolaryngol Head & Neck Surg, Busan, South Korea
[6] Changwon Natl Univ, Sch Elect Elect & Control Engn, Chang Won, South Korea
[7] Pusan Natl Univ, Coll Nat Sci, Dept Stat, Busan, South Korea
关键词
Diagnosis; Glottic cancer; Laryngeal image and voice; Ensemble learning; Convolutional neural network classifiers; PATHOLOGY;
D O I
10.1016/j.jvoice.2022.07.007
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Objectives. The purpose of study is to improve the classification accuracy by comparing the results obtained by applying decision tree ensemble learning, which is one of the methods to increase the classification accuracy for a relatively small dataset, with the results obtained by the convolutional neural network (CNN) algorithm for the diagnosis of glottal cancer. Methods. Pusan National University Hospital (PNUH) dataset were used to establish classifiers and Pusan National University Yangsan Hospital (PNUYH) dataset were used to verify the classifier's performance in the generated model. For the diagnosis of glottic cancer, deep learning-based CNN models were established and classified using laryngeal image and voice data. Classification accuracy was obtained by performing decision tree ensemble learning using probability through CNN classification algorithm. In this process, the classification and regression tree (CART) method was used. Then, we compared the classification accuracy of decision tree ensemble learning with CNN individual classifiers by fusing the laryngeal image with the voice decision tree classifier. Results. We obtained classification accuracy of 81.03 % and 99.18 % in the established laryngeal image and voice classification models using PNUH training dataset, respectively. However, the classification accuracy of CNN classifiers decreased to 73.88 % in voice and 68.92 % in laryngeal image when using an external dataset of PNUYH. To solve this problem, decision tree ensemble learning of laryngeal image and voice was used, and the classification accuracy was improved by integrating data of laryngeal image and voice of the same person. The classification accuracy was 87.88 % and 89.06 % for the individualized laryngeal image and voice decision tree model respectively, and the fusion of the laryngeal image and voice decision tree results represented a classification accuracy of 95.31 %. Conclusion. The results of our study suggest that decision tree ensemble learning aimed at training multiple classifiers is useful to obtain an increased classification accuracy despite a small dataset. Although a large data amount is essential for AI analysis, when an integrated approach is taken by combining various input data high diagnostic classification accuracy can be expected.
引用
收藏
页码:245 / 257
页数:13
相关论文
共 31 条
[1]   An Investigation of Multidimensional Voice Program Parameters in Three Different Databases for Voice Pathology Detection and Classification [J].
Al-nasheri, Ahmed ;
Muhammad, Ghulam ;
Alsulaiman, Mansour ;
Ali, Zulfiqar ;
Mesallam, Tamer A. ;
Farahat, Mohamed ;
Malki, Khalid H. ;
Bencherif, Mohamed A. .
JOURNAL OF VOICE, 2017, 31 (01) :113.e9-113.e18
[2]   Deep Learning Applied to White Light and Narrow Band Imaging Videolaryngoscopy: Toward Real-Time Laryngeal Cancer Detection [J].
Azam, Muhammad Adeel ;
Sampieri, Claudio ;
Ioppi, Alessandro ;
Africano, Stefano ;
Vallin, Alberto ;
Mocellin, Davide ;
Fragale, Marco ;
Guastini, Luca ;
Moccia, Sara ;
Piazza, Cesare ;
Mattos, Leonardo S. ;
Peretti, Giorgio .
LARYNGOSCOPE, 2022, 132 (09) :1798-1806
[3]   A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy [J].
Beede, Emma ;
Baylor, Elizabeth ;
Hersch, Fred ;
Iurchenko, Anna ;
Wilcox, Lauren ;
Ruamviboonsuk, Paisan ;
Vardoulakis, Laura M. .
PROCEEDINGS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'20), 2020,
[4]   Gradient Boosting Machine and Object-Based CNN for Land Cover Classification [J].
Bui, Quang-Thanh ;
Chou, Tien-Yin ;
Hoang, Thanh-Van ;
Fang, Yao-Min ;
Mu, Ching-Yun ;
Huang, Pi-Hui ;
Pham, Vu-Dong ;
Nguyen, Quoc-Huy ;
Do Thi Ngoc Anh ;
Pham, Van-Manh ;
Meadows, Michael E. .
REMOTE SENSING, 2021, 13 (14)
[5]   Sex Disparities in Cancer Mortality and Survival [J].
Cook, Michael B. ;
McGlynn, Katherine A. ;
Devesa, Susan S. ;
Freedman, Neal D. ;
Anderson, William F. .
CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION, 2011, 20 (08) :1629-1637
[6]   Narrow band imaging as screening test for early detection of laryngeal cancer: a prospective study [J].
De Vito, A. ;
Meccariello, G. ;
Vicini, C. .
CLINICAL OTOLARYNGOLOGY, 2017, 42 (02) :347-353
[7]   Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach [J].
Fang, Shih-Hau ;
Tsao, Yu ;
Hsiao, Min-Jing ;
Chen, Ji-Ying ;
Lai, Ying-Hui ;
Lin, Feng-Chuan ;
Wang, Chi-Te .
JOURNAL OF VOICE, 2019, 33 (05) :634-641
[8]   Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs [J].
Gulshan, Varun ;
Peng, Lily ;
Coram, Marc ;
Stumpe, Martin C. ;
Wu, Derek ;
Narayanaswamy, Arunachalam ;
Venugopalan, Subhashini ;
Widner, Kasumi ;
Madams, Tom ;
Cuadros, Jorge ;
Kim, Ramasamy ;
Raman, Rajiv ;
Nelson, Philip C. ;
Mega, Jessica L. ;
Webster, R. .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2016, 316 (22) :2402-2410
[9]  
Hancock Shawn, 2012, Diagn Ther Endosc, V2012, P193570, DOI 10.1155/2012/193570
[10]  
Harar P, 2017, P INT C WORKSH BIOIN, P1