Voice Disorder Identification by using Hilbert-Huang Transform (HHT) and K Nearest Neighbor (KNN)

被引:41
作者
Chen, Lili [1 ]
Wang, Chaoyu [1 ]
Chen, Junjiang [1 ]
Xiang, Zejun [2 ]
Hu, Xue [3 ]
机构
[1] Chongqing Jiaotong Univ, Sch Mechatron & Vehicle Engn, Chongqing, Peoples R China
[2] Chongqing Survey Inst, Chongqing, Peoples R China
[3] Chongqing Med Univ, Affiliated Hosp 1, Dept Blood Transfus, Chongqing 400016, Peoples R China
关键词
Voice disorders; Hilbert-Huang transform; Linear Prediction Coefficient; K nearest Neighbor; EMPIRICAL MODE DECOMPOSITION; PATHOLOGY DETECTION; CLASSIFICATION; SPEECH; RECOGNITION; SYSTEM; HEALTHY; AUDIO;
D O I
10.1016/j.jvoice.2020.03.009
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Objectives. Clinical evaluation of dysphonic voices involves a multidimensional approach, including a variety of instrumental and noninstrumental measures. Acoustic analyses provide an objective, noninvasive and intelligent measures of voice quality. Based on sound recordings, this paper proposes a new classification method of voice disorders with HHT and KNN. Methods. In this research, 12 features of each sample is calculated by HHT. Based on the algorithm of Linear Prediction Coefficient (LPCC), a sample can be characterized by 9 features. After each sample is expressed by 21 features, the classifier is constructed based on KNN. In addition, classifier based on KNN was further compared with random forest and extra trees classifiers in relation to their classification performance of voice disorder. Results. The experiment results revel that classifier based on KNN showed better performance than other two classifiers with accuracy rate of 93.3%, precision of 93%, recall rate of 95%, F1-score of 94% and the area of receiver operating characteristic curve is 0.976. Conclusions. The method put forward in this paper can be effectively used to classify voice disorders.
引用
收藏
页码:932.e1 / 932.e11
页数:11
相关论文
共 53 条
[41]   Pathological voice detection and binary classification using MPEG-7 audio features [J].
Muhammad, Ghulam ;
Melhem, Moutasem .
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2014, 11 :1-9
[42]   Scrotal Ultrasound [J].
Rebik, Kristin ;
Wagner, Jason M. ;
Middleton, William .
RADIOLOGIC CLINICS OF NORTH AMERICA, 2019, 57 (03) :635-+
[43]   Support vector wavelet adaptation for pathological voice assessment [J].
Saeedi, Nafise Erfanian ;
Almasganj, Farshad ;
Torabinejad, Farhad .
COMPUTERS IN BIOLOGY AND MEDICINE, 2011, 41 (09) :822-828
[44]   The role of laryngeal ultrasound in the assessment of pediatric dysphonia and stridor [J].
Shirley, Friedman ;
Oshri, Wasserzug ;
Ari, Derowe ;
Gad, Fishman .
INTERNATIONAL JOURNAL OF PEDIATRIC OTORHINOLARYNGOLOGY, 2019, 122 :175-179
[45]   Application of Hilbert-Huang transform for vibration signal analysis in end-milling [J].
Susanto, Agus ;
Liu, Chia-Hung ;
Yamada, Keiji ;
Hwang, Yean-Ren ;
Tanaka, Ryutaro ;
Sekiya, Katsuhiko .
PRECISION ENGINEERING-JOURNAL OF THE INTERNATIONAL SOCIETIES FOR PRECISION ENGINEERING AND NANOTECHNOLOGY, 2018, 53 :263-277
[46]   Exploring the feasibility of the combination of acoustic voice quality index and glottal function index for voice pathology screening [J].
Ulozaite-Staniene, Nora ;
Petrauskas, Tadas ;
Saferis, Viktoras ;
Uloza, Virgilijus .
EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2019, 276 (06) :1737-1745
[47]   Dysphonia Detection Index (DDI): A New Multi-Parametric Marker to Evaluate Voice Quality [J].
Verde, Laura ;
De Pietro, Giuseppe ;
Alrashoud, Mubarak ;
Ghoneim, Ahmed ;
Al-Mutib, Khaled N. ;
Sannino, Giovanna .
IEEE ACCESS, 2019, 7 :55689-55697
[48]   Data dependent random forest applied to screening for laryngeal disorders through analysis of sustained phonation: Acoustic versus contact microphone [J].
Verikas, A. ;
Gelzinis, A. ;
Vaiciukynas, E. ;
Bacauskiene, M. ;
Minelga, J. ;
Hallander, M. ;
Uloza, V. ;
Padervinskis, E. .
MEDICAL ENGINEERING & PHYSICS, 2015, 37 (02) :210-218
[49]   Top 10 algorithms in data mining [J].
Wu, Xindong ;
Kumar, Vipin ;
Quinlan, J. Ross ;
Ghosh, Joydeep ;
Yang, Qiang ;
Motoda, Hiroshi ;
McLachlan, Geoffrey J. ;
Ng, Angus ;
Liu, Bing ;
Yu, Philip S. ;
Zhou, Zhi-Hua ;
Steinbach, Michael ;
Hand, David J. ;
Steinberg, Dan .
KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 14 (01) :1-37
[50]   ENSEMBLE EMPIRICAL MODE DECOMPOSITION: A NOISE-ASSISTED DATA ANALYSIS METHOD [J].
Wu, Zhaohua ;
Huang, Norden E. .
ADVANCES IN DATA SCIENCE AND ADAPTIVE ANALYSIS, 2009, 1 (01) :1-41