Characterization Methods for the Detection of Multiple Voice Disorders: Neurological, Functional, and Laryngeal Diseases

被引:96
作者
Rafael Orozco-Arroyave, Juan [1 ,2 ]
Alexander Belalcazar-Bolanos, Elkyn [1 ]
David Arias-Londono, Julian [1 ]
Francisco Vargas-Bonilla, Jesus [1 ]
Skodda, Sabine [3 ]
Rusz, Jan [4 ]
Daqrouq, Khaled [5 ]
Hoenig, Florian [2 ]
Noeth, Elmar [2 ,5 ]
机构
[1] UdeA, Fac Engn, Medellin 1226, Colombia
[2] Univ Erlangen Nurnberg, Pattern Recognit Lab, D-91054 Erlangen, Germany
[3] Ruhr Univ Bochum, Dept Neurol, Knappschaftskrankenhaus, D-44801 Bochum, Germany
[4] Czech Tech Univ, Dept Circuit Theory, Fac Elect Engn, Prague 16636, Czech Republic
[5] King Abdulaziz Univ, Dept Elect & Comp Engn, Jeddah 22254, Saudi Arabia
关键词
Hypernasality; laryngeal pathologies (LP); noise measures; nonlinear behavior; Parkinson's disease (PD); periodicity; spectral-cepstral modeling; stability; PARKINSONS-DISEASE; PATHOLOGICAL VOICES; AUTOMATIC DETECTION; DIMENSIONALITY REDUCTION; ACOUSTIC ANALYSIS; LARGE-SAMPLE; CLEFT-LIP; SPEECH; ARTICULATION; IMPAIRMENT;
D O I
10.1109/JBHI.2015.2467375
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper evaluates the accuracy of different characterization methods for the automatic detection of multiple speech disorders. The speech impairments considered include dysphonia in people with Parkinson's disease (PD), dysphonia diagnosed in patients with different laryngeal pathologies (LP), and hypernasality in children with cleft lip and palate (CLP). Four different methods are applied to analyze the voice signals including noise content measures, spectral-cepstralmodeling, nonlinear features, and measurements to quantify the stability of the fundamental frequency. These measures are tested in six databases: three with recordings of PD patients, two with patients with LP, and one with children with CLP. The abnormal vibration of the vocal folds observed in PD patients and in people with LP is modeled using the stability measures with accuracies ranging from 81% to 99% depending on the pathology. The spectral-cepstral features are used in this paper to model the voice spectrum with special emphasis around the first two formants. These measures exhibit accuracies ranging from 95% to 99% in the automatic detection of hypernasal voices, which confirms the presence of changes in the speech spectrum due to hypernasality. Noise measures suitably discriminate between dysphonic and healthy voices in both databases with speakers suffering from LP. The results obtained in this study suggest that it is not suitable to use every kind of features to model all of the voice pathologies; conversely, it is necessary to study the physiology of each impairment to choose the most appropriate set of features.
引用
收藏
页码:1820 / 1828
页数:9
相关论文
共 63 条
[1]  
[Anonymous], P 15 ANN C INT SPEEC
[2]  
[Anonymous], 2001, EURASIP J APPL SIG P, DOI DOI 10.1155/S1110865701000336
[3]   Automatic Detection of Pathological Voices Using Complexity Measures, Noise Parameters, and Mel-Cepstral Coefficients [J].
Arias-Londono, Julian D. ;
Godino-Llorente, Juan I. ;
Saenz-Lechon, Nicolas ;
Osma-Ruiz, Victor ;
Castellanos-Dominguez, German .
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2011, 58 (02) :370-379
[4]   An improved method for voice pathology detection by means of a HMM-based feature space transformation [J].
Arias-Londono, Julian D. ;
Godino-Llorente, Juan I. ;
Saenz-Lechon, Nicolas ;
Osma-Ruiz, Victor ;
Castellanos-Dominguez, German .
PATTERN RECOGNITION, 2010, 43 (09) :3100-3112
[5]   Fully automated assessment of the severity of Parkinson's disease from speech [J].
Bayestehtashk, Alireza ;
Asgari, Meysam ;
Shafran, Izhak ;
McNames, James .
COMPUTER SPEECH AND LANGUAGE, 2015, 29 (01) :172-185
[6]  
Belalcazar-Bolanos E. A., 2013, Natural and Artificial Models in Computation and Biology. 5th International Work-Conference on the Interplay Between Natural and Artificial Computation, IWINAC 2013. Proceedings, Part I: LNCS 7930, P283, DOI 10.1007/978-3-642-38637-4_29
[7]  
Bocklet T., 2011, IEEE Work. Autom. Speech Recognit. Underst, P478, DOI DOI 10.1109/ASRU.2011.6163978
[8]  
Bocklet T, 2013, INTERSPEECH, P1148
[9]   Noninvasive technique for detecting hypernasal speech using a nonlinear operator [J].
Cairns, DA ;
Hansen, JHL ;
Riski, JE .
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 1996, 43 (01) :35-45
[10]  
DANTONIO LL, 1988, LARYNGOSCOPE, V98, P432