Automatic voice pathology detection and classification using vocal tract area irregularity

被引:38
作者
Muhammad, Ghulam [1 ]
Altuwaijri, Ghadir [1 ]
Alsulaiman, Mansour [1 ]
Ali, Zulfiqar [1 ,2 ]
Mesallam, Tamer A. [3 ,4 ,5 ]
Farahat, Mohamed [3 ,4 ]
Malki, Khalid H. [3 ,4 ]
Al-nasheri, Ahmed [1 ]
机构
[1] King Saud Univ, Coll Comp & Informat Sci, Dept Comp Engn, Speech Proc Grp, POB 51178, Riyadh 11543, Saudi Arabia
[2] Univ Tekhnol PETRONAS, Dept Elect & Elect Engn, Ctr Intelligent Signal & Imaging Res, Tronoh, Perak, Malaysia
[3] King Saud Univ, Coll Med, ENT Dept, Riyadh 11461, Saudi Arabia
[4] King Saud Univ, Res Chair Voice Swallowing & Commun Disorders, Riyadh, Saudi Arabia
[5] Al Menoufiya Univ, Coll Med, ENT Dept, Shebin Alkoum, Egypt
关键词
Voice pathology detection; Vocal tract area; Voice disorders; Support vector machine;
D O I
10.1016/j.bbe.2016.01.004
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
In this paper, an automatic voice pathology detection (VPD) system based on voice production theory is developed. More specifically, features are extracted from vocal tract area, which is connected to the glottis. Voice pathology is related to a vocal fold problem, and hence the vocal tract area which is connected to vocal folds or glottis should exhibit irregular patterns over frames in case of a sustained vowel for a pathological voice. This irregular pattern is quantified in the form of different moments across the frames to distinguish between normal and pathological voices. The proposed VPD system is evaluated on the Massachusetts Eye and Ear Infirmary (MEEI) database and Saarbrucken Voice Database (SVD) with sustained vowel samples. Vocal tract irregularity features and support vector machine classifier are used in the proposed system. The proposed system achieves 99.22% +/- 0.01 accuracy on the MEEI database and 94.7% +/- 0.21 accuracy on the SVD. The results indicate that vocal tract irregularity measures can be used effectively in automatic voice pathology detection. (C) 2016 Nalecz Institute of Biocybemetics and Biomedical Engineering of the Polish Academy of Sciences. Published by Elsevier Sp. z o.o. All rights reserved.
引用
收藏
页码:309 / 317
页数:9
相关论文
共 28 条
[1]  
Ali Z., 2015, J. Voice
[2]  
[Anonymous], 1994, DIS VOIC DAT VERS 1
[3]   Automatic Detection of Pathological Voices Using Complexity Measures, Noise Parameters, and Mel-Cepstral Coefficients [J].
Arias-Londono, Julian D. ;
Godino-Llorente, Juan I. ;
Saenz-Lechon, Nicolas ;
Osma-Ruiz, Victor ;
Castellanos-Dominguez, German .
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2011, 58 (02) :370-379
[4]   Corpora for the evaluation of speaker recognition systems [J].
Campbell, JP ;
Reynolds, DA .
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, :829-832
[5]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[6]   Dimensionality reduction of a pathological voice quality assessment system based on Gaussian mixture models and short-term cepstral parameters [J].
Godino-Llorente, Juan Ignacio ;
Gomez-Vilda, Pedro ;
Blanco-Velasco, Manuel .
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2006, 53 (10) :1943-1953
[7]   Cepstral peak prominence: A more reliable measure of dysphonia [J].
Heman-Ackah, YD ;
Heuer, RJ ;
Michael, DD ;
Ostrowski, R ;
Horman, M ;
Baroody, MM ;
Hillenbrand, J ;
Sataloff, RT .
ANNALS OF OTOLOGY RHINOLOGY AND LARYNGOLOGY, 2003, 112 (04) :324-333
[8]   Cloud-Assisted Speech and Face Recognition Framework for Health Monitoring [J].
Hossain, M. Shamim ;
Muhammad, Ghulam .
MOBILE NETWORKS & APPLICATIONS, 2015, 20 (03) :391-399
[9]  
Kent RD, 2008, HDB CLIN LINGUISTICS, P364, DOI DOI 10.1002/9781444301007
[10]   LISTENER EXPERIENCE AND PERCEPTION OF VOICE QUALITY [J].
KREIMAN, J ;
GERRATT, BR ;
PRECODA, K .
JOURNAL OF SPEECH AND HEARING RESEARCH, 1990, 33 (01) :103-115