Voice Pathology Detection and Classification Using MPEG-7 Audio Low-Level Features

被引：0

作者：

Muhammad, Ghulam ^{[1
]}

Melhem, Moutasem ^{[1
]}

机构：

[1] King Saud Univ, Dept Comp Engn, Coll Comp & Informat Sci, Riyadh 11543, Saudi Arabia

来源：

14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年

关键词：

MPEG-7 audio features; dysphonia recognition; support vector machines; pathology binary classification; Fisher discrimination ratio; RECOGNITION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, a new pathological voice detection and pathology classification method based on MPEG-7 audio low-level features is proposed. MPEG-7 features are originally used for multimedia indexing, which includes both video and audio. Indexing is related to event detection, and as pathological voice is a separate event than normal voice, we show that MPEG-7 audio low-level features can do very well in detecting pathological voices, as well as classifying the pathologies. The experiments are done on a subset of sustained vowel (namely, "AH") recordings from healthy and voice pathological subjects, from the MEET database. For classification, support vector machine (SVM) with 10-fold cross-validation is applied. The proposed method with MPEG7 audio features and SVM classification is evaluated on voice pathology detection, as well as pathology classification. The experiment results show that the proposed method outperforms some recent methods in the literature both in detection and in classification. The proposed method is able to achieve an accuracy of 99.994 0.0105% for detecting pathological voices and an accuracy of 100% for binary pathologies classifying.

引用

页码：3594 / 3598

页数：5

共 21 条

[1] Automatic Detection of Pathological Voices Using Complexity Measures, Noise Parameters, and Mel-Cepstral Coefficients
Arias-Londono, Julian D.
Godino-Llorente, Juan I.
Saenz-Lechon, Nicolas
Osma-Ruiz, Victor
Castellanos-Dominguez, German
[J]. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2011, 58 (02) : 370 - 379
[2] Baken R. J., 2000, Clinical Measurement of Speech and Voice
[3] Corpora for the evaluation of speaker recognition systems
Campbell, JP
Reynolds, DA
[J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 829 - 832
[4] LIBSVM: A Library for Support Vector Machines
Chang, Chih-Chung
Lin, Chih-Jen
[J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[5] Dimensionality reduction of a pathological voice quality assessment system based on Gaussian mixture models and short-term cepstral parameters
Godino-Llorente, Juan Ignacio
Gomez-Vilda, Pedro
Blanco-Velasco, Manuel
[J]. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2006, 53 (10) : 1943 - 1953
[6] DESCRIPTIONS OF SPEECH OF PATIENTS WITH CANCER OF VOCAL FOLDS .1. MEASURES OF FUNDAMENTAL FREQUENCY
HECKER, MHL
KREUL, EJ
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1971, 49 (04) : 1275 - &
[7] ISO/IEC 15938-4, 2001, 159384 ISOIEC
[8] PERTURBATIONS IN VOCAL PITCH
LIEBERMAN, P
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1961, 33 (05) : 597 - &
[9] Voice Pathology Detection and Discrimination Based on Modulation Spectral Features
Markaki, Maria
Stylianou, Yannis
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (07): : 1938 - 1948
[10] Massachusetts Eye and Ear Infirmary, 1994, EL DIS VOIC DAT VERS

← 1 2 3 →