Voice Pathology Detection and Classification Using MPEG-7 Audio Low-Level Features

被引：0

作者：

Muhammad, Ghulam ^{[1
]}

Melhem, Moutasem ^{[1
]}

机构：

[1] King Saud Univ, Dept Comp Engn, Coll Comp & Informat Sci, Riyadh 11543, Saudi Arabia

来源：

14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年

关键词：

MPEG-7 audio features; dysphonia recognition; support vector machines; pathology binary classification; Fisher discrimination ratio; RECOGNITION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, a new pathological voice detection and pathology classification method based on MPEG-7 audio low-level features is proposed. MPEG-7 features are originally used for multimedia indexing, which includes both video and audio. Indexing is related to event detection, and as pathological voice is a separate event than normal voice, we show that MPEG-7 audio low-level features can do very well in detecting pathological voices, as well as classifying the pathologies. The experiments are done on a subset of sustained vowel (namely, "AH") recordings from healthy and voice pathological subjects, from the MEET database. For classification, support vector machine (SVM) with 10-fold cross-validation is applied. The proposed method with MPEG7 audio features and SVM classification is evaluated on voice pathology detection, as well as pathology classification. The experiment results show that the proposed method outperforms some recent methods in the literature both in detection and in classification. The proposed method is able to achieve an accuracy of 99.994 0.0105% for detecting pathological voices and an accuracy of 100% for binary pathologies classifying.

引用

页码：3594 / 3598

页数：5

共 21 条

[11] Multidirectional Regression (MDR)-Based Features for Automatic Voice Disorder Detection
Muhammad, Ghulam
Mesallam, Tamer A.
Malki, Khalid H.
Farahat, Mohamed
Mahmood, Awais
Alsulaiman, Mansour
[J]. JOURNAL OF VOICE, 2012, 26 (06) : 817.e19 - 817.e27
[12] ENVIRONMENT RECOGNITION FOR DIGITAL AUDIO FORENSICS USING MPEG-7 AND MEL CEPSTRAL FEATURES
Muhammad, Ghulam
Alghathbar, Khalid
[J]. JOURNAL OF ELECTRICAL ENGINEERING-ELEKTROTECHNICKY CASOPIS, 2011, 62 (04): : 199 - 205
[13] Identification of pathological voices using glottal noise measures
Parsa, V
Jamieson, DG
[J]. JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2000, 43 (02): : 469 - 485
[14] Methodological issues in the development of automatic systems for voice pathology detection
Saenz-Lechon, Nicolas
Godino-Llorente, Juan I.
Osma-Ruiz, Victor
Gomez-Vilda, Pedro
[J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2006, 1 (02) : 120 - 128
[15] Smith D., 2010, P 20 INT C AC ICA 20
[16] Szczuko P, 2004, P AES 116 CONV
[17] Titze I.R., 1994, Workshop on acoustic voice analysis
[18] COMPARISON OF F(O) EXTRACTION METHODS FOR HIGH-PRECISION VOICE PERTURBATION MEASUREMENTS
TITZE, IR
LIANG, HX
[J]. JOURNAL OF SPEECH AND HEARING RESEARCH, 1993, 36 (06): : 1120 - 1133
[19] Spectral jitter modeling and estimation
Vasilakis, Miltiadis
Stylianou, Yannis
[J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2009, 4 (03) : 183 - 193
[20] Wang JC, 2006, IEEE IJCNN, P1731

← 1 2 3 →