Voice Pathology Detection and Classification Using MPEG-7 Audio Low-Level Features

被引:0
作者
Muhammad, Ghulam [1 ]
Melhem, Moutasem [1 ]
机构
[1] King Saud Univ, Dept Comp Engn, Coll Comp & Informat Sci, Riyadh 11543, Saudi Arabia
来源
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年
关键词
MPEG-7 audio features; dysphonia recognition; support vector machines; pathology binary classification; Fisher discrimination ratio; RECOGNITION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a new pathological voice detection and pathology classification method based on MPEG-7 audio low-level features is proposed. MPEG-7 features are originally used for multimedia indexing, which includes both video and audio. Indexing is related to event detection, and as pathological voice is a separate event than normal voice, we show that MPEG-7 audio low-level features can do very well in detecting pathological voices, as well as classifying the pathologies. The experiments are done on a subset of sustained vowel (namely, "AH") recordings from healthy and voice pathological subjects, from the MEET database. For classification, support vector machine (SVM) with 10-fold cross-validation is applied. The proposed method with MPEG7 audio features and SVM classification is evaluated on voice pathology detection, as well as pathology classification. The experiment results show that the proposed method outperforms some recent methods in the literature both in detection and in classification. The proposed method is able to achieve an accuracy of 99.994 0.0105% for detecting pathological voices and an accuracy of 100% for binary pathologies classifying.
引用
收藏
页码:3594 / 3598
页数:5
相关论文
共 21 条
  • [11] Multidirectional Regression (MDR)-Based Features for Automatic Voice Disorder Detection
    Muhammad, Ghulam
    Mesallam, Tamer A.
    Malki, Khalid H.
    Farahat, Mohamed
    Mahmood, Awais
    Alsulaiman, Mansour
    [J]. JOURNAL OF VOICE, 2012, 26 (06) : 817.e19 - 817.e27
  • [12] ENVIRONMENT RECOGNITION FOR DIGITAL AUDIO FORENSICS USING MPEG-7 AND MEL CEPSTRAL FEATURES
    Muhammad, Ghulam
    Alghathbar, Khalid
    [J]. JOURNAL OF ELECTRICAL ENGINEERING-ELEKTROTECHNICKY CASOPIS, 2011, 62 (04): : 199 - 205
  • [13] Identification of pathological voices using glottal noise measures
    Parsa, V
    Jamieson, DG
    [J]. JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2000, 43 (02): : 469 - 485
  • [14] Methodological issues in the development of automatic systems for voice pathology detection
    Saenz-Lechon, Nicolas
    Godino-Llorente, Juan I.
    Osma-Ruiz, Victor
    Gomez-Vilda, Pedro
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2006, 1 (02) : 120 - 128
  • [15] Smith D., 2010, P 20 INT C AC ICA 20
  • [16] Szczuko P, 2004, P AES 116 CONV
  • [17] Titze I.R., 1994, Workshop on acoustic voice analysis
  • [18] COMPARISON OF F(O) EXTRACTION METHODS FOR HIGH-PRECISION VOICE PERTURBATION MEASUREMENTS
    TITZE, IR
    LIANG, HX
    [J]. JOURNAL OF SPEECH AND HEARING RESEARCH, 1993, 36 (06): : 1120 - 1133
  • [19] Spectral jitter modeling and estimation
    Vasilakis, Miltiadis
    Stylianou, Yannis
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2009, 4 (03) : 183 - 193
  • [20] Wang JC, 2006, IEEE IJCNN, P1731