Directed Acyclic Graphs for Content Based Sound, Musical Genre, and Speech Emotion Classification

被引：10

作者：

Ntalampiras, Stavros ^{[1
]}

机构：

[1] Politecn Milan, I-20133 Milan, Italy

来源：

JOURNAL OF NEW MUSIC RESEARCH | 2014年 / 43卷 / 02期

关键词：

audio signal processing; content-based generalized sound recognition; decision directed acyclic graph; hidden Markov model; RECOGNITION; MODEL;

D O I：

10.1080/09298215.2013.859709

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

This work introduces the methodology of Decision Directed Acyclic Graphs (DDAG) 1 to the scientific domain of content based audio signal processing. We apply the particular methodology to three multiclass classification problems involving the categories of generalized sound events, musical genres, and speech expressing emotional states. A decision graph is constructed which breaks the overall problem into a series of two-class ones. The order of the graph nodes is revealed using a clustering criterion based on the Kullback-Leibler divergence. Every graph node is composed by two hidden Markov models, each one representing the class which participates in the specific problem. We extract three heterogeneous feature sets (Mel-Filterbank, MPEG-7 Audio Spectrum Projection and Perceptual Wavelet Packets) out of each recording and fuse them for training the HMMs. Extensive comparative experiments are conducted using the following three datasets: (a) a combination of professional sound effects collections, (b) GTZAN musical genre database, and (c) BERLIN emotional speech corpus. The results demonstrate the superiority of the DDAG classification approach over the standard HMM approach regardless the application task.

引用

页码：173 / 182

页数：10

共 21 条

[1] Speech based emotion classification
Nwe, TL
Wei, FS
De Silva, LC
IEEE REGION 10 INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONIC TECHNOLOGY, VOLS 1 AND 2, 2001, : 297 - 301
[2] Automatic music genre classification based on musical instrument track separation
Rosner, Aldona
Kostek, Bozena
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2018, 50 (02) : 363 - 384
[3] Speech Based Emotion Classification Framework for Driver Assistance System
Tawari, Ashish
Trivedi, Mohan
2010 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2010, : 174 - 178
[4] Speech Emotion Classification Using Attention-Based LSTM
Xie, Yue
Liang, Ruiyu
Liang, Zhenlin
Huang, Chengwei
Zou, Cairong
Schuller, Bjoern
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (11) : 1675 - 1685
[5] Evaluation of Speech Emotion Classification Based on GMM and Data Fusion
Vondra, Martin
Vich, Robert
CROSS-MODAL ANALYSIS OF SPEECH, GESTURES, GAZE AND FACIAL EXPRESSIONS, 2009, 5641 : 98 - 105
[6] Speech emotion classification using fractal dimension-based features
Tamulevicius, Gintautas
Karbauskaite, Rasa
Dzemyda, Gintautas
NONLINEAR ANALYSIS-MODELLING AND CONTROL, 2019, 24 (05): : 679 - 695
[7] Introducing New Feature Set based on Wavelets for Speech Emotion Classification
Tanmoy, Roy
Tshilidzi, Marwala
Snehashish, Chakraverty
Paul, Satyakama
PROCEEDINGS OF 2018 IEEE APPLIED SIGNAL PROCESSING CONFERENCE (ASPCON), 2018, : 124 - 128
[8] Speech Based Multiple Emotion Classification Model Using Deep Learning
Patneedi, Shakti Swaroop
Kumari, Nandini
ADVANCES IN COMPUTING AND DATA SCIENCES, PT I, 2021, 1440 : 648 - 659
[9] A neurally inspired musical instrument classification system based upon the sound onset
Newton, Michael J.
Smith, Leslie S.
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2012, 131 (06) : 4785 - 4798
[10] A Comparative Study on MFCC and Fundamental Frequency Based Speech Emotion Classification
Shah, Asfahan
Bhowmik, Tanmay
DISTRIBUTED COMPUTING AND INTELLIGENT TECHNOLOGY, ICDCIT 2022, 2022, 13145 : 173 - 184

← 1 2 3 →