SPEECH-BASED EMOTION CLASSIFICATION USING MULTICLASS SVM WITH HYBRID KERNEL AND THRESHOLDING FUSION

被引：0

作者：

Yang, N. ^{[1
]}

Muraleedharan, R.

Kohl, J. ^{[1
]}

Demirkol, I.

Heinzelman, W. ^{[1
]}

Sturge-Apple, M.

机构：

[1] Univ Rochester, Dept Elect & Comp Engn, Rochester, NY 14627 USA

来源：

2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012) | 2012年

关键词：

Emotion classification; support vector machine; speaker independent; hybrid kernel; thresholding fusion; FEATURES;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Emotion classification is essential for understanding human interactions and hence is a vital component of behavioral studies. Although numerous algorithms have been developed, the emotion classification accuracy is still short of what is desired for the algorithms to be used in real systems. In this paper, we evaluate an approach where basic acoustic features are extracted from speech samples, and the One-Against-All (OAA) Support Vector Machine (SVM) learning algorithm is used. We use a novel hybrid kernel, where we choose the optimal kernel functions for the individual OAA classifiers. Outputs from the OAA classifiers are normalized and combined using a thresholding fusion mechanism to finally classify the emotion. Samples with low 'relative confidence' are left as 'unclassified' to further improve the classification accuracy. Results show that the decision-level recall of our approach for six-class emotion classification is 80.5%, outperforming a state-of-the-art approach that uses the same dataset.

引用

页码：455 / 460

页数：6

共 17 条

[1]

[Anonymous], P IAPR IEEE INT C BI

[2]

[Anonymous], NONL DYN SYNCHR 16 I

[3]

[Anonymous], P INTERSPEECH

[4] Class-level spectral features for emotion recognition [J].

Bitouk, Dmitri ;

Verma, Ragini ;

Nenkova, Ani .

SPEECH COMMUNICATION, 2010, 52 (7-8) :613-625

[5]

Cowie R., 2000, P ISCA WORKSH SPEECH, P100

[6] Support vector machine-based method for subcellular localization of human proteins using amino acid compositions, their order, and similarity search [J].

Garg, A ;

Bhasin, M ;

Raghava, GPS .

JOURNAL OF BIOLOGICAL CHEMISTRY, 2005, 280 (15) :14427-14432

[7]

Gönen M, 2011, J MACH LEARN RES, V12, P2211

[8]

Goudbeek M., 2009, INTERSPEECH

[9]

Jackson L.B., 1989, DIGITAL FILTERS SIGN

[10] Empirical mode decomposition based weighted frequency feature for speech-based emotion classification [J].

Sethu, Vidhyasaharan ;

Ambikairajah, Eliathamby ;

Epps, Julien .

2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :5017-5020

← 1 2 →