Real Life Emotion Classification using Spectral Features and Gaussian Mixture Models

被引：5

作者：

Koolagudi, Shashidhar G. ^{[1
]}

Barthwal, Anurag ^{[1
]}

Devliyal, Swati ^{[1
]}

Rao, K. Sreenivasa ^{[2
]}

机构：

[1] Graph Era Univ, Sch Comp, Dehra Dun 248002, Uttarakhand, India

[2] Indian Inst Technol, Kharagpur 721302, W Bengal, India

来源：

INTERNATIONAL CONFERENCE ON MODELLING OPTIMIZATION AND COMPUTING | 2012年 / 38卷

关键词：

emotion classification; spectral features; GMM; MFCC; LPCC; text dependent emotion recognition; text independent emotion recognition; RECOGNITION; SPEECH;

D O I：

10.1016/j.proeng.2012.06.447

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In this work, spectral features are extracted for speech emotion classification. Mel frequency cepstral coefficients (MFCCs) are used as features. Gaussian mixture models (GMMs) are explored as classifiers. The emotions considered are anger, happy, neutral, sad and surprise. Semi-natural emotional database (Graphic Era University Semi Natural Emotion Speech Corpus) is collected from the dialogues of popular Hindi movies. Average emotion recognition performance, in the case of multiple speaker database is observed to be around 55.60%. Results of male, female, multiple male and multiple female speakers are compared to study the effect of speakers and gender on expression of emotions.

引用

页码：3892 / 3899

页数：8

共 10 条

[1]

Chauhan R, 2011, COMM COM INF SC, V168, P359

[2]

Koolagudi S. G., 2009, COMMUNICATION COMPUT, V40

[3]

Koolagudi S.G., 2009, IITKGP SESC SPEECH D

[4]

Koolagudi Shashidhar G., 2011, P IEEE INT C DEV COM

[5]

Li Y., 1998, Proceedings of International Conference on Spoken Language Processing, P2255

[6] Epoch Extraction From Speech Signals [J].

Murty, K. Sri Rama ;

Yegnanarayana, B. .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (08) :1602-1613

[7]

Neiberg D, 2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, P809

[8]

Pao TL, 2005, LECT NOTES COMPUT SC, V3784, P279

[9]

Rabiner L. R., 1993, Fundamentals of Speech Recognition

[10] Duration modification using glottal closure instants and vowel onset points [J].

Rao, K. Sreenivasa ;

Yegnanarayana, B. .

SPEECH COMMUNICATION, 2009, 51 (12) :1263-1269

← 1 →