Speaker normalisation for speech-based emotion detection

被引:32
|
作者
Sethu, Vidhyasaharan [1 ,2 ]
Ambikairajah, Eliathainby [1 ,2 ]
Epps, Julien [1 ,3 ]
机构
[1] Univ New S Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia
[2] NICTA, Sydney, NSW, Australia
[3] UNSW Asia, Singapore 248922, Singapore
来源
PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING | 2007年
关键词
feature warping; cumulative distribution mapping; emotion detection; hidden Markov model;
D O I
10.1109/ICDSP.2007.4288656
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The focus of this paper is on speech-based emotion detection utilising only acoustic data, i.e. without using any linguistic or semantic information. However, this approach in general Suffers from the fact that acoustic data is speaker-dependent, and can result in inefficient estimation of the statistics modelled by classifiers such as hidden Markov models (HMMs) and Gaussian mixture models (GMMs). We propose the use of speaker-specific feature warping as a means of normalising acoustic features to overcome the problem of speaker dependency. In this paper we compare the performance of a system that uses feature warping to one that does not, The back-end employs ail HMM-based classifier that captures the temporal variations of the feature vectors by modelling them as transitions between different states. Evaluations conducted oil the LDC Emotional Prosody speech corpus reveal a relative increase in classification accuracy of up to 20%.
引用
收藏
页码:611 / +
页数:2
相关论文
共 50 条
  • [31] A Quick Sequential Forward Floating Feature Selection Algorithm for Emotion Detection from Speech
    Brendel, Matyas
    Zaccarelli, Riccardo
    Devillers, Laurence
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1157 - 1160
  • [32] Emotion-detecting based model selection for emotional speech recognition
    Pan, Y. C.
    Xu, M. X.
    Liu, L. Q.
    Jia, P. F.
    2006 IMACS: MULTICONFERENCE ON COMPUTATIONAL ENGINEERING IN SYSTEMS APPLICATIONS, VOLS 1 AND 2, 2006, : 2169 - +
  • [33] Automatic speech emotion detection using hybrid of gray wolf optimizer and naive Bayes
    Ramesh, S.
    Gomathi, S.
    Sasikala, S.
    Saravanan, T. R.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 26 (3) : 571 - 578
  • [34] Detection of anger emotion in dialog speech using prosody feature and temporal relation of utterances
    Nomoto, Narichika
    Masataki, Hirokazu
    Yoshioka, Osamu
    Takahashi, Satoshi
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 494 - 497
  • [35] Emotion modeling from speech signal based on wavelet packet transform
    Degaonkar, Varsha
    Apte, Shaila
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2013, 16 (01) : 1 - 5
  • [36] A speaker adaptation technique for MRHSMM-based style control of. synthetic speech
    Nose, Takashi
    Kato, Yoichi
    Kobayashi, Takao
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 833 - +
  • [37] Speaker Adaptation for Slovak Statistical Parametric Speech Synthesis Based on Hidden Markov Models
    Sulir, Martin
    Juhar, Jozef
    2015 25TH INTERNATIONAL CONFERENCE RADIOELEKTRONIKA (RADIOELEKTRONIKA), 2015, : 137 - 140
  • [38] Speech Emotion Recognition based on Gaussian Mixture Models and Deep Neural Networks
    Tashev, Ivan J.
    Wang, Zhong-Qiu
    Godin, Keith
    2017 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA), 2017,
  • [39] Recognition of Human Emotion from a Speech Signal Based on Plutchik's Model
    Kaminska, Dorota
    Pelikant, Adam
    INTERNATIONAL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2012, 58 (02) : 165 - 170
  • [40] CNN-Based Models for Emotion and Sentiment Analysis Using Speech Data
    Madan, Anjum
    Kumar, Devender
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (10)