Speaker normalisation for speech-based emotion detection

被引:32
|
作者
Sethu, Vidhyasaharan [1 ,2 ]
Ambikairajah, Eliathainby [1 ,2 ]
Epps, Julien [1 ,3 ]
机构
[1] Univ New S Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia
[2] NICTA, Sydney, NSW, Australia
[3] UNSW Asia, Singapore 248922, Singapore
来源
PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING | 2007年
关键词
feature warping; cumulative distribution mapping; emotion detection; hidden Markov model;
D O I
10.1109/ICDSP.2007.4288656
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The focus of this paper is on speech-based emotion detection utilising only acoustic data, i.e. without using any linguistic or semantic information. However, this approach in general Suffers from the fact that acoustic data is speaker-dependent, and can result in inefficient estimation of the statistics modelled by classifiers such as hidden Markov models (HMMs) and Gaussian mixture models (GMMs). We propose the use of speaker-specific feature warping as a means of normalising acoustic features to overcome the problem of speaker dependency. In this paper we compare the performance of a system that uses feature warping to one that does not, The back-end employs ail HMM-based classifier that captures the temporal variations of the feature vectors by modelling them as transitions between different states. Evaluations conducted oil the LDC Emotional Prosody speech corpus reveal a relative increase in classification accuracy of up to 20%.
引用
收藏
页码:611 / +
页数:2
相关论文
共 50 条
  • [41] Automatic speech emotion detection using hybrid of gray wolf optimizer and naïve Bayes
    S. Ramesh
    S. Gomathi
    S. Sasikala
    T. R. Saravanan
    International Journal of Speech Technology, 2023, 26 : 571 - 578
  • [42] Group Emotion Detection Based on Social Robot Perception
    Quiroz, Marco
    Patino, Raquel
    Diaz-Amado, Jose
    Cardinale, Yudith
    SENSORS, 2022, 22 (10)
  • [43] Speaker Independent Sinhala Speech Recognition for Voice Dialling
    Amarasingh, W. G. T. N.
    Gamini, D. D. A.
    INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER2012), 2012, : 3 - 6
  • [44] Speaker Independent Urdu Speech Recognition Using HMM
    Ashraf, Javed
    Iqbal, Naveed
    Khattak, Naveed Sarfraz
    Zaidi, Ather Mohsin
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, 2010, 6177 : 140 - 148
  • [45] ON THE USE OF SPECTRAL TRANSFORMATION FOR SPEAKER ADAPTATION IN HMM BASED ISOLATED-WORD SPEECH RECOGNITION
    CHOI, HC
    KING, RW
    SPEECH COMMUNICATION, 1995, 17 (1-2) : 131 - 143
  • [46] Feature extraction model for speech emotion detection with prodigious precedence assortment model using fuzzy-based convolution neural networks
    Deepika, Chandupatla
    Kuchibhotla, Swarna
    SOFT COMPUTING, 2023,
  • [47] EmoSRE: Emotion prediction based speech synthesis and refined speech recognition using large language model and prosody encoding
    Akhouri, Shivam
    Balasundaram, Ananthakrishnan
    CURRENT PSYCHOLOGY, 2025, : 7250 - 7262
  • [48] A Survey on Emotion Detection A lexicon based backtracking approach for detecting emotion from Bengali text
    Rabey, Tapasy
    Ferdous, Sanjida
    Ali, Himel Suhita
    Chakraborty, Narayan Ranjan
    2017 20TH INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2017,
  • [49] Directed Acyclic Graphs for Content Based Sound, Musical Genre, and Speech Emotion Classification
    Ntalampiras, Stavros
    JOURNAL OF NEW MUSIC RESEARCH, 2014, 43 (02) : 173 - 182
  • [50] ArmanEmo: a Persian dataset for text-based emotion detection
    Mirzaee, Hossein
    Peymanfard, Javad
    Moshtaghin, Hamid Habibzadeh
    Zeinali, Hossein
    LANGUAGE RESOURCES AND EVALUATION, 2025,