Speaker normalisation for speech-based emotion detection

被引：32

作者：

Sethu, Vidhyasaharan ^{[1
,2
]}

Ambikairajah, Eliathainby ^{[1
,2
]}

Epps, Julien ^{[1
,3
]}

机构：

[1] Univ New S Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia

[2] NICTA, Sydney, NSW, Australia

[3] UNSW Asia, Singapore 248922, Singapore

来源：

PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING | 2007年

关键词：

feature warping; cumulative distribution mapping; emotion detection; hidden Markov model;

D O I：

10.1109/ICDSP.2007.4288656

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The focus of this paper is on speech-based emotion detection utilising only acoustic data, i.e. without using any linguistic or semantic information. However, this approach in general Suffers from the fact that acoustic data is speaker-dependent, and can result in inefficient estimation of the statistics modelled by classifiers such as hidden Markov models (HMMs) and Gaussian mixture models (GMMs). We propose the use of speaker-specific feature warping as a means of normalising acoustic features to overcome the problem of speaker dependency. In this paper we compare the performance of a system that uses feature warping to one that does not, The back-end employs ail HMM-based classifier that captures the temporal variations of the feature vectors by modelling them as transitions between different states. Evaluations conducted oil the LDC Emotional Prosody speech corpus reveal a relative increase in classification accuracy of up to 20%.

引用

页码：611 / +

页数：2

共 50 条

[21] Real Time Emotion Detection From Speech Using Raspberry Pi 3
Mishra, Amit
Patil, Dipak
Karkhanis, Nikhil
Gaikar, Vaishnavi
Wani, Kadambari
2017 2ND IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2017, : 2300 - 2303
[22] CBE : Corpus-Based of Emotion for Emotion Detection in Text Document
Rachman, Fika Hastarita
Sarno, Riyanarto
Fatichah, Chastine
2016 3RD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY, COMPUTER, AND ELECTRICAL ENGINEERING (ICITACEE), 2016, : 331 - 335
[23] ShEMO: a large-scale validated database for Persian speech emotion detection
Nezami, Omid Mohamad
Lou, Paria Jamshid
Karami, Mansoureh
LANGUAGE RESOURCES AND EVALUATION, 2019, 53 (01) : 1 - 16
[24] ShEMO: a large-scale validated database for Persian speech emotion detection
Omid Mohamad Nezami
Paria Jamshid Lou
Mansoureh Karami
Language Resources and Evaluation, 2019, 53 : 1 - 16
[25] Emotion Detection for Social Robots Based on NLP Transformers and an Emotion Ontology
Graterol, Wilfredo
Diaz-Amado, Jose
Cardinale, Yudith
Dongo, Irvin
Lopes-Silva, Edmundo
Santos-Libarino, Cleia
SENSORS, 2021, 21 (04) : 1 - 19
[26] A SPEAKER ADAPTABLE VERY LOW BIT RATE SPEECH CODER BASED ON HMM
彭煳
朱杰
JournalofShanghaiJiaotongUniversity, 2000, (02) : 1 - 5
[27] Ontology-Based Textual Emotion Detection
Haggag, Mohamed
Fathy, Samar
Elhaggar, Nahla
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (09) : 239 - 246
[28] SPEAKER-CONSISTENT PARSING FOR SPEAKER-INDEPENDENT CONTINUOUS SPEECH RECOGNITION
YAMAGUCHI, K
SINGER, H
MATSUNAGA, S
SAGAYAMA, S
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1995, E78D (06) : 719 - 724
[29] HMM Based Emotion Detection in Games: An Apercu
Mishra, Prerna
Ratnaparkhi, Saurabh
2018 3RD INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2018,
[30] Emotional transplant in statistical speech synthesis based on emotion additive model
Ohtani, Yaniato
Nasu, Yu
Morita, Masahiro
Akamine, Masami
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 274 - 278

← 1 2 3 4 5 →