Speaker normalisation for speech-based emotion detection

被引：32

作者：

Sethu, Vidhyasaharan ^{[1
,2
]}

Ambikairajah, Eliathainby ^{[1
,2
]}

Epps, Julien ^{[1
,3
]}

机构：

[1] Univ New S Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia

[2] NICTA, Sydney, NSW, Australia

[3] UNSW Asia, Singapore 248922, Singapore

来源：

PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING | 2007年

关键词：

feature warping; cumulative distribution mapping; emotion detection; hidden Markov model;

D O I：

10.1109/ICDSP.2007.4288656

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The focus of this paper is on speech-based emotion detection utilising only acoustic data, i.e. without using any linguistic or semantic information. However, this approach in general Suffers from the fact that acoustic data is speaker-dependent, and can result in inefficient estimation of the statistics modelled by classifiers such as hidden Markov models (HMMs) and Gaussian mixture models (GMMs). We propose the use of speaker-specific feature warping as a means of normalising acoustic features to overcome the problem of speaker dependency. In this paper we compare the performance of a system that uses feature warping to one that does not, The back-end employs ail HMM-based classifier that captures the temporal variations of the feature vectors by modelling them as transitions between different states. Evaluations conducted oil the LDC Emotional Prosody speech corpus reveal a relative increase in classification accuracy of up to 20%.

引用

页码：611 / +

页数：2

共 50 条

[41] Automatic speech emotion detection using hybrid of gray wolf optimizer and naïve Bayes
S. Ramesh
S. Gomathi
S. Sasikala
T. R. Saravanan
International Journal of Speech Technology, 2023, 26 : 571 - 578
[42] Group Emotion Detection Based on Social Robot Perception
Quiroz, Marco
Patino, Raquel
Diaz-Amado, Jose
Cardinale, Yudith
SENSORS, 2022, 22 (10)
[43] Speaker Independent Sinhala Speech Recognition for Voice Dialling
Amarasingh, W. G. T. N.
Gamini, D. D. A.
INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER2012), 2012, : 3 - 6
[44] Speaker Independent Urdu Speech Recognition Using HMM
Ashraf, Javed
Iqbal, Naveed
Khattak, Naveed Sarfraz
Zaidi, Ather Mohsin
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, 2010, 6177 : 140 - 148
[45] ON THE USE OF SPECTRAL TRANSFORMATION FOR SPEAKER ADAPTATION IN HMM BASED ISOLATED-WORD SPEECH RECOGNITION
CHOI, HC
KING, RW
SPEECH COMMUNICATION, 1995, 17 (1-2) : 131 - 143
[46] Feature extraction model for speech emotion detection with prodigious precedence assortment model using fuzzy-based convolution neural networks
Deepika, Chandupatla
Kuchibhotla, Swarna
SOFT COMPUTING, 2023,
[47] EmoSRE: Emotion prediction based speech synthesis and refined speech recognition using large language model and prosody encoding
Akhouri, Shivam
Balasundaram, Ananthakrishnan
CURRENT PSYCHOLOGY, 2025, : 7250 - 7262
[48] A Survey on Emotion Detection A lexicon based backtracking approach for detecting emotion from Bengali text
Rabey, Tapasy
Ferdous, Sanjida
Ali, Himel Suhita
Chakraborty, Narayan Ranjan
2017 20TH INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2017,
[49] Directed Acyclic Graphs for Content Based Sound, Musical Genre, and Speech Emotion Classification
Ntalampiras, Stavros
JOURNAL OF NEW MUSIC RESEARCH, 2014, 43 (02) : 173 - 182
[50] ArmanEmo: a Persian dataset for text-based emotion detection
Mirzaee, Hossein
Peymanfard, Javad
Moshtaghin, Hamid Habibzadeh
Zeinali, Hossein
LANGUAGE RESOURCES AND EVALUATION, 2025,

← 1 2 3 4 5 →