Emotion Recognition from Human Speech Using Temporal Information and Deep Learning

被引:31
|
作者
Kim, John W. [1 ]
Saurous, Rif A. [2 ]
机构
[1] Menlo Sch, Atherton, CA USA
[2] Google Inc, Mountain View, CA USA
来源
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年
关键词
emotion recognition; temporal information; deep learning; CNN; LSTM;
D O I
10.21437/Interspeech.2018-1132
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition by machine is a challenging task, but it has great potential to make empathic human-machine communications possible. In conventional approaches that consist of feature extraction and classifier stages, extensive studies have devoted their effort to developing good feature representations, but relatively little effort was made to make proper use of the important temporal information in these features. In this paper, we propose a model combining features known to be useful for emotion recognition and deep neural networks to exploit temporal information when recognizing emotion status. A benchmark evaluation on EMO-DB demonstrates that the proposed model achieves a state-of-the-art performance of 88.9% recognition rate.
引用
收藏
页码:937 / 940
页数:4
相关论文
共 50 条
  • [31] Deep Temporal Models using Identity Skip-Connections for Speech Emotion Recognition
    Kim, Jaebok
    Englebienne, Gwenn
    Truong, Khiet P.
    Evers, Vanessa
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1006 - 1013
  • [32] Speech Emotion Recognition Using Deep Neural Network and Extreme Learning Machine
    Han, Kun
    Yu, Dong
    Tashev, Ivan
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 223 - 227
  • [33] LEARNING TEMPORAL INFORMATION FROM SPATIAL INFORMATION USING CAPSNETS FOR HUMAN ACTION RECOGNITION
    Algamdi, Abdullah M.
    Sanchez, Victor
    Li, Chang-Tsun
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3867 - 3871
  • [34] EMOTION RECOGNITION USING DEEP LEARNING
    Priya, R. N. Beena
    Hanmandlu, M.
    Vasikarla, Shantaram
    2021 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR), 2021,
  • [35] Emotion recognition in EEG signals using deep learning methods: A review
    Jafari, Mahboobeh
    Shoeibi, Afshin
    Khodatars, Marjane
    Bagherzadeh, Sara
    Shalbaf, Ahmad
    Garcia, David Lopez
    Gorriz, Juan M.
    Acharya, U. Rajendra
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 165
  • [36] Multimodal Arabic emotion recognition using deep learning
    Al Roken, Noora
    Barlas, Gerassimos
    SPEECH COMMUNICATION, 2023, 155
  • [37] Emotion Recognition based on Human Gesture and Speech Information using RT Middleware
    Vu, H. A.
    Yamazaki, Y.
    Dong, F.
    Hirota, K.
    IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 787 - 791
  • [38] Enhancing speech emotion recognition through deep learning and handcrafted feature fusion
    Eris, Fatma Gunes
    Akbal, Erhan
    APPLIED ACOUSTICS, 2024, 222
  • [39] Spontaneous Speech Emotion Recognition Using Multiscale Deep Convolutional LSTM
    Zhang, Shiqing
    Zhao, Xiaoming
    Tian, Qi
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (02) : 680 - 688
  • [40] Emotion Recognition from EEG Signals Using Advanced Transformations and Deep Learning
    Cruz-Vazquez, Jonathan Axel
    Montiel-Perez, Jesus Yalja
    Romero-Herrera, Rodolfo
    Rubio-Espino, Elsa
    MATHEMATICS, 2025, 13 (02)