Speech Emotion Recognition Using Speech Feature and Word Embedding

被引:0
|
作者
Atmaja, Bagus Tris [1 ,2 ]
Shirai, Kiyoaki [2 ]
Akagi, Masato [2 ]
机构
[1] Inst Teknol Sepuluh Nopember, Surabaya, Indonesia
[2] Japan Adv Inst Sci & Technol, Nomi, Japan
关键词
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Emotion recognition can be performed automatically from many modalities. This paper presents a categorical speech emotion recognition using speech feature and word embedding. Text features can be combined with speech features to improve emotion recognition accuracy, and both features can be obtained from speech. Here, we use speech segments, by removing silences in an utterance, where the acoustic feature is extracted for speech-based emotion recognition. Word embedding is used as an input feature for text emotion recognition and a combination of both features is proposed for performance improvement purpose. Two unidirectional LSTM layers are used for text and fully connected layers are applied for acoustic emotion recognition. Both networks then are merged by fully connected networks in early fusion way to produce one of four predicted emotion categories. The result shows the combination of speech and text achieve higher accuracy i.e. 75.49% compared to speech only with 58.29% or text only emotion recognition with 68.01%. This result also outperforms the previously proposed methods by others using the same dataset on the same modalities.
引用
收藏
页码:519 / 523
页数:5
相关论文
共 50 条
  • [1] Autoencoder With Emotion Embedding for Speech Emotion Recognition
    Zhang, Chenghao
    Xue, Lei
    IEEE ACCESS, 2021, 9 : 51231 - 51241
  • [2] Autoencoder with emotion embedding for speech emotion recognition
    Zhang, Chenghao
    Xue, Lei
    IEEE Access, 2021, 9 : 51231 - 51241
  • [3] Speech Emotion Recognition Using Spectrogram & Phoneme Embedding
    Yenigalla, Promod
    Kumar, Abhay
    Tripathi, Suraj
    Singh, Chirag
    Kar, Sibsambhu
    Vepa, Jithendra
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3688 - 3692
  • [4] Speech emotion recognition using emotion perception spectral feature
    Jiang, Lin
    Tan, Ping
    Yang, Junfeng
    Liu, Xingbao
    Wang, Chao
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (11):
  • [5] Speech Emotion Recognition using Convolutional Neural Network with Audio Word-based Embedding
    Huang, Kun-Yi
    Wu, Chung-Hsien
    Hong, Qian-Bei
    Su, Ming-Hsiang
    Zeng, Yuan-Rong
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 265 - 269
  • [6] Speech emotion recognition using a novel feature set
    Yang, J. (jsjyj0801@163.com), 1600, Binary Information Press, P.O. Box 162, Bethel, CT 06801-0162, United States (09):
  • [7] Emotion Label Encoding using Word Embeddings for Speech Emotion Recognition
    Stanley, Eimear
    DeMattos, Eric
    Klementiev, Anita
    Ozimek, Piotr
    Clarke, Georgia
    Berger, Michael
    Palaz, Dimitri
    INTERSPEECH 2023, 2023, : 2418 - 2422
  • [8] Feature representation for speech emotion Recognition
    Abdollahpour, Mehdi
    Zamani, Lafar
    Rad, Hamidreza Saligheh
    2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 1465 - 1468
  • [9] On the Speech Properties and Feature Extraction Methods in Speech Emotion Recognition
    Kacur, Juraj
    Puterka, Boris
    Pavlovicova, Jarmila
    Oravec, Milos
    SENSORS, 2021, 21 (05) : 1 - 27
  • [10] Prosodic feature normalization for emotion recognition by using synthesized speech
    Suzuki, Motoyuki
    Nakagawa, Shohei
    Kita, Kenji
    ADVANCES IN KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, 2012, 243 : 306 - 313