DEEP MULTIMODAL LEARNING FOR EMOTION RECOGNITION IN SPOKEN LANGUAGE

被引:0
|
作者
Gu, Yue [1 ]
Chen, Shuhong [1 ]
Marsic, Ivan [1 ]
机构
[1] Rutgers State Univ, Dept Elect & Comp Engn, Piscataway, NJ 08854 USA
关键词
Emotion recognition; spoken language; deep multimodal learning; SENTIMENT ANALYSIS;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we present a novel deep multimodal framework to predict human emotions based on sentence-level spoken language. Our architecture has two distinctive characteristics. First, it extracts the high-level features from both text and audio via a hybrid deep multimodal structure, which considers the spatial information from text, temporal information from audio, and high-level associations from low-level handcrafted features. Second, we fuse all features by using a three-layer deep neural network to learn the correlations across modalities and train the feature extraction and fusion modules together, allowing optimal global fine-tuning of the entire structure. We evaluated the proposed framework on the IEMOCAP dataset Our result shows promising performance, achieving 60.4% in weighted accuracy for five emotion categories.
引用
收藏
页码:5079 / 5083
页数:5
相关论文
共 50 条
  • [1] Spoken Emotion Recognition Using Deep Learning
    Albornoz, E. M.
    Sanchez-Gutierrez, M.
    Martinez-Licona, F.
    Rufiner, H. L.
    Goddard, J.
    PROGRESS IN PATTERN RECOGNITION IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2014, 2014, 8827 : 104 - 111
  • [2] Emotion Recognition Using Multimodal Deep Learning
    Liu, Wei
    Zheng, Wei-Long
    Lu, Bao-Liang
    NEURAL INFORMATION PROCESSING, ICONIP 2016, PT II, 2016, 9948 : 521 - 529
  • [3] Emotion Recognition on Multimodal with Deep Learning and Ensemble
    Dharma, David Adi
    Zahra, Amalia
    International Journal of Advanced Computer Science and Applications, 2022, 13 (12): : 656 - 663
  • [4] Emotion Recognition on Multimodal with Deep Learning and Ensemble
    Dharma, David Adi
    Zahra, Amalia
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (12) : 656 - 663
  • [5] Deep Imbalanced Learning for Multimodal Emotion Recognition in Conversations
    Meng, Tao
    Shou, Yuntao
    Ai, Wei
    Yin, Nan
    Li, Keqin
    IEEE Transactions on Artificial Intelligence, 2024, 5 (12): : 6472 - 6487
  • [6] Multimodal Arabic emotion recognition using deep learning
    Al Roken, Noora
    Barlas, Gerassimos
    SPEECH COMMUNICATION, 2023, 155
  • [7] Multimodal Emotion Recognition using Deep Learning Architectures
    Ranganathan, Hiranmayi
    Chakraborty, Shayok
    Panchanathan, Sethuraman
    2016 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2016), 2016,
  • [8] Annotation Efficiency in Multimodal Emotion Recognition with Deep Learning
    Zhu, Lili
    Spachos, Petros
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 560 - 565
  • [9] Comparing Recognition Performance and Robustness of Multimodal Deep Learning Models for Multimodal Emotion Recognition
    Liu, Wei
    Qiu, Jie-Lin
    Zheng, Wei-Long
    Lu, Bao-Liang
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (02) : 715 - 729
  • [10] EmoNets: Multimodal deep learning approaches for emotion recognition in video
    Kahou, Samira Ebrahimi
    Bouthillier, Xavier
    Lamblin, Pascal
    Gulcehre, Caglar
    Michalski, Vincent
    Konda, Kishore
    Jean, Sebastien
    Froumenty, Pierre
    Dauphin, Yann
    Boulanger-Lewandowski, Nicolas
    Ferrari, Raul Chandias
    Mirza, Mehdi
    Warde-Farley, David
    Courville, Aaron
    Vincent, Pascal
    Memisevic, Roland
    Pal, Christopher
    Bengio, Yoshua
    JOURNAL ON MULTIMODAL USER INTERFACES, 2016, 10 (02) : 99 - 111