Multi-modal Emotion Recognition with Temporal-Band Attention Based on LSTM-RNN

被引:18
|
作者
Liu, Jiamin [1 ]
Su, Yuanqi [2 ]
Liu, Yuehu [1 ,3 ]
机构
[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian, Shaanxi, Peoples R China
[2] Xi An Jiao Tong Univ, Dept Comp Sci & Technol, Xian, Shaanxi, Peoples R China
[3] Shaanxi Key Lab Digital Technol & Intelligent Sys, Xian, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Emotion recognition; LSTM-RNN; Temporal attention; Band attention; Multi-modal fusion; FUSION; EEG;
D O I
10.1007/978-3-319-77380-3_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition is a key problem in Human-Computer Interaction (HCI). The multi-modal emotion recognition was discussed based on untrimmed visual signals and EEG signals in this paper. We propose a model with two attention mechanisms based on multi-layer Long short-term memory recurrent neural network (LSTM-RNN) for emotion recognition, which combines temporal attention and band attention. At each time step, the LSTM-RNN takes the video and EEG slice as inputs and generate representations of two signals, which are fed into a multi-modal fusion unit. Based on the fusion, our network predicts the emotion label and the next time slice for analyzing. Within the process, the model applies different levels of attention to different frequency bands of EEG signals through the band attention. With the temporal attention, it determines where to analyze next signal in order to suppress the redundant information for recognition. Experiments on Mahnob-HCI database demonstrate the encouraging results; the proposed method achieves higher accuracy and boosts the computational efficiency.
引用
收藏
页码:194 / 204
页数:11
相关论文
共 50 条
  • [1] EMOTION RECOGNITION IN PUBLIC SPEAKING SCENARIOS UTILISING AN LSTM-RNN APPROACH WITH ATTENTION
    Baird, Alice
    Amiriparian, Shahin
    Milling, Manuel
    Schuller, Bjoern W.
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 397 - 402
  • [2] Exploring temporal representations by leveraging attention-based bidirectional LSTM-RNNs for multi-modal emotion recognition
    Li, Chao
    Bao, Zhongtian
    Li, Linhao
    Zhao, Ziping
    INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (03)
  • [3] Multi-modal Attention for Speech Emotion Recognition
    Pan, Zexu
    Luo, Zhaojie
    Yang, Jichen
    Li, Haizhou
    INTERSPEECH 2020, 2020, : 364 - 368
  • [4] ATTENTION DRIVEN FUSION FOR MULTI-MODAL EMOTION RECOGNITION
    Priyasad, Darshana
    Fernando, Tharindu
    Denman, Simon
    Sridharan, Sridha
    Fookes, Clinton
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3227 - 3231
  • [5] EEG-based emotion recognition using LSTM-RNN machine learning algorithm
    Koya, Jeevan Reddy
    Rao, Venu Madhava S. P.
    Pothunoori, Shiva Kumar
    Malyala, Srivikas
    PROCEEDINGS OF 2019 1ST INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION AND COMMUNICATION TECHNOLOGY (ICIICT 2019), 2019,
  • [6] Attention-based Multi-modal Sentiment Analysis and Emotion Detection in Conversation using RNN
    Huddar, Mahesh G.
    Sannakki, Sanjeev S.
    Rajpurohit, Vijay S.
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2021, 6 (06): : 112 - 121
  • [7] Multi-modal feature fusion based on multi-layers LSTM for video emotion recognition
    Nie, Weizhi
    Yan, Yan
    Song, Dan
    Wang, Kun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16205 - 16214
  • [8] Multi-modal feature fusion based on multi-layers LSTM for video emotion recognition
    Weizhi Nie
    Yan Yan
    Dan Song
    Kun Wang
    Multimedia Tools and Applications, 2021, 80 : 16205 - 16214
  • [9] Multi-modal Emotion Recognition Based on Hypergraph
    Zong L.-L.
    Zhou J.-H.
    Xie Q.-J.
    Zhang X.-C.
    Xu B.
    Jisuanji Xuebao/Chinese Journal of Computers, 2023, 46 (12): : 2520 - 2534
  • [10] Dense Attention Memory Network for Multi-modal emotion recognition
    Ma, Gailing
    Guo, Xiao
    2022 5TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND NATURAL LANGUAGE PROCESSING, MLNLP 2022, 2022, : 48 - 53