Multi-modal Emotion Recognition with Temporal-Band Attention Based on LSTM-RNN

被引:18
|
作者
Liu, Jiamin [1 ]
Su, Yuanqi [2 ]
Liu, Yuehu [1 ,3 ]
机构
[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian, Shaanxi, Peoples R China
[2] Xi An Jiao Tong Univ, Dept Comp Sci & Technol, Xian, Shaanxi, Peoples R China
[3] Shaanxi Key Lab Digital Technol & Intelligent Sys, Xian, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Emotion recognition; LSTM-RNN; Temporal attention; Band attention; Multi-modal fusion; FUSION; EEG;
D O I
10.1007/978-3-319-77380-3_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition is a key problem in Human-Computer Interaction (HCI). The multi-modal emotion recognition was discussed based on untrimmed visual signals and EEG signals in this paper. We propose a model with two attention mechanisms based on multi-layer Long short-term memory recurrent neural network (LSTM-RNN) for emotion recognition, which combines temporal attention and band attention. At each time step, the LSTM-RNN takes the video and EEG slice as inputs and generate representations of two signals, which are fed into a multi-modal fusion unit. Based on the fusion, our network predicts the emotion label and the next time slice for analyzing. Within the process, the model applies different levels of attention to different frequency bands of EEG signals through the band attention. With the temporal attention, it determines where to analyze next signal in order to suppress the redundant information for recognition. Experiments on Mahnob-HCI database demonstrate the encouraging results; the proposed method achieves higher accuracy and boosts the computational efficiency.
引用
收藏
页码:194 / 204
页数:11
相关论文
共 50 条
  • [11] A novel signal channel attention network for multi-modal emotion recognition
    Du, Ziang
    Ye, Xia
    Zhao, Pujie
    FRONTIERS IN NEUROROBOTICS, 2024, 18
  • [12] Multi-modal Emotion Recognition Based on Speech and Image
    Li, Yongqiang
    He, Qi
    Zhao, Yongping
    Yao, Hongxun
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT I, 2018, 10735 : 844 - 853
  • [13] IS CROSS-ATTENTION PREFERABLE TO SELF-ATTENTION FOR MULTI-MODAL EMOTION RECOGNITION?
    Rajan, Vandana
    Brutti, Alessio
    Cavallaro, Andrea
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4693 - 4697
  • [14] Lightweight multi-modal emotion recognition model based on modal generation
    Liu, Peisong
    Che, Manqiang
    Luo, Jiangchuan
    2022 9TH INTERNATIONAL FORUM ON ELECTRICAL ENGINEERING AND AUTOMATION, IFEEA, 2022, : 430 - 435
  • [15] Multi-head attention fusion networks for multi-modal speech emotion recognition
    Zhang, Junfeng
    Xing, Lining
    Tan, Zhen
    Wang, Hongsen
    Wang, Kesheng
    COMPUTERS & INDUSTRIAL ENGINEERING, 2022, 168
  • [16] Multi-Modal Fusion Emotion Recognition Based on HMM and ANN
    Xu, Chao
    Cao, Tianyi
    Feng, Zhiyong
    Dong, Caichao
    CONTEMPORARY RESEARCH ON E-BUSINESS TECHNOLOGY AND STRATEGY, 2012, 332 : 541 - 550
  • [17] Towards Efficient Multi-Modal Emotion Recognition
    Dobrisek, Simon
    Gajsek, Rok
    Mihelic, France
    Pavesic, Nikola
    Struc, Vitomir
    INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2013, 10
  • [18] Evaluation and Discussion of Multi-modal Emotion Recognition
    Rabie, Ahmad
    Wrede, Britta
    Vogt, Thurid
    Hanheide, Marc
    SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND ELECTRICAL ENGINEERING, VOL 1, PROCEEDINGS, 2009, : 598 - +
  • [19] Emotion Recognition from Multi-Modal Information
    Wu, Chung-Hsien
    Lin, Jen-Chun
    Wei, Wen-Li
    Cheng, Kuan-Chun
    2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [20] AFLEMP: Attention-based Federated Learning for Emotion recognition using Multi-modal Physiological data
    Gahlan, Neha
    Sethia, Divyashikha
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 94