Multi-modal Emotion Recognition with Temporal-Band Attention Based on LSTM-RNN

被引：18

作者：

Liu, Jiamin ^{[1
]}

Su, Yuanqi ^{[2
]}

Liu, Yuehu ^{[1
,3
]}

机构：

[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian, Shaanxi, Peoples R China

[2] Xi An Jiao Tong Univ, Dept Comp Sci & Technol, Xian, Shaanxi, Peoples R China

[3] Shaanxi Key Lab Digital Technol & Intelligent Sys, Xian, Shaanxi, Peoples R China

来源：

ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT I | 2018年 / 10735卷

基金：

中国国家自然科学基金;

关键词：

Emotion recognition; LSTM-RNN; Temporal attention; Band attention; Multi-modal fusion; FUSION; EEG;

D O I：

10.1007/978-3-319-77380-3_19

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Emotion recognition is a key problem in Human-Computer Interaction (HCI). The multi-modal emotion recognition was discussed based on untrimmed visual signals and EEG signals in this paper. We propose a model with two attention mechanisms based on multi-layer Long short-term memory recurrent neural network (LSTM-RNN) for emotion recognition, which combines temporal attention and band attention. At each time step, the LSTM-RNN takes the video and EEG slice as inputs and generate representations of two signals, which are fed into a multi-modal fusion unit. Based on the fusion, our network predicts the emotion label and the next time slice for analyzing. Within the process, the model applies different levels of attention to different frequency bands of EEG signals through the band attention. With the temporal attention, it determines where to analyze next signal in order to suppress the redundant information for recognition. Experiments on Mahnob-HCI database demonstrate the encouraging results; the proposed method achieves higher accuracy and boosts the computational efficiency.

引用

页码：194 / 204

页数：11

共 50 条

[11] A novel signal channel attention network for multi-modal emotion recognition
Du, Ziang
Ye, Xia
Zhao, Pujie
FRONTIERS IN NEUROROBOTICS, 2024, 18
[12] Multi-modal Emotion Recognition Based on Speech and Image
Li, Yongqiang
He, Qi
Zhao, Yongping
Yao, Hongxun
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT I, 2018, 10735 : 844 - 853
[13] IS CROSS-ATTENTION PREFERABLE TO SELF-ATTENTION FOR MULTI-MODAL EMOTION RECOGNITION?
Rajan, Vandana
Brutti, Alessio
Cavallaro, Andrea
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4693 - 4697
[14] Lightweight multi-modal emotion recognition model based on modal generation
Liu, Peisong
Che, Manqiang
Luo, Jiangchuan
2022 9TH INTERNATIONAL FORUM ON ELECTRICAL ENGINEERING AND AUTOMATION, IFEEA, 2022, : 430 - 435
[15] Multi-head attention fusion networks for multi-modal speech emotion recognition
Zhang, Junfeng
Xing, Lining
Tan, Zhen
Wang, Hongsen
Wang, Kesheng
COMPUTERS & INDUSTRIAL ENGINEERING, 2022, 168
[16] Multi-Modal Fusion Emotion Recognition Based on HMM and ANN
Xu, Chao
Cao, Tianyi
Feng, Zhiyong
Dong, Caichao
CONTEMPORARY RESEARCH ON E-BUSINESS TECHNOLOGY AND STRATEGY, 2012, 332 : 541 - 550
[17] Towards Efficient Multi-Modal Emotion Recognition
Dobrisek, Simon
Gajsek, Rok
Mihelic, France
Pavesic, Nikola
Struc, Vitomir
INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2013, 10
[18] Evaluation and Discussion of Multi-modal Emotion Recognition
Rabie, Ahmad
Wrede, Britta
Vogt, Thurid
Hanheide, Marc
SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND ELECTRICAL ENGINEERING, VOL 1, PROCEEDINGS, 2009, : 598 - +
[19] Emotion Recognition from Multi-Modal Information
Wu, Chung-Hsien
Lin, Jen-Chun
Wei, Wen-Li
Cheng, Kuan-Chun
2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
[20] AFLEMP: Attention-based Federated Learning for Emotion recognition using Multi-modal Physiological data
Gahlan, Neha
Sethia, Divyashikha
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 94

← 1 2 3 4 5 →