Multi-modal Emotion Recognition with Temporal-Band Attention Based on LSTM-RNN

被引：18

作者：

Liu, Jiamin ^{[1
]}

Su, Yuanqi ^{[2
]}

Liu, Yuehu ^{[1
,3
]}

机构：

[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian, Shaanxi, Peoples R China

[2] Xi An Jiao Tong Univ, Dept Comp Sci & Technol, Xian, Shaanxi, Peoples R China

[3] Shaanxi Key Lab Digital Technol & Intelligent Sys, Xian, Shaanxi, Peoples R China

来源：

ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT I | 2018年 / 10735卷

基金：

中国国家自然科学基金;

关键词：

Emotion recognition; LSTM-RNN; Temporal attention; Band attention; Multi-modal fusion; FUSION; EEG;

D O I：

10.1007/978-3-319-77380-3_19

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Emotion recognition is a key problem in Human-Computer Interaction (HCI). The multi-modal emotion recognition was discussed based on untrimmed visual signals and EEG signals in this paper. We propose a model with two attention mechanisms based on multi-layer Long short-term memory recurrent neural network (LSTM-RNN) for emotion recognition, which combines temporal attention and band attention. At each time step, the LSTM-RNN takes the video and EEG slice as inputs and generate representations of two signals, which are fed into a multi-modal fusion unit. Based on the fusion, our network predicts the emotion label and the next time slice for analyzing. Within the process, the model applies different levels of attention to different frequency bands of EEG signals through the band attention. With the temporal attention, it determines where to analyze next signal in order to suppress the redundant information for recognition. Experiments on Mahnob-HCI database demonstrate the encouraging results; the proposed method achieves higher accuracy and boosts the computational efficiency.

引用

页码：194 / 204

页数：11

共 50 条

[1] EMOTION RECOGNITION IN PUBLIC SPEAKING SCENARIOS UTILISING AN LSTM-RNN APPROACH WITH ATTENTION
Baird, Alice
Amiriparian, Shahin
Milling, Manuel
Schuller, Bjoern W.
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 397 - 402
[2] Exploring temporal representations by leveraging attention-based bidirectional LSTM-RNNs for multi-modal emotion recognition
Li, Chao
Bao, Zhongtian
Li, Linhao
Zhao, Ziping
INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (03)
[3] Multi-modal Attention for Speech Emotion Recognition
Pan, Zexu
Luo, Zhaojie
Yang, Jichen
Li, Haizhou
INTERSPEECH 2020, 2020, : 364 - 368
[4] ATTENTION DRIVEN FUSION FOR MULTI-MODAL EMOTION RECOGNITION
Priyasad, Darshana
Fernando, Tharindu
Denman, Simon
Sridharan, Sridha
Fookes, Clinton
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3227 - 3231
[5] EEG-based emotion recognition using LSTM-RNN machine learning algorithm
Koya, Jeevan Reddy
Rao, Venu Madhava S. P.
Pothunoori, Shiva Kumar
Malyala, Srivikas
PROCEEDINGS OF 2019 1ST INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION AND COMMUNICATION TECHNOLOGY (ICIICT 2019), 2019,
[6] Attention-based Multi-modal Sentiment Analysis and Emotion Detection in Conversation using RNN
Huddar, Mahesh G.
Sannakki, Sanjeev S.
Rajpurohit, Vijay S.
INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2021, 6 (06): : 112 - 121
[7] Multi-modal feature fusion based on multi-layers LSTM for video emotion recognition
Nie, Weizhi
Yan, Yan
Song, Dan
Wang, Kun
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16205 - 16214
[8] Multi-modal feature fusion based on multi-layers LSTM for video emotion recognition
Weizhi Nie
Yan Yan
Dan Song
Kun Wang
Multimedia Tools and Applications, 2021, 80 : 16205 - 16214
[9] Multi-modal Emotion Recognition Based on Hypergraph
Zong L.-L.
Zhou J.-H.
Xie Q.-J.
Zhang X.-C.
Xu B.
Jisuanji Xuebao/Chinese Journal of Computers, 2023, 46 (12): : 2520 - 2534
[10] Dense Attention Memory Network for Multi-modal emotion recognition
Ma, Gailing
Guo, Xiao
2022 5TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND NATURAL LANGUAGE PROCESSING, MLNLP 2022, 2022, : 48 - 53

← 1 2 3 4 5 →