MF-Net: a multimodal fusion network for emotion recognition based on multiple physiological signals

被引:0
|
作者
Zhu, Lei [1 ]
Ding, Yu [1 ]
Huang, Aiai [1 ]
Tan, Xufei [2 ]
Zhang, Jianhai [3 ,4 ]
机构
[1] Hangzhou Dianzi Univ, Sch Automat, Hangzhou 310000, Peoples R China
[2] Hangzhou City Univ, Sch Med, Hangzhou 310015, Peoples R China
[3] Hangzhou Dianzi Univ, Sch Comp Sci, Hangzhou 310000, Peoples R China
[4] Hangzhou City Univ, Key Lab Brain Machine Collaborat Intelligence Zhej, Hangzhou 310015, Peoples R China
关键词
Deep learning; Physiological signal; Multimodal fusion; Emotion recognition; EEG;
D O I
10.1007/s11760-024-03632-0
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Currently, research on emotion recognition has shown that multi-modal data fusion has advantages in improving the accuracy and robustness of human emotion recognition, outperforming single-modal methods. Despite the promising results of existing methods, significant challenges remain in effectively fusing data from multiple modalities to achieve superior performance. Firstly, existing works tend to focus on generating a joint representation by fusing multi-modal data, with fewer methods considering the specific characteristics of each modality. Secondly, most methods fail to fully capture the intricate correlations among multiple modalities, often resorting to simplistic combinations of latent features. To address these challenges, we propose a novel fusion network for multi-modal emotion recognition. This network enhances the efficacy of multi-modal fusion while preserving the distinct characteristics of each modality. Specifically, a dual-stream multi-scale feature encoding (MFE) is designed to extract emotional information from both electroencephalogram (EEG) and peripheral physiological signals (PPS) temporal slices. Subsequently, a cross-modal global-local feature fusion module (CGFFM) is proposed to integrate global and local information from multi-modal data and then assign different importance to each modality, which makes the fusion data tend to the more important modalities. Meanwhile, the transformer module is employed to further learn the modality-specific information. Moreover, we introduce the adaptive collaboration block (ACB), which optimally leverages both modality-specific and cross-modality relations for enhanced integration and feature representation. Following extensive experiments on the DEAP and DREAMER multimodal datasets, our model achieves state-of-the-art performance.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Multimodal Stability-Sensitive Emotion Recognition based on Brainwave and Physiological Signals
    Thammasan, Nattapong
    Hagad, Juan Lorenzo
    Fukui, Ken-ichi
    Numao, Masayuki
    2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2017, : 44 - 49
  • [22] Hierarchical fusion of visual and physiological signals for emotion recognition
    Yuchun Fang
    Ruru Rong
    Jun Huang
    Multidimensional Systems and Signal Processing, 2021, 32 : 1103 - 1121
  • [23] Emotion Recognition from Multimodal Physiological Signals for Emotion Aware Healthcare Systems
    Ayata, Deger
    Yaslan, Yusuf
    Kamasak, Mustafa E.
    JOURNAL OF MEDICAL AND BIOLOGICAL ENGINEERING, 2020, 40 (02) : 149 - 157
  • [24] Emotion Recognition from Multimodal Physiological Signals for Emotion Aware Healthcare Systems
    Değer Ayata
    Yusuf Yaslan
    Mustafa E. Kamasak
    Journal of Medical and Biological Engineering, 2020, 40 : 149 - 157
  • [25] Feature Extraction Analysis for Emotion Recognition from ICEEMD of Multimodal Physiological Signals
    Gomez-Lara, J. F.
    Ordonez-Bolanos, O. A.
    Becerra, M. A.
    Castro-Ospina, A. E.
    Mejia-Arboleda, C.
    Duque-Mejia, C.
    Rodriguez, J.
    Revelo-Fuelagan, Javier
    Peluffo-Ordonez, Diego H.
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2019, PT I, 2019, 11431 : 351 - 362
  • [26] Emotion Recognition from Physiological Signals Based on ASAGA
    Zhou, Lianzhe
    Pang, Huanli
    Liu, Hanmei
    PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON COMMUNICATION, ELECTRONICS AND AUTOMATION ENGINEERING, 2013, 181 : 735 - 740
  • [27] Development of person-independent emotion recognition system based on multiple physiological signals
    Kim, KH
    Bang, SW
    Kim, SR
    SECOND JOINT EMBS-BMES CONFERENCE 2002, VOLS 1-3, CONFERENCE PROCEEDINGS: BIOENGINEERING - INTEGRATIVE METHODOLOGIES, NEW TECHNOLOGIES, 2002, : 50 - 51
  • [28] Enhancing emotion recognition using multimodal fusion of physiological, environmental, personal data
    Kim, Hakpyeong
    Hong, Taehoon
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [29] Emotion Recognition From Multimodal Physiological Signals via Discriminative Correlation Fusion With a Temporal Alignment Mechanism
    Hou, Kechen
    Zhang, Xiaowei
    Yang, Yikun
    Zhao, Qiqi
    Yuan, Wenjie
    Zhou, Zhongyi
    Zhang, Sipo
    Li, Chen
    Shen, Jian
    Hu, Bin
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (05) : 3079 - 3092
  • [30] Multimodal Emotion Recognition Using a Hierarchical Fusion Convolutional Neural Network
    Zhang, Yong
    Cheng, Cheng
    Zhang, Yidie
    IEEE ACCESS, 2021, 9 : 7943 - 7951