MF-Net: a multimodal fusion network for emotion recognition based on multiple physiological signals

被引:0
作者
Zhu, Lei [1 ]
Ding, Yu [1 ]
Huang, Aiai [1 ]
Tan, Xufei [2 ]
Zhang, Jianhai [3 ,4 ]
机构
[1] Hangzhou Dianzi Univ, Sch Automat, Hangzhou 310000, Peoples R China
[2] Hangzhou City Univ, Sch Med, Hangzhou 310015, Peoples R China
[3] Hangzhou Dianzi Univ, Sch Comp Sci, Hangzhou 310000, Peoples R China
[4] Hangzhou City Univ, Key Lab Brain Machine Collaborat Intelligence Zhej, Hangzhou 310015, Peoples R China
关键词
Deep learning; Physiological signal; Multimodal fusion; Emotion recognition; EEG;
D O I
10.1007/s11760-024-03632-0
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Currently, research on emotion recognition has shown that multi-modal data fusion has advantages in improving the accuracy and robustness of human emotion recognition, outperforming single-modal methods. Despite the promising results of existing methods, significant challenges remain in effectively fusing data from multiple modalities to achieve superior performance. Firstly, existing works tend to focus on generating a joint representation by fusing multi-modal data, with fewer methods considering the specific characteristics of each modality. Secondly, most methods fail to fully capture the intricate correlations among multiple modalities, often resorting to simplistic combinations of latent features. To address these challenges, we propose a novel fusion network for multi-modal emotion recognition. This network enhances the efficacy of multi-modal fusion while preserving the distinct characteristics of each modality. Specifically, a dual-stream multi-scale feature encoding (MFE) is designed to extract emotional information from both electroencephalogram (EEG) and peripheral physiological signals (PPS) temporal slices. Subsequently, a cross-modal global-local feature fusion module (CGFFM) is proposed to integrate global and local information from multi-modal data and then assign different importance to each modality, which makes the fusion data tend to the more important modalities. Meanwhile, the transformer module is employed to further learn the modality-specific information. Moreover, we introduce the adaptive collaboration block (ACB), which optimally leverages both modality-specific and cross-modality relations for enhanced integration and feature representation. Following extensive experiments on the DEAP and DREAMER multimodal datasets, our model achieves state-of-the-art performance.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Multimodal Fusion based on Information Gain for Emotion Recognition in the Wild
    Ghaleb, Esam
    Popa, Mirela
    Hortal, Enrique
    Asteriadis, Stylianos
    PROCEEDINGS OF THE 2017 INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS), 2017, : 814 - 823
  • [32] Graph Theoretical Analysis of EEG Functional Connectivity Patterns and Fusion with Physiological Signals for Emotion Recognition
    Xefteris, Vasileios-Rafail
    Tsanousa, Athina
    Georgakopoulou, Nefeli
    Diplaris, Sotiris
    Vrochidis, Stefanos
    Kompatsiaris, Ioannis
    SENSORS, 2022, 22 (21)
  • [33] Multimodal emotion recognition algorithm based on edge network emotion element compensation and data fusion
    Yu Wang
    Personal and Ubiquitous Computing, 2019, 23 : 383 - 392
  • [34] HYPERCOMPLEX MULTIMODAL EMOTION RECOGNITION FROM EEG AND PERIPHERAL PHYSIOLOGICAL SIGNALS
    Lopez, Eleonora
    Chiarantano, Eleonora
    Grassucci, Eleonora
    Comminiello, Danilo
    2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
  • [35] Emotion recognition based on multi-modal physiological signals and transfer learning
    Fu, Zhongzheng
    Zhang, Boning
    He, Xinrun
    Li, Yixuan
    Wang, Haoyuan
    Huang, Jian
    FRONTIERS IN NEUROSCIENCE, 2022, 16
  • [36] Multimodal emotion recognition algorithm based on edge network emotion element compensation and data fusion
    Wang, Yu
    PERSONAL AND UBIQUITOUS COMPUTING, 2019, 23 (3-4) : 383 - 392
  • [37] GraphMFT: A graph network based multimodal fusion technique for emotion recognition in conversation
    Li, Jiang
    Wang, Xiaoping
    Lv, Guoqing
    Zeng, Zhigang
    NEUROCOMPUTING, 2023, 550
  • [38] Multimodal Attention Network for Continuous-Time Emotion Recognition Using Video and EEG Signals
    Choi, Dong Yoon
    Kim, Deok-Hwan
    Song, Byung Cheol
    IEEE ACCESS, 2020, 8 : 203814 - 203826
  • [39] A Research on Emotion Recognition of the Elderly Based on Transformer and Physiological Signals
    Feng, Guohong
    Wang, Hongen
    Wang, Mengdi
    Zheng, Xiao
    Zhang, Runze
    ELECTRONICS, 2024, 13 (15)
  • [40] Hybrid deep convolutional model-based emotion recognition using multiple physiological signals
    Akbulut, Fatma Patlar
    COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING, 2022, 25 (15) : 1678 - 1690