MF-Net: a multimodal fusion network for emotion recognition based on multiple physiological signals

被引：0

作者：

Zhu, Lei ^{[1
]}

Ding, Yu ^{[1
]}

Huang, Aiai ^{[1
]}

Tan, Xufei ^{[2
]}

Zhang, Jianhai ^{[3
,4
]}

机构：

[1] Hangzhou Dianzi Univ, Sch Automat, Hangzhou 310000, Peoples R China

[2] Hangzhou City Univ, Sch Med, Hangzhou 310015, Peoples R China

[3] Hangzhou Dianzi Univ, Sch Comp Sci, Hangzhou 310000, Peoples R China

[4] Hangzhou City Univ, Key Lab Brain Machine Collaborat Intelligence Zhej, Hangzhou 310015, Peoples R China

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2025年 / 19卷 / 01期

关键词：

Deep learning; Physiological signal; Multimodal fusion; Emotion recognition; EEG;

D O I：

10.1007/s11760-024-03632-0

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Currently, research on emotion recognition has shown that multi-modal data fusion has advantages in improving the accuracy and robustness of human emotion recognition, outperforming single-modal methods. Despite the promising results of existing methods, significant challenges remain in effectively fusing data from multiple modalities to achieve superior performance. Firstly, existing works tend to focus on generating a joint representation by fusing multi-modal data, with fewer methods considering the specific characteristics of each modality. Secondly, most methods fail to fully capture the intricate correlations among multiple modalities, often resorting to simplistic combinations of latent features. To address these challenges, we propose a novel fusion network for multi-modal emotion recognition. This network enhances the efficacy of multi-modal fusion while preserving the distinct characteristics of each modality. Specifically, a dual-stream multi-scale feature encoding (MFE) is designed to extract emotional information from both electroencephalogram (EEG) and peripheral physiological signals (PPS) temporal slices. Subsequently, a cross-modal global-local feature fusion module (CGFFM) is proposed to integrate global and local information from multi-modal data and then assign different importance to each modality, which makes the fusion data tend to the more important modalities. Meanwhile, the transformer module is employed to further learn the modality-specific information. Moreover, we introduce the adaptive collaboration block (ACB), which optimally leverages both modality-specific and cross-modality relations for enhanced integration and feature representation. Following extensive experiments on the DEAP and DREAMER multimodal datasets, our model achieves state-of-the-art performance.

引用

页数：12

共 50 条

[41] MULTIPLE FEATURE FUSION FOR AUTOMATIC EMOTION RECOGNITION USING EEG SIGNALS
Liu, Ningjie
Fang, Yuchun
Li, Ling
Hou, Limin
Yang, Fenglei
Guo, Yike
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 896 - 900
[42] Multimodal Emotion Recognition From EEG Signals and Facial Expressions
Wang, Shuai
Qu, Jingzi
Zhang, Yong
Zhang, Yidie
IEEE ACCESS, 2023, 11 : 33061 - 33068
[43] Multimodal emotion recognition based on speech and ECG signals
Huang C.
Jin Y.
Wang Q.
Zhao L.
Zou C.
Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition), 2010, 40 (05): : 895 - 900
[44] Emotion Recognition Based on Multimodal Physiological Data: A Survey
Liu, Ying
Yuan, Li
Zu, Shuodi
Fan, Youteng
Xie, Ning
Yang, Yang
Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2024, 53 (05): : 720 - 731
[45] Fine-grained emotion recognition: fusion of physiological signals and facial expressions on spontaneous emotion corpus
Setiawan, Feri
Prabono, Aria Ghora
Khowaja, Sunder Ali
Kim, Wangsoo
Park, Kyoungsoo
Yahya, Bernardo Nugroho
Lee, Seok-Lyong
Hong, Jin Pyo
INTERNATIONAL JOURNAL OF AD HOC AND UBIQUITOUS COMPUTING, 2020, 35 (03) : 162 - 178
[46] Topics Guided Multimodal Fusion Network for Conversational Emotion Recognition
Yuan, Peicong
Cai, Guoyong
Chen, Ming
Tang, Xiaolv
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14877 : 250 - 262
[47] MM-DFN: MULTIMODAL DYNAMIC FUSION NETWORK FOR EMOTION RECOGNITION IN CONVERSATIONS
Hu, Dou
Hou, Xiaolong
Wei, Lingwei
Jiang, Lianxin
Mo, Yang
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7037 - 7041
[48] Length Uncertainty-Aware Graph Contrastive Fusion Network for multimodal physiological signal emotion recognition
Li, Guangqiang
Chen, Ning
Zhu, Hongqing
Li, Jing
Xu, Zhangyong
Zhu, Zhiying
NEURAL NETWORKS, 2025, 187
[49] Driver Emotion Recognition With a Hybrid Attentional Multimodal Fusion Framework
Mou, Luntian
Zhao, Yiyuan
Zhou, Chao
Nakisa, Bahareh
Rastgoo, Mohammad Naim
Ma, Lei
Huang, Tiejun
Yin, Baocai
Jain, Ramesh
Gao, Wen
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (04) : 2970 - 2981
[50] Multimodal Emotion Recognition Based on Ensemble Convolutional Neural Network
Huang, Haiping
Hu, Zhenchao
Wang, Wenming
Wu, Min
IEEE ACCESS, 2020, 8 : 3265 - 3271

← 1 2 3 4 5 →