Hierarchical multimodal-fusion of physiological signals for emotion recognition with scenario adaption and contrastive alignment

被引:27
作者
Tang, Jiehao [1 ,2 ]
Ma, Zhuang [1 ,2 ]
Gan, Kaiyu [1 ,2 ]
Zhang, Jianhua [3 ]
Yin, Zhong [1 ,2 ,4 ]
机构
[1] Univ Shanghai Sci & Technol, Engn Res Ctr Opt Instrument & Syst, Shanghai Key Lab Modern Opt Syst, Minist Educ, Shanghai 200093, Peoples R China
[2] Univ Shanghai Sci & Technol, Sch Opt Elect & Comp Engn, Shanghai 200093, Peoples R China
[3] Oslo Metropolitan Univ, Dept Comp Sci, OsloMet Artificial Intelligence Lab, N-0130 Oslo, Norway
[4] Jungong Rd 516, Shanghai 200093, Peoples R China
基金
中国国家自然科学基金;
关键词
Emotion recognition; Affective computing; Multimodal fusion; Physiological signal; Deep learning; EEG; TRANSFORMER;
D O I
10.1016/j.inffus.2023.102129
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The lack of complementary affective responses from both the central and peripheral nervous systems could limit the performance of emotion recognition with the single-modal physiological signal. However, when integrating multimodalities, a direct fusion may ignore the heterogeneous nature of multiple feature domains from one modality to another. Besides, there is a risk that the distribution of the multimodal physiological responses may vary across different affective scenarios for stimulating an identical emotional category. The inter-individual variation may also increase due to the superposition of the biometric information from the multimodal features. To tackle these issues, we present a hierarchical multimodal network for robust heterogeneous physiological representations (RHPRNet). First, we applied a spatial-frequency pattern extractor to identify the electroencephalogram (EEG) representations in both the spatial and frequency domains. Next, inter-domain and inter-modality affective encoders are separately applied to the statistic-complexity EEG features and multimodal peripheral features, respectively. All the learned representations are integrated via a hierarchical fusion module. To model the multi-peak patterns stimulated by different affective scenarios, we designed a scenario-adapting pretraining stage. A random contrastive training loss was also applied to mitigate the inter-individual variance. In the end, we performed adequate experiments to examine the performance of the RHPRNet based on three publicly available multimodal databases combined with two validation approaches.
引用
收藏
页数:18
相关论文
共 80 条
[31]   Multimodal Emotion Recognition Using Deep Generalized Canonical Correlation Analysis with an Attention Mechanism [J].
Lan, Yu-Ting ;
Liu, Wei ;
Lu, Bao-Liang .
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[32]  
Lew WCL, 2020, IEEE ENG MED BIO, P116, DOI 10.1109/EMBC44109.2020.9176682
[33]   EEG-based Emotion Recognition via Transformer Neural Architecture Search [J].
Li, Chang ;
Zhang, Zhongzhen ;
Zhang, Xiaodong ;
Huang, Guoning ;
Liu, Yu ;
Chen, Xun .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (04) :6016-6025
[34]   Cross-Subject Emotion Recognition Using Deep Adaptation Networks [J].
Li, He ;
Jin, Yi-Ming ;
Zheng, Wei-Long ;
Lu, Bao-Liang .
NEURAL INFORMATION PROCESSING (ICONIP 2018), PT V, 2018, 11305 :403-413
[35]   Hybrid Fusion with Intra- and Cross-Modality Attention for Image-Recipe Retrieval [J].
Li, Jiao ;
Xu, Xing ;
Yu, Wei ;
Shen, Fumin ;
Cao, Zuo ;
Zuo, Kai ;
Shen, Heng Tao .
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, :244-254
[36]   Domain Adaptation for EEG Emotion Recognition Based on Latent Representation Similarity [J].
Li, Jinpeng ;
Qiu, Shuang ;
Du, Changde ;
Wang, Yixin ;
He, Huiguang .
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2020, 12 (02) :344-353
[37]  
Li JP, 2020, IEEE T CYBERNETICS, V50, P3281, DOI [10.1109/TPAMI.2019.2929036, 10.1109/TCYB.2019.2904052]
[38]  
Li TH, 2019, I IEEE EMBS C NEUR E, P607, DOI [10.1109/ner.2019.8716943, 10.1109/NER.2019.8716943]
[39]  
Li X, 2015, EEG based emotion identification using unsupervised deep feature learning
[40]   A novel transferability attention neural network model for EEG emotion recognition [J].
Li, Yang ;
Fu, Boxun ;
Li, Fu ;
Shi, Guangming ;
Zheng, Wenming .
NEUROCOMPUTING, 2021, 447 :92-101