Hierarchical multimodal-fusion of physiological signals for emotion recognition with scenario adaption and contrastive alignment

被引：18

作者：

Tang, Jiehao ^{[1
,2
]}

Ma, Zhuang ^{[1
,2
]}

Gan, Kaiyu ^{[1
,2
]}

Zhang, Jianhua ^{[3
]}

Yin, Zhong ^{[1
,2
,4
]}

机构：

[1] Univ Shanghai Sci & Technol, Engn Res Ctr Opt Instrument & Syst, Shanghai Key Lab Modern Opt Syst, Minist Educ, Shanghai 200093, Peoples R China

[2] Univ Shanghai Sci & Technol, Sch Opt Elect & Comp Engn, Shanghai 200093, Peoples R China

[3] Oslo Metropolitan Univ, Dept Comp Sci, OsloMet Artificial Intelligence Lab, N-0130 Oslo, Norway

[4] Jungong Rd 516, Shanghai 200093, Peoples R China

来源：

INFORMATION FUSION | 2024年 / 103卷

基金：

中国国家自然科学基金;

关键词：

Emotion recognition; Affective computing; Multimodal fusion; Physiological signal; Deep learning; EEG; TRANSFORMER;

D O I：

10.1016/j.inffus.2023.102129

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The lack of complementary affective responses from both the central and peripheral nervous systems could limit the performance of emotion recognition with the single-modal physiological signal. However, when integrating multimodalities, a direct fusion may ignore the heterogeneous nature of multiple feature domains from one modality to another. Besides, there is a risk that the distribution of the multimodal physiological responses may vary across different affective scenarios for stimulating an identical emotional category. The inter-individual variation may also increase due to the superposition of the biometric information from the multimodal features. To tackle these issues, we present a hierarchical multimodal network for robust heterogeneous physiological representations (RHPRNet). First, we applied a spatial-frequency pattern extractor to identify the electroencephalogram (EEG) representations in both the spatial and frequency domains. Next, inter-domain and inter-modality affective encoders are separately applied to the statistic-complexity EEG features and multimodal peripheral features, respectively. All the learned representations are integrated via a hierarchical fusion module. To model the multi-peak patterns stimulated by different affective scenarios, we designed a scenario-adapting pretraining stage. A random contrastive training loss was also applied to mitigate the inter-individual variance. In the end, we performed adequate experiments to examine the performance of the RHPRNet based on three publicly available multimodal databases combined with two validation approaches.

引用

页数：18

共 80 条

[1] Emotions Recognition Using EEG Signals: A Survey [J].

Alarcao, Soraia M. ;

Fonseca, Manuel J. .

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2019, 10 (03) :374-393

[2] Improving BCI-based emotion recognition by combining EEG feature selection and kernel classifiers [J].

Atkinson, John ;

Campos, Daniel .

EXPERT SYSTEMS WITH APPLICATIONS, 2016, 47 :35-41

[3] Multimodal Machine Learning: A Survey and Taxonomy [J].

Baltrusaitis, Tadas ;

Ahuja, Chaitanya ;

Morency, Louis-Philippe .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (02) :423-443

[4] An Emotion Recognition Method Based on Eye Movement and Audiovisual Features in MOOC Learning Environment [J].

Bao, Jindi ;

Tao, Xiaomei ;

Zhou, Yinghui .

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (01) :171-183

[5] MS-MDA: Multisource Marginal Distribution Adaptation for Cross-Subject and Cross-Session EEG Emotion Recognition [J].

Chen, Hao ;

Jin, Ming ;

Li, Zhunan ;

Fan, Cunhang ;

Li, Jinpeng ;

He, Huiguang .

FRONTIERS IN NEUROSCIENCE, 2021, 15

[6] Deep Understanding of Cooking Procedure for Cross-modal Recipe Retrieval [J].

Chen, Jing-Jing ;

Ngo, Chong-Wah ;

Feng, Fu-Li ;

Chua, Tat-Seng .

PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, :1020-1028

[7] Cross-Modal Recipe Retrieval: How to Cook this Dish? [J].

Chen, Jingjing ;

Pang, Lei ;

Ngo, Chong-Wah .

MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 :588-600

[8]

Chen Shizhe, 2015, P 5 INT WORKSH AUD V, P49, DOI [DOI 10.1145/2808196.2811638, 10.1145/2808196. 2811638]

[9] XGBoost: A Scalable Tree Boosting System [J].

Chen, Tianqi ;

Guestrin, Carlos .

KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794

[10]

Dai YM, 2020, Arxiv, DOI arXiv:2009.14082

← 1 2 3 4 5 6 7 8 →