Emotion Recognition From Multimodal Physiological Signals Using a Regularized Deep Fusion of Kernel Machine

被引：123

作者：

Zhang, Xiaowei ^{[1
]}

Liu, Jinyong ^{[1
]}

Shen, Jian ^{[1
]}

Li, Shaojie ^{[1
]}

Hou, Kechen ^{[1
]}

Hu, Bin ^{[1
,3
,4
]}

Gao, Jin ^{[1
]}

Zhang, Tong ^{[2
]}

机构：

[1] Lanzhou Univ, Sch Informat Sci & Engn, Gansu Prov Key Lab Wearable Comp, Lanzhou 730000, Peoples R China

[2] South China Univ Technol, Sch Elect & Informat, Guangzhou 510640, Peoples R China

[3] Chinese Acad Sci, CAS Ctr Excellence Brain Sci, Shanghai 200031, Peoples R China

[4] Chinese Acad Sci, Shanghai Inst Biol Sci, Inst Biol Sci, Shanghai 200031, Peoples R China

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2021年 / 51卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Physiology; Emotion recognition; Feature extraction; Kernel; Task analysis; Brain modeling; Fuses; Deep neural network; emotion recognition; kernel machine; multimodal fusion;

D O I：

10.1109/TCYB.2020.2987575

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

These days, physiological signals have been studied more broadly for emotion recognition to realize emotional intelligence in human-computer interaction. However, due to the complexity of emotions and individual differences in physiological responses, how to design reliable and effective models has become an important issue. In this article, we propose a regularized deep fusion framework for emotion recognition based on multimodal physiological signals. After extracting the effective features from different types of physiological signals, we construct ensemble dense embeddings of multimodal features using kernel matrices, and then utilize a deep network architecture to learn task-specific representations for each kind of physiological signal from these ensemble dense embeddings. Finally, a global fusion layer with a regularization term, which can efficiently explore the correlation and diversity among all of the representations in a synchronous optimization process, is designed to fuse generated representations. Experiments on two benchmark datasets show that this framework can improve the performance of subject-independent emotion recognition compared to single-modal classifiers or other fusion methods. Data visualization also demonstrates that the final fusion representation exhibits higher class-separability power for emotion recognition.

引用

页码：4386 / 4399

页数：14

共 45 条

[1] DECAF: MEG-Based Multimodal Database for Decoding Affective Physiological Responses
Abadi, Mojtaba Khomami
Subramanian, Ramanathan
Kia, Seyed Mostafa
Avesani, Paolo
Patras, Ioannis
Sebe, Nicu
[J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2015, 6 (03) : 209 - 222
[2] EasyMKL: a scalable multiple kernel learning algorithm
Aiolli, Fabio
Donini, Michele
[J]. NEUROCOMPUTING, 2015, 169 : 215 - 224
[3] [Anonymous], 2019, BAYESIAN DEEP LEARNI
[4] [Anonymous], 2009, Advances in Neural Information Processing Systems
[5] THEORY OF REPRODUCING KERNELS
ARONSZAJN, N
[J]. TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY, 1950, 68 (MAY) : 337 - 404
[6] Cheng Bo, 2008, Journal of Computer Applications, V28, P333, DOI 10.3724/SP.J.1087.2008.00333
[7] Demsar J, 2006, J MACH LEARN RES, V7, P1
[8] The use of ranks to avoid the assumption of normality implicit in the analysis of variance
Friedman, M
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1937, 32 (200) : 675 - 701
[9] Multimodal semi-supervised learning for image classification
Guillaumin, Matthieu
Verbeek, Jakob
Schmid, Cordelia
[J]. 2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 902 - 909
[10] Gupta A. K., 2018, Matrix Variate Distributions

← 1 2 3 4 5 →