Emotion Recognition From Multimodal Physiological Signals Using a Regularized Deep Fusion of Kernel Machine

被引:123
作者
Zhang, Xiaowei [1 ]
Liu, Jinyong [1 ]
Shen, Jian [1 ]
Li, Shaojie [1 ]
Hou, Kechen [1 ]
Hu, Bin [1 ,3 ,4 ]
Gao, Jin [1 ]
Zhang, Tong [2 ]
机构
[1] Lanzhou Univ, Sch Informat Sci & Engn, Gansu Prov Key Lab Wearable Comp, Lanzhou 730000, Peoples R China
[2] South China Univ Technol, Sch Elect & Informat, Guangzhou 510640, Peoples R China
[3] Chinese Acad Sci, CAS Ctr Excellence Brain Sci, Shanghai 200031, Peoples R China
[4] Chinese Acad Sci, Shanghai Inst Biol Sci, Inst Biol Sci, Shanghai 200031, Peoples R China
基金
中国国家自然科学基金;
关键词
Physiology; Emotion recognition; Feature extraction; Kernel; Task analysis; Brain modeling; Fuses; Deep neural network; emotion recognition; kernel machine; multimodal fusion;
D O I
10.1109/TCYB.2020.2987575
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
These days, physiological signals have been studied more broadly for emotion recognition to realize emotional intelligence in human-computer interaction. However, due to the complexity of emotions and individual differences in physiological responses, how to design reliable and effective models has become an important issue. In this article, we propose a regularized deep fusion framework for emotion recognition based on multimodal physiological signals. After extracting the effective features from different types of physiological signals, we construct ensemble dense embeddings of multimodal features using kernel matrices, and then utilize a deep network architecture to learn task-specific representations for each kind of physiological signal from these ensemble dense embeddings. Finally, a global fusion layer with a regularization term, which can efficiently explore the correlation and diversity among all of the representations in a synchronous optimization process, is designed to fuse generated representations. Experiments on two benchmark datasets show that this framework can improve the performance of subject-independent emotion recognition compared to single-modal classifiers or other fusion methods. Data visualization also demonstrates that the final fusion representation exhibits higher class-separability power for emotion recognition.
引用
收藏
页码:4386 / 4399
页数:14
相关论文
共 45 条
  • [1] DECAF: MEG-Based Multimodal Database for Decoding Affective Physiological Responses
    Abadi, Mojtaba Khomami
    Subramanian, Ramanathan
    Kia, Seyed Mostafa
    Avesani, Paolo
    Patras, Ioannis
    Sebe, Nicu
    [J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2015, 6 (03) : 209 - 222
  • [2] EasyMKL: a scalable multiple kernel learning algorithm
    Aiolli, Fabio
    Donini, Michele
    [J]. NEUROCOMPUTING, 2015, 169 : 215 - 224
  • [3] [Anonymous], 2019, BAYESIAN DEEP LEARNI
  • [4] [Anonymous], 2009, Advances in Neural Information Processing Systems
  • [5] THEORY OF REPRODUCING KERNELS
    ARONSZAJN, N
    [J]. TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY, 1950, 68 (MAY) : 337 - 404
  • [6] Cheng Bo, 2008, Journal of Computer Applications, V28, P333, DOI 10.3724/SP.J.1087.2008.00333
  • [7] Demsar J, 2006, J MACH LEARN RES, V7, P1
  • [9] Multimodal semi-supervised learning for image classification
    Guillaumin, Matthieu
    Verbeek, Jakob
    Schmid, Cordelia
    [J]. 2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 902 - 909
  • [10] Gupta A. K., 2018, Matrix Variate Distributions