Semi-Supervised Cross-Subject Emotion Recognition Based on Stacked Denoising Autoencoder Architecture Using a Fusion of Multi-Modal Physiological Signals

被引:12
作者
Luo, Junhai [1 ]
Tian, Yuxin [1 ]
Yu, Hang [1 ]
Chen, Yu [1 ]
Wu, Man [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 610056, Peoples R China
关键词
DEAP dataset; electroencephalogram (EEG); emotion recognition; multi-source fusion; stacked denoising autoencoder; unsupervised representation learning; FEATURE-EXTRACTION; TIME-SERIES; EEG; REPRESENTATIONS;
D O I
10.3390/e24050577
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
In recent decades, emotion recognition has received considerable attention. As more enthusiasm has shifted to the physiological pattern, a wide range of elaborate physiological emotion data features come up and are combined with various classifying models to detect one's emotional states. To circumvent the labor of artificially designing features, we propose to acquire affective and robust representations automatically through the Stacked Denoising Autoencoder (SDA) architecture with unsupervised pre-training, followed by supervised fine-tuning. In this paper, we compare the performances of different features and models through three binary classification tasks based on the Valence-Arousal-Dominance (VAD) affection model. Decision fusion and feature fusion of electroencephalogram (EEG) and peripheral signals are performed on hand-engineered features; data-level fusion is performed on deep-learning methods. It turns out that the fusion data perform better than the two modalities. To take advantage of deep-learning algorithms, we augment the original data and feed it directly into our training model. We use two deep architectures and another generative stacked semi-supervised architecture as references for comparison to test the method's practical effects. The results reveal that our scheme slightly outperforms the other three deep feature extractors and surpasses the state-of-the-art of hand-engineered features.
引用
收藏
页数:29
相关论文
共 31 条
[1]   Emotion Recognition Based on High-Resolution EEG Recordings and Reconstructed Brain Sources [J].
Becker, Hanna ;
Fleureau, Julien ;
Guillotel, Philippe ;
Wendling, Fabrice ;
Merlet, Isabelle ;
Albera, Laurent .
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2020, 11 (02) :244-257
[2]  
Bengio Y., 2007, Advances in Neural Information Processing Systems, V19, P153
[3]   Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[4]   UNIVERSALS AND CULTURAL-DIFFERENCES IN THE JUDGMENTS OF FACIAL EXPRESSIONS OF EMOTION [J].
EKMAN, P ;
FRIESEN, WV ;
OSULLIVAN, M ;
CHAN, A ;
DIACOYANNITARLATZIS, I ;
HEIDER, K ;
KRAUSE, R ;
LECOMPTE, WA ;
PITCAIRN, T ;
RICCIBITTI, PE ;
SCHERER, K ;
TOMITA, M ;
TZAVARAS, A .
JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 1987, 53 (04) :712-717
[5]  
García HF, 2016, IEEE ENG MED BIO, P850, DOI 10.1109/EMBC.2016.7590834
[6]   APPROACH TO AN IRREGULAR TIME-SERIES ON THE BASIS OF THE FRACTAL THEORY [J].
HIGUCHI, T .
PHYSICA D-NONLINEAR PHENOMENA, 1988, 31 (02) :277-283
[7]   Reducing the dimensionality of data with neural networks [J].
Hinton, G. E. ;
Salakhutdinov, R. R. .
SCIENCE, 2006, 313 (5786) :504-507
[8]   A fast learning algorithm for deep belief nets [J].
Hinton, Geoffrey E. ;
Osindero, Simon ;
Teh, Yee-Whye .
NEURAL COMPUTATION, 2006, 18 (07) :1527-1554
[9]   EEG ANALYSIS BASED ON TIME DOMAIN PROPERTIES [J].
HJORTH, B .
ELECTROENCEPHALOGRAPHY AND CLINICAL NEUROPHYSIOLOGY, 1970, 29 (03) :306-&
[10]   The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis [J].
Huang, NE ;
Shen, Z ;
Long, SR ;
Wu, MLC ;
Shih, HH ;
Zheng, QN ;
Yen, NC ;
Tung, CC ;
Liu, HH .
PROCEEDINGS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 1998, 454 (1971) :903-995