Emotion Recognition With Audio, Video, EEG, and EMG: A Dataset and Baseline Approaches

被引:35
作者
Chen, Jin [1 ]
Ro, Tony [2 ,3 ,4 ]
Zhu, Zhigang [1 ,5 ]
机构
[1] CUNY, Comp Sci Dept, New York, NY 10031 USA
[2] CUNY, Grad Ctr, Program Psychol, New York, NY 10016 USA
[3] CUNY, Grad Ctr, Program Biol, New York, NY 10016 USA
[4] CUNY, Grad Ctr, Program Cognit Neurosci, New York, NY 10016 USA
[5] CUNY, Grad Ctr, Doctoral Program Comp Sci, New York, NY 10016 USA
基金
美国国家科学基金会;
关键词
Electroencephalography; Feature extraction; Videos; Support vector machines; Physiology; Emotion recognition; Electromyography; data collection; electroencephalography; electromyography; SIGNAL; LSTM;
D O I
10.1109/ACCESS.2022.3146729
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes a new posed multimodal emotional dataset and compares human emotion classification based on four different modalities - audio, video, electromyography (EMG), and electroencephalography (EEG). The results are reported with several baseline approaches using various feature extraction techniques and machine-learning algorithms. First, we collected a dataset from 11 human subjects expressing six basic emotions and one neutral emotion. We then extracted features from each modality using principal component analysis, autoencoder, convolution network, and mel-frequency cepstral coefficient (MFCC), some unique to individual modalities. A number of baseline models have been applied to compare the classification performance in emotion recognition, including k-nearest neighbors (KNN), support vector machines (SVM), random forest, multilayer perceptron (MLP), long short-term memory (LSTM) model, and convolutional neural network (CNN). Our results show that bootstrapping the biosensor signals (i.e., EMG and EEG) can greatly increase emotion classification performance by reducing noise. In contrast, the best classification results were obtained by a traditional KNN, whereas audio and image sequences of human emotions could be better classified using LSTM.
引用
收藏
页码:13229 / 13242
页数:14
相关论文
共 59 条
[41]  
Ranganathan H, 2016, IEEE WINT CONF APPL
[42]  
Rassadin A, 2017, PROCEEDINGS OF THE 19TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2017, P544, DOI 10.1145/3136755.3143007
[43]   A CIRCUMPLEX MODEL OF AFFECT [J].
RUSSELL, JA .
JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 1980, 39 (06) :1161-1178
[44]  
Simonyan K, 2015, Arxiv, DOI [arXiv:1409.1556, DOI 10.48550/ARXIV.1409.1556]
[45]   MPED: A Multi-Model Physiological Emotion Database for Discrete Emotion Recongnition [J].
Song, Tengfei ;
Zheng, Wenming ;
Lu, Cheng ;
Zong, Yuan ;
Zhang, Xilei ;
Cui, Zhen .
IEEE ACCESS, 2019, 7 :12177-12191
[46]   Emotion recognition through EEG phase space dynamics and Dempster-Shafer theory [J].
Soroush, Morteza Zangeneh ;
Maghooli, Keivan ;
Setarehdan, Seyed Kamaledin ;
Nasrabadi, Ali Motie .
MEDICAL HYPOTHESES, 2019, 127 :34-45
[47]   Random forests in non-invasive sensorimotor rhythm brain-computer interfaces: a practical and convenient non-linear classifier [J].
Steyrl, David ;
Scherer, Reinhold ;
Faller, Josef ;
Mueller-Putz, Gernot R. .
BIOMEDICAL ENGINEERING-BIOMEDIZINISCHE TECHNIK, 2016, 61 (01) :77-86
[48]   EEG signal classification using PCA, ICA, LDA and support vector machines [J].
Subasi, Abdulhamit ;
Gursoy, M. Ismail .
EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (12) :8659-8666
[49]  
SUJANAA J, 2020, INDIAN J SCI TECHNOL, V13, P3222, DOI DOI 10.17485/IJST/v13i31.1118
[50]   Rethinking the Inception Architecture for Computer Vision [J].
Szegedy, Christian ;
Vanhoucke, Vincent ;
Ioffe, Sergey ;
Shlens, Jon ;
Wojna, Zbigniew .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2818-2826