Hierarchical fusion of visual and physiological signals for emotion recognition

被引:10
作者
Fang, Yuchun [1 ]
Rong, Ruru [1 ,2 ]
Huang, Jun [2 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai, Peoples R China
[2] Chinese Acad Sci, Shanghai Adv Res Inst, Beijing, Peoples R China
基金
上海市自然科学基金; 中国国家自然科学基金;
关键词
Emotion recognition; Facial expression; Electroencephalogram; FACIAL EXPRESSIONS; EEG; JUDGMENTS; TRACKING; ENTROPY;
D O I
10.1007/s11045-021-00774-z
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Emotion recognition is an attractive and essential topic in image and signal processing. In this paper, we propose a multi-level fusion method to combine visual information and physiological signals for emotion recognition. For visual information, we propose a serial fusion of two-stage features to enhance the representation of facial expression in a video sequence. We propose to integrate the Neural Aggregation Network with Convolutional Neural Network feature map to reinforce the vital emotional frames. For physiological signals, we propose a parallel fusion scheme to widen the band of the annotation of the electroencephalogram signals. We extract the frequency feature with the Linear-Frequency Cepstral Coefficients and enhance it with the signal complexity denoted by Sample Entropy (SampEn). In the classification stage, we realize both feature level and decision level fusion of both visual and physiological information. Experimental results validate the effectiveness of the proposed multi-level multi-modal feature representation method.
引用
收藏
页码:1103 / 1121
页数:19
相关论文
共 51 条
[1]   Emotion Recognition in Speech using Cross-Modal Transfer in the Wild [J].
Albanie, Samuel ;
Nagrani, Arsha ;
Vedaldi, Andrea ;
Zisserman, Andrew .
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, :292-301
[2]  
[Anonymous], 2017, P 4 INT C COMP SCI I
[3]  
Arriaga O., 2017, Real-time Convolutional Neural Networks for Emotion and Gender Classification, P221, DOI DOI 10.1109/ICSIDEMPC49020.2020.9299633
[4]   Real-time classification of evoked emotions using facial feature tracking and physiological responses [J].
Bailenson, Jeremy N. ;
Pontikakis, Emmanuel D. ;
Mauss, Iris B. ;
Gross, James J. ;
Jabon, Maria E. ;
Hutcherson, Cendri A. C. ;
Nass, Clifford ;
John, Oliver .
INTERNATIONAL JOURNAL OF HUMAN-COMPUTER STUDIES, 2008, 66 (05) :303-317
[5]  
Chang CY, 2009, CIBCB: 2009 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, P278
[6]  
Chaparro V, 2018, IEEE ENG MED BIO, P530, DOI 10.1109/EMBC.2018.8512407
[7]   The timing of facial motion in posed and spontaneous smiles [J].
Cohn, JF ;
Schmidt, K .
ACTIVE MEDIA TECHNOLOGY, 2003, :57-69
[8]   Multiscale entropy analysis of biological signals [J].
Costa, M ;
Goldberger, AL ;
Peng, CK .
PHYSICAL REVIEW E, 2005, 71 (02)
[9]  
Duan RN, 2013, I IEEE EMBS C NEUR E, P81, DOI 10.1109/NER.2013.6695876
[10]   UNIVERSALS AND CULTURAL-DIFFERENCES IN THE JUDGMENTS OF FACIAL EXPRESSIONS OF EMOTION [J].
EKMAN, P ;
FRIESEN, WV ;
OSULLIVAN, M ;
CHAN, A ;
DIACOYANNITARLATZIS, I ;
HEIDER, K ;
KRAUSE, R ;
LECOMPTE, WA ;
PITCAIRN, T ;
RICCIBITTI, PE ;
SCHERER, K ;
TOMITA, M ;
TZAVARAS, A .
JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 1987, 53 (04) :712-717