User Emotion Recognition Method Based on Facial Expression and Speech Signal Fusion

被引：2

作者：

Lu, Fei ^{[1
]}

Zhang, Long ^{[1
]}

Tian, Guohui ^{[1
]}

机构：

[1] Shandong Univ, Sch Control Sci & Engn, Jinan 250061, Peoples R China

来源：

PROCEEDINGS OF THE 2021 IEEE 16TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2021) | 2021年

关键词：

emotion recognition; gabor transform; transfer learning; multimodal fusion; arousal-valence;

D O I：

10.1109/ICIEA51954.2021.9516216

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

In human-computer interaction, it is an urgent problem to use facial expressions and speech information to identify the user's continuous emotions, and the key factors affecting the recognition accuracy are the data deficiencies during the fusion of speech and facial information, and the abnormal frames in the video. In order to solve these problems, a user emotion recognition system based on the fusion of facial expressions and speech multimodality is designed. In the part of facial expressions, Gabor transform continuous emotion recognition method based on data increments is proposed. In the part of speech information, Mel-scale Frequency Cepstral Coefficients (MFCC) is used to extract speech features, and user emotions are recognize through transfer learning. Finally, in the late fusion, multiple linear regression is used for multi-modality to verify the method in this paper. This paper uses the AVEC2013 dataset with Arousal-Valence label to conduct a valid experiment on the proposed method. The experimental results prove that the method improves the accuracy of user emotion recognition.

引用

页码：1121 / 1126

页数：6

共 50 条

[1] Multi-Modal Fusion Emotion Recognition Method of Speech Expression Based on Deep Learning
Liu, Dong
Wang, Zhiyong
Wang, Lifeng
Chen, Longxi
FRONTIERS IN NEUROROBOTICS, 2021, 15
[2] Emotion recognition from the facial image and speech signal
Go, HJ
Kwak, KC
Lee, DJ
Chun, MG
SICE 2003 ANNUAL CONFERENCE, VOLS 1-3, 2003, : 2890 - 2895
[3] Audio-Visual Emotion Recognition Based on Facial Expression and Affective Speech
Zhang, Shiqing
Li, Lemin
Zhao, Zhijin
MULTIMEDIA AND SIGNAL PROCESSING, 2012, 346 : 46 - +
[4] The Fusion of Electroencephalography and Facial Expression for Continuous Emotion Recognition
Li, Dahua
Wang, Zhe
Wang, Chuhan
Liu, Shuang
Chi, Wenhao
Dong, Enzeng
Song, Xiaolin
Gao, Qiang
Song, Yu
IEEE ACCESS, 2019, 7 : 155724 - 155736
[5] English speech emotion recognition method based on speech recognition
Liu, Man
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2022, 25 (2) : 391 - 398
[6] English speech emotion recognition method based on speech recognition
Man Liu
International Journal of Speech Technology, 2022, 25 : 391 - 398
[7] Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis
Loic Kessous
Ginevra Castellano
George Caridakis
Journal on Multimodal User Interfaces, 2010, 3 : 33 - 48
[8] Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis
Kessous, Loic
Castellano, Ginevra
Caridakis, George
JOURNAL ON MULTIMODAL USER INTERFACES, 2010, 3 (1-2) : 33 - 48
[9] Multimodal Emotion Recognition Based on Facial Expressions, Speech, and EEG
Pan, Jiahui
Fang, Weijie
Zhang, Zhihang
Chen, Bingzhi
Zhang, Zheng
Wang, Shuihua
IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY, 2024, 5 : 396 - 403
[10] FaceFetch: A User Emotion Driven Multimedia Content Recommendation System Based on Facial Expression Recognition
Mariappan, Mahesh Babu
Suk, Myunghoon
Prabhakaran, Balakrishnan
2012 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2012, : 84 - 87

← 1 2 3 4 5 →