Multi-Modal Emotion Recognition Using Speech Features and Text-Embedding

被引：7

作者：

Byun, Sung-Woo ^{[1
]}

Kim, Ju-Hee ^{[1
]}

Lee, Seok-Pil ^{[2
]}

机构：

[1] SangMyung Univ, Grad Sch, Dept Comp Sci, Seoul 03016, South Korea

[2] SangMyung Univ, Dept Elect Engn, Seoul 03016, South Korea

来源：

APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 17期

关键词：

speech emotion recognition; emotion recognition; multi-modal emotion recognition;

D O I：

10.3390/app11177967

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Recently, intelligent personal assistants, chat-bots and AI speakers are being utilized more broadly as communication interfaces and the demands for more natural interaction measures have increased as well. Humans can express emotions in various ways, such as using voice tones or facial expressions; therefore, multimodal approaches to recognize human emotions have been studied. In this paper, we propose an emotion recognition method to deliver more accuracy by using speech and text data. The strengths of the data are also utilized in this method. We conducted 43 feature vectors such as spectral features, harmonic features and MFCC from speech datasets. In addition, 256 embedding vectors from transcripts using pre-trained Tacotron encoder were extracted. The acoustic feature vectors and embedding vectors were fed into each deep learning model which produced a probability for the predicted output classes. The results show that the proposed model exhibited more accurate performance than in previous research.

引用

页数：9

共 50 条

[21] Semantic Alignment Network for Multi-Modal Emotion Recognition
Hou, Mixiao
Zhang, Zheng
Liu, Chang
Lu, Guangming
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5318 - 5329
[22] Emotion recognition with multi-modal peripheral physiological signals
Gohumpu, Jennifer
Xue, Mengru
Bao, Yanchi
FRONTIERS IN COMPUTER SCIENCE, 2023, 5
[23] A Multi-Modal Deep Learning Approach for Emotion Recognition
Shahzad, H. M.
Bhatti, Sohail Masood
Jaffar, Arfan
Rashid, Muhammad
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 36 (02) : 1561 - 1570
[24] ATTENTION DRIVEN FUSION FOR MULTI-MODAL EMOTION RECOGNITION
Priyasad, Darshana
Fernando, Tharindu
Denman, Simon
Sridharan, Sridha
Fookes, Clinton
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3227 - 3231
[25] Multi-modal Emotion Recognition for Determining Employee Satisfaction
Zaman, Farhan Uz
Zaman, Maisha Tasnia
Alam, Md Ashraful
Alam, Md Golam Rabiul
2021 IEEE ASIA-PACIFIC CONFERENCE ON COMPUTER SCIENCE AND DATA ENGINEERING (CSDE), 2021,
[26] Lightweight multi-modal emotion recognition model based on modal generation
Liu, Peisong
Che, Manqiang
Luo, Jiangchuan
2022 9TH INTERNATIONAL FORUM ON ELECTRICAL ENGINEERING AND AUTOMATION, IFEEA, 2022, : 430 - 435
[27] Evaluating Ensemble Learning Methods for Multi-Modal Emotion Recognition Using Sensor Data Fusion
Younis, Eman M. G.
Zaki, Someya Mohsen
Kanjo, Eiman
Houssein, Essam H.
SENSORS, 2022, 22 (15)
[28] Text-independent speech emotion recognition using frequency adaptive features
Chenjian Wu
Chengwei Huang
Hong Chen
Multimedia Tools and Applications, 2018, 77 : 24353 - 24363
[29] Text-independent speech emotion recognition using frequency adaptive features
Wu, Chenjian
Huang, Chengwei
Chen, Hong
MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (18) : 24353 - 24363
[30] Multi-modal speech emotion detection using optimised deep neural network classifier
Padman, Sweta Nishant
Magare, Dhiraj
COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING-IMAGING AND VISUALIZATION, 2023, 11 (05) : 2020 - 2038

← 1 2 3 4 5 →