Speech Emotion Recognition among Couples using the Peak-End Rule and Transfer Learning

被引：4

作者：

Boateng, George ^{[1
]}

Sels, Laura ^{[2
]}

Kuppens, Peter ^{[3
]}

Hilpert, Peter ^{[4
]}

Kowatsch, Tobias ^{[1
]}

机构：

[1] Swiss Fed Inst Technol, Zurich, Switzerland

[2] Univ Ghent, Ghent, Belgium

[3] Katholieke Univ Leuven, Leuven, Belgium

[4] Univ Surrey, Surrey, England

来源：

COMPANION PUBLICATON OF THE 2020 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION (ICMI '20 COMPANION) | 2020年

关键词：

Speech emotion recognition; Speech processing; Affective computing; Couples; Transfer Learning; Peak-end rule; Convolutional neural network; Support vector machine; MODEL;

D O I：

10.1145/3395035.3425253

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Extensive couples' literature shows that how couples feel after a conflict is predicted by certain emotional aspects of that conversation. Understanding the emotions of couples leads to a better understanding of partners' mental well-being and consequently their relationships. Hence, automatic emotion recognition among couples could potentially guide interventions to help couples improve their emotional well-being and their relationships. It has been shown that people's global emotional judgment after an experience is strongly influenced by the emotional extremes and ending of that experience, known as the peak-end rule. In this work, we leveraged this theory and used machine learning to investigate, which audio segments can be used to best predict the end-of-conversation emotions of couples. We used speech data collected from 101 Dutch-speaking couples in Belgium who engaged in 10-minute long conversations in the lab. We extracted acoustic features from (1) the audio segments with the most extreme positive and negative ratings, and (2) the ending of the audio. We used transfer learning in which we extracted these acoustic features with a pre-trained convolutional neural network (YAMNet). We then used these features to train machine learning models - support vector machines - to predict the end-of-conversation valence ratings (positive vs negative) of each partner. The results of this work could inform how to best recognize the emotions of couples after conversation-sessions and eventually, lead to a better understanding of couples' relationships either in therapy or in everyday life.

引用

页码：17 / 21

页数：5

共 50 条

[41] Speech emotion recognition with unsupervised feature learning
Huang, Zheng-wei
Xue, Wen-tao
Mao, Qi-rong
FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2015, 16 (05) : 358 - 366
[42] Investigation of Transfer Learning for End-to-End Russian Speech Recognition
Kipyatkova, Irina
SPEECH AND COMPUTER, SPECOM 2022, 2022, 13721 : 349 - 357
[43] Multimodal Emotion Recognition on RAVDESS Dataset Using Transfer Learning
Luna-Jimenez, Cristina
Griol, David
Callejas, Zoraida
Kleinlein, Ricardo
Montero, Juan M.
Fernandez-Martinez, Fernando
SENSORS, 2021, 21 (22)
[44] A Deep Learning Approach for Speech Emotion Recognition Optimization Using Meta-Learning
Ottoni, Lara Toledo Cordeiro
Ottoni, Andre Luiz Carvalho
Cerqueira, Jes de Jesus Fiais
ELECTRONICS, 2023, 12 (23)
[45] Cross-Corpus Speech Emotion Recognition Based on Sparse Subspace Transfer Learning
Zhao, Keke
Song, Peng
Zhang, Wenjing
Zhang, Weijian
Li, Shaokai
Chen, Dongliang
Zheng, Wenming
BIOMETRIC RECOGNITION (CCBR 2021), 2021, 12878 : 466 - 473
[46] Transfer Sparse Discriminant Subspace Learning for Cross-Corpus Speech Emotion Recognition
Zhang, Weijian
Song, Peng
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (28) : 307 - 318
[47] Hedonic Evaluation: Retest of the Peak-End Rule over Short and Long Time Frame
Geng, Xiaowei
Zheng, Quanquan
Chen, Ziguang
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SMALL AND MEDIUM-SIZED ENTERPRISES (SMES) PSYCHOLOGICAL ADAPTATION AND SOCIAL BEHAVIOR UNDER FINANCIAL CRISIS, 2010, : 158 - 166
[48] SPEECH EMOTION RECOGNITION USING TRANSFER NON-NEGATIVE MATRIX FACTORIZATION
Song, Peng
Ou, Shifeng
Zheng, Wenming
Jin, Yun
Zhao, Li
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5180 - 5184
[49] Dimensional speech emotion recognition from speech features and word embeddings by using multitask learning
Atmaja, Bagus Tris
Akagi, Masato
APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2020, 9
[50] END-TO-END SPEECH EMOTION RECOGNITION USING DEEP NEURAL NETWORKS
Tzirakis, Panagiotis
Zhang, Jiehao
Schuller, Bjoern W.
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5089 - 5093

← 1 2 3 4 5 →