Curriculum Learning for Speech Emotion Recognition From Crowdsourced Labels

被引：56

作者：

Lotfian, Reza ^{[1
]}

Busso, Carlos ^{[1
]}

机构：

[1] Univ Texas Dallas, Erik Jonsson Sch Elect & Comp Engn, Richardson, TX 75080 USA

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2019年 / 27卷 / 04期

基金：

美国国家科学基金会;

关键词：

Curriculum learning; speech emotion recognition; inter-evaluator agreement; CORPUS; CLASSIFICATION; INTELLIGENCE;

D O I：

10.1109/TASLP.2019.2898816

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This study introduces a method to design a curriculum for machine-learning to maximize the efficiency during the training process of deep neural networks (DNNs) for speech emotion recognition. Previous studies in other machine-learning problems have shown the benefits of training a classifier following a curriculum where samples are gradually presented in increasing level of difficulty. For speech emotion recognition, the challenge is to establish a natural order of difficulty in the training set to create the curriculum. We address this problem by assuming that, ambiguous samples for humans are also ambiguous for computers. Speech samples are often annotated by multiple evaluators to account for differences in emotion perception across individuals. While some sentences with clear emotional content are consistently annotated, sentences with more ambiguous emotional content present important disagreement between individual evaluations. We propose to use the disagreement between evaluators as a measure of difficulty for the classification task. We propose metrics that quantify the inter-evaluation agreement to define the curriculum for regression problems and binary and multi-class classification problems. The experimental results consistently show that relying on a curriculum based on agreement between human judgments leads to statistically significant improvements over baselines trained without a curriculum.

引用

页码：815 / 826

页数：12

共 50 条

[1] Meta-Learning for Speech Emotion Recognition Considering Ambiguity of Emotion Labels
Fujioka, Takuya
Homma, Takeshi
Nagamatsu, Kenji
INTERSPEECH 2020, 2020, : 2332 - 2336
[2] Learning Attributes from the Crowdsourced Relative Labels
Tian, Tian
Chen, Ning
Zhu, Jun
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1562 - 1568
[3] Emotion Recognition with Refined Labels for Deep Learning
Zhang, Su
Guan, Cuntai
42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 108 - 111
[4] Learning Alignment for Multimodal Emotion Recognition from Speech
Xu, Haiyang
Zhang, Hui
Han, Kun
Wang, Yun
Peng, Yiping
Li, Xiangang
INTERSPEECH 2019, 2019, : 3569 - 3573
[5] Emotion Recognition from Speech: An Unsupervised Learning Approach
Rovetta, Stefano
Mnasri, Zied
Masulli, Francesco
Cabri, Alberto
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2021, 14 (01) : 23 - 35
[6] Representation Learning for Speech Emotion Recognition
Ghosh, Sayan
Laksana, Eugene
Morency, Louis-Philippe
Scherer, Stefan
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3603 - 3607
[7] Speech Emotion Recognition with Deep Learning
Harar, Pavol
Burget, Radim
Dutta, Malay Kishore
2017 4TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2017, : 137 - 140
[8] Transfer Learning for Speech Emotion Recognition
Han Zhijie
Zhao, Huijuan
Wang, Ruchuan
2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 96 - 99
[9] AESR: Speech Recognition With Speech Emotion Recogniting Learning
Han, RongQi
Liu, Xin
Zhang, Hui
MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 91 - 101
[10] Speech Emotion Recognition Based on Joint Self-Assessment Manikins and Emotion Labels
Chen, Jing-Ming
Chang, Pao-Chi
Liang, Kai-Wen
2019 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2019), 2019, : 327 - 330

← 1 2 3 4 5 →