Semi-supervised Ladder Networks for Speech Emotion Recognition

被引：0

作者：

Jian-Hua Tao

Jian Huang

Ya Li

Zheng Lian

Ming-Yue Niu

机构：

[1] National Laboratory of Pattern Recognition,School of Artificial Intelligence

[2] University of Chinese Academy of Science (CAS),CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Automation

[3] Chinese Academy of Sciences,undefined

来源：

International Journal of Automation and Computing | 2019年 / 16卷

关键词：

Speech emotion recognition; the ladder network; semi-supervised learning; autoencoder; regularization;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

As a major component of speech signal processing, speech emotion recognition has become increasingly essential to understanding human communication. Benefitting from deep learning, many researchers have proposed various unsupervised models to extract effective emotional features and supervised models to train emotion recognition systems. In this paper, we utilize semi-supervised ladder networks for speech emotion recognition. The model is trained by minimizing the supervised loss and auxiliary unsupervised cost function. The addition of the unsupervised auxiliary task provides powerful discriminative representations of the input features, and is also regarded as the regularization of the emotional supervised task. We also compare the ladder network with other classical autoencoder structures. The experiments were conducted on the interactive emotional dyadic motion capture (IEMOCAP) database, and the results reveal that the proposed methods achieve superior performance with a small number of labelled data and achieves better performance than other methods.

引用

页码：437 / 448

页数：11

共 50 条

[31] Semi-Supervised Speaker Adaptation for In-Vehicle Speech Recognition with Deep Neural Networks
Lee, Wonkyum
Hang, Kyu J.
Lane, Ian
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3843 - 3847
[32] Semi-Supervised Group Emotion Recognition Based on Contrastive Learning
Zhang, Jiayi
Wang, Xingzhi
Zhang, Dong
Lee, Dah-Jye
ELECTRONICS, 2022, 11 (23)
[33] Semi-supervised Emotion Recognition using Inconsistently Annotated Data
Happy, S. L.
Dantcheva, Antitza
Bremond, Francois
2020 15TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2020), 2020, : 286 - 293
[34] Semi-Supervised Dictionary Learning of Sparse Representations for Emotion Recognition
Kaechele, Markus
Schwenker, Friedhelm
PARTIALLY SUPERVISED LEARNING, PSL 2013, 2013, 8193 : 21 - 35
[35] USING COLLECTIVE INFORMATION IN SEMI-SUPERVISED LEARNING FOR SPEECH RECOGNITION
Varadarajan, Balakrishnan
Yu, Dong
Deng, Li
Acero, Alex
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4633 - +
[36] DEEP CONTEXTUALIZED ACOUSTIC REPRESENTATIONS FOR SEMI-SUPERVISED SPEECH RECOGNITION
Ling, Shaoshi
Liu, Yuzong
Salazar, Julian
Kirchhoff, Katrin
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6429 - 6433
[37] Unsupervised and semi-supervised adaptation of a hybrid speech recognition system
Trmal, Jan
Zelinka, Jan
Mueller, Ludek
PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 527 - 530
[38] Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition
Higuchi, Yosuke
Moritz, Niko
Le Roux, Jonathan
Hori, Takaaki
INTERSPEECH 2021, 2021, : 726 - 730
[39] Regularized Urdu Speech Recognition with Semi-Supervised Deep Learning
Humayun, Mohammad Ali
Hameed, Ibrahim A.
Shah, Syed Muslim
Khan, Sohaib Hassan
Zafar, Irfan
Bin Ahmed, Saad
Shuja, Junaid
APPLIED SCIENCES-BASEL, 2019, 9 (09):
[40] Emotion recognition using semi-supervised feature selection with speaker normalization
Sun Y.
Wen G.
International Journal of Speech Technology, 2015, 18 (3) : 317 - 331

← 1 2 3 4 5 →