Cascaded Adversarial Learning for Speaker Independent Emotion Recognition

被引：0

作者：

Lekamalage, Chamara Kasun Liyanaarachchi ^{[1
]}

Lin, Zhiping ^{[1
]}

Huang, Guang-Bin ^{[2
]}

Rajapakse, Jagath Chandana ^{[3
]}

机构：

[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore, Singapore

[2] Mindpointeye, 18-06-07-08 Vis Exchange,2 Venture Dr, Singapore, Singapore

[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore

来源：

2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2022年

基金：

新加坡国家研究基金会;

关键词：

cascade networks; autoencoder; speaker independent emotion recognition (SIER); adversarial learning (AL); deep learning;

D O I：

10.1109/IJCNN55064.2022.9892223

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In contrast to traditional adversarial learning (AL) which learns speaker-invariant representations, this paper proposes cascaded adversarial learning (CAL) which learns speaker-invariant emotion data for speaker independent emotion recognition (SIER) tasks. CAL is a dual cascaded network architecture where the output of the transformation network is fed as input to the classification network. Transformation network transforms original speech emotion to speaker-invariant emotion data by implementing an AL strategy with an encoder-decoder architecture. The classification network predicts the emotion from the speaker-invariant emotion data (output of the transformation network). We argue that the speaker-invariant emotion data realized by transformation network has less variation than the original speech emotion data and therefore are conducive for SIER as it improve generalization capability. To our knowledge this is the first time a dual cascaded network has been used for SIER and demonstrate state-of-the-art performances for SIER on Emo-DB and RAVDESS datasets.

引用

页数：8

共 50 条

[1] Discriminative Adversarial Learning for Speaker Independent Emotion Recognition
Kasun, Chamara
Ahn, Chung Soo
Rajapakse, Jagath C.
Lin, Zhiping
Huang, Guang-Bin
INTERSPEECH 2022, 2022, : 4975 - 4979
[2] Speaker Adversarial Neural Network (SANN) for Speaker-independent Speech Emotion Recognition
Md Shah Fahad
Ashish Ranjan
Akshay Deepak
Gayadhar Pradhan
Circuits, Systems, and Signal Processing, 2022, 41 : 6113 - 6135
[3] Speaker Adversarial Neural Network (SANN) for Speaker-independent Speech Emotion Recognition
Fahad, Md Shah
Ranjan, Ashish
Deepak, Akshay
Pradhan, Gayadhar
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (11) : 6113 - 6135
[4] Graph Learning Based Speaker Independent Speech Emotion Recognition
Xu, Xinzhou
Huang, Chengwei
Wu, Chen
Wang, Qingyun
Zhao, Li
ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2014, 14 (02) : 17 - 22
[5] Contrastive Adversarial Learning for Person Independent Facial Emotion Recognition
Kim, Daeha
Song, Byung Cheol
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 5948 - 5956
[6] COMPARISON OF SPEAKER DEPENDENT AND SPEAKER INDEPENDENT EMOTION RECOGNITION
Rybka, Jan
Janicki, Artur
INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2013, 23 (04) : 797 - 808
[7] BAYESIAN ADVERSARIAL LEARNING FOR SPEAKER RECOGNITION
Chien, Jen-Tzung
Kuo, Chun-Lin
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 381 - 388
[8] ADVERSARIAL MANIFOLD LEARNING FOR SPEAKER RECOGNITION
Chien, Jen-Tzung
Peng, Kang-Ting
2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 599 - 605
[9] Neural adversarial learning for speaker recognition
Chien, Jen-Tzung
Peng, Kang-Ting
COMPUTER SPEECH AND LANGUAGE, 2019, 58 : 422 - 440
[10] Domain Invariant Feature Learning for Speaker-Independent Speech Emotion Recognition
Lu, Cheng
Zong, Yuan
Zheng, Wenming
Li, Yang
Tang, Chuangao
Schuller, Bjoern W.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2217 - 2230

← 1 2 3 4 5 →