ISNet: Individual Standardization Network for Speech Emotion Recognition
被引:23
|
作者:
Fan, Weiquan
论文数: 0引用数: 0
h-index: 0
机构:
South China Univ Technol, Sch Elect & Informat, Guangzhou 510640, Peoples R ChinaSouth China Univ Technol, Sch Elect & Informat, Guangzhou 510640, Peoples R China
Fan, Weiquan
[1
]
Xu, Xiangmin
论文数: 0引用数: 0
h-index: 0
机构:
South China Univ Technol, Sch Elect & Informat, Guangzhou 510640, Peoples R ChinaSouth China Univ Technol, Sch Elect & Informat, Guangzhou 510640, Peoples R China
Xu, Xiangmin
[1
]
Cai, Bolun
论文数: 0引用数: 0
h-index: 0
机构:
South China Univ Technol, Sch Elect & Informat, Guangzhou 510640, Peoples R ChinaSouth China Univ Technol, Sch Elect & Informat, Guangzhou 510640, Peoples R China
Cai, Bolun
[1
]
Xing, Xiaofen
论文数: 0引用数: 0
h-index: 0
机构:
South China Univ Technol, Sch Elect & Informat, Guangzhou 510640, Peoples R ChinaSouth China Univ Technol, Sch Elect & Informat, Guangzhou 510640, Peoples R China
Xing, Xiaofen
[1
]
机构:
[1] South China Univ Technol, Sch Elect & Informat, Guangzhou 510640, Peoples R China
Speech emotion recognition plays an essential role in human-computer interaction. However, cross-individual representation learning and individual-agnostic systems are challenging due to the distribution deviation caused by individual differences. The existing related approaches mostly use the auxiliary task of speaker recognition to eliminate individual differences. Unfortunately, although these methods can reduce interindividual voiceprint differences, it is difficult to dissociate interindividual expression differences since each individual has its unique expression habits. In this paper, we propose an individual standardization network (ISNet) for speech emotion recognition to alleviate the problem of interindividual emotion confusion caused by individual differences. Specifically, we model individual benchmarks as representations of nonemotional neutral speech, and ISNet realizes individual standardization using the automatically generated benchmark, which improves the robustness of individual-agnostic emotion representations. In response to individual differences, we also propose more comprehensive and meaningful individual-level evaluation metrics. In addition, we continue our previous work to construct a challenging large-scale speech emotion dataset (LSSED). We propose a more reasonable division method of the training set and testing set to prevent individual information leakage. Experimental results on datasets of both large and small scales have proven the effectiveness of ISNet, and the new state-of-the-art performance is achieved under the same experimental conditions on IEMOCAP and LSSED.
机构:
Int Islamic Univ Malaysia, Dept Elect & Comp Engn, Kuala Lumpur 53100, MalaysiaInt Islamic Univ Malaysia, Dept Elect & Comp Engn, Kuala Lumpur 53100, Malaysia
Wani, Taiba Majid
Gunawan, Teddy Surya
论文数: 0引用数: 0
h-index: 0
机构:
Int Islamic Univ Malaysia, Dept Elect & Comp Engn, Kuala Lumpur 53100, Malaysia
Univ New South Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, AustraliaInt Islamic Univ Malaysia, Dept Elect & Comp Engn, Kuala Lumpur 53100, Malaysia
Gunawan, Teddy Surya
Qadri, Syed Asif Ahmad
论文数: 0引用数: 0
h-index: 0
机构:
Int Islamic Univ Malaysia, Dept Elect & Comp Engn, Kuala Lumpur 53100, MalaysiaInt Islamic Univ Malaysia, Dept Elect & Comp Engn, Kuala Lumpur 53100, Malaysia
机构:
South China Normal Univ SCNU, Sch Phys & Telecommun Engn, Guangzhou 510006, Peoples R China
SCNU, Guangdong Prov Engn Technol Res Ctr Cardiovasc In, Guangzhou 510006, Peoples R ChinaSouth China Normal Univ SCNU, Sch Phys & Telecommun Engn, Guangzhou 510006, Peoples R China
Zhong, Shunming
Yu, Baoxian
论文数: 0引用数: 0
h-index: 0
机构:
South China Normal Univ SCNU, Sch Phys & Telecommun Engn, Guangzhou 510006, Peoples R China
SCNU, Guangdong Prov Engn Technol Res Ctr Cardiovasc In, Guangzhou 510006, Peoples R China
SCNU Qingyuan Inst Sci & Technol Innovat Co Ltd, Qingyuan 511517, Peoples R ChinaSouth China Normal Univ SCNU, Sch Phys & Telecommun Engn, Guangzhou 510006, Peoples R China
Yu, Baoxian
Zhang, Han
论文数: 0引用数: 0
h-index: 0
机构:
South China Normal Univ SCNU, Sch Phys & Telecommun Engn, Guangzhou 510006, Peoples R China
SCNU, Guangdong Prov Engn Technol Res Ctr Cardiovasc In, Guangzhou 510006, Peoples R China
SCNU Qingyuan Inst Sci & Technol Innovat Co Ltd, Qingyuan 511517, Peoples R ChinaSouth China Normal Univ SCNU, Sch Phys & Telecommun Engn, Guangzhou 510006, Peoples R China
机构:
Hong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Peoples R ChinaHong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Peoples R China
Yi, Lu
Mak, Man-Wai
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Peoples R ChinaHong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Peoples R China