Robust Training for Speaker Verification against Noisy Labels

被引：2

作者：

Fang, Zhihua ^{[1
,2
]}

He, Liang ^{[1
,2
,3
]}

Ma, Hanhan ^{[1
,2
]}

Guo, Xiaochen ^{[1
,2
]}

Li, Lin ^{[4
]}

机构：

[1] Xinjiang Univ, Sch Informat Sci & Engn, Urumqi 830017, Peoples R China

[2] Xinjiang Key Lab Signal Detect & Proc, Urumqi 830017, Peoples R China

[3] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China

[4] Xiamen Univ, Sch Elect Sci & Engn, Xiamen 361005, Peoples R China

来源：

INTERSPEECH 2023 | 2023年

基金：

国家重点研发计划;

关键词：

speaker verification; speaker embedding; noisy label; early learning; curriculum learning;

D O I：

10.21437/Interspeech.2023-452

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The deep learning models used for speaker verification rely heavily on large amounts of data and correct labeling. However, noisy (incorrect) labels often occur, which degrades the performance of the system. In this paper, we propose a novel twostage learning method to filter out noisy labels from speaker datasets. Since a DNN will first fit data with clean labels, we first train the model with all data for several epochs. Then, based on this model, the model predictions are compared with the labels using our proposed the OR-Gate with top-k mechanism to select the data with clean labels and the selected data is used to train the model. This process is iterated until the training is completed. We have demonstrated the effectiveness of this method in filtering noisy labels through extensive experiments and have achieved excellent performance on the VoxCeleb (1 and 2) with different added noise rates.

引用

页码：3192 / 3196

页数：5