I-Vector DNN Scoring and Calibration for Noise Robust Speaker Verification

被引:2
作者
Tan, Zhili [1 ]
Mak, Man-Wai [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Peoples R China
来源
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年
关键词
Deep learning; speaker verification; score calibration; multi-task learning; noise robustness; PLDA;
D O I
10.21437/Interspeech.2017-656
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes applying multi-task learning to train deep neural networks (DNNs) for calibrating the PLDA scores of speaker verification systems under noisy environments. To facilitate the DNNs to learn the main task (calibration). several auxiliary tasks were introduced, including the prediction of SNR and duration from i-vectors and classifying whether an i-vector pair belongs to the same speaker or not. The possibility of replacing the PLDA model by a DNN during the scoring stage is also explored. Evaluations on noise contaminated speech suggest that the auxiliary tasks are important for the DNNs to learn the main calibration task and that the uncalibrated PLDA scores are an essential input to the DNNs. Without this input, the DNNs can only predict the score shifts accurately. suggesting that the PLDA model is indispensable.
引用
收藏
页码:1562 / 1566
页数:5
相关论文
共 35 条
[1]  
[Anonymous], P ICASSP
[2]  
[Anonymous], P OD BRNO CZECH REP
[3]  
[Anonymous], 2006, NIPS
[4]  
[Anonymous], P INTERSPEECH
[5]  
[Anonymous], 2011, INTERSPEECH
[6]  
[Anonymous], IEEE ACM T AUD UNPUB
[7]  
[Anonymous], 2012, PROC ODYSSEY SPEAKER
[8]  
[Anonymous], P OD
[9]  
[Anonymous], P ISCSLP OCT
[10]  
[Anonymous], ARXIV13042861