Noise robust speaker verification via the fusion of SNR-independent and SNR-dependent PLDA

被引:1
作者
Pang, Xiaomin [1 ]
Mak, Man-Wai [1 ]
机构
[1] Hong Kong Polytechn Univ, Ctr Signal Proc, Dept Elect & Informat Engn, Kowloon, Hong Kong, Peoples R China
关键词
Speaker verification; i-Vectors; Probabilistic LDA; NIST; 2012; SRE; Noise robustness; Fusion;
D O I
10.1007/s10772-015-9310-8
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
While i-vectors with probabilistic linear discriminant analysis (PLDA) can achieve state-of-the-art performance in speaker verification, the mismatch caused by acoustic noise remains a key factor affecting system performance. In this paper, a fusion system that combines a multi-condition signal-to-noise ratio (SNR)-independent PLDA model and a mixture of SNR-dependent PLDA models is proposed to make speaker verification systems more noise robust. First, the whole range of SNR that a verification system is expected to operate is divided into several narrow ranges. Then, a set of SNR-dependent PLDA models, one for each narrow SNR range, are trained. During verification, the SNR of the test utterance is used to determine which of the SNR-dependent PLDA models is used for scoring. To further enhance performance, the SNR-dependent and SNR-independent models are fused using linear and logistic regression fusion. The performance of the fusion system and the SNR-dependent system is evaluated on the NIST 2012 speaker recognition evaluation for both noisy and clean conditions. Results show that a mixture of SNR-dependent PLDA models perform better in both clean and noisy conditions. It was also found that the fusion system is more robust than the conventional i-vector/PLDA systems under noisy conditions.
引用
收藏
页码:633 / 648
页数:16
相关论文
共 41 条
[1]  
Bishop Christopher M., 2006, PATTERN RECOGN
[2]  
Brummer N., 2011, BOSARIS TOOLKIT USER
[3]  
Brummer N., 2014, FOCAL
[4]   The ITU-T software tool library [J].
De Campos Neto S.F. .
International Journal of Speech Technology, 1999, 2 (4) :259-272
[5]   Front-End Factor Analysis for Speaker Verification [J].
Dehak, Najim ;
Kenny, Patrick J. ;
Dehak, Reda ;
Dumouchel, Pierre ;
Ouellet, Pierre .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04) :788-798
[6]  
Ferrer L, 2011, P NIST 2011 WORKSH
[7]  
Garcia-Romero D., 2011, INTERSPEECH
[8]  
Garcia-Romero D, 2012, INT CONF ACOUST SPEE, P4257, DOI 10.1109/ICASSP.2012.6288859
[9]   Maximum Likelihood Acoustic Factor Analysis Models for Robust Speaker Verification in Noise [J].
Hasan, Taufiq ;
Hansen, John H. L. .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (02) :381-391
[10]  
Hasan T, 2013, INT CONF ACOUST SPEE, P6783, DOI 10.1109/ICASSP.2013.6638975