Noise robust speaker verification via the fusion of SNR-independent and SNR-dependent PLDA

被引：1

作者：

Pang, Xiaomin ^{[1
]}

Mak, Man-Wai ^{[1
]}

机构：

[1] Hong Kong Polytechn Univ, Ctr Signal Proc, Dept Elect & Informat Engn, Kowloon, Hong Kong, Peoples R China

来源：

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY | 2015年 / 18卷 / 04期

关键词：

Speaker verification; i-Vectors; Probabilistic LDA; NIST; 2012; SRE; Noise robustness; Fusion;

D O I：

10.1007/s10772-015-9310-8

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

While i-vectors with probabilistic linear discriminant analysis (PLDA) can achieve state-of-the-art performance in speaker verification, the mismatch caused by acoustic noise remains a key factor affecting system performance. In this paper, a fusion system that combines a multi-condition signal-to-noise ratio (SNR)-independent PLDA model and a mixture of SNR-dependent PLDA models is proposed to make speaker verification systems more noise robust. First, the whole range of SNR that a verification system is expected to operate is divided into several narrow ranges. Then, a set of SNR-dependent PLDA models, one for each narrow SNR range, are trained. During verification, the SNR of the test utterance is used to determine which of the SNR-dependent PLDA models is used for scoring. To further enhance performance, the SNR-dependent and SNR-independent models are fused using linear and logistic regression fusion. The performance of the fusion system and the SNR-dependent system is evaluated on the NIST 2012 speaker recognition evaluation for both noisy and clean conditions. Results show that a mixture of SNR-dependent PLDA models perform better in both clean and noisy conditions. It was also found that the fusion system is more robust than the conventional i-vector/PLDA systems under noisy conditions.

引用

页码：633 / 648

页数：16

共 41 条

[31]

Prince SJD, 2007, IEEE I CONF COMP VIS, P1751

[32] From single to multiple enrollment i-vectors: Practical PLDA scoring variants for speaker verification [J].

Rajan, Padmanabhan ;

Afanasyev, Anton ;

Hatitamaki, Ville ;

Kinnunen, Tomi .

DIGITAL SIGNAL PROCESSING, 2014, 31 :93-101

[33] Boosting the Performance of I-Vector Based Speaker Verification via Utterance Partitioning [J].

Rao, Wei ;

Mak, Man-Wai .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (05) :1012-1022

[34] Speaker verification using adapted Gaussian mixture models [J].

Reynolds, DA ;

Quatieri, TF ;

Dunn, RB .

DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) :19-41

[35]

Sadjadi S. O., 2014, P INTERSPEECH, P1860

[36]

Sadjadi SO, 2012, 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, P1694

[37]

Saeidi R., 2012, P NIST SPEAK REC EV

[38]

Shao Y, 2008, INT CONF ACOUST SPEE, P1589

[39]

van Leeuwen DA, 2013, INT CONF ACOUST SPEE, P6778, DOI 10.1109/ICASSP.2013.6638974

[40]

Yu C, 2014, I SYMP CONSUM ELECTR, P448

← 1 2 3 4 5 →