PLDA FOR SPEAKER VERIFICATION WITH UTTERANCES OF ARBITRARY DURATION

被引：0

作者：

Kenny, Patrick

Stafylakis, Themos

Ouellet, Pierre

Alam, Md Jahangir

Dumouchel, Pierre

机构：

来源：

2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年

关键词：

speaker recognition; i-vectors; PLDA;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The duration of speech segments has traditionally been controlled in the NIST speaker recognition evaluations so that researchers working in this framework have been relieved of the responsibility of dealing with the duration variability that arises in practical applications. The fixed dimensional i-vector representation of speech utterances is ideal for working under such controlled conditions and ignoring the fact that i-vectors extracted from short utterances are less reliable than those extracted from long utterances leads to a very simple formulation of the speaker recognition problem. However a more realistic approach seems to be needed to handle duration variability properly. In this paper, we show how to quantify the uncertainty associated with the i-vector extraction process and propagate it into a PLDA classifier. We evaluated this approach using test sets derived from the NIST 2010 core and extended core conditions by randomly truncating the utterances in the female, telephone speech trials so that the durations of all enrollment and test utterances lay in the range 3-60 seconds and we found that it led to substantial improvements in accuracy. Although the likelihood ratio computation for speaker verification is more computationally expensive than in the standard i-vector/PLDA classifier, it is still quite modest as it reduces to computing the probability density functions of two full covariance Gaussians (irrespective of the number of the number of utterances used to enroll a speaker).

引用

页码：7649 / 7653

页数：5

共 50 条

[31] NORMALIZATION OF TOTAL VARIABILITY MATRIX FOR I-VECTOR/PLDA SPEAKER VERIFICATION
Rao, Wei
Mak, Man-Wai
Lee, Kong-Aik
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4180 - 4184
[32] On the use of Total Variability and Probabilistic Linear Discriminant Analysis for Speaker Verification on Short Utterances
Gonzalez Dominguez, Javier
Zazo, Ruben
Gonzalez-Rodriguez, Joaquin
ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, 2012, 328 : 11 - 19
[33] On Behaviour of PLDA Models in the Task of Speaker Recognition
Machlica, Lukas
Radova, Vlasta
TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 352 - 359
[34] Speaker Verification Using Gaussian Posteriorgrams on Fixed Phrase Short Utterances
Jelil, Sarfaraz
Das, Rohan Kumar
Sinha, R.
Prasanna, S. R. Mahadeva
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1042 - 1046
[35] Relevance Vector Machines with Empirical Likelihood-Ratio Kernels for PLDA Speaker Verification
Rao, Wei
Mak, Man-Wai
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 64 - 68
[36] I-VECTOR KULLBACK-LEIBLER DIVISIVE NORMALIZATION FOR PLDA SPEAKER VERIFICATION
Pan, Yilin
Zheng, Tieran
Chen, Chen
2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 56 - 60
[37] Robust discriminative training against data insufficiency in PLDA-based speaker verification
Rohdin, Johan
Biswas, Sangeeta
Shinoda, Koichi
COMPUTER SPEECH AND LANGUAGE, 2016, 35 : 32 - 57
[38] Dataset-Invariant Covariance Normalization for Out-domain PLDA Speaker Verification
Rahman, Md Hafizur
Kanagasundaram, Ahilan
Dean, David
Sridharan, Sridha
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1017 - 1021
[39] VARIATIONAL BAYESIAN PLDA FOR SPEAKER DIARIZATION IN THE MGB CHALLENGE
Villalba, Jesus
Ortega, Alfonso
Miguel, Antonio
Lleida, Eduardo
2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 667 - 674
[40] A PLDA approach for language and text independent speaker recognition
Khosravani, Abbas
Homayounpour, Mohammad M.
COMPUTER SPEECH AND LANGUAGE, 2017, 45 : 457 - 474

← 1 2 3 4 5 →