PLDA FOR SPEAKER VERIFICATION WITH UTTERANCES OF ARBITRARY DURATION

被引:0
|
作者
Kenny, Patrick
Stafylakis, Themos
Ouellet, Pierre
Alam, Md Jahangir
Dumouchel, Pierre
机构
来源
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年
关键词
speaker recognition; i-vectors; PLDA;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The duration of speech segments has traditionally been controlled in the NIST speaker recognition evaluations so that researchers working in this framework have been relieved of the responsibility of dealing with the duration variability that arises in practical applications. The fixed dimensional i-vector representation of speech utterances is ideal for working under such controlled conditions and ignoring the fact that i-vectors extracted from short utterances are less reliable than those extracted from long utterances leads to a very simple formulation of the speaker recognition problem. However a more realistic approach seems to be needed to handle duration variability properly. In this paper, we show how to quantify the uncertainty associated with the i-vector extraction process and propagate it into a PLDA classifier. We evaluated this approach using test sets derived from the NIST 2010 core and extended core conditions by randomly truncating the utterances in the female, telephone speech trials so that the durations of all enrollment and test utterances lay in the range 3-60 seconds and we found that it led to substantial improvements in accuracy. Although the likelihood ratio computation for speaker verification is more computationally expensive than in the standard i-vector/PLDA classifier, it is still quite modest as it reduces to computing the probability density functions of two full covariance Gaussians (irrespective of the number of the number of utterances used to enroll a speaker).
引用
收藏
页码:7649 / 7653
页数:5
相关论文
共 50 条
  • [31] NORMALIZATION OF TOTAL VARIABILITY MATRIX FOR I-VECTOR/PLDA SPEAKER VERIFICATION
    Rao, Wei
    Mak, Man-Wai
    Lee, Kong-Aik
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4180 - 4184
  • [32] On the use of Total Variability and Probabilistic Linear Discriminant Analysis for Speaker Verification on Short Utterances
    Gonzalez Dominguez, Javier
    Zazo, Ruben
    Gonzalez-Rodriguez, Joaquin
    ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, 2012, 328 : 11 - 19
  • [33] On Behaviour of PLDA Models in the Task of Speaker Recognition
    Machlica, Lukas
    Radova, Vlasta
    TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 352 - 359
  • [34] Speaker Verification Using Gaussian Posteriorgrams on Fixed Phrase Short Utterances
    Jelil, Sarfaraz
    Das, Rohan Kumar
    Sinha, R.
    Prasanna, S. R. Mahadeva
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1042 - 1046
  • [35] Relevance Vector Machines with Empirical Likelihood-Ratio Kernels for PLDA Speaker Verification
    Rao, Wei
    Mak, Man-Wai
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 64 - 68
  • [36] I-VECTOR KULLBACK-LEIBLER DIVISIVE NORMALIZATION FOR PLDA SPEAKER VERIFICATION
    Pan, Yilin
    Zheng, Tieran
    Chen, Chen
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 56 - 60
  • [37] Robust discriminative training against data insufficiency in PLDA-based speaker verification
    Rohdin, Johan
    Biswas, Sangeeta
    Shinoda, Koichi
    COMPUTER SPEECH AND LANGUAGE, 2016, 35 : 32 - 57
  • [38] Dataset-Invariant Covariance Normalization for Out-domain PLDA Speaker Verification
    Rahman, Md Hafizur
    Kanagasundaram, Ahilan
    Dean, David
    Sridharan, Sridha
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1017 - 1021
  • [39] VARIATIONAL BAYESIAN PLDA FOR SPEAKER DIARIZATION IN THE MGB CHALLENGE
    Villalba, Jesus
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 667 - 674
  • [40] A PLDA approach for language and text independent speaker recognition
    Khosravani, Abbas
    Homayounpour, Mohammad M.
    COMPUTER SPEECH AND LANGUAGE, 2017, 45 : 457 - 474