PLDA FOR SPEAKER VERIFICATION WITH UTTERANCES OF ARBITRARY DURATION

被引：0

作者：

Kenny, Patrick

Stafylakis, Themos

Ouellet, Pierre

Alam, Md Jahangir

Dumouchel, Pierre

机构：

来源：

2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年

关键词：

speaker recognition; i-vectors; PLDA;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The duration of speech segments has traditionally been controlled in the NIST speaker recognition evaluations so that researchers working in this framework have been relieved of the responsibility of dealing with the duration variability that arises in practical applications. The fixed dimensional i-vector representation of speech utterances is ideal for working under such controlled conditions and ignoring the fact that i-vectors extracted from short utterances are less reliable than those extracted from long utterances leads to a very simple formulation of the speaker recognition problem. However a more realistic approach seems to be needed to handle duration variability properly. In this paper, we show how to quantify the uncertainty associated with the i-vector extraction process and propagate it into a PLDA classifier. We evaluated this approach using test sets derived from the NIST 2010 core and extended core conditions by randomly truncating the utterances in the female, telephone speech trials so that the durations of all enrollment and test utterances lay in the range 3-60 seconds and we found that it led to substantial improvements in accuracy. Although the likelihood ratio computation for speaker verification is more computationally expensive than in the standard i-vector/PLDA classifier, it is still quite modest as it reduces to computing the probability density functions of two full covariance Gaussians (irrespective of the number of the number of utterances used to enroll a speaker).

引用

页码：7649 / 7653

页数：5

共 50 条

[1] Duration Dependent Covariance Regularization in PLDA Modeling for Speaker Verification
Cai, Weicheng
Li, Ming
Li, Lin
Hong, Qingyang
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1027 - 1031
[2] Nonparametrically trained PLDA for short duration i-vector speaker verification
Khosravani, Abbas
Homayounpour, Mohammad M.
COMPUTER SPEECH AND LANGUAGE, 2018, 52 : 105 - 122
[3] Local Training in Speaker Verification for PLDA
Pahuja, Hunny
Ranjan, Priya
Ujlayan, Amit
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2017, : 1466 - 1469
[4] PHONETICALLY-CONSTRAINED PLDA MODELING FOR TEXT-DEPENDENT SPEAKER VERIFICATION WITH MULTIPLE SHORT UTTERANCES
Larcher, Anthony
Lee, Kong Aik
Ma, Bin
Li, Haizhou
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7673 - 7677
[5] PLDA Modeling in the Fishervoice Subspace for Speaker Verification
Zhong, Jinghua
Jiang, Weiwu
Rao, Wei
Mak, Man-Wai
Meng, Helen
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1130 - 1134
[6] PLDA Speaker Verification with Limited Speech Data
Ridzik, Andrej
Rusko, Milan
SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 325 - 332
[7] Fisher Vectors in PLDA Speaker Verification System
Zajic, Zbynek
Hruz, Marek
PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016), 2016, : 1339 - 1342
[8] CONSTRAINED DISCRIMINATIVE PLDA TRAINING FOR SPEAKER VERIFICATION
Rohdin, Johan
Biswas, Sangeeta
Shinoda, Koichi
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[9] PLDA inspired Siamese networks for speaker verification
Ramoji, Shreyas
Krishnan, Prashant
Ganapathy, Sriram
COMPUTER SPEECH AND LANGUAGE, 2022, 76
[10] Gaussian PLDA for speaker verification and joint estimation
Xu, Yun-Fei
Yang, Hai
Zhou, Ruo-Hua
Yan, Yong-Hong
Zidonghua Xuebao/Acta Automatica Sinica, 2014, 40 (06): : 1068 - 1074

← 1 2 3 4 5 →