PLDA FOR SPEAKER VERIFICATION WITH UTTERANCES OF ARBITRARY DURATION

被引：0

作者：

Kenny, Patrick

Stafylakis, Themos

Ouellet, Pierre

Alam, Md Jahangir

Dumouchel, Pierre

机构：

来源：

2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年

关键词：

speaker recognition; i-vectors; PLDA;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The duration of speech segments has traditionally been controlled in the NIST speaker recognition evaluations so that researchers working in this framework have been relieved of the responsibility of dealing with the duration variability that arises in practical applications. The fixed dimensional i-vector representation of speech utterances is ideal for working under such controlled conditions and ignoring the fact that i-vectors extracted from short utterances are less reliable than those extracted from long utterances leads to a very simple formulation of the speaker recognition problem. However a more realistic approach seems to be needed to handle duration variability properly. In this paper, we show how to quantify the uncertainty associated with the i-vector extraction process and propagate it into a PLDA classifier. We evaluated this approach using test sets derived from the NIST 2010 core and extended core conditions by randomly truncating the utterances in the female, telephone speech trials so that the durations of all enrollment and test utterances lay in the range 3-60 seconds and we found that it led to substantial improvements in accuracy. Although the likelihood ratio computation for speaker verification is more computationally expensive than in the standard i-vector/PLDA classifier, it is still quite modest as it reduces to computing the probability density functions of two full covariance Gaussians (irrespective of the number of the number of utterances used to enroll a speaker).

引用

页码：7649 / 7653

页数：5

共 50 条

[21] INTRA-CLASS COVARIANCE ADAPTATION IN PLDA BACK-ENDS FOR SPEAKER VERIFICATION
Madikeri, Srikanth
Ferras, Marc
Motlicek, Petr
Dey, Subhadeep
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5365 - 5369
[22] PLDA using Gaussian Restricted Boltzmann Machines with application to Speaker Verification
Stafylakis, Themos
Kenny, Patrick
Senoussaoui, Mohammed
Dumouchel, Pierre
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1690 - 1693
[23] Analysis of the Influence of Speech Corpora in the PLDA Verification in the Task of Speaker Recognition
Machlica, Lukas
Zajic, Zbynek
TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 464 - 471
[24] Scoring of Large-Margin Embeddings for Speaker Verification: Cosine or PLDA?
Wang, Qiongqiong
Lee, Kong Aik
Liu, Tianchi
INTERSPEECH 2022, 2022, : 600 - 604
[25] Sparse kernel machines with empirical kernel maps for PLDA speaker verification
Rao, Wei
Mak, Man-Wai
COMPUTER SPEECH AND LANGUAGE, 2016, 38 : 104 - 121
[26] Non-linear PLDA for i-Vector Speaker Verification
Novoselov, Sergey
Pekhovsky, Timur
Kudashev, Oleg
Mendelev, Valentin
Prudnikov, Alexey
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 214 - 218
[27] Duration compensation of i-vectors for short duration speaker verification
Ma, Jianbo
Sethu, Vidhyasaharan
Ambikairajah, Eliathamby
Lee, Kong Aik
ELECTRONICS LETTERS, 2017, 53 (06) : 405 - 407
[28] Improving X-vector and PLDA for Text-dependent Speaker Verification
Chen, Zhuxin
Lin, Yue
INTERSPEECH 2020, 2020, : 726 - 730
[29] Fusion of SNR-Dependent PLDA Models for Noise Robust Speaker Verification
Pang, Xiaomin
Mak, Man-Wai
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 619 - 623
[30] UNSUPERVISED DOMAIN ADAPTATION OF NEURAL PLDA USING SEGMENT PAIRS FOR SPEAKER VERIFICATION
Ulgen, I. Rasim
Arslan, Levent M.
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 571 - 576

← 1 2 3 4 5 →