PLDA FOR SPEAKER VERIFICATION WITH UTTERANCES OF ARBITRARY DURATION

被引:0
|
作者
Kenny, Patrick
Stafylakis, Themos
Ouellet, Pierre
Alam, Md Jahangir
Dumouchel, Pierre
机构
来源
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年
关键词
speaker recognition; i-vectors; PLDA;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The duration of speech segments has traditionally been controlled in the NIST speaker recognition evaluations so that researchers working in this framework have been relieved of the responsibility of dealing with the duration variability that arises in practical applications. The fixed dimensional i-vector representation of speech utterances is ideal for working under such controlled conditions and ignoring the fact that i-vectors extracted from short utterances are less reliable than those extracted from long utterances leads to a very simple formulation of the speaker recognition problem. However a more realistic approach seems to be needed to handle duration variability properly. In this paper, we show how to quantify the uncertainty associated with the i-vector extraction process and propagate it into a PLDA classifier. We evaluated this approach using test sets derived from the NIST 2010 core and extended core conditions by randomly truncating the utterances in the female, telephone speech trials so that the durations of all enrollment and test utterances lay in the range 3-60 seconds and we found that it led to substantial improvements in accuracy. Although the likelihood ratio computation for speaker verification is more computationally expensive than in the standard i-vector/PLDA classifier, it is still quite modest as it reduces to computing the probability density functions of two full covariance Gaussians (irrespective of the number of the number of utterances used to enroll a speaker).
引用
收藏
页码:7649 / 7653
页数:5
相关论文
共 50 条
  • [21] INTRA-CLASS COVARIANCE ADAPTATION IN PLDA BACK-ENDS FOR SPEAKER VERIFICATION
    Madikeri, Srikanth
    Ferras, Marc
    Motlicek, Petr
    Dey, Subhadeep
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5365 - 5369
  • [22] PLDA using Gaussian Restricted Boltzmann Machines with application to Speaker Verification
    Stafylakis, Themos
    Kenny, Patrick
    Senoussaoui, Mohammed
    Dumouchel, Pierre
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1690 - 1693
  • [23] Analysis of the Influence of Speech Corpora in the PLDA Verification in the Task of Speaker Recognition
    Machlica, Lukas
    Zajic, Zbynek
    TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 464 - 471
  • [24] Scoring of Large-Margin Embeddings for Speaker Verification: Cosine or PLDA?
    Wang, Qiongqiong
    Lee, Kong Aik
    Liu, Tianchi
    INTERSPEECH 2022, 2022, : 600 - 604
  • [25] Sparse kernel machines with empirical kernel maps for PLDA speaker verification
    Rao, Wei
    Mak, Man-Wai
    COMPUTER SPEECH AND LANGUAGE, 2016, 38 : 104 - 121
  • [26] Non-linear PLDA for i-Vector Speaker Verification
    Novoselov, Sergey
    Pekhovsky, Timur
    Kudashev, Oleg
    Mendelev, Valentin
    Prudnikov, Alexey
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 214 - 218
  • [27] Duration compensation of i-vectors for short duration speaker verification
    Ma, Jianbo
    Sethu, Vidhyasaharan
    Ambikairajah, Eliathamby
    Lee, Kong Aik
    ELECTRONICS LETTERS, 2017, 53 (06) : 405 - 407
  • [28] Improving X-vector and PLDA for Text-dependent Speaker Verification
    Chen, Zhuxin
    Lin, Yue
    INTERSPEECH 2020, 2020, : 726 - 730
  • [29] Fusion of SNR-Dependent PLDA Models for Noise Robust Speaker Verification
    Pang, Xiaomin
    Mak, Man-Wai
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 619 - 623
  • [30] UNSUPERVISED DOMAIN ADAPTATION OF NEURAL PLDA USING SEGMENT PAIRS FOR SPEAKER VERIFICATION
    Ulgen, I. Rasim
    Arslan, Levent M.
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 571 - 576