PLDA FOR SPEAKER VERIFICATION WITH UTTERANCES OF ARBITRARY DURATION

被引:0
|
作者
Kenny, Patrick
Stafylakis, Themos
Ouellet, Pierre
Alam, Md Jahangir
Dumouchel, Pierre
机构
来源
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年
关键词
speaker recognition; i-vectors; PLDA;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The duration of speech segments has traditionally been controlled in the NIST speaker recognition evaluations so that researchers working in this framework have been relieved of the responsibility of dealing with the duration variability that arises in practical applications. The fixed dimensional i-vector representation of speech utterances is ideal for working under such controlled conditions and ignoring the fact that i-vectors extracted from short utterances are less reliable than those extracted from long utterances leads to a very simple formulation of the speaker recognition problem. However a more realistic approach seems to be needed to handle duration variability properly. In this paper, we show how to quantify the uncertainty associated with the i-vector extraction process and propagate it into a PLDA classifier. We evaluated this approach using test sets derived from the NIST 2010 core and extended core conditions by randomly truncating the utterances in the female, telephone speech trials so that the durations of all enrollment and test utterances lay in the range 3-60 seconds and we found that it led to substantial improvements in accuracy. Although the likelihood ratio computation for speaker verification is more computationally expensive than in the standard i-vector/PLDA classifier, it is still quite modest as it reduces to computing the probability density functions of two full covariance Gaussians (irrespective of the number of the number of utterances used to enroll a speaker).
引用
收藏
页码:7649 / 7653
页数:5
相关论文
共 50 条
  • [41] Joint Estimation of PLDA and Nonlinear Transformations of Speaker Vectors
    Cumani, Sandro
    Laface, Pietro
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (10) : 1890 - 1900
  • [42] Improved i-vector extraction technique for speaker verification with short utterances
    Poddar A.
    Sahidullah M.
    Saha G.
    International Journal of Speech Technology, 2018, 21 (03) : 473 - 488
  • [43] Denoising autoencoder-based speaker feature restoration for utterances of short duration
    Yamamoto, Hitoshi
    Koshinaka, Takafumi
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1052 - 1056
  • [44] DEEP NEURAL NETWORK BASED DISCRIMINATIVE TRAINING FOR I-VECTOR/PLDA SPEAKER VERIFICATION
    Zheng Tieran
    Han Jiqing
    Zheng Guibin
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5354 - 5358
  • [45] DOMAIN ADAPTATION USING MAXIMUM LIKELIHOOD LINEAR TRANSFORMATION FOR PLDA-BASED SPEAKER VERIFICATION
    Wang, Qiongqiong
    Yamamoto, Hitoshi
    Koshinaka, Takafumi
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5110 - 5114
  • [46] Turkish Text-Dependent Speaker Verification using i-vector/PLDA Approach
    Hanilci, Cemal
    Celiktas, Havva
    2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [47] Modified-prior PLDA and Score Calibration for Duration Mismatch Compensation in Speaker Recognition System
    Hong, QingYang
    Li, Lin
    Li, Ming
    Huang, Ling
    Wan, Lihong
    Zhang, Jun
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1037 - 1041
  • [48] Iterative PLDA Adaptation for Speaker Diarization
    Le Lan, Gael
    Charlet, Delphine
    Larcher, Anthony
    Meignier, Sylvain
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2175 - 2179
  • [49] Duration and Pronunciation Conditioned Lexical Modeling for Speaker Verification
    Tur, Gokhan
    Shriberg, Elizabeth
    Stolcke, Andreas
    Kajarekar, Sachin
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2664 - 2667
  • [50] Blind score normalization method for PLDA based speaker recognition
    Doroshin, Danila
    Lubimov, Nikolay
    Nastasenko, Marina
    Kotov, Mikhail
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 210 - 213