Duration Dependent Covariance Regularization in PLDA Modeling for Speaker Verification

被引:0
|
作者
Cai, Weicheng [2 ,3 ]
Li, Ming [1 ,2 ]
Li, Lin [4 ]
Hong, Qingyang [4 ]
机构
[1] Sun Yat Sen Univ, SYSU CMU Joint Inst Engn, Guangzhou, Guangdong, Peoples R China
[2] SYSU CMU Shunde Int Joint Res Inst, Shunde, Guangdong, Peoples R China
[3] Sun Yat Sen Univ, Sch Informat Sci & Technol, Guangzhou, Guangdong, Peoples R China
[4] Xiamen Univ, Sch Informat Sci & Technol, Xiamen, Peoples R China
来源
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年
关键词
PLDA; covariance regularization; i-vector; speaker verification; duration; ROBUST;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we present a covariance regularized probabilistic linear discriminant analysis (CR-PLDA) model for text independent speaker verification. In the conventional simplified PLDA modeling, the covariance matrix used to capture the residual energies is globally shared for all i-vectors. However, we believe that the point estimated i-vectors from longer speech utterances may be more accurate and their corresponding co-variances in the PLDA modeling should be smaller. Similar to the inverse 0th order statistics weighted covariance in the i-vector model training, we propose a duration dependent normalized exponential term containing the duration normalizing factor mu and duration extent factor v to regularize the covariance in the PLDA modeling. Experimental results are reported on the NIST SRE 2010 common condition 5 female part task and the NIST 2014 i-vector machine learning challenge, respectively. For both tasks, the proposed covariance regularized PLDA system outperforms the baseline PLDA system by more than 13% relatively in terms of equal error rate (EER) and norm minDCF values.
引用
收藏
页码:1027 / 1031
页数:5
相关论文
共 50 条
  • [41] UNSUPERVISED DOMAIN ADAPTATION OF NEURAL PLDA USING SEGMENT PAIRS FOR SPEAKER VERIFICATION
    Ulgen, I. Rasim
    Arslan, Levent M.
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 571 - 576
  • [42] PLDA in the i-supervector space for text-independent speaker verification
    Jiang, Ye
    Lee, Kong Aik
    Wang, Longbiao
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014, : 1 - 13
  • [43] Improving PLDA speaker verification performance using domain mismatch compensation techniques
    Rahman, Md Hafizur
    Kanagasundaram, Ahilan
    Himawan, Ivan
    Dean, David
    Sridharan, Sridha
    COMPUTER SPEECH AND LANGUAGE, 2018, 47 : 240 - 258
  • [44] CHANNEL ADAPTATION OF PLDA FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Chen, Liping
    Lee, Kong Aik
    Ma, Bin
    Guo, Wu
    Li, Haizhou
    Dai, Li Rong
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5251 - 5255
  • [45] DNN-Driven Mixture of PLDA for Robust Speaker Verification
    Li, Na
    Mak, Man-Wai
    Chien, Jen-Tzung
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (06) : 1371 - 1383
  • [46] Noise robust speaker verification via the fusion of SNR-independent and SNR-dependent PLDA
    Pang, Xiaomin
    Mak, Man-Wai
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2015, 18 (04) : 633 - 648
  • [47] I-VECTOR KULLBACK-LEIBLER DIVISIVE NORMALIZATION FOR PLDA SPEAKER VERIFICATION
    Pan, Yilin
    Zheng, Tieran
    Chen, Chen
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 56 - 60
  • [48] Robust discriminative training against data insufficiency in PLDA-based speaker verification
    Rohdin, Johan
    Biswas, Sangeeta
    Shinoda, Koichi
    COMPUTER SPEECH AND LANGUAGE, 2016, 35 : 32 - 57
  • [49] Autonomous selection of i-vectors for PLDA modelling in speaker verification
    Biswas, Sangeeta
    Rohdin, Johan
    Shinoda, Koichi
    SPEECH COMMUNICATION, 2015, 72 : 32 - 46
  • [50] BAYESIAN ESTIMATION OF PLDA WITH NOISY TRAINING LABELS, WITH APPLICATIONS TO SPEAKER VERIFICATION
    Borgstrom, Bengt J.
    Torres-Carrasquillo, Pedro
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7594 - 7598