A study on the effects of using short utterance length development data in the design of GPLDA speaker verification systems

被引:7
|
作者
Kanagasundaram A. [1 ,2 ]
Dean D. [2 ]
Sridharan S. [2 ]
Ghaemmaghami H. [2 ]
Fookes C. [2 ]
机构
[1] Department of Electrical & Electronic Engineering, Faculty of Engineering, University of Jaffna, Kilinochchi
[2] Speech Research Lab, SAIVT, Queensland University of Technology, Brisbane, QLD
基金
澳大利亚研究理事会;
关键词
Linear Discriminant Analysis; Equal Error Rate; Speaker Verification; Session Data; Universal Background Model;
D O I
10.1007/s10772-017-9402-8
中图分类号
学科分类号
摘要
This paper studies the performance degradation of Gaussian probabilistic linear discriminant analysis (GPLDA) speaker verification system, when only short-utterance data is used for speaker verification system development. Subsequently, a number of techniques, including utterance partitioning and source-normalised weighted linear discriminant analysis (SN-WLDA) projections are introduced to improve the speaker verification performance in such conditions. Experimental studies have found that when short utterance data is available for speaker verification development, GPLDA system overall achieves best performance with a lower number of universal background model (UBM) components. As a lower number of UBM components significantly reduces the computational complexity of speaker verification system, that is a useful observation. In limited session data conditions, we propose a simple utterance-partitioning technique, which when applied to the LDA-projected GPLDA system shows over 8% relative improvement on EER values over baseline system on NIST 2008 truncated 10–10 s conditions. We conjecture that this improvement arises from the apparent increase in the number of sessions arising from our partitioning technique and this helps to better model the GPLDA parameters. Further, partitioning SN-WLDA-projected GPLDA shows over 16% and 6% relative improvement on EER values over LDA-projected GPLDA systems respectively on NIST 2008 truncated 10–10 s interview-interview, and NIST 2010 truncated 10–10 s interview-interview and telephone-telephone conditions. © 2017, Springer Science+Business Media New York.
引用
收藏
页码:247 / 259
页数:12
相关论文
共 49 条
  • [1] Brief Review of Short Utterance Speaker Verification Systems
    Nirmal, Asmita
    Jayaswal, Deepak
    BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (14): : 419 - 426
  • [2] Using Voice Quality Features to Improve Short-Utterance, Text-Independent Speaker Verification Systems
    Park, Soo Jin
    Yeung, Gary
    Kreiman, Jody
    Keating, Patricia A.
    Alwan, Abeer
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1522 - 1526
  • [3] Improving short utterance i-vector speaker verification using utterance variance modelling and compensation techniques
    Kanagasundaram, A.
    Dean, D.
    Sridharan, S.
    Gonzalez-Dominguez, J.
    Gonzalez-Rodriguez, J.
    Ramos, D.
    SPEECH COMMUNICATION, 2014, 59 : 69 - 82
  • [4] Few-shot short utterance speaker verification using meta-learning
    Wang W.
    Zhao H.
    Yang Y.
    Chang Y.
    You H.
    PeerJ Computer Science, 2023, 9
  • [5] Few-shot short utterance speaker verification using meta-learning
    Wang, Weijie
    Zhao, Hong
    Yang, Yikun
    Chang, YouKang
    You, Haojie
    PEERJ COMPUTER SCIENCE, 2023, 9
  • [6] An efficient text-independent speaker verification for short utterance data from Mobile devices
    Arora, Sanghamitra V.
    Vig, Rekha
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (3-4) : 3049 - 3074
  • [7] An efficient text-independent speaker verification for short utterance data from Mobile devices
    Sanghamitra V. Arora
    Rekha Vig
    Multimedia Tools and Applications, 2020, 79 : 3049 - 3074
  • [8] Study of the Effect of I-vector Modeling on Short and Mismatch Utterance Duration for Speaker Verification
    Sarkar, A. K.
    Matrouf, D.
    Bousquet, P. M.
    Bonastre, J. F.
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2661 - 2664
  • [9] I-vector Transformation Using Conditional Generative Adversarial Networks for Short Utterance Speaker Verification
    Zhang, Jiacen
    Inoue, Nakamasa
    Shinoda, Koichi
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3613 - 3617
  • [10] CNN-based joint mapping of short and long utterance i-vectors for speaker verification using short utterances
    Guo, Jinxi
    Nookala, Usha Amrutha
    Alwan, Abeer
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3712 - 3716