A study on the effects of using short utterance length development data in the design of GPLDA speaker verification systems

被引：7

作者：

Kanagasundaram A. ^{[1
,2
]}

Dean D. ^{[2
]}

Sridharan S. ^{[2
]}

Ghaemmaghami H. ^{[2
]}

Fookes C. ^{[2
]}

机构：

[1] Department of Electrical & Electronic Engineering, Faculty of Engineering, University of Jaffna, Kilinochchi

[2] Speech Research Lab, SAIVT, Queensland University of Technology, Brisbane, QLD

来源：

International Journal of Speech Technology | 2017年 / 20卷 / 2期

基金：

澳大利亚研究理事会;

关键词：

Linear Discriminant Analysis; Equal Error Rate; Speaker Verification; Session Data; Universal Background Model;

D O I：

10.1007/s10772-017-9402-8

中图分类号：

学科分类号：

摘要：

This paper studies the performance degradation of Gaussian probabilistic linear discriminant analysis (GPLDA) speaker verification system, when only short-utterance data is used for speaker verification system development. Subsequently, a number of techniques, including utterance partitioning and source-normalised weighted linear discriminant analysis (SN-WLDA) projections are introduced to improve the speaker verification performance in such conditions. Experimental studies have found that when short utterance data is available for speaker verification development, GPLDA system overall achieves best performance with a lower number of universal background model (UBM) components. As a lower number of UBM components significantly reduces the computational complexity of speaker verification system, that is a useful observation. In limited session data conditions, we propose a simple utterance-partitioning technique, which when applied to the LDA-projected GPLDA system shows over 8% relative improvement on EER values over baseline system on NIST 2008 truncated 10–10 s conditions. We conjecture that this improvement arises from the apparent increase in the number of sessions arising from our partitioning technique and this helps to better model the GPLDA parameters. Further, partitioning SN-WLDA-projected GPLDA shows over 16% and 6% relative improvement on EER values over LDA-projected GPLDA systems respectively on NIST 2008 truncated 10–10 s interview-interview, and NIST 2010 truncated 10–10 s interview-interview and telephone-telephone conditions. © 2017, Springer Science+Business Media New York.

引用

页码：247 / 259

页数：12

共 49 条

[1] Brief Review of Short Utterance Speaker Verification Systems
Nirmal, Asmita
Jayaswal, Deepak
BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (14): : 419 - 426
[2] Using Voice Quality Features to Improve Short-Utterance, Text-Independent Speaker Verification Systems
Park, Soo Jin
Yeung, Gary
Kreiman, Jody
Keating, Patricia A.
Alwan, Abeer
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1522 - 1526
[3] Improving short utterance i-vector speaker verification using utterance variance modelling and compensation techniques
Kanagasundaram, A.
Dean, D.
Sridharan, S.
Gonzalez-Dominguez, J.
Gonzalez-Rodriguez, J.
Ramos, D.
SPEECH COMMUNICATION, 2014, 59 : 69 - 82
[4] Few-shot short utterance speaker verification using meta-learning
Wang W.
Zhao H.
Yang Y.
Chang Y.
You H.
PeerJ Computer Science, 2023, 9
[5] Few-shot short utterance speaker verification using meta-learning
Wang, Weijie
Zhao, Hong
Yang, Yikun
Chang, YouKang
You, Haojie
PEERJ COMPUTER SCIENCE, 2023, 9
[6] An efficient text-independent speaker verification for short utterance data from Mobile devices
Arora, Sanghamitra V.
Vig, Rekha
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (3-4) : 3049 - 3074
[7] An efficient text-independent speaker verification for short utterance data from Mobile devices
Sanghamitra V. Arora
Rekha Vig
Multimedia Tools and Applications, 2020, 79 : 3049 - 3074
[8] Study of the Effect of I-vector Modeling on Short and Mismatch Utterance Duration for Speaker Verification
Sarkar, A. K.
Matrouf, D.
Bousquet, P. M.
Bonastre, J. F.
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2661 - 2664
[9] I-vector Transformation Using Conditional Generative Adversarial Networks for Short Utterance Speaker Verification
Zhang, Jiacen
Inoue, Nakamasa
Shinoda, Koichi
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3613 - 3617
[10] CNN-based joint mapping of short and long utterance i-vectors for speaker verification using short utterances
Guo, Jinxi
Nookala, Usha Amrutha
Alwan, Abeer
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3712 - 3716

← 1 2 3 4 5 →