Consideration of Varying Training Lengths for Short-Duration Speaker Verification

被引:0
|
作者
Ko, WooSeok [1 ]
Um, Seyun [1 ]
Piao, Zhenyu [1 ]
Kang, Hong-goo [1 ]
机构
[1] Yonsei Univ, Seoul, South Korea
来源
2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC | 2023年
关键词
Speaker verification; short utterance; training scheme;
D O I
10.1109/APSIPAASC58517.2023.10317214
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an efficient training scheme for speaker verification (SV) networks in short-duration speech input scenarios. We analyze the effects of varying training lengths on SV performance, with a particular focus on short utterances. Despite the high demand for short-duration SV in real-world applications, state-of-the-art SV systems have primarily been evaluated on long utterances, and little research has been conducted on short-duration SV. By considering the innate characteristics of SV architectures and the performance discrepancies associated with varying training data lengths, we propose a training scheme that accounts for varying length conditions. We categorize speaker characteristics as coarse-grained and fine-grained features and demonstrate that training models to learn both features can result in length-robust speaker embeddings. Our proposed training scheme improves model performance by 28.7% and 37.9% in terms of equal error rate on short-duration speech scenarios compared to baseline models.
引用
收藏
页码:139 / 144
页数:6
相关论文
共 50 条
  • [1] PHONE ADAPTIVE TRAINING FOR SHORT-DURATION SPEAKER VERIFICATION
    Soldi, Giovanni
    Bozonnet, Simon
    Beaugeant, Christophe
    Evans, Nicholas
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2107 - 2111
  • [2] Deep Speaker Embeddings for Short-Duration Speaker Verification
    Bhattacharya, Gautam
    Alam, Jahangir
    Kenny, Patrick
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1517 - 1521
  • [3] Transfer Learning for Speaker Verification with Short-Duration Audio
    Fathima, Noor
    Simha, J. B.
    Abhi, Shinu
    SMART TRENDS IN COMPUTING AND COMMUNICATIONS, VOL 5, SMARTCOM 2024, 2024, 949 : 195 - 205
  • [4] The Sogou System for Short-duration Speaker Verification Challenge 2021
    Yan, Jie
    Yao, Shengyu
    Pan, Yiqian
    Chen, Wei
    INTERSPEECH 2021, 2021, : 2327 - 2331
  • [5] The SJTU System for Short-duration Speaker Verification Challenge 2021
    Han, Bing
    Chen, Zhengyang
    Zhou, Zhikai
    Qian, Yanmin
    INTERSPEECH 2021, 2021, : 2332 - 2336
  • [6] The TalTech Systems for the Short-duration Speaker Verification Challenge 2020
    Alumae, Tanel
    Valk, Jorgen
    INTERSPEECH 2020, 2020, : 746 - 750
  • [7] The XMUSPEECH System for Short-Duration Speaker Verification Challenge 2020
    Jiang, Tao
    Zhao, Miao
    Li, Lin
    Hong, Qingyang
    INTERSPEECH 2020, 2020, : 736 - 740
  • [8] UIAI SYSTEM FOR SHORT-DURATION SPEAKER VERIFICATION CHALLENGE 2020
    Sahidullah, Md
    Sarkar, Achintya Kumar
    Vestman, Ville
    Liu, Xuechen
    Serizel, Romain
    Kinnunen, Tomi
    Tan, Zheng-Hua
    Vincent, Emmanuel
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 323 - 329
  • [9] Investigation of NICT submission for short-duration speaker verification challenge 2020
    Shen, Peng
    Lu, Xugang
    Kawai, Hisashi
    INTERSPEECH 2020, 2020, : 751 - 755
  • [10] Angular Softmax for Short-Duration Text-independent Speaker Verification
    Huang, Zili
    Wang, Shuai
    Yu, Kai
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3623 - 3627