Consideration of Varying Training Lengths for Short-Duration Speaker Verification

被引：0

作者：

Ko, WooSeok ^{[1
]}

Um, Seyun ^{[1
]}

Piao, Zhenyu ^{[1
]}

Kang, Hong-goo ^{[1
]}

机构：

[1] Yonsei Univ, Seoul, South Korea

来源：

2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC | 2023年

关键词：

Speaker verification; short utterance; training scheme;

D O I：

10.1109/APSIPAASC58517.2023.10317214

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present an efficient training scheme for speaker verification (SV) networks in short-duration speech input scenarios. We analyze the effects of varying training lengths on SV performance, with a particular focus on short utterances. Despite the high demand for short-duration SV in real-world applications, state-of-the-art SV systems have primarily been evaluated on long utterances, and little research has been conducted on short-duration SV. By considering the innate characteristics of SV architectures and the performance discrepancies associated with varying training data lengths, we propose a training scheme that accounts for varying length conditions. We categorize speaker characteristics as coarse-grained and fine-grained features and demonstrate that training models to learn both features can result in length-robust speaker embeddings. Our proposed training scheme improves model performance by 28.7% and 37.9% in terms of equal error rate on short-duration speech scenarios compared to baseline models.

引用

页码：139 / 144

页数：6

共 50 条

[1] PHONE ADAPTIVE TRAINING FOR SHORT-DURATION SPEAKER VERIFICATION
Soldi, Giovanni
Bozonnet, Simon
Beaugeant, Christophe
Evans, Nicholas
2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2107 - 2111
[2] Deep Speaker Embeddings for Short-Duration Speaker Verification
Bhattacharya, Gautam
Alam, Jahangir
Kenny, Patrick
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1517 - 1521
[3] Transfer Learning for Speaker Verification with Short-Duration Audio
Fathima, Noor
Simha, J. B.
Abhi, Shinu
SMART TRENDS IN COMPUTING AND COMMUNICATIONS, VOL 5, SMARTCOM 2024, 2024, 949 : 195 - 205
[4] The Sogou System for Short-duration Speaker Verification Challenge 2021
Yan, Jie
Yao, Shengyu
Pan, Yiqian
Chen, Wei
INTERSPEECH 2021, 2021, : 2327 - 2331
[5] The SJTU System for Short-duration Speaker Verification Challenge 2021
Han, Bing
Chen, Zhengyang
Zhou, Zhikai
Qian, Yanmin
INTERSPEECH 2021, 2021, : 2332 - 2336
[6] The TalTech Systems for the Short-duration Speaker Verification Challenge 2020
Alumae, Tanel
Valk, Jorgen
INTERSPEECH 2020, 2020, : 746 - 750
[7] The XMUSPEECH System for Short-Duration Speaker Verification Challenge 2020
Jiang, Tao
Zhao, Miao
Li, Lin
Hong, Qingyang
INTERSPEECH 2020, 2020, : 736 - 740
[8] UIAI SYSTEM FOR SHORT-DURATION SPEAKER VERIFICATION CHALLENGE 2020
Sahidullah, Md
Sarkar, Achintya Kumar
Vestman, Ville
Liu, Xuechen
Serizel, Romain
Kinnunen, Tomi
Tan, Zheng-Hua
Vincent, Emmanuel
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 323 - 329
[9] Investigation of NICT submission for short-duration speaker verification challenge 2020
Shen, Peng
Lu, Xugang
Kawai, Hisashi
INTERSPEECH 2020, 2020, : 751 - 755
[10] Angular Softmax for Short-Duration Text-independent Speaker Verification
Huang, Zili
Wang, Shuai
Yu, Kai
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3623 - 3627

← 1 2 3 4 5 →