On robustness of unsupervised domain adaptation for speaker recognition

被引:18
作者
Bousquet, Pierre-Michel [1 ]
Rouvier, Mickael [1 ]
机构
[1] Univ Avignon LIA, Avignon, France
来源
INTERSPEECH 2019 | 2019年
关键词
Speaker recognition; speaker embeddings; x-vectors; unsupervised; domain adaptation;
D O I
10.21437/Interspeech.2019-1524
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Current speaker recognition systems, that are learned by using wide training datasets and include sophisticated modelings, turn out to be very specific, providing sometimes disappointing results in real-life applications. Any shift between training and test data, in terms of device, language, duration, noise or other tends to degrade accuracy of speaker detection. This study investigates unsupervised domain adaptation,when only a scarce and unlabeled "in-domain" development dataset is available. Details and relevance of different approaches are described and commented, leading to a new robust method that we call feature-Distribution Adaptor. Efficiency of the proposed technique is experimentally validated on the recent NIST 2016 and 2018 Speaker Recognition Evaluation datasets.
引用
收藏
页码:2958 / 2962
页数:5
相关论文
共 50 条
[41]   Unsupervised Domain Adaptation Learning Algorithm for RGB-D Stairway Recognition [J].
Jing WANG ;
Kuangen ZHANG .
Instrumentation, 2019, 6 (02) :21-29
[42]   An improved open-view human action recognition with unsupervised domain adaptation [J].
Samsudin, M. S. Rizal ;
Abu-Bakar, Syed A. R. ;
Mokji, Musa M. .
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (20) :28479-28507
[43]   Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition [J].
Latif, Siddique ;
Qadir, Junaid ;
Bilal, Muhammad .
2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2019,
[44]   An improved open-view human action recognition with unsupervised domain adaptation [J].
M. S. Rizal Samsudin ;
Syed A. R. Abu-Bakar ;
Musa M. Mokji .
Multimedia Tools and Applications, 2022, 81 :28479-28507
[45]   Bridging domain spaces for unsupervised domain adaptation [J].
Na, Jaemin ;
Jung, Heechul ;
Chang, Hyung Jin ;
Hwang, Wonjun .
PATTERN RECOGNITION, 2025, 164
[46]   Multimodal speech synthesis architecture for unsupervised speaker adaptation [J].
Hieu-Thi Luong ;
Yamagishi, Junichi .
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, :2494-2498
[47]   Unsupervised Adaptive Speaker Recognition by Coupling-Regularized Optimal Transport [J].
Zhang, Ruiteng ;
Wei, Jianguo ;
Lu, Xugang ;
Lu, Wenhuan ;
Jin, Di ;
Zhang, Lin ;
Xu, Junhai .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 :3603-3617
[48]   Multi-Source Domain Adaptation for Text-Independent Forensic Speaker Recognition [J].
Wang, Zhenyu ;
Hansen, John H. L. .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 :60-75
[49]   IMPROVING SPEAKER RECOGNITION PERFORMANCE IN THE DOMAIN ADAPTATION CHALLENGE USING DEEP NEURAL NETWORKS [J].
Garcia-Romero, Daniel ;
Zhang, Xiaohui ;
McCree, Alan ;
Povey, Daniel .
2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, :378-383
[50]   Unsupervised domain adaptation with adversarial distribution adaptation network [J].
Zhou, Qiang ;
Zhou, Wen'an ;
Wang, Shirui ;
Xing, Ying .
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (13) :7709-7721