Learning domain-heterogeneous speaker recognition systems with personalized continual federated learning

被引:1
作者
Chen, Zhiyong [1 ]
Xu, Shugong [1 ]
机构
[1] Shanghai Univ, Sch Commun & Informat Engn, Shanghai, Peoples R China
关键词
Speaker recognition; Federated learning; Domain adaptation; Continual learning; ADAPTATION; ASR;
D O I
10.1186/s13636-023-00299-2
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speaker recognition, the process of automatically identifying a speaker based on individual characteristics in speech signals, presents significant challenges when addressing heterogeneous-domain conditions. Federated learning, a recent development in machine learning methods, has gained traction in privacy-sensitive tasks, such as personal voice assistants in home environments. However, its application in heterogeneous multi-domain scenarios for enhancing system customization remains underexplored. In this paper, we propose the utilization of federated learning in heterogeneous situations to enable adaptation across multiple domains. We also introduce a personalized federated learning algorithm designed to effectively leverage limited domain data, resulting in improved learning outcomes. Furthermore, we present a strategy for implementing the federated learning algorithm in practical, real-world continual learning scenarios, demonstrating promising results. The proposed federated learning method exhibits superior performance across a range of synthesized complex conditions and continual learning settings, compared to conventional training methods.
引用
收藏
页数:17
相关论文
共 54 条
[1]   A Refinement-based Formal Development of Cyber-physical Railway Signalling Systems [J].
Ait-Ameur, Yamine ;
Bogomolov, Sergiy ;
Dupont, Guillaume ;
Iliasov, Alexei ;
Romanovsky, Alexander ;
Stankaitis, Paulius .
FORMAL ASPECTS OF COMPUTING, 2023, 35 (01)
[2]   Speaker recognition based on deep learning: An overview [J].
Bai, Zhongxin ;
Zhang, Xiao-Lei .
NEURAL NETWORKS, 2021, 140 :65-99
[3]  
Bhattacharya G, 2019, INT CONF ACOUST SPEE, P6226, DOI [10.1109/icassp.2019.8682064, 10.1109/ICASSP.2019.8682064]
[4]   A Deep Reinforcement Learning Framework with Formal Verification [J].
Boudi, Zakaryae ;
Wakrime, Abderrahim Ait ;
Toub, Mohamed ;
Haloua, Mohamed .
FORMAL ASPECTS OF COMPUTING, 2023, 35 (01)
[5]   Silas: A high-performance machine learning foundation for logical reasoning and verification [J].
Bride, Hadrien ;
Cai, Cheng-Hao ;
Dong, Jie ;
Dong, Jin Song ;
Hou, Zhe ;
Mirjalili, Seyedali ;
Sun, Jing .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 176
[6]   WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing [J].
Chen, Sanyuan ;
Wang, Chengyi ;
Chen, Zhengyang ;
Wu, Yu ;
Liu, Shujie ;
Chen, Zhuo ;
Li, Jinyu ;
Kanda, Naoyuki ;
Yoshioka, Takuya ;
Xiao, Xiong ;
Wu, Jian ;
Zhou, Long ;
Ren, Shuo ;
Qian, Yanmin ;
Qian, Yao ;
Zeng, Michael ;
Yu, Xiangzhan ;
Wei, Furu .
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (06) :1505-1518
[7]  
Chung JS, 2018, INTERSPEECH, P1086
[8]   Front-End Factor Analysis for Speaker Verification [J].
Dehak, Najim ;
Kenny, Patrick J. ;
Dehak, Reda ;
Dumouchel, Pierre ;
Ouellet, Pierre .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04) :788-798
[9]   ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification [J].
Desplanques, Brecht ;
Thienpondt, Jenthe ;
Demuynck, Kris .
INTERSPEECH 2020, 2020, :3830-3834
[10]   AutoSpeech: Neural Architecture Search for Speaker Recognition [J].
Ding, Shaojin ;
Chen, Tianlong ;
Gong, Xinyu ;
Zha, Weiwei ;
Wang, Zhangyang .
INTERSPEECH 2020, 2020, :916-920