Learning domain-heterogeneous speaker recognition systems with personalized continual federated learning

被引：1

作者：

Chen, Zhiyong ^{[1
]}

Xu, Shugong ^{[1
]}

机构：

[1] Shanghai Univ, Sch Commun & Informat Engn, Shanghai, Peoples R China

来源：

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING | 2023年 / 2023卷 / 01期

关键词：

Speaker recognition; Federated learning; Domain adaptation; Continual learning; ADAPTATION; ASR;

D O I：

10.1186/s13636-023-00299-2

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speaker recognition, the process of automatically identifying a speaker based on individual characteristics in speech signals, presents significant challenges when addressing heterogeneous-domain conditions. Federated learning, a recent development in machine learning methods, has gained traction in privacy-sensitive tasks, such as personal voice assistants in home environments. However, its application in heterogeneous multi-domain scenarios for enhancing system customization remains underexplored. In this paper, we propose the utilization of federated learning in heterogeneous situations to enable adaptation across multiple domains. We also introduce a personalized federated learning algorithm designed to effectively leverage limited domain data, resulting in improved learning outcomes. Furthermore, we present a strategy for implementing the federated learning algorithm in practical, real-world continual learning scenarios, demonstrating promising results. The proposed federated learning method exhibits superior performance across a range of synthesized complex conditions and continual learning settings, compared to conventional training methods.

引用

页数：17

共 54 条

[1] A Refinement-based Formal Development of Cyber-physical Railway Signalling Systems [J].

Ait-Ameur, Yamine ;

Bogomolov, Sergiy ;

Dupont, Guillaume ;

Iliasov, Alexei ;

Romanovsky, Alexander ;

Stankaitis, Paulius .

FORMAL ASPECTS OF COMPUTING, 2023, 35 (01)

[2] Speaker recognition based on deep learning: An overview [J].

Bai, Zhongxin ;

Zhang, Xiao-Lei .

NEURAL NETWORKS, 2021, 140 :65-99

[3]

Bhattacharya G, 2019, INT CONF ACOUST SPEE, P6226, DOI [10.1109/icassp.2019.8682064, 10.1109/ICASSP.2019.8682064]

[4] A Deep Reinforcement Learning Framework with Formal Verification [J].

Boudi, Zakaryae ;

Wakrime, Abderrahim Ait ;

Toub, Mohamed ;

Haloua, Mohamed .

FORMAL ASPECTS OF COMPUTING, 2023, 35 (01)

[5] Silas: A high-performance machine learning foundation for logical reasoning and verification [J].

Bride, Hadrien ;

Cai, Cheng-Hao ;

Dong, Jie ;

Dong, Jin Song ;

Hou, Zhe ;

Mirjalili, Seyedali ;

Sun, Jing .

EXPERT SYSTEMS WITH APPLICATIONS, 2021, 176

[6] WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing [J].

Chen, Sanyuan ;

Wang, Chengyi ;

Chen, Zhengyang ;

Wu, Yu ;

Liu, Shujie ;

Chen, Zhuo ;

Li, Jinyu ;

Kanda, Naoyuki ;

Yoshioka, Takuya ;

Xiao, Xiong ;

Wu, Jian ;

Zhou, Long ;

Ren, Shuo ;

Qian, Yanmin ;

Qian, Yao ;

Zeng, Michael ;

Yu, Xiangzhan ;

Wei, Furu .

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (06) :1505-1518

[7]

Chung JS, 2018, INTERSPEECH, P1086

[8] Front-End Factor Analysis for Speaker Verification [J].

Dehak, Najim ;

Kenny, Patrick J. ;

Dehak, Reda ;

Dumouchel, Pierre ;

Ouellet, Pierre .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04) :788-798

[9] ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification [J].

Desplanques, Brecht ;

Thienpondt, Jenthe ;

Demuynck, Kris .

INTERSPEECH 2020, 2020, :3830-3834

[10] AutoSpeech: Neural Architecture Search for Speaker Recognition [J].

Ding, Shaojin ;

Chen, Tianlong ;

Gong, Xinyu ;

Zha, Weiwei ;

Wang, Zhangyang .

INTERSPEECH 2020, 2020, :916-920

← 1 2 3 4 5 6 →