A GDPR-compliant Ecosystem for Speech Recognition with Transfer, Federated, and Evolutionary Learning

被引:14
作者
Jiang, Di [1 ]
Tan, Conghui [1 ]
Peng, Jinhua [1 ]
Chen, Chaotao [1 ]
Wu, Xueyang [2 ]
Zhao, Weiwei [1 ]
Song, Yuanfeng [1 ]
Tong, Yongxin [3 ,4 ]
Liu, Chang [1 ]
Xu, Qian [1 ]
Yang, Qiang [2 ,5 ]
Deng, Li [6 ]
机构
[1] WeBank Co Ltd, AI Grp, Shenzhen, Peoples R China
[2] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Kowloon, Hong Kong, Peoples R China
[3] Beihang Univ, SKLSDE Lab, BDBC, Beijing, Peoples R China
[4] Beihang Univ, IRI, Beijing, Peoples R China
[5] WeBank Co Ltd, Shenzhen, Peoples R China
[6] Citadel LLC, 131 South Dearborn St, Chicago, IL 60603 USA
关键词
Speech recognition; federated learning; transfer learning; evolutionary learning; ADAPTATION;
D O I
10.1145/3447687
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic Speech Recognition (ASR) is playing a vital role in a wide range of real-world applications. However, Commercial ASR solutions are typically "one-size-fits-all" products and clients are inevitably faced with the risk of severe performance degradation in field test. Meanwhile, with new data regulations such as the European Union's General Data Protection Regulation (GDPR) coming into force, ASR vendors, which traditionally utilize the speech training data in a centralized approach, are becoming increasingly helpless to solve this problem, since accessing clients' speech data is prohibited. Here, we show that by seamlessly integrating three machine learning paradigms (i.e., Transfer learning, Federated learning, and Evolutionary learning (TFE)), we can successfully build a win-win ecosystem for ASR clients and vendors and solve all the aforementioned problems plaguing them. Through large-scale quantitative experiments, we show that with TFE, the clients can enjoy far better ASR solutions than the "one-size-fits-all" counterpart, and the vendors can exploit the abundance of clients' data to effectively refine their own ASR products.
引用
收藏
页数:19
相关论文
共 83 条
[1]   Deep Learning with Differential Privacy [J].
Abadi, Martin ;
Chu, Andy ;
Goodfellow, Ian ;
McMahan, H. Brendan ;
Mironov, Ilya ;
Talwar, Kunal ;
Zhang, Li .
CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, :308-318
[2]  
Abrash Victor., 1995, P EUR C SPEECH COMM
[3]   Isolation and distinctiveness in the design of e-learning systems influence user preferences [J].
Al-Samarraie, Hosam ;
Selim, Hassan ;
Teo, Timothy ;
Zaqout, Fahed .
INTERACTIVE LEARNING ENVIRONMENTS, 2017, 25 (04) :452-466
[4]   Evolutionary Multiobjective Image Feature Extraction in the Presence of Noise [J].
Albukhanajer, Wissam A. ;
Briffa, Johann A. ;
Jin, Yaochu .
IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (09) :1757-1768
[5]  
Bater Johes, 2018, SHRINKWRAP DIFFERENT
[6]   A neural probabilistic language model [J].
Bengio, Y ;
Ducharme, R ;
Vincent, P ;
Jauvin, C .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) :1137-1155
[7]  
Blanchard P, 2017, ADV NEUR IN, V30
[8]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[9]  
Bonawitz Keith, 2016, Practical secure aggregation for federated learning on user-held data
[10]   Federated learning of predictive models from federated Electronic Health Records [J].
Brisimi, Theodora S. ;
Chen, Ruidi ;
Mela, Theofanie ;
Olshevsky, Alex ;
Paschalidis, Ioannis Ch. ;
Shi, Wei .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2018, 112 :59-67