SAFA: A Semi-Asynchronous Protocol for Fast Federated Learning With Low Overhead

被引:187
作者
Wu, Wentai [1 ]
He, Ligang [1 ]
Lin, Weiwei [2 ]
Mao, Rui [3 ]
Maple, Carsten [4 ]
Jarvis, Stephen [1 ]
机构
[1] Univ Warwick, Dept Comp Sci, Coventry CV4 7AL, W Midlands, England
[2] South China Univ Technol, Sch Comp Sci & Technol, Guangzhou 510641, Guangdong, Peoples R China
[3] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
[4] Univ Warwick, Warwick Manufacturer Grp, Coventry CV4 7AL, W Midlands, England
基金
英国工程与自然科学研究理事会; 中国国家自然科学基金;
关键词
Protocols; Training; Machine learning; Data models; Optimization; Convergence; Distributed databases; Distributed computing; machine learning; edge intelligence; federated learning;
D O I
10.1109/TC.2020.2994391
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Federated learning (FL) has attracted increasing attention as a promising approach to driving a vast number of end devices with artificial intelligence. However, it is very challenging to guarantee the efficiency of FL considering the unreliable nature of end devices while the cost of device-server communication cannot be neglected. In this article, we propose SAFA, a semi-asynchronous FL protocol, to address the problems in federated learning such as low round efficiency and poor convergence rate in extreme conditions (e.g., clients dropping offline frequently). We introduce novel designs in the steps of model distribution, client selection and global aggregation to mitigate the impacts of stragglers, crashes and model staleness in order to boost efficiency and improve the quality of the global model. We have conducted extensive experiments with typical machine learning tasks. The results demonstrate that the proposed protocol is effective in terms of shortening federated round duration, reducing local resource wastage, and improving the accuracy of the global model at an acceptable communication cost.
引用
收藏
页码:655 / 668
页数:14
相关论文
共 25 条
[1]  
Alistarh D, 2017, ADV NEUR IN, V30
[2]  
[Anonymous], 2016, P 29 C NEUR INF PROC
[3]  
[Anonymous], 2015, P 25 INT JOINT C ART
[4]  
[Anonymous], 2015, Adv. Neural Inf. Proces. Syst
[5]  
Baytas IM, 2016, IEEE DATA MINING, P11, DOI [10.1109/ICDM.2016.61, 10.1109/ICDM.2016.0012]
[6]  
Beaufays F., 2018, ARXIV181103604
[7]  
Bonawitz K, 2019, Proc Mach Learn, V1, P374, DOI DOI 10.48550/ARXIV.1902.01046
[8]  
[陈珺娴 Chen Junxian], 2020, [高分子通报, Polymer Bulletin], P1
[9]   Edge Intelligence: The Confluence of Edge Computing and Artificial Intelligence [J].
Deng, Shuiguang ;
Zhao, Hailiang ;
Fang, Weijia ;
Yin, Jianwei ;
Dustdar, Schahram ;
Zomaya, Albert Y. .
IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (08) :7457-7469
[10]  
Dutta S., 2018, ARXIV180301113