Towards Efficient and Stable K-Asynchronous Federated Learning With Unbounded Stale Gradients on Non-IID Data

被引:34
作者
Zhou, Zihao [1 ]
Li, Yanan [1 ]
Ren, Xuebin [2 ]
Yang, Shusen [3 ]
机构
[1] Xi An Jiao Tong Univ, Sch Math & Stat, Natl Engn Lab Big Data Analyt, Xian 710049, Shaanxi, Peoples R China
[2] Xi An Jiao Tong Univ, Sch Comp Sci & Technol, Natl Engn Lab Big Data Analyt, Xian 710049, Shaanxi, Peoples R China
[3] Xi An Jiao Tong Univ, Natl Engn Lab Big Data Analyt, Minist Educ, Key Lab Intelligent Networks & Network Secur, Xian 710049, Shaanxi, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Federated learning; asynchronous learning; data heterogeneity; prediction accuracy; training stability;
D O I
10.1109/TPDS.2022.3150579
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Federated learning (FL) is an emerging privacy-preserving paradigm that enables multiple participants collaboratively to train a global model without uploading raw data. Considering heterogeneous computing and communication capabilities of different participants, asynchronous FL can avoid the stragglers effect in synchronous FL and adapts to scenarios with vast participants. Both staleness and non-IID data in asynchronous FL would reduce the model utility. However, there exists an inherent contradiction between the solutions to the two problems. That is, mitigating the staleness requires to select less but consistent gradients while coping with non-IID data demands more comprehensive gradients. To address the dilemma, this paper proposes a two-stage weighted $K$K asynchronous FL with adaptive learning rate (WKAFL). By selecting consistent gradients and adjusting learning rate adaptively, WKAFL utilizes stale gradients and mitigates the impact of non-IID data, which can achieve multifaceted enhancement in training speed, prediction accuracy and training stability. We also present the convergence analysis for WKAFL under the assumption of unbounded staleness to understand the impact of staleness and non-IID data. Experiments implemented on both benchmark and synthetic FL datasets show that WKAFL has better overall performance compared to existing algorithms.
引用
收藏
页码:3291 / 3305
页数:15
相关论文
共 56 条
  • [1] Alex K., 2009, LEARNING MULTIPLE LA
  • [2] [Anonymous], P 14 WORKSH HOT TOP
  • [3] [Anonymous], 2015, ARXIV151105950
  • [4] Federated learning of predictive models from federated Electronic Health Records
    Brisimi, Theodora S.
    Chen, Ruidi
    Mela, Theofanie
    Olshevsky, Alex
    Paschalidis, Ioannis Ch.
    Shi, Wei
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2018, 112 : 59 - 67
  • [5] DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving
    Chen, Chenyi
    Seff, Ari
    Kornhauser, Alain
    Xiao, Jianxiong
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2722 - 2730
  • [6] A Bi-layered Parallel Training Architecture for Large-Scale Convolutional Neural Networks
    Chen, Jianguo
    Li, Kenli
    Bilal, Kashif
    Zhou, Xu
    Li, Keqin
    Yu, Philip S.
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (05) : 965 - 976
  • [7] [陈珺娴 Chen Junxian], 2020, [高分子通报, Polymer Bulletin], P1
  • [8] FedSA: A staleness-aware asynchronous Federated Learning algorithm with non-IID data
    Chen, Ming
    Mao, Bingcheng
    Ma, Tianyi
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 120 : 1 - 12
  • [9] Chen Y., 2019, ASYNCHRONOUS ONLINE
  • [10] Communication-Efficient Federated Deep Learning With Layerwise Asynchronous Model Update and Temporally Weighted Aggregation
    Chen, Yang
    Sun, Xiaoyan
    Jin, Yaochu
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (10) : 4229 - 4238