RSA: Byzantine-Robust Stochastic Aggregation Methods for Distributed Learning from Heterogeneous Datasets

被引:0
作者
Li, Liping [1 ]
Xu, Wei [1 ]
Chen, Tianyi [2 ]
Giannakis, Georgios B. [2 ]
Ling, Qing [3 ]
机构
[1] Univ Sci & Technol China, Dept Automat, Hefei, Anhui, Peoples R China
[2] Univ Minnesota, Digital Technol Ctr, Minneapolis, MN USA
[3] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou, Guangdong, Peoples R China
来源
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2019年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a class of robust stochastic subgradient methods for distributed learning from heterogeneous datasets at presence of an unknown number of Byzantine workers. The Byzantine workers, during the learning process, may send arbitrary incorrect messages to the master due to data corruptions, communication failures or malicious attacks, and consequently bias the learned model. The key to the proposed methods is a regularization term incorporated with the objective function so as to robustify the learning task and mitigate the negative effects of Byzantine attacks. The resultant subgradient-based algorithms are termed Byzantine-Robust Stochastic Aggregation methods, justifying our acronym RSA used henceforth. In contrast to most of the existing algorithms, RSA does not rely on the assumption that the data are independent and identically distributed (i.i.d.) on the workers, and hence fits for a wider class of applications. Theoretically, we show that: i) RSA converges to a near-optimal solution with the learning error dependent on the number of Byzantine workers; ii) the convergence rate of RSA under Byzantine attacks is the same as that of the stochastic gradient descent method, which is free of Byzantine attacks. Numerically, experiments on real dataset corroborate the competitive performance of RSA and a complexity reduction compared to the state-of-the-art alternatives.
引用
收藏
页码:1544 / 1551
页数:8
相关论文
共 24 条
  • [1] Alistarh D., 2018, ARXIV180308917
  • [2] [Anonymous], 2018, ARXIV180210116
  • [3] [Anonymous], 2017, Google
  • [4] Robust Distributed Consensus Using Total Variation
    Ben-Ameur, Walid
    Bianchi, Pascal
    Jakubowicz, Jeremie
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (06) : 1550 - 1564
  • [5] Blanchard P., 2017, 31 C NEUR INF PROC S, P119
  • [6] Large-Scale Machine Learning with Stochastic Gradient Descent
    Bottou, Leon
    [J]. COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 177 - 186
  • [7] Chen Lingjiao, 2018, P INT C MACH LEARN, P902
  • [8] Chen Tianyi, 2018, ARXIV180509965
  • [9] Chen Y., 2017, ACM MEAS ANAL COMPUT, V1, P1
  • [10] Chen Yudong, 2018, ARXIV180605358