AggEnhance: Aggregation Enhancement by Class Interior Points in Federated Learning with Non-IID Data

被引:3
作者
Ou, Jinxiang [1 ]
Shen, Yunheng [1 ]
Wang, Feng [1 ]
Liu, Qiao [2 ]
Zhang, Xuegong [1 ]
Lv, Hairong [1 ]
机构
[1] Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol, Dept Automat, Beijing, Peoples R China
[2] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
基金
中国国家自然科学基金;
关键词
Federated learning; aggregation enhancement; class interior points; non-IID; communication;
D O I
10.1145/3544495
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Federated learning (FL) is a privacy-preserving paradigm for multi-institutional collaborations, where the aggregation is an essential procedure after training on the local datasets. Conventional aggregation algorithms often apply a weighted averaging of the updates generated from distributed machines to update the global model. However, while the data distributions are non-IID, the large discrepancy between the local updates might lead to a poor averaged result and a lower convergence speed, i.e., more iterations required to achieve a certain performance. To solve this problem, this article proposes a novel method named AggEnhance for enhancing the aggregation, where we synthesize a group of reliable samples from the local models and tune the aggregated result on them. These samples, named class interior points (CIPs) in this work, bound the relevant decision boundaries that ensure the performance of aggregated result. To the best of our knowledge, this is the first work to explicitly design an enhancing method for the aggregation in prevailing FL pipelines. A series of experiments on real data demonstrate that our method has noticeable improvements of the convergence in non-IID scenarios. In particular, our approach reduces the iterations by 31.87% on average for the CIFAR10 dataset and 43.90% for the PASCAL VOC dataset. Since our method does not modify other procedures of FL pipelines, it is easy to apply to most existing FL frameworks. Furthermore, it does not require additional data transmitted from the local clients to the global server, thus holding the same security level as the original FL algorithms.
引用
收藏
页数:25
相关论文
共 28 条
  • [1] The Pascal Visual Object Classes (VOC) Challenge
    Everingham, Mark
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338
  • [2] Ghosh A, 2019, Arxiv, DOI arXiv:1906.06629
  • [3] Goetz J, 2019, Arxiv, DOI arXiv:1909.12641
  • [4] Goodfellow Ian J., 2013, Neural Information Processing. 20th International Conference, ICONIP 2013. Proceedings: LNCS 8228, P117, DOI 10.1007/978-3-642-42051-1_16
  • [5] Guha N, 2019, Arxiv, DOI arXiv:1902.11175
  • [6] Hard A, 2019, Arxiv, DOI arXiv:1811.03604
  • [7] Hsu TMH, 2019, Arxiv, DOI arXiv:1909.06335
  • [8] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [9] Ioffe S, 2015, PR MACH LEARN RES, V37, P448
  • [10] Secure, privacy-preserving and federated machine learning in medical imaging
    Kaissis, Georgios A.
    Makowski, Marcus R.
    Ruckert, Daniel
    Braren, Rickmer F.
    [J]. NATURE MACHINE INTELLIGENCE, 2020, 2 (06) : 305 - 311