Understanding and Improving Model Averaging in Federated Learning on Heterogeneous Data

被引:2
作者
Zhou, Tailin [1 ]
Lin, Zehong [2 ,4 ]
Zhang, Jun [2 ,4 ]
Tsang, Danny H. K. [3 ,4 ]
机构
[1] Hong Kong Univ Sci & Technol, Acad Interdisciplinary Studies, IPO, Hong Kong, Peoples R China
[2] Hong Kong Univ Sci & Technol, Dept Elect & Comp Engn, Hong Kong, Peoples R China
[3] Hong Kong Univ Sci & Technol, Internet Things Thrust, Guangzhou 999077, Peoples R China
[4] Hong Kong Univ Sci & Technol, Dept Elect & Comp Engn, Hong Kong, Peoples R China
关键词
Federated learning; heterogeneous data; loss decomposition; loss landscape visualization; model averaging;
D O I
10.1109/TMC.2024.3406554
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Model averaging is a widely adopted technique in federated learning (FL) that aggregates multiple client models to obtain a global model. Remarkably, model averaging in FL yields a superior global model, even when client models are trained with non-convex objective functions and on heterogeneous local datasets. However, the rationale behind its success remains poorly understood. To shed light on this issue, we first visualize the loss landscape of FL over client and global models to illustrate their geometric properties. The visualization shows that the client models encompass the global model within a common basin, and interestingly, the global model may deviate from the basin's center while still outperforming the client models. To gain further insights into model averaging in FL, we decompose the expected loss of the global model into five factors related to the client models. Specifically, our analysis reveals that the global model loss after early training mainly arises from i) the client model's loss on non-overlapping data between client datasets and the global dataset and ii) the maximum distance between the global and client models. Based on the findings from our loss landscape visualization and loss decomposition, we propose utilizing iterative moving averaging (IMA) on the global model at the late training phase to reduce its deviation from the expected minimum, while constraining client exploration to limit the maximum distance between the global and client models. Our experiments demonstrate that incorporating IMA into existing FL methods significantly improves their accuracy and training speed on various heterogeneous data setups of benchmark datasets.
引用
收藏
页码:12131 / 12145
页数:15
相关论文
共 41 条
  • [1] Improving Generalization in Federated Learning by Seeking Flat Minima
    Caldarola, Debora
    Caputo, Barbara
    Ciccone, Marco
    [J]. COMPUTER VISION, ECCV 2022, PT XXIII, 2022, 13683 : 654 - 672
  • [2] Cha J, 2021, ADV NEUR IN
  • [3] Draxler F, 2018, PR MACH LEARN RES, V80
  • [4] FedDD: Toward Communication-Efficient Federated Learning With Differential Parameter Dropout
    Feng, Zhiying
    Chen, Xu
    Wu, Qiong
    Wu, Wen
    Zhang, Xiaoxi
    Huang, Qianyi
    [J]. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (05) : 5366 - 5384
  • [5] Foret P., 2021, INT C LEARN REPR
  • [6] Garipov T, 2018, ADV NEUR IN, V31
  • [7] Goodfellow I.J., 2014, P INT C LEARN REPR S
  • [8] Izmailov P, 2018, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, P876
  • [9] Jain P, 2018, J MACH LEARN RES, V18
  • [10] Advances and Open Problems in Federated Learning
    Kairouz, Peter
    McMahan, H. Brendan
    Avent, Brendan
    Bellet, Aurelien
    Bennis, Mehdi
    Bhagoji, Arjun Nitin
    Bonawitz, Kallista
    Charles, Zachary
    Cormode, Graham
    Cummings, Rachel
    D'Oliveira, Rafael G. L.
    Eichner, Hubert
    El Rouayheb, Salim
    Evans, David
    Gardner, Josh
    Garrett, Zachary
    Gascon, Adria
    Ghazi, Badih
    Gibbons, Phillip B.
    Gruteser, Marco
    Harchaoui, Zaid
    He, Chaoyang
    He, Lie
    Huo, Zhouyuan
    Hutchinson, Ben
    Hsu, Justin
    Jaggi, Martin
    Javidi, Tara
    Joshi, Gauri
    Khodak, Mikhail
    Konecny, Jakub
    Korolova, Aleksandra
    Koushanfar, Farinaz
    Koyejo, Sanmi
    Lepoint, Tancrede
    Liu, Yang
    Mittal, Prateek
    Mohri, Mehryar
    Nock, Richard
    Ozgur, Ayfer
    Pagh, Rasmus
    Qi, Hang
    Ramage, Daniel
    Raskar, Ramesh
    Raykova, Mariana
    Song, Dawn
    Song, Weikang
    Stich, Sebastian U.
    Sun, Ziteng
    Suresh, Ananda Theertha
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2021, 14 (1-2): : 1 - 210