FEDGKD: Toward Heterogeneous Federated Learning via Global Knowledge Distillation

被引:26
作者
Yao, Dezhong [1 ]
Pan, Wanning [1 ]
Dai, Yutong [2 ]
Wan, Yao [1 ]
Ding, Xiaofeng [1 ]
Yu, Chen [1 ]
Jin, Hai [1 ]
Xu, Zheng [3 ]
Sun, Lichao [2 ]
机构
[1] Huazhong Univ Sci & Technol, Natl Engn Res Ctr Big Data Technol & Syst, Serv Comp Technol & Syst Lab, Sch Comp Sci & Technol,Cluster & Grid Comp Lab, Wuhan 430074, Peoples R China
[2] Lehigh Univ, Bethlehem, PA 18015 USA
[3] Google Res, San Francisco, CA 94043 USA
关键词
Heterogeneous federated learning; non-IID; knowledge distillation; edge intelligence;
D O I
10.1109/TC.2023.3315066
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Federated learning, as one enabling technology of edge intelligence, has gained substantial attention due to its efficacy in training deep learning models without data privacy and network bandwidth concerns. However, due to the heterogeneity of the edge computing system and data, many methods suffer from the "client-drift" issue that could considerably impede the convergence of global model training: local models on clients can drift apart, and the aggregated model can be different from the global optimum. To tackle this issue, one intuitive idea is to guide the local model training by global teachers, i.e., past global models, where each client learns the global knowledge from past global models via adaptive knowledge distillation techniques. Inspired by these insights, we propose a novel approach for heterogeneous federated learning, FedGKD, which fuses the knowledge from historical global models and guides local training to alleviate the "client-drift" issue. In this paper, we evaluate FedGKD through extensive experiments across various CV and NLP datasets (i.e., CIFAR-10/100, Tiny-ImageNet, AG News, SST5) under different heterogeneous settings. The proposed method is guaranteed to converge under common assumptions and outperforms the state-of-the-art baselines in the non-IID federated setting.
引用
收藏
页码:3 / 17
页数:15
相关论文
共 61 条
[1]  
Acar D. A. E., 2020, P ICLR
[2]  
Allen-Zhu Z., 2022, P ICLR
[3]  
Bonawitz K., 2019, PROC MLSYS
[4]   Practical Secure Aggregation for Privacy-Preserving Machine Learning [J].
Bonawitz, Keith ;
Ivanov, Vladimir ;
Kreuter, Ben ;
Marcedone, Antonio ;
McMahan, H. Brendan ;
Patel, Sarvar ;
Ramage, Daniel ;
Segal, Aaron ;
Seth, Karn .
CCS'17: PROCEEDINGS OF THE 2017 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2017, :1175-1191
[5]  
Chen FW, 2022, PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, P2575
[6]  
Chen H.-Y., 2021, P ICLR
[7]   Reveal training performance mystery between TensorFlow and PyTorch in the single GPU environment [J].
Dai, Hulin ;
Peng, Xuan ;
Shi, Xuanhua ;
He, Ligang ;
Xiong, Qian ;
Jin, Hai .
SCIENCE CHINA-INFORMATION SCIENCES, 2022, 65 (01)
[8]  
Dai R, 2022, PR MACH LEARN RES
[9]  
Diao Enmao, 2021, P ICLR
[10]   Wireless Edge Computing With Latency and Reliability Guarantees [J].
Elbamby, Mohammed S. ;
Perfecto, Cristina ;
Liu, Chen-Feng ;
Park, Jihong ;
Samarakoon, Sumudu ;
Chen, Xianfu ;
Bennis, Mehdi .
PROCEEDINGS OF THE IEEE, 2019, 107 (08) :1717-1737