Accelerating Federated Learning With Cluster Construction and Hierarchical Aggregation

被引:51
作者
Wang, Zhiyuan [1 ]
Xu, Hongli [1 ]
Liu, Jianchun [1 ]
Xu, Yang [1 ]
Huang, He [2 ]
Zhao, Yangming [1 ]
机构
[1] Univ Sci & Technol China, Hefei 230027, Anhui, Peoples R China
[2] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Jiangsu, Peoples R China
基金
美国国家科学基金会;
关键词
Hierarchical federated learning; mobile edge computing; cluster construction; optimization;
D O I
10.1109/TMC.2022.3147792
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Federated learning (FL) has emerged in edge computing to address the limited bandwidth and privacy concerns of traditional cloud-based training. However, the existing FL mechanisms may lead to a long training time and consume massive communication resources. In this paper, we propose an efficient FL mechanism, namely FedCH, to accelerate FL in heterogeneous edge computing. Different from existing works which adopt the pre-defined system architecture and train models in a synchronous or asynchronous manner, FedCH will construct a special cluster topology and perform hierarchical aggregation for training. Specifically, FedCH arranges all clients into multiple clusters based on their heterogeneous training capacities. The clients in one cluster synchronously forward their local updates to the cluster header for aggregation, while all cluster headers take the asynchronous method for global aggregation. Our analysis shows that the convergence bound depends on the number of clusters and the training epochs. We propose efficient algorithms to determine the optimal number of clusters with resource budgets and then construct the cluster topology to address the client heterogeneity. Extensive experiments on both physical platform and simulated environment show that FedCH reduces the completion time by 49.5-79.5% and the network traffic by 57.4-80.8%, compared with the existing FL mechanisms.
引用
收藏
页码:3805 / 3822
页数:18
相关论文
共 65 条
[1]  
Abad MSH, 2020, INT CONF ACOUST SPEE, P8866, DOI [10.1109/ICASSP40776.2020.9054634, 10.1109/icassp40776.2020.9054634]
[2]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[3]  
Ammaduddin M., 2019, INF RETRIEVAL
[4]  
[Anonymous], 2005, Notes Math
[5]   Frequency-sensitive competitive learning for scalable balanced clustering on high-dimensional hyperspheres [J].
Banerjee, A ;
Ghosh, J .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2004, 15 (03) :702-719
[6]  
Bonawitz K., 2019, Proceedings of Machine Learning and Systems, V1, P374
[7]  
Bradley P. S., 1998, Machine Learning. Proceedings of the Fifteenth International Conference (ICML'98), P91
[8]   SAP-SGD: Accelerating Distributed Parallel Training with High Communication Efficiency on Heterogeneous Clusters [J].
Cao, Jing ;
Zhu, Zongwei ;
Zhou, Xuehai .
2021 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2021), 2021, :94-102
[9]   A Hierarchical Blockchain-Enabled Federated Learning Algorithm for Knowledge Sharing in Internet of Vehicles [J].
Chai, Haoye ;
Leng, Supeng ;
Chen, Yijin ;
Zhang, Ke .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (07) :3975-3986
[10]  
Chang X., 2014, LEARN