Accelerating Federated Learning With Data and Model Parallelism in Edge Computing

被引：15

作者：

Liao, Yunming ^{[1
,2
]}

Xu, Yang ^{[1
,2
]}

Xu, Hongli ^{[1
,2
]}

Yao, Zhiwei ^{[1
,2
]}

Wang, Lun ^{[1
,2
]}

Qiao, Chunming ^{[3
]}

机构：

[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230027, Anhui, Peoples R China

[2] Univ Sci & Technol China, Suzhou Inst Adv Res, Suzhou 215123, Jiangsu, Peoples R China

[3] SUNY Buffalo, Dept Comp Sci & Engn, Buffalo, NY 14260 USA

来源：

IEEE-ACM TRANSACTIONS ON NETWORKING | 2024年 / 32卷 / 01期

基金：

美国国家科学基金会;

关键词：

Edge computing; Convergence; Adaptation models; Parallel processing; federated learning; spilt learning; system heterogeneity; COMMUNICATION;

D O I：

10.1109/TNET.2023.3299851

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recently, edge AI has been launched to mine and discover valuable knowledge at network edge. Federated Learning, as an emerging technique for edge AI, has been widely deployed to collaboratively train models on many end devices in data-parallel fashion. To alleviate the computation/communication burden on the resource-constrained workers (e.g., end devices) and protect user privacy, Spilt Federated Learning (SFL), which integrates both data parallelism and model parallelism in Edge Computing (EC), is becoming a practical and popular approach for model training over distributed data. However, apart from the resource limitation, SFL still faces two other critical challenges in EC, i.e., system heterogeneity and context dynamics. To overcome these challenges, we present an efficient SFL method, named, which controls both local updating frequency and batch size to better accelerate model training. We theoretically analyze the model convergence rate and obtain a convergence upper bound regarding local updating frequency given a fixed batch size. Upon this, we develop a control algorithm to determine adaptive local updating frequency and diverse batch sizes for heterogeneous workers to enhance the training efficiency. The experimental results show that can reduce the completion time by about 43% and the network traffic consumption by about 31% for achieving the similar test accuracy, compared to the baselines.

引用

页码：904 / 918

页数：15

共 59 条

[1] Abedi A, 2021, Arxiv, DOI arXiv:2011.03180
[2] Abuadbba Sharif, 2020, ASIA CCS '20: Proceedings of the 15th ACM Asia Conference on Computer and Communications Security, P305, DOI 10.1145/3320269.3384740
[3] [Anonymous], 2016, White paper
[4] Avent B., 2019, arXiv
[5] Castro P, 2022, Arxiv, DOI arXiv:2208.04213
[6] Cohen G, 2017, IEEE IJCNN, P2921, DOI 10.1109/IJCNN.2017.7966217
[7] Fan Mo, 2021, MobiSys '21: Proceedings of the 19th Annual International Conference on Mobile Systems, Applications, and Services, P94, DOI 10.1145/3458864.3466628
[8] Fan Mo, 2020, MobiSys '20: Proceedings of the 18th International Conference on Mobile Systems, Applications, and Services, P161, DOI 10.1145/3386901.3388946
[9] Gabriel E, 2004, LECT NOTES COMPUT SC, V3241, P97
[10] Gu ZS, 2020, Arxiv, DOI arXiv:1807.00969

← 1 2 3 4 5 6 →