Design and Optimization of Hierarchical Gradient Coding for Distributed Learning at Edge Devices

被引:0
|
作者
Tang, Weiheng [1 ]
Li, Jingyi [1 ]
Chen, Lin [2 ]
Chen, Xu [1 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510275, Peoples R China
[2] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangdong Prov Key Lab Informat Secur Technol, Guangzhou 510275, Peoples R China
基金
美国国家科学基金会;
关键词
Encoding; Distance learning; Computer aided instruction; Computational modeling; Task analysis; Optimization; Computer architecture; Distributed learning; hierarchical architecture; stragglers tolerance; gradient coding; ALLOCATION;
D O I
10.1109/TCOMM.2024.3418901
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Edge computing has recently emerged as a promising paradigm to boost the performance of distributed learning by leveraging the distributed resources at edge nodes. Architecturally, the introduction of edge nodes adds an additional intermediate layer between the master and workers in the original distributed learning systems, potentially leading to more severe straggler effect. Recently, coding theory-based approaches have been proposed for stragglers mitigation in distributed learning, but the majority focus on the conventional workers-master architecture. In this paper, along a different line, we investigate the problem of mitigating the straggler effect in hierarchical distributed learning systems with an additional layer composed of edge nodes. Technically, we first derive the fundamental trade-off between the computational loads of workers and the stragglers tolerance. Then, we propose a hierarchical gradient coding framework, which provides better stragglers mitigation, to achieve the derived computational trade-off. To further improve the performance of our framework in heterogeneous scenarios, we formulate an optimization problem with the objective of minimizing the expected execution time for each iteration in the learning process. We develop an efficient algorithm to mathematically solve the problem by outputting the optimum strategy. Extensive simulation results demonstrate the superiority of our schemes compared with conventional solutions.
引用
收藏
页码:7727 / 7741
页数:15
相关论文
共 50 条
  • [41] A Low-Complexity and Adaptive Distributed Source Coding Design for Model Aggregation in Distributed Learning
    Zhang, Naifu
    Tao, Meixia
    IEEE OPEN JOURNAL OF THE COMMUNICATIONS SOCIETY, 2022, 3 : 2444 - 2460
  • [42] A Multiple Gradient Descent Design for Multi-Task Learning on Edge Computing: Multi-Objective Machine Learning Approach
    Zhou, Xiaojun
    Gao, Yuan
    Li, Chaojie
    Huang, Zhaoke
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2022, 9 (01): : 121 - 133
  • [43] Live Gradient Compensation for Evading Stragglers in Distributed Learning
    Xu, Jian
    Huang, Shao-Lun
    Song, Linqi
    Lan, Tian
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,
  • [44] Distributed Computation Offloading and Trajectory Optimization in Multi-UAV-Enabled Edge Computing
    Chen, Xiangyi
    Bi, Yuanguo
    Han, Guangjie
    Zhang, Dongyu
    Liu, Minghan
    Shi, Han
    Zhao, Hai
    Li, Fengyun
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (20): : 20096 - 20110
  • [45] Optimization of Deep-Learning Detection of Humans in Marine Environment on Edge Devices
    Rizk, M.
    Heller, D.
    Douguet, R.
    Baghdadi, A.
    Diguet, J-Ph
    2022 29TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS (IEEE ICECS 2022), 2022,
  • [46] Vehicle Selection and Resource Optimization for Federated Learning in Vehicular Edge Computing
    Xiao, Huizi
    Zhao, Jun
    Pei, Qingqi
    Feng, Jie
    Liu, Lei
    Shi, Weisong
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (08) : 11073 - 11087
  • [47] An Efficient Asynchronous Federated Learning Protocol for Edge Devices
    Li, Qian
    Gao, Ziyi
    Sun, Yetao
    Wang, Yan
    Wang, Rui
    Zhu, Haiyan
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (17): : 28798 - 28808
  • [48] TDMiL: Tiny Distributed Machine Learning for Microcontroller-Based Interconnected Devices
    Gulati, Mayank
    Zandberg, Koen
    Huang, Zhaolan
    Wunder, Gerhard
    Adjih, Cedric
    Baccelli, Emmanuel
    IEEE ACCESS, 2024, 12 : 167810 - 167826
  • [49] StitchNet: Distributed On-Device Model Partitioning Over Edge Devices Under Volatile Wireless Links
    Lee, Jiho
    Cho, Jeihee
    Lee, Hyungjune
    IEEE ACCESS, 2022, 10 : 110616 - 110627
  • [50] Hierarchical Federated Edge Learning With Adaptive Clustering in Internet of Things
    Tian, Yuqing
    Wang, Zhongyu
    Zhang, Zhaoyang
    Jin, Richeng
    Shan, Hangguan
    Wang, Wei
    Quek, Tony Q. S.
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (21): : 34108 - 34122