Multi-Stage Hybrid Federated Learning Over Large-Scale D2D-Enabled Fog Networks

被引：63

作者：

Hosseinalipour, Seyyedali ^{[1
]}

Azam, Sheikh Shams ^{[1
]}

Brinton, Christopher G. ^{[1
]}

Michelusi, Nicolo ^{[2
]}

Aggarwal, Vaneet ^{[3
]}

Love, David J. ^{[1
]}

Dai, Huaiyu ^{[4
]}

机构：

[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47906 USA

[2] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85287 USA

[3] Purdue Univ, Sch Ind Engn, W Lafayette, IN 47906 USA

[4] NC State Univ, Dept Elect & Comp Engn, Raleigh, NC 27606 USA

来源：

IEEE-ACM TRANSACTIONS ON NETWORKING | 2022年 / 30卷 / 04期

关键词：

Collaborative work; Device-to-device communication; Training; Servers; Topology; Computational modeling; Convergence; Fog learning; device-to-device communications; peer-to-peer learning; cooperative learning; distributed machine learning; semi-decentralized federated learning; CONSENSUS ALGORITHMS; SUBGRADIENT METHODS; COMMUNICATION; OPTIMIZATION; ALLOCATION; CHALLENGES; SYSTEMS; DEVICE; DESIGN; POWER;

D O I：

10.1109/TNET.2022.3143495

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Federated learning has generated significant interest, with nearly all works focused on a ``star'' topology where nodes/devices are each connected to a central server. We migrate away from this architecture and extend it through the network dimension to the case where there are multiple layers of nodes between the end devices and the server. Specifically, we develop multi-stage hybrid federated learning (MH-FL), a hybrid of intra- and inter-layer model learning that considers the network as a multi-layer cluster-based structure. MH-FL considers the topology structures among the nodes in the clusters, including local networks formed via device-to-device (D2D) communications, and presumes a semi-decentralized architecture for federated learning. It orchestrates the devices at different network layers in a collaborative/cooperative manner (i.e., using D2D interactions) to form local consensus on the model parameters and combines it with multi-stage parameter relaying between layers of the tree-shaped hierarchy. We derive the upper bound of convergence for MH-FL with respect to parameters of the network topology (e.g., the spectral radius) and the learning algorithm (e.g., the number of D2D rounds in different clusters). We obtain a set of policies for the D2D rounds at different clusters to guarantee either a finite optimality gap or convergence to the global optimum. We then develop a distributed control algorithm for MH-FL to tune the D2D rounds in each cluster over time to meet specific convergence criteria. Our experiments on real-world datasets verify our analytical results and demonstrate the advantages of MH-FL in terms of resource utilization metrics.

引用

页码：1569 / 1584

页数：16

共 56 条

[1]

Abad MSH, 2020, INT CONF ACOUST SPEE, P8866, DOI [10.1109/ICASSP40776.2020.9054634, 10.1109/icassp40776.2020.9054634]

[2]

Akkaya K., 2005, Ad Hoc Networks, V3, P325, DOI 10.1016/j.adhoc.2003.09.010

[3] Federated Learning Over Wireless Fading Channels [J].

Amiri, Mohammad Mohammadi ;

Gunduz, Deniz .

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2020, 19 (05) :3546-3557

[4]

[Anonymous], 2004, AdHoc Networks

[5] Flying Ad-Hoc Networks (FANETs): A survey [J].

Bekmezci, Ilker ;

Sahingoz, Ozgur Koray ;

Temel, Samil .

AD HOC NETWORKS, 2013, 11 (03) :1254-1270

[6]

Bertsekas D. P., 1996, Neuro-Dynamic Programming

[7] Multi-Agent Distributed Optimization via Inexact Consensus ADMM [J].

Chang, Tsung-Hui ;

Hong, Mingyi ;

Wang, Xiangfeng .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2015, 63 (02) :482-497

[8] Distributed Constrained Optimization by Consensus-Based Primal-Dual Perturbation Method [J].

Chang, Tsung-Hui ;

Nedic, Angelia ;

Scaglione, Anna .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (06) :1524-1538

[9] Adaptive Consensus Control for a Class of Nonlinear Multiagent Time-Delay Systems Using Neural Networks [J].

Chen, C. L. Philip ;

Wen, Guo-Xing ;

Liu, Yan-Jun ;

Wang, Fei-Yue .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (06) :1217-1226

[10]

Chen Mingyang, 2019, CoRR

← 1 2 3 4 5 6 →