FedPD: A Federated Learning Framework With Adaptivity to Non-IID Data

被引：101

作者：

Zhang, Xinwei ^{[1
]}

Hong, Mingyi ^{[1
]}

Dhople, Sairaj ^{[1
]}

Yin, Wotao ^{[2
]}

Liu, Yang ^{[3
]}

机构：

[1] Univ Minnesota, Minneapolis, MN 55455 USA

[2] Univ Calif Los Angeles, Los Angeles, CA 90095 USA

[3] Tsinghua Univ, Inst AI Ind Res, Beijing 100084, Peoples R China

来源：

IEEE TRANSACTIONS ON SIGNAL PROCESSING | 2021年 / 69卷 / 69期

关键词：

Signal processing algorithms; Protocols; Complexity theory; Servers; Computational modeling; Data models; Distributed databases; Distributed algorithms; machine learning algorithms; federated learning; data heterogeneity; convergence analysis;

D O I：

10.1109/TSP.2021.3115952

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Federated Learning (FL) is popular for communication-efficient learning from distributed data. To utilize data at different clients without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a computation then aggregation model, in which multiple local updates are performed using local data before aggregation. These algorithms fail to work when faced with practical challenges, e.g., the local data being non-identically independently distributed. In this paper, we first characterize the behavior of the FedAvg algorithm, and show that without strong and unrealistic assumptions on the problem structure, it can behave erratically. Aiming at designing FL algorithms that are provably fast and require as few assumptions as possible, we propose a new algorithm design strategy from the primal-dual optimization perspective. Our strategy yields algorithms that can deal with non-convex objective functions, achieves the best possible optimization and communication complexity (in a well-defined sense), and accommodates full-batch and mini-batch local computation models. Importantly, the proposed algorithms are communication efficient, in that the communication effort can be reduced when the level of heterogeneity among the local data also reduces. In the extreme case where the local data becomes homogeneous, only O(1) communication is required among the agents. To the best of our knowledge, this is the first algorithmic framework for FL that achieves all the above properties.

引用

页码：6055 / 6070

页数：16

共 29 条

[1]

Acar D. A. E., 2021, INT C LEARN REPR

[2]

[Anonymous], 2016, ARXIV161005492

[3] Penalized likelihood regression for generalized linear models with non-quadratic penalties [J].

Antoniadis, Anestis ;

Gijbels, Irene ;

Nikolova, Mila .

ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2011, 63 (03) :585-615

[4]

Bagdasaryan E, 2020, PR MACH LEARN RES, V108, P2938

[5]

Caldas S., 2018, LEAF: a benchmark for federated settings

[6] Lower bounds for finding stationary points I [J].

Carmon, Yair ;

Duchi, John C. ;

Hinder, Oliver ;

Sidford, Aaron .

MATHEMATICAL PROGRAMMING, 2020, 184 (1-2) :71-120

[7]

Cen S., 2019, ARXIV190512648

[8]

Hanzely F., 2020, Federated Learning of a Mixture of Global and Local Models

[9]

Karimireddy SP, 2020, PR MACH LEARN RES, V119

[10]

Khaled A., 2019, arXiv preprint arXiv:1909.04715

← 1 2 3 →