Adaptive Federated Dropout: Improving Communication Efficiency and Generalization for Federated Learning

被引：24

作者：

Bouacida, Nader ^{[1
]}

Hou, Jiahui ^{[1
]}

Zang, Hui ^{[2
]}

Liu, Xin ^{[1
]}

机构：

[1] Univ Calif Davis, Davis, CA 95616 USA

[2] Google, Mountain View, CA 94043 USA

来源：

IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (IEEE INFOCOM WKSHPS 2021) | 2021年

关键词：

federated learning; compression; communication efficiency; generalization; convergence time;

D O I：

10.1109/INFOCOMWKSHPS51825.2021.9484526

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

To exploit the wealth of data generated and located at distributed entities such as mobile phones, a revolutionary decentralized machine learning setting, known as federated learning, enables multiple clients to collaboratively learn a machine learning model while keeping all their data on-device. However, the scale and decentralization of federated learning present new challenges. Communication between the clients and the server is considered a main bottleneck in the convergence time of federated learning because of a very large number of model's weights that need to be exchanged in each training round. In this paper, we propose and study Adaptive Federated Dropout (AFD), a novel technique to reduce the communication costs associated with federated learning. It optimizes both server-client communications and computation costs by allowing clients to train locally on a selected subset of the global model. We empirically show that this strategy, combined with existing compression methods, collectively provides up to 57x reduction in convergence time. It also outperforms the state-of-the-art solutions for communication efficiency. Furthermore, it improves model generalization by up to 1.7%.

引用

页数：6

共 18 条

[1]

Bonawitz K., 2019, C MACH LEARN SYST ML, P374

[2]

Caldas S., 2019, arXiv

[3]

Caldas S, 2019, Arxiv, DOI arXiv:1812.07210

[4]

Cohen G, 2017, Arxiv, DOI arXiv:1702.05373

[5]

Dutta A, 2020, AAAI CONF ARTIF INTE, V34, P3817

[6]

Go Alec, 2009, CS224N PROJECT REPOR, V1

[7]

Konecny J., 2016, CoRR

[8]

Konečny J, 2016, Arxiv, DOI [arXiv:1610.02527, DOI 10.48550/ARXIV.1610.02527, 10.48550/arXiv.1610.02527]

[9]

Lin Y., 2018, INT C LEARNING REPRE

[10]

McMahan HB, 2017, PR MACH LEARN RES, V54, P1273

← 1 2 →