Adaptive Federated Dropout: Improving Communication Efficiency and Generalization for Federated Learning

被引:24
作者
Bouacida, Nader [1 ]
Hou, Jiahui [1 ]
Zang, Hui [2 ]
Liu, Xin [1 ]
机构
[1] Univ Calif Davis, Davis, CA 95616 USA
[2] Google, Mountain View, CA 94043 USA
来源
IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (IEEE INFOCOM WKSHPS 2021) | 2021年
关键词
federated learning; compression; communication efficiency; generalization; convergence time;
D O I
10.1109/INFOCOMWKSHPS51825.2021.9484526
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
To exploit the wealth of data generated and located at distributed entities such as mobile phones, a revolutionary decentralized machine learning setting, known as federated learning, enables multiple clients to collaboratively learn a machine learning model while keeping all their data on-device. However, the scale and decentralization of federated learning present new challenges. Communication between the clients and the server is considered a main bottleneck in the convergence time of federated learning because of a very large number of model's weights that need to be exchanged in each training round. In this paper, we propose and study Adaptive Federated Dropout (AFD), a novel technique to reduce the communication costs associated with federated learning. It optimizes both server-client communications and computation costs by allowing clients to train locally on a selected subset of the global model. We empirically show that this strategy, combined with existing compression methods, collectively provides up to 57x reduction in convergence time. It also outperforms the state-of-the-art solutions for communication efficiency. Furthermore, it improves model generalization by up to 1.7%.
引用
收藏
页数:6
相关论文
共 18 条
[1]  
Bonawitz K., 2019, C MACH LEARN SYST ML, P374
[2]  
Caldas S., 2019, arXiv
[3]  
Caldas S, 2019, Arxiv, DOI arXiv:1812.07210
[4]  
Cohen G, 2017, Arxiv, DOI arXiv:1702.05373
[5]  
Dutta A, 2020, AAAI CONF ARTIF INTE, V34, P3817
[6]  
Go Alec, 2009, CS224N PROJECT REPOR, V1
[7]  
Konecny J., 2016, CoRR
[8]  
Konečny J, 2016, Arxiv, DOI [arXiv:1610.02527, DOI 10.48550/ARXIV.1610.02527, 10.48550/arXiv.1610.02527]
[9]  
Lin Y., 2018, INT C LEARNING REPRE
[10]  
McMahan HB, 2017, PR MACH LEARN RES, V54, P1273