FedZKT: Zero-Shot Knowledge Transfer towards Resource-Constrained Federated Learning with Heterogeneous On-Device Models

被引:31
作者
Zhang, Lan [1 ]
Wu, Dapeng [2 ]
Yuan, Xiaoyong [3 ]
机构
[1] Michigan Technol Univ, Dept Elect & Comp Engn, Houghton, MI 49931 USA
[2] Univ Florida, Dept Elect & Comp Engn, Gainesville, FL USA
[3] Michigan Technol Univ, Coll Comp, Houghton, MI 49931 USA
来源
2022 IEEE 42ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2022) | 2022年
基金
美国国家科学基金会;
关键词
Federated Learning; Model Heterogeneity; Re-source Constraint; Data-Free; Knowledge Transfer;
D O I
10.1109/ICDCS54860.2022.00094
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Federated learning enables multiple distributed devices to collaboratively learn a shared prediction model without centralizing their on-device data. Most of the current algorithms require comparable individual efforts for local training with the same structure and size of on-device models, which, however, impedes participation from resource-constrained devices. Given the widespread yet heterogeneous devices nowadays, in this paper, we propose an innovative federated learning framework with heterogeneous on-device models through Zero-shot Knowledge Transfer, named by FedZKT. Specifically, FedZKT allows devices to independently determine the on-device models upon their local resources. To achieve knowledge transfer across these heterogeneous on-device models, a zero-shot distillation approach is designed without any prerequisites for private on-device data, which is contrary to certain prior research based on a public dataset or a pre-trained data generator. Moreover, this compute-intensive distillation task is assigned to the server to allow the participation of resource-constrained devices, where a generator is adversarially learned with the ensemble of collected on-device models. The distilled central knowledge is then sent back in the form of the corresponding on-device model parameters, which can be easily absorbed on the device side. Extensive experimental studies demonstrate the effectiveness and robustness of FedZKT towards on-device knowledge agnostic, on-device model heterogeneity, and other challenging federated learning scenarios, such as heterogeneous on-device data and straggler effects.
引用
收藏
页码:928 / 938
页数:11
相关论文
共 45 条
[1]  
[Anonymous], 2006, P ACM SIGKDD INT C K
[2]  
Ba L. J., 2013, arXiv
[3]  
Cai H, 2020, Arxiv, DOI arXiv:1908.09791
[4]  
Chang HY, 2019, Arxiv, DOI arXiv:1912.11279
[5]  
Chen H., 2020, ARXIV
[6]  
Chen X., 2021, IEEE T NETW SCI ENG
[7]   On the Efficacy of Knowledge Distillation [J].
Cho, Jang Hyun ;
Hariharan, Bharath .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :4793-4801
[8]  
Clanuwat T., 2018, Deep Learning for Classical Japanese Literature
[9]  
Diao E., 2020, arXiv
[10]  
Fang GF, 2020, Arxiv, DOI arXiv:1912.11006