Overcoming Client Data Deficiency in Federated Learning by Exploiting Unlabeled Data on the Server

被引：0

作者：

Park, Jae-Min ^{[1
]}

Jang, Won-Jun ^{[1
]}

Oh, Tae-Hyun ^{[2
,3
]}

Lee, Si-Hyeon ^{[1
]}

机构：

[1] Korea Adv Inst Sci & Technol KAIST, Sch Elect Engn, Daejeon 34141, South Korea

[2] Pohang Univ Sci & Technol POSTECH, Dept Elect Engn, Pohang 37673, South Korea

[3] Pohang Univ Sci & Technol POSTECH, Grad Sch AI, Pohang 37673, South Korea

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Federated Learning; knowledge distillation; ensemble distillation; self-supervised learning; uncertainty;

D O I：

10.1109/ACCESS.2024.3458911

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Federated Learning (FL) is a distributed machine learning paradigm involving multiple clients to train a server model. In practice, clients often possess limited data and are not always available for simultaneous participation in FL, which can lead to data deficiency. This data deficiency degrades the entire learning process. To address this, we propose Federated learning with entropy-weighted ensemble Distillation and Self-supervised learning (FedDS). FedDS effectively handles situations with limited data per client and few clients. The key idea is to exploit the unlabeled data available on the server in the aggregating step of client models into a server model. We distill the multiple client models to a server model in an ensemble way. To robustly weigh the quality of source pseudo-labels from the client models, we propose an entropy weighting method and show a favorable tendency that our method assigns higher weights to more accurate predictions. Furthermore, we jointly leverage a separate self-supervised loss for improving generalization of the server model. We demonstrate the effectiveness of our FedDS both empirically and theoretically. For CIFAR-10, our method shows an improvement over FedAVG of 12.54% in the data deficient regime, and of 17.16% and 23.56% in the more challenging scenarios of noisy label or Byzantine client cases, respectively. For CIFAR-100 and ImageNet-100, our method shows an improvement over FedAVG of 18.68% and 15.06% in the data deficient regime, respectively.

引用

页码：130007 / 130021

页数：15

共 50 条

[21] Empirical Measurement of Client Contribution for Federated Learning With Data Size Diversification
Shyn, Sung Kuk
Kim, Donghee
Kim, Kwangsu
[J]. IEEE ACCESS, 2022, 10 : 118563 - 118574
[22] Exploring Server-Side Data in Federated Learning: An Empirical Study
Liu, Tao
Pan, Shengli
Li, Peng
[J]. 2024 IEEE/CIC INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN CHINA, ICCC, 2024,
[23] Client Discovery and Data Exchange in Edge-based Federated Learning via Named Data Networking
Amadeo, Marica
Campolo, Claudia
Ierat, Antonio
Molinaro, Antonella
Ruggeri, Giuseppe
[J]. IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, : 2990 - 2995
[24] Towards Predicting Client Benefit and Contribution in Federated Learning From Data Imbalance
Duesing, Christoph
Cimiano, Philipp
[J]. PROCEEDINGS OF THE 3RD INTERNATIONAL WORKSHOP ON DISTRIBUTED MACHINE LEARNING, DISTRIBUTEDML 2022, 2022, : 23 - 29
[25] Energy-efficient client selection in federated learning with heterogeneous data on edge
Jianxin Zhao
Yanhao Feng
Xinyu Chang
Chi Harold Liu
[J]. Peer-to-Peer Networking and Applications, 2022, 15 : 1139 - 1151
[26] Flexible Clustered Federated Learning for Client-Level Data Distribution Shift
Duan, Moming
Liu, Duo
Ji, Xinyuan
Wu, Yu
Liang, Liang
Chen, Xianzhang
Tan, Yujuan
Ren, Ao
[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (11) : 2661 - 2674
[27] Dubhe: Towards Data Unbiasedness with Homomorphic Encryption in Federated Learning Client Selection
Zhang, Shulai
Li, Zirui
Chen, Quan
Zheng, Wenli
Leng, Jingwen
Guo, Minyi
[J]. 50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2021,
[28] FedDSS: A data-similarity approach for client selection in horizontal federated learning
Nguyen, Tuong Minh
Poh, Kim Leng
Chong, Shu-Ling
Lee, Jan Hau
[J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2024, 192
[29] Energy-efficient client selection in federated learning with heterogeneous data on edge
Zhao, Jianxin
Feng, Yanhao
Chang, Xinyu
Liu, Chi Harold
[J]. PEER-TO-PEER NETWORKING AND APPLICATIONS, 2022, 15 (02) : 1139 - 1151
[30] Can hierarchical client clustering mitigate the data heterogeneity effect in federated learning?
Lee, Seungjun
Yu, Miri
Yoon, Daegun
Oh, Sangyoon
[J]. 2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW, 2023, : 799 - 808

← 1 2 3 4 5 →