Overcoming Client Data Deficiency in Federated Learning by Exploiting Unlabeled Data on the Server

被引:0
作者
Park, Jae-Min [1 ]
Jang, Won-Jun [1 ]
Oh, Tae-Hyun [2 ,3 ]
Lee, Si-Hyeon [1 ]
机构
[1] Korea Adv Inst Sci & Technol KAIST, Sch Elect Engn, Daejeon 34141, South Korea
[2] Pohang Univ Sci & Technol POSTECH, Dept Elect Engn, Pohang 37673, South Korea
[3] Pohang Univ Sci & Technol POSTECH, Grad Sch AI, Pohang 37673, South Korea
关键词
Federated Learning; knowledge distillation; ensemble distillation; self-supervised learning; uncertainty;
D O I
10.1109/ACCESS.2024.3458911
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Federated Learning (FL) is a distributed machine learning paradigm involving multiple clients to train a server model. In practice, clients often possess limited data and are not always available for simultaneous participation in FL, which can lead to data deficiency. This data deficiency degrades the entire learning process. To address this, we propose Federated learning with entropy-weighted ensemble Distillation and Self-supervised learning (FedDS). FedDS effectively handles situations with limited data per client and few clients. The key idea is to exploit the unlabeled data available on the server in the aggregating step of client models into a server model. We distill the multiple client models to a server model in an ensemble way. To robustly weigh the quality of source pseudo-labels from the client models, we propose an entropy weighting method and show a favorable tendency that our method assigns higher weights to more accurate predictions. Furthermore, we jointly leverage a separate self-supervised loss for improving generalization of the server model. We demonstrate the effectiveness of our FedDS both empirically and theoretically. For CIFAR-10, our method shows an improvement over FedAVG of 12.54% in the data deficient regime, and of 17.16% and 23.56% in the more challenging scenarios of noisy label or Byzantine client cases, respectively. For CIFAR-100 and ImageNet-100, our method shows an improvement over FedAVG of 18.68% and 15.06% in the data deficient regime, respectively.
引用
收藏
页码:130007 / 130021
页数:15
相关论文
共 50 条
[1]   Enhancing Federated Learning With Server-Side Unlabeled Data by Adaptive Client and Data Selection [J].
Xu, Yang ;
Wang, Lun ;
Xu, Hongli ;
Liu, Jianchun ;
Wang, Zhiyuan ;
Huang, Liusheng .
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (04) :2813-2831
[2]   Migrating federated learning to centralized learning with the leverage of unlabeled data [J].
Wang, Xiaoya ;
Zhu, Tianqing ;
Ren, Wei ;
Zhang, Dongmei ;
Xiong, Ping .
KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (09) :3725-3752
[3]   Migrating federated learning to centralized learning with the leverage of unlabeled data [J].
Xiaoya Wang ;
Tianqing Zhu ;
Wei Ren ;
Dongmei Zhang ;
Ping Xiong .
Knowledge and Information Systems, 2023, 65 :3725-3752
[4]   Enhancing Federated Learning with In-Cloud Unlabeled Data [J].
Wang, Lun ;
Xu, Yang ;
Xu, Hongli ;
Liu, Jianchun ;
Wang, Zhiyuan ;
Huang, Liusheng .
2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, :136-149
[5]   Overcoming Noisy and Irrelevant Data in Federated Learning [J].
Tuor, Tiffany ;
Wang, Shiqiang ;
Ko, Bong Jun ;
Liu, Changchang ;
Leung, Kin K. .
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, :5020-5027
[6]   Expand and Shrink: Federated Learning with Unlabeled Data Using Clustering [J].
Kumar, Ajit ;
Singh, Ankit Kumar ;
Ali, Syed Saqib ;
Choi, Bong Jun .
SENSORS, 2023, 23 (23)
[7]   Overcoming intergovernmental data sharing challenges with federated learning [J].
Sprenkamp, Kilian ;
Fernandez, Joaquin Delgado ;
Eckhardt, Sven ;
Zavolokina, Liudmila .
DATA & POLICY, 2024, 6
[8]   Federated learning with joint server-client momentum [J].
Li, Boyuan ;
Zhang, Shaohui ;
Han, Qiuying .
SCIENTIFIC REPORTS, 2025, 15 (01)
[9]   Exploitation Maximization of Unlabeled Data for Federated Semi-Supervised Learning [J].
Chen, Siguang ;
Shen, Jianhua .
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025, 9 (02) :2039-2044
[10]   Client Selection for Wireless Federated Learning With Data and Latency Heterogeneity [J].
Chen, Xiaobing ;
Zhou, Xiangwei ;
Zhang, Hongchao ;
Sun, Mingxuan ;
Vincent Poor, H. .
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (19) :32183-32196