Overcoming Client Data Deficiency in Federated Learning by Exploiting Unlabeled Data on the Server

被引:0
|
作者
Park, Jae-Min [1 ]
Jang, Won-Jun [1 ]
Oh, Tae-Hyun [2 ,3 ]
Lee, Si-Hyeon [1 ]
机构
[1] Korea Adv Inst Sci & Technol KAIST, Sch Elect Engn, Daejeon 34141, South Korea
[2] Pohang Univ Sci & Technol POSTECH, Dept Elect Engn, Pohang 37673, South Korea
[3] Pohang Univ Sci & Technol POSTECH, Grad Sch AI, Pohang 37673, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Federated Learning; knowledge distillation; ensemble distillation; self-supervised learning; uncertainty;
D O I
10.1109/ACCESS.2024.3458911
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Federated Learning (FL) is a distributed machine learning paradigm involving multiple clients to train a server model. In practice, clients often possess limited data and are not always available for simultaneous participation in FL, which can lead to data deficiency. This data deficiency degrades the entire learning process. To address this, we propose Federated learning with entropy-weighted ensemble Distillation and Self-supervised learning (FedDS). FedDS effectively handles situations with limited data per client and few clients. The key idea is to exploit the unlabeled data available on the server in the aggregating step of client models into a server model. We distill the multiple client models to a server model in an ensemble way. To robustly weigh the quality of source pseudo-labels from the client models, we propose an entropy weighting method and show a favorable tendency that our method assigns higher weights to more accurate predictions. Furthermore, we jointly leverage a separate self-supervised loss for improving generalization of the server model. We demonstrate the effectiveness of our FedDS both empirically and theoretically. For CIFAR-10, our method shows an improvement over FedAVG of 12.54% in the data deficient regime, and of 17.16% and 23.56% in the more challenging scenarios of noisy label or Byzantine client cases, respectively. For CIFAR-100 and ImageNet-100, our method shows an improvement over FedAVG of 18.68% and 15.06% in the data deficient regime, respectively.
引用
收藏
页码:130007 / 130021
页数:15
相关论文
共 50 条
  • [1] Enhancing Federated Learning With Server-Side Unlabeled Data by Adaptive Client and Data Selection
    Xu, Yang
    Wang, Lun
    Xu, Hongli
    Liu, Jianchun
    Wang, Zhiyuan
    Huang, Liusheng
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (04) : 2813 - 2831
  • [2] Migrating federated learning to centralized learning with the leverage of unlabeled data
    Wang, Xiaoya
    Zhu, Tianqing
    Ren, Wei
    Zhang, Dongmei
    Xiong, Ping
    KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (09) : 3725 - 3752
  • [3] Migrating federated learning to centralized learning with the leverage of unlabeled data
    Xiaoya Wang
    Tianqing Zhu
    Wei Ren
    Dongmei Zhang
    Ping Xiong
    Knowledge and Information Systems, 2023, 65 : 3725 - 3752
  • [4] Enhancing Federated Learning with In-Cloud Unlabeled Data
    Wang, Lun
    Xu, Yang
    Xu, Hongli
    Liu, Jianchun
    Wang, Zhiyuan
    Huang, Liusheng
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 136 - 149
  • [5] Overcoming Noisy and Irrelevant Data in Federated Learning
    Tuor, Tiffany
    Wang, Shiqiang
    Ko, Bong Jun
    Liu, Changchang
    Leung, Kin K.
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 5020 - 5027
  • [6] Expand and Shrink: Federated Learning with Unlabeled Data Using Clustering
    Kumar, Ajit
    Singh, Ankit Kumar
    Ali, Syed Saqib
    Choi, Bong Jun
    SENSORS, 2023, 23 (23)
  • [7] Overcoming intergovernmental data sharing challenges with federated learning
    Sprenkamp, Kilian
    Fernandez, Joaquin Delgado
    Eckhardt, Sven
    Zavolokina, Liudmila
    DATA & POLICY, 2024, 6
  • [8] Federated learning with joint server-client momentum
    Boyuan Li
    Shaohui Zhang
    Qiuying Han
    Scientific Reports, 15 (1)
  • [9] Exploitation Maximization of Unlabeled Data for Federated Semi-Supervised Learning
    Chen, Siguang
    Shen, Jianhua
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, : 1 - 6
  • [10] Client Selection for Wireless Federated Learning With Data and Latency Heterogeneity
    Chen, Xiaobing
    Zhou, Xiangwei
    Zhang, Hongchao
    Sun, Mingxuan
    Vincent Poor, H.
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (19): : 32183 - 32196