Overcoming Client Data Deficiency in Federated Learning by Exploiting Unlabeled Data on the Server

被引:0
作者
Park, Jae-Min [1 ]
Jang, Won-Jun [1 ]
Oh, Tae-Hyun [2 ,3 ]
Lee, Si-Hyeon [1 ]
机构
[1] Korea Adv Inst Sci & Technol KAIST, Sch Elect Engn, Daejeon 34141, South Korea
[2] Pohang Univ Sci & Technol POSTECH, Dept Elect Engn, Pohang 37673, South Korea
[3] Pohang Univ Sci & Technol POSTECH, Grad Sch AI, Pohang 37673, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Federated Learning; knowledge distillation; ensemble distillation; self-supervised learning; uncertainty;
D O I
10.1109/ACCESS.2024.3458911
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Federated Learning (FL) is a distributed machine learning paradigm involving multiple clients to train a server model. In practice, clients often possess limited data and are not always available for simultaneous participation in FL, which can lead to data deficiency. This data deficiency degrades the entire learning process. To address this, we propose Federated learning with entropy-weighted ensemble Distillation and Self-supervised learning (FedDS). FedDS effectively handles situations with limited data per client and few clients. The key idea is to exploit the unlabeled data available on the server in the aggregating step of client models into a server model. We distill the multiple client models to a server model in an ensemble way. To robustly weigh the quality of source pseudo-labels from the client models, we propose an entropy weighting method and show a favorable tendency that our method assigns higher weights to more accurate predictions. Furthermore, we jointly leverage a separate self-supervised loss for improving generalization of the server model. We demonstrate the effectiveness of our FedDS both empirically and theoretically. For CIFAR-10, our method shows an improvement over FedAVG of 12.54% in the data deficient regime, and of 17.16% and 23.56% in the more challenging scenarios of noisy label or Byzantine client cases, respectively. For CIFAR-100 and ImageNet-100, our method shows an improvement over FedAVG of 18.68% and 15.06% in the data deficient regime, respectively.
引用
收藏
页码:130007 / 130021
页数:15
相关论文
共 50 条
  • [21] Empirical Measurement of Client Contribution for Federated Learning With Data Size Diversification
    Shyn, Sung Kuk
    Kim, Donghee
    Kim, Kwangsu
    [J]. IEEE ACCESS, 2022, 10 : 118563 - 118574
  • [22] Exploring Server-Side Data in Federated Learning: An Empirical Study
    Liu, Tao
    Pan, Shengli
    Li, Peng
    [J]. 2024 IEEE/CIC INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN CHINA, ICCC, 2024,
  • [23] Client Discovery and Data Exchange in Edge-based Federated Learning via Named Data Networking
    Amadeo, Marica
    Campolo, Claudia
    Ierat, Antonio
    Molinaro, Antonella
    Ruggeri, Giuseppe
    [J]. IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, : 2990 - 2995
  • [24] Towards Predicting Client Benefit and Contribution in Federated Learning From Data Imbalance
    Duesing, Christoph
    Cimiano, Philipp
    [J]. PROCEEDINGS OF THE 3RD INTERNATIONAL WORKSHOP ON DISTRIBUTED MACHINE LEARNING, DISTRIBUTEDML 2022, 2022, : 23 - 29
  • [25] Energy-efficient client selection in federated learning with heterogeneous data on edge
    Jianxin Zhao
    Yanhao Feng
    Xinyu Chang
    Chi Harold Liu
    [J]. Peer-to-Peer Networking and Applications, 2022, 15 : 1139 - 1151
  • [26] Flexible Clustered Federated Learning for Client-Level Data Distribution Shift
    Duan, Moming
    Liu, Duo
    Ji, Xinyuan
    Wu, Yu
    Liang, Liang
    Chen, Xianzhang
    Tan, Yujuan
    Ren, Ao
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (11) : 2661 - 2674
  • [27] Dubhe: Towards Data Unbiasedness with Homomorphic Encryption in Federated Learning Client Selection
    Zhang, Shulai
    Li, Zirui
    Chen, Quan
    Zheng, Wenli
    Leng, Jingwen
    Guo, Minyi
    [J]. 50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2021,
  • [28] FedDSS: A data-similarity approach for client selection in horizontal federated learning
    Nguyen, Tuong Minh
    Poh, Kim Leng
    Chong, Shu-Ling
    Lee, Jan Hau
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2024, 192
  • [29] Energy-efficient client selection in federated learning with heterogeneous data on edge
    Zhao, Jianxin
    Feng, Yanhao
    Chang, Xinyu
    Liu, Chi Harold
    [J]. PEER-TO-PEER NETWORKING AND APPLICATIONS, 2022, 15 (02) : 1139 - 1151
  • [30] Can hierarchical client clustering mitigate the data heterogeneity effect in federated learning?
    Lee, Seungjun
    Yu, Miri
    Yoon, Daegun
    Oh, Sangyoon
    [J]. 2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW, 2023, : 799 - 808