Two-Stage Clustering for Federated Learning with Pseudo Mini-batch SGD Training on Non-IID Data

被引:1
|
作者
Weng, Jianqing [1 ]
Su, Songzhi [1 ]
Fan, Xiaoliang [1 ]
机构
[1] Xiamen Univ, Xiamen 361005, Peoples R China
来源
COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING, CHINESECSCW 2021, PT I | 2022年 / 1491卷
关键词
Federated learning; Clustering; Non-IID data;
D O I
10.1007/978-981-19-4546-5_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Statistical heterogeneity problem in federated learning is mainly caused by the skewness of the data distribution among clients. In this paper, we first discover a connection between the discrepancy of data distributions and their model divergence. Based on this insight, we introduce a K-center clustering method to build client groups by the similarity of their local updating parameters, which can effectively reduce the data distribution skewness. Secondly, this paper provides a theoretical proof that a more uniform data distribution of clients in training can reduce the growth of model divergence thereby improving the training performance on Non-IID environment. Therefore, we randomly divide the clients of each cluster in the first stage into multiple fine-grained clusters to flatten the original data distribution. Finally, to fully leverage the data in each fine-grained cluster for training, we proposed an intra-cluster training method named pseudo mini-batch SGD training. This method can conduct general mini-batch SGD training on each fine-grained cluster with data kept locally. With the two-stage clustering mechanism, the negative effect of Non-IID data can be steadily eliminated. Experiments on two federated learning benchmarks i.e. FEMNIST and CelebA, as well as a manually setting Non-IID dataset using CIFAR10 show that our proposed method significantly improves training efficiency on Non-IID data and outperforms several widely-used federated baselines.
引用
收藏
页码:29 / 43
页数:15
相关论文
共 50 条
  • [1] Dynamic Clustering Federated Learning for Non-IID Data
    Chen, Ming
    Wu, Jinze
    Yin, Yu
    Huang, Zhenya
    Liu, Qi
    Chen, Enhong
    ARTIFICIAL INTELLIGENCE, CICAI 2022, PT III, 2022, 13606 : 119 - 131
  • [2] Blockchain-Based Two-Stage Federated Learning With Non-IID Data in IoMT System
    Lian, Zhuotao
    Zeng, Qingkui
    Wang, Weizheng
    Gadekallu, Thippa Reddy
    Su, Chunhua
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2023, 10 (04): : 1701 - 1710
  • [3] Federated learning on non-IID data: A survey
    Zhu, Hangyu
    Xu, Jinjin
    Liu, Shiqing
    Jin, Yaochu
    NEUROCOMPUTING, 2021, 465 : 371 - 390
  • [4] FedCML: Federated Clustering Mutual Learning with non-IID Data
    Chen, Zekai
    Wang, Fuyi
    Yu, Shengxing
    Liu, Ximeng
    Zheng, Zhiwei
    EURO-PAR 2023: PARALLEL PROCESSING, 2023, 14100 : 623 - 636
  • [5] Training Keyword Spotting Models on Non-IID Data with Federated Learning
    Hard, Andrew
    Partridge, Kurt
    Nguyen, Cameron
    Subrahmanya, Niranjan
    Shah, Aishanee
    Zhu, Pai
    Moreno, Ignacio Lopez
    Mathews, Rajiv
    INTERSPEECH 2020, 2020, : 4343 - 4347
  • [6] FedRFC: Federated Learning with Recursive Fuzzy Clustering for improved non-IID data training
    Deng, Yuxiao
    Wang, Anqi
    Zhang, Lei
    Lei, Ying
    Li, Beibei
    Li, Yizhou
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 160 : 835 - 843
  • [7] Federated learning with hierarchical clustering of local updates to improve training on non-IID data
    Briggs, Christopher
    Fan, Zhong
    Andras, Peter
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [8] Information-Exchangeable Hierarchical Clustering for Federated Learning With Non-IID Data
    Shih, Chen-Han
    Kuo, Jian-Jhih
    Sheu, Jang-Ping
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 231 - 236
  • [9] FEDBS: Learning on Non-IID Data in Federated Learning using Batch Normalization
    Idrissi, Meryem Janati
    Berrada, Ismail
    Noubir, Guevara
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 861 - 867
  • [10] Privacy-preserving clustering federated learning for non-IID data
    Luo, Guixun
    Chen, Naiyue
    He, Jiahuan
    Jin, Bingwei
    Zhang, Zhiyuan
    Li, Yidong
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 154 : 384 - 395