Federated Learning Based on Diffusion Model to Cope with Non-IID Data

被引:4
作者
Zhao, Zhuang [1 ]
Yang, Feng [1 ,2 ,3 ]
Liang, Guirong [1 ]
机构
[1] Guangxi Univ, Sch Comp Elect & Informat, Nanning 530004, Peoples R China
[2] Guangxi Univ, Guangxi Key Lab Multimedia Commun Network Technol, Nanning 530004, Peoples R China
[3] Guangxi Univ, Key Lab Parallel & Distributed Comp Guangxi Coll, Nanning, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX | 2024年 / 14433卷
关键词
Federated learning; Non-IID data; Diffusion model; Data augmentation;
D O I
10.1007/978-981-99-8546-3_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Federated learning is a distributed machine learning paradigm that allows model training without centralizing sensitive data in a single place. However, non independent and identical distribution (non-IID) data can lead to degraded learning performance in federated learning. Data augmentation schemes have been proposed to address this issue, but they often require sharing clients' original data, which poses privacy risks. To address these challenges, we propose FedDDA, a data augmentation-based federated learning architecture that uses diffusion models to generate data conforming to the global class distribution and alleviate the non-IID data problem. In FedDDA, a diffusion model is trained through federated learning and then used for data augmentation, thus mitigating the degree of non-IID data without disclosing clients' original data. Our experiments on non-IID settings with various configurations show that FedDDA significantly outperforms FedAvg, with up to 43.04% improvement on the Cifar10 dataset and up to 20.05% improvement on the Fashion-MNIST dataset. Additionally, we find that relatively low-quality generated samples that conform to the global class distribution still improve federated learning performance considerably.
引用
收藏
页码:220 / 231
页数:12
相关论文
共 25 条
[1]  
Dhariwal P, 2021, ADV NEUR IN, V34
[2]   Self-Balancing Federated Learning With Global Imbalanced Data in Mobile Systems [J].
Duan, Moming ;
Liu, Duo ;
Chen, Xianzhang ;
Liu, Renping ;
Tan, Yujuan ;
Liang, Liang .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (01) :59-71
[3]   Generative Adversarial Networks [J].
Goodfellow, Ian ;
Pouget-Abadie, Jean ;
Mirza, Mehdi ;
Xu, Bing ;
Warde-Farley, David ;
Ozair, Sherjil ;
Courville, Aaron ;
Bengio, Yoshua .
COMMUNICATIONS OF THE ACM, 2020, 63 (11) :139-144
[4]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[5]  
Ho J, 2020, P 34 INT C NEUR INF, P6840
[6]  
Jeong E, 2023, Arxiv, DOI arXiv:1811.11479
[7]  
Karimireddy SP, 2020, PR MACH LEARN RES, V119
[8]  
Krizhevsky A, 2009, Technical Report
[9]   Federated Learning on Non-IID Data Silos: An Experimental Study [J].
Li, Qinbin ;
Diao, Yiqun ;
Chen, Quan ;
He, Bing Sheng .
2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, :965-978
[10]  
Li T, 2020, P MACHINE LEARNING S, V2, P429