Privacy-preserved federated clustering with Non-IID data via GANs

被引:0
作者
Zhao, Jianzhe [1 ]
Wang, Wenji [1 ]
Wang, Jiabao [2 ]
Zhang, Songyang [1 ]
Fan, Zhelin [1 ]
Matwin, Stan [3 ]
机构
[1] Northeastern Univ, Software Coll, Shenyang, Liaoning, Peoples R China
[2] Zhejiang Univ, Software Coll, Ningbo, Zhejiang, Peoples R China
[3] Dalhousie Univ, Dept Comp Sci, Halifax, NS, Canada
基金
中国国家自然科学基金;
关键词
Federated clustering; Non-IID; Differential privacy; GANs;
D O I
10.1007/s11227-025-07006-2
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Federated clustering (FedC) is designed to cluster participants by utilizing global similarity measures and then training on independent clusters to enhance global accuracy. As an unsupervised federated learning approach, FedC operates on distributed and unlabeled data while upholding privacy. However, it faces challenges, such as non-independent and identically distributed (Non-IID) data on clients rendering the global clustering structure fragile, and potential privacy leaks through shared gradients. In response, this study introduces GFC-DP, a privacy-preserving federated clustering algorithm tailored for Non-IID data using generative adversarial networks (GANs), to address both data heterogeneity and privacy protection concerns. The algorithm incorporates GANs to generate synthetic data, leveraging global information to construct robust clustering structures. Notably, as the first work introducing a client selection strategy in GANs model training, it enhances the performance of global GANs models by defining a client evaluation equation and subsequently selecting better-performing clients to participate in GANs model training. Additionally, Gaussian noise is introduced during GANs model training to bolster privacy and counter model inversion and membership inference attacks. One-shot FedC is performed on the client side based on global centroids to obtain a stable global clustering structure. We conducted comprehensive experiments on the MNIST, Cifar-10, Rotated MNIST, and Rotated Cifar-10 datasets. The results demonstrate that, in Non-IID scenarios, GFC-DP achieves superior accuracy in both GANs performance and clustering effectiveness compared to similar algorithms in image classification tasks.
引用
收藏
页数:37
相关论文
共 53 条
[1]   Deep Learning with Differential Privacy [J].
Abadi, Martin ;
Chu, Andy ;
Goodfellow, Ian ;
McMahan, H. Brendan ;
Mironov, Ilya ;
Talwar, Kunal ;
Zhang, Li .
CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, :308-318
[2]  
Arjovsky M, 2017, Arxiv, DOI [arXiv:1701.07875, 10.48550/arXiv.1701.07875, DOI 10.48550/ARXIV.1701.07875]
[3]  
Augenstein S, 2020, Generative models for effective ml on private, decentralized datasets
[4]  
Chamikara MAP, 2022, Arxiv, DOI arXiv:2202.06053
[5]  
Chen Leiming, 2025, IEEE Transactions on Artificial Intelligence, V6, P301, DOI 10.1109/TAI.2024.3355362
[6]   FEDDRL: TRUSTWORTHY FEDERATED LEARNING MODEL FUSION METHOD BASED ON STAGED REINFORCEMENT LEARNING [J].
Chen, Leiming ;
Zhang, Weishan ;
Dong, Cihao ;
Huang, Ziling ;
Nie, Yuming ;
Hou, Zhaoxiang ;
Qiao, Sibo ;
Tan, Chee Wei .
COMPUTING AND INFORMATICS, 2024, 43 (02) :1-37
[7]   FedTKD: A Trustworthy Heterogeneous Federated Learning Based on Adaptive Knowledge Distillation [J].
Chen, Leiming ;
Zhang, Weishan ;
Dong, Cihao ;
Zhao, Dehai ;
Zeng, Xingjie ;
Qiao, Sibo ;
Zhu, Yichang ;
Tan, Chee Wei .
ENTROPY, 2024, 26 (01)
[8]  
Chuenbubpha Thiti, 2023, 2023 20th International Joint Conference on Computer Science and Software Engineering (JCSSE), P333, DOI 10.1109/JCSSE58229.2023.10202100
[9]  
Dennis DK, 2021, PR MACH LEARN RES, V139
[10]   Calibrating noise to sensitivity in private data analysis [J].
Dwork, Cynthia ;
McSherry, Frank ;
Nissim, Kobbi ;
Smith, Adam .
THEORY OF CRYPTOGRAPHY, PROCEEDINGS, 2006, 3876 :265-284