Privacy-preserved federated clustering with Non-IID data via GANsPrivacy-preserved federated clustering with Non-IID data...J. Zhao et al.

被引：0

作者：

Jianzhe Zhao ^{[1
]}

Wenji Wang ^{[1
]}

Jiabao Wang ^{[2
]}

Songyang Zhang ^{[1
]}

Zhelin Fan ^{[1
]}

Stan Matwin ^{[3
]}

机构：

[1] Northeastern University,Software College

[2] Zhejiang University,Software College

[3] Dalhousie University,Department of Computer Science

来源：

The Journal of Supercomputing | / 81卷 / 4期

关键词：

Federated clustering; Non-IID; Differential privacy; GANs;

D O I：

10.1007/s11227-025-07006-2

中图分类号：

学科分类号：

摘要：

Federated clustering (FedC) is designed to cluster participants by utilizing global similarity measures and then training on independent clusters to enhance global accuracy. As an unsupervised federated learning approach, FedC operates on distributed and unlabeled data while upholding privacy. However, it faces challenges, such as non-independent and identically distributed (Non-IID) data on clients rendering the global clustering structure fragile, and potential privacy leaks through shared gradients. In response, this study introduces GFC-DP, a privacy-preserving federated clustering algorithm tailored for Non-IID data using generative adversarial networks (GANs), to address both data heterogeneity and privacy protection concerns. The algorithm incorporates GANs to generate synthetic data, leveraging global information to construct robust clustering structures. Notably, as the first work introducing a client selection strategy in GANs model training, it enhances the performance of global GANs models by defining a client evaluation equation and subsequently selecting better-performing clients to participate in GANs model training. Additionally, Gaussian noise is introduced during GANs model training to bolster privacy and counter model inversion and membership inference attacks. One-shot FedC is performed on the client side based on global centroids to obtain a stable global clustering structure. We conducted comprehensive experiments on the MNIST, Cifar-10, Rotated MNIST, and Rotated Cifar-10 datasets. The results demonstrate that, in Non-IID scenarios, GFC-DP achieves superior accuracy in both GANs performance and clustering effectiveness compared to similar algorithms in image classification tasks.

引用