CGKDFL: A Federated Learning Approach Based on Client Clustering and Generator-Based Knowledge Distillation for Heterogeneous Data

被引:0
作者
Zhang, Sanfeng [1 ]
Xu, Hongzhen [2 ]
Yu, Xiaojun [2 ]
机构
[1] East China Univ Technol, Sch Informat Engn, Nanchang, Peoples R China
[2] East China Univ Technol, Sch Software, Nanchang, Peoples R China
基金
中国国家自然科学基金;
关键词
clustering; federated learning; generator; heterogeneous data; knowledge distillation;
D O I
10.1002/cpe.70048
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In practical, real-world complex networks, data distribution is frequently decentralized and Non-Independently Identically Distributed (Non-IID). This heterogeneous data presents a significant challenge for federated learning. Such problems include the generation of biased global models, the lack of sufficient personalization capability of local models, and the difficulty in absorbing global knowledge. We propose a Federated Learning Approach Based on Client Clustering and Generator-based Knowledge Distillation(CGKDFL) for heterogeneous data. Firstly, to reduce the global model bias, we propose a clustering federated learning approach that only requires each client to transmit some of the parameters of the selected layer, thus reducing the number of parameters. Subsequently, to circumvent the absence of global knowledge resulting from clustering, a generator designed to improve privacy features and increase diversity is developed on the server side. This generator produces feature representation data that aligns with the specific tasks of the client by utilizing the labeling information provided by the client. This is achieved without the need for any external dataset. The generator then transfers its global knowledge to the local model. The client can then utilize this information for knowledge distillation. Finally, extensive experiments were conducted on three heterogeneous datasets. The results demonstrate that CGKDFL outperforms the baseline method by a minimum of 7.24%$$ 7.24\% $$, 6.73%$$ 6.73\% $$, and 3.13%$$ 3.13\% $$ regarding accuracy on the three heterogeneous datasets. Additionally, it outperforms the compared methods regarding convergence speed in all cases.
引用
收藏
页数:14
相关论文
共 62 条
[31]   Multi-center federated learning: clients clustering for better personalization [J].
Long, Guodong ;
Xie, Ming ;
Shen, Tao ;
Zhou, Tianyi ;
Wang, Xianzhi ;
Jiang, Jing .
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2023, 26 (01) :481-500
[32]   Federated learning with uncertainty-based client clustering for fleet-wide fault diagnosis [J].
Lu, Hao ;
Thelen, Adam ;
Fink, Olga ;
Hu, Chao ;
Laflamme, Simon .
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2024, 210
[33]  
Ma J., 2022, arXiv, DOI DOI 10.48550/ARXIV.2202.06187
[34]   A novel structured sparse fully connected layer in convolutional neural networks [J].
Matsumura, Naoki ;
Ito, Yasuaki ;
Nakano, Koji ;
Kasagi, Akihiko ;
Tabaru, Tsuguchika .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (11)
[35]  
McMahan HB, 2017, PR MACH LEARN RES, V54, P1273
[36]  
Nayak GK, 2019, PR MACH LEARN RES, V97
[37]  
Kingma DP, 2014, Arxiv, DOI [arXiv:1312.6114, DOI 10.48550/ARXIV.1312.6114]
[38]   FedKD-IDS: A robust intrusion detection system using knowledge distillation-based semi-supervised federated learning and anti-poisoning attack mechanism [J].
Quyen, Nguyen Huu ;
Duy, Phan The ;
Nguyen, Ngo Thao ;
Khoa, Nghi Hoang ;
Pham, Van-Hau .
INFORMATION FUSION, 2025, 117
[39]   FairDPFL-SCS: Fair Dynamic Personalized Federated Learning with strategic client selection for improved accuracy and fairness [J].
Sabah, Fahad ;
Chen, Yuwen ;
Yang, Zhen ;
Raheem, Abdul ;
Azam, Muhammad ;
Ahmad, Nadeem ;
Sarwar, Raheem .
INFORMATION FUSION, 2025, 115
[40]  
Seo H., 2022, Machine Learning and Wireless Communications, P457