FedGA: A greedy approach to enhance federated learning with Non-IID data

被引:11
作者
Cong, Yue [1 ]
Zeng, Yuxiang [2 ]
Qiu, Jing [1 ,3 ]
Fang, Zhongyang [1 ]
Zhang, Lejun [1 ]
Cheng, Du [4 ]
Liu, Jia [5 ]
Tian, Zhihong [1 ]
机构
[1] Guangzhou Univ, Cyberspace Acad, Guangzhou 510006, Peoples R China
[2] Beijing Univ Aeronaut & Astronaut, Beijing 100000, Peoples R China
[3] Pengcheng Lab, Shenzhen 518000, Peoples R China
[4] Beijing Shengxin Network Technol Co LTD Qingteng C, Beijing 100000, Peoples R China
[5] China Elect Technol Network Commun Res Inst, Shijiazhuang 050000, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Federated learning; Non-IID; Data privacy; Partial optimize; Complementary federated meta-learning; SMOTE;
D O I
10.1016/j.knosys.2024.112201
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the context of federated learning, smart terminal devices inherently exhibit various factors such as personalized user behaviors, regional differences, and heterogeneous hardware configurations due to their unique data capturing capabilities. Consequently, they inevitably present non-independent and identically distributed (Non-IID) data during the training process, making it challenging for traditional federated learning methods to achieve desired model performance and convergence effects when dealing with such complex data distributions. To address this challenge, researchers have explored data augmentation techniques, adaptive optimization algorithms, and improved model aggregation rules. However, these methods often fail to deliver satisfactory performance when confronted with Non-IID problems. In this paper, we propose a federated learning framework based on a greedy algorithm (FedGA). Unlike traditional approaches that rely on global parameter averaging aggregation, FedGA progressively searches for partially optimal models and aggregates them to obtain the global model. We first gather client data information and finely classify all clients, employing two strategies to optimize client data distribution: (1) for clients with imbalanced quantities, we utilize data sampling methods to balance client data; (2) for clients with imbalanced class distributions, we introduce a complementary federated meta-learning method, called FedComMeta, to improve the data distribution of client groups. Experimental results demonstrate that in scenarios involving Non-IID data, FedGA achieves significant improvements in model performance compared to existing methods such as FedAvg, FedProx, and Astraea, validating the effectiveness and superiority of our approach in handling Non-IID data distributions.
引用
收藏
页数:10
相关论文
共 32 条
[1]   Deep Learning with Differential Privacy [J].
Abadi, Martin ;
Chu, Andy ;
Goodfellow, Ian ;
McMahan, H. Brendan ;
Mironov, Ilya ;
Talwar, Kunal ;
Zhang, Li .
CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, :308-318
[2]  
Arjovsky M, 2017, PR MACH LEARN RES, V70
[3]  
Bunkhumpornpat C, 2009, LECT NOTES ARTIF INT, V5476, P475, DOI 10.1007/978-3-642-01307-2_43
[4]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[5]   FRAug: Tackling Federated Learning with Non-IID Features via Representation Augmentation [J].
Chen, Haokun ;
Frikha, Ahmed ;
Krompass, Denis ;
Gu, Jindong ;
Tresp, Volker .
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, :4826-4836
[6]   Dynamic Vision Enabled Contactless Cross-Domain Machine Fault Diagnosis with Neuromorphic Computing [J].
Chen, Xinrui ;
Li, Xiang ;
Yu, Shupeng ;
Lei, Yaguo ;
Li, Naipeng ;
Yang, Bin .
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2024, 11 (03) :788-790
[7]   Astraea: Self-balancing Federated Learning for Improving Classification Accuracy of Mobile Deep Learning Applications [J].
Duan, Moming ;
Liu, Duo ;
Chen, Xianzhang ;
Tan, Yujuan ;
Ren, Jinting ;
Qiao, Lei ;
Liang, Liang .
2019 IEEE 37TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2019), 2019, :246-254
[8]  
Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1
[9]  
Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
[10]   Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning [J].
Han, H ;
Wang, WY ;
Mao, BH .
ADVANCES IN INTELLIGENT COMPUTING, PT 1, PROCEEDINGS, 2005, 3644 :878-887