FedDSS: A data-similarity approach for client selection in horizontal federated learning

被引:0
|
作者
Nguyen, Tuong Minh [1 ]
Poh, Kim Leng [1 ]
Chong, Shu-Ling [2 ]
Lee, Jan Hau [3 ,4 ]
机构
[1] Natl Univ Singapore, Dept Ind Syst Engn & Management, Singapore 117576, Singapore
[2] KK Womens & Childrens Hosp, Childrens Emergency, Singapore 229899, Singapore
[3] Duke NUS Med Sch, SingHlth Duke NUS Paediat Acad Clin Programme, Singapore 169857, Singapore
[4] KK Womens & Childrens Hosp, Childrens Intens Care Unit, Singapore 229899, Singapore
关键词
Federated learning; Non-i.i.d; Client selection; Data similarity; Pediatric sepsis;
D O I
10.1016/j.ijmedinf.2024.105650
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Background and objective: Federated learning (FL) is an emerging distributed learning framework allowing multiple clients (hospitals, institutions, smart devices, etc.) to collaboratively train a centralized machine learning model without disclosing personal data. It has the potential to address several healthcare challenges, including a lack of training data, data privacy, and security concerns. However, model learning under FL is affected by non-i.i.d. data, leading to severe model divergence and reduced performance due to the varying client's data distributions. To address this problem, we propose FedDSS, Federated Data Similarity Selection, a framework that uses a data-similarity approach to select clients, without compromising client data privacy. Methods: FedDSS comprises a statistical-based data similarity metric, a N-similar-neighbor network, and a network-based selection strategy. We assessed FedDSS' performance against FedAvg's in i.i.d. and non-i.i.d. settings with two public pediatric sepsis datasets (PICD and MIMICIII). Selection fairness was measured using entropy. . Simulations were repeated five times to evaluate average loss, true positive rate (TPR), and entropy. . Results: In i.i.d setting on PICD, FedDSS achieved a higher TPR starting from the 9th round and surpassing 0.6 three rounds earlier than FedAvg. On MIMICIII, FedDSS's loss decreases significantly from the 13th round, with TPR > 0.8 by the 2nd round, two rounds ahead of FedAvg (at the 4th round). In the non-i.i.d. setting, FedDSS achieved TPR > 0.7 by the 4th and > 0.8 by the 7th round, earlier than FedAvg (at the 5th and 11th rounds). In both settings, FedDSS showed reasonable fairness ( entropy of 2.2 and 2.1). Conclusion: We demonstrated that FedDSS contributes to improved learning in FL by achieving faster convergence, reaching the desired TPR with fewer communication rounds, and potentially enhancing sepsis prediction (TPR) over FedAvg.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] FedBoost: Bayesian Estimation Based Client Selection for Federated Learning
    Sheng, Yuhang
    Zeng, Lingguo
    Cao, Shuqin
    Dai, Qing
    Yang, Shasha
    Lu, Jianfeng
    IEEE ACCESS, 2024, 12 : 52255 - 52266
  • [42] Client Selection Based on Label Quantity Information for Federated Learning
    Ma, Jiahua
    Sun, Xinghua
    Xia, Wenchao
    Wang, Xijun
    Chen, Xiang
    Zhu, Hongbo
    2021 IEEE 32ND ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2021,
  • [43] Bidirectional Selection for Federated Learning Incorporating Client Autonomy: An Accuracy-Aware Incentive Approach
    Shi, Huaguang
    Tian, Yuxiang
    Li, Hengji
    Shi, Lei
    Zhou, Yi
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (20): : 33861 - 33872
  • [44] Data Distribution-Aware Online Client Selection Algorithm for Federated Learning in Heterogeneous Networks
    Lee, Jaewook
    Ko, Haneul
    Seo, Sangwon
    Pack, Sangheon
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (01) : 1127 - 1136
  • [45] Energy-efficient client selection in federated learning with heterogeneous data on edge
    Zhao, Jianxin
    Feng, Yanhao
    Chang, Xinyu
    Liu, Chi Harold
    PEER-TO-PEER NETWORKING AND APPLICATIONS, 2022, 15 (02) : 1139 - 1151
  • [46] Energy-efficient client selection in federated learning with heterogeneous data on edge
    Jianxin Zhao
    Yanhao Feng
    Xinyu Chang
    Chi Harold Liu
    Peer-to-Peer Networking and Applications, 2022, 15 : 1139 - 1151
  • [47] Privacy-Preserving Data Selection for Horizontal and Vertical Federated Learning
    Zhang, Lan
    Li, Anran
    Peng, Hongyi
    Han, Feng
    Huang, Fan
    Li, Xiang-Yang
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2024, 35 (11) : 2054 - 2068
  • [48] Federated Feature Selection for Horizontal Federated Learning in IoT Networks
    Zhang, Xunzheng
    Mavromatis, Alex
    Vafeas, Antonis
    Nejabati, Reza
    Simeonidou, Dimitra
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (11) : 10095 - 10112
  • [49] Enhancing Federated Learning With Server-Side Unlabeled Data by Adaptive Client and Data Selection
    Xu, Yang
    Wang, Lun
    Xu, Hongli
    Liu, Jianchun
    Wang, Zhiyuan
    Huang, Liusheng
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (04) : 2813 - 2831
  • [50] A Client Selection Method Based on Loss Function Optimization for Federated Learning
    Zeng, Yan
    Teng, Siyuan
    Xiang, Tian
    Zhang, Jilin
    Mu, Yuankai
    Ren, Yongjian
    Wan, Jian
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2023, 137 (01): : 1047 - 1064