Enhancing Federated Learning Convergence With Dynamic Data Queue and Data-Entropy-Driven Participant Selection

被引：0

作者：

Herath, Charuka ^{[1
]}

Liu, Xiaolan ^{[1
]}

Lambotharan, Sangarapillai ^{[1
]}

Rahulamathavan, Yogachandran ^{[1
]}

机构：

[1] Loughborough Univ London, Inst Digital Technol, London E20 3BS, England

来源：

IEEE INTERNET OF THINGS JOURNAL | 2025年 / 12卷 / 06期

基金：

英国工程与自然科学研究理事会;

关键词：

Data models; Convergence; Internet of Things; Distributed databases; Accuracy; Training; Mathematical models; Servers; Adaptation models; Data entropy; fairness FL; federated learning (FL); not identically and independently distributed (non-IID);

D O I：

10.1109/JIOT.2024.3491034

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Federated learning (FL) is a decentralized approach for collaborative model training on edge devices. This distributed method of model training offers advantages in privacy, security, regulatory compliance, and cost efficiency. Our emphasis in this research lies in addressing statistical complexity in FL, especially when the data stored locally across devices is not identically and independently distributed (non-IID). We have observed an accuracy reduction of up to approximately 10%-30%, particularly in skewed scenarios where each edge device trains with only 1 class of data. This reduction is attributed to weight divergence, quantified using the Euclidean distance between device-level class distributions and the population distribution, resulting in a bias term (delta(k)) . As a solution, we present a method to improve convergence in FL by creating a global subset of data on the server and dynamically distributing it across devices using a dynamic data queue-driven FL (DDFL). Next, we leverage Data Entropy metrics to observe the process during each training round and enable reasonable device selection for aggregation. Furthermore, we provide a convergence analysis of our proposed DDFL to justify their viability in practical FL scenarios, aiming for better device selection, a non-suboptimal global model, and faster convergence. We observe that our approach results in a substantial accuracy boost of approximately 5% for the MNIST dataset, around 18% for CIFAR-10, and 20% for CIFAR-100 with a 10% global subset of data, outperforming the state-of-the-art (SOTA) aggregation algorithms.

引用

页码：6646 / 6658

页数：13

共 41 条

[41] Data driven feature selection and machine learning to detect misplaced V1 and V2 chest electrodes when recording the 12-lead electrocardiogram
Rjoob, Khaled
Bond, Raymond
Finlay, Dewar
McGilligan, Victoria
Leslie, Stephen J.
Iftikhar, Aleeha
Guldenring, Daniel
Rababah, Ali
Knoery, Charles
McShane, Anne
Peace, Aaron
JOURNAL OF ELECTROCARDIOLOGY, 2019, 57 : 39 - 43

← 1 2 3 4 5 →