CaBaFL: Asynchronous Federated Learning via Hierarchical Cache and Feature Balance

被引：1

作者：

Xia, Zeke ^{[1
]}

Hu, Ming ^{[2
]}

Yan, Dengke ^{[1
]}

Xie, Xiaofei ^{[2
]}

Li, Tianlin ^{[3
]}

Li, Anran ^{[4
]}

Zhou, Junlong ^{[5
]}

Chen, Mingsong ^{[1
]}

机构：

[1] East China Normal Univ, MoE Engn Res Ctr HW SW Codesign Technol & Applicat, Shanghai, Peoples R China

[2] Singapore Management Univ, Sch Comp & Informat Syst, Singapore, Singapore

[3] Nanyang Technol Univ, Coll Comp & Data Sci, Singapore, Singapore

[4] Yale Univ, Sch Med, Dept Biomed Informat & Data Sci, New Haven, CT 06520 USA

[5] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China

来源：

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | 2024年 / 43卷 / 11期

基金：

新加坡国家研究基金会;

关键词：

Training; Performance evaluation; Integrated circuits; Accuracy; Design automation; Federated learning; Data structures; Data models; Internet of Things; Servers; Artificial Intelligence of Things (AIoT); asynchronous federated learning (FL); data/device heterogeneity; feature balance;

D O I：

10.1109/TCAD.2024.3446881

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Federated learning (FL) as a promising distributed machine learning paradigm has been widely adopted in Artificial Intelligence of Things (AIoT) applications. However, the efficiency and inference capability of FL is seriously limited due to the presence of stragglers and data imbalance across massive AIoT devices, respectively. To address the above challenges, we present a novel asynchronous FL approach named CaBaFL, which includes a hierarchical cache-based aggregation mechanism and a feature balance-guided device selection strategy. CaBaFL maintains multiple intermediate models simultaneously for local training. The hierarchical cache-based aggregation mechanism enables each intermediate model to be trained on multiple devices to align the training time and mitigate the straggler issue. In specific, each intermediate model is stored in a low-level cache for local training and when it is trained by sufficient local devices, it will be stored in a high-level cache for aggregation. To address the problem of imbalanced data, the feature balance-guided device selection strategy in CaBaFL adopts the activation distribution as a metric, which enables each intermediate model to be trained across devices with totally balanced data distributions before aggregation. Experimental results show that compared to the state-of-the-art FL methods, CaBaFL achieves up to 9.26X training acceleration and 19.71% accuracy improvements.

引用

页码：4057 / 4068

页数：12

共 38 条

[1] Anguita Davide, 2013, PROCEUR S ARTIF NEUR, P437, DOI DOI 10.3390/S20082200
[2] Caldas S., 2019, ARXIV
[3] FedCluster: Boosting the Convergence of Federated Learning via Cluster-Cycling
Chen, Cheng
Chen, Ziyi
Zhou, Yi
Kailkhura, Bhavya
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 5017 - 5026
[4] ADS-Lead: Lifelong Anomaly Detection in Autonomous Driving Systems
Han, Xingshuo
Zhou, Yuan
Chen, Kangjie
Qiu, Han
Qiu, Meikang
Liu, Yang
Zhang, Tianwei
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (01) : 1039 - 1051
[5] Hsu TMH, 2019, Arxiv, DOI arXiv:1909.06335
[6] Hu CH, 2023, IEEE J SEL AREA COMM, V41, P874, DOI [10.1109/JSAC.2023.3242719, 10.1109/ICASSP49357.2023.10096633]
[7] Hu Ming, 2024, 2024 IEEE 40th International Conference on Data Engineering (ICDE), P2137, DOI 10.1109/ICDE60146.2024.00170
[8] Hu Ming, 2023, 2023 IEEE Real-Time Systems Symposium (RTSS), P145, DOI 10.1109/RTSS59052.2023.00022
[9] Hu M, 2024, AAAI CONF ARTIF INTE, P12528
[10] AIoTML: A Unified Modeling Language for AIoT-Based Cyber–Physical Systems
Hu, Ming
Cao, E.
Huang, Hongbing
Zhang, Min
Chen, Xiaohong
Chen, Mingsong
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (11) : 3545 - 3558

← 1 2 3 4 →