Byzantine fault tolerance in distributed machine learning: a survey

被引:0
|
作者
Bouhata, Djamila [1 ,2 ]
Moumen, Hamouma [1 ,2 ]
Mazari, Jocelyn Ahmed [3 ,4 ]
Bounceur, Ahcene [5 ]
机构
[1] Univ Batna, Comp Sci Dept, 2 53 Constantine Rd, Batna 05078, Algeria
[2] Lab Applicat Math Comp & Elect, Comp Sci Dept, Batna, Algeria
[3] Sorbonne Univ, CNRS, ISIR, Paris, France
[4] Extrality, Paris, France
[5] Univ Sharjah, Informat Syst Dept, Sharjah, U Arab Emirates
关键词
Byzantine fault tolerance; distributed machine learning; stochastic gradient descent; communication; optimisation; SUBGRADIENT METHODS; COORDINATE DESCENT; GRADIENT DESCENT; AGREEMENT; GENERALS;
D O I
10.1080/0952813X.2024.2391778
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Byzantine Fault Tolerance (BFT) is crucial for ensuring the resilience of Distributed Machine Learning (DML) systems during training under adversarial conditions. Among the rising corpus of research on BFT in DML, there is no comprehensive classification of techniques or broad analysis of different approaches. This paper provides an in-depth survey of recent advancements in BFT for DML, with a focus on first-order optimisation methods, particularly, the popular one Stochastic Gradient Descent (SGD) during the training phase. We offer a novel classification of BFT approaches based on characteristics such as the communication process, optimisation method, and topology setting. This classification aims to enhance the understanding of various BFT methods and guide future research in addressing open challenges in the field. This work provides the foundations for developing robust BFT systems, using a variety of optimisation methods to strengthen resilience.
引用
收藏
页数:59
相关论文
共 50 条
  • [31] A Fast Machine Learning Framework with Distributed Packet Loss Tolerance
    Wu, Shuang
    2024 2ND ASIA CONFERENCE ON COMPUTER VISION, IMAGE PROCESSING AND PATTERN RECOGNITION, CVIPPR 2024, 2024,
  • [32] Byzantine Fault Tolerance with Window Mechanism for Replicated Services
    Chen, Liu
    Zhou, Wei
    2015 FIFTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC), 2015, : 1255 - 1258
  • [33] An Optimized Byzantine Fault Tolerance Algorithm for Consortium Blockchain
    Li, Yuxi
    Qiao, Liang
    Lv, Zhihan
    PEER-TO-PEER NETWORKING AND APPLICATIONS, 2021, 14 (05) : 2826 - 2839
  • [34] Byzantine Fault Tolerance With Non-Determinism, Revisited
    Huang, Yue
    Li, Huizhong
    Sun, Yi
    Duan, Sisi
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 309 - 322
  • [35] Byzantine Fault-Tolerance Consensus Algorithm Based on
    Li, Shuzhi
    Xiong, Weizhi
    Deng, Xiaohong
    Wang, Zhiqiang
    Liu, Hunwen
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2023, 45 (07) : 2484 - 2493
  • [36] An Optimized Byzantine Fault Tolerance Algorithm for Consortium Blockchain
    Yuxi Li
    Liang Qiao
    Zhihan Lv
    Peer-to-Peer Networking and Applications, 2021, 14 : 2826 - 2839
  • [37] Byzantine Fault Tolerance for Collaborative Editing with Commutative Operations
    Zhao, Wenbing
    Babi, Mamdouh
    Yang, William
    Luo, Xiong
    Zhu, Yueqin
    Yang, Jack
    Luo, Chaomin
    Yang, Mary
    2016 IEEE INTERNATIONAL CONFERENCE ON ELECTRO INFORMATION TECHNOLOGY (EIT), 2016, : 246 - 251
  • [38] Achieving Provable Byzantine Fault-tolerance in a Semi-honest Federated Learning Setting
    Tang, Xingxing
    Gu, Hanlin
    Fan, Lixin
    Yang, Qiang
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT II, 2023, 13936 : 415 - 427
  • [39] BYZANTINE FAULT TOLERANT DISTRIBUTED QUICKEST CHANGE DETECTION
    Bayraktar, Erhan
    Lai, Lifeng
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2015, 53 (02) : 575 - 591
  • [40] A survey: Distributed Machine Learning for 5G and beyond
    Nassef, Omar
    Sun, Wenting
    Purmehdi, Hakimeh
    Tatipamula, Mallik
    Mahmoodi, Toktam
    COMPUTER NETWORKS, 2022, 207