Byzantine-Resilient Distributed Bandit Online Optimization in Dynamic Environments

被引：2

作者：

Wei, Mengli ^{[1
]}

Yu, Wenwu ^{[2
,3
]}

Liu, Hongzhe ^{[4
]}

Chen, Duxin ^{[4
]}

机构：

[1] Southeast University, School of Cyber Science and Engineering, Nanjing,210096, China

[2] Southeast University, School of Mathematics, Frontiers Science Center for Mobile Information Communication and Security, Nanjing,210096, China

[3] Purple Mountain Laboratories, Nanjing,211111, China

[4] Southeast University, School of Mathematics, Nanjing,210096, China

来源：

IEEE Transactions on Industrial Cyber-Physical Systems | 2024年 / 2卷

关键词：

Constrained optimization - Convex optimization - Cyber Physical System - Cybersecurity - Dynamics - Embedded systems - Linear programming - Multi agent systems - Online systems - Quadratic programming;

D O I：

10.1109/TICPS.2024.3410846

中图分类号：

学科分类号：

摘要：

We consider the constrained multi-agent online optimization problem in dynamic environments that are vulnerable to Byzantine attacks, where some infiltrated agents may deviate from the prescribed update rule and send arbitrary messages. The objective functions are exposed in a bandit form, i.e., only the function value is revealed to each agent at the sampling instance, and held privately by each agent. The agents only exchange information with their neighbors to update decisions, and the collective goal is to minimize the sum of the unattacked agents' objective functions in dynamic environments, where the same function can only be sampled once. To handle this problem, a Byzantine-Resilient Distributed Bandit Online Convex Optimization (BR-DBOCO) algorithm that can tolerate up to mathcalB Byzantine agents is developed. Specifically, the BR-DBOCO employs the one-point bandit feedback (OPBF) mechanism and state filter to cope with the objective function, which cannot be explicitly expressed in dynamic environments and the arbitrary deviation states caused by Byzantine attacks, respectively. We show that sublinear expected regret is achieved if the accumulative deviation of the comparator sequence also grows sublinearly with a proper exploration parameter. Finally, experimental results are presented to illustrate the effectiveness of the proposed algorithm. © 2023 IEEE.

引用

页码：154 / 165