Cooperative Learning for Adversarial Multi-Armed Bandit on Open Multi-Agent Systems

被引:1
|
作者
Nakamura, Tomoki [1 ]
Hayashi, Naoki [1 ]
Inuiguchi, Masahiro [1 ]
机构
[1] Osaka Univ, Grad Sch Engn Sci, Toyonaka, Osaka 5608531, Japan
来源
IEEE CONTROL SYSTEMS LETTERS | 2023年 / 7卷
关键词
Estimation; Multi-agent systems; Decision making; Upper bound; Peer-to-peer computing; Optimization; Heuristic algorithms; Cooperative learning; adversarial multiarmed bandit; open multi-agent system;
D O I
10.1109/LCSYS.2023.3279788
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This letter considers a cooperative decision-making method for an adversarial bandit problem on open multi-agent systems. In an open multi-agent system, the network configuration changes dynamically as agents freely enter and leave the network. We propose a distributed Exp3 policy in which a group of agents exchanges the estimation of the expected reward of each arm with active neighboring agents. Then, each agent updates the probability distribution of choosing arms by combining the estimated rewards of neighboring agents. We derive a sufficient condition for a sublinear bound of a pseudo regret. The numerical example shows that active agents can cooperatively find the optimal arm by the proposed Exp3 policy algorithm.
引用
收藏
页码:1712 / 1717
页数:6
相关论文
共 50 条
  • [1] Multi-agent Multi-armed Bandit Learning for Content Caching in Edge Networks
    Su, Lina
    Zhou, Ruiting
    Wang, Ne
    Chen, Junmei
    Li, Zongpeng
    2022 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES (IEEE ICWS 2022), 2022, : 11 - 16
  • [2] Decentralized Multi-Agent Multi-Armed Bandit Learning With Calibration for Multi-Cell Caching
    Xu, Xianzhe
    Tao, Meixia
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2021, 69 (04) : 2457 - 2472
  • [3] Collaborative Multi-Agent Multi-Armed Bandit Learning for Small-Cell Caching
    Xu, Xianzhe
    Tao, Meixia
    Shen, Cong
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2020, 19 (04) : 2570 - 2585
  • [4] A Dynamic Observation Strategy for Multi-agent Multi-armed Bandit Problem
    Madhushani, Udari
    Leonard, Naomi Ehrich
    2020 EUROPEAN CONTROL CONFERENCE (ECC 2020), 2020, : 1677 - 1682
  • [5] Multi-Agent Multi-Armed Bandit Learning for Online Management of Edge-Assisted Computing
    Wu, Bochun
    Chen, Tianyi
    Ni, Wei
    Wang, Xin
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2021, 69 (12) : 8188 - 8199
  • [6] Achieving Privacy in the Adversarial Multi-Armed Bandit
    Tossou, Aristide C. Y.
    Dimitrakakis, Christos
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2653 - 2659
  • [7] Bridging Adversarial and Nonstationary Multi-Armed Bandit
    Chen, Ningyuan
    Yang, Shuoguang
    Zhang, Hailun
    PRODUCTION AND OPERATIONS MANAGEMENT, 2025,
  • [8] An Efficient Algorithm for Fair Multi-Agent Multi-Armed Bandit with Low Regret
    Jones, Matthew
    Huy Nguyen
    Thy Nguyen
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 7, 2023, : 8159 - 8167
  • [9] Decentralized Randomly Distributed Multi-agent Multi-armed Bandit with Heterogeneous Rewards
    Xu, Mengfan
    Klabjan, Diego
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Sustainable Cooperative Coevolution with a Multi-Armed Bandit
    De Rainville, Francois-Michel
    Sebag, Michele
    Gagne, Christian
    Schoenauer, Marc
    Laurendeau, Denis
    GECCO'13: PROCEEDINGS OF THE 2013 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2013, : 1517 - 1524