Decentralized Learning for Multi-player Multi-armed Bandits

被引:0
|
作者
Kalathil, Dileep [1 ]
Nayyar, Naumaan [1 ]
Jain, Rahul [1 ]
机构
[1] Univ So Calif, Dept Elect Engn, Los Angeles, CA 90089 USA
关键词
Distributed adaptive control; multi-armed bandits; online learning; multi-agent systems; ALLOCATION RULES;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of distributed online learning with multiple players in multi-armed bandit models. Each player can pick among multiple arms. As a player picks an arm, it gets a reward from an unknown distribution with an unknown mean. The arms give different rewards to different players. If two players pick the same arm, there is a "collision", and neither of them get any reward. There is no dedicated control channel for coordination or communication among the players. Any other communication between the users is costly and will add to the regret. We propose an online index-based learning policy called dUCB4 algorithm that trades off exploration v. exploitation in the right way, and achieves expected regret that grows at most near-O(log(2)T). The motivation comes from opportunistic spectrum access by multiple secondary users in cognitive radio networks wherein they must pick among various wireless channels that look different to different users.
引用
收藏
页码:3960 / 3965
页数:6
相关论文
共 50 条
  • [21] TRANSFER LEARNING FOR CONTEXTUAL MULTI-ARMED BANDITS
    Cai, Changxiao
    Cai, T. Tony
    Li, Hongzhe
    ANNALS OF STATISTICS, 2024, 52 (01): : 207 - 232
  • [22] Quantum Reinforcement Learning for Multi-Armed Bandits
    Liu, Yi-Pei
    Li, Kuo
    Cao, Xi
    Jia, Qing-Shan
    Wang, Xu
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 5675 - 5680
  • [23] A Survey on Multi-player Bandits
    Boursier, Etienne
    Perchet, Vianney
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 45
  • [24] Multi-armed bandits for decentralized AP selection in enterprise WLANs
    Carrascosa, Marc
    Bellalta, Boris
    COMPUTER COMMUNICATIONS, 2020, 159 : 108 - 123
  • [25] Multi-armed bandits for decentralized AP selection in enterprise WLANs
    Carrascosa, Marc
    Bellalta, Boris
    Computer Communications, 2020, 159 : 108 - 123
  • [26] Coordinated Versus Decentralized Exploration In Multi-Agent Multi-Armed Bandits
    Chakraborty, Mithun
    Chua, Kai Yee Phoebe
    Das, Sanmay
    Juba, Brendan
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 164 - 170
  • [27] An Instance-Dependent Analysis for the Cooperative Multi-Player Multi-Armed Bandit
    Pacchiano, Aldo
    Bartlett, Peter
    Jordan, Michael
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 1166 - 1215
  • [28] On Kernelized Multi-armed Bandits
    Chowdhury, Sayak Ray
    Gopalan, Aditya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [29] Multi-armed Bandits with Compensation
    Wang, Siwei
    Huang, Longbo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [30] Regional Multi-Armed Bandits
    Wang, Zhiyang
    Zhou, Ruida
    Shen, Cong
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84