Decentralized Learning for Multi-player Multi-armed Bandits

被引:0
|
作者
Kalathil, Dileep [1 ]
Nayyar, Naumaan [1 ]
Jain, Rahul [1 ]
机构
[1] Univ So Calif, Dept Elect Engn, Los Angeles, CA 90089 USA
关键词
Distributed adaptive control; multi-armed bandits; online learning; multi-agent systems; ALLOCATION RULES;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of distributed online learning with multiple players in multi-armed bandit models. Each player can pick among multiple arms. As a player picks an arm, it gets a reward from an unknown distribution with an unknown mean. The arms give different rewards to different players. If two players pick the same arm, there is a "collision", and neither of them get any reward. There is no dedicated control channel for coordination or communication among the players. Any other communication between the users is costly and will add to the regret. We propose an online index-based learning policy called dUCB4 algorithm that trades off exploration v. exploitation in the right way, and achieves expected regret that grows at most near-O(log(2)T). The motivation comes from opportunistic spectrum access by multiple secondary users in cognitive radio networks wherein they must pick among various wireless channels that look different to different users.
引用
收藏
页码:3960 / 3965
页数:6
相关论文
共 50 条
  • [31] Federated Multi-Armed Bandits
    Shi, Chengshuai
    Shen, Cong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9603 - 9611
  • [32] Multi-armed Bandits with Probing
    Elumar, Eray Can
    Tekin, Cem
    Yagan, Osman
    2024 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, ISIT 2024, 2024, : 2080 - 2085
  • [33] Ballooning multi-armed bandits
    Ghalme, Ganesh
    Dhamal, Swapnil
    Jain, Shweta
    Gujar, Sujit
    Narahari, Y.
    ARTIFICIAL INTELLIGENCE, 2021, 296
  • [34] Transfer Learning in Multi-Armed Bandits: A Causal Approach
    Zhang, Junzhe
    Bareinboim, Elias
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1340 - 1346
  • [35] Cooperative Multi-player Multi-Armed Bandit: Computation Offloading in a Vehicular Cloud Network
    Xu, Shilin
    Guo, Caili
    Hu, Rose Qingyang
    Qian, Yi
    IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2021), 2021,
  • [36] Millimeter-Wave Concurrent Beamforming: A Multi-Player Multi-Armed Bandit Approach
    Mohamed, Ehab Mahmoud
    Hashima, Sherief
    Hatano, Kohei
    Kasban, Hani
    Rihan, Mohamed
    CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 65 (03): : 1987 - 2007
  • [37] Potential and pitfalls of Multi-Armed Bandits for decentralized Spatial Reuse in WLANs
    Wilhelmi, Francesc
    Barrachina-Munoz, Sergio
    Bellalta, Boris
    Cano, Cristina
    Jonsson, Anders
    Neu, Gergely
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2019, 127 : 26 - 42
  • [38] Multi-player bandits: The adversarial case
    Alatur, Pragnya
    Levy, Kfir Y.
    Krause, Andreas
    Journal of Machine Learning Research, 2020, 21
  • [39] Multi-Player Bandits: The Adversarial Case
    Alatur, Pragnya
    Levy, Kfir Y.
    Krause, Andreas
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [40] Multi-armed bandits for performance marketing
    Gigli, Marco
    Stella, Fabio
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024,