Decentralized Learning for Multi-player Multi-armed Bandits

被引:0
|
作者
Kalathil, Dileep [1 ]
Nayyar, Naumaan [1 ]
Jain, Rahul [1 ]
机构
[1] Univ So Calif, Dept Elect Engn, Los Angeles, CA 90089 USA
关键词
Distributed adaptive control; multi-armed bandits; online learning; multi-agent systems; ALLOCATION RULES;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of distributed online learning with multiple players in multi-armed bandit models. Each player can pick among multiple arms. As a player picks an arm, it gets a reward from an unknown distribution with an unknown mean. The arms give different rewards to different players. If two players pick the same arm, there is a "collision", and neither of them get any reward. There is no dedicated control channel for coordination or communication among the players. Any other communication between the users is costly and will add to the regret. We propose an online index-based learning policy called dUCB4 algorithm that trades off exploration v. exploitation in the right way, and achieves expected regret that grows at most near-O(log(2)T). The motivation comes from opportunistic spectrum access by multiple secondary users in cognitive radio networks wherein they must pick among various wireless channels that look different to different users.
引用
收藏
页码:3960 / 3965
页数:6
相关论文
共 50 条
  • [41] Lenient Regret for Multi-Armed Bandits
    Merlis, Nadav
    Mannor, Shie
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8950 - 8957
  • [42] Finding structure in multi-armed bandits
    Schulz, Eric
    Franklin, Nicholas T.
    Gershman, Samuel J.
    COGNITIVE PSYCHOLOGY, 2020, 119
  • [43] ON MULTI-ARMED BANDITS AND DEBT COLLECTION
    Czekaj, Lukasz
    Biegus, Tomasz
    Kitlowski, Robert
    Tomasik, Pawel
    36TH ANNUAL EUROPEAN SIMULATION AND MODELLING CONFERENCE, ESM 2022, 2022, : 137 - 141
  • [44] Visualizations for interrogations of multi-armed bandits
    Keaton, Timothy J.
    Sabbaghi, Arman
    STAT, 2019, 8 (01):
  • [45] Multi-armed bandits with dependent arms
    Singh, Rahul
    Liu, Fang
    Sun, Yin
    Shroff, Ness
    MACHINE LEARNING, 2024, 113 (01) : 45 - 71
  • [46] On Kernelized Multi-Armed Bandits with Constraints
    Zhou, Xingyu
    Ji, Bo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [47] Multi-Armed Bandits in Metric Spaces
    Kleinberg, Robert
    Slivkins, Aleksandrs
    Upfal, Eli
    STOC'08: PROCEEDINGS OF THE 2008 ACM INTERNATIONAL SYMPOSIUM ON THEORY OF COMPUTING, 2008, : 681 - +
  • [48] Multi-Armed Bandits With Costly Probes
    Elumar, Eray Can
    Tekin, Cem
    Yagan, Osman
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2025, 71 (01) : 618 - 643
  • [49] Multi-armed bandits with episode context
    Christopher D. Rosin
    Annals of Mathematics and Artificial Intelligence, 2011, 61 : 203 - 230
  • [50] MULTI-ARMED BANDITS AND THE GITTINS INDEX
    WHITTLE, P
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1980, 42 (02): : 143 - 149