Decentralized Learning for Multi-player Multi-armed Bandits

被引:0
|
作者
Kalathil, Dileep [1 ]
Nayyar, Naumaan [1 ]
Jain, Rahul [1 ]
机构
[1] Univ So Calif, Dept Elect Engn, Los Angeles, CA 90089 USA
关键词
Distributed adaptive control; multi-armed bandits; online learning; multi-agent systems; ALLOCATION RULES;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of distributed online learning with multiple players in multi-armed bandit models. Each player can pick among multiple arms. As a player picks an arm, it gets a reward from an unknown distribution with an unknown mean. The arms give different rewards to different players. If two players pick the same arm, there is a "collision", and neither of them get any reward. There is no dedicated control channel for coordination or communication among the players. Any other communication between the users is costly and will add to the regret. We propose an online index-based learning policy called dUCB4 algorithm that trades off exploration v. exploitation in the right way, and achieves expected regret that grows at most near-O(log(2)T). The motivation comes from opportunistic spectrum access by multiple secondary users in cognitive radio networks wherein they must pick among various wireless channels that look different to different users.
引用
收藏
页码:3960 / 3965
页数:6
相关论文
共 50 条
  • [1] Multi-player Multi-armed Bandits: Decentralized Learning with IID Rewards
    Kalathil, Dileep
    Nayyar, Naumaan
    Jain, Rahul
    2012 50TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2012, : 853 - 860
  • [2] Decentralized Multi-player Multi-armed Bandits with No Collision Information
    Shi, Chengshuai
    Xiong, Wei
    Shen, Cong
    Yang, Jing
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
  • [3] Decentralized Stochastic Multi-Player Multi-Armed Walking Bandits
    Xiong, Guojun
    Li, Jian
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10528 - 10536
  • [4] Online Learning for Cooperative Multi-Player Multi-Armed Bandits
    Chang, William
    Jafarnia-Jahromi, Mehdi
    Jain, Rahul
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 7248 - 7253
  • [5] Decentralized Heterogeneous Multi-Player Multi-Armed Bandits With Non-Zero Rewards on Collisions
    Magesh, Akshayaa
    Veeravalli, Venugopal V.
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2022, 68 (04) : 2622 - 2634
  • [6] Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization
    Shi, Chengshuai
    Xiong, Wei
    Shen, Cong
    Yang, Jing
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [7] On No-Sensing Adversarial Multi-Player Multi-Armed Bandits with Collision Communications
    Shi C.
    Shen C.
    IEEE Journal on Selected Areas in Information Theory, 2021, 2 (02): : 515 - 533
  • [8] An Attackability Perspective on No-Sensing Adversarial Multi-player Multi-armed Bandits
    Shi, Chengshuai
    Shen, Cong
    2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2021, : 533 - 538
  • [9] Multi-Player Multi-Armed Bandits With Collision-Dependent Reward Distributions
    Shi, Chengshuai
    Shen, Cong
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2021, 69 : 4385 - 4402
  • [10] Massive multi-player multi-armed bandits for IoT networks: An application on LoRa networks
    Dakdouk, Hiba
    Feraud, Raphael
    Varsier, Nadege
    Maille, Patrick
    Laroche, Romain
    AD HOC NETWORKS, 2023, 151