Stochastic Multi-armed Bandits in Constant Space

被引:0
|
作者
Liau, David [1 ]
Price, Eric [1 ]
Song, Zhao [1 ]
Yang, Ger [1 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
来源
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84 | 2018年 / 84卷
关键词
EXPLORATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider the stochastic bandit problem in the sublinear space setting, where one cannot record the win-loss record for all K arms. We give an algorithm using O(1) words of space with regret Sigma(K)(i=1) 1/Delta(i) log Delta(i)/Delta log T where Delta(i) is the gap between the best arm and arm i and Delta is the gap between the best and the second-best arms. If the rewards are bounded away from 0 and 1, this is within an O(log 1/Delta) factor of the optimum regret possible without space constraints.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Stochastic Multi-Armed Bandits with Control Variates
    Verma, Arun
    Hanawal, Manjesh K.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [2] Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions
    Lancewicki, Tal
    Segal, Shahar
    Koren, Tomer
    Mansour, Yishay
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [3] Robust Stochastic Multi-Armed Bandits with Historical Data
    Yacobi, Sarah Boufelja
    Bounefouf, Djallel
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 959 - 965
  • [4] Networked Stochastic Multi-Armed Bandits with Combinatorial Strategies
    Tang, Shaojie
    Zhou, Yaqin
    Han, Kai
    Zhang, Zhao
    Yuan, Jing
    Wu, Weili
    2017 IEEE 37TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2017), 2017, : 786 - 793
  • [5] Anytime optimal algorithms in stochastic multi-armed bandits
    Degenne, Remy
    Perchet, Vianney
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [6] Parametrized Stochastic Multi-armed Bandits with Binary Rewards
    Jiang, Chong
    Srikant, R.
    2011 AMERICAN CONTROL CONFERENCE, 2011, : 119 - 124
  • [7] Stealthy Adversarial Attacks on Stochastic Multi-Armed Bandits
    Wang, Zhiwei
    Wang, Huazheng
    Wang, Hongning
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15770 - 15777
  • [8] Quantum Multi-Armed Bandits and Stochastic Linear Bandits Enjoy Logarithmic Regrets
    Wan, Zongqi
    Zhang, Zhijie
    Li, Tongyang
    Zhang, Jialin
    Sun, Xiaoming
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 10087 - 10094
  • [9] On Kernelized Multi-armed Bandits
    Chowdhury, Sayak Ray
    Gopalan, Aditya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [10] Regional Multi-Armed Bandits
    Wang, Zhiyang
    Zhou, Ruida
    Shen, Cong
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84