Evolutionary Multi-Armed Bandits with Genetic Thompson Sampling

被引:1
|
作者
Lin, Baihan [1 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
来源
2022 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC) | 2022年
关键词
Multi-armed bandits; genetic algorithm; MIXTURE;
D O I
10.1109/CEC55065.2022.9870279
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As two popular schools of machine learning, online learning and evolutionary computations have become two important driving forces behind real-world decision making engines for applications in biomedicine, economics, and engineering fields. Although there are prior work that utilizes bandits to improve evolutionary algorithms' optimization process, it remains a field of blank on how evolutionary approach can help improve the sequential decision making tasks of online learning agents such as the multi-armed bandits. In this work, we propose the Genetic Thompson Sampling, a bandit algorithm that keeps a population of agents and update them with genetic principles such as elite selection, crossover and mutations. Empirical results in multi-armed bandit simulation environments and a practical epidemic control problem suggest that by incorporating the genetic algorithm into the bandit algorithm, our method significantly outperforms the baselines in nonstationary settings. Lastly, we introduce EvoBandit, a web-based interactive visualization to guide the readers through the entire learning process and perform lightweight evaluations on the fly. We hope to engage researchers into this growing field of research with this investigation.(1)
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Thompson sampling for multi-armed bandits in big data environments
    Kim, Min Kyong
    Hwang, Beom Seuk
    KOREAN JOURNAL OF APPLIED STATISTICS, 2024, 37 (05)
  • [2] Multi-Armed Bandits for Many-Task Evolutionary Optimization
    Le Tien Thanh
    Le Van Cuong
    Ta Bao Thang
    Huynh Thi Thanh Binh
    2021 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC 2021), 2021, : 1664 - 1671
  • [3] Multi-armed bandits with dependent arms
    Singh, Rahul
    Liu, Fang
    Sun, Yin
    Shroff, Ness
    MACHINE LEARNING, 2024, 113 (01) : 45 - 71
  • [4] Multi-armed bandits with episode context
    Christopher D. Rosin
    Annals of Mathematics and Artificial Intelligence, 2011, 61 : 203 - 230
  • [5] Multi-Armed Bandits With Costly Probes
    Elumar, Eray Can
    Tekin, Cem
    Yagan, Osman
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2025, 71 (01) : 618 - 643
  • [6] Multi-Armed Bandits With Correlated Arms
    Gupta, Samarth
    Chaudhari, Shreyas
    Joshi, Gauri
    Yagan, Osman
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2021, 67 (10) : 6711 - 6732
  • [7] Multi-armed bandits with episode context
    Rosin, Christopher D.
    ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2011, 61 (03) : 203 - 230
  • [8] LEVY BANDITS: MULTI-ARMED BANDITS DRIVEN BY LEVY PROCESSES
    Kaspi, Haya
    Mandelbaum, Avi
    ANNALS OF APPLIED PROBABILITY, 1995, 5 (02) : 541 - 565
  • [9] Combinatorial Multi-armed Bandits for Resource Allocation
    Zuo, Jinhang
    Joe-Wong, Carlee
    2021 55TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2021,
  • [10] Quantum greedy algorithms for multi-armed bandits
    Hiroshi Ohno
    Quantum Information Processing, 22