Quantum greedy algorithms for multi-armed bandits

被引:0
|
作者
Ohno, Hiroshi [1 ]
机构
[1] Toyota Cent Res & Dev Labs Inc, 41-1 Yokomichi, Nagakute, Aichi 4801192, Japan
关键词
Multi-armed bandits; -greedy algorithm; MovieLens dataset; Quantum amplitude amplification; Regret analysis;
D O I
10.1007/s11128-023-03844-2
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Multi-armed bandits are widely used in machine learning applications such as recommendation systems. Here, we implement two quantum versions of the e-greedy algorithm, a popular algorithm for multi-armed bandits. One of the quantum greedy algorithms uses a quantum maximization algorithm and the other is a simple algorithm that uses an amplitude encoding method as a quantum subroutine instead of the argmax operation in the e-greedy algorithm. For the former algorithm, given a quantum oracle, the query complexity is on the order root K (O(root K)) in each round, where K is the number of arms. For the latter algorithm, quantum parallelism is achieved by the quantum superposition of the arms and the run-time complexity is on the order O(K)/O(log K) in each round. Bernoulli reward distributions and the MovieLens dataset are used to evaluate the algorithms with their classical counterparts. The experimental results show that for best arm identification, the performance of the quantum greedy algorithm is comparable with that of the classical counterparts.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] Multi-Armed Bandits With Costly Probes
    Elumar, Eray Can
    Tekin, Cem
    Yagan, Osman
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2025, 71 (01) : 618 - 643
  • [32] On Optimal Foraging and Multi-armed Bandits
    Srivastava, Vaibhav
    Reverdy, Paul
    Leonard, Naomi E.
    2013 51ST ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2013, : 494 - 499
  • [33] MULTI-ARMED BANDITS AND THE GITTINS INDEX
    WHITTLE, P
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1980, 42 (02): : 143 - 149
  • [34] Multi-armed bandits with episode context
    Christopher D. Rosin
    Annals of Mathematics and Artificial Intelligence, 2011, 61 : 203 - 230
  • [35] Multi-armed bandits with switching penalties
    Asawa, M
    Teneketzis, D
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1996, 41 (03) : 328 - 348
  • [36] Active Learning in Multi-armed Bandits
    Antos, Andras
    Grover, Varun
    Szepesvari, Csaba
    ALGORITHMIC LEARNING THEORY, PROCEEDINGS, 2008, 5254 : 287 - +
  • [37] Multi-Armed Bandits With Correlated Arms
    Gupta, Samarth
    Chaudhari, Shreyas
    Joshi, Gauri
    Yagan, Osman
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2021, 67 (10) : 6711 - 6732
  • [38] Multi-Armed Bandits with Cost Subsidy
    Sinha, Deeksha
    Sankararama, Karthik Abinav
    Kazerouni, Abbas
    Avadhanula, Vashist
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [39] Batched Multi-armed Bandits Problem
    Gao, Zijun
    Han, Yanjun
    Ren, Zhimei
    Zhou, Zhengqing
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [40] Are Multi-Armed Bandits Susceptible to Peeking?
    Loecher, Markus
    ZAGREB INTERNATIONAL REVIEW OF ECONOMICS & BUSINESS, 2018, 21 (01): : 95 - 104