Sample complexity of partition identification using multi-armed bandits

被引:0
|
作者
Juneja, Sandeep [1 ]
Krishnasamy, Subhashini [1 ]
机构
[1] TIFR, Mumbai, Maharashtra, India
来源
CONFERENCE ON LEARNING THEORY, VOL 99 | 2019年 / 99卷
关键词
multi-armed bandits; best arm identification; pure exploration; partition identification; ALLOCATION; SIMULATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given a vector of probability distributions, or arms, each of which can be sampled independently, we consider the problem of identifying the partition to which this vector belongs from a finitely partitioned universe of such vector of distributions. We study this as a pure exploration problem in multi-armed bandit settings and develop sample complexity bounds on the total mean number of samples required for identifying the correct partition with high probability. This framework subsumes well studied problems such as finding the best arm or the best few arms. We consider distributions belonging to the single parameter exponential family and primarily consider partitions where the vector of means of arms lie either in a given set or its complement. The sets considered correspond to distributions where there exists a mean above a specified threshold, where the set is a half space and where either the set or its complement is a polytope, or more generally, a convex set. In these settings, we characterize the lower bounds on mean number of samples for each arm highlighting their dependence on the problem geometry. Further, inspired by the lower bounds, we propose algorithms that can match these bounds asymptotically with decreasing probability of error. Applications of this framework may be diverse. We briefly discuss a few associated with simulation in finance.
引用
收藏
页数:29
相关论文
共 50 条
  • [21] MULTI-ARMED BANDITS WITH COVARIATES: THEORY AND APPLICATIONS
    Kim, Dong Woo
    Lai, Tze Leung
    Xu, Huanzhong
    STATISTICA SINICA, 2021, 31 : 2275 - 2287
  • [22] Stochastic Multi-Armed Bandits with Control Variates
    Verma, Arun
    Hanawal, Manjesh K.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [23] Quantum greedy algorithms for multi-armed bandits
    Ohno, Hiroshi
    QUANTUM INFORMATION PROCESSING, 2023, 22 (02)
  • [24] MULTI-ARMED BANDITS IN MULTI-AGENT NETWORKS
    Shahrampour, Shahin
    Rakhlin, Alexander
    Jadbabaie, Ali
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2786 - 2790
  • [25] Multi-Armed Bandits and Quantum Channel Oracles
    Buchholz, Simon
    Kuebler, Jonas M.
    Schoelkopf, Bernhard
    QUANTUM, 2025, 9
  • [26] On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models
    Kaufmann, Emilie
    Cappe, Olivier
    Garivier, Aurelien
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [27] On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits
    Barrier, Antoine
    Garivier, Aurelien
    Stoltz, Gilles
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 136 - 181
  • [28] Constructing Test Collections using Multi-armed Bandits and Active Learning
    Rahman, Md Mustafizur
    Kutlu, Mucahid
    Lease, Matthew
    WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 3158 - 3164
  • [29] Efficient crowdsourcing of unknown experts using bounded multi-armed bandits
    Long Tran-Thanh
    Stein, Sebastian
    Rogers, Alex
    Jennings, Nicholas R.
    ARTIFICIAL INTELLIGENCE, 2014, 214 : 89 - 111
  • [30] Decentralized Learning for Multi-player Multi-armed Bandits
    Kalathil, Dileep
    Nayyar, Naumaan
    Jain, Rahul
    2012 IEEE 51ST ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2012, : 3960 - 3965