Sample complexity of partition identification using multi-armed bandits

被引：0

作者：

Juneja, Sandeep ^{[1
]}

Krishnasamy, Subhashini ^{[1
]}

机构：

[1] TIFR, Mumbai, Maharashtra, India

来源：

CONFERENCE ON LEARNING THEORY, VOL 99 | 2019年 / 99卷

关键词：

multi-armed bandits; best arm identification; pure exploration; partition identification; ALLOCATION; SIMULATION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Given a vector of probability distributions, or arms, each of which can be sampled independently, we consider the problem of identifying the partition to which this vector belongs from a finitely partitioned universe of such vector of distributions. We study this as a pure exploration problem in multi-armed bandit settings and develop sample complexity bounds on the total mean number of samples required for identifying the correct partition with high probability. This framework subsumes well studied problems such as finding the best arm or the best few arms. We consider distributions belonging to the single parameter exponential family and primarily consider partitions where the vector of means of arms lie either in a given set or its complement. The sets considered correspond to distributions where there exists a mean above a specified threshold, where the set is a half space and where either the set or its complement is a polytope, or more generally, a convex set. In these settings, we characterize the lower bounds on mean number of samples for each arm highlighting their dependence on the problem geometry. Further, inspired by the lower bounds, we propose algorithms that can match these bounds asymptotically with decreasing probability of error. Applications of this framework may be diverse. We briefly discuss a few associated with simulation in finance.

引用

页数：29

共 50 条

[21] MULTI-ARMED BANDITS WITH COVARIATES: THEORY AND APPLICATIONS
Kim, Dong Woo
Lai, Tze Leung
Xu, Huanzhong
STATISTICA SINICA, 2021, 31 : 2275 - 2287
[22] Stochastic Multi-Armed Bandits with Control Variates
Verma, Arun
Hanawal, Manjesh K.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
[23] Quantum greedy algorithms for multi-armed bandits
Ohno, Hiroshi
QUANTUM INFORMATION PROCESSING, 2023, 22 (02)
[24] MULTI-ARMED BANDITS IN MULTI-AGENT NETWORKS
Shahrampour, Shahin
Rakhlin, Alexander
Jadbabaie, Ali
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2786 - 2790
[25] Multi-Armed Bandits and Quantum Channel Oracles
Buchholz, Simon
Kuebler, Jonas M.
Schoelkopf, Bernhard
QUANTUM, 2025, 9
[26] On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models
Kaufmann, Emilie
Cappe, Olivier
Garivier, Aurelien
JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
[27] On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits
Barrier, Antoine
Garivier, Aurelien
Stoltz, Gilles
INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 136 - 181
[28] Constructing Test Collections using Multi-armed Bandits and Active Learning
Rahman, Md Mustafizur
Kutlu, Mucahid
Lease, Matthew
WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 3158 - 3164
[29] Efficient crowdsourcing of unknown experts using bounded multi-armed bandits
Long Tran-Thanh
Stein, Sebastian
Rogers, Alex
Jennings, Nicholas R.
ARTIFICIAL INTELLIGENCE, 2014, 214 : 89 - 111
[30] Decentralized Learning for Multi-player Multi-armed Bandits
Kalathil, Dileep
Nayyar, Naumaan
Jain, Rahul
2012 IEEE 51ST ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2012, : 3960 - 3965

← 1 2 3 4 5 →