Sample complexity of partition identification using multi-armed bandits

被引:0
|
作者
Juneja, Sandeep [1 ]
Krishnasamy, Subhashini [1 ]
机构
[1] TIFR, Mumbai, Maharashtra, India
来源
CONFERENCE ON LEARNING THEORY, VOL 99 | 2019年 / 99卷
关键词
multi-armed bandits; best arm identification; pure exploration; partition identification; ALLOCATION; SIMULATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given a vector of probability distributions, or arms, each of which can be sampled independently, we consider the problem of identifying the partition to which this vector belongs from a finitely partitioned universe of such vector of distributions. We study this as a pure exploration problem in multi-armed bandit settings and develop sample complexity bounds on the total mean number of samples required for identifying the correct partition with high probability. This framework subsumes well studied problems such as finding the best arm or the best few arms. We consider distributions belonging to the single parameter exponential family and primarily consider partitions where the vector of means of arms lie either in a given set or its complement. The sets considered correspond to distributions where there exists a mean above a specified threshold, where the set is a half space and where either the set or its complement is a polytope, or more generally, a convex set. In these settings, we characterize the lower bounds on mean number of samples for each arm highlighting their dependence on the problem geometry. Further, inspired by the lower bounds, we propose algorithms that can match these bounds asymptotically with decreasing probability of error. Applications of this framework may be diverse. We briefly discuss a few associated with simulation in finance.
引用
收藏
页数:29
相关论文
共 50 条
  • [41] Decentralized AP selection using Multi-Armed Bandits: Opportunistic ε-Greedy with Stickiness
    Carrascosa, Marc
    Bellalta, Boris
    2019 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2019, : 309 - 315
  • [42] Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
    Audibert, Jean-Yves
    Munos, Remi
    Szepesvari, Csaba
    THEORETICAL COMPUTER SCIENCE, 2009, 410 (19) : 1876 - 1902
  • [43] Personalizing Natural Language Understanding using Multi-armed Bandits and Implicit Feedback
    Moerchen, Fabian
    Ernst, Patrick
    Zappella, Giovanni
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 2661 - 2668
  • [44] Scheduling for Massive MIMO With Hybrid Precoding Using Contextual Multi-Armed Bandits
    Mauricio, Weskley V. F.
    Maciel, Tarcisio Ferreira
    Klein, Anja
    Marques Lima, Francisco Rafael
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (07) : 7397 - 7413
  • [45] PAC models in stochastic multi-objective multi-armed bandits
    Drugan, Madalina M.
    PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'17), 2017, : 409 - 416
  • [46] Multi-User Multi-Armed Bandits for Uncoordinated Spectrum Access
    Bande, Meghana
    Veeravalli, Venugopal V.
    2019 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS (ICNC), 2019, : 653 - 657
  • [47] Dynamic Estimation of Rater Reliability in Subjective Tasks Using Multi-Armed Bandits
    Tarasov, Alexey
    Delany, Sarah Jane
    Mac Namee, Brian
    Proceedings of 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust and 2012 ASE/IEEE International Conference on Social Computing (SocialCom/PASSAT 2012), 2012, : 979 - 980
  • [48] Multi-armed Bandits with Generalized Temporally-Partitioned Rewards
    van den Broek, Ronald C.
    Litjens, Rik
    Sagis, Tobias
    Verbeeke, Nina
    Gajane, Pratik
    ADVANCES IN INTELLIGENT DATA ANALYSIS XXII, PT I, IDA 2024, 2024, 14641 : 41 - 52
  • [49] Robust Risk-Averse Stochastic Multi-armed Bandits
    Maillard, Odalric-Ambrym
    ALGORITHMIC LEARNING THEORY (ALT 2013), 2013, 8139 : 218 - 233
  • [50] Unreliable Multi-Armed Bandits: A Novel Approach to Recommendation Systems
    Ravi, Aditya Narayan
    Poduval, Pranav
    Moharir, Sharayu
    2020 INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS (COMSNETS), 2020,