Qualitative Multi-Armed Bandits: A Quantile-Based Approach

被引:0
|
作者
Szorenyi, Balazs [1 ,5 ,6 ]
Busa-Fekete, Robert [2 ]
Weng, Paul [3 ,4 ]
Huellermeier, Eyke [2 ]
机构
[1] INRIA Lille Nord Europe, SequeL Project, 40 Ave Halley, F-59650 Villeneuve Dascq, France
[2] Univ Paderborn, Dept Comp Sci, D-33098 Paderborn, Germany
[3] SYSU CMU Joint Inst Engn, Guangzhou 510006, Guangdong, Peoples R China
[4] SYSU CMU Shunde Int Joint Res Inst, Shunde 528300, Peoples R China
[5] MTA SZTE Res Grp Artificial Intelligence, H-6720 Szeged, Hungary
[6] Technion Israel Inst Technol, Dept Elect Engn, IL-32000 Haifa, Israel
关键词
BOUNDS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We formalize and study the multi-armed bandit (MAB) problem in a generalized stochastic setting, in which rewards are not assumed to be numerical. Instead, rewards are measured on a qualitative scale that allows for comparison but invalidates arithmetic operations such as averaging. Correspondingly, instead of characterizing an arm in terms of the mean of the underlying distribution, we opt for using a quantile of that distribution as a representative value. We address the problem of quantile-based online learning both for the case of a finite (pure exploration) and infinite time horizon (cumulative regret minimization). For both cases, we propose suitable algorithms and analyze their properties. These properties are also illustrated by means of first experimental studies.
引用
收藏
页码:1660 / 1668
页数:9
相关论文
共 50 条
  • [1] On Kernelized Multi-armed Bandits
    Chowdhury, Sayak Ray
    Gopalan, Aditya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [2] Multi-armed Bandits with Compensation
    Wang, Siwei
    Huang, Longbo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [3] Regional Multi-Armed Bandits
    Wang, Zhiyang
    Zhou, Ruida
    Shen, Cong
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [4] Federated Multi-Armed Bandits
    Shi, Chengshuai
    Shen, Cong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9603 - 9611
  • [5] Multi-armed Bandits with Probing
    Elumar, Eray Can
    Tekin, Cem
    Yagan, Osman
    2024 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, ISIT 2024, 2024, : 2080 - 2085
  • [6] Ballooning multi-armed bandits
    Ghalme, Ganesh
    Dhamal, Swapnil
    Jain, Shweta
    Gujar, Sujit
    Narahari, Y.
    ARTIFICIAL INTELLIGENCE, 2021, 296
  • [7] On a new approach to the analysis of complex multi-armed bandits
    R. Garbe
    K. D. Glazebrook
    Mathematical Methods of Operations Research, 1998, 48 : 419 - 442
  • [8] Transfer Learning in Multi-Armed Bandits: A Causal Approach
    Zhang, Junzhe
    Bareinboim, Elias
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1340 - 1346
  • [9] On a new approach to the analysis of complex multi-armed bandits
    Garbe, R
    Glazebrook, KD
    MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 1998, 48 (03) : 419 - 442
  • [10] MULTI-ARMED BANDITS BASED ON A VARIANT OF SIMULATED ANNEALING
    Abdulla, Mohammed Shahid
    Bhatnagar, Shalabh
    INDIAN JOURNAL OF PURE & APPLIED MATHEMATICS, 2016, 47 (02): : 195 - 212