Qualitative Multi-Armed Bandits: A Quantile-Based Approach

被引：0

作者：

Szorenyi, Balazs ^{[1
,5
,6
]}

Busa-Fekete, Robert ^{[2
]}

Weng, Paul ^{[3
,4
]}

Huellermeier, Eyke ^{[2
]}

机构：

[1] INRIA Lille Nord Europe, SequeL Project, 40 Ave Halley, F-59650 Villeneuve Dascq, France

[2] Univ Paderborn, Dept Comp Sci, D-33098 Paderborn, Germany

[3] SYSU CMU Joint Inst Engn, Guangzhou 510006, Guangdong, Peoples R China

[4] SYSU CMU Shunde Int Joint Res Inst, Shunde 528300, Peoples R China

[5] MTA SZTE Res Grp Artificial Intelligence, H-6720 Szeged, Hungary

[6] Technion Israel Inst Technol, Dept Elect Engn, IL-32000 Haifa, Israel

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37 | 2015年 / 37卷

关键词：

BOUNDS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We formalize and study the multi-armed bandit (MAB) problem in a generalized stochastic setting, in which rewards are not assumed to be numerical. Instead, rewards are measured on a qualitative scale that allows for comparison but invalidates arithmetic operations such as averaging. Correspondingly, instead of characterizing an arm in terms of the mean of the underlying distribution, we opt for using a quantile of that distribution as a representative value. We address the problem of quantile-based online learning both for the case of a finite (pure exploration) and infinite time horizon (cumulative regret minimization). For both cases, we propose suitable algorithms and analyze their properties. These properties are also illustrated by means of first experimental studies.

引用

页码：1660 / 1668

页数：9

共 50 条

[31] Decentralized Exploration in Multi-Armed Bandits
Feraud, Raphael
Alami, Reda
Laroche, Romain
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[32] Multi-armed bandits with episode context
Rosin, Christopher D.
ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2011, 61 (03) : 203 - 230
[33] Introduction to Multi-Armed Bandits Preface
Slivkins, Aleksandrs
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2019, 12 (1-2): : 1 - 286
[34] Federated Multi-armed Bandits with Personalization
Shi, Chengshuai
Shen, Cong
Yang, Jing
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
[35] Unreliable Multi-Armed Bandits: A Novel Approach to Recommendation Systems
Ravi, Aditya Narayan
Poduval, Pranav
Moharir, Sharayu
2020 INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS (COMSNETS), 2020,
[36] A Differentially Private Approach for Budgeted Combinatorial Multi-Armed Bandits
Wang, Hengzhi
Cui, Laizhong
Wang, En
Liu, Jiangchuan
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2025, 22 (01) : 424 - 439
[37] LEVY BANDITS: MULTI-ARMED BANDITS DRIVEN BY LEVY PROCESSES
Kaspi, Haya
Mandelbaum, Avi
ANNALS OF APPLIED PROBABILITY, 1995, 5 (02): : 541 - 565
[38] Learned Scheduling of LDPC Decoders Based on Multi-armed Bandits
Habib, Salman
Beemer, Allison
Kliewer, Jorg
2020 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2020, : 2789 - 2794
[39] Successive Reduction of Arms in Multi-Armed Bandits
Gupta, Neha
Granmo, Ole-Christoffer
Agrawala, Ashok
RESEARCH AND DEVELOPMENT IN INTELLIGENT SYSTEMS XXVIII: INCORPORATING APPLICATIONS AND INNOVATIONS IN INTELLIGENT SYSTEMS XIX, 2011, : 181 - +
[40] Quantum greedy algorithms for multi-armed bandits
Hiroshi Ohno
Quantum Information Processing, 22

← 1 2 3 4 5 →