An Optimal Algorithm for Adversarial Bandit Problem with Multiple Plays

被引:0
作者
Vural, N. Mert [1 ]
Ozturk, Bugra [1 ,2 ]
Kozat, Suleyman S. [1 ,2 ]
机构
[1] Bilkent Univ, Ankara, Turkey
[2] DataBoss AS, Ankara, Turkey
来源
2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU) | 2020年
关键词
adversarial multi-armed problem; combinatorial optimization; general framework; efficient implementation;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we study the adversarial bandit problem with multiple plays. We introduce a highly efficient multiple-play bandit algorithm that achieves the minimax optimal performance with high probability. As our performance guarantee holds with high probability, our algorithm provides more robust performance compared to the state-of-the-art. Moreover, since we do not have any statistical assumptions on the bandit arms, the results in the paper are guaranteed to hold in an individual sequence manner. Hence, our algorithm can be used for a wide range of combinatorial applications that require exploration and exploitation at the same time.
引用
收藏
页数:4
相关论文
共 11 条
[1]  
[Anonymous], 2006, PREDICTION LEARNING
[2]  
Auer P, 1995, AN S FDN CO, P322, DOI 10.1109/SFCS.1995.492488
[3]  
Beygelzimer A, 2011, Artificial Intelligence and Statistics (AISTATS), V15, P19
[4]  
Bubeck G., 2012, ABS12044710 CORR
[5]   Dependent rounding and its applications to approximation algorithms [J].
Gandhi, Rajiv ;
Khuller, Samir ;
Parthasarathy, Srinivasan ;
Srinivasan, Aravind .
JOURNAL OF THE ACM, 2006, 53 (03) :324-360
[6]  
György A, 2007, J MACH LEARN RES, V8, P2369
[7]  
Kale Satyen, 2010, P 23 INT C NEUR INF, P1054
[8]  
Neu G., 2015, ABS150603271 CORR
[9]  
Neu G, 2016, J MACH LEARN RES, V17, P1
[10]  
Uchiya T, 2010, LECT NOTES ARTIF INT, V6331, P375, DOI 10.1007/978-3-642-16108-7_30