Identifying Outlier Arms in Multi-Armed Bandit

被引:0
作者
Zhuang, Honglei [1 ,2 ]
Wang, Chi [2 ]
Wang, Yifan [3 ]
机构
[1] Univ Illinois, Champaign, IL 61820 USA
[2] Microsoft Res, Redmond, WA USA
[3] Tsinghua Univ, Beijing, Peoples R China
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017) | 2017年 / 30卷
基金
美国国家科学基金会;
关键词
APPROXIMATE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study a novel problem lying at the intersection of two areas: multi-armed bandit and outlier detection. Multi-armed bandit is a useful tool to model the process of incrementally collecting data for multiple objects in a decision space. Outlier detection is a powerful method to narrow down the attention to a few objects after the data for them are collected. However, no one has studied how to detect outlier objects while incrementally collecting data for them, which is necessary when data collection is expensive. We formalize this problem as identifying outlier arms in a multi-armed bandit. We propose two sampling strategies with theoretical guarantee, and analyze their sampling efficiency. Our experimental results on both synthetic and real data show that our solution saves 70-99% of data collection cost from baseline while having nearly perfect accuracy.
引用
收藏
页数:10
相关论文
共 30 条
[1]  
Aggarwal C. C., 2008, SDM, P483, DOI 10.1137/1.9781611972788.44
[2]   Approximate is better than "exact" for interval estimation of binomial proportions [J].
Agresti, A ;
Coull, BA .
AMERICAN STATISTICIAN, 1998, 52 (02) :119-126
[3]  
[Anonymous], 2006, Proceedings of the 12th international conference on Knowledge discovery and data mining
[4]  
[Anonymous], 2010, P COLT
[5]  
[Anonymous], 2006, Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, DOI 10.1145/1150402.1150501
[6]  
[Anonymous], 2012, Advances in Neural Information Processing Systems
[7]  
[Anonymous], 2012, ICML
[8]   Finite-time analysis of the multiarmed bandit problem [J].
Auer, P ;
Cesa-Bianchi, N ;
Fischer, P .
MACHINE LEARNING, 2002, 47 (2-3) :235-256
[9]  
Bubeck S., 2013, PMLR, P258
[10]   Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems [J].
Bubeck, Sebastien ;
Cesa-Bianchi, Nicolo .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2012, 5 (01) :1-122