Semi-Bandit Learning for Monotone Stochastic Optimization

被引:0
作者
Agarwal, Arpit [1 ]
Ghuge, Rohan [2 ]
Nagarajan, Viswanath [3 ]
机构
[1] Indian Inst Technol, Comp Sci & Engn, Mumbai, Maharashtra, India
[2] Georgia Inst Technol, H Milton Stewart Sch Ind & Syst Engn, Atlanta, GA 30332 USA
[3] Univ Michigan, Ind & Operat Engn, Ann Arbor, MI 48109 USA
来源
2024 IEEE 65TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, FOCS 2024 | 2024年
关键词
online learning; online-to-offline; adaptivity gaps; ALGORITHMS; KNAPSACK; BOUNDS;
D O I
10.1109/FOCS61266.2024.00083
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Stochastic optimization is a widely used approach for optimization under uncertainty, where uncertain input parameters are modeled by random variables. Exact or approximation algorithms have been obtained for several fundamental problems in this area. However, a significant limitation of this approach is that it requires full knowledge of the underlying probability distributions. Can we still get good (approximation) algorithms if these distributions are unknown, and the algorithm needs to learn them through repeated interactions? In this paper, we resolve this question for a large class of "monotone" stochastic problems, by providing a generic online learning algorithm with root T log T regret relative to the best approximation algorithm (under known distributions). Importantly, our online algorithm works in a semi-bandit setting, where in each period, the algorithm only observes samples from the random variables that were actually probed. Our framework applies to several fundamental problems in stochastic optimization such as prophet inequality, Pandora's box, stochastic knapsack, stochastic matchings and stochastic submodular optimization.
引用
收藏
页码:1260 / 1274
页数:15
相关论文
共 63 条
[1]   Submodular Stochastic Probing on Matroids [J].
Adamczyk, Marek ;
Sviridenko, Maxim ;
Ward, Justin .
MATHEMATICS OF OPERATIONS RESEARCH, 2016, 41 (03) :1022-1038
[2]   Improved analysis of the greedy algorithm for stochastic matching [J].
Adamczyk, Marek .
INFORMATION PROCESSING LETTERS, 2011, 111 (15) :731-737
[3]  
Agarwal A., 2023, arXiv, DOI arXiv:2312.15427
[4]   BAYESIAN COMBINATORIAL AUCTIONS: EXPANDING SINGLE BUYER MECHANISMS TO MANY BUYERS [J].
Alaei, Saeed .
SIAM JOURNAL ON COMPUTING, 2014, 43 (02) :930-972
[5]   Maximizing Stochastic Monotone Submodular Functions [J].
Asadpour, Arash ;
Nazerzadeh, Hamid .
MANAGEMENT SCIENCE, 2016, 62 (08) :2374-2391
[6]   Finite-time analysis of the multiarmed bandit problem [J].
Auer, P ;
Cesa-Bianchi, N ;
Fischer, P .
MACHINE LEARNING, 2002, 47 (2-3) :235-256
[7]  
Azar MG, 2017, PR MACH LEARN RES, V70
[8]  
Azar P.D., 2014, P 25 S DISCRETE ALGO, P1358, DOI [DOI 10.1137/1.9781611973402.100, 10.1137/1.9781611973402.100]
[9]   When LP Is the Cure for Your Matching Woes: Improved Bounds for Stochastic Matchings [J].
Bansal, Nikhil ;
Gupta, Anupam ;
Li, Jian ;
Mestre, Julian ;
Nagarajan, Viswanath ;
Rudra, Atri .
ALGORITHMICA, 2012, 63 (04) :733-762
[10]  
Bhalgat A, 2011, PROCEEDINGS OF THE TWENTY-SECOND ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P1647