An Improved Analysis of LP-Based Control for Revenue Management

被引:0
作者
Chen, Guanting [1 ]
Li, Xiaocheng [2 ]
Ye, Yinyu [3 ]
机构
[1] Univ North Carolina Chapel Hill, Dept Stat & Operat Res, Chapel Hill, NC 27599 USA
[2] Imperial Coll London, Imperial Coll Business Sch, London SW7 2AZ, England
[3] Stanford Univ, Dept Management Sci & Engn, Stanford, CA 94305 USA
关键词
linear programming; online learning; revenue management;
D O I
10.1287/opre.2022.2358
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
In this paper, we study a class of revenue-management problems, where the decision maker aims to maximize the total revenue subject to budget constraints on multiple types of resources over a finite horizon. At each time, a new order/customer/bid is revealed with a request of some resource(s) and a reward, and the decision maker needs to either accept or reject the order. Upon the acceptance of the order, the resource request must be satisfied, and the associated revenue (reward) can be collected. We consider a stochastic setting where all the orders are independent and identically distributed-sampled that is, the reward-request pair at each time is drawn from an unknown distribution with finite support. The formulation contains many classic applications, such as the quantity based network revenue-management problem and the Adwords problem. We focus on the classic linear program (LP)-based adaptive algorithm and consider regret as the performance measure defined by the gap between the optimal objective value of the certainty equivalent LP and the expected revenue obtained by the online algorithm. Our contribution is twofold: (i) When the underlying LP is nondegenerate, the algorithm achieves a problem dependent regret upper bound that is independent of the horizon/number of time periods T; and (ii) when the underlying LP is degenerate, the algorithm achieves a tight regret upper bound that scales on the order of root Tv log(T) and matches the lower bound up to a logarithmic order. To our knowledge, both results are new and improve the best existing bounds for the LP-based adaptive algorithm in the corresponding setting. We conclude with numerical experiments to further demonstrate our findings.
引用
收藏
页码:1124 / 1138
页数:16
相关论文
共 38 条
[1]   Making Better Fulfillment Decisions on the Fly in an Online Retail Environment [J].
Acimovic, Jason ;
Graves, Stephen C. .
M&SOM-MANUFACTURING & SERVICE OPERATIONS MANAGEMENT, 2015, 17 (01) :34-51
[2]   A Dynamic Near-Optimal Algorithm for Online Linear Programming [J].
Agrawal, Shipra ;
Wang, Zizhuo ;
Ye, Yinyu .
OPERATIONS RESEARCH, 2014, 62 (04) :876-890
[3]  
[Anonymous], 1989, Statistical Science, DOI DOI 10.1214/SS/1177012493
[4]  
[Anonymous], 2002, Empirical process techniques for dependent data
[5]   Online Resource Allocation with Limited Flexibility [J].
Asadpour, Arash ;
Wang, Xuan ;
Zhang, Jiawei .
MANAGEMENT SCIENCE, 2020, 66 (02) :642-666
[6]   Bandits with Knapsacks (Extended Abstract) [J].
Badanidiyuru, Ashwinkumar ;
Kleinberg, Robert ;
Slivkins, Aleksandrs .
2013 IEEE 54TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2013, :207-216
[7]  
Balseiro S, 2021, PREPRINT, DOI [10.2139/ssrn.3963265, DOI 10.2139/SSRN.3963265]
[8]  
Balseiro S, 2020, PREPRINT
[9]  
Banerjee S, 2020, SAR QSAR ENVIRON RES, V31, P325, DOI [10.1080/1062936X.2020.1734080, 10.1145/3393691.3394224]
[10]  
Banerjee Siddhartha, 2020, Constant regret in online allocation: On the sufficiency of a single historical trace