Optimal Regularized Online Allocation by Adaptive Re-Solving

被引：0

作者：

Ma, Wanteng ^{[1
]}

Cao, Ying ^{[2
]}

Tsang, Danny H. K. ^{[2
]}

Xia, Dong ^{[1
]}

机构：

[1] Hong Kong Univ Sci & Technol, Dept Math, Kowloon, Clearwater Bay, Hong Kong, Peoples R China

[2] Hong Kong Univ Sci & Technol, Dept Elect & Comp Engn, Kowloon, Clearwater Bay, Hong Kong, Peoples R China

来源：

OPERATIONS RESEARCH | 2024年

关键词：

online allocation problems; regret; re-solving strategy; NETWORK REVENUE MANAGEMENT; MAX-MIN; OPTIMIZATION; ALGORITHMS; FAIRNESS; REGRET; POLICY;

D O I：

10.1287/opre.2022.0486

中图分类号：

C93 [管理学];

学科分类号：

12 ; 1201 ; 1202 ; 120202 ;

摘要：

This paper introduces a dual -based algorithm framework for solving the regularized online resource allocation problems, which have potentially nonconcave cumulative rewards, hard resource constraints, and a nonseparable regularizer. Under a strategy of adaptively updating the resource constraints, the proposed framework only requests approximate solutions to the empirical dual problems up to a certain accuracy and yet delivers an optimal logarithmic regret under a locally second -order growth condition. Surprisingly, a delicate analysis of the dual objective function enables us to eliminate the notorious log -log factor in regret bound. The flexible framework renders renowned and computationally fast algorithms immediately applicable, for example, dual stochastic gradient descent. Additionally, an infrequent re -solving scheme is proposed, which significantly reduces computational demands without compromising the optimal regret performance. A worst -case square -root regret lower bound is established if the resource constraints are not adaptively updated during dual optimization, which underscores the critical role of adaptive dual variable update. Comprehensive numerical experiments demonstrate the merits of the proposed algorithm framework.

引用

页数：19

共 60 条

[1] Agrawal S., 2014, PROC 26 ANN ACM SIAM, P1405
[2] Agrawal S, 2018, PR MACH LEARN RES, V80
[3] A Dynamic Near-Optimal Algorithm for Online Linear Programming
Agrawal, Shipra
Wang, Zizhuo
Ye, Yinyu
[J]. OPERATIONS RESEARCH, 2014, 62 (04) : 876 - 890
[4] Convex Optimization: Algorithms and Complexity
不详
[J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2015, 8 (3-4): : 232 - +
[5] Arlotto A., 2019, STOCHASTIC SYSTEMS, V9, P231
[6] Babaioff M, 2007, LECT NOTES COMPUT SC, V4627, P16
[7] Balseiro S, 2021, PR MACH LEARN RES, V139
[8] Balseiro S, 2020, PR MACH LEARN RES, V119
[9] Survey of Dynamic Resource-Constrained Reward Collection Problems: Unified Model and Analysis
Balseiro, Santiago R.
Besbes, Omar
Pizarro, Dana
[J]. OPERATIONS RESEARCH, 2024, 72 (05) : 2168 - 2189
[10] The Best of Many Worlds: Dual Mirror Descent for Online Allocation Problems
Balseiro, Santiago R.
Lu, Haihao
Mirrokni, Vahab
[J]. OPERATIONS RESEARCH, 2023, 71 (01) : 101 - 119

← 1 2 3 4 5 6 →