Unifying offline and online simulation for online decision-making

被引:3
作者
Liu, Haitao [1 ]
Liang, Jinpeng [2 ]
Lee, Loo Hay [1 ]
Chew, Ek Peng [1 ]
机构
[1] Natl Univ Singapore, Dept Ind Syst Engn & Management, Singapore, Singapore
[2] Dalian Maritime Univ, Sch Transportat Engn, Dalian, Peoples R China
基金
美国国家科学基金会;
关键词
Offline-online learning; simulation optimization; online decision-making; ranking and selection with scenarios; Gaussian process; BUDGET ALLOCATION; KNOWLEDGE-GRADIENT; FRAMEWORK; OPTIMIZATION; SELECTION; RANKING;
D O I
10.1080/24725854.2021.2018739
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Stochastic simulation is typically deployed for offline system design and control; however, the time delay in executing simulation hinders its application in making online decisions. With the rapid growth of computing power, simulation-based online optimization has emerged as an attractive research topic. We consider a problem of ranking and selection via simulation in the context of online decision-making, in which there exists a short time (referred to as online budget) after observing online scenarios. The goal is to select the best alternative conditional on each scenario. We propose a Unified Offline and Online Learning (UOOL) paradigm that exploits offline simulation, online scenarios, and online simulation budget simultaneously. Specifically, we model the mean performance of each alternative as a function of scenarios and learn a predictive model based on offline data. Then, we develop a sequential sampling procedure to generate online simulation data. The predictive model is updated based on offline and online data. Our theoretical result shows that online budget should be allocated to the revealed online scenario. Numerical experiments are conducted to demonstrate the superior performance of the UOOL paradigm and the benefits of offline and online simulation.
引用
收藏
页码:923 / 935
页数:13
相关论文
共 39 条
[1]   Efficient simulation budget allocation with regression [J].
Brantley, Mark W. ;
Lee, Loo Hay ;
Chen, Chun-Hung ;
Chen, Argon .
IIE TRANSACTIONS, 2013, 45 (03) :291-308
[2]  
Chen C. h., 2011, Stochastic Simulation Optimization: An Optimal Computing Budget Allocation, V1
[3]   Simulation budget allocation for further enhancing the efficiency of ordinal optimization [J].
Chen, CH ;
Lin, JW ;
Yücesan, E ;
Chick, SE .
DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2000, 10 (03) :251-270
[4]  
Chen CH, 2015, INT SER OPER RES MAN, V216, P45, DOI 10.1007/978-1-4939-1384-8_3
[5]   New two-stage and sequential procedures for selecting the best simulated system [J].
Chick, SE ;
Inoue, K .
OPERATIONS RESEARCH, 2001, 49 (05) :732-743
[6]   Technical note-Knowledge gradient for selection with covariates: Consistency and computation [J].
Ding, Liang ;
Hong, L. Jeff ;
Shen, Haihui ;
Zhang, Xiaowei .
NAVAL RESEARCH LOGISTICS, 2022, 69 (03) :496-507
[7]   Indifference-Zone-Free Selection of the Best [J].
Fan, Weiwei ;
Hong, L. Jeff ;
Nelson, Barry L. .
OPERATIONS RESEARCH, 2016, 64 (06) :1499-1514
[8]   The Knowledge-Gradient Policy for Correlated Normal Beliefs [J].
Frazier, Peter ;
Powell, Warren ;
Dayanik, Savas .
INFORMS JOURNAL ON COMPUTING, 2009, 21 (04) :599-613
[9]   A KNOWLEDGE-GRADIENT POLICY FOR SEQUENTIAL INFORMATION COLLECTION [J].
Frazier, Peter I. ;
Powell, Warren B. ;
Dayanik, Savas .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2008, 47 (05) :2410-2439
[10]  
Gao SY, 2019, IEEE INT CON AUTO SC, P547, DOI [10.1109/coase.2019.8842957, 10.1109/COASE.2019.8842957]