Information theory for ranking and selection

被引:1
作者
Delshad, Saeid [1 ]
Khademi, Amin [1 ]
机构
[1] Clemson Univ, Dept Ind Engn, Clemson, SC 29631 USA
基金
美国国家科学基金会;
关键词
consistency; entropy; information gradient; ranking and selection; BUDGET ALLOCATION; SIMULATION; OPTIMIZATION; DESIGN; POLICY;
D O I
10.1002/nav.21903
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
We study the classical ranking and selection problem, where the ultimate goal is to find the unknown best alternative in terms of the probability of correct selection or expected opportunity cost. However, this paper adopts an alternative sampling approach to achieve this goal, where sampling decisions are made with the objective of maximizing information about the unknown best alternative, or equivalently, minimizing its Shannon entropy. This adaptive learning is formulated via a Bayesian stochastic dynamic programming problem, by which several properties of the learning problem are presented, including the monotonicity of the optimal value function in an information-seeking setting. Since the state space of the stochastic dynamic program is unbounded in the Gaussian setting, a one-step look-ahead approach is used to develop a policy. The proposed policy seeks to maximize the one-step information gain about the unknown best alternative, and therefore, it is called information gradient (IG). It is also proved that the IG policy is consistent, that is, as the sampling budget grows to infinity, the IG policy finds the true best alternative almost surely. Later, a computationally efficient estimate of the proposed policy, called approximated information gradient (AIG), is introduced and in the numerical experiments its performance is tested against recent benchmarks alongside several sensitivity analyses. Results show that AIG performs competitively against other algorithms from the literature.
引用
收藏
页码:239 / 253
页数:15
相关论文
共 37 条
[31]  
Peng YJ, 2015, WINT SIMUL C PROC, P3678, DOI 10.1109/WSC.2015.7408526
[32]  
Powell W.B., 2012, Optimal Learning, DOI [10.1002/9781118309858, DOI 10.1002/9781118309858]
[33]   Sequential Selection with Unknown Correlation Structures [J].
Qu, Huashuai ;
Ryzhov, Ilya O. ;
Fu, Michael C. ;
Ding, Zi .
OPERATIONS RESEARCH, 2015, 63 (04) :931-948
[34]   2-STAGE SELECTION PROCEDURES AND RELATED PROBABILITY-INEQUALITIES [J].
RINOTT, Y .
COMMUNICATIONS IN STATISTICS PART A-THEORY AND METHODS, 1978, 7 (08) :799-811
[35]  
Russo D, 2016, J MACH LEARN RES, V17
[36]   An informational approach to the global optimization of expensive-to-evaluate functions [J].
Villemonteix, Julien ;
Vazquez, Emmanuel ;
Walter, Eric .
JOURNAL OF GLOBAL OPTIMIZATION, 2009, 44 (04) :509-534
[37]   An Adaptive Hyperbox Algorithm for High-Dimensional Discrete Optimization via Simulation Problems [J].
Xu, Jie ;
Nelson, Barry L. ;
Hong, L. Jeff .
INFORMS JOURNAL ON COMPUTING, 2013, 25 (01) :133-146