Multi-arme d bandit-base d hyper-heuristics for combinatorial optimization problems

被引:8
作者
Lagos, Felipe [1 ]
Pereira, Jordi [2 ,3 ]
机构
[1] Univ Adolfo Ibanez, Fac Engn & Sci, Santiago 7941169, Chile
[2] Univ Adolfo Ibanez, Fac Engn & Sci, Vina Del Mar, Chile
[3] UPF Barcelona Sch Management, Innovat & Sustainabil Data Lab ISDaLab, C Balmes 132-134, Barcelona, Spain
关键词
Metaheuristics; Combinatorial optimization; Hyper-heuristics; Machine learning; SELECTION; ALGORITHM; FRAMEWORK; DESIGN;
D O I
10.1016/j.ejor.2023.06.016
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
There are significant research opportunities in the integration of Machine Learning (ML) methods and Combinatorial Optimization Problems (COPs). In this work, we focus on metaheuristics to solve COPs that have an important learning component. These algorithms must explore a solution space and learn from the information they obtain in order to find high-quality solutions. Among the metaheuristics, we study Hyper-Heuristics (HHs), algorithms that, given a number of low-level heuristics, iteratively select and apply heuristics to a solution. The HH we consider has a Markov model to produce sequences of low-level heuristics, which we combine with a Multi-Armed Bandit Problem (MAB)-based method to learn its parameters. This work proposes several improvements to the HH metaheuristic that yields a better learning for solving problem instances. Specifically, this is the first work in HHs to present Exponential Weights for Exploration and Exploitation (EXP3) as a learning method, an algorithm that is able to deal with adversarial settings. We also present a case study for the Vehicle Routing Problem with Time Windows (VRPTW), for which we include a list of low-level heuristics that have been proposed in the literature. We show that our algorithms can handle a large and diverse list of heuristics, illustrating that they can be easily configured to solve COPs of different nature. The computational results indicate that our algorithms are competitive methods for the VRPTW (2.16% gap on average with respect to the best known solutions), demonstrating the potential of these algorithms to solve COPs. Finally, we show how algorithms can even detect low-level heuristics that do not contribute to finding better solutions to the problem.& COPY; 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:70 / 91
页数:22
相关论文
共 75 条
[21]   THE TRUCK DISPATCHING PROBLEM [J].
DANTZIG, GB ;
RAMSER, JH .
MANAGEMENT SCIENCE, 1959, 6 (01) :80-91
[22]  
Denzinger J, 1997, INT JOINT CONF ARTIF, P102
[23]  
Drake John H., 2012, Parallel Problem Solving from Nature - PPSN XII. Proceedings of the 12th International Conference, P307, DOI 10.1007/978-3-642-32964-7_31
[24]   Recent advances in selection hyper-heuristics [J].
Drake, John H. ;
Kheiri, Ahmed ;
Ozcan, Ender ;
Burke, Edmund K. .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2020, 285 (02) :405-428
[25]   A Case Study of Controlling Crossover in a Selection Hyper-heuristic Framework Using the Multidimensional Knapsack Problem [J].
Drake, John H. ;
Ozcan, Ender ;
Burke, Edmund K. .
EVOLUTIONARY COMPUTATION, 2016, 24 (01) :113-141
[26]   NEW OPTIMIZATION HEURISTICS - THE GREAT DELUGE ALGORITHM AND THE RECORD-TO-RECORD TRAVEL [J].
DUECK, G .
JOURNAL OF COMPUTATIONAL PHYSICS, 1993, 104 (01) :86-92
[27]   Analyzing bandit-based adaptive operator selection mechanisms [J].
Fialho, Alvaro ;
Da Costa, Luis ;
Schoenauer, Marc ;
Sebag, Michele .
ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2010, 60 (1-2) :25-64
[28]  
Fisher H., 1963, IND SCHEDULING, P225, DOI DOI 10.1109/ICAL.2009.5262867
[29]  
Frey B.B., 2022, The SAGE Encyclopedia of Research Design, DOI [DOI 10.4135/9781506326139, 10.4135/9781506326139, DOI 10.4135/9781071812082]
[30]  
Gehring Hermann., 1999, P EUROGEN99 SHORT CO, P57