Learning Variable Ordering Heuristics with Multi-Armed Bandits and Restarts

被引：6

作者：

Wattez, Hugues ^{[1
,2
]}

Koriche, Frederic ^{[1
,2
]}

Lecoutre, Christophe ^{[1
,2
]}

Paparrizou, Anastasia ^{[1
,2
]}

Tabary, Sebastien ^{[1
,2
]}

机构：

[1] Univ Artois, CRIL, Arras, France

[2] CNRS, Paris, France

来源：

ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2020年 / 325卷

关键词：

SOLVER; SEARCH;

D O I：

10.3233/FAIA200115

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In constraint-based applications, the user is often required to be an expert as, for a given problem instance, many parameters of the used solver must be manually tuned to improve its efficiency. Clearly, this background knowledge burdens the spread of constraint programming technology to non-expert users. In order to alleviate this issue, the idea of "autonomous" constraint solving is to adjust the solver parameters and to efficiently handle any problem instance without manual tuning. Notably, the choice of the variable ordering heuristic can lead to drastically different performances. A key question arises then: how can we find the best variable ordering heuristic for a problem instance, given a set of available heuristics provided by the solver? To answer this question, we propose an algorithmic framework that combines multi-armed bandits and restarts. Each candidate heuristic is viewed as an arm, and the framework learns to estimate the best heuristic using a multi-armed bandit algorithm. The common mechanism of restarts is used to provide feedback for reinforcing the bandit algorithm. Based on a thorough experimental evaluation, we demonstrate that this framework is able to find the best heuristic for most problem instances; notably, it outperforms the state-of-the-art in terms of time and solved instances.

引用

页码：371 / 378

页数：8

共 50 条

[1] Active Learning in Multi-armed Bandits
Antos, Andras
Grover, Varun
Szepesvari, Csaba
ALGORITHMIC LEARNING THEORY, PROCEEDINGS, 2008, 5254 : 287 - +
[2] TRANSFER LEARNING FOR CONTEXTUAL MULTI-ARMED BANDITS
Cai, Changxiao
Cai, T. Tony
Li, Hongzhe
ANNALS OF STATISTICS, 2024, 52 (01): : 207 - 232
[3] Quantum Reinforcement Learning for Multi-Armed Bandits
Liu, Yi-Pei
Li, Kuo
Cao, Xi
Jia, Qing-Shan
Wang, Xu
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 5675 - 5680
[4] On Kernelized Multi-armed Bandits
Chowdhury, Sayak Ray
Gopalan, Aditya
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[5] Multi-armed Bandits with Compensation
Wang, Siwei
Huang, Longbo
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[6] Regional Multi-Armed Bandits
Wang, Zhiyang
Zhou, Ruida
Shen, Cong
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
[7] Federated Multi-Armed Bandits
Shi, Chengshuai
Shen, Cong
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9603 - 9611
[8] Multi-armed Bandits with Probing
Elumar, Eray Can
Tekin, Cem
Yagan, Osman
2024 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, ISIT 2024, 2024, : 2080 - 2085
[9] Ballooning multi-armed bandits
Ghalme, Ganesh
Dhamal, Swapnil
Jain, Shweta
Gujar, Sujit
Narahari, Y.
ARTIFICIAL INTELLIGENCE, 2021, 296
[10] Decentralized Learning for Multi-player Multi-armed Bandits
Kalathil, Dileep
Nayyar, Naumaan
Jain, Rahul
2012 IEEE 51ST ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2012, : 3960 - 3965

← 1 2 3 4 5 →